My Top 3 Content Conversion Tips

On the ASAE Technology List, someone asked for the top three tips you would give for someone who is about to embark on a content conversion project (defined as moving content from an old site to a new site). Great question!

Here are my top three based on one completed mirgration and another that is in the planning stages:

1. Inventory current content and delete as much of it as possible. Remove anything that is out of date, incorrect, etc. If you are migrating to a new site structure, the inventory can be used to map existing content to its home in the new structure. This page has some good tips and a spreadsheet tool for conducting a content inventory.

2. Budget for html temps to assist in migration and clean-up. This is essential if you do not have tools to automate portions of the conversion. The temps can focus on brute force cut-and-paste (if necessary) and content clean-up to use new style sheets, etc. Staff can then focus on overall content organization, template design, etc., which is a better use of their time. BTW, you need to do the content inventory/mapping for temps to effectively do the brute-force work. No budget for temps? Then you need to allow for extra time (lots of it) for staff to do this themselves.

3. Force yourself to assign metadata during the conversion. If you don’t do it now chances are you will never have time to go back and do it later.

A Pain in the Metadata

Seb points to an interesting presentation on metadata written by Stefano Mazzocchi.

The presentation dances around an issue we ran into like a brick wall: quality metadata is needed to provide quality search and retrieval on large collections of material, however, the amount of human effort needed to create the metadata is directly proportional to the size of the collection.

This does not scale well and has to be done in a distributed way unless you can afford a room of librarians on staff. The problem with distributed metadata creation is one of training. Expecting our usual web content contributors to be experts in applying our full thesaurus is not realistic. Hell, I’m not an expert in applying it either.

So what to do? I’m open to suggestions!

We are experimenting with targeting specific, high knowledge-value, subcollections for in-depth metadata tagging by a central expert. A ‘shallow’ representation of the full thesaurus would be used for indexing normal content on the web site by distributed content contributors.

The idea is that the high-value resources, typically used in academic research, allow for the most finely tuned searching while less valuable content is tagged in much less detail. All of it in combination should be supportable by existing staff resources.

I also want to explore allowing our users to rate the value of individual pages/items and see if that provides better rankings than we can do internally.

Enterprise Information Architecture

Lou Rosenfeld will be doing a road show about IA design for large organizations.

I’ll be tackling the frustrating challenge of getting a large, multi-departmental and hugely political web environment to behave like a single, unified, user-centric web site.

I wish I could have taken this course about six months ago! We are in the process of finalizing the IA for a redesign of our main web site. Crafting the information architecture for a non-profit membership association web site is as much a political process as it is a design exercise.

Staff Directories

Column Two has posted a “list of what you might consider including in your staff directory.” A few extras we are considering for our staff directory, in addition to those on James’ list, include:

  • regular work hours
  • telecommuting days with contact info
  • teams you are a member of (we are a team-based staff)
  • teams you are interested in
  • self-selected subject areas of expertise (drawn from our thesaurus)
  • self-selected subject areas of interest (also drawn from our thesaurus)