XFML

A new XML standard has been proposed for publishing faceted metadata. XFML. 

eXchangable Faceted Metadata Language. XFML is an open XML format to publish and share faceted metadata for websites. It allows for easy creation of advanced, automatically generated navigation for your website. You can even automatically generate links to related topics on other websites. It also allows for merging of metadata between different websites.

This looks promising for publishing meta data that can be used by other web sites and/or user client software.  After reviewing the information on the site it does not appear that they borrowed anything from the Zthes DTD for xml representation of a thesaurus. It seems to me that creating linkages between the two could make both standards stronger.

Article on Best Practices in Content Management Application Development

This Webreference Newsletter has a good piece on some best practices for deploying a content management system.  The author makes a very good point about content conversion: While some of it can be automated there is always going to be a fair chunk of content that will have to be moved into the new system by hand. This is usually going to be due to the existing content not fitting perfectly into the new structure or that it was poorly deployed in the first place and now has to be redesigned.

RSS Job Feeds

Here’s a thought: why not create RSS files for the most recently posted jobs in a career center?  Users could then subscribe to the job feed and follow links back to the ones they are interested in. If people started blogging jobs they found interesting it could get quite a bit more exposure for the listing than it would have had before.  It also adds quite a bit of value to the job posting service for those sites that charge to list jobs (such as association-sponsored career centers).

Would an RSS job feed be more useful than e-mail reminders based on keyword hits?  Guess it depends on how a person likes to get their information. A job feed would provide another avenue for keeping up with job listings without a whole lot of extra effort on the part of the publisher.

Speaking of jobs: ASHA is looking for a Web Administrator.

A google search shows that someone thought of job feeds two years ago. Ha!

Thesauri and Web Logs

A common tool used in knowledge management is the thesaurus. There are a variety of definitions out there but I’ll use this one for our purpose here:

Thesaurus — The vocabulary of a controlled indexing language, formally organized so that the a priori relationships between concepts (for example as “broader” and “narrower?) are made explicit. (ISO 2788, 1986:2)

A thesaurus is not only a list of keywords (or terms) and their synonyms: it also embodies an overall hierarchy of related terms. These relationships can be compared to Yahoo!’s branching subject index.  An XML DTD already exists to document these relationships between terms in a thesaurus.

The importance of a thesaurus to knowledge management is that it gives a common language to users who are keywording content for an index. If everyone agrees to use the same terms for the same meaning then metadata indexes become much more effective. Consistent relationships can then be inferred among documents and other content.

Thesauri have to be living documents if they are to remain effective. New terms must be added as the language of a particular field changes. Existing terms may need to be refined or even retired if they fall out of use. This requires a human to manage the thesaurus based on feedback from the users of that thesaurus.

So how could a thesaurus be used with a blog network?  Here are some ideas:

  • Intranet bloggers use thesaurus terms to create categories for their web log. Readers on an intranet, for example, could then see blog posts made by anyone on the network for a particular thesaurus term.  Links to related, broader and narrower categories could be created automatically.  Essentially a meta-blog of content based on commonly used thesaurus terms.
  • The preceding idea could also be done by assigning thesaurus terms to individual blog entries and then indexing that metadata.
  • A hierarchical subject index of blogs could be created based on the categories that are used by individual blog writers. They are added to more categories as they write content in those areas.
  • A Yahoo-like directory/index of an intranet could be created based on the thesaurus which then indexes a blogged set of content. The google-bombing effect of blogs then raises more relevant content to the top of the search results list.
  • Blogs indexed by a structured thesaurus makes it much easier to find other blogs that talk about similar topics without having to rely on the bloggers themselves to create the association via direct links. This could be a supplemental tool to the referrals that currently drive traffic between blogs.
  • A thesaurus manager could monitor related weblogs for new language being used that should be entered into the thesaurus as a formal term.

Those are only a few ideas and I am sure there are many more creative applications out there.  The biggest challenge I see is learning how to merge a more formal document such as a thesaurus with the very informal and hierarchy busting dynamic of a weblog.  However, a structured thesaurus could be a potentially powerful supplmental tool for bloggers to use.

Washingtonpost.com

The Post recently redesigned their home page. Here is a blurb from the editor about the changes.

I think it is a nice refinement. It’s much easier to see what the top stories are in each of the major sections while the major stories are still highlighted at the top. Even better, the annoying banner ad in the middle of the page is gone gone gone!

FacetMap

Here is another project looking at how to use the structure of a facet thesaurus for web site navigation: FacetMap. They even allow you to build your own navigation system (hosted on their site) by entering keywords into a hierarchical relationship.

Flamenco Search Project: Triangulation with Facets

The Flamenco Search System project is exploring how to best create web-based search interfaces based on faceted thesaurii.  The demo inteface they have built for an architecture image collection is excellent. It allows the user to triangulate a set of results by selecting terms from multiple facets. This triangulation allows a user to quickly narrow down to a small set of specific records even within a large overall set of records. 

The site also has several articles that give the background on how they created their design.

Content Filtering with Facets

An article about employing content filters has spawned a very interesting discussion over at boxesandarrows.com. A few people have pointed out work that is being done on building navigation and filtering systems based on faceted metadata. This is very timely information for me since my office is working towards integrating our faceted thesaurus on our web sites as a search tool and an alternate organizational structure.

What isn’t mentioned so far in the discussion over at b&a is that the faceted approach requires a very structured set of metadata terms if the filtering is to be effective. A sloppy set of facets that contain redundant terms or leave significant gaps will not be effective. However, creating an effective faceted thesaurus is NOT easy: starting from scratch, it took several months to create, review and finalize the first version of our faceted thesaurus.