Semantic Blogging Research

HP has been conducting research into fusing structured metadata into weblogs. They also have a demonstration blog set up to see their ideas in action. (Found via Open Access News.)

Semantic blogging exploits this same personal publishing, syndication, aggregation and subscription model but applies it to structured items with richer metadata data. The metadata would include classification of the items into one or more topic ontologies, semantic links between items (“supports”, “refutes”, “extends” etc.) as well as less formal annotations and ratings. There are several ways this more structured data could extend the power of blogging:

Discovery. At present is it not easy to discover either a channel of interest (e.g. “I would like to find blog channels about the semantic web”) or a collection of specific items of interest (e.g. “Are there any more blog entries describing this application idea?”).

Cross-linking. Current blogs support a single link between the channel record and the blogged item. By extending this mechanism to support linking between items (using a property hierarchy) we can create a network of topic interconnections that supports more flexible navigation. These links can themselves form part of the disseminated content – for example to represent the structure or scholarly discourse.

Flexible aggregation and selection. The current blog subscription mechanisms are in some ways both too fine (being bounded by the individual blogger’s channel of posts) and too coarse (e.g. I might like Ian’s technology channel but am only interested in the semantic web bits). The richer categorization and structure of semantic blog channels would make it easier for users to create virtual blog channels which aggregate across multiple bloggers but select from that aggregate according to other criteria such as topic (or community rating).

Integration with other sources and applications. The structured nature of semantic blog channels makes it possible to develop automated blog robots that can process and enhance the blogged items. For example, in the bibliography domain transducers would enable import and export via existing bibliography schemas like BibTex and automatic linking to large repositories such as CiteSeer.

There are lots of differing opinions as to whether the semantic web can actually be achieved. It’s good to see some actual research being done to shed some more light on this issue. As I’ve said before, I think metadata can still be useful in discrete communities and/or collections where there is some control and incentive to code accurately. I don’t think it will likely work for the web as whole given the web community’s tendency to game systems. A semantic blog network could be quite interesting for a community of researchers.

Batch vs. Dynamic Publishing

James Robertson has published a concise review of the pros and cons of dynamic vs. batch publishing in web content management systems. He also covers the hybrid of the two, which is what we now have in place with our systems at work.

The major con under the dynamic system that I have experienced directly is the server load issue. Dynamic systems cannot scale up for additional traffic as efficiently as batch or hybrid systems. Authoring activities are very processor and database intensive operations. Mix a few active editors along with heavy end user traffic and your server may quickly succumb.

About 18 months ago, we were in the unenviable situation of limiting editing activities to a small number of staff during low-traffic time periods. Without those restrictions our site constantly crashed and/or timed-out due to the unmanageable load of end users and editors hitting a single server. Not a popular measure with staff, to say the least. The dynamic CMS we were using at the time eventually came out with an update that allowed us to move to the hybrid approach and lift our editing restrictions. Moving authoring to a separate server dramatically improved the stability of our production web site.

A Few New Blogs on My Reading List

I added 4 new-to-me blogs to my reading list this week.

Open Access News
News from the debate over making scholarly journals open to everyone without subscription fees (with lots of variations on the theme as well). The blog is on a slow server but it will deliver a page eventually.

Asterix*
A blog by D. Keith Robinson focusing on web design. His recent entry on moving a healthcare organization to a standards based design caught my interest (thanks for the tip, Glen!).

View from the Corner Office
An anonymously authored blog by an association CEO. Pretty frank stuff so far.

Association Innovation
Jeff de Cagna is back at it with his blog about associations and their need to innovate.

And Yet More on Personalization

Does Personalization Do Anything Useful?

I interviewed user interface expert Jared Spool a couple of years ago for a now defunct Web site. I really like Jared’s ideas on all things technical, so it was a pleasure to discuss personalization with him.

Another take on personalization from a couple years ago. My favorite bit of the interview:

For Spool, the way to tackle personalization isn’t to start with the question, ?What can we personalize?? The right questions to ask are, ?What does the user need to see right now? What information does the user need??