Metadata for Everyone

Metadata, or literally "data about data," has long been considered a holy grail for improving the quality of web search. If web sites were cataloged in the same way library resources were, claim metadata proponents, search engines could mine this information and use it to help zero in on the best possible document to match a query.

The folks at the W3C, the closest thing the web has to a standards committee, have released Annotea, an open source "annotation" capability that lets anyone create metadata about web pages that is stored on separate "annotation servers."

Here's what the W3C says about Annotea:

"Annotea is a LEAD (Live Early Adoption and Demonstration) project enhancing the W3C collaboration environment with shared annotations. By annotations we mean comments, notes, explanations, or other types of external remarks that can be attached to any Web document or a selected part of the document without actually needing to touch the document. When the user gets the document he or she can also load the annotations attached to it from a selected annotation server or several servers and see what his peer group thinks."

This may sound familiar -- a now-defunct browser add-on called ThirdVoice allowed third parties to add comments to web pages in a similar way. ThirdVoice was highly criticised as a tool for creating "web graffitti" and demonstrated the pitfalls of allowing just anyone to annotate a web page.

Nonetheless, with the W3C behind the Annotea project, we may actually see the rise of trusted sources of metadata that search engines (or anyone with a browser, for that matter) can use to figure out what a document is all about. In essence, Annotea is one of the first steps toward realizing what web creator Tim Berners-Lee calls "The Semantic Web."

Annotea Project
The home page for information on Annotea, including instructions on how to download and use it.

Annotations in Amaya˜checkout˜/Amaya/doc/amaya/Annotations.html
Amaya is the W3C's browser/editor, and currently the only browser that supports Annotea. Here's a detailed look, including screen shots, of how annotations or metadata are created using Annotea and Amaya.

Semantic Web Activity
News, information, and links regarding the W3C's efforts to implement the Semantic Web.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.