Metadata or Metagarbage?

Metadata is the holy grail for improving search, according to its advocates. Garbage! replies one critic, detailing seven reasons while even the most promising metadata schemes will fail.

If you've ever tried to improve your web site's search engine rankings, you've probably fiddled with your keyword and description meta tags. This metadata, or data about the information on the page, ostensibly tells the search engine what's important on the page.

The problem is, spammers love meta tags, and search engines don't really pay all that much attention to them any more.

Several groups have proposed advanced metadata standards that could ameliorate the problems with simple meta tags, and theoretically improve search. These standards include the Dublin core, RDF and its flavors like XML, SMIL and others. Tim Berners-Lee, the creator of the web, even goes so far to say that these standards will enable the Semantic web, where machines will be able to communicate with each other without human intervention.

Bah! Humbug! says Cory Doctorow. Cory has published an interesting paper spelling out with wickedly pointed humor why metadata won't work -- at least in a public arena like the web. Lest you think this is mere sycophantic muttering, consider that Cory is the founder of Open Cola, a company building collaborative search software that holds serious promise for changing the way we search.

Many thanks to Monnie Nilsson Grosjean -- the smartest person I know -- for bringing this to my attention.

Putting the Torch to Seven Straw-Men of the Meta-Utopia˜doctorow/metacrap.htm
Cory Doctorow's treatise on why metadata won't work. Warning: Contains opinionated and graphic language -- and is not necessarily the opinion of SearchDay's editor, Chris Sherman.

Open Cola
More information about the "Folders" P2P search product developed by Open Cola founder Cory Doctorow.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Top internet stories
Over time, Internet use shifts from quantity to quality... Mar 4 2002 6:42AM GMT
Online search engines news
Google Programmer Creates Buzz...
New York Times Mar 4 2002 6:15AM GMT
Online access news
Good (or Unwitting) Neighbors Make for Good Internet Access...
New York Times Mar 4 2002 4:46AM GMT
XML and metadata news
Quick quiz: What is XML?...
CNN Mar 3 2002 6:16PM GMT
Online portals news
Internet Explorer users urged to patch browser...
CNN Mar 3 2002 5:11PM GMT
Online search engines news
Using Googles AdWords Select for Keyword Info and Branding...
Rank Write Mar 2 2002 2:22AM GMT
Domain name news
ICANNs ccTLD Quandary...
Internet News Mar 1 2002 11:46PM GMT
Online portals news
Lycos Paid Inclusion Woos Webmasters...
Traffick Mar 1 2002 10:02PM GMT
Online search engines news
Which Search Engine is Really #1? Metrics Agencies Close in on Reality...
Traffick Mar 1 2002 10:02PM GMT
Google and Jeeves lead web growth figures... Mar 1 2002 4:26PM GMT
Tech latest
Copyright buzz: They just dont get it...
Interactive Week Mar 1 2002 1:58PM GMT
Online portals news
Portal Technology and Content Management Side by Side...
Content-Wire Mar 1 2002 12:10PM GMT
Domain name news
European Parliament gives support to .eu domain... Mar 1 2002 10:27AM GMT
Online portals news
Microsoft's MSN starts new 'switch from AOL' pitch...
CNET Mar 1 2002 6:41AM GMT
powered by

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.