Search Engines 201

Want to dive deep -- really deep -- into the technical literature about search engines? Here's a road map to some of the best web information retrieval resources available online.

Longtime SearchDay readers know I'm an avowed technical literature junkie. I'm talking the heavy-duty, industrial strength type of stuff that's typically buried in PDF or PostScript files on remote servers at academic institutions or in technical journals.

You can find a treasure trove of links to this type of information in the modestly titled Web IR & IE, a directory of web information retrieval and information extraction resources compiled by information scientist Einat Amitay.

The site is organized into sections. conferences and online proceedings has links to the major events where information scientists gather and share information. These gatherings are quite different than Search Engine Strategies shows.

Other section of the site provide links to influential people working in the field, working groups attempting to hammer out standards, and resources--tools to help you build your own search engine.

Want to communicate with the gurus? Links to mailing lists and newsgroups show you where to find the online watering holes.

My favorite sections of the site are selected publications, PhD/MSc related work, and books. Here you can find Jon Kleinberg's groundbreaking "Authoritative Sources in a Hyperlinked Environment" paper that influenced Google and Teoma. Larry Page and Sergey Brin's early description of Google, "The Anatomy of a Large-Scale Hypertextual Web Search Engine," is also represented.

And the seminal work that influenced all three of the above authors is also here: Eugene Garfield's 1955 (!!!) "Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas," a must read if you want to understand the genesis of what we today call link analysis on the world wide web.

And if you still haven't had your fill, related sites points you elsewhere for more. I love Einat's description of Search Engine Watch: "A commercial web site about search engine development and related news. It might be a bit "non academic" but this is the real world..."

If you're relatively new to the technical literature about web information retrieval, start with the overview and resources I wrote about in Search Engines 101. Another good resource is covered in How Search Engines Make Sense of the Web.

Featured Discussions In Our Forums
Multiple Country Sites Hosted in U.S.
New article about Traffic Power/1p.com
Has there been any talk of MSN creating it's own PPC?
If I only use Overture and Google... am I missing anyone?

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Patent News Sources...
Patent Law Blog Sep 13 2004 12:46PM GMT
Business-to-business online ad network launches...
IDG News Service Sep 13 2004 12:18PM GMT
AOL Expands Shopping Features...
New York Times Sep 13 2004 6:37AM GMT
The 'Design' Part of Search-Friendly Design...
ClickZ Today Sep 13 2004 5:07AM GMT
What's New With Google News...
ResearchBuzz Sep 13 2004 5:07AM GMT
Keyword Discovery Service For Your Google Adwords Campaigns: AdWordAccelerator...
MasterNewMedia.org Sep 12 2004 9:12PM GMT
Have You Googlewhacked?...
Electric New Paper Sep 12 2004 5:39PM GMT
'Do your homework on search services'...
Bradenton Herald Sep 12 2004 2:33PM GMT
Search results lag expectations...
BToBOnline Sep 11 2004 5:46PM GMT
More New Services From Findory News: Personalized News on Your Web Page...
ResourceShelf Sep 11 2004 4:02PM GMT
Fidelity buys into Google...
San Francisco Chronicle Sep 11 2004 12:24PM GMT
Free email options quickly getting bigger, better in the wake of Gmail...
Seattle Times Sep 11 2004 7:41AM GMT
Thomas takes vertical approach to search...
BToBOnline Sep 11 2004 0:21AM GMT
Survey: Consumers Want More Personalized Online Ads, But Don't Want Identities Known...
Media Post Sep 10 2004 1:02PM GMT

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was About.com's Web Search Guide.