The Best of the International World Wide Web Conferences

Each year, the International World Wide Web Conference provides a showcase for innovative web technologies. Here's a chronological list of significant papers over the past decade focusing on searching and search engines.

Perusing the proceedings of the International World Wide Web Conferences is deeply satisfying for both the inner geek and inner historian. There have literally been hundreds of papers presented over the past decade or so at these conferences, and all provide fascinating snapshots of the development of the web.

The papers below are my subjective list of favorites from the archives. In addition to papers, there are also many slide and poster presentations online. Use the link at the bottom of this list to access the full "table of contents" for each year's conference, except for the first -- which has apparently been lost.

Finding What People Want: Experiences with the WebCrawler (1994)

WebCrawler is widely regarded as one of the first, if not the first, crawler-based search engines. Creator Brian Pinkerton discusses how he built the engine, laying the foundation for all modern-day search engines.

The Distributed Link Service: A Tool for Publishers, Authors and Readers (1995)

The authors describe "link servers" -- foreshadowing both link analysis technologies used by contemporary search engines, as well as "link farms" used to spam the engines.

Measuring the Web (1996)

XML co-creator Tim Bray takes on "questions without answer," including How big is the web? What is the "average page" like? How richly connected is it? What are the biggest and most visible sites? What data formats are being used? What does the web look like?

WebQuery: Searching and Visualizing the Web through Connectivity (1997)

This paper hints at what's to come with Google, Teoma and others: "We do this by examining links among the nodes returned in a keyword-based query. We then rank the nodes, giving the highest rank to the most highly connected nodes."

The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998)

The classic paper by then students Larry Page and Sergey Brin, describing their "prototype of a large-scale search engine" with the goofy name, Google.

Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery (1999)

The goal of a focused crawler is to selectively seek out pages that are relevant to a pre-defined set of topics. The topics are specified not using keywords, but using exemplary "training" documents.

Trawling the Web for Emerging Cyber-Communities (1999)

An overview of how "social networks" can be used to identify hubs, authorities, and other high-quality web pages that form "communities" of information.

Graph Structure in the Web (2000)

The web is shown to have a structure like a "bow tie" with surprising and instructive implications for both search engines and searchers alike.

Scaling Question Answering to the Web (2001)

Investigates the challenges of creating an "answer engine" and discusses MULDER, "the first general-purpose, fully-automated question-answering system available on the web."

All of these papers, as well as hundreds of others, can be found using this "table of contents" for the International World Wide Web Conferences:

International World Wide Web Conferences

This year's conference is coming up. For a preview of what will be presented, as well as access to presentations once they are posted to the web, use this link:

The Eleventh International World Wide Web Conference
Sheraton Waikiki Hotel, Honolulu, Hawaii, USA, 7-11 May 2002

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Online content news
InfoSpace warns of need for accounting adjustment...
Seattle Times Apr 11 2002 1:12PM GMT
Online portals news
Yahoo Gives Pro Forma the Boot...
Wired News Apr 11 2002 10:47AM GMT
Online content news
More Divine Acquisitions Announced...
Content-Wire Apr 11 2002 10:24AM GMT
Online search engines news
Inktomi aims to block Web-based viruses...
ZDNet Apr 11 2002 8:57AM GMT
Domain name news
Microsoft, I.B.M. and VeriSign to Cooperate on Web Security...
New York Times Apr 11 2002 6:32AM GMT
Web developer news
.NET for managers...
CNET Apr 11 2002 5:04AM GMT
Online search engines news
Yahoo/CEO -3: Google Is Possible Paid-Listings Supplier...
Yahoo Apr 11 2002 1:19AM GMT
Google Gets Down to Basis for Chinese... Apr 10 2002 4:50PM GMT
LookSmart Introduces Pay-Per-Click URL Inclusion Program...
Research Buzz Apr 10 2002 3:28PM GMT
AltaVista Launches AltaVista ParaPhrase, New Crawling Initiative...
Research Buzz Apr 10 2002 3:28PM GMT
Online portals news
Why Yahoo Is No Longer Good...
Traffick Apr 10 2002 9:48AM GMT
Domain name news downloads reignite Kazaa controversy...
ZDNet Apr 10 2002 9:43AM GMT
powered by