Happy Birthday, Aliweb!

Aliweb, a pioneering web search engine that used advanced technologies way ahead of its time, made its debut nine years ago, on November 30, 1993.

Aliweb was one of the original "web wanderers," programs that retrieved documents by following links discovered on web pages, spreading "like a virus," according to early lore.

Of course, today we call web wanderers "crawlers" and they're the technology used by all of the major search engines to build their web indexes.

Aliweb wasn't a full-text search engine. Rather, it read special "index" files created by webmasters that described the contents of their sites. These index files followed a format suggested by the Internet Anonymous FTP Archives Working Group.

At the time, FTP searching was one of the primary ways you found things on the Internet. And one of the most popular FTP "search" programs was called Archie. Aliweb is actually an acronym for "Archie Like Indexing In The Web."

Martijn Koster, Aliweb's creator, announced the system to the comp.infosystems.www Usenet newsgroup on November 30, 1993. He wrote:

"ALIWEB is an experiment in automatic 'distributed'
indexing: a WWW server advertises its contents in a
local file, which is automatically retrieved and
processed proactively by a single site. The combined
database of these indices can of course be searched
from the Web."

The index files that "advertised" the contents of the web server effectively used meta data describing the entire contents of the web site. This allowed Aliweb users to run advanced searches, limiting results to document subjects or titles, all without the advantage of having a full text index of web pages.

The problem was that the Aliweb required webmasters to create and maintain their own indexes. According to the history of Aliweb page, "... a chicken-and-egg problem became apparent: Because not many people provided information, the resulting database was rather empty. Because the database was empty not many people used it to look for things. Because not many people used it there was no incentive for people to provide information."

This lack of interest, combined with the emergence of automated crawlers like Webcrawler and Lycos in 1994, led to the demise of Aliweb. In 1995, responsibility for Aliweb was handed over to EMNET, a UK based Internet consultancy, which forecast a bright future for the service. From the History of Aliweb page (link below):

"A new and exciting period for Aliweb has recently arrived, since responsibility for it has been handed over to EMNET and the whole site and engine is being rewritten to bring it up to date with the latest HTTP and HTML standards, which will carry through well into the new millennium."

So much for rosy predictions. Although Aliweb no longer functions as a search service, the methods it pioneered, including distributed indexing and the use of rich meta data are now widely accepted.

Aliweb's Koster is also well known for his work on the Robots Exclusion Protocol. He created The Web Robots Pages, a site that's still considered to be one of the definitive resources for information about crawlers and their operation.

Web pioneer that he is, Koster is just 32 years old, and currently works as a Software Architect for Danger, Inc., working on server-side applications to support the Hiptop mobile device.

Announcing ALIWEB
Aliweb's creator, Martijn Koster, announces his new web search service to the comp.infosystems.www Usenet newsgroup.

ALIWEB - Archie-Like Indexing in the Web
Aliweb creator Martijn Koster presented this paper to the First International Conference on the World-Wide Web in Geneva, 1994.

History of ALIWEB
A history of Aliweb maintained by the company that hosted and then took over the service in 1994, when Koster no longer had time to maintain the system.

The Web Robots Pages
Web Robots are programs that traverse the web automatically. Some people call them web wanderers, crawlers, or spiders. These pages have further information about these web robots.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Online marketing news
COLUMN: Spam, spam, spam and solutions...
netimperative Dec 2 2002 1:59PM GMT
Technology features
Copyright cartel still winning most of the time...
SiliconValley.com Dec 2 2002 11:24AM GMT
Online portals news
What went wrong at AOL Time Warner...
MSNBC Dec 1 2002 11:39PM GMT
Gocher Takes Charge Of Yahoo India...
Financial Express Dec 1 2002 3:03PM GMT
Internet features
Sore Wrists? You're a Computer Athlete, After All...
WebTalkGuys Radio Dec 1 2002 1:24AM GMT
Online search engines news
Search Engine Tip: Beware of Query Strings...
Net Mechanic Nov 30 2002 4:13AM GMT
Search Engine Tip: Getting Listed With Yahoo...
Net Mechanic Nov 30 2002 4:13AM GMT
Search Engine Tip: Search Engines Like Ugly Pages...
Net Mechanic Nov 30 2002 4:13AM GMT
Promotion Tip: Search Engine And Directory Alliances...
Net Mechanic Nov 30 2002 4:13AM GMT
Promotion Tip: Boost Your Search Engine Rank With ALT Tags...
Net Mechanic Nov 30 2002 4:13AM GMT
Promotion Tip: Top 10 Reasons Why You Aren't In The Search Engines...
Net Mechanic Nov 30 2002 4:13AM GMT
Promotion Tip: Invite Search Engine Spiders Into Your Dynamic Web Site...
Net Mechanic Nov 30 2002 4:13AM GMT
Online marketing news
Anti-spam filters kill legitimate emails...
The Register Nov 29 2002 12:39PM GMT
Domain name news
ITU call Governments and Domain Name operators together...
Demys Nov 29 2002 11:46AM GMT
powered by Moreover.com

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was About.com's Web Search Guide.