The Wayback Machine: A Web Archives Search Engine

Remember what Yahoo looked like in 1996? Or Google, when it graduated from Stanford and went live under its own URL in 1999? What about long-lost Infoseek, which has vanished entirely from the search engine scene?

If you can't recall the specifics (I can't!), a new service from the Internet Archive is available to help. The Wayback Machine is a search engine that contains over 100 terabytes and 10 billion web pages archived from 1996 to the present. It's an absolutely phenomenal gift to the web community.

The archive doesn't contain every web page ever published. Rather, it's a collection of "snapshots" taken over time. Simply enter a URL and your results are a table of links to specific dates when snapshots were taken and stored in the archive. Clicking the link brings up the page exactly as it looked on that date.

The Wayback Machine also has an advanced search form designed specifically to help the adventurous web archaeologist. You can limit your search to a particular date range, or even a particular date. Advanced search also offers some subtle options that reveal a very thoughtful approach to the design of the search interface.

For example, you can match a URL exactly to see a specific page, or you can request every page associated with a URL to see all archived pages from a site. You can also control whether aliases are shown -- for example, http://www.searchenginewatch.com, http://searchenginewatch.com, and http://www.searchenginewatch.com/index.html are all aliases that point to the same page.

Advanced search also provides controls for displaying redirected pages, file types, and duplicates. A helpful list of hints and tips for advanced search shows how to refine your query for a number of common types of searches.

The Internet Archive provides a number of "special collections" that are absolutely fascinating glimpses of historical web sites. These include snapshots from September 11th, the year 2000 election, U.S. government sites, and one that'll really get your nostalgic juices flowing, Web Pioneers.

The Wayback Machine was unveiled just last week, but has already been overwhelmed by users. The service is "intermittent" for the time being, meaning you won't always see a complete list of results for a particular URL. The Internet Archive is working to add servers, but expects the process to take "weeks."

In the mean time, there's still plenty available for viewing from the 100 terrabyte archive of the web -- and it's well worth the time spent journeying through the web that was.

The Internet Archive Wayback Machine
http://web.archive.org/

Wayback Machine Advanced Search
http://web.archive.org/collections/web/advanced.html

How Big Is 100 Terabytes?
http://www.archive.org/xterabytes.html
Here's how the size of the Internet Archive's collections — containing material dating from 1996 to the present — compares to some familiar data banks.

Internet Archive Special Collections:

September 11
http://web.archive.org/collections/sep11.html
The tragic events of September 11, 2001, prompted web creators around the world to respond. This special collection of archived web sites preserves this unique moment in our history.

Election 2000
http://web.archive.org/collections/e2k.html
The United States Elections of 2000 were perhaps the most controversial elections in our nation's history. Use this collection to revisit the historic elections of 2000.

United States Government
http://web.archive.org/collections/government.html
This collection contains thousands of United States government web sites.

Web Pioneers
http://web.archive.org/collections/pioneers.html
This collection highlights a handful of sites that played a role in the early internet.

Search Engine Gallery
http://www.searchenginewatch.com/subscribers/gallery/index.html
Many search services are constantly changing their looks, both on the home page and on the results page, in order to better please users. The Search Engine Gallery tracks some of these changes, over time. Available to Search Engine Watch Members.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

With links to songs, videos and pictures, search engines advance...
San Francisco Chronicle Oct 29 2001 1:51PM GMT
After an Online Ruckus, Microsoft Opens MSN Site to All...
New York Times Oct 29 2001 8:33AM GMT
AOL Gains Cable Rights in China by Omitting News, Sex and Violence...
New York Times Oct 29 2001 7:57AM GMT
Copy Sells, Flash Doesnt: Implications for Search Engine Optimization...
Rank Write Oct 28 2001 4:46AM GMT
Inktomi to Offer 12.5 Million Shares...
ISPWorld.com Oct 27 2001 11:41AM GMT
Web Hosting News - Decision in Domain Name Dispute Upholds Freedom of Speech...
Web Host Directory Oct 27 2001 6:58AM GMT
Google to charge for premium searches...
Silicon.com Oct 26 2001 12:36PM GMT
MSN locks out non-Microsoft browsers...
Silicon.com Oct 26 2001 12:35PM GMT
NetRatings to acquire Jupiter Media Metrix...
SiliconValley.com Oct 26 2001 10:58AM GMT
Hits soar for Arab news website...
Media Guardian Oct 26 2001 9:50AM GMT
LookSmart Reports Third Quarter 2001 Results...
Yahoo Oct 26 2001 6:13AM GMT
Search Engine Overture Sees Profit...
New York Times Oct 26 2001 4:47AM GMT
powered by moreover.com