FAST Sprints Past Google in Search Engine Size Wars

FAST announced today that it has expanded its index to 2.1 web billion pages, taking the lead from Google in the search engine size wars.

The 2.1 billion documents are searchable at FAST's search engine, as well as through Lycos, which is powered by FAST.

While the 2.1 billion number appears to just slightly exceed Google's claim of 2,073,418,204 web pages, the actual number of documents indexed by FAST is much larger than Google, according to FAST spokesperson Jami Axelrod. This is because Google includes in its claim pages that it knows the URLs for, but that aren't fully indexed.

FAST's number represents "pure" web pages that are fully indexed, says Axelrod. The number does not include multimedia or FTP files that are also searchable at

To build this massive index, FAST engineers estimated that their crawler retrieved between 6 and 8 billion documents. Duplicate pages and spam were culled before the index was built.

Google is unlikely to stand by and allow FAST to claim the largest index title for long. Given past rounds in the search engine size wars, you can expect the number of pages searched displayed on Google's home page to increase any day now.


Google's Multifaceted Database
Google includes some results (URLs) that it has not actually indexed. This chart shows the breakdown of indexed vs. unindexed pages in Google's database.

Search Engine Sizes
The charts on this page show the size of each search engine's index, and recount the history of the search engine "size wars."

Google Upgrades Search Appliance

Google announced today that it has made enhancements to its Google Search Appliance, the standalone search tool it markets for site and intranet search customers.

The enhancements include:

- Doubling the capacity of the GB 1001. It can now search up to 300,000 documents, up from the former limit of 150,000 pages.

- Simplified operation of the appliance. In particular, the company has simplified the tools for designing output pages. The interface makes it easy to customize the appearance of result, cache, and the advanced search page, and automatically generates HTML for the search box for a site.

"It's something like a wizard with a point and click interface," said John Piscitello, Product Development Guru for Google. "You can get something up and running with your own look and feel very quickly."

Separately, Google announced a list of customers who have purchased the Google appliance. "A number of companies from a wide variety of industries have selected the Google search appliance to search their corporate intranets," said Google spokesperson Nate Tyler. The list of customers includes Boeing, Cisco, FindLaw, National Semiconductor and the University of Florida.

Google Search Appliance

Inside the Google Search Appliance
SearchDay, April 30, 2002
Want Google search on your own internal network? Try the Google Appliance, a self-contained version of the popular search engine that's stuffed into a pizza-sized box.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Internet: international news
Beijing's web users angry at shutdown...
BBC Jun 17 2002 12:23PM GMT
Online search engines news
Cult search engine 'bigger than Google'... Jun 17 2002 11:38AM GMT
Online content news
Mugabe versus the internet...
Media Guardian Jun 17 2002 8:54AM GMT
Online search engines news
L'Oreal Breaks Search Engine Effort to Drive Traffic... Jun 17 2002 6:49AM GMT
AskJeeves to debut search product for corporations...
CNET Jun 17 2002 4:16AM GMT
Online portals news Releases eSubscription Membership and Content Management Portal...
EContent Jun 17 2002 1:34AM GMT
Domain name news
Questions Surround Domain Names...
New York Times Jun 17 2002 1:28AM GMT
Domain body faces crossroad...
CNN Jun 16 2002 1:54PM GMT
Online marketing news
Spam: Taking Action...
AtNewYork Jun 16 2002 6:03AM GMT
Online portals news
Build Plug-and-Play Web Portals...
XML Mag Jun 16 2002 3:53AM GMT
Online search engines news
Five years ago: Web site offers search engine gen...
ZDNet Jun 15 2002 6:24AM GMT
Search party: Search engines evolve... Jun 15 2002 6:12AM GMT
Top internet stories
The Internet's Alphabet Soup Is Getting Messy...
Fortune Jun 15 2002 5:48AM GMT
powered by

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.