FAST announced today that it has expanded its index to 2.1 web billion pages, taking the lead from Google in the search engine size wars.
The 2.1 billion documents are searchable at FAST's AllTheWeb.com search engine, as well as through Lycos, which is powered by FAST.
While the 2.1 billion number appears to just slightly exceed Google's claim of 2,073,418,204 web pages, the actual number of documents indexed by FAST is much larger than Google, according to FAST spokesperson Jami Axelrod. This is because Google includes in its claim pages that it knows the URLs for, but that aren't fully indexed.
FAST's number represents "pure" web pages that are fully indexed, says Axelrod. The number does not include multimedia or FTP files that are also searchable at AllTheWeb.com
To build this massive index, FAST engineers estimated that their crawler retrieved between 6 and 8 billion documents. Duplicate pages and spam were culled before the index was built.
Google is unlikely to stand by and allow FAST to claim the largest index title for long. Given past rounds in the search engine size wars, you can expect the number of pages searched displayed on Google's home page to increase any day now.
Google's Multifaceted Database
Google includes some results (URLs) that it has not actually indexed. This chart shows the breakdown of indexed vs. unindexed pages in Google's database.
Search Engine Sizes
The charts on this page show the size of each search engine's index, and recount the history of the search engine "size wars."
Google Upgrades Search Appliance
Google announced today that it has made enhancements to its Google Search Appliance, the standalone search tool it markets for site and intranet search customers.
The enhancements include:
- Doubling the capacity of the GB 1001. It can now search up to 300,000 documents, up from the former limit of 150,000 pages.
- Simplified operation of the appliance. In particular, the company has simplified the tools for designing output pages. The interface makes it easy to customize the appearance of result, cache, and the advanced search page, and automatically generates HTML for the search box for a site.
"It's something like a wizard with a point and click interface," said John Piscitello, Product Development Guru for Google. "You can get something up and running with your own look and feel very quickly."
Separately, Google announced a list of customers who have purchased the Google appliance. "A number of companies from a wide variety of industries have selected the Google search appliance to search their corporate intranets," said Google spokesperson Nate Tyler. The list of customers includes Boeing, Cisco, FindLaw, National Semiconductor and the University of Florida.
Google Search Appliance
Inside the Google Search Appliance
SearchDay, April 30, 2002
Want Google search on your own internal network? Try the Google Appliance, a self-contained version of the popular search engine that's stuffed into a pizza-sized box.
NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.
Meet Your Favorite Search Engine Watch Contributors
Many of SEW's leading expert contributors will be at ClickZ Live, the new online and digital marketing event kicking off in New York (March 31-April 3). Hear from the likes of: Thom Craver, Josh Braaten, Lisa Barone, Simon Heseltine, Josh McCoy, Lisa Raehsler, Greg Jarboe, Dan Cristo, Joseph Kerschbaum, John Gagnon, Eric Enge and more!