Google Announces Largest Index

A longer and more detailed version of this article is available to Search Engine Watch "site subscribers." This is just one of the many benefits that site subscribers receive. Click here to learn more about becoming a site subscriber.

Another milestone in the search engine size wars was hit when Google went live with a full-text index of 560 million URLs in June, making it the largest search engine on the web. In addition, because of how Google makes use of link data, its reach extends to a further 500 million URLs that it has never actually visited, the company says. That means searches at Google potentially encompass more than 1 billion pages, which is the size the entire web was recently estimated to be at earlier this year.

So does this mean Google is the first search engine to give 100 percent coverage of the web? No. For one thing, that 1 billion page estimate is several months old, and the web has almost certainly increased in size since then. Nor does that estimate include the millions of pages that search engines typically don't crawl, such as those behind password protected areas or served up by identifiable dynamic delivery systems. How big the web is now is anyone's guess.

Also, Google has actually visited and recorded the contents of 560 million pages, not 1 billion. Google, unlike any other major search engine, does make clever use of its technology to leverage its reach beyond this core set of pages (as the articles below explain further). It isn't just marketing hype for it to use the 1 billion figure, but those extra pages are more like a bonus that you can't always depend on, rather than the assurance you get from having indexed each and every page.

None of this takes away from Google's accomplishment nor the value of using its service. It is now a clear choice for those seeking both highly relevant results and comprehensive searching across the web. Searchers should also have even more choice in the coming months. WebTop just announced its own half-billion page index, and some of Inktomi's partners should go live with Inktomi's new half-billion page index in the very near future.

While Google is running searches against the full index at its own site, its partners may not tap into the entire amount. "We support searches for different partners, so they won't all be necessarily be getting the largest index," said Google president Sergey Brin. Google's customers can choose to search against indexes of different sizes, with the smaller indexes containing a higher amount of the web's most popular pages, as determined by Google's link analysis system. The benefit for customers in using a smaller index is savings. It costs them more to query against the biggest collection of documents, Brin says.

Offering different sized indexes isn't new. Inktomi has also offered its partners this option, and it is one reason why Yahoo's Inktomi-powered results have never matched those of some other Inktomi-powered services. Yahoo has never chosen to hit Inktomi's index as deeply as possible. Whether this will change when Google takes over remains to be seen. Google says Yahoo has the option to do so, and I'll revisit the issue after Yahoo goes live with the Google results.

Google has also begun to float articles from major news wires at the top of its results, in response to current news stories. You can't specifically search for news, but the system should recognize topical queries such as "elian gonzalez" and respond with links that appear beginning with the word "News."


Google Snapshots

A new page offering a behind-the-scenes look of Googlites at work and play.

Search Engine Size Test

I took a look at how the new Google index performs against other size leaders. Did it live up to its claims? Pretty much, yes. Also see how FAST, Excite, AltaVista and some Inktomi-powered services perform. NOTE: The page isn't ready yet, but results will be posted by July 7.

Search Engine Sizes

The current sizes of major crawler-based search engines, historical numbers and plenty of articles that document the size wars over the past years.

Google Announces Largest Index
The Search Engine Report, July 5, 2000

Google's index has gotten much larger, but not all partners may tap into it.

Yahoo Partners With Google
The Search Engine Report, July 5, 2000

Yahoo has selected Google to take over from Inktomi in powering Yahoo's secondary results. These are the listings that appear in the "Web Pages" area of Yahoo's results, after any hits from Yahoo's own human-compiled listings.

Numbers, Numbers -- But What Do They Mean?
The Search Engine Report, March 3, 2000

Explains how Google leverages its link database to expand its coverage, and it also puts other "dual numbers" you may hear into perspective.

The Half Billion Crew: Google, Inktomi GEN3, & WebTop
Search Engine Showdown, June 29, 2000

Greg Notess compares and contrast leaders in the index size game and runs a current comparison.

Google's Cool Billion Web Search Guide, June 26, 2000

Another look at the Google size increase from search writer Chris Sherman.


I mentioned WebTop in my June newsletter, and now the company has just announced a half-billion page index. Expect a closer look in the future.'s computers used to revamp search engine
Reuters, June 28, 2000

Computers from the failed online retailer have been put to new use powering

A longer and more detailed version of this article is available to Search Engine Watch "site subscribers." This is just one of the many benefits that site subscribers receive. Click here to learn more about becoming a site subscriber.