Yahoo Announces Total Size Count

Looks like we might have a search engine total size wars beginning.

As I’ve said in the past total size numbers are primarily used for marketing purposes, bragging rights if you like. Michael
Liedtke from the AP reports that Yahoo is announcing a total size count. The number Yahoo is announcing is 20 billion “web objects.” The number is a combination of total web pages and total images.

Yahoo said its index, boosted by a recent upgrade, covers 20.5 billion online “objects,” comprised of about 19 billion documents and 1.5 billion images. By comparison, Google said it tracks 11.3 billion objects.

Tim Mayer points out on the Yahoo Search Blog that the “total objects” number also includes more than 50 million audio files.

This is the first time Yahoo has publicly announced a total web count. However, they have announced a total image and audio file counts in the past.

Interesting numbers but don’t get carried away with them. Yahoo will have “the largest” bragging rights until (I would bet) Google announces a larger number. Then, it will be Yahoo’s turn again. Will MSN join in the fun? What about Ask Jeeves? And so it goes. What really matters is relevance and other metrics. Hat tip to Tim Mayer for not forgetting this important point and mentioning this in his Yahoo blog post.

I want to make sure that while Yahoo’s total size number is just a number, a claim really since all size numbers from just about all web engines are difficult to verify, Yahoo Search does deserve lots of credit for building some first-rate products over the past couple of years. Web search but also offering several excellent specialty indexes including image search, audio search (here’s my overview), and a great news search engine. Yes, competition means good things for the searcher. (-:

Don’t forget that very often a smaller, focused web databases are also very capable of providing excellent results. Finally, since many searchers only look at the first few results, just because a page is listed somewhere in a results set doesn’t mean it will be seen. Again, this is why relevance is so important. Maybe the Invisible or Deep Web in 2005 is everything beyond the first 10 results?

Postscript: As web engines grow larger, the searcher would be doing themeselves a favor and learn to to take advantage of some of the many advanced features web engines offer that could do wonders in providing more precise queries and more relevant results. A little learning can go a long way.

