I've written before about Google giving strange results counts and why maybe it's time for them to go. Yesterday, I came across the oddest ones ever, when doing some typical searches to gauge the size of the index.
Here's an example. Search for xxkjdiuenmnmd8i, which when I just did it came back with no results. Now search for -xxkjdiuenmnmd8i. In theory, that should show the size of the Google index, all the pages it has.
In reality, that type of search hasn't often worked. It was only last September that this type index estimation technique gave any results at all. Even then, I didn't trust that the numbers were accurate. Still, they seemed better than what's coming up now. Look at the screenshot below:
Ten results? Only ten results, for a search technique that last month would have come up with more than 25 billion? Something funky is going on.
Finding it odd, I tried a search for the, often useful as a fast way to get a sense of how big Google might be, at least for the number of English language pages it has. The query came back with 23 billion matches. So how about -the, I tried, just out of curiosity. Ten matches:
Ten? Ten?!!! And more strangeness. A search for -and, -cars, -movies all did the same thing. The results were different in various ways, but the count was always only 10 matches, when it should be much more.
Note that the results all have additional information that make them appear to come out of Google Base. It all suggests that Google has disabled counting for queries involving a single word, but that somehow, Google Base integration is still happening to throw things off. It might be that Google is still doing a call to Google Base, asking for the top 10 results that it has, in order to integrate those results into a regular web search listing. But because it also has disabled display of regular web search results for a single negative word query, it's only Google Base that shows.
Going back to my post from last month, Google, Kill The Web Search Counts!, I explained how Google had stated that the counts reported for a spam site that were removed were much inflated by a counting glitch. I talked with Google about this and some other issues last week just before leaving for my trip to SES Latino in Miami, where I am now.
Some of what I talked about with Google's Matt Cutts and other engineers at Google has already addressed in a recent blog post. The issue of counts came up, and I'll do a longer post on what Google said after I get back from this trip and clear what I can discuss. The short answer is that they are aware of the issues and are looking to correct things. These strange results counts might be part of that.
More later when I'm back from my current trip, or watch Matt's blog, in case he posts before me.
Introducing SES Online
Want to view one of the sessions you missed or listen to an especially informative presenter a second time? SES New York sessions are available for purchase on ClickZ Academy's new e-Learning site. SES is now Online!