No, it's not April Fool's Day. Google has indeed cloaked pages on its own search engine and now banned those pages from its index.
Earlier I posted about Google cloaking pages as spotted on Threadwatch (and updated here). Turns out, Google says it's an accident that happened due to it trying to optimize its internal search engine used by AdWords support people. Nevertheless, the company's now banned its own pages from its own search engine for cloaking.
The move is sort of odd given that Google does allow other people to cloak on its search engine, as my Google & Approved Cloaking and Cloaking By NPR OK At Google stories explain more. Nevertheless, it's a PR move the company probably felt it had to make, let it be accused of not following guidelines it tells others to follow.
Google's GoogleGuy forum rep provided the explanation early today in this WebmasterWorld thread: Cloaked Pages Targeted at Search Box To Be Removed. Specifically, he said:
Those pages were primarily intended for the Google Search Appliances that do site search on individual help center pages. For example, http://adwords.google.com/support has a search box, and that search is powered by a Google Search Appliance. In order to help the Google Search Appliance find answers to questions, the user support system checked for the user agent of "Googlebot" (the Google Search Appliance uses "Googlebot" as a user agent), and if it found it, it added additional information from the user support database into the title.
The issue is that in addition to being accessed via the internal site-search at each help center, these pages can be accessed by static links via the web. When the web-crawl Googlebot visits, the user support system thinks that it's the Google Search Appliance (the code only checks for "Googlebot") and adds these additional keywords.
That's the background, so let me talk about what we're doing. To be consistent with our guidelines, we're removing these pages from our index. I think the pages are already gone from most of our data centers--a search like [site:google.com/support] didn't return any of these pages when I checked. Once the pages are fully changed, people will have to follow the same procedure that anyone else would (email webmaster at google.com with the subject "Reinclusion request" to explain the situation).
I did follow up with Google on Monday, immediately after I posted my original story on the cloaking. I got a preliminary "we're checking and we'll get back you" message. I'm still waiting on that official response. If it finally comes, I'll let you know.
That preliminary message I received, however, conflicts with what was later posted. On Monday, I was told directly by Google that a quick check of the page in question from a Google IP address and with either a Google user agent or a Googlebot user agent didn't show any cloaking.
In other words, the title of the page displayed to the person at Google, pretending to be Google's web indexing agent on Monday was:
Google AdWords Support: Why do traffic estimates for my Ad Group differ from those given by the standalone tool?
Nevertheless, the title actually recorded by Google in its index was:
traffic estimator, traffic estimates, traffic tool, estimate traffic Google AdWords Support: Why do traffic estimates for my Ad Group differ from those given by the standalone tool?
If it was actually the case that Google's web indexer, Googlebot, accidentally got served these pages, then that preliminary check should have revealed it. (UPDATE: Why it didn't is uncertain, Google says. One likely culprit seems to be that the page content itself had been changed by another Google department when the check was done, as I speculated in the paragraph below).
It could be that the cloaking had stopped by the time the check was done. I do know that the last time I looked at that page as recorded in Google's cache, Google had recorded the cloaked content as of March 7 at 4:54am GMT. That's the time stamp for when Googlebot last indexed the page. The fast check by a Google employee was done at 5:15pm later that day. During the 12 hours from when the spider last visited the page and when the checking was done, someone at Google may have shut off the cloaking.
By the way, GoogleGuy is indeed a real Google employee that you can trust as speaking for Google, even though as I've also written before, comments he makes have been sometimes said to be unofficial in nature.
Confusing? Yep, it is. I've also written before that it's time for the lid to come off GoogleGuy's identity. That's especially so if Google's going to continue releasing official information about controversial topics such as cloaking or nofollow via forums, blog entries and so on in this way. The company needs to finally identify the person behind the nickname, so that the general public doesn't have to wonder if it's really Google talking. I've had reporters ask me in the past how they can know the person is real; John Battelle on his blog wondered the same earlier this year after getting a GoogleGuy comment:
As I understand it from the Google Guy post (and I am not sure this really is a "Google Guy" - when will Google just stop being coy and let actual real people make comments?)
Hopefully, we'll see Google finally identify GoogleGuy so there's no confusion that he does speak for the company. If not, and if we have to keep getting "official" information in this "non-official" way, I'll simply out him myself.
Want to comment or discuss? Please visit our forum thread, Google Caught Cloaking and Keyword Stuffing.