SEO News
Search

Google & The Approved Cloaking Problem

author-default
by , Comments

The launch of Google Scholar shows once again that Google has no problem with cloaking, as long as this is approved by Google.

As you may recall, I wrote back in May about how Google was allowing NPR to cloak content that was in both its Google News and Google Web Search indexes: Cloaking By NPR OK At Google.

There's no problem on the searcher side with allowing this. Google wants to have more good content in its index. Allowing the cloaking of NPR's content, and now some scholarly content, helps the searcher. However, it flies in the face of Google's own stated policy on cloaking:

The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings.

In both situations, Google is being shown content that is different than what regular users see. My past article documents this in the NPR case. With the Google Scholar situation, Google is able to spider the full-text of documents while many regular searchers without password access to this material will only see abstracts of the documents.

In addition, the presence of such material has a natural effect to distort search rankings. If the material is now in the index, then it suddenly has a chance to rank well -- and push aside other material. This is a GOOD distortion. Users may want to find this formerly invisible material. But it's a distortion nonetheless and part of Google's current definition of cloaking.

Rewrite The Definition

I've suggested to Google that they make some changes to their polices, to help reconcile what they say and what they do. Here are two rewordings I've passed along:

The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings. In some limited cases, Google does have arrangements with publishers where we may crawl material different from what a regular user sees. In these cases, the arrangements are done because we feel they benefit the quality of our searches, not harm them.

Or:

The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking without our permission, if we feel it is harmful to our search rankings.

Perhaps this all seems like word play to Google, but it's this type of oversight that causes some people to lose faith in the company. Our forums recently had a thread looking at other cases where what Google says is different than how it acts: Google quirks summary (all lies...).

More important, when Google allows cloaking despite saying it is not allowed, then web site owners who aren't allowed to have special arrangements with Google are left feeling like anything goes. Clarifying the policy is important. Otherwise, as I've heard from people since I wrote about the NPR story, some will feel like they may as well cloak if they feel the circumstances warrant it.

This leads to two final points. First, cloaking is not bad. Rather, cloaking as a technique is often tied to bad intent, but not always.

Intent, Not Technique

If the intent is to try and manipulate the search engine in ways many might find harmful to users, then the ban is really on that manipulation rather than the fact cloaking is involved. If the intent is deemed good, as is the case with NPR and Google Scholar, then cloaking is clearly seen by Google as OK.

I hope Google will make the word changes and finally acknowledge that in some cases, it does approve cloaking. My past article from 2003, Ending The Debate Over Cloaking, looks at how this might help us get back on track to think about intent rather than technique.

Better Support For All

The last point is that Google needs to rapidly develop some system to extend the special arrangements it gives only some publishers to all of them. There is plenty of good, non-scholarly material locked behind password systems. That material -- much of it perhaps even more important to the general public -- remains inaccessible.

Google takes feeds from merchants. It works with book publishers. Academic publishers now get to have relationships. But general web publishers, upon which Google has built its business? They remain in the cold.

I've written and written and written in the past about the need for Google to provide some type of webmaster services to such publishers. It's time for the standard response of "we're always thinking" or "maybe in the future" to end. Get on with it now.

Failure to do so is going to cause web site owners to lose further faith with Google, or as mentioned, simply decide they might as well do whatever makes sense.

A recent forum thread illustrates this: Locked content on Google. There, a web site owner is trying to get his password-protected content indexed. The idea of doing abstracts is suggested, but so is cloaking -- and that's something the site owner considers. How much better would it be for Google to simply establish formal relationships with sites, so such end-runs and other games could end? Much better!

FYI, I've had several off-the-record conversations about all this with Google over the past few months, and I sent another follow up message to them yesterday. I've still got no on-the-record comment I can report on the cloaking policy. If that changes, I'll let you know.


The Original Search Marketing Event is Back!
SES DenverSES Denver (Oct 16) offers an intense day of learning all the critical aspects of search engine optimization (SEO) and paid search advertising (PPC). The mission of SES remains the same as it did from the start - to help you master being found on search engines. Early Bird rates extended through Sept 19. Register today!

Recommend this story

comments powered by Disqus