My favorite part was this:
The second issue was to ensure that the crawler got the full text so they could work their on the full content rather than just the titles and abstracts. A bit of sleight-of-hand at our end ensured that the crawler got what it needed but with the URLs in the Google index being a suitable entry point for an end user.
That sleight-of-hand is almost certainly cloaking, showing an end user something different than what the crawler saw. Cloaking, of course, is against Google's published policies for webmasters.
As I've covered before in the situations of cloaking allowed at Google for NPR and Google Scholar, this type of cloaking is helpful to searchers. It's good cloaking. I have to stress, Dodds and the other Google Scholar participants are doing nothing wrong. They are working directly with Google, with Google's full approval, in a way that Google rightly feels will help searchers.
Nevertheless, Google's failure to update its policy continues to make it sound hypocritical. Telling general web publishers not to cloak, then having your Google Scholar participants talk about "sleight-of-hand" is a mixed message.
As I blogged earlier, it's long overdue for Google's policy on cloaking to be updated, to eliminate this mixed message. Simple changes like shown in bold below would be enough:
The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they'll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking without our permission, if we feel it is harmful to our search rankings.
If you're a Search Engine Watch member, Google & The Approved Cloaking Problem takes an even longer look at the issue, not just about the needed definition change, but also the fact that general web publishers are long-overdue for some of the special assistances being given to merchants, book and scholarly publishers by Google.