Google Book Search “Hack” Just Normal Operation

Steve Rubel thinks he’s "hacked" Google Book Search, as he covers in his
Read Most
of O’Reilly’s Hacks Books for Free Using Google
post. In reality, I think
he’s just finding that Google Book Search operates exactly the way it is
supposed to operate, to show you a percentage of a book that a publisher itself
has allowed you to view online.

Steve describes reading books in O’Reilly’s "Hack" series, such as Podcasting
Hacks. He’ll go to the table of contents, pick a hack he wants to read about,
then is able to read an entire chapter covering the hack as the chapters are
fairly short. If I understand right, he then goes back to the table of contents,
finds another chapter, then reads that.

Scary sounding stuff, reading the entire book online like that! Actually, it
turns out he can’t read the entire book. The percentage he can read isn’t so
scary when you understand that a publisher is allowing it.

Once Again —
The Difference Between Google Print & Google Library
covered this once
before, on how publishers work with Google Book Search. Nevertheless, I’ll do a
short version and apply it to what Steve found.

Google Book Search takes in content in two different ways. There’s the Google
Library program, where they scan books. If the book is out of copyright, the
entire content may be displayed. If it’s in copyright, nothing is displayed
other than small snippets.

Then there’s the entire separate
Books Partner
program. Publishers in that program, like O’Reilly,
voluntarily submit their books. When they do this, they can also indicate

how much of their books
they want to have displayed, from 20 to 100 percent.
If they don’t want any of it viewable, then only snippets and no actual pages
are shown.

In Steve’s case, O’Reilly is in the partner program. You’re told that at the
top of the pages you view, where it says:

Provided by O’Reilly through the Google Books Partner Program.

Now remember that 70 percent figure Steve was talking about, that he could
read about 70 percent of the hacks in any particular book? Sounds to me like
O’Reilly’s gone with a 70 percent viewable figure for its books.

You can see another mistaken assumption (or perhaps intentional twisting) of
how Google Book Search works over at Google Watch. Scroll to the bottom of this
page, which is against the
library scanning program.

You’ll see a graphic with the faces of Google cofounders Larry Page and
Sergey Brin saying, "Hey boys and girls, write all your term papers using
Google’s snippets. No need to visit the library to find that copyrighted book."

The example used below the smiling faces of Larry and Sergey is a search on
Steve Badrich,
with this snippet shown:

Campus Wars: The Peace Movement at American State Universities in the Vietnam

by Kenneth J Heineman – History – 1994 – 160 pages
Page 134 – Steve Badrich, decided in March 1966 to enlist in
the marines rather than
spend two more anxious years at the university while his draft board made


More results from this book

Below that is a screenshot of an actual page from the book, such as you’ll

at Google Book Search.

Conclusion? I think many will read that as an example of how Google Book
Search is taking copyrighted books out of libraries and putting them online in a
viewable format. But go up to the top of the page, and you’ll see this:

Campus Wars: The Peace Movement at American State Universities in the
Vietnam Era
by Kenneth J Heineman – Provided by NYU Press through the Google Books Partner

In reality, this book wasn’t scanned through the library program. It was
put into Google Book Search by the publisher itself, NYU Press
. And the
reason those college "boys and girls" can view the page online is down to the
publisher itself allowing this.

Gary looked earlier in
Can Full Book
Preview Prevention Be Hacked?
at another mistaken assumption this year that
someone had found a Google Book Search hole when in reality, it was the
publisher allowing viewing.

Gary’s post also covers the only single report (source material no longer
online) I’ve ever seen about someone saying they found away around protections
entirely. This was before Google had the required log-in system and used a
wholly cookie-based on. Since that time, no honest-to-goodness hacking has come
to light that I’ve seen.

I’m not saying it’s impossible. It wouldn’t surprise me if it happens. But
that’s not what Steve’s done here. There’s no "hole" that he’s "hacked," as far
as I can tell.

Postscript from Gary: I agree with all of Danny’s comments. Two quick points.

First, a recent post about Firefox add-on CustomizedGoogle says that they have a method that allows the printing of CustomizedGoogle pages. This would also make for some issues if the tool grew in popularity and people started printing thousands of pages.

Second, if you’re interested in reading and searching all of the O’Reilly books as well as tech books for many other major publishers including MS Press, Sams, Prentice-Hall and many more. I suggest taking a look at a service named Safari Tech Books, that just happens to be co-owned owned by O’Reilly. Yes, O’Reilly content everywhere. This service allows full text searching, fielded searching, printing, e-mailing, and more. As I’ve said many times in 2005, many libraries like the San Francisco Public Library offer access to this service for FREE! That’s right, free!!! No hacking needed. (-: If your library doesn’t offer it, then you can subscribe to Safari. Prices vary but access to up to 10 books (full text, no limit) is about $20.00. This page has more info. Interested in a free trial to to the full service? Go for it! It’s completely free for two weeks. Register here.
Btw, Safari offers other tools. For example, notificiation of new titles via RSS. A feed generator makes all sorts of feeds (titles by publisher and category) possible.

So, Safari doesn’t have the titles you want. Then take a look at Books24x7. Again, this service is usually licensed to companies, libraries, etc. but individual subs are also available. Again, full text, seachable access (no limits on how much you can view or print). This collection offers more than technology books. Here’s a list of a few recently added titles. You can also request a free trial.

Access to more the 20,000 NEW books (no limit on how much you can view online) is available from ebrary. Online access is free, just pay to print or copy pages. Usually about $.25/page. Great stuff. More in this SearchDay article.

Related reading

interview with SEMrush CEO
facebook is a local search engine. Are you treating it like one?
17 best extensions and plugins that experienced SEOs use
Gillette video search trends