Once Again -- The Difference Between Google Print & Google Library

After reading What's A Week On The Web Without Controversy? over at MediaPost, I'm literally shaking my head in disbelief at the confusion in the article and what it may breed among those who read it. So once again, I'm going to dive into what Google Print is, what it does and the difference between that and what I'm going to call Google Library. Perhaps some history will be helpful given all the debate in recent weeks.

Google Print was launched in December 2003 with the full cooperation of participating publishers, as our Google Introduces Book Searches article from that time explains more. You couldn't actually search on a Google Print site at that time, however. Instead, matches from Google Print would show up in regular search results, and you could click through to very limited excerpts. Interestingly, Random House was one of the participating publishers back then, whereas today, it's critical of Google Print because of the Google Library project I'll discuss below.

In October 2004, Google greatly expanded the way for publishers to participate in Google Print, as well as making it possible to see the full-text of books in varying amounts according to what PUBLISHERS chose to display, not what Google decided would be best. Our Google Print Opens Widely To Publishers article from that time explains more about this.

The MediaPost article I mentioned above talks about Google Print having a "library project" and a "publisher project," with the latter being most controversial:

Google is said to be working in two capacities: The "library" project and the "publisher" project. The publisher project is the most controversial, as Google aims to work with publishers to make copyrighted books searchable. The Authors Guild and five major publishers are suing to prevent Google from scanning books without explicit permission.

The opposite is true. It is the Google Library project that is controversial. The Google Print Program for Publishers project isn't part of Google Library. It's the preexisting program that allows publishers who wish (and plenty do) to make their content available through Google Print and viewable to the degree they want to show. There's nothing controversial about that program in terms of copyright issues, unless you find some authors who may have concerns that their publishers might benefit more than they do. Publishers who want to participate can and do. Publishers who don't want to participate stay out of the program.

Google Library is what I'll use as a shorthand description of Google Print Library Program, Google's library digitization project. It probably would help matters greatly if Google gave that program a name that is distinct from Google Print, as I'll explain further below.

Google Library launched in December 2004, with the goal of taking books (both in and out of copyright) in public libraries and scanning them to make them searchable. Our Google Partners with Oxford, Harvard & Others to Digitize Libraries article from that time explains more about the program.

One of the chief goals of Google Library was to feed new content into Google Print. But unlike with Google Print's publisher program, Google Library gathered content up without publisher permission.

It didn't take long for publishers to object to the activity. Copyright Questions On Google Digitization Project is a post from us in March 2005 about objections. Some Publishers Not Happy With Google's Library Digitization Program followed in May. Publishers' Group Asks Google To Halt Scanning For 6 Months from June covers more pressure. Eventually, we got to a lawsuit in September (Google's Library Scanning Project Heads to Court) and a further one last month (Association of American Publishers Sues Google over Library Digitization Plan).

What's lost in all these objections is that Google Library is NOT reprinting books online. Back to that confusing MediaPost article:

Cynics speculate all books will be made available via search. The company has not said how it will address copyright laws.


So, dear readers, how do you feel about this? As a writer and a consumer, I am torn. When I've got my writing hat on, I'd say this is wrong. There must be protection in regard to copyrighted materials.

Google has said how it will protect copyright laws, that being that it will not and does not reprint books that are in copyright without explicit publisher permission via the Google Print publisher's program.

Google Library simply makes the content of a book searchable. You can go to the Google Print site, maybe find a matching book scanned through Google Library, but you won't see anything from that book unless the book publisher has given explicit permission for this. The only exception to this is if the book is out of copyright.

My recent Indexing Versus Caching & How Google Print Doesn't Reprint post explains this in more depth. Google Library is the scanning process for SOME of the content in Google Print, but that scanning is NOT the same as printing material. Google Library is effectively making a card catalog of books.

Gary hates me using the card catalog analogy as too archaic, but too bad -- I think that still resonates with many people. Card catalog, "online public access catalog," whatever you want to call them -- it's whatever you use to find a book in a library.

Now think about the last time you went into a library and sat down at a search terminal to find a book. When you got a match, did you then click and read the book on the computer screen? No, in all likelihood you did not. Instead, you were given the location of the book in the stacks, and you walked over to pull it off the shelf.

Google Library is helping Google create that type of searchable index of books, that feeds into Google Print -- but Google Print does not let you then pull the book off the virtual shelf and read it online unless a publisher has explicitly given permission.

Whether the scanning itself to build a search index is still a copyright infringement remains to be seen. If so, my Why Don't Book Publishers Object To Web Indexing?, Forget Google Print Copyright Infringement; Search Engines Already Infringe and Legal Experts Say Google Library Digitization Project Likely OK; Will It Revolve Around Snippets? posts explain why scanning of web pages has gone on for over a decade without legal repercussions, and how publisher groups involved in the Google Print lawsuit themselves sing a different tune when it comes to web indexing, though the principle at stake is the same.

Back to Google Print, the most recent news has been that it is making public domain works gained through Google Library available online via Google Print. Unlike what the MediaPost article suggests, however, these are not the only works you can get. As I've explained, works that are still in copyright works may also be read online, but this is with publisher permission.

And finally, back to that Google Print versus Google Library confusion. It is difficult for anyone to understand the differences between the publisher program, the library scanning program and what both allow and do. It would help if Google gave the library program -- which at the moment seems to be called the Google Print Library Project -- a better name.

For example, take the Why we believe in Google Print post over at Google from last month, where Google writes:

We've been asked recently why we're so determined to pursue Google Print, even though it has drawn industry opposition in the form of two lawsuits, the most recent coming today from several members of the American Association of Publishers

Google's not being sued over Google Print. It's being sued over Google Library. But the failure to distinguish the two things is making ALL of Google Print seem like it's under fire. Google Print has much content that publishers are voluntarily providing. It's the Google Library that's the problem right now for Google, so give that a name separate from Google Print and perhaps some of the confusion between the two will go away and benefit discussion about the real issues, rather than what often seem to be mistaken assumptions.

Want to know more? If you're a Search Engine Watch member, our Google: Print & Library section of Search Engine Watch has a rundown on many more past posts with history. Plus, you help support Search Engine Watch and the tired fingers of me, Gary and Chris here at the site.

Want to comment or discuss? Visit our Google Sued Over Google Print Library Scanning in the Search Engine Watch Forums or create a new thread over there.

About the author

Danny Sullivan was the founder and editor of Search Engine Watch from June 1997 until November 2006.

To contact current Search Engine Watch editorial staff, please click here.