President of University of Michigan Talks Google Book Search

Via Searchblog, I read the excerpts from a speech by the President of the University of Michigan, Mary Sue Coleman, about Google Book Search or to be more specific Google's Library Program. The full text of the speech is here (thanks John). I have no intention on arguing the intellectual property/copyright/fair use issues that the program brings to the forefront. I'll save that for the lawyers and judges.

However, a couple of quick comments on other issues.

First, Google's efforts to digitize materials both public domain and in-copyright are noble and should be applauded.

However, as we've pointed out on SEW Blog several times, the searcher who finds in-copyright material (that's been digitized via a library collection) will only be able to read snippets of these books online. In other words, it's not a full text book search tool. As Google describes it:

"...Snippet View which, like a card catalog, shows you information about the book plus a few snippets ? a few sentences of your search term in context."

Quick note to Google. Card catalogs are very rare these days. They are, for the most part, now called Online Public Access Catalogs. They are web accessible and often contain lots of value added data like reviews, tables-of-content, indexes, author bio's, etc. I posted on this topic last year with several examples of the "modern" card catalog.

If you're interested, Google is using the year 1922 as a baseline to determine copyright.

For users in the U.S., Google Book Search currently treats all books published after 1922 as protected by copyright, except for books to which no copyright was ever attached, such as books authored by the U.S. government. For users outside the U.S., we make determinations based on appropriate local law.

New, in-copyright books will also be available (as part of the Google Book Search "Partner Program"). No legal issues here. Users view as much content as the publisher deems fit.

This is no different than what Amazon.com is also doing with their Search Inside the Book program that currently also offers lots of cool stats for SITB titles.

As far as "pure" public domain material, the full text will be viewable online and in full text via Google Book Search. However, as Ms. Coleman points out libraries and others have been digitizing books in one form or another for many years. For example, Project Gutenberg began in 1971. A great directory of public domain books (from a variety of sources including Google, Project Gutenberg, and many others) is available here.

Of course, let's not forget that late last year we learned that Yahoo, Microsoft, RLG, and others are supporting another large digitization (books and other materials) effort called The Open Content Alliance that's being led by Brewster Kahle.

As a librarian, I'm thrilled that the potential for people to make better use of their libraries could be made possible by Google's Library Program. However,

+ Will people make the extra effort to get the book they need from the library or via interlibrary loan if what they find online is only a snippet? This could be great news for the library community but would it be good news for Google?

+ I'm asking this question because of comment from Google's Chief Counsel, David Drummond, at a November 2005 Google print "debate" at the NY Public Library. Here's how the NY Times reported it:

Mr. [Allan] Adler [a vice president for legal and governmental affairs at the Association of American Publishers]said Google's contention that its search program might somehow increase sales of books was speculation at best.

"When people make inquiries using Google's search engine and they come up with references to books, they are just as likely to come to this fine institution to look up those references as they are to buy them," he said, referring to the Public Library.

To which Google's Mr. Drummond [Google's general counsel] replied, "Horrors."

I also once again want to point out that many other companies are out there offering full text access to a variety of book materials. In some cases, these databases are available at little or no cost, allow you to access the full text, print content, annotate, etc.

Here are a few posts that look at some of these services:

+ NetLibrary: Over 100,000 Digitized and Searchable Books Available Online

+ Search and Read Full Text Books Online via ebrary

+ More Sources For Ebooks & Electronic Text

+ Don't forget that many libraries (including public libraries) already offer free full text databases (articles, books, audio books)
for free from home or office. All you need is a library card. This article has more along with several examples.

Random Thoughts
Kudos to Ms. Coleman for saying:

We were digitizing books long before Google knocked on our door* and we will continue our preservation efforts long after our contract with Google ends. As one of our librarians says, ?We believed in this forever.? Google Book Search complements our work. It amplifies our efforts, and reduces our costs. It does not replace books, but instead expands their presence in the marketplace.

* The Humanities Text Initiative is one example of a digitization program that's been going on at the U of M for many years.

Coleman then mentions that a copy of each digitized book will be given to the University of Michigan. What does this mean for the researcher who does not have access to the University of Michigan Library?

+ Will the public be able to access these digitized full text copies that the U of M is provided with through an interlibrary loan or perhaps an interlibrary "download" program? For example, will a researcher in Germany be able to quickly access the full text of a 1972 book about tourism via Google and the University of Michigan that they need for their research?

+ What usability of the digitized content? Will the book or perhaps better said, snippet, have? Can you copy it? Annotate it? Share it with a colleague? Copy and paste a passage into an email? Print a page?

+ Will Google decide to sell full text access to some/all of this material? They've already hinted at opening an online e-bookstore? Mr. Drummond's response (listed above) leads me to also believe that one day Google will be selling access to some of these full text books. What does this mean for the library in the long run?

+ Assuming that all of the material (billions of words) are fully indexed, will people be able to find what they're looking for in a rapid manner? When a typical searcher enter two, three, or four keywords and then be given thousands of hits. Will specialized collections be developed? Will subject searching be available? Is Google going to use Libary of Congress Subject Headings or develop their own controlled vocabulary and cross reference structure? Will dynamic clustering be available? Does Google have any plans to train users in becoming better searchers?

+ Right now, Google is very clear that no advertising will appear on pages of books that have been digitized from a library. Could this change in the future? Who would profit from these ads? Would profits be shared with the publisher, author, etc.?

+ I have many more questions but we'll save them for later.