Why Don’t Book Publishers Object To Web Indexing?

Gary posted earlier about the latest publisher’s group to object to Google’s digital library program. As
I’ve posted earlier, I’ve found some of the arguments odd given that search engines have long indexed copyrighted material from across the web without permission and without
complaint by publishing groups. This time, I followed up with Sally Morris, chief executive of the Association Of Learned And Professional
Society Publishers
, about why her group seems to me to view print copyright as something deserving greater indexing protection that web copyright.

The ALPSP put out a statement (PDF format) last week with this key highlight that caught my eye:

Google Print for Libraries is a very different matter. We firmly believe that, in cases where the works digitised are still in copyright, the law does not permit making a
complete digital copy for such purposes.

I asked Morris:

Is the view of the ALPSP that this applies only to books or any copyrighted work? I ask because all works, to my knowledge, enjoy automatic copyright protection in many
counties. That’s certainly the case in the US. This includes the billions of pages that Google and other search engines index on the web.

Google, for example, has indexed nearly 1,000 pages from the ALPSP web site. My assumption is that the ALPSP
never overtly asked for these pages, all of which are copyrighted, to be digitized and included in Google. Despite this, I’ve never heard your organization complain about such

In fact, when I look here, you seem not only happy to have Google index your pages but also happy to have other people
search the entire web’s copyrighted works via Google.

In short, why is opt-out OK when it comes to web content but not OK when it comes to published works?

Morris replied:

In my view the laws of copyright are not different for books and for other copyright works. So you’re right, in principle Google should seek opt-in permission before
indexing freely available web pages, too (as, indeed, the British Library’s web archiving project has very properly done – and very hard work it is too). However, I think the
issue is much more acute where the content is not made freely available by its copyright owner – which is, of course, the case for all the in-copyright content Google are
planning to digitise from libraries

I wasn’t convinced on the “freely available” front and sent this follow-up:

Why is publishing a book not making content freely available?

If I go into a library, I’ve got plenty of content for free. That’s exactly why Google has gone into the libraries. The information is made accessible to any patron.

I don’t know of any library being sued for allowing people to borrow books, which arguably goes directly to the potential earnings a publisher could make. You’d know far
better than I if this has actually happened, of course. But the books are there, and they are free to anyone able to gain a library card or lending rights. In contrast, Google
is not making the full text of books available as a library does. If anything, libraries are far greater infringers than Google and have been so longer. Why aren’t libraries
being targeted?

Now if you mean freely available in terms of easy of copying — IE, web pages published to the web are easier for anyone to access — I can understand your point of view
more. I’d still disagree with it, however. Just because I distribute on the web doesn’t mean I consider my copyright to be any less important than if I publish in print.

As for the British Library project, my understanding was that they wanted to have the law changed to
essentially give them opt-out > abilities:

One of the problems faced by the consortium is that, due to UK copyright law, permission is needed before a site can be archived. The British Library is working with the
government to extend the law to allow them blanket access to all Web sites because “there are 4 million sites that we would like to capture — we cannot ask everyone for
permission,” said Boulderstone.

They’re also not quite the same as with search engines. While search engines generally do offer cached copies of pages, archiving is more substantial, making actual lasting
full-text copies of pages.

Morris replied:

A published book is sold – to the individual or to the library. Lending it out does not contravene copyright. To my mind, making a digital copy of the whole thing does.

We are not saying that increasing visibility via Google Print is a bad thing – I think those of our members who participate in the Google Print for Publishers program (or
who otherwise allow Google to index their closed content) are generally pleased with the increased hits, though I’m less clear whether they are in fact seeing increased sales.
All we’re saying is that the method of achieving it seems to us clearly to break copyright laws – and we’d like to work with Google to find an acceptable way of getting
publishers’ opt-in.

And I guess all I’m saying is that those publishers, if they try to push this angle with Google via a lawsuit, had better be prepared for explaining why they’ve never
complained about having their web sites indexed by Google for years without permission.

Moreover, woe to the publisher or member of a publishing group that is ever found during legal disclosure to have complained about not being indexed better on Google. You
can’t enjoy years of free traffic from a source, then suddenly decide that copyright law is now different just because the words appear in print, rather than on the web.

One interesting solution will be to see if Google simply goes out and buys a copy of every book it wants to offer in its virtual library. If libraries are OK lending books,
Google might argue that it’s creating a card catalog of books in its collection. Heck, you could even make it so that only one person at a time could “check out” viewing some
of the pages that Google Print offers for reading online.

I do have sympathy for publisher rights. I publish material myself. I just find it bizarre to see the print industry suddenly acting like it can ignore 10 years of web
indexing. For more on me on this topic, see:

See also our forum thread, SEW should support the AAUP’s position on Google.

Postscript from Gary:
I wanted to add two points to Danny’s post.

First, Danny writes, “Heck, you could even make it so that only one person at a time could “check out” viewing some of the pages that Google Print offers for reading online.” Actually, this concept is already being used by many libraries. Libraries purchase access to digital copies of both new and old full text books via services like ebrary and NetLibrary. Patrons can then “virtually” check-out these books for a certain period of time. Services like Books24x7 and Safari Tech Books also provide searchable full text books online and unlike Google Print/Library, there is NO limit on how much you read online. As I’ve pointed out before, many of you have free access (from home) to one or more of these services via a local, corporate, or university library. More about that here. Btw, many libraries have started to allow card holders the ability to virtually check-out and download audio books for free. (-:

Second, from a searcher’s perspective. Material that Google scans from a library that’s still in copyright will be full text searchable but not full text viewable online. Google puts it this way, “you will only be able to view the bibliographic information and a few short sentences of text around your search term.” You will also be unable to print this material (yes, you could do screen caps). Here’s a screen cap of what will be visible.

Related reading

Simple Share Buttons