I've had a long talk with the group that so far has successfully sued Google in Belgium over indexing, a talk that leaves me thinking they don't fully understand how search engines work and why their arguments over copyright infringement will ultimately fail. Then again, the case is really about trying to convince Google it should pay to carry their news content. A closer look at all this in the story below, as well as an update on the situation in general, including an appeal for Google that's been granted.
Let's go back to the beginning. In March, Copiepresse tells me it started legal proceedings against Google over its inclusion of Belgian news sources without explicit permission. The organization represents a number of publishers that were concerned over being indexed.
Information about the case, including a summons, was all set to Google in the United States, according to Copiepresse. A hearing was held in Belgium on September 5th, then the ruling came out last Friday, September 15. Google didn't take part in the hearings, for reasons it says it is still investigating.
The ruling required that Google do two main things within 10 days of receipt:
- Remove French and German-language content from the publishers from Google
Belgium's web sites or pay a fine of €1 million per day
- Publish the ruling on Google Belgium and Google News Belgium or pay a fine of €500,000 per day
Over this past weekend, Google says it complied with the first part. It removed links to at least these news sources, Google told me:
It's been noted that Google did more than remove these sites from Google News Belgium. They were removed from Google Belgium entirely. Here are a couple of searches that demonstrate this:
Some have thought this is an example of Google getting revenge, robbing these publishers of regular traffic they probably assumed was safe in a fight over Google News indexing. For its part, Google said its reading of the ruling meant that the sites had to be dropped entirely from Google Belgium:
Order the defendant to withdraw the articles, photographs and graphic representations of Belgian publishers of the French - and German-speaking daily press, represented by the plaintiff, from all their sites (Google News and "cache" Google or any other name within 10 days of the notification of the intervening order, under penalty of a daily fine of 1,000,000.- € per day of delay;
I've bolded the key part. Google says it interpreted "all their sites" as being all sites that it views the court having jurisdiction over, anything using the Google.be domain. In addition, Google has removed the sites from Google News worldwide, saying it is treating the ruling as it would any request to be removed from Google News. In those cases, you're dropped entirely, not on a country-by-country basis.
The sites do still appear in a searches via Google.com or other Google editions not aimed at Belgium. While these sites can still be reached from Belgium, Google considers them outside Belgian jurisdiction.
That view is sort of laughable, though I understand the reasoning well. It's unlikely that Google Belgium is actually being served up out of Belgium, so artificially pretending that Google.com another other Google sites are somehow "outside" Belgian jurisdiction makes no sense. However, this type of pretending isn't that unusual. It's a nice way for search engines to act like they are following the ruling of a particular country by making changes on "that country's Google." It's also a convenient way for particular courts to feel they've exerted jurisdiction over sites that that they might really not be able to control.
Overall, Google has complied with the first part of the ruling. As for the second, it hasn't posted the required notices and says it will wait for a ruling due out Friday specifically about that issue. It argued yesterday in a hearing for appeal that posting the notice on the home pages wasn't necessary given all the publicity the case has now received.
An appeal for the case overall was granted. It will be heard on November 24, and the entire matter is largely in limbo until then. I hesitate to consider the case a victory for Copiepresse given that the first hearing -- for whatever reason -- had no defense from Google at all.
This leads me to Copiepresse's complaint with Google. In the group's view, Google has illegally copied material without permission. It feels that in some way, Google should get permission before indexing.
Indexing, of course, is not copying. Search engines do read pages in to make them searchable, as my Indexing Versus Caching & How Google Print Doesn't Reprint article explains in more detail. But indexing isn't reprinting pages, in the way some arguments try to make it. Google does show cached copies, something raised in the case. But cached copies aren't shown within Google News search, which was the main focus of this case (as an aside, one US court has ruled cached copies aren't an infringement, something I disagree with but something also easily rectified through no caching mechanisms).
I had a very long conversation about the permissions issue with Margaret Boribon, secretary general of Copiepresse, to try and better understand how they wanted Google to operate. Why not use commonly understood and effective mechanisms such as robots.txt files or meta robots tags to prevent indexing?
"If you do so, you admit that Google does what they want, and if you don't agree, you have to contact them. This is not the legal framework of copyright," Boribon said.
This is an age old issue in the search engine world. By default, search engines assume that permission is granted to index a document, in order to make it searchable. Technically, shouldn't they get explicit permission? Legally, that might make things safer. Logistically, it would never work. Many sites don't have clear contact details. Some domains themselves contain multiple sites. Moreover, there are millions of sites across the web. Contacting them all beforehand simply wouldn't work well.
I asked Boribon about this, how her group would propose search engines undertake such a task.
"I'm sure they can find a very easy system to send an email or a document to alert the site and ask for permission or maybe a system of opt-in or opt-out," she said.
Would it be OK for such a system to work automatically, I asked? Yes, that would be fine. A machine-to-machine connection would be OK, she said. So then, I asked, why not use the existing robots.txt or meta robots systems?
Both mechanisms are easy, automatic ways for publishers to declare if they grant indexing permission or not. In fact, I'd argue that both are a way for search engines to ask beforehand for the very permission that Copiepresse wants them to seek. Major search engines -- not just Google -- all request or check these blocking mechanisms.
Boribon rejected the existing solutions. One issue she had was that they weren't legally endorsed. That's true, but that's also something I think will change over time. In the US, we've had one case recently where opt-out solutions like tags have been accepted.
Outside the US, there have been some scatted cases, such as this one from 1997 in the UK involving news indexing. But none of these cases have seemed to stop the search engines.
The Belgium case could be different. What happens in one country isn't applicable to others. It may be that Copiepresse will prove its point that permission should be sought in advance. Alternatively, a court could endorse existing blocking mechanisms as having legal force.
That's what I think should happen. These systems pose an easy way for anyone who doesn't want to be in a search engine to stay out. If the issue with Copiepresse was really about not being indexed, all of the publications it represents could easily stay out through those solutions. Google -- like other major search engines -- doesn't index sites against their wills.
There's more at work here, of course. The publications DO want to be in Google. The action is simply an effort to force Google to the bargaining table and get paid for inclusion, from what I can see.
"Our purpose is not to be excluded. Of course, we want to be in the system, but on a legal basis," said Boribon. "We want to be remunerated."
Her group's view -- as is the view of the World Association Of Newspapers that she also referenced several times -- is that Google is exploiting sites. It is making money off these sites and giving them little or nothing in return.
Most search marketers hearing this have to stifle laughter or disbelief. That's because most search marketers want all the search traffic they can get. It's free, easy and converts well. They understand that search engines give them plenty of value and complain most when something happens to take that traffic away, as was the case with the Google Florida Update of 2003.
I'm not going to spin out the argument that search engines generate far more benefits from the indexing they do than harm. For one thing, I think this is self-evident given the sheer amount of concern of getting into search engines, rather than out of them. If you must have more argument, see my past post, Search Engines As Leeches, The Difference Between Paid & Free Listings & Keyword Price Rises.
The difference between most publishers on the web and those of Boribon -- or book publishers also suing over Google's scanning program -- is that they think they are special, in my opinion. They think they have content that is more important than other content on the web, content that is either entitled to more protection or that warrants payment for being included.
Several times, Boribon stressed that those who spent a lot of time and money on their works deserved to be compensated by Google. My response was that I don't care if content is worth €1 or €1,000,000. It is entitled to the same protections. To be fair, Boribon agreed when I made that point. Yet our talk still continued to be riddled with her references to the high value of some content or the concept that only some content had protected status.
I've been through this before. Why Don't Book Publishers Object To Web Indexing? covers how one book group, while admitting that copyright law should apply the same regardless of whether works are in digital or book form, still suggested that online works were somehow different:
I think the issue is much more acute where the content is not made freely available by its copyright owner - which is, of course, the case for all the in-copyright content Google are planning to digitise from libraries.
Skipping past copyright law, let's focus on payment for inclusion. Boribon said that Google had made special arrangements with Le Monde to include it in Google News, explaining that was one of many examples of Google targeting the most important sources for special treatment.
My response was Google has special arrangements with lots of publishers that have content that can't easily be indexed. If Le Monde required user registrations, Google couldn't spider the site without contacting them and being allowed in. Indeed, it's the same thing Google has done for the New York Times, as we've covered. It's something Google (and other search engines) does for even non-news sites, if they have important content that it thinks should be gathered.
Google is not paying Le Monde or the New York Times for these arrangement, however -- something that Boribon seemed to believe the case, and no doubt other publications do as well. Google confirmed with me it has no payment system like this with Le Monde. But such a belief highlights the huge education challenge Google faces, trying to help these publications that have mistaken notions of how it -- and all search engines -- operate.
Of course, Google does have one paid relationship with a news source that came to attention recently, the Associated Press. Google still hasn't explained exactly whether this was a relationship it did to prevent an AP lawsuit over being in Google News or a separate agreement to pick up some of AP's content for reuse.
Fair to say, AP's content is important enough and helpful enough to Google that it did decide to enter into an agreement to make use of it in some way. Boribon's group feels their content is important enough that it should obtain some type of agreement as well.
This is also an old story, in some ways. Tom Mohr in Editor & Publisher earlier this month was only the latest of those with the newspaper industry sounding a call for newspapers to band together to deny content in hopes of getting paid:
But what if 2/3 or more of the U.S. newspaper industry sits on one platform, managed by Switzerland Inc.? What if Switzerland Inc. decides to deny Yahoo! and perhaps Google access to newspaper industry content for three months, followed by a negotiation for better terms? That's the power of a network.
The World Association Of Newspapers had a similar call earlier this year:
Web search engines, such as Google and Yahoo, collect headlines and photos for their users without compensating the publishers a cent, according to the World Association of Newspapers (WAN), which announced Tuesday that it intends to "challenge the exploitation of content" by the Googles and MSNs of the Web.
The Belgian lawsuit is simply another step forward in pushing for that payment, exactly what Google CEO Eric Schmidt described as "negotiation being done in a courtroom" when I spoke with him last month:
Because of our scale and because of the amounts of money that we have, Google has to be more careful with respect to launching products that may violate other people's notion of their rights. But also, frankly, we find ourselves in litigation and the litigation was expensive, and diverts the management team, etcetera, from our mission. In the cases that you describe, most of the litigation in my judgment was really a business negotiation being done in a courtroom. And I hate to say that, but that is my personal opinion. And in most cases a change in our policy or a financial change would in fact address many of the issues.
In the end, I want honesty. If the Copiepresse or the AFP (also suing Google) feel Google doesn't have permission to index their content, then just use the easily implemented mechanisms to get out and stay out. Don't file unnecessary court cases, nor just single out Google as the whipping boy when Yahoo and Microsoft, to name only two search engines, operate the same way.
Is it about getting paid? Is it that these publishers think they are so important they should get money for being included, since links alone to their web sites make search engines more comprehensive. That's fine, but you don't need a court case for that either. Just opt-out. If you're worth it, Google and the others will come running to the negotiating table. If you're not, well, no one's going to miss you -- but you'll miss the search engine traffic, as the Belgian publications almost certainly are discovering to their horror now.
I don't want lawsuits that seriously threaten web search itself. Bourbon's ruling potentially applies to all content, not just news content, in Belgium. Anyone could sue Google and other search engines saying that robots.txt blocking isn't explicit enough. If that happens, Boribon's organization is going to find searching the web from Belgium is difficult, since there won't be any content in Google, Yahoo or other services at all.
That would be ironic, given that Boribon says she's a regular Google user. She's routinely using a service where virtually none of the content listed is there because of some explicit approval process. That's hypocritical, given her group's lawsuit. If they don't believe opt-out mechanisms are sufficient, then none of these member publications should be using Google or any search engine as part of their daily routines.
Postscript: V7N points at WAN to combat 'search engine spiders', which has the World Association Of Newspapers suggesting incorrectly that search engines have no technological solution to spider only some content. They absolutely do. Content can be flagged on a page-by-page basis, if that's what a content owner wants to do.
Meet Your Favorite Search Engine Watch Contributors
Many of SEW's leading expert contributors will be at ClickZ Live, the new online and digital marketing event kicking off in New York (March 31-April 3). Hear from the likes of: Thom Craver, Josh Braaten, Lisa Barone, Simon Heseltine, Josh McCoy, Lisa Raehsler, Greg Jarboe, Dan Cristo, Joseph Kerschbaum, John Gagnon, Eric Enge and more!