I’ve had a long talk with the group that so far has
Google in Belgium over indexing, a talk that leaves me thinking they don’t fully
understand how search engines work and why their arguments over copyright
infringement will ultimately fail. Then again, the case is really about trying
to convince Google it should pay to carry their news content. A closer look at
all this in the story below, as well as an update on the situation in general,
including an appeal for Google that’s been granted.
Let’s go back to the beginning. In March,
Copiepresse tells me it started
legal proceedings against Google over its inclusion of Belgian news sources
without explicit permission. The organization represents a number of publishers
that were concerned over being indexed.
Information about the case, including a summons, was all set to Google in the
United States, according to Copiepresse. A hearing was held in Belgium on
September 5th, then the ruling came out last Friday, September 15. Google didn’t
take part in the hearings, for reasons it says it is still investigating.
ruling required that Google do two main things within 10 days of receipt:
- Remove French and German-language content from the publishers from Google
Belgium’s web sites or pay a fine of €1 million per day
- Publish the ruling on Google Belgium
and Google News Belgium
or pay a fine of €500,000 per day
Over this past weekend, Google says it complied with the first part. It
removed links to at least these news sources, Google told me:
noted that Google did more than remove these sites from Google News Belgium.
They were removed from Google Belgium entirely. Here are a couple of searches
that demonstrate this:
Some have thought this is an example of Google getting revenge, robbing these
publishers of regular traffic they probably assumed was safe in a fight over
Google News indexing. For its part, Google said its reading of the ruling meant
that the sites had to be dropped entirely from Google Belgium:
Order the defendant to withdraw the articles, photographs and graphic
representations of Belgian publishers of the French – and German-speaking
daily press, represented by the plaintiff, from all their sites (Google
News and "cache" Google or any other name within 10 days of the notification
of the intervening order, under penalty of a daily fine of 1,000,000.- € per
day of delay;
I’ve bolded the key part. Google says it interpreted "all their sites" as
being all sites that it views the court having jurisdiction over, anything using
the Google.be domain. In addition, Google has removed the sites from Google News
worldwide, saying it is treating the ruling as it would any request to be
removed from Google News. In those cases, you’re dropped entirely, not on a
The sites do still appear in a searches via Google.com or other Google
editions not aimed at Belgium. While these sites can still be reached from
Belgium, Google considers them outside Belgian jurisdiction.
That view is sort of laughable, though I understand the reasoning well. It’s
unlikely that Google Belgium is actually being served up out of Belgium, so
artificially pretending that Google.com another other Google sites are somehow
"outside" Belgian jurisdiction makes no sense. However, this type of pretending
isn’t that unusual. It’s a nice way for search engines to act like they are
following the ruling of a particular country by making changes on "that
country’s Google." It’s also a convenient way for particular courts to feel
they’ve exerted jurisdiction over sites that that they might really not be able
Overall, Google has complied with the first part of the ruling. As for the
second, it hasn’t posted the required notices and says it will wait for a ruling
due out Friday specifically about that issue. It argued yesterday in a hearing
for appeal that posting the notice on the home pages wasn’t necessary given all
the publicity the case has now received.
An appeal for the case overall was granted. It will be heard on November 24,
and the entire matter is largely in limbo until then. I hesitate to consider the
case a victory for Copiepresse given that the first hearing — for whatever
reason — had no defense from Google at all.
This leads me to Copiepresse’s complaint with Google. In the group’s view,
Google has illegally copied material without permission. It feels that in some
way, Google should get permission before indexing.
Indexing, of course, is not copying. Search engines do read pages in to make
them searchable, as my
Caching & How Google Print Doesn’t Reprint article explains in more detail.
But indexing isn’t reprinting pages, in the way some arguments try to make it.
Google does show cached copies, something raised in the case. But cached copies
aren’t shown within Google News search, which was the main focus of this case
(as an aside, one US court has ruled cached copies aren’t an infringement,
disagree with but something also easily rectified through no caching
I had a very long conversation about the permissions issue with Margaret
Boribon, secretary general of Copiepresse, to try and better understand how they
wanted Google to operate. Why not use commonly understood and effective
mechanisms such as
files or meta robots
tags to prevent indexing?
"If you do so, you admit that Google does what they want, and if you don’t
agree, you have to contact them. This is not the legal framework of copyright,"
This is an age old issue in the search engine world. By default, search
engines assume that permission is granted to index a document, in order to make
it searchable. Technically, shouldn’t they get explicit permission? Legally,
that might make things safer. Logistically, it would never work. Many sites
don’t have clear contact details. Some domains themselves contain multiple
sites. Moreover, there are millions of sites across the web. Contacting them all
beforehand simply wouldn’t work well.
I asked Boribon about this, how her group would propose search engines
undertake such a task.
"I’m sure they can find a very easy system to send an email or a document to
alert the site and ask for permission or maybe a system of opt-in or opt-out,"
Would it be OK for such a system to work automatically, I asked? Yes, that
would be fine. A machine-to-machine connection would be OK, she said. So then, I
asked, why not use the existing robots.txt or meta robots systems?
Both mechanisms are easy, automatic ways for publishers to declare if they
grant indexing permission or not. In fact, I’d argue that both are a way for
search engines to ask beforehand for the very permission that Copiepresse wants
them to seek. Major search engines — not just Google — all request or check
these blocking mechanisms.
Boribon rejected the existing solutions. One issue she had was that they
weren’t legally endorsed. That’s true, but that’s also something I think will
change over time. In the US, we’ve had one case
where opt-out solutions like tags have been accepted.
Outside the US, there have been some scatted cases, such as
from 1997 in the UK involving news indexing. But none of these cases have seemed
to stop the search engines.
The Belgium case could be different. What happens in one country isn’t
applicable to others. It may be that Copiepresse will prove its point that
permission should be sought in advance. Alternatively, a court could endorse
existing blocking mechanisms as having legal force.
That’s what I think should happen. These systems pose an easy way for anyone
who doesn’t want to be in a search engine to stay out. If the issue with
Copiepresse was really about not being indexed, all of the publications it
represents could easily stay out through those solutions. Google — like other
major search engines — doesn’t index sites against their wills.
There’s more at work here, of course. The publications DO want to be in
Google. The action is simply an effort to force Google to the bargaining table
and get paid for inclusion, from what I can see.
"Our purpose is not to be excluded. Of course, we want to be in the system,
but on a legal basis," said Boribon. "We want to be remunerated."
Her group’s view — as is the view of the World Association Of Newspapers
that she also referenced several times — is that Google is exploiting sites. It
is making money off these sites and giving them little or nothing in return.
Most search marketers hearing this have to stifle laughter or disbelief.
That’s because most search marketers want all the search traffic they can get.
It’s free, easy and converts well. They understand that search engines give them
plenty of value and complain most when something happens to take that traffic
away, as was the case with the
Update of 2003.
I’m not going to spin out the argument that search engines generate far more
benefits from the indexing they do than harm. For one thing, I think this is
self-evident given the sheer amount of concern of getting into search engines,
rather than out of them. If you must have more argument, see my past post,
Search Engines As
Leeches, The Difference Between Paid & Free Listings & Keyword Price Rises.
The difference between most publishers on the web and those of Boribon — or
book publishers also suing over Google’s scanning program — is that they think
they are special, in my opinion. They think they have content that is more
important than other content on the web, content that is either entitled to more
protection or that warrants payment for being included.
Several times, Boribon stressed that those who spent a lot of time and money
on their works deserved to be compensated by Google. My response was that I
don’t care if content is worth €1 or €1,000,000. It is entitled to the same
protections. To be fair, Boribon agreed when I made that point. Yet our talk
still continued to be riddled with her references to the high value of some
content or the concept that only some content had protected status.
I’ve been through this before.
Why Don’t Book
Publishers Object To Web Indexing? covers how one book group, while
admitting that copyright law should apply the same regardless of whether works
are in digital or book form, still suggested that online works were somehow
I think the issue is much more acute where the content is not made freely
available by its copyright owner – which is, of course, the case for all the
in-copyright content Google are planning to digitise from libraries.
Skipping past copyright law, let’s focus on payment for inclusion. Boribon
said that Google had made special arrangements with
Le Monde to include it in Google News,
explaining that was one of many examples of Google targeting the most important
sources for special treatment.
My response was Google has special arrangements with lots of publishers that
have content that can’t easily be indexed. If Le Monde required user
registrations, Google couldn’t spider the site without contacting them and being
allowed in. Indeed, it’s the same thing Google has done for the New York Times,
covered. It’s something Google (and other search engines) does for even
non-news sites, if they have important content that it thinks should be
Google is not paying Le Monde or the New York Times for these arrangement,
however — something that Boribon seemed to believe the case, and no doubt other
publications do as well. Google confirmed with me it has no payment system like
this with Le Monde. But such a belief highlights the huge education challenge
Google faces, trying to help these publications that have mistaken notions of
how it — and all search engines — operate.
Of course, Google does have one
with a news source that came to attention recently, the Associated Press. Google
still hasn’t explained exactly whether this was a relationship it did to prevent
an AP lawsuit over being in Google News or a separate agreement to pick up some
of AP’s content for reuse.
Fair to say, AP’s content is important enough and helpful enough to Google
that it did decide to enter into an agreement to make use of it in some way.
Boribon’s group feels their content is important enough that it should obtain
some type of agreement as well.
This is also an old story, in some ways. Tom Mohr in Editor & Publisher
earlier this month was only the latest of those with the newspaper industry
sounding a call for newspapers to band together to deny content in hopes of
But what if 2/3 or more of the U.S. newspaper industry sits on one
platform, managed by Switzerland Inc.? What if Switzerland Inc. decides to
deny Yahoo! and perhaps Google access to newspaper industry content for three
months, followed by a negotiation for better terms? That’s the power of a
The World Association Of Newspapers had a
similar call earlier
Web search engines, such as Google and Yahoo, collect headlines and photos
for their users without compensating the publishers a cent, according to the
World Association of Newspapers (WAN), which announced Tuesday that it intends
to "challenge the exploitation of content" by the Googles and MSNs of the Web.
The Belgian lawsuit is simply another step forward in pushing for that
payment, exactly what Google CEO Eric Schmidt described as "negotiation being
done in a courtroom" when I
spoke with him
Because of our scale and because of the amounts of money that we have,
Google has to be more careful with respect to launching products that may
violate other people’s notion of their rights. But also, frankly, we find
ourselves in litigation and the litigation was expensive, and diverts the
management team, etcetera, from our mission. In the cases that you describe,
most of the litigation in my judgment was really a business negotiation being
done in a courtroom. And I hate to say that, but that is my personal opinion.
And in most cases a change in our policy or a financial change would in fact
address many of the issues.
In the end, I want honesty. If the Copiepresse or the AFP (also
feel Google doesn’t have permission to index their content, then just use the
easily implemented mechanisms to get out and stay out. Don’t file unnecessary
court cases, nor just single out Google as the whipping boy when Yahoo and
Microsoft, to name only two search engines, operate the same way.
Is it about getting paid? Is it that these publishers think they are so
important they should get money for being included, since links alone to their
web sites make search engines more comprehensive. That’s fine, but you don’t
need a court case for that either. Just opt-out. If you’re worth it, Google and
the others will come running to the negotiating table. If you’re not, well, no
one’s going to miss you — but you’ll miss the search engine traffic, as the
Belgian publications almost certainly are discovering to their horror now.
I don’t want lawsuits that seriously threaten web search itself. Bourbon’s
ruling potentially applies to all content, not just news content, in Belgium.
Anyone could sue Google and other search engines saying that robots.txt blocking
isn’t explicit enough. If that happens, Boribon’s organization is going to find
searching the web from Belgium is difficult, since there won’t be any content in
Google, Yahoo or other services at all.
That would be ironic, given that Boribon says she’s a regular Google user.
She’s routinely using a service where virtually none of the content listed is
there because of some explicit approval process. That’s hypocritical, given her
group’s lawsuit. If they don’t believe opt-out mechanisms are sufficient, then
none of these member publications should be using Google or any search engine as
part of their daily routines.
Postscript: V7N points at WAN to combat ‘search engine spiders’, which has the World Association Of Newspapers suggesting incorrectly that search engines have no technological solution to spider only some content. They absolutely do. Content can be flagged on a page-by-page basis, if that’s what a content owner wants to do.