Google may face legal action over Ashley Cole searches over at Pink News covers how English footballer Ashley Cole might be upset with Google because of how its clustering technology is highlighting content about "ashley cole gay" in its search results. It's another example of Google's user interface experiments confusing people.
This screenshot shows what's at issue. Midway down, you'll see a section that says:
See results for: ashley cole gay
Independent Online Edition > Legal
England footballer Ashley Cole is suing The Sun and the News of the World over
claims that two Premiership players indulged in a "gay sex orgy". ...
Ashley Cole sues after gay rumours | Headlines | News | Gay.com UK
Gay.com UK is the country's leading gay and lesbian lifestyle portal, providing
an unrivalled combination of chat and news.
Ashley Cole files lawsuit over gay orgy story- from Pink News- all ...
Ashley Cole files lawsuit over gay orgy story from PinkNews - all the latest gay
news from the UK and beyond to the gay community.
This is an example of the middle-of-the-page query refinement that Google's been testing over the past several months, as we wrote about back in August.
In particular, what seems to be happening is that Google is performing "clustering," a long-standing technique of grouping pages on a similar topic together. In other words, its sees there are lots of pages about "ashley cole" along with a subgroup of those on the topic of "ashley cole gay."
That there might be a subgroup like this isn't surprising. Cole is currently suing newspapers The Sun and The News Of The World over allegations they printed that he is gay. Those allegations have fueled discussion on the web, leading to a subgroup of pages on this topic.
Clusty provides a similar example of this. A search for ashley cole over there shows clustered topics along the left-hand side of the page including:
Cole's solicitor is reported by Pink News as wanting to know if it was editorially done by Google or based on search volume. Google gave no comment.
From where I sit, it almost certainly was NOT editorially done. Instead, it was probably based on a combination of search volume and actual pages on the web.
In other words, Google's probably seen a spike in queries for "ashley cole gay." It also can probably see there's a good chunk of pages out there on this topic.
For example, a search for the exact phrase "ashley cole" brings back 551,000 matching web pages. If I further refine that to "ashley cole" gay, I find there are 48,800 pages that use his name along with the word "gay" on them -- about 9 percent of all the exact phrase "Ashley Cole" pages out there.
It's important to remember that search counts can be very misleading. A large number of pages with his name and the word gay doesn't mean he is gay, only that many pages might be discussing the topic. It could also be his name is showing up on pages that use the word gay in reference to other people.
Our Fox News & Danger Of Citing Search Counts discussion at the Search Engine Watch Forums covers more about why you can't depend on counts to "prove" particular facts. But the large number of pages could cause Google -- just like Clusty -- to automatically decide that there's a "cluster" or "topic" related to those words.
Why bring up this particular topic when something like "ashley cole" cars comes up with more matches (60,100 of them)? That brings me back to search volume. If Google's noticing that there are a lot of queries on a particular subtopic (ashley cole gay) related to the main topic (ashley cole) plus a significant number of pages on that topic, that might cause this refinement to kick in.
- ashley cole
- ashley cole and cheryl tweedy
- cheryl tweedy ashley cole
- ashley cole pictures
- ashley cole girlfriend
So where's "ashley cole gay" on the list? My guess is that the search data Google is showing is old, so that this term that may be rising in popularity isn't appearing.
The Google Zeitgeist is another place to check if this query might be gaining. However, Google's not updated non-US versions since last November. Even if it does, the lists there are subject to human review. Google might very well remove something if it's deemed not family friendly, just as it already removes many sexually-related queries.
In the end, I doubt Cole would have much success in suing Google over the listing, if indeed he decided he wanted to. There are definitely pages on the topic and almost certainly people looking for information about it.
Still, it would sure be nice as we wrote in our Google Losing Consistency As It Continues To Experiment With Results article back in August if Google made it clearer how and why certain things show up in its search results. It has a search results explanation page here, but that page doesn't cover the continued experimental displays that Google is doing and confusing people with.
Postscript: Hitwise has stats showing the growth the "gay" queries here, and Schmidt's Google Queried By Soccer Star's Lawyers from Forbes has none other than Google CEO Eric Schmidt putting out a statement saying the suggestion was automatically created based on query behavior. Cole's lawyer Graham Shear is satisfied with that explanation though wants to know more about the data behind it.
Graham, see the previous Hitwise link. The data's simple. Your client is in the news over the allegations. Lots of people interested in the case are almost certainly typing in his name to find out more, getting a lot of stuff not necessarily related to the allegations, so they are adding the word "gay" to narrow down the search results.