Catching up on some important news from last week, the judge in the case of the US Department Of Justice versus Google has ruled that Google does NOT have to provide the DOJ with query logs. Google calls it a victory, and I agree.
- Last year, the DOJ
that Google handover two month's worth of query data, from June 1 through
July 31, 2005. That would have been billions of queries in total. Just put
them in an "electronic file," Google was told. Then find a terabyte USB key
big enough to hold this monstrous text file, so that I guess the DOJ could
open it up in WordPad on the
computer used to process Bill Gates's taxes. Maybe that has enough memory
to load the file :)
- The DOJ backed off the original request, saying it wanted only on week's
worth of data. "Only a week" still would have put the number of queries in the
billion plus range.
- In court last week, the DOJ declared that it now only needed 5,000 random queries in total. Got it? Originally it needed billions of queries and went to court to force Google's hand, then it decides only 5,000 were necessary.
The judge decided against giving the DOJ any search data at all. Why? From my reading of the ruling (PDF format), the judge found that the possible concerns over privacy outweighed the concerns that the DOJ needed to have Google's data in addition to data it already obtained from other search engines or could obtain through other options.
The judge noted that Google itself warns users that government actions might require it to hand over private data. Still, the judge wrote:
The expectation of privacy by some Google users may not be reasonable, but may nonetheless have an appreciable impact on the way in which Google is perceived, and consequently the frequency in which users use Google. Such an expectation does not rise to the level of an absolute privilege, but does indicate that there is a potential burden as to Google's loss of goodwill if Google is forced to disclose search queries to the Government.
But the government didn't want private data, right? They only wanted queries, not the other log information that might link the queries personally with anyone.
That's not entirely correct. "Private Searches Versus Personally Identifiable Searches" is my past article that explains how all searches are private, at least in the minds of many searching. Moreover, some of these private searches might contain information to somewhat link them back to an individual. True, the searches can't be absolutely, positively identified back to an individual. However, they still remain "private" in nature.
The judge clearly was concerned about this, enough so to ultimately ruled against the query log handover:
Thus, while a user's search query reading "[user name] stanford glee club" may not raise serious privacy concerns, a user's search for "[user name] third trimester abortion san jose," may raise certain privacy issues as of yet unaddressed by the parties' papers. This concern, combined with the prevalence of Internet searches for sexually explicit material (Supp. Stark Decl. ¶4) -- generally not information that anyone wishes to reveal publicly -- gives this Court pause as to whether the search queries themselves may constitute potentially sensitive information.
Google does have to handover 50,000 random URLs from its index, which doesn't impact the privacy of anyone. Interestingly, the ruling does put Google in an odd position. Now it has a court backing up the idea that query logs are private. So should it still be publishing query log data through tools like those provided to AdWords advertisers? I showed earlier how I could use that tool to find things queries containing social security numbers. Perhaps the company might find itself the target of a different suit down the line, by someone claiming their privacy was violated through exposure like this.
I think that's unlikely, but it's worth noting. Overall, I still am glad to have these types of tools (other search engines offer them as well). I think there's a difference between the government overreaching to ask for billions of queries versus advertisers doing more focused research on searching patterns. And heck, the government could have just used the advertiser tools themselves.
For background on the case, see these past articles from us:
Administration Demands Search Data; Google Says No; AOL, MSN & Yahoo Said Yes
Documents & Summary Of United States Versus Google Over Search Data
- The Day
After: Points In The Search Trust Sweepstakes
Groups, Goverment Officials Comment on Privacy and Web Search
Larry Page Comments on Privacy Matters
- MSN Search
Blogs On DOJ Request
- A Brief
Look at Danny's Appearance on Nightline
- Full Text
Reports from the Congressional Research Service on Internet Privacy, Net
Technology, and Protecting Children from "Unsuitable Material"
Your Search Privacy: A Flowchart To Tracks You Leave Behind
Searches Versus Personally Identifiable Searches
- Google Not
Installing Third Party Cookies -- It's Firefox Prefetching
- New Poll
Finds Web Users Want Google to Keep Data Private; Full Text Access to Report
Search Champs Talk to MSN VP about Data Turned Over to Feds
- Search Data
Request Has Searchers Pondering Their Next Query
Senator Patrick Leahy Asks Attorney General for More Info On Web Search
- Google Has
Right to Log the Text of Messages Sent Using their Send to SMS Feature
- How The US
Department Of Justice May Analyze Search Data & Freedom Of Information Act
Request For Disclosure
Search Engines Log IP Addresses & Cookies -- And Why Care?
Privacy Bill Introduced, Not Well Thought Out
Suspect Entwistle's Desktop Search Query to be Used in Court Case
- 60% Oppose
Search Engines Storing Search Behaviors
of Justice Rejects Google's Claims of Privacy Threat
Filings Against DOJ Request -- Including Declaration From Matt Cutts
of Justice Rejects Google's Claims of Privacy Threat
- Judge: Google Must Give Up Some Data To Department Of Justice