Google Doesn't Have To Hand Over Search Logs To Justice Department

Catching up on some important news from last week, the judge in the case of the US Department Of Justice versus Google has ruled that Google does NOT have to provide the DOJ with query logs. Google calls it a victory, and I agree.

Let's recap:

  • Last year, the DOJ demanded that Google handover two month's worth of query data, from June 1 through July 31, 2005. That would have been billions of queries in total. Just put them in an "electronic file," Google was told. Then find a terabyte USB key big enough to hold this monstrous text file, so that I guess the DOJ could open it up in WordPad on the special computer used to process Bill Gates's taxes. Maybe that has enough memory to load the file :)
  • The DOJ backed off the original request, saying it wanted only on week's worth of data. "Only a week" still would have put the number of queries in the billion plus range.
  • In court last week, the DOJ declared that it now only needed 5,000 random queries in total. Got it? Originally it needed billions of queries and went to court to force Google's hand, then it decides only 5,000 were necessary.

The judge decided against giving the DOJ any search data at all. Why? From my reading of the ruling (PDF format), the judge found that the possible concerns over privacy outweighed the concerns that the DOJ needed to have Google's data in addition to data it already obtained from other search engines or could obtain through other options.

The judge noted that Google itself warns users that government actions might require it to hand over private data. Still, the judge wrote:

The expectation of privacy by some Google users may not be reasonable, but may nonetheless have an appreciable impact on the way in which Google is perceived, and consequently the frequency in which users use Google. Such an expectation does not rise to the level of an absolute privilege, but does indicate that there is a potential burden as to Google's loss of goodwill if Google is forced to disclose search queries to the Government.

But the government didn't want private data, right? They only wanted queries, not the other log information that might link the queries personally with anyone.

That's not entirely correct. "Private Searches Versus Personally Identifiable Searches" is my past article that explains how all searches are private, at least in the minds of many searching. Moreover, some of these private searches might contain information to somewhat link them back to an individual. True, the searches can't be absolutely, positively identified back to an individual. However, they still remain "private" in nature.

The judge clearly was concerned about this, enough so to ultimately ruled against the query log handover:

Thus, while a user's search query reading "[user name] stanford glee club" may not raise serious privacy concerns, a user's search for "[user name] third trimester abortion san jose," may raise certain privacy issues as of yet unaddressed by the parties' papers. This concern, combined with the prevalence of Internet searches for sexually explicit material (Supp. Stark Decl. 4) -- generally not information that anyone wishes to reveal publicly -- gives this Court pause as to whether the search queries themselves may constitute potentially sensitive information.

Google does have to handover 50,000 random URLs from its index, which doesn't impact the privacy of anyone. Interestingly, the ruling does put Google in an odd position. Now it has a court backing up the idea that query logs are private. So should it still be publishing query log data through tools like those provided to AdWords advertisers? I showed earlier how I could use that tool to find things queries containing social security numbers. Perhaps the company might find itself the target of a different suit down the line, by someone claiming their privacy was violated through exposure like this.

I think that's unlikely, but it's worth noting. Overall, I still am glad to have these types of tools (other search engines offer them as well). I think there's a difference between the government overreaching to ask for billions of queries versus advertisers doing more focused research on searching patterns. And heck, the government could have just used the advertiser tools themselves.

For background on the case, see these past articles from us: