Earlier we reported in Bush Administration Demands Search Data; Google Says No, Yahoo & MSN Said Yes that the US Government seeks to force Google to hand over search data. That story explains more about the situation, and there have been a number of postscripts from when it was first written. Along with that, we've been able to obtain copies of the three court documents filed in the case. Below you'll find links to each document, along with a summary of what's in each of them.
Alberto Gonzalez, as Attorney General of the United States vs. Google
Notice of Motion to Compel Compliance (PDF File)
- The motions requests that Google comply with a subpoena filed by the Attorney General and "produce" for inspection and copying the materials the Government is asking for.
- After the lead government attorney conferred with Google, Google has chosen not to comply with subpoena.
- Google is asking the court to make Google comply
- The filing then goes into a background explanation about the Children's Online Protection Act (COPA) and how the government is developing its defense of the constitutionality of COPA. They believe that COPA is, "more effective than filtering software in protecting from harmful exposure to harmful material on the Internet."
- In preparation of the case, subpoenas were issued to Google and "other entities" that operate search engines to produce two sets materials.
- First, the subpoena asks Google to produce an electronic file contain, "[a]ll URL's that are available to be located on your companys' search engine as of July 31, 2005.
- However, after "lengthy negotiation" the government changed and "narrowed" their request and asked for a "multi- stage random sample of one million URLS from Google's database ie, a random selection of the various databases in which those URL's are stored, and a random sample of the URL's held in those selected databases.
- Second, Google was asked to "produce an electronic file containing [a]ll queries entered into the Google engine between July 1 and July 31 inclusive.
- Again, after lengthy negotiations the government the government changed their request and asked for an electronic file "containing the text of any search string entered into Google's search engine for a one week period (absent any personal information identifying the person who entered the query).
- Google has still refused to comply with these requests in any way.
- The Government says that access to this information would be of "significant significance" in the preoperation of the their case. Specifically why?
- "The production set of queries entered into Google's search engine would assist the Government in its efforts to understand the behavior of current web users, to estimate how often web users encounter harmful-to-minors material in the course of their searches, and to measure the effectiveness of filtering in screening that material."
- This information would also help the Government understand what, "web sites people find through the use of search engines, to determine the character of those sites, to estimate the prevalence of harmful-to-minors material on those sites, and to measure the effectiveness of filtering software on that harmful to minors material.
- The document continues into a discussion with plenty of legalese and citations and again points out the Google has failed to comply and lists some of the reason Google objects to this.
- Google first objects to this on the grounds of relevancy.
- Google also objects on the grounds that if they would provide what the government asks for, they would be required to produce information identifying the users of its search engines.
- The Government claims that this is "illusory" since they have specifically asked for a random sample containing no personally identifying information to any search string.
- The Government said that it has received compliance from search entities with files containing no personally identifying information.
- Google also contends that the information they're being asked to produce is "redundant" since the Government has asked other engines to produce similar files. The Government argues that this "misunderstands" what's being requested. "The production set of queries from Google's database, in combination with similar productions from other search engine operators will assist the Government in developing a sample of the overall universe of search engines queries, while accounting for the potential of any variations in the type of queries that are entered into different search engines."
- The Government says that since Google is the market leader, its response, "would be of value" in developing the Governments overall sample of queries.
- Google says that complying would also force Google to share trade secrets because the total number of queries receives in a day is a trade secret. The Government adds that if this was the case, a district court has said that these numbers would not be disclosed.
- Finally, according to the filing, Google says that it will be subject to an "undue burden" in complying. The Government claims that this is not the case whatsoever. The Government adds that they would be "willing to work" with Google to specify a multistage sample. They are also willing to compensate Google for its work and complying with the subpoena.
- The filing ends with the Government saying that, "This court should require Google to comply with the subpoena on the same terms it's competitors have."
Declaration Of Joel McElvain (PDF File)
The second filing is a declaration by Government attorney, Joel McElvain, who I believe the lead attorney for the U.S. Department of Justice in this matter. It also helps produce a timeline of events to this point.
- A copy of the original subpoena, originally signed on August 25, 2005
- Detailed info and definitions about Google was to submit to the Government.
- A several page letter, dated October 25, 2005, from Ashok Ramani, Commercial Litigation Counsel, Google sent to Joel McElvain with his objection to the subpoena. THIS IS A MUST READ!!!
- "It is against Google's competitive interest to be viewed as reflecting the whole world wide web."
- Worth noting that Google says that the government tried to use Archive.org/Wayback Machine and found the results unsatisfactory. From the letter, "...given the www.archive.org's stated purpose, one would expect them -- with an appropriate consulting relationship to create the results the DEFENDANT wanted.
- The Governments request is seen as redundant because they already has URLs from at least one other engine
- From the letter, "Though the search engines doubtlessly have some differences in the URLS, they store, what distinguishes Google from it's competitors is the sophistication of Google's search engine in locating and ordering relevant results."
- On the burden to Google. "Google would have to spend a disproportionate amount of engineering time and resources to (i) number (even in rough terms) in real time the URLs contained in its search database and (ii) extract based on that initial numbering the URLs selected by Professor Stark.
- Google also objects because it could "endanger" its "crown-jewel trade secrets." Specficially, they would have to disclose the approximate number of URLs in its database and "some" details on how it crawls URLs, "such as the number of servers, server distribution, and how often Google crawls the World Wide Web."
- More objections. "Google objects to the Defendant's view of Google's highly proprietary queries database as a free resource that Defendant can use, some levels removed, to formulate its own defense."
- "Moreover, Google's acceeding to the Request would suggest that it is willing to reveal information about those who use its services. This is not a perception Google is willing to accept. And one can envision scenarios where queries alone could reveal identifying information about a specific Google user, which is another outcome we cannot accept.
- The letter discusses how the Government is willing to narrow what's asked for in the subpeona This is summarized in the Alberto Gonzalez, as Attorney General of the United States vs. Google section of this post.
- McElvain discusses how Google asked for and was granted two extensions to serve their objections to the subpeona until October 10, 2005. He then writes, "In our several discussions prior to the service of those objections we had offered to limit the scope of of the requests for production, and you had indicated Google's willingness to consider compliance with the subpeona along with the narrowed terms that we had suggested. Your written objection also reiterated your hope to reach a resolution regarding Google's compliance with the subpeona. However, shortly after the service of your objections, you telephoned me to inform me that Google would decline to comply with the subpeona.
- More conversations between the Government and Google take place on December 12th and December 21st to discuss the technical aspects of the request. Finally, on December 21st, MacElvain was informed that Google would not comply with the subpeona.
- The final document is a protective order in the ACLU v. U.S. case.
Key Quotes and Passages from the Letter
Next, we find another letter. This time it's from DOJ's McElvain to Google's Ramani. This later is dated December 23, 2005.
Declaration Of Philip B Stark (PDF File)This document is a declaration by Philipp Stark, Ph.D who was the person to work on the project. Dr. Stark is a Professor of Statistics at the University of California, Berkeley.
- Stark explains how he has had conversations with the USDOJ, Google and other search providers, "to develop practical approaches to sampling their databases or URLs and search queries."
- He adds that he has started to analyze the samples produced by search providers other than Google.
- He writes, "Reviewing user queries to search engines will help us understand the search behavior of current web users, to estimate how often web users encounter HTM materials through searches, and to measure the effectiveness of filters in screening those materials.
Stark goes on to add more about his approach while including Google results are directly relevant.
This Year's Premier Digital Marketing Event is #CZLSF
ClickZ Live San Francisco (Aug 11-14) will bring together the industry's leading online marketing practitioners to deliver 4 days of educational sessions and training workshops. From Data-Driven Marketing to Social, Mobile, Display, Search and Email, the comprehensive agenda will help you maximize your marketing efforts and ROI. Register today!