When the nomination was announced, some people took it as a sign of Google's wrong-doing in the privacy arena. That was a mistake, because there was no vetting of the nominations accepted. Anyone could contribute. Google could just as easily turned the tables and nominated Google Watch for the awards.
Moreover, Google was not selected as one of the Big Brother finalists, which certainly indicates that Privacy International itself did not see the company as among the largest threats to privacy.
Nevertheless, the nomination has caused some to wonder about the privacy of their search requests at Google. In addition, some allegations made in the nomination have been transformed by others as proof of privacy violations, without being closely examined.
In this article, I'll explore each of the major allegations that Google Watch made against Google as evidence of it being a threat to privacy and of "Big Brother" behavior. At the end of each allegation, I'll provide my own verdict about how seriously a typical person may wish to consider each claim.
Be forewarned. This is a long article. If you are interested in a particular accusation, use the links below to jump to the beginning of where each accusation is explored. You can also jump right to the verdict for each accusation.
- Accusation 1: Google's immortal cookie& Verdict
- Accusation 2: Google records everything they can & Verdict
- Accusation 3: Google retains all data indefinitely & Verdict
- Accusation 4: Google won't say why they need this data & Verdict
- Accusation 5: Google hires spooks & Verdict
- Accusation 6: Google's toolbar is spyware & Verdict
- Accusation 7: Google's cache copy is illegal & Verdict
- Accusation 8: Google is not your friend & Verdict
- Accusation 9: Google is a privacy time bomb & Verdict
Google Watch (no connection with Search Engine Watch) is a web site backed by non-profit group Public Information Research. It was launched in the middle of last year by PIR's president Daniel Brandt.
At that time, one of Brandt's chief complaints about Google was its use of a "persistent" or long-lasting cookie, one that doesn't expire until 2038. It's a complaint that was reiterated as Brandt's first point in nominating Google for the Big Brother award.
Cookies help Google remember any personal preferences you might have set, such as that you like to see more than 100 results at a time or that you like to search for pages in English, rather than in all languages.
Given this, my personal reaction to the fact that the cookie doesn't expire for years has been, "So what?" If I've set preferences, I want Google to remember those preferences as long as possible. This is also Google's explanation for why its cookie lasts so long.
"If you were set a temporary cookie, then it would expire. We'd no longer know the languages you chose or any number of things like that," said Google cofounder and president of technology Sergey Brin.
In case you're wondering, that odd 2038 date has to do with the vagaries of the Unix operating system. It's the longest date that Google can set due to a millennium-like bug with Unix.
Length Of Time Is Non-Issue
If Google could, it would set the date to expire even further in the future, nor is this wrong. I've seen statements suggesting that Google is doing something unusual or nefarious in having such a long expiration date. One article even states that "most such cookies expire within a relatively short time." But I've seen no proof of this, such as a survey showing what is the average length of time for cookie expiration.
One excellent, recent survey of search engines and cookie use actually shows that Google competitors AltaVista and AllTheWeb have 10 year cookies and notes wisely that given how people upgrade computers, that's just as "bad" as a 35 year one. That's because when you get a new computer, your cookies are generally tossed out with old one -- and few use the same computer for more than a few years.
If others have long-lasting cookies, Brandt puts the blame for this on Google.
"Google was the first search engine to use a cookie that expires in 2038," he alleges in his nomination. "Google set the standard because no one bothered to challenge them."
Maybe Google was the first search engine that used a cookie in 2038, but Brandt provided no proof of this, when asked. As for setting the standard, Google competitors have had long-lasting cookies since at least early 2001, as another excellent survey of search engines and privacy issues shows.
This was at a time when Google was far from the popular powerhouse that it is today, so an argument that other sites are simply mimicking what the "leader" Google is doing doesn't hold up. It wasn't the leader, back then. It's also interesting to note that the survey also saw Google as one of the best search engines at the time, from a privacy point of view, despite the long-lasting cookie.
"The search engines to be preferred by paranoid people are probably www.lycos.com and www.google.com," wrote survey author Marc Roessler.
What's a privacy expert's view on Google's persistent cookies? For help, I spoke with Parry Aftab. She's a cyberspace lawyer, with an expertise on security and privacy issues. Aftab sits on the board of TRUSTe, one of the web's oldest and most prominent privacy groups, and is executive director of WiredSafety.
"The fact that they expire late, that's just for ease. It means nothing. If you own a computer that you're going to be using in 2038, thats an interesting prospect," said Aftab. "Most sites have cookies that expire in 50 years or whatever.
In addition, Aftab feels the focus on the time a cookie lasts is the wrong direction, pulling attention away from real privacy issues, such as how real personally identifiable information is used, such as through a site registration scheme that collects names and addresses.
"Instead of attacking sites like Google and some of the other ones that are not trying to abuse privacy, deal with the ones that really are," Aftab said.
Interestingly, even Google Watch's Brandt admits the time a cookie lasts isn't an issue. Asked for how long a cookie should last, Brandt explained that it would be fine for it to last forever, as long as it keeps getting "renewed" by regular visits to the web site.
"In terms of duration, I'd recommend a 30-day cookie that rewrites the 30-day expiration date with every access. That way, if someone doesn't use Google for 30 days, the cookie will get deleted by the browser."
In other words, every time you visited Google, your cookie would keep getting extended by 30 days. In short, it would stay alive as long as you stayed alive through regular visits to the site.
Unique ID Issue
Google does more than store your preferences with its cookie. The cookie also contains a unique ID that's assigned to your browser. Anytime you come to Google, the cookie helps it know that it has seen your particular browser before.
It's this unique ID that Brandt is really more worried about, rather than the cookie's long expiration date.
"How long it lasts is less important than the fact that it has a unique ID in it. You don't need a unique ID for [user preference” configuration. Something else is going on here, and Google won't say what it is."
In other words, yes, Google does like to track what an individual user does on the web site. The user is anonymous -- literally just a number rather than a named person -- but Google can then see how that unique numbered person interacted with its search results (for more about why a user ID is anonymous, see my Search Privacy At Google & Other Search Engines article).
"There's a lot of quality work we do with unique cookies. If we present a spelling correction to people, we often see how much they use this," said Brin, citing one example of monitoring user behavior for quality purposes.
"If we wanted to know how often when a user does a four word query do they end up refining it, a cookie can help us with that," Brin added, offering another example.
Looking forward, Brin opened other possibilities that having a persistent cookie makes possible.
"In the future, we might want to tailor results based on previous queries," Brin said.
As for privacy expert Aftab, she's not worried about the use of a unique ID, since in and of itself, it doesn't provide any real data about who a person really is.
"The unique id isn't tied to personally identifiable information," she said. "Frankly the unique identifier more than anything else teaches them how to improve their product," Aftab said.
Verdict 1: Cookie Expiration Doesn't Matter
In conclusion, don't be worried that Google's cookie won't expire for 35 years. Even Brandt agrees that's not the issue. He just doesn't like the unique ID portion of the cookie.
"Getting rid of the unique ID is the most important thing. The expiration date is a second indicator of how sensitive they are to privacy issues, even without the unique ID. But the expiration date issue is close to trivial once the unique ID is gone," Brandt said.
Verdict 2: Don't Like Unique ID? Don't Accept The Cookie
Should you be worried about the unique ID? Here's, it's three against one. I, Google and privacy expert Aftab don't think that's a concern, and Search Privacy At Google & Other Search Engines article explains why in more depth.
Still not convinced? Then you've got an easy solution. Just don't accept Google's cookie, if you are frightened. You won't have a unique ID and your privacy issues, if you subscribe to Brandt's views, will gone. Practically everything you want to do on Google, in particular searching the web, works fine without them. For help in removing and blocking cookies, see the excellent Unofficial Cookie FAQ page.
Verdict 3: Google Changes Might Reduce Fear Of Cookies
For its part, Google might consider Brandt's idea of a renewal-based cookie, if only to ease concerns that some might, however unfounded those concerns might be. User preferences can also be stored in a cookie that does not have a user ID, something the aforementioned survey found that Teoma does. Perhaps Google might allow users to accept this as an alternative to having to simply reject a cookie outright.
Brandt's second point in his Big Brother nomination of Google makes the company sound greedy in the information it gathers when you visit:
"For all searches they record the cookie ID, your Internet IP address, the time and date, your search terms, and your browser configuration," he writes.
Some perspective is in order. Most web servers are set to record all this same information in their log files, so Google is not doing anything unusual. Indeed, even Brandt's organization's own web sites record this type of information other than search terms, he confirmed.
So why is Google's behavior, which is pretty normal, held up as something apparently wrong by Brandt?
"The 150 million searches a day mean that Google sets the standard for what other server administrators feel is okay. If Google does it and gets away with it, then Yahoo knows that they can do it, and so does MSN, and Overture too. They're all guilty, but you have to start somewhere. If Google were to start doing things right tomorrow, I'd start a Yahoo-Watch site," he said.
To me, this is a giant stretch. Logging web server data, such as IP address, time of visit, cookie ID and even search terms was a standard long, long before Google appeared in 1998. Google did not exist when the common log format (July 1995) and extended log format (March 1996) were established (and they were even in use before these formal dates).
As for Google, Brin said that the standard information that Google logs with each request is useful for the aforementioned research into improving quality, as well as being able to show advertisers audits on the frequency ads appeared, based on aggregate information (the limited individual information Google has is NEVER shared with advertisers, the company stresses).
As sort of a subpoint to complaining that Google records "everything," Brandt also raises a concern about what he calls Google's customization of results based on a person's geographical location.
This isn't correct. For example, I'm based in the UK, and if I search at Google.com, I'm not given UK-specific results. I get the same results that anyone in the US would see.
Occasionally, very slight differences may appear due to the fact that Google has "mirrors" of its index in multiple locations. Being in the UK, I hit the closest mirror or copy of the Google index to me, one in Europe. Someone on the West Coast of the United States hits a California-based mirror. The mirrors are not always in perfect sync, hence the chance of very slight differences. These are not differences done intentionally.
It's a different situation when it comes to ads. Paid listings are definitely targeted by location. Those in the UK only see ads targeted to them, while those in the US see something else.
Instead, what Brandt is really referring to is the fact that in some countries, Google may redirect you to a "local" edition if you try to reach Google.com. For example, if I were in Ireland and trying to reach Google.com for the very first time on a new computer, I would automatically be redirected to Google Ireland.
Google does this redirection to raise awareness that there's a customized edition for someone in a particular country. There are also good reasons to use that custom edition. They give you access to everything you get at Google.com, but you also have the ability to easily narrow your search to your particular country or see information in your country's language.
Nevertheless, such redirection can be an annoying tactic, for some who prefer for whatever reason to visit Google.com. Fortunately, there's an easy solution. Assuming you accept cookies, all you need to do is click on the "Google.com" link at the bottom of any non-US Google edition. Doing that will send you to Google.com, and the cookie will remember that you don't want to be redirected, in the future.
It's also worth noting that Google is not the pioneer in doing this type of redirection. Search engines long before Google did redirection in this manner, with Lycos standing out as the most noticeable, in my mind.
And The Privacy Implication?
It's all very interesting that Google does geographic redirection, but what's the privacy implication in this? What makes Google Big Brother-like for doing so, especially since it's easy to switch to your preference.
"Unfortunately, the same geolocation referencing makes the IP number coming in from a user much more of a threat to privacy. The data mining that US officials are talking about would consider such geolocation data, based on IP number, to be very essential," Brandt says.
In other words, Brandt is concerned that the US government might somehow make use of the fact that Google can identify users outside the United States in order to mine the company's data. However, georedirection has nothing to do with this. The US government could see just from a person's internet address alone where someone is located geographically.
In short, raising georedirection simply clouds the issue about what Google records and isn't worth being concerned with, in my view, from a privacy perspective.
Verdict: Google Not Excessive In Logging Information
As seen, Google simply records the same information that any typical web server can record. Saying Google records "everything" would be more fairly written to say, "Google records the same things everyone else records." On a practical level, I don't think it's worth it for the vast majority of people to worry about this.
Not convinced? Then to keep Google or anyone else from recording the limited and impersonal information that a web server routinely gathers, consider an anonymizing tool or proxy. They'll keep Google and others from seeing your information and are also a useful way for those who want to see the US-targeted ads on Google, even when they are outside the country. Of course, the tool itself will know what you visit, so you'll have to trust that things are in place to destroy this data on a regular basis.
Finally, it's worth noting that even Brandt who says Google records "everything" couldn't say what minimum amount of information he would deem acceptable.
"I don't know enough about their operation to know what they consider important to record. They don't need a unique ID in the cookie. Beyond that, a limit on data retention is the most important thing," Brandt said.
Let's recap what we've covered so far. Brandt really doesn't mind that Google uses a long-lasting cookie nor that it records standard log data. Instead, his main concern as we've explored in the prior two points comes back to his objection that Google uses a unique IDs to know if it has seen a particular browser before.
With that, we move to Brandt's third point, that he wants Google to regularly purge data.
"Google has no data retention policies. There is evidence that they are able to easily access all the user information they collect and save," writes Brandt, in his Big Brother nomination of Google.
Again, it's another scary sounding statement, suggesting that there's some data retention policy that Google should be following. So what's the standard? Privacy expert Aftab says there isn't one:
"Nobody has data retention policies. That is where everyone falls off the map, even the good guys," Aftab said.
Nor is Aftab overly concerned about this, in Google's case. This is because for the average web surfer, the company doesn't have any personal identifiable information.
Google: Safeguard, Not Destroy
As for Google, Brin says it does have a data retention policy, this being that Google doesn't destroy its data but instead safeguards who can access it.
"We do have an existing policy and are in the process of creating a new one, which will be quite strict. There are more important issues. Where do you keep it? Who has access to it? There are many safeguards we have in place already and many more we are adding," he said.
What sort of safeguards? First, it important to understand that Google actually is missing some data, largely a few gaps in logging from its early days. As for the data is does have, it remains in raw format and isn't easily accessible, for the moment.
"We have in fact much more limited access that we'd like in that we can't run some of these aggregates easily," Brin said. "In the future, we are setting it up where we can answer very easily some questions without having to hit raw logs but rather summary data or an in-between computer," he said.
Providing only limited access to summary data, rather than to raw log data, is one type of safeguard. It makes it harder for a determined person to track an actual individual user, though even then, it's important to remember that the individual is still not personally known to Google.
Brandt: What You Destroy, The Government Can't Get
As for Brandt, safeguarding isn't enough. He wants Google to keep data no longer than 30 to 60 days, in order to prevent it from being mined by the US government.
"Professional librarians in the US are now required to hand over any borrowing records to the FBI, and it is a crime for such a librarian to even mention that the FBI requested such records. The American Library Association and similar organizations are recommending that all records be destroyed as soon as a book is returned. If you don't have the data, it isn't a crime to not provide the data. But if you do have the data, and you tell the FBI that you don't have it, then you have committed two felonies -- lying to the FBI and obstruction of evidence. The obvious solution is to keep only the data that is absolutely essential to your operation," Brandt said.
Fear Of US Government
Indeed, in talking with Brandt, it turns out that this is the key issue he has with Google. He really doesn't seem worried that Google itself will somehow abuse the limited data that it collects. Instead, he's more concerned that the US government has powers that will let it mine the data -- and importantly, have the ability to get internet addresses that are recorded along with search queries linked to actual ISP user accounts (as explained more in my Search Privacy At Google & Other Search Engines article). That would then let the US government more easily mine Google's logs to tie search requests with an individual.
Sound far fetched? Rather than dismiss Brandt as paranoid, it's important to note he's simply reciting what the widely respected Electronic Frontier Foundation has to say in its analysis of the USA Patriot Act:
"Be careful what you put in that Google search. The government may now spy on web surfing of innocent Americans, including terms entered into search engines, by merely telling a judge anywhere in the US that the spying could lead to information that is 'relevant' to an ongoing criminal investigation. The person spied on does not have to be the target of the investigation. This application must be granted and the government is not obligated to report to the court or tell the person spied upon what it has done."
Should YOU Fear?
Pretty scary stuff. But realistically, privacy expert Aftab says most people shouldn't be worried.
"Unless someone is really, really under surveillance as an Iraqi spy or something, they just arent going to watch what someone is doing on Google," Aftab said.
Aftab added that when she's helped agencies track criminals involved in cyberspace, search data isn't something that's targeted.
"We don't spend a lot of time there," she said. "When law enforcement relies on anything on the internet, they will stake out the house. They will not just show that it was downloaded on the computer but that it is clear that no one else but the suspect was doing the downloading," she said.
EFF Senior Staff Attorney Lee Tien is not so quick as Aftab to dismiss possible government monitoring of search traffic.
"I think [Brandt” is correct that Google would be a popular target. That sort of just makes sense. The world's most popular search engine, or AOL, Yahoo or anyone who handles a lot of traffic would be of interest," Tien said.
However, Tien doesn't see the unique ID as a problem, nor is the EFF's write up about the Patriot Act a reference to data that's logged. Instead, the EFF's concern is that the Patriot Act makes it easier for US government authorities to perform realtime "pen register" or "trap and trace" surveillance of internet activities.
This type of surveillance was originally meant to capture telephone numbers, rather than the actual "content" of a telephone call -- what was talked about, as with a wire tap. However, the ability to capture the URLs passed between a browser and a web site to some extent inadvertently captures a conversation that's happening. In particular, with a search engine, it's common that what you searched for is sent back embedded with the URL.
For example, look for "osama bin laden" on Google, and the page that loads with your results will have a URL that contains your search terms in it, as in bold below:
The Center For Democracy & Technology has a good write-up that explains how the pen register statue may capture meaningful content in more depth. The EFF is worried that the statue has been made even weaker by the Patriot Act.
Tien believes that government agencies would love to have a system that in realtime monitors everything, such as all Google search requests, as well as search requests on other major search engines and web surfing traffic in general. Realistically, however, he thinks there'd be real problems in mining all that data.
"We know they are interested in precisely this large scale dragnet approach, but I agree it's not likely to be effective. Its likely to have large noise issues," Tien said. "The big thing in their way is the difficulty of sifting it out."
Because of this, he thinks that if a government agency were hoping to mine data, they'd want historical information.
"A subpoena for historical records is the preferred tool," Tien said.
Given this, like Brandt, Tien recommends regular destruction of data.
"If you really believe in your customers privacy, then dont keep anything," Tien said.
Interestingly, however, Tien is most concerned with the internet address that's logged, not the unique ID. If the internet address portion is deleted, then historic information can be saved without giving a government agency the ability to link search requests to an actual individual through the use of ISP records. The unique ID, because it is anonymous, isn't a concern.
So do authorities subpoena Google for information?
"We have gotten that question before, and I've taken the policy of not to respond publicly about government requests," Brin said.
While Brin will neither confirm nor deny if Google gets government subpoenas, he does say staying quiet is designed not to encourage them.
"Don't jump to the conclusion that we do [get them” or don't," he said. "I don't necessarily want to have press saying, 'Well, Google never gets subpoenaed because that might get government agencies saying, 'Why not do that?'"
Brin also further downplays the idea that Google is somehow a potential magnet for government data mining. The information it has really isn't that useful, he says.
"I've done queries about nuclear weapons, I've done a lot of adult queries, because were supposed to see how that goes," Brin said. "There are a lot of reasons why people do searches. There's not that much you can necessarily infer from them. Furthermore, the information is not tied to a person. Instead, if some government agency were to go to an ISP, they could get the same information from their logs. That information would be more valuable than getting it from Google," Brin said.
Brin also stresses that ISP logs provide a much easier path for authorities to get information than following a complicated reconstruction effort with Google.
"Here's the issue when you start to worry about these things. You have to assume a long string of improbable things. First, that the US acts like Big Brother, which I suppose is more probable than it was before. Second, it decides to do widespread electronic monitoring. After that, they'd have to decide instead of going after the very obvious and identifiable data such ISPs, web access logs and emails, that they'd somehow go to these ambiguous search queries. Then you'd have to assume that Google would turn these over and that then somehow that there would be some conclusions drawn from this data. That's a chain of reasoning of unlikely things to happen," Brin said.
Having said this, it is Google's policy that it will act to fulfill legal requests, and we've seen the company already do this over DMCA actions, filtering sites as demanded by the French and German governments and dropping a sabotage web site as a result of legal action by the German train operator, Deutsche Bahn. Moreover, if some US authority hasn't subpoenaed Google yet, it's inevitable that it will happen at some point.
Overall, Google's in a balancing act. It does have to comply with laws, yet it also has to be concerned with the privacy of its users. Ultimately, if Google felt the privacy of users was really at threat, for no justification, then it might consider destroying even the limited data it keeps.
"If there were a situation where there was a lot of widespread government data gathering, I can't imagine that search would be in the top 20 sources, or that Google would continue to retain that data under such circumstances," he said.
Verdict 1: Perhaps Delete Internet Addresses
Should Google regularly destroy the limited data it keeps, to prevent it from falling into government hands?
Even if Google kept no records at all, your ISP might still have records of your activity. Getting Google to destroy the limited data it maintains offers no guarantee of protection. In fact, going to an ISP for data might be even easier than going to Google.
Nevertheless, removing internet addresses from logs might be a way for Google to still provide some greater privacy for its users, in the case it gets subpoenaed, while perhaps largely maintaining the usefulness of the data for its own research and historical purposes.
Interestingly, it has to be noted that such a move by Google might please some users and privacy advocates, but at the same time, it might also displease those who in the current climate may see the destruction of data as impeding law enforcement efforts.
"Where is the balance?" Tien says. He leans toward the greater privacy side but admit it's a choice each company has to make.
Verdict 2: Perhaps Move To POST System For Query Processing
Even if Google does destroy historic data or remove internet addresses, the potential for realtime monitoring of search queries remains. However, it (and other search engines) could consider moving to a "POST" system of accepting search queries, as recommended in this past survey of search engine privacy. Such a move should prevent keywords from being embedded in the URLs that are sent back to users.
From a web marketing perspective, it's a change I'd hate to see Google make. Search term data is a incredibly valuable way for site owners to learn how visitors come to their web sites. Of course, Google and other search engines could fill this absence through a service where web site owners could monitor the traffic sent to them. The downside is such services would no doubt require a fee.
"Inquiries to Google about their privacy policies are ignored," Brandt writes in his fourth nomination point. Proof of this? He says that a letter he wrote Google last year has never been answered.
So what else does Brandt want spelled out? He didn't have an exact answer to this but rather came back to the central issue of wanting data purged on a regular basis:
"The only thing I've seen is that they say they use it to improve their engine. You don't need more than 30 days of data to do this -- not at a rate of 150 million searches per day," he said.
Verdict: Google Has Provided Reasons
Brandt points out in his nomination that one of Google's engineers used to work for the US National Security Agency, a sign to him that the company is somehow tied into the US federal government.
Does Brandt believe anyone with a US security clearance, current or formerly, is a spy?
"I believe that anyone who is sensitive to the dangers of surveillance by powerful government agencies would not hire someone with a security clearance," he said.
OK, so should search engine companies never hire people who have had security clearances?
"Yes. This is the least that world citizens should expect from a search engine with Google's global reach. If you were asking this question during the Cold War, and the security clearance that the job applicant had was from the KGB instead of from the U.S. government, and the person to be hired would have access to all that user information at Google, would you hire such a person? Of course not. It's common sense, really," Brandt says.
I found it ironic that Brandt's site, which champions privacy, named the actual engineer who formerly worked for the NSA. Did Brandt see any privacy issues in doing that? No.
"Do you know of others at Google with security clearances? If so, send me their names and I'll be sure to mention them as well," Brandt said, noting that the engineer's resume had been on the web for years. "Agents of powerful, secret organizations have no right to privacy, in my opinion. I've been in favor of naming CIA officers for 30 years now. The NSA is no different," he said.
In his nomination of Google, Brandt also says, "Google wants to hire more people with security clearances, so that they can peddle their corporate assets to the spooks in Washington."
This has come from the fact that Google has one -- exactly one -- job opening that I've seen where a security clearance was required. Nevertheless, why is this a requirement? Google didn't provide an answer, when asked.
Does Brandt see any reason why Google might need to have people with current clearances for products?
"Yes. I believe that they want to secretly peddle their information, via a back-door, real-time feed, to the John Poindexter types in Washington. To do this they need sales people with security clearances."
Verdict: Security Clearance Does Not Equal Spying
I don't subscribe to the idea that people working at Google with current or former security clearances means that the company is a privacy threat. Indeed, Brandt's already worried that the US government already has all the power it needs to access data without doing an "inside" job. Nor does the EFF's Tien having someone with a security clearance is indicative of anything.
"It's not fair to infer the worse about them," he said.
In his nomination, Brandt writes:
"With the advanced features enabled, Google's free toolbar for Explorer phones home with every page you surf."
From this, he dubs the Google Toolbar spyware. In my view, this is completely unfair. You absolutely cannot install the software with its advanced features without getting a very in-your-face notice that to work properly, it needs to send data back to Google. The notice begins:
PLEASE READ THIS CAREFULLY
IT'S NOT THE USUAL YADA YADA
By using the Advanced Features version of the Google Toolbar, you may be sending information about the sites you visit to Google//_subscribers.
The coloring and italics are exactly as used in the actual notice. Google is not hiding in fine print the fact that it will see the sites you visit. It's in the very first sentence that you will see in the screen where you confirm if you want to install the toolbar. Furthermore, Google offers an option not to use the advanced features, if you are concerned about possible privacy issues.
Users do have some responsibility when they install software. In this case, Google is being extremely proactive in helping users make a considered choice.
For his part, Brandt does acknowledge the notice but suggests its something Google does only because it must:
It's odd logic to try and fault Google for doing something voluntarily that another company was sued over. In addition, the Alexa case involved the mixing of cookie data and personally identifiable information, something Google doesn't do. Moreover, the Google Toolbar and its notice existed before the Alexa case was settled.
Brandt dismisses all of this:
"Yes, but the suit was many months in the courts before the toolbar was launched. Google anticipated that there would be a problem unless they took precautions," he said.
"No, I honestly cannot. If that were the case, they wouldn't have started out with a cookie that expired in 2038. No, this warning is just a cover-your-ass legal thing. Also, perhaps, a public relations ploy," he says.
Fixing The So-Called Spyware
So what could be done to the toolbar to make it not "spyware" in Brandt's book?
"Dump the phoning home for the PageRank, or for any of the other advanced features. No phoning home, period, unless search terms need to be transmitted," he says.
In other words, get rid of the PageRank meter, which is the only advanced feature that needs to make a call back to Google. Honestly, it might be a smart public relations move by Google, given how some webmasters obsess over the estimates that meter gives and how others have even tried to sell links based on PageRank value.
However, other people might indeed like the advice that Google gives about the potential value of a page. Brandt acknowledges this with a proposed solution:
"How about this -- change the PR meter so that it doesn't phone home or display until you click it. That makes so much sense, that Google will never do it. And the reason they won't is because the phone-home is a spying function that they want to keep," he said.
It's a good idea, for those who install without the advanced features/PageRank meter option. But even if so, others might like the automatic updating (I certainly do). If they agree to accept advanced features, with the warning that information should be sent back to Google, they should have the right to that decision.
Automatic Updates Worrying
A much more valid concern is Brandt raising the fact that the toolbar will automatically update itself without asking. It is unusual. Other software on my system will often notify me of updates being available and prompt to ask me if I want it. I've got no dispute with Brandt wanting to see the same with the Google Toolbar.
Brandt also says he's seen personally seen the toolbar used to discover new sites for crawling. I've seen at least one similar report for a post at WebmasterWorld.com in the past. Google has disputed this in the past and reassures again that the toolbar is not used to find new pages to crawl.
Verdict: Not Spyware, But Do Prompt When Upgrading
Google's done a great job of highlighting the fact that the "advanced features" version of its toolbar sends information back to Google and doesn't deserve to be called spyware for doing so. However, it would be good if Google also highlighted better the fact that its toolbar will automatically update with new versions and prompt users to accept these changes, when they happen.
When Google visits a web page, it make a copy of exactly what it saw available to searchers through its cached links. Brandt named this practice as part of his Big Brother nomination. But rather than it being a privacy issue, he really sees it more as unfair competition and itemized why. In his own words, the points are:
- The cache copy makes it possible to highlight the search terms, whether or not you have the toolbar installed.
- The download time for the cache copy from Google's servers is always faster than from the original website.
- You never get a 404 "not found" or a DNS lookup failure for the cache copy.
- The link to the page recommended by Google for bookmarking at the top of the cache copy is a link to Google's copy, not to the original page.
- How about all that Google branding on the top of the cache copy? Priceless.
"How can any webmaster compete with Google's cache copy? How can any search engine compete with it? Why don't we all just let Google take over the Internet, and have all webmasters put up their pages directly on Google's servers," he said.
Of course, if you hold to this argument, Google's acquisition of Blogger and its use of contextual links really are stronger examples of where it has potentially made much of the web part of its own "content." That weakens some concerns about the cache, but it doesn't negate the fact that there may be some real fears as Google matured, as covered more in my article last month and a similar one from News.com.
To these concerns, Google reiterated what it said to me in my earlier article:
"Our relationship with content sites does not in any way influence Google's search results. As you know, Google's search results are completely separate from content targeting and our advertising programs in general."
Verdict: Likely Google Will Continue To Do The Fair Thing
Potential concerns about Google's cached copies being unfair competition pale in my view when compared to concerns about Google's relationship with Blogger.com content and any site that runs its contextual links. However, the company has an excellent track record of trying to be fair. I'd expect this will continue, but we'll all be watching.
Brandt's explanation of this point is wrapped up in the fact that since Google controls so much of the web's search stream, webmasters have to live in fear of being cutoff from the Google referral gravy train, he says.
Brandt notes that Google has a "75 percent monopoly for all external referrals to most websites," but he's misstating a fact reported in a Feb. 26, 2003 article in the Wall Street Journal that gives Google too much importance.
That article, "Google Becomes Web's Gatekeeper," said that 75 percent of all WEB SEARCHES that sent traffic to other web sites came from Google or sites powered by Google, according to private stats given to the publication by StatMarket. This is much different than Google controlling 75 percent of ALL web traffic, not just searches.
Indeed, web sites get traffic from many other ways than search engines. In fact, a recent StatMarket survey found that search engines generate only 13 percent of a site's overall traffic. So, Google influences 75 percent of that 13 percent, or about 10 percent of a site's traffic. So much for that giant monopoly of all traffic, on average.
Of course, it is important to remember that search engines are an important way that people first discover a web site, so they have an influence that extends beyond the 13 percent of traffic they directly generate. My article Avoiding The Search Gap explains this more.
Brandt complains that Google has no published standards, nor an appeals process for penalized sites. "Google is completely unaccountable," he says.
In reality, Google does publish some standards but as with every other search engine, it faces the age old difficulty of not being able to reveal too much in order to avoid weakening its defenses against the very real threat of search engine spam.
Similarly, the service has long said it doesn't want to establish a formal or automated process to let people check on banned web sites (explored more in this past article for Search Engine Watch members) since it fears that might inadvertently help spammers. Nevertheless, it would be nice to see it at least experiment and test such a system, since many webmasters worry unnecessarily that they've been dropped for spamming reasons. The vast majority have not.
Brandt also makes a factual sounding statement, "Most of the time they don't even answer email from webmasters." His proof?
"Most responses to webmasters are automated. It looks like they slip in the name at Dear [insert name here”, and it looks like they have some low-level employee select from a variety of boiler-plate responses, but I don't consider this to be an answer. It's a public relations gambit. Google can afford to hire the staff necessary to not only answer email, but also to set up an appeals process for penalized sites," Brandt said.
It's not fair to completely dismiss automated responses. I have email templates that I use myself to answer standard questions that come from readers. They can be very effective, assuming the human sending them has carefully read the initial question properly.
But as for hiring staff, sure. It would be good for Google to be more responsive, based on anecdotal evidence. And there is some good news, in that the company says it has been doing this.
"We've beefed up resources recently on being able to handle user support," said Matt Cutts, who deals with webmaster issues for Google.
Cutts reiterates that most people think they've done something to be banned by Google haven't. However, for those who may have made fairly innocent mistakes, Google even wants to help there.
"We're trying to make it easier for webmasters who think they might be penalized to get an answer and to make sure they get a second chance, if they have done something wrong," Cutts said. "We're making sure that if they write in to email@example.com and ask, we'll take a good look. Google wants to be responsive."
Ultimately, it remains true that Google is a private enterprise and will run itself in the way it seems fit. Brandt in the past has said he thinks it is so important that it may need to be regulated like a public utility. Similar things were said about Yahoo in the past, and the web survived without its regulation (for more, see toward the end of this past article). That will likely be the case with Google.
However, it is ironic that Brandt has such fear of the US federal government yet simultaneously wants that same government to regulate Google. How does he reconcile these two views?
"I wouldn't want the Pentagon regulating it, nor the FBI, nor the CIA," Brandt said. "There are government agencies that regulate public resources when it is perceived that such regulation contributes to the health, safety, and proper functioning of society//_subscribers.not all government agencies are in the spook business. Many are motivated by a commitment to the general welfare," he explained.
Verdict: Google's Pretty Friendly
Webmasters may have issues with Google, but so too they have them with all search engines. And when it comes to Google, the service has been named the Most Webmaster Friendly Search Engine by Search Engine Watch readers three years in a row, always by a wide margin. They certainly see it as friendly.
This accusation really restates what's been contained in the ones above, that Google has so much search data coming through it that Brandt fears ultimately it will be a target for US government data mining. How to defuse it?
"The most important step would be to announce a 30 to 60 day data retention policy for all user information," he said.
He also says most searches come from outside the US, based on a radio interview he heard with Google's cofounders around June 2001. And Google does reconfirm to me that more than half its searches continue to come from outside the US. But why is that relevant to privacy?
"It's relevant because the bar for legal surveillance by US authorities is set lower for non-US citizens than it is for US citizens. This is due to historical reasons. Prior to 9/11, the NSA was prohibited from any surveillance of US citizens. With the Patriot Act, this issue has become clouded, as there is much sharing between various US agencies now. The data mining that the feds want to do makes Google particularly attractive to them, given that there is this huge non-US component to Google's operations, and such surveillance is completely unrestricted by law," he said.
The China Issue
Not mentioned in the nomination but a point Brandt raised in discussion about Google being secretive was the suggestion that the company made a behind-the-scenes deal to restore access when it was banned in China last year.
"Google cut some sort of secret deal with China that got the blocking lifted. AltaVista wouldn't play ball, and from what I heard, they may still be blocked. What was the deal that Google cut with China? They won't say; it's none of our business. Don't you see a pattern here?," Brandt said.
Rather than seeing a pattern, my assumption had been that China relented to some degree on Google but not AltaVista because no one was complaining about AltaVista.
Indeed, I wrote at the time that both services were apparently blocked at the same moment but that the outcry was all about Google, since relatively few searched with AltaVista. It took AltaVista a week of watching Google get all the attention, and feeling left out, before the company issued a press release saying that it also had been blocked.
Today, people in China can search on Google again. However, if they search for something China dislikes, the Chinese government will ban that particular query or mess up what's delivered. It can do this because it can monitor and control all the internet activity with the country.
Google has reconfirmed the situation I've described above. The company says it never established contact with the Chinese government nor made any technical changes on its end.
Verdict: Government Time Bomb? Maybe. Corporate Time Bomb? Not.
Sure, Google presents a target. There are a few things that Google could do to minimize government use of its data, as outlined in the verdict to accusation 3. However, it bears repeating that if Google is a privacy time bomb, because of possible government access, it is far from the only major bomb on the web. As for a time bomb because Google itself might abuse data, this isn't a worry. It has no personally identifiable data recorded that would let it identify actual users.