Wired's "How to Foil Search Engine Snoops" is a nice guide to protecting your search privacy, but it doesn't really go far enough. In particular, anyone who assumes they've protected themselves by using an anonymizing tool is probably not eliminating the important ISP aspect. Meanwhile, laws being considered to force search companies to destroy data must consider the role of ISPs to fully provide the intended protection.
In this piece, I'll take you step-by-step about how your search privacy data gets exposed from all the way from your desktop to the sites you visit. Let me make some caveats before I begin.
Normally with stuff like this, I like to do a "Big Story With Answers To All The Questions" type of piece. That's what I tried to do back in 2003, the last time search privacy really came up as an issue. Much of what I wrote then is still applicable to the issues today, and I'll be drawing on those pieces. You may wish to read them as well:
I definitely don't have all the answers to all the privacy questions in this piece, especially as privacy issues have gotten more complex. But I wanted to make a start, perhaps the beginning of a living document or future article that will provide all the answers. I'd especially invite those with additional tips, observations and so on to contribute to a Search Engine Watch Forum discussion on this topics -- the link will be at the end of the article.
Onward to the search privacy flowchart. It's not an illustrated one in the traditional sense, but it should give you an idea of all the traces you leave behind when searching for something.
In November, we wrote of a man convicted of killing his wife in part because authorities found he'd searched for "neck," "snap," "break" and "hold" on Google. But that information was not handed over by Google itself. Instead, it was found in traces left behind on the man's own computer.
Anything you do on the internet gets recorded on your own computer in various ways. Pages you've visited are stored in your computer's cache, and a history of the URLs you've seen and things you've searched for may also get stored in your browser.
Clearing Your Search History From Google And Other Search Engines from me in 2003 covers some of the ways to delete what you've looked for in Internet Explorer 5, much of which is applicable to Internet Explorer 6.
How do I delete the drop-down list of my past searches? over at Google looks to be a very comprehensive guide on clearing out any search history that appears in the search box on the Google home page.
That information is NOT saved at Google. Instead, it's recorded within your own browser. The Google page gives instructions for cleaning out IE, Firefox, Safari and other browsers. Also, these same instructions should work to clear out your search history at all search engine in one go, not just at Google.
Unfortunately, there are so many search toolbars out there that they might keep their own histories independently of your browser. Google's does, and the page above from Google has instructions on clearing that out. MSN has instructions on clearing its toolbar history here. Instructions for Yahoo are here. For other tools, a first stop is to check the help pages for them.
Now that you've cleared out saved searches, you've still got URL histories and saved pages you might need to clear. How to clear your browser's cache and cover your tracks on the Web looks to be a pretty good article to guide you on how to delete this type of material. It also points to a number of software tools to make life easier. There's also more tools here, here and here from Download.com.
Software may be the way people need to go, as search gets more and more embedded into everything. Running any desktop search tools? They may be storing information you want to delete. For example, Google's desktop search tool also stores all the pages you view on the web. When I last looked, deleting your browser cache did not destroy the data Google Desktop itself keeps.
Managed to wiped everything out either manually or with software? Now go wipe out your hard drive. That's because even if you delete files, people with the right tools and knowledge might still be able to bring back the data. Some of the tools mentioned above may be able to make this easier so that something you've deleted really stays deleted. But the most surefire way to do so would be to physically destroy your computer's hard drive, literally prying out the metal platter where the info is recorded and ideally breaking it up into multiple parts that would be disposed of in various places.
Back to reality, most people aren't going to do that. But I'm trying to underscore how difficult it is to absolutely protect your privacy from prying eyes right on your own computer.
For those worried that tips like cleaning search history from your desktop is helping potential wrongdoers, keep in mind that there are plenty of innocent reasons for wanting to clear search information. For example, a neighbor's older son had looked up porn on their computer. My neighbor could not figure out how to get rid of the pornographic search terms that kept appearing in the search drop down box that his younger daughter was seeing.
The weakest link in protecting your search privacy is your ISP. Everything you do is going to flow out of your computer and through your ISP to a search engine. Your ISP will see the pages you are requesting and in all likelihood have some type of records of what you've done for a set period of time. Whatever deletions you do on your own computer -- plus whatever things you do to be anonymous with search engines -- these have no impact on your ISP. It sees all.
EarthLink has security measures in place to protect the loss, misuse, and alteration of the information under our control. While we make every effort to ensure the integrity and security of our network and systems, we cannot guarantee that our security measures will prevent third-party "hackers" from illegally obtaining this information. We will never sell your information to a third party.
How long are records of what you've visited kept? Do these records exist at all? How might they be shared with others? Answers aren't provided.
Back in June, I wrote of a Reuters article (no longer at Reuters, but there's copy here) that cited one analyst saying that most ISPs don't keep data for longer than a month. In Europe, governments themselves apparently mandate a one to three year retention of data, according to a News.com article from last year. Ironically, while the current US government request for search data has at least one lawmaker considering whether search engines should destroy data, that News.com article says the US government seeks to force ISPs to keep data longer.
By the way, even if your ISP deletes data, you'd better make sure they are forcing companies that mine their data to do the same. Better Search Privacy Needs Addressing Overall from me covers how third party companies such as Hitwise take in ISP data as a way to track what people are doing on the internet.
Visit a major search engine, and it keeps track of every request you make. It will also assign you a cookie, unless you reject these. That's easy enough to do, and the Wired article gives you some tips on that.
Rejecting cookies still leaves behind your internet address. My Search Privacy At Google & Other Search Engines article and the other one I've just posted, Private Searches Versus Personally Identifiable Searches explains this a bit more. Basically, it links your request back to your ISP and thus still back to you, if someone has access to your ISP.
The Wired article suggests using an anonymizing tool to avoid this. Anonymizer is a long-standing one. However, most anonymizing tools only prevent sites you visit from seeing your real internet address. They don't prevent your ISP from seeing where you are going.
I learned of the Tor anonymizing service through the Wired article. It's not clear to me whether that prevents the ISP tracing, as well, Talking with Dave Naylor, a search marketer who also runs his own ISP, your activity would be hidden from your ISP only if Tor keeps all information you send encrypted between your computer and the Tor servers you tap into.
Ethan Zuckerman (author of A technical guide to anonymous blogging - a very early draft) has a nice post about using Tor over here, but it doesn't seem to address the ISP question.
Let's flip things around and say you are NOT worried about visiting your favorite search engine and staying anonymous. In fact, you've decided to embrace the search history features they offer, which frankly can be really useful. Google's, for example, I find does a good job of improving my results based on pages I've visited.
All the major search engines embed the search terms you used into the URL that appears in the address field of your browser. When you click on a listing, that URL is sent as "referrer" information to the web site you go to. That means what you searched on is sent to the web site you ultimately visit from a search engine. They're able to know the search terms you used plus your IP address.
Referrer information is precious data to web sites. It allows them to know exactly how people found them. As a search marketer, I'd hate to see this information go away. But it is a privacy issue to be aware of.
Many web sites make use of third party analytic services, such as ClickTracks, WebSideStory, WebTrends or Google Analytics. That means these services are almost like clearinghouses of search data. They see what many people are searching for -- and clicking on -- from all over the web through the data from thousands of clients using them. Potentially, they are just as rich a target for any government agency to mine as the search engines themselves.
To protect yourself, you want to ensure your browser doesn't pass along referral information. In Internet Explorer, I see no native way to do this. You'll have to turn to products like Norton Security or the tool I use and much prefer, ZoneAlarm. There are certainly other third party tools out there. For Firefox, there's at least one extension you can try.
As you can see, ensuring your search privacy is tricky. The information you send is leaving traces in multiple places. The solution to ensuring privacy isn't going to be as easy as passing a law that targets Google, Yahoo and the others. Ideally, the entire lifecycle of a search beyond the computer desktop needs to be considered from ISP through to tracking services. Searchers themselves also need to consider what they do on their own computer desktops.
There's also an issue of what should be private. I wrote earlier today that most people probably think the conversations they have with search engines as being private. But to date, we don't have any protected searcher-search engine relationship as we do with attorney-client privilege or between clergy and worshipper. Perhaps that needs to be enshrined in some way. But then again, others may feel that going out on to the public web and using publicly accessible search engines entitles no one to an expectation of privacy, or perhaps a more limited one.
Certainly, we need to have a good debate and discussion. That's probably the good that's coming out of the Department Of Justice action. After years of worrying about privacy issues, the DOJ action is turning that worry into action about better protections that may need to be put into place.
Let me add that while I hate the sloppy manner in how the DOJ has acted in this particular case, I have no more interest in criminals using the internet for bad purposes than most people would. In specific circumstances, with the right legal oversight, I hope search or internet browsing data might be evidence that helps catch a criminal, just as I hope they'd be caught through legally approved wiretapping or other types of law enforcement monitoring.
What I don't want is a Big Brother state to be mining everything with the assumption we're all criminals, any more than I want all telephone calls to be monitored. Moreover, it's very, very easy to mistakenly assume from a search request that something wrong is happening, when it is not. Jon Swift takes a light-hearted look at this in his post today, but it's true. A search for "bombing the white house" doesn't mean someone's planning to do that. It may simply be that you're trying to find out about someone who may have attempted this.
Aside from the government issue, there's the concern that the search companies themselves might misuse data. That needs to be considered and improved guidelines or laws developed. Even better would be to see such moves as part of improved protection of consumer information of all types. The amount of data about what people personally are interested in and do seems easier to obtain from consumer research organizations right now than what search engines possibly might provide in the future. How about considering these both together, rather than separately, an idea that came up in a Newsfactor article on Google and consumer data in general last year.
For more the current issue between the Department Of Justice request for search data, please see these articles from us and others:
- Bush Administration Demands Search Data; Google Says No, Yahoo & MSN Said Yes
- Court Documents & Summary Of United States Versus Google Over Search Data
- The Day After: Points In The Search Trust Sweepstakes
- Privacy Groups, Government Officials Comment on Privacy and Web Search from Reuters
- FAQ: What does the Google subpoena mean? from News.com
- Private Searches Versus Personally Identifiable Searches
Want to comment on things discussed in this article? We have three Search Engine Watch Forum threads where everyone is welcome:
Administration Demands Search Records - For general comments about the
Department Of Justice action.
Search Privacy Bill Of Rights - This is the place to comment on what types
of changes you'd like to see search engines put into place, but you can also
propose laws, as well.
- Tips On Protecting Your Search Privacy - Have I missed some great tool or technique above on protecting search privacy? This is the place to contribute.
Postscript: Anonymizer tells me that if you are using only the IP hiding function in Anonymizer, then your ISP will see what you are doing. However, if you use the SSL encrypted "Surfing Security," then your ISP cannot see what you are doing. They're using a better metaphor for this now, calling it an "virtual tunnel" between you to the Anonymizer servers. Ah, but what records does Anonymizer itself keep? None, the company tells me:
The way that the technology is architected, it does not retain any information about users' requests so even if subpoenaed, no information can be supplied because -- simply -- they do not keep any of it. For example, they would not be able to share with anyone where a user is by IP address, or what sites they visited, or anything else, because even Anonymizer does not know. Additionally, the company provides software for use in instances where a privacy breech might have severe consequences -- even death in some cases (where the company protects freedom of speech in foreign countries, Anonymous tips, etc.). Anonymizer has never had a single breech since it began selling products and services in '97, due to its level of security. Trust is a key difference.