No, Google Hasn't Sold Out

Recent changes at Google have provoked cries of alarm and protest from both the media and webmasters. Pay no attention to the hoohaw: Google hasn't sold out.

The furor has focused mostly on Google's new AdWords Select program, which allows advertisers to purchase paid placement links on search result pages. Google's program differs from similar programs offered by other pay per click services such as Overture and FindWhat, and these differences have caused confusion about Google's intentions.

One of the most common misconceptions is that Google has compromised the integrity of its search results. The outcry was apparently strong enough that it prompted Google to publish a statement called "Why We Sell Advertising, Not Search Results," which was linked prominently from its home page.

While Google's new advertising program has some confusing aspects and rough edges that will undoubtedly get smoothed out over time, rest assured that nothing has changed in the way that Google computes and delivers non-paid search results. See Danny Sullivan's article "Up Close With Google AdWords" (link below) for a more detailed analysis of the new program.

Google also stirred up controversy among the webmaster community when it started crawling and indexing secure "https" documents. These are documents that are transmitted from servers to browsers using the Secure Sockets Layer (SSL) protocol.

This protocol is commonly used to encrypt and securely transmit sensitive information like credit card numbers and other personal details. But there is no rule that says ordinary pages can't be stored on web servers using this protocol.

Why would Google suddenly start indexing these "secure" pages, when it previously said it didn't crawl them?

According to Craig Silverstein, Google's Director of Technology, there's a lot of high quality content stored on SSL servers that's not password-protected or excluded by the robots.txt protocol. This isn't sensitive information, but rather content that's just sharing space on a server that uses the SSL protocol. In other words, fair game for Google to fetch and index.

"There are many different types of useful data available via https pages," said Google spokesperson Nate Tyler. "A good example of this is, a site that offers remote PC access services and has published its entire homepage in https."

Other examples of SSL sites Google has indexed include the homepage for Trading Partner Connect, and, a German Linux community site that has published its content in https.

The misconception that Google is going where it shouldn't comes partly from the somewhat vague definition of "secure." The SSL protocol is simply a transmission protocol. It has nothing to do with whether an individual page should be considered "secure" or not.

Admittedly, this may surprise some webmasters who assumed that URLs starting with "https" were automatically off-limits to search engines. The situation is similar to what happened when Google began indexing Microsoft Office documents last year, much to the chagrin of webmasters who thought they didn't have to worry about non-HTML documents being indexed.

It's the responsibility of the webmaster to place sensitive documents off-limits. Indeed, with Google and other search engines continually adding new file formats to their databases, it's a good idea to explicitly exclude everything on your server that shouldn't be crawled and indexed by search engines.

That said, Google's implementation of indexing SSL pages wasn't flawless. Henk van Ess, operator of the Dutch search engine news service Voelspriet, noted that some content that Google was indexing should not have been, since it apparently was specified as off-limits by robots.txt files on affected sites.

Google has acknowledged the bug. "In response, Google fixed the bug and removed all HTTPS pages from the Google index and our cache," said Google's Tyler. "The improved version of Google's web crawler will recognize all robots.txt files associated with HTTPS web pages and will be deployed in the next 30 days."

The net result was that for a few days, some SSL pages that shouldn't have been indexed were available via Google. Was security compromised? Was sensitive data revealed to leachers? Given the way Google's relevance ranking algorithms work, probably not. And Google moved quickly to fix the problem once it was discovered.

So, despite all of the critical noise about the new advertising program and indexing of SSL pages, Google hasn't really done anything that should concern searchers, or most webmasters, for that matter.

The new advertising program has a few rough edges, and there's still some confusion about how advertisers can influence (or not) the position of their ads in search results, but the way Google calculates non-paid search results has not changed. Nor has it abandoned its ethical principles and started crawling "secure content" that it has no business indexing.

Bottom line: Google hasn't sold out. And don't expect it to any time soon.

Up Close With Google AdWords
Forget the hype about what the new cost-per-click AdWords program means for Google, in terms of competing with Overture. What's in the program for advertisers, and how does the cost per click pricing fit in with the existing cost per impression ads? Search Engine Watch editor Danny Sullivan takes a close look at the new AdWords program.

Why We Sell Advertising, Not Search Results
Google's spirited defense of its new AdWords program, assuring users that its non-advertising results are absolutely not for sale.

About Secure Sockets Layer (https)
A definition of the SSL protocol.

Google Launches Microsoft Search

Google is launching a new search specialized for information related to Microsoft. This is similar to its BSD, Linux, Apple and other specialized search services.

Google Microsoft Search

Google Topic Specific Search Interfaces
All of Google's topic-specific search interfaces are available via the advanced search page; scroll to the bottom of the page for direct links.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Domain name news
Mixed Signals at VeriSign...
Business Week Mar 12 2002 1:02PM GMT
Online search engines news
Looksmart to acquire WiseNut search engine...
ZDNet Mar 12 2002 11:27AM GMT
Domain name news
Transatlantic battle to stop sale of .brit and .usa domain names... Mar 12 2002 11:26AM GMT
Online portals news
Yahoo pushes premium Web hosting services... Mar 12 2002 9:23AM GMT
Online marketing news
Savoring Spam: A true story...
CNET Mar 12 2002 6:12AM GMT
3 Web Sites Closed in Spam Inquiry...
New York Times Mar 12 2002 5:16AM GMT
Online portals news
AOL Tests Netscape Internet Browser...
Washington Post Mar 12 2002 4:24AM GMT
Online search engines news
Susan ONeil, Search Optimization Pioneer...
About Web Search Mar 12 2002 1:45AM GMT
Tech latest
Hey, what ever happened to...?...
Interactive Week Mar 11 2002 3:03PM GMT
Online search engines news
AT&T Gets $2M to Provide Search Engine for Government Site... Mar 11 2002 7:47AM GMT
Online legal issues news
The BT Group Stakes a Patent Claim on Hyperlinking...
New York Times Mar 11 2002 5:16AM GMT
Online content news
News Web sites: no such thing as a free read...
Nando Times Mar 10 2002 5:40PM GMT
Online marketing news
FTC Plans Crackdown on 9-11 Spam... reg Mar 9 2002 11:31PM GMT
powered by

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.