Can whois information be used by a search engine to rank web pages? Is Google using whois information in their ranking of web pages? Some research on a recent trilogy of Go Daddy patent applications raised those questions in my mind.
The patent filings involve adding additional reputation information to published whois data, and letting others use the information for a number of reasons, including letting search engines incorporate that reputation information into their ranking mechanisms.
This seemed in line with something that Google discussed doing last year in Information retrieval based on historical data. But, is it something that either company can do? Is it a use consistent with the way that whois information is supposed to be used? There's the rub.
A recent task force vote from the Generic Names Supporting Organization (GNSO) recommended a limited use of whois information and a definition of the the purpose of Whois. The purpose that they came up with doesn't seem to go well with a commercial search engine using the information as part of their ranking algorithm. Their definition was agreed to by the GNSO, at a vote by teleconference on April 12th. Here's the definition of whois information they decided upon:
The purpose of the gTLD Whois service is to provide information sufficient to contact a responsible party for a particular gTLD domain name who can resolve, or reliably pass on data to a party who can resolve, issues related to the configuration of the records associated with the domain name within a DNS nameserver."
What this means is that less whois information, rather than more, will be published and available to the public.
The Go Daddy patent applications were originally filed on October 29, 2004.
Presenting search engine results based on domain name related reputation (US Patent Application 20060095404)
Publishing domain name related reputation in whois records (US Patent Application 20060095459)
Tracking domain name related reputation (US Patent Application 20060095586)
The Google patent application was published on March 31, 2005. Here are some of the uses of domain name information that it suggests could be used by Google:
- Domain registration could be used as a way to determine the "document inception date," or an age associated with a page.
- The expiration date of a domain could indicate the "legitimacy" of a document, with short term registrations indicating more questionable pages.
- Changes, and the frequency of changes, in registration information, including contact information, hosting companies, and more, could also raise warning flags.
- Information about name servers, and other sites on those name servers could also play a role in a ranking score:
A "good" name server may have a mix of different domains from different registrars and have a history of hosting those domains, while a "bad" name server might host mainly pornography or doorway domains, domains with commercial words (a common indicator of spam), or primarily bulk domains from a single registrar, or might be brand new.
Does Google use this type of information? Some signs point to that, as noted in this Search Engine Watch Forums thread: Does New Google Patent Validate Sandbox Theory?. A Search Engine Roundtable post also describes an interest in using that information: Google Admits to Improve Search Quality with Registrar Data. Both hint at reasons why Google became a domain name registrar beyond registering domain names.
If they are using whois information, will this vote from ICANN's Generic Names Supporting Organization force their use to change? Tough question to answer.
There's an interesting piece of information hidden away in the real-time captioning of the minutes of the ICANN Meetings in Wellington, New Zealand on March 29th, which discusses the reasons for this change, and some of the implications of it, such as the removal of the name and contact information of the owner of a domain from whois information in what will be available to the public.
Near the end of the teleconference, there's a discussion, and an unconfirmed report, that Jordyn Buchanan, who has been the chair of the WHOIS task force would be leaving his present employer to work with another former chair head from ICANN, Vint Cerf.
Vint Cerf is presently the Chief Internet Evangelist at Google.
Want to comment or discuss? Visit our Google Web Search area of the Search Engine Watch Forums.