Do patents, white papers, and other publications authored by search engine employees provide clear guidance to how to optimize Web pages? A panel of experts debated the issue at a recent Search Engine Strategies conference.
A special report from the Search Engine Strategies conference, August 7-10, 2006, San Jose, CA.
Software engineers and other staff at the commercial web search engines publish academic papers and apply for patents, which may or may not give proof about how search engines find and rank web pages. Dissecting whether understanding patents helps search optimizers were panelists Jon Glick, Senior Director of Product Search and Comparison Shopping at Become.com; Rand Fishkin, CEO of SEOmoz.org; and Bill Slawski, President of SEO by the Sea, Inc.
Search engine patents: proof or no proof
Many search engine optimizers regularly monitor search engine patent applications and use this documentation as proof that their methodologies help web pages rank. However, patent applications often offer limited, and even misleading, information.
“What search engines put into patents is often more like brainstorming,” said Glick. “It’s every approach that they can think of versus what they are actually doing, or even have a technology to do.”
Search engine staff file patents with the idea that they might use certain features in the future yet prevent their competitors from utilizing the same features. “Search engine staff know that their patent applications will be read by competitors and SEOs,” he continued. “You don’t actually have to use the features in the patent to be granted a patent, nor does anyone have to disclose all features in a patent application.”
For example, personal data is not likely to be used as part of a search engine algorithm. Many people might use the same computer (such as computers in libraries, universities, and Internet cafes); therefore, the personal data is often inaccurate and does little to enhance the search experience. Nonetheless, this information can be a part of the patent application.
“People should realize that looking at patents and white papers might describe things that never happen,” echoed Slawski.
However, some items in a patent application can be useful, such as the frequency of change of links, or evaluation of out-links. Web site owners do not have control over how other sites link to their site, but they do have complete control over their content and the sites they choose to link to. “Traditionally, search engines have ranked a web site based on who links to the site, not who the site link to,” said Glick. Both Google and Yahoo! use out-links for spam evaluation, and next-generation algorithms are using them.”
According to Fishkin, search engines recognize manipulative link-building techniques by looking at links and link flow.
“Manipulative links are built for search engines, not (human) users,” said Fishkin. “They are built automatically rather than by hand. They are not an editorial vote for the quality of a page and are influenced by financial or less ‘legitimate’ incentives.”
Search engines use algorithmic techniques for identifying and combating manipulative links. Some of these techniques might include:
- Spotting link networks
- Similarity identification
- Trends and search data evaluation
- web analytics and user surfing data
“If you’re concerned about privacy,” added Fishkin, “you have to question where the data from Google Analytics goes.”
Even expert SEOs can be easily confused with patent information. For example, some SEOs believe that having an RSS feed will automatically give a site a boost in rankings. However, if a site has an RSS feed, it might be crawled more frequently because the site is likely to have fresh content. “The rate of change in content mostly impacts crawl frequency, not ranking,” said Glick.
Evolution of search engine algorithms
Search engine algorithms are constantly evolving. “Search algorithms are getting better and better at understanding what the content on pages actually means,” Glick stated. “A few years ago they were just blindly indexing the words on a web page, but now they are beginning to understand what some of those words mean and what the page represents (store, news article, etc.) For example, (650) 555-1212 is a phone number.”
Slawski sees search engine algorithms evolving in stages. Stage 1 was a “one size fits all approach,” which, as Glick mentioned, was not very effective.
Stage 2 algorithms developed through understanding users. “Search engines are looking more into search query data, which involves analyzing search queries, collecting searcher information, and matching searcher intentions,” Slawski said. “With Stage 3, search engines are taking a step forward, not only looking at interactions but at people themselves.”
Should SEOs regularly monitor patent applications, white papers, and other publications that are authored by search engine software engineers and scientists? Absolutely. Search engines constantly try to improve the search experience, and information provided in these documents can help web site owners improve the search experience on their own sites. However, realize that patent information might not always offer the solid “proof” of an algorithm that one might believe.
NOTE: Article links often change. In case of a bad link, use the publication’s search facility, which most have, and search for the headline.
From The SEW Blog…
- Google’s Belgium Fight: Show Me The Money, Not The Opt-Out, Say Publishers
- See Google Results As If You Are In Another Country
- The Unchanging Search Interface
- Again, The Need For Search Ad Revenue To Stand Alone
- Google Webmaster Central’s Vanessa Fox & Amanda Camp Interviewed
- Microsoft To Enter Chinese Market With China Telecom
- Google Base Drops Search Box As Part Of Usability Improvements
- Yahoo Teams Up With Gore’s Current TV
- Zillow Adds ‘Owner-Generated’ Content
- Citysearch Launches in San Diego
- Video Search Usage for August 2006
- Yahoo CEO Says Ad Growth Slowing Down; Ask.com To Increase Market Share
Headlines & News From Elsewhere
- Bush loses ‘failure’ and ‘miserable failure’ in Google, Threadwatch
- Ask.com Earthquake Information, Google Blogoscoped
- Advertising Execs Speak Up on Google’s Manhattan Move, Google Watch
- YouTube headed for Good Morning America, TechCrunch
- NYTimes.com Integrates Answers.com Content, Search Engine Journal
- Six Wall Street Reactions to Yahoo’s Warning, SeekingAlpha
- Podcast: Is The Search Bubble Popping? Where’s The Search Box Gone On Google Base? Will Search Ever Change? And More!, Daily SearchCast
- A method for removing spam from the index, Search-Science
- Google regains ground in U.S. search market, News.com
- Congratulations, Luis von Ahn, Official Google Blog
- Writely To Open To All Google Accounts, InsideGoogle
- How to get a job at Google, InfoWorld
- Google’s Quiet Acquisition of Transformic, Inc, SEO by the SEA
- IAC Appoints Richard Stalzer President of IAC Advertising Solutions, Yahoo! Finance
- Microsoft Offers Paid Display Ad Upgrade to Free Classifieds, ClickZ
- Google’s Internal Subdomains, Google Blogoscoped
- What Are Microformats And Why They Make Your Information Easier To Find, Robin Good
- Torpark is out, offering “anonymous, portable web browsing”, Boing Boing
- Podcast: Live.com, AdWords Test, Intuit & Google, Yahoo! Ads, Supplementals, Belgian Cache, OneBox Spam, Adam Lasnik on Links, Search Pulse
- Debugging blocked URLs, Official Google Webmaster Central Blog
- Topless Nude Sunbathers! In Google Earth…, InsideGoogle
- Gonzales Pushes to Retain ISP Records, Associated Press
- Diller: Ask.com To Continue Outsourcing Paid Search, MediaPost