SEO News
Search

Search Engine Algorithms & Research

author-default
by , Comments

As a searcher or search engine optimization specialist, do you really need to understand the algorithms and technologies that power search engines? Absolutely, said a panel of experts at a recent Search Engine Strategies conference.

A special report from the Search Engine Strategies conference, February 28-March 3, 2005, New York, NY.

A longer version of this story for Search Engine Watch members goes into much more detail about specific search engine algorithms, including PageRank, HITS, Google's local rank and hilltop algorithms, and others. The article also describes theoretical work that can be used to examine how easy or difficult it would be to rank for a given sequence of keywords in a search engine. Click here to learn more about becoming a member.

The Search Engine Algorithm and Research panel featured Rahul Lahiri, Vice President of Product Management and Search Technology at Ask Jeeves, Mike Grehan, CEO of Smart Interactive (recently acquired by webSourced), and Dr. Edel Garcia from Mi Islita.com.

What's the fuss all about?

"Do we really need to know all this scientific stuff about search engines?" asked Grehan. "Yes!" he answered unequivocally and proceeded to explain the practical competitive edge you gain when you understand search algorithm functions.

"If you know what ranks one document higher than another, you can strategically optimize and better serve your clients. And if your client asks, 'Why is my competitor always in the top 20 and I'm not? How do search engines work?' If you say 'I don't know—they just do'—ow long do you think you're going to keep this account?"

Grehan illustrated his point by quoting Brian Pinkerton, who developed the first full text retrieval search engine back in 1994. "Picture this," he explained, " A customer walks into a huge travel outfitters store, with every type of item, for vacations anywhere in the world, looks at the guy who works there, and blurts out, 'Travel.' Now where's that sales clerk supposed to begin?"

Search engines users want to achieve their goals with minimum cognitive load and maximum enjoyment. They don't think carefully when they are entering queries; they use inaccurate three word searches, and haven't learned proper query formulation. This makes the search engine's job more difficult.

Heuristics, abundance problems & the evolution of algorithms

Grehan went on to explain the important role that heuristics play in ranking documents. "A fascinating combination of things come together to produce a rank. We need to understand as much as we possibly can, so at least when we're talking about what ranks one document higher than another, we have some indication about what is actually happening."

Grehan described the progression of search algorithms over time. In early search engines, text was extremely important. But then search researcher Jon Kleinberg discovered what he termed "the abundance problem." The abundance problem occurs when a search for a query returns millions of pages all containing the appropriate text. Say a search on the term "digital cameras" will return millions of pages. How do you know which are the most important or authoritative pages? How does a search engine decide which one is going to be the listing that comes to the top? Search engine algorithms had to evolve in complexity to handle the problem of over-abundance.

Insights from Ask Jeeves

Ask Jeeves is the seventh ranked property on the web and the number 4 search engine,according to Rahul Lahiri from Ask Jeeves. Lahiri described a number of components that are key to Ask Jeeves search algorithms, including index size, freshness of content and data structure. Ask Jeeves' focus on the structure of data is unique and differentiates its approach from other engines, he said.

There are two key drivers in web search: content analysis and linkage analysis. Lahiri confirmed that Ask Jeeves looks at the web as a graph and looks at the link relationships between them, attempting to map clusters of related information.

By breaking down the web into different communities of information, Ask Jeeves can rely on the "knowledge" from authorities in each community to better understand a query and present more on-topic results to the searcher. If you have a smaller site, but one that is very relevant within your community, your site may rank higher than some larger sites that provide relevant information but are not part of the community.

Why co-occurrence is important

Dr. Edel Garcia was delayed and not able to be physically present at the panel, but had prepared a PowerPoint presentation with audio narration. Moderator Chris Sherman told everyone to pretend Dr. Garcia was "channeling" through him and presented in his stead.

Dr. Garcia is a scientist with a special interest in Artificial Intelligence and Information Retrieval. He explained that terms that co-occur more frequently tend to be related or "connected." Furthermore, semantic associations affect the way we think of a term. When we see the term "aloha" we think of "Hawaii" because of the semantic associations between the terms. Co-occurrence theory, according to Garcia, can be used to understand semantic associations between terms, brands, products, services, etc.

Dr. Garcia then posed a question. Why should we care about term associations in a search engine? His answer: Think about keyword-brand associations. This has powerful implications for search marketing.

For more information on Dr Garcia's theories, check out the Search Engine Watch forum thread Keywords Co-occurrence and Semantic Connectivity.

The panel ended with a lively Q&A session. Where is the evolution of the search algorithm going? Grehan had a ready answer: He expects the introduction of probabilistic latent semantic indexing and probabilistic hyper text induced topic search. What do those mouthfuls of jargon mean? You'll have to attend the next SES to find out.

Christine Churchill is President of KeyRelevance.com, a full service search engine marketing firm offering organic search engine optimization, strategic link building, usability testing, and pay per click management.

A longer version of this story for Search Engine Watch members goes into much more detail about specific search engine algorithms, including PageRank, HITS, Google's local rank and hilltop algorithms, and others. The article also describes theoretical work that can be used to examine how easy or difficult it would be to rank for a given sequence of keywords in a search engine. Click here to learn more about becoming a member.

Want to discuss or comment on this story? Join the Search Algorithm Research & Developments discussion in the Search Engine Watch forums.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

SEO for Graphical Pages...
High Rankings Apr 14 2005 4:02AM GMT
Google Home-Video Upload Site Goes Live...
InternetWeek Apr 13 2005 11:12PM GMT
European Search Firm Nixes Use of Some English Brand Words...
MarketingVOX Apr 13 2005 8:28PM GMT
Review: Blinkx 3.0...
PC Magazine Apr 13 2005 7:24PM GMT
What is Content? Part I...
Search Engine Guide Apr 13 2005 6:56PM GMT
Searchandizing...
InternetRetailer.com Apr 13 2005 6:48PM GMT
Multi-language Gmail...
Google Blogoscoped Apr 13 2005 4:56PM GMT
Yahoo's 'Web Beacon' draws unwanted attention...
CRM Knowledge Base Apr 13 2005 4:48PM GMT
Sensis pumps up Aussie search...
ZDNet Australia Apr 13 2005 4:11PM GMT
Tagvertising = Blogging 2.0... Already?...
iMedia Connection Apr 13 2005 3:58PM GMT
Social Security Numbers Easy To Find Online...
CRMDaily.com Apr 13 2005 3:48PM GMT
A Taxonomy of Web Spam...
Search Engine Watch Apr 13 2005 3:35PM GMT
An Exit Strategy With an Open Door Policy...
ClickZ Today Apr 13 2005 3:28PM GMT
Search Engine Strategies Expansion...
Alan Meckler's Weblog Apr 13 2005 3:26PM GMT
Yahoo + Overture Pays Off $ Billions...
Tornado Insider Apr 13 2005 10:03AM GMT


The Original Search Marketing Event is Back!
SES AtlantaSES Denver (Oct 16) offers an intense day of learning all the critical aspects of search engine optimization (SEO) and paid search advertising (PPC). The mission of SES remains the same as it did from the start - to help you master being found on search engines. Early Bird rates available through Sept 12. Register today!

Recommend this story

comments powered by Disqus