Meet The Crawlers

Representatives of Yahoo, Google, Ask Jeeves and Looksmart offer an inside glimpse of recent developments at the major search engines.

A special report from the Search Engine Strategies 2004 Conference, March 1-4, New York City.

A longer version of this story for Search Engine Watch members goes into more detail about Yahoo’s free and paid inclusion programs, Google’s recommended tips for webmasters, an update on Google’s advertising programs, questionable optimization tactics all of the engines consider borderline spam, and much more. Click here to learn more about becoming a member.

Always a favorite, the NYC Search Engine Strategies session of “Meet the Crawlers” was packed, as usual. This session lived up to its reputation of providing information straight from the source and granted the audience direct access to representatives of the big engines.

Yahoo Technology Changes

Tim Mayer, Director of Product Management at Yahoo Search, provided a brief overview of recent technology activities at Yahoo. Here are a few highlights from his presentation.

With the release of its new search engine, Yahoo now powers over half of the US web searches – this is a dramatic shift in the market share within the industry. Tim stated that Yahoo now has 260 million users world wide and 100 million registered users.

Concerted effort has been made to increase the size of the Yahoo index. Tim stated that Yahoo “grew the index over 50% from what it was on before.” Tim also mentioned that the Yahoo index is more than 99% populated through the free crawl process.

Tim described more than a dozen daily tests to improve search quality of the user interface. The Yahoo team is extremely focused on improving the user experience and providing relevant results. To the user, this means on a given day, your results or user interface may change. To ensure fresh content, Tim mentioned that Yahoo has added a daily crawl for updating documents that it knows change frequently.

Toolbars provide search engines valuable feedback, so to go along with its new search engine, Yahoo is now offering its own tool bar called Yahoo Companion.

Yahoo sees personalized search as the future focus. The goal of personalization is to better understand the user intent. Currently, people have to type in extra words in their query to be more specific to get the results they want. With Personalized search, the search engine delivers relevant results with fewer words. For example, if people want a haircut and Yahoo knows that they live in midtown New York, Yahoo would be able to automatically supply haircutters in that area.

Google Time

The second speaker was Craig Nevill-Manning, Senior Research Scientist at Google. Nevill-Manning provided an overview of Google’s ranking process and elaborated on a few points to help webmasters.

Nevill-Manning stated that the Page Rank of a page is dependent on the aggregate importance of all the pages pointing to that page. He said that this is one significant factor that factors into the rank “so that for the same query, the different pages with essentially the same content we chose the one that has the best reputation in terms of the best reputation of others linking to it.”

Nevill-Manning went on to explain that on the other side of the ranking function is the text analysis. “Google looks at the words on the page, the links, the text of the links pointing to that page, and various other items on the page like proximity of adjacent words and so on”. Nevill-Manning stated that there are about 100 additional factors considered and those factors are constantly being tweaked to improve the ranking to make it more relevant.

Like Yahoo, Google updates its index frequently. Google looks for content that has changed recently or that changes regularly over time. For news updates, Google has developed a separate news crawl that can update on a minute-by-minute basis.

Ask Jeeves: What’s New

Michael Palka, Director of Search, said that Ask Jeeves was now the number two pure search engine and the number five overall search engine player.

Palka described “subject specific popularity” as the feature that makes Ask Jeeves unique. This feature allows Ask “to analyze the entire web link graph and then break it down into subject specific communities.” Once the communities are identified, they can further classify the communities that are on the same topic which allows Ask to identify the authorities. The final step in validating authorities is actual editor review.

Palka identified Ask Jeeves’ “smart search” as taking search beyond temporal links. Smart search features provide fast access to weather forecasts, stock quotes, news headlines and other related areas that might be helpful to the user.

LookSmart’s Recent Focus

The final speaker in this session was Kevin Berk, VP of Advertiser Solutions at LookSmart/WiseNut. Berk stated that he wanted to answer the most often asked question he’d been asked at this conference, “What is going on at LookSmart?”

According to Berk, LookSmart is alive and kicking and involved in many activities. One area with a considerable focus is improving the user experience and increasing relevancy in search results.

Berk mentioned WiseNut was the search engine for and that Zeal is the place to go within the Looksmart family to submit your site.

Berk showed a short demonstration of the distributed crawling capability of Grub. This new technology “lets individuals, businesses and organizations donate their computers’ otherwise unused processing power to run software programs that continually crawl the Net, indexing websites and other documents. This data is gathered into the first comprehensive, daily-updated registry of websites, which will be used to provide accurate, up to the minute results for search engines.”

Question and Answer Time

The presentations were followed by a short question and answer session where audience members were able to throw questions to the search engine representatives.

The first question asked what would happen to the Yahoo properties AltaVista and All The Web?

Tim Mayer assured the audience that AltaVista and All the Web would remain as search destinations. However, both properties will be migrating to Yahoo search technology platform (This occurred shortly after the conference).

Another audience member had concerns with the push toward Yahoo’s Personalization and whether there was a way to turn it off.

Yahoo’s Mayer stated that because of all the privacy issues, personalization at Yahoo will always be opt-in. He assured the audience that even if you actively opted in, there will also be the capability to turn personalization off.


The general focus for the major search engines continues to be a focus on the user experience. Whether it is Google’s algorithms tweaks to improve relevancy or Yahoo’s and AllTheWeb’s changes to the actual User Interface of the engines to explore improvements in customer interaction, the search engines continue to strive to maximize the usability, relevancy, and accuracy of the search experience.

Christine Churchill is President of, a full service search engine marketing firm. She is also on the Board of Directors of the Search Engine Marketing Professional Organization (SEMPO) and serves as co-chair of the SEMPO Technical Committee.

A longer version of this story for Search Engine Watch members goes into more detail about Yahoo’s free and paid inclusion programs, Google’s recommended tips for webmasters, an update on Google’s advertising programs, questionable optimization tactics all of the engines consider borderline spam, and much more. Click here to learn more about becoming a member.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication’s search facility, which most have, and search for the headline.

A Critical Mass of Advertisers Makes Contextual Ads More Relevant
Traffick Jul 21 2004 1:07PM GMT
Gmail opens up to rivals
ZDNet UK Jul 21 2004 9:35AM GMT
23 Reasons Google Can Become a Penny Stock
ResearchBuzz Jul 21 2004 9:01AM GMT
Insider selling in ASKJ and the market in general
CBS MarketWatch via Yahoo Jul 21 2004 4:30AM GMT
ICANN starts to run IPv6
ZDNet UK Jul 21 2004 4:24AM GMT
Bringing Search Down to Earth
ClickZ Today Jul 21 2004 3:43AM GMT
Choosing a Content Management System
Internet Works Jul 20 2004 7:12PM GMT
Search Engine And E-mail Marketing Firms Combine
Advertising Age Jul 20 2004 6:42PM GMT
Time to Vote… for the ClickZ Marketing Excellence Award Winners
ClickZ Today Jul 20 2004 5:17PM GMT
Curiosity Built the Database
Microsoft Research Jul 20 2004 3:01PM GMT
A Search Engine for Finding Medical Information and Tutorials
ResearchBuzz Jul 20 2004 11:24AM GMT
BBC To Launch their Own Search Engine?
ResearchBuzz Jul 20 2004 11:23AM GMT
Web encyclopedia lets readers cut through to basics
Chicago Sun-Times Jul 20 2004 10:33AM GMT
We Don’t Need No Stinkin’ Login
Wired News Jul 20 2004 9:40AM GMT
Apple and Microsoft make search personal Jul 20 2004 9:15AM GMT

Related reading

Simple Share Buttons