Meet the Search Engines!

by Robin Nobles, Guest Writer

A special report from the Search Engine Strategies 2001 Conference, August 16-17, San Francisco CA.

A longer, more detailed version of this article is
available to Search Engine Watch members.
Click here to learn more about becoming a member

One of the most popular sessions at each of the Search Engine Strategies Conferences is Meet the Search Engines, and this year was no exception.

After all, where else can you hear representatives from four different search engines give information about their engines as well as answer questions from the audience?

In this particular session, the engines were represented by:

  • Rob Rubin, Executive Vice President of Internet Services with FAST

  • Tim Mayer, Web Search Product Manager with Inktomi
  • Chris Kermoian, Director of Internet Search Services and Web Marketing with AltaVista
  • Daniel Dulitz, Software Engineer with Google

Now, let's move to announcements by each of the major engines.


Rob Rubin with FAST began by outlining some facts about their engine:

  • In May of 1999, FAST launched the world's largest search engine, All the Web, with 50 million documents.

  • In November 1999, they launched multimedia search.
  • In December 1999, they launched mobile search.
  • FAST has a 9-12 day cycle for crawling the Web, but they don't crawl the entire Web during each cycle.
  • FAST uses a proprietary relevancy algorithm coupled with link and query analysis. With FAST, link analysis is very important when determining relevancy, as is placement of text on the page and the use of keywords in headers, etc.
  • FAST currently has 625 million Web pages in its index and is getting as much coverage of the Web as it can, including English sites and 46 other language documents.

Interesting Note: Rubin mentioned that is FAST's testing domain. At this domain, they were the first to integrate HTML, multimedia, MP3, and FTP search results. Advanced algorithms are set up at the site to detect spam, deep links, and offensive content as well as automated search tips.


Moving on to Inktomi, Tim Mayer discussed the engine's top priority: relevance. Other priorities include high quality content, better breadth of content (results for every query), and making their portal partners successful. Inktomi reaches 82 percent of the U.S. Internet users.

Mayer outlined Inktomi's separate databases:

  • Best of the Web: full refresh every 9 days; contains over 110 million documents.

  • EuroCluster: full refresh every 21 days; contains 100 million documents.
  • APAC: full refresh every 21 days; contains 65 million documents.
  • Gigadoc: full refresh every 30 days; contains more than 500 million documents.

How does Inktomi define spam? "Trying to trick the search engine into offering inappropriate search results for particular queries," explained Mayer. Examples include:

  • Cloaking - if used to feed Inktomi crawlers content that is not relevant to the actual pages

  • Link farms

Spammers can be reported to a new email address:


AltaVista's rep, Chris Kermoian, relayed the engine's view on ranking: "If you have the most useful and most popular site on a particular topic, AltaVista will do its best to make sure your site is #1." AltaVista receives 200 million unique queries every month.

Key elements of ranking in AltaVista are:

  • Content that is useful and unique.

  • Placement of content on the page. Use "newspaper" style placement. Text at the top of the page is generally assigned greater weight than text at the bottom of the page.
  • Title tags. The page title is critical.
  • Keyword and description META tags.
  • Link popularity, with links coming from valuable sites. Artificial links are considered spam. Anchor text should accurately describe the page's content.

Does AltaVista assign more relevancy to pages that have been in their index for a while? According to Kermoian, no. He also stated that participating in their inclusion programs doesn't have a special effect on rankings either.


Daniel Dulitz with Google said that Google's mission is to organize all the information they can find and make it useful to everyone.

Facts about Google:

  • Google boosts 1.3 billion Web pages now, making it the world's largest search engine.

  • Google offers 60 interface languages.
  • Google crawls the Web every 28 days, but it takes a little additional time for pages to appear in the index. It crawls some sites more often.
  • Google offers PDF searches, an image search in beta, and searches by date.

Interesting Note: Google crawls pages on the Web in order of importance.

Why are sites dropped from Google?

  • If a page is unreachable when Google's spider tries to crawl it, it will get dropped.

  • Also, sites can get banned because of "serious" offenses, either detected manually or automatically.

Dynamic Content

One of the participants asked which of the engines index dynamic content.

  • Google will index dynamic content and will crawl pages with question marks or cgi references.

  • Inktomi will index dynamic content but only a few pages from each site. You can form a partnership with Inktomi and they will index more pages.
  • Fast doesn't crawl dynamic pages with question marks or large databases behind them.
  • AltaVista can index dynamic content and can handle question marks. You can submit through their Basic Submit for free or through Express Inclusion. However, if your content changes frequently, they don't want to index those pages because by the time the pages make it into the index, the content has changed.

Robin Nobles is Director of Training for the Academy of Web Specialists ( Robin has taught well over a thousand students in her online and onsite search engine positioning courses during the past several years. Her latest books, Web Site Analysis and Reporting and Streetwise Maximize Web Site Traffic, can be ordered through Amazon. Visit the Academy's training site to learn more about their search engine ranking courses and software solution, at

A longer, more detailed version of this article is
available to Search Engine Watch members.
Click here to learn more about becoming a member

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

A Site to Take Issue With...
Business 2.0 Sep 6 2001 7:36AM GMT
World Wide Webs reach grows...
MSNBC Sep 6 2001 4:48AM GMT
Spam limits, privacy on White House agenda...
CNET Sep 5 2001 9:26PM GMT
ICANN attacked for board-voting method... Sep 5 2001 9:23PM GMT
UK Metasearch Search Engine Moonmist Launches...
URLwire Sep 5 2001 3:26PM GMT
eBay offers top sellers banner ads...
ZDNet Sep 5 2001 11:52AM GMT
Ask Jeeves Makes To Exec-level Appointments...
Advertising Age Sep 5 2001 9:22AM GMT
LookSmart Extends Distribution of LookListings Through Ask Jeeves, Increasing Volume of Qualified Leads for Di...
Yahoo Sep 5 2001 9:02AM GMT
The Browser as a Cookie-Control Key...
New York Times Sep 5 2001 7:29AM GMT
AOL Meddling in ODP Causes Shift in Balance of Editorial Power...
Traffick Sep 5 2001 4:37AM GMT
Nazi crackdown turns to portals...
CNET Sep 4 2001 6:26PM GMT
Trademark law clashes with domain rights...
CNET Sep 4 2001 5:27PM GMT
powered by

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.