Meet the Search Engines!

A special report from the Search Engine Strategies 2001 Conference, August 16-17, San Francisco CA.

One of the most popular sessions at each of the Search Engine Strategies Conferences is Meet the Search Engines, and this year was no exception.

After all, where else can you hear representatives from four different search engines give information about their engines as well as answer questions from the audience?

In this particular session, the engines were represented by:

  • Rob Rubin, Executive Vice President of Internet Services with FAST

  • Tim Mayer, Web Search Product Manager with Inktomi
  • Chris Kermoian, Director of Internet Search Services and Web Marketing with AltaVista
  • Daniel Dulitz, Software Engineer with Google

Now, let's move to announcements by each of the major engines.


Rob Rubin with FAST began by outlining some facts about their engine:

  • In May of 1999, FAST launched the world's largest search engine, All the Web, with 50 million documents.

  • In November 1999, they launched multimedia search.
  • In December 1999, they launched mobile search.
  • FAST has a 9-12 day cycle for crawling the Web, but they don't crawl the entire Web during each cycle.
  • FAST uses a proprietary relevancy algorithm coupled with link and query analysis. With FAST, link analysis is very important when determining relevancy, as is placement of text on the page and the use of keywords in headers, etc.
  • FAST currently has 625 million Web pages in its index and is getting as much coverage of the Web as it can, including English sites and 46 other language documents.

Rubin discussed FAST's PartnerSite service, which is a pay-for-inclusion service that's currently being offered through a beta program. Highlights of the PartnerSite service include:

  • Guaranteed inclusion in FAST's index;

  • Dedicated index of the site's pages hosted in FAST's data center;
  • Guaranteed 24-hour site reindexing;
  • Customized site search results pages;
  • Controlled site crawling;
  • Advanced logging, reporting, and keyword tracking tools;
  • Language support and detection (46 languages).

For more information about FAST's PartnerSite, visit:

Interesting Note: Rubin mentioned that is FAST's testing domain. At this domain, they were the first to integrate HTML, multimedia, MP3, and FTP search results. Advanced algorithms are set up at the site to detect spam, deep links, and offensive content as well as automated search tips.


Moving on to Inktomi, Tim Mayer discussed the engine's top priority: relevance. Other priorities include high quality content, better breadth of content (results for every query), and making their portal partners successful. Inktomi reaches 82 percent of the U.S. Internet users.

Mayer outlined Inktomi's separate databases:

  • Best of the Web: full refresh every 9 days; contains over 110 million documents.

  • EuroCluster: full refresh every 21 days; contains 100 million documents.
  • APAC: full refresh every 21 days; contains 65 million documents.
  • Gigadoc: full refresh every 30 days; contains more than 500 million documents.

Inktomi now offers two pay inclusion programs. Search/Submit is ideal for sites with less than 1000 pages. It offers a 48-hour refresh and is available through Position Technology, VeriSign,, and Outrider. For more information, visit:

Index Connect is for large content providers, and it offers a 48-hour refresh as well, based on customer needs. This service is available through e-Luminator, Position Technology, e-centives, Inceptor, and Traffic Leader. For more information, visit:

How does Inktomi define spam? "Trying to trick the search engine into offering inappropriate search results for particular queries," explained Mayer. Examples include:

  • Cloaking - if used to feed Inktomi crawlers content that is not relevant to the actual pages

  • Link farms

Spammers can be reported to a new email address:


AltaVista's rep, Chris Kermoian, relayed the engine's view on ranking: "If you have the most useful and most popular site on a particular topic, AltaVista will do its best to make sure your site is #1." AltaVista receives 200 million unique queries every month.

Key elements of ranking in AltaVista are:

  • Content that is useful and unique.

  • Placement of content on the page. Use "newspaper" style placement. Text at the top of the page is generally assigned greater weight than text at the bottom of the page.
  • Title tags. The page title is critical.
  • Keyword and description META tags.
  • Link popularity, with links coming from valuable sites. Artificial links are considered spam. Anchor text should accurately describe the page's content.

How can you get into AV's index?

  • Regular spider run by AV's crawler.

  • Basic free submit takes 4-6 weeks, of which 90 percent of these submissions are considered spam. Tip: In AV's submission "puzzle," using the letter "O" or the zero "0" will both work, if you can't tell which they're asking for. Same thing with the number "1" and the lower-case L "l."
  • Express inclusion is ideal for small and mid-sized sites with under 500 pages. Weekly updates for six months are included. Ranking of pages are updated each week, so changes made to pages are quickly seen in the search results. For more information, visit:
  • Trusted Feed is a new program especially for large sites where Web pages are provided to AV in XML feed format. This service offers detailed online reporting capabilities. All pages submitted through Trusted Feed are monitored for spam. This program solves many issues such as dynamic content and framed pages.

For more information, visit:

Does AltaVista assign more relevancy to pages that have been in their index for a while? According to Kermoian, no. He also stated that participating in their inclusion programs doesn't have a special effect on rankings either.


Daniel Dulitz with Google said that Google's mission is to organize all the information they can find and make it useful to everyone.

Facts about Google:

  • Google boosts 1.3 billion Web pages now, making it the world's largest search engine.

  • Google offers 60 interface languages.
  • Google crawls the Web every 28 days, but it takes a little additional time for pages to appear in the index. It crawls some sites more often.
  • Google offers PDF searches, an image search in beta, and searches by date.

Interesting Note: Google crawls pages on the Web in order of importance.

Dulitz said that Google has no pay inclusion program and doesn't plan on offering one, which was met with a round of applause from attendees.

What's important to Google?

  • The most important thing to Google is great content.

  • Use user-friendly navigation.
  • Make sure that the site shows up in all browsers.
  • Work hard on building your link popularity. Get links from well-respected sites, which is the "peer review" aspect of the Web. Links should be visible and described accurately in an effort to get people to click on them.

If you want to appear at the top of the results in Google, Dulitz recommends advertising with AdWords.

Why are sites dropped from Google?

  • If a page is unreachable when Google's spider tries to crawl it, it will get dropped.

  • Also, sites can get banned because of "serious" offenses, either detected manually or automatically.

Dynamic Content

One of the participants asked which of the engines index dynamic content.

  • Google will index dynamic content and will crawl pages with question marks or cgi references.

  • Inktomi will index dynamic content but only a few pages from each site. You can form a partnership with Inktomi and they will index more pages.
  • Fast doesn't crawl dynamic pages with question marks or large databases behind them.
  • AltaVista can index dynamic content and can handle question marks. You can submit through their Basic Submit for free or through Express Inclusion. However, if your content changes frequently, they don't want to index those pages because by the time the pages make it into the index, the content has changed.

Automated Queries

How do the engines feel about automated queries?

Google's Terms of Service ( states, "You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that 'automated queries' includes using any software which sends queries to Google to determine how a website 'ranks' on Google for various queries."
AltaVista doesn't encourage the use of automated queries for checking positions and have stopped huge infractions. On a small scale, however, they haven't taken an active position to stop them.

Robin Nobles is Director of Training for the Academy of Web Specialists ( Robin has taught well over a thousand students in her online and onsite search engine positioning courses during the past several years. Her latest books, Web Site Analysis and Reporting and Streetwise Maximize Web Site Traffic, can be ordered through Amazon. Visit the Academy's training site to learn more about their search engine ranking courses and software solution, at

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.