IndustryHow Inktomi Works

How Inktomi Works

In-depth explanation of how the Inktomi search engine works, for Search Engine Watch members.

Recent Articles

The articles below appeared in the Search Engine Update newsletter and have important information not yet added to this page. Please review them to find out about any new developments. Further below, you will also find a list of other articles about this search engine that may be of interest.

Overview

This page explains how to get listed with Inktomi, as well as what generally helps pages rank well within Inktomi’s results. Inktomi is a crawler-based search engine owned by Yahoo.

Yahoo intends to use Inktomi’s results within its own site, for the crawler-based listings that currently come from Google. Some testing of Inktomi results has been done, but full implementation of this has yet to happen.

Yahoo also provides Inktomi’s results to other search engines through partnerships. MSN Search is Inktomi’s most important partnership, though MSN hopes to eventually replace Inktomi with its own crawler-based results, as explained more in the Microsoft’s MSN Search To Build Crawler-Based Search Engine article.

Jump to:

Getting Listed: Link Crawling

The best way for pages from a web site to be listed with Inktomi is for the crawler to find the pages naturally, as it moves around the web following links. If your brand new site has even one link pointing at it, Inktomi’s crawler may eventually find it and index pages within it. The more links pointing at you, the more likely the crawler is likely to come across your pages and add them for free.

Think of it like asking people for recommendations. If you ask 10 different people for a good doctor and several of them suggest the same person, you are more likely to contact that doctor for help. In the same way, having many links pointing at your pages (or a few links from important web sites) makes it more likely the Inktomi crawler will visit you.

To gain links, you should offer good content. If you do, people are more likely to link to it. Similarly, you should build links to your pages from good quality sites related to you in content (see the Link Analysis & Link Building page for more tips on this).

Inktomi’s link crawler won’t add everything that it finds, and there are some pages such as those delivered dynamically (see Dynamic Pages & Search Engines) that it may ignore entirely.

How many pages Inktomi will include may vary. Inktomi is more likely to index individual pages within your site that have links pointing at them from external sources. To help it with this, the company maintains a “WebMap” of links from across the web. By looking at this giant network of the web, Inktomi can determine which pages are most commonly linked to. This helps it decide what to include.

As with most crawlers, Inktomi is also more likely to index documents that are “high” or “shallow” in your site, rather than being buried deeply in a complicated directory structure (see the Submitting & Encouraging Crawlers page).

Getting Listed: Paid Inclusion

If your page is not added by the link crawler, the next best option for being listed is to use Inktomi’s paid inclusion programs. There is no free add URL page; this was discontinued in September 2002.

Inktomi’s paid inclusion programs allow you to pay to ensure that your pages are listed. The downside to these programs are that you must pay for them and that they do NOT guarantee that your pages will rank well for particular terms. The upside is that you know absolutely that your pages will be included in the index and thus may appear in response to searches.

Inktomi offers two major paid inclusion programs that allow site owners to ensure that the pages they want are listed and visited often. These are Search Submit and Index Connect, as described below.

Search Submit: Flat Fee Paid Inclusion

Inktomi’s Search Submit program is designed for those with smaller web sites. It uses a self-serve model, charging a flat fee per page. In other words, you pick the pages that you want indexed, then use a form to submit them along with your credit card details, paying a set amount per page. Once the pages are accepted, they’ll appear within the Inktomi paid inclusion index within 3 days and will be revisited every 2 days for up to a year. Inktomi sells Search Submit listings through partners, which you’ll find listed on the Search Submit page at the Inktomi web site. Each partner may set its own price, so shop around for the best combination of services and discounts. A rough range of “list” pricing can be found on the Crawler Submission Chart.

Index Connect: Flat Fee Paid Inclusion

Inktomi’s Index Connect program is designed for those with large web sites and who wish to list 1,000 or more pages. Instead of a flat fee per page, as with the Search Submit program, Index Connect uses cost per click (CPC) pricing. This means that you only pay for pages that actually generate clicks to your web site.. Index Connect may be a more economical system for those with thousands of pages that they wish to include with Inktomi. Index Connect is sold directly by Inktomi or through a variety of partners, as listed on the Index Connect page at the Inktomi web site.

Content Indexed & Refresh

Inktomi indexes the full-text of documents but ignores certain common words during phrase searches. For example, “top web provider” is seen by Inktomi as “top XXX provider,” where any word between top and provider will return a result. Inktomi has some stop words, with no way to override their exclusion.

Inktomi revisits pages on a regular basis, to check for changes. Historically, this has been on a roughly basis, for pages not in its paid inclusion program.

In September 2002, Inktomi announced that it plans to revisit every non-paid inclusion page in its index at least once during a 2 to 3 week period — with the average being around every 10 days. Some pages may be revisited far more frequently than this. Pages may be tagged for fast refresh if they are seen as popular or if they’ve been observed to change often over time.

Revisiting pages may also sometimes cause pages to be dropped. If the Inktomi crawler tries to reach a page repeatedly but has trouble accessing it, then the page may be dropped from the listings. There are various reasons why a page may not be found, such as server problems and network delays.

In these instances, the page may naturally reappear in the near future, when the link crawler tries to reach it again. Of course, submitting through the paid inclusion programs will immediately restore a page.

Ranking & Listing Issues

As with most major crawler-based search engines, Inktomi will rank pages better in response to particular queries if they contain the search terms in their HTML body copy with some degree of frequency.

The HTML title tag is also very important to Inktomi. The particular terms you want a page to be found for should appear in the title tag of that page.

A ranking boost is also given to pages that contain the search terms in their meta tags, either the meta keywords tag, the meta description tag, or both. The Meta Tags Revisited article looks more closely at how Inktomi processes the meta keywords tag and generates descriptions.

Link analysis also plays a role in how Inktomi ranks web pages. It analyzes links from across the web to determine both the importance of a page and the terms it might be relevant for. The Link Analysis & Link Building page explains this concept in more depth, plus it has tips on gaining important links to build your reputation in link analysis systems.

Clickthrough measurements are also used to determine if a page should be given a boost in rankings for particular terms. More about measuring clickthrough can be found on the How Direct Hit Works page. Inktomi doesn’t use Direct Hit’s system, but the general principles are the same.

Inktomi also has an internal editorial staff that constantly runs massive numbers of searches and then selects documents that they consider relevant. Inktomi then uses this human preference data to tweaking its various relevancy controls to try and automatically match the human selections. In this way, the company hopes to model the human qualities of what’s relevant into its ranking software.

Finally, Inktomi is experimenting with new “anti-proximity” relevancy, as explained in the Inktomi Increases Size, Introduces Anti-Proximity article from Sept. 2002.

For some additional ranking improvement advice, you might also see the recent “Optimising for Inktomi And how it can help on Other SEs!” article by Barry Lloyd, listed below.

As for spamming, these are tactics Inktomi has traditionally banned or downgraded pages for:

  • Keyword stuffing in titles, meta tags and in the body copy.
  • Use of invisible text or text too small to read.
  • Submission of identical pages in an effort to dominate the top results.
  • Use of irrelevant keywords in meta tags.

In Aug. 2001, Inktomi debuted some additional spam guidelines and the “Spam Crusader” program, as described at the end of the Inktomi Expands Inclusion Partners article.

It’s also worth being aware of the fact that Inktomi may be more liberal in the content it will allow when that content comes in through paid inclusion programs, as explained in the Doorways Not Always Bad, At Inktomi article.

Past Articles

Non-Search Engine Watch Articles

Optimising for Inktomi And how it can help on Other SEs!
SearchEngineBlog.com, July 2003
http://www.searchengineblog.com/columns/optimising_for_inktomi.htm

I’ve never been a fan of the concept that you should do some type of specific actions to please one particular search engine. In general, there are key things that you should do that work across the board. Nevertheless, if you want some Inktomi-specific tips, then Barry Lloyd has some suggestions in this article. Plus, he finds the ground work for Inktomi can translate into later Google success.

Resources

The 2023 B2B Superpowers Index
whitepaper | Analytics

The 2023 B2B Superpowers Index

9m
Data Analytics in Marketing
whitepaper | Analytics

Data Analytics in Marketing

11m
The Third-Party Data Deprecation Playbook
whitepaper | Digital Marketing

The Third-Party Data Deprecation Playbook

1y
Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study
whitepaper | Digital Marketing

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

2y