Doorways Not Always Bad, At Inktomi
In my Ending The Debate Over Cloaking article, I discussed the fact that XML feeds provide a way for search engine marketers to essentially cloak doorway page-style content with approval. In this article, we take a closer look at exactly how this is so, at Inktomi.
Last week, when I was arguing to Alan Perkins via email that XML feeds are a form of cloaking, I wanted to send him an example of this in action. I looked around on my desk for an electronic product, since I know that many electronic retailers make use of paid inclusion programs. Spotting the Toshiba E330 Pocket PC I bought two months ago, I then checked what happened when I searched for it in Inktomi's results.
Inktomi is the biggest of the paid inclusion providers, so it was a good choice for such a test. The company has in excess of 10 million paid URLs. I was confident that somehow one of these 10 million or more would make it into the top results for my query. However, even I was surprised to discover how prominent the paid inclusion content was, for that particular search.
You can see this for yourself by looking for toshiba e330 at MSN Search, which uses Inktomi results for some of its listings. When you hover over some of the URLs listed, you'll notice that they begin "http://redirect-west.inktomi.com..." rather than the URL that's listed.
This is a sign that these are paid inclusion URLs, which Inktomi is tracking in most cases so they can bill on a cost-per-click basis. For that search, I counted 8 of the 15 editorial results appearing to come out of paid inclusion -- 53 percent. Companies with paid inclusion content listed included Amazon, NexTag, DealTime and BizRate.
Wanting to test things further today, I visited the BizRate site to see what generic product types it offers. I wanted to see how Inktomi did when looking for a category of product, rather than a specific model. I found that DealTime had an area about kitchenware, including pages on tea kettles and crock pots. Both sounded like good, generic examples to test.
So the results for tea kettle? I found 4 of the 11 Inktomi-provided "Web Pages" results came from paid inclusion, or 36 percent. Interesting, though BizRate got top billing in that search, it wasn't a paid inclusion URL that was listed. Instead, paid inclusion URLs appeared for Amazon (twice), DealTime and eBay.
As for crock pot, no Inktomi results made it into the top MSN listings. These were well buried by data from LookSmart. However, HotBot provides access to "pure" Inktomi results. There, you can see that paid inclusion URLs made up 3 out of the 10 pages listed, or 30 percent. Amazon, Cooking.com and eBay were the companies listed.
XML Feed or Doorway Template Feed?
While the percentage of paid inclusion content for "crock pot" was fairly low, the eBay listing drew my attention, because it was so similar to what I'd just seen for "tea kettle." Compare:
Tea Kettle on eBay
eBay offers great deals and a wide selection on Tea Kettle and related items!
Crock Pot on eBay
eBay offers great deals and a wide selection on Crock Pot and related items!..
On the face of it, the similarity suggests that the eBay paid inclusion content isn't an XML feed that is assembled from the actual content on the eBay web site but instead is a case of having a template that is used for whatever words eBay would like to be found for. Just plug in the words like this:
KEYWORDS on eBay
eBay offers great deals and a wide selection on KEYWORDS and related items!
Next, submit the content via an XML feed and then hope you get some hits. Some further evidence of template use came from when I tried to do a domain lookup of all pages listed in Inktomi from the search-desc.ebay.com domain that appears to be used for paid inclusion content. That lookup found at least 1,000 pages that are targeted at no particular products at all, such as this:
1 NM on eBay
eBay offers great deals and a wide selection on 1 NM and related items.
Looking at these pages, my first thought is that they represent an attempt to reverse engineer what pleases Inktomi's ranking algorithm, in situations where off the page criteria such as link analysis isn't so much a factor. You might submit thousands of different pages, for instance, to see which ones seem to perform best. You'd then use that particular template for your "real" products.
Another reason this could happen is that eBay might be using software to analyze what content it has, by reading various auction listings. If it sees certain "words," the software might think it needs to make a page for them. This could be an example of that:
1 EA on eBay
eBay offers great deals and a wide selection on 1 EA and related items!
The "1 EA" page might have come out of the fact that many products on the site get listed with these words near them, to indicate "1 each." Software to automatically generate an XML feed may have misinterpreted some of these abbreviations and created doorway templates for non-existent "products."
In either case, the system of submitting information according to a template is a classic doorway page tactic. Before paid inclusion, such content would have been considered spam by most, if not all, the major crawlers. Today, this can be acceptable at Inktomi.
Paid Inclusion Customers Not Doing Wrong
It's important to stress that I'm not attaching any wrong-doing to eBay or any company that may be running a paid inclusion campaign on their behalf. The same is true for the other companies that I've named, where they rank well with paid inclusion content. Paid inclusion is a process that Inktomi closely monitors. What they've done has been done with approval.
"With paid content, everything that comes into the index is subject to editorial input, and for paid content, there's more stringent requirements than for unpaid content," said Ken Norton, Inktomi's director of product strategy.
As for the eBay content, it is deemed acceptable. Inktomi feels eBay has no other way for some of its content to be included, since it is dynamic. Receiving doorway style content may not be the ideal solution, but since the feed pages are relevant to what the user ultimately sees, Inktomi currently allows it.
"What we have with eBay is a reasonable solution. We take you there if it's appropriate to be there," said Dennis Buchheim, Inktomi's director of search marketing solutions. "Is that the end all be all with eBay? Probably not. We absolutely are always talking with eBay on ways to get better and better content. That said, those [paid inclusion” search pages are still valuable."
In short, while Inktomi admits the situation of how it receives eBay's content (and perhaps that from others) isn't ideal, it meets the "the no better way" criteria, as Norton describes it. It's allowed, because Inktomi feels it has no better way to get content it deems important to users into the index.
"Not showing them eBay content when they are the number one auction site on the web, thats worse," Norton said.
It's also allowed because eBay's a paying customer, of course. That's not said cynically. It's Inktomi's search engine, and it has the right to set its own rules regarding content acceptability. If it wants to provide this type of service to paying customers, then that's fine.
What The Guidelines Say: Cloaking
Inktomi maintains Content Guidelines that apply to everything in its index. Among the things forbidden by the guidelines are "Pages that give the search engine a different page than the public sees (cloaking)."
As I've already explained, the Index Connect program's XML feeds do allow the search engine to see something different than what the public sees, so to me, the feeds do go against the policy forbidding cloaking. Inktomi agrees.
"We definitely agree that as the definition of cloaking is written today, XML feeds violate that. By the letter of the definition, that's true. However, really cloaking for us is something done deceptively, a page that is given to a search engine that's meant to deceive the search engine about the content behind it," Norton said.
Given this, Inktomi now says Inktomi will be revising its guidelines about cloaking to add the concept of deception. Unfortunately, I couldn't get a good answer about what "deceptive" means.
For instance, it is "deceptive" to show Inktomi a doorway-style page outside of its paid inclusion programs, if those pages ultimately lead to a relevant page. In other words, say I make a doorway page about "tea kettles." I cloak that page, so that Inktomi has a version I hope will rank well but users see a version that looks good for them. They aren't "deceived" because the page they see really is about tea kettles. However, is Inktomi "deceived" because it didn't knowingly approve of my cloaking?
I honestly don't know. Inktomi and I went around the issue several times, and I kept getting different answers. For instance, Buchheim said, "We do everything we can to remove unpaid and paid cloaked content," but then added, "We are constantly working with content providers that claim that they can only give this [cloaked content”."
The most consistent answer seemed to follow the "no other way" rule.
"If it's the only way to get to the content, that's something the editorial team has been allowing," said Buchheim.
My advice would be that until Inktomi clarifies things, expect that cloaking unpaid content is generally unacceptable, while cloaking paid HTML content may be acceptable if you can prove a need to do it. XML feeds inherently cloak content, and these are acceptable.
What The Guidelines Say: Doorway Pages
The content guidelines provide this advice about what's not acceptable, in terms of doorway pages:
- Pages in great quantity, automatically generated or of little value
- Pages built primarily for the search engines
In the eBay templates shown above, it's hard to say that these are not pages built in great quantity. They have little value in terms of content themselves -- it is the pages they redirect to that have the content. Finally, they've clearly been built primarily for the search engine.
Inktomi further rules against doorway pages from its Content Guidelines FAQ page:
"Q: I want to be found under a wide variety of keywords. Shouldn't I create a doorway for every keyword?"
"A: Not at all. Inktomi's Web search is not keyword or keyphrase based, like some pay-for-keyword "search engines". It uses a sophisticated algorithm to match search terms to relevant pages. One well designed page can do more good than any number of doorways. This is part of why Inktomi does not want doorways."
Inktomi didn't say whether these doorway page guidelines were going to be changed or expanded, as is the case with the cloaking guidelines. However, I think it's abundantly clear that if the situation is right, the rules on doorway pages don't apply in the paid inclusion program. If you aren't in that program, then I'd expect close scrutiny and perhaps banning of doorway content.
Impact On Ranking
The hallmark of paid inclusion with all the search engines offering it is that it is not supposed to influence rankings. Inktomi itself makes this clear on the FAQ page mentioned earlier:
"None of Inktomi's paid inclusion products have anything to do with ranking. They simply provide inclusion; ranking occurs as it would if the pages were not paid."
At the end of last year, I found that similar words offered by AltaVista weren't holding up. It was fairly easy for me to demonstrate that some paid inclusion content from major customers was getting a boost, something AltaVista admitted and explained as being due to poor index blending.
At Inktomi, I found such apparent boosting to be most consistent for Amazon and eBay. On many searches, I tried, I often found their paid inclusion URLs in the top ten listings via HotBot. Some of these include:
- mammals (Amazon)
- perfume (Amazon)
- powershot s200 (Amazon)
- dvd players (eBay)
- fishing rods (eBay)
- guitar straps (eBay)
- pentium iii (eBay)
- bubble gum (Both)
- six feet under (Both)
- oliver stone (Both)
Unlike my review of AltaVista, it seemed more often the case at Inktomi that only one or two paid inclusion URLs were present, rather than several. That was the situation with all the queries shown above.
In addition, it's important to note that the presence of these paid inclusion URLs does not mean that the results overall were somehow irrelevant to the query. Plenty of relevant content was present, and in all cases, the paid inclusion content was somehow also relevant to the query.
Paid Inclusion Prominence Said To Be Rare
Inktomi said that on the whole, it calculates of all the queries it serves up, a single paid inclusion URL will appear only half the time. In other words, if you search and get a page with one paid inclusion URL out of 10 results listed, chances are that the next search served by Inktomi won't have any at all. Two searches, 20 results in all, only one paid URL -- 5 percent of the results listed.
Did you get a page with two paid inclusion URLs on it? Then the next three searches served by Inktomi won't have any at all. Four searches -- 40 results in all and only two paid URLs -- again, 5 percent of the results listed.
That's at least how I understood the situation works, on average. If not, I'll clarify it here with an update. The main point is that pages aren't typically loaded heavily with paid inclusion URLs, especially when you look across the entire query spectrum, Inktomi said.
Link Flux Helps Amazon & eBay
Inktomi reiterated that it does not boost paid inclusion content, but then how could Amazon and eBay do so incredibly well? Inktomi says its because the content is associated with popular web sites. The popularity of the host domain -- really the site's home page -- essentially gets transferred to some degree to the site's paid inclusion XML content.
Let's put this in perspective using Google's PageRank system. Inktomi doesn't use Google PageRank, but I think it will be the easiest way to explain things.
eBay is a popular site, with its home page getting a PageRank 8 score as measured by the Google Toolbar. Lots of people link to the eBay home page, so that helps its popularity. But go into the site, say to the Dolls & Bears category, and the PR score drops to 6. Fewer people link to this inside page, so it's not as popular.
Now imagine you try to find a page that doesn't even exist at eBay, such as http://www.ebay.com/afddafafsd. That gets you an error, yet Google gives this page -- which doesn't exist -- a PR score of 7! What's happening is that Google is trying to provide some guidance. It sees this is a page "inside" the eBay site but at the top level, so it subtracts 1 point from the eBay.com home page PR score of 8. If the page were buried one level in the site, another point would come off and so on. This non-existent page, http://www.ebay.com/level1/level2/level3/level4/level5, is five levels down -- so five off of PR8 gets its score: PR3.
Google makes the above estimates only for its toolbar users, not for how it actually ranks web pages. Indeed, if Google hasn't indexed a page directly or seen a link to a page, it won't list that page at all.
Inktomi's in a different situation. Amazon and eBay, like other XML partners, are sending it content for pages that may not exist on their sites until you click on a link. For instance, your click might generate a search that creates a page, on the fly.
Since these pages don't normally exist, Inktomi wants to somehow give them some type of link popularity credit, so they can better compete with "real" pages. It does this by giving the pages a proportion of the host site's overall importance. Since Amazon and eBay are among the most popular sites on the web, this especially helps their inside paid inclusion pages immensely in ranking.
Given this, XML feeds offers a potential boost to anyone with inside pages that aren't currently ranking well. By making use of the feed, it may be that your inside pages will "inherit" more importance from your domain that might otherwise happen normally.
What's the situation with other search engines offering paid inclusion? It's difficult to know, since you can't always identify what's a paid inclusion URL. I've already done a close look at AltaVista in the past. Revisiting it, using the queries itemized above, I found that the presence of paid inclusion does seem to be less than when I last tested. However, it was interesting to see how well BizRate did. Here's a rundown on URLs that seem to be in the AltaVista paid inclusion program:
- powershot s200 (eBay, maybe also BizRate and NexTag)
- dvd players (BizRate)
- guitar straps (BizRate))
- pentium iii (BizRate)
- bubble gum (Maybe NexTag).
- six feet under (Maybe NexTag)
- other queries: no paid inclusion URLs identified
At AllTheWeb, it was exceedingly difficult to tell whether a URL was in a paid inclusion program or not. As best I could tell, paid inclusion URLs really only seemed to come up for "powershot s200" for DealTime and probably also for NexTag. At Teoma, no apparent paid inclusion URLs were spotted at all.
It should be noted that both AllTheWeb and Teoma carry much less paid inclusion content than Inktomi and AltaVista. If their content increases, it may be that the presence of paid inclusion URL will rise.
As for Google, it doesn't offer paid inclusion. Nevertheless, you will find commercial web sites listed it editorial results. For instance, a search for "powershot s200" will bring up a page in Google that leads to content at Amazon that's similar to the Amazon page listed for the same search with Inktomi. The difference is that Amazon seems to come up with far less consistency at Google, and there's of course no business partnership between Google and Amazon that leaves you wondering if behind-the-scenes boosting is going on.
This also brings me back to an inconsistency from Inktomi. The Amazon page it lists comes via XML and uses content that is different from the page the user sees. You can tell this because the title and description don't match. Inktomi says this is necessary under the "no other way" criteria, because it can't gather that page otherwise. Yet Google manages to get essentially the same information through normal crawling.
Overall, Inktomi's paid inclusion programs do provide some key benefits to those with certain types of content and gives those in them greater flexibility in dealing with Inktomi's content guidelines. You'll still have to be relevant for the queries you wish to target, but you appear to be able to target those queries with content in a way that wouldn't be allowed through the normal "free" crawling.
As said, this is Inktomi's right to do. It's also hard to show user harm. So eBay ranked well for "tea kettles." When you clicked on the link, you did indeed arrive at a page about tea kettles at eBay. There's no question that the page was relevant for those terms.
Indeed, it's very important to note that I made no close look to measure the overall relevancy of the pages I viewed from Inktomi as compared to pages from its competitors. It may be that on some queries, even with a high amount of paid inclusion URLs, the results listed might be better than when compared to pure "free" URLs. The opposite could also be true.