THE SEARCH ENGINE UPDATE
November 3, 2000 - Number 88
By Danny Sullivan
Editor, Search Engine Watch
Copyright (c) 2000 internet.com corporation
About The Update
The Search Engine Update is a twice-monthly update of search engine news. It is available only to Search Engine Watch "site subscribers." Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ Site News
+ Conference News
+ Paid Inclusion At Search Engines Gains Ground
+ Inktomi Debuts Self-Serve Paid Inclusion
+ eLuminator Brings Content To Light In Search Engines
+ MSN Search Releases New Version
+ Google & FAST Move Up In Size
+ New Go Site Goes Up
+ GoTo Live At AOL, Enters The UK, To Appear At Lycos & HotBot
+ Lookin' For Liv In All The Wrong Places
+ Yahoo Publishes Top Searches
+ New Search Engine Sites & Resources
+ Interesting Search Engine Articles
+ List Info (Subscribing/Unsubscribing)
This issue is heavy on stories about paid listings and paid inclusion in search engines. I apologize for that, but it's a sign of how rapidly the search engine landscape is changing. Whether you operate a web site or simply want to search the web better, you'll need to be aware of the revolutionary business moves that are taking place. I hope these articles will help keep you informed.
Along these lines, the Pay For Placement page in the site has received a major update. It's always had articles pertaining to the issue of selling search engine results, but now there's a chart (how I love charts!) that summarizes major non-banner advertising programs that are in place at various services. It can be found via the What's New Page, below.
I've also given the ever-popular Search Engines and Dynamic Pages article a slight update, including adding links to products that can help those with ASP pages or Lotus Domino pages that search engines ignore. In addition, I've updated and expanded the Search Engine Display Chart, which shows how various search engines form the information they display about pages in their results. Both can be found via the Subscribers-Only What's New page, below.
Subscribers-Only What's New
The Search Engine Strategies conference is next week! It will be in Dallas, on November 9. I'll be presenting and moderating sessions at the conference that features experts on search engine marketing issues and panelists from the various major search engines themselves. Services participating in the panels include About.com, AltaVista, Go, Google, Inktomi, LookSmart, NBCi, Netscape/The Open Directory and RealNames. In addition to the main track about search engine marketing issues, new concurrent sessions will cover making your own site searchable for visitors, creating a vertical search engine and coping with spider activity. There's also a series of roundtable discussions that will cover advanced search engine marketing issues. An agenda and details about the conference, for attendees or potential sponsors and exhibitors, can be found via the URL below.
Search Engine Strategies 2000 - Dallas
Paid Inclusion At Search Engines Gains Ground
Previously, I've written generally about the many new ways in which search engines are trying to earn money from their search results. This month, we'll take a closer look at one particular method, "paid inclusion," which is now in place at Ask Jeeves, LookSmart and Inktomi.
In pay for inclusion, site owners pay money to guarantee that they will be included in a search engine's listings in greater depth than might ordinarily occur. Paid inclusion does not guarantee that pages will be ranked well for particular search terms. However, sites enrolling in paid inclusion programs are likely to receive greater traffic than those that don't.
To understand this, let's liken search engines to a lottery. When someone searches, it's almost as if a search engine spins a big barrel full of millions of listings, to determine which listings will come up first in its results. In a lottery, the more tickets you have, the more likely it is you'll win something. Similarly, with search engines, the more listings you have, the more likely you'll rank well for various searches.
I can't stress enough that paid inclusion is not equivalent to paid placement programs that guarantee positions. For example, if you want to be number one for a particular search term at GoTo.com, which is a paid placement search engine, you simply agree to pay more money than any other advertiser for the term. In contrast, advertisers are not guaranteed a particular position in paid inclusion systems. They are simply sold more tickets in the search engine lottery, so to speak. That means they may win more often in the ranking game than might ordinarily happen, but they would still need to satisfy all the normal editorial criteria to do so.
Paid Inclusion At Ask Jeeves
Ask Jeeves has experimented with paid inclusion over the past several months, and now it's a standard part of its advertising offerings called "Answer Link." It works two ways. The editorial staff, after creating listings in its usual fashion, may then suggest to the advertising staff that a particular site might be a good prospect for a paid inclusion deal. The advertising staff would then follow up to see if a deal could be signed. Alternatively, an advertiser might approach Ask Jeeves about participating in the program. The editorial staff would then review the site, suggest some types of questions it might be useful in answering, and the advertising side would then complete the deal.
In either case, pages from the partner's site eventually appear as answers to the Ask Jeeves questions that appear at the top of its search results page, in the "I have found answers to the following questions" section. However, the partner can't control exactly what questions it will appear for, Ask Jeeves says. Instead, paid inclusion links will only appear if the search engine's normal ranking systems deem it relevant. In return, Ask Jeeves is paid based on the amount traffic it sends to the partner.
Ask Jeeves stresses that only sites that provide quality information will appear in its results, whether there is paid partnership or not. It also points out that only about 5 percent of its knowledgebase is made up of paid inclusion answers, nor does it plan to greatly increase this percentage. But doesn't favoring some sites in response to money penalize other good ones? Yes, Ask Jeeves responds -- but that doesn't mean the users themselves are penalized.
"For example, say we point to a site that lists stock quotes. A stock quote is a basic piece of information, so if one site has more in depth stock information and quality financial content than another, then we are not opposed to approaching that site to work with us in a paid partnership," said Jonathan Silverman, product manager of Ask.com.
To see paid inclusion in action, try a search for "what should my blood pressure be?" The top link that appears leads to a page from OnHealth, an Ask Jeeves advertiser and one of about 20 "basic knowledge" providers that also include companies such as Ticketmaster-CitySearch, Verizon, GE Financial, ImproveNet, and AllBusiness. Ask Jeeves also has several hundred advertisers for ecommerce topics, such as Sears, Best Buy, Land's End and Garden.com.
The listing has no disclosure that an advertiser is benefiting from it. Ask Jeeves says during recent design testing, flagging paid inclusion links wasn't seen as useful. The company also feels that since such paid inclusion listings will only appear if they meet editorial standards, there's no need to call them out from non-paid listings. However, Ask Jeeves has added a link to its home page called "Editorial Guidelines for Answers" that explains how some answers come from paid partners and that these also meet editorial and ecommerce guidelines.
This is good, but I still think it would help if there was some type of symbol integrated alongside the listings themselves, just to better advise those users who are sensitive about paid programs. Nor should this be exclusive to Ask Jeeves. The entire search engine industry ought to be considering some standard way of labeling such material, so consumers of information have a clear idea of what relationships may be involved.
Considering signing up? If so, Ask Jeeves will review your site and give you an estimate of the traffic they'd expect to send to you over six months. You'll then pay anywhere between five cents to as high as US $1 per click, depending on the perceived value of the search queries you may be relevant for and the overall amount of traffic expected to be generated. Obviously, you'll need to have quality content in order to participate. It also is important not to use frames, so that Ask Jeeves can deliver people directly to the most relevant portion of your site.
What happens if you are already listed in the Ask Jeeves knowledgebase for free and get approached about a paid inclusion deal? Will you suddenly be replaced with someone else? Obviously, this could happen, but Ask Jeeves says that good sites will continue to be listed regardless if they chose not to pay.
"Understandably, some of those companies [approached” have come back to us and said, 'We have always received traffic from Ask.com for free simply because we are a quality site, and we suspect Ask Jeeves will continue to direct users our way, whether we have an advertising relationship or not," Silverman said.
Another thing to consider is that in the future, Ask Jeeves may integrate answers from its site search customers into its web-wide results. For example, you can go to some sites such as Datek and search just those sites using their own Ask Jeeves-powered search engine. Ask Jeeves may then take those answers, and for a fee, merge them into its web search service.
In this way, someone searching for something specific to that company might be directed to a page within the company's site, since Ask Jeeves has already cataloged that answer as part of its site search work. This gives Ask Jeeves an obvious plus to those considering site search solutions -- you might be able to generate traffic to your site as part of the package.
Paid Inclusion At LookSmart
Over at LookSmart, much attention has been focused recently on the relatively new change where all commercial web sites must pay a fee in order to be considered for listing in the directory. However, LookSmart's "Subsite" paid inclusion program goes far beyond this.
Typically, most web sites might find their home page listed in one or two categories at LookSmart. Large web sites might have further classification, with a few key inside sections listed in appropriate categories. Under Subsites, LookSmart editors do a deep review of a web site, categorizing individual pages they find with suitable content throughout the directory. Ultimately, a site could end up with over a hundred different listings, if not more. The result is that the site will appear in response to a far greater range of queries than if only its home page was listed.
Paid inclusion at LookSmart has similarities to the system at Ask Jeeves. At the request of LookSmart's advertising department, LookSmart editors will review a site and determine appropriate places they feel the site's content could be listed. Only content meeting regular editorial standards is said to be included, and listings aren't guaranteed to appear in response to a particular search. This is especially true for LookSmart, given that it can't control how its many partners rank the information it provides. In return for this work, LookSmart receives a per click fee for each visitor it sends to sites in the program.
While there are concerns that users might miss out finding sites that don't pay, LookSmart notes that its editors spend a significant amount of time searching the web for new sites to add to the directory, independent of its paid inclusion and submission systems. Additionally, the company says its recent acquisition of the Zeal.com community directory is also expected to help it ensure broad representation of valuable not-for-profit and community sites.
LookSmart isn't restrictive in offering the Subsite program to one company in a particular business. Anyone who wishes to pay can be deeply indexed, regardless of whether their competitor is already in the program. "We're wide open so long, as the links meet our editorial standards," said Scott Stanford, LookSmart's vice president of listing services and ecommerce. "We have not signed any deals or exclusives."
The Subsite program was publicly launched in September, with mySimon named as the first advertiser. However, the program has been in beta testing since April of this year, and now approximately 20 large sites are represented, including eBay, through a deal just signed this week.
Try a search for "downloadable software" at LookSmart, and you'll see an example of paid inclusion link from mySimon that appears at the top of the "Reviewed Web Sites" section. You'll also see that, as with Ask Jeeves, there's no disclosure -- nor do any of LookSmart's partners such as MSN Search, iWon, Excite and AltaVista make any type of disclosure next to the LookSmart Subsite listings they display.
If you are thinking about trying the Subsite program, as with Ask Jeeves, you'll want a site with lots of solid content. Pages of affiliate links aren't going to fly, and sites built out of frames can't participate, since there's no way to link directly to the most relevant content.
There's a setup fee for each listing you receive (discounted from the regular $199 submission fee, based on volume of listings), then you'll also be charged by the click for each visitor sent to your site. The per click fee includes making changes to your links, such as if you move a URL or change the products or services a link describes. This per click fee ranges anywhere from 35 cents and up, depending on the type of site you have, the number of links that are approved, and the perceived value of the traffic you are likely to receive.
LookSmart will provide a rough traffic estimate, so that you can budget for your Subsite fees. Should you withdraw from the program, you'll lose your deep links and instead likely fall back to having only your home page categorized and maybe up to four other pages from your site, the maximum LookSmart now allows through its standard submission program.
Paid Inclusion At Inktomi
Inktomi's paid inclusion program has only recently gone live through a partnership announced in September with MediaDNA and another announced just this week with Position Technologies, so expect to see it evolve as the program matures and as new partners are announced. FYI, Inktomi's first partnership, announced in July with Network Solutions, isn't expected to go live until January.
At its core, paid inclusion with Inktomi means that site owners pay to be guaranteed that the web pages they select are included in its crawler-based listings and that these pages will be reindexed every 48 hours. See the "Inktomi Debuts Self-Serve Paid Inclusion" article below for more specifics about the service.
That's the end of the guarantees. As with the other paid inclusion programs described, there is no assurance that pages will appear highly ranked for any particular search.
Given that Inktomi crawls the web, its pay for inclusion model is potentially more worrisome to searchers than the ones run by Ask Jeeves and LookSmart. Human-powered directories, by their very nature, have never been inclusive of everything on the web. That's why major search sites using directory information typically back this up with crawler-based results. If the human editors haven't categorized something, then the crawler provides a fall-through.
Because of this, a crawler is automatically expected to be inclusive of everything. Indeed, the reason stories about search engine sizes have continued to attract so much press is that the general public may naturally assume that a crawler will find everything on the web. That's never been, nor dare I say, never will be the case. Nevertheless, until the Inktomi announcement, we've also never had a major crawler say that some sites might be more deeply crawled for reasons other than feeling there was essential content that should be listed.
So what's going to happen as the Inktomi program progresses? Will there be a general degradation of freshness and crawl depth, in order to make the paid inclusion model more attractive to site owners? Definitely not, Inktomi says. Paid inclusion is mainly a way it sees for site owners to share the cost of getting people to their content, plus it makes it possible for Inktomi to list new content from hard to crawl sites that it's never carried before. Rather than a replacement for its regular crawling, paid inclusion is seen as an additional, supplementary system.
"We're going to continue crawling the web much as we have, using the same kind of popularity analysis to build the bulk of our index," said Troy Toman, general manager of Inktomi's search solutions division. "We're not on a path where we'll say were going to remove every site in our index unless they pay. It's really to go more after sites that would wish to be better represented in our index or people who want more timely information from their site made available."
Among the other crawlers, Go says it is readying a paid inclusion system with its own spidered results that may be unveiled in December. Additionally, AltaVista is still determining what services it intends to market to webmasters. Paid inclusion could be one of these, though AltaVista is sending out signals that its first product may allow site owners to enhance their listings with highlighting or even pictures. Notably, Google says it has no such plans for a paid inclusion system, at the moment.
"I think there are some significant philosophical issues," said Google president Sergey Brin. "If someone searches for cancer, and there's a really good cancer site out there, what if you don't have the answer they are looking for because that particular site didn't pay to be in there."
There are certainly some pluses that paid inclusion can offer. It's not unreasonable to expect that extremely large sites might help pick up the costs of making their content available, especially if that in turn helps the searching public in general. Such programs can also make content that's currently unreachable to spiders, such as locked in databases or behind firewalls, easily accessible for the first time. Certainly such programs offer the opportunity for companies to interact with crawlers on a more formal basis, rather than unproductively kept at arms length.
Nevertheless, while paid inclusion programs may impact the editorial quality of search results far less than paid placement, paid inclusion still raises concerns -- whether run through crawler-based search engines like Inktomi or human-powered services such as Ask Jeeves and LookSmart. The test will see whether these pioneers can prove over time that their programs don't hurt the search experience -- or better, actually do improve it.
Ask Jeeves Editorial Guidelines
This page, linked from the Ask Jeeves home page, explains to users how paid partnerships may have a role in the answers Ask Jeeves provides.
Ask Jeeves Advertising Programs
Describes the Answer Link program more via a downloadable PDF file. Use the contact form on this page to reach the sales department, if you wish to participate in the program.
LookSmart Advertising Contact Form
Use the contact form on this page to reach the LookSmart ad sales team if you wish to enroll in the Subsite program.
LookSmart Earnings Looking Strong
smallcapcenter.com, Nov. 2, 2000
LookSmart's paid submission and paid inclusion programs are a big reason behind its better than expected earnings, which were recently announced. Listing revenues have gone from $600,000 in the first quarter, to $1.6 million in the second, then to $3.3 million in the third quarter just announced. Last year, they were non-existent.
Inktomi Debuts Self-Serve Paid Inclusion
The Search Engine Update, Nov. 3, 2000
More about Inktomi's paid inclusion program and partners can be found here.
Monetizing The Search
The Search Engine Report Sept. 4, 2000
Goes into a range of other programs search engines are experimenting with to earn money and touches on issues relating to pay for inclusion.
Pay For Placement?
Paid inclusion and even paid placement can solve some problems that have plagued search engines. Articles on this page go into more depth about the issues, pro and con. A new chart has also been added as a guide to where significant non-banner advertising components are showing up at different search engines.
Inktomi Debuts Self-Serve Paid Inclusion
In a landmark move, Inktomi announced a new partnership with Position Technologies this week that allows site owners to pay for guaranteed inclusion in the Inktomi index. While paid submission (or "pay for consideration") systems have operated at human-powered Yahoo and LookSmart for some time, we've never had a major crawler-based service offer something similar. In addition, paid inclusion means that pages will absolutely be guaranteed to be listed, while in paid submission systems, no such assurances are offered.
"The biggest challenge web site owners face is getting found on the web," said Troy Toman, general manager of Inktomi's search solutions division. "Inktomi Search/Submit is the first time we're allowing Web site owners to publish directly into the Inktomi index, helping them get found faster and stay up-to-date in the database."
The program promises to add any URL submitted to the Inktomi index within 2 days and keep revisiting the URL at that frequency for up to a year. In return, site owners pay on a sliding scale: US $20 for the first URL, $10 for the next 2 to 100 URLs and $6 for any URLs over 100. For a 150 page web site, the costs work out like this:
Page 1: $20
Pages 2-100: $990 (99 pages @ $10 each)
Pages 101-150: $300 (50 pages @ $6 each)
Total: $1,310 per year
The pricing is more expensive than I would have expected. A few dollars per URL, sliding down into less than a dollar for large orders, feels fairer. It's one thing to ask web site owners to help share in the cost of indexing the web, but this is a huge jump for even small sites of 50 pages. In addition, the pricing feels high in contrast to fees charged by Yahoo and LookSmart. At those services, $199 essentially purchases you a listing (if approved) for life, or at least several years. Inktomi charges on an annual basis. True, it has to keep revisiting the pages, but as a crawler-based service, that's something it was always supposed to do. However, the pricing could also change.
"The paid submission program was designed for Web site owners to submit the most relevant pages for entry to their site, not all of their content," said Toman. "This pricing structure is an introductory offer with the new service and we will be adjusting it over the coming months to determine the most equitable price point."
Inktomi says that the new paid inclusion system won't alter the normal crawling that it does. Pages from across the web will continue to be listed for free, especially those that Inktomi deems to be popular or which seem to satisfy queries.
Free Add URL pages that feed into Inktomi, such as the one offered by HotBot, are eventually to be phased out. Inktomi finds that the vast majority of submissions sent via these pages are spam. Other crawler-based services also report the same problem. A small number of people submit an incredibly large number of poor-quality URLs through these pages. By eliminating them and moving to a paid inclusion model, Inktomi (and others) could arguably do a better job of indexing the web.
Such a shutdown won't occur until Inktomi can establish a means for non-commercial sites to submit to the search engine. "We want to make sure we have adequate non-fee submission mechanisms for them," said Troy Toman, general manager of Inktomi's search solutions division. "Our goal is not to charge anybody and everybody. We want submissions from non-profits and non-commercial sites on the web, and we'll wait until those are covered."
Finally, while pages are guaranteed to be listed, there's no guarantee that they will rank well for particular searches. Inktomi will continue to apply its regular relevancy algorithms to paid inclusion pages, just as it would to those it lists for free.
To help you further understand the new program, I thought some Q&A style answers would be helpful. After them, you'll find links to more information about the Inktomi program.
Q. Why would I pay Inktomi to include my pages?
People will have various reasons. Some realize that that more pages they have listed, the more likely they'll generate traffic. If you can earn money off that traffic, then paying for inclusion may be worthwhile.
Others may wish to ensure that Inktomi has the freshest information from their web sites. While Inktomi does try to spider rapidly changing sites more frequently than others, the company still gets complaints from webmasters about stale listings. Inktomi says paid inclusion is the only way it will be able to meet the demand for greater freshness.
"It's one thing to have the technology that knows how to do that [detect changes” and its another thing to build that out for the entire web," said Troy Toman, general manager of Inktomi's search solutions division. "This mechanism allows us to be able to afford to get fresher content."
Q. Why do some of my pages disappear from Inktomi currently? Is this just to get me to sign up for paid inclusion?
Inktomi has operated a system of "trialing" new submissions for over a year. Basically, that means a new page that's picked up may stay in the index for a short time, such as three weeks. By using clickthrough measurements, Inktomi can tell if that page is satisfying queries. If no clicks are detected for a particular listing, then the listing is never being selected by users. In that case, it may be dropped to make room for other pages.
In addition to this, sometimes a spider may have problems accessing a web site. Perhaps there is network traffic. Perhaps the site is down temporarily. Perhaps the spider simply has an internal problem. In any case, it can cause a page to be dropped until the spider can return to visit it. That can take up to a month. With paid inclusion, you are paying to avoid fallout from such problems:
"There are many factors that can exclude a site from being picked up by the spider during a crawl," said Toman. "The paid inclusion program is another tool that web site owners can use to insulate their content against missing a crawl. Web site owners that subscribe to the new service can be confident that their pages will be refreshed every 48 hours."
Q. Inktomi already lists some of my pages. How can I arrange to pay them to add only the ones they don't already list?
You have complete control to submit any URLs you want. The only problem is that there's no way to know which ones Inktomi already feels should naturally be included in its index, even if you don't pay. Given this, you may indeed find you are paying for pages Inktomi might normally list for free.
If you have pages you know generate traffic, then you may want to use the system to guarantee they aren't dropped from the index. Even though you think Inktomi might keep them, it might be worth spending the money to have the reassurance. Position checking tools or log analysis software can help you determine which pages you may wish to protect in this way.
Conversely, you might choose to submit only pages that you've never seen appear in the Inktomi index, trusting that your other pages already listed may be retained.
Q. I have no money and must depend on the free Add URL pages that feed into Inktomi. What's my best strategy?
Inktomi says the best thing to do is to submit to HotBot. You can submit up to 50 pages per day there, and pages MAY appear within three weeks, once submitted. However, the free Add URL situation is very fluid -- it's likely to change in the near future, as Inktomi begins to make changes.
The long term plan is to eliminate free Add URL, except perhaps for non-profit and non-commercial sites. That doesn't mean commercial sites wouldn't get listed if they didn't submit, since Inktomi also crawls the web independently of Add URL. However, it does mean you will probably save yourself time and frustration by recognizing that paid inclusion is the future with Inktomi.
"If you have a commercial reason to need to be in the Inktomi index, it doesn't seem unreasonable that you should have to pay for it," said Toman.
Q. My web site has thousands of URLs. Isn't there a better price I can obtain?
Probably. Inktomi says the pricing it offers through Position Technologies is aimed at those with sites of 1,000 pages or less. For those with larger sites, Inktomi suggests contacting it directly about arranging better bulk pricing.
Q. Will Inktomi really guarantee to list anything I submit?
Inktomi says that it will reject pages it considers spam, but what is spam isn't defined on the paid inclusion signup pages. Typically, the concern is mostly about pages that attempt to mislead users about their content. The signup pages do warn that "illegal" URLs or those infringing copyright will not be accepted and itemizes a few other reasons for page removal. Inktomi will also remove pages for "fraudulent use of the service." What's that? Whatever Inktomi decides it wants it to be -- the company has sole discretion.
Q. Are cloaked pages OK?
As usual, this depends on how you explain cloaking. "We still consider cloaking to be spam," was Toman's initial response to this question, but then he soon qualified this. "Cloaking in and of itself is not necessarily spamming. If someone's putting it up because it helps us find the content, we have a tendency to let those lie. However, if you are doing something to the page that's designed to deceive the search engine, that's not OK."
Q. If pages are guaranteed to be listed, won't this make it easier to reverse engineer Inktomi, as was the case a few year ago with Go (Infoseek), when it had an Instant Add URL mechanism?
Perhaps. Inktomi's ranking system also makes use of off-the-page criteria that webmasters cannot control. Nevertheless, the ability to get listed, make changes and then see if the changes let you rise higher certainly make it easier for those trying to reverse engineer Inktomi's algorithms. Should this prove to be a problem, Inktomi says it may make further changes. "This is sort of the process of discovery when you go in to a new area," Toman said. One idea Inktomi is considering is to allow site owners to also provide meta data about their site, which presumably would be reviewed for accuracy.
Inktomi Paid Inclusion Program
Describes pricing and leads you to an order form
Inktomi Paid Inclusion FAQ
More Q&A directly from Inktomi.
Operated by Position Technologies, Inktomi's new paid inclusion partner, Position Pro has tools especially suited to help those large web sites optimize their pages and submit to crawler-based search engines. However, even small site owners may want to take a look. An updated review is planned for the next newsletter.
Another Inktomi paid inclusion partner, MediaDNA's eLuminator spiders web sites and creates optimized pages for search engines. The eLuminator product is especially designed for sites that have content locked behind password protected areas, as it can make that content visible to search engines without giving it away for free to unregistered users.
Pay For Inclusion Advances
The Search Engine Update, Nov. 3, 2000
Inktomi isn't the only major search service with a paid inclusion program. This article looks at programs run by Ask Jeeves and LookSmart, plus touches on some issues that paid inclusion raises.
Inktomi To Offer Paid Submit Option
The Search Engine Update, Aug. 2, 2000
Network Solutions was the first paid inclusion partner that Inktomi named, and this article provides more details. The Network Solutions offering isn't expected to go live until January.
eLuminator Brings Content To Light In Search Engines
Call it the "invisible web," "shallow web" or "deep web," the various names refer to the same thing -- content that crawler-based search engines ordinarily cannot access. There are several roadblocks that can stop spiders in their tracks, and one of the worst is making content password-protected. Place your pages in an area that only registered users can access, and search engines will never find them -- nor will those searching at search engines.
It's a shame, because there's plenty of good content that may be offered in registered areas that the general public might like to discover. Enter eLuminator, a product from MediaDNA. eLuminator will duplicate protected content in a way that makes it accessible to search engines without dropping any restrictions required for human visitors before they are allowed to view it.
The system makes use of doorway pages and cloaking -- highly-charged words that raise the hackles of many and certainly terms MediaDNA will probably dislike being applied to their service. However, the automated system uses them in a way that makes it stand out as a shining example of why such techniques cannot be automatically be considered search engine spam. Anything but, eLuminator really does help both search engines and their users, and the system is so positive that MediaDNA was selected by Inktomi in September as a partner for its new paid inclusion system.
Let's say you have a stock tips web site with 1,000 pages hidden behind a registration system. eLuminator will read those pages and automatically make a search engine friendly version of each one of them. This is the doorway page concept -- you create a page designed to please the search engines, rather than humans. However, that's as far as that comparison can be made. Doorways are often highly manufactured pages, designed to rank well for a particular term and maybe even a particular search engine. Instead, eLuminator works with existing pages to distill the essence of what they are about, then makes a version that the search engines can read.
For example, eLuminator might read your stock analysis of IBM. It might then extract the headline to be the page title, the opening paragraph for the page's meta description tag and make a list of the unique words that appear on the page for the meta keywords tag. The opening paragraph and the keywords would also be used to appear on the page itself. The result is an abstract of what the original page was about, but it's not "human readable," as MediaDNA characterizes it. In other words, there's no sentence structure on the page, no context that makes the page valuable to humans, though spiders have enough information to understand what it's about.
The pages are then placed on a "ghost site," which usually has a domain similar to the site owners, for branding purposes. For example, if our original site was called "greatstocktips.com," then perhaps our eLuminator pages would go onto "greatstocktips-eLum.com." The pages would then be submitted to the major crawler-based search engines.
Should one of the pages appear in response to a search, cloaking is used so that the human visitor sees a page designed for them. They'd see a summary of what the page is about and how to access it -- whether that be purchasing the article or simply registering with the site.
eLuminator is working no special magic on the pages to make them rank well. These are not intensively optimized pages, and they might even perform better if the sentence structure was left intact. Nevertheless, the fact that sites might go from having little or no representation in search engines to perhaps thousands or hundreds of thousands of pages listed means that they are more likely to naturally have some pages rank well and see a traffic increase -- perhaps a substantial one. MediaDNA cited one client's experience where 4,500 documents were listed and in turn generated 365,000 clickthroughs from just Inktomi-powered services during a month.
"Because eLuminator allows full-text searching of valuable content, we're finding that the traffic we're driving to clients' sites is better targeted than traditional doorway page traffic" said Larry Vernec, vice president of marketing at MediaDNA.
To date, Inktomi is eLuminator's only official search engine partner. That means you can guarantee that your pages will be listed in the Inktomi index. eLuminator also submits your pages to other crawler-based search engines, but the lack of partnerships means there's no guarantee they'll be included. Nevertheless, you should expect that some will get picked up. MediaDNA is working to establish partnerships with other crawlers.
MediaDNA charges a US $5,000 set-up fee for the eLuminator service, and then youll pay anywhere between 5 cents to 40 cents for each person that clicks through to your pages. You purchase blocks of clickthroughs in advance and are given a lower per click rate if you buy a large block at a time. To date, eLuminator has about a dozen clients, including ZDNet, Hoover's Media Technologies, Penton Media, McGraw Hill, and channel partner Qpass.
Who should use this? If you have the budget, obviously anyone with password-protected content that they would like made accessible to the search engines. eLuminator is also useful to anyone with a site that poses problems to spiders, such as those that use dynamic URLs or frames. It's also useful for anyone with content in non-HTML or text format, such as PDF files. However, having good, text-based content is essential. eLuminator will only be successful if it has such content to work with.
Invisible Web Gets Deeper
The Search Engine Report, Aug. 2, 2000
Invisible Web? Deep Web? Shallow Web? This article explains the concepts in more depth.
MSN Search Releases New Version
MSN Search went live with a new version of its site about a week ago. There have been a variety of minor tweaks and design changes, as well as some substantial changes.
Most important are the new "Popular Search Topics" links that now appear below the search box, after you perform a search. These are suggestions designed to help you easily narrow your request to a particular topic, if your original search was ambiguous. For example, in a search for "saturn," you'll see these options:
+ Saturn Corporation (auto manufacturer)
+ Saturn (planet)
+ Sega Saturn cheats (game hints)
Select a topic, and the search engine will rerun your request focused around that particular topic. However, the real beauty to these is that you're not simply giving the search engine new words to search for, such as "planet saturn," if you were to choose the planet-oriented topic. While those words will appear in the search box, behind the scenes they are mapped to other words that editors at MSN Search believe will bring up the best sites for that topic. Moreover, the editors may have preselected what they believe to be the best sites for that particular query.
That's just one example of the hard work going on at MSN Search to improve the quality of their results. A team of editors closely monitors search logs and provides human intervention where needed to improve the listings. Misspellings are a good example. Consider:
The top spelling is correct, but the two other spellings have been programmed to also bring up results similar to the correct spelling, thus saving thousands of teenage pop fans from the heartache of missing web sites about singer Britney Spears (and who knew there was a Society of Future Husbands of Britney Spears!).
For webmasters, this human intervention is also why you may not understand why certain sites appear in response to particular searches, if the search words don't appear in the description. MSN editors may have manually placed a site into the top results. And, even though the main results come from LookSmart, MSN editors have the ability to rewrite a site's description for when it appears in response to a particular search.
Returning to the Popular Search Topics feature, if there are more than four topics for a particular search, you'll also see a "Show All" link next to the words "Popular Search Topics." Select this link, and you'll be shown all the different topics related to your original search.
After doing a search, you'll see that the results screen has gained some new tabs at the top of the page. The "News Search" tab runs your query against content from MSNBC. "eShop Search" runs a search against listing at the MSN eShop shopping site. "Yellow Pages" brings back geographical business listings from the MSN Yellow Pages site.
The feature that allowed those using Internet Explorer to save results has now been removed. I always thought this was pretty cool, but users generally didn't take advantage of it. Given this, it was dropped to make the results page less cluttered. Similarly, MSN Search has added more white space to make the results visually appealing.
"We put white space to let page breathe," said Philip Carmichael, group program manager for MSN Search. "The results were definitely running together."
Numbers have also been added to each of the major links on the page, a small change, but one MSN Search feels is important.
"We think the numbers are helping because if you go to page two, then come back, it's pretty easy to remember a site's position on the page," Carmichael said.
Finally, should you try a search that MSN Search feels may bring up porn or adult content, you'll see a screen that offers you the option to try your search at NightSurf.com, an adult search engine. Otherwise, you can ask MSN Search to show you what results it has, and these will come from the Inktomi "Web Pages" database, since the LookSmart-powered "Web Directory Sites" listings do not contain adult content.
Google & FAST Move Up In Size
Google announced last week that its size has increased to 602 million pages that have been fully-indexed and to 1.25 million pages when partially indexed material is included. The announcement came just about week after FAST Search had gained the title of biggest search engine by increasing its index to 575 million pages. Now Google regains the size crown, but more maneuvering is likely in the coming months. In third place, Inktomi remains at the 500 million mark. Partners accessing full Inktomi information include iWon, HotBot and NBCi, Inktomi says. WebTop.com also claims 500 million pages. All numbers are self-reported.
Search Engine Sizes
Links to past articles about search engine size issues and announcements. I haven't updated the charts yet, but that should happen by Monday, if not sooner (promise!).
New Go Site Goes Up
Having given up its goal of being a portal to please all people, the new Go site was formally unveiled last month. While you may have heard talk about the site being transformed into a starting point for those seeking "freetime" information such as entertainment, leisure and lifestyle content, so far it instead retains features that may appeal to general searchers.
Specialized freetime information may come in the future, but for now, even Go admits that the big push is to emphasize search. "This is a navigation site," said Rajiv Samant, executive vice president and general manager of Go. "We've really tried to make the search site dominant."
If you haven't been to the Go since its relaunch, the big change is that Go's human-compiled directory information is now being given top billing over its crawler-based results. That makes sense, given that the Go crawler has not remained competitive with other major crawlers in terms of size and freshness.
Go says it plans to refresh and expand its crawler results soon, but even with an increase from the current 50 million pages to 100 million, you won't be turning to Go for comprehensive web coverage. Nevertheless, Go's directory picks might point you in the right direction for general queries. A team of about 43,000 volunteer "Go Guides" compiles the listings, which are now said to number over 430,000.
When you perform a search, you'll see any matching categories from this directory appearing at the top of the results page, under the heading of "Go Directory." Category links are not always present, but when they are, you can select one and go right into an area where sites on the category topic are listed in order of star quality, with three star sites coming first, two star sites next, then one star sites.
Go will also extract the top sites from the directory that it thinks match your query and present these right on the search results page. These appear in the "Proven Picks" area, just below the category links "module." Or, if there are no category links, then Proven Picks will top the page. After the Proven Picks come "Web Search Results," which are any matches that the Go crawler has found from across the web.
As I mentioned, the crawler results are about to be refreshed. "We are in the final preparation of releasing the new web index," said Andy Bensky, Go's chief scientist. "We haven't released a new one since mid-summer. We're now respidering all the pages and going deeper into some sites."
One of the key factors of whether a site will be deeply spidered will continue to be whether it is listed within the Go Guides directory. "As always, getting your pages listed in Go Guides is the best way to get enhanced coverage and placement in our various indexes and results," Bensky said.
Getting listed is easy to do, plus you'll then have a chance to appear in the main results that Go presents on its search results page, as well as perhaps getting greater representation in the crawler results and a ranking boost there, as well. The article below covers more about submitting to Go Guides.
Of course, there's also the Go Add URL system. Bensky says that URLs are still being accepted by this, but that pages are added only as resources allow. If accepted, they should appear within six weeks. You can submit multiple URLs from the same site, and there's no set daily limit. However, Bensky warned to be sensible with your submissions.
"If they start going overboard, there's more chance they may show up on our radar screen," Bensky said. "Fifty pages a day is a reasonable range, but if they do 50 pages a day every day for a month to submit thousands of pages, that's something we might notice and cause us to look at the differentiation of the pages."
On the horizon is a paid inclusion system that Go expects to release by December. For a fee, still to be determined, a URL would be guaranteed to be added within 2 days and then revisited every 2 days after that. Go is also considering adding an option so that this also submits a site to Go Guides. Instead of having to join Go Guides and submit your site personally, you could use the system to place your site before the volunteer editors, as well as getting it included in the crawler results. These ideas are all tentative, so expect that there may be changes from when the final system is released. I'll keep you updated on developments.
Finally, another new feature you may come across is a "Focus Your Search" page that may show up if you do an ambiguous search. For example, look for "tiger," and Focus Your Search will ask if you mean topics such as:
+ Music artists & bands
+ Detroit Tigers
+ Woods, Tiger
+ Tigers Computer
When you choose a topic, Go reruns your search to bring back results relevant to that area. It will go beyond the actual search words that are displayed, also. For example, select the Tiger Woods topic, and it will bring up directory picks and web pages relevant to Tiger Woods, even though the search query will remain "tiger." Behind the scenes, Go will look at the directory listings relevant to Tiger Woods and use these, as well as some editor adjustments, to bring back relevant results.
Sign-up to be a guide here.
Go Beta Tests User-Assisted Directory
Older article that still covers the basics of submitting to Go Guides.
Go Add URL page
GoTo Live At AOL, Enters The UK, To Appear At Lycos & HotBot
It's been a busy month for GoTo, with its AOL partnership now live, new in-roads to bring its listings to the United Kingdom, and a deal to provide paid links to Lycos and HotBot just approved.
Paid links from GoTo.com are now appearing at AOL Search. The top three bids for any particular term at GoTo will also be displayed for a search for that term at AOL Search. They'll be shown at the top of the page, in a clearly marked and well disclaimed "Sponsored Links" area.
Given this, if you wanted to be in the top AOL Search results for a particular search term, you would open a GoTo account, bid on that term and ensure that you were one of the top three bidders. Note that AOL Search is not displaying descriptions with the GoTo links, so make your GoTo page titles attractive and enticing (but not misleading), in order to encourage clickthrough.
GoTo has also just announced today that its paid links are soon to appear at Lycos and HotBot, through a new three-year deal. They'll most likely be shown in a similar manner as with AOL Search. No ETA on when they'll go live -- expect more news on this, as it comes.
GoTo's UK site has also gone live, but more importantly, GoTo is also pegged to provide results to UK Internet service provider Freeserve, bumping Inktomi out of that partnership (though Inktomi will still provide any unpaid results). It's a real coup for GoTo, because Freeserve is the UK's largest ISP, with 2 million users. The two-year deal is significant in that it gives GoTo's UK advertisers immediate access to the huge audience that performs searches from the Freeserve home page.
It's also significant that paid listings are being dumped on such a large audience without there being any mention of it that I've seen in the UK press. Let's put this in perspective. Even with the changing climate in the US toward paid listings, if America's largest ISP -- AOL -- were to suddenly to change so that the bulk of its search results were paid ads, there would certainly be some criticism. The same thing is set to happen in the UK, and not a word is being said.
Meanwhile, GoTo competitor Kanoodle is extending its reach to Europe through a new partnership with Espotting, a UK-based paid placement search engine. Advertisers at either service will be able to run their ads across both North America and Europe, should they desire.
I-Search GoTo UK Special Edition
I-Search, Oct 24, 2000
The decision by GoTo to set minimum bids at 5p (7.5 US cents) and require a minimum opening balance of 25 pounds (US $37.50) has drawn some criticism, as summarizes in this special edition of the I-Search mailing list.
GoTo to Change Name, Rebrand
InternetNews.com, Oct. 26, 2000
To underscore how much GoTo sees its future as a distributor of paid ads, rather than a standalone search service, the company is seeking a new name. Ironically, it fought and won a lawsuit earlier this year to protect its logo from confusion with Go.com's former logo.
FindWhat.com Cooks Up Santa Claus Giveaway
@NY, Oct. 4, 2000
Are you a failing dot com? If so, paid placement search engine FindWhat.com wants to give you some free ads in hopes you'll want more, if you survive the year.
GoTo Redevelops, Reaches Out
The Search Engine Update, Sept. 4, 2000
Covers how GoTo sees its future not as a destination site but rather in powering paid listings for other search engines.
Lookin' For Liv In All The Wrong Places
One of the great things about Google is that it has really stood apart from other crawler-based search engines in fending off spam. Its heavy reliance on link analysis, among other factors, has made life difficult for those seeking to manipulate Google's results. However, a recent story about how a porn site achieved some top rankings has shown that even Google is vulnerable to spam, though the black eye it has received is more than it deserves.
The porn site created doorway pages for different celebrities such as Phoebe Cates and Liv Tyler, which were discovered to appear top ranked for corresponding searches such as "Phoebe Cates nude" and "Liv Tyler nude." The success of these pages was described in an article from GeekPress, then commented upon within Slashdot, which led to a follow up article in GeekPress. The issue was whether or not Google was "fooled," as GeekPress puts it, into returning these pages as relevant.
Yeah, Google was fooled. While a message forwarded to Slashdot from Google suggested that there were no relevant results for something like "Liv Tyler nude," I think its obvious that Google has set its standards high enough that giving top billing to fake discussion pages isn't what it considers optimal performance. Moreover, GeekPress makes an excellent argument that there are indeed relevant sites for Liv Tyler nude out on the web (sorry, Liv).
Is this a big deal? Nah. I'd be much more concerned if porn sites were coming up for searches on "Liv Tyler" alone, rather than her name plus "nude."
Google's not perfect, and I'm sure we'll continue to see little things like this crop up. However, such flaws seem much worse than they are primarily because Google has set the bar so high in the relevancy game. For the most part, it continues to dazzle people with its amazing high jumps, and a slight stumble now and then shouldn't be confused with a broken leg. Whew -- I think I winded myself pushing that metaphor too far :)
GeekPress, Oct. 30, 2000
The original story about the high-ranking porn pages at Google.
Reports Of Google's Demise Exaggerated
Slashdot, Oct. 31, 2000
Discussion thread contains comments from Google and lots of defense of the search engine.
Still Scamming Google
GeekPress, Oct. 31, 2000
Written in response to criticism at Slashdot and to disprove comments from Google, this article goes into detail that there are indeed relevant sites for "Liv Tyler nude," including even some non-porn sites.
Did Smut Spammers Scam Google?
Wired News, Nov. 1, 2000
Wired nicely wraps up the entire issue.
More Evil Than Dr. Evil?
The Search Engine Report, Nov. 1, 1999
Perhaps Google-blips are an annual thing, because a year ago, the now-notorious "more evil than satan" query eclipsed sex as a top search on Google. Why? The top result returned was Microsoft, and everyone had to try it for themselves. Want to bet "Liv Tyler nude" is rising in the rankings right now at Google?
Yahoo Publishes Top Searches
It used to be that every so often, a Yahoo top searches list would be leaked and circulated around. Now Yahoo's made the wise move of realizing what we search for is great content. The Yahoo Buzz Index provides a breakdown of top search topics by category and overall.
"Leaders" shows you what's hot -- for example, Halloween and the Singapore Airlines crash currently top the overall list. "Movers" shows you what topics are gaining interest from the previous day. Actress Tara Reid is currently second on the movers list, probably due to her engagement, which came to light this week. You'll find Buzz archives stretching back to Sept. 26 of this year.
Meanwhile, a similar feature that was launched earlier this year by AltaVista, the A-List, has been discontinued as part of September's budget cutting over there. However, the oldest of the search analysis services, the Lycos 50, continues on strong.
Yahoo Buzz Index
You can still read the last edition here.
What People Search For
Other places where you can see how people are searching for things online.
Search Engine Resources
Designed to direct you to academic papers and information across the web.
New health portal, with a special web-wide search engine powered by Google that provide results specifically from health and medical-related web sites.
Enter a search term, select a search engine, then watch as Vivismo automatically organizes pages from the results into categories. Slick and easy to use. Fans of Northern Light's Custom Search Folders will love this ability to give other major search engines a similar feature.
FindSame finds sentences, paragraphs, or documents that have been duplicated on the web. Just feed it a URL or a block of text, and it will scan against its index of 200 million URL to look for matches. It's a great way to see if someone is stealing your content, or just to find documents that may be similar to one you like.
Now out in beta, this is designed to ease life for both network administrators and end users tapping into the Gnutella network.
Search Engine Articles
Search Us, Says Google
MIT Technology Review, Nov/Dec. 2000
Good Q&A with the founders of Google.
Online Policy Group Web Site Listed in Major Search Engines
Online Policy Group, Nov. 2, 2000
Looks at getting a new site listed for free. Major conclusions? Expect a month or two delay, and getting into the Open Directory can help you in a wide-variety of other places.
Choose Your Words With Care
ClickZ, Nov. 1, 2000
Tips on selecting search terms for search engine optimization efforts.
China Net Users Seek Better Search Engines
China Online, Oct. 16, 2000
Covers search satisfaction, or dissatisfaction, in China -- as well as top Chinese search engines.
Deja News Search Engine for Sale: News and Irresponsible Speculation
ResearchBuzz, Oct. 17, 2000
ResearchBuzz's Tara Calishain takes an interesting stab at where she thinks the Deja newsgroup archives might fit (see next article, below)
Deja Puts Sale Up For Discussion
ZDNet, Oct. 15, 2000
In the latest in the Deja's newsgroup search saga, the archives and the associated online reading service are to be sold.
Building an Effective Linking Strategy
ClickZ, Oct. 12, 2000
Eric Ward is a widely acknowledged master of generating traffic to a web site through online PR and link building. In this article, he introduces you to link building concepts. In the one below, he provides further tips.
Link Popularity Is Not Your Only Linking Goal
ClickZ, Oct 26, 2000
Are portals passe
MSNBC, Oct. 3, 2000
Nice piece on the rise and fall of being a portal.
Stanford Launches Better Search Engine Project
siliconvalley.internet.com, Oct. 2, 2000
Yahoo, Excite and Google all came out of Stanford University. Will Global InfoBase be next?
Dogpile, The Speaker of the House, and A Little Test
Search Engine Guide, Sept. 18, 2000
Turns out that an aide to the Speaker of the US House of Representatives is a fan of the Dogpile metasearch engine.
Next-Generation Web Search
IEEE Data Engineering Bulletin, Sept. 2000
In this special edition, six technical papers that deal with web searching are presented. They include topics such as link analysis, computing page reputations, creating topic specific search engines and the role a search engine's interface plays in the success of a searcher. Papers are presented in PostScript. To read them, try the GhostScript viewer, http://www.cs.wisc.edu/˜ghost/gsview/get34.html.
How do I unsubscribe?
+ Follow the instructions at the very end of this email.
How do I subscribe?
+ The Search Engine Update is only available to paid subscribers of the Search Engine Watch web site. If you are not a subscriber and somehow are receiving a copy of the newsletter, learn how to subscribe at: http://searchenginewatch.com/about/subscribe.html
How do I see past issues?
+ Follow the links at:
Is there an HTML version?
+ Yes, but not via email. View it online at:
How do I change my address?
+ Send a message to firstname.lastname@example.org
I need human help with my subscription!
+ Send a message to email@example.com. DO NOT send messages regarding list management or site subscription issues to Danny Sullivan. He does not deal with these directly.
I have feedback about an article!
+ I'd love to hear it. Use the form at