THE SEARCH ENGINE REPORT
November 1, 1999 - Number 36
About The Report
The Search Engine Report is a monthly newsletter that covers developments with search engines and changes to the Search Engine Watch web site, http://searchenginewatch.com/.
The report has 88,000 subscribers. You may pass this newsletter on to others, as long either part is sent in its entirety.
If you enjoy this newsletter, consider showing your support by becoming a subscriber of the Search Engine Watch web site. It provides you with some extra benefits and access to some exclusive materials and articles. Details can be found at:
Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ General Notes
+ The "New" AltaVista
+ Beyond The Hype: Dissecting AltaVista's Claims
+ Who's The Biggest Of Them All?
+ AOL Search Big Improvement For Members
+ Direct Hit Expands Site
+ Northern Light Introduces New Ranking System
+ Monitoring Firm Acquired
+ More Evil Than Dr. Evil?
+ Search Resources
+ Search Engine Articles
+ Subscribing/Unsubscribing Info
Lots of news this past month -- so much so that I couldn't do a longer look at changes to Direct Hit and Northern Light in this issue. Instead, there are some brief mentions that I may follow up on in the future. I also expect to update the search engine rating pages with the site soon with new information that I've received.
There's still time to book a spot at the Search Engine Strategies '99 conference later this month. It's an all-day event that I've organized, covering aspects of search engine promotion and marketing. It gives you a chance to hear editors from the major directories talk about submissions, or to pose questions to representatives from some of the major crawler-based services. Experts will also help you understand issues about meta tags, doorway pages and how page design can impact your rankings. More information on attending can be found via the link below.
Search Engine Strategies '99
Search Engine News
The "New" AltaVista
With great fanfare, the "new" AltaVista was launched last week. Now owned by CMGI, the revamped service is being firmly targeted as a challenger against more established portals such as Yahoo, Excite and Lycos.
There's a lot to like at the refreshed service, but AltaVista's failure to live up to some of its marketing claims spoils the party. Those claims are examined in a separate article, below. In this article, we tour AltaVista's new look and features.
There's a lot happening on the AltaVista home page -- options, menu items and links to content all scream for attention. Fortunately, everything is already set for those who just want to do a normal search across the web. Simply enter your terms into the search box, and away you go. The results you receive may come from up to four different sources, as has been the case with AltaVista for over a year.
For many popular searches or those phrased as questions, the first source continues to be Ask Jeeves (though occasionally, it will be the third source displayed). Information from the answer service appears under the heading, "AltaVista knows the answer to this question." RealNames also continues as the second source, feeding the link which appears just above the numbered listings. A previous review of AltaVista below describes the pluses to both these data sources in more detail.
Those numbered listings, which dominate the results page, continue to come from AltaVista's web crawler. What's different is that the company has made a number of changes that it hopes will improve the quality of these listings.
In my opinion, the most important change has been the introduction of results clustering. This means that only one page per web site appears in the top results. It's a real plus for searchers, because it means that you get more variety in your results. A related article below explains results clustering in more depth.
AltaVista also says it has expanded its web index from 150 million to 250 million web pages, which should give it more comprehensive coverage. Indeed, some testing I did comparing it against size leaders Northern Light and FAST Search showed AltaVista holding its own or exceeding them.
That said, I've had an unprecedented number of complaints from webmasters who have seen many or all of their pages dropped from the AltaVista index. AltaVista admits that some sites have been lost in the transition to its new index, which occurred last week. The company says it is now moving forward to process submissions made via its Add URL page and to add new pages found by its crawler.
AltaVista has also introduced a new page ranking system that makes more use of off-the-page criteria. AltaVista's staying closed-mouth about some of these criteria, but it is highlighting the fact that link data is being more heavily leveraged. Links are analyzed both to determine the overall "popularity" and content of web pages.
If you look below each numbered listing, and you may see up to three links. "Translate" lets you translate the page into another language. "More pages from this site" lets you uncluster results for that particular web site. "Company factsheet" takes you to detailed information about the company which owns the web site.
Those fact sheets are being produced by iAtlas, a company that AltaVista quietly acquired before its relaunch. You may also find factsheet links within AltaVista's version of the Open Directory, discussed below.
After the numbered crawler listings come any relevant categories of web sites as classified within the Open Directory. Select a category, and you'll be shown sites from the Open Directory on that topic.
Previously, LookSmart had been AltaVista's primary directory provider. Now the company is being retained as a "premium partner." Among other things, this means that LookSmart listings are to be incorporated within AltaVista's version of the Open Directory. When this happens, LookSmart-specific information is supposed to be clearly identified, LookSmart says.
That's basic web searching. Other types of searches are available, and let's explore them by returning to the home page. On it, you'll see three tabs above the search box: "Search" (which we've already covered), "Advanced Search," and "Images, Audio & Video."
Selecting "Advanced Search" takes you to AltaVista's advanced search page. Do you want to use this page? In most cases, no -- and I'm surprised that it's given such prominence over other search options.
Use the Advanced tab only if you absolutely insist on using Boolean commands, instead of the nearly equivalent Search Engine Math commands that can be used in the normal search box. Also use Advanced if you want to narrow your search to a particular date range, though many people who do date restricted searches will probably be better served by using the new News search feature.
By the way, those familiar with the Advanced page will also discover that the exact count option has been removed. And here's a last tip -- MSN Search offers you access to a much more useful "advanced" search page for AltaVista. See the URL, below.
The "Images, Audio & Video" tab is much more useful, giving you access to AltaVista's exceptional multimedia search. In particular, it displays thumbnail images from the Corbis and Getty picture collections, along with images and multimedia found by crawling the web.
AltaVista also offers three more specialty searches, but they are confusingly made available via options below the search box, rather than through additional tabs above it.
"News" is new -- it lets you tap into top stories from major news sources from between 6 hours to 14 days old. "Discussions" lets you scan newsgroups. Previously, AltaVista had crawled Usenet itself to offer this feature. Now the information is provided through a partnership with RemarQ.
Finally, "Shopping" lets you look for products from across the web, peruse reviews, and more. This is not the same as searching within only AltaVista's online store. Instead, an AltaVista shopping search is meant to provide matches from a wide range of web merchants. AltaVista promises the service is supposed to be unbiased, and in a search for "palm iiix," I was indeed presented with hits from multiple retailers. AltaVista's Shopping.com store was one of the merchants listed, but it was not the only one nor even at the top of the list.
Once you've performed any type of search, you can quickly perform another via tab-like images on the results page. For instance, assume you searched for "year 2000" as a web search, then decided you wanted to instead look for news stories. Just click on the "NEWS" tab at the top or bottom of your web results page -- that will rerun the search. Similarly, you could choose "DISCUSSION" to see what people are talking about on the topic in Usenet groups.
Back to the home page, there are yet other changes and features to mention. AltaVista Live replaces My AltaVista for those seeking to personalize the service with news headlines and other features. It's also possible to browse the directory, which is given much more prominence on the home page than in the past. Those seeking to eliminate porn sites and other possibly objectionable content from their search results will find the Family Filter link below and to the right of the search box.
MSN/AltaVista Advanced Search
AltaVista Debuts Search Features
The Search Engine Report, Nov. 4, 1998
This older article about AltaVista still provides useful background about using Ask Jeeves and RealNames information at the service.
Search Assistance Features
Explains how results clustering works, along with other features used by many major search engines.
Search Engine Math
You don't need to know Boolean syntax to perform better searches. Learn the three simple "math" commands that work on nearly all search engines and which can improve your results.
Showdown News AltaVista Special Issue
Search Engine Showdown, Oct. 31, 1999
Another look at the new AltaVista from the perspective of Search Engine Showdown webmaster Greg Notess.
Altavista spins chic image over geek
Red Herring, Oct. 26, 1999
A favorite comment from analysts is that its "too late" for AltaVista to compete with the likes of Yahoo, Excite, Lycos and other established portals, as this article details. Wrong. AltaVista enjoys widespread grassroots support -- it has consistently been a top ranked web site for years, despite the fact it has done virtually no advertising. With a $120 million ad spend now underway, there's every reason to suspect that AltaVista will increase its visitors and successfully make the jump to be counted among the most major of web portals.
Beyond The Hype: Dissecting AltaVista's Claims
In conjunction with its relaunch, AltaVista made several claims about its new search capabilities that don't currently hold up. Here's a look at where the service isn't yet matching promises made in a recent press release:
"Four years since introducing the world's first search engine, AltaVista once again breaks new ground with a powerful way for users to zero-in on precise results by type of information."
The world's first search engine? Not by a long-shot. AltaVista hit the stage in December 1995. Both Lycos and WebCrawler had been indexing pages since early 1994. Even Excite and Infoseek were live before AltaVista.
"AltaVista Search debuted the world's largest index today, spanning an unprecedented 90 percent of Web sites, 250 million unique pages and 25 million multimedia objects."
Wow -- AltaVista spans 90 percent of sites on the web! That sounds great, but the number is essentially meaningless, as far as I can tell.
The NEC Research Institute estimated that as of February 1999, there were 2.8 million publicly accessible web servers. AltaVista tells me it compared this number with the number of web servers it has visited to come up with its 90 percent figure.
Now here's the problem. The NEC estimate was for web servers, not web sites. A single web server can host multiple web sites -- even hundreds of web sites. That means there's no way of knowing exactly how many web sites exist based on the NEC data. Similarly, there's no way for AltaVista to know exactly how many web sites it has spanned.
Let's take another look at this. Having an index of 250 million web pages is indeed an achievement -- it means AltaVista shares the top spot for largest index with FAST Search (as least based on self-reported numbers).
But that same NEC study estimated that there were 800 million pages on the web, as of February 1999. That means AltaVista's index covers 31% of these estimated web pages. So even if AltaVista really does span 90 percent of web sites, it clearly doesn't have every page from those sites, which again devalues the 90 percent number.
"AltaVista's new 'Living Index' crawls the Web 'intelligently,' recording how often pages are updated, to ensure AltaVista Search always has the Web's freshest information."
Perhaps it does. Perhaps it will. But the index certainly doesn't have the freshest information at the moment. For instance, go to the advanced search page, then look for pages containing "golf" between Oct 1 and Oct. 31. Only three pages are found -- three pages! (and one of those is already a dead link). Is it possible that only three pages containing "golf" were either added to the web or updated during all of October? Not likely -- do the same date restricted search at Northern Light, and you get over 48,000 hits.
Here's another look at the problem. Do a search for "the," and you'll find these numbers:
Aug. 1999: 5,3377,558 pages
Sept. 1999: 7,993 pages
Oct. 1999: 208 pages
If anything, AltaVista appears to have a relatively dated index, at the moment -- certainly not the web's "freshest" information, as claimed.
For its part, AltaVista says that the current index is based on a crawl done at the beginning of October, and it is just now beginning to go forward with adding new information and updating older pages.
"I think in terms of where we are going to be, we're going to have the freshest information there is. It's perhaps not as fresh as we would like it to be, at the moment, but what we are doing will completely rectify the situation," said Tracy Roberts, AltaVista's marketing director.
"AltaVista search is able to make its Freshness Guarantee: no search site will have fresher results than AltaVista."
AltaVista unveiled its first "Freshness Guarantee" back when it relaunched in June, promising that its entire index would be refreshed at least once per month. That guarantee was almost immediately broken, as even AltaVista President Rod Schrock admitted when we talked recently. "We turned our attention to this new system," Schrock said.
OK, fair enough -- they wanted to build something even better. But this new guarantee has already been broken, as described above. If claims like these are going to be made, then they should actually be met. And not to meet them in the midst of a huge media blitz is an incredible blunder.
"Called 'AltaVista Page Knowledge,' this technology analyzes the content of the page, its meta-tags, the page's connectivity, any referring anchor text and other pertinent information. No other Web search engine takes all these variables into account when measuring relevancy."
The implication is that AltaVista is going well beyond what other search engines consider when determining relevancy. In reality, both Inktomi and Go analyze all of the factors specifically named above, according to past interviews with them. Of course, the exact algorithm used by each service is unique -- but AltaVista certainly has equals as to the factors it is specifically naming.
AltaVista Search Press Release
Search Engine Coverage Study Published
The Search Engine Report, August 2, 1999
More information about that NEC study.
Who's The Biggest Of Them All?
AltaVista is now claiming to have the largest index of the web, at 250 million pages. But is its claim true? Does the service deserve the title of biggest? Probably -- or certainly it and FAST Search share the title. But it's not possible to say so definitively.
Before exploring the issue, let me make my usual caution that being biggest does not mean having the best results. I definitely see advantages to having index sizes grow -- it does mean that we are more likely to find unusual or obscure information. But fixating just on numbers can be misleading as to the actual quality of a service.
Now how do we prove the size of a search engine's index? One technique is to search for a word you know does not exist on any page in the index. For instance, a search for "dffjkdjkf" at Northern Light shows no matches. That means if I do a Boolean "NOT dffjkdjkf" search, I should be shown a count of ALL the pages in Northern Light's index, since all of them fit that search criteria.
This works well -- Northern Light tells me it has 189,060,458 pages indexed. But this technique doesn't work at those services seriously competing against Northern Light in the size game.
So what else can be done? You can search for various topics, then compare the total hit count. After all, if one search engine reports 7 million hits for "travel" and a competing one that's supposed to be the same size only finds 3 million matches, you have reason to believe maybe the second service has a smaller index than it claims.
There are two problems here. First, some search engines have results clustering that cannot be turned off. That means there's no way to get an accurate count of ALL the pages actually found for a query. Another problem is that not all search engines report a hit count, such as Lycos, or the count is only approximate, as with AltaVista.
That leaves only one real solution -- to run queries for extremely obscure topics, so that you can easily verify the exact count. An example of this can be found via the URL below, which is the method I used to try and verify AltaVista's size claim. That test told me that AltaVista and FAST Search seemed about equal, which isn't surprising given that they both claim indexes in the 250 million page range, which would make them the largest search engines on the web.
Now let's complicate things. AltaVista has a terrible habit of "timing out." This means that during busy hours, it will search for a short period of time, then return whatever it has found -- even if there's more information lurking in the index.
So even though it might be biggest, or tied for biggest, you might not be querying everything it has available. Nor might AltaVista be alone in this -- other services have suggested that their competitors don't search as completely against their indexes as they could.
This brings us back to the value issue. Unless a query is really obscure, having more isn't helpful. Is anyone really going to look through more than a thousand matching web pages for any topic? No -- but they are certainly going to appreciate having the very best 10 or 25 or 50 of those pages for that topic.
Of course, size will continue to be a selling point for crawler-based search engines, and those services will want their claims to be verified -- as do their users. Fine, then give reviewers the tools they need. Allow for results clustering to be turned off. Provide accurate counts, not approximate ones that can change depending on the time of day. With these two features, anyone can run comparative tests.
I'd also like to see all crawler-based search engines add a feature that lets you see the number of pages indexed from any particular web site. This also goes to the issue of verifying size claims. It would allow you to compare how deep search engines crawl various web sites, which can also be brought into the mix to verify size claims.
Search Engine Size Test
The Search Engine Report, Nov. 1, 1999
Shows the tests I did to check AltaVista's coverage against its claims.
Northern Light Claims Largest Index
The Search Engine Report, Feb. 2, 1999
More thoughts on the difficulty in auditing sizes.
AOL Search Big Improvement For Members
The old AOL NetFind service never really offered any particular advantages to AOL members, who were its main users. In contrast, the new AOL Search service should be a top choice for AOL's millions of members. That's because for the first time, you can search for information within AOL and across the web at the same time.
AOL Search was officially launched early last month, in conjunction with the release of the AOL 5.0 software. That software makes search and navigation a central part of its interface.
In particular, a new navigation box has been added. It appears near the top of the AOL screen, just below the icons, with text inside that says "Type Search words, Keywords or Web Addresses here." To the right of this box are three buttons, "Go," "Search" and "Keyword."
You can use this box to navigate within AOL using its keyword system. For instance, enter "shopping," select the Go button, and you'll be taken to the AOL Shopping channel. Enter "movies," select Go, and you're taken to the AOL movie area.
What happens if you enter a word that's not an AOL keyword? Then AOL sends your words to the AOL Search service. So if you looked for "whale watching," you'd be taken straight to AOL Search, as there is no matching AOL keyword for those terms. We'll discuss what you might find there in a moment.
Sometimes you'll enter a phrase that's somewhat related to an existing AOL keyword. For instance, enter "new york hotels." and AOL will suggest its "new york homes" area. If that doesn't suit your needs, choose the Find option. That will take you to AOL Search -- though annoyingly, you'll need to reenter your query.
In contrast to the Go button, pushing the Search button tells AOL to use AOL Search instead of first checking to see if there are any matching AOL keywords. You can use this button anytime you actually want to search for something, whether it is within the AOL service or out on the web.
Don't be afraid to use it -- you'll get both the best of AOL and the web combined into one results list. For instance, let's go back to "movies." If you enter that word and hit the Search button, you'll see a link called "Movies" at the top of the results, in the "Recommended Sites" section. Clicking on that takes you right into the AOL movies area.
It's not just AOL content that's featured in Recommended Sites. Do a search for "mp3," and you'll see a link to the WinAmp site, which provides a player for listening to the audio format. All Recommended Sites are picked by editors at AOL and may be on the web or within the AOL service.
After Recommended Sites, you'll also see a section called Matching Categories. These list sites from across the web, which have been organized into topics by the volunteer editors of the Open Directory.
AOL doesn't want to overwhelm its users with matching categories, so only the top five are displayed. If you wish to see more, select the little "next" link in the Matching Categories section.
Next, AOL Search displays actual web sites or relevant sections of the AOL service in the "Matching Sites" section. If it's out on the web, you'll know because you'll see a web address beginning with http:// below the site description. Otherwise, the site is internal to AOL.
At the bottom of the page, you can choose to see more web sites by selecting the "next" link. Or, you can choose to view "AOL Articles" or "Web Articles." AOL Articles are any type of content that might be related to your search terms from across the whole of AOL. Similarly, Web Articles are any individual web pages that are found related to your search from across the entire web. In some cases, if you search for something obscure, you'll automatically be shown only Web Article results -- not Matching Categories or Matching Sites.
AOL Search also has some features designed to help you search better. After all the search results, you'll see a sentence that says "People who searched for movies also searched for," followed by some links. These are popular searches that are related to your core search term. So if you looked for "movies," you'd discover that "horror movies" is a popular related search. Then if you clicked on the "horror movies" link, your search would automatically be done again, using those words.
Non-AOL members can also use AOL Search, as can AOL members who aren't signed into the service, for some reason. The only difference with this "external" AOL Search is that it does not list any content that's within the AOL online service.
AOL Search (Internal)
Get to it using the search button described above or via the red "Search" button in the upper left-hand corner of the Welcome Screen.
AOL Search (External Version)
Direct Hit Expands Site
Been to Direct Hit recently? The company has given its site a new look, added Open Directory listings and made its shopping search feature more accessible. Expect a longer look, in a future newsletter. Direct Hit also just announced a deal to power the AT&T World Net portal.
AT&T World Net
Northern Light Introduces New Ranking System
Add Northern Light to the list of services making use of link data to improve its search results. And the change really does seem to have helped for some popular queries I checked. Other factors beyond links are also being taken into account -- more details in a coming newsletter.
Monitoring Firm Acquired
Aeneid acquired InGenius Technologies at the end of October. InGenius makes some pretty cool monitoring tools such as Daily Diffs, which lets you see exact changes that have happened at particular web sites. Aeneid said it will add new monitoring abilities using InGenius technology to its EoCenter industry search site.
(NOTE: This next article is meant to be humorous. If you're of deep religious convictions, please skip it -- I wouldn't want to cause offense, and certainly none is intended).
More Evil Than Dr. Evil?
So here's the deal -- a post on Memepool back in October noted that if you searched for "more evil than satan himself" on Google, the Microsoft home page was listed first. Oooooooooooh! Tasty Bits from the Technology Front then picked it up, and I seem to recall it getting a mention in some other places. Certainly people must have been talking about it, because now TBTF reports that for a short time, that search was more popular than "sex" at Google.
Of course, the search really doesn't mean anything -- the Microsoft home page also comes up tops for plain old "more evil," so given the way Google works, this just tells us that there's a significant number of people on the web who may use the word "evil" near links to Microsoft on their pages. Or perhaps Google's just plain screwy for this type of search.
But this got me to thinking. If Microsoft is "more evil than satan," then who would Google say is more evil than the arch enemy of Austin Powers, Dr. Evil? It turns out that the most unusual top pick for "more evil than dr. evil" is....Dilbert!
Let's not stop there. Who's "more evil than microsoft," if they are so bad? Turns out, no one -- Microsoft still comes up tops. Congrats, Bill! But look out -- lurking just below is....Netscape? Yep, Netscape (must be that AOL influence!).
With all this evil on the web, perhaps God could use some help fighting it online. In fact, I wondered if Google knew of anything "more powerful than god." The top pick -- Yahoo Well, anyone who has ever submitted a site to Yahoo could have told you that :)
Seriously, it sounds more like Google's heavy reliance on link popularity causes it to float the web's most popular sites up to the top these relatively non-descript searches.
And by the way...there is a "right" answer to "more evil than satan" that isn't Microsoft. It's apparently part of a riddle: "What is greater than God, more evil than Satan, poor people have it, rich people need it, if you eat it you will die." Want the answer? Then search for "more evil than satan" at....AltaVista!
Tasty Bits from the Technology Front: Second Mention
Here's the report of sex being eclipsed on Google.
Tasty Bits from the Technology Front: First Mention
Memepool Original Post
New portal backed by CBS and featuring Inktomi's directory search engine, which automatically classifies web sites into categories. Oh, and they're giving away money through a sweepstakes to visitors.
Haven't had a chance to try this yet, but the idea is that by collecting people's bookmarks, the service can guide you to helpful web sites.
Hope to do a longer write-up in the future, but meanwhile, try this site if you are looking for business and company information.
This is a relatively new paid placement search engine that follows the GoTo.com model.
Apparently an Alexa-like navigational tool you might find interesting. I haven't tried it, but Chris Sherman of the About Web Search Guide loved it. See his review, below.
Worth A Look: Flyswat
About Web Search Guide, Oct. 29, 1999
Imagine Yahoo or any directory, where each category is established at its own web site. That's essentially 4anything.com. It's long been on my radar screen, and I hope to do a longer write-up in the future.
New service from Intelliseek, launched in late October, that allows you to find databases with information that search engines can't crawl.
Need the ability for people to search your site? That's what SearchButton offers. No software to install, and no cost if you don't mind having ads run in your results. Ad-free versions are also available, for a fee -- as are versions for large enterprises.
Search service that is designed to help you better find what you are looking for by presenting concepts at the top of the search results list. I find it a bit clunky -- maybe you'll love it. Results come from the Open Directory and AltaVista. And no, the name has nothing to do with the band Boingo, formerly Oingo Boingo. It's an acronym, but the site owners aren't saying what it stands for.
A new meta search service.
A metasearch tool with features designed to help you refine your query based on the content of pages it finds.
This site gives you the unofficial forecast of company earnings as determined by what people are saying on the web. See "The Information Laundromat" article below for more about how search technology is used in part to gather the information.
The Information Laundromat
Salon, Oct. 26, 1999
Search Engine Articles
Licensing deal validates Ask Jeeves's approach
News.com, Oct. 29, 1999
I've always thought the little Ask Jeeves guy was pretty hokey, but now he's being backed by one of Hollywood's biggest agents. Look out, Pokemon!
Building Traffic With The Engines
ClickZ, Oct. 27, 1999
Nice stats that say search engine marketing is the top ranked promotional method used by business-to-business sites.
Yahoo pulls away from portal pack
Forbes Digital Tool, Oct. 19, 1999
Interesting stats about Yahoo's reach across the web.
How do I unsubscribe?
+ Use the form at http://searchenginewatch.com/sereport/unsubscribe.html or follow the instructions at the very end of this email.
How do I see past issues?
+ Follow the links at http://searchenginewatch.com/sereport/
Is there an HTML version?
+ Yes, but not via email. View it online at
I didn't get Part 1 or 2. Can you resend it?
+ No, but you can view the entire issue online, via the link above.
How do I change my address?
+ Unsubscribe your old one, then subscribe the new one, using the links above.
I need human help with a list issue!
+ Write to firstname.lastname@example.org. DO NOT send messages regarding list management issues to Danny Sullivan. He does not deal with these.
I have feedback about an article!
+ I'd love to hear it. Use the form at http://searchenginewatch.com/about/contact.html.
This newsletter is Copyright (c) internet.com Corp, 1999