About The Update
The Search Engine Update is a twice-monthly update of search engine news. It is available only to those people who have subscribed to Search Engine Watch, http://searchenginewatch.com/.
Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ General Notes
+ AltaVista To Go Public, Partner With Microsoft
+ Excite Purchased By @Home
+ Yahoo To Buy GeoCities
+ Direct Hit Debuts Personalized Search
+ RealNames Ramps Up
+ Netscape Integrates Directory
+ Northern Light Claims Largest Index
+ WebTrends Upgrades Search Analysis
+ Submission Weather Report
+ Search Engine Notes
+ Search Engine Articles
+ Subscribing/Unsubscribing Info
I've been very busy this month, not just covering all the latest news, but also revising information within the site about searching better. By the time you receive this, a number of new and updated pages will be online. Here's a rundown:
Search Engine Math
I've been doing a series of search engine workshops in association with the British Library, and "Search Engine Math" is one of the concepts I've been putting across to both new and experienced searchers. Forget all that stuff about Boolean searching. Instead, search engine math teaches you how to add, subtract and multiply your terms using the simple commands that almost all the major search engines support. For many users, this is all the "power searching" you'll ever need. While math was never my favorite subject, I hope that at least the concept of search engine math will be easier for many people to understand.
Power Searching For Anyone
While most people only need to know basic math, some users will benefit from learning more about how to control their searches. To help, I've completely revised this page, with a particular emphasis on restricting searches by site. While I love producing comparative tables, I've instead chosen a narrative approach to cover important power search commands, and which search engines support them.
Search Assistance Features
Several search engines offer special search assistance features that many users overlook. This page explains the ones I find particularly useful. It also provides a rundown on which search engines support each feature and how to use it. I hope to greatly improve this page in the near future, but I think many will find even the basic information provided to be helpful.
This is for the professionals. You're used to Boolean operators, like Boolean operators and you want use Boolean when using web-wide search engines. Here's a rundown on the services that support Boolean, and the slight differences between them. Things are more consistent than you might think.
Search Engine Tutorials
There is plenty of help for searchers out on the web. This page has been updated to include some new resources.
One of the things that came out of all the work above, plus work I did for an article on searching to appear later this year in Online magazine, is that there are plenty of areas where it would make sense for search engines to agree upon some standards.
For example, Infoseek allows you to control which sites appear in your results via the site: command. AltaVista provides the same ability, but it uses the host: command. Likewise, at Inktomi, it's the domain: command that is used. Obviously, it would be much better for users if they all agreed to use the same syntax. Moreover, it would be nice if all the major search services provided this functionality, which is extremely helpful in some searches.
When I asked the services about these inconsistencies, it turns out that no one has really been providing a mechanism to help bring them together on issues that benefit users in general. Furthermore, many of them thought such an effort would be a good idea.
With this in mind, I'm launching something I call the Search Engine Standards Project. The idea is to help bring about standards that make sense, especially where simple changes can eliminate inconsistencies.
For the moment, I'm starting out by suggesting some items where I already feel standards could be quickly established. You can read about these on the page below.
You'll also find a link from that page to a forum area where you can post ideas for the types of standards you would like to see emerge. Please do -- I look forward to seeing your comments.
Over the coming months, I hope to help establish a framework that leads to consensus on areas that make sense. I'll let you know how things are going, and how you can participate, as details become established.
Search Engine Standards Project
This new page has also been added:
This documents articles and resources relating to artists who are upset with services that index their images.
Search Engine News
Compaq acquired the AltaVista search service when it purchased Digital Equipment last year, and AltaVista's future has been in limbo since then. That came to an end last week, with the announcement that AltaVista has been spun off into the AltaVista Company, a wholly-owned subsidiary of Compaq.
A primary reason behind the move is to tap into the value that AltaVista can command as a search-and-navigation portal. The multibillion dollar deals recently announced for Netscape and Excite only underscore the value Compaq hopes to gain by establishing AltaVista as an Internet media company independent of Compaq's hardware operations.
"By creating a separate, publicly traded company, we will unlock AltaVista's tremendous value for Compaq's shareholders," said Eckhard Pfeiffer, Compaq's President and CEO, during last week's press conference.
There are no intentions for AltaVista to abandon search. In fact, many of the search improvements the service has introduced recently were made in anticipation of the spin-off, it was said. But AltaVista also plans to integrate online shopping functionality heavily into the service, leveraging Compaq's recent purchase of Shopping.com.
"We intend to transform AltaVista into the leading destination site for search and e-commerce on the Internet," Pfeiffer said.
More portal features will be coming, thanks to a new partnership with Microsoft. AltaVista will transition its free email service from being powered by iName to using Microsoft's Hotmail. AltaVista will also tap into Microsoft's instant messaging service, when that product is ready. The two companies also have pledged to work together on unspecified community offerings in the future.
In return for the AltaVista partnership, Microsoft has agreed to dump Inktomi as the service powering its MSN Search service. It was probably the most stunning announcement emerging from the press conference, in that Microsoft was abandoning a company it touted as having the best search technology in order to promote its business interests.
"We believe that Inktomi has developed the most advanced search technology available and it makes sense for Microsoft to provide these capabilities to consumers. Incorporating the Inktomi technology as a core service of Microsoft's online properties helps us give users the ability to easily and quickly find just what they're looking for on the Internet - part of our overall goal to make MSN the place to get the most out of each Internet experience," said MSN Vice President Laura Jennings, when the original Inktomi announcement was made in October 1997.
So if Inktomi was the best, why dump them only a few months after going live with the Inktomi-powered MSN Search service? Microsoft was at pains to explain that Inktomi had done nothing wrong, nor to even suggest that Inktomi's search product had diminished in any way. Instead, AltaVista simply offered Microsoft a better business opportunity.
"The change to AltaVista is a better business deal for MSN. The change is no reflection on our satisfaction with Inktomi. We had an opportunity to form a deeper strategic relationship with AltaVista and Compaq. That is the primary motivation for the change," said Marty Taucher, MSN's director of network communications.
In Microsoft's defense, it could hardly have found a better service to swap for Inktomi than AltaVista. They both offer large indexes of the web, with AltaVista even claiming to be even slightly bigger than Inktomi. They also both offer a wide range of advanced searching capabilities. So it's not like Microsoft is ripping out a V-8 engine from under the hood of MSN Search and replacing it with a lawnmower motor. These are indeed comparable services, and most users probably won't notice the changeover.
"You need to look at how we incorporated the Inktomi searches into our MSN Web Search product. We use these search engines as a back end for our best of web extensive search feature," Taucher said. "The user today doesn't really know that this is an Inktomi search."
Indeed, most users probably don't. And that's the most disturbing aspect of the announcement. There has been plenty of talk over the past two years about search as a commodity. The idea is that all services are more or less equal, so those looking for a search product can go for whatever presents the best deal. Microsoft's dropping of Inktomi is the most dramatic illustration yet that this concept is true, at least from a site owner's perspective. It makes it more likely that a trend will emerge where what benefits the site will win out over what benefits the user.
Again, to defend Microsoft, AltaVista is a good swap for Inktomi. You can fault Microsoft for putting a business opportunity first, but it's harder to argue that its users are losing because of this. But that may not be the case in the future with Microsoft, or with other search deals.
As for Microsoft, it gains in the deal by advancing its Hotmail and other MSN portal applications, with a heavy stress on the word applications. Don't think of the deal as Microsoft repackaging MSN content for AltaVista, for that's incorrect. Instead, Microsoft is licensing web applications -- web software -- to AltaVista.
The distinction is crucial, because Microsoft knows software much better than it knows media. We've gotten used to Microsoft dominating software categories such as word processors and spreadsheets. But Microsoft does not enjoy the ubiquity on the web that is has on personal computers. Pick any topical site that Microsoft produces, and you'll find it faces strong competition -- and there are plenty of areas where Microsoft doesn't have properties at all.
But the rollout of Hotmail to AltaVista is perhaps a harbinger of Microsoft applying its strength, software, into winning on the Internet. After all, Hotmail is simply web-based software. Instant messaging is the same. Microsoft looks to be building a suite of online software applications that it can license out to thousands of users at a time, through web portals. Expect more deal like AltaVista's to come.
"We are talking to a wide range of OEMs, ISPs and web sites about working with us to license these platforms. We are doing this today with our travel platform. American Express and a couple of airlines are using essentially the same back end that we have with Expedia. You'll see us do more deals like this in the future," said Microsoft's Taucher.
Now for some specifics on the change at MSN. There is no timetable on when the MSN Search will begin using AltaVista's results, but Inktomi will probably continue to power the service for the next five or six months, said Bill Bliss, general manager of MSN Search.
MSN Search will only be taking web results from AltaVista, not AltaVista directory results that come from LookSmart, the RealNames links or the Ask AltaVista information provided by Ask Jeeves. MSN does plan to enhance the AltaVista web results in the same manner it had intended to reshape Inktomi results.
"We've got some fairly firm plans that we are not prepared to talk about for MSN search, Bliss said. "This arrangement doesn't affect that."
The deal with AltaVista only affects the MSN Search service, not with the entire MSN site. That means those entering through the MSN front page will continue to be offered a choice of several search services to chose from, with the top line up remaining AltaVista, Infoseek, Lycos, Snap and MSN Search itself.
The deal does not prevent Inktomi from being a search partner elsewhere within the MSN network. Inktomi could turn up as providing specialty search features, such as the custom crawling it does for GeoCities. An incentive for this exists in that there remains contracts between Inktomi and Microsoft. Neither company will release more details about this, but both expressed an interest in continuing to work together.
Inktomi might also provide some of its caching and web shopping technologies to Microsoft, which are its other core products besides search. Having multiple products is why Inktomi stresses that while losing the MSN search deal was a blow, it wasn't a knock-out punch.
"Is it a loss that was a hard one for us and that we were upset about? Yes. Was it a material part of our operation? No. We are pretty well diversified," said Inktomi marketing director Kevin Brown.
Back at AltaVista, there are no immediate timetables as to when Hotmail will take over from iName, or when personalized and community portal options will begin appearing. A personalized My AltaVista service already exists, having been offered to those using Compaq Presarios since January 9, as an option linked to their keyboards. But the general public has to wait longer, in part because AltaVista wants the service to be perfected before a general release.
Do expect to see AltaVista launching a US $60 million dollar brand building campaign shortly. It will be interesting to see how much this may increase traffic. Since its launch, AltaVista has enjoyed significant grassroots traffic. Only in the past year has it spent significantly on advertising itself, in particular at the Netscape Net Search page. That boosted traffic even more, and now it has a war chest from Compaq to use in stepping up its efforts.
AltaVista president and CEO Rod Schrock, formerly senior vice president of Compaq's Consumer Product Group, said the company has plenty of funds to carry it through this year, and an IPO is expected before the year's end. He added that AltaVista was managed as breakeven until the third quarter of last year, and that he expects to run at a loss for the next two years as the company develops.
"We will be significantly investing in growth and brand development," Schrock said. "I expect a negative position for the next two years, then we'll emerge as one of the highest revenue companies on the Internet in a profitable position."
One notable moment during the press conference came when Schrock demonstrated the still relatively new AltaVista Photo Finder service.
"Where can I find a picture of a giraffe," Schrock asked as an example. He then showed how using Photo Finder, "You can quickly get to that web site, download that image and include it in the book report."
It sounds great, and it is for web searchers. But I continue to hear from artists who are upset with the Photo Finder service, feeling AltaVista is using their images without permission and profiting from them. Schrock's presentation highlights the type of concerns they have. He made no mention of checking to see if the image was protected by copyright restrictions. Instead, Photo Finder was presented as if it were a source of free photos to use as desired.
AltaVista has changed Photo Finder so that it presents Corbis images first, which can be used freely for non-commercial, personal purposes. In fact, in Schrock's example, it was images from the Corbis catalog that were displayed -- although he clearly assumed that these were drawn from across the web.
Despite the change, non-Corbis images continue to be listed. They either appear after relevant Corbis images have been displayed, or they appear if there are no Corbis images that match the search query.
Artist Les Kelly is one of those that complained to AltaVista in writing about the indexing of images from his site, and he still plans to pursue damages. He has sent a similar complaint to Lycos and most recently, to the ArribaVista image search service. ArribaVista has since removed images taken directly from Kelly's site. AltaVista has also removed Kelly's images, Kelly has said. But Kelly refuses to implement a robots.txt restriction until reaching a settlement with AltaVista, so those images could reappear.
AltaVista didn't get back to me with a response on whether it was concerned about complaints by Kelly and others. Apparently, it is not. When I had talked about the issue with Greg Memo, AltaVista's new vice president in charge of business and technology strategy, he said only that he wasn't personally aware of Kelly's complaint. Kelly has reported to me previously that AltaVista tells him it considers the issue to be dead.
Compaq Press Conference
View the event for yourself online.
AltaVista aims to be among top 3 on Web in revenues
Reuters, Jan. 26, 1999
Just a few more details on AltaVista's IPO plans and its upcoming brand development campaign.
Compaq Goes Shopping.com
PC World Today, Jan. 12, 1999
Details on the Shopping.com purchase, a stock swap valued at $220 million.
Microsoft wants piece of portal pie
News.com, Jan. 26, 1999
Details on MSN's first steps towards licensing branded versions of portal applications to those who provide access through hardware or dial-up.
Inktomi sees no impact in Microsoft exit
Reuters, Jan. 26, 1999
The MSN deal represented only 5 percent of Inktomi's projected 1999 revenues, according to one analyst. Article details why the company isn't worried, from a fiscal standpoint.
Multimedia Search Complaints
A round-up of articles and resources relating to complaints about image indexing.
AltaVista Debuts Search Features
The Search Engine Report, Nov. 4, 1998
In case you missed it, here's a rundown on the new search features that were recently added to AltaVista.
Excite was the latest of the search services to pick up a deep-pockets partner, with the announcement on January 19 that it would merge with @Home Network in a $6.7 billion stock swap.
@Home offers high-speed Internet access through cable television systems. It has affiliate partnerships with 18 cable companies worldwide, and AT&T is to become a major shareholder in the company when it completes its acquisition of Tele-Communications Inc.
With last year's partnership between Disney and Infoseek, and AOL having purchased Netscape, analysts have been expecting leading portals Excite, Lycos and Yahoo to seek out partners to assure their competitiveness and survival.
Now Excite has made its partnership, but ironically, perhaps it will assure @Home's survival more than vice-versa. Excite already reaches far more people than does @Home.
In fact, Excite's value is currently so high that AOL expects to net a half billion dollars from selling off most of the Excite shares it gained in stock-swap two years ago for the WebCrawler service.
@Home buys Excite in $6.7 billion deal
News.com, Jan. 19, 1999
AT&T nabs content for broadband bid
News.com, Jan. 19, 1999
How AT&T fits into the @Home/Excite deal.
Acquisition of Excite Has Rivals Thinking Broadband
Internet World, Feb. 1, 1999
Does Excite's partnership with a high-speed access provider mean that high-bandwidth multimedia portals are coming? It's being thought of.
Is Broadband Being Oversold?
Internet World, Feb. 1, 1999
Cable Internet access has potential, but there have been bumps along the way.
@Home suffering cracks in the foundation?
News.com, Jan. 26, 1999
Similar to the article above.
America Online Cash Pile $2 Billion After Excite Sale
Reuters, Jan. 28, 1999
Yahoo is to purchase web community provider GeoCities in a stock swap valued at $3.6 billion, it was announced on January 28.
Yahoo has had a partnership with GeoCities since January 1998, when it made a $5 million stock investment in the company. The acquisition will give Yahoo members the option to establish free home pages, a portal service that competitors Infoseek and Lycos have already offered for several months. Yahoo also plans to offer its portal features such as Yahoo Clubs to GeoCities members.
With a Bang, Yahoo Adds Home-Page Hosting
Internet World, Feb. 1, 1999
Yahoo Finally Gets the Network News
Industry Standard, Jan. 29, 1999
Expect more acquisitions by Yahoo in the future, sources say.
Yahoo to Acquire GeoCities
InternetNews.com, Jan. 28, 1999
Will broadband determine Yahoo's future?
News.com, Jan. 21, 1999
Forget the parts about whether Yahoo needs a broadband partner. The interesting bits are toward the end, where Yahoo's CEO talks about how it considered an acquisition of Excite and how it doesn't feel it needs to be acquired itself.
Toward the end of last year, I wrote about the concept of personalized search. This is where search results are custom tailored to your personal profile. Now Direct Hit is making such a system available to its partners.
Direct Hit already provides results that are ranked by user popularity through a variety of venues, with HotBot being best known. There, you can do a search, then choose to view sites ranked in order of user popularity. That popularity is determined by measuring which sites users actually select the search results.
Personalized search goes a step further. Services like Yahoo and Excite already have millions of registered users who have provided basic demographic information such as age, sex and geographical location. Direct Hit's system can marry this type of information to the sites that users choose from search results.
For example, Direct Hit's personalized system can distinguish between the sites that men consider popular in response to the query "flowers," as opposed to what women like. Its limited test system so far shows that results for men are dominated by sites allowing online purchase of flowers, while results for women tend to have more sites providing pictures of flowers or online ordering of seeds.
It's a very powerful concept, and I'm looking forward to seeing how it actually works on a large scale. That may happen if one of Direct Hit's existing partners, or a new partner, picks up the option. The company is optimistic that this may happen in a matter of weeks.
"Right down the list of the top portals, they are all interested," said Direct Hit Chairman Gary Culliss.
The idea of linking search selections to personal profiles may raise privacy issues with some people, but Culliss says any implementation would be anonymous.
"Just as with our popularity engine, the Direct Hit data is totally anonymous. We never know who performs a particular search. We only know that someone with certain demographics performed a particular search," he said.
Direct Hit has also announced a new partnership with LookSmart. The directory is to use Direct Hit's technology to present search results as ranked by user popularity, similar to system at HotBot. The system should go live there next month, Culliss said.
The popularity results themselves have gained a refinement in the form of displaying queries that are related to the main search topic, similar to the system that AltaVista debuted in December.
For example, in a search for "music" at HotBot, the Direct Hit option will display top music sites as ranked by popularity. But above this list, Direct Hit will also show related searches such as "sheet music" and "music videos." The idea is to help the user focus in on exactly what they are looking for.
Here's an update on where Direct Hit technology is in action:
At HotBot, do a search, and if Direct Hit data is available, you'll see text just above the results that says "Top Ten Most Visited Sites." Click on this text, and you'll be presented with the Direct Hit data. Related queries will also be shown, if they are available.
Direct Hit also has a partnership with AOL's ICQ. By default, those running the ICQ99a client can submit a query to the ICQ-powered Inktomi search service. To see Direct Hit results, ICQ users should choose the ICQ IT Most Visited Results option, which appears when clicking on the arrow to the right of the search box.
Mac users running Mac OS 8.5 can access Direct Hit results through the Sherlock search client. The Direct Hit plug-ins can be found within the Apple web site.
Don't have ICQ? You can see ICQ Inktomi-powered results via this site. Direct Hit data, if available, can be found by selecting the link which says "Get the top 10 results visited by ICQ members."
Counting Clicks and Looking At Links
The Search Engine Report, Aug. 4, 1998
More about how the Direct Hit system works.
RealNames announced a partnership with Inktomi in January, which extends the alternative web address system's presence even further into the search arena.
RealNames already has existing partnerships with search services AltaVista and LookSmart. Now Inktomi will be able to offer its search partners the ability to have RealNames links integrated into its search results.
The service has also been made much more useful as a search tool, thanks to a new high-speed address resolver that reports whether a name actually exists in its registry.
You can see the difference this makes by doing a search at AltaVista. Look for "nike," and you'll see the RealNames link listed with this text:
The RealNames link takes you directly to Nike.
In contrast, search for "shoes," and the text reads,
shoes - List of near matches related to shoes provided by RealNames.
What's happening is that when someone searches at AltaVista, the query is simultaneously sent to RealNames. It checks to see if the query matches a name that has actually been registered and reports this status to AltaVista. It all happens so fast that users shouldn't notice a decrease in speed.
"We are capable of answering the question in less than one millisecond," said RealNames CEO Keith Teare. "It adds no overhead to AltaVista users."
When a RealName exists, anyone clicking on the link in AltaVista is taken directly to a web site. If it doesn't exist, then users would instead be taken to the RealNames search engine, where several names containing the terms would be displayed. This is where the new system is clever. By telling AltaVista the status of a name, the search service can then present appropriate text that helps its users decide whether to select the link.
To build brand awareness, RealNames has also begun using the RN superscript in the way TM is used to indicate trademarks, Teare said.
A surprising area of new traffic has been users of the Infoseek Express client, which is RealNames enabled. Teare said that 12.5 percent of its 1.6 million RealNames resolutions per day come from those running the client. But AltaVista generates the bulk of resolutions, responsible for 81 percent of the system's traffic.
RealNames also got good news on the legal front. A patent infringement suit filed by competitor Netword was tossed out on January 8. The U.S. District Court for the Eastern District of Virginia ruled in a summary judgment that Centraal's RealNames system did not infringe upon Netword's patent.
RealNames Expands Listings
The Search Engine Report, Sept. 2, 1998
More about the RealNames system.
Netscape has created a branded version of the former NewHoo directory that it acquired last November and begun integrating it into the Netscape site.
Visitors will encounter the Netscape Open Directory primarily if they choose the "Find Web Sites" option from the Netscape home page. Over time, they will also find references to it when browsing the site's channels, as Netscape begins to use its directory categories in place of the branded offerings it has been pulling from Excite. The directory can also be reach directly via the URL below.
The name Netscape Open Directory may sound familiar, because that's originally what Netscape relabeled NewHoo, after it acquired the volunteer listing service. It has been renamed once again to the Open Directory, to distinguish it from the Netscape branded effort.
So to be clear, there is the Open Directory, which was NewHoo, and the Netscape Open Directory, which is a copy of the Open Directory promoted within the Netscape site and with Netscape's branding. It would be less confusing if the services had more distinct names. To help, I'll refer to the branded version as the "Netscape Directory" in this article.
The Open Directory remains a volunteer effort, with the overall goal of organizing web sites and publishing the information for anyone to use, including Netscape or its competitors. Thus, the Netscape Directory is powered by the public-domain Open Directory. Likewise, a new free use license would allow even Yahoo to make use of the Open Directory's information, assuming they provided appropriate credit. But this is seen as unlikely.
"Our feeling is a major competitor wouldn't do that, because it would be like ceding control to us," said Chris Tolles, the Open Directory's product manager.
Instead, Netscape expects that people may use portions of the guide to provide specialty offerings. It is making the complete directory available every two weeks or so as an RDF dump, to assist those interested in development. Netscape doesn't know of anyone using this method, however. Rather, most people seem to make copies of individual pages of interest and place their own branding and banners on them.
"We've had a lot of editors utilize their categories within their own sites," Tolles said.
Netscape says it is also taking advantage of the directory to build out regional portals, such as in Canada, Germany, the United Kingdom and Australia. Visit Netscape Canada, for example, and you'll see that the directory categories are immediately available from the home page.
There have also been a number of changes meant to improve and expand the Open Directory's content. A dead link crawler has reduced dead links from an estimated five percent to less than one percent, Netscape says.
Bulk content has also been added. In particular, the directory includes story archives from News.com, Wired, Time and Newsweek, along with content from several online encyclopedias.
For webmasters, there's no need to submit to both the Netscape Directory and the Open Directory. Submitting to the Open Directory will get you listed in the Netscape Directory, since the Open Directory is what powers the Netscape version.
Of course, you can enter the submission process through the Netscape version. If you do this, be sure to select the "Add URL" option at the bottom of the category you are submitting to, rather than the "Add Site" option that appears on the next line down. The Add URL link puts you into the Open Directory submission process, while the Add Site option routes you to Register It service. The free version of this is not configured for submissions to the Open Directory.
Netscape Open Directory
Open Directory Free-Use License
NewHoo Becomes Netscape Open Directory
The Search Engine Report, Dec. 3, 1998
More about how the volunteer directory works.
The search engine size wars may be about to heat up again, with Northern Light now claiming it has the largest index of the web.
Northern Light is staking its claim on the latest survey conducted by search expert Greg Notess, author of the Search Engine Showdown.
Notess has conducted test queries on the major search engines since October 1996. In his latest round, completed on January 5, he found Northern Light provided the most matches, followed by AltaVista and then Inktomi-powered HotBot.
In contrast, the reported sizes of the search services puts AltaVista first, at 150 million web pages, followed by Northern Light at 122 million web pages, and then Inktomi with 110 million pages.
Northern Light is keen to claim the title as biggest, as the service feels it will help increase its share of the search audience. CEO David Seuss says he believes both AltaVista and HotBot have gained significant traffic because they were seen as the biggest. If Northern Light gains the title, he expects it will also gain in popularity.
"AltaVista was perceived to be the largest, and it became quite popular as a result," Seuss said. "An incorrect study that was widely published found HotBot had 150 million pages, and it picked up traffic," he said, referring to the study by the NEC Research Institute that appeared in Science magazine last year.
Seuss disagrees with some of the findings of that NEC study, which is why he refers to it as incorrect. Likewise, although he is happy to be ranked as largest in the survey from Notess, he doesn't feel estimates are a good enough solution.
"Any estimate is open to criticism no matter how valid the methodology. So, why not simply ask each company and have the company give accurate information?" Seuss asks.
Companies already provide self-reported sizes, but Seuss wants these audited for accuracy -- or at least to have an audit of one search service's size, which could then be coupled with test queries to arrive at sizes for the others.
For example, imagine that you run a series of searches at the major search engines. You might discover that on average, AltaVista has 21 percent fewer matches than Northern Light (this is just an example -- not a fact!). If you then knew that Northern Light had exactly 122 million web pages, you could subtract 21 percent from that number to know precisely the size of AltaVista index, which would be 96 million pages. You also compare this to the self-reported number, to see if the figures were inflated.
There hasn't been a need for audited figures until now because self-reported sizes have been good enough. AltaVista and Inktomi have stood well above their competitors with indexes in excess of 100 million web pages for over a year. They've swapped the title of biggest a couple of times, but neither of them have felt that the other inflated its size enough to raise a serious complaint.
As for the other services, aside from Northern Light, their sizes have remained at 50 million web pages or fewer. Since they are clearly smaller than AltaVista or Inktomi, they've had no incentive to raise doubts about AltaVista's or Inktomi's numbers. Nor have they been inclined to squabble among themselves about who's biggest within their range. That's because they've all firmly been in the "bigger isn't necessarily better" camp. Users want quality before quantity, goes the refrain, and so they say they've focused their efforts on improving relevancy.
Northern Light's desire to be biggest causes new pressure to be placed on the reported numbers. It could indeed be the largest index but has no easy way to prove its claim. Meanwhile, database sizes are about to increase among some players, such as Infoseek.
"You always hear AltaVista and HotBot say they are the biggest, and we definitely want to go after that," said Jennifer Mullin, Infoseek's director of search. "We've just developed some different technology in house, and we've been able to invest in the scaling issues." Infoseek is currently at the 45 million page mark.
Google also aims to be a size leader. It is currently at 60 million pages, and cofounder Larry Page wants to go much higher. Page won't say exactly how high, but he gives every indication that he'd like to raise the benchmark well above the 100 million page level that currently separates the large search engines from the smaller ones.
"We want to have the most comprehensive, highest quality search that is available," Page said.
The historic size leaders of AltaVista and Inktomi also plan some increases to stay competitive, though growth isn't their top priority.
"Will we see a 200 million web page index in 1999? Probably," said Louis Monier, AltaVista's chief technical officer. "But don't expect a big jump beyond that. My goal is really to have amore useful index."
Similar sentiments come from Inktomi:
"We're not just interested in being able to advertise our self as the biggest. We want to be the best," said Troy Toman, Inktomi's director of search services.
Toman's words are echoed by all the search services. Big is nice, but relevancy remains the chief goal. I agree. I think index growth is important, and some services are overdue to increase their sizes. But while size does make a difference, it is not the only reason to choose a search service over another.
I'll be exploring the size issue more in coming months, but I think even the reported sizes give a user looking for comprehensiveness a good idea of which services to use. At the moment, AltaVista, an Inktomi-powered service like HotBot or MSN Search, or Northern Light are all excellent choices for searching across a large portion of the web -- as are meta search services.
Surveys like those conducted by Notess, NEC, Melee or the Search Engine EKGs that I produce (and expect to update soon) are also good ways to gain a better idea of comprehensiveness. To help assist in these type of surveys, there may be some standards that could be established to help outside observers measure better. In turn, these standards may lead to the type of audited results Northern Light is looking for, or at least more confidence that the self-reported numbers are indeed accurate.
Search Engine Sizes
See current reported sizes and index growth over time, based on reported numbers. You will also find links to the NEC and other size studies.
Search Engine Standards Project
Read more about some standards that, if established, might help users determine index size.
Search Engine Showdown
This site from Greg Notess provides searching tips and surveys of search engine sizes, dead links and other data.
WebTrends is a popular log analysis program that had a terrible flaw when it came to search term reporting. The program would break multiple word phrases up into individual keywords, so that it was impossible to know exactly how people found your site via a search engine.
I'm happy to report that this flaw has been corrected in the program's latest release, WebTrends Log Analyzer 4.5. It should be a must upgrade for anyone running it. Likewise, if you have a web hosting provider using an older version of WebTrends, urge them to upgrade. And if you are not using a log analysis program at all, this provides all the more reason to consider WebTrends as an option.
WebTrends will show you the top terms generated by each search engine, something also available from the entry level package Hit List Pro, available from competitor Marketwave. This is the WebTrends "Top Search Engines with Phrases Detail" table. But WebTrends also goes further and also offers two other search reports reports that are only available in Marketwave's higher-end packages.
The first shows a list of top search phrases without subdividing them by search engine. This is the "Top Search Phrases" table. It makes it very easy to spot the top phrases that are generating traffic, across the board.
The second, which I absolutely love, lists top phrases and details how much each search engine contributed to a phrase's total count. This is the "Top Search Phrases with Engines Detail" table. It's an exceptional way to see exactly where you stand for each search term, on each of the major search engines.
The old "Search Keyword" reports remain, but do yourself a favor and disable all of these. They provide no useful information. It's the "Search Phrases" reports that you want.
WebTrends has even provided an option to add search engines to its reporting list, or the ability to change search engine parameters as needed. I may cover this more in the future. In the meantime, information can be found by running Help, then entering "Modifying Search Engines" into the search box on the Index tab. The help file listed will explain things in more detail.
You probably won't need to make changes or add services for some time. WebTrends is already preconfigured for a wide variety of services, including GoTo.com
I thought it would be nice to run an update on the current submission trends that I've been seeing at the major search services:
In general, AltaVista and Infoseek still have about a one-to-two day turnaround time for listing any page submitted to them directly. Only in the past few days has Infoseek seemed to be taking longer, but I doubt this is a trend. I'm also getting new reports that the problem with submitting .uk sites to Infoseek seems to have returned. I'll be checking on this.
Excite and WebCrawler still show about a one week turnaround in indexing home pages, with inside pages taking up to three weeks to appear at Excite. Remember, Excite may not list all your inside pages, as discussed in the last newsletter. Also, there no longer appears to be any reason to submit to both Excite and WebCrawler, as the WebCrawler Add URL page is now feeding into the Excite submission system. I'll get confirmation on this for the next issue.
Inktomi-powered HotBot confirms that it is currently taking longer than two weeks promised on its Add URL page to list new submissions in its index. It estimates turnaround time to be three-to-four weeks, and I wouldn't be surprised if this is stretching to five weeks for some people.
Meanwhile, Inktomi-powered MSN Search is showing a turnaround time of one-to-two days for pages submitted via its Add URL page. Technically, you are supposed to only submit the home page of your web site and then wait for the system to crawl your other pages. In fact, if you try to submit more than one page from the same site on the same day, the system will reject the excess attempts. However, there appears to be no problems associated with submitting a single page per day from a web site. This is a useful way to quickly add additional key pages to the MSN index.
Moreover, Inktomi says that additions to MSN should eventually be reflected in the HotBot index. So submitting to MSN Search may actually turn out to be a faster way of getting listed with HotBot, for the time being. That's ironic, since HotBot was serving as the de facto submission service for MSN Search until the service debut its own Add URL page about two months ago.
Over at Lycos, the index is supposed to reflect new additions and crawled information as of mid-December 1998. However, I still see test pages submitted months ago that haven't appeared. Some people report they have gotten in, and Lycos says it hopes to return to refreshing its index every three weeks, in the near future. Until then, expect new submissions to take up to five weeks to appear at the service.
Search Engine Notes
Is Lycos Next?
In the wake of recent portal mergers and acquisitions, eyes are turning toward Lycos as a possible target. Lycos says it will go it alone, for the time being -- but it has also given impressions that it wants a partner. Below are a collection of articles on the subject.
The urge to merge on the Net
US News & World Report, February 1999
Net merger mania continues
News.com, Jan. 28, 1999
Lycos: 'We'll Stay Independent'
Reuters, Jan. 25, 1999
===================Excite Sued Over Banner Ad
Reports are that Excite is being sued by Estie Lauder for selling banner ads using Estie Lauder trademarks to a competing fragrance company.
Lawsuit Filed Over Keyword Search Ad
InternetNews.com, Jan. 29, 1999
Mascara Mogul Sues Excite
Wired, Jan. 28, 1999
Search Book Offers Easy Learning
I immediately liked Search Engines for the World Wide Web when the first edition appeared. It was a small book, nicely-organized, well-written and full of illustrations that make learning easier. Now the second edition from authors Alfred and Emily Glossbrenner is out. It's an excellent resource for anyone who wants to learn more about search engines and searching better. Some parts of it are already dated -- such is the nature of trying to keep up with search engines that continually reshape themselves. But much of the book remains evergreen, especially sections on searching commands and tips. This book offers an excellent alternative to those tired of reading about search engines online.
Search Engines for the World Wide Web (2nd Edition)
Peachpit Press, ISBN 0-201-69642-8
Information Please Launches Kids Site
Reference provider Information Please has launched a new site aimed at children. Infoplease Kids' Almanac provides facts and information oriented around the needs of children.
Infoplease Kids' Almanac
Search Engine Articles
Portals rethink retail strategies, shopping agents
Ad Age, Feb. 1, 1999
Shopping bots and shopping centers integrated into portals make sense as a user benefit, but they can also upset online retailers advertising on those same portals in hopes of capturing customers. A look at how services are walking the tightrope.
Cheap, Cheaper, Cheapest
Industry Standard, Dec. 31, 1999
Similar to the article above, this details how some online retailers aren't pleased with visits from comparison shopping robots, while others are happy to see them. It also looks at players in the shopbot marketplace.
Search Engine Portals Tick Me Off
ClickZ, January 1999
Tom Hespos reports that several search engines have adopted a policy of requiring advertisers to purchase untargeted run-of-site ads along with the keyword-linked banner ads that they really want -- and he doesn't like having this excess inventory forced upon him.
Are portals the Webs Holy Grail?
MSNBC, Jan. 24, 1999
Portals aim to capture users, but users turn out to be a fickle bunch. A look at how some surfers interact and hop around between providers.
Portals: the new desktop?
News.com, Jan. 20, 1999
Are portals becoming the new portable computers? Why not? They make your address book, calendar, email and other applications normally tied to the desktop available anywhere in the world.
Snap Launches High Speed Portal, Adds Content Partners
InternetNews.com, Jan. 19, 1999
Snap has released a version of its service designed for those with high-speed access. "Snap Cyclone" will debut officially through a variety of access provider partners that have been announced and will be NBC television programming.
Below are sponsor messages that ran in this month's issue of the Search Engine Report, which may be of interest to Search Engine Update readers.
When a prospect does a search for a keyword related to your products or services, do you appear in the top 10 or does your competition? Submitting alone does nothing to insure good visibility. WebPosition is the first software product to monitor & to help improve your search positions in the top search engines. Rated 5 out of 5 stars by ZD Net. Try out WebPosition yourself FREE:
WebReference.com-The Webmaster's Reference Library FREE NEWSLETTER
and more info at <http://www.webreference.com/> Your one-stop Web
dev resource...9-time Winner PC Mag's Top 100...Daily how-tos on
You need an easy way to build self-service Web apps. - apps. that give users the information they need, as needed. Web Self-Service Solutions from IBM can help. WebSphere Application Server offers built-in connectors to databases and a servlet-based runtime environment. WebSphere Studio provides visual page editing, servlet creation wizards and server-side logic. Visual Age for Java provides an Integrated Development Environment for developing e-business enterprise apps.
Test drive IBM Web self-service with free trial code: http://redir.internet.com/sew/990201/www.software.ibm.com/wss/news6
INTERNET WORLD WEEKLY--"The Voice of E-Business and Internet Technology"
Attention Internet professionals! Don't miss a beat in the fast-paced and ever-changing business Internet market. Internet World weekly puts all of the weekly news "into perspective," saving you valuable time and giving you an edge on breaking trends and technologies. Sections on E-commerce, Infrastructure, Web Development, and Industry, plus columns from leading reporters and journalists make Internet World the one newspaper you won't want to miss!
Act now to apply for your FREE subscription by visiting our Web site at:
How do I unsubscribe?
+ Send a message to [email protected] with the following as the first line of the body:
How do I subscribe?
+ The Search Engine Update is only available to paid subscribers of the Search Engine Watch web site. If you are not a subscriber and somehow are receiving a copy of the newsletter, learn how to subscribe at: http://searchenginewatch.com/about/subscribe.html
How do I see past issues?
+ Follow the links at:
Is there an HTML version?
+ Yes, but not via email. View it online at:
How do I change my address?
+ Send a message to [email protected]
I need human help!
+ Send a message to [email protected]. DO NOT send messages regarding list management or site subscription issues to Danny Sullivan. He does not deal with these directly.
I have feedback about an article!
+ I'd love to hear it. Use the form at http://searchenginewatch.com/about/contact.html.
This newsletter is Copyright (c) Internet.com LLC, 1999
Twitter Canada MD Kirstine Stewart to Keynote Toronto
ClickZ Live Toronto (May 14-16) is a new event addressing the rapidly changing landscape that digital marketers face. The agenda focuses on customer engagement and attaining maximum ROI through online marketing efforts across paid, owned & earned media. Register now and save!