THE SEARCH ENGINE REPORT
May 3, 2000 - Number 42
About The Report
The Search Engine Report is a monthly newsletter that covers developments with search engines and changes to the Search Engine Watch web site, http://searchenginewatch.com/.
The report has 130,000 subscribers. You may pass this newsletter on to others, as long either part is sent in its entirety.
Did you know that there's a longer, more in-depth version of this newsletter? The twice-monthly "Search Engine Update" newsletter is just one of the many benefits available to Search Engine Watch "site subscribers." Learn more about the advantages to becoming a site subscriber at this page:
Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ About The Search Engine Watch site
+ Search Engine Strategies Conference
+ AltaVista Launches New Search Site
+ Yahoo Changes Listings
+ GeoSearch Comes To Northern Light
+ Goodbye Domain Names, Hello RealNames?
+ Google Speaks Languages, WAP, Adds Other Features
+ Movement In Meta Search
+ Auction Search Case Awaits Ruling
+ Northern Light Wins Domain Suit
+ New Services Target Mobile Web Users
Search Engine Articles
+ Interesting articles relating to search engines.
+ List Info (Subscribing/Unsubscribing)
I have done a huge amount of updating to the Search Engine Resources area of the web site. I've also reorganized the Search Engines and Legal Issues page to group articles into topics such as domain disputes and meta search complaints.
Search Engine Resources
Search Engines and Legal Issues
With the London Search Engine Strategies conference now a happy memory, I'm beginning planning for its return to San Francisco on August 14. I'll be presenting and moderating sessions that feature experts on search engine marketing issues and panelists from various search engines themselves. In addition, there will be a special session on shopping search, which should be of interest to any online retailers. Details about the conference, for attendees or potential sponsors and exhibitors, can be found via the URL below.
Search Engine Strategies 2000 - San Francisco
AltaVista Launches New Search Site
AltaVista has launched a search-only site today which follows on improvements the company made to its core database of web pages about a week ago. Called Raging Search, the new site delivers fast and uncluttered listings.
"This is for the search enthusiasts, for the tech and web savvy people looking for pure web results," said Rajiv Parikh, AltaVista Search's marketing director.
The move to launch a search-centric site is significant, because it goes completely against the trend search engines have followed to date. Usually, when a search engine has become popular, it begins adding on "portal" features such as news headlines, horoscopes or free email in hopes of keeping users within the site.
Among the major search engines, the notable exceptions to this have been Google, GoTo and Northern Light. Google, in particular, has won praise from searchers I talk with that are impressed with the quality of its results and the concentration on search. GoTo has also attracted a significant audience and is able to eschew portal features because its pay-for-placement system actually means the search engine makes money when users leave its web site.
For the moment, Raging Search has no banner ads. Instead, AltaVista plans to make money with ecommerce and affiliate links that appear at the bottom of its search results page. Text ads, such as those Google uses, may also be tried in the future, AltaVista says.
Raging Search allows users to customize results in several ways, such as to see up to 50 listings at a time, to display a more compact format, to filter out adult content and to set language preferences, among other options. By default, only one page per web site is displayed in the results, but more can be seen by using the "Results from this site only" option.
Raging Search uses the same web page index as does AltaVista itself. However, expect that there could be variations when running the same searches in both places, especially for multiple word queries. Raging Search is designed to be a sort of test bed for improvements that may migrate over to the main AltaVista site. Consequently, a different ranking algorithm may be in use, or Raging Search may process the query in a different manner.
As for the web page index, AltaVista has expanded it to 350 million pages, which the company says are the best on the web. AltaVista has begun using a new "connectivity" graph that analyzes links from over 1 billion pages. From this, the top 600 million are crawled, then the index is reduced to 350 million pages when duplicates, dead links and spam are removed.
Having this better collection of documents should mean an overall improvement in the results users see, AltaVista says. To back its claim, the company says it was ranked the overall winner in a new relevancy survey carried out by ZD Labs. The survey itself has not yet been released, so I can't comment on it further. I'll follow up on it in a future newsletter.
Overall, I'm pleased to see the new standalone search service. There's definitely a demand for pure search, and even though AltaVista estimates the demand is relatively small, its nice to see them cater to this audience. It will be interesting to see if other services mimic the move.
Search Engine Sizes
This page compares reported sizes of major crawler-based search engines. It hasn't yet got the new AltaVista numbers posted, but it will shortly. It has been updated to reflect the new 500 million page index Inktomi says will go live later this month. Also see the "Numbers, Numbers -- But What Do They Mean?" article for a further explanation of how extra pages are spidered in order to increase relevancy. AltaVista is now taking the approach described for Inktomi.
Google Adds Directory
The Search Engine Report, April 4, 2000
Describes how Google is using link analysis to relevancy rank listings at the Open Directory
Yahoo Changes Listings
Significant changes to the look and functionality of Yahoo are few and far between, which can be a relief given how some other search engines can seem to be constantly altering themselves. Nevertheless, even Yahoo needs to refresh itself now and then. The new changes it introduced last month are designed to help users more easily locate what they are looking for.
The most significant change has been to reorganize how information is listed within Yahoo's category pages. The "Online Horoscopes" category, listed below, provides a good model to understand what's new.
You'll see that any of Yahoo's own relevant content now appears in the "Inside Yahoo" area. After that, any subcategories or categories related to the topic you are viewing are shown, in the "Categories" area. Next comes the "Most Popular Sites" area, where Yahoo is using an automatic system to list the sites it believes are most popular for that category. Finally, the "Complete List of Sites" section has the familiar comprehensive list of all sites that Yahoo editors have reviewed and approved, listed in alphabetical order.
You won't find the Most Popular Sites section in all categories yet, but it should become more commonplace as Yahoo grows more comfortable with the new format. It will certainly be welcomed in any place where there are more sites than a user can easily choose from. Guidance in these situations is needed, as Yahoo itself acknowledges.
"We introduced this feature in some of our larger categories because we realize that a long alphabetical list can be unwieldy to navigate through, and while alphabetical order is extremely functional and democratic, it is also an arbitrary ordering as far as content goes," said Srinija Srinivasan, Yahoo's editor in chief. "For the user who has time to sift through the multiple sites we list, we of course present them all, as each was only included in the directory after thoughtful, manual review by our team. But for the user who would like to see some added information to supplement alphabetical order, the 'Most Popular Sites' section gives them a way of focusing in on those sites that stand out in terms of their popularity online."
The appearance of Yahoo's search results page has also been slightly modified. Instead of making the entire category path clickable, now only the "end" category itself is clickable and set apart from the path with a bullet point.
GeoSearch Comes To Northern Light
Northern Light debuted a new geographical search capability in April. It allows users to filter results so that only matches relating to a particular "real world" address will appear. For instance, imagine you wanted to find all the pizza places near your home. Using Northern Light's "GeoSearch," you can enter the word "pizza" and your zip code, then any pages that contain the word pizza and which also seem to be related to where you live will be shown.
GeoSearch is a nice alternative to entering geographical keywords, especially in that those keywords can be too limiting. For example, say I wanted to find pizza places near Newport Beach, California. I could do a "normal" search such as "newport beach pizza," and that might miss out on pizza places that are located in the neighboring city of Costa Mesa. In contrast, GeoSearch might find them, because when you give it a zip code or postal code, it knows all the geographical keywords relevant to that code, for the radius you choose.
To use GeoSearch, just select the "Geo Search" link that appears below the search box on the Northern Light home page. A special form will appear, where you can enter your search word plus additional location information. You must at least provide at least a zip code or a telephone area code, and you can specify a geographical search radius up to 100 miles. Searches are currently limited to US and Canadian locations, but more worldwide support will be coming later this year. After doing a search, you can use the small use the "Edit this search" option to the right of the results page search box to refine your query.
I did a few head-to-head tests of GeoSearch against "normal" searches at Northern Light using geographical keywords, such as the pizza query above. I found that often, the normal search was just as good if not better than the GeoSearch results. Vicinity, the company behind GeoSearch technology at Northern Light, said one reason may be that not all the relevant pages within Northern Light's index have yet been geocoded. When that happens, you would expect the GeoSearch results to improve.
You may recognize Vicinity as the company behind the popular MapBlast mapping and directions web site, though that's just part of what Vicinity does. For GeoSearch, the company has developed "address recognition" software that teaches spiders how to recognize common address formats that they may encounter on web pages. If an address is found, the page is then "geocoded" when placed in the search engine's index. That means the search engine will store an appropriate real-word latitude and longitude for the page, as well as the usual cyberspace address information. If there are several addresses, then the page is assigned to multiple geographical locations.
Overall, GeoSearch seems a good tool to have for those times when you want to find information that could be situated in one of several locations. Similarly, some people may prefer using it rather than trying to think of appropriate geographical keywords. Another plus to using GeoSearch is that any addresses found on a web page will appear below that page's listing, along with a link to map for those addresses.
Northern Light GeoSearch Page
Whereonearth.com Signs Agreement With Yahoo
Whereonearth.com, April 10, 2000
Yahoo's classified and yellow pages sections are to receive their own form of GeoSearch, through a new partnership with Whereonearth.com.
Goodbye Domain Names, Hello RealNames?
While the domain name system continues to devolve into a joke, the RealNames web addressing system is growing stronger. The company recently cemented a tighter relationship with Microsoft, continues to expand its reach and has pulled back on its main issue of controversy, the assignment of "generic" terms. The moves further position RealNames as a viable alternative, if not future successor, to the current domain name system.
Devaluation of top level domains is one of the biggest problems our current system faces. Top level domains, or TLDs, are the "endings" you choose when registering a domain name. For instance, McDonald's has registered a domain name of mcdonalds.com. As you can see, the name ends in .com. McDonald's also owns a domain name with a different ending, mcdonalds.org.
Why have domains with different endings? There was a time when the endings used to mean something. In fact, they could be incredibly helpful to understand the origin and backing of a web site. For instance, the .org TLD used to be reserved for non-profit organizations. Only non-profits could register a name ending in .org, and users going to sites with this ending could be fairly well assured that they were non-commercial in nature.
Today, this is no longer the case. Anyone may register a .org, regardless of their non-profit status. In fact, domain registrars like Network Solutions and others encourage companies to register .org addresses, as well as the .net TLDs, which used to be reserved for companies involved with Internet network operations. The pitch is to "protect" yourself against other companies getting your name with these alternative endings.
Similarly, .com has become the TLD that everyone wants, regardless of whether it is appropriate. That's because with the growth of commercial sites on the web, many surfers assume that all sites must end in .com. Looking for Ford? Just slap a .com on the end of their name, and you'll find them. Looking for the Sierra Club? Even though they are a non-profit organization, you can still reach them at sierraclub.com in addition to sierraclub.org. The environmental group no doubt realized users might look for it by adding a .com to its name, and so registered a domain with that ending to ensure it could be found. Likewise, the US White House surely must regret that it never registered whitehouse.com in addition to the proper whitehouse.gov address that it currently uses. That failure meant a porn site was able to get whitehouse.com, which comes as a surprise to many who arrive there.
Back in 1996, I used to call adding .com to the end of something ".comifying," such as when my wife -- sick of hearing me talk about the Internet -- would say "getalife.com." It was a way to make anything Internet-related. Today, it's commonplace for people to talk about "dotcoms" when they are referring to web sites. It's the same principle -- the .com ending is equated to being in cyberspace.
This brings us to the issue of domain ownership. There can only be one mcdonalds.com, even if there are several companies that hold trademarks that seemingly would entitle them to the address. Who ultimately gets ownership of a .com address has led to plenty of disputes and lawsuits, and there are no signs that these are diminishing. This is especially so because no one tries to resolve problems before domains are registered. Instead, it's a free-for-all. Anyone can register a name, and complaints are dealt with after the fact.
Rather than exert preemptive control, the current thinking is to simply introduce new TLDs, such as .firm for businesses or .rec for recreationally-oriented organizations. It's an absurd solution. We've already seen how existing TLDs are perverted and how companies are encouraged to register every ending available. The introduction of new TLDs will only cause companies to spend more money on names they do not need, while users will be poorly served because the classifications will inevitably be ignored.
In even more craziness, even country-specific TLDs can be twisted away from their original purpose. In early April, the Pacific Island country of Tuvalu has sold the rights to its country-specific domain of .tv to a private company, in a deal worth at least $50 million over the next ten years. Can you imagine if a country was able to sell its seat at the United Nations to a private company? To me, that's the equivalent of what's happened here. Country-specific domain names were supposed to be assigned so that we'd understand what country a domain was based in, not cash cows for sale to the highest bidder.
In short, the domain name system is a mess, and I don't see hope on the horizon -- except in the form of RealNames. In RealNames, we have an alternative system for reaching web sites that can potentially avoid the problems that the domain name system has suffered.
For one, names are subject to review before being approved. That allows potential conflicts with trademarks to be spotted before the fact, not afterwards. Moreover, if several companies seem to have an equal claim to a particular name, then usually no single company can own it. For instance, the RealNames keyword "alpine" will list several sites relevant for that keyword, rather than trying to take you to just one.
A real benefit is that names can be regionally specific. If I enter "Ford" into AltaVista, then click on the RealNames link that appears, I'll be taken to Ford's US web site. That's because AltaVista uses the US RealNames database. In contrast, if I enter "Ford" into UKMax.com, a UK-specific search engine, the RealNames link takes me to Ford's UK web site. It's the same word, "Ford," but because my location is known, I'm directed to the correct regional location.
Another plus is that unlike the domain name system, RealNames do not have to be in Latin-characters. In fact, RealNames has just expanded into Japan, offering navigational keywords in Kanji, Hiragana and Katakana characters.
Overall, RealNames is a much more intelligent, equitable and better managed system than the current domain name system. I don't see it as an immediate threat to the primacy of .com, but it is well-positioned to take over in the future. That's especially so given RealNames success with Microsoft. The company managed to get Microsoft to provide native support of some types of RealNames keywords within the Internet Explorer browser toward the end of last year, and that support was expanded in March, at the same time Microsoft gained a 20 percent stake in RealNames.
To see this in action, enter "ABC" into IE5's address box, and your browser will split into two windows, if you haven't changed the default settings. On the right, the ABC television network's web site will automatically load. On the left, you'll be shown a list of web sites. The list will be topped by the "MSN Top Pick," which is a RealNames link to the ABC web site, but other sites such as ABCstereo.com are also shown, in case you might have wanted these instead.
Similarly, try entering the names of other companies or web sites that you are trying to reach into IE5's address box. You'll probably be surprised to discover how successfully the RealNames system will get you to the right location. Certainly if you enter "white house," you'll arrive at the US White House site, not a porn site.
RealNames isn't perfect. I feel the company lost goodwill when it allowed some companies to register "generic" terms, as the article listed below explains in more depth. Last week, RealNames decided formally that it would no longer issue generics, or "categorical" keywords, as it is now calling them.
"We have decided to maintain our current policy of not selling categorical terms, terms which are synonymous with an entire category of goods and services," said RealNames CEO Keith Teare. "Right now, it remains the case -- and we think it will remain the case for some time -- that most users do not distinguish between navigation and search."
In other words, sometimes users want to "navigate" to a particular web site, such as MP3.com. Other times, they want to search for several possible web sites, such as places that offer MP3 files. These are completely different goals, yet they may involve the same keyword, "mp3." If RealNames were to allow only a single web site to be found for a generic / categorical term like this, then those with search expectations might be confused.
Existing contracts will still be honored, which is why the RealNames keyword "mp3" currently does resolve to MP3.com. Also, companies with strong brand names that are also categorical may be granted these terms. Amazon and Apple are both examples where their names are generic in nature and which might also be used by other companies. Its hard to argue that most Internet users would not expect to reach Amazon.com or Apple.com.
Oversight of names is another issue. I certainly have no lack of complaints from readers unhappy with some decisions RealNames has made, and I get more every time I write an article about the system. While RealNames does have a policy review board, that board is not an independent body that can force RealNames to change a decision the company has made. Teare agrees this is a weakness.
"I definitely want to change that where someone else can tell us we've made a mistake and force us to change," he said. "I think the question is who wants to do that, adding, "I would absolutely love it if ICANN wanted to," referring to the authority that oversees the domain name system.
At the moment, it seems as if RealNames partners are helping provide a balance. Since most of them are search services, there's pressure that the RealNames system resolves addresses in a way consistent with what their users expect. In fact, the added support Microsoft has given the system was conditional on RealNames adding more review and discretion over how names are assigned.
"They wanted an unequivocal endorsement from us. We wanted to make sure if we were endorsing them that we knew that the user experience was going to be superior," said Bill Bliss, general manager of Microsoft's MSN Search. He and others at Microsoft were involved in a process to help RealNames better define how names would be issued and resolved in the RealNames system.
You'll also hear RealNames make mention of its system being based on "open standards." The reference here is to how the RealNames system and other alternative web addressing systems, or "namespaces," work technically. The goal is to make them compatible with each other. The standards being developed have absolutely nothing to do with how names are assigned.
Of course, RealNames isn't the only player in the namespace competition, but it is the strongest. AOL runs a long-standing keyword navigation system, but that operates only within AOL's proprietary online service. AOL-owned Netscape also has an Internet Keyword system, but it isn't supported yet beyond within the Netscape browser. Netword is another namespace company, but it has nothing like the reach that RealNames has established through search engines and other partnerships.
In conclusion, the domain name system isn't going away immediately, but like the dinosaurs, I think it will slowly become extinct. In its place will be namespaces, which we'll use to navigate the web. Indeed, as what I call "GenNet" comes online -- youths who have never known a world without Internet access -- they'll probably view "old-timers" trying to reach web sites using .com and other DNS addresses in the same way we might laugh to hear someone trying to dial a phone number using old style telephone exchange prefixes, such as the one from the Glenn Miller song, PEnnsylvania 6-5000.
Sound unbelievable? Before there were domain names, we used IP addresses to reach information on the Internet. In fact, IP addresses still underlie everything on the web. The ABC television web site is really at http://188.8.131.52. The domain name system evolved to save us from having to remember these numbers. We simply enter ABC.com into our browsers, then they communicate with a DNS server that routes us to the right IP address. In the same way, the namespace systems are evolving to save us from the confusing DNS system. Going forward, we'll enter "ABC" into our browser, then the namespace system will resolve it behind the scenes to a domain name, delivering us to our destination.
Using RealNames Links
More about how the system works.
RealNames Temporarily Suspends Registration Of Generics
The Search Engine Report, Jan. 4, 2000
More information on how RealNames was assigning generic or categorical terms and why the practice was halted.
The organization in charge of the domain naming system. Find information here about new TLD proposals and more.
A rival to RealNames in the namespace field.
The FAQ page has more information about the new TLDs, plus there's a wealth or other information about domain name registration and issues.
Brief definition of the domain name system, with many great links that provide more information about how the system works.
Dot-coms: Masters Of New Domains
Forbes, April 26, 2000
How country-specific domain names are being used for new purposes. But despite these moves, .com remains king.
AOL, Microsoft going to war over browsers
Washington Post, April 21, 2000
Another look at the coming of namespaces.
ICANN Moves Closer To Adding Web Domains
Newsbytes, April 20, 2000
New top level domains come closer to reality due to a recent action by one of ICANN's supporting organizations.
Island nation cashes in on ".tv" country code
News.com, April 8, 2000
Why Dot Com is King
Domain Notes, April 2000
Argues that despite the possible introduction of new TLDs, .com will remain the top choice for businesses.
The ICAAN Dispute Resolution Policy
Domain Notes, April 2000
A look at how the new domain name dispute system seems to be working to against cybersquatters.
Microsoft to Back a Browser Keyword System
New York Times, March 14, 2000
Details on RealNames charges to corporate clients, plus a comment from Esther Dyson, who heads ICANN but who wasn't speaking for the group as a whole.
Internet Board Agrees to Overhaul Election Plan
New York Times, March 10, 2000
Very nice summary of recent decisions made by ICANN.
Telephone Exchange Name Project
Everything you wanted to know about the old-style telephone exchange name system.
Google Speaks Languages, WAP, Adds Other Features
Google has unveiled beta versions of its site for non-English speakers and those accessing the web using wireless devices. The search engine has also added new features to its search results and launched a new "university" search engine.
Users can now search for pages in 10 languages in addition to English. They are Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish. Additional language support is planned for later this year.
To use the feature, just choose the language you desire using the drop-down box next to the search box. For instance, select "French," and only pages written in French should appear. Google's messages and instructions will also change to match the language you searched in. Additionally, Google will remember your language preferences. Use the "Language options" link on the home page to control your settings.
Google has also become the latest service to cater to the growing wireless market. Nor is the company just playing catch-up. Google's goes beyond the other WAP search offerings that I've seen, which either index or catalog only pages specifically designed for wireless users. In contrast, Google's WAP service is supposed to translate any document it lists into a format viewable on mobile or handheld devices.
For instance, FAST Search operates a wireless search engine, which lists pages written in wireless markup language, or WML. It excludes ordinary HTML documents from its listings, since those pages probably won't appear properly on a WAP phone. In comparison, Google presents its normal results, but if a user chooses an HTML document from its listings, the page will automatically be translated into a wireless format. That means Google gives mobile users much greater reach, though only to documents actually listed in its results.
A new feature added to Google's search results is the ability to email results to yourself or someone else. Perform a search, then click on the "Email These Results" link that appears at the top of the results page. Enter up to four email addresses, and the results will be sent to those people.
Another feature of Google's results has been renamed and enhanced. The "Cached" link below each page listed used to bring up an exact copy of that page, as it looked when Google spidered it. The Cached link has become a popular method for users to find copies of pages that no longer exist, and Google still supports it, just under the new name of "Show matches." The feature has also been enhanced with hit highlighting. When you view the cached copy of the page, the search terms you looked for will be highlighted in yellow on the page.
Finally, you can now perform a search to find matching web pages from within any one of over 40 prominent US universities, such as Harvard, Stanford or even my alma mater, the University of California, Irvine. Go to Google's University Search, then pick the college you are interested in. I only wish Google also made it possible for you to do a search across all of these universities at once. Perhaps in the future....
Wireless users need not go to a special address to use Google's special wireless version, which is in beta testing. Google will detect your WAP phone and automatically load formatted for it.
Google: Palm Edition
Those with Palms should use this edition of Google, which displays in normal HTML but in a format designed for Palm and other small screen devices.
Google Language Options
Set your language preferences with this page.
Google University Search
Don't have a WAP phone? Use Gelon's Wapalizer to see the web as a wireless user does.
Wild About WAP
The Search Engine Report, March 3, 2000
A bit more about wireless search engines and WAP.
Movement In Meta Search
Meta search engine ProFusion was acquired by search utility maker Intelliseek in April, making it the third major meta search engine to be gobbled up in recent months.
The trend started last August, when Go2Net acquired meta search site Dogpile in a US $55 million deal. The move was especially notable because Go2Net already had absorbed the web's most popular meta search service, MetaCrawler, in November 1998. Acquiring Dogpile gave Go2Net a second popular meta search service. Why buy what Go2Net already had? To lock up the market.
"We had the number one search service, and this was number two, and we wanted to own the meta search category," said Dr. Oren Etzioni, Go2Net's chief technology officer and creator of MetaCrawler, in an interview last year.
It's not a bad category for Go2Net to own, especially given that some search engines actually pay to be carried in the meta search results. It's a way for smaller, standalone search engines to extend their reach to new users and been seen alongside the more established players. For instance, pay-for-placement search engine GoTo.com recently cut new deals with both Go2Net and meta search engine Mamma.com. The deal ensures GoTo's results, and thus its advertiser-supported links, will be placed before more eyeballs.
Meta search is also a good category because it may attract search users dissatisfied with standalone search engines. This is especially so when articles appear that discuss how "little" of the web each search engine covers or how results can be different from engine to engine. The natural solution to these concerns for a user is to turn to a meta search engine, which provide the top results from several search engines all at once.
Nevertheless, only Go2Net seems to have leveraged meta search into attracting big traffic. The site now regularly lands in Media Metrix's top rankings, though much of that traffic may be driven by Go2Net's other content, such as the Silicon Investor web site.
Cnet is another player that's entered the meta search competition. In October, it acquired the popular SavvySearch service in a $22 million deal. SavvySearch continues to run as an independent web site, but its technology was integrated last month into Cnet's long-standing Search.com site. Previously, Search.com had been powered by Infoseek.
Until now, Cnet hasn't seemed to invest much time in Search.com. The inattention began when Cnet launched Snap.com back in September 1997, aiming the site to take on the likes of Yahoo, Lycos, Excite and other portals. With Snap now owned primarily by US television network NBC, Cnet seems to have thoughts about targeting search again.
Like MetaCrawler and SavvySearch, ProFusion originally began at a university. It went private, then Intelliseek purchased it last month. Intelliseek makes the well-regarded BullsEye meta search software. The acquisition of ProFusion now gives it low-traffic but established search site that it can expand and enhance.
From a user perspective, these deals mean that the meta search services named are likely to grow and be developed. Of course, that doesn't mean that "independent" meta search sites such as Mamma.com or C4.com won't also develop. In fact, the recent acquisitions will probably help other services attract investment, since value is clearly being attached to meta search.
Similarly, huge success by meta search engines could ultimately cause the "main" search engines that they depend on to cut off licensing agreements, which most of the major meta search sites establish in order to avoid legal problems. However, for the moment, the major search engines generally say that meta search sites pose little burden and even provide them with some exposure. Consequently, it's a win-win for both parties.
Search Links: Metacrawlers
You want meta search? Have no idea what it is? Links and answers are here.
Relaunched in January, the service added music and auction meta search capabilities, the ability to customize their searches to return results from more than 25 specifically requested countries, and included Google as one of the services queried.
You'll find SavvySearch's meta search technology here, for both general purpose searching and to power topical meta search offerings.
CNET Investor Message Board Area
Provides meta search across major investment discussion areas.
Courting Retailers: Metasearchers Increasingly Cozy up to E-Commerce Sites
Internet World, Jan. 15, 2000
More details on how and why some search and shopping sites are paying to be carried on meta search engines.
Auction Search Case Awaits Ruling
Is it legal to spider someone's web site without permission? We've never had a court ruling on this before, at least in terms of textual information, but that's expected to change shortly.
There's been a long-running dispute between auction site eBay and auction search engine Bidder's Edge. eBay claims the right to restrict who can crawl its listings, in part arguing that spiders can slow its service and that the information on its site is intellectual property entitled to protection. Bidder's Edge argues that the information belongs to those auctioning their goods and services, not to eBay, and that it needs no permission from eBay to index this information.
The case is currently before the US District Court in San Jose, California. The judge in the case has said that he's inclined to grant a preliminary injunction against Bidder's Edge, but how exactly he might restrict the service remains to be seen -- as is whether he actually will grant an injunction at all.
Should he do so, it would be a blow for Bidder's Edge but by no means an end to the issue. The case would still head to trial, where Bidder's Edge could win. It might also be successful during the during the legal maneuverings that occur before a trial.
Bidder's Edge, and those on its side, express concern that a victory for eBay could mean an end to search engines. After all, no major crawler-based search engine expressly seeks permission to index content. Instead, permission is assumed.
This brings up the issue of the robots.txt file, a long-standing convention to explicitly tell spiders to stay out of a web site. There's a strong argument that the greater good on the web is served by allowing search engines to operate in "opt-out" mode. That means that if you don't want your content indexed, you use a robots.txt file to "opt-out" of the process by telling spiders to go away.
By all means, eBay should certainly have a robots.txt file in place, if it wishes to keep spiders out. It doesn't, and has never had one in place any time I've checked since this dispute began last October. The lack of such a file shows little concern about being crawled. It's a fundamental mechanism the company should have in place. Moreover, eBay is well known to me not only to be listed on major search engines but also to have employed several search engine optimization firms to promote itself on search engines. This demonstrates no real concern that visitors should only come into the site via its home page, nor that its internal content should be protected from spiders. Instead, eBay simply seems to want to play favorites. Spiders that benefit it by sending the site traffic are apparently OK, but spiders it feels may be a threat to its business interests should be forbidden.
So this isn't just a case about spidering. It also involves whether some spiders can be selectively discriminated against. The robots.txt convention certainly allows this. You can exclude just particular spiders, and this was especially designed to stop "misbehaving" spiders that site owners felt were putting a burden on their servers. But the robots.txt file isn't a legal convention, which is why court cases like this one will be watched so closely.
For the record, even if eBay had put up a robots.txt file, Bidder's Edge says they would not have observed it. I find that disturbing, because I feel those who operate spiders should obey the robots.txt convention. It's one of the few solid rules we have involving search engine spiders, and it has helped make the entire opt-out indexing situation possible. In turn, that has benefited web users as a whole.
However, Bidder's Edge does argue that it would ignore such a file at eBay because the Bidder's Edge feels the content belongs to those placing the auctions, not eBay itself. Thus, eBay should have no right to limit the ability of its participants to be found. That's a powerful argument. It's akin to saying that those at GeoCities couldn't have their home pages found because GeoCities-owner Yahoo decided to block search engine spiders. I can imagine the outcry that would bring.
This brings us to the meta robots tag. It allows spiders to be blocked on a page-by-page basis, rather than the site-wide system the robots.txt file is designed for. Potentially, eBay could give all of its users the choice at the time they place an auction of whether they want spiders blocked from indexing their content. It's even possible that eBay might charge those who chose to allow spiders in an extra fee, which could be used to cover any real burden that the spiders might place on its servers.
Search Engines And Legal Issues
You'll find links to other cases involving spidering and linking here.
Search Engine Features For Webmasters
Information on blocking spiders with a robots.txt file or a robots meta tag can be found here.
eBay, Bidder's Edge face off in court
News.com, April 14, 2000
Short summary of the current situation, with a link to a story about the Justice Department asking eBay about its actions.
Auction Dispute Centers on Question of Control Over Data
New York Times, April 14, 2000
Longer summary of the case, with quotes from both sides and third parties.
eBay vs. Auction Aggregators: A Freedom Fight?
InternetNews.com, Feb. 11, 2000
An older article exploring the issues involved.
Economist, Oct. 16, 1999
Another older article but still useful for explaining the concerns in the case.
Legality of 'Deep Linking' Remains Deeply Complicated
New York Times, April 14, 2000
Article about a different case between Ticketmaster and Tickets.com which touches on similar issues to the eBay dispute.
Ticketmaster Corp., et al. v. Tickets.Com, Inc
GigaLaw, March 27, 2000
Detailed information from the Ticketmaster case.
Copyright Decision Threatens Freedom to Link
New York Times, Dec. 10, 1999
Describes a case where a court rules against the ability for a site to link to pirated content.
Court Ruling Denies Copyright Protection For Images On The Net
7am, Dec. 21, 1999
Details on a case involving the spidering of images, where the court upheld the right to index.
Kelly vs. Arriba Soft Corporation
More information about the imaging indexing case above, with a current update on its status, fron the plaintiff, photographer Les Kelly.
Northern Light Wins Domain Suit
Northern Light also obtained a restraining order forcing the NorthernLights.com web site (notice the S) to be removed. In its place is a text link to the NorthernLight.com search engine. The US District Court in Massachusetts determined that Northern Light would succeed at trial with its cybersquatting claim and issued the order in mid-April.
Northern Light Press Release
New Services Target Mobile Web Users
+ Excite has launched a version of its site for mobile phone and Palm users that gives you access to any personalized information it provides, such as stocks, horoscopes or email. Directions, news stories, sports scores and other information is also available. You cannot search the web, however.
+ SearchPalm spiders pages from web sites about the Palm on a regular basis. Try a search there to find games, FAQs, tips and other information relating to your handheld device.
+ Pinpoint has launched a wireless search engine that's said to contain over 1.5 million pages designed for wireless users. It's not available for searching, however. It's a service being made available to sites who want a wireless search engine for their users.
Search Engine Articles
Lycos, Yahoo step back from ambitious broadband plans
News.com, May 1, 2000
Some portals see broadband as the future, but for Lycos and Yahoo, it's not an immediate priority.
Yahoo Casts Wide Net To Protect Domain Name
Newsbytes, April 27, 2000
Yahoo seeks control over 37 domain names it considers through the new ICANN domain resolution policy -- setting a new record for number of disputes recorded at once.
Portals Start to Feel the Heat
April 21, 2000
Multimillion dollar ecommerce deals signed by some sites with portals now may feel more like multimillion dollar anchors, dragging them down. But not everyone is unhappy.
Northern Light is not search lite
Ad Age, April 17, 2000
Northern Light has launched its second television ad campaign, in hopes of raising awareness of the service. The first campaign last fall failed to bring the service anywhere close to the traffic levels of its competitors, as this article details.
5th Annual Search Engine Meeting - Special Report
About WebSearch Guide, April 17, 2000
Great coverage of the Infonortics search engine conference held in Boston earlier this month can be found here. There's a summary of my presentation on search trends, along with write-ups of talks by some of the major search engine players.
Yahoo Chief Addresses Hot Topics
PC World, April 11, 2000
Chief Yahoo Jerry Yang speaks on privacy and ecommerce issues.
My Reading List
Thanks this month to items spotted in....
Tasty Bits from the Technology Front
Online Community Report
How do I unsubscribe?
+ Use the form at http://searchenginewatch.com/sereport/unsubscribe.html or follow the instructions at the very end of this email.
How do I see past issues?
+ Follow the links at http://searchenginewatch.com/sereport/
Is there an HTML version?
+ Yes, but not via email. View it online at
I didn't get Part 1 or 2. Can you resend it?
+ No, but you can view the entire issue online, via the link above.
How do I change my address?
+ Unsubscribe your old one, then subscribe the new one, using the links above.
I need human help with a list issue!
+ Write to [email protected]. DO NOT send messages regarding list management issues to Danny Sullivan. He does not deal with these.
I have feedback about an article!
+ I'd love to hear it. Use the form at http://searchenginewatch.com/about/contact.html.
How do I advertise?
+ To advertise in this newsletter or any of Internet.com's other 100 newsletters, contact Frank Fazio, Director of Inside Sales, at (203) 662-2997 or via email at mailto:[email protected]
This newsletter is Copyright (c) internet.com Corp, 2000
Twitter Canada MD Kirstine Stewart to Keynote Toronto
ClickZ Live Toronto (May 14-16) is a new event addressing the rapidly changing landscape that digital marketers face. The agenda focuses on customer engagement and attaining maximum ROI through online marketing efforts across paid, owned & earned media. Register now and save!*
*Early Bird Rates expire April 17.