THE SEARCH ENGINE UPDATE
May 3, 2000 - Number 76
About The Update
The Search Engine Update is a twice-monthly update of search engine news. It is available only to those people who have subscribed to Search Engine Watch, http://searchenginewatch.com/. Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ About The Search Engine Watch site
+ Search Engine Strategies Conference
+ AltaVista Launches New Search Site
+ Yahoo Changes Listings
+ GeoSearch Comes To Northern Light
+ Goodbye Domain Names, Hello RealNames?
+ Google Speaks Languages, WAP, Adds Other Features
+ LookSmart: Commerical Charge Just A Test
+ Movement In Meta Search
+ Auction Search Case Awaits Ruling
+ Northern Light Wins Domain Suit
+ New Services Target Mobile Web Users
+ New Submission Tool Launched
Search Engine Articles
+ Interesting articles relating to search engines.
+ List Info (Subscribing/Unsubscribing)
By next Monday, I'll have a new page that covers issues about how search engines deal with multilingual pages, as well as issues to consider when dealing with regional search engines. It's quite detailed, and I think it will answer many of the questions I've gotten on this topic in the past. See the Subscribers-Only Area What's New page, and you'll know when it is up.
Subscribers-Only What's New
With the London Search Engine Strategies conference now a happy memory, I'm beginning planning for its return to San Francisco on August 14. I'll be presenting and moderating sessions that feature experts on search engine marketing issues and panelists from various search engines themselves. In addition, there will be a special session on shopping search, which should be of interest to any online retailers. Details about the conference, for attendees or potential sponsors and exhibitors, can be found via the URL below.
Search Engine Strategies 2000 - San Francisco
AltaVista Launches New Search Site
AltaVista has launched a search-only site today which follows on improvements the company made to its core database of web pages about a week ago. Called Raging Search, the new site delivers fast and uncluttered listings.
"This is for the search enthusiasts, for the tech and web savvy people looking for pure web results," said Rajiv Parikh, AltaVista Search's marketing director.
The move to launch a search-centric site is significant, because it goes completely against the trend search engines have followed to date. Usually, when a search engine has become popular, it begins adding on "portal" features such as news headlines, horoscopes or free email in hopes of keeping users within the site.
Among the major search engines, the notable exceptions to this have been Google, GoTo and Northern Light. Google, in particular, has won praise from searchers I talk with that are impressed with the quality of its results and the concentration on search. GoTo has also attracted a significant audience and is able to eschew portal features because its pay-for-placement system actually means the search engine makes money when users leave its web site.
For the moment, Raging Search has no banner ads. Instead, AltaVista plans to make money with ecommerce and affiliate links that appear at the bottom of its search results page. Text ads, such as those Google uses, may also be tried in the future, AltaVista says.
Raging Search allows users to customize results in several ways, such as to see up to 50 listings at a time, to display a more compact format, to filter out adult content and to set language preferences, among other options. By default, only one page per web site is displayed in the results, but more can be seen by using the "Results from this site only" option.
Raging Search uses the same web page index as does AltaVista itself, which means there is no need to submit to it separately. However, expect that there could be variations when running the same searches in both places, especially for multiple word queries. Raging Search is designed to be a sort of test bed for improvements that may migrate over to the main AltaVista site. Consequently, a different ranking algorithm may be in use, or Raging Search may process the query in a different manner.
As for the web page index, AltaVista has expanded it to 350 million pages, which the company says are the best on the web. AltaVista has begun using a new "connectivity" graph that analyzes links from over 1 billion pages. From this, the top 600 million are crawled, then the index is reduced to 350 million pages when duplicates, dead links and spam are removed.
Having this better collection of documents should mean an overall improvement in the results users see, AltaVista says. To back its claim, the company says it was ranked the overall winner in a new relevancy survey carried out by ZD Labs. The survey itself has not yet been released, so I can't comment on it further. I'll follow up on it in a future newsletter.
AltaVista has also made small tweaks to its ranking system, primarily to refine and improve its usage of link analysis. The company has also been cracking down severely on doorway pages and says that they simply are no longer allowed in its index.
"We do consider doorway pages to be spam," said Tracy Roberts, marketing director for the AltaVista network. In the past, search engines sometimes have reversed themselves on their anti-doorway page statements, depending on how a doorway page is defined. Roberts specifically cited doorways as pages with little or no real content to them. As I reported in the last newsletter, "machine-generated" pages produced by doorway page programs such as WebPosition have also been targeted.
As always, I'd advise concentrating on content. If you want to be found for a particular topic, then create a page rich in information for that topic -- not just a page that's rich in keywords but which has no inherent value. Think of it this way -- a doorway page is not a destination for users. When users come to one, they are usually moved quickly to another page within the web site. So if you are in doubt, ask yourself whether you'd spend 30 seconds or so reading the page you've created. That's actually a pretty long time, and if you would, then your page probably has enough real information on it to escape being considered a doorway.
In another change, AltaVista has now enhanced its directory to rank sites based on popularity, using links as the means to measure this. It's similar to the system I described at Google last month -- see the article below. Additionally, the directory now includes listings from both LookSmart and the Open Directory. Any sites out of LookSmart will specifically say "From: LookSmart" after the site description.
Overall, I'm pleased to see the new standalone search service. There's definitely a demand for pure search, and even though AltaVista estimates the demand is relatively small, its nice to see them cater to this audience. It will be interesting to see if other services mimic the move.
Search Engine Sizes
This page compares reported sizes of major crawler-based search engines. It hasn't yet got the new AltaVista numbers posted, but it will shortly. It has been updated to reflect the new 500 million page index Inktomi says will go live later this month. Also see the "Numbers, Numbers -- But What Do They Mean?" article for a further explanation of how extra pages are spidered in order to increase relevancy. AltaVista is now taking the approach described for Inktomi.
Missing Pages At AltaVista
The Search Engine Update, March 3, 2000
Have you gotten the dreaded "Too Many URLs submitted" message? This article explains a bit more about what to do. AltaVista still says email contact, as described, is the best way to resolve a problem.
Google Adds Directory
The Search Engine Update, April 4, 2000
Describes how Google is using link analysis to relevancy rank listings at the Open Directory
WebPosition, April 2000
This issue of WebPosition's newsletter has specific advice for its users relating to AltaVista and doorway pages generated by the program.
Yahoo Changes Listings
Significant changes to the look and functionality of Yahoo are few and far between, which can be a relief given how some other search engines can seem to be constantly altering themselves. Nevertheless, even Yahoo needs to refresh itself now and then. The new changes it introduced last month are designed to help users more easily locate what they are looking for.
The most significant change has been to reorganize how information is listed within Yahoo's category pages. The "Online Horoscopes" category, listed below, provides a good model to understand what's new.
You'll see that any of Yahoo's own relevant content now appears in the "Inside Yahoo" area. After that, any subcategories or categories related to the topic you are viewing are shown, in the "Categories" area. Next comes the "Most Popular Sites" area, where Yahoo is using an automatic system to list the sites it believes are most popular for that category. Finally, the "Complete List of Sites" section has the familiar comprehensive list of all sites that Yahoo editors have reviewed and approved, listed in alphabetical order.
You won't find the Most Popular Sites section in all categories yet, but it should become more commonplace as Yahoo grows more comfortable with the new format. It will certainly be welcomed in any place where there are more sites than a user can easily choose from. Guidance in these situations is needed, as Yahoo itself acknowledges.
"We introduced this feature in some of our larger categories because we realize that a long alphabetical list can be unwieldy to navigate through, and while alphabetical order is extremely functional and democratic, it is also an arbitrary ordering as far as content goes," said Srinija Srinivasan, Yahoo's editor in chief. "For the user who has time to sift through the multiple sites we list, we of course present them all, as each was only included in the directory after thoughtful, manual review by our team. But for the user who would like to see some added information to supplement alphabetical order, the 'Most Popular Sites' section gives them a way of focusing in on those sites that stand out in terms of their popularity online."
Srinivasan said that the sites in the Most Popular area are selected using an automatic system rather than being Yahoo editor choices, as is with the case with Yahoo cool sites, which are identified by the sunglasses icon. " 'Most Popular Sites' reflects what's happening online and is not influenced by our editorial decisions. This [popularity” algorithm is being continually tweaked, refined and updated to remain current with what's happening online," Srinivasan said.
I haven't gotten more information yet on what exactly is being used to determine popularity, but I anticipate having answers from Yahoo for a follow up newsletter. It seems extremely likely that some type of link analysis is being used, similar to what Google did with the Open Directory last month (see article below). It's also unclear where the editorially selected top picks will appear, so I'll be looking into that, too.
The appearance of Yahoo's search results page has also been slightly modified. Instead of making the entire category path clickable, now only the "end" category itself is clickable and set apart from the path with a bullet point.
For example, previously a search for "shoes" would bring up a category list like this:
Business and Economy > Shopping and Services > Apparel > Footwear
Business and Economy > Companies > Apparel > Footwear > Athletic Shoes
Business and Economy > Companies > Sports > Golf > Apparel > Shoes
Each category was a single, clickable link. Now the presentation is similar to this:
Business and Economy > Shopping and Services > Apparel
Business and Economy > Shopping and Services > Apparel > Footwear
+ Athletic Shoes
Business and Economy > Shopping and Services > Sports > Golf > Apparel
Only the bulleted word below the category path is clickable.
"This is a straightforward change in presentation as a result of various factors, including results from usability studies, to make our search results more readable," Srinivasan said.
Google Adds Directory
The Search Engine Update, April 4, 2000
Describes how Google is using link analysis to relevancy rank listings at the Open Directory
GeoSearch Comes To Northern Light
Northern Light debuted a new geographical search capability in April. It allows users to filter results so that only matches relating to a particular "real world" address will appear. For instance, imagine you wanted to find all the pizza places near your home. Using Northern Light's "GeoSearch," you can enter the word "pizza" and your zip code, then any pages that contain the word pizza and which also seem to be related to where you live will be shown.
GeoSearch is a nice alternative to entering geographical keywords, especially in that those keywords can be too limiting. For example, say I wanted to find pizza places near Newport Beach, California. I could do a "normal" search such as "newport beach pizza," and that might miss out on pizza places that are located in the neighboring city of Costa Mesa. In contrast, GeoSearch might find them, because when you give it a zip code or postal code, it knows all the geographical keywords relevant to that code, for the radius you choose.
To use GeoSearch, just select the "Geo Search" link that appears below the search box on the Northern Light home page. A special form will appear, where you can enter your search word plus additional location information. You must at least provide at least a zip code or a telephone area code, and you can specify a geographical search radius up to 100 miles. Searches are currently limited to US and Canadian locations, but more worldwide support will be coming later this year. After doing a search, you can use the small use the "Edit this search" option to the right of the results page search box to refine your query.
I did a few head-to-head tests of GeoSearch against "normal" searches at Northern Light using geographical keywords, such as the pizza query above. I found that often, the normal search was just as good if not better than the GeoSearch results. Vicinity, the company behind GeoSearch technology at Northern Light, said one reason may be that not all the relevant pages within Northern Light's index have yet been geocoded. When that happens, you would expect the GeoSearch results to improve.
You may recognize Vicinity as the company behind the popular MapBlast mapping and directions web site, though that's just part of what Vicinity does. For GeoSearch, the company has developed "address recognition" software that teaches spiders how to recognize common address formats that they may encounter on web pages. If an address is found, the page is then "geocoded" when placed in the search engine's index. That means the search engine will store an appropriate real-word latitude and longitude for the page, as well as the usual cyberspace address information. If there are several addresses, then the page is assigned to multiple geographical locations.
Vicinity is also talking with other search engines, so it seems likely that GeoSearch may become a more common option. Given this, there are a few things you may wish to consider to ensure your pages are accurately geocoded.
Most important is to include geographical addresses on any relevant pages. For instance, if you ran a local restaurant, having your address on the same page that describes your restaurant might help you rank better if someone did a GeoSearch for restaurants in your area. Similarly, if you ran a chain of electronic stores, having a list of addresses along with a general description of your store could help ensure that you come up well for relevant GeoSearches.
Remember, success will come out of having both your geographical address and a description that includes the terms people might be searching for. You need them both. Imagine that restaurant situation. There might be a page about the restaurant, with a link on it to another page with the actual address. This would be bad, because the separate address page might not actually have the word "restaurant" on it. In that case, it wouldn't be found if someone geosearched for "restaurant."
As a rule of thumb, you might consider listing your geographical or postal address across the bottom of all your pages. That will ensure some geographical information is available to work with the varied content on all your web pages, or certainly at least on your home page. Also, don't forget to also include your web address. Significant numbers of people go to search engines and enter domain names to locate web sites -- MSN Search recently said about 15 percent of their queries are this way. By including your web address, you increase the odds of ranking well when people search for you this way.
Combined together, that means your pages might have a footer like this:
123 Plaza, Newport Beach, CA, 92663, USA
By the way, the actual format of your address isn't important. You don't need to use commas to separate the street address from the city, nor must you spell out a state or province name rather than using abbreviations. The key is to use your zip code or postal code, then keep your address information relatively near this. Vicinity's technology looks for these codes first, and they are what give it the primary information to know where you are located at, the company says. The address recognition software will also quickly scan near the zip code or post code for the other address information, so keeping it nearby is helpful. In short, use any standard address format you prefer, but just make sure that you include your zip or post code.
How about one more stat? Vicinity currently estimates about 15 percent of the web's pages contain some type of geographical information on them. It is these pages, and only these pages, that then become available when doing a GeoSearch.
Overall, GeoSearch seems a good tool to have for those times when you want to find information that could be situated in one of several locations. Similarly, some people may prefer using it rather than trying to think of appropriate geographical keywords. Another plus to using GeoSearch is that any addresses found on a web page will appear below that page's listing, along with a link to map for those addresses.
Northern Light GeoSearch Page
Whereonearth.com Signs Agreement With Yahoo
Whereonearth.com, April 10, 2000
Yahoo's classified and yellow pages sections are to receive their own form of GeoSearch, through a new partnership with Whereonearth.com.
Goodbye Domain Names, Hello RealNames?
While the domain name system continues to devolve into a joke, the RealNames web addressing system is growing stronger. The company recently cemented a tighter relationship with Microsoft, continues to expand its reach and has pulled back on its main issue of controversy, the assignment of "generic" terms. The moves further position RealNames as a viable alternative, if not future successor, to the current domain name system.
Devaluation of top level domains is one of the biggest problems our current system faces. Top level domains, or TLDs, are the "endings" you choose when registering a domain name. For instance, McDonald's has registered a domain name of mcdonalds.com. As you can see, the name ends in .com. McDonald's also owns a domain name with a different ending, mcdonalds.org.
Why have domains with different endings? There was a time when the endings used to mean something. In fact, they could be incredibly helpful to understand the origin and backing of a web site. For instance, the .org TLD used to be reserved for non-profit organizations. Only non-profits could register a name ending in .org, and users going to sites with this ending could be fairly well assured that they were non-commercial in nature.
Today, this is no longer the case. Anyone may register a .org, regardless of their non-profit status. In fact, domain registrars like Network Solutions and others encourage companies to register .org addresses, as well as the .net TLDs, which used to be reserved for companies involved with Internet network operations. The pitch is to "protect" yourself against other companies getting your name with these alternative endings.
Similarly, .com has become the TLD that everyone wants, regardless of whether it is appropriate. That's because with the growth of commercial sites on the web, many surfers assume that all sites must end in .com. Looking for Ford? Just slap a .com on the end of their name, and you'll find them. Looking for the Sierra Club? Even though they are a non-profit organization, you can still reach them at sierraclub.com in addition to sierraclub.org. The environmental group no doubt realized users might look for it by adding a .com to its name, and so registered a domain with that ending to ensure it could be found. Likewise, the US White House surely must regret that it never registered whitehouse.com in addition to the proper whitehouse.gov address that it currently uses. That failure meant a porn site was able to get whitehouse.com, which comes as a surprise to many who arrive there.
Back in 1996, I used to call adding .com to the end of something ".comifying," such as when my wife -- sick of hearing me talk about the Internet -- would say "getalife.com." It was a way to make anything Internet-related. Today, it's commonplace for people to talk about "dotcoms" when they are referring to web sites. It's the same principle -- the .com ending is equated to being in cyberspace.
This brings us to the issue of domain ownership. There can only be one mcdonalds.com, even if there are several companies that hold trademarks that seemingly would entitle them to the address. Who ultimately gets ownership of a .com address has led to plenty of disputes and lawsuits, and there are no signs that these are diminishing. This is especially so because no one tries to resolve problems before domains are registered. Instead, it's a free-for-all. Anyone can register a name, and complaints are dealt with after the fact.
Rather than exert preemptive control, the current thinking is to simply introduce new TLDs, such as .firm for businesses or .rec for recreationally-oriented organizations. It's an absurd solution. We've already seen how existing TLDs are perverted and how companies are encouraged to register every ending available. The introduction of new TLDs will only cause companies to spend more money on names they do not need, while users will be poorly served because the classifications will inevitably be ignored.
In even more craziness, even country-specific TLDs can be twisted away from their original purpose. In early April, the Pacific Island country of Tuvalu has sold the rights to its country-specific domain of .tv to a private company, in a deal worth at least $50 million over the next ten years. Can you imagine if a country was able to sell its seat at the United Nations to a private company? To me, that's the equivalent of what's happened here. Country-specific domain names were supposed to be assigned so that we'd understand what country a domain was based in, not cash cows for sale to the highest bidder.
In short, the domain name system is a mess, and I don't see hope on the horizon -- except in the form of RealNames. In RealNames, we have an alternative system for reaching web sites that can potentially avoid the problems that the domain name system has suffered.
For one, names are subject to review before being approved. That allows potential conflicts with trademarks to be spotted before the fact, not afterwards. Moreover, if several companies seem to have an equal claim to a particular name, then usually no single company can own it. For instance, the RealNames keyword "alpine" will list several sites relevant for that keyword, rather than trying to take you to just one.
A real benefit is that names can be regionally specific. If I enter "Ford" into AltaVista, then click on the RealNames link that appears, I'll be taken to Ford's US web site. That's because AltaVista uses the US RealNames database. In contrast, if I enter "Ford" into UKMax.com, a UK-specific search engine, the RealNames link takes me to Ford's UK web site. It's the same word, "Ford," but because my location is known, I'm directed to the correct regional location.
Another plus is that unlike the domain name system, RealNames do not have to be in Latin-characters. In fact, RealNames has just expanded into Japan, offering navigational keywords in Kanji, Hiragana and Katakana characters.
Overall, RealNames is a much more intelligent, equitable and better managed system than the current domain name system. I don't see it as an immediate threat to the primacy of .com, but it is well-positioned to take over in the future. That's especially so given RealNames success with Microsoft. The company managed to get Microsoft to provide native support of some types of RealNames keywords within the Internet Explorer browser toward the end of last year, and that support was expanded in March, at the same time Microsoft gained a 20 percent stake in RealNames.
To see this in action, enter "ABC" into IE5's address box, and your browser will split into two windows, if you haven't changed the default settings. On the right, the ABC television network's web site will automatically load. On the left, you'll be shown a list of web sites. The list will be topped by the "MSN Top Pick," which is a RealNames link to the ABC web site, but other sites such as ABCstereo.com are also shown, in case you might have wanted these instead.
Similarly, try entering the names of other companies or web sites that you are trying to reach into IE5's address box. You'll probably be surprised to discover how successfully the RealNames system will get you to the right location. Certainly if you enter "white house," you'll arrive at the US White House site, not a porn site.
RealNames isn't perfect. I feel the company lost goodwill when it allowed some companies to register "generic" terms, as the article listed below explains in more depth. Last week, RealNames decided formally that it would no longer issue generics, or "categorical" keywords, as it is now calling them.
"We have decided to maintain our current policy of not selling categorical terms, terms which are synonymous with an entire category of goods and services," said RealNames CEO Keith Teare. "Right now, it remains the case -- and we think it will remain the case for some time -- that most users do not distinguish between navigation and search."
In other words, sometimes users want to "navigate" to a particular web site, such as MP3.com. Other times, they want to search for several possible web sites, such as places that offer MP3 files. These are completely different goals, yet they may involve the same keyword, "mp3." If RealNames were to allow only a single web site to be found for a generic / categorical term like this, then those with search expectations might be confused.
Existing contracts will still be honored, which is why the RealNames keyword "mp3" currently does resolve to MP3.com. Also, companies with strong brand names that are also categorical may be granted these terms. Amazon and Apple are both examples where their names are generic in nature and which might also be used by other companies. Its hard to argue that most Internet users would not expect to reach Amazon.com or Apple.com.
Another concern is that the RealNames system is much more expensive than the domain name system. Once you register a domain name, there are no charges for the behind the scenes translation that brings people to your site. In contrast, RealNames will charge its small clients $100 per name and reserve the right to impose excess charges if a site gets more than 2,500 visitors per month through the RealNames system (though most small clients will never hit this). Big clients are forced into corporate pricing plans, where the resolution fees can be worth hundreds of thousands of dollars annually.
Teare makes no apologies for this. It's the cost of doing business, he explains. "There are lots of commercial web sites that will pay for the traffic," he said, alluding to popular affiliate programs that companies operate. However, he does express concern that RealNames does not want to make being found unaffordable for the masses. Only about 200 customers out of 60,000 are in the corporate pricing system, he said. The rest get by on the flat rate, which is the goal.
"I do not believe most individuals in the world should very pay more than the flat rate Internet keyword fee," Teare said. In fact, he sees the large corporate fees as a means to pay for a system that benefits smaller sites. "It's almost like we are trying to be Robin Hood. We're trying to get the wealththy commercial Internet to subsidize the vast majority."
Oversight of names is another issue. I certainly have no lack of complaints from readers unhappy with some decisions RealNames has made, and I get more every time I write an article about the system. While RealNames does have a policy review board, that board is not an independent body that can force RealNames to change a decision the company has made. Teare agrees this is a weakness.
"I definitely want to change that where someone else can tell us we've made a mistake and force us to change," he said. "I think the question is who wants to do that, adding, "I would absolutely love it if ICANN wanted to," referring to the authority that oversees the domain name system.
At the moment, it seems as if RealNames partners are helping provide a balance. Since most of them are search services, there's pressure that the RealNames system resolves addresses in a way consistent with what their users expect. In fact, the added support Microsoft has given the system was conditional on RealNames adding more review and discretion over how names are assigned.
"They wanted an unequivocal endorsement from us. We wanted to make sure if we were endorsing them that we knew that the user experience was going to be superior," said Bill Bliss, general manager of Microsoft's MSN Search. He and others at Microsoft were involved in a process to help RealNames better define how names would be issued and resolved in the RealNames system.
You'll also hear RealNames make mention of its system being based on "open standards." The reference here is to how the RealNames system and other alternative web addressing systems, or "namespaces," work technically. The goal is to make them compatible with each other. The standards being developed have absolutely nothing to do with how names are assigned.
Of course, RealNames isn't the only player in the namespace competition, but it is the strongest. AOL runs a long-standing keyword navigation system, but that operates only within AOL's proprietary online service. AOL-owned Netscape also has an Internet Keyword system, but it isn't supported yet beyond within the Netscape browser. Netword is another namespace company, but it has nothing like the reach that RealNames has established through search engines and other partnerships.
In conclusion, the domain name system isn't going away immediately, but like the dinosaurs, I think it will slowly become extinct. In its place will be namespaces, which we'll use to navigate the web. Indeed, as what I call "GenNet" comes online -- youths who have never known a world without Internet access -- they'll probably view "old-timers" trying to reach web sites using .com and other DNS addresses in the same way we might laugh to hear someone trying to dial a phone number using old style telephone exchange prefixes, such as the one from the Glenn Miller song, PEnnsylvania 6-5000.
Sound unbelievable? Before there were domain names, we used IP addresses to reach information on the Internet. In fact, IP addresses still underlie everything on the web. The ABC television web site is really at http://18.104.22.168. The domain name system evolved to save us from having to remember these numbers. We simply enter ABC.com into our browsers, then they communicate with a DNS server that routes us to the right IP address. In the same way, the namespace systems are evolving to save us from the confusing DNS system. Going forward, we'll enter "ABC" into our browser, then the namespace system will resolve it behind the scenes to a domain name, delivering us to our destination.
For webmasters, I've always suggested registering RealNames as a means to capture traffic from AltaVista. As the system has grown, I now also think its essential that if you own a brand name, you also consider registering RealNames keywords that match. It should help drive brand-related traffic, as well as ensure that you are well positioned as people begin to experiment with using namespaces. Finally, be sure to also register a RealNames keyword that matches your domain name. For instance, if you were Nike, you'd want to have both the RealNames of Nike and Nike.com. The latter is important because it will ensure that anyone who enters a domain name into a search engine -- and many people do -- will find a link to your site displayed prominently.
Using RealNames Links
More about how the system works.
How RealNames Works
RealNames Temporarily Suspends Registration Of Generics
The Search Engine Report Jan. 4, 2000
More information on how RealNames was assigning generic or categorical terms and why the practice was halted.
The organization in charge of the domain naming system. Find information here about new TLD proposals and more.
A rival to RealNames in the namespace field.
The FAQ page has more information about the new TLDs, plus there's a wealth or other information about domain name registration and issues.
Brief definition of the domain name system, with many great links that provide more information about how the system works.
Common Name Resolution Protocol
Information about the standards being developed for namespaces to use.
Dot-coms: Masters Of New Domains
Forbes, April 26, 2000
How country-specific domain names are being used for new purposes. But despite these moves, .com remains king.
AOL, Microsoft going to war over browsers
Washington Post, April 21, 2000
Another look at the coming of namespaces.
ICANN Moves Closer To Adding Web Domains
Newsbytes, April 20, 2000
New top level domains come closer to reality due to a recent action by one of ICANN's supporting organizations.
Island nation cashes in on ".tv" country code
News.com, April 8, 2000
Why Dot Com is King
Domain Notes, April 2000
Argues that despite the possible introduction of new TLDs, .com will remain the top choice for businesses.
The ICAAN Dispute Resolution Policy
Domain Notes, April 2000
A look at how the new domain name dispute system seems to be working to against cybersquatters.
Microsoft to Back a Browser Keyword System
New York Times, March 14, 2000
Details on RealNames charges to corporate clients, plus a comment from Esther Dyson, who heads ICANN but who wasn't speaking for the group as a whole.
Internet Board Agrees to Overhaul Election Plan
New York Times, March 10, 2000
Very nice summary of recent decisions made by ICANN.
Telephone Exchange Name Project
Everything you wanted to know about the old-style telephone exchange name system.
Google Speaks Languages, WAP, Adds Other Features
Google has unveiled beta versions of its site for non-English speakers and those accessing the web using wireless devices. The search engine has also added new features to its search results and launched a new "university" search engine.
Users can now search for pages in 10 languages in addition to English. They are Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish. Additional language support is planned for later this year.
To use the feature, just choose the language you desire using the drop-down box next to the search box. For instance, select "French," and only pages written in French should appear. Google's messages and instructions will also change to match the language you searched in. Additionally, Google will remember your language preferences. Use the "Language options" link on the home page to control your settings.
Google has also become the latest service to cater to the growing wireless market. Nor is the company just playing catch-up. Google's goes beyond the other WAP search offerings that I've seen, which either index or catalog only pages specifically designed for wireless users. In contrast, Google's WAP service is supposed to translate any document it lists into a format viewable on mobile or handheld devices.
For instance, FAST Search operates a wireless search engine, which lists pages written in wireless markup language, or WML. It excludes ordinary HTML documents from its listings, since those pages probably won't appear properly on a WAP phone. In comparison, Google presents its normal results, but if a user chooses an HTML document from its listings, the page will automatically be translated into a wireless format. That means Google gives mobile users much greater reach, though only to documents actually listed in its results.
A new feature added to Google's search results is the ability to email results to yourself or someone else. Perform a search, then click on the "Email These Results" link that appears at the top of the results page. Enter up to four email addresses, and the results will be sent to those people.
Another feature of Google's results has been renamed and enhanced. The "Cached" link below each page listed used to bring up an exact copy of that page, as it looked when Google spidered it. The Cached link has become a popular method for users to find copies of pages that no longer exist, and Google still supports it, just under the new name of "Show matches." The feature has also been enhanced with hit highlighting. When you view the cached copy of the page, the search terms you looked for will be highlighted in yellow on the page.
Finally, you can now perform a search to find matching web pages from within any one of over 40 prominent US universities, such as Harvard, Stanford or even my alma mater, the University of California, Irvine. Go to Google's University Search, then pick the college you are interested in. I only wish Google also made it possible for you to do a search across all of these universities at once. Perhaps in the future....
Wireless users need not go to a special address to use Google's special wireless version, which is in beta testing. Google will detect your WAP phone and automatically load formatted for it.
Google: Palm Edition
Those with Palms should use this edition of Google, which displays in normal HTML but in a format designed for Palm and other small screen devices.
Google Language Options
Set your language preferences with this page.
Google University Search
Don't have a WAP phone? Use Gelon's Wapalizer to see the web as a wireless user does.
Wild About WAP
The Search Engine Report, March 3, 2000
A bit more about wireless search engines and WAP.
LookSmart: Commerical Charge Just A Test
For two days in April, LookSmart ran an experiment where any commercial web site was asked to pay a fee for submission. The usual $199 option was available, which promised an answer about being listed within 48 hours. A new $79 option promised a 4 to 6 week turnaround time. Non-commercial sites were also given the option to submit for free. Now only the normal two options remain: express submission for $199 or free submission, which is open to anyone.
LookSmart wanted to gauge the reaction to a less expensive option, to see whether it might be a solution for coping with all the additional sites that want to be listed but which find the $199 fee too expensive.
"It is clear to us that the vast majority of web sites who submit through us and our partners are ecommerce providers and other commercially oriented sites. We do, however, understand that not all of these sites are able to pay huge amounts of money to have their web sites processed We are trying to conduct a variety of tests to establish a range of price points that would still enable them to do this," said Kate Wingerson, LookSmart's editor in chief.
If more commercial sites did take up a paid listings option, it would relieve growing pressure on LookSmart's resources, Wingerson said.
"We estimate that within two months we will be receiving over 20,000 submissions a day. In order to process this number of submissions in a timely manner would require a huge staff of hundreds and would completely divert resources away from what we consider to be our main editorial job, proactively selecting, describing and categorizing web sites in response to user missions," she explained.
Wingerson said that LookSmart is in the middle of its submission process review and doesn't yet have enough information to say what the shape of any final program will be. As more details are known, I'll follow up accordingly.
LookSmart Launches Express Submission Service
The Search Engine Update, Feb. 3, 2000
More details about LookSmart's express submission option.
Movement In Meta Search
Meta search engine ProFusion was acquired by search utility maker Intelliseek in April, making it the third major meta search engine to be gobbled up in recent months.
The trend started last August, when Go2Net acquired meta search site Dogpile in a US $55 million deal. The move was especially notable because Go2Net already had absorbed the web's most popular meta search service, MetaCrawler, in November 1998. Acquiring Dogpile gave Go2Net a second popular meta search service. Why buy what Go2Net already had? To lock up the market.
"We had the number one search service, and this was number two, and we wanted to own the meta search category," said Dr. Oren Etzioni, Go2Net's chief technology officer and creator of MetaCrawler, in an interview last year.
It's not a bad category for Go2Net to own, especially given that some search engines actually pay to be carried in the meta search results. It's a way for smaller, standalone search engines to extend their reach to new users and been seen alongside the more established players. For instance, pay-for-placement search engine GoTo.com recently cut new deals with both Go2Net and meta search engine Mamma.com. The deal ensures GoTo's results, and thus its advertiser-supported links, will be placed before more eyeballs.
Meta search is also a good category because it may attract search users dissatisfied with standalone search engines. This is especially so when articles appear that discuss how "little" of the web each search engine covers or how results can be different from engine to engine. The natural solution to these concerns for a user is to turn to a meta search engine, which provide the top results from several search engines all at once.
Nevertheless, only Go2Net seems to have leveraged meta search into attracting big traffic. The site now regularly lands in Media Metrix's top rankings, though much of that traffic may be driven by Go2Net's other content, such as the Silicon Investor web site.
Cnet is another player that's entered the meta search competition. In October, it acquired the popular SavvySearch service in a $22 million deal. SavvySearch continues to run as an independent web site, but its technology was integrated last month into Cnet's long-standing Search.com site. Previously, Search.com had been powered by Infoseek.
Until now, Cnet hasn't seemed to invest much time in Search.com. The inattention began when Cnet launched Snap.com back in September 1997, aiming the site to take on the likes of Yahoo, Lycos, Excite and other portals. With Snap now owned primarily by US television network NBC, Cnet seems to have thoughts about targeting search again.
Like MetaCrawler and SavvySearch, ProFusion originally began at a university. It went private, then Intelliseek purchased it last month. Intelliseek makes the well-regarded BullsEye meta search software. The acquisition of ProFusion now gives it low-traffic but established search site that it can expand and enhance.
From a user perspective, these deals mean that the meta search services named are likely to grow and be developed. Of course, that doesn't mean that "independent" meta search sites such as Mamma.com or C4.com won't also develop. In fact, the recent acquisitions will probably help other services attract investment, since value is clearly being attached to meta search.
Similarly, huge success by meta search engines could ultimately cause the "main" search engines that they depend on to cut off licensing agreements, which most of the major meta search sites establish in order to avoid legal problems. However, for the moment, the major search engines generally say that meta search sites pose little burden and even provide them with some exposure. Consequently, it's a win-win for both parties.
Search Links: Metacrawlers
You want meta search? Have no idea what it is? Links and answers are here.
Relaunched in January, the service added music and auction meta search capabilities, the ability to customize their searches to return results from more than 25 specifically requested countries, and included Google as one of the services queried.
You'll find SavvySearch's meta search technology here, for both general purpose searching and to power topical meta search offerings.
CNET Investor Message Board Area
Provides meta search across major investment discussion areas.
Courting Retailers: Metasearchers Increasingly Cozy up to E-Commerce Sites
Internet World, Jan. 15, 2000
More details on how and why some search and shopping sites are paying to be carried on meta search engines.
Auction Search Case Awaits Ruling
Is it legal to spider someone's web site without permission? We've never had a court ruling on this before, at least in terms of textual information, but that's expected to change shortly.
There's been a long-running dispute between auction site eBay and auction search engine Bidder's Edge. eBay claims the right to restrict who can crawl its listings, in part arguing that spiders can slow its service and that the information on its site is intellectual property entitled to protection. Bidder's Edge argues that the information belongs to those auctioning their goods and services, not to eBay, and that it needs no permission from eBay to index this information.
The case is currently before the US District Court in San Jose, California. The judge in the case has said that he's inclined to grant a preliminary injunction against Bidder's Edge, but how exactly he might restrict the service remains to be seen -- as is whether he actually will grant an injunction at all.
Should he do so, it would be a blow for Bidder's Edge but by no means an end to the issue. The case would still head to trial, where Bidder's Edge could win. It might also be successful during the during the legal maneuverings that occur before a trial.
Bidder's Edge, and those on its side, express concern that a victory for eBay could mean an end to search engines. After all, no major crawler-based search engine expressly seeks permission to index content. Instead, permission is assumed.
This brings up the issue of the robots.txt file, a long-standing convention to explicitly tell spiders to stay out of a web site. There's a strong argument that the greater good on the web is served by allowing search engines to operate in "opt-out" mode. That means that if you don't want your content indexed, you use a robots.txt file to "opt-out" of the process by telling spiders to go away.
By all means, eBay should certainly have a robots.txt file in place, if it wishes to keep spiders out. It doesn't, and has never had one in place any time I've checked since this dispute began last October. The lack of such a file shows little concern about being crawled. It's a fundamental mechanism the company should have in place. Moreover, eBay is well known to me not only to be listed on major search engines but also to have employed several search engine optimization firms to promote itself on search engines. This demonstrates no real concern that visitors should only come into the site via its home page, nor that its internal content should be protected from spiders. Instead, eBay simply seems to want to play favorites. Spiders that benefit it by sending the site traffic are apparently OK, but spiders it feels may be a threat to its business interests should be forbidden.
So this isn't just a case about spidering. It also involves whether some spiders can be selectively discriminated against. The robots.txt convention certainly allows this. You can exclude just particular spiders, and this was especially designed to stop "misbehaving" spiders that site owners felt were putting a burden on their servers. But the robots.txt file isn't a legal convention, which is why court cases like this one will be watched so closely.
For the record, even if eBay had put up a robots.txt file, Bidder's Edge says they would not have observed it. I find that disturbing, because I feel those who operate spiders should obey the robots.txt convention. It's one of the few solid rules we have involving search engine spiders, and it has helped make the entire opt-out indexing situation possible. In turn, that has benefited web users as a whole.
However, Bidder's Edge does argue that it would ignore such a file at eBay because the Bidder's Edge feels the content belongs to those placing the auctions, not eBay itself. Thus, eBay should have no right to limit the ability of its participants to be found. That's a powerful argument. It's akin to saying that those at GeoCities couldn't have their home pages found because GeoCities-owner Yahoo decided to block search engine spiders. I can imagine the outcry that would bring.
This brings us to the meta robots tag. It allows spiders to be blocked on a page-by-page basis, rather than the site-wide system the robots.txt file is designed for. Potentially, eBay could give all of its users the choice at the time they place an auction of whether they want spiders blocked from indexing their content. It's even possible that eBay might charge those who chose to allow spiders in an extra fee, which could be used to cover any real burden that the spiders might place on its servers.
Search Engines And Legal Issues
You'll find links to other cases involving spidering and linking here.
Search Engine Features For Webmasters
Information on blocking spiders with a robots.txt file or a robots meta tag can be found here.
eBay, Bidder's Edge face off in court
News.com, April 14, 2000
Short summary of the current situation, with a link to a story about the Justice Department asking eBay about its actions.
Auction Dispute Centers on Question of Control Over Data
New York Times, April 14, 2000
Longer summary of the case, with quotes from both sides and third parties.
eBay vs. Auction Aggregators: A Freedom Fight?
InternetNews.com, Feb. 11, 2000
An older article exploring the issues involved.
Economist, Oct. 16, 1999
Another older article but still useful for explaining the concerns in the case.
Legality of 'Deep Linking' Remains Deeply Complicated
New York Times, April 14, 2000
Article about a different case between Ticketmaster and Tickets.com which touches on similar issues to the eBay dispute.
Ticketmaster Corp., et al. v. Tickets.Com, Inc
GigaLaw, March 27, 2000
Detailed information from the Ticketmaster case.
Copyright Decision Threatens Freedom
The Original Search Marketing Event is Back!
SES Denver (Oct 16) offers an intense day of learning all the critical aspects of search engine optimization (SEO) and paid search advertising (PPC). The mission of SES remains the same as it did from the start - to help you master being found on search engines. Early Bird rates available through Sept 12. Register today!