THE SEARCH ENGINE UPDATE
July 5, 2000 - Number 80
By Danny Sullivan
Editor, Search Engine Watch
Copyright (c) 2000 internet.com corporation
About The Update
The Search Engine Update is a twice-monthly update of search engine news. It is available only to those people who have subscribed to Search Engine Watch, http://searchenginewatch.com/. Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ About The Search Engine Watch site
+ Conference News
+ Yahoo Partners With Google
+ Good For Google Does Not Equal Bad For Inktomi (online only)
+ Google Announces Largest Index
+ Inktomi To Acquire Ultraseek From Go.com (online only)
+ Excite Changes Look, Results
+ Lycos Partners More Closely With FAST
+ Subtle But Helpful Changes At HotBot
+ Community Search Blossoms - Part 1
+ New Sites In Search Links
+ LookSmart Ends Pricing Tests
+ Fact Search Engines Available
+ Vortal Services Now Offered
+ On The Radar Screen
Search Engine Articles
+ Interesting articles relating to search engines.
+ List Info (Subscribing/Unsubscribing)
Happy 4th of July to all my US readers, and Happy Canada Day to all the Canadians! There lots of news to share this issue, covering changes to major sites such as Yahoo, Google, Excite, Lycos and HotBot. Plus, there are an increasing number of new tools to cover, also. So let's dive in. But first, for some time I've been meaning to mention and thank Jimco's Auto-Save tool for Microsoft's FrontPage HTML editor. I use (and despite its faults, like) FrontPage for managing Search Engine Watch. However, I find it unbelievable that after four releases, Microsoft still hasn't added an auto save feature to the program. Until I found Jimco's tool, I either had to remember to hit save button every five minutes or find myself losing work after some inevitable crash. If you are a FrontPage user, get this tool. It has rescued me many times. In addition, I was extremely thankful to have a copy of Norton Utilities this week. I managed to accidentally delete an entire article while editing the newsletter, representing several hours work. I realized my mistake the next day, well after I'd overwritten both the file and its backup version several times. Very fortunately, the Norton Utilities UnErase program was able to locate all the previously deleted backup versions that Jimco had been making, which ordinarily cannot be retrieved from the Windows Recycle Bin. I got the article back, much to my relief.
Jimco's Auto-Save Tool
(and more cool FrontPage enhancements)
In case you forgot your password, the finder will help you access the Search Engine Update articles listed below.
The next Search Engine Strategies conference will be held in San Francisco on August 14. I'll be presenting and moderating sessions that feature experts on search engine marketing issues and panelists from the various major search engines themselves, including About.com, AltaVista, Ask Jeeves, Excite, Google, Inktomi, LookSmart, Netscape/The Open Directory and Snap. In addition, there will be a special session on shopping search, which should be of interest to any online retailers. Details about the conference, for attendees or potential sponsors and exhibitors, can be found via the URL below.
Search Engine Strategies 2000 - San Francisco
Yahoo Partners With Google
Yahoo has selected Google to take over from Inktomi in powering Yahoo's secondary results. These are the listings that appear in the "Web Pages" area of Yahoo's results, after any hits from Yahoo's own human-compiled listings. In addition, Yahoo plans to move forward with changes to improve the relevancy of its own primary listings over the coming months.
The switch to Google should occur sometime this month (in fact, it appears to have already happened). Inktomi is also being retained as a Yahoo partner. The company will provide search services for Yahoo's new Yahoo Corporate product that was announced in June.
Google becomes Yahoo's fourth crawler-based partner. Open Text was the first, then AltaVista took over in mid-1996, then Inktomi picked up the contract in mid-1998. At that time, AltaVista lost out because it was seen by Yahoo as competitor in the portal space. In contrast, Inktomi's "behind-the-scenes" business model of powering but never competing with portals gave it the edge.
Winning the Yahoo contract didn't "make" Inktomi. At the time, the company had already been picked as a provider by both MSN Search and Snap, in addition to serving HotBot and smaller players like GoTo. But the contract certainly raised Inktomi's profile as a leader in the portal powering market. Similarly, the Yahoo partnership is a huge boost for Google. It's an extraordinary achievement for Google, less than two years old, to be selected by the web's most popular search site as its partner.
It's also well deserved. "Our goal is to produce no holds barred, the best search available," said Google's CEO Larry Page, in an article I wrote just after the company received major investment in 1999. Since then, Google has delivered. The most recent NPD Search and Portal Study found 97 percent of Google users say they find what they are looking for every or most of the time, placing it highest among the 13 major search sites surveyed. In addition, reviewers continue to rave about Google, as does the general public. When I speak about search engines to groups and mention Google, something unusual happens to some members of the audience. They smile and nod, in the way you do when you feel like you've found a secret little getaway that no one else knows about. And each time I speak, I see more and more people smiling and nodding this way, pleased to have discovered Google.
One thing that will be missing from Google's results at Yahoo are links to the Google version of the Open Directory, a main competitor with Yahoo in categorizing the web. But what Google has done to apply link analysis to the Open Directory could also be applied to Yahoo's categories. "We have chatted with them about what can be done to improve their directory," said Google president Sergey Brin.
Bringing Google on board to improve Yahoo's secondary results is great, but Yahoo is also planning upgrades in the coming months in how listings from its own database are presented and ranked in response to a search request. "You should certainly be looking at further improvements to our own search," said Srinija Srinivasan, Yahoo's editor in chief.
The process already began when "Most Popular Sites" sections were added to some categories in April. Since then, popular sections have been added to additional categories, Srinivasan said. Srinivasan says you'll tend to see these more in non-commercial areas of the directory and only in categories where 20 or more sites are listed. Examples where Most Popular Sites can be found include the Virtual Cards category, Children's Health and Genealogy.
Yahoo's keeping quiet on how the popularity rankings are determined. I've not seen Yahoo do any clickthrough measurements, nor has anyone reported this to me, which leaves me assuming the company is spidering and doing some type of link analysis.
Yahoo has also come through a reorganization of its business listings, which was completed around the end of April. "This was truly a wholescale renovation in how we deal with commercial listings," Srinivasan said.
Previously, commercial listings had been broadly divided between the "Companies" and "Products and Services" categories. However, Yahoo realized there was less distinction between companies and products and much more between companies with products aimed at consumers and those catering to other businesses. Consequently, if you enter the top level "Business and Economy" category, you'll now see two major categories below it, "Business to Business" and "Shopping and Services," aimed at consumers.
Given the reorganization, you may wish to check your listings at Yahoo. Some companies offer both business to business (B2B) and business to consumer (B2C) services. If you are one of these, you should be classified both ways. If you are not, Srinivasan said to use the Yahoo change form to let editors know why you should be in both major sections. Make it easy for them. Located the most appropriate B2B or Shopping and Services category for your site, and send that along with your message.
For new sites not listed in Yahoo, Srinivasan said to pick the best category for your site, then submit the other as your second pick, then add a note explaining why you should be in both places. For instance, let's say your site serves both B2B and consumer markets. You need to start the submission process from somewhere, so you decide to find the best category within the B2B area. On page 2 of the submission form, you would enter the Shopping and Services category that's appropriate as an optional suggestion. Then, at the end of the form, use the Final Comments section to explain why you belong in both places.
Follow this procedure even if you use the Business Express service, Srinivasan said. Don't do two separate Business Express submissions, one to each category. You'll only lose the fee for the second submission.
Finally, another change that just hit Yahoo about two weeks ago was an increase in the fee for adult sites to use the Business Express system. They must now pay $600, rather than the normal $199. Yahoo says this is to make up for the extra time they've found adult sites take to process. "A small proportion of the submissions take a disproportionally large amount of our time, Srinivasan said. In particular, there is a high incidence of adult sites submitting trying to obtain multiple listings through the use of disguised mirror sites, she said.
Yahoo Changes Listings
The Search Engine Update, May 3, 2000
More information about the addition of "Most Popular Results" to Yahoo's categories.
NPD Search and Portal Site Study
More details from the latest search satisfaction survey will be posted by the end of this week.
Yahoo Business Express Expanded
The Search Engine Update, Nov. 23, 1999
Past article and links to previous articles about the Yahoo express service.
Yahoo Business Express Help
More information about Yahoo's express service, from Yahoo.
Yahoo Change Form
Good For Google Does Not Equal Bad For Inktomi
Inktomi's stock plunged after last week's announcement that the company had lost the Yahoo web search contract to Google, as investors wondered if the Yahoo deal was a harbinger of future Inktomi defections. Maybe, but the loss of a big portal doesn't necessarily mean Inktomi is a big loser, just because Google is a big winner. See the article below for more details.
Good For Google Does Not Equal Bad For Inktomi
The Search Engine Report, July 5, 2000
Google Announces Largest Index
Another milestone in the search engine size wars was hit when Google went live with a full-text index of 560 million URLs in June, making it the largest search engine on the web. In addition, because of how Google makes use of link data, its reach extends to a further 500 million URLs that it has never actually visited, the company says. That means searches at Google potentially encompass more than 1 billion pages, which is the size the entire web was recently estimated to be at earlier this year.
So does this mean Google is the first search engine to give 100 percent coverage of the web? No. For one thing, that 1 billion page estimate is several months old, and the web has almost certainly increased in size since then. Nor does that estimate include the millions of pages that search engines typically don't crawl, such as those behind password protected areas or served up by identifiable dynamic delivery systems. How big the web is now is anyone's guess.
Also, Google has actually visited and recorded the contents of 560 million pages, not 1 billion. Google, unlike any other major search engine, does make clever use of its technology to leverage its reach beyond this core set of pages (as the articles below explain further). It isn't just marketing hype for it to use the 1 billion figure, but those extra pages are more like a bonus that you can't always depend on, rather than the assurance you get from having indexed each and every page.
None of this takes away from Google's accomplishment nor the value of using its service. It is now a clear choice for those seeking both highly relevant results and comprehensive searching across the web. Searchers should also have even more choice in the coming months. WebTop just announced its own half-billion page index, and some of Inktomi's partners should go live with Inktomi's new half-billion page index in the very near future.
While Google is running searches against the full index at its own site, its partners may not tap into the entire amount. "We support searches for different partners, so they won't all be necessarily be getting the largest index," said Google president Sergey Brin. Google's customers can choose to search against indexes of different sizes, with the smaller indexes containing a higher amount of the web's most popular pages, as determined by Google's link analysis system. The benefit for customers in using a smaller index is savings. It costs them more to query against the biggest collection of documents, Brin says.
Offering different sized indexes isn't new. Inktomi has also offered its partners this option, and it is one reason why Yahoo's Inktomi-powered results have never matched those of some other Inktomi-powered services. Yahoo has never chosen to hit Inktomi's index as deeply as possible. Whether this will change when Google takes over remains to be seen. Google says Yahoo has the option to do so, and I'll revisit the issue after Yahoo goes live with the Google results.
For webmasters, the index variation is important. As Google begins to add more and more partners, as with Inktomi, you'll need to expect that results will be different that at Google.com. For searchers looking for comprehensiveness, the key issue will be to tell which partners hit the full index. Unfortunately, that probably won't be readily apparent.
In other changes to the index, Google can now also update the parts of the collection more often. "We're aiming to keep everything within a month," Brin said, explaining that in the worst case, a document might not be rechecked for a month. However, the index itself is aimed to be updated at least weekly, and more volatile documents may be refreshed on a daily schedule.
Google also now serves queries from three data centers, two on the West Coast of the US and one on the East Coast. Brin says that the different data centers should be kept in sync with each other, so lag time between mirrors should not be significant.
Google has also begun to float articles from major news wires at the top of its results, in response to current news stories. You can't specifically search for news, but the system should recognize topical queries such as "elian gonzalez" and respond with links that appear beginning with the word "News."
A new page offering a behind-the-scenes look of Googlites at work and play.
Search Engine Size Test
I took a look at how the new Google index performs against other size leaders. Did it live up to its claims? Pretty much, yes. Also see how FAST, Excite, AltaVista and some Inktomi-powered services perform. NOTE: The page isn't ready yet, but results will be posted by July 7.
Search Engine Sizes
The current sizes of major crawler-based search engines, historical numbers and plenty of articles that document the size wars over the past years.
Numbers, Numbers -- But What Do They Mean?
The Search Engine Update, March 3, 2000
Explains how Google leverages its link database to expand its coverage, and it also puts other "dual numbers" you may hear into perspective.
The Half Billion Crew: Google, Inktomi GEN3, & WebTop
Search Engine Showdown, June 29, 2000
Greg Notess compares and contrast leaders in the index size game and runs a current comparison.
Google's Cool Billion
About.com Web Search Guide, June 26, 2000
Another look at the Google size increase from search writer Chris Sherman.
I mentioned WebTop in my June newsletter, and now the company has just announced a half-billion page index. Expect a closer look in the future.
Boo.com's computers used to revamp search engine
Reuters, June 28, 2000
Computers from the failed Boo.com online retailer have been put to new use powering WebTop.com
Inktomi To Acquire Ultraseek From Go.com
Inktomi has announced that it is to acquire Ultraseek Corporation, the subsidiary of Go.com that produces search software used by web sites and companies for site-specific and Intranet searching. However, the move will have no impact on Go.com continuing to serve a general search market. See the article below for more details.
Inktomi To Acquire Ultraseek From Go.com
SearchEngineWatch.com, June 8, 2000
Excite Changes Look, Results
Excite has released a new look to its search results that also coincides with a new ranking system. In response to a search, Excite's new "Excite Precision Search" generally displays only matches from its crawler-built database. The simple design mimics the "less is more" move that AltaVista kicked off in May with its Raging Search. Behind the scenes, Excite is making better use of link analysis than it has in the past to improve its search results.
The search format changes should be welcomed by Excite users. In particular, it was becoming increasingly difficult to understand how Excite would construct its search results page. Sometimes web page results would appear with full summaries, then other times only titles would appear. Directory listings, news headlines or information from other data sources would pop up almost whimsically. But far from whimsical, Excite was purposely varying its results format depending on the context of the search in an attempt to better serve users. Being predictable is now the watchword.
"We're aiming for consistency," said Abbot Chambers, Excite's Senior Director of Search and Directory Products. "Some users really liked the different results, but it did have the untended expect of unsettling users."
Simplicity is also a key aim. Search for something on Excite, such as "fireworks," and you'll get back 10 results that come from Excite crawler-based index. Other information, such as human-compiled directory listings, news articles and multimedia results will only appear if you specifically request them using links at the top of the results page. Doing so will rerun your queries against those search offerings, described further below.
The only exception to this is if you search for something popular or which falls into a category where Excite editors have produced some programmed results. In those cases, a "Quick Results" box appears to the left of the search results. It has links to related information within the Excite portal. For examples, try "vw bug," "britney spears" or use another example from the Targeted Results article, below.
Excite also says it has upgraded its use of link analysis to improve the quality of its results. Previously, Excite gave pages a boost depending on how many links were pointing at them, with no attempt to measure the quality of those links. Now, sites that have some degree of authority, as measured by the number of links pointing at them, are able to transmit this authority to other sites. In other words, a few links from high quality sites may factor in more highly than many links from ordinary web sites.
"Formerly, if 5,000 sites pointed at you and only 2,000 pointed at another site, barring all the other factors, we would have given you an edge," Chambers said. "Now, we are stepping back another level. We aren't just taking into account other pages pointing at you. We are also looking at both the number and quality of the sites that are pointing at you."
In addition, Chambers said that Excite is also making more use of link text, something that's been a key component of how Google works. The idea in that regard is that you examine words in or near the hyperlink to determine the relevance of the page being linked to. For instance, if a link says "Great Place For Books" and points to Amazon, then usage of link text would understand that Amazon is relevant for the word "book."
Once again, link building becomes more important to the success of your site on search engines. I'm working up a long feature on how to go about link building that I hope to have ready in the near future, to guide you more. But until then, here are the key tips that you should follow.
+ Context is important. You don't just want links from any place. Ideally, you want links from sites that are relevant to the terms you wish to be found for.
+ To find these sites, use the search engines! Search for the top keywords that you want to be found for. Review the top sites that appear for these terms. Find those which are non-competitive with you. Then go to the sites and ask them for links. Remember, these are the sites that the search engines consider most relevant, and so getting links may help you pick up some of their importance. In addition, if these sites are listed well with search engines, then they are receiving traffic. Getting links from them means that you can tap into some of that traffic.
+ Make it easy for people to link to you. When you send a message, include the title of your web site, your URL and a suggested description. You'll be surprised at how many people will list you exactly as you suggest.
+ Guilt is your friend! Make a reciprocal links page for your site, and add any site that you wish to get a link from to that page. Then send your link request, noting that you've already added them to your link list, and provide the URL, so they can see it.
Talking with Excite also raised some questions about whether searches hit its entire index or just some of it. Chambers suggested that a smaller index of popular pages is hit first, then a query might fall through to a larger index of web pages, if necessary to satisfy the query. How exactly this might work was unclear, and Excite's engineering team refused to clarify the situation further. Presumably, they fear giving away some secret either to webmasters or their competitors. That's a valid concern, but it's also increasingly important to understand the key details when multiple indexes are being used. Without this information, it becomes difficult to evaluate search engines against each other. Moreover, the situation that I assume is happening with Excite is hardly secret. Inktomi has been doing that same with its partners, and Google plans the same.
As for index freshness, Excite says that no pages should be more than two weeks old, under its new system. Some pages may be updated more frequently, even on a daily basis. Also, while you may note that Excite has been more consistently monitoring clickthrough, this is not being used as a relevancy factor, Chambers said.
Aside from web searching, Excite offers four other major search options. Let's take them in turn by first stopping at the new Excite Precision Search home page. To reach it, click on the "More" link that appears below the search box, on the Excite home page. By default, the page will be set to do web searching. Click on "Category Search," and you can search for matching categories or web sites that come out of Excite's directory, which is primarily powered by LookSmart. Some listings remain from the days when Excite assembled the directory itself. Other than that, except for adult listings, Excite's directory is essentially LookSmart. Unfortunately, no option to browse the directory is presented. Instead, you'll need to go back to the Excite home page itself, then follow links in the "Explore Excite" box. Directory categories will appear on the right side of the screen.
"News Search" allows you to search for "Web News" articles from over 350 news sites, which are checked several times per day. The "Newswires" option lets you check against a much smaller group of wire services, such as the Associated Press and Reuters, while "News Photos" brings back news images from those two wire services. Unfortunately, Excite fails to notify you about its helpful NewsTracker service, from the News Search home page. That's a shame. The NewsTracker clipping service is one of Excite's biggest strengths, yet people I talk with never seem to have found it. You should definitely try it (see link below), and I only wish Excite would add an option to have finds emailed to you and fix the "Learn What I Like" option, which doesn't seem to learn anything, anymore.
"Photo Search" runs a query against pictures from Excite's "Webshots" web site, in particular against the WebShots Community collection, which allows people to contribute photos to other Webshot members. It sounds great -- Excite touts this as giving searchers access to over 750,000 "free" photos. But how "free" are these photos? According to the Webshots Terms and Conditions, members give Excite the right to distribute their photos to other members, when they contribute them. As for what rights Excite then grants you, they seem to be restricted to sending photos as email greeting cards or to download for use in the Webshots screen saver.
As for "Audio/Video Search," it checks for matching multimedia files that Excite has found across the web, such as AVI and MP3 files. Excite says it has between 500,000 and 1 million files indexed.
Other searches and assistance can also be found on the Precision Search home page. There are 25 different search tips that rotate on the page. Popular searches spotted by Excite editors are spotlighted along the left hand side of the screen, while specialty searches such as Maps and TV Shows are also itemized on this page.
Finally, Excite's Add URL page has undergone a change. Click on it, and you'll now see options to submit to the Excite Directory or the Excite Search Index. A submission to the directory simply routes your request to LookSmart and into its $199 Express Submit system. You can also go directly to LookSmart itself and submit, and I'd recommend doing this, in order to select the right category for your site. See the article below about why this is helpful.
Choosing to submit to the Excite Search Index will bring up the old Excite Add URL page, which sends your page to Excite's crawling system. It's best to submit your home page plus a few "inside" pages, mainly to help ensure that Excite has a way inside your site, in case there's a problem with the home page. Excite is very good about adding the home pages (those that are the root URL of a web server), and it seems to be providing better coverage of inside pages than in the past. However, there's still no strong correlation between what you submit and what Excite chooses to index. So, although Excite will allow you to submit 25 URLs per day (up from 25 per week, when I last wrote about Excite submits), there's no great advantage to doing so. Ultimately, Excite's systems will make its own decision about what to index, independently of add URL submissions.
Excite Add URL Page
Excite Search Index Add URL Page
AltaVista Launches New Search Site
The Search Engine Update, May 3, 2000
More about AltaVista going for the pure search market with its Raging Search site.
Counting Clicks and Looking at Links
The Search Engine Update, Aug. 4, 1998
An older article that nonetheless still covers the basics behind link analysis, including the notion that some sites have more authority than others.
Longer Domain Names Arrive
The Search Engine Update, Jan. 4, 2000
Have multiple domain names? This touches on why using the same one is best when doing link building.
Targeted Categories At Excite
The Search Engine Report, April 5, 1999
An older article that still provides current examples of how to see Quick Results appear at Excite.
Excite's news clipping service, a long-time favorite of mine, can be found here.
Moreover: News Lover's Delight
The Search Engine Report, June 2, 2000
If you are searching for news, you'll also want to try Moreover. More about the service.
Lycos Partners More Closely With FAST
Lycos has finally committed to using FAST Search's results to power the "Web Sites" section of its search results. Over the past two months, Lycos experimented with using results from its own crawler, from Inktomi and from FAST Search in this section. Now, FAST Search, in which Lycos has a significant investment, has been given the nod. The change does not impact Inktomi's deal to power results at Lycos-owned HotBot.
"We'll continue to use both Inktomi and FAST," said Mark Stoever, Vice President of Emerging Destinations at Lycos. In particular, Stoever explained that using different vendors helps distinguish the Lycos.com and HotBot.com sites from each other, thus broadening the Lycos Network's appeal to different audiences.
Significantly, the move makes Lycos the first major search engine to switch from using its own in house web search technology to outsourcing. Lycos started life at Carnegie Mellon University in May 1994 as a crawler-based search engine, using its own spidering system to retrieve pages from across the web. As the Lycos site developed, additional sources of search data were added. However, results from the Lycos spider always were maintained as the dominant information presented.
Then, in April 1999, a landmark move came when the main results began coming from the Lycos version of the Open Directory. The Lycos crawler was retained primarily to serve secondary results, for when Open Directory information was not available. Now, this role has been outsourced to FAST, pretty much taking Lycos out of the do-it-yourself search business, at least on its US site.
I can hear some readers asking about other major players that outsource, such as iWon, AOL Search or even HotBot. How is the Lycos action different from what these players do. It isn't, except that those players never had their own internal web search technology to begin with. Lycos did, and its move to outsourcing could mean that we see other players with their own technologies do the same. If it did happen, my guess would be Go (Infoseek) would be next, though the Go's president recently said this would not happen. I suspect Excite might also be an outsourcing candidate, though its recent search improvements do show a continuing commitment to doing search internally.
The Lycos spider isn't dead. Lycos is retaining it to build specialty collections of content, which will appear in the Lycos results as explained further below. However, the bulk of the burden for providing general web search results now falls to the Open Directory and FAST and to some degree, Direct Hit.
Do a search at Lycos, and the results will start out with a "Popular" section. This is either content from within the Lycos network that editors have selected as relevant to a particular search, or Direct Hit kicks in to provide answers when there is no editor data. You'll know Direct Hit information is being used if the Direct Hit logo appears at the bottom of the search results page. For some very popular queries, Lycos also pops up links in this section to other types of searching you can do with the Lycos network, such as for MP3 files, pictures or homepages. Try a search for "britney spears" to see an example of this.
In the "Web Sites" section, information will come from the Lycos version of the Open Directory, FAST or from the Lycos crawler. From the Open Directory, you might get categories links. (for example, search for "cars," and "Recreation > Models > Scale > Cars" comes up). Click on the category, and you'll see a list of sites for that topic. Additionally, individual sites from the Open Directory might appear in the Web Sites area. These are easy to identify, because they have a link to their "home" category under their listings.
The freshness of the Open Directory information should improve over the next few weeks, as Lycos brings on board a new system. HotBot should be completely in sync with the Open Directory this week, then be updated each week, going forward. The same should happen at Lycos around the end of this month.
If you don't see a category link, then the information is probably coming from FAST Search. You'll certainly see a FAST logo at the bottom of the results page, if any of its information has been used. However, the Lycos crawler may also return some results to this section. This is usually (but not always) information from within the Lycos network. So, look at the URL for the pages listed. If you see "lycos.com" in them, then that's probably the Lycos crawler, still at work.
The Lycos crawler is also being used to return some results that appear in the "News Articles" section of the results page, with newsfeeds also being employed.
Outside the US, Lycos is still depending on its own crawler to power search results, such as at Lycos Germany and Lycos UK. Also, several of the European Lycos sites have just been upgraded to make use of new directories, which look to be built by volunteers but which are not based on the Open Directory. I hope to take a longer look at this in the near future.
Lycos says it is tapping into the full FAST index, and FAST says that no page that's listed should be more than about for weeks old, with the goal to cut that to only a week, in the future. Additionally, some pages are already revisited and updated more frequently than others, if they change often -- a few even several times per day, FAST says.
Also, don't expect results at Lycos to match those exactly at FAST. Lycos is using its own porn filtering system, and like any FAST customer, it is also able to tune how FAST returns results, FAST says. Some major options that FAST customers can tweak include the weight to place on link connectivity or popularity, importance of keywords in the title tag, importance on frequency of keywords in the HTML page and keywords in the URL field. Why change setting like these? Imagine someone using FAST on an intranet. Link analysis might be less importance in determining relevance, while title text might be more trustworthy. In contrast, the opposite might be more useful for web results, where link analysis can help balance out webmasters trying to improve placement by creating keyword-rich titles.
Despite the change to FAST, Lycos is maintaining its own Add URL page and recommends that webmasters make use of it. Lycos said it may make use of the Add URL database for its own spidering needs. However, URLs are also supposed to be sent on to FAST Search.
To be on the safe side, I'd recommend submitting your home page and one or two key inside pages to both Lycos and FAST Search, just in case the home page is inaccessible. While technically a double submission, it shouldn't cause you any spam problems, especially as many people who don't know the relationship between Lycos and FAST will be doing this.
Should you deep submit? Both Lycos and FAST say there is no reason, and I would agree. FAST is a very good crawler. Give it a URL to your site, and it should manage to retrieve many other pages. A deep submit really shouldn't be necessary.
"The idea is you submit the top level page you want us to start crawling from, said Knut Magne Risvik, FAST's Research and Development Director of Search Technology. "You can do in depth submission, but we will prioritize the top page of your submit." While FAST has no particular per day URL submission limit, Risvik said that a high rate of submissions, say 10,000 sent quickly for the same site, will cause all of those to be ignored.
Of course, this assumes good linkage between your pages. Any pages that have no external links pointing to them would need to be submitted individual, which is true of any crawler.
By the way, while FAST does ask for a category when you submit using its Add URL page, that information isn't currently being used. As they may make use of it in the future, it makes sense to take a moment and pick the most appropriate category.
FAST has been a major player in pushing up search engine sizes, being the first to break the 200 million page and then the 300 million page mark. The company aims to full-text index 1 billion unique pages later this year.
FAST Search Add URL Page
FAST Natural Language Search
FAST has invested in a natural language interpretation company called Albert, and this beta site lets you have your queries examined for meaning and applied against the FAST index.
FAST Gets Bigger, Partners With Lycos
The Search Engine Update, Feb. 3, 2000
More about FAST Search and how it has previously been powering advanced searches at Lycos.
Lycos to hand off Net-search business
Boston Globe, June 19, 2000
Another take on the Lycos-FAST deal. I have a comment on this being the first big test for FAST, and the company says they've actually been handling a significant load from Lycos for over a month without trouble.
Boston Globe, June 26, 2000
Nice background on the pending Terra-Lycos merger.
Subtle But Helpful Changes At HotBot
Unlike at Lycos.com, Lycos-owned HotBot has no dramatic search provider announcements. HotBot says that it is pleased with its current major providers, Direct Hit and Inktomi, plus says that having a different mix helps the site appeal to a different audience than those using Lycos.com.
In response to popular searches, matches from the Direct Hit database appear under the heading, "WEB RESULTS Top 10 Matches."
"We're pretty happy with Direct Hit," said Kevin Cooke, Director of Engineering at HotBot. "Top ten matches are what [our users” want."
Inktomi results appear when Direct Hit fails to report good matches, or when you go to the second page of HotBot results. To that degree, Inktomi provides HotBot with comprehensive coverage, and Cooke says that HotBot intends to use the larger 500 million page index of the web that Inktomi is now offering its customers.
There have been a number of small changes at HotBot that users may appreciate:
+ You can now uncluster results. Use the "Disable 'Best Page Only' Filter" option on the advance search page to do this, in order to see more than one page per web site in the top results.
+ Due to user demand, HotBot recently introduced more precise count numbers that appear in the non-Direct Hit portions of its results.
+ Users can now search within categories of the HotBot version of the Open Directory.
Also, HotBot has been operating a beta site for several months. The key difference there is that a company called e-Cyc is providing alternative meanings to searches, in the "Refine Your Search" area that appears at the top of HotBot's search results.
For example, search for "chips," and HotBot will provide these suggestions: CHiPs (TV show), french fries, computer chips, chips (snack food). Click on one, and your search will be rerun with words meant to get you pages aimed at that meaning.
On the normal site, Refine Your Search is replaced by the "People Who Did This Search Also Searched For" area at the top of the results, which show you the top queries related to what you may be looking for. A related searches feature appears at Lycos. The results come from in house work that Lycos does to monitor what people click on. The can help it understand if similar sites are selected for different searches, thus helping it determine that a query like travel is related to things like "airlines," "hotels," "vacation" and "cruises."
One goal at the beta site is how to best incorporate the e-Cyc information alongside the existing related searches.
Community Search Blossoms - Part 1
The popularity of the Open Directory has not been lost on new search companies. Several are trying to tap into human power both as a means to locate, reorganize and rate information. Here's a summary of new players looking to leverage communities into better search.
Allows anyone to create "guides" about different topics. A guide is simply a list of links, but Clip2 makes it easy to create link lists or web logs without knowing any HTML. Visitors to the site can search or browse for topics of interest, and relevant guides will be displayed. Visitors can also chose to "subscribe" to guides by adding them to their own Clip2 account. There's even a nice mechanism that allows the guide's owner to send or receive email from their subscribers. Guide builders will find an option to import links off of any page you find, which is somewhat of a concern because it could also make it easy for others to take links off pages you've created. On the other hand, all they can do is import the links and titles, not any descriptions you may have written. An "Add It" tool for your browser also makes it easy to import the link and title of any page you visit. Ideally, the tool would also import the meta description tag from the page, if one was present. Bookmarks or favorites can also be imported. Guides you create can be public, open to anyone, or "private," accessible only to those you allow. To help searchers, public guides are also assigned ratings. All guides begin with a rating of 10, and ratings go up if people discover and make use of information within the guide. Overall, Clip2 is a great place to visit if you've always wanted to build and share a link resource around a particular topic but have lacked the skills to create web pages. It's also a place where searchers may come across great nuggets of information. For instance, be sure to use the Top 100 Guides link from the site's home page. Mixed among too many guides of affiliate links are fun sources of info, such as a the Computer Gaming for Girlz guide, a concise Harry Potter guide and the gruesome yet fascinating celebrity Death List guide.
Like Clip2, Octopus allows anyone to create collections of information around different topics that are called "views." Views are extremely powerful. In addition to links to web pages, a view can contain parts of web pages, images, "informational elements" that pull data such as sports scores into a view, and more. Even entire web pages can be added to a view, which is convenient, but which also raises some issues of legality, somewhat similar to the issue of pages being framed. All this power comes with a price. Octopus is a harder system to get used to, at least initially. However, the price may well be worth paying, especially if you have recurring informational needs. For instance, in just a few minutes, I was able to make a view for several major search companies that contained their names, addresses, company descriptions from Hoover's, one month stock prices, annual financial information and even their logos. All of this was done by dragging and dropping elements. Moreover, I could easily change the view and even export information into a spreadsheet. It sure beats copying and pasting information HTML into a spreadsheet for analysis. Searchers coming to Octopus can perform keyword searches or browse the Octopus directory (click on the Directory button at the top of the page, to do this). As the site has just emerged from beta, the content still hasn't quite developed. For that reason, I think browsing may be the better way to go, rather than searching, which may not come up with good hits. When entering a category, you'll discover both "Octopus Views," created by the Octopus staff, and "User Views" that are created and shared by Octopus users. User views are ranked in order of how popular they are (meaning how many unique visitors at Octopus view them each day). For a good introduction to the power of Octopus, visit the site's home page and try on of the Top 5 User Views listed at the bottom left-hand corner of the page. Overall, if you are an informational professional, Octopus is a must visit, especially if you deal with company information. Do note that the painless installation of a small, 15K Java applet is required.
Cherry-Picking The Web
The Standard, April 17, 2000
More background on Octopus and other "metabrowse" sites.
There are several bookmark-based search engines now, such as HotLinks.com. Quiver also uses bookmarks to power its directory product, but the company isn't aiming its service at consumers. Instead, Quiver's hopes that vertical portals will chose its technology to create targeted directories. For instance, a tennis web site might get bookmarks from its visitors and thus have a bookmark-based tennis directory that might be especially relevant to its users. As a demonstration of its technology, you can also search at the Quiver site (from the home page, click on the Quiver Technology Preview image). Unfortunately, there's no web-based system that allows you to contribute. Instead, you have to download the 1 MB Qbar applet. The applet also keeps track of browsing behavior (anonymously, the company says), in order to refine Quiver's results.
In Part 2, next month, I'll take a longer look at the sites below. In the meantime, here's a quick overview of them:
Adding and rating sites is a snap -- and I've had some reports from webmasters that it can be a traffic booster, as well.
Money is a prime motivator here. Become a member and get paid for participating, as you also do if your site or people you know send traffic to HotRate. Upon my quick initial review, I found the listings surprisingly good, editors that appear very dedicated and lots of help information. Non-categorized results come from Google.
In beta for the past two months, Wherewithall also offers money to its editors. Unlike HotRate, the difference is that earnings are tied to maintaining a good category, rather than any actions throughout the site. Data comes initially from the Open Directory Project, but Wherewithall editors can build out categories independently of the ODP. Wherewithall has also made provisions so that existing ODP editors can continue to work in both systems at the same time. If you are an editor, you might get over there now and claim your category, if this site sounds of interest to you. For the time being, ODP editors have the priority.
In preview mode, the community submits and rates sites. There's also an altruistic monetary incentive. Your activities can help raise money for charity. Non-categorized results come from Google.
Life After the Open Directory Project
Traffick, June 1, 2000
Former ODP editor David Prenatt found one day that he could no longer log in to perform his editing work. He soon realized that he had been expelled from the project. Far from a rant, Prenatt eloquently chronicles his ouster and illuminates aspects of the Open Directory with detachment. Prenatt describes a rather fearsome world where speaking up equals being attacked and highlights a backdoor allowing some large content providers like Rolling Stone and AOL easy edit control over their listings. This is obviously one side of the story, and any community has its problems. Nevertheless, it highlight problems volunteer directories may encounter.
New Sites In Search Links
Somehow, I managed to work my way through the nearly 300 submissions that had come into the Search Links area of Search Engine Watch. After killing about half that were no way search engines, and viewing way to many metacrawlers to count, I added about 100 new search engines. Here are a few notable ones you may wish to check out:
Inktomi-powered search engine that attempts to filter out adult content, web sites promoting hate, and possibly offensive material. Also accepts paid listings at the top of its results. Directory listings are created automatically using Inktomi's classification technology.
Web sites are organized into categories, then ranked in each category by how popular they are. Popularity is determined by online ratings service PC Data Online.
Spanish metacrawler that searches on the more popular search engines. It also has a database of questions and answers that aid the user when searching, similar to Ask Jeeves. Motor de busqueda que busca tu consulta en los buscadores mas populares en espanol.
Great service that lets you search through the audio files of popular US radio programs. Enter a subject, and you'll be shown matching programs and even be taken to the exact points in the shows where the subject is being discussed.
Directory of search engines, written in French.
Tell Somewherenear what type of business you want, such as a pub or cinema, then tell it your location, and the search engine will return matching businesses. If you have a WAP phone, there's a special version that's perfect for locating businesses while on the move. Locations are limited to Great Britain, for the moment.
R U Sure
Software-based comparison shopping search utility that has garnered good reviews.
Enter a question, and within a few minutes, a human being will respond with an answer via this web site.
Directory of free "web apps" or programs that run within your browser, such as calculators or games. Also lists Application Service Providers (ASPs) that produce fee-based web apps.
Finance search engine that indexes over 1.4 million financial Web pages. In addition to spidered results, there is a human-compiled directory.
Search or browse for more search engines and sites about search engines.
LookSmart Ends Pricing Tests
LookSmart did a last round of price testing for its express submission service last week, including raising the price from $199 to $299. However, anyone who paid the higher amount was in reality only supposed to be charged the "normal" price, the company says. LookSmart said there would be no more price tests after last Friday, and final submission pricing should be set this month. Currently, the express service is back to the regular $199 fee, but the free submit option still bars non-commercials sites from using it. I don't think this has been made a formal change, but I'll let you know in the next newsletter.
Free Listings Gone At LookSmart
The Search Engine Update, June 2, 2000
Explains how LookSmart has been experimenting with different submission systems.
Fact Search Engines Available
Rather than crawling the web or depending on humans, xrefer is powered by encyclopedias, dictionaries, books of quotations and other reference works. It's designed to help you locate facts backed by trusted publishers such as Penguin and Oxford University Press.
Like xrefer, Fact City's goal is to deliver authoritative facts, rather than links to web pages. Unlike xrefer, the company's model is to provide services to portals, rather than run its own search site. Current live partners are sport sites FOXSports.com and ESPN.com. Use the "Try It" link from the FactCity home page to try the service in action at these sites.
Vortal Services Now Offered
In April, I wrote about how a number of new players were entering the vertical portal market, offering the ability for site owners to create specialty search engines that encompass sites related to a particular topic. Now two of those players have released their products, which were previously pending. The players, then a link to my earlier article, for those who may have missed it:
Sandy Bay Networks
"BetterGetter" has been renamed the Results Engine. You can have a specialty search engine of up to 20,000 pages for free, while Sandy Bay plans to recoup its costs by selling advertising within the results that it delivers. The tools for webmasters to control and manage listings look impressive. There's currently no signup form for the service. If you are interested, use the small "Contact Us" link at the bottom right-hand corner of the home page to send email to sales or to call the company.
Searchbutton Community Search
Searchbutton's vertical portal product is now available to customers. Get in contact using this page. Unlike Sandy Bay, you pay for the service, with pricing beginning at $100 per month. However, there are
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.