New Goes Live

FAST Search's got both a new look and new functionality in July, improvements that have turned the site into a top-notch resource for searchers.

AllTheWeb has always been an important site, because it has consistently offered one of the largest collections of web pages available. Consequently, it was a great resource when you needed to search for information on unusual or obscure topics.

The site's weakness was that it lacked many features that other search engines offered, such as automatically clustering listings, so that one web site didn't dominate the results. It also had generally poor spam filtering and fairly average relevancy for more popular queries. As a result, it wasn't a resource I'd typically recommend as a first stop for average web searchers.

All that's changed, now. There's every reason to consider AllTheWeb as one of your top choices when searching. Its new features and relevancy improvements have made the site far more appealing.

One of the most notable things about the new AllTheWeb is the what it calls "universal search," where the search engine automatically brings back information from different collections that it maintains. For instance in addition to a web page catalog, AllTheWeb also has database of pictures, video clips, MP3 and FTP files from across the web. When you do a search, results from several of these different sources may be presented, in response.

For example, take a search for "britney spears." By default, you are shown matching pages from across the web, leading off with Britney's own official web site. However, on the right hand side of the screen is a picture of Britney and links to bring up more pictures or video clips that seem to be about her.

Generally, the suggestion of pictures or video clips in what AllTheWeb calls its "side bar results" is most likely what you'll see. However, FTP or MP3 suggestions might also appear. Of course, you can also specify exactly which database you'd like to search against by using the links that appear under the search box, on the AllTheWeb home page.

AllTheWeb has also added new search tips that may appear to the right of search results, in a "Search tips" box. Look back at the "britney spears" search, and you'll see that the tip suggests using quotation marks to perform a phrase search. In addition to educating users, you can also select the link in the box to perform the suggested tip.

Clustering, which AllTheWeb calls "site collapsing," is another welcome new feature. It wasn't uncommon to do a search at AllTheWeb and find that all the top results came from the same web site. This problem has now been greatly reduced, and you shouldn't see more than two pages in the top results from any particular site.

If you do want to see more pages from a particular site listed in the results, simply select the "more hits from" link that appears below a listing. You can also override site collapsing for all your searches by using the options on AllTheWeb's customize page.

That new customization page also offers a variety of other features. You can control the porn filter, disable search tips or side bar results, stop term highlighting and more. In addition, there are a variety of new search commands that let you search within URL text or link text. These are summarized on the "Basic Help" page, and all the help files are an easy read and provide a well-done summary of how to use the service.

AllTheWeb also used to have noticeable problems with spam, but you should now find that this has been greatly reduced, due to new filtering that the service is using.

In particular, AllTheWeb is making use of several new methods to reduce the spam problem. Most important, Fast says, is watching for unusual linkages, sites that appear to be linking together for purposes of making themselves more popular.

"We discover all the rings, and we can effectively exclude them," said Knut Risvik, Fast's director of engineering.

AllTheWeb is also examining the frequency of terms on pages and removing those that seem excessively abnormal.

"We analyze the page and look at distribution of words," Risvik said. "If the distribution is significantly different from any distribution that should be normal for a language, we will remove them."

That can sound scary -- what if you accidentally create an abnormal page? This isn't likely to happen, Fast says. Fast is checking to see if terms appear excessively in different locations of a document and in high frequency, which can be indicative of those creating "doorway" style pages that they hope will be highly targeted toward a particular term.

In other words, let's say that you wanted to be found for "movies," so you place that word in your title tag ten times, within an H1 header, within link text, within ALT text and use it repeatedly throughout your body copy. That would seem abnormal when compared to the more common collection of documents in the same language about movies, where the word might appear once or twice in the title tag and within the body tag, but not in an extremely high proportion.

There are no hard and fast rules about where and how much is too much that Fast will release. Naturally, that would defeat the spam analysis they are doing. The main advice to take away is not to try and overly engineer your pages. Make use of the terms you want to be found for in your body copy and in your title tags, but don't go overboard.

AllTheWeb is also watching for "gibberish" pages, those where the text may make no sense to a human reader, despite having a sentence structure intended to make it appear normal and relevant to crawlers.

In another fairly recent change, AllTheWeb is now generating descriptions in one of three ways. First, it will use your meta description tag if you provide one and if the tag seems to reflect the content of your web page. Next, in lieu of a meta description tag, it will use your page's description from the Open Directory, if it is listed there. Finally, it will default to using the first 215 characters or so that appear in your visible HTML body copy. By the way, meta keyword tags are still not indexed by Fast.

Along with better spam detection, AllTheWeb also now has relevancy improvements that Fast collective refers to as "FirstPage." At the core of this is better link analysis. Also, the level of pages has an impact. Pages in "upper" levels are more likely to be ranked better, Fast says.

The service has also announced that it will refresh its index every nine to 12 days. Most search engines update on a roughly monthly basis. The last search engine to make such an explicit freshness promise was AltaVista, back in June 1999. It almost immediately failed to keep that promise. It will be interesting to see if AllTheWeb does better.

Index size is 625 million pages, where it has been since around March. That's just below the 700 million or so pages that Google has in its full-text index. FAST also has 70 million listings in its multimedia picture and video catalogs, 2 million MP3 listings and 150 million FTP listings.

All these changes have been done to make AllTheWeb a more attractive destination for users. This is completely opposite of what happened when the site launched back in April 1999. It was meant primarily to demonstrate Fast's technology to prospective portal partners, so special features for searchers weren't added.

Does the change now mean that Fast wants to compete against the portals that it also wants to power? It's more an attempt to coexist, in the way that Google operates alongside its partners such as Yahoo and Netscape, as well as turn AllTheWeb's successes into benefits for FAST's customers.

"The focus of our site is not to abandon all the portal customers and become the number one search destination ourselves. Our focus is toward monetizing AllTheWeb and using that to build monetization technology for potential customers," said Stephen Baker, Fast's director of business development and marketing.

The key is continuing to develop AllTheWeb's ability to dynamically and intelligently pull information from different databases. Fast believes this can help e-commerce sites and others better monetize their services.

Of course, Fast will still continue providing web search services to those who want it. It just expanded its agreement with Terra Lycos in July, so that Fast information can be used by all Terra Lycos sites worldwide. New addition at Lycos using Fast data include Lycos properties for Argentina, Brazil, Mexico and Venezuela. Lycos now also has the ability to sell advertising to appear on the AllTheWeb site.

AllTheWeb Customize Page Help

It's Fresher at FAST
SearchDay, July 25, 2001

Guest writer and search expert Gary Price puts Fast's claim to be the freshest search engine to the test. It easily beat AltaVista but fared less well against Google.