On Monday, FAST announced that its alltheweb.com search engine, with an index of more than 625 million web pages, is completely refreshed every nine to twelve days. The search engine also catalogs more than 70 million integrated multimedia files.
I decided to do a small test to check the latest dates for several web pages in FAST, AltaVista and Google. This test took place on 7/23/01 at approximately 2pm EDST.
I conducted the test by identifying pages where the date is included in the title of the page. After the page is crawled this data is displayed when looking at the titles in a set of search results. I learned this technique from search engine expert Greg Notess (http://www.searchengineshowdown.com).
Using a sample of five pages with date stamps in their title, it appeared that FAST had last crawled these pages on dates ranging from May 28th through July 11th. Not quite the nine to twelve days claimed, but still impressive compared to other search engine crawler lag-times.
It looks as if Google refreshed their index on July 23, perhaps in response to the announcement by FAST. Pages included in the Google refresh had dates as late as 7/6/01. Prior to this update, the Google database was about 47 days old. It was last updated on 6/5/01.
The test illustrates that using more than one general-purpose web engine is essential for the most current and comprehensive results.
Here are the full test results:
|Tahoe Daily Tribune|
|The Free Press|
|Newsweek via MSNBC|
|n.a..=Exact page with date stamp not available.|
FAST Launches World´s Freshest Internet Search Engine
The official press release from FAST describing the new indexing and recrawl program at alltheweb.com.
Google Image Search Grows
On Tuesday, Google released a new, 66 percent larger index for Google Image Search. Google Image Search now enables users to search and browse 250 million digital images, 100 million more than the first index, which was released just a month ago.
Google Image Search
NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.