It's Fresher at FAST

On Monday, FAST announced that its search engine, with an index of more than 625 million web pages, is completely refreshed every nine to twelve days. The search engine also catalogs more than 70 million integrated multimedia files.

I decided to do a small test to check the latest dates for several web pages in FAST, AltaVista and Google. This test took place on 7/23/01 at approximately 2pm EDST.

I conducted the test by identifying pages where the date is included in the title of the page. After the page is crawled this data is displayed when looking at the titles in a set of search results. I learned this technique from search engine expert Greg Notess (

Using a sample of five pages with date stamps in their title, it appeared that FAST had last crawled these pages on dates ranging from May 28th through July 11th. Not quite the nine to twelve days claimed, but still impressive compared to other search engine crawler lag-times.

It looks as if Google refreshed their index on July 23, perhaps in response to the announcement by FAST. Pages included in the Google refresh had dates as late as 7/6/01. Prior to this update, the Google database was about 47 days old. It was last updated on 6/5/01.

The test illustrates that using more than one general-purpose web engine is essential for the most current and comprehensive results.

Here are the full test results:

Source Crawl Date
AllTheWeb AltaVista Google
Tahoe Daily Tribune 5/28/01 6/8/01 7/6/01
The Free Press 7/11/01 n.a. 7/6/01
Newsweek via MSNBC 7/10/01 4/12/01 7/6/01
SPEED Magazine 6/27/01 4/6/01 7/6/01
PDABuzz.Com 6/22/01 n.a. 7/6/01
n.a..=Exact page with date stamp not available.


FAST Launches World´s Freshest Internet Search Engine
The official press release from FAST describing the new indexing and recrawl program at

Google Image Search Grows

On Tuesday, Google released a new, 66 percent larger index for Google Image Search. Google Image Search now enables users to search and browse 250 million digital images, 100 million more than the first index, which was released just a month ago.

Google Image Search

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.