Site Search Standard

This standard covers the syntax or "command" that search engine users can use to either restrict a search to pages from a particular web site or to exclude pages from that site.

Overview

Some search engines currently offer the ability to perform a site search, which is an extremely useful feature for search engine users. For instance, this search on Go (Infoseek):

tony blair +site:gov.uk

would find web pages about UK Prime Minister Tony Blair that only come from official UK government web sites. Without such a site search command, there would be no way to narrow in on the government sites in this way.

A site search command also benefits webmasters. They can immediately discover exactly which of their pages have been indexed by search engines, which could eliminate needless resubmission attempts and feedback messages to search engines about the status of their listings.

Search engine reviewers can also use a site search command to compare the coverage of the web that crawler-based services provide, assuming that the command is also coupled with accurate hit count reporting.

Current Status

It was decided in November 1999 that the syntax of site: would be used as the standard. Search engines that support this standard will either replace their old site search command or allow both the old and new commands to work.

Here's the current status among major crawler-based search engines, or those that operate a crawler for some of their results:

Search Engine Current Command Status
on Standard
AltaVista host: Unknown
Excite none Plans to add this ability and support standard; not yet implemented
Fireball site: Standard supported as of Sept. 99
Go site: Now supported (this has always been the command)
Google site: Standard supported as of June 00.
Inktomi domain: Plans to support standard; not yet implemented
Lycos via advanced form No plans to support standard
Northern Light via advanced form Plans to support standard; not yet implemented

Other Notes

This standard doesn't address consistency of results. For instance, assume that your ran this identical search at AltaVista and Go:

"mars landings" -site:nasa.gov

At AltaVista, pages from any nasa.gov site would be excluded (ie, *.nasa.gov). But at Infoseek, only pages from the main nasa.gov site would be excluded. Pages from www.ksc.nasa.gov and www.jpl.nasa.gov would still appear.

Here's a copy of the original proposal for this standard.

Back To:
Search Engine Standards Project home page