This standard covers the syntax or "command" that search engine users can use to either restrict a search to pages from a particular web site or to exclude pages from that site.
Some search engines currently offer the ability to perform a site search, which is an extremely useful feature for search engine users. For instance, this search on Go (Infoseek):
tony blair +site:gov.uk
would find web pages about UK Prime Minister Tony Blair that only come from official UK government web sites. Without such a site search command, there would be no way to narrow in on the government sites in this way.
A site search command also benefits webmasters. They can immediately discover exactly which of their pages have been indexed by search engines, which could eliminate needless resubmission attempts and feedback messages to search engines about the status of their listings.
Search engine reviewers can also use a site search command to compare the coverage of the web that crawler-based services provide, assuming that the command is also coupled with accurate hit count reporting.
It was decided in November 1999 that the syntax of site: would be used as the standard. Search engines that support this standard will either replace their old site search command or allow both the old and new commands to work.
Here's the current status among major crawler-based search engines, or those that operate a crawler for some of their results:
|Search Engine||Current Command||Status |
|Excite||none||Plans to add this ability and support standard; not yet implemented|
|Fireball||site:||Standard supported as of Sept. 99|
|Go||site:||Now supported (this has always been the command)|
|site:||Standard supported as of June 00.|
|Inktomi||domain:||Plans to support standard; not yet implemented|
|Lycos||via advanced form||No plans to support standard|
|Northern Light||via advanced form||Plans to support standard; not yet implemented|
This standard doesn't address consistency of results. For instance, assume that your ran this identical search at AltaVista and Go:
"mars landings" -site:nasa.gov
At AltaVista, pages from any nasa.gov site would be excluded (ie, *.nasa.gov). But at Infoseek, only pages from the main nasa.gov site would be excluded. Pages from www.ksc.nasa.gov and www.jpl.nasa.gov would still appear.
Here's a copy of the original proposal for this standard.