SpiderSpotting Chart: Robot Agent and Host Names

Search Engine Watch
The Search Engine Watch
SpiderSpotting Chart

On this chart, * is a wildcard, so *.infoseek.com
would mean anything.infoseek.com

Updated: October 14, 1998
Please see bottom of the page for other resources with updated information.

Search Engine Agent Names Host Names
AltaVista
(normal spider)
Scooter/2.0 G.R.A.B. X2.0
Scooter/1.0 scooter@pa.dec.com
scooter.pa-x.dec.com
scooter*.av.pa-x.dec.com
such as: scooter3.av.pa-x.dec.com
AltaVista
(instant spider)
Scooter/1.0 add-url.altavista.digital.com
ww2.altavista.digital.com
Euroseek Arachnoidea (arachnoidea@euroseek.com) *.euroseek.net
such as: infra.euroseek.net
Excite
(mega spider)
ArchitextSpider crawl*.atext.com
such as: crawl2.atext.com
Excite
(fresh spider)
ArchitextSpider crimpshrine.atext.com
Fireball
(German search engine)
KIT-Fireball/2.0 heavymetal.fireball.de
Google
(Experimental search engine)
BackRub/2.1 backrub@google.stanford.edu http://google.stanford.edu/ *.stanford.edu
such as: hake.stanford.edu
Inktomi
(powers HotBot, others)
Slurp/2.0 (slurp@inktomi.com;
http://www.inktomi.com/slurp.html)
*.inktomi.com
such as: j2001.inktomi.com
or j10.inktomi.com
Infoseek
(normal spider)
InfoSeek Sidewinder/0.9 *.infoseek.com
such as: wilbur-bbn.infoseek.com

or
IP number
such as: 204.162.98.90
Infoseek
(instant spider)
Mozilla/3.01 (Win95; I) as above
Lycos
(regular spider)
Lycos_Spider_(T-Rex) lycosidae.lycos.com
or
*.pgh.lycos.com
such as: spider3.srv.pgh.lycos.com
Lycos
(Add URL spider)
Lycos_Spider_(T-Rex) *.sjc.lycos.com
such as: sjc-fe4-1.sjc.lycos.com
Northern Light Gulliver/1.2 taz.northernlight.com
WebCrawler Served by Excite spiders Served by Excite spiders
More Resources

SpiderSpotting
A page within Search Engine Watch that
explains how to track down robot visitors.

SpiderHunter.com
http://www.spiderhunter.com/

A huge collection of resources devoted to tracking spiders. You
can pretend to be a spider, view a collection of spiders by name
and IP addresses, read tutorial information and more.

Search Engine Spider IP Addresses
Comprehensive list of agent names and IP addresses.

The Web Robots Database
http://info.webcrawler.com/mak/projects/robots/active.html
Some of the entries are outdated,
but it remains a useful resource.

BotWatch Configuration File
http://www.tardis.ed.ac.uk/˜sxw/robots/index.html
Lists a wide range of robots,
with agent and host names.