SEO News
author-default

SpiderSpotting Chart: Robot Agent and Host Names

by , Comments

Search Engine Watch
The Search Engine Watch
SpiderSpotting Chart

On this chart, * is a wildcard, so *.infoseek.com
would mean anything.infoseek.com

Updated: October 14, 1998
Please see bottom of the page for other resources with updated information.

Search Engine Agent Names Host Names
AltaVista
(normal spider)
Scooter/2.0 G.R.A.B. X2.0
Scooter/1.0 [email protected]
scooter.pa-x.dec.com
scooter*.av.pa-x.dec.com
such as: scooter3.av.pa-x.dec.com
AltaVista
(instant spider)
Scooter/1.0 add-url.altavista.digital.com
ww2.altavista.digital.com
Euroseek Arachnoidea ([email protected]) *.euroseek.net
such as: infra.euroseek.net
Excite
(mega spider)
ArchitextSpider crawl*.atext.com
such as: crawl2.atext.com
Excite
(fresh spider)
ArchitextSpider crimpshrine.atext.com
Fireball
(German search engine)
KIT-Fireball/2.0 heavymetal.fireball.de
Google
(Experimental search engine)
BackRub/2.1 [email protected] http://google.stanford.edu/ *.stanford.edu
such as: hake.stanford.edu
Inktomi
(powers HotBot, others)
Slurp/2.0 ([email protected];
http://www.inktomi.com/slurp.html)
*.inktomi.com
such as: j2001.inktomi.com
or j10.inktomi.com
Infoseek
(normal spider)
InfoSeek Sidewinder/0.9 *.infoseek.com
such as: wilbur-bbn.infoseek.com

or
IP number
such as: 204.162.98.90
Infoseek
(instant spider)
Mozilla/3.01 (Win95; I) as above
Lycos
(regular spider)
Lycos_Spider_(T-Rex) lycosidae.lycos.com
or
*.pgh.lycos.com
such as: spider3.srv.pgh.lycos.com
Lycos
(Add URL spider)
Lycos_Spider_(T-Rex) *.sjc.lycos.com
such as: sjc-fe4-1.sjc.lycos.com
Northern Light Gulliver/1.2 taz.northernlight.com
WebCrawler Served by Excite spiders Served by Excite spiders
More Resources

SpiderSpotting
A page within Search Engine Watch that
explains how to track down robot visitors.

SpiderHunter.com
http://www.spiderhunter.com/

A huge collection of resources devoted to tracking spiders. You
can pretend to be a spider, view a collection of spiders by name
and IP addresses, read tutorial information and more.

Search Engine Spider IP Addresses
Comprehensive list of agent names and IP addresses.

The Web Robots Database
http://info.webcrawler.com/mak/projects/robots/active.html
Some of the entries are outdated,
but it remains a useful resource.

BotWatch Configuration File
http://www.tardis.ed.ac.uk/˜sxw/robots/index.html
Lists a wide range of robots,
with agent and host names.


ClickZ Live Toronto Twitter Canada MD Kirstine Stewart to Keynote Toronto
ClickZ Live Toronto (May 14-16) is a new event addressing the rapidly changing landscape that digital marketers face. The agenda focuses on customer engagement and attaining maximum ROI through online marketing efforts across paid, owned & earned media. Register now and save!

Recommend this story

comments powered by Disqus