SEO News
Search

SpiderSpotting Chart: Robot Agent and Host Names

author-default
by , Comments

Search Engine Watch
The Search Engine Watch
SpiderSpotting Chart

On this chart, * is a wildcard, so *.infoseek.com
would mean anything.infoseek.com

Updated: October 14, 1998
Please see bottom of the page for other resources with updated information.

Search Engine Agent Names Host Names
AltaVista
(normal spider)
Scooter/2.0 G.R.A.B. X2.0
Scooter/1.0 [email protected]
scooter.pa-x.dec.com
scooter*.av.pa-x.dec.com
such as: scooter3.av.pa-x.dec.com
AltaVista
(instant spider)
Scooter/1.0 add-url.altavista.digital.com
ww2.altavista.digital.com
Euroseek Arachnoidea ([email protected]) *.euroseek.net
such as: infra.euroseek.net
Excite
(mega spider)
ArchitextSpider crawl*.atext.com
such as: crawl2.atext.com
Excite
(fresh spider)
ArchitextSpider crimpshrine.atext.com
Fireball
(German search engine)
KIT-Fireball/2.0 heavymetal.fireball.de
Google
(Experimental search engine)
BackRub/2.1 [email protected] http://google.stanford.edu/ *.stanford.edu
such as: hake.stanford.edu
Inktomi
(powers HotBot, others)
Slurp/2.0 ([email protected];
http://www.inktomi.com/slurp.html)
*.inktomi.com
such as: j2001.inktomi.com
or j10.inktomi.com
Infoseek
(normal spider)
InfoSeek Sidewinder/0.9 *.infoseek.com
such as: wilbur-bbn.infoseek.com

or
IP number
such as: 204.162.98.90
Infoseek
(instant spider)
Mozilla/3.01 (Win95; I) as above
Lycos
(regular spider)
Lycos_Spider_(T-Rex) lycosidae.lycos.com
or
*.pgh.lycos.com
such as: spider3.srv.pgh.lycos.com
Lycos
(Add URL spider)
Lycos_Spider_(T-Rex) *.sjc.lycos.com
such as: sjc-fe4-1.sjc.lycos.com
Northern Light Gulliver/1.2 taz.northernlight.com
WebCrawler Served by Excite spiders Served by Excite spiders
More Resources

SpiderSpotting
A page within Search Engine Watch that
explains how to track down robot visitors.

SpiderHunter.com
http://www.spiderhunter.com/

A huge collection of resources devoted to tracking spiders. You
can pretend to be a spider, view a collection of spiders by name
and IP addresses, read tutorial information and more.

Search Engine Spider IP Addresses
Comprehensive list of agent names and IP addresses.

The Web Robots Database
http://info.webcrawler.com/mak/projects/robots/active.html
Some of the entries are outdated,
but it remains a useful resource.

BotWatch Configuration File
http://www.tardis.ed.ac.uk/˜sxw/robots/index.html
Lists a wide range of robots,
with agent and host names.


The Original Search Marketing Event is Back!
SES DenverSES Denver (Oct 16) offers an intense day of learning all the critical aspects of search engine optimization (SEO) and paid search advertising (PPC). The mission of SES remains the same as it did from the start - to help you master being found on search engines. Early Bird rates extended through Sept 19. Register today!

Recommend this story

comments powered by Disqus