Very nice. Wondering how a search engine will process your robots.txt file? Google now provides a way to check on that through the Google Sitemaps program. More stats and analysis of robots.txt files from the official Inside Google Sitemaps blog explains more.
For Search Engine Watch members, the longer version of this article gives a real life example of how nice the checker is in action.
Overall, I'm thrilled with the new tool. I'd like to see the other search engines add similar ones. Even better, I'd like to see them all come together on creating an enhanced and more standardized robots.txt standard. Consider:
- Google
allows
wildcards, but others don't.
- Ask,
MSN &
Yahoo allow crawl delays (but don't define minimum or maximum values).
Google does not.
- Ask & Google have ALLOW commands that no others support
Postscript: Matt Cutts from Google has some good comments over here, pointing out Google also has an allow command (I've updated my list above) and further in comments to the post, explaining why they don't support crawl-delay yet because of concerns it might be set too low by mistake by some webmasters.
Marketers Rejoice! ClickZ has launched ClickZ Live, an educational series to bring you innovative online marketing strategies and techniques. Learn to construct and successfully execute multi-channel marketing campaigns, plus identify key metrics and translate them into actionable plans.
Thursday, July 18: ClickZ Live will be in Vancouver, BC. Register before July 1 to save $100!
