Yahoo Adds Support for Page-Level Exclusion Tags for Non-HTML Docs

Yahoo is giving webmasters more control over page-level directives to its Slurp crawler for non-HTML files.

The X-Robots-Tag is a page-level exclusion tag that is used to direct a search engine spider in how it should treat that page. Similar to the way a robots.txt file is used, or a meta tag, the X-Robots-Tag can use the NOINDEX, NOARCHIVE, NOSNIPPET, or NOFOLLOW tag to tell spiders not to index a page, not to display a cached version of a page in search results, not to display a summary of the page in search results, or not to crawl links on a page.

The difference is that the X-Robots-Tag directive is processed in the http header, so it can now be used on non-HTML pages like PDF files, Word documents, PowerPoint, video, and other file types. It can still be used on HTML pages as well.

Yahoo also gave a weather report: "Along with this change, we'll be rolling out additional changes to our crawling, indexing and ranking algorithms over the next few days. We expect the update will be completed early next week, but you may see some changes in ranking as well as some shuffling of the pages in the index during this process."

About the author

Kevin Newcomb joined ClickZ in August 2004, covering search marketing and other online marketing topics. He has been reporting on web-based businesses since 2000.

Before the bubble burst, Kevin was a marketing manager for an online computer reseller, handling copywriting, e-mail marketing, search marketing and running the affiliate program.

With a combination of real-world marketing experience and years of business journalism, Kevin brings to ClickZ a unique ability to deliver news and training materials that help online marketers do their jobs better.