A paper presented at the 10th International Conference on Extending Database Technology conference in Munich near the end of March, Indexing Shared Content in Information Retrieval Systems (pdf), jointly authored by employees of Yahoo, Google, and IBM, discusses how to limit index sizes of search engines by reducing the amount of duplicate content contained in their indexes.
After reading it, I started considering and listing some of the problems that sites may have that could cause search engines to not index the pages of those sites, or display them in search results. My list is in a post at SEO by the Sea, titled Duplicate Content Issues and Search Engines.
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.