Google Webmaster Central Blog Addresses Duplicate Content Issues

Over at the Google Webmaster Central blog, Search Quality Team member Sven Naumann is tackling the issue of duplicate content. Naumann says there are two primary types of duplicate content, within a domain and cross-domain, and offers up tips in how to deal with each.

Within a Domain

This type of duplicate content is when the content from one page appears on other pages with your site. In this case, most webmasters or site owners usually have a preference as to which page they want to rank. Naumann offers up the following tip, "Include the preferred version of your URLs in your Sitemap file. When encountering different pages with the same content, this may help raise the likelihood of us serving the version you prefer."

Cross-Domain

Cross-domain duplicate content is when content from your site appears on other site, usually through syndication or blogs that scrape content. When it comes to syndication, asking your partners to link back to your page is a good way to help Google know that your site is the original source. As for scraped content, Naumann insists that Google is good at knowing what's scraped and what's real: "You shouldn't be very concerned about seeing negative effects on your site's presence on Google if you notice someone scraping your content."

Still, once in a while, scraped content may rank higher than your page. In such an instance, Naumann suggests the following:

  • Check if your content is still accessible to our crawlers. You might unintentionally have blocked access to parts of your content in your robots.txt file.
  • You can look in your Sitemap file to see if you made changes for the particular content which has been scraped.
  • Check if your site is in line with our webmaster guidelines.

Wrapping up, Naumann assured webmasters and SEOs that, "In the majority of cases, having duplicate content does not have negative effects on your site's presence in the Google index. It simply gets filtered out."

What do you think about this duplicate content post on the Google Webmaster Central blog? Does it line up with your experience in dealing with duplicate content. Share your thoughts in the comments.

Related Reading:
Adam Lasnik comments on Spam Complaints and Dupe Content
Large Enterprise SEO: Content Development