SES: Duplicate Content and Multiple Sites

Date published 25 March 2010 Author

Adam Audette

Categories

Events
SEO

This panel focused on various tactics to control duplicate content for SEO purposes. It’s always a great topic because, while the general guidelines are clear, solving duplicate content efficiently always depends on the specific situation. There is lots of nuance involved in this topic.

Moderator: Adam Audette

Shari Thurow
Maile Ohye
Anthony Long

Shari is up first, and she describes the 6 primary methods websites can use to give the search engines directives and/or hints as to preferred URL canonicalization. (Canonicalization in this context is the consistent representation of 1 primary URL for every HTML document.) These are:

robots.txt directives
meta robots tags
XML sitemaps
nofollow link attribute
rel=canonical meta tag
internal linking

Shari hammers home the importance of consistency with everything a website says and does. For example, if a web page is robots excluded, don’t include it in an XML sitemap, and don’t have a rel=canonical tag on the page. When there is a consistent ‘scent’ or signal to the engines, canonicalization is much easier. Don’t give signals to the search engines that are in conflict.

Another important topic from Shari is how duplicate content filters happen at 3 different phases. There are crawl time filters, index time filters, and query time filters.

Next is Maile who gives a great overview of the Google perspective on duplicate content. She assures the audience that having duplicate content is not considered a penalty by Google, it is in fact a filter. Maile talks about crawl overhead and how curing duplicate content can greatly improve the crawl experience. In a great example of how sites struggle with duplicate content, Maile points out the Google Books site and shows a number of duplicate URLs for the primary site. Yep, duplicate content is such a common problem that even Google has issues with it.

Maile encourages the use of rel=canonical, which is fully supported by the engines. Google is the only one that also supports cross-domain use of rel=canonical.

Maile notes that over 60,000 sites are using the parameter removal option in Google Webmaster Tools to help with duplicate content. She explains how this can be a useful step for helping googlebot understand what URL parameters aren’t necessary and can be safely ignored during the crawl.

One topic that wasn’t covered was Google’s preference for webmasters to not exclude duplicate content with robots.txt or a meta robots tag.

Finally it’s Anthony’s turn and he gives an interesting case study of duplicate content struggles experienced on AOL properties. Popeater.com is the primary example he uses, showing how the blog articles are featured in Google News but only syndication partners are featured in traditional listings, not the original article on Popeater.com. Why is this happening? he asks.

Anthony explains that they have a selection of Guidelines for syndication partners, and also a selection of Preferences, that are shared. It’s taken case by case, but the goal is ultimately to give the syndication partners a limited version of the original content, and to feature prominent links back to the source content.

During Q&A, topics include everything from handling tag pages, to dealing with hundreds of ‘salesperson’ sites with the same content, to how to handle a domain just purchased that is in a bad neighborhood with dirty links.

Industry

SEO

PPC

Analytics

Social

Local

Mobile

Video

Content

Development

Opinion

Information

Follow us

SES: Duplicate Content and Multiple Sites

Leave a Reply Cancel reply

Resources

Analytics The 2023 B2B Superpowers Index

Analytics Data Analytics in Marketing

Digital Marketing The Third-Party Data Deprecation Playbook

Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Resources

The 2023 B2B Superpowers Index

Data Analytics in Marketing

The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Related Articles

Search transformation projects: Q&A with SAP's Siddharth Taparia

YouTube optimization and intent: Q&A with goop's Courtney Messerli

TechSEO Boost: Machine Learning for SEOs

Amazon Advertising tips from Bai and LEGO

Transformation of Search Summit roundup

New research: The Era of Ecommerce

Why search marketing matters in 2018

Round up of all things SEO @BrightonSEO