Seven ways to avoid content cannibalization

There’s currently a lot of discussion about how to improve your domain’s SEO.

Many issues are labeled as dramatically influential and people are always sharing tips on how to deal with them. However, one topic is seldom discussed within SEO circles: internal content cannibalization.

Why should we care?

SEO experts and digital marketers have long been contemplating whether or not there’s such a thing as a Google penalty for duplicate content. As it turns out, it’s a myth.

However, we cannot simply sit back and relax. In fact, there’s one thing worse than “duplicate content” – “too similar” content, which can be a stone around your website’s neck.

Pieces of content that are too similar are the real cannibals here, since the duplicates are being filtered. Thus, having this kind of content on your site can ruin its potential, because your own URLs will be competing against your other URLs.

We know that Google doesn’t like to rank multiple pages from the same domain. So, one or the other gets picked eventually: one day Google picks one page from your domain, another day a different one gets displayed, thus, you end up losing your potential traffic.

We call it the “seesaw effect.” When you have different pages that contain too-similar content, basically, you confuse the Google bot as to which page from your site it should rank for key terms.

So, to get some insight into the matter of internal site cannibalization, SEMrush recently hosted an in-depth discussion about the subject with an industry expert – Dawn Anderson, lecturer in digital and search marketing, researcher and international SEO consultant.

Based on our talk, we have outlined seven ways to avoid internal site cannibalization.


1) Internal linking

Cause: “Skewed ‘popularity’ via internal linking, which will cause a particular page to be displayed instead of another regardless of relevancy.”

To clarify this point, let’s say, you have a website for selling shoes online. It also includes a blog. You definitely want your homepage to be as popular as possible in search results pages; however, you end up earning more traffic with your blog.

When reviewing your internal linking, you may find that many internal links point to your blog, which is ranked higher. Why is that? Because the more links you provide to a page, the more important Google considers it be!

Solution: Review your internal link structure, and place what’s most important higher within the hierarchy of your pages. If you use the latter, you can launch a simple site audit, which will automatically check your domain for duplicate links and tags, and even provide some suggestions for improvement.

2) Anchor text inconsistency

Cause: Overuse of keywords with anchors that refer to the wrong page, which creates spam within your own URL

Let’s consider our shoes website example once again. Your anchors should always be concise and get right to the point! If you anchor says “buy shoes,” after clicking “buy shoes,” your visitor should be able to do so right away – do not digress elsewhere.

Don’t, for example, create a transitional page on “shoe sizes” or something else. Preciseness lends clarity, and clarity leads to higher rankings in search engines.

Solution: Make sure your keywords are natural, useful and mapped to the right page – that way, you are sending all the right hints to Google from within your own website to your URLs.

3) Canonicalization (incorrect or missing canonicalization)

A page’s canonicalization is basically a tag at its head claiming that it contains either exactly duplicated or too-similar content.

Cause: When you purposefully use duplicate or too-similar content within your URL, make sure you canonicalize it. However, if you canonicalize too much, it doesn’t do you any good either – Google may consider it to be messy and may ignore the page completely.

Solution: Pay attention to the way you canonicalize:

  • Canonicalization with or without a trailing slash is different
  • Don’t forget to switch your canonicals when migrating content to HTTPS
  • Self-referencing canonicalization can help if your content gets scraped
  • Avoid canonicalization of URLs if the content is too different (if your content actually differs quite substantially, canonicalization will only be detrimental)
  • Use a “rel=canonical element”

green adult female of praying mantis

4) Page title and other meta differentiation

Cause: Different URLs ranking for the same term in your industry’s seasonal periods, or optimizing several pages for the same terms (even unwittingly).

For instance, in our previous example, our primary term is “shoes.” Thus, we optimize our main page to appear in response to a keyword “shoes”. However, in the summer you decide to target an audience looking for “summer shoes,” so you create a separate page with that name.

Eventually, you end up with two pages on which “shoes” appears simultaneously, and your “summer shoes” page is killing the potential of your primary target page.

Solution: Avoid over-optimization for primary terms on non-targeted pages, and under-optimization for secondary target pages

5) Merging content for concentration

Cause: Stealing power from your own target via variants, stemming, synonyms, or associated keywords.

For instance, you have four pages that are dedicated to “Shoes Care” in winter/summer/spring/autumn period. Thus, you have created four separate pages that could actually be tied into one big category “How to Look After Shoes”.

Solution: Focus on theme level, section and page level, avoid silos. How? Firstly, you can do a content audit, and afterwards, concentrate rather than dilute – try to make one page instead of two.

6) Intent to content mapping

Cause: Lack of logic within your site structure.

For example, you have a blog within your site, however, if you do not provide some logic and order to the navigation within you blog, Google will also consider it as a source of cannibalization.

Thus, you end up being cannibalized by illogical blog categories (rather than five random topics, make a category and divide it to topics afterwards). Provide no competition with your primary target page (see point 5).

Solution: Surround your content with a sectional environment of contextual relevance.

7) If you can’t improve (for now), noindex (temporarily) and then improve.

Just don’t forget to index later!

If you cannot identify the problem, here are some possible symptoms:

  • Different pages ranking for the same term
  • Hovering on page 2
  • Never quite achieving those top spots
  • Different URLs ranking for the same terms during your industry’s seasonal periods
  • Not getting the CTR (click-through rate) despite ranking reasonably well
  • Shared impressions in Google Search Console for different pages for the same terms

You can always rely on specially designed tools to help you with your website audit; they can point out weaknesses and help you decrease the possibility of internal cannibalization.

Meri Chobanyan is a Content Writer at SEMrush. You can connect to Meri on LinkedIn.

Related reading

Vector graphic of a laptop displaying a search result for 'your website'. A magnifying glass hovers in front of the laptop screen, enlarging the search result.

Simple Share Buttons