Simplifying the Task of Pruning Links

Once you have identified if bad links are pointing at your site, you need to start working to address it.

If you have 10,000 or more links to your site, this can seem like an insurmountable task – particularly if the people who acquired the bad links are no longer around to ask about what they did.

Here are some things you can do to simplify the cleanup process.

Categorize Your Links

Start by pulling the link data. At a minimum, pull it from Google Webmaster Tools, because that is what Google is reporting.

If you can, it is also great to get link data from Open Site Explorer and Majestic SEO. Integrate all this data into one master list.

Combining your data (and de-duping the results) gives you the largest possible list of links, as individual link data sources can only provide a sampling of the links to your site.

Once you have your master list, it’s time to start simplifying the task a bit.

1. Sort Your Links by the Linking URL

Group the links by domain. If a domain links to you from 834 pages, just check 1-3 at most – chances are that if the links are bad on any of the pages, it’s bad on all of them.

This by itself is a huge simplification of the task. For example, look at these example link counts:


The blog has 201,048 links from 778 domains. So instead of checking more than 200,000 pages, you only need to check 778 domains. The workload seems a lot lighter already, doesn’t it?

2. Separate Links From Blogs

This is hard to do without doing a little programming, but the required programming is easy.

Look for URLs that have “blog” as part of the URL, or load the pages and see if you can find the string “WordPress”, “Moveable Type” and other such blog platforms on the page. This won’t give you a complete list of blogs, but it will identify a lot of them for you.

You will need to look at these posts, but you can give the person doing the work simplified guidelines, such as telling them to look for:

  • Low quality posts.
  • In context links with rich anchor text.
  • Multiple links per post.

Also, if 1,100 blogs link to your site, and you look at 100 of them and see significant problems, you know you need to check them all (unfortunately!).

But, if you look at 200 or more and they are all clean, you might start to think you don’t need to look at the rest. To be conservative, if you look at half of them, and they are all clean, chances are there was no bad blog campaign underway, and you can skip the rest and focus on looking at other problems.

3. Look for Multiple Links Per Page

This is another hint of a possible problem. Not definitive, of course.

For example, the other a guest post someone did on Forbes had six links to the site of the author, including rich anchor text links in the body of the post. It just looked “off.”


People who buy links tend to be a bit greedy. To them, one rich link anchor text link is good, but several links are great – they want to get their money’s worth.

The upside of this greed is it can make it easier for you to recognize potential bad links.

Tracking down pages with multiple links takes little bit of programming, but it isn’t too hard.

4. Look for Pages That Link to You With Rich Anchor Text

This is again not a definitive flag, but can focus where you look for trouble.

Consider the inverse rule too – if the only links on the page to you use your URL or business name (assuming that these aren’t keyword packed), then chances are that the page in question isn’t a problem.

Focus your energy on pages that smell like trouble.


No matter how you slice it, a full on analysis looking for bad links is a ton of work. Doing some things to simplify it can make the task seem a lot less daunting. U

sing guidelines like these, and perhaps others that you come up with on your own, can make the work a lot easier to take on and get done.

Related reading

Exclusive interview with Craig Campbell Golden nuggets every SEO needs to know
How to understand searcher intent to boost SEO rankings
How to master technical SEO Six areas to attack now