The worst time to conduct a link audit is when you need to. Don’t wait for a Penguin update to wipe out your rankings in the SERPs or an “unnatural links” warning to hit your inbox. The best time to perform a link audit is now.
A comprehensive link audit is the first step in developing any digital marketing plan. Link building is still the most effective way to boost traffic, via rankings. Google wouldn’t be wasting their time creating the Penguin Algorithm and imposing Manual Link Spam penalties if backlinks didn’t still make a big difference.
The end game will determine just how the data is analyzed.
- If the primary objective is to recover from a penalty, one needs to focus on GWT and those areas that fall outside of the webmaster guidelines, aka “link schemes.”
- If developing a program for a new website, the data will be used to find opportunities.
- If there is a competitor outranking us, we’ll use the data to neutralize their competitive advantages and to find opportunities.
How do you know which links to remove when you get an "unnatural links" message?
For competitive purposes, having access to Google Webmaster Tools and Google Analytics is nice, but not necessarily a deal breaker. If you're looking to recover from a penalty, then access to GWT & GA is mandatory. Clients have the option to give limited administrative access to a third party.
Collecting the Link Data
The interwebs went nuts after comments by a previously unknown Googler named Uli Lutz. When asked “What is the definitive source for backlinks to look at?” Lutz is claimed to have said “I would concentrate on the links reported in the Webmaster Tools on Google” To put this in context, Lutz was purportedly addressing a question about the disavow links tool, so we’ll begin at Google Webmaster Tools:
If you don’t already have a GWT account, sign up for one, add the code to your site and validate the website. Then:
Select your site
Select Traffic > Links to Your Site
Select Traffic > Links to Your Site > More
Select “Download Latest Links”
Export to .CSV or Google Docs
The backlinks that Google discloses are a “sampling.” It looks like Google may have recently reduced that sampling size. On February 6, 2013 I noticed that over half of the links that once appeared in a client’s WMT backlink sample had “disappeared. Shaun Andersentweeted that same day “Less backlinks available in Google Webmaster Tools by the looks of it - I am seeing about a third of the links once shown.”
To recap: gathering the GWT data is most important when you are looking to recover from a penalty. The most likely “culprits” are the links acquired around the date of the penalty. That said, EVERY link that falls outside of the webmaster guidelines must be cleaned up via link removal, or as a last resort, the disavow tool.
To get a complete backlink profile, you will need a paid subscription to a backlink checker. Everyone seems to have a “favorite”, but any of the “Big 4” SEOmoz, Majestic SEO, Link Research Tools or Ahrefswill do the job. We’ll be focusing on the following link characteristics:
- The URL of the page linking to you
- The URL on your site that is being linked to
- The IP of the URL linking to you
- The anchor text used
- The Percentage (Mix) of Anchor text
- The follow/nofollow status of the link
- A measure (rank) of the link’s trust & authority
To begin, enter the URL to audit into the backlink tool. In this case, Ahrefs was used. You will get an output like this:
Next, export the data into a CSV file. Sort in ascending value (low to high) by domain/trust/Moz/Cemper whatever rank. In theory this will provide you with a list of links in the order of weakest to strongest.
I say “In Theory” as some of the weakest links may be harmless, and some powerful paid links may be killing you. There is no pure algorithmic solution. To do a link audit correctly, requires a manual review.
Analyzing the Link Data
Links that need to be reviewed and considered for removal are the following:
Links that appear on a domain that isn't indexed in Google.
This usually signals a quality problem. A quick way to test for this is to run a “site” command:
If the website is indexed, you will see a result like this:
A website that isn’t indexed returns a result like this:
Then again, sometimes a perfectly good site isn’t indexed, because of a bad robots.txt, like:
This usually happens when a website leaves the development stage, but the robots.txt isn’t changed to allow the search engines to crawl the site. That’s why a manual review is important.
Links that appear on a website with a malware or virus warning.
This is pretty self explanatory.
Links that appear on the same page as spammy, unrelated links.
Commonly links pertaining to Porn, Pills, Casinos, PayDay Loans, etc:
The Following links appear at http://iacapap.org/links
But wait… what’s wrong with this page – these links look beautiful. Yes Ma’am – Until you look at the source code, seen below:
If you want to play at home, run the Google Search Command: inurl:links sex,viagra,payday loans and you can find unlimited hacked pages, too.
Links that appear on a page with Google PageRank that is gray bar or zero.
This usually signals poor quality or low trust, but it could also indicate a new page that hasn’t been updated in the PR bar. Gray PR is not the same as PR 0 (zero). The graybar is sometimes a quality indicator, but doesn’t necessarily mean that the site is penalized or de-indexed. Many low quality, made for SEO directories, have a gray bar or PR 0.
On a side note: if the owner of the monkey blog (my 14 yr old son) is reading this; you can do better than that☺
Links coming from link networks.
Link networks are a group of websites with common registrars, common IPs, common C-blocks, common DNS, common analytics and/or common affiliate code. Chances are, if a group of websites shares a common ip, you will also find some of the other characteristics of a link network, so that’s where I look first. If using Ahrefs, you would navigate to Domain reports>yourwebsite.com>IPs and get a report like this:
Then Drill down to Domain reports>yourwebsite.com>referring domains, to discover a crappy network
Sitewide Links – especially blogroll and footer links.
Most are unnatural and none pass the juice that they once did.
Watch for exceptions to the rule: After a manual review, I am able to determine that in this case, the first sitewide link found in the tool is natural and there is no need to remove it:. Just one more example of why human intervention is necessary to get a link audit right.
If you are attempting to recover from a manual penalty, every paid link must be removed. No exception. The Google spam team spends all day every day rooting out paid links. After awhile, spotting a paid link becomes second nature. That juicy link that you are certain that you can slip by Google will stick out like a sore thumb to the trained eye and will only prolong the agony of a manual penalty.
Beyond specific link types which could be considered “suspicious”, there are new link rules that need to be reviewed and adhered to in a Post Penguin era.
Post-Penguin Link Audit Considerations
Keep in mind that Penguin is just the latest anti link spam algorithm rolled out by Google. They are hammering websites built on link schemes and rewarding sites with a natural backlink profile. A natural profile contains an assortment of link types, pointing to a website. Your audit should turn up a good mix of:
- Brand links: Variations include: Your Domain, YourDomain.com, www.YourDomain.com, YourDomain.
- Exact-match anchor text keyword links: These anchor text links should point to the most appropriate page on the website (the one you are optimizing).
- Partial-match keyword links: It’s important not to over-optimize with exact match keywords, otherwise you could trip a phrase based filter.
- Generic Links: Like “Read More” or “Click Here.” Keep in mind that good content should fill this need with little if any work required on your part.
- Page title links: Some of your links should be the same as your page title.
There has been a lot of speculation regarding what the “right” link mix should be. Chris Cemper is frequently analyzing “cheap flights”, so let’s take a look at the #1 SERP in Google.com for that phrase:
With the domain CheapFights.com, It’s not unreasonable to expect a lot of people to link naturally, using the exact match “Cheap Flights.” That said, when I do a link audit and I see a mix like this, I get more than a little nervous. I would prefer to see more diversity.
Cemper’s Penguin analysis shows that winners have a 46 percent mix of trophy keyword phrases in their backlink profile as compared to 30 percent for the losers. Brand links account for 41 percent of the mix for winners and 48 percent for losers.
This was corroborated by testing which indicated that if: “under 50 percent of your anchor text for incoming links were “money keywords” it’s all but guaranteed you weren’t affected by (the Penguin) update” and “every single site we looked at which got negatively hit by the Penguin Update had a “money keyword” as its anchor text for over 60% of its incoming links.”
There are some good tools on the market like Link Detox and Remove’em to help you with link audits and even link removals. The key takeaway is that no matter what tool you are using, a human review is going to be necessary to “get it right.” Leaving it to metrics alone is a formula for failure.