How Google fights webspam and what you need to learn from this

Google has this week revealed its annual report on how it has policed the internet over the last 12 months. Or at least how it policed the vast chunk of the internet it allows on its results pages.

Although it’s self-congratulatory stuff, and as much as you can rightfully argue with some of Google’s recent penalties, you do need to understand what Google is punishing in terms of ‘bad quality’ internet experiences so you can avoid the same mistakes.

It’s important to remember that Google for some people IS the internet, or at least the ‘front door’ to it (sorry Reddit), but it’s equally important to remember that Google is still a product; one that needs to make money to survive and (theoretically) provide the best possible experience for its users, or else it is off to DuckDuckGo they… uh… go.

So therefore Google has to ensure the results it serves on its SERPs (search engine results pages) are of the highest quality possible. Algorithms are built and manual reviews by actual human beings are carried out to ensure crappy websites with stolen/thin/manipulative/harmful content stay hidden.

Here’s how Google is currently kicking ass and taking names… and how you can avoid falling between its crosshairs.

google webspam

How Google fought webspam

According to Google, an algorithmic update helped remove the amount of webspam in search results, impacting 5% of queries.

The remaining spam was tackled manually. Google sent more than 4.3 million messages to webmasters notifying them of manual actions it had imposed on sites affected by spam.

Following this, Google saw a 33% increase in the number of sites that went through a spam clean-up “towards a successful reconsideration process.” It’s unclear whether the remaining sites are still in the process of appealing, or have been booted off the face of the internet.

Who watches the watchmen?

More than 400,000 spam reports were manually submitted by Google users around the world. Google acted on 65% of them, and considered 80% of those acted upon to be spam.

Hacking

There was a huge 180% increase in websites being hacked in 2015, compared to the previous year. Hacking can take on a number of guises, whether its website spam or malware, but the result will be the same. You’ll be placed ‘in quarantine’ and your site will be flagged or removed.

Google has a number of official guidelines on how to help avoid being hacked. These include:

  • Strengthen your account security with lengthy, difficult to guess or crack passwords and not reusing those passwords across platforms.
  • Keep your site’s software updated, including its CMS and various plug-ins.
  • Research how your hosting provider handles security issues and check its policy when it comes to cleaning up hacked sites. Will it offer live support if your site is compromised?
  • Use tools to stay informed of potential hacked content on your site. Signing up to Search Console is a must, as it’s Google’s way of communicating any site issues with you.

google spam fighting

Thin, low quality content

Google saw an increase in the number of sites with thin, low quality content, a substantial amount likely to be provided by scraper sites.

Unfortunately there is very little you can do if your site is being scraped, as Google has discontinued its reporting tool and believes this problem to be your own fault. You just have to be confident that your own site’s authority, architecture and remaining content is enough to ensures it ranks higher than a scraper site.

If you have been served a manual penalty for ‘thin content with little or no added value’ there are things you can do to rectify it, which can mostly be boiled down to ‘stop making crappy content, duh’.

1) Start by checking your site for the following:

  • Auto-generated content: automatically generated content that reads like it was written by a piece of software because it probably was.
  • Thin content pages with affiliate links: affiliate links in quality articles are fine, but pages where the affiliates contain descriptions or reviews copied directly from the original retailer without any added original content are bad. As a rule, affiliates should form only a small part of the content of your site.
  • Scraped content: if you’re a site that automatically scrapes and republishes entire articles from other websites without permission then you should just flick the off-switch right away.
  • Doorway pages: these are pages which can appear multiple times for a particular query’s search results but ultimately lead users to the same destination. The purpose of doorway pages are purely to manipulate rankings.

2) Chuck them all in the bin.

3) If after all that you’re 100% sure your site somehow offers value, then you can resubmit to Google for reconsideration.

For more information on Google’s fight against webspam, read its official blog-post.

And finally, I’ll leave you with this terrifying vision of things to come…

robots and people

Related reading

screen-shot-2016-09-21-at-00-06-31-1024x473
cma feature
Search Console Search Analytics
i_fought_the_law
Simple Share Buttons