Bing Waxes Lyrical on Spam Detection and Filtering

shutterstock-189362795In the latest post on the Bing blog, Bing dives into the inner workings of Web spam, what it looks like, and what Bing is doing about it.

In the post, Igor Rondel, principal development manager for Bing index quality, discusses the motivation behind spamming, and why it’s not always easy to determine intent:

There is typically a fine line between a legitimate use of an SEO technique and its abuse … even if SEO techniques are severely abused, it’s often not clear whether it’s intentional or accidental … even the most egregious spam pages may have user value and it is important to recognize that to decide on the proper course of action.

To figure out how to detect spam, Bing looks at the primary motivation of most spammers, which Rondel says is money.

“There are exceptions who are in it for other causes, e.g., politics and general mayhem, but the vast majority of the spammers are driven by their ability to monetize their efforts,” he writes.

The primary way they make money is through ads, Rondel says. And, understanding this motivation “helps us in that a spammer’s motivation is often reflected by the Web page itself and while there is no absolute sure way to pinpoint this, there are often clues that one can learn to read.”

Rondel says one of the primary facets of Bing’s work is Web spam detection and filtering; however, talking about the details of how the search engine does this is tricky without giving away information that could fuel the spammer’s goal. He writes:

Communication around spam detection is a sensitive matter, however, because unlike most other facets of search engine algorithms, we are dealing with an adversary who stands to benefit from a) detailed understanding of search algorithms and b) detailed understanding of anti-spam efforts. Therefore I hope the reader will forgive me for steering clear of specifics and instead focusing on the main themes of our detection and filtering workflow.

But there are conditions Bing uses to evaluate Web pages, including

  • The quality of the content
  • The volume and type of ads and how they render
  • The layout of the information and position of ads to text

However, when a spam page is detected, Rondel says the penalty matches the offense:

The specific mechanism that achieves this [spam filtering] is less important and could take the form of demoting the page, neutralizing the effect of specific spam techniques or removing the page/site out of the index [altogether]. The decision is made based on considerations like a) the extent/egregiousness of the spam techniques involved and b) the potential value the page presents to the users.

Rondel says spam is everywhere – in places you’d expect it, like free ringtone download pages, and places you perhaps wouldn’t expect it, like LinkedIn.

Rondel says the best thing marketers can do is focus on providing quality content:

Ultimately, search engines rank pages based on whether or not we think they content will provide value to the searcher and the best way to ensure that your pages rank well is to provide content that users actually want to see, rather than focusing on the specifics of the page structure or its link graph. Aside from the fundamentals of ensuring that your pages are well formed and that the content is easily discoverable by the search engine, the best SEO you can do as a webmaster is providing quality content.

Recently, Bing discussed its efforts to help inform searchers whether or not a page is safe to visit with its site safety page. The page helps searchers understand:

  • The reason the page is considered malicious
  • How often the URL has been scanned, the date the infection was first detected, and the date the infection was most recently detected

The next post on the Bing blog in the series on Web spam, says Rondel, will look at “one specific update we recently rolled out that focused on URL keyword stuffing and how it impacted our users and the SEO community.”

