In a paper entitled Spam-Double Funnel: Connecting Web Spammers with Advertisers , researchers at Microsoft and the University of California Davis show the path whereby the ads of legitimate web site owners come be shown on spam pages. The paper reported on Monday in a New York Times story Researchers Track Down a Plague of Fake Web Pages is to be delivered in May at the 16th International World Wide Web Conference in Banff, Alberta, Canada. The paper's methodology, finding and conclusions are of interest to search marketers.
For this paper the researchers focused on redirection spam (for examples of redirection spam) where Web pages redirect browsers to visit known spam controlled domains. Many of these redirection spam pages use pay-per-click advertising and frequently display ads from reputable advertisers. Many research papers on search spam are essentially descriptive seeking to categorize the various forms of search spam. This paper provides means for identifying not just how these redirection schemes work but points to who is involved in the schemes.
To unravel these redirection schemes and identify the sources, the researchers simply “followed the money” analyzing the end-to-end redirection paths (for more on the methodology and how you can use similar tactics, see Strider Search Ranger). In the paper they outline the methodology they used to analyze tens of thousands of spam links found for this piece of research. To describe their findings they created a five-layer double funnel model that includes:
- Doorway pages
- Redirection domains
Spammers control the doorway pages and redirection domains, aggregators buy traffic from the spammers and sell traffic to the syndicators who in turn are paid by the advertisers for to display their ads. The system works both two ways.
For their study, the researchers used 1,000 keywords spread across ten spammer targeted categories – spammer targeted keywords in one set and most bid advertiser keywords were targeted in a second. Predictably, the categories included:
The results of the analysis and the conclusions are of particular interest:
For Layer #1 – the doorway domains. The free blog hosting site blogspot.com was responsible for one in every four spam appearances in the top search results. At least three in every four unique blogspot URLs that appear in the the top 50 results were spam. (Aside – this is not new news to most search marketers, but it is nice to see real hard data on this.)
For Layer #2 – the redirection domains The spammer domain topsearch10.com figured prominently and 220.127.116.11~18.104.22.168 IP block where it resided hosted multiple domains responsible for 22-25% of all spam appearances.
Layer #3 – the aggregators which the authors believe present the best target for attacking search spam and are a bottleneck. Two IP blocks 22.214.171.124~126.96.36.199 are responsible for the 100,000 spam ads in the sample (Aside -- Talk about a bad neighborhood).
Layer #4 – the syndicators includes just a handful of ad syndicators who serve as middlemen for the majority of the spammers.
Layer #5 – the advertisers includes many well known reputable advertisers whose ads garner traffic funneled through the system. It is advertiser money that fuels the entire system.
The authors hope that their paper will help search engines strengthen their ranking algorithms and will provide impetus for advertisers to carefully scrutinize their involvement with syndicators and traffic affiliates.