Last month, we wrote about how Google’s human search quality raters assign URL ratings, based on their leaked 120+ page training manual. PotPieGirl, the blogger who found the original manual, did some more digging to learn more about who these people are and how they are hired by outside companies to work on contract.
Whether their ratings are actually used in Google’s algorithm seems to be up for debate; some believe their only purpose is to help Google improve the algorithm, while others consider human ratings a ranking factor. The effect on a site’s ranking may be direct or indirect, but it’s there. Is it possible to optimize for human raters?
Much of what they are looking at reflects what a real searcher is looking for when they visit a page and seems common sense. Surprisingly, raters report processing between 30 and 60 of these evaluations per hour, so it does seem that a first impression may be all you get.
The Search Rater Hiring Process
Human raters are hired by outside companies, such as Leapforce or Lionbridge. These two companies appear to be looking for different skills and educational backgrounds. For example, a recent Search Engine Evaluator job posting for Leapforce asks that applicants have a university degree. Over at Lionbridge, which contracts to over 4,500 raters called Internet Assessors, the job posting has no educational requirements and says, “You can be a college student, teacher, retired person, online shopper, stay-at-home parent, or professional from any field.”
Those selected for testing must pass a two-part open-book exam, consisting of 24 theory questions and 150 simulated evaluation tasks. Applicants refer to a 125-page study guide, which does sound very much like the manual leaked last month; they appear to be one and the same. Both Leapforce and Lionbridge limit rater jobs to one household or IP address, though as in other types of work-from-home jobs, there really is no way to tell if the person who took the test is the one actually performing the ratings:
There does seem to be an internal rater ranking process; raters are evaluated on their own abilities to rate URLs and must maintain a level of quality to keep the position. Raters can also comment on the work of others. From the training manual:
How to Optimize for Human Search Quality Raters - And Do You Even Want To?
Raters are given one of two types of tasks: they are either given a query and corresponding site to evaluate its relevance to the searcher, or a query and two sets of SERPs to compare and determine which one produces the best results for searchers.
Anything you could possibly do to influence a human rater could be considered a best practice in SEO, anyway. Don’t kill yourself over this.
Keep in mind that even if human ratings are an organic ranking factor, Google must take into account the possibility of human error and the inability to supervise work-from-home raters. With that said, here are a few of the factors human raters are looking for when evaluating a site’s relevance to a query:
- The first thing they should determine is user intent - Do, Know, or Go - based on the query. Does the searcher have a specific intent (Do), are they looking for information (Know), or are they trying to get to a specific page (Go).
- Is the page vital, useful, relevant, slightly relevant, or off-topic/useless to the query?
Here’s what they’re looking at when they compare two sets of search results, which may be completed at a rate of 20 tasks per hour, according to forum posters who claim to have done the job:
- User intent based on the query.
- Page titles and snippets - that’s right, in three minutes they’re probably not clicking through on many results. This does mirror actual search behavior. When was the last time you clicked through on all top 10 results to see which one you liked the best?
Bill Slawski found a patent granted last week that explores a few different ways Google might use human raters to evaluate which algorithmic changes might produce the best results. As he points out, the way it could be used is up in the air. So while all of this information about human raters is intriguing, it’s not necessarily worthy of head-desk-banging.
This is all subjective, could be determined in one to two minutes, and may not be evaluated by the person who actually took the test. Yes, it’s a crapshoot.
Is it likely that Google gives these ratings a lot of weight? Probably not, though I’d love to be proven wrong in this case. If you have any thoughts or insight into how human ratings affect organic results, share in the comments!