Since search involves people from all of the world speaking a variety of languages, Google takes language translation very seriously. Shankar Kumar and Wolfgang Macherey recently took to the Official Google Research blog to explain more about Google's translation methods.
Specifically, Kumar and Macherey talked about the Minimum Bayes Risk (MBR) criterion in how to determine which translation to return to a user. It's best explained in their own words:
Essentially, we look at a sample of the best candidate translations (the so called n-best list) and choose the safest one, the one most likely to do the least amount of damage (where 'damage' is defined by our measurement of translation quality). You might want to view this as choosing a translation that is a lot like the other good translations instead of choosing that strange one that had the good model score.
Kumar and Macherey went on to say that they improve the diversification of MBR by adding candidate translations. They build lattices (a mathematical set, not a fence, though the fence is a decent visual) of translations which the MBR uses to search for the n-best approach. The more languages added to the lattice, the more diversified the search is.
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.