Since search involves people from all of the world speaking a variety of languages, Google takes language translation very seriously. Shankar Kumar and Wolfgang Macherey recently took to the Official Google Research blog to explain more about Google's translation methods.
Specifically, Kumar and Macherey talked about the Minimum Bayes Risk (MBR) criterion in how to determine which translation to return to a user. It's best explained in their own words:
Essentially, we look at a sample of the best candidate translations (the so called n-best list) and choose the safest one, the one most likely to do the least amount of damage (where 'damage' is defined by our measurement of translation quality). You might want to view this as choosing a translation that is a lot like the other good translations instead of choosing that strange one that had the good model score.
Kumar and Macherey went on to say that they improve the diversification of MBR by adding candidate translations. They build lattices (a mathematical set, not a fence, though the fence is a decent visual) of translations which the MBR uses to search for the n-best approach. The more languages added to the lattice, the more diversified the search is.
Last Week to Save on SES London Tickets!
Learn to engage customers and increase ROI by distributing your online marketing efforts across paid, owned & earned media. Join the leaders of today's digital marketing & advertising industry at SES London. Find out more ››
*Saver Rates expire this Friday, Dec 13.