Imagine your company’s engineering team has a virtually unlimited budget to translate any language in the world. Your engineers rank among the best in the world at machine translation, a subset of the widely-known (and widely misunderstood) field: AI, or Artificial Intelligence.
Your challenge: Hire specialists in any field to do machine translation better. You’re working on the Next Big Thing. Spend on people, processes, technology. Your choice. Your mission: solve the oldest problem in human history. You’ve got five years.
No celebrity talk-show hosts. You can’t choose Stephen Colbert, Letterman, or Leno to crack jokes or ask you to appear on their shows. Billionaire boys club? You can’t pick at the Brin-Buffett billionaire’s buffet to solve the problem. You’ve got the big bucks. Famous filmmakers? No way. Nix on Spielberg pix only because “AI” boffo boxoffice blockbuster.
No One Laptop Per Child and $100. for the thirsty, the poor, the hungry.
Poets? Yes. Idio-mots? Sure they’re welcome to contribute a bon mot or two. Slanguists: (linguists who decipher slang?). No problem.
You’re invited to fly your team first class to “Machine Translation Summit XI” in Copenhagen, Denmark, at CST (Center for Sprogteknologi, CST, er et center under KØbenhavns Universitet, Det Humanistiske Fakultet), coming up on September 14-17 …2007.
Ruled out this year. Maybe next. Remember, everything is rules-based.
More Rules-based Rules
Anyone who invests in 3.5 billion “It’s A Small World After All” or “I’d Like To Teach The World To Sing” iTunes suffers immediate disqualification. Nor is pirating or P2P distribution of the aforementioned Splenda songs allowed. Due to non-proliferation treaties banning the use of Nutrasweet in music: Band Aid, Barry Manilow, Live Aid, and Live8 concerts? Banned.
No roomful of monkeys typing away at a typewriter for five years. No “let’s all hold hands and group-hug it out.”
Solving the translation problem in an hour on Oprah, Anderson Cooper, or Dr. Phil is cheating. Berlitz is not banned – but you only have five years – and there are only so many headphones to go around.
Instead you might choose to download a few presentations from “Machine Translation Summit XI” in Copenhagen, Denmark. Choose from the best of the best machine translation research, judged by a jury of engineer peers. You may, as I did, enjoy “Faster Beam-Search Decoding for Phrasal Statistical Machine Translation” by Robert Moore and the aptly-named Chris Quirk.
My favorite paper (o, irony) is from the easy-to-translate-in-any language-session entitled, “Evaluation,” chaired by Andrei Popescu-Belis. If only for fun, read “Automatic Evaluation of Machine Translation based on Recursive Acquisition of an Intuitive Common Parts Continuum” by Hiroshi Echizen-ya and Kenji Araki. I haven’t read it in Japanese but I hear the equations and algorithms lose nothing in translation. Download here.
Written in the Esperanto of Universal Search (anglais) and of course you can get it translated but not by a machine.
Who You Gonna … Bennenen?
Who would you ask to join the team: linguists, professional translators? Would you invest in more translation technology, or in people to foster sharing of language research and data?
How would you turn your research into a product? (If you want to use the word “productize” you’ll have to translate it into English first.)
What language would you choose to translate first?
How many linguists would you hire and why?
Would you invest in NLP? And would that be Natural Language Processing? Or more specifically neuro-linguistic programming? Do words like NLP make you feel you’re already translating a foreign language?
I know some of the above are the right questions. I hope you’ll help answer them. On the other hand, you’re searching for something online and the keywords you use may lead you to stumbleupon this column. That’s okay. It’s written for humans and the search engines may or may not get you exactly where you want to go.
So are search engines. They’re driving your behavior in ways you may not have thought about. Search engines have changed the ways we talk to each other and in what language. Search marketing isn’t warming up. The red-hot search industry has gone way beyond global warming. What’s fueling the industry at every level? Artificial Intelligence. Last year on a conference panel, the moderator asked panelists to define the single most important trend in search. My answer last year: AI.
My answer this year? AI. Guilty of fueling international growth in search, but guilty with an explanation.
Next time out, I’ll explain why. I’ll also share answers I received to some of the (serious) questions posed in this column. I’m not sure I asked all the right questions so I’m sure you’ll answer my questions with questions of your own.
I’m positive Phillip Lenssen of GoogleBlogoscoped fame gave us the right answer to the wrong question in his excellent analysis: How Well Does Google Translator Cope With Idioms?
In Lenssen’s widely-discussed, well-read, and well-written blog post, he takes Google to task for boasting about their award-winning in-house machine translation efforts while outsourcing Google’s German language translation to a third party.
Next time out: SchadenFreudian slip: Lenssen, no idiot, right on idioms, wrong on ID’ing Google id. Answers live from the Googleplex.