Machines In Translation: Do MT Engineers Dream of Selectric Sheep? Part 2

Machine Translation (MT), Part 1 of this series captured the imagination of a number of marketers and agencies. We had some fun discussing a translation “dream team” on one of the world’s largest social networking sites. This week, let’s bring the machine translation (MT) issues home to search engine marketing and focus on some problems business owners face every day.

SEW Expert Mark Jackson has allowed me to share some details of a confidential inquiry. A reader asked Mark about duplicate content issues in foreign languages. Translation may become the alchemy of online content, turning pedestrian words into “foreign gold” for global marketers; MT promises to solve the problem of duplicate content through translation. One day.

In the meantime, Mark shared some international SEO strategies that have worked for his clients. Foreign language “duplicate” content hasn’t been an issue for clients. Yet he’s aware algorithms change, as do the search engine “rules.” Smart SEOs never say never. When Mark believes there may be even the most remote chance foreign language duplicate content problems may arise, he recommends subdirectories over subdomains. Since his firm’s white hat SEO, he doesn’t mind sharing key tactics. Mark notes having a country-specific domain helps search engines identify the target audience of a site. The country extension, for example, tells Google that .fr is intended for a French audience in France. Thus search engines tend to place more weight on a client’s .fr site for searches done on Another tactical advantage of owning country-specific domains: home pages are always more authoritative than interior pages. On a tactical level, global content and translation are top-of-mind for search marketers. From their clients’ point-of-view, the key question, then, is how to staff up or outsource for site translation.

The LinkedIn Conversation

So I asked a bigger question (on LinkedIn) to help business owners learn more about some of the human versus technology issues related to machine language translation. In short, imagine your company must solve global language translation problems. Only Machine translation engineers are on your team. Who’d you hire next? Linguists? Translators? How many? Would you spend your budget on more technology, or more people to foster sharing of language research?

In the past couple weeks, the conversation took place (in English) on LinkedIn in the Answers section. In the next few weeks, the conversation may move, to another location: a moveable feast of brain food. We’ll also continue the conversation and contributions from LinkedIn members on our Search Engine Watch blog.

On LinkedIn, Dima (Dan) Itkis, an NPD (new product development) project manager focusing on SaaS (Software as a Service) projects, shared his recommendation for assembling a dream team. His approach reflects his expertise and brings a new and valuable point-of-view to the conversation.

“I’d build a quality control system next, then do a root-cause analysis (RCA) and depending on the findings would fill in the gaps with the appropriate skill sets,” he wrote. Some tools used in RCA might include the “5 Whys,” Pareto analysis, Bayesian inference, barrier analysis, and Ishikawa (fishbone) diagrams.

Some folks weren’t afraid to take on the challenge, freely admitting they’re not experts. Tracy Larimer, a Great Falls, Montana-based enterprise account manager at RightNow Technologies, suggested casting a wider collaboration net.

“My view (which will surely be corrected by folks in the industry). Technology is not there yet,” Tracy wrote. “Machine translation tools can be used to assist human translators, but they cannot yet replace them.”

I think Tracy may be surprised at how many industry experts agree, not only with her take on machine translation, but on the value of forums. She noted forum communities encourage a wide range of people to proactively share their expertise. The net gain will be more than a company could otherwise achieve hiring more people.

While Tracy only intended to answer my posted question, she spoke to us all in other ways, based on our immediate and pressing needs. (Note to self: “SEO/SEM Staffing Crisis” — can forums help solve the industry’s biggest problem?)

Let’s get back to “food for the search marketer’s brain.” I borrowed the phrase “brain food” from Matt Spiegel, the SEW Expert who gave our readers an inside ticket to Google Zeitgeist, the exclusive partner summit. Again, no secrets, but he revealed perhaps one of the funniest one-liners by a CEO I’ve ever heard.

The CEO spoke in English, but he’s French. The Zeitgeist video wasn’t yet posted on The Google Channel on YouTube. Knowing the existence of The Google Channel does not a media company Google make, I turned instead to Google Translate (BETA) for a translation:

“Un (CENSORED) sur la route est un (CENSORED) sur la route.”

The above quote, though appearing (censored), is clearly uncensored. Doesn’t it say, roughly translated, “Uncensored is as uncensored does?” Surely it must have been translated by Forrest Gump who was, in turn translated into film and more recently bowdlerized (but in a good way?) in novel form at Google Book Search.

The actual idiom, provided in Google Translate: “Un abruti sur la route est un abruti sur la route.”

Odd. Does it seem something may have been lost in translation? An “abruti” may be somewhat “brutish,” but isn’t “cul” or “kewl” in any vernacular: French, English, Franglais. “Abruti” doesn’t need to be censored by a family newspaper in any country.

SEW Matt explains (no cuts) some of what’s lost in translation at Google Zeitgeist and why. The clue: It has, perhaps, something to do with the rise and fall of man. Put another way:

“Don’t look down, you’re losing altitude.”

How else can we help you translate in context? Let’s see. In French, “Sur la route” means “On The Road,” and not in a hey Jack Kerouac way.

Language, or rather the precise translation of language, will always be a puzzle. Even when we’re all speaking the same language. How we solve problems like “language translation” depends on our background and skills.

So What Do We Mean?

Philip K. Dick asked big questions. He answered them in his fiction. One allusion can be found in the title of this column.

Do engineers dream? Yes. So do androids.

Do they dream of Selectric sheep?

That depends on whether engineers recall the IBM Selectric. Engineers can’t forget the 1982 neo-noir sci-fi flick “Blade Runner,” written by David Peoples and Hampton Fancher, and directed by Ridley Scott (final, final cut in theaters now). Released 25 years ago, when Selectrics were the writing machines used by rolling stoners P.J. O’Rourke, Hunter S. Thompson, and David Sedaris, “Blade Runner” was translated to film from the 1968 novel “Do Androids Dream of Electric Sheep?”

To find the right answers, we sometimes have no choice but to search.

I was asked another question about the headline: What are MT Engineers? Machine translation engineers understand there’s no malice aforethought in the title. Fortunately, no offense was taken by search engine engineers. I think. None have called or written, but that’s OK. I don’t speak their language.

Google does.

The Lost Language of Google

Google has a policy not many people know about. This whispered policy has circled the blogosphere several times I’m sure. The false version you’ve probably heard has even been uttered in the halls of hotels hosting search engine conferences.

Here’s the version that smacks of provincialism and in my mind, xenophobia:

The Lie: Google doesn’t require engineers to speak English.

The Truth: Google doesn’t require its employees who aren’t engineers to speak in algorithms.

I’ve asked Google to clarify their official HR policy. When I hear back, I’ll let you know.

Is MT starting to sound more like a subset of search? The database of intentions is at play. People search is usually the fulcrum. Social search helps provide answers. Technology works in the service of people when MT is enhanced by humans.

Next: Nabokov Blogoscoped; Brin’s In; Bookmark this Page.

Related reading

Simple Share Buttons