Subtitles section Play video
-
Machine translation is incredibly difficult. And to prove that, I will now read this introduction
-
again, after it's been sent through Google's translator -- currently one of the best in
-
the world -- and then translated back into English.
-
Machine translation is very difficult. Back then translated into English - is one of the
-
best in the world right now - it is to prove that, after being sent through Google's translator,
-
I'll read this again introduced.
-
Okay, I chose a difficult language, but each one I tried introduced subtle errors in different
-
ways. Via Chinese, it had been translated by “Google hair”. Via French, the introduction
-
became a “he”, not an “it”. And those sentences were incredibly simple.
-
Folks who only speak one language -- and I am embarrassed to say that's a group that
-
includes me, I'm sorry -- folks who only speak one language often assume that you can
-
open a translation dictionary, pick an appropriate word, faff around with the grammar a bit,
-
and have a functional sentence in another language. For simple sentences, yes, that's
-
true: but very few sentences in the real world are actually that simple.
-
Google recently released a paper about how they'd reduced machine translation to a
-
problem in vector space mathematics, representations of concepts in an abstract language space.
-
Which is great for mapping concepts to words, and it'll even deal well with homographs,
-
identical words that mean completely different things. You can deal with those through context:
-
the days of “hydraulic ram” being translated as “water sheep” are pretty much in the
-
past.
-
[OFF SCREEN LAUGHTER]
-
Spot the engineer.
-
For formal, technical documents, it might even start to work well.
-
But for more casual communication, it's not so easy.
-
Heck, translating between British English and American English isn't always easy.
-
Not because your car's “hood” is our “bonnet”, but because “that's a brave
-
idea” isn't a compliment in British English, it means you're a prat and your idea is
-
impossible.
-
There are concepts which don't quite match between languages. “Bonne nuit” might
-
literally mean the same as “buenas noches” -- I'm sorry about my pronunciation there
-
-- but one is meant for saying goodnight at bedtime and the other's for saying hello
-
or goodbye at any point after dark.
-
Then you have the concepts that don't translate between languages at all. In French, “you”
-
translates as “vous” if it's someone you should be respectful towards, and “tu”
-
if it's a more casual conversation. Or if you're talking to God. No, really. God is
-
“tu”. A computer will crush both of those to “you” when translating to other languages,
-
and it won't have any idea which of them to use when translating into French.
-
And that is just a simple “honorifics” system. Korean has a much more complicated
-
set of pronouns for all sorts of situations. Remember this? That repeated line: oppan Gangnam
-
style. The English translation of “oppa” is usually “a woman's older brother”:
-
but in everyday speech, “oppa” is used to refer to someone based on a series of complicated
-
and fuzzy rules that make instinctive sense to native speakers. To make it worse, PSY
-
is referring to himself in the third person there, which sounds really weird when translated
-
out of Korean. There is no way to translate all of the meaning in those words into one
-
English sentence.
-
Then you have the problem of shared expectations. English-speaking cultures tend to be monochronic:
-
if you make an appointment to meet someone at 11am, you are expected to be there at about
-
11am. I mean, groups of friends can often get around this -- “the party starts at
-
6” often means people will turn up anywhere from 6:30 to 9. But imagine if that lack of
-
punctuality, and that acceptance of a lack of punctuality, expanded to all aspects of
-
everyday life. Welcome to the rest of the world. Massive parts of this planet run on
-
what is called polychronic time. Two appointments at the same time? That's fine, they'll
-
understand. And they will understand.
-
Needless to say, there is often quite significant culture clash when monochronic and polychronic
-
people meet. But a machine translation isn't going to see an English sentence like “I'll
-
meet you at 7pm” and add a note for someone in a polychronic culture that, no, they really
-
do mean 7pm, and they're going to be annoyed if you're late.
-
Ultimately, to accurately translate something, you don't just need to know how words map
-
to concepts: you need to understand social structures, subtext, nuance, innuendo. You
-
need at least a basic theory of mind: the idea that the speaker and the listener both
-
have beliefs and desires expressed by the particular words they've chosen. Translators
-
need to be able to ask questions of the original author, so you can check that the subtleties
-
that you have to add to their work reflect their intention.
-
The problem isn't that language is messy -- computers can cope with messy, heck, they
-
can pretty much solve CAPTCHAs better than humans these days. The problem is that language
-
relies on intent, on shared secrets, on group identity, and on hidden knowledge. Machine
-
translation is a useful tool, don't get me wrong, but trying to get a machine to translate
-
better than a human is… a brave idea.
-
[Translating thee subtitles? Add your name here!]