Return to Video

How computers translate human language - Ioannis Papachimonas

  • 0:07 - 0:11
    How is it that so many
    intergalactic species in movies and TV
  • 0:11 - 0:14
    just happen to speak perfect English?
  • 0:14 - 0:18
    The short answer is that no one
    wants to watch a starship crew
  • 0:18 - 0:22
    spend years compiling an alien dictionary.
  • 0:22 - 0:23
    But to keep things consistent,
  • 0:23 - 0:27
    the creators of Star Trek
    and other science-fiction worlds
  • 0:27 - 0:31
    have introduced the concept
    of a universal translator,
  • 0:31 - 0:35
    a portable device that can instantly
    translate between any languages.
  • 0:35 - 0:39
    So is a universal translator
    possible in real life?
  • 0:39 - 0:42
    We already have many programs
    that claim to do just that,
  • 0:42 - 0:46
    taking a word, sentence,
    or entire book in one language
  • 0:46 - 0:49
    and translating it into almost any other,
  • 0:49 - 0:52
    whether it's modern English
    or Ancient Sanskrit.
  • 0:52 - 0:56
    And if translation were just a matter
    of looking up words in a dictionary,
  • 0:56 - 1:00
    these programs would run circles
    around humans.
  • 1:00 - 1:03
    The reality, however,
    is a bit more complicated.
  • 1:03 - 1:07
    A rule-based translation program
    uses a lexical database,
  • 1:07 - 1:10
    which includes all the words
    you'd find in a dictionary
  • 1:10 - 1:13
    and all grammatical forms they can take,
  • 1:13 - 1:19
    and set of rules to recognize the basic
    linguistic elements in the input language.
  • 1:19 - 1:22
    For a seemingly simple sentence like,
    "The children eat the muffins,"
  • 1:22 - 1:27
    the program first parses its syntax,
    or grammatical structure,
  • 1:27 - 1:30
    by identifying the children
    as the subject,
  • 1:30 - 1:32
    and the rest of the sentence
    as the predicate
  • 1:32 - 1:34
    consisting of a verb "eat,"
  • 1:34 - 1:37
    and a direct object "the muffins."
  • 1:37 - 1:40
    It then needs to recognize
    English morphology,
  • 1:40 - 1:45
    or how the language can be broken down
    into its smallest meaningful units,
  • 1:45 - 1:46
    such as the word muffin
  • 1:46 - 1:50
    and the suffix "s,"
    used to indicate plural.
  • 1:50 - 1:52
    Finally, it needs to understand
    the semantics,
  • 1:52 - 1:56
    what the different parts of the sentence
    actually mean.
  • 1:56 - 1:58
    To translate this sentence properly,
  • 1:58 - 2:02
    the program would refer to a different set
    of vocabulary and rules
  • 2:02 - 2:05
    for each element of the target language.
  • 2:05 - 2:07
    But this is where it gets tricky.
  • 2:07 - 2:12
    The syntax of some languages
    allows words to be arranged in any order,
  • 2:12 - 2:17
    while in others, doing so could make
    the muffin eat the child.
  • 2:17 - 2:20
    Morphology can also pose a problem.
  • 2:20 - 2:23
    Slovene distinguishes between
    two children and three or more
  • 2:23 - 2:27
    using a dual suffix absent
    in many other languages,
  • 2:27 - 2:31
    while Russian's lack of definite articles
    might leave you wondering
  • 2:31 - 2:34
    whether the children are eating
    some particular muffins,
  • 2:34 - 2:37
    or just eat muffins in general.
  • 2:37 - 2:40
    Finally, even when the semantics
    are technically correct,
  • 2:40 - 2:43
    the program might miss their finer points,
  • 2:43 - 2:46
    such as whether the children
    "mangiano" the muffins,
  • 2:46 - 2:48
    or "divorano" them.
  • 2:48 - 2:52
    Another method is
    statistical machine translation,
  • 2:52 - 2:56
    which analyzes a database
    of books, articles, and documents
  • 2:56 - 2:59
    that have already
    been translated by humans.
  • 2:59 - 3:03
    By finding matches between source
    and translated text
  • 3:03 - 3:05
    that are unlikely to occur by chance,
  • 3:05 - 3:09
    the program can identify corresponding
    phrases and patterns,
  • 3:09 - 3:12
    and use them for future translations.
  • 3:12 - 3:15
    However, the quality
    of this type of translation
  • 3:15 - 3:18
    depends on the size
    of the initial database
  • 3:18 - 3:21
    and the availability of samples
    for certain languages
  • 3:21 - 3:23
    or styles of writing.
  • 3:23 - 3:27
    The difficulty that computers have
    with the exceptions, irregularities
  • 3:27 - 3:31
    and shades of meaning
    that seem to come instinctively to humans
  • 3:31 - 3:35
    has led some researchers to believe
    that our understanding of language
  • 3:35 - 3:39
    is a unique product
    of our biological brain structure.
  • 3:39 - 3:43
    In fact, one of the most famous
    fictional universal translators,
  • 3:43 - 3:46
    the Babel fish from
    "The Hitchhiker's Guide to the Galaxy",
  • 3:46 - 3:50
    is not a machine at all
    but a small creature
  • 3:50 - 3:54
    that translates the brain waves
    and nerve signals of sentient species
  • 3:54 - 3:57
    through a form of telepathy.
  • 3:57 - 4:00
    For now, learning a language
    the old fashioned way
  • 4:00 - 4:05
    will still give you better results than
    any currently available computer program.
  • 4:05 - 4:07
    But this is no easy task,
  • 4:07 - 4:09
    and the sheer number
    of languages in the world,
  • 4:09 - 4:13
    as well as the increasing interaction
    between the people who speak them,
  • 4:13 - 4:18
    will only continue to spur greater
    advances in automatic translation.
  • 4:18 - 4:21
    Perhaps by the time we encounter
    intergalactic life forms,
  • 4:21 - 4:25
    we'll be able to communicate with them
    through a tiny gizmo,
  • 4:25 - 4:29
    or we might have to start compiling
    that dictionary, after all.
Title:
How computers translate human language - Ioannis Papachimonas
Speaker:
Ioannis Papachimonas
Description:

View full lesson: http://ed.ted.com/lessons/how-computers-translate-human-language-ioannis-papachimonas

Is a universal translator possible in real life? We already have many programs that claim to be able to take a word, sentence, or entire book in one language and translate it into almost any other. The reality, however, is a bit more complicated. Ioannis Papachimonas shows how these machine translators work, and explains why they often get a bit mixed up.

Lesson by Ioannis Papachimonas, animation by NOWAY Video Club.

more » « less
Video Language:
English
Team:
closed TED
Project:
TED-Ed
Duration:
04:45

English subtitles

Revisions Compare revisions