Return to Video

Tatoeba Project - Open, collaborative, multilingual dictionary of sentences

  • 0:00 - 0:05
    Tatoeba: A bridge between languages.
  • 0:06 - 0:11
    What is Tatoeba?
  • 0:11 - 0:14
    Tatoeba is a language dictionary.
  • 0:14 - 0:16
    You can search words
  • 0:16 - 0:18
    and get translations.
  • 0:19 - 0:23
    But it's not exactly a typical dictionary.
  • 0:23 - 0:25
    It's all about sentences,
  • 0:25 - 0:27
    Not words.
  • 0:27 - 0:30
    You can search sentences containing a certain word
  • 0:30 - 0:34
    And get translations for these sentences.
  • 0:34 - 0:37
    "Why sentences?" you may ask.
  • 0:37 - 0:41
    Well, because, sentences are more interesting.
  • 0:41 - 0:43
    Sentences bring context to the words.
  • 0:43 - 0:46
    Sentences have personalities.
  • 0:46 - 0:49
    They can be funny, smart, silly
  • 0:49 - 0:50
    insightful, touching,
  • 0:50 - 0:52
    hurtful.
  • 0:52 - 0:54
    Sentences can teach us a lot,
  • 0:54 - 0:57
    and a lot more than just words.
  • 0:57 - 1:00
    So we love sentences.
  • 1:00 - 1:04
    But, even more, we love languages.
  • 1:04 - 1:07
    And what we really want is to have many sentences
  • 1:07 - 1:10
    in many—and any—languages.
  • 1:11 - 1:14
    This is why Tatoeba is multilingual.
  • 1:15 - 1:18
    But not that kind of multilingual—
  • 1:18 - 1:20
    not the kind where languages
  • 1:20 - 1:22
    are being simply paired up together,
  • 1:22 - 1:25
    and where some pairs are left behind.
  • 1:25 - 1:28
    Tatoeba is really multilingual.
  • 1:28 - 1:32
    All the languages are interconnected.
  • 1:32 - 1:37
    If an Icelandic sentence has a translation in English,
  • 1:37 - 1:41
    and the English sentence has a translation in Swahili,
  • 1:41 - 1:45
    then indirectly, this will provide a Swahili translation
  • 1:45 - 1:47
    for the Icelandic sentence.
  • 1:48 - 1:53
    Languages that would have never found themselves together in a traditional system
  • 1:53 - 1:56
    can be connected in Tatoeba.
  • 1:56 - 1:58
    Awesome, right?
  • 1:59 - 2:02
    But, where do we get the sentences?
  • 2:02 - 2:04
    And how do we translate them?
  • 2:04 - 2:08
    Obviously, this cannot be the work of one person.
  • 2:09 - 2:12
    This is why Tatoeba is collaborative.
  • 2:13 - 2:15
    Everyone is free to contribute.
  • 2:15 - 2:19
    And everyone has the ability to contribute.
  • 2:19 - 2:22
    It doesn't require you to be a polyglot.
  • 2:22 - 2:24
    Everyone speaks a language.
  • 2:24 - 2:26
    Everyone can feed the database
  • 2:26 - 2:29
    to illustrate new vocabulary.
  • 2:29 - 2:33
    Everyone can help ensure that sentences sound correct,
  • 2:33 - 2:35
    and are correctly spelled.
  • 2:35 - 2:40
    And actually, this project needs everyone.
  • 2:40 - 2:43
    Languages are not carved in stone.
  • 2:43 - 2:46
    Languages live through all of us.
  • 2:46 - 2:50
    We want to capture all the uniqueness of each language.
  • 2:50 - 2:54
    And we want to capture their evolution through time.
  • 2:54 - 2:56
    But you know, it would be sad
  • 2:56 - 3:01
    to collect all these sentences and keep them for ourselves.
  • 3:01 - 3:04
    Because there's so much you can do with them.
  • 3:04 - 3:08
    Which is why Tatoeba is open.
  • 3:08 - 3:09
    Our source code is open,
  • 3:09 - 3:12
    Our data is open.
  • 3:12 - 3:14
    We're releasing all the sentences we collect
  • 3:14 - 3:18
    under the Creative Commons Attribution license.
  • 3:18 - 3:22
    This means you can reuse them freely for a textbook,
  • 3:22 - 3:24
    for an application,
  • 3:24 - 3:26
    for a research project,
  • 3:26 - 3:29
    for anything!
  • 3:29 - 3:32
    So that's Tatoeba,
  • 3:32 - 3:35
    But that's not the whole picture.
  • 3:35 - 3:39
    Tatoeba is not just an open, collaborative,
  • 3:39 - 3:42
    multilingual dictionary of sentences.
  • 3:43 - 3:46
    It's part of an ecosystem that we want to build.
  • 3:46 - 3:50
    We want to bring language tools to the next level.
  • 3:50 - 3:54
    We want to see innovation in the language learning landscape.
  • 3:54 - 3:59
    And this cannot happen without open language resources
  • 3:59 - 4:02
    which cannot be built without a community,
  • 4:02 - 4:06
    which cannot contribute without efficient platforms.
  • 4:07 - 4:10
    So ultimately, with Tatoeba,
  • 4:10 - 4:13
    we are only building the foundations
  • 4:13 - 4:14
    to make the Web
  • 4:14 - 4:23
    a better place for language learning.
Title:
Tatoeba Project - Open, collaborative, multilingual dictionary of sentences
Description:

Video presenting the key ideas behind the Tatoeba Project (http://tatoeba.org/).

This is a first version, and hopefully not the last because it can be hugely improved.

If anyone out there has more extensive experience in video editing than I do, a better microphone, a better voice, better graphic skills... please contact me =]

---------------------------------

Link to the prezi presentation this video was based on:
http://prezi.com/i-f9vmxoxkym/tatoeba/

The prezi presentation has been translated...
- into German (thank you jakov): http://prezi.com/jkptitff3d8i/tatoeba-einfuhrungsvideo-deu/
- into Turkish (thank you boracasli): http://prezi.com/teo-vffiex8x/tatoeba-turkce-prezisi/

---------------------------------

more » « less
Video Language:
English
Duration:
04:17

English subtitles

Revisions Compare revisions