0:00:00.000,0:00:04.894 Tatoeba: A bridge between languages. 0:00:05.961,0:00:11.279 What is Tatoeba? 0:00:11.387,0:00:14.317 Tatoeba is a language dictionary. 0:00:14.434,0:00:16.010 You can search words 0:00:16.010,0:00:17.926 and get translations. 0:00:18.541,0:00:22.570 But it's not exactly a typical dictionary. 0:00:23.277,0:00:25.415 It's all about sentences, 0:00:25.415,0:00:26.717 Not words. 0:00:26.717,0:00:30.191 You can search sentences containing a certain word 0:00:30.191,0:00:33.696 And get translations for these sentences. 0:00:34.327,0:00:37.077 "Why sentences?" you may ask. 0:00:37.077,0:00:40.642 Well, because, sentences are more interesting. 0:00:40.688,0:00:43.345 Sentences bring context to the words. 0:00:43.345,0:00:45.797 Sentences have personalities. 0:00:45.797,0:00:48.538 They can be funny, smart, silly 0:00:48.538,0:00:50.378 insightful, touching, 0:00:50.378,0:00:51.763 hurtful. 0:00:51.886,0:00:54.338 Sentences can teach us a lot, 0:00:54.338,0:00:56.745 and a lot more than just words. 0:00:57.160,0:00:59.628 So we love sentences. 0:01:00.074,0:01:03.677 But, even more, we love languages. 0:01:03.677,0:01:07.265 And what we really want is to have many sentences 0:01:07.265,0:01:10.320 in many—and any—languages. 0:01:10.751,0:01:14.218 This is why Tatoeba is multilingual. 0:01:14.880,0:01:17.588 But not that kind of multilingual— 0:01:17.588,0:01:19.618 not the kind where languages 0:01:19.618,0:01:22.111 are being simply paired up together, 0:01:22.111,0:01:24.637 and where some pairs are left behind. 0:01:25.067,0:01:28.286 Tatoeba is really multilingual. 0:01:28.286,0:01:31.726 All the languages are interconnected. 0:01:32.188,0:01:36.788 If an Icelandic sentence has a translation in English, 0:01:36.788,0:01:40.708 and the English sentence has a translation in Swahili, 0:01:40.708,0:01:45.114 then indirectly, this will provide a Swahili translation 0:01:45.114,0:01:47.452 for the Icelandic sentence. 0:01:47.883,0:01:52.959 Languages that would have never found themselves together in a traditional system 0:01:52.959,0:01:56.003 can be connected in Tatoeba. 0:01:56.003,0:01:58.052 Awesome, right? 0:01:58.652,0:02:01.717 But, where do we get the sentences? 0:02:01.717,0:02:04.129 And how do we translate them? 0:02:04.129,0:02:08.188 Obviously, this cannot be the work of one person. 0:02:08.726,0:02:12.452 This is why Tatoeba is collaborative. 0:02:12.575,0:02:15.240 Everyone is free to contribute. 0:02:15.240,0:02:19.243 And everyone has the ability to contribute. 0:02:19.243,0:02:22.148 It doesn't require you to be a polyglot. 0:02:22.148,0:02:24.262 Everyone speaks a language. 0:02:24.262,0:02:26.037 Everyone can feed the database 0:02:26.037,0:02:28.704 to illustrate new vocabulary. 0:02:28.704,0:02:32.748 Everyone can help ensure that sentences sound correct, 0:02:32.748,0:02:35.082 and are correctly spelled. 0:02:35.082,0:02:39.760 And actually, this project needs everyone. 0:02:39.760,0:02:42.728 Languages are not carved in stone. 0:02:42.728,0:02:45.766 Languages live through all of us. 0:02:45.766,0:02:50.004 We want to capture all the uniqueness of each language. 0:02:50.004,0:02:54.122 And we want to capture their evolution through time. 0:02:54.122,0:02:56.044 But you know, it would be sad 0:02:56.044,0:03:00.520 to collect all these sentences and keep them for ourselves. 0:03:00.520,0:03:04.360 Because there's so much you can do with them. 0:03:04.360,0:03:07.571 Which is why Tatoeba is open. 0:03:07.571,0:03:09.160 Our source code is open, 0:03:09.160,0:03:11.983 Our data is open. 0:03:11.983,0:03:13.972 We're releasing all the sentences we collect 0:03:13.972,0:03:17.775 under the Creative Commons Attribution license. 0:03:18.006,0:03:22.281 This means you can reuse them freely for a textbook, 0:03:22.281,0:03:23.994 for an application, 0:03:23.994,0:03:26.252 for a research project, 0:03:26.252,0:03:29.083 for anything! 0:03:29.452,0:03:31.917 So that's Tatoeba, 0:03:31.917,0:03:35.019 But that's not the whole picture. 0:03:35.342,0:03:38.923 Tatoeba is not just an open, collaborative, 0:03:38.923,0:03:42.373 multilingual dictionary of sentences. 0:03:42.819,0:03:46.382 It's part of an ecosystem that we want to build. 0:03:46.382,0:03:49.951 We want to bring language tools to the next level. 0:03:49.951,0:03:54.153 We want to see innovation in the language learning landscape. 0:03:54.153,0:03:58.671 And this cannot happen without open language resources 0:03:58.671,0:04:02.138 which cannot be built without a community, 0:04:02.138,0:04:06.231 which cannot contribute without efficient platforms. 0:04:06.877,0:04:09.841 So ultimately, with Tatoeba, 0:04:09.841,0:04:12.960 we are only building the foundations 0:04:12.960,0:04:14.444 to make the Web 0:04:14.444,0:04:23.298 a better place for language learning.