Hello. Well, let me start by asking you a question: How many of you had to fill out some sort of web form where you've been asked to read a distorted sequence of characters? How many of you found it really annoying? Okay, outstanding. So I invented that. (Laughter) (Applause) That thing is called a CAPTCHA. And it is there to make sure the entity filling out the form is actually a human and not some sort of computer program that was written to submit the form millions and millions of times. The reason it works is because humans have no trouble reading these squiggly characters, whereas computer programs simply can't do it as well yet. For example, when you're buying tickets online for attending a concert the reason you have to type these distorted characters is to prevent scalpers from writing a program that can buy millions of tickets, two at a time. CAPTCHAs are used all over the Internet. And since they're used so often, a lot of times the precise sequence of random characters shown to the user is not so fortunate. So this is an example from Yahoo. The random characters that happened to be shown to the user were W, A, I, T which spells a word. But the best part is the message that the Yahoo help desk got about 20 minutes later. ["Help! I've been waiting for over 20 minutes, and nothing happens."] (Laughter) This of course, is not as bad as this poor person. [REBOOT] (Laughter) I can tell funny stories about captchas for hours but since I cannot do that let me tell you about a project that we did afterwards which is sort of the next evolution of CAPTCHA. We call it reCAPTCHA, which is something that we started at the University, and then we turned it into a startup company. And then Google acquired this company. so, all what I'm going to say for the next 5 minutes is owned by Google. So, please, do not spread the word. So let me tell you how this project started. It turns out that about 200 million CAPTCHAs are typed everyday. When I first heard this, I was quite proud of myself. I thought, "look at the impact that my research has had." But then I started feeling bad. They are not only obnoxious, but also each time you type a CAPTCHA essentially you waste 10 seconds of your time. And if you multiply that by 200 million you get that humanity as a whole is wasting about 500,000 hours every day typing these annoying CAPTCHAs. So then I started feeling bad. And then I started thinking, is there any way we can use this effort for something that is good for humanity? While you're typing a CAPTCHA, during those 10 seconds, your brain is doing something amazing. Your brain is doing something that computers cannot yet do. So can we get you to do some useful work to mankind? Putting it differently, is there some humongous problem that we cannot yet get computers to solve, yet we can split into tiny chunks such that each time somebody solves a CAPTCHA they solve a little bit of this problem? The answer to that is "yes," and this is what we're doing now. So what you may not know is that nowadays while you're typing a CAPTCHA, not only are you authenticating yourself as a human, but in addition you're actually helping us to digitize books. So let me explain how this works. So there's a lot of projects out there trying to digitize the existing books. Google is digitizing books. Amazon, with the Kindle, is digitizing books. The way this works is you start with an old book. You've seen those things, right? Like a book? (Laughter) So you start with a book, and then you scan it. Now scanning a book is like taking a digital photograph of every page. The next step in the process is that the computer needs to be able to decipher all of the words in this image. Now the problem is that for older books that were written several years ago the computer cannot recognize a lot of the words because the ink has faded and the pages have turned yellow. Thus the words look a bit different and the computer cannot recognize them. So, for books that were written more than 50 years ago, the computer cannot recognize about 30 percent of the words. So what we're doing now is we're taking all of the words that the computer cannot recognize and we're getting people to read them for us while they're typing a CAPTCHA on the Internet. So, the next time you type a CAPTCHA - (Applause) these words that you're typing are actually words that are coming from books that are being digitized that the computer could not recognize. And now the reason we have two words nowadays instead of one is because we need to verify if the answer is correct. Because one of the words is such that the system knows what it was, and the other is a word that the system just got out of a book, it didn't know what it was, and it's presented to you. We're going to ask you to type both words. And we won't tell you which one's which. And if you type the correct word for the one for which the system already knows the answer, it assumes you are human, and it also gets some confidence that you typed the other word correctly. And if we repeat this process to like 10 different people and all of them agree on what the new word is, we are very confident that this new word was accurately digitized. So this is how the system works. The good thing is that it has been very successful. We're digitizing about 100 million words a day, which is the equivalent of about two million books a year. And this is all being done one word at a time by just people typing CAPTCHAs on the Internet. Now, since we're doing so many words per day, funny things can happen. And this is especially true because now we're giving people two randomly chosen English words next to each other. So funny things can happen. For example, we presented this word. It's the word "Christians"; there's nothing wrong with it. But if you present it along with another randomly chosen word, bad things can happen. So we get this. [bad Christians] (Laughter) It's quite funny. But it's even worse, because the particular website where we showed this actually happened to be called The Embassy of the Kingdom of God. (Laughter) Oops! Here's another really bad one. American politician, JohnEdwards.com [Damn liberal] (Laughter) So we keep on insulting people everyday. Now, we're not just insulting people. Quite often, interesting things can happen. So this actually has given rise to an Internet meme that millions of people have participated in, which is called CAPTCHA art. Here's how it works. Imagine you're using the Internet and you see a CAPTCHA that you think is somewhat peculiar, like this CAPTCHA. Then what you're supposed to do is you take a screen shot of it. Then of course, you fill out the CAPTCHA because you help us digitize a book, please. But then, first you take a screen shot, and then you draw something that is related to it, like this. [invisible toaster] (Laughter) It's just an example of CAPTCHA art. There are tens of thousands of these. Some of them are interesting. Some of them are very cute. [clenched it!] Some of them are funnier. [stoned founders] (Laughter) This is my favorite number of this whole project: 900 millions. This is the number of distinct people that have helped us digitize at least one word out of a book through reCAPTCHA. A little over 10% of the world's population, has helped digitize human knowledge. And it is numbers like these that motivate my research agenda. So the question that motivates my research is the following: If you look at humanity's large-scale achievements, these really big things that humanity has gotten together like building the pyramids of Egypt or the Panama Canal or putting a man on the Moon -- there is a curious fact about them, and it is that they were all done with about the same number of people. They were all done with about 100,000 people. We can ask ourselves why is that all of them used about the same number of people. The reason for that is because, before the Internet, coordinating more than 100,000 people was impossible. But now with the Internet, I've just shown you a project where we've coordinated 900 million people. So the question that motivates my research is, if we can put a man on the Moon with 100,000 people, what can we do with 100 million people? Based on this question, we've been working on a lot of projects. I will not tell you about all we have done. But, let me tell you about one that we are working on now. We've been working on this for about two years now. And we're going to launch it in about 30 days. It's called Duolingo. This project started asking the following question: How can we get 100 million people translating the Web into every major language for free? So there's a lot of things to say about this question. First of all, translating the Web. Right now it is partitioned into multiple languages. A large fraction of it is in English. If you don't know any English, you can't access it. But large fractions are in other languages, and if you don't know the languages you can't access them. I would like to translate all of the Web into every major language. Now some of you may say, why can't we use computers to translate? Machine translation nowadays is starting to translate some sentences. Well the problem with that is that it's not yet good enough, and it probably won't be for the next 20 to 30 years. So let me show you an example of something that was translated by a machine. Actually it was a forum about programming questions. It was a programming question translated from Japanese into English and from then into Spanish, though my translation is good. The other one is bad. You'll see. So I'll just let you read. This person starts apologizing for the machine translation. Indeed, this was done with the best translation program from Japanese into English. Remember, it's a question about computer programming. So here you are the preamble to the question. [At often, the goat-time install a error is vomit.] (Laughter) Then comes the first part of the question. [How many times like the wind, a pole, and the dragon?] (Laughter) Then comes my favorite part of the question. [This insult to father's stones?] (Laughter) And then comes my favorite part of the whole thing. [Please apologize for your stupidity. There are a many thank you.] (Laughter) Okay, so computer translation isn't yet good enough. We need people to translate. So what I want is to get 100 million people translating the Web into every major language for free. I couldn't afford paying 100 million people for the job, so I want them to do it for free. Now if this is what you want to do, you pretty quickly realize you're going to run into two pretty big obstacles, needing to be hurdled. The first one is a lack of bilinguals. So I don't even know if there exists 100 million people out there using the Web who are bilingual enough to help us translate. That's a big problem. The other problem is a lack of motivation. How are we going to motivate people to actually translate the Web for free? After thinking about this for months, we realized there's actually a way to solve both these problems with the same solution. We realized that there's a way to kill two birds with one stone. And that is to transform language translation into something that millions of people want to do, and that also helps with the problem of lack of bilinguals, and that is language education. It turns out there are millions of people wanting to learn other languages. Today there are over 1.2 billion people learning a foreign language. And it's not just because they're being forced to do so in school. For example, in the United States alone, there are over 5 million people who have paid over $500 for software to learn a new language. Many people want to learn a new language. So what we've been working on for the last two years is a new website called Duolingo, where the basic idea is people learn a new language for free, while simultaneously translating the Web. And so they're learning by doing. So this is how it works. The way this works is whenever you're a just a beginner, we give you very, very simple sentences on the Web. And if you don't know a word we'll tell you what each word means though you are asked to "translate this sentence". And it turns out that it really works. Even though people know nothing of the language if we explain what each word means, they'll be able to translate it. As you see how other people translate the same sentence, you start learning the language. And as you get more and more advanced, we give you more and more complex sentences to translate. This is how you are going to help us translate. This is how the site works. We're mostly done building it, and now we're testing it. When we started working on this I didn't think it could work, really. But it turns out that it works, indeed. It's amazing. First, people really can learn a language with it. In this case we are testing it with people knowing English, wanting to learn Spanish, and vice versa. So people really do learn a language. And they learn it about as well as the leading language learning software, which is very good, but perhaps more surprisingly, the translations that we get from people using the site are very good. They are as accurate as those of professional language translators. Now of course, we play a trick here and it is that we combine the translations of multiple beginners, several students, and choose the best. But it turns out that that best translation is as good as those of professional language translators. Now even though we're combining multiple translations, another good thing about Duolingo is that the site actually can translate pretty fast. So let me show you an estimate of how fast we could translate. If we wanted to translate Wikipedia from English into Spanish -- of course, Wikipedia exists in Spanish but is much smaller than its English counterpart, about 20 percent of it -- If we wanted to translate Wikipedia from English into Spanish using Duolingo we could do it in five weeks with 100,000 active users learning English with Duolingo. And we could do it in about 80 hours with a million active users. Since all the projects that my group has worked on so far have gotten millions of users, we're hopeful that we'll be able to translate the Web for free. We haven't yet launched Duolingo, (Applause) I'd like to leave you with -- we haven't yet launched Duolingo we plan to do so in 30 days. If you visit Duolingo.com, you can sign up to be part of our private beta in about 30 days. Help us. Thank you. (Applause)