1 00:00:00,556 --> 00:00:04,573 Our emotions influence every aspect of our lives, 2 00:00:04,573 --> 00:00:08,149 from our health and how we learn to how we do business and make decisions, 3 00:00:08,149 --> 00:00:10,262 big ones and small. 4 00:00:10,262 --> 00:00:14,162 Our emotions also influence how we connect with one another. 5 00:00:14,162 --> 00:00:19,108 We've evolved to live in a world like this, 6 00:00:19,108 --> 00:00:23,427 but instead, we're living more and more of our lives like this. 7 00:00:23,427 --> 00:00:26,561 -- This is the text message from my daughter last night -- 8 00:00:26,561 --> 00:00:29,301 in a world that's devoid of emotion. 9 00:00:29,301 --> 00:00:31,252 So I'm on a mission to change that. 10 00:00:31,252 --> 00:00:35,733 I want to bring emotions back into our digital experiences. 11 00:00:35,733 --> 00:00:39,030 I started on this path 15 years ago. 12 00:00:39,030 --> 00:00:41,366 I was a computer scientist in Egypt, 13 00:00:41,366 --> 00:00:45,871 and I had just gotten accepted to a Ph.D program at Cambridge University. 14 00:00:45,871 --> 00:00:47,984 So, I did something quite unusual 15 00:00:47,984 --> 00:00:52,209 for a young newlywed Muslim Egyptian wife. 16 00:00:52,209 --> 00:00:56,598 With the support of my husband, who had to stay in Egypt, 17 00:00:56,598 --> 00:00:59,616 I packed my bags and I moved to England. 18 00:00:59,616 --> 00:01:02,844 At Cambridge, thousands of miles away from home, 19 00:01:02,844 --> 00:01:06,257 I realized I was spending more hours with my laptop 20 00:01:06,257 --> 00:01:08,486 than I did with any other human. 21 00:01:08,486 --> 00:01:13,339 Yet despite this intimacy, my laptop had absolutely no idea how I was feeling. 22 00:01:13,339 --> 00:01:16,055 It had no idea if I was happy, 23 00:01:16,055 --> 00:01:19,538 having a bad day, or stressed, confused, 24 00:01:19,538 --> 00:01:22,046 and so that got frustrating. 25 00:01:22,046 --> 00:01:28,501 Even worse, as I communicated online with my family back home, 26 00:01:28,501 --> 00:01:32,703 I felt that all my emotions disappeared in cyberspace. 27 00:01:32,703 --> 00:01:37,858 I was homesick, I was lonely, and on some days I was actually crying, 28 00:01:37,858 --> 00:01:42,786 but all I had to communicate these emotions was this. 29 00:01:42,786 --> 00:01:44,806 (Laughter) 30 00:01:44,806 --> 00:01:49,078 Today's technology has lots of IQ, but no EQ; 31 00:01:49,078 --> 00:01:52,956 lots of cognitive intelligence, but no emotional intelligence. 32 00:01:52,956 --> 00:01:55,153 So that got me thinking, you know, 33 00:01:55,153 --> 00:01:58,777 what if our technology could sense our emotions? 34 00:01:58,777 --> 00:02:02,853 What if our devices could sense how we felt and reacted accordingly, 35 00:02:02,853 --> 00:02:06,666 just the way an emotionally intelligent friend would? 36 00:02:06,666 --> 00:02:10,079 Those questions led me and my team 37 00:02:10,079 --> 00:02:14,607 to create technologies that can read and respond to our emotions, 38 00:02:14,607 --> 00:02:18,577 and our starting point was the human face. 39 00:02:18,577 --> 00:02:21,750 So our human face happens to be one of the most powerful channels 40 00:02:21,750 --> 00:02:25,766 that we all use to communicate social and emotional state, 41 00:02:25,766 --> 00:02:28,776 everything from enjoyment, surprise, 42 00:02:28,776 --> 00:02:32,979 empathy, and curiosity. 43 00:02:32,979 --> 00:02:37,907 In emotion science, we call each facial muscle movement an action unit. 44 00:02:37,907 --> 00:02:40,832 So for example, action unit 12, 45 00:02:40,832 --> 00:02:43,015 it's not a Hollywood blockbuster, 46 00:02:43,015 --> 00:02:46,312 it is actually a lip corner pull, which is the main component of a smile. 47 00:02:46,312 --> 00:02:49,052 Try it everybody. Let's get some smiles going on. 48 00:02:49,052 --> 00:02:51,954 Another example is action unit 4. It's the brow furrow. 49 00:02:51,954 --> 00:02:54,392 It's when you draw your eyebrows together 50 00:02:54,392 --> 00:02:56,459 and you create all these textures and wrinkles. 51 00:02:56,459 --> 00:03:00,754 We don't like them, but it's a strong indicator of a negative emotion. 52 00:03:00,754 --> 00:03:02,960 So we have about 45 of these action units, 53 00:03:02,960 --> 00:03:06,350 and they combine to express hundreds of emotions. 54 00:03:06,350 --> 00:03:10,251 Teaching a computer to read these facial emotions is hard, 55 00:03:10,251 --> 00:03:13,223 because these action units, they can be fast, they're subtle, 56 00:03:13,223 --> 00:03:15,777 and they combine in many different ways. 57 00:03:15,777 --> 00:03:19,515 So take, for example, the smile and the smirk. 58 00:03:19,515 --> 00:03:23,268 They look somewhat similar, but they mean very different things. 59 00:03:23,268 --> 00:03:24,986 (Laughter) 60 00:03:24,986 --> 00:03:27,099 So the smile is positive, 61 00:03:27,099 --> 00:03:29,026 a smirk is often negative. 62 00:03:29,026 --> 00:03:33,136 Sometimes a smirk can make you became famous. 63 00:03:33,136 --> 00:03:35,960 But seriously, it's important for a computer to be able 64 00:03:35,960 --> 00:03:38,815 to tell the difference between the two expressions. 65 00:03:38,815 --> 00:03:40,627 So how do we do that? 66 00:03:40,627 --> 00:03:42,414 We give our algorithms 67 00:03:42,414 --> 00:03:46,524 tens of thousands of examples of people we know to be smiling, 68 00:03:46,524 --> 00:03:49,589 from different ethnicities, ages, genders, 69 00:03:49,589 --> 00:03:52,004 and we do the same for smirks. 70 00:03:52,004 --> 00:03:53,954 And then, using deep learning, 71 00:03:53,954 --> 00:03:56,810 the algorithm looks for all these textures and wrinkles 72 00:03:56,810 --> 00:03:59,039 and shape changes on our face, 73 00:03:59,039 --> 00:04:02,592 and basically learns that all smiles have common characteristics, 74 00:04:02,592 --> 00:04:05,773 all smirks have subtly different characteristics. 75 00:04:05,773 --> 00:04:08,141 And the next time it sees a new face, 76 00:04:08,141 --> 00:04:10,440 it essentially learns that, you know, 77 00:04:10,440 --> 00:04:13,473 this face has the same characteristics of a smile, 78 00:04:13,473 --> 00:04:17,791 and it says, "Aha, I recognize this. This is a smile expression." 79 00:04:17,791 --> 00:04:21,181 So the best way to demonstrate how this technology works 80 00:04:21,181 --> 00:04:23,317 is to try a live demo, 81 00:04:23,317 --> 00:04:24,850 so I need a volunteer, 82 00:04:24,850 --> 00:04:27,195 preferably somebody with a face. 83 00:04:27,195 --> 00:04:29,564 (Laughter) 84 00:04:29,564 --> 00:04:32,335 Chloe's going to be our volunteer today. 85 00:04:33,325 --> 00:04:37,783 So over the past five years, we've moved from being a research project at MIT 86 00:04:37,783 --> 00:04:39,269 to a company, 87 00:04:39,269 --> 00:04:41,971 where my team has worked really hard to make this technology work, 88 00:04:41,971 --> 00:04:44,075 as we like to say, in the wild. 89 00:04:44,075 --> 00:04:47,210 And we've also shrunk it so that the core emotion engine 90 00:04:47,210 --> 00:04:50,530 works on any mobile device with a camera, like this iPad. 91 00:04:50,530 --> 00:04:54,756 So let's give this a try. 92 00:04:54,756 --> 00:04:58,680 As you can see, the algorithm has essentially found Chloe's face, 93 00:04:58,680 --> 00:05:00,552 so it's this white bounding box, 94 00:05:00,552 --> 00:05:02,943 and it's tracking the main feature points on her face, 95 00:05:02,943 --> 00:05:05,799 so her eyebrows, her eyes, her mouth, and her nose. 96 00:05:05,799 --> 00:05:08,786 The question is, can it recognize her expression? 97 00:05:08,786 --> 00:05:10,457 So we're going to test the machine. 98 00:05:10,457 --> 00:05:14,823 So first of all, give me your poker face. Yep, awesome. (Laughter) 99 00:05:14,823 --> 00:05:17,586 And then as she smiles, this is a genuine smile, it's great. 100 00:05:17,586 --> 00:05:19,968 So you can see the green bar go up as she smiles. 101 00:05:19,968 --> 00:05:22,151 Now that was a big smile. Can you try, like, a subtle smile 102 00:05:22,151 --> 00:05:25,402 to see if the computer can recognize? It does recognize subtle smiles as well. 103 00:05:25,402 --> 00:05:27,607 We've worked really hard to make that happen. 104 00:05:27,607 --> 00:05:31,439 And then eyebrow raised, indicator of surprise. 105 00:05:31,439 --> 00:05:35,688 Brow furrow, which is an indicator of confusion. 106 00:05:35,688 --> 00:05:39,565 Frown. Yes, perfect. 107 00:05:39,565 --> 00:05:42,468 So these are all the different action units. There's many more of them. 108 00:05:42,468 --> 00:05:45,022 This is just a slimmed down demo. 109 00:05:45,022 --> 00:05:48,528 But we call each reading an emotion data point, 110 00:05:48,528 --> 00:05:51,337 and then they can fire together to portray different emotions. 111 00:05:51,337 --> 00:05:55,099 So on the right side of the demo, like, look like you're happy. 112 00:05:55,099 --> 00:05:57,444 So that's joy. Joy fires up. 113 00:05:57,444 --> 00:05:59,371 And then give me a disgust face. 114 00:05:59,371 --> 00:06:03,643 Try to remember what it was like when Zayn left One Direction. 115 00:06:03,643 --> 00:06:05,153 (Laughter) 116 00:06:05,153 --> 00:06:09,495 Yeah, wrinkle your nose. Awesome. 117 00:06:09,495 --> 00:06:13,326 And the valance is actually quite negative, so you must have been a big fan. 118 00:06:13,326 --> 00:06:15,926 So valance is how positive or negative an experience is, 119 00:06:15,926 --> 00:06:18,712 and engagement is how expressive she is as well. 120 00:06:18,712 --> 00:06:22,126 So imagine if Chloe had access to this realtime emotion stream, 121 00:06:22,126 --> 00:06:24,935 and she could share it with anybody she wanted to. 122 00:06:24,935 --> 00:06:27,858 Thank you. 123 00:06:27,858 --> 00:06:33,749 (Applause) 124 00:06:33,749 --> 00:06:35,979 So so far, we have amassed 125 00:06:35,979 --> 00:06:38,788 12 billion of these emotion data points. 126 00:06:38,788 --> 00:06:41,063 It's the largest emotion database in the world. 127 00:06:41,063 --> 00:06:44,593 We've collected it from 2.9 million face videos, 128 00:06:44,593 --> 00:06:47,193 people who have agreed to share their emotions with us, 129 00:06:47,193 --> 00:06:50,398 and from 75 countries around the world. 130 00:06:50,398 --> 00:06:52,603 It's growing every day. 131 00:06:52,603 --> 00:06:54,670 It blows my mind away 132 00:06:54,670 --> 00:06:57,865 that we can now quantify something as personal as our emotions, 133 00:06:57,865 --> 00:07:00,001 and we can do it at this scale. 134 00:07:00,001 --> 00:07:02,277 So what have we learned to date? 135 00:07:02,277 --> 00:07:05,388 Gender. 136 00:07:05,388 --> 00:07:09,034 Our data confirms something that you might suspect. 137 00:07:09,034 --> 00:07:10,891 Women are more expressive than men. 138 00:07:10,891 --> 00:07:13,724 Not only do they smile more, their smiles last longer, 139 00:07:13,724 --> 00:07:16,478 and we can now really quantify what it is that men and women 140 00:07:16,478 --> 00:07:18,614 respond to differently. 141 00:07:18,614 --> 00:07:20,904 Let's do culture: so in the United States, 142 00:07:20,904 --> 00:07:24,108 women are 40 percent more expressive than men, 143 00:07:24,108 --> 00:07:27,753 but curiously, we don't see any difference in the U.K. between men and women. 144 00:07:27,753 --> 00:07:30,259 (Laughter) 145 00:07:31,445 --> 00:07:33,256 Age: 146 00:07:33,256 --> 00:07:35,323 people who are 50 years and older 147 00:07:35,323 --> 00:07:38,759 are 25 percent more emotive than younger people. 148 00:07:38,759 --> 00:07:43,751 Women in their 20s smile a lot more than men the same age, 149 00:07:43,751 --> 00:07:46,770 perhaps a necessity for dating, 150 00:07:46,770 --> 00:07:49,997 but perhaps what surprised us the most about this data 151 00:07:49,997 --> 00:07:53,410 is that we happen to be expressive all the time, 152 00:07:53,410 --> 00:07:56,243 even when we are sitting in front of our devices alone, 153 00:07:56,243 --> 00:07:59,517 and it's not just when we're watching cat videos on Facebook. 154 00:07:59,517 --> 00:08:02,857 We are expressive when we're emailing, texting, shopping online, 155 00:08:02,857 --> 00:08:05,527 or even doing our taxes. 156 00:08:05,527 --> 00:08:07,919 Where is this data used today? 157 00:08:07,919 --> 00:08:10,682 In understanding how we engage with media, 158 00:08:10,682 --> 00:08:13,166 so understanding virality and voting behavior; 159 00:08:13,166 --> 00:08:15,906 and also empowering or emotion-enabling technology, 160 00:08:15,906 --> 00:08:20,527 and I want to share some examples that are especially close to my heart. 161 00:08:20,527 --> 00:08:24,265 Emotion-enabled wearable glasses can help individuals 162 00:08:24,265 --> 00:08:27,493 who are visually impaired read the faces of others, 163 00:08:27,493 --> 00:08:31,068 and it can help individuals on the autism spectrum interpret emotion, 164 00:08:31,068 --> 00:08:34,458 something that they really struggle with. 165 00:08:34,458 --> 00:08:38,777 In education, imagine if your learning apps 166 00:08:38,777 --> 00:08:41,587 sense that you're confused and slow down, 167 00:08:41,587 --> 00:08:43,444 or that you're bored, so it sped up, 168 00:08:43,444 --> 00:08:46,973 just like a great teacher would in a classroom. 169 00:08:46,973 --> 00:08:49,644 What if your wristwatch tracks your mood, 170 00:08:49,644 --> 00:08:52,337 or your car sensed that you're tired, 171 00:08:52,337 --> 00:08:54,885 or perhaps your fridge knows that you're stressed, 172 00:08:54,885 --> 00:09:00,951 so it auto-locks to prevent you from binge eating. (Laughter) 173 00:09:00,951 --> 00:09:03,668 I would like that, yeah. 174 00:09:03,668 --> 00:09:05,595 What if, when I was in Cambridge, 175 00:09:05,595 --> 00:09:07,908 I had access to my realtime emotion stream, 176 00:09:07,908 --> 00:09:11,437 and I could share that with my family back home in a very natural way, 177 00:09:11,437 --> 00:09:15,408 just like I would if we were all in the same room together? 178 00:09:15,408 --> 00:09:18,055 I think in five years down the line, 179 00:09:18,055 --> 00:09:20,887 all our devices are going to have an emotion chip, 180 00:09:20,887 --> 00:09:24,951 and we won't remember what it was like when we couldn't just frown at our device 181 00:09:24,951 --> 00:09:29,200 and our device would say, "Hmm, you didn't like that, did you." 182 00:09:29,200 --> 00:09:32,961 Our biggest challenge is that there are so many applications of this technology, 183 00:09:32,961 --> 00:09:35,864 my team and I realize that we can't build them all ourselves, 184 00:09:35,864 --> 00:09:39,036 so we've made this technology available so that other developers 185 00:09:39,036 --> 00:09:41,474 can get building and get creative. 186 00:09:41,474 --> 00:09:45,560 We recognize that there are potential risks 187 00:09:45,560 --> 00:09:47,627 and potential for abuse, 188 00:09:47,627 --> 00:09:50,576 but personally, having spent many years doing this, 189 00:09:50,576 --> 00:09:53,548 I believe that the benefits to humanity 190 00:09:53,548 --> 00:09:55,823 from having emotionally intelligent technology 191 00:09:55,823 --> 00:09:59,399 far outweigh the potential for misuse. 192 00:09:59,399 --> 00:10:01,930 And I invite you all to be part of the conversation. 193 00:10:01,930 --> 00:10:04,484 The more people who know about this technology, 194 00:10:04,484 --> 00:10:08,431 the more we can all have a voice in how it's being used. 195 00:10:08,431 --> 00:10:13,655 So as more and more of our lives become digital, 196 00:10:13,655 --> 00:10:17,153 we are fighting a losing battle trying to curb our usage of devices 197 00:10:17,153 --> 00:10:19,382 in order to reclaim our emotions. 198 00:10:19,382 --> 00:10:24,536 So what I'm trying to do instead is to bring emotions into our technology 199 00:10:24,536 --> 00:10:26,765 and make our technologies more responsive. 200 00:10:26,765 --> 00:10:29,435 So I want those devices that have separated us 201 00:10:29,435 --> 00:10:31,897 to bring us back together, 202 00:10:31,897 --> 00:10:36,485 and by humanizing technology, we have this golden opportunity 203 00:10:36,485 --> 00:10:39,782 to re-imagine how we connect with machines, 204 00:10:39,782 --> 00:10:44,263 and therefore, how we, as human beings, 205 00:10:44,263 --> 00:10:46,167 connect with one another. 206 00:10:46,167 --> 00:10:48,327 Thank you. 207 00:10:48,327 --> 00:10:51,064 (Applause)