1 00:00:08,334 --> 00:00:09,417 Hi. 2 00:00:10,562 --> 00:00:13,951 We are living in an exciting era, 3 00:00:13,952 --> 00:00:19,281 where innovation and technology has the potential to do the unimaginable, 4 00:00:19,282 --> 00:00:22,559 and it becomes even more unimaginable 5 00:00:22,560 --> 00:00:26,560 when it breaks down the gaps between disability and ability. 6 00:00:28,345 --> 00:00:31,325 15% of the world population 7 00:00:32,564 --> 00:00:35,324 - 1 billion people around the world - 8 00:00:35,325 --> 00:00:37,184 lives with disabilities 9 00:00:37,185 --> 00:00:41,668 which makes people with disabilities the largest minority in the world. 10 00:00:42,605 --> 00:00:45,264 And they are not living on a different planet. 11 00:00:45,265 --> 00:00:50,145 They may be part of our families, friends, or colleagues. 12 00:00:51,426 --> 00:00:55,985 Today, I'm going to tell you how people with speech disabilities 13 00:00:55,986 --> 00:00:59,366 will have a way to better communicate. 14 00:00:59,375 --> 00:01:03,233 I was 7 years old when my sister Amal was born. 15 00:01:03,234 --> 00:01:05,893 I was too young to see the challenges 16 00:01:05,894 --> 00:01:09,463 that my family was facing on a daily basis, 17 00:01:09,464 --> 00:01:13,813 but I could see that Amal couldn't crawl, or eat, or talk 18 00:01:13,814 --> 00:01:16,913 like any other baby her age. 19 00:01:16,914 --> 00:01:22,063 But with time, we adjusted to raise a baby with cerebral palsy, 20 00:01:22,064 --> 00:01:26,492 while understanding her special communication patterns and needs. 21 00:01:28,406 --> 00:01:29,845 Nine years later, 22 00:01:29,846 --> 00:01:33,469 my family was blessed to have another baby, Ahmad. 23 00:01:34,469 --> 00:01:38,288 Ahmad decided to grow up exactly like his sister Amal, 24 00:01:38,289 --> 00:01:42,838 being so smart, so sharp, curious about everything around him, 25 00:01:42,839 --> 00:01:47,208 but he also decided to invent his special communication patterns 26 00:01:47,209 --> 00:01:48,809 to communicate with us, 27 00:01:49,782 --> 00:01:53,081 and for the other people that couldn't understand him, 28 00:01:53,082 --> 00:01:55,208 we had to translate. 29 00:01:55,209 --> 00:01:59,626 Amal and Ahmad say "num" when they are hungry, 30 00:01:59,659 --> 00:02:04,528 and they say "ahh" to call the name of Nora, my sister. 31 00:02:04,542 --> 00:02:08,833 And when they want to call my name, they say "abeya". 32 00:02:08,834 --> 00:02:12,585 In case they want to go to the bathroom, they say "kkhh". 33 00:02:13,366 --> 00:02:16,945 We understand most of their special communication patterns, 34 00:02:16,946 --> 00:02:20,546 but it's only us, the close circle. 35 00:02:20,551 --> 00:02:25,131 And this is the case for most of the people who have an unclear voice. 36 00:02:26,292 --> 00:02:29,471 One of those people is Urit. 37 00:02:29,472 --> 00:02:33,691 Urit is a 34-year-old woman with cerebral palsy. 38 00:02:33,692 --> 00:02:35,946 She is living an independent life. 39 00:02:35,947 --> 00:02:41,003 She can drive her car, go to the gym, and do a lot of other things. 40 00:02:42,917 --> 00:02:47,656 However, when it comes to communicating using her voice, 41 00:02:47,657 --> 00:02:50,912 sometimes, it can become harder than going to the gym, 42 00:02:50,913 --> 00:02:53,122 and more frustrating 43 00:02:53,123 --> 00:02:58,542 because she finds herself repeating the same words again and again 44 00:02:58,543 --> 00:03:01,067 in order to be understood. 45 00:03:01,068 --> 00:03:04,738 We asked Urit to say a few words in English. 46 00:03:06,370 --> 00:03:08,199 Let's listen to her together 47 00:03:08,200 --> 00:03:10,790 and see if you can understand what she's trying to say. 48 00:03:11,856 --> 00:03:13,946 (unclear speech) 49 00:03:17,481 --> 00:03:21,861 I don't know how many of you could understand her this first time, 50 00:03:21,862 --> 00:03:23,471 but let's listen to her again, 51 00:03:23,472 --> 00:03:27,521 and really focus and try to understand what she's trying to say. 52 00:03:27,522 --> 00:03:29,488 (unclear speech) 53 00:03:33,251 --> 00:03:37,491 Try to memorize what she has just said; we'll get to that later. 54 00:03:38,664 --> 00:03:41,883 With my siblings, and Urit, and people that I get to know, 55 00:03:41,884 --> 00:03:46,443 I had the chance to see a world full of challenges, 56 00:03:46,444 --> 00:03:49,454 - a world of special people with needs. 57 00:03:50,353 --> 00:03:53,772 And this allowed me to examine the existent technology 58 00:03:53,773 --> 00:03:57,635 in search of an answer for what my siblings were seeking. 59 00:03:58,542 --> 00:04:02,334 Unfortunately, the current state of the art assistive technology, 60 00:04:02,335 --> 00:04:07,258 including speech recognition applications, could not provide an answer. 61 00:04:08,485 --> 00:04:13,534 Since then, all the assistive technology has completely bypassed the voice, 62 00:04:13,535 --> 00:04:17,411 opting to use other modes of communication 63 00:04:18,362 --> 00:04:22,361 [by] replacing the voice with symbols and images, 64 00:04:22,362 --> 00:04:26,222 or movements of the body in the head or in the eyes. 65 00:04:27,356 --> 00:04:31,806 This brings me to the other lightweight alternative that does use the voice 66 00:04:32,695 --> 00:04:35,844 which is speech recognition applications. 67 00:04:35,845 --> 00:04:39,395 This technology works in two approaches. 68 00:04:40,281 --> 00:04:44,401 The first approach attempts to discover which word has been said. 69 00:04:46,013 --> 00:04:49,302 The second approach relies on phonemes. 70 00:04:49,303 --> 00:04:54,443 Phonemes are all the sounds we produce using our mouth and nose. 71 00:04:55,618 --> 00:04:59,806 Both approaches rely on statistical models 72 00:04:59,807 --> 00:05:03,136 from a large database of standard speech. 73 00:05:03,137 --> 00:05:05,959 But once the speech is not standard, 74 00:05:05,960 --> 00:05:09,659 - when I say not standard, I mean it's enough to have an accent, 75 00:05:09,660 --> 00:05:11,739 like most of us here - 76 00:05:11,740 --> 00:05:13,590 this will not work. 77 00:05:14,444 --> 00:05:19,593 My colleagues and I developed a new approach of assistive technology 78 00:05:19,594 --> 00:05:22,355 that does use the person's own voice 79 00:05:22,356 --> 00:05:26,175 and can understand non-standard speech patterns, 80 00:05:26,176 --> 00:05:31,506 with the mission to give people with a speech disability their voice back. 81 00:05:32,858 --> 00:05:36,407 So, whose life is this going to change? 82 00:05:36,408 --> 00:05:39,166 People with cerebral palsy, 83 00:05:39,167 --> 00:05:41,959 people with Parkinson's, and Myasthenia Gravis, 84 00:05:41,972 --> 00:05:44,347 so many [other] neurological disorders, 85 00:05:44,348 --> 00:05:46,637 people who are born with hearing disabilities, 86 00:05:46,638 --> 00:05:51,717 or people who suddenly have a stroke and their whole life is changed, 87 00:05:51,718 --> 00:05:54,569 but not only theirs. 88 00:05:54,570 --> 00:05:58,803 Not only the people who have difficulty expressing themselves, 89 00:05:58,804 --> 00:06:03,473 but everyone who interacts with them on a daily basis. 90 00:06:03,474 --> 00:06:08,547 This will make it easier for them to be socially included 91 00:06:08,548 --> 00:06:13,195 - because every one of us wants to be socially included. 92 00:06:13,196 --> 00:06:17,508 And now, you may be asking yourself, "How does it work?" 93 00:06:17,509 --> 00:06:22,078 "How come the current speech recognition technology couldn't do the same?" 94 00:06:24,978 --> 00:06:27,598 Because our technology works in a different way. 95 00:06:28,808 --> 00:06:32,217 So, each person has to go through two phases. 96 00:06:32,218 --> 00:06:35,357 The first phase is called the calibration phase, 97 00:06:35,358 --> 00:06:41,047 where the person has to teach the device and the application his own patterns 98 00:06:41,048 --> 00:06:44,227 by entering the patterns and building his own dictionary. 99 00:06:44,228 --> 00:06:45,920 This phase usually happens 100 00:06:45,921 --> 00:06:48,920 with the person who understands him the most. 101 00:06:48,921 --> 00:06:51,090 Together they will build the dictionary. 102 00:06:51,091 --> 00:06:55,340 This generally takes only one to three hours, 103 00:06:55,341 --> 00:06:58,280 and it depends on the speaking capability of the speaker. 104 00:06:58,281 --> 00:07:00,022 After building the dictionary, 105 00:07:00,023 --> 00:07:03,642 we move to the second phase which is the recognition phase. 106 00:07:03,643 --> 00:07:07,628 The application will be able to recognize unintelligible speech patterns 107 00:07:07,629 --> 00:07:10,828 from the dictionary that is already built 108 00:07:10,829 --> 00:07:14,369 and translate them into a clear voice in real time. 109 00:07:15,660 --> 00:07:19,819 Our approach is user-dependent and language-independent 110 00:07:19,820 --> 00:07:23,470 which means it can work in any language in the world, 111 00:07:24,347 --> 00:07:26,476 even the invented ones. 112 00:07:26,477 --> 00:07:29,726 And the key word here is 'pattern-matching'. 113 00:07:29,727 --> 00:07:35,016 Once the person builds his own dictionary, and says a word that already exists there, 114 00:07:35,017 --> 00:07:36,652 there will be a pattern-matching 115 00:07:36,653 --> 00:07:39,832 between what he says and what it already exists. 116 00:07:39,833 --> 00:07:41,852 But here we found a problem. 117 00:07:41,853 --> 00:07:44,921 We found that people with a speech disability 118 00:07:44,922 --> 00:07:48,012 pronounce different words in similar ways. 119 00:07:49,652 --> 00:07:53,601 And the challenge was to differentiate between them. 120 00:07:53,602 --> 00:07:57,314 So we created a technology called Adaptive Framing. 121 00:07:58,255 --> 00:08:03,825 Adaptive Framing technology can be adapted to the width of the event in the pattern. 122 00:08:03,834 --> 00:08:09,543 In the existing technology, you can see the L and the A in the same frame. 123 00:08:10,402 --> 00:08:15,011 But in our new technology, you can see that the L and A are in different frames 124 00:08:15,012 --> 00:08:18,042 which increases the accuracy of the pattern-matching. 125 00:08:18,844 --> 00:08:22,454 And this makes our pattern-matching so much better. 126 00:08:23,463 --> 00:08:26,352 I suppose you still remember Urit, right? 127 00:08:26,353 --> 00:08:30,523 Let's listen to her again now, but this time using Talkitt: 128 00:08:33,520 --> 00:08:34,568 (unclear speech) 129 00:08:34,570 --> 00:08:36,042 Now I can ... 130 00:08:36,043 --> 00:08:37,373 (unclear speech) 131 00:08:37,374 --> 00:08:38,374 ... start 132 00:08:38,375 --> 00:08:39,881 (unclear speech) 133 00:08:39,881 --> 00:08:41,342 ... speaking freely. 134 00:08:42,982 --> 00:08:44,542 (Applause) 135 00:08:55,552 --> 00:08:57,906 Talkitt is only one step 136 00:08:57,907 --> 00:09:02,026 towards bridging the gap between disability and ability 137 00:09:02,027 --> 00:09:04,946 by letting people express their potential. 138 00:09:04,947 --> 00:09:07,085 The more we challenge our minds, 139 00:09:07,086 --> 00:09:11,512 the more gaps will collapse to let us all have a normal life. 140 00:09:11,513 --> 00:09:12,622 Thank you. 141 00:09:12,623 --> 00:09:13,653 (Applause)