1 00:00:00,449 --> 00:00:01,482 My passions 2 00:00:01,482 --> 00:00:05,171 are music, technology and making things. 3 00:00:05,171 --> 00:00:08,354 And it's the combination of these things 4 00:00:08,354 --> 00:00:11,050 that has led me to the hobby of sound visualization, 5 00:00:11,050 --> 00:00:15,128 and, on occasion, has led me to play with fire. 6 00:00:15,128 --> 00:00:17,505 This is a Rubens' tube. It's one of many I've made over the years, 7 00:00:17,505 --> 00:00:19,251 and I have one here tonight. 8 00:00:19,251 --> 00:00:20,776 It's about an 8-foot-long tube of metal, 9 00:00:20,776 --> 00:00:22,111 it's got a hundred or so holes on top, 10 00:00:22,111 --> 00:00:23,864 on that side is the speaker, and here 11 00:00:23,864 --> 00:00:26,040 is some lab tubing, and it's connected to this tank 12 00:00:26,040 --> 00:00:27,641 of propane. 13 00:00:29,087 --> 00:00:32,456 So, let's fire it up and see what it does. 14 00:00:37,887 --> 00:00:39,778 So let's play a 550-herz frequency 15 00:00:39,778 --> 00:00:41,293 and watch what happens. 16 00:00:41,342 --> 00:00:49,825 (Frequency) 17 00:00:49,825 --> 00:00:52,538 Thank you. (Applause) 18 00:00:52,538 --> 00:00:54,526 It's okay to applaud the laws of physics, 19 00:00:54,526 --> 00:00:55,982 but essentially what's happening here 20 00:00:55,982 --> 00:00:57,742 -- (Laughter) -- 21 00:00:57,742 --> 00:01:01,750 is the energy from the sound via the air and gas molecules 22 00:01:01,750 --> 00:01:04,254 is influencing the combustion properties of propane, 23 00:01:04,254 --> 00:01:06,158 creating a visible waveform, 24 00:01:06,158 --> 00:01:08,478 and we can see the alternating regions of compression 25 00:01:08,478 --> 00:01:10,518 and rarefaction that we call frequency, 26 00:01:10,518 --> 00:01:12,262 and the height is showing us amplitude. 27 00:01:12,262 --> 00:01:14,590 So let's change the frequency of the sound, 28 00:01:14,590 --> 00:01:16,015 and watch what happens to the fire. 29 00:01:16,015 --> 00:01:26,145 (Higher frequency) 30 00:01:26,145 --> 00:01:29,495 So every time we hit a resonant frequency we get a standing wave 31 00:01:29,495 --> 00:01:31,195 and that emergent sine curve of fire. 32 00:01:31,195 --> 00:01:32,773 So let's turn that off. We're indoors. 33 00:01:32,773 --> 00:01:38,364 Thank you. (Applause) 34 00:01:38,364 --> 00:01:40,711 I also have with me a flame table. 35 00:01:40,711 --> 00:01:42,276 It's very similar to a Rubens' tube, and it's also used 36 00:01:42,276 --> 00:01:44,397 for visualizing the physical properties of sound, 37 00:01:44,397 --> 00:01:46,365 such as eigenmodes, so let's fire it up 38 00:01:46,365 --> 00:01:48,615 and see what it does. 39 00:01:52,292 --> 00:01:56,760 Ooh. (Laughter) 40 00:01:56,760 --> 00:01:59,734 Okay. Now, while the table comes up to pressure, 41 00:01:59,734 --> 00:02:01,406 let me note here that the sound is not traveling 42 00:02:01,406 --> 00:02:04,152 in perfect lines. It's actually traveling in all directions, 43 00:02:04,152 --> 00:02:07,261 and the Rubens' tube's a little like bisecting those waves 44 00:02:07,261 --> 00:02:09,257 with a line, and the flame table's a little like 45 00:02:09,257 --> 00:02:11,111 bisecting those waves with a plane, 46 00:02:11,111 --> 00:02:15,111 and it can show a little more subtle complexity, which is why 47 00:02:15,111 --> 00:02:17,464 I like to use it to watch Geoff Farina play guitar. 48 00:02:17,464 --> 00:02:59,745 (Music) 49 00:02:59,745 --> 00:03:01,510 All right, so it's a delicate dance. 50 00:03:01,510 --> 00:03:04,055 If you watch closely — (Applause) 51 00:03:04,055 --> 00:03:06,822 If you watch closely, you may have seen 52 00:03:06,822 --> 00:03:09,294 some of the eigenmodes, but also you may have seen 53 00:03:09,294 --> 00:03:13,911 that jazz music is better with fire. 54 00:03:13,911 --> 00:03:15,921 Actually, a lot of things are better with fire in my world, 55 00:03:15,921 --> 00:03:18,353 but the fire's just a foundation. 56 00:03:18,353 --> 00:03:19,501 It shows very well that eyes can hear, 57 00:03:19,501 --> 00:03:20,896 and this is interesting to me because 58 00:03:20,896 --> 00:03:23,750 technology allows us to present sound to the eyes 59 00:03:23,750 --> 00:03:26,613 in ways that accentuate the strength of the eyes 60 00:03:26,613 --> 00:03:29,310 for seeing sound, such as the removal of time. 61 00:03:29,310 --> 00:03:32,694 So here, I'm using a rendering algorithm to paint 62 00:03:32,694 --> 00:03:35,157 the frequencies of the song "Smells Like Teen Spirit" 63 00:03:35,157 --> 00:03:37,197 in a way that the eyes can take them in 64 00:03:37,197 --> 00:03:39,441 as a single visual impression, and the technique 65 00:03:39,441 --> 00:03:41,414 will also show the strengths of the visual cortex 66 00:03:41,414 --> 00:03:43,030 for pattern recognition. 67 00:03:43,030 --> 00:03:44,909 So if I show you another song off this album, 68 00:03:44,909 --> 00:03:48,390 and another, your eyes will easily pick out 69 00:03:48,390 --> 00:03:51,318 the use of repetition by the band Nirvana, 70 00:03:51,318 --> 00:03:53,175 and in the frequency distribution, the colors, 71 00:03:53,175 --> 00:03:56,223 you can see the clean-dirty-clean sound 72 00:03:56,223 --> 00:03:57,430 that they are famous for, 73 00:03:57,430 --> 00:04:01,430 and here is the entire album as a single visual impression, 74 00:04:01,430 --> 00:04:03,310 and I think this impression is pretty powerful. 75 00:04:03,310 --> 00:04:05,024 At least, it's powerful enough that 76 00:04:05,024 --> 00:04:06,366 if I show you these four songs, 77 00:04:06,366 --> 00:04:08,807 and I remind you that this is "Smells Like Teen Spirit," 78 00:04:08,807 --> 00:04:11,126 you can probably correctly guess, without listening 79 00:04:11,126 --> 00:04:12,560 to any music at all, that the song 80 00:04:12,560 --> 00:04:14,854 a die hard Nirvana fan would enjoy is this song, 81 00:04:14,854 --> 00:04:17,110 "I'll Stick Around" by the Foo Fighters, 82 00:04:17,110 --> 00:04:19,110 whose lead singer is Dave Grohl, 83 00:04:19,110 --> 00:04:22,888 who was the drummer in Nirvana. 84 00:04:22,888 --> 00:04:24,188 The songs are a little similar, but mostly 85 00:04:24,188 --> 00:04:25,814 I'm just interested in the idea that someday maybe 86 00:04:25,814 --> 00:04:30,226 we'll buy a song because we like the way it looks. 87 00:04:30,226 --> 00:04:31,326 All right, now for some more sound data. 88 00:04:31,326 --> 00:04:33,978 This is data from a skate park, 89 00:04:33,978 --> 00:04:36,010 and this is Mabel Davis skate park 90 00:04:36,010 --> 00:04:38,152 in Austin, Texas. (Skateboard sounds) 91 00:04:38,152 --> 00:04:39,526 And the sounds you're hearing came from eight 92 00:04:39,526 --> 00:04:41,742 microphones attached to obstacles around the park, 93 00:04:41,742 --> 00:04:43,926 and it sounds like chaos, but actually 94 00:04:43,926 --> 00:04:47,273 all the tricks start with a very distinct slap, 95 00:04:47,273 --> 00:04:48,877 but successful tricks end with a pop, 96 00:04:48,877 --> 00:04:50,670 whereas unsuccessful tricks 97 00:04:50,670 --> 00:04:52,526 more of a scratch and a tumble, 98 00:04:52,526 --> 00:04:56,536 and tricks on the rail will ring out like a gong, and 99 00:04:56,536 --> 00:04:59,326 voices occupy very unique frequencies in the skate park. 100 00:04:59,326 --> 00:05:01,264 So if we were to render these sounds visually, 101 00:05:01,264 --> 00:05:02,671 we might end up with something like this. 102 00:05:02,671 --> 00:05:05,127 This is all 40 minutes of the recording, 103 00:05:05,127 --> 00:05:07,287 and right away the algorithm tells us 104 00:05:07,287 --> 00:05:09,360 a lot more tricks are missed than are made, 105 00:05:09,360 --> 00:05:11,695 and also a trick on the rails is a lot more likely 106 00:05:11,695 --> 00:05:14,567 to produce a cheer, and if you look really closely, 107 00:05:14,567 --> 00:05:16,300 we can tease out traffic patterns. 108 00:05:16,300 --> 00:05:22,387 You see the skaters often trick in this direction. The obstacles are easier. 109 00:05:22,387 --> 00:05:24,092 And in the middle of the recording, the mics pick this up, 110 00:05:24,092 --> 00:05:26,896 but later in the recording, this kid shows up, 111 00:05:26,896 --> 00:05:29,824 and he starts using a line at the top of the park 112 00:05:29,824 --> 00:05:31,626 to do some very advanced tricks on something 113 00:05:31,626 --> 00:05:32,729 called the tall rail. 114 00:05:32,729 --> 00:05:34,601 And it's fascinating. At this moment in time, 115 00:05:34,601 --> 00:05:38,104 all the rest of the skaters turn their lines 90 degrees 116 00:05:38,104 --> 00:05:39,882 to stay out of his way. 117 00:05:39,882 --> 00:05:42,425 You see, there's a subtle etiquette in the skate park, 118 00:05:42,425 --> 00:05:44,025 and it's led by key influencers, 119 00:05:44,025 --> 00:05:47,273 and they tend to be the kids who can do the best tricks, 120 00:05:47,273 --> 00:05:49,703 or wear red pants, and on this day the mics picked that up. 121 00:05:49,703 --> 00:05:53,604 All right, from skate physics to theoretical physics. 122 00:05:53,604 --> 00:05:55,220 I'm a big fan of Stephen Hawking, 123 00:05:55,220 --> 00:05:56,556 and I wanted to use all eight hours 124 00:05:56,556 --> 00:05:59,143 of his Cambridge lecture series to create an homage. 125 00:05:59,143 --> 00:06:02,207 Now, in this series he's speaking with the aid of a computer, 126 00:06:02,207 --> 00:06:05,311 which actually makes identifying the ends of sentences 127 00:06:05,311 --> 00:06:08,737 fairly easy. So I wrote a steering algorithm. 128 00:06:08,737 --> 00:06:10,697 It listens to the lecture, and then it uses 129 00:06:10,697 --> 00:06:13,384 the amplitude of each word to move a point on the x-axis, 130 00:06:13,384 --> 00:06:16,017 and it uses the inflection of sentences 131 00:06:16,017 --> 00:06:18,171 to move a same point up and down on the y-axis. 132 00:06:18,171 --> 00:06:20,945 And these trend lines, you can see, there's more questions 133 00:06:20,945 --> 00:06:22,815 than answers in the laws of physics, 134 00:06:22,815 --> 00:06:24,848 and when we reach the end of a sentence, 135 00:06:24,848 --> 00:06:27,183 we place a star at that location. 136 00:06:27,183 --> 00:06:29,983 So there's a lot of sentences, so a lot of stars, 137 00:06:29,983 --> 00:06:32,339 and after rendering all of the audio, this is what we get. 138 00:06:32,339 --> 00:06:35,268 This is Stephen Hawking's universe. 139 00:06:35,268 --> 00:06:42,205 (Applause) 140 00:06:42,205 --> 00:06:44,692 It's all eight hours of the Cambridge lecture series 141 00:06:44,692 --> 00:06:46,540 taken in as a single visual impression, 142 00:06:46,540 --> 00:06:48,297 and I really like this image, 143 00:06:48,297 --> 00:06:50,105 but a lot of people think it's fake. 144 00:06:50,105 --> 00:06:52,449 So I made a more interactive version, 145 00:06:52,449 --> 00:06:57,905 and the way I did that is I used their position in time 146 00:06:57,905 --> 00:07:00,272 in the lecture to place these stars into 3D space, 147 00:07:00,272 --> 00:07:02,673 and with some custom software and a Kinect, 148 00:07:02,673 --> 00:07:05,336 I can walk right into the lecture. 149 00:07:05,336 --> 00:07:07,136 I'm going to wave through the Kinect here 150 00:07:07,136 --> 00:07:08,968 and take control, and now I'm going to reach out 151 00:07:08,968 --> 00:07:12,207 and I'm going to touch a star, and when I do, 152 00:07:12,207 --> 00:07:14,151 it will play the sentence 153 00:07:14,151 --> 00:07:15,616 that generated that star. 154 00:07:15,616 --> 00:07:19,432 Stephen Hawking: There is one, and only one, arrangement 155 00:07:19,432 --> 00:07:22,231 in which the pieces make a complete picture. 156 00:07:22,231 --> 00:07:26,379 Jared Ficklin: Thank you. (Applause) 157 00:07:26,379 --> 00:07:29,552 There are 1,400 stars. 158 00:07:29,552 --> 00:07:31,216 It's a really fun way to explore the lecture, 159 00:07:31,216 --> 00:07:32,683 and, I hope, a fitting homage. 160 00:07:32,683 --> 00:07:38,155 All right. Let me close with a work in progress. 161 00:07:38,155 --> 00:07:41,138 I think, after 30 years, the opportunity exists 162 00:07:41,138 --> 00:07:43,242 to create an enhanced version of closed captioning. 163 00:07:43,242 --> 00:07:45,361 Now, we've all seen a lot of TEDTalks online, 164 00:07:45,361 --> 00:07:48,288 so let's watch one now with the sound turned off 165 00:07:48,288 --> 00:07:52,210 and the closed captioning turned on. 166 00:07:52,210 --> 00:07:54,356 There's no closed captioning for the TED theme song, 167 00:07:54,356 --> 00:07:56,469 and we're missing it, but if you've watched enough of these, 168 00:07:56,469 --> 00:07:57,838 you hear it in your mind's ear, 169 00:07:57,838 --> 00:08:00,821 and then applause starts. 170 00:08:00,821 --> 00:08:03,011 It usually begins here, and it grows and then it falls. 171 00:08:03,011 --> 00:08:04,988 Sometimes you get a little star applause, 172 00:08:04,988 --> 00:08:07,474 and then I think even Bill Gates takes a nervous breath, 173 00:08:07,474 --> 00:08:09,164 and the talk begins. 174 00:08:09,164 --> 00:08:14,862 All right, so let's watch this clip again. 175 00:08:14,862 --> 00:08:16,118 This time, I'm not going to talk at all. 176 00:08:16,118 --> 00:08:17,485 There's still going to be no audio, 177 00:08:17,485 --> 00:08:19,389 but what I am going to do is I'm going to render the sound 178 00:08:19,389 --> 00:08:23,705 visually in real time at the bottom of the screen. 179 00:08:23,705 --> 00:08:26,496 So watch closely and see what your eyes can hear. 180 00:08:47,880 --> 00:08:49,760 This is fairly amazing to me. 181 00:08:49,760 --> 00:08:53,093 Even on the first view, your eyes will successfully 182 00:08:53,093 --> 00:08:56,181 pick out patterns, but on repeated views, 183 00:08:56,181 --> 00:08:57,870 your brain actually gets better 184 00:08:57,870 --> 00:08:59,526 at turning these patterns into information. 185 00:08:59,526 --> 00:09:01,117 You can get the tone and the timbre 186 00:09:01,117 --> 00:09:02,340 and the pace of the speech, 187 00:09:02,340 --> 00:09:04,404 things that you can't get out of closed captioning. 188 00:09:04,404 --> 00:09:06,588 That famous scene in horror movies 189 00:09:06,588 --> 00:09:09,236 where someone is walking up from behind 190 00:09:09,236 --> 00:09:11,196 is something you can see, 191 00:09:11,196 --> 00:09:13,900 and I believe this information would be something 192 00:09:13,900 --> 00:09:16,703 that is useful at times when the audio is turned off 193 00:09:16,703 --> 00:09:19,644 or not heard at all, and I speculate that deaf audiences 194 00:09:19,644 --> 00:09:20,821 might actually even be better 195 00:09:20,821 --> 00:09:22,602 at seeing sound than hearing audiences. 196 00:09:22,602 --> 00:09:24,100 I don't know. It's a theory right now. 197 00:09:24,100 --> 00:09:25,621 Actually, it's all just an idea. 198 00:09:25,621 --> 00:09:29,812 And let me end by saying that sound moves in all directions, 199 00:09:29,812 --> 00:09:31,789 and so do ideas. 200 00:09:31,789 --> 00:09:34,901 Thank you. (Applause)