WEBVTT 00:00:00.449 --> 00:00:01.482 My passions 00:00:01.482 --> 00:00:05.171 are music, technology and making things. 00:00:05.171 --> 00:00:08.354 And it's the combination of these things 00:00:08.354 --> 00:00:11.050 that has led me to the hobby of sound visualization, 00:00:11.050 --> 00:00:15.128 and, on occasion, has led me to play with fire. NOTE Paragraph 00:00:15.128 --> 00:00:17.505 This is a Rubens' tube. It's one of many I've made over the years, 00:00:17.505 --> 00:00:19.251 and I have one here tonight. 00:00:19.251 --> 00:00:20.776 It's about an 8-foot-long tube of metal, 00:00:20.776 --> 00:00:22.111 it's got a hundred or so holes on top, 00:00:22.111 --> 00:00:23.864 on that side is the speaker, and here 00:00:23.864 --> 00:00:26.040 is some lab tubing, and it's connected to this tank 00:00:26.040 --> 00:00:27.641 of propane. 00:00:29.087 --> 00:00:32.456 So, let's fire it up and see what it does. 00:00:37.887 --> 00:00:39.778 So let's play a 550-herz frequency 00:00:39.778 --> 00:00:41.293 and watch what happens. NOTE Paragraph 00:00:41.342 --> 00:00:49.825 (Frequency) NOTE Paragraph 00:00:49.825 --> 00:00:52.538 Thank you. (Applause) 00:00:52.538 --> 00:00:54.526 It's okay to applaud the laws of physics, 00:00:54.526 --> 00:00:55.982 but essentially what's happening here 00:00:55.982 --> 00:00:57.742 -- (Laughter) -- 00:00:57.742 --> 00:01:01.750 is the energy from the sound via the air and gas molecules 00:01:01.750 --> 00:01:04.254 is influencing the combustion properties of propane, 00:01:04.254 --> 00:01:06.158 creating a visible waveform, 00:01:06.158 --> 00:01:08.478 and we can see the alternating regions of compression 00:01:08.478 --> 00:01:10.518 and rarefaction that we call frequency, 00:01:10.518 --> 00:01:12.262 and the height is showing us amplitude. 00:01:12.262 --> 00:01:14.590 So let's change the frequency of the sound, 00:01:14.590 --> 00:01:16.015 and watch what happens to the fire. NOTE Paragraph 00:01:16.015 --> 00:01:26.145 (Higher frequency) NOTE Paragraph 00:01:26.145 --> 00:01:29.495 So every time we hit a resonant frequency we get a standing wave 00:01:29.495 --> 00:01:31.195 and that emergent sine curve of fire. 00:01:31.195 --> 00:01:32.773 So let's turn that off. We're indoors. 00:01:32.773 --> 00:01:38.364 Thank you. (Applause) NOTE Paragraph 00:01:38.364 --> 00:01:40.711 I also have with me a flame table. 00:01:40.711 --> 00:01:42.276 It's very similar to a Rubens' tube, and it's also used 00:01:42.276 --> 00:01:44.397 for visualizing the physical properties of sound, 00:01:44.397 --> 00:01:46.365 such as eigenmodes, so let's fire it up 00:01:46.365 --> 00:01:48.615 and see what it does. NOTE Paragraph 00:01:52.292 --> 00:01:56.760 Ooh. (Laughter) 00:01:56.760 --> 00:01:59.734 Okay. Now, while the table comes up to pressure, 00:01:59.734 --> 00:02:01.406 let me note here that the sound is not traveling 00:02:01.406 --> 00:02:04.152 in perfect lines. It's actually traveling in all directions, 00:02:04.152 --> 00:02:07.261 and the Rubens' tube's a little like bisecting those waves 00:02:07.261 --> 00:02:09.257 with a line, and the flame table's a little like 00:02:09.257 --> 00:02:11.111 bisecting those waves with a plane, 00:02:11.111 --> 00:02:15.111 and it can show a little more subtle complexity, which is why 00:02:15.111 --> 00:02:17.464 I like to use it to watch Geoff Farina play guitar. NOTE Paragraph 00:02:17.464 --> 00:02:59.745 (Music) NOTE Paragraph 00:02:59.745 --> 00:03:01.510 All right, so it's a delicate dance. 00:03:01.510 --> 00:03:04.055 If you watch closely — (Applause) 00:03:04.055 --> 00:03:06.822 If you watch closely, you may have seen 00:03:06.822 --> 00:03:09.294 some of the eigenmodes, but also you may have seen 00:03:09.294 --> 00:03:13.911 that jazz music is better with fire. 00:03:13.911 --> 00:03:15.921 Actually, a lot of things are better with fire in my world, 00:03:15.921 --> 00:03:18.353 but the fire's just a foundation. 00:03:18.353 --> 00:03:19.501 It shows very well that eyes can hear, 00:03:19.501 --> 00:03:20.896 and this is interesting to me because 00:03:20.896 --> 00:03:23.750 technology allows us to present sound to the eyes 00:03:23.750 --> 00:03:26.613 in ways that accentuate the strength of the eyes 00:03:26.613 --> 00:03:29.310 for seeing sound, such as the removal of time. NOTE Paragraph 00:03:29.310 --> 00:03:32.694 So here, I'm using a rendering algorithm to paint 00:03:32.694 --> 00:03:35.157 the frequencies of the song "Smells Like Teen Spirit" 00:03:35.157 --> 00:03:37.197 in a way that the eyes can take them in 00:03:37.197 --> 00:03:39.441 as a single visual impression, and the technique 00:03:39.441 --> 00:03:41.414 will also show the strengths of the visual cortex 00:03:41.414 --> 00:03:43.030 for pattern recognition. 00:03:43.030 --> 00:03:44.909 So if I show you another song off this album, 00:03:44.909 --> 00:03:48.390 and another, your eyes will easily pick out 00:03:48.390 --> 00:03:51.318 the use of repetition by the band Nirvana, 00:03:51.318 --> 00:03:53.175 and in the frequency distribution, the colors, 00:03:53.175 --> 00:03:56.223 you can see the clean-dirty-clean sound 00:03:56.223 --> 00:03:57.430 that they are famous for, 00:03:57.430 --> 00:04:01.430 and here is the entire album as a single visual impression, 00:04:01.430 --> 00:04:03.310 and I think this impression is pretty powerful. NOTE Paragraph 00:04:03.310 --> 00:04:05.024 At least, it's powerful enough that 00:04:05.024 --> 00:04:06.366 if I show you these four songs, 00:04:06.366 --> 00:04:08.807 and I remind you that this is "Smells Like Teen Spirit," 00:04:08.807 --> 00:04:11.126 you can probably correctly guess, without listening 00:04:11.126 --> 00:04:12.560 to any music at all, that the song 00:04:12.560 --> 00:04:14.854 a die hard Nirvana fan would enjoy is this song, 00:04:14.854 --> 00:04:17.110 "I'll Stick Around" by the Foo Fighters, 00:04:17.110 --> 00:04:19.110 whose lead singer is Dave Grohl, 00:04:19.110 --> 00:04:22.888 who was the drummer in Nirvana. 00:04:22.888 --> 00:04:24.188 The songs are a little similar, but mostly 00:04:24.188 --> 00:04:25.814 I'm just interested in the idea that someday maybe 00:04:25.814 --> 00:04:30.226 we'll buy a song because we like the way it looks. NOTE Paragraph 00:04:30.226 --> 00:04:31.326 All right, now for some more sound data. 00:04:31.326 --> 00:04:33.978 This is data from a skate park, 00:04:33.978 --> 00:04:36.010 and this is Mabel Davis skate park 00:04:36.010 --> 00:04:38.152 in Austin, Texas. (Skateboard sounds) 00:04:38.152 --> 00:04:39.526 And the sounds you're hearing came from eight 00:04:39.526 --> 00:04:41.742 microphones attached to obstacles around the park, 00:04:41.742 --> 00:04:43.926 and it sounds like chaos, but actually 00:04:43.926 --> 00:04:47.273 all the tricks start with a very distinct slap, 00:04:47.273 --> 00:04:48.877 but successful tricks end with a pop, 00:04:48.877 --> 00:04:50.670 whereas unsuccessful tricks 00:04:50.670 --> 00:04:52.526 more of a scratch and a tumble, 00:04:52.526 --> 00:04:56.536 and tricks on the rail will ring out like a gong, and 00:04:56.536 --> 00:04:59.326 voices occupy very unique frequencies in the skate park. NOTE Paragraph 00:04:59.326 --> 00:05:01.264 So if we were to render these sounds visually, 00:05:01.264 --> 00:05:02.671 we might end up with something like this. 00:05:02.671 --> 00:05:05.127 This is all 40 minutes of the recording, 00:05:05.127 --> 00:05:07.287 and right away the algorithm tells us 00:05:07.287 --> 00:05:09.360 a lot more tricks are missed than are made, 00:05:09.360 --> 00:05:11.695 and also a trick on the rails is a lot more likely 00:05:11.695 --> 00:05:14.567 to produce a cheer, and if you look really closely, 00:05:14.567 --> 00:05:16.300 we can tease out traffic patterns. 00:05:16.300 --> 00:05:22.387 You see the skaters often trick in this direction. The obstacles are easier. NOTE Paragraph 00:05:22.387 --> 00:05:24.092 And in the middle of the recording, the mics pick this up, 00:05:24.092 --> 00:05:26.896 but later in the recording, this kid shows up, 00:05:26.896 --> 00:05:29.824 and he starts using a line at the top of the park 00:05:29.824 --> 00:05:31.626 to do some very advanced tricks on something 00:05:31.626 --> 00:05:32.729 called the tall rail. 00:05:32.729 --> 00:05:34.601 And it's fascinating. At this moment in time, 00:05:34.601 --> 00:05:38.104 all the rest of the skaters turn their lines 90 degrees 00:05:38.104 --> 00:05:39.882 to stay out of his way. 00:05:39.882 --> 00:05:42.425 You see, there's a subtle etiquette in the skate park, 00:05:42.425 --> 00:05:44.025 and it's led by key influencers, 00:05:44.025 --> 00:05:47.273 and they tend to be the kids who can do the best tricks, 00:05:47.273 --> 00:05:49.703 or wear red pants, and on this day the mics picked that up. NOTE Paragraph 00:05:49.703 --> 00:05:53.604 All right, from skate physics to theoretical physics. 00:05:53.604 --> 00:05:55.220 I'm a big fan of Stephen Hawking, 00:05:55.220 --> 00:05:56.556 and I wanted to use all eight hours 00:05:56.556 --> 00:05:59.143 of his Cambridge lecture series to create an homage. 00:05:59.143 --> 00:06:02.207 Now, in this series he's speaking with the aid of a computer, 00:06:02.207 --> 00:06:05.311 which actually makes identifying the ends of sentences 00:06:05.311 --> 00:06:08.737 fairly easy. So I wrote a steering algorithm. 00:06:08.737 --> 00:06:10.697 It listens to the lecture, and then it uses 00:06:10.697 --> 00:06:13.384 the amplitude of each word to move a point on the x-axis, 00:06:13.384 --> 00:06:16.017 and it uses the inflection of sentences 00:06:16.017 --> 00:06:18.171 to move a same point up and down on the y-axis. NOTE Paragraph 00:06:18.171 --> 00:06:20.945 And these trend lines, you can see, there's more questions 00:06:20.945 --> 00:06:22.815 than answers in the laws of physics, 00:06:22.815 --> 00:06:24.848 and when we reach the end of a sentence, 00:06:24.848 --> 00:06:27.183 we place a star at that location. 00:06:27.183 --> 00:06:29.983 So there's a lot of sentences, so a lot of stars, 00:06:29.983 --> 00:06:32.339 and after rendering all of the audio, this is what we get. 00:06:32.339 --> 00:06:35.268 This is Stephen Hawking's universe. NOTE Paragraph 00:06:35.268 --> 00:06:42.205 (Applause) NOTE Paragraph 00:06:42.205 --> 00:06:44.692 It's all eight hours of the Cambridge lecture series 00:06:44.692 --> 00:06:46.540 taken in as a single visual impression, 00:06:46.540 --> 00:06:48.297 and I really like this image, 00:06:48.297 --> 00:06:50.105 but a lot of people think it's fake. 00:06:50.105 --> 00:06:52.449 So I made a more interactive version, 00:06:52.449 --> 00:06:57.905 and the way I did that is I used their position in time 00:06:57.905 --> 00:07:00.272 in the lecture to place these stars into 3D space, 00:07:00.272 --> 00:07:02.673 and with some custom software and a Kinect, 00:07:02.673 --> 00:07:05.336 I can walk right into the lecture. 00:07:05.336 --> 00:07:07.136 I'm going to wave through the Kinect here 00:07:07.136 --> 00:07:08.968 and take control, and now I'm going to reach out 00:07:08.968 --> 00:07:12.207 and I'm going to touch a star, and when I do, 00:07:12.207 --> 00:07:14.151 it will play the sentence 00:07:14.151 --> 00:07:15.616 that generated that star. NOTE Paragraph 00:07:15.616 --> 00:07:19.432 Stephen Hawking: There is one, and only one, arrangement 00:07:19.432 --> 00:07:22.231 in which the pieces make a complete picture. NOTE Paragraph 00:07:22.231 --> 00:07:26.379 Jared Ficklin: Thank you. (Applause) 00:07:26.379 --> 00:07:29.552 There are 1,400 stars. 00:07:29.552 --> 00:07:31.216 It's a really fun way to explore the lecture, 00:07:31.216 --> 00:07:32.683 and, I hope, a fitting homage. NOTE Paragraph 00:07:32.683 --> 00:07:38.155 All right. Let me close with a work in progress. 00:07:38.155 --> 00:07:41.138 I think, after 30 years, the opportunity exists 00:07:41.138 --> 00:07:43.242 to create an enhanced version of closed captioning. 00:07:43.242 --> 00:07:45.361 Now, we've all seen a lot of TEDTalks online, 00:07:45.361 --> 00:07:48.288 so let's watch one now with the sound turned off 00:07:48.288 --> 00:07:52.210 and the closed captioning turned on. NOTE Paragraph 00:07:52.210 --> 00:07:54.356 There's no closed captioning for the TED theme song, 00:07:54.356 --> 00:07:56.469 and we're missing it, but if you've watched enough of these, 00:07:56.469 --> 00:07:57.838 you hear it in your mind's ear, 00:07:57.838 --> 00:08:00.821 and then applause starts. 00:08:00.821 --> 00:08:03.011 It usually begins here, and it grows and then it falls. 00:08:03.011 --> 00:08:04.988 Sometimes you get a little star applause, 00:08:04.988 --> 00:08:07.474 and then I think even Bill Gates takes a nervous breath, 00:08:07.474 --> 00:08:09.164 and the talk begins. NOTE Paragraph 00:08:09.164 --> 00:08:14.862 All right, so let's watch this clip again. 00:08:14.862 --> 00:08:16.118 This time, I'm not going to talk at all. 00:08:16.118 --> 00:08:17.485 There's still going to be no audio, 00:08:17.485 --> 00:08:19.389 but what I am going to do is I'm going to render the sound 00:08:19.389 --> 00:08:23.705 visually in real time at the bottom of the screen. 00:08:23.705 --> 00:08:26.496 So watch closely and see what your eyes can hear. NOTE Paragraph 00:08:47.880 --> 00:08:49.760 This is fairly amazing to me. 00:08:49.760 --> 00:08:53.093 Even on the first view, your eyes will successfully 00:08:53.093 --> 00:08:56.181 pick out patterns, but on repeated views, 00:08:56.181 --> 00:08:57.870 your brain actually gets better 00:08:57.870 --> 00:08:59.526 at turning these patterns into information. 00:08:59.526 --> 00:09:01.117 You can get the tone and the timbre 00:09:01.117 --> 00:09:02.340 and the pace of the speech, 00:09:02.340 --> 00:09:04.404 things that you can't get out of closed captioning. 00:09:04.404 --> 00:09:06.588 That famous scene in horror movies 00:09:06.588 --> 00:09:09.236 where someone is walking up from behind 00:09:09.236 --> 00:09:11.196 is something you can see, 00:09:11.196 --> 00:09:13.900 and I believe this information would be something 00:09:13.900 --> 00:09:16.703 that is useful at times when the audio is turned off 00:09:16.703 --> 00:09:19.644 or not heard at all, and I speculate that deaf audiences 00:09:19.644 --> 00:09:20.821 might actually even be better 00:09:20.821 --> 00:09:22.602 at seeing sound than hearing audiences. 00:09:22.602 --> 00:09:24.100 I don't know. It's a theory right now. 00:09:24.100 --> 00:09:25.621 Actually, it's all just an idea. NOTE Paragraph 00:09:25.621 --> 00:09:29.812 And let me end by saying that sound moves in all directions, 00:09:29.812 --> 00:09:31.789 and so do ideas. 00:09:31.789 --> 00:09:34.901 Thank you. (Applause)