WEBVTT 00:00:00.992 --> 00:00:04.410 It's actually delightful to be here 00:00:04.410 --> 00:00:07.126 So thanks to for inviting me. 00:00:07.126 --> 00:00:10.822 I am a mathematician. And you may not 00:00:10.822 --> 00:00:12.860 associate mathematicians with people that 00:00:12.860 --> 00:00:13.953 love art. 00:00:13.953 --> 00:00:15.787 But I happen to think that mathematics is 00:00:15.787 --> 00:00:17.164 the most beautiful thing on earth. 00:00:17.164 --> 00:00:19.247 And I want to convince you today that 00:00:19.247 --> 00:00:20.745 that's the case. 00:00:20.745 --> 00:00:21.797 Now we're going to use a few tricks. 00:00:21.797 --> 00:00:23.624 So we're not just going to stare at 00:00:23.624 --> 00:00:26.422 equations. I get really a wonderful 00:00:26.422 --> 00:00:29.441 feeling inside of me when I stare at a 00:00:29.441 --> 00:00:32.050 system of equations. But I'm not expecting 00:00:32.050 --> 00:00:33.448 you to have that same sort of feeling. 00:00:33.448 --> 00:00:35.497 So what I'd like to do is provide some 00:00:35.497 --> 00:00:37.265 color and some illustrations of the 00:00:37.265 --> 00:00:42.545 systems of equations. 00:00:42.545 --> 00:00:45.017 Now, matrices what are matrices, you know? 00:00:45.017 --> 00:00:47.873 Why, as a mathematician, do I love them so 00:00:47.873 --> 00:00:49.821 much? Well, let me just give you a little 00:00:49.821 --> 00:00:52.357 bit of a feel of the sort of problems I 00:00:52.357 --> 00:00:53.855 work on. I'm a computational mathematician 00:00:53.855 --> 00:00:58.075 And I really like things to do with fluid 00:00:58.075 --> 00:01:00.642 flow. So you give me a fluid, I like to 00:01:00.642 --> 00:01:03.646 simulate it. So I've done work on 00:01:03.646 --> 00:01:04.751 coastal ocean flows 00:01:04.751 --> 00:01:06.485 I've done work on oil and gas flows 00:01:06.485 --> 00:01:10.796 I've designed sails for the American cup 00:01:10.796 --> 00:01:13.126 For those of you who know about America's 00:01:13.126 --> 00:01:14.982 cup Right now. Not for this particular 00:01:14.982 --> 00:01:17.429 race but in 2000 and 2003 00:01:17.429 --> 00:01:19.472 but in all of these things are eminent 00:01:19.472 --> 00:01:21.776 flow I'd like to simulate it. 00:01:21.776 --> 00:01:23.386 and to simulate this 00:01:23.386 --> 00:01:26.641 we set up a really large system of 00:01:26.641 --> 00:01:29.039 equations relating pressures at all 00:01:29.039 --> 00:01:31.167 sort of different points in the flow 00:01:31.167 --> 00:01:32.720 fields and velocities 00:01:32.720 --> 00:01:35.059 and we need to solve it 00:01:35.059 --> 00:01:35.933 and ultimately the system of equations 00:01:35.933 --> 00:01:37.701 leads to something called a matrix. 00:01:37.701 --> 00:01:39.039 I'll show you an example. 00:01:39.039 --> 00:01:41.582 But also, because I like matrices so much 00:01:41.582 --> 00:01:42.975 In these matrix equations 00:01:42.975 --> 00:01:44.457 I can use that same 00:01:44.457 --> 00:01:47.061 knowledge to help with example 00:01:47.061 --> 00:01:49.221 of the design of a search engine. 00:01:49.221 --> 00:01:51.343 Or a recommender system. 00:01:51.343 --> 00:01:53.234 And that may sound really funny to you, 00:01:53.234 --> 00:01:54.141 but it's exactly the same math 00:01:54.141 --> 00:01:56.036 behind it. 00:01:56.036 --> 00:01:58.263 And the math behind it, 00:01:58.263 --> 00:01:59.123 is this matrix stuff. 00:01:59.123 --> 00:02:01.258 Now for some of you, 00:02:01.258 --> 00:02:02.804 matrices may actually have to do 00:02:02.804 --> 00:02:04.481 with something like this. 00:02:04.481 --> 00:02:06.864 Do you recognize that made with 00:02:06.864 --> 00:02:07.816 the movie the matrix 00:02:07.816 --> 00:02:10.234 which I happen to love. 00:02:10.234 --> 00:02:12.342 But has nothing to do with my field. 00:02:12.342 --> 00:02:14.388 But if you type in matrices 00:02:14.388 --> 00:02:15.966 or you type in matrices for 00:02:15.966 --> 00:02:17.456 engineering applications 00:02:17.456 --> 00:02:19.396 in Google, you get millions and millions 00:02:19.396 --> 00:02:20.021 and millions of hits. 00:02:20.021 --> 00:02:21.814 And you can see lots of different 00:02:21.814 --> 00:02:23.631 applications, which some how use this 00:02:23.631 --> 00:02:25.521 This is just a snap shot 00:02:25.521 --> 00:02:26.923 of the images 00:02:26.923 --> 00:02:28.438 the first place of images you see 00:02:28.438 --> 00:02:29.991 when you type in matrices. 00:02:29.991 --> 00:02:32.941 00:02:32.941 --> 00:02:34.945 And when you go look at this, 00:02:34.945 --> 00:02:35.323 you see matrices come up in 00:02:35.323 --> 00:02:36.070 flow mechanics 00:02:36.070 --> 00:02:39.314 but also in structural mechanics 00:02:39.314 --> 00:02:41.981 you see it come up in social networks 00:02:41.981 --> 00:02:42.487 you see it come up in neurology, 00:02:42.487 --> 00:02:44.110 biology, so many different application areas 00:02:45.890 --> 00:02:46.368 use this matrices. 00:02:46.368 --> 00:02:47.109 And so, officialization of them is nice 00:02:47.109 --> 00:02:52.801 also because sometimes, when we stare 00:02:52.801 --> 00:02:55.691 at a big matrix 00:02:55.691 --> 00:02:57.716 which is just a table with numbers, 00:02:57.716 --> 00:02:59.946 it's very hard to discern patterns 00:02:59.946 --> 00:03:02.124 But sometimes when you start visualizing 00:03:02.124 --> 00:03:02.837 them, 00:03:02.837 --> 00:03:04.919 you can get a deeper understanding, 00:03:04.919 --> 00:03:07.084 also of the underlying math 00:03:07.084 --> 00:03:08.119 and the underlying physics. 00:03:08.119 --> 00:03:12.681 So, let's start with a little bit of 00:03:12.681 --> 00:03:13.664 algebra. 00:03:13.664 --> 00:03:17.630 So say, I have four unknowns, that I want 00:03:17.630 --> 00:03:19.670 to compute. So this brings 00:03:19.670 --> 00:03:21.071 you back to, maybe, high school algebra 00:03:21.071 --> 00:03:23.990 If I have four unknowns I want to 00:03:23.990 --> 00:03:24.962 compute, 00:03:24.962 --> 00:03:25.949 in this case they're called 00:03:25.949 --> 00:03:27.378 W, X, Y and Z 00:03:27.378 --> 00:03:29.462 We always use these sort of variables 00:03:29.462 --> 00:03:30.553 because mathematicians aren't 00:03:30.553 --> 00:03:32.325 that creative 00:03:32.325 --> 00:03:32.985 alright, so we use things like 00:03:32.985 --> 00:03:33.235 X1, X2, X3 00:03:33.235 --> 00:03:35.338 and so on, right? 00:03:35.338 --> 00:03:37.640 And so if I want to solve for 00:03:37.640 --> 00:03:38.588 four unknowns, 00:03:38.588 --> 00:03:40.263 I need four constraints. 00:03:40.263 --> 00:03:41.573 Four equations 00:03:41.573 --> 00:03:43.156 that tell me how they relate to 00:03:43.156 --> 00:03:43.843 each other 00:03:43.843 --> 00:03:45.369 and here I just made up 00:03:45.369 --> 00:03:46.804 four of those 00:03:46.804 --> 00:03:47.545 K? 00:03:47.545 --> 00:03:49.211 Now one of the things that you see, 00:03:49.211 --> 00:03:50.526 is that there is 00:03:50.526 --> 00:03:52.374 some empty spots here. 00:03:52.374 --> 00:03:54.576 And you also see that there is 00:03:54.576 --> 00:03:56.923 this pattern; W, X, Y Z 00:03:56.923 --> 00:03:58.324 They are always written in the 00:03:58.324 --> 00:03:59.121 same pattern. 00:03:59.121 --> 00:04:00.970 Sometime there is a one in front 00:04:00.970 --> 00:04:01.847 of it. 00:04:01.847 --> 00:04:03.152 1 times W, 1 times X 00:04:03.152 --> 00:04:04.425 and sometimes there is nothing. 00:04:04.425 --> 00:04:06.121 Which really means there was a zero 00:04:06.121 --> 00:04:07.627 in front of it, right? 00:04:07.627 --> 00:04:09.205 But as a mathematician, 00:04:09.205 --> 00:04:11.208 I look at these, I'm a very organized 00:04:11.208 --> 00:04:11.914 person 00:04:11.914 --> 00:04:13.501 So I write all of these things in 00:04:13.501 --> 00:04:15.357 this particular order. 00:04:15.357 --> 00:04:17.219 And I see there are four equations, and 00:04:17.219 --> 00:04:19.945 each equation has a bunch of 00:04:19.945 --> 00:04:21.487 coefficient corresponding to W. 00:04:21.487 --> 00:04:23.581 Bunch corresponding to X, 00:04:23.581 --> 00:04:25.120 to Y, and to Z. 00:04:25.120 --> 00:04:27.235 And all I really need to remember 00:04:27.235 --> 00:04:30.376 for the system, is these coefficients. 00:04:30.376 --> 00:04:32.280 Right? As soon as I fix the order, 00:04:32.280 --> 00:04:34.587 all I need to remember are these 00:04:34.587 --> 00:04:35.771 coefficients. 00:04:35.771 --> 00:04:37.630 So what I'm going to do, 00:04:37.630 --> 00:04:38.858 is I'm going to re-write this 00:04:38.858 --> 00:04:39.574 a little bit. 00:04:39.574 --> 00:04:40.543 And I'm going to create, 00:04:40.543 --> 00:04:43.116 this table here, which has these 00:04:43.116 --> 00:04:44.501 coefficients, now I left out 00:04:44.501 --> 00:04:46.500 the zeros. 00:04:46.500 --> 00:04:47.743 Now we are also very lazy, 00:04:47.743 --> 00:04:48.901 mathematicians. 00:04:48.901 --> 00:04:51.211 Zeros, we never write down. 00:04:51.211 --> 00:04:52.289 Also, because with these 00:04:52.289 --> 00:04:54.235 large systems of equations 00:04:54.235 --> 00:04:54.908 that we have to deal with, 00:04:54.908 --> 00:04:56.074 say for recommender system 00:04:56.074 --> 00:04:58.303 or page rank, or search algorithm 00:04:58.303 --> 00:04:59.524 there are many, many many zeros. 00:04:59.524 --> 00:05:00.282 So we leave them out. 00:05:00.282 --> 00:05:01.074 K? 00:05:01.074 --> 00:05:02.896 But, arguably, all that's really 00:05:02.896 --> 00:05:05.866 important to me, for this particular 00:05:05.866 --> 00:05:07.148 system, whatever that represents; 00:05:07.148 --> 00:05:10.077 maybe it's pressure of velocity 00:05:10.077 --> 00:05:11.701 relationships in fluid flow, 00:05:11.701 --> 00:05:13.559 maybe it's a friend networking 00:05:13.559 --> 00:05:14.593 algorithm. I don't really care. 00:05:14.593 --> 00:05:16.936 All that's really important 00:05:16.936 --> 00:05:19.281 is this table with the numbers 00:05:19.281 --> 00:05:20.667 this table, that we call a matrix 00:05:20.667 --> 00:05:24.002 K? So that's the matrix. 00:05:24.002 --> 00:05:25.910 Now 00:05:25.910 --> 00:05:28.435 sometimes we use very simple 00:05:28.435 --> 00:05:29.977 visualizations. 00:05:29.977 --> 00:05:31.747 Mine is called, Spy Plot. 00:05:31.747 --> 00:05:33.794 And all I do 00:05:33.794 --> 00:05:36.458 is, you know, I had all these ones 00:05:36.458 --> 00:05:38.783 and everywhere a one appears 00:05:38.783 --> 00:05:41.012 I simply put a little dot. 00:05:41.012 --> 00:05:41.742 K? 00:05:41.742 --> 00:05:43.582 So in this case I'm not so interested in 00:05:43.582 --> 00:05:47.134 what size these coefficients are, 00:05:47.134 --> 00:05:48.732 whether they're one or two or ten 00:05:48.732 --> 00:05:50.373 or minus two-hundred. 00:05:50.373 --> 00:05:50.923 I don't really care. 00:05:50.923 --> 00:05:52.714 I just want to know where there are 00:05:52.714 --> 00:05:54.858 non-zeros 00:05:54.858 --> 00:05:56.596 because these non-zeros 00:05:56.596 --> 00:05:59.471 give me exactly, how these things relate. 00:05:59.471 --> 00:06:02.944 If i have a non-zero right here, and a non 00:06:02.944 --> 00:06:07.033 zero there, I know that W and Y are an 00:06:07.033 --> 00:06:07.970 equation together. 00:06:07.970 --> 00:06:10.909 And so, the patterns, of these non zeros 00:06:10.909 --> 00:06:12.240 give me information about the system. 00:06:12.240 --> 00:06:14.453 Does that make sense? 00:06:14.453 --> 00:06:16.862 And so the simplest way, 00:06:16.862 --> 00:06:18.846 because again you know the start 00:06:18.846 --> 00:06:20.428 is very simple, and not super creative. 00:06:20.428 --> 00:06:22.137 In the beginning, we just put a little 00:06:22.137 --> 00:06:24.923 dot. And therefore very large systems 00:06:24.923 --> 00:06:27.291 of equations we may get things like this. 00:06:27.291 --> 00:06:30.102 So this is called, "Spy plot." 00:06:30.102 --> 00:06:33.425 So this is just a matrix, with dots 00:06:33.425 --> 00:06:35.242 where ever there is a non zero 00:06:35.242 --> 00:06:37.531 and you see there are patterns, 00:06:37.531 --> 00:06:38.975 now that I can see. 00:06:38.975 --> 00:06:41.247 And actually these come from a network 00:06:41.247 --> 00:06:44.432 a matrix can also represent a network, 00:06:44.432 --> 00:06:45.905 which we'll see in a little bit. 00:06:45.905 --> 00:06:47.977 They're Stanford Networks. 00:06:47.977 --> 00:06:50.407 This is, I think, uh Standford as a whole 00:06:50.407 --> 00:06:52.940 And this was the Internet. 00:06:52.940 --> 00:06:55.510 And all the pages in Ero Astro 00:06:55.510 --> 00:06:58.265 that we crolled in 2004 to create those 00:06:58.265 --> 00:06:59.063 "Spy plots." 00:06:59.063 --> 00:07:01.800 And looking at this, I can see 00:07:01.800 --> 00:07:03.480 because I know a little bit about how 00:07:03.480 --> 00:07:04.186 these were created, 00:07:04.186 --> 00:07:07.013 That here are clusters of websites 00:07:07.013 --> 00:07:09.514 that are correlated very strongly 00:07:09.514 --> 00:07:11.586 so they keep on referring to each other 00:07:11.586 --> 00:07:14.263 and internally. And they're not very 00:07:14.263 --> 00:07:15.147 much connected to this 00:07:15.147 --> 00:07:17.222 so that's central administration. 00:07:17.222 --> 00:07:18.102 They only talk to each other. 00:07:18.102 --> 00:07:19.532 Not so much to others. 00:07:19.532 --> 00:07:21.197 And they can look at this, and they 00:07:21.197 --> 00:07:22.046 can discern 00:07:22.046 --> 00:07:23.082 organizational structures. 00:07:23.082 --> 00:07:24.868 And it's amazing how you can 00:07:24.868 --> 00:07:25.964 use these things. 00:07:25.964 --> 00:07:26.439 Right? 00:07:26.439 --> 00:07:28.502 And of course, some other equations 00:07:28.502 --> 00:07:30.394 lead to more interesting patterns. 00:07:30.394 --> 00:07:32.378 This is a "spy plot" 00:07:32.378 --> 00:07:33.535 Of a particular 00:07:33.535 --> 00:07:35.533 equation, or system of equation 00:07:35.533 --> 00:07:37.433 that looked at an oil resovoir modeling. 00:07:37.433 --> 00:07:40.317 And when look at these particular 00:07:40.317 --> 00:07:41.214 patterns, I can say something about 00:07:41.214 --> 00:07:45.426 the behavior, not that much 00:07:45.426 --> 00:07:47.281 But it's still, you know, resonably 00:07:47.281 --> 00:07:47.944 pretty. 00:07:47.944 --> 00:07:49.119 And then what I could do, 00:07:49.119 --> 00:07:52.383 if the actual non zeros change in size 00:07:52.383 --> 00:07:54.302 so some if they're bigger than others 00:07:54.302 --> 00:07:56.013 I could give them a color, 00:07:56.013 --> 00:07:56.766 depending on the size. 00:07:56.766 --> 00:07:58.406 Now we're getting very artsy, 00:07:58.406 --> 00:07:59.405 for a mathmatician. 00:07:59.405 --> 00:08:00.221 Which is pretty amazing. 00:08:00.221 --> 00:08:03.695 And I get things like this. 00:08:03.695 --> 00:08:06.603 Now there are many more interesting ways 00:08:06.603 --> 00:08:07.883 to do this though, and this is really 00:08:07.883 --> 00:08:09.946 what I'd like to show you today. 00:08:09.946 --> 00:08:11.809 Is to use graphs. 00:08:11.809 --> 00:08:13.542 So we're going to go back 00:08:13.542 --> 00:08:16.106 I hope 00:08:16.106 --> 00:08:16.620 Yep. 00:08:16.620 --> 00:08:18.377 To this matrix. 00:08:18.377 --> 00:08:19.004 OK? 00:08:19.004 --> 00:08:25.126 Now I want you to focus on, ehm, this 00:08:25.126 --> 00:08:28.828 equation here, this equation had W 00:08:28.828 --> 00:08:30.933 Plus zero times X 00:08:30.933 --> 00:08:32.789 X doesn't play a role 00:08:32.789 --> 00:08:33.894 Plus Y times nothing, times Z 00:08:33.894 --> 00:08:36.365 Is equal to one. 00:08:36.365 --> 00:08:38.712 That's really where that came from 00:08:38.712 --> 00:08:41.114 And from this, I know that there is 00:08:41.114 --> 00:08:44.794 a connection between W and X. 00:08:44.794 --> 00:08:50.459 They're in the same equation. 00:08:50.459 --> 00:08:51.017 Right? 00:08:51.017 --> 00:08:51.869 W and Y, sorry. W and Y. 00:08:51.869 --> 00:08:54.101 I thought you said "Why," so I tried to 00:08:54.101 --> 00:08:57.251 explain it again. (Laughter). 00:08:57.251 --> 00:08:58.401 Because there is an equation connected. 00:08:58.401 --> 00:09:01.215 But W and Y. 00:09:01.215 --> 00:09:03.984 So all I'm going to remember now is 00:09:03.984 --> 00:09:05.294 that there is that connection. 00:09:05.294 --> 00:09:10.665 Now W is here in this location and Y 00:09:10.665 --> 00:09:13.594 is in this location. And this non zero 00:09:13.594 --> 00:09:16.442 I can see is really connecting 00:09:16.442 --> 00:09:17.254 those two together. 00:09:17.254 --> 00:09:21.904 So if I say I have a non zero in the 00:09:21.904 --> 00:09:23.782 first row and the third column, 00:09:23.782 --> 00:09:27.726 I know that W and Y are connected. 00:09:27.726 --> 00:09:29.665 If there is a non zero, say, in the 00:09:29.665 --> 00:09:31.155 second row and the fourth column 00:09:31.155 --> 00:09:34.436 I know that X and Z are connected. 00:09:34.436 --> 00:09:35.995 And that's what I am going to do. 00:09:35.995 --> 00:09:41.067 So I am just looking at these non zeros 00:09:41.067 --> 00:09:43.687 You know, these ones off of the diagonal 00:09:43.687 --> 00:09:46.874 The one that's saying W is connected 00:09:46.874 --> 00:09:47.554 To itself, but this one 00:09:47.554 --> 00:09:50.235 signifies a connection between W and Y 00:09:50.235 --> 00:09:52.285 This one between X and Y 00:09:52.285 --> 00:09:56.410 And so when I look at those non zeros 00:09:56.410 --> 00:09:58.520 I could also write it like this like a 00:09:58.520 --> 00:09:59.832 little graph. 00:09:59.832 --> 00:10:03.397 Now I see one and three are connected. 00:10:03.397 --> 00:10:05.149 The first element, and the third element, 00:10:05.149 --> 00:10:07.129 That is the W and the Y. 00:10:07.129 --> 00:10:10.744 Now I measure W, X, Y and Z. 00:10:10.744 --> 00:10:11.824 One, two, three, four. 00:10:11.824 --> 00:10:13.896 And so one and three are connected, 00:10:13.896 --> 00:10:15.673 there is another equation that connects 00:10:15.673 --> 00:10:17.854 Two with three, and there is an equation 00:10:17.854 --> 00:10:19.573 that connects two and four. 00:10:19.573 --> 00:10:20.941 And there is an equation that connects 00:10:20.941 --> 00:10:21.789 three and four. 00:10:21.789 --> 00:10:23.578 These are the connections that I have 00:10:23.578 --> 00:10:24.641 between these unknowns. 00:10:24.641 --> 00:10:27.097 Now I made it very abstract, right? 00:10:27.097 --> 00:10:29.210 Because I had this system of equations 00:10:29.210 --> 00:10:31.390 that told me exactly what these equations 00:10:31.390 --> 00:10:32.955 were. And I'm just removing that 00:10:32.955 --> 00:10:34.941 and say, I'm just interested now in 00:10:34.941 --> 00:10:36.593 connections. 00:10:36.593 --> 00:10:37.942 What is influenced by what? 00:10:37.942 --> 00:10:40.768 If W and Y are in an equation together, 00:10:40.768 --> 00:10:42.848 then the size of one influences 00:10:42.848 --> 00:10:43.742 the size of the other. 00:10:43.742 --> 00:10:46.175 That's all I am interested in now. 00:10:46.175 --> 00:10:49.434 Now vice-versa, if I had a network like 00:10:49.434 --> 00:10:52.389 this, a connected graph, maybe 00:10:52.389 --> 00:10:55.234 friends. Friend one is connected to 00:10:55.234 --> 00:10:56.939 friend three and three is friends with 00:10:56.939 --> 00:10:58.329 four but four is not friends with one, 00:10:58.329 --> 00:11:01.641 then I could replace that network 00:11:01.641 --> 00:11:04.012 with this graph with a matrix. 00:11:04.012 --> 00:11:06.035 Right? I could go from one to the 00:11:06.035 --> 00:11:08.983 other. But now I'm going to take these 00:11:08.983 --> 00:11:10.452 matrices, maybe they come from 00:11:10.452 --> 00:11:12.723 fluid mechanics. And I have ten million 00:11:12.723 --> 00:11:17.013 columns this way, and I have 10 million 00:11:17.013 --> 00:11:19.414 rows this way. This is what we call small 00:11:19.414 --> 00:11:20.206 simulation. 00:11:20.206 --> 00:11:22.760 So I have a lot, and I'm going to create 00:11:22.760 --> 00:11:25.023 a graph out of that. Right? 00:11:25.023 --> 00:11:27.652 Now when I look at that graph I can see 00:11:27.652 --> 00:11:29.174 these connections, but of course 00:11:29.174 --> 00:11:30.906 you immediately say, well I control this 00:11:30.906 --> 00:11:32.230 in all sorts of different ways. 00:11:32.230 --> 00:11:35.832 You see the same graphs, just drawn 00:11:35.832 --> 00:11:38.026 a little differently. 00:11:38.026 --> 00:11:39.173 And then the question is, well 00:11:39.173 --> 00:11:41.143 which drawing do you prefer? 00:11:41.143 --> 00:11:42.654 Which makes it clearest, what the 00:11:42.654 --> 00:11:44.893 connections are. 00:11:44.893 --> 00:11:45.840 To you. 00:11:45.840 --> 00:11:47.277 Just by looking at it, what do you think? 00:11:47.277 --> 00:11:49.621 This one? 00:11:49.621 --> 00:11:50.836 (Audience) "That one" 00:11:50.836 --> 00:11:53.397 Yeah I like this one a lot too. 00:11:53.397 --> 00:11:55.521 So then, of course, this is just 00:11:55.521 --> 00:11:57.437 with four, right? I just had four 00:11:57.437 --> 00:12:00.154 One, two, three and four in connections 00:12:00.154 --> 00:12:02.272 Now suppose I have 10 million 00:12:02.272 --> 00:12:05.472 With maybe 50 million in connection total 00:12:05.472 --> 00:12:09.400 and, and I ask my student, make me 00:12:09.400 --> 00:12:11.773 a nice looking graph. 00:12:11.773 --> 00:12:14.006 So they can look at it and maybe 00:12:14.006 --> 00:12:15.822 discern a little bit of information about 00:12:15.822 --> 00:12:16.565 fluid flow. 00:12:16.565 --> 00:12:17.252 Right? 00:12:17.252 --> 00:12:19.147 So that would be very hard to do, by hand. 00:12:19.147 --> 00:12:22.240 So if I had something like this. 00:12:22.240 --> 00:12:24.240 And I say, now give me something 00:12:24.240 --> 00:12:26.555 that looks very good because now there 00:12:26.555 --> 00:12:28.369 are all these overlapping connections. 00:12:28.369 --> 00:12:30.744 So then, it's an interesting thing 00:12:30.744 --> 00:12:34.555 How do I pull a part this complicated 00:12:34.555 --> 00:12:36.902 looking graph and make something 00:12:36.902 --> 00:12:38.679 where structures and connections 00:12:38.679 --> 00:12:41.137 are much more easy to see. 00:12:41.137 --> 00:12:43.381 And that's what I want to show you, 00:12:43.381 --> 00:12:45.194 because we can do this. Now you may say 00:12:45.194 --> 00:12:46.837 this is a made up example. 00:12:46.837 --> 00:12:48.273 What has a messy network like this? 00:12:48.273 --> 00:12:49.001 Well let me just show you. 00:12:49.001 --> 00:12:49.864 Just one example. 00:12:49.864 --> 00:12:51.366 Don't just thought off the web. 00:12:51.366 --> 00:12:54.535 I just looked at work by Allera Hall. 00:12:54.535 --> 00:12:56.778 And I looked at Saga's from Iceland, 00:12:56.778 --> 00:12:58.726 and he published this. 00:12:58.726 --> 00:13:00.651 So these are the connections between 00:13:00.651 --> 00:13:02.630 various sagas. 00:13:02.630 --> 00:13:04.330 Obviously this is very messy. 00:13:04.330 --> 00:13:05.701 And I think he should be using 00:13:05.701 --> 00:13:06.429 our software. 00:13:06.429 --> 00:13:08.980 (Audience Laughter) 00:13:08.980 --> 00:13:10.127 Maybe I should send it to him. 00:13:10.127 --> 00:13:11.527 So these things happen. 00:13:11.527 --> 00:13:13.090 So now the question is, 00:13:13.090 --> 00:13:16.171 if I have a bunch of nodes 00:13:16.171 --> 00:13:17.803 and I would just place the nodes, 00:13:17.803 --> 00:13:18.997 one, two, three, four, all the way 00:13:18.997 --> 00:13:20.273 to 10 million 00:13:20.273 --> 00:13:22.553 and I have connections between them, 00:13:22.553 --> 00:13:24.560 how do I figure out how to pull them 00:13:24.560 --> 00:13:27.476 apart and put them on a two 00:13:27.476 --> 00:13:28.566 dimensional piece of paper. 00:13:28.566 --> 00:13:29.873 So that I have a nice view. 00:13:29.873 --> 00:13:34.411 Okay? So how would you do it? 00:13:34.411 --> 00:13:37.120 (Audience "grab one and pull?") 00:13:37.120 --> 00:13:40.067 Somehow you need - no first of all you 00:13:40.067 --> 00:13:41.371 need two nodes that are really 00:13:41.371 --> 00:13:44.807 strongly connected to come close 00:13:44.807 --> 00:13:46.616 Right? So if node one and two are strongly 00:13:46.616 --> 00:13:47.840 connected, maybe because there was 00:13:47.840 --> 00:13:50.535 a big non-zero in this matrix then 00:13:50.535 --> 00:13:51.508 I want them to be close. 00:13:51.508 --> 00:13:55.503 Right? And if one is connected to two 00:13:55.503 --> 00:13:57.882 and two to four and four to 17 and 00:13:57.882 --> 00:14:00.680 17 to 300, I don't want 1 and 300 00:14:00.680 --> 00:14:02.218 to be too close because there is 00:14:02.218 --> 00:14:04.941 four degrees of separation. 00:14:04.941 --> 00:14:06.367 Right? So the question is, how do I do 00:14:06.367 --> 00:14:07.051 that? 00:14:07.051 --> 00:14:14.551 So we have these nodes, and 00:14:14.551 --> 00:14:16.613 we have these lines connecting them. 00:14:16.613 --> 00:14:17.737 K? Now what we're going to do is 00:14:17.737 --> 00:14:20.982 two things. Each of these lines will 00:14:20.982 --> 00:14:22.357 imagine it's a spring. 00:14:22.357 --> 00:14:26.998 So when we pull things apart, 00:14:26.998 --> 00:14:28.597 they pull back. 00:14:28.597 --> 00:14:30.979 Right? And the size of the spring - 00:14:30.979 --> 00:14:32.499 the strength of the spring, Guess what? 00:14:32.499 --> 00:14:35.022 That's determined by what? 00:14:35.022 --> 00:14:38.983 By the strength of that Non-zero. 00:14:38.983 --> 00:14:41.918 Right? OK, that's nice. But what would 00:14:41.918 --> 00:14:43.193 happen if I did this? 00:14:43.193 --> 00:14:44.677 If I had all of these nodes and I 00:14:44.677 --> 00:14:47.023 put springs on them, and let them go. 00:14:47.023 --> 00:14:51.097 What would happen if I didn't do anything 00:14:51.097 --> 00:14:51.944 else? 00:14:51.944 --> 00:14:53.753 Would they (shooo)? All get together? 00:14:53.753 --> 00:14:56.019 I don't want that either. 00:14:56.019 --> 00:14:56.735 I don't want them all to cluster, 00:14:56.735 --> 00:14:58.732 so they're not allowed to get too close. 00:14:58.732 --> 00:15:01.875 So how can I make things - I need some 00:15:01.875 --> 00:15:03.984 kind of repelling force. So when they get 00:15:03.984 --> 00:15:06.003 too close, they're not allowed to. 00:15:06.003 --> 00:15:06.933 So what do I do? 00:15:06.933 --> 00:15:09.856 I give every node an electric charge. 00:15:09.856 --> 00:15:15.208 So that they repel each other. 00:15:15.208 --> 00:15:17.382 K? So now I have a whole network of 00:15:17.382 --> 00:15:19.241 balls attached to springs, 00:15:19.241 --> 00:15:21.358 the springs have stiffness, the balls 00:15:21.358 --> 00:15:22.865 have an electric charge. 00:15:22.865 --> 00:15:24.533 And I let the whole thing drop on the 00:15:24.533 --> 00:15:25.559 floor. 00:15:25.559 --> 00:15:31.909 Because I want it two dimensional, right? 00:15:31.909 --> 00:15:35.260 And I let this thing organize itself. 00:15:35.260 --> 00:15:37.563 So it comes to an equilibrium shape. 00:15:37.563 --> 00:15:39.519 That's always minimizing some sort of 00:15:39.519 --> 00:15:40.265 energy. 00:15:40.265 --> 00:15:40.869 Right? 00:15:40.869 --> 00:15:43.000 It's beautiful, these systems do this 00:15:43.000 --> 00:15:44.193 automatically. 00:15:44.193 --> 00:15:46.029 And I just let them organize themselves. 00:15:46.029 --> 00:15:47.640 So it would look something like this. 00:15:47.640 --> 00:15:50.380 We start with this, 00:15:50.380 --> 00:15:51.874 we let it drop, 00:15:51.874 --> 00:15:53.783 and now it becomes this. 00:15:53.783 --> 00:15:56.782 So it's exactly that same configuration. 00:15:56.782 --> 00:15:59.997 Now here we've cheated a little, 00:15:59.997 --> 00:16:04.067 how big you make that electric charge 00:16:04.067 --> 00:16:07.103 and how big you make the strength of 00:16:07.103 --> 00:16:08.698 the springs. 00:16:08.698 --> 00:16:11.139 That determines what stuff you get out. 00:16:11.139 --> 00:16:13.710 So afterward, we need a lot of changing 00:16:13.710 --> 00:16:15.699 to make it look beautiful. 00:16:15.699 --> 00:16:16.994 K so this looks really easy. 00:16:16.994 --> 00:16:18.954 But some of the pictures I'm going to 00:16:18.954 --> 00:16:20.794 show you took a long, long time to create. 00:16:20.794 --> 00:16:22.843 Because there was a lot of 00:16:22.843 --> 00:16:23.305 twiggling. Put a little 00:16:23.305 --> 00:16:25.345 bit more spring strength here. 00:16:25.345 --> 00:16:26.227 And more repulsive charge there. 00:16:26.227 --> 00:16:29.685 But when you look at this it's beautiful 00:16:29.685 --> 00:16:32.676 structure. And it shows you very naturally 00:16:32.676 --> 00:16:36.192 these clusters and when you stare at 00:16:36.192 --> 00:16:36.898 these structures 00:16:36.898 --> 00:16:38.150 you can really get some information 00:16:38.150 --> 00:16:39.176 about the underlying system. 00:16:39.176 --> 00:16:41.389 No matter where this comes from. 00:16:41.389 --> 00:16:43.988 Now let me show you some 00:16:43.988 --> 00:16:45.993 really beautiful examples. 00:16:45.993 --> 00:16:50.438 In much larger systems than this. 00:16:50.438 --> 00:16:51.942 This is a financial portfolio 00:16:51.942 --> 00:16:53.323 optimization. 00:16:53.323 --> 00:16:56.276 So this is one of the matrices you 00:16:56.276 --> 00:16:58.707 would have come up in one of the 00:16:58.707 --> 00:16:59.994 simulations or computer programs you 00:16:59.994 --> 00:17:01.810 would have in financial portfolio 00:17:01.810 --> 00:17:02.753 optimizations. This looks much better 00:17:02.753 --> 00:17:06.285 than what you would imagine. From the 00:17:06.285 --> 00:17:08.010 2008 problems. 00:17:08.010 --> 00:17:08.812 Right? 00:17:08.812 --> 00:17:12.905 How did we know that this was behind it. 00:17:12.905 --> 00:17:16.533 This one, is another type of program 00:17:16.533 --> 00:17:18.880 that we often have in optimization. 00:17:18.880 --> 00:17:20.142 Called "prodredic(sp?)" programming. 00:17:20.142 --> 00:17:22.480 It's the matrix from one of those 00:17:22.480 --> 00:17:24.074 simulations, here is the close up. 00:17:24.074 --> 00:17:25.620 So they're very intricate things. 00:17:25.620 --> 00:17:27.345 These are just two dimensional 00:17:27.345 --> 00:17:29.522 patterns. But of course, it looks a little 00:17:29.522 --> 00:17:31.587 bit three dimensional. 00:17:31.587 --> 00:17:33.231 You can also do these things in 3D 00:17:33.231 --> 00:17:35.311 but it is much harder to vision. 00:17:35.311 --> 00:17:39.469 This one, is from electrical engineering 00:17:39.469 --> 00:17:40.630 It's a circuit simulation. 00:17:40.630 --> 00:17:46.131 So we also call this the porcupine. 00:17:46.131 --> 00:17:48.212 And this is a close up 00:17:48.212 --> 00:17:49.876 It's not a super high resolution image, 00:17:49.876 --> 00:17:52.020 but it gives you an idea. 00:17:52.020 --> 00:17:54.507 This is another linear programming problem 00:17:54.507 --> 00:17:56.755 That comes from some sort of optimization 00:17:56.755 --> 00:18:00.170 problem. And I forgot which one this is. 00:18:00.170 --> 00:18:03.118 And this color is often the strength of 00:18:03.118 --> 00:18:06.549 connection. You can also use it in also 00:18:06.549 --> 00:18:07.899 different ways. 00:18:07.899 --> 00:18:09.897 And my friend, Tim, who created this, 00:18:09.897 --> 00:18:12.602 uses colors so the pictures look really 00:18:12.602 --> 00:18:16.827 nice. (audience laughter). 00:18:16.827 --> 00:18:17.997 So we can play with these colors because 00:18:17.997 --> 00:18:19.102 there is lots of different ways to color 00:18:19.102 --> 00:18:19.864 this. 00:18:19.864 --> 00:18:21.414 I have to admit, this is pretty nice. 00:18:21.414 --> 00:18:22.280 I'll show you some others. 00:18:22.280 --> 00:18:29.249 This one, is part of my field, it's 00:18:29.249 --> 00:18:38.188 a matrix of an ocean, of shallow water. 00:18:38.188 --> 00:18:40.358 So where the depth of the water is much 00:18:40.358 --> 00:18:42.372 less than the width of the area. 00:18:42.372 --> 00:18:44.569 Now, this doesn't nearly look as nice. 00:18:44.569 --> 00:18:45.757 But the matrix that comes out 00:18:45.757 --> 00:18:49.568 is very unstructured, but still, 00:18:49.568 --> 00:18:52.738 it almost has the same feel to it as 00:18:52.738 --> 00:18:57.383 water. That's of course why we made it 00:18:57.383 --> 00:18:58.970 green and blue. 00:18:58.970 --> 00:19:02.002 Now this is a close-up. 00:19:02.002 --> 00:19:05.028 This one is another linear programming 00:19:05.028 --> 00:19:07.582 problem. It's my favorite. It has quite 00:19:07.582 --> 00:19:09.783 beautiful structures. 00:19:09.783 --> 00:19:12.905 And this one is actually a social network 00:19:12.905 --> 00:19:16.812 Which we have labeled, the poppy. 00:19:16.812 --> 00:19:24.761 And here, where you see poppies 00:19:24.761 --> 00:19:27.788 of flowers, they're really clusters 00:19:27.788 --> 00:19:29.447 of friends. 00:19:29.447 --> 00:19:31.756 They are strongly connected friend 00:19:31.756 --> 00:19:34.147 networks, within this large social 00:19:34.147 --> 00:19:37.331 network. So you can have a lot of fun 00:19:37.331 --> 00:19:42.755 with this. Right? 00:19:42.755 --> 00:19:46.271 We played with something else as well. 00:19:46.271 --> 00:19:49.730 And I brought a poster. 00:19:49.730 --> 00:19:58.086 Because Allison was going to show her 00:19:58.086 --> 00:19:59.250 beautiful work, so I wanted to show 00:19:59.250 --> 00:20:01.595 that we also do nice things. 00:20:01.595 --> 00:20:05.561 Here is my artwork. 00:20:05.561 --> 00:20:11.768 Sometimes I ask people, what are they 00:20:11.768 --> 00:20:14.500 looking at, and they say I'm looking at 00:20:14.500 --> 00:20:16.896 some sort of network or graph. But 00:20:16.896 --> 00:20:20.337 this is the LCSH. 00:20:20.337 --> 00:20:24.525 Library of Congress Subject headers. 00:20:24.525 --> 00:20:27.274 So this is the library system that we use 00:20:27.274 --> 00:20:29.073 in almost all libraries of the world 00:20:29.073 --> 00:20:31.773 And what you are looking at are 00:20:31.773 --> 00:20:36.503 library main categories, and their sub 00:20:36.503 --> 00:20:38.293 categories, and how everything 00:20:38.293 --> 00:20:40.525 is linked together. 00:20:40.525 --> 00:20:45.786 Now the LCSH asked us in 2005, me 00:20:45.786 --> 00:20:47.555 and a group of my students, to help 00:20:47.555 --> 00:20:50.234 them understand the structure. 00:20:50.234 --> 00:20:53.527 So here are cataloggers who have 00:20:53.527 --> 00:20:57.173 worked on this categorizing 00:20:57.173 --> 00:21:00.753 for decades. But they have never seen it, 00:21:00.753 --> 00:21:11.819 they just put it the data. So we 00:21:11.819 --> 00:21:14.938 used exactly the same kind of program 00:21:14.938 --> 00:21:19.805 that you just saw. We call it the galaxy. 00:21:19.805 --> 00:21:26.321 I will make sure they get the link 00:21:26.321 --> 00:21:27.736 so they can play with it. You 00:21:27.736 --> 00:21:29.652 can go in and zoom in and click on 00:21:29.652 --> 00:21:32.058 the node and it will jump up 00:21:32.058 --> 00:21:34.436 and show connected notes, and this 00:21:34.436 --> 00:21:36.689 is a way to browse. What is also really 00:21:36.689 --> 00:21:40.412 funny here, is you see this whole mess 00:21:40.412 --> 00:21:43.104 in the middle, because it is very strongly 00:21:43.104 --> 00:21:50.718 interconnected stuff, we but in words 00:21:50.718 --> 00:21:57.917 where the connections were strongest 00:21:57.917 --> 00:22:02.871 But the library also used it to find 00:22:02.871 --> 00:22:07.100 lazy cataloggers. Because look at this, 00:22:07.100 --> 00:22:09.543 we call this a Supernova. This is 00:22:09.543 --> 00:22:14.301 Japanese Antiquites. And it has one main 00:22:14.301 --> 00:22:17.360 category, and a whole bunch of sub 00:22:17.360 --> 00:22:19.813 categories. But the catalogger did 00:22:19.813 --> 00:22:21.360 not interconnect the sub-catagories. 00:22:21.360 --> 00:22:23.491 So we have Japanese Antiquities, 00:22:23.491 --> 00:22:25.540 and 140 things attached to it, 00:22:25.540 --> 00:22:28.753 but how they related, he or she did 00:22:28.753 --> 00:22:30.130 not relate it. So the library can look 00:22:30.130 --> 00:22:34.051 at it and say, we really should do 00:22:34.051 --> 00:22:36.444 something about this. Because the more 00:22:36.444 --> 00:22:38.592 messy it looks, the better it is for 00:22:38.592 --> 00:22:40.609 browsing purposes. But we just 00:22:45.218 --> 00:22:47.703 thought it was a beautiful picture. 00:22:47.703 --> 00:22:49.539 This took a long, long time to create, 00:22:49.539 --> 00:22:52.272 we had to think about the colors, 00:22:52.272 --> 00:22:58.962 and how the nodes. 00:22:58.962 --> 00:23:00.337 And now we are looking at ways 00:23:00.337 --> 00:23:02.078 to put this in 3D, so you can 00:23:02.078 --> 00:23:03.270 truly fly through the galaxy. 00:23:03.270 --> 00:23:04.746 And you can do the same with 00:23:04.746 --> 00:23:07.854 Wikipedia, and other networks that you 00:23:07.854 --> 00:23:12.767 have. You can say let's go on a flight 00:23:12.767 --> 00:23:15.790 through my social network. 00:23:15.790 --> 00:23:17.871 And you can see who is stronger connected 00:23:17.871 --> 00:23:20.889 than you are. Anyway it is a lot of fun 00:23:20.889 --> 00:23:22.555 to play with. So I'll make sure you have 00:23:22.555 --> 00:23:24.247 this link, and I'll take any questions you 00:23:24.247 --> 00:23:27.376 may have.