WEBVTT 00:00:00.613 --> 00:00:04.318 I feel very honored to be invited here. 00:00:04.318 --> 00:00:05.918 Thank you very much. 00:00:06.586 --> 00:00:10.069 I like to, I think I've seen one 00:00:10.069 --> 00:00:12.140 maybe two other people with gray hair here 00:00:12.140 --> 00:00:15.140 [audience laughter] 00:00:15.140 --> 00:00:17.160 The last talk I gave a few weeks ago 00:00:17.160 --> 00:00:19.472 was to a meeting of ophthalmologists 00:00:19.472 --> 00:00:21.492 and that was a bunch of 00:00:22.860 --> 00:00:27.309 much older people, okay, and 00:00:27.309 --> 00:00:31.269 a homage here to the Fifth Elephant 00:00:32.303 --> 00:00:35.022 This is the novel, the cover of the novel 00:00:35.022 --> 00:00:37.812 from which it was taken and, well actually 00:00:37.818 --> 00:00:41.121 I'm using this as a connection for a 00:00:41.121 --> 00:00:42.905 little bit of boasting because 00:00:42.905 --> 00:00:45.036 Terry Pratchett wrote the book 00:00:45.036 --> 00:00:50.275 I am a co-author of a co-author of Terry Pratchett 00:00:52.611 --> 00:00:55.400 and I actually signed a publisher contract 00:00:55.400 --> 00:00:56.890 on my 70th birthday 00:00:56.890 --> 00:01:00.771 a few weeks ago to publish a 00:01:00.771 --> 00:01:05.121 science-fiction novel with Ian Stewart 00:01:06.739 --> 00:01:11.229 and I mention that not just as boasting but, 00:01:11.881 --> 00:01:16.433 ok, this is a Data Geeks meeting rather 00:01:16.771 --> 00:01:19.301 than the Graphics Geeks meeting, but if 00:01:19.668 --> 00:01:23.202 anybody has graphics enthusiasm, there is 00:01:23.772 --> 00:01:26.538 all kinds of stuff that would be fun 00:01:26.923 --> 00:01:29.407 to build for the website we are putting 00:01:29.407 --> 00:01:34.123 together for that novel. strange things 00:01:34.645 --> 00:01:38.825 happening on that planet, so do make 00:01:39.151 --> 00:01:43.301 contact if you're interested in drawing 00:01:43.459 --> 00:01:45.142 strange and beautiful things because 00:01:45.239 --> 00:01:48.009 I have some strange and beautiful things 00:01:48.291 --> 00:01:50.751 to draw and some to interact with. 00:01:51.182 --> 00:01:53.082 What I don't have is a budget 00:01:53.124 --> 00:01:54.744 you have to just like it 00:01:55.353 --> 00:01:58.003 ok, that is pure digression 00:01:58.359 --> 00:02:03.229 I was originally a mathematician and 00:02:03.405 --> 00:02:08.195 that was my PhD back before almost anybody 00:02:08.493 --> 00:02:12.763 here was born, and I've kinda wandered 00:02:12.937 --> 00:02:15.577 around the world and the sciences and 00:02:15.727 --> 00:02:18.337 I've turned into some sort of an engineer 00:02:18.559 --> 00:02:21.361 But what I'm going to talk about here is 00:02:21.411 --> 00:02:25.701 the power of particular mathematical 00:02:25.922 --> 00:02:29.722 point of view, which is that numbers are 00:02:29.953 --> 00:02:31.353 not just numbers 00:02:31.589 --> 00:02:35.079 They belong together in shapes, so 00:02:40.427 --> 00:02:42.057 What are data? 00:02:43.244 --> 00:02:44.828 Mostly, they're numbers 00:02:44.828 --> 00:02:49.018 I know there are fields and things 99:59:59.999 --> 99:59:59.999 We've been hearing about that, but 99:59:59.999 --> 99:59:59.999 then you keep counting 99:59:59.999 --> 99:59:59.999 Lots and lots of it is numbers 99:59:59.999 --> 99:59:59.999 But are numbers only numbers? 99:59:59.999 --> 99:59:59.999 Well, no they gather together in things 99:59:59.999 --> 99:59:59.999 They come in patterns 99:59:59.999 --> 99:59:59.999 and really big data is all about 99:59:59.999 --> 99:59:59.999 the arrangements those things make 99:59:59.999 --> 99:59:59.999 just knowing the numbers, you don't know anything 99:59:59.999 --> 99:59:59.999 You got to know how they fit together 99:59:59.999 --> 99:59:59.999 Patterns are shapes 99:59:59.999 --> 99:59:59.999 So, studying shapes, data shapes, any kind of shapes 99:59:59.999 --> 99:59:59.999 Space-time shapes. That's Geometry 99:59:59.999 --> 99:59:59.999 But not the kind that I was doing when I was 13 or 14 years old 99:59:59.999 --> 99:59:59.999 Mind you, I had some taste for it and it was quite fun 99:59:59.999 --> 99:59:59.999 but it was all flat in the sand, just like that 99:59:59.999 --> 99:59:59.999 and here is Euclid 99:59:59.999 --> 99:59:59.999 Stuff we would write in little triangles 99:59:59.999 --> 99:59:59.999 and fun things 99:59:59.999 --> 99:59:59.999 This I remember as a remarkable theorem, 99:59:59.999 --> 99:59:59.999 but I have never ever, ever, ever seen a use for 99:59:59.999 --> 99:59:59.999 [audience laughs] 99:59:59.999 --> 99:59:59.999 It's weird, it's very much something 99:59:59.999 --> 99:59:59.999 about the plane, it's strange 99:59:59.999 --> 99:59:59.999 and I have never encountered it or 99:59:59.999 --> 99:59:59.999 referred to it in anything useful since I left school 99:59:59.999 --> 99:59:59.999 It's a bizarre theorem, which is occasionally useful 99:59:59.999 --> 99:59:59.999 Everything is so much in the plane 99:59:59.999 --> 99:59:59.999 Data shapes don't live mostly in the plane 99:59:59.999 --> 99:59:59.999 Geometry doesn't mean that you replace 99:59:59.999 --> 99:59:59.999 now this, by the way, is highly superior pointer technology 99:59:59.999 --> 99:59:59.999 Much better than those twinkling little 99:59:59.999 --> 99:59:59.999 red things that you lose track of where it's pointing to 99:59:59.999 --> 99:59:59.999 and 10% of your audience can't see red 99:59:59.999 --> 99:59:59.999 Now, here is something serious 99:59:59.999 --> 99:59:59.999 Children think in 3D 99:59:59.999 --> 99:59:59.999 They think brilliantly in 3D 99:59:59.999 --> 99:59:59.999 They naturally work in 3D 99:59:59.999 --> 99:59:59.999 They are connecting how their vision 99:59:59.999 --> 99:59:59.999 is working with their hands 99:59:59.999 --> 99:59:59.999 they can reach out and grab your nose 99:59:59.999 --> 99:59:59.999 If you watch a small child, it's doing a lot of practice at building 99:59:59.999 --> 99:59:59.999 a 3D model of the world, and then, 99:59:59.999 --> 99:59:59.999 and these days that continues into primary school 99:59:59.999 --> 99:59:59.999 a hundred years ago, ugh.. 99:59:59.999 --> 99:59:59.999 but now primary school's good 99:59:59.999 --> 99:59:59.999 but their secondary schools suck rocks 99:59:59.999 --> 99:59:59.999 It's still, if you get any geometry 99:59:59.999 --> 99:59:59.999 it's flat, flat stuff. 99:59:59.999 --> 99:59:59.999 It can get more and more complicated 99:59:59.999 --> 99:59:59.999 yeah.. but, 99:59:59.999 --> 99:59:59.999 I just grabbed that off the web 99:59:59.999 --> 99:59:59.999 as one particular complicated 2D diagram 99:59:59.999 --> 99:59:59.999 but fix your mind in 2D, you get to the point where you can't think 99:59:59.999 --> 99:59:59.999 Are the x- and y- axes this way or this way? 99:59:59.999 --> 99:59:59.999 I found my UCLA students 99:59:59.999 --> 99:59:59.999 if I switched drawing on the blackboard 99:59:59.999 --> 99:59:59.999 from this way to that way 99:59:59.999 --> 99:59:59.999 because something could be seen better that way 99:59:59.999 --> 99:59:59.999 they couldn't turn it in their heads. 99:59:59.999 --> 99:59:59.999 Well data doesn't live in the plane. 99:59:59.999 --> 99:59:59.999 It's not flat. 99:59:59.999 --> 99:59:59.999 If we have three variables, we have three dimensions. 99:59:59.999 --> 99:59:59.999 That might be how far this way, 99:59:59.999 --> 99:59:59.999 how far this way, and how far up. 99:59:59.999 --> 99:59:59.999 And if you're doing graphics, it's three very directly spatial dimensions. 99:59:59.999 --> 99:59:59.999 But if it's, if you've just got numbers about people 99:59:59.999 --> 99:59:59.999 I look at everybody here, I know their 99:59:59.999 --> 99:59:59.999 height, well I don't know their height but 99:59:59.999 --> 99:59:59.999 you guys do because you're big data guys. 99:59:59.999 --> 99:59:59.999 Know the height, know the weight, know the age. 99:59:59.999 --> 99:59:59.999 Three numbers, that's a three dimensional set. 99:59:59.999 --> 99:59:59.999 And the pattern that you make, that's 99:59:59.999 --> 99:59:59.999 three dimensional geometry. 99:59:59.999 --> 99:59:59.999 But of course, you typically have 99:59:59.999 --> 99:59:59.999 a lot more. So you've got 'n' dimensions 99:59:59.999 --> 99:59:59.999 and 'n' can be quite big. 99:59:59.999 --> 99:59:59.999 So you need to think about 'n' dimensions. 99:59:59.999 --> 99:59:59.999 And there's two ways to do it. 99:59:59.999 --> 99:59:59.999 One is to turn it all into algebra, which 99:59:59.999 --> 99:59:59.999 is what people spend a lot of their time doing. 99:59:59.999 --> 99:59:59.999 And in this talk I'm only going to talk linear algebra 99:59:59.999 --> 99:59:59.999 which doesn't mean it's the only kind there is, 99:59:59.999 --> 99:59:59.999 but I've only got a few minutes. 99:59:59.999 --> 99:59:59.999 Or you can practice thinking in 3D and 99:59:59.999 --> 99:59:59.999 build up insights that help you very 99:59:59.999 --> 99:59:59.999 seriously in 'n'-dimensional thinking. 99:59:59.999 --> 99:59:59.999 I took up carving things when I was a grad 99:59:59.999 --> 99:59:59.999 student because I realized that my mind 99:59:59.999 --> 99:59:59.999 had been flattened by my high school 99:59:59.999 --> 99:59:59.999 and my undergraduate. I needed to loosen up 99:59:59.999 --> 99:59:59.999 my mind and think in 3D, so I started 99:59:59.999 --> 99:59:59.999 using my hands. You got, this is the 99:59:59.999 --> 99:59:59.999 visual part of the brain, this is the 99:59:59.999 --> 99:59:59.999 motor part of the brain, and the motor cortex 99:59:59.999 --> 99:59:59.999 just has to be 3D because you've got to 99:59:59.999 --> 99:59:59.999 pick things up, you got to twist them, 99:59:59.999 --> 99:59:59.999 connect it all up. So seriously, for 3D 99:59:59.999 --> 99:59:59.999 thinking, take up sculpture. 99:59:59.999 --> 99:59:59.999 So practice thinking in 3D, and the more 99:59:59.999 --> 99:59:59.999 3D you can think, the readier you are to 99:59:59.999 --> 99:59:59.999 think in other dimensions, general dimensions. 99:59:59.999 --> 99:59:59.999 But 2D? Nah, it's not enough. 99:59:59.999 --> 99:59:59.999 So, question for you guys. 99:59:59.999 --> 99:59:59.999 Most people here have done things - are 99:59:59.999 --> 99:59:59.999 doing things with matrices sometimes, right? 99:59:59.999 --> 99:59:59.999 What does a matrix even mean? 99:59:59.999 --> 99:59:59.999 Whats it represent? 99:59:59.999 --> 99:59:59.999 Blah, you have told me the data structure. 99:59:59.999 --> 99:59:59.999 It's, yeyah, it's an array. 99:59:59.999 --> 99:59:59.999 This is the data structure. 99:59:59.999 --> 99:59:59.999 It's an array this way, and this way. 99:59:59.999 --> 99:59:59.999 But, at the level of algebra, and geometry. 99:59:59.999 --> 99:59:59.999 It's something a bit more. 99:59:59.999 --> 99:59:59.999 It's something that operates on vectors. 99:59:59.999 --> 99:59:59.999 Transforms vectors, and in particular 99:59:59.999 --> 99:59:59.999 there was a rule that they taught me 99:59:59.999 --> 99:59:59.999 yea back at first or second year of graduate 99:59:59.999 --> 99:59:59.999 of which way you multiply the matrix and the vector. 99:59:59.999 --> 99:59:59.999 And, I swear to you it took me a year to remember 99:59:59.999 --> 99:59:59.999 When I'm multiplying 2 matrices do I go along this way 99:59:59.999 --> 99:59:59.999 or along that way. Because it was a damn silly rule. 99:59:59.999 --> 99:59:59.999 That came from the Algebra book. 99:59:59.999 --> 99:59:59.999 But, trying to avoid spending too much time on this. 99:59:59.999 --> 99:59:59.999 You do know the rule most of you. So, if you have this 3x3 matrix. 99:59:59.999 --> 99:59:59.999 And, apply it to this vector. [column of '1,0,0'] 99:59:59.999 --> 99:59:59.999 You get this. Which is this column. [points to 'a,c,f' columns] 99:59:59.999 --> 99:59:59.999 And, if you apply to this vector. [column of '0,1,0'] 99:59:59.999 --> 99:59:59.999 You get this column. [column of 'b,d,g'] 99:59:59.999 --> 99:59:59.999 And, if you apply to this vector. [ column of '0,0,1'] 99:59:59.999 --> 99:59:59.999 You get the last column. 99:59:59.999 --> 99:59:59.999 Now, '1,0,0' means lets suppose this is the x direction. 99:59:59.999 --> 99:59:59.999 X this way. It says anything that is purely in the 99:59:59.999 --> 99:59:59.999 X direction goes to 'a,c,f'. 99:59:59.999 --> 99:59:59.999 Wherever that is. Which is a 3 dimensional vector sum. 99:59:59.999 --> 99:59:59.999 Anything that is in the Y direction. Like, '0,1,0' goes to something else. 99:59:59.999 --> 99:59:59.999 Specifically, it goes to 'b,d,g'. 99:59:59.999 --> 99:59:59.999 And, anything that starts vertical goes to 'c,e,h'. 99:59:59.999 --> 99:59:59.999 So the matrix is actually a list of vectors. 99:59:59.999 --> 99:59:59.999 It's saying, where does the first one go 99:59:59.999 --> 99:59:59.999 where does the second one go, and where does the third one go. 99:59:59.999 --> 99:59:59.999 And, believe me if you're doing 3d computer graphics; 99:59:59.999 --> 99:59:59.999 understanding that point will make it much easier 99:59:59.999 --> 99:59:59.999 than anything I've ever seen in an OpenGL manual 99:59:59.999 --> 99:59:59.999 of what you ought to be do. 99:59:59.999 --> 99:59:59.999 They don't explain matrices very well. 99:59:59.999 --> 99:59:59.999 It's just a list of where these 3 things go. 99:59:59.999 --> 99:59:59.999 And, in the case of a rotation that's particularly tidy. 99:59:59.999 --> 99:59:59.999 Right angles, things at right angles go to things at right angles. 99:59:59.999 --> 99:59:59.999 And, so on. But, not every matrix is doing something 99:59:59.999 --> 99:59:59.999 as simple as a rotation. Unless you're as 99:59:59.999 --> 99:59:59.999 simple-minded as an IQ theorist. 99:59:59.999 --> 99:59:59.999 And, they rotate things they've no justification doing. 99:59:59.999 --> 99:59:59.999 If you remember that, you can always clarify, see more definitely 99:59:59.999 --> 99:59:59.999 what the algebra is doing, and if you know what the algebra is doing. 99:59:59.999 --> 99:59:59.999 You can make it drive better code. 99:59:59.999 --> 99:59:59.999 What should the code be doing? 99:59:59.999 --> 99:59:59.999 So, I'm just going to illustrate this point of view. 99:59:59.999 --> 99:59:59.999 With a very top down glimpse at some of the things people do 99:59:59.999 --> 99:59:59.999 when they got a lot of data. One of them 99:59:59.999 --> 99:59:59.999 is Principal Component Analysis. 99:59:59.999 --> 99:59:59.999 Now this very sketchy, very, very 2d. 99:59:59.999 --> 99:59:59.999 I didn't have time to do wonderful 3d animations. I'm sorry. 99:59:59.999 --> 99:59:59.999 But, I got this variable, this variable. 99:59:59.999 --> 99:59:59.999 Put them together I got datapoints. Which are pairs of variables. 99:59:59.999 --> 99:59:59.999 And, roughly speaking I can say. Just looking at this. 99:59:59.999 --> 99:59:59.999 That as this increases, that increases. 99:59:59.999 --> 99:59:59.999 But, if I want to compress my data. Than, I rotate my axis. 99:59:59.999 --> 99:59:59.999 I put an axis along this way. And, another axis along this way. 99:59:59.999 --> 99:59:59.999 So, it's going to be a matrix that says this guy, goes to there 99:59:59.999 --> 99:59:59.999 and this guy, goes to there. 99:59:59.999 --> 99:59:59.999 Finding that matrix; you had this chance of algebra 99:59:59.999 --> 99:59:59.999 not fitting into this time. But that's the idea of Principal Component Analysis. 99:59:59.999 --> 99:59:59.999 And, if you're applying it to a machine. That does wobbles here, and here. 99:59:59.999 --> 99:59:59.999 And, makes squeaks there, and all sorts of things. 99:59:59.999 --> 99:59:59.999 You've got a lot of numbers, and you got to do the algebra 99:59:59.999 --> 99:59:59.999 a bit more complicatedly than that 2d picture represents. 99:59:59.999 --> 99:59:59.999 But, that is really what's going on. 99:59:59.999 --> 99:59:59.999 You're finding the way to move your axes. 99:59:59.999 --> 99:59:59.999 And now, if you know where you are on this axis. 99:59:59.999 --> 99:59:59.999 You know most of what you want to know about a datapoint. 99:59:59.999 --> 99:59:59.999 Is it here or is here, one number. How far along it is. 99:59:59.999 --> 99:59:59.999 And, then you say well you can expect some errors in that direction. 99:59:59.999 --> 99:59:59.999 Where in the previous picture you had to know 99:59:59.999 --> 99:59:59.999 2 directions, 2 numbers. So Principal Component Analysis 99:59:59.999 --> 99:59:59.999 is beautiful technique for information compression, 99:59:59.999 --> 99:59:59.999 reducing the amount of arbitrariness; handling all sorts of things 99:59:59.999 --> 99:59:59.999 excellently; as long as things are reasonably linear. 99:59:59.999 --> 99:59:59.999 Which, is a very, very big if. 99:59:59.999 --> 99:59:59.999 Small variations are more often linear. That's the whole point of 99:59:59.999 --> 99:59:59.999 the calculus, the calculus is a linear approximation of small changes. 99:59:59.999 --> 99:59:59.999 But, um, larger systems with more variation. 99:59:59.999 --> 99:59:59.999 Be wary, they're generally not linear. 99:59:59.999 --> 99:59:59.999 There's an expression "non-linear mathematics." 99:59:59.999 --> 99:59:59.999 Which is a bit like 'non-elephant biology.' 99:59:59.999 --> 99:59:59.999 You shouldn't be defining everything else by what it isn't. 99:59:59.999 --> 99:59:59.999 When what it isn't, is so special. 99:59:59.999 --> 99:59:59.999 Mm, yea, you look to me like an unusual elephant 99:59:59.999 --> 99:59:59.999 with some missing teeth, and going round on two back legs 99:59:59.999 --> 99:59:59.999 for some reason. [audience laughs] 99:59:59.999 --> 99:59:59.999 That's not really a good starting point for saying what you do. 99:59:59.999 --> 99:59:59.999 So there's other mathematics than linear 99:59:59.999 --> 99:59:59.999 but linear is very powerful, particularly when variation 99:59:59.999 --> 99:59:59.999 is reasonably small. 99:59:59.999 --> 99:59:59.999 And, that is the whole idea 99:59:59.999 --> 99:59:59.999 of Principal Component Analysis. 99:59:59.999 --> 99:59:59.999 Managing the machinery of it, I know people who spent 99:59:59.999 --> 99:59:59.999 their entire lives doing nothing but crunching those matrices. 99:59:59.999 --> 99:59:59.999 But, there's technique. But, you need the idea 99:59:59.999 --> 99:59:59.999 of how it's all working. Okay? 99:59:59.999 --> 99:59:59.999 So, let's try another thing. 99:59:59.999 --> 99:59:59.999 How many of you have done linear programming? 99:59:59.999 --> 99:59:59.999 Okay... what is the geometry in linear programming? 99:59:59.999 --> 99:59:59.999 Actually the first course I ever gave back when I was a graduate student. 99:59:59.999 --> 99:59:59.999 I was told, "Teach these economists from this book." 99:59:59.999 --> 99:59:59.999 And one of the things they were supposed to learn was linear programming. 99:59:59.999 --> 99:59:59.999 It was all matrices, and you pivot this, and you shquiggle that. 99:59:59.999 --> 99:59:59.999 Is that the kind of linear programming you had? 99:59:59.999 --> 99:59:59.999 Right? What the matrices do. 99:59:59.999 --> 99:59:59.999 Yeaa, but what's really going on is what the geometry is doing. 99:59:59.999 --> 99:59:59.999 So, first of all what does this mean? 99:59:59.999 --> 99:59:59.999 Suppose this was just 'X, Y, and Zed'. 99:59:59.999 --> 99:59:59.999 That if I put equal to 0; that's a plane. 99:59:59.999 --> 99:59:59.999 I'm positive on one side, and negative on the other. 99:59:59.999 --> 99:59:59.999 I say I got to be positive. 99:59:59.999 --> 99:59:59.999 Oh, how would I define a cube? 99:59:59.999 --> 99:59:59.999 I'd say I'd got something positive on this side of the cube, 99:59:59.999 --> 99:59:59.999 something positive on that side of the cube, 99:59:59.999 --> 99:59:59.999 something positive as I go down from the top, 99:59:59.999 --> 99:59:59.999 something that is positive as I go up from the bottom. 99:59:59.999 --> 99:59:59.999 With 6 inequalities I've got a cube. 99:59:59.999 --> 99:59:59.999 Same idea in N-dimensions. 99:59:59.999 --> 99:59:59.999 But 3 is plenty for thinking about this one. 99:59:59.999 --> 99:59:59.999 2d figures don't give the magic at all 99:59:59.999 --> 99:59:59.999 of what you need to do. 99:59:59.999 --> 99:59:59.999 But, the 3d problem does. 99:59:59.999 --> 99:59:59.999 You see here is where one of those limiting planes is. 99:59:59.999 --> 99:59:59.999 Here's where another one meets. 99:59:59.999 --> 99:59:59.999 So they're doing all kinds of things like this. 99:59:59.999 --> 99:59:59.999 Inside this polyhedron you're satisfying all those constraints. 99:59:59.999 --> 99:59:59.999 Go outside, cross any one of those planes; you're not. 99:59:59.999 --> 99:59:59.999 And, the algebra problem is that the planes meet 99:59:59.999 --> 99:59:59.999 in a whole lot of places. 99:59:59.999 --> 99:59:59.999 It looks like a sort of weird, 3-dimensional, hedgehoggy thing. 99:59:59.999 --> 99:59:59.999 This plane, and this plane, they don't meet on the surface; 99:59:59.999 --> 99:59:59.999 but they meet somewhere here. 99:59:59.999 --> 99:59:59.999 So you want to make sure you're staying inside this region. 99:59:59.999 --> 99:59:59.999 So about 1950 comes the 'Simplex Method' 99:59:59.999 --> 99:59:59.999 That confused, because simplex is a word 99:59:59.999 --> 99:59:59.999 that mathematicians use differently.