All right, thanks Chris and thanks for having me here. Can everybody hear me OK? So, today I'm going to talk about a type of data we're all very familiar with and I think most of us like, intrinsically. And that is geographic data and particularly imagery of the Earth. And we've already seen some examples of that today. I'm going to start with a little show and tell here. I had to bring a prop, I couldn't resist. My old Macintosh Power Book 145 from 1992. This is the first computer I had with a hard drive. It came with 4, was it 4? Actually with about 6 megabytes of memory. That was a big deal and I was just blown away. I couldn't believe I had that much memory. I'm just going to put this back here. I couldn't believe I had that much memory at my fingertips. Today, we now all have computers. I can go down to Staples and buy a computer that has a quarter million times more memory than that for about $400 or $600 or something like that. Times have changed and that's 20 years ago. As a result, with all this increased computing power, we are drowning in data. We're just absolutely drowning in data. And one of the types of data we are most drowning in is remote sensing or imagery data, satellite data, aerial data, things like that. And we've all played around with this, I'm sure. We all love Google Earth, it's free and it's fun. And it's just teaming with imagery. So, what do we do with all this stuff? How do we make use of this? Here's an image of Baltimore. This is urban Baltimore. It's got all these great objects. I can look in there and see, it's hard with this projector, but I can see trees and buildings, things like that. And let's just say I wanted to actually do some kind of a quantitative study with that. Say I had to do something that required knowing where the trees really were. I can see where the trees are, but the computer doesn't know, it has no clue what a tree is. Let's just say I wanted to do something like, these are actually the locations of crimes, say I wanted to know if the density of trees affects crime. There's no way I can do that with imagery the way we have it now, in a computing environment. Part of the reason for this is that computers aren't really good at recognizing things the way that we can recognise things. We are excellent are recognizing things with very slight differences. I can tell you within two seconds that that's George Carlin and that's Sigmund Freud. That that is the Big Lebowski and that is Eddie Vedder. For me to train a computer to recognise the difference between the Big Lebowski, a.k.a. the Dude and Eddie Vedder, would take me unbelievable amounts of time to do. Yet I can do that instantly, so that's an issue right here. So, let's cut to the chase here. On the left I have raw data, color infrared remote sense imagery. On the right I have a classified GIS layer. That is usable information. The computer knows what's grass, what's buildings and knows what's trees. How do I get from one to the other? This is a major major conundrum in today's world of high resolution data. Here's an image of just a typical suburban area. I look at it and I see all sort of features and I see that it is at a very fine resolution. If I were to be working with remote sensing data 15 years ago, I'd have coarse resolution imagery. This is the same exact location using 30 meter pixels. Back then, classifying this stuff was a qualitatively different thing, because all I really needed to do is get in the general ball park. These pixels here are sort of generally urbanized, these pixels here are generally forested. I didn't really have to know about the specific identity of objects. Now fast forward to today and I've got imagery which I can zoom in and I can see a million different types of objects. From cars in a parking lot to shipping containers, to the cranes at the shipping containers facility to the mechanicals on top of a building to, I love this one, this is the Sphinx at the Luxor Hotel in Las Vegas. Try telling a computer what that is. Here's another Vegas one, I love Google Earth in Vegas, it's the best, it's so much fun. This is a tropical fish-shaped pool. Again, not so easy to tell a computer what that is. And this is the best of all. I believe this is high-resolution satellite imagery of camels in the middle of Africa and it's part of this Google, National Geographic, Africa mega fly-over project. The number of possible things I have to prepare computers to be ready for, the types of objects out on the surface of the Earth surface of the Earth, is staggering. And it's a project that we'll never see finished, that project of teaching computers the artificial intelligence, giving them the artificial intelligence they need to recognize all of this variation on the Earth. Here's another even better one. This is actually a real thing, this is the Colonel. Someone actually did mega art of the Colonel in the middle of Nevada. It's interesting how many of these come from Nevada. (Laughter) So, let's explain why it's difficult to use the methods we've always used in the past for this generation of high resolution imagery. Here's a high resolution image of Burlington. If I try to classify each pixel, pixel by pixel, I get this awful pixelated meaningless gobbledygook. If I look at a particular object like a house, that house is made up, it's hard to see, but it's just dozens of different pixel values that don't really mean anything. If I take a single tree, again, is made up of a gobbledygook of pixels. Now, if I zoom in on that tree, for instance, I will see that it's made up of pixels, lots of different tones, different colors, and if this is the direct representation in classified pixels, it's meaningless, right? This is not finding objects. So, I need to teach a computer to see objects and to think like me. So this means teaching a computer to think like a human, which means working based on shape size, tone, pattern, texture, site and association, a lot of this is all spatial. We have to stop thinking pixel by pixel and start thinking of things spatially. That means taking an image and, what's called segmenting it, turning it into objects. And the process of segmenting it is very difficult. You have to train a computer to segment imagery correctly. And if I look, here is a house, there's one side of the roof and another side of the roof, there is the driveway, they're segmented as different objects and I can then re-aggregate those objects into something that is just a house and another one that is just a driveway. And at the end of the day, what I'm going to end up with is something like this. I will be able to tell you the difference. Even though the spectral signature is the same of this roof and this road, I know that their compactness factor is different, and because of that, because of the shape metrics of them, I can tell you which one's a roof and which one's a road, and I can start classifying things in that way. To do this requires huge rule sets. The rule sets could be dozens and dozens to over a hundred pages long of all these classification rules and I won't bore you with the details. I also will make use of a lot of ancillary data. There's all sort of great GIS data that helps me classify things now. Most cities are collecting things about building footprints, we know where parcels are, we know where sewer lines are and roads are and things like that. We can use this to help us, but the most important form of ancillary data that's out there today is called LIDAR: Light Detection and Ranging. And LIDAR has been used in engineering for a while and it allows us to essentially create models of the surface. This is Columbus Circle in Central Park in New York City and this is a surface elevation of the trees. The LIDAR tells me where the canopy of the trees is, where the tops of the buildings are, where the ground surface is too. And I can create these incredibly detailed models of the world so now I'm not just working with spatial spectral information, reflectance information, I'm also working with height information, I know the heights of things so I can see two objects that are green and woody, but I can tell that one of them is a shrub and one of them is a tree. And this is just zooming into that stuff there. Now the problem is, this is incredibly data-intensive and nobody's figured out until recently, and I mean like maybe two years ago, people were doing this on a tile by tile basis to work on one little tile of data at a time that might be, you know, one, that just that red outline that you see right there, and that might be half a gigabyte or something like that. So we've worked on turning this into an enterprise environment, that's what we have to do, make an enterprise environment out of this so we can start looking at thousands of tiles of data at a time, and we've successfully done that. My lab, which is the Spatial Analysis Lab. The Spatial Analysis lab is the lab I run, and they've been doing this stuff for a number of years, and they've collected, through 64 projects, 837 communities, covering 28 million people, almost 9000 square miles of data mapped, 250 billion pixels of land cover products generated and 110 terabytes of data. So this is a major undertaking but it's only the beginning. Going back to the crime data that I was telling you about those trees, so here's Baltimore again. Using this method, we turn data into information. We get trees, we now know where trees are. I overlay it with the crime. I end up with information, I can now do a study, and we just submitted this for publication. We just found out in fact there's a strong negative correlation between trees and crime even when adjust for about fifty other things. We couldn't have done that without this sort of information. So with that, I will say thanks to the people from the Spatial Analysis Lab, and particularly Jarlath O'Neil-Dunne who helped me put this together and has been doing this research for a long time and thanks to you. Thank you. (Applause)