WEBVTT

00:00:14.641 --> 00:00:16.809
All right, thanks Chris
and thanks for having me here.

00:00:16.809 --> 00:00:19.441
Can everybody hear me OK?

00:00:21.665 --> 00:00:24.825
So, today I'm going to talk

00:00:24.825 --> 00:00:27.094
about a type of data
we're all very familiar with

00:00:27.094 --> 00:00:30.861
and I think most of us like, intrinsically.

00:00:30.861 --> 00:00:34.878
And that is geographic data
and particularly imagery of the Earth.

00:00:34.878 --> 00:00:38.131
And we've already seen
some examples of that today.

00:00:38.131 --> 00:00:41.839
I'm going to start
with a little show and tell here.

00:00:41.845 --> 00:00:44.305
I had to bring a prop, I couldn't resist.

00:00:44.305 --> 00:00:51.144
My old Macintosh Power Book 145 from 1992.

00:00:51.144 --> 00:00:56.052
This is the first computer
I had with a hard drive.

00:00:56.052 --> 00:00:58.747
It came with 4, was it 4?

00:00:58.747 --> 00:01:03.534
Actually with about 6 megabytes of memory.

00:01:03.534 --> 00:01:07.434
That was a big deal
and I was just blown away.

00:01:07.434 --> 00:01:09.376
I couldn't believe I had that much memory.

00:01:09.376 --> 00:01:11.240
I'm just going to put this back here.

00:01:11.240 --> 00:01:14.610
I couldn't believe
I had that much memory at my fingertips.

00:01:14.610 --> 00:01:16.953
Today, we now all have computers.

00:01:16.979 --> 00:01:19.178
I can go down to Staples
and buy a computer

00:01:19.178 --> 00:01:22.143
that has a quarter million times
more memory than that

00:01:22.143 --> 00:01:25.149
for about $400 or $600
or something like that.

00:01:25.149 --> 00:01:27.930
Times have changed
and that's 20 years ago.

00:01:27.930 --> 00:01:31.726
As a result,
with all this increased computing power,

00:01:31.726 --> 00:01:34.393
we are drowning in data.

00:01:34.393 --> 00:01:36.236
We're just absolutely drowning in data.

00:01:36.236 --> 00:01:38.775
And one of the types of data
we are most drowning in

00:01:38.775 --> 00:01:41.931
is remote sensing
or imagery data, satellite data,

00:01:41.931 --> 00:01:44.127
aerial data, things like that.

00:01:44.127 --> 00:01:46.345
And we've all played around
with this, I'm sure.

00:01:46.345 --> 00:01:49.015
We all love Google Earth,
it's free and it's fun.

00:01:49.015 --> 00:01:51.728
And it's just teaming with imagery.

00:01:51.728 --> 00:01:54.276
So, what do we do with all this stuff?

00:01:54.276 --> 00:01:56.065
How do we make use of this?

00:01:56.065 --> 00:01:59.014
Here's an image of Baltimore.
This is urban Baltimore.

00:01:59.061 --> 00:02:00.856
It's got all these great objects.

00:02:00.856 --> 00:02:02.675
I can look in there and see,

00:02:02.675 --> 00:02:04.140
it's hard with this projector,

00:02:04.163 --> 00:02:06.525
but I can see trees and buildings,
things like that.

00:02:06.549 --> 00:02:07.890
And let's just say

00:02:07.916 --> 00:02:12.144
I wanted to actually do some kind
of a quantitative study with that.

00:02:12.144 --> 00:02:14.262
Say I had to do something
that required knowing

00:02:14.262 --> 00:02:15.580
where the trees really were.

00:02:15.580 --> 00:02:17.200
I can see where the trees are,

00:02:17.200 --> 00:02:20.392
but the computer doesn't know,
it has no clue what a tree is.

00:02:20.392 --> 00:02:23.021
Let's just say I wanted to do
something like,

00:02:23.021 --> 00:02:25.114
these are actually
the locations of crimes,

00:02:25.114 --> 00:02:29.114
say I wanted to know
if the density of trees affects crime.

00:02:29.114 --> 00:02:32.291
There's no way I can do that
with imagery the way we have it now,

00:02:32.291 --> 00:02:35.894
in a computing environment.

00:02:35.894 --> 00:02:38.645
Part of the reason for this

00:02:38.645 --> 00:02:41.395
is that computers
aren't really good at recognizing things

00:02:41.395 --> 00:02:43.348
the way that we can recognise things.

00:02:43.348 --> 00:02:47.911
We are excellent are recognizing
things with very slight differences.

00:02:47.911 --> 00:02:50.572
I can tell you within two seconds

00:02:50.572 --> 00:02:53.359
that that's George Carlin
and that's Sigmund Freud.

00:02:53.359 --> 00:02:56.148
That that is the Big Lebowski
and that is Eddie Vedder.

00:02:56.148 --> 00:02:59.896
For me to train a computer
to recognise the difference

00:02:59.896 --> 00:03:03.644
between the Big Lebowski, a.k.a. the Dude
and Eddie Vedder,

00:03:03.644 --> 00:03:07.394
would take me unbelievable amounts
of time to do.

00:03:07.394 --> 00:03:11.648
Yet I can do that instantly,
so that's an issue right here.

00:03:11.648 --> 00:03:14.866
So, let's cut to the chase here.

00:03:14.866 --> 00:03:17.798
On the left I have raw data,

00:03:17.824 --> 00:03:20.510
color infrared remote sense imagery.

00:03:20.510 --> 00:03:26.557
On the right I have
a classified GIS layer.

00:03:26.573 --> 00:03:29.644
That is usable information.

00:03:29.644 --> 00:03:33.944
The computer knows what's grass,
what's buildings and knows what's trees.

00:03:33.944 --> 00:03:36.976
How do I get from one to the other?

00:03:36.976 --> 00:03:41.223
This is a major major conundrum
in today's world of high resolution data.

00:03:41.223 --> 00:03:44.192
Here's an image
of just a typical suburban area.

00:03:44.192 --> 00:03:46.144
I look at it and I see
all sort of features

00:03:46.160 --> 00:03:48.325
and I see that it is at a
very fine resolution.

00:03:48.325 --> 00:03:52.063
If I were to be working
with remote sensing data 15 years ago,

00:03:52.063 --> 00:03:54.785
I'd have coarse resolution imagery.

00:03:54.811 --> 00:03:57.224
This is the same exact location
using 30 meter pixels.

00:03:57.224 --> 00:04:01.980
Back then, classifying this stuff
was a qualitatively different thing,

00:04:01.980 --> 00:04:05.509
because all I really needed to do
is get in the general ball park.

00:04:05.509 --> 00:04:08.315
These pixels here
are sort of generally urbanized,

00:04:08.315 --> 00:04:11.121
these pixels here are generally forested.

00:04:11.121 --> 00:04:14.707
I didn't really have to know
about the specific identity of objects.

00:04:14.707 --> 00:04:18.587
Now fast forward to today
and I've got imagery which I can zoom in

00:04:18.587 --> 00:04:23.472
and I can see
a million different types of objects.

00:04:23.472 --> 00:04:29.917
From cars in a parking lot
to shipping containers,

00:04:29.917 --> 00:04:33.818
to the cranes
at the shipping containers facility

00:04:33.858 --> 00:04:37.556
to the mechanicals on top of a building

00:04:37.937 --> 00:04:39.270
to, I love this one,

00:04:39.270 --> 00:04:42.908
this is the Sphinx
at the Luxor Hotel in Las Vegas.

00:04:42.908 --> 00:04:45.088
Try telling a computer what that is.

00:04:45.088 --> 00:04:48.504
Here's another Vegas one,
I love Google Earth in Vegas,

00:04:48.504 --> 00:04:50.024
it's the best, it's so much fun.

00:04:50.024 --> 00:04:52.746
This is a tropical fish-shaped pool.

00:04:52.793 --> 00:04:56.218
Again, not so easy to tell a computer
what that is.

00:04:56.250 --> 00:04:57.710
And this is the best of all.

00:04:57.757 --> 00:05:03.103
I believe this is high-resolution
satellite imagery

00:05:03.103 --> 00:05:07.496
of camels in the middle of Africa
and it's part of this Google,

00:05:07.496 --> 00:05:11.889
National Geographic,
Africa mega fly-over project.

00:05:11.889 --> 00:05:14.056
The number of possible things

00:05:14.082 --> 00:05:16.282
I have to prepare computers
to be ready for,

00:05:16.282 --> 00:05:18.687
the types of objects
out on the surface of the Earth

00:05:18.713 --> 00:05:20.345
surface of the Earth,
is staggering.

00:05:20.745 --> 00:05:25.208
And it's a project
that we'll never see finished,

00:05:25.208 --> 00:05:29.381
that project of teaching computers
the artificial intelligence,

00:05:29.381 --> 00:05:31.663
giving them the artificial intelligence
they need

00:05:31.687 --> 00:05:33.885
to recognize all of this variation
on the Earth.

00:05:33.909 --> 00:05:35.353
Here's another even better one.

00:05:35.369 --> 00:05:38.114
This is actually a real thing,
this is the Colonel.

00:05:38.114 --> 00:05:42.733
Someone actually did mega art
of the Colonel in the middle of Nevada.

00:05:43.233 --> 00:05:46.371
It's interesting
how many of these come from Nevada.

00:05:46.400 --> 00:05:48.828
(Laughter)

00:05:49.471 --> 00:05:51.618
So, let's explain why it's difficult

00:05:51.618 --> 00:05:55.125
to use the methods
we've always used in the past

00:05:55.125 --> 00:05:58.302
for this generation
of high resolution imagery.

00:05:58.302 --> 00:06:01.337
Here's a high resolution image
of Burlington.

00:06:01.337 --> 00:06:06.232
If I try to classify each pixel,
pixel by pixel,

00:06:06.232 --> 00:06:10.710
I get this awful pixelated
meaningless gobbledygook.

00:06:10.710 --> 00:06:14.370
If I look at a particular object
like a house, that house is made up,

00:06:14.370 --> 00:06:17.410
it's hard to see, but it's just dozens
of different pixel values

00:06:17.410 --> 00:06:19.005
that don't really mean anything.

00:06:19.029 --> 00:06:23.784
If I take a single tree, again,
is made up of a gobbledygook of pixels.

00:06:23.784 --> 00:06:26.802
Now, if I zoom in on that tree,
for instance,

00:06:26.802 --> 00:06:33.312
I will see that it's made up of pixels,
lots of different tones, different colors,

00:06:33.312 --> 00:06:37.837
and if this is the direct representation
in classified pixels,

00:06:37.837 --> 00:06:39.332
it's meaningless, right?

00:06:39.332 --> 00:06:40.949
This is not finding objects.

00:06:40.949 --> 00:06:45.969
So, I need to teach a computer
to see objects and to think like me.

00:06:46.017 --> 00:06:48.795
So this means teaching a computer
to think like a human,

00:06:48.795 --> 00:06:52.148
which means working
based on shape size, tone, pattern,

00:06:52.148 --> 00:06:55.310
texture, site and association,
a lot of this is all spatial.

00:06:55.310 --> 00:07:00.450
We have to stop thinking pixel by pixel
and start thinking of things spatially.

00:07:00.450 --> 00:07:04.265
That means taking an image
and, what's called segmenting it,

00:07:04.265 --> 00:07:05.757
turning it into objects.

00:07:05.788 --> 00:07:08.694
And the process of segmenting it
is very difficult.

00:07:08.694 --> 00:07:12.763
You have to train a computer
to segment imagery correctly.

00:07:12.763 --> 00:07:15.496
And if I look, here is a house,
there's one side of the roof

00:07:15.496 --> 00:07:17.911
and another side of the roof,
there is the driveway,

00:07:17.939 --> 00:07:19.727
they're segmented as different objects

00:07:19.753 --> 00:07:21.702
and I can then re-aggregate those objects

00:07:21.702 --> 00:07:25.181
into something that is just a house
and another one that is just a driveway.

00:07:25.181 --> 00:07:27.761
And at the end of the day,
what I'm going to end up with

00:07:27.787 --> 00:07:29.107
is something like this.

00:07:29.107 --> 00:07:31.829
I will be able to tell you the difference.

00:07:31.845 --> 00:07:37.503
Even though the spectral signature
is the same of this roof and this road,

00:07:37.598 --> 00:07:40.845
I know that their compactness
factor is different,

00:07:40.845 --> 00:07:43.568
and because of that,
because of the shape metrics of them,

00:07:43.568 --> 00:07:46.291
I can tell you which one's a roof
and which one's a road,

00:07:46.291 --> 00:07:49.015
and I can start
classifying things in that way.

00:07:52.845 --> 00:07:55.646
To do this requires huge rule sets.

00:07:55.646 --> 00:08:01.125
The rule sets could be dozens and dozens
to over a hundred pages long

00:08:01.125 --> 00:08:06.186
of all these classification rules
and I won't bore you with the details.

00:08:06.186 --> 00:08:09.687
I also will make use of
a lot of ancillary data.

00:08:09.687 --> 00:08:13.136
There's all sort of great GIS data
that helps me classify things now.

00:08:13.136 --> 00:08:16.174
Most cities are collecting things
about building footprints,

00:08:16.174 --> 00:08:18.721
we know where parcels are,
we know where sewer lines are

00:08:18.737 --> 00:08:20.371
and roads are and things like that.

00:08:20.398 --> 00:08:21.806
We can use this to help us,

00:08:21.806 --> 00:08:23.823
but the most important
form of ancillary data

00:08:23.849 --> 00:08:27.767
that's out there today is called LIDAR:
Light Detection and Ranging.

00:08:27.767 --> 00:08:30.044
And LIDAR has been used
in engineering for a while

00:08:30.060 --> 00:08:32.808
and it allows us to essentially create
models of the surface.

00:08:32.808 --> 00:08:35.566
This is Columbus Circle
in Central Park in New York City

00:08:35.566 --> 00:08:38.787
and this is a surface elevation
of the trees.

00:08:38.787 --> 00:08:42.107
The LIDAR tells me
where the canopy of the trees is,

00:08:42.107 --> 00:08:47.580
where the tops of the buildings are,
where the ground surface is too.

00:08:47.580 --> 00:08:50.702
And I can create these incredibly detailed
models of the world

00:08:50.702 --> 00:08:52.195
so now I'm not just working

00:08:52.195 --> 00:08:55.539
with spatial spectral information,
reflectance information,

00:08:55.539 --> 00:08:59.011
I'm also working with height information,
I know the heights of things

00:08:59.011 --> 00:09:02.349
so I can see two objects
that are green and woody,

00:09:02.349 --> 00:09:05.687
but I can tell that one of them
is a shrub and one of them is a tree.

00:09:05.687 --> 00:09:09.027
And this is just zooming
into that stuff there.

00:09:09.027 --> 00:09:12.042
Now the problem is,
this is incredibly data-intensive

00:09:12.052 --> 00:09:17.757
and nobody's figured out until recently,
and I mean like maybe two years ago,

00:09:17.757 --> 00:09:20.862
people were doing this
on a tile by tile basis

00:09:20.862 --> 00:09:22.840
to work on one little tile of data
at a time

00:09:22.866 --> 00:09:24.672
that might be, you know,

00:09:24.672 --> 00:09:28.552
one, that just that red outline
that you see right there,

00:09:28.576 --> 00:09:31.250
and that might be half a gigabyte
or something like that.

00:09:31.250 --> 00:09:35.148
So we've worked on turning this
into an enterprise environment,

00:09:35.148 --> 00:09:38.397
that's what we have to do,
make an enterprise environment out of this

00:09:38.397 --> 00:09:41.322
so we can start looking
at thousands of tiles of data at a time,

00:09:41.322 --> 00:09:42.860
and we've successfully done that.

00:09:44.316 --> 00:09:48.088
My lab, which is the
Spatial Analysis Lab.

00:09:48.114 --> 00:09:51.860
The Spatial Analysis lab is the lab I run,

00:09:51.860 --> 00:09:54.921
and they've been doing this stuff
for a number of years,

00:09:54.921 --> 00:09:59.142
and they've collected,
through 64 projects, 837 communities,

00:09:59.142 --> 00:10:03.895
covering 28 million people,
almost 9000 square miles of data mapped,

00:10:03.895 --> 00:10:06.977
250 billion pixels of land cover
products generated

00:10:06.977 --> 00:10:09.024
and 110 terabytes of data.

00:10:09.064 --> 00:10:12.346
So this is a major undertaking
but it's only the beginning.

00:10:12.386 --> 00:10:15.578
Going back to the crime data
that I was telling you about those trees,

00:10:15.578 --> 00:10:17.141
so here's Baltimore again.

00:10:17.181 --> 00:10:20.395
Using this method,
we turn data into information.

00:10:20.434 --> 00:10:22.640
We get trees,
we now know where trees are.

00:10:22.695 --> 00:10:24.274
I overlay it with the crime.

00:10:24.456 --> 00:10:28.479
I end up with information,
I can now do a study,

00:10:28.532 --> 00:10:30.683
and we just submitted this
for publication.

00:10:30.707 --> 00:10:34.008
We just found out in fact
there's a strong negative correlation

00:10:34.048 --> 00:10:35.437
between trees and crime

00:10:35.460 --> 00:10:37.618
even when adjust for about
fifty other things.

00:10:37.658 --> 00:10:42.844
We couldn't have done that
without this sort of information.

00:10:42.852 --> 00:10:45.352
So with that, I will say thanks
to the people

00:10:45.391 --> 00:10:46.986
from the Spatial Analysis Lab,

00:10:47.031 --> 00:10:50.221
and particularly Jarlath O'Neil-Dunne
who helped me put this together

00:10:50.222 --> 00:10:53.245
and has been doing this research
for a long time and thanks to you.

00:10:53.250 --> 00:10:56.312
Thank you. (Applause)