How I built an information time machine

0:00 - 0:03

This is an image of the planet Earth.
0:03 - 0:06

It looks very much like the Apollo pictures
0:06 - 0:08

that are very well-known.
0:08 - 0:10

There is something different;
0:10 - 0:11

you can click on it,
0:11 - 0:13

and if you click on it,
0:13 - 0:16

you can zoom in on almost any place on the Earth.
0:16 - 0:18

For instancce, this is a bird's-eye view
0:18 - 0:20

of the EPFL Compass
0:20 - 0:22

In many cases, you can also see
0:22 - 0:26

how a building looks from a nearby street.
0:26 - 0:28

This is pretty amazing.
0:28 - 0:31

But there's something missing in this wonderful tour:
0:31 - 0:33

It's time.
0:33 - 0:36

i'm not really sure when this picture was taken.
0:36 - 0:38

I'm not even sure it was taken
0:38 - 0:44

at the same moment as the bird-eye's view.
0:44 - 0:46

In my lab, we develop tools
0:46 - 0:48

to travel not only in space
0:48 - 0:50

but also through time.
0:50 - 0:53

The kind of question we're asking is
0:53 - 0:53

Is it possible to build something
0:53 - 0:56

like Google Maps of the past?
0:56 - 0:59

Can I add a slider on top of Google Maps
0:59 - 1:01

and just change the year?
1:01 - 1:03

Seeing as it was 100 years before
1:03 - 1:04

a thousand years before,
1:04 - 1:06

is that possible?
1:06 - 1:09

Can we reconstruct social networks of the past?
1:09 - 1:12

Can I make a Facebook of the Middle Age?
1:12 - 1:16

So, can I build time machines?
1:16 - 1:18

You can just say, "No, it's not possible."
1:18 - 1:22

Or, maybe, we can think of it from an information point of view.
1:22 - 1:25

This is what I call the Information Mushroom.
1:25 - 1:27

Vertically, you have the time.
1:27 - 1:29

and horizontally, the amount of digital information available.
1:29 - 1:33

Obviously, in the last ten years, we have much information.
1:33 - 1:37

And obviously the more we go in the past, the less information we have.
1:37 - 1:39

If you want to build something like the Google Map of the past,
1:39 - 1:40

or Facebook of the past,
1:40 - 1:42

we need to enlarge this space,
1:42 - 1:44

make it like a rectangle.
1:44 - 1:45

How do we do that?
1:45 - 1:47

One way is digitization.
1:47 - 1:49

There's a lot of material available.
1:49 - 1:55

Newspaper, printed books, thousands of printed books
1:55 - 1:57

I can digitize all these.
1:57 - 2:00

I can extract information from these.
2:00 - 2:04

Of course, the more you go in the past,
the less information you will have.
2:04 - 2:06

So, it might not be enough.
2:06 - 2:09

So, I can do what historians do.
2:09 - 2:10

I can extrapolate.
2:10 - 2:15

This is what we call, in computer science, simulation.
2:15 - 2:16

If I take a log book,
2:16 - 2:18

I can consider, it's not just a log book
2:18 - 2:22

of a Venetian captain going to a prodigal journey.
2:22 - 2:23

I can consider it is actually a log book
2:23 - 2:26

which is representative of
many journeys of that period.
2:26 - 2:28

I'm extrapolating.
2:28 - 2:30

If I have a painting of a facade,
2:30 - 2:33

I can consider it's not just that particular building,
2:33 - 2:37

but probably it also shares the same grammar
2:37 - 2:40

of buildings we lost any information.
2:42 - 2:44

So if we want to construct a time machine,
2:44 - 2:45

we need two things.
2:45 - 2:47

We need very large archives,
2:47 - 2:51

and we need excellent specialists.
2:51 - 2:52

The Venice Time Machine,
2:52 - 2:54

the project I'm going to talk to you about,
2:54 - 2:57

is a joint project between the EPFL
2:57 - 3:00

and the University of Venice Ca'Foscari.
3:00 - 3:02

There's something very peculiar about Venice
3:02 - 3:05

is that its administration has been
3:05 - 3:07

very, very bureaucratic.
3:07 - 3:09

They've been keeping trace of everything,
3:09 - 3:12

almost like Google today.
3:12 - 3:13

At the Archivio di Stato,
3:13 - 3:15

you have 80 kilometers of archives
3:15 - 3:17

documenting every aspect
3:17 - 3:19

of the life of Venice over
more than a thousand years.
3:19 - 3:21

You have every boat that goes out,
3:21 - 3:22

every boat that comes in.
3:22 - 3:25

You have every change that was made in the city.
3:25 - 3:29

This is all there.
3:29 - 3:32

We are setting up a 10-year digitization program
3:32 - 3:34

which has the objective of transforming
3:34 - 3:35

this immense archive
3:35 - 3:38

into a giant information system.
3:38 - 3:40

The type of objective we want to reach
3:40 - 3:45

is 450 books a day that can be digitized.
3:45 - 3:47

Of course, when you digitize, that's not enough,
3:47 - 3:48

because these documents,
3:48 - 3:51

most of them are in Latin, in Tuscan,
3:51 - 3:52

in Venetian dialect,
3:52 - 3:54

so you need to transcribe them,
3:54 - 3:56

to translate them in some cases,
3:56 - 3:57

to index them,
3:57 - 3:59

and this is obviously not easy.
3:59 - 4:03

In particular, traditional optical
character recognition method
4:03 - 4:04

that can be used for printed manuscripts,
4:04 - 4:08

they do not work well on the written document.
4:08 - 4:10

So the solution is actually to take inspiration
4:10 - 4:13

from another domain: speech recognition.
4:13 - 4:15

This is a domain of something that seems impossible
4:15 - 4:18

could actually be done,
4:18 - 4:20

simply by putting additional constraints.
4:20 - 4:22

If you have very good model
4:22 - 4:23

of a language which is used,
4:23 - 4:25

if you have a very good model of a document,
4:25 - 4:27

how well they are structured.
4:27 - 4:28

And these are administrative documents.
4:28 - 4:30

They are well-structured in many cases.
4:30 - 4:33

If you divide this archive into smaller subsets
4:33 - 4:36

where a smaller subset
actually share similar features,
4:36 - 4:40

then there's a chance of success.
4:43 - 4:45

If we reach that stage, then there's something else:
4:45 - 4:49

we can extract from this document events.
4:49 - 4:51

Actually probably 10 billions of events
4:51 - 4:53

can be extracted from this archive.
4:53 - 4:55

And this giant information system
4:55 - 4:56

can be searched in many ways.
4:56 - 4:58

You can ask questions like,
4:58 - 5:01

"Who lived in this palazzo in 1323?"
5:01 - 5:03

"How much cost a sea bream at the Realto market
5:03 - 5:05

in 1434?"
5:05 - 5:06

"What was the salary
5:06 - 5:08

of a glass maker in Murano
5:08 - 5:09

maybe over a decade?"
5:09 - 5:11

You can ask even bigger questions
5:11 - 5:14

because it will be semantically coded.
5:14 - 5:16

And then what you can do is put that in space,
5:16 - 5:18

because many of these information are special.
5:18 - 5:20

And from that, you can do things like
5:20 - 5:22

reconstructing this extraordinary journey
5:22 - 5:25

of that city which managed to
have a sustainable development
5:25 - 5:27

over a thousand years,
5:27 - 5:29

managing to have all the time
5:29 - 5:32

form of equilibrium with its environment.
5:32 - 5:33

You can reconstruct that journey,
5:33 - 5:36

visualize in many different ways.
5:36 - 5:38

But of course, you cannot understand Venice
5:38 - 5:39

if you just look at the city.
5:39 - 5:41

You have to put it in a larger European context.
5:41 - 5:44

So the idea is also to document all the things
5:44 - 5:47

that worked at the European level.
5:47 - 5:48

We can reconstruct also the journey
5:48 - 5:50

of the Venetian maritime empire,
5:50 - 5:54

how it progressively controlled the Adriatic Sea,
5:54 - 5:57

how it became the most powerful medieval empire
5:57 - 5:59

of its time,
5:59 - 6:01

controlling most of the sea routes
6:01 - 6:04

from the east to the south.
6:05 - 6:08

But you can even do other things,
6:08 - 6:10

because in these maritime routes,
6:10 - 6:12

there are regular patterns.
6:12 - 6:14

You can go one step beyond
6:14 - 6:17

and actually create a simulation system,
6:17 - 6:19

create a Mediterranean simulator
6:19 - 6:22

which is capable actually of reconstructing
6:22 - 6:24

even the information we are missing,
6:24 - 6:27

which would enable to have questions you could ask
6:27 - 6:30

like if you were using a route planner.
6:30 - 6:33

"If I am in Corfu in June 1323
6:33 - 6:36

and want to go to Constantinople,
6:36 - 6:38

where can I take a boat?"
6:38 - 6:39

Probably we can answer this question
6:39 - 6:44

with one or two or three days' precision.
6:44 - 6:45

"How much will it cost?"
6:45 - 6:49

"What are the chance of encountering pirates?"
6:49 - 6:51

Of course, you understand,
6:51 - 6:53

the central scientific challenge
of a project like this one
6:53 - 6:57

is qualifying, quantifying, and representing
6:57 - 7:00

uncertainty and inconsistency
at each step of this process.
7:00 - 7:03

There are errors everywhere,
7:03 - 7:06

errors in the document, it's
the wrong name of the captain,
7:06 - 7:09

some of the boats never actually took to sea.
7:09 - 7:10

There are errors in translation,
7:10 - 7:14

errors in traduction, interpretative biases,
7:14 - 7:17

and on top of that, if you add algorithmic processes,
7:17 - 7:20

you're going to have errors in recognition,
7:20 - 7:22

errors in extraction,
7:22 - 7:26

so you have very, very uncertain data.
7:26 - 7:30

So how can we detect and
correct these inconsistencies?
7:30 - 7:34

How can we represent that form of uncertainty?
7:34 - 7:36

It's difficult. One thing you can do
7:36 - 7:38

is document each step of the process,
7:38 - 7:41

not only coding the historical information
7:41 - 7:43

but what we call the meta-historical information,
7:43 - 7:46

how is historical knowledge constructed,
7:46 - 7:48

documenting each step.
7:48 - 7:50

That will not guarantee that we actually converge
7:50 - 7:52

toward a single story of Venice,
7:52 - 7:54

but probably we can actually reconstruct
7:54 - 7:57

fully documented potential story of Venice.
7:57 - 7:59

Maybe there's not a single map.
7:59 - 8:01

Maybe there are several maps.
8:01 - 8:03

The system should allow for that,
8:03 - 8:06

because we have to deal with
a new form of uncertainty,
8:06 - 8:10

which is really new for this type of giant databases.
8:12 - 8:13

And how should we communicate
8:13 - 8:17

this new research to a large audience?
8:17 - 8:19

Again, Venice is extraordinary for that.
8:19 - 8:22

With the millions of visitors that comes every year,
8:22 - 8:23

it's actually one of the best places
8:23 - 8:26

to try to invent the museum of the future.
8:26 - 8:30

Imagine, horizontally you see the reconstructed map
8:30 - 8:31

of a given year,
8:31 - 8:34

and vertically, you see the document
8:34 - 8:35

that served as the reconstruction,
8:35 - 8:39

paintings, for instance.
8:39 - 8:41

Imagine an immersive system that permits
8:41 - 8:45

to go and dive and reconstruct
the Venice of a given year.
8:45 - 8:48

Some experience you could share within a group.
8:48 - 8:50

On the contrary, imagine actually that you start
8:50 - 8:52

from a document, a Venetian manuscript,
8:52 - 8:55

and you show, actually, what
you can construct out of it,
8:55 - 8:57

how it is decoded,
8:57 - 9:00

how the context of that document can be recreated.
9:00 - 9:01

This is an image from an exhibit
9:01 - 9:03

which is currently conducted in Geneva
9:03 - 9:06

with that type of system.
9:06 - 9:08

So to conclude, we can say that
9:08 - 9:11

research into humanities is about to undergo
9:11 - 9:13

an evolution which is maybe similar
9:13 - 9:17

to what happened to life science 30 years ago.
9:17 - 9:20

It's really
9:20 - 9:22

a question of scale.
9:22 - 9:25

We see projects which are
9:25 - 9:29

much beyond any single research team can do,
9:29 - 9:32

and this is really new for the humanities,
9:32 - 9:35

which are very often take the habit of working
9:35 - 9:40

in small groups or only with a couple of researchers.
9:40 - 9:41

When you visit the Archivio di Stato,
9:41 - 9:44

you feel this is beyond what any single team can do,
9:44 - 9:45

and that should be a joint
9:45 - 9:48

and common effort.
9:48 - 9:51

So what we must do for this paradigm shift
9:51 - 9:53

is actually foster a new generation
9:53 - 9:54

of "digital humanists"
9:54 - 9:57

that are going to be ready for this shift.
9:57 - 9:59

I thank you very much.
9:59 - 10:03

(Applause)

Title:: How I built an information time machine
Speaker:: Frederic Kaplan
Description:: more » « less
Video Language:: English
Team:: closed TED
Project:: TEDTalks
Duration:: 10:20

	Morton Bast edited English subtitles for How to build an information time machine
	Morton Bast approved English subtitles for How to build an information time machine
	Morton Bast edited English subtitles for How to build an information time machine
	Morton Bast edited English subtitles for How to build an information time machine
	Madeleine Aronson accepted English subtitles for How to build an information time machine
	Madeleine Aronson edited English subtitles for How to build an information time machine
	Joseph Geni edited English subtitles for How to build an information time machine
	Joseph Geni edited English subtitles for How to build an information time machine

Show all

English subtitles

Revisions Compare revisions

Revision 7 Edited (legacy editor)

Morton Bast
Revision 6 Edited (legacy editor)

Morton Bast
Revision 5 Edited (legacy editor)

Madeleine Aronson
Revision 4 Edited (legacy editor)

Joseph Geni
Revision 3 Edited (legacy editor)

Joseph Geni
Revision 2 Edited (legacy editor)

Jessica Ruby
Revision 1

Amara Bot

	Revision Number	Author	Created
	7	Morton Bast
	6	Morton Bast
	5	Madeleine Aronson
	4	Joseph Geni
	3	Joseph Geni
	2	Jessica Ruby
	1	Amara Bot

How I built an information time machine

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)