A Brief History of Logical Time - John Daily - Midwest.io 2015

0:11 - 0:12

Thank you all for coming
0:12 - 0:14

and before I forget
0:14 - 0:15

thank you very much to Midwest
0:15 - 0:16

for having me back
0:16 - 0:18

I was privileged enough to speak
0:18 - 0:20

at their first iteration of this last year
0:20 - 0:22

and I had a great time
0:22 - 0:24

and was excited to submit again
0:24 - 0:25

although I told them not to pick me
0:25 - 0:27

because I had been here last year
0:27 - 0:28

and I figured they should branch
0:28 - 0:30

out a bit but here I am nonetheless
0:31 - 0:34

So we're here to talk about time today
0:34 - 0:37

and, specifically, we're not talking about
0:37 - 0:40

time zones and leap years and leap seconds
0:40 - 0:43

and September 1752 and Easter
0:43 - 0:45

and the other Easter
0:45 - 0:47

That's a different talk
0:47 - 0:49

In fact that's probably a whole
series of talks
0:49 - 0:52

We are here to talk today
about logical time.
0:54 - 0:55

I am John Daily
0:55 - 0:58

and I work for a company
named Basho Technologies
0:58 - 1:00

We write a distributed
database called Riak
1:00 - 1:02

and distributed databases are a big place
1:02 - 1:05

where time and causality come into play
1:05 - 1:06

thus I've been exposed to a lot of
this theory in practice
1:06 - 1:10

a lot of this theory in practice
of the last few years
1:10 - 1:14

Quick note: lunch is at the end of my talk
1:14 - 1:17

I will try not to get between you at lunch
1:18 - 1:20

So if I seem to be going a little fast
1:20 - 1:22

it's because I don't want to be
between you and lunch
1:22 - 1:23

but please feel free to interrupt
1:23 - 1:25

especially if I jump the rails
1:25 - 1:27

I'm very good at tangents
1:27 - 1:29

It's very easy to get me off track
1:29 - 1:30

Pop up, ask questions
1:30 - 1:33

Otherwise, what I plan on doing is
1:33 - 1:35

if you have general questions that
1:35 - 1:37

don't involve 'Hey John, I have no idea
1:37 - 1:39

what you're talking about'
1:39 - 1:40

try to save them for the end
1:40 - 1:42

and we can talk after while other people
1:42 - 1:43

are free to go get their lunch
1:43 - 1:44

and we can talk about your questions.
1:46 - 1:47

So, the question is
1:47 - 1:49

Why are we here at all?
1:49 - 1:51

I appreciate everyone who bothered to show up
1:52 - 1:56

So, causality and co-ordination are
1:56 - 2:01

such trivial concepts on single
machine applications
2:02 - 2:06

Something you'd never even think about really
2:06 - 2:08

However, as soon as you introduce
2:09 - 2:11

a second machine, and a third machine
2:12 - 2:14

causality and co-ordination become much more complicated
2:14 - 2:16

So complicated in fact that
2:16 - 2:18

we've been at this for 40 years or so
2:18 - 2:22

trying to address these problems
or even longer
2:22 - 2:24

and we're still finding new areas
2:24 - 2:27

of research and new ways to improve
2:27 - 2:28

our handling of this issue.
2:29 - 2:32

If I can use an astrophysics analogy here,
2:32 - 2:34

this is much like the two-body vs. three-body
2:34 - 2:37

problem in physics.
2:37 - 2:40

Two-body is trivial. Three-body's a whole mess.
2:40 - 2:42

In computers, one server is fine.
2:42 - 2:44

Two servers are more of a challenge.
2:45 - 2:48

Let's start by talking about the
2:48 - 2:52

simplest, non-trivial problem I can come up with.
2:52 - 2:54

Two clients, two servers
2:54 - 2:55

Key value store
2:55 - 2:58

and two concurrent attempts
2:58 - 2:59

to set the same value.
2:59 - 3:04

Or where they set the same key with different values
3:04 - 3:07

The question here is, who would win?
3:07 - 3:12

Now, you may say 'John, this is obvious'
3:13 - 3:16

'The last one to arrive would win.'
3:16 - 3:18

And that's a simple enough heuristic.
3:18 - 3:19

We can certainly apply that.
3:19 - 3:22

In this case, the 2:42pm
write would arrive.
3:22 - 3:25

After the 2:37 arrived, we'd say
3:25 - 3:27

'Well the balance must be $300.'
3:29 - 3:31

So then the question becomes
3:31 - 3:33

'Given that heuristic, what happens
3:33 - 3:35

when both arrive at the same time?'
3:35 - 3:38

RFC677,
3:38 - 3:40

one of those internet RFC's
3:40 - 3:42

you've never read,
3:42 - 3:44

is written by BBN, my former
3:44 - 3:46

employer way back in the day,
3:46 - 3:49

1975? or 73? Somewhere on that time frame.
3:49 - 3:51

They came up with a
3:52 - 3:54

set of rules for keeping a distributed
3:54 - 3:56

key value store in sync
3:56 - 3:59

and one of those rules is that
3:59 - 4:01

you implement a tie-breaker here.
4:01 - 4:04

Each object, each operation in your system,
4:04 - 4:06

is tagged with a logical time stamp.
4:06 - 4:09

It's tagged with non only the physical time stamp
4:09 - 4:10

but also the name of the server
4:10 - 4:13

and you pick an ordering on your servers,
4:13 - 4:14

and that one wins.
4:14 - 4:16

So, in this case, B is greater than A.
4:16 - 4:18

We pick the write that arrived on
server B
4:18 - 4:20

We call that the winner.
4:22 - 4:24

One of the questions that will arise
4:24 - 4:25

out of this is
4:25 - 4:28

when it comes to business logic
4:28 - 4:30

'Is just blindly picking the
4:30 - 4:32

last write to arrive the right choice?'
4:32 - 4:34

We'll talk more about that.
4:36 - 4:38

So there are any number of lists on the internet
4:38 - 4:40

of things that programmers believe that
4:40 - 4:41

are not true.
4:41 - 4:44

This is a subset of things they believe
4:44 - 4:46

about time that are not true.
4:46 - 4:49

I have personally been bitten by
4:49 - 4:53

several of these but my number one
4:53 - 4:54

was number 16
4:54 - 4:56

'Time on the server clock and time
4:56 - 4:58

on the client clock would never be
4:58 - 5:00

different by a matter of decades'
5:00 - 5:00

YES
5:00 - 5:01

(laughter)
5:02 - 5:03

Astonishingly enough.
5:03 - 5:06

My very first IT job, I was
5:06 - 5:09

among other things supporting
Windows 3.1
5:09 - 5:13

clients connecting up through dial-up.
5:13 - 5:17

And we had a customer in Northern Indiana,
5:17 - 5:20

and some of their desktops refused
5:20 - 5:23

to connect to our system.
5:23 - 5:26

And so I was setting up in Chicago anyway
5:26 - 5:28

so I dropped in and tried to figure out
5:28 - 5:29

what in the world's going on.
5:29 - 5:31

It turned out that some of their desktops
5:31 - 5:33

had somehow been set at least
ten years
5:33 - 5:34

in the future.
5:34 - 5:35

I don't remember the exact date
5:35 - 5:38

but it was far enough advanced that
5:38 - 5:41

DOS could not accurately represent
the date.
5:41 - 5:43

It was 2000-and-something
5:43 - 5:45

and nobody was quite sure what it was.
5:45 - 5:47

I figured it was at least ten years.
5:47 - 5:49

We reset the clocks back.
5:49 - 5:51

Everything worked fine.
5:51 - 5:54

So yes, server clocks, client clocks are rarely in sync.
5:54 - 5:57

Server clocks themselves are rarely in sync.
5:57 - 6:00

And this is a vital problem we're trying to solve.
6:00 - 6:03

You may say, 'We've got NTP,
6:03 - 6:05

why are we talking about this at all?'
6:05 - 6:07

Fact of the matter is,
6:07 - 6:09

NTP is a great tool
6:09 - 6:11

and like all tools, it fails.
6:12 - 6:15

Firewall rules are set up wrong.
6:15 - 6:18

Demons are not configured to set up a boot.
6:18 - 6:22

Demons give up if synchronisation
6:22 - 6:25

gets drifted too far.
6:25 - 6:27

NTP can also roll your clock backwards
6:27 - 6:28

which is a terrible, terrible thing.
6:28 - 6:30

We'll talk more about that.
6:33 - 6:35

Yes, I have heard of Spanner.
6:35 - 6:37

For anyone who hasn't,
6:37 - 6:39

Google published a paper on their
6:39 - 6:41

time synchronisation strategies
6:41 - 6:44

which involved bajillions of dollars,
6:44 - 6:48

specialised hardware, atomic clocks,
GPS clocks,
6:48 - 6:51

highly reliable networks,
6:51 - 6:53

their own server hardware.
6:53 - 6:54

All this for the goal of
6:54 - 6:56

trying to synchronise clocks across
6:56 - 6:58

their entire environment.
6:58 - 7:01

Even their true time API,
which they supply,
7:01 - 7:03

still gives you an error range
7:03 - 7:05

any time you ask for the current time.
7:05 - 7:08

Even Google throwing all these problems
7:08 - 7:10

still doesn't know what time it is.
7:10 - 7:11

Time is a hard problem.
7:11 - 7:13

This is not something that's
trivially solved.
7:13 - 7:16

Even with all the money and resources
7:16 - 7:17

in the world.
7:17 - 7:19

Lesley Lamport
7:19 - 7:22

Who here has heard of Lesley Lamport?
7:22 - 7:24

Okay, good percentage.
7:24 - 7:25

I'm embarassed to admit
7:25 - 7:28

the only reason I knew about Leslie
7:28 - 7:29

Lamport before this job was LaTeX.
7:29 - 7:31

He is the LA in LaTeX
7:31 - 7:32

which is the macro
7:32 - 7:33

programming language
7:33 - 7:35

for tech that a lot more of people use
7:35 - 7:36

Lesley Lamport is a lot
7:36 - 7:38

than the author of the tech LaTeX.
7:38 - 7:40

He's an Turing award winner
7:40 - 7:43

and he is a distributed
systems pioneer
7:43 - 7:45

In the mid-seventies he
wrote this paper
7:45 - 7:47

which has been sighted more than
7:47 - 7:50

all but one other computer science paper
7:50 - 7:51

in the history of computer science.
7:51 - 7:53

Time, Clocks and the
Ordering of Events in
7:53 - 7:54

a Distributed System.
7:54 - 7:57

And Lesley recognized that hardware
7:57 - 7:58

clocks were bad.
7:58 - 8:00

Of course, in the 70s I would imagine
8:00 - 8:02

hardware clocks were even worse
8:02 - 8:03

than they are today.
8:03 - 8:06

There was no NTP yet
and he was looking at
8:06 - 8:08

the general problem of
8:08 - 8:10

"How do you distinguish time in an
8:10 - 8:12

intelligible way under
8:12 - 8:14

the distributed system?"
8:14 - 8:16

Obviously enough.
8:17 - 8:21

And he had a key realisation.
8:22 - 8:23

In this case we have three
8:23 - 8:25

discreet processes.
8:25 - 8:27

Those processes are
characterized by
8:27 - 8:29

internal events.
Those internal events
8:29 - 8:31

are all running under
same hardware so
8:31 - 8:33

within each process you can
8:33 - 8:35

trivially order the
sequence of events.
8:35 - 8:36

On one computer we know
8:36 - 8:38

what happened first.
8:38 - 8:39

That's easy enough.
8:39 - 8:40

Between processes
8:40 - 8:42

the communication is messages.
8:42 - 8:44

And Lesley realized that these messages
8:44 - 8:47

provide us an event horizon
8:47 - 8:49

or event boundary.
8:49 - 8:51

A point in time where we can say
8:52 - 8:56

P1 happened before Q2 because
8:56 - 8:58

P1 sent a message and it arrived
8:58 - 9:00

at queue at Q2.
See you can not send
9:00 - 9:02

a message after it arrives, right?
9:02 - 9:04

We have some basic laws of physics
9:04 - 9:06

even though physics self-struggles
9:06 - 9:07

to tell what time is it exactly.
9:07 - 9:09

We do know that messages arrive
9:09 - 9:10

after the're sent.
9:10 - 9:11

That provides a strict ordering.
9:11 - 9:15

We know that P1 and Q1 are concurrent
9:15 - 9:17

not necessarily parallel.
9:17 - 9:19

Parallel implies things are happening
9:19 - 9:20

at the same time.
9:20 - 9:22

Concurrent just says they happened and
9:22 - 9:24

we don't know which happened first.
9:24 - 9:27

That's one primitive definition of
9:27 - 9:28

the two terms.
9:28 - 9:30

P1 and Q1 are concurrent.
9:30 - 9:31

We can't really see anything about their
9:31 - 9:33

ordering other than they happened
9:33 - 9:34

maybe roughly the same time.
9:34 - 9:36

But regardless Q1 happened before Q2 ,
9:36 - 9:38

P1 happened before Q2.
9:39 - 9:43

So, Lesley builds up this system in order
9:43 - 9:45

to share a global clock.
9:45 - 9:46

That global clock is no longer
9:46 - 9:48

a physical time stamp.
9:48 - 9:49

It's a simple counter.
9:49 - 9:51

Each time you send a message you send
9:51 - 9:53

your counter along with it and
9:53 - 9:55

the process that receives that message,
9:55 - 9:57

if its counter is behind yours because
9:57 - 9:59

it hasn't committed enough events
9:59 - 10:01

in its timeline,
10:01 - 10:02

it will advance that counter.
10:02 - 10:05

So in this case A is on 7,
B is on event 4.
10:05 - 10:08

When A sends its message,
B picks it up,
10:08 - 10:10

says "Well, A is in 7 events so I need to
10:10 - 10:11

move mine at least 8."
10:15 - 10:18

There is a vital vital word that
10:18 - 10:20

does not appear in Lesley's document but
10:20 - 10:22

shows up in a lot of other papers
10:22 - 10:25

around clocks and time and logical time.
10:25 - 10:27

Monotonicity in this context means -
10:27 - 10:30

Time always goes forward.
10:30 - 10:32

Time should never go backwards.
10:33 - 10:35

Obviously, when those three won
10:35 - 10:36

I'd have to set clocks back in time.
10:36 - 10:37

You don't want to do that
for database server.
10:38 - 10:39

If you set a relational database
10:39 - 10:41

that's actively processing transactions,
10:42 - 10:43

if you rewind that clock
10:44 - 10:46

bad things happen.
10:46 - 10:47

Never set a clock backwards.
10:47 - 10:49

Has anyone actually done that and
10:49 - 10:51

then been burn by it? Out of curiosity.
10:51 - 10:53

Hey, we have one lucky here.
10:53 - 10:54

I've never been burned, lucky enough
10:54 - 10:56

but I have to admit in my midspan youth
10:56 - 10:58

I would in fact set a clock backwards
10:58 - 11:01

cause I didn't know any better.
It's a bad idea.
11:03 - 11:05

All right, 1988.
11:05 - 11:07

Lamport's ideas have been
around for a bit.
11:07 - 11:09

We'll actually rewind to an earlier paper
11:09 - 11:10

here in a few minutes.
11:10 - 11:12

Colin Fidge in Australia takes
11:12 - 11:14

Lamport's ideas further
11:14 - 11:16

and he defines the concept
of vector clocks.
11:18 - 11:20

So, each process in your system
11:20 - 11:23

tracks a global state
11:23 - 11:25

not in terms of a SQL counter
11:25 - 11:28

but it's the last time stamp it knows
11:28 - 11:31

for each of the other processes
in the system.
11:31 - 11:33

And every time it sends a message,
11:33 - 11:36

it will share that global state
11:36 - 11:38

with whatever processes on the other end.
11:39 - 11:40

And the remote process can then
11:40 - 11:42

take a look at that compare to
11:42 - 11:44

it's notion of what the timestamps are
11:44 - 11:45

and push everything forward.
11:45 - 11:47

So, in this case,
11:48 - 11:49

B did not know that A had
11:49 - 11:51

advanced to event 6
11:51 - 11:53

so it pushes 6 forward
11:53 - 11:55

B's own event counter gets incremented
11:55 - 11:58

because the receipt of a message
is another event
11:58 - 11:59

in it's lifecycle.
11:59 - 12:02

and it pushes C's
12:03 - 12:05

clock forward to 13
12:05 - 12:07

because C sent the message
12:08 - 12:10

and that 'sending a message' constitutes
12:10 - 12:11

another event.
12:11 - 12:13

So, B can go ahead and advance
up to 13.
12:13 - 12:14

So now, at the...
12:14 - 12:17

Say we're troubleshooting or we're trying
12:17 - 12:19

to understand how the system
arrive in the state
12:19 - 12:22

We have a much more comprehensive history
12:22 - 12:25

and we can do a better job of analyzing
12:25 - 12:27

what the order of events across
the system was
12:27 - 12:30

even if the physical time stamps
12:30 - 12:32

themselves did not align.
12:32 - 12:35

Now, you may be asking yourself,
12:35 - 12:37

"How would I troubleshoot log files
that have
12:37 - 12:40

these monotonically
incrementing counters, right?"
12:42 - 12:42

"What does 12 mean?"
12:44 - 12:46

"What time did that happen?"
12:46 - 12:49

"Was there a solar flare yesterday
at 1:00?"
12:49 - 12:50

"Maybe I should be looking for something there?"
12:50 - 12:51

So, Strange Loop...
12:51 - 12:52

Anyone been to Strange Loop?
12:52 - 12:53

I hope some of you have...
12:53 - 12:53

Excellent.
12:53 - 12:55

So Strange Loop in St. Louis every year,
12:55 - 12:57

one of the best technical conferences
12:57 - 12:58

you will ever attend,
12:58 - 13:00

John Moore from ComCast gave a
13:00 - 13:01

great talk this year about clocks.
13:01 - 13:05

Specifically about hybrid clocks,
13:05 - 13:08

trying to marry logical clocks
13:08 - 13:10

and physical time stamps and
13:10 - 13:11

use that marriage in order to
13:11 - 13:14

A) keep physical clock better in synched
13:14 - 13:16

across your cluster, and
13:16 - 13:18

B) have something that operators can
13:18 - 13:20

actually look at in a log file and
figure out
13:20 - 13:22

what in the world is going on.
13:22 - 13:23

So, physical timestamps are
tremendously useful.
13:23 - 13:28

But, they're not the focus of today's talk.
13:28 - 13:29

Do, you may have noticed that
13:29 - 13:30

I pulled a fast one on you.
13:30 - 13:32

I started by talking about the problem of
13:32 - 13:35

data synchronization, of choosing a winner
13:35 - 13:36

in conflicting rights.
13:36 - 13:39

And I jumped to global-state order.
13:39 - 13:42

It turns out that these are very
similar problems,
13:42 - 13:43

they have very similar solutions, but they
13:43 - 13:45

are, in fact, separate streams of research
13:45 - 13:47

for the most part, although people
13:47 - 13:49

are trying to unify the two now.
13:49 - 13:55

So, let's go back in time a little
bit to 1983.
13:55 - 13:58

A team from UCLA was writing a network
13:58 - 14:00

operating system called LOCUS.
14:00 - 14:02

And LOCUS's two big ideas were this,
14:02 - 14:05

you take all the computers in the network,
14:05 - 14:09

and you expose a global process state
14:09 - 14:11

and a global file system state.
14:11 - 14:13

So, you can manipulate processes
14:13 - 14:15

on any machine in your network
14:15 - 14:17

and you can edit files that are stored on
14:17 - 14:20

any machine in your network.
14:20 - 14:22

Incidentally, I tried to work on this
14:22 - 14:24

problem many years ago and we got nowhere
14:24 - 14:26

near as far as they did.
14:26 - 14:28

There are some very hard problems
to solve here
14:28 - 14:32

So, this paper
14:32 - 14:35

was specifically focused on the problem
14:35 - 14:41

of network partitions.
14:41 - 14:43

The paper itself introduces
version vectors,
14:43 - 14:45

not to be confused with vector clocks,
14:45 - 14:47

although many people do.
14:47 - 14:48

Including REAC, our database
14:48 - 14:49

got that name wrong. We use version
14:49 - 14:53

vectors, but we call them vector clocks.
14:53 - 14:56

As importantly, it talks about what
14:56 - 14:58

partitions actually mean for a network.
14:58 - 15:01

Now, one of those myths that programmers
15:01 - 15:03

believe about networks is that
15:03 - 15:06

network partitions are rare.
15:06 - 15:08

That could not be further from the truth.
15:08 - 15:10

Network partitions are evil, they are
15:10 - 15:14

pernicious, they are ubiquitous.
15:14 - 15:17

And they're extremely difficult
to troubleshoot.
15:17 - 15:20

Network partitions don't have
to be bi-directional.
15:20 - 15:22

YOu can have Server A able to talk to B
15:22 - 15:24

and B can't talk to Server A.
15:24 - 15:25

Good luck troubleshooting that one.
15:25 - 15:28

And good luck writing the software that
15:28 - 15:29

can account for that, right?

Title:: A Brief History of Logical Time - John Daily - Midwest.io 2015
Description:: more » « less
Video Language:: English
Team:: Captions Requested
Duration:: 32:58

	Anastassiya Kozlovskaya edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	Cam Mangels edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	Cam Mangels edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	Gorana Panic edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	Gorana Panic edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	Kelly Pruitt edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015
	KimPea edited English subtitles for A Brief History of Logical Time - John Daily - Midwest.io 2015

English subtitles

Revisions Compare revisions

Revision 7 Edited

Anastassiya Kozlovskaya
Revision 6 Edited

Cam Mangels
Revision 5 Edited

Cam Mangels
Revision 4 Edited

Gorana Panic
Revision 3 Edited

Gorana Panic
Revision 2 Edited

Kelly Pruitt
Revision 1 Edited

KimPea

	Revision Number	Author	Created
	7	Anastassiya Kozlovskaya
	6	Cam Mangels
	5	Cam Mangels
	4	Gorana Panic
	3	Gorana Panic
	2	Kelly Pruitt
	1	KimPea

A Brief History of Logical Time - John Daily - Midwest.io 2015

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)