Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"

Not Synced

So, um, the title of this talk is, "Jepsen IV:
Hope Springs Eternal."
Not Synced

There is in fact no Jepsen III. Don't go
looking for it. It doesn't exist. That
Not Synced

should tell you a little bit about what
we're going to discuss in the talk today.
Not Synced

My name is Kyle Kingsbury. You might know
me as "aphyr."
Not Synced

Um, I am scared and confused and excited
about computers all the time.
Not Synced

Uh, and today, I work at a new company
called Stripe.
Not Synced

Uh, we are a credit card company. Ah, we
help you sell goods and services via some
Not Synced

API, um, which is phenomenal, right?
Developers say, "Oh, I wanna take credit
Not Synced

cards. How do I do it? "Well, I could
build all this [PCI-complying]
Not Synced

infrastructure, or I could just make
[an HTP request] to some magical
Not Synced

rainbow in the sky, and everything will
work out just great.
Not Synced

We've all built APIs, right? They're not,
They're not real things. They're
Not Synced

abstractions. They're imaginary
constructs. So that's this rainbow up top.
Not Synced

That's our, that's our API, right?
But it's actually supported on this
Not Synced

steel girder framework, which is our code.
That's written in Ruby.
Not Synced

Um, and then that code rests on top of all
these libraries and the [Ruby VM].
Not Synced

Um, that's represented here by this big
pile of sticks. And those things in turn
Not Synced

rest on a pile of tires, which symbolize
our databases and [queues and
Not Synced

other associated systems]. Now does
anybody who's used a database--
Not Synced

Ah, how many of you have, actually?
Oh, very good. Less than I expected.
Not Synced

(Audience laughs as slide is changed}
Not Synced

Um, anybody who's used a database knows
everything is on fire all the time. Um,
Not Synced

and our job is to continue building the
tower higher and higher to try and get
Not Synced

away from the flames. Ah...
(Audience laughs again)
Not Synced

This is called the hierarchy of
abstraction. Ah, so as good information
Not Synced

hiders, we're trying to isolate the
terrible things happening down below
Not Synced

from the good things which our customers
think are happening.
Not Synced

Ah, and ideally it works. Ideally
customers continue to come to the
Not Synced

site. They make requests. Data flows
properly. Even though nodes internally are
Not Synced

failing and bad things are happening to
the network. That's the goal.
Not Synced

What kind of bad things am I talking about?
Well, if you've ever used a queue or a
Not Synced

database or any consistency system
whatsoever, ah, you know they can get into
Not Synced

split brain scenarios, where different
nodes believe different truths.
Not Synced

Or there might be foreign key problems,
where you go to look up a record and
Not Synced

it's gone. Or maybe you see read anomalies,
where you go to examine a record and,
Not Synced

ah, maybe you do a write, and you do a
read, and it's not there. Then you do
Not Synced

another read, and it comes back. You do
another, and it's gone. Right? Interesting
Not Synced

flip-flops in state can occur.
So, in the face of these problems, we want
Not Synced

to know, how likely are they to actually
affect our systems? Our--is our use of the
Not Synced

database correct? Is--is our software
likely to show these problems to users?
Not Synced

And the only way to do that is to measure
them. So Jepsen is a project I'm working
Not Synced

on to understand distributive systems
better. Ah, it's--it's code-based, it's
Not Synced

articles, it's talks, um, trying to put
pressure on vendors, trying to educate
Not Synced

users, uh, and figure out for myself what
consistency actually means.
Not Synced

In Jepsen, we take a systems-based
approach. Um, what is a system?
Not Synced

Well, anything you can put a box around.
Ah, which means pretty much everything
Not Synced

is a system. Ah, inside of the system
we're gonna have some interacting
Not Synced

components. Ah, they're gonna have
complicated and nuanced, um, things
Not Synced

happen between them. But our goal is that
if we draw a box around the system as a
Not Synced

whole, we should be able to characterize
what goes on inside the box, just based
Not Synced

on the interactions it has in its
environment. And hopefully these
Not Synced

interactions are simple. So at the edge,
when we're pushing things into a system
Not Synced

and taking things out, we should have
simple rules, like, maybe matter is
Not Synced

conserved, or maybe [format] is conserved,
or maybe if I put things into a cube they
Not Synced

come out again. Maybe if I write data into
a database, it should still be there in a
Not Synced

week. Those are simple, high-level rules.
And then internally, the database can do
Not Synced

whatever kind of locks and transformations
it wants to make those things happen.
Not Synced

Thinking about invariants for a minute--
like what sort of rules, what sort of
Not Synced

high-level properties could we want. We
might think about a simple program, right?
Not Synced

Like allocating a variable X set initially
to A will read the variable by [printing
Not Synced

it, that prints out] A, right, then we'll
say X equals B, print X. We all
Not Synced

know what this does, right? We've written
a program like this before.
Not Synced

What you think it does, is we start off
with a state A and we read it. And then
Not Synced

we'll change the state to B by performing
a write. And then we'll read B.
Not Synced

So we'll print A and then B.
Straightforward enough.
Not Synced

But this is not the only possible
interpretation of this program.
Not Synced

We could read B. We could read A. We could
do anything. The program doesn't have to
Not Synced

obey any laws unless we tell it to. So the
invariants in our head map the program's
Not Synced

structure--the sort of lexical structure
of the process--to the behaviors
Not Synced

which it's allowed to undergo.
Invariants are constraints that
Not Synced

apply to the history of the system. Like
what--what possible orderings of states
Not Synced

could we see? What possible ways could the
operations come together?
Not Synced

We might think about not just a single
thread of program, but a concurrent one,
Not Synced

where multiple processes interact with the
state at the same time. So here the top
Not Synced

process, ah, reads A and then B. That's--
that's crazy! Why did it change? Well,
Not Synced

it changed because a process on the bottom
performed a write with the value B. So
Not Synced

individual processes may not have the
whole story. But if we unify the whole
Not Synced

history from every process, we should get
something that follows the same rules.
Not Synced

But it's a little more complicated than
that because our systems aren't just
Not Synced

concurrent. They're also distributed. Now,
distributed systems have some
Not Synced

characteristic difference or some
characteristic distance between nodes.
Not Synced

It takes time for messages to propagate.
If I want to write a value to a register
Not Synced

in [D-Ram], I have to go off of the CPU,
into the memory controller, through a
Not Synced

bunch of caches, back onto some wire, like
eleven whole centimeters over in the
Not Synced

D-Ram, into the actual bit, and it flips.
And that takes, like, what?
Not Synced

A nanosecond? So then I have to come all
the way back, and that takes more time.
Not Synced

So whenever we're manipulating state
somewhere, it's gonna take a little bit of
Not Synced

time to go back and forth. And that time
means that things might be ambiguous.
Not Synced

We could see different orders. For
example, if I do a write of B, and I
Not Synced

concurrently begin a read, and then the
write completes, and the read completes,
Not Synced

one possible ordering, depending on how
those messages propagated, is that I see
Not Synced

the value B because the read arrives after
the write. But it could also see A if the
Not Synced

read arrives first. A or B. A or B. All
that changes is the ordering of the
Not Synced

messages. In an asynchronous network,
like the ones that we're usually working
Not Synced

with in IP, ah, we don't get to control
what those ordering are. So both,
Not Synced

both histories are valid. We have to
consider all possible interweavings with
Not Synced

the current operations. But it's not
infinitely bad because we're not allowed
Not Synced

to break the laws of physics. We can't go
back in time. If we're actually talking to
Not Synced

a single point of truth somewhere in the
system, then the earliest possible state
Not Synced

I could interact with is the one just
after I send my request. And the latest
Not Synced

possible state is the one just before I
receive a response. So you get some window
Not Synced

between invocation and response.
And that's the range of states we can
Not Synced

interact with. This property is
called linearizability.
Not Synced

And it's kind of the "gold standard" for,
um, both distributed and concurrent
Not Synced

systems. You want to have, ah, things like
atomic registers, like [u-textes], that
Not Synced

give you real-time guarantees on the
states you'll interact with.
Not Synced

And in u-text, when you lock it, nobody
else gets to claim it until you release it.
Not Synced

That's a really nice, strong property.
So when linearizability means that any
Not Synced

operations that are not concurrent are
ordered back to each other. It's not a
Not Synced

total order, right? We can still have
those windows of concurrency where
Not Synced

you could get A or B. But that only
happens when two operations overlap
Not Synced

in time. And that means that if I do a
write and the write completes and then
Not Synced

I begin a read, because those two aren't
concurrent, the read is guaranteed to see
Not Synced

the write, or some later state. So
everybody's gonna agree on the same order
Not Synced

events. We all see A, B, C, D. Everbody's
going to agree on when an operation is
Not Synced

visible. Once you've completed an
operation, you know everybody else is
Not Synced

gonna see it. And this means we get nice
invariants like no stale reads, like
Not Synced

mutual exclusion. Lots of systems rely on
these properties. But we don't have to be
Not Synced

that strong. This is a really rigid thing
to do. We could relax the time
Not Synced

constraints. We could let you go
backwards and forwards in time.
Not Synced

You could read something from the past and
write something from the future.
Not Synced

Well, how do you write to the future?
Well, you put your message in a
Not Synced

bottle and throw it in the ocean and
somebody else picks it up. Ah, if your
Not Synced

ocean is well-ordered, as mine is, um,
then this works just fine. Everything
Not Synced

arrives in order at your recipient, ah,
maybe on a different coast, and they apply
Not Synced

your operations, and they see the same
history of messages. They just see them a
Not Synced

week later. So all participants are going
to agree on the same order events.
Not Synced

They'll all see A, B, C. But they might
not agree on what time those things
Not Synced

happened. We can relax that invariant
still further. If you're thinking about a
Not Synced

transactional system, um, with lots of, um
different cells interacting, maybe like an
Not Synced

[SQL database], what do you give a
property like serializability?
Not Synced

Where we say that every transaction has
to fit somewhere into a single linear
Not Synced

history. But we don't have agree on when
it happened or even in what order.
Not Synced

Just as long as you can squeeze the
transaction somewhere.
Not Synced

So, in a serializable system, operations
happen in an order. We don't know
Not Synced

what it is. It's perfectly legal, for
example, in a register, to put all of
Not Synced

your writes to time infinity and all of
your reads to time zero and just have
Not Synced

[dev null] be your database. Um, that
doesn't seem very useful, right?
Not Synced

But it's a sort of trivial solution. If
you have a more complex system,
Not Synced

like an SQL database, where you're using
something like, "set this row, this
Not Synced

particular column, to be three if and
only if this thing plus that thing is
Not Synced

equal to this other thing when the time
is whatever. Those constraints are
Not Synced

actually a lot more difficult to preserve.
And so you get a stronger invariant out
Not Synced

of it. Finally we don't have to just have
a single line of history. We could
Not Synced

actually have a forking, diverging
history. In eventually consistent
Not Synced

systems, the constraint is that things
have to come together somehow.
Not Synced

So if we have a counter that's value is
zero, that counter could diverge. We could
Not Synced

have two distinct copies of it. And each
one of them could increment separately
Not Synced

and return one. And reads would show
one for a while. And then when the counter
Not Synced

has exchanged information, the value will
converge on two. If we do everything
Not Synced

right. So, eventually consistent systems
have to converge, and what that
Not Synced

convergence does, doesn't just mean they
have to agree on some value. Ideally they
Not Synced

should converge on the correct value. So
the correct value might be the most
Not Synced

recently-written value in the register.
The correct value might be a number
Not Synced

of increments that have occurred. It
depends on the semantics of the data
Not Synced

type that you're working with. Finally we
could imagine the variance to just order
Not Synced

two operations. Like a write and write or
a write and a read. And we get this family
Not Synced

of monotonic writes, monotonic reads,
writes follow reads, reads [yield] writes.

Title:: Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"
Description:: more » « less
Video Language:: English
Team:: Captions Requested
Duration:: 54:34

	Michelle Fromm edited English subtitles for Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"
	Michelle Fromm edited English subtitles for Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"
	Michelle Fromm edited English subtitles for Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"

English subtitles

Incomplete

Revisions Compare revisions

Revision 3 Edited

Michelle Fromm
Revision 2 Edited

Michelle Fromm
Revision 1 Edited

Michelle Fromm

	Revision Number	Author	Created
	3	Michelle Fromm
	2	Michelle Fromm
	1	Michelle Fromm

Kyle Kingsbury: "Jepsen IV: Hope Springs Eternal"

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)