-
So now that we understand what block
ciphers are, let's look at a classic
-
example called the Data Encryption
Standard. So, just a quick reminder. Block
-
ciphers basically map N bits of input to N
bits of output. And we talked about two
-
canonical examples, triple DES and AES. In
this segment, we're gonna talk about DES,
-
and we'll talk about triple DES, actually,
in the next segment. And then I also
-
mentioned before that block ciphers are
often built by iteration. In particular,
-
we're gonna look at block ciphers that are
built by a form of iteration where a key K
-
is first expanded into a bunch of round
keys. And then a round function is applied
-
to the input message, again and again and
again. And essentially, after all these
-
round functions are applied, we obtain the
resulting cipher text, okay? And again,
-
what we're gonna look at, how DES, the
data encryption standard, uses this
-
format. I just wanna be clear that, in
fact, to specify a block cipher of this
-
type, one needs to specify the key
expansion mechanism, and one needs to
-
specify the round function. In the segment
here, I'm gonna focus on the round
-
function, and I'm not gonna talk much
about key expansion. But I just wanted to
-
mention that, in fact, key expansion is
also a big part of describing how block
-
cipher works. Okay, so let's talk about
the history of DES. Essentially, in the
-
early 1970s, IBM realized that their
customers are demanding some form of
-
encryption. And so they formed a crypto
group, and the head of that group, was
-
Horst Feistel, who, in the early 70s,
designed a cipher called Lucifer. Now,
-
it's interesting. In fact Lucifer had a
number of variations but one of the later
-
variations in fact had a key length that
was 128 bits and a block length that's
-
also 128 bits. Okay, in 1973 the governor
realized that it's buying many commercial
-
off-the-shelf computers and so it wanted
its suppliers to actually have a good grip
-
to algorithm that they could use in
products sold to the government. So in
-
1973 the National Bureau of Standards as
it was called at the time put out a
-
request for proposals for a block cipher
that is going to become a federal
-
standard. And in fact IBM submitted a
variant of Lucifer. That variant actually
-
went through some modification during the
standardization process and then finally
-
in 1976, the national bureau standard
adopted DES as a federal standard. And, in
-
fact, for DES it's interesting that the
key length was far reduced from Lucifer.
-
It's only 56 bits. And the block length
was also reduced to 64 bits. And in
-
fact, these decisions, especially the
decision to reduce the key length, is
-
going to be a key length yield of DES and
was a source of many complaints over its
-
life. In particular, already back in 1997,
DES was broken by exhaustive search
-
meaning that a machine was able to search
through all two to the 56 possible keys to
-
recover a particular challenge key. And in
fact we're going to talk about exhaustive
-
search quite a bit. It's quite an
interesting question, and there are
-
various ways to defend against exhaustive
search. And basically this 1997 experiment
-
kinda spelled the doom of DES. It meant
that DES is itself, is no longer secure.
-
And as a result, the National Institute of
Standards, as it became known, issued a
-
request for proposals. For our next
generation's block cipher standard and in
-
2000 it standardized on a cipher called Rijndael.
Which became the advanced encryption
-
standard, AES. And we'll talk about AES
later on. But in this segment, I wanna
-
describe how DES works. Now, DES is a
cipher, it's an amazingly successful
-
cipher. It's been used in the banking
industry. In fact, there's a classic
-
network called the Electronic
Clearinghouse, which banks use to clear
-
checks with one another. And DES is used
for integrity in those transactions. It's
-
also used in commerce. In fact, it was
very popular up until recently, as the
-
main encryption mechanism for the web. Of
course, now, that's been replaced with AES
-
and other ciphers. Overall, it's a
very successful cipher in terms of
-
deployment. DES also has a very rich
history of attacks, which we'll talk about
-
in the next segment. Okay, so now, let's
talking about the construction of DES. So,
-
the core idea behind DES is what's called
a Feistel network, due to Horst Feistel.
-
And basically, it's a very clever idea for
building the block cipher out of arbitrary
-
functions, F1 to FD. Okay so imagine we
have these functions F1 to FD
-
that happens to match n bits to n bits.
Now these are arbitrary functions,
-
they don't have to be invertible or
anything. What we want to do is build an
-
invertible function out of those D
functions and the way we'll do it is by
-
building a new function we'll call capital
F that maps 2n bits to 2n bits.
-
And the construction is described right
here. So here we have our inputs. You
-
notice, there are two blocks of n bits.
In other words, the input is actually
-
2n bits. The R and L stand for right and
left. Typically, people describe a
-
Feistel network from top to bottom. In
which case, these n bits really would be
-
right and left. But here it is more
convenient for me to describe it
-
horizontally. So if we follow the R
inputs. You realize it basically gets
-
copied into the L output, without any
change at all. Okay? However, the L
-
inputs, is changed somewhat. What happens
is, basically, the R inputs is fit into
-
the function F1 and the result is then
XORed with L0 and that becomes the new R1
-
input. Okay, so this is called one round
of a Feistel network, and is done using
-
the function F1. Now we do this again with
another round of the Feistel network
-
with the function F2, and we do it again
and again and again, 'till we get to the
-
last round, and we do it again with the
function FD. And finally, the output is
-
R_d L_d. Okay. So, if you like, we can write
this in symbols. That basically, L_i is
-
simply equal to R_i-1. And R_i,
let's see, that's the more complicated
-
one. R_i is equal, well, let's just follow
the lines here. R_i is just equal to F_i
-
applied to, R_i-1 XOR L_i. Okay?
So this, and this is basically, i goes
-
from, 1 to d. So this is the, equation
that defines a Feistel network, mapping a
-
2n bit input to 2n bit outputs. So
here we have the, again, I just copied the
-
picture of the Feistel network. And the
amazing claim is that, in fact, it doesn't
-
matter which functions you give me. For
any functions, F1 to FD that you give me,
-
the resulting Feistel network function is,
in fact, invertible. And the way we're
-
going to prove that is basically we're
going to construct an inverse, because not
-
only is it invertible, it's efficiently
invertible. So let's see. So let's look at
-
one round of a Feistel network, so
here, this is the inputs, R_i L_i, and this
-
is the output R_i+1, L_i+1. And now what I'm
going to ask you is to invert this.
-
So let's see. So suppose now the input that
we're given is R_i+1, L_i+1 and we want to
-
compute R_i L_i. So we want to compute the
round in the reverse direction. So let's
-
see if we can do it. Well, let's look at
R_i. So, R_i is very easy. Basically, R_i is
-
just equal to L_i+1. So L_i+1 just becomes
R_i. But now, let me ask you, to write the
-
expression for L_i in terms of R_i+1, and L_i+1.
-
So I hope everybody sees that, basically, L_i+1
is fed into the function F_i+1.
-
The result is XORed with R_i+1,
and that gives the L_i input.
-
So this the inverse of one round
of a Feistel network.
-
And if we draw this as a diagram, let's just
write the picture for the inverse.
-
So here you notice the input is R_i+1, L_i+1
and the output is R_i L_i. Right?
-
So we're computing, we're inverting, the
rounds. So you notice that the inverse of
-
a Feistel round looks pretty much the
same as the Feistel round in the
-
forward direction. It's literally, you
know, just for a technical reason, it's
-
kinda the mirror image of one another. But
it's basically, the same construct. And
-
when we put these inverted rounds back
together. We essentially get the inverse
-
of the entire Feistel network. So you
notice we start off with the round number D
-
with the inverse of round number D,
then we do the inverse of round number D-1
-
and so on and so forth until we
get to the inverse of the first round. And
-
we get our final outputs which is R_0, L_0,
like this is the inputs and we manage to
-
invert basically R_d, L_d and get R_0, L_0
and the interesting thing is that
-
basically these inversion circuits look
pretty much the same as the encryption
-
circuits and the only difference is that
the functions are applied in reverse order.
-
Right we started with F_d and ended with
F_1. Whereas when we were encrypting, we
-
started with F_1 and ended with F_d. So, for
hardware designers, this is very
-
attractive, because, basically, if you
wanna save hardware, you realize that your
-
encryption hardware is identical to your
decryption hardware. So you only have to
-
implement one algorithm, and you get both
algorithms the same way. The only
-
difference is that the functions are
applied in reverse order. Okay. So this
-
Feistel mechanism is a general method
for building invertible functions from
-
arbitrary functions F_1 to F_d and in fact,
it's used in many different block ciphers.
-
Although, interestingly, it's not actually
used in AES. So, there are many other
-
block ciphers that use a Feistel
network. Or, of course, they differ from
-
DES in the functions F_1 to F_d. But AES
actually uses a completely different type
-
of structure that's actually not a
Feistel network. We'll see how AES
-
works in a couple of segments. So now that
we know what Feistel networks are, let
-
me mention an important theorem about the
theory of Feistel networks that shows
-
why they're a good idea. This theorem is
due to Luby and Rackoff back in 1985, and it
-
says the following. Suppose I have a
function that is a secure pseudorandom
-
function, okay. So it's indistinguishable
from random and happens to act on N bits.
-
So it maps N bits to N bits and uses a
key K. Then, it turns out, that if you use
-
this function in three rounds of a Feistel
network. What you end up with is a secure
-
pseudo random permutation. In other words,
what you end up with is an invertible
-
function that is indistinguishable from a
truly random invertible function. And I
-
hope you remember that the true definition
of a secure block cipher is that it needs
-
to be a secure pseudo random permutation.
So what this theorem says, is that if you
-
start with a secure pseudo random
function, you end up with a secure block
-
cipher. Basically, that's what this is.
And let me explain in a little bit more
-
detail what's actually going on here. So,
essentially, the PRF is used in every
-
round of the Feistel network. So, in
other words, here, what's actually
-
computed is the PRF using one secret key,
K0. Here, what's computed is the PRF
-
using a different secret key, of course
applied to R1. And here we have yet
-
another secret key, K1 applied, K2 applied
to R2. And you notice, this is why,
-
basically this Feistel construction,
uses keys in K cubed. In other words, it
-
uses three independent keys. So it's very
important that the keys are actually
-
independent. So really, we need three,
independent keys. And then we end up with
-
a secure pseudorandom permutation. Okay,
so that's the theory behind Feistel
-
networks. And now that we understand that,
we can actually look at the specifics of DES.
-
So DES is basically a sixteen round
Feistel network, okay. So there are
-
functions F1 to F16 that map 32 bits to
32 bits, and as a result, the DES itself
-
acts on 64 bit blocks, 2x32. Now the
sixteen round functions in DES are
-
actually all derived from a single
function F. Just used with different keys.
-
So in fact, these are the different round
keys. So K_i is, a round key. And it's
-
basically derived from the key K, derived
from the 56 bit DES key K. Okay and I'll
-
describe what this function F is in just a
minute. But basically that, you see that
-
by using sixteen different round keys, we
get sixteen different round functions. And
-
that gives us the Feistel network. So just
on a high level how the DES works
-
basically you have a 64 bit input. The
first thing it does is, this initial
-
permutation that just permutes the 64 bits
around. Namely it maps bit number one to
-
bit number six. Bit number two to bit
number seventeen, and so on. This is not
-
for security reasons, this is just
specified in the standard. Then we go into
-
the sixteen round Feistel network. That
actually, you now know how it works.
-
Basically uses the function F1 to F16, as
specified before. And then, basically we
-
have another permutation, it's called the
final permutation. That's just the inverse
-
of the initial permutation. Again,
it just permutes bits around, this is not
-
necessary for security reasons. And then
we finally get, the final outputs. Okay.
-
Now, as we said, there's a key expansion
step, which I'm not gonna describe. But
-
basically, this 56-bit DES key is expanded
into these rounds keys. Where each round
-
key, is 48 bits. Okay, so we have sixteen
48 bit round keys, and they're basically
-
used in the sixteen rounds of DES. And
then when you want to invert the cipher,
-
all you do is you use these, round keys,
these sixteen round keys, in reverse
-
order. Okay, so now that we understand the
DES structure, the only thing that's left
-
to do is specify the function, capital F.
So let me explain how this function works.
-
So basically, it takes, as inputs, its, 32
bit value, let's call it X. But in
-
reality, you remember, this is R_0,
R_1, R-2, R_3, so on and so
-
forth. These are 32 bit values. And then
it takes, also, a 48 bit round key. So
-
here we have our key K_i, which happens to
be 48 bits. The first thing it does, is it
-
goes through an expansion box. And this
expansion box basically take thirty two
-
bits, and maps them into 48 bits. Now, all
the expansion box does is just replicates
-
some bits, and move other bits around. So,
for example, bit #1 of X is replicated
-
into positions 2 and 48 in the output.
Bit #2 of X is positioned in as bit
-
#3 of the output. And so on and so
forth, just by replicating some of the
-
bits of X, we expand the input into 48
bits. The next thing we do is we compute
-
an XOR with the round key.
Sometimes people say that cryptographers
-
only compute XORs. This is an example of
that, where, well, we just do XORs in this
-
function. And then comes the magic of DES,
where, actually, these 48 bits are broken
-
into eight groups of six bits. Six, seven,
eight. And so let me draw, and then what
-
happens is, so yes. Each one of these,
each one of these wires is, six bits. And
-
then they, they go into what, what are
called S boxes. And I'll explain S boxes
-
in just a minute. The S boxes are kind of
the smarts of, DES. And then, the S
-
box is basically a map, six bits to four
bits. So, the outputs of the S boxes are
-
these four bits. They're collected. This
gives up 32 bits, right? Eight groups of
-
four bits, gives us 32 bits and then
finally this is fed into yet another
-
permutation which just maps the bits
around. So, for example bit number one
-
will go to bit number nine, bit number two
will go to bit number fifteen and so on.
-
So it just permutes the 32 bits around and
that's the final 32 bit output of this F function.
-
Okay? So by using different
round keys, essentially, we get different
-
round functions, and that's how we form
the sixteen round functions of DES. Now,
-
the only thing that, left to specify are
these S boxes. So the S boxes, literally,
-
are just functions from six bits to four
bits. They are just implemented as a look
-
up table. Right? So describing a function
from six bits to four bits basically
-
amounts to writing the output of the
function on all two to the six possible inputs.
-
Two to the six is 64. So we just
have a table that literally contains 64 values.
-
Where each value is four bits. So
here is an example, this happens to be the
-
fifth S box, and you see that this is a
table that contains 64 values right?
-
It's four by sixteen. So, 64 values. For
example, if you wanna look at, the output,
-
that corresponds to 0-1-1-0-1-1. Okay, then
you look at these two bits. This is 01,
-
and these four bits are 1101, and you see
that the output is 1001. The four bits
-
output 1001. Okay, so the S boxes are just
implemented as these tables.
-
Now the question is, how are these S boxes chosen.
How are these tables actually chosen by
-
the designers of this. So to give you some
intuitions for that, lets start with a
-
very bad choice for S boxes. So imagine
the S boxes were linear. What do I mean by
-
that? I mean that imagine that these six
bit inputs literally were just
-
XORed with one another in different
ways to produce the four bit outputs.
-
Okay, another way of writing that is that
we can write the S box as a matrix vector product.
-
So here you have the matrix Ai.
And the vector, the six bit vector X.
-
And you can see that, if we write this matrix
vector product, basically, we take the
-
inner product of this vector with the
input vector. Remember, these are all bits.
-
So the six bits vector inner
product. Another six bit vector, and we do
-
that modulo two, you realize, basically,
what we're computing is X2 XOR X3.
-
Right? Because only position two and
position three have 1's in it.
-
And similarly the next, inner product will
produce X1 XOR X4 XOR X5 and so on and
-
so forth. Okay. So you can literally see
that if the S boxes are implemented this
-
way. Then, all they do, is just apply the
matrix A to the input vector X. Which is
-
why we say, that in this case the S boxes
are completely linear. Now, I claimed that
-
in fact that if the S boxes were linear, then DES
would be totally insecure. The reason is,
-
if the S boxes are linear, then all that
DES does is just compute XOR of various
-
things and permute and shuffle bits
around. So it's just XORs and bit
-
permutations, which means that as a
result, all of DES is just a linear
-
function. In other words, there will be a
Matrix B. Of these dimensions. Basically,
-
it's a matrix B that has width 832.
Basically what I will do is I will write
-
the 64 bit message plus the sixteen round
keys as one long vector. Alright, so the
-
message is 64 bits and there are sixteen
round keys. Each one is 48 and that, if
-
you do the math, it's basically 832. Okay?
So I write these values, the keys and the
-
message, as one long vector and then there
will be this matrix that essentially when
-
you compute these matrix vector products.
Essentially you get the different bits of
-
the ciphertext. So there's 64 of these
rows and as a result, you get 64 bits of
-
ciphertext. Okay, so this is what it
means for DES to be linear. So if you
-
think a little bit about this, you realize
that the S boxes are the only nonlinear
-
part of DES. So if the S boxes were
linear, then the entire circuit is linear
-
and therefore can be expressed as this
matrix. Now if that's the case then DES
-
would be terrible as a secure pseudorandom
permutation. And let me give you a very
-
simple example. Basically if you did the
XOR of three outputs of DES, well
-
let's think what that means. Basically we
would be looking at B times, the matrix B
-
that defines DES, times, one vector
XOR B times another vector,
-
XOR B times a third vector. We
could take the B out of the parentheses so
-
we'd be basically doing B times this
vector over here. But of course K XOR K XOR K,
-
this is just K. And so if you
think about what that means, basically we
-
just got back DES of K at the point M1 XOR M2 XOR M3. But this means that now DES
-
has this horrible relation. That can be
tested. Right? So, basically, if you
-
XOR the output of three values,
M1, M2, M3, you'll get the value of
-
DES, at the point M1 XOR M2 XOR M3. Now this
is not a relation that is going to hold
-
for a random function. A random function
is not going to satisfy this equality.
-
And so you get a very easy test to tell you
that DES is not a random function.
-
In fact, maybe you can take that as a small
exercise. It's not even difficult to see,
-
that given enough input output pairs, you
can literally recover the entire secret key.
-
Yeah. You just need 832 input output
pairs, and you'll be able recover the
-
entire secret key. And so if the S boxes
were linear, DES would be completely
-
insecure. It turns out, actually, even if
the S boxes were close to being linear. In
-
other words, the S boxes were linear most
of the time. So maybe for 60 out of the 64
-
inputs, the S boxes were linear. It turns
out that would also be enough to break
-
DES, and we're gonna see why later on. In
particular, if you choose the S boxes at
-
random, it turns out they'll tend to be
somewhat close to linear functions. As a
-
result, you'll be able to totally break
DES. You'll just be able to recover the
-
key, in basically, very little time. And
so, the designers of DES actually
-
specified a number of rules they use for
choosing the S boxes. And it's not
-
surprising, the first rule is that these
functions are far away from being linear.
-
Okay. So, in other words, there is no
function that agrees with a large fraction
-
of the outputs of the S box. And then
there are all these other rules, for
-
example, there are exactly four to one
maps, right? So, every output has exactly
-
four pre-images and so on and so forth. So
we understand now why they chose the S
-
boxes they way they did and in fact its
all done to defeat certain attacks on DES.
-
Okay. So that's the end of the
description of DES, and in the next few
-
segments we are going to look at the
security of DES.