Return to Video

Attacking non-atomic decryption (10 min)

  • 0:00 - 0:04
    In the last segment, we looked at a
    padding Oracle attack that completely
  • 0:04 - 0:08
    breaks an authenticated encryption system.
    I hope this attack convinces you that you
  • 0:08 - 0:11
    shouldn't implement authenticated
    encryption on your own cause you might end
  • 0:11 - 0:16
    up exposing yourself to a padding oracle
    attack or a timing attack or any other
  • 0:16 - 0:20
    such attack. Instead you should be using
    standards like GCM or any other of the
  • 0:20 - 0:24
    standardized authenticated encryption
    modes as implemented in many crypto
  • 0:24 - 0:28
    libraries. In this segment, I'm gonna show
    you another very clever attack on an
  • 0:28 - 0:32
    authenticated encryption system. And I
    hope after you see this attack, you'll be
  • 0:32 - 0:35
    completely convinced not to invent and
    implement your own authenticated
  • 0:35 - 0:40
    encryption systems. But instead, always
    use one of the standard schemes, like GCM
  • 0:40 - 0:44
    or others. So this particular attack that
    I want to show you is an attack on the SSH
  • 0:44 - 0:49
    binary packet protocol. So SSH is a
    standard secure remote shell application
  • 0:49 - 0:54
    that uses a protocol between a client
    and the sever. It has a key exchange
  • 0:54 - 0:59
    mechanism and once two sides exchange keys,
    SSH uses what's called the binary packet
  • 0:59 - 1:04
    protocol to send messages back and forth
    between the client and the server. Now
  • 1:04 - 1:10
    here is how SSH works, so recall that SSH
    uses what we called encrypt-and-MAC. Okay
  • 1:10 - 1:14
    so technically what happens is every SSH
    packet begins with a sequence number, and
  • 1:14 - 1:18
    then the packet contains the packet
    length, the length of the CBC pad, the
  • 1:18 - 1:24
    actual payload follows, then the CBC pad
    follows. Now this whole red block here is
  • 1:24 - 1:30
    CBC encrypted also with a chained IV, so
    this is also vulnerable to the CPA attacks
  • 1:30 - 1:34
    that we discussed before. But
    nevertheless, this whole red packet is
  • 1:34 - 1:39
    encrypted using CBC encryption. And then
    the entire clear text packet is MAC-ed.
  • 1:39 - 1:43
    And the MAC is sent in the clear, along
    with the packet. So I want you to remember
  • 1:43 - 1:49
    that the MAC is computed over plain text
    packets, and then the MAC is sent in the
  • 1:49 - 1:54
    clear. This is what we call encrypt-and-MAC.
    And we said that this is not a good
  • 1:54 - 1:58
    way to do things, because MACs have no
    confidentiality requirements. And by sending
  • 1:58 - 2:03
    the MAC of the clear text in the clear,
    you might be exposing information about
  • 2:03 - 2:06
    the clear text. But this is not the
    mistake that I want to show you here. I
  • 2:06 - 2:10
    want to show you a much more clever attack.
    So first, let's look at how decryption
  • 2:10 - 2:16
    works in SSH. So what happens is, first of
    all, the server decrypts the encrypted
  • 2:16 - 2:22
    packet length field only. So it only
    decrypts these particular first few bytes.
  • 2:22 - 2:26
    Then it will go ahead and read from the
    network, as many bytes as specified in the
  • 2:26 - 2:31
    packet length field. It's gonna decrypt the
    remaining cipher text blocks using CBC
  • 2:31 - 2:36
    decryption. Then, once it's recovered the
    entire SSH packet, it will go ahead and
  • 2:36 - 2:41
    check the MAC of the plain text, and
    report an error if the MAC happens to be
  • 2:41 - 2:46
    invalid. Now the problem here is that the
    packet length field is decrypted and then
  • 2:46 - 2:50
    used directly to determine the length of
    the packet before any authentication has
  • 2:50 - 2:55
    taken place. In fact, it's not possible to
    verify the MAC of the packet length field
  • 2:55 - 2:59
    because we haven't recovered the entire
    packet yet and as a result we cannot check
  • 2:59 - 3:04
    the MAC. But nevertheless the protocol uses
    the packet length before verifying that the MAC
  • 3:04 - 3:10
    is valid. So it turns out this introduces a
    very, very cute attack. And I'm only
  • 3:10 - 3:13
    gonna describe a very simplified version
    of this attack, just to get the idea
  • 3:13 - 3:17
    across. So here's the idea. Suppose the
    attacker intercepted a particular cipher
  • 3:17 - 3:23
    text block, namely the direct AES
    encryption of the message block M. And now
  • 3:23 - 3:27
    he wants to recover this M. And I
    emphasize that this intercepted
  • 3:27 - 3:31
    cipher text is only one block length.
    It's one AES block. So here's what the
  • 3:31 - 3:37
    attacker is gonna do. Well, he's gonna
    send a packet to the server that starts
  • 3:37 - 3:41
    off as normal. It's basically, starts off
    with a sequence number and then he's going
  • 3:41 - 3:46
    to inject his captured cipher text as the
    first cipher text block that's sent to the
  • 3:46 - 3:51
    server. Now, what is the server going to
    do? The server is gonna decrypt the first
  • 3:51 - 3:57
    few bytes of this first AES block and he's
    going to interpret the first few bytes as
  • 3:57 - 4:02
    the length fields of the packet. The next
    thing that's gonna happen is, the server
  • 4:02 - 4:07
    is gonna expect this many bytes, before it
    checks that the MAC is valid. And so what
  • 4:07 - 4:12
    the attacker is gonna do, is, he's gonna
    feed the server one byte at a time. So the
  • 4:12 - 4:15
    server will read one byte, and then
    another byte, and then another byte.
  • 4:15 - 4:20
    Eventually, the server will read as many
    bytes as the length field specifies, at
  • 4:20 - 4:25
    which point, it will check that the MAC is
    valid. And of course, the attacker was
  • 4:25 - 4:29
    just feeding the server junk bytes. And as
    a result, the MAC is not gonna verify, and
  • 4:29 - 4:34
    the server will send a MAC error. But you
    realize that what happened here, the
  • 4:34 - 4:38
    attacker was counting how many bytes it
    sent to the server. And so it knows
  • 4:38 - 4:43
    exactly how many bytes were sent at the
    time that it receives the MAC error from
  • 4:43 - 4:48
    the server. So that tells it that the
    decryption of the first 32 bits of its
  • 4:48 - 4:54
    cypher text C are exactly equal to the
    number of bytes that were sent before it
  • 4:54 - 4:57
    saw the MAC error. So this is a very
    clever attack. So let me say it one more
  • 4:57 - 5:03
    time to make sure this is clear. So again,
    the attacker has a one block cipher text C
  • 5:03 - 5:07
    that it wants to decrypt and let's pretend
    that when C is decrypted the 32 most
  • 5:07 - 5:12
    significant bits of the plain text happen
    to be the number five. In this case, what
  • 5:12 - 5:17
    the attacker will see, is the following
    behavior. The server is gonna decrypt the
  • 5:17 - 5:22
    challenge block c and he's gonna obtain
    the number five as the length field. So,
  • 5:22 - 5:27
    now, the attacker is gonna feed the server
    one byte at a time and after the attacker
  • 5:27 - 5:32
    feeds the server five bytes the server
    says, hey, I've just recovered the entire
  • 5:32 - 5:36
    packet, let me check the MAC. The MAC is
    likely to be false and, then, it will
  • 5:36 - 5:41
    send, bad MAC error. So after five bytes
    are read off the network the attacker is
  • 5:41 - 5:45
    gonna see a bad MAC error and then the
    attacker learns that the most significant
  • 5:45 - 5:52
    32 bits of the decrypted block is equal to
    the number five. So there. So, you just
  • 5:52 - 5:57
    learned the 32 most significant bits of
    C. So this is a very significant attack,
  • 5:57 - 6:02
    because the attacker just learned 32 bits
    of the decrypted cipher text block. And
  • 6:02 - 6:06
    since he can apply this attack to any
    cipher text block he wants, he can
  • 6:06 - 6:11
    basically learn the first 32 bits of every
    cipher text block in a very long message.
  • 6:11 - 6:16
    So what just happened here? Well, there
    are actually two things that were wrong in
  • 6:16 - 6:19
    this design. The first one is the
    decryption operation is non-atomic. In
  • 6:19 - 6:25
    other words, the decryption algorithm
    doesn't take a whole packet as input, and
  • 6:25 - 6:30
    respond with a whole plain text as output,
    or with the word reject. Instead, the
  • 6:30 - 6:34
    decryption algorithm partially decrypts
    the cipher text, namely to obtain the
  • 6:34 - 6:39
    length field, and then it waits to recover
    as many bytes as needed and then it
  • 6:39 - 6:44
    completes the decryption process. So these
    nonatomic decryption operations are fairly
  • 6:44 - 6:48
    dangerous, and generally, they should be
    avoided. In this example, this nonatomic
  • 6:48 - 6:53
    decryption happens to break authenticated
    encryption. The other problem that
  • 6:53 - 6:57
    happened is that the length field was used
    before it was properly authenticated. And
  • 6:57 - 7:01
    this is another issue that should never be
    done. So the encryption field should never
  • 7:01 - 7:05
    be used before the field is actually
    authenticated. So let me ask you, if you
  • 7:05 - 7:09
    had the option of redesigning SSH what is
    the minimum change that you would do to
  • 7:09 - 7:14
    make SSH resist this attack? And let me
    tell you that multiple answers might be
  • 7:14 - 7:18
    correct. One option is to send a length
    field in the clear, just as in the case of
  • 7:18 - 7:23
    TLS. In this case, there's no opportunity
    for an attacker to submit chosen cipher
  • 7:23 - 7:27
    text attack, because, well, the length
    field is never decrypted. And so there's
  • 7:27 - 7:32
    no decryption taking place that the attacker
    can abuse. Replacing encrypt-and-MAC
  • 7:32 - 7:36
    by encrypt-then-MAC doesn't have any
    impact because this attack would apply
  • 7:36 - 7:40
    either way. The problem is that the length
    field is used before it's authenticated
  • 7:40 - 7:44
    and that would have to happen either way.
    So a better mode of encryption doesn't
  • 7:44 - 7:48
    actually help. Another option is to MAC
    the length field separately so that now
  • 7:48 - 7:53
    the server can read the length field,
    check that the MAC for just the length
  • 7:53 - 7:57
    field is valid, and then it would know how
    many subsequent bytes to read before
  • 7:57 - 8:01
    checking MAC field on the entire packet.
    The last option is actually one that
  • 8:01 - 8:05
    works, but is terribly inefficient, and it
    would expose the server to a denial of
  • 8:05 - 8:09
    service attack, so I'm not going to mark
    it as a valid answer. So the main lesson
  • 8:09 - 8:14
    to remember is, don't implement or design
    your own authenticated encryption system.
  • 8:14 - 8:19
    Just use the standards like GCM. But if
    for some reason, you can't use the
  • 8:19 - 8:23
    standards, and you have to implement your
    own authenticated encryption system, then
  • 8:23 - 8:28
    use encrypt-then-MAC. And make sure not
    to repeat the mistakes of the last two
  • 8:28 - 8:33
    segments, namely don't use a length field
    before the length field is authenticated.
  • 8:33 - 8:37
    And more generally, don't use any
    decrypted field before that field is
  • 8:37 - 8:41
    authenticated. Okay so this is the end of
    our discussion of authenticated
  • 8:41 - 8:45
    encryption. I wanted to point out a couple
    of papers on authenticated encryption that
  • 8:45 - 8:50
    you could use as further reading. The
    first one is a very cute one on the order
  • 8:50 - 8:54
    of encryption and authentication that talks
    about whether one should do encrypt-then-MAC
  • 8:54 - 8:57
    or MAC-then-encrypt and it
    shows that one is correct and one is
  • 8:57 - 9:00
    incorrect. It's a good read and there's a
    lot of information in that paper. The
  • 9:00 - 9:04
    second discusses OCB mode, which is a very
    efficient way of building authenticated
  • 9:04 - 9:09
    encryption. In particular, it looks at a
    variant of OCB with associated data as we
  • 9:09 - 9:14
    discussed when we described OCB. The last
    three papers are attack papers. The first
  • 9:14 - 9:18
    one describes the padding oracle attack
    that we discussed in the last segment.
  • 9:18 - 9:23
    This one here describes the length attack
    that we just described in this segment.
  • 9:23 - 9:29
    And the last one describes a number of
    attacks on encryptions that just use CPA
  • 9:29 - 9:34
    security without adding integrity. So this
    last paper actually provides a number of
  • 9:34 - 9:39
    good examples for why CPA security by
    itself should never, ever, ever be used
  • 9:39 - 9:43
    for encryption. Remember the only thing
    you're allowed to use is authenticated
  • 9:43 - 9:47
    encryption for confidentiality. Or if all
    you need is integrity with no
  • 9:47 - 9:50
    confidentiality then you just use a MAC.
Title:
Attacking non-atomic decryption (10 min)
Video Language:
English

English subtitles

Revisions