Return to Video

Prisoners' Dilemma and Nash Equilibrium

  • 0:01 - 0:06
    On the same day, police have made two at first unrelated arrests.
  • 0:06 - 0:09
    They arrest a gentleman named Alan.
  • 0:09 - 0:11
    They caught him red-handed selling drugs.
  • 0:11 - 0:13
    So it's an open-and-shut case.
  • 0:13 - 0:17
    And in the same day they catch a gentleman named Bill,
  • 0:17 - 0:20
    and he is also caught red-handed dealing drugs.
  • 0:20 - 0:23
    And they bring them separately to the police station
  • 0:23 - 0:26
    and they tell them, "look, this is an open-and-shut case
  • 0:26 - 0:28
    you're going to get convicted for drug dealing
  • 0:28 - 0:29
    and you`re going to get two years."
  • 0:29 - 0:31
    And they tell this to each of them individually.
  • 0:31 - 0:34
    They are selling the same type of drugs, just happened to be that.
  • 0:34 - 0:35
    But they were doing it completely independently.
  • 0:35 - 0:40
    Two years for drugs is what's going to happen,
  • 0:40 - 0:42
    assuming nothing else.
  • 0:42 - 0:44
    But then the District Attorney has the chance
  • 0:44 - 0:46
    to chat with each of this gentlemen separately.
  • 0:46 - 0:49
    And while he's chatting with them, he reinforces the idea that
  • 0:49 - 0:51
    this is an open-and-shut case for the drug dealing.
  • 0:51 - 0:54
    They're each going to get 2 years if nothing else happens.
  • 0:54 - 0:56
    But then he starts to realize that
  • 0:56 - 0:59
    these 2 characters look like.
  • 0:59 - 1:01
    He starts to have a suspicion for whatever reason
  • 1:01 - 1:03
    that these were the 2 characters that actually committed
  • 1:03 - 1:06
    a much more serious offence, that they had committed
  • 1:06 - 1:09
    a major armed robbery a few weeks ago.
  • 1:09 - 1:13
    And all the District Attorney has to go on
  • 1:13 - 1:18
    is his hunch, his suspicion. He has no hard evidence.
  • 1:18 - 1:20
    So what he wants to do is try to get a deal
  • 1:20 - 1:23
    with each of these guys, so that they have an incentive
  • 1:23 - 1:25
    to, essentially, snitch on each other.
  • 1:25 - 1:27
    So what he tells each of them is
  • 1:27 - 1:29
    "look, you're gonna get two years for drug dealing,
  • 1:29 - 1:33
    that's kind of guaranteed". But he says
  • 1:33 - 1:45
    "look, if you confess, and the other doesn't,
  • 1:45 - 1:50
    then you will get 1 year
  • 1:50 - 1:56
    and the other guy will get 10 years".
  • 1:56 - 2:01
    So he's telling Al, "look, we caught Bill too just randomly today,
  • 2:01 - 2:05
    if you confess that it was you and Bill who performed that armed robbery,
  • 2:05 - 2:08
    your term is actually going down from 2 years to 1 year.
  • 2:08 - 2:11
    But Bill is obviously going to have to spend a lot more time in jail,
  • 2:11 - 2:14
    especially because he's not cooperating with us,
  • 2:14 - 2:16
    he's not confessing".
  • 2:16 - 2:19
    But then the other statement is also true:
  • 2:19 - 2:28
    If you deny and the other confesses
  • 2:28 - 2:30
    now it switches around.
  • 2:30 - 2:33
    You will get 10 years because you're not cooperating,
  • 2:33 - 2:38
    and the other, your co-conspirator will get a reduced sentence,
  • 2:38 - 2:41
    will get the 1 year. So this is like telling Al
  • 2:41 - 2:43
    "look, if you deny that you were the armed robber
  • 2:43 - 2:45
    and Bill snitches you out,
  • 2:45 - 2:48
    then you're gonna get 10 years in prison
  • 2:48 - 2:50
    and Bill is only going to get 1 year in prison".
  • 2:50 - 2:58
    And if both of you essentially confess, both confess,
  • 2:58 - 3:03
    you will both get 3 years.
  • 3:03 - 3:06
    So this scenario is called "The Prisoner's Dilemma".
  • 3:06 - 3:08
    Because we'll see in a second
  • 3:08 - 3:10
    there is a globally optimal scenario for them
  • 3:10 - 3:15
    where they both deny, and they both get 2 years.
  • 3:15 - 3:17
    But we'll see, based on their incentives,
  • 3:17 - 3:20
    assuming they don't have any unusual loyalty to each other,
  • 3:20 - 3:22
    and these are, you know, these are hardened criminals here.
  • 3:22 - 3:24
    They're not brothers or related to each other in any way.
  • 3:24 - 3:26
    They don't have any kind of loyalty pack.
  • 3:26 - 3:30
    We'll see that they will rationally pick a non,
  • 3:30 - 3:33
    or they might rationally pick a non-optimal scenario.
  • 3:33 - 3:35
    And to understand that I'm going to draw something
  • 3:35 - 3:39
    called the "pay-off matrix", a pay-off matrix.
  • 3:39 - 3:42
    So let me do it right here for Bill.
  • 3:42 - 3:50
    So Bill has two options, he can confess to the armed robbery
  • 3:50 - 3:52
    or he can deny that he had anything,
  • 3:52 - 3:55
    that he knows anything about the armed robbery.
  • 3:55 - 3:57
    And Al has the same two options.
  • 3:57 - 4:04
    Al can confess and Al can deny.
  • 4:04 - 4:06
    And since it's called the pay-off matrix,
  • 4:06 - 4:11
    let me draw some grids here.
  • 4:11 - 4:13
    And let's think about all of the different scenarios
  • 4:13 - 4:15
    and what the pay-offs would be.
  • 4:15 - 4:19
    If Al confesses and Bill confesses then they're in scenario 4,
  • 4:19 - 4:26
    they both get 3 years in jail, they both would get
  • 4:26 - 4:30
    3 for Al, and 3 for Bill.
  • 4:30 - 4:36
    Now, if Al confesses and Bill denies
  • 4:36 - 4:39
    then we are in scenario 2, from Al's point of view,
  • 4:39 - 4:43
    Al is only going to get 1 year,
  • 4:43 - 4:48
    but Bill is going to get 10 years.
  • 4:48 - 4:49
    Now if the opposite thing happens,
  • 4:49 - 4:51
    that Bill confesses and Al denies
  • 4:51 - 4:53
    then it goes the other way around.
  • 4:53 - 4:55
    Al is going to get 10 years for not cooperating and
  • 4:55 - 4:59
    Bill is going to have a reduced sentence of 1 year for cooperating.
  • 4:59 - 5:06
    And if they both deny, they're in scenario 1, where
  • 5:06 - 5:09
    they're both just going to get their time for the drug dealing.
  • 5:09 - 5:16
    So Al would get 2 years and Bill would get 2 years.
  • 5:16 - 5:18
    Now I alluded to this earlier in the video:
  • 5:18 - 5:22
    what is the globally optimal scenario for them?
  • 5:22 - 5:23
    Well, it's this scenario, where
  • 5:23 - 5:26
    they both deny having anything to do with the armed robbery,
  • 5:26 - 5:29
    then they both get 2 years.
  • 5:29 - 5:31
    But what we'll see is that it is actually somewhat rational,
  • 5:31 - 5:34
    assuming that they don't have any strong loyalties to each other,
  • 5:34 - 5:36
    a strong level of trust with the other party,
  • 5:36 - 5:40
    to not go there, it's actually rational for both of them to confess.
  • 5:40 - 5:43
    And a confession is actually a "Nash equilibrium".
  • 5:43 - 5:45
    And we'll talk more about this.
  • 5:45 - 5:49
    But a Nash equilibrium is where each party has picked a choice
  • 5:49 - 5:52
    given the choices of the other party.
  • 5:52 - 5:56
    So when we think of, or each party's picked the optimal choice
  • 5:56 - 6:01
    given the choices of, or given whatever choice the other party picks.
  • 6:01 - 6:03
    And so from Al's point of view he says, well look,
  • 6:03 - 6:07
    I don't know whether Bill, or Bill is confessing or denying,
  • 6:07 - 6:10
    so let me, let's say he confesses, what's better for me to do?
  • 6:10 - 6:13
    If he confesses and I confess, then I get 3 years.
  • 6:13 - 6:16
    If he confesses and I deny, I get 10 years.
  • 6:16 - 6:19
    So if he confesses it's better for me to confess as well.
  • 6:19 - 6:23
    So this is a preferable scenario to this one down here.
  • 6:23 - 6:26
    Now I don't know that Bill confessed, he might deny.
  • 6:26 - 6:30
    If I assume Bill denied, is it better for me to confess
  • 6:30 - 6:33
    and get 1 year, or deny and get 2 years?
  • 6:33 - 6:36
    Well once again, it's better for me to confess.
  • 6:36 - 6:39
    And so, regardless of whether Bill confesses or denies,
  • 6:39 - 6:43
    so this once again, the optimal choice for Al to pick,
  • 6:43 - 6:46
    taking into account Bill's choices, is to confess.
  • 6:46 - 6:49
    If Bill confesses, Al's better off confessing,
  • 6:49 - 6:51
    If Bill denies, Al's better off confessing.
  • 6:51 - 6:53
    Now we look at it from Bill's point of view,
  • 6:53 - 6:54
    and it's completely symmetric.
  • 6:54 - 6:59
    If Bill, Bill says, well I don't know if Al's confessing or denying.
  • 6:59 - 7:02
    If Al confesses, I can confess and get 3 years,
  • 7:02 - 7:04
    or I can deny and get 10 years.
  • 7:04 - 7:06
    Well, 3 years in prison is better than 10,
  • 7:06 - 7:09
    so I would go for the 3 years.
  • 7:09 - 7:11
    If I know Al is confessing.
  • 7:11 - 7:14
    But I don't know that Al's definitely confessing, he might deny.
  • 7:14 - 7:18
    If Al's denying, I could confess and get 1 year,
  • 7:18 - 7:20
    or I could deny and get 2 years.
  • 7:20 - 7:24
    Well, once again, I would want to confess and get the 1 year.
  • 7:24 - 7:28
    So Bill, taking into account each of the scenarios that Al might take,
  • 7:28 - 7:33
    it's always better for him to confess.
  • 7:33 - 7:35
    And so this is interesting.
  • 7:35 - 7:39
    They're rationally deducing that they should get to this scenario,
  • 7:39 - 7:41
    this Nash equilibrium state,
  • 7:41 - 7:44
    as opposed to this globally optimal state.
  • 7:44 - 7:47
    They're both getting 3 years by both confessing
  • 7:47 - 7:49
    as opposed to both of them getting 2 years by both denying.
  • 7:49 - 7:54
    The problem with this one is this is an unstable state.
  • 7:54 - 7:58
    If one of them assumes that the other one has,
  • 7:58 - 7:59
    if one of them assumes that
  • 7:59 - 8:01
    they're somehow in that state temporarily.
  • 8:01 - 8:05
    They say "well, I can always improve my scenario
  • 8:05 - 8:08
    by changing my, by changing what I wanna do".
  • 8:08 - 8:10
    If Al thought that Bill was definitely denying
  • 8:10 - 8:14
    Al can improve his circumstance by moving out of that state
  • 8:14 - 8:16
    and confessing and only getting 1 year.
  • 8:16 - 8:20
    Likewise, if Bill had thought that maybe Al is likely to deny
  • 8:20 - 8:24
    he realizes that he can optimize by moving in this direction
  • 8:24 - 8:26
    instead of denying and getting 2 and 2
  • 8:26 - 8:28
    he could move in that direction right over there.
  • 8:28 - 8:31
    So this is an ustable optimal scenario,
  • 8:31 - 8:34
    but this Nash equilibrium, this state right over here
  • 8:34 - 8:37
    is actually very, very, very stable.
  • 8:37 - 8:41
    If they assume... this is, it's better for each of them to confess
  • 8:41 - 8:43
    regardless of what the other one does,
  • 8:43 - 8:47
    and assuming all of the other actors have chosen their strategy,
  • 8:47 - 8:50
    there's no incentive for Bill.
  • 8:50 - 8:53
    So... if assuming everyone else has changed the strategy
  • 8:53 - 8:58
    you can only move in that direction, if you're Bill you can either...
  • 8:58 - 9:01
    you can go from the Nash equilibrium of confessing to denying,
  • 9:01 - 9:04
    but you're worse off, so you won't wanna do that.
  • 9:04 - 9:06
    Or you could move in this direction,
  • 9:06 - 9:08
    where it would be Al changing his decision.
  • 9:08 - 9:11
    But once again that gets a worse outcome for Al
  • 9:11 - 9:13
    you're going from 3 years to 10 years.
  • 9:13 - 9:16
    So this is the equilibrium state, the stable state,
  • 9:16 - 9:18
    that both people would pick something
  • 9:18 - 9:20
    that it's not optimal globally.
Title:
Prisoners' Dilemma and Nash Equilibrium
Description:

The classical exposition of the Prisoner's Dilemma, as a way to introduce the concept of Nash equilibrium

more » « less
Video Language:
English
Duration:
09:21

English subtitles

Revisions