The Prisoner's Dilemma

General Model of the Two player Game

 The Prisoner's Dilemma model as presented by Robert Axelrod,
 Douglas Hofstadter, and others (See References at end), goes
 as follows:
 
 
 Two prisoners, lets call them Joe and Sam, are being held for
 trial.  They are being held in separate cells with no means of
 communication.  The prosecutor offers each of them a deal.  He
 also disclosed to each that the deal was made to the other.
 The deal he offered is this:
 
 
 
 a) If you will confess that the two of you committed the crime
 and the other guy denies it, we will let you go free and send
 him up for five years.
 
 b) If you both deny the crime, we have enough circumstantial
 evidence to put both of you away for two years.
 
 c) If both of you confess to the crime, then you'll both get 4
 year sentences.
 
 
 
 Put yourself in Joe's position.  If Sam stays mum and you
 sing, you get zero years.  If he stays mum and you stay mum,
 you will each get 2 years.  On the other hand if both of you
 confess, you both get 4 years.  Finally, if he confesses and
 you don't, you will get 5 years.  Whatever Sam does, it is to your
 advantage to admit your wrong doing.  Of course, Sam is also a
 rational person and he will, therefore, come to the same
 conclusion.  So you both end up confessing which nets a total
 of 8 man-years in the pokey.  The paradox is, if you had both
 denied the crime, a total of only 4 man-years would be spent
 behind bars.
 
 Wait a minute! Can it really be that rationality leads to an
 inferior result? Let's look at this one more time. We will use
 a payoff matrix, a common tool of the game theoreticians.
 
 The payoff matrix is usually presented in the following form:
 
      ACTION                        PAYOFF
  Joe       Sam               Joe        Sam
 Cooperate  Cooperate         -2 (R)     -2 (R)
 Cooperate  Defect            -5 (S)      0 (T)
 Defect     Cooperate          0 (T)     -5 (S)
 Defect     Defect            -4 (P)     -4 (P)
 
 (The codes represent standard terminology for each action:
 
     R       Reward for mutual cooperation
     S       Sucker's payoff
     T       Temptation to defect
     P       Punishment for mutual defection                 )
 
 The general form of the Prisoner's Dilemma model is that the
 preference ranking of the four payoffs be, from best to worst,
 T, R, P, S and that R be greater than the average of T and S.
 That is, any situation that meets these conditions will be a
 "Prisoner's Dilemma".
 
 In summary, the prisoner's dilemma model postulates a
 condition in which the rational action of each individual is
 to not cooperate (that is, to defect), yet, if both parties
 act rationally, each party's reward is less that it would have
 been if both acted irrationally and cooperated.
 
 The model can be applied to many real world situations, from
 genetics to business transactions to international politics.
 

Iterated "Prisoner's Dilemma" with multiple participants
 
 
 If the game is played only once there is no incentive for
 either player to do anything but defect, as discussed above.
 In fact, if the game is to be played a known number of rounds,
 there is no better choice than to defect. (Why? Because you
 both know you will defect on the last move. That puts you in
 the same situation for the next to the last move - and so on
 for all.) But if the game is to be played an indefinite number
 of times, under certain conditions, cooperation will evolve as
 the best policy.
 
 Another addition to the game that makes it more realistic is
 to assume that each player interacts with a multitude of other
 players.  Additionally, it is assumed that each player
 remembers the past history of the interactions with each of
 the other players and that past history is the only
 information he has.
 
 The Iterated "Prisoner's Dilemma" has been the subject of much
 study and computer simulation (see references). An interesting
 and possibly useful result of these studies is that a player's
 best strategy in this "game" is "Tit for Tat", with the
 additional proviso that the player be initially cooperative.
 That is, "I'll start off being nice but from that point on,
 whatever you do to me, I will do to you on the next
 interaction". This strategy has been shown to be clearly more
 productive than "The Golden Rule"!
 
 Note that we are discussing multiple participants in which
 activities are between pairs of "actors". There is yet another
 more complex situation in which an individual is interacting
 with ALL of the other participants at once. This situation,
 which is more common in the real world,  is called the "Many-
 person-dilemma" or "Voter's Paradox". See the companion essay,
 "Voter's Paradox" at this and other sites.
 
 
 
 Author: Leon Felkins
 
 Email: leonf@perspicuity.net
 
 Written: 10/13/95
 



References:
 
 1. Axelrod, Robert. 1984. The Evolution of Cooperation. New
    York: Basic Books.
 2. Hofstadter, Douglas R. 1983, "Metamagical Themas: Computer
    Tournaments of the Prisoner's Dilemma Suggest How
    Cooperation Evolves".  Scientific American 248 (no.5):16-26.
 3. On the Internet: http://pespmc1.vub.ac.be/PRISDIL.html.
    Author: F.Heylighen. Date: Apr 13, 1995 (modified)
 4. Several other essays are on the Internet. Just do a search
    on "Prisoner Dilemma"