Why Prisoners' Dilemma is Not a Newcomb Problem

SORITES ISSN 1135-1349
http://www.sorites.org
Issue #17 -- October 2006. Pp. 81-84
Why Prisoners' Dilemma is Not a Newcomb Problem
Copyright © by P. A. Woodward and SORITES
Why Prisoners' Dilemma is Not a Newcomb Problem
(It's Not Even Two Newcomb Problems Side by Side.)
P. A. Woodward

In a brief discussion David Lewis argues that «[c]onsidered as puzzles about rationality, or disagreements between two concepts thereof...» Newcomb's Problem is «...one and the same problem...» (Lewis, p. 235) as that posed by the Prisoners' Dilemma.Foot note 7_1 «Prisoners' Dilemma,» he claims, «is a Newcomb Problem -- or rather, two Newcomb Problems side by side, one per prisoner» (Lewis, p. 235). Lewis concludes his discussion with the following paragraph:

: Some have fended off the lessons of Newcomb's Problem by saying: «Let us not have, or let us not rely on, any intuitions about what is rational in goofball cases so unlike the decision problems of real life.» But Prisoners' Dilemmas are deplorably common in real life. They are the most down-to-earth versions of Newcomb's Problem now available (Lewis, p. 240).

This paragraph makes it clear that Lewis thinks that we can learn what is rational in «common» Prisoners' Dilemma decision situations by studying Newcomb's Problem.

To show that the Prisoners' Dilemma (PD) is a (pair of) Newcomb's Problems (NPs) Lewis lists the three elements which, «in a nutshell» capture the «decision problem» faced by each prisoner in the PD (Lewis, p. 236). Two of the three elements are, he claims, the same for the decision problem faced by the agent in the NP, only one of the elements differs, and he attempts to show that it differs only in its «inessential trappings» (Lewis, p. 235). Only slight modifications to the element not in common with regard to those inessential trappings are required to show that the two problems are the same problem. It is assumed, in Lewis's discussion and in most discussions of PD and NP, that each prisoner in PD and the player in the NP is an egoist. Thus, what counts as «rational» is what maximizes the prisoner's or the player's own self interest.

Let's consider Lewis's modifications to the decision element not shared by the puzzles. The three decision elements in the PD are as follows:

(1) I [one of the prisoners] am offered a thousand [dollars] -- take it or leave it. [Taking it amounts to confessing in usual presentations of PDs.]

(2) Perhaps also I will be given a million [dollars]; but whether I will or not is causally independent of what I do now. Nothing I can do now will have any effect on whether or not I get my million. [In typical PDs, a reduced sentence is analogous to getting a million dollars.]

(3) I will get my million if and only if you [the other prisoner] do not take your thousand [dollars] (Lewis, p. 236).

Lewis thinks that (1) and (2) are shared by the agent in a Newcomb Problem, but that (3) is replaced by

(3') I [the agent] will get my million [dollars] if and only if it is predicted that I do not take my thousand [dollars] (Lewis, p. 236).

Lewis points out that it is inessential whether or not the prisoners (in PD) choose simultaneously, or one after the other (assuming that no information regarding the choices is shared until after the choices have been made), or whether the prediction (in NP) has been made before, during, or after the one faced with the problem (the agent) decides to takes his thousand dollars or forego it, again assuming that the agent's prediction plays no role in the agent's decision (see Lewis, pp. 236-237).

Lewis also points out that it is inessential to the NP that «...any prediction...should actually take place. It is enough that some potentially predictive process should go on, and that whether I get my million is somehow made to depend on the outcome of that process» (Lewis, p. 237). Thus, he claims, NP is characterized by (1), (2), and

(3'') I will get my million if and only if a certain potentially predictive process (which may go on before, during, or after my choice) yields the outcome which could warrant a prediction that I do not take my thousand (Lewis, p. 237).

But, claims Lewis, that «potentially predictive process» could be a replica of the agent (i.e., Lewis himself) thus (3'') is correctly replaced by

(3''') I will get my million if and only if my replica does not take his thousand (Lewis, p. 238).

The Predictor in NPs is usually taken to be nearly perfect at making this type of prediction, thus a replica of the agent that matches the agent perfectly in respects relevant to the decision «...will have more predictive power than a less perfect replica...»( Lewis, p. 238). But, as Lewis points out, a nearly perfect predictor is not necessary to the problem. «The disagreement between conceptions of rationality that gives the problem its interest arises when the reliability of the predictor, as estimated by the agent, is quite poor -- indeed even when the agent judges that the predictive process will do little better than chance» (Lewis, p. 238).

The «replica» may be simply another person placed in a similar situation as the agent facing the NP. Thus, (3''') can be replaced by

(3) I will get my million if and only if you [i.e., the other person in a similar situation] do not take your thousand (Lewis, p. 239).

Thus, argues Lewis, «[i]nessential trappings aside, Prisoners' dilemma is a version of Newcomb's Problem, quod erat demonstrandum» (Lewis, p. 239).

Some who discuss Newcomb's Problem, claims Lewis, think it rational to decline the thousand if the predictive process is reliable enough -- and some who discuss Prisoners Dilemma think it rational to not take the thousand they are offered, if the two partners are enough alike. These are, according to Lewis, two statements of the same view, a view often labeled «expected utility.» In following this strategy the agent maximizes his expected utility. Lewis claims that he thinks it rational to take the thousand in both problems because no matter what the predictor does, and no matter what his partner does, he would be better off than if he didn't take it -- better off by a thousand dollars. This is the «dominant strategy.» Following a dominant strategy, if there is one, the agent performs that action which will maximize his utility whatever the other agent does (or whichever possible situation turns out to be actual).

Since, thinks Lewis, in NPs the dominant strategy seems to be (more) clearly what is rational for an egoist, and since the two problems are (according to Lewis) the same problem, we ought to accept the dominant strategy as being more rational in the all too common PDs we run into in real life; this is the claim that Lewis's last paragraph (quoted above p. 80) amounts to.

But notice, in typical presentations of NPs the predictor is said to be very good, and evidence of that success is drawn from the fact that he (or it) has been nearly flawless in the past when he (it) has made similar predictions. As Nozick (in an early discussion of NPs) describes the predictor

: you know that this being has often correctly predicted your choices in the past (and has never, so far as you know, made an incorrect prediction about your choices), and furthermore you know that this being has often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation to be described below [i.e., a familiar Newcomb Problem situation].Foot note 7_2

Given Nozick's telling of the story, which is consistent with Lewis's version, the NP is really the last of a series of NPs. The PD, as usually described, and as Lewis describes it, is taken to be a single playing of the game. The partners in the PD are usually given a onetime opportunity to make a decision, the results of which will involve more or less jail time. Described as the last of a series of NPs the game has a unique character in which the dominant strategy has a certain strong appeal; and that appeal may transfer over to a single play of the PD.

Lewis has noted that PDs are «deplorably common in real life» (Lewis, p. 304, quoted above, p.80). Robert Axelrod has claimed that the

: Prisoner's Dilemma is simply an abstract formulation of some very common and very interesting situations in which what is best for each person individually leads to mutual defection, whereas everyone would have been better off with mutual cooperation.Foot note 7_3

Such situations include international trade situations in which two nations must decide whether to erect trade barriers; situations in which legislators must decide whether to support other legislator's bills; situations in which corporate executives must decide whether to cooperate with other executives; and situations in which manufacturing companies (with but one competitor) must decide on a price for their products (see Axelrod, Part I). Gregory Kavka has noticed that situations in which two nations must decide whether to maintain an expensive and dangerous arsenal of nuclear weapons, or to disarm, can be described as a PD.Foot note 7_4 Virtually any situation in which an individual must decide how much to exploit, for personal gain, a commonly held resource (such as pasture land, the air, rivers, etc.) can be described as a PD.

What is common to these «real life» PD situations is that they are not single play games. They are situations in which the PD is faced numerous or an indefinite number of times by each player, often with the same «partner.» They are «iterated PDs.»Foot note 7_5

It's clear that in an iterated NP, in which it is not known how many times the agent will face the decision, if the predictor is a little better than chance at making the prediction, the agent is better off foregoing the thousand each time he plays then he would be by taking it each time. Moreover, it is clear that the better the predictor is the better off the agent is by foregoing the thousand (on each play) compared with how well he would be by taking it (on each play). The agent would do better by taking his thousand only if he could correctly predict when the predictor will be wrong.Foot note 7_6 Thus, following the dominant strategy, which was so attractive to Lewis, is not in the player's best interest in iterated PDs. Hence, the common PDs are not (one play) NPs -- not even two such NPs side by side. Further more, if the NP were «iterated» the situation would not be similar to the situation faced by someone playing iterated PD because, although in PD both players have a rational egoistic rank ordering of possible outcomes and can gain mutual benefit by cooperating with each other, the predictor in NP does not have such a rank ordering of possible outcomes, and it's not clear what counts a benefiting the predictor -- he seems not to be an egoist. The best that we can do as far as figuring out the predictor's preference ordering is to conclude that he prefers that the agent get something rather than nothing, but he prefers that the agent not get everything. Whether the agent gets $1 million or $1 thousand does not seem to matter to him, as long as the agent does not get $1,001,000. We should not, therefore, be too quick to follow the dominant strategy in «common» iterated PD situations.

P. A. Woodward
East Carolina University
<woodwardp [at] ecu.edu>