No Way Out: Negotiation And The Prisoner’s Dilemma

April 2, 2007

The prisoner’s dilemma is often expressed as a game played on a computer but we see the ramifications of the prisoner’s dilemma in all aspects of living in society. The essential question asked by the prisoner’s dilemma: Can people be naturally cooperative, or do our individual genes require a selfish response to life situations? This question is of interest to mediators, as variations of it are played out in all mediations.

The crux of the argument that individuals are essentially motivated by self-interest in expressed in the famous statement of Thomas Hobbes – life is “a war of all against all.” Hobbes had a bleak view of human existence, because he saw the war of all against all as the natural state mankind, with no other natural outcome but, as he put it: “…the life of man, solitary, poor, nasty, brutish and short.” The same idea is expressed in the statement attributed to Heraclitus 2,500 years ago: “War is the father and ruler of all things.” The acceptance of this condition has justified for centuries our systems of laws and hierarchies of authority, to counteract the natural tendency of human beings to revert into the anarchic state described by Hobbes. The view of man as essentially self-interested also finds support in the great work of Adam Smith, Wealth of Nations, published the same year as the Declaration of Independence, in which Smith takes the view that each person is motivated by self-interest yet postulates that in a society based on “natural liberty,” even though each individual in society pursues his own rational self-interest, yet the results will be beneficial for the commonwealth as a whole, as though the vast workings of countless individuals pursing self-interest were regulated by an invisible hand.

Further support for the view that human beings are essentially motivated by self-interest is found in modern studies into genetics. The modern view is that human beings are motivated by a desire to propagate their own individual genes, and that there is nothing in the genetic make-up of human beings that points to any kind of evolution of a cooperative gene. [Dawkins: The Selfish Gene, but see Ridley: The Origins of Virtue, for a different perspective.]

Gene or no gene, many people are uncomfortable with the idea that we are fundamentally selfish, and although when Adam Smith was writing in the second half of the 18th century the idea of an invisible hand may have seemed attractive, today in our more rigorously scientific civilization, we are not willing to accept the idea of such a hand mysteriously regulating our affairs to ensure that, in spite of our individual selfishness, every thing comes out all right in the end. Yet on the other hand, our daily experience in living is one of cooperation as well as competition, and the theory of the selfish gene does not appear able to account for the observed and experienced reality that we are also cooperative beings.

Some social scientists have sought to use the power of computers to find a way out of the prisoner’s dilemma – to find a formula pursuant to which it is always better to cooperate – because of discomfort with the idea that we are prisoners of the imperatives of our own genes. The prisoner’s dilemma can be expressed positively and negatively. It is expressed negatively for prisoners caught by the police. The two prisoners are regarded as having an option whether to cooperate with each other, by not talking to the police, or whether to defect from each other and start cooperating with police. It should be borne in mind that the first one to talk routinely gets a better deal from prosecuting authorities than suspects who stay silent. So this is not just a computer game; it is something that police suspects are forced to experience every time they get caught with fellow suspects. In the version which is usually taught, and also played on computers, the two prisoners are offered the following choices: If they both cooperate with each other, i.e. stay silent, each will be sentenced on a lesser charge to one year in jail, and this is called the reward, the reward for cooperation with each other. If they both chose to defect, that is to say they both talk to the police, then they each get three years each in jail. So at this point it is easy to see that it is better to cooperate than to defect. But they are in different rooms, with no means of communicating. The game, which for many is a reality, becomes more interesting when one cooperates and the other defects. In this instance, the first to defect, i.e. talk to the police, gets off free by testifying against the co-conspirator. This is called the temptation. The second prisoner, who refuses to cooperate, i.e. stays silent, and she gets five years, which is called suckers payoff. In real life, often the temptation to get off free is only available to the first person to defect, and so both suspects are faced with the temptation and pressure to be the first to defect. You have the choice of being a rat or a sucker, but if a rat, you need to be first.

This game can also be played for positive points, in which the numbers are simply reversed. The cooperators with each other get three points, which is their reward for cooperating. The defectors the one point each, which is their punishment for defecting. If one cooperates and the other defects, the cooperator gets the sucker’s payoff, zero points, and the defector gets five points, which is the temptation.

Expressed in this mathematical way, whether positively or negatively, it is obvious that in every case it is better to defect than to cooperate, even though it may seem that cooperation is the better course. The reason it is better to defect is clear simply by adding up the numbers. If you always defect, the maximum number of points you can get is six – one point if both defect, five points if only you defect – but if you always cooperate, the maximum number of points you ever can get is three. So unless you know in advance what the other person is going to do, the only safe course of actions is always to choose defection. This is the prisoner’s dilemma. If you alter the numbers, i.e. ten points for cooperation, then of course everyone would always cooperate, but the experience of life is that cooperation produces slower but more certain long-term gain, but defection works better in the short run.

The prisoner’s dilemma arises because of lack of communication. If the parties to the game or problem are perfectly synchronized with each other, they will see that if playing the game over and over again, cooperation will result in a better outcome over the long run, thus the mafia law of silence. But the temptation to obtain a short-term advantage is always present, and the difficulty of obtaining the level of trust necessary to achieve cooperation is always present as well.

Where two people are involved, say man and wife, or two partners, a high level of cooperation may be achieved. What if there are fifteen people? The matter becomes exponentially more problematic. What if 500 people are involved, or a small town, or a large country? How does one achieve cooperation on that kind of scale? It is a basic problem of civilization. When we get into very large numbers, involving populations, one sees how immensely difficult it is to achieve the benefits of cooperation versus short-term benefits of pursuing ones own narrow interest.

That is why the saying goes that all politics are local, but this is another way of saying when it comes right down to it, people will vote their own immediate interests, people will vote their pocketbooks, people will vote what is good for them personally in their particular small sphere, rather than looking at the rest of the country, or the world, as a whole.

Obviously, this has immediate practical ramifications for mediators, because the challenge is always to get disputing parties to reach a level of cooperation so that they can achieve settlement.

There are lots of versions of the prisoner’s dilemma. One is called the “wolf’s dilemma,” first suggested by Hofstadter, in which a number of people are separated, each with their finger on a button, and the game is this: The first person to push the button gets $100 and everyone else gets nothing, but if each person sits without pushing the button for a given period of time, say 20 minutes, they each get $500. The dilemma here is again one of trust. The temptation to defect is made less appealing because the temptation to cooperate is five times as attractive, provided one can be certain that no one else is going to get the certain, though lesser, money by pushing the button first.

Social scientists have been playing with the prisoner’s dilemma for many years, and their efforts have intensified since the wide introduction of personal computers. One of the original realizations was that prisoners’ dilemmas in one form or another are experienced in daily life. Any situation in which you can cut an immediate deal for yourself, and yet if you know everyone else is going to cooperate that it would be better, is a prisoner’s dilemma. Another way of experiencing it is to realize that if everyone acts for herself, the results will be very much worse than if people cooperated, and yet there is still the temptation to be one of the (few) winners rather than one of the (many) suckers.

An example sometimes given is that if everyone could be trusted not to steal cars or rob department stores, then insurance rates would be cheaper for everyone, costs would go down, the general benefit for all would be increased. And yet it does not take too many defectors to create a situation in which everyone has to lock their cars, in which department stores have to have security guards with the corresponding increase in the cost of goods.

The prisoner’s dilemma is experienced on a continuous basis with respect to our environment. Over-fishing is an enormous problem, or the extinction of species, or the overgrowing of crops, or cutting down the rain forest – all of these produce immediate gain for the persons doing it, but an overall loss for everyone else. If anyone could be sure that other people would not be selfish, then one might forgo individualism for which collectively pay a heavy price. Yet we live in a society in which individualism is prized as an ultimate value. This has been called the ‘tragedy of the commons’ – the ‘commons’ only succeeds if no one abuses it, and it is nearly always abused. The effort of some political philosophies is to develop systems in which self-interest can somehow be made to produce a favorable result for the common interest.

On a macro level, one can see that communism was a utopian attempt to create a society in which, as expressed in the communist Manifesto, each gives according to his ability, and each receives according to his need. That utopian vision elevates cooperation as a supreme value, but in practice as we all know it failed utterly. Why? Because ‘some are more equal than others’; the bosses stole everything. Capitalism can be seen as an attempt to acknowledge that each individual is motivated by self-interest, and yet to create a system in which the common good can also be preserved. Yet experience has shown that unrestrained capitalism can lead to monopoly and oppression. There is no substitute for balance, and the balance is between cooperation and competition.

One of the chief variables in any system is time. Often there is tension or conflict between short-term and long-term interest. It is very well to live in a society devoted to consumption, but if in doing so one sacrifices one’s children’s education, which requires large capital investment and long-term vision, then the outcome two to three generations hence will be poor.

Scientists and mathematicians have sought to use computers and computer games to find a computerized, mathematical solution out of the bleak conclusion of the prisoner’s dilemma that selfishness is the only rational option for a human being.

The prisoner’s dilemma may be regarded as a branch of game theory. Game theory seeks to discover strategies for success in which the best option depends upon what other people do, in circumstances where it may not be easy to know in advance what anyone else is going to do. The goal of game theory is to find a formula, a strategy that will work in all circumstances, even though the variable is what other people in the situation will do. The mathematician John Nash won a Nobel for developing the Nash Equilibrium, which is a mathematical way of expressing the optimal response to what other people are doing or going to do, even though that optimal response may not be entirely satisfactory. In other words, the Nash Equilibrium often requires one to make the best of poor circumstances, and this very often occurs in the negotiation situation where one of the parties is in a weak position.

Take for example a game invented by Hammerstein and Selten, in which A and B must share a given sum of money with each other. A gets to play first, and the decision he needs to make is whether the money will be shared equally, or whether he will take the larger share. B plays second and he must decide on the total amount of money that will be shared. Thus, A chooses the split, and B chooses the total amount. The second rule is that if A chooses a 50:50 split, he gets half and B gets half, but incredibly unfairly, if A does not share the money equally, he gets his split, say 90:10, multiplied by ten times. Unfairness pays big for A, provided B behaves rationally and chooses not to punish A for being unfair. If A decides to take a larger share, he is then at the mercy of B, who decides the amount of the total pot to be shared. There is nothing B can do about it once A has chosen to take the larger share. What is the rational thing for B to do? On the one hand, he may want to stick it to A for being unfair, but if he does, he punishes himself as well as A. B’s rational choice is to choose the largest possible sum of money, because thereby he gets more, even though it grates on him that A has behaved in this way. This is the Nash Equilibrium, i.e. for A to play unfair and for B to play high. This is not the perfect outcome for B, because he must put up with the emotional anguish of seeing A succeed, but rationally it is the best of a bad job. However as we all know in practice, in numerous instances people will not choose the rational option, but will prefer a revenge option based on emotion.

When the prisoner’s dilemma is played between two people a great many times on a computer, it is found that they prove to be very willing to cooperate. Even though they see the advantage in making a quick killing at the other’s expense, where they knew that they were going to play again and again, the value of cooperation outweighs the value of a quick advantage. Cooperation pays in the long term, which is how societies hold together over time. Yet in many circumstances in modern life, we are not playing the same gave over and over again with the same people. Particularly in mediation, we are playing with someone whom we are never going to see again, and the temptation not to cooperate is very great, and this is a bias that in almost every situation the mediator needs to overcome. The challenge is always how to get this pair of people or group of people to cooperate long enough to get this particular situation resolved.

The search of the computer scientists to find a stable strategy, in which the advantages of playing that strategy always produce a viable result whether or not anyone else in the game is playing a different strategy, proved in practice to exceedingly difficult, leading eventually to the conclusion that there is no foolproof formula.

A biologist called John Maynard Smith wanted to find out why, in the wild, animals generally do not fight to the death. They tend to choose other strategies, such as to submit or to leave the scene. Smith invented a game that he called Hawk and Dove, in which the hawk corresponds to the defector in the prisoner’s dilemma, and the dove corresponds to the cooperator. If hawk meets dove, hawk wins, but if hawk meets another hawk, hawk can be badly wounded. This is important in the wild, because animals that are badly wounded have a poor survival prognosis. Although hawk always beats dove, if dove meets dove then the outcome is positive, but more interestingly, if dove meets hawk over and over again, then the qualities of dove start to improve dove’s chances, particularly if dove can learn to change from dove to hawk when the occasion demands – such a dove is called a ‘retaliator,’ and in other contexts a ‘shape shifter.’

When social scientists started playing the prisoner’s dilemma on computers, they discovered something surprising. Computers, which are entirely rational, started cooperating in circumstances in which it seemed irrational. In 1979, Robert Axelrod, a political scientist, asked for submissions of a number of programs in circumstances in which each program would play another program numerous times, and he set it up so that it would be possible to determine which program produced the winning strategy. The astonishing thing was that the cooperative programs tended to do well, and the winner was the simplest and most cooperative of all.

The simplest and most cooperative of all the submitted programs won the tournament. It was submitted by a Canadian political scientist called Anatol Rapoport, who submitted a program called Tit-for-Tat. Tit-for-Tat starts cooperatively, and then simply follows what the other person did last time. In other words, if cooperation is met by defection or aggression, then Rapoport’s program will retaliate. But if cooperation is met by cooperation, then the Rapoport program continues to cooperate. So whereas the habitual cooperator is likely to get thwarted, the cooperator who is willing to turn into a retaliator, that is, a dove that is willing to turn into a hawk and then back into a dove, wins the game.

The first time Axelrod attempted this tournament fourteen programs were submitted, with Tit-for-Tat the winner. The second time Axelrod set up the tournament, sixty-two programs were submitted, and once again Tit-for-Tat proved to be the winner. Axelrod wrote a book on the subject, in which he identified Tit-for-Tat’s four essential attributes resulting in its overall success. (1) It was cooperative, but (2) willing to retaliate. (3) If after retaliation the opponent started to cooperate, then Tit-for-Tat was forgiving. (4) Finally, it communicated by its actions a clear and consistent message. In this way, Tit-for-Tat has the best chance of eliciting not only long-term cooperation, but short-term cooperation as well – its strategy is cooperative, retaliatory, forgiving, clear and consistent.

The reason that Tit-for-Tat’s strategy has a better chance of success, although not foolproof, is that hawks tend to kill each other off. So although at first the hawks tend to kill the doves, hawks that never learn to be nice eventually always come up against a faster gun and get shot down. This can happen to anyone, but the advantage of flexibility in the long run, and even in the short run, is overwhelming. It doesn’t pay to hope that life is different than what we actually experience. We cannot always be peacemakers, nor can we always win by being warriors. We have to learn to be flexible.

However, Tit-for-Tat cannot be a universal panacea. If there were a universal panacea, we would have discovered or evolved into it by now. Everyone knows the expression “tit-for-tat” killings. In other words, if a cooperative move is met by a retaliatory move, then tit-for-tat demands that the next move by the cooperator be one of retaliation. But then if the player on the other side retaliates to the retaliation, the game spirals downwards out of control into what amounts to a blood feud. The danger of tit-for-tat is that it can lead to mutual recrimination or retaliation, from which there is no escape, and we observe this in current events as well as historically, and we observe this in mediation.

What the social scientists with their computer games have established is that the dilemma all must experience in their own lives, between the survival imperatives of the individual and the survival interests of society, is not subject to an invariably winning formula. It is a form of tension or strife that is part of our condition. We are genetically programmed for individual survival, but we know from experience, personal and historical, that we need each other.

It is lucky for mediators that no formula exists. If there were a formula, mediators would be redundant. Negotiating parties seek their own self-interest. They are obliged to cooperate because cooperation, in the particular circumstances, is in their self-interest, but they wish to cooperate to the minimal extent required by the circumstances. Hence the ‘dance’ for advantage and that is why mediators can be useful. The parties cannot escape the dilemma; all they can do is try to maximize whatever advantages they possess vis-à-vis the others in the negotiation. It is not patty-cake; it is war waged with kisses.

Charles B. Parselle

Admitted to practice law in California and England, Charles Parselle is a founding partner of Centers for Excellence in Dispute Resolution - CEDRS.COM - and a sought-after ADR professional. An experienced litigator, he enjoys the confidence of both plaintiff and defense bars as a gifted facilitator of dispute resolution. He… MORE >