Iterated prisoner’s dilemma

Pay-off table	A cooperates	A defects
B cooperates	A gets 1 year B gets 1 year	A goes free B gets 3 years
B defects	A gets 3 years B goes free	A gets 2 years B gets 2 years

In which the curmudgeonly old sod puts the world to rights.

Index — Click ᐅ to expand: The Devil’s Advocate

Tell me more
Sign up for our newsletter — or just get in touch: for ½ a weekly 🍺 you get to consult JC. Ask about it here.

A game of prisoner’s dilemma with a twist: You get to play the game repeatedly with the same player (crucially, in for an indeterminate number of rounds, so neither of you know at any time that there won’t be “another time”), and you can remember how the other player treated you last time. In game theory terms, this dramatically changes the payoffs. whereas in a single round, the rational best move is to defect, in an iterated game, the best move is to cooperate until the other guy defects. If she defects, defect in the next round, tit for tat.

If two prisoners start out cooperating and play this tit-for-tat game they will always cooperate. Nice, huh?

Mathematicians have worked out that the best long-term payoff comes from this tit-for-tat strategy.

Why is this interesting? Because in commerce — represented in the JC by the eBayer’s dilemma — there almost always is another time, and even when there isn't, the participants are likely not to know that.

We do remember, which is why the single round prisoner’s dilemma strategy of defecting is a bad one.

This explains why cooperation works as an evolutionary strategy.

Convexity and the prisoner’s dilemma

The simple imperative of the iterated prisoner’s dilemma is conditioned in an environment with convex payoffs, however. A reputation for trustworthiness acquired across a large range of small transactions would be inordinately costly to forfeit for the sake of the defector’s reward on a small transaction, but not necessarily on a disproportionately large one. To take an example from eBay: I may have acquired a five-star reputation over 500 transactions having an average value of £10. The outright cost for acquiring that reputation, compared with the defection value of a single transaction is large, but in absolute terms may still be minimal: only the bid-ask spread and the cost of eBay commissions (bear in mind I receive value for the transactions I settled). If, in my one-hundred-and-first transaction, I offer a car for £15,000, the defection reward easily justifies burning my existing transaction reputation, especially since it is trivial to create a new eBay account and start again.^[1]

References

↑ Of course, in real life there are other pressures and incentives at play to prevent me defecting: the criminal law, for one, and any prudent buyer’s circumspection for another. A buyer is unlikely to pay away any money without full legal title to a mechanically sound vehicle that meets the sale description. But, for the purposes of the prisoner’s dilemma metaphor, these “real-world” circumstances are not relevant.

[1] Of course, in real life there are other pressures and incentives at play to prevent me defecting: the criminal law, for one, and any prudent buyer’s circumspection for another. A buyer is unlikely to pay away any money without full legal title to a mechanically sound vehicle that meets the sale description. But, for the purposes of the prisoner’s dilemma metaphor, these “real-world” circumstances are not relevant.

[1]

Iterated prisoner’s dilemma

Convexity and the prisoner’s dilemma

See also

References