Iterated prisoner’s dilemma: Difference between revisions
Amwelladmin (talk | contribs) No edit summary Tags: Mobile edit Mobile web edit |
Amwelladmin (talk | contribs) No edit summary |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{a| | {{a|devil|{{Prisonersdilemmatable}}}}A game of [[prisoner’s dilemma]] with a twist: You get to play the game ''repeatedly'' with the same player (crucially, in for an indeterminate number of rounds, so neither of you know at any time that there won’t be “another time”), and you can remember how the other player treated you last time. In [[game theory]] terms, this dramatically changes the payoffs. whereas in a single round, the rational best move is to defect, in an [[iterated]] game, the best move is to cooperate ''until the other guy defects''. If she defects, defect in the next round, tit for tat. | ||
If two prisoners start out cooperating and play this tit-for-tat game they will always cooperate. Nice, huh? | If two prisoners start out cooperating and play this tit-for-tat game they will always cooperate. Nice, huh? | ||
Line 10: | Line 10: | ||
This explains why cooperation works as an evolutionary strategy. | This explains why cooperation works as an evolutionary strategy. | ||
===[[Convexity]] and the prisoner’s dilemma=== | |||
The simple imperative of the iterated prisoner’s dilemma is conditioned in an environment with [[convex]] payoffs, however. A reputation for trustworthiness acquired across a large range of small transactions would be inordinately costly to forfeit for the sake of the defector’s reward on a small transaction, but not necessarily on a disproportionately large one. To take an example from eBay: I may have acquired a five-star reputation over 500 transactions having an average value of £10. The outright cost for acquiring that reputation, compared with the defection value of a single transaction is large, but in absolute terms may still be minimal: only the bid-ask spread and the cost of eBay commissions (bear in mind I receive value for the transactions I settled). If, in my one-hundred-and-first transaction, I offer a car for £15,000, the defection reward easily justifies burning my existing transaction reputation, especially since it is trivial to create a new eBay account and start again.<ref>Of course, in real life there are other pressures and incentives at play to prevent me defecting: the criminal law, for one, and any prudent buyer’s circumspection for another. A buyer is unlikely to pay away any money without full legal title to a mechanically sound vehicle that meets the sale description. But, for the purposes of the prisoner’s dilemma [[metaphor]], these “real-world” circumstances are not relevant.</ref> | |||
{{sa}} | {{sa}} | ||
*[[Prisoner’s dilemma]] | *[[Prisoner’s dilemma]] | ||
*[[Agency problem]] | *[[Agency problem]] | ||
{{ref}} |
Latest revision as of 16:34, 23 January 2021
|
A game of prisoner’s dilemma with a twist: You get to play the game repeatedly with the same player (crucially, in for an indeterminate number of rounds, so neither of you know at any time that there won’t be “another time”), and you can remember how the other player treated you last time. In game theory terms, this dramatically changes the payoffs. whereas in a single round, the rational best move is to defect, in an iterated game, the best move is to cooperate until the other guy defects. If she defects, defect in the next round, tit for tat.
If two prisoners start out cooperating and play this tit-for-tat game they will always cooperate. Nice, huh?
Mathematicians have worked out that the best long-term payoff comes from this tit-for-tat strategy.
Why is this interesting? Because in commerce — represented in the JC by the eBayer’s dilemma — there almost always is another time, and even when there isn't, the participants are likely not to know that.
We do remember, which is why the single round prisoner’s dilemma strategy of defecting is a bad one.
This explains why cooperation works as an evolutionary strategy.
Convexity and the prisoner’s dilemma
The simple imperative of the iterated prisoner’s dilemma is conditioned in an environment with convex payoffs, however. A reputation for trustworthiness acquired across a large range of small transactions would be inordinately costly to forfeit for the sake of the defector’s reward on a small transaction, but not necessarily on a disproportionately large one. To take an example from eBay: I may have acquired a five-star reputation over 500 transactions having an average value of £10. The outright cost for acquiring that reputation, compared with the defection value of a single transaction is large, but in absolute terms may still be minimal: only the bid-ask spread and the cost of eBay commissions (bear in mind I receive value for the transactions I settled). If, in my one-hundred-and-first transaction, I offer a car for £15,000, the defection reward easily justifies burning my existing transaction reputation, especially since it is trivial to create a new eBay account and start again.[1]
See also
References
- ↑ Of course, in real life there are other pressures and incentives at play to prevent me defecting: the criminal law, for one, and any prudent buyer’s circumspection for another. A buyer is unlikely to pay away any money without full legal title to a mechanically sound vehicle that meets the sale description. But, for the purposes of the prisoner’s dilemma metaphor, these “real-world” circumstances are not relevant.