1 of 29

Infinitely Repeated Games and Markov Perfect Equilibrium

Roman Sheremeta, Ph.D.

Professor, Weatherhead School of Management

Case Western Reserve University

1

2 of 29

Outline�

  • Review
  • Repeated games
  • Infinitely repeated games

2

3 of 29

Review:�Repeated Game

  • Stage game:
    • G = {Player 1, Player 2, S1, S2; u1, u2} denotes a static stage game in which player 1 chooses an action s1 from the action space S1 and player 2 chooses an action s2 from the action space S2, and payoffs are u1 and u2, respectively

  • Finitely repeated game:
    • G(T) denotes the finitely repeated game in which a stage game G is played T times, with the outcomes of all preceding plays observed before the next play begins. The payoffs for G(T) are simply the sum of the payoffs from the T stage games
    • In other words, a repeated game is a dynamic game of complete information in which a simultaneous-move game is played at least twice

3

4 of 29

Review: �Two-Stage Prisoners’ Dilemma

  • Rules of the game:
    • Two players play the following simultaneous move prisoners’ dilemma game twice

  • Question: what is the SPNE?
    • The unique SPNE is (D1D1D1D1D1, D2D2D2D2D2)

4

Player 2

D2

C2

Player 1

D1

1 , 1

5 , 0

C1

0 , 5

4 , 4

5 of 29

Review: �Theorems 5 and 6

  • THEOREM 5: If the stage game G has a unique Nash equilibrium, then the repeated game G(T) has a unique SPNE, in which the Nash equilibrium of G is played in every stage t < T

  • THEOREM 6: If the stage game G has multiple Nash equilibria, then the repeated game G(T) will have multiple SPNE, some of which are not the part of the original stage game

5

6 of 29

Infinitely Repeated Game�

  • Stage game:
    • G = {Player 1, Player 2, S1, S2; u1, u2} denotes a static stage game in which player 1 chooses an action s1 from the action space S1 and player 2 chooses an action s2 from the action space S2, and payoffs are u1 and u2, respectively

  • Infinitely repeated game:
    • Given a stage game G, let G(∞) denote the infinitely repeated game in which G is played infinitely number of times, with the outcomes of all preceding plays observed before the next play begins
    • In other words, an infinitely repeated game is a dynamic game of complete information in which a simultaneous-move game is played infinitely many times

6

7 of 29

Infinitely Repeated Game�

  • Dr Strange - Bargaining with Dormammu

7

8 of 29

Infinitely Repeated Game�

    • Rules of the game:
    • The simultaneous-move game is played at stage 1, 2, 3, ..., t-1, t, t+1, .....
    • The outcomes of all previous t-1 stages are observed before the play at the tth stage

    • How do we compare payoffs of 1 + 1 + 1 + … = ∞ with 4 + 4 + 4 + … = ∞?
    • We consider the present value of discounted payoffs π1, π2, π3, ….
    • Each player discounts her payoff by a factor δ, where 0 < δ < 1
    • Present value payoff = π1 + δπ2 + δ2π3 + δ3π4 + …

8

9 of 29

Infinitely Repeated Game�

  • Example 1: Suppose a player obtains 1 in each stage for an infinite number of stages
    • Present value payoff = 1 + 1δ + 1δ2 + 1δ3 + … = 1/(1− δ)

  • Example 2: Suppose a player obtains a in each stage for an infinite number of stages
    • Present value payoff = a + aδ + 2 + 3 + … = a (1 + δ + δ2 + δ3 + … ) = a/(1− δ)

  • Example 3: Suppose a player obtains payoffs 4, 1, 4, 1, 4, 1 .......(4 in every odd stage, 1 in every even stage) for an infinite number of stages
    • Present value payoff = 4/(1− δ2) + δ/(1− δ2)

9

10 of 29

Infinitely Repeated Game:�Prisoners’ Dilemma

  • Informal game tree of the infinitely repeated prisoners’ dilemma

D1

C1

P2

D2

C2

P2

D2

C2

D1

C1

P2

D2

C2

D2

C2

11

50

05

44

P1

P1

P1

P1

11

50

05

4�4

11

50

0�5

44

1�1

5� 0

05

44

11

50

0�5

4�4

P1

D2

C2

D2

C2

D2

C2

D2

C2

D2

C2

D2

C2

D1

C1

D1

C1

D1

C1

P2

P2

P2

P2

P2

P2

P2

To Infinity

11 of 29

Infinitely Repeated Game:�Prisoners’ Dilemma

  • The strategies in infinitely repeated games can be exceedingly complex, since a strategy for a player is a complete plan of actions that may depend on the history of the play
  • We will consider just a few types of simple strategies in repeated games:
    • Always defect: play defect (D) at every stage
    • Always cooperate: play cooperate (C) at every stage
    • Grim trigger: start by cooperating (C) and continues to do so as long as the other player cooperates, but defect (D) forever following a defection
    • Tit-for-tat: first cooperate (C), then subsequently replicate an opponent's previous action

11

12 of 29

Infinitely Repeated Game:�Prisoners’ Dilemma

  • Because there are infinite amount of subgames, it is impossible to check all subgames in the infinitely repeated game

  • Therefore, we will use a new equilibrium concept called Markov Perfect Equilibrium
    • To check if a strategy is a Markov perfect equilibrium, we need to check if the combination of strategies is a Nash equilibrium of the infinitely repeated game

12

13 of 29

Infinitely Repeated Game: �Markov Perfect Equilibrium

  • Always defect strategy:
    • Play defect (D) at every stage

  • Check whether the combination of strategies is a Nash equilibrium of the infinitely repeated game
    • Player 1’s best response to D2 is D1 in every stage
    • Player 2’s best response to D1 is D2 in every stage
    • Hence, playing (D1D1D1D1, D2D2D2D2…) is a Nash equilibrium of the infinitely repeated game

13

14 of 29

THEOREM 7: �Infinitely Repeated Game

  • THEOREM 7: If the stage game G has a unique Nash equilibrium, then the repeated game G() has a Markov perfect equilibrium, in which the Nash equilibrium of G is played in every stage of infinitely repeated game. However, this equilibrium is not necessarily unique.

  • In other words, if the simultaneous-move stage game G has a unique Nash equilibrium then an infinitely repeated game G(∞) also has an equilibrium but this equilibrium is not necessarily unique

14

15 of 29

Grim Trigger Strategy�

  • Grim trigger strategy:
    • Start by cooperating (C) and continues to do so as long as the other player cooperates
    • Defect (D) forever following a defection

  • Next, we need to check whether the combination of grim trigger strategies is a Nash equilibrium of the infinitely repeated game?

15

16 of 29

Grim Trigger Strategy�

  • Consider the following case:
    • Suppose that player 1 plays the grim trigger strategy
    • Suppose that player 2 plays the grim trigger strategy up to stage t-1
    • Can player 2 be better-off if she deviates from the grim trigger strategy at stage t?

16

1

2

t-1

Stage

(C1, C2)

(C1, C2)

(C1, C2)

P2: grim

17 of 29

Grim Trigger Strategy�

  • Case 1: Play the trigger strategy
    • If player 2 continues to play the trigger strategy at stage t and after, then she will get a sequence of payoffs 4, 4, 4, ... (from stage t to stage +∞).
    • Discounting these payoffs to stage t gives us:
    • 4 + 4δ + 4δ2 + … = 4/(1− δ)

17

1

2

t-1

t

t+1

t+2

Stage

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

P2: grim

18 of 29

Grim Trigger Strategy�

  • Case 2: Deviate from the trigger strategy
    • If player 2 deviates from the trigger strategy at stage t then player 1 will play D1 after stage t forever, and in response, player 2 should play D2 forever
    • So, player 2 will get a sequence of payoffs 5, 1, 1, 1 ... (from stage t to stage +∞)
    • Discounting these payoffs to stage t gives us:
    • 5 + 1δ + 1δ2 + … = 5 + δ/(1− δ)

18

1

2

t-1

t

t+1

t+2

Stage

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

(C1, C2)

P2: grim

(C1, C2)

(C1, C2)

(C1, C2)

(C1, D2)

(D1, D2)

(D1, D2)

P2: deviate

19 of 29

Grim Trigger Strategy�

  • Comparing the two cases:
    • Case 1: If player 2 plays the trigger strategy, her payoff is: 4 + 4δ + 4δ2 + 4δ3 + … = 4/(1− δ)
    • Case 2: If player 2 deviates from the trigger strategy, her payoff is: 5 + 1δ + 1δ2 + 1δ3 + … = 5 + δ/(1− δ)
    • Hence, if 4/(1− δ)5 + δ/(1− δ), player 2 cannot be better off if she deviates from the trigger strategy
    • Thus, if player 1 plays the trigger strategy the player 2’s best response is the trigger strategy for δ ≥ 1/4
    • By symmetry, if player 2 plays the trigger strategy then player 1's best response is the trigger strategy
  • Hence, there is a Markov equilibrium in which both players play the grim trigger strategy for δ ≥ 1/4

19

20 of 29

Experiment #8:�Repeated Game

  • What is the equilibrium in the infinitely repeated game?
    • One equilibrium is (D1D1D1D1, D2D2D2D2…)
    • Another equilibrium is grim trigger strategy if δ ≥ 1/4 (in the experiment δ = 9/10)

20

Player 2

D2

C2

Player 1

D1

1 , 1

5 , 0

C1

0 , 5

4 , 4

21 of 29

Experiment #8:�Results (2019 CWRU)

21

22 of 29

Applications�

  • Although it is difficult to sustain cooperation in the finitely repeated Prisoner’s Dilemma game, it is possible to do so in the infinitely repeated game with appropriate discounting

  • In oligopolistic competition firms can sustain cooperation (collusion) by producing lower quantities and earning higher payoffs

  • In contests contestants can sustain cooperation by making lower bids and earning higher payoffs

22

23 of 29

Infinitely Repeated Game:�Prisoners’ Dilemma

  • Bo and Fréchette (2011) find the following use of strategies:
    • Always defect: 50-60%
    • Always cooperate: 5-10%
    • Grim trigger: 5-10%
    • Tit-for-tat: 30-40%

23

24 of 29

Axelrod’s Tournament�

  • Robert Axelrod invited experts to submit programs for a computer Prisoner’s Dilemma tournament
    • Each participant had to specify the strategy to be used in the tournament
    • The strategies could be conditional on the history of play
    • He received 14 entrees and ran a round robin tournament
    • Tit-for-tat was the winner (start cooperating and then replicate the opponent’s move)

  • Axelrod then circulated the results and again asked for submissions
    • The winner was? …
    • Tit-for-tat!

24

25 of 29

Why Tit-for-Tat�

  • Why tit-for-tat was the winner?
    • Tit-for-tat never beats its direct opponents!
    • However, it induces cooperation from its opponents and as a result it wins overall

  • Why tit-for-tat is effective (Axelrod, 1984):
    • Not envious (it does not aim to beat its opponent)
    • Nice (it begins with cooperation and it is never first to defect)
    • Tough (it sends a message that it cannot be exploited)
    • Forgiving (it reciprocates cooperation even after a history of defection)
    • Simple

25

26 of 29

Evolution of Cooperation�

  • Evolution of cooperation:
    • http://ncase.me/trust/

26

27 of 29

Collusion in Markets�

  • Next Time!

28 of 29

Thank you!

Roman Sheremeta, Ph.D.

Professor, Weatherhead School of Management

Case Western Reserve University

28

29 of 29

References�

  • Watson, J. (2013). Strategy: An Introduction to Game Theory (3rd Edition). Publisher: W. W. Norton & Company. (Chapter 22)
  • Bo, P.D., & Fréchette, G.R. (2011). The evolution of cooperation in infinitely repeated games: Experimental evidence. American Economic Review, 101, 411-429.

29