Journal of Economic Theory 135 (2007) 382 – 413 www.elsevier.com/locate/jet
Efficiency results in N player games with imperfect private monitoring Yuichi Yamamoto Graduate School of Economics, The University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan Received 15 June 2005; final version received 11 May 2006 Available online 22 June 2006
Abstract We demonstrate that efficiency is achievable in a certain class of N player repeated games with private, almost perfect monitoring. Our equilibrium requires only one period memory and can be implemented by two state automata. Furthermore, we show that this efficiency result holds with any degree of accuracy of monitoring if private signals are hemiindependent. Whereas most existing research focuses on two player cases or only a special example of N player games, our results are applicable to a wide range of N player games of economic relevance, such as trading goods games and price-setting oligopolies. © 2006 Elsevier Inc. All rights reserved. JEL classification: C72; C73; D82 Keywords: Repeated game; Private monitoring; Belief-free equilibrium; Review strategy; Efficiency
1. Introduction One of the most interesting characteristics of repeated games is the fact that we can achieve efficiency through long-term relationships even when we cannot attain such an outcome in oneshot interaction. Fudenberg and Maskin [6] demonstrate that any feasible and individually rational payoff vectors are sustained in infinitely repeated games with perfect monitoring, where players can observe their opponents’ actions directly (the folk theorem). Fudenberg et al. [5] also show the folk theorem when monitoring is not perfect but public. However, the case of private monitoring in a repeated game, where players can observe private signals about their opponents’ actions, is not fully understood, and it has been the subject of active research in the past few years. 1
E-mail address:
[email protected]. 1 Kandori [8] explains the reason why private monitoring is theoretically difficult.
0022-0531/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.jet.2006.05.003
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
383
In fact, many economic situations should be analyzed as repeated games with private monitoring. A leading example is a repeated price-setting oligopoly proposed by Stigler [18] where firms cannot observe their opponents’ prices and sales. In this “secret price-cutting” game, a firm obtains some information about the opponents’ prices only through its own sales, where the firm’s sales level tends to be high when the opponents have high prices and vice versa. However, the sales level also depends on unobservable shocks due to business cycle. This implies that, even if it gets a high sales level, a firm cannot know whether this is caused by the opponents’ high prices or an upswing of the business cycle. Thus, monitoring is imperfect and private in this example. While most of the literature on private monitoring focuses on two player games, the purpose of this paper is to demonstrate that a large class of repeated N player games can achieve efficiency with almost perfect monitoring and that such a result can be generalized to the case without the accuracy of monitoring. For example, Ely and Välimäki [4] (hereafter EV) is one of a few papers which deal with N player games, but their analysis is limited to the degenerate game and they assume almost perfect monitoring. 2 In contrast, our result is applicable to non-degenerate N player games. Our results are applicable to several familiar games, such as trading goods games and secret price-cutting games. In a trading goods game, player i produces N units of goods in each period. Then, one by one, she exchanges her own products with other producers and consumes what she obtains. The quality of goods is randomly determined by producer’s unobservable effort level, and the quality can be observed when the good is consumed. Since each player infers her opponents’ effort level only through the consumption of goods, this game should be analyzed as private monitoring. We show that efficiency, which is defined as the outcome where each player puts in a high level of effort, is achievable. In a secret price-cutting game, our results imply that cartel is sustained if several mild conditions are satisfied. This result is contrary to Stigler’s argument; he conjectures that secret price cuts prevent the cartel agreement from long-term relationships. There are mainly two kinds of approaches to address the problem of repeated games with private monitoring. The first one is the belief-based strategy, as outlined by Sekiguchi [16] and Bhaskar and Obara [1]. The second is the belief-free strategy, as defined by Piccione [15], EV, Ely et al. [3], and Kandori and Obara [10]. Our approach belongs to the latter category. First of all, we consider almost perfect monitoring cases where monitoring technology is sufficiently accurate. We construct belief-free equilibria that bring efficiency in a large class of N player games. Here, a sequential equilibrium is belief-free if, after every history, each player’s continuation strategy is optimal independently of her opponents’ private histories. Then, a player does not have to compute the belief about the history of her opponents because her best reply is not dependent on the belief. This property provides us with a recursive structure and allows us to use dynamic programming techniques. Our equilibrium is the generalization of EV’s Markov strategy. In the EV construction, each player has two states, rewarding state R and punishing state P, and one’s payoff depends on which state the opponent is in. Hence, when there are only two players, each player’s value function is defined with just two states of the opponent. The resulting dynamic programming equations are relatively easy to analyze. In contrast, when there are three or more players, each player has a large number of relevant states (combinations of opponents’ states), so that there are a number 2 Mailath and Morris [11] show a folk theorem with almost public monitoring. Bhaskar and Obara [1] also derive efficiency results in some class of N player games when monitoring is almost perfect on the assumption that the probability of observing a particular signal profile only depends on the number of errors contained within. Compte [2], Kandori and Matsushima [9] and Obara [14] establish the folk theorem in general games but they allow communication.
384
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
of “intermediate states” between the “best” (all opponents are in state R) and “worst” states (all opponents are in P) for each player. One of the main difficulties associated with the N player case is to determine those intermediate values to satisfy the system of dynamic programming equations. EV present a rather specific N player stage game (with two actions for each player) for which the equilibrium value function can be “linear” in the number of opponents who are in state R. The present paper shows that, in contrast, the system of dynamic programming equations has a solution for a large class of general N player stage games by a judicious choice of the intermediate values. Despite the potential complexity of the N player case, our equilibrium strategy has a rather simple form: on the equilibrium path, each player switches between two actions. In the secret pricecutting game under almost perfect monitoring, for example, approximate efficiency is achieved when each firm charges a “cartel price” or a “punishment price” in each period. Recently, Hörner and Olszewski [7] independently prove the folk theorem in all games with almost perfect monitoring by constructing a block strategy. A block strategy treats every consecutive T periods as a block game, and players take complex strategies in every block. Since our result is limited to some class of N player games, their result is more general. However, our equilibria possess several advantages. First, our equilibrium is much simpler than theirs; their block strategy assigns positive probabilities to all actions in almost all periods whereas each player transits between two actions in our strategy. Second, our equilibrium requires only one period memory but the block strategy needs T periods memory. Finally, our result is generalizable to the case where monitoring is far from perfect via review strategies, while their construction does not admit such an extension. To analyze games without the accuracy of monitoring, we apply a review strategy like Matsushima [13], which achieves efficiency in a class of two player games. The present paper generalizes Matsushima’s result to the N player case and demonstrates that an efficient outcome is achievable in some class of N player games even when monitoring is far from perfect. A review strategy regards every consecutive T periods as a review phase. In every review phase, players play constant actions and collect information about what action the opponents take. Then, for a sufficiently large T, players can gather enough information to guess the opponents’ action almost perfectly. In this way, review strategies can recover the accuracy of monitoring. The main difficulty in making a review strategy work well in N player games is that a player’s action in the current review phase affects the action of several players in the next phase. This fact implies that a player’s action in the current phase affects her continuation payoff in a complicated way since the continuation payoff is dependent on all opponents’ action in the next phase. The rest of this paper is organized as follows. Section 2 describes the model. In Section 3, we construct belief-free equilibria and derive an efficiency result when monitoring is accurate. Section 4 introduces a review strategy and demonstrates that we are able to get an efficiency result even when monitoring is far from perfect. Finally, Section 5 concludes the paper. 2. The Model The stage game is G = {I, (Ai , i , gi )i∈I , m}; I = {1, 2, . . . , N} is the set of players, Ai is the finite set of player i’s pure actions, i is the finite set of player i’s private signals, gi : Ai ×i → R is player i’s profit function, and m is the probability distribution of the signals. Let A = ×i∈I Ai , and = ×i∈I i . In every stage game, players choose an action profile a = (a1 , . . . , aN ) ∈ A simultaneously, and then a signal profile = (1 , . . . , N ) ∈ is realized according to the conditional distribution
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
385
function, m(·|a). That is, m(|a) stands for the probability that player i observes a private signal i for each i ∈ I when players took an action profile a. After that, player i receives the profit action ai and a signal i . Thus, her expected stage payoff i : A → R is gi (ai , i ) from an given by i (a) = ∈ m(|a)gi (ai , i ). Let (a) = (i (a))i∈I . We assume full support in that m(|a) > 0 for all a ∈ A and ∈ . Consider the infinitely repeated game of G where the discount factor is ∈ (0, 1). Denoting by (ait , ti ) player i’s action and signal in period t, her private history until period T is hTi = (ait , ti )Tt=1 for any T 1. Let h0i = ∅, and for all T 0, let HiT be the set of all hTi . Then, T player i’s strategy is a mapping si : ∞ T =0 Hi → Ai . Let Si be the set of i’s strategies, and let S = ×i∈I Si . Wedenote by wi (s) player i’s expected average payoff for a profile s ∈ S. That is, t−1 i (a t )|s]. wi (s) = 1−1 E[ ∞ t=1 Pick two actions of player i from Ai arbitrarily, and call each of them Ci and Di , respectively. In a large part of this paper, we impose no restrictions on the payoff function, and so Ci and Di are only labels. However, in Proposition 2 and Corollary 1, we focus on the prisoner’s dilemma type games (see Condition 2 for details). In such cases, actions Ci and Di have particular meanings; the former is regarded as cooperation and the latter as defection. We conclude this section by introducing several notations. Let C = (Ci )i∈I , and D = (Di )i∈I . Define Ai = {Ci , Di }, and let A = ×i∈I Ai . For each k ∈ {0, . . . , N}, let Ak be the set of all a ∈ A such that k components of a are cooperation and N −k components are defection. Similarly, we define Ak−i as the set of all a−i ∈ A−i such that k components of a−i are cooperation. 3. Almost perfect monitoring This section treats an almost perfect monitoring case. Assume i =A−i . Let (a)=(m(| a))= (a−1 ,...,a−N ) , and let = ((a))a∈A . In words, is the vector whose elements are the probabilities that someone acquires wrong information. We say that monitoring is almost perfect if approximates 0. 3.1. Efficiency result Define ∗i as the set of player i’s strategies that satisfy the following properties. In the first period, player i plays Ci or Di . In period t > 1, she mixes Ci and Di depending on her private history in the last period. For each i ∈ i , if the history in period t−1 was (Ci , i ), then she plays Ci and Di with probabilities 1−i (i ) and i (i ), respectively. If the history was (Di , i ), / Ai in period t−1, she plays Ci . she plays Ci and Di with i (i ) and 1−i (i ). If she took ai ∈ See Fig. 1. Given a strategy i ∈ ∗i , let i (ai ) be the strategy for player i that plays the action ai in the first period and then follows the transition rule of i . From the definition, i (Ci ) and i (Di ) also belong to ∗i . To simplify the notation, let (a) and −i (a−i ) represent (i (ai ))i∈I and (j (aj ))j =i , respectively. For all i ∈ I and a−i ∈ A−i , take a real number Vi (a−i ). Then, let V be a vector that contains all these numbers. We focus on the equilibria satisfying the following property: Definition 1. Given a vector V , a strategy profile ∈ ∗ sustains V if, for all i ∈ I , a−i ∈ A−i and ai ∈ Ai , Vi (a−i ) = wi ((Ci , a−i )) = wi ((Di , a−i ))wi ((a)).
(1)
386
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
1
β i (ω i)
1
α i (ω i)
β i (ω i)
play D i
play C i
1
other actions
α i (ω i) Fig. 1. A transition rule of i ∈ ∗i against a private signal i .
Eq. (1) implies that, in any period where the opponents choose a−i ∈ A−i , both Ci and Di are optimal for player i and bring Vi (a−i ) as continuation payoffs. Therefore, given any V , a strategy profile ∈ ∗ that sustains V is a sequential equilibrium. Notice that this equilibrium is belieffree in the sense of Ely et al. [3], namely, both Ci and Di are optimal for player i in every period independently of the opponents’ history. Moreover, in two person prisoner’s dilemma games, this equilibrium coincides with the one analyzed by EV. From the theory of dynamic programming, given any V , we can say that ∈ ∗ sustains V if Vi (a−i )(1 − )i (a) + E[Vi |a, , ] A−i
(2) A .
with equality if a ∈ Here, E[Vi |a, , ] is player for all i ∈ I and a ∈ Ai × i’s continuation payoff from the second period for (a). Precisely, E[Vi |a, , ] is defined as ( a ˜ |a, , )V ( a ˜ ) where ( a ˜ |a, , ) is the probability that a˜ −i is chosen in the −i i −i −i a˜ −i ∈A−i second period for (a). For example, (D−i |C, , 0) is equal to j =i j (C−j ) since = 0 means perfect monitoring. To simplify the discussion, we restrict attention to the case where A = A and = 0 for a while. Let be a vector with the elements i (a−i ) for all i ∈ I and a−i ∈ A−i . Similarly, let be a vector with the elements i (a−i ) for all i ∈ I and a−i ∈ A−i . From A = A , a vector V is sustained if there exist and satisfying Vi (a−i ) = (1 − )i (a) + E[Vi |a, , 0]
(3)
for all i ∈ I and a ∈ A . Suppose that there are only two players. Then, each player’s continuation payoff depends only on a single opponent’s randomization. For example, E[V1 |C, , 0] is equal to 2 (C1 ) · V1 (D2 ) + (1−2 (C1 )) · V1 (C2 ). Hence, Eq. (3) is linear with respect to and , and we can calculate the solution of (3) by simple linear algebra. However, in three or more player games, (3) is no longer linear since the continuation payoff depends on several opponents’ randomization. Hence, it is not obvious whether (3) has a solution. Nevertheless, EV show the example of N player games where (3) is solved by applying the implicit function theorem. Their idea is as follows. Notice that, given any V , E[Vi |a, , 0] is equal to Vi (a−i ) when = = 0. Thus, (3) has a trivial solution = = 0 at = 1. So, the implicit function theorem assures that, if the full rank condition is satisfied (the Jacobian has full rank), then there exist the implicit functions and of such that they solve (3) in the neighborhood of = 1 and such that = = 0 at = 1. Moreover, if the derivatives of and by are negative at = 1, then and increase from zero as decreases from one, and hence they are indeed probabilities for a close to one. Overall, the existence of the equilibria sustaining V is guaranteed if the full rank condition is satisfied and if the derivatives of the implicit functions are negative.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
387
This paper demonstrates that their method can be applicable to general N player games while the construction of EV seems non-generic from their paper. We invent the way to examine whether the Jacobian has full rank and whether the derivatives of the implicit functions are negative for general games. Indeed, Proposition 1 says that it suffices to analyze the system of N 2N linear equations j Vi (a−i ) · j (a) = i (a) − Vi (a−i ), ∀i ∈ I, ∀a ∈ A . (4) j =i
Here, for all i ∈ I, j = i, and a−i ∈ A−i , Vi (a−i ) is defined as Vi (Dj , a−(i,j ) ) − Vi (a−i ) if aj = Cj , and as Vi (Cj , a−(i,j ) ) − Vi (a−i ) if aj = Dj . Let us interpret (4). Suppose that, given a vector V , simultaneous equations (4) have a negative solution . Then, given a , construct a strategy profile ∈ ∗ such that j
i (a−i ) = −
1−
i (Ci , a−i ) and
i (a−i ) = −
1−
i (Di , a−i ).
Since i (a) is negative, i and i belong to the interval (0, 1) for a close to one, and so a strategy profile is well-defined. Observe that, by rearranging (3), we obtain (Vi (a−i ) − E[Vi |a, , 0]) = i (a) − Vi (a−i ) 1−
(5)
for all i ∈ I and a ∈ A . Moreover, notice that 1− (Vi (a−i )−E[Vi |a, , 0]) converges to the left-hand side of (4) as goes to one. 3 Hence, it turns out that (5) is almost satisfied by for a sufficiently large . Then, from continuity, there exists ∈ ∗ satisfying (5) exactly. Since (5) is equivalent to (3), V is sustained by such a strategy profile . The implicit function theorem justifies the above informal discussion, and the result is summarized as the following proposition. Observe that the assumptions of the proposition are still satisfied after perturbing and V . Hence, our result is not limited to degenerate games. The proof is given in Appendix B. Proposition 1. Fix a vector V that satisfies the following properties: 1. supa ∗ ∈Ai i (ai∗ , D−i ) < Vi (D−i ) < Vi (a−i ) for all i ∈ I and a−i = D−i . i 2. Simultaneous equations (4) have a unique solution = ( i (a))i∈I,a∈A 0. Then, ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃ 0, ∀ , ∃ ∈ ∗ sustaining V . Next, we focus on the games that satisfy Conditions 1 and 2; Condition 1 requires weak symmetry, and Condition 2 is satisfied in prisoner’s dilemma type games. For example, the N person prisoner’s dilemma analyzed by EV is the game that satisfies these conditions; EV consider the symmetric game such that Ai = {Ci , Di } and i (Di , a−i ) = i (Ci , a−i ) + 1 = k for each i ∈ I , k ∈ {0, . . . , N−1}, and a−i ∈ Ak−i . The next proposition demonstrates that the payoff by mutual cooperation is achievable. Let Uε (v) be the ε-neighborhood of v in Euclidean space. 3 In calculating E[V |a, , 0], we can ignore the terms of the second or higher powers of the transition probabilities, i such as i (a−i )j (a−j ). In fact, 1− (a ) (a ) is equal to 1− i (Cj , a−i ) j (Dj , a−j ), and it goes to zero as i −i j −j → 1.
388
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Condition 1. i (a) = j (a) for all i, j ∈ I and a ∈ A satisfying (ai , aj ) = (Ci , Cj ) or (ai , aj ) = (Di , Dj ). Condition 2. The payoff function satisfies the following properties: 1. For all i ∈ I , supai ∈Ai i (ai , D−i ) < i (C) i (Di , C−i ). 2. For all i ∈ I and a−i ∈ A−i \ {C−i }, i (Ci , a−i ) < i (C). 3. If |I |3, then i (a) > j (a) for all i, j ∈ I and a ∈ A−(i,j ) × {Di } × {Cj }. Definition 2. A payoff vector v = (vi )i∈I can be achieved approximately with almost perfect monitoring if ∀ε > 0, ∃v ∈ Uε (v), ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃ 0, ∀ , there exists the sequential equilibrium s satisfying (wi (s))i∈I = v . Proposition 2. If Conditions 1 and 2 hold, then (C) can be achieved approximately with almost perfect monitoring. The proof is given in Appendix C. You may think that Condition 1 implies degeneration, but the further result presented in Appendix A illustrates that our result is not limited to degenerate games. In Section 3.3, we will see that efficiency is achievable in many examples by using Proposition 2. 3.2. Discussion Our main contribution is to show that the efficiency is achievable in a large class of games including the examples analyzed by EV. In particular, our result is not limited to degenerate games while EV treated only the degenerate games. We want to clarify the cause of the difference between our result and EV’s. To simplify the discussion, we assume A = A and = 0. Recall that, given any and V , it suffices for the existence of the equilibrium sustaining V to show the following two properties: the Jacobian of system (3) has full rank, and the derivatives of the transition probabilities by are negative. EV check these properties by obtaining the closed-form solutions of the Jacobian and of the derivatives of the transition probabilities. In general, the derivation of such solutions is complex for three or more player games. EV avoid this difficulty by concentrating on the highly symmetric and degenerate game. Symmetry reduces the rank of system (3) and makes it easy to derive the closed-form solutions of the Jacobian and of the derivatives of the transition probabilities. Thus, EV succeed in obtaining the closed-form solutions and in checking the rank of the Jacobian and the signs of the derivatives for the degenerate (symmetric) games. However, their derivation of the closed-form solutions fails if symmetry is violated. Hence, it has not been clear whether their method is applicable in non-degenerate games. In contrast, the present paper starts with deriving the necessary and sufficient condition for both the full rank of the Jacobian and the negative derivatives of the transition probabilities; the second assumption of Proposition 1 is indeed necessary and sufficient. It is straightforward that this assumption is still satisfied after perturbing and V . This fact immediately implies that the existence of the equilibria is not limited to degenerate games. In addition, we have to mention Hörner and Olszewski [7], who prove the folk theorem in all games under almost perfect monitoring by constructing a block strategy. A block strategy treats every consecutive T periods as a block game, and players take complex strategies in every block game. Since our result is limited to some class of N player games, their result is more general
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
389
than ours. Moreover, their result says that ∃, ∃, ∀ ∈ (, 1), ∀ , the equilibrium exists, while our result states only that ∃, ∀ ∈ (, 1), ∃, ∀ , the equilibrium exists. Clearly, the statement of the former is stronger than that of the latter. However, there are several advantages in our result. First, our equilibrium is simpler than a block strategy; the former takes only Ci or Di in every period while the latter assigns positive probabilities to all actions in almost all periods. Moreover, our equilibrium needs only one period memory but a block strategy requires T periods (one block game) memory. Furthermore, our equilibrium is belief-free, that is, a player’s best response is not affected by the opponents’ history. Section 4 generalizes the result of this section to the case without accuracy of monitoring via review strategies, which are familiar with a belief-free property. However, a block strategy is not belief-free, and so it is difficult to generalize their result to the case of far from perfect monitoring. This point is explained more precisely by Hörner and Olszewski [7]. Finally, we discuss the applicability of our method to repeated games with finite periods. You may expect that it is possible to construct the equilibria in those games on the analogy of infinitely repeated games. However, we do not think so; such an extension is difficult. To see this, suppose that we try to construct the equilibria where players choose the actions other than the static Nash equilibria in the finitely repeated game. Then, a player’s best reply in the final period has to depend on her beliefs about the opponents’ histories. (If not, players have to take a Nash equilibrium in that period since there are no continuation payoffs and no dynamic incentives.) This is contrary to the characteristics of our equilibria; we consider the equilibria where each player’s best reply is independent of her beliefs. Thus, there is no obvious way to extend our result to finitely repeated games. In fact, Sekiguchi [17] succeeds in building non-trivial equilibria in some class of finitely repeated games, but his equilibria are completely different from ours. 3.3. Examples 3.3.1. Trading goods There are N players. In each period, player i produces N units of goods i. Then, one by one, she exchanges her own products with the others’ and ends up with all kinds of products. At the end of each period, player i consumes all having goods. Suppose that there are two types of each good: high quality and low quality. In each period, player i determines an unobservable effort level in producing. When she makes an effort, each of her products has high quality and low quality with probability 1− and , respectively. When she does not, these probabilities are and 1− . Quality can be observed when the good is consumed. So, each player can acquire the imperfect and private information about the opponents’ unobservable effort level from the quality of goods which she consumed. We assume that the quality of the goods produced by i is independent of the quality of the goods produced by others. However, the quality of the goods for player j produced by i may depend on the quality of the goods for player l produced by i. Let Ai = {Ci , Di } where Ci means making an effort and Di means not doing so. We denote by i = {L, H }N the set of signals for player i where L means low quality and H means high quality. Let e > 0 be the cost for efforts, and let u(k) be the utility of a player when she consumes k high quality goods and N − k low quality goods. Then, we can write gi (Ci , i ) = u(k) − e and gi (Di , i ) = u(k) when i includes k high quality goods. Since the production by i is independent of the one by others, Condition 1 is satisfied in this game. Moreover, as converges to zero, i (Ci , a−i ) and i (Di , a−i ) approximate u(k + 1) − e and u(k), respectively, for all
390
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
a−i ∈ Ak−i . Therefore, if u(0) < u(N ) − e < u(N−1) and u(k) < u(k + 1) for each k, then Condition 2 holds for a sufficiently small . So, together with Proposition 2, the payoffs in mutual cooperation can be achieved approximately for a sufficiently large . We point out that the trading game has the same structure as the following partnership game with subjective evaluation. Suppose that N players work jointly and players choose effort levels like the trading game. In our model, player i makes subjective evaluation about the opponents’ efforts, i ∈ i = A−i , which does not affect players’ profits. The joint work succeeds with a probability u(k) for an action profile a ∈ Ak , and the success of the work provides one profit to each player but the failure brings nothing. Then, for all i ∈ I and a−i ∈ Ak−i , we have i (Ci , a−i ) = u(k + 1) − e and i (Di , a−i ) = u(k). Thus, the payoffs in mutual cooperation can be achieved approximately if u(k) satisfies the same conditions as in the trading games, is close to one, and monitoring is almost perfect. 3.3.2. Price-setting oligopoly There are N sellers and each seller i attaches the price of her own commodities from Ai = {0, a1 , a2 , . . . , 1} in each period where a is a large integer. They cannot observe the opponents’ price but can receive their own sales i ∈ i at the end of the period. Suppose that i is private information. When seller i attaches ai and acquires i , her current gain is gi (ai , i ) = ai i − ei (i ) where ei (i ) stands for the cost of her production. Define i (a) = ∈ m(|a)gi (, a). Let ri : A−i → Ai be i’s best response. Given an action profile a ∈ A, construct a new action profile by exchanging i’s action for j’s action, and call it a(i, j ). Similarly, given any ∈ , define (i, j ). Then, assume that i (a) = j (a(i, j )) and m(|a) = m((i, j )|a(i, j )) for all i ∈ I , j = i, a ∈ A, and ∈ . In words, we assume that the stage game is symmetric. Suppose also that i (a) is strictly increasing with respect to aj , ri (a−i ) is weakly increasing with respect to aj , and i (a) is single-peaked with respect to ai , that is, i (a) < i (ai + a1 , a−i ) for all 0 ai < ri (a−i ), and i (a) > i (ai + a1 , a−i ) when ri (a−i ) ai < 1. Suppose that there is a Nash equilibrium a ∗ ∈ A satisfying ai∗ = aj∗ and p ri (p, . . . , p) for all i, j ∈ I and p ai∗ . Suppose also that there is the cartel price Ci ∈ Ai defined as the price c such that c > ai∗ and i (c, . . . , c) > i (p, . . . , p) for all p = c. Assume that the cartel (C1 , . . . , CN ) is not a Nash equilibrium. Choose the punishment price Di as the maximal price d satisfying i (ri (d, . . . , d), d, . . . , d) < i (C). From the above assumptions, we have ri (D−i ) Di < Ci . We will show that Conditions 1 and 2 are satisfied in this game. From symmetry, Condition 1 holds. Conditions 2.1 and 2.2 also hold from the definition of D and the increasingness of i with respect to aj . Since a is a large integer, it is natural to assume that i (ri (D−i ), D−i ) approximates i (C). Then, from the increasingness of i with respect to aj , i (ri (D−i ), a−i ) is greater than i (C) for all a−i ∈ A−i \{D−i }. Then, Condition 2.3 holds since i (Ci , a−i ) < i (C), ri (D−i ) Di < Ci , and i is single-peaked. Suppose that, for each i ∈ I , ai ∈ {Ci , Di }, and k ∈ {0, . . . , N−1}, there exists i (ai , k) ∈ i such that player i observes i (ai , k) with a probability greater than 1− when she took ai and others chose a−i ∈ Ak−i . Assume also that, for each i ∈ I , ai ∈ {Ci , Di }, and k ∈ {0, . . . , N−1}, / Ak−i was taken. Consider player i observes i (ai , k) with a probability less than when a−i ∈ the case where is sufficiently small, that is, every player can know how many players took a cartel price almost perfectly as long as a ∈ A is chosen. Then, from symmetry and Proposition 2, the profits in cartel can be achieved approximately if is close to one. (Symmetry assures that ∈ Ak in the equilibrium.) player i does not have to distinguish between a−i ∈ Ak−i and a−i −i In our equilibrium, each player chooses her action between two prices, the cartel price and the
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
391
punishment price, in every period. Notice that the punishment price is higher than the equilibrium price in the one-shot stage game generally. 4. Review strategy This section extends the results in the last section to the cases where monitoring is not accurate. Matsushima [13] attains efficiency in some two player games even if monitoring technology is far from accurate by using a review strategy. We demonstrate that a review strategy is generalizable to N player games. 4.1. Result This section assumes that a signal structure m can be decomposed into two functions denoted by m ˜ and f as follows. In every period, after the choice of the action a ∈ A, an unobservable macro shock is randomly drawn according to the conditional probability function f (·|a) : → [0, 1] where is the finite set of possible macro shocks. Then, a private signal is drawn according to ˜ ) : → [0, 1]. Hence, we can write m(|a) = the conditional probability function m(·|a, m(|a, ˜ )f (|a). ∈ Definition 3. Private signals are hemiindependent if m can be decomposed intom ˜ and f, and private signals are independent conditional on a and in the sense that m(|a, ˜ ) = i∈I m ˜ i (i |a, ) for all ∈ . Matsushima [13] call hemiindependence correlation only through a macro shock. Private signals are conditionally independent if signals are hemiindependent and | | = 1. We denote by M(, , m, ˜ f ) the above information structure. Moreover, given any and , let M ∗ (, ) be the set of all M(, , m, ˜ f ). This section does not assume i = A−i any longer, but assume, for all i ∈ I , |i | |A−i | × | |.
(6)
The following corollary is the efficiency result without the accuracy of monitoring. Definition 4. A payoff vector v can be achieved approximately if ∀ε > 0, ∃v ∈ Uε (v), ∃ ∈ (0, 1), ∀ ∈ (, 1), there exists a sequential equilibrium s satisfying (wi (s))i∈I = v . Corollary 1. If private signals are hemiindependent, Conditions 1 and 2 hold, and (6) holds for all i ∈ I , then (C) can be achieved approximately a.e. in M ∗ (, ). This corollary is obtained from Theorem 1, which is discussed later, by using the same V as in the proof of Proposition 2. Corollary 1 enables us to attain efficiency at the previous examples without the accuracy of monitoring if i is large enough and signals are hemiindependent. We begin with providing a brief sketch of review strategies. In a review strategy, players regard every consecutive T periods as a review phase, and they take constant actions in every review phase. Namely, if player i chose an action ai in the initial period of some review phase, then she continues to take ai until the last period of that phase. A review strategy recovers the accuracy of monitoring in the following sense. Suppose that player i tries to know what action player j took, Cj or Dj . Assume also that, in every period,
392
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
1
play D i T times
β iT (h Ti )
1
β iT (h Ti )
play C i T times
α iT (h Ti )
1 other actions
α iT (h Ti ) ˜ T against a history hT . Fig. 2. Transition rule of Ti ∈ i i
player i observes i more frequently for Cj than for Dj . In such a case, player i can guess that the opponent took Dj in the review phase if i occurred many times during the phase, and she can guess that the opponent took Cj otherwise. The law of large numbers assures that her guess is almost correct when T is large enough. In this way, a review strategy allows players to know the opponents’ actions almost perfectly. ˜ Ti as the set of all i’s strategies satisfying the following Given a natural number T, define properties. In the first review phase (from period 1 to period T), she chooses Ci or Di constantly. In the nth review phase (from period (n − 1)T + 1 to period nT), she mixes constant actions Ci and Di depending on the history during the (n−1)th review phase. If she played Ci in all periods of the (n−1)th phase, and the history during that phase is given by hTi ∈ HiT , then she chooses constant actions Ci and Di with probabilities 1−Ti (hTi ) and Ti (hTi ), respectively. If she played Di in all periods of the (n−1)th phase, and the history during that phase is hTi , then she chooses Ci and Di with probabilities Ti (hTi ) and 1−Ti (hTi ). Otherwise, she plays Ci constantly. See Fig. 2. −1 t Let siT : Tt=0 Hi → Ai be player i’s strategy in a T periods repeated game, and let SiT be the T set of all s T . For any review strategy T ∈ ˜ i and any strategy s T ∈ S T , we denote by T (s T ) i
i
i
i
i
i
the strategy that plays siT in the first T periods and then follows Ti from the second review phase. Especially, when siT is equivalent to a constant action ai , we write Ti (ai ) instead of Ti (siT ) for ˜ T sustains V if simplicity. Then, given any V , we say that T ∈ Vi (a−i ) = wi (T (Ci , a−i )) = wi (T (Di , a−i )) wi (Ti (siT ), T−i (a−i ))
(7)
for all i ∈ I , a ∈ A , and siT ∈ SiT . Eq. (7) means that both constant actions Ci and Di are optimal for player i in every review phase irrespective of her opponents’ history. Therefore, given any V , a review strategy profile T sustaining V is a Nash equilibrium. The main result of this section is the following theorem. Theorem 1. Suppose that private signals are hemiindependent and (6) holds for all i ∈ I . Fix a vector V that satisfies the following properties: 1. supa ∗ ∈Ai i (ai∗ , D−i ) < Vi (D−i ) < Vi (a−i ) for all i ∈ I and a−i = D−i . i 2. Simultaneous equations (4) have a unique solution 0. 3. Vi (Cj , a−(i,j ) ) > Vi (Dj , a−(i,j ) ) for all i, j ∈ I and a−(i,j ) ∈ A−(i,j ) . Then, a.e. in M ∗ (, ), ∃T , ∀T > T , ∃ ∈ (0, 1), ∀ ∈ (, 1), there exists a strategy profile ˜ T that sustains V . T ∈
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
393
The basic idea behind the theorem is quite simple. Recall that, in review strategies, players can know the opponents’ constant actions almost perfectly. Therefore, we can regard every review phase as one period with almost perfect monitoring when players take only constant actions. Then, on the analogy of Proposition 1, we can construct the review strategy where players are indifferent between constant actions Ci and Di in every review phase. Thus, we can finish the proof by showing that non-constant actions are not profitable compared to constant actions Ci and Di . 4.2. The proof of Theorem 1 This subsection is devoted to the proof of the theorem. Fix V satisfying the assumptions of Ci ,1 Ci ,2 the theorem. Let be a solution of (4). Then, take two sequences (a−i , . . . , a−i Di ,1 Di ,2N−1 , . . . , a−i ) (a−i
by ordering all the elements in
Ci ,k+1 Ci ,k
i (Ci , a−i ) i (Ci , a−i )
A−i
N−1
) and
subject to
Di ,k+1 Di ,k and i (Di , a−i ) i (Di , a−i )
(8)
for all k. Since we do not use (8) for a while, we recommend readers to remember only that each of them is a sequence of all the elements in A−i . In this proof, we establish a review strategy where players detect the opponents’ action through random events. Here, a random event i is defined as a function from Ai × i to [0, 1], and it is interpreted as follows. Suppose that, in each period, player i chooses a real number xi from [0, 1] according to the uniform distribution. Then, player i’s history in period t is written by (ait , ti , xit ). We say that a random event i occurs in period t if xit i (ait , ti ). Since xit follows the uniform distribution, i occursin period t with a probability i (ait , ti ). Define q( i |a) as ∈ m(|a) i (ai , i ). In words, q( i |a) is the probability that a random event i occurs when players chose an action profile a. Similarly, define q( i |a, −i ) as i ∈i m(|a) i (ai , i ), which is the probability that a random event i occurs given a private signal −i and an action profile a. We allow players to access multiple random events in each period. For example, two kinds of random events i and i occur simultaneously in period t if xit i (ait , ti ) and if xit i (ait , ti ). a The following lemma defines the particular random events ki for all k ∈ {1, . . . , 2N−1 } and i −i for all a−i ∈ A−i , and we will construct the review strategy where players use these random events in each period. Let i be the set of these random events, that is, for all i ∈ I , i is defined as a the union of { ki | 1 k 2N−1 } and { i −i | a−i ∈ A−i }. The proof of the lemma is dramatically improved by the referees’ comments and is stated in Appendix D. Lemma 1. Suppose that private signals are hemiindependent and (6) holds for all i ∈ I . Then, a for all i ∈ I , k ∈ {1, . . . , 2N−1 }, and a−i ∈ A−i , there exist random events ki and i −i satisfying Condition 3 a.e. in M ∗ (, ). a
Condition 3. Random events ki and i −i satisfy the following properties: / A , k ∈ {1, . . . , 2N−1 }, and l ∈ {1, . . . , 2N−1 }, 1. For all i ∈ I , ai ∈ Ai , a˜ ∈ ˜ q( ki |a)
=q
and
ai ,l q( ki |ai , a−i )
=
q if k > l, q if k l.
394
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
For all i ∈ I , a−i ∈ A−i , and a˜ ∈ A, q if a˜ −i = a−i , a−i q( i |a) ˜ = q if a˜ −i = a−i . Here, q and q are probabilities that satisfy 0 < q < q < 1. 2. For all i ∈ I , a ∈ A, −i ∈ −i , and i ∈ i , q( i |a) = q( i |a, −i ). 3. Fix i ∈ i for all i ∈ I . Then, 1 , . . . , and N are independent. Let us interpret Condition 3 for random events ki . Condition 3.2 implies that player j’s private signal j has no information about whether her opponent’s random event i occurs or not. Condition 3.3 says that i’s random event and j’s random event are independent. These conditions are used later. ai ,l Condition 3.1 means that, if players take an action profile (ai , a−i ) ∈ A , then a random event k
i occurs with a high probability q for each k ∈ {l + 1, . . . , 2N−1 }, and with a low probability q for each k ∈ {1, . . . , l}. This property implies that player i can detect the opponents’ action N−1
through the combination of the random events 1i , 2i , . . . , 2i . Namely, player i can guess ai ,l ∈ A−i in the review phase if she took ai ∈ Ai and if ki occurred that the opponents took a−i many times for each k ∈ {l + 1, . . . , 2N−1 } but not for each k ∈ {1, . . . , l}. Indeed, if players ai ,l took (ai , a−i ) ∈ A constantly in the review phase, then ki would occur about qT times during the phase for each k > l and about qT times for each k l. We state how players guess the opponents’ actions more precisely. Fix a threshold ZT as an integer satisfying qT < ZT < qT , and we denote by Xik (ZT ) (or simply Xik if there is no confusion) the event that a random event ki occurs less than or equal to ZT times during some ai ,l ∈ A−i in the review phase if review phase. Then, player i guesses that the opponents took a−i k she took ai ∈ Ai and if Xi occurred for every k l but not for every k > l. From the law of large numbers, the guess becomes almost perfect as T goes to infinity. That is, when players took ai ,l (ai , a−i ) constantly, the probability that Xik occurs approximates one for every k l and zero for every k > l. a a a Similarly to Xik , we denote by Xi −i (ZT ) (or simply Xi −i ) the event that a random event i −i occurs less than or equal to ZT times during some review phase. Then, define Ti as the set of all ˜ Ti such that there exist a threshold ZT and real numbers k , a−i , k , and review strategies T ∈ i
i
i −i for each k ∈ {0, . . . , 2N−1 } and a−i ∈ A−i such that Ti and Ti are written as a
Ti (hTi )
=
0i
−
N−1 2
1{Xik (ZT )|hTi }ki −
a−i ∈A−i
k=1
a
a
a
a
i
i
1{Xi −i (ZT )|hTi }i −i
and Ti (hTi ) = 0i +
N−1 2
k=1
1{Xik (ZT )|hTi }ki +
a−i ∈A−i
1{Xi −i (ZT )|hTi }i −i
for each hTi ∈ HiT . Here, 1{X|hTi } is an indicator function that takes one only if the event X occurs in the review phase whose the history is given by hTi = (ait , ti , xit )Tt=1 (and takes zero otherwise). For example, suppose that player i who follows a review strategy Ti ∈ Ti took a constant action
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
395
Ci and only the events Xi1 and Xi2 occurred in the (n−1)th phase. Then, in the nth phase, she chooses constant actions Ci and Di with probabilities 1−Ti and Ti where Ti = 0i −1i −2i . In the following argument, we prove the theorem by constructing a review strategy T ∈ T that sustains V . Particularly, when (8) holds strictly for all i ∈ I and k, we focus on Ti ∈ Ti a a satisfying i −i = i −i = 0 for all a−i ∈ A−i , that is, we consider the review strategy where the a transition probabilities Ti and Ti are independent of the random events i −i . In other words, in a−i constructing the equilibrium, we use i only in the degenerate case where (8) holds at least one equality for some i ∈ I and k. One of the advantages of considering T ∈ T is that players have no incentive to take history dependent strategies in every review phase. Recall that, from Condition 3.2, each player’s history has no information about the occurrence of the opponents’ random events. This implies that each player’s history has no information about the opponents’ continuation strategy from the next phase since the transition probabilities Ti and Ti depend only on the occurrence of the random events. Thus, history dependent strategies cannot earn more profits than history independent strategies. This property enables us to simplify the proof of (7). Let {ait } be a sequence of i’s actions (ai1 , . . . , aiT ) ∈ ATi , and let Ti ({ait }) be the strategy that plays the action ait in period t for all 1t T and then follows Ti from period T + 1. From the above argument, in showing (7), we can replace a history dependent strategy siT with a history independent strategy {ait }. Thus, from the theory of dynamic programming, we can say that T ∈ T sustains V if Vi (a−i )(1 − )
T
t−1 i (ait , a−i ) + T E[Vi |{ait }, a−i , T ]
(9)
t=1
for all i ∈ I , a−i ∈ A−i , and {ait } ∈ ATi with equality if {ait } = (Ci , . . . , Ci ) or {ait } = (Di , . . . , Di ). Here, E[Vi |{ait }, a−i , T ] means the continuation payoff from the second phase, and it is defined as a˜ −i ∈A T (a˜ −i | {ait }, a−i , T )Vi (a˜ −i ) where T (a˜ −i | {ait }, a−i , T ) is the −i
probability that a˜ −i is chosen in the second phase for (Ti ({ait }), T−i (a−i )). The second advantage of considering T ∈ T is that we can represent T as the following simple form. Let (X | {ait }, a−i ) be the probability that the event X occurs in the first phase when player i performed {ait }, and the opponents took a−i constantly. Particularly, we rewrite it as (X|a) when {ait } means a constant action ai . Let Tj (a˜ j | {ait }, a−i , T ) be the probability that player j chooses a˜ j in the second phase for a strategy (Ti ({ait }), T−i (a−i )). Note that Tj is evaluated as the expectation of Tj or Tj . Then, in the non-degenerate cases (so we assume a a i −i = i −i = 0), we obtain Tj (Dj
|
{ait }, a−i , T )
=
0j
−
N−1 2
kj · (Xjk | {ait }, a−i ),
(10.1)
k=1
Tj (Cj | {ait }, a−i , T ) = 1 − Tj (Dj | {ait }, a−i , T ), , T ) = 0j + Tj (Cj | {ait }, a−i
N−1 2
kj · (Xjk | {ait }, a−i ),
(10.2) (10.3)
k=1
Tj (Dj
|
{ait }, a−i , T )
=
1 − Tj (Cj
| {ait }, a−i , T )
(10.4)
396
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
for all i ∈ I , j = i, a−i ∈ A−i satisfying aj = Cj , and a−i ∈ A−i satisfying aj = Dj . Moreover, we have Tj (a˜ j | {ait }, a−i , T ) (10.5) T (a˜ −i | {ait }, a−i , T ) = j =i
since Condition 3.3 implies that the occurrence of the random events is independent across players. These equations enable us to calculate T by using Condition 3.1. For example, (Xjk (ZT )|a) is defined as the cumulative distribution function of the binomial distribution with parameters T and q( kj |a). From Lemma 1 and the above discussion, it suffices for proving Theorem 1 to show the following lemma. The formal proof is written in Appendix E. Lemma 2. Suppose that there exist the random events satisfying Condition 3. Then, ∃T , ∀T > T , ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃T ∈ T satisfying (9) for all i ∈ I , a−i ∈ A−i , and {ait } ∈ ATi with equality if {ait } ∈ {(Ci , . . . , Ci ), (Di , . . . , Di )}. The proof is divided into two steps. First, we show that there exists T ∈ T satisfying (9) with equality for all i ∈ I , a−i ∈ A−i , and {ait } ∈ {(Ci , . . . , Ci ), (Di , . . . , Di )}. That is, we prove that there exists T ∈ T satisfying Vi (a−i ) = (1 − T )i (a) + T E[Vi |a, T ]
(11)
for all i ∈ I and a ∈ A . Here, E[Vi |a, T ] is defined as E[Vi |{ait }, a−i , T ] where {ait } represents a constant action ai . Second, we prove that such a review strategy T satisfies (9) for all i ∈ I, a−i ∈ A−i , and {ait } ∈ {(Ci , . . . , Ci ), (Di , . . . , Di )}. In words, we firstly show the existence of the review strategy where both constant actions Ci and Di are indifferent in every phase, and then prove that other behavior is not optimal. In the rest of this subsection, we will see that (4) is still a key system in the first step of the proof as well as the almost perfect monitoring case. Namely, we informally explain that, if simultaneous equations (4) have a negative solution , then there exists T ∈ T satisfying (11) for all i ∈ I and a ∈ A . Fix an integer T sufficiently large. Observe that (11) is equivalent to T 1 − T
(Vi (a−i ) − E[Vi |a, T ]) = i (a) − Vi (a−i )
(12)
for all i ∈ I and a ∈ A . Fix a positive number i (Ci , 0) sufficiently large and a negative number a a
i (Di , 0) close to zero. Consider a review strategy Ti ∈ Ti satisfying i −i = i −i = 0 for all a−i ∈ A−i , ⎧ T
⎪ ⎪ 1 − (C , a Ci ,k ) − (C , a Ci ,k−1 ) ⎪ if k > 1, i i i i ⎪ −i −i ⎪ ⎪ T ⎪ ⎪ ⎪ ⎨
1 − T Ci ,1 ki = (C , a ) +
(C , 0) if k = 1,
i i i i −i ⎪ ⎪ T ⎪ ⎪ ⎪ ⎪ ⎪ 1 − T ⎪ ⎪ ⎩
i (Ci , 0) if k = 0 T
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
and
397
⎧
1 − T ⎪ Di ,k Di ,k−1 ⎪ ) − i (Di , a−i ) if k > 1,
i (Di , a−i ⎪− T ⎪ ⎪ ⎪ ⎪ ⎨
1 − T Di ,1 ki = − (D , a ) −
(D , 0) if k = 1,
i i i i −i ⎪ ⎪ T ⎪ ⎪ ⎪ T ⎪ ⎪ ⎩ − 1 − i (Di , 0) if k = 0. T
From (8) and < 0, irrespective of the value of the indicator functions, both Ti and Ti using these ki and ki are probabilities for a sufficiently large T . Recall that, for a sufficiently large T and a proper threshold ZT , monitoring becomes almost ai ,l perfect, i.e. the probability that Xik occurs for a constant action (ai , a−i ) approximates one for every k l and zero for every k > l. Then, it follows from (10) that C ,l Tj (Dj |(Cj , a−jj ), Ti )
≈
0j
−
l k=1
kj = −
1 − T
T
C ,l
j (Cj , a−jj ).
In other words, Tj (Dj |(Cj , a−j ), Ti ) approximates − 1−T j (Cj , a−j ). Similarly, Tj (Cj |(Dj , T
a−j ), Ti ) approximates − 1−T j (Dj , a−j ). Therefore, on the analogy of the discussion before T
T Proposition 1, T (Vi (a−i )−E[Vi |a, T ]) approximates the left-hand side of (4) as tends to
1−
1. That is, (12), which is equivalent to (11), is almost satisfied by T for a close to one. Then, from continuity, there exists ∗T ∈ T that satisfies (11) exactly. Indeed, the existence of such a ∗T is guaranteed by the implicit function theorem as we see in the next subsection. 4.3. Comparison with two player cases The present paper overcomes two difficulties in extending the review strategy to N player games compared to Matsushima [13], who constructs the review strategy in two player games. Both difficulties are caused by the fact that a player’s action in the current phase affects the action of several players in the next phase. In the following argument, we consider the non-degenerate a a case, and so we focus on T satisfying i −i = i −i = 0. Recall that we require two steps for proving Lemma 2; we firstly prove the existence of the review strategy T satisfying (11), and then we show that the established T satisfies inequality (9). The first difficulty is that we cannot follow the same methodology as in Matsushima to solve (11). In two player games, this equation is linear with respect to the transition probabilities, and Matsushima can solve them by simple linear algebra. In contrast, the equation is not linear in three or more player games since the continuation payoffs depend on the results of several opponents’ randomization. Thus, we have to invent a new method to solve the equation in three or more player games. This difficulty is conquered by applying the implicit function theorem as well as the almost perfect monitoring case. Notice that (11) has an obvious solution ki = ki = 0 at T = 1. So, if the full rank condition is satisfied, then there exist the implicit functions ki and ki of T such that ki = ki = 0 with T = 1 and such that they solve (11) in the neighborhood of T = 1.
398
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Moreover, if the derivatives of Ti and Ti using such ki and ki by T are negative at the point T = 1, then Ti and Ti increase from zero as slightly decreases from one, and hence they are indeed probabilities for a sufficiently large . As a result, it suffices for the existence of the solution of (11) to show that the full rank condition is satisfied and the derivatives of Ti and Ti by T are negative at T = 1. The second difficulty arises in showing inequality (9), which means that non-constant actions {ait } are not profitable. Notice that the implicit function theorem assures the existence of the solution of (11) but does not yield the closed-form solution of ki and ki . Then, we cannot calculate the continuation payoff brought by non-constant actions, E[Vi |{ait }, a−i , T ], since it depends on ki and ki . So, it seems impossible to show inequality (9). This problem is solved in the following way. Given any T , let ∗T (T ) be the solution of (11), of which the existence is guaranteed by the implicit function theorem. Recall that ∗T (T ) satisfies ki = ki = 0 at T = 1. Hence, E[Vi |{ait }, a−i , ∗T (T )] is equal to Vi (a−i ) at T = 1. This implies that (9) holds with equality with T = ∗T (T ) and T = 1. Hence, it suffices to show that the derivatives of the right-hand side of (9) by T are positive at T = 1. Indeed, if so, the right-hand side decreases and inequality holds strictly when T slightly decreases from one. Note that we can calculate the derivative of E[Vi |{ait }, a−i , ∗T (T )] by the chain rule since the implicit function theorem enables us to obtain the closed-form solutions of the derivatives of ki and ki . Remark. Even in two player games, our result is more general than that of Matsushima. Corollary 1 requires only maxai ∈Ai i (ai , Dj ) < i (C) i (Di , Ci ) in two player games to approximate (1 (C), 2 (C)), while Matsushima assumes both maxai ∈Ai i (ai , Dj ) < i (C) < i (Di , Ci ) and i (Ci , Di ) < i (D). 5. Conclusion This paper has established the efficiency results in N player repeated games with private monitoring. First, we have constructed belief-free equilibria that bring an efficient outcome in many games with almost perfect monitoring. Then, we have extended this result to the case without accuracy of monitoring technology by applying a review strategy. Our results are applicable to several familiar games. However, we have not shown the folk theorem; our strategy can attain only a part of a feasible and individually rational payoff set. Furthermore, our results are available only if simultaneous equations (4) have a negative solution. Although we might be able to achieve efficiency by considering more complex strategies, it is beyond the purpose of this paper. Repeated games with private monitoring still contain many unsolved problems, and progress of future research is expected. Acknowledgments I am grateful to Michihiro Kandori, Hitoshi Matsushima, Drew Fudenberg, Akihiko Matsui, Kazuya Kamiya, two anonymous referees, and an associate editor for their helpful comments, and to Youta Ishii, Dan Sasaki and Roger Smith for their careful reading. I also thank seminar participants at Kyoto University and the University of Tokyo. All errors are mine.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
399
Appendix A. Extension of efficiency results Although Proposition 2 and Corollary 1 assume weak symmetry corresponding to Condition 1, we can extend them to the case where symmetry is violated. Let (a) = N1 i∈I i (a). Normalize the payoff function as i (C) = j (C) and i (D) = j (D) for all i, j ∈ I . Condition 4. The payoff function satisfies the following properties: 1. For all i ∈ I , supai ∈Ai i (ai , D−i ) < i (C). 2. For all i ∈ I and a−i ∈ A−i \ {C−i }, max{i (Ci , a−i ), (Ci , a−i )} < i (C). 3. For all i, j ∈ I and a−(i,j ) ∈ A−(i,j ) , i (Di , Cj , a−(i,j ) ) > (Di , Cj , a−(i,j ) ) > j (Di , Cj , a−(i,j ) ). 4. For all i, j ∈ I ,
⎛
i (Di , C−i ) − i (C) 2 ⎝(N − 1)j (Di , C−i ) −
⎞ l (Di , C−i )⎠ .
l=i
Proposition 3. If Condition 4 holds, then (C) can be achieved approximately with almost perfect monitoring. Proof. Fix V like the proof of Proposition 2. Then, simultaneous equations (4) have a unique solution defined by (18)–(20). Moreover, from Condition 4, is negative for sufficiently small ε and ε . Thus, similar to the proof of Proposition 2, we finish the proof. Corollary 2. If private signals are hemiindependent, Condition 4 holds, and (6) holds for all i ∈ I , then (C) can be achieved approximately a.e. in M ∗ (, ). Appendix B. The proof of Proposition 1 For all i ∈ I and a−i ∈ A−i , we denote i (Ci , a−i ) = i (a−i ) and i (Di , a−i ) = i (a−i ). Then, let be a row vector that contains i (a) for all i ∈ I and a ∈ A . Moreover, construct row vectors ˜ and ˜ by ordering i (a−i ) for all i ∈ I and a−i ∈ A−i \ A−i , and by ordering i (a−i ) for all i ∈ I and a−i ∈ A−i \ A−i . Notice that, given any a ∈ A, a strategy profile (a) is completely ˜ ). Hence, we can rewrite E[V |a, , ] as defined by determining transition probabilities (˜, , ˜ E[V |a, (˜, , ), ]. In particular, we will construct a strategy satisfying ˜ = (1, . . . , 1) and ˜ we write E[V |a, , ] instead of E[V |a, (˜, , ˜ ), ]. ˜ = (0, . . . , 0). Thus, given such ˜ and , Fix V satisfying the assumptions of the proposition. All we have to do is to show that, ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃ 0, ∀ , there exists a vector such that every element of is a probability and Vi (a−i )(1 − )i (a) + E[Vi |a, , ]
(13)
for all i ∈ I and a ∈ Ai × A−i with equality if a ∈ A . Since ˜ = (1, . . . , 1) and ˜ = (0, . . . , 0), we have E[Vi |a, , 0] = Vi (D−i ) for all i ∈ I , a ∈ (Ai \ Ai ) × A−i , and . Then, from the first assumption of the proposition, (13) holds strictly for all i ∈ I , a ∈ (Ai \ Ai ) × A−i , and when is close to one and is small enough. We
400
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
denote by Wi (a, , , ) the right-hand side of (13), and let W (, , ) be a column vector whose components are given by Wi (a, , , )−Vi (a−i ) for all i ∈ I and a ∈ A . Then, it suffices to show that, ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃ 0, ∀ , there exists a vector such that every element of is a probability and W (, , ) = 0.
(14)
In fact, if (14) is satisfied, then (13) holds with equality for all i ∈ I and a ∈ A . Lemma 3. The rank of the Jacobian of W (, , ) is N 2N at (, , ) = (0, 1, 0). Proof. Given any j ∈ I and a ∈ A , let JW (j, a ) be the derivative of W (, , ) by j (a ) at (, , ) = (0, 1, 0). Note that every element of JW (j, a ) is calculated by j Vi (a−i ) if j = i and a = a , *Wi (a, 0, 1, 0) (15) = *j (a ) 0 otherwise. Construct a N2N th order square matrix JW by ordering N 2N -component column vectors JW (j, a ) for all j ∈ I and a ∈ A . Then, it suffices to show that rank JW = N 2N since JW is a sub-matrix of the Jacobian. Let b be a column vector with the components i (a)−Vi (a−i ) for all i ∈ I and a ∈ A . Then, system (4) is equivalent to JW t = b. Since (4) has a unique solution, the rank of JW is equal to N2N . Observe that, when (, ) = (0, 0), we obtain E[Vi |a, , ] = Vi (a−i ) for all i ∈ I and a ∈ A . Hence, (14) is satisfied at (, , ) = (0, 1, 0). In addition, Lemma 3 implies that the full rank condition is satisfied at this point. Then, from the implicit function theorem, there exists the function ∗ (, ) such that ∗ (1, 0) = 0, ∗ (, ) solves (14) in the neighborhood of ∗ (, ) = (1, 0), and solves dW ( (1,0),1,0) = 0. Here, = ((i (a))a∈A )i∈I is the derivative of d ∗ (, ) by at (, ) = (1, 0). We finish the proof by showing that ∃ ∈ (0, 1), ∀ ∈ (, 1), ∃ 0, ∀ , every element ∗ of ∗ (, ) is a probability. From the chain rule, dW ( (1,0),1,0) = 0 is equivalent to d *Wi (a, 0, 1, 0) *Wi (a, 0, 1, 0) + · j (a ) = 0 * *j (a ) j ∈I a ∈A
for all i ∈ I and a ∈ A . Then, from (15), we obtain j Vi (a−i ) − i (a) + Vi (a−i ) · j (a) = 0 j =i
for all i ∈ I and a ∈ A . Observe that this system is equivalent to (4) by replacing with . Then, from the second assumption of the proposition, the derivatives of ∗ by are negative at (, ) = (1, 0). This implies that ∗ (, 0) increases as slightly decreases from = 1. Therefore, together with ∗ (1, 0) = 0, every element of ∗ (, 0) is a probability for a close to one. Moreover, it follows from continuity that all the elements of ∗ (, ) are probabilities if is small enough.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
401
Appendix C. The proof of Proposition 2 Take ε > 0 and ε > 0. Then, fix V (ε, ε ) such that Vi (a−i ) = Vk for all i ∈ I , k ∈ ε for all k ∈ {0, 1, . . . , N−1}, and a−i ∈ Ak−i where VN−1 = i (C)−ε and Vk − Vk−1 = N−k {1, . . . , N−1}. Then, it suffices to show that, ∃ε, ∃ε , ∀ε ∈ (0, ε), ∀ε ∈ (0, ε ), V (ε, ε ) satisfies the assumptions of Proposition 1. Indeed, if so, there is the equilibrium sustaining V (ε, ε ), and the payoff of this equilibrium approximates (C) for sufficiently small ε and ε . From Condition 2.1, the first assumption of the proposition is satisfied. So, we will show that the second assumption is satisfied. First, we treat four or more player games. Fix a ∗ ∈ Ak satisfying 2 k N −2. Define IC = ∗ ). Moreover, {i ∈ I | ai∗ = Ci } and ID = {i ∈ I | ai∗ = Di }. Let bi (a ∗ ) = i (a ∗ )−Vi (a−i C ∗ ∗ D ∗ ∗ let b (a ) be the sum of bi (a ) over i ∈ IC , and let b (a ) be the sum of bi (a ) over i ∈ ID . Finally, let b(a ∗ ) denote the sum of bi (a ∗ ) over all i ∈ I . Then, from the definitions of V and a ∗ , each equation of (4) for an action profile a ∗ is rewritten as j ∈IC \{i}
ε −ε
j (a ∗ ) +
j (a ∗ ) = bi (a ∗ ) N −k+1 N −k
(16)
j ∈ID
for all i ∈ IC , and −ε
j (a ∗ ) + N −k
j ∈IC
j ∈ID \{i}
ε
j (a ∗ ) = bi (a ∗ ) N −k−1
(17)
for all i ∈ ID . Summing (16) over IC and (17) over ID , dividing both sides by N, and subtracting (16) for a particular i ∈ IC from it, we obtain b(a ∗ ) − N bi (a ∗ ) −ε
i (a ∗ ) = , N −k+1 N
(18)
where i ∈ IC . Next, summing (17) over ID , dividing both sides by N −k−1, subtracting (17) for a particular i ∈ ID from it, and substituting (18) to it, we get b(a ∗ ) − N bi (a ∗ ) (N − k)b(a ∗ ) + N bC (a ∗ ) − kb(a ∗ ) ε
i (a ∗ ) = + , N −k−1 N N (N − k)(N − k − 1)
(19)
where i ∈ ID . From (18), (19) and Condition 1, we have (N − k)(N − k + 1)(i (a ∗ ) − j (a ∗ ) + Vk − Vk−1 ) , ε N (k − 1)(N − k)(i (a ∗ ) − j (a ∗ ) + Vk − Vk−1 ) + N (i (a ∗ ) − Vk−1 )
j (a ∗ ) = , ε N
i (a ∗ ) =
where i ∈ IC and j ∈ ID . Then, from Condition 2, i (a ∗ ) and j (a ∗ ) are negative. Similarly, we solve (4) for other i (a). Since i (C) = j (C), and i (D) = j (D), we obtain ε (N − 1) i (C) = −bi (C),
(20.1)
ε i (D) = bi (D),
(20.2)
402
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
ε 2(N − 1) i (Di , C−i ) = 2bC (Di , C−i ) − (N − 2)bi (Di , C−i ),
(20.3)
ε i (Ci , D−i ) = N bi (Ci , D−i ) − b(Ci , D−i ),
(20.4)
ε (N − 1) j (Di , C−i ) = 2(N − 1)bj (Di , C−i ) − bi (Di , C−i ) − 2bC (Di , C−i ),
(20.5)
and ε (N − 1) j (Ci , D−i ) = bi (Ci , D−i ) + (N − 2)(bj (Ci , D−i ) + b(Ci , D−i ) − N bj (Ci , D−i )).
(20.6)
Then, from Condition 1, we have ε (N − 1) i (C) = VN−1 − i (C), ε i (D) = i (D) − V0 , ε 2(N − 1) i (Di , C−i ) = j (Di , C−i ) − VN−2 − (N − 2)(i (Di , C−i ) − VN−1 ), ε i (Ci , D−i ) = (N − 1)(V1 − j (Ci , D−i ) + i (Ci , D−i ) − V0 ), ε (N − 1) j (Di , C−i ) = VN−1 − i (Di , C−i ), ε j (Ci , D−i ) = i (Ci , D−i ) − V0 . Therefore, we obtain 0 from Condition 2. If |I | = 3, then is given by (20), and they are negative. If |I | = 2, then is given by (20.1) and (20.2), and they are negative. Appendix D. The proof of Lemma 1 All we have to do is to show that, for all i ∈ I , k ∈ {1, . . . , 2N−1 } and a−i ∈ A−i , there exist a and i −i satisfying
ki
m ˜ i (i |a, ˜ ) ki (a˜ i , i ) = q( ki |a) ˜
i ∈i
and
a
a
m ˜ i (i |a, ˜ ) i −i (a˜ i , i ) = q( i −i |a) ˜
i ∈i a
for all a˜ ∈ A and ∈ , where q( ki |a) ˜ and q( i −i |a) ˜ satisfy Condition 3.1, and q and q are probabilities satisfying 0 < q < q < 1. Indeed, if these equations hold, then Condition 3 holds in the same way as in Matsushima [13]. Fix qˆ ∈ (0, 1) and take q = q = q. ˆ Then, the above equations have a trivial solution a
ki (a˜ i , i ) = i −i (a˜ i , i ) = qˆ for all a˜ i ∈ Ai and i ∈ i . Since (6) holds, the solution set is continuous for generic monitoring technologies. Hence, slightly perturbing it to q < qˆ < q ensures the solution is a probability.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
403
Appendix E. The proof of Lemma 2 We illustrate the road map of the proof. The first step displays the preliminary lemmas. The second step shows that there exists T ∈ T satisfying (11) on the assumption that (8) holds strictly for all i ∈ I and k. Then, the third step shows that T defined in the second step satisfies inequality (9). Finally, the fourth step treats the case where (8) holds at least one equality for some i ∈ I and k. Step 1: Show the preliminary lemmas. We denote by F (, T , r) the probability that ki occurs r times during the first phase when Ci ,k−1 Ci ,k ) is played until period and (Ci , a−i ) is played after period . We say F (, T , r) is (Ci , a−i single-peaked with respect to r if there exists a non-negative integer r satisfying F (, T , r)F (, T , r + 1) when r < r and F (, T , r) F (, T , r + 1) when r r. Similarly, F (, T , r) is singlepeaked with respect to if there exists a non-negative integer satisfying F (, T , r) F ( + 1, T , r) when < and F (, T , r) F ( + 1, T , r) when . The following lemmas are the preliminary results. Lemmas 4 and 5 are proved by Matsushima [13,12], respectively. Lemma 4. There exists a sequence of integers {ZT }∞ T =1 satisfying qT < ZT T ,
lim
T →∞
lim
T →∞
ZT
∀T = 1, 2, . . . ,
(21)
F (0, T , r) = 1,
(22)
r=0
ZT =q T
(23)
and lim T F (0, T − 1, ZT ) = ∞.
(24)
T →∞
Lemma 5. F (, T , r) is single-peaked with respect to r and . Lemma 6. For all q ∈ (0, 1) satisfying q = q, we have lim T 2 F (0, T − 1, [qT ]) = 0.
(25)
T →∞
Here, [·] means Gauss sign, that is, [x] is the largest integer not exceeding x. Proof. First, we prove (25) for a rational number q. We can write q = x! . Then, we obtain integers satisfying a > b > 0. Define x Cy as y!(x−y)! [qT ]
T −1−[qT ] T 2 F (0, T − 1, [qT ]) = T 2 T −1 C[qT ] q 1−q
T −1 C[qT ]
qT −1
T −2−qT q 1−q .
b a
where a and b are
(26)
404
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
The inequality is derived from [qT ] > qT −1 and T −1−[qT ] > T −2−qT . Let f (T ) be the formula after the strict inequality of (26). Then, we get q
1−q T +1 2 f (T + 1) 1−q = QT q , (27) f (T ) T where QT ≡
T C[q(T +1)] T −1 C[qT ]
⎧ T ⎪ ⎪ ⎨ T − [q(T + 1)] = T ⎪ ⎪ ⎩ [q(T + 1)]
if [qT ] = [q(T + 1)], (28) if [qT ] = [q(T + 1)] − 1.
T T −[q(T +1)]
a and [q(TT+1)] converge to a−b and ab , respectively, as T goes to infinity. Moreover, given any T, the number of T ∈ {T + 1, . . . , T + a} satisfying [qT ] = [q(T + 1)] is equal to a − b, and the number of T ∈ {T + 1, . . . , T + a} satisfying [qT ] = [q(T + 1)]−1 is equal to b. Then, from (28), we obtain a a b a a−b lim QT +l = . (29) T →∞ b a−b l=1
Therefore, we have f (T + l + 1) f (T + a) lim = lim T →∞ T →∞ f (T ) f (T + l) a−1 l=0
a−1
a−b b T + l + 1 2 lim QT +l 1−q = q T →∞ T +l l=0
b
a−b a b a a−b = q . 1−q b a−b Here, the second equality is derived from (27) and q = ab , and the third equality from (29). Denoting by g(a) the formula after the third equality, we have a(1 − q) b * log(g(a)) . =0 ⇔ a= = 0 ⇔ log a−b q *a Therefore, g(a) is maximized by a ∗ =
b q
. Since we assume that q =
b a
⇔ a = a ∗ , we obtain
f (T + a) < g(a ∗ ) = 1. T →∞ f (T ) lim
This implies that f (T ) converges to zero, and from (26), we obtain (25). Next, we show (25) for an irrational number q. Let rT∗ be the integer r maximizing F (0, T −1, r). (0,T −1,r) ∗ By calculation, we obtain FF(0,T −1,r−1) 1 ⇔ r qT . Hence, we have rT = [qT ]. Fix an irrational number q = q and take rational numbers q1 and q2 such that 0 < q1 < q < q2 < 1 and the interval [q1 , q2 ] does not contain q. Then, we have either rT∗ [q1 T ] [qT ] [q2 T ] or [q1 T ] [qT ] [q2 T ] rT∗ . Thus, together with Lemma 5, we obtain T 2 F (0, T − 1, [qT ]) max{T 2 F (0, T − 1, [q1 T ]), T 2 F (0, T − 1, [q2 T ])}.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
405
Observe that the right-hand side converges to zero since (25) is satisfied for a rational number q. Then, we have (25) for an irrational number q. Lemma 7. Take a sequence {ZT } satisfying (21)–(24). Then, for all q ∈ (0, 1], lim T F ([qT ], T − 1, ZT ) = 0.
(30)
T →∞
Proof. First, we prove (30) for q∈(0, q]. So, fix q∈(0, q] arbitrarily. Let F1 (r)=F ([qT ], [qT ], r) and F2 (r) = F (0, T −1−[qT ], r). From (21), we have [qT ] < ZT . Then, we obtain T F ([qT ], T − 1, ZT ) =
[qT ]
T F1 (r)F2 (ZT − r).
(31)
r=0
For each k = 1, 2, let rT∗k be the maximizer of Fk (r). Using the same argument as in the proof of Lemma 6, we have rT∗1 = [q([qT ] + 1)] and rT∗2 = [q(T −[qT ])]. Fix q1 ∈ (q, q) arbitrarily. Let r1 = [q1 ([qT ] + 1)]. Then, we have r1 rT∗1 . Therefore, from Lemma 5, we obtain ∀r r1 ,
(32)
F1 (r)F1 (r1 ).
Fix q2 ∈ ( 1−q , q) arbitrarily. Let r2 = [q2 (T −[qT ])]. From (23), ZTT−r1 and rT2 converge to q−q1 q and q2 (1−q), respectively. Moreover, from the definition of q2 , we have q−q1 q < q2 (1−q). Hence, there exists a natural number T1 such that, for all T > T1 , ZT −r1 < r2 . Take such a T1 . Then, from Lemma 5 and r2 rT∗2 , we obtain q−q1 q
∀T > T1 ,
∀r r1 , F2 (ZT − r)F2 (r2 ).
(33)
Then, for all T > T1 , we obtain T F ([qT ], T − 1, ZT ) <
r1 r=0
r1 r=0
T F1 (r) +
[qT ]
T F2 (ZT − r)
r=r1
T F1 (r1 ) +
[qT ]
T F2 (r2 )
r=r1
([qT ] + 1)2 F1 (r1 ) (T − [qT ])2 F2 (r2 ) + . q2 (1 − q)2
Here, the first inequality is derived from (31) and the fact that F1 (r) and F2 (r) are less than one, the second inequality from (32) and (33), and the third inequality from r1 T < T 2 ( [qTq]+1 )2
] 2 and ([qT ]−r1 )T < T 2 ( T −[qT 1−q ) . Observe that, from Lemma 6, (T −[qT ])2 F2 (r2 ) converges to zero as T goes to infinity. Moreover, from symmetry between q and q, T 2 F (T −1, T −1, [q1 T ]) converges to zero, and hence ([qT ] + 1)2 F1 (r1 ) converges to zero. Therefore, from the above inequality, we have (30) for q ∈ (0, q]. Next, we prove (30) for q ∈ (q, 1]. So, fix q ∈ (q, 1] arbitrarily. Let F () represent F (, T −1, ZT ). Since (24) holds and T F ([qT ]) → 0 as T goes to infinity, there exists a natural number T2 such that, for all T > T2 , F (0) > F ([qT ]). Take such a T2 . Then, from
406
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Lemma 5 and q < q, we have F ([qT ]) F ([qT ]) for all T > T2 . Since F ([qT ]) → 0, we obtain F ([qT ]) → 0. Step 2: Show the existence of T ∈ T satisfying (11). Suppose that (8) holds strictly for all i ∈ I and k. This step shows that, ∃T , ∀T > T , ∃ ∈ (0, 1), ∀ ∈ (, 1), there exists T ∈ T satisfying (11) for all i ∈ I and a ∈ A . For all i ∈ I , fix real numbers i (Ci , 0) and i (Di , 0) satisfying j |Vi (a−i )| · j (aj , 0) > i (a) − Vi (a−i ), (34) j =i
i (Ci , 0) > − i (Ci , a−i ) > 0 > i (Di , 0) > i (Di , a−i )
(35)
for all i ∈ I and a ∈ Ai × A−i . Here, i (a) is defined as the solution of (4). The existence of these j
numbers is guaranteed since i (a) is negative, |Vi (a−i )| is positive, and Vi (D−i ) > i (ai , D−i ). Intuitively, these inequalities are satisfied when i (Ci , 0) is sufficiently large, and i (Di , 0) is less than but close to zero. Let WiT (a, T , T ) be the right-hand side of (11). For all i ∈ I and 1 k 2N−1 , let (Ci , k) and (Di , k) be ki and ki , respectively. Then, let be a row vector with the components (ai , k) for all i ∈ I , ai ∈ Ai and k ∈ {1, . . . , 2N−1 }. Moreover, construct row vectors ˜ and ˜ by a a ordering 0i and i −i for all i ∈ I and a−i ∈ A−i , and by ordering 0i and i −i for all i ∈ I ˜ and a−i ∈ A−i , respectively. Since T (a) ∈ T is completely defined by determining , ˜ , , T T T T T ˜ and ZT , we can rewrite Wi (a, , ) as Wi (a, (, ˜ , , ZT ), ). Particularly, we construct ˜ and ZT are defined as (i) given any T and , a−i = a−i = 0, the review strategy where ˜ , , i
i
0i = (1−T ) i (Ci , 0) and 0i = −(1−T ) i (Di , 0) for all i ∈ I and a−i ∈ A−i , and (ii) a ˜ and ZT , let us write W T (a, , T ) ˜ , , sequence {ZT }∞ i T =1 satisfies (21)–(24). Thus, given such ˜ ZT ), T ). Similarly, we denote by W T ({a t }, a−i , , T ) the right-hand instead of WiT (a, (, ˜ , , i i side of (9). Let W T (, T ) be a column vector with the elements WiT (a, , T )−Vi (a−i ) for all i ∈ I and a ∈ A . Then, it suffices to show that, ∃T , ∀T > T , ∃ ∈ (0, 1), ∀ ∈ (, 1), there exists such that W T (, T ) = 0
(36)
holds and Ti and Ti using are indeed probabilities after every history. Indeed, if (36) is satisfied, then (11) holds for all i ∈ I and a ∈ A . Lemma 8. The rank of the Jacobian of W T (, T ) at (, T ) = (0, 1) is equal to N 2N for a sufficiently large T. Proof. Let J T (aj , k) be the derivative of W T (, T ) by (aj , k) at (, T ) = (0, 1). Note that W T is a N2N -component column vector, and so is J T (aj , k). Construct a N 2N th order square matrix J T by ordering J T (aj , k) for all j ∈ I , aj ∈ Aj and 1 k 2N−1 . Let J ∗ = limT →∞ J T and J ∗ (aj , k) = limT →∞ J T (aj , k). It suffices to show that t (J ∗ ) and t (JW ) are row equivalent where JW is defined in the proof of Proposition 1. Indeed, if so, rank J ∗ = rank JW = N2N . Then, from continuity, rank J T = N 2N
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
407
for a sufficiently large T. This fact immediately proves the full rank condition since J T is a sub-matrix of the Jacobian. From (10), we obtain *WiT (a, 0, 1) = *(aj , k)
|Vi (a−i )| · (Xjk |a) if j = i and aj = aj , j
0
(37)
otherwise
for all i ∈ I , a ∈ A , aj ∈ Aj and 1k 2N−1 . Moreover, from (22), (23) and the law of large numbers, we have lim
T →∞
(Xjk
|
a ,l (aj , a−jj ))
=
1
if k l,
0
otherwise.
(38)
Then, we apply (38) to (37) and get
a ,l
lim
T →∞
*WiT ((aj , a−jj ), 0, 1) *(aj , k)
=
|Vi (a−i )|
if j = i, aj = aj , and k l,
0
otherwise.
j
(39)
For all j ∈ I , aj ∈ Aj , and k ∈ {1, . . . , 2N−1 }, we denote by JW (aj , k) the derivative of a ,k
W (, , ) by j (aj , a−jj ) at (, , ) = (0, 1, 0). Note that JW (aj , k) is one of the columns of JW . Observe that every element of JW (aj , k) and J ∗ (aj , k) is calculated by (15) and (39), respectively. Then, we obtain JW (Cj , 2N−1 ) = −J ∗ (Cj , 2N−1 ), JW (Dj , 2N−1 ) = J ∗ (Dj , 2N−1 ), JW (Cj , k) = J ∗ (Cj , k + 1) − J ∗ (Cj , k), and JW (Dj , k) = J ∗ (Dj , k) − J ∗ (Dj , k + 1) for all j ∈ I and 1k < 2N−1 . Overall, all rows of t (JW ) are constructed by applying row operations to some rows of t (J ∗ ). Therefore, t (JW ) and t (J ∗ ) are row equivalent. Recall that, in our equilibrium, both ˜ and ˜ are equal to zero at T = 1. Then, from (10), we obtain WiT (a, , T ) = Vi (a−i ) for each i ∈ I and a ∈ A at the point (, T ) = (0, 1). Hence, (36) is satisfied at this point. Then, from the implicit function theorem and Lemma 8, there exists a natural number T such that, for all T > T , there exists the function T (T ) such that, T (1) = 0, dW T (T (T ),T ) = 0 at T = 1. d T 1. We denote by T (ai , k) the element of T .
T (T ) satisfies (36) in the neighborhood of T = 1, and T solves
Here, T is the derivative of T (T ) by T at T = Since we have shown the existence of the solution of (36), we can finish this step by proving that, ∃T , ∀T > T , ∃, ∀ ∈ (, 1), Ti and Ti using T (T ) are indeed probabilities T T T T T after every history. Let T i and i denote the derivatives of i and i by at = 1.
408
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Precisely, we can write T T i (hi )
= − i (Ci , 0) −
N−1 2
T (Ci , k)1{Xik |hTi },
(40.1)
k=1 T T i (hi )
= i (Di , 0) +
N−1 2
T (Di , k)1{Xik |hTi }.
(40.2)
k=1
Observe that, when T = 1, we have T (T ) = 0 and hence Ti = Ti = 0 after every history. So, T it suffices to show that, if T is large enough, then T i and i are negative after every history. Indeed, if so, Ti and Ti increase from zero as decreases from one, and hence they are probabilities for a close to one. To simplify the notation, let T (Ci , 0) and T (Di , 0) be 0i and 0i , respectively, and let T (ai , 0) be the derivative of T (ai , 0) by T . From the chain rule, *WiT (a, 0, 1) *T
dW T (T (1),1) d T
= 0 is equivalent to
N−1
2 *W T (a, 0, 1) i + , k) · T (aj , k) = 0 * (a T j j ∈I aj ∈Aj k=0
for all i ∈ I and a ∈ A . Notice that T (Cj , 0) = − j (Ci , 0) and T (Dj , 0) = j (Di , 0). Then, from (37), we obtain ⎛ ⎞ N−1 2 j Vi (a−i ) − i (a) + |Vi (a−i )| ⎝ j (aj , 0) + (Xjk |a)T (aj , k)⎠ = 0 (41) j =i
k=1
for all i ∈ I and a ∈ A . Given any j = i and a ∈ A , define l(j, a) as an integer satisfying a ,l(j,a) a = (aj , a−jj ). Then, we apply (38) to (41) and obtain ⎛ ⎞ l(j,a) j |Vi (a−i )| ⎝ j (aj , 0) + lim T (aj , k)⎠ = i (a) − Vi (a−i ) j =i
k=1
T →∞
for all i ∈ I and a ∈ A . Together with (4), the term in the parenthesis on the left-hand side is equal to − j (a) if aj = Cj , and is equal to j (a) if aj = Dj . Hence, we have Ci ,k−1 Ci ,k
i (Ci , a−i ) − i (Ci , a−i ) if k > 1, lim T (Ci , k) = (42.1) Ci ,1 T →∞ ) if k = 1, − i (Ci , 0) − i (Ci , a−i Di ,k Di ,k−1
i (Di , a−i ) − i (Di , a−i ) if k > 1, (42.2) lim T (Di , k) = D ,1 i T →∞ if k = 1.
i (Di , a−i ) − i (Di , 0) T From (8), (35), (40), and (42), the limits of T i and i are negative irrespective of the values of T the indicator functions. Therefore, from continuity, the T i and i are negative for a sufficiently large T. Step 3: Show the sequential rationality of T .
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
409
This step proves that T defined in the second step satisfies inequality (9). Formally, we show that, ∃T , ∀T > T , ∃, ∀ ∈ (, 1), Vi (a−i )WiT ({ait }, a−i , T (T ), T )
(43)
for all i ∈ I , a−i ∈ A−i and {ait } ∈ ATi \ {(Ci , . . . , Ci ), (Di , . . . , Di )}. Since T (1) = 0, we have WiT ({ait }, a−i , T (1), 1) = Vi (a−i ). Namely, when T = 1, (43) holds with equality for all i ∈ I , a−i ∈ A−i and {ait } ∈ ATi . Thus, it suffices to show that there exists a natural number T such that, for all T > T , dW T ({ait }, a−i , T (T ), T ) dT
>0
(44)
=1 T
for all i ∈ I , a−i ∈ A−i , and {ait } ∈ ATi \ {(Ci , . . . , Ci ), (Di , . . . , Di )}. Indeed, if (44) holds, then the right-hand side of (43) decreases as slightly decreases from one. So, (43) holds strictly for a close to one. Similarly to (41), for all i ∈ I , a−i ∈ A−i and {ait } ∈ ATi , the left-hand side of (44) is equal to Vi (a−i ) −
+
T i (ait , a−i ) T t=1 ⎛
|Vi (a−i )| ⎝ j (aj , 0) + j
j =i
N−1 2
⎞ (Xjk | {ait }, a−i )T (aj , k)⎠ .
(45)
k=1
We finish this step by proving the following three lemmas: Lemma 9 implies that any constant action ai ∈ / Ai is not optimal for player i, Lemma 10 shows that mixing Ci and Di in some review phase is also not optimal, and Lemma 12 shows that other non-constant actions are not optimal. Lemma 9. Fix i ∈ I , and suppose Ai = Ai . If T is large enough, then (44) holds for all a−i ∈ A−i and {ait } = (ai , . . . , ai ) satisfying ai ∈ / Ai . Proof. Fix i ∈ I , a−i ∈ A−i and {ait } = (ai , . . . , ai ) satisfying ai ∈ / Ai . From the law of large k t numbers, (Xj | {ai }, a−i ) approximates zero for all j = i and k as T goes to infinity. Then, from (45), the left-hand side of (44) goes to Vi (a−i ) − i (a) +
j
|Vi (a−i )| · j (aj , 0)
j =i
as T goes to infinity. From (34), this value is positive. Hence, from continuity, the left-hand side of (44) is positive when T is sufficiently large. Lemma 10. If T is large enough, then (44) holds for all i ∈ I , a−i ∈ A−i and {ait } satisfying ait ∈ Ai for all t and ait = ait for some t = t .
410
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Proof. Fix i ∈ I and a−i ∈ A−i . Let ˜ i ≡ i (Ci , a−i )−i (Di , a−i ). For any ∈ {0, . . . , T }, define {ait ()} = (Di , . . . , Di , Ci , . . . , Ci ).
T −
Let WiT () represent the left-hand side of (44) when {ait } = {ait ()}. Similarly, define (Xjk |) as (Xjk | {ait ()}, a−i ). Then, it suffices to show that there exists a natural number T such that, given any T > T , WiT () is positive for all ∈ {1, . . . , T −1}. Notice that WiT (0) = WiT (T ) = 0 from the definition of T . ˜ Let W˜ iT () ≡ WiT ()−WiT ( − 1) and (X|) ≡ (X|)−(X| − 1). From (45), we obtain N−1
2 j ˜ i ˜ k |) (aj , k). W˜ iT () = (X |Vi (a−i )| + T j T j =i
(46)
k=1
a ,l
For all j = i and ai ∈ Ai , let l(j, ai ) represent the integer l satisfying (ai , a−i ) = (aj , a−jj ). In the same way as in the proof of Lemma 7, let F () be F (, T − 1, ZT ). Then, similar to Matsushima [13], we have ⎧ (q − q)F ( − 1) if l(j, Di ) < k l(j, Ci ), ⎪ ⎨ k ˜ (47) (X j |) = ⎪ (q − q)F (T − ) if l(j, Ci ) < k l(j, Di ), ⎩ 0 otherwise. Let ICi represent the set of all j = i satisfying l(j, Ci ) > l(j, Di ). Similarly, define IDi as the set of all j = i satisfying l(j, Ci ) < l(j, Di ). Moreover, for every j = i, let ⎧
l(j,Ci ) ⎪ ⎨ |Vij (a−i )| q − q k=l(j,Di )+1 T (aj , k) if j ∈ ICi , T
Kj ≡ ⎪ |V j (a )| q − q l(j,Di ) ⎩ −i i k=l(j,Ci )+1 T (aj , k) if j ∈ IDi . Then, it follows from (46) and (47) that T W˜ iT () = ˜ i + KjT T F ( − 1) − KjT T F (T − ). j ∈ICi
(48)
j ∈IDi
From (8), (35) and (42), limT →∞ KjT is positive for all j ∈ I . Moreover, from (24) and Lemma 7, we have T F ([ T2 ]) → 0 and T F (0) → ∞. Then, there exists a natural number T1 such that, for all T > T1 , KjT > 0,
∀j ∈ I,
(49)
F (0) > F ([ T2 ]), KjT T F (0) > |˜ i | +
(50) j =j
KjT T F ([ T2 ]),
∀j ∈ I.
(51)
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
411
Take such a T1 . Furthermore, if ˜ i is positive, then there exists a natural number T2 such that, for all T > T2 , ˜ i >
j ∈I
KjT T F ([ T2 ]).
(52)
So, take such a T2 when ˜ i is positive, and fix T2 = 1 when ˜ i is non-positive. Then, define T3 = max{T1 , T2 }. Let ∗T be the maximizer of F (). From (50) and Lemma 5, if T is greater than T3 , then we have 0 ∗T < [ T2 ]. Lemma 11. If T is greater than T3 , then (i) W˜ iT () > 0 for all ∈ {1, . . . , ∗T + 1}, and (ii) W˜ iT () < 0 for all ∈ {T −∗T , . . . , T }. Proof. From the symmetry between Ci and Di , it suffices to prove (i). Fix T > T3 . From 0 ∗T < [ T2 ], we obtain 0 −1 ∗T [ T2 ] T − for all ∈ {1, . . . , ∗T + 1}. Then, from Lemma 5, we get F (0)F (−1) and F ([ T2 ])F (T −) for all ∈ {1, . . . , ∗T + 1}. From these inequalities, (48) and (49), we have T W˜ iT () ˜ i +
j ∈ICi
KjT T F (0) −
j ∈IDi
KjT T F ([ T2 ])
(53)
for all ∈ {1, . . . , ∗T + 1}. Consider the case where ˜ i is positive. Then, from (49) and (52), the right-hand side of (53) is positive. Hence, W˜ iT () is positive for all ∈ {1, . . . , ∗T + 1}. Consider the case where ˜ i is non-positive. We claim that ICi is not empty. To prove this, suppose ICi is empty. Then, from (48) and (49), W˜ iT () is negative for all ∈ {1, . . . , T }. This contradicts to the fact that WiT (0) = WiT (T ) = 0. Therefore, we have ICi = ∅. Then, from (49) and (51), the right-hand side of (53) is positive. Thus, W˜ iT () is positive for all ∈ {1, . . . , ∗T + 1}. We can finish the proof by showing that, if T > T3 , then WiT () > 0 for all ∈ {1, . . . , T −1}. So, fix T > T3 arbitrarily. From Lemma 5, we have F (−1) F (( + 1)−1) for all ∈ {∗T + 1, . . . , T }, and F (T −)F (T −( + 1)) for all ∈ {0, . . . , T −∗T −1}. Then, from (48) and (49), we obtain W˜ iT () W˜ iT ( + 1) for all ∈ {∗T + 1, . . . , T −∗T −1}. Together with Lemma 11, there exists a natural number T ∈ {∗T + 1, . . . , T −∗T −1} such that WiT (0) < · · · < WiT (∗T + 1) · · · WiT (T ) · · · WiT (T − ∗T − 1) > · · · > WiT (T ). From WiT (0) = WiT (T ) = 0, we have WiT () > 0 for all ∈ {1, . . . , T −1}.
(54)
Lemma 12. Fix i ∈ I , and suppose Ai = Ai . If T is large enough, then (44) holds for all / Ai for some t, and ait = ait for some t = t. a−i ∈ A−i and {ait } satisfying ait ∈
412
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
Proof. Fix i ∈ I and a−i ∈ A−i . Let ai∗ ∈ arg maxai ∈A / i i (a). For all ∈ {0, . . . , T } and ∈ {0, . . . , }, define {ait (, )} = (Di , . . . , Di , Ci , . . . , Ci , ai∗ , . . . , ai∗ ).
−
T −
When = we denote the left-hand side of (44) by WiT (, ). Then, from (45) and Lemma 9, it suffices to show that there exists a natural number T such that, given any T > T , WiT (, ) is positive for all ∈ {1, . . . , T −1} and ∈ {0, . . . , }. Note that WiT (T , 0) = WiT (T , T ) = 0 from the same argument as in the proof of Lemma 10. Moreover, from Lemma 9, there exists a natural number T1 such that, for all T > T1 , WiT (0, 0) > 0. Take such a T1 . Using the same argument as the one deriving (54), there exists a natural number T2 such that, given any T > T2 , WiT (, 0) is strictly greater than WiT (0, 0) or WiT (T , 0) for all ∈ {1, . . . , T −1}. Fix such a T2 , and define T3 = max{T1 , T2 }. Then, given any T > T3 , WiT (, 0) is positive for all ∈ {1, . . . , T −1}. Similarly, there exists a natural number T4 such that, given any T > T4 , WiT (, ) is positive for all ∈ {1, . . . , T −1}. In the same way, there exists a T5 such that, given any T > T5 , WiT (, ) is equal to or greater than WiT (, 0) or WiT (, ) for all ∈ {2, . . . , T −1} and ∈ {1, . . . , −1}. Take such a T5 . Then, given any T > max{T3 , T4 , T5 }, WiT (, ) is positive for all ∈ {1, . . . , T −1} and ∈ {0, . . . , }. {ait }
{ait (, )},
Step 4: Degenerate cases. This step explains why the second and third steps should be corrected when (8) holds with at least one equality for some i ∈ I and k, and then demonstrates that Lemma 2 holds even in these degenerate cases. In the degenerate cases, the proof of Lemma 10 becomes wrong. Now, from (42), the limit of T (aj , k) may be equal to zero. Then, the limit of KjT may be equal to zero, and hence KjT may be negative for all T. This contradicts to (49), and so the proof of Lemma 10 becomes incorrect. To avoid this problem, we have to reconstruct T ∈ T . Take i (Ci , 0) and i (Di , 0) satisfying a a (34) and (35). After that, fix a real number less than but close to zero. Then, define i −i = i −i = −(1−T ) for all i ∈ I and a−i ∈ A−i , and define 0i , 0i , and ZT in the same way as in the a a second step. We focus on T ∈ T using such 0i , i −i , 0i , i −i , and ZT . Define W T (, T ) in the same way as in the second step. If T is sufficiently large, then T W (, T ) has full rank, and is equal to zero at (, T ) = (0, 1). So, from the implicit function theorem, there exists a natural number T such that, for all T > T , there exists T (T ) that solves (36) in the neighborhood of T = 1. We obtain Ci ,k−1 Ci ,k
i (Ci , a−i ) − i (Ci , a−i ) if k > 1, lim T (Ci , k) = C ,1 i T →∞ − i (Ci , a−i ) − i (Ci , 0) − if k = 1, and
lim
T →∞
T (Di , k)
=
Di ,k Di ,k−1
i (Di , a−i ) − i (Di , a−i ) if k > 1, Di ,1 ) − i (Di , 0) −
i (Di , a−i
if k = 1.
T Then, T i and i are negative after every history when T is sufficiently large. This implies that T T i and i are indeed probabilities.
Y. Yamamoto / Journal of Economic Theory 135 (2007) 382 – 413
413
We finish this step by proving Lemma 10. Indeed, Lemmas 9 and 12 are shown in the same way as in the third step. To prove Lemma 10, we calculate T W˜ iT () = ˜ i + LTj T F ( − 1) − LTj T F (T − ), j ∈ICi
where
j ∈IDi
⎧
l(j,Ci ) (a , k) ⎪ ⎨ |Vij (a−i )| q − q + k=l(j,D ifj ∈ ICi , j T )+1 i
LTj ≡ l(j,Di ) ⎪ ⎩ |Vij (a−i )| q − q + k=l(j,C (aj , k) if j ∈ IDi . i )+1 T
Observe that the limit of LTj is positive since the limit of T (aj , k) is non-positive and is negative. Then, in the same way as in the third step, we can show Lemma 10. References [1] V. Bhaskar, I. Obara, Belief-based equilibria in the repeated prisoner’s dilemma with private monitoring, J. Econ. Theory 102 (2002) 40–69. [2] O. Compte, Communication in repeated games with imperfect private monitoring, Econometrica 66 (1998) 597–626. [3] J.C. Ely, J. Hörner, W. Olszewski, Belief-free equilibria in repeated games, Econometrica 73 (2005) 377–415. [4] J.C. Ely, J. Välimäki, A robust folk theorem for the prisoner’s dilemma, J. Econ. Theory 102 (2002) 84–105. [5] D. Fudenberg, D.K. Levine, E. Maskin, The folk theorem with imperfect public information, Econometrica 62 (1994) 997–1040. [6] D. Fudenberg, E. Maskin, The folk theorem in repeated games with discounting and with incomplete information, Econometrica 54 (1986) 533–554. [7] J. Hörner, W. Olszewski, The folk theorem for games with private almost-perfect monitoring, Econometrica, forthcoming. (Web page: http://www.econometricsociety.org/manus.asp). [8] M. Kandori, Introduction to repeated games with private monitoring, J. Econ. Theory 102 (2002) 1–15. [9] M. Kandori, H. Matsushima, Private observation, communication and collusion, Econometrica 66 (1998) 627–652. [10] M. Kandori, I. Obara, Efficiency in repeated games revisited: the role of private strategies, Econometrica 72 (2006) 499–519. [11] G.J. Mailath, S. Morris, Repeated games with almost-public monitoring, J. Econ. Theory 102 (2002) 189–228. [12] H. Matsushima, Multimarket contact, imperfect monitoring, and implicit collusion, J. Econ. Theory 98 (2001) 158–178. [13] H. Matsushima, Repeated games with private monitoring: two players, Econometrica 72 (2004) 823–852. [14] I. Obara, Folk theorem with communication, Mimeo, 2006. [15] M. Piccione, The repeated prisoner’s dilemma with imperfect private monitoring, J. Econ. Theory 102 (2002) 70–83. [16] T. Sekiguchi, Efficiency in repeated prisoner’s dilemma with private monitoring, J. Econ. Theory 76 (1997) 345–361. [17] T. Sekiguchi, Existence of nontrivial equilibria in repeated games with imperfect private monitoring, Games Econ. Behav. 40 (2002) 299–321. [18] G.J. Stigler, A theory of oligopoly, J. Polit. Economy 72 (1964) 44–61.