On the Rate of Convergence of Continuous-Time Fictitious Play

On the Rate of Convergence of Continuous-Time Fictitious Play

GAMES AND ECONOMIC BEHAVIOR ARTICLE NO. 22, 238]259 Ž1998. GA970582 On the Rate of Convergence of Continuous-Time Fictitious Play Christopher Harri...

236KB Sizes 6 Downloads 22 Views

GAMES AND ECONOMIC BEHAVIOR ARTICLE NO.

22, 238]259 Ž1998.

GA970582

On the Rate of Convergence of Continuous-Time Fictitious Play Christopher Harris* King’s College, Cambridge CB2 1ST, United Kingdom Received October 17, 1994

This paper shows, first, that continuous-time fictitious play converges Žin both payoff and strategy terms. uniformly at rate ty1 in any finite two-person zero-sum game. The proof is, in essence, a simple Lyapunov-function argument. The convergence of discrete-time fictitious play is a straightforward corollary of this result. The paper also shows that continuous-time fictitious play converges in all finite weighted-potential games. In this case, the convergence is not uniform. It is conjectured, however, that any given continuous-time fictitious play of a finite weighted-potential game converges Žin both payoff and strategy terms. at rate ty1 . Journal of Economic Literature Classification Numbers: C6, C7. Q 1998 Academic Press

1. INTRODUCTION An appealing approach to boundedly rational learning in games is to assume, first, that each player ignores strategic considerations and, second, that she regards the average of the past play of the other players as the best guide to their current play. Indeed, if a player ignores strategic considerations, then she will not take into account the effect that her current action will have on the future actions of other players and will therefore choose her action myopically; and if she thinks that average past play is the best guide to current play, then her myopically chosen action will be a best response to average past play. Play will therefore evolve according to a simple iterative scheme. This scheme is known as fictitious play. *I would particularly like to thank Ehud Kalai, Vijay Krishna, Dov Monderer, two anonymous referees, and an anonymous associate editor for their help with this paper. 238 0899-8256r98 $25.00 Copyright Q 1998 by Academic Press All rights of reproduction in any form reserved.

CONTINUOUS-TIME FICTITIOUS PLAY

239

The idea of fictitious play appears to have originated in the work of Brown Ž1949, 1951a, 1951b.. Brown considered both discrete- and continuous-time versions of fictitious play.1 His analysis was, to varying degrees, heuristic. Robinson Ž1951. gave the first rigorous proof that discrete-time fictitious play converges Žin both payoff and strategy terms. in any finite two-person zero-sum game, but did not identify the rate of convergence. Shapiro Ž1958. showed that the rate of convergence Žin payoff terms. was at least ty1 rŽ < S1
1

Brown also considered a continuous-time system in which the two players of a finite two-person zero-sum game slowly adjust the probability with which they play any given strategy in relation to its payoff. A similar system was considered independently by von Neumann. The definitive version of this system appears in Brown and von Neumann Ž1950..

240

CHRISTOPHER HARRIS

of continuous-time fictitious play converged at rate ty1 .2 We believe that he would have had no difficulty in making his argument precise had he had access to the modern methods of convex analysis and differential inclusions.3 We should also like to point out that, in independent work, Hofbauer Ž1994. has established a version of this result. 4 Second, we establish a simple result in convex analysis. This result says, roughly speaking, that the distance between a point of Euclidean space and a lower contour set of a polyhedrally convex function can be bounded in terms of the difference between the value of the function at the point in question and the value defining the lower contour set. It allows us to deduce that the rate of convergence Žin strategy terms. of continuous-time fictitious play is at least ty1 . This same result allows us to complete Shapiro’s Ž1958. analysis by showing that the rate of convergence Žin strategy terms. of discrete-time fictitious play in finite two-person zero-sum games is at least ty1 rŽ < S1
In Brown’s Ž1949. version of continuous-time fictitious play, the total number of plays of a strategy accumulates at rate 1 if that strategy is a best response and at rate 0 if it is not. In particular, the plays of more than one strategy may accumulate at any given time. For some starting points, this system does not have a solution. 3 Brown does not appear to have pursued his argument about continuous-time fictitious play, perhaps because his motivation for formulating fictitious play in the first place was computational. This was, as we shall see, a pity: the convergence of discrete-time fictitious play is a corollary of the convergence of continuous-time fictitious play. 4 Hofbauer’s Ž1994. results are in some respects more and in some respects less general than ours. They are more general in that they apply to a wider class of games than just finite zero-sum games. ŽThis class of games does not include weighted potential games.. They are less general in that they apply only to a particular version of continuous-time fictitious play. In Hofbauer’s version of continuous-time fictitious play, players employ a mixed-strategy profile that is a Nash equilibrium of the subgame of the original game obtained by eliminating all pure strategies that are not best responses to the average of past play. This construction has the feature that, starting at any given time, players’ choices remain constant for a nonzero interval of time. It has two advantages. First, it leads to a proof of the existence of continuous-time fictitious play Žwhich cannot be taken for granted. that is elementary in the sense that it does not involve an appeal to advanced methods from the theory of differential inclusions Žbut it does involve an appeal to the existence of Nash equilibrium.. Second, the fact that the continuous-time fictitious plays considered are differentiable from the right at all points of time simplifies the analysis significantly. It also has two disadvantages. First, part of the conceptual appeal of fictitious play is that it is nonstrategic. Requiring that players play a Nash equilibrium of a certain subgame does not respect the nonstrategic nature of fictitious play. Second, when deducing the convergence of discrete-time fictitious play from that of continuous-time fictitious play Žas we do in the present paper., it is essential to consider all possible continuous-time fictitious plays, and not just a convenient subset of such plays.

CONTINUOUS-TIME FICTITIOUS PLAY

241

to know whether a refinement of the approach we use here could be used to prove Shapiro’s Ž1958. results, or even Karlin’s Ž1959. conjecture, on the rate of convergence of discrete-time fictitious play. Fourth, we prove the continuous-time analogue of the result of Monderer and Shapley Ž1996a. for weighted-potential games. The proof is even simpler than that of the original result. An example shows that continuous-time fictitious play does not converge uniformly at rate ty1 in such games. We do, however, formulate a conjecture according to which any given continuous-time fictitious play converges at rate ty1 . Finally, we would like to mention work by Gilboa and Matsui Ž1991., Matsui Ž1992., and Krishna and Sjostrom ¨ ¨ Ž1995.. Gilboa and Matsui Ž1991. and Matsui Ž1992. both worked with particular solutions of what is}a simple time change aside}continuous-time fictitious play. Krishna and Sjostrom ¨ ¨ Ž1995. analyzed the convergence of continuous-time fictitious play in finite two-player nonzero-sum games.

2. DISCRETE-TIME FICTITIOUS PLAY Suppose that a finite normal-form game G is given. Let I denote the number of players in G, let Si denote the set of pure strategies of player i, I let S s =is1 Si denote the set of pure-strategy profiles, let u i : S ª R denote the payoff function of player i, let Pi s DŽ Si . denote the set of I mixed strategies of player i, let P s =is1 Pi denote the set of mixedstrategy profiles, let Bi : P ª Pi denote the best-response correspondence of player i, let B: P ª P be defined by B Ž p . s =Iis1 Bi Ž p . for all p g P, and let NEŽ G . denote the set of Nash equilibria of G. Then discrete-time fictitious play may be defined as follows. DEFINITION 1. A discrete-time fictitious play of G Žor DTFP for short. 1 Ž .. is a sequence b: N ª P such that bŽT . g B ŽŽ1rT .ÝTy for all T G 1. ts0 b t In other words, in period 0 each player plays an arbitrary mixed strategy, and in each subsequent period she plays a best response to the average play of the preceding periods. Notice that, in the scheme adopted here, players are implicitly assumed to be able to observe the past mixed strategies of the other players. This assumption does not matter if one is interested only in pure-strategy DTFP, since the convergence of DTFP in the present sense implies the convergence of pure-strategy DTFP. It is made in order to be as consistent as possible with the continuous-time case.

242

CHRISTOPHER HARRIS

For all DTFP b, let AŽ b .:  1, 2, . . . 4 ¬ P be defined by the formula

Ž AŽ b. . Ž T . s

1 T

Ty1

Ý bŽ t . ts0

for all T G 1 and let Lv Ž b . denote the set

F

closure Ž  Ž A Ž b . . Ž t . N t G T 4 . .

TG1

In other words, let AŽ b . denote the sequence of short-run averages of b, and let Lv Ž b . be the set of all limit points of AŽ b .. DEFINITION 2. Let b be a DTFP. Then b converges iff Lv Ž b . ; NEŽ G .. 3. CONTINUOUS-TIME FICTITIOUS PLAY The analogue of Definition 1 for continuous-time fictitious play is as follows. DEFINITION 3. A continuous-time fictitious play of G of the first kind Žor CTFP1 for short. is a Lebesgue measurable function b: w0, `. ª P such that bŽT . g B ŽŽ1rT .H0T bŽ t . dt . for almost all T G 1. In other words, in each period prior to period 1 each player plays an arbitrary mixed strategy, and in each subsequent period she plays a best response to the average play of the preceding periods. Notice that the Lebesgue measurability of b is needed to ensure that the short-run averages Ž1rT .H0T bŽ t . dt are well defined. Notice too that, unlike the discrete-time case, the use of mixed strategies is essential in the continuous-time case. For example, in the case of matching pennies, if players play the unique Žmixed-strategy. equilibrium for all t g w0, 1., then they must play this equilibrium for almost all t G 1 as well. For all CTFP1 b, let AŽ b .: w1, `. ª P be defined by the formula

Ž AŽ b. . Ž T . s

1 T

T

H0

b Ž t . dt

for all T G 1 and let Lv Ž b . denote the set

F

TG1

closure Ž  Ž A Ž b . . Ž t . N t G T 4 . .

CONTINUOUS-TIME FICTITIOUS PLAY

243

In other words, let AŽ b . denote the path of short-run averages of b and let Lv Ž b . be the set of all limit points of AŽ b .. DEFINITION 4. Let b be a CTFP1. Then b converges iff Lv Ž b . ; NEŽ G .. Now suppose that b is a CTFP1 and let a s AŽ b .. Then it is easy to show that a ˙Ž T . s y s

1 T

1 T

2

T

H0

b Ž t . dt q

1 T

bŽ T .

Ž b Ž T . y aŽ T . .

for almost all T g w1, `.. In particular, a satisfies the differential inclusion a ˙ g Ž1rt .Ž B Ž a. y a. with initial condition aŽ1. s H01 bŽ t . dt. Conversely, if a satisfies the differential inclusion a ˙ g Ž1rt .Ž B Ž a. y a. with initial condition aŽ1. s p and if bs

½

p

on 0, 1 .

ta ˙q a

on 1, ` . ,

5

then b is a CTFP1. More formally, we have Definition 5 and Propositions 6 and 7. DEFINITION 5. A function a: w1, `. ª P satisfies the differential inclusion a ˙ g Ž1rt .Ž B Ž a. y a. with initial condition aŽ1. s p iff: 1. 2. 3.

a is locally Lipschitz continuous; aŽ1. s p; a ˙Ž t . g Ž1rt .Ž B Ž aŽ t .. y aŽ t .. for almost all t g w1, `..

Notice that, by definition, a function a: w1, `. ª P is locally Lipschitz continuous iff, for all T g w1, `., there exists k ŽT . such that < aŽ t 1 . y aŽ t 2 .< F k ŽT .< t 1 y t 2 < for all t 1 , t 2 g w1, T x. Notice further that a is locally Lipschitz continuous iff it is differentiable almost everywhere and its derivative a ˙ is locally bounded in the sense that, for all T g w1, `., there exists k ŽT . such that < a ˙< F k ŽT . almost everywhere on w1, T x. PROPOSITION 6. For all p g P, the differential inclusion a ˙ g Ž1rt .Ž B Ž a. y a. with initial condition aŽ1. s p possesses a solution. Moreo¨ er the set of all solutions with initial condition p is compact in the topology of uniform con¨ ergence on compact subsets of w1, `.. These properties follow from the fact that the best-response correspondence B is nonempty, compact and convex valued, bounded, and upper semicontinuous. See Theorem 1 of Section 2.2 of Aubin and Cellina

244

CHRISTOPHER HARRIS

Ž1984., for example. Notice that, as the example of matching pennies shows, the convexity of the values of B is essential for the existence of a solution to the differential inclusion. PROPOSITION 7. The function b: w0, `. ª P is a CTFP1 iff a s AŽ b . is a solution of the differential inclusion a ˙ g Ž1rt .Ž B Ž a. y a. with initial condition aŽ1. s H01 bŽ t . dt. Combining Propositions 6 and 7, we see in particular, that CTFP1 exist. Now suppose that b is a CTFP1 and let ˜ b: wy`, `. ª P be defined by T˜ . ˜ ˜ ˜ Ž . Ž w the formula b T s b e for all T g y`, `. Žwith the understanding that ey` s 0.. Then it is easy to check that

˜b Ž T˜ . g B



žH

y`

˜

eyŽ Tyt˜. b Ž ˜t . dt˜

/

for all T˜ g w0, `.. This motivates the following definition. DEFINITION 8. A continuous-time fictitious play of G of the second kind Žor CTFP2 for short. is a Lebesgue measurable function ˜ b: wy`, `. ˜ T˜ ª P such that ˜ bŽT˜. g B Ž Hy` eyŽTyt˜. bŽ˜t . dt˜. for almost all T˜ G 0. In other words, at all times prior to time 0, each player plays an arbitrary mixed strategy, and at all times after time 0, each player plays a best response to an exponentially weighted average of past play. Now suppose further that we define A˜Ž ˜ b .: w0, `. ª P by the formula A˜Ž ˜ b . Ž T˜ . s

T˜ e Hy`

˜ ˜. yŽ Tyt

b Ž ˜t . dt˜

for all T˜ g w0, `.. Then it is easy to check that a ˜ s A˜Ž ˜b . solves the ˙ Ž . differential inclusion a ˜ g B a˜ y a˜ with initial condition a˜Ž0. s 0 yŽ0yt˜. ˜Ž . ˜ Conversely, if a˜ satisfies the differential inclusion a˜˙ g Hy` e b ˜t dt. Ž . B a ˜ y a˜ with initial condition a˜Ž0. s ˜p, and if

˜b s

½

on y`, 0 . ˜p ˙ a ˜ q a˜ on 0, `. ,

5

then ˜ b is a CTFP2. Since CTFP2 differs from CTFP1 only by a change of time variable and since the differential inclusion associated with CTFP2 is stationary, we prefer to work with CTFP2. Notice that the differential inclusion associated with CTFP2 was studied earlier by Gilboa and Matsui Ž1991. and Matsui Ž1992.. In particular, Matsui Ž1992. called this differential inclusion the best-response dynamics. Given that the present paper focuses on fictitious play and given that

CONTINUOUS-TIME FICTITIOUS PLAY

245

CTFP2 differs from CTFP1 only by a change of time variable, we prefer to avoid this terminology in the present paper. Notice too that both Gilboa and Matsui Ž1991. and Matsui Ž1992. considered only those solutions of the differential inclusion that are regular in an appropriate sense. For ˙˜ q a˜ example, Matsui Ž1992. considered only those solutions such that ˜ bsa is constant from the right. For our purposes, however, it is important to consider all meaningful solutions of the differential inclusion. It would not otherwise be possible to deduce the convergence of DTFP as a corollary of our results on CTFP.

4. HEURISTIC PROOF OF THE MAIN RESULT In the present section we give a heuristic proof of the main theorem of the paper. According to this result, for any finite zero-sum game G: 1. The rate of convergence in payoff terms of CTFP1 is 1rt. More precisely, suppose that ¨ is the value of the game. Then there exists K P Ž G . - ` such that, for all CTFP1 b, ¨ y min u Ž a1 Ž t . , p 2 . F

KP Ž G. t

p 2gP 2

and max u Ž p1 , a2 Ž t . . y ¨ F

KP Ž G.

p1gP1

t

for all t G 1. In other words, a1Ž t . is a Ž K P Ž G .rt .-optimal strategy for player 1, and a2 Ž t . is a Ž K P Ž G .rt .-optimal strategy for player 2. 2. The rate of convergence in strategy terms of CTFP1 is 1rt. More precisely, let Ci denote the set of optimal strategies of player i. Then there exists K S Ž G . - ` such that, for all CTFP1 b, d Ž a i Ž t . , Ci . F

KS Ž G. t

for all i g  1, 24 and all t G 1. Here a s AŽ b ., and dŽ a i Ž t ., Ci . denotes the Euclidean distance between a i Ž t . and Ci . Of course, we shall actually prove that the rate of convergence in both payoff and strategy terms of CTFP2 is eyt˜. The corresponding results for CTFP1 then follow at once from the time change t s e˜t.

246

CHRISTOPHER HARRIS

It should also be noted that we give explicit choices for the constants K P Ž G . and K S Ž G .. For these choices, K P Ž G . depends continuously on G, but K S Ž G . does not. The basic idea of our proof is extremely simple. Suppose that we identify P1 and P2 with the unit simplices in R S1 and R S 2 , and let U be the matrix in R S1 = R S 2 for which u1 Ž p1 , p 2 . s pU1 Up2 s yu 2 Ž p1 , p 2 . . ŽHere pU1 denotes the transpose of p1.. Let H Ž p 2 . s max pU1 Up2 p1gP1

L Ž p1 . s min pU1 Up2 p 2gP 2

and W Ž p1 , p 2 . s H Ž p 2 . y L Ž p1 . . Then W Ž p1 , p 2 . G 0, with equality iff Ž p1 , p 2 . g NEŽ G .. So W can be used as a Lyapunov function for fictitious play. Indeed, suppose that b is a CTFP2, let a s A˜Ž b ., and let w Ž t . s W Ž a1Ž t ., a2 Ž t ... ŽWe have dropped the tilde associated with CTFP2 for notational ease.. Then w ˙ s H9 Ž a2 ; a˙2 . y L9 Ž a1 ; a˙1 . Žwhere f 9Ž a i ; a ˙1 . denotes the one-sided directional derivative of f in the direction a ˙i . s bU1 Ua ˙2 y a˙U1 Ub2

Ž by the envelope theorem and the fact that bi is a best response to a3yi . s bU1 U Ž b 2 y a2 . y Ž b1 y a1 . *Ub2

Ž since a˙i s bi y ai . s

ybU1 Ua2

q

aU1 Ub2

Ž on simplifying. s yH Ž a2 . q L Ž a1 .

Ž by definition of H and L . s yw. In other words, W decreases exponentially along trajectories of the differential inclusion a ˙ g B Ž a. y a. In particular, W is always nondecreasing and W is strictly decreasing as long as it is strictly positive.

CONTINUOUS-TIME FICTITIOUS PLAY

247

Now let ¨ denote the value of G. Then C2 is the intersection of P2 and the half-spaces  x 2 N ŽU*s1 . ? x 2 F ¨ 4 . Moreover,

Ž U*s1 . ? a2 y ¨ F H Ž a2 . y ¨ F Ž H Ž a2 . y ¨ . q Ž ¨ y L Ž a1 . . s H Ž a2 . y L Ž a1 . s W Ž a1 , a2 . . Hence the distance between a2 and the half-space  x 2 N ŽU*s1 . ? x 2 F ¨ 4 converges exponentially to zero. Hence a2 converges exponentially to C2 . Similarly, a1 converges exponentially to C1. There are, unfortunately, two difficulties with this argument. First, there is no immediate reason why the best responses that arise in our application of the envelope theorem need be the same as the best responses that define the rate of change of the a i . Second, we need to give a precise proof of the intuitively obvious claim that the exponential convergence of a i to each of the half-spaces of which Ci is the intersection implies the exponential convergence of a i to Ci itself. ŽIt is easy to see that a i converges to Ci . The difficulty consists in showing that convergence takes place at the same rate as the convergence of w.. The next two sections are devoted to showing that these difficulties can be overcome.

5. CONVERGENCE IN PAYOFF TERMS Suppose that b is a CTFP2, let a s A˜Ž b ., and let H, L, and W be as in Section 4. Then the purpose of the present section is to establish the following theorem. THEOREM 9. W Ž a1Ž t ., a2 Ž t .. s eyt W Ž a1Ž0., a2 Ž0... It follows at once from this theorem that the rate of convergence in payoff terms of CTFP2 is eyt and that the constant K P Ž G . may be taken to be max p g P W Ž p .. For the purposes of the proof of Theorem 9, it is convenient to fix a version a ˙ of the derivative of a such that: Ži. a˙ is Borel measurable; Žii. Ž . a g B a y a everywhere. Furthermore, since the statement of Theorem 9 ˙ involves only a, we may change b on a set of measure zero. We put bsa ˙ q a. Let hŽ t . s H Ž a2 Ž t ... Then we have the following lemma. LEMMA 10. For almost all t G 0, ˙ hŽ t . exists and is equal to H9Ž a2 Ž t .; b 2 Ž t . y a2 Ž t ... Here H9Ž a2 Ž t .; b 2 Ž t . y a2 Ž t .. denotes the directional derivative of H at a2 Ž t . in the direction b 2 Ž t . y a2 Ž t ..

248

CHRISTOPHER HARRIS

Proof. It follows at once from the definition of a2 that a ˙2 Ž t . exists and is equal to b 2 Ž t . y a2 Ž t . for almost all t. For any such t we have H Ž a2 Ž T . . y H Ž a2 Ž t . . s H Ž a2 Ž t . q Ž T y t . Ž b 2 Ž t . y a2 Ž t . . q o Ž T y t . . y H Ž a2 Ž t . .

Ž by choice of t . s H Ž a2 Ž t . q Ž T y t . Ž b 2 Ž t . y a2 Ž t . . . y H Ž a2 Ž t . . q o Ž T y t .

Ž by the Lipschitz continuity of H . . However H is convex. The directional derivative H9Ž a2 Ž t .; b 2 Ž t . y a2 Ž t .. of H at a2 Ž t . in the direction b 2 Ž t . y a2 Ž t . therefore exists and we have H Ž a2 Ž T . . y H Ž a2 Ž t . . Tyt

ª H9 Ž a2 Ž t . ; b 2 Ž t . y a2 Ž t . .

as T ª t q . B Lemma 10 is actually more than we need. It would be sufficient to note that h is Lipschitz continuous and that ˙ hŽ t . therefore exists for almost all t G 0. LEMMA 11.

For almost all t G 0,

b1 Ž T . *U Ž b 2 Ž t . y a2 Ž t . . ª H9 Ž a2 Ž t . ; b 2 Ž t . y a2 Ž t . . as T ª t q . Proof. As noted above, a ˙2 Ž t . exists and is equal to b 2 Ž t . y a2 Ž t . for almost all t. For any such t we have H Ž a2 Ž T . . y H Ž a2 Ž t . . s b1 Ž T . *Ua2 Ž T . y b1 Ž t . *Ua2 Ž t .

Ž because b1 is always a best response to a2 . s b1 Ž T . *U Ž a2 Ž t . q Ž T y t . Ž b 2 Ž t . y a2 Ž t . . q o Ž T y t . . y b1 Ž t . *Ua2 Ž t .

Ž by choice of t . s Ž b1 Ž T . *Ua2 Ž t . y b1 Ž t . *Ua2 Ž t . . q Ž T y t . b1 Ž T . *U Ž b 2 Ž t . y a2 Ž t . . q o Ž T y t .

Ž on rearranging . .

CONTINUOUS-TIME FICTITIOUS PLAY

249

However, B1Ž p 2 . ; B1Ž a2 Ž t .. for all p 2 in a neighborhood of a2 Ž t ., since G is a finite game. Hence b1ŽT . g B1Ž a2 Ž t .. }and therefore, b1ŽT .*Ua2 Ž t . s b1Ž t .*Ua2 Ž t . s H Ž a2 Ž t .. }for all T sufficiently close to t. It follows that H Ž a2 Ž T . . y H Ž a2 Ž t . . s Ž T y t . b1 Ž T . *U Ž b 2 Ž t . y a2 Ž t . . q o Ž T y t . . The lemma now follows on dividing through by T y t and letting Tªtq. B LEMMA 12.

For almost all t G 0, HtT etyt b1 Ž t . dt HtT etyt dt

ª b1 Ž t .

as T ª t q . Proof. It follows at once from the definition of a1 that a ˙1Ž t . exists and is equal to b1Ž t . y a1Ž t . for almost all t. For any such t, we have eTy t a1 Ž T . y a1 Ž t . eTyt y 1

ª b1 Ž t .

as T ª t q . However, e Ty t a1 Ž T . y a1 Ž t . eTy t y 1

s

HtT etyt b1 Ž t . dt HtT etyt dt

.

This completes the proof of the lemma. B Combining Lemmas 10, 11, and 12, we have: L EMMA 13. For almost all t G 0, b1Ž t .*UŽ b 2 Ž t . y a2 Ž t ...

H 9Ž a 2 Ž t .; b 2 Ž t . y a 2 Ž t .. s

Proof. For almost all t G 0: b1 Ž t . *U Ž b 2 Ž t . y a2 Ž t . . s

ž

HtT etyt b1 Ž t . dt

lim Tªtq

HtT etyt dt

/

*U Ž b 2 Ž t . y a2 Ž t . .

Ž by Lemma 12 . s lim

Tªtq

ž

HtT etyt b1 Ž t . *U Ž b 2 Ž t . y a2 Ž t . . dt HtT etyt dt

/

s H9 Ž a2 Ž t . ; b 2 Ž t . y a2 Ž t . .

Ž by Lemma 11 . . B

250

CHRISTOPHER HARRIS

Similarly, we have: L EMMA 14. For almost all t G 0, b 2 Ž t .*UŽ b1Ž t . y a1Ž t ...

L9Ž a 2 Ž t .; b 1Ž t . y a1Ž t .. s

Now let w Ž t . s W Ž a1Ž t ., a2 Ž t .. for all t G 0. Combining Lemmas 13 and 14, we find that w ˙ Ž t . s yw Ž t . for almost all t G 0 as the heuristic argument of Section 4 suggested. Hence w Ž t . s eyt w Ž0.. We have therefore established Theorem 9.

6. CONVERGENCE IN STRATEGY TERMS Suppose that M is a finite set of affine functions M: R q ª R and that N is a finite set of closed half-planes N ; R q. Let d N : R q ª R j  `4 be defined by the formula d N Ž y . s 0 if y g N and d N Ž y . s ` if y g R q _ N, let F s max M g M M q Ý N g N d N , let F0 be a point in range Ž F . _  `4 , let X 0 s  y N y g R q, F Ž y . F F0 4 , let j 0 be a point in R q _ X 0 , and let dŽ j 0 , X 0 . denote the Euclidean distance of j 0 from X 0 . Then we have the following result. THEOREM 15. There exists K Ž M , N . - ` such that d Ž j 0 , X 0 . F K Ž M , N . Ž F Ž j 0 . y F0 . for all j 0 g R q _ X 0 . An important special case of this theorem occurs when N s B. In this case F is polyhedrally convex. ŽA function is polyhedrally convex iff it is the maximum of a finite number of affine functions.. In the general case, F is the restriction of a polyhedrally convex function to a polyhedrally convex set. ŽA set is polyhedrally convex iff it is the intersection of a finite number of closed half-planes.. Combining Theorem 15 with Theorem 9, we see that the rate of convergence in strategy terms of CTFP2 is eyt . Indeed, let q s S1 , let M consist of the affine functions M Ž y . s ¨ y y ? Us2 obtained as s2 varies over S2 , let N consist of the closed half-planes  y N Žys1 . ? y F 04 obtained as s1 varies over S1 , the closed half-plane  y N ŽÝ s1 g S1 s1 . ? y F 14 and the closed half-plane  y N ŽyÝ s1 g S1 s1 . ? y F y14 , and let F0 s 0. Then X 0 s C1 and F s ¨ y L on P1. Hence there exists K 1Ž G . such that d Ž a1 Ž t . , C1 . F eyt K 1 Ž G . Ž ¨ y L Ž a1 Ž t . . . .

CONTINUOUS-TIME FICTITIOUS PLAY

251

Similarly, there exists K 2 Ž G . such that d Ž a2 Ž t . , C2 . F eyt K 2 Ž G . Ž H Ž a2 Ž t . . y ¨ . . In particular, we may take K S Ž G . s K P Ž G .max K 1Ž G ., K 2 Ž G .4 . By the same token, it follows from Shapiro’s Ž1958. result Žnamely, that the rate of convergence in payoff terms of DTFP in finite two-person zero-sum games is at least ty1 rŽ < S1
˜ be the set of all subsets of R q of the form Let G conv  =M N M g M˜4 q cone  Z Ž N . N N g N˜ 4 obtained as M˜ varies over the set of all subsets of M and N˜ varies over the set of all subsets of N . ŽHere conv =M N M g M˜4 denotes the set of convex combinations of elements of  =M N M g M˜4 , and cone ZŽ N . N N g N˜4 denotes the set of nonnegative linear combinations of elements of  ZŽ N . N N g N˜4 .. Let G denote the set of all sets in G˜ that do not contain the origin and let KŽ M, N. s

1 min  d Ž 0, G . N G g G 4

.

Then K Ž M , N . G 0 by construction, with equality iff G is empty, and K Ž M , N . - ` since G is finite and any G g G is closed. ŽIf G is empty, then min dŽ0, G . N G g G 4 s ` by the usual convention, and then K Ž M , N . s 0.. Now fix j 0 g R q _ X 0 . Let « g Ž0, F Ž j 0 . y F0 ., let X« s  y N y g R q, F Ž y . F F0 q « 4 , and let x« be the point of X« closest to j 0 . Finally, let N Ž x« , X« . denote the normal cone to X« at x« . ŽA vector n g R q is normal to X« at x« iff n ? Ž y« y x« . F 0 for all y« g X« . In other words, n makes an obtuse angle with any line segment in X« with x« as one endpoint. The normal cone to X« at x« is the set of all n that are normal to X« at x« .. Then we have the following lemma.

252

CHRISTOPHER HARRIS

LEMMA 16.

j 0 y x« g N Ž x« , X« ..

In other words, j 0 y x« is perpendicular to X« at x« . Proof. Suppose that y g X« . Let h g Ž0, 1. be arbitrary and put yh s Ž1 y h . x« q h y. Then < j 0 y x« < 2 F < j 0 y yh < 2

Ž by choice of x« . s < j 0 y x« < 2 q 2 Ž j 0 y x« . ? Ž x« y yh . q < x« y yh < 2 s < j 0 y x« < 2 y 2 h Ž j 0 y x« . ? Ž y y x« . q h 2 < x« y y < 2 . Hence

Ž j 0 y x« . ? Ž y y x« . F 12 h < x« y y < 2 . Hence, letting h ª 0, we obtain the required inequality.

B

Next, let ­ F Ž x« . denote the subdifferential of F at the point x« . ŽThe subdifferential of F at x« is the set of all y« g R q such that F Ž y . y F Ž x« . G y« ? Ž y y x« . for all y g R q.. We wish to establish that N Ž x« , X« . can be characterized in terms of ­ F Ž x« .. For this purpose, we need to show that F Ž x« . s F0 q « , so that X« can be written in the form  y N y g R q, F Ž y . F F Ž x« .4 , and that ­ F Ž x« . has a particularly simple structure. We show first that F Ž x« . s F0 q « . LEMMA 17.

F Ž x« . s F0 q « .

Proof. Let X«ys  y N y g R q, F Ž y . - « 4 . Then X«y is open by continuity of F and X«y; X« . On the other hand, by construction, x« lies on the boundary of X« . Hence x« g X« _ X«y and F Ž x« . s F0 q « . B We show now that ­ F Ž x« . g G . LEMMA 18.

­ F Ž x« . g G .

In particular, ­ F Ž x« . is finitely generated. ŽA convex set is said to be finitely generated iff it is the convex hull of a finite number of points and directions.. Proof. Let f : R M ª R be defined by the formula f Ž y . s max  M Ž 0 . q yM N M g M 4

CONTINUOUS-TIME FICTITIOUS PLAY

253

and let f : R q ª R M be defined by the formula Ž f Ž y ..M s M Ž y . y M Ž0.. Then ­ F s ­ Ž f ( f . q Ý n g N ­d N by Theorem 23.8 of Rockafellar Ž1970., ­ Ž f ( f . s f *( ­ f ( f by Theorem 23.9 of Rockafellar Ž1970., ­ f Ž y . s conv e M N M Ž0. q yM s f Ž y .4 , where e M denotes the unit vector in the direction M, and ­d N Ž y . s  04 if ZŽ N . ? y - z Ž N . and ­d N Ž y . s cone ZŽ N .4 if ZŽ N . ? y s z Ž N .. Hence

­ F Ž x« . s conv  =M N M g M˜4 q cone  Z Ž N . N N g N˜ 4 , where M˜s  M N M Ž0. q Ž f Ž x« ..M s f Ž f Ž x« ..4 and N˜s  N N ZŽ N . ? x« s z Ž N .4 . In particular, ­ F Ž x« . g G˜. Finally, y cannot minimize F since F Ž x« . ) F0 . Hence 0 f ­ F Ž y . and ­ F Ž x« . g G . B The characterization of N Ž x« , X« . is then as follows. LEMMA 19. N Ž x« , X« . s ray ­ F Ž x« .. Ž Here ray ­ F Ž x« . s  l y« N l g Rq, y« g ­ F Ž x« .4 .. In other words, a vector n is normal at x« to the level set of F defined by the point x« iff n is a scalar multiple of a gradient of F at x« . Proof. Lemma 17 implies that X« s  y N y g R q, F Ž y . F F Ž x« .4 and that F Ž x« . is not the minimum of F. Theorem 23.7 of Rockafellar Ž1970. therefore implies that N Ž x« , X« . s clŽcone ­ F Ž x« ... ŽHere clŽcone ­ F Ž x« .. denotes the closure of cone ­ F Ž x« ... However, Lemma 18 implies that ­ F Ž x« . is finitely generated. From this it follows at once that cone ­ F Ž x« . s ray ­ F Ž x« . and that cone ­ F Ž x« . is closed. B It follows from Lemma 19 that j 0 y x« g ray ­ F Ž x« .. This allows us to establish a bound for dŽ j 0 , X« . in terms of F Ž j 0 . y F Ž x« .. LEMMA 20.

dŽ j 0 , X« . F Ž F Ž j 0 . y F Ž x« ..rdŽ0, ­ F Ž x« ...

Proof. By definition of ray ­ F Ž x« ., there exists l G 0 and y« g ­ F Ž x« . such that j 0 y x« s l y« . Now F Ž j 0 . y F Ž x« . G y« ? Ž j 0 y x« .

Ž because

y« g ­ F Ž x« . .

s < y« < < j 0 y x« <

Ž because j 0 y x« s l y« . .

254

CHRISTOPHER HARRIS

Hence

d Ž j 0 , X« . s < j 0 y x« < F

F Ž j 0 . y F Ž x« . < y« <

F

F Ž j 0 . y F Ž x« . d Ž 0, ­ F Ž x« . .

. B

Theorem 15 follows at once from Lemma 20 on noting that ­ F Ž x« . g G , so that 1rdŽ0, ­ F Ž x« .. F K Ž M , N ., and on letting « ª 0.

7. CONVERGENCE OF DISCRETE-TIME FICTITIOUS PLAY In this section we give a new proof of Robinson’s Ž1951. theorem: any DTFP of a finite zero-sum game converges. The idea of the proof is to note that any discrete-time fictitious play may be embedded in continuous time. The embedded play is not a continuous-time fictitious play of the second kind, since neither player need play best responses to the average past play of the other player at all points of continuous time. ŽIn this sense, the requirement that a play of the game be a CTFP is a stronger requirement than the requirement that it be a DTFP.. However, there exists a constant K U Ž G . such that, from time T g w0, `. onward, the embedded play is confined to convex combinations of pure strategies that are eyT K U Ž G .-optimal. The required result therefore follows from standard results on the upper semicontinuous dependence of the solutions of a differential inclusion on a parameter. We begin by stating such a result in a form that is convenient for our purposes. For all p g P, let Bi« Ž p . denote the convex hull of the set of pure strategies of player i that are at or within « of being a best-response I to p, let B « : P ª P be defined by B « Ž p . s =is1 Bi« Ž p . for all p g P, let C Žw0, `.; P . denote the space of continuous functions from w0, `. to P endowed with the topology of uniform convergence on compact subsets of w0, `., and let T Ž « , p . ; C Žw0, `.; P . denote the set of solutions of the differential inclusion a ˙ g B « Ž a. y a with initial condition aŽ0. s p. We already know from Proposition 6 that T Ž « , p . is nonempty and compact. The following proposition describes the dependence of T Ž « , p . on Ž « , p .. PROPOSITION 21.

T Ž « , p . ¨ aries upper semicontinuously with Ž « , p ..

This property follows, as before, from the fact that the best-response correspondence B « is nonempty, compact and convex valued, bounded, and upper semicontinuous. See Corollary 4 of Section 2.2 of Aubin and Cellina Ž1984., for example.

CONTINUOUS-TIME FICTITIOUS PLAY

255

THEOREM 22. Any DTFP of a finite zero-sum game con¨ erges in both payoff and strategy terms. Moreo¨ er this con¨ ergence is uniform in the initial conditions. Proof. It suffices to prove uniform convergence in strategy terms. In other words, we need to show that, for all g ) 0, there exists t Žg . G 0 such that, for all DTFP ˆ b, dŽ a ˆŽ t ., C1 = C2 . - g for all t G t Žg ., where a ˆ s AŽ ˆb .. Now suppose that ˆ b is a DTFP, let b: wy`, `. ª P be given by the formula bŽ t . s ˆ bŽlogŽ n.. for all n g N and all t g wlogŽ n., logŽ n q 1.., let a s A˜Ž b ., and let K U Ž G . s 2 max < u Ž s . < . sgS

Then we have a Ž log Ž n . . s a Ž t . y

et y n et

Ž b Ž log Ž n . . y aŽ log Ž n . . . .

Moreover bŽ t . is a best response to aŽlogŽ n.., and Ž e t y n.re t F wŽ n q 1. y n xre t s eyt. Hence bŽ t . is within eyt K U Ž G . of a best response to aŽ t .. In particular, a: w T, `. ª P is a solution of the differential inclusion yT

a ˙g Be

K U ŽG .

Ž a. y a

T with initial condition aŽT . s Hy` eyŽTyt . bŽ t . dt. Finally, for all g ) 0, there exists t 1Žg . G 0 such that, for all t G t 1Žg . and all a g T Ž0, P ., dŽ aŽ t ., C1 = C2 . - g ; and for all g ) 0 and all t 1 G 0, there exists d Žg , t 1 . ) 0 such that, for all a g T Ž d Žg , t 1 ., P ., the restriction of a to w0, t 1 x lies within g of the restriction of T Ž0, P . to w0, t 1 x. It follows that we may put t Žg . s t 1Žg . q t 2 Žg ., where

ž ½

t 2 Ž g . s log max 1,

KU Ž G .

d Ž g , t 1Ž g . .

5/

.

Indeed, if ˆ b is a DTFP, if b is the associated CTFP2, and if a s A˜Ž b ., then a ˙Ž t . g Bg Ž aŽ t .. y aŽ t . for all t G t 2 Žg .. Hence dŽ aŽ t ., C1 = C2 . - g for all t G t 1Žg . q t 2 Žg .. B Notice that, while the proof does use the fact that CTFP converges in strategy terms uniformly in the initial conditions, it does not use any information about the rate of convergence. The proof therefore depends on the results of Section 5 but not on those of Section 6.

256

CHRISTOPHER HARRIS

8. THE RATE OF CONVERGENCE OF CONTINUOUS-TIME FICTITIOUS PLAY IN NONZERO-SUM GAMES In the present section we show that CTFP converges in weighted-potential games. The proof is remarkable for its simplicity. DEFINITION 23. G is a weighted-potential game iff there exists a potential function P: S ª R and weights g i g Ž0, `. such that u i Ž syi , si2 . y u i Ž syi , si1 . s g i Ž P Ž syi , si2 . y P Ž syi , si1 . . for all 1 F i F I, all syi g Syi , and all si1, si2 g Si . As we shall see in a moment, in weighted-potential games, the potential function serves directly as a Lyapunov function for CTFP. THEOREM 24. Weighted-potential games ha¨ e the continuous-time fictitious-play property. Proof. Indeed, suppose that we put Hi Ž pyi . s max u i Ž pyi , pi . pigPi

W Ž p. s

I

1

Ý

gi

is1

Ž Hi Ž pyi . y u i Ž p . . ,

p Ž t . s P Ž aŽ t .. and w Ž t . s W Ž aŽ t ... Then I

p ˙s

I

Ý P9 Ž a; a˙i . s Ý is1

is1

1

gi

uXi Ž a; a ˙i . s

I

1

Ý

gi

is1

u i Ž ayi , a ˙i .

Žby the chain rule for differentiation, because G is a potential game and by the multilinearity of u i , respectively. I

s

Ý is1

1

gi

u i Ž ayi , bi y a i . s

I

1

Ý

gi

is1

Ž u i Ž ayi , bi . y u i Ž ayi , ai . .

Ž by definition of a˙i and by the multilinearity of s

I

1

Ý

gi

is1

u i once again .

Ž Hi Ž ayi . y u i Ž a. . s w Ž by definition of Hi and w . .

CONTINUOUS-TIME FICTITIOUS PLAY

257

Since w is nonnegative and p is bounded above, p Ž t . converges monotonically to a limit p Ž` y . as t ª `. Since w is Lipschitz continuous, there exists K g w0, `. such that wŽT . G wŽ t. y K ŽT y t. for all T G t. Hence

p Ž ` y. y p Ž t . G p t q

ž

wŽ t. K

/

yp Ž t. s

tq Ž w Ž t .rK .

Ht

p ˙ Ž t . dt G

wŽ t. 2K

2

.

Hence w Ž t . converges to zero as t ª `, but c g NEŽ G . iff W Ž c . s 0. This completes the proof. B Notice that the function W used in the proof}which provides a measure, in payoff terms, of the degree to which a strategy profile fails to be an equilibrium}is a generalization of the function W used in the previous section. Indeed, if we set I s 2, g 1 s g 2 s 1, and u 2 s yu1 , then the current W reduces to the previous W. The earliest use of this function of which we are aware is that by Nikaido and Isoda Ž1955.. Notice, too, that the proof does not yield any clue as to the rate of convergence of fictitious play. This is not surprising, as the following example shows. Consider the common interest game G in which S1 s S2 s  H , T 4 and u1Ž s . s u1Ž s . s 1 if s1 s s2 and u1Ž s . s u1Ž s . s 0 if s1 / s2 . This game has three Nash equilibria: two pure-strategy equilibria Ž HH . and Ž TT ., and a mixed-strategy equilibrium in which each player assigns probability 12 to each of her two strategies. It is easy to characterize the fictitious plays of these game which originate from the mixed-strategy equilibrium. Indeed, they are characterized by two parameters: a g w0, `x and b g  H , T 4 . An Ž a , b . path remains at the mixed-strategy equilibrium over the interval w0, a x and then converges exponentially to the pure-strategy equilibrium Ž bb .. It follows from this example that we cannot hope to establish a rate of convergence for a weighted-potential game that applies uniformly to all fictitious plays of that game. We may, however, hope to establish the following result. Conjecture 25. Suppose that G is a weighted-potential game and that b is a continuous-time fictitious play of G. Then there exists K Ž G, b . such that d Ž a Ž t . , NE Ž G . . F K Ž G, b . eyt

258

CHRISTOPHER HARRIS

for all t G 0. ŽActually it is possible to be more precise about the structure of the constant K Ž G, b .. We would expect it to have the form K 1Ž G . eyŽ tyK 2 ŽG, b.. .. In other words, convergence of continuous-time fictitious play is asymptotically exponential. We have not attempted to prove this conjecture.

9. CONCLUSION The analysis of the present paper suggests that, from a purely conceptual point of view, continuous-time fictitious play should be regarded as ‘‘ideal’’ fictitious play, and discrete-time fictitious play should be regarded as an approximation to continuous-time fictitious play. Having said this, it must be stressed that, from a computational point of view, discrete-time fictitious play is more relevant than continuous-time fictitious play. Indeed, the difficulties that arise in computing the evolution of continuous-time fictitious play are closely related to the difficulties that arise in analyzing discrete-time fictitious play in its own right.

REFERENCES Brown, G. W. Ž1949.. ‘‘Some Notes on Computation of Games Solutions.’’ Report P-78, The Rand Corporation. Aubin, J-P. and Cellina, A. Ž1984.. Differential Inclusions. New York: Springer-Verlag. Brown, G. W. Ž1951a.. ‘‘Iterative Solutions of Games by Fictitious Play,’’ in Acti¨ ity Analysis of Production and Allocation, ŽT. C. Koopmans, Ed.. pp. 374]376. New York: Wiley. Brown, G. W. Ž1951b.. ‘‘Notes on the Solution of Linear Systems Involving Inequalities,’’ in ‘‘Proceedings of a Second Symposium on Large Scale Digital Calculating Machinery,’’ The Annals of the Computation Laboratory of Harvard University, 26, pp. 137]140. Cambridge, Massachusetts: Harvard University Press. Brown, G. W., and von Neumann, J. Ž1950.. ‘‘Solutions of Games by Differential Equations,’’ in Contributions to the Theory of Games ŽH. W. Kuhn and A. W. Tucker, Eds.., Annals of Mathematics Studies No. 24, pp. 73]79. Princeton: Princeton University Press. Gilboa, I., and Matsui, A. Ž1991.. ‘‘Social Stability and Equilibrium,’’ Econometrica, 59, 859]867. Hofbauer, J. Ž1994.. ‘‘Stability for the Best Response Dynamics.’’ Mimeo. Institut fur ¨ Mathematik, Universitat ¨ Wien. Karlin, S. Ž1959.. Mathematical Methods and Theory in Games, Vols. 1 and 2. Reading, MA: Addison-Wesley. Krishna, V., and Sjostrom, ¨ ¨ T. Ž1995.. ‘‘On the Convergence of Fictitious Play.’’ Mimeo, Penn State University and Harvard University. Matsui, A. Ž1992.. ‘‘Best-Response Dynamics and Socially Stable Strategies,’’ J. Econ. Theory 57, 343]362.

CONTINUOUS-TIME FICTITIOUS PLAY

259

Miyasawa, K. Ž1961.. ‘‘On the Convergence of the Learning Process in a 2 = 2 Nonzero-Sum Two-Person Game.’’ Research Memorandum No. 33, Economic Research Program, Princeton University. Monderer, D., and Shapley, L. S. Ž1996a.. ‘‘Fictitious-Play Property for Games with Identical Interests,’’ J. Econ. Theory 68, 258]265. Monderer, D., and Shapley, L. S. Ž1996b.. ‘‘Potential Games,’’ Games Econ. Beha¨ . 14, 124]143. Monderer, D., and Sela, A. Ž1996.. ‘‘A 2 = 2 Game without the Fictitious Play Property,’’ Games Econ. Beha¨ . 14, 144]148. Nikaido, H., and K. Isoda Ž1955.. ‘‘Note on Noncooperative Convex Games,’’ Pacific J. Math. 4, 65]72. Robinson, J. Ž1951.. ‘‘An Iterative Method of Solving a Game,’’ Ann. Math. 54, 296]301. Rockafellar, R. T. Ž1970.. Con¨ ex Analysis, Princeton: Princeton University Press. Shapiro, H. Ž1958.. ‘‘Note on a Computation Model in the Theory of Games,’’ Comm. Pure Appl. Math. 11, 587]593. Shapley, L. S. Ž1964.. ‘‘Some Topics in Two-Person Games,’’ in Ad¨ ances in Game Theory, ŽM. Dresher, L. S. Shapley, and A. W. Tucker, Eds.., Annals of Mathematics Studies No. 52, pp. 1]28. Princeton: Princeton University Press.