Journal of Economic Theory ET2382 journal of economic theory 79, 207223 (1998) article no. ET972382
Evolution with Changing Mutation Rates* Jack Robles Department of Economics, University of Colorado, Campus Box 256, Boulder, Colorado 80309 E-mail: roblesjspot.colorado.edu Received March 14,1996; revised November 9,1997
This article considers the robustness of long run equilibria when mutation rates are not assumed to be constant over time. Particular attention is paid to the case where mutation rates decline to zero in the limit. It is found that if behavior is ergodic, then it corresponds to the long run equilibrium for the game. However, conditions for ergodicity become increasingly restrictive as population size increases. Journal of Economic Literature Classification Number: 72. 1998 Academic Press
1. INTRODUCTION This paper studies how the predictions of the evolutionary model of Kandori, Mailath, and Rob [14] and Young [28] (henceforth KMRY) change when mutation rates are not assumed constant over time. KMRY predict the long run behavior of an evolutionary dynamic which consists of an adaptive adjustment process that models population adjustment to payoff improving strategies and a fully random component (mutation) wherein agents change strategy unpredictable. Mutations are intended to capture experimentation, mistakes, and the replacement of old population members by agents who are not aware of all the particulars of the situation into which they are entering. The addition of mutations results in unique historyindependent equilibrium predictions. However, while adding mutations yields quite striking results, the source of these mutations is not formally modeled. As a consequence, it is important to understand the implications of plausible changes in the mutation generating process. 1 Declining mutation rates seem a plausible possibility. 2 * This paper is based upon Chapters 3 and 4 of my Ph.D. dissertation at UCSD. I am indebted to Fabrizio Germano, Joel Sobel, Ruth Williams, three referees, an associate editor, and especially Vincent P. Crawford for helpful comments and advice. Financial support from the National Science Foundation is gratefully acknowledged. 1 [1, 2, 3, 4] all consider different assumptions regarding mutations. 2 Blume [4] also studies this question, but with a slightly different evolutionary dynamic and only for two by two games.
207 0022-053198 25.00 Copyright 1998 by Academic Press All rights of reproduction in any form reserved.
File: DISTL2 238201 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 4330 Signs: 2484 . Length: 50 pic 3 pts, 212 mm
208
JACK ROBLES
Experimental evidence suggests that the random element of strategy choice declines with repetition.3 As agents repeat a game, they should become more certain of their environment and their predictions of opponent's play, and hence less likely to experiment or make mistakes. 4 Of course, in order to fully capture such phenomena, mutation rates must be determined endogenously. A model with exogenously changing mutation rates is a first step towards such a model. By definition, history-dependent mutation rates are not constant, and, if endogenous mutation rates can be characterized, then results on exogenously changing mutation rates may be applied to the characterization. In this paper, I generalize KMRY's model by allowing mutation rates to change over time. I demonstrate that, with changing mutation rates, if limiting behavior is ergodic, then it is the same as limiting ergodic behavior with small, but constant mutation rates. However, if mutation rates decline to zero too quickly, then an ergodic distribution may fail to exist. Sufficient conditions for existence, which relate the speed at which mutation rates go to zero to the population size, are provided. Because necessary and sufficient conditions for ergodicity are very difficult to formulate, I turn next to the question of lock out. Lock in refers to a population entering and then never departing from an equilibrium. Lock out is the opposite, and occurs if there is a time after which an equilibrium is never played. If there is a positive probability of lock out from KMRY's predicted equilibrium, then this equilibrium cannot be a history independent prediction. Necessary and sufficient conditions for zero probability of lock out, which again relate the population size and the rate at which mutations go to zero, are provided. The rest of the paper is organized as follows. In Section 2, I provide an overview of the KMRY model. Section 3 demonstrates the equivalence (under reasonable conditions) of ergodic behavior with changing mutation rates and ergodic behavior with small but constant mutation rates. Section 4 relates the speed at which mutation rates decline to the existence of an ergodic distribution and lock out. In Section 5, I extend results to state-dependent mutation rates. Section 6 discusses related literature.
2. THE KMRY MODEL A single population is repeatedly randomly paired to play a symmetric normal form game. 5 A state is a vector, the elements of which identify how many agents are playing each strategy. Let S be the state space. Let N 3
See Crawford [7]. Marimon and McGrattan [17] and Fudenberg and Kreps [12] use a similar assumption in learning models. 5 The results in this paper are easily extended to multiple populations, in which, for example, the row and column players are selected from separate populations. 4
File: DISTL2 238202 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3711 Signs: 2823 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
209
denote the cardinality of S. Exposition is eased by denoting the states as k=1, 2, ..., N. Let 2 be the set of probability distributions on S, and let \ # 2 be a (1_N)-vector, whose k th element is the probability that state k is realized. KMRY begin by considering a deterministic adjustment process, in which the frequency of strategies with higher payoffs increases. 6 The adjustment process may also be probabilistic, as in Samuelson [23] and Kandori and Rob [15], and put positive probability on a transition from a state i to j if and only if j can result from some of the population adjusting to a best response to state i. Given an adjustment process one may define p ij (0)= prob[state j | last state was i] under adjustment without mutation. The transition matrix for the Markov chain defined by the adjustment process, P(0), is the matrix whose entry at the i th row and j th column is p ij (0). A communication class is a minimal collection of states that is closed under the adjustment process without mutation. That is, once the population enters a state in a communication class, the adjustment process without mutation cannot escape the communication class, and will eventually visit every state in it. Let 1 be the set of communication classes. KMRY's model is completed by adding, for each agent, a probability =>0 of mutating. A mutating agent has a positive probability of choosing every strategy. Let p ij (=)=prob[state j | last state was i] when there is a mutation rate =. The transition matrix P(=) (with entries p ij (=)) is strictly positive, and hence irreducible and a periodic. This implies the existence of a unique stationary distribution +(=) (i.e., +(=) P(=)=+(=).) +(=) is an ergodic distribution for the Markov chain defined by P(=). KMRY show that lim = 0 +(=)=+*, where +* is a stationary distribution of P(0). Hence mutations select one of the stationary distributions of P(0). The support of +* is the long run equilibrium (LRE), the long run prediction when = is close to zero. The LRE are the states that require the fewest mutations to be reached from every other state and can be characterized by counting the mutations needed for transitions. Let d(i, i $) be the number of agents playing different strategies in states i and i $. The transition cost for i 1 i 2 is c(i 1 , i 2 )= min[d(i, i 2 ) | p i1 i (0)>0]. Analysis is simplified by focusing on communication classes and the difficulty of (possibly multi-period) transitions between them. A path between two states i and i $ is an ordered set, with first element i, and last element i$. The cost of a path q=[i=i 1 , i 2 , ..., i n =i $] is c(q)= n&1 j=1 c(i j , i j+1 ). Let Q($ 1 , $ 2 ) be the set of paths starting from i # $ 1 and ending at i $ # $ 2 for $ 1 , $ 2 # 1. The resistance from $ 1 to $ 2 is r($ 1 , $ 2 )=min q # Q($1, $2 ) c(q). The radius of a set #/1 is R(#)=min $ # #, $$ # 1 "# r($, $$). The coradius of #/1 6
The following description is closer in nature to [14] than to [28].
File: DISTL2 238203 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3669 Signs: 2915 . Length: 45 pic 0 pts, 190 mm
210
JACK ROBLES
is CR(#)=min $ # # max $$ # 1 "# r($$, $). The modified cost of a path q is c*(q) =c(q)& $: q & ${< R($). The modified resistance between two communication classes is r*($, $$)=min q # Q($, $$) c*(q). The modified coradius of #/1 is CR*(#)=min $ # # max $$ # 1 "# r*($$, $). 7 The antiradius of #/1 is AR(#) =max #$/1 "# R(#$). When the argument of the radius, coradius, modified coradius or antiradius is a $ # 1, it is to be understood that [$]/1 is operated upon. For q # Q($ 1 , $ n ), let $ i represent the i th element of 1 that q passes through. The associated 1-path, q^ =($ 1 , ..., $ n ), is an ordered list of the communication classes that q passes through. Proposition 2.1. The antiradius is (weakly) less than the modified coradius. AR(#)CR*(#) for all #/1. Proof. Fix #/1 and #* # arg max #$/1 "# R(#$). Then R(#*)=AR(#). Fix x # arg max z # #* R(z). Then there exists a path q=[i 1 # x, ..., i # z n # #] with CR*(#)r*(x, i)=c*(q). Let q^ =[z 1 =x, z 2 , ..., z n ] be the associated 1-path. Let k=inf [l | z l #*]. CR*(#)c*( p)(x, z 2 )+ k&1 l=2 (r(z l , z l+1 ) &R(z l ))R(x)+r(z k&1 , z k )&R(z k&1 )R(x)+R(#)&R(x)=R(#). K
3. COMPARISONS TO LRE In this section and the next I use the mathematics of nonstationary Markov chains. I use two notions of history independence: weak ergodicity and strong ergodicity. A chain is weakly ergodic if the predictive power of the initial conditions goes to zero as time passes. If, in addition, the chain converges to a unique distribution, then the chain is strongly ergodic. Weak ergodicity is related to the ergodic coefficient, which measures the similarity of the rows of the transition matrix P t and is denoted :(P t ). Intuitively, the more similar the rows of a transition matrix, the less the current state matters in predicting the next period's state, and the more this matrix contributes to history independence. The precise details of this technical apparatus are contained in the appendix. Also included in the appendix are all results with section number A. I first concern myself with the predictive robustness of LRE when mutation rates are not constant, but there is long run history independence (i.e., there is an ergodic distribution.) Let = t be the period t mutation rate. We are interested in the Markov chain with [P(= t )] the sequence of transition matrices. If this chain is strongly ergodic, then denote its ergodic distribution as , and call a modified long run equilibrium (MLRE). 8 7 8
The definitions of R, CR, c*, and CR* are based on Ellison [11]. Note that the MLRE is a distribution, not the support of a distribution.
File: DISTL2 238204 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3523 Signs: 2504 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
211
Following KMRY, let +(=) be the stationary distribution for P(=) and let +*=lim = 0 +(=). My first result states that neither predictions nor predictive power of LRE is sensitive to nonconstant mutation rates, so long as mutation rates have a positive limit. Let =-MLRE denote the ergodic distribution that results if = t =>0. Proposition 3.1. (=)=+(=).
If = t =>0 then the =-MLRE, (=), exists and
Proof. =>0 implies that P(=) is strictly positive, and hence weakly ergodic. The continuity of P( } ) and the fact that = t = imply that P(= t ) P(=). The proposition now follows from Theorem A.3. K If the mutation rate goes to zero, then to show that +* is the MLRE, one 9 must show that t=1 &+(= t )&+(= t+1 )&< (Theorem A.2.) Fortunately, given the structure of KMRY models identified in Lemma A.2, it is possible to show (Lemma A.3) that t=1 &+(= t )&+(= t+1 )&< if = t decreases monotonically. Let = t a 0 denote the case where = t is nonincreasing, is strictly positive for all finite t, and converges to zero in the limit. Proposition 3.2. then =+*.
If there exists a MLRE, , for [P( } ), [= t ]] and = t a 0
Proof. A MLRE is assumed to exist and this implies weak ergodicity. The assumption that = a 0 and Lemma A.3 imply that t=1 &+(= t )& +(= t+1 )&<. Application of Theorem A.2 then implies that the MLRE is +*. K
4. CONDITIONS FOR EXISTENCE OF A MLRE Conditions for the existence of a MLRE depend upon the number of mutations required for transitions between states. Hence I consider transition paths long enough to minimize the mutations required. To each state i assign a communication class $(i) such that there is a positive probability, no mutation path from i to $(i). Let t 1 be large enough so that for every i a mutationless transition from i to $(i) is possible in t 1 periods. Let t 2 be large enough so that a minimum cost path can be constructed between any two communication classes in no more than t 2 periods. Let t 3 be large enough so that starting from any state within a communication class, any other state within that communication class can be reached without mutation 9 Isaacson and Madsen [14, Example V.4.1] demonstrate that this condition is not superfluous.
File: DISTL2 238205 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3150 Signs: 2161 . Length: 45 pic 0 pts, 190 mm
212
JACK ROBLES
in t 3 periods. Let t^ =t 1 +t 2 +t 3 and P$t =P(= tt^ ). P(= (tt^ +1) ) } } } P(= (tt^ +t^ &1) ). It is possible for minimum mutation transitions between any communication classes to occur in t^ periods and P$t is the transition matrix for transitions starting in period tt^ and ending in period (t+1) t^. Let }^ =max $$, $" min $ max[r($$, $), r($", $)]. }^ is the order of magnitude of :(P$t ), and so can be used to determine if there is history independence in terms with which KMRY modellers are familiar, resistances. 10 History independence is shown if t :(P$t )=. Hence, increasing }^ decreases the rate at which = t can go to zero. The ergodic coefficient is a measure of the similarity of rows in a transition matrix, and min $ max[r($$, $), r($", $)] is the order of magnitude of the difference between the $" row and the $$ row. Hence, $$ and $" are chosen to have the most dissimilarity and }^ is the order of magnitude of this difference. 11 Let IS([= t ])= } [} # R ++ | t=1 = t =]. A number is an element of IS([= t ]) if it is sufficiently small. Proposition 4.1. Let = t a 0. A sufficient condition for [P(0), [= t ]] to have a MLRE is }^ # IS([= t ]). This condition is satisfied if min $ # 1 CR($) # IS([= t ]). Proof. t |+(= t)&+(= t+1 )|< by Lemma A.3 since = t a 0. By Theorem A.2 all that remains is to show that the chain is weakly ergodic. Let P$t #P(= t^ V t ). P(= t^ V t+1 ) } } } P(= t^ V (t+1)&1 ). Weak ergodicity is demonstrated by showing that t :(P$t )= and applying Theorem A.1. To show this I provide a lower bound for :(P$t ). Fix t sufficiently large, let p$ij denote the i th, j th component of P$t , and let {=t^(t+1)&1. :(P$t )=min ik[ j min[ p$ij , p$kj ]] min ik[ j # S min[ p$ij , p$kj ]] where S =[ j # S | j # $ # 1]. p$ij is greater than the probability that starting from i, j is reached via a least mutation path and then not departed from. The probability of such a path is the probability that the adjustment process follows this path, the right mutations occur and the wrong mutations do not occur. This is a polynomial in = t^ V t , ..., = { , the leading terms of which are of the form x ij > {s=t^ V t = `s s x ij = }{ ij with } ij = ` s . Now t^ was chosen so that there would be a t^ period path i j which used r($(i), $( j)) mutations. Hence } ij r($(i), $( j)). Since = t a 0, for $( j)) t sufficiently large there exists x ij >0 such that p$ij x ij = r($(i), . Let x= { r($(i), $( j)) r($(k), $( j)) }^ , ={ ]]x= { where }^ = minij x ij . :(P$t )x min ik[ j # S min[= { max ik min j # S max[r($(i), $( j)), r($(k), $( j))]. The expression for }^ follows from the fact that since = { <1, the size of r and = r{ are inversely related (so 10 The order of magnitude is in terms of = t^ (t+1)&1 . Why it is this particular mutation rate is made clear in the proof of Proposition 4.1. 11 Note that if there are two or more communication classes, then $$ and $" are distinct. However, there is no reason, a priori, to think that $ will be distinct from these two communication classes.
File: DISTL2 238206 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 4358 Signs: 2852 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
213
that min changes to max and vice versus) and the fact that in terms of establishing a lower bound to :(P$t ) the sum acts like a maximization. By }^ }^ Lemma A.1, t=0 = t = is equivalent to t=0 = t^ V (t+1)&1 = which clearly implies that :(P$t )=. }^ is less than the minimum coradius, since the minimization in }^ can always do at least as well as picking j # $ where $ has the minimum coradius. By definition of the coradius, r($(i), $( j)) CR($( j)) for all i. K Since resistances, and hence }^, are (ignoring discreteness) linear in the population size, the sufficient conditions for ergodicity are increasingly restrictive for large populations. In the next section, necessary conditions for history independence (i.e., conditions for zero probability of lock out) are derived which also have this property. 4.1. Lock Out Typically one speaks of lock in, the possibility that the population begin playing an equilibrium which is forever repeated. The opposite possibility is that after some time the population never again visits some state. I refer to this as lock out. Clearly if the population locks in to some state (or set of state), then it has locked out of all other states. If there is a positive probability that the population might lock out of the set of LRE, then the LRE cannot be a history independent prediction. 12 The following proposition deals with two closely related issues: (i) Will the process eventually return to some state in the set of LRE? (ii) Will a particular element of the long run equilibria arise infinitely often? Let 3 be the set of LRE. Proposition 4.2. There is a positive probability of lock out from 3 if and only if AR(3) IS([= t ]). % # 3 will be visited infinitely often with probability one if and only if AR(%) # IS([= t ]). Proof. The proofs of the two statements in the proposition are nearly identical, so I will only prove the second. Assume AR(%) # IS([= t ]). Let B 0 =[%] and let B l =[z # 1 | _z$ # B l&1 such that r(z, z$)AR(%)] for i>0. Since 1 is a finite set there must be an n such that B n =1. Also since 1 is finite there must be z 1 # B n =1 such that z 1 is visited infinitely often. By the definition of B n there exists z 2 # B n&1 such that r(z 1 , z 2 )AR(%). Hence since mutations are independent of everything else, the first BorelCantelli lemma (Durrett [9, p. 40]) implies that the transition z 1 z 2 must occur infinitely often (i.o.) with probability one. Hence z 2 is visited i.o. with probability one. Induction and the finiteness of 1 implies that % is visited 12 This is true even though there may be many points in time before lock out when the state is in the LRE.
File: DISTL2 238207 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3398 Signs: 2611 . Length: 45 pic 0 pts, 190 mm
214
JACK ROBLES
i.o. with probability one. Now assume that AR(%) IS([= t ]). Even though the transition z 1 z 2 may involve mutations over several periods, it still 1 , z2 ) <, then by the has probability of order = r(z1 , z2 ). 13 Hence if = r(z t second BorelCantelli lemma z 1 z 2 occurs i.o. with probability zero. Fix #/1 "% with R(#)=AR(%). Then \z 1 , z 2 such that z 1 # # and z 2 # 1 "# the transition z 1 z 2 occurs i.o. with probability zero. Hence transitions out of # occur i.o. with probability zero. Let t denote the random time such that \tt such transitions do not occur. It is clearly possible that at time t the state is in #. Hence there is a positive probability of lock in on # and lock out of %. K Propositions 4.1 and 4.2 can be used to derive a corollary about two by two coordination games. Consider Game 1 with a>c and d>b. There are L players in the population and the state is simply the number of agents playing s 1 . Aside from the two pure strategy equilibria, there is also a mixed strategy equilibrium which puts probability _=(d&b)(a&c+d&b) on strategy one. Let (s 1 , s 1 ) be the risk dominant equilibrium and let }=[_(L&1)] + . The long run equilibrium, 3 consists of the state where all play the risk dominant equilibrium and AR(3)=CR(3)=}. 14
Proposition 4.3. Let P(0) be derived from a 2_2 coordination game. } Assume that = t a 0. Then there exists a MLRE if and only if t=0 = t =. The no lock in condition in Proposition 4.2 is quite similar to the condition for history independence in Proposition 4.1. A natural conjecture is that if there is no lock out, then there is history independence. This seems unlikely. What separates % # 3 from other states is that there is a %-tree with lower stochastic potential than any s-tree for any s 3. 15 However, what if the minimum potential %-tree includes a transition that requires }>AR(%) mutations and } IS([= t ])? Then with probability one there is a point in time beyond which this transition does not occur. In this case, there is little difference between this tree and one in which the transition in question is replaced by a transition which requires infinite mutations. Hence I would conjecture that history independence requires a %-tree made 13
See, e.g., the appendix from [28]. Necessary and sufficient conditions for history independence may be found whenever AR(3)=CR(3). Two cases where this occurs are games with at most two communication classes, and summary statistic games [22]. 15 A s-tree is a direct graph over S such that every s 1 # S "s has a single arrow from it to some s 2 and starting from any s 1 there is a sequence of arrows leading to s. Summing c(s 1 , s 2 ) for every s 1 s 2 in the tree and taking the minimum of this sum over every s-tree yields the stochastic potential for s. s # 3 if and only if there is no state with lower stochastic potential. 14
File: DISTL2 238208 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3803 Signs: 2807 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
215
Figure 1
of transitions which all occur infinitely often and with lower stochastic potential than any z-tree for any z # 1 "3. The conditions for Propositions 4.1, 4.2, and 4.3 require mutation rates to decline more slowly for larger populations. That is, the sensitivity of KMRY's predictive power to changes in the mutation process increases with population size. As an example consider Game 1, and let _=511. _L &15 but If the population size L=11, then t=2 = t = if = t =K(t log t) &15 &14 not if = t =Kt (log t) . If the population size is L=44 then there is history independence if = t =K(t log t) &120 and history dependence if = t = Kt&120(log t) &119. Let HI denote the sequence resulting in history independence and HD the sequence leading to history dependence. Some numerical values are provided below. History independence requires slow convergence even for a population of 11. Convergence must be much slower with a larger population. The issue is intuitively related to the objection that it seems strained to make predictions based on the relative probabilities of transitions all of which require a large number of very improbable events. While there is no arguing with KMRY's result that constant mutation rates, no matter how small, imply an ergodic distribution, the results in this section demonstrate
Figure 2
File: 642J 238209 . By:XX . Date:28:04:98 . Time:09:53 LOP8M. V8.B. Page 01:01 Codes: 1949 Signs: 1325 . Length: 45 pic 0 pts, 190 mm
216
JACK ROBLES
the sensitivity of these mathematics to a relaxation of the constant mutation rate assumption. Furthermore, this sensitivity increases with the population size. Since one cannot expect modeling assumptions to be completely accurate, the results in this section provide formal support for this intuitive criticism. However, the linearity (ignoring discreteness) of the coradius and the antiradius in the population size results from modeling a population randomly paired to play a normal form game. There are other strategic environments. For example, Ellison [10, 11] models local interaction on both a circle and a torus. Noldeke and Samuelson [20, 21] and Kim and Sobel [16] model populations randomly paired to play dynamic games. For the strategic environments in these papers, the antiradius does not depend upon the population size. Hence no lock out conditions (necessary conditions for history independence) are independent of population size. In addition, if a population is interacting locally on a circle, then the coradius (and hence even sufficient conditions) is independent of population size. 5. STATE DEPENDENT MUTATION RATES Bergin and Lipman [2] suggest that mutation rates may be different in different states. In particular, if mutations reflect experimentation, and a game has a Pareto dominant equilibrium, one might expect experimentation to occur at a much lower rate in the state where all play according to that equilibrium than in other states. Accordingly, they set ==(= 1 , ..., = N ), where = i is the mutation rate in state i. The idea that mutation rates should be different is captured by allowing the various = i to go to zero at different rates. That is, lim =i , =j 0 = i = j =1 is not assumed as in KMRY. Note that Bergin and Lipman consider time invariant mutation rates as do KMRY; it is simply that predictions which are a function of the relative mutation rates are considered in the limit as the time invariant mutation rates go to zero. Bergin and Lipman demonstrate that for any f such that fP(0)= f, one can choose state dependent mutations such that f is the limit distribution. Combining state and time dependence is quite natural. One would expect both in a model where agent's experimentation goes to zero, but at (state dependent) rates which are determined by their experience. Incorporating state dependence requires some modifications. Let = t = (= 1, t , ..., = N, t ) and let P(= t ) be the transition matrix that results from perturbing P(0) with state dependent mutations = t . As before, let +(= t ) be the probability distribution such that +(= t ) P(= t )=+(= t ). Let +* BL =lim +(= t ) be the Bergin Lipman long run equilibrium (BL-LRE) and let BL be the ergodic distribution for [P( } ), [= t ]] (the BL-MLRE) if it exists. The previous two sections made extensive use of Lemma A.3. This lemma followed from the monotony of +(= t ) implies by Lemma A.2. In general,
File: DISTL2 238210 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3501 Signs: 2889 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
217
time and state dependent mutation rates do not yield the same monotonicity, so some restrictions must be placed on = t . Let [! t ] be some sequence such that ! t <1 and ! t a 0. Define ( to be the set of polynomials, such that Z # ( implies that Z: (0, 1) [ (0, 1), and Z(0)=0. I require that = i, t = Z i (! t ), Z i # ( for all i. Proposition 5.1. Let ! t a 0 with ! 1 <1. Assume that = i, t =Z i (! t ) for some Z i # (. Assume that there exists a BL-MLRE, BL , for [P( } ), [= t ]]. Then BL =+* BL . Proof (sketch). Since for each i = i, t is a polynomial in ! t and + j (= t ) (the subscript j refers to the j th component) is a ratio of polynomials in the = i, t , it follows that + t(= t ) is the ratio of polynomials in ! t . One can then apply the arguments in the proofs of Lemma A.3 and Proposition 3.2 to obtain the result. K It is possible to choose Z i # ( such that = i, t go to zero at any rate desired, both in absolute terms and relative to the rate at which the other elements of = t converge to zero. Since it is the relative rates at which the various mutation probabilities converge to zero that determines which invariant distribution of P(0) is selected, Z i may be chosen so that t converges to any desired distribution, and any long run behavior described by Bergin and Lipman is possible. Propositions 4.1 and 4.2 can also be generalized. As in Section 4, let }^ =min $$, $" max $ max[r($$, $), r($", $)]. Proposition 5.2. Let the conditions of Propositions 5.1 hold. A sufficient condition for [P( } ), [= t ]] to have a BL-MLRE is }^ # i IS([= i, t ]). This condition is satisfied if min $ # 1 CR($) # i IS([= i, t ]) Proof (Sketch). As in the proof of Proposition 5.1, Lemma A.3 still stands. To generalize the rest of the proof of Proposition 4.1, first note that since =i, t is a polynomial in ! t and ! t a 0, for t sufficiently large m=arg min i = i, t is constant. As a consequence, IS([= m, t ])= i IS([= i, t ]). Let P$t be as in the proof of Proposition 4.1. Again :(P$t )min ik[ j # S min[ p$ij , p$kj ]. The RHS of the above inequality is a polynomial in = 1, t^ V t , ..., = 1, { , = 2, t^ V t , ..., = N, { . For t sufficiently large the smallest of these is = m, { . By exactly the same ^ arguments as in Propositions 4.1, :(P$t )x= }m, t . The rest of the arguments are identical. K Proposition 5.3. There is a positive probability of lock out from 3 if AR(3) i IS([= i, t ]). There is zero probability of lock out from 3 if AR(3) # i IS(= i, t ]). % # 3 will be visited infinitely often with probability one if AR(%) # i IS([= i, t ]). % # 3 will be visited infinitely often with probability strictly less than one if AR(%) i IS([= i, t ]).
File: DISTL2 238211 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3547 Signs: 2522 . Length: 45 pic 0 pts, 190 mm
218
JACK ROBLES
Proof. The proof of the first and third statements and the second and fourth statements are essentially identical, so I prove only the third and fourth statements. Since = i, t is a polynomial in ! t it follows that for t sufficiently large both m=arg min i = i, t and M=arg max i = i, t are constant. Since for t sufficiently large, mutations in any state have at least probability = m, t of occurring, if AR(%) # IS([= m, t ])= i IS([= i, t ]), then by the same logic as in the proof of Proposition 4.2, % is visited infinitely often with probability one. Since for t sufficiently large, mutations in any state have probability of at most = M, t of occurring, if AR(%) IS([= M, t ])= i IS([= i, t ]), then _#1"% with R(#)=AR(%) and transitions out of # happen infinitely often with probability zero. Hence there is a positive probability of lock in to # and lock out of %. K Propositions 5.2 and 5.3 could be strengthened if one had specific knowledge about the game and the state dependence of the mutations. For example, consider again game 1. Assume (s 1 , s 1 ) is both risk and Pareto dominant, and the mutation rate goes to zero slowest for the state where the population plays the Pareto dominated equilibrium. In this case, ergodicity and lock in are determined by a single condition. Let state N be the state where all play the Pareto dominated equilibrium, and recall that _ is the probability that the fully mixed equilibrium puts on s 1 and L is the population size. There is a MLRE if and only if [_(L&1)] + # IS([= N, t ]), and conversely there is a positive probability of lock out if and only if [_(L&1)] + IS([=N, t ]). In the context of Propositions 5.2 and 5.3, [_(L&1)] + # IS([= N, t] is equivalent to AR(%) # i IS([= i, t ]).
6. RELATED LITERATURE Other papers have modified KMRY's model. Anderlini and Ianni [1] make mutations dependent upon the adjustment process. An agent cannot mutate unless she is already changing her strategy because of the adjustment process without mutation. Since this process cannot escape whatever equilibrium it first enters, their model is history dependent. Binmore and Samuelson [3] include both independent mutations and noise in the adaptive process. Equilibrium selection depends not only on the sizes of basins of attraction (as in KMRY), but also on the strength of adjustment flows within basins. Hence direct examination of a game's payoff matrix is not sufficient. Instead one must examine the fitness game, which incorporates information about the payoffs and the adjustment process. Sandholm and Pauzner [24] modify KMRY's model by including population growth. This is closely related to declining mutation rates since in either case the probability of a transition between equilibria goes to zero.
File: DISTL2 238212 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3409 Signs: 2712 . Length: 45 pic 0 pts, 190 mm
MUTATION RATES
219
In addition to demonstrating that even moderate population growth results in history dependence, they demonstrate that very small mutation rates imply a high probability of lock-in on whatever equilibrium first arises. Crawford [7] (see also [5, 6, 8]) also proposes a model which allows agents' strategy choices to become less random as time passes. He finds strong empirical support for declining variation in individual behavior.16 However, Crawford models adjustment paths and short-run convergence, not long run behavior. In a slightly different framework from mine, Blume [4] provides necessary and sufficient conditions for history independence in two by two games. In addition, he discusses state dependent mutations and explains how his results could be generalized to include the class of potential games (Monderer and Shapley [19].) Blume [4] uses results from simulated annealing, a method for choosing a global optimum from amongst many local optima. Simulated annealing starts with a system that moves in the direction of local improvement, and then heats it up (e.g., adds mutations) so that the system can escape from local optima. The system is then cooled off (e.g., the mutation rate goes to zero) at a rate such that it will converge with probability one to the global optima. Simulated annealing has general necessary and sufficient conditions on how quickly the system can be cooled off if convergence to the global optimal is to hold. However, these conditions are not in general applicable to my model, because they require a potential function with two properties: it must achieve its global maximum at the global optima, and if a transition from $ i to $ j has higher probability than a transition from $ j to $ i then the potential must be higher at $ j than at $ i . The stochastic potential is the natural choice for potential functions and it does indeed satisfy the first property. However the second property is essentially a rewording of risk dominance. Hence any potential function satisfying the second property selects risk-dominant equilibria and it has been clear since Young [28] that risk dominance does not characterize long run equilibria.
A. APPENDIX A nonstationary Markov chain is described by a starting vector f # 2 17 P t is the transition matrix and a sequence of transition matrices [P t ] t=1 . 16 [58] used the experimental data of Van Huyck, Battalio and Beil [2527]. McKelvey and Palfrey [18] also find support for declining variation in individual behavior. However, because their model is not truly dynamic, their results are not as relevant. 17 In this section 2 is the set of probability distributions over any given finite state space, not necessarily the space of strategy profiles described above.
File: DISTL2 238213 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3465 Signs: 2752 . Length: 45 pic 0 pts, 190 mm
220
JACK ROBLES
in period t. Multiple steps in the chain are denoted by P (m, t) #P m+1 } P m+2 } } } P t , f (m, t) = fP (m, t). For any f # 2, & f i | f i |. Definition A.1. for all m
A nonstationary Markov chain is weakly ergodic if
lim sup & f (m, t) & g (m, t)&=0 t
(A.1)
f, g
Definition A.2. A nonstationary Markov chain is strongly ergodic if _ # 2 such that \m lim sup & f (m, t)&&=0.
(A.2)
t f # 2
Weak ergodicity captures the idea that initial conditions are forgotten in the long run. Strong ergodicity adds convergence to some # 2. Definition A.3.
The ergodic coefficient of a transition matrix P is
{
:(P)#1&max : [ p ij & p kj ] + =min : min[ p ij , p kj ] i, k
i, k
j
j
=
(A.3)
where [ p ij & p kj ] + =max(0, p ij & p ij ). The ergodic coefficient measures how close a matrix is to having identical rows. No mater what vector premultiplies a matrix with identical rows, the same vector results. For history independence, P (m, t) should converge to a matrix with identical rows (i.e. lim t :(P (m, t) )=1.) Intuitively, the ergodic coefficient measures how much a matrix contributes to history independence. The larger :(P), the less difference it makes what one premultiplies P by. It can be shown that & fP& gP&(1&:(P)) & f& g&. Theorem A.1 (Isaacson and Madsen [13, Theorem V.3.2]). Let [Pt ] t=1 be the transition matrices for a nonstationary Markov chain. This chain is weakly ergodic if and only if there exists a subdivision of P 1 } P 2 } P 3 } } } } into blocks of matrices P (0, n1 ) } P (n1 , n2 ) } } } P (nj , nj+1 ) } } } such that
: :(P (nj , nj+1) )=. j=0
File: DISTL2 238214 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 2673 Signs: 1442 . Length: 45 pic 0 pts, 190 mm
(A.4)
MUTATION RATES
221
If t P t = t and t , then is a likely candidate for the strongly ergodic distribution. For this to be true another condition must be satisfied. 18 Theorem A.2 (Isaacson and Madsen, [13, Theorem V.4.3]). Let [P t ] be a sequence of transition matrices corresponding to a nonstationary weakly ergodic Markov chain. If there exists t such that t P t = t and
: & j & j+1 &<,
(A.5)
j=1
then the chain is strongly ergodic. (The strongly ergodic distribution is =lim t t .) A matrix is weakly ergodic if it has a unique stationary distribution. Theorem A.3 (Isaacson and Madsen [13, Theorem V.4.5]). Let [P t ] be a sequence of transition matrices corresponding to a nonstationary Markov chain. If &Pt &P& 0 as t where P is weakly ergodic, then the chain is strongly ergodic. (The strongly ergodic distribution is # 2 such that P=). Lemma A.1. Assume that = t a 0. Then t=0 = t = if and only if = t^t+t1 = \t^ finite and \t 1
^ ^ Proof. That t=0 = t^t+t1 = \t finite and \t 1
Lemma A.2 (Kandori et al. [15, Lemma 1]). Let + i (=) be the probability that +(=) places on state i. Then _' i (=), i=1, ..., N such that ' i (=) is a polynomial in = and + i (=)=' i (=) i ' i (=). Lemma A.3.
If = t a 0 then t=1 &+(= t )&+(= t+1 )& is finite.
Proof. For this proof let primes denote the derivative with respect to =. Let L be the number of players in the population. Let '(=)= v Then i ' i (=). Let x i ( } ) be such that 2L&1 v=0 x i (v) = ='$i '&'$' i . 18 Isaacson and Madsen [13] did not include the parenthetical statements in Theorems A.2 and A.3. However those statements follow from the proofs, and they are central to my analysis.
File: DISTL2 238215 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 3492 Signs: 1856 . Length: 45 pic 0 pts, 190 mm
222
JACK ROBLES
v 2 +$i (=)= 2L&1 v=0 x i (v) = ('(=)) . Let v i =min[v | x i (v){0], let . i = |x i (v i )| 2L&1 v=0 |x i (v)| and let .=min[. i ]. Then for = # (0, .), sign(+$i (=))= sign(x i (v i )). Let T=min[t | = t <.]. Then
: &+(= t )&+(= t+1 )&= : : |+ i (= t )&+ i (= t+1 )| t=0
i # S t=0
=M+ : : |+ i (= t )&+ i (= t+1 )| i # S t=T
=M+ : |+ i (= T )&+ * i | <. i#S
The first equality is due to Fubini's Theorem. The second equality simply results from putting the sum up to t=T all together in M. M is the finite sum of finite terms and is therefore finite. The next equality results from the fact that + i ( } ) is monotonic for each i on the relevant range, which results in a telescoping sum. The final inequality follows because i # S |+ i (= T )&+ * i | is a finite sum of finite terms. K
REFERENCES 1. Luca Anderlini and Antonella Ianni, Path dependence and learning from neighbors, Games Econ. Behav. 13 (1996), 141177. 2. James Bergin and Barton L. Lipman, Evolution with state-dependent mutations, Econometrica 64 (1993), 943956. 3. Ken Binmore and Larry Samuelson, Muddling through: Noisy equilibrium selection, J. Econ. Theory 74 (1997), 235265. 4. Lawrence Blume, How noise matters, mimeo, Cornell University, 1995. 5. Bruno Broseta, ``Estimation of a Game-Theoretic Model of Learning: An Autoregressive Conditional Heteroskedasticity Approach,'' Discussion Paper 93-35, UCSD, 1993. 6. Bruno Broseta, ``Strategic Uncertainty and Learning in Coordination Games,'' Discussion Paper 93-34, UCSD, 1993. 7. Vincent P. Crawford, Adaptive dynamics in coordination games, Econometrica 63 (1995), 103144. 8. Vincent P. Crawford and Bruno Broseta, What price coordination? Auctioning the right to play as a form of preplay communication, mimeo, UCSD, 1994. 9. Richard Durrett, ``Probability: Theory and Examples,'' Wadsworth and BrooksCole, Pacific Grove, 1991. 10. Glenn Ellison, Learning, local interaction, and coordination, Econometrica 61 (1993), 10471072. 11. Glenn Ellison, Basins of attraction and long run equilibria, mimeo, Harvard University, 1994. 12. Drew Fudenberg and David Kreps, Learning mixed equilibria, Games Econ. Behav. 5 (1993), 320367. 13. Dean L. Isaacson and Richard W. Madsen, ``Markov Chains, Theory and Applications,'' Wiley, New York, 1976.
File: DISTL2 238216 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 5744 Signs: 2158 . Length: 45 pic 0 pts, 190 mm
223
MUTATION RATES
14. Michihiro Kandori, George Mailath, and Rafael Rob, Learning, mutation, and long run equilibria in games, Econometrica 61 (1993), 2956. 15. Michihiro Kandori and Rafael Rob, Evolution of equilibria in the long run: A general theory and applications, J. Econ. Theory 65 (1995), 383415. 16. Yong-Gwan Kim and Joel Sobel, An evolutionary approach to pre-play communication, Econometrica 63 (1995), 11811194. 17. Ramon Marimon and Ellen McGrattan, Adaptive learning in games, in ``Learning and Rationality in Economics'' (A. Kirman and M. Salmon, Eds.), Blackwell, Oxford New York, 1995. 18. Richard D. McKelvet and Thomas R. Palfrey, Quantal response equilibria for normal form games, Games Econ. Behav. 10 (1995), 638. 19. Dov Monderer and Lloyd S. Shapley, Potential games, Games Econ. Behav. 14 (1996), 124143. 20. Georg Noldeke and Larry Samuelson, The evolutionary foundations of backwards and forwards induction, Games Econ. Behav. 5 (1993), 425454. 21. Georg Noldeke and Larry Samuelson, A dynamic model of equilibrium selection in signalling markets, J. Econ. Theory 73 (1997), 118156. 22. Jack Robles, Evolution and long run equilibria in coordination games with summary statistic payoff technologies, J. Econ. Theory 75 (1997), 180193. 23. Larry Samuelson, Stochastic stability in games with alternative best replies, J. Econ. Theory 64 (1994), 3565. 24. William H. Sandholm and Ady Pauzner, Noisy evolution with population growth yields history dependence, mimeo, Northwestern University, 1996. 25. J. Van Huyck, R. Battalio, and R. Beil, Tacit coordination games, strategic uncertainty and coordination failures, Amer. Econ. Rev. 80 (1990), 234248. 26. J. Van Huyck, R. Battalio, and R. Beil, Strategic uncertainty, equilibrium selection principles and coordination failure in average opinion games, Quart. J. Econ. 106 (1991), 885910. 27. J. Van Huyck, R. Battalio, and R. Beil, Asset markets as an equilibrium selection mechanism: Coordination failure, game form auctions, and tacit communications, Games Econ. Behav. 5 (1993), 485504. 28. Peyton Young, The evolution of conventions, Econometrica 61 (1993), 5784.
File: DISTL2 238217 . By:CV . Date:15:05:98 . Time:07:42 LOP8M. V8.B. Page 01:01 Codes: 6465 Signs: 2155 . Length: 45 pic 0 pts, 190 mm