Asymptotic properties of equilibrium forecasts in Bayesian learning models

Asymptotic properties of equilibrium forecasts in Bayesian learning models

Journal of Mathematical ELSEVIER Economics 25 (1996) 335-344 Asymptotic properties of equilibrium forecasts in Bayesian learning models Sudhir A. ...

619KB Sizes 0 Downloads 35 Views

Journal of Mathematical

ELSEVIER

Economics

25 (1996) 335-344

Asymptotic properties of equilibrium forecasts in Bayesian learning models Sudhir A. Shah Department of Economics. 1200 Crosley Tower. llniuersity of Cincinnati, Cincinnuti, OH 45221-0371, Submitted

USA

1994; accepted June 1995

Abstract In an earlier paper [Shah Journal of Mathematical Economics, 1995, 24(5), 461-4951, we studied Bayesian learning in an intertemporal, stochastic setting. The results stated in the present paper are significant generalizations and simplifications of the earlier results. We study a sequence of games in which the stochastic kernel that links the state variable is unknown. Successive stage games are played by successive generations of identical players. Each generation makes a rational forecast of the state, given their belief about the true kernel and the equilibrium implemented by the players. Beliefs are updated by Bayes’ rule after observing the actual state. Our present results concern the dynamic behavior of the forecasts process. We show there is partial learning in the sense of being able to (asymptotically) predict the future as well as an omniscient modeler who knows the true transition structure and understands the stochastic evolution of the state and players’ actions. Moreover, we characterize the limiting behavior of the forecasts process in terms of

the set of full information rational expectations forecasts. JEL classification: C73; D82; D83

Keywords: Bayesian learning; Rational expectations; Stochastic games

1. Introduction In Shah (1993, we studied dynamic learning models in a stochastic environment. For analytical reasons, including the need to prove various existence results, 030~4068/96/$15.00 Q 1996 Elsevier Science S.A. All rights reserved SSDI 0304.4068(95)00729-6

336

S.A. Shah /Journal

ofMathematical Economics 2.5 (1996) 335-344

the general model was factored into three classes of models with different specifications regarding the primitives of the general model. A variety of approximation techniques were used to characterize the limit and asymptotic behavior of the three classes of learning models. While it was clearly desirable to study the model in its fully general unified form, our constructions of the fixed point arguments did not allow this. In this paper, we employ very elementary constructions to significantly sharpen and generalize our earlier results. In particular, we do away with the complicated approximation arguments and replace correlated equilibrium with Nash equilibrium as our solution concept. This paper is devoted to stating these improved results. The reader should consult Shah (1995) for precise statements and proofs of the earlier results. This paper is organized as follows. In the remainder of this section, we shall describe the conceptual model that underlies our analysis. In subsequent sections, we specify the formal model studied in this paper and proceed to state generalized and sharpened versions of the results stated in the earlier paper. Section 2 specifies the formal model. Section 3 establishes the existence of rational expectations equilibria. This section represents a substantive departure from Shah (1995) and contains the main contributions of the present paper. Subsequent sections merely confirm that all the results proved in Shah (1995) continue to hold (indeed, are improved substantially) in the context of the results reported in Section 3. In Section 4, we construct the probability space used for our analysis. Section 5 contains the analysis of the asymptotic properties of the forecasts process. A characterization of the limiting behavior of this process is given in Section 6. Proofs of results are collected in the appendix. Details of omitted proofs are available in Shah (19951, or may be obtained from the author. In our model, N is the set of players, X is the state space, M is the set of probability measures on X, Ai is the action space for player i (with A = Iii, N Ai), 9 is a set of kernels P: X X A --) M, and M(9) is the set of probability measures on 9’. The mapping Qi: X * Ai describes the set of feasible actions for player i. The players play a stage game in a sequence of periods. The exogenous uncertainty in our model stems from the fact that the kernel P ES@ that drives the state variable is unknown. The stage game played in period t is as follows. All players know the past realizations of the state variable. Given this information, they enter period t with a homogeneous belief A, E M(9) regarding the true kernel. The state x,_ , determines the set of feasible actions in period t as 45,(x,_ ,). The one-period payoff function of player i is ui: X X M X A + xi; the ex ante payoff in period t, which guides the action chosen by player i, is 2+(x,_ ]’ m,, a,), where a, is the profile of actions chosen in period t and m, EM is the forecast of x,; note that x, is unknown when actions are chosen. Given the belief A, and action profile a,, the expected distribution of x, is /“A,(dP)P(x,_ ,t a,). The actions a, will depend on the distribution of x,, which, in turn, is a function of a,. This circularity is resolved by forecasting the distribution m,.

S.A. Shah/Journul

of’Mathematica1

Economics

25 (1996) 335-344

337

For the forecast m E M, we have the stage game {N, (Qi( x,_ ,)jiE N, (uTx,_ ,, m, . >ji E N). Given m, the players implement a Nash equilibrium profile of actions a,(m). Consequently, the expected distribution of x, is a,(m)). It is natural to require that the forecast matches the /,A,(dP)P(x,_,, expected distribution of x,, i.e. m,= j&h,(dP)P(x,_,, a,(m,>). Given such a forecast, m,, the players implement an equilibrium a,(m,>. The true state x, is now observed. Given the initial belief A, and observation x,,h, is updated in a Bayesiw fashion to A,+ ,, which is the belief for period t + 1. Imagine a modeler who knows the true kernel P and understands how players, without this knowledge, make rational forecasts and choose equilibria in response to these forecasts. If the initial state in period r is x,_ , E X and the chosen equilibrium action profile is a, E A, then the modeler would forecast P(x,_ ,, a,) E M as the distribution of x,. Our first result (Theorem 5.4) establishes that, as time passes, the forecast m, of the state in period t, made by players who do not know the true P, asymptotically approaches the forecast made by our omniscient modeler. In other words, given the probability space that models the ignorance of the players, our model generates learning in the sense that players’ forecasting abilities become asymptotically perfect. An alternative notion of learning is to seek a connection between the rational forecasts made by ignorant players, as in our model, and those made by players who know the true P. More precisely: as time passes, do ignorant players ‘learn’ to make rational forecasts that are close to those they would have made had they known the true P? Our second principal result (Theorem 6.1) shows that the rational forecasts process generated by ignorant players converges to the set of rational forecasts that could be made by fully informed players.

2. Conventions and formal statement of model While the proofs we report are adequate in providing the ideas and constructions underlying our results, we omit many of the mathematical details in the interest of brevity. The following conventions and references should enable the reader to easily fill in the gaps. Jy is the set of natural numbers. Given topological spaces S and T, the following conventions will be used below. f: S * T denotes a mapping with domain S and values in 2T, Gr f is its graph, and U.S.C. refers to upper semi-continuity of such a mapping (Berge, 1963, VI.1). QJS is the Bore1 a-algebra on S. M(S) is the set of probability measures on S with the weak * topology; see Chapter II in Parthasarathy (1967) for properties. TS is the set of continuous functions from S to T with the compact-open topology; if S is compact and T metric, then TS is metrized by the uniform convergence metric. Given f~ TS, we define the evaluation mapping cp: TS X S by cp(f, S) =f(s). The reader is referred to Kuratowski (1966, 1968) for the properties of TS and cp.Product spaces will be

338

S.A. Shah/Journal

of Mathemuticd Economics 25 (1996) 335-344

equipped with product topologies and product r-algebras. Given P ~9, we define 4: 9 X X X A + 8 by +( P, x, a> = P( x, a>. The following assumptions state the mathematical scope of our model. Axiom 2.1. (a) N is a non-empty finite set; (b) X is a non-empty compact subset of a metric space; (c) Ai is a non-empty, compact and convex subset of a metrizable locally convex linear topological space, for all i EN; (d) for every i E N, ui is continuous, and u’(x, m, u_~, . ): Ai --) % is quasi-concave for all x E X, m EM and u-i E A_,; (e) for every i E N,Gi: X * Ai is continuous, with non-empty, compact and convex values; and (f) LF’is a compact subset of MXXA.

3. Existence

of rational expectations

forecasts

Consider a period with initial state x E X and initial belief A E M(g).

Let

Vi: XXMXA+% be defined by V’(x, m, a)=max,,O,(,.vi(x, m, Q._~,6). The best reply mapping Bi: X X M X A -Ai is defined by B&x, m, a) = {b E Gi(x) 1v’(x, m, a_ i, b) = V ‘(x, m, a>}. Given Axiom 2.1, a routine application

of theorems VI.3.1 and VI.3.2 in Berge (1963) confirms that the best reply mappings are u.s.c., with non-empty compact values. The mapping B: X X M X A 3 A, defined by B(x, m, a) = FItfz , Bi(x, m, a), inherits these properties. It is easily confirmed that all the Bi, and therefore also B, have convex values. Definition 3.1. A pair of measurable functions h: M(g) X X + M and c: M(9) X X + A is called a partial information equilibrium (PIE) if for all A E M(9) and all x E X, they satisfy h(A, x) = j’,A(dP)+(P, x, c(A, x>) and c(A, x) E B( x, h( A, x), c( A, x)). If h and c satisfy the above conditions, then h( A, x) is

called the consistent forecast given A and x, and c(A, x) is called the equilibrium action profile. Given the belief A about the true transition kernel, the initial state x, and the forecast h( A, x), the profile of actions c( A, x) is a Nash equilibrium. Moreover, given the equilibrium actions c(A, x), the forecast h(A, x) is consistent with the initial conditions (A, x). In order to establish the existence of a PIE, we define the map r: M(9) X X XMXA*MxAby r( A, x, m, a) = M and M(9)

J9A(dP)4(

P, x, u)

X B( x, m, u).

are compact metric (Parthasarathy, 1967, theorem 11.6.4). Also, 4 is continuous (Kuratowski, 1968, theorem 44.11.1). Using these facts, it can be shown that (A, x, a> * jg A(dP>+( P, x, a) is continuous. Therefore, r is u.s.c.,

S.A. Shah/Journal c~fMarhematicu1 Economics25 (1996) 335-344

339

with non-empty, compact and convex values. It follows that, for every (A, x> E the map r(A, x): MXA - M x A has the same properties. Noting that the involved spaces are locally convex, there exists (m, a) E M X A such that (m, a) E T(A, x, m, a> (Browder, 1968, theorem 4). It follows that the map Z: M(9) x X = M X A, defined by z( A, x) = {(m, U) E M x A I(m, a> E IYA, x, m, a>), has non-empty values. Note that Z(A, X) = {(m, a) EM x A I(0, 0) E f(A, x, m, a>), where f: (A, x, m, u>b r( A, X, m, a) - (m, a). It is easy to establish that GrE= T- ‘(Gri’), where 7~: (A, X, m, a) - (A, x, m, a, 0, 0). Noting that the spaces involved are linear topological spaces, and r is U.S.C.with compact values, it follows that Grl’ is closed. Therefore, Gr,Z is closed. As M X A is compact, s is U.S.C. with compact values. Suppose Z has a measurable selection 5. Then, 5,: M(9) X X + M and LJ~:M(B) XX + A are measurable functions that satisfy the condition:

M(9)xX,

(&(A,

x)9 (,(A,

x)) E r(A,

x, (,(A* x)3 (,(A,

x)>y

i.e. 5,(A, x) = /,A(dP)+(P, x, &(A, x)) and &(A, x) E B(x, &(A, x), .$&A, x)). Setting h = 5, and c = t2, we have the desired PIE. Theorem

3.2. Given Axiom 2.1, there exists a PIE.

Some other constructions that will be used below are as follows. The map X X * M is defined by H( A, x) = rr, 0 a( A, x). The equilibrium mapping is 8: X X M * A defined by 8(x, m> = (a E A 1a E B( x, m, a)). Using 8, we can define G: M(9) X X X M * M by G(A, x, m) = r,( A, x, m, 8(x, ml>. It is straightforward to check that H, 8 and G are U.S.C.maps with compact values. Furthermore, it is easy to establish the following facts: for all (A, x) E M(p) X X, (a) c(A, x> E Z(x, h(A, x>>, (b) h(A, x> E G(A, x, h( A, xl), Cc> H(A, x) = {m E M I m E G(A, x, m)}, and (d) h(A, x> E H(A, x).

H: M(9)

4. Construction

of probability

space

We proceed to construct the probability space that will serve as the background for all subsequent probabilistic statements. Let (X’, Bi) = @Ii,, Xi, u(IIf_ ,L@~)), where (Xi, LZ?x) is a copy of (X, B,). Let (L!,, 9,) = (9 XX’, a(&& XL&J) and (0, ST) i (9 X X”, (~(97~ Xq)). x0 will be the initial state and A, E M(9) the initial belief in the first period. Let h and c constitute a PIE. Define the transition kernel K,: 9-+ M by K,(P)= P(x,, c(A,, x,)) for P ~9. Q, E M(0,) is generated by Q,(E, X E,) = /,,A,(dP)K,(P, E,), with E, X E, ~97~ XL2Yx.K,(P) is the probability induced on X in the first period if the initial state is x0, P is the transition kernel, and equilibrium c( A,, x0) is implemented. Q, is the probability measure induced on L?’X X by the belief A,

340

.%A. Shah/Journal

of Mathemutical

Economics 25 (1996) 335-344

and equilibrium c(A,, x0>. Let P: 0 +9 and x’: 0 +X’ be the canonical projection mappings. Theorem V.8.1 in Parthasarathy (I 967) implies: Lemma 4.1. Given Axiom 2.1 and (0,) -ST,, Q, >, there exists a regular conditional distribution A,: X --) M(9), of P given x,.

Suppose we are given: (a) a kernel K,: 9 + M(X’); (b) a measure Q, E M(9 X X’); and (c) for the probability space (n,, S,, Q,>, the kernel A,: X’ + M(9), which is a regular conditional distribution of P, given x’. The kernel K,, , is defined by

K,+,(P,

EXF)=JEK,(P,dx’)P(xt(x’), c(A,(x’>>x,(x’)))(F),

ExF~9+.5&

(4.2)

The measure Q,, , E M(R,+ ,) is given by Q,+,(EXF)=jEA,(dP)K,+,(P,

F),

EXFES’,XSZ?;‘.

(4.3)

is a regular condi], Fr+ ,, Q,, ,>, the kernel A,+ ,: X’+’ + M(9) tional distribution of P, given x’+ ‘. Such a function exists by an argument analogous to the proof of Lemma 4.1. Applying theorem V.3.2 in Parthasarathy (19671, we have Given (a,,

Theorem 4.4. Given Axiom 2.1, there exist unique probability measures Q E M( a>, and K(P) E M( Xz) for P ~9, such that Q restricted to (a,, F,> coincides with Q,, and K(P) restricted to (X’, 94) coincides with K,(P).

5. Asymptotic

behavior of forecasts process

The analysis of this section leads to Theorem 5.4. This result asserts that the difference between the forecast distribution of the state in period t + 1 and the true distribution of the state in period t + 1 vanishes over time. Given J? EX%, let x’= (x ,,..., x,> and .F\‘=(x,+,, x,+* ,... 1. The beliefs process (A,) is defined by the relation A,+, = A,(x’); h,, , is the belief regarding the true P, conditional on observing x’. We now define the random sequence of forecasts and the associated sequence of Nash equilibria by m,, , = h(A,+ ,, x,> and a,, , = c(A,+ ,, x,1, respectively. m,+ , is the forecast announced in period t + 1, given the updated belief A,+, and initial state x,; a,, , is the implemented equilibrium that makes m,+ , the rational expectations forecast. Given (0, F, Q>, let the kernel Q’: X’ --) M(n) be a regular conditional probability on (0, F) given x’. It follows from regularity that Q’
S.A. Shah / Journul ojMuthemarica1

Economics 25 (1996) 335-344

341

X’ X E), with E EK, and the kernel 0’: X’ + M by Q’( x’, E) = Q’(x’, E X X”>. with E ~9~. It follows from the definitions and the regularity of Q’ that Q’ and Q’ are regular conditional distributions of x”\’ and x,+ ,, respectively, given A

.

If F? is a probability measure on (X”, AFT), then fi’ and H’ are defined analogously to Q’ and Q’, respectively. X” is compact metric as X is compact metric, and therefore M(X”) is compact metric in the %‘(X”> topology (Parthasarathy, 1967, theorem 11.6.4). Let p be the distance function that metrizes this topology. We first note a useful corollary of the theorem in Blackwell and Dubins (1962). Lemma 5.1. IfE?E M( X”) is such that IT?-=cd, then there exist regular versions Ij,(x,)) = 0, A-a.s. of fi’, t E ,Y, such that lim ,,mp(Q’(x’), 9 is separable metric, and therefore second-countable; i.e. we can write its base as a collection {,Yi I i E ,Y). Let Bk = {@,k, &$‘, . . . ,@j) be the partition generated 8, by@‘,, @,,..., @J. Define Z,: R + Di, by Z,(w) = Cf: , 1@,L0 P(w)diam {k~A”\Z~(w)<6}. We note that lim,t, and (~a: fi+Jf by cr,(w)=min Z,(o)=Oforall oE0. Information regarding the partition Bk can be represented by the map jk: R + Jk} defined by j,(w) = C$, jl,; 0 P(o). Let the kernel L\: {l,. . . , Jk} {l,..., x X’ + M( 0) be a regular conditional probability on (0, 7 1, given j, and x’. It follows that Li( j,, x’; Bi X x’ X X”> = 1, Q-a.s. by 2k( jk, x’; E) = Define the kernel 4: {l, . . . , Jk) X X’ -+ M(X”) Li(j,, x’; 9XX’X E), with EEL, and the kernel zk: {l,...,J,) XX’+M by L:( j,, x’; E) = i:< j,, x’; E X X”), with E ~9’~. I?~ and ck are regular conditional distributions of xx\’ and x ,+ ,, respectively, given jk and x’. Define the kernel xi: (1,. . . , Jk) X X’ -+ M(fl,+,) by xi(j,, x’; E) = Li(j,, x’; EXX”),with E~F,+,,andthekernel A~:(l,...,J,]XX’-,M(~D) by A:( j,, x’; E) = x:( j,, x’; E X X’+‘), with E E&Yu. xl, and AL are regular conditional distributions of (P, x” ‘> and P, respectively, given j, and x’. Using the constructions of Section 4, the following facts follow quite easily. Lemma 5.2. Given (0, (a> zk(jk, x’)=/&

F, Q), Q-a. s., &(jk, x’)(dP)+(P,

x,, a,+,),

(b) Li( j,, x0> +C C$and i”,Cj,, x0) K d, and Cc> @ = h( A,(x’), x,1. Employing

Lemmas 5.1 and 5.2, it follows that

Lemma 5.3. (a) Given kE_H, lim,T,p(@(x,), ,$(jjk, (b) Given a random variable r: fJ+N, a.s.

x,>>=O, Q-a.s. lim,,,p(~T(j,, x’>, m,+,)=O,

Q-

342

SA. Shah/Journal

of Mathemarical Economics 25 (1996) 335-344

If nature picks w, x,_ ,(o) is the initial state in period r, equilibrium a,(o) is implemented, and P(o) is known, then P(w)(x,_ ,(o), a,(~>) is the true distribution of x,. Given this interpretation, the following result asserts that the forecast distribution of x, converges almost surely to the true distribution. Theorem.5.4.

lim,,,p(m,+,,

P(x,, a,+,))=O,

Q-a.~.

6. Limit behavior of forecasts process It remains to see whether the sequence Cm,) of forecasts converges to some identifiable subset of M. We show that it converges to the ser of stationary forecasts consistent with the correct belief 6,. Let _N be the class of non-empty, closed subsets of M. Since M is compact, every element of J is compact. Define the function fi: M XA%+ 3, by the formula fi(m, E) = minxE ,p(m, x). Since E is compact and p is continuous, this function is well defined. Given w E 0, consider the sequence xm(01. Let X(w) denote the set of cluster points of this sequence. As X is compact, X(o) is non-empty. Furthermore, it is closed, and since X is compact, X(o) is compact. Given that H is u.s.c., this means H(6,, X(o)) is compact. We are now in a position to characterize the asymptotic behavior of Cm,>. Theorem 6.1. lim,t,j?(m,+,(w),

H(6,,,,,

X(o)))=

0 for Q-ae.

WE 0.

Appendix Proof of Theorem 3.2. We have to show the existence of (-- 8. As M(P) X X is metric, M x A is compact metric, and Z is u.s.c., it follows that E-(E) = IS ((h, x)EM@~XXIH(~, x)flE#:pJ) . measurable for every closed E c M x A. By theorem 3.1 in Himmelberg (1975) , E is weakly measurable. As M X A is separable metric and a has complete values, the result follows from theorem

5.1 in Himmelberg (1975).

Cl

Proof of Lemma 5.2. (a) It follows from the constructions and definitions in

Sections 4 and 5 that Q,, , = In,+, Q,, ,(du)p(j&W), x’(w)>, where the kernel /L {l,... ,Jk)xX'+ M(L!,+,> is defined by: pu(j,, x’; E X F X G) = x')(dw')$(P(w'), x,(w'), u,+,(w'))(G), forEx F'x G ET9 x /.x,x,;i~
SA.

Jg; x.9x. A;(j,,

Shah /Journal

Theorem

x’),

Q-as.

zk( j,,

ofMorhemariculEconomics

25 (1996)

335-344

343

V.8.1 in Parthasarathy (1967) implies that ~(j,, x’) = As zk( j,, x’; . > = &( j,, x’; 0, X . >, and 3: is regular,

x’; .)

Q-a.s. Noting the set over which the integration is performed, we have x’(w’) = X’(W). Therefore, x,(w’) = x,(w). As a,, , depends on o’ only via x’( w’), we have a,, ,( w’> = a,, ,(w>. Therefore, Zk(jk,x’;

.)

-1 ,,,~xx,x,““(“. -++h(f’W)~ =L,tA:(_ikt

Q-a.s., as required.

~‘)W’bW’~

X,3

x,3

u,+,>(.)

a,+,)(.),

0

Proof of Theorem 5.3. (a> Let B, = (w E R llim sup, $ -=z Q). Pick w E C,. By Lemma 5.1, there exist regular versions pk( j,( 0)): X’ -+ M( X”> such that i”,(j,(w), x”>(Hk(o))= 1, where Zfk(~)=(xP~Xz llim su~,&(~(j,(w), x’ x’XB,)= 1. (xx)>, Q’< x’( 3))) = 0). As @$,, ~H~(~)~B~,wehaveLi(j,(o), Therefore, Q(‘%)

=/nQ(d+!(j,cw)~

X0)@,)

= /-, Q(d+Ok(j&&

Using the above facts, this equals /,-, Q(dw) = Q = 1. (b) Let B, = {OJE 0 1 lim sup, p&,,( jTcw)(w), x’(w)), Clearly, ll;=,B,cB,.Therefore, lrQ(B,>rQ(fl;=,B,)=l.

X”)(Bk).

Q’>) = 01. 0

Proof of Theorem 5.4. Consider the open ball B,,(P( x,, a,, ,), E) C M. As the topology on M is locally convex, there exists a random open convex set U c B,(P(x,, a,, I >, E) such that P(x,, a,, ,> E U. As U is open, there exists a positive random variable 6: fi --) !I’l such that B,(P(x,, a,, ,), 6) c U. Define the random variable T: 0 + N by T= ‘Ye. By definition, Z, < 6. This means diam &jz < 6. Therefore, if P’ E Bi, then dist(P’, P) < 6, which implies p(P’(x,, a,,,), P(x,, a,, ,>> < 6. It follows from Lemma 5.2 that z!‘, E II c B,(P(x,, a,, ,, e)).“By the triangle inequality, P(F+,~

P(x,?

a,+,))

SP(~,+,, +p(2T(j,,

G:(j,> x’),

x’)) P(x,,

a,+,)).

344

S.A. Shah/Journul

of Mathematical

Economics 25 (1996) 335-344

By the definition of T, ~(m,+ ,, P( x,, a,, , >) 5 p(m,+ , , zT’, 0 is arbitrary, we have the desired result. 0

References Berge, C., 1963, Topological Spaces, including a treatment of Multi-valued Functions, Vector Spaces, and Convexity (Macmillan, New York). Blackwell, D. and L. Dubins, 1962, Merging of opinions with increasing information, Annals of Mathematical Statistics 33, 882-886. Browder, F.E., 1968, The fixed point theory of multi-valued mappings in topological vector spaces, Mathematische Annalen 177, 283-301. Himmelberg. C.J., 1975, Measurable relations, Fundamenta Mathematicae 87, 53-72. Kuratowski, K., 1966, Topology, Vol. I (Academic Press, New York). Kuratowski, K., 1968, Topology, Vol. II (Academic Press, New York). Parthasarathy, K., 1967, Probability Measures on Metric Spaces (Academic Press, New York). Shah, S.A., 1995, Bayesian learning behavior and the stability of equilibrium forecasts, Journal of Mathematical Economics, 24 (5). 461-495.