Journal
of Mathematical
Economics
16 (1987) 297-313.
North-Holland
BAYESIAN LEARNING AND CONVERGENCE TO RATIONAL EXPECTATIONS*
Mark FELDMAN University of Illinois, Champaign, IL61820, USA Submitted
November
1985, accepted
June 1987
In this paper the probability distribution of equilibrium outcomes is assumed to be a continuous but unknown function of agents’ forecasts (which are probability measures). Agents start with a prior distribution on the set of mappings from forecasts into probabilities on outcomes. This induces an initial forecast. After observing the equilibrium outcome a posterior distribution is computed which induces a new forecast. The main result is that with probability one the forecasts converge to the set of fixed points of the unknown mapping. This can be interpreted as convergence to rational expectations.
1. Introduction and summary
The cornerstone of a popular version of the rational expectations hypothesis (R.E.H.) is that the beliefs of economic agents should be identical to the probability obtained by conditioning with respect to the ‘objectively correct’ parameter specification of the economy. It is often suggested that imposing rational expectations is merely the logical extension to the uncertainty case of the principle of individual optimization. The accompanying assertion is that in a stationary environment rational agents will not engage in systematic errors as would be required by an alternative stationary equilibrium concept. While superficially convincing, the above argument is not without gaps. A fundamental matter left unresolved by the R.E.H. is the mechanism via which agents acquire the information they are credited with possessing. Appeals to the asymptotic consistency of statistical estimation in conventional settings is inadequate. The source of the difficulty is that if agents are in the midst of *This paper is a revision and extension of Chapter 1 of the author’s Ph.D. dissertation at the University of California, Davis. I would like to thank my advisor, Ross Starr, for his guidance and encouragement. I am indebted to R. Driskill, D. Easley, and S. Sheffrin for helpful conversations. C. Giues, J. Robertson and the late J. Blum were helpful in resolving technical details. The comments of M. Ali Khan, A. McLennan, J. Sonstelie and an anonymous referee have improved the exposition. Of course, only I am responsible for any remaining errors. 030&4068/87/$3.50
0
1987, Elsevier Science Publishers
B.V. (North-Holland)
298
M. Feldman, Bayesian learning, rational expectations
learning about the economy, the learning procedure itself is part of the structure of the economy. Thus, the resultant stochastic process will not be stationary, even if the exogenous random variables (tastes, technology, the weather, etc., but not prices or beliefs) are stationary. Consequently, the standard theorems on consistency of estimators cannot be invoked and there is no a priori contradiction between intertemporal von Neumann-Morgenstern utility maximization and failure to attain a rational expectations equilibrium. This paper is an attempt to investigate the circumstances under which the limit of a learning process is rational expectations. This topic, the stability of rational expectations in the context of ongoing learning, has received much recent attention. Blume, Bray and Easley (1982) is an introduction to the existing literature. One approach has been to assume that agents’ learning entails adopting ‘reasonable’ ad hoc procedures (such as misspecitied OLS). The papers of Blume and Easley (1982), Bray (1982), and DeCanio (1979) are examples of this method. Their results suggest that convergence to rational expectations is not automatic and may depend upon the precise specification of the learning procedure and numerical values of parameters in the model. A difficulty with the ad hoc procedures is deciding what is ‘reasonable’. Estimation procedures that appear plausible over a finite horizon may be foolish over an infinite time span. The non-convergence results in the above papers can be attributed to agents adopting procedures which are ‘unreasonable’ over an infinite horizon. Other authors including Arrow and Green (1973), Bray and Kreps (1981), Blume and Easley (1984), Cyert and DeGroot (1974), Lewis (1981), and Townsend (1978) have investigated Bayesian learning rules. A rigorous Bayesian analysis with asymmetric information among market participants requires each agent to explicitly consider the beliefs of others (including beliefs about beliefs, etc.) as in Harsanyi (1967-1968). Because of the infinite regress in beliefs it is difficult to make analytical progress at this level of generality [for partial results in this context see Blume and Easley (1984), Bray and Kreps (1981), Feldman (1986), and Townsend (1978)]. Accordingly, an assumption essentially identical to homogeneity of agents’ beliefs (and that this is common knowledge) is made in this paper. The mathematically equivalent story that is told is that in each period all agents rely upon the prediction (a probability measure) of a Bayesian forecaster. The main results are that with probability one: (1) predictions are asymptotically correct, i.e., arbitrarily close to the forecast of an omniscient observer who knows the ‘true’ parameter, and (2) any prediction that is a cluster point (the limit of a subsequence of predictions) is a rational expectations equilibrium. Thus, there is convergence to the set of rational expectations equilibria.
M. Feldman, Bayesian learning, rational expectations
299
2. The model 2.1. Description of the model and notation 3, will denote the economic variables of interest observable in period t, }. 8, is a random element defined on the undertET where T={l,2,3,...
lying probability space (C&G?,P) taking values in a compact metric space (X, d). This section will consist of an informal description of the probability law generating 8, along with some definitions. All considerations of measurability of functions and existence of a probability space with the requisite properties are deferred until section 3. With 9 the Bore1 subsets of X, J.@(X) will denote the space of probability measures on (X, X) endowed with the topology of weak convergence. &Z(X) is compact [Parthasarathy (1981, Theorem 11.6.4)] and with pM the Prohorov distance’ (&(X),p,) is a metric space. The space of economies is assumed to be homeomorphic to FE C(A!(X),A(X)), the family of continuous functions from M(X) into J(X) endowed with the topology of uniform convergence. The interpretation is that the ‘true’ (but unknown) specification of the economy induces a mapping fO E F such that if m, represents the beliefs of agents prior to period t regarding the realization of xi,, the ‘true’ (but unknown) distribution of 2, is .Mm,). It should be noted that for the results of this paper it is sufficient for X to be a complete separable space as long as attention is restricted to a tight family of probability measures on (X, CZ). The beliefs of agents are assumed to be homogeneous. For convenience it will also be assumed that there is a forecaster or econometrician whose predictions are accepted by all agents.2 The prior probability of the econometrician on (F, 9) is I, E d(F) where B is the Bore1 a-field and d(F) is the space of probability measures on (F,9) endowed with the topology of weak convergence. Based upon & a forecast m, E&(X) is chosen. After observing the realization x1 of xi, the forecaster revises his beliefs in a Bayesian fashion resulting in a posterior measure 1, E A!(F) and a corresponding forecast m2 E A(X), etc. A rational expectations equilibrium is a forecast m that is a fixed-point of ‘Let (S,u) be a separable metric space with Y the Bore1 u-field. Let M(S) denote the space of probability measures endowed with the topology of weak convergence. For P, Q,eM(S) the Prohorov metric p is defined by p(P,Q)=inf{e>O:Q(A)SP(Ac)+c,P(A)~Q(AE)+e where
VAoS}
A”={xGS:IS(X,A)
For further discussion see Billingsley (1968, p. 237). ‘This can be relaxed somewhat. What is crucial are time invariant.
is that
the reactions
of agents
to a forecast
300
M. Feldman, Bayesian learning, rational expectations
iii Fig. 1
fe. The main result of the paper is that with prior probability one the sequence {m,} converges to the set of rational expectations equilibria. It would be natural to conjecture that {A,} converges to hJO, a unit masspoint at fO. Reference to fig. 1, though, indicates that without further restrictions the conjecture is false. Interpret the horizontal axis in fig. 1 as the space of forecasts, the vertical axis as the space of induced probabilities on outcomes, and the 45” line as the diagonal in A!(X) x A(X). Suppose fi EF is identical to fO in a neighborhood N, of Ci, where ti is a fixed point of fO. If the tail of {m,} is in N,, then the asymptotic behavior of the process will not distinguish between fi and fO. Thus, without unacceptably strong restrictions on A,, identification of fO cannot be expected. This is a variant of the identification problem in econometrics. While fO is not identifiable, one or more of the fixed points of fO will be identified [but which one(s) may be random]. 2.2. Interpretations
of the model
Since the model is not embedded in a conventional framework, we shall suggest two possible interpretations.
general equilibrium
2.2.1. Partial equilibrium In this version agents take actions prior to period t that are influenced by their expectations concerning market outcomes. For example, farmers’ decisions to plant crops reflect their beliefs concerning the joint distribution of
M. Feldman, Bayesian learning, rational expectations
301
crop prices and harvest yields. Viewed in this perspective, this paper can be interpreted as a generalization of the work of Cyert and DeGroot (1974) to an abstract setting without restrictive parametric assumptions regarding distributions of random variables. 2.2.2. General equilibrium In the appendix it is demonstrated that the mathematical structure in the body of the paper can be derived from a temporary equilibrium model of an exchange economy. Here I will provide an informal description of the temporary equilibrium model developed in the appendix and indicate the nature of the equivalence between that model and the reduced form mathematical structure described above. The agents in the economy consist of: (i) a sequence of non-overlapping generations indexed by t, t = 1,2,. . . , and (ii) an economic forecaster. Generation t is comprised of consumers (1, t), . . . , (I, t), each of whom is alive only in period t. Period t consists of two subperiods, subperiod (t, 1) and subperiod (t, 2). Consumer (i, t) receives a random endowment Zitk in subperiod (t, k). The vector of endowments for all members of generation t is a random element Z,. The family of possible distributions for Z, is indexed by a parameter space 0 with generic element 19. Conditional upon the realization of a random element g taking values in 0, the sequence {Z,) is i.i.d. with distribution pgcw,.The marginal distribution of Zilz conditional upon Zirl and B is denoted as 71i2(~~)(.1)Zi,,). By assumption there is no intertemporal exchange. Consumers trade in subperiod (t,k) at the (random) price vector prk taking values in A. At the commencement of period t the forecaster announces a forecast m,, where m,E.M(A x A). The forecasted marginal distribution of pt2 is denoted by n2(m,). Consumers, in contrast to the forecaster, are assumed to know 7ci(pg& the ‘true’ distribution of their endowment. Because consumer (i, t) lacks forecasting expertise, her beliefs regarding the joint distribution of ptz and Zitl given the forecast m, and observation of the realization of Z,,, is represented by the product probability measure nz(m,) x rr&J(. 11Z,,). With the additional assumption that preferences are strictly concave and monotonic, we can construct the correspondence mapping the forecast and first subperiod endowments into first subperiod equilibrium prices. A selection, assumed to satisfy a regularity condition, is chosen from the equilibrium correspondence. Conditional upon g(o) =O and a forecast m,, the induced measure on prices is 6&m,). The function 6, is continuous and has a fixed point. The forecaster starts with a correctly specified model of the economy and a prior probability 2, =&-ion 0. The forecaster observes the price realizations, but not the realization of g or the sequence {Z,}. It is shown in the
302
M. Feldman, Bay&an
learning, rational expectations
appendix that there is a compact set A* c&(d2) such that for all 0~ 0, 6,(&Y*)c_M*. The forecast m,~ JZ* is chosen in a manner consistent with posterior beliefs I,_ 1. Upon observing the realization of pa beliefs are revised generating the posterior & etc. The results in the remainder of the paper can be interpreted by identifying f, F, and A(X), respectively with 6,, (6,: 8 E O}, and A*. 3. Evolution of the process 3.1. Selecting a forecast A natural requirement to impose on announced forecasts is that the forecasts are truthful or consistent with the econometrician’s beliefs. Definition 3.1. The evaluation function g:F x &Z’(X)x %+[O, l] satisfies g(f,m, A)=f(m)(A) for all ~EF, mEM(X), and AEX. So g(f,m,A) is the probability of event A occurring conditional on f being the economy and m being the forecast. Definition 3.2. For a given IE_&‘(F) a forecast m will be termed consistent (for 2) if for all d E X, m(A) = IF g( . , m, A) d,I.
The interpretation is that if m(A) is the announced probability of A, then m(A) is also the econometrician’s subjective probability of A given that 2 is his probability measure on (F,F). To verify the existence of consistent forecasts which are a measurable function of 2, it is necessary to establish certain facts. Proposition
sup
mG.Af(X)
dflk4
3.3.
(F, z)
is
a
separable
metric
space
where
$fi,
fi) =
f2W).
Proof Since A(X) is a compact metric space the topology of uniform convergence (which is generated by z) on F is identical to the compact-open topology [Kuratowski (1966, p. 89)]. Since A(X) is separable F is separable with the compact-open topology [Kuratowski (1966, p. 94)] and therefore Q.E.D. (F, z) is separable. Lemma 3.4. measurable.
For every rnE &f(X)
and AE X
the function g(. , m, A) is 9
The proof, a routine but lengthy exercise in measure theory, is omitted. The details of the proof are in Feldman (1985). Lemma 3.5.
&f(X) is a fixed point space.
M. Feldman, Bayesian learning, rational expectations
303
The proof is a standard application of the Schauder fixed point theorem. For details see either Anderson and Sonnenschein (1982, Lemma 4) or Feldman (1985). Definition 3.6. The function Z:M(X) x !Z x M(F)-+[O, l] is defined to satisfy the functional equation Z(m,A, 2) = SFg( . , m, A) dl. Lemma 3.7. Proof:
l(m, .,,I) is a probability measure on (X,3).
This is exercise 19.19 of Billingsley (1979).
A forecast m and beliefs 1 induce the subjective probability measure I(m, . , A) on (X, ?Z). f(m, A, 2) is the econometrician’s subjective probability of A given beliefs 1 and forecast m. Definition 3.8. @_/l(X) x M(F)+&(X) for every A E X.
is defined by $(m,l)(A) =I(m, A, 2)
A forecast m is consistent for 2 if and only if t&m, A) = m, or in other words, m is a fixed point of $(. , A). Proposition 3.9.
* is continuous.
Proof: Let AE?E be a closed set. Using a characterization of weak convergence [Billingsley (1968, Theorem 2.1)] it follows that g(. , . , A) is an upper semicontinuous (u.s.c.) function. Therefore the function (m,1)+Jg(f, m, A)ll(df) is U.S.C.[Schal (1975, Lemma 3.4)]. Now let (m’, l’)+(m, 1). The U.S.C.property implies lim jg(f, mi, A)n’(df) 5 sg(f, m, AMdf) or, equivalently, that lim $(m’, 2’)(A) 5 $(m, l)(A). But this s&ices to prove that $ is continuous [Billingsley (1968, Theorem 2.1)]. Q.E.D. To assure that probabilities measurable function of 1, _ 1.
of future events are well-defined m, must be a
Definition 3.10. The consistent forecast correspondence H:&(F)+&(X) is defined by H(I)= {mE&(X):ll/(m,1) =m}. With 6, a unit-mass at f; H(S,-) will sometimes be denoted by H, Proposition 3.11. pondence. Proof:
H is a non-empty upper hemicontinuous (u.h.c.) corres-
ti continuous implies H has closed graph and hence, because J?‘(X) is
304
M. Feldman, Bayesian learning, rational expectations
compact, H is u.h.c. [Hildenbrand (1974, p. 23)]. $ continuous 3.5 imply H is non-empty for all AE&(F). Q.E.D. Recall that a class 1 function has the property open sets are F,-sets.
and Lemma
that the inverse image of
Proposition 3.12. There exists a class 1 (and hence measurable) function h:A(F)-+A’(X) such that h(A)EH(IZ)for all SEA’. Proof
With pF the Prohorov metric, (&(F),p,) is a metric space. But H u.h.c., d(F) metric, and J!(X) complete are sufficient conditions [Kuratowski (1968, p. 74)] to imply existence of a class 1 selection. Q.E.D. Assumption 3.13. If the econometrician’s period t is At, the forecast m,, 1 = h(A,).
measure on (F,F)
at the end of
Definition 3.14. The economy is the realization of the random i?Q+F. For given o~s2, the ‘true’ economy P(o) is denoted by fO,
element
Definition 3.15. A rational expectations equilibrium is a forecast m such that m E Hf,,, i.e., $(m, 6/J = m.
We can now provide
a precise definition
of convergence
to rational
expectations. Let pM(m, HfO) z inf m,EH,O pM(m, m’). Definition 3.16. There lim t+ a, p&it, H/J = 0.
is
convergence
to
rational
expectations
iff
3.2. Revision of beliefs and construction of the probability space Let (X*,x”) =(ni= ,Xi, n:= ,X,)where Xi and Xi are respectively the ith copy of X and !Z. The generic element of X’ is denoted by x’. Let (52,,9,) = (F x X’, 9 x .!Z?)and (s2,W) = (F x X”, 9 x %‘“). Building upward from the prior IzOon (F,9) and the forecasting rule h, the sequence of probability spaces {(C&a’,,P,)> will now be constructed. The sequence {Pt} of finite horizon probabilities uniquely determines the probability P on the infinite horizon space (52,SS). (52,g, P) is taken to be the probability space on which all random elements are defined. F:S2+F is the projection of Z onto F defined by F(f, xm) = J Definition 3.17. P,,, is the probability f(h(&))(B) for all BE%‘.
measure on (X, %) such that PI,,(B) =
M. Feldman, Bayesian learning, rational expectations
PI,/(B) should be thought of as the conditional probability that F=J
305
of 8, EB given
Definition 3.18. Let P, be the (unique) probability measure on (Q,,aI) that for ~4~9 and S~‘EX, P,(& ~~)=S~g(f,h(~~),W)~~(df).
such
Lemma 3.19. There exists a measurable function A,;X+M(F) such that A, is a regular conditional probability distribution given X,. That is AIM& for x1 EX and the mapping x,+A,(x,)(A) is a oersion of P,(A xX 1X,=x,). Proof: This is Theorem V8. l)].
the
We now proceed
Disintegration
to an induction
Theorem
argument
[Parthasarathy
(1967,
that allows us to define
P,,,, P,, and I& for all t E T. Proposition 3.20. Suppose for t 2 2 that: (i) P,_ 1,/ is a probability on (X’-‘, X1-‘) with the mapf~P,_,,f~/~(~(X’-‘)) measurable, (ii) P,_l is a probability on (s2,_ 1, &?- 1), and (iii) for the probability space (a,_ 1, at,- 1, P,_ J the function A, _ 1:X’- ’ +M(F) is a regular conditional probability given 8,_I), then (i) the map f +P,,I defined by P,,,(& x B)= (%,%,..., JA f(h(A,_,(x’-l)))(B)PI_l,Jd(x’-‘) for &EX’-’ and BEX is S/B(&(X’)) measurable, (ii) there is a unique probability P, on (Q,,g,) such that and WE X’, and (iii) there is a measurP,W x 9) = JA PJ,OV&(df 1for d E 9 able function A,:X*+A(F) such that for the probability space (sZ,,B,,PJ A, is a regular conditional probability given (x1, 8,, . . . ,8,). Proposition 3.21. There are unique probability measures P on (C&g) and P, on (Xm,Xm) such that P restricted to (sZ,,@J is equal to P, and P, restricted to (X’, X’) is P,,,. Proof: This is a direct application of a version of Kolmogorov’s Theorem [Ash (1972, Theorem 2.7.2)].
4. Asymptotic
Extension
behavior of the system
The proofs of the convergence theorems in this section rely upon a mathematical result of Blackwell and Dubins (1962, Main Theorem)3 which will be restated in the context of our model. Let P be the restriction of P to (Xm,Xm) and let 0 be a probability measure on (Xm,Xm) with o<
proof
relying
on Theorem
1 of Dubins
and
Freedman
(1965) is in Feldman
306
M. Feldman, Bayesian learning, rational expectations
and ~m’f=(~~+l,~~+2,Xf+3,... ) and define w’ by x’=(81,z2,. . . ,x,). For the probability space (X”,%“^“, P) let the map xf+pf(xf) be a regular conditional probability on (X”“, !Zmiz) given zl. Then the Blackwell-Dubins theorem is that there exist regular conditional probabilities 0 given X, such that p(P’, Q’)+O as t-co with Q probability 1 where p is the sup metric.4 Let p and Q’ be defined as above and let I?’ and Q’ be the respective restrictions of P’ and Q’ to (X,, r, 3-t+ r). Proposition 4.1. If Q is a probability pM(&‘, p)+O with & probability one.
on (Xm,!Tm)
with O=K~ then
Proof. The conclusion follows directly from the Blackwell-Dubins Theorem and the fact that norm convergence is stronger than weak convergence. Q.E.D.
One interpretation is that if an observer has prior Q, the forecaster has prior ii, and Q is absolutely continuous with respect to P, then the conditional forecasts of the future given the past will become close in a strong sense (the sup topology) with observer probability one. If P, was absolutely continuous with respect to P for almost all f[&,] then Proposition 4.1 would be directly applicable with P,,, substituted for Q. Instead, we construct an increasing sequence of partitions of F and an associated sequence {Q,}c= 1 of probabilities on (a, g) such that Q,<
Let (cO,,O,,. . .> b e a base for the (F,r) metric topology. Let R,,2k} be the partition of F generated by {Co,,.. ., S,}. Let F, &={&,r,..., be the a-field generated by R,. Lemma 4.3.
F,f9,
i.e., Fk~Fk+l
and o(u~zlY-k)=9,
Definition 4.4.
For the probability space (C&g, P), QI, is a proper regular probability on 9 given by 9-k. $, is the restriction of Qk to Qi is a regular version of QK(*IIcr(Xf)) restricted to (FxX @, F x %-t-m’[).&;, is a regular version of Q,(. IIa(8’)) restricted to (X@, S?-co’t). conditional (Xo3,Xm),
The requirement for Qk to be proper is that Qk(Rn,j x Xm) =I( .){pER) .) for j=l ,.. . ,2k where I( .) is the indicator function. So even if &(R,,~$=O, Qk(Rk,j x Xm)= 1 whenever FeRk,> A general reference on the existence and 4p(P,@,:=sup,,,,,, IP(D)-Q’(D)l.
307
M. Feldman, Bayesian learning, rational expectations
non-existence of proper, regular conditional probabilities is Blackwell and Dubins (1975). Q: can be interpreted as the probability on (F x X”“,F x .C@‘) which obtains from prior beliefs P, observing 8’ and knowing which element of R, contains fO. 0: is the analogous probability on (Xmi’,%-aoit). Lemma 4.5.
Except on a set of P probability 0, Qk-x P and &,<
Proposition 4.6.
P as. lim,, mpM(& P’) = 0.
Proof:
Let E, = {o E R:pM(&, @-+O} and let A, = {o E Q:Q,<
From Proposition 4.1 Qk(E,)(co) = 1 for oe A, and by Lemma 4.5 P(A,) = 1 Q.E.D. so JA,Qk(Ek)(w)P(dco) = 1 implying P(E,) = 1. The convergence of pM(fo(m,), m,) to zero might now appear upon first glance to be a direct conseauence of the triangle inequality p,(f,(m,),m,) 5 &(fO(mA 0:- ‘) + PM(&- l, mSS But since the value of k that guarantees pM(fO(m,),& ‘) is sufficiently small is random, we need to introduce the machinery of stopping times. Definition 4.7. Let Z,:G+R be the random variable defined by Z,(o)= cjZRk (F(o)). diam R,,? Z, is an upper bound on the discrepancy between &-’ &d fJm,). Lemma 4.8.
Z,JO for all w E a.
The proof which is routine is omitted. Details are in Feldman (1985) Dejnition 4.9. For n E T let cx,be a stopping time relative to {YJ with U,(O) = min (k E T: Z,(w) < l/n}. Lemma
4.10.
Let
a:a-+R
be an integer
valued random
variable.
Then
lim, _ mpM(ai- ‘, m,) = 0 P a.s. The proof is routine. Details are in Feldman (1985). Theorem 4.11. Mth P probability one p,(f,(m,), p,(F(w)(m,(w)), m,(o))+0 except on a null set.
m,)-+O or equivalently
308
M. Feldman, Bayesian learning, rational expectations
Proof: By the triangle inequality p,(f,(m,), m,) 5 p,(fo(m,), &; '1+p,&&; ', m,). Lemma 4.8 and Definition 4.9 imply that s~p,p,(f~(m,),@~;‘) < l/n. From Lemma 4.10 we conclude that P a.s. there exist r such that for t zt, (2:; ‘). Therefore P a.s. hm pw(fo(m,), m,) < l/n P&, l, m,) < l/n-p,(f,(m,), and since n is arbitrary the proof is complete. Q.E.D. From Theorem 4.11 we can conclude that asymptotically forecasts are accurate in the sense that an observer who knew the realization of F and m, would have no substantial disagreement with the forecast m,. Convergence to rational expectations embodies the stronger conclusion that forecasts converge to the set of fixed points of fO. Theorem 4.12.
P as. there is convergence to rational expectations.
Proof. According to Definition 3.16 it is necessary to prove that lim,+ mP&tt, H,J = 0 P as. Let M(o) be the set of cluster points of {m,>. Since d(X) is compact it suffices to prove that M(w) cHfO P a.s. Let E={o~S2:~~(f~(m~),m,)%O}. P(F)=0 so it suffices to show that &?(o) cHfo f or WE-E. Let OE NE and let m* EM(~) with {m,j)+m*. By the triangle inequality p&&n*), m*) 5 p&&r*), fo(mt,)) + p&@,j, m,j) + Pnr(m,jY m*). The first term on the right-hand side of the inequality converges to zero as j+cc because f0 is continuous. The second term converges to zero because WE -E. Therefore pM(fo(m*),m*) =0 so m* eNlo or M(w)cHrO. So if there is a unique fixed point m* of f0 then m, =Sm*. It is natural to suspect that even if f0 has multiple fixed points the sequence {m,} will have a limit. In Proposition 4.13 we prove that there is a random element A, such that 2, + 2,. If h is continuous at n,(w) then m, =Sh(il,(o)). Unfortunately, no formal probabilistic results seem attainable regarding the continuity of h at A,(w). However, the fact that the set of discontinuities of h is of first category does suggest that continuity of h at J,(o) is ‘typical’. Proposition 4.13. 1, z-z.I,.
There is a random element A,:C!-*A?(F)
such that P a.s.
Proof Let the family of open sets {cO,,Lo,,.. .} be the base for F defined in Definition 4.2. Let @ be the family of sets consisting of {O,,O,,. . .} and all finite intersections. By construction % is closed under the formation of finite intersections and every open set in F is a finite or countable union of elements of %!. Define I,(o)(A) by &,(A) = P(A x X” 11a(_%,,s,, . . .)). Invoking a version of the Martingale convergence theorem [Billingsley (1979, Theorem 35.5)],
M. Feldman, Bayesian learning, rational expectations
309
for any A~91,(A)+1, P as. Since % is countable P a.s. L,(B)+&,,(B) simultaneously for all BE 43. The conclusion of the proposition now follows directly from a theorem of Billingsley (1968, Theorem 2.2). 5. Possible extensions It is possible to construct a specialized temporary equilibrium model which gives rise to the stochastic structure in this paper. But the extent to which versions of the above theorems are valid in intertemporal general equilibrium contexts without restrictive assumptions is an open question. It would be desirable to extend the results of this paper by relaxing the assumptions that fO is continuous and that beliefs are common knowledge. But in part because of the paucity of results concerning the existence of nonrevealing rational expectations equilibria, modeling learning with heterogeneous beliefs appears to be a difficult task. An obvious question is whether convergence results are attainable with P, probability one for all fe F. The answer is negative even if Iz, assigns positive probability to every open set in F. Another direction in which to proceed is to investigate the rate of convergence to rational expectations. Without restricting the prior beliefs, it appears (to me) doubtful that much can be said. But Martingale central limit theorems and the law of the iterated logarithm may be of some use. However, if one is willing to assume, say, a normally distributed prior in a linear econometric model the prior and thus the rate of convergence can be estimated. This also provides a method of testing the R.E.H. as a special case of rational learning. A preliminary attempt at pursuing these issues is in Feldman (1982). A recent paper by Marcet and Sargent (1985) is also relevant. Appendix Consider an exchange economy with a sequence of non-overlapping generations and a market structure that prohibits intertemporal exchange. Each generation consists of I < cc consumers each of whom lives one period, where each period is comprised of two subperiods. So consumer (i,t) is alive in subperiods (t, 1) and (t, 2) where in { 1,2,. . . , I} and t E {1,2,. . .). There are I (non-storable) commodities available for consumption in each subperiod. The consumption set for each consumer is Ci x Ci where Ci = R’+. The total consumption set is C x C where C= C, x .. x C1. The Bore1 subsets of C are denoted by V, and the product a-field by %?x%‘. The preferences of consumer (i,t) can be represented by a von NeumannMorgenstern utility function Ui:Ci x Ci+R which is bounded, strictly concave and strictly monotone. The endowment of consumer (i,t) is a random vector
310
M. Feldman, Bayesian learning, rational expectations
Zif=(Zifl, Zit2) where Zitk takes values in int Ci. The vector of endowments for generation t is Z,=(Z,,,Z,,), taking values in C” x C” where Ztk= Z,,,) and C” denotes the interior of C. (Zlfk, ZZ&. . .7 The family of a priori possible distributions for Z, is {pe:8 E O> where 0 is an abstract parameter space and {pLe:f9~O} is a family of Bore1 probability measures on C” x C” dominated by Lebesgue measure. Identifying t!Jwith pO, endow 63 with the topology generated by the variational metric d defined by d(O1, ~‘)=su~~,~~~~I~~(A)-~~~(A)I. So (0, d) is a separable metric space. The Bore1 o-field of 0 is denoted by X. There is a random element 8:!2+0 such that conditional upon g(u)=& the sequence {Z,} is i.i.d. with distribution pLe.The probability A., on (0,X), defined by 2, = PK ’ is the forecaster’s prior distribution. The prior is assumed to satisfy the restriction that suppI, is a tight family of probability measures on C” x C”. For tie 0, let rr&) denote the conditional (upon 0) marginal distribution of Zi,. There is an assumed informational asymmetry which is that consumer (i, t) knows the realized marginal distribution r&J, while in contrast the forecaster must attempt to infer g(w) from the prior distribution and the sequence {p,} of price realizations. The rationale for this is that individuals can be presumed to have more detailed knowledge of their personal characteristics than a forecaster. We now proceed to describe the dynamical structure of the economy. The set of possible prices in each subperiod is A = {p E R’+ +: s.t. C,p, = l}. Period t commences when the forecaster announces a forecast m, E &Z(A’). The marginal distribution of the forecast of second period prices is the projection proj, (m,) defined by proj, (m,)(A)=m,(A x A). The forecast m, will affect outcomes only through proj,(m,). Consumer (i,t), informed of the forecast m, and privately observing the realization of Zifl, calculates, using a fixed regular version of conditional probability rr&)(. 11Zitr), a conditional distribution for Zit2. Lacking forecasting expertise, consumer (i, t) has beliefs on (int Ci) x A given by the product measure ni(~e)(. 11Zi,,) x proj, (m,). The subperiod (t, 1) demand functions can be obtained by solving a routine dynamic programming problem. Define hi(Citr,Zit2,~~2) by hi(citr, Zif2,pt2) = argmaxci12ui(ci*13 cit2)3 s.t. Pt2 ’ (Zit2 - Cit2) = 0. The value function ui: Ci x int Ci x A’( A2) + R of subperiod (t, 1) consumption for consumer (i, t) is defined by
The demand function for individual (1, t) in subperiod (t, 1) is qi,:A x int Ci x .M(A2)+Ci,
definedby qil(Ptl, zitl, 4
=argmaxcitl
ui(Citl,zitl, 4,
s.t.~tl.
(zitl -Girl) ~0.
The vector of induced demand functions is q1 = (aI r, q12,. . . , qIr) and the
M. Feldman, Bayesian learning, rational expectations
311
function
defined by w2h2,
4
= (pr2: 52(~r2, zz2, 4
=O).
The second subperiod equilibrium price function 42:C” x C+ A is a measurable selection from W,. tj2 is assumed to satisfy a regularity condition resembling the condition imposed upon 41. Defining D2(c) for c E C, by D,(c) = .) is discontinuous at c}, the requisite assumption is that {zt2 E co: 42(‘%2, for all c E C, D,(c) is a set of Lebesgue measure zero. This implies that for all 19E 0, the map c,i +p&; ‘(. , c,J is continuous. The equilibrium price function 4: C” x C” x &(A2)-+A x A is defined by m,)))). The assumptions made regard4kl,Z,2r m,) =(41h mA 42(zt2T rll(41(zrlT ing 4i and 42 guarantee that for all mEdi’(A2), 4(ztl,zt2, -) is continuous at m except for a set of endowments of Lebesgue measure zero. It follows that the induced price distribution map &Jt!(A2)+JZ(A2) defined by ii,( ,b& - ‘( f , m,) (A) is continuous. A technical loose end arises from the fact that Ji’(A2) is not compact. But by exploiting the tightness of the family {,ue:OE O} and the strict monotonicity
312
M. Feldman, Bayesian learning, rational expectations
of preferences, it can be shown that u {Zm6,:8~ O} is tight. By Prohorov’s Theorem [Billingsley (1968, Theorem 6.1)] this can be restated as the family of probability measures that could conceivably be the induced measure on second subperiod prices has compact closure in .&Z(d). A corollary is that there is a compact, convex set .k’* c&(d’) such that 8&M*)c&* for all 8 E 0. By the Schauder Fixed Point Theorem the restriction of BBto .M* has a fixed point. Accordingly, the results in the main part of the paper can be interpreted by identifying each function f E F, with a function a0 1A!* for some 8 E 0.
References Allen, B., 1981, Generic existence of completely revealing equilibria for economies with uncertainty when prices convey information, Econometrica 49, 1173-l 199. Anderson, R. and H. Sonnenschein, 1982, On the existence of rational expectations equilibrium, Journal of Economic Theory 26,261-278. Arrow, K. and J.R. Green, 1973, Notes on expectations equilibria in Bayesian settings, IMMS working paper no. 33 (Stanford University, Stanford, CA). Ash, R., 1972, Real analysis and probability (Academic Press, New York). Billingsley, P., 1968, Convergence of probability measures (Wiley, New York). Billingsley, P., 1979, Probability and measure (Wiley, New York). Blackwell, D. and L. Dubins, 1962, Merging of opinions with increasing information, Annals of Mathematical Statistics 33, 882-886. Blackwell, D. and L. Dubins, 1975, On existence and non-existence of proper, regular, conditional distributions, Annals of Probability 3, 741-752. Blanchard, O., 1976, The non-transition of rational expectations, Unpublished manuscript (Department of Economics, MIT, Cambridge, MA). Blume, L.E. and D. Easley, 1982, Learning to be rational, Journal of Economic Theory 26, 34&351. Blume, L.E. and D. Easley, 1984, Rational expectations equilibrium: An alternative approach, Journal of Economic Theory 34, 116129. Blume, L.E., M.M. Bray and D. Easley, 1982, Introduction to the stability of rational expectations equilibrium, Journal of Economic Theory 26, 313-317. Bray, M., 1982, Learning, estimation and the stability of rational expectations, Journal of Economic Theory 26, 318-339. Bray, M. and D. Kreps, 1981, Rational learning and rational expectations, Mimeo. (Stanford University, Stanford, CA). Cyert, R. and M. DeGroot, 1974, Rational expectations and Bayesian analysis, Journal of Political Economy 82, 521-536. DeCanio, S., 1979, Rational expectations and learning from experience, Quarterly Journal of Economics 92,45-57. Dubins, L.E. and D. Freedman, 1965, A sharper form of the BorelCantelli Lemma and the strong law, Annals of Mathematical Statistics 36, 800-807. Dunford, N. and J.T. Schwartz, 1957, Linear operators part I (Interscience Publishers, New York). Feldman, M., 1982, Estimation and learning in models of rational expectations, Ph.D. dissertation (Department of Economics, University of California, Davis, CA). Feldman, M., 1985, Bayesian learning and convergence to rational expectations, Unpublished manuscript (Department of Economics, University of California, Santa Barbara, CA). Feldman, M., 1987, An example of convergence to rational expectations with heterogeneous beliefs, International Economic Review, forthcoming. Freedman, David, 1973, Another note on the Borel-Cantelli Lemma and the strong law, with the Poisson approximation as a by-product, The Annals of Probability 1, 91&925.
hf. Feldman, Bayesian learning, rational expectations
313
Grandmont, J.M. and W. Hildenbrand, 1974, Stochastic processes of temporary equilibrium, Journal of Mathematical Economics 1, 247-277. Harsanyi, J., 1967-1968, Games with incomplete information played by Bayesian players, Parts I, II, and III, Management Science 14, 159-182, 32&334,486502. Hildenbrand, W., 1974, Core and equilibria of a large economy (Princeton University Press, Princeton, NJ). Jordan, J.S., 1980, On the predictability of economic events, Econometrica 48, 955-972. Jordan, J.S., 1982, Admissible market data structures: A complete characterization, Journal of Economic Theory 28, 19-3 1. Jordan, J.S. and R. Radner, 1982, Rational expectations in microeconomic models: An overview, Journal of Economic Theory 26. Kuratowski, K., 1966, Topology Volume I (Academic Press, New York). Kuratowski, K., 1968, Topology Volume II (Academic Press, New York). Lewis, G., 1981, The Phillips curve and Bayesian learning, Journal of Economic Theory 24, 2&264. Marcet, A. and T. Sargent, 1985, Convergence of least squares learning mechanisms in selfreferential linear stochastic models. Mas-Colell, A., 1967, The theory of general economic equilibrium: A differentiable approach (Cambridge University Press, Cambridge). Parthasarathy, K., 1967, Probability measures on metric spaces (Academic Press, New York). Royden, H., 1968, Real analysis (Macmillan, New York). Schal, M., 1975, On dynamic programming: Compactness of the space of policies, Stochastic Processes and their Applications 3, 345-364. Smart, D., 1974, Fixed point theorems (Cambridge University Press, Cambridge). Townsend, R., 1978, Market anticipations, rational expectations and Bayesian analysis, International Economic Review 19. 481494.