Journal
of Econometrics
LATENT
17 (1981) 287-304.
VARIABLE
North-Holland
MODELS
Publishing
Company
FOR TIME
SERIES
A Frequency Domain Approach with an Application Permanent Income Hypothesis*
to the
John F. GEWEKE Uniuersity
of Wisconsin,
Kenneth Carnegie-Mellon
Madison,
WI 53706. USA
J. SINGLETON
University,
Pittsburgh,
PA 15213, USA
Received April 1981, final version received
August
1981
The theory of estimation and inference in a very general class of latent variable models for time series is developed by showing that the distribution theory for the finite Fourier transform of the observable variables in latent variable models for time series is isomorphic to that for the observable variables themselves in classical latent variable models. This implies that analytic work on classical latent variable models can be adapted to latent variable models for time series, an implication which is illustrated here in the context of a general canonical form. To provide an empirical example a latent variable model for permanent income is developed, its parameters are shown to be identified, and a variety of restrictions on these parameters implied by the permanent income hypothesis are tested.
1. Introduction Recent developments in the theory of aggregate economic behavior which incorporate uncertainty and imperfect information have re-emphasized the importance of variables which are unobservable but clearly have an economic interpretation. Perhaps the single best example of this emphasis on latent variables in the theoretical literature is Lucas (1975), but the theme emerges in other papers of his (1972, 1973) and those of Sargent (1973, 1976), as well.’ While this particular emphasis is recent, unobservable time series have always been important constructs in macroeconomic theory; e.g., the real rate of interest and price expectations in the theory of interest rates, and permanent income in the theory of consumer behavior.2 That these unobservables have not been treated *We are indebted to Arthur Goldberger and an anonymous referee for comments on earlier drafts, to Mostafa Baladi and Steve Symansky for assistance with the computations, and to the National Science Foundation for support through grant SOC 76-24428. Any remaining errors are our own. ‘An application of the methods described in this paper, inspired by this literature, is provided by Sargent and Sims (1977). ‘Singleton (1977) has treated the real rate of interest and price expectations explicitly as latent variables using the general approach set forth here.
0165-7410/81/000~0000/$02.75
0
1981 North-Holland
288
J.F. Geweke and K.J. Singleton, Latent variable models for time series
explicitly as latent variables in the associated empirical literature may be ascribed to the statistical problems which arise in latent variable models for time series. The serial correlation which characterizes time series, but not cross-sectional data, introduces non-trivial problems of estimation and identification [Anderson (1963)]. The most obvious difficulty is that one may not proceed under the usual assumption that the observables are an independent, identically distributed sampling from an underlying population. More subtle problems arise when the influence of latent variables on observables, or vice versa, is not strictly contemporaneous, and when latent variables themselves are serially correlated. In this paper we discuss a very general class of latent variable models for time series. We show how the complications peculiar to time series may be removed and how the estimation techniques and the distribution theory associated with cross-sectional applications may be adapated to time series. The basic idea of the paper is simple. If the time series in question are a sampling from the realization of a jointly stationary stochastic process, then the asymptotic joint distribution of the finite Fourier transform ordinates of the series is that of independent, complex normal random variables. The asymptotic distributions of all the ordinates are not identical, but those of a given number of ordinates closest to a prespecified frequency are, and the variance of each ordinate in the latter group is the spectral density of the time series at the prespecified frequency. Because of this property the distribution theory for the finite Fourier transform of the observable variables in latent variable models for time series is isomorphic to that for the observable variables themselves in classical latent variable models. This isomorphism was exploited in earlier work [Geweke (1975, 1977), Sargent and Sims (1977)] to allow estimation and a test of general specification in the exploratory dynamic factor model. The contribution of the present paper is to demonstrate that the isomorphism extends to a very general canonical form, and to show how the analytical results for classical latent variable models may be applied directly to latent variable models for time series in this canonical form. In the process of this extension we develop the asymptotic distribution theory required for tests of hypotheses about specific parameters in a given model. We also illustrate how the time paths of the latent variables themselves may be estimated. The rest of the paper is organized as follows. In section 2 the conditions for the asymptotic normality of the finite Fourier transform of a covariance stationary time series are set forth, and the relation of maximum likelihood estimation in complex normal and real normal models is discussed. It is shown that certain numerical algorithms may be applied to the optimization problems arising in both cases. These results are the basis of a solution of the inference problem for latent variable models for time series and test statistics of the overall specification of these models proposed in section 3. Adaptation of the analytic work on latent variable models for cross-sections to our problems is illustrated in the context of a general canonical form. To provide an empirical example, we develop a latent
J.F. Geweke and K.J.
Singleton,
Latent
variable
modelsJiir
time series
289
variable model for permanent income in section 4. It is shown that the behavioral parameters of the model are identified and that the permanent income hypothesis implies several testable restrictions on the model. The results of these tests, discussed in section 5, support some implications of the permanent income hypothesis but reject others.
2. Solution of a formal problem Suppose that z1 , . ., zT is part of the realization Cramer representation
zI rx1
=
i
of a stationary
time series with
exp (itA) dZ(A),
-7c
rx1
and Wold representation
where IIA,1J2 denotes the largest eigenvalue of AtA, and the E, are serially uncorrelated, r x 1 random vectors with E(EJ = 0, E(&) = I. The vector dZ(1,) is a complex, stochastic differential with mean 0, orthogonal increments, and variance equal to the spectral density of {z,}: E dZ(3.) dZ(A)’ = S;(3.).3 We further assume that the best and the best linear predictors of zt, conditional on the history of z~, are the same; this will be the case, for example, if the E, in (1) are independent and identically distributed. The finite Fourier transform of {z,}~~, is
W o’, T) = (271T) + f. z1exp (itA]{T)), *Xl
j=1,...,7;
r=1
where %j(T) = 2rrj/‘iY For samples of the size likely to be encountered in economic applications the sequence {w(j, T)>& 1 may be computed very accurately and at insignificant cost using the fast Fourier transform numerical algorithm [Cooley and Tukey (1965)]. Let ;1 be a fixed frequency and 1any chosen positive integer. Consider the set of 1 vectors
WO’T + 1,T), . . ., WO’T + 1,T), with j, chosen to minimize
xi = 1 12n(j, + i) - ITI. The joint distribution
(2) of the 1
‘For matrices or vectors which are complex, transposition includes conjugation: i.e., if the element in the ith row and jth column of A is denoted aij, the element in the ith row and jth column of A’ is a,,, where ~7,~denotes the complex conjugate of aji.
290
J.F. Geweke and K.J. Singleton, Latent variable models for time series
vectors in (2) converges to that of 1 independent vectors, each of which has a complex normal distribution with zero mean and covariance matrix S,(L), unless L = 0 or I = x in which case the limiting distribution is ordinary normal with zero mean and variance matrix S(O) or S,(rc) [Hannan (1973)]. For L#O, A# rr, the limiting distribution is I
4
i=l
wi ; S,(i) rx1
==(271))*$SZ(L)l-‘exp )
- i (
i=l
,
w$,(,?-‘wi 1
(3)
and the estimator
QA)= l- l i
Myv’i
i=l
is an unconstrained, quasi-maximum likelihood estimator for S,(1,) based on the (3) for set of vectors wi = WQT+ i, T), i = 1,. . ., 1.In view of the limiting distribution this set of vectors, the difference between $,(I*) and the true (but unknown) maximum likelihood estimator converges to zero almost surely. Most latent variable models for time series place restrictions on the spectral density matrix of the observed variables, just as latent variable models restrict covariance matrices in cross-section applications, and hypotheses regarding certain parameters of those models may further restrict the spectral density matrix. It is useful to distinguish between two types of restrictions which may arise. In the first case S,(A) is constrained to be a function of a k,-element complex 8,, that is S,(L) =gn(8J, but no constraints are placed on the relation of the vectors 8, corresponding to different frequencies. [A simple example of this type of constraint is the restriction that the observed varaibles are uncorrelated at all leads and lags, implying that S,(A) is a diagonal matrix at all frequencies.] Restrictions of this type do not constrain the ‘spectral shape’ of (z,}, and correspond to restrictions in the time domain which do not constrain the autocovariance functions of the components of {z,} or lead-lag relations between these components. The second class of restrictions extends the first by allowing cross-frequency constraints on the eL. [A simple example of this type of constraint is the restriction that the observed variables are correlated only contemporaneously, implying that the off-diagonal elements of S,(1) are real and the same at all frequencies.] Restrictions of this type arise whenever autocovariance functions of individual variables or lead-lag relations between them are restricted in the time domain. In general computation of maximum likelihood estimates of the 8, must proceed jointly at all frequencies of interest. If the number of frequencies is even moderately large, then the computations required may be rather expensive. However, given estimates of gJ0,) at several frequencies and their estimated second moments, cross-frequency constraints on the &Aare easy to test (because
J.F. Geweke and K.J. Singleton, Latent variable models for time series
291
the @Aare asymptotically independent) if not to impose. On the basis of these tests it may be decided whether or not to impose cross-frequency restrictions. An example of this procedure is provided in section 5. When there are no cross-frequency constraints on the 8,, computation of the maximum likelihood estimates 8, may proceed frequency by frequency as follows. For notational convenience we dispense with the fixed argument i,, write S,=&8),
and express the likelihood
L(8; WI)
..
.)
function
for the sample (2) as
WJ= fj f(wi; e). i=l
Let 8l and e2 denote the real and imaginary parts of 0, and rvj and wf denote the real and imaginary parts of wi, i = 1,. ., 1,respectively. Then we can treat L as a function L* of the 2k real parameters (0’, 0”) and the realizations of the 2rl real random variables (w:, . . ., w;), L*(el,
e2;w:,w:,..
.)
w: w,“,= L(B; )
WI,.
If L* satisfies the usual regularity conditions, estimates of 8l and e2 are among the solutions
aL*jael = 0,
aqae2 = 0.
. .,
WI).
(5)
then the maximum likelihood of the 2k equation system
(6)
These first-order conditions can be simplified by defining the differential operator a/ad =$a/%’ + i d/d02). Conditions (6) are then equivalent to i?L*/dtl =O, which by virtue of (5) may be written
aL/ae= 0.
(7)
The differential operator a/% satisfies the formal rules of differentiation and the chain rule [Nehari (1968, pp. 18-23)]. This formal similarity of Z/a0 and ordinary differentiation has two important practical implications. First, in many instances aL/iM may be obtained from the literature on maximum likelihood estimation in models with real random variables. If tirstorder conditions have been obtained for the problem in which the elements of Wlr..., w1and 0 are real and the likelihood function is (4), then (7) is given by these first-order conditions with 8 and the wi read as complex numbers. Second, to maximize the likelihood function (4) a steepest descent numerical algorithm may be applied. In the steepest descent algorithm for determination of a local maximum of a real functionfof a vector of real arguments x, the sequence of estimates. d-j+ 1)= _$) + h . aflax
1cx= ,(j)),
292
J.F. Geweke and K.J. Singleton, Latent variable models for time series
is formed. The parameter h is a step size which can be determined by a variety of methods [Hamming (1973, pp. 668-669)], while L?f/axl(x =x(j)) is the analytic expression for the first derivative offwith respect to X, evaluated at the preceding estimate x(j). In the case of the objective function L* the steepest descent algorithm will use the sequence of estimates fpj+ l) =01(j) + h . aL*/a@
I(@ = 01(j), 02 = @U),
ON+
((01 =01(j),
1) = @(A + h. aL*/ap
From our definition of the differential sequence is equivalent to @+ 1) = e(j) + h . aL/afj
operator
02 = p(j)),
a/de, it is apparent
that this
I(0 = @).
Consequently, given the objective function and the analytic first derivatives for the real analogue of our maximum likelihood problem, all that is required is the reprogramming of the relevant steepest descent computer program in complex arithmetic. Once the steepest descent algorithm has been used to maximize L* the information matrix can be approximated by evaluating (5) in a neighborhood of the values of 8, and 8, which maximize L*. On the basis of this approximation confidence regions may be constructed and hypotheses tested in the usual way. Furthermore, evaluation of the likelihood function at $ and at g(@) provides a basis for testing the null hypothesis that S,=g(8). Since estimates and test independent, statistics for any J distinct frequencies Al,. . ., I, are asymptotically a test of the joint null hypothesis SZ(Aj)=g,j(O,j), j= 1,. ., J, is also straightforward.
3. Latent variable models for time series Most latent
variable
zi = Bq,
models E(E~)= 0,
for longitudinal var (Ei)= C,
data have the form i=
1, . ., N.
(8)
The r x 1 vectors zl,. . ., zN are observed, and sl,. . ., Ed are p x 1 vectors of latent variables for which cov (si, sj) = 0 if i #j. The r x p matrix B and p x p matrix Care functions of a vector of k unknown real parameters 8, B = B(O),
c= C(d).
(9)
J.F. Geweke and K.J
Under the usual assumption function is
Singleton,
Latenr vnriahle modelsfor
that the zi are normally
L (8;zl,. . .,~~)=(2n)-~"'[~I~~'~
exp t-4
time series
distributed
1 Zix-‘Zi),
293
the likelihood
(10)
i=l
z
=
The analogue
B(O)C(B)B(fq’.
(11)
of (8) for time series is
Zf= W
)at > E(q) = 0,
EELS;u= R,(v),
t=l,...,T
(14
In (12) the subscript t denotes time. The r x 1 vectors zl,. . ., zT are observed, and the E, are covariance stationary, p x 1 vectors of latent variables whose autocovariance function is given by the p x p matrices R,(s), s = 0, i. 1, _t 2, . .. The operator B(L ) is an r x p matrix of polynomials of infinite order in powers of the lag operator L which is defined by &=u~-~ for any time series {u,}. It has an explicit expansion B(L ) = 2’5 _ mB,E and corresponding generating function us, where v is any complex number. B(v) = Es”= m B, If the process {z,} satisfies the conditions of section 2 and B(v) is analytic in an open annulus of the unit circle, then the limiting distribution of the set of complex vectors in (2) is complex normal with mean zero and covariance matrix &A)S,(A)&1_)‘. [The r x p, complex matrix B(n) is the Fourier transform of B(L ), S,(,?) is the spectral density of .q, S,(i.) b(A) = cz _ m B, e iLs, and = cs a, R,(s) eUnder th ese assumptions, the likelihood function for the sample wi, i= 1,. ., 1,is (3) with
iis.]
S,(A) = f3(%)S,@)B(3*)‘.
(13)
A comparison of (lOH11) with (4) and (13) reveals the formal similarity between the estimation and inference problems for latent variable models of longitudinal and time series data. The decomposition of the spectral density matrix (13) represents a complex version of the decomposition of the covariance matrix (11). This observation, together with the discussion in section 2, implies that steepest descent algorithms previously developed for special cases of (8) [e.g., Joreskog (1969)] can be adapted to estimate the corresponding special cases of( 12). To illustrate the process ofidentitication, the imposition ofidentifying and overidentifying constraints, and inference, we turn to a simple application of a latent variable model for time series. J E-B
294
J.F. Geweke and K.J. Singleton, Latent variable models
for time serirs
4. A latent variable model for permanent income In his celebrated reconciliation of consumption functions estimated with crosssection data, short time series, and long time series, Milton Friedman (1957) proposed the model
ct= BY,,
+ 4
(14)
for time series. In (14) yPt is aggregate permanent income at time t, c, is aggregate consumption, p is the marginal propensity to consume out of permanent income, and U, is a disturbance term with mean zero, finite variance, and E(u,) = E(u, y,,) =0 for all integers t and s. Although aggregate consumption is observed, permanent income is not. In order to estimate /?, Friedman assumed that ypl is a geometrically declining distributed lag in y,. We shall briefly consider how the behavior posited in (14) might arise. In a single consumption good economy suppose that household i determines its consumption git of that good in period t based on its net wealth, an appropriately discounted stream of future incomes yi,, preferences over time, and the classical intertemporal optimization process described by Fisher (1930). As is often done in estimates of the aggregate consumption function, ignore net wealth and the dependence on the rate of interest of the marginal propensity to consume out of the discounted stream of future incomes. The household’s consumption decision problem then reduces to estimating its discounted stream of future incomes. If the estimates are rational expectations [as defined by Muth (1961)], then git is a function of E[yirJ@,,], where Qi, is a p x 1 vector which comprises the ith household’s information set at time t. A variety of assumptions about functional forms will allow the aggregation of individual household decisions into (14). For purposes of illustration we shall assume that git is a linear function of E[yZ 1hit] and a random variable independent of E[ye 1Gi,], that E[Y:~ 1Qi,] is linear in the constituents of Qiit, that p is the same for all households, and that either Qit is the same for all households or Git differs across housholds but is used in the same way by all households so that E[yft 1tPit] =/I(@~,) for all i. Just how one will estimate (14) depends on the nature of Qir and the joint distribution of the yt and Qit. If Qit includes only past aggregate income and or if Qit includes only past household ECYZIYr~t,Yt~2,...,1=Csa=tCltsYt-,, income and E[yZl yi,r_l,yi,tm2 ,... ]=~sZ1y,yi,t~S, then we have
which is (14) with yP, = xi E[ yTf1Git] = ES”=I 6, y, _ ,; 6, = xi qS in the first case and 6, = yS in the second. In the first case, Y:=ZY;t=spsY*-s+c, L
(16)
J.F. Gewhe
und K.J. Singleton, Latent varicrhle modds for time series
YS
is a regression equation - i.e., E(v,)= E(u,y,_,)=O for all t and all s>O; in the second case it is a regression equation if y!, - E[yft( @J is uncorrelated with yj,, .~s for allj # i and s > 0. If the relation between y: and yt, y,, 1 , . . . is assumed known -typically yf is assumed equal to a sum of discounted future incomes-then (15) and (16) may be estimated jointly, with the appearance of the 6, in both equations overidentifying fl in (15). In the context of (15) and (16) Friedman’s assumption that yPt is a geometrically declining distributed lag in yr amounts to asserting u priori knowledge of the 6,s. In the present illustration we shall assume that the stream of future incomes of each household has two components which are uncorrelated at all leads and lags,
Yjt = Zir +
uif,
cov(zi,, c’J=O
for all t and s.
(17)
Each household’s information set allows perfect predictions of its permanent income, zif. Transitory income, uif, is independent of @it-s, s> 0, and so cannot be forecast. These assumptions imply that vit is serially uncorrelated. If permanent income could not be predicted perfectly and consumption depended on its expected value, then transitory income would become air + zit - E[z,,l @J, which in general is serially correlated. We make no assumptions about the actual constituents of any household’s information set. Each household is assumed to consume the services of several commodities. Let cjir denote the ith household’s expenditures for the consumption of the services of commodityj at time t, and suppose that households determine consumption using the same Cobb-Douglas utility function and a permanent income budget constraint. If consumption services could be measured perfectly and services were purchased at the time of consumption, then it would be the case that cjit = fijz,. Because expenditures do not coincide with consumption of services and these expenditures themselves are not perfectly measured, we in fact have c
Summing
ji(
=
ljjZir + uiit .
(18) across all households, Cjt=PjZ,
Similarly
(18)
summing
+Ujt.
(19)
(17)
yt=zt+ut. We shall assume that the discrepancy goods and the value of consumption = 0 whenever j# k, although each
(20) between the expenditures on consumption services consumed is such that cov (uj,, u,_) ujt will in general be serially correlated.
296
J.F. Geweke and K.J. Singleton, Latent variable models for time series
Following Friedman’s original model, the errors are also assumed to be uncorrelated with z, at all leads and lags. We shall impose the further restriction that ujt is uncorrelated with v, for all t and s; since V, is aggregate transitory income, this assumption will be most appropriate when expenditures on thejth consumption good most nearly coincide with the value of the jth service consumed. For purposes of estimation and inference we shall work with the more general model
i(L)zt+ IJNiL) 1
(21)
_
In (21) z,, v,, and the ujt are assumed to be mutually uncorrelated at all leads and lags, but each of these variables is permitted to be serially correlated. The model (21) is the very simple special case of (12)-(13) in which r = N + 1 and p = N + 2. Regarded as a statistical model, (21) is an example of the dynamic factor models for time series which have been studied by one of the authors [Geweke (1977)] and by Sargent and Sims ( 1977).4 In these earlier studies the use of the model was descriptive, whereas in the present application there is an underlying, explicit theory of economic behavior. Consequently the question of inference about behavioral parameters, which did not arise in the earlier work, is important here. identification. turn first to We Let yt=(clt,...,CNf,yt)‘, B(L) in this notation (21) becomes =(fi,(L), . . .> PNtL), l)‘, *, =ht, ‘. .> u Nt,~,)‘; Yt =
BW)z*+ *,
(22)
Let S,,(A), S,(A), and S,(R) be the spectral densities ofy,, zt, and u,, respectively, and let p(j&)=(p,(A),. .,D,,Ji), 1)’ where flj(A)=cz _ 3nflisePiAs, j= 1,. . ., N. Then from (21)s (23) In (23) the spectral density ofy, is the sum of a positive semidefinite one and a positive semidelinite diagonal matrix.
matrix of rank
4The use of several consumption goods or services as ‘multiple indicators’ was suggested by Goldberger (1972). Goldberger’s strategy could easily be pursued using cross-sectional data, but to the best of our knowledge this paper represents thefirst effort to do so using either cross-sectional data or time series. Identification of the interesting parameters may also be achieved using ‘multiple causes’ of permanent income, and this strategy has been pursued by Atttield (1977), who used grouped cross\cc~ional data.
J.F. Geweke and K.J. Singleton, Latent variable models for time series
297
Since most spectral density matrices of order three or greater cannot be decomposed this way (23) imposes restrictions on the process y(t). So long as F(A)+& which we henceforth assume to be true for all frequencies i, this decomposition is unique. The parameters &A) and S;(A) are identified because the last equation of (21) implies that the last element of &A) is unity at all frequencies. This restriction resolves the factor rotation problem [postmultiplication of p’(;,) by e’“] and provides a normalization which prevents S,(A) from being multiplied by an arbitrary positive scalar while &in) is inversely scaled. It is to be emphasized that the last equation of (21) imposes no restrictions on .S,@) beyond those of (23); i.e., given (23) the further restrictions in (21) serve to exactly identify F(A) and S;(A). In estimation, we impose the factor structure by requiring S,(i.) = y(j.)jr((/2)+ S,(A) with the last element of f(A) constrained to be real. Then flJ(A)= yAj(n/T,+ i(A) and SA4 = I”%+1V”)l’. In (21) inference about the actual path of z, is possible, because z; = E[z, 1yt _-s, s = 0, +_1, +_2,. . ] can be estimated. For purposes of explaining how this is done, it is simpler to consider estimation of the (N + 1) x 1 vector Z: = E[P(L)z, 1yt us, s = 0, + 1, _+2,. . .], and then note that z: isjust the last element of this vector. From (22) the cross-spectral density of /?(L)z, and yt at frequency /1 is &i)S,(i_)fl(1_1’ =?(A)?(A)‘, while the spectral density ofy, is given by (23). Both T((3.)j(;l)’and S,,(i.) are identified and consistently estimated by the procedures given in section 3 if the regularity conditions of section 2 hold. Let the (N+ 1) x (N+ 1) matrix function of s, C(s), be the inverse Fourier transform of $r(jJ~(A)‘(S,(%))~i. If we add to our regularity conditions the condition that det (S,(A)) is bounded uniformly away from zero then the elements of C(s) are square summable and $ =I:= -Lo C(s)y,-,. R e pl acing f((n) and S,(A) by their respective consistent estimators $A) and $,(A) and letting C(s) be the inverse Fourier transform of f(A)$(A)‘(!?,,(A)))‘, Z: is estimated by cs -do C(s)y,_,. In practice, it is computationally efficient to form the finite Fourier transform $(j.) of y, and take the estimate of zp to be the inverse Fourier transform of ~(3.)~(;0’(~~(1.))-‘~(~). Because of ‘end effects’ [truncation of c(s) in the first method and the implicit assumption of circularity in the second] a given Z: is consistently estimated only if observations are added to both ends of the sample as sample size increases. In application the C(s) usually diminish rapidly, so that for all but a few observations near the beginning and end of the sample we may entertain the idea that the z,* have been estimated reasonably, and these estimates may have interesting interpretations in the context of the economic events of the time.
5. Empirical findings In the U.S. national product accounts all personal consumption expenditures are classitied as expenditures on durable goods, non-durable goods, and services. Certain categories of expenditures are identified within each group: autos and auto parts, and furniture and household equipment in durables; clothes and
298
J.F. Geweke and K.J. Singleton, Latent uariahle modelsfhr time series
shoes, food and beverages, and gasoline and oil in non-durables; and household operation, housing, and transportation in services. Of these categories, clothes and shoes, food and beverages, and transportation were chosen for the present illustration. To the extent that durable goods expenditures represent saving, they will be correlated with transitory income v,, violating the assumptions of our model. Housing and household operation were excluded because of substantial measurement error, in particular because housing operations and services are often not traded in the market. Since expenditures on gasoline and oil are a fairly close substitute for expenditures on public transportation, the assumption of a Cobb-Douglas utility function seemed unreasonable if both were included.5 Personal disposable income and the three types of consumption expenditures were deflated by the consumer price index and by population. All data are quarterly and seasonally adjusted. Inspection shows that the assumption of covariance stationarity is inappropriate for these four series. At the outset linear trends were removed from all four series by ordinary least squares regression, using data for the period 1950:1-1977: IV. All results reported here pertain to the deviations from these trends. Since the detrended series still exhibit strong serial correlation, leakage can lead to a large small sample bias in frequency domain estimators. To diminish this bias, each of the four series were prewhitened using the estimated coefficients from a fifth-order autoregression. Prewhitening is equivalent to multiplication of both sides of (21) and (22) by an r x r diagonal matrix of lag operators and does not affect the identification of the model. Estimates based on the prewhitened data were recolored in the usual way. The parameters of (21) were estimated at the harmonic frequencies 271j/ll2, j = 0,. . ., 111. At each frequency, estimates and their estimated variance matrix were computed as described in sections 1 and 2, using 13 periodogram ordinates centered about that frequency. In table 1 parameter estimates and their estimated standard errors are provided for four equally spaced, nonoverlapping frequency groups. At each frequency (23) imposes five restrictions on the spectral density matrix S,,(i). Test statistics for each frequency are shown on the bottom line of table 1. The factor structure was accepted at each of the four frequencies, and for all four frequencies jointly. Corresponding time domain point estimates are given in table 2. The factor model, a statistical restriction which provides identification of important behavioral parameters, fails to be rejected by the data. We therefore proceed to evaluate the plausibility of the parameter estimates which emerge from this model in the context of the permanent income hypothesis. ‘Very little pretesting was done before selecting the variables food, clothing, and transportation. If housing services are substituted for transportation the likelihood ratio test statistic for the restrictions (23) rises from the value 15.6 reported in table 1 for food, clothing and transportation, to 30.6. If the model is applied to the aggregates durable goods, non-durable goods, and services, the likelihood ratio test statistic is 38.7.
299
J.F. Geweke and K.J. Singleton, Latent variable models for time series
Table 1 Frequency
domain
estimates
for permanent
income
model.”
Frequency(i)
1, =O.l25z
A2 = 0.375n
A, = 0.625~
i.,=O.87511
Periodogram ordinates used
2-14
16-28
3@42
44-56
Re @,(A))
0.131 I (0.0370)
0.0605 (0.0419)
0.2455 (0.2415)
- 0.0502 (0.0881)
Im (i3,(~))
-0.0612 (0.0464)
PO.0372 (0.0373)
- 0.0688 (0.0908)
0.1739 (0.1103)
Rc @A~))
0.1057 (0.0186)
0.0778 (0.0350)
0.1093 (0.0659)
- 0.0949 (0.0869)
Im (B#))
- 0.0025 (0.0199)
- 0.0748 (0.0311)
- 0.0060 (0.0562)
0.1517 (0.1002)
Re (&V.))
0.0181 (0.0059)
0.0075 (0.0070)
0.0207 (0.0220)
- 0.0084 (0.0105)
Im (flJ1.H
~ 0.0029 (0.0063)
0.0121 (0.007 I)
0.0139 (0.0208)
0.0036 (0.0107)
s,L(;.l
325.537 (9.638)
16.332 (4.474)
4.889 (6.381)
5.607 (2.203)
S”*(1)
25.973 (18.378)
7.973 (2.747)
2.680 (1.533)
2.508 (2.063)
S,,u(i.)
8.057 (2.291)
0.673 (0.182)
0.232 (0.044)
0.171 (0.108)
2,246.OOO
0.000 (476.514)
505.518 (1,156.650)
155.860 (1.267.260)
11,253X15 (4,129.680)
489.224 (213.512)
114.794 (170.789)
89.602 (77.403)
1.549
3.303
6.666
4.112
S,(i)
(15.836) s,(n) Test statistic, factor structure (x2(5)) “Estimated
standard
errors
are shown in parentheses
In many formulations (including the one set forth in the previous section) transitory income cannot be forecast given the information available to agents at the time their consumption expenditures are made. The variance of transitory income conditional on the relevant information set must be the same as its unconditional variance; in particular transitory income must be serially uncorrelated if in each period consumers know the value of transitory income in all previous periods. If ‘quarter’ is a reasonable interpretation of ‘period’ then in (21) U,should be a white noise. The point estimates ofthe autocorrelation function of U, displayed in table 2 indicate this is clearly not the case, but they do suggest
J.F. Geweke and K.J. Singleton, Latent variable modelsfor time series
300
Table 2 Time domain Food s 0
1 2 3 4 5 6 1 8
iLlBI0 1.oo 0.16 - 0.05 0.15 0.31 - 0.03 0.08 -0.12 0.14
point estimates.
Clothing l%. -,/a^,, 1.oo 0.44 -0.73 0.68 -0.35 0.14 -0.24 0.09 0.05
&,=O.ilO ~fi,,=O.107
LliL
Transportation B,, &LI
1.00 0.65 0.31 -0.22 -0.20 -0.54 0.43 -0.12 0.16
1.00 -0.64 -0.80 -0.53 - 0.47 -0.25 -0.09 0.10 -0.19
Llli,, 1.00 0.32 -0.30 0.69 0.05 0.07 -0.12 -0.21 0.34
&, = 0.067 c/L\ = 0.087
A. -S/B,” 1.00 0.62 -0.39 0.34 0.08 0.05 0.08 0.26 0.06
&0=o.012 C&\ zO.038
Food
Clothing
Transportation
Income
Permanent income
\
Pu,(s)
Ai2(8)
P”,(S)
P”(.Y)
P?(S)
0 1 2 3 4 5 6 7 8
1.00 0.96 0.91 0.85 0.78 0.72 0.66 0.61 0.56
1.oo 0.84 0.78 0.75 0.72 0.69 0.59 0.53 0.47
1.00 0.98 0.96 0.93 0.90 0.87 0.84 0.82 0.79
var (u,) = 332
var(u,)=21.1
var (uJ = 19.7
1.00 0.63 0.46 0.33 0.02 -0.08 - 0.02 -0.03 0.04 var (v) = 929
1.00 0.97 0.93 0.89 0.83 0.78 0.72 0.67 0.62 var(z)=
11,936
that u, (unlike any of the other latent variables in the model) shows no correlation with values of itself lagged more than a year. Since tax accounts are settled once a year and many transfer payments are revised annually, it seems plausible that in a given quarter each consumer knows his transitory income four quarters previous, but transitory income for more recent quarters may be unknown. Frequency domain parameter estimates can be used to test this idea formally. The hypothesis that u, (measured quarterly) is white noise is equivalent to its spectral density being flat, and this is inconsistent with the estimates in table 1; the relevant test statistic is x2(3)=27.16. The spectral density of the annual series u,*_-tI_&, s=o, fl, +2 )...) is obtained by ‘folding’ the spectral density of u, [Fishman (1969, pp. 36-38)]: S,,(i)=~,3=0 S,(A.+O.Szk). The maximum likelihood estimate of S,.(A) is therefore ??,,(A)=~~,0 sU(L+ 0.5nk). Since S,,(O) and S,40.257r) are each sums of estimates based on non-overlapping groups of
J.F. Geweke and K.J. Singleron, Latent variable models for time series
301
periodogram ordinates &r (SJO)) and v% (SJ0.25~)) are easily calculated, and the asymptotic distribution of the test statistic (S”,,(O)- SJ0.25~))/(&& (S,,(O)) + v% (SJ0.25~)))f is standard normal if S,,(O) = SJ0.25~). The reasonable alternative to absence of serial correlation would seem to be positive serial correlation - i.e., S,,(O) > S,,(0.25n). Similar test statistics can be computed using the SUj(JL).The values of the statistics are 0.02633 for u:, 3.596 for the nj: corresponding to food, 1.496 for clothing, and 3.426 for transportation. Absence of serial correlation is therefore accepted for transitory income, but rejected for the specific factors associated with food and transportation and rejected at the 10% level (but not the So/;:,)for clothing. Our evidence that annual observations on the specific factor associated with disposable personal income are serially uncorrelated while those on the specific factor associated with consumption are not supports the interpretation of v, as transitory income under the permanent income hypothesis. The permanent income hypothesis specifies that current consumption depends on current, but not past or future, permanent income. Point estimates of the pj(L), shown in table 2, appear inconsistent with this restriction. Although the estimated contemporaneous coefficient pjO is the largest in absolute value for all three consumption groups, other fljs are of the same order of magnitude, for negative as well as positive values of s. The values of the & are positive, reasonable, and for food and clothing nearly the same as Es% _ m /Ij,. The latter, in turn, are close to the share of the specified aggregate in disposable income, as would be expected. There is some evidence that values of pjs for s exceeding four quarters in absolute value may be zero, but no indication of one-sidedness of the pi(L) lag operators. The net effect of the jjj,, s # 0, is to render the estimated path of consumption out of permanent income not as smooth as estimated permanent income itself. The hypothesis that current consumption depends only contemporaneously on permanent income may be tested formally in the frequency domain; actually, we are testing jointly the hypotheses that current consumption and measured income are linked to permanent income only in the current period, since failure of the latter specification imposed here as an identifying restriction would lead to the appearance of a non-contemporaneous relationship between permanent income and measured consumption. This hypothesis imposes cross-frequency restrictions, but they are of a rather simple type. For the four frequencies for which estimates are reported in table 2, the restrictions are Re (flj(&) - flj(& ,_r)) = 0, k = 1,2,3, and Im (flj(%,)) = 0, k = 1,2,3,4, for j = 1,2,3. A Wald test of these 21 restrictions can be based on the estimates and their estimated variance matrix; the test statistic is x2(21)=613. The hypothesis is rejected for each of the three consumption groups, as well, for which test statistics [all x2(7)] are 5 1.02, 4 1.55, and 56.46, respectively. Point estimates and test statistics both suggest strongly that the contemporaneous effects hypothesis must be rejected. As a final means ofevaluating the permanent income hypothesis interpretation
J.F. Geweke and K.J. Singleton, Laient variable models fir time series
302
of the latent variables z:, the projection of z, on y, and the cjt, was estimated as described in section 4. Estimated detrended z: is plotted along with detrended disposable personal income in fig. 1. The overall, convex shape of the plotted values for both series results from the exponential growth path of per capita constant dollar consumption expenditures in conjunction with the linear trend which was removed from each series. In the long run, estimated permanent income and measured income have a close association, but over intervals as long as several quarters there can be a considerable departure of one from the other. As one would expect, estimated projected permanent income responds in a smoother
3oc
2oc
Disposable personal income (per capita 1967 dollars
........ Estimated projected permanent income (per capita 1967 dollars)
100
0
-100
-200
1950:1
1955:1
1960:1
1965:I
1970:1
1975:1
Fig. 1. Per capita personal disposable income and estimated projection of per capita income on disposable permanent income, food, clothing, and transportation, detrended adjusted quarterly totals at annual rates, 1950:1-1977:IV.
permanent seasonally
J.F. Geweke and K.J. Singleton.
Latent
variable
models,for
time series
303
way to changes in economic conditions than does measured income. The boom of 1974 and the subsequent collapse provide the most spectacular example of this characteristic in the postwar period, and suggest that estimates of permanent income based only on a projection on measured income would not agree well with those presented in fig. 1. The general tendency for estimated projected permanent income to be relatively smooth across business cycle peaks and troughs is consistent with the permanent income interpretation of z, in our latent variable model. The evidence regarding the permanent income hypothesis provided by this approach is mixed. On the one hand, the relative variances of permanent and transitory income relative to each other and to consumption is about what should be expected, as is the fact that permanent income and consumption out of permanent income exhibit substantial serial correlation while transitory income is serially uncorrelated at intervals of at least a year. On the other hand, the serial correlation of permanent income does not account for all of the serial correlation in permanent consumption as stipulated in most formulations of the permanent income hypothesis. The latter difficulty could be ascribed to the fact that the timing of consumption itself departs from the timing of consumption expenditures, but we do not find that argument persuasive: it would imply that although clothing consumption expenditures might not be tied contemporaneously to permanent income food expenditures should be very nearly so, and there is no evidence of this distinction in our estimates. A further difficulty is that the relationship between permanent income and consumption appears to be two-sided rather than contemporaneous. This could again be ascribed to timing and measurement problems, or to the variability of the real rate of interest documented elsewhere by one of the authors [Singleton (1980)]. The permanent income hypothesis seems to provide a reasonable, if rough, description of the behavior of aggregate consumption over time, but its specification of the exact, temporal relation between consumption and permanent income should perhaps not be taken literally.
References Anderson, T.W., 1963, The use of factor analysis in the statistical analysis of multiple time series, Psychometrika 28, l-25. Attfield, C., 1977, Estimation of a model containing unobserved variable5 using grouped observations, Journal of Econometrics 6, 51-65. Cooley, J.W. and J.W. Tukey, 1965, An algorithm for the machine calculation of Fourier series, Mathematical Computations 19, 297-301. Fisher, I., 1930, The theory of interest (Macmillan, New York). Fishman, G., 1969, Spectral methods in econometrics (Harvard University Press, Cambridge, MA). Friedman, M., 1957, A theory of the consumption function (Princeton University Press for NBER, Princeton, NJ). Geweke, J.F., 1975, Employment turnover and wage dynamics in U.S. manufacturing, Ph.D. dissertation (University of Minnesota, Minneapolis, MN).
304
J.F. Geweke and K.J. Singleton, Latent variable models for time series
Geweke, J.F., 1977, The dynamic factor analysis of economic time series models, in: D.J. Aigner and AS. Goldberger, eds., Latent variables in socio-economic models, Ch. 19 (North-Holland, Amsterdam) 365-383. Goldberger, A.S., 1972, Structural equation methods in the social sciences, Econometrica 40, 9791001. Hamming, R.W., 1973, Numerical methods for scientists and engineers, 2nd ed. (McGraw-Hill, New York). Hannan, E.J., 1973, Central limit theorems for time series regressions, Zeitschrift fur Wahrscheinlichkeitstheorie und Verwandte Gebiete 26, 157-170. Joreskog, K.G., 1969, A general approach to confirmatory maximum likelihood factor analysis, Psychometrika 34, 183-202. Lucas, R.E., 1972, Expectations and the neutrality ofmoney, Journal of Economic Theory 4,103-124. Lucas, R.E., 1973, Some international evidence on output-inflation tradeoffs, American Economic Review 63, 326334. Lucas, R.E., 1975, An equilibrium model of the business cycle, Journal of Political Economy 83,11131144. Muth, J.F., 1961, Rational expectations and the theory of price movements, Econometrica 29, 315335. Nehari, Z., 1968, Introduction of complex analysis (Allyn and Bacon, Boston, MA). Sargent, T.J., 1973, Rational expectations, the real rate of interest, and the natural rate of unemployment, Brookings Papers on Economic Activity 2, 4299480. Sargent, T.J., 1976, A classical macroeconometric model for the United States, Journal of Political Economy 84,207-238. Sargent, T.J. and C.A. Sims, 1977, Business cycle modeling without pretending to have too much a priori economic theory, in: New methods in business cycle research Proceedings from a conference (Federal Reserve Bank of Minneapolis, Minneapolis, MN) 45-110. Singleton, K.J., 1977, The cyclical behavior of the term structure of interest rates, Ph.D. dissertation (University of Wisconsin, Madison, WI). Singleton, K.J., 1980, Real and nominal factors in the cyclical behavior of interest rates, output, and money, Working paper (Carnegie-Mellon University, Pittsburgh, PA).