Sources of error in economic time series

Sources of error in economic time series

Journal of Econometrics SOURCES 17 (1981) 305-321. OF ERROR North-Holland Publishing IN ECONOMIC Company TIME SERIES* David A. PIERCE Feder...

1001KB Sizes 1 Downloads 55 Views

Journal

of Econometrics

SOURCES

17 (1981) 305-321.

OF ERROR

North-Holland

Publishing

IN ECONOMIC

Company

TIME

SERIES*

David A. PIERCE Federal Reserve Board, Washington, DC 20551, USA Received April 1981, final version received

August

1981

This paper classifies and measures the major sources of error, uncertainty or noise in economic data, regarding such data as observations from stationary and non-stationary time series. Uncertainties due to seasonal adjustment, sampling, and transitory variation are studied, both as observable error (eventually removed from preliminary tigures &hen they are revised) and as unobservable error (imbedded in both preliminary and final data). Correlations between different error sources are derived. The results are illustrated with U.S. money supply series.

1. Introduction For a variety of reasons observed data on economic time series are subject to uncertainty or error. In some instances first published data are revised at a later date, so that these preliminary-data errors can be observed and corrected; in others the error persists in the final data. Cutting across this classification are the numerous varieties of error, including conceptual error, seasonal adjustment error, sampling error, etc. This paper attempts to classify and measure these sources of error in economic data, regarding such data as observations from stationary and non-stationary time series. The remainder of this section summarizes the nature of the various errors that can occur. The time series models employed are introduced in section 2 and used there to analyze error in final data, or data not expected to undergo further revision. Revisions, or preliminary-data errors which are subsequently observed and removed, are considered in section 3, including possible correlation between these and final-data errors. 1.1.

Major error sources

There are numerous reasons why an observed economic differ from a ‘true’ one; we will distinguish the following: (1)

Conceptual

error:

‘Unemployment’,

or ‘money’,

data

or whatever,

series will

may not

*Any views expressed are not necessarily related to those of the Federal Reserve System. Helpful comments on an earlier version were provided by W.P. Cleveland, A. Maravall, and D.W. Parke.

01657410/81/000&0000/$02.75

0

1981 North-Holland

306

D.A.Pierce, Sources of error

in economic

time series

really be what we try to construct, even if our construction of the purported series were perfect. Little will be said of the problem in the present study, except for the following special case. (2) Transitory error (transitory variation): Irregular, evanescent fluctuation in a data series, presumably due to causes extraneous to those related to our concept of the series. This phenomenon was examined in some detail by the Committee on Monetary Statistics [Bach et al. (1976)] and by Porter et al. (1978). (3) Sampling error: The true series is an aggregate of a universe of constituents (e.g. bank deposit totals, unemployed persons), and only a sample from this universe is available at a particular time, from which an estimate of the population figure is constructed. An important example for monetary statistics is the presence of non-member banks which report their deposits only one week each quarter. As an extreme case, there may be no data at all available, for example on monthly GNP, and any ‘data’ constructed by interpolation or other means are of course only estimates. (4) Seasonal adjustment error: While this as we often don’t know very well what we addition the seasonal adjustment technique method generally provides only an estimate we are able to specify. If seasonal adjustment is not undertaken, sometimes be regarded as error.

is also partly conceptual insofar want to remove from a series, in may be faulty, and even the best of any ‘true’ seasonal factor that seasonality

in data

may

itself

(5) Reporting error: This may result from simple clerical mistakes or may reflect more fundamental problems such as deliberate concealment and lies, faulty classification of observations or construction of questionnaires, etc. A lucid discussion of such sources of error appears in Morgenstern (1963, ch. II).

1.2.

Error in preliminary

and in final data

A second classification of error in economic data is according to whether (a) the error exists only in preliminary or first-published data and is eliminated in a subsequent version of the series, or (b) the error is imbedded in the final data, as well as in any preliminary versions of the series. Revisions are errors that are discovered and removed from preliminary data series when further information subsequently becomes available, a maip example being revisions due to improved estimates of the seasonal component. Remaining errors, which by definition can never be observed, include conceptual error and transitory variation, and parts of seasonal adjustment and sampling errors.

D.A. Pierce, Sources

cf error in

economic

307

time series

Preliminary data error can be measured and analyzed empirically by comparing first-published and final data. Thus the existence and extent of such error is unambiguous and independent of any concept of a ‘true’ series. Estimating measures such as the standard error of revisions in a series is straightforward a.nd would be useful in publicizing and quantifying this source of uncertainty in preliminary data. On the other hand, error in final data is always unobservable (else it could be removed). In fact, its presence follows only from a concept of a true series, that is to say, a set of assumptions or model for the generation (definition, determination) of the underlying series values, of which our final data are estimates. Thus, as with variance estimation in general, there is necessarily error in our assessment of error. Yet there is precedent for the use of models in sampling, in seasonal adjustment [sometimes implicitly - see Cleveland and Tiao (1976)], and in analyzing transitory error [e.g., Porter et al. (1978) and references therein]; and frequently there is a robustness property in that alternative model specifications result in similar conclusions regarding error characteristics. In addition, properties of revisions can be derived from models, which can be compared with the properties exhibited by revisions actually occurring [see, for example, Pierce (1980)]. 2. Unobservable

error

In this section we consider the sources and measurement of error which remain in data in final form, that is, data not subject to further revision and with no further relevant information forthcoming.’ As noted in section 1, it is only meaningful to talk about error in such series in the context of a model depicting the generation of the ‘true’ series and the various superposed errors. The errors considered are those due to seasonal adjustment, sampling and transitory variation in the context of an unobserved components model of the form

xt=pt+st+et,

(1)

in which pt, s, and e, are respectively the ‘trend’, ‘seasonal’ and ‘irregular’ components of the observed series x,. Furthermore, if x, is a sample survey estimate of a population figure pLt,then it is assumed that e, is the sum of the sampling error E, and other irregular influences &, e, =

‘Of course section 3.

~5, + E,.

this error

is also present

(2)

in preliminary

data,

in addition

to the error

treated

in

308

D.A. Pierce, Sources of error in economic time series

We can thus also write

x,=P,+st+tt+%

(3)

XC=/&+&,.

(4)

or

It is usually assumed that the four components in (3) are independent, though in some cases this assumption will need to be examined. In particular, (3) is sometimes appropriate for the logarithms of a ‘multiplicatively’ generated time series, X, = P,S,Z,E,. If such a series were (erroneously) written in the additive form without first logging, the components would be related; for example, the sampling error would be more variable given the occurrence of an (unusually) large transitory component.

2.1. Model for x, We shall assume that the components s,, pt, <, and E, of x, in (3) are mutually independent (except when otherwise noted) and each generated by stationary or homogeneously non-stationary stochastic processes as described in Box and Jenkins (1970). Thus, s, and pt are representable in the form

(7) where CC,and B, are white noise sequences sided polynomials,

with variances

c,’ and of, the one-

are absolutely convergent and non-zero for 1z 15 1; and A,(B) and A,(B) are ‘differencing operators’ such that the zeroes of d,(z) and A,(z) are on the unit circle. Examples of such operators are the ordinary and ‘seasonal’ differencing operators, 1 -B and 1 - B”, respectively. It is also assumed that suitable initial conditions [see, e.g., Box and Jenkins (1970, pp. 114119)] are given for pr and s,. It is usually assumed that the transitory component 5, is already white

D.A. Pierce, Sources of error in economic

time series

309

noise. The sampling error E, is sometimes assumed to be white noise, for example, if the samples are non-overlapping, and in other instances to be serially correlated (section 2.3). The models (6) and (7) for pI and s, together with t, and E, are known to imply a model for the observable series X, of the same form,

so that A(B)x, is a linear, stationary, non-deterministic time series. If all differencing and summing operators are identically unity the series x, and its components are stationary; if A(z)+ 1 then x,, and at least one of pt, sI and E,, are non-stationary. The types of error in an observed time series x, are in principle determined from the model (3); however, this determination is rarely unique, and four sets of problems may be identified. First, the models set forth above are flexible and have repeatedly been found realistic for representing series in terms of their own past histories, but they say nothing about relationships between (components of) X, and (components of) other economic variables. Such relationships may be felt by some to be relevant in defining seasonality, transitoriness, etc2 Second? given a model for xf, such as an ARIMA model as above, there is ambiguity in the specification of (ARIMA) models for the components of x, in (3). Addressing this problem is a major task of this section. Third, even having resolved this identification problem there are different definitions of a ‘true’ series, depending on whether the seasonal and irregular parts of the series are considered ‘error’ or as part of the series of interest. For example, we may regard the ideal series as p, in some applications and as p,+& in others. Finally, the observed series that we wish to study is not unique, as x, itself is often modified in an attempt to deal with these sources of error or noise. The most common modification is seasonal adjustment, where an estimate .qt of s, is constructed and the observed series is 1, = x, - 3,.

(9)

The ‘error’ in this series is then either Tr - pt = (s, - 3,)+ i’, +

E,,

or z’r-(p, + 4,) = (s, - it) + s,, depending

on the definition

of the ‘true’ series.

‘A further aspect of relationships is that a change in another feedback control rule, may affect the model structure.

variable,

including

a change

in a

310

D.A. Pierce, Sources of’ error

in economic

time series

We will generally assume that the true series is pt, though we will not usually assume that an estimate & of transitory error has been constructed and removed from x,. Thus while the error due to seasonality is

rather than s, itself, since the series is assumed to have been seasonally adjusted, the error due to transitory variation is t, itself. This convention reflects the practice usually followed with economic data, which is to adjust them for seasonal variation but not for irregular variation. Thus, using (lo), the seasonally adjusted series can be written as g = pt + 6, + & + e,. The major sources seasonal adjustment 8,.

(11)

of error in (final) economic error 6,, the ‘transitory’ error

2.2. Seasonal adjustment

data series are then the &, and the sampling error

error

As noted in section 2.1, the seasonal adjustment error is the difference 6, between an estimate 8, and a true seasonal component s,. The error thus depends on both the model (the definition of what s, is) and the adjustment procedure used. Given the model preceding, particularly (8) for the series x, and (7) for its seasonal component, it is known [Whittle (1963, p. 57), Cleveland and Tiao (1976)] that the estimate $ of s, that minimizes the mean square E(@) of the seasonal adjustment error (10) is of the form

s’,=V,(B)X,=

f

VjXf-j,

(12)

where (13) where, e.g.,

f,(z)= “p”/1cls(z)12* and where the convention

lh(z)12=h(z)lz(z- ‘) is employed. Thus the filter is symmetric, reversibility of the x-process. The numerator

vj= v_~, as expected from the and denominator of ( 13) are the

D.A. Pierce, Sources

of error in

economic

311

time series

autocovariance generating functions (acgf’s), or spectra at z=e’“‘, of the component and over-all processes {s,} and {xr}. The nature of the seasonal adjustment error 6, in (10) was examined by Pierce (1979) who found that 6, is stationary (the MSE finite) if and only if the roots of the component process differencing operators d,(z) and d,(z) are distinct. Assuming this restriction is always imposed, then the seasonal adjustment error follows the stationary linear process

where Q is white noise with stochastic process followed by

CJ,”=cJ~D~/c,~, and

where

n, being the composite of all other, nonseasonal components result, the mean square error of the estimate s’l is

$,(B)

defines

the

of x,. Using this

Strictly speaking this means for determining the variance of the seasonal adjustment error is valid only for ‘optimal’ seasonal adjustment procedures of the form (12), (13). However, we note that the Census X-l 1 seasonal adjustment procedure [Shiskin, Young and Musgrave (1967)], which is used for most published seasonally adjusted U.S. economic time series, is essentially of the form (12). Moreover, in Cleveland (1972) a model of the form (7), (8) is presented such that the particular filter weights {rj) in (13) match very closely those of the X-11 program with standard options. This model has been found to be close to those fitted to a large number of economic time series, and therefore for such series it should be possible to use (16) to obtain a good approximation (perhaps a lower bound) to the variance of 6,. For example, ARIMA models for the log of the money supply (M-l), measured monthly, have often been of this form, when fitted with recent years’ data, and in Pierce (1979) it was found that the standard deviation of the SA error was about 0.09 of one percent. 2.3. Sampling

error

Many time series, such as the unemployment rate and various series relating to the U.S. money supply, are sample survey estimates of population

312

D.A. Pierce, Sources of error

in economic

time series

totals, so that there is an error

representing the discrepancy between the sample population value pL,.For example, if x, is the mean,

x*=(1ln) i

estimate

x,

and

the

Xir>

i=l

of a simple random sample of size n from a population of N statistically independent elements, numbered as x1,. . .,x,, x,+ 1,. . ., xN, then it can be shown that the sampling error E, is independent of pt with variance

of=(o’/n)(1 -n/N). For more complex designs, such as stratified and multistage surveys, the sampling error variance can also be determined, and the error is also indepdndent of the population value [Cochran (1963)]. The fact that x, (and pr, E,) are time series does not enter into the determination of sampling error variance a:, given that x, is a function of the sampled values xit at time t. Recently there have been investigations of ways to incorporate the historical time series information into the current estimate of pLr[e.g., Scott, Smith and Jones (1977)], but the present paper assumes the conventional survey estimate is employed. In contrast to the transitory error (section 2.3), the sampling error E, is generally serially correlated, except in non-overlapping surveys, and the nature of the stochastic process generating E, is frequently difficult to ascertain. In considering the not seasonally adjusted series x,, the sampling error E,, being independent of pt, is also independent of the components of pLt, including the transitory error 5,. (The independence here depends on the assumption of independence of the population values {xit}. Also the presence of the error E, will be seen to complicate the determination of the variance a;.) However, the application of a seasonal adjustment filter such as (12) to xf, and hence to all its components, affects the covariance properties of the different errors. For the sampling and seasonal adjustment errors, using (12),

=

-E{E~‘(B)E~-&

=

-0: 1

VjP,f_j+ k).

(17)

D.A. Pierce,

The correlation

between

Sources

oferror

in economic

time series

313

the errors 6, and E, in x, is thus

P&(O) = - (“JaJ 1

vjP.20'X

(18)

which is often dominated by the G=O) term - cr,v0/cr6, since p,(O) = 1 dominates the autocorrelation function of E, and v0 is typically the largest weight in v(B). The correlation (18) is thus generally negative. Intuitively, a large positive sampling error at time t causes, through vO, a larger-thanotherwise value of $, tending to decrease (increase in the opposite direction) the seasonal adjustment error 6,.

2.4. Transitory

error

Time series are very often characterized as consisting of the sum of ‘signal’ and ‘noise’, of ‘systematic’ and ‘irregular’ components or of ‘permanent’ and ‘transitory’ components [e.g., Bach et al. (1976) Whittle (1963)]. Such a classification is admittedly arbitrary, and what is ‘noise’ in, say, a univariate framework may in fact be systematic in a multivariate framework where inter-series relations are strong. But within the univariate approach there is a well established practice [Box, Hillmer and Tiao (1978) Porter et al. (1978) Shiskin, Young and Musgrave (1967)] of regarding that part of a series which is serially uncorrelated white noise as irregular, or transitory, and not (as much) related to the underlying phenomenon generating the true series. Statistically such a component is of no value in predicting the future or explaining the past of the observed series. In this subsection approaches are examined for specifying and measuring this transitory error in a series. As noted in section 2.1, the transitory component t, in (3) along with the other components, is not uniquely determined (identified) from the model for x, itself, so that additional information is needed concerning the nature of this source of uncertainty. It has already been seen that its irregular or random nature requires that it be white noise. We assume that such a component is ‘purely random’ and not systematically related to anything else, in particular its own past and the other components of the given series x,. The other idea necessary to identify the transitory component is essentially the converse of this one: if a component is purely random, then it is irregular or transitory. That is, if a series x, can be represented as the sum

xt = nt+ 5,,

(19)

where 5, is white noise independent of the aggregate n, of non-transitory components, then 5, is labeled as transitory. This requires that the decomposition (19) be made so as to maximize the variance of [, (else n, itself

could be so represented), or equivalently to minimize the variance of n,. This ‘minimum extraction principle’ [Pierce (1978), see also Box, Hillmer and Tiao (1978)] suffices to determine uniquely the model for n, and the variance c$. It is worth noting that the irregular component produced by X-l 1 has this maximum variance property [Tiao and Hillmer (1978)]. Also we reiterate that the fact that 5, is serially independent and independent of x, does not mean that there is not some other time series of interest with which iJ, is cross-correlated, and for this reason ~52 could overstate the transitory variance relative to a larger information set. Yet in some instances it is possible to specify n, so as to ‘explain’ a great deal of the series; for example, if .xZ is a temporal aggregate (say measured monthly) of a basic (weekly or daily) series then a model for the disaggregated series, which can include deterministic as well as stochastic effects, will generally leave a residual which, when re-arranged, possesses smaller variance than when the series x, is modelled directly; see Geweke (1978) for this result and Porter et al. (1978) for instances of several such models. The foregoing has dealt with not seasonally adjusted data. As in section 2.3, if x, (and hence its components) is seasonally adjusted, the error 6, of this adjustment is correlated with l,. Arguing as in the derivation leading up to (17)>

(20) since p,(k)=0 if k#O, 5, being white noise. Setting k=O in (20) the correlation coefficient

between

6, and 4, is

which is usually negative since the ‘center weight’ v,, is generally positive. It is this negative correlation between the seasonal adjustment error and the irregular component of the series that underlies the common notion that seasonal adjustment tends to ‘smooth’ the data; a large positive value of <, [or of et, from (17)] tends to be partially offset by the seasonal adjustment process. In the presence of sampling error the problem of identifying the transitory component or measuring its variance is more difficult. If the sampling error C, is serially independent, e.g. if the samples are non-overlapping, then the assumption of independence of C, and 5, would suffice; 0: is available from sampling theory, and what is measured by the minimum extraction approach

D.A. Pierce, Sources cferror

isd+<,the

variance

in economic

time series

of the sum of the two error sources.

315

Thus, the transitory

variance, 2 a;

2 =Qi;+<-flc,

2 (22)

can be readily determined. In the more usual case, however, the sampling error is autocorrelated, and by some means its stochastic process must be determined. Some guidelines for this are given in Scott, Smith and Jones (1977). Note that the possibility of association between the transitory and sampling errors may appear to have been ‘assumed away’, as the four components of x, in (8) were assumed independent. However, it is important to note that x, is often the result of a preliminary transformation of the original data. If the sampling standard error were proportional to the transitory error, then the logarithmic transformation would be appropriate to stabilize the variance. Other transformations, e.g. those of Box and Cox (1964), may be needed in some situations. We conclude this section with an illustration of the measurement of transitory variation in the (not seasonally adjusted) U.S. money supply, based on Porter et al. (1978), which also illustrates the power of temporal disaggregation in capturing systematic phenomena, as both weekly and daily data were employed. For daily data the basic model used was

Xt=Pr+Slt+5t>

(23)

where sit represents only fixed day-of-week effects and the trend term pt represents all inter-week and longer-term effects. Three ways of estimating pr were employed, with different results; the method with the median transitory variance consisted of taking

(24)

the moving average of the 5-day week centered about x,. The standard deviation of 5, in this approach was ar=0.41%, translating into about 0.1’2; for monthly data. [Using a moving ‘quadratic’ formula for pr yielded G; =0.31”/ using (24) except summing only over the calendar week yielded p5 =0.56”/,.] Note that this value is approximately the same as the standard deviation of 6,, the final seasonal adjustment error, with which we have seen <, is negatively correlated. Using weekly data, however, the corresponding estimates of o5 [from a stochastic model where 5, was estimated by a means analogous to (12)] varied about 0.55% (depending in this case on how the weekend was treated),

316

D.A. Pierce, Sources of’errnr

in

economic

time series

yielding transitory standard errors of about one-fourth percent for monthly data, contrasted with one-tenth percent when constructed from the daily model (23).

3. Revisions in preliminary data (observable error) In addition to the uncertainty or error in data as described in the previous section, there are usually further errors in first-published data, which are released before all relevant information is available. These errors, or revisions, are observable (eventually), and hence there is generated a series of such revisions whose statistical properties (e.g. their standard or average absolute deviation) can be investigated. While this is most directly done empirically, the models in the previous section can also yield revision error estimates. This section considers the measurement and the relative importance of various sources of revisions in first-released data, and possible covariances among revisions and between revisions and errors in final data.

3.1. Revisions due to seasonal

adjustment

As seen in section 2.2, X-11 and other filtering seasonal adjustment procedures make use of data both prior and subsequent to the datum being adjusted, as both future and past observations ordinarily contain information pertinent to seasonality at a given point in the series. For example, when the series is generated by a stationary or homogeneously non-stationary stochastic process, the optimal (minimum mean square error) seasonal estimate (12), (13) was seen to be symmetric in the future and past. However, for the seasonal adjustment of current or recent data and for problems than forecasting seasonal factors, which are more important historical seasonal adjustment for interpreting or reacting to movements in the series, the relevant future of the series is not yet available. Thus, based on the observations that are available, preliminary estimates of the seasonal component are made, which are subsequently revised as more series values are observed, perhaps repeatedly, until the unobserved future no longer contains significant relevant information. The nature and extent of these seasonal revisions is examined in Pierce (1980) under the assumption that the series x, can be adequately represented by a homogeneously non-stationary stochastic model of the form (8). Let

iy) = v,(B)x, = f

(25)

vmjxt-j

j=m

denote

the seasonal

estimate

based

on data

{xr, 75 t- -WI). At any two such

D.A. Pierce, Sources

of error in

economic

time series

317

times t-m and t-n, estimates Sljm)and $I”’ of s, can be calculated. Supposing - cc 5 IZ< m, so that t-n > t-m, the revision in the estimate Slim)is

rtcm.n) =

$4

_ $“I,

(26)

This quantity may be regarded as that portion of the error in the *(m)which is detected or corrected as a result (preliminary) seasonal estimate s, of information contained in the additional data x,_,+ r,. . .,x,_,. Of particular interest is the case n= - CO, in which $“I is the final estimate in the preliminary S,. The total preliminary data error in Sjm), or equivalently seasonally adjusted figure xim) = X, - .?I”‘, is A(m) = j%*, -a)

It

f

zz

s*,- p.

(27)

Given (25) and (8) the revisions themselves follow a stochastic process which in general can be characterized. An important case occurs when the seasonal estimate .$“) can be represented as

where

=E(x~/x,,x,_~

,... ),

t>z,

is the extended series obtained by adjoining to the available series {x~, 75 t-m} a set of forecasts of xtern + 1, x, -* + 2,. . ., and where v(B) is independent of m. Letting the series {pj, - x
v(B)=p(B)~(B)$-‘(H,

(29)

where d(B) and $(B) are as in the model (8) for x,, then it is shown in Pierce (1980) that, whenever (28) is satisfied, the revisions (26) follow the stochastic process m-1

1 pjat_i, YtCmsn)= j=n which is a moving square is

average

m-1

fs2 *,n =0,2 c

j=n

p;.

of order

m-n-

1, so that

the revision

mean

(31)

318

D.A. Pierce, Sources of error in economic time series

Moreover, the successive revisions are independent of each other and of the error 6, [eq. (lo)] in the final estimate $. Finally, given a symmetric filtering procedure such as X-l 1, application of that procedure on the extended series, as in (28) minimizes the revision mean square of preliminary seasonally adjusted data. Thus, X-ll/ARIMA would for this reason be expected to produce better initial data (smaller revisions) than the ordinary X-11 procedure. In practice, the seasonal component is forecasted a year in advance so that the first-published seasonally data is of the form -(ml xt

=x

t

_

.$n),

m=l,...,12.

(32)

The above result then implies that the mean square of the total error in Zi”’ due to seasonal adjustment, say @“‘, may be expressed as the sum of the revision mean square and the variance of the error in the final estimate, E(c?~“))~= E(rj”))’ + E(62).

(33)

When the U.S. money supply (old A4-1) was investigated it was found that the mean square revision in first published data over 1974-77 was 0.18x, with little variation between months of the year (representing different values of m). If this is combined with the final-data error estimate in section 2.2, the standard deviation of the total error in M-l due to seasonal adjustment is [(O. 1S)2 + (0.09)2] + = 0.20%.

(34)

For example, if a preliminary seasonally adjusted monetary aggregate were $500 billion, then the ‘true’ figure would deviate from this value by more than $1 billion almost one-third of the time, due to uncertainty in seasonal adjustment.

3.2. Reoisions from other sources The main source of preliminary data error in many economic time series is the seasonal adjustment process, which depends heavily and explicitly on information unavailable in current or recent time periods. Occasional instances of revision would occur when reporting errors or definitional changes necessitated a re-estimation of portions of the series. When data series are sample survey estimates, revisions would not ordinarily occur if the estimate of the population value pLtwere based only on the sample x, from the current time period. There has recently begun to be work on obtaining improved estimates of p* based on the series x, at other times as well [e.g. Scott, Smith and Jones (1977)], and revisions of these

D.A. Pierce, Sources of error

estimates considered

would occur much as in the previous section.3

in economic

revisions

in

time series

the

319

seasonal

component

3.3. Joint effcts As was the case with the seasonal adjustment error in the final data, the revisions in preliminary seasonally adjusted data are frequently correlated with other error sources. For example, if

v,(B)- v,(B) = 4,,,(B)> so that

T:“‘.~’= R,,(B)xt,

E($!$+)

(35)

= E{~~R,,(B)x~_~} (36)

and similarly E(rj?,“‘i”,) = cr; 1 R$m~“‘p<(j+ k) =a2R’“.“’

5

k .

(37)

Thus seasonal revisions are correlated with sampling and transitory errors. In practice, the seasonal components or factors are usually forecasted a year ahead, so that the next annual revision represents the first occurrence of a ‘center weight’, vb”‘, in determining $“‘. Thus R~“~“‘= -I@’ and from (36) and (37) r~“~“‘, and the total revision rim), will generally be negatively correlated with 4, and with a,. These results provide support for the widely held view that unusual deviations in first-published data series tend to be mitigated in subsequent revisions of those series. Concerning seasonal and non-seasonal revisions in the money supply, Porter et al. (1978) found that the standard deviation of monthly M-I revisions was 2.33% (annual percentage rate) over the 1968-75 period. In Bach et al. (1976) it is reported that revision standard error for seasonally adjusted M-l is 3.20% of which 2.13% is due to seasonal factor revisions alone. Combining these results, an empirical estimate of covariance between seasonal revisions rjs’ and non-seasonal revisions rjn’ is

‘Similarly, if one actually estimated and removed a transitory component, estimate would also be two-sided and the initial estimate subject to revision.

the tinal transitory

D.A. Pierce, Sources of error in economic

320

cov($),

rp)) =+{var(r?+

which would appear shown above between

ri”))- var($))-

time series

var(r?)}

negligible, in contrast with the negative seasonal revisions and transitory error.

covariance

3.4. Errors in levels (logs) versus changes (growth rates) of series It has usually been assumed in this paper error, say e,, in a ‘level’ series of the form

that

we were investigating

an

x,=m,+e,, where m, is the error-free series. Frequently interest centers on the errors in changes Vxt of x, (which, if x, is the log of the original series X,, are essentially the growth rates of x,). The most important point that needs to be addressed is whether, if based on (38) we write Vxt= Vm,+ Vet, the series Vee, is in fact the corresponding error series for Vx,. In Pierce (1978) this ‘filter invariance’ property is explicitly assumed for the seasonal component (and hence the revisions and final seasonal component errors) of a series. However, seasonal adjustment procedures in general do not satisfy this property, and moreover we may wish to require the transitory components of both x, and Vxt to be white noise whereas Vee, is autocorrelated if e, is white noise. If in (39) Vee, is the corresponding error series for Vxt, then it is straightforward to show that its variance is &=2&l

-p,(l)].

(40)

If e, is random, then differencing x, doubles the error variance. But it is important to note that the error variance can be much reduced by differencing if e, is highly and positively autocorrelated, reflecting the fact that large (positive or negative) errors in the undifferenced series would tend to occur in runs. References Bach, G.L., Phillip D. Cagan, Milton Friedman, Clifford G. Hildreth, Franc0 Modigliani and Arthur Okun, Improving the monetary aggregates: Report of the Advisory Committee on Monetary Statistics (Board of Governors of the Federal Reserve System, Washington, DC). Box, George E.P. and David R. Cox, 1964, An analysis of transformations, Journal of the Royal Statistical Society B 26, no. 2, 211-252.

D.A. Pierce, Sources of error in economic

time series

331

Box, George E.P. and Gwilym M. Jenkins, 1970, Time series analysis, forecasting and control, Rev. ed. 1976 (Holden-Day, San Francisco, CA). Box, George E.P., S.C. Hillmer and G.C. Tiao, 1976, Analysis and modelling of seasonal time series, in: A. Zellner, ed., Seasonal analysis of economic time series (U.S. Department of Commerce, Bureau of the Census, Washington, DC) 3099334. Cleveland, William P., 1972, Analysis and forecasting of seasonal time series, Ph.D. dissertation (University of Wisconsin, Madison, WI). Cleveland, William P. and George C. Tiao, 1976, Decomposition of seasonal time series: A model for the Census X-11 program, Journal of the American Statistical Association 71, Sept., 581-587. Cochran, William G., 1963, Sampling techniques (Wiley, New York). Geweke, John, 1976, The temporal and sectoral aggregation of seasonally adjusted time series, in: A. Zellner, ed., Seasonal analysis of economic time series (U.S. Department of Commerce, Bureau of the Census, Washington, DC) 41 l-427. Morgenstern, Oskar, 1963. On the accuracy of economic observations (Princeton University Press, Princeton, NJ). Parke, Darrel W., 1978, Nonmember banks and estimation of the monetary aggregates, in: Improving the monetary aggregates: Staff papers (Board of Governors of the Federal Reserve System, Washington, DC) 55-70. Pierce, David A., 1978, Seasonal adjustment when both deterministic and stochastic seasonality are present, in: A. Zellner, ed., Seasonal analysis of economic time series (U.S. Department of Commerce, Bureau of the Census, Washington, DC) 242-269. Pierce, David A., 1979, Signal extraction error in nonstationary time series, Annals of Statistics 7, 130331320. Pierce, David A., 1980, Data revisions with moving average seasonal adjustment procedures, Journal of Econometrics 14, Sept.. 95-l 14. Porter, Richard D., Agustin Maravall, Darrel W. Parke and David A. Pierce, 1978, Transitory variations in the monetary aggregates, in: Improving the monetary aggregates: Staff papers (Board of Governors of the Federal Reserve System, Washington, DC) l-34. Scott, AS., T.M.F. Smith and R.G. Jones, 1977, Application of time series methods to the analysis of repeated surveys, International Statistical Review 45, 13-28. Shiskin, Julius, Allan H. Young and John C. Musgrave, 1967, The X-l 1 variant of the Census method-II seasonal adjustment program, Technical paper no. 15, Feb. (U.S. Bureau of the Census, Washington, DC). Tiao, George C. and Steven C. Hillmer, 1978, Some consideration of decomposition of a time series, Biometrika 65, Dec., 497-502. Whittle, Peter, 1963, Prediction and regulation by linear least squares methods (English Universities Press, London).