Journal of Macroeconomics 24 (2002) 469–482 www.elsevier.com/locate/econbase
Comments on ÔThe state of macroeconomic forecastingÕ Michael P. Clements Department of Economics, University of Warwick, Coventry CV4 7AL, UK Received 3 April 2000; accepted 7 November 2000
Abstract This paper is a comment on Fildes and SteklerÕs survey of published forecast evaluations of UK and US output growth and inflation. Various explanations of the poor quality of macroeconomic forecasts are considered, and a number of routes by which progress might be made are suggested. Ó 2002 Elsevier Science Inc. All rights reserved. JEL classification: C52; C53 Keywords: Forecast failure; Structural shifts; Model mis-specification
1. Introduction I would like to begin by congratulating Robert Fildes and Herman Stekler (henceforth F&S) on compiling an impressive survey of published forecast evaluations of UK and US output growth and inflation. Working with other peopleÕs forecast evaluations has the obvious drawback that you take what you get, and F&S rightly bemoan the fact that, when comparing forecasts from different models, it would be nice to know whether one set of forecasts is statistically better than another (see, e.g., Diebold and Mariano, 1995), and whether allowing for the uncertainty inherent in the estimates of the modelsÕ parameters would affect the inferences made (see West, 1996; West and McCracken, 1998). Valuable information might also be provided by forecast encompassing tests, that is, whether the inferior set of forecasts can be used to improve the accuracy of the superior (see, e.g., Chong and Hendry, 1986). Nevertheless, the weight of evidence amassed on the poor quality of macroeconomic
E-mail address:
[email protected] (M.P. Clements). 0164-0704/02/$ - see front matter Ó 2002 Elsevier Science Inc. All rights reserved. PII: S 0 1 6 4 - 0 7 0 4 ( 0 2 ) 0 0 0 5 6 - 3
470
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
forecasts leads F&S to consider what might be done to improve their accuracy, including developments already underway. Section 7 of their paper considers a number of routes by which progress might be made, including a brief discussion of the explanations that David Hendry and I have given for the occurrence of forecast failure and systematic forecast errors, and the partial remedies that we propose. My purpose in this paper is to expand on that discussion. In surveying the published results on the quality of forecasts, F&S consider the following: the role of judgmental adjustments; whether one method is consistently superior to others; whether economic models beat simpler standards of comparison; whether forecast combination is desirable. My discussion of why forecast failure occurs and how forecasts can be improved will touch on some of these issues. The material in this paper is divided up as follows. Section 2 describes what is meant by forecast failure, and introduces the example of forecasting UK consumersÕ expenditure that will be used to illustrate key aspects of the discussion. Section 3 argues that model mis-specification cannot in general cause forecast failure, and Section 4 argues that structural change is likely to be the prime culprit. Ways of mitigating the effects of structural change are described in Sections 5 and 6. Section 7 considers the possibility that non-linearities in economic time series can account for forecast failures. It is argued that neglected non-linearities are unlikely to be serious for first-moment prediction, but may be more important if we wish to predict conditional variances of macroeconomic series, and especially of financial time series. It is shown analytically that the conditional mean forecasts of a linear model may be quite close to the forecasts of a non-linear business-cycle model of US output growth, for the degree of non-linearity that appears to characterise such series. This underscores a growing body of empirical and simulation evidence that attests to the competitiveness of linear models for forecasting. Finally, Section 8 concludes.
2. Forecast failure in macroeconomics We need to be explicit about what constitutes failure. We define forecast failure as a significant deterioration in forecast performance relative to the anticipated outcome, where the anticipated outcome is typically based on the historical performance of the model. Thus, the squared forecast errors being significantly greater than the modelÕs estimated in-sample error variance signals forecast failure. ÔSignificantly greaterÕ can be formalised in a simple statistical test, that compares the mean of the squared forecast errors to the modelÕs error variance. Similarly, testing for forecast bias can be viewed as a test of forecast failure that focuses on the first-moment properties of the forecast errors. Assuming the modelÕs in-sample errors are on average zero (and this will be enforced by estimation methods such as OLS), a non-zero bias of the forecast errors constitutes a worse performance than expected, and thus forecast failure. A Ôpoor forecast performanceÕ, on the other hand, could simply indicate that the object to be forecast is intrinsically difficult to model, so that large forecast errors might be expected on the basis of the poor in-sample fit of the equation. When forecasts from a simple time-series forecasting device are available
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
471
to rival those from the preferred macroeconomic forecasting system, we might generalise the notion of forecast failure to include a significantly worse performance of the macro forecasting system. F&S provide evidence of forecast failure due to bias, although the published studies do not clearly distinguish between poor forecasts and forecast failure in terms of second-moment performance. They find that the economic systems do not clearly underperform time-series model benchmarks. A motivation for our interest in developing a theory of forecasting is that failure, as defined here, has occurred regularly in many countries. In addition to the detailed evidence documented by F&S, the wider picture for the UK in recent times would include the consumer boom of the late 1980s. Consider forecasts of aggregate non-durable consumersÕ expenditure made for this period with the equation in Davidson et al. (1978) (known as DHSY), which is close to the equation in the UK HM Treasury model during the early 1980s. The equation relates annual changes in real consumption to changes in real personal disposable income and the inflation rate, with an equilibrium-correction mechanism from the previous yearÕs differential between consumption and income, such that, on a steady-state growth path for income, consumption is proportional to income. The model was estimated over the sample 1962(2)–1982(4) and all the in-sample tests of model adequacy were acceptable. Fig. 1a–d shows: the fitted, actuals and 1-step ex post forecast values; their cross plot with separate regression lines pre- and post-1982; the residuals and
Fig. 1. Modelling and forecasting the annual growth in consumersÕ expenditure, D4 ct : graphical statistics for the DHSY model.
472
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
forecast errors scaled by the equation standard error; and the forecasts with 1-step 95% confidence bands around the forecasts. The forecast and sample regressions have distinctly different slopes and the post-sample residuals greatly exceed the insample ones. Many realizations lie outside their confidence intervals, and the Chow (1964) constancy test over 1983(1)–1992(3) yields Fð40; 79Þ ¼ 2:90 which rejects at the 1% level. 1 This grave deterioration in out-of-sample, relative to in-sample, performance, could not occur in a constant, stationary world even if the model used for forecasting was only a poor representation of the process that generated the data: model misspecification per se cannot cause forecast failure, as illustrated in Section 3. The evidence here suggests that the model is a good representation of the data over the period 1962–84. To account for serious mis-prediction, a theory of economic forecasting must allow for an evolving economy subject to unforeseen shifts and structural change, and economic models which only imperfectly capture that structure. Our recent research into the possible causes of forecast failure (reported in Clements and Hendry, 1998, 1999a) allows for less-than-perfect models and an evolving macroeconomy subject to structural breaks, and distinguishes sources that are peripheral from those that are central. For this approach to be useful in terms of delivering relevant conclusions about empirical forecasting, we require that the data generating process, under which we analytically derive the properties of various forecasting devices and models that are used in practise, adequately captures the appropriate aspects of the real world to be forecast. Over the last 15 years, a consensus has emerged that the economy can be approximated by a system of integrated variables, subject to restrictions implied by cointegration, perturbed by occasional nonconstancies. 2 This is the form of DGP that underlies our analysis. The forecasting models are often formulated in accordance with some theoretical notions, but will be mis-specified to an unknown extent for the DGP, particularly when the economy is evolving. While theory often serves as a guide, the model will also typically have been selected by some empirical criteria (raising concerns about model-selection effects), and the parameters will have been estimated (possibly inconsistently) from (probably inaccurate) observations. Any or all of these mistakes might precipitate significantly bad forecasts. However, it transpires that shifts in underlying growth rates and equilibrium means over the forecast period are likely to be the primary cause of forecast failure. Other possible sources of forecasting errors, such as model mis-specification and estimation uncertainty, are less central. Moreover, these findings provide a key as to why some time-series models perform reasonably well, and why we might have expected F&S to find support for judgmental adjustments in the form of intercept corrections, as we discuss in Section 6. Because periods of forecast failure and economic turbulence often go hand in glove, it is hardly surprising that structural breaks and
1
The computations and figures were done in GiveWin, see Doornik and Hendry (1996). On integration and cointegration, some seminal references are Nelson and Plosser (1982), Engle and Granger (1987), and Johansen (1988). 2
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
473
regime shifts should be identified as the primary cause of forecast failure. What is less obvious is that it is possible to narrow down the types of structural change that result in forecast failure, and show that many of the factors commonly held to be important, such as mis-specified models, play only a peripheral role.
3. Model mis-specification To see why model mis-specification per se will not cause forecast failure, consider the following very simple example based on a bivariate vector autoregression (VAR). Since Sims (1980) VARs have been very popular in empirical macroeconomics as a way of describing the inter-relationships between variables, and together with Bayesian VARs have been used extensively for forecasting (see, e.g., Doan et al., 1984). The data generating process is given by
y1;t y2;t
¼
/1 /2
þ
P12 P22
P11 0
y1;t1 y2;t1
þ
1;t : 2;t
ð1Þ
The absence of feedback from y1;t1 on y2;t due to setting P21 ¼ 0 simplifies the algebra but otherwise is of no consequence. Consider the following mis-specification: the forecasting model for y1;t omits y2;t1 . Simply setting P22 ¼ 0 and supposing y1;t ¼ /1 þ P11 y1;t1 þ 1;t is invalid because of the textbook ÔproblemÕ of Ôomitted variable biasÕ. We need to derive the coefficient on y1;t1 under (1). The following 0 expressions for the unconditional moments of yt ¼ ½y1;t y2;t are simple to derive 1 0 assuming fyt g is stationary: E½yt ¼ u ¼ ðI2 PÞ /, E½ðyt uÞðyt uÞ ¼ M ¼ 0 0 X þ PMP and E½ðyt uÞðyt1 uÞ ¼ PM. Given (1), the model to be used for forecasting is obtained by reduction as y1;t ¼ /1 þ P11 y1;t1 þ P12 y2;t1 þ 1;t ¼ ð/1 þ P12 qÞ þ ðP11 þ P12 W11 Þy1;t1 þ ðP12 u2;t1 þ 1;t Þ ¼ d1 þ C11 y1;t1 þ v1;t ;
ð2Þ
where q and W11 are defined by y2;t ¼ q þ W11 y1;t þ u2;t ;
E½y1;t u2;t ¼ 0
ð3Þ
and so W11 ¼ M21 =M11 , and q ¼ E½y2;t W11 y1;t ¼ u2 W11 u1 ¼ ðW11 : 1ÞðI2 PÞ1 /:
ð4Þ
Despite the mis-specification, the model given by y1;t ¼ d1 þ C11 y1;t1 þ v1;t
ð5Þ
is well defined, and its error variance in any forecast period will on average match that in-sample. Notice that from the expressions for d1 and for q: d1 ¼ /1 þ P12 q ¼ /1 P12 W11 u1 þ P12 u2 :
474
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
We can obtain /1 from ð1 P11 Þu1 P12 u2 / ¼ ðI2 PÞu ¼ ; ð1 P22 Þu2 and hence d1 ¼ ð1 P11 Þu1 P12 u2 P12 W11 u1 þ P12 u2 ¼ ð1 C11 Þu1 :
ð6Þ
Substituting (6) into (5): y1;t u1 ¼ C11 ðy1;t1 u1 Þ þ v1;t : The important point is that the equilibrium mean of y1;t is unchanged at u1 despite the mis-specification. This ensures that the forecasts from the mis-specified model are unbiased. Consider the modelÕs forecast error in predicting T þ 1 conditional on period T: y1;T þ1 yg 1;T þ1 ¼ /1 þ P11 y1;t1 þ P12 y2;t1 þ 1;t u1 C11 ðy1;t1 u1 Þ; where yg 1;T þ1 ¼ u1 C11 ðy1;t1 u1 Þ. The unconditional expectation of this forecast error is zero, E½y1;T þ1 yg 1;T þ1 ¼ 0. So model mis-specification in the absence of non-constancy cannot account for forecast failure, although mis-specifications that concern deterministic components might. Suppose for example the data display a positive upward trend but this feature is absent from the model. But mis-specifications of this sort are easily detectable and are unlikely to be a problem in practise. 3 4. Structural breaks It is a truism that the problem with forecasting is that it is difficult to foretell what the future will bring. The macroaggregates that we observe and wish to model, and forecast, are the outcomes of myriad inter-related decisions and actions taken by large numbers of heterogeneous agents with conflicting objectives. Changes in e.g., Ôinstitutional arrangementsÕ, which impinge at the micro-level can set in sway forces which manifest at the macrolevel in sudden shifts in erstwhile stable relationships. Changes might alternatively be precipitated by political, social, financial, legal, or technological change. Examples for the UK of financial and legal changes include the introduction of interest-bearing chequing accounts and the removal of mortgage rationing. The latter is thought to have contributed to the failure in predicting consumersÕ expenditure in the 1980Õs. Even if the forecaster is aware that such changes are occurring, it may be hard to envision what, and how large, the consequences of such changes will be, and how quickly they will kick in. Often the best we may be able to do is to rapidly adjust our forecasts as the changes begin to bite to ensure that successive sequences of forecasts are not all systematically wrong in the same direction. It is one thing to make a large forecast error 3 Clements and Hendry (2001) consider the consequences for forecasting of mistaking deterministic and stochastic trends.
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
475
when something happened Ôout of the blueÕ after the forecast was made, quite another to continue to put out forecasts which on average will have similarly large errors of the same sign. Unfortunately, our analysis suggests that the popular vector equilibrium models, or cointegrated vector autoregressive models, will do just that. The systematic forecast errors in Fig. 1 are a prime example. Compare these forecasts with those depicted in Figs. 2 and 3. The forecasts shown in Fig. 2 are based on a Ôunit-rootÕ model, whereby the annual growth in consumption in period T þ 1 (D4 cT þ1 ) is forecast to be equal to the annual growth observed in period T: D4 ct ¼ D4 ct1 þ ut :
ð7Þ
There is no evidence of predictive failure, the two regressions (in and out of sample) have nearly equal slopes and the post-sample residuals are comparable to the insample. The unit-root model is clearly mis-specified in-sample: the residual variance exceeds that of the DHSY model by more than 100%, and there is evidence of fourth-order residual autocorrelation and non-normality of the equation errors. An alternative strategy that proves to be highly successful is to Ôintercept correctÕ the DHSY model. That is, each time a forecast is made we add in the error at the forecast origin (so for predicting period T þ 1, from period T, we add to the forecast the error made in predicting T at time T 1). The results of doing so are shown in Fig. 3. Considerable effort has been devoted to understanding the causes of the predictive failure in the DHSY model, and there are many potential explanations, including the financial deregulation of the mid-1980s, demographic change and omitted wealth
Fig. 2. Graphical statistics of the unit-root model.
476
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
Fig. 3. Graphical statistics for the intercept-corrected DHSY model.
effects: see the review in Muellbauer (1994). The main point of the illustration is that previously successful equations such as DHSY did not forecast satisfactorily through that period, while a model that would have been rejected as being inadequate insample suffered no such failings, nor did forecasts based on the simple ÔtrickÕ of adding in the previous errors. This example might be treated as a curiosity, but that would be a mistake. The next section explains why. 5. Error correction and equilibrium correction We suppose that the economy can usefully be viewed as a system of integrated variables, subject to restrictions implied by cointegration, but perturbed by occasional non-constancies. Then, the analogue of the unit-root model is a VAR in differences (denoted DVAR) that ignores the information contained in the reduced-rank cointegrating matrix of long-run restrictions, and the Ôeconomic modelÕ is the vector equilibrium-correction system (VEqCM). Clements and Hendry (1996) show that, when the means of the long-run equilibrium relations are subject to unanticipated shifts (for example, the savings ratio in a model of consumption, perhaps because of financial deregulation such as changes in the degree of mortgage rationing), forecasts from the DVAR will be approximately unbiased, while the VEqCM will Ôerror-correctÕ to the old equilibrium and generate biased forecasts. So the consumption example figures might be part of a pattern. For example, Clements and Hendry (1996) confirm the finding in Mizon (1995) that a DVAR model of UK
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
477
wages and prices over the period 1966–1993 performs satisfactorily, while models which include long-run information fail. Eitrheim et al. (1999) compare the forecast performance of the Norges Bank macroeconomic model with VARs in differences, and found that the latter did well for short-horizon forecasts. F&S discuss a number of published studies reporting results for VARs and economic models, although it is seldom clear whether the latter impose the full complement of unit roots, that is, are specified in terms of first differences. They find some support for Bayesian VARs, which might be quite similar to VARs with unit roots imposed, depending upon the specification of the priors.
6. Intercept corrections Forecast failure, in terms of biased forecasts, will occur when the modelÕs forecasts and the mean value of the data are out of sync because of shifts in the process generating the data, relative to the model. Intercept corrections (ICs: or perjoratively ÔconÕ or ÔfudgeÕ factors, or ad hoc corrections) can reduce this discrepancy once the shift has already occurred, and hence lessen the likelihood of failure. A very simple example taken from Clements and Hendry (1998) will illustrate. Let yt be generated around an unconditional mean l by yt ¼ l þ t ;
where t INð0; r2 Þ:
ð8Þ
Estimate l from a sample of size T by least squares, and forecast yT þ1 by y^T þ1 ¼ l^ ¼ T 1
T X
ð9Þ
yt :
t¼1
When (8) is the DGP, (9) is the estimated conditional expectation E½yT þ1 jyT ¼ l; and as E½^ l ¼ l, provides the minimum-variance unbiased forecast. However, the DGP is yt ¼ l þ d1ft P T1 g þ t
where t INð0; r2 Þ;
ð10Þ
and 1ft P T1 g is an indicator with the value zero till time T1 < T , after which it is unity. Consequently E½yT þ1 jyT ¼ E½yT þ1 ¼ l þ d; for which (9) is a poor estimate, since for j ¼ T 1 T1 , ! T1 T X X 1 E½^ l ¼ T E½yt þ E½yt ¼ l þ ð1 jÞd: t¼1
t¼T1 þ1
However, the residual at T was u^T ¼ yT l^, so to set the model Ôback on trackÕ (i.e., fit perfectly at the forecast origin), the IC u^T may be added to (9), to deliver the alternative forecast y^i;T þ1 given by
478
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
y^i;T þ1 ¼ l^ þ u^T ¼ l^ þ yT l^ ¼ yT :
ð11Þ
Thus, this IC radically alters the implicit forecasting model, which becomes a random walk, as (11) shows. Consequently, E½^ yi;T þ1 ¼ E½yT ¼ l þ d; which is unconditionally unbiased, with unconditional MSFE: 2
E½ðyT þ1 y^i;T þ1 Þ ¼ 2r2 ;
ð12Þ r2 .
as against the minimum obtainable (for known parameters) of ICs such as u^T can offset breaks that have already occurred, as evident from Fig. 3. As F&S note, judgmental adjustments in the form of ICs can be given a number of rationalizations, but we suspect that the main role they play in times of structural change is to automatically home in the model-generated forecasts on the changed data means. Clements and Hendry (1999a) discuss eight distinct, but obviously closely related interpretations of why ICs work. 7. Omitted non-linearities and forecast performance A number of authors have found that non-linear models of US output growth are not favoured on MSFE criteria. Pesaran and Potter (1997) find that their Ôfloor and ceilingÕ model is better at predicting conditional variances of the growth rates of output, but not the growth rates themselves. Clements and Krolzig (1998) find that linear models are just about as good empirically as popular Markov switching (MS-AR) (e.g., Hamilton, 1989) and threshold autoregressive (e.g., Tong, 1995) models of US output growth. More surprisingly, Monte Carlo results employing empirical business-cycle models for the data generating process deliver the same conclusion! Some simple algebra (taken from Clements and Krolzig, 1998) explains why this is the case. We focus on the Markov switching model because it allows an explicit analytical expression for the optimal predictor. For the sake of simplicity consider the first-order model Dyt lðst Þ ¼ aðDyt1 lðst1 ÞÞ þ t ;
ð13Þ
2
where t NIDð0; r Þ and the conditional mean lðst Þ of the growth rate of output (Dyt ) switches between two states: l1 > 0 if st ¼ 1 ð‘expansion’ or ‘boom’Þ; lðst Þ ¼ l2 < 0 if st ¼ 2 ð‘contraction’ or ‘recession’Þ; and where the evolution of regimes is governed by a Markov chain with transition probabilities pij ¼ Prðstþ1 ¼ jjst ¼ iÞ;
M X
pij ¼ 1;
8i; j 2 f1; 2g:
ð14Þ
j¼1
The seminal paper Hamilton (1989) proposed a model of this form except that lags up to order four were allowed, but this is of no consequence here.
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
479
Notice that the model can be rewritten as the sum of two independent processes: Dyt ly ¼ lt þ zt ; where ly is the unconditional mean of Dyt , such that E½lt ¼ E½zt ¼ 0. While the process zt is Gaussian: zt ¼ azt1 þ t ;
t NIDð0; r2 Þ;
the other component, lt , represents the contribution of the Markov chain: lt ¼ ðl2 l1 Þft ; where ft ¼ 1 Prðst ¼ 2Þ if st ¼ 2 and )Prðst ¼ 2Þ otherwise. Prðst ¼ 2Þ ¼ p12 = ðp12 þ p21 Þ is the unconditional probability of regime 2. Invoking the unrestricted autoregressive representation of a Markov chain (see Krolzig, 1997, p. 40): ft ¼ ðp11 þ p22 1Þft1 þ vt ; then predictions of the hidden Markov chain are given by h f^tþhjt ¼ ðp11 þ p22 1Þ f^tjt ;
where f^tjt ¼ Eðft jYt Þ ¼ Prðst ¼ 2jYt Þ Prðst ¼ 2Þ is the filtered probability Prðst ¼ 2jYt Þ of being in regime 2 corrected for the unconditional one. Thus the conditional mean of Dytþh is given by c ^tþhjt þ ^ztþhjt Dy tþhjt ly ¼ l ¼ ðl2 l1 Þðp11 þ p22 1Þh f^tjt þ ah ½Dyt ly ðl2 l1 Þf^tjt ¼ ah ðDyt ly Þ þ ðl2 l1 Þ½ðp11 þ p22 1Þ ah f^tjt : h
ð15Þ
The first term in (15) is the optimal prediction rule for a linear model, and the contribution of the Markov regime-switching structure is given by the term multiplied by f^tjt , where f^tjt contains the information about the most recent regime at the time the forecast is made. Thus the contribution of the non-linear part of (15) to the overall forecast depends on both the magnitude of the regime shifts, jl2 l1 j, and on the persistence of regime shifts p11 þ p22 1 relative to the persistence of the Gaussian process, given by a. For the model estimates in Clements and Krolzig (1998), p11 þ p22 1 ¼ 0:65, and the largest root of the AR polynomial is 0.64. This explains the success of the linear AR model in forecasting the MS-AR process. Since the predictive power of detected regime shifts is extremely small, p11 þ p22 1 a in (15), the conditional expectation collapses to a linear prediction rule. But while linear models may be competitive with non-linear models for firstmoment prediction for the degrees of non-linearity that characterise macroeconomic data (so the use of linear models is unlikely to be a serious cause of forecast failure), the same might not be true for higher moments, or indeed for density forecasts. Here we have in mind the findings of Pesaran and Potter (1997) noted above for conditional variances, and those of Clements and Smith (2000) which show that non-linear models of US output growth and unemployment may more accurately portray the
480
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
distribution of future realizations of these variables than linear models. The Ôfloor and ceilingÕ model exhibits regime-dependent heteroscedasticity in the underlying equation disturbances, and therefore suggests that the ability to predict the future values of the series varies over time, and systematically with the model-identified regime. The phenomenon of time-varying conditional forecast-error variances is of course much more common in financial time series, as manifest in autoregressive conditional heteroscedasticity (ARCH). ARCH models and its generalizations have become almost indispensable in the modelling of high-frequency financial series, which are typically characterised by thick-tailed unconditional distributions, variances that change over time, clusterings of large (small) changes, and periodic bouts of volatility 4 (good surveys on the vast literature on ARCH and related models are Engle and Bollerslev (1987), Bollerslev et al. (1992), Bera and Higgins (1993), Shephard (1996)). Clements and Taylor (2002) discuss a number of tests of interval adequacy that have power to reject intervals that do not widen and contract in line with the periodic volatility in the series, extending the approach to prediction interval evaluation of Christoffersen (1998). 8. Conclusions F&S provide an excellent survey of the state of macroeconomic forecasting based on the results of published studies. I would argue that the state of the theory of macroeconomic forecasting is in reasonably good shape. In essence, forecast failure occurs because of unforeseen forecast-period events. Models which performed well within sample may fail out of sample when there are structural shifts (e.g., the DHSY model of consumersÕ expenditure), while models which provide a poor in-sample characterisation of the data exhibit no obvious problems (e.g., the unit-root consumption model). That forecast failure in economics is not due to model mis-specification has a number of important methodological implications for econometric modelling, and these are explored in Clements and Hendry (1999b). An obvious implication of the work discussed above is that forecast performance in a world of deterministic shifts is not a good guide to model choice, unless the sole objective is short-term forecasting. Models which omit causal factors and cointegrating relations, and impose unit roots, may adapt more quickly in the face of unmodelled shifts, and so provide more accurate forecasts, but may omit the important channels of influence that are essential for policy analysis. Moreover, the simplest unit-root model forecast is essentially of Ôno changeÕ. The unit root consumption model is of this sort: the annual increase to next quarter is simply equal to the observed annual increase to this quarter. As a close inspection of Fig. 2 indicates, the forecasts always lag the actual Ôturning pointsÕ in the growth rates––when negative growth replaces positive growth, positive growth is initially forecast before the forecasts adapt. No-change forecasts are unable to anticipate changes in direction. 4 On the latter, Bollerslev and Ghysels (1996, p. 139) note that ÔMost high-frequency asset returns exhibit seasonal volatility patternsÕ.
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
481
As F&S note, the record on anticipating turning points is poor. Recent work on incorporating leading-indicator type variables in the regime-switching functions of non-linear business-cycle models of output growth may lead to improvements. 5 But an improved outlook for macroeconomic forecasting in general would appear to require more timely recognition of the structural shifts that are afoot in the economy, and a better appreciation of their likely macroeconomic effects. Making allowances for non-linearities in macroeconomic relationships and advances in macroeconomic theory do not tackle the root problem and are likely to yield only marginal improvements.
Acknowledgements Financial support from the UK Economic and Social Research Council under grants L116251015 and L138251009 is gratefully acknowledged. I am grateful to David Hendry for helpful comments. This paper is largely based on joint research with David F. Hendry (Nuffield College, Oxford) over the last decade. Much of this work is conveniently summarised in two books, Forecasting Economic Time Series (Cambridge University Press, 1998) and Forecasting Non-stationary Economic Time Series (MIT, 1999). The comments on non-linear models, prediction intervals and density forecasts are based on recent research with Hans-Martin Krolzig (Nuffield College, Oxford), Jeremy Smith (Department of Economics, University of Warwick) and Nick Taylor (Cardiff Business School). References Bera, A.K., Higgins, M.L., 1993. Arch models: Properties estimation and testing. Journal of Economic Surveys 7, 305–366. Bollerslev, T., Ghysels, E., 1996. Periodic autoregressive conditional heteroscedasticity. Journal of Business and Economic Statistics 14, 139–151. Bollerslev, T., Chou, R.S., Kroner, K.F., 1992. Arch modelling in finance––a review of the theory and empirical evidence. Journal of Econometrics 52, 5–59. Chong, Y.Y., Hendry, D.F., 1986. Econometric evaluation of linear macro-economic models. Review of Economic Studies 53, 671–690, Reprinted. In: Granger, C.W.J. (Ed.), Modelling Economic Series. Clarendon Press, Oxford. Chow, G.C., 1964. A comparison of alternative estimators for simultaneous equations. Econometrica 32, 532–553. Christoffersen, P.F., 1998. Evaluating interval forecasts. International Economic Review 39, 841–862. Clements, M.P., Hendry, D.F., 1996. Intercept corrections and structural change. Journal of Applied Econometrics 11, 475–494. Clements, M.P., Hendry, D.F., 1998. Forecasting Economic Time Series: The Marshall Lectures on Economic Forecasting. Cambridge University Press, Cambridge. Clements, M.P., Hendry, D.F., 1999a. Forecasting Non-Stationary Economic Time Series: The Zeuthen Lectures on Economic Forecasting. MIT Press, Cambridge, MA. 5 See e.g., the work of Denise Osborn and her collaborators at the Centre for Growth and businesscycle research at the University of Manchester.
482
M.P. Clements / Journal of Macroeconomics 24 (2002) 469–482
Clements, M.P., Hendry, D.F., 1999b. Some methodological implications of forecasting failure. Mimeo., Department of Economics, University of Warwick. Clements, M.P., Hendry, D.F., 2001. Forecasting with difference-stationary and trend-stationary models. Econometrics Journal 4, S1–S19. Clements, M.P., Krolzig, H.-M., 1998. A comparison of the forecast performance of Markov-switching and threshold autoregressive models of US GNP. Econometrics Journal 1, C47–C75. Clements, M.P., Smith, J., 2000. Evaluating the forecast densities of linear and non-linear models: Applications to output growth and unemployment. Journal of Forecasting 19, 255–276. Clements, M.P., Taylor, N., 2002. Evaluating prediction intervals for high-frequency data. Journal of Applied Econometrics, forthcoming. Davidson, J.E.H., Hendry, D.F., Srba, F., Yeo, J.S., 1978. Econometric modelling of the aggregate timeseries relationship between consumersÕ expenditure and income in the United Kingdom. Economic Journal 88, 661–692, Reprinted In: Hendry, D.F., Econometrics: Alchemy or Science. Blackwell Publishers, Oxford. Diebold, F.X., Mariano, R.S., 1995. Comparing predictive accuracy. Journal of Business and Economic Statistics 13, 253–263. Doan, T., Litterman, R., Sims, C.A., 1984. Forecasting and conditional projection using realistic prior distributions. Econometric Reviews 3, 1–100. Doornik, J.A., Hendry, D.F., 1996. GiveWin: An interactive empirical modelling program. Timberlake Consultants Press, London. Eitrheim, Ø., Husebø, T.A., Nymoen, R., 1999. Equilibrium-correction versus differencing in macroeconometric forecasting. Economic Modelling 16, 515–544. Engle, R.F., Bollerslev, T., 1987. Modelling the persistence of conditional variances. Econometric Reviews 5, 1–50. Engle, R.F., Granger, C.W.J., 1987. Cointegration and error correction: Representation estimation and testing. Econometrica 55, 251–276. Hamilton, J.D., 1989. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57, 357–384. Johansen, S., 1988. Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control 12, 231–254. Krolzig, H.-M., 1997. Markov Switching Vector Autoregressions: Modelling Statistical Inference and Application to Business Cycle Analysis. In: Lecture Notes in Economics and Mathematical Systems, Vol. 454. Springer-Verlag, Berlin. Mizon, G.E., 1995. Progressive modelling of macroeconomic time series: the LSE methodology. In: Hoover, K.D. (Ed.), Macroeconometrics: Developments, Tensions and Prospects. Kluwer Academic Press, Dordrecht, pp. 107–169. Muellbauer, J.N.J., 1994. The assessment: Consumer expenditure. Oxford Review of Economic Policy 10, 1–41. Nelson, C.R., Plosser, C.I., 1982. Trends and random walks in macroeconomic time series: Some evidence and implications. Journal of Monetary Economics 10, 139–162. Pesaran, M.H., Potter, S.M., 1997. A floor and ceiling model of US Output. Journal of Economic Dynamics and Control 21, 661–695. Shephard, N.G., 1996. Statistical aspects of arch and stochastic volatility. In: Cox, D.R., Hinkley, D.V., Barndorff-Nielsen, O.E. (Eds.), Time Series Models: In Econometrics, Finance and Other Fields. Chapman and Hall, London. Sims, C.A., 1980. Macroeconomics and reality. Econometrica 48, 1–48, Reprinted in. In: Granger, C.W.J. (Ed.), Modelling Economic Series. Clarendon Press, Oxford. Tong, H., 1995. Non-linear Time Series. A Dynamical System Approach. Clarendon Press, Oxford, First published 1990. West, K.D., 1996. Asymptotic inference about predictive ability. Econometrica 64, 1067–1084. West, K.D., McCracken, M.W., 1998. Regression-based tests of predictive ability. International Economic Review 39, 817–840.