Available online at www.sciencedirect.com
Journal of Policy Modeling 30 (2008) 417–429
Macroeconomic forecasting in the EMU Does disaggregate modeling improve forecast accuracy? Karsten Ruth J.W. Goethe-University Frankfurt, Faculty of Economics and Business Administration, Graduate Program “Finance and Monetary Economics”, Mertonstr. 17-21, 60054 Frankfurt (Main), Germany Received 1 December 2006; received in revised form 1 September 2007; accepted 1 December 2007 Available online 25 December 2007
Abstract Accurate forecasts of aggregate European variables are crucial for conducting a union-wide monetary policy. This paper investigates empirically whether pooling forecasts from disaggregate models is a promising strategy for forecasting actual macroeconomic European variables. In contrast to previous studies we formulate an intermediate case of disaggregation with regard to forecast combination by pooling forecasts obtained from models which are separately specified and estimated for subgroups of actual EMU Member States. Moreover, by modeling different degrees of monetary autonomy across countries during the EMS-era we explicitly account for cross-country heterogeneity in advance of 1999. We find that policymakers might obtain more accurate forecasts of actual European macroeconomic variables by pooling subgroup-specific forecasts compared to forecasting with a single union-wide model. © 2008 Society for Policy Modeling. Published by Elsevier Inc. All rights reserved. JEL classification: C32; C53; E65 Keywords: Forecast pooling; Disaggregate modeling; Monetary policy; Euro area
1. Introduction Conducting a single monetary policy for the euro area mainly demands for monitoring the development of aggregate European variables like area-wide inflation. This is why, from a poli-
E-mail address: karsten
[email protected]. 0161-8938/$ – see front matter © 2008 Society for Policy Modeling. Published by Elsevier Inc. All rights reserved. doi:10.1016/j.jpolmod.2007.12.002
418
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
cymaker’s perspective, accurate forecasts of union-wide macroeconomic variables are of crucial importance. Until today, economic forecasters heavily depend on econometric models which are fitted to historical pre-EMU data in order to forecast actual macroeconomic European variables. In fact, it has become quite common in the empirical literature to use aggregated single-country data before 1999 (i.e. “synthetic” euro data) for analyzing union-wide developments, see Fagan, Henry, and Mestre (2005). However, the use of aggregated data within a union-wide model ignores country-specific differences in economic structure across countries and imposes a questionable degree of cross-country homogeneity. In a recent study, Marcellino, Stock, and Watson (2003) show that the violation of this homogeneity assumption is directly reflected in the forecast performance of a model which is based on aggregated data. They find that the combination of country-specific forecasts often outperforms forecasts obtained from such an aggregate Euro-model. Building on the work of Marcellino et al. (2003) this paper investigates whether forecasts of actual European variables can be improved by pooling forecasts from different models, which are separately estimated for subgroups of EMU Member States. A distinguishing feature of this study is that we focus on subgroups of countries in order to formulate an intermediate case of disaggregation between pooled country-specific forecasts and forecasts based on aggregated data.1 In particular, we aggregate data for similar countries, which are comprised in a particular subgroup, while varying the model specification between subgroups. Thereby, we might improve forecasts from an aggregate Euro-model by drawing on different information sets while avoiding an overparameterization by neglecting homogeneity within a respective subgroup, see Marcellino et al. (2003). From a practitioner’s perspective this research strategy provides some advantage compared to both the country-specific and the aggregate approach: While it is not necessary to specify a separate model for each of the countries it is possible to account for some cross-country differences in economic structure, which is not possible within an aggregate Euro-model. In fact, in this paper we put strong effort on varying the model specification across the disaggregate subgroup-models in order to capture different economic conditions prevalent in different European countries. In particular, we account for unequal degrees of monetary autonomy during the pre-EMU era, thereby following an idea by Mojon and Peersman (2003) who group actual Member States of the EMU according to the monetary independence those countries exhibited within the European Monetary System (EMS). Thus, we explicitly exploit one of the advantages of disaggregation, namely “that the specification can be varied across micro units to suit the circumstances” (Barker & Pesaran, 1990, p. 6). We illustrate our research strategy by predicting inflation, real output and interest rates since 1999 either by pooling subgroup-forecasts or by forecasting directly with an aggregate Euromodel. Comparing the out-of-sample forecast performance we ask whether the combination of subgroup-specific forecasts outperforms the forecast obtained from a single, aggregate model. In contrast to Marcellino et al. (2003) who performed out-of-sample forecasts for data up to 1997 we explicitly predict macroeconomic European variables since 1999. This might help to assess the usefulness of models, which are fitted to historical data from the pre-EMU era, for actual forecast purposes.
1 This approach is related to a study by Hubrich (2005) who asks whether one can improve European HICP-inflationforecasts by pooling inflation forecasts for sub-components of the HICP in opposite to predicting HICP-inflation directly.
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
419
The main finding of our forecasting exercise is that pooling subgroup-specific forecasts tends to be superior to forecasting with a single Euro-model. In particular, forecast accuracy improves with regard to forecasting inflation and interest rates while we do not observe such an improvement with regard to real output. Overall, we argue that formulating an intermediate case of disaggregation by comprising similar countries in a common subgroup and pooling subgroup-specific forecasts might be a promising strategy in the light of an increasing number of EMU Member States. As the preparation of accurate forecasts constitutes a crucial part of the political decisionmaking process the qualitative results of this study are likely to be of practical relevance for policymakers and modelers. In particular, our results indicate that inducing pre-knowledge about country-specific characteristics and cross-country differences in the forecasting process pays off in terms of forecast accuracy. Thereby, in view of the empirical results, it appears to be desirable to consider macroeconomic forecasting as an economic task and not solely as a statistical exercise. The paper is organized as follows: Section 2 discusses the basic econometric setup and presents the subgroup-specific model specifications. In Section 3 we discuss the construction of aggregated data as well as the model selection. Section 4 presents the empirical results of the forecast comparison and provides some robustness analysis. Section 5 discusses potential policy implications of the empirical results. Finally, Section 6 concludes. 2. Model specification 2.1. Modeling the EMS context 2.1.1. The basic setup The basic idea underlying the combination of forecasts2 is that a pooled forecast might benefit from variation of the model specification across the disaggregate units compared to forecasts obtained from a single rival model. However, an increased accuracy by means of forecast combination always indicates the lack of an “overall” model, which incorporates all available information optimally (see Clements & Hendry, 1998, chpt. 10). Referring to the encompassing principle (see Mizon, 1984), it should be part of a progressive research strategy to refine an existing model such that the improved model contains all useful features of the former model. Put differently, the aim of the researcher should be to combine existing information sets rather than forecasts based on different information sets (see Fang, 2003). In fact, the use of aggregated data within an area-wide model (see Fagan et al., 2005) can be considered as the attempt to pool information sets. However, this proceeding ignores cross-country differences in economic structure by imposing a high degree of homogeneity. For instance, as regards the pre-EMU period imposing cross-country homogeneity turns out to be highly questionable. In the light of the asymmetric design of the European Monetary System (EMS) the assumption of homogenous European countries is severely at variance with the historical European experience. In contrast, our approach circumvents this restrictive homogeneity assumption. By explicitly modeling the EMS-context in advance of 1999 the subgroup-specific models do not only draw on different information sets but also differ in the way the available information is incorporated. This is why forecast combination seems to be a promising alternative to a direct
2
For early reviews on this topic see e.g. Clemen (1989) and Granger (1989).
420
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
forecast based on aggregated data when predicting actual European variables. This is in line with theoretical considerations by Clements and Hendry (1998, p. 227) who justify the forecast combination approach by emphasizing that “when models do not draw on a common information pool, and are essentially of a different nature or type [. . .] then the case for combination is [. . .] persuasive.” We account for the country-specific “EMS constraint” by focusing on subgroups of countries. Thereby, we follow Mojon and Peersman (2003) who classify actual Member States of the EMU into three different subgroups. The main criterion underlying their classification is the degree of autonomy with regard to monetary policy decision-making during the pre-EMU era. Accordingly, we estimate three different subgroup-models. For the first model we exclusively use German data; the second model is estimated with data which were aggregated over Austria, Belgium and the Netherlands. Moreover, we specify a model for a subgroup including Finland, France, Italy, Portugal and Spain.3 Finally, we estimate an aggregate Euro-model with union-wide data, aggregated over nine Member States of the EMU. The analysis is based on the class of vector error-correction models (VECMs).4 Following the familiar notation (see e.g. Johansen, 1995) a VECM can be written as: Δyt = Πyt−1 +
p−1 i=1
+
k
Γi Δyt−i +
k j=0
Bj xt−j + ΦDt + εt
Bj xt−j + ΦDt + εt = αβ yt−1 +
p−1
Γi Δyt−i
i=1
(1)
j=0
where yt is a vector process of n endogenous variables, and xt is a vector of m (stationary) exogenous variables. Dt contains deterministic components like a constant, a time trend or dummy variables. It follows from (1) that we can decompose Π by the product of the n x r-matrices α and β, i.e. Π = αβ , where r is the number of cointegrating relations (the cointegration rank). The columns of β contain the cointegrating vectors whereas α contains the adjustment coefficients. We restrict our attention to four endogenous variables in order to keep our model setup parsimonious. These variables are the year-on-year inflation rate (π), (log) real GDP (GDP), a short-term nominal interest rate (is ), and a long-term nominal interest rate (il ). This is a natural choice in the monetary policy literature for specifying (cointegrated) vector-autoregressive (VAR) models to analyze the interaction between interest rates, prices and output. In particular, il is included in the endogenous vector since it has been shown that the precision of parameter estimates for this kind of VAR models substantially increases when a long-term nominal interest rate is incorporated, see Bagliano and Favero (1998). Finally, we mainly introduce differences between the subgroup-models by varying the exogenous variables included in xt . In the following, the model specifications for the three different subgroups of EMU Member States as well as for the aggregate Euro-model are discussed.
3 In contrast to Mojon and Peersman (2003) we do not include Greece since it did not join the EMU before 2001. We exclude Luxembourg since it formed a monetary union with Belgium and had no independent monetary policy. Ireland is excluded due to data limitations. 4 There is a lot of work in econometrics highlighting the benefits of forecasting with cointegrated systems, see e.g. Engle and Yoo (1987), Clements and Hendry (1995), Hoffman and Rasche (1996), or Eitrheim, Husebo, and Nymoen (1999).
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
421
2.1.2. Subgroup I: Germany Following Mojon and Peersman (2003) one of the most important issues when modeling the euro area in the pre-EMU era is to deal with the monetary leading role of Germany in an appropriate way. This is because the asymmetric design of the EMS, with the German de facto anchor role, lead to a reduced monetary autonomy in most of the participating countries. Therefore, Germany constitutes a “subgroup” on its own. Referring to Eq. (1) the German country-specific model (henceforth GER) is specified as follows: ger
ger
s(ger)
yt = [πt , GDPt , it DM/$
xt = [ et
l(ger) ]
, it
US , πtworld , iUS t , GDPt ]
Dt = [1, D911]
where yt includes the four endogenous variables. In the exogenous vector xt we include the DM/$DM/$ exchange rate (et ) (converted to Euro before 1999), and the world inflation rate (πtworld )5 to capture the international economic conditions Germany was subject to. In addition, the U.S. federal funds rate (iUS t ) enters the exogenous vector since German monetary policy in the EMSera might have been influenced by the Fed’s monetary policy decisions. Furthermore, U.S. output is also considered as a potential explanatory variable. In the deterministic vector we include an impulse dummy (D911) to account for the German reunification. This variable takes the value one for the first quarter of 1991, and zero otherwise. Finally, we include a constant in Dt , allowing for a linear time trend in the variables’ levels. 2.1.3. Subgroup II: Austria, Belgium, Netherlands One consequence of the German anchor role within the EMS was a far-reaching lack of monetary autonomy in Austria, Belgium and the Netherlands. As a result of a fixed exchange rate parity vis-`a-vis the German mark, those countries actually gave up an autonomous monetary policy decision-making by tying their interest rates to the German ones. Therefore, first differences of the German short-term nominal interest rate enter the exogenous vector of the three-country subgroup-model (henceforth Sub3).6 The series for the year-on-year inflation rate, (log) GDP and interest rates are calculated as weighted averages of the respective country-specific time series (see Section 3.2): s(sub3)
, it yt = [πtsub3 , GDPsub3 t
l(sub3) ]
, it
xt = [ it , GDPt , πt ] Dt = [1, D911] s(ger)
ger
ger
Due to the regional neighborhood and the economic interrelationships it is likely that the three countries under regard were substantially affected by economic developments in Germany. For modeling the interrelationships we include German GDP and inflation as exogenous variables. Finally, we assume that Austria, Belgium and the Netherlands were also economically affected by German reunification. Therefore, we include the impulse dummy D911 in addition to a constant in the deterministic vector Dt . 5 All exogenous variables enter in first differences since they are determined to be integrated of order 1, i.e. I(1), see Section 3.3. 6 Note that from 1999 onwards we always use forecast values (obtained from the GER-model) for German variables incorporated in the Sub3- and Sub5-model.
422
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
2.1.4. Subgroup III: Finland, France, Italy, Portugal, Spain Although these five countries under regard were also affected by the German de facto anchor role within the EMS they did not entirely give up an autonomous monetary policy. In fact there remained some degree of autonomy with regard to monetary policy decisions. To model this (moderate) monetary autonomy we introduce the differential between s(sub5) the short-term nominal interest rate aggregated over five countries (it ) and the Gers(ger) s(sub5) man short-term rate in the model, i.e. idiff = i − i . The idea behind this modeling t t t strategy is that the differential reflects the autonomous part of monetary policy decisionmaking:7 Holding the German interest rate fixed, e.g. a positive differential reflects an additional tightening of monetary policy. Since we are dealing with relative interest rate movements, observed changes in the differential might have alternative explanations. However, in all cases the differential indicates a deviation of interest rate setting from the German behavior such that interpreting the differential as autonomous part of monetary policy appears to be plausible. We specify the model for the five-country subgroup (henceforth Sub5) as follows: l(sub5) ]
, idiff yt = [πtsub5 , GDPsub5 t t , it ger ger xt = [ πt , GDPt ] Dt = [1, D923]
Again, we include differences of German inflation in the exogenous vector of the Sub5-model to capture potential effects of German inflation which might have occurred via pressure on the exchange rate. Moreover, analogously to the Sub3-specification, we include differences of GDPger in the exogenous vector. Finally, we try to capture the effects of the ERM crisis by including the impulse dummy D923 in the deterministic vector. This variable takes the value one for the third quarter of 1992, and zero otherwise. 2.2. The aggregate Euro-model In order to achieve comparability between the pooled subgroup-forecasts and the forecasts obtained from the single union-wide model it is of crucial importance to base the aggregate Euro-model on an appropriate information set. Put differently, the aim is to specify the Euro-model as if the information sets of all subgroup-models would be pooled. By using all variables for the aggregate model which are also incorporated in the subgroup-models, forecasts from the Euro-model essentially draw on the same information as the pooled subgroupforecasts. However, due to the uniform specification, the Euro-model crucially differs in the way this information is exploited.8 To achieve comparability we specify the aggregate Euromodel (henceforth EUR) like the German model, augmented by the D923 dummy variable. Since this model setup also comprises the “international” variables like the DM/$-exchange rate, world inflation, the federal funds rate and U.S. output this appears to be a natural choice
7 For instance, Mojon and Peersman (2003) remark that the model of the Banque de France defined the reaction function in terms of deviations from the German interest rate. 8 This fact becomes especially obvious with regard to the interest rate differential: Both the interest rate series for the Sub5-countries and for Germany enter the aggregate model through the aggregated it s(EUR) -series, but in an apparently different way compared to the Sub5-model.
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
423
for a union-wide model specification. The model specification for the EUR-model reads as follows: s(EUR)
, it yt = [πtEUR , GDPEUR t
l(EUR) ]
, it
US , πtworld , iUS xt = [ eDM/$ t t , GDPt ] Dt = [1, D911, D923]
The following section discusses the particular models, empirically selected for the different subgroups. 3. Model selection 3.1. Data We used quarterly data from the International Financial Statistics provided by the IMF. Yearon-year inflation was computed as the annual change of the (log) GDP deflator. For real output we took the GDP index with base year 1995. As short-term nominal interest rate we used the country-specific call money rate or money market rate, respectively. The 10-year government bond yield was chosen as long-term nominal interest rate. From 1999 onwards, country-specific interest rates are partly not available. In those cases, the EONIA-rate was chosen as short-term nominal interest rate, for the long-term rate we took the series for the European 10-year government bond yield. All models are estimated over the common estimation period 1982:1 to 1998:4. The outof-sample forecast evaluation period is 1999:1 to 2002:4, i.e. we have 16 observations for our forecast evaluation exercise.9 3.2. Aggregation weights An important question concerning the aggregation of single-country data arises with regard to the appropriate choice of aggregation weights. More specifically, in our study we have to distinguish between within-group and between-group aggregation weights: The within-group weights indicate the proportion a single country contributes to the aggregate series within the specific subgroup (i.e. Sub3 and Sub5). The between-group aggregation weights determine how a particular subgroup is weighted relative to the other subgroups. These weights are used later for pooling the subgroup-specific forecasts. Overall, we chose both the within-group and the betweengroup aggregation weights such that the final (overall) contribution of a single-country series equals the respective GDP-weight used by Fagan et al. (2005) for constructing “synthetic” euro data.10 Consequently, these overall weights were also employed for constructing the aggregated data for the EUR-model.
9 Due to the recursive estimation technique applied for computing forecasts of different horizons the initial estimation period 1982:1 to 1998:4 is often shortened or expanded, depending on the forecast horizon under regard. In particular, this variation of the initial estimation period ensures – irrespective of the forecast horizon – that we always consider paths of 16 observations within the forecast evaluation period. 10 These weights are EU11-shares of GDP at PPP-exchange rates (1995). They are only slightly adjusted due to the omission of Luxembourg and Ireland.
424
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
Table 1 Specification of the (subgroup-)models Model
Endogenous
Exogenous
GER Sub3 Sub5
πt , GDPt , ist , ilt πt , GDPt , ist , ilt l πt , GDPt , idiff t , it
world , iUS
et , πt−1 t−1 ger s(ger) s(ger)
it , it−1 , GDPt ger
πt−1
EUR
πt , GDPt , ist , ilt
Δet
DM/$
DM/$
world US , πt−1 , iUS t−1 , GDPt−1
Deterministics
Lags
1, D911 1, D911 1, D923
1–3 1–4 1–2
1, D911, D923
1–4
Notes: The table summarizes the specifications of the subgroup-models, empirically selected according to the modeling strategy discussed in Section 2. Table 2 Diagnostic statistics Model
¯2 R
π
¯2 R
GDP
¯ 2 s/diff R
i
¯2 l R
i
p(AC1)
p(AC4)
p(het)
GER Sub3 Sub5
0.71 0.29 0.19
0.56 0.38 0.25
0.52 0.71 0.32
0.17 0.24 0.39
0.24 0.84 0.11
0.21 0.86 0.76
0.66 0.07 0.31
EUR
0.48
0.50
0.46
0.50
0.06
0.72
0.76
¯ 2 denotes the adjusted R2 -value for Notes: The table summarizes some diagnostic statistics for the (subgroup-)models. R [·] the different system equations. p(AC1) (p(AC4) ) stands for the p-value of an LM-test on residual autocorrelation at lag 1(4). p(het) denotes the p-value of the multivariate LM-test on heteroscedasticity.
3.3. The VECMs For investigating the stochastic properties of the time series we performed Augmented-DickeyFuller (ADF) unit root tests. The test results justify to treat all variables as being integrated of order one, i.e. I(1). Johansen cointegration tests provided ambiguous results. Since it is a wellknown observation that the test results of the Johansen procedure can crucially depend on the choice of the lag length we tried alternative lag specifications.11 In summary, the German and the Sub5-model exhibit a cointegration rank of at least r = 1 while there is only poor evidence for a cointegrating relationship in the EUR-model, and even no evidence in the Sub3-model. As a robustness analysis revealed that the forecasting results were improved when allowing for one cointegrating relation (compared to relying on simple differenced VAR models) we decided to impose a cointegration rank of one for each (subgroup-)model.12 The appropriate lag length for the different subgroup-models was determined according to the Akaike Information Criterion (AIC), assuming a maximum lag length of p − 1 = 6. The exogenous variables were chosen parsimoniously according to statistical significance in at least one of the four system equations. Table 1 summarizes the selected specifications for the different (subgroup-)models. As we are primarily interested in the forecast performance we confine ourselves to presenting some ¯ 2 -values for each system diagnostic statistics for evaluating the in-sample fit. Table 2 summarizes R equation of the (subgroup-)models. Interestingly, for all equations of the aggregate Euro-model ¯ 2 -values around 0.50, while there is much more variation across the equations of we observe R 11
Detailed results of ADF- and Johansen-tests are available upon request. This is in line with Clements and Hendry (1995, p.144) who conclude that imposing too few cointegrating relations is more costly in terms of forecast accuracy than allowing for a ’spurious’ levels term. 12
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
425
¯ 2 -value of 0.71 for the inflation equation the subgroup-models: For instance, while we obtain an R of the German model the corresponding value for the Sub5-model is only 0.19. On average, we observe a weaker in-sample fit for the subgroup-models compared to the aggregate Euro-model. Finally, we find no indication of residual autocorrelation according to the multivariate LMtest (see Johansen, 1995, p. 22), evaluated at lag 1 and 4 (p(AC[·]) ). Similarly, considering the p-values from the multivariate White test (p(het) , see Doornik, 1996), there is no indication for heteroscedasticity in any model. 4. Comparison of forecast accuracy: subgroup-specific versus aggregate modeling 4.1. Evaluation methods The relative forecast performance of rival forecasts was evaluated in different ways. First, we compared the root mean squared forecast error (RMSFE) of the rival forecasts for each variable and forecast horizon. To this end we calculated RMSFE-ratios. However, two competing forecasts are unlikely to produce equal RMSFE-values even if they exhibit a similar forecast performance. Obviously, simply comparing RMSFE-values does not take into account the sample uncertainty underlying observed forecast differences. This is why we additionally applied the test proposed by Diebold and Mariano (1995). This test is a standard procedure in the forecasting literature for deciding on the significance of observed forecast differences.13 Since a good relative forecast performance does not ensure that a model provides accurate forecasts in absolute terms we also performed univariate benchmark forecasts. By formulating a “na¨ıve” benchmark model we can decide on the overall adequacy of the multivariate models. We obtained the benchmark forecast by regressing each (differenced) variable on a constant and own lagged values (see Fair & Shiller, 1990, p. 381), where the lag order was chosen in accordance with the lag specification of the respective multivariate model. Consequently, the benchmark forecast for the pooled multivariate subgroup-forecast was obtained by pooling univariate forecasts based on subgroup-specific data. In contrast, the benchmark forecast for the Euro-model was based on data aggregated over all nine countries under regard.14 4.2. Forecasting results Table 3 summarizes the results of our out-of-sample forecasting exercise. Firstly, it becomes apparent that the pooled subgroup-forecasts outperform their (pooled) benchmark forecasts in the majority of cases (Column 2, Pool-B1): Except for the GDP forecasts the RMSFE-ratios are always smaller than one, indicating predictive superiority of the pooled multivariate subgroupforecasts. These differences in predictive accuracy become statistically significant at medium to long forecast horizons (h). In contrast, the pooled multivariate forecasts for GDP are significantly outperformed at short forecast horizons. The multivariate Euro-model-forecasts never improve their benchmark forecasts significantly (Column 3, EUR-B2). Rather, the Euro-model-forecasts are sometimes even significantly outperformed by the univariate benchmark forecasts.
13 For a survey of different test procedures for evaluating differences in forecast accuracy see e.g. Diebold and Mariano (1995), or Diebold and Lopez (1996). 14 The aggregation weights are the same as discussed in Section 3.2.
426
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
Table 3 Comparison of forecast performance: pooled subgroup-forecasts (Pool) vs. Euro-model-forecasts (EUR) h
Pool-B1
EUR-B2
Pool–EUR
Inflation (π) 1 4 8 12
0.813 0.594*** 0.628*** 0.753***
1.134** 1.073 0.954 0.981
0.618** 0.507** 0.672** 0.810***
Gross domestic product (GDP) 1 4 8 12
1.203** 1.107*** 1.010 0.915
1.021 0.809 0.812 0.915
1.135 1.308* 1.146 0.961
Short-term nominal interest rate (is ) 1 0.804 4 0.701 8 0.604 12 0.557*
1.036 0.842 0.979 0.991
0.774 0.891 0.672 0.532***
Long-term nominal interest rate (il ) 1 0.995 4 0.945 8 0.746** 12 0.781*
1.208* 1.149* 1.036 0.989
0.821* 0.844** 0.733*** 0.823
Notes: Column 1 (h) shows the forecast horizon under regard. Column 2 (Pool-B1) reports the RMSFE-ratios for the pooled multivariate forecasts and their univariate benchmark forecasts. Values smaller than one indicate a better forecast performance of the former. Column 3 shows the respective ratios for the multivariate Euro-model-forecasts (EUR-B2). Column 4 compares the pooled subgroup-forecasts and the Euro-model-forecasts (Pool-EUR). */**/*** denotes significance of observed differences in forecast accuracy at the 10%/5%/1%-level (Diebold–Mariano test).
However, for modeling purposes the main interest is directed towards the comparison between the pooled subgroup-forecasts and the Euro-model-forecasts (Pool-EUR), where the observed differences in forecast performance relative to the univariate benchmarks are directly reflected: We obtain highly significant forecast improvements by pooling subgroup-forecasts with regard to inflation, the long-term interest rate, and even for the short-term rate at longer horizons. In particular, the superiority of the inflation forecasts is remarkable as we obtain RMSFE-reductions of up to 50%. In the light of the importance of precise inflation forecasts for monetary policy this finding is of particular interest. In contrast, in terms of the RMSFE-criterion, the pooled subgroup-forecasts appear to perform slightly weaker for GDP compared with the Euro-modelforecasts, although differences in accuracy are not significant at the 5%-level. Nevertheless, the subgroup-specifications mainly seem to capture the interaction between nominal variables while they provide less explanatory power for output dynamics. Overall, it seems to be justified to conclude that the gains from pooling forecasts of subgroupmodels more than compensate the losses: In general, the pooled forecasts are at least as good as the direct Euro-model-forecasts and in many cases significantly better for inflation and interest rates. 4.3. Is variation of the model specification important? From the perspective of a modeler it is of crucial importance to disentangle whether forecast improvements by pooling subgroup-forecasts in fact emerge from variation of the model specifi-
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
427
Table 4 Comparison of forecast performance: pooled subgroup-forecasts without (Pooluni ) and with model variation (Pool) h
Pooluni -B1
Pool-B2
Pooluni –Pool
Inflation (π) 1 4 8 12
0.980 0.707 0.666** 0.768
0.813 0.594*** 0.628*** 0.753***
1.306** 1.384 1.093 1.022
Gross domestic product (GDP) 1 4 8 12
0.941 0.725** 0.731* 0.979
1.203** 1.107*** 1.010 0.915
0.834** 0.670** 0.755 1.111
Short-term nominal interest rate (is ) 1 0.876 4 0.869 8 0.809 12 0.934
0.804 0.701 0.604 0.557*
1.122 1.154 1.212* 1.742
Long-term nominal interest rate (il ) 1 1.253** 4 1.092 8 0.749 12 0.862
0.995 0.945 0.746** 0.781*
1.240** 1.136* 0.988 1.075
Notes: The table reports RMSFE-ratios for forecast comparisons between the pooled subgroup-forecasts without (Pooluni ) and with variation of the model specification (Pool). Values above one indicate a better forecast performance of the last named model. */**/*** denotes significance of observed differences in forecast accuracy at the 10%/5%/1%-level (Diebold–Mariano test).
cation across subgroups or solely from the increased within-group homogeneity, obtained through the formation of subgroups. This is why we also considered pooled VECM-forecasts based on subgroup-models which were uniformly specified according to the Euro-model. Those forecasts were contrasted with the pooled VECM-forecasts from Section 4.2, drawing on different model specifications. Table 4 summarizes the results. Interestingly, according to the RMSFE-criterion the pooled uniform subgroup-forecasts (Pooluni ) are remarkably weaker than the pooled forecasts with model variation (Pool) for inflation and interest rates. In contrast, the pooled uniform forecasts perform well with regard to GDP as a result of the good performance relative to their benchmark forecasts. Moreover, it becomes apparent that the observed overall improvement in accuracy by pooling subgroup-forecasts at least partly arises from the formation of subgroups (i.e. from increasing within-group homogeneity), as the pooled uniform forecasts (Pooluni ) in general perform better relative to their benchmarks than the direct Euro-model forecasts (see Table 3). However, the forecast comparison from above (Pooluni − Pool) confirms that it is mainly the variation of the disaggregate model specification which leads to forecast improvements by pooling forecasts. In fact, holding the specification constant across subgroup-models we would not entirely exploit the advantages of disaggregation, namely “that the specification can be varied across micro-units to suit the circumstances” (Barker & Pesaran, 1990, p. 6).
428
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
5. Policy implications In view of our empirical results some implications for policymakers and modelers may be derived. First and foremost, the results cast some doubts on the use of a single Euro-model, based on aggregated data, for forecasting actual European macroeconomic variables. Rather, pooling forecasts of disaggregate models, separately specified for largely homogenous subgroups of countries, often improves accuracy compared to the aggregate forecast. One explanation for this seems to be that the use of aggregated data within a single Euro-model induces restrictive homogeneity restrictions whose inadequacy is directly reflected in the model’s forecast performance. Second, when pursuing the strategy of forecast combination variation of the specification for the disaggregate models turns out to be crucial. Put differently, merely increasing within-group homogeneity by forming subgroups one does not fully exploit the advantages of disaggregation. Rather, it seems to be the flexibility with regard to the disaggregate specifications which makes the pooled forecasts in the majority of cases more accurate than the direct forecasts from the aggregate model. Thus, capturing the subgroup-specific economic circumstances by means of appropriate disaggregate model specifications appears to be a major challenge with regard to forecast pooling. Third, it is worth to stress that from a modeler’s perspective the strategy of pooling subgroupforecasts appears to be desirable as it reduces modeling costs compared to a more disaggregate country-specific approach. In particular, describing similar countries by a common subgroupmodel appears to be promising in the light of an increasing number of EMU Member States. While this modeling strategy avoids to specify a separate model for each of the (up to 25) countries it allows to account for some cross-country differences in economic structure, e.g. between actual Member States and the Accession Countries. Thus, forecast pooling of subgroup-specific forecasts seems to be well suited for a dynamically expanding currency union. Deciding on the appropriate degree of both disaggregation and model variation is of crucial importance for the preparation of accurate forecasts, being part of the political decisionmaking process. Overall, our results confirm that it is advantageous to induce pre-knowledge about country-specific characteristics and cross-country differences in formulating forecasting models.15 Thereby, the qualitative results of this study are likely to be of particular practical relevance for policymakers and modelers. 6. Concluding remarks This paper re-examined potential benefits of forecast pooling with regard to predicting actual macroeconomic variables in the euro area. A novelty of this study was that we pooled forecasts which were obtained from models for subgroups of countries rather than single countries, thereby formulating an intermediate case of disaggregation with regard to forecast combination. Moreover, we investigated whether varying the model specification across subgroups of actual EMU Member States improves accuracy of the pooled forecasts compared to relying on a single specification of an aggregate Euro-model. Our findings suggest to account for cross-country heterogeneity by separately specifying different models for largely homogenous subgroups of countries. Thus, combining forecasts from differentially specified subgroup-models turns out to be a promising strategy for actual and future forecast purposes.
15 For a comprehensive description how the European Central Bank (ECB) incorporates country-specific information for its area-wide projections see ECB (2001).
K. Ruth / Journal of Policy Modeling 30 (2008) 417–429
429
Acknowledgements I thank Dieter Nautz and Michael Binder for valuable comments and suggestions on the paper. Any errors are my own responsibility. Financial support by the German Research Foundation (DFG) is gratefully acknowledged. An earlier version of this paper is published as part of my doctoral thesis (ISBN 3-8322-4345-3). References Bagliano, F. C., & Favero, C. A. (1998). Measuring monetary policy with VAR models: An evaluation. European Economic Review, 42, 1069–1112. Barker, T., & Pesaran, M. H. (Eds.). (1990). Disaggregation in econometric modelling. London/New York: Routledge. Clemen, R. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5, 559–583. Clements, M. P., & Hendry, D. F. (1995). Forecasting in cointegrated systems. Journal of Applied Econometrics, 10(2), 127–146. Clements, M. P., & Hendry, D. F. (1998). Forecasting economic time series. Cambridge University Press. Diebold, F. X., & Lopez, J. A. (1996). Forecast evaluation and combination. In G. Maddala & C. Rao (Eds.), Handbook of statistics (pp. 241-268). Amsterdam: Elsevier Science. Diebold, F. X., & Mariano, R. (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13(3), 253–263. Doornik, J. A. (1996). Testing vector autocorrelation and heteroscedasticity in dynamic models. Working Paper, Nuffield College, Oxford. ECB (2001). A guide to Eurosystem staff macroeconomic projection exercises. Download available at: http://www.ecb.int. Eitrheim, O., Husebo, T., & Nymoen, R. (1999). Equilibrium-correction versus differencing in macroeconometric forecasting. Economic Modelling, 16(4), 515–545. Engle, R., & Yoo, S. (1987). Forecasting and testing in cointegrated systems. Journal of Econometrics, 35, 143–159. Fagan, G., Henry, J., & Mestre, R. (2005). An area-wide model (AWM) for the Euro area. Economic Modelling, 22, 39–59. Fair, R. C., & Shiller, R. J. (1990). Comparing information in forecasts from econometric models. American Economic Review, 80(3), 375–389. Fang, Y. (2003). Forecasting combination and encompassing tests. International Journal of Forecasting, 19, 87–94. Granger, C. W. J. (1989). Combining forecasts–twenty years later. Journal of Forecasting, 8, 167–173. Hoffman, D. L., & Rasche, R. H. (1996). Assessing forecast performance in a cointegrated system. Journal of Applied Econometrics, 11(5), 495–517. Hubrich, K. (2005). Forecasting Euro area inflation: Does aggregating forecasts by HICP component improve forecast accuracy? International Journal of Forecasting, 21, 119–136. Johansen, S. (1995). Likelihood-based inference in cointegrated vector autoregressive models. Oxford University Press. Marcellino, M., Stock, J. H., & Watson, M. W. (2003). Macroeconomic forecasting in the Euro area: Country-specific versus area-wide information. European Economic Review, 47, 1–18. Mizon, G. (1984). The encompassing approach in econometrics. In D. F. Hendry & K. Wallis (Eds.), Econometrics and quantitative economics (pp. 135–172). Oxford: Basil Blackwell. Mojon, B., & Peersman, G. (2003). A VAR description of the effects of monetary policy in the individual countries of the Euro area. In I. Angeloni, A. Kashyap, & B. Mojon (Eds.), Monetary policy transmission in the Euro area (pp. 56–74). Cambridge University Press.