International Journal of Forecasting 33 (2017) 745–759
Contents lists available at ScienceDirect
International Journal of Forecasting journal homepage: www.elsevier.com/locate/ijforecast
Predicting recessions with boosted regression trees Jörg Döpke a , Ulrich Fritsche b,1 , Christian Pierdzioch c,∗ a
University of Applied Sciences, Merseburg, Germany
b
University Hamburg, Germany
c
Helmut-Schmidt-University, Hamburg, Germany
article
info
Keywords: Recession forecasting Boosting Regression trees
abstract We use a machine-learning approach known as boosted regression trees (BRT) to reexamine the usefulness of selected leading indicators for predicting recessions. We estimate the BRT approach on German data and study the relative importance of the indicators and their marginal effects on the probability of a recession. Our results show that measures of the short-term interest rate and the term spread are important leading indicators. The recession probability is a nonlinear function of these leading indicators. The BRT approach also helps to uncover the way in which the recession probability depends on the interactions between the leading indicators. While the predictive power of the shortterm interest rates has declined over time, the term spread and the stock market have gained in importance. The BRT approach shows a better out-of-sample performance than popular probit approaches. © 2017 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
1. Introduction Against the background of the Great Recession, researchers have started to reassess various major linear and nonlinear forecasting approaches (see Bec, Bouabdallah, & Ferrara, 2014; Ferrara, Marcellino, & Mogliani, 2015, among others) and leading indicators that are used widely in applied business-cycle research (see for example Drechsel & Scheufele, 2012). We contribute to this rapidly growing strand of research by using a machinelearning approach known as boosted regression trees (BRT) to reexamine the predictive value of selected leading indicators for forecasting recessions in Germany (on boosting, see Freund & Schapire, 1997; Friedman, 2001, 2002;
∗ Correspondence to: Helmut-Schmidt-University, Department of Economics, Holstenhofweg 85, P.O.B. 700822, 22008 Hamburg, Germany. E-mail address:
[email protected] (C. Pierdzioch). 1 Other affiliations: University Hamburg, KOF ETH Zurich, GWU Research Program on Forecasting, Germany.
Friedman, Hastie, & Tibshirani, 2000, or for a survey, see Bühlmann & Hothorn, 2007). The BRT approach is a modeling platform that makes it possible to develop a nuanced view of the relative importance of leading indicators for forecasting recessions, to capture any nonlinearities in the data, and to model interaction effects between leading indicators. The BRT approach combines elements of statistical boosting with techniques studied in the literature on regression trees. Boosting is a machine-learning technique that requires an ensemble of simple base learners to be built and combined in an iterative stagewise process in order to build a potentially complicated function known as a strong learner. The weak learners are simple individual regression trees, and the strong learner results from combining the individual regression trees in an additive way. The ensemble of trees is then used to compute recession forecasts. Regression trees use recursive binary splits to subdivide the space of leading indicators into non-overlapping regions in order to minimize some loss function. Regression trees lend themselves to the recovery of the infor-
http://dx.doi.org/10.1016/j.ijforecast.2017.02.003 0169-2070/© 2017 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
746
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
mational content of leading indicators because regression trees capture, in a natural way, even complex nonlinearities in the link between the recession probability and leading indicators. Moreover, regression trees are insensitive to the inclusion of irrelevant variables in the list of leading indicators, and are robust to outliers in the data (on regression trees, see Breiman, Friedman, Olshen, & Ston, 1984). Aggregating over regions, and over trees, then allows the recovery of highly complicated links between the recession probability and a leading indicator. In addition, special techniques for the analysis of regression trees have been developed that make it straightforward to trace out the quantitative importance of leading indicators and their marginal effects on the recession probability. Regression trees can also be used to shed light on the ways in which the interaction of leading indicators changes the probability of a recession. While regression trees have several interesting advantages (see Hastie, Tibshirani, & Friedman, 2009, p. 351), their hierarchical structure makes them high-variance predictors. The BRT approach overcomes this drawback by using boosting techniques to combine several regression trees additively in order to form a low-variance predictor. The BRT approach complements the widely-studied probit approach to the forecasting of recessions, which has been popular in the business-cycle literature since the early 1990s (Estrella & Hardouvelis, 1991; Estrella & Mishkin, 1998), and has also been used extensively as a tool for recession forecasting in recent research (Fritsche & Kuzin, 2005; Proaño & Theobald, 2014; Theobald, 2012, among others). Applications of regression trees to economics can be found in monetary economics (Orphanides & Porter, 2000), empirical finance (Savona, 2014, among others), and political and sports forecasting (see for example Cáceres & Malone, 2013; Lessman, Sung, & Johnson, 2010), as well as in the literature on the determinants of financial crises (see Savona & Vezzoli, 2015, among others). The list of applications of boosting to economics includes the research by Berge (2015), who uses boosting to model exchange rates, and Buchen and Wohlrabe (2011), who compare the performance of boosting with those of other widelystudied forecasting schemes for forecasting the growth rate of U.S. industrial production. Lloyd (2014) and Taieb and Hyndman (2014) use boosting for forecasting the hourly loads of a US utility, and Silva (2014) applies boosting to forecasting wind power generation. Robinzonov, Tutz, and Hothorn (2012) use boosting to forecast the monthly growth rate of German industrial production, and find a good performance at short and medium-term forecast horizons. Lehmann and Wohlrabe (2016) also use boosting to forecast German industrial production. They report various top leading indicators, including turnover, orders, and several survey indicators, for four different forecast horizons. Wohlrabe and Buchen (2014) use boosting to forecast several macroeconomic variables. Bai and Ng (2009) study boosting in the context of factor models. Mittnik, Robinzonov, and Spindler (2015) use boosted regression trees to model the stock market volatility. Closely related to our research is the work of Ng (2014), who uses boosted regression trees to forecast
U.S. recessions, and finds that only a few predictors are important for predicting recessions (including interestrate variables), but also that the relative importance of predictors has changed over time. Ng (2014) does not document marginal effects and abstracts from potential interaction effects of the predictor variables being studied, because the regression trees are restricted to stumps (that is, there is no hierarchical structure of the trees). We find that measures of the short-term interest rate and the term spread are important leading indicators of recessions in Germany, but also that the BRT approach uses other indicators like business climate indicators and stock market returns to grow trees. The informational content of stock market returns for subsequent recessions is in line with both earlier findings for the U.S. documented by Estrella and Mishkin (1998), and recent theoretical and empirical results reported by Farmer (2012). For the G-7 countries, Bluedorn, Decressin, and Terrones (2013) report that drops in real equity prices are useful predictors of recession starts. Barro and Ursúa (2009) find that stock market crashes help to predict depressions, especially in times of a currency or banking crisis. For Germany, Drechsel and Scheufele (2012) also consider stock market returns as an indicator, but find it insignificant, at least before the Great Recession. For earlier evidence of the predictive power of the yield curve, see Duarte, Venetis, and Paya (2005), Estrella, Rodriguez, and Schich (2003), Ivanova, Lahiri, and Seitz (2000), and Rudebusch and Williams (2009), among others. Furthermore, marginal effects reveal nonlinearities in the link between the leading indicators and the probability of being in a recession. For example, the recession probability sharply decreases when the term spread changes sign from negative to positive, and it is a non-linear negatively-sloped function of stock market returns. The recession probability also increases sharply when the short-term interest rate rises above approximately 7%–8%. In contrast, the recession probability is relatively flat for lower short-term interest rates, which implies that monetary policy that operates at the zero-lower bound may have a small effect on the recession probability. At the same time, an investigation of interaction effects shows that the increase in the recession probability that follows an increase of the short-term interest rate is larger in times of a bearish stock market than in times of a bull market. Simulation results and results of an out-of-sample forecasting experiment show that the BRT approach has a better out-of-sample performance as compared to variants of the probit approach. Section 2 of the paper briefly describes the BRT approach, while Section 3 describes our data. Section 4 then reports our results, and Section 5 concludes. 2. BRT approach The machine-learning literature has developed several boosting algorithms for solving regression/classification problems under various loss functions (for surveys, see Bühlmann & Hothorn, 2007; Mayr, Binder, Gefeller, & Schmid, 2014a,b; Schapire, 2003). One of the earliest and most popular boosting algorithms is the Adaboost
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
algorithm (Freund & Schapire, 1997). Friedman et al. (2000) later developed variants of the Adaboost algorithm and traced its links to logistic regression models, while Friedman (2001, 2002) showed that boosting can be interpreted as a function-approximation problem that can be solved using a forward stage-wise gradient-descent technique. Drawing on the results of Friedman et al. (2000), we model recessions as a binary variable yt +h ∈ {0, 1}, where yt = 1 denotes a recession, t = 1, . . ., denotes a time index, and h denotes a forecast horizon. The aim is to model the links between recessions and the leading indicators, xt = (xt ,1 , xt ,2 , . . .), by means of a function F (xt ), so as to minimize the expected value of a loss function L. Friedman et al. (2000) considers the exponential loss function
L(F ) = E exp(−˜yt +h F (xt )),
(1)
where, for ease of notation, y˜ t +h = 2yt +h − 1, such that y˜ t +h ∈ {−1, 1}, and E denotes the conditional expectations operator. The loss function increases when y˜ t +h and F (xt ) have different signs, and decreases when y˜ t +h and F (xt ) have the same sign (that is, when F (xt ) helps to classify a recession). After denoting the conditional probability of being in a recession by P (˜yt +h = 1|xt ), the right-hand-side of Eq. (1) can be expanded to give E exp(−˜yt +h F (xt )) = P (˜yt +h = 1|xt ) exp(−F (xt ))
+ P (˜yt +h = −1|xt ) exp(F (xt )). (2) The loss function L(F ) is minimized by setting F (xt ) to one-half of the log-odds ratio (see Friedman et al., 2000, p. 345): F (xt ) =
1 2
log
P (˜yt +h = 1|xt ) P (˜yt +h = −1|xt )
.
(3)
Thus, the function F (xt ) can be estimated by modeling the log-odds ratio, which, in turn, can be modeled using the unconditional recession probability. However, the unconditional recession probability is a crude measure of the conditional recession probability, and boosting techniques show how this measure can be refined. The idea of boosting is to break down the functionestimation problem into a series of simple problems by stipulating that F (xt ) can be expressed as the sum of simpler functions, T (xt ): F (xt ) =
M
Tm (xt ),
(4)
m=0
where m is the index of a weak learner and M denotes an upper bound on the number of functions considered in the summation operation. The function F (xt ) is a strong learner and the simpler functions T (xt ) are weak learners. We use the following gradient-descent boosting algorithm to estimate the weak learners step-by-step in a forward stage-wise manner (Friedman, 2001, 2002): 1 2
P (˜y
=1 )
log P (˜y t +h=−1) . t +h 2. Define some upper bound, M, for the number of weak learners. 3. For m in 1 to M: 1. Initialize the algorithm: F0 = T0 =
747
(a) Compute the negative gradient vector given by zt ,m = −∂ L(F )/∂ F = y˜ t +h exp(−˜yt +h F (xt )). (b) Fit a weak learner, Tm (xt ), to the negative gradient vector. (c) Update the function estimate, Fm (xt ), by adding the weak learner Tm (xt ) to Fm−1 (xt ). (d) Equipped with the new function estimate, go back to Step (a). 4. At m = M: FM (xt ) has been computed as the sum of weak learners, Tm (xt ), m = 0, . . . , M. We use regression trees as weak learners. A regression tree T (xt ) with J terminal nodes partitions the space of the leading indicators xt into l non-overlapping rectangular regions Rl , in a binary and hierarchical top-down way, and predicts a region-specific constant E (zt ,m |xt ∈ Rl ) at every terminal node. Every region is defined by a leading indicator s ∈ xt , used for the partitioning, and a partitioning point p (the value of an indicator at which a split is invoked). The splitting indicator and the partitioning point are chosen so as to minimize a quadratic loss function, defined over the negative gradient vector and the region-specific constant prediction. Regression trees are integrated into Step 3a by choosing terminal node responses such that (Friedman, 2002, Algorithm 1)
γl,m = arg min γ
L(Fm−1 (xt ) + γ ).
(5)
xt ∈Rl,m
The minimization problem specified in Eq. (5) can be solved using Newton’s method (Friedman et al., 2000, p. 353), and Eq. (4) can then be rewritten as F (xt ) =
M
γl,m 1xt ∈Rl,m ,
(6)
m=0
where 1 is an indicator function. Friedman (2001) further introduces a shrinkage parameter, 0 < λ ≤ 1, that curbs the influence of individual weak learners on the strong learner. Step 3c is then modified to Fm (xt ) = Fm−1 (xt ) + λγl,m 1xt ∈Rl,m .
(7)
Finally, Friedman (2002) suggests that the model predictions be stabilized by injecting randomness at Step 3a. The resulting stochastic gradient-descent-boosting algorithm requires researchers to sample a subset from the data without replacement before fitting a weak learner. Only the sampled data are then used to estimate the next weak learner. 3. Data 3.1. Defining recessions It is difficult to define a benchmark for recession periods because the question of what constitutes a ‘‘recession’’ is controversial. Most researchers who study U.S. data refer to the definition of recessions that was coined by the NBER Business Cycle Dating committee (National Bureau of Economic Research, 2014): ‘‘During a recession, a significant decline in economic activity spreads across the economy
748
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
and can last from a few months to more than a year. Similarly, during an expansion, economic activity rises substantially, spreads across the economy, and usually lasts for several years’’. When determining recession periods, the committee makes a broad assessment of the overall state of the economy, an approach which is in the tradition of Burns and Mitchell (1946). This so-called ‘‘classical’’ understanding of business cycles is in contrast to the alternative understanding of business cycles as deviations from a longterm trend or potential GDP. Because there is no ‘‘official’’ business cycle dating committee for Germany, we refer to the dating of peaks and troughs as published by the Economic Cycle Research Institute (2013). The ECRI recessions periods track the swings in the production cycle nearly perfectly, and are in line with results of earlier research (see Kholodilin, 2005, and the literature cited therein).2 3.2. Selection of leading indicators Researchers have studied various leading indicators of the German business cycle, and German recession periods in particular (Fritsche & Kuzin, 2005; Theobald, 2012). Like Drechsel and Scheufele (2012), we study selected financial indicators, surveys, real economy variables, prices, and composite leading indicators.3 Specifically, we use the following criteria to select the 35 leading indicators summarized in Table 14 : 1. Long-term availability: Because recessions are rare events, we need indicators that are available for long time spans. Our sample period starts in 1973/1 (data for 1973 are used for computing year-on-year changes) and ends in 2014/12. 2. Monthly availability: We restrict our analysis to monthly data because we want to study different forecasting horizons and to have many observations available. Several macroeconomic time series are subject to revisions (Croushore, 2011). Because real-time data for Germany are only available for short sample periods, we use indicators which are observable in a timely fashion and not severely vulnerable to revisions. We consider (German and U.S.) industrial production, order inflow, and other macroeconomic data as leading indicators, because they are particularly relevant for business-cycle dating. 2 A figure showing the data is available from the authors. Our sample period starts in 1973/1, and the last forecast is made in 2014/12 (that is, for six-month-ahead forecasts, the last observation of the response variable is from 2015/6). There were five recessions in our sample period: one at the very beginning of the sample period (first oil-price shock), one in the early 1980s (second oil-price shock), one in the first half of the 1990s (after German reunification), one in the early 2000s (after the dotcom boom), and the Great Recession of 2008. 3 Some of the series used by Drechsel and Scheufele (2012) start after 1980, leaving only a very few recessions to be predicted. Others have been discontinued or are available only at a quarterly frequency. We also studied a model that features fewer leading indicators, but indicators that trace back to the early 1960s. The results for this model are not reported, but are available from the authors upon request. 4 Unit-root tests (not reported) show that the transformations of the leading indicators detailed in Table 1 generally result in stationary time series, with the exception of some monetary aggregates which are highly persistent. Given the limited power of unit root tests in small samples, we assume stationarity and perform all calculations under this assumption.
4. Empirical results 4.1. Model calibration We use the R programming environment for statistical computing (R Core Team, 2015) for our empirical analysis, and the add-on package ‘‘gbm’’ (loss function ‘‘Adaboost’’, Ridgeway, 2015) for estimating the BRT model. We allow for a maximum tree depth of five, which should be sufficient to capture potential interaction effects.5 The minimum number of observations per terminal node is five, and the shrinkage parameter assumes the value λ = 0.005, with reasonable larger values giving similar results. Our estimation strategy in Sections 4.2, 4.3 and 4.6 is to use 70% of the data (sampling without replacement) to train the BRT model, and 30% for quasi out-of-sample model testing. We simulate this process 1,000 times to make statistical inference, using five-fold cross-validation to determine the optimal number of weak learners in every simulation run. In this respect, we fix the maximum number of weak learners to M = 3,000, but the cross-validated bias–variance minimized number of weak learners is typically much smaller. We study two forecast horizons: three and six months. Across all 1,000 simulations, the optimal number of weak learners averages about 746 (standard deviation: 173) when we study a forecast horizon of three months, and about 663 (standard deviation: 115) for a forecast horizon of six months. We select 50% of the training data at random in Step 3a to build the next weak learner in the recursion. We present in-sample results and results for our quasi out-of-sample experiment in Sections 4.2–4.6. In Sections 4.7 and 4.8, we assess the performance of the BRT approach based on out-of-sample forecasts. 4.2. Relative importance of leading indicators The relative importance of a leading indicator in a regression tree is defined as the sum over non-terminal nodes of the squared improvement that results from using a leading indicator to form splits (Breiman et al., 1984), where this definition can be extended to boosted tree ensembles by averaging across weak learners (Friedman, 2001). In our application, we average across simulation runs.6 Measures of the short-term interest rate (especially the money market rate) and the term spread (especially when calculated as the yield on 9–10 year government bonds minus the money market rate) are among the most influential leading indicators (Fig. 1). While the term spread is relatively more important for a forecast horizon of three months, the money market rate is relatively more
5 A model that only features stumps (see also Ng, 2014) gives qualitatively similar results (not reported), but does not allow the type of interaction of indicators visualized in Fig. 5 to be studied. See also the results for a dynamic model that are mentioned briefly in Section 4.8. 6 For alternative ways to visualize the relative importance of predictors, see Kim and Swanson (2014), who compare various forecasting approaches, and Lehmann and Wohlrabe (2016), who study leading indicators of the growth rate of German industrial production.
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
749
Table 1 Data. Series
Source
Transformation
Recession phases Order inflow industry ifo business climate (industry) fo business climate (industry, current situation) fo business climate (industry, expectations) Consumer price index (CPI) OECD stock market index Industrial production OECD consumer confidence U.S. industrial production Crude oil prices: West Texas Intermediate (WTI) OECD leading indicator (amplitude adjusted) Real narrow effective exchange rate for Germany U.S. effective federal funds rate Money market rate (a) Discount rate (b) 3m-money market rate (c) Yields on debt (4-5) Yields on debt (7-8) Long-term government yield (9-10) Term spread Term spread Term spread Spread Spread (corporate–government) Money supply M1 (d) Money supply M1, real Money supply M2 (e) Money supply M2, real Money supply M3 (e) Money supply M3, real Oil price, real OECD leading indicator, norm. OECD leading indicator, trend rest. Exchange rate (dollar) Unemployment rate
Economic Cycle Research Institute Deutsche Bundesbank FRED / ifo institute FRED / ifo institute FRED / ifo institute Deutsche Bundesbank OECD Monthly Economic Indicators Deutsche Bundesbank OECD Monthly Economic Indicators OECD Monthly Economic Indicators FRED OECD Monthly Economic Indicators FRED FRED Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank Deutsche Bundesbank, FRED
None yoy None None None yoy yoy yoy
Deutsche Bundesbank, FRED Deutsche Bundesbank, FRED
OECD Monthly Economic Indicators OECD Monthly Economic Indicators Deutsche Bundesbank, FRED Deutsche Bundesbank
yoy yoy None yoy None None None None None None None 10y: money 10y: discount 10y: 3 month money discount: money None yoy yoy M1 change: inflation yoy yoy M2 change: inflation yoy yoy M3 change: inflation yoy price change: inflation None yoy Prices 1+UR /100
Notes: yoy denotes the change over the previous year. FRED is the Federal Reserve Bank of St. Louis database. The data start in 1973/1 and the last forecast is made in 2014/12. (a) From 1999 onwards: EURIBOR one-month funds; before: money market rates reported by Frankfurt banks. (b) From 1999 onwards: ECB’s deposit facility rate; before: discount rate of the Bundesbank. (c) from 1999 onwards: EURIBOR three-month funds rate, before: Money market rates reported by Frankfurt banks (three-month funds). (d) Up to 1980: M1 growth for Germany; from 1980 onwards: growth of ‘‘German contribution’’ to Euro-Zone M1. (e) Up to 1998: M2 (M3) growth for Germany; from 1999 onwards: growth of ‘‘German contribution’’ to Euro-Zone M2 (M3).
important for a forecast horizon of six months. The relative importance of stock market returns ranges between about 6% and 7%.7 The relative importance of the growth rate of money supply (M1) ranges between 5% and about 6%.8 The relative importance of the business-climate indicators decreases when we switch to a forecast horizon of six months. While the other leading indicators have relative importances below 5%, the BRT approach occasionally uses their informational content for tree building. 7 For U.S. data, Estrella and Mishkin (1998) find that stock prices have predictive value for recessions, especially at forecast horizons of between one and three quarters, and Farmer (2012) emphasizes the predictive value of the stock market for the unemployment rate. On the predictive power of stock prices for recession starts and depressions, see also Bluedorn et al. (2013) and Barro and Ursúa (2009). 8 Some authors have considered real money growth as an indicator of coming recessions (Balbach & Karnosky, 1975; Brand, Reimers, & Seitz, 2003). The European Central Bank (2012) argues that monetary aggregates are useful leading indicators, but only during periods of financial crisis. Levanon, Ozyildirim, and Tanchua (2010) show that the real M2 lost its formerly well-documented leading indicator property in the U.S. during the first decade of the 21st century.
4.3. Marginal effects Figs. 2 and 3 show marginal effects for forecast horizons of three and six months, where the shaded areas denote 95% confidence intervals computed across 1,000 simulation runs. The marginal effects show the effect of a leading indicator (horizontal axis) on the probability of a recession (vertical axis, log-odds scale), where the effects of the other leading indicators are controlled for using the weighted-traversal technique described by Friedman (2001, p. 1221). The recession probability shows an abrupt increase for short-term interest rates of around 7%–8%, and stays constant for higher short-term interest rates. A negative term spread is associated with a higher recession probability than a positive term spread, where the log-odds ratio changes significantly when the term structure is flat. The recession probability is higher in times of a bearish stock market and decreases gradually over the range of stock market returns from −40% to 20%. A higher growthrate of money supply (M1) is associated with a lower recession probability. The recession probability is lower for
750
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
Fig. 1. Relative importance of leading indicators. Note: For definitions of leading indicators, see Table 1. The relative importance is averaged over 1,000 simulation runs.
larger values of two of the business-climate indicators, though this effect is somewhat more visible for a forecast horizon of three months. The recession probability increases when the inflation rate and the oil price increase (for the six-month forecast horizon). 4.4. Changes in marginal effects Fig. 4 shows the changes over time in the marginal effects of the short-term interest rate, the term spread, and stock market returns. The marginal effects plotted in the figure are based on estimates of the BRT model over the full sample period (data up to and including 2014/12) and two shorter sample periods (data up to and including either 1989/12 or 1999/12). The function that summarizes the marginal effects of the short-term interest rate undergoes a clockwise rotation as the sample period lengthens. The marginal-effect curve for the term spread shifts upward, with the shift being more pronounced for a negative term spread than for a positive. Similarly, the marginal effect curve for stock market returns shifts upwards for the longer sample periods, with the shift being more pronounced for a bearish stock market.
4.5. Interaction effects Fig. 5 shows the recession probability (on a log-odds scale) as a function of the short-term interest rate and the term spread for alternative realizations of stock market returns. The BRT model is estimated on the full sample of data. The recession probability that corresponds to a high short-term interest rate is smaller (the negative log-odds ratio implies that the recession probability is smaller than 0.5) in a bull market than in a bear market. Similarly, a positive term spread is associated with a smaller recession probability in a bull market than in a bear market. The shift in the marginal-response curves is not parallel, but rather becomes nonlinear in the presence of interaction effects when the tree depth exceeds one. 4.6. ROC analysis The estimated recession probability can be mapped back to a recession classifier by checking whether Pˆ (yt +k = 1|xt ) exceeds some cutoff value. An ROC curve plots the resulting rate of true positives as a function of the rate of
Fig. 2. Marginal effects (forecast horizon: three months). Note: Horizontal axis = mean per quantile of 2.5% width of the leading indicators computed across 1,000 simulation runs. Black line = mean per quantile of the log-odds ratio computed across simulation runs. Shaded area = 95% confidence interval computed for every quantile. For definitions of the leading indicators, see Table 1.
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759 751
Fig. 3. Marginal effects (forecast horizon: six months). Note: Horizontal axis = mean per quantile of 2.5% width of the leading indicators computed across 1,000 simulation runs. Black line = mean per quantile of the log-odds ratio computed across simulation runs. Shaded area = 95% confidence intervals computed for every quantile. For definitions of the leading indicators, see Table 1.
752 J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
753
Fig. 4. Changing marginal-effect curves. Note: Short-term interest rate = money market rate. Term spread = the yield on 9–10 year government bonds minus the money market rate. For definitions of the leading indicators, see Table 1.
false positives, as the cutoff value varies.9 The area under a ROC curve, AUROC, summarizes the forecast performance. Perfect (pure noise) forecasts give AUROC = 1 (AUROC = 0.5).10 The AUROC statistic is estimated using the result that it is linked to the Wilcoxon-Mann–Whitney U statistic (Bamber, 1975; Hanley & McNeil, 1982). We first compute the AUROC statistic for the 1,000 simulated samples of the quasi out-of-sample forecasting experiment (30% test data), then subject a probit model to the same quasi out-of-sample forecasting experiment (we use the same random seed of training and test data). For each simulation run, we estimate probit models that feature one predictor at a time, and use the model that maximizes the pseudo R2 statistic for forecasting purposes. Finally, we subtract the AUROC statistics for the probit approach from those computed for the BRT approach. The difference in the AUROC statistics is significantly positive (Table 2).11
9 While ROC curves are often used in the machine-learning literature to study the predictive power, it is only recently that they have become popular in economics (Berge & Jordà, 2011; Lahiri & Wang, 2013; Liu & Moench, 2016; Pierdzioch & Rülke, 2015; Schneider & Gorr, 2015). 10 If AUROC < 0.5, reversing the definition of a signal implies that AUROC > 0.5. 11 Studying the comparative performances of the BRT and the probit approach in Table 2 is more interesting than studying their absolute performances. The quasi out-of-sample forecasting experiment implies that the test data are scattered across the sample, so that information
We also compare the BRT approach with an alternative probit approach. To this end, we use Baysian model averaging (BMA) to form weighted forecasts from the individual probit models.12 The results, given in Table 2, show that the BRT approach outperforms the BMA probit approach. 4.7. Recursive estimation Next, we analyze recursive out-of-sample forecasts. We start by estimating the BRT model on data up to and including 1979/12, then compute forecasts. At the end of the next month, we reestimate the BRT model and use the reestimated model to make the next forecasts. This continues until we reach the end of the sample period.
from business-cycle developments that occurred later in the sample are used to study how the model performs on the test data. Thus, the quasi out-of-sample forecasting experiment overestimates the out-of-sample performances (see also Table 3). 12 Specifically, like Berge (2015), we use the result derived by Raftery (1995, p. 145) to approximate the a posteriori probability, posti , of model i by means of the Bayesian information criterion (BIC, see Schwarz, 1978), N (x ) as posti = exp(− 12 BICi )/ i=1t exp(− 12 BICi ), where N (xt ) denotes the number of leading indicators in xt . The BIC is computed as BICi = −2LLi + 2n, where LLi denotes the maximized log likelihood function of model i, and 2n denotes the number of parameters (an intercept plus the coefficient of the leading indicator being studied) times the number of observations in the training data.
754
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
Fig. 5. Interaction effects: influence of the stock market. Note: Short-term interest rate = money market rate. Term spread = the yield on 9–10 year government bonds minus the money market rate. Stock market returns assume a value of two standard deviations below their mean (dashed line; bear market; returns approximately −37%), an intermediate value (solid line; mean returns of about 5%), and a value of two standard deviations above their mean (dotted line; bull market; returns roughly 48%). For definitions of the leading indicators, see Table 1.
Fig. 6. Out-of-sample performances. Note: Shaded areas indicate recessions. The forecast horizon in the left (right) panel is three (six) months. Table 2 AUROC results (simulation study). Forecast horizon
Mean BRT
Mean probit
Difference in means
CI lower bound
CI upper bound
0.0747 0.0515
0.2055 0.1513
0.0656 0.0506
0.1958 0.1436
BRT versus probit 3 months 6 months
0.9937 0.9880
0.8582 0.8939
3 months 6 months
0.9937 0.9880
0.8672 0.8949
0.1354 0.0941 BRT versus BMA probit 0.1264 0.0930
Note: CI = 95% confidence interval. Test fraction: 30%. Number of simulation runs: 1,000. Mean denotes the average difference between the AUROC statistic for the BRT approach minus that for the (BMA) probit approach.
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
The results (Fig. 6) show that the model captures the recession of the 1980s and the beginning of the recession of the early 1990s well, but is a little late in predicting the end of the recession of the early 1980s, especially in the case of six-month-ahead forecasts. The model also describes the end of the recession of the 1990s well, although the recession probability becomes unstable towards the end of the recession. At the start of the recession of the early 2000s, the recession probability shows some noticeable upticks and downticks. Furthermore, the model signals the beginning of the Great Recession, but is a bit late in signalling the end of this short-lived recession. Fig. 7 shows the change in the relative importance of the various leading indicators since 1980.13 The results show a decline in the relative importance of short-term interest rates, beginning in the 1990s. The relative importance of the term spread (the yield on 9–10 year government bonds minus the money market rate) starts increasing in the second half of the 1990s, reaching roughly 20% by the end of the sample period when the forecast horizon is three months (about 10% when the forecast horizon is six months).14 While the relative importance of most other leading indicators has stayed more or less constant at a low level over the years, the relative importance of the business climate indicators has increased slightly, but only for a forecast horizon of three months. We also observe an increase (at a low level) in the relative importance of the growth rate of money supply (M1). Furthermore, the results show that the relative importance of the unemployment rate has declined in recent years. This decline may indicate that the unemployment rate in Germany is less cyclical today than it was, say, in the midseventies. The relative importance of stock market returns started increasing around 2000, reaching about 6% at the end of the sample period for both forecast horizons (see also Fig. 1). Table 3 summarizes the out-of-sample AUROC results. As expected, the point estimates are somewhat smaller than those given in Table 2, but they are still significantly above the pure noise benchmark for both the BRT and probit approaches. The BRT approach performs better than the probit approach, corroborating the results of the quasi out-of-sample experiment. 4.8. A dynamic model As a robustness check, we analyze a dynamic forecasting model. We introduce dynamics by adding twelve lags of every variable to our vector of leading indicators and also add twelve lags of the recession indicator. The dynamic model features a total of 467 potential leading indicators. We estimate the dynamic model recursively in order to compute out-of-sample forecasts, using data up to and including 1979/12 to initialize the recursive estimation, as in
13 In regard to the stability of prediction models for U.S. and German inflation and real activity, see Estrella et al. (2003). 14 The results (not reported) of a recursive out-of-sample forecasting experiment based on a BRT approach that uses a larger numerical value for the learning rate (λ = 0.05) are qualitatively similar.
755
Section 4.7. In order to save CPU time, we estimate the dynamic model using stumps. Fig. 8 summarizes the results for the dynamic model. Panel A reports the relative importance of the top leading indicators. We condense the estimation output by reporting measures of the relative importance averaged across time. As in the case of the static model (Section 4.7), measures of the short-term interest rate and the term spread dominate the list of top leading indicators.15 Panel B reports the evolution of the relative importance of the money market rate, the term spread, and stock market returns over time. We aggregate the relative importance across variables.16 The relative importance of the money market rate started decreasing in the mid-1990s, while the relative importance of the term spread increased. The relative importance of stock market returns started to increase in the early 2000s. Panel C reports the estimated recession probabilities, which resemble those given in Fig. 6.17 5. Concluding remarks From the point of view of applied business-cycle forecasting, machine-learning techniques are not a substitute for experience in business-cycle forecasting in general, and in interpreting changes in estimated recession probabilities in particular. The BRT approach that we have studied in this research complements the probit approach as a useful instrument for practical business-cycle analysis, but also has additional features. One such feature is that the BRT approach is a natural modelling platform for analyzing the relative importance of leading indicators. Another feature is that the BRT approach makes it possible to study the nonlinear marginal effects of leading indicators on the recession probability. The BRT approach also helps to uncover the ways in which leading indicators interact when predicting recessions. We have considered a static BRT approach for studying such interaction effects, and a simplified dynamic model that features stumps for capturing the informational content of lagged predictors. The question of whether a static or dynamic BRT approach is the method of choice for forecasting recessions depends on the facets of the data that a researcher is interested in. Our results show that the short-term interest rate and the term spread are two important leading indicators of
15 The lagged recession indicator is the top predictor for a forecast horizon of three months. Two issues should be mentioned in this regard. First, when using lags of the response variable as predictors, the question arises as to when a recession indicator is known to a forecaster such that he or she can use the indicator to compute forecasts. For example, Nyberg (2010) assumes a lag of nine months. Second, information on the stance of the business cycle can be taken into account in various different ways (by using a lagged recession indicator, the lagged recession probability, or a combination thereof; see Kauppi & Saikkonen, 2008). We leave a comparison of the various modeling choices in the context of a BRT approach for future research. 16 For example, the relative importance reported for the money market rate is the sum of the relative importance values computed for the contemporaneous and all lagged money market rates. 17 The AUROC statistics are 0.9735 (0.0107) for a three-month horizon and 0.9597 (0.0133) for a six-month horizon.
Fig. 7. The changing relative importance of leading indicators. Note: Black (red) solid (dashed) lines represent forecast horizons of three (six) months. For definitions of the leading indicators, see Table 1.
756 J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
757
Table 3 AUROC results (recursive estimation). Forecast horizon
BRT
SD
Probit
SD
BMA probit
SD
3 months 6 months
0.9769 0.9693
0.0100 0.0117
0.8562 0.8689
0.0236 0.0229
0.8561 0.8704
0.0236 0.0228
Note: SD = standard deviation.
Fig. 8. Results for the dynamic model. Note: The graphs on the left (right) are for forecast horizons of three (six) months. Panel A: Relative importance averaged across recursive estimation periods. Panel B: Relative importance aggregated across contemporaneous and lagged leading indicators. Term spread = yield on 9–10 year government bonds minus the money market rate. Panel C: Shaded areas indicate recessions.
758
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759
recessions in Germany. While the relative importance of the short-term interest rate has decreased over time, the relative importance of the term spread has increased. Given that the relative importance of stock market returns has increased somewhat over time, it would be interesting to investigate the role of the stock market in more detail in future research using the BRT approach or some other machine-learning technique. Our results further show that the BRT approach can be a useful technique for the analysis of economic policy. The changes in the relative importance of the short-term interest rate as a leading indicator of recessions that we have detected may have implications for monetary policy. The unresponsiveness of the marginal-effect curve to variation in the shortterm interest rate when the short-term interest rate is close to its zero lower bound is also interesting from a monetary-policy perspective. Another result that is of interest for both monetary-policy analysis and economic model building concerns the dependence of the marginal effect of the variation in the short-term interest rate on the recession probability and on the state of the stock market. This state dependence deserves further attention in future research. Acknowledgments We thank Artur Tarassow, Christian Proaño and two anonymous referees for helpful comments. We thank the German Science Foundation for financial support (Project Macroeconomic Forecasting in Great Crises; Grant number: FR 2677/4-1). References Bai, J., & Ng, S. (2009). Boosting diffusion indices. Journal of Applied Econometrics, 24(4), 607–629. Balbach, A., & Karnosky, D. S. (1975). Real money balances: A good forecasting device and a good policy target? Federal Reserve Bank of St. Louis Review, 57(9), 11–15. Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12(4), 387–415. Barro, R.J., & Ursúa, J.F. (2009). Stock-market crash and depressions, Working Paper No. 14760, National Bureau of Economic Research, Cambridge, Mass. Bec, F., Bouabdallah, O., & Ferrara, L. (2014). The way out of recessions: A forecasting analysis for some Euro area countries. International Journal of Forecasting, 30(3), 539–549. Berge, T. J. (2015). Predicting recessions with leading indicators: Model averaging and selection over the business cycle. Journal of Forecasting, 34(6), 455–471. Berge, T. J., & Jordà, Ò (2011). Evaluating the classification of economic activity into recessions and expansions. American Economic Journal: Macroeconomics, 3(2), 246–277. Bluedorn, J.C., Decressin, J., & Terrones, M.E. (2013). Do asset price drops foreshadow recessions? IMF Working Paper No. WP/13/203, International Monetary Fund. Brand, C., Reimers, H.-E., & Seitz, F. (2003). Forecasting real GDP: What role for narrow money? Working Paper No. 254, European Central Bank. Breiman, L., Friedman, J. H., Olshen, R. A., & Ston, C. J. (1984). Classification and regression trees. Monterey: Wadsworth and Brooks/Cole. Buchen, T., & Wohlrabe, K. (2011). Forecasting with many predictors: Is boosting a viable alternative? Economics Letters, 113(1), 16–18. Bühlmann, P., & Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, 22(4), 477–505. Burns, A. F., & Mitchell, W. C. (1946). Measuring business cycles. National Bureau of Economic Research. Cáceres, N., & Malone, S. W. (2013). Forecasting leadership transitions around the world. International Journal of Forecasting, 29(4), 575–591.
Croushore, D. (2011). Frontiers of real-time data analysis. Journal of Economic Literature, 49(1), 72–100. Drechsel, K., & Scheufele, R. (2012). The performance of short-term forecasts of the German economy before and during the 2008/2009 recession. International Journal of Forecasting, 28(2), 428–445. Duarte, A., Venetis, I. A., & Paya, I. (2005). Predicting real growth and the probability of recession in the Euro area using the yield apread. International Journal of Forecasting, 21(2), 261–277. Economic Cycle Research Institute, (2013). Business cycle peak and trough dates, 1948–2011, URL http://www.businesscycle.com/ecribusiness-cycles/international-business-cycle-dates-chronologies. Date of download: 19.09.2013. Estrella, A., & Hardouvelis, G. A. (1991). The term structure as a predictor of real economic activity. The Journal of Finance, 46(2), 555–576. Estrella, A., & Mishkin, F. S. (1998). Predicting US recessions: Financial variables as leading indicators. Review of Economics and Statistics, 80(1), 45–61. Estrella, A., Rodriguez, A. P., & Schich, S. (2003). How stable is the predictive power of the yield curve? Evidence from Germany and the United States. Review of Economics and Statistics, 85(3), 629–644. European Central Bank (2012). Money and credit growth after economic and financial crisis — a historical global perspective. Monthly Bulletin, 69–85. Farmer, R. E. (2012). The stock market crash of 2008 caused the great recession: Theory and evidence. Journal of Economic Dynamics and Control, 36(5), 693–707. Ferrara, M., Marcellino, Laurent, & Mogliani, M. (2015). Macroeconomic forecasting during the great recession: The return of non-linearity? International Journal of Forecasting, 31(3), 664–679. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378. Friedman, J. H., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2), 337–407. Fritsche, U., & Kuzin, V. (2005). Prediction of business cycle turning points in Germany. Jahrbücher für Nationalökonomie und Statistik, 225(1), 22–43. Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Berlin: Springer. Ivanova, D., Lahiri, K., & Seitz, F. (2000). Interest rate spreads as predictors of German inflation and business cycles. International Journal of Forecasting, 16(1), 39–58. Kauppi, H., & Saikkonen, P. (2008). Predicting U.S. recessions with dynamic binary response models. Review of Economics and Statistics, 90, 777–791. Kholodilin, K. A. (2005). Forecasting the German cyclical turning points: Dynamic bi-factor model with Markov switching. Jahrbücher für Nationalökonomie und Statistik, 225(6), 653–674. Kim, H. H., & Swanson, N. R. (2014). Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence. Journal of Econometrics, 178(2), 352–367. Lahiri, K., & Wang, J. G. (2013). Evaluating probability forecasts for GDP declines using alternative methodologies. International Journal of Forecasting, 29(1), 175–190. Lehmann, R., & Wohlrabe, K. (2016). Looking into the black box of boosting: The case of Germany. Applied Economics Letters, 23(17), 1229–1233. Lessman, S., Sung, M.-C., & Johnson, V. , Johnson (2010). Alternative methods of predicting competitive events: An application in horserace betting markets. International Journal of Forecasting, 26(3), 518–536. Levanon, G., Ozyildirim, A., & Tanchua, J. (2010). Real M2 and its impact on the conference board leading economic index (LEI) for the United States. Business Cycle Indicators. A monthly report from The Conference Board U.S. and Global Indicators Program, 15(3), 3–6. Liu, W., & Moench, E. (2016). What predicts US recessions? International Journal of Forecasting, 32(4), 1138–1150. Lloyd, J. R. (2014). GEFCom2012 hierarchical load forecasting: Gradient boosting machines and Gaussian processes. International Journal of Forecasting, 30(2), 369–374. Mayr, A., Binder, H., Gefeller, O., & Schmid, M. (2014a). The evolution of boosting algorithms: From machine learning to statistical modelling. Methods of Information in Medicine, 6(1), 419–427. Mayr, A., Binder, H., Gefeller, O., & Schmid, M. (2014b). Extending statistical boosting: An overview of recent methodological developments. Methods of Information in Medicine, 6(2), 428–435.
J. Döpke et al. / International Journal of Forecasting 33 (2017) 745–759 Mittnik, S., Robinzonov, N., & Spindler, M. (2015). Stock market volatility: Identifying major drivers and the nature of their impact. Journal of Banking and Finance, 58, 1–14. National Bureau of Economic Research, (2014). The NBER’s Business Cycle Dating Committee, URL http://www.nber.org/cycles/recessions.html. (Last checked: 21.11.2014). Ng, S. (2014). Boosting recessions. Canadian Journal of Economics, 47(1), 1–34. Nyberg, H. (2010). Dynamic probit models and financial variables in recession forecasting. Journal of Forecasting, 29, 215–230. Orphanides, A., & Porter, R. (2000). P ∗ revisited: Money-based inflation forecasts with a changing equilibrium velocity. Journal of Economics and Business, 52(1/2), 87–100. Pierdzioch, C., & Rülke, J.-C. (2015). On the directional accuracy of forecasts of emerging market exchange rates. International Review of Economics and Finance, 38(4), 369–376. Proaño, C. R., & Theobald, T. (2014). Predicting recessions with a composite real-time dynamic probit model. International Journal of Forecasting, 30(4), 898–917. R Core Team (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, version 3.1.3, URL http://www.R-project.org/. Raftery, A. R. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163. Ridgeway, R.w.c.f.o. (2015). gbm: Generalized boosted regression models. R package version 2.1.1. http://CRAN.R-project.org/package=gbm. Robinzonov, N., Tutz, G., & Hothorn, T. (2012). Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis, 96(1), 99–122. Rudebusch, G. D., & Williams, J. C. (2009). Forecasting recessions: The puzzle of the enduring power of the yield curve. Journal of Business and Economics Statistics, 27(4), 492–503. Savona, R. (2014). Hedge fund systemic risk signals. European Journal of Operational Research, 236(1), 282–291. Savona, R., & Vezzoli, M. (2015). Fitting and forecasting sovereign defaults using multiple risk signals. Oxford Bulletin of Economics and Statistics, 77(1), 66–92.
759
Schapire, R. E. (2003). The boosting approach to machine learning an overview. In D. D. Denison, M. H. Hansen, C. C. Holmes, B. Mallick, & B. Yu (Eds.), Lecture notes in statistics: vol. 171. Nonlinear estimation and classification (pp. 149–171). Springer. Schneider, M. J., & Gorr, W. L. (2015). ROC-based model estimation for forecasting large changes in demand. International Journal of Forecasting, 31(2), 253–262. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. Silva, L. (2014). A feature engineering approach to wind power forecasting GEFCom 2012. International Journal of Forecasting, 30(2), 395–401. Taieb, S. B., & Hyndman, R. J. (2014). A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 30(2), 382–394. Theobald, T. (2012). Combining recession probability forecasts from a dynamic probit indicator, Working Paper 89, IMK at the Hans Boeckler Foundation, Macroeconomic Policy Institute. Wohlrabe, K., & Buchen, T. (2014). Assessing the macroeconomic forecasting performance of boosting: Evidence for the United States, the Euro area and Germany. Journal of Forecasting, 33(4), 231–242.
Jörg Döpke, Professor of Economics, University of Applied Sciences Merseburg, Faculty of Economics. Research interests: macroeconomics, business-cycle research, forecasting, econometric modelling. Ulrich Fritsche, Professor of Economics, University of Hamburg, Faculty of Socioeconomics. Research interests: macroeconomics, business-cycle research, forecasting, econometric modelling. Christian Pierdzioch, Professor of Economics, Helmut-Schmidt-University Hamburg, Faculty of Social Sciences and Economics. Research interests: macroeconomics, monetary economics, forecasting, political economy, sports economics.