Computational Statistics and Data Analysis (
)
–
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda
Forecasting inflation and GDP growth using heuristic optimisation of information criteria and variable reduction methods George Kapetanios a , Massimiliano Marcellino b,c,d , Fotis Papailias e,f,∗ a
School of Economics and Finance, Queen Mary, University of London, UK
b
Department of Economics, Bocconi University, Italy
c
Innocenzo Gasparini Insitute for Economic Research (IGIER), Italy
d
Center for Economic and Policy Research (CEPR), UK
e
Queen’s University Management School, Queen’s University Belfast, UK
f
quantf research1
article
info
Article history: Received 5 February 2014 Received in revised form 26 February 2015 Accepted 27 February 2015 Available online xxxx Keywords: Heuristic optimisation Information criteria Unbalanced datasets Forecasting Inflation GDP Principal components Partial least squares Bayesian shrinkage regression
abstract Forecasting macroeconomic variables using many predictors is considered. Model selection and model reduction approaches are compared. Model selection includes heuristic optimisation of information criteria using: simulated annealing, genetic algorithms, MC3 and sequential testing. Model reduction employs the methods of principal components, partial least squares and Bayesian shrinkage regression. The problem of unbalanced datasets is discussed and potential solutions are suggested. An out-of-sample forecasting exercise provides evidence that these methods are useful in predicting the growth rates of quarterly GDP and monthly inflation. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Selecting proper forecasting methods for macroeconomic variables has been a major debate issue among researchers, academics, economic analysts and others. During the last decade, attention has focused on the development of methods to cope with a large and possibly unbalanced set of predictors. This literature can be divided into two main approaches: (i) variable selection and (ii) variable reduction. In the former approach, the aim is to identify the specific predictors with highest information content for the target variable. In the latter approach, the large information set is summarised into a smaller number of efficient predictors. Several methods have been proposed within each approach. Sequential testing, which belongs to the variable selection approach, has been analysed by Krolzig and Hendry (2001) and is associated to what is often described as ‘‘general-to-specific’’ methodology. Starting from a general statistical model
∗ Correspondence to: Queen’s University Management School, Queen’s University Belfast, Riddel Hall, 185 Stranmillis Road, BT9 5EE, Northern Ireland, UK. Tel.: +44 2890974667; fax: +44 2890974201. E-mail addresses:
[email protected],
[email protected] (F. Papailias). 1 www.quantf.com. http://dx.doi.org/10.1016/j.csda.2015.02.017 0167-9473/© 2015 Elsevier B.V. All rights reserved.
2
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
that captures the dynamics of the data, the model is reduced by sequentially testing the significance of the regressors in order to achieve parsimony while retaining accuracy. We implement Sequential Testing (ST ) along the lines of Hoover and Perez (1999). An alternative method for variable selection is based on the use of information criteria. While conceptually simple, this method is computationally very demanding when applied in a large dataset context. To tackle this issue, Kapetanios (2006) uses heuristic methods for the optimisation of information criteria. We apply similar heuristic algorithms to select the combination (subset) of all regressors that returns the minimum of a given information criterion. We use Akaike (1974) Information Criterion (AIC ) and Schwarz (1978) Bayesian Information Criterion (BIC ). Results using Hannan and Quinn (1979) Information Criterion are omitted as they are qualitatively similar to BIC . In terms of the heuristic optimisation algorithms, we consider: (i) the Simulated Annealing (SA), (ii) the Genetic Algorithm (GA), and (iii) the MC 3 . Recently, Buchen and Wohlrabe (2011) evaluated the boosting method on the Stock and Watson (2006) dataset providing evidence that alternatives of this kind need further exploration. The above techniques have been analysed by Acosta-González and Fernández-Rodríguez (2007), Alcock and Burrage (2004), Baragona et al. (2004), Brooks et al. (2003), Jacobson and Yücesan (2004), Jerrell and Campione (2001), Maringer (2005) and Gilli et al. (2011) among others. However, their forecasting performance has not been extensively covered. Therefore, we perform a forecasting exercise comparing sequential testing and variable selection based on the use of the non-standard approaches for the optimisation of the information criteria. We adopt the ‘‘direct’’ forecasting approach, which can be more robust in the presence of model misspecification (see Marcellino et al., 2006 for a detailed discussion), and rank the models and methods according to their Root Mean Square Forecast Error (RMSFE). An AR model which typically produces good and robust forecasts for several macroeconomic variables acts as a benchmark. The forecasting comparison also includes variable reduction methods which have often been used in the recent forecasting literature. Specifically, we evaluate: (i) the Principal Components (PC ), (ii) the Partial Least Squares (PLS ), and (iii) the Bayesian Shrinkage Regression (BR). In order to choose the number of factors and the shrinkage parameter, we use a grid search method to optimise the Root Mean Squared Error in a cross-validation set. We focus on forecasting the quarterly GDP growth and monthly inflation in the euro area, which are two key macroeconomic indicators, based on a large set of 195 monthly variables extracted from the Eurostat Principal European Economic Indicators (PEEIs) Dataset. We also discuss how to handle the mixed quarterly/monthly frequency in the case of GDP forecasting. The forecasting exercise investigates: (i) the combination of heuristic method(s) and information criteria that is most accurate; (ii) the top performers within the variable reduction approaches; (iii) the relative ranking of variable selection and variable reduction methods. Overall, our findings indicate that variable selection based on heuristic optimisation of information criteria often outperforms variable reduction methods (and the AR benchmark), matched only in some cases by the Bayesian Shrinkage Regression using LASSO regressions. As expected, AIC results in a larger set of predictors than BIC and, due to lack of parsimony, often deteriorates the forecasting performance. From an economic point of view, the selected regressors are also reasonable. Specifically, labour market variables (wages and salaries, employment index and unemployment rate) are most frequently selected by the heuristic approaches to forecast the HICP inflation, together with its own lags. Interest rates, inflation, money supply and other monetary variables are used in most of the cases to forecast the GDP growth rate. The rest of the paper is organised as follows. Section 2 briefly discusses the various methods and algorithms used in our empirical evaluation. Section 3 describes the forecast evaluation, the data, and how to cope with the data unbalancedness problem. Section 4 discusses the forecast results. Section 5 summarises our main findings and conclusions. 2. Variable selection and variable reduction Consider the following regression model: ′
yt = α + β 0 x0t + ϵt ,
t = 1, . . . , T ,
x0t
(1) 0
where is a k-dimensional vector of stationary predetermined variables. The superscript denotes the true regression model. Let the set of all available variables at time t be represented by the N-dimensional vector xt = (x1,t , . . . , xN ,t )′ , with N much larger than k and the set of variables in x0t contained in xt . The aim of the analysis is to determine x0t starting from xt . Formally, let I = (I1 , . . . , IN )′ denote a vector of zeros and ones (which we refer to as a string). Let I0 be the string for which I0i = 1 if xi,t is an element of x0t and zero otherwise. We wish to determine I0 starting from I. In small samples I0 may not represent the best fitting model for the data at hand. Thus, we cannot base a selection only on the goodness of fit. Therefore, information criteria to select the relevant variables in Eq. (1) should be considered. The generic form of information criteria is usually: IC (I) = −2L(I) + CT (I),
(2)
where L(I) is the log-likelihood of the model associated with string I and CT (I) is the penalty term related to the same ˜ (I) and ln(T )m ˜ (I), corresponding to AIC and BIC, respectively, where m ˜ (I) is string. The two penalty terms we use are 2m the number of free parameters in the model resulting from string I.
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
3
It is straightforward, under relatively weak conditions on xj,t and ϵj,t and using the results of, say, Sin and White (1996), to show that the string which minimises IC (·) will converge to I0 with probability approaching one as T → ∞, as long as (i) CT (I) → ∞ and (ii) CT (I)/T → 0. More specifically, the main assumptions needed for the results of Sin and White (1996) to hold are the following, assuming estimation of the models is undertaken by Gaussian or pseudo maximum likelihood (which, in the simplest case of spherical errors, is equivalent to OLS): (i) Measurability, continuity and twice differentiability of the log-likelihood function and a standard identifiability assumption; (ii) A uniform weak law of large numbers for the log-likelihood of each observation and its second derivative; (iii) A central limit theorem for the first derivative of the loglikelihood of each observation. (ii) and (iii) above can be obtained by assuming, e.g., that xj,t are weakly dependent, say, near epoch dependent, processes and ϵj,t are martingale differences processes. Hence, it is clear that consistency of model selection, as long as the penalty related conditions hold, is straightforwardly obtained in our context. In particular, BIC consistently estimates the true model in the sense of Sin and White (1996), but AIC is inconsistent, as CT remains bounded as T → ∞. For small dimensional xt , evaluating the information criteria for all strings may be feasible as, e.g., in AR lag order selection. In the case of lag selection the problem is made even easier by the fact that there exists a natural ordering of the variables, but in more general cases (e.g., in regression models) such an ordering may not be available. Moreover, as soon as N exceeds, say, 10 or 15 units, evaluating all strings is not feasible. In fact, since I is a binary sequence, there exist 2N strings to be evaluated. For example, when N = 50, and optimistically assuming that 100.000 strings can be evaluated per second, we still need about 357 years for an evaluation of all strings. A solution that overcomes this problem has recently been proposed by Gatu and Kontoghiorghes (2006). In addition, we have a discrete minimisation problem, so that many standard minimisation algorithms cannot be applied. To overcome these difficulties, we consider heuristic optimisation approaches which include: (i) simulated annealing (SA), (ii) genetic algorithm (GA), and (iii) the MC 3 . These approaches are employed here in their ‘‘standard’’ form; for detailed information see Brooks et al. (2003), Brüggemann et al. (2003), Gilli et al. (2011), Goffe et al. (1994), Hajek (1998), Hartl and Belew (1990), Kapetanios (2006), Krolzig and Hendry (2001), Maringer (2005), Morinaka et al. (2001) and Fernandez et al. (2001) among others. As mentioned in the Introduction, we also consider variable selection by means of sequential testing (ST ), see Hendry (1995, 1997) and Hoover and Perez (1999) among others for more information. In terms of variable reduction approaches, factor methods have been at the forefront of developments in forecasting with large datasets and in fact started this literature, see e.g., the influential work of Stock and Watson (2002a). The assumption is that the co-movements across the N variables in xt can be captured by a small number, r, of unobserved factors, grouped in the vector Ft = (F1,t · · · Fr ,t )′ . Formally, it is: x˜ t = Λ′ Ft + et ,
(3)
where x˜ t may be equal to xt or may involve other variables such as, e.g., lags and leads of xt , and Λ is a r × N matrix of parameters describing how the individual indicator variables relate to each of the r factors, which we denote with the term ‘loadings’. In Eq. (3) et is a vector of zero-mean I (0) errors, which represent the idiosyncratic component of each variable. It is important that the number of factors is small, so that the potentially many explanatory variables in Eq. (1) can be replaced by the few factors Ft . To extract (or estimate) the common factors we consider: (i) Principal Components (PC ) and (ii) Partial Least Squares (PLS ). A third alternative to reduce the dimensionality of xt is to use Bayesian Shrinkage Regression (BR), and we evaluate both Ridge and Lasso regressions. More information regarding the theoretical features of these methods and examples of their application can be found in De Mol et al. (2008), Forni et al. (2000, 2005), Helland (1988, 1990), Stock and Watson (2002a,b) and Wold (1982) among others. In order to choose the number of factors to be used in the PC and PLS methods, and the shrinkage parameter for BR, we employ a grid search cross-validation method. The function to be optimised in the cross-validation exercise is the Root Mean Squared Forecast Error as defined in Eq. (6) in the next section. We set up the cross-validation exercise in a similar fashion to Baillie et al. (2014). A maximum of 30 factors for PC and PLS and 5N shrinkage parameter values (with step 0.1) for BR are considered. In all cases, we present results in terms of RMSFE relative to an AR(1) benchmark, so that values smaller than one imply that a specific method beats the benchmark. The best method is the one with the smallest RMSFE value. 3. Forecasting exercise and data description 3.1. Structure of the forecasting exercise Assume that x0t has been obtained using one of the methods previously described. In the next step, we use x0t in combination with the direct approach to predict future values of the variable of interest, yt +h . The forecasts are given by: f y t +h = β h′ x0t ,
where β is obtained by regressing yt on h
(4) x0t −h
and h denotes the forecast horizon, with h = 1, 2, . . . , H.
4
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
A summary of the (pseudo) out-of-sample forecasting algorithm follows. 1. For h = 1, 2, . . . , H, use an initial sample of T1 observations (T1 = T − E v al − h), where E v al is the evaluation sample. 2. With any method described in the previous section, obtain x0t ′ , t = 1, 2, . . . , T1 . ′ 3. Regress yt on x0t −h (for each h) and obtain β h. f
f
f
f
′
4. Calculate the forecasts of yt +h (for each h) using x0t ′ and β h , and obtain yt = yt +1 , . . . , yt +H . 5. Repeat the whole procedure increasing the initial sample T1 to Tl = Tl−1 + 1 until Tl = T − h. At the end of the iterations, we will have gathered a number of E v al forecast values for each forecast horizon, h. The forecast errors are then calculated as: f f et +h = yt +h − yt +h ,
(5)
and they can be used to compute the Root Mean Squared Forecast Error (RMSFE) for a given model, defined as:
E v al 2 1 f RMSFEh = et +h,j . E v al j=1
(6)
3.2. Data Our dependent variables are the growth rates of:
• Monthly HICP (EA-16) (source: Eurostat), • Quarterly GDP (EA-16), seasonally adjusted (source: ECB). We calculate the growth rate using the log transformation: gt (yt ) = ln
yt yt −1
.
(7)
The seasonal adjustment is made by the ECB and the series are provided transformed. The sample spans from 1996-Q2 to 2008-Q4 for the GDP series in levels and, consequently, from 1996-Q3 to 2008-Q4 for its quarter-on-quarter growth rate. The HICP series spans from January 1996 to February 2009 in levels and, from February 1996 to February 2009 for the month-on-month inflation. We have available 195 monthly predictors (source: Eurostat, ECB), spanning from January 1996 to February 2009. The dataset is the same used in Foroni and Marcellino (2014) and contains a large universe of variables that are potentially useful instruments in forecasting key macroeconomic variables in the Euro Area (see Table 1 for the mnemonics). Furthermore, in the spirit of Stock and Watson (2002a) we have transformed the series for stationarity using first differences or log differences appropriately (see Table 2 for the transformations, noting that some of the variables remain unchanged). An obvious problem that arises using quarterly regressands and monthly regressors is how to cope with the frequency irregularity. We assess three different approaches: Tr1 Take the quarterly average of each monthly indicator. Tr2 Take the observations in the last month of each quarter only. Tr3 Split each variable into three indicators, each of them containing observations for, respectively, the first, second and third months of each quarter. This approach is in the spirit of the UMIDAS regressions introduced in Foroni et al. (2015). It is less restrictive than the other methods, but it leads to a further increase in the number of predictors, which becomes N = 3 × 195 = 585. In the case of the monthly HICP inflation, we start the recursive forecast exercise using 110 observations for the first estimation sample and 36 observations (3 years) for out-of-sample evaluation. We also repeat the exercise using a first insample size of 85 observations, which allows for a 60 observations (5 years) out-of-sample evaluation period. Regarding the quarterly GDP growth, we start the forecasting algorithm using the first 33 observations. Then we perform the forecast exercise recursively for 12 evaluation periods (i.e., 3 years of out-of-sample data). We set h = 1, 2, . . . , 12 for inflation and h = 1, 2, . . . , 6 for GDP growth. Finally, we point out that in our context one could also produce monthly updates of the quarterly GDP growth forecasts. This exercise is considered in Bulligan et al. (2014). Their results are qualitatively similar to ours confirming the relevance of variable selection. 3.3. Handling unbalanced datasets Some of the 195 predictors we consider in the analysis are not available for the full sample period, they either start at a later date or present some scattered missing observations or are not available at the very end of the sample. The effect of missing observations is twofold. First, we need to adapt the estimation methods. There are particular ways to handle missing observations in a given model. For example, if a factor model is used, the assumed factor structure can be used both for the estimation of the
69 BS − SV − NY
34 MIG − NDCOG − IS − PPI
35 MIG − NRG − IS − PPI
36 D35 − E36 − IS − IMPR
37 IS − WSI − F
38 B − D − IS − WSI
39 B − E36 − IS − WSI
4 CP − HI00XE
5 CP − HI00XTB
6 CP − HI00
7 CP − HI01
8 CP − HI02
49 MIG − DCOG − IS − WSI
50 MIG − ING − IS − WSI
51 MIG − NDCOG − IS − WSI
52 MIG − NRG − IS − WSI
53 BS − CCI − BAL
54 BS − CEME − BAL
55 BS − COB − BAL
56 BS − CPE − BAL
57 BS − CTA − BAL
58 BS − BCI
59 BS − CSMCI
60 BS − FS − LY
21 B − D − IS − PPI
22 B − E36 − IS − PPI
23 B − C − D − IS − PPI
24 B − IS − PPI
25 B − TO − E36 − IS − PPI
26 C − IS − PPI
27 C − ORD − IS − PPI
28 D − IS − PPI
29 E36 − IS − PPI
30 MIG − CAG − IS − PPI
46 E36 − IS − WSI
16 CP − HI10
20 CP − HIF
45 D − IS − WSI
15 CP − HI09
19 CP − HIE
44 D35 − E36 − IS − WSI
14 CP − HI08
47 MIG − CAG − IS − WSI
43 C − IS − WSI
13 CP − HI07
48 MIG − COG − IS − WSI
42 B − TO − E36 − IS − WSI
12 CP − HI06
17 CP − HI11
41 B − IS − WSI
11 CP − HI05
18 CP − HI12
40 B − C − D − IS − WSI
10 CP − HI04
9 CP − HI03
68 BS − SFSH
33 MIG − ING − IS − PPI
90 BS − SERM
89 BS − SCI
88 BS − SARM
87 BS − SAEM
86 BS − SABC
Label
120 B − C − IS − EPI
119 B − E36 − IS − EPI
118 B − D − IS − EPI
117 IS − EPI − F
116 MIG − NDCOG − IS − IP
115 MIG − ING − IS − IP
114 MIG − DCOG − IS − IP
113 MIG − COG − IS − IP
112 MIG − CAG − IS − IP
111 D − IS − IP
110 C − ORD − IS − IP
109 C − IS − IP
108 B − IS − IP
107 B − C − IS − IP
106 B − D − IS − IP
105 IS − IP − F
104 IS − IP − F − CC 2
103 IS − IP − F − CC 1
102 IS − IP
101 1000 − PERS − LM − UN − T − TOT
100 1000 − PERS − LM − UN − T − LE25
99 1000 − PERS − LM − UN − T − GT 25
98 RT − LM − UN − T − TOT
97 RT − LM − UN − T − LE25
96 RT − LM − UN − T − GT 25
95 BS − SCI − BAL
94 BS − RCI − BAL
93 BS − ICI − BAL
92 BS − ESI − I
91 BS − CSMCI − B
#
150
149
148
147
146
145
144
143
142
141
140
139
138
137
136
135
134
133
132
131
130
129
128
127
126
125
124
123
122
121
#
MIG − DCOG − IS − ITD
MIG − COG − IS − ITT
MIG − COG − IS − ITND
MIG − COG − IS − ITD
MIG − CAG − IS − ITT
MIG − CAG − IS − ITND
MIG − CAG − IS − ITD
C − ORD − IS − ITT
C − ORD − IS − ITND
C − ORD − IS − ITD
C − IS − ITT
C − IS − ITND
C − IS − ITD
B − C − IS − ITT
B − C − IS − ITND
B − C − IS − ITD
G45 − IS − EPI
MIG − NRG − IS − EPI
MIG − NDCOG − IS − EPI
MIG − ING − IS − EPI
MIG − DCOG − IS − EPI
MIG − COG − IS − EPI
MIG − CAG − IS − EPI
E − IS − EPI
E36 − IS − EPI
D − IS − EPI
D35 − E36 − IS − EPI
C − IS − EPI
B − TO − E36 − IS − EPI
B − IS − EPI
Label
180
179
178
177
176
175
174
173
172
171
170
169
168
167
166
165
164
163
162
161
160
159
158
157
156
155
154
153
152
151
#
EMECB3Y
EMECB2Y
DJES50I
BDSHRPRCF
EXA − RT − GBP
EXA − RT − JPY
EXA − RT − USD
LTGBY − RT
3MI − RT
M3
M2
M1
X − G473 − IS − DIT
NFOOD − X − G473 − IS − DIT
NFOOD − IS − DIT
IS − DIT
FOOD − IS − DIT
IS − CAR
C − ORD − X − C 30 − IS − IO
C − ORD − IS − IO
IS − HWI − F
IS − IPI − F − CC 11 − X − CC 113
IS −PEI −F −CC 11−X −CC 113
MIG − NDCOG − IS − ITT
MIG − NDCOG − IS − ITD
MIG − ING − IS − ITT
MIG − ING − IS − ITND
MIG − ING − IS − ITD
MIG − DCOG − IS − ITT
MIG − DCOG − IS − ITND
Label
195
194
193
192
191
190
189
188
187
186
185
184
183
182
181
#
BDECBXLIA
BDECBXOLA
BDECBXNOA
BDECBXNGA
BDECBXDMA
BDECBXDGA
BDEBDBSIA
BDWU0022R
BDWU1032R
FIBOR6M
FIBOR3M
FIBOR1Y
EMGBOND
EMECB7Y
EMECB5Y
Label
)
85 BS − RPBS
84 BS − ROP
83 BS − REM
82 BS − REBS
81 BS − RCI
80 BS − RAS
79 BS − ISPE
78 BS − ISFP
77 BS − IPT
76 BS − IPE
75 BS − IOB
74 BS − IEOB
73 BS − IEME
72 BS − ICI
71 BS − UE − NY
70 BS − SV − PR
67 BS − PT − NY
66 BS − PT − LY
65 BS − MP − PR
64 BS − MP − NY
63 BS − GES − NY
62 BS − GES − LY
3 CP − HI00XES
Label
61 BS − FS − NY
#
32 MIG − DCOG − IS − PPI
Label
31 MIG − COG − IS − PPI
#
2 CP − HI00XEF
Label
1 CP − HI00XEFU
#
Table 1 Set of predictors (mnemonics).
G. Kapetanios et al. / Computational Statistics and Data Analysis ( – 5
6
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
Table 2 Transformations. #
Label
#
Label
#
Label
#
Label
#
Label
#
Label
#
Label
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange NoChange
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
NoChange NoChange NoChange NoChange NoChange FirstDiff . FirstDiff . FirstDiff . NoChange NoChange NoChange FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff ., Logs FirstDiff ., Logs FirstDiff . FirstDiff .
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff . FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs FirstDiff ., Logs
unobserved factors and the interpolation of missing observations by using the factor estimates. This can be implemented via an EM algorithm, as discussed in the literature. However, it would be useful to have a generic (non-model specific) method T to handle this problem. For this, we note that most estimation methods rely on sample moments of the form T1 t =1 f (xt ). For example, the i, j-th element of X ′ X in the Bayesian Regression, is given by
1 T 1 T
T xjt xit . Therefore, as a generic device Tt =1 t =1,t ∈Ix f (xt ) where Ix denotes the set of
for handling missing observations for parameter estimation we suggest using time indices, for which an observation for xt exists. This enables the estimation of moments that are needed for parameter estimation. The second effect of missing observations arises when end of sample observations for a predictor xt are not available, as they are needed for forecasting. In this case, we suggest as a generic solution the use of simple AR forecasting models to forecast the missing value(s) for xt . This approach works well in a related context in Kuzin et al. (2011); Marcellino and Schumacher (2010). Mixed sampling frequency can be considered as a special case of missing observations, where for variables collected at lower frequency some observations are systematically missing. This case is considered in the previous subsection. 4. Discussion of empirical results In this section we discuss the outcome of the forecasting exercise for the monthly HICP inflation and the quarterly GDP growth. All figures are in terms of the RMSFE of each method relative to the AR(1) benchmark. Specifically, we report the ratio RMSFEAlternativ e /RMSFEBenchmark . The AR(1) benchmark is used as a point of reference and allows cross-comparisons across different approaches. To make the results clearer, we report the Sequential Testing using 1% level of statistical significance (this permits to decrease the number of variables to which apply ST and improves the performance by dropping not significant regressors), and the heuristic approaches that use BIC in the selection process. Next, we take a closer look at the variable selection methods and we examine how the optimal solutions change by running a 30 times repetition (reporting the case of HICP with 36 evaluation periods). Finally, we discuss the variables that are selected by the most successful heuristic approaches and the Bayesian Shrinkage Regression using LASSO regressions. This sheds some extra light on the economic meaning of the ‘‘best’’ predictors. In all cases, we have included the first lag of the dependent variable in the set of predictors. In the grid search crossvalidation used in the variable reduction methods we use a training set of 36 observations for the HICP inflation and 12 observations for GDP growth. In other words, we find the factor/shrinkage parameter value that provided the smallest RMSFE in the past 3 years and use it for the out-of-sample forecasting exercise.
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
7
Table 3 Forecasting HICP: RMSFE. Method \ h
Eval = 36
Eval = 60
1
3
6
12
Average
SD
1
3
6
12
Average
SD
ST SABIC GABIC MC 3BIC
0.860 0.890 0.899 0.862
0.797 0.867 0.861 0.837
0.971 0.996 0.999 1.018
1.136 1.240 1.335 1.288
0.941 0.998 1.024 1.001
0.149 0.171 0.216 0.207
0.827 0.882 0.916 0.950
0.837 0.873 0.855 0.901
0.961 1.173 1.177 1.054
1.303 1.335 1.582 1.260
0.982 1.066 1.132 1.041
0.223 0.227 0.330 0.159
PC − CrV PLS − CrV BR − CrV BR − LASSO − CrV
0.979 0.975 1.071 0.894
0.957 0.955 1.009 0.829
1.150 1.134 1.096 0.943
1.408 1.401 1.354 1.202
1.124 1.116 1.132 0.967
0.208 0.206 0.152 0.164
0.973 0.979 1.104 0.891
0.975 0.962 1.039 0.861
1.151 1.120 1.082 0.915
1.519 1.478 1.553 1.184
1.155 1.135 1.195 0.963
0.257 0.239 0.240 0.149
1
We use the following notation: Sequential Testing at 1% significance level (ST 1 ), Simulated Annealing with BIC (SABIC ), 3 Genetic Algorithm with BIC (GABIC ), MC 3 with BIC (MCBIC ), Principal Components with factors selected by grid search crossvalidation (PC − CrV ), Partial Least Squares with factors selected by grid search cross-validation (PLS − CrV ), Bayesian Shrinkage Regression using Ridge regressions and shrinkage parameter selected by grid search cross-validation (BR − CrV ), and Bayesian Shrinkage Regression using LASSO regressions and shrinkage parameter selected by grid search cross-validation (BR − LASSO − CrV ). 4.1. Forecasting HICP inflation The results of HICP inflation forecasting are reported in Table 3 for h = 1, 3, 6, 12 steps ahead forecasts. The table is divided into two panels: the left panel reports the relative RMSFE of each method in a forecast exercise with 36 evaluation rounds, whereas the right panel reports the same results for an exercise with 60 evaluation rounds. 3 Starting with the 36 evaluation periods exercise, for h = 1ST 1 and MCBIC are the best methods providing a relative RMSFE of 0.860 and 0.862, respectively. The other heuristic approaches are better than the AR(1) benchmark and the variable reduction methods, where only the BR − LASSO − CrV using lasso regressions returns a relative RMSFE of 0.894. This is also evident in Fig. 1, which reports the forecast values of all methods (including the AR(1) benchmark) contrasted to the actual values, after having translated the growth forecasts in levels to make the comparison more visible. Next, for h = 3 the best two methods are ST 1 and BR − LASSO − CrV with a relative RMSFE of 0.797 and 0.829 respectively. 3 SABIC , GABIC and MCBIC still perform better compared to the benchmark and the variable reduction methods of Principal Components and Partial Least Squares. The same qualitative findings hold for h = 6, when the top performers are again the ST 1 and the BR − LASSO − CrV . However, at the longer horizon, i.e. h = 12, none of the methods outperforms the simple AR(1). For the 60 evaluation period exercise, ST 1 is again the top performer for h = 1, 3, 6. The second best method is the SABIC , GABIC and BR − LASSO − CrV for h = 1, h = 3 and h = 6, respectively. 4.2. Convergence of heuristics Another issue which often arises when using the heuristic optimisations is the convergence to the optimal solution. To further investigate this issue, we report in Fig. 2 the boxplots of the optimal fitness values for BIC using 30 repetitions across the 36 out-of-sample periods. We see that all three heuristic approaches follow the same uptrend as we move across time. Specifically, we see that after the fifth out-of-sample period SABIC provides solutions with closer convergence with a small number of extreme values. Next, 3 GABIC is also accurate in terms of convergence, however we see boxes that are slightly larger. Finally, MCBIC seems to provide more extremes towards the end of the out-of-sample evaluation period. In general, we can claim that the heuristic approaches, as employed here and for the purposes of the forecasting exercise, do not suffer from convergence problems. 4.3. Forecasting the GDP growth rate Before we start discussing the GDP results in detail, it is worth mentioning that the available data is very limited, allowing for an initial in-sample size of just 33 observations. This is not ideal, however we can still use it to reach some useful conclusions. Table 4 is divided into three panels: the top panel reports the results using the quarterly average transformation (Tr1), the mid panel reports the results using the last month of each quarter transformation (Tr2) and the bottom panel reports the results using the U-MIDAS approach where each monthly predictor is split into three variables that contain the first, second and third months of each quarter (Tr3). The first important fact emerging from Table 4 is the superior performance provided by the BR − LASSO − CrV across all three transformations for h = 1, h = 2 and h = 4. It results in an average RMSFE of 0.667, 0.704 and 0.669 for Tr1, Tr2 and
8
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
Fig. 1. Forecasting HICP, h = 1.
Tr3 respectively. We also observe a common pattern regarding the ranking of the methods: BR − LASSO − CrV is better for short-term forecasting (up to 4 quarters) and PC − CrV and PLS − CrV are better for longer term forecasts (h = 6). In detail, using Tr1 we see that ST 1 outperforms the other methods (apart from BR − LASSO − CrV ) for h = 1. However, for h = 2 we see that BR − LASSO − CrV and BR − CrV are the top-two performing methods followed by ST 1 and GABIC . For h = 4 we again have BR − LASSO − CrV and BR − CrV as the top performers with a relative RMSFE of 0.780 and 0.585 respectively. In the last case of h = 6 quarters ahead, PC − CrV and PLS − CrV provide better forecasts with 0.988 and 0.937 RMSFE respectively. These results are illustrated in Fig. 3. Using Tr2 we again see the superiority of BR − LASSO − CrV , with an average RMSFE of 0.704. The next best method, 3 across all forecast horizons, is MCBIC with an average RMSFE of 0.860. Similarly, for Tr3 we also see that BR − LASSO − CrV 3 and BR − CrV are followed by MCBIC with an average RMSFE of 0.669, 0.675 and 0.999 respectively. 4.4. Selected variables Having discussed the usefulness of the heuristic variable selection approaches and the Bayesian Shrinkage Regression using LASSO regressions, we now examine the underlying selected variables. In Table 5 we report the top twenty variables selected by the three heuristics and BR − LASSO − CrV across the evaluation periods. The table is divided into four panels: (i) the top left panel is concerned with the HICP inflation forecasting using 36 evaluation periods, (ii) the top right panel is concerned with GDP growth forecasting using Tr1, (iii) the bottom left panel is concerned with GDP growth forecasting using Tr2 and (iv) the bottom right panel is concerned with GDP growth forecasting using Tr3. In each panel we present the variable categories, the variable numbers, the number of times each variable is selected by the different methods and, finally, the total number of selections. 4.4.1. HICP inflation Starting with the HICP, we see in the top left panel of Table 5 that the most selected variables are categorised as follows: (i) inflation related variables, (ii) labour variables that include employment, unemployment, employment and unemployment expectations, wages and salaries, (iii) building activity variables, (iv) financial variables that include total liabilities and non-equity holdings.
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
9
Fig. 2. Forecasting HICP. Convergence of heuristic algorithms.
The above variables are not selected in the same manner for each method. Specifically, ST 1 selects the HICP-All items variable 33 times out of the 36 out-of-sample evaluation rounds (92% of the time) with employment, unemployment, wages and salaries and building activity being selected from 33% to 53% of the times. GABIC focuses more on the three top variables that fall in the following two categories: HICP-All items (50% of the times) 3 follows a pattern very similar to ST 1 , selecting the HICP-All and the number of persons employed (33% of the times). MCBIC items variable in 83% of the times and wages and salaries 44% of the times.
10
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
Table 4 Forecasting GDP. Method \ h
Tr1 1
2
4
6
Average
SD
ST SABIC GABIC MC 3BIC
0.791 3.362 1.191 0.976
0.768 2.843 0.872 0.931
0.991 1.015 0.946 1.080
0.995 1.560 1.375 1.163
0.886 2.195 1.096 1.038
0.124 1.092 0.231 0.104
PC − CrV PLS − CrV BR − CrV BR − LASSO − CrV
1.506 1.675 0.963 0.535
1.345 1.426 0.756 0.530
1.075 1.065 0.780 0.585
0.988 0.937 1.334 1.018
1.228 1.276 0.958 0.667
0.240 0.337 0.267 0.235
Method \ h
Tr2 1
2
4
6
Average
SD
ST SABIC GABIC MC 3BIC
1.001 0.936 0.794 0.857
0.648 0.944 0.893 0.830
0.953 1.109 2.993 0.735
2.593 1.443 1.080 1.018
1.299 1.108 1.440 0.860
0.877 0.237 1.042 0.117
PC − CrV PLS − CrV BR − CrV BR − LASSO − CrV
1.484 1.612 1.151 0.631
1.314 1.421 0.843 0.570
1.060 1.067 0.625 0.601
0.987 0.922 1.157 1.012
1.211 1.256 0.944 0.704
0.230 0.317 0.258 0.207
Method \ h
Tr3 1
2
4
6
Average
SD
ST 1 SABIC GABIC MC 3BIC
0.977 2.091 0.802 0.922
0.802 2.056 1.399 1.073
0.900 1.251 1.155 0.733
1.452 1.089 1.061 1.268
1.033 1.622 1.104 0.999
0.288 0.526 0.247 0.227
PC − CrV PLS − CrV BR − CrV BR − LASSO − CrV
1.689 1.505 0.707 0.565
1.416 1.371 0.557 0.538
1.055 1.042 0.568 0.578
0.977 0.952 0.869 0.997
1.284 1.217 0.675 0.669
0.331 0.263 0.146 0.219
1
1
On the other hand, BR − LASSO − CrV tends to select financial variables most of the times. Specifically, total liabilities is selected 100% of the times and non-equity holdings 86% of the times followed by inflationary variables. In general, all methods (on average) use inflation related variables and labour market variables as their main forecasting instruments. This selection highlights the forecasting power of: (i) the ‘‘autoregressive’’ nature of the series and, (ii) a potential Phillips Curve relationship. 4.4.2. GDP growth rate Moving to the GDP growth forecasting, the results for the three different transformations are in the top right, bottom left and bottom right panels of Table 5. In general, now the top selected variables (across all transformations and methods) are categorised as: (i) inflation related variables, (ii) financial market variables which include interest rates, exchange rates, debt securities, deposits and stock price indexes, (iii) production variables, (iv) money supply and (v) unemployment variables. The above findings confirm the well-known ability of the yield curve to predict recessions along with the impact labour market variables have on the GDP. Another important fact is the effect of the German economy on the Euro Area’s GDP. Using the Tr3 transformation, we see that the DAX index is selected 100% of the times (12 out of 12 evaluation rounds) in the BR − LASSO − CrV approach. 5. Summary and conclusions The issue of forecasting key macroeconomic variables is approached using indicators from a large unbalanced dataset. The predictors are selected accordingly to heuristic optimisation techniques and reduction methods, such as factor analysis, in order to reduce the computational burden. Our work, overall, indicates that these methods are worth considering, as some of the results are promising. We focus on forecasting euro area HICP inflation and the GDP growth rate, finding that sequential testing at 1% significance level and the MC 3 using BIC provide more accurate forecasts compared to the rest of the heuristic methods. The Bayesian Shrinkage Regression using LASSO regressions is the best method in the GDP growth forecasting exercise. As discussed, it is also interesting that the variable selection methods and the Bayesian Shrinkage Regression using LASSO regressions tend to choose variables with strong economic links with the dependent variables, which explains their successful performance.
42 71 38 57 87 19
Gross Wages & Salaries
Unemployment Expectations (next 12 months)
Gross Wages & Salaries
Building Activity (past 3 months)
Demand Expectation (next 3 months)
HICP-Energy
New Residential Buildings
163 171 172 109 195
3-Month Interest Rates
New Residential Buildings
Total Liabilities (EUR)
189
Debt Securities Issued
M3
187
Yields—All Bank Bonds
4
5
6
5
6
7
7
SABIC
0
6
4
2
0
7
9
0
0
2
12
12
15
13
12
19
16
16
16
33
SABIC
2
1
9
9
9
8
9
GABIC
0
0
0
6
0
5
1
4
0
6
0
1
4
2
5
7
6
12
12
18
GABIC
6
3
6
8
7
8
10
3 MCBIC
1
4
2
4
0
4
6
0
0
3
11
11
8
13
16
10
16
14
14
30
3 MCBIC
8
11
0
0
0
0
2
BR − LASSO − CrV
27
18
22
17
31
15
15
29
36
25
15
15
15
15
15
15
15
15
15
15
BR − LASSO − CrV
20
20
21
22
22
23
28
Total
28
28
28
29
31
31
31
33
36
36
38
39
42
43
48
51
53
57
57
96
Total
GDP Tr 1, Eval = 12
HICP-Furnishings
Seasonal Food
HICP-Excl. Energy &
5-Year Yield
HICP-Furnishings
HICP-Education
3-Month Interest Rates
Unprocessed Food
HICP-Excl. Energy &
Category
GDP Tr 3, Eval = 12
New Orders
Offered Rate
DE Interbank 12 Month
2-Year Yield
Stoxx 50
M3
Turnover Index
months)
Price Trends (last 12
Euro Area (EUR)
Non-Equity Holdings,
New Orders
(EUR)
Euro Area Deposits
Debt Securities Issued
Production Index
HICP-Furnishings
EUR/JPY (Average)
New Cars
3-Year Yield
3-Month Interest Rates
HICP-Excl. Tobacco
HICP-All Items
HICP-Education
Category
11-M3
3-M3
M3
181-
11-M2
16-M3
M3
172-
1-M3
Variable
162
184
179
178
171
164
66
192
161
190
189
106
11
175
163
180
172
5
6
16
Variable
3
7
5
5
6
6
8
SABIC
1
6
6
4
5
5
1
5
3
6
6
5
5
5
6
5
8
10
10
9
SABIC
5
5
3
3
5
2
6
GABIC
0
3
3
1
5
1
3
4
0
5
4
6
6
2
7
3
4
8
8
12
GABIC
1
4
0
1
1
4
6
3 MCBIC
3
7
7
3
5
10
4
6
3
7
6
7
4
4
7
5
9
7
8
8
3 MCBIC
1
8
1
10
9
7
11
9
17
17
18
18
19
23
29
Total
16
17
17
17
17
17
17
18
18
19
19
19
19
21
21
22
22
26
27
30
Total
(continued on next page)
BR − LASSO − CrV
12
1
1
9
2
1
9
3
12
3
1
4
10
1
9
1
1
1
1
BR − LASSO − CrV
)
New Cars
Variable
158
Category
GDP Tr 2, Eval = 12
83
9
HICP-Clothing
Employment Expectations (next 3 months)
4
192
HICP-Excl. Energy
120
Non-Equity Holdings, Euro Area (EUR)
54
Employment Expectations (next 3 months)
# Persons Employed
13
HICP-Transport
195
39
Gross Wages & Salaries
Total Liabilities (EUR)
40
122
# Persons Employed
Gross Wages & Salaries
119
6
Variable
# Persons Employed
HICP-All Items
Category
HICP, Eval = 36
Table 5 Most frequently selected variables.
G. Kapetanios et al. / Computational Statistics and Data Analysis ( – 11
179 181 77
2-Year Yield
5-Year Yield
Production Development (Observed over the past 3
151 164
Turnover Index 5
1
4
2
3
3
1
4
2
4
5
6
4
2
0
8
2
5
6
3
3
2
1
5
5
3
4
0
1
2
4
2
2
4
3
2
3
6
1
2
12
0
8
2
3
8
4
8
9
3
0
10
13
13
13
14
14
14
14
15
15
16
16
17
18
Total Unemployment
Unemployment (U25)
3 months)
(Observed over the past
Development
Production
HICP-Excl. Tobacco
Yields—All Bank Bonds
Yields—All Bank Bonds
Unemployment (U25)
HICP-Furnishings
Seasonal Food
HICP-Excl. Energy &
3-Year Yield
3-Year Yield
DAX
HICP-Excl. Energy
98-M3
97-M3
77-M3
5-M1
M3
187-
M1
187-
97-M2
11-M1
3-M2
M3
180-
M2
180-
M3
177-
4-M3
0
2
2
4
5
4
0
6
8
3
4
2
6
0
0
5
3
2
1
0
4
3
2
1
1
3
2
0
6
6
1
2
3
4
3
0
1
1
1
6
12
12
1
1
7
8
12
1
1
11
10
12
14
14
14
14
15
15
15
15
15
16
16
16
16
G. Kapetanios et al. / Computational Statistics and Data Analysis (
Turnover Index (Non-Domestic)
months)
106
45
161
3
115
Production Index
Gross Wages & Salaries
New Orders
HICP-Excl. Energy & Seasonal Food
Production Index
97
190
Euro Area Deposits (EUR)
Unemployment (U25)
180
3-Year Yield
Table 5 (continued)
12 ) –
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
13
Fig. 3. Forecasting GDP using Tr1, h = 1.
Acknowledgements The authors would like to thank the Editor, the Associated Editor, three anonymous referees and the participants of: (i) the CREATES Seminar Series presentation, and (ii) the Department of Economics, University of Ioannina Seminar Series presentation for their helpful comments which improved the quality of this paper. Any remaining errors are our own. Additional results are available upon request. Appendix. Parameters setup and normalisation For the simulated annealing and genetic algorithms we use the same (default) values as in Kapetanios (2006), i.e. in the simulated annealing h = 1, Bυ = 500, Bs = 5000, T0 = 10, in the genetic algorithm m = 200, Bg = 200, pc = 0.6 and pm = 0.1. We allow the max counter of convergence iterations to be 10 and 500 times. In all heuristics we have used the data as is, however in PC and PLS we have normalised the regressors to zero mean and unit variance series. For the Bayesian Shrinkage Regression with lasso we use the iterative Landweber algorithm with the same values as in De Mol et al. (2008). References Acosta-González, E., Fernández-Rodríguez, F., 2007. Model selection via genetic algorithms illustrated with cross-country growth data. Empir. Econom. 33, 313–337. Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Automat. Control 19, 716–723. Alcock, J., Burrage, K., 2004. A genetic estimation algorithm for parameters of stochastic ordinary differential equations. Comput. Statist. Data Anal. 42 (2), 255–275. Baillie, R.T., Kapetanios, G., Papailias, F., 2014. Bandwidth selection by cross-validation for forecasting long memory financial time series. J. Empir. Financ. 29, 129–143. Baragona, R., Battaglia, F., Cucina, D., 2004. Fitting piecewise linear threshold autoregressive models by means of genetic algorithms. Comput. Statist. Data Anal. 47, 277–295. Brooks, S.P., Friel, N., King, R., 2003. Classical model selection via simulated annealing. J. Roy. Statist. Soc. Ser. B 65, 503–520. Brüggemann, R., Krolzig, H.M., Lutkepohl, H., 2003. Comparison of Model Reduction Methods for VAR Processes, Technical Report 2003-W13. Nuffield College, University of Oxford.
14
G. Kapetanios et al. / Computational Statistics and Data Analysis (
)
–
Buchen, T., Wohlrabe, K., 2011. Forecasting with many predictors: is boosting a viable alternative? Econom. Lett. 113, 16–18. Bulligan, G., Marcellino, M., Venditti, F., 2014. Forecasting economic activity with targeted predictors. Int. J. Forecast. 31, 188–206. De Mol, C., Giannone, D., Reichlin, L., 2008. Forecasting with a large number of predictors: is bayesian regression a valid alternative to principal components. J. Econometrics 146, 318–328. Fernandez, C., Ley, E., Steel, M.F.J., 2001. Benchmark priors for bayesian model averaging. J. Econometrics 100, 381–427. Forni, M., Hallin, M., Lippi, M., Reichlin, L., 2000. The generalised factor model: identification and estimation. Rev. Econ. Stat. 82, 540–554. Forni, M., Hallin, M., Lippi, M., Reichlin, L., 2005. The generalised factor model: one-sided estimation and forecasting. J. Amer. Statist. Assoc. 100 (471), 830–840. Foroni, C., Marcellino, M., 2014. A comparison of mixed frequency approaches for modelling euro area macroeconomic variables. Int. J. Forecast. 30, 554–568. Foroni, C., Marcellino, M., Schumacher, C., 2015. Unrestricted mixed data sampling (MIDAS): MIDAS regressions with unrestricted lag polynomials. J. Roy. Statist. Soc. Ser. A 178, 57–82. Gatu, C., Kontoghiorghes, E.J., 2006. Branch-and-bound algorithms for computing the best subset regression models. J. Comput. Graph. Statist. 15, 139–156. Gilli, M., Maringer, D., Schumann, E., 2011. Numerical Methods and Optimization in Finance. Academic Press. Goffe, W.L., Ferrier, G.D., Rogers, J., 1994. Global optimisation of statistical functions with simulated annealing. J. Econometrics 60 (1), 65–99. Hajek, B., 1998. Cooling schedules for optimal annealing. Math. Oper. Res. 13 (2), 311–331. Hannan, E.J., Quinn, B.G., 1979. The determination of the order of an autoregression. J. Roy. Statist. Soc. Ser. B 41, 190–195. Hartl, H.R.F., Belew, R.K., 1990. A Global Convergence Proof for a Class of Genetic Algorithms, Technical Report. Technical Universiy of Vienna. Helland, I.S., 1988. On the structure of partial least squares regression. Comm. Statist. Simulation Comput. 17, 581–607. Helland, I.S., 1990. Partial least squares regression and statistical models. J. Statist. 17, 91–114. Hendry, D.F., 1995. Dynamic Econometrics. Oxford University Press. Hendry, D.F., 1997. On Congruent Econometric Relations: A Comment. In: Carnegie-Rochester Conference Series on Public Policy, vol. 47. pp. 163–190. Hoover, K.D., Perez, S.J., 1999. Data mining reconsidered: encompassing and the general-to-specific approach to specification set. Econom. J. 2, 167–191. Jacobson, S.H., Yücesan, E., 2004. Global optimization performance measures for generalized hill climbing algorithms. J. Global Optim. 29, 173–190. Jerrell, M.E., Campione, W.A., 2001. Global optimization of econometric functions. J. Global Optim. 20 (3-4), 273–295. Kapetanios, G., 2006. Variable selection in regression models using non-stantard optimisation of information criteria. Comput. Statist. Data Anal. 52 (1), 4–15. Krolzig, H.M., Hendry, D.F., 2001. Computer automation of general-to-specific model selection procedures. J. Econ. Dynam. Control 25 (6–7), 831–886. Kuzin, V., Marcellino, M., Schumacher, C., 2011. MIDAS vs. mixed-frequence VAR: nowcasting GDP in the Euro area. Int. J. Forecast. 27, 529–542. Marcellino, M., Schumacher, C., 2010. Factor-MIDAS for now- and forecasting with ragged-edge data: a model comparison for German GDP. Oxf. Bull. Econ. Stat. 72, 518–550. Marcellino, M., Stock, J., Watson, M., 2006. A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. J. Econometrics 135, 499–526. Maringer, D., 2005. Portfolio Management with Heuristic Optimization. Springer, Dordrecth. Morinaka, Y., Yoshikawa, M., Amagasa, T., 2001. The L-index: an indexing structure for efficient subsequence matching in time sequence databases. In: Proceedings of the Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining. Schwarz, G., 1978. Estimating the dimension of the model. Ann. Statist. 6, 461–464. Sin, C.Y., White, H., 1996. Information criteria for selecting possibly misspecifed parametric models. J. Econometrics 71 (1–2), 207–225. Stock, J., Watson, M., 2002a. Forecasting using principal components from a large number of predictors. J. Amer. Statist. Assoc. 297, 1167–1179. Stock, J., Watson, M., 2002b. Macroeconomic forecasting using diffusion indexes. J. Bus. Econom. Statist. 20, 147–162. Stock, J., Watson, M., 2006. Forecasting with many predictors. In: Graham, E., Granger, C., Timmerman, A. (Eds.), Handbook of Economic Forecasting, Amsterdam, pp. 546–550. Wold, H., 1982. Soft modeling: the basic design and some extensions. In: Jöreskog, K.G., et al. (Eds.), Systems Under Indirect Observation, Part 2. NorthHolland, Amsterdam, pp. 1–54.