Journal of Econometrics 123 (2004) 67 – 87
www.elsevier.com/locate/econbase
Bootstrapping the HEGY seasonal unit root tests Peter Burridgea;∗ , A.M. Robert Taylorb a Department
of Economics, City University, Northampton Square, London EC1V 0HB, UK of Economics, University of Birmingham, Birmimgham B15 2TT, UK
b Department
Accepted 22 October 2003
Abstract This paper proposes bootstrap versions of the seasonal unit root tests of, inter alia, Hylleberg, Engle, Granger and Yoo (J. Econometrics 44 (1990) 215 –238) (HEGY). We report a simulation study of the properties of both the conventional and bootstrapped seasonal unit root tests when applied to series having higher-order serial correlation and/or periodic heteroscedasticity, both of which are known to severely distort the signi5cance level of the conventional tests. Our results demonstrate that the bootstrap provides good approximations to the statistics’ null distributions. Moreover, the bootstrap corrects the adverse e8ects of data-dependent lag selection seen in the conventional augmented HEGY tests. The bootstrapped tests have comparable power to (infeasible) exactly signi5cance-level-corrected lag-augmented HEGY tests, and their use is recommended. c 2003 Elsevier B.V. All rights reserved. JEL classi'cation: C12; C15; C22; C52 Keywords: Seasonal unit roots; Bootstrap tests; Higher-order serial correlation; Periodic heteroscedasticity; Data-based lag selection
1. Introduction Seasonally observed economic time series are now routinely tested for unit autoregressive roots at both zero and seasonal frequencies. While many such series may be driven by serially correlated periodically heteroscedastic shocks, most of the literature devoted to the properties of the tests, or to the invention of new variants, pays rather little attention to this feature. It has been widely assumed that higher-order serial correlation is relatively innocuous, with “correction” along the lines of the Augmented Dickey–Fuller test being the standard response, while apart from the study of periodic ∗
Corresponding author. Tel.: +44-20-7040-8919; fax: +44-20-7040-8580. E-mail address:
[email protected] (P. Burridge).
c 2003 Elsevier B.V. All rights reserved. 0304-4076/$ - see front matter doi:10.1016/j.jeconom.2003.10.029
68
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
seasonal models by Ghysels et al. (1996), periodic heteroscedasticity (PH) has most often been ignored altogether. However, two recent papers by Burridge and Taylor (2001a, b), BTa and BTb henceforth, show that in the case of the tests of Hylleberg et al. (1990), HEGY, lag augmentation of the test equation is at best a partial solution to the serial correlation (SC) problem, while certain of the HEGY tests are also sensitive to PH. Users of the tests may thus have reservations about their robustness which have not been fully addressed. This study tackles the issue head-on; we demonstrate that the e8ects on the sampling distributions of the HEGY statistics induced by shocks which are serially correlated, periodically heteroscedastic, and possibly asymmetric, can be successfully accommodated by the use of a bootstrap. Such e8ects can be uncomfortably large, as we also show. Moreover, we show that our bootstrap method largely stops the inJation of test signi5cance levels above their nominal levels caused by the use of data-dependent lag selection procedures in small samples, discussed in the context of the conventional augmented HEGY tests by Taylor (1997). The use of the bootstrap in the unit root context is not new, and its properties are gradually coming to be better understood. Applications include the bootstrapped Dickey–Fuller tests investigated by Nankervis and Savin (1996), Rayner (1990), Ferretti and Romo (1996), and van Giersbergen (1998); the latter also mentions the possibility of bootstrapping the HEGY tests, but reports no details. Psaradakis (2000) shows that the bootstrap may be used to approximate the sampling distribution of the seasonal unit root test of Dickey et al. (1984), at least in the case when the driving shocks are serially uncorrelated and homoscedastic. In the cointegration setting, use of the bootstrap has also been proposed, but problems arising from the inconsistency of spurious regressions in the non-stationary case explored recently by Phillips (2001) mean that it must be applied with great care. In the present context, no such problems arise, as the test statistics we study are calculated from regressions in which parameter estimates are consistent (though at di8erent rates) under both null and alternatives of interest. In the non-seasonal setting, and with lag length known a priori, Ferretti and Romo (1996) prove the asymptotic validity of a bootstrap which imposes the unit root but uses estimated stationary higher (5xed) order dynamics to recolour the shocks in the resampling scheme. In their case, the test statistic being bootstrapped is calculated from an unaugmented regression. A similar approach is taken by Park (2002), who allows the shocks to follow a general linear process. Our approach di8ers from these authors’ in that we incorporate data-based lag selection into the equation used to calculate the seasonal unit root test statistics, as is currently near-universal empirical practice, 1 and we explicitly allow for PH. The data-based lag selection algorithm we employ is that outlined for the HEGY tests in Beaulieu and Miron (1993, pp. 318–319). BTa and BTb derive representations for the limiting null distributions of the HEGY test statistics when the driving shocks follow autoregressions of known 5nite order, say p, with (potentially) periodically heteroscedastic shocks. They derive these representations for the case where the HEGY test regression is augmented with a 'xed number of lags,
1
Presumably, following the example set in the empirical application in the original HEGY paper.
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
69
this being at least as large as p. However, so far as we are aware, the limiting null distributions of the HEGY-type statistics implemented with data-based lag selection have yet to be established in the literature. We can therefore only conjecture that the limiting null distributions of our bootstrapped statistics will coincide with these, though for 5nite samples, the numerical evidence we present is very favourable. Our null model is a (possibly) seasonally periodically heteroscedastic non-stationary autoregression with (possibly) serially correlated and (possibly) asymmetric shocks. We handle nuisance parameters in the dynamics, that is, higher-order SC, by 5tting an autoregression using the afore-mentioned data-based lag selection procedure, while any seasonal heteroscedasticity and skewness that may be present are captured in the residuals, which are then re-sampled separately for each season, and re-coloured in the bootstrap using the estimated dynamic nuisance parameters. Within the bootstrap, the seasonal unit roots are imposed, thus avoiding the diLculties with the use of estimated unit roots discussed by Basawa et al. (1991). Furthermore, in applications, the bootstrap delivers estimated tail probabilities, which are the quantities required for inference, and so the unreliability of tabulated critical values, highlighted by Horowitz and Savin (2000), is not an issue provided we can show that the tail probabilities delivered are accurate. BTa and BTb investigate the regression-based (augmented) HEGY tests as developed further by Beaulieu and Miron (1993), Ghysels et al. (1994), Smith and Taylor (1998, 1999a, b) and Taylor (1998). The e8ects of PH and SC are shown to be most severe in the sampling distributions of the statistics for testing unit roots at harmonic seasonal frequencies, with the joint F-statistic being shown to be more robust than its component t-statistics (de5ned below) in both cases. However, even the joint F-tests, as usually implemented, can be very seriously too liberal as we shall demonstrate. The now common practice of basing inference on the joint F tests is undoubtedly wise, subject to the proviso that if any signi5cant PH is present, then in the conventional approach the critical values used should be modi5ed to allow for this. Even more serious empirical signi5cance level distortions are seen for the joint frequency F-tests of Ghysels et al. (1994) and Taylor (1998). Allowing the shocks to be asymmetric could further disturb the statistics’ 5nite-sample distributions, (we have not found evidence of this in our experiments, but the possibility remains), and so caution suggests there is a case for making use of information about asymmetry when conducting the tests. Finally, as shown by Taylor (1997), the use of data-based lag selection makes the conventional approach very liberal in certain cases, and so a method which automatically corrects for this e8ect is highly desirable. Ignoring the lag selection issue for the moment, there are essentially three ways in which non-IID shocks might be accommodated. Firstly, tables of critical values of the statistics a8ected by PH, SC and asymmetry could be produced, with their use being controlled by indicators calculated from sample information. Secondly, a more time-consuming approach could be used, in which, say, the SC and PH patterns are estimated and then used to set the parameters of a Monte Carlo simulation to approximate the sampling distribution of the a8ected test statistics. Finally, we can perform a full bootstrap, in which the empirical distribution of the seasonal residuals from the test equation is re-sampled to approximate the null distributions of the test statistics.
70
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
The 5rst approach has a major drawback, even when PH alone is present; in BTa it was found that the critical values of all but two of the HEGY statistics depend on the pattern of PH, so any such tables would have to be very extensive (see BTa for details and related literature). This problem would certainly be worse if SC were also present. The second approach was advocated to deal with PH in BTa, and following a referee’s suggestion we extended it to allow for SC, and compared it to the third approach. Once the lag-selection algorithm is incorporated, it di8ers from the third method only in the way in which the shocks are drawn, and we found the results (see Section 4) were una8ected by this choice. Of course, the bootstrap approach can only be used successfully if the tail probabilities of the bootstrap null distributions are good approximations to those of the true sampling distributions of the test statistics at practically relevant sample sizes. Our numerical results show that this is broadly the case, and that such tests have very satisfactory power, relative to tests using nominal critical values, under conditions in which the latter are correct, but also more generally. In some cases we are able to achieve a dramatic improvement over the conventional approach, and so we now favour use of the bootstrap to implement the HEGY tests. The rest of the paper is organised as follows. Section 2 describes brieJy the regression-based approach to seasonal unit root testing, and de5nes the various test statistics. Section 3 describes our bootstrap methodology, while Section 4 brieJy describes the e8ects of SC and PH on conventionally calculated test statistics as discussed in BTa and BTb, and shows how the bootstrap tests ameliorate these problems much more successfully than the conventional approach. Section 5 concludes. 2. Seasonal unit root tests Using the set-up of BTa and BTb, we consider a quarterly series which can be written as the sum of a deterministic component, d4t+s , and a purely autoregressive (AR) process; that is, x4t+s = y4t+s + d4t+s , where a(L)y4t+s = v4t+s ;
s = −3; : : : ; 0;
t = 1; 2; : : : ; T;
d4t+s = s + s (4t + s); (L)v4t+s = u4t+s ; (2.1) 4 where a(L) = 1 − j=1 j Lj is a fourth order polynomial in the usual lag operator, L. This design allows for periodic (seasonal) intercepts and time trends through s and s , respectively. The shocks {v4t+s } are an AR(m) process, in which the m roots of (z) = 0 all lie outside the unit circle. Consequently, {y4t+s } itself is an AR(m + 4). We allow the shocks {u4t+s } to be periodically heteroscedastic, and to have an asymmetric distribution, that is, we de5ne the annualised vector innovation process ut = 2 (u4t−3 ; u4t−2 ; u4t−1 ; u4t ) and assume that ut ∼ IID(0; ) with = diag(−3 ; : : : ; 02 ), and with 5nite fourth moments. The innovations are otherwise unrestricted. In our ex2 periments, we used a centred (1) distribution to investigate the e8ects of asymmetry.
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
71
At least for this (heavily skewed) distribution, these turned out to be negligible and are therefore not reported. The shocks, {u4t+s }, and hence {x4t+s }, display PH unless s ≡ , for all s. We are concerned with regression-based tests for seasonal unit roots in the autoregressive AR(4) lag polynomial a(L); that is, the null hypothesis of interest is H0 : a(L) = 4 = 1 − L4 :
(2.2)
In order to derive the HEGY tests, the polynomial a(L) is factorised at the seasonal frequencies !k ≡ 2k=4, k = 0; 1; 2, and expanded around the seasonal unit roots exp(±i2k=4), k = 0; 1; 2, to obtain the auxiliary regression equation 4 x4t+s = ∗s + ∗s (4t + s) + 1 x1; 4t+s−1 + 2 x2; 4t+s−1 +3 x3; 4t+s−1 + 4 x4; 4t+s−1 +
m
j 4 x4t+s−j + u4t+s ;
(2.3)
j=1
which may be estimated along 4t + s = m + 5; : : : ; 4T . The inclusion of seasonal level and trend dummies in (2.3), whose parameters ∗s and ∗s , respectively, are linear mappings of s and s of (2.1), s = −3; : : : ; 0, ensures that the sampling distributions of the estimated coeLcients on the transformed level variables, xj; 4t+s , j = 1; : : : ; 4, and their associated t- and F-statistics, are una8ected by the ≡ (−3 ; : : : ; 0 ) and ≡ (−3 ; : : : ; 0 ) parameters, as shown below. The transformed level variables which correspond to the seasonal frequencies !k = 2k=4, are given by x1; 4t+s ≡ (1 + L + L2 + L3 )x4t+s ; x3; 4t+s ≡ −L(1 − L2 )x4t+s ;
x2; 4t+s ≡ −(1 − L + L2 − L3 )x4t+s ;
x4; 4t+s ≡ −(1 − L2 )x4t+s :
(2.4)
The existence of unit roots at the zero (! = 0), Nyquist (! = ) and harmonic seasonal frequencies (! = =2; 3=2), respectively, imply that 1 = 0, 2 = 0 and 3 = 4 = 0, in (2.3), and, using an obvious notation, the tests we study here are the regression t-statistics, t1 , t2 , t3 (one-sided) and t4 (two-sided), together with the F-statistics, F34 for 3 = 4 = 0, F234 for 2 = 3 = 4 = 0, and F1234 for 1 = 2 = 3 = 4 = 0. The F1234 therefore facilitates an overall test of H0 of (2.2). Percentiles from approximations to the 5nite-sample null distributions of these various statistics, obtained by Monte Carlo simulation assuming that {v4t+s } ∼ IN (0; 1) are given by HEGY (Tables 1a and 1b, pp. 226 –227), Smith and Taylor (1998, Tables, 1a–1b, p. 276) and Ghysels et al. (1994, Tables C.1 and C.2, pp. 440–441). Some limited tabulations for the t1 , t2 , t3 , t4 and F34 statistics arising from the PH case appear in BTa. To demonstrate that the sampling distributions of these statistics do not depend on either or , it is suLcient to establish that the deterministic part of both the regressand and each of the regressors in (2.3) lie in the span of the matrix, D∗ , de5ned below.
72
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
Write the generic seasonal intercept 1 0 0 0 1 0 0 1 0 0 0 2 0 0 1 0 0 0 D=0 0 0 1 0 0 1 0 0 0 5 0 .. .. .. .. .. .. . . . . . . 0
0
0
1
0
0
and trend dummies as 0 0 0 0 3 0 0 4 0 0 .. .. . . 0
(2.5)
4T
and let D∗ denote the (4T − m − 4) × 8 matrix obtained from D by deleting the 5rst m + 4 rows, and write PD∗ = D∗ (D∗ D∗ )−1 D∗ for the projection matrix associated ∗ with D . The trend in the dependent variable, 4 x4t+s , lies in the span of D∗ : writing 4 x for the column vector containing 4 x4t+s and de5ning y in the obvious way, we 5nd that ym+5 − ym+1 + dm+5 − dm+1
4 . ∗ .. 4 x = : = 4 y + D 0 y4T − y4T −4 + d4T − d4T −4 Turning to the transformed level variables, writing x for the column vector containing x4t+s and x−1 for the vector containing x4t+s−1 and so on, we see that each of the transformed level variables de5ned in (2.4) can be written as a linear combination of the xj , j = −3; : : : ; 0. That is, x = y + D∗ ( ) , and x−1 = y−1 + D∗ M−1 ( ) , where 0 0 0 1 0 0 0 −1 1 0 0 0 −1 0 0 0 0 1 0 0 0 −1 0 0 0 0 1 0 0 0 −1 0 M−1 = 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
0
0
0
0
0
1
0
and so on, where the non-singular matrices, M−j apply the necessary column operations to obtain lags of the rows of D∗ . Thus, the trends in the transformed level variables are in the span of D∗ , and evidently the same is true of the lagged 4 x
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
73
terms. Consequently, as asserted, the sampling distributions of all the statistics of interest are invariant to and . We exploit this in the bootstrap resampling, setting the presample values to zero and omitting the estimated deterministic component. 3. The bootstrap 3.1. Initial estimation Our algorithm begins with estimation of (2.3), with the lag length, m, and the intervening lags to be retained, selected using the sequential elimination procedure advocated by Beaulieu and Miron (1993, pp. 318–319). That is, a maximum lag, mmax , is speci5ed (for programming convenience, rather than necessity, we took this to be a multiple of four), the deterministics are speci5ed, and the test equation is estimated. Thereafter, if any lagged fourth di8erences have t-statistics smaller than 1:65 in absolute value (i.e. insigni5cant at an approximate level 2 of 10%), the least signi5cant lag is removed and the equation re-estimated. This continues until all the included lags are signi5cant, at which point their estimated coeLcients and the seven unit root test statistics of Section 2 are recorded, and the residual vector is stored. In small samples there is the possibility that the resulting estimated lag polynomial could have one or more explosive roots, and we found that the performance of the bootstrap was improved if any such root was shrunk to have modulus less than unity; in our experiments we scaled such estimated roots to have modulus equal to 0.999. 3.2. Bootstrap samples The residuals for each quarter are stored separately, and we draw a sample from each of their empirical distributions; these four independently drawn samples are merged, preserving the seasonal ordering, into the vector, u∗ , which is then used to construct a bootstrapped observed sequence under the null hypothesis via the equation, ∗ ∗ ˆ (L) 4 x4t+s = u4t+s
(3.1)
initialised at 0. The full 5tting algorithm, using the same maximum lag, selection method, and de∗ series and the resulting test statistics compared terministics is then applied to the x4t+s with the originals. We repeat this procedure for a large number of bootstrap samples and record how many of the bootstrapped statistics are more extreme (in the relevant tail(s)) than the original, thus locating the latter in the bootstrap null cumulative density function (cdf). An important feature of the 5tted equation is that the test statistics and the residual vector are una8ected by the deterministic parameters under both the null and alternatives, so that in simulating the null distribution by bootstrap sampling 2
We experimented with both tighter and more liberal lag selection rules; in most cases the former inJated the HEGY test signi5cance levels, while the latter reduced power. Users of our algorithm might wish to experiment with several values.
74
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
from the seasonal residual empirical cdfs we need not incorporate the 5tted deterministic parameters in the bootstrap samples so long as they are included in the test equation 5t to those samples. By this simple means we are able to completely avoid the problems with induced quadratic trends identi5ed by Inoue and Kilian (2002). For our procedure to have the correct signi5cance level, we need the relevant quantiles of the bootstrap cdf to coincide with those of the true sampling distributions of the various test statistics under the null hypothesis with the true nuisance parameters present. For the bootstrap tests to have power against a given alternative, H1 , we require the sampling distributions of the tests under H1 to have mass in the relevant tails of the bootstrap distribution under H1 . We explore these issues numerically in Section 4. 3.3. An alternative treatment of heteroscedasticity Rather than re-sample directly from the residuals, we could use these to consistently 2 estimate the four seasonal variances, −3 ; : : : ; 02 , and then generate periodically heteroscedastic shocks by scaling standard Normal pseudo-random numbers by the ˆs , ∗ using these in place of the u4t+s . This corresponds to the second method described in the introduction; we comment on its relative performance in Section 4. 4. A comparison of the bootstrap and conventional procedures 4.1. Choosing the number of bootstrap replications in our experimental design To evaluate the empirical signi5cance level, or the power, of a bootstrap applied to any given data-generation process requires two layers of Monte Carlo simulation; in the outer layer we draw N pseudo-random samples from the DGP and calculate the HEGY test statistics, [t ; F ]=[t1 ; t2 ; t3 ; t4 ; F34 ; F234 ; F1234 ], say, and in the inner layer we generate n bootstrap samples of the DGP with the null hypothesis imposed, as described above, calculating the HEGY statistics, [t∗ F∗ ], say, for each such sample, and counting how many of these are more extreme (in the relevant tail(s)) than the original [t ; F ]. For concreteness, suppose we wish to evaluate the empirical signi5cance level of a bootstrap applied to t1 with nominal level 5%. Such a bootstrap test rejects its null hypothesis if fewer than n=20 of the t1∗ lie below t1 : Having chosen n (see below), we apply this nominal rule to N samples from the null DGP, 5nding that the test rejects the null hypothesis Nt1 ;5% times, say; the empirical size of the bootstrapped t1 test applied to the DGP in question is then estimated to be Nt1 ;5% =N . If the original samples are drawn under an alternative, then this procedure delivers an estimate of the power of the bootstrap with nominal signi5cance level, 5%. For sample size, 4T , such an experiment requires approximately 4T × N × n calls to whichever pseudo-random number generator is used. If the generator had period 6 231 −1, for example, then with 4T =200, N =10;000 and n=2000 it would complete nearly two whole cycles for each experiment. 3 In view of this, the Monte Carlo results 3
We are grateful to a referee for alerting us to this problem.
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
75
below are based on the so-called Kiss Monster 4 pseudo-random number generator implemented in GAUSS 5.0, which has period at least 23859 . All experiments reported here used N = 10;000; the sample size, 4T , and number of bootstrap replications, n, are as given case by case below. As discussed by Davidson and MacKinnon (2000), if the power vs. level curve of an exact test is locally concave, then power of a bootstrap version with correct signi5cance level should be an increasing function of the number of bootstrap replications, n, with upper bound given by the power of the exact test. That is, if a random (i.e. estimated) critical value is used for inference, as is implicit when we use the bootstrap order statistics, then bootstrap power must be an increasing function of the precision of the critical value, that is, of the number of bootstrap replications. However, our purpose in this paper is to estimate the empirical level or power of the proposed bootstrap rather than to provide the most reliable possible inference on any given sample, so to determine a suitable n, we did a preliminary experiment. The results, reported in Table 1, were obtained as follows. To represent the null hypothesis, we generated samples from the DGP, (1 − L4 )y4t+s = u4t+s , and for the alternative, (1 − 0:8L4 )y4t+s = u4t+s , with u4t+s ∼ NIID(0; 1) in both cases. We set mmax = 0 and allowed for seasonal de-meaning only, via ∗s in (2.3). The procedure adopted was therefore very simple, and is as follows. Step (i): Set outer loop counters (for nominal levels lj = 1%, 2:5%, 5%, and 10%), to zero, and initialise the random-number generator, then Step (ii): Generate a sample of length T years from the foregoing DGP. Step (iii): Estimate Eq. (2.3) with m = 0 and no seasonal trend dummies, storing the seven HEGY test statistics, [t ; F ], and the vector of residuals. Set inner loop counters, c1 to c7 , to zero. Step (iv): From the residuals for each season, draw a pseudo-random sample with replacement, preserving the seasonal ordering, combine these into the vector, u∗ , say, ∗ ∗ . = u4t+s and generate the bootstrap sample from the null DGP via (1 − L4 )y4t+s ∗ Step (v): Re-estimate (2.3) using y4t+s with m = 0 and no seasonal trend dummies, to obtain a draw of [t∗ F∗ ] from the bootstrap distribution. Step (vi): If any given element of [t∗ F∗ ], is more extreme (in the relevant tail(s)) than the corresponding element of [t ; F ], increment the corresponding counter, ci by 1. Step (vii): Do steps (iv) to (vi) a total of n times Step (viii): For each i = 1; : : : ; 7, increment the jth outer loop counter, Ni; lj by 1 if ci =n ¡ lj . Step (ix): Do steps (ii) to (viii) a total of N times, estimating the empirical significance level (or power) at each nominal level 5 as Ni; lj =N . We ran the above algorithm with n = 400 and 2000; it appears from Table 1 that any variation in power (at level 5%) between these two n values is quite small, at least for the alternative chosen. 4
Analysis of the properties of this generator may be found at http://www.aptech.com/papers/rndKMi.pdf Here and throughout the paper we report results only for nominal signi5cance level, 5%, as these are qualitatively similar to results for nominal levels 10%, 2:5% and 1%. 5
76
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
Table 1 Empirical level (4 = 1) and power (4 = 0:8) of bootstrap seasonal unit root tests with nominal level 0.05. Seasonal de-meaning, using N = 10;000 outer (Monte Carlo) replications and n bootstrap replications, with mmax = 0. DGP : (1 − 4 L4 )x4t+s = v4t+s ∼ NIID(0; 1) t1 n
t2
t3
t4
F34
F234
F1234
4 =1 4 =0:8 4 =1 4 =0:8 4 =1 4 =0:8 4 =1 4 =0:8 4 =1 4 =0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8
T = 13 400 0.049 0.088 2000 0.053 0.092
0.049 0.088 0.050 0.090
0.051 0.107 0.048 0.112
0.054 0.052 0.055 0.043
0.050 0.100 0.056 0.102
0.053 0.050
0.112 0.115
0.048 0.052
0.115 0.119
T = 25 400 0.050 0.144 2000 0.054 0.137
0.048 0.132 0.053 0.136
0.047 0.212 0.053 0.224
0.053 0.037 0.055 0.037
0.049 0.184 0.052 0.193
0.049 0.051
0.233 0.252
0.049 0.053
0.276 0.289
T = 37 400 0.051 0.211 2000 0.051 0.221
0.051 0.217 0.050 0.225
0.049 0.424 0.050 0.427
0.056 0.037 0.054 0.031
0.049 0.355 0.052 0.355
0.049 0.051
0.474 0.483
0.051 0.048
0.562 0.570
T = 49 400 0.052 0.348 2000 0.049 0.338
0.053 0.336 0.051 0.336
0.051 0.665 0.047 0.663
0.049 0.029 0.052 0.031
0.049 0.572 0.049 0.574
0.048 0.053
0.734 0.737
0.049 0.052
0.834 0.838
The Monte Carlo standard error of the estimated levels in the table is approximately (0:05×0:95=10;000)1=2 ≈ 0:0022, while, for example, the standard error of the estimated power of F34 for T = 25 is approximately (0:18 × 0:82=10;000)1=2 ≈ 0:0038, and so the di8erence between the estimated powers for n = 400 and 2000, which is 0:01, has standard error approximately 0:0038 × 21=2 = 0:0054, and is therefore statistically signi5cant, though not practically so. Such a di8erence in experimentally observed power, of the order of 1%, occurs more or less randomly throughout the table, and so, given the small size of this e8ect, in our main experiments we use n = 400 to economise on computing time. The bootstrap powers we report below can safely be treated as conservative estimates, therefore. 4.2. The main experiments 4.2.1. Motivation To motivate these results, we note that the asymptotic distributions of the HEGY test statistics with PH present, allowing for 5nite-order autoregressive SC by lag augmentation, were obtained by BTa, who found that those of t1 and t2 were una8ected by the PH, while t3 and t4 were a8ected, as was F34 though to a lesser extent. However, BTa did not consider the joint frequency F234 and F1234 tests, and, as we will demonstrate, these tests are in fact very badly a8ected by PH; this is a serious drawback, given that these two tests are generally used in an attempt to control the level of the overall testing procedure; see the discussion in, inter alia, Taylor (1998, pp. 353–354) and Smith and Taylor (1998, p. 284). BTa showed that the F34 test could be corrected for PH via a pretest procedure in which 5rst-round residuals were used to estimate the PH pattern, which would then be used to estimate critical values by Monte
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
77
Carlo simulation using seasonally heteroscedastic pseudo-random Normal shocks. Ignoring any asymmetry in the shocks driving the DGP had rather little e8ect. In BTb it was shown that in the homoscedastic 5nite-order SC case, the asymptotic distributions were corrected by lag augmentation in every case except the t3 and t4 statistics; notably, F34 (as well as the joint frequency F234 and F1234 tests) was also shown to have correct level asymptotically under these conditions. However, even in the absence of PH and SC, Taylor (1997) has demonstrated that the small sample distributions of the augmented HEGY statistics can di8er substantially from those of the unaugmented statistics. We now explore how well the proposed bootstrap procedure deals with these various diLculties. 4.3. Experimental design We have simulated both standard and bootstrap HEGY statistics for data generated according to (2.1) with a(L) = 1 − 4 L4 . Results are reported only for the seasonal de-meaning case, those for demeaning and detrending being qualitatively similar. We considered both 4 = 1:0 [the null model] and 4 = 0:8 [alternative], in each case for various levels of PH, namely: (1) [none] −3 = −2 = −1 = 0 = 1; (2) [moderate] −3 = 3; −2 = 1; −1 = 3; 0 = 1; (3) [extreme] −3 = 30; −2 = 1; −1 = 1; 0 = 1. Four levels of 5rst-order serial correlation were also induced via (1 − L)v4t+s = u4t+s , with (1) [none] = 0; (2) [weak] = 0:1; (3) [moderate] = 0:5; and (4) [strong] = 0:9. We report results for the nominal 5% signi5cance level, and with the maximum lag length set to 4. That is, with reference to the procedure outlined in Section 4.1, at Step (iii), the estimated Eq. (2.3), has lags selected via the Beaulieu and Miron (1993, pp. 318–319) algorithm with mmax = 4, and in Step (iv) the bootstrap sample is ∗ ∗ ˆ generated from (L)(1 − L4 )y4t+s = u4t+s , while in Step (v) the test Eq. (2.3), is again estimated using the lag selection algorithm with mmax = 4. This bootstrap was implemented in two ways; the reported results are for experiments in which the resampling was from the 5tted seasonal residuals, while in a parallel set of experiments we used the residuals only to estimate the pattern of periodic heteroscedasticity, which was then applied to pseudo-Normal random numbers (the second method mentioned in the penultimate paragraph of Section 1). The latter approach yielded results essentially identical to those we report below, and so no further details are given. 4.4. Neither periodic heteroscedasticity nor serial correlation We 5rst consider any power loss that might arise from use of the bootstrap when it is unnecessary; that is, when the shocks are in fact independent homoscedastic Normal. To that end, in Table 2 we compare the bootstrap with a version of current standard practice, that is, in which a lag selection algorithm is employed, but in which the resulting HEGY statistics are referred to the sampling distribution appropriate when the shocks are IID, there is no higher-order autocorrelation present and no lag selection. Since the latter procedure (STD in the table) turns out, most often, to be too liberal, we also tabulate the power of a superior though infeasible test: the 5gures
78
t1 −3 −2 −1 0 T = 13 1111 3131 30 1 1 1
T = 25 1111 3131 30 1 1 1
BS STD LC BS STD LC BS STD LC BS STD LC BS STD LC BS STD LC
t2
t3
t4
F34
F234
F1234
4 = 1
4 = 0:8
4 = 1
4 = 0:8
4 = 1
4 = 0:8
4 = 1
4 = 0:8
4 = 1
4 = 0:8
4 = 1
4 = 0:8
4 = 1
4 = 0:8
0.051 0.075
0.098 0.141 0.098 0.119 0.163 0.118 0.123 0.244 0.100
0.053 0.076
0.099 0.142 0.090 0.119 0.167 0.107 0.122 0.242 0.106
0.049 0.078
0.116 0.182 0.126 0.131 0.268 0.109 0.110 0.377 0.096
0.097 0.071
0.090 0.069 0.045 0.033 0.007 0.034 0.012 0.000 0.036
0.033 0.081
0.073 0.167 0.115 0.106 0.189 0.107 0.102 0.291 0.096
0.033 0.085
0.083 0.187 0.122 0.106 0.217 0.115 0.087 0.317 0.086
0.032 0.082
0.080 0.182 0.121 0.101 0.224 0.116 0.069 0.314 0.073
0.147 0.162 0.152 0.166 0.186 0.176 0.171 0.245 0.173
0.047 0.058
0.221 0.252 0.234 0.207 0.351 0.212 0.161 0.422 0.157
0.065 0.062
0.061 0.051 0.040 0.027 0.004 0.030 0.013 0.000 0.028
0.042 0.061
0.176 0.245 0.203 0.190 0.283 0.204 0.160 0.340 0.158
0.042 0.067
0.219 0.298 0.258 0.234 0.333 0.260 0.144 0.374 0.139
0.042 0.065
0.055 0.081 0.062 0.141
0.051 0.060 0.051 0.062 0.053 0.088
0.141 0.161 0.142 0.166 0.197 0.174 0.173 0.252 0.162
0.054 0.081 0.063 0.140
0.051 0.062 0.052 0.063 0.054 0.086
0.059 0.136 0.060 0.242
0.054 0.107 0.052 0.192
0.046 0.009 0.018 0.000
0.044 0.004 0.029 0.000
0.046 0.095 0.055 0.179
0.050 0.077 0.052 0.149
0.044 0.095 0.052 0.203
0.048 0.073 0.053 0.173
0.047 0.100 0.050 0.217
0.050 0.092 0.050 0.192
0.253 0.348 0.300 0.237 0.383 0.278 0.126 0.393 0.115
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
Table 2 Empirical level (4 = 1) and power (4 = 0:8) of bootstrap (BS), standard (STD) and level-corrected standard (LC) Seasonal unit root tests, all with nominal level 0.05. Seasonal de-meaning, and mmax = 4. DGP: (1 − 4 L4 )x4t+s = v4t+s ∼ NIID(0; s2 ), s = −3; : : : ; 0
T = 37 1111
30 1 1 1
T = 49 1111 3131 30 1 1 1
BS STD LC BS STD LC BS STD LC
0.050 0.053 0.053 0.056 0.050 0.078
0.050 0.056 0.052 0.056 0.053 0.074
0.227 0.236 0.238 0.240 0.269 0.257 0.251 0.308 0.228 0.330 0.354 0.349 0.351 0.395 0.344 0.348 0.423 0.354
0.048 0.065 0.046 0.063 0.049 0.078
0.047 0.055 0.051 0.055 0.052 0.070
0.228 0.268 0.235 0.244 0.274 0.248 0.250 0.317 0.233 0.334 0.360 0.307 0.355 0.377 0.334 0.349 0.406 0.348
0.050 0.059 0.057 0.112 0.049 0.187
0.049 0.059 0.049 0.107 0.052 0.185
0.408 0.466 0.417 0.324 0.514 0.321 0.239 0.533 0.218 0.628 0.650 0.619 0.470 0.658 0.473 0.339 0.673 0.341
0.064 0.062 0.048 0.006 0.034 0.000
0.056 0.051 0.045 0.002 0.035 0.000
0.049 0.042 0.039 0.023 0.001 0.026 0.013 0.000 0.019 0.043 0.033 0.038 0.018 0.001 0.023 0.011 0.000 0.020
0.043 0.063 0.054 0.076 0.049 0.132
0.047 0.061 0.047 0.072 0.052 0.126
0.329 0.401 0.342 0.309 0.413 0.317 0.238 0.427 0.218 0.541 0.571 0.528 0.452 0.546 0.460 0.338 0.555 0.339
0.046 0.063 0.052 0.078 0.050 0.163
0.045 0.057 0.048 0.067 0.053 0.154
0.441 0.530 0.458 0.412 0.524 0.426 0.224 0.475 0.203 0.701 0.729 0.693 0.616 0.685 0.622 0.319 0.607 0.323
0.045 0.061 0.052 0.082 0.049 0.181
0.047 0.057 0.052 0.086 0.052 0.182
0.532 0.594 0.564 0.444 0.571 0.438 0.206 0.491 0.185 0.804 0.839 0.810 0.664 0.776 0.668 0.303 0.648 0.305
Notes: ‘BS’ refers to the full bootstrap statistics, ‘STD’ refers to the corresponding conventional augmented HEGY statistics and ‘LC’ refers to level-corrected powers for the conventional HEGY tests. N = 10;000 and n = 400 for all BS experiments.
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
3131
BS STD LC BS STD LC BS STD LC
79
80
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
in the rows labelled, LC (level-corrected), are the powers obtained by referring the standard HEGY statistics not to the nominal 5% critical values but to actual 5% empirical critical values obtained by Monte Carlo simulation for the given DGP with lag selection. Examining the rows labelled “1 1 1 1” in Table 2, we see, for example, that with 13 years’ data the bootstrapped t1 , t2 and t3 statistics are correctly sized and have power against the alternative, (1 − 0:8L4 ) (which has four roots of magnitude about 0:95), which is comparable to that of the infeasible exactly level-corrected procedure. The STD rows reveal that the use of the lag selection algorithm followed by reference to the nominal critical values computed without lag selection (i.e. with no lagged dependent variables) results in clear level-inJation. Though ameliorated by increasing sample size, the level inJation reduces only slowly; for example, with 37 years’ data, the t2 statistic still has level 6:5% at nominal 5% while the corresponding bootstrapped test has level 4:8%. The bootstrap remains superior even with 49 years’ data, except in the case of the biased t4 statistic. Neither the bootstrap nor the standard approach do a good job with the joint F-statistics at the smallest sample size, T = 13, for which the bootstrap is rather conservative, unlike the standard approach which is liberal. As sample size increases, both sets of joint F tests approach their correct levels, though the slightly conservative bootstrap tests are closer to the nominal level than are the slightly liberal standard tests. Comparing the powers of the bootstrap and level-corrected standard tests, we see that when the BS has level 5%, the two tests are essentially the same if we allow for the slight under-estimation of BS power suggested by Table 1. Since the level-corrected standard test is infeasible, this is an impressive performance by the bootstrap. 4.5. Periodic heteroscedasticity When the shocks are periodically heteroscedastic but not serially correlated, the superiority of the bootstrap tests is very apparent; for example, with T = 13, the bootstrapped t1 , t2 , F34 , F234 , and F1234 tests all have level within about 1% of nominal level, or better, while all the corresponding standard tests are too liberal, very badly so in the case of the F tests when the PH is extreme. Most worryingly, the level inJation a8ecting the standard tests is greatest in the case of the overall test of H0 of (2.2), F1234 . For example, for T = 13, the empirical level of the standard HEGY F1234 test is over 8% for the homoscedastic case, rising to 10% for the moderate PH pattern and almost 22% for the extreme PH pattern, in each case on a nominal 5% level. By T = 49 these empirical levels have fallen only to 5:7%, 8:6%, and 18:2%, respectively. In contrast, the bootstrapped F1234 statistic has empirical level very close to the nominal level, except for the homoscedastic case where it has level 3% for T = 13. Evidently, use of the standard F1234 test to control overall level will not work well if there is PH present, even for a reasonably large sample, while the bootstrap version will control the overall level even with a small sample. More generally, with T = 49, the overall pattern is much the same as with T = 13, the bootstrap doing an excellent job on both level and power (even the t3 and t4 statistics are about correct here), while the standard tests are still too liberal. The
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
81
strange behaviour of t3 and t4 is discussed at some length in BTa; the PH patterns 2 2 + −1 ) to we have chosen are intended to mimic the fact that if the ratio of (−3 2 2 (−2 + 0 ) tends towards in5nity, then the asymptotic distribution of the t4 statistic collapses to zero, while that of t3 has a heavy leftwards shift. In the light of this, it is frankly quite remarkable how nearly correct the levels of the bootstrapped versions are. In contrast, the standard t3 and t4 tests are essentially unusable when the shocks have PH. 4.6. Periodic heteroscedasticity and serial correlation In our main experiments, reported in Table 3, we combine periodic heteroscedasticity and serial correlation, together with the data-based lag selection procedure (again using mmax = 4). We also allow asymmetric shocks, but only the results for Normal shocks are reported, as those for centred chi-square shocks are no di8erent. Interest therefore now centres on the e8ects on the tests of introducing weak (=0:1), moderate (=0:5) or strong ( = 0:9) serial correlation into the shocks. For the smallest sample size, T = 13, in 59 of the 63 cases (test, PH, SC) displayed in the table, the bootstrap has level closer to the nominal 5% than does the standard test. Neither the bootstrap nor the standard procedure yields the correct level for all of the tests, although the bootstrap procedure is quite clearly superior, notably for the widely used joint F-tests. An interesting feature of the results is that the empirical levels are generally somewhat higher for moderate ( = 0:5) than for either strong ( = 0:9) or weak ( = 0:1) serial correlation. However, this e8ect is not present at the larger sample sizes, suggesting that low power of the lag-selection tests is the likely cause; that is, the closer is to zero, the smaller is the chance of retaining any lagged regressors. Moreover, the closer is to zero, the smaller is the impact on the seasonal unit root tests of omitting lagged regressors. As in Table 2, there is no evidence of any power loss relative to the infeasible exact-level test. With 25 years of data, the bootstrap has level closer to 5% in 53 out of 63 cases, and is closer for every one of the joint F-tests. Again, there is no general loss of power compared to the (infeasible) exactly level-corrected test, except for some of the joint F-tests for which the bootstrap remains slightly conservative. For T = 37, in every case bar four the bootstrap level is closer to 5% than is that of the standard procedure, and the power loss relative to the infeasible exactly level-corrected test is never more than 3%, the various joint F-tests no longer being conservative. Finally, with 49 years of data, the levels of the bootstrap tests are in 53 out of 63 cases closer to 5% than those of the standard tests, and there is no loss of power relative to the exactly corrected tests. We conclude that, at least with these levels of serial correlation, and with moderate or even quite extreme periodic heteroscedasticity present, and in samples of the sizes typically found in applied work, the bootstrap is much the most reliable way of controlling the levels of the HEGY tests. Furthermore, we 5nd that the powers of the bootstrap tests are not signi5cantly di8erent from those of the corresponding infeasible exactly level-corrected tests whenever the bootstrap test has level close to 5%, as is usually the case.
t1
T = 13 1 1 1 1
3131
30 1 1 1
t3
t4
F34
F234
F1234
4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8
0.1 BS STD LC 0.5 BS STD LC 0.9 BS STD LC
0.047 0.066
0.1 BS STD LC 0.5 BS STD LC 0.9 BS STD LC
0.046 0.078
0.1 BS STD LC 0.5 BS STD LC 0.9 BS STD LC
0.064 0.130
0.041 0.074 0.056 0.107
0.051 0.078 0.057 0.092
0.076 0.119 0.065 0.109
0.087 0.129 0.088 0.070 0.099 0.071 0.074 0.134 0.064
0.059 0.087
0.103 0.147 0.094 0.076 0.115 0.082 0.074 0.121 0.062
0.064 0.100
0.123 0.232 0.106 0.125 0.200 0.091 0.090 0.150 0.073
0.065 0.130
0.090 0.135 0.059 0.100
0.105 0.163 0.065 0.103
0.092 0.193 0.078 0.135
0.113 0.161 0.115 0.177 0.247 0.091 0.153 0.193 0.113
0.046 0.078
0.134 0.195 0.096 0.214 0.280 0.090 0.166 0.205 0.090
0.062 0.127
0.129 0.256 0.105 0.172 0.355 0.105 0.159 0.249 0.104
0.064 0.240
0.048 0.081 0.042 0.058
0.055 0.107 0.031 0.056
0.073 0.221 0.029 0.120
0.114 0.188 0.117 0.124 0.183 0.129 0.105 0.131 0.112
0.099 0.140
0.134 0.266 0.111 0.128 0.234 0.119 0.095 0.139 0.134
0.060 0.031
0.115 0.377 0.095 0.132 0.369 0.106 0.060 0.234 0.110
0.184 0.000
0.143 0.173 0.115 0.213
0.181 0.123 0.131 0.206
0.318 0.002 0.243 0.160
0.100 0.138 0.096 0.208 0.256 0.053 0.219 0.327 0.070
0.035 0.084
0.048 0.022 0.079 0.297 0.190 0.040 0.279 0.316 0.071
0.050 0.096
0.309 0.000 0.074 0.484 0.004 0.053 0.358 0.257 0.048
0.060 0.176
0.066 0.151 0.048 0.116
0.097 0.172 0.064 0.128
0.141 0.284 0.125 0.153
0.079 0.176 0.108 0.179 0.309 0.135 0.148 0.238 0.143
0.039 0.097
0.113 0.198 0.104 0.215 0.344 0.127 0.194 0.270 0.151
0.050 0.114
0.107 0.299 0.094 0.248 0.473 0.108 0.248 0.256 0.112
0.063 0.215
0.116 0.223 0.083 0.155
0.154 0.266 0.097 0.157
0.200 0.339 0.128 0.174
0.101 0.226 0.133 0.307 0.459 0.166 0.240 0.301 0.167
0.034 0.084
0.128 0.252 0.123 0.366 0.508 0.157 0.280 0.321 0.152
0.049 0.118
0.102 0.338 0.089 0.342 0.542 0.103 0.249 0.267 0.107
0.060 0.218
0.104 0.207 0.089 0.172
0.138 0.257 0.105 0.183
0.150 0.321 0.140 0.213
0.087 0.194 0.120 0.267 0.417 0.144 0.245 0.324 0.153 0.109 0.235 0.108 0.313 0.469 0.133 0.285 0.347 0.133 0.077 0.322 0.075 0.233 0.499 0.084 0.255 0.304 0.083
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
−3 −2 −1 0
t2
82
Table 3 Empirical level (4 = 1) and power (4 = 0:8) of bootstrap (BS), standard (STD) and standard level-corrected (LC) Seasonal unit root tests, all run at the nominal 0.05 level. Seasonal de-meaning, and mmax = 4. DGP: (1 − 4 L4 )x4t+s = v4t+s , (1 − L)v4t+s = u4t+s ∼ NIID(0; s2 ), s = −3; : : : ; 0
T = 25 1111
0.1
0.9
3131
0.1 0.5 0.9
30 1 1 1
0.1 0.5 0.9
T = 37 1111
0.1
BS STD LC BS STD LC BS STD LC BS STD LC BS STD LC BS STD LC BS STD LC
0.047 0.055 0.050 0.050 0.059 0.076 0.045 0.049 0.049 0.059 0.055 0.072 0.049 0.080 0.056 0.081 0.044 0.065
0.046 0.052
0.130 0.138 0.123 0.124 0.128 0.112 0.095 0.122 0.083 0.135 0.152 0.138 0.131 0.139 0.104 0.083 0.102 0.077 0.155 0.200 0.156 0.146 0.183 0.148 0.078 0.107 0.077 0.187 0.229 0.191
0.060 0.085 0.041 0.061 0.046 0.049 0.067 0.091 0.034 0.049 0.049 0.049 0.067 0.106 0.029 0.058 0.028 0.042
0.062 0.073
0.176 0.215 0.130 0.127 0.167 0.158 0.119 0.133 0.140 0.222 0.251 0.171 0.110 0.163 0.147 0.116 0.131 0.130 0.217 0.284 0.176 0.115 0.168 0.159 0.076 0.134 0.145 0.271 0.319 0.221
0.050 0.059 0.043 0.053 0.044 0.036 0.055 0.111 0.045 0.060 0.032 0.020 0.059 0.195 0.032 0.106 0.005 0.020
0.050 0.061
0.227 0.260 0.203 0.180 0.201 0.201 0.127 0.102 0.151 0.221 0.344 0.193 0.145 0.233 0.176 0.076 0.088 0.153 0.173 0.410 0.151 0.125 0.272 0.165 0.025 0.083 0.157 0.402 0.452 0.411
0.079 0.128 0.065 0.124 0.063 0.191 0.069 0.026 0.054 0.056 0.072 0.159 0.254 0.000 0.082 0.020 0.158 0.124
0.067 0.101
0.076 0.125 0.091 0.114 0.200 0.083 0.107 0.348 0.080 0.061 0.022 0.081 0.106 0.115 0.063 0.164 0.336 0.075 0.600 0.000 0.109 0.202 0.081 0.092 0.339 0.277 0.086 0.066 0.104 0.084
0.048 0.072 0.037 0.067 0.041 0.062 0.057 0.082 0.042 0.075 0.047 0.065 0.062 0.153 0.035 0.114 0.030 0.091
0.051 0.062
0.188 0.260 0.186 0.166 0.256 0.217 0.146 0.202 0.203 0.219 0.288 0.196 0.154 0.267 0.180 0.146 0.227 0.184 0.181 0.333 0.152 0.129 0.276 0.156 0.076 0.225 0.143 0.340 0.410 0.349
0.052 0.088 0.036 0.071 0.037 0.062 0.064 0.098 0.030 0.071 0.044 0.065 0.066 0.194 0.033 0.137 0.031 0.121
0.056 0.081
0.267 0.369 0.224 0.171 0.292 0.253 0.165 0.254 0.257 0.302 0.397 0.265 0.142 0.297 0.199 0.148 0.254 0.228 0.179 0.401 0.146 0.110 0.302 0.143 0.057 0.268 0.129 0.486 0.597 0.469
0.049 0.073 0.041 0.079 0.043 0.074 0.058 0.101 0.037 0.091 0.047 0.083 0.060 0.197 0.043 0.178 0.041 0.161
0.051 0.072
0.276 0.369 0.256 0.229 0.365 0.312 0.204 0.316 0.275 0.274 0.381 0.256 0.189 0.370 0.228 0.166 0.309 0.213 0.140 0.383 0.127 0.123 0.364 0.128 0.070 0.319 0.098
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
0.5
BS STD LC BS STD LC BS STD LC
0.541 0.667 0.542 83
84
Table 3 (continued) t1
3131
30 1 1 1
T = 49 1111
t3
t4
F34
F234
F1234
4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8 4 = 1 4 = 0:8
0.5 BS STD LC 0.9 BS STD LC
0.052 0.056
0.1 BS STD LC 0.5 BS STD LC 0.9 BS STD LC
0.046 0.044
0.1 BS STD LC 0.5 BS STD LC 0.9 BS STD LC
0.043 0.067
0.055 0.063
0.053 0.062 0.051 0.058
0.051 0.071 0.039 0.065
0.1 BS 0.047 STD 0.050 LC
0.187 0.201 0.196 0.127 0.148 0.114
0.048 0.056
0.197 0.227 0.211 0.183 0.218 0.188 0.115 0.135 0.106
0.066 0.088
0.200 0.254 0.216 0.205 0.255 0.189 0.086 0.113 0.095
0.069 0.103
0.296 0.297 0.301
0.060 0.065
0.050 0.062
0.049 0.057 0.050 0.057
0.046 0.059 0.048 0.060
0.195 0.225 0.210 0.214 0.233 0.213
0.049 0.054
0.321 0.364 0.253 0.202 0.244 0.220 0.208 0.229 0.222
0.055 0.104
0.321 0.401 0.271 0.221 0.260 0.232 0.211 0.231 0.230
0.056 0.188
0.383 0.403 0.383
0.053 0.059
0.049 0.035
0.048 0.062 0.032 0.024
0.036 0.100 0.009 0.027
0.290 0.313 0.331 0.190 0.167 0.229
0.059 0.115
0.333 0.502 0.307 0.235 0.343 0.251 0.139 0.147 0.248
0.074 0.019
0.249 0.543 0.237 0.173 0.379 0.238 0.050 0.136 0.239
0.254 0.000
0.621 0.686 0.652
0.068 0.118
0.061 0.193
0.065 0.046 0.069 0.153
0.131 0.005 0.127 0.141
0.091 0.224 0.055 0.151 0.450 0.108
0.046 0.061
0.076 0.015 0.070 0.121 0.123 0.081 0.238 0.470 0.115
0.055 0.073
0.751 0.000 0.202 0.426 0.031 0.113 0.398 0.446 0.137
0.059 0.134
0.064 0.113 0.089
0.053 0.073
0.045 0.057
0.052 0.070 0.050 0.065
0.049 0.105 0.050 0.106
0.298 0.381 0.324 0.319 0.364 0.328
0.045 0.063
0.335 0.411 0.298 0.276 0.273 0.292 0.288 0.378 0.301
0.060 0.099
0.262 0.441 0.240 0.213 0.391 0.226 0.199 0.359 0.210
0.063 0.185
0.545 0.628 0.564
0.057 0.078
0.046 0.060
0.048 0.067 0.048 0.061
0.048 0.132 0.053 0.132
0.374 0.476 0.408 0.424 0.487 0.447
0.044 0.056
0.488 0.595 0.433 0.346 0.346 0.376 0.379 0.483 0.402
0.056 0.097
0.270 0.546 0.230 0.191 0.449 0.205 0.192 0.424 0.207
0.058 0.196
0.721 0.800 0.752
0.053 0.073
0.049 0.061
0.049 0.080 0.047 0.070
0.049 0.162 0.052 0.162
0.494 0.576 0.529 0.487 0.531 0.501 0.478 0.619 0.4360 0.398 0.427 0.430 0.393 0.526 0.421 0.221 0.538 0.2020 0.194 0.498 0.190 0.159 0.461 0.178 0.811 0.871 0.851
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
−3 −2 −1 0
t2
0.5 0.9
0.1 0.5 0.9
30 1 1 1
0.1 0.5 0.9
0.050 0.057
BS STD LC BS STD LC BS STD LC
0.041 0.050
BS STD LC BS STD LC BS STD LC
0.054 0.062
0.051 0.060 0.050 0.064 0.039 0.053 0.053 0.059 0.038 0.056
Notes: See notes for Table 2.
0.282 0.319 0.266 0.173 0.193 0.158
0.048 0.051
0.284 0.301 0.298 0.283 0.307 0.283 0.163 0.185 0.144
0.065 0.071
0.265 0.322 0.311 0.280 0.343 0.297 0.106 0.147 0.115
0.052 0.053
0.051 0.054 0.053 0.047 0.065 0.086 0.051 0.047 0.048 0.048
0.315 0.324 0.320 0.316 0.321 0.308
0.050 0.045
0.431 0.453 0.363 0.320 0.329 0.326 0.322 0.314 0.324
0.054 0.112
0.445 0.504 0.370 0.334 0.353 0.358 0.333 0.320 0.319
0.049 0.034
0.047 0.056 0.038 0.026 0.053 0.183 0.041 0.090 0.017 0.020
0.452 0.457 0.448 0.281 0.243 0.300
0.060 0.108
0.472 0.686 0.468 0.373 0.477 0.399 0.226 0.208 0.365
0.072 0.025
0.348 0.684 0.336 0.261 0.496 0.353 0.122 0.197 0.328
0.057 0.187
0.057 0.046 0.061 0.147 0.251 0.000 0.116 0.001 0.101 0.125
0.104 0.235 0.088 0.194 0.536 0.155
0.049 0.051
0.081 0.018 0.079 0.166 0.156 0.094 0.307 0.564 0.180
0.055 0.082
0.855 0.000 0.273 0.504 0.044 0.211 0.482 0.587 0.191
0.047 0.060
0.051 0.062 0.052 0.066 0.056 0.131 0.052 0.097 0.048 0.095
0.519 0.561 0.551 0.521 0.561 0.540
0.047 0.049
0.473 0.601 0.465 0.459 0.533 0.438 0.447 0.521 0.460
0.062 0.091
0.368 0.592 0.339 0.321 0.514 0.343 0.326 0.491 0.320
0.047 0.051
0.050 0.057 0.050 0.057 0.062 0.173 0.053 0.121 0.050 0.125
0.668 0.688 0.691 0.680 0.700 0.701
0.050 0.052
0.670 0.785 0.653 0.607 0.659 0.608 0.605 0.648 0.636
0.054 0.098
0.383 0.685 0.340 0.302 0.566 0.329 0.313 0.564 0.297
0.047 0.059
0.055 0.081 0.050 0.077 0.055 0.187 0.054 0.149 0.048 0.164
0.795 0.834 0.810 0.730 0.780 0.762 0.679 0.815 0.676 0.668 0.769 0.668 0.618 0.736 0.650 0.326 0.675 0.308 0.294 0.636 0.312 0.265 0.621 0.264
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
3131
BS STD LC BS STD LC
85
86
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
5. Conclusions In this paper we have shown experimentally that the current practice of lag-augmenting the regression used to implement the HEGY seasonal unit root tests, followed by use of tabulated critical values produced under the assumption that the DGP has serially uncorrelated homoscedastic shocks with an unaugmented test regression produces far less reliable inferences than an alternative bootstrap procedure. This is especially the case for the joint-frequency F-tests advocated by Ghysels et al. (1994) and Taylor (1998) as a means of controlling overall test level. In the experiments reported, the number, n, of bootstrap replications was kept quite small, at 400, to economise on computer time. In applications, the choice of the number of bootstrap replications is less constrained by computer time, and optimising the bootstrap’s power for the speci5c series in hand becomes important. Furthermore, increasing the number of bootstrap replications, perhaps to many thousands, will certainly yield more accurate estimates of tail probabilities in any given application. Such questions have been explored in recent papers by Andrews and Buchinsky (2001) and Davidson and MacKinnon (2000, 2002) to which we refer readers for further analysis. Since it will almost always be necessary in applied work to somehow correct for higher-order serial correlation, and often for periodic heteroscedasticity too, at the time of writing, our bootstrap method appears to be, by quite a wide margin, the most reliable basis upon which to conduct the HEGY tests. Acknowledgements The authors gratefully acknowledge 5nancial support for this research provided by the Economic and Social Research Council of the United Kingdom under research grant R000223963. We are also grateful to Co-Editor Arnold Zellner, an Associate Editor, two anonymous referees, Richard J. Smith, and participants at the UK Econometric Study Group Meeting, Bristol, July 2002, the Econometric Society European Meeting, Venice, August 2002, and the Economics Seminar Series at Melbourne University, in particular Jan Kiviet, for helpful comments and suggestions on earlier drafts of this paper. We thank Marie Brixtofte for excellent research assistance. References Andrews, D.W.K., Buchinsky, M., 2001. Evaluation of a three-step method for choosing the number of bootstrap repetitions. Journal of Econometrics 103, 345–386. Basawa, I.V., Mallik, A.K., McCormick, W.P., Reeves, J.H., Taylor, R.L., 1991. Bootstrapping unstable 5rst-order autoregressive processes. Annals of Statistics 19 (2), 1098–1101. Beaulieu, J.J., Miron, J.A., 1993. Seasonal unit roots in aggregate U.S. data. Journal of Econometrics 55, 305–328. Burridge, P., Taylor, A.M.R., 2001a. On regression-based tests for seasonal unit roots in the presence of periodic heteroscedasticity. Journal of Econometrics 104, 91–117. Burridge, P., Taylor, A.M.R., 2001b. On the properties of regression-based tests for seasonal unit roots in the presence of higher-order serial correlation. Journal of Business and Economic Statistics 19 (3), 374–379.
P. Burridge, A.M. Robert Taylor / Journal of Econometrics 123 (2004) 67 – 87
87
Davidson, R., MacKinnon, J.G., 2000. Bootstrap tests: how many bootstraps? Econometric Reviews 19, 55–68. Davidson, R., MacKinnon, J.G., 2002. Fast double bootstrap tests of nonnested linear regression models. Econometric Reviews 21, 417–427. Dickey, D.A., Hasza, D.P., Fuller, W.A., 1984. Testing for unit roots in seasonal time series. Journal of the American Statistical Association 79, 355–367. Ferretti, N., Romo, J., 1996. Bootstrap tests for unit root AR(1) models. Biometrika 84, 849–860. Ghysels, E., Lee, H.S., Noh, J., 1994. Testing for unit roots in seasonal time series: some theoretical extensions and a Monte Carlo investigation. Journal of Econometrics 62, 415–442. Ghysels, E., Hall, A., Lee, H.S., 1996. On periodic structures and testing for seasonal unit roots. Journal of the American Statistical Association 91, 1551–1559. van Giersbergen, N.P.A., 1998. Bootstrapping dynamic econometric models. Ph.D. Thesis No. 184, Tinbergen Institute Series, Thesis Publishers Amsterdam, 1998. Horowitz, J.L., Savin, N.E., 2000. Empirically relevant critical values for hypothesis tests: a bootstrap approach. Journal of Econometrics 95, 375–390. Hylleberg, S., Engle, R.F., Granger, C.W.J., Yoo, B.S., 1990. Seasonal integration and cointegration. Journal of Econometrics 44, 215–238. Inoue, A., Kilian, L., 2002. Bootstrapping autoregressive processes with possible unit roots. Econometrica 70 (1), 377–391. Nankervis, J.C., Savin, N.E., 1996. Level and power of the bootstrap t test in the AR(1) model with trend. Journal of Business and Economic Statistics 14 (2), 161–168. Park, J.Y., 2002. An invariance principle for seive bootstrap in time series. Econometric Theory 18 (2), 469–490. Phillips, P.C.B., 2001. Bootstrapping spurious regression. Cowles Foundation Discussion Paper 1330, May 2001. Psaradakis, Z., 2000. Bootstrap tests for unit roots in seasonal autoregressive models. Statistics and Probability Letters 50, 389–395. Rayner, R.K., 1990. Bootstrapping p values and power in the 5rst-order autoregression: a Monte Carlo investigation. Journal of Business and Economic Statistics 8, 251–263. Smith, R.J., Taylor, A.M.R., 1998. Additional critical values and asymptotic representations for seasonal unit root tests. Journal of Econometrics 85, 269–288. Smith, R.J., Taylor, A.M.R., 1999a. Regression-based seasonal unit root tests. University of Birmingham Discussion Papers in Economics, 99 –15. Smith, R.J., Taylor, A.M.R., 1999b. Likelihood ratio tests for seasonal unit roots. Journal of Time Series Analysis 20, 453–476. Taylor, A.M.R., 1997. On the practical problems of computing seasonal unit root tests. International Journal of Forecasting 13, 307–318. Taylor, A.M.R., 1998. Testing for unit roots in monthly time series. Journal of Time Series Analysis 19, 349–368.