Spurious correlation of I(0) regressors in models with an I(1) dependent variable

Spurious correlation of I(0) regressors in models with an I(1) dependent variable

Economics Letters 91 (2006) 184 – 189 www.elsevier.com/locate/econbase Spurious correlation of I(0) regressors in models with an I(1) dependent varia...

140KB Sizes 0 Downloads 25 Views

Economics Letters 91 (2006) 184 – 189 www.elsevier.com/locate/econbase

Spurious correlation of I(0) regressors in models with an I(1) dependent variable Chris Stewart * Department of Economics, Finance and International Business, London Metropolitan University, 84 Moorgate, London, EC2M 6SQ, United Kingdom Received 29 March 2005; received in revised form 25 August 2005; accepted 11 November 2005 Available online 20 March 2006

Abstract Hassler [Hassler, U., 1996. Spurious regressions when stationary regressors are included, Economics Letters, 50, 25–31] shows that t- and F-tests for zero restrictions on I(0) regressors in equations with an I(1) dependent variable do not diverge to infinity asymptotically. He concludes that there is no spurious significance for these regressors. Using Monte Carlo simulation we demonstrate that spurious correlation generally occurs in such regressions. D 2005 Elsevier B.V. All rights reserved. Keywords: Spurious regression; Stationary regressors; Simulation JEL classification: C15

1. Introduction Spurious regression refers to exaggerated correlation indicated by various statistics (generally) in a linear regression model (LRM). The focus has typically been on the appearance of correlation between independently generated variables as indicated by t- and F-tests for zero restrictions on coefficients. In particular, the coefficients of the regressors are found to be statistically significant more frequently than

* Tel.: +44 20 7320 1651; fax: +44 20 7320 1414. E-mail address: [email protected]. 0165-1765/$ - see front matter D 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.econlet.2005.11.014

C. Stewart / Economics Letters 91 (2006) 184–189

185

the specified nominal level of significance. This will be the case if such t- or F-test statistics diverge to infinity asymptotically, because they will generally exceed (in magnitude) their corresponding critical values. Numerous cases where spurious regression is evident have been discussed in the literature. Granger and Newbold (1974), Phillips (1986) and Entorf (1997) demonstrate that there will be spurious regression on the slope coefficients involving Ordinary Least Squares (OLS) regressions of independently generated I(1) variables. Spurious correlations are also shown to be found in regressions involving random walks and linear trends according to Nelson and Kang (1984) and Durlauf and Phillips (1988). Haldrup (1994) and Marmol (1995, 1996) demonstrate that spurious correlations are evident in OLS regressions involving combinations of series with integer orders of integration equal to, or greater than, one. Spurious regressions are shown to occur in models with series generated by various combinations of different types of stationary process (with and without linear trends and possibly allowing for time-varying means due to structural breaks or seasonality) by Jenkins and Watts (1968), Granger et al. (2001), Hassler (2003) and Kim et al. (2004). Marmol (1998) and Tsay and Chung (2000) demonstrate that spurious correlation generally occurs in regressions involving various combinations of fractionally integrated processes (both stationary and nonstationary).1 One of the few situations where spurious regression is suggested not to arise is on the coefficient of I(0) regressors when the dependent variable is I(1). The work of Hassler (1996), who has examined this case, is reconsidered in the next section. An empirical analysis is provided in Section 3. Section 4 concludes.

2. Spurious regression involving I(0) regressors: theory and evidence Hassler (1996) derives the asymptotic theory for model (1), and the unbalanced regression, which is Eq. (1) with bˆ 2 = 0. P Yt ¼ aˆ þ bˆ VX þ bˆ VX þ uˆ t ð1Þ P1 P 1t P2 P 2t where, Y t is an I(1) random walk (without drift), X 1t is an m 1  1 vector of I(0) processes, X 2t is an m 2  1 vector of I(1) random walks (without drifts) and bˆ 1 and bˆ 2 are the respective m 1  1 and m 2  1 P P coefficient vectors on the regressors. The intercept is aˆ. All variables are assumed to be independently generated and the error term is I(1) so that there is no cointegration. Hassler (1996) demonstrates that aˆ = O p (T 1 / 2), bˆ 1 = O p (1), bˆ 2 = O p (1), t b 1 = O p (1), t b 2 = O p (T 1 / 2), P P F b 1 = O p (1), F b 2 = O p (T), s 2 = O p (T), R 2 = O p (1) and DW = O p (T  1) for model (1). Where t b 1 and t b 2 denote the t-statistics for testing that the individual coefficients on the I(0) and I(1) variables, respectively, are zero. F b 1 and F b 2 are respective F-tests of zero restrictions on all of the I(0) variables’ coefficients and all of the I(1) variables’ coefficients. Further, s 2 is the residual variance estimator, R 2 is the coefficient of determination and DW is the Durbin–Watson statistic.2 Of particular interest is that the coefficient estimator and t- and F-statistics associated with the I(0) variables all converge to random variables. It is noted that, b. . .it is somehow surprising that bˆ 1 does not converge to zero. Nevertheless, P we observe no spurious significance of bˆ 1. . .meaning that t- and F-statistics do not divergeQ (Hassler, P ˆ 1996, p. 29). Hassler (1996) also demonstrates that b 1 = O p (1), t b 1 = O p (1), F b 1 = O p (1), s 2 = O p (T), P 1 2

There is no spurious regression in some cases involving stationary series. All statistics are based on OLS estimators.

186

C. Stewart / Economics Letters 91 (2006) 184–189

R 2 = O p (T  1) and DW = O p (T  1) for the unbalanced regression. As for model (1), the I(0) regressors’ coefficients and their associated t- and F-statistics do not diverge to infinity. Granger et al. (2001, Table 2, p. 901) report simulation results for the slope coefficient’s t-ratio, t b 1, in the unbalanced regression when the regressor follows a first-order autoregressive process, AR(1), with autocorrelation coefficient equal to 0.5.3 The reported rejection probabilities (with a nominal size of 5%) are 24.8%, 26.2% and 22.8% for sample sizes of 100, 500 and 2000, respectively, indicating spurious regression. These results are inconsistent with Hassler’s (1996) interpretation of his findings. This inconsistency arises because Hassler (1996) interprets the non-divergence of t b 1 and F b 1 as indicating no spurious significance. However, the occurrence of spurious regression depends on the (absolute) average size of the random variable that the test statistic converges to relative to its critical value. Theoretically, for the unbalanced regression the average size of the test statistic will depend on the specification of the data generation process (DGP) of the I(0) regressor. For the two-variable LRM when both variables are positively autocorrelated (assuming both are AR(1), where the dependent variable has a unit root) the t-ratio of the slope coefficient will be biased upwards in the presence of residual autocorrelation.4 Further, this bias will increase with the size of the autocorrelation coefficient in the DGP of the regressor. This paper employs simulation methods to determine whether spurious correlation occurs in such regressions.

3. Monte Carlo simulation experiments For the Monte Carlo simulation experiment we specify the DGPs of the three variables, X 1t , X 2t and Y t , as: X1t ¼ qX1 X1;t1 þ vX1 ;t X2t ¼ X2;t1 þ vX2 ;t Yt ¼ Yt1 þ vY ;t

X1;0 ¼ 0

X2;0 ¼ 0

Y0 ¼ 0

vX1 ;t fN ð0; 1Þ

vX2 ;t fN ð0; 1Þ

vY ;t fN ð0; 1Þ

ð2Þ ð3Þ ð4Þ

where X 1,0, X 2,0 and Y 0 denote the initial values of the variables (all are set to zero), q X 1 is the autocorrelation coefficient and v X 1,t , v X 2,t and v Y,t are stochastic errors generated using independent standard normal distributions. The model to be estimated is: Yt ¼ aˆ þ bˆ 1 X1t þ bˆ 2 X2t þ uˆ t :

ð5Þ

Eq. (5) is estimated by OLS for various sample sizes (T = 25, 50, 100, 200, 500, 1000, 5000, 10,000),5 and values of the autocorrelation coefficient, q X 1, (0 V q X 1 V1, rising in 0.1 unit increments). From the 10,000 replications we calculate the average magnitudes of the t-ratio of the stationary regressor, t b 1, and 3 Granger et al. (2001) do not comment upon these results: they appear to be an externality in their analysis of spurious correlation among stationary variables. 4 Hassler (1996) demonstrates that DW converges to zero for model (1) (with and without bˆ 2 = 0) indicating residual P autocorrelation. 5 These are the sample sizes after the first 500 observations have been discarded.

C. Stewart / Economics Letters 91 (2006) 184–189

187

its associated empirical size (percentage mean rejections of the null) using a 5% nominal level of significance. The average magnitudes of t b 1 are reported in Table 1. For 0 V q x V 0.9, t b 1 does not diverge to infinity. However, the average magnitude of t b 1 increases as the value of q X 1 rises. For example, the average magnitude of t b 1 is in the range of 0.80 to 0.83 for q X 1 = 0, ranges from 1.25 to 1.40 for q X 1 = 0.5 and for q X 1 = 0.9 takes on values between 1.98 and 3.44. This is not inconsistent with Hassler’s (1996) theoretical results, however it is not clear from them either. Clearly, as these average magnitudes rise (relative to the absolute critical value) so will the empirical size. Table 2 reports the empirical size for t b 1 using a 5% critical value (*** denotes that the size of the test is significantly different from 5% at the 1% level). The empirical size notably increases both as the sample size rises and as q X 1 becomes larger. For example, when q X 1 = 0 the empirical size of the test is not significantly greater than the nominal size of 5% for all sample sizes considered (no evident spurious regression). However, when 0.1 V q X 1 V 0.9 the empirical size of the test is significantly different from the nominal size for all sample sizes suggesting significant spurious regression. The severity of spurious regression can be gauged by considering the percentage mean rejections for various values of q X 1. When q X 1 = 0.1 the empirical size of the test ranges from 6.44% to 7.74% indicating a very minor degree of spurious correlation. A moderate but notable degree of spurious regression is evident when q X 1 = 0.5 with the empirical size falling between 18.09% and 27.06%. Finally, the empirical size is in the range of 37.77% to 65.17% when q X 1 = 0.9, which suggests a substantial level of spurious correlation. This evidence refutes Hassler’s (1996) interpretation of the non-divergence of t b 1 as indicating that there will be no spurious significance for bˆ1. Our findings suggest that, in this case, while there will not always be

Table 1 Average magnitudes of t b 1 in the 3-variable LRM qx A 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

p Sample size (T) Y 25

50

100

200

500

1000

5000

10000

0.83 0.88 0.97 1.04 1.14 1.25 1.36 1.52 1.76 1.98 2.37

0.82 0.89 0.96 1.09 1.16 1.30 1.48 1.70 2.00 2.54 3.38

0.80 0.87 0.97 1.08 1.20 1.34 1.53 1.77 2.19 2.94 4.83

0.81 0.88 0.97 1.08 1.18 1.39 1.56 1.83 2.30 3.21 6.81

0.80 0.87 0.99 1.09 1.21 1.37 1.57 1.87 2.33 3.35 10.97

0.81 0.88 0.98 1.09 1.21 1.39 1.58 1.90 2.39 3.42 15.60

0.80 0.88 0.97 1.08 1.22 1.38 1.60 1.92 2.41 3.44 34.78

0.80 0.88 0.98 1.08 1.23 1.40 1.58 1.92 2.40 3.44 48.71

Number of samples (replications) = 10,000; maximum number observations in a sample = 10,500, number of discarded initial observations = 500. DGP X 1t : X 1t = q X 1X 1,t1 + m X 1,t , X 1,0 = 0, m X 1,t ~N(0, 1); DGP X 2t : X 2t = X 2,t1 + m X 2,t , X 2,0 = 0, m X 2,t ~N(0, 1); DGP Y t : Y t = Y t1 + m Y,t , Y 0 = 0, m Y,t ~N(0, 1). Regression: Y t = aˆ + bˆ1X 1t + bˆ2X 2t + uˆt . The average magnitudes of t b 1 are reported.

188

C. Stewart / Economics Letters 91 (2006) 184–189

Table 2 Percentage mean rejections of H0: b 1 = 0 in the 3-variable LRM qx A 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

p Sample size (T) Y 25

50

100

200

500

1000

5000

10000

5.05% 6.44%*** 9.27%*** 11.78%*** 15.07%*** 18.09%*** 21.96%*** 26.83%*** 33.38%*** 37.77%*** 45.24%***

5.05% 7.46%*** 9.98%*** 13.97%*** 16.98%*** 21.57%*** 27.49%*** 33.76%*** 41.40%*** 51.32%*** 60.81%***

4.81% 7.34%*** 10.29%*** 13.98%*** 18.44%*** 23.38%*** 29.61%*** 36.33%*** 46.65%*** 57.95%*** 71.75%***

4.83% 7.28%*** 10.31%*** 14.56%*** 18.49%*** 26.15%*** 31.20%*** 38.79%*** 48.98%*** 62.37%*** 79.81%***

4.96% 7.54%*** 11.25%*** 14.99%*** 19.75%*** 25.71%*** 31.64%*** 40.32%*** 49.73%*** 63.67%*** 87.25%***

5.37% 7.74%*** 10.82%*** 14.96%*** 19.69%*** 26.27%*** 32.28%*** 40.86%*** 50.61%*** 65.17%*** 91.35%***

4.81% 7.64%*** 11.05%*** 14.55%*** 19.81%*** 25.58%*** 32.68%*** 41.93%*** 51.32%*** 64.87%*** 96.13%***

4.92% 7.63%*** 10.99%*** 14.81%*** 20.64%*** 27.06%*** 31.93%*** 41.44%*** 51.33%*** 64.49%*** 97.14%***

Number of samples (replications) = 10,000; maximum number observations in a sample = 10,500, number of discarded initial observations = 500. DGP X 1t : X 1t = q X 1X 1,t1 + m X 1,t , X 1,0 = 0, m X 1,t ~N(0, 1); DGP X 2t : X 2t = X 2,t1 + m X 2,t , X 2,0 = 0, m X 2,t ~N(0, 1); DGP Y t : Y t = Y t1 + m Y,t , Y 0 = 0, m Y,t ~N(0, 1). Regression: Y t = aˆ + bˆ1X 1t + bˆ2X 2t + uˆ t . The percentage mean rejections of the t-ratio (t b 1) for the null, H0: b 1 = 0 against the alternative, H1: b 1 p 0 are reported. ***Denotes the percentage mean rejections of t b 1 being significantly different from 5% at the 1% level.

spurious regression it will generally arise to some notable degree and sometimes to a substantial degree. Further simulation results not reported here are provided in the working paper version of this article — see Stewart (2005). These show that: ! The I(0) regressor also suffers from spurious regression in the unbalanced two-variable LRM. ! Hassler’s (1996) theoretical results for the other statistics in both the two- and three-variable LRM are consistent with the simulation evidence. ! The use of Heteroscedasticity Autocorrelation Consistent standard errors in both two- and threevariable LRMs generally reduces the degree of spurious regression for the I(0) regressor. ! When the I(0) variable, X 1,t , is generated using a first-order moving average process there remains evidence of spurious regression for this regressor in both LRMs. However, it is not as severe as when the variable’s DGP is AR(1).

4. Conclusion This paper demonstrates that spurious regression will generally occur in a regression of an I(1) dependent variable on an I(0) regressor, with or without another I(1) regressor. This does not support Hassler’s (1996) interpretation of his most important result, being that there is no spurious significance for I(0) regressors in a model with an I(1) dependent variable. This implies that spurious correlation is more widespread than previously thought and that it will affect frequently applied regressions — see, for example, Stock and Watson (1993).

C. Stewart / Economics Letters 91 (2006) 184–189

189

Acknowledgement The author acknowledges the helpful comments of an anonymous referee.

References Durlauf, S.N., Phillips, P.C.B., 1988. Trends versus random walks in time series analysis. Econometrica 56 (6), 1333 – 1354. Entorf, H., 1997. Random walks with drifts: nonsense regression and spurious fixed-effect estimation. Journal of Econometrics 80, 287 – 296. Granger, C.W.J., Newbold, P., 1974. Spurious regressions in econometrics. Journal of Econometrics 2, 111 – 120. Granger, W.J. Clive, Hyung, Namwon, Jeon, Yongil, 2001. Spurious regressions with stationary series. Applied Economics 33, 899 – 904. Haldrup, N., 1994. The asymptotics of single-equation cointegration regressions with I(1) and I(2) variables. Journal of Econometrics 63, 153 – 181. Hassler, U., 1996. Spurious regressions when stationary regressors are included. Economics Letters 50, 25 – 31. Hassler, U., 2003. Nonsense regressions due to neglected time-varying means. Statistical Papers 44, 169 – 182. Jenkins, J.M., Watts, D.G., 1968. Spectral Analysis and its Applications. Holden-Day, San Francisco. Kim, T., Lee, Y., Newbold, P., 2004. Spurious regressions with stationary processes around linear trends. Economics Letters 83, 257 – 262. Marmol, F., 1995. Spurious regressions between I(d) processes. Journal of Time Series Analysis 16 (3), 313 – 321. Marmol, F., 1996. Nonsense regressions between integrated processes of different orders. Oxford Bulletin of Economics and Statistics 58 (3), 525 – 536. Marmol, F., 1998. Spurious regression theory with nonstationary fractionally integrated processes. Journal of Econometrics 84, 233 – 250. Nelson, C.R., Kang, H., 1984. Pitfalls in the use of time as an explanatory variable in regression. Journal of Business and Economic Statistics 2, 73 – 82. Phillips, P.C.B., 1986. Understanding spurious regressions in econometrics. Journal of Econometrics 33, 311 – 340. Stewart, C., 2005. Spurious correlation of I(0) regressors in regressions with an I(1) dependent variable. Department of Economics Discussion Paper. London Metropolitan University. Stock, J.H., Watson, M.W., 1993. A simple estimator of cointegrating vectors in higher order integrated systems. Econometrica 61, 783 – 820. Tsay, W., Chung, C., 2000. The spurious regression of fractionally integrated processes. Journal of Econometrics 96, 155 – 182.