A spatial Hausman test

Economics Letters 101 (2008) 282–284 Contents lists available at ScienceDirect Economics Letters j o u r n a l h o m e p a g e : w w w. e l s ev i e...

Download PDF

126KB Sizes 1 Downloads 46 Views

Report

PDF Reader
Full Text

Economics Letters 101 (2008) 282–284

Contents lists available at ScienceDirect

Economics Letters j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / e c o n b a s e

A spatial Hausman test R. Kelley Pace a,⁎,1, James P. LeSage b a

LREC Endowed Chair of Real Estate, Department of Finance, E.J. Ourso College of Business Administration, Louisiana State University, Baton Rouge, LA 70803-6308, United States Fields Endowed Chair in Urban and Regional Economics, McCoy College of Business Administration, Department of Finance and Economics, Texas State University - San Marcos, San Marcos, Texas 78666, United States

b

a r t i c l e

i n f o

Article history: Received 20 October 2007 Received in revised form 2 September 2008 Accepted 16 September 2008 Available online 23 September 2008

a b s t r a c t Often, authors report materially different OLS and spatial error model estimates. However, under the null of correct speciﬁcation, these estimates should be similar. We propose a spatial Hausman test and conduct a Monte Carlo experiment to examine its performance. © 2008 Elsevier B.V. All rights reserved.

Keywords: Spatial autoregression Speciﬁcation test Spatial econometrics SAR SEM JEL classiﬁcation: C11 C13

1. Introduction Both ordinary least squares (OLS) and the spatial error model (SEM) have been widely applied to spatial data. Often, authors provide both sets of estimates along with standard errors, allowing a pairwise comparison. This type of comparison reveals cases where OLS and SEM estimates are quite similar (Pace,1997; Cohen and Coughlin, 2006), other indeterminate cases where various, but not obviously signiﬁcant, differences exist (Neill et al., 2007; Theebe, 2004), and cases where differences appear to be statistically signiﬁcant. For example, Brasington (2007), in a study on the willingness to pay for public schools, found OLS and SEM coefﬁcients with different signs on variables representing educational attainment and owner occupied housing. In a study on retail location, Lee and Pace (2005), report an OLS estimate relating store size to sales that was negative and signiﬁcant, while the SEM estimate was positive and signiﬁcant. In fact, in Ord's seminal paper on spatial regression models (Ord, 1975), he reports OLS and SEM estimates from a univariate model (with intercept) where the slope coefﬁcient differs by two standard errors.

⁎ Corresponding author. Tel.: +1 225 578 6256 (OFF); fax: +1 225 578 9065. E-mail addresses: [email protected] (R. Kelley Pace), [email protected] (J.P. LeSage). 1 The authors would like to thank David Brasington, Dek Terrell, Donald Lacombe and Jennifer Zhu for their valuable comments. In addition, the author would like to acknowledge support from NSF SES-0729259, 0729264 as well as the Louisiana and Texas Sea Grant Programs. 0165-1765/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.econlet.2008.09.003

Under the SEM model assumptions, OLS and SEM regression parameter estimates should be unbiased (Anselin, 1988, p. 59). This suggests that signiﬁcant differences in regression parameter estimates will arise only from misspeciﬁcation. We formalize this result with a spatial Hausman speciﬁcation test for signiﬁcant differences between OLS and SEM estimates. In a Monte Carlo experiment, we show that the spatial Hausman test has the correct size. 2. Spatial Hausman test The linear model where the disturbances are independent identically distributed (iid) represents a simple data generating process that we label the iid DGP, shown in 1. The n observation vector y represents the regressand, the matrix X contains n observations on k exogenous explanatory variables, β is a k by 1 vector of regression parameters, and ε is a n by 1 vector of iid disturbances. y ¼ Xβ þ ɛ:

ð1Þ

ˆ = (X′X)-1X′y, or OLS. The canonical estimator for the iid DGP is β The iid error model has been widely used with spatial data samples where the observations represent points or regions located in space. As an alternative, assume the disturbances follow a spatial autoregressive process, labeled the spatial error DGP in (2), y ¼ Xβ þ ðI−ρW Þ−1 ɛ

ð2Þ

R. Kelley Pace, J.P. LeSage / Economics Letters 101 (2008) 282–284

where ρ is a scalar parameter that governs spatial dependence, and W is a n by n spatial weight matrix with zeros on the diagonal and nonnegative off-diagonal elements. In W any two observations i and j are neighbors if W ij N 0 (i ≠ j), and elements of W are ﬁxed prior to estimation. Powers of matrix W have an interpretation such that, a positive element ij in W 2 means that j is a second-order neighbor of observation i (a neighbor to a neighbor). Similar interpretations apply to higher-order powers of W. Assuming it exists, (I − ρW)-1 = I + ρW + ρ2W2+…, the variance–covariance matrix Ω equals G2(I − ρW)− 1(I − ρW′)− 1. Therefore, the elements of the weight matrix directly affect corresponding elements in the variance–covariance matrix. Note, the spatial error DGP 2 nests the iid DGP 1 as a special case when ρ = 0. The canonical estimator for the spatial error DGP associated with the spatial error model (SEM), appears in Eq. (3). −1 ~ 0 0 ~ ~ ~ ~ β ¼ X 0 ðI− ρW Þ ðI− ρW ÞX X 0 ðI− ρW Þ ðI− ρW Þy:

ð3Þ

Due to the transformation of the variables implied by the SEM, these models require estimation by maximum likelihood or some other technique to avoid biased estimation of the spatial dependence parameter, ρ (Ord, 1975, p. 121). Assuming the spatial error DGP, consistency of maximum likelihood (Mardia and Marshall, 1984; Lee, 2004) should lead to an estimate of ρ˜ close to the true value ρ in large samples. A consequence is that under the spatial error DGP, maximum likelihood estimates, β˜ approach the true β as n becomes large. Also, under the assumed ˆ from the OLS model are unbiased spatial error DGP, estimates β (Anselin, 1988, p. 59). Unbiasedness guarantees the weaker property of consistency for OLS estimates. h i ˆ ¼ ð X 0 X Þ−1 X 0 Xβ þ ðI−ρW Þ−1 ɛ β

ð4Þ

ˆ ¼ β þ ð X 0 X Þ−1 X 0 ðI−ρW Þ−1 ɛ β

ð5Þ

283

Table 1 Empirical versus theoretical sizes and spatial dependence ρ ρ

0.01

0.05

0.10

0.25

0.50

0.30 0.60 0.90

0.0093 0.0098 0.0130

0.0499 0.0521 0.0533

0.0984 0.0964 0.1049

0.2472 0.2498 0.2576

0.5023 0.5067 0.5038

˜s in Eq. (8) has a wellThe estimated variance–covariance matrix V known form (Anselin, 1988). ~ ~2 ~ Þ0 ðI− ρW ~ ÞX −1 : V s ¼ σ X 0 ðI−ρW

ð8Þ

However, the usual OLS variance–covariance matrix σ2(X′X)− 1 is inconsistent under the null of a spatial error DGP (Cordy and Grifﬁth, 1993, p. 1167–1168). Nonetheless, deriving a consistent estimator of the OLS variance–covariance matrix under a spatial error DGP is straightforward (Cordy and Grifﬁth, 1993, p. 1167). Given the DGP with a known value of ρ, the variance of the OLS estimates appears in Eq. (10). ˆ ˆ ¼ ð X 0 X Þ−1 X 0 ðI−ρW Þ−1 ɛ β−E β

−1

ð9Þ

−1

−1

Vo ¼ σ 2 ð X 0 X Þ X 0 ðI−ρW Þ−1 ðI−ρW 0 Þ X ð X 0 X Þ

ð10Þ

~ ~ 2 ð X 0 X Þ−1 X 0 ðI− ρW ~ Þ−1 ðI− ρW ~ 0 Þ−1 X ð X 0 X Þ−1 : Vo ¼ σ

ð11Þ

Under the assumed null of the spatial error process, the maximum likelihood estimate, σ˜ 2, based on the variance of the residuals from the SEM provides a consistent estimate of σ 2. The maximum ˜ provides a consistent estimate of ρ. These likelihood estimate ρ estimates permit a feasible calculation of the variance of the least ˆ , β˜ , and V˜s , this completes all squares estimates as in 11. Along with β the necessary ingredients for computing T˜ . 3. Monte Carlo results

E βˆ ¼ β:

ð6Þ

Although OLS and SEM estimators under the spatial error DGP should yield estimates that approach β for large n, the literature contains examples where the estimates do not appear similar. A Hausman test (Hausman, 1978) can be used whenever under the null hypothesis there are two consistent estimators differing in efﬁciency, and under the alternative hypothesis of misspeciﬁcation the two estimators yield divergent results. We propose such a test for statistical differences between the OLS and SEM estimates. Given the theoretical results set forth above, a signiﬁcant difference between the two sets of estimates suggests misspeciﬁcation. Let δ˜ = βˆ - β˜ , represent the difference between OLS and SEM estimates of the model parameters. Under the null hypothesis of the spatial error DGP, the Hausman test statistic T˜ has the simple form shown in Eq. (7), ~ ~0 ~ ~ −1~ T ¼ δ V o− V s δ

ð7Þ

˜o is a consistent estimate of the variance–covariance matrix where V associated with βˆ from OLS (under the null of the spatial error DGP), and V˜s is a consistent estimate of the variance–covariance matrix ˜ from the spatial error model. The statistic T˜ follows a associated with β Chi-squared distribution with degrees-of-freedom equal to the number of regression parameters under test. See Davidson and MacKinnon (1993, p. 389–395) for a clear exposition of Hausman tests.

To obtain some idea of the performance of the spatial Hausman test under controlled conditions, we simulated a spatial error process with 3000 observations using ﬁve explanatory variables (including a constant term), and a setting of σ2 = 0.2. Three cases were considered based on ρ = 0.3, 0.6, and 0.9, corresponding to low, medium, and high levels of spatial dependence. For each case, we simulated 10,000 separate trials of y and estimated the SEM model via maximum likelihood, the iid model via OLS, and calculated the spatial Hausman test statistic T˜ . The empirical size was determined by the proportion of T˜ from the 10,000 trials that exceeded the Chi-squared critical values at levels of 0.01, 0.05, 0.10, 0.25, and 0.50 for 5 degrees-offreedom. As shown by Table 1, the empirical sizes conformed closely with the theoretical sizes. 4. Conclusion Most testing relies on choosing models with better ﬁt. However, the magnitude of the regression parameter estimates themselves have value, particularly in cases like the one examined here where theoretical results suggest that OLS and SEM should produce similar estimates in large samples. For many fundamentally spatial problems such as those involving real estate data, SEM will almost always yield a signiﬁcantly higher likelihood than OLS. For a given set of variables, a divergence between the coefﬁcient estimates from SEM and OLS suggests that neither is yielding regression parameter estimates matching the underlying parameters in the DGP. This calls into question use of either OLS or SEM for that set of variables.

284

R. Kelley Pace, J.P. LeSage / Economics Letters 101 (2008) 282–284

Suppose a particular speciﬁcation fails the spatial Hausman test? Since under the null hypothesis, both estimators are unbiased, a rejection implies a failure of the orthogonality condition, i.e., the stochastic disturbance is correlated with one or more right hand side variables. Since the differences between OLS and the SEM arise in the presence of spatial dependence, enriching the spatial speciﬁcation for the explanatory variables may reduce the correlation between the stochastic disturbance and the included variables. Often a more general model with separate spatial lags of both the explanatory variables and the dependent variable (which nests the SEM) may be appropriate as it ﬁts better and has a richer spatial interpretation. However, if theory suggests an error model, adding spatially lagged explanatory variables can reduce the Hausman test statistic below its critical value. Naturally, the results will depend upon the transformations of the explanatory and dependent variable as well as the actual explanatory variables chosen by the investigator. The spatial Hausman test developed here could be easily extended to other models of spatial disturbances such as conditional autoregression, moving average, geostatistical, and the matrix exponential spatial speciﬁcation (Cordy and Grifﬁth, 1993; Dubin, 1988; LeSage and Pace, 2007). References Anselin, Luc, 1988. Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dorddrecht. Brasington, David M., 2007. Private schools and the willingness to pay for public schooling. Education Finance and Policy 2, 152–174. Cohen, Jeffrey P., Coughlin, Cletus C., 2006. Spatial Hedonic models of airport noise, proximity, and housing prices. Working paper 2006-026B, Federal Reserve Bank of St. Louis.

Cordy, Clifford B., Grifﬁth, Daniel A., 1993. Efﬁciency of least squares estimators in the presence of spatial autocorrelation. Communications in Statistics – Simulation and Computation 22, 1161–1179. Davidson, Russell, MacKinnon, James, 1993. Estimation and Inference in Econometrics. Oxford University Press, New York. Dubin, Robin, 1988. Estimation of regression coefﬁcients in the presence of spatially autocorrelated error terms. Review of Economics and Statistics 70, 466–474. Hausman, J.A., 1978. Speciﬁcation tests in econometrics. Econometrica 46, 1251–1272. Lee, L.-F., 2004. Asymptotic distributions of quasi-maximum likelihood estimators for spatial econometric models. Econometrica 72, 1899–1926. Lee, Ming Long, Pace, R. Kelley, 2005. Spatial distribution of retail sales. Journal of Real Estate Finance and Economics 31, 53–69. LeSage, James P., Pace, R. Kelley, 2007. A matrix exponential spatial speciﬁcation. Journal of Econometrics 140, 190–214. Mardia, K.V., Marshall, R.J., 1984. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71, 135–146. Neill, Helen R., Hassenzahl, David M., Assane, Djecto D., 2007. Estimating the effect of air quality: spatial versus traditional hedonic price models. Southern Economic Journal 73, 1088–1111. Ord, Keith, 1975. Estimation methods for models of spatial interaction. Journal of the American Statistical Association 70, 120–126. Pace, R. Kelley, 1997. Performing large-scale spatial autoregressions. Economics Letters 54, 283–291. Theebe, Marcel A.J., 2004. Planes, trains, and automobiles: the impact of trafﬁc noise on house prices. Journal of Real Estate Finance and Economics 28, 209–234.

A spatial Hausman test

A spatial Hausman test

Recommend Documents