Chapter 14
Systems of Spatial Equations 14.1 INTRODUCTORY COMMENTS Although many spatial models relate to a single equation, in many cases some of the variables involved in those models are determined in a system of equations along with the dependent variable of the model.1 Such single equation models are clearly in a limited information framework. There are, however, many examples of spatial simultaneous equations models with applications spreading over many fields of economics. However, in some of these studies all of the available information is not used. For example, Jeanty et al. (2010) considered a twoequation spatial simultaneous model of population migration and housing price dynamics. However, they did not estimate their model by a systems method. Instead, their equations were estimated by the general spatial 2SLS procedure that was described in Chapter 2. Another example of a systems model is the one given in Mukerji (2009). She considered a three-equation system explaining three endogenous variables: volatility of GDP growth rates across countries, the growth rate of a country’s GDP per capita, and an index which describes a country’s capital account openness. For each of these three endogenous variables, she specified a structural equation. These equations had additional endogenous variables. Her model was estimated by a “natural” systems generalization of the general spatial 2SLS procedure described in Chapter 2. Her estimation procedure, which is described below, was defined by Kelejian and Prucha (2004) as general spatial 3SLS, henceforth GS3SLS.2 This is a systems instrumental variable procedure in that, as will become clear below, it takes into account potential cross-equation correlations of the error terms. At the time, Mukerji’s estimation procedure was state-of-the-art. One important point should be noted here. Since Mukerji’s (2009) model had additional endogenous variables her model could not have been estimated 1. See, e.g., Kelejian and Prucha (2004), Fingleton and Le Gallo (2008), Mukerji (2009), Royuela (2011), Hoogstra (2012), Hoogstra et al. (2011), Lastauskas and Tatsi (2013), Jeanty et al. (2010), and Kelejian and Piras (2014), among others. These studies focus on more than one structural equation, or one structural equation which involves more than one endogenous variable. In such models the unspecified equations have implications for estimation which will become clear. 2. Because Mukerji’s weighting matrix was assumed to be endogenous, her procedure should, perhaps, be viewed as an evident variation of GS3SLS. The GS3SLS is discussed in Section 14.8 below. Spatial Econometrics. http://dx.doi.org/10.1016/B978-0-12-813387-3.00014-7 Copyright © 2017 Elsevier Inc. All rights reserved.
293
294 Spatial Econometrics
by maximum likelihood. As indicated earlier in this book, in general, if a model has more endogenous variables than equations it cannot be consistently estimated by maximum likelihood. The reason for this is that the joint distribution of the endogenous variables will involve unknown equations. Concerning this, we note that many single equation spatial models have been estimated by maximum likelihood under the assumption that all of the regressors are exogenous. Unfortunately, in many of these cases the exogeneity assumption can easily be called into question!
14.2 AN ILLUSTRATIVE TWO-EQUATIONS MODEL Before presenting the description and procedures involved in a general G-equations model, we discuss a two-equations model. The extension of results to the general model is then presented. We consider two specifications of the error term. In the first model the error term is specified nonparametrically. In the second, the error term is specified in terms of a spatial autoregressive model. Estimation details are given for both specifications.
14.3 THE MODEL WITH NONPARAMETRIC ERROR TERMS Consider the two equations model y1 = X1 β1 + ρ1,11 W11 y1 + ρ1,12 W12 y2 + ρ11,13 , y2 + Y1 δ1 + u1 ,
(14.3.1)
y2 = X2 β2 + ρ1,21 W21 y1 + ρ1,22 W22 y2 + ρ1,23 y1 + Y2 δ2 + u2 where for j = 1, 2, yj is an N × 1 vector of observations on the dependent variable in the j th equation, Xj is an N ×kj matrix of observations on kj exogenous variables that appear in the j th equation, Wj i is the N × N weighting matrix in the j th equation that relates yj to yi , Yj is an N × qj matrix of observations on qj endogenous variables that are in the j th equation, δj are corresponding parameter vectors, uj is the corresponding N × 1 error term, and βj , δj and ρ1,j i (for i = 1, 2, 3; j = 1, 2) are corresponding parameters. Assume that the researcher does not know the equations determining Y1 and Y2 . Because of this, and because the error terms will be specified nonparametrically, maximum likelihood cannot be considered. Therefore we will describe IV estimation for the two equation system in (14.3.1). Of course, the results given below can easily be restricted to the case in which additional endogenous variables are not present. For future reference, note that since Y1 and Y2 are matrices of endogenous variables they feedback through a general system to y1 and y2 , which are also clearly interrelated. Let be the N × h, h ≥ max(q1 , q2 ) matrix of observations
Systems of Spatial Equations Chapter | 14
295
on exogenous variables in that general system which do not appear in (14.3.1). Typically, researchers do not have data on all of the exogenous variables in that general system. The two equation system in (14.3.1) can be stacked in two useful ways. The first way illustrates a property of the solution of the two equations for y1 and y2 in terms of X1 , X2 , Y1 , Y2 , u1 , u2 , and the weighting matrices. The second way is useful for estimation. Let y = (y1 , y2 ),
(14.3.2)
2 X = diagi=1 (Xi ),
β = (β1 , β2 ), 2 Y = diagi=1 (Yi ),
δ = (δ1 , δ2 ), u = (u1 , u2 ), and, in order to describe the system in (14.3.1), let ρ1,11 W11 ρ1,12 W12 + ρ1,13 IN W= ρ1,21 W21 + ρ1,23 IN ρ1,22 W22
.
(14.3.3)
Given this notation, the first stacked form of (14.3.1) which is useful for describing a property of the solution is y = Xβ + Wy + Y δ + u.
(14.3.4)
For example, assuming (I2N − W ) is nonsingular, y = (I2N − W )−1 [Xβ + Y δ + u].
(14.3.5)
The result in (14.3.5), together with the definition of W in (14.3.3), implies that both y1 and y2 depend upon X1 , X2 , Y1 , and Y2 , as well as interactions of these terms with all four of the weighting matrices. Recall that is the matrix of observed exogenous variables in the system determining Y1 and Y2 . Then, the implication of (14.3.5) is that if the two-equation system is to be estimated by instrumental variables, the instruments should include the matrices X1 , X2 , and , as well as products of these matrices with all four of the weighting matrices and, perhaps, their squares. Then, the instrument matrix, say H∗ , for both equations could be taken as H∗ = [X1 , X2 , , W11 X1 , W12 X1 , W21 X1 , W22 X1 ,
(14.3.6)
296 Spatial Econometrics
W11 X2 , W12 X2 , W21 X2 , W22 X2 , W11 , W12 , W21 , W22 ]LI where, for ease of notation, the squares of the weighting matrices are not indicated, and again, LI denotes the linearly independent columns of the corresponding matrix. Assume quite reasonably that the number of linearly independent columns in H∗ is at least as large as the number of parameters in each of the two equation. Finally, let H be the stacked instrument matrix, 2 (H∗ ). H = diagi=1
(14.3.7)
Now consider the second way of rewriting the model by stacking the two equations. In particular, let yj = Zj γj + uj ,
(14.3.8)
Zj = (Xj , Wjj yj , Wj i yi , yi , Yj ), γj = (βj , ρ1,j 1 , ρ1,j 2 , ρ1,j 3 , δj ) , j, i = 1, 2; j = i, and correspondingly let 2 Z = diagi=1 (Zi ),
γ
(14.3.9)
= (γ1 , γ2 ) .
Given this notation, the second stacked version of the model which is useful for estimation is y = Zγ + u.
(14.3.10)
The Error Term For the model in (14.3.10), assume the error term, u, as defined in (14.3.2), is u = Rε
(14.3.11)
where R is an unknown 2N × 2N matrix, and ε is a 2N × 1 vector of innovations.
14.4 ASSUMPTIONS OF THE MODEL Assumption 14.1. (a) The elements of ε are i.i.d. with mean and variance of 0 and 1, respectively, and finite fourth moment.3 (b) The row and column sums of 3. This specification does not allow for triangular arrays. It is given here because it is intuitive. For a formal statement which allows for triangular arrays, see Section A.15 in the appendix on large sample theory.
Systems of Spatial Equations Chapter | 14
297
the 2N × 2N unknown exogenous matrix R are uniformly bounded in absolute value. (c) R is nonsingular. Assumption 14.2. For N large enough X1 and X2 , and therefore X, have full column rank. Assumption 14.3. (a) |ρ1,ij | < 1.0, i = 1, 2; j = 1, 2, 3. (b) The main diagonal elements of W are all zero. (c) The matrix (I2N − W ) is nonsingular for all |ρ1,ij | < 1.0. In addition, for all values of ρ1,ij stated in part (a), the row and column sums of W and (I2N − W )−1 are uniformly bounded in absolute value. Assumption 14.4. The elements of the matrix of instruments H in (14.3.7) are uniformly bounded in absolute value. In addition, H has full column rank for N large enough. Assumption 14.5. (A) (B) (C)
lim N −1 H H = QH H ,
N→∞
(14.4.1)
lim N −1 H RR H = QH RRH ,
N→∞
p lim N −1 H Z = QH Z N→∞
where QH H , QH RRH , and QH Z have full column rank.
14.5 INTERPRETATION OF THE ASSUMPTIONS The above five assumptions are reasonable. Parts (a) and (b) of Assumption 14.1 allow the error terms to be both spatially correlated and heteroskedastic in an unknown fashion. Parts (a)–(c) of Assumption 14.1, along with (14.3.11), imply that E(uu ) = RR and RR is nonsingular, and so no error term is a linear combination of the other error terms. Assumption 14.2 just rules out perfect multicollinearity. This is an underlying assumption in virtually every model. Among other things, Assumption 14.3 ensures that the two-equation system is complete in the sense that the two specified equations can be solved for the two dependent variables being explained. The force of Assumption 14.4 is that the instruments are uniformly bounded, and H does not contain redundancies. Assumption 14.5 is standard in most large sample analyses. Among other things, it rules out “peculiar” sequences of variables. An example of a variable that would violate Assumption 14.5 would be xi whose values are: 1, 2, 2, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, ... .
(14.5.1)
298 Spatial Econometrics
14.6 ESTIMATION AND INFERENCE The form of the model to be estimated is (14.3.10) and (14.3.11). Let Zˆ = H (H H )−1 H Z where H is defined in (14.3.7). Then, since the error term is nonparametrically specified as in (14.3.11), variations on a GLS procedure cannot be considered. Thus, the estimator of γ in (14.3.10) is just the 2SLS estimator, namely ˆ −1 Zˆ y. γˆ = (Zˆ Z)
(14.6.1)
ˆ we leave it as an exercise to show that, given (14.3.10) and Since Zˆ Z = Zˆ Z, (14.3.11), ˆ −1 Zˆ Rε N 1/2 (γˆ − γ ) = N 1/2 (Zˆ Z)
(14.6.2)
ˆ −1 ] [N −1 Z H ] [(N −1 H H )−1 ]N −1/2 H Rε. = [(N −1 Zˆ Z) It is also left as an exercise to show that, given Assumptions 14.1–14.5, D
N 1/2 (γˆ − γ ) → N (0, F QH RRH F ), F
(14.6.3)
−1 −1 = (QH Z Q−1 H H QH Z ) QH Z QH H .
Inference would be based on the finite sample approximation ˆ H RRH Fˆ ) γˆ N (γ , N −1 Fˆ Q
(14.6.4)
where ˆ −1 Z H (N −1 H H )−1 Fˆ = (Zˆ Z) ˆ H RRH is the HAC estimator of QH RRH . and Q Note that since γ = (γ1 , γ2 ) the result in (14.6.4) can be used not only to test hypotheses about γ1 or γ2 , but also hypotheses relating to possible crossequation restrictions. For example, suppose γ1 is 3 × 1 and γ2 is 5 × 1. Suppose the researcher wants to test the hypothesis, say H0 , that the second element of γ1 is the same as the sum of the third and fourth elements of γ2 . This hypothesis can be expressed as H0 : S1 γ1 − S2 γ2 = 0
(14.6.5)
where S1 = (0, 1, 0) and S2 = (0, 0, 1, 1, 0). The same hypothesis can also be expressed as γ1 H0 : [S1 , −S2 ] = 0. (14.6.6) γ2
Systems of Spatial Equations Chapter | 14
299
Let S = [S1 , −S2 ] and γˆ = (γˆ1 , γˆ2 ). Then, inferences would be based on S γˆ N (0, V CS γˆ ),
(14.6.7)
ˆ H RRH Fˆ ] S . V CS γˆ = S [N −1 Fˆ Q Thus, H0 would be rejected at the 5% level if S γˆ [V CS γˆ ]−1 S > χ12 (0.95).
(14.6.8)
Illustration 14.6.1: Volatility and financial development In this example we use the data set from Mukerji (2009) to examine whether the impact of capital account convertibility on the volatility of economic growth depends on financial development. Mukerji (2009) estimated a system of three simultaneous equations. The dependent variables in these three equations are growth volatility, economic growth, and financial development. The model specified in the original contribution was based on a panel of countries and four time periods, with each time period corresponding to a decade (1960–1969, 1970–1979, 1980–1989, and 1990–1999). The panel was unbalanced and therefore the estimation used in the original paper simply pooled the data. Dummies for the decades were included to account for this. The system in matrix notation is described by the three following equations: V = α0 + α1 K + α2 F + α3 [(F, K)] + α4 G + α5 W V + X1 α6 + ε1 , G = β0 + β1 K + β2 F + β3 V + β4 W G + β5 W V + X2 β6 + ε2 , F = γ0 + γ1 F−1 + γ2 K + γ3 G + X3 γ4 + ε3 where V is a vector of observations on the growth volatilities of countries; G is a vector of observations on the mean annual growth rates of real gross domestic product per capita; F is a vector of observations of average financial development of countries; F−1 is the time lag of F ; K is a vector of indices of capital account openness of those countries; W is a spatial weighting matrix4 ; X1 , X2 , and X3 are matrices of exogenous variables specific to each of the equations, and ε1 , ε2 and ε3 are vectors of spatially correlated error terms (see Mukerji (2009) for more details). In the original contribution, the instruments used were a dummy variable for diversified exporters, the ratio of initial investment to gross domestic product, initial secondary school enrollment, the average long term growth of trade partners, the average dispersion of growth 4. The weights of the spatial weighting matrix W consisted of average trade shares of the top 20 trading partners of each country in the sample. In this context W was then treated as endogenous and a variant of a gravity model was used to instrument the trade shares in the weighting matrix (for additional details, see Mukerji, 2009).
300 Spatial Econometrics
rates among sectors, the time lag of financial development, X1 , X2 , X3 , and the spatial lags of all of the exogenous variables. Because the endogenous variables were assumed to be simultaneously determined, the disturbances in the three equations were assumed to be correlated. Mukerji (2009) used a GS3SLS procedure to consistently estimate the parameters of the model. Mukerji’s results are reported in the first three columns of Table 14.6.1.5 As a point of comparison, the last three columns are obtained by using the simultaneous estimation procedure with nonparametric error terms presented in Section 14.6. Clearly, we expect to see differences both in the estimated coefficients and in their standard errors. The reason for this is that the original system was estimated by GS3SLS, while the estimation procedure described in Section 14.6 is based on 2SLS. A glance at Table 14.6.1 reveals that very few of the variables in the volatility and growth equations present substantial differences between the two methods. For the financial development equation, things are a little bit more different, but the overall statistical significance is unchanged.
14.7 THE MODEL WITH SAR ERROR TERMS In this section we assume that the error terms for the model in (14.3.1) and (14.3.8) are determined as uj = ρ2,j Mj uj + vj ,
(14.7.1)
|ρ2,j | < 1, j = 1, 2 where Mj is an observed N × N exogenous weighting matrix, and vj is an N × 1 vector of innovations. Again let u = (u1 , u2 ), and let v = (v1 , v2 ) be a random 2N × 1 vector. Then, the two equation error term system in (14.7.1) can be expressed as u1 ρ2,1 M1 v1 u1 0 = + , (14.7.2) u2 u2 v2 0 ρ2,2 M2 or u=
ρ2,1 IN 0
0 ρ2,2 IN
M1 0
0 M2
u + v.
5. A comparison between the results in Table 14.6.1 and those of Mukerji’s original contribution reveals some minor differences. Those differences are due to the fact that the equation for financial development was estimated from a slightly different number of observations. This was due to missing values in some of the variables used to estimate the equations for volatility and growth. To keep things simple, we reestimated the model over the same number of observations for the three equations. The results reported in Table 14.6.1 reflect those changes.
Systems of Spatial Equations Chapter | 14
301
TABLE 14.6.1 Results of a simultaneous equation model for volatility, growth, and financial development Constant Initial GDP pc/1000 Initial inflation Initial trade share of GDP Democracy Black market premium Revolutions Gini Rate of change of trade SD rate of change of trade Log population SD inflation Initial investment Initial secondary education
Growth
Volatility
FD
Growth
Volatility
FD
−3.219 (3.045) 0.000 (0.000) 0.000 (0.001) 0.000 (0.006) −1.183 (0.598) −0.272 (0.129) −0.549 (0.496) −0.024 (0.018) 0.203 (0.052) −0.001 (0.017) 0.306 (0.147) −0.001 (0.001) 0.076 (0.026) 0.012 (0.008)
8.285 (3.229) 0.000 (0.000) −0.001 (0.001) −0.004 (0.007) −1.363 (0.619) 0.177 (0.160) 0.843 (0.555) 0.035 (0.021) 0.049 (0.066) 0.045 (0.019) −0.413 (0.169) 0.003 (0.001)
−64.507 (22.773) 0.001 (0.000) 0.001 (0.004) 0.054 (0.047) −3.236 (4.446) −0.211 (1.095) −0.007 (3.916) 0.385 (0.138) −0.486 (0.473) −0.142 (0.132) 2.209 (1.182) 0.002 (0.010)
−3.711 (2.900) 0.000 (0.000) 0.000 (0.000) 0.001 (0.007) −1.158 (0.537) −0.287 (0.586) −0.558 (0.586) −0.024 (0.020) 0.203 (0.064) −0.003 (0.020) 0.340 (0.133) −0.001 (0.001) 0.074 (0.026) 0.010 (0.009)
8.058 (3.819) 0.000 (0.000) −0.001 (0.001) −0.004 (0.006) −1.382 (0.682) 0.166 (0.503) 0.790 (0.503) 0.036 (0.020) 0.047 (0.151) 0.044 (0.049) −0.403 (0.194) 0.003 (0.001)
−59.239 (21.459) 0.001 (0.000) 0.000 (0.002) 0.069 (0.059) −4.775 (3.631) −0.510 (0.916) −0.252 (3.732) 0.331 (0.149) −0.305 (0.463) −0.135 (0.135) 2.435 (1.109) 0.002 (0.005)
−0.475 (0.304)
Diversified exporter Lagged financial development
0.904 (0.065) 2.503 (1.114)
Dispersion of sectoral growth rate Volatility
−0.056 (0.141)
Growth FD Capital account openness
0.005 (0.009) −0.187 (0.374)
Capital account openness × FD WG WV
−0.412 (0.271)
0.769 (0.297) −0.206 (0.243)
0.931 (0.077) 1.529 (1.076) −0.048 (0.157)
0.008 (0.161) 0.035 (0.013) 2.445 (1.238) −0.050 (0.023)
−0.278 (0.234)
3.998 (1.018)
6.611 (3.090)
0.003 (0.008) −0.184 (0.351)
0.753 (0.281) −0.149 (0.198)
0.028 (0.164) 0.030 (0.016) 2.237 (1.102) −0.046 (0.023)
−0.242 (0.226)
3.071 (1.230)
6.094 (3.354)
302 Spatial Econometrics
The random vector v accounts for cross-equation correlation. Specifically, let v = ( ∗ ⊗ IN )
(14.7.3)
∗ ∗ = , σ11 σ12 ,
= σ12 σ22 and where is a 2N × 1 random vector. For this error specification assume the following: Assumption 14.6. The elements of are i.i.d. with mean and variance of 0 and 1, respectively, and finite fourth moment.6 Assumption 14.7. (a) |ρ2,j | < 1, j = 1, 2. (b) |σii | < c, where c is a finite constant, i = 1, 2. Assumption 14.8. (a) The diagonal elements of Mj are all zero, j = 1, 2. (b) The row and column sums of Mj and (IN − aMj )−1 are uniformly bounded in absolute value for all |a| < 1, j = 1, 2. Note that Assumption 14.6 and (14.7.3) imply E(vv ) = v
(14.7.4)
= ( ⊗ IN ) or, in more standard notation, E(vi vj ) = σij I2 , i, j = 1, 2.
14.8 ESTIMATION AND INFERENCE: GS3SLS An Illustration The model in (14.3.8) and (14.3.10) with error terms defined by (14.7.1)– (14.7.3) has additional endogenous regressors and so it does not lend itself to maximum likelihood estimation. Again, an IV procedure will be used. To illustrate the nature of the estimation procedure assume for the moment that the parameters in the error term process are known. Let yj (ρ2,j ) = (IN − ρ2,j Mj )yj , j = 1, 2,
(14.8.1)
Zj (ρ2,j ) = (IN − ρ2,j Mj )Zj , j = 1, 2, 6. This specification does not allow for triangular arrays. It is given because it is intuitive. For a formal statement which allows for triangular arrays, see Section A.15 in the appendix on large sample theory.
Systems of Spatial Equations Chapter | 14
303
y(ρ2 ) = (y1 (ρ2,1 ) , y2 (ρ2,2 ) ) , Z(ρ2 ) = diagj2=1 [Zj (ρ2,j )]. Then the model in (14.3.10), with error terms in (14.7.1)–(14.7.3) can be written as y(ρ2 ) = Z(ρ2 )γ + v,
(14.8.2)
E(vv ) = ( ⊗ IN ). The estimator of γ in (14.8.2) should account for the cross-equation correlation of the error terms. Let A be a 2 × 2 matrix that diagonalizes , i.e., A A = I2 , so that −1 = AA . Let P = (A ⊗ IN ). Then P ( ⊗ IN )P = (A A ⊗ IN )
(14.8.3)
= I2N . For future reference, note that an implication of (14.8.3) is P P = (A ⊗ IN )(A ⊗ IN )
(14.8.4)
= ( −1 ⊗ IN ). Multiplying the first line of (14.8.2) across by P yields P y(ρ2 ) = P Z(ρ2 )γ + ξ,
(14.8.5)
ξ =P v where E(ξ ) = 0 and E(ξ ξ ) = I2N . The ideal set of instruments for the system in (14.8.5) is E[P Z(ρ2 )] = P E[Z(ρ2 )].
(14.8.6)
In general, unless the entire system is linear, and all of the model equations determining the additional endogenous regressors, namely Y1 and Y2 , are known, an exact expression for E[Z(ρ2 )] will not be available. However, E[Z(ρ2 )] can be approximated in terms of the available data on the instruments. Specifically, the instrument matrix for both equations can be taken to be H& where H& = (H∗ , M1 H∗ , M2 H∗ )
(14.8.7)
with M1 and M2 being defined in (14.7.2) and H∗ defined in (14.3.6). Let H# be the block diagonal matrix of instruments H# = diagj2=1 (H& ).
(14.8.8)
304 Spatial Econometrics
Given H# , the evident approximation to P E[Z(ρ2 )] in (14.8.6) is 2 )] P H# (H# H# )−1 H# Z(ρ2 ) P E[Z(ρ
(14.8.9)
ˆ 2) = P Z(ρ
ˆ 2 ) = H# (H H# )−1 H Z(ρ2 ). Given (14.8.9) and (14.8.4), the evident where Z(ρ # # estimator of γ in (14.8.5) can be expressed as ˆ 2 ) P P Z(ρ ˆ 2 )]−1 Z(ρ ˆ 2 ) P P y(ρ2 ) γˆ = [Z(ρ
(14.8.10)
ˆ 2 ) [ −1 ⊗ IN ] y(ρ2 ). ˆ 2 )]−1 Z(ρ ˆ 2 ) [ −1 ⊗ IN ] Z(ρ = [Z(ρ General Feasible Estimation: Extension to a G-Equation System The estimator in (14.8.10) is not feasible because the parameters are not known. It also relates to only a two-equation system. However, given its development, the feasible counterpart which relates to a G-equation system should be evident. To simplify notation in generalizing to G equations, suppose the regressor matrix in the j th equation is Zj = (Xj , Wj ∗ yj ∗ , yj • , Yj ), j = 1, ..., G
(14.8.11)
where Wj ∗ yj ∗ is “shorthand” for the list of spatially lagged dependent variables that appear in the j th equation; and yj • is “shorthand” for the list of dependent variables that appear in the j th equation (where yj , of course, is excluded). For example, if the only spatially lagged dependent variables in the j th equation are yj , y1 , and y5 , then Wj ∗ yj ∗ = (Wjj yj , Wj 1 y1 , Wj 5 y5 ). Given this notation, and generalizing the above notation in an obvious way, the G-equation system is yj = Zj γj + uj , j = 1, ..., G,
(14.8.12)
uj = ρ2,j Mj uj + vj , E(vj ) = 0, E(vj vj ) = σjj IN , E(vj vi ) = σj i IG . Now let be the matrix of observations on the known exogenous variables that are in this general G-equation system, but not in (14.8.12), which generate the additional endogenous variables Y1 , ..., YG ; see (14.8.11). Similarly, let the instrument matrix H∗ as defined in (14.3.6) be extended to include all of the known and observed exogenous variables, as well as the products of these with all of the weighting matrices. In order to account for an SAR transformation described below related to the error terms, we extend this generalized G-equation
Systems of Spatial Equations Chapter | 14
305
systems version of H∗ to H∗∗ : H∗∗ = (H∗ , M1 H∗ , M2 H∗ , ..., MG H∗ )LI .
(14.8.13)
Finally, let the instrument matrix be HI V = diagjG=1 (H∗∗ ).
(14.8.14)
Given this notation, for the G-equation system in (14.8.12), the calculation of the feasible generalized counterpart to γˆ in (14.8.10) is as follows: Step 1. Estimate the j th equation by 2SLS to get γˆj using HI V as defined in (14.8.14). Step 2. Obtain the estimated residuals uˆ j = yj − Zj γˆj . Step 3. Use uˆ j and the GMM procedure described in Chapter 2 to obtain ρˆ2,j . Step 4. Use ρˆ2,j and uˆ j to estimate as follows: vˆj = uˆ j − ρˆ2,j Mj uˆ j ,
(14.8.15)
σˆ ij = N −1 vˆj vˆi , i, j = 1, ..., G, ˆ = {σˆ ij }; i, j = 1, ..., G.
Step 5. Calculate yj (ρˆ2,j ) = (IN − ρˆ2,j Wjj )yj ,
(14.8.16)
Zj (ρˆ2,j ) = (IN − ρˆ2,j Wjj )Zj , y(ρˆ2 ) = (y1 (ρˆ2,1 ) , ..., yG (ρˆ2,G ) ) , Z(ρˆ2 ) = diagjG=1 [Zj (ρˆ2,j )]. Step 6. Obtain the GS3SLS estimator of γ = (γj , ..., γG ) as ˆ ρˆ2 )]−1 Z( ˆ ρˆ2 ) [
ˆ ρˆ2 ) [
ˆ −1 ⊗ IN ] Z( ˆ −1 ⊗ IN ] y(ρˆ2 ) γˆ = [Z(
(14.8.17)
ˆ ρˆ2 ) = H∗ (H∗ H∗ )−1 H∗ Z(ρˆ2 ). where Z( Inferences can be based on the small sample approximation .
γˆ N (γ , V Cγˆ ), ˆ ρˆ2 ) [
ˆ V Cγˆ = [Z(
−1
(14.8.18)
ˆ ρˆ2 )] ⊗ IN ] Z(
−1
.
306 Spatial Econometrics
SUGGESTED PROBLEMS 1. If W12 = 0 = W21 and δ1 = 0 in (14.3.1), what IVs would you use to estimate the first equation in (14.3.1)? 2. If W12 = 0 = W21 but δ1 = 0, how would your IVs change for the estimation of the first equation in (14.3.1)? 3. Suppose = I2 in (14.7.3). What would your estimator of γ in (14.8.2) be? 4. Given the result in (14.6.2), demonstrate the result in (14.6.3).