Economics Letters 79 (2003) 385–392 www.elsevier.com / locate / econbase
Partial GLS regression Hailong Qian a , Peter Schmidt b , * a
Department of Economics, John Cook School of Business, Saint Louis University, 3674 Lindell Blvd., St. Louis, MO 63108, USA b Department of Economics, Michigan State University, East Lansing, MI 48824, USA Received 13 November 2002; accepted 10 December 2002
Abstract We derive a novel expression for the GLS estimate of a subset of the regression coefficients. We give conditions for the partial equality of OLS and GLS, and we derive a feasible GLS estimator for the fixed effects panel data model 2003 Elsevier Science B.V. All rights reserved. Keywords: Generalized least squares; Fixed effects; Panel data JEL classification: C20; C23
1. Introduction Consider a regression model of the usual form, where the regressors are partitioned into two sets: y 5 Xb 1 ´ 5 X1 b1 1 X2 b2 1 ´,
(1)
where y is T 3 1, X 5 (X1 ,X2 ) is T 3 K, X1 is T 3 K1 and X2 is T 3 K2 , with K 5 K1 1 K2 . Let S be the assumed error variance matrix of ´. Since this paper is about the numerical equivalence of estimators, it does not matter whether this assumption is correct. It is well known that the ordinary least squares (OLS) estimator of b1 can be obtained by ‘partialling out’ X2 : regress X1 on X2 , calculate the residuals, and then regress y on these residuals. A similar result exists for the generalized least squares (GLS) estimator of b1 , but it is more complicated. Since GLS is the same as the OLS regression of S 21 / 2 y on S 21 / 2 X, we can partial out X2 by regressing S 21 / 2 X1 on S 21 / 2 X2 , calculating the residuals, and then regressing S 21 / 2 y on these * Corresponding author. Tel.: 11-517-355-8381; fax: 11-517-432-1068. E-mail address:
[email protected] (P. Schmidt). 0165-1765 / 03 / $ – see front matter 2003 Elsevier Science B.V. All rights reserved. doi:10.1016 / S0165-1765(03)00033-8
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
386
residuals. Note that the first step in this procedure, the OLS regression of S 21 / 2 X1 on S 21 / 2 X2 , amounts to a GLS regression. In this paper we give a new expression for the GLS estimator of b1 . The interesting feature of this expression is that it has a two-step interpretation in which the first step is the same as for OLS: calculate the residuals from an OLS regression of X1 on X2 . That is, we partial out X2 with the same first step as for OLS; only the second step differs. We give two applications of this result, which are interesting in their own right. First, we derive conditions for the partial equality of OLS and GLS, by which we mean the equality of the OLS and GLS estimators of b1 . Second, we show that a feasible fixed-effects GLS estimator exists, even though S cannot be estimated. The paper is organized as follows. Section 2 derives a new expression for the GLS estimator of a subset of the parameters. Section 3 provides a list of easily checkable and equivalent conditions for partial equality of OLS and GLS estimators. Section 4 shows the efficiency of a feasible fixed-effects GLS estimator for a panel data model. Section 5 gives some concluding remarks, while Appendix A contains proofs of two results stated in the text.
2. An expression for the partial GLS estimator We consider the regression model given in (1) above. We assume that X is of full column rank, so that X9X is positive definite, and that S is also positive definite. Then we define the OLS estimator bˆ and the GLS estimator b˜ :
bˆ 5 (X9X)21 X9y,
(2A)
b˜ 5 (X9S 21 X)21 X9S 21 y.
(2B)
Using the partitioned-matrix inverse rule, it is easy to see that the OLS estimator bˆ 1 and the GLS estimator b˜ 1 can be written as:
bˆ 1 5 (X 19 M2 X1 )21 X 19 M2 y,
(3A)
b˜ 1 5 (X 91 AX1 )21 X 19 Ay
(3B)
M2 5 IT 2 X2 (X 29 X2 )21 X 29 ,
(4A)
A 5 S 21 2 S 21 X2 (X 29 S 21 X2 )21 X 29 S 21 .
(4B)
with
Also, for later use we define B 5 M2 S M2 .
(5)
We now wish to derive a useful alternative expression for b˜ 1 . Consider multiplying Eq. (1) by M2 to obtain: M2 y 5 M2 X1 b1 1 M2 ´.
(6)
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
387
It is well known that OLS applied to (6) yields bˆ 1 . We next consider GLS estimation of (6). If S is the assumed variance matrix of ´, and if X is treated as fixed, it would be natural to take the variance matrix of M2 ´ as B 5 M2 S M2 . Then the GLS estimator based on (6) would be
b˘ 1 5 (X 91 M2 B 1 M2 X1 )21 X 91 M2 B 1 M2 y,
(7)
where B 1 denotes the Moore-Penrose pseudo-inverse of B; see, e.g., Theil (1971, Section 6.7). This leads to our main result. Theorem 1. Let b˜ 1 be the GLS estimator of b1 , as given in Eq. (3 B) above, and let b˘ 1 be as defined in Eq. (7). Then b˘ 1 5 b˜ 1 . Proof. In Appendix A we prove that M2 B 1 M2 5 B 1 5 A, with A defined in Eq. (4B) above. Therefore, substituting M2 B 1 M2 5 A into (7), we have b˘ 1 5 b˜ 1 , using (3B). h This result is of interest for at least two reasons. First, we note that the premultiplication by M2 in Eq. (6) amounts to taking residuals from an OLS regression on X2 . Since we are dealing here with an error structure that calls for GLS regression, one might have expected that an OLS partialling out of X2 would involve a loss of information, but it does not. Second, there are cases in which S is not consistently estimable, but M2 S M2 is consistently estimable. (The fixed-effects GLS model, to be discussed later, is such a case.) Then our result shows that, although there is no feasible GLS estimator for all of b, there is a feasible GLS estimator for a subset of b, b1 .
3. Conditions for partial equality of OLS and GLS A large number of papers have been written identifying conditions under which, in the linear regression model, OLS and GLS are identical. Examples include Rao (1965), Zyskind (1967), Kruskal (1968), Rao and Mitra (1971), Anderson (1971), Gourieroux and Monfort (1980) and Puntanen and Styan (1989). Amemiya (1985, pp. 182–183) gives a convenient summary, in the form of a list of logically identical necessary and sufficient conditions for the equality of OLS and GLS. These conditions depend on the relationship between the regressor matrix and the (assumed) error variance matrix. This section addresses the related question of when OLS equals GLS for a subset of the regression coefficients. That is, in some cases we may have equality for some parameters and not for others, and this case is not covered by the results summarized by Amemiya. We call this partial equality of OLS and GLS. A condition for partial equality of OLS and GLS has been given by Gourieroux and Monfort (1980), and an equivalent condition can be obtained from Rao and Mitra (1971, Theorem 8.2.2, p. 159). However, there does not seem to be a systematic treatment of partial equality that parallels the literature on full equality of OLS and GLS. We will give a list of equivalent necessary and sufficient conditions for partial equality, and show how Theorem 1 can be used to prove their equivalence in a simple way. As a preliminary result, we first restate Amemiya’s result for the case that S is singular. In this case GLS can still be defined, by replacing S 21 by S 1 , the Moore-Penrose pseudo-inverse of S. Theorem 2 below gives conditions very similar to those in Amemiya for the equality of GLS and OLS when S
388
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
is singular. However, compared to his results, Theorem 2 contains the additional assumption that X is in the column space of S, which would be automatically true when S is nonsingular. To explain this assumption, let F be the matrix of characteristic vectors of S corresponding to the non-zero characteristic roots, and let G be the matrix of characteristic vectors corresponding to the zero roots. We assume in Theorem 2 that X9S X and X9S 1 X are nonsingular, for which a necessary and sufficient condition is that F9X has full column rank. Theil (1971, p. 277) refers to this as the ‘first rank condition’. His ‘second rank condition’ (p. 278) is that G9X 5 0, or equivalently that X is in the column space of S. To motivate this assumption, note that S G50, which corresponds to G9´ 5 0 (with probability equal to 1) in the model y 5 Xb 1 ´. Typically this corresponds to some sort of adding up that is also satisfied identically by the data, so that G9y 5 0 and G9X 5 0. More pointedly, if G9´ 5 0, then G9y 5 G9Xb, which would restrict b unless G9X 5 0. So, unless G9X 5 0, efficient estimation would require the imposition of linear restrictions on b, and we cannot expect GLS to be efficient. (In fact, without this condition we cannot even prove that GLS is efficient relative to OLS.) Therefore we will restrict our attention to the case that X is in the column space of S. Theorem 2. Let S be positive semidefinite, with Moore-Penrose pseudo-inverse S 1 . Assume that X9X, X9S X and X9S 1 X are positive definite, and that X is in the column space of S. Then, the following are equivalent: (A) (X9X)21 X9 5 (X9S 1 X)21 X9S 1 . (B) (X9X)21 X9S X(X9X)21 5 (X9S 1 X)21 (C) S X 5 XC1 for some nonsingular C1 . (D) X 5 HC2 for some nonsingular C2 , where the columns of H are K characteristic vectors of S, corresponding to positive characteristic roots. (E) X9S Z 5 0 for any Z such that Z9X 5 0. (F) S 5 XG X9 1 ZQ Z9 1 s 2 I for some G, Q and Z such that Z9X 5 0, and some nonnegative scalar 2 s . For a proof of this result, see Rao and Mitra (1971, Theorem 8.2.1, p. 155, and Corollary 3, p. 157). They also give a few other equivalent conditions. We now return to the problem of partial equality of OLS and GLS. We partition the model as in Eq. (1), and we seek conditions under which bˆ 1 5 b˜ 1 , so that OLS is the same as GLS for b1 . Consider Eq. (6), from which X2 has been partialled out using M2 . Theorem 1 shows that the GLS estimator b˜ 1 can be obtained from GLS applied to Eq. (6) using the appropriate generalized inverse. Also, as noted above, the OLS estimator bˆ 1 is just OLS applied to Eq. (6). Therefore the question of when b˜ 1 5 bˆ 1 is the same as the question of when OLS equals GLS for Eq. (6). Note that B 5 M2 S M2 is singular even though S is nonsingular. Theorem 2 is applicable because it does not require the covariance matrix B to be nonsingular and because the regressors of (6), M2 X1 , are in the column space of the covariance matrix, B 5 M2 S M2 . This leads to our result for equality of bˆ 1 and b˜ 1 . Theorem 3. Let A and B be as defined in Eqs. (4 B) and (5). Let X9X, S, X 91 AX1 and X 91 BX1 be positive definite. Then the following are equivalent: (A) (X 91 M2 X1 )21 X 19 M2 5 (X 19 AX1 )21 X 19 A. (B) (X 91 M2 X1 )21 X 19 M2 BM2 X1 (X 19 M2 X1 )21 5 (X 19 AX1 )21
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
389
(C) BM2 X1 5 M2 X1 C1 for some nonsingular C1 . (D) M2 X1 5 HC2 for some nonsingular C2 , where the columns of H are K1 characteristic vectors of B, corresponding to positive characteristic roots. (E) X 91 M2 BZ 5 0 for any Z such that Z9M2 X1 5 0. (F) B 5 M2 X1 G X 19 M2 1 ZQ Z9 1 s 2 IT for some G, Q and Z such that Z9M2 X1 5 0, and some nonnegative scalar s 2 . (G) S M2 X1 5 XD for some D of full column rank. (B*) (X 91 M2 X1 )21 X 19 M2 S M2 X1 (X 19 M2 X1 )21 5 (X 19 AX1 )21 (C*) M2 S M2 X1 5 M2 X1 C1 for some nonsingular C1 . (E*) X 19 M2 S M2 Z 5 0 for any Z such that Z9M2 X1 5 0. The proof of this result is straightforward. The equivalence of conditions (A), (B), (C), (D), (E) and (F) follows from Theorem 2. We simply replace ‘X’ by M2 X1 , ‘S ’ by B 5 M2 S M2 , and ‘S 1 ’ by B 1 5 A. Conditions (B*), (C*) and (E*) are just restatements of conditions (B), (C) and (E), using the fact that BM2 5 M2 B 5 B 5 M2 S M2 . Condition (G) deserves a little more explanation. It is the condition of Gourieroux and Monfort (1980, p. 1086). Let L be the linear subspace of R T spanned by the columns of X (i.e., the column space of X); let S L be the column space of S X; let L2 be the column space of X2 ; and let L *2 be the orthogonal complement of L2 in L. Their condition for full equality of OLS and GLS is that S L , L. If S is nonsingular, this is just the requirement that S X 5 XB for some nonsingular B, which is one of Amemiya’s conditions. For partial equality of OLS and GLS (for b1 ), their condition is that S L *2 , L. If S is nonsingular, this corresponds to the condition that S M2 X1 5 XD for some D of full column rank, which is Condition (G). In Appendix A, we show that this condition is equivalent to condition (C), and therefore to the other conditions in Theorem 3.
4. A feasible fixed-effects GLS estimator In this section we consider a ‘fixed effects’ model for panel data: y it 5 x it b 1 gi 1 ´it , i 5 1, 2, . . . , N, t 5 1, 2, . . . , T;
(8A)
y i 5 Xi b 1 e Tgi 1 ´i , i 5 1, 2, . . . , N;
(8B)
y 5 Xb 1 Zg 1 ´.
(8C)
Here expression (8A) is for a single observation, (8B) is for T observations for a single individual, and (8C) is for all NT observations. Thus x it is 1 3 K, Xi is T 3 K, and X is NT 3 K, for example. Also, e T is the T 3 1 vector of ones, Z ; IN ^ e T is the NT 3 N matrix of individual dummy variables, and g is the N 3 1 vector containing the gi , where IN is the identity matrix of dimension N and ^ denotes the Kronecker product. We follow Kiefer (1980) in three ways that define the model more precisely. First, we treat the gi as ‘fixed’. Thus we make no assumptions about the gi , and the parameters of interest are the elements of
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
390
b. Second, we have in mind large N and small T, so that asymptotics would be considered as N → ` with T fixed. Third, we treat S 5Var(´i ) as unrestricted, which is meaningful when T is fixed. It is well-known that OLS applied to (8) yields the ‘within’ estimator of b : bˆ 5 (X9QX)21 X9Qy,
(9)
where Q 5 I 2 Z(Z9Z)21 Z9 5 IN ^ Q T with Q T 5 IT 2 e T e 9T /T. Q is the matrix that converts y and X into deviations from individual means (the ‘within transformation’). Now consider GLS estimation of (8). If S is known, we can calculate the GLS estimator of b and g, say b˜ and g˜ , and under standard assumptions these would be best linear unbiased. However, as pointed out by Kiefer, 1980, p. 199), it is not possible to obtain a consistent (as N → ` with T fixed) estimator of S. Thus there is no feasible fixed-effects GLS estimator of (both) b and g. However, the point of this section is that there is a feasible fixed-effects GLS estimator of b. To see why this is possible, we first apply the within transformation (premultiplication of (8C) by Q) to obtain: Qy 5 QXb 1 Q´.
(10)
Next, we apply GLS to (10), treating the variance matrix of Q´ as B 5 Q(IN ^ S )Q 5 IN ^ (Q T S Q T ). This variance matrix is singular so we use the generalized inverse of B, and we obtain:
b˘ 5 (X9QB 1 QX)21 X9QB 1 Qy.
(11)
This estimator was suggested by Kiefer (1980, p. 199). See also Im et al. (1999, pp. 185–186), who give some alternative forms of the estimator. Kiefer noted that, when S is unknown, this estimator is still feasible because Q T S Q T can be estimated consistently (even though S cannot); for example, Q T S Q T can be estimated from the within residuals. However, Theorem 1 implies that b˘ equals the GLS estimator of b ( b˜ ). (The correspondence is that ‘X’ here is X1 in Section 2, ‘Z’ is X2 , ‘b ’ is b1 , and ‘Q’ is M2 .) That is, b˘ is not just an alternative to GLS of the original (untransformed) model (8). For b (though not for g ), it is the GLS estimator of b in (8). Given the discussion of Section 2, this equivalence may seem unsurprising, but it is a novel result that is contrary to the wisdom of the existing literature. For example, Kiefer (1980, p. 200) says—incorrectly—‘‘this is not the GLS estimator which would be used if the variance–covariance matrix were known.’’
5. Concluding remarks In this paper, we consider partial GLS estimation, by which we mean GLS estimation of a subset of the parameters of a linear regression model. We derive a new formula for the partial GLS estimator. This has a simple two-step interpretation: in the first step, we partial out the regressors corresponding to parameters that are not of primary interest, and in the second step, we apply GLS to the residuals from the first step, using the appropriate generalized inverse of the covariance matrix of the implicit transformed model. It is interesting that the first step involves an OLS (as opposed to GLS) regression. Our partial GLS approach is particularly relevant when the error covariance matrix in the original
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
391
(untransformed) model is not estimable but the error covariance matrix in the transformed model is estimable. In order to demonstrate the usefulness of our expression for partial GLS, we give two applications. In the first, we use it to derive a list of equivalent conditions for partial equality of the OLS and GLS estimators. In the second application, we use our result to show that there is a feasible GLS estimator for the fixed-effects GLS regression model, and that Kiefer’s (1980) feasible fixed-effects estimator is in fact the same as GLS for the parameters of interest in the original (untransformed) model. Similar results could no doubt be derived for other models, such as the panel data model in which some regressors other than the constant term have coefficients that vary over individuals.
Acknowledgements This paper is a revision and extension of an earlier paper, Conditions for the Partial Equality of OLS and GLS. The authors would like to thank Professors Alan Rogers and Peter Phillips and other participants of the New Zealand Econometrics Study Group Meeting, Auckland, July 1999, for their very helpful comments on the earlier version of the paper. The first author gratefully acknowledges a summer research grant of the John Cook School of Business of Saint Louis University.
Appendix A A.1. Proof that M2 B 1 M2 5 B 1 5 A It is easy to establish that AX2 5 0, AM2 5 A, M2 AM2 5 A, AS A 5 A. Using these results we obtain A(M2 S M2 ) 5 AS M2 5 M2 , (M2 S M2 )A 5 M2 S A 5 M2 (M2 S M2 )A(M2 S M2 ) 5 (M2 S M2 ), A(M2 S M2 )A 5 A. But this implies A5B 1 5M 2 B 1 M 2 . h A.2. Proof of equivalence of ( C) and ( G) in Theorem 3 First, we show that (G) implies (C). Thus, suppose that S M2 X1 5 XD for some D of full column rank. Partition X and D, so that XD 5 X1 D1 1 X2 D2 . Then BM2 X1 5 M2 S M2 X1 5 M2 XD 5 M2 X1 D1 . Pre-multiplying by X 91 M2 , we obtain D1 5(X 91 M2 X1 )21 X 91 M2 BM2 X1 , which is nonsingular because X 91 M2 BM2 X1 5 X 19 BX1 and X 19 M2 X1 are assumed to be positive definite. Thus, condition (G) implies condition (C). Next, we show that (C) implies (G). So, we suppose that BM2 X1 5 M2 X1 C1 for some nonsingular C1 . Then
H. Qian, P. Schmidt / Economics Letters 79 (2003) 385–392
392
M2 S M2 X1 5 M2 X1 C1 M 2 [S M 2 X 1 2 X 1 C 1 ] 5 0 which implies that, for some C2 ,
S M2 X1 2 X1 C1 5 X2 C2 S M2 X1 5 X1 C1 1 X2 C2 5 XD, where D9 5 (C 19 , C 92 ) and where D must have full column rank because C1 is nonsingular. h References Amemiya, T., 1985. In: Advanced Econometrics. Harvard University Press, Cambridge, MA. Anderson, T.W., 1971. In: The Statistical Analysis of Time Series. Wiley, New York. Gourieroux, C., Monfort, A., 1980. Sufficient linear structures: econometric applications. Econometrica 48, 1083–1097. Im, K.S., Ahn, S.C., Schmidt, P., Wooldridge, J., 1999. Efficient estimation of panel data models with strictly exogenous explanatory variables. Journal of Econometrics 93, 177–201. Kiefer, N.M., 1980. Estimation of fixed-effects models for time series of cross-sections with arbitrary intertemporal covariance. Journal of Econometrics 14, 195–202. Kruskal, W., 1968. When are Gauss-Markoff and least squares estimates identical? A coordinate-free approach. Annals of Mathematical Statistics 39, 70–75. Puntanen, S., Styan, G.P.H., 1989. The equality of the ordinary least squares estimator and the best linear unbiased estimator. The American Statistician 43, 153–164. Rao, C.R., 1965. The theory of least squares when the parameters are stochastic and its applications to the analysis of growth curves. Biometrika 52, 447–458. Rao, C.R., Mitra, S.K., 1971. In: Generalized Inverse of Matrices and Its Applications. Wiley, New York. Theil, H., 1971. In: Principles of Econometrics. Wiley, New York. Zyskind, G., 1967. On canonical forms, nonnegative covariance matrices and best and simple least squares linear estimators in linear models. Annals of Mathematical Statistics 38, 1092–1109.