ARTICLE IN PRESS
Statistics & Probability Letters 77 (2007) 75–82 www.elsevier.com/locate/stapro
Asymptotic efficiency of the ordinary least-squares estimator for sur models with integrated regressors Dong Wan Shina,, Han Joon Kimb, Won-Chul Jheec a
Department of Statistics, Ewha University, Seoul 120-750, South Korea Korea Ocean Research and Development Institute Ansan 425-600, South Korea c Department of Industrial Engineering, Hongik University, Seoul 121-791, Korea b
Received 10 November 2003; received in revised form 28 February 2005; accepted 23 May 2006 Available online 18 July 2006
Abstract For seemingly unrelated regression (SUR) models with integrated regressors, two sufficient conditions are identified, under which the ordinary least-squares estimator (OLSE) is asymptotically efficient. The first condition is that every pair of regressor processes are cointegrated in a specific way that one regressor is a linear combination of the other regressor up to a zero-mean stationary error and the second condition is that, for every pair of regressor processes, the pair of error processes deriving the regressor processes have zero long-run covariance. r 2006 Elsevier B.V. All rights reserved. Keywords: Cointegration; Efficiency; Generalized least-squares estimator; Long-run covariance
1. Introduction SUR models attract much attention from many statisticians and econometricians in terms of efficiency of the OLSE. Since the seminal work of Zellner (1962), people try to characterize conditions under which the OLSE is the best linear unbiased estimator (BLUE). Two most well-known sufficient conditions for a SUR model ykt ¼ x0kt bk þ ekt , t ¼ 1; . . . ; n, k ¼ 1; 2; . . . ; K, consisting of K seemingly unrelated linear regression models with serially uncorrelated errors ek1 ; . . . ; ekn such that covðekt , eks Þ ¼ 0, tas, are (i) x1t ¼ x2t ¼ ¼ xKt or (ii) covðekt , e‘t Þ ¼ 0, for all ka‘, established by Zellner (1962). Several extensions were made by Baltagi (1988), Bartels and Fiebig (1991), and references therein. For many economic and financial applications, the regressor xt is usually an integrated process xt ¼ x having positive long-run variance, i.e., t1 P1 þ ut denoted by Ið1Þ, where ut is a zero mean stationary process 0 h¼1 covðut ; utþh Þ40. For time series regression model yt ¼ xt b þ zt with Ið1Þ regressor xt and stationary error process fzt g independent of fxt g, it is well known that the OLSE b^ O is asymptotically efficient. Kramer (1986) and Phillips and Park (1988) showed that, if zt is a stationary autoregressive process, then, as n ! 1, nðb^ O bÞ has the same nontrivial limiting distribution as nðb^ G bÞ, where b^ G is the generalized least-squares Corresponding author. Tel.: +82 2 3277 2614; fax: +82 2 3277 3607.
E-mail address:
[email protected] (D.W. Shin). 0167-7152/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2006.05.024
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
76
estimator (GLSE). Asymptotic efficiency study of the OLSE for time series regressions with Ið1Þ regressors was extended by Kramer and Hassler (1998) and Shin and Oh (2002) for models having fractionally integrated regressors and general unstable regressors with characteristic roots on the unit circle, respectively. Now our interest is asymptotic efficiency of the OLSE in the SUR model ykt ¼ x0kt bk þ zkt with Ið1Þ regressors xkt and stationary error processes fzkt g independent of fxkt g. This issue would be important because models with Ið1Þ regressors are frequently encountered in economic or financial applications. Three Ið1Þ regressor examples are illustrated. One is estimating cointegrating relations by regressions for which various real data examples can be found in many books such as Johnston and DiNardo (1997, Section 8.4), Hamilton (1994, Section 19), and others. The second one is predicting stock returns using integrated explanatory variables as Torous et al. (2004) and references therein did. The third one, as Li (1999) did for panel data sets of several countries, is modelling purchasing power parity (PPP) in which log exchange rate is regressed on log domestic price level and log foreign price level to test proportionality of the regression parameter for log domestic price level being one and to test symmetry of the two regression parameters having the same magnitude with reversed signs. The hypotheses of proportionality and symmetry jointly state stationarity of exchange rate. Phillips and Hansen (1990), Kitamura and Phillips (1997), Shin and Oh (2004), and others developed the socalled ‘‘fully modified (FM)’’ estimation procedures which address the issues of serial correlation in error terms and endogeneity of regressors in the context of Ið1Þ regressors. Also, Moon (1999) considered SUR models with integrated regressors and developed a GLS-estimation procedure adopting the FM-estimation. Our major finding is that the OLSE is asymptotically efficient under ðiÞ0 for every pair ðk, ‘Þ,P xkt and x‘t are cointegrated in a specific way or ðiiÞ0 the long-run covariance of zkt and z‘t is zero, that is, 1 h¼1 covðzkt , z‘;tþh Þ ¼ 0 for all ka‘. Condition ðiÞ0 is a stochastic version of (i), which states that each of the Ið1Þ regressors x1t ; . . . ; xKt is linear combination of a common stochastic trend up to an Ið0Þ error process, i.e., zero mean stationary error having positive long-run variance. One trivial situation satisfying ðiÞ0 is the common regressor model with x1t ¼ x2t ¼ ¼ xKt which is identical with (i). Condition ðiiÞ0 states that zkt and z‘t , ka‘, have asymptotically zero long-run covariance while condition (ii) states that ekt and e‘t , ka‘, are uncorrelated. 2. Models and results We first analyze models without deterministic trends. Later in this section, analysis is extended to models with deterministic trends. Consider a SUR model ykt ¼ x0kt bk þ zkt ;
xkt ¼ xk;t1 þ ukt ,
(1)
t ¼ 1; . . . ; n, k ¼ 1; . . . ; K, where bk is an unknown pk 1 vector, pk X1, fzkt ; k ¼ 1; . . . ; Kg are stationary processes independent of fxkt ; k ¼ 1; . . . ; Kg, and xk0 is a given random variable. Here and in the sequel, when we say ‘‘stationarity’’, we assume ‘‘covariance stationarity’’. We rewrite model (1) as Y ¼ X b þ Z where Y ¼ ðY 01 ; . . . ; Y 0K Þ0 , X ¼ diagðX 01 ; . . . ; X 0K Þ, Z ¼ ðZ01 ; . . . ; Z 0K Þ0 , Y k ¼ ðyk1 ; . . . ; ykn Þ0 , X k ¼ ðx0k1 ; . . . ; x0kn Þ0 , and Z k ¼ ðzk1 ; . . . ; zkn Þ0 . Now, the OLSE and the GLSE are b^ O ¼PðX 0 X Þ1 X 0 Y and b^ G ¼ ðX 0 G1 X Þ1 X 0 G1 Y , k‘ respectively, where G ¼ varðZÞ. Let gk‘ covðzkt , z‘;tþh Þ, ck‘ ¼ 1 h¼1 gh , Gk‘ ¼ covðZ k , Z ‘ Þ, k, ‘ ¼ 1; . . . ; K. h ¼ P P 1 i k‘ k‘ k‘ k‘ Let Ri ¼ h¼i gh for iX0 and Ri ¼ h¼1 gh for io0. For a matrix B, kBk ¼ ftrðB B0 Þg1=2 denotes the matrix norm of B: We state conditions required for our analysis. A1. For each k ¼ 1; . . . ; K, vkt ¼ ðu0kt , zkt Þ0 is a covariance stationary process having a positive definite covariance matrix and satisfying the following properties: (a) vP kt is a strong mixing sequence with mixing numbers dm such that 1 ð12=gÞ o1 for some g42; m¼1 dm (b) P vkt has zero mean and finite P gth order moment, 1 0 0 (c) 1 j¼1 kEðvkj vk0 Þko1 and j¼1 Eðvkj vk0 Þ is positive definite. A2. inf where lm ðGÞ is the minimum eigenvalue of G: P n lm ðGÞ40, 2 A3. ni¼n ðRk‘ i Þ ¼ oðnÞ, k, ‘ ¼ 1; . . . ; K: A4. fu1t ; u2t ; . . . ; uKt g and fz1t ; z2t ; . . . ; zKt g are independent.
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
77
A5. (a) ck‘ ¼ 0 for all ka‘ or (b) p1 ¼ p2 ¼ ¼ pK ¼ p and x0kt ¼ x0‘t B‘k þ w0‘kt , k, ‘ ¼ 1; . . . ; K, for some p p matrices B11 ; B12 ; . . . ; BKK and zero-mean stationary processes w11t ; . . . ; wKKt satisfying that C B ¼ ½cc11K1BB11K1 cc1KKKBB1KKK is nonsingular. Condition A1 imposes the invariance principle on the process vkt ¼ ðu0kt ; zkt Þ0 so that the weak limit of P½nr n t¼1 vkt is a Brownian motion as shown by Corollary 2.2 of Phillips and Durlauf (1986), where ½nr is the integer part of nr, 0prp1. P Condition A1(c) requires that the Brownian motion has a positive definite covariance matrix and hence ts¼1 ðu0ks ; zks Þ0 is not cointegrated. Condition A1 can be replaced by any other one which allows the invariance principle of the process vkt ¼ ðu0kt ; zkt Þ0 . This condition is not a binding one because the invariance principle holds for a wide class of stationary errors containing stationary invertible vector ARMA processes as special cases. The second condition A2 states that G is nonsingular even for large n. The third one A3 restricts the rate at which the cross-covariance functions gk‘ kt and z‘t decay to zero. A sufficient condition for h of the error terms zP P P1 k‘ 1 k‘ A3 is nh¼0 jhgk‘ j ¼ oðnÞ, which can be easily shown using h¼0 jgh jo1. Note that h¼0 jgh jo1 implies h P1 k‘ Pn Pn k‘ 2 k‘ Therefore, the condition i¼n jRi j ¼ oðnÞ is slightly more restrictive than h¼0 jhgh jpn h¼0 jgh j ¼ OðnÞ. P the usual absolute summability h jgk‘ h jo1. Note that all stationary invertible vector ARMA satisfy A2 and A3 because the cross covariance gk‘ h declines exponentially asjhj increases. However, the class of stationary processes specified by A2 and A3 is much wider than the class of stationary invertible vector ARMA. If P 1k 1k gk‘ Þ for some k40, then nh¼0 hjgk‘ Þ ¼ oðnÞ and A3 is satisfied. The class of errors with h ¼ Oðh h j ¼ Oðn 1k 1k k‘ gh ¼ Oðh Þ contains the class of ‘‘intermediate memory errors’’ characterized by gk‘ f1 þ h ¼ Ch oð1Þg; k40; Ca0 which is a subclass of long memory errors characterized by k gk‘ ¼ Ch f1 þ oð1Þg; k40; Ca0, see Brockwell and Davis (1990, Section 13.2). Therefore, the class of h errors in our analysis is much more general than the class of stationary AR processes for the analysis of Kramer (1986) and Phillips and Park (1988). The fourth one A4 is the independence condition employed in the usual asymptotic efficiency study of the OLSE, see for example, Kramer (1986), Phillips and Park (1988) and Kramer and Hassler (1998). The last one A5 is the most crucial Pcondition. Condition P½nr A5ðaÞ together with A1 implies that the weak limits 1=2 of the partial sum processes n1=2 ½nr z and n kt t¼1 t¼1 z‘t , ka‘, are independent Brownian motions, see Phillips and Durlauf (1986). Condition A5(b) implies that, for each combination of ðk; ‘Þ, xkt and x‘t are cointegrated in the specific way that xkt is identical with x‘t up to some p p nonsingular matrix right multiplication B‘k and an additive zero-mean stationary error w‘kt , which will be referred to as ‘‘xkt and x‘t are similar’’ in the sequel. Note that, if x1t ; . . . ; xKt are all one dimensional, under A5(a), every pair of regressors are cointegrated in the usual sense. Under A5(a), all the regressors x1t , x2t ; . . . ; xKt are similar because xkt is similar to x‘t and x‘t is similar to x1t . Therefore, all the Ið1Þ regressors are driven by a common Ið1Þ process up to nonsingular linear transformations. In this sense, we may say that the regressors have a common stochastic trend. The following lemma characterize a sufficient condition for A5(b), which states that all the regressors xkt ; k ¼ 1; . . . ; K have the same dimension and every pair of the regressors are cointegrated componentwise. 1=2
Lemma 1. Assume p1 ¼ ¼ pK ¼ p. Let xkjt be the jth component of xkt ¼ ðxk1t ; . . . ; xkpt Þ0 . Assume that for each k ¼ 1; . . . ; K; j ¼ 1; . . . ; p, there exists d kj a0 such that xkjt d kj x1jt is stationary. Also assume that C ¼ ½cc11K1 cc1KKK is nonsingular. Then A5(b) holds. The first three conditions A1–A3 are mild ones because they are satisfied by many important time series models including stationary and invertible ARMA processes as special cases. The last two conditions A4 and A5 are binding ones in that A4 is satisfied only if xt is exogeneous and A5 is satisfied under the above cointegrated or independent cases. Since A1–A3 are mild ones, the independent condition of A4 is usually imposed in efficiency study in regression models, and A5(a) is a trivial condition of independence, the key condition for asymptotic efficiency of the OLSE in our SUR model is A5(b). Note that the condition of Lemma 1 for A5(b) can be easily verified. One may apply cointegration tests to see whether there is d kj a0 in Lemma 1, i.e., xkjt ; x1jt are cointegrated.
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
78
The following two lemmas characterize algebraic and probabilistic aspects of GX which are useful for our analysis. P k‘ Lemma 2. For any k, ‘ ¼ 1; . . . ; K, if 1 h¼1 jgh jo1, then Gk‘ X ‘ ¼ ck‘ X ‘ Ck‘ U ‘ jk‘ x0‘n , where U ‘ ¼ ðu0‘1 ; . . . ; u0‘n Þ0 , 0 Rk‘ Rk‘ 1 1 B k‘ B R2 Rk‘ 1 B B Rk‘ Rk‘ 3 2 Ck‘ ¼ B B B . . .. B .. @ Rk‘ Rk‘ n n1
(2)
Rk‘ 2
Rk‘ 1
Rk‘ 1 .. .
.. .
Rk‘ n2
Rk‘ nþ1
1
C Rk‘ nþ2 C C C Rk‘ nþ3 C; C .. C . C A Rk‘ 1
0
jk‘
Rk‘ n
1
B k‘ C B Rnþ1 C C B B Rk‘ C C, B nþ2 ¼B C B .. C B . C A @ Rk‘ 1
and cl‘ ; Rk‘ h are defined in the first paragraph of Section 2. Lemma 3. Under A1–A2, as n ! 1, kCk‘ U ‘ k ¼ op ðnÞ and kjk‘ x0‘n k ¼ op ðnÞ for any k, ‘ ¼ 1; . . . ; K: Lemmas 2 and 3 show that, in the decomposition (2) of Gk‘ X ‘ , the first term ck‘ X ‘ dominates the other op ðnÞ terms Ck‘ U ‘ and jk‘ x0‘n because kX ‘ k is of probabilistic order n. Therefore, 0 1 0 1 0 1 G11 X 1 G1K X K c11 X 1 c1K X K V 11 V 1K B B B C C C GX ¼ @ A ¼ @ A þ @ A, GK1 X 1 GKK X K cK1 X 1 cKK X K V K1 V KK for some negligible terms V ‘k with kV ‘k k ¼ op ðnÞ. Now, if A5(a) holds so that ck‘ ¼ 0 for all ka‘, then 0
c11 X 1
c1K X K
1
0
X1
0
10
c11 I p1
0
1
B B CB C C @ A ¼ @ A@ A ¼ XC 0 cKK I pK 0 XK cK1 X 1 cKK X K P P for some ð pk Þ ð pk Þ matrix C, where I pk is the pk pk identity matrix. Next, if A5(b) holds, then X k ¼ X ‘ B‘k þ W ‘k; where W ‘k ¼ ðw0‘k1 ; . . . ; w0‘kn Þ0 . Thus 0 1 0 1 c11 ðX 1 B11 þ W 11 Þ c1K ðX 1 B1K þ W 1K Þ c11 X 1 c1K X K B C B C B B C C @ A¼@ A c ðX B þ W Þ c ðX B þ W Þ cK1 X 1 cKK X K K1 K K1 K1 KK K KK KK 0 10 1 0 c11 B11 c1K B1K X1 B CB C CB C ¼B @ A@ A 0 XK cK1 BK1 cKK BKK 0 1 c11 W 11 c1K W 1K B C C þB @ A cK1 W K1 cKK W KK 0 1 c11 W 11 c1K W 1K B C ... ... C ¼ XC þ B @ A cK1 W K1 cKK W KK
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
79
for some ðKpÞ ðKpÞ matrix C. Note that kW ‘k k ¼ Op ðn1=2 Þ because w‘kt is a zero-mean stationary process. Therefore, if we have A5 as well as A1 and A3, then GX ¼ XC þ V
(3)
for some V with kV k ¼ op ðnÞ. Note that, if V ¼ 0, i.e., GX ¼ XC, then the OLSE and the GLSE are numerically equivalent, see for example, Kruskal (1968) and Zyskind (1967). Now, in our SUR model, the remainder term V is asymptotically negligible in the sense that kV k is of smaller probabilistic order than kXCk, which render the OLSE to be asymptotically as efficient as the GLSE as shown in the following theorem. Theorem 1. Consider model (1). If conditions A1–A5 hold, then, as n ! 1, nðb^ O bÞ and nðb^ G bÞ have the same nontrivial limiting distribution. The result of Theorem 1 can be extended to models with polynomial trends ykt ¼ g0kt ak þ x0kt bk þ zkt ;
xkt ¼ xk;t1 þ ukt ,
(4)
qk 0
where gkt ¼ ð1; t; . . . ; t Þ and ak and bk are vectors of unknown parameters with0 dimensions qk þ 1X0 and 0 pk X0, respectively. Let a ¼ ða01 ; . . . ; a0K Þ0 and b ¼ ðb01 ; . . . ; b0K Þ0 . The OLSE ð^a0O , b^ O Þ0 and the GLSE ð^a0G , b^ G Þ0 are defined in the obvious manner. We need A1–A4 and, instead of A5: A50 . ða0 Þ A5(a) holds or ðb0 Þ A5(b) holds together with q1 ¼ q2 ¼ ¼ qK satisfying that ½cc11K1 cc1K is nonsingular. KK Now, letting T j ¼ ð1j ; . . . ; nj Þ , j ¼ 1; 2; . . . ; noting tj ¼ ðt 1Þj þ ujt with ujt , a polynomial in t of order ðj 1Þ, and applying Lemma 2, we get Gk‘ T j ¼ ck‘ T j þ V gj
(5)
for some n-vector V gj such that kV gj k ¼ oðnjþ1=2 Þ by an argument similar to Lemma 3. The result (5), together with the fact that kT j k is of order njþ1=2 allow us to establish the extended result below. Theorem 2. Consider model (4). If conditions A1–A4 and A50 hold, then, as n ! 1, ½n1=2 An ð^aO aÞ0 , nðb^ O bÞ0 and ½n1=2 An ð^aG aÞ0 , nðb^ G bÞ0 have the same nontrivial limiting distribution, where An ¼ diag½A1n ; . . . ; AKn and Akn ¼ diag½n0 ; n1 ; . . . ; nqk : Remark 1. When pk ¼ 0 for all k, condition A50 reduces to ða0 ) q1 ¼ ¼ qK or ðb0 ) ck‘ ¼ 0 for ka‘. Therefore, for models consisting solely with polynomial time trends, the OLSE is asymptotically efficient if each regression equation has the same polynomial order or every pair of regression error processes have zero long-run covariance. This analysis extends the result of Grenander and Rosenblatt (1957) of asymptotic efficiency of the OLSE for polynomial regression models to SUR models with polynomial trends. Remark 2. For models with nonzero long-run covariance, in order for the OLSE to be asymptotically efficient, in addition to the fact that every pair of regressors is cointegrated in the specific way, we need that every regression equation has polynomial time trend of the same order. However, for models with zero longrun covariance, the orders of deterministic time trends need not to be same in order for the OLSE to be asymptotically efficient. Acknowledgments The authors are very grateful for an associate editor and a referee for many helpful comments. This work was supported by Korea Research Foundation Grant KRF-2002-042-C00008.
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
80
Appendix. Proofs Proof of Lemma 1. We have B‘k ¼ diagðd k1 =d ‘1 ; . . . ; d kp =d ‘p Þ: Thanks to the diagonal structure of B‘k , for c d =d
nonsingularity of C B in A5(b), it suffices to show that of C j ¼ ½c11K1 d1j1j =d1jKj Dj ¼ diagðd 1j ; . . . ; d Kj Þ and noting C j ¼
D1 j CDj ,
c1K d Kj =d 1j cKK d Kj =d Kj
for j ¼ 1; . . . ; p: Letting
we get nonsingularity of C j from that of C. &
Proof of Lemma 2. In the proof of this lemma, for notational simplicity, we suppress the super script k‘ and k‘ use ðgh , Ri Þ for ðgk‘ h , Ri Þ. For a matrix B, Plet ðBÞi denote the ith row vector of B. Interchanging the order of summations, together with finiteness of 1 h¼1 jgh j, we get ðGk‘ X ‘ Þi ¼
n X
gij x0‘j ¼
j¼1
¼
j¼1
i X n X
gij u0‘t þ
j¼t
t¼1
¼
n X
i X
gij
j X
u0‘t ¼
t¼1
t¼1 n n X X
n X n X
gij u0‘t
j¼t
gij u0‘t
t¼iþ1 j¼t
ðgin þ ginþ1 þ g1 þ g0 þ g1 þ þ git Þu0‘t
t¼1 n X
þ
ðgin þ þ git Þu0‘t
t¼iþ1
¼
i X
1 X
t¼1
h¼1
¼ ck‘ x0‘i
! gh Rin1 Ritþ1 u0‘t þ
i X t¼1
and hence the result.
n X
ðRit Rin1 Þu0‘t
t¼iþ1
Ritþ1 u0‘t þ
n X
Rit u0‘t Rin1 x0‘n
t¼iþ1
&
Proof of Lemma 3. Proof is the same as that of Proposition 2 of Shin and Oh (2002) for unit root case. & P d 0 0 0 0 Proof of Theorem 1. According to A1, n1=2 ½nr t¼1 ½u1t ; . . . ; uKt , z1t ; . . . ; zKt !½B1 ðrÞ; . . . ; BK ðrÞ; W 1 ðrÞ; . . . ; 0 0 W K ðrÞ for some Brownian motions B1 ðrÞ; . . . ; BK ðrÞ; W 1 ðrÞ; . . . ; W K ðrÞ with positive definite covariance d
d
matrices, where ! denotes convergence in distribution. Hence, ½n2 X 0 X , n1 X 0 Z !½H xx , Dx , where H xx ¼ R R R diag½ B1 B01 ; . . . ; BK B0K and Dx ¼ ½ B1 dW 1 ; . . . ; BK dW K . By A1, H xx is nonsingular almost surely. Note d d that we need the independence condition A4 for n1 X 0 Z ! Dx . We now have nðb^ bÞ ! H 1 Dx .By A5, C in O
xx
(3) is nonsingular. Thanks to (3), X 0 G1 ¼ C 1 X 0 C 1 V 0 G1 . Hence, d
n2 X 0 G1 X ¼ n2 C 1 X 0 X n2 C 1 V 0 G1 X ¼ n2 C 1 X 0 X þ op ð1Þ ! C 1 H xx , 0
1
1
(6)
2
because kV G X kplM ðG ÞkV kkX k ¼ op ðnÞOp ðnÞ ¼ op ðn Þ by A3 and Lemma 3, where, for a symmetric matrix B, lM ðBÞ denotes the maximum eigenvalue of B. We have used the fact that supn lM ðG1 Þ ¼ ½inf n lm ðGÞ1 o1 by A2. Also, we show V 0 G1 Z ¼ op ðnÞ
(7)
to get d
n1 X 0 G1 Z ¼ n1 C 1 X 0 Z n1 C 1 V 0 G1 Z ¼ n1 C 1 X 0 Z þ op ð1Þ ! C 1 Dx .
(8)
d Then nðb^ G bÞ ¼ fn2 C 1 X 0 X þ op ð1Þg1 fn1 C 1 X 0 Z þ op ð1Þg ! H 1 xx Dx ; establishing the result. Now, it remains to show (7).
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
81
We first provide a proof for (7) by assuming that x1t ; . . . ; xKt are all one-dimensional. Observe that, by Lemma 2, the ‘th column vector of GX is 0 1 0 1 0 1 0 1 0 1 j1‘ x0‘n c1‘ X ‘ C1‘ U ‘ c1‘ X ‘ G11 X ‘ B C B C B C B C B C (9) @ A¼@ A@ A@ A¼@ A V 1‘ V 2‘ 0 jK‘ x‘n GK1 X ‘ cK‘ X ‘ CK‘ U ‘ cK‘ X ‘ say. It suffices to show Z 0 G1 V m ‘ ¼ op ðnÞ, m ¼ 1, 2 for each ‘ ¼ 1; . . . ; K. Let ‘ 2 f1; . . . ; Kg be fixed. Observe that, by independence of Z and V 1‘ of A4 and EðZÞ ¼ 0, we have 0
0
varðZ 0 G1 V 1‘ Þ ¼ E½trðZ 0 G1 V 1‘ V 1‘ G1 ZÞ ¼ tr½EðG1 V 1‘ V 1‘ G1 ZZ 0 Þ ¼ tr½G1 varðV 1‘ Þ. For a symmetric n n matrix B, let lðBÞ ¼ diagðl1 ; . . . ; ln Þ, where l1 X Xln are eigenvalues of B. Repeated applications of the inequality trðBDÞptrðlðBÞlðDÞÞ from Fuller (1987, p. 394) show that trðG1 varðV 1‘ ÞÞ is bounded by lM ðG1 Þ
K X
trðCk‘ S‘‘ C0k‘ ÞplM ðG1 ÞlM ðS‘‘ Þ
k¼1
K X
trðCk‘ C0k‘ Þ
k¼1
plM ðG1 ÞlM ðS‘‘ Þ
K X k¼1
n
n X
R2s ,
s¼n
which is oðn Þ by A2 and A3, where S‘‘ ¼ varðU ‘‘ Þ. This, together with EðZ 0 G1 V 1‘ Þ ¼ 0, yields Z 0 G1 V 1‘ ¼ Note that we have used boundedness of lM ðS‘‘ Þ which is a consequence of Pop ðnÞ. ‘‘ lM ðS‘‘ Þp 1 h¼1 jgh jo1 by the Gershgorin’s theorem (Horn and Johnson, 1985) and A1ðcÞ. Finally, since P kjk‘ k ¼ ð ni¼1 R2i Þ1=2 ¼ oðn1=2 Þ and x‘n ¼ Op ðn1=2 Þ, we get kjk‘ x0‘n k ¼ op ðnÞ and hence Z0P G1 V 2‘ ¼ op ðnÞ: Now, consider models with regressors of general dimension. Note that, for L 2 f1; 2; . . . ; pk g, the Lth column vector of GX is the same as (9) if X ‘ ; x‘n ; U ‘ are replaced by some column vectors of X ‘0 ; x‘0 n ; U ‘0 , respectively, for some ‘0 2 f1; 2; . . . ; Kg. Therefore, the above proof for models with one-dimensional regressors also applies to models with regressors of general dimension if the ‘th column vector of GX is replaced by the Lth column vector. & 2
Proof of Theorem 2. It is obvious that d
0 1 3=2 0 2 0 1=2 1 0 ðn1 A1 X GA1 An G Z; n1 X 0 ZÞ !ðH gg ; H xg ; H xx ; Dg ; Dx Þ n G GAn ; n n ; n X X; n H H ½H ggxg H xxgx
(10)
H 0xg .
say, with an almost sure positive definite matrix H ¼ where H gx ¼ Therefore, the limiting D distribution of ½n1=2 An ð^aG aÞ0 , nðb^ O bÞ0 is H 1 D, where D ¼ ½Dgx . Letting T j ¼ ð1; 2j ; . . . ; nj Þ0 , Gk ¼ ðT 0 ; T 1 ; . . . ; T qk Þ, G ¼ diagðG 1 ; . . . ; GK Þ, we write (4) as Y ¼ Ga þ X b þ Z. Following the proofs of Lemmas 2 and 3 and utilizing the fact that gkt ¼ gk;t1 þ ugkt and each element of ugkt is of one-degree smaller than the corresponding element of gkt instead of the fact that, ukt is of smaller degree than xkt , we get that GG ¼ GC g þ V g with V g ¼ ðV g1 ; . . . ; V gQ Þ satisfying that P kV gi k ¼ oðani Þ, where ani is the order of kGi k, Q ¼ K k¼1 qk , G i is the ith column vector of G ¼ ðG 1 ; . . . ; G Q Þ, 11 C 1K and C is defined below. If q1 ¼ q2 ¼ ¼ qK ¼ q then An ¼ I K diagðn0 ; . . . ; nq Þ and C g ¼ ½C C K1 C I qþ1 , KK
where denotes the Kronecker product. Note that An is diagonal and , if ck‘ ¼ 0, ka‘, then C g is diagonal. We hence have C g An ¼ An C g for both cases of A5ða0 ) and A5ðb0 ). By A50 , C g is nonsingular. Now, following the argument of (6) and (8), 0 1 1 1 1 1 0 1 1 1 1 0 1 n1 A1 n G G GAn ¼ n An C g G GAn þ oð1Þ ¼ C g n An G GAn þ oð1Þ, 0 1 3=2 1 1 0 3=2 1 0 An C g G X þ op ð1Þ ¼ C 1 An G X þ op ð1Þ, n3=2 A1 n GG X ¼ n g n 1 3=2 0 X GA1 n3=2 X 0 G1 GA1 n ¼ C n n þ op ð1Þ, 0 1 1=2 1 1 0 1=2 1 0 n1=2 A1 An C g G Z þ op ð1Þ ¼ C 1 An G Z þ op ð1Þ. n GG Z ¼n g n
ARTICLE IN PRESS D.W. Shin et al. / Statistics & Probability Letters 77 (2007) 75–82
82
Therefore, the result follows from 0 1 0 1 1 3=2 1 0 1 n1 A1 G G GA n A G G X n n n @ A 2 0 1 n3=2 X 0 G1 GA1 n X G X n 1 !0 1 1 0 1 3=2 1 0 n A G GA n A G X C 1 0 n n n g @ A þ op ð1Þ ¼ n3=2 X 0 GA1 n2 X 0 X 0 C 1 n and 0 1 n1=2 A1 n GG Z
n1 X 0 G1 Z
! ¼
C 1 g
0
0
C 1
!
0 n1=2 A1 n GZ
n1 X 0 Z
! þ op ð1Þ
together with (6), (8), and (10). & References Baltagi, B.H., 1988. The efficiency of OLS in a seemingly unrelated regressions model. Econometric Theory 4, 536–537. Bartels, R., Fiebig, D.G., 1991. A simple characterization of seemingly unrelated regressions models in which OLS is BLUE. Amer. Statist. 45, 137–140. Brockwell, P.J., Davis, R.A., 1990. Time Series: Theory and Methods, second ed. Springer, New York. Fuller, W.A., 1987. Measurement Error Models. Wiley, New York. Grenander, U., Rosenblatt, M., 1957. Statistical Analysis of Stationary Time Series. Wiley, New York. Hamilton, J.D., 1994. Time Series Analysis. Princeton University Press, New Jersey. Horn, R.A., Johnson, C.R., 1985. Matrix Analysis. Cambridge University Press, Cambridge. Johnston, J., DiNardo, J., 1997. Econometric Methods, fourth ed., The McGraw-Hill Companies, New York. Kitamura, Y., Phillips, P.C.B., 1997. Fully modified IV, GIVE and GMM estimation with possibly non-stationary regressors and instruments. J. Econom. 80, 85–123. Kramer, W., 1986. Least squares regression when the independent variable follows an ARIMA Process. J. Amer. Statist. Assoc. 81, 150–154. Kramer, W., Hassler, U., 1998. Limiting efficiency of OLS vs. GLS when the regressors are fractionally integrated. Econom. Lett. 60, 285–290. Kruskal, W., 1968. When are Gauss–Markov and least squares estimators identical? A coordinate free approach. Ann. Math. Statist. 39, 70–75. Li, K., 1999. Testing symmetry and proportionality in PPP: a panel-data approach. J. Business Econom. Statist. 17, 409–418. Moon, H.R., 1999. A note on fully-modified estimation of seemingly unrelated regressions models with integrated regressors. Econom. Lett. 65, 25–31. Phillips, P.C.B., Durlauf, S.N., 1986. Multiple time series regression with integrated processes. Rev. Econom. Stud. 473–495. Phillips, P.C.B., Hansen, B.E., 1990. Statistical inference in instrumental variables regression with Ið1Þ processes. Rev. Econom. Stud. 57, 99–125. Phillips, P.C.B., Park, J., 1988. Asymptotic equivalence of ordinary least squares and generalized least squares in regressions with integrated regressors. J. Amer. Statist. Assoc. 83, 111–115. Shin, D.W., Oh, M.S., 2002. Asymptotic efficiency of the ordinary least squares estimator for regressions with unstable regressors. Econom. Theory 18, 1121–1138. Shin, D.W., Oh, M.S., 2004. Fully-modified semiparametric GLS estimation for regressions with nonstationary seasonal regressors. J. Econom. 122, 247–280. Torous, W., Valkanov, R., Yan, S., 2004. On predicting stock returns with nearly integrated explanatory variables. J. Business 77, 937–966. Zellner, A., 1962. An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J. Amer. Statist. Assoc. 57, 348–368. Zyskind, G., 1967. On canonical forms, nonnegative covariance matrices and best and simple least squares linear estimators in linear models. Ann. Math. Statist. 38, 1092–1109.