Some properties of the LIML estimator in a dynamic panel structural equation

Some properties of the LIML estimator in a dynamic panel structural equation

Journal of Econometrics 166 (2012) 167–183 Contents lists available at SciVerse ScienceDirect Journal of Econometrics journal homepage: www.elsevier...

1MB Sizes 0 Downloads 26 Views

Journal of Econometrics 166 (2012) 167–183

Contents lists available at SciVerse ScienceDirect

Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom

Some properties of the LIML estimator in a dynamic panel structural equation Kentaro Akashi a , Naoto Kunitomo b,∗ a

Faculty of Economics, Gakushuin University, Japan

b

Graduate School of Economics, University of Tokyo, Japan

article

info

Article history: Received 14 March 2010 Received in revised form 20 December 2010 Accepted 19 August 2011 Available online 3 September 2011 JEL classification: C23 C33 C36

abstract We investigate the finite sample and asymptotic properties of the within-groups (WG), the randomeffects quasi-maximum likelihood (RQML), the generalized method of moment (GMM) and the limited information maximum likelihood (LIML) estimators for a panel autoregressive structural equation model with random effects when both T (time-dimension) and N (cross-section dimension) are large. When we use the forward-filtering due to Alvarez and Arellano (2003), the WG, the RQML and GMM estimators are significantly biased when both T and N are large while T /N is different from zero. The LIML estimator gives desirable asymptotic properties when T /N converges to a constant. © 2011 Elsevier B.V. All rights reserved.

Keywords: Dynamic panel model Simultaneous equation Within-groups estimator RQML GMM LIML Many orthogonal conditions

1. Introduction Recently there has been a growing interest on panel econometric models in micro-econometrics and they are indispensable tools for econometric analysis. (See Hsiao (2003), Arellano (2003a,b) and Baltagi (2005), for instance.) In particular the dynamic panel models have been often used in empirical applications. Earlier investigations on some aspect of the dynamic panel models were carried out by Anderson and Hsiao (1981, 1982). However, there are still non-trivial statistical problems of estimating dynamic panel econometric models to be investigated. When we use the lagged explained variables as well as other explanatory variables with individual effects in panel regression models, there could be a natural question among econometricians on what would happen if one of the variables was actually endogenous in the economic system. When we have an endogenous variable in the dynamic panel models with individual effects, it would not be obvious how to estimate such a particular structural equation because some complicated interactions would have occurred



Corresponding author. Tel.: +81 3 5841 5614; fax: +81 3 5841 5521. E-mail address: [email protected] (N. Kunitomo).

0304-4076/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2011.08.005

by the lagged endogenous variables and the individual effects in the econometric model at the same time. In a pioneering work Alvarez and Arellano (2003) have investigated the asymptotic behavior of alternative estimation methods, namely the WG (within-groups), the random-effects quasi maximum likelihood (RQML), the GMMs (generalized method of moments) and the LIML (limited information maximum likelihood) estimators, for a coefficient in a dynamic panel regression model when both N (the number of individuals) and T (the number of observation periods) are large. They have investigated the asymptotic properties of estimators when both N and T go to infinity and have derived the asymptotic distributions of these estimators. They have obtained interesting findings, however, one remaining major issue in econometrics is the effects of the endogeneity among possible explanatory variables in the dynamic panel structural equation. One important aspect in this problem is the fact that when there are many orthogonal conditions in dynamic panel models except the case when T is very small, the use of GMM would be problematic due to incidental parameters in the recent light on the estimation of structural equations in econometric studies by Anderson et al. (2008, 2010). The main purpose of this paper is to investigate the finite sample and the asymptotic properties of alternative estimators (the WG, the RQML, the GMM and the LIML estimators) for a

168

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

coefficient in a specific panel dynamic structural equation. The model we shall consider is intentionally very simple because it is possible to obtain the precise information of the finite sample as well as the asymptotic properties of alternative estimators, which would be useful for practical problems eventually. In a companion paper (Akashi and Kunitomo, 2010b), we shall develop the general formulation of the estimation methods of dynamic panel structural equation with endogeneity, individual effects and many orthogonal conditions. We have tried to draw rather general results including the asymptotic distributions of alternative estimation methods and the asymptotic optimality of estimation, of which the results would be rather complicated at the first glance. In order to make our expositions of our general results useful in a meaningful and persuasive way, this paper utilizes a particular dynamic panel structural equation with an endogenous variable and individual effects as the typical and standard case, which was originally used by Blundell and Bond (2000). In Section 2, we present the panel structural model and define its alternative estimation methods. In Section 3 we shall establish the asymptotic properties of estimators considered and discuss their asymptotic behavior. Then in Section 4 we also discuss the finite sample properties of estimators based on their empirical distribution functions in the Monte Carlo simulations. In Section 5, some concluding remarks will be presented. All mathematical proofs for our theoretical results are in Appendix A and some figures on the normalized distribution functions of alternative estimators are in Appendix B. 2. The panel model and estimation methods We consider a dynamic panel model with an endogenous variable (Blundell and Bond, 2000) (1)

(2)

(1)

(1)

yit = β2 yit + γ1 yit −1 + ηi + uit , (2)

yit =

γ2 y(it2−) 1

(2.1)

(2)

+ δηi + uit ,

(2.2) (1)

(2)

for i = 1, . . . , N ; t = 1, . . . , T , where (uit , uit ) are the disturbance terms. In the first structural equation, there is an endoge(1) (2) nous variable yit with a lagged endogenous regressor yit −1 and individual effects ηi . Define the reduced form yit = Π yit −1 + π i + vit , (1)

(2)

Π = B−1

πi = B−1

γ1

0

γ2

0

  1

δ



(1)



,

B=

(2)

vit = B−1

−β2

1 0

 ηi ,

1



(f )

zit = ct zit −



1 T −t



 (1)

uit

(2)

uit

(f )

(f )

(2,f )

 θˆ WG =

T −1 −

(f )′ (f ) Yt ·t −1 Yt ·t −1

−1  T −1 −

t =1

(f )′ (1,f ) Yt ·t −1 yt

,

(2.6)

where the forward-filtered variables are operated by the forward orthogonal deviations (T − 1) × T matrix Af , such that A′f Af = QT , Af A′f = IT −1 , QT = IT − ιT ι′T /T (the WG operator) and ιT is a T × 1 vector whose elements are ones. The form (2.6) is written as the OLS estimator in terms of the orthogonal deviations and notice that since Af ιT = 0, the individual effects are differenced out in the orthogonal deviations for the associated variables from the original observations. Alvarez and Arellano (2003) also investigated the properties of the random effects quasi-ML (RQML) estimator, which can be defined by

 T −1    − ′ (1,f )′ ˆθ RQML = arg min log yt − θ ′ Y(t f·t)−1 y(t 1,f ) − Y(t f·t)−1 θ θ

t =1

1



(1)′

log y¯

T −1





− θ Yt ·t −1 S0 y¯ ′ ¯′



(1)

(1) T T −1 t =1 yt , Yt ·t −1 (1)′ (1) (1) (1) (1) ′ y0 y0 , yt y1t yNt (2) (1) yt yt −1 is an N 2 matrix.

/[ =(

= (

]

,

¯

∑ )

= ,...,

− Y¯ t ·t −1 θ

,



(2.7)

∑T

t =1 Yt ·t −1 , S0 = IN − ) is an N × 1 vector and

×

2.2. A GMM estimator A GMM (generalized method of moments) estimator is given by

 T −1 −

(f )′ Yt ·t −1 Mt

(f ) Yt ·t −1

−1  T −1 −

(f )′ Yt ·t −1 Mt

(1,f )

yt

 ,

(2.8)

t =1

(1)

(2)

(1)

(2)

where Mt = Zt (Z′t Zt )−1 Z′t , and Zt = (y0 , y0 , . . . , yt −1 , yt −1 ) is the N × 2t instrumental variables matrix. The GMM estimator is identical to the one given by Arellano and Bond (1991),1 which has the form written in the orthogonal deviations. An interesting feature of this estimation method is to use all available

(2.5)

where the superscript (f ) denotes the forward-filtered variable.



t =1

t =1

(2.4)

(1,f )

matrix Yt ·t −1 = (yt , yt −1 ). Then the WG (within-groups) estimator for θ = (β2 , γ1 )′ is given by

θˆ GMM =

]

By using the forward-filters to each variables in both sides of (2.3), we write the forward-filtered reduced form (f )

for the two endogenous variables yit

Yt ·t −1

.

= (y(1tg ,f ) , . . . , y(Ntg ,f ) )′ (g = 1, 2), and set an N × 2

(g )

(1) (1)′ y0 y0

with ct2 = (T − t )/(T − t + 1).

(f )

(g ,f )

Define the forward-filtered variables yt

where y¯ (1) = T −1

,

(zit +1 + · · · + ziT )

(t = 1, . . . , T − 1),

yit = Π yit −1 + vit ,

2.1. The within-groups and the RQML estimators

+

We define the forward-filter (Arellano and Bover, 1995) on the variables zit (i = 1, . . . , n; t = 1, . . . , T ) by the set of transfor(f ) mations such that each elements zit from zis (s = t , . . . , T ) has the form

[

(f )

lated with vit in (2.5) by the consequence of applying the forward filtering. It is an important difference from the standard simultaneous equation problem. Alvarez and Arellano (2003) considered the single equation (2.1) when β2 = 0 and investigated the estimation problem of a coefficient γ1 without (2.2). They have shown that three estimators the WG, the GMM and the LIML estimators are consistent and have the same asymptotic variance when both N and T go to infinity while the ratio converges to a constant. An important aspect is the lack of the endogeneity in the reduced form (2.5) with the structural Eq. (2.1), which is of our main interest in the present study. As for a dynamic structural equation problem, the parameters of interests are both β2 and γ1 , and we shall focus on four estimators for these parameters in this paper.

(2.3)

where yit = (yit , yit )′ , vit = (vit , vit )′ and



Although we call the Eq. (2.3) or (2.5) as the reduced form, yit −1 (f ) is correlated with unobserved π i in (2.3), and yit −1 is also corre-

1 See Chapter 4 of Hsiao (2003), for instance.

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

instrumental variables at each t and the orthogonal conditions are given by (g ) (1,f )

E [yis uit

] = 0 (0 ≤ s < t = 1, . . . , T − 1; g = 1, 2),

(1,f )

where uit stands for the forward-filtered structural error term. In this formulation the number of orthogonal conditions rn can be often substantial, i.e., rn = 2 × T (T − 1)/2. In this paper we use the notation that the total number of observations n = NT and rn can be dependent on n. 2.3. The LIML estimator The LIML (limited information maximum likelihood) estimator was originally developed by Anderson and Rubin (1949, 1950) for the classical simultaneous equation problem, and we shall apply this estimation method to the sets of filtered variables in a dynamic panel structural equation model. Define two 3 × 3 matrices by G

(f )

=

  T −1 (1,f )′ − yt t =1

(f )′ Yt ·t −1

(1,f )

Mt (yt

, Y(t f·t)−1 ),

(2.9)

 ′



T −1 (1,f ) − yt

(f )′

t =1

Yt ·t −1

[IN − Mt ](y(t 1,f ) , Y(t f·t)−1 ).

(A3) ηi are i.i.d. across individuals with E [ηi ] = 0, Var [ηi ] = ση2 , and finite fourth order moment. (A4) The true parameters satisfy that |γ1 | < 1, |γ2 | < 1, γ2 ̸= 0, and γ1 ̸= γ2 . The conditions from (A1) to (A3) and the stationarity condition (A4) are the analogue to the assumptions used in Alvarez and Arellano (2003). They could be certainly relaxed to the panel econometric models with fixed individual effects for instance, but then we will need further complications of our derivations and discussions. The condition γ2 ̸= 0 is the rank condition for the identification of β2 . It is mathematically convenient to assume γ1 ̸= γ2 for analyzing the dynamic process of yit . We also assume that the limit of T /N is equal to 1/2 or less. This condition is necessary to define the GMM and LIML estimators appropriately, or insure the non-singularity of Z′T −1 ZT −1 . Let

π = (I2 − Π) B

−1 −1

  1

(3.1)

δ

wit = yit − πηi . (2.10)



Then the LIML estimator (1, −θˆ LI ) = (1, −βˆ 2·LI , −γˆ1·LI ) is defined by



where wi0 is independent of ηi and i.i.d. with the steady state distribution ∑∞ ofj the homogeneous process, so that we can write wi0 = j=0 Π vi(0−j) .

and the underlying stationary process be

and H(f ) =

169

1 1 (f ) G − λn H(f ) n qn





1

−θˆ LI

Then we write the auto-covariance matrices of {wit } as Γ h = E [wit w′it +h ] for h ≥ 0 under the stationary assumption, which are given by

Γh =

= 0,



Πi ΩΠ i+h

(3.3)

i =0

and Ω = (ωgh ) (g , h = 1, 2). For the WG, the RQML and the GMM estimators, we have the next result. The proof will be given in Appendix A. (2.12)

n

The solution to (2.11) gives the minimum of the variance ratio

(1, −θ ′ )G(f ) (1, −θ ′ )′ Rn = . (1, −θ ′ )H(f ) (1, −θ ′ )′

∞ −

(2.11)

where n = NT , qn = n − rn , and λn is the smallest root of

   1 (f )   G − λn 1 H(f )  = 0. n  q

(3.2)

(2.13)

We note that the LIML estimation in our formulation does not depend on the particular distribution for disturbances although the original derivation by Anderson and Rubin (1949, 1950) assumed the normal disturbances and it could be interpreted as a semi-parametric estimation method. By setting 0 for λn in (2.11), the solution of minimizing the numerator of (2.13) is the (forward-filtered) two-stage least squares (TSLS) estimator, which is equivalent to (2.8) in this case. The relation between the LIML and the TSLS estimators has been investigated intensively in the literature on the classical structural equation estimation (Anderson et al. (1982) for instance) and the structural equation estimation with many instruments (Anderson et al. (2008, 2010) for instance).

Theorem 1. Let Assumptions (A1)–(A4) hold. (i) As T → ∞, regardless of whether N is fixed or tends to infinity,

  ]−1   [ p 1 β (1, 0) θˆ WG − 2 → Φ∗ + E (vit(2)2 ) 0 γ1 ] [ (2) (1) × E (vit uit ) 0

(3.4)

and also θˆ RQML has the same asymptotic property as (3.4). (ii) Assume T /N → c and 0 ≤ c ≤ 1/2 as N and T → ∞. Then

  [   ] −1 p β 1 θˆ GMM − 2 → Φ∗ + c E (vit(2)2 ) (1, 0) γ1 0 [ ] (2) (1) × c E (vit uit ) 0

(3.5)

where Φ∗ = D′ Γ 0 D and

[

0

]

1 . 0

3. Asymptotic properties of estimators

D=

In this section we shall state our main results on the asymptotic properties of estimators when both N and T go to infinity. For this purpose we first state a set of assumptions.

(iii) When c = 0, we additionally assume that 0 ≤ limN ,T →∞ (T 3 /N ) = da < ∞. Then [  ] √ d β2 ˆ NT θ GMM − −→ N (b(0a) , σ 2 Φ∗−1 ), (3.6) γ1

Assumptions. (A1) {vit } (i = 1, . . . , N ; t = 1, . . . , T ) are i.i.d. across time and individuals and independent of ηi and yi0 , with E [vit ] = 0, E [vit v′it ] = Ω, and E [‖vit ‖8 ] exists. (A2) The initial observations satisfy yi0 = (I2 − Π )−1 π i + wi0

(i = 1, . . . , N ),

γ2

where (a)

b0 = d1a /2 Φ∗−1

[

(0, 1)Ωβ 0

]

.

(3.7)

170

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

(2) (1)

When E (vit uit ) ̸= 0 and we have an endogenous variable, the WG estimator is generally inconsistent, even though T tends to infinity, and also the GMM estimator becomes inconsistent if c ̸= 0. Arellano (2003a,b) has derived the order of endogenous bias of the GMM estimator in a more general setting. For the LIML estimator we shall give lengthy arguments for deriving its asymptotic behavior when both N and T go to infinity while T /N tends to a positive constant. We summarize our results whose proof will be given in Appendix A. Theorem 2. Let Assumptions (A1)–(A4) hold. Assume N and T → p

∞ and T /N → c (0 ≤ c ≤ 1/2). Then θˆ LI → θ and [  ] √ d β2 → N (b(ca) , Ψ ∗ ), NT θˆ LI − γ1

(3.8)

where b(ca)

Ψ∗

[ √ ] − c = Φ∗−1 D′ (I2 − Π)−1 Ωβ, (3.9) 1−c [   ] 1 = Φ∗−1 σ 2 Φ∗ + (c∗ |Ω| + Ξ4 ) (1, 0) + Ξ 3 + Ξ ′3

Fig. 1. β2 (BB-model): N = 75, T = 25, c = 1/3.

0

×Φ ,   (1)2 T −1 E (uit u⊥ 1 ′− it ) ′ lim D Ξ3 = E [Wt −1 dt ], 0 , n→∞ n 1−c t =1   (1)2 T −1 2 E [(uit − σ 2 )u⊥ 1− it ] ′ 2 Ξ4 = lim E [dt dt ] − c , n→∞ n (1 − c )2 t =1 ∗−1

(3.10) (3.11)

(3.12)

(1)2

and β = (1, −β2 )′ , σ 2 = E [uit ], c∗ = c /(1 − c ), W′t −1 = (w1t −1 , . . . , wNt −1 ) is a 2×N matrix, dt is the N ×1 vector containing the diagonal elements of Mt = Zt (Z′t Zt )−1 Z′t , and u⊥ it is defined by

[

u⊥ it = (0, 1) I2 −

Fig. 2. β2 (BB-model): N = 75, T = 25, c = 1/3.

Ωββ′ vit , β ′ Ωβ

]

(3.13)

provided that Ξ 3 and Ξ4 are well-defined. The asymptotic covariance (3.10) of the LIML estimator has the same structure as the recent result by Anderson et al. (2010). However, we have an extra asymptotic bias term, which depends on the limiting behavior of T /N. This is due to the effects of the forward-filtering in our formulation of dynamic panel structural equations. When the disturbances are normally distributed, the asymptotic covariance becomes

[

Ψ ∗ = Φ∗−1 σ 2 Φ∗ + c∗ |Ω|

 

]

1 (1, 0) Φ∗−1 , 0

(3.14)

which is much simpler than the general case of (3.10). We expect that the additional two terms in (3.10) are often small in comparison with leading two terms. Because we have many orthogonal conditions, we have the second term of (3.10) when c ̸= 0. 4. On finite sample properties It is important to investigate the finite sample properties of estimators partly because they are not necessarily similar to their asymptotic properties. One simple example would be the fact that the exact moments of some estimators do not necessarily exist. (In that case it may be meaningless to compare the exact MSEs of alternative estimators and their Monte Carlo analogues.) Hence we have investigated the distribution functions of several estimators in the normalized form given by

√ NT

σ

[

(φ 11 )−1/2 0

][ ] βˆ 2 − β2 , (φ 22 )−1/2 γˆ1 − γ1 0

(4.1)

Fig. 3. γ1 (BB-model): N = 75, T = 25, c = 1/3.

where φ 11 and φ 22 are the (1,1)-th element and (2.2)-th element of Φ∗−1 , respectively. We have chosen this standardization because the limiting (marginal) distributions of the LIML estimator of each coefficient in the form of (3.8) are N (0, 1) when c = 0. We have conducted our numerical investigations in a systematic way. We first set the unknown parameters such as (β2 , γ1 ) = (.5, .5), γ2 = .3. δ = Var [ηi ]/σ 2 = 1, Var [vit(g ) ] = 1 (g = 1, 2) (1) (2)

and Cov[vit vit ] = .3. Then we generate a large number of normal random variables by simulations and calculate the empirical distribution function in the form of (4.1). We repeat 5000 replications for each case and the smoothing technique to estimate the empirical distribution functions. The details of simulations are similar to

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Fig. 4. γ1 (BB-model): N = 75, T = 25, c = 1/3.

Fig. 7. γ1 (BB-model): N = 150, T = 50, c = 1/3.

Fig. 5. β2 (BB-model): N = 150, T = 50, c = 1/3.

Fig. 8. γ1 (BB-model): N = 150, T = 50, c = 1/3.

171

by the limiting covariance matrix is not appropriate in such situation because the circles in figures are the standard normal distribution function N (0, 1) in these figures. In many figures we have drawn the distribution function by removing bias term as the bias-corrected-LIML (LIML-bias-correct in figures) and also we have drawn the distribution function of the LIML estimator by using the normalized factor given by Theorem 2 as the var-corrected-LIML (LIML-var-collect in figures), that is,

√ NT

[

−1/2

ψ11

0

] [  ] βˆ 2 − β2 − √1 b(a) , −1/2 c γˆ1 − γ1 ψ22 NT 0

(4.2)

where ψ11 and ψ22 are the (1, 1)-th element and the (2, 2)-th element of Ψ ∗ in (3.14), respectively.

Fig. 6. β2 (BB-model): N = 150, T = 50, c = 1/3.

those explained by Anderson et al. (2005, 2008). We shall report mainly the results for (N , T ) = (75, 25), (150, 50) and (150, 50) as the typical cases among a large number of experiments. When N and T are large, the WG estimator is badly biased. (We have the similar observation on the RQML estimator and we shall discuss its properties later.) The GMM estimator is badly biased unless T is much smaller than N and T 3 /N converges to a constant as the minimum requirement. We have confirmed the asymptotic behavior of these estimators in Figs. 2, 4, 6 and 8. Figs. 1, 3, 5 and 7 show the distribution function of the LIML estimator in the particular normalization (4.1). We have found that the distributions are sometimes biased and also the normalization

The resulting curves are called the LIML distribution with largeK asymptotics (LIML-large-K in figures), which correspond to the cases when both bias and variance corrected distribution function of the LIML estimator based on Theorem 2. From these figures, we have confirmed that the limiting normal distributions approximate the finite sample distribution functions of the LIML estimator quite well as Theorems 1 and 2 stated. There are immediate implications. First, the GMM estimator is badly biased when T is large and it should not be used unless T is very small. (The WG estimator is badly biased even when T is small.) Second, in order to use the limiting normal distribution of the LIML estimator for statistical inference, it is important to adjust the asymptotic bias and the asymptotic variance formulas in Theorem 2. Since we have the explicit formulas for the bias and the covariance, it is straightforward to use them in practical applications.

172

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Fig. 9. β2 (Just-identified case): N = 75, T = 25, c = 1/6.

Fig. 12. γ1 (AA-model): N = 75, T = 25, c = 1/6.

Fig. 10. β2 (Just-identified case): N = 75, T = 25, c = 1/6.

Fig. 13. β2 (Just-identified case): N = 150, T = 50, c = 1/6.

Fig. 11. γ1 (AA-model): N = 75, T = 25, c = 1/6. Note: Two distribution functions of LIML and LIML-var-correct in Fig. 11 are indistinguishable. Also two distribution functions of LIML-bias-correct and LIML-large-K are indistinguishable in this case.

In order to make comparison with the results reported by Alvarez and Arellano (2003) and Anderson et al. (2008), we have conducted the similar calculations without endogeneity by setting β2 = 0 and without dynamic effect by setting γ1 = 0, respectively. We give the distribution function of the standardized WG, the GMM and the LIML estimators as Figs. 11, 12, 15 and 16 for the first case. We have confirmed the numerical results of Alvarez and Arellano (2003) in the sense that it is important to adjust the bias terms in the limiting distributions of the GMM and LIML estimators. We give the distribution function of the standardized

Fig. 14. β2 (Just-identified case): N = 150, T = 50, c = 1/6.

WG, the GMM and the LIML estimators as Figs. 9, 10, 13 and 14 for the second case. The numerical results are similar to those reported by Anderson et al. (2008) on the TSLS, the GMM and the LIML estimators although there are some differences due to the bias terms in the panel LIML estimation. We have conducted a large number of the related numerical analyses. Among many possibilities, we shall report our results on two issues, which may be of some interests among empirical researchers. First, there may be a natural question on what happens when the dynamic coefficient γ1 is more persistent than the cases reported so far and in particular it may be interesting to see the result when the estimated coefficient may be comparable to the one in Blundell and Bond (2000). For this purpose we report

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Fig. 15. γ1 (AA-model): N = 150, T = 50, c = 1/6. Note: Two distribution functions of LIML and LIML-var-correct in Fig. 15 are indistinguishable. Also two distribution functions of LIML-bias-correct and LIML-large-K are indistinguishable in this case.

173

Fig. 18. γ1 (BB-model: γ1 = 0.8): N = 300, T = 100, ca = 1/3. Note: The orders of four distribution functions in Figs. 17 and 18 are different from other figures. We have compared the WG, GMM, LIML-var-correct and LIML-large-K of β2 and γ1 for the case when γ2 is relatively large.

(2003). We give Table 1 on the finite sample properties of the RQML estimator. Form Table 1 it may be important to notice that the RQML estimator has similar properties as the WG estimator in the sense that its asymptotic bias is significant. This is the main reason why we did not draw their distribution functions in figures. 5. Some concluding remarks

Fig. 16. γ1 (AA-model): N = 150, T = 50, c = 1/6.

Fig. 17. β2 (BB-model: γ1 = 0.8): N = 300, T = 100, ca = 1/3.

one case when (β2 , γ1 ) = (.5, .8) and (N , T ) = (300, 100) as Figs. 17 and 18. We have confirmed the fact that the normal approximations are valid even in this case. As the dynamic coefficient γ1 approaches to 1, however, we expect that they become deteriorated because it corresponds to the non-stationary bound. The general features and merits of alternative estimators have not been substantially changed in our experiments. Second, we report some properties of the RQML estimator, which is defined in Section 2.1, because it has some interesting properties for the particular model used by Alvarez and Arellano

In this paper we have investigated the finite sample and asymptotic properties of the WG, the RQML, the GMM and the LIML estimators for coefficients in a particular dynamic panel structural equation, that is, the model used by Blundell and Bond (2000) with one endogenous variable. We have investigated the conditions for the consistency and the asymptotic normality of the WG, RQML, GMM, and LIML estimators when both N and T go to infinity. We have derived the asymptotic distributions and the asymptotic bias terms of the GMM and the LIML estimators explicitly. Although we have a finite number of observations in actual applications, we have confirmed that our asymptotic results agree with their finite sample properties based on a large number of Monte Carlo experiments. When N and T are reasonably large, our results show the asymptotic robustness of the panel LIML estimation with many instruments, which agree with the recent results obtained by Anderson et al. (2008, 2010) for the standard structural equation estimation. We have pointed out that it is possible to use the bias correction of the LIML estimator for general dynamic panel structural equations if necessary. Finally, as we have mentioned in Introduction, the results reported in this paper can be generalized to more general dynamic panel structural equations with some complications. Some results on the asymptotic properties of estimators and testing procedures have been developed by Akashi and Kunitomo (2010b), and Akashi (2008) in a more general framework, respectively. They have suggested that the essential characteristics of good performance of the LIML estimation in the dynamic panel structural equation reported in this paper remain the same. Acknowledgments AK11-8-17. This is a revision of Discussion Paper CIRJE-F-707, Graduate School of Economics, University of Tokyo. We thank Yukitoshi Matsushita and Ryo Okui for useful discussions, and

174

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Table 1 A comparison of RQML with other estimators4 .

(N , T , γ1 ); β2 = 0.5

βˆ 2

(75, 25, 0.5) median iqr (150, 50, 0.5) median iqr 4

γˆ1

WG

RQML

GMM

LIML

WG

RQML

GMM

LIML

.317 .030 .317 .015

.349 .030 .333 .015

.339 .048 .341 .024

.525 .378 .519 .153

.463 .026 .490 .013

.539 .030 .526 .014

.468 .032 .493 .015

.468 .052 .486 .021

In Table 1 iqr is the 75th–25th interquartile range. The estimators θˆWG and θˆGMM converge to (.316, .516), (.342, .513) for γ1 = 0.5, respectively.

we also thank the editor, the associate editor and referees of this journal for comments to the previous versions including an inconsistency in figures of the original Appendix B.

Proof of Lemma A.2. Let M′µ = (µ1 , . . . , µN ), µi = ηi π and we construct the N × 2 error matrix of the linear projection of Mµ on Z∗t ,

Appendix A. Mathematical proofs

1) 2) (µ∗( , µ∗( ) = Mµ − Z∗t (γ (t 1) , γ (t 2) ), t t

This Appendix A gathers the mathematical derivations of our results in Section 3. The most parts of our derivations are rather straight-forward applications and some extensions of Alvarez and Arellano (2003) and Anderson et al. (2010). For the sake of completeness, we give some details. We shall use the notations as (g ) (g ) (g ) (g ) (1) the N × 1 vectors ut = (uit ), vt = (vit ), wt −1 = (wi(t −1) ) (1)

Lemma A.1. Let dt and ds be N × 1 vectors containing the diagonal elements of Mt and Ms , respectively, such that tr(Mt ) = d′t ιN = 2t, tr(Ms ) = d′s ιN = 2s and d′t ds ≤ 2 max(t , s). Then, under the conditions (A1) and (A3) in Section 3, for l ≥ r ≥ t , p ≥ q ≥ s, t ≥ s,

ϵ , ϵp Ms ϵ(qb) ]  (3) (m + m(2) )(2s) + m(0) E [d′t ds ]    (a)2 (b) E [ϵit ϵit ]E [d′t Ms ϵ(qb) ] = (3)   m (2s)

if l = r = p ̸= q < t, if l = p ̸= r = q, otherwise,

(b)

(b)2

where |E [dt Ms ϵq ]| ≤ 2(st E [ϵit ′

m

(2)

1/2

])

(1)

(a)

(b)

(a)2 (b)2

(2)

(a)

(b)

(a) (b)

(a)

(b)

(a)

(b)

(A.1)

(a)2

(A.2)

(b)

(vit ) (g = 1, 2) such as (v(t 2) , ut ).

Lemma A.2. Under Assumptions (A1)–(A4), assume both N and T → ∞ while T /N → c (0 ≤ c ≤ 1/2). Then

NT t =1

wt −1

(2)

  ,  

. ... ...

Φt −3 Φt −2

I2

Φ

(A.6)

I2

′ ′ where Φ = L−1 Π L and (2t ) × 1 vector lt = (π ′ L−1 , . . . , π ′ L−1 )′ .

Then we have

γ (t g ) = (ση2 lt l′t + At )−1 (π(g ,1) ση2 lt ) =

π(g ,1) ση2 1 1 + ση2 l′t A− t lt

1 A− t lt

(g = 1, 2),

(A.7)

1 A− t

 =  ...   O2

.. .

···

O2

..

. . . . P−1 + ΦP−1 Φ′ −P−1 Φ′ −1 ... −ΦP I2 + ΦP−1 Φ′

O2 O2

O2

    ,  

g) = µ(i g ) − z∗it γ (t g ) µ∗( it   = µ(i g ) − ηi l′t γ (t g ) − (L−1 wi0 )′ , . . . , (L−1 wit −1 )′ γ t(g )     π(g ,1) 2 ∗′ − 1 = η − σ wit At lt i η − 1 1 + ση2 l′t At lt = π(g ,1) µ∗it (, say), (A.8)

∗(g )2

E [µit

(1)

Mt (wt , wt −1 )

]=

(π(g ,1) )2 ση2 1 1 + ση2 l′t A− t lt

,

(A.9)



→ Φ∗ + c E [vit(2)2 ] p

where w∗it = (w′i0 , . . . , w′it −1 ).

 

1 (1, 0), 0

(A.3)

where

Φ∗ = D′ Γ 0 D,

.. . Φ′



(g )

(1)′

..

where P = I2 − Φ′ Φ. Hence, for g = 1, 2, we find that

]E [ϵit(b)2 ],

and (ϵt , ϵt ) takes any pair of N × 1 vectors from random variables

  ′ T −1 1 − w(t 2)

Φt −2 Φt −1

2

m(0) = m(0) (ϵt , ϵt ) = m(1) − 2m(2) − m(3) , (a)

.. .



P−1 −P−1 Φ′ O2 −ΦP−1 P−1 + ΦP−1 Φ′ 

= m (ϵt , ϵt ) = (E [ϵit ϵit ]) ,

m(3) = m(3) (ϵt , ϵt ) = E [ϵit

I2

.. .

(Φ′ )t −1



,

= m (ϵt , ϵt ) = E [ϵit ϵit ],

Φ

···

and the inverse matrix of At is a block-tridiagonal symmetric matrix such that

if l = r = p = q,

0

m

  At =   

Φ′

I2



(a)′

Cov[ϵ

(1)

where γ t = (E [ ])−1 E [z∗it µi(g ,1) ] (g = 1, 2) are (2t ) × 1 vectors and Z∗t = Zt Γ 0t · (µi(g ,1) is the g-th component of µi .) We take Γ 0t as the 2t × 2t block-diagonal matrix whose 2 × 2 diagonal ′ blocks are lower triangular matrix L−1 such that Γ 0 = LL′ . Let also At be the 2t × 2t partitioned symmetric matrix as

(2)

for g = 1, 2, and the N × 2 matrices Vt = (vt , vt ), Yt −1 = (y(t 1−)1 , y(t 2−)1 ) and Wt −1 = (w(t 1−)1 , w(t 2−)1 ). (Here Vt and Wt −1 are N × 2 matrices.) First, we prepare two lemmas for our derivation. The first lemma is a direct application of Lemma C1 of Alvarez and Arellano (2003) and so we omit the proof. (See Akashi and Kunitomo (2010a).)

(a)′ Mt r(b) l

(g )

(A.5)

′ z∗it z∗it

′ Since π ′ L−1 (I2 − Φ)P−1/2 ̸= 0′ and

lim

 D=

0

γ2

t →∞



1 . 0

(A.4)

1 l′t A− t lt

t

 ′  = π′ L−1 P−1 + ΦP−1 Φ′ − P−1 Φ′ − ΦP−1 L−1 π  ′   ′ ′ = P−1/2 (I2 − Φ′ )L−1 π P−1/2 (I2 − Φ′ )L−1 π ,

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183 1 which is O(1), we have established that l′t A− t lt

∗(g )2

E [µit

= O(t ) and

 ≤

] = O(t −1 ).

π(g ,1) π(h,1)

2 

∗(g )

∗(g )4

1, 2). Because of the form (A.8), we have E [µit O(t 2 ) = O(t −2 ) (g = 1, 2). Next, we shall consider the decomposition (g )′ (h) wt −1 Mt wt −1

= =

(g )′ (h) wt −1 wt −1 (g )′ (h) wt −1 wt −1



(g )′ wt −1 (IN ∗(g )′

(g = ] = O(t ) × −4

(IN − Mt )µt∗(h)

− µt

2 −− T2

    1

O



(g , h = 1, 2),

(A.10)

E

T −1 1 − (2)′ (2) vt Mt vt NT t =1

(2)

∗(1)

∗(2)

= Yt −1 − Z∗t ), Mt = Zt (Zt Zt ) Zt and (IN − Mt )

(γ t , γ t ) − (µt , µt [Yt −1 − Z∗t (γ (t 1) , γ (t 2) )] = 0. Therefore, for g , h = 1, 2, T −1 1 −

NT t =1

=

(2)

Also by using wt

 (g )′

(h)

Var

E [wt −1 Mt wt −1 ]

(g )′ (h) E [wit −1 wit −1 ]

T −1 1 −



NT t =1

∗(g )′

E [µt

∗(h)

(IN − Mt )µt



]. Var

g) h) | ≤ |π(g ,1) π(h,1) |(µ∗t µ∗t ), |µ∗( (IN − Mt )µ∗( t t ′

(A.11)

since (IN − Mt ) is an idempotent matrix. Then

=

NT

=O

log T T



Var

.

E [wi(t −1) w′i(t −1) ]

 ′

Wt −1 Wt −1

1

=

N

 Var

=O

1

T −1 |π(g ,1) π(h,1) | −

T t =1

NT

→ 0.

(A.13)

µt µ∗t

2 −− T2

s

t >s

(2)2 T −1 E [vit ] −

N 2T 2

(g )′

(g )

E [wt −1 Mt wt −1 ],

t =1

(2)

(A.17)



Mt vt

 (E [vit(2)2 ])2 + (E [vit(2)4 ] − 3E [vit(2)2 ]) (2t ) (A.18) (2)

(g )′

(2) (2)′ vt Mt vt

(2)

wit −1 wit −1

(A.14)

=

γ

Var [µ∗it2 ]

t

 Cov[µ∗it2 , µ∗is2 ]

h 1

0

  h β2 γ2 γ γ2 − γ1  01

0



γ2h

1

, v(s2)



 1 0

 −1 β2 γ2 γ2 − γ1  1

φβ (γ − γ ) , γ2h h 1

h 2



where φβ = β2 γ2 /(γ1 − γ2 ). Then the auto-covariance matrices are given by Γ h = E [wit w′it +h ]

∑∞ i ′ i +h = , where Ω = E [vit v′it ]. By using the relation i=0 Π ΩΠ ′ ′ Γ h − ΠΓ h Π = ΩΠh , we find that v ec [Γ h ] = (I4 − Π ⊗ Π)−1 ′ v ec [ΩΠh ] and (I4 − Π ⊗ Π) is an upper triangular matrix. By a direct calculation, the elements of Γ h are given by 1

(1 − γ12 )(1 − γ1 γ2 )

[(1 − γ1 γ2 )(ω11 + ω12 φβ )

+ (γ1 β2 γ2 )(ω21 + ω22 φβ )]γ1h +

t =1

N

,

 ′



∗′

 2  π(g ,1) π(h,1) 1 −

+



γh(1,1) =



T2

T −1 1−



and

=

=

= o(1).

(A.12)

E [Wt −1 Mt Wt −1 ] −

NT



(Step 1) First, we utilize the dynamic process wit = yit −πηi and we shall evaluate the asymptotic behavior of the quantities appeared in the form of the WG estimator. Under the condition γ1 ̸= γ2 , Π h can be decomposed as



Var

= γ2 wt(2−)1 + v(t 2) , it is sufficient to show that

E [µ∗it2 ]



NT t =1



N 2 T 2 t =1

1 Πh = 

T −1 1 −

(A.16)

Proof of Theorem 1. The proof consists of three steps.

To establish the convergence in probability by using the relations from (A.32) to (A.34) below, we have



(2)′

vt

T −1  −

1

0

NT t =1

(2)

(2t )E [vit(2)2 ]

(2)

Hence T −1 1 −

(g )′

wt −1 Mt vt ] = 0 for

Ms vs ] = 0 hold for t > s (g = 1, 2), we have the result due to Lemma A.1. 

t =1



t =1

Since Cov[wt −1 Mt vt , ws−1 Ms vs ] = 0 and Cov[

t =1

T

NT t =1

∑T −1

E [µ∗t µ∗t ]

T −1 |π(g ,1) π(h,1) | −



T −1 1 −

=

(g )′ (2) wt −1 Mt vt

(g )′

    T −1 1 − ∗(g )′ ∗(h)′  E [µt (IN − Mt )µt ]   NT  t =1 T −1 |π(g ,1) π(h,1) | −

(A.15)

= o(1) (g = 1, 2),

T −1 1 −

NT t =1



(g , h = 1, 2)



T −1 1 −

NT t =1

Moreover, ′

,

→ c E [vit(2)2 ].

∗′ ∗ −1 ∗′



s



where the second equality follows from Wt −1 (1)

1

O

t

t >s

s

t2

t

which is o(1). Also we have E [(1/NT ) g = 1, 2 and then

(h) Mt )wt −1



+

O

T2

N

Furthermore, we evaluate the fourth-order moments of µit

175

  1 − 1

1

(1 − γ12 )(1 − γ1 γ2 )(1 − γ22 )

× [(ω12 − ω22 φβ )(1 − γ22 )(γ1 β2 γ2 ) + (β22 γ22 )(1 + γ1 γ2 ) − (1 − γ1 γ2 )(1 − γ22 )(ω12 φβ )]γ2h , γh(1,2) =

1

(1 − γ1 γ2 )(1 − γ22 )

[(1 − γ22 )ω12 + β2 γ22 ω22 ]γ2h ,

176

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

1

γh(2,1) =

γh(2,2)



In order to evaluate the first element of the leading term in (A.23), ∑ (2) (1) ∑ (2) (1) we use the relation that Var [ t wit uit ] = E [( t wit uit )2 ] −

[ω21 + ω22 φβ ]γ1h (1 − γ 1 γ2 ) ]  [ 2 2 β2 γ2 ω22 γ2h , + − φ ω β 22 1 − γ22   ω22 = γ2h . 1 − γ22

(T E [vit(2) u(it1) ])2 . Then for the first term of this relation (2)2 (1)2

 − N

E

(f )′

(1,f )

Yt ·t −1 ut

t =1

(2) (1)

(A.19)

(1)2

E

1 (1) ui ] − ι′T E [yi(−1) u′i ]ιT , T

(2)′

(A.20)

1) (1) (1) = (y(i12) , . . . , y(iT2) ), y(i(− 1) = (yi0 , . . . , yi(T −1) ) and ( 1 ) ( 1 ) u′i = (ui1 , . . . , uiT ). ′

where yi

(1)

(1) (1)

Since the (t , s) elements of E [yi(−1) u′i ] are γ1t E [vit uit ] + φβ (γ1t −

γ )E [vit uit ] if t > s and 0 (otherwise), we obtain    N  − N  (2) (1) (1)′ (1) (1) E yi(−1) QT ui = − E [vit uit ] + φβ E [vit uit ]  ×

1

T−

1 − γ1 (2) (1)

− φβ E [vit uit ] E

 N −

(2)′

yi



N T −





 1 Var   √NT   =

 T−

1−γ

T 2

1 − γ2



,

1

+O

T

[

1 1 − γ2

 γ   1 − γ T ] 2 2 1− . (A.22) T 1 − γ2

N −



(2)′

E

+E

1)2 E [w ¯ i((− 1) ]

yi(−1) QT ui

i =1 N



T

−−

NT i=1 t =1



T − N i=1



(2)

w ¯i

1) w ¯ i((− 1)

(2)

wit



T 1−

T t =1 T

u¯ i =

wit(2) ,

1 − (1) u . T t =1 it



2   − E [u(it1) vit(2) ] 

(A.25)

wit(1−) 1  



 γ = 0(2,2) γ1(1,2)

w ¯ i(2)2

1) w ¯ i(2) w ¯ i((− 1)

1) w ¯ i((− ¯ i(2) 1) w

1)2 w ¯ i((− 1)

=

1

 j T −1 − −

T2

Γh +

j=0 h=0

The first double sum S0T = represented as

(1)

uit

γ1(1,2) γ0(1,1)



 ,

(A.26)

j T −1 − −



j=1 h=1

h=0



Γ 0 Π h = Γ 0 ST can be



u¯ i ,

(A.23)

(A.27)

(2,2)

∑ T −1 ∑ j j=0

.

Γh ′

S0T = Γ 0 [Π ′ (Π T − I2 )(Π ′ − I2 )−1 − T I2 ](Π ′ − I2 )−1 ,

(A.28)

and the elements of ST are given by

[πii (1 − πiiT )(1 − πii ) − T ](πii − 1) (g = 1, 2) (1 − π11 )2 (1 − π22 )2   1 T = [([π22 (π11 − 1) (1 − π11 )2 (1 − π22 )2

ST (g ,g ) =

where

w ¯ i(2) =

γ22 1 − γ22

and

y QT ui    i =1 i    −  N   (1)′

N



.

T −1 1 − (f )′ (f ) Y Y NT t =1 t ·t −1 t ·t −1





(2)2

(1 )2

(1)2 (2)2

(A.21)

Next, we need to evaluate the variances of the associated quantities and we write

1

yi(−1) QT ui

Third, we need to evaluate the expectation of quadratic form

= N E [vit(2) u(it1) ] T −

= Var √



(2)′

(1)2





(A.24)

E [uit ]Γ0(1,1)

QT ui



  1 Var   √NT 

N −

E [uit vit ] + E [uit ]E [vit ]





 

yi QT ui     i= 1    − N   (1)′

 

i =1



w ¯ i(2)



i= 1

1 − γ1

1 − γ2

(1)2

(1)

1 − γ1

1

(1)2

N i=1

 T



(1)2



T

i =1

(1)

since Var [(·)¯ui ] = O(1/T 2 ) (see Page 1139 of Alvarez and Arellano (2003)). Then

(2) (1)

t 2

(2)

u¯ i 1) w ¯ i((− 1)  (2)     w ¯i 1 , u¯ i = O = T × Var 1) T w ¯ i((− 1)

Var

=

(1)

For the second term of (A.23), we have

 (1)′ E [yi(−1)

(2)

For the second element of the leading term of (A.23), we use ∑T (1) (1) (1) (1) E [uit −j wit −1−j ] = 0 for j ̸= 0 and E [( t =1 uit wit −1 )2 ] = T ×

and



(2) (1)

E [uit wit −1 ] = T × E [uit ]E [wit −1 ].

yi(−1) QT ui

N 1 − (1)′ y QT ui N i=1 i(−1)

γ22 (1 − γ22 )

= (E [vit(2) u(it1) ])2 (t ̸= s).

i=1



(2)2

E [wis uis wit uit ] = E [wi(s∧t ) ui(s∧t ) ]E [vi(s∨t ) ui(s∨t ) ]



(2)′

yi QT ui     = E  iN=1  , − (1)′ 



(1)2

(t = s),

For the within-groups estimator, we need to evaluate

 T −1 −

(1)2 (2)2

E [wit uit ] = E [uit vit ] + E [uit ]E [vit ]

1) w ¯ i((− 1) =

T −1 1−

T t =0

wit(1) ,

ST (2,1)

T T T + π12 φβ (π11 − π22 )](π11 − 1) − π22 π12 (π11 − 1))(π11 − 1) T − (π22 (1 − π22 )(1 − π22 ) − T )π12 ], ST (1,2) = 0,

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

where Π = {πgh } (g , h = 1, 2). Also define S1T = Γ 1 ST , and then we use the relation

(2)2

] = [(S0T + S′1(T −1) )/T 2 ](1,1) .

E

T −1 1 − (f )′ (f ) Y Y NT t =1 t ·t −1 t ·t −1



 =

γ0(2,2) γ1(1,2)

1 γ1(1,2) . +O γ0(1,1) T

Var

(A.30)

=

1 − (f )′ (f ) Y Y NT t =1 t ·t −1 t ·t −1

N

 Var

T

t

wi(2) wi((1t)−1)

wi((1t)−1) wi(2)

wi((1t)−21)

(2)

w ¯i

w ¯i

1) w ¯ i((− 1)

1)2 w ¯ i((− 1)

1) w ¯ i((− ¯ i(2) 1) w

Var



T t =1

 Var

T 1 −



T t =1

 +2  +

(1)2

1+γ γ



1−γ

2 2



Var





T 1−

T t =1

 E

 T −1 −

= 2T

(A.31)

1 + γ12



1−γ

2 1

(1)2

Cov[wit

(v(t 1−)1−j , v(t 2−)1−j )



(1)2

Cov wit

(A.32)

, wit(11)2



→0



ct

T −1 −

T −t (f )

ct v˜ tT Mt ut ,

t =1

where (1)

(1)′



wt −1 = (w1t −1 , . . . , wNt −1 )′ ,

T −t −

′  j+h

,

Π(1)

(A.39)

(1) (2)

 1−

T − (1 − γ t )

1



2

T (1 − γ2 ) t =1

t

(1) (2)

(1) (1)

1 − γ1

 ×

1−

T − (1 − γ t )

1

 .

1

T (1 − γ1 ) t =1

t

(A.40)

as

 (γ ) γ22 φT −2t (f ) (2)′ yt Mt ut = ct γ2 − wt −1 Mt ut T − t t =1 t =1     (γ ) T −1 − γ2 φT −2t 1 (γ ) (2) vt − (φT −2t v(t 2+)1 + · · · + ct 1− T − t T − t t =1 ′

T −1 −

(2,f )′



T −1 −

(f )

(f )

Mt ut .

(A.41)

Again we use the facts that (a) only the second term has non-zero (2)′

˜ (t 1−)1 w

′

(1)

(1) (2)

mean, (b) E [vt +j Mt ut +k ] = (2t )E [uit vit ] for j = k and zero (γ )

(A.34)

1

(A.38)

E [uit vit + φβ uit vit ]

E [yt

(γ )

(f )

Mt ut ] (γ )



(γ )

(γ2 )

γ2 φT −2t φ 2 + · · · + φ1 = (2t )E [uit vit ] 1 − + T −t T −t (T − t )2  (2t )E [u(it1) vit(2) ] 1 (γ ) = (T − t ) + − γ2 φT −2t T −t +1 1 − γ2    (γ2 )  φT −t γ2 − . 1 − γ2 T −t (1) (2)

ct2

(f )

Mt ut

(γ )

otherwise, and (c) φ1 2 + · · · + φT −2t −1 = (T − t − φT −2t )/(1 − γ2 ). Then for each t, we have (2,f )′

(1) wt −1

(A.37)

(A.33)

(2) (1) t =1 wit wi(t −1) ] = O(1). Also we have 



(γ )

− φ1 2 )v(T2−) 1 ]



E [φβ uit vit ]

+ φ1 2 v(T2) )

∑T

T −1 −

(γ1 )

)v(t 2) + · · · + (φ1

h= 1

Similarly, we decompose yt

  (2)2 Var wit ,

, wit(11) wit(12) ]

t =1

(1)



1 − γ2



(Step 2) In order to evaluate the asymptotic behavior of the GMM estimator, we shall utilize Lemmas A.1 and A.2. When the forwardfilter is applied to data, we shall use the decomposition

(1)

(γ2 )

(for x = γ1 , γ2 ),

(γ )



−φ

(2,f )



  (1)2 (12)2 Cov wit , wit ,

=

,

(1,f )′ (f ) yt −1 Mt ut



probability convergence of the WG estimator.

t =1

(γ1 )

t =1

 ,

1 + γ12

wi((2t)−1) wi((1t)−1)

(1,f )′ (f ) yt −1 Mt ut

[(φ

j +h

√ ∑T √ ∑T (2) (1) (2) because Var [(1/ T ) t =1 wit wit ] = Var [(1/ T ) t =1 wit (β2 wit(2) + γ1 wi((1t)−1) + u(it1) )] and the variances in the right-hand side of (A.31) are O(1) as T → ∞. Hence we have established the

T −1 −

∞ −

(1)



1 − γ12



1−γ γ 1 + γ22





2 2 1 2 2 2 1 2

and Var [(1/ T )







wit

1−x

˜ t −1 = w

AR(1) processes, or wit + wit , those coefficients are γ1 and γ2 , respectively. Then we can show that

wi((2t)−21)

1 − xj

φj(x) =

(12)

(11)

T 1 −

φβ

T −t T −t T −t = v˜ (tT11) + v˜ (tT12) , (say),

which is o(1) eventually. The second term is O(N −1 T −2 ) by using the same arguments as for (A.24). In order to show that the first (1) terms are O(N −1 T −1 ), we decompose wit as the sum of the two



(γ )

[φT −1t v(t 1) + · · · + phi1 1 v(T1−) 1 ]

and Π (1) denotes the first row of Π j+h . Since only the second term in the right-hand side of (A.35) has nonzero mean (the same calculation of (A46) and (A47) of Alvarez and Arellano (2003)), we obtain



wit(2)2



(2)2

 − 

1−

(γ )

T −t

+

(A.36)

j=0

T −1

1

1

=

 



Moreover, we shall evaluate the covariance matrix



(1)

(A.29)

1) ′ Similarly, we can show that E [w ¯ i(2) w ¯ i((− 1) ] = [(S1T + S0(T −1) )/ T 2 ](1,2) , and



= (u(1t1,f ) , . . . , u(Nt1,f ) )′ ,

ut

v˜ tT

(1)2

E [w ¯ i(−1) ] = [(S0T + S′1(T −1) )/T 2 ](2,2) , E [w ¯i

(f )

177



(A.42)

Furthermore, by taking the sum of (A.42), we have (A.35)

E

 T −1 −

(2,f )′

yt

(f )



Mt ut

t =1

(1) (2)

= 2E [uit vit ]



(T + 2)(T − 1) 2

 +

γ2 1 − γ2



(T + 1)

T − 1 s=2

s

178

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

 −

γ2



1 − γ2



(T − 2) −

(γ ) T − φs−21 − (s − 1) s=2 

 (γ ) T − φs−21 γ2 (T + 1) 1 − γ2 s(s − 1) s=2



− γ 2 (T + 1)

(γ ) T − φs−21



s

T −

(γ ) φs−21



1



 (1)′  T −1 − wt −1 (2)′

,

(A.43)

′ T −1 1 − v(2) t 0′ n t =1





 =

T −1 1 −

Mt ut + op (1).

Var [vt

(A.44)

(A.45)

2 Then the asymptotic covariance–variance matrix becomes √ Φ (σ ∑ ∗ ∗−1 2 ∗−1 Φ )Φ = σ Φ . Under the condition t tr(Mt )/( NT ) < ∞, its asymptotic bias is given by

∗−1

b0

= lim



N ,T →∞

NT t =1

∗−1

(A.46)

0

The derivation of the asymptotic normality of the GMM estimator is the special case of α2t = 0 in (A.96) below and it is omitted. Finally, by applying the similar arguments used in Page 1153 of Alvarez and Arellano (2003) on the asymptotic behavior of the RML estimator, which is omitted here, we have the result on the RQML estimator.  (Step 1) First, we need the convergence result on (1/n)G and (1/qn )H(f ) . We use similar arguments as Akashi and Kunitomo (2010b) with Lemma A.1 and we have 1 (f ) p G → G0 = n

 ′ θ I2

Φ∗ (θ, I2 ) + c

= Φθ + c



Ω 0′





0 , 0

Ω 0′



0 0

(say)

(A.47)

and 1 qn

(f ) p

H

→ H0 =

 Ω

λ(1nf ) =

0′

(A.48)

   Φθ + (c − p lim λn ) Ω′  n→∞

0

 0  = 0. 0 

(A.49)

(A.51)

where c∗ = c /(1 − c ). Then by using the rank relation of Φθ (1, −θ ′ )′ = 0, we have (f )



(f )



cc∗ H1 ]

1



−θ

+ op (1).

(A.52)

Also by multiplying (0, I2 ) to (A.52) from the left and substituting λ(1nf ) for (A.52), we find



βˆ 2LI − β2 γˆ1LI − γ1

 √

(f )

(f )



1



= (0, I2 )[G1 − λ1n H0 − cc∗ H1 ] + op (1) −θ [   ] 1 Ωβ = (0, I2 ) I3 − ′ (1, −θ ′ ) β Ωβ 0   √ 1 (f ) (f ) × [G1 − cc∗ H1 ] + op (1), −θ (f )



(A.53)

(f )

cc∗ H1 ](1, −θ ′ )′ , we decompose G(f )

′ G(f ) = G(f ,1) + G(f ,2) + G(f ,2) + G(f ,3) ,

(A.54)

where G(f ,1) = D∗



T −1 −

(f )′

(f )

Yt −1 Mt Yt −1 D∗ ,

t =1

G(f ,2) = D∗

T −1 ′ −

(f )′

(f )

Yt −1 Mt (Vt , 0),

t =1

G(f ,3) =

T −1  − t =1

(f )′

Vt 0′



(f )

Mt (Vt , 0),

  (f ) (1,f ) (2,f ) D = (π11 , π12 )′ , D and Yt −1 = (yt −1 , yt −1 ). Then by using the ′ ′ ∗ relation D (1, −θ ) = 0, we write   T −1 √ 1 ′ − (f )′ 1 (f ) [G(1f ) − cc∗ H(1f ) ] = √ D∗ Yt −1 Mt ut −θ n t =1    T −1  (f )′  − 1 Ωβ (f ) Vt +√ Mt ut − rn ′ ′ ∗

0

0

t =1

√ T −1 cc∗ ′ − (f )′ (f ) − √ D∗ Yt −1 [IN − Mt ]ut qn

where Φ∗ = D′ Γ 0 D. Then we have



√ (1, −θ ′ )[G(1f ) − cc∗ H(1f ) ](1, −θ ′ )′ + op (1), (1, −θ ′ )H0 (1, −θ ′ )′

n



0 , 0

1

By multiplying (1, −θ) to (A.50) from the left, we have

Proof of Theorem 2. The proof consists of four steps. (f )



n(0, θˆ − θ).

n

In order to evaluate [G1 − as

  (1) ( 1, 0)E [vit uit ] × .

(f )

+ √ [G1 − λ1n H0 ]

where β = (1, −β2 )′ .

 tr(Mt ) Φ

−θ

(f )

Mt ut ]

= O(c ).

T −1 −

(f )



−θ    1 1 1 1 = op √ . (A.50) + √ [G0 − cH0 ]b1 − √ [cH(1f ) ] −θ qn n n



(2)′

NT t =1

1

(f )

n[(1/n)G(f ) − G0 ], H1 =





Φ∗ n

T −1 1 − (2)′ vt Mt ut √ n t =1

1

[G0 − cH0 ]



1

(f )

By using Lemma A.1,





(f )

Φθ b1 = [G1 − λ1n H0 −

ut

wt −1

+√

(a)

(f )

(f )



s=2

n(θˆ GM − θ) = √ D n t =1

Var

(f )



(Step 3) We now turn to the asymptotic distribution and the asymptotic bias of the GMM estimator when c = 0. (The general case has been treated in Akashi and Kunitomo (2010b).) The normalized GMM estimators are asymptotically equivalent to



(f )

Define G1 , H1 , λ1n and b1 by G1 =

qn [(1/qn )H(f ) − H0 ], λ1n = n(λn − c ) and b1 = By substituting these variables into (2.12), we have

which is O(T 2 ). Because we have the convergence in probability to the limits of (A.42) and (A.43), the first factor of (3.5) can be established by using (A.47) and (A.65) below.

Φ

p

c > p limn→∞ λn , then | · | > 0. Hence λn → c.



s=2



By the singularity of Φθ , we first notice that c is a solution. If

t =1

  √ − T −1  (f )′  cc∗ Ωβ (f ) Vt − √ [IN − Mt ]ut − qn . ′ 0′ qn

t =1

0

(A.55)

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Also we use the relation

√ √ cc∗ / qn − c∗ / n = o(1) and set



1

Nt = Mt − c∗ (IN − Mt ) =

1−c

[Mt − cIN ].

(A.56)

Then we have the representation



1

= √ D′ n

− t =1

tend to zero. First, we notice that Var [Υ3n ] → 0 from (A.24). Second, we notice



T −1

−

1 ut + √ n t =1

u⊥ t 0′



(1)

(1)2

Nt ut

E [uit ]

=

′ Ωββ′ ′ , u⊥ ) = ( 0 , 1 ) I − (v(t f ) , v′t ). 2 t ′ β Ωβ

]

(A.58)

E [uit ]

=





n t =1

(1)′ wt −1 Mt ut

(1)

(1)



(1)′

(1)

−(Υ21n − Υ22n ) − c∗



n t =1

(1)′ wt −1 ut

T −1 1 −

(1)

n t =1

Var [Υ11n ] ≤

[



(1)

Υ12n = √

n t =1 T − t

Υ21n

(1)

Υ22n

˜

− Υ3n

,

(A.59)

×

,

(A.61)

(A.62)

(1)

(A.63)

n t =1

Υ3n =

N i=1

1) ¯ w ¯ i((− 1 ) ui .

(A.64)

√ ∑ By using Lemma A.2, the leading term of (A.59) and (1/ n) t

(2)′ wt −1 Mt ut

 Var



n t =1

(g )′ wt −1 Mt ut

 =

+ (1 + |φβ |)ST 1 (|γ1 |)

N 1 + |γ1 |

1 − |γ1 | log T

T



+ |φβ | − 2 ST 2 (1) ]

2





T

ST 2 (|γ1 |) + |φβ |ST 2 (|γ2 |)

,

(A.68)

where 2

S T 1 ( x) =

T

2

(x + · · · + xT −2 ) +

T −1

(1)2 T −1 E [uit ] −

NT

= O(1).

S T 2 ( x) =

x 2

3 x T −1

+ ··· +

T

.

Then (1)

Var [Υ12n ] =

=

(g )′ (g ) E [wt −1 Mt wt −1 ]

1 NT

Var

 T −1 − t =1

ct T −t

(1)2 T −1 E [uit ] −

NT

t =1

′ ˜ (t 1−)1 Mt u(t f ) w

ct2



(1)

(1)′

(T − t )

2

˜ t −1 Mt w ˜ t −1 ] E [w

(A.69)

t =1

(A.65)

because (1)′

(f )

(1)′

(f ) (f )′

˜ t −1 Mt ut ] = E [w ˜ t −1 Mt ut ut Var [w 2 Note that u(f ) = (u − u ¯ tT )/ct . t t

(x + · · · + xT −3 )

2

must satisfy

T −1 1 −

(2)

+ · · · + x,

T −1 1 − (1)′ ¯ tT , = √ v˜ tT Mt u N T −

(2)′



1 − |γ1 |

=O

T −1 1 − (1)′ = √ v˜ tT Mt ut , n t =1



NT  1

1

T

[

(A.60)

(f ) (1)′ wt −1 Mt ut

(1)

(E [(w(01) w(01) )(w(02) w(02) )])1/2 E [u(it1)2 ]



(1)′

ct

(1)′



¯ tT , wt −1 Mt u

T −1 1 −

(1)′

E [uit ](E [(w0 w0 )(w0 w0 )])1/2



Υ11n = √

(2)

(1)2

(1)

where

(1)

(2)′

+ ··· + 2 ] + |φβ |ST 1 (|γ2 |)

(1)

(2)

(E [(w0 w0 )(w0 w0 )])1/2 . By using the relation (E [(w0 w(01) ) ′ (w(02) w(02) )])1/2 = O(N ), we can evaluate as

− Υ11n − Υ12n

T −1 1 −

(1)′

From the covariance-stationarity, we have |E [ws−1 Ms ws−1 ]| ≤

=



(A.67)

where [A](i,j) is the (i, j)-th element of A.

×

T −1 1 −

(1)

(1)′

2) (1) + [Πt −s ](1,2) w(s− 1 )Ms ws−1 ],

T −1 1 − (1,f )′ (f ) yt −1 Nt ut n t =1

(1)

(1)

E [([Πt −s ](1,1) ws−1

(T − s + 1)



1−c

(1)′

E [Es [wt −1 ]Ms ws−1 ]

(T − s + 1) (1)2

(A.57)

(Step 2) At this step we shall evaluate the orders of the quantities appeared in (A.57). (They shall be used to derive the limiting distribution of the LIML estimator in Step 3.) ¯ tT = (ut + · · · + uT )/(T − t + 1). To evaluate the sampling Let u error of γˆ1.LI , we further decompose the first term of (A.57) as2

=

(A.66)

1) = E [w(t 1−)1 Mt Et [u¯ tT u¯ ′sT ]Ms w(s− 1]



(say),

[



(1)

(1)′

E [wt −1 Mt u¯ tT u¯ ′sT Ms ws−1 ].

NT t =1 s=1

(1)′

where O(1) is the terms involved in the asymptotic bias term (and hence the expected values of a1n and a2n are of the small order o(1), which shall be discussed later in Step 4 with the more details), and

1

T −1 T −1 1 −−

E [wt −1 Mt u¯ tT u¯ ′sT Ms ws−1 ]

+ O(1) + op (1)

(ut

(1)



(1)′ wt −1 (2)′ wt −1

= a1n + a2n + O(1) + op (1),

(⊥,f )′

(1)

For t ≥ s,

0

T −1

(1)

(1)

(1)

− (f )′ √ βˆ − β2 1 (f ) = √ D′ Φ∗ n 2LI Yt −1 Nt ut γˆ1LI − γ1 n t =1 ′ T −1  1 − u(⊥,f ) (f ) t Nt ut + op (1) +√ ′ n t =1

(1)

Next, we shall show that the variances of Υ11n , Υ12n , Υ21n and Υ22n

Var [Υ11n ] =

T −1





179

(1)

˜ t −1 ] Mt w

˜ (t 1−)1 Mt w ˜ (t 1−)1 ], = E [u(it1)2 ]E [w ′

(A.70)

180

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

and the covariance terms are zero. In effect, for t > s, we have (f )

(1)′

(1)′

˜ t −1 Mt ut , w ˜ s−1 Ms us(f ) ] Cov[w =

′ ′ 1) ˜ (t 1−)1 Mt Et [u(t f ) u(sf ) ]Ms w ˜ (s− E [w 1]

(γ1 )2

By using Lemma A.1 and the fact that φj we have

= 0,

(A.71)

(1)

(1)2 T −1 E [uit ] −

NT

=

(1)′

(T − t )

T −1

h =1

(1)

(A.72)

(·)j ]| in the left-hand side is

∑∞

j =0 (γ ) (γ ) (γ ) (γ ) |φT −1t |2 + c (1,2) |φT −1t ||φT −2t | + c (2) |φT −2t |2 (1) (1,2) (2)

bounded by c positive constants c

,c

and c

for some

.

(1)

We turn to evaluate the variance of Υ21n . By using Lemma A.1, (11)

(11)

(12)

the only non zero terms to be considered are a0n , a1n , a0n and (12)

a1n , which are defined by



  1 (1) Var Υ21n =

Var

NT

 T −1 −

(11)



+ 2 Cov

T −1 −

ut Mt v˜ tT

(11)

ut Mt v˜ tT , ′

t =1

+ Var

 T −1 −

T −1 −



(12)

(12)



(1 − γ1 ) NT 2

(1)

where m(k) (ut , vt ) (k = 0, 2, 3) are defined as the same way as (A.1) and (A.2) in Lemma A.1. The last equality follows from

[uit ]) |a1n

ut Mt v˜ tT

1/2

a0n

=

(γ1 )2

+ · · · + φ1 (12)

a0n

=

T −1 φβ2 −

   T −2 φ (γ1 )2 E [v (1)2 u(1) ]E [d′ M u(1) ] it it T −t −1 t +1 t t (T − t )(T − t − 1)  (γ )2 φ1 1 E [vit(1)2 u(it1) ]E [d′T −1 Mt u(t 1) ]  + ··· +   (T − t )



NT t =1 (T − t ) (γ )

(11)

a1n

(12)

a1n

(1 − γ1 )2 

(A.73)

(1)

T −2 − t =1

1

t +1

(T − t )

T −t −1

(γ )

(γ )

× T

NT

 (γ ) (γ ) T −2 − (φT −1t −1 − φT −2t −1 )2 Cov[u′t Mt v(t 2+)1 , u′t +1 Mt +1 v(t 2+)1 ] × (T − t )(T − t − 1) t =1  (γ1 ) (φ1 − φ (γ2 ) )2 Cov[u′t Mt v(T2−) 1 , u′T −1 MT −1 v(T2−) 1 ] + ··· + . (T − t )

 =O (12)

1 2

+ ··· +

(log T )2 N



+ ··· +

])1/2 2

(1 − γ1 )2

(γ )

2φβ2

(1)2

uit ]|(22 E [uit

4|E [vit

[  ((φT −1t − φT −2t )2 Var [u′t Mt v(t 2) ]

])1/2 2 NT

(1)2 (1)



+ · · · + (φ1 1 − φ1 2 )2 Var [u′t Mt v(T2−) 1 ]),  (γ )2 (1) (1) T −2 2 − φT −1t −1 Cov[u′t Mt vt +1 , u′t +1 Mt +1 vt +1 ] = NT t =1 (T − t )(T − t − 1)  (γ )2 φ1 1 Cov[u′t Mt v(T1−) 1 , u′T −1 MT −1 v(T1−) 1 ] + ··· + , (T − t ) =

uit ]|(22 E [uit

4|E [vit

Var [u′t Mt vT −1 ]), 2

(1)2

(1)2 (1)

t =1

(γ )2 (φT −1t Var [u′t Mt v(t 1) ] 2

1

≤ 2(t + h)(E

, we have

× NT t =1 (T − t )

(A.75)

2 − |=  NT  t =1

where (11)

(A.74)

N

(11)

+ (a(0n12) + a(1n12) ),

1

([m(3) (ut , v(t 1) ) + m(2) (ut , v(t 1) )

+ |m (ut , vt )|] + (T − t − 1)m(3) (ut , v(t 1) ))   log T =O ,

= (a(0n11) + a(1n11) )   T −1 T −1 − − 2 (11) (12) ′ ′ ut Mt v˜ tT , ut Mt v˜ tT + Cov

T −1 1 −

t =1

t

(T − t )2

(1)

(0)

(1)2



t =1

T −1 1 −

4·2

)m(3) (ut , v(t 1) ))

Moreover, from the fact that |E [d′t +h Mt ut ]|

ut Mt v˜ tT

t =1

NT

(γ1 )2

(1)



t =1



(φT −1t [m(3) (ut , v(t 1) ) + m(2) (ut , v(t 1) )

T −1 T −1 − − t (T − t − 1) (T − s)(s − 1) = = O(T log T ). 2 (T − t ) s2 t =1 s =1



t =1



(γ )2

NT t =1 (T − t )2

(γ )2

h =1

which is o(1) since the term |[

2t

)[m(3) (ut , v(t 1) )(2t )])

+ |m(0) (ut , v(t 1) )|] + (φT −1t −1 + · · · + φ1

t =1

j =0

T −1 1 −



(T − t )2     ′  ∞ T −t T −t − − − j +h j+h × Π(1) Ω Π(1) , NT

(γ1 )2

+ (φT −1t −1 + · · · + φ1

(1)

˜ t −1 w ˜ t −1 ] E [w 2

ct2 N

E [uit ] −

(γ )2

(φT −1t [(m(3) (ut , v(t 1) )

NT t =1 (T − t )2

(γ )2

ct2

t =1

(1)2

1

+ m(2) (ut , v(t 1) ))(2t ) + m(0) (ut , v(t 1) )E [d′t dt ]]

(f ) (f )′

since Et [ut us ] = O. Moreover, Var [Υ12n ] ≤

T −1 1 −

(11)

a0n =

< 4/(1 − γ1 )2 for all j,



NT 1

T −1



T −1

1 2



1

+ ··· +



1 T −1

] +1

.

(A.76)

= O(log T /N ) and a(1n12) = O((log T )2 /N ) are analogous

Also a0n (11)

(11)

to a0n and a1n , respectively.

(1)

Finally, we consider the variance of Υ22n . By the same arguments as used for Lemma A.1, (1)′

(1)

(1)

¯ tT ] = [m(3) (˜vtT , u¯ tT ) + m(2) (˜vtT , u¯ tT )](2t ) Var [˜vtT Mt u + m(0) (˜v(tT1) , u¯ tT )E [d′t dt ] ≤ (2t )[m(1) (˜v(tT1) , u¯ tT ) + m(3) (˜v(tT1) , u¯ tT )] (1)

(1)

(2)

¯ tT ) − m since E [dt dt ] ≤ 2t , m (˜vtT , u (k = 1, 2, 3) are non-negative. ′

(1)

(A.77)

(˜vtT , u¯ tT ) ≥ 0 and m(k)

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

Moreover, (γ ) (1)

(1)

(γ )

(γ )

(2)

¯ tT ) = Var [([φT −1t vit + φβ (φT −1t − φT −2t )vit ] m(3) (˜vtT , u (γ )

(γ1 )

(γ )

+ · · · + [φ1 1 viT(1−) 1 + φβ (φ1

T −1 1 − (g ,f )′ (f ) vt Nt ut n t =1

− φ1 2 )viT(2−) 1 ])/(T − t )]

× Var [(u(it1) + · · · + u(iT1) )/(T − t + 1)]  2  1 =O . T −t



=

(γ )

1−c

(γ )

 − c∗ (A.79)



(1)′

¯ tT ] = O Var [˜vtT Mt u



t

(T − t )2

.

Var

(1)′



¯ tT v˜ tT Mt u



T −1 1 −

(1)′

¯ tT ] = Var [˜vtT Mt u

T −1 2 −−

(1)

NT s=1 t >s (1)′

1 NT

O

 T −1 −

t

(g )′

Mt ut ] ≤ O(t ) and Cov[vt

Mt





1

vt

T −t

n t =1

(g )′

 Mt ut



T −1 1 −

NT t =1



1 T −t

2

O(t ) (A.85)

T −1 1 −

T −1 c∗ − ′ W′t −1 Mt ut − √ Wt −1 ut n t =1 n t =1

1





T −1 1 −

= √

(T − t )2

t =1

n t =1

W′t −1 ut + op (1).

(A.86)

Since T −1 1 −

(1)′



(1)′

¯ tT , v˜ sT Ms u¯ sT ]. Cov[˜vtT Mt u

¯ tT , v˜ (sT1) Ms u¯ sT ] ≤ (Var [˜v(tT1) Mt u¯ tT ])1/2 (Var [˜v(sT1) Mt u ′



Since Cov[˜vtT ¯ sT ])1/2 , we can evaluate Ms u

|b1n | ≤

(A.84)

= o(1).

(A.81)

t =1

NT t =1   1 = O , N

(1)

Var

.

Ms us ] = 0, we have

T −1 1 −

1−c

(1)

b1n =





Also we have the relation that

= b(0n1) + b(1n1) ,

where b0n =

N T − (g ) v¯ u¯ i N i=1 i

The third, fourth and fifth terms of (A.84) are analogous to (1) (1) Υ21n , Υ22n , respectively, and their variances tend to zero. The last term also tends to zero by the similar arguments as for (A.24).

The variance of Υ22n is given by

NT



(A.80)

(1)

 T −1 −



(g )′

ut , vs

(1)2

(1)2

T −1 1 − (g )′ vt ut − √ n t =1

By using Lemma A.1, Var [vt (g )′

since each terms of E [(φT′ −t vit )2 uit ] and E [(φT′ −t vit )2 ]E [uis ] (t ̸= s) appeared are O((T − t + 1)(T − t )) in the numerator. Thus

Var [Υ22n ] =





(T − t )2 (T − t + 1)2

× E [(φT′ −t vit + · · · + φ1′ viT −1 )2 (u(it1) + · · · + u(iT1) )2 ]   1 =O , (T − t )2

1

− ct−2 v¯ (tTg ) Mt ut + ct−2 v¯ (tTg ) Mt u¯ tT )

1

(1)

(1)



T −1 1 − −2 (g )′ (ct vt Mt u¯ tT n t =1

t ̸= s, where φ′T −t = (φT −1t , φβ (φT −1t − φT −2t )), and each ‖φT −t ‖ is bounded. As for the second term we have

¯ tT ) = m(1) (˜vtT , u

 T −1 T −1  1 1 − 1 − (g )′ (g )′ vt Mt ut + √ vt Mt ut √ T − t n t =1 n t =1

−√



(γ )



1

(A.78)

It is because the terms φT −t vit and φT −s vis are independent for ′

181

√ ∑ (⊥,f )′ (f ) (g ,f ) For the term (1/ n) t ut Nt ut , by using vt = (vt(g ) − (f ) ¯ tT )/ct , (g = 1, 2), we have v¯ tT )/ct and ut = (ut − u

T −1 2 −−

NT s=1 t >s

(1)′



n t =1

W′t −1 Mt ut

T −1 1 −

T −1 1 − ′ W′t −1 ut − √ Wt −1 (IN − Mt )ut n t =1 n t =1

= √

(A.87)

and Lemma A.2, for g = 1, 2, (1)′

|Cov[˜vtT Mt u¯ tT , v˜ sT Ms u¯ sT ]|

 √   √  2 −− t s ≤ O O NT s t >s T −t T −s  −   − 2 t 1 ≤ O O NT t T −t T −s s   (log T )2 . =O N

 Var

=

= (A.82)

T −1 1 −



T −1 1 −

NT t =1

NT

E [uit

where we notice that O(1) corresponds to the term associated with the asymptotic bias to be treated at Step 4.

T −1 ]−

NT

 =O

(A.83)

(g )′

(g )′

(g )

E [wt −1 (IN − Mt )wt −1 ]

t =1

(1)2

=

− Mt )ut

Var [wt −1 (IN − Mt )ut ]

(1)2 T −1 E [uit ] −

We turn to evaluate the sampling error of the second part of (A.57) for βˆ 2 LI . First, we consider T −1 T −1 1 − (2,f )′ 1 − (2)′ (f ) yt −1 Nt ut = √ yt −1 Nt ut + O(1) + op (1), √ n t =1 n t =1

n t =1



(g )′ wt −1 (IN

∗(g )′

E [µt

g) (IN − Mt )µ∗( ] t

t =1

log T T



.

(A.88)

For the above arguments, the leading order which has to tend to zero is (log T )2 /N. Provided T /N → c ≥ 0, we have (log T )2 / N ∼ c (log T )2 /T → 0 because limT →∞ c (log T )2 /T = limT →∞ 2c (log T /T ) = 0.

182

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

(1)2

(Step 3) At this step we shall evaluate the variance–covariance term and derive the limiting distribution of the LIML estimator by using the form (A.57). First, by using the stationarity and direct calculations, we have

E [a1n a′1n ] =

σ

T −1 −

2

D′

n

(1)2

E [W′t −1 Wt −1 ]D → σ 2 Φ∗ ,

(A.89)

].

Under the conditions stated in Assumptions (A1)–(A4) of Section 3, we have assumed T −1 1−

n t =1

T −1 1−

p



Wt −1 dt → lim

n t =1

n→∞

E [Wt −1 dt ],

T −1 T −1 1− ′ 1− p dt dt → lim E [d′t dt ], n→∞ n n t =1 t =1

(A.91)

where dt denotes the N × 1 vector containing the diagonal elements of Mt = Zt (Z′t Zt )−1 Z′t . Then by the facts that the i-th

E [a1n a′2n ] =

 =



uit ] and E [wit −1 ]



T −1

1 ′− ′ D E [W′t −1 Et [ut u⊥ t Nt ut ]], 0 n t =1 (1)2 ⊥

E (uit uit ) 1 1−c



(g )

(1)2 ⊥



(1)2

E [uit u⊥ it ] 1−c

n

D′

T −1 −



E [a2n a2n ] =

1− n t =1



that (1/n) (A.91),

T t =1



T −1

1 ′− D E [W′t −1 dt ], 0 . n→∞ n t =1 lim

(A.92)



2



t

p

W′t −1 Wt −1 → Γ 0 , (1/n)



p

t

W′t −1 ιN → 0, (A.90) and

p

n→∞

∑ T −1 t =1

p

Et [α2t α′2t ] → limn→∞ E [α2t α′2t ].

(Step 4) Finally, we evaluate the asymptotic bias in the right-hand side of (A.57). It is a collection of the terms which the mean-square convergences to non-zero means, and it can be evaluated as



n

n→∞

+√

1

√ D′ T −1

−

n t =1

T −1 −

(f )′

(f )

Yt −1 Nt ut

t =1

u(f ,⊥) 0′





(f )



Nt ut

.

(A.97)

E (A.94)

T −1 1 − (⊥,f )′ (f ) ut Nt ut √ n t =1



T −1 1 −

= √

n t =1

(f ) (⊥,f )′

E [tr(Nt E [ut ut

= 0. 2

2



By using the result of (A.21) and (A.40),

 lim E

n→∞



′ 2 ⊥ E [u⊥ t Nt [ut ut − σ IN ]Nt ut ]

T −1 1−

n t =1

 E

N −

[ut u′t − σ 2 IN ](i,j) (u⊥ 1t Nt (i,1) + · · ·

i,j=1

⊥ ⊥ + u⊥ Nt Nt (i,N ) ) × (u1t Nt (j,1) + · · · + uNt Nt (j,N ) ) T −1 N 1 −−

n t =1 i =1

2 E [(Nt (i,i) )2 Et [(u2it − σ 2 )u⊥ it ]]

T −1 1 − (1,f )′ (f ) yt −1 Nt ut √ n t =1

1



=

Nt ut

Et [α1t α′2t ] → lim E [α1t α′2t ],

For the second term, we have

=

0′



n→∞

and (1/T )



where E [uit ] = (1/σ )[Ωσ − Ωββ Ω](2,2) and [·](2,2) = |Ω|.

n t =1



We first notice that

2 2 ⊥2 tr(N2t )σ 2 E [u⊥ it ] → (c∗ σ )E [uit ] = c∗ |Ω|,

⊥2

T −1 1−



p

(A.93)

T −1

n t =1

u⊥ t

Et [α1t α′1t ] → lim E [α1t α′1t ],

n→∞

Then the first term of (A.93) converges to 1−

 ut +

(2)′

1

1 (1, 0). 0



wt −1

lim µn = lim E

E [ut Nt [σ IN + (ut ut − σ IN )]Nt ut ] ′

2

D

(1)′

wt −1

As for the relevant Lyapunov conditions hold from the result of Akashi and Kunitomo (2010b). By using the facts

T −1 1−

E [W′t −1 (dt − c ιN )], 0

 ′

= α1t + α2t , (say) (A.96) √ ∑ T −1 √ ∑ T −1 where (1/ T ) t =1 α1t = a1n , (1/ T ) t =1 α2t = a2n and √ ∑ (1/ T ) Tt =−11 αt = an (see (A.57) for the notation). To apply the martingale central limit theorem, for ∑any 2 × 1 constant vector ζ , we check the condition that (1/T ) t ζ ′ Et [(α1t + p α2t )(α1t + α2t )′ ]ζ → limn→∞ ζ ′ E [(α1t + α2t )(α1t + α2t )′ ]ζ .

T t =1

t =1

⊥′

N

  ×



1

T −1 1−

Next, we use the last part of the decomposition in (A.57) and evaluate as T −1

(A.95)

For any N, define a sequence of 2 × 1 vectors as the martingale difference sequence for the central limit theorem (CLT) as

(A.90)

element of Et [ut u⊥ t Nt ut ] is equal to Nt (i,i) E [uit = 0 (g = 1, 2), we have

(E [d′t dt ] − 2tr(Mt )c + Nc 2 ),

⊥ the second equality is from the facts that Et [u⊥ it ujt ] = E [uit ujt ] = 0 ⊥2 ′ 2 for any i, j and that E [[ut ut − σ IN ](i,i) ]E [ujt ] = 0 for i ̸= j.

αt = √ ′

and



T −1 −

×

t =1

t =1

where σ 2 = E [uit

2 − σ 2 ) u⊥ it ] 1 2 (1 − c ) n

E [(uit

=



2T

= lim √ n→∞ 1 − c NT   (1) (2) (1) (1) (1) (2) E [φβ uit vit ] E [uit vit + φβ uit vit ] × − 1 − γ2 1 − γ1 c

N

− lim √ n→∞ 1 − c NT   (1) (2) (1) (1) (1) (2) E [φβ uit vit ] E [uit vit + φβ uit vit ] × − 1 − γ2 1 − γ1

])] (A.98)

K. Akashi, N. Kunitomo / Journal of Econometrics 166 (2012) 167–183

 √



c

=

(1) (1)

(1) (2)

E [φβ uit vit ]



1 − γ2

1−c

(1) (2)

E [uit vit + φβ uit vit ]



1 − γ1

  √  (1) (1) E [uit vit ] β2 γ2 E [u(it1) vit(2) ] − c + . = 1−c 1 − γ1 (1 − γ1 )(1 − γ2 )

(A.99)

By using the similar calculation as for (A.40), (2,f )′

(1) (2)

(f )

E [yt −1 Mt ut ] = −

2T E [uit vit ] 1 − γ2

 × 1−

T − (1 − γ t )

1

2

T (1 − γ2 ) t =1

t

lim E

n→∞

= lim

T −1 1 − (2,f )′ (f ) yt −1 Nt ut √ n t =1

n→∞

1

,

(A.100)

1−c

− lim

c

n→∞





1−c

E [uit vit ]

(1) (2)







1 − γ2

NT N



(1) (2)



2T



NT

E [uit vit ]



1 − γ2

  √  (1) (2) − c E [uit vit ] = . 1−c 1 − γ2

(A.101)

Hence we summarize these results as lim µn =

n→∞

[ √ ] − c D′ (I2 − Π )−1 Ωβ, 1−c

(A.102)

(1)

where Ωβ = E [vit uit ]. Thus, we can define the asymptotic bias of



n(θˆ LI − θ) by

b(ca) = Φ∗−1 lim µn .  n→∞

LIML-bias-correct, LIML-var-correct and LIML-large-K have been explained in Section 4.) For the sake of comparisons, the distribution functions of the WG and the GMM estimators are normalized in the same way and presented in figures. The details of numerical computation method are similar to those explained in Anderson et al. (2005, 2008). The BB model stands for the one by Blundell and Bond (2000) and the AA model stands for the one by Alvarez and Arellano (2003). The just identified model is the simultaneous equation with γ1 = 0. References



and the result of (A.22),



183

(A.103)

Appendix B. Some figures In Figures the distribution functions of the WG, the GMM and the LIML estimators are shown with the large sample normalization (i.e. the case of c = 0) and the large-K normalization. The limiting (marginal) distributions for the LIML estimator for each coefficients in the large-K asymptotics are N (0, 1) as n → ∞, which are denoted as ‘‘o’’ based on Theorem 2. (The meanings of

Akashi, K., 2008, t-Test in dynamic panel structural equations, Unpublished Manuscript. Akashi, K., Kunitomo, N., 2010a, Some properties of the LIML estimator in a dynamic panel structural equation, Discussion Paper CIRJE-F-707, Graduate School of Economics, University of Tokyo, http://www.e.utokyo.ac.jp/cirje/research/dp/2010/2010cf707.pdf. Akashi, K., Kunitomo, N., 2010b, The limited information maximum likelihood approach to dynamic panel structural equations, Discussion Paper CIRJE-F-708, Graduate School of Economics, University of Tokyo. Alvarez, J., Arellano, M., 2003. The time series and cross section asymptotics of dynamic panel data estimators. Econometrica 71, 1121–1159. Anderson, T.W., Hsiao, C., 1981. Estimation of dynamic models with error components. Journal of the American Statistical Association 76, 598–606. Anderson, T.W., Hsiao, C., 1982. Formulation and estimation of dynamic models with panel data. Journal of Econometrics 18, 47–82. Anderson, T.W., Rubin, H., 1949. Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20, 46–63. Anderson, T.W., Rubin, H., 1950. The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 21, 570–582. Anderson, T.W., Kunitomo, N., Matsushita, Y., 2005, A new light from old wisdoms: alternative estimation methods of simultaneous equations with possibly many instruments, Discussion Paper CIRJE-F-321, Graduate School of Economics, University of Tokyo. Anderson, T.W., Kunitomo, N., Matsushita, Y., 2008, On finite sample properties of alternative estimators of coefficients in a structural equation with many instruments, Discussion Paper CIRJE-F-576, Graduate School of Economics, University of Tokyo, forthcoming in Journal of Econometrics. Anderson, T.W., Kunitomo, N., Matsushita, Y., 2010. On the asymptotic optimality of the LIML estimator with possibly many instruments. Journal of Econometrics 157, 191–204. Anderson, T.W., Kunitomo, N., Sawa, T., 1982. Evaluation of the distribution function of the limited information maximum likelihood estimator. Econometrica 50, 1009–1027. Arellano, M., 2003a. Panel Data Econometrics. Oxford University Press. Arellano, M., 2003b, Modelling optimal instrumental variables for dynamic panel data models, Unpublished Manuscript. Arellano, M., Bond, S.R., 1991. Some tests of specification for panel data: Monte Carlo evidence and an application to employment equation. Review of Economics Studies 58, 277–297. Arellano, M., Bover, O., 1995. Another look at the instrumental variable estimation of error – components models. Journal of Econometrics 68, 29–51. Baltagi, Badi H., 2005. Econometric Analysis of Panel Data, third ed. John-Wiley. Blundell, R., Bond, S., 2000. GMM estimation with persistent panel data: an application to production function. Econometric Review 19-3, 321–340. Hsiao, Cheng, 2003. Analysis of Panel Data, second ed. Cambridge University Press.