Hausman-type tests for individual and time effects in the panel regression model with incomplete data

Hausman-type tests for individual and time effects in the panel regression model with incomplete data

Journal of the Korean Statistical Society ( ) – Contents lists available at ScienceDirect Journal of the Korean Statistical Society journal homepa...

423KB Sizes 0 Downloads 33 Views

Journal of the Korean Statistical Society (

)



Contents lists available at ScienceDirect

Journal of the Korean Statistical Society journal homepage: www.elsevier.com/locate/jkss

Hausman-type tests for individual and time effects in the panel regression model with incomplete data Jing Chen, Rongxian Yue, Jianhong Wu * College of Mathematics and Science, Shanghai Normal University, Shanghai 200234, China

article

info

Article history: Received 30 June 2017 Accepted 11 April 2018 Available online xxxx AMS 2000 subject classifications: 62F05 62H15 Keywords: Error component models Hausman-type tests Incomplete panels Moment estimation Random effects

a b s t r a c t By comparing estimators of the variance of idiosyncratic error at different robust levels, two Hausman-type test statistics are respectively constructed for the existence of individual and time effects in the panel regression model with incomplete data. The resultant test statistics have several desired properties. Firstly, they are robust to the presence of one effect when the other is tested. Secondly, they are immune to the non-normal distribution of the disturbances since the distributional conditions are not needed in the construction of the statistics. Thirdly, they have more robust performances than the main competitors in the literature when the covariates are correlated with the effects. Additionally, they are very simple and have no heavy computational burden. Joint tests for both of the two effects are also discussed. Monte Carlo evidence shows that the proposed tests have desired finite sample properties, and a real data analysis gives further support. © 2018 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

1. Introduction Panel data models with two-way error components can capture the heteroscedasticity of individual and time points, and have been widely used in econometrics. Since the misspecification of the random effects in the error component usually causes modeling biases and even inefficient statistical inferences, there have been many econometrics studies focusing on testing for the existence of random effects, see, e.g., Baltagi, Chang, and Li (1992), Bera, Sosa Escudero, and Yoon (2001), Breusch and Pagan (1980), Honda (1985), Moulton and Randolph (1989), Castagnetti, Rossi, and Trapani (2015a, b), MontesRojas (2010), etc. More details can also be found in the two books by Hsiao (2003) and Baltagi (2008). Most of the existing literatures deal with fully observed data, however, in practice one often has to face incomplete data. For example, labor or consumer panels on households may drop out individuals’ information after certain time points due to migration. Recently, statistical modeling for incomplete panels has received more and more attention in both the theory and the application. The readers can refer to Baltagi (1985), Baltagi, Chang, and Li (1998), Baltagi, Song, and Jung (2002), Oya (2004), Shao, Xiao, and Xu (2011), Song and Jung (2001), Sosa-Escudero and Bera (2008), Wansbeek and Kapteyn (1989), Yue, Li, and Zhang (2017), etc. Some of them focused on the estimation of parameter and the rest focused on the hypothesis test of the random effects. In this paper, we mainly develop some robust random effect tests for the panel regression model with incomplete data. In the following, we give a simple review on the potential competitors in the literature. Baltagi and Li (1990) extended Breusch and Pagan (1980)’s Lagrange multiplier (LM) tests to the incomplete panel data case. Moulton (1987) extended the uniformly mostly powerful test (UMPT) of Honda (1985) to the incomplete one-way error component model. However, when there are high correlations among the regressors or the number of regressors is very large, the test of Moulton (1987) can lead to incorrect inference. And then Moulton and Randolph (1989) proposed a standardized

*

Corresponding author. E-mail address: [email protected] (J. Wu).

https://doi.org/10.1016/j.jkss.2018.04.002 1226-3192/© 2018 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

2

J. Chen et al. / Journal of the Korean Statistical Society (

)



Lagrange multiplier (SLM) test which had better critical value approximations. Baltagi et al. (1998) extended Honda (1991) standardized version of the one-sided LM test to the incomplete panel data case and derived the standardized version of the local mean most powerful (LMMP) test suggested by Baltagi, Chang, and Li (1992). Note that the above individual (time) effect tests used for the one-way error component model could distort the empirical sizes if the omitted time (individual) effect does exist. Then, Song and Jung (2001) extended the results of Baltagi et al. (1992) to the incomplete case and proposed the conditional LM tests for the presence of individual (time) effect assuming that the time (individual) effect is present. Notice that, the above tests need the normal assumption of the idiosyncratic errors and independence among the covariates, effects and the idiosyncratic errors, although some of them are robust to the distributional assumptions. Moreover, most of the above tests may have heavy computational burden due to the iterative computation in the estimation of parameters. So, for the case of complete data, Wu and Zhu (2011) proposed two robust tests based on the coefficient estimators of the artificial autoregression modeled by differenced residuals, and the cost is loss of the test power. Wu and Li (2014) further constructed two moment-based tests by comparing the estimators of the variance of idiosyncratic error at different robust levels. The resultant tests have some desired properties as follows since they use one/two transformations to wipe out the possible time and/or individual effects. They are robust to the presence of one effect when the other is tested, which is achieved by employing corresponding transformation to remove the redundant effect. They also allow the dependence among the idiosyncratic errors, the covariates and the effects. However, this mechanism was unfortunately limited to balanced panel models only because when data are randomly missing among the panel, the time effect cannot be eliminated by centering transformation. In this paper, we assume that the considered models are set to the two-way error component model with incomplete panels, and the missing patterns are so-called random patterns (MCAR). The corresponding cost is that, every two individuals are required to share exactly the same missing pattern in pairs. Inspired by the idea of the first difference transformation in Wu (2016), we adopt the two transformations, the first difference transformation over individual index and the orthogonal transformation over time index, to respectively eliminate the possible individual and time effects. Owing to the transformed models, we can derive two different moment estimators of the variance of the idiosyncratic error. One is consistent no matter of the existence of random effects, and the other one is consistent when and only when the individual (time) effect exists. Motivated by Hausman’s specification test (Hausman, 1978), by comparing the two estimators on different robust levels, we construct two test statistics for the existence of the individual effect and the time effect respectively, which possess two robust properties. The first robust property is related to resistance to misspecification of one effect when testing the other effect, while robustness here is achieved by employing the corresponding transformation to erase the redundant effect. This mechanism also makes sure that the tests can be applied to the case that these two effects are correlated with the covariates. We also demonstrate that ANOVA F tests are asymptotically equivalent to variations of our tests. Power study shows that the tests can detect local alternatives distinct from the null at the parametric rate and have a larger asymptotic power than the corresponding ANOVA F tests when the individual heterogeneity effects are correlated with regressors. The second robust property is that they are distribution-free since these statistics are based on the comparison of different moment estimators and there are no distributional assumptions in the construction of statistics. Additionally, they are very simple and have no heavy computational burden. Joint tests for both of the two effects are also discussed. Monte Carlo evidence shows that the proposed tests have desired finite sample properties. The rest of the paper is organized as follows. Section 2 presents the two-way error component model with incomplete panels and several higher order moment estimators of the variance of idiosyncratic error used to construct moment-based test statistics. In Sections 3–5, we respectively construct tests for the presence of the random effects and study their asymptotic properties under the null and the alternatives. Monte Carlo simulation experiments and the corresponding results are stated in Section 6. Section 7 applies our methods to a real data example. Technique proofs are relegated to the Appendix. 2. Model and estimation of moments The considered error component regression model is given by yit = α + Xit′ β + uit ,

i = 1, 2, . . . , n,

t = 1, 2, . . . , T ,

where yit denotes observation on the dependent variable for the ith individual at the tth time period, α is a scalar, Xit is the ith observation on k explanation variables, β is a k × 1 vector of coefficients of the explanatory variables and uit is error component term. In this paper, the considered panel data are incomplete, and further, the missing pattern can be set to be random for generality. For the sake of denotation, we define yi = (yi1 , yi2 , . . . , yiTi )′ , where yit denotes the tth observed data over the ith individual, Ti is the number of time periods with observations for the ith individual. Define Xi similarly. And then, the considered model can be rewritten as vector form yi = αιTi + Xi β + ui ,

with

ui = µi ιTi + Di η + vi ,

i = 1, 2, . . . , n,

(1)

where ui = (ui1 , ui2 , . . . , uiTi ) , Ti is the time number of the observed data for the ith individual, µi denotes the ith individual effect which is assumed to be a random variable with zero mean and finite variance σµ2 , ιTi is a Ti -dimensional vector with ′

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



3

all elements equal to one, vi = (vi1 , vi2 , . . . , viTi )′ and {vit } are assumed to be IID(0, σv2 ) with different individuals and time points, η = (η1 , η2 , . . . , ηT )′ , ηt denotes the tth time effect which is assumed to be a random variable with zero mean and finite variance. For the sake of statement, T denotes by the total number of time periods, and Di denotes a Ti × T matrix obtained from the identity matrix IT by omitting the rows corresponding to time periods not observed for the ith cross section. It is worthwhile to point out that the matrix Di can show the corresponding time points for the missing data in the ith individual. Since the missing pattern is set to be random, it is not easy to eliminate time effect ηt by the commonly used centering transformation (e.g. Wu & Li, 2014). For the sake of simplicity, we assume that every two individuals possess the same missing pattern in pairs so that the time effect can be easily eliminated by difference operator. That is, we assume that the individuals with the same missing pattern are in pairs written into the vector form as follows, yi−1 = αιTi + Xi−1 β + µi−1 ιTi + Di η + vi−1 , yi = αιTi + Xi β + µi ιTi + Di η + vi ,

(2)

i = 2j ≤ n.

(3)

To construct robust test statistics, we first implement two transformations to model (3). Eq. (3) minus Eq. (2) is as follows,

∆y2j = ∆X2j β + ∆µ2j ιT2j + ∆v2j ,

n m = [ ], 2

j = 1, 2, . . . , m,

(4)

where ∆y2j = (∆y2j,1 , ∆y2j,2 , . . . , ∆y2j,T2j )′ , ∆X2j = (∆X2j,1 , ∆X2j,2 , . . . , ∆X2j,T2j )′ , ∆v2j = (∆v2j,1 , ∆v2j,2 , . . . , ∆v2j,T2j )′ , ‘‘[·]’’ means the integer part of ·, and ‘‘∆’’ stands for the difference operator over the individual index, i.e., ∆y2j,t = y2j,t − y2j−1,t , ∆X2j,t = X2j,t − X2j−1,t , ∆v2j,t = v2j,t − v2j−1,t , ∆µ2j = µ2j − µ2j−1 . Then we can find a matrix QT2j , such ιT

that ( √2j , QT2j ) is a T2j × T2j orthogonal matrix and then QT′ ιT2j = 0. Premultiplying model (4) by QT′ to wipe off the 2j

2j

T2j

individual effect, we obtain QT′2j ∆y2j = QT′2j ∆X2j β + QT′2j ∆v2j ,

j = 1, 2, . . . , m,

n m = [ ]. 2

(5)

We consider the ordinary least square(OLS) estimation for model (5)

ˆ β = argmin β

=

(∑ m

m ∑

∥QT′2j ∆y2j − QT′2 j ∆X2j β∥2

j=1

∆X2j Pι⊥T ∆X2j 2j ′

)−1 ∑ m

(6)

∆X2j Pι⊥T ∆y2j , 2j ′

j=1

j=1

= QT2j QT′2j = IT2j − T1 ιT2j ι′T2j is independent of matrix QT2j . Denote QT2j = 2j = (q2j,1l , q2j,2l , . . . , q2j,T2j l )′ for l = 1, 2, . . . , T2j − 1, j = 1, 2, . . . , m. It holds that

where ∥ · ∥ is the Euclidean norm and Pι⊥T (q2j,1 , q2j,2 , . . . , q2j,T2j −1 ) and q2j,l

2j

T2j −1

E ∥QT′2j ∆v2j ∥2 = E [



(q′2j,l ∆v2j )2 ] = 2(T2j − 1)σv2 ,

l=1 T2j −1

T2j −1 T2j



∑∑

E[

(q′2j,l ∆v2j )4 ] = 2(

l=1

T2j −1 T2j

q42j,k,l )γv4 + 3[4(T2j − 1) −

l=1 k=1

∑∑

2q42j,k,l ](σv2 )2 ,

l=1 k=1

and then from model (5), we can obtain a moment estimator of the variance of the idiosyncratic error vit by

ˆ σ02v =

m 1 ∑

nc0n

∥QT′2j (∆y2j − ∆X2jˆ β )∥2 ,

j=1

∑n

where c0n = c1n − 1 with c1n = 1n i=1 Ti , and ˆ β is given by (6). Let γv4 stand for the fourth order moment of error vit , and 4 then, we can also obtain the estimator of γv

ˆ γv4 =

1 nc00n

m T2j −1 ∑ ∑ 2c0n [q′2j,l (∆y2j − ∆X2jˆ β )]4 − 3( − 1)(ˆ σ02v )2 , j=1

l=1

nc00n

∑n ∑T −1 ∑T

i 4 ˆ where c00n = 1n i=1 l=i 1 σ02v and k=1 qi,k,l . Under some regularity conditions, we can prove that β is consistent, and then ˆ 4 ˆ γv can be proven to be consistent as n → ∞, regardless of the presence of individual and time effects. Note that all the asymptotic results in this paper are based on the setting that the individual number n tends to infinity and the numbers of time periods Ti are fixed.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

4

J. Chen et al. / Journal of the Korean Statistical Society (

)



3. Test for the individual effect Consider the hypotheses as follows, µ

H0 : σµ2 = 0

µ

H1 : σµ2 > 0.

vs µ

Note that, under H0 : σµ2 = 0, i.e., µ1 = µ2 = · · · = µn = 0, model (4) becomes

∆y2j = ∆X2j β + ∆v2j ,

j = 1, 2, . . . , m,

From the equation E ∥∆v2j ∥2 = E( m 1 ∑

ˆ σ12v =

nc1n

∑T2j

n m = [ ]. 2

(7)

= 2T2j σv2 , we get another estimator of σv2

2 k=1 ∆v2j,k )

β ∥2 , ∥∆y2j − ∆X2jˆ

j=1

µ

µ

µ

where ˆ β is given by (6). Note that, ˆ σ02v is consistent under both H0 and H1 , however ˆ σ12v is consistent only under H0 . So the µ µ difference of the two moment estimators should be small under H0 but large under H1 . Following the idea of Wu and Li (2014) test, we construct a Hausman-type test by the difference of the two estimators − 12

Ωµ = Φn

√ ·

n(ˆ σ12v − ˆ σ02v ),

where the scalar Φn = anˆ γv4 + bn (ˆ σ02v )2 , an and bn are as follows, m 1∑

an =

n

n

T2j

2

2 c1n

j=1

m 1∑

bn =

(

[ 2

−2

T2j − 1

T2j (2T2j + 1) 2 c1n

j=1

+

c1n c0n



T2j − 1



2 c0n

T2j − 1

)

2 T2j c0n

2(T2j − 1)(2T2j + 1) c1n c0n

,

+

(T2j − 1)(2T2j2 − T2j + 3) 2 c0n T2j

] .

Assumptions. ∑n (A1): {Xi } are i.i.d. across individuals, |Σi | > 0, i = 1, 2, 3, with Σ1 = EX′i Xi − EX′i EXi , Σ2 = limn→∞ 1n i=1 EX′i Pι⊥T Xi ,

Σ3 = limn→∞ 1n

∑n

′ ⊥

i=1 (EXi PιT

i

i

Xi − EX′i Pι⊥T EXi ). i

∑n 2

(A2): E(vit4 ) < ∞, E(Xit vit ) = 0, limn→∞ n

i=1

[E(X′i Pι⊥T vi v′i Pι⊥T Xi ) − σv2 EX′i Pι⊥T EXi ] = Σ4 . i

i

(A3): {µi } are i.i.d. with mean zero and variance σµ2 3

limn→∞ n− 4

∑n



i=1 E(Xi

i

1

= n− 2 σ12 with a constant σ12 ≥ 0 and, E(µi vi ) = 0,

1

µi ιTi ) = Σ5 , n 2 E(µ2i ∥vi ∥2 ) = Σ6 . −1

D

Theorem 1. For model (1), if Assumptions (A1)–(A3) hold, we have that Ωµ − Φn 2 σ12 −→ N(0, 1), as n → ∞, where Φn = an γˆv4 + bn (σˆ 02v )2 . µ

Under H0 , we propose another consistent estimator of σv2 ,

˜ σ12v =

m 1 ∑

nc1n

∥∆y2j − ∆X2j˜ β1 ∥2 ,

j=1

where

˜ β1 = argmin β

m ∑

∥△y2j − ∆X2j β∥ = 2

(∑ m

j=1

∆X2j ∆X2j ′

j=1

)−1 ∑ m

∆X′2j ∆y2j .

j=1

We construct a new test statistic − 21

Ωµ∗ = Φn

√ ·

− 21

n(˜ σ12v − ˆ σ02v ) = ˆ σ02v Φn

·

√ ˜ σ2 n( 12v − 1) = θ1n Fµ + ω1n , ˆ σ0 v

where

∑m Fµ =

{∥∆y2j − ∆X2j˜ β1 ∥2 − ∥QT′2j (∆y2j − ∆X2jˆ β )∥2 }/n , ∑m ∑n ′ ˆ 2 j=1 ∥QT2j (∆y2j − ∆X2j β )∥ /( i=1 Ti − n − 2k) j=1

− 21

θ1n = ˆ σ02v Φn

·

√ c0n n

c1n

(c1n − 1 −

2k n

)−1

and

− 21

ω1n = ˆ σ02v Φn

·

√ 1 n

c1n

.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



5

The statistic Fµ is the ANOVA F test statistic in the case of incomplete panels. Under the assumption that vit is normally ∑n distributed, the statistic Fµ is an exact F distribution with n and i=1 Ti − n − 2k degrees of freedom. Corollary 1. If the conditions of Theorem 1 hold, then −1

Ωµ∗ − Φn 2 (σ12 −

1

D

c1n

Σ5′ Σ1−1 Σ5 ) −→ N(0, 1), as n → ∞.

Note that Σ5 = 0 as σ12 = 0 and it shows that 0 ≤ c1 Σ5′ Σ1−1 Σ5 ≤ σ12 and c1 Σ5′ Σ1−1 Σ5 = 0 if and only if Σ5 = 0. We 1n 1n may conclude that our test Ωµ∗ will be asymptotically more powerful than the ANOVA F test when the individual effect is correlated with covariates. However, statistic Ωµ does not follow an F distribution up to an affine transformation. 4. Test for the time effect In this section, we propose test statistics for the presence of time effect and suppose that {ηt } are random variables with zero mean and finite variance. To check the heteroscedasticity of {ηt }, testing hypotheses for the time effect is formalized as follows (e.g. Wu & Li, 2014), η

H0 : var(η1 ) = var(η2 ) = · · · = var(ηT ) = 0

vs

η

H1 : at least one of them is nonzero.

To eliminate the possible individual effect, we premultiply model (3) by QT′ and obtain i

QT′i yi = QT′i Xi β + QT′i Di η + QT′i vi ,

i = 1, 2, . . . , n.

η

Under H0 , since E ∥QT′ vi ∥2 = (Ti − 1)σv2 , we derive a consistent estimator of σv2 i

ˆ σ22v =

n 1 ∑

nc0n

∥QT′i (yi − Xiˆ β )∥2 , where ˆ β is given by (6).

i=1

Assumptions. ∑n ∑n ∑n (A4): limn→∞ 1n i=1 D′i Pι⊥T Di = Υ1 , limn→∞ 1n i=1 EX′i Pι⊥T vi v′i Pι⊥T Xi = Σ7 , limn→∞ 1n i=1 EX′i Pι⊥T Di = Σ8 . i i i i To check the sensitivity of Ωη , we give the condition on η as follows, 1

(A5): η = n− 4 ζ , where ζ is a T -dimensional random vector with zero mean and σ22 = E ∥ζ∥2 < ∞. The case with σ22 = 0 η corresponds to H0 , and ζ may be correlated with covariates and even idiosyncratic errors. Theorem 2. If Assumptions (A1), (A2), (A4) and (A5) hold, then



1

Ωη − where Ωη =

D

2c0n



c0n 2

ζ ′ Υ1 ζ/ˆ σ02v −→ N(0, 1), as n → ∞,

(



n

ˆ σ22v ˆ σ02v

) −1 .

Similar to statistic Ωµ∗ constructed in the previous section, we can also derive another consistent estimator of σv2 under H0 as follows, η

˜ σ22v =

n 1 ∑

nc0n

∥QT′i (yi − Xi˜ β2 )∥2 ,

i=1

where

β˜2 = argmin β

n ∑

∥QT′i (yi − Xi β )∥2 =

i=1

(∑ n

X′i Pι⊥T Xi

)−1 ∑ n

i

i=1

X′i Pι⊥T yi . i

i=1

Corollary 2. If the conditions of Theorem 2 hold, then

√ Ωη∗ − where Ωη∗ =

1

D

2c0n



c0n 2

ζ ′ Υ1 ζ/ˆ σ02v −→ N(0, 1), as n → ∞,



n

(

˜ σ22v ˆ σ02v

) −1 .

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

6

J. Chen et al. / Journal of the Korean Statistical Society (

)



η

Further, it is easy to obtain that under H0 ,

Ωη∗ =



k

nc0n

∑n

2

i=1 Ti − n − 2k

Fη ,

where

∑ ∑ ′ ˆ 2 β2 )∥2 − m { ni=1 ∥QT′i (yi − Xi˜ j=1 ∥QT2j (∆y2j − ∆X2j β )∥ }/k . Fη = ∑n ∑m ′ ˆ 2 i=1 Ti − n − 2k) j=1 ∥QT2j (∆y2j − ∆X2j β )∥ /( ∑n The ANOVA F test statistic Fη follows F distribution with k and i=1 Ti − n − 2k degrees of freedom when error vit is normally distributed.

5. Test jointly for both individual and time effects To test for the presence of individual and time effects jointly, we consider the hypotheses as follows, µη

µη

: σµ2 = var(η1 ) = var(η2 ) = · · · = var(ηT ) = 0 vs H1 : at least one of them is nonzero.

H0

µη

Under H0 , model (1) reduces to yi = αιTi + Xi β + vi ,

i = 1, 2, . . . , n.

We derive a consistent estimator of σv2 as follows,

ˆ σ32v =

n 1 ∑

nc1n

∥yi − ˆ α ιTi + Xiˆ β ∥2 , since E ∥vi ∥2 = Ti σv2 ,

i=1

∑n

′ ˆ ˆ where ˆ α= i=1 ιTi (yi − Xi β ) and β is given by (6). To study the asymptotic behavior of the joint test statistic Ωµη defined below, we give the∑ assumptions as follows, ∑ ∑n ∑n n n (A6): limn→∞ 1n i=1 D′i Di = Υ2 , limn→∞ √1n i=1 D′i ιTi = Ω3 , limn→∞ 1n i=1 EX′i Di = Σ9 , limn→∞ 1n i=1 EX′i ιTi = Σ10 . 1 nc1n

Theorem 3. If Assumptions (A1)–(A6) hold, then ∗− 12

Ωµη − Φn

(σ12 +

∗− 12 √

where Ωµη = Φn

1 c1n

γv4 + (b∗n + σ02v ), Φn∗ = a∗nˆ n(ˆ σ32v − ˆ

n

a∗n =

( 1 ∑ Ti n

−2

2 c1n

i=1

D

ζ ′ Υ2 ζ ) −→ N(0, 1), as n → ∞,

Ti − 1 c1n c0n

+

Ti − 1 2 c0n



Ti − 1

ˆ σ

2 )( 02v )2 c0n

with

)

2 c0n Ti

and b∗n =

n 1 ∑ Ti (Ti − 1)

[

n

2 c1n

i=1



2(Ti − 1)2 c1n c0n

+

(Ti − 1)(2Ti2 − 2Ti + 3) 2 c0n Ti

] .

µη

From model (3), we derive a consistent estimator only under H0 as follows,

β˜3 =

(∑ n



Xi Xi −

i=1

˜ α3 =

n 1 ∑

nc1n

n 1 ∑

nc1n

Xi ιTi ′

i=1

n ∑

ιT i X i ′

)−1 (∑ n

i=1

ι′Ti (yi − Xi β˜3 ) and ˜ σ32v =

i=1



Xi yi −

i=1 n 1 ∑

nc1n

n 1 ∑

nc1n

Xi ιTi ′

n ∑

i=1

)

ιTi yi , ′

i=1

∥yi − ˜ α3 ιTi − Xi β˜3 ∥2 .

i=1

As in the above two sections, we construct another test statistic ∗− 12

∗ Ωµη = Φn

√ ·

∗ ∗ n(˜ σ32v − ˆ σ02v ) = θ1n Fµη + ω1n ,

where Fµη

∑ ∑ ′ ˆ 2 { ni=1 ∥yi − ˜ α3 ιTi − Xi β˜3 ∥2 − m j=1 ∥QT2j (∆y2j − ∆X2j β )∥ }/(n − 1 + k) = , ∑m ∑ n ′ ˆ 2 j=1 ∥QT2j (∆y2j − ∆X2j β )∥ /( i=1 Ti − n − 2k) ∗− 21

∗ θ1n =ˆ σ02v Φn

·

√ c0n n

c1n

· ∑n

n−1+k

i=1

Ti − n − 2k

and

∗− 21

∗ ω1n =ˆ σ02v Φn

·

√ 1 n

c1n

.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



When error vit is normally distributed, the ANOVA F test statistic Fµη follows F distribution with n − 1 + k and degrees of freedom.

7

∑n

i=1 Ti

− n − 2k

Corollary 3. Suppose that EXi′ Xi = Σ0 and the conditions of Theorem 3 hold, then ∗− 21

∗ − Φn Ωµη

(σ12 − κ +

∗− 21 √

∗ where Ωµη = Φn

n

(

˜ σ32v ˆ σ02v

1 c1n

D

ζ ′ Υ2 ζ ) −→ N(0, 1), as n → ∞,

) −1 ,κ =

1 ( Σ5 c1n

+ Σ9 ζ )′ (Σ0 −

1 ′ Σ10 Σ10 )(Σ5 c1n

+ Σ9 ζ ) and Φn∗ = a∗nˆ γv4 + (b∗n +

ˆ σ

2 )( 02v )2 . c1n

6. Simulation studies We consider the linear regression model as follows, (1)

(2)

yit = 0.5 + Xit + 2Xit + µi + ηt + vit ,

i = 1, 2, . . . , n,

t = 1, 2, . . . , Ti ,

where Ti = T − pi , T is the total number of time periods, pi is the number of missed time periods set to take from the (1) (2) truncated Poisson distribution with mean 1 such that Ti = T − pi > 0, Xit , Xit , µi and ηt follow the normal distributions (1) (2) (1) (1) (1) 2 2 with zero mean, var(Xit ) = var(Xit ) = 1, var(µi ) = σµ , var(ηt ) = ση , corr(Xit , µi ) = ρ , corr(Xit , Xis ) = ρ 2 for t ̸ = s



and vit follows the standard normal distribution N(0, 1) or student’s t distribution 0.6t(5) with 5 degrees of freedom. Let (2) (1) (1) (1) (1) Xi = (Xi1 , . . . , XiT )′ . Sequences {Xi }, {Xit }, {µi }, {ηt } and {vit } are set to be i.i.d., and are independent of each other (1)

i

except for {Xi } and {µi } with ρ ̸ = 0. Three Monte Carlo experiments are carried out to compare the size, power and robustness properties between our tests and the competitors mentioned in Introduction. They are Baltagi and Li (1990) test which extended the Breusch and Pagan (1980) test (LMµ and LMη ) to incomplete two-way error component model, Honda (1985) test (HOµ and HOη ) using incomplete panel data and its standardized version test (SLMµ and SLMη ) derived by Moulton and Randolph (1989), conditional LM tests (CLMµ and CLMη ) provided by Song and Jung (2001) which extended the work of Baltagi et al. (1992) to the incomplete two-way error component model and ANOVA F test (Fµ and Fη ) for the existence of the individual effect and the time effect respectively. Moreover, it is reported that the test of Oya (2004) and the HO test are asymptotically equivalent (see, e.g., Oya, 2004). For the joint hypothesis test for both of the two effects, the mainly potential competitors in the literature are as follows. They are Baltagi and Li (1990) test (LMµη ), ANOVA F test (Fµη ) and Baltagi et al. (1998) tests. The Baltagi et al. (1998) tests include the five tests denoted by HOµη , SHOµη , KWµη , SKWµη and GHMµη which respectively extended the Honda (1991) test, the standardized version of Honda (1991) test, King and Wu (1997) test, the standardized version of King and Wu (1997) test and Gourieroux, Holly, and Monfort (1982) test to the incomplete panel data case. We perform 1000 replications for all experiments and obtain the empirical sizes and powers with nominal level 5%. The first experiment is about the performance of test statistics for the individual effect in the case with ρ = 0. Tables 1– √ 2 present simulation results for (n, T ) = (100, 5), (200, 5) and (200, 10) panel sizes, with vit ∼ N(0, 1) and 0.6t(5) respectively. In Table 1, we see that, in the absence of time effects, the tests have the desired empirical sizes except for CLMµ . However, when ση2 becomes large, the sizes of LMµ , HOµ and SLMµ are all distorted, but Fµ and Ωµ still perform well. The test CLMµ is also size-distorted in very short time length such as Ti ≤ 5, but the test keeps desired empirical size and robust to the presence of time effects as time length increases. Also, we see that the two-sided LMµ test frequently rejects the null hypothesis, while the one-sided HOµ and SLMµ test badly underestimates the nominal size. This confirms similar results for the incomplete two-way error component model by Baltagi et al. (1998). In addition, our test Ωµ dominates all other tests in power performance except CLMµ in the cases of small panel size. Specially, Ωµ slightly outperforms Fµ in any cases. Table 2 lists √ the performance of all the test statistics under the non-normal setting, where {vit } follows student’s t distribution 0.6t(5). It shows that these test statistics perform similarly for the different distributions of vit , hence they are robust to the non-normality. Table 3 presents the results of these statistics in the cases of the sample size (n, T ) = (200, 10), ρ = 0.4 and 0.8. When ρ = 0, the correlated setting will reduce to the standard setting in evaluating empirical sizes of the statistics. Here we only consider the case that {vit } follows the standard normal distribution. It shows that, when the value of ρ increases, the empirical powers of these tests vary dramatically, however those powers of our test Ωµ are not affected. That is to say that our test is robust to the correlation between covariates and effects. The second experiment is to evaluate the performance of the proposed test Ωη and the other five commonly used tests for the time effect, LMη , HOη , SLMη , CLMη and Fη . We conduct the simulation experiment under the normal setting vit ∼ N(0, 1), the sample size (n, T ) = (100, 5) and the correlation coefficient ρ = 0. In Table 4, we see that, our proposed statistic is not as good as expected since its power is smaller than those of other statistics except Fη . However, our test statistic Ωη and CLMη are immune to the misspecification of individual effect, and LMη , HOη and SLMη are negatively affected by misspecification. We also implement the experiment under correlated setting, and the corresponding results are similar to those of the first experiment. The third experiment is to compare the proposed test Ωµη with the following seven joint tests for both random effects, LMµη , HOµη , KWµη , SHOµη , SKWµη , GHMµη and Fµη . The idiosyncratic error follows from the standard normal distribution Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

8

J. Chen et al. / Journal of the Korean Statistical Society (

)



Table 1 Empirical sizes and powers of the test Ωµ and other five tests for the individual effect under the standard setting. The nominal rate is 5%.

ση

σµ

LMµ

HOµ

SLMµ

CLMµ



Ωµ

0.042 0.061 0.162 0.207 0.152 0.101 0.715 0.718 0.565

0.044 0.085 0.293 0.003 0.013 0.084 0.001 0.004 0.007

0.053 0.093 0.311 0.009 0.015 0.083 0.000 0.004 0.008

0.124 0.212 0.518 0.134 0.198 0.496 0.132 0.208 0.498

0.065 0.107 0.248 0.069 0.093 0.285 0.065 0.129 0.257

0.062 0.136 0.321 0.065 0.137 0.311 0.063 0.135 0.310

0.058 0.063 0.173 0.388 0.277 0.173 0.882 0.830 0.706

0.052 0.121 0.452 0.005 0.012 0.092 0.001 0.001 0.014

0.054 0.128 0.468 0.004 0.009 0.083 0.001 0.002 0.012

0.140 0.222 0.499 0.139 0.208 0.479 0.132 0.218 0.505

0.060 0.124 0.359 0.063 0.123 0.383 0.066 0.129 0.367

0.052 0.132 0.472 0.058 0.125 0.404 0.056 0.136 0.407

0.053 0.117 0.823 0.394 0.193 0.226 0.956 0.897 0.595

0.061 0.216 0.926 0.003 0.012 0.388 0.000 0.000 0.007

0.057 0.218 0.941 0.002 0.011 0.364 0.000 0.000 0.014

0.042 0.211 0.919 0.052 0.216 0.896 0.048 0.208 0.866

0.061 0.203 0.761 0.066 0.217 0.774 0.062 0.210 0.800

0.055 0.226 0.834 0.056 0.221 0.802 0.058 0.225 0.795

(n, T ) = (100, 5) 0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 (n, T ) = (200, 5)

0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 (n, T ) = (200, 10)

0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2

N(0, 1), the sample size is set to (n, T ) = (200, 5), and the correlation coefficient is set to ρ = 0 and 0.8. Table 5 lists the empirical sizes and powers of these tests. From this table, we see that the empirical sizes of HOµη and KWµη are inaccurate, but the empirical sizes of standardized version of these tests and the other tests are closer to the true significant value. This coincides with the results of Baltagi et al. (1998). Table 5 also shows that Ωµη performs better in power than LMµη , HOµη , KWµη and SKWµη when ση = 0 and σµ is relatively small. We also see that the test Ωµη is more powerful than Fµη . Note that the correlation happens only between individual effect and covariates, and then there is no correlation when σµ = 0. It is clear from Table 5 that the power of our test Ωµη is robust to possible correlation, however, all the other tests are affected severely by the high correlation. 7. A real example To investigate the productivity of public capital in private production, Munnell (1990) proposed the following Cobb– Douglas production function, lnY = α + β1 lnK1 + β2 lnK2 + β3 lnL + β4 Unemp + u, where Y is gross state product, K1 is public capital which includes highways and streets, water and sewer facilities and other public buildings and structures, K2 is the private capital stock based on the Bureau of Economic Analysis national stock estimates, L is labor input measured as employment in nonagricultural payrolls, Unemp is the state unemployment rate included to capture business cycle effects. This panel data consists of annual observations of 48 contiguous states covering the period 1970–1986. Baltagi and Pinnoi (1995) performed a simple F -test to test the significance of state specific effects and a test for time effects given the existence of state specific effects. These turned out to be significant at the 5% level. In order to demonstrate the performance of the proposed tests clearly, we chose three subsets of this data set that closely resemble the missing patterns in the simulation study. Data 1, Data 2 and Data 3 are respectively observed with the total number of time periods T = 5, 10 and 17, and the number of missed time periods are set to take from the truncated Poisson distribution with mean one. For the three data sets, we respectively compute the values of our statistics and the competitors for comparison. Table 6 gives the values of the test statistics for individual effects, and the corresponding p-values are all less than 0.0001 for the three sets. Clearly, for the three sets, all the tests mentioned above are verified that the individual effect is present. However, with uncertainty of time effect, the inferences made from LMµ , HOµ and SLMµ cannot be convinced. Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



9

Table 2 Empirical sizes and powers of the test Ωµ and other five tests for the individual effect under the non-normal setting. The nominal rate is 5%.

ση

σµ

LMµ

HOµ

SLMµ

CLMµ



Ωµ

0.048 0.053 0.174 0.237 0.145 0.092 0.782 0.706 0.564

0.041 0.094 0.276 0.005 0.010 0.091 0.002 0.002 0.004

0.050 0.110 0.302 0.007 0.009 0.072 0.001 0.003 0.008

0.126 0.187 0.465 0.130 0.192 0.487 0.128 0.201 0.455

0.065 0.094 0.239 0.071 0.106 0.267 0.068 0.109 0.259

0.066 0.141 0.309 0.068 0.150 0.292 0.067 0.129 0.301

0.049 0.065 0.289 0.353 0.291 0.135 0.876 0.827 0.709

0.056 0.107 0.428 0.002 0.017 0.106 0.000 0.002 0.008

0.056 0.118 0.429 0.004 0.006 0.098 0.001 0.003 0.012

0.140 0.199 0.501 0.125 0.202 0.496 0.138 0.187 0.503

0.060 0.121 0.362 0.063 0.126 0.352 0.064 0.113 0.348

0.058 0.131 0.432 0.052 0.137 0.407 0.057 0.134 0.410

0.051 0.120 0.866 0.372 0.185 0.231 0.958 0.891 0.645

0.047 0.246 0.923 0.002 0.007 0.318 0.001 0.000 0.022

0.057 0.212 0.935 0.003 0.021 0.321 0.000 0.001 0.015

0.046 0.212 0.901 0.050 0.211 0.898 0.048 0.209 0.902

0.059 0.204 0.764 0.062 0.178 0.772 0.057 0.201 0.788

0.053 0.215 0.789 0.055 0.208 0.776 0.052 0.211 0.796

(n, T ) = (100, 5) 0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 (n, T ) = (200, 5)

0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 (n, T ) = (200, 10)

0.0

0.5

1.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2

Table 3 Empirical sizes and powers of the test Ωµ and other five tests for the individual effect under the correlated setting. The nominal rate is 5% and (n, T ) = (200, 10).

σµ

ρ

LMµ

HOµ

SLMµ

CLMµ



Ωµ

0.078 0.043 0.583 0.070

0.148 0.064 0.756 0.105

0.181 0.060 0.787 0.120

0.198 0.089 0.889 0.189

0.159 0.093 0.677 0.289

0.221 0.251 0.776 0.777

0.203 0.350 0.112 0.257

0.004 0.001 0.161 0.005

0.005 0.003 0.157 0.004

0.212 0.076 0.876 0.223

0.169 0.087 0.639 0.277

0.230 0.252 0.784 0.729

0.932 0.951 0.745 0.937

0.000 0.000 0.003 0.000

0.000 0.000 0.001 0.000

0.223 0.077 0.899 0.222

0.154 0.098 0.628 0.286

0.221 0.245 0.774 0.750

ση = 0.0 0.1 0.2

0.4 0.8 0.4 0.8

ση = 0.5 0.1 0.2

0.4 0.8 0.4 0.8

ση = 1.0 0.1 0.2

0.4 0.8 0.4 0.8

Table 7 gives the results of the tests for time effect. Clearly, for the three data sets, the null hypothesis is rejected by CLMη , Fη and Ωη , and their p-values are all very small for the three data sets. However, except for Data 2, the null hypothesis cannot be rejected by LMη , HOη and SLMη even as the significant size is 0.05. It may be caused by the fact that the individual effect is present and this affects the inferences from LMη , HOη and SLMη . Finally, we also consider the joint tests for the presence of both effects. The corresponding p-values of these tests are all less than 0.0001 for the three sets mentioned above, although there is a distinct difference among these values due to their different asymptotic distributions under the null. See Table 8 for more details. Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

10

J. Chen et al. / Journal of the Korean Statistical Society (

)



Table 4 Empirical sizes and powers of the test Ωη and other five tests for the time effect. The nominal rate is 5%, (n, T ) = (100, 5).

σµ

ση

LMη

HOη

SLMη

CLMη



Ωη

0.0

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2

0.043 0.192 0.629 0.012 0.110 0.517 0.005 0.032 0.293

0.041 0.257 0.696 0.009 0.120 0.618 0.000 0.035 0.384

0.066 0.362 0.768 0.028 0.243 0.688 0.004 0.088 0.482

0.082 0.355 0.763 0.085 0.345 0.758 0.080 0.370 0.761

0.070 0.108 0.194 0.072 0.104 0.217 0.070 0.094 0.188

0.052 0.135 0.284 0.056 0.126 0.226 0.058 0.125 0.214

0.5

1.0

Table 5 Empirical sizes and powers of the test Ωµη and other seven joint tests for both individual and time effects. The nominal rate is 5% and (n, T ) = (200, 5).

ση

σµ

LMµη

HOµη

SHOµη

KWµη

SKWµη

GHMµη

Fµη

Ωµη

0.048 0.054 0.202 0.321 0.336 0.491 0.817 0.840 0.876

0.028 0.047 0.205 0.360 0.435 0.633 0.837 0.859 0.924

0.052 0.010 0.322 0.462 0.533 0.749 0.875 0.894 0.898

0.029 0.031 0.047 0.477 0.494 0.526 0.892 0.893 0.894

0.061 0.070 0.099 0.586 0.583 0.596 0.912 0.916 0.926

0.038 0.068 0.361 0.428 0.484 0.662 0.843 0.870 0.912

0.062 0.101 0.212 0.117 0.141 0.236 0.183 0.259 0.409

0.059 0.116 0.244 0.121 0.133 0.298 0.214 0.285 0.499

0.039 0.035 0.384 0.343 0.824 0.806

0.021 0.027 0.355 0.386 0.809 0.830

0.075 0.060 0.412 0.398 0.833 0.792

0.046 0.023 0.462 0.459 0.872 0.870

0.072 0.074 0.587 0.398 0.874 0.813

0.054 0.050 0.430 0.439 0.857 0.863

0.087 0.124 0.119 0.139 0.231 0.258

0.146 0.270 0.177 0.296 0.293 0.462

ρ = 0.0 0.0

0.1

0.2

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2

ρ = 0.8 0.0 0.1 0.2

0.1 0.2 0.1 0.2 0.1 0.2

Table 6 Values of several statistics for individual effect.

Data 1 Data 2 Data 3

LMµ

HOµ

SLMµ

CLMµ



Ωµ

193.36 1158.37 3468.07

13.90 34.03 58.89

14.78 35.79 61.82

686.30 461.04 399.60

428.72 507.35 615.16

1227.44 1679.39 2079.90

Table 7 Values of several statistics for time effect.

Data 1 Data 2 Data 3

LMη

HOη

SLMη

CLMη



Ωη

0.29 (0.58) 6.01 (0.014) 1.17 (0.27)

0.54(0.29) 2.45 (0.007) 1.08 (0.13)

0.98 (0.16) 2.96 (0.001) 1.38 (0.08)

15.01 (0.00) 33.70 (0.00) 3.40 (0.00)

45.94 (0.00) 43.20 (0.00) 124.98 (0.00)

11.76 (0.00) 4.40 (0.00) 11.03 (0.00)

Note: The values in the parentheses (·) are the p-values of the corresponding test statistics.

Table 8 Values of several statistics for both of the two effects.

Data 1 Data 2 Data 3

LMµη

HOµη

SHOµη

KWµη

SKWµη

Fµη

Ωµη

193.65 1164.38 3469.24

10.21 25.80 42.40

11.93 28.37 45.44

4.41 15.85 30.58

5.53 17.75 32.85

352.84 408.25 495.58

674.48 603.09 584.47

Acknowledgments We thank two anonymous referees for valuable comments that led to the substantial improvement of this paper. This paper is supported in part by the National Nature Science Foundation of China (11671263, 11471216). Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



11

Appendix. Technical details Proof of Theorem 1. Since {Xi } and {vi } are both i.i.d. sequences, and under Assumptions (A1) and (A2), we have that m 1∑

n

∆X′2j Pι⊥T ∆X2j = Σ3 + op (1)

(8)

2j

j=1

and m 1∑

lim

n→∞

n

E(∆X′2j Pι⊥T ∆v2j v′2j Pι⊥T ∆X2j ) = Σ4 .

(9)

2j

2j

j=1

Together with (8) and (9), it implies that



n(ˆ β − β) =

( ∑ m 1 n

∆X′2j Pι⊥T ∆X2j

)−1 (

m 1 ∑



2j

j=1 m 1 ∑

= Σ3 √ −1

n

n

∆X′2j Pι⊥T ∆v2j

)

2j

j=1

(10) D

∆X′2j Pι⊥T ∆v2j + op (1) −→ N(0, Σ3 Σ4 Σ3 ). −1

−1

2j

j=1

Hence m 1∑

n

m 1 ∑

∆X′2j Pι⊥T ∆X2j = Op (1), √

n

2j

j=1

∆X′2j Pι⊥T ∆v2j = Op (1) and



2j

n(ˆ β − β ) = Op (1).

j=1

Using these results, we show that



nˆ σ02v = √

= √

= √

+√

+√

m ∑

1 nc0n

nc0n

nc0n

2j

· √

2 nc0n

·

n(β − ˆ β) · ′

n(β − ˆ β )′ ·

m ∑

1 nc0n

∆v′2j Pι⊥T ∆v2j

j=1



nc0n

∥QT′2j ∆v2j + QT′2j ∆X2j (β − ˆ β ) ∥2

j=1 m ∑

1

1

= √

j=1 m ∑

1

∥QT′2j (∆y2j − ∆X2jˆ β )∥2

( ∑ m 1 n

(

(11)

∆X2j Pι⊥T ∆X2j 2j ′

√ ·

n(β − ˆ β)

j=1

m 1 ∑



)

n

∆X′2j Pι⊥T ∆v2j

)

2j

j=1

∆v′2j Pι⊥T ∆v2j + op (1). 2j

j=1

Under Assumptions (A1) and (A2), we obtain m 1∑

n

∆X′2j ∆X2j = Σ1 + op (1) and

j=1

m 1∑

n

∆X′2j ∆v2j = op (1).

j=1

And under Assumption (A3), we show that m 1 ∑



n

and

∥∆µ2j ιT2j ∥2 = σ12 + op (1),

j=1

1

n− 4

m ∑

∆µ2j ι′T2j ∆v2j = Op (1)

j=1

3

n− 4

m ∑

∆µ2j ι′T2j ∆X2j = Op (1).

j=1

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

12

J. Chen et al. / Journal of the Korean Statistical Society (

)



By using the method similar to (11), it follows from model (4) that



nˆ σ12v = √

= √

= √

m ∑

1 nc1n

m ∑

1 nc1n

nc1n

∆v′2j ∆v2j + √

j=1 m ∑

1

∥∆v2j + ∆µ2j ιT2j + ∆X2j (β − ˆ β )∥2

j=1 m ∑

1 nc1n

∥∆µ2j ιT2j ∥2 + op (1)

(12)

j=1

∆v′2j ∆v2j + σ12 + op (1).

j=1

Thus, using (11) and (12), we obtain



m 1 ∑ σ02v ) = σ12 + √ n(ˆ σ12v − ˆ ξ2j + op (1), n j=1

where ξ2j = ∆v′2j ( c IT2j − 1

1

1n

c0n

Pι⊥T )∆v2j , and it holds that 2j

m

1∑

lim

n

n→∞

E(ξ2j ) = 0

j=1

and E(ξ

2 2j )

( =2 −

T2j

−2

2 c1n

T2j − 1

T2j − 1

+

c1n c0n

2 c0n

2(T2j − 1)(2T2j + 1)



)

2 T2j c0n

[

γv + 2 4

T2j (2T2j + 1) 2 c1n

(T2j − 1)(2T2j2 − T2j + 3)

+

c1n c0n

T2j − 1

2 c0n T2j

]

(σv2 )2 .

Note that {ξ2j } is an independent sequence and it follows from Eqs. (11), (12) and the central limit theorem that − 21

Φn



−1

D

n(ˆ σ12v − ˆ σ02v ) − Φn 2 σ12 −→ N(0, 1). □

·

Proof of Corollary 1. Notice that ∆y2j = ∆X2j β + ∆µ2j ιT2j + ∆v2j , and then

˜ β1 =

(∑ m

∆X2j ∆X2j ′

)−1 (∑ m

(∑ m

)

j=1

j=1

=

∆X2j ∆y2j ′

∆X′2j ∆X2j

)−1 ∑ m

j=1

∆X′2j (∆X2j β + ∆µ2j ιT2j + ∆v2j )

j=1

= β + ∆1n + ∆2n , where

∆1n =

(∑ m

∆X2j ∆X2j ′

j=1

)−1 ∑ m

∆X′2j ∆µ2j ιT2j

j=1

and

∆2n =

(∑ m

∆X′2j ∆X2j

j=1

)−1 ∑ m

∆X′2j ∆v2j .

j=1

Under Assumptions (A1)–(A3), we show that 1

n 4 ∆1n =

( ∑ m 1 n

∆X′2j ∆X2j

)−1 (

3

n− 4

j=1

m ∑

∆X′2j ∆µ2j ιT2j

)

= Σ1−1 Σ5 + op (1)

j=1

and 1

1

n 4 ∆2n = n− 4

( ∑ m 1 n

j=1

∆X′2j ∆X2j

)−1 (

1

n− 2

m ∑

∆X′2j ∆v2j

)

= op (1).

j=1

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



13

Thus, it holds that 1

1

1

n 4 (˜ β1 − β ) = n 4 ∆1n + n 4 ∆2n = Σ1−1 Σ5 + op (1).

(13)

Together with (13) and Assumptions (A1)–(A3), we have that



n˜ σ12v = √

= √

+



m ∑

1 nc1n

j=1 m ∑

1 nc1n

1

∥∆v2j + ∆µ2j ιT2j + ∆X2j (β − ˜ β1 )∥2 ∆v′2j ∆v2j + √

j=1 1

c1n 2

· n 4 (˜ β1 − β ) ′ ·

= √

∥∆µ2j ιT2j ∥2

j=1

) 1 β1 − β ) ∆X′2j ∆X2j · n 4 (˜

j=1

3

1

c1n

nc1n

( ∑ m 1 n

m ∑

1

β 1 − β ) ′ · n− 4 · n 4 (˜

n ∑

∆µ2j ι′T2j ∆X2j + op (1)

i=1 m

1



nc1n

∆v′2j ∆v2j + σ12 −

j=1

1

Σ5′ Σ1−1 Σ5 + op (1).

c1n

Therefore, − 21 √

Φn

1

−1

n(˜ σ12v − ˆ σ02v ) − Φn 2 (σ12 −

c1n

D

Σ5′ Σ1−1 Σ5 ) −→ N(0, 1). □

Proof of Theorem 2. From Assumptions (A4) and (A5), we show that n 1 ∑



n

i

i=1

n 1 ∑



n

η′ D′i Pι⊥T Di η = ζ ′ Υ1 ζ + op (1),

v′i Pι⊥T Di = Op (1)

and

i

i=1

n 1∑

n

n 1∑

n

X′i Pι⊥T vi = op (1), i

i=1

X′i Pι⊥T Di = Op (1). i

i=1

Thus, by using these results and the method similar to (11), we show that



nˆ σ22v = √

= √

= √

= √

1 nc0n 1 nc0n 1 nc0n 1 nc0n

n ∑

∥QT′i (yi − Xiˆ β )∥2

i=1 n ∑

∥QT′i vi + QT′i Di η + QT′i Xi (β − ˆ β )∥2

i=1 n ∑ i=1

n 1 ∑ ′ ′ ⊥ v′i Pι⊥T vi + √ η Di PιT Di η + op (1) i i nc0n i=1

n



v′i Pι⊥T vi + i

i=1

1 c0n

ζ ′ Υ1 ζ + op (1)

and



nˆ σ02v = √

= √

1 nc0n 1 nc0n

m ∑

∆v′2j Pι⊥T ∆v2j + op (1) 2j

j=1 n ∑ i=1

v′i Pι⊥T vi i

−√

2

nc0n

m ∑

(14) ′

v2j−1 Pι⊥T v2j 2j

+ op (1).

j=1

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

14

J. Chen et al. / Journal of the Korean Statistical Society (

)



Furthermore, note that {v′2j−1 Pι⊥T v2j } is a sequence of independent variables with zero mean and variance var(v′2j−1 Pι⊥T v2j ) = 2j

2j

(T2j − 1)(σv2 )2 . Therefore, we obtain that



m 2 ∑ ′ 1 ′ σ02v ) = √ n(ˆ σ22v − ˆ ζ Υ1 ζ + op (1), v2j−1 Pι⊥T v2j + 2j c0n nc0n j=1

and then



σ22v c0n √ ˆ n −1 2 ˆ σ02v

(



) −

1

D

2c0n

ζ ′ Υ1 ζ/ˆ σ02v −→ N(0, 1), as n → ∞. □

Proof of Corollary 2. Under Assumption (A4), by using the method similar to (10), we have that



n 1∑

n(˜ β2 − β ) = (

n

i=1

n 1 ∑ ′ ⊥ D X′i Pι⊥T Xi )−1 ( √ Xi PιT vi ) −→ N(0, Σ2−1 Σ7 Σ2−1 ). i i n i=1

Thus, by the method similar to Theorem 3, under Assumptions (A1), (A2), (A4) and (A5), we obtain that



n˜ σ22v = √

n ∑

1 nc0n

∥QT′i vi + QT′i Xi (β − ˜ β2 ) ∥ 2 = √

i=1

n ∑

1 nc0n

v′i Pι⊥T vi + i

i=1

1 c0n

ζ ′ Υ1 ζ + op (1),

and hence



c0n 2

√ ·

) √ ˜ σ22v 1 ′ D −1 − ζ Υ1 ζ/ˆ n σ02v −→ N(0, 1). □ 2c0n ˆ σ02v (

Proof of Theorem 3. We first study the asymptotic property of ˆ α . Notice that ι′T yi = Ti α + ι′T Xi β + Ti µi + ι′T Di η + ι′T vi ,

and then ˆ α=

1 nc1n

∑n

ι′

i=1 Ti (yi



n(ˆ α − α) = √

= √

− Xiˆ β ). Under Assumption (A6), it holds that

n ∑

1 nc1n

i

i

i

i

[ι′Ti Xi (β − ˆ β ) + Ti µi + ι′Ti Di η + ι′Ti vi ]

i=1 n ∑

1 nc1n



ι′Ti Xi ·

n(β − ˆ β ) + op (1) = Op (1),

i=1

since we show that n 1∑

n

ι′Ti Xi = Σ10 + op (1),

n 1 ∑



i=1

and

n 1 ∑



n

n

D′i ιTi = Op (1),

i=1

n 1 ∑



n

Ti µi = Op (1)

i=1

ι′Ti vi = Op (1).

i=1

Hence, a tedious calculation yields



nˆ σ32v = √

= √

= √

n ∑

1 nc1n 1

nc1n

n

c1n

= √

∥ (α − ˆ α )ιTi + Xi (β − ˆ β ) + µi ιTi + Di η + vi ∥2

i=1 n ∑

1

√ +

i=1 n ∑

nc1n

∥yi − ˆ α ιTi − Xiˆ β ∥2

v′i vi + √

i=1

1∑ n

1 nc1n

i=1 n ∑ i=1

nc1n



n

·

1

Ti µ2i +

n

c1n

v′i vi + σ12 +

·

n 1∑

n 1

i=1

· n− 4 ζ · 1 c1n



Ti · [ n(ˆ α − α )]2 n 1∑

n

1

D′i Di · n− 4 ζ + op (1)

i=1

ζ ′ Υ2 ζ + op (1),

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society ( √

since we have that we have that



n c1n

· 1n

∑n

i=1 Ti

µ2i = σ12 + op (1),

1 n

∑n



i=1 Di Di

)



15

= Υ2 and all the other items are op (1). Furthermore, with (11)

n m 1 ∑ ′ 1 1 ⊥ 2 ∑ ′ σ02v ) = √ n(ˆ σ32v − ˆ vi ( IT i − PιT )vi + √ v2j−1 Pι⊥T v2j 2j c1n c0n i n nc0n i=1

j=1

1

+σ + 2 1

ζ Υ2 ζ + op (1). ′

c1n

(15)

Using the result of Wu, Qin, and Ding (2015), the first term of the right-hand side of (15) holds that ∗− 21

Ψn

n 1 ∑



n

v′i (

i=1

1

1

ITi −

c1n

c0n

D

Pι⊥T )vi −→ N(0, 1),

(16)

i

where Ψn∗ = a∗nˆ γv4 + b∗n (ˆ σ02v )2 . The second term on the right-hand side of (15) has the same result as that of Theorem 3, which is ∗− 21

Γn



m ∑

1

nc0n

D

2v′2j−1 Pι⊥T v2j −→ N(0, 1),

(17)

2j

j=1

where Γn∗ = c2 (ˆ σ02v )2 . We further show that (16) and (17) are asymptotically independent since {vi } are independent and 0n identically distributed and then n m 1 ⊥ 1 ∑ ′ 1 1 ∑ ′ E √ vi ( IT i − PιT )vi · √ 2v2j−1 Pι⊥T v2j 2j c1n c0n i n nc0n

(

i=1

j=1

n

=

m

( 2 ∑∑ nc0n

)

1

E v′i (

c1n

i=1 j=1

ITi −

1 c0n

Pι⊥T )vi v′2j−1 Pι⊥T v2j i

2j

)

= 0.

Then we have that ∗− 21 √

Φn

∗− 12

n(ˆ σ32v − ˆ σ02v ) − Φn

1

(σ12 +

c1n

D

ζ ′ Υ2 ζ ) −→ N(0, 1). □

˜3 and ˜ Proof of Corollary 3. We first investigate the asymptotic property of β α3 1 −1 −1 β˜3 − β = ∆− 3n ∆4n + ∆3n ∆5n + ∆3n ∆6n ,

where

∆3n =

n ∑

X′i Xi −

i=1

∆4n =

n ∑

X′i vi −

i=1

∆5n =

n ∑

n 1 ∑

nc1n

i=1

n 1 ∑

nc1n

X′i µi ιTi −

i=1

X′i ιTi ·

ι′Ti Xi ,

i=1

ι′Ti Xi ·

i=1

n ∑

ι′Ti vi ,

i=1

n 1 ∑

nc1n

n ∑

X′i ιTi ·

i=1

n ∑

Ti µi

i=1

and

∆6n =

n ∑

X′i Di η −

i=1

n 1 ∑

nc1n

i=1

X′i ιTi ·

n ∑

ι′Ti Di η.

i=1

Then, we have 1 1 1 1 1 ˜3 − β ) = n− 14 ( ∆3n )−1 ( √ ∆4n ) + ( ∆3n )−1 (n− 43 ∆5n ) + ( ∆3n )−1 (n− 34 ∆6n ). n 4 (β n n n n Let E(X′i Xi ) = Σ0 ,

| Σ0 −

1 c1n

′ Σ10 Σ10 | > 0

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

16

J. Chen et al. / Journal of the Korean Statistical Society (

)



and n 1∑

n

′ ′ ιTi ) = Σ11 + op (1). ιTi )′ vi v′i (Xi − Σ10 E(Xi − Σ10

i=1

By Assumption (A6), we have that 1 n

∆3n =

n 1∑

n

1

X′i Xi −

i=1

1

·

nc1n

n 1∑

n

X′i ιTi ·

i=1

n 1∑

n

ι′Ti Xi = Σ0 −

i=1

1 c1n

′ Σ10 Σ10 + op (1),

D

√ ∆4n −→ N(0, Σ11 ), n

3

3

n− 4 ∆5n = n− 4

n ∑

1

X′i µi ιTi − n− 4 ·

i=1

1 c1n

·

n 1∑

n

i=1

n 1 ∑ Ti µi = Σ5 + op (1) X′i ιTi · √ n i=1

and n 1∑

1

3

n− 4 ∆6n = n 4 ·

n

1

1

X′i Di · n− 4 ζ − n− 4 ·

i=1

1

·

c1n

n 1∑

n

i=1

n 1 ∑ ′ 1 Di ιTi · n− 4 ζ = Σ9 ζ + op (1). X′i ιTi · √ n i=1

Therefore, 1

1

˜3 − β ) = (Σ0 − n 4 (β

c1n

′ Σ10 Σ10 )−1 (Σ5 + Σ9 ζ ) + op (1).

(18)

Similarly, it holds that 1

n 1 ∑

1

n 4 (˜ α3 − α ) = n 4 · 1

= n4

=−

nc1n

ι′Ti (yi − Xi˜ β3 )

i=1

n n n n 1 1∑ ′ 1∑ 1∑ ′ 1∑ ′ [ ιT i X i ( β − ˜ β3 ) + Ti µ i + ιTi Di η + ιTi vi ]

c1n n 1

c1n

n

i=1

Σ10 (Σ0 −

1 c1n

n

i=1

i=1

n

(19)

i=1

′ Σ10 Σ10 )−1 (Σ5 + Σ9 ζ ) + op (1).

Using (18), (19) and Assumptions (A1)–(A6), a tedious but straightforward calculation shows that



n˜ σ32v = √

= √

= √ where κ =

1 c1n

n ∑

1 nc1n

i=1 n ∑

1 nc1n

nc1n

∥ (α − ˜ α )ιTi + Xi (β − ˜ β ) + µi ιTi + Di η + vi ∥2

i=1 n ∑

1

∥yi − ˜ α ιTi − Xi˜ β ∥2

v′i vi +

i=1

(Σ5 + Σ9 ζ )′ (Σ0 − ∗− 21

∗ Ωµη − Φn

(σ12 − κ +

∗− 12 √

∗ where Ωµη = Φn

n

(

˜ σ32v ˆ σ02v

1 c1n

1 c1n

ζ ′ Υ2 ζ + σ12 − κ + op (1),

1 ′ Σ10 Σ10 )−1 (Σ5 c1n

+ Σ9 ζ ). Thus,

D

ζ ′ Υ2 ζ ) −→ N(0, 1), as n → ∞,

) −1 ,κ =

1 ( Σ5 c1n

+ Σ9 ζ )′ (Σ0 −

1 ′ Σ10 Σ10 )(Σ5 c1n

+ Σ9 ζ ) and Φn∗ = a∗nˆ γv4 + (b∗n +

ˆ σ

2 )( 02v )2 . c1n



References Baltagi, B., & Pinnoi, N. (1995). Public capital stock and state productivity growth: further evidence from an error components model. Empirical Economics, 20, 351–359. Baltagi, B. H. (1985). Pooling cross-sections with unequal time series lengths. Economics Letters, 18(2–3), 133–136. Baltagi, B. H. (2008). Econometric analysis of panel data. John Wiley and Sons, Ltd.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.

J. Chen et al. / Journal of the Korean Statistical Society (

)



17

Baltagi, B. H., Chang, Y., & Li, Q. (1992). Monte Carlo results on several new and existing tests for the error component model. Journal of Econometrics, 54, 95–120. Baltagi, B. H., Chang, Y., & Li, Q. (1998). Testing for random individual and time effects using unbalanced panel data. Advances in Econometrics, 13, 1–20. Baltagi, B. H., & Li, Q. (1990). A lagrange multiplier test for the error components model with incomplete panels. Econometric Reviews, 9(1), 103–107. Baltagi, B. H., Song, S. H., & Jung, B. C. (2002). A comparative study of alternative estimators for the unbalanced two-way error component regression model. The Econometrics Journal, 5, 480–493. Bera, A., Sosa Escudero, W., & Yoon, M. (2001). Tests for the error-component model in the presence of local misspecification. Journal of Econometrics, 101, 1–23. Breusch, T. S., & Pagan, A. R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics. Reviews of Economic Studies, 47, 239–253. Castagnetti, C., Rossi, E., & Trapani, L. (2015a). Inference on factor structures in heterogeneous panels. Journal of Econometrics, 184, 145–157. Castagnetti, C., Rossi, E., & Trapani, L. (2015b). Testing for no factor structures: On the use of Hausman-type statistics. Economics Letters, 130, 66–68. Gourieroux, C., Holly, A., & Monfort, A. (1982). Likelihood ratio test, wald test, and kuhn-tucker test in linear models with inequality constraints on the regression parameters. Econometrica, 50(1), 63–80. Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46, 1251–1271. Honda, Y. (1985). Testing the error components model with non-normal disturbances. Review of Economic Studies, 52, 681–690. Honda, Y. (1991). A standardized test for the error components model with the two-way layout. Economics Letters, 37, 125–128. Hsiao, C. (2003). Analysis of panel data. In C. Hsiao (Ed.), Cambridge: Cambridge University Press. King, M., & Wu, P. (1997). Locally optimal one-sided tests for multiparameter hypotheses. Econometric Reviews, 16(2), 131–156. Montes-Rojas, G. V. (2010). Testing for random effects and serial correlation in spatial autoregressive models. Journal of Statistical Planning and Inference, 140 (4), 1013–1020. Moulton, B. (1987). Diagnostics for group effects in regression analysis. Journal of Business & Economic Statistics, 5(2), 275–282. Moulton, B., & Randolph, W. (1989). Alternative tests of the error components model. Econometrica, 57, 685–693. Munnell, A. (1990). Why has productivity growth declined? Productivity and public investment. New England Economic Review, 3–22. Oya, K. (2004). Test of random effects with incomplete panel data. Mathematics and Computers in Simulation, 64, 409–419. Shao, J., Xiao, Z. G., & Xu, R. F. (2011). Estimation with unbalanced panel data having covariate measurement error. Journal of Statistical Planning and Inference, 141, 800–808. Song, S. H., & Jung, B. C. (2001). Tests for panel regression model with unbalanced data. Journal of the Korean Statistical Society, 30, 511–527. Sosa-Escudero, W., & Bera, A. K. (2008). Tests for unbalanced error-components models under local misspecification. Stata Journal, 8(1), 68–78. Wansbeek, T., & Kapteyn, A. (1989). Estimation of the error-components model with incomplete panels. Journal of Econometrics, 41(3), 341–361. Wu, J. (2016). Robust random effects tests for two-way error component models with panel data. Economic Modelling, 59, 1–8. Wu, J., & Li, G. (2014). Moment-based tests for individual and time effects in panel data models. Journal of Econometrics, 178, 569–581. Wu, J., Qin, J., & Ding, Q. (2015). A moment-based test for individual effects in the error component model with incomplete panels. Statistics & Probability Letters, 104, 153–162. Wu, J., & Zhu, L. X. (2011). Testing for serial correlation and random effects in a two-way error component regression model. Economic Modelling, 28, 2377–2388. Yue, L., Li, G., & Zhang, J. (2017). Statistical inference for the unbalanced two-way error component regression model with errors-in-variables. Journal of the Korean Statistical Society, 46(4), 593–607.

Please cite this article in press as: Chen, J., et al., Hausman-type tests for individual and time effects in the panel regression model with incomplete data. Journal of the Korean Statistical Society (2018), https://doi.org/10.1016/j.jkss.2018.04.002.