The impact of a Hausman pretest, applied to panel data, on the coverage probability of confidence intervals

The impact of a Hausman pretest, applied to panel data, on the coverage probability of confidence intervals

Economics Letters 131 (2015) 12–15 Contents lists available at ScienceDirect Economics Letters journal homepage: www.elsevier.com/locate/ecolet The...

363KB Sizes 0 Downloads 20 Views

Economics Letters 131 (2015) 12–15

Contents lists available at ScienceDirect

Economics Letters journal homepage: www.elsevier.com/locate/ecolet

The impact of a Hausman pretest, applied to panel data, on the coverage probability of confidence intervals Paul Kabaila ∗ , Rheanna Mainzer, Davide Farchione Department of Mathematics and Statistics, La Trobe University, Australia

highlights • • • •

A Hausman pretest is commonly used in the analysis of panel data. We assess the finite sample impact of this pretest on a confidence interval. We present 3 new theorems on the coverage probability of this confidence interval. Our numerical results lead us to reject this confidence interval.

article

info

Article history: Received 24 February 2015 Received in revised form 19 March 2015 Accepted 21 March 2015 Available online 27 March 2015 JEL classification: C1 C12

abstract In the analysis of panel data that includes a time-varying covariate, a Hausman pretest is commonly used to decide whether subsequent inference is made using the random effects model or the fixed effects model. We consider the effect of this pretest on the coverage probability of a confidence interval for the slope parameter. We prove three new finite sample theorems that make it easy to assess, for a wide variety of circumstances, the effect of the Hausman pretest on the minimum coverage probability of this confidence interval. © 2015 Elsevier B.V. All rights reserved.

Keywords: Coverage probability Fixed effects model Hausman specification test Panel data Random effects model

1. Introduction In the analysis of panel data that includes a time-varying covariate, a preliminary Hausman (1978) test is commonly used to decide whether subsequent inference is made using the random effects model or the fixed effects model. If the Hausman pretest rejects the null hypothesis of no correlation between the random effect and time-varying covariate then the fixed effects model is chosen for subsequent inference, otherwise the random effects model is chosen. This preliminary model selection procedure has been widely used in econometrics (see e.g. Wooldridge, 2002 and Baltagi, 2005) and has been implemented in popular computer programs including SAS, Stata, eViews and R. Guggenberger (2010) gives examples of the practical application of this procedure.



Corresponding author. Tel.: +61 3 9479 2594; fax: +61 3 9479 2466. E-mail address: [email protected] (P. Kabaila).

http://dx.doi.org/10.1016/j.econlet.2015.03.031 0165-1765/© 2015 Elsevier B.V. All rights reserved.

So, what is widely used in the analysis of panel data that includes a time-varying covariate, is the following two-stage procedure. In the first stage, the Hausman pretest is used to decide whether subsequent inference is made using the random effects model or the fixed effects model. The second stage is that the inference of interest is carried out assuming that the model chosen in the first stage had been given to us a priori, as the true model. Guggenberger (2010) considers this two-stage procedure when the inference of interest is a hypothesis test about the slope parameter. We consider the case that the inference of interest is a confidence interval for the slope parameter. We prove three new theorems on the finite sample properties of the coverage probability function of this confidence interval. By the duality between hypothesis tests and confidence intervals, these new theorems imply corresponding new results when the inference of interest is a hypothesis test. In Section 2, we consider the practical situation that the random error and random effect variances are estimated from the data. The

P. Kabaila et al. / Economics Letters 131 (2015) 12–15

13

Also suppose that the (µi , xi1 , . . . , xiT )’s are i.i.d. multivariate normally distributed with zero mean and covariance matrix

 σµ2  τ σµ σx e′ ,  τ σµ σx e σx2 G



(2)

where e is a T -vector of 1’s, G is a T × T matrix with 1’s on the diagonal and τ˜ is a parameter that measures the dependence between µi and (xi1 , . . . , xiT ). We consider two models for G: (a) the off-diagonal elements of G are all ρ (compound symmetry) and (b) the (i, j)’th element of G is ρ |i−j| (first order autoregression). We define the ‘‘non-exogeneity parameter’’ τ as follows. For the case

1/2

of compound symmetry, τ =  τ T /(1 + (T − 1)ρ)



Fig. 1. Graphs of the coverage probability function, minimized over τ , of the confidence interval resulting from the two-stage procedure. This minimum coverage is considered as a function of ψ , for both compound symmetry (CS) and first order autoregression (AR) structures of G, where ρ = 0.4. The unbiased estimators of the random error and random effect variances are used. We consider  α ∈ {0.05, 0.50}, N = 100, T = 3 and 1 − α = 0.95.

coverage probability of the confidence interval resulting from the two-stage procedure is determined by 4 known quantities and 5 unknown parameters. Theorem 1 states that, apart from the known quantities, this coverage probability is actually determined by only 3 unknown parameters. Theorem 2 states that this coverage probability is an even function of the non-exogeneity parameter, so that the time required to compute the coverage is halved. In Section 3, we consider the coverage probability of the confidence interval resulting from the two-stage procedure when the random error and random effect variances are assumed to be known. Theorem 3 states that this coverage probability, conditional on the time-varying covariates, can be found exactly by the evaluation of the bivariate normal cumulative distribution function. This theorem is used (via a control variate) to reduce the variance of the simulation based estimators of the coverage probability of the confidence interval resulting from the two-stage procedure (when random error and random effect variances are estimated). Outline proofs of Theorems 1–3 are given in the Appendix. As illustrated in Fig. 1, the Hausman pretest, with the usual small nominal level of significance, can lead to this confidence interval having minimum coverage probability far below nominal. If the nominal level of significance is increased to 50% then the minimum coverage probability is much closer to the nominal coverage. However, the expected length of this confidence interval exceeds the expected length of the confidence interval based on the fixed effects model with the same minimum coverage. This confirms the rejection of the two-stage procedure by Guggenberger (2010). The results presented in this paper were computed using programs written in the language R. 2. The model and the practical two-stage procedure (random error and random effect variances are estimated) Let yit and xit denote the response variable and the time-varying covariate, respectively, for the individual i (i = 1, . . . , N ) at time t (t = 1, . . . , T ). Suppose that yit = a + β xit + µi + εit ,

(1)

where the εit ’s and the (µi , xi1 , . . . , xiT )’s are independent, the εit ’s are i.i.d. N (0, σε2 ) and the µi ’s are i.i.d. N (0, σµ2 ). We call

β the slope parameter, σε2 the error variance and σµ2 the random effect variance. The εit ’s and the µi ’s are unobserved. Suppose that the parameter of interest is β and that the inference of interest is a confidence interval for β . Let x = (x11 , . . . , x1T , x21 , . . . , x2T , . . . , xN1 , . . . , xNT ).

and, for first

 1/2 order autoregression, τ =  τ (T (1 − ρ) + 2ρ)/(1 + ρ) . In both cases τ is a correlation, so τ ∈ (−1, 1). If τ = 0 then the xit ’s are exogenous variables. Assume, for the moment, that σε and σµ are known. When τ = 0, a confidence interval for β may be found as follows. Let ψ = σµ /σε . Condition on x and use the GLS estimator  β(ψ) of β . Let zc = Φ −1 (c ), where Φ denotes the N (0, 1) cdf. Define the following confidence interval for β

 1/2 I (ψ) =  β(ψ) − z1−α/2 Var0 ( β(ψ) | x) , 

 1/2   β(ψ) + z1−α/2 Var0 ( β(ψ) | x) , where Var0 ( β(ψ) | x) denotes the variance of  β(ψ), conditional on x when τ = 0. The confidence interval I (ψ) has coverage probability 1 − α when τ = 0. Averaging (1) over t = 1, . . . , T for each i = 1, . . . , N we obtain yi = a + β xi + µi + ε i ,

(3)

where yi =

T 1

T t =1

yit ,

xi =

T 1

T t =1

xit

and ε i =

T 1

T t =1

εit .

This model is called the between effects model. When τ = 0, an alternative estimator of β is  βB , the OLS estimator based on the model (3), when we condition on x. This estimator does not require a knowledge of ψ . Subtracting (3) from (1), we obtain yit − yi = β(xit − xi ) + (εit − ε i ).

(4)

This model is called the fixed effects model. We estimate β by  βW , the OLS estimator based on this model. Define the following confidence interval for β   1/2 J (σε ) =  βW − z1−α/2 Var( βW | x) ,  1/2   βW + z1−α/2 Var( βW | x) , where Var( βW | x) denotes the variance of  βW , conditional on x. Irrespective of the value of τ , the confidence interval J (σε ) has coverage probability 1 − α . In practice, we do not know whether or not τ = 0. The usual procedure is to use a Hausman pretest to test H0 : τ = 0 against Ha : τ ̸= 0. We consider this pretest, based on the test statistic H (σε , σµ ) =

( βW −  βB )2 , Var( βW | x) + Var0 ( βB | x)

(5)

where Var0 ( βB | x) denotes the variance of  βB conditional on x and assuming that τ = 0. This test statistic has a χ12 distribution under H0 . Suppose that we accept H0 if H (σε , σµ ) ≤ z12− α /2 ; otherwise we reject H0 . Note that  α is the level of significance of this test. We now describe the two-stage procedure. If H0 is accepted then we use the confidence interval I (ψ); otherwise we use the confidence interval

14

P. Kabaila et al. / Economics Letters 131 (2015) 12–15

J (σε ). Let K (σε , σµ ) denote the confidence interval, with nominal coverage 1 −α , that  results from this two-stage procedure. Now let  P β ∈ K (σε , σµ )  x denote the coverage probability of K (σε , σµ ), conditional on x. Observe that P β ∈ K (σε , σµ )x is equal to

 



  



P β ∈ I (σε , σµ ), H (σε , σµ ) ≤ z12− α /2  x

    + P β ∈ J (σε ), H (σε , σµ ) > z12−α/2  x    = P |gI | ≤ z1−α/2 , |h| ≤ z1−α/2  x    + P |gJ | ≤ z1−α/2 , |h| > z1−α/2  x , (6)    1/2 where gI =  β(ψ) − β / Var0 ( β(ψ)|x) , gJ = ( βW − β)/  1/2  1/2     Var(βW |x) and h = (βW − βB )/ Var(βW |x) + Var0 ( βB |x) . The right-hand side of (6) is determined by the conditional bivariate normal distributions of (gI , h) and (gJ , h), described in Theorem 3. Of course, in practice, σε and σµ are not known and need to be estimated. So, in practice, the two-stage procedure results in the confidence interval K ( σε ,  σµ ) where  σε and  σµ denote estimators of σε and σµ , respectively. The coverage probability of the confidence interval constructed from this two-stage procedure is P (β ∈ K ( σε ,  σµ )). The following theorems state properties of this coverage probability.

σµ ) any of the pairs of estimators listed in Theorem 1. For ( σε ,  the Appendix, P (β ∈ K ( σε ,  σµ )) is determined by N, T ,  α, 1 − α, ψ , ρ and τ . Theorem 2. Suppose that N, T ,  α , 1 − α , ψ and ρ are fixed. When  σε and  σµ are any of the pairs of estimators listed in the Appendix, P (β ∈ K ( σε ,  σµ )) is an even function of τ ∈ (−1, 1). The minimum over τ of P (β ∈ K ( σε ,  σµ )) depends on only two unknown parameters, ψ and ρ . In practice, ψ is not known. However, one is likely to have some background knowledge about ρ . Therefore we fix ρ and plot the graph of P (β ∈ K ( σε ,  σµ )), minimized over τ , as a function of ψ (see Fig. 1). For nominal significance level  α = 0.05 of the Hausman pretest, this minimized coverage probability is far below the nominal coverage for ψ ≈ 0.2. Also, for  α = 0.5, we see an improvement in the minimum coverage probability. However, for  α = 0.5 and compound symmetry, the expected length of J ( σε ), with the same coverage as the coverage of K ( σε ,  σµ ) minimized over (ρ, τ , ψ) ∈ [0, 0.8] × (−1, 1) × (0, ∞), is less than the expected length of K ( σε ,  σµ ) for all (ρ, τ , ψ) ∈ [0, 0.8] × (−1, 1) × (0, ∞). 3. The two-stage procedure when random error and random effect variances are assumed known In this section we suppose that σε and σµ are known and that  G has a compound symmetry structure. We show that P β ∈ K (σε , σµ )  x can be computed exactly using the bivariate normal distribution.

 

Theorem 3. Let x = (NT )−1

(xi − x)2 and SSW = i=1 t =1 (xit − ) Let p (x) = SSB/Var(xi ), where Var(xi ) = σx2 (1 + (T − 1)ρ)/T . Also let r (x) = SSB/SSW and q(ψ, T ) = ψ 2 + (1/T ). Conditional on x, (gI , h) and (gJ , h) have bivariate normal distributions, where E (gJ | x) = 0, Var(gJ | x) = 1, N T

N T i =1 xi 2 .

t =1 xit , 2

τ ψ p(x) E (gI | x) =  1/2 , q(ψ, T ) + q2 (ψ, T )/r (x)

SSB =

N

i =1

τ 2ψ 2 , q(ψ, T ) + q2 (ψ, T )/r (x) −τ ψ p(x) E (h | x) = , (r (x) + q(ψ, T ))1/2 τ 2ψ 2 Var(h | x) = 1 − , r (x) + q(ψ, T ) Cov(gI , h | x) τ 2ψ 2 =  1/2  1/2 q(ψ, T )r (x) + q2 (ψ, T ) 1 + q(ψ, T )/r (x) Var(gI | x) = 1 −

1 and Cov(gJ , h | x) =  1/2 . 1 + q(ψ, T )/r (x) Similarly to Theorems 1 and 2, P β ∈ K (σε , σµ )  x is determined by N, T , x,  α , 1 − α , ψ , ρ and τ and P (β ∈ K (σε , σµ )|x) is an even function of τ ∈ (−1, 1).

 



Appendix It has been suggested (see e.g. Maddala and Mount, 1973, Hsiao, 1986 and Baltagi, 2005) that negative estimates of variance be replaced by 0. We use this kind of approach to ensure that  σε2 is 2 always positive and  σµ is always nonnegative. We consider the usual unbiased estimators and maximum likelihood estimators, given by Hsiao (1986), and Wooldridge’s (2002) estimators. Theorems 1–3 hold for the three Hausman test statistics given by Hausman and Taylor (1981) and these three pairs of estimators. For the sake of brevity, the proofs are presented only for the test statistic (5). The proofs of Theorems 1 and 3 use Maddala’s (1971, Eq. (1.3)). Suppose that N, T ,  α , 1 −α , x, σε and σµ are given. Let  gI ,  gJ and  h denote the statistics gI , gJ and h when σε and σµ are replaced by the   estimators  σε and  σµ . The coverage probability P β ∈ K ( σε ,  σµ ) is equal to P β ∈ I ( σε ,  σµ ), H ( σε ,  σµ ) ≤ z12−α/2





  + P β ∈ J ( σε ), H ( σε ,  σµ ) > z12−α/2   = P | gI | ≤ z1−α/2 , | h| ≤ z1− α /2   + P | gJ | ≤ z1−α/2 , | h| > z1− α /2 . Ď

(7) Ď

Proof of Theorem 1. Let xit = xit /σx , and µi = µi /σµ . The joint distribution of the

εitĎ ’s

and the

(µĎi , xĎi1 , . . . , xĎiT )’s

is determined Ď

Ď

Ď

by ρ and τ . Now express  gI ,  gJ and  h in terms of the xit ’s, εit ’s, µi ’s and ψ . Therefore the distributions of both ( gI ,  h) and ( gJ ,  h) are functions of N, T ,  α and 1 − α and the unknown parameters ψ , ρ and τ . The theorem now follows from (7). Proof of Theorem 2. Assume that (µi , xi1 , . . . , xiT ) has a multivariate normal distribution with mean 0 and covariance matrix (2) √ where  τ = τ / T and G = I. The proof for G with either compound symmetry or first order autoregression structure (ρ ̸= 0) is similar.   By Theorem 1, P β ∈ K ( σε ,  σµ ) is a function of τ . Let x∗it = −xit for i = 1, . . . , N and t = 1, . . . , T . For τ = d, (µi , xi1 , . . . , xiT ) has a multivariate normal distribution with mean 0 and covariance √ matrix (2), where  τ = d/ T and G = I. For τ = −d, (µi , x∗i1 , . . . , x∗iT ) has the same distribution. Therefore − gI , − gJ and − h are the same functions of the x∗it ’s,  εit ’s and µi ’s as  gI ,  gJ and h, respectively, are functions of the xit ’s, εit ’s and µi ’s. The theorem now follows from (7).

P. Kabaila et al. / Economics Letters 131 (2015) 12–15

Proof of Theorem 3. Suppose that G has 1’s on the diagonal and ρ elsewhere (compound symmetry). It may be shown that τ = Corr(µi , xi ) and so

  µi xi

  σµ2 ∼ N 0, τ σµ σx

τ σµ σx σx2



,

( β − β,  βW −  βB ). The distributions of ( βW − β,  βW −  βB ) and ( β −β,  βW − βB ), conditional on x, are bivariate normal. Theorem 3 follows from this. References

where σx2 = Var(xi ). Therefore, conditional on x, µi ∼ N τ (σµ /



 σx )xi , σµ2 (1 − τ 2 ) . Thus E ( βB | x) = β + τ (σµ /σx ) and

Var( βB | x) =

15

σε2 (1 − τ 2 )ψ 2 + (1/T ) . SSW r ( x)

Also, E ( βW | x) = β , Var( βW | x) = σε2 /SSW and Cov( βB ,  βW |   x) = 0. Conditional on x, βB and βW are independent normally distributed random variables with the stated conditional means and variances. Conditional on x, the distributions of (gI , h) and (gJ , h) are determined by the distributions of ( βW − β,  βW −  βB ) and

Baltagi, B.H., 2005. Econometric Analysis of Panel Data, third ed. John Wiley & Sons, Ltd. Guggenberger, P., 2010. The impact of a Hausman pretest on the size of a hypothesis test: the panel data case. J. Econometrics 156, 337–343. Hausman, J.A., 1978. Specification tests in econometrics. Econometrica 46, 1251–1271. Hausman, J.A., Taylor, W.E., 1981. Panel data and unobservable individual effects. Econometrica 49, 1377–1398. Hsiao, C., 1986. Analysis of Panel Data. Cambridge University Press, Cambridge. Maddala, G.S., 1971. The use of variance components models in pooling cross section and time series data. Econometrica 39, 341–358. Maddala, G.S., Mount, T.D., 1973. A comparative study of alternative estimators for variance component models used in econometric applications. J. Amer. Statist. Assoc. 68, 324–328. Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge.