The power envelope of panel unit root tests in case stationary alternatives offset explosive ones

The power envelope of panel unit root tests in case stationary alternatives offset explosive ones

Statistics and Probability Letters 108 (2016) 1–8 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: www...

390KB Sizes 0 Downloads 60 Views

Statistics and Probability Letters 108 (2016) 1–8

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

The power envelope of panel unit root tests in case stationary alternatives offset explosive ones I. Gaia Becheri a,∗ , Feike C. Drost b , Ramon van den Akker b , Oliver Wichert b a

Delft Institute of Applied Mathematics (DIAM), Delft University of Technology, Delft, The Netherlands

b

Econometrics Group, CentER, Tilburg University, Tilburg, The Netherlands

article

info

Article history: Received 26 August 2015 Received in revised form 23 September 2015 Accepted 23 September 2015 Available online 3 October 2015

abstract We derive the power envelope for panel unit root tests where heterogeneous alternatives are modeled via zero-expectation random perturbations. We obtain an asymptotically UMP test and discuss how to proceed when one is agnostic about the expectation of the perturbations. © 2015 Elsevier B.V. All rights reserved.

Keywords: Panel unit root test Local Asymptotic Normality Limit experiment Asymptotic power envelope

1. Introduction We start from the setup of Moon et al. (2007), followed by Becheri et al. (2015), which study the asymptotic power envelope for the unit root testing problem in a Gaussian cross-sectionally independent panel where the observations Yit for i = 1, . . . , n and t = 1, . . . , T are generated by Yit = mi + Yit0 , Yit0 = ρi Yit0−1 + σi ϵit , with mi a deterministic observed fixed effect, Yi00 = 0, and εit satisfying Assumption 1.1(a). Both papers assume the hetero√ geneous autoregression coefficients ρi to be generated according to the random coefficient structure ρi = 1 + hUi /( nT ) where U1 , . . . , Un are i.i.d. unobserved random variables with mean 1 and unknown distribution. The results from Moon et al. (2007) and Becheri et al. (2015) cannot be extended to the case where the perturbations have zero mean since the power envelopes √ would be flat (which intuitively means that there do not exist tests that can detect alternatives at the localizing rate nT ). In this note we assume Ui to have mean zero and, more specifically, to satisfy Assumption 1.1(c) and we reparameterize ρi as h Ui , h ≤ 0, (1) Tnγ for some appropriate value of γ . Note that, Ui being unobserved, the sign of h is unidentified; thus there is no loss of generality in assuming h ≤ 0.

ρi = 1 +



Corresponding author. E-mail address: [email protected] (I.G. Becheri).

http://dx.doi.org/10.1016/j.spl.2015.09.019 0167-7152/© 2015 Elsevier B.V. All rights reserved.

2

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

Assumption 1.1. (a) The innovations εit , i, t ∈ N, are i.i.d. N (0, 1). (b) The deterministic scale parameters σi are strictly positive, i.e. σi > 0 for i ∈ N. (c) The perturbations Ui , i ∈ N, are i.i.d. with mean 0 and variance 1, have bounded support, and are independent of the idiosyncratic shocks εit , i, t ∈ N. Moreover, the moment generating function of U1 exists on an open interval containing 0. Throughout, we are interested in testing the unit root hypothesis H0 : h = 0

versus

Ha : h < 0.

(2)

Under the null hypothesis, each panel unit has a unit root whereas, under the alternative, there are both explosive and stationary time series {Yit , t ∈ N}. Assumption 1.1 allows the Ui to have an atom at zero, so a random fraction of the time series {Yit , t ∈ N} might have a unit root under the alternative. In this note we show that, under Assumption 1.1, the alternatives are contiguous to the null hypothesis if γ = 1/4. Note that this is a different rate than the one in Moon et al. (2007) and Becheri et al. (2015) (where U1 has expectation 1 and γ = 1/2). We derive the UMP test for (2) and we also compare this test to the UMP test for the setting where the expectation of U1 is 1. The mi and σi are treated as unknown nuisance parameters. 2. Main results First we derive the limit experiment of the model where mi and σi are known. This yields the power envelope for the testing problem (2). In Section 2.2, we prove adaptivity of our problem with respect to the nuisance parameters mi and σi and propose an optimal test. 2.1. Limit experiment and power envelope In this section we assume the parameters mi and σi to be known. The limit experiment for this model is given in Proposition 2.1. (n,T ) ˜ (hn,T ) the law of Y conditional on U1 , . . . , Un , and Pu Let Ph denote the law of Y := {Yit , i = 1, . . . , n, t = 1, . . . , T }, P the law of U1 , . . . , Un . Note that, thanks to Assumption 1.1(c), under the null, the law of U1 , . . . , Un conditional on Y is still Pu . Unless otherwise indicated, all expectations are taken under H0 . (n,T ) (n,T ) In order to derive the limit experiment, we have to study the likelihood ratio of our model, that is dPh /dP0 . To compute it, we use the following relation between likelihood ratios: (n,T )

(n,T ) 



dPh

 Y , ˜ (n,T ) 

˜h dP

=E

(n,T )

dP 0

(3)

dP0

/dP˜ (0n,T ) is the likelihood ratio of the model where both Yit and Ui are observed. (T ) Let 1Yit = Yit − Yit −1 for i = 1, . . . , n and t = 1, . . . , T , and let us introduce the partial sum process Wi as (n,T )

˜h where dP

(T )

Wi

1

(u) := √

[Tu] 

T σ i t =1

1Yit ,

and define (T )

Xi

1

 :=

(T )

Wi

(u−)dWi(T ) (u) and Ji(T ) :=

0

1





(T )

Wi

(n,T )

(n,T )

=

(n,T )

˜0 dP

(4)

0

˜h where W (u−) = limx→u− W (x). The likelihood ratio dP and it is given by ˜h dP

2 (u−) du,

n  i=1

 exp Ui

h

n

(T )

X 1/4 i

/dP˜ 0(n,T ) can be easily computed thanks to Assumption 1.1(a)

 h2 U 2 − √ i Ji(T ) .

(5)

2 n

In the following proposition, we make use of (3)–(5) to establish the LAN property for the model of interest under joint asymptotics (T , n) → ∞, as in Becheri et al. (2015). The proof is postponed to the Appendix. Remark 2.1. Note that (T , n) → ∞ means that min(T , n) → ∞. (n,T )

Proposition 2.1. Let Assumption 1.1 hold and put γ = 1/4. Then, under P0 (n,T )

log

dPh

(n,T )

dP0

1

= h2 ∆n,T − h4 J + op (1), 2

as (T , n) → ∞, (6)

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

3

(n,T )

where J = 5/8 and, under P0

(T ) (T ) n 1  (Xi )2 − Ji

∆ n ,T =



2 i=1

n

(n,T )

Moreover, under Ph

d

→ N (0, J ).

(7)

d

, ∆n,T → N (h2 J , J ) as (T , n) → ∞.

Remark 2.2. In Assumption 1.1(a), it might be possible to replace the Gaussian assumption by some milder conditions. Plausibly, the results of this note still hold if ϵi,t satisfy a functional central limit theorem for arrays that would ensure convergence of the partial sums to Wiener processes. However, this is beyond the aim of this note. This proposition and an application of Theorem 9.4 in van der Vaart (2000) imply that the sequence of experiments {P(hn,T ) : h ∈ R− } converges to the experiment {N (h2 J , J ) : h ∈ R− } under P(hn,T ) .1 Using the Asymptotic Representation Theorem,2 we can thus obtain the (asymptotic) power envelope for testing hypothesis (2). The resulting power envelope is presented in the following corollary. Corollary 2.1. Let Assumption 1.1 hold, γ = 1/4, α ∈ (0, 1), and denote zα = Φ −1 (1 − α). Consider a test ϕ(Y11 , . . . , YnT ) of level α with power πn,T (h). Then, for all h, we have



lim sup πn,T (h) ≤ Φ −zα + h2



(T ,n)→∞

J ,

(8)

where Φ denotes the cumulative distribution function of the standard normal distribution. Moreover, let tn,T

∆ n ,T = √ =



n 2 1 

5

J



n i =1

 (Xi(T ) )2 − Ji(T ) .

(9)

Then, for all h, the test ψn,T = 1{tn,T ≥ zα } attains the upper bound (8) uniformly in h. Remark 2.3. Note that this test is semiparametrically optimal in the sense that the power envelope (8) does not depend on the distribution of the perturbations Ui . 2.2. A feasible test In this section we treat mi and σi as unknown nuisance parameters. We show that the unit root testing problem is adaptive with respect to these parameters, that is the power envelope can still be attained when mi and σi are unknown when n/T → 0.3 In fact, we can define a test whose (local and asymptotic) power achieves the power envelope (8) while being invariant with respect to mi and where σi are estimated. This test is based on a feasible version of the central sequence ∆n,T , obtained by replacing σi2 , i = 1, . . . , n, by T 

1

σˆ i2 =

T − 1 t =2

(1Yit )2 .

Our test statistic tˆn,T is thus defined on the basis of (9) as:

 tˆn,T =

n 2 1 

5

 =



n i =1

2 1 5



 

n  

n i =1



T t −1 1  1

T t =3

s=2

σˆ i

 1Yis

1

σˆ i

2 1Yit



T 1  1

T 2 t =3 σˆ i2



t −1 

2  1Yis 

s=2

σi4 (T ) 2 σi2 (T ) σ4 σ2 (Xi ) − 2 Ji − i4 ria,T + i2 rib,T , 4 σˆ i σˆ i σˆ i σˆ i 



where ria,T

=

1

σi4



T 1

T t =2

2 1Yi1 1Yit

+

2



σi4

T 1

T t =2

 1Yi1 1Yit

T t −1 1 

T t =3 s=2

 1Yis 1Yit

,

and

1 Note that Theorem 9.4 in van der Vaart (2000) needs to be applied to the model where ρ is re-parameterized in terms of the local parameter h˜ = h2 i √ h˜

as ρi = 1 − Tn1/4 Ui . 2 See, for example, van der Vaart (2000, Chapter 15). 3 The additional assumption on n and T is needed to handle an increasing number of nuisance parameters; this assumption is standard in the literature, see, for instance, Moon et al. (2007) and Becheri et al. (2015).

4

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

rib,T =

T − 1 1Yi12

σi2

T2

+

T t −1 2 1Yi1  

T 2 σi2

1Yis .

t =3 s=2

Note that ria,T and rib,T are remainder terms due to not observing Yi0 = mi . The following proposition proves that tˆn,T is asymptotically equivalent to tn,T in the sense that they differ only for order oP (1) terms. (n,T )

Proposition 2.2. Let Assumption 1.1 hold and suppose n/T → 0. Then we have, for all h ∈ R and under Ph

as (T , n) → ∞,

tˆn,T = tn,T + oP (1).

(10)

ˆ n,T = 1{tˆn,T > zα } is asymptotically UMP. Remark 2.4. From (10) and Corollary 2.1, it readily follows that the test ψ 3. Testing for a unit root when EU1 is unknown In practice, it may be difficult to determine whether some data were generated under the DGP introduced in Section 1, where EU1 = 0, or under the DGP considered in Moon et al. (2007) and Becheri et al. (2015), where U1 satisfies Assumption 3.1, i.e. EU1 = 1. In this section we address the problem of testing for a unit root while being agnostic about the first moment of U1 . For notational simplicity, we consider the test statistics tn,T and τn,T (introduced below) which, as in Section 2.1, rely on the nuisance parameters being known; the extension to their estimated, feasible counterparts is immediate as long as n/T → 0. Assumption 3.1. The perturbations Ui , i ∈ N, are i.i.d. with mean 1 and independent of the idiosyncratic shocks εit , i, t ∈ N. Moreover, the moment generating function of U1 exists on an open interval containing 0. Note that we need γ = 1/2 to ensure contiguity of the alternatives with respect to the null hypothesis under Assumption 3.1 (see Moon et al., 2007 and Becheri et al., 2015). In Section 2, we have shown that optimal inference for the testing problem (2) can be based on tn,T when EU1 = 0. Becheri et al. (2015) shows that, if EU1 = 1, optimal inference can be based on



n 2  (T ) Xi . n i =1

τn,T = √

(n,T )

Let us denote by Qh (n,T )

(n,T )

the law of Y when the Ui satisfy Assumption 3.1 and ρi satisfies (1) with γ = 1/2. Clearly, (n,T )

(n,T )

since Q0 = P0 , the statistics tn,T and τn,T converge to a standard normal distribution under both P0 and Q0 (see Proposition 2.1 and Proposition 4.2 in Becheri et al., 2015). This implies that both statistics are valid in terms of size for testing the unit root hypothesis (2) irrespective of the expectation of the U1 . In what follows, we propose two tests based on the statistics tn,T and τn,T having power against h < 0 even when we do not know whether U1 satisfies Assumption 1.1(c) or Assumption 3.1. (n,T ) (n,T ) and Qh . Its proof relies on a straightforward application Lemma 3.1 provides the distribution of tn,T and τn,T under Ph of Le Cam’s third lemma and can be found in the Appendix. Lemma 3.1. Let Assumption 1.1(a)–(b) hold. (n,T )

(i) Let Assumption 1.1(c) hold and γ = 1/4. Then, under Ph d

tn,T → N (h

 2

d

5/8, 1) and τn,T → N (h

 2

2/9, 1). (n,T )

(ii) Let Assumption 3.1 hold and γ = 1/2. Then, under Qh d

tn,T → N (h 8/45, 1)



and τn,T

, as (T , n) → ∞,

, as (T , n) → ∞,

√ d → N (h/ 2, 1).

This result provides guidance on defining tests that do not rely on knowing the expectation of U1 and it enables us to compute their (local and asymptotic) power. (n,T ) From Lemma 3.1(i), we conclude that, under Ph , one would reject for a large value of either test statistic. On the (n,T )

contrary, under Qh , one would reject for small values. This suggests that, when it is not known whether EU1 = 0 or EU1 = 1, one should reject for both large and small values of tn,T and τn,T . Following this lead, we can define two tests (n,T )

having power both under Ph

(n,T )

and Qh

. Let us define the tests:

ϕn,T = 1 − 1{−zα/2 < tn,T < zα/2 } and ϕ˜ n,T = 1 − 1{−zα/2 < τn,T < zα/2 }. From Lemma 3.1, we easily obtain the (local and asymptotic) powers of these tests which are presented in the following corollary.

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

5

Corollary 3.1. Let Assumption 1.1(a)–(b) hold. (n,T )

(i) Let Assumption 1.1(c) hold and ρi satisfy (1) with γ = 1/4. Then, under Ph

,

(n,T )

[−zα/2 < tn,T < zα/2 ]) = Φ (−zα/2 − h

(n,T )

  [−zα/2 < τn,T < zα/2 ]) = Φ (−zα/2 − h2 2/9) + Φ (−zα/2 + h2 2/9).

lim (1 − Ph

(T ,n)→∞

2

5/8) + Φ (−zα/2 + h2





5/8),

and lim (1 − Ph

(T ,n)→∞

(n,T )

(ii) Let Assumption 3.1 hold and ρi satisfy (1) with γ = 1/2. Then, under Qh (n,T )

lim (1 − Qh

(T ,n)→∞

and (n,T )

lim (1 − Qh

(T ,n)→∞

,

 [−zα/2 < tn,T < zα/2 ]) = Φ (−zα/2 − h 8/45) + Φ (−zα/2 + h 8/45) 

√ √ [−zα/2 < τn,T < zα/2 ]) = Φ (−zα/2 − h/ 2) + Φ (−zα/2 + h/ 2). (n,T )

Corollary 3.1(i) shows that under Ph

, the power of the test ϕn,T is (asymptotically) higher than that of ϕ˜ n,t , while, from

(n,T )

the power of ϕ˜ n,t is higher than that of ϕn,T . Furthermore, from Corollary 3.1 it Corollary 3.1(ii) it follows that under Qh is clear that neither ϕn,t nor ϕ˜ n,T is optimal. It is, however, important to note that the one-sided test ψn,T , which is optimal (n,T )

under Ph

(n,T )

, always has power less than the size α in the Qh (n,T )

which is optimal under Qh

-model. A similar remark applies to the test 1{τ < −z }, (n,T )

, but has power less than α under Ph

. This implies that these tests are pretty useless when

(n,T )

(n,T )

or Qh . Therefore, when it is not possible to determine under which DGP it is not possible to decide on the model Ph the data were generated, we recommend to use the two-sided tests. Appendix. Proofs A.1. Proof of Proposition 2.1 In the following, we first establish convergence (7), then we prove the expansion (6), and finally we establish the (n,T ) convergence result under the alternative. All probabilities and expectations are evaluated under P0 unless otherwise stated. (T ) (T ) For m = 2, . . . , 8, we introduce the random variables Kmi = fm (Xi , Ji ), i = 1, . . . , n, where f2 (x, j) = f 5 ( x, j ) = (T )

and Xi

(T )

and Ji

x2 − j

,

2 3xj2 − 2x3 j 24

f3 (x, j) =

,

x3 − 3xj 6

f6 (x, j) =

,

f4 (x, j) =

3x2 j2 − j3 48

,

x4 − 6x2 j + 3j2

f7 (x, j) =

24 −xj3 48

,

,

f8 (x, j) =

j4 384

,

are as defined in (4). Note that these are approximations to the stochastic integrals Xi =

1

1 0

Wi (u)dWi (u) =

1/2(Wi2 (1) − 1) and Ji = 0 Wi2 (u)du, based on independent Brownian motions Wi and that, for fixed m, the variables Kmi , i = 1, . . . , n, are i.i.d. √ (T ) (T ) Put µm = EKm1 , σm = Var(Km1 ). Some tedious calculations show, as T → ∞,

µ(mT ) = Efm (X1(T ) , J1(T ) ) → µm = Efm (X1 , J1 ) and   σm(T ) = Var(fm (X1(T ) , J1(T ) )) → σm = Var(fm (X1 , J1 )). Although it is not strictly necessary to demonstrate convergence to the moments of the limiting process (Xi , Ji ), it provides (T ) (T ) some additional intuition why the sequences µm and σm are bounded. Furthermore, the limits µm and σm can be easily  (T )

obtained from Ito calculus. In particular, we obtain µ2

= µ2 = µ(3T ) = µ3 = 0, µ(4T ) → µ4 = 0, and σ2(T ) → σ2 =

5 . 8

Using once again the Gaussianity of our innovations, it can be demonstrated that higher moments of Km1 are bounded as well. The previous considerations on the moments of Km1 enable us to apply a Central Limit Theorem for a double array of random variables (see Serfling, 1980, p. 32), n  i=1

(Kmi − µ(mT ) ) √

(T )

nσ m

d

→ N (0, 1) and

n 1

n i=1

P

2 (Kmi − (σm(T ) )2 − (µ(mT ) )2 ) → 0.

(A.1)

6

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

(T )

= 0 and σ2(T ) →

As µ2



5 , 8

(A.1) establishes the limiting distribution (7) of the central sequence ∆nT as well as a useful 5 , 8

approximation to the Fisher information J = n 1 

∆nT = √

n i =1

namely

n 1

d

K2i → N (0, J ) and

n i =1

P

K2i2 → J . U

(T )

Next, we obtain the desired expansion of the loglikelihood ratio. Define ai = h n1/i 4 Xi

U2

(T )

and bi = −h2 2√i n Ji

. From (3)

and (5) and using the independence across i, we get (n,T )

log

dPh

(n,T )

=

n 

dP0

log E(exp(ai + bi ) | Y).

i=1

(T )

Expanding the exponential, we have, for some 0 ≤ |ξ1,i | = |ξ1 (Ui , n, T , Xi (n,T )

log

dPh

(n,T )

=

n 

dP0

n 4    1 1 (T ,n) log(1 + Li ). (ai + bi )k + eξ1,i (ai + bi )5 | Y = k ! 5 ! i=1 k =1



log E 1 +

i=1

(T )

Recall Ui , with EUi = 0 and EUi2 = 1, is independent of both Xi (T ,n)

Li

=E

, Ji(T ) )| ≤ |ai + bi |,

(T )

and Ji

(see Assumption 1.1); hence

4   1 1 (ai + bi )k + eξ1,i (ai + bi )5 | Y k! 5! k=1

1

= h2 √ K2i + n

8 

hm n−m/4 (EU1m )Kmi +

m=3

1 120

E(eξ1,i (ai + bi )5 | Y).

Using boundedness of moments and employing the following inequality for i.i.d. random variables due to Gumbel (1954),

E max |Kmi |ℓ ≤ E|Km1 |ℓ +



i≤n



n−1 Var(|Km1 |ℓ ) √ , 2n − 1

ℓ > 0,

we obtain E maxi≤n |Kmi |ℓ = O( n). Therefore, the Markov inequality implies, max |Kmi | = op (nα ) for any α > 0. i ≤n

(T )

h√2 (T ) J 2 n i

= op (1). This implies that the final term of L(i T ,n) is asymptotically negligible. Indeed, using again a similar reasoning as before, we find that, for all ϵ > 0, there exist n, T and 

A similar reasoning shows ζnT = n1/5 maxi≤n | n1h/4 Xi

|+



k such that max |E(eξ1,i (ai + bi )5 | Y)| ≤ i ≤n

k n

5 ekζnT ζnT = op (n−1 ),

where k is a finite positive constant that depends on the support of U1 (use Assumption 1.1(c)). Collect the previous results and repeatedly use the Central Limit Theorem for a double array of random variables (Serfling, 1980, p. 32), to obtain (n,T )

P0 n 

(T ,n)

Li

[max |L(i T ,n) | < ϵ] → 1,

n 1 

= h2 √

n i=1

i =1

(A.2)

i≤n

d

K2i + op (1) → N (0, h4 J ),

(A.3)

n n  1 2 5 p (L(i T ,n) )2 = h4 K2i + op (1) = h4 + op (1) → h4 J .

n i=1

i=1

(A.4)

8

Subsequently, proceed with an expansion of the logarithm in the loglikelihood ratio, yielding (n,T )

log

dPh

(n,T )

=

dP0

n 

(T ,n)

log(1 + Li

i=1

for 0 and Li  some ξ2,i between  (T ,n)

(Li )3  3(1+ξ2,i )3 



(T ,n)

|Li |3 i=1 3(1−ϵ)3

n

n 

(T ,n)

Li



i=1

( T ,n )

n   i =1

)=



n  (L(T ,n) )2 i

i=1

(T ,n)

. Since |ξ2,i | ≤ |Li ϵ 3(1−ϵ)3

n

i=1

2

+

n  i=1

(L(i T ,n) )3 , 3(1 + ξ2,i )3

| ≤ ϵ (with probability converging to one), we find the bound

(T ,n) 2

|Li

| . Therefore, (A.2)–(A.4) establish the desired expansion.

Finally, an application of Le Cam’s third lemma immediately yields convergence of the central sequence to a normal N (hJ , J ) distribution under the local alternatives. 

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

7

A.2. Proof of Proposition 2.2 As we have shown that our model is LAN, contiguity is obtained from Le Cam’s first lemma. Hence we only have to prove (n,T ) (n,T ) (10) under P0 . In the remainder of this proof, all expressions, probabilities and expectations are evaluated under P0 . We have



n 2 1 

tˆn,T − tn,T =

5





n i=1

 σi4 a σi2 b r + r , σˆ i4 i,T σˆ i2 i,T

J

riX,nT − ri,nT −

where riX,nT

 =

  2  σi σi4 J (T ) 2 − 1 (Xi ) and ri,nT = − 1 Ji(T ) . σˆ i4 σˆ i2

Below, we analyze one term at the time and prove each one is oP (1). First we obtain some handy relationships on replacing

E

n  2  σˆ

i 2 i

σ

i=1

2 −1

2n

=

Hence we also have mini≤n

T −1 σˆ i2 σi2

2 − 1 = oP (1) since

n  σˆi2

the scale parameters σi by estimates. We obtain

i=1

σi2

→ 0. σˆ i2

P

→ 1 and maxi≤n

P

→ 1. This also implies

σi2

2  2  n  2 σˆ i σˆ 4 −1 ≤ − 1 min i4 = oP (1), 2 i ≤n σ σˆ σi i i =1 i =1     2   2 2 n n 2  σˆ 2  σ4 σˆ 8 σ ˆ i i i − 1 ≤ − 1 max + 1 min i8 = oP (1). 4 2 2 i≤n σ i≤n σ σˆ i σi i i i =1 i =1 n  2  σ i 2 i

Since the averages

1 n

n

i =1

(Xi(T ) )4 and

that the leading remainder terms due to

 2 2 n n  4  1    σi  X  r  ≤ −1 √  n i=1 i,nT  σˆ i4 i=1  2 2 n n  2  1    σi   J − 1 r ≤ √    n i=1 i,nT  σˆ i2 i=1

n

1 n X ri,nT

i =1

(Ji(T ) )2 are bounded in probability, the Cauchy–Schwarz inequality yields J

and ri,nT are negligible,

n 1

n i =1

(Xi(T ) )4 = oP (1),

n 1  (T ) 2 (J ) = oP (1). n i =1 i

Finally, we show that the remainder terms due to not observing the initial observations Yi0 are negligible. Using (a + b)2 ≤ 2a2 + 2b2 and Cauchy–Schwarz, we have

 2 2   2  2   n T n T n T  t −1 n  1       σi4 a  σi8  1 1 4 1 1  r  ≤ 2 max 8 ϵit  ϵi12 ϵit + ϵis ϵit  , ϵ2 √  n i=1 σˆ i4 i,T  i≤n σ n i=1 T t =2 n i=1 T t =3 s=2 ˆ i i=1 i1 T t =2   2  2  n n n n T t −1  1   σi2 b  σi4 1  4 1   2  1 2 r  ≤ 2 max 4 ϵ ϵ + ϵis  . √ √  n i=1 σˆ i2 i,T  i≤n σ ˆ i T i=1 i1 nT i=1 i1 n i=1 T T t =3 s=2 To obtain the desired negligibility of these two remaining terms, observe (take expectations and note the similarity to the proofs of the LAN theorem) n  i =1

 ϵ

2 i1

T 1

T t =2 n 1

T i=1

2 ϵit

= oP (1),

ϵ = oP (1), 2 i1

n 1



n i=1 n 1

n i =1



1

T t =3 s=2

√ T

T t −1 1 

T  t −1 

T t =3 s=2

2 ϵis ϵit

= OP (1),

2 ϵis

= OP (1). 

8

I.G. Becheri et al. / Statistics and Probability Letters 108 (2016) 1–8

A.3. Proof of Lemma 3.1 Note that the first part of the statement (i) and the second part of statement (ii) follow from Proposition 2.1 and Proposition 4.2 of Becheri et al. (2015), respectively. The other two statements can also be obtained by a straightforward application of Le Cam’s third lemma. To obtain the appropriate shifts under local alternatives, we calculate the covariance between the central sequences in both set-ups (see also Appendix A.1)

 Cov

X12 − J1 2

, X1

 =E

X13 − 3X1 J1 6

1

1

3

3

+ EX13 = EK31 + (n,T )

To compute the distribution of τn,T under Ph (n,T )

likelihood ratio log dPh

(n,T )

/dP0

=

1 3

.

, we need to consider the (asymptotic) covariance between τn,T and the log-

. Since, the central sequence ∆n,T is multiplied by h2 and the τn,T -test has a factor (n,T )

front of the Xi , the shift under local alternatives Ph

√ 2

is h

(n,T )

Similarly, we compute the distribution of tn,T under Qh (n,T )

d

2/3: τn,T → N (h2





2 in

2/9, 1).

. To obtain the covariance between tn,T and the log-likelihood

/dQ0(n,T ) note that, in the quadratic expansion of log dQ(hn,T ) /dQ(0n,T ) from Proposition 4.2 of Becheri et al. √ (n,T ) (2015), the central sequence is multiplied by h while the tn,T -test has a factor 8/5. Hence, under Qh , we obtain a shift √ √ d of h 8/5/3: tn,T → N (h 8/45, 1). ratio log dQh

This completes the proof of the lemma.



References Becheri, I.G., Drost, F.C., van den Akker, R., 2015. Asymptotically UMP panel unit root tests—the effect of heterogeneity in the alternatives. Econometric Theory 31, 539–559. Gumbel, E.J., 1954. The maxima of the mean largest value and of the range. Ann. Math. Statist. 25, 76–84. Moon, H.R., Perron, B., Phillips, P.C.B., 2007. Incidental trends and the power of panel unit root tests. J. Econometrics 141, 416–459. Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. John Wiley & Sons. van der Vaart, A.W., 2000. Asymptotic Statistics. Cambridge University Press.