Higher-order comparisons of asymptotic confidence intervals

Higher-order comparisons of asymptotic confidence intervals

Journal of Statistical Planning and Inference 133 (2005) 359 – 379 www.elsevier.com/locate/jspi Higher-order comparisons of asymptotic confidence inte...

262KB Sizes 0 Downloads 93 Views

Journal of Statistical Planning and Inference 133 (2005) 359 – 379 www.elsevier.com/locate/jspi

Higher-order comparisons of asymptotic confidence intervals Yoshihiko Maesono∗ Faculty of Economics, Kyushu University, Hakozaki 6-19-1, Fukuoka 812-8581, Japan Received 21 July 2002; accepted 4 February 2004 Available online 11 June 2004

Abstract This paper presents discussion of properties of asymptotic confidence intervals based on a normalizing transformation and a Cornish–Fisher inversion of a studentized statistic. The normalizing transformation discussed herein removes bias, skewness and kurtosis. The Cornish–Fisher inversion is obtained from the Edgeworth expansion of the studentized statistic with residual term o(n−1 ). Asymptotic mean-squared errors and simulation results are also discussed. © 2004 Elsevier B.V. All rights reserved. MSC: primary 62G20; secondary 62E20 Keywords: Asymptotic U-statistics; Confidence interval; Cornish–Fisher inversion; Mean-squared error; Normalizing transformation

1. Introduction Let X1 , X2 , . . . , Xn be independently and identically distributed random vectors (possibly taking values in any measurable space) with distribution function F (x), and let √ ˆ = ˆ (X1 , . . . , Xn ) be an estimator of a parameter . Usually, the statistic n(ˆ − ) 2 is asymptotically normal with mean zero and variance  ; more practically, substituting a consistent estimator ˆ to , we have √ ˆ z} = (z) + O(n−1/2 ), P { n(ˆ − )/ ∗ Corresponding author. Tel.: +81-92-6422485; fax: +81-92-6422512

E-mail address: [email protected] (Yoshihiko Maesono). 0378-3758/$ - see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2004.02.009

360

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

where (z) is the distribution function of the standard normal N (0, 1). Several methods are proposed to improve the approximation of the distribution of ˆ : Edgeworth expansions, saddle-point approximations, normalizing transformations, etc. If we have an Edgeworth expansion of the studentized statistic, we can construct an asymptotic confidence interval which improves the convergence rate using a Cornish-Fisher inversion. On the other hand, Fujioka and Maesono (2000) have obtained a normalizing transformation (s) that satisfies √ P {( n(ˆ − )/ˆ )  z} = (z) + o(n−1 ). Therein, (s) is a polynomial and monotone function of s. This paper is intended to compare confidence intervals based on the normalizing transformation (s) and the Edgeworth expansion of the studentized statistic. Using the Hdecomposition from Hoeffding (1961), we obtain asymptotic representations of confidence bounds and discuss asymptotic mean-squared errors of them. Section 2 reviews the normalizing transformation and the Cornish–Fisher inversion; it presents discussion of the confidence bounds of the two methods. In Section 3, asymptotic representations of the confidence bounds are discussed using the H-decomposition. We obtain their asymptotic mean-squared errors and convergence rates of the coverage probabilities. Section 4 examines an application to U-statistics. Section 5 considers examples both theoretically and numerically. 2. Normalizing transformation and Cornish–Fisher inversion Let us assume that the studentized statistic representation: √ ˆ n( − ) = n−1/2  + Vn + Rn, , ˆ



n(ˆ − )/ˆ has the following asymptotic

(A1)

where ˆ 2 is a consistent estimator of 2 = n Var(ˆ ) and −1/2

Vn =n

n  i=1

+ n−5/2

g1 (Xi ) + n 

−3/2

n 



g˜ 1 (Xi ) + n−3/2

i=1

g2 (Xi , Xj )

1  i
g3 (Xi , Xj , Xk ).

1  i
Here g2 and g3 are symmetric functions that are invariant under permutation of their arguments. g1 , g2 , g3 and g˜ 1 may depend on the parameter .  is a bias. Furthermore, we assume that E[g1 (X1 )] = E[g˜ 1 (X1 )] = 0,

E[g12 (X1 )] = 1,

E[g2 (X1 , X2 )|X1 ] = E[g3 (X1 , X2 , X3 )|X1 , X2 ] = 0

(A2) a.s.

(A3)

and P {|Rn, |  n−1 (log n)−3/2 } = o(n−1 ).

(A4)

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

361

Hereafter we will use one symbol, Rn, , which may be different in each case, but that satisfies (A4). Similarly, we use another type of remainder Rn, which satisfies P {|Rn, |  n−1 n } = o(n−1 ) for a non-random sequence n → 0 (as n → ∞). When we discuss asymptotic expansions up to the order of n−1 , we can ignore Rn, and Rn, . Using von Mises expansion or the Hdecomposition, we can show that many interesting studentized statistics satisfy assumptions (A1)–(A4). In Section 4, we will examine whether the studentized U-statistic satisfies these conditions. We also assume the moment condition E{|g1 (X1 )|4 + |g˜ 1 (X1 )|3 + |g3 (X1 , X2 , X3 )|4 } < ∞.

(M1)

Lai and Wang (1993) call Vn an asymptotic U-statistic and have obtained the Edgeworth expansion with remainder term o(n−1 ). Modifying their result, we obtain the Edgeworth √ expansion of n(ˆ − )/ˆ . Let us define

2 = E[g22 (X1 , X2 )],

1 = E[g1 (X1 )g˜ 1 (X1 )], m1 = E[g13 (X1 )],

m2 = E[g1 (X1 )g2 (X2 )g2 (X1 , X2 )],

m3 = E[g14 (X1 )],

m4 = E[g12 (X1 )g1 (X2 )g2 (X1 , X2 )], m5 = E[g1 (X1 )g1 (X2 )g2 (X1 , X3 )g2 (X2 , X3 )], m6 = E[g1 (X1 )g1 (X2 )g1 (X3 )g3 (X1 , X2 , X3 )], 3 = m1 + 3m2 , 4 = m3 − 3 + 12m4 + 12m5 + 4m6 , P1 (z) =  + and

 P2 (z) =

3 (z2 − 1) 6

2 2 + 1 + 2 4





 2 3 4 z+ + (z3 − 3z) + 3 (z5 − 10z3 + 15z). 72 6 24

For the validity of the Edgeworth expansion, we assume the following condition: (C): There exist constants cv and Borel functions v :  R → R such that E[ v (X1 )] = 0, E| v (X1 )|d < ∞ for some d  5 and g2 (X1 , X2 ) = K v=1 cv × v (X1 ) v (X2 ) a.s.; moreover, for some 0 <  < min{1, 2(1 − 11/(3d))},    K      sup cv v (X1 )})  < 1. lim sup E exp(it{g1 (X1 ) +  |t|→∞ |s1 |+···+|sK |  |t|−  v=1

Then, from Lai and Wang (1993), we have the following lemma: Lemma 1. Assume (A1)–(A4) and the moment condition (M1). If lim sup |E[exp{itg 1 (X1 )}]  < 1, |t|→∞

362

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

and the condition (C) is satisfied, we have √  n(ˆ − ) P  z = (z) − n−1/2 (z)P1 (z) − n−1 (z)P2 (z) + o(n−1 ), ˆ where (z) is the density function of the standard normal distribution. Lai and Wang (1993) have obtained another condition for g2 (x, y) that ensures the validity of asymptotic expansion. The Cornish–Fisher inversion of -quantile is given as z + n−1/2 P1 (z ) + n−1 P3 (z ), where z is -quantile of the standard normal distribution N (0, 1) and

2 4 3 2 (z − 3z ) − 3 (2z 3 − 5z ) + 1 + z . P3 (z ) = 24 36 4 Let ˆ , ˆ 3 , ˆ 4 , ˆ 1 and ˆ 2 be estimators of , 3 , 4 , 1 and 2 . Substituting these estimators to P1 (z ) and P3 (z ), we have an estimator of the Cornish–Fisher inversion QStu ( ) = z + n−1/2 Pˆ1 (z ) + n−1 Pˆ3 (z ).

(1)

On the other hand, Fujioka and Maesono (2000) have proposed the normalizing transformation which simultaneously removes bias, skewness and kurtosis. We must estimate the following parameters to remove the bias and skewness: p=−

3 6

and

q=

3 − . 6

Using the estimators pˆ and q, ˆ Hall (1992) proposed the monotone transformation 1 (·), which removes the bias and skewness. Its asymptotic inversion is given by −1/2 ˜ −1 (ps ˆ 2 + q). ˆ 1 (s) = s − n

(2)

Further, Fujioka and Maesono (2000) obtained a higher-order transformation which removes a kurtosis. We assume that n−1/2 pˆ = n−1/2 p + n−3/2

n 

p1 (Xi ) + Rn, ,

(A5)

q1 (Xi ) + Rn,

(A6)

i=1 −1/2

n

qˆ = n

−1/2

q +n

−3/2

n  i=1

and E[p1 (X1 )] = E[q1 (X1 )] = 0.

(A7)

Let us define

3 = E[g1 (X1 )p1 (X1 )], u=

1 8



7 2 3p

4 = E[g1 (X1 )q1 (X1 )],

− pm1 − 3pm2 −

1 24 m3

− 21 m4 − 21 m5 − 16 m6 − 3

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

363

and v= − +

2 3 8 − 2p + 5p + 2pm1 3 1 1 2 m5 + 2 m6 − 1 − 4 2

+ 6pm2 + 18 m3 + 23 m4 − 4 .

Using estimators uˆ and vˆ of u and v, Fujioka and Maesono (2000) proposed the monotone transformation √

n(ˆ − ) Tn = 2 (Sn ) = 2 1 , ˆ which removes bias, skewness and kurtosis, simultaneously. Under some regularity conditions, Fujioka and Maesono (2000) showed that P (Tn  x) = (x) + o(n−1 ).

(3)

Using the perturbation method, Barndorff-Nielsen and Cox (1989) and Fujioka and Maesono (2000) obtained an asymptotic inversion Sn ˜ −1 2 (Tn ), where −1 ˆ 3 }. ˜ −1 2 (t) = t − n {vˆ + ut

(4)

Similarly as pˆ and q, ˆ we assume that uˆ and vˆ satisfy n−1 uˆ = n−1 u + Rn,

(A8)

n−1 vˆ = n−1 v + Rn, ,

(A9)

and

and moment condition E{|g1 (X1 )|5 + |g˜ 1 (X1 )|3 + |g2 (X1 , X2 )|5 + |g3 (X1 , X2 , X3 )|5 } < ∞.

(M2)

Combining both approximations of ˜ −1 ˜ −1 1 and  2 in (3) and (5), we obtain the following √ approximation of the -quantile of the distribution of n(ˆ − )/ˆ :    (5) ˆ 2 + q) ˆ + n−1 53 pˆ 2 − uˆ z 3 + (2pˆ qˆ − v)z ˆ . QNor ( ) = z − n−1/2 (pz The above discussion yields asymptotic confidence intervals with 100(1 − )%

 ˆ − n−1/2 ˆ QNor ( ) and  ˆ − n−1/2 ˆ QStu ( ). We will compare QNor ( ) and QStu ( ) both theoretically and numerically.

364

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

3. Asymptotic representations of confidence bounds and mean-squared errors Using H-decomposition, we obtain the following asymptotic representations. Theorem 1. Assume (A1)–(A9), the moment conditions (M2), E|p1 (X1 )|2+ < ∞ and E|q1 (X1 )|2+ < ∞ for some  > 0. For z = O(1), we get QNor ( ) = z + a( ) − n−3/2

n  i=1

{z 2 p1 (Xi ) + q1 (Xi )} + Rn,

and QStu ( ) = z + b( ) − n−3/2

n  i=1

where a( ) = −n−1/2 (pz2 + q) + n−1

{z 2 p1 (Xi ) + q1 (Xi )} + Rn, ,



5 3

  p 2 − u z 3 + (2pq − v)z

and b( ) = −n−1/2 (pz2 + q) + n−1 P3 (z ). Proof. See appendix.



Subsequently, using asymptotic representations of QNor ( ) and QStu ( ), we can obtain the convergence rates of the coverage probabilities. Let us define

(x, y) = g2 (x, y) + 2pg 1 (x)g1 (y). Then we have the following theorem: Theorem 2. The assumptions (A1)–(A9) are satisfied and the moment conditions (M2), E|p1 (X1 )|2+ < ∞ and E|q1 (X1 )|2+ < ∞ (some  > 0) all hold. Moreover, (x, y) satisfies the condition (C) and lim sup |E[exp{itg 1 (X1 )}]| < 1. |t|→∞

Then, for z = O(1), we have  √ n(ˆ − )  QNor ( ) = + o(n−1 ) P ˆ and

 √ n(ˆ − ) P  QStu ( ) = − n−1 (3 z 3 + 4 z ) (z ) + o(n−1 ). ˆ

Proof. See appendix. 

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

365

Because QNor ( ) is an asymptotic inversion of the normalizing transformation ˜ −1 ˜ −1 1 ( 2 (·)), QNor ( ) is better than QStu ( ) in the sense of accuracy of the coverage probability. The difference between QNor ( ) and QStu ( ) is small. From direct computation, we can show that ( 53 p 2 − u)z 3 + (2pq − v)z = P3 (z ) + 3 z 3 + 4 z , implying that a( ) = b( ) + n−1 (3 z 3 + 4 z ). Therefore, 3 and 4 depend on p1 (·) and q1 (·). QStu ( ) ignores these terms; thereby b( ) does not include 3 and 4 . √ If we know the exact distribution of the studentized statistic n(ˆ − )/ˆ , we can obtain asymptotic mean-squared errors of QNor ( ) and QStu ( ). Let t be -point of the distribution √ of n(ˆ − )/ˆ . A left-side confidence interval with 100(1 − )% is given by

 ˆ − n−1/2 ˆ t . Thus asymptotic mean-squared errors of QNor ( ) and QStu ( ) are AMSE(QNor ( ))={z + a( ) − t }2 + n−2 {z 4 E[p12 (X1 )] + 2z 2 E[p1 (X1 )q1 (X1 )] + E[q12 (X1 )]} and AMSE(QStu ( ))={z + b( ) − t }2 + n−2 {z 4 E[p12 (X1 )] + 2z 2 E[p1 (X1 )q1 (X1 )] + E[q12 (X1 )]}. The difference of these asymptotic mean-squared errors is AMSE(QNor ( )) − AMSE(QStu ( )) = n−1 2(3 z 3 + 4 z )[z + b( ) − t ] + n−2 (3 z 3 + 4 z )2 . Because z + b( ) is the Cornish–Fisher inversion of the distribution of can show that, for z = O(1),



n(ˆ − )/ˆ , we

z + b( ) − t = o(n−1 ). Thus, we obtain the following theorem: Theorem 3. Under the same conditions of Theorem 2, we have AMSE(QNor ( )) − AMSE(QStu ( )) = n−2 (3 z 3 + 4 z )2 + o(n−2 ). Therefore, in the sense of the asymptotic mean-squared error, QStu ( ) is always better than QNor ( ). The simulation study in Section 5 will verify this.

366

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

4. Application to U-statistics This section will address the case of U-statistics. For a symmetric kernel h(x1 , . . . , xr ), the U-statistic with degree r is given as  n −1  h(Xi1 , Xi2 , . . . , Xir ). Un = r 1  i1
Un is an unbiased estimator of  = E(Un ) = E[h(X1 , . . . , Xr )]; we will consider the confidence interval of . Applying the H-decomposition, let us define h1 (x)=E[h(x, X2 , . . . , Xr )] − , h2 (x, y)=E[h(x, y, X3 , . . . , Xr )] −  − h1 (x) − h1 (y) and h3 (x, y, z)=E[h(x, y, z, X4 , . . . , Xr )] −  − h2 (x, y) − h2 (x, z) − h2 (y, z) − h1 (x) − h1 (y) − h1 (z). These functions may depend on the parameter . Using the above functions, Maesono (1997) obtained an asymptotic representation of a studentized U-statistic. Let us define

ˆ 2 = (n − 1)

n  i=1

(Un(i) − Un )2 ,

(i)

where Un denotes the U-statistic computed from a sample of n − 1 points with Xi left proved that if out. In that case, ˆ 2 is a jackknife estimator of n Var(Un ). Maesono (1997)√ E|h(X1 , . . . , Xr )|9 < ∞ and E[h21 (X1 )]= 21 > 0, the studentized U-statistic n(Un − )/ˆ satisfies conditions (A1)–(A4). From Fujioka and Maesono (2000), we have that e1 e2 e1 e2 e1 e2 (6)  = − 3 − 3 , p = 3 + 3 and q = 3 + 3 , 2 1

1 3 1 2 1 6 1 2 1 where e1 = E[h31 (X1 )] and

e2 = (r − 1)E[h1 (X1 )h1 (X2 )h2 (X1 , X2 )].

Based on the idea of Hinkley and Wei (1984) and Fujioka and Maesono (2000) proposed jackknife estimators ˆ 1 , eˆ1 and eˆ2 of 1 , e1 and e2 . Substituting these estimators to , p and q in (6), we have estimators ˆ , pˆ and q. ˆ Maesono (1998, Theorems 3.2 and 3.3) has obtained asymptotic representations of jackknife estimators of third cumulants e1 / 31 + 3e1 / 31 and −2e1 / 31 − 3e2 / 31 of the standardized and the studentized U-statistics when r = 2. Modifying the result of Maesono (1998), we can show that the estimators pˆ and qˆ satisfy (A5) and (A6), and the functions p1 (x) and q1 (x) are obtained by Fujioka and Maesono (2000). For the second step, let us define

22 = (r − 1)2 E[h22 (X1 , X2 )], e3 = E[h41 (X1 )], e4 = (r − 1)E[h21 (X1 )h1 (X2 )h2 (X1 , X2 )], e5 = (r − 1)2 E[h1 (X1 )h1 (X2 )h2 (X1 , X3 )h2 (X2 , X3 )]

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

367

and e6 = (r − 1)(r − 2)E[h1 (X1 )h1 (X2 )h1 (X3 )h3 (X1 , X2 , X3 )]. Then it follows from Fujioka and Maesono (2000) that u=

11e22 1 11e12 11e1 e2 e3 e4 e5 e6 + + + − 4− 4− 4+ 4 6 6 2 27 61 9 1 12 1 4 1

1 2 1 3 1

and v=1+

22 31e12 7e22 11e1 e2 5e3 5e4 3e5 + + + − − 4 − 4. 2 6 6 6 4 4 1 72 1 6 1 4 1 12 1 2 1 2 1

Because the parameters 22 , e3 , . . . , e6 also depend on F (·) and h(x1 , . . . , xr ), we must estimate them. Similarly as for ˆ 1 , eˆ1 and eˆ2 , Fujioka and Maesono (2000) proposed jackknife estimators ˆ 22 , eˆ3 , . . . , eˆ6 . Substituting ˆ 21 , ˆ 22 , eˆ1 , . . . , eˆ6 to u and v, we have estimators uˆ and v. ˆ Using the same method of Maesono (1998), it is possible to prove that uˆ and vˆ satisfy Eqs. (A8)and (A9) when E|h(X1 , . . . , Xr )|9 < ∞. Combining the above results, we can construct a normalizing transformation that removes the bias, skewness and kurtosis, √ and Tˆn∗ = 2 (1 ( n(Un − )/ˆ )) satisfies Eq. (3). Approximation of the -quantile based on the normalizing transformation is given by (5). On the other hand, applying Lemma 1, Maesono (1997) obtained the Edgeworth expansion of the studentized U-statistic. It follows from Maesono (1997) that

22 3e22 9 3e12 3e1 e2 e3 2e4 e5 e6 + − − − − − , + 6+ 8 4 1

61 8 41

41 2 41

41 2 21

61

2 2e2 e2 1 2e1 e2 e3 2e5 2 = − + 16 + 6 + 62 + 4 − 4 + 22 , 2 2 1

1

1 2 1

1

1 2e1 3e2 3 = − 3 − 3

1

1 1 =

and

4 = 12 +

12e12

61

+

36e22 42e1 e2 2e3 24e4 12e5 8e6 + − 4 − 4 − 4 − 4. 6 6

1

1

1

1

1

1

√ Under some regularity conditions, Maesono (1997) showed that n(Un − )/ˆ satisfies 2 2 Lemma 1. Substituting the estimators ˆ 1 , ˆ 2 , eˆ1 , . . . , eˆ6 to 1 , 2 , 3 and 4 , we have estimators ˆ 1 , ˆ 2 , ˆ 3 and ˆ 4 , and the approximation QStu ( ) in (1).

5. Examples We will discuss the cases of the sample mean and variance.

368

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

Example 1. We consider the confidence interval of  = E(Xi ) with the sample mean n 1  Xi . ˆ = X¯ = n i=1

The sample mean is a U-statistic with kernel h(x) = x. For that reason, it is possible to apply results of the U-statistics. Let us define

k = E[(X1 − )k ],

k = 2, 3, 4.

Then we have

21 = 2 , p=

3 3/2

32

22 = 0, ,

q=

e1 = 3 ,

3 3/2

62

,

e3 = 4 ,

e2 = e4 = e5 = e6 = 0,

 1 1123 u= + − 42 3 2 272 4 2

and v=1+

3123 7232

54 . 1222



From Fujioka and Maesono (2000), if X1 has a continuous distribution and E|X1 |9 < ∞, the sample mean satisfies the condition (C) and the other conditions of Theorems 1–3. Direct computation yields

3 2 3 −1/2 QNor ( )=z − n z + 3/2 3/2 32 62

  2 2323 4 1 2 3 54 3 −1 +n z − − 3 + 2 z + −1 − + 2 9 2 4 2 7232 1222

n  1 1  −3/2 2 +n z + {(Xi − )3 − 3 } 3/2 2 i=1 32 3 −  X i + Rn, − 5/2 {(Xi − )2 − 2 } − 1/2 2 2 2 and

QStu ( )=z − n

−1/2



3

z 3/2

32

2

+

3

3/2

62

 523 523 1 4 4 1 3 z + − − + − + 2 z +n 2 1832 2 7232 1222 4 2 

 n 1 1 + n−3/2 z 2 + {(Xi − )3 − 3 } 3/2 2 3  i=1 2 3 Xi −  2 + Rn, . − 5/2 {(Xi − ) − 2 } − 1/2 2 2 2 −1

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

369

Note that

3 = −1 −

23  + 42 232 32

and

2 3  1 4 = − − 33 + 42 = . 2 2 42 62

We will consider the confidence interval of the mean  = E(X1 ) with 100(1 − )%. If Xi ∼ N(, 2 ), we have √

n(X¯ − ) 1/2

ˆ 2 where ˆ 2 =

n

∼ t − distribution with n − 1 degrees of freedom,

i=1 (Xi

¯ 2 /(n − 1). Because 3 = 0 and 4 = 32 , we can show that − X) 2

QNor ( )

  n 3+z 2+1 z 2z −  X i = z + n−1 + n−3/2 (Xi − )3 − 1/2 + Rn, 4 6 2 i=1

= QStu ( ) + Rn, . Let tn−1 ( ) be -point of the t-distribution with n − 1 degrees of freedom. Then the meansquared errors of QNor ( ) and QStu ( ) are

2 3 5(2z 2 + 1)2 −1 z + z + z + n ( ) + o(n−3 ) − t n−1 12n2 4 = AMSE(QStu ( )) + o(n−3 ).

AMSE(QNor ( ))=

AMSE(QNor ( )) and AMSE(QStu ( )) coincide until the order n−2 . Note that from the Edgeworth expansion of the t-distribution, we have z 3 + z z + n−1 − tn−1 ( ) = o(n−1 ). 4 Further, let us consider the case that the underlying distribution is the gamma distribution with density function f (x) = x −1 e−x /() (x  0), =0 (x < 0). Note that E(X1 ) = . Because 2 = , 3 = 2 and 4 = 32 + 6, we have that 3 = 4 = 0. Thus QNor ( ) = QStu ( ) + Rn, , and AMSE(QNor ( )) and AMSE(QStu ( )) coincide until the order n−2 . In cases of the double exponential distribution f (x) = exp{−|x − |}/2 and the absolute normal distribution where Xi = |Gi | and Gi ∼ N (0, 1), 3 and 4 are not zero. In both cases, there are some differences between QNor ( ) and QStu ( ). We will compare QNor ( ) and QStu ( ) by simulation because we cannot obtain exact percentile points t . Tables 1–3 show simulation results of coverage probabilities for the sample mean when the distribution of Xi is normal, absolute normal, and double exponential. The table values are coverage probabilities based on 1,000,000 replications. Q1 and Q2 denote the coverage probabilities of QNor ( ) and QStu ( ), respectively. Q3 denotes the coverage probability based on the Cornish–Fisher inversion of an Edgeworth expansion of the standardized statistic with

370

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

Table 1 Coverage prob. (normal distribution)



n 20

0.010

0.025

0.050

0.950

0.975

0.990

Q1 Q2 Q3 Qt

0.015431 0.011824 0.014720 0.009878

0.031742 0.028151 0.031690 0.024610

0.057217 0.054762 0.057767 0.049355

0.942143 0.944696 0.941477 0.949964

0.967836 0.971424 0.967947 0.975010

0.984256 0.987909 0.985118 0.990055

50 Q1 Q2 Q3 Qt

0.010 0.010981 0.010854 0.011832 0.009985

0.025 0.026378 0.026648 0.027699 0.024967

0.050 0.051621 0.052521 0.053462 0.050137

0.950 0.948728 0.947791 0.946913 0.950302

0.975 0.973901 0.972800 0.972641 0.975274

0.990 0.989267 0.989405 0.988480 0.990254

100 Q1 Q2 Q3 Qt

0.010 0.010280 0.010461 0.010864 0.010441

0.025 0.025328 0.025838 0.026301 0.025055

0.050 0.050578 0.051456 0.051943 0.050312

0.950 0.949836 0.948957 0.948605 0.950195

0.975 0.974941 0.974499 0.974031 0.975334

0.990 0.989912 0.989718 0.989319 0.989751

200 Q1 Q2 Q3 Qt

0.010 0.010096 0.010263 0.010436 0.009973

0.025 0.025229 0.025566 0.025765 0.025076

0.050 0.050180 0.050713 0.050903 0.050035

0.950 0.950466 0.949938 0.949760 0.950617

0.975 0.975196 0.974845 0.974621 0.975345

0.990 0.989934 0.989733 0.989607 0.990056

residual term o(n−1 ). In Table 1, Qt denotes the coverage probabilities of the exact tdistribution. The simulation results show that Q1 is slightly better than Q2 : Q1 is slightly worse than Q2 when the sample size is small, but Q1 is better when the sample size is large. Both Q1 and Q2 are quite comparable to Qt when the sample size is large. The accuracy of the coverage probabilities of Q3 are much worse than Q1 and Q2 . These results coincide with Theorem 2. Next we will simulate mean-squared errors of QNor ( ), QStu ( ) and the inversion of the Edgeworth expansion of the standardized statistic. Because we cannot obtain the exact √ 1/2 -point t of n(X¯ − )/ˆ 2 except where the underlying distribution is normal, we first simulate the -point t . We need an ordering of simulated values for non-parametric estimation of the -point. Here, we consider the estimation of t based on 10,000,000 replications; it is impossible to get the ordering of these values. Therefore, we first fix the values t ∗ = −2.326, −1.960, −1.650, 1.650, 1.960, 2.326, which are 0.01-, 0.025-, 0.05-, 0.95-, 0.975- and 0.99-points of the standard normal distribution N (0, 1). Then we simulate √ 1/2 the probability ∗ = P { n(X¯ − )/ˆ 2  t ∗ }; then using z ∗ we obtain estimates of the ∗ mean-squared errors of QNor ( ) and QStu ( ∗ ) based on 1,000,000 replications. Tables 4–6 show simulation results of average values of the mean-squared errors for the sample mean,

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

371

Table 2 Coverage prob. (absolute normal)



n 20

0.010

0.025

0.050

0.950

0.975

0.990

Q1 Q2 Q3

0.022908 0.016648 0.040882

0.041609 0.035124 0.064094

0.068413 0.063095 0.093423

0.949779 0.955620 0.965180

0.974858 0.980740 0.985184

0.989507 0.993801 0.994859

50 Q1 Q2 Q3

0.010 0.013042 0.011725 0.026281

0.025 0.028974 0.027703 0.046960

0.050 0.054527 0.053900 0.075156

0.950 0.950813 0.951516 0.963030

0.975 0.975525 0.976806 0.984453

0.990 0.990380 0.991566 0.995073

100 Q1 Q2 Q3

0.010 0.011094 0.010648 0.020417

0.025 0.026362 0.026070 0.039356

0.050 0.051463 0.051495 0.066676

0.950 0.950485 0.950529 0.960798

0.975 0.975420 0.975758 0.982956

0.990 0.990256 0.990688 0.994446

200 Q1 Q2 Q3

0.010 0.010398 0.010281 0.016776

0.025 0.025377 0.025301 0.034416

0.050 0.050387 0.050544 0.061238

0.950 0.950683 0.950544 0.958614

0.975 0.975465 0.975584 0.981465

0.990 0.990193 0.990346 0.993724

Table 3 Coverage prob. (double exponential)



n 20

0.010

0.025

0.050

0.950

0.975

0.990

Q1 Q2 Q3

0.021184 0.016441 0.010627

0.039941 0.037590 0.026780

0.067268 0.068735 0.054587

0.932305 0.930909 0.944973

0.959708 0.961865 0.972767

0.978717 0.978660 0.989436

50 Q1 Q2 Q3

0.010 0.012890 0.015526 0.009309

0.025 0.029010 0.033708 0.024833

0.050 0.054738 0.061141 0.051386

0.950 0.945172 0.938591 0.948406

0.975 0.971011 0.966387 0.975088

0.990 0.987185 0.984428 0.990676

100 Q1 Q2 Q3

0.010 0.011132 0.013666 0.009430

0.025 0.026447 0.030338 0.024742

0.050 0.051678 0.056632 0.050561

0.950 0.948213 0.943134 0.949269

0.975 0.973568 0.969546 0.975373

0.990 0.989005 0.986487 0.990686

200 Q1 Q2 Q3

0.010 0.010299 0.012039 0.009542

0.025 0.025308 0.027845 0.024645

0.050 0.050030 0.053126 0.049701

0.950 0.949245 0.946218 0.949498

0.975 0.974278 0.971736 0.974953

0.990 0.989566 0.987781 0.990347

372

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

Table 4 Mean-squared errors (normal)



n 20

0.01565

0.03251

0.05831

0.94194

0.96764

0.98439

Q1 Q2 Q3

0.056451 0.031726 0.035534

0.029496 0.019283 0.015160

0.015512 0.011802 0.006173

0.015302 0.011631 0.005881

0.029191 0.019098 0.014704

0.056243 0.031615 0.035172

50 Q1 Q2 Q3

0.01211 0.009258 0.007810 0.006542

0.02788 0.004840 0.004454 0.002555

0.05327 0.002595 0.002611 0.001030

0.94698 0.002581 0.002576 0.000971

0.97229 0.004796 0.004377 0.002373

0.98799 0.009103 0.007637 0.005962

100 Q1 Q2 Q3

0.01102 0.002332 0.002242 0.001629

0.02643 0.001246 0.001291 0.000696

0.05180 0.000682 0.000779 0.000325

0.94849 0.000665 0.000712 0.000233

0.97362 0.001237 0.001268 0.000649

0.98899 0.002330 0.002244 0.001627

200 Q1 Q2 Q3

0.01053 0.000594 0.000640 0.000494

0.02575 0.000313 0.000345 0.000175

0.05094 0.000173 0.000212 0.000090

0.94926 0.000169 0.000188 0.000058

0.97438 0.000313 0.000322 0.000131

0.98952 0.000584 0.000607 0.000415

Table 5 Mean-squared errors (absolute normal)



n 20

0.03505

0.05748

0.08707

0.96269

0.98292

0.99373

Q1 Q2 Q3

0.128018 0.074637 0.338172

0.068689 0.044657 0.179752

0.037094 0.027061 0.093855

0.010438 0.009087 0.038309

0.022791 0.016533 0.062392

0.053236 0.030778 0.089529

50 Q1 Q2 Q3

0.02242 0.022431 0.013850 0.148156

0.04227 0.011703 0.008104 0.076704

0.07048 0.006012 0.004627 0.038447

0.96066 0.001721 0.002435 0.024693

0.98227 0.003493 0.004091 0.045025

0.99385 0.007459 0.007035 0.079245

100 Q1 Q2 Q3

0.01766 0.005914 0.003810 0.076940

0.03601 0.002914 0.002079 0.038633

0.06347 0.001501 0.001191 0.019265

0.95851 0.000508 0.000802 0.014443

0.98098 0.000996 0.001376 0.027473

0.99340 0.001737 0.002437 0.055336

200 Q1 Q2 Q3

0.01489 0.001441 0.000975 0.038617

0.03220 0.000676 0.000500 0.019020

0.05902 0.000372 0.000313 0.009682

0.95654 0.000151 0.000234 0.007914

0.97977 0.000265 0.000395 0.016151

0.99272 0.000592 0.000726 0.029312

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

373

Table 6 Mean-squared errors (double exponential)



n 20

0.01363

0.03079

0.05799

0.94195

0.96911

0.98631

Q1 Q2 Q3

0.129003 0.112600 0.028364

0.067731 0.060061 0.014246

0.036386 0.034096 0.007374

0.036272 0.034099 0.007497

0.067298 0.059803 0.014552

0.128103 0.111731 0.028716

50 Q1 Q2 Q3

0.01136 0.063759 0.040743 0.007296

0.02729 0.031454 0.022235 0.003408

0.05336 0.015291 0.012548 0.001857

0.94676 0.015418 0.012528 0.001796

0.97278 0.031263 0.022262 0.003475

0.98870 0.064174 0.040862 0.007226

100 Q1 Q2 Q3

0.01066 0.025925 0.014921 0.002429

0.02615 0.012399 0.008182 0.001152

0.05167 0.006061 0.004589 0.000565

0.94827 0.006048 0.004593 0.000564

0.97388 0.012370 0.008186 0.001151

0.98939 0.026094 0.014920 0.002390

200 Q1 Q2 Q3

0.01031 0.008551 0.004732 0.000714

0.02557 0.004056 0.002588 0.000331

0.05077 0.002006 0.001432 0.000146

0.94909 0.001912 0.001468 0.000177

0.97437 0.003902 0.002627 0.000367

0.98966 0.008545 0.006008 0.000714

when the distribution of Xi is normal, absolute normal and double exponential. The table values are based on 1,000,000 replications. Q1 and Q2 denote the mean-squared errors of QNor ( ) and QStu ( ), respectively. Q3 denotes the mean-squared error based on the Cornish–Fisher inversion of the Edgeworth expansion of the standardized statistic. Simulation results show that Q2 is slightly better than Q1 . These results coincide with Theorem 3. When the underlying distribution is symmetric (normal and double exponential), Q3 is better than Q1 and Q2 , but Q3 is much worse in the case of double exponential. Q3 is also much worse when the underlying distribution is the lognormal distribution (not listed here). Q1 and Q2 are stable both in the sense of the coverage probability and the mean-squared error, but Q3 is not. Finally, we consider the sample variance. Example 2. We consider the confidence interval of  = Var(X1 ) with the sample variance

ˆ =

n 1  2 ¯ 2= (Xi − X) n−1 n(n − 1) i=1

 1  i
1 (Xi − Xj )2 . 2

The sample variance is a U-statistic with kernel h(x, y) = (x − y)/2; therefore, we can apply the results of the U-statistics. Let us define

1 = E(X1 ),

 = E[(X1 − 1 )2 ] and k = E[(X1 − 1 )k ] k = 3, 4 . . . 8.

374

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

It follows from Fujioka and Maesono (2000) that

23 , 4 2    1 e3 = 16 (8 − 46  + 64 2 − 34 ), e4 = − 3 (5 − 23 ), e5 = 3 , e6 = 0, 8 4 28 − 56  − 245 3 + 34 2 + 6323  3 = − 1 + 6(4 − 2 )2 ( − 34  − 323 + 23 )(6 − 24  − 423 + 3 ) − 6 4(4 − 2 )3

21 = 41 (4 − 2 ),

22 = 2 ,

e1 =

1 8

(6 − 34  + 23 ),

e2 = −

and

4 = − −

1 28 − 56  − 365 3 + 34 2 + 10223  + 2 12(4 − 2 )2 (6 − 34  − 623 + 23 )(6 − 24  − 423 + 3 ) 2(4 − 2 )3

.

If X1 has a continuous distribution and E|X1 |18 < ∞, the sample variance satisfies the condition (C) and the other conditions of Theorem 1 (see, Fujioka and Maesono (2000)). If Xi ∼ N(1 , ), we can show that 3 = 0, 4 = 0 and √

53z 3 79z −1/2 2 QNor ( )=z − n (2z 2 + 1) + n−1 + 3 36 36 n  + n−3/2 {z 2 p1 (Xi ) + q1 (Xi )} + Rn, i=1

= QStu ( ) + Rn, . Thus if the underlying distribution is normal, the mean-squared errors coincide to the order n−2 , as the sample mean. In cases of the double exponential and the absolute normal distributions, 3 and 4 are not 0. Simulation results of the coverage probabilities and the mean-squared errors are very similar to those of the sample mean. QStu ( ) and QNor ( ) are quite comparable in the coverage probability. The mean-squared error of QStu ( ) is slightly better than QNor ( ).

Acknowledgements The author wishes to thank the referees for their helpful comments, which have improved the manuscript significantly. He is also grateful to the hospitality of the Centre for Mathematics and its Applications at the Australian National University, where he carried out a portion of this study.

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

375

Appendix. Using the H-decomposition, we first obtain the moment bounds (see, Hoeffding (1961); Maesono (1997)). Let r (x1 , . . . , xr ) be a real-valued function which is symmetric in its arguments, and Ar =

 Cn,r

r (Xi1 , Xi2 , . . . , Xir ),

 where Cn,r indicates that the summation is taken over all integers i1 , . . . , ir satisfying 1  i1 < i2 < · · · < ir  n. Assume that E[r (X1 , . . . , Xr )|X1 , . . . , Xr−1 ] = 0

a.s.

Then, if E|r (X1 , . . . , Xr )|b < ∞ for 1  b < 2, there exists a positive constant B , which may depend on  and F, but not on n, such that E|Ar |b  B nr .

(7)

And for 2  b, if E|r (X1 , . . . , Xr )|b < ∞,there exists a positive constant B , which may depend on  and F but not on n, such that E|Ar |b  B nbr/2 .

(8)

Proof of Theorem 1. From assumptions (A5) and (A6), we can show that n−1/2 (pz ˆ 2 + q) ˆ = n−1/2 (pz2 + q) + n−3/2

n  i=1

{z 2 p1 (Xi ) + q1 (X1 )} + Rn, .

Further we have n−1 pˆ 2 = n−1 p 2 + n−3

n  n  i=1 j =1

+ 2pn−2 n−3

n n  

n 

2 p1 (Xi )p1 (Xi ) + Rn,

p1 (Xi ) + 2pR n, + 2Rn, n−3/2

i=1

n 

p1 (Xi ),

i=1

p1 (Xi )p1 (Xj )

i=1 j =1 n  −3

=n

i=1

{p12 (Xi ) − E[p12 (X1 )]} + n−2 E[p12 (X1 )] + 2n−3

 Cn,2

p1 (Xi )p1 (Xj )

376

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

and E[p1 (X1 )p1 (X2 )|X1 ] = 0 a.s. It follows from (7), (8) and the Markov inequality that    n     −3  2 {p1 (Xi ) − E[p12 (X1 )]}  n−1 (log n)−3/2 P n   i=1  n−3−3/2 E| ni=1 {p12 (Xi ) − E[p12 (X1 )]}|1+/2  n−1−/2 (log n)−3/2−3/4 −1−3/2 = O(n (log n)3/2+3/4 ) = o(n−1 ) and

        P 2n−3 p1 (Xi )p1 (Xj )  n−1 (log n)−3/2    Cn,2  n−6 E| Cn,2 p1 (Xi )p1 (Xj )|2  = o(n−1 ). n−2 (log n)−3  Similarly, we can show that 2pn−2 ni=1 p1 (Xi ) = Rn, . It is obvious that 2pR n, = Rn, 2 = R . Finally, we have and Rn, n,    n      −3/2 −1 −3/2 P 2Rn, n p1 (Xi )  n (log n)   i=1    n       P {|Rn, |  n−1 (log n)−3/2 } + P 2n−3/2 p1 (Xi )  1 = o(n−1 ).   i=1

Thus, we get n−1 pˆ 2 = n−1 p 2 + Rn, . Similarly we can show that n−1 pˆ qˆ = n−1 pq + Rn, . It follows from assumptions (A8) and (A9) that n−1 uz ˆ 3 = n−1 uz3 + Rn, and n−1 vz ˆ = n−1 vz + Rn, . Thereby, we obtain the desired result. Proof of Theorem 2. From Theorem 1, we can show that √  n(ˆ − ) P  QNor ( ) ˆ √  n  n(ˆ − ) −3/2 2 =P +n [z p1 (Xi ) + q1 (Xi )]  z + a( ) + o(n−1 ). ˆ i=1

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

377

Here, we have √

n

 n(ˆ − ) [z 2 p1 (Xi ) + q1 (Xi )] + n−3/2 ˆ = n−1/2  + n ×



i=1 n  −1/2

g1 (Xi ) + n−3/2

i=1

n 

g1∗ (Xi ) + n−3/2

i=1

g2 (Xi , Xj ) + n



−5/2

1  i
g3 (Xi , Xj , Xk ) + Rn, ,

1  i
where g1∗ (x) = g˜ 1 (x) + z 2 p1 (x) + q1 (x). Thus we can use the Edgeworth expansion of the asymptotic U-statistic. It follows from Lemma 1 that √  n(ˆ − ) P  QNor ( ) ˆ = (z + a( )) − (z + a( )){n−1/2 P1 (z + a( )) + n−1 P2∗ (z + a( ))} + o(n−1 ), where P2∗ (z) = z3 E[g1 (X1 )p1 (X1 )] + zE[g1 (X1 )q1 (X1 )] + P2 (z).

(9)

Let us define a1 ( ) = −(pz2 + q) and

a2 ( ) = ( 53 − u)z 3 + (2pq − v)z .

(10)

Using the Taylor expansion, we have

(z + a( )) = (z ) + n−1/2 a1 ( ) (z ) + n−1



 (pz2 + q)2 a2 ( ) − z (z ) + o(n−1 ), 2

− n−1/2 (z + a( ))P1 (z + a( )) = −n−1/2 P1 (z ) (z ) − n−1 {pz3 + (q − 2p)z } (z ) + o(n−1 ) and −n−1 P2∗ (z + a( )) (z + a( )) = −n−1 P2∗ (z ) (z ) + o(n−1 ).

378

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

Thus, we get √  n(ˆ − ) P  QNor ( ) ˆ = (z ) + n−1/2 {a1 ( ) − P1 (z )} (z )   (pz2 + q)2 −1 3 ∗ +n z − pz − (q − 2p)z − P2 (z ) (z ) a2 ( ) − 2 + o(n−1 ). For the coefficient of the term n−1/2 (z ), we have that a1 ( ) − P1 (z ) = 0. Let c1 be the coefficient of n−1 (z )z . It follows from (9) and (10) that: c1 = −v + 21 p 2 − 21 2 − 1 − 41 2 − 4 + 21 3 + 18 4 −

5 2 24 3 .

Note that 3 = −6p, q = −p −  and 4 = m3 − 3 + 12m4 + 12m5 + 4m6 . Substituting these values, we get c1 = −12p 2 − 2pm1 − 6pm2 . Because p = −(m1 + 3m2 )/6, we have c1 = 0. Let c2 be the coefficient of n−1 (z )z 3 . From direct computation, we have c2 = −u +

11 2 4 − 3 . p − 24 3

Thus substituting u, p and 4 , we can show that c2 = 0. Similarly, let c3 be the coefficient of n−1 (z )z 5 ; thereby, we obtain c3 = − 21 p 2 + p 2 −

1 2 72 3

= − 21 p 2 + p 2 − 21 p 2 = 0.

This means that √  n(ˆ − ) P  QNor ( ) = + o(n−1 ). ˆ It follows from Lemma 1 that √  n  n(ˆ − ) −3/2 2 P +n [z p1 (Xi ) + q1 (Xi )]  x ˆ i=1

= (x) − n−1/2 (x)P1 (x) − n−1 (x)P2∗ (x) + o(n−1 ), where P2∗ (x) is defined in (9). Because z + b( ) is the Cornish–Fisher inversion, we can show that

(z + b( )) − (z + b( )){n−1/2 P1 (z + b( )) + n−1 P2 (z + b( ))} = (z ) + o(n−1 ).

Yoshihiko Maesono / Journal of Statistical Planning and Inference 133 (2005) 359 – 379

379

Thus we have  √ n(ˆ − ) P  QStu ( ) ˆ √  n  n(ˆ − ) −3/2 2 =P +n [z p1 (Xi ) + q1 (Xi )]  z + b( ) + o(n−1 ) ˆ = (z ) − n

−1

i=1

(z + b( ))(3 z 2 + 4 ){z + b( )} + o(n−1 )

= − n−1 (3 z 2 + 4 ) (z ) + o(n−1 ). This completes the proof of Theorem 2.



References Barndorff-Nielsen, O.E., Cox, D.R., 1989. Asymptotic Techniques for Use in Statistics. Chapman & Hall, London. Fujioka, Y., Maesono, Y., 2000. Higher order normalizing transformations for asymptotic U-statistics. J. Statist. Plann. Inference 83, 47–74. Hall, P., 1992. On the removal of skewness by transformation. J. Roy. Statist. Soc. B 54, 221–228. Hoeffding, W., 1961. The strong law of large numbers for U-statistics. University of North Carolina Institute of Statistics. Mimeo Series. no. 302. Lai, T.L., Wang, J.Q., 1993. Edgeworth expansion for symmetric statistics with applications to bootstrap methods. Statist. Sin. 3, 517–542. Maesono, Y., 1997. Edgeworth expansions of a studentized U-statistic and a jackknife estimator of variance. J. Statist. Plann. Inference 61, 61–84. Maesono, Y., 1998. Asymptotic properties of jackknife skewness estimators and Edgeworth expansions. Bull. Inform. Cybernet. 30, 51–68.