Journal of Statistical Planning and Inference 115 (2003) 171 – 179
www.elsevier.com/locate/jspi
Local Bahadur e'ciency of some goodness-of-*t tests under skew alternatives A. Durioa; ∗ , Ya.Yu. Nikitinb; 1 a Department
of Statistics and Applied Mathematics “D. de Castro” of Turin University, P.zza Arbarello 8, 10122 Torino, Italy b Department of Mathematics and Mechanics of St. Petersburg State University, Bibliotechnaya sq., 2, Stary Peterhof 198504, Russia Received 22 April 2001; accepted 24 January 2002
Abstract The e'ciency of most known distribution-free goodness-of-*t tests such as Kolmogorov– Smirnov, Cram8er–von Mises and their variants was studied mainly under the classical alternatives of location and scale. Hence, it is interesting to compare the e'ciencies of these tests under asymmetric alternatives like the skew alternative proposed by Azzalini (Scand. J. Stat. 12 (1985) 171). We calculate and compare local Bahadur e'ciencies of many known statistics for skew c 2002 Elsevier Science B.V. alternatives and discuss the conditions of their local optimality. All rights reserved. MSC: primary 62G10; secondary 62E10; 62G20 Keywords: Bahadur e'ciency; Skew alternative; Kullback–Leibler information; Local index; Local asymptotic optimality
1. Introduction Goodness-of-*t testing is one of the most interesting problems of statistics. If the hypothetical distribution is continuous, it is reasonable to apply distribution free tests using Kolmogorov–Smirnov, Cram8er–von Mises statistics and their numerous variants (see, e.g., Shorack and Wellner (1986) or Nikitin (1995) for their description). Their asymptotic properties have been well studied. ∗
Corresponding author. Tel.: +39-11-670-6251; fax: +39-11-670-6239. E-mail addresses:
[email protected] (A. Durio),
[email protected] (Ya.Yu. Nikitin). 1 The author was supported by grants of RFBR No. 01-01-0245 and 00-15-96019.
c 2002 Elsevier Science B.V. All rights reserved. 0378-3758/03/$ - see front matter PII: S 0 3 7 8 - 3 7 5 8 ( 0 2 ) 0 0 1 5 5 - 6
172 A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179
Most examples of e'ciency calculations include classical alternatives of location and scale (see, e.g., Abrahamson, 1967; Groeneboom and Shorack, 1981; Nikitin, 1995). However, in many practical problems these models are unrealistic as the alternative distributions are often skewed and lose the symmetry properties. The most interesting and simple example of such alternative model in the case of normal distribution was introduced by Azzalini (1985, 1986). Let and ’ denote the distribution function (d.f.) and the density of the standard normal law. Azzalini (1985, 1986) proposed the skew-normal distribution depending on the real parameter and having the density g(x; ) = 2’(x)(x);
x ∈ R1 ; ¿ 0:
It is evident that for any the function g(x; ) is a density and that for = 0 we obtain the standard normal density. Later, the properties of the Azzalini model and its generalizations were considered, e.g., by Henze (1986), Johnson et al. (1988), Liseo (1990), Azzalini and Dalla Valle (1996) and Chiogna (1998). For any symmetric distribution function F with density f we can introduce analogously the corresponding skew distribution with the density h(x; ) = 2f(x)F(x);
x ∈ R1 ; ¿ 0:
(1)
The interest in skew models, both normal and non-normal case, considerably increased in the past few years, as seen in recent papers of Azzalini and Capitanio (1999), Arnold and Beaver (2000), Pewsey (2000), Genton et al. (2001), Branco and Dey (2001), among others. Therefore, it is quite interesting to analyze the e'ciencies of distribution-free tests mentioned above with respect to the skew alternative. General formulas for local Bahadur e'ciencies in case of simple one-parameter families can be found in Nikitin (1995, Chapter 2), but the special structure of alternative (1) requires some regularity conditions on the density f imposed in Section 2. We calculate these e'ciencies for *ve model distributions with diJerent behavior of tails. The results obtained in Section 3 demonstrate that the ordering of tests by their e'ciency is similar to the location case. However, we get entirely diJerent and somewhat surprising results concerning the conditions of local Bahadur optimality of our tests under skew alternatives. As shown in Section 4, some tests (including Kolmogorov’s) are never locally optimal in a broad class of symmetric probability laws. Two tests enjoy this form of optimality for the uniform density while the omega-square test is locally optimal for the arcsine density. This is in contrast with the phenomena discovered for the location case (see Nikitin, 1995, Chapter 6).
2. Tests and regularity conditions Let X1 ; : : : ; Xn be a sample with the density h(x; ) given by (1) and depending on the known symmetric density f and a real parameter ¿ 0. Denoting by H (x; ) the d.f. corresponding to this density, we want to test the goodness-of-*t hypothesis
A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179 173
H0 : = 0 against the alternative H1 : ¿ 0. Let Fn be the empirical d.f. based on the sample X1 ; : : : ; Xn , namely Fn (t) = n−1
n
t ∈ R1 :
1{Xi 6 t};
i=1
We consider the well-known goodness-of-*t tests based on the Kolmogorov statistic Dn = sup|Fn (t) − F(t)| t
on the Watson–Darling statistic which is the centered variant of Dn : Gn = sup|Fn (t) − F(t) − (Fn (s) − F(s)) dF(s)| t
R1
on the Chapman–Moses statistic !n1 = [F(t) − Fn (t)] dF(t) R1
on the Cram8er–von Mises statistic !n2 = [Fn (t) − F(t)]2 dF(t) R1
and on its centered version called Watson statistic 2 Un2 = Fn (t) − F(t) − (Fn (s) − F(s)) dF(s) dF(t): R1
R1
Note that we use a slightly diJerent (but equivalent) form of the statistic !n1 in order to ensure consistency under skew alternative (1). The limiting distributions of all these statistics are well known (see, e.g., Shorack and Wellner, 1986). All these distributions (except that of !n1 ) are non-normal, and do not allow us to calculate the Pitman e'ciency. Instead, we will consider the exact local Bahadur e'ciency. This type of e'ciency was introduced and developed by Bahadur (1967, 1971). The measure of local Bahadur e'ciency (local means that the alternative is close to the null-hypothesis) is the so-called local index. General formulas for the local indices and for the local e'ciencies of considered statistics are given in Nikitin (1995). They were derived under certain regularity conditions on the family of alternatives which we are going to impose as applied to family (1). Condition 1: We consider in (1) only symmetric densities f with *nite variance and with f(0) ¿ 0 which are positive and diJerentiable within their support; for symmetry we always have f (0) = 0. Condition 2: Let f be such that uniformly in x ∈ R1 x H (x; ) − F(x) ∼ 2f(0) uf(u) du as → 0: (2) −∞
174 A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179
It is easy to formulate su'cient conditions ensuring (2). For instance, assuming the existence of bounded f and Condition 1 we have for any x x x H (x; ) − F(x) = 2f(0) uf(u) du + 2 u2 f(u)f (u) du; 0 ¡ ¡ x: −∞
−∞
The last term is of order O( 2 ) as → 0, and we get (2). An important quantity is the Kullback–Leibler information (see its de*nition and properties in Bahadur (1971) or Nikitin (1995)). In our notations, it is ln{2F(x)}f(x)F(x) d x: ln{h(x; )=h(x; )}h(x; ) d x = 2 K(f; ) = R1
R1
Condition 3: Suppose that x2 f(x) d x 2 K(f; ) ∼ 2f2 (0) R1
as → 0:
(3)
In fact, as proved in Durio and Nikitin (2001), this behavior is typical and is valid if the second derivative of f is bounded around zero and if the third absolute moment of f exists. It is important to observe that conditions (2) and (3) are sometimes valid even if the derivative of f is unbounded. An important example is the symmetric arcsine density f(x) = ( 1 − x2 )−1 1{−1 ¡ x ¡ 1}: (4) This density and its derivative are clearly unbounded on (−1; 1). However for the corresponding skewed d.f. H (x; ) (2) and (3) are true (see Durio and Nikitin (2001) for the detailed proof). Denote by F the set of symmetric densities f satisfying conditions 1–3. In the rest of the paper we consider only densities f from the class F. 3. Calculation of local eciencies Suppose that T = {Tn } is a statistic such that as n → ∞ Tn → b(T; f; )
in probability under H1
and under H0 n−1 ln PH0 (Tn ¿ ) → −r(T; ); where the function r(T; ) is continuous in for su'ciently small . Then the exact Bahadur slope (see Bahadur (1967, 1971)) is de*ned as c(T; f; ) = 2r(T; b(T; f; )) and the local Bahadur e'ciency is given by eB (T; f) = lim
→0+
c(T; f; ) : 2K(f; )
(5)
A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179 175
In all the examples that are examined in the paper, we have c(T; f; ) ∼ l(T; f) 2
as → 0+;
where l(T; f) is called the local index. Then by (3), (5) and by considering modi*ed local indices l∗ (T; f) = l(T; f)=4f2 (0), we have eB (T; f) =
l∗ (T; f) l(T; f) = : 4f2 (0) Var X1 Var X1
(6)
It is well-known (see Nikitin, 1995, Chapter 2) that as → 0 r(D; ) ∼ 22 ; r(G; ) ∼ 62 ; r(!2 ; ) ∼ (2 =2);
r(!1 ; ) ∼ 62 ;
r(D; ) ∼ 22 :
By using (3) and the Glivenko–Cantelli theorem and by setting g(x) = +∞ +∞ I = −∞ f(u)g(u) du, and J = −∞ f(u)g2 (u) du, we get b(D; f; ) ∼ 2f(0) sup |g(x)|; x
1
b(! ; f; ) ∼ 2f(0)I;
x −∞
uf(u) du,
b(G; f; ) ∼ 2f(0) sup |g(x) − I |;
2
x
2
2
b(! ; f; ) ∼ 4 f (0)J;
b(U 2 ; f; ) ∼ 4 2 f2 (0)[J − I 2 ]: Hence, we obtain the following formulas for the modi*ed local indices l∗ (T; f) of statistics considered in this paper: 2 2 l∗ (D; f) = 4 sup |g(x)| ; l∗ (G; f) = 12 sup |g(x) − I | ; x
l∗ (!1 ; f) = 12I 2 ;
x
l∗ (!2 ; f) = 2 J;
l∗ (U 2 ; f) = 42 [J − I 2 ]:
We can simplify some of these formulas. Since g (x) = xf(x) has the sign of x, we have g(0) 6 g(x) 6 0. This implies that l∗ (D; f) = 4g2 (0) = (E|X1 |)2 together with g(0) 6 I 6 0. Hence −I if |g(0)| 6 2|I |; sup |g(x) − I | = max(|I |; |I − g(0)|) = x I − g(0) if |g(0)| ¿ 2|I |: As a result, if |g(0)| 6 2|I |, then l∗ (G; f) = l∗ (!1 ; f) = 12I 2 . This relation will be valid for all cases that are examined herein. We will calculate the local indices for *ve model densities f de*ned on R1 . The *rst density is the standard normal one f1 (x)=(2)−1=2 exp(−x2 =2), the second density is the logistic one f2 (x) = exp(x)=(1 + exp(x))2 , the third density is the arcsine density (4), the fourth density f4 is the uniform one on [ − 1; 1], and the *fth density f5 is given by f5 (x) = (8=3)(1 + x2 )−3 ;
x ∈ R1
176 A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179
which has power tail decreasing and resembles the Cauchy distribution. Put for brevity x ∞ uf' (u) du; I' = f' (u)g' (u) du; and g' (x) = J' (x) =
−∞ ∞
−∞
−∞
f' (u)g'2 (u) du:
It is easy to see that for the considered densities g1 (x) = −(2)−1=2 exp(−x2 =2);
x ∈ R1 ;
g2 (x) = −ln(1 + exp(x)) + x exp(x)=(1 + exp(x)); g3 (x) = −−1 1 − x2 ; −1 6 x 6 1; g4 (x) = −1=4(1 − x2 );
x ∈ R1 ;
−1 6 x 6 1;
g5 (x) = −2=(3(1 + x2 )2 );
x ∈ R1 :
Note that, in the case of skew alternative, the behavior of the Kullback–Leibler information, according to (3), is proportional to the variance which in all of our *ve cases is, respectively, 1; 2 =3; 12 ; 13 and 13 . We begin by calculating the e'ciencies for the normal density f1 . Clearly, l∗ (D; f1 ) √ = 2= ≈ 0:6366 and I1 = −1=(2= ). As g1 (0) = (2)−1=2 , we easily get √ l∗ (G; f1 ) = l∗ (!1 ; f1 ) = 12=(2 )2 = 3= ≈ 0:9549: √ √ l∗ (!2 ; f1 ) = =2 3 ≈ 0:9068 and l∗ (U 2 ; f1 ) = Moreover√J1 = 1=2 3, hence √ 42 (1=2 3 − 1=4) = (2 3 − 3)=3 ≈ 0:4860: To get the local Bahadur e'ciencies we must divide these indices on the variance which is 1. Comparing with Nikitin (1995, Chapter 2), we recognize the same values of local indices as in the location case. This invariance is a consequence of the equation x uf1 (u) du = −f1 (x); g1 (x) = −∞
which is a characteristic property of the symmetric normal law and is not preserved for other distributions. Hence, the phenomenon of the same local Bahadur e'ciencies for the skew and location alternative for the tests under examination is a characterization of the normal law in the class F. Now we proceed to the calculation of local indices for the other four densities. First of all we report that g22 (0) = ln2 2;
g32 (0) = −2 ;
g42 (0) =
1 16 ;
g52 (0) = 4=(92 ):
Moreover, I2 = − 12 ;
I3 = −2=2 ;
J2 ≈ 0:2850;
I4 = − 16 ;
J3 = 1=(22 );
I5 ≈ −0:1547;
J4 = 1=30;
J5 ≈ 0:0271:
A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179 177 Table 1 Local Bahadur e'ciencies under skew alternatives Statistic
Dn Gn !n1 !n2 Un2
Distribution Gauss
Logistic
Arcsine
Uniform
f5
0.637 0.955 0.955 0.907 0.486
0.584 0.912 0.912 0.855 0.420
0.810 0.985 0.985 1 0.758
0.750 1 1 0.987 0.658
0.540 0.862 0.862 0.802 0.373
The knowledge of these values is su'cient to derive the local Bahadur e'ciencies according to (6) using the formulas for l∗ (Ti ; f) given above. We can summarize our calculations in Table 1 A comparison of Table 1 with Table 3 in Nikitin (1995, p. 80) shows that the ordering of tests is similar to the location case. This is favorable for practitioners as they seldom know the structure of the alternative but can use the same test both for location and skew models. We underline the maximal e'ciency 1 for Gn and !n1 in the case of uniform distribution and for !n2 in the case of arcsine density. Below in Section 4 is our theoretical analysis of this surprising result. Note that the so-called Pitman limiting relative e'ciency of considered statistics is the same as the local Bahadur e'ciency calculated above (under somewhat stronger regularity conditions). It can be veri*ed in the same way as in Wieand (1976) and Nikitin (1995). 4. Conditions of local optimality As is well known (see Bahadur, 1967; Nikitin, 1995, Chapter 6), the local asymptotic optimality (LAO) of a sequence of statistics in the Bahadur sense means the e'ciency is one or, equivalently, the local exact slope and 2K(f; ) are equivalent as → 0+. Under the regularity conditions described in Section 2, it follows that for a given sequence Tn one should have ∗ l (T; f) = x2 f(x) d x: (7) R1
We are interested in those densities f ∈ F for which (7) is true. Such densities form the so-called domain of LAO in F. The study of this “inverse” problem was started by Nikitin (1984), even if the importance of exploring the conditions of maximal Bahadur e'ciency was already pointed out by Savage (1969). In the case of the Kolmogorov statistic, due to symmetry of f, we have l∗ (D; f) = (E|X1 |)2 , hence the condition of LAO (7) reduces to the condition Var|X1 | = 0 which is impossible for distributions from F. Hence the domain of LAO of Dn is empty in F.
178 A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179
More interesting is the case of statistics !n1 . If f ∈ F, integrating by parts, we get ∞ x 2 ∞ 2 l∗ (!1 ; f) = 12 uf(u) du f(x) d x = 12 uf(u)F(u) du ; −∞
−∞
−∞
hence by Cauchy–Schwarz inequality, we have 2 ∞ 2 ∞ uf(u)F(u) du = 12 uf(u)(F(u) − 1=2) du 12 −∞
6 12
−∞
∞
−∞
u2 f(u) du
∞
−∞
(F(u) − 1=2)2 dF(u) = EX12 :
It follows that the condition of LAO corresponds to the equality in the Cauchy–Schwarz inequality which takes place iJ F(x) −
1 2
= C1 x
on the support of f. This implies that f is constant on a symmetric interval around zero. The statement under discussion also follows from Theorem 6.1 of Karlin and Studden (1966). We can consider this result as a characterization of the symmetric uniform distribution. We remark that the local optimality of the same statistic under the location alternative is valid for logistic distribution (see Nikitin, 1995, Chapter 6), which, on the other hand, emphasizes the diJerence between these two types of alternatives. In the case of statistics Gn and !n2 the arguments are similar. The densities f for which the LAO condition takes place are, respectively, the symmetric uniform density and the arcsine density (see the details in Durio and Nikitin, 2001). It is worth mentioning that we gained a new characterization in the class F of the arcsine density by the property of LAO of !n2 -statistic under the skew alternative. There are other characterizations of this distribution (see, e.g., Norton, 1978; Shantaram, 1978), but they are very rare. Note that in the location case the LAO property of !n2 takes place for hyperbolic cosine density (see Nikitin, 1995, Chapter 6). These topics explain the presence of 1 in the Table 1. Finally, the domain of LAO for Watson statistic is empty, as proved in Durio and Nikitin (2001). Acknowledgements The authors are indebted to the coordinating editor and the referees for careful reading of the manuscript and for their many important remarks and suggestions. References Abrahamson, I.G., 1967. Exact Bahadur e'ciencies for the Kolmogorov–Smirnov and Kuiper one- and two-sample statistics. Ann. Math. Statist. 38 (2), 1475–1490. Arnold, B.C., Beaver, R.J., 2000. The skew-Cauchy distribution. Statist. Probab. Lett. 49 (3), 285–290. Azzalini, A., 1985. A class of distributions which includes the normal ones. Scand. J. Statist. 12, 171–178.
A. Durio, Ya.Yu. Nikitin / Journal of Statistical Planning and Inference 115 (2003) 171 – 179 179 Azzalini, A., 1986. Further results on a class of distributions which includes the normal ones. Statistica 46, 199–208. Azzalini, A., Capitanio, A., 1999. Statistical applications of the multivariate normal skew distribution. J. Roy. Statist. Soc., Ser. B 61 (3), 579–602. Azzalini, A., Dalla Valle, A., 1996. The multivariate skew-normal distribution. Biometrika 83 (4), 715–726. Bahadur, R.R., 1967. Rates of convergence of estimates and test statistics. Ann. Math. Statist. 38, 303–324. Bahadur, R.R., 1971. Some Limit Theorems in Statistics. SIAM, Philadelphia. Branco, M.D., Dey, D., 2001. A general class of multivariate skew-elliptical distributions. J. Multivariate Anal. 79 (1), 99–113. Chiogna, M., 1998. Some results on the scalar skew-normal distribution. J. Ital. Statist. Soc. 7, 1–13. Durio, A., Nikitin, Ya.Yu., 2001. Local asymptotic e'ciency of some goodness-of-*t tests under skew alternatives. Working Paper No. 4=2001, International Centre for Economic Research (ICER), Torino, 18pp. Genton, M.G., He, L., Liu, X., 2001. Moments of skew-normal random vectors and their quadratic forms. Statist. Probab. Lett. 51, 319–325. Groeneboom, P., Shorack, G., 1981. Large deviations of goodness-of-*t statistics and linear combinations of order statistics. Ann. Probab. 9, 971–987. Henze, N., 1986. A probabilistic representation of the ‘skew-normal’ distribution. Scand. J. Statist. 13, 271– 275. Johnson, N.L., Kotz, S., Read, C.B., 1988. Skew-normal distributions. In: Johnson, N.L., Kotz, S., Read, C.B. (Eds.), Encyclopedia of Statistical Sciences, Vol. 8. Wiley, New York, pp. 507–508. Karlin, S., Studden, W.J., 1966. TchebycheJ systems: with Applications in Analysis and Statistics. Wiley, New York. Liseo, B., 1990. The skew-normal class of densities: aspects of inference from the Bayesian point of view. Statistica 50 (l), 71–82. Nikitin, Ya.Yu., 1984. Bahadur local asymptotic optimality and characterization problems. Theory Probab. Appl. 29, 79–92. Nikitin, Ya.Yu., 1995. Asymptotic E'ciency of Nonparametric Tests. Cambridge University Press, Cambridge. Norton, R.M., 1978. Moment properties and the arcsine law. Sankhya A 40 (2), 192–198. Pewsey, A., 2000. Problems of inference for Azzalini’s skew-normal distribution. J. Appl. Statist. 27 (7), 859–870. Savage, I.R., 1969. Nonparametric statistics: a personal review. Sankhya A 31, 107–144. Shantaram, R., 1978. A characterization of the arcsine law. Sankhya A 40 (2), 199–207. Shorack, G., Wellner, J., 1986. Empirical Processes with Applications to Statistics. Wiley, New York. Wieand, H.S., 1976. A condition under which the Pitman and Bahadur approaches to e'ciency coincide. Ann. Statist. 4, 1003–1011.