Estimation problems for distributions with heavy tails

Estimation problems for distributions with heavy tails

Journal of Statistical Planning and Inference 123 (2004) 13 – 40 www.elsevier.com/locate/jspi Estimation problems for distributions with heavy tails...

320KB Sizes 4 Downloads 46 Views

Journal of Statistical Planning and Inference 123 (2004) 13 – 40

www.elsevier.com/locate/jspi

Estimation problems for distributions with heavy tails Zhaozhi Fan∗;1 Institut fur Mathematische Stochastik, Universitat Gottingen Lotze Strasse 13, 37083 Gottingen, Germany Received 19 October 2001; accepted 26 January 2003

Abstract We establish estimators for the tail index of heavy-tailed distributions. A large deviation principle is proved. An estimator with U-statistic structure is constructed and its asymptotic normality is established. An estimator based on re-sampling procedure is developed and this provides a possibility to simulate U-statistics in a simpler way. Simulation studies show the advantages of our estimators to the competitive ones. c 2003 Elsevier B.V. All rights reserved.  MSC: 62E07; 62F12; 62F40 Keywords: Heavy-tailed distribution; Domain of attraction; Stable distribution; Tail index; U-statistics; Re-sampling; Simulation

1. Introduction Estimation of tail indices of heavy-tailed distributions has been an active 9eld of research for quite a long time. Fama and Roll (1968) started to estimate the index of an -stable distribution. Under the same assumption on the distributions Press (1972) and Zolotarev (1986) constructed di
Tel.: +1-603-8623681; fax: +1-603-8654096. E-mail address: [email protected] (Z. Fan). 1 Current address: Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, USA. c 2003 Elsevier B.V. All rights reserved. 0378-3758/$ - see front matter  doi:10.1016/S0378-3758(03)00142-3

14

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

Hill (1975) investigated the special case of a heavy-tailed distribution with Pareto-type tail. He constructed a simple estimator for the tail index by maximizing the conditional Likelihood function. In fact, Hill’s estimator is a plotting procedure. It depends on the number k of largest observations chosen to calculate the estimates. It is one of the most popular estimators of the tail index. Pickands (1975) constructed a di
Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

15

each of the estimators performs very well no matter whether the distribution of the parent population is stable or has some other heavy tail. Moreover, the convergence rate of our estimators, is better than (log n)−1 . The paper is organized as follows: In Section 2 we introduce the preliminaries to the problem. In Section 3 we express the basic structure of our new estimator and prove the large deviation principle. And then we develop our estimator with U-statistics structure and prove the asymptotic normality of it when both the sample size and the degree of the U-statistic go to in9nity. Another estimator with incomplete U-statistic structure is discussed. In Section 4 we analyze the simulation results. 2. Preliminaries Denition 2.1. Suppose X1 ; X2 ; : : : ; Xn ; : : : is a sequence of i.i.d. random variables with distribution F. If there are sequences of positive numbers {an } and real numbers {bn }, such that X1 + X2 + · · · + Xn + bn ⇒ X; (1) an for some random variable X , then we say F is in the domain of attraction of a stable distribution, noted by F ∈ DA(), where  ∈ (0; 2] is the stable exponent of the corresponding stable distribution. And we call X a stable distributed random variable. Through out the text, the symbol “⇒” means convergence in distribution. When the Xi s are i.i.d. with 9nite variance, then (1) is the statement of the ordinary central limit theorem. One important property of stable distributions is that the family of them is closed under convolution. It means if X , Y are independent and identically distributed stable random variables, then for any given positive numbers A and B, there exist positive C and real D, such that d

AX + BY = CX + D; where C can be determined by A + B  = C  ; and  ∈ (0; 2) is the exponent of the stable distribution. Sometimes we call it the stable index. The random variables X and Y are said to obey an -stable distribution. The case D = 0 corresponds to strictly stable distribution of X . The normalizing constants an in (1) can be chosen to satisfy the following equation:  (2 − )      · cos  if  = 1;  an · |1 − | 2 nL(an ) =     an · if  = 1; 2 and we will always assume this choice.

16

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

According to Theorem 2.1.1 of Ibragimov and Linnik (1971), an must be of the form an = n1= h(n); where h(n) is also slowly varying at in9nity in the sense of Karamata. That is we have n i=1 Xi + bn ⇒ X; 1= n h(n) where X is some -stable distributed random variable. Proposition 2.2. A distribution F is in the domain of attraction of some -stable law, i5 x (1 − F(x)) ∼ pL(x); x F(−x) ∼ qL(x); as x → ∞ where p; q ¿ 0; p + q = 1; and L(x) is a slowly varying function with Karamata’s representation  x j(t) L(x) = c(x) exp dt ; t B c(t) → c; j(t) → 0 as t → ∞; B ¿ 1 and c is some constant. In this case, we say F is a heavy-tailed distribution and  is called the tail index. If F is an -stable distribution, then the slowly varying function L(x) = c. Proposition 2.3. If a distribution F is attracted to some -stable law, then its characteristic function f(t) is



· (1 + o(1)) f(t) = exp −c0 |t| L(1=|t|) · 1 − i sign(t) tan 2 for  = 1 (See Ibragimov and Linnik (1971, Theorem 2.6.5)). Note that the characteristic function in the case of  = 1 is more complicated (See Aaronson and Denker, 1998, Theorem 2 for detail).

3. The structure of the estimators 3.1. Construction of the estimator and the large deviation principle Let us start with the sequence of i.i.d. strictly -stable random variables X1 ; X2 ; : : : ; Xn ; : : : :

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

17

The sum-preserving property of stable laws says X1 + X 2 + · · · + X n d = X1 ; n1= and this leads to n   log | i=1 Xi | 1 d − log n = log |X1 |: log n 

(2)

Let log | −1  n =

n

i=1

log n

Xi |

;

(3)

−1 −1 then  in probability as n → ∞. n → −1 It follows that  is a reasonable choice for an estimator of the parameter −1 . n As to the more general situation of stable attraction, we have a similar relation in the sense of weak convergence. A slowly varying function is added to the calculation. According to the property of slowly varying functions, the same form of estimation can still be used. We have the following theorem:

Theorem 3.1. Let X1 ; X2 ; : : : ; Xn be i.i.d. random variables with distribution F, which −1 is attracted to some -stable law.  is de8ned by (3). Then, for any given  ¿ 0, n we have the following large deviation principle: −1 −1 ¡ − } log P{ n − = −; n→∞ log n

lim

−1 −1 log P{ ¿ } n − = −: n→∞ log n

lim

Remark 3.2. The proof of the theorem is a straight forward calculation of the corresponding large deviation probabilities using a lemma of Heyde (1968) for the second equation and the Fourier inversion formula for the 9rst one. Lemma 3.3 (Heyde): Suppose Sn = as n → ∞: Then

n

i=1

Xi , an ; X; Xn are de8ned as in (1) and cn → ∞

P{|Sn | ¿ an cn } = 1: n→∞ nP{|X | ¿ an cn } lim

or equivalently, lim

n→∞

P{|Sn | ¿ an cn } = 1: P{max16k6n |Xk | ¿ an cn }

18

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

The proof of Theorem 3.1. For any given  ¿ 0; as n → ∞,   log(h(n)) log(h(n))  −1 −1 −1 −1 P{ − −  ¿ } = P  −  ¿  − n n log(n) log(n)   |Sn | = P ¿ n =h(n) an   |X | Lemma 3:3  ∼ nP ¿ n =h(n) an =

nP{|X | ¿ an n =h(n)}

=

nP{|X | ¿ n1=+ }



n · (n1=+ )− L(n1=+ )

=

n− L(n1=+ );

where L is slowly varying at in9nity. Taking the logarithm of the last term and dividing it by log n, we obtain the second equation. The 9rst part of the theorem can be proved as follows:   |Sn |  −1 − −1 P{n −  ¡ − } = P ¡n n1=  ∞ exp(i · n− t) − exp(−i · n− t) 1 = fn (t) dt 2 −∞ it  ∞ 2 sin(n− t) 1 = fn (t) dt 2 −∞ t  ∞ 2 sin(n− t)  t n 1 f 1= = dt 2 −∞ t n  ∞  2 ∼ n− (1 + o(1)) dt exp −c0 |t| L(n1= ) · 1 − i sign(t) tan  2 0  ∞ 2 ∼ n− exp(−c0 t  L(n1= )(1 + o(1))) cos(c1 t  L(n1= )) dt  0 ∼ K · n− L(n1= )−1= ; where K is constant. 3.2. Estimating the stable index using estimator with U-statistic structure 3.2.1. U-statistic as an estimator It is well known that U-statistics have usually better statistical properties when used to estimate an unknown parameter. Based on the statistic we have used to estimate

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

19

the index of an attracting law, we can construct the following estimator of −1 with U-statistic structure. Suppose now, that X1 ; X2 ; : : : ; Xn are i.i.d. copies from X with a distribution F being attracted to an -stable law S. Let m ¡ n, and denote by h the function m log | i=1 xi | h(x1 ; x2 ; : : : ; xm ) = : (4) log m h is considered as the kernel of a U-statistic de9ned by  −1  n  −1 U = Un (h) = h(Xi1 ; Xi2 ; : : : ; Xim ): m 16i1 ¡i2 ¡···¡im 6n

(5)

This is an estimator of −1 . By the discussion above we know that when m is large enough, Eh(X1 ; X2 ; : : : ; Xm )k ¡ ∞

(6)

for every k ∈ N. In fact, if m → ∞, we have (see Meerschaert and ScheJer (1998), Theorem 2)  1 → 0; (7) E h(X1 ; X2 ; : : : ; Xm ) −  2  1 E h(X1 ; X2 ; : : : ; Xm f−) − →0 (8)  that is, the expectation of h tends to −1 and the mean square error tends to 0. −1 The distribution of  U here is diPcult to obtain, its asymptotic variance also. A very good calculation method to overcome this diPculty is jackkni9ng the variance (Arvesen, 1969). Let Un0 denote the U-statistic based on the observations X1 ; X2 ; : : : ; Xn i and Un−1 the U-statistic based on all observations except Xi , that is X1 ; X2 ; : : : ; Xi−1 ; i Xi+1 ; : : : ; Xn . Let Un−1 denote the average of the Un−1 ; i = 1; 2; : : : ; n. It is known that n  p i (Un−1 − Un−1 )2 → m2 &1 ; (9) Sn2 := (n − 1)−1 i=1

where &1 = Var(E(h(X1 ; X2 ; : : : ; Xm )|X1 ) − Eh(X1 ; X2 ; : : : ; Xm )). −1 −1 In this way we obtain the consistency of  and also its U as an estimator of  asymptotic normality under some restrictions on m: Theorem 3.4. Suppose X1 ; X2 ; : : : ; Xn are i.i.d. observations with the same distribution −1 F, which is attracted to an -stable law, and let  be de8ned by (5). If m → ∞

and m = o(n1=2 ), as n → ∞, then √  −1 −1 n · Sn−1 ( U − E U ) ⇒ N(0; 1)

U

as n → ∞:

Proof. Let c; m h(x1 ; x2 ; : : : ; xc ) = ('x1 − P)('x2 − P) · · · ('xc − P)P m−c h;

(10)

20

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

c = 1; 2; : : : ; m, where   Q1 · · · Qm h = · · · h dQ1 (x1 ) · · · dQm (xm ): −1 Hoe
= E(h(X1 ; X2 ; : : : ; Xm )) +

m 1; m (h)(Xi ) + Rn ; n i=1

where Rn =

m 



c=2

m c



[ : U−1( c; m (h))

The main part of the proof is to verify that the variance ofthe remainder term Rn is n an in9nitesimal of higher order compared with that of m=n i=1 pi1; m (h)(Xi ). As we know, if X has a distribution in the domain of attraction of an -stable law, then X 2 sign(X ) has a distribution attracted to an =2-stable law. So that without loss of generality, we can suppose that the stable index  is less than 1. The kernel of the U-statistic can be written in the form h(X1 ; X2 ; : : : ; Xm ) =

log(m − 1) log l(m − 1) log |Ym−1 | + +  log m log m log m +

log |1 + X1 =(Ym−1 (m − 1)1= l(m − 1))| ; log m

m where Ym−1 = i=2 Xi=(m − 1)1= l(m − 1) is independent with X1 , and l(·) is slowly varying at in9nity. Hence &1 = Var(g1 (X1 )) =

Var(E(log |1 + X1 =(Ym−1 (m − 1)1= l(m − 1)) X1 )) (log m)2

=

VarZ : (log m)2

where Z = E(log |1 + X1 =(Ym−1 (m − 1)1= l(m − 1)) X1 ). While EZ ≈

1 ; m

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

21

using the following inequality we have EZ 2 ¿ m2 |EZ|2 P(|Z| ¿ m|EZ|)  1 1 ≈ 2 P |Z| ¿      1 1 X1  ¿ ¿ 2 P |log |1 + Ym−1 (m − 1)1= l(m − 1)    (since X ¿ a ⇒ E(X |Y ) ¿ a; so P(E(X |Y ) ¿ a) ¿ P(X ¿ a):)  1 X1 = 2P ¿ (m − 1)1= l(m − 1)(e1= − 1) Ym−1   1 X1 1= 1= + 2P 6 − (m − 1) l(m − 1)(1 + e ) Ym−1   1 X1 −1= 1= + 2P 6 − (m − 1) l(m − 1)(1 − e )  Ym−1  1 X1 − 2P 6 − (m − 1)1= l(m − 1)(1 + e−1= ) Ym−1  := I1 + I2 + I3 − I4 : Now we calculate I1 , the other terms are in the same order as I1 . 

2 I1 := P  =  ¿

X1 ¿ Am Ym−1



+∞

P(X1 ¿ Am s)fYm−1 (s) ds +

0 +∞



 P(X1 ¿ Am s)fYm−1 (s) ds +

c1 ∼ m−1 ∼



 

+∞

s

−

c2 f(s) ds + m−1

0

−∞ −

−∞



−

−∞

P(X1 6 Am s)fYm−1 (s) ds P(X1 6 Am s)fYm−1 (s) ds |s|− f(s) ds

1 · d1 ; m

the last step is based on Proposition 2.2 and the fact that fYm−1 → f according to the uniform convergence theorem of regularly varying function with negative index (see Bingham and Goldie, 1989, p. 23, Theorem 1.5.2),  and the convergence of the m integrals, where Am =(m−1)1= l(m−1)·(e1= −1), Ym−1 = i=2 Xi =(m−1)1= l(m−1), l is slowly varying at in9nity, fYm−1 is the density of Ym−1 , f is the density of a stable

22

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

distribution, c1 ; c2 ; d1 are positive constants. Var(Z) &1 = (log m)2 ¿

d2 ; m(log m)2

where d2 is some positive constant. Furthermore, by the theory of U-statistics (Lee, 1990, pp. 15, Theorem 4) we have &m &3 &2 &1 6 6 6 · · · 6 ; 3 m 2 where, for k = 1; 2; : : : ; m, &k = Var(hk (X1 ; X2 ; : : : ; Xk )); and hk (x1 ; x2 ; : : : ; xk ) = E(h(x1 ; x2 ; : : : ; xk ; Xk+1 ; : : : ; Xm )); and following the method of Meerschaert and ScheJer (1998), Theorem 2, we have Var(log |Y |) &m ≈ ; (log m)2 where Y is an -stable random variable, and Var(log |Y |) ¡ + ∞, whence we obtain that d &1 = m(log m)2 for some positive d. Under this setting of U-statistics, we have a representation of the −1 variance of  U :   −1 m     m n−m n  −1 &c Var(U ) = c m−c m c=1  =

n

−1 

m



1

m

n−m m−1



 &1 +

n

−1

m

m 



m



c

c=2

n−m m−c

 &c :

Also we have another representation from Hoe
m

2  −1 n

1

1

Var(1; m (h)) +

m2 &1 + Var(Rn ): n

m  c=2



m

2  −1 n

c

c

Var(c; m (h))

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

23

In case m = o(n1=2 ) as n → ∞, we have  −1      n m n−m m−1 m−1 m−1 m2 1− · · · 1− 1− = n−1 n−2 n−m+1 n m 1 m−1 ∼

m2 : n

−1 So from the two representations of Var( U ), we know that   −1 m     m n−m n Var(Rn ) ∼ &c c m−c m c=2

= Now we have

m2 &1 O(m2 =n): n n

m  −1 −1  1; m (h)(Xi ) + Rn : U − E U = n i=1

The sum on the right hand side of the equation is an average of n i.i.d. random variables with variance that can be estimated through Eq. (9) and the remainder term Rn is simply a controlled error under the condition m=o(n1=2 ). At this point, the central limit theorem completes the proof. Note that the conclusion in the above theorem can also be described in the following way: X1 ; X2 ; : : : ; Xn are i.i.d. copies of X which is in the domain of attraction of stable law, then we have  √ 1 d −1 n · Sn−1  − as n → ∞: (11) + c m ⇒ N(0; 1); U  −1 where cm → 0 as m → ∞ and m2 =n → 0. That is  U is an asymptotically normal −1 estimator of  with very small bias when the sample size is large. −1 1=2 Remark 3.5. The asymptotic normality of  U holds only when m = o(n ); Otherwise its asymptotic limit can be some Wiener–Ito stochastic integral with respect to some Wiener measure or Poisson measure. It is a U-statistic with in9nite degree m. The convergence is seriously a
Remark 3.6. Using U-statistics to estimate the stable index does not reduce the bias of the estimator, but the variance becomes smaller. Our estimator is improved in the sense that it possesses smaller mean square error (MSE). 3.2.2. Estimating the stable index using incomplete U-statistics The estimator of a stable index with U-statistic structure has very nice properties. −1 However, if the sample size n or the degree m is large, the calculation of  can be U

24

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

quite elaborate, since it involves averaging ( mn ) terms. Because of the dependence of −1 without unduly inRating the involved summands, we can omit some summands of  U

the variance. We are thus led to consider a special form of incomplete U-statistics. Let the assumptions be the same as in the previous section. Suppose that the sample size has the form n = km, otherwise we delete some terms. The i.i.d. observations X1 ; X2 ; : : : ; Xn are partitioned into k independent sub-samples, each having a sample size m. From each of the sub-samples we get one estimator [ (j); j = 1; 2; : : : ; k, and the average of these estimators is our of the stable index, −1

incomplete U-statistic

m

k

1  −1 −1  (j): m[ U0 = k j=1

−1 In this case, the variance of  U 0 is

1 −1 −1 [ Var( U 0 ) = Var(m (j)); k

j = 1; 2; : : : ; k:

Comparing it with the variance that of the complete U-statistic, we get its asymptotic relative ePciency (ARE) as  −1 −1 ARE = lim Var( U )=Var(U 0 ) = m&1 =&m ; n→∞

for given m. By Lemma 1.1.4 of Koroljuk and Borovskich (1994) we have m&1 =&m ¡ 1: So the ARE depends on the ratio &1 =&m and may be close to 1. The asymptotic normality of the estimator can be obtained directly from the properties of U-statistics. For practical use, we have also the following result: −1 −1 [ Theorem 3.7. Let  U 0 and m (j) j = 1; 2; : : : ; k be de8ned as above, and

sk2 =

k

1  −1 −1 2 (m[ (j) −  U0) : k −1 j=1

Then if n → ∞, and m is 8xed, we have the following normal approximation: −1 −1 √  − E  U0 k U0 ⇒ N(0; 1): sk

(12)

Proof. The proof of the theorem is direct. From the partition of the sample we have [ that the sub-estimators m−1 (j); j =1; 2; : : : ; k are independent and identically distributed  −1 random variables, and  is exactly the sample mean. U0

−1 −1 [ The expectations of  U 0 and m (j); j = 1; 2; : : : ; k are all equal to each other. The result follows now from routine calculations.

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

25

−1 Note that the exact distribution of  U 0 is almost impossible to obtain, so we cannot construct the exact con9dence interval of −1 . But from the above theorem we can asymptotically construct tests for hypotheses on −1 by constructing asymptotic con9dence intervals.

Remark 3.8. Our incomplete U-statistic is obtained by choosing k disjoint subsets of the sample successively. We can use other methods to select sub-samples. For example, it is possible to choose the subsets at random with replacement. This is indeed a type of bootstrap. Suppose we have selected L subsets, or sub-samples say, each having sample size m, at random with replacement. Then the estimator with incomplete U-statistic structure has the variance −1 −1 Var( ) = & =L + (1 − L−1 ) Var( ): m

U0

U

Hence, repeating the re-sampling procedure in9nitely often, we obtain a variance asymptotically very close to that of the complete U-statistic. The random re-sampling procedure without replacement has similar properties. Furthermore, the re-sampling procedure can also be designed to get smallest variance for a given sub-sample size m. From the discussion above we know, that the degree m of the underlying U-statistic can be freely chosen in the range of the sample size. But for di
1  [−1 −1 −1 = (m (j) − −1 )  U0 −  k j=1

d

=

k

1  log |Xj | ; k log m j=1

the last step is based on Eq. (2). Its mean square error can be calculated as 2  k  | log |X 1 j   MSE(U−10 ) = E  k log m j=1

1 E(log |X1 |)2 k − 1 (E log |X1 |)2 + 2 k (log m) k (log m)2  2 2 02 1 = + +1 ; (log m)2 k(log m)2 12 2 =

26

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

where X1 ; X2 ; : : : ; Xn are i.i.d. random variables with strictly stable distribution, and 0 = E log |X1 |. In the case of a standard stable distribution i.e. the scale parameter 1 = 1, then 0 = C(1= − 1), here C = 0:5772 is the Euler constant (See Zolotarev, 1986, Chapter 4). We assume further the underlying stable distribution is symmetric.). Now we can start the optimization procedure with a given initial value of 0−1 . The −1 . This can be optimal value of m is that minimizing the mean square error of  U0

carried out by iteration. In case that the parent distribution is attracted to a stable law, we have to consider the slowly varying function in addition. We suggest a similar procedure as in the case of stable distribution, but we need further information about the slowly varying function.

3.2.3. Estimating the stable index by re-sampling The estimator of −1 with the structure of U-statistic can be treated as a special kind of re-sampling from the known observations. In the following we give a randomized re-sampling procedure. Suppose that X1 ; X2 ; : : : ; Xn are i.i.d. observations from some population attracted to an -stable law, and let Y1 ; Y2 ; : : : ; Yn be i.i.d. copies of a Bernoulli distributed parent random variable Y with parameter p, independent of X . De9ne a new random sequence of Zi ; i = 1; 2; : : : ; n by Zi = X i · Y i ; then, as z → ∞, P{|Zi | ¿ z} = P{|Xi | · Yi ¿ z} = P{|Xi | ¿ z} · p ∼ p · z − L(z): Hence we have the normalizing constant an = (np)1= l(n); for the convergence of the partial sum of Zi ; i = 1; 2; : : : ; n, where l(n) is a slowly varying function. Considering the attraction property for the re-sampled observations, we have n log | i=1 Zi | 1 → ; log np  as n → ∞. Furthermore, by the strong law of large numbers, we have also n i=1 Yi a:s: → p: n Summarizing we obtain the following theorem: Theorem 3.9. Suppose Xi ; i = 1; 2; : : : ; n; are i.i.d. copies of X ∼ F, which is attracted to some -stable law and let Yi ; i = 1; 2; : : : ; n be observations from a Bernoulli

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

27

distribution with parameter p. X and Y are assumed to be independent. Then n log| i=1 Xi · Yi | −1 [ n n (r1) = log i=1 Yi is a consistent estimator of −1 . Suppose further that we have k i.i.d. Bernoulli samples of size n, Yij ; i=1; : : : ; n; j= 1; : : : ; k. From every Bernoulli sample we get a re-sample from that in the domain of stable attraction. Then we have k estimates of −1 from the re-sampled observations. The average of the estimates is our re-sampled estimator: [ (R) = n−1

k

1  −1 (rj); n[ k j=1

where [ n−1 (rj) =

n log| i=1 Xi · Yij | n log i=1 Yij

−1 The above theorem is derived from  by a minor change. Note however, that n it opens another possibility to do simulations for U-statistic in a simpler way. We can also omit the upper and lower extremal values of the repeated estimates of this procedure to get a robust estimate of the stable index. Using this method, we get a good estimate of the unknown parameter in a simple way which may be close to the complete U-statistic. This method is simpler than that of the incomplete U-statistic, which takes re-samples at random without replacement. [ (R) is a consistent estimator of −1 and also has the property of asymptotic −1 n

normality.

3.2.4. Estimating stable index in the case of dependence Now we consider the exponentially -mixing sequences of random variables. A stationary sequence X1 ; X2 ; : : : is called -mixing if    P(A ∩ B)     P(A)P(B) − 1 6 (n); for A ∈ 1{Xj : 1 6 j 6 k};

B ∈ 1{Xj : j ¿ k + n};

∀k n ∈ N;

where the mixing coePcients (n) → 0 as n → ∞. If there exist 0 ¡ 5 ¡ 1 and positive K, such that (n) = K5n ; n ∈ N, then the sequence is called exponentially mixing. Theorem 3.10. Let (Xn ; n ¿ 1) be a strictly stationary sequence of random variables such that P{|X1 | ¿ x} ∼ x− L(x)

(x → ∞);

28

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

where 0 ¡  ¡ 2 and L(x) is a slowly varying function with Karamata’s representation such that  1 : j(t) = o log(t) De8ne bn by the following relation: lim nP{|X1 | ¿ bn } = 1:

n→∞

Suppose that (i) (Xn ; n ¿ 1) is exponentially -mixing with 0 ¡ 5 ¡ 1, (ii) Xn ; n ¿ 1, are symmetric if 1 ¡  ¡ 2.

(n) = K5n ; n ∈ N, K ¿ 0, and

Then, for all x n = O(n' ), where ' ¿ 1=, as n → ∞,       Sn X1  x n P ∈ x n A − nP ∈ x n A  → 0; bn bn S for all A ∈ B, where B is Borel 1-algebra, (0 ∈ A). In particular,  X1 nP ∈ A → 6(A) bn implies xn P





Sn ∈ x nA bn

→ 6(A)

as n → ∞: Remark 3.11. Mentioning condition (ii) in the theorem, for any random variable X , if Y takes two values 1 and −1 with the same probability 1=2 and is independent of X , then Z = XY is symmetric. Proof of Theorem 3.10. Without loss of generality, we can suppose that A = (1; ∞); then we have  xn P(Xi ¿ bn x n ; Xj ¿ bn x n ) 16i¡j6n

6 xn



(n − i)(K5i + 1)P(X1 ¿ bn x n )2

16i6n

6 xn

n(n − 1) − 2 (1 + K)(x− n bn L(x n bn )) 2

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

=x− n (1 ∼

n(n − 1) + K) 2n2



L(x n bn ) L(bn )

2

1 + K − x n → 0; 2

(IE)

where L(x n bn ) = exp L(bn )



x n bn

bn

 = exp

j(t) dt t



1= log(n)+log(h(n))+log(x n )

1= log(n)+log(h(n))

 ∼ exp

1=+log(x n )=log(n)

1=

 j(exp(z)) d z 

j(exp(x log(n))) log(n) d x

= exp(log(x n )j(n' ))  1 log(x n ) 1 ¡'¡ + × ;   log(n)   1 = exp log(x n )o ' log(n)   log(x n ) = 1: = exp o ' log(n)

by mean value theorem

For 0 ¡ ' ¡ 1 9xed. Since   n   n   Xk 5{|Xk |¿'bn x n } ¿ bn x n M Xk 5{|Xk |¿bn x n } ¿ bn x n k=1

k=1

⊂ {∃i = j : |Xi | ¿ 'bn x n ; |Xj | ¿ 'bn x n }  = {|Xi | ¿ 'bn x n ; |Xj | ¿ 'bn x n } 16i¡j6n

Using (IE) for 'x n , we have that    n    Xk 5{|Xk |¿bn x n } ¿ bn x n lim x n P n→∞  k=1

−P

 n  k=1

29

   Xk 5{|Xk |¿'bn x n } ¿ bn x n  = 0: 

30

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

Moreover, since   n n   {Xk ¿ bn x n } ; Xk 5{|Xk |¿bn x n } ¿ bn x n = k=1

k=1

it follows from Bonferroni’s inequality and (IE) that     n      Xk 5{|Xk |¿bn x n } ¿ bn x n − nP(X1 ¿ bn x n ) lim xn P n→∞   k=1

6 lim xn n→∞



P(Xi ¿ bn x n ; Xj ¿ bn x n ) = 0:

16i¡j6n

Following the proof of Theorem 2 in Denker and Jakubowski (1989), it suPces to show that for 1 ¡  ¡ 2,   n   Xk 5{|Xk |¡'bn x n } ¿ bn x n = 0: lim lim sup x n P '→0 n→∞

k=1

Since

-mixing implies 7-mixing, the above equation is true. So we have       X1 Sn  ¿ x n − nP ¿ x n  lim x n P n→∞ bn bn   n     X1 k=1 Xk 5{|Xk |¿'bn x n } = lim xn P ¿ x n − nP ¿ x n  n→∞ bn bn  6 lim xn P(Xi ¿ bn x n ; Xj ¿ bn x n ) n→∞

16i¡j6n

=0: From Theorem 3.10 we have the following corollary: Corollary 3.12. Under the conditions of Theorem 3.10, we have that the sequence Sn =bn , n ¿ 1 converges weakly to a strictly -stable law 8, whose L=evy measure has density f(x) = |x|−1− (c5{x¿0} + (1 − c)5{x¡0} ); and where bn = n1= h(n), h(n) is slowly varying at in8nity. Similar to the independent case, we have the following theorem. Theorem 3.13. Under the assumptions of Theorem 3.10, n log(| i=1 Xi |) −1  = n log(n) is a consistent estimator of −1 .

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

31

Proof. The proof follows directly from Theorem 3.10 and some simple calculations. 3.3. Improvement to the estimator of  −1 As we already mentioned above, the convergence rate of our estimator  is very n slow. The scale parameter of the attracting stable law will provide great bias when it is not equal to 1 even when the sample size is very large. So that an improvement of the estimator of  is necessary. Let X be strictly distributed stable random variable with scale parameter 1 and index . From Zolotarev (1986, Chapter 4, pp. 218–220) we know that  1 E log |X | = log 1 + C · −1 ; 

where C = 0:577216 is the Euler constant. Let B(; 1) = log 1 + C · ( 1 − 1). If we have i.i.d. sample X1 ; X2 ; : : : ; Xn from the parent population X , then an unbiased estimator of B(; 1) is n  ! " 1 B(; 1) = log |Xi |: n i=1

Suppose now we have N = kn observations from some strictly stable distribution with index  and scale parameter 1. The observations are partitioned into the following way: X11 ; X12 ; : : : ; X1n ; X21 ; X22 ; : : : ; X2n ; : : : ; Xk1 ; Xk2 ; : : : ; Xkn : [(i); i = From every sub-sample Xi1 ; Xi2 ; : : : ; Xin , we can get an estimator of −1 , say n−1 1; 2; : : : ; k, and we have  1 d −1 [ log n n (i) − = log |Xi |:  [(i); i = 1; 2; : : : ; k, we have Taking the average of the k estimators n−1 k

1  −1 −1  n[(i) U0 = k

(13)

i=1

and



1 −1 log n  U0 − 



d

=

k

1 log |Xi |: k i=1

If k is large enough, the right hand side of the above equation tends to the expectation of log |Xi | and can be estimated from the sample mean of log |X1 |; log |X2 |; : : : ; log |Xn |. Now we get our unbiased estimator of 1= as follows:  −1 −1  U 0s = U 0 −

k

k

 1 log |Xij |: kn · log n i=1 j=1

(14)

32

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

Then we have the following theorem: Theorem 3.14. Suppose X1 ; X2 ; : : : ; XN are i.i.d. observations from a strictly -stable −1 distribution, N = kn. Let  be de8ned by (14), then we have U 0s

1 −1 E  U 0s = :  The scale parameter can be estimated as follows: k

n

 −1 [1 = 1 log log |Xij | − C · ( U 0s − 1); kn i=1 j=1

and it is an asymptotically unbiased estimator of log 1. −1 Remark 3.15. If both k and n go to in9nity, we have asymptotic normality of  U 0s which is unbiased and has an improved variance.

Remark 3.16. In the case of other heavy-tailed distributions, we can also use the es−1 timator  U 0s as an approximation. The simulation shows that it will overestimate the real parameter by about 10%. 3.4. The case of non-strict population In the above sections we have discussed the estimating method of the stable index  and the corresponding properties of the estimators. All what has been said so far assumed that the corresponding parent population is strictly distributed. If the strictness assumption does not hold, then the estimator can usually not be used in the case when  ¿ 1. If we want the estimator still to be applicable, we should 9rst of all, make the observations strictly distributed. In what follows we have some suggestions of how to transform the non-strictly distributed observations into a symmetric one. Suppose X1 ; X2 ; : : : ; Xn be i.i.d. copies of some population X attracted to a stable law, X is not necessarily strictly distributed. Let now Y1 ; Y2 ; : : : ; Yn be i.i.d. random variables with the distribution 1 P{Yi = 1} = P{Yi = −1} = ; 2 and de9ne new random variables by Zi = Xi · Yi ;

i = 1; 2; : : : ; n:

Then we have, for any given x ¿ 0, P{Zi ¡ x} =

1 2

· [P{X ¡ x} + P{X ¿ − x}];

P{Zi ¿ − x} =

1 2

· [P{X ¡ x} + P{X ¿ − x}]

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

33

that is the random variables Zi ; i = 1; 2; : : : ; n are symmetric, and of course also strictly distributed. Furthermore, if the attracting stable law has index , we have, as x → ∞, P{Zi ¿ x} = 12 [P{X ¿ x} + P{X ¡ − x}] ∼ 12 x− L(x); P{Zi ¡ − x} = 12 [P{X ¿ x} + P{X ¡ − x}] ∼ 12 x− L(x): Note that the procedure can also be repeated, and an average of the correspondingly generated estimators is available. Another way to make our estimators available under non-strict assumption is to make the exponent smaller, less than 1, say. Then the attracting property holds without adding a corresponding constant term, that is the convergence is the same as in the strict case. In fact, if X ∼ F, and F is in the domain of attraction of -stable law, let Y =|X |2 sign(X ), then Y is also in the domain of attraction of stable law, but the index of the attracting stable distribution will be changed to =2. From the assumption that  ¡ 2 we know, =2 ¡ 1. Then we have the corresponding estimator of  as follows: n log | i=1 Yi | [ : (q) = n−1 2 log n The simulation result shows that even when the parent population is strict, the above adjustment gives also better estimates. There is still another methods to make a distribution strict: Suppose X1 ; X2 ; : : : ; Xn , where n = 2m, are i.i.d. observations from a distribution F. Then the random variables Y1 ; Y2 ; : : : ; Ym , where Yi = X2i−1 − X2i ; i = 1; 2; : : : ; m are strictly distributed. 4. Simulations In this section we compare our estimators to the most commonly used ones by simulation. Our estimators are simulated in the following two cases: (1) the parent distribution is stable; (2) the parent distribution is attracted to some -stable law. For di
34

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

Table 1 Simulation—stable—n = 1000, 1000 replicates 

MEAN

STD

MAXL

MINL

95%

5%

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.179975 0.368623 0.565932 0.780657 0.986729 1.200678 1.406406 1.602115 1.783365

0.011743 0.025317 0.041946 0.060605 0.0797 0.104251 0.129193 0.155149 0.17714

0.233847 0.452184 0.719763 1.036442 1.338058 1.761665 2.060279 2.473736 2.633028

0.142098 0.299404 0.453906 0.606317 0.754691 0.921855 1.088249 1.267983 1.43042

0.199641 0.410056 0.636796 0.886737 1.122484 1.384139 1.641126 1.886896 2.079229

0.161013 0.327046 0.500755 0.686527 0.859136 1.003652 1.220774 1.399116 1.543659

where W is standard exponential variable, : is uniformly distributed on (−=2; =2), :0 = −=2(k()=), k() = 1 − |1 − |,  is the skewness of the stable distribution. We take  = 0 here for simplicity. And  = 1 is corresponding to a Cauchy distribution. We can get the observations directly from the SAS random number generating procedure. The sample size is 20 × 50 = 1000. We have partitioned the sample into 20 groups, or sub-samples say, each with a sample size of 50. One thousand repetitions are taken. The simulated mean, standard deviation, the maximal and minimal observation and the 95%-, 5%-quantiles are given in Table 1. The simulation results show that the estimator is asymptotically unbiased and the simulated standard deviations are very small. The simulated 90% con9dence intervals of the parameters have very short lengths. Next we present the comparison simulation of our estimator  U 0 with that of Press and Zolotarev. The estimator of Press performs better when  ¿ 1 and that of Zolotarev is better when  ¡ 1. Our estimator  U 0 performs almost as good as the best of Press and Zolotarev estimator all the time. Note that the estimator of Press is diPcult to be simulated 1000 times when the parameter  ¡ 0:4. The computer refused to calculate the cosine or sine transformation of the observed data. The estimator of Zolotarev has also a problem. We have observed many samples from some stable distribution, where the associated estimates of 1=2 take negative values. In the following two tables, MEANU 0 and STDU 0 denote the simulated mean values and the standard deviations of our estimator  U 0 , MAXU 0 and MINU 0 the maximums and the minimums. The sub-index −z and −p represent that of Zolotarev estimator and Press estimator. The sample size here is 20 × 50 = 1000 (Table 2). In the case of small sample size, n = 200, for example, we have simulated our re-sampling estimator 200 times with stable parent distributions. Our re-sampling procedure gives very good estimates of the unknown parameters. The estimators of Press or Zolotarev perform not so well. When  takes values greater than 1, the estimator of Zolotarev performs not well, but our re-sampling estimator and Press estimator perform very well in this situation. When  takes values less than 1, all three estimators perform well, but the Press estimator performs relatively worse.

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

35

Table 2 Comparison simulation—stable—n = 1000, 1000 replicates 

MEANU 0

MEANz

MEANp

STDU 0

STDz

STDp

0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.370 0.565 0.785 0.991 1.203 1.405 1.610 1.791

0.401 0.599 0.803 1.003 1.207 1.415 1.612 1.823

0.400 0.600 0.802 1.000 1.203 1.407 1.605 1.801

0.0243 0.0420 0.0664 0.0847 0.1010 0.1197 0.1550 0.1825

0.0137 0.0218 0.0324 0.0498 0.0679 0.0971 0.1337 0.1860

0.0354 0.0376 0.0438 0.0505 0.0570 0.0619 0.0646 0.0586

Table 3 Comparison simulation—stable—200 times 

MEANr

MEANz

MEANp

STDr

STDz

STDp

0.3 0.6 0.9 1.2 1.5 1.8

0.277025 0.570816 0.903337 1.201662 1.55455 1.876307

0.300399 0.600956 0.923222 1.213172 1.631239 2.023097

0.288717 0.600964 0.916601 1.20449 1.537978 1.819125

0.032808 0.067654 0.115214 0.149892 0.182088 0.161744

0.022099 0.049142 0.086165 0.171192 0.330738 0.730564

0.067261 0.08977 0.100267 0.134035 0.144458 0.107914

In the following there are some results of the above simulation. Here MEAN, STD represent the simulated mean, standard deviation, range and the interquartile-range of the estimators respectively, the sub-index -r, -z, -p denote our re-sampling estimator, the estimator of Zolotarev and that of Press, accordingly (Table 3). 4.2. The simulation of estimators of the tail index Now we come to the simulation of the estimators for the tail index of heavy-tailed distributions. The simulation of the random variables, whose distributions have heavy tails, is based on the following theorem. We have generated all our observations in the same way. Theorem 4.1. Let X be a random variable with probability density f(x); x ∈ R, and satisfying the following condition: lim f(x) = m ¿ 0:

x→0

Then Y = sign(X ) · |X |−1=

36

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

has a distribution with heavy tails of tail index , i.e. y P{|Y | ¿ y} ∼ L(y);

(y → ∞);

where L(y) is slowly varying at in8nity. Proof. For any real y ¿ 0, y P{|Y | ¿ y} = y P{|X |−1= ¿ y} = y P{|X | ¡ y− } =y  =





y−

−y− 1

−1

f(t) dt

f(zy− ) d z =L(y) #

Now for any given = ¿ 0, $1 f(z=− y− ) d z L(=y) = −1 $1 L(y) f(zy− ) d z −1

$1 → $−1 1 −1

limt→0 f(t) d z limt→0 f(t) d z

=1

(y → ∞): It follows that L(y) is slowly varying and the distribution of Y has heavy tails of index . Based on the above theorem, we have simulated our estimator  U 0 of tail index when the parent distributions have heavy tails. Here the tail indices are taken to be equal to 0:2 to 1:8 with step size 0:2. For each value of , we have taken a sample of size 30 × 100 = 3000. One thousand replicates are simulated. The simulation results show that our estimator  U 0 underestimates the real parameters a little, about 5% lower than what we are estimating, however the standard deviations are very small. All the 1000 estimates are distributed very near to the theoretical value we want to estimate. The simulated ranges of the estimator are very narrow. See the following table for detail (Table 4). [ 4.2.1. The simulation results of n−1 (R) Next consider the simulation results of the re-sampling procedure. The sample size here is 400. We have compared the unimproved and the re-sampling estimator of . The re-sampling rate is 0:1, 2000 re-samples and 100 repetitions are taken. The simulation results show that the estimator without re-sampling adjustment behaves very badly when the sample size is small. But when the re-sampling procedure

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

37

Table 4 Simulation result–DA()–1000 times 

MEAN

STD

MAXL

MINL

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.1857 0.372936 0.560768 0.752233 0.942731 1.136058 1.32609 1.525933 1.718552

0.008701 0.017568 0.027235 0.038655 0.04766 0.056685 0.07605 0.086198 0.101254

0.215748 0.429575 0.660525 0.869054 1.114957 1.305414 1.579052 1.845454 2.097628

0.161254 0.319699 0.488642 0.632564 0.804083 0.959664 1.157234 1.287219 1.451296

Table 5 Simulation results for [ n (R) 

MEAN

MEANr

STD

STDr

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.193972 0.383693 0.58773 0.767623 1.0273 1.2426 1.423468 1.668838 1.82359

0.189954 0.374682 0.56753 0.764183 0.966376 1.1453 1.33073 1.540794 1.71423

0.039187 0.068228 0.1142144 0.129302 0.20466 0.477682 0.327833 0.542352 0.4848

0.016387 0.034126 0.052592 0.061988 0.0931 0.09827 0.110699 0.132829 0.138599

is used, we get a great improvement both in the sense of unbiasedness and standard deviation. Let [ n (R) denote the estimator % U0 when it is used in the re-sampling procedure. In the following table, the MEAN and STD denote the means and the standard deviations of the estimator #n . MEANr and STDr denote those of the estimator re-sampled. The simulation results tell us that it is very dangerous to use the unadjusted estimator #n directly, especially in the case when  is larger than 1. However, the re-sampled estimator performs very well. The unbiasedness of the estimator holds and the simulated standard deviations are very small (Table 5). 4.3. Comparison with the estimators of Hill and De Haan and Resnick The most important advantages of our estimator are that we have used all of the observations and the formula is very simple. The following simulated comparisons show that our estimator performs very well also. In the simulation study, our averaged estimator are simulated together with the sub-sampling procedure. Our estimator is

38

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

Table 6 Comparison simulation results—with Hill’s and R&D’s estimators 

MEAN

MEANh

MEANrd

STD

STDh

STDrd

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 N

0.184255 0.369841 0.555074 0.744661 0.935727 1.120411 1.324354 1.523432 1.721861 1.927878 1.96267

0.200443 0.402636 0.602877 0.8056175 1.003484 1.199682 1.407946 1.603543 1.797191 1.999309 3.461711

0.187198 0.373979 0.563899 0.756091 0.942006 1.118505 1.322749 1.487837 1.671671 1.869232 5.078234

0.012125 0.025111 0.039116 0.052415 0.068127 0.080414 0.096013 0.113431 0.126096 0.143364 0.197853

0.020616 0.041024 0.059673 0.078557 0.102838 0.123391 0.146552 0.160963 0.18476 0.201738 0.312094

0.043228 0.085669 0.128889 0.17507 0.215082 0.264387 0.298518 0.39942 0.446097 0.439047 0.612247

unbiased and has always the smallest standard deviations comparing with that of Hill’s estimator and Resnick and De Haan’s estimator. Here are the simulations of a comparison with Hill’s estimator and that of Resnick and de Haan, where the parent distributions are in the domain of attraction of stable distribution and the sample size is 20×50=1000. One thousand repetitions are computed (Table 6). In the above table, MEAN, STD denote the simulated mean values and standard deviations of our estimator and that with the sub-index -h or -rd represent that of Hill’s or Resnick and De Haan’s estimators. N denotes the standard normal distribution. The simulation results in the above table show that, our estimator performs always better than that of Hill’s estimator and Resnick and De Haan’s estimator in the sense of mean squared error (MSE). In fact, the MSE of Hill’s estimator is about 2 times that of ours. And the estimator of Resnick and De Haan preforms worse. One more thing to note: When the three estimators are used to estimate the index of a normal distribution, only our estimator performs very well. The estimates of the other two estimators are totally misleading. The case of small sample size: When the sample size is not large, 200 say, we have compared our estimator to the other two also. We have used our re-sampling procedure. Our simulation results show that the standard deviations of Hill’s estimator are getting too large. But our re-sampling procedure can still help to get very good estimates. The comparison is simulated when  = 0:3; 0:6; 1:2; 1:6. 200 repetitions are calculated. The re-sampling rate here is 15% and the corresponding k—the number of the largest order statistics in Hill’s or R&D’s estimators is taking to be 20 (Table 7). 4.3.1. A Notation From the simulation results above we know that, our estimators preform always very well. Now we give some further hints how to use them. When the sample size is large enough, that is we have enough observations, we suggest a procedure of sub-sampling. That is we should partition the sample into

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

39

Table 7 Comparison simulation results—with Hill’s and R&D’s estimators 

MEAN

MEANh

MEANrd

STD

STDh

STDrd

0.3 0.6 1.2 1.6

0.280791 0.568137 1.15863 1.545009

0.307381 0.636665 1.287416 1.740074

0.280648 0.564266 1.17409 1.548522

0.034833 0.065783 0.141467 0.184832

0.069951 0.135139 0.285386 0.417576

0.097299 0.184693 0.389861 0.551903

sub-samples. Because the rate of convergence of our estimator is only of order (log n)−1 , n is the sample size, but at the same time the convergence region is very narrow. It means we can get from every sub-sample an estimate almost as good as that from the whole sample. When taking average of the sub-estimates we can improve the standard deviations considerably. When the sample size is not so large, we cannot get much by sub-sampling procedures. In this situation we can use our re-sampling procedure. By deleting the extremal estimates in the re-sampling procedure, we can get a relatively robust estimate of the corresponding parameter with small standard deviation. Or we can use our estimator with U-statistic structure. That is the best estimator we can get from the sum preserving property. Acknowledgements The author is grateful to his Ph.D. advisor Professor Dr. Manfred Denker for his constant encouragement, helpful suggestions and valuable discussions in the preparation of the paper. References Aaronson, J., Denker, M., 1998. Characteristic functions of random variables attracted to 1-stable laws. Ann. Probab. 26 (1), 399–415. Arvesen, J.N., 1969. Jackkni9ng u-statistics. Ann. Math. Statist. 40 (6), 2076–2100. Bingham, N.H., Goldie, C.M., 1989. In: Teugels, J.L. (Ed.), Regular Variation. Cambridge University Express, Cambridge. Chambers, J.M., Mallows, C.L., Stuck, B.W., 1976. A method for simulating stable random variables. JASA 71 (354), 340–344. CsIorgIo, S., Deheuvels, P., Mason, D., 1985. Kernel estimates of the tail index of a distribution. Ann. Statist. 13, 1050–1077. Denker, M., Jakubowski, A., 1989. Stable limit distributions for strongly mixing sequences. Statist. Probab. Lett. 8, 477– 483. Fama, E.F., Roll, R., 1968. Some properties of symmetric stable distributions. J. Amer. Statist. Assoc. 63, 817–836. Feuerverger, A., Hall, P., 1999. Estimating a tail exponent by modelling departure from a Pareto distribution. Ann. Statist. 27 (2), 760–781. De Haan, L., Pereira, T.T., 1999. Estimating the index of a stable distribution. Statist. Probab. Lett. 41, 39–55.

40

Z. Fan / Journal of Statistical Planning and Inference 123 (2004) 13 – 40

De Haan, L., Resnick, S., 1980. A simple asymptotic estimate for the index of a stable distribution. J. Roy. Statist. Soc. Ser. B 42, 83–88. Hall, P., 1982. On some simple estimates of an exponent of regular variation. J. Roy Statist. Soc. B 44 (1), 37–42. Heyde, C.C., 1968. On large deviation probabilities in the case of attraction to a non-normal stable law. SANKHYA 30, 253–258. Hill, B.M., 1975. A simple general approach to inference about the tail of a distribution. Ann. Statist. 3, 1163–1174. Holt, D.R., Crow, E.L., 1973. Tables and graphs of the stable probability density functions. J. Res. Bur. Standards Sect. B 77, 143–198. Ibragimov, I.A., Linnik, Yu.V., 1971. Independent and Stationary Sequences of Random Variables. Wolters– Noordho< publishing, Groningen, The Netherlands. Koroljuk, V.S., Borovskich, Yu.V., 1994. Theory of U-Statistics. Kluwer Academic Publishers, Dordrecht. Meerschaert, M.M., ScheJer, H.-P., 1998. A simple robust estimation method for the thickness of heavy tails. J. Statist. Plann. Inference 71, 19–34. Mittnik, S., Rachev, S.T., 1993. Modeling asset returns with alternative stable models. Econometric Rev. 12, 261–330. Pickands, J., 1975. Statistical inference using extreme order statistics. Ann. Statist. 3, 119–131. Press, S.J., 1972. Estimation in univariate and multivariate stable distributions. J. Amer. Statist. Assoc. 67 (340), 842–846. Rempala, G., Gupta, A., 1999. Weak limits of u-statistics of in9nite order. Random Operators Stochast. Equations 7 (1), 39–52. Zolotarev, V.M., 1986. One-Dimensional Stable Distributions. Transl. Math. Monographs, vol. 65.