Journal of Statistical Planning and Inference 76 (1999) 1–17
On the bootstrap and the moving block bootstrap for the maximum of a stationary process Krishna B. Athreya 1 , Jun-ichiro Fukuchi ∗ , Soumendra N. Lahiri 2 Iowa State University, IA, USA Received 30 June 1997; accepted 22 June 1998
Abstract In this paper, asymptotic properties of bootstrap methods for the maximum of a stationary process are investigated. It is shown that the Efron’s bootstrap provides a valid approximation to the sampling distribution of the normalized maximum only in a restricted situation, but the c 1999 Elsevier Science moving block bootstrap is successful in a more general situation. B.V. All rights reserved. AMS classiÿcation: primary 62G09; 60G70; 60G10 Keywords: Bootstrap; Maximum; Moving block bootstrap; Stationary process
1. Introduction Efron (1979) introduced the bootstrap method of estimating the sampling distributions of statistics. It is well known that when observations are independently and identically distributed (i.i.d.), the Efron’s bootstrap (EB) provides a valid approximation to sampling distributions of a wide variety of statistics (the EB is said to be consistent in this case) including sample means, sample quantiles and Von Mises functionals (Bickel and Freedman, 1981). It is also known that, for some statistics, EB does not approximate their distributions at all. The maximum of random variables is one of the examples for which EB fails to be consistent. Recently, Swanepoel (1986), Deheuvels et al. (1993) and Athreya and Fukuchi (1994, 1997) (hereafter referred to as AF1, AF2, respectively) showed that one can make the EB consistent for extremes of i.i.d. random variables by making the resample size suitably smaller than the sample size. A natural question is: Is the bootstrap still consistent for the maximum of a stationary ∗ Corresponding author; present address: Faculty of Economics, Kagamiyama, Higashi-hiroshima 739 Japan. 1 Research supported in part by NSF Grant DMS 92-04938. 2 Research supported in part by NSF Grant DMS 95-05124.
Hiroshima
c 1999 Elsevier Science B.V. All rights reserved. 0378-3758/99/$ – see front matter PII: S 0 3 7 8 - 3 7 5 8 ( 9 8 ) 0 0 1 4 0 - 2
University,
1-2-1
2
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
process? The aim of this paper is to investigate the asymptotic behaviors of EB and the moving block bootstrap (MBB) introduced by Kunsch (1989) and Liu and Singh (1992), for this problem. A main result of this paper is that the EB provides a valid approximation to the distribution of the maximum for a class of stationary processes, but it does not in general. On the other hand, the MBB provides a valid approximation in a wider class of stationary processes. The rest of this paper is organized as follows. Section 2 reviews some basic results on extremes of stationary processes. In Section 3, the asymptotic properties of the EB are investigated under Leadbetter’s conditions D(un ) and D0 (un ). In Section 4, asymptotic properties of the EB and MBB are investigated when the process has extremal index 6= 1.
2. Preliminaries Let {Xi }∞ i=1 be a stationary process with a marginal cumulative distribution function (cdf) F and let Mn = max{X1 ; X2 ; : : : ; Xn }. There is a considerable amount of results on asymptotic properties of the maximum Mn (see e.g., Leadbetter et al., 1983). In order to obtain results similar to those of i.i.d. random variables, the distributional mixing condition D(un ) introduced by Leadbetter (1974) is required. For each h with 16h6n − 1, de ne n; h = sup{|P{Xj 6un ; j ∈ A ∪ B} − P{Xj 6un ; j ∈ A}P{Xj 6un ; j ∈ B}| : A ⊂ {1; : : : ; k}; B ⊂ {k + h; : : : ; n}; 16k6n − h}: Then {Xi }∞ i=1 is said to satisfy D(un ) if n; h → 0 for some h = o(n). The condition D(un ) is much weaker than the strong mixing condition. The classical result concerning possible extremal types holds for a stationary process satisfying D(un ). Theorem 1 (Leadbetter, 1974). Let {Xi }∞ i=1 be a stationary process. Suppose that there exist an ¿0, bn ∈ R and a nondegenerate cdf G such that P{a−1 n (Mn − bn )6x} → G(x)
(1)
for each continuity point x of G. Suppose that D(un ) is satisÿed for un = an x + bn for each x ∈ R. Then G is one of the following classical types: (x) = exp(− exp(−x)) x ∈ R; 0; x¡0; (x) = exp(−x− ); x¿0; exp(−(−x) ); x¡0; (x) = 1; x¿0; where ¿0:
(2)
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
3
Leadbetter (1974) introduced a further condition D0 (un ) as follows. A stationary 0 process {Xi }∞ i=1 is said to satisfy D (un ) if lim n
n→∞
[n=k] P j=2
P{X1 ¿un ; Xj ¿un } → 0
as k → ∞, where [x] is the integer part of x ∈ R. Under D(un ) and D0 (un ), the asymptotic theory of extremes is considerably simpli ed as the following theorem shows. Theorem 2 (Leadbetter, 1974). Let {Xi }∞ i=1 be a stationary process with a marginal cdf F and {un } be a sequence of constants such that D(un ) and D0 (un ) hold. Let 06¡∞. Then P{Mn 6un } → e−
(3)
n{1 − F(un )} → :
(4)
i
An implication of Theorem 2 is that if D(un ) and D0 (un ) are satis ed for un = an x+bn for each x ∈ R where an and bn are as in Eq. (1), the limit distribution is determined only by marginal cdf F but not aected by the joint distribution of {Xi }∞ i=1 at all. Asymptotic theory of extremes without assuming D0 (un ) has been also developed by several authors. Chernick (1981) has shown that if for each ¿0 there is a sequence un () such that P{Mn 6un ()} has a limit and if Eq. (4) and D(un ) are satis ed with un = un (), then P{Mn 6un ()} → e−
(5)
for each ¿0 and for some with 0661. A stationary process {Xi }∞ i=1 is said to have extremal index if Eqs. (4) and (5) hold for each ¿0. Characterizations of and the asymptotic theory for extremes of a stationary process which has extremal index were developed by, among others, Leadbetter (1983), Hsing et al. (1988), Leadbetter and Nandagopalan (1989), and Chernick et al. (1991). Let F and G be nondegenerate cdf’s. A cdf F is said to belong to the (maximal) domain of attraction of G (written F ∈ D(G)) if there exist sequences of real numbers {an } and {bn } such that F n (an x + bn ) → G(x);
(6)
as n → ∞, for every continuity point x of G. It is well known (see de Haan, 1970) that normalizing constants an and bn in Eq. (6) may be chosen as 1 − n ; bn = n ; (7) (i) if F ∈ D(); an = F −1 1 − en
4
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
(ii)
if F ∈ D( );
an = n ; bn = 0;
(8)
(iii)
if F ∈ D( );
an = F − n ; bn = F ;
(9)
where F −1 (u) = inf {x: F(x)¿u}, F = sup{x: F(x)¡1} and n = F −1 (1 − n−1 ). From the above de nition of extremal index, it is seen that if {Xi }∞ i=1 has extremal index and F ∈ D(G) where F is the marginal cdf of {Xi }∞ and G = , or for some i=1 (M −b ¿0, then with the constants given in Eqs. (7) – (9), a−1 n n ) has a nondegenerate n limit distribution whose type is G. Listed below are several mixing conditions which will be used in later sections. Deÿnition 1. Let {un } = {(un(1) ; : : : ; un(r) )0 } be a sequence in Rr . For each h with 16 h6n − 1, de ne n; h (un ) = sup{|P{Xj 6vj ; j ∈ A ∪ B} − P{Xj 6vj ; j ∈ A}P{Xj 6vj ; j ∈ B}| : A ⊂ {1; : : : ; k}; B ⊂ {k + h; : : : ; n}; 16k6n − l}; where each vj is any choice of the r values un(1) ; : : : ; un(r) : A stationary process {Xi }∞ i=1 is said to satisfy the condition Dr (un ) if n; h (un ) → 0, as n → ∞ for some sequence h = o(n). Deÿnition 2. For each n; i; j with 16i6j6n and a sequence {un }, de ne Fi j (un ) to be the -algebra generated by the events {Xs 6un }; i6s6j. Also for each n and 16h6n − h, write ∞ (un ); k ∈ N}; ˜h (un ) = sup{|P(A ∩ B) − P(A)P(B)|: A ∈ F1k (un ); B ∈ Fk+h
where N is the set of positive integers. A stationary process {Xi }∞ i=1 is said to satisfy the condition (un ) if ˜h (un ) → 0 for some h = o(n). Deÿnition 3. For each h with 16h, de ne h− (un ) = sup{|P{Xj ¿un ; j ∈ A ∪ B} − P{Xj ¿un ; j ∈ A}P{Xj ¿un ; j ∈ B}| : A ⊂ {1; : : : ; k}; B ⊂ {k + h; : : :}; 16k}: − − A stationary process {Xi }∞ i=1 is said to satisfy the condition D (un ) if h (un ) → 0 for some h = o(n). The condition Dr (un ) is from Leadbetter et al. (1983), p. 107. The condition (un ) was introduced by Hsing et al. (1988) to prove the convergence of the point process of exceedances.
3. Efron’s Bootstrap under D(un ) and D0 (un ) Let {Xi }∞ i=1 be a stationary process with a marginal cdf F. For x ∈ R, let Fn (x) = n−1
n P
I (Xi 6x)
i=1
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
5
be the empirical distribution function (edf) of the sample Xn := (X1 ; X2 ; : : : ; Xn ). In the following, subscripts n in {mn }, {ln } and {kn } are suppressed for notational simplicity. Let {m} be a sequence of positive integers such that m → ∞, as n → ∞. Given Xn , let Y1 ; Y2 ; : : : ; Ym be conditionally i.i.d. random variables with the cdf Fn and Y1:m 6Y2:m 6 · · · 6Ym:m be their order statistics. De ne Gn (x) = P{a−1 n (Mn − bn )6x} and Hn; m (x) = P{a−1 m (Ym:m − bm )6x | Xn }: Hn; m is called the Efron’s bootstrap (EB) distribution of a−1 n (Mn − bn ). Here n and m are called the sample size and the resample size, respectively. Note that the EB ignores the dependence of the process and its resampling scheme is i.i.d. sampling. It is known that for extremes of i.i.d. random variables, the EB distribution Hn; m has a random limit if mn = n and thus the EB fails to provide a valid approximation to Gn ; see Angus (1993) for the case in which normalizing constants are estimated and AF1 for the case in which they are not. For a stationary process, a similar result holds. Theorem 3. Let {Xi }∞ i=1 be a stationary process such that Eq. (1) holds for some an ¿0, bn ∈ R and a nondegenerate cdf G. Let un = (un(1) ; un(2) ; : : : ; un(r) )0 . Suppose that D0 (un ) holds for all sequences un = an x+bn , x ∈ R and that Dr (un ) holds all r = 1; 2; : : : ; and all sequences un(k) = an xk + bn , 16k6r, for arbitrary choices of the xk . If m = n, then d
Hn; m (x) → exp{−V (x)}; where V (x) is a Poisson random variable with the mean − log G(x): Proof. Notations in this proof follow those of Resnick (1987), Ch. 3. Let B be the Borel- algebra of subsets of R. For x ∈ R, de ne the measure ”x on B by 1; x ∈ A; ”x (A) = 0; x ∈= A; Pn . Under the assumptions, it is known (Leadbetter for A ∈ B. Let n; m = i=1 ”a−1 n (Xi −bn ) et al., 1983, Theorem 5.7.2) that the point process n; m converges weakly to a Poisson random measure PRM() where is a measure determined by (x; ∞] = − log G(x), x ∈ R. The result follows from a variant of the continuous mapping theorem (Billingsley, 1968, Theorem 5.5). Swanepoel (1986), Deheuvels et al. (1993), AF1 and AF2 showed that the EB is consistent for the maximum of i.i.d. random variables if m = o(n). It is known that in general the EB fails drastically for dependent random variables (cf. Remark 2.1 of Singh, 1981). Since i.i.d. resampling scheme of the EB destroys time order in the
6
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
sample, the EB fails to approximate the sampling distribution of statistics of interest if its limit distribution depends on the joint distribution of the process. However, since the limit distribution G of the normalized maximum is completely determined by the marginal distribution F when the conditions D(un ) and D0 (un ) are ful lled, it is natural to conjecture that the EB is still consistent in this situation if m = o(n). In fact this conjecture is correct under D(un ), D0 (un ) and an additional condition D− (un ). Theorem 4. Let {Xi }∞ i=1 be a stationary process such that Eq. (1) holds for some an ¿0, bn ∈ R and a nondegenerate cdf G. Suppose that D(un ), D0 (un ) and D− (un ) hold for a sequence un = an x + bn , x ∈ R and that the mixing coecient h− (un ) of − D− (un ) satisÿes lim n2 [n] (un )¡∞ for every ¿0. Assume m = o(n) and n→∞
lim lim
m 2
n P
n
j=[np]+1
p→0 n→∞
(n − j)rn ( j) = 0;
where rn ( j) = Cov{I (X1 ¿um ); I (Xj+1 ¿um )} = P(X1 ¿um ; Xj+1 ¿um ) − P 2 (X1 ¿um ): Then p
Hn; m (x) → G(x);
(10)
as n → ∞. Proof. Let c(x) = − log G(x). Since Hn; m (x) = [1 − m{1 − Fn (um )}=m]m ; it is enough to p show that m{1 − Fn (um )} → c(x): We have E[m{1 − Fn (um )}] = m{1 − F(um )} → c(x) and n mP I (Xi ¿um ) Var[m{1 − Fn (um )}] = Var n i=1 ) ( m 2 n−1 P nrn (0) + 2 (n − j)rn (j) = n j=1 = An; 1 + An; 2
(say):
Then m2 m P(X1 ¿um ){1 − P(X1 ¿um )} ∼ c(x) → 0; n n as n → ∞. For each p ∈ (0; 1), m 2 [np] m 2 n−1 P P 1 (n − j)rn ( j) + (n − j)rn (j) |An; 2 | = n j=1 2 n j=[np]+1 An; 1 =
6
P m2 m2 [np]+1 [np]P(X1 ¿um )2 P(X1 ¿um ; Xj ¿um ) + n j=2 n
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
+
m 2
n−1 P
n
j=[np]+1
= Bn; 1 + Bn; 2 +
7
(n − j)rn ( j)
m 2
n−1 P
n
j=[np]+1
(n − j)rn (j)
(say):
Then Bn; 1 = 6
m n m n +
[mp] P
m
j=2
P(X1 ¿um ; Xj ¿um ) +
[mp] P
m
j=2
P(X1 ¿um ; Xj ¿um ) +
P m2 [np]+1 P(X1 ¿um ; Xj ¿um ) n j=[mp]+1 P m2 [np]+1 P 2 (X1 ¿um ) n j=[mp]+1
m2 − (um ) ([np] − [mp] + 1)[mp] n
− (um ): ∼ 0 + pm2 P 2 (X1 ¿um ) + pm2 [mp]
(by D0 (un )):
Thus lim Bn; 1 6pc2 (x) + pK for some K¿0 by assumption. And Bn; 2 → pc2 (x), as n→∞
n → ∞. Therefore lim Var[m{1 − Fn (um )}] = lim lim Var[m{1 − Fn (um )}]
n→∞
p→0 n→∞
6 lim {4pc2 (x) + 2pK} p→0
+ lim lim
p → 0 n→∞
m 2
n−1 P
n
j=[np]+1
(n − j)rn (j):
= 0: p
Hence Hn; m (x) → G(x), as n → ∞. Remark. If convergence (10) takes place for every x ∈ R, then the convergence is uniform over R. This is because the limiting distribution G is continuous. The proof of this can be found in Theorem 2.1 of Politis and Romano (1994) or Lemma 1 of AF2. The same remark applies to other results in this paper. The next corollary gives a simpler condition on m by assuming a power law decay of ˜h (un ). Corollary 1. Let {Xi }∞ i=1 be a stationary process such that Eq. (1) holds for some an ¿0, bn ∈ R and a nondegenerate cdf G. Suppose that D(un ), D0 (un ) and D− (un ) hold for a sequence un = an x + bn , x ∈ R and that the mixing coecient h− (un ) of D− (un ) satisfy h− (un )6Ch− for some ¿2 and C¿0, for every n; h ∈ N. If m = o(n), then Eq. (10) holds.
8
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
Proof. By D− (un ), |rn ( j)|6j− (um )6Cj − , thus m 2 P m 2 P n n (n − j)rn ( j) 6 (n − j)j − n j=[np]+1 n j=[np]+1 − j j 1− n n
n 1 m2 P = n j=[np]+1 n
which converges to zero since n P
1 j=[np]+1 n
− Z 1 j j → 1− x− (1 − x) dx¡∞; n n p
as n → ∞. Therefore Theorem 4 yields the result. The strong consistency of the bootstrap can be proved for a stationary process satisfying (un ) and D0 (un ). The next lemma is standard. Lemma 1. Suppose that the condition (un ) holds for a stationary process {Xi }∞ i=1 . Let j ∞ Y and Z be F1 (un )- and Fj+h (un )-measurable random variables, respectively, such that |Y |6M1 and |Z|6M2 . Then |E(YZ) − E(Y )E(Z)|64˜h (un )M1 M2 : Proof. See for example Theorem 17.2.1 of Ibragimov and Linnik (1971). Theorem 5. Suppose that (un ) and D0 (un ) hold for a sequence un = an x + bn , x ∈ R, and that the mixing coecient ˜n; h (un ) of (un ) satisfy ˜n; h (un )6Ch for some ∈ (0; 1) and C¿0 for every n; h ∈ N. If m = O(n ) for some ∈ (0; 1=2), then Hn; m (x) → G(x); with probability 1, as n → ∞. Pn Proof. Let Tn; m = i=1 I (Xi ¿um ) and pm = P(X1 ¿um ). It suces to show that m=n(Tn; m − npm ) → 0 with probability 1. What follows is based on a blocking argument. Let r be a positive integer. Let h ∼ n1=(2r+1) and k = [n=(2h)], be the integer part of n=(2h). De ne Zj = I (Xj ¿um ) − pm , j = 1; 2; : : : ; n and de ne block-sums Ui; n =
Uk; n =
(2i−1)h P
Vi; n =
Zj ;
Vk; n =
j=2(i−1)h+1 (2k−1)h P
2ih P
Zj ;
j=2(k−1)h+1
Zj ;
j=(2i−1)h+1 n P
Zj :
j=(2k−1)h+1
i = 1; 2; : : : ; k − 1;
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
Then E(Tn; m − npm )
2r
n P
=E
9
!2r Zj
j=1
=E
k P
Ui; n +
i=1
k P
2r Vi; n
i=1
( 2r 2r ) k k P P 62 Ui; n + E Vi; n E : 2r
i=1
i=1
De ne Aj; 2r = {(x1 ; x2 ; : : : ; xj ): xi ∈ N; x1 + x2 + · · · + xj = 2r}. Then 2r 2r k P P P P Ui; n = E(Ui11; n Ui22; n · · · Uij ;j n ): E i=1
j=1(1 ;:::;j ) ∈ Aj; 2r 16i1 ¡···¡ij 6k
Since Ui; n ’s are separated at least by h variables, it follows from Lemma 1 that j Q E(U 1 U 2 · · · U j ) − E(U s ) 6 16( j − 1)h2r ˜h (um ) i1 ; n i2 ; n ij ; n is ; n s=1
6 32Crh2r h → 0; as n → ∞. Clearly, there exists K¿0 such that |EUis ;sn |6Khs for each s = 1; 2; : : : ; 2r. Let C(r) =
Kj + 1: j=1(1 ; :::; j ) ∈ Aj; 2r j! r P
P
Since ]{(i1 ; i2 ; : : : ; ij ): 16i2 ¡ · · · ¡ij 6k} = E
k P
2r Ui; n
i=1
= = 6
2r P
P
P
k j
6k j =j! for large n, it follows that j Q
E(Uis ;sn ) + O(h2r h )
j=1(1 ;:::;j )∈Aj; 2r 16i1 ¡···¡ij 6k s=1 r P
P
P
j Q
E(Uis ;sn ) + O(h2r h )
j=1(1 ;:::;j )∈Aj; 2r 16i1 ¡···¡ij 6k s=1
k j j 2r K · h + O(h2r h ) j=1(1 ;:::;j )∈Aj; 2r j! r P
P
6 C(r)k r h2r ; where terms with j larger than r vanish since any of such terms contain EUi; n = 0. Therefore, m 2r o nm E(Tn; m − npm )2r (Tn; m − npm )¿” 6 ”−2r P n n m 2r 6 ”−2r 22r+1 C(r) k r h2r n
10
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
for large n, and ∞ m2r hr ∞ m 2r P P k r h2r 6 2−r r n=1 n n=1 n ∞ P 6 n−(r−2r) nr=(2r+1) n=1
¡ ∞;
for suciently large r since −(r − 2r) + r=(2r + 1)¡−r(1 − 2) + 2−1 ¡−1 for suciently large r. The Borel–Cantelli lemma completes the proof. In the arguments above, normalizing constants an and bn are assumed to be known. However, when the cdf F is unknown, an and bn are also unknown and they need to be estimated. Let aˆm and bˆ m be some estimators of am and bm . De ne ˆ Hˆ n; m (x) = P{aˆ−1 m (Ym : m − bm )6x|Xn }: Natural estimators of am and bm are their empirical counterparts and Hˆ n; m (x) is shown to be consistent with this choice of aˆm and bˆ m . Theorem 6. Let vn = [n=m] and vn0 = [n=(em)]. Deÿne aˆm and bˆ m as follows. 1 1 −1 −1 − Fn = Xn−vn0 : n −Xn−vn : n ; 1− 1− (i) If F ∈ D(); aˆm =Fn em m ˆbm = Fn−1 1 − 1 = Xn−vn : n : m 1 = Xn−vn : n ; aˆm = Fn−1 1 − (ii) If F ∈ D( ); m bˆ m = 0: 1 = Xn : n − Xn−vn : n ; aˆm = Fn − Fn−1 1 − (iii) If F ∈ D( ); m bˆ m = Fn = Xn : n : Then under the assumptions of Theorem 4 or Corollary 1, p Hˆ n; m (x) → G(x);
as n → ∞. Proof. We give the proof only for the case F ∈ D( ). Proof for other cases are similar. p p ˆ From Theorem 2 of AF2, it is enough to show that aˆm =am → 1 and a−1 m (bm − bm ) → 0 which, in the case of F ∈ D( ), reduce to F − n →0 F − m
(11)
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
11
and F − Xn−vn : n p → 1: F − m
(12)
Convergence (11) was proved in Theorem 3 of AF2. It was proved in the proof of Theorem 4 that p
m{1 − Fn (am x + bm )} → c(x)
(13)
for every x ∈ R. An inspection of the proof of Theorem 3 of AF2 shows that Eq. (13) implies Eq. (12). 4. Efron’s and moving block bootstrap when {Xi }∞ i=1 has extremal index  In this section asymptotic properties of the EB and the moving block bootstrap 0 (MBB) are investigated when the process {Xi }∞ i=1 does not satisfy D (un ). Instead it is ∞ assumed that {Xi }i=1 has extremal index , 0661. In order to obtain results in this section, we introduce a weak dependence condition D+ which is stronger than D(un ) but weaker than the strong mixing condition. Deÿnition 4. For each h¿1, de ne g(h) = sup{|P{Xj 6u; j ∈ A ∪ B} − P{Xj 6u; j ∈ A}P{Xj 6u; j ∈ B}|; |P{Xj ¿u; j ∈ A ∪ B} − P{Xj ¿u; j ∈ A}P{Xj ¿u; j ∈ B}|: A ⊂ {1; : : : ; k}; B ⊂ {k + h; : : :}; k ∈ N; u ∈ R}: + A stationary process {Xi }∞ i=1 is said to satisfy the condition D if g(h) → 0 as h → ∞. It is easy to show that the EB is inconsistent when 6= 1 even if m=n → 0 as follows: − 2 Assume that D+ holds for {Xi }∞ i=1 with g(h) = h , where ¿1 and that m =n → 0. Then m2 n−1 m 2 n−1 P P − (n − j)rn ( j) 6 j → 0: n j=1 n j=1 p
Thus it follows from the proof of Theorem 4 that Hn; m (x) → e− (x) where (x) = limn→∞ n{1 − F(an x + bn )}, while Gn (x) → e− (x) . Hence the EB is inconsistent. The reason of the failure of the EB in this case is that the extremal index is determined by the joint distribution of {Xi }∞ i=1 (Leadbetter, 1983; Leadbetter and Nandagopalan, 1989; Chernick et al., 1991) while the EB cannot capture any dependence structures of {Xi }∞ i=1 . In the Efron’s bootstrap sample, the dependence structure of (X1 ; X2 ; : : : ; Xn ) is destroyed because the Efron’s bootstrap ignores the time order in (X1 ; X2 ; : : : ; Xn ). The moving block bootstrap (MBB) was introduced by Kunsch (1989) and independently by Liu and Singh (1992) to overcome the drawback of the Efron’s bootstrap
12
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
for dependent observations and they showed that the MBB is consistent for the sample mean of a stationary process. In the following, it will be shown that the MBB is also consistent for the maximum of a stationary process. The MBB method for a−1 n (Mn −bn ) can be described as follows: Let {l} and {k} be sequences of positive integers such that 16l6n. De ne blocks of length l, Xi; l = (Xi ; Xi+1 ; : : : ; Xi+l−1 );
i = 1; 2; : : : ; n − l + 1:
Next, a sample (X∗1; l ; X∗2; l ; : : : ; X∗k; l ) (the MBB sample) is randomly drawn with replacement from (X1; l ; X2; l ; : : : ; Xn−l+1; l ). Thus in the MBB sample, the dependence structure is preserved at least within each block. De ne the MBB maximum Mm∗ = max{X∗1; l ; X∗2; l ; : : : ; X∗k; l }; where m = kl and maximum is taken over all elements in X∗i; l , i = 1; : : : ; k. Now de ne the MBB distribution of a−1 n (Mn − bn ) by ∗ Hn; l; k (x) = P{a−1 m (Mm − bm )6x|Xn }:
We need the following variant of Lemma 2.1 of Leadbetter (1983). The proof can be found in Fukuchi (1994), (p. 48). + Lemma 2. Let {Xi }∞ i=1 be a stationary process satisfying D . Let {un } be a sequence of real numbers and {k} be a sequence of integers such that k = o(n). Suppose that there exists a sequence {hn } of positive integers such that hn → ∞, khn =n → 0 and kg(hn ) → 0. Then
P(Mn 6un ) − P k (Ml 6un ) → 0; where l = [n=k]. The next result gives sucient conditions for the consistency of the MBB. Theorem 7. Suppose that a stationary process {Xi }∞ i=1 has extremal index ∈ [0; 1], and satisÿes D+ . Let {l} and {k} be sequence of integers and let m = kl. Suppose Pn that k 2 l=n → 0, k 2 n−1 i=1 g(i) → 0 where g(·) is the mixing coecient of D+ , and there exists a sequence of positive integers {hn } such that hn =l → 0 and kg(hn ) → 0, as n → ∞. Then p
Hn; l; k (x) → G(x);
(14)
as n → ∞. Proof. De ne the blockwise maximum Mi; l of the ith block by Mi; l = max{Xi ; Xi+1 ; : : : ; Xi+l−1 };
i = 1; 2; : : : ; n − l + 1:
Then, draw a sample {M1;∗ l ; M2;∗ l ; : : : ; Mk;∗ l } randomly with replacement from {M1; l ; M2; l ; : : : ; Mn−l+1; l }. Then clearly Mm∗ is distributionally (conditional on Xn ) equal to
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
13
max{M1;∗ l ; M2;∗ l ; : : : ; Mk;∗ l }. Let un = an x + bn be such that n{1 − F(un )} has a nite limit − (x) for each and write (x) := limn→∞ n{1 − F(un )}. Thus P{a−1 n (Mn − bn )6x} → e PN −1 x ∈ R. Let N = n−l+1 and Fn; l (x) = N I (M 6x) be the empirical distribution i; l i=1 function of {M1; l ; M2; l ; : : : ; MN; l }. Then Hn; l; k (x) = P{Mi;∗l 6um ;
i = 1; 2; : : : ; k|Xn }
= Fn;k l (um ) k k{1 − Fn; l (um )} : = 1− k Thus it is enough to identify the limit of k{1 − Fn; l (um )}: By letting m have the role of n in Lemma 2, it follows that P(Ml 6um ) ∼ P 1=k (Mm 6um ) ∼ exp{− (x)lm−1 }; as n → ∞. Therefore, m P(M1; l ¿um ) l m ∼ [1 − exp{− (x)lm−1 }] l
E[k{1 − Fn; l (um )}] =
∼ (x);
(15)
as n → ∞. Now de ne cm (i) := Cov{I (M1; l ¿um ); I (M1+i; l ¿um )} = P(M1; l ¿um ; M1+i; l ¿um ) − P 2 (M1; l ¿um ): Then
Var[k{1 − Fn; l (um )}] =
k N
2 Var
N P
k2 = cm (0) + 2 N
I (Mi; l ¿um )
i=1
k N
The rst term of Eq. (16) is k2 k2 cm (0) 6 P(M1; l ¿um ) N N m 2 1 P(M1; l ¿um ) = n−l+1 l m ∼ (x) (from Eq: (15)) nl → 0;
2 N −1 P (N − i)cm (i): i=1
(16)
14
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
as n → ∞. If l¡i + 1, then it follows from D+ that "( cm (i) = P
l S
j=1
) ( (Xj ¿um ) ∩
"
)#
i+l S
j=i+1
(Xj ¿um )
−P
2
l S j=1
# (Xj ¿um )
6 g(i + 1 − l): The second term of Eq. (16) is proportional to
k N
2 N −1 P i=1
(N − i)cm (i) =
k N
2 l−1 P i=1
(N − i)cm (i) +
= An; 1 + An; 2
k N
2 N −1 P i=l
(N − i)cm (i)
(say):
Then |An; 1 | 6
k N
2 N (l − 1) −
k2 ∼ l− N
k N
2
l−1 P i=1
i
l2 → 0; 2
as n → ∞, by an assumption. Also |An; 2 | 6 = =
k N k N k N
2 N −1 P i=l
(N − i)g(i + 1 − l)
2 N −l P (N − l − i + 1)g(i) 2
i=1
n−2l+1 P
(N − l)
i=1
g(i) −
k N
2 n−2l+1 P (i − 1)g(i) → 0; i=1
as n → ∞, by an assumption. Therefore, Var[k{1−Fn; l (um )}] → 0 and thus the theorem is proved. The following corollary of Theorem 7 gives a simpler condition on l and k by assuming a power law decay of the mixing coecient g(h). In the following, a ∧ b denotes min(a; b). Corollary 2. Suppose that a stationary process {Xi }∞ i=1 has extremal index ∈ [0; 1], and satisÿes D+ with g(h)6Ch− for some ¿1 and C¿0. If the block size l and the number of blocks k in the MBB sample satisfy l = O(n” ), k = O(n ) for some ”, with 0¡”¡1 and 0¡¡(” ∧ (1 − ”)=2), then Eq. (14) holds. Proof. Given ” and which satisfy 0¡”¡1 and 0¡¡(” ∧ (1 − ”)=2); let be such that −1 ¡ ¡” and let h = n : Then hl−1 → 0 and kg(h) → 0: It is easy to see
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
k 2 l=n → 0 and k 2 n−1 Theorem 7.
Pn
i=1
15
g(i) → 0, and thus the desired result follows from
De ne the MBB distribution of a−1 n (Mn −bn ) with estimated normalizing constants as ∗ ˆ Hˆ n; l; k (x) = P{aˆ−1 m (Mm − bm )6x | Xn };
where aˆm and bˆ m are de ned as in Theorem 6. Theorem 8. Suppose that a stationary process {Xi }∞ i=1 has extremal index ∈ [0; 1] and satisÿes D+ with g(h)6Ch− for some C¿0. Let l = O(n” ) and k = O(n ). If either 1−” (i) ¿2; 0¡”¡1 and 0¡¡ ” ∧ 2 or
(ii)
¿1;
1 − 2” 0¡”¡1=2 and 0¡¡ ” ∧ 2
;
then p Hˆ n; l; k (x) → G(x);
(17)
as n → ∞. Proof. The proof is similar to the proof of Theorem 6. Let F ∈ D( ). Then n{1 − F(an x+bn )} → (−x) = (x) (say), where an = F − n and bn = F . It is clear that, from p Corollary 2, Hn; l; k (x) → G(x). Thus it is enough to show Eq. (11). Suppose condition p (i) holds. Since m=n = O(n−(1−”−) ) → 0, m{1 − Fn (am x + bm )} → (x) as was shown in the proof of Corollary 1. Convergence (17) follows from the same argument in the . proof of Theorem 6. The proof for the case that condition (ii) holds is similar. Remark. To the knowledge of the authors, there is no formal procedure to check from the data if the data-generating process (DGP) satis es certain mixing conditions. However, some subclasses of stationary processes are known to satisfy the mixing conditions given in this paper and it is reasonable in many applications to assume that the DGP belongs to such classes. One of these classes is a stationary ARMA process with absolutely continuous error terms. This process is known to be geometrically strong mixing and thus all the mixing conditions given in this paper are satis ed. See Doukhan (1995) for other processes which are geometrically strong mixing. 5. Conclusions We showed that the MBB provides a valid approximation to the cdf of the normalized maximum for a wider class of stationary processes than the EB does. Results obtained
16
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
in this paper are somewhat parallel to those of Lahiri (1995), who showed that the MBB approximates the distribution of the sample mean of a stationary process with a heavy tailed marginal cdf if the MBB sample size m is of order o(n). A drawback of our results is that the conditions on resample size m do not uniquely determine the value of m. Finding a practical method of choosing m is in the scope of a future research. Acknowledgements I wish to thank the referee for his=her suggestions that led to an improved version of the paper. This work was based on the second author’s doctoral dissertation (Fukuchi, 1994) at Iowa State University. We are grateful to Professors Yasuo Amemiya, Jay Breidt, Ken Koehler and Ananda Weerasinghe for their comments. References Angus, J., 1993. Asymptotic theory for bootstrapping the extremes. Commun. Statist. Theory Methods 22 (1) 15–30. Athreya, K.B., Fukuchi, J., 1994. Bootstrapping extremes of i.i.d random variables. In: Galambos, J., Lechner, J., Simiu, E. (Eds.), Proc. Conf. on Extreme Value Theory and Applications, vol. 3, NIST Special Publication 866. Athreya, K.B., Fukuchi, J., 1997. Con dence intervals for endpoints of a c.d.f. via bootstrap. J. Statist. Plann. Inference 58, 299–320. Bickel, P.J., Freedman, D.A., 1981. Some asymptotic theory for the bootstrap. Ann. Statist. 9, 1196–1217. Billingsley, 1968. Convergence of Probability Measures. Wiley, New York. Chernick, M.R., 1981. A limit theorem for the maximum of autoregressive processes with uniform marginal distribution. Ann. Probab. 9, 145–149. Chernick, M.R., Hsing, T., McCormick, W.P., 1991. Calculating the extremal index for a class of stationary sequences. Adv. Appl. Probab. 23, 835–850. Deheuvels, P., Mason, D., Shorack, G., 1993. Some results on the in uence of extremes on the bootstrap. Ann. Inst. H. Poincare 29, 83–103. Doukhan, P., 1995. Mixing: Properties and Examples. Springer, New York. Efron, B., 1979. Bootstrap methods: another look at the Jackknife. Ann. Statist. 7, 1–26. Fukuchi, J., 1994. Bootstrapping extremes of random variables. Ph.D. Dessertation, Department of Statistics, Iowa State University. Haan, L. de, 1970. On regular variation and its application to the weak convergence of sample extremes. Mathematical Centre Tracts, vol. 32, Mathematical Centre, Amsterdam. Hsing. T., Husler, J., Leadbetter, M.R., 1988. On the exceedance point process for a stationary sequence. Probab. Theory and Related Fields 78, 97–112. Ibragimov, I.A., Linnik, Y.V., 1971. Independent and Stationary sequences of Random Variables. WoltersNoordho, Netherlands. Kunsch, H.R., 1989. The jackknife and the bootstrap for general stationary observations. Ann. Statist. 17, 1217–1224. Lahiri, S.N., 1995. On the asymptotic behaviour of the moving block bootstrap for normalized sums of heavy-tail random variables. Ann. Statist. 23, 1331–1349. Leadbetter, M.R., 1974. On extreme values in stationary sequences. Z. Wahrscheinlichkeitstheor. Verw. Geb. 28, 289–303. Leadbetter, M.R., 1983. Extremes and local dependence in stationary sequences. Z. Wahrscheinlichkeitstheor. Verw. Geb. 65, 291–306.
K.B. Athreya et al. / Journal of Statistical Planning and Inference 76 (1999) 1–17
17
Leadbetter, M.R., Lindgren, G., Rootzen, H., 1983. Extremes and Related Properties of Random Sequences and Processes. Springer, Berlin. Leadbetter, M.R. Nandagopalan, S., 1989. On exceedance point processes for stationary sequences under mild oscillation restrictions. Extreme Value Theory, lecture notes in statistics, vol. 51. Springer, Berlin, pp. 69–80. Liu, R.Y., Singh, K., 1992. Moving blocks jackknife and bootstrap capture weak dependence. In: Lepage, R., Billard, L. (Eds.), Exploring the Limits of Bootstrap, Wiley, New York, pp. 225–248. Politis, D.N., Romano, J.P., 1994. Large sample con dence regions based on subsamples under minimal assumptions. Ann. Statist. 22, 2031–2050. Singh, K., 1981. On the asymptotic accuracy of Efron’s bootstrap. Ann. Statist. 9, 1187–1195. Swanepoel, J.W.H., 1986. A note on proving that the (modi ed) bootstrap works. Commun. Statist. Theory Methods 15 (11), 3193–3203. Resnick, S., 1987. Extreme Values, Regular Variation, and Point Processes. Springer, New York.