Statistics & Probability North-Holland
September
Letters 8 (1989) 315-323
ON THE NON-PARAMETRIC ESTIMATION EXTREME-VALUE DISTRIBUTIONS
1989
OF THE BIVARIATE
Paul DEHEUVELS UniuersitP Paris VI, France
Jose TIAGO University
de OLIVEIRA
of Lisbon, Portugal
Received August Revised October
Abstract: AMS
1988 1988
We establish
the consistency
1980 Subject Classifications:
Keywords:
extreme
values, bivariate
of a non-parametric
Primary
62G30;
distributions,
estimator
Secondary
62605,
non-parametric
of bivariate
extreme-value
distributions.
62E20, 60F05.
statistics.
weak qnd strong
convergence.
order statistics.
1. Introduction and main result Let (Xi, Y,), (X2, Y,), . . . , be an i.i.d. sequence of random distribution with standard Gumbel marginals, i.e. A(x,
Y) =P(X11
vectors with a common
Y,<~)=exp(-(e-“+e-‘)k(x-y))
bivariate
for-cc
extreme
y
value
(1.1)
where the dependence function k( .) satisfies suitable conditions specified below. Such distributions (see e.g. Galambos, 1978) arise as the non-degenerate weak limits of w h ere (E,, l,), i= 1, 2 ,..., denotes an i.i.d. se(a;‘(max(E,, . . . ,5,,> - b,,), c,-‘(max(S,,...,~,>-d,)), quence of random vectors, and a, > 0, c, > 0, b,,, d, suitable normalizing constants. The problem of the estimation of the marginal distributions being solved, we will assume throughout that, as in (l.l), the distributions of X, and Y, are known and concentrate our interest in the estimation of k( .). Since the choice of the marginal distributions is then a matter of pure convenience, we may transform the coordinates by setting 0: = exp( - X,) and q = exp( - Y) for i = 1,2,. . . . It is obvious from (1.1) that U, and V, have standard exponential distributions. Their joint distribution is given by H(u, where {A(s),
u)=P(U,>u,
V,>u)=exp(-(u+u)A(u/(u+u)))
0 Q s < l} is related
k(z)
=,4(l/(eZ
+ 1))
to {k(z), and
A(s)
for u>Oand
u>O,
(1.2)
- CC < z < C.Q}via the equations = k(log((1
- s)/s))
for-cc
and
O
0167-7152/89/$3.50
0 1989, Elsevier Science Publishers
B.V. (North-Holland)
315
Volume 8,
Number 4
STATISTICS & PROBABILITY LETTERS
September 1989
Necessary and sufficient conditions for k( .) to be the dependence function of an extreme value distribution can be written in a simple form in terms of A( .) (Pickands, 1981). They are as follows: (Al) A(.) is convex on [0, 11; (A2) max(s, 1 - s) G A(s) < 1 for 0 Q s < 1. These conditions imply that A(s)/s 1 and A(1 - s)/(l - s)t on [0, 11. Also, A(0) = A(1) = 1. The problem of the nonparametric estimation of A( .) (or equivalently of k( .)) has received some attention lately (see e.g. Deheuvels, 1984; Tawn, 1987; Smith, Tawn and Yuen, 1987). A simple estimator of A(.) has been introduced by Pickands (1981) as follows: aH(s)=n/imin(Uj/(l-.s),
F/s)
forO
(1.4)
i=l
In spite of the fact that a,( -) has nice properties (note that E(l/A,(s)) = l/A(s)) and that, for any fixed 0 GS G 1, i,,(s) -A(s) = O(n-“*(log log n)“‘) a.s.), it bears the inconvenient of not being admissible i? the sense that it does not satisfy (Al-A2). It requires therefore to be modified in i,( .) by (i) truncating A, to ensure that (A2) holds, (ii) building the convex hull of the extremal points of the graph of the resulting estimate, (iii) take A”, to be an interpolating spline between these points. Since the construction of 2, is rather involved and leads to a function having a complicated implicit dependence upon the data, it is very natural to seek some other ways of estimating A( .). It turns out that the convexity assumption (A2) is crucial for this, since a non-convex estimate necessitates a convex-hull-type modification such as that used for Pickand’s estimator. Motivated by the fact that the sum of convex functions is convex, Tiago de Oliveira (1987) introduced the estimators of A(s) defined by A,&)
i: min((1
= 1 -n-%(6,)
- s)/q”n,
s/vFQ)
forO
(1.5)
i=l
where 0 < 6, < 1 is a sequence of exponents such that S,, t 1 as n -+ co, and where R(S) is a function of 0~6~1 such that R(6)/(1-6)+1 as STl. The aim of this paper is to investigate the limiting behavior of A,,,m as n + 00. Our main result is stated in the following theorem, whose proof is postponed to Section 3. Theorem 1.1. We have sup
IA,,,n(s)-A(s)(+O
inprobabiZityasn-+oo,
O
for any A( .) if and onb if the sequence 0 < 6, < 1 satisfies the condition that S,-1
and
asn+cx3.
(l-S,)logn+cc
We discuss the special case where A(s) = coordinates in Section 2. Section 3 is devoted It is obvious that A,,,” (.) satisfies (Al). inequality of (A2) (note that always A,(s) < is convex, the following simple modification a,,s,(s)
=max(A,,,n(s),
0.6)
1 for 0 < s < 1 which corresponds to the independence of the to the limiting behavior of A,,,n for a general A(.). On the other hand, it does not necessarily satisfy the first 1 for 0 < s < 1). Since the supremum of two convex functions of A,,,” gives an estimator of A( .) fulfilling (Al) and (A2):
max(1 -s,
.s))
forO
(1.7)
It is straightforward that in,6, is consistent if and only if A,,,n is. Therefore, we will limit ourselves in the sequel to the study of An,6,. Moreover, the explicit choice of R( .) is not important as far as asymptotic properties are concerned but needs to be precised for the practical use of this estimate. It turns out that the 316
Volume
8, Number
STATISTICS
4
& PROBABILITY
following two choices of R( .) can be recommended. second is justified by bias arguments. R(6)
September
1989
of simplicity,
the
LETTERS
The first one has the advantage
= 1 - 6,
(1.8)
R(a)=l/r(l-8)
forO
(1.9)
Since T(z) = (l/z)(l + yz + o(z)) as z JO, where y is Euler’s constant, we see that either choice (1.8) or (1.9) fulfills our assumptions. In the remainder of this paper, we shall use mainly the first one for sake of simplicity.
2. The case of independent margins This case corresponds w(J’,
to A(s) = 1 in (1.2). Set
6)=min((l-s)/L$!,
s/v:“)
fori=l,2,...,
O<.s
(2.1)
and0<6<1.
By (1.2) (see (3.1) in the sequel) we have 13)> X) = 1 - exp( -xP”s(l
P(Wl(s,
+exp[-x Straightforward
expansions
- s)l’&)
-i16((1
-s)i’*
- s))1’6 + x-%(s,
where ( E(S, 6, x) ( < :((l - s)i” + s~‘&)~. It follows that, for all 0 < 6 < 1, E(W,(.s, E(W:(s, 1)) = cc. We have, in particular,
The main result of this section
(2.2)
lim 1 t min((1 n+oo n i=l
may now be stated
-s)L+,
(2.3)
for all 0 < 6 < 1, E(WT(s,
6)) < cc and
(2.4)
as follows.
0 < S,, < 1 such that S,, t 1 as n + 00, and for any fixed
s/F+)=
-(l-s)log(l-s)-s
from (2.5) that, under = (1 + o(l))(l
6, x),
forallO~s
Theorem 2.1. Let A(s) = 1. For any sequence 0 < s < 1, we have almost surely
-A(s)
for x>O.
8)) < cc, while
E(W,(s,l))=-(1-s)log(l-s)-slogs
A,,,n(s)
+ s”‘))
show that
P(W,(s, 6) > x) = x-*‘s(s(l
Remark 2.1. It is obvious
- exp( -xP”ss1’6)
the assumptions
- 6,)(s
(2.5)
logs
of Theorem
log s + (1 -s)
log(1 -s))
2.1, for any fixed 0 G s < 1, + 0
a.s.
as n ---) co. (2.6)
Since A,,,( ‘) and A( a) are convex functions, it is straightforward that the pointwise consistency of A,,,Js) to A,(s) implies the uniform almost sure consistency. Therefore, following corollary. Corollary 2.1. Let A(s) liM n+m
sup O
= 1. Then, for any sequence )An,G,(~) -A(s)
1= 0
almost sure we have the
0 -C S,, < 1 such that S, t 1 as n + 00, we have
almost sureb.
0
(2-T) 317
Volume
8. Number
STATISTICS
4
& PROBABILITY
LETTERS
September
1989
Remark 2.2. Since A,,,(s) =A(s) = 1, the best choice of S,, is given here by 8, = 1. We will see in the following section that, whenever A(s) < 1, such a choice is impossible and we need impose that 1 - 8, is bounded below by positive sequence. In any case, (2.6) gives an evaluation of the exact rate of pointwise consistency of An,G,(.s) to A(s). This rate cannot be better than 0(1 - 8,) almost surely as n + co. Proof of Theorem 2.1. Let 17,= U/v:,
i = 1, 2, . . . . Routine
computations
P(U,/V,>p)=P(~~:/U,>p)=P(11,>p)=l/(l+p) and, forO
show that forp>O,
(2.8)
cc,
G,,(z) := P(l/U,
< z 1U,/V,
>~)=P(1/1/,~~~~~/U,>p)=e~“‘(l+p-pe-”~’)
for z > 0.
(2.9)
DenotebyF,(p)=n~‘#{9,~P:l~i~n}.FixanyO
min((1
- s)/U,“n,
s/y:“n)
(1 - s)/@
=
i s/ K8”
for9,
> P,,
with p, := ((1 - s)/s)i”“.
for 71,<~,, (2.10)
Notice that p,, 1 p, := (1 - s)/s > 1 as 8, T 1. Consider now the following three cases (recall (2.1) for the definition Case 1. Assume that vi < &. It follows from (2.10) that W(S,
8,) = s/van
Case 2. Assume W(S,
w(s,
1) =s/y.
8,) = S/F8,
and
w(s,
that n, > p,. We obtain 8,) = (1 - s)/cr,“n
and
By (2.11), (2.12) and (2.13), we obtain
1) = (1 - s)/q.
(2.11)
F(s,
and
i=l
i=l
(P,
1) = (1 -s)/I!J.
(2.13)
the upper bound:
where
=Cl
(2.12)
in the same way
~~~lw(S,~~)-~(~.l)/6SI,+S2.+~~,,I,
318
6)):
that p, < 11,< p,,. We have likewise
Case 3. Assume w;(s,
and
of K(s,
-+Vn)(l-+/pm)
(2.14)
Volume
8, Number
Rewrite,
4
STATISTICS
& PROBABILITY
(2.9), the sum in the expression
using
no -CC&J,)) z,“n < t c
LETTERS
September
1989
above as
z,“n,
i=l
i=l
function P( Z, < z) = where Z,, Z,, . . . , is an i.i.d. sequence of random variables having distribution G,_(z). Note that since 0 < S,, < 1, Zsc < max(1, Z,), which by (2.9) has finite expectation. It follows that, almost surely, S,,, = (1 - ((1 -s)/.~)~“-‘)0(n) By using similar
arguments
for S,,, and S,,,, we obtain
S,,,=(l-6,)0(n) A joint
= (1-&)0(n)
asn+cc,
as n + cc. likewise
8,) = $ .?
that, almost
surely,
fori=landi=2.
use of (2.14), (2.15) and (2.16) shows that, almost ; ;$t ul;(&
(2.15)
K(s,
1) +
implies
that
(1 -
(2.16) surely as n + cc, (2.17)
&JO(l).
r=l
By (2.4), the law of large numbers lim 1 $ w(s,l)= n+m n r=t which by (2.17) and a similar
-(1-s)log(l-s)-slogs argument
as.,
for f < s < 1 completes
(2.18)
the proof of the theorem.
0
3. The general case In this section, we assume that 0 < s < 1 is fixed and such that i
0, P(min((1
- s)/U,S,
s/V:)
= 1 - exp( -,z-‘/S(l +exp( Observe
-zP1”(
that (Al)-(A2)
< 1. If no such s exists, then we
> z)
-s)i”)
- exp( -z-l’ssl’s)
s”~+(1-S)1’6)A(S”B/(S*‘R+(1
-s)l’s)j).
(3.1)
imply that for any 0 < w’, w” < 1,
IA(w’)-A(w”)l+‘-w”I. This
implies
the existence sup
A(Y/(
(3.2)
of a 0 < PS
A < 1 such that
+ (1 - s)1’s)j
< 1.
(3.3)
ACS
By (3.1) and (3.3), we have uniformly P(min((1
- s)/Uf,
s/V:)
over A < 6 G 1, > z) = (1 + o(~))z-‘~~(s’~” x (1 --A(s”“/(
+ (1 - s)l”)
sl/’ + (1 -s)““)))
as z + cc.
(3.4) 319
Volume
8, Number
In particular
4
STATISTICS
& PROBABILITY
LETTERS
September
1989
for 6 = 1, we get P(min((1 - s)/U,,
> z) = (1 + o(l))z-‘(1
s/Vi)
It follows from (3.5) that E(min((1 ~(8)
:= E(min((1 = r(l
- S)jl
- s)/U,,
- s)/U:,
s/V,))
= co. On the other hand,
- S)-‘(1
-A(s))
- s)/lJf,
Our next lemma
is inspired
s/V:)
for A > S < 1,
(3.6)
by (3.1) and (3.4)
>, 2) < C(6)zF’I”
by Theorem
+ (1 - S)i’“))s
as 6 tl.
In the sequel, we will make use of the fact, obvious bounded on the interval A < 6 < 1, such that P(min((1
(3.5)
s/V:))
- ( S1’6 + (1 - S)“S))fiA(s”S,(s’/~
= (1 + o(l))(l
as z + rx).
-A(s))
that there exists a function
C(S),
for all z > 0.
(3.7)
27, p. 283 of Petrov (1975) and is of independent
interest.
Lemma 3.1. Let a,, f3,, . . . , be an i.i.d. sequence of random variables with partial sums S,, + 8, + . . . + 0,,, and let m, denote the median of the distribution of S,. Assume that, for some constants C > 0 n = 1, 2,..., and O
~z)~CZ-‘-’
forallz>O.
(3.8)
Then, we have P( 1S, - m, 1 > nz) < 27’25Cn-‘z-‘-1
for all z > 0.
Proof. Denote
by 6r, e2,. . . , an i.i.d. sequence of random variables for 1 < i < n, $ = C:,l& and $,, = cl,,&. Set 6,=61,,~,,<~~~ 284), we have for all t 2 0, n > 1 and a E R,
(3.9) having the same distribution as 19~- &. By Petrov (1975, inequality (4.5), p.
(3.10)
tP(IS,-m,I~t)$P((~~I~tj~2P()S,-aI~at).
By (3.8) and (3.10) taken with n = 1 and a = 0, we have P( ( @I1 > x) 6 2P( ) d1 I a is) G 25’2C~-f-1 for x > 0. Another application of (3.10) shows that, for all z > 0, P((S,-m,I
~nz)~2P((~~IZnzj~2nP((B,1~nzj+2P(l~~,J~nz) G
27/2~~-t~-t-1
+
2( nz)-2E(
Sin) = 27/2Cn-‘z-‘-’
+ 26’zP2E(
e;‘,). (3.11)
An integration E(e;‘n) which, jointly
by parts yields = JR; x2 dP(B,,
with (3.11), implies
=2~“‘xP(~B,~
(3.9) as sought.
ax)
dx<2”‘Clo”;x-’
dt<29’2C(nz)1-‘,
0
We will apply Lemma 3.1 with 0, = q(s, 6,) = min((1 - s)/Uz, s/V:) for i = 1, 2,. . . , and t = - 1 + l/S,. The following inequality (see Petrov, 1975, p. 285) will be useful to obtain a lower bound in (3.9). We have, for all z >, 0, n > 1 and a 6 R, P(IS,-a( 320
>nz)~fP(J~\>2nz)>~nP(B;>2nz)(l-nP(B,>2nz)).
(3.12)
STATISTICS
Volume 8. Number 4
& PROBABILITY
September 1989
LETTERS
Consider the case where 19,= u/;( s, 6,) for i = 1, 2, . . . ; obviously the median m(6,) of B, is bounded above by a finite constant M, whenever A < 6, < 1. It follows from (3.10) that, whenever t > M, :p(B,~t+M)~P(l~11~t)=2P(B;~t)~2P(e,~t),
(3.13)
where in the last inequality, we make use of the fact that 6, z 8, - e2 < 8,. By (3.12) and (3.13), we see that, whenever nz > 2M, for all a E R, P( 1S, - a 1 3 nz) a +P(e,
a 4nz)(1 - d(e,
2 2n~)).
Observe, using (3.4), that nP(8, > nz) - n-‘z-‘-I(1 -A(s)) (3.9) and (3.14), suffices to prove the following lemma. Lemma 3.2. 1n order that the sequence
P iI
),nE,
i=l
I
for some non-random
sequence a,,
lim n(1-1/6,)~-1/6, n n-m
-+O
nz
00. This, jointly
with
(3.15)
asn-,co,
1
it is necessary and sufficient
= 0
as n + 00 and
E, > 0 satisfies s/qBn)--flun
min((1 - s)/LI;‘fl,
5
(3.14)
that (3.16)
2
in which case we may chose na, equal to the median of the distribution
of c:= I min((l
- s)/@.,
s/I$‘,).
0
In order to complete our study, we need evaluate the median of C:,,w(s, a,,). We will do better in the following lemma which gives the limiting distribution of this sum. First, we introduce some notation. n. It follows from (3.4) that, an n += co Let tin=n-‘nK(s, 8,) and I;,(x)=P(t,,
by f*(x)
-x-‘(l-A(s))
= F,‘(x)
f,(x)
the density
and
forx>O of II,.
= (l/ns,)x-‘-“Sn(s’isn+ +n-‘x-‘~1/6~0(min(l,
n&(x)=0
forx
(3.17)
By (3.1) we have the expansion (1 -s)l’Fn)(l
--A(s’/Sfi/(s’/Gn+
(nx1’6fl)-1)j
(1 -.r)“““)])
for x>O.
(3.18)
If follows that, for i G 6, < 1 and a fixed E > 0, nk’xzfn(x)
dx = O((E”-“““)/(2
- l/S,,))
and n
'xf,(x)dx)i=n(rFn(c)-@(x)dx)*
=n A simple argument
’ +t(l)
(1 -A(s))
+ /6(1
--8(x))
dXi2.
(see (3.23) in the sequel) shows that this last expression
0(n-1((n’-6n
- l)/(l
- S,))‘)
-+ 0
as n -j 00.
is (3.19)
We now make use of Petrov (1975, Theorem 8, pp. 81-82), which jointly with (3.17) and (3.19) ensures the existence of a non-random sequence b,, such that the limiting distribution of C:=,&, - b,, is an 321
Volume
8, Number
STATISTICS
4
& PROBABILITY
LETTERS
infinitely divisible distribution. Using Petrov (1975, Theorem of b,,. The corresponding result is stated in our next lemma. Lemma 3.3. Assume distribution of
that
n-sn~$tmin((l
8, t 1. Then,
- s)/L+,
is the completely assymmetrical
s/y&n)
& = n/,; We proceed
dF,(x)
d&(x)
a non-random
(3.20)
given (in Levy’s form)
by the characteristic
function
dL(x)],
(3.21) Moreover,
for any fixed
r > 0, we
(3.22)
of &. Fix r > 0, X > 0, and set u, = Xn-‘n.
-F,(r))
+ njU’(1
- F,(x))
dx + n]‘(l
0
-F,(x))
/3,, such that the limiting
q
+ O(1).
= nT(l
By (3.17), A, --j (1 -A(s)) there exists a constant K,
sequence
1989
an evaluation
asn-,c0,
1 - s)
0
\n(l
7, p. Sl), we may also obtain
for x > 0 and where y is a real constant.
-A(s))
now to the evaluation
n /‘x
- p,
stable distribution
G(t) = exp( --yt + s,“(eitxwhere L(x) = -x-‘(1 have, as n + 00,
there exists
September
-F,(x))
We have dx :=A, +B,, + C,,.
U”
as n -+ 00. For B,,, we use the bounds such that, uniformly over x > u,,
0 < B,, < nu, = Xn’-‘n.
Finally
by (3.1),
s)) < K,.,n-1x-2’sn,
- x-~/~~R(G,,
where R(6,
s)=(s1~R+(l-~)1’S)(1-A(s1~S/(s1’B+(1-s)1’S)))~1-A(~)
as6+1.
It follows that C, = { S,x’-‘/‘~}R(6,,
s)(n’-8n/(l
- 6,)) - 8nn(71-1’6n/(1
If we assume that S,, t 1, the first term dominates (1 - 8,)log n + co, in which case we have C, = (1 + o(l))(
n’-’ n/(1 - a,))(1
On the other hand, if we assume C, = (1 + o(l))((e” Finally,
the others
-A(s))
s) + O(n’-‘n).
if and only if n1-6n + 00, or equivalently
as n + 00.
if
(3.23a)
that (1 - S,)log n + (Y E (0, oo), we get likewise
- l)/(l
- a,))(1
-A(s))
as n + co.
(3.23b)
if (1 - S,)log n + 0, we obtain C,, = (1 + o(l))(log
It is obvious
from the arguments p, = c,(1+
n)(l
-A(s))
= o(l/(l
- 8,))
as n + 00.
(3.23~)
above that in all three cases we have
o(1)) + o(1).
We may now state the main result of this section. 322
- 6,))R(S,,
(3.24)
Volume
8, Number
4
STATISTICS
& PROBABILITY
Theorem 3.1. Assume that 0 < s < 1 is such that A(s) as n -+ co. Then: (i) If(l-6,)logn+coasn-,cowehaue A,(s)
in probability
-A(s)
LETTERS
September
1989
< 1. Let also 0 < 8, < 1 be a sequence such that S,, + 1
as n -+ co ;
(3.25)
(ii) If (1 - S,,)log n + cx E (0, 00) as n + M we have A,(s)
+ 1 - (1 - A(s))(e”
- 1)
inprobability
as n + co;
(3.26)
(iii) If (1 - 6,)l og n -+ 0 as n + co we have A,(s)
in probability
+ 1
Proof. It is an obvious
consequence
as n + cc.
(3.27)
of (3.20)-(3.24),
which imply
that
(3.28) Proof of Theorem 1.1. It follows immediately
from (3.28).
0
Remark 3.1. (i) In view of Lemmas 3.1 and 3.2, we have a simple characterization of all possible rates of consistency of An,G,(~). A disappointing fact which follows easily from Theorems 2.1 and 3.1 is that A,,,“(s) -A(s) cannot have a rate of consistency below O,(l/log n). (ii) In order to obtain better performances, it is necessary to modify the form of A,,,” to enable consistency for values of S, as close as possible to one. The following example of modified estimator tailored to cover the cases where (1 - S,,)log n + 0 or (1 - S,,)log n -+ a E (0, co) follows this idea.
L,, = 1 -
s
If we use again (3.20)-(3.24) $-$
i min((1 ” i=l
,kmin((l-s)/@“, I=1
s/v”n)
forOGs61.
(3.29)
as in (3.28), we see that - s)/@fl,
G?‘>
=
(1 + o(l))@
-A(s))
+ .r!,
y 1 O,(l),
(3.30)
where the O,(l) corresponds to the limiting distribution (3.21). A consequence of this representation, that, in spite of the fact that A-,?,” is consistent in the case where (1 - S,,)log n j 0, the rate of consistency given by (3.30) is again O,(l/log n). It follows that these modifications give only a slight improvement over the original estimate.
References Deheuvels, P. (1984), Point processes and multivariate extreme values II, in: P.R. Krishnaiah, ed., Multivariate Analysis Vol. VI (North-Holland, Amsterdam) pp. 145-164. Galambos, J. (1978) The Asymptotic Theory of Extreme Order Statistics (Wiley, New York) Petrov, V.V. (1975), Sums of Independent Random Variables (Springer, Berlin). Pickands, J. III (1981), Multivariate extreme value distributions, in: Bull. Internat. Statist. Inst., Proc. 43rd Session (Buenos Aires) pp. 859-878.
Smith, R.L., J.A. Tawn and H.K. Yuen (1987), Statistics of multivariate extremes, Preprint, University of Surrey (Guildford, UK). Tawn, J.A. (1987), Bivariate extreme value theory-models and estimation, Tech. Rep. 57, University of Surrey (Guildford, UK). Tiago de Oliveira, J. (1987), Instrinsic estimation of the dependence structure for bivariate extremes, Tech. Rept. 87-18, Dept. of Statistics, Iowa State University (Ames, IA).
323