Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266 www.elsevier.com/locate/jspi
Distributions and moments of record values in a sequence of maximally dependent random variables Ch.A. Charalambidesa,∗ , Tomasz Rychlikb a Department of Mathematics, University of Athens, Athens, Greece b Institute of Mathematics, Polish Academy of Sciences, Toru´n, Poland
Received 3 October 2006; received in revised form 14 September 2007; accepted 14 September 2007 Available online 4 March 2008
Abstract A sequence of possibly dependent random variables is maximally dependent if all the sample maxima in the sequence have stochastically maximal distributions in the class of all distributions with the same marginals. For a sequence of maximally dependent standard uniform random variables, we determine the distribution functions of record times and values. We show that the distribution of the record occurrence times coincides with the respective distribution for the i.i.d. sequence, and the distributions of the record values are stochastically maximal in the class of sequences with the same record times distributions, containing all the exchangeable sequences. We also derive analytic formulae for the moments of record values from the maximally dependent sequence, and compare them with those of the i.i.d. case. © 2008 Elsevier B.V. All rights reserved. MSC: primary 60E15 secondary 60G70 05A30 Keywords: Maximally dependent sequence; Exchangeable sequence; Sample maximum; Record time; Record value; Stochastic order; Signless Stirling numbers of first kind; Riemann zeta function
1. Introduction Dependent random variables X1 , X2 , . . . , Xj , with marginal distribution functions F1 , F2 , . . . , Fj , respectively, are called maximally dependent if their maximum Xj :j = max{X1 , X2 , . . . , Xj } has the distribution function ⎧ ⎫ j ⎨ ⎬ Fj :j (x) = max 0, 1 − [1 − Fi (x)] . ⎩ ⎭ i=1
It can be shown that the right-hand side of this expression is the best uniform lower bound for the values of distribution functions of the maxima of arbitrarily dependent random variables with given marginals F1 , F2 , . . . , Fj . Obviously, maximally dependent random variables have the maximal expectations of arbitrary non-decreasing functions of the sample maximum among the samples with the same marginals. In other words, maxima of the maximally dependent variables are maximal in the stochastic order among the maxima over the class. We say that a sequence ∗ Corresponding author.
E-mail addresses:
[email protected] (Ch.A. Charalambides),
[email protected] (T. Rychlik). 0378-3758/$ - see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2007.09.014
2254
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
X1 , X2 , . . . , Xj , . . . , with respective marginals F1 , F2 , . . . , Fj , . . . , is maximally dependent if X1 , X2 , . . . , Xj are maximally dependent for all j = 1, 2, . . . . Various sequences of maximally dependent random variables have been constructed and studied. Mallows (1969) presented a first example of absolutely continuous standard uniform maximally dependent sample solving the problem of maximizing the expectation of the maximum of dependent uniform random variables. Lai and Robbins (1976) extended the result to the samples with arbitrary marginal distributions. Furthermore, Lai and Robbins (1978) defined an infinite sequence of maximally dependent uniform random variables, and called it canonical. It was a sequence of piecewise linear deterministic functions of a single uniform random variable. By the standard quantile transformation Xj = F −1 (Uj ), j = 1, 2, . . . , one can obtain maximally dependent identically distributed sequences with arbitrary common distribution function F. Lai and Robbins (1978) determined limit values and distributions of the sequences. Using some more delicate arguments, Tchen (1980) constructed maximally dependent sequences with arbitrary nonidentical marginals. Rychlik (1992) constructed identically distributed samples with stochastically maximal distributions
j F (x) + 1 − k P(Xk:j x) = max 0, j +1−k of arbitrary order statistic Xk:j , k = 1, 2, . . . , j . He showed that there do not exist stochastically maximal distributions for the sequences of order statistics different from the sample maxima; it is only possible to construct joint distributions so that some subsequences of order statistics possess the property of being stochastically extreme. Rychlik (1995) determined necessary and sufficient conditions for non-identical marginals F1 , F2 , . . . , Fj so that the analogous bounds
j i=1 Fi (x) + 1 − k P(Xk:j x) = max 0, j +1−k are uniformly attained for all real x’s. In the present paper, we consider records based on maximally dependent uniform sequences. The records in a random sequence X1 , X2 , . . . , Xj , . . . , or, more precisely, the upper records, are the values of the sequence which are greater than all the preceding ones. By convention, we assume that the first record occurs at time T1 = 1, and its value coincides with the first observation R1 = X1 . The consecutive record occurrence times and values are defined as Tk = min{j > Tk−1 : Xj > Xj −1:j −1 },
k = 2, 3, . . .
and Rk = XTk ,
k = 2, 3, . . . ,
respectively. If the observations X1 , X2 , . . . , Xj , . . . are independent and have an identical continuous distribution function F, then the record sequence is infinite almost surely and the record occurrence times have the probability functions P(Tk = n) =
|s(n − 1, k − 1)| , n!
n = k, k + 1, . . . , k = 1, 2, . . . ,
(1.1)
where |s(n, k)|, k = 1, 2, . . . , n, n = 1, 2, . . . , are the signless Stirling numbers of the first kind, which are defined by (t + n − 1)n =
n
|s(n, k)|t k ,
n = 0, 1, . . . ,
k=0
where (x)n = x(x − 1) · · · (x − n + 1) (cf. Charalambides, 2002, p. 278). Note that the distributions of the record times do not depend on the particular form of the marginal distribution function F. The distribution functions of the record values in the i.i.d. continuous case have the form FRk (x) = 1 − (1 − F (x))
k−1 (− log(1 − F (x))i , i! i=1
k = 1, 2, . . . .
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2255
If F has a density f, then the respective density function of Rk is fRk (x) =
(− log(1 − F (x))k−1 f (x), (k − 1)!
k = 1, 2, . . . .
(1.2)
Comprehensive studies of records for the standard i.i.d. model and some more general ones are presented in the monographs of Arnold et al. (1998) and Nevzorov (2001). The sequence of maximally dependent random variables considered in this paper is constructed as follows. Let U1 , U2 , . . . , Uj , . . . , be a sequence of random variables defined recursively by ⎧ 1 1 1 ⎪ (j − 1) Uj −1:j −1 − 1 + + 1, 1 − Uj −1:j −1 < 1 − , ⎨ j j j −1 (1.3) Uj = 1 1 ⎪ ⎩ (j − 1) Uj −1:j −1 − 1 + , 1 − Uj −1:j −1 1, j j for j = 2, 3, . . . , with U1 = U1:1 uniformly distributed on the interval [0, 1]. In Section 2, we first show that this sequence is maximally dependent uniform. Then we determine the distributions of occurrence times and values of records based on this sequence. In particular, we prove that the record times have the distribution (1.1), identical with that of the i.i.d. case. We also conclude that the respective distribution of record values is maximal in the stochastic order among all dependent standard uniform sequences with the same distribution of record times. By use of the quantile transformation, the statements can be extended to identically distributed sequences with arbitrary continuous distribution. In particular, the expectations of any monotone non-decreasing functions of records from (1.3) are maximal. In Section 3, we establish close analytic formulae for moments of record values based on the sequence. We also compare the expectations and variances of records from the maximally dependent and i.i.d. sequences. 2. Distribution and density functions Consider the sequence of random variables U1 , U2 , . . . , Uj , . . . , defined by (1.3). Note that the functions gj (x) = (j − 1)(x − 1 + 1/j ) + 1 and hj (x) = (j − 1)(x − 1 + 1/j ), j = 2, 3, . . . , in the right-hand side of (1.3), are linear increasing and transform the intervals [1 − 1/(j − 1), 1 − 1/j ] and [1 − 1/j, 1] onto [1 − 1/j, 1] and [0, 1 − 1/j ], respectively. Therefore the sequence of partial maxima Uj :j = max{U1 , U2 , . . . , Uj }, j = 1, 2, . . . , may be expressed recursively by ⎧ 1 1 1 ⎪ + 1, 1 − Uj −1:j −1 < 1 − , ⎨ (j − 1) Uj −1:j −1 − 1 + j j − 1 j Uj :j = (2.1) 1 ⎪ ⎩ Uj −1:j −1 , 1 − Uj −1:j −1 1, j for j = 2, 3, . . . , with U1:1 = U1 . In the following lemma, we show that Uj :j are uniformly distributed on [1 − 1/j, 1], j = 1, 2, . . . , and conclude that Uj defined by (1.3) are uniformly distributed on [0, 1], j = 1, 2, . . . , which implies that the sequence is maximally dependent uniform. Lemma 2.1. The sequence of random variables U1 , U2 , . . . , Uj , . . . , defined by (1.3) is maximally dependent uniform. Proof. Note that by (2.1) the random variable Uj :j is supported on [1 − 1/j, 1], j = 1, 2, . . . . It will be shown by induction that Uj :j is uniformly distributed on [1 − 1/j, 1], j = 1, 2, . . . . The random variable U1:1 = U1 , by assumption, is uniformly distributed on [0, 1]. Suppose that the random variable Uj −1:j −1 is uniformly distributed on [1 − 1/(j − 1), 1], that is P(Uj −1:j −1 u) = (j − 1)(u − 1) + 1,
1−
1 u1. j −1
2256
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
Then, for 1 − 1/j u1, it follows that 1 1 P(Uj :j u) = P (j − 1) Uj −1:j −1 − 1 + + 1 u, Uj −1:j −1 < 1 − j j 1 + P Uj −1:j −1 u, Uj −1:j −1 1 − j 1 1−u 1 = P 0 Uj −1:j −1 1 − − + P 1 − Uj −1:j −1 u j j −1 j 1 1 = u−1+ + (j − 1)(u − 1) + 1 − = j (u − 1) + 1. j j Therefore, by the induction argument, Uj :j is uniformly distributed on [1 − 1/j, 1] for all j = 1, 2, . . . . This implies that 1 1 P(Uj u) = P (j − 1) Uj −1:j −1 − 1 + u, Uj −1:j −1 1 − j j u 1 1 = u, = P 1 − Uj −1:j −1 1 − + j j −1 j for 0 u < 1 − 1/j , and 1 1 1 P(Uj u) = P (j − 1) Uj −1:j −1 − 1 + 1 − , Uj −1:j −1 1 − j j j 1 1 + P (j − 1) Uj −1:j −1 − 1 + + 1 u, Uj −1:j −1 1 − j j 1 1 1−u = P Uj −1:j −1 1 − + P Uj −1:j −1 1 − − = u, j j j −1 for 1 − 1/j u1, which completes the proof.
Note that the recursive definition (1.3) is much simpler than the one for the canonical sequence of Lai and Robbins (1978), and provides easier calculations and evaluations of records. In order to derive the probability function of record occurrence times Tk , consider the sequence of record indicator random variables Ij = {a record occurs at time j } = {Uj :j > Uj −1:j −1 },
j = 2, 3, . . . .
(2.2)
The following lemma establishes two important properties of this sequence of random variables. Lemma 2.2. Let Uj , j = 1, 2, . . . , be the sequence of maximally dependent random variables defined by (1.3). Then the record indicator random variables Ij , j = 1, 2, . . . , are independent and P(Ij = 1) =
1 , j
j = 1, 2, . . . .
Proof. Clearly P(I1 = 1) = P(T1 = 1) = 1, by definition. Since (2.1) holds and Uj :j are uniformly distributed on the intervals [1 − 1/j, 1], j = 1, 2, . . . , respectively, we have 1 1 1 P(Ij = 1) = P(Uj :j > Uj −1:j −1 ) = P 1 − Uj −1:j −1 < 1 − = , j −1 j j for j = 2, 3, . . . . Since the record indicators are binary, in order to prove they are independent, it suffices to check that P(I1 = 1, I2 = 1, . . . , In = 1) =
n j =1
P(Ij = 1)
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2257
(cf. Arnold et al., 1998, Exercise 9, p. 45). The product representations for the probabilities of the remaining events can be recovered from it. Clearly, by (2.1) and (2.2), we have {I1 = 1, I2 = 1, . . . , In = 1} = {U2:2 > U1 , U3:3 > U2:2 , . . . , Un:n > Un−1:n−1 }
1 1 1 2 1 = U1 ∈ 0, , U2:2 ∈ , , . . . , Un−1:n−1 ∈ 1 − ,1 − . 2 2 3 n−1 n Introducing the functions gj (x) = (j − 1)(x − 1 + 1/j ) + 1, j = 2, 4, . . . , n − 1 and Gj (x) = gj (Gj −1 (x)), j = 3, 4, . . . , n − 1, with G2 (x) = g2 (x), we find {I1 = 1, I2 = 1, . . . , In = 1}
1 1 1 2 1 = U1 ∈ 0, , g2 (U1 ) ∈ , , . . . , gn−1 (Un−2:n−2 ) ∈ 1 − ,1 − 2 2 3 n−1 n
1 1 1 1 2 , , . . . , Gn−1 (U1 ) ∈ 1 − ,1 − . = U1 ∈ 0, , G2 (U1 ) ∈ 2 3 n−1 n 2 We easily verify that Gj (x) are linearly increasing, Gj (0)=1−1/j , and Gj (1/n!)=1−1/j +(j −1)!/n!, j =2, . . . , n−1. It follows that 1 1 1 1 2 x ∈ 0, , , . . . , Gn−1 (x) ∈ 1 − ,1 − , G2 (x) ∈ 2 3 n−1 n 2 if and only if x ∈ [0, 1/n!], and so
1 P(I1 = 1, I2 = 1, . . . , In = 1) = P U1 ∈ 0, n!
1 P(Ij = 1). = n! n
=
j =1
The distribution of the record indicators uniquely determines the distribution of the record occurrence times (cf. Nevzorov, 2001, Remark 28.4, p. 121). The former is identical with that of the case of i.i.d. sequence. Therefore we immediately have Theorem 2.1. Let Uj , j = 1, 2, . . . , be the sequence of maximally dependent random variables defined by (1.3). Then the probability functions of the kth record times Tk , k = 1, 2, . . . , are given by (1.1). It can be shown (see, Nevzorov, 2001, Theorem 28.2, p. 119) that the statement of Lemma 2.2 is true for arbitrary exchangeable sequences X1 , X2 , . . . , Xj , . . . such that P(X1 = X2 ) = 0. The elements of (1.3) are different almost surely, but they are not exchangeable. We can prove it by an example. Suppose that U1 ∈ A1 = [0.2, 0.21], say. Then, by definition, U2 ∈ A2 = [0.7, 0.71] and U3 ∈ A3 = [0.06666 . . . , 0.08666 . . .]. We also easily check that U1 ∈ A3 implies U2 ∈ [0.56666 . . . , 0.58666 . . .] and U3 ∈ [0.83333 . . . , 0.87333 . . .]. Accordingly, P(U1 ∈ A1 , U2 ∈ A2 , U3 ∈ A3 ) = 0.01, whereas P(U1 ∈ A3 , U2 ∈ A1 , U3 ∈ A2 ) = P(U1 ∈ A3 , U2 ∈ A2 , U3 ∈ A1 ) = 0. Note that the pair of random variables U1 and U1 + 21 , U1 21 , U2 = U1 − 21 , U1 > 21 is exchangeable. Certainly, there exist sequences of dependent random variables with identical continuous distributions such that the distributions of record times are different. The most trivial examples concern infinite repetitions of finite sequences X1 , X2 , . . . , Xk , X1 , X2 , . . . , Xk , . . . , say, which have finite number of records with occurrence times not exceeding k.
2258
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
On the other hand, Rychlik (2001, Theorem 33, p. 132) constructed sequences such that T3 > N and R2 > F −1 (1−1/N ) almost surely for arbitrarily large N. The distributions of the record values are derived in the following theorem. Theorem 2.2. Let Rk be the kth record value in the maximally dependent sequence Uj , j = 1, 2, . . . , defined by (1.3). Then the distribution function of Rk is given by FRk (u) = 1 −
k−1 1 |s(r, k)|(1 − u) , |s(r, j )| − r! (r − 1)!
(2.3)
j =1
for 1 − 1/r u < 1 − 1/(r + 1), r = k, k + 1, . . . , while FRk (u) = 0, for u < 1 − 1/k and FRk (u) = 1, for u 1. Its density function is given by fRk (u) =
|s(r, k)| , (r − 1)!
1−
1 1 , r = k, k + 1, . . . , u < 1 − r +1 r
(2.4)
while fRk (u) = 0, for u < 1 − 1/k or u > 1. Proof. The distribution function of Rk may be expressed as a mixture distribution. Specifically, from the expression P(Rk u, Tk = n) = P(Tk = n)P(Rk u|Tk = n) = P(Tk = n)P(Un:n u), 0 u 1, n = k, k + 1, . . . , it follows that FRk (u) =
∞
P(Tk = n)FUn:n (u),
0 u 1.
(2.5)
n=k
Since ⎧ ⎪ ⎪ ⎨ 0,
1 −∞ < u < 1 − , n 1 FUn:n (u) = 1 − n(1 − u), 1 − u < 1, ⎪ ⎪ ⎩ n 1, 1 u < ∞, we get the expression FRk (u) =
r |s(n − 1, k − 1)| {1 − n(1 − u)} n! n=k
r r |s(n − 1, k − 1)| |s(n − 1, k − 1)| = − (1 − u) , n! (n − 1)! n=k
n=k
for 1 − 1/r u < 1 − 1/(r + 1) and r = k, k + 1, . . . . Using the triangular recurrence relation |s(n, k)| = |s(n − 1, k − 1)| + (n − 1)|s(n − 1, k)|,
n = k, k + 1, . . . ,
with |s(k − 1, k)| = 0 (see Charalambides, 2002, p. 294), we get the relation r r r |s(n − 1, k − 1)| |s(n, k)| |s(n − 1, k)| = − , (n − 1)! (n − 1)! (n − 2)! n=k
n=k
n=k+1
(2.6)
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2259
which implies that the second sum in (2.6) has the form sr,k =
r |s(r, k)| |s(n − 1, k − 1)| = . (n − 1)! (r − 1)!
(2.7)
n=k
The first sum in (2.6) Sr,k
r |s(n − 1, k − 1)| = n! n=k
may be expressed as follows, using again the triangular recurrence relation of the signless Stirling numbers of the first kind. We then get the expression r r r r |s(n − 1, k)| |s(n − 1, k)| |s(n, k)| |s(n − 1, k − 1)| − , − = n! (n − 1)! n! n! n=k
n=k
n=k+1
n=k+1
which implies the recurrence relation Sr,k − Sr,k+1 =
|s(r, k)| , r!
k = 1, 2, . . . ,
with Sr,1 =
r
|s(n − 1, 0)|/n! = 1,
n=1
and so Sr,k =
r k−1 |s(n − 1, k − 1)| 1 |s(r, j )|. =1− n! r!
(2.8)
j =1
n=k
Introducing (2.7) and (2.8) into (2.6), the required expression (2.3) is deduced. Differentiating (2.3) in each interval, we immediately obtain (2.4). For large k, with u ∈ [1 − 1/k, 1] being close to 1 and r k large, the signless Stirling numbers of the first kind can be approximated as |s(r, k)|
(r − 1)! (log r + )k−1 , (k − 1)!
where = 0.57721 . . . denotes the Euler constant (see, Charalambides, 2002, p. 323). Then, for large k the following approximation of the density function (2.4) is readily deduced fRk (u) ∝
(log r + )k−1 , (k − 1)!
1−
1 1 u < 1 − , r = k, k + 1, . . . r r +1
(2.9)
(cf. (1.2) with F (x) = x and f (x) = 1, 0 x 1, and notice that here r is the integral part of 1/(1 − u)). The distribution and density functions of the kth record value Rk in the sequence of maximally dependent random variables (X1 , X2 , . . . , Xj , . . .) = (F −1 (U1 ), F −1 (U2 ), . . . , F −1 (Uj ), . . .), with a common absolutely continuous marginal distribution function F (x) and density f (x), x ∈ R, may be deduced from (2.3) and (2.4) as follows. Corollary 2.1. Let Rk be the kth record value in the sequence X1 = F −1 (U1 ), X2 = F −1 (U2 ), . . . , Xj = F −1 (Uj ), . . . of maximally dependent random variables with a common absolutely continuous marginal distribution function F (x), x ∈ R. Then the distribution function of Rk is given by FRk (x) = 1 −
k−1 1 |s(r, k)| (1 − F (x)), |s(r, j )| − r! (r − 1)! j =1
(2.10)
2260
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
for F −1 (1 − 1/r)x < F −1 (1 − 1/(r + 1)), r = k, k + 1, . . . , while FRk (x) = 0, for x < F −1 (1 − 1/k). Also, if F −1 (1) < ∞, in addition FRk (x) = 1, for x F −1 (1). Its density function is given by 1 1 |s(r, k)| −1 −1 fRk (x) = f (x), F x < F , 1− 1− (r − 1)! r r +1 for r = k, k + 1, . . . , while fRk (x) = 0, for x < F −1 (1 − 1/k) or x F −1 (1). Remark 2.1. Using representation (2.5), we note that (2.10) are the uniformly minimal distribution functions of record values based on possibly dependent identically F-distributed random variables such that the respective record occurrence times have distributions (1.1). The class contains all the sequences of exchangeable, almost surely different random variables with marginal F. Consequently, (2.3) and its quantile transformations provide maximal expectations of all non-decreasing functions of record values. In particular, by positivity of standard uniform random variables, we conclude that all the moments of records from (1.3) are maximal in the class of uniform sequences with record time distributions identical with those of the i.i.d. sequences. 3. Moments The moments of the record values in (1.3) are derived in the following theorem. Theorem 3.1. Let Rk be the kth record value in the sequence of maximally dependent uniform random variables (1.3). The jth moment of Rk satisfies the recurrence relation j
j
E(Rk ) = E(Rk−1 ) + hj (k), k = 2, 3, . . . ,
j
E(R1 ) = 1/(j + 1),
(3.1)
where hj (k) =
j
(−1)
i−1
i=1
∞ 1 |s(m, i)| j , i (i + 1) (m − 1)!(m + 1)k m=i
(3.2)
for k = 2, 3, . . . , j = 1, 2, . . . , and so 1 + hj (r), j +1 k
j
E(Rk ) =
j = 1, 2, . . . .
(3.3)
r=2
Proof. Clearly, by (2.5), j
E(Rk ) =
∞
j
P(Tk = n)E(Un:n ).
n=k
Since Un:n is uniformly distributed on the interval [1 − 1/n, 1], its moments satisfy j 1 1 j +1 n j i j 1− 1− =1+ E(Un:n ) = (−1) i j +1 n (i + 1)ni i=1
and so, by (1.1), we have j
E(Rk ) = 1 +
j
(−1)i
i=1
∞ |s(n − 1, k − 1)| 1 j . i (i + 1) (n − 1)!ni+1 n=k
Then, using the expansion ∞
1 |s(n − 1, k − 1)| = , tk (t + n − 1)n n=k
(3.4)
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2261
(see Charalambides, 2002, p. 300) with the variable n changed to m = n − 1, and setting k = i + 1 and t = n, we get ∞ ∞ |s(n − 1, k − 1)| 1 j (−1) |s(m, i)| i (i + 1) (n − 1)!(n + m)m+1 i=1 m=i n=k j ∞ ∞ |s(m, i)| |s(n − 1, k − 1)| 1 j . =1+ (−1)i i (i + 1) m! (m + n)n j
j E(Rk ) = 1 +
i
m=i
i=1
n=k
If we evaluate the inner series using again (3.4) with t = m + 1, then the last expression reduces to j
E(Rk ) = 1 +
j
(−1)i
i=1
∞ |s(m, i)| 1 j . i (i + 1) m!(m + 1)k m=i
Finally, the identity 1 1 1 = − , m(m + 1) m m + 1
m = 1, 2, . . . ,
(3.5)
implies the expression for the right-hand series ∞
|s(m, i)|
m=i
m!(m + 1)
k
=
∞ m=i
|s(m, i)| m!(m + 1)
k−1
−
∞
|s(m, i)|
m=i
(m − 1)!(m + 1)k
from which the recurrence relation (3.1) is readily deduced.
,
In particular, the expectations and variances of the record values can be expressed in terms of the Riemann zeta function. The derivation of the variance expression is facilitated by the following lemma which is of interest on its own. Lemma 3.1. The series h(k) =
∞
Hm
m=1
(m + 2)k
,
k = 2, 3, . . . ,
(3.6)
where Hm =
m 1 , i
m = 1, 2, . . .
i=1
is the mth harmonic number, can be expressed as h(k) =
k
(r) +
r=2
k−1 k 1 (r)(k − r + 1) − k, (k + 1) − 2 2
(3.7)
r=2
for k = 3, 4, . . . , and h(2) = (2) + (3) − 2, with (k) =
∞ 1 , ik
k>1
(3.8)
i=1
being the Riemann zeta function. Proof. Using the recurrence relation Hm+1 = Hm +
1 , m = 1, 2, . . . , m+1
H1 = 1,
2262
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
for the harmonic numbers, we get ∞
Hm
m=1
(m + 2)k
=
∞
Hm
m=1
(m + 1)k
−
∞
1
m=1
m(m + 1)k
.
Similarly ∞ m=1
∞ ∞ Hm 1 = − mk+1 (m + 1)k m=1 mk m=1
Hm
holds, and so series (3.6) may be expressed as h(k) = d(k) − (k + 1) − (k),
k = 2, 3, . . . ,
(3.9)
where d(k) =
∞ Hm , mk
k = 2, 3, . . .
(3.10)
m=1
is a Dirichlet series with the harmonic numbers as the coefficients, and (k) =
∞
1
m=1
m(m + 1)k
,
k = 1, 2, . . . .
(3.11)
Euler in 1775 expressed (3.10) in terms of the Riemann zeta function as k−1 1 1 d(k) = (k + 2)(k + 1) − (r)(k − r + 1), 2 2
(3.12)
r=2
for k = 3, 4, . . . and d(2) = 2(3) (cf. Berndt, 1985, p. 252 et seq. for a detailed historical account). Also, using identity (3.5), we see that series (3.11) satisfies the first order recurrence relation (k) − (k − 1) = 1 − (k), k = 2, 3, . . . ,
(1) = 1,
and so (k) = k −
k
(r), k = 2, 3, . . . ,
(1) = 1.
(3.13)
r=2
Plugging (3.12) and (3.13) into (3.9), we deduce the required expression (3.7).
The expectations and variances of record values are specified in the following corollary of Theorem 3.1. Corollary 3.1. Let Rk be the kth record value in the sequence of maximally dependent random variables (1.3). The expected values of Rk satisfy the recurrence relation E(Rk ) = E(Rk−1 ) + 21 [(k) − 1], k = 2, 3, . . . ,
E(R1 ) = 21 ,
(3.14)
and so
k 1 E(Rk ) = (r) − (k − 2) , 2
k = 2, 3, . . . ,
(3.15)
r=2
where (k) is the Riemann zeta function (3.8). The variances of Rk satisfy the recurrence relation Var(Rk ) = Var(Rk−1 ) −
1 12
g(k), k = 2, 3, . . . ,
Var(R1 ) =
1 12 ,
(3.16)
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2263
and so Var(Rk ) =
k 1 1 g(r), − 12 12
k = 2, 3, . . . ,
r=2
with g(k) = 4h(k) + 3[3 − (k)] + 6[(k) − 1] 2
k−1
(r) − (k − 3) − 12,
r=2
where the series h(k) is expressed in terms of the Riemann zeta function (3.8) via (3.7). Proof. Since |s(m, 1)| = (m − 1)!, the series (3.2) for j = 1 reduces to ∞ 1 1 1 h1 (k) = = [(k) − 1], k 2 2 (m + 1) m=1
k = 1, 2, . . . .
Then the recurrence relation (3.14) and the explicit expression (3.15) are readily deduced from the recurrence relation (3.1) and the explicit expression (3.3), respectively. Also, since |s(m, 1)| = (m − 1)! and |s(m, 2)| = (m − 1)!Hm−1 , the series (3.2) for j = 2 reduces to h2 (k) =
∞
1
m=1
(m + 1)k
−
∞ 1 Hm−1 1 = (k) − 1 − h(k), k 3 3 (m + 1) m=1
k = 1, 2, . . . .
Thus, by (3.1), we have 2 ) + [(k) − 1] − 13 h(k), k = 2, 3, . . . , E(Rk2 ) = E(Rk−1
E(R12 ) = 13 .
(3.17)
Then the recurrence relation (3.16) for the variance Var(Rk ) = E(Rk2 ) − [E(Rk )]2 is obtained from (3.14) and (3.17) after some algebraic manipulations. Arnold and Balakrishnan (1989, p. 148) calculated E(R2 ) = (2) = 2 /12, and noted that it is greater than the expectation of the second record value in the i.i.d. uniform sequence. We showed in Remark 2.1 that this relation holds for all record values and all moments. Using recurrence relations (3.14) and (3.16) together with a table of the Riemann zeta function (3.8) (see Abramowitz and Stegun, 1965, Table 23.3, p. 811), we determine numerical values of expectation E(Rk ) and variance Var(Rk ) of Table 1 Expectations and variances of record values k
E(Rk )
E(R˜ k )
1−E(R˜ k ) 1−E(Rk )
Var(Rk )
Var(R˜ k )
Var(R˜ k ) Var(Rk )
2 3 4 5 6 7 8 9 10 11 12
0.82246703 0.92349549 0.96465710 0.98312098 0.99179251 0.99596715 0.99800583 0.99901002 0.99950731 0.99975441 0.99987745
0.75000000 0.87500000 0.93750000 0.96875000 0.98437500 0.99218750 0.99609375 0.99804688 0.99902344 0.99951172 0.99975586
1.40818917 1.63389050 1.76838922 1.85141074 1.90374899 1.93721519 1.95883227 1.97290062 1.98210814 1.98816030 1.99215155
0.01948506 0.00595948 0.00194151 0.00064611 0.00021666 0.00007282 0.00002448 0.00000822 0.00000276 0.00000092 0.00000031
0.04861111 0.02141204 0.00843943 0.00313866 0.00112760 0.00039621 0.00013716 0.00004699 0.00001598 0.00000541 0.00000182
2.49478943 3.59293933 4.34684039 4.85779132 5.20436344 5.44089196 5.60369049 5.71679243 5.79611185 5.85224502 5.89230535
2264
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
the kth record value Rk , k = 2, 3, . . . , 12, from the sequence (1.3) and present them in Table 1. For comparison, we also present the respective values 1 E(R˜ k ) = 1 − k 2
(3.18)
1 1 Var(R˜ k ) = k − k , 3 4
(3.19)
and
for records R˜ k , k = 2, 3, . . . , 12, based on a sequence of independent standard uniform random variables. Both the expectation sequences tend to 1, and the record expectations of the dependent sequence do it faster, as claimed. Also, both the variances decrease to zero, and those of the dependent sequence are smaller. In the same table, we present also values of the ratios [1 − E(R˜ k )]/[1 − E(Rk )] and Var(R˜ k )/Var(Rk ), for k = 2, 3, . . . , 12. Limiting values of these ratios are established in the following theorem. Theorem 3.2. Let Rk and R˜ k , k = 1, 2, . . . , denote the record values from the maximally dependent sequence U1 , U2 , . . . , Uj , . . . defined in (1.3) and independent standard uniform sequence U˜ 1 , U˜ 2 , . . . , U˜ j , . . . , respectively. Then lim
k→∞
E[(1 − R˜ k )j ] E[(1 − Rk )j ]
= (j + 1)!,
j = 1, 2, . . . .
(3.20)
In particular lim
1 − E(R˜ k ) =2 1 − E(Rk )
lim
Var(R˜ k ) = 6. Var(Rk )
k→∞
and
k→∞
Proof. Clearly E[(1 − U˜ n:n )j ] =
j! , (n + j )j
E[(1 − Un:n )j ] =
1 (j + 1)nj
and so, by (2.5), we find E[(1 − R˜ k )j ] =
∞
P(Tk = n)E[(1 − U˜ n:n )j ] =
n=k
n=k
and E[(1 − Rk )j ] =
∞
P(Tk = n)
n=k
1 . (j + 1)nj
Then, applying the inequalities 1 1 < , (n + j )j nj
∞
n = 1, 2, . . . , j = 1, 2, . . .
P(Tk = n)
j! (n + j )j
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
2265
and (n + j )j (k + j )j , < kj nj
n = k + 1, k + 2, . . . , j = 1, 2, . . . ,
we get E[(1 − R˜ k )j ] = (j + 1)!
∞
P(Tk = n)
1 (j + 1)(n + j )j
P(Tk = n)
1 = (j + 1)!E[(1 − Rk )j ] (j + 1)nj
n=k
< (j + 1)!
∞ n=k
and E[(1 − Rk )j ] =
∞
P(Tk = n)
n=k
<
(n + j )j j! (n + j )j (j + 1)!nj
(k + j )j E[(1 − R˜ k )j ], (j + 1)!k j
respectively. Thus (j + 1)!
kj E[(1 − R˜ k )j ] < < (j + 1)!, (k + j )j E[(1 − Rk )j ]
j = 1, 2, . . .
and letting k → ∞, we obtain (3.20). The limits of the variance ratios may be deduced by considering the expression
Var(R˜ k ) [E(1 − R˜ k )]2 E[(1 − Rk )2 ] [E(1 − Rk )]2 = 1− − . Var(Rk ) E[(1 − R˜ k )2 ] E[(1 − R˜ k )2 ] E[(1 − R˜ k )2 ]
(3.21)
Specifically, from (3.18) and (3.19), we get k [E(1 − R˜ k )]2 3 , k = 1, 2, . . . = 2 4 E[(1 − R˜ k ) ] and so [E(1 − R˜ k )]2 = 0. ˜ k )2 ] k→∞ E[(1 − R lim
Also, by (3.20), lim
k→∞
[E(1 − Rk )]2 [E(1 − Rk )]2 [E(1 − R˜ k )]2 = lim = 0. 2 E[(1 − R˜ k ) ] k→∞ [E(1 − R˜ k )]2 E[(1 − R˜ k )2 ]
Letting k → ∞ in (3.21), we find the required limit.
Remark 3.1. As we noted in Remark 2.1, all the raw moments of records in the maximally dependent sequence (1.3) are greater than the respective record moments in the uniform sequences with the same distribution of record times. Lai and Robbins (1978, Theorem 1(ii)) proved that the variances of maxima in the maximally dependent uniform vectors are minimal in the class of arbitrarily dependent random vectors with the uniform marginal distributions. For instance, it is asymptotically 12 times smaller than the variance in the i.i.d. sequence. A similar claim cannot be stated for record values. Using (2.5), we can only get the minimal value of E(Var Rk |Tk ) in the class of sequences with the classic laws of record occurrence times.
2266
Ch.A. Charalambides, T. Rychlik / Journal of Statistical Planning and Inference 138 (2008) 2253 – 2266
Remark 3.2. Record values from the i.i.d. uniform sequences do not have a non-degenerate limiting distribution (see Resnick, 1973, Theorems 4.2–4.4). Records in sequence (1.3) have asymptotic representations of density functions (2.9) similar to (1.2). The expectations and variances tend to their limits at the same rate. Therefore one can hardly expect that the distributions of record values from the maximally dependent uniform variables have a limiting distribution. Acknowledgments The research of the first author was partially supported by the University of Athens Research Special Account under Grant 70/4/3406. The second author was partially supported by the Polish Ministry of Science and Higher Education Grant no. 1 P03A 015 30. References Abramowitz, M., Stegun, I.A., 1965. Handbook of Mathematical Functions. Dover, New York. Arnold, B.C., Balakrishnan, N., 1989. Relations, Bounds and Approximations for Order Statistics. Lecture Notes in Statistics, vol. 53. Springer, New York. Arnold, B.C., Balakrishnan, N., Nagaraja, H.N., 1998. Records. Wiley, New York. Berndt, B.C., 1985. Ramanujan’s Notebooks, Part I. Springer, New York. Charalambides, Ch.A., 2002. Enumerative Combinatorics. Chapman & Hall, CRC, Boca Raton, FL. Lai, T.L., Robbins, H., 1976. Maximally dependent random variables. Proc. Natl. Acad. Sci. USA 73, 286–288. Lai, T.L., Robbins, H., 1978. A class of dependent random variables and their maxima. Z. Wahrsch. Verw. Gebiete 42, 89–111. Mallows, C.L., 1969. Extrema of expectations of uniform order statistics. SIAM Rev. 11, 410–411. Nevzorov, V.B., 2001. Records: Mathematical Theory. Translations of Mathematical Monographs, vol. 194. American Mathematical Society, Providence, RI. Resnick, S.I., 1973. Limit laws for record values. Stochastic Process. Appl. 1, 67–82. Rychlik, T., 1992. Stochastically extremal distributions of order statistics for dependent samples. Statist. Probab. Lett. 13, 337–341. Rychlik, T., 1995. Bounds for order statistics based on dependent variables with given nonidentical distributions. Statist. Probab. Lett. 23, 351–358. Rychlik, T., 2001. Projecting Statistical Functionals. Lecture Notes in Statistics, vol. 160. Springer, New York. Tchen, A., 1980. Inequalities for distributions with given marginals. Ann. Probab. 8, 814–827.