Exponential characterizations motivated by the structure of order statistics in samples of size two

Exponential characterizations motivated by the structure of order statistics in samples of size two

Statistics and Probability Letters 83 (2013) 596–601 Contents lists available at SciVerse ScienceDirect Statistics and Probability Letters journal h...

220KB Sizes 0 Downloads 7 Views

Statistics and Probability Letters 83 (2013) 596–601

Contents lists available at SciVerse ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Exponential characterizations motivated by the structure of order statistics in samples of size two Barry C. Arnold a , Jose A. Villasenor b,∗ a

Department of Statistics, University of California, Riverside, USA

b

Department of Statistics, Colegio de Postgraduados, Montecillo, MX, Mexico

article

info

Article history: Received 29 March 2012 Received in revised form 25 October 2012 Accepted 28 October 2012 Available online 7 November 2012 Keywords: Convolution Order statistics Functional equation Failure rate

abstract Motivated by the observation that for a sample of size two from an exponential distribution, the largest order statistic is distributed as a convolution of two independent exponential random variables with distributions differing only in their intensity or rate parameter, a spectrum of related characterizations of the exponential distribution are identified and verified. © 2012 Elsevier B.V. All rights reserved.

1. Introduction If we consider X1 , X2 , a sample of size two from an exponential distribution, it is well known that the two spacings X1:2 and X2:2 − X1:2 are independent exponentially distributed random variables. Consequently X2:2 has as its distribution a convolution of two independent exponential random variables (with differing intensity or rate parameters). For notation, we will write X ∼ exp(λ) if the density of X is of the form fX (x) = λe−λx I (x > 0), and its survival function is given by F X (x) = P {X > x} = e−λx , x > 0. In this paper we will refer to λ as an intensity parameter. Our observation about the structure of the joint distribution of the two spacings corresponding to an exponential sample of size two can be expressed as: X2:2 =d X1 +

1 2

X2

where =d denotes equality in distribution and can be read as ‘‘has the same distribution as’’. Thus X2:2 has the same distribution as the convolution of two exponential variables with different intensity parameters. This property holds for exponential samples but may be predicted to be unlikely to hold for samples from other distributions. We will confirm the truth of this assertion, under mild regularity conditions in Section 2. In addition, the standard exponential distribution (corresponding to the case in which λ = 1) has the, also well known, striking property that F X (x) = fX (x), x > 0; i.e., it has a constant failure rate. Combining this distributional property with the convolution property displayed above, we may write the following extensive list of distributional properties that are all satisfied by a sample of size two from a distribution function F with density f and survival function F , when F is a standard exponential distribution function. X1 +



1 2

X2 =d max{X1 , X2 },

Corresponding author. E-mail address: [email protected] (J.A. Villasenor).

0167-7152/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2012.10.028

(1)

B.C. Arnold, J.A. Villasenor / Statistics and Probability Letters 83 (2013) 596–601

X1 + X1 + X1 + X1 +

1 2 1 2 1 2 1 2

597

X2

has density 2[f (x) − f (2x)],

(2)

X2

has density 2[F (x) − F (2x)],

(3)

X2

has density 2[f (x) − F (2x)],

(4)

X2

has density 2[F (x) − f (2x)],

(5)

max{X1 , X2 }

has density 2[f (x) − f (2x)],

(6)

max{X1 , X2 }

has density 2[F (x) − F (2x)],

(7)

max{X1 , X2 }

has density 2[f (x) − F (2x)],

(8)

max{X1 , X2 }

has density 2[F (x) − f (2x)],

(9)

f (x) − f (2x) = F (x) − F (2x).

(10)

It will be shown that each one of these conditions, on its own, sometimes with mild regularity assumptions on the form of F , is sufficient to guarantee that f (x) = λe−λx I (x > 0) for some λ > 0. We will throughout assume that we are dealing with absolutely continuous positive random variables (thus F (0) = 0) with density function f (x). A convenient survey of other exponential characterizations may be found in Chapter 19 of Johnson et al. (1994). See also Arnold and Huang (1995). Remark 1. Characterizations are particularly of interest when they shed light on the consequences of certain distributional assumptions and/or can be used to assess the plausibility of such assumptions via suitable tests of hypotheses. For example, consider the characterization based on Eq. (1). Eq. (1) will hold if a parallel system of two identical components exhibits the same reliability as a single component provided with a cold standby component with doubled failure rate. If such a situation is deemed to be plausible, then the assumption of an exponential distribution for the failure times will lead to an acceptable model. On the other hand, if a sample of n X ’s is available then one can randomly divide the data set into four subsets, relabeled as U1 , U2 , . . . , Un/4 ,

V1 , V2 , . . . , Vn/4 ,

W1 , W2 , . . . , Wn/4 ,

Z1 , Z2 , . . . , Zn/4 .

Then for i = 1, 2, . . . , n/4 define Si = max{Ui , Vi } and Ti = Wi +(1/2)Zi . The S’s and the T ’s will have a common distribution if and only if the original X ’s have an exponential distribution. Any standard two-sample non-parametric test can be used to compare the sample distribution functions of the S’s and the T ’s, to provide evidence regarding the acceptability of the exponential model. 2. The characterizations In all theorems below, it is assumed that X1 , X2 is a sample of size two from a distribution F , assumed to be absolutely continuous with F (0) = 0, with density function f (x), and with Laplace transform ζ (t ) = E (e−tX1 ). We begin by recalling a useful Lemma which will be used in two of the theorems. Lemma 1. If a function g : [0, ∞] −→ (−∞, ∞) has a right derivative at 0 denoted by g ′ (0) and satisfies g (t ) = 2g (t /2),

for every t ≥ 0,

then g (t ) = tg (0) for every t ≥ 0. ′

Proof. For any t > 0, since g (t ) = 2g (t /2) it follows by induction that g (t ) = 2k g (t /2k )

for all k = 1, 2, . . . .

Consequently g (t ) = lim 2k g (t /2k ) = lim t k→∞

k→∞

g (t /2k ) t /2k

= tg ′ (0). 

Rather than present ten separate theorems, one for each of the candidate characterization conditions (1)–(10), we will group those which use the same regularity condition.

598

B.C. Arnold, J.A. Villasenor / Statistics and Probability Letters 83 (2013) 596–601

Theorem 1. If f has derivatives of all orders in a neighborhood of 0, and X1 + 21 X2 =d max{X1 , X2 }, then X1 ∼ exp(λ) for some

λ > 0.

Proof. From the hypothesis we have x



[f (x − y)f (2y) − f (x)f (y)]dy = 0 for all x ≥ 0. 0

Differentiate this expression three times with respect to x using Leibnitz’s rule to obtain 6[f (0)f (2) (2x) − 3f (x)f (2) (x)] − 3[f (1) (x)]2 + 2f (2) (0)f (2x)

− f (2) (x)f (x) +

x



[f (3) (x − y)f (2y) − f (3) (x)f (y)]dy = 0,

(11)

0

where f (j) (x) =

dj f dxj

(x), j = 1, 2, . . . .

Set x = 0 in (11) to obtain 3f (0)f (2) (0) − 3[f (1) (0)]2 = 0. Thus f (2) (0) =

[f (1) (0)]2 f (0)

=



f (1) (0) f (0)



f (1) (0).

If we differentiate four times and set x = 0, we eventually obtain 8f (0)f (3) (0) − 8f (1) (0)f (2) (0) = 0, so that f (3) (0) =



f (1) (0) f (0)

2

f (1) (0).

Further differentiation and an induction argument yield f (k) (0) =

f (x) =

∞  f (k) (0) k=0

k!

x = f (0) + k

∞ 



f (1) (0) f (0)

k−1 k!

k=1

f (1) (0)



f (1) (0) f (0)

k−1

f (1) (0) for k = 4, 5, . . .. It follows that

xk

= f (0) exp[xf (1) (0)/f (0)]. For this to integrate to one, we must have [f (1) (0)/f (0)] < 0. So for λ > 0 we can write f (1) (0)/f (0) = −λ and f (0) = λ, so that the density becomes f (x) = λe−λx I (x > 0), i.e., X1 ∼ exp(λ).



Theorem 2. Assume that E (X1 ) is finite, then (i) If X1 + 21 X2 has density 2[f (x) − f (2x)], x > 0, then X1 ∼ exp(λ) for some λ > 0, and (ii) If X1 + 21 X2 has density 2[F (x) − F (2x)], x > 0, then X1 ∼ exp(1). Proof. For (i), by hypothesis we have for t ≥ 0, ∞

    −t X1 + 12 X2 ζ (t )ζ (t /2) = E e =

e−tx 2[f (x) − f (2x)]dx = 2ζ (t ) − ζ (t /2). 0

From this, defining β(t ) = [1/ζ (t )] − 1, we find that β(t ) = 2β(t /2). Since E (X ) is finite, ζ (t ) has a finite derivative at 0 and so does β(t ). Denote β ′ (0) by 1/λ. It follows from Lemma 1 that,

−1

, i.e. X1 ∼ exp(λ). for t > 0, β(t ) = t /λ. Thus ζ (t ) = 1 + λt ∞ For (ii), using integration by parts, it may be verified that ξ (t ) = 0 e−tx F (x)dx = (1/t )(1 − ζ (t )). By hypothesis we have for t ≥ 0,





    −t X1 + 12 X2 ζ (t )ζ (t /2) = E e =

e−tx 2[F (x) − F (2x)]dx = 2ξ (t ) − ξ (t /2) = 0

t ζ (t )

2 t

[ζ (t /2) − ζ (t )],

from this, defining γ (t ) = 1−t ζ (t ) , we find that γ (t ) = 2γ (t /2). Since E (X ) is finite, ζ (t ) has a finite derivative at 0 and so does γ (t ). Denote γ ′ (0) by 1/λ. It follows from Lemma 1 that γ (t ) = t /λ Thus ζ (t ) = (λ + t )−1 . However, since ζ (0) = 1, it must be the case that λ = 1 and we may conclude that X1 ∼ exp(1).  Theorem 3. Assume that E (etX1 ) is finite for all t in a neighborhood of 0, then

B.C. Arnold, J.A. Villasenor / Statistics and Probability Letters 83 (2013) 596–601

599

(i) If X1 + 12 X2 has density 2[f (x) − F (2x)], x > 0, then X1 ∼ exp(1). and (ii) If X1 + 21 X2 has density 2[F (x) − f (2x)], x > 0, then X1 ∼ exp(1).

∞

Proof. As in Theorem 2, ξ (t ) = 0 e−tx F (x)dx = (1/t )(1 − ζ (t )). For (i), by hypothesis we then have for t ≥ 0, ∞

    −t X1 + 12 X2 = ζ (t )ζ (t /2) = E e

e−tx 2[f (x) − F (2x)]dx = 2ζ (t ) − ξ (t /2) = 2ζ (t ) − (2/t )[1 − ζ (t /2)]. 0

If we define β(t ) = [1/ζ (t )] − 1, then the equation

ζ (t )ζ (t /2) = 2ζ (t ) − (2/t )[1 − ζ (t /2)], is equivalent to C (t ) = 2β(t )β(t /2) − 2t β(t /2) + 2β(t /2) − t = 0.

(12)

By hypothesis, X1 has a moment generating function and consequently ζ (t ) and β(t ) have power series expansions in a ∞ ∞ k k neighborhood of 0. Thus we can write β(t ) = k=0 bk t . Also write C (t ) = k=0 ck t . Expanding C (t ) in (12) in powers of t, we will have all coefficients equal to 0. Note that b0 = β(0) = [1/ζ (0)]− 1 = 0. Also c1 = 2b0 b1 + b1 b0 − 2b20 + b1 − 1 = 0, which implies that b1 = 1. Next, c2 = 2b0 b2 + b21 + (b2 /2)b0 − b1 + b2 /2 = 0, which implies that b2 = 0. Since, for k > 2,



 k  j −1 ck = (bj /2 )bk−j − (bk−1 /2k−2 ) + (bk /2k−1 ), j =0

an inductive argument may be used to conclude that bk = 0 for every k > 2. It follows that β(t ) = t and so ζ (t ) = (1 + t )−1 , i.e., X1 ∼ exp(1). For (ii), by hypothesis we have for t ≥ 0, ∞

    −t X1 + 12 X2 = ζ (t )ζ (t /2) = E e

e−tx 2[F (x) − f (2x)]dx = (2/t )(1 − ζ (t )) − ξ (t /2). 0

As before define β(t ) = [1/ζ (t )] − 1, then the equation

ζ (t )ζ (t /2) = (2/t )(1 − ζ (t )) − ξ (t /2) is equivalent to D(t ) = 2β(t )β(t /2) − 2β(t ) − t β(t ) − 2t = 0.

(13)

Because ζ (t ) and β(t ) have power series expansions in a neighborhood of 0, we can write β(t ) = k=0 bk t . Also ∞ k write D(t ) = k=0 dk t . Expanding D(t ) in (13) in powers of t, we will have all coefficients equal to 0. Note again that b0 = β(0) = [1/ζ (0)] − 1 = 0. Also d1 = 2b0 b1 + b1 b0 + 2b1 − b0 − 2 = 0, which implies that b1 = 1. Next, d2 = 2b0 b2 + b21 + (b2 /2)b0 + 2b2 − b1 = 0, which implies that b2 = 0. Since, for k > 2,

∞

 dk =

k 

k

 (bj /2

j −1

)bk−j + 2bk − bk−1 ,

j =0

an inductive argument may be used to conclude that bk = 0 for every k > 2. It follows that β(t ) = t and so ζ (t ) = (1 + t )−1 , i.e., X1 ∼ exp(1).  Theorem 4. If max{X1 , X2 } has density 2[f (x) − f (2x)], x > 0, then X1 ∼ exp(λ) for some λ > 0. Proof. By hypothesis 2F (x)f (x) = 2[f (x) − f (2x)] holds for every x ≥ 0. Divide both sides of the equation by 2 and then 2

integrate both sides with respect to x from 0 to y. This yields F 2 (y) = 2F (y) − F (2y) for y ≥ 0. Equivalently F (y) = F (2y). Thus, log F (y) = 2 log F (y/2) = y

log F (y/2k ) y/2k

for k = 1, 2, . . . .

However log F is right differentiable at 0. Denote its right derivative at 0 by −λ where λ > 0. It follows that log F (y) = lim y k→∞

log F (y/2k ) y/2k

= −λy,

and consequently F (y) = 1 − e−λy for y ≥ 0, i.e., X1 ∼ exp(λ).



600

B.C. Arnold, J.A. Villasenor / Statistics and Probability Letters 83 (2013) 596–601

Theorem 5. Assume that the distribution function F of X1 admits a power series representation of the form F (x) = x ∈ (0, ∞).

∞

j =0

cj xj for

(i) If max{X1 , X2 } has density 2[F (x) − F (2x)], x > 0, then X1 ∼ exp(1). (ii) If max{X1 , X2 } has density 2[f (x) − F (2x)], x > 0, then X1 ∼ exp(1). (iii) If max{X1 , X2 } has density 2[F (x) − f (2x)], x > 0, then X1 ∼ exp(1). Proof. For (i), by hypothesis we have 2F (x)f (x) = 2[F (x) − F (2x)]. Equivalently D(x) =

∞ 

dk xk = F (x)f (x) + F (x) − F (2x) = 0.

(14)

k=0 j j Since F (x) = 0 cj x we have f (x) = 0 (j + 1)cj+1 x . Since all the coefficients of powers of x are equal to 0 in the power series expansion of D(x) in (14), it follows that for each k = 1, 2, . . . it must be the case that:

∞

∞

 0 = dk =

k 

 ci (k − i + 1)ck−i+1

+ ck − 2k ck .

(15)

i=0

Since F (0) = 0, it follows that c0 = 0. From (15) with k = 1, it then follows that we have c12 = c1 , so that c1 = 0 or 1. If c1 = 0, it is readily determined using (15) that ck = 0 for all k, in which case F (x) would not be a valid distribution function. So we must have c1 = 1. From (15) with k = 2, we have 3c1 c2 = 3c2 which leaves c2 undetermined. For reasons which will soon become apparent, define c2 by (−δ/2) where δ ∈ (−∞, ∞). From (15) with k = 3, we have c3 = (2/3)c22 , i.e., c3 = δ 2 /3!. From (15) with k = 4, c4 = (1/3)c23 , i.e., c4 = −δ 3 /4!. An inductive argument, using (15) then yields ck = (−δ)k−1 /k! for k = 2, 3, . . . . It then follows that

 F (x) = x +

∞  (−δ)j−1 xj j =2



j!

= 1 + (1 − δ)x − e−δx .

(16)

However, it is only when δ = 1 that the expression in (16) remains bounded between 0 and 1 for all x ∈ (0, ∞). Thus δ = 1 and F (x) = 1 − e−x , i.e., X1 ∼ exp(1). Remark 2. An alternative argument involves defining h(x) = h′ (x)[2 + h′ (x)].

∞ x

F (y)dy which will satisfy the equation h(2x) − 2h(x) =

For (ii), by hypothesis we have 2F (x)f (x) = 2[f (x) − F (2x)]. Equivalently G(x) =

∞ 

gk xk = F (x)f (x) − f (x) + 1 − F (2x) = 0.

(17)

k=0 j j Since F (x) = 0 cj x , we have f (x) = 0 (j + 1)cj+1 x . Since all the coefficients of powers of x are equal to 0 in the power series expansion of G(x) in (17), it follows that for each k = 1, 2, . . . it must be the case that:

∞

 0 = gk =

∞

k 

 ci (k − i + 1)ck−i+1

− (k + 1)ck+1 − 2k ck .

(18)

i=0

Since F (0) = 0, it follows that c0 = 0. Since g0 = 0, it follows that c1 = 1. From (18) with k = 1, we have 2c2 = −c1 , i.e., c2 = −1/2. From (18) with k = 2, we have −c2 = 3c3 , whence c3 = 1/6 = 1/(3!). From (18) with k = 3, we have 2c22 − 4c3 = 4c4 , whence c4 = −1/(4!). An inductive argument, using (18) then yields ck = (−1)k−1 /k! for k = 2, 3, . . . . Thus F (x) = 1 − e−x , i.e., X1 ∼ exp(1). Remark 3. An alternative argument involves defining h(x) = For(iii), we use a similar argument as in (ii).



∞ x

F (y)dy which will satisfy the equation h(2x) = [h′ (x)]2 .

B.C. Arnold, J.A. Villasenor / Statistics and Probability Letters 83 (2013) 596–601

601

Theorem 6. If f is right continuous at 0, and if f (x) − f (2x) = F (x) − F (2x), x > 0, then X1 ∼ exp(1). Proof. Since f (x)− f (2x) = F (x)− F (2x), it follows that f (x)− F (x) = f (2x)− F (2x) for all x ≥ 0. Define Q (x) = f (x)− F (x). Then for any x ≥ 0 and any k = 0, 1, 2, . . ., we have Q (x) = Q (x/2) = Q (x/2k ). Consequently Q (x) = lim Q (x/2k ) = Q (0+) = f (0+) − F (0+) = f (0+) − 1. k→∞

On the other hand, lim f (x) = lim f (x) − lim F (x) = lim Q (x) = f (0+) − 1.

x→∞

x→∞

x→∞

x→∞

But since f is integrable, we must have limx→∞ f (x) = 0, and therefore f (x) = F (x) for every x ≥ 0, i.e., X1 ∼ exp(1).



3. Samples of size k > 2 Parallel conjectures are possible regarding characterizations based on properties of exponential samples of size k, where k > 2. For example, with k = 3, and with X1 , X2 , X3 i.i.d.standard exponential variables we have: X3:3 =d X1 +

1 2

1 3

X3 ,

1

X3 , 3 has density 3[f (x) − 2f (2x) + f (3x)],

X3:3 =d X2:2 + X3:3

X2 +

and X1 +

1 2

X2 +

1 3

X3

has density 3[f (x) − 2f (2x) + f (3x)];

where X2:2 = max{X1 , X2 } and X3:3 = max{X1 , X2 , X3 .}. It is conjectured that such properties and related parallel properties based on samples of size k > 3 also characterize the exponential distribution. As of now, these issues are still being studied. Remark 4. If we assume, for i.i.d. random variables X1 , X2 , . . . with E |X1 | < ∞, that for every n it is the case that n X Xn:n =d i=1 ii , then necessarily the Xi ’s have a common exponential distribution. In fact it is enough to assume that for every n, E (Xn:n ) = E

 n

Xi i=1 i



, since the sequence of expected maxima determines the distribution.

Acknowledgments The authors wish to thank the anonymous referees for their constructive comments on the original version of the manuscript which helped to improve considerably the presentation of this paper. References Arnold, B.C., Huang, J.S., 1995. Characterizations. In: Balakrishnan, N., Basu, A.P. (Eds.), The Exponential Distribution: Theory, Methods and Applications. Gordon and Breach, Amsterdam, pp. 185–203 (Chapter 12). Johnson, N.L., Kotz, S., Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1, second ed. John Wiley and Sons, New York.