Journal of Statistical Planning and Inference 138 (2008) 964 – 970 www.elsevier.com/locate/jspi
Characterization based on convex conditional mean function Ramesh C. Guptaa,∗ , S.N.U.A. Kirmanib a Department of Mathematics and Statistics, University of Maine, Orono, ME 04469-5752, USA b Department of Mathematics, University of Northern Iowa, Cedar Falls, IA 50614, USA
Received 20 June 2005; received in revised form 25 May 2006; accepted 8 March 2007 Available online 3 July 2007
Abstract This paper addresses the problem of characterizing a distribution by means of a convex conditional mean function. The characterization is proved, under mild conditions, by means of showing the uniqueness of the solution of a certain non-linear differential equation. © 2007 Published by Elsevier B.V. Keywords: Truncated moments; Non-linear differential equation; Uniqueness; Order statistics; Record values
1. Introduction Suppose X is a random variable with absolutely continuous distribution function F having support [a, b] and finite mean. Then, x t dF (t) E(X|X < x) = a , (1.1) F (x) and
b E(X|X > x) =
x t dF (t) . 1 − F (x)
(1.2)
It is well known that individually d(x) = E(X|X < x) and m(x) = E(X|X > x) determine the distribution uniquely. In addition to these functions, Ruiz and Navarro (1996) have shown that the function m(x, y) = E(X|x X y) also characterizes the distribution. In the reliability literature, the function (x) = E(X − x|X > x) is called the mean residual life function and the function (x) = E(x − X|X < x) is called the reverse mean residual life function. It is well known that (x) and (x) characterize the distribution, see Ruiz and Navarro (1996) and Gupta (1975). In the context of some characterizations based on the properties of certain regression functions associated with order statistics, record values and the original observation, the following question was raised: Does a convex linear combination of E(X|X < x) and E(X|X > x) characterize the distribution? ∗ Corresponding author.
E-mail address:
[email protected] (R.C. Gupta). 0378-3758/$ - see front matter © 2007 Published by Elsevier B.V. doi:10.1016/j.jspi.2007.03.059
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
965
To answer this question, Nagaraja and Nevzorov (1997) proved the following result: Theorem 1.1. Let X and Y be two random variables with support S = [a, b], −∞ a < b ∞ and continuous cdf’s F and G, respectively. Assume that E(X) exists and there is a point x0 such that for some , 0 < < 1, the following conditions are satisfied: F (x0 ) = G(x0 ) = ,
(1.3)
E(X|X < x0 ) = E(Y |Y < x0 ).
(1.4)
and
Then, F = G if and only if E(X|X < x) + (1 − )E(X|X > x) = E(Y |Y < x) + (1 − )E(Y |Y > x),
x ∈ (a, b).
(1.5)
In the same paper, they conjectured that the characterization holds even without the assumptions (1.3) and (1.4). In order to shed light on the above conjecture, Wesolowski and Gupta (2001) observed that (1.3) holds if E(X)=E(Y ). In fact (1.3) and the equality of two means are equivalent, see Remark 2 of Nagaraja and Nevzorov (1997). Wesolowski and Gupta (2001) remarked that if M(x) = E(X|X < x) + (1 − )E(X|X > x) has a “simple form”, even with further restrictions on the family of distributions under consideration, the condition (1.4) cannot be omitted in order to pin down the distribution function F . Consequently, strictly speaking, Nagaraja and Nevzorov’s conjecture is incorrect. On the other hand, some scale family of distributions are uniquely determined by M of a “simple form”, and in such a sense the intuition is correct. More recently, Lillo (2004) claimed that, assuming the continuity of F and that its support known, the conjecture is true without the (1.3) and (1.4). Unfortunately, her proof is incorrect . This can be seen as follows: Assuming that there are two different distribution functions F and G corresponding to the function M and F (x0 ) = (note that such a point exists because of the continuity of F ), Lillo (2004) obtains 1 F (x) G(x) − F (x) 2 − + m2 (x) for all a < x < b, (1.6) m1 (x) = G(x) F (x) F (x) − F (x) − G(x) where 1 and 2 are the two means, and m1 (x) and m2 (x) are the right truncated moments corresponding to the two distribution functions F and G. From the above relation, by taking the limit as x → x0 , she erroneously concludes that G(x0 ) = and 1 = 2 and hence F = G. However, as also noticed by the Associate Editor, this amounts to claiming that, if for x going to x0 , the limit of g1 (x) + g2 (x) is 0, then the individual limits are 0. This, of course, is not true without any additional condition. In the present paper, we further explore the validity of Nagaraja and Nevzorov’s (1997) conjecture and show that, indeed, M(x) characterizes the distribution without the assumptions (1.3) and (1.4). Our approach is entirely different from the ones presented earlier and essentially depends on showing the uniqueness of a solution of a non-linear differential equation. Note that, normally, solving such a differential equation is not feasible except in very simple cases, without some stringent conditions. Some such cases have been considered by Wesolowski and Gupta (2001) and Akhundov et al. (2004). The characterization problem, in the case of Akhundov et al. (2004), arises in the context of characterization of distributions by some regression function of order statistics. 2. The characterization result The main result of this paper, namely Theorem 2.5, exploits several results including the following theorem which gives a sufficient condition for the uniqueness of solution of the initial value problem (IVP): y = f (x, y), y(x0 ) = y0 , where f is a given function of two variables whose domain is a region D ⊂ R 2 , (x0 , y0 ) is a specified point in D, y is the unknown function. By the solution of IVP on an interval I ⊂ R, we mean a function (x) such that (i) is differentiable on I , (ii) the graph of lies in D, (iii) (x0 ) = y0 and (iv) (x) = f (x, (x)) for all x ∈ I . The following theorem, when coupled with other results (including the Lemma 2.2), will help in proving our characterization result.
966
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
Theorem 2.1. Let the function f be defined and continuous in a domain D ⊂ R 2 , and let f satisfy a Lipschitz condition (with respect to y) in D, namely, |f (x, y1 ) − f (x, y2 )|K|y1 − y2 |, for every point (x, y1 ) and (x, y2 is unique.
K >0
(2.1)
) in D. Then, the function y =(x) satisfying the IVP y =f (x, y) and (x
0 )=y0 , x
∈I
Proof. Suppose there exists another differentiable function on R such that the point (x, (x)) ∈ D and (d/dx)(x)= f [x, (x)], x ∈ I and (x0 ) = y0 . We shall show that (x) = (x) for all x ∈ I. Since both and satisfy the differential equation y = f (x, y), we have x f [t, (t)] dt, (2.2) (x) = y0 + x0
and
(x) = y0 +
x
f [t, (t)] dt
(2.3)
x0
for all x ∈ I . The above equations give x |(x) − (x)| |f (t, (t)) − f (t, (t))| dt.
(2.4)
x0
Since the functions and are continuous on I , so is the function g, where g(x) = (x) − (x). Thus, we have for some constant B(m) > 0, |(x) − (x)|B(m)
for all x ∈ Im ,
(2.5)
where Im = [−m, m]. Now we utilize Eq. (2.1) with y1 = (x) and y2 = (x) to give |f (x, (x)) − f (x, (x))|K|(x) − (x)|. Thus, we have
|(x) − (x)|K
x
(2.6)
|(t) − (t)| dt KB(m)
x0
x
dt = KB(m)(x − x0 )
(2.7)
x0
for all x ∈ I. We shall now prove by mathematical induction that for n = 1, 2, 3, . . . , |(x) − (x)|B(m)K n
|x − x0 |n n!
(2.8)
for all x ∈ Im . Assume that (2.8) is true for n = k, that is |(x) − (x)|B(m)K k for all x ∈ Im , Then,
|(x) − (x)|
x
x0x
|x − x0 |k k!
|f (t, (t)) − f (t, (t))| dt |(t) − (t)| dt
x0
B(m)K k K k! B(m)K k+1 for all x ∈ Im .
(2.9)
x
(t − x0 )k dt
x0
(x − x0 )k+1 (k + 1)!
(2.10)
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
967
Thus, the inductive proof of inequality (2.8) is complete. Letting x − x0 = h, the inequality (2.8) can be written as |(x) − (x)|B(m)
(Kh)n . n!
The term (Kh)n /n! is the general term in the expansion of eKh which converges for all finite values of Kh. A necessary condition for a series to converge is that its general term (or nth term) approaches zero as n → ∞. Hence B(m)((Kh)n /n!) → 0 as n → ∞. Thus, 0 |(x) − (x)| lim B(m) n→∞
(Kh)n = 0. n!
Therefore, |(x) − (x)| = 0
for all x ∈ Im .
Since m is arbitrary, (x) = (x) for all x ∈ I and the proof is complete.
For any function f (x, y) of two variables defined in D ⊂ R 2 , we now present a sufficient condition which guarantees that the Lipschitz condition is satisfied by f (x, y) in D. Lemma 2.2. Suppose that the function f is continuous in a convex region D ⊂ R 2 . Suppose further that jf/jy exists and is continuous in D. Then, the function f satisfies Lipschitz condition in D. Proof. Let R ⊂ D be a closed rectangle with center at (x0 , y0 ) and the sides parallel to the x, y axes. Let D0 be the interior of R. Thus, R is an open rectangle consisting of R minus its boundary. Clearly, D0 is a convex domain. Since the function jf/jy is continuous in D, it is bounded on the compact set R ⊂ D and so is also bounded on D0 ⊂ R. We shall now show that the function f satisfies Lipschitz condition with respect to y in D0 . Since jf/jy is bounded on D0 , we have for all (x, y) in D0 , |jf/jy| K for some K > 0. Then, by the law of the mean j f (x, y1 ) − f (x, y2 ) = (y1 − y2 ) f (x, Y ) jy for some Y between y1 and y2 . Thus, we have j |f (x, y1 ) − f (x, y2 )| = |(y1 − y2 )| f (x, Y ) K|y1 − y2 |. jy Thus, f satisfies Lipschitz condition on D0 and hence on D.
We now proceed to prove our characterization result. For that, we define d(x) = E(X|X < x), m(x) = E(X|X > x), and M(x) = d(x) + (1 − )m(x). With these definitions, the following identities hold (i) d(x) =
E(X) − (1 − F (x))m(x) , F (x)
(2.11)
968
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
(ii) m(x) =
E(X) − F (x)d(x) , (1 − F (x))
(2.12)
(iii) M(x) =
E(X) + m(x)(F (x) − ) , F (x)
(2.13)
(iv) M(x) =
E(X) − ( − F (x))m(x) . F (x)
(2.14)
The function M(x) is called the convex mean residual life function. We now state the following useful result. Theorem 2.3. Suppose that there are two absolutely continuous distribution functions F and G with corresponding random variables X and Y such that M1 (x) = M2 (x), where M1 (x) = E(X|X < x) + (1 − )E(X|X > x), and M2 (x) = E(Y |Y < x) + (1 − )E(Y |Y > x). Then, there exists a point x0 such that F (x0 ) = G(x0 ) = if and only if E(X) = E(Y ). Proof. A proof of the above result is available in Nagaraja and Nevzorov (1997).
The following result will be useful in proving our main characterization result. Theorem 2.4. Let X be an absolutely continuous random variable with distribution function F. Then, with the notations described earlier F (x) =
−F (x)F (x)( − F (x))M (x) [( − F (x)(1 − 2F (x))M(x) + F (x)F (x)M(x) − c(1 − ) − x( − F (x))2 ]
,
(2.15)
where c = E(X) < ∞ and F (x) = 1 − F (x). Proof. For proof, we refer to Wesolowski and Gupta (2001). We are now ready to present our main characterization result. Theorem 2.5. Let F and G be two absolutely continuous distribution functions having the same mean with corresponding random variables X and Y such that M1 (x) = M2 (x) ∀x and some ∈ [0, 1] where M1 (x) = E(X|X < x) + (1 − )E(X|X > x), and M2 (x) = E(Y |Y < x) + (1 − )E(Y |Y > x). Then, F = G. Proof. Since F is continuous, there exists a point x0 such that F (x0 ) = . Because the two means are equal, by Theorem 2.3, F (x0 ) = G(x0 ) = . By Theorem 2.4, we have F (x) =
−F (x)F (x)( − F (x))M1 (x) [( − F (x)(1 − 2F (x))M1 (x) + F (x)F (x)M1 (x) − c(1 − ) − x( − F (x))2 ]
.
Similarly, G (x) =
−G(x)G(x)( − G(x))M2 (x) [( − G(x)(1 − 2G(x))M2 (x) + G(x)G(x)M2 (x) − c(1 − ) − x( − G(x))2 ]
.
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
969
Suppose now that M1 (x) = M2 (x) = M(x) (say). Then for all x, F (x) = (x, F (x)) and G (x) = (x, G(x)), where (x, y) =
−y(1 − y)( − y)M (x) ( − y)(1 − 2y)M(x) + y(1 − y)M(x) − c(1 − ) − x( − y)2
.
It follows by Theorem 2.1 and Lemma 2.2 that F = G. This proves our main characterization result.
Remark 2.1. Normally, solving the differential equation (2.15) is not feasible except in very simple cases with some stringent conditions. Wesolowski and Gupta (2001) have attempted to solve it when E(X) = 0 and M(x) = Ax. Even in this case, they had to assume that = 21 , since this was the only value of for which they were able to complete a thorough analysis of the family of distributions characterized by M(x), including explicit form of F s for some special values of A. Remark 2.2. Akhundov et al. (2004) considered the case M(x) = Ax and obtained distribution function F in terms of its quantile function and some constants. Our characterization result shows that Akhundov et al.’s result can be extended to more general function M(x), under some mild conditions. All the papers cited earlier mention various kinds of characterization results based on some regression functions of order statistics and record values. Thus, our result solves several characterization problems which arise in various fields. Finally, we present another characterization result based on Theorem 2.5. It is well known that the function mX (t) = E(X|X > t) characterizes the distribution of a continuously distributed random variable X. We show below that if the distribution of X is symmetric about zero (i.e. X and −X are identically distributed), then the function X (t) = mX (t) − mX (−t) also characterizes the distribution. Although important in its own right, this characterization is, in fact, a consequence of Theorem 2.5. Theorem 2.6. Let X and Y have respective continuous distribution functions F and G having finite means with common support [-, ], where ∈ (0, ∞]. Suppose that F (t) = 1 − F (−t), G(t) = 1 − G(−t) for all t ∈ [−, ]. Then, X (t) = Y (t) for all t ∈ [0, ]
(2.16)
if, and only if, F = G. Proof. In view of the symmetry of the distribution of X, E(X|X < t) = E(−X| − X < t) = − E(X|X > − t) = − mX (−t) for all t ∈ [−, ]. Hence, E(X|X > t) + E(X|X < t) = mX (t) − mX (−t) = X (t) for all t ∈ [−, ]. Suppose now that (2.16) holds. Since X (−t) = −X (t), we get E(X|X > t) + E(X|X < t) = E(Y |Y > t) + E(Y |Y < t) for all t ∈ [−, ]. Further, by symmetry about 0, we have E(X) = E(Y ). Applying Theorem 2.5 with = 21 , we can conclude that F = G.
970
R.C. Gupta, S.N.U.A. Kirmani / Journal of Statistical Planning and Inference 138 (2008) 964 – 970
Acknowledgments The authors are thankful to two referees, the Associate Editor and the Executive Editor for some useful comments which enhanced the presentation. References Akhundov, I.S., Balakrishnan, N., Nevzorov, V.B., 2004. New characterizations by properties of midrange and related statistics. Commun. Statist. 33 (12), 3133–3143. Gupta, R.C., 1975. On the characterization of distributions by conditional expectations. Commun. Statist. 4, 99–103. Lillo, R.E., 2004. On the characterizing property of the convex conditional mean function. Statist. Probab. Lett. 66, 19–24. Nagaraja, H.N., Nevzorov, V.B., 1997. On characterization based on record values and order statistics. J. Statist. Plann. Inference 63, 271–284. Ruiz, J.M., Navarro, J., 1996. Characterizations based on conditional expectations of the doubled truncated distribution. Ann. Inst. Statist. Math. 48 (3), 563–572. Wesolowski, J., Gupta, A.K., 2001. Linearity of convex mean residual life. J. Statist. Plann. Inference 99, 183–191.