Some results on generalized past entropy

Some results on generalized past entropy

Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674 www.elsevier.com/locate/jspi Some results on generalized past entropy Asok K. Na...

181KB Sizes 0 Downloads 73 Views

Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674 www.elsevier.com/locate/jspi

Some results on generalized past entropy Asok K. Nanda∗ , Prasanta Paul Department of Mathematics, Indian Institute of Technology, Kharagpur 721 302, India Received 3 November 2003; accepted 12 January 2005 Available online 22 March 2005

Abstract In the context of information theory, Shannon’s entropy plays an important role. Since this entropy is not applicable to a system that has survived for some units of time, the concept of residual entropy has been developed in the literature. This definition deals with random variable truncated above some t, i.e. the support of the random variable is taken to be (0, t). In this paper, some ordering and aging properties have been defined in terms of generalized past entropy and their properties have been studied. Quite a few results available in the literature have been generalized. The uniform distribution has been characterized through the generalized past entropy. © 2005 Elsevier B.V. All rights reserved. MSC: 94AXX; 94A17 Keywords: IUL; IUL(); Measure of information; Past entropy; Residual entropy; Reversed hazard rate function

1. Introduction Let X be an absolutely continuous nonnegative random variable having distribution function F (t) = P (X t) and the survival function F¯ (t) = P (X > t). Suppose X denotes the lifetime of a component/system or of a living organism and f (t)=F  (t) denotes the lifetime density function. Shannon (1948) was the first to introduce entropy, known as Shannon’s entropy or Shannon’s information measure, into information theory. For an absolutely continuous random

∗ Corresponding author.

E-mail addresses: [email protected] (A.K. Nanda), [email protected] (P. Paul). 0378-3758/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2005.01.006

3660

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

variable X having probability density function f, Shannon’s entropy is defined as  ∞ H (X) = − f (x) ln f (x) dx = −E(ln f (X)).

(1.1)

0

The properties and virtues of H (X) have been thoroughly investigated by Shannon (1948) and Wiener (1961). Further, many generalizations of (1.1) have been proposed by Arimoto (1971), Ferreri (1980), Havrda and Charvát (1967), Khinchin (1957), Rényi (1961), Sharma and Mittal (1977), Sharma and Taneja (1975), Taneja (1975, 1990), and Varma (1966). Khinchin (1957) generalized (1.1) by choosing a convex function , such that (1) = 0 and defined the measure  f (x)(f (x)) dx. (1.2) H  (X) = In recent years, the modification of Shannon entropy as a measure of uncertainty in residual lifetime distributions has drawn attention of many researchers (cf. Ebrahimi, 1996; Ebrahimi and Kirmani, 1996a, b; Ebrahimi and Pellerey, 1995; Belzunce et al., 2004). Ebrahimi (1996) defined the uncertainty of residual lifetime distributions H (f ; t), by truncating the distributions below some point t, of a component as    ∞ f (x) f (x) ln dx, (1.3) H (f ; t) = − F¯ (t) F¯ (t) t where f (x) is the failure density function. Di Crescenzo and Longobardi (2002) have introduced past entropy over (0, t), since it is reasonable to presume that in many realistic situations uncertainty is not necessarily related to the future but can also refer to the past. They have also shown the necessity of past entropy and its relation with the residual entropy. If X denotes the lifetime of an item or of a living organism, then past entropy (or uncertainty of lifetime distribution) of an item is defined as    t f (x) f (x) ∗ ln dx. (1.4) H (X; t) = − F (t) 0 F (t) Nanda and Paul (2005) have studied some properties and applications of past entropy. ∗ Gupta and Nanda (2002) define generalized uncertainty of lifetime distribution, H1 (X; t) ∗

and H2 (X; t), by truncating the distributions above some point t as     t f (x)  1 ∗ 1− dx H1 (X; t) = −1 F (t) 0

(1.5)

and ∗

H2 (X; t) =

1 ln 1−

 t 0

 f (x)  dx. F (t)

(1.6)

Note that, as  → 1, (1.5) and (1.6) reduce to H ∗ (X; t). Again, as  → 1 and t → ∞, ∗ ∗ (1.5) and (1.6) reduce to H (X). We call H1 (X; t) and H2 (X; t) as first kind past entropy

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3661

of order  and second kind past entropy of order , respectively. One can easily verify that as t → ∞, (1.5) reduces to H1 (f ) and (1.6) reduces to H2 (f ), defined in Gupta and Nanda (2002). It is to be noted here that the nonparametric class IUL(), defined in Section 3 includes the IUL class defined in Nanda and Paul (2005), and as a result, the results for this class as reported in this paper will hold true for the smaller IUL class, and that can be seen by taking limit as  → 1 in the respective results proved here. In forensic science, a lifetime distribution truncated above t is of utmost importance. Looking into this aspect, in this paper, we analyze how the generalized past entropies behave when the distribution is truncated above t. Based on generalized past entropy one stochastic order is defined and the properties are studied in Section 2. It is shown that the stochastic order defined here is closed under increasing linear transformation. In Section 3, a nonparametric class is defined based on the generalized past entropy. The properties of the defined class are studied in this section. Some characterization results based on generalized past entropies are studied in Section 4. It is shown that, under certain condition, the generalized past entropies uniquely determine the distribution function. Here the uniform distribution is characterized in terms of generalized past entropies. Some discrete distribution results are studied in Section 5. Discrete uniform distribution is characterized in this section in terms of generalized past entropy. Throughout this paper, the words increasing and decreasing are not used in strict sense. 2. Properties of a lifetime random variable based on generalized past entropy Nanda and Paul (2005) have defined an order of life distributions based on the measure H ∗ (X; t) as follows. Definition 2.1. Let X and Y be two random variables, with support (lX , uX ) and (lY , uY ), denoting the lifetime of two components with density functions f and g, respectively. Then PE

X is said to be greater than Y in past entropy order (written as X  Y ) if H ∗ (X; t) H ∗ (Y ; t) for all t ∈ (max(lX , lY ), ∞). Here uX and uY may be ∞ and lX , lY may be zero. In this section, we give a new partial order based on generalized past entropies. Definition 2.2. Let X (resp. Y) be a random variable with support (lX , uX ) (resp. (lY , uY )). GPE

Then X is said to be greater than Y in generalized past entropy of order  (written as X  Y ) if ∗

∗

∗

∗

H1 (X; t) H1 (Y ; t) (or equivalently, H2 (X; t) H2 (Y ; t)), for all t ∈ (max(lX , lY ), ∞). Here uX and uY may be ∞ and lX , lY may be zero. Remark 2.1. It is to be noted that as  → 1, Definitions 2.2 and 2.1 become identical.

3662

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

Let S ∗ be the set of all pairs of distributions which are GPE ordered, for all  > 0 and S be the set of that which are PE ordered. Then clearly, S ∗ ⊇ S. The following counterexample shows that S ∗ is a strict superset of S. Counterexample 2.1. Let X be a nonnegative random variable with distribution function   ⎧ 1 1 ⎪ exp − − if 0 x 1, ⎪ ⎪ 2 x ⎪ ⎪ ⎨   F (x) = x2 ⎪ exp −2 + if 1x 2, ⎪ ⎪ 2 ⎪ ⎪ ⎩ 1 if x 2, and Y be another nonnegative random variable with distribution function ⎧ 2 ⎨x if 0 x 2, G(x) = 4 ⎩ 1 if x 2. Case 1: When 0 < t < 1,    t  t f (x) f (x) dx − ln F (t) 0 0 F (t)  t  1 2 = − ln t + e1/t − t x 0

  g(x) g(x) ln dx G(t) G(t)  1 e−1/x dx − ln 2 + x3

1 2

= B(t)

(say).

Note that B(0.0246) = −0.591694 and B(0.0248) = 7.744852. Therefore, it is clear that B(t) crosses the horizontal axis at some t ∈ (0, 1). Thus, X and Y are not PE ordered. Again, note that 



t

0

   t  t  1 −/x f (x)  g(x)  2 1− /t dx − dx = e e dx − . t 2 F (t) G(t) +1 0 0 x

Case 2: When 1 < t < 2,    t f (x)  g(x)  dx − dx F (t) G(t) 0 0

 1  t 1 −/x 2 t 1− 2  x 2 /2 = e−t /2 e3/2 . e dx + x e dx − 2  +1 0 x 1



t



Let us write B1 (t) =

 t 0

   t  f (x)  g(x)  dx − dx. F (t) G(t) 0

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3663

Then, it can be checked that ⎧ /t t 1 −/x  dx − 2+1 t 1− if 0 < t 1, ⎨e 0 x 2 e B1 (t) =

 t  1− 2 ⎩ −t 2 /2 3/2 1 1 −/x e e dx+ 1 x  ex /2 dx − 2t+1 , if 1 t < 2, 0 x 2 e is negative for all values of t ∈ (0, 2), when  = 0.5. Case 3: When t > 2, 

   t  g(x)  f (x)  dx dx − F (t) G(t) 0 0  1  2 2 1 −/x 2 −2 e dx + e x  ex /2 dx − = e−/2 2  + 1 x 1 0 t



= −0.0570091,

when  = 0.5. GPE

PE

Thus, combining all the cases we get that, when  = 0.5, X  Y , but X  Y . Ebrahimi and Pellerey (1995) mentioned for residual entropy that no relations exist between the orders based on residual entropy and the classical stochastic orders. Here we justify a similar claim by giving two examples. GPE

st

The following example shows that X  Y , but X  Y . A nonnegative random variable X is said to be greater than another nonnegative random variable Y in stochastic order (written st

as X  Y ) if F (x)G(x), for all x > 0, where F and G are the distribution functions of X and Y, respectively. Example 2.1. Let X and Y be two nonnegative random variables having distribution functions as defined in Counterexample 2.1. Then, by writing B2 (t) = F (t) − G(t), we have   ⎧ t2 1 1 ⎪ − − if 0 < t 1, exp − ⎪ ⎨ 2 t 4 B2 (t) =  2 ⎪ ⎪ t2 ⎩ exp −2 + t − if 1 t < 2, 2 4 0. st

Thus X  Y . Again, we have, for  = 2, B1 (1.6) = −0.0191663 and B1 (1.7) = 0.0333548, GPE

where B1 (t) is as defined in Counterexample 2.1. Hence, X  Y , for  = 2. st

GPE

Below is an example which shows that X  Y , but X  Y .

3664

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

Example 2.2. Let X be a nonnegative random variable having distribution function ⎧ 2 x +x ⎪ ⎪ if 0 x 1, ⎪ ⎪ ⎨ 4 F (x) = x if 1 x 2, ⎪ ⎪ ⎪ 2 ⎪ ⎩ 1 if x 2, and Y be another nonnegative random variable with distribution function   ⎧ ⎨ exp 1 − 1 if 0 x 2, 2 x G(x) = ⎩ 1 if x 2. Then, B3 (t) = F (t) − G(t), is given by   ⎧ t2 + t 1 1 ⎪ if 0 t 1, ⎪ ⎨ 4 − exp 2 − t B3 (t) =   ⎪ ⎪ ⎩ t − exp 1 − 1 if 1 t 2, 2 2 t t t and B4 (t) = 0 (f (x)/F (t)) dx − 0 (g(x)/G(t)) dx, is given as  t ⎧ (2t + 1)+1 − 1 1 −/x ⎪ /t ⎪ − e e dx if 0 t 1, ⎪ ⎨ 2( + 1)(t 2 + t) 2 0 x B4 (t) =  t ⎪ 3+1 − 1 ⎪ 1 −/x t −1 ⎪ /t ⎩ e dx if 1 t 2. + − e 2 x 2+1 ( + 1)t  t 0 Note that B3 (0.3) = 0.03868352 and B3 (0.5) = −0.03563016. Thus, it is clear that B3 (t) st

crosses the horizontal axis. Therefore, X  Y . It can also be checked that B4 (t) is negative GPE

for all t ∈ (0, 2) when  = 2. Hence, X  Y , for  = 2. ∗

∗

The following proposition, which gives the values of the functions H1 and H2 under linear transformation, will be used in proving the upcoming theorems of this section. The proof is omitted. Proposition 2.1. For any absolutely continuous random variable X, define Z = aX + b, where a > 0 and b 0 are constants. Then, for t > b,

  ∗ ∗  1 1 (i) H1 (Z; t) = −1 ; 1 − a −1 1 − ( − 1)H1 X; t−b a ∗ ∗     (ii) H2 (Z; t) = ln a + H2 X; t−b a . The following theorem shows that GPE order defined earlier is closed under increasing linear transformation.

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3665

Theorem 2.1. For two absolutely continuous random variables X and Y, define Z1 =a1 X + GPE

b1 and Z2 = a2 Y + b2 , a1 , a2 > 0 and b1 , b2 0. Let (i) X  Y , (ii) a1 a2 , (iii) b1 b2 . ∗

GPE

∗

Then Z1  Z2 , if H1 (X; t) or H1 (Y ; t) is increasing in t > b1 . ∗

Proof. Suppose H1 (X; t) is increasing in t. Since (t − b1 )/a1 (t − b2 )/a2 , we have     t − b1 t − b2 ∗ ∗ H1 X; H1 X; . (2.1) a1 a2 GPE

Further, X  Y implies     t − b2 t − b2 ∗ ∗ H1 X; H1 Y ; . a2 a2

(2.2) GPE

Combining (2.1) and (2.2) and using Proposition 2.1, we have, Z1  Z2 . ∗ If H1 (Y ; t) is increasing in t, the proof is similar and hence omitted.  ∗

∗

Below is an example which shows that H1 (X; t) and H2 (X; t) are increasing in t > 0. Example 2.3. Let X be a nonnegative random variable having distribution function F (x) as defined in Example 2.2. Then, we have   ⎧ (2t + 1)+1 − 1 1 ⎪ ⎪ if 0 < t 1, 1− ⎪ ⎪ −1 ⎪ 2( + 1)(t 2 + t) ⎪ ⎪ ⎪   ⎪ ⎨ 1 3+1 − 1 t −1 ∗ if 1 t 2, H1 (X; t) = 1 − +1 −  ⎪ −1 2 ( + 1)t  t ⎪ ⎪ ⎪   ⎪ ⎪ ⎪ 1 1 3+1 − 1 ⎪ ⎪ ⎩ if t 2 −  1− −1 ( + 1)22+1 2 and

  ⎧ (2t + 1)+1 − 1 1 ⎪ ⎪ if 0 < t 1, ln ⎪ ⎪ 2 + t) 1− ⎪ 2( + 1)(t ⎪ ⎪ ⎪   ⎪ ⎨ 1 ∗ t −1 3+1 − 1  H2 (X; t) = if 1 t 2, +  ln +1 ⎪ 1− 2 ( + 1)t  t ⎪ ⎪ ⎪   ⎪ ⎪ ⎪ 1 1 3+1 − 1 ⎪ ⎪ ⎩ if t 2. +  ln 1− ( + 1)22+1 2 ∗

∗

It is not very hard to check that, for  = 2, H1 (X; t) and H2 (X; t) are increasing in t. ∗

From the above example, we see that the class of distributions for which H1 (X; t) (or ∗

equivalently H2 (X; t)) is increasing is not void.

3666

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

Corollary 2.1. Let Z1 =a1 X+b1 and Z2 =a2 X+b2 , a1 , a2 > 0 and b1 , b2 0. Further, let PE

PE

a1 a2 and b1 b2 . Then Z1  Z2 if X  Y and either H ∗ (X; t) or H ∗ (Y ; t) is increasing in t > b. Corollary 2.2. Let X and Y be two absolutely continuous random variables such that GPE

X  Y . Define, X1 = aX + b and Y1 = aY + b, where a > 0 and b 0 are constants. ∗

GPE

∗

Then, X1  Y1 if either H1 (X; t) or H1 (Y ; t) is increasing in t b. Although Corollary 2.2 is a direct consequence of Theorem 2.1, a more stronger result can be proved. This is stated below without proof. Theorem 2.2. Let X and Y be two absolutely continuous random variables. Define X1 = GPE

GPE

aX + b and Y1 = aY + b where a > 0 and b 0. Then X1  Y1 if X  Y . PE

PE

Corollary 2.3. Let X1 and Y1 be as defined in Theorem 2.2. Then X1  Y1 if X  Y . Remark 2.2. Corollary 2.3 is given in Nanda and Paul (2005). The following example shows an application of Theorem 2.2. Example 2.4. Let X and Y be two exponentially distributed random variables with means 1/2 and 1/ respectively. Let X1 and Y1 be as defined in Theorem 2.2. Then, for  = 2, GPE

GPE

one can see that X  Y . Hence, by Theorem 2.2, X1  Y1 , for  = 2. Also, if we take GPE

GP E

 = 0.5, then also X  Y . Hence, by Theorem 2.2, we have X1  Y1 , for  = 0.5.

3. A new nonparametric class based on generalized past entropy Nanda and Paul (2005) have defined one nonparametric class of life distributions based on the measure H ∗ (X; t) as follows. Definition 3.1. A random variable X is said to have increasing uncertainty of life (IUL) if H ∗ (X; t) is increasing in t 0. ∗

We define the following nonparametric class based on the measure H1 (X; t) or

∗

H2 (X; t). Definition 3.2. A nonnegative random variable X is said to have IUL of order  (IUL()) ∗ ∗ if H1 (X; t) (or equivalently H2 (X; t)) is increasing in t 0. Remark 3.1. It can be noted that as  → 1, Definitions 3.2 and 3.1 become identical.

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3667

The following counterexample shows that IUL() class does not coincide, in general, with IUL class. Counterexample 3.1. Let X be a nonnegative random variable as defined in Counterexample 2.1. When 0 < t < 1,     t   t f (x) 1 1 f (x) 2 def L(t) = ln dx = − 2 ln t + e1/t − 3 e−1/x dx F (t) t x x 0 F (t) 0 is not monotone, since L(0.02) = 57.824, t L(0.0243) = −34.1333 and L(0.03) = 6.09096. On the other hand, by writing L1 (t) = 0 (f (x)/F (t)) dx, we have  t ⎧ 1 −/x /t ⎪ e e dx if 0 < t 1, ⎪ ⎪ 2 ⎪ 0 x ⎨

 1  t L1 (t) = 1 −/x  ex 2 /2 dx −t 2 /2 e3/2 ⎪ e dx + x if 1 t 2, e ⎪ 2 ⎪ ⎪ 0 x 1 ⎩ 1.27632 if t 2, which is increasing in t, for  = 0.5. Thus, we get that X is not IUL, but it is IUL(0.5). ∗

To see that not all distributions are monotone in terms of H1 (X; t), we provide below with an example. Example 3.1. Let X be a nonnegative random variable with distribution function as defined in Counterexample 2.1. Then, for 1 t 2,

  1  t   t 1 −/x f (x)  2  x 2 /2 dx = e−t /2 e3/2 e dx + x e dx . 2 F (t) 0 0 x 1 Hence, one can see that, for  = 2 and 1 t 2,

 1  t 1 −2/x ∗ −t 2 3 2 x2 H1 (X; t) = 1 − e e e dx + x e dx , 4 0 x 1 which is not monotone. The following theorem shows that the nonparametric class given in Definition 3.2 is closed under linear transformation. Theorem 3.1. Let X ∈ IUL(). Define Z = aX + b, a > 0 and b 0. Then Z ∈ IUL(). Proof. The proof follows from the definition along with Proposition 2.1.



An application of the above theorem is given below. Example 3.2. Let X be an exponentially distributed random variable having distribution function F (x) = 1 − e−x ,

x 0,  > 0,

3668

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

and Z = aX + b, a > 0 and b 0. Let us take  = 2. Then one can show that   t  f (x) 2 ∗ dx H1 (X; t) = 1 − F (t) 0    1 + e−t =1− 2 1 − e−t is increasing in t > 0. This implies that X ∈ IUL(2). Therefore, by using Theorem 3.1, we have Z ∈ IUL(2). It is noted in Nanda and Paul (2005) that DRHR ⊂ IUL. Further, IUL() reduces to IUL as  → 1. Thus, DRHR ⊂ IUL ⊂ IUL(). A class of distributions is said to be DRHR (decreasing in reversed hazard rate) if (x) = f (x)/F (x) is decreasing in x. The theorem below gives upper bound to the reversed hazard rate function (t) in terms ∗ ∗ of H1 (X; t) and H2 (X; t). The proof is simple and hence omitted. Theorem 3.2. If X is IUL(), then ∗

(t)[(1 − ( − 1)H1 (X; t))]1/(−1) and ∗

(t)1/(−1) e−H2

(X;t)

.

Remark 3.2. Since the distribution function and the reversed hazard rate function are equivalent (in the sense that known the one, other can be obtained uniquely), from Theorem 3.2 one can give a bound to the distribution function also. 4. Some characterization results In this section, we give a few characterizations of distributions in terms of generalized uncertainty. Differentiating both sides of (1.5) with respect to t, we get ∗

d ∗ H (X; t), dt 1 where (t) = f (t)/F (t) is the reversed hazard rate function of the random variable X. Hence, for a fixed t > 0, (t) is a solution of h(x) = 0, where  (t) − (t)[1 − ( − 1)H1 (X; t)] = −( − 1)

∗

d ∗ H (X; t). dt 1 Differentiating both sides of (4.1) with respect to x, we get h(x) = x  − x[1 − ( − 1)H1 (X; t)] + ( − 1) ∗

h (x) = x −1 − [1 − ( − 1)H1 (X; t)]. ∗

Note that, h (x) = 0 gives x = [1 − ( − 1)H1 (X; t)]1/(−1) = x0 (say).

(4.1)

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3669

∗

Proposition 4.1. If H1 (X; t) is increasing in t > 0, then (i) h(x) = 0 has a unique solution if h(x0 ) = 0. That unique solution is the reversed hazard rate. (ii) h(x) = 0 has two solutions if h(x0 )  = 0. Of these two solutions, at least one should be reversed hazard rate. Proof. We prove the theorem in two different cases. ∗ Case 1: Let  > 1. Then h(0) > 0, since H1 (X; t) is increasing in t > 0. Further, one can show that h(x) is a convex function with minimum occurring at x = x0 . So, h(x) = 0 has unique solution when h(x0 ) = 0. Case 2: Let  < 1. Then h(0) < 0 and h(x) is a concave function with maximum at x =x0 . So, h(x) = 0 has unique solution when h(x0 ) = 0. ∗ Therefore, combining both the cases, we get that if H1 (X; t) is increasing in t > 0, and h(x0 ) = 0, then h(x) = 0 has the unique solution. Since, (t) is a solution to h(x) = 0, hence the unique solution is the reversed hazard rate. The proof of (ii) can easily be obtained from the above two cases. If h(x0 )  = 0, then from the above cases one can see that h(x) = 0 has two solutions and out of these two, one should be reversed hazard rate.  Below is one counterexample where both the solutions of h(x) = 0 are reversed hazard rates. Counterexample 4.1. Let X be a nonnegative random variable having beta distribution with density function f (t) = ct c−1 , 0 < t < 1 and =0, otherwise, where 21 < c < 1. Then ∗

it can be verified that, for  = 2, H1 (X; t) = 1 − c2 /((2c − 1)t), which is increasing in t, and h(x0 )  = 0. Further, by writing (t) as the reversed hazard rate function of X, we have c x0 = >1 (t) 2c − 1 for t ∈ (0, 1). Thus, for every t > 0, h(x) = 0 has two solutions (t) and  (t) such that (t) < t0 <  (t). Hence  (t) must be a reversed hazard rate function. Below we characterize the uniform distribution by the generalized uncertainty. Theorem 4.1. The uniform distribution over (a, b), a < b can be characterized by first kind ∗ uncertainty H1 (X; t) = ( − 1)−1 [1 − (t − a)1− ], a < t < b, i.e. a random variable X over ∗

(a, b), a < b has uniform distribution if and only if H1 (X; t) = ( − 1)−1 [1 − (t − a)1− ], a < t < b. ∗

Proof. The ‘only if part’ is straight forward. To prove ‘if part’, note that H1 (X; t) = ( − 1)−1 [1 − (t − a)1− ] gives   t  f (x)  dx = (t − a)1− , a < t < b. F (t) 0

3670

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

Differentiating both sides of the above expression with respect to t, we get 

t

−(t) 0



 f (x)  dx +  (t) = (1 − )(t − a)− , F (t)

or equivalently,  (t) − (t)(t − a)1− − (1 − )(t − a)− = 0. Thus, for a fixed t > a, (t) is a solution of k(x) = 0, where k(x) = x  − x(t − a)1− − (1 − )(t − a)− .

(4.2)

Differentiating both sides of (4.2) with respect to x, we get k  (x) = x −1 − (t − a)1− . Note that, k  (x) = 0 gives x = (t − a)−1 = x¯0 , (say). Also, observe that k(x¯0 ) = 0. Case 1: Let  > 1. Then, from (4.2), we get k(0) > 0 and k(x) is a convex function with minimum occurring at x = x¯0 . So, k(x) = 0 has unique solution x = x¯0 , since k(x¯0 ) = 0. Case 2: Let 0 <  < 1. Then k(0) < 0 and k(x) is a concave function with maximum occurring at x = x¯0 . Therefore, k(x) = 0 has unique solution x = x¯0 . Combining both the cases, we get that k(x) = 0 has the unique solution x = x¯0 . Since, (t) is a solution to k(x) = 0, hence (t) = x¯0 = (t − a)−1 . This is the reversed hazard rate function of the uniform distribution over (a, b), a < b. Hence the theorem is established.  The proof of the following theorem follows on the same line as that of Theorem 4.1, and is omitted. Theorem 4.2. The uniform distribution over (a, b), a < b can be characterized by the ∗ second kind uncertainty H2 (X; t) = ln(t − a), a < t < b, i.e. a random variable X ∗

over (a, b), a < b has uniform distribution if and only if H2 (X; t) = ln(t − a), a < t < b.

5. Discrete distribution results based on generalized uncertainty Let X be a discrete random variable taking values x0 , x1 , x2 , . . . , xn with respective probabilities p0 , p1 , p2 , . . . , pn . Here n may be infinite. The past uncertainty of discrete lifetime distribution is defined in Nanda and Paul (2005) as   j  pk pk H (p; j ) = − ln , P (j ) P (j ) ∗

k=0

(5.1)

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

where P (j )= Let us define

3671

j

k=0 pk is the discrete distribution function of X and p=(p0 , p1 , p2 , . . . , pn ).

⎤  j   pk 1 ⎣  ⎦ H1 (p; j ) = 1− P (j ) −1 ⎡



(5.2)

k=0

and ⎤ ⎡  j   p 1 k  ⎦ H2 (p; j ) = ln ⎣ 1− P (j ) ∗

(5.3)

k=0

as the first kind discrete generalized uncertainty of order  and the second kind discrete uncertainty of order , respectively. It can be noted that, as  → 1, (5.2) and (5.3) reduce to (5.1). The following lemma (cf. Mitrinovi´c, 1970, p. 282) is useful to prove the upcoming proposition. Lemma 5.1. If  1,  +  1, and xk > 0, for k = 0, 1, 2, . . . , j , then ⎞+  k  ⎛ j j     xk xi  ⎝ xk ⎠ . k=0

i=0

k=0

Proposition 5.1. For any real number xk > 0, and k = 0, 1, 2, . . . , j , we have (i) if  1, then ⎞ ⎛ j j    xk  ⎝ xk ⎠ ,

(5.4)

k=0

k=0

(ii) if  1, then ⎞ ⎛ j j    ⎝ xk  xk ⎠ . k=0

k=0

Proof. (i) Follows from Lemma 5.1 by putting  = 0. To prove (ii), we replace  by 1/ in (5.4) so that, we have, for  1, xk > 0, and k = 0, 1, 2, . . . , j , j  k=0

⎛ j 

1/ xk  ⎝

k=0

⎞1/ xk ⎠

,

3672

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

which gives j  k=0

⎞ ⎛ j  1/  xk ⎠ . xk  ⎝ k=0 1/

Now the result follows by substituting xk in place of xk . Hence the proposition follows.  ∗

∗

Thus, from Proposition 5.1, we see that H1 (p; j ) and H2 (p; j ) are nonnegative for all j and  > 0. It is very important to note that, there exist no discrete distributions for which H ∗ (p; j ), ∗ ∗ H1 (p; j ) or H2 (p; j ) is decreasing in j. This is due to the fact that at j = 0, all of them ∗

∗

are zero. Further, each of H ∗ (p; j ), H1 (p; j ) and H2 (p; j ) is nonnegative for all j. Thus, each of these functions is nonnegative having value zero at j = 0. Hence they cannot be decreasing in j. ∗ ∗ Throughout this section we write H1 (p; j ) = H1 (j ) and H2 (p; j ) = H2 (j ). The following theorem characterizes the discrete uniform distribution by increasing first kind discrete uncertainty of order . Theorem 5.1. A random variable X with support {0, 1, 2, . . . , n} has discrete uniform

distribution if and only if H1 (j ) = 1/( − 1) 1 − (j + 1)1− , for j = 0, 1, 2, . . . , n. Proof. The ‘only if part’is straight forward. To prove ‘if part’suppose, for j =0, 1, 2, . . . , n, H1 (j ) = 1/( − 1)[1 − (j + 1)1− ]. This gives j   pk = P  (j )(j + 1)1− .

(5.5)

k=0

By replacing j by (j + 1) in (5.5), we have j +1   pk = P  (j + 1)(j + 2)1− .

(5.6)

k=0

Subtracting (5.5) from (5.6) and writing pj +1 = P (j + 1) − P (j ), we have [P (j + 1) − P (j )] = P  (j + 1)(j + 2)1− − P  (j )(j + 1)1− , which is equivalent to 

[1 − j ] = (j + 2)1− − j (j + 1)1− , where j = P (j )/P (j + 1) ∈ (0, 1). It can be noted that, for a fixed x > 0, x = j is a solution to j (x) = 0, where j (x) = (1 − x) − (j + 2)1− + x  (j + 1)1− .

(5.7)

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

3673

Differentiating both sides of (5.7) with respect to x, we get j (x) = −(1 − x)−1 + x −1 (j + 1)1− . Thus, j (x) = 0 gives x = (j + 1)/(j + 2) = tj , (say). Again, note that j (tj ) = 0. Case 1: Let  > 1. Then, from (5.7), we get j (0) 0 and j (1) 0, since H1 (j ) is increasing in j. Again, j (x) = 0

<0

if x < tj , if x = tj ,

>0

if x > tj .

So, j (x) = 0 has unique solution x = tj , since j (tj ) = 0. Case 2: Let 0 <  < 1. Here j (0)0 and j (1) 0. One can easily verify that j (x) = 0

>0

if x < tj , if x = tj ,

<0

if x > tj ,

which gives that j (x) = 0 has unique solution x = tj , since j (tj ) = 0. Combining both the cases, we get that j (x) = 0 has unique solution x = tj . Again, since j is a solution to j (x)=0, therefore j =tj =(j +1)/(j +2). Thus, for j =0, 1, 2, . . . , n, P (j ) j +1 = . P (j + 1) j + 2 This gives, P (j ) = (j + 1)pj +1 , i.e. j 

pk = (j + 1)pj +1

for j = 0, 1, 2, . . . , n.

k=0

Substituting j = 0, 1, 2, . . . , n in the above expression we get p0 = p1 = p2 = · · · = pn . This gives that X has discrete uniform distribution with support {0, 1, 2, . . . , n}. Hence, the theorem is established.  The following theorem characterizes the discrete uniform distribution by increasing second kind discrete uncertainty of order . The proof is similar to that of Theorem 5.1 and is not detailed here. Theorem 5.2. A random variable X with support {0, 1, 2, . . . , n} has discrete uniform distribution if and only if H2 (j ) = ln(j + 1), for j = 0, 1, 2, . . . , n. Remark 5.1. In general, H1 (j ) or H2 (j ) does not uniquely determine p. For example, if X has Bernoulli distribution B(p) with P (X = 1) = p, then one can verify that ⎧ if j = 0, ⎨0 H1 (j ) = 1 ⎩ [1 − p  − (1 − p) ] if j = 1 −1

3674

A.K. Nanda, P. Paul / Journal of Statistical Planning and Inference 136 (2006) 3659 – 3674

and H2 (j ) =

⎧ ⎨0 ⎩

if j = 0,

1 ln[p  + (1 − p) ] if j = 1. 1−

Let us take  = 2 and p  = 21 . Then one can verify that H1 (j ) [resp. H2 (j )] is increasing in j and H1 (j ) [resp. H2 (j )] gives either B(p) or B(1 − p). Acknowledgements We sincerely thank one anonymous referee for his/her careful reading of the manuscript and some thought provoking questions raised. His/her valuable comments have forced us to give clarification to quite a few questions and to correct some mistakes improving the quality of the paper substantially. References Arimoto, S., 1971. Information-theoretical considerations on estimation problems. Inform. Control 19, 181–194. Belzunce, F., Navarro, J., Ruiz, J.M., Aguila, Y., 2004. Some results on residual entropy function. Metrika 59, 147–161. Di Crescenzo, A., Longobardi, M., 2002. Entropy-based measure of uncertainty in past lifetime distributions. J. Appl. Probab. 39, 434–440. Ebrahimi, N., 1996. How to measure uncertainty about residual lifetime. Sankhy¯a A 58, 48–57. Ebrahimi, N., Kirmani, S.N.U.A., 1996a. Some results on ordering of survival functions through uncertainty. Statist. Probab. Lett. 29, 167–176. Ebrahimi, N., Kirmani, S.N.U.A., 1996b. A characterization of the proportional hazards model through a measure of discrimination between two residual life distributions. Biometrika 83 (1), 233–235. Ebrahimi, N., Pellerey, F., 1995. New partial ordering of survival functions based on notion of uncertainty. J. Appl. Probab. 32, 202–211. Ferreri, C., 1980. Hypoentropy and related heterogeneity divergence measures. Statistica 40, 55–118. Gupta, R.D., Nanda, A.K., 2002. - and -entropies and relative entropies of distributions. J. Statist. Theory Appl. 1 (3), 177–190. Havrda, J., Charvát, F., 1967. Quantification method of classification process: concept of structural -entropy. Kybernetika 3, 30–35. Khinchin, A.J., 1957. Mathematical Foundation of Information Theory. Dover, New York. Mitrinovi´c, D.S., 1970. Analytical Inequalities. Springer, Berlin, New York. Nanda, A.K., Paul, P., 2005. Some properties of past entropy and their applications. Metrika, to appear. Rényi, A., 1961. On measures of entropy and information. Proceedings of the fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 547–561. Shannon, C.E., 1948. A mathematical theory of communications. Bell System Tech. J. 27, 379–423. Sharma, B.D., Mittal, P., 1977. New non-additive measures of relative information. J. Combin, Inform. System Sci. 2, 122–133. Sharma, B.D., Taneja, I.J., 1975. Entropy of type (, ) and other generalized measures in information theory. Metrika 22, 205–215. Taneja, I.J., 1975. A study of generalized measures in information theory. Ph.D. Thesis, University of Delhi. Taneja, I.J., 1990. On generalized entropy with applications. In: Ricciardi, L.M. (Ed.), Lectures in Applied Mathematics and Informatics. Manchester University Press, Manchester, pp. 107–169. Varma, R.S., 1966. Generalizations of Renyi’s entropy of order . J. Math. Sci. 1, 34–48. Wiener, N., 1961. Cybernetics. second ed. MIT Press, Wiley, New York.