An extended Lindley distribution

An extended Lindley distribution

Journal of the Korean Statistical Society 41 (2012) 75–85 Contents lists available at SciVerse ScienceDirect Journal of the Korean Statistical Socie...

293KB Sizes 4 Downloads 190 Views

Journal of the Korean Statistical Society 41 (2012) 75–85

Contents lists available at SciVerse ScienceDirect

Journal of the Korean Statistical Society journal homepage: www.elsevier.com/locate/jkss

An extended Lindley distribution Hassan S. Bakouch a,∗ , Bander M. Al-Zahrani a , Ali A. Al-Shomrani a , Vitor A.A. Marchi b , Francisco Louzada c a

Statistics Department, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia

b

Department of Statistics, Universidade Federal de São Carlos, P.O. Box 676, São Carlos, 13.565-905, Brazil

c

Department of Applied Mathematis and Statistics, Universidade de São Paulo, P.O. Box 668, São Carlos, 13.566-590, Brazil

article

abstract

info

Article history: Received 28 February 2011 Accepted 13 June 2011 Available online 12 July 2011

In this paper we introduce an extension of the Lindley distribution which offers a more flexible model for lifetime data. Several statistical properties of the distribution are explored, such as the density, (reversed) failure rate, (reversed) mean residual lifetime, moments, order statistics, Bonferroni and Lorenz curves. Estimation using the maximum likelihood and inference of a random sample from the distribution are investigated. A real data application illustrates the performance of the distribution. © 2011 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

AMS 2000 subject classifications: Primary 62E15 Secondary 62E20 Keywords: Distributional properties Inactivity time Lindley distribution Lifetime data Unimodal failure rate Bathtub failure rate

1. Introduction In many applied sciences such as medicine, engineering and finance, amongst others, modeling and analyzing lifetime data are crucial. Several lifetime distributions have been used to model such kinds of data. For instance, the exponential, Weibull, gamma, Rayleigh distributions and their generalizations (see, e.g., Gupta & Kundu, 1999 and Nadarajah & Kotz, 2006). Each distribution has its own characteristics due specifically to the shape of the failure rate function which may be only monotonically decreasing or increasing or constant in its behavior, as well as non-monotone, being bathtub shaped or even unimodal. Here we consider the Lindley distribution which was introduced by Lindley (1958) with the density function given by f (x, λ) =

λ2 (1 + x)e−λx , 1+λ

x > 0, λ > 0

and the distribution function F (x, λ) = 1 −



1 + λ + λx −λx e . 1+λ

Corresponding address: Department of Mathematics, Faculty of Science, Tanta University, Tanta, Egypt. E-mail address: [email protected] (H.S. Bakouch).

1226-3192/$ – see front matter © 2011 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.jkss.2011.06.002

(1)

76

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

Although the Lindley distribution has drawn little attention in the statistical literature over the great popularity of the well-known exponential distribution, they have closed form and, according to Hussain (2006), the Lindley distribution is important for studying stress–strength reliability modeling. Besides, some researchers have proposed new classes of distributions based on modifications of the Lindley distribution, including also their properties. The main idea is always directed by embedding former distributions to more flexible structures. Sankaran (1970) introduced the discrete Poisson–Lindley distribution by combining the Poisson and Lindley distributions. Ghitany, Atieh, and Nadarajah (2008) investigated most of the statistical properties of the Lindley distribution, showing this distribution may provide a better fitting than the exponential distribution. Mahmoudi and Zakerzadeh (2010) proposed an extended version of the compound Poisson distribution which was obtained by compounding the Poisson distribution with the generalized Lindley distribution. Louzada, Roman, and Cancho (2011) proposed the complementary exponential geometric distribution by combining the geometric and the exponential distributions. The aim of this paper is to introduce an extension of the Lindley distribution which offers a more flexible distribution for modeling lifetime data, namely in reliability, in terms of its failure rate shapes. The new distribution can accommodate both decreasing and increasing failure rates as its antecessors, as well as unimodal and bathtub shaped failure rates. We extend the Lindley distribution by exponentiation. Several authors have considered extensions from usual survival distributions by following this idea. For instance, Mudholkar, Srivastava, and Freimer (1995) considered the exponentiated Weibull distribution as a generalization of the Weibull distribution, Gupta and Kundu (1999) introduced the exponentiated exponential distribution as a generalization of the usual exponential distribution, Nadarajah and Kotz (2006) proposed an exponentiated type distributions extending the Fréchet, gamma, Gumbel and Weibull distributions and Barriga, LouzadaNeto, and Cancho (2011) proposed the complementary exponential power distribution by exponentiating the exponential power distribution proposed by Smith and Bain (1975). Hence, we extend the Lindley distribution via its distribution function, by considering a particular exponentiation of (1) given by F (x) = 1 −



1 + λ + λx 1+λ

α

β

e−(λx) ,

(2)

where α ∈ R− ∪ {0, 1}, λ > 0 and β ≥ 0. Hereafter, the extension of the Lindley distribution shall be denoted by the extended Lindley (EL) distribution. The motivation of the proposed EL distribution given by (2) arises from its ability to model lifetime data with increasing, decreasing, unimodal and bathtub shaped failure rates. This is justified by introducing the parameters α and β , as we shall see in the next section. By virtue of this, the model represents a good alternative to the most popular gamma and Weibull lifetime distributions that suffer from not exhibiting unimodal and bathtub shaped failure rates. Also, we shall see later that the parameter λ can be interpreted as an upper bound on the failure rate function, which is an important characteristic for lifetime models. Not many lifetime distributions have their parameters directly interpretable in terms of their hazard rate functions, such as the Weibull. Another attractive feature of the EL distribution is that it has closed form expressions for its cumulative distribution function and failure rate function, which is not the case, for instance, for the gamma distribution. Moreover, the EL distribution has several particular cases. For α = 1 and β = 1 the EL distribution is reduced to the Lindley distribution (Lindley, 1958), and for α = 0 it reduces to the Weibull distribution. Also, for β = 0 and α ∈ R− , the EL

δ

c distribution is reduced to a Pareto distribution given by F (x) = 1 − c + , where x > 0, c = 1 + λ1 and δ > 0. Moreover, for x − β = 1 and α ∈ R , the distribution function defined by (2) reduces to the tapered (or generalized) Pareto distribution with



δ

c F ( x) = 1 − c + e−λx , where x > 0, c = 1 + λ1 and δ > 0. This distribution has been discussed by Kagan and Schoenberg x (2001) and used to model the sizes of earthquakes. Besides, Eq. (2) represents the product of the survival functions (1 − F (x)) of the Lomax (Pareto II) and Weibull distributions, respectively, for any β and α ∈ R− (Murthy, Swartz, & Yuen, 1973). That is, EL distribution can be seen as a mixture of a Lomax and a Weibull distribution for the former α and β . Through the paper we shall use the following equations:







β

xc (1 + λ + λx)α (λx)a e−(λx) dx =

b

1

λ1+c β

∞    α j =0

j

(1 + λ)α−j Γ



where Γ (s, b) is the upper incomplete gamma function given by Γ (s, b) = b

 0

β

(1 + λ + λx)α (λx)a e−(λx) dx =

∞ 1  α i λβ

 

i=0

(1 + λ)α−i γ



1+a+c+j

β ∞ b

1+i+a

β

 , (λb)β ,

(3)

t s−1 e−t dt.

 , (λb)β ,

(4) (−1)k α+k

b where γ (α, b) is the lower incomplete gamma function given by γ (α, b) = 0 t α−1 e−t dt = . k=0 k! α+k The remainder of this paper is organized as follows. Various statistical and reliability measures of the EL distribution are explored in Section 2. Estimation of the distribution parameters by maximum likelihood and inference of a random sample from EL distribution are investigated in Section 3. In Section 4 a real data application illustrates the performance of the EL distribution over other competing reliability distributions. Section 5, with some concluding remarks, ends the paper.

b

∞

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

α=–1.0 α=–3.0 α=–5.0

0 0.0

10 x

15

0.4

0.6

0.8

1.0

0.0

0.1

0.05

0.10

α=–1.0 α=–10.0 α=–100.0

0.2 x

20

0

20

40

60 x

80 100 120

0.3

0.4

β=100.0, λ=0.01

Probability Density Function 0.00 0.05 0.10 0.15 0.20

0.15

β=10.0, λ=0.01

0.00

Probability Density Function 5

0.2

x

α=–4, β=1.5, λ=0.001

0

α=–1.0 α=–3.0 α=–5.0

0

Probability Density Function 1 2 3 4 5

Probability Density Function 2 4 6 8 0

0.00 0.05 0.10 0.15 0.20 0.25 0.30 x

Probability Density Function 0.00380 0.00390 0.00400

β=5.0, λ=5.0

β=1.0, λ=5.0 α=–1.0 α=–3.0 α=–5.0

Probability Density Function 2 4 6 8

β=0.5, λ=5.0

77

α=–1.0 α=–10.0 α=–100.0

0

20

40

60

80

100

x

Fig. 1. Plot of the probability density EL function, Upper Panels: versus x for selected parameters α and β , divided by 11.74671, 8.333771 and 8.375672, respectively; Lower Panels: versus x for selected α, β and λ.

2. Statistical and reliability measures In this section, we give some important statistical and reliability measures for the EL distribution. 2.1. Density, survival and failure rate functions The density and survival functions associated with (2) are given by

λ(1 + λ + λx)α−1 β (β(1 + λ + λx)(λx)β−1 − α)e−(λx) , (1 + λ)α  α 1 + λ + λx β F¯ (x) = e−(λx) , x > 0. 1+λ

f ( x) =

x > 0.

(5) (6)

Density function plots of the EL distribution are displayed in Fig. 1 for different values of α, λ and β . From (5) and (6), the failure (or hazard) rate function is given by r (x) =

β(1 + λ + λx)λβ xβ−1 − λα . 1 + λ + λx

(7)

We can obtain the first derivative of r (x) as r ′ (x) = β(β − 1)λβ xβ−2 +

λ2 α . (1 + λ + λx)2

It is obvious that r´ (x) ≤ 0, for β ≤ 1 and α ≤ 0. More generally the function r (x) is increasing for α > k and decreasing for α < k, where k = −β(β − 1)(λx)β−2 (1 + λ + λx)2 . Besides, for some combinations to values of the parameters we have unimodal or bathtub shaped failure rate function. Therefore, one concludes that the shape parameters α and β influences the shape of the failure rate function. Fig. 2 shows some shapes of the failure rate function with some different values of α, β and λ. For β > 1, r (0) = f (0) = −λα . Therefore, at the origin r (x) varies continuously with the parameters. This is in contrast 1+λ with the Weibull and gamma distribution families, where r (0) = 0 or r (0) = ∞ for both families and hence r (0) is discontinuous in the parameters of such families.

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

β=1.0, λ=5.0

0

1

0

50

100

150

x

Hazard Rate Function 20 40 60 80 100 120

x

0.00 0.05 0.10 0.15 0.20 0.25 0.30 x

β=10.0, λ=0.01

β=100.0, λ=0.01

Hazard Rate Function 0.2 0.4 0.6 0.8 1.0

2

3

4

5

α=–1.0 α=–10.0 α=–100.0

0.0

α=–1.0 α=–10.0 α=–100.0

0.0

Hazard Rate Function

α=–4, β=1.5, λ=0.001 α=–1.0 α=–10.0 α=–100.0

α=–1.0 α=–3.0 α=–5.0

0

0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 x

0.00400 0.00404 0.00408 0.00412

β=5.0, λ=5.0 α=–1.0 α=–3.0 α=–5.0

0

α=–1.0 α=–3.0 α=–5.0

Hazard Rate Function 2 4 6 8

Hazard Rate Function 5 10 15

β=0.5, λ=5.0

Hazard Rate Function 0.2 0.4 0.6 0.8 1.0

78

0

20

40

60 x

80 100 120

0

20

40

60

80

100

x

Fig. 2. Plot of the hazard rate EL function, Upper Panels: versus x for selected parameters α and β ; Lower Panel: unimodal and bathtub shaped.

For β = 1, limx→∞ r (x) = λ, the function r (x) is bounded above by λ and continuous in the parameters of the EL distribution, as the gamma distribution but unlike the Weibull one. Also, for this case, the EL distribution is reduced to an exponential distribution. 2.2. Quantiles, moments and order statistics The pth quantile xp of the EL distribution, the inverse of the distribution function F (xp ) = p, is given by

 xp =

 1/β α 1 + λ + λ xp ln . λβ (1 + λ)(1 − p)1/α

Many of the interesting characteristics and features of a distribution can be studied through its moments. Let X be a random variable following the EL distribution with parameters α, λ and β . Expressions for mathematical expectation, variance and the rth moment on the origin of X can be obtained using the well-known formula E [X r ] = r





xr −1 F¯ (x)dx. 0

Hence, it follows that E [X r ] =

r

βλr

∞   Γ  α j =0

j



r +j



β

(1 + λ)j

.

In particular, α = 1 implies E [X ] = µ =

(1 + λ)Γ (1/β) + Γ (2/β) . λβ(1 + λ)

Moreover, for α = β = 1 we get µ = (2 + λ)/λ(1 + λ), which is the mean of the original Lindley distribution. Now, we discuss some properties of the order statistics of the EL distribution. Order statistics are among the most fundamental tools in non-parametric statistics and inference. Let X1 , . . . , Xn be a random sample taken from the EL

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

79

distribution and X1:n , . . . , Xn:n denote the corresponding order statistics. Then, the pdf fi:n (x) of the ith order statistics Xi:n is given by fi:n (x) =

n!λ(β(1 + λ + λx)(λx)β−1 − α)(1 + λ + λx)(n−i+1)α−1

(n − i)!(i − 1)!(1 + λ)nα   β i−1 −(n−i+1)(λx)β × (1 + λ)α − (1 + λ + λx)α e−(λx) e .

The rth moment of the ith order statistic Xi:n can be obtained from the known result, E[

Xir:n

n 

]=r



(−1)

k−n+i−1

k=n−i+1

k−1

  n

n−i

k



xr −1 [F¯ (x)]k dx, 0

substituting F¯ (x) given by (6) implies

      Γ r +j n ∞  k−n+i−1  β r (−1) k−1 n kα . E [Xir:n ] = r j j βλr k=n−i+1 n − i k β k j =0 k β (1 + λ)j 2.3. Residual life and reversed failure rate function Given that a component survives up to time t ≥ 0, the residual life is the period beyond t until the time of failure and defined by the conditional random variable X − t |X > t. In reliability, it is well known that the mean residual life function and ratio of two consecutive moments of residual life determine the distribution uniquely (Gupta & Gupta, 1983). Therefore, we obtain the rth-order moment of the residual life via the general formula

µr (t ) = E ((X − t )r |X > t ) =





1 F¯ (t )

r (x − t )r −1 F¯ (x)dx. t

Applying Eq. (3), the binomial expansion to (x − t )r −1 and substituting F¯ (.) given by (6) into the formula above imply α (λt )β

r (1 + λ) e βλr (1 + λ + λt )α

µr (t ) =

 r −1   r −1 j

j =0

(−1)j (λt )j

∞   Γ  α k=0



k

r −j +k

β

, (λt )β

(1 + λ)k

 ,

r ≥ 1.

Thus, we obtain the mean residual life of the EL distribution as β

µ1 (t ) = µ(t ) =

e(λt )

λβ(1 + λ + λt )α

∞    α k=0

k

(1 + λ)α−k Γ



1+k

β

 , (λt )β .

In particular, for α = 1, we obtain

µ(0) = E (X ) =

(1 + λ)Γ (1/β) + Γ (2/β) . λβ(1 + λ)

Also, if α = β = 1, then

µ(t ) =

2 + λ + λt

λ(1 + λ + λt )

,

which is the mean residual lifetime function of the original Lindley distribution. Moreover, the second moment of the residual life of the EL distribution is

µ2 (t ) =

2e(λt )

β

βλ2 (1 + λ + λt )α

∞    α k=0

     2+k 1+k (1 + λ)α−k Γ , (λt )β − λt Γ , (λt )β . k β β

The variance of the residual life of the EL distribution can be obtained easily using µ2 (t ) and µ(t ), and consequently its coefficient of variation. On the other hand, we analogously discuss the reversed residual life and some of its properties. The reversed residual life can be defined as the conditional random variable t − X |X ≤ t which denotes the time elapsed from the failure of a component given that its life is less than or equal to t. This random variable may also be called the inactivity time (or time since failure); for more details you may see (Kundu & Nanda, 2010; Nanda, Singh, Misra, & Paul, 2003). Also, in reliability, the mean reversed residual life and ratio of two consecutive moments of reversed residual life characterize the distribution uniquely. Using (2) and (5), the reversed failure (or reversed hazard) rate function is given by h(x) =

f (x) F ( x)

=

λ(1 + λ + λx)α−1 (β(1 + λ + λx)(λx)β−1 − α) , (1 + λ)α e(λx)β − (1 + λ + λx)α

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

α=–1.0 α=–3.0 α=–5.0

0.00 0.05 0.10 0.15 0.20 0.25 0.30 x

0.0

0.2

0.4

0.6

0.0

0.2

0.4

150

2

4

0.8

1.0

β=100.0, λ=0.01

α=–1.0 α=–10.0 α=–100.0

0

x

0.6 x

Reverse Hazard Rate Function 0.0 0.2 0.4 0.6 0.8 1.0 100

1.0

β=10.0, λ=0.01

Reverse Hazard Rate Function 0.0 0.2 0.4 0.6 0.8 1.0

50

0.8

α=–1.0 α=–3.0 α=–5.0

x

α=–4, β=1.5, λ=0.001

0

β=5.0, λ=5.0 α=–1.0 α=–3.0 α=–5.0

Reverse Hazard Rate Function 0 5 10 15 20 25 30

β=1.0, λ=5.0 Reverse Hazard Rate Function 0 5 10 15 20 25 30

Reverse Hazard Rate Function 0 5 10 15 20

β=0.5, λ=5.0

6

8

Reverse Hazard Rate Function 0.0 0.2 0.4 0.6 0.8 1.0

80

10

α=–1.0 α=–10.0 α=–100.0

0

2

x

4

6

8

10

x

Fig. 3. Plot of the reversed hazard rate EL function for selected parameters α, β and λ.

it is noticed that h(0) = ∞, h(0) is discontinuous in the parameters of the EL distribution. Fig. 3 shows some shapes of the reversed failure rate function with some different values of α, β and λ. The rth-order moment of the reversed residual life can be obtained by the well known formula mr (t ) = E ((t − X )r |X ≤ t ) =

t



1 F (t )

r (t − x)r −1 F (x)dx, 0

hence, β

mr (t ) =

rt r −1 (1 + λ)α e(λt )

(1 + λ)α e(λt )β − (1 + λ + λt )α    k  1+k  ∞   γ 1+i+k , (λt )β  r −1    β t 1 − 1 α r −1 − , × 1+k i k   t 1 + k βλ (1 + λ)i i=0 k=0

r ≥ 1.

Thus, the mean of the reversed residual life of the EL distribution is given by

   1 +i β ∞    , (λt )β  β (1 + λ)α e(λt ) 1  α γ m1 (t ) = t− ,  λβ i=0 i (1 + λ)i (1 + λ)α e(λt )β − (1 + λ + λt )α  and the second moment of the reversed residual life of the EL distribution is given by β

m2 (t ) =

t (1 + λ)α e(λt )

(1 + λ)α e(λt )β − (1 + λ + λt )α      2+i β    ∞    γ , (λ t )  β 2 1 1+i α  × t+ −γ , (λt )β  .   λβ i=0 i (1 + λ)i λt β

Using m1 (t ) and m2 (t ) we obtain the variance of the reversed residual life of the EL distribution, and hence the coefficient of variation of the reversed residual life of the EL distribution can be easily obtained.

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

81

2.4. Bonferroni and Lorenz curves The Bonferroni and Lorenz curves and Gini index have many applications not only in economics to study income and poverty, but also in other fields like reliability, medicine and insurance. The Bonferroni curve BF [F (x)] is given by BF [F (x)] =

x



1

µF (x)

uf (u)du, 0

or equivalently given by BF (p) =

p



1

µp

F −1 (t )dt , 0

where p = F (x) and F −1 (t ) = inf{x : F (x) ≥ t }. From the relationship between the Bonferroni curve and the mean residual lifetime given by Theorem 2.1 of Pundir, Arora, and Jain (2005), the Bonferroni curve of the distribution function F of EL distribution is given by BF [F (x)] =

1 1−

×

 1+λ+λx α 1+λ

 

1−



1

λµβ

β

e−(λx)

∞   Γ  α j =0

j



1+j

β

, (λx)β

 −

(1 + λ)j

x



1 + λ + λx

µ

1+λ

α e

−(λx)β

 

.



Also, the Lorenz curve of F that follows the EL distribution can be obtained via the expression LF [F (x)] = BF [F (x)]F (x). The scaled total time and cumulative total time on test transform of a distribution function F (Pundir et al., 2005) are defined by SF [F (t )] =

t



1

µ

F¯ (u)du 0

and 1



SF [F (t )]f (t )dt ,

CF = 0

respectively. If F (t ) is the EL distribution function specified by (2) then using formula (4), SF [F (t )] =

∞ 1  α j λµ

 

j =0

∞  (−1)k (λt )1+kβ+j j (1 + λ) k=0 k! 1 + kβ + j

1

and

(−1)k (1 + λ)j k=0 k!(1 + kβ + j) j =0      1+i+j+β+kβ β  γ 2+i+j+kβ , λβ  ∞  ∞   γ  , λ  β β α α α−1 × − . i  i=0 i  (1 + λ)i β i =0 (1 + λ)i

∞ 1  α CF = j λµ

 

1

∞ 

The Gini index can be obtained from the relationship G = 1 − CF . 2.5. Mean deviations The amount of scatter in a population can be measured by the totality of deviations from the mean and median. For a random variable X with pdf , f (x), distribution function F (x), mean µ = E (X ) and M = Median(X ), the mean deviation about the mean and the mean deviation about the median, respectively, are defined by

δ1 ( X ) =





|x − µ|f (x)dx = 2µF (µ) − 2µ + 2 0



∞ µ

xf (x)dx

and

δ2 ( X ) =





|x − M |f (x)dx = 2MF (M ) − M − µ + 2 0





xf (x)dx. M

If X is EL random variable specified by (5) then using formula (3),

δ1 (X ) = 2µF (µ) − 2µ + 2L(µ)

82

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

and

δ2 (X ) = 2MF (M ) − M − µ + 2L(M ), where

 Γ

∞ 1 α L(b) = j λ



1+β+j

β

, (λb)β

(1 + λ)j

j =0



  2 +j β  ∞  α  α − 1 Γ β , (λb) − . j λβ j=0 (1 + λ)1+j

3. Estimation and inference In this section we consider maximum likelihood estimation and provide expressions for the associated observed Fisher information matrix. Assuming the lifetimes are independently distributed, the maximum likelihood estimates (MLEs) of the parameters are obtained by direct maximization of the log-likelihood function given by, log L(α, β, λ) = n(log(λ) − α log(1 + λ)) +

n 

α log(1 + λ + λxi ) +

i =1

+

n 

log(β(1 + λ + λxi )(λxi )β−1 − α) −

i=1

n 

(λxi )β .

(8)

i =1

It follows that the maximum likelihood estimators (MLEs), say α, ˆ βˆ and λˆ , are the simultaneous solutions of the equations, n n   (β(1 + λ + λxi )(λxi )β−1 − α)−1 = −n log(1 + λ) + log(1 + λ + λxi ), i =1

i=1

β−1

 (1 + λ + λxi )(λxi ) (1 + β log(λxi ))  (λxi )β log(λxi ) = β(1 + λ + λxi )(λxi )β−1 − α i =1 i=1 n

n

and

βλ−1

   n n  α α(1 + xi ) 1 − + (λxi )β = n λ ( 1 + λ) ( 1 + λ + λxi ) i=1 i=1 +

n  β(λxi )β−1 (1 + xi + (1 + λ + λxi )λ−1 (β − 1))

β(1 + λ + λxi )(λxi )β−1 − α

i =1

.

For interval estimation of (α, β, λ) we consider the observed Fisher information matrix given by,

 IF (α, β, λ) = −

Iαα Iβα Iλα

Iαβ Iββ Iλβ

Iαλ Iβλ Iλλ

    

,

(9)

ˆ λ) ˆ (α,β,λ)=(α, ˆ β,

where the elements of the matrix IF (α, β, λ) are given in the Appendix. Under conditions that are fulfilled for the parameters α, β and λ in the interior of the parameter space, the asymptotic ˆ − λ), as n → ∞, is a normal 3-variate with zero mean and variance co-variance matrix distribution of (αˆ − α, βˆ − β, λ IF−1 (α, β, λ). 4. Data analysis In this section we illustrate the applicability of the EL distribution to sociology problem by considering a real dataset on recidivistic performance of individuals within a correctional release program. The data is presented at Table 1 of Stollmack and Harris (1974) and consist of 61 observed recidivism failure times (in days) of individuals released directly from correctional institutions to parole in the District of Columbia, Columbia, USA. We fit the EL distribution to the real dataset and compare its fitting with some usual survival distributions. Namely, the β

Weibull distribution with pdf given by f (x) = (β/λ)(x/λ)β−1 e−(x/λ) , where the shape parameter is β and scale parameter is λ, the gamma distribution with pdf given by f (x) = (1/sr )Γ (r )xr −1 e−x/s , with shape parameter r and scale parameter s, the exponentiated exponential (Exp. Exp.) distribution (Gupta & Kundu, 1999) with pdf given by f (x) = αλ(1 − e−λx )α−1 e−λx , where α is the shape parameter, the complementary exponential geometric (CEG) distribution (Louzada et al., 2011) with pdf given by f (x) = αβ e−α x (e−α x (1 − β) + β)−2 , where α > 0 is a scale parameter and 0 < β < 1 is a shape parameter and λ is the scale parameter, the complementary exponential geometric (CEG) distribution (Louzada et al., 2011) with pdf

83

1.0

0.0000 0.0005 0.0010 0.0015 0.0020 0.0025 0.0030 0.0035

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

EL Weibull Gamma Exp. Exp. CEG MW

0.0

0.2

S(t) estimated 0.4 0.6

0.8

EL Weibull Gamma Exp. Exp. CEG MW

0

100

200

300

400

500

600

700

0

100

200

300

400

500

Time

t

0.020

Smooth Hazard Rate

0.005

Hazard Rate 0.010 0.015

EL Weibull Gamma Exp. Exp. CEG MW

0

100

200

300

400

500

Time Fig. 4. Fitted density functions on the histogram, and fitted survival and hazard rate functions on the Kaplan–Meier estimate (irregular line).

given by f (x) = αβ e−α x (e−α x (1 − β) + β)−2 , where α > 0 is a scale parameter and 0 < β < 1 is a shape parameter, and β

the Modified Weibull (MW) distribution (Lai, Xie, & Murthy, 2003) with pdf given by f (x) = α xβ−1 (β + λx)eλx e−α x exp{λx} , where α, β ≥ 0 and λ > 0. In order to compare distributions we consider the −LOG = − log L( α,  β,  λ) values, the Akaike information criterion (AIC) and Bayesian information criterion (BIC), which are defined, respectively, by −2LOG + 2q and −2LOG + q log(n), where ( α,  β,  λ) are the MLEs vector, q is the number of parameters estimated and n is the sample size. The best distribution corresponds to lower −LOG, AIC and BIC values. Table 1 shows the values of the AIC, BIC and −LOG, and also the Kolmogorov Smirnov statistics with their p values. Table 2 shows the parameter MLEs according to each one of the six fitted distributions. The values of AIC, BIC, −LOG and the K–S statistic with their p values in Table 1, indicate that the EL distribution is a strong competitor to other distributions commonly used in literature for fitting lifetime data, moreover being the best fitting considering AIC, BIC, −LOG and K–S criterion. These conclusions are corroborated by the fitted pdf , survival and hazard rate functions of the EL, Weibull, gamma, Exp. Exp., MW and CEG distributions superimposed to the Kaplan–Meier fit (see Fig. 4). We observed a clear difference between the fitted curves, which is a strong motivation for choosing the most suitable distribution for fitting the data. The EL pdf , survival and hazard rate functions are the most adequate fitting. Also, this has been investigated by means of quantile–quantile plots. Fig. 5 presents the QQ-Plot for the considered dataset for all distributions consider in this paper. A QQ-plot consists of plots of the observed quantiles, Q (j), j = 1, 2, . . . , n, against the quantiles predicted by the fitted model. From the above results, it is evident that the EL distribution is the best distribution for fitting the dataset compared to other distributions considered here. 5. Concluding remarks In this paper we introduce an extension of the Lindley distribution. The new distribution is much more flexible than its predecessor Lindley distribution, presenting decreasing, increasing, unimodal and bathtub shaped failure rates. We provide statistical properties of the EL distribution including reliability measures: the density, (reversed) failure rate, (reversed)

84

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

0

0

Quantiles of Input Sample 100 200 300 400 500 0

100 200 300 400 500

0

100 200 300 400 500

0

100 200 300 400 500

Quantiles of Gamma Distribution

Exp. Exp. QQ–Plot

CEG QQ–Plot

MW QQ–Plot

0 100 200 300 400 500 Quantiles of Exp. Exp. Distribution

0

0

0

Quantiles of Input Sample 100 200 300 400 500

Quantiles of Weibull Distribution

Quantiles of Input Sample 100 200 300 400 500

Quantiles of EL Distribution

Quantiles of Input Sample 100 200 300 400 500

0

Gamma QQ–Plot Quantiles of Input Sample 100 200 300 400 500

Weibull QQ–Plot Quantiles of Input Sample 100 200 300 400 500

EL QQ–Plot

0

100 200 300 400 500 Quantiles of CEP Distribution

0

100 200 300 400 500 Quantiles of MW Distribution

Fig. 5. QQ plot for the dataset. Table 1 Comparison criterion. Model

AIC

BIC

−LOG

K–S statistic

p-value

EL Weibull Gamma Exp. Exp. CEG MW

765.1814 771.4959 774.9714 775.6715 767.9690 793.9221

771.5141 775.7177 779.1931 779.8933 772.1907 800.2547

379.5907 383.7480 385.4857 385.8358 381.9845 393.9610

0.0441 0.0770 0.0968 0.0993 0.0746 0.1109

0.9998 0.8627 0.6173 0.5842 0.8861 0.4411

Table 2 Parameter MLEs. Model

α

β

λ

r (scale)

s (shape)

EL Weibull Gamma Exp. Exp. CEG MW

−1.2418

3.2274 1.3785 – – 0.2088 0.0300

0.0027 229.3306 – 0.0058 – 0.0071

– – 0.0068 – – –

– – 1.4340 – – –

– – 1.4076 0.0092 0.1352

mean residual lifetime, moments, quantiles, order statistics, and Bonferroni and Lorenz curves. Estimation via maximum likelihood is straightforward. We also derived the observed information matrix. A real data application of the EL distribution shows that it could provide a better fit than a set of usual statistical distributions considered in lifetime data analysis. Acknowledgments The authors thank the editor and referees for their important criticism and comments which led to improvement of the manuscript. This Project was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, under grant no. 3-091/430. The authors, therefore, acknowledge with thanks DSR support for Scientific Research. Vitor A.A. Marchi and Francisco Louzada are supported by Brazilian organizations CAPES and CNPq, respectively.

H.S. Bakouch et al. / Journal of the Korean Statistical Society 41 (2012) 75–85

85

Appendix In this Appendix we show the values of the elements of the observed Fisher information matrix in Eq. (9). From Eq. (8) we obtain: Iαα = −

n  (β Tlbi − α)−2 , i=1

Iαβ = Iβα =

n  Tlbi + β Tlbi log(λxi )

(β Tlbi − α)2

i =1

Iαλ = Iλα = −

Iββ =

n 1+λ

+

n  1 + xi i=1

Tli

,

+

n  β Tbi (1 + xi + Tli (β − 1)λ−1 )

n   Tlbi (2 log(λxi ) + β(log(λxi ))2 )

β Tlbi − α

i=1

Iβλ = Iλβ =

n  

(β Tlbi − α)2

i=1

(Tlbi + β Tlbi log(λxi ))2 − (β Tlbi − α)2

,

 −

n  (λxi )β (log(λxi ))2 , i=1

((Tbi (1 + xi ) + Tlbi (β − 1)λ−1 )(1 + β log(λxi )) + β Tlbi λ−1 )(β Tlbi − α)−1

i=1

n   (λxi )β (1 + β log(λxi )) , − Tlbi (1 + β log(λxi ))(β Tbi (1 + xi ) + β Tlbi (β − 1)λ−1 )(β Tlbi − α)−2 − λ i=1

and

 Iλλ = n

−1 α + 2 λ (1 + λ)2

 −

n  α(1 + xi )2 i =1

Tli2



n  

(2β(1 + xi )Tbi (β − 1)λ−1

i=1

+ β Tlbi (β − 1)λ (β − 2))(β Tlbi − α) −2

+

−1

− (β Tbi )2 (1 + xi + β Tli (β − 1)λ−1 )2 (β Tlbi − α)−2



n  ((λxi )β β 2 λ−2 − β(λxi )β λ−2 ), i =1

where Tli = (1 + λ + λxi ), Tbi = (λxi )β−1 and Tlbi = Tli Tbi . References Barriga, G. D. C., Louzada-Neto, F., & Cancho, V. G. (2011). The complementary exponential power lifetime model. Computational Statistics and Data Analysis, 54(5), 1250–1259. Ghitany, M. E., Atieh, B., & Nadarajah, S. (2008). Lindley distribution and its application. Mathematics and Computers in Simulation, 78(4), 493–506. Gupta, P. L., & Gupta, R. C. (1983). On the moments of residual life in reliability and some characterization results. Communications in Statistics—Theory and Methods, 12, 449–461. Gupta, R. D., & Kundu, D. (1999). Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41(2), 173–188. Hussain, E. (2006). The non-linear functions of order statistics and their properties in selected probability models. Ph.D. thesis. Department of Statistics. University of Karachi, Pakistan. Kagan, Y. Y., & Schoenberg, F. (2001). Estimation of the upper cutoff parameter for the tapered Pareto distribution. Journal of Applied Probability, 38, 158–175. Kundu, C., & Nanda, A. K. (2010). Some reliability properties of the inactivity time. Communications in Statistics—Theory and Methods, 39, 899–911. Lai, C. D., Xie, M., & Murthy, N. P. (2003). A modified Weibull distribution. IEEE Transactions on Reliability, 52, 33–37. Lindley, D. V. (1958). Fiducial distributions and Bayes theorem. Journal of the Royal Statistical Society, 20(1), 102–107. Louzada, F., Roman, M., & Cancho, V. G. (2011). The complementary exponential geometric distribution: model, properties, and a comparison with its counterpart. Computational Statistics & Data Analysis, 55(8), 2516–2524. Mahmoudi, E., & Zakerzadeh, H. (2010). Generalized Poisson–Lindley distribution, communications in statistics. Theory and Methods, 39, 1785–1798. Mudholkar, G. S., Srivastava, D. K., & Freimer, M. (1995). The exponentiated Weibull family: a reanalysis of the bus-motor-failure data. Technometrics, 37(4), 436–445. Murthy, V.K., Swartz, G., & Yuen, K. (1973). Realistic models for mortality rates and estimation, I and II. Technical reports. Department of Biostatistics. University of California. Los Angeles, California. Nadarajah, S., & Kotz, S. (2006). The exponentiated type distributions. Acta Applicandae Mathematicae, 92(2), 97–111. Nanda, A. K., Singh, H., Misra, N., & Paul, P. (2003). Reliability properties of reversed residual lifetime. Communications in Statistics—Theory and Methods, 32, 2031–2042. Pundir, S., Arora, S., & Jain, K. (2005). Bonferroni curve and the related statistical inference. Statistics and Probability Letters, 75(2), 140–150. Sankaran, M. (1970). The discrete Poisson–Lindley distribution. Biometrics, 26, 145–149. Smith, R. M., & Bain, L. J. (1975). An exponential power life-testing distribution. Communications in Statistics—Theory and Methods, 4, 469–481. Stollmack, S., & Harris, C. (1974). Failure-rate analysis applied to recidivism data. Operations Research, 22(6), 1192–1205.