Exponentiated generalized linear exponential distribution

Exponentiated generalized linear exponential distribution

Applied Mathematical Modelling 37 (2013) 2838–2849 Contents lists available at SciVerse ScienceDirect Applied Mathematical Modelling journal homepag...

791KB Sizes 5 Downloads 156 Views

Applied Mathematical Modelling 37 (2013) 2838–2849

Contents lists available at SciVerse ScienceDirect

Applied Mathematical Modelling journal homepage: www.elsevier.com/locate/apm

Exponentiated generalized linear exponential distribution Ammar M. Sarhan a,⇑, Abd EL-Baset A. Ahmad b, Ibtesam A. Alasbahi b a b

Department of Math, Stats & CS, StFX University, Antigonish, NS, Canada B2G 2W5 Department of Mathematics, Assiut University, Assiut, Egypt

a r t i c l e

i n f o

Article history: Received 1 January 2012 Received in revised form 22 May 2012 Accepted 6 June 2012 Available online 21 June 2012 Keywords: Distribution theory Estimation theory Reliability analysis

a b s t r a c t A new generalization of the linear exponential distribution is recently proposed by Mahmoud and Alam [1], called as the generalized linear exponential distribution. Another generalization of the linear exponential was introduced by Sarhan and Kundu [1,2], named as the generalized linear failure rate distribution. This paper proposes a more generalization of the linear exponential distribution which generalizes the two. We refer to this new generalization as the exponentiated generalized linear exponential distribution. The new distribution is important since it contains as special sub-models some widely well known distributions in addition to the above two models, such as the exponentiated Weibull distribution among many others. It also provides more flexibility to analyze complex real data sets. We study some statistical properties for the new distribution. We discuss maximum likelihood estimation of the distribution parameters. Three real data sets are analyzed using the new distribution, which show that the exponentiated generalized linear exponential distribution can be used quite effectively in analyzing real lifetime data. Ó 2012 Elsevier Inc. All rights reserved.

1. Introduction The linear exponential (LE) distribution, having exponential and Rayleigh distributions as sub-models, is a very wellknown distribution for modeling lifetime data and for modeling phenomenon with linearly increasing failure rates. However, the LE distribution does not provide a reasonable parametric fit for modeling phenomenon with decreasing, non linear increasing, or non-monotone failure rates such as the bathtub shape, which are common in firmware reliability modeling, biological studies, see Lai et al. [3] and Zhang et al. [4]. The bathtub failure rate curves have nearly flat middle portions and the corresponding densities have a positive antimode. As example of bathtub-shaped failure rate is the firmware reliability [4]. The models that exhibit bathtub-shaped failure rate function are very useful in reliability analysis, and particularly in reliability related decision making and cost analysis Xie et al. [13], and firmware reliability modeling [4]. Many different parametric families of these distributions, which show bathtub-shaped failure rate, were constructed in the past two decades. A good review of some of these models is presented by Pham and Lai [5]. Among those are the exponentiated Weibull (EW) distribution, which was proposed by Mudholkar and Srivastava [6]. More recently, Sarhan and Kundu [2] presented the generalized linear failure rate (GLFR) distribution and Mahmoud and Alam [1] proposed the generalized linear exponential (GLE) distribution. None of these three distributions (EW, GLFR, and GLE) can be derived as a sub-model of the other. In this paper, we introduce a new distribution with four parameters, referred to as the exponentiated generalized linear exponential (EGLE) distribution, with the hope it will attract many applications in different disciplines such as survival analysis, reliability, biology and others. One of the main goals to introduce this new distribution is that it involves the above ⇑ Corresponding author. Permanent address: Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt. E-mail addresses: [email protected], [email protected] (A.M. Sarhan). 0307-904X/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.apm.2012.06.019

2839

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

mentioned three distributions as sub-models. The fourth parameter indexed to this distribution makes it more flexible to describe different types of real data than its sub-models. Generally, the EGLE distribution generalizes the GLFR, GLE, EW, generalized exponential (GE) [7], generalized Rayleigh (GR) [8] and LE distributions, among several others. The GGLE distribution, due to its flexibility in accommodating different forms of the hazard function, seems to be a suitable distribution that can be used in a variety of problems in fitting survival data. The EGLE distribution is not only convenient for fitting comfortable bathtub-shaped failure rate data but it is also suitable for testing goodness-of-fit of some special sub-models such as the EW, GLFR and GLE distributions. The rest of the paper is organized as follows. In Section 2, we define the EGLE distribution, discuss some special sub-models and provide its cumulative distribution function (cdf), the probability density function (pdf) and the hazard function. A formula for generating EGLE random samples from the EGLE distribution is given in Section 2. Section 3 discusses some important statistical properties of the EGLE distribution asuch as the mode, median, the quantile, the ordinary moments and measures of skewness and kurtosis. Section 4 discusses the distribution of the order statistics. Maximum likelihood estimates of the four parameter index to the distribution are presented in Section 5. Section 6 provides three applications to real data. Section 7 concludes the paper. The paper also contains an appendix giving technical details. 2. The EGLE distribution A non-negative random variable X has the EGLE distribution with parameter vector h ¼ ða; b; c; dÞ, say EGLDðhÞ or EGLEða; b; c; dÞ, if its cdf is

h i b 2 c d Fðx; hÞ ¼ 1  eðaxþ2x Þ ;

x P 0;

ð2:1Þ

where a; b P 0 (such that a þ b > 0) and c; d > 0. The two parameters a and b are scale parameters while c and d are shape parameters. The pdf of the EGLE distribution is

 c1 h i b b 2 c d1 ðaxþbx2 Þc 2 f ðx; hÞ ¼ cdða þ bxÞ ax þ x2 1  eðaxþ2x Þ e ; 2

x P 0:

ð2:2Þ

One of the advantages of the EGLE distribution is, it has a closed form cdf, which enables us to generating random numbers from it by using the following simple formula

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 8 1 1 > <  ba þ 1b a2 þ 2b½ lnð1  U d Þ c ; if b – 0; X¼ h i1 > : 1  lnð1  U 1d Þ c ; if b ¼ 0; a

ð2:3Þ

where U is a uniformly distributed random variable on (0, 1) interval. The formula (2.3) can be used to generate random samples from a wide set of sub-models of the EGLE distribution, such as the exponential, generalized exponential, Rayleigh, generalized Rayleigh, linear exponential (linear failure rate), generalized linear failure rate, generalized linear exponential, Weibull and exponentiated Weibull distributions. When d is an integer, the EGLEðhÞ distribution can be interpreted as the lifetime distribution of a parallel system consists of d independent and identical units whose lifetime follows the GLEða; b; cÞ distribution. The proposed distribution generalizes many very well known distributions which show different patterns of the hazard function. Table 1 summarizes some of those more recent sub-models which show bathtub shaped hazard function. The pdf of the EGLE distribution can be written in terms of the cumulative hazard and hazard functions of LEða; bÞ distribution as

h i c c d1 f ðx; hÞ ¼ dchLE ðxÞ½HLE ðxÞc1 e½HLE ðxÞ 1  e½HLE ðxÞ ;

x P 0;

ð2:4Þ



 where HLE ðxÞ ¼ HLE ðx; a; bÞ ¼ ax þ 2b x2 and hðxÞLE ¼ hLE ðx; a; bÞ ¼ a þ bx are the cumulative hazard and hazard functions of the LE distribution, respectively. Plots of the EGLE pdf for selected choices of the parameter vector h ¼ ða; b; c; dÞ are given in Fig. 1. From this figure, it is immediate that the pdf of the EGLE distribution can be either unimodal or right skewed.

Table 1 Some recent sub-models from the EGLE(a, b, c, d) distribution. The model

cdf

Special case

Author

EWðr; c; dÞ

h i c d 1  eðx=rÞ h i b 2 d 1  eax2x

b ¼ 0; a ¼ r1

[6]

GLFRða; b; dÞ GLEða; b; cÞ

ðaxþ2bx2 Þ

1e

c

c¼1

[2]

d¼1

[1]

2840

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

2 1.8

θ = (0.7, 2.5, 1.5, 0.5)

1.6

θ = (0.7, 2.5, 1.5, 2)

1.4

f(x;θ)

1.2 1 0.8 0.6

θ = (0.7, 2.5, 0.5, 2)

0.4 0.2 0

θ = (0.7, 2.5, 0.5, 0.5) 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

x Fig. 1. The pdf of the EGLE distribution at different h.

The construction of the cdf of the EGLE distribution (2.1) can be proposed in two ways: (i) by rasing the cumulative hazard rate function of the LE distribution Hðx; a; bÞ, embedded in the GLFE distribution, by an arbitrary parameter c > 0; or (ii) by raising the cdf of the GLE F GLE ðx; a; b; cÞ by an arbitrary power d > 0. Following case (2), F GLE ðx; a; b; cÞ is the baseline distribution and Fðx; hÞ may be referred to as the exponentiated F GLE ðx; a; b; cÞ distribution. The relation between the corresponding probability density functions is f ðx; hÞ ¼ d½F GLE ðx; a; b; cÞd1 fGLE ðx; a; b; cÞ. It is noticed that for d > 1 and d < 1 and for larger (smaller) values of x, the factor dfGLE ðx; a; b; cÞ is greater (smaller) and smaller (greater) than one, respectively. This immediately implies that the ordinary moments associated with the pdf (2.2) are strictly larger (smaller) than those associated with the pdf fGLE ðxÞ when d > 1 (d < 1). The hazard function of the EGLEðhÞ distribution is

h id1 b 2 c b 2 c 2 cdða þ bxÞðax þ bx Þc1 1  eðaxþ2x Þ eðaxþ2x Þ hðx; hÞ ¼ ; h id b 2 c 1  1  eðaxþ2x Þ

x P 0:

ð2:5Þ

Plots of the hazard function of the EGLE distribution, for selected choices of the parameter vector h ¼ ða; b; c; dÞ, are given in Fig. 2. From this figure, it is immediate that the hazard function of the EGLE distribution can be either decreasing, increasing, or of bathtub shape, which makes the distribution more flexible to fit different lifetime data sets. The pdf of the EGLE distribution (2.2) can be written as a linear combination of the pdf of GLE distribution. For d > 0, a series expansion for ð1  wÞd1 , for jwj < 1, is 5 4.5 θ = (0.7, 2.5, 1.5, 0.5)

4 3.5

θ = (0.7, 2.5, 0.5, 0.5)

h(x;θ)

3 2.5 2 θ = (0.7, 2.5, 1.5, 2)

1.5 1 0.5 0

θ = (0.7, 2.5, 0.5, 2) 0

0.1

0.2

0.3

0.4

x

0.5

0.6

0.7

Fig. 2. The hazard function of the EGLE distribution at different h.

0.8

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

ð1  wÞd1 ¼

1 X ð1Þj CðdÞ j¼0

Cðd  jÞj!

wj ;

2841

ð2:6Þ c

where CðÞ is the gamma function. Since for x > 0, e½Hðx;hÞ < 1, then using the series expansion (2.6) in (2.2), yields

f ðx; hÞ ¼

1 X ð1Þj Cðd þ 1Þ f ðx; ðj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ: Cðd  jÞðj þ 1Þ! GLE j¼0

ð2:7Þ

When d is a positive integer, the index j in (2.7) stops at d  1. The linear combination (2.7) enables us to obtain some mathematical properties of EGLE distribution directly from those of the GLE distribution, such as for example, the moments, the moment generating function, characteristic function, etc. There are many softwares such as MATHEMATICA, MATLAB and MAPLE can be used to compute (2.7) numerically. Currently, such softwares have ability to deal with formidable analytical expressions. 3. Statistical properties 3.1. Mean, median and mode As it was expected, the mean of the EGLEða; b; c; dÞ distribution cannot be derived in an explicit form. However, it can be   expressed as a linear combination of the means of GLEða ; b ; cÞ distribution, with different values of a and b . In general, we will present different moments of the EGLEða; b; c; dÞ distribution later. The quantile xq of the EGLEða; b; c; dÞ distribution can be easily given as

xq ¼

( rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi) h i1c 1 1 a þ a2 þ 2b  lnð1  qd Þ : b

ð3:1Þ

Using (3.1), the median of the EGLEða; b; c; dÞ distribution can be obtained, by setting q ¼ 12. Moreover, the mode of the EGLEða; b; c; dÞ distribution can be obtained as a nonnegative solution of the following nonlinear equation

 c    c

  b 2 c b b b ¼ 0: ða þ bxÞ2 c  1 þ cðd  1Þ ax þ x2 1  eðaxþ2x Þ  ax þ x2 b  ða þ bxÞ ax þ x2 2 2 2

ð3:2Þ

It is not possible to get an explicit solution to (3.2) in the general case. Numerical methods should be used such as bisection or fixed-point method, see [9]. Explicit forms can be derived in different special cases. 3.2. Moments Moments are necessary and important in any statistical analysis, especially in applications. It can be used to study the most important features and characteristics of a distribution (e.g., tendency, dispersion, skewness and kurtosis). Let X be a random variable with density function (2.2). The kth ordinary moment of the EGLE distribution is R lk ðhÞ ¼ E½X k  ¼ 01 xk f ðx; hÞdx. Using (2.7), we can derive lk ðhÞ as

lk ðhÞ ¼

1 X ð1Þj Cðd þ 1Þ lk;GLE ððj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ; C ðd  jÞðj þ 1Þ! j¼0

ð3:3Þ

where lk;GLE ððj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ is the kth ordinary moment of the GLE distributed random variable with parameters ðj þ 1Þ1=c a; ðj þ 1Þ1=c b; c. Using Eq. (9) in [1], we have

lk;GLE ða; b; cÞ ¼

k X 1   X k i¼0 ‘¼0

i

1 ðk 2

 iÞ

!



ki

ð1Þi 2 2 ‘ a2‘þi kþi

c 2 þ‘



C

 ki ‘  þ1 : 2c c

ð3:4Þ

If d is a positive integer, the upper limit in the inner sum in (3.3) will be d  1. This result shows a useful application of the infinite linear combination of the EGLE probability density function. Based on the first four ordinary moments of the EGLE distribution, the measures of skewness aðhÞ and kurtosis jðhÞ of the EGLE distribution can obtained as

aðhÞ ¼

l3 ðhÞ  3l1 ðhÞl2 ðhÞ þ 2l31 ðhÞ ; 3 l2 ðhÞ  l21 ðhÞ 2

ð3:5Þ

2842

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

4

25 c=0.5 c=1.5 c=1.0

3.5

c=0.5 c=1.5 c=1.0

20

2.5

Kurtosis

Skewness

3

2

15

10

1.5 1

5 0.5 0

0

2

4

d

6

8

10

0 0

2

4

d

6

8

10

Fig. 3. Plots of the skewness and kurtosis of the EGLE distribution as a function of d for some values of c and a ¼ 0:7; b ¼ 2:5.

and

jðhÞ ¼

l4 ðhÞ  4l1 ðhÞl3 ðhÞ þ 6l21 ðhÞl2 ðhÞ  3l41 ðhÞ : l2 ðhÞ  l21 ðhÞ 2

ð3:6Þ

Plots of the skewness and kurtosis of the EGLE distribution as a function of d for selected values of c and a ¼ 0:7; b ¼ 2:5 are given in Fig. 3. It is observed that: (i) aðhÞ and jðhÞ decrease as d increases when c < 1; (ii) when c P 1; aðhÞ and jðhÞ first decrease as d increase and then start increasing. 4. Order statistics Let X 1 ; X 2 ; . . . ; X n be a simple random sample from EGLDðhÞ distribution with pdf and cdf given by (2.2) and (2.1), respectively. Let X 1:n 6 X 2:n 6    6 X n:n denote the order statistics obtained from this sample. We now give the probability density function of X r:n , say fr:n ðx; hÞ, and the moments of X r:n ; r ¼ 1; 2; . . . ; n. Therefore, the measures of skewness and kurtosis of the distribution of the X r:n are presented. The pdf of X r:n is given by

fr:n ðx; hÞ ¼

1 ½Fðx; hÞr1 ½1  Fðx; hÞnr f ðx; hÞ; Bðr; n  r þ 1Þ

ð4:7Þ

where f ðx; hÞ; Fðx; hÞ are the pdf, cdf of the EGLEðhÞ distribution given by (2.2), (2.1), respectively, and Bð; Þ is the beta function. Since 0 < Fðx; hÞ < 1 for x > 0, we can use the binomial series expansion for ½1  Fðx; hÞnr , given by

½1  Fðx; hÞnr ¼

 nr  X nr ð1Þj ½Fðx; hÞj ; j j¼0

ð4:8Þ

therefore

fr:n ðx; hÞ ¼

 nr  X nr 1 f ðx; a; b; c; dÞ ð1Þj Fðx; a; b; c; dÞrþj1 Bðr; n  r þ 1Þ j j¼0

ð4:9Þ

substituting from (2.2) and (2.1) into (4.9), one gets

fr:n ðx; hÞ ¼

nr X j¼0

ð1Þj n! f ðx; a; b; c; ðr þ jÞdÞ: j!ðr  1Þ!ðn  r  jÞ!ðr þ jÞ

ð4:10Þ

Relation (4.10) shows that fr:n ðx; hÞ is the weighted average of the exponentiated generalized exponential distribution with different shape parameters. Using (2.7), (3.3), and (3.4), we can express the kth moment of the ith order statistics X i:n as a liner combination of the kth moments of the GLE distribution with different shape parameters. Therefore, the measures of skewness and kurtosis of the distribution of X i:n can be calculated.

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

2843

5. Estimation and inference Now, we discuss the estimation of the model parameters by using the method of maximum likelihood in the presence of right-censored sample. In right-censored sample, it is assumed that N independent and identical units are put on the life test. The test is terminated after a pre-specified time, say T, is reached. The observations obtained from this test are N; T; n ¼ the number of units failed before the censored time T and the lifetimes of these n failed unites. Let x ¼ fx1 6 x2 6    6 xn ; N; n; Tg be a right-censored sample when the lifetime of the tested unites follow the EGLE distribution with unknown parameter vector h ¼ ða; b; c; dÞ. The log-likelihood function LðhÞ for h is

  X c n n  X b b lnða þ bxi Þ þ ðc  1Þ ln axi þ x2i  axi þ x2i 2 2 i¼1 i¼1 i¼1

 n h i h i X b 2 c b 2 c d : þ ðd  1Þ ln 1  eðaxi þ2xi Þ þ ðN  nÞ ln 1  1  eðaTþ2T Þ

LðhÞ ¼ n ln d þ n ln c þ

n X

ð5:11Þ

i¼1

The components of the score vector UðhÞ ¼ ðU a ; U b ; U c ; U d ÞT are

 c1 b 2 c n n n X X X xi ðaxi þ 2b x2i Þc1 eðaxi þ2xi Þ 1 xi b 2 þ ðc  1Þ  c x ax þ þ cðd  1Þ x i i c b 2 a þ bxi 2 i axi þ 2b x2i 1  eðaxi þ2xi Þ i¼1 i¼1 i¼1 i¼1  c h i d1 b 2 c b 2 c Tdc aT þ 2b T 2 eðaTþ2T Þ 1  eðaTþ2T Þ  ðN  nÞ ; h id b 2 c 1  1  eðaTþ2T Þ  c1 b 2 c n n n n X x2i ðaxi þ 2b x2i Þc1 eðaxi þ2xi Þ x2i xi c 1X cX b 2 cðd  1Þ X 2 x þ  x ax þ þ Ub ¼ i c b 2 2 i¼1 ax þ 2b x2i 2 i¼1 i 2 i 2 a þ bx 1  eðaxi þ2xi Þ i¼1 i¼1 h i b 2 c d1 2 b 2 c ðaTþ2bT 2 Þc 1  eðaTþ2T Þ ðN  nÞ T dcðaT þ 2 T Þ e  ; h id b 2 c 2 1  1  eðaTþ2T Þ  c     X c  b 2 c n n n X ax þ 2b x2 eðaxþ2xi Þ ln ax þ 2b x2 n X b b b ln axi þ x2i  axi þ x2i ln axi þ x2i þ ðd  1Þ Uc ¼ þ b 2 c c i¼1 2 2 2 1  eðaxþ2x Þ i¼1 i¼1  c h i   d1 b 2 c b 2 c d aT þ 2b T 2 eðaTþ2T Þ 1  eðaTþ2T Þ ln aT þ 2b T 2  ðN  nÞ ; h i d b 2 c 1  1  eðaTþ2T Þ

Ua ¼

n X

and

h i b 2 c n h ln 1  eðaTþ2T Þ ci b 2 n X : Ud ¼ þ ln 1  eðaxi þ2xi Þ  ðN  nÞ h i b 2 c d d i¼1 1 1  eðaTþ2T Þ Setting these expressions to zero, UðhÞ ¼ 0, and solving them simultaneously gives the maximum likelihood estimate (MLE)  T ^ ^c; d ^ of the four parameters. The system of these four nonlinear equations cannot be solved analytically and math^ ^; b; h¼ a ematical or statistical software should apply to get a numerical solution via iterative techniques such as the Newton–Raphson method. For asymptotic interval estimation of the four parameters a; b; c and d, we obtain the observed Fisher information matrix. The elements of the 4  4 observed information matrix IðhÞ ¼ @ 2 L=@h@hT , are given in Appendix A. The multivariate normal N 4 ð0; Ið^ hÞ1 Þ distribution can be used to construct asymptotic confidence intervals for the parameters. The asymp   ^ ^ ; ^c  Z a=2 SEð^cÞ, and ^  Z a=2 SEða ^Þ ; b totic 100ð1  aÞ% confidence intervals of a; b; c and d are a  Z a=2 SEðbÞ   ^  Z a=2 SEðdÞ ^ , respectively, where Z a=2 is the quantile ð1  a=2Þ of the standard normal distribution and SEðÞ is the square d ^ 1 corresponding to each parameter. root of the diagonal element of IðhÞ Different types of goodness-of-fit can be applied here to test the superiority of the EGLE distribution comparing to some other models. Mainly in Section 6, we use Kolmogorov–Smirnov (K–S) test as a non-parametric test and the likelihood ratio (LR) test as a parametric one to illustrate how one can compare the EGLE distribution with the GLE, GLFR, EW, LFR and W distributions to fit real data sets.

2844

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

6. Data analysis In this section, we analyze three real data sets (two of them are complete and one is censored) to demonstrate the performance of the EGLE distribution in practice. The first data set is a sample of 5 components taken from [1,10], which possess a bathtub-shaped hazard rate. The second data set is a sample of 40 patients taken from Abuammoh et al. [12], which posses an increasing hazard rate. The third data set is censored and consists of 82 prisoners taken from different places in Middle East, which posses a bathtub-shaped hazard rate. The third data is available with the corresponding author. For each data set, we compare the results of the fits of the EGLE, GLE, GLFR and EW distributions. The main reasons to compare our new model with the GLE, GLFR and EW models are: (1) our model generalizes these three models; and (2) The hazard function of all of these four models can be increasing, decreasing or of bathtub shaped. Furthermore, we compare the EGLE with the Weibull (W) and the linear failure rate (LFR) distributions. The main reason to compare the EGLE with the W and LFR distributions is to see the benefit of introducing this new distribution with four parameters when it compares with simpler two parameter distributions. We perform the following five testing of null hypotheses; (i) H0 : d ¼ 1, the data follow the GLEða; b; cÞ distribution, (ii) H0 : c ¼ 1, the data follow the GLFRða; b; dÞ distribution, (iii) H0 : b ¼ 0; a ¼ r1 , the data follow EWðr; c; dÞ distribution, (iv) H0 : b ¼ 0; a ¼ r1 ; d ¼ 1, the data follow Wðr; cÞ ‘‘Weibull’’ distribution, and (v) H0 : c ¼ 1; d ¼ 1, the data follow the LFRða; bÞ ‘‘linear failure rate’’ distribution, against the alternative hypothesis Ha : the data follow EGLEða; b; c; dÞ distribution. Parametric and nonparametric test statistics are used to test the above null hypotheses H0 ’s against Ha . We use Kolmogorov–Smernov (K–S) test and likelihood ratio (LR) test statistics. 6.1. Devices times data The data set refers to the lifetimes of 50 devices provided by Aarset [10]. Table 2, gives the measurements of the data set. Table 3 shows the mles of parameters index to every distribution used, the observed K–S test statistic values and the corresponding p-values for the six models used. Fig. 4(a) and (b) give the empirical and parametric survival and hazard functions, respectively, for the devices data set. From Tables 3, based on the p-values associated with the K–S values, we can conclude that: (i) the Weibull distribution is rejected and any level of significance a P 0:005, (ii) the LFR distribution should be rejected and any level of significance a P 0:04 (iii) the EW distribution is rejected at a P 0:06, (iv) both the EW and GLFR distributions are rejected at a P 0:14, (v) the EGLE distribution is not rejected at a 6 0:21, and (vi) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. For parametric comparisons, we have used the likelihood ratio (LR) test statistics, KH0 ¼ 2ðLHa  LH0 Þ, to test the null hypotheses against the alternative one mentioned above. In addition, the Akaike Information Criterion (AIC) by Akaike [11] is used to select the best model among several models. The AIC is given as AIC ¼ 2LModel þ 2p, where p is the number of parameters index to the model. The best model to fit the data is the model with lowest AIC. Table 4 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC. From the p-values it is clear that we reject all the null hypotheses at level of significance a P 1:9  103 . Also, the EGLE distribution has the lowest AIC. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. This conclusion supports the results obtained based the K–S test mentioned above.

Table 2 (Devices data) Lifetimes of 50 devices [10]. .1 21 79

.2 32 82

1 36 82

1 40 83

1 45 84

1 46 84

1 47 84

2 50 85

3 55 85

6 60 85

7 63 85

11 63 85

12 67 86

18 67 86

18 67

18 67

18 72

18 75

Table 3 The mles of the parameters, the K–S values and p-values for devices data. The model

MLE of the parameters

K–S

p-Value

Wðr; cÞ LFRða; bÞ

r^ ¼ 44:913; ^c ¼ 0:949

0.2397 0.1955

0.0052 0.0370

0.1841

0.0590

EW (r; c; d)

^ ¼ 2:4  104 ^ ¼ 0:014; b a r^ ¼ 91:023; ^c ¼ 4:69; d^ ¼ 0:146

0.1620

0.1293

GLEða; b; cÞ

^ ¼ 3:074  104 ; d ^ ¼ 0:533 ^ ¼ 3:822  103 ; b a ^ ¼ 4:52  104 ; ^c ¼ 0:73 ^ ¼ 9:621  103 ; b a

0.1598

0.1391

EGLEða; b; c; dÞ

^ ¼ 1:738  104 ; ^c ¼ 4:564; d ^ ¼ 0:112 ^ ¼ 3:307  103 ; b a

0.1475

0.2055

GLFRða; b; dÞ

2845

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

(a)

(b) Empirical NGLE GLE EW GLFR W LFR

0.8

0.6

0.4

0.2

0

Empirical EGLE GLE EW GLFR

0.1

The hazard function

The survival function

1

0.08 0.06 0.04 0.02

0

10

20

30

40

50

60

70

0

80

0

10

20

30

40

50

60

70

80

x

x

Fig. 4. (a) The empirical and estimated survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for devices data. (b) Empirical and estimated hazard functions of the EGLE, GLE, GLFR and EW models for the devices data.

Table 4 H0 ; L, the LR test statistics, p-values and AIC values for devices data. The model

H0

LH 0

KH0

df

p-Value

AIC

Wðr; cÞ

241.002

33.330

2

5:787  108

486.004

LFRða; bÞ

b ¼ 0; a ¼ r1 ; d ¼ 1 c ¼ 1; d ¼ 1

238.064

27.454

2

1:093  106

480.128

GLEða; b; cÞ

d¼1

235.926

23.178

1

1:477  106

370.240

GLFRða; b; dÞ

c¼1

233.145

17.616

1

2:703  105

369.173

EWðr; c; dÞ

b ¼ 0; a ¼ r1

229.114

9.554

1

1:995  103

360.464

EGLEða; b; hÞ

LHa ¼

224:337







358.502

Table 5 (Leukemia data) Lifetimes of 40 patients suffering from Leukemia. 115 807 1222 1578

181 865 1251 1578

255 924 1277 1599

418 983 1290 1603

441 1024 1357 1605

461 1062 1369 1696

(a) Empirical NGLE GLE EW GLFR W LFR

0.8

0.6

0.4

9

x 10

739 1165 1455 1799

743 1191 1478 1815

789 1222 1549 1852

(b)

−3

Empirical EGLE GLE EW GLFR

8

The hazard function

The survival function

1

516 1063 1408 1735

7 6 5 4 3 2

0.2

1

0

0

500

1000

x

1500

0

200

400

600

800

1000

1200

1400

1600

1800

x

Fig. 5. (a) The empirical and estimated survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for leukemia data. (b) Empirical and estimated hazard functions of the EGLE, GLE, GLFR and EW models for the leukemia data.

2846

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

Table 6 The MLEs of the parameters, the K–S values and p-values for leukemia data. The model

MLE of the parameters

K–S

p-Value

LFRða; bÞ

^ ¼ 4:229  107 ^ ¼ 9:501  104 ; b a r^ ¼ 1143:3; ^c ¼ 1:055 r^ ¼ 734:185; ^c ¼ 1:265; d^ ¼ 2:973

0.3585

4:143  105 0.0050 0.4494

Wðr; cÞ EWðr; c; dÞ

0.2680 0.1321

^ ¼ 1:389  106 ; d ^ ¼ 1:553 ^ ¼ 2:102  104 ; b a 5 ^ ¼ 1:131  106 ; ^c ¼ 1:260 ^ ¼ 7:591  10 ; b a ^ ¼ 7:147  107 ; ^c ¼ 3:410; d ^ ¼ 0:333 ^ ¼ 3:278  105 ; b a

GLFRða; b; dÞ GLEða; b; cÞ EGLEða; b; c; dÞ

0.1183

0.5884

0.1105

0.6727

0.0917

0.8591

Table 7 The MLE, the log-likelihood values, p-values and AIC values for leukemia data. The model

LH 0

KH0

df

b ¼ 0; a ¼ r ; d ¼ 1 c ¼ 1; d ¼ 1

319.874

38.555

318.458

35.723

308.933

GLFRða; b; dÞ

b ¼ 0; a ¼ r1 c¼1

GLEða; b; cÞ NGLEða; b; c; dÞ

W (r; c) LFRða; bÞ EW (r; c; d)

H0 1

p-Value

AIC

2

4:245  10

9

643.747

2

1:749  108

640.916

16.674

1

4:438  105

623.866

305.338

9.484

1

2:072  103

616.677

d¼1

304.111

7.029

1

8:019  103

614.222

LHa ¼

300:596





609.192

p-Value

Table 8 The mles of the parameters, the associated K–S values and p-values for drug data. The model

MLE of the parameters

K–S

LFRða; bÞ

^ ¼ 4:002  106 ^ ¼ 2:473  104 ; b a r^ ¼ 603:995; ^c ¼ 1:66 ^ ¼ 3:444  106 ; d ^ ¼ 0:732 ^ ¼ 1:24  104 ; b a r^ ¼ 858:334; ^c ¼ 1:716; d^ ¼ 0:537

0.1512

0.0422

0.1421 0.1154

0.0659 0.2085

0.1094

0.2602

Wðr; cÞ GLFRða; b; dÞ EWðr; c; dÞ GLEða; b; cÞ EGLEða; b; c; dÞ

^ ¼ 5:427  106 ; ^ ^ ¼ 3:445  105 ; b c ¼ 0:582 a ^ ¼ 1:17  106 ; ^c ¼ 3:414; d ^ ¼ 0:195 ^ ¼ 4:686  104 ; b a

0.0984

0.3806

0.0621

0.8900

6.2. Leukemia data Table 5 gives the data set studied by Abuammoh et al. [12], which represent the lifetime in days of 40 patients suffering from leukemia from one of the Ministry of Health Hospitals in Saudi Arabia. Fig. 5(a) shows the empirical the estimated survival functions using every model. Fig. 5(b) gives the empirical and fitted hazard functions of the models that are not rejected to fit the leukemia data. Fig. 5(b) shows an increasing hazard rate for the data set. Hence, the any of the EGLE, GLE, GLFR, EW, W and LFR distributions could be appropriate to fit such data. To see which one of these distributions is more appropriate to fit the data set, we calculate the mles of the parameters index to each model, therefore, we use different test statistics to compare them. Table 6 gives the mles of the parameters, the K–S statistic and the corresponding p-value for every model. Based on the p-value associated with K–S values, given in Table 6, we can conclude that: (i) both the W and LFR distributions should be rejected and any level of significance a > 0:005, (ii) none of the four models EW, GLFR, GLE and EGLE is rejected at any considerable level of significance, and (iii) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. Table 7 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC for the leukemia data. From the pvalues it is clear that we reject all the null hypotheses at level of significance a P 8:019  103 . Also, the EGLE distribution has the lowest AIC. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. This conclusion gives more accurate comparison than that obtained based the K–S test mentioned above. 6.3. Drug data A random sample of 82 prisoners who imprisoned because of drug issue then they released together in general pardon. During 111 weeks, 65 of them arrested again because either drug abuse or sale it. The lifetime data consist of the times at which the prisoners return back to the prison after they released. The data was collected from a prison in the Middle East and it is with the corresponding author.

2847

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

(b)

0.6 0.5

0.004 0.003 0.002

0.7

The hazard function

Survival function

0.8

0.000

0.4 0.3 0.2 0

Empirical EGLE GLE EW GLFR W LFR

0.001

Empirical EGLE GLE EW GLFR W LFR

0.9

0.005

(a) 1

0

100

200

300

400

500

600

700

200

800

400

600

800

x

x

Fig. 6. (a) The empirical and fitted survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for the drug data. (b) The empirical and fitted hazard functions of the EGLE, GLE, GLFR, EW, W and LFR models for the drug data.

Table 9 The MLE, the log-likelihood values, p-values and AIC values for drug data. The model W (r; c)

H0

LH 0

KH0

df

p-Value

AIC

b ¼ 0; a ¼ r ; d ¼ 1 c ¼ 1; d ¼ 1

208.908

11.803

2

2:735  10

LFRða; bÞ

207.872

9.732

2

GLFRða; b; dÞ GLEða; b; cÞ EW (r; c; d)

c¼1 d¼1 b ¼ 0; a ¼ r1

205.400 205.345 204.471

4.789 4.678 2.929

1 1 1

7:705  103 0:029 0:031 0:087

416.801 416.690 414.941

NGLEða; b; c; dÞ

LHa ¼

203:006







414.012

1

3

421.815 419.744

Table 8 shows the mles of parameters index to every distribution used, the observed K–S test statistic values and the corresponding p-values for the six models used. Fig. 6(a) gives the empirical and parametric survival functions for the drug data set. Fig. 6(b) shows the empirical and parametric hazard functions for the drug data set. From Fig. 6(b), the data shows a bathtub-shaped hazard, therefore it is expected that one of the EW, GLE, GLFR and EGLE distributions might be appropriate to fit it. Based on the p-value associated with K–S values, given in Table 8, we can conclude that: (i) The LFR distribution should be rejected at any level of significance a 6 0:0422, (ii) The W model should be rejected at any level of significance a 6 0:0659, (iii) none of the four models EW, GLFR, GLE and EGLE is rejected at any considerable level of significance a 6 0:2086, and (iv) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. For more accurate comparisons between these distributions, we perform more analysis using the likelihood ratio test statistics. Table 9 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC for the drug data. From the p-values, in Table 9, we can immediately conclude that: (i) both the LFR and W distributions are rejected at any level of significance a P 7:705  103 , (ii) the GLFR distribution should be rejected at a P 0:029 significance level, (iii) the GLE distribution should be rejected at a P 0:031 significance level, (iv) the EW distribution should be rejected at a P 0:087 significance level. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. 7. Conclusion We have introduced a four parameter distribution, so-called the exponentiated generalized exponential distribution, as a simple extension of either the generalized linear exponential distribution [1], or the generalized linear failure rate distribution [2] or the exponentiated Wibull distribution [6]. We discussed some statistical properties of the distribution, including mean, median, mode, moments, measures of skewness and kurtosis, probability density of the order statistics and their moments. The maximum likelihood estimates of the four parameters index to the new distribution are discussed and we obtained the observed Fisher information matrix.

2848

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

Three real data sets are analyzed using the new distribution and it is compared with three immediate sub-models mentioned above in addition to another two simpler two parameter distributions (Weibull and linear failure rate). The results of the comparisons showed that the new distribution provides a better fit than those three mentioned distributions to the three data sets. We hope our new distribution might attract wider sets of applications in lifetime data and reliability analysis. Acknowledgments The authors thank the referees for their valuable comments which improved the earlier version of the manuscript. Appendix A The elements of the observed Fished information matrix, for complete data set, is

0

U aa B  B Ið^hÞ ¼ B @  

U ab U bb

U ac U bc



U cc





1 U ad U bd C C C U cd A U dd

; h¼^h

are given by wi ¼ hLE ðxi ; a; bÞ ¼ a þ bxi ,

n 

-i ¼ HLE ðaxi þ 2b x2i Þ and xi ¼ exp  axi þ 2b x2i

c o

,

n n n n X X X X x2i ðx2i -c2 xi Þ½ð1  xi Þðc  1Þ  c-ci  1 i U aa ¼   ðc  1Þ  cðc  1Þ x2i -c2 þ cðd  1Þ ; 2 2 -i ð1  xi Þ2 i¼1 wi i¼1 i¼1 i¼1  3 c2  n n n n X X X xi -i xi ½ð1  xi Þðc  1Þ  c-ci  x3i xi c  1 X c c 3 c2 ðc  1Þ ðd  1Þ   x þ ; U ab ¼  i i 2 2 i¼1 -2i 2 2 ð1  xi Þ2 i¼1 wi i¼1 i¼1   n n n n X X X X xi -c1 xi ½ð1  xi Þð1 þ c ln -i Þ  c-ci ln -i  xi i  c xi -c1 ln -i  xi -c1 þ ðd  1Þ ; U ac ¼ i i -i ð1  xi Þ2 i¼1 i¼1 i¼1 i¼1 n X xi -c1 wi i ; U ad ¼ c 1  w i i¼1

 4 c2  n n n n X xi -i xi ½ð1  xi Þðc  1Þ  c-ci  x2i x2i c  1X cðc  1Þ X cðd  1Þ X 4 c2 U bb ¼   þ x þ ; i i 2 2 4 i¼1 -i 4 4 ð1  xi Þ2 i¼1 wi i¼1 i¼1 U bc ¼ 

U bd ¼

n x2i -c1 xi cX i ; 2 i¼1 1  xi

U cc ¼ 

U cd ¼

n n n n x2i ðx2i -c1 xi Þ½ð1  xi Þð1 þ c ln -i Þ  c-ci ln -i  1X cX 1X d  1X i  x2 -c1 ln -i  x2 -c1 þ ; i i 2 i¼1 -i 2 i¼1 2 i¼1 2 i¼1 ð1  xi Þ2

n n X xi -ci ðln -i Þ2 ðð1  xi Þ  -ci Þ n X  -c ðln -i Þ2 þ ðd  1Þ ; 2 c ð1  xi Þ2 i¼1 i¼1

n X xi -c ln -i i

i¼1

1  xi

and U dd ¼ 

n d

2

:

References [1] M.A.W. Mahmoud, F.M.A. Alam, The generalized linear exponential distribution, Statist. Probabil. Lett. 80 (2010) 1005–1014. [2] A. Sarhan, D. Kundu, Generalized linear failure rate distribution, Commun. Statist. Theory Methods 38 (5) (2009) 642–660. [3] C.D. Lai, M. Xie, D.N.P. Murthy, Bathtub shaped failure rate distributions, in: N. Balakrishnan, C.R. Rao (Eds.), Handbook in Reliability, vol. 20, 2001, pp. 69–104. [4] T. Zhang, M. Xie, L.C. Tang, S.H. Ng, Reliability and Modeling of Systems Integrated with Firmware and hardware, Int. J. Reliab. Quality Safety Eng. 12 (3) (2005) 227–239. [5] H. Pham, C.D. Lai, On recent generalizations of the Weibull distribution, IEEE Trans. Reliab. 56 (2007) 454–458. [6] G.S. Mudholkar, D.K. Srivastava, Exponentiated Weibull family for analyzing bathtub failure rate data, IEEE Trans. Reliab. 42 (1993) 299–302. [7] R. Gupta, D. Kundu, Generalized exponential distribution, Aust. N. Z. J. Statist. 41 (2) (1999) 173–188. [8] J.G. Surles, W.J. Padgett, Some properties of a scaled Burr type X distribution, J. Statist. Plann. Inference 128 (2005) 271–280. [9] L. Burden, J.D. Faires, Numerical Analysis, ninth ed., Brooks/Cole, Cengage Learing, 2011. [10] M.V. Aarset, How to identify bathtub hazard rate, IEEE Trans. Reliab. R-36 (1987) 106–108.

A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849

2849

[11] H. Akaike, A new look at statistical model identification, IEEE Trans. Reliab. 19 (1974) 716–723. [12] A.M. Abouammoh, S.A. Abdulghani, I.S. Qamber, On partial orderings and testing of new better than renewal used classes, Reliab. Eng. Syst. Safety 43 (1994) 37–41. [13] M. Xie, Y. Tang, T.N. Goh, A Modified Weibull extension with bathtub-shaped failure rate function, Reliab. Eng. Syst. Safe. 76 (2002) 279–285.