Applied Mathematical Modelling 37 (2013) 2838–2849
Contents lists available at SciVerse ScienceDirect
Applied Mathematical Modelling journal homepage: www.elsevier.com/locate/apm
Exponentiated generalized linear exponential distribution Ammar M. Sarhan a,⇑, Abd EL-Baset A. Ahmad b, Ibtesam A. Alasbahi b a b
Department of Math, Stats & CS, StFX University, Antigonish, NS, Canada B2G 2W5 Department of Mathematics, Assiut University, Assiut, Egypt
a r t i c l e
i n f o
Article history: Received 1 January 2012 Received in revised form 22 May 2012 Accepted 6 June 2012 Available online 21 June 2012 Keywords: Distribution theory Estimation theory Reliability analysis
a b s t r a c t A new generalization of the linear exponential distribution is recently proposed by Mahmoud and Alam [1], called as the generalized linear exponential distribution. Another generalization of the linear exponential was introduced by Sarhan and Kundu [1,2], named as the generalized linear failure rate distribution. This paper proposes a more generalization of the linear exponential distribution which generalizes the two. We refer to this new generalization as the exponentiated generalized linear exponential distribution. The new distribution is important since it contains as special sub-models some widely well known distributions in addition to the above two models, such as the exponentiated Weibull distribution among many others. It also provides more flexibility to analyze complex real data sets. We study some statistical properties for the new distribution. We discuss maximum likelihood estimation of the distribution parameters. Three real data sets are analyzed using the new distribution, which show that the exponentiated generalized linear exponential distribution can be used quite effectively in analyzing real lifetime data. Ó 2012 Elsevier Inc. All rights reserved.
1. Introduction The linear exponential (LE) distribution, having exponential and Rayleigh distributions as sub-models, is a very wellknown distribution for modeling lifetime data and for modeling phenomenon with linearly increasing failure rates. However, the LE distribution does not provide a reasonable parametric fit for modeling phenomenon with decreasing, non linear increasing, or non-monotone failure rates such as the bathtub shape, which are common in firmware reliability modeling, biological studies, see Lai et al. [3] and Zhang et al. [4]. The bathtub failure rate curves have nearly flat middle portions and the corresponding densities have a positive antimode. As example of bathtub-shaped failure rate is the firmware reliability [4]. The models that exhibit bathtub-shaped failure rate function are very useful in reliability analysis, and particularly in reliability related decision making and cost analysis Xie et al. [13], and firmware reliability modeling [4]. Many different parametric families of these distributions, which show bathtub-shaped failure rate, were constructed in the past two decades. A good review of some of these models is presented by Pham and Lai [5]. Among those are the exponentiated Weibull (EW) distribution, which was proposed by Mudholkar and Srivastava [6]. More recently, Sarhan and Kundu [2] presented the generalized linear failure rate (GLFR) distribution and Mahmoud and Alam [1] proposed the generalized linear exponential (GLE) distribution. None of these three distributions (EW, GLFR, and GLE) can be derived as a sub-model of the other. In this paper, we introduce a new distribution with four parameters, referred to as the exponentiated generalized linear exponential (EGLE) distribution, with the hope it will attract many applications in different disciplines such as survival analysis, reliability, biology and others. One of the main goals to introduce this new distribution is that it involves the above ⇑ Corresponding author. Permanent address: Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt. E-mail addresses:
[email protected],
[email protected] (A.M. Sarhan). 0307-904X/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.apm.2012.06.019
2839
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
mentioned three distributions as sub-models. The fourth parameter indexed to this distribution makes it more flexible to describe different types of real data than its sub-models. Generally, the EGLE distribution generalizes the GLFR, GLE, EW, generalized exponential (GE) [7], generalized Rayleigh (GR) [8] and LE distributions, among several others. The GGLE distribution, due to its flexibility in accommodating different forms of the hazard function, seems to be a suitable distribution that can be used in a variety of problems in fitting survival data. The EGLE distribution is not only convenient for fitting comfortable bathtub-shaped failure rate data but it is also suitable for testing goodness-of-fit of some special sub-models such as the EW, GLFR and GLE distributions. The rest of the paper is organized as follows. In Section 2, we define the EGLE distribution, discuss some special sub-models and provide its cumulative distribution function (cdf), the probability density function (pdf) and the hazard function. A formula for generating EGLE random samples from the EGLE distribution is given in Section 2. Section 3 discusses some important statistical properties of the EGLE distribution asuch as the mode, median, the quantile, the ordinary moments and measures of skewness and kurtosis. Section 4 discusses the distribution of the order statistics. Maximum likelihood estimates of the four parameter index to the distribution are presented in Section 5. Section 6 provides three applications to real data. Section 7 concludes the paper. The paper also contains an appendix giving technical details. 2. The EGLE distribution A non-negative random variable X has the EGLE distribution with parameter vector h ¼ ða; b; c; dÞ, say EGLDðhÞ or EGLEða; b; c; dÞ, if its cdf is
h i b 2 c d Fðx; hÞ ¼ 1 eðaxþ2x Þ ;
x P 0;
ð2:1Þ
where a; b P 0 (such that a þ b > 0) and c; d > 0. The two parameters a and b are scale parameters while c and d are shape parameters. The pdf of the EGLE distribution is
c1 h i b b 2 c d1 ðaxþbx2 Þc 2 f ðx; hÞ ¼ cdða þ bxÞ ax þ x2 1 eðaxþ2x Þ e ; 2
x P 0:
ð2:2Þ
One of the advantages of the EGLE distribution is, it has a closed form cdf, which enables us to generating random numbers from it by using the following simple formula
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 8 1 1 > < ba þ 1b a2 þ 2b½ lnð1 U d Þ c ; if b – 0; X¼ h i1 > : 1 lnð1 U 1d Þ c ; if b ¼ 0; a
ð2:3Þ
where U is a uniformly distributed random variable on (0, 1) interval. The formula (2.3) can be used to generate random samples from a wide set of sub-models of the EGLE distribution, such as the exponential, generalized exponential, Rayleigh, generalized Rayleigh, linear exponential (linear failure rate), generalized linear failure rate, generalized linear exponential, Weibull and exponentiated Weibull distributions. When d is an integer, the EGLEðhÞ distribution can be interpreted as the lifetime distribution of a parallel system consists of d independent and identical units whose lifetime follows the GLEða; b; cÞ distribution. The proposed distribution generalizes many very well known distributions which show different patterns of the hazard function. Table 1 summarizes some of those more recent sub-models which show bathtub shaped hazard function. The pdf of the EGLE distribution can be written in terms of the cumulative hazard and hazard functions of LEða; bÞ distribution as
h i c c d1 f ðx; hÞ ¼ dchLE ðxÞ½HLE ðxÞc1 e½HLE ðxÞ 1 e½HLE ðxÞ ;
x P 0;
ð2:4Þ
where HLE ðxÞ ¼ HLE ðx; a; bÞ ¼ ax þ 2b x2 and hðxÞLE ¼ hLE ðx; a; bÞ ¼ a þ bx are the cumulative hazard and hazard functions of the LE distribution, respectively. Plots of the EGLE pdf for selected choices of the parameter vector h ¼ ða; b; c; dÞ are given in Fig. 1. From this figure, it is immediate that the pdf of the EGLE distribution can be either unimodal or right skewed.
Table 1 Some recent sub-models from the EGLE(a, b, c, d) distribution. The model
cdf
Special case
Author
EWðr; c; dÞ
h i c d 1 eðx=rÞ h i b 2 d 1 eax2x
b ¼ 0; a ¼ r1
[6]
GLFRða; b; dÞ GLEða; b; cÞ
ðaxþ2bx2 Þ
1e
c
c¼1
[2]
d¼1
[1]
2840
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
2 1.8
θ = (0.7, 2.5, 1.5, 0.5)
1.6
θ = (0.7, 2.5, 1.5, 2)
1.4
f(x;θ)
1.2 1 0.8 0.6
θ = (0.7, 2.5, 0.5, 2)
0.4 0.2 0
θ = (0.7, 2.5, 0.5, 0.5) 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
x Fig. 1. The pdf of the EGLE distribution at different h.
The construction of the cdf of the EGLE distribution (2.1) can be proposed in two ways: (i) by rasing the cumulative hazard rate function of the LE distribution Hðx; a; bÞ, embedded in the GLFE distribution, by an arbitrary parameter c > 0; or (ii) by raising the cdf of the GLE F GLE ðx; a; b; cÞ by an arbitrary power d > 0. Following case (2), F GLE ðx; a; b; cÞ is the baseline distribution and Fðx; hÞ may be referred to as the exponentiated F GLE ðx; a; b; cÞ distribution. The relation between the corresponding probability density functions is f ðx; hÞ ¼ d½F GLE ðx; a; b; cÞd1 fGLE ðx; a; b; cÞ. It is noticed that for d > 1 and d < 1 and for larger (smaller) values of x, the factor dfGLE ðx; a; b; cÞ is greater (smaller) and smaller (greater) than one, respectively. This immediately implies that the ordinary moments associated with the pdf (2.2) are strictly larger (smaller) than those associated with the pdf fGLE ðxÞ when d > 1 (d < 1). The hazard function of the EGLEðhÞ distribution is
h id1 b 2 c b 2 c 2 cdða þ bxÞðax þ bx Þc1 1 eðaxþ2x Þ eðaxþ2x Þ hðx; hÞ ¼ ; h id b 2 c 1 1 eðaxþ2x Þ
x P 0:
ð2:5Þ
Plots of the hazard function of the EGLE distribution, for selected choices of the parameter vector h ¼ ða; b; c; dÞ, are given in Fig. 2. From this figure, it is immediate that the hazard function of the EGLE distribution can be either decreasing, increasing, or of bathtub shape, which makes the distribution more flexible to fit different lifetime data sets. The pdf of the EGLE distribution (2.2) can be written as a linear combination of the pdf of GLE distribution. For d > 0, a series expansion for ð1 wÞd1 , for jwj < 1, is 5 4.5 θ = (0.7, 2.5, 1.5, 0.5)
4 3.5
θ = (0.7, 2.5, 0.5, 0.5)
h(x;θ)
3 2.5 2 θ = (0.7, 2.5, 1.5, 2)
1.5 1 0.5 0
θ = (0.7, 2.5, 0.5, 2) 0
0.1
0.2
0.3
0.4
x
0.5
0.6
0.7
Fig. 2. The hazard function of the EGLE distribution at different h.
0.8
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
ð1 wÞd1 ¼
1 X ð1Þj CðdÞ j¼0
Cðd jÞj!
wj ;
2841
ð2:6Þ c
where CðÞ is the gamma function. Since for x > 0, e½Hðx;hÞ < 1, then using the series expansion (2.6) in (2.2), yields
f ðx; hÞ ¼
1 X ð1Þj Cðd þ 1Þ f ðx; ðj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ: Cðd jÞðj þ 1Þ! GLE j¼0
ð2:7Þ
When d is a positive integer, the index j in (2.7) stops at d 1. The linear combination (2.7) enables us to obtain some mathematical properties of EGLE distribution directly from those of the GLE distribution, such as for example, the moments, the moment generating function, characteristic function, etc. There are many softwares such as MATHEMATICA, MATLAB and MAPLE can be used to compute (2.7) numerically. Currently, such softwares have ability to deal with formidable analytical expressions. 3. Statistical properties 3.1. Mean, median and mode As it was expected, the mean of the EGLEða; b; c; dÞ distribution cannot be derived in an explicit form. However, it can be expressed as a linear combination of the means of GLEða ; b ; cÞ distribution, with different values of a and b . In general, we will present different moments of the EGLEða; b; c; dÞ distribution later. The quantile xq of the EGLEða; b; c; dÞ distribution can be easily given as
xq ¼
( rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi) h i1c 1 1 a þ a2 þ 2b lnð1 qd Þ : b
ð3:1Þ
Using (3.1), the median of the EGLEða; b; c; dÞ distribution can be obtained, by setting q ¼ 12. Moreover, the mode of the EGLEða; b; c; dÞ distribution can be obtained as a nonnegative solution of the following nonlinear equation
c c
b 2 c b b b ¼ 0: ða þ bxÞ2 c 1 þ cðd 1Þ ax þ x2 1 eðaxþ2x Þ ax þ x2 b ða þ bxÞ ax þ x2 2 2 2
ð3:2Þ
It is not possible to get an explicit solution to (3.2) in the general case. Numerical methods should be used such as bisection or fixed-point method, see [9]. Explicit forms can be derived in different special cases. 3.2. Moments Moments are necessary and important in any statistical analysis, especially in applications. It can be used to study the most important features and characteristics of a distribution (e.g., tendency, dispersion, skewness and kurtosis). Let X be a random variable with density function (2.2). The kth ordinary moment of the EGLE distribution is R lk ðhÞ ¼ E½X k ¼ 01 xk f ðx; hÞdx. Using (2.7), we can derive lk ðhÞ as
lk ðhÞ ¼
1 X ð1Þj Cðd þ 1Þ lk;GLE ððj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ; C ðd jÞðj þ 1Þ! j¼0
ð3:3Þ
where lk;GLE ððj þ 1Þ1=c a; ðj þ 1Þ1=c b; cÞ is the kth ordinary moment of the GLE distributed random variable with parameters ðj þ 1Þ1=c a; ðj þ 1Þ1=c b; c. Using Eq. (9) in [1], we have
lk;GLE ða; b; cÞ ¼
k X 1 X k i¼0 ‘¼0
i
1 ðk 2
iÞ
!
‘
ki
ð1Þi 2 2 ‘ a2‘þi kþi
c 2 þ‘
C
ki ‘ þ1 : 2c c
ð3:4Þ
If d is a positive integer, the upper limit in the inner sum in (3.3) will be d 1. This result shows a useful application of the infinite linear combination of the EGLE probability density function. Based on the first four ordinary moments of the EGLE distribution, the measures of skewness aðhÞ and kurtosis jðhÞ of the EGLE distribution can obtained as
aðhÞ ¼
l3 ðhÞ 3l1 ðhÞl2 ðhÞ þ 2l31 ðhÞ ; 3 l2 ðhÞ l21 ðhÞ 2
ð3:5Þ
2842
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
4
25 c=0.5 c=1.5 c=1.0
3.5
c=0.5 c=1.5 c=1.0
20
2.5
Kurtosis
Skewness
3
2
15
10
1.5 1
5 0.5 0
0
2
4
d
6
8
10
0 0
2
4
d
6
8
10
Fig. 3. Plots of the skewness and kurtosis of the EGLE distribution as a function of d for some values of c and a ¼ 0:7; b ¼ 2:5.
and
jðhÞ ¼
l4 ðhÞ 4l1 ðhÞl3 ðhÞ þ 6l21 ðhÞl2 ðhÞ 3l41 ðhÞ : l2 ðhÞ l21 ðhÞ 2
ð3:6Þ
Plots of the skewness and kurtosis of the EGLE distribution as a function of d for selected values of c and a ¼ 0:7; b ¼ 2:5 are given in Fig. 3. It is observed that: (i) aðhÞ and jðhÞ decrease as d increases when c < 1; (ii) when c P 1; aðhÞ and jðhÞ first decrease as d increase and then start increasing. 4. Order statistics Let X 1 ; X 2 ; . . . ; X n be a simple random sample from EGLDðhÞ distribution with pdf and cdf given by (2.2) and (2.1), respectively. Let X 1:n 6 X 2:n 6 6 X n:n denote the order statistics obtained from this sample. We now give the probability density function of X r:n , say fr:n ðx; hÞ, and the moments of X r:n ; r ¼ 1; 2; . . . ; n. Therefore, the measures of skewness and kurtosis of the distribution of the X r:n are presented. The pdf of X r:n is given by
fr:n ðx; hÞ ¼
1 ½Fðx; hÞr1 ½1 Fðx; hÞnr f ðx; hÞ; Bðr; n r þ 1Þ
ð4:7Þ
where f ðx; hÞ; Fðx; hÞ are the pdf, cdf of the EGLEðhÞ distribution given by (2.2), (2.1), respectively, and Bð; Þ is the beta function. Since 0 < Fðx; hÞ < 1 for x > 0, we can use the binomial series expansion for ½1 Fðx; hÞnr , given by
½1 Fðx; hÞnr ¼
nr X nr ð1Þj ½Fðx; hÞj ; j j¼0
ð4:8Þ
therefore
fr:n ðx; hÞ ¼
nr X nr 1 f ðx; a; b; c; dÞ ð1Þj Fðx; a; b; c; dÞrþj1 Bðr; n r þ 1Þ j j¼0
ð4:9Þ
substituting from (2.2) and (2.1) into (4.9), one gets
fr:n ðx; hÞ ¼
nr X j¼0
ð1Þj n! f ðx; a; b; c; ðr þ jÞdÞ: j!ðr 1Þ!ðn r jÞ!ðr þ jÞ
ð4:10Þ
Relation (4.10) shows that fr:n ðx; hÞ is the weighted average of the exponentiated generalized exponential distribution with different shape parameters. Using (2.7), (3.3), and (3.4), we can express the kth moment of the ith order statistics X i:n as a liner combination of the kth moments of the GLE distribution with different shape parameters. Therefore, the measures of skewness and kurtosis of the distribution of X i:n can be calculated.
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
2843
5. Estimation and inference Now, we discuss the estimation of the model parameters by using the method of maximum likelihood in the presence of right-censored sample. In right-censored sample, it is assumed that N independent and identical units are put on the life test. The test is terminated after a pre-specified time, say T, is reached. The observations obtained from this test are N; T; n ¼ the number of units failed before the censored time T and the lifetimes of these n failed unites. Let x ¼ fx1 6 x2 6 6 xn ; N; n; Tg be a right-censored sample when the lifetime of the tested unites follow the EGLE distribution with unknown parameter vector h ¼ ða; b; c; dÞ. The log-likelihood function LðhÞ for h is
X c n n X b b lnða þ bxi Þ þ ðc 1Þ ln axi þ x2i axi þ x2i 2 2 i¼1 i¼1 i¼1
n h i h i X b 2 c b 2 c d : þ ðd 1Þ ln 1 eðaxi þ2xi Þ þ ðN nÞ ln 1 1 eðaTþ2T Þ
LðhÞ ¼ n ln d þ n ln c þ
n X
ð5:11Þ
i¼1
The components of the score vector UðhÞ ¼ ðU a ; U b ; U c ; U d ÞT are
c1 b 2 c n n n X X X xi ðaxi þ 2b x2i Þc1 eðaxi þ2xi Þ 1 xi b 2 þ ðc 1Þ c x ax þ þ cðd 1Þ x i i c b 2 a þ bxi 2 i axi þ 2b x2i 1 eðaxi þ2xi Þ i¼1 i¼1 i¼1 i¼1 c h i d1 b 2 c b 2 c Tdc aT þ 2b T 2 eðaTþ2T Þ 1 eðaTþ2T Þ ðN nÞ ; h id b 2 c 1 1 eðaTþ2T Þ c1 b 2 c n n n n X x2i ðaxi þ 2b x2i Þc1 eðaxi þ2xi Þ x2i xi c 1X cX b 2 cðd 1Þ X 2 x þ x ax þ þ Ub ¼ i c b 2 2 i¼1 ax þ 2b x2i 2 i¼1 i 2 i 2 a þ bx 1 eðaxi þ2xi Þ i¼1 i¼1 h i b 2 c d1 2 b 2 c ðaTþ2bT 2 Þc 1 eðaTþ2T Þ ðN nÞ T dcðaT þ 2 T Þ e ; h id b 2 c 2 1 1 eðaTþ2T Þ c X c b 2 c n n n X ax þ 2b x2 eðaxþ2xi Þ ln ax þ 2b x2 n X b b b ln axi þ x2i axi þ x2i ln axi þ x2i þ ðd 1Þ Uc ¼ þ b 2 c c i¼1 2 2 2 1 eðaxþ2x Þ i¼1 i¼1 c h i d1 b 2 c b 2 c d aT þ 2b T 2 eðaTþ2T Þ 1 eðaTþ2T Þ ln aT þ 2b T 2 ðN nÞ ; h i d b 2 c 1 1 eðaTþ2T Þ
Ua ¼
n X
and
h i b 2 c n h ln 1 eðaTþ2T Þ ci b 2 n X : Ud ¼ þ ln 1 eðaxi þ2xi Þ ðN nÞ h i b 2 c d d i¼1 1 1 eðaTþ2T Þ Setting these expressions to zero, UðhÞ ¼ 0, and solving them simultaneously gives the maximum likelihood estimate (MLE) T ^ ^c; d ^ of the four parameters. The system of these four nonlinear equations cannot be solved analytically and math^ ^; b; h¼ a ematical or statistical software should apply to get a numerical solution via iterative techniques such as the Newton–Raphson method. For asymptotic interval estimation of the four parameters a; b; c and d, we obtain the observed Fisher information matrix. The elements of the 4 4 observed information matrix IðhÞ ¼ @ 2 L=@h@hT , are given in Appendix A. The multivariate normal N 4 ð0; Ið^ hÞ1 Þ distribution can be used to construct asymptotic confidence intervals for the parameters. The asymp ^ ^ ; ^c Z a=2 SEð^cÞ, and ^ Z a=2 SEða ^Þ ; b totic 100ð1 aÞ% confidence intervals of a; b; c and d are a Z a=2 SEðbÞ ^ Z a=2 SEðdÞ ^ , respectively, where Z a=2 is the quantile ð1 a=2Þ of the standard normal distribution and SEðÞ is the square d ^ 1 corresponding to each parameter. root of the diagonal element of IðhÞ Different types of goodness-of-fit can be applied here to test the superiority of the EGLE distribution comparing to some other models. Mainly in Section 6, we use Kolmogorov–Smirnov (K–S) test as a non-parametric test and the likelihood ratio (LR) test as a parametric one to illustrate how one can compare the EGLE distribution with the GLE, GLFR, EW, LFR and W distributions to fit real data sets.
2844
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
6. Data analysis In this section, we analyze three real data sets (two of them are complete and one is censored) to demonstrate the performance of the EGLE distribution in practice. The first data set is a sample of 5 components taken from [1,10], which possess a bathtub-shaped hazard rate. The second data set is a sample of 40 patients taken from Abuammoh et al. [12], which posses an increasing hazard rate. The third data set is censored and consists of 82 prisoners taken from different places in Middle East, which posses a bathtub-shaped hazard rate. The third data is available with the corresponding author. For each data set, we compare the results of the fits of the EGLE, GLE, GLFR and EW distributions. The main reasons to compare our new model with the GLE, GLFR and EW models are: (1) our model generalizes these three models; and (2) The hazard function of all of these four models can be increasing, decreasing or of bathtub shaped. Furthermore, we compare the EGLE with the Weibull (W) and the linear failure rate (LFR) distributions. The main reason to compare the EGLE with the W and LFR distributions is to see the benefit of introducing this new distribution with four parameters when it compares with simpler two parameter distributions. We perform the following five testing of null hypotheses; (i) H0 : d ¼ 1, the data follow the GLEða; b; cÞ distribution, (ii) H0 : c ¼ 1, the data follow the GLFRða; b; dÞ distribution, (iii) H0 : b ¼ 0; a ¼ r1 , the data follow EWðr; c; dÞ distribution, (iv) H0 : b ¼ 0; a ¼ r1 ; d ¼ 1, the data follow Wðr; cÞ ‘‘Weibull’’ distribution, and (v) H0 : c ¼ 1; d ¼ 1, the data follow the LFRða; bÞ ‘‘linear failure rate’’ distribution, against the alternative hypothesis Ha : the data follow EGLEða; b; c; dÞ distribution. Parametric and nonparametric test statistics are used to test the above null hypotheses H0 ’s against Ha . We use Kolmogorov–Smernov (K–S) test and likelihood ratio (LR) test statistics. 6.1. Devices times data The data set refers to the lifetimes of 50 devices provided by Aarset [10]. Table 2, gives the measurements of the data set. Table 3 shows the mles of parameters index to every distribution used, the observed K–S test statistic values and the corresponding p-values for the six models used. Fig. 4(a) and (b) give the empirical and parametric survival and hazard functions, respectively, for the devices data set. From Tables 3, based on the p-values associated with the K–S values, we can conclude that: (i) the Weibull distribution is rejected and any level of significance a P 0:005, (ii) the LFR distribution should be rejected and any level of significance a P 0:04 (iii) the EW distribution is rejected at a P 0:06, (iv) both the EW and GLFR distributions are rejected at a P 0:14, (v) the EGLE distribution is not rejected at a 6 0:21, and (vi) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. For parametric comparisons, we have used the likelihood ratio (LR) test statistics, KH0 ¼ 2ðLHa LH0 Þ, to test the null hypotheses against the alternative one mentioned above. In addition, the Akaike Information Criterion (AIC) by Akaike [11] is used to select the best model among several models. The AIC is given as AIC ¼ 2LModel þ 2p, where p is the number of parameters index to the model. The best model to fit the data is the model with lowest AIC. Table 4 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC. From the p-values it is clear that we reject all the null hypotheses at level of significance a P 1:9 103 . Also, the EGLE distribution has the lowest AIC. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. This conclusion supports the results obtained based the K–S test mentioned above.
Table 2 (Devices data) Lifetimes of 50 devices [10]. .1 21 79
.2 32 82
1 36 82
1 40 83
1 45 84
1 46 84
1 47 84
2 50 85
3 55 85
6 60 85
7 63 85
11 63 85
12 67 86
18 67 86
18 67
18 67
18 72
18 75
Table 3 The mles of the parameters, the K–S values and p-values for devices data. The model
MLE of the parameters
K–S
p-Value
Wðr; cÞ LFRða; bÞ
r^ ¼ 44:913; ^c ¼ 0:949
0.2397 0.1955
0.0052 0.0370
0.1841
0.0590
EW (r; c; d)
^ ¼ 2:4 104 ^ ¼ 0:014; b a r^ ¼ 91:023; ^c ¼ 4:69; d^ ¼ 0:146
0.1620
0.1293
GLEða; b; cÞ
^ ¼ 3:074 104 ; d ^ ¼ 0:533 ^ ¼ 3:822 103 ; b a ^ ¼ 4:52 104 ; ^c ¼ 0:73 ^ ¼ 9:621 103 ; b a
0.1598
0.1391
EGLEða; b; c; dÞ
^ ¼ 1:738 104 ; ^c ¼ 4:564; d ^ ¼ 0:112 ^ ¼ 3:307 103 ; b a
0.1475
0.2055
GLFRða; b; dÞ
2845
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
(a)
(b) Empirical NGLE GLE EW GLFR W LFR
0.8
0.6
0.4
0.2
0
Empirical EGLE GLE EW GLFR
0.1
The hazard function
The survival function
1
0.08 0.06 0.04 0.02
0
10
20
30
40
50
60
70
0
80
0
10
20
30
40
50
60
70
80
x
x
Fig. 4. (a) The empirical and estimated survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for devices data. (b) Empirical and estimated hazard functions of the EGLE, GLE, GLFR and EW models for the devices data.
Table 4 H0 ; L, the LR test statistics, p-values and AIC values for devices data. The model
H0
LH 0
KH0
df
p-Value
AIC
Wðr; cÞ
241.002
33.330
2
5:787 108
486.004
LFRða; bÞ
b ¼ 0; a ¼ r1 ; d ¼ 1 c ¼ 1; d ¼ 1
238.064
27.454
2
1:093 106
480.128
GLEða; b; cÞ
d¼1
235.926
23.178
1
1:477 106
370.240
GLFRða; b; dÞ
c¼1
233.145
17.616
1
2:703 105
369.173
EWðr; c; dÞ
b ¼ 0; a ¼ r1
229.114
9.554
1
1:995 103
360.464
EGLEða; b; hÞ
LHa ¼
224:337
–
–
–
358.502
Table 5 (Leukemia data) Lifetimes of 40 patients suffering from Leukemia. 115 807 1222 1578
181 865 1251 1578
255 924 1277 1599
418 983 1290 1603
441 1024 1357 1605
461 1062 1369 1696
(a) Empirical NGLE GLE EW GLFR W LFR
0.8
0.6
0.4
9
x 10
739 1165 1455 1799
743 1191 1478 1815
789 1222 1549 1852
(b)
−3
Empirical EGLE GLE EW GLFR
8
The hazard function
The survival function
1
516 1063 1408 1735
7 6 5 4 3 2
0.2
1
0
0
500
1000
x
1500
0
200
400
600
800
1000
1200
1400
1600
1800
x
Fig. 5. (a) The empirical and estimated survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for leukemia data. (b) Empirical and estimated hazard functions of the EGLE, GLE, GLFR and EW models for the leukemia data.
2846
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
Table 6 The MLEs of the parameters, the K–S values and p-values for leukemia data. The model
MLE of the parameters
K–S
p-Value
LFRða; bÞ
^ ¼ 4:229 107 ^ ¼ 9:501 104 ; b a r^ ¼ 1143:3; ^c ¼ 1:055 r^ ¼ 734:185; ^c ¼ 1:265; d^ ¼ 2:973
0.3585
4:143 105 0.0050 0.4494
Wðr; cÞ EWðr; c; dÞ
0.2680 0.1321
^ ¼ 1:389 106 ; d ^ ¼ 1:553 ^ ¼ 2:102 104 ; b a 5 ^ ¼ 1:131 106 ; ^c ¼ 1:260 ^ ¼ 7:591 10 ; b a ^ ¼ 7:147 107 ; ^c ¼ 3:410; d ^ ¼ 0:333 ^ ¼ 3:278 105 ; b a
GLFRða; b; dÞ GLEða; b; cÞ EGLEða; b; c; dÞ
0.1183
0.5884
0.1105
0.6727
0.0917
0.8591
Table 7 The MLE, the log-likelihood values, p-values and AIC values for leukemia data. The model
LH 0
KH0
df
b ¼ 0; a ¼ r ; d ¼ 1 c ¼ 1; d ¼ 1
319.874
38.555
318.458
35.723
308.933
GLFRða; b; dÞ
b ¼ 0; a ¼ r1 c¼1
GLEða; b; cÞ NGLEða; b; c; dÞ
W (r; c) LFRða; bÞ EW (r; c; d)
H0 1
p-Value
AIC
2
4:245 10
9
643.747
2
1:749 108
640.916
16.674
1
4:438 105
623.866
305.338
9.484
1
2:072 103
616.677
d¼1
304.111
7.029
1
8:019 103
614.222
LHa ¼
300:596
–
–
609.192
p-Value
Table 8 The mles of the parameters, the associated K–S values and p-values for drug data. The model
MLE of the parameters
K–S
LFRða; bÞ
^ ¼ 4:002 106 ^ ¼ 2:473 104 ; b a r^ ¼ 603:995; ^c ¼ 1:66 ^ ¼ 3:444 106 ; d ^ ¼ 0:732 ^ ¼ 1:24 104 ; b a r^ ¼ 858:334; ^c ¼ 1:716; d^ ¼ 0:537
0.1512
0.0422
0.1421 0.1154
0.0659 0.2085
0.1094
0.2602
Wðr; cÞ GLFRða; b; dÞ EWðr; c; dÞ GLEða; b; cÞ EGLEða; b; c; dÞ
^ ¼ 5:427 106 ; ^ ^ ¼ 3:445 105 ; b c ¼ 0:582 a ^ ¼ 1:17 106 ; ^c ¼ 3:414; d ^ ¼ 0:195 ^ ¼ 4:686 104 ; b a
0.0984
0.3806
0.0621
0.8900
6.2. Leukemia data Table 5 gives the data set studied by Abuammoh et al. [12], which represent the lifetime in days of 40 patients suffering from leukemia from one of the Ministry of Health Hospitals in Saudi Arabia. Fig. 5(a) shows the empirical the estimated survival functions using every model. Fig. 5(b) gives the empirical and fitted hazard functions of the models that are not rejected to fit the leukemia data. Fig. 5(b) shows an increasing hazard rate for the data set. Hence, the any of the EGLE, GLE, GLFR, EW, W and LFR distributions could be appropriate to fit such data. To see which one of these distributions is more appropriate to fit the data set, we calculate the mles of the parameters index to each model, therefore, we use different test statistics to compare them. Table 6 gives the mles of the parameters, the K–S statistic and the corresponding p-value for every model. Based on the p-value associated with K–S values, given in Table 6, we can conclude that: (i) both the W and LFR distributions should be rejected and any level of significance a > 0:005, (ii) none of the four models EW, GLFR, GLE and EGLE is rejected at any considerable level of significance, and (iii) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. Table 7 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC for the leukemia data. From the pvalues it is clear that we reject all the null hypotheses at level of significance a P 8:019 103 . Also, the EGLE distribution has the lowest AIC. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. This conclusion gives more accurate comparison than that obtained based the K–S test mentioned above. 6.3. Drug data A random sample of 82 prisoners who imprisoned because of drug issue then they released together in general pardon. During 111 weeks, 65 of them arrested again because either drug abuse or sale it. The lifetime data consist of the times at which the prisoners return back to the prison after they released. The data was collected from a prison in the Middle East and it is with the corresponding author.
2847
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
(b)
0.6 0.5
0.004 0.003 0.002
0.7
The hazard function
Survival function
0.8
0.000
0.4 0.3 0.2 0
Empirical EGLE GLE EW GLFR W LFR
0.001
Empirical EGLE GLE EW GLFR W LFR
0.9
0.005
(a) 1
0
100
200
300
400
500
600
700
200
800
400
600
800
x
x
Fig. 6. (a) The empirical and fitted survival functions of the EGLE, GLE, GLFR, EW, W and LFR models for the drug data. (b) The empirical and fitted hazard functions of the EGLE, GLE, GLFR, EW, W and LFR models for the drug data.
Table 9 The MLE, the log-likelihood values, p-values and AIC values for drug data. The model W (r; c)
H0
LH 0
KH0
df
p-Value
AIC
b ¼ 0; a ¼ r ; d ¼ 1 c ¼ 1; d ¼ 1
208.908
11.803
2
2:735 10
LFRða; bÞ
207.872
9.732
2
GLFRða; b; dÞ GLEða; b; cÞ EW (r; c; d)
c¼1 d¼1 b ¼ 0; a ¼ r1
205.400 205.345 204.471
4.789 4.678 2.929
1 1 1
7:705 103 0:029 0:031 0:087
416.801 416.690 414.941
NGLEða; b; c; dÞ
LHa ¼
203:006
–
–
–
414.012
1
3
421.815 419.744
Table 8 shows the mles of parameters index to every distribution used, the observed K–S test statistic values and the corresponding p-values for the six models used. Fig. 6(a) gives the empirical and parametric survival functions for the drug data set. Fig. 6(b) shows the empirical and parametric hazard functions for the drug data set. From Fig. 6(b), the data shows a bathtub-shaped hazard, therefore it is expected that one of the EW, GLE, GLFR and EGLE distributions might be appropriate to fit it. Based on the p-value associated with K–S values, given in Table 8, we can conclude that: (i) The LFR distribution should be rejected at any level of significance a 6 0:0422, (ii) The W model should be rejected at any level of significance a 6 0:0659, (iii) none of the four models EW, GLFR, GLE and EGLE is rejected at any considerable level of significance a 6 0:2086, and (iv) the EGLE distribution is the best distribution among all those used here to fit the data set, in the sense of having the highest p-value. For more accurate comparisons between these distributions, we perform more analysis using the likelihood ratio test statistics. Table 9 gives the null hypothesis H0 , the value of log-likelihood function under H0 ; LH0 , the value of the likelihood ratio test statistics, KH0 , the degree of freedom of KH0 , df, the corresponding p-value and the AIC for the drug data. From the p-values, in Table 9, we can immediately conclude that: (i) both the LFR and W distributions are rejected at any level of significance a P 7:705 103 , (ii) the GLFR distribution should be rejected at a P 0:029 significance level, (iii) the GLE distribution should be rejected at a P 0:031 significance level, (iv) the EW distribution should be rejected at a P 0:087 significance level. This concludes that the EGLE distribution is the best among all distributions used here to fit the current data set. 7. Conclusion We have introduced a four parameter distribution, so-called the exponentiated generalized exponential distribution, as a simple extension of either the generalized linear exponential distribution [1], or the generalized linear failure rate distribution [2] or the exponentiated Wibull distribution [6]. We discussed some statistical properties of the distribution, including mean, median, mode, moments, measures of skewness and kurtosis, probability density of the order statistics and their moments. The maximum likelihood estimates of the four parameters index to the new distribution are discussed and we obtained the observed Fisher information matrix.
2848
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
Three real data sets are analyzed using the new distribution and it is compared with three immediate sub-models mentioned above in addition to another two simpler two parameter distributions (Weibull and linear failure rate). The results of the comparisons showed that the new distribution provides a better fit than those three mentioned distributions to the three data sets. We hope our new distribution might attract wider sets of applications in lifetime data and reliability analysis. Acknowledgments The authors thank the referees for their valuable comments which improved the earlier version of the manuscript. Appendix A The elements of the observed Fished information matrix, for complete data set, is
0
U aa B B Ið^hÞ ¼ B @
U ab U bb
U ac U bc
U cc
1 U ad U bd C C C U cd A U dd
; h¼^h
are given by wi ¼ hLE ðxi ; a; bÞ ¼ a þ bxi ,
n
-i ¼ HLE ðaxi þ 2b x2i Þ and xi ¼ exp axi þ 2b x2i
c o
,
n n n n X X X X x2i ðx2i -c2 xi Þ½ð1 xi Þðc 1Þ c-ci 1 i U aa ¼ ðc 1Þ cðc 1Þ x2i -c2 þ cðd 1Þ ; 2 2 -i ð1 xi Þ2 i¼1 wi i¼1 i¼1 i¼1 3 c2 n n n n X X X xi -i xi ½ð1 xi Þðc 1Þ c-ci x3i xi c 1 X c c 3 c2 ðc 1Þ ðd 1Þ x þ ; U ab ¼ i i 2 2 i¼1 -2i 2 2 ð1 xi Þ2 i¼1 wi i¼1 i¼1 n n n n X X X X xi -c1 xi ½ð1 xi Þð1 þ c ln -i Þ c-ci ln -i xi i c xi -c1 ln -i xi -c1 þ ðd 1Þ ; U ac ¼ i i -i ð1 xi Þ2 i¼1 i¼1 i¼1 i¼1 n X xi -c1 wi i ; U ad ¼ c 1 w i i¼1
4 c2 n n n n X xi -i xi ½ð1 xi Þðc 1Þ c-ci x2i x2i c 1X cðc 1Þ X cðd 1Þ X 4 c2 U bb ¼ þ x þ ; i i 2 2 4 i¼1 -i 4 4 ð1 xi Þ2 i¼1 wi i¼1 i¼1 U bc ¼
U bd ¼
n x2i -c1 xi cX i ; 2 i¼1 1 xi
U cc ¼
U cd ¼
n n n n x2i ðx2i -c1 xi Þ½ð1 xi Þð1 þ c ln -i Þ c-ci ln -i 1X cX 1X d 1X i x2 -c1 ln -i x2 -c1 þ ; i i 2 i¼1 -i 2 i¼1 2 i¼1 2 i¼1 ð1 xi Þ2
n n X xi -ci ðln -i Þ2 ðð1 xi Þ -ci Þ n X -c ðln -i Þ2 þ ðd 1Þ ; 2 c ð1 xi Þ2 i¼1 i¼1
n X xi -c ln -i i
i¼1
1 xi
and U dd ¼
n d
2
:
References [1] M.A.W. Mahmoud, F.M.A. Alam, The generalized linear exponential distribution, Statist. Probabil. Lett. 80 (2010) 1005–1014. [2] A. Sarhan, D. Kundu, Generalized linear failure rate distribution, Commun. Statist. Theory Methods 38 (5) (2009) 642–660. [3] C.D. Lai, M. Xie, D.N.P. Murthy, Bathtub shaped failure rate distributions, in: N. Balakrishnan, C.R. Rao (Eds.), Handbook in Reliability, vol. 20, 2001, pp. 69–104. [4] T. Zhang, M. Xie, L.C. Tang, S.H. Ng, Reliability and Modeling of Systems Integrated with Firmware and hardware, Int. J. Reliab. Quality Safety Eng. 12 (3) (2005) 227–239. [5] H. Pham, C.D. Lai, On recent generalizations of the Weibull distribution, IEEE Trans. Reliab. 56 (2007) 454–458. [6] G.S. Mudholkar, D.K. Srivastava, Exponentiated Weibull family for analyzing bathtub failure rate data, IEEE Trans. Reliab. 42 (1993) 299–302. [7] R. Gupta, D. Kundu, Generalized exponential distribution, Aust. N. Z. J. Statist. 41 (2) (1999) 173–188. [8] J.G. Surles, W.J. Padgett, Some properties of a scaled Burr type X distribution, J. Statist. Plann. Inference 128 (2005) 271–280. [9] L. Burden, J.D. Faires, Numerical Analysis, ninth ed., Brooks/Cole, Cengage Learing, 2011. [10] M.V. Aarset, How to identify bathtub hazard rate, IEEE Trans. Reliab. R-36 (1987) 106–108.
A.M. Sarhan et al. / Applied Mathematical Modelling 37 (2013) 2838–2849
2849
[11] H. Akaike, A new look at statistical model identification, IEEE Trans. Reliab. 19 (1974) 716–723. [12] A.M. Abouammoh, S.A. Abdulghani, I.S. Qamber, On partial orderings and testing of new better than renewal used classes, Reliab. Eng. Syst. Safety 43 (1994) 37–41. [13] M. Xie, Y. Tang, T.N. Goh, A Modified Weibull extension with bathtub-shaped failure rate function, Reliab. Eng. Syst. Safe. 76 (2002) 279–285.