Computational Statistics and Data Analysis 55 (2011) 677–686
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda
The Poisson-exponential lifetime distribution Vicente G. Cancho a , Franscisco Louzada-Neto b,∗ , Gladys D.C. Barriga c a
Department of Applied Mathematics and Statistics, Universidade de São Paulo, Brazil
b
Department of Statistics, Universidade de São Paulo, Brazil
c
FEB, Universidade Estadual Paulista, Brazil
article
info
Article history: Received 22 November 2009 Received in revised form 26 May 2010 Accepted 26 May 2010 Available online 23 June 2010 Keywords: Complementary risks EM algorithm Exponential distribution Poisson distribution Survival analysis
abstract In this paper we proposed a new two-parameters lifetime distribution with increasing failure rate. The new distribution arises on a latent complementary risk problem base. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulae for its reliability and failure rate functions, quantiles and moments, including the mean and variance. A simple EM-type algorithm for iteratively computing maximum likelihood estimates is presented. The Fisher information matrix is derived analytically in order to obtaining the asymptotic covariance matrix. The methodology is illustrated on a real data set. © 2010 Elsevier B.V. All rights reserved.
1. Introduction The exponential distribution (ED) provides a simple, elegant and close form solution to many problems in lifetime testing and reliability studies. However, the ED does not provide a reasonable parametric fit for some practical applications where the underlying hazard rates are nonconstant, presenting monotone shapes. In recent years, in order to overcome such a problem, new classes of models were introduced based on modifications of the ED. Gupta and Kundu (1999) proposed a generalized ED. This extended family can accommodate data with increasing and decreasing failure rate functions. Kus (2007) proposed another modification of the ED with a decreasing failure rate function. While Barreto-Souza and CribariNeto (2009) generalizes the distribution proposed by Kus (2007) by including a power parameter in his distribution. In this paper, we propose a new distribution family based on the ED with increasing failure rate function. Its genesis is stated on a complementary risk problem base (Basu and Klein, 1982) in presence of latent risks, in the sense that there is no information about which factor was responsible for the component failure and only the maximum lifetime value among all risks is observed. The paper is organized as follows. Section 2 presents the new distribution and its properties. Section 3 outlines an EM-type algorithm for maximum likelihood estimation. Section 4 contains the results of a Monte Carlo simulation on the finite sample behavior of the maximum likelihood estimates and the EM algorithm. In Section 5 the methodology is illustrated in a real data set. Some final comments are presented in Section 6. 2. The Poisson-exponential distribution Let Y be a nonnegative random variable denoting the lifetime of a component in some population. The random variable Y is said to have a Poisson-Exponential distribution (PED) with parameters λ > 0 and θ > 0 if its probability density function
∗
Corresponding address: Departamento de Estatística, Universidade Federal de São Carlos, Caixa Postal 676, CEP 13565-905, São Carlos, São Paulo, Brazil. E-mail address:
[email protected] (F. Louzada-Neto).
0167-9473/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2010.05.033
678
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
Fig. 1. Left panel: probability density function of the PE distribution. Right panel: survival function of the PED. We fixed λ = 1.
is given by −λy
θ λe−λy−θ e , y > 0. (1) 1 − e−θ The parameter λ controls the scale of the distribution while the parameter θ controls its shape. As θ approaches zero, the PED converges to an ED with parameter λ. Fig. 1 (left panel) shows the probability density function (1) for θ = 0.1, 1, 2, 4, 8, which is decreasing if 0 < θ < 1 and unimodal for θ ≥ 1. The modal value λe−1 is obtained at y = log θ /λ. f (y) =
The reliability function of a PED random variable is given by S (y) =
1 − e−θ e
−λy
1 − e−θ
,
y > 0.
(2)
Fig. 1 (right panel) shows some survival function shapes for same fixed values of θ . We simulate the PED considering log(θ ) − log(− log(u − e−θ (u − 1))) Q (u) = F −1 (u) = ,
λ
where u has the uniform U (0, 1) distribution and F (y) = 1 − S (y) is the distribution function of Y . 2.1. Failure rate From (1) and (2) it is easy to verify that the failure rate function is given by
θ λ exp −λy − θ e−λy , h(y) = 1 − exp −θ e−λy
y > 0.
(3)
Fig. 2 (left panel) shows some failure rate function shapes for same fixed values of θ . The failure rate function (3) increases with time but stabilizes at λ. Indeed, the initial and long-term failure rate function values are both finite and are given by h(0) = λθ (eθ − 1) and h(∞) = λ. Let y0 = inf{H (y, y + δ)} = 1, where H (y, y + δ) = h(y + δ)/h(y) and δ > 0, be the threshold from which we observed stabilization of the failure rate function. Fig. 2 (right panel) shows the H (y, y + δ) = h(y + δ)/h(y) against time for same fixed values of θ . We observe that the threshold, y0 , is an increasing function of θ . This means that the threshold is lower for small values of the parameter θ in comparison with the threshold obtained for large values of θ . The concept of an increasing failure rate is very attractive in an engineering context where it has often been related to a mathematical representation of wearout (Marshall and Olkin, 2007). Besides, in some real life situations after increasing the failure rate the function may become stabilized. For instance, consider deep groove ball bearings. The ball bearings are a type of rolling-element bearing, which use balls to maintain the separation between the moving parts of the bearing with the purpose of reducing rotational friction and supporting radial and axial loads. In a ball bearing endurance test the number of revolutions is usually measured before failure of the ball bearings (Lawless, 2003). The ball bearings may be exposed
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
679
Fig. 2. Left panel: failure rate function of the PE distribution. Right panel: H (y, y + δ) against time. We fixed λ = 1.
to several unobserved complementary risks, such as risk of contamination from dirt from the casting of the casing, wear particles from hardened steel gear wheels and harsh working environments, amongst others, leading to a increasing failure rate function. However, if a ball bearing survives exposure to several complementary risks its failure rate may stabilize during the rest of its mission time. As another real life situation, consider a observational dataset on the serum-reversal time (in days) of children contaminated with HIV from vertical transmission (Perdona, 2006). Serum-reversal is a process of the disappearance of anti-HIV antibodies (antitoxin) in blood (neutralization of anti-HIV serology) in an individual who previously showed positive anti-HIV serology. Serum-reversal can occur in children born from mothers infected with HIV. Their children are born with positive anti-HIV serology (vertical HIV transmission), which can occur due to unobserved complementary risks, such as intrauterine and intra-parturition transplacental transmission of the mother antibodies to her baby during labor and in the period following childbirth while the infant is breastfed. After a few months, the mother antibodies are eliminated and the anti-HIV serology may change from positive to negative, leading to an increasing failure rate function. However, as time passes, the chance of serum-reversal is stabilized. 2.2. Genesis Complementary risk problems (Basu and Klein, 1982) arise in several areas, such as public health, actuarial science, biomedical studies, demography and industrial reliability. In the classical complementary risk scenarios the lifetime associated with a particular risk is not observable, rather we observe only the maximum lifetime value among all risks. Simplistically, in reliability, we observe only the maximum component lifetime of a parallel system. That is, the observable quantities for each component are the maximum lifetime value to failure among all risks, and the cause of failure. Full statistical procedures and extensive literature are available to deal with complementary risk problems and interested readers can refer to Lawless (2003), Crowder et al. (1991) and Cox and Oakes (1984). A difficulty arises if the risks are latent in the sense that there is no information about which factor was responsible for the component failure, which can be often observed in field data. We call these latent complementary risk data. On many occasions this information is not available or it is impossible that the true cause of failure is specified by an expert. In reliability, the components can be totally destroyed in the experiment. Further, the true cause of failure can be masked from our view. In modular systems, the need to keep a system running means that a module that contains many components can be replaced without the identification of the exact failing component. Goetghebeur and Ryan (1995) addressed the problem of assessing covariate effects based on a semi-parametric proportional hazards structure for each failure type when the failure type is unknown for some individuals. Reiser et al. (1995) considered statistical procedures for analyzing masked data, but their procedure cannot be applied when all observations have an unknown cause of failure. Lu and Tsiatis (2001) presents a multiple imputation method for estimating regression coefficients for risk modeling with a missing cause of failure. A comparison of two partial likelihood approaches for risk modeling with a missing cause of failure is presented in Lu and Tsiatis (2005). Our model can be derived as follows. Let M be a random variable denoting the number of complementary risks related to the occurrence of an event of interest. Further, assume that M has a zero truncated Poisson distribution with a probability mass function given by P (M = m) =
e−θ θ m m!(1 − e−θ )
,
m = 1, 2 . . . , θ > 0.
(4)
680
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
Let Tj (j = 1, 2, . . .) denote the time-to-event due to the j-th complementary risks, hereafter lifetime, which are independent of M. Given M = m, the random variables Tj , j = 1, . . . , m are assumed to be independent and identically distributed according to a ED with a common distribution function with a probability density function given by f (t ; λ) = λ exp (−λt ) ,
t > 0, λ > 0.
(5)
In the latent complementary risks scenario, the number of causes M and the lifetime Tj associated with a particular cause are not observable (latent variables), but only the maximum lifetime Y among all causes is usually observed. So, the component lifetime is defined as Y = max (T1 , T2 , . . . , TM ) .
(6)
The following result shows that the random variable Y has a probability density function given by (1). Proposition 2.1. If the random variable Y is defined as in (6), then, considering (4) and (5), Y is distributed according to a PED with probability density function given by (1). The proof of Proposition 2.1 is given in the Appendix. The PED parameters have a direct interpretation in terms of complementary risks. The θ /(1 − e−θ ) represents the mean of the number of complementary risks, while λ denotes the failure rate of the distribution of time-to-event due to individual complementary risks. In comparison with the formulation of Kus (2007) we follow an opposite way, since he defines the component lifetime as Y = min (T1 , T2 , . . . , TM ), while we are considering Y = max (T1 , T2 , . . . , TM ) as stated in (6). 2.3. Moments Some of the most important features and characteristics of a distribution can be studied through its moments, such as mean, variance, tending, dispersion skewness and kurtosis. A general expression for the rth ordinary moment µ0r = E (Y r ) of the PED can be obtained analytically considering the generalized hypergeometric function denoted by Fp,q (a, b, θ ) and defined as
Fp,q (a, b, θ ) =
j=0
p Q
θj
∞ X
Γ (ai + j)Γ (ai )−1
i=1
Γ (j + 1)
q Q
,
(7)
Γ (bi + j)Γ (bi )−1
i=1
where a = [a1 , . . . , ap ], p is the number of operands of a, b = [b1 , . . . , bq ], and q is the number of operands of b. We then formulate the following result. Proposition 2.2. Suppose Y denotes a random variable from a PED with parameters λ > 0 and θ > 0 and probability density function given by (1), then
µ0r =
θ Γ (r + 1) Fr +1,r +1 ([1, . . . , 1], [2, . . . , 2], −θ ), λr (1 − e−θ )
(8)
where Fp,q (a, b, θ ) is the generalized hypergeometric function. The proof of 2.2 is obtained by direct integration and it is then omitted. Considering (8), the mean and variance of the of the PED are given, respectively, by
θ F2,2 ([1, 1], [2, 2], −θ ), λ(1 − e−θ ) θ θ 2 Var(Y ) = 2 F ([ 1 , 1 , 1 ], [ 2 , 2 , 2 ], −θ ) − F ([ 1 , 1 ], [ 2 , 2 ], −θ ) . 3 , 3 λ (1 − e−θ ) (1 − e−θ ) 2,2 E (Y ) =
3. Inference In this section we discuss inference and interval estimation for the model parameters as well as proposing an EMalgorithm. 3.1. Estimation by maximum likelihood The log-likelihood function based on an observed sample of size n, y = (y1 , . . . , yn ), from a PED is given by
`(ϑ) = n log(θ λ) − λ
n X i=1
yi − θ
n X i=1
e−λyi − n log(1 − e−θ ).
(9)
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
681
The MLEs of ϑ = (θ , λ) can be directly obtained by maximizing the log-likelihood function (9), or alternatively, by finding the solution for the following two nonlinear equations, n X ∂`(ϑ) n n = − θ − e−λyi = 0, ∂θ θ e −1 i =1
(10)
n n X X ∂`(ϑ) n = − yi + θ yi e−λyi = 0. ∂λ λ i =1 i=1
(11)
The Eq. (11) can be solved exactly for θ , i.e., n P
θˆ =
yi − nb λ−1
i=1 n
P
,
(12)
yi e−λˆ yi
i=1
ˆ , where θˆ and λˆ are the MLEs for θ and λ, respectively. Therefore, it is sufficient to solve the conditional upon the value of λ Eq. (10) iteratively in order to find the MLE for λ. In the following the Propositions 3.1 and 3.2 express which conditions are needed in order to obtain the existence and uniqueness of the MLE when the other parameter is given (or known). Proposition 3.1. Let U1 (θ; λ, y) denote the score function expressed by (10), where λ is the true value of the parameter. Then, Pn −λy i < n. the root of U1 (θ; λ, y) = 0 and θˆ is unique for i=1 e 2
Pn Proposition 3.2. Let U2 (λ; θ , y) denote the score function expressed by (11) and y¯ = n−1 i=1 yi , where θ is the true value of ˆ lies in the interval [¯y(θ + 1)]−1 , y¯ −1 . the parameter. Then, for given θ ∈ (0, e2 ) the root of U2 (λ; θ , y) = 0 and λ
The proof of the Proposition 3.1 is presented in the Appendix and proof of Proposition 3.2 following directly from Theorem 2 of Kus (2007). 3.2. An EM-algorithm In this subsection we develop an EM-algorithm (Dempster et al., 1977) for obtaining the MLEs of the PED parameters. Let y = (y1 , . . . , yn )> and m = (m1 , . . . , mn )> . It follows that the complete log-likelihood function associated with (y, m) is given by n X
`c (ϑ|y, m) =
log(f (yi , mi |ϑ)),
(13)
i =1
where f (yi , mi |ϑ) = mi θ mi λ[1 − e−λyi ]mi −1 e−θ−λyi /Γ (mi + 1)(1 − e−θ ), mi = 1, 2, . . ., yi > 0. bi = E [Mi |ϑ = b Considering m ϑ, yi ] we can direct to the implementation of the M-step, which consists in maximizing the expected complete data function or the Q -function over ϑ , given by (k)
(k)
Q (ϑ|b ϑ ) = E [`c (ϑ)|y, b ϑ ] = c + n log λ +
n n X X (b mi − 1) log(1 − e−λyi ) − λ yi i=1
+
n X
i =1
bi log θ − n log(eθ − 1), m
i=1
(k)
bi = 1 + b where c is a constant that is independent of ϑ , m θ (1 − e−λyi ) and b ϑ Then, the E-step and the M-step of the algorithm are given by, b
(k)
is an updated value of b ϑ.
(k)
bi . E-step: Given a current estimate b ϑ , compute m (k)
M-step: Update b ϑ n
P b θ (k+1) =
i=1
(k)
by maximizing Q (ϑ|b ϑ ) over ϑ, which leads to the following nice expressions b(k+1)
(k)
bi (1 − eθ m
,
n
b λ(k+1) =
)
n n P
i =1
yi −
n P i=1
(k)
b(k+1) y
(b mi −1)yi e−λ
i
.
b(k+1) y i 1−e−λ
The EM iteration for both parameters θ and λ is not explicit and a Newton-Raphson algorithm for the M-Step of a EM cycle is required (Kus, 2007).
682
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
3.3. Interval estimation Large sample inference for the parameters can be based, in principle, on the MLEs and their estimated standard errors. Following Cox and Hinkley (1974), under suitable regularity conditions, it can be shown that D R(ϑ) b ϑ − ϑ −→ N (0, I) ,
as n −→ ∞, where I denotes the unity matrix and R(ϑ) denotes the Choleski decomposition of the Fisher information matrix, I(ϑ), that is R(ϑ)> R(ϑ) = I(ϑ). For PED the Fisher information matrix is given by
∂ 2 `(ϑ) 2 ∂θ I(ϑ) = − ∂ 2 `(ϑ) E ∂θ ∂λ E
∂ 2 `(ϑ) ∂λ ∂θ ∂ 2 `(ϑ) E ∂λ2
E
,
where the second derivatives are given by ∂ 2 `(ϑ)/∂θ 2 = −n/θ 2 + neθ /(eθ − 1)2 , ∂ 2 `(ϑ)/∂θ ∂λ = P ∂ 2 `(ϑ)/∂λ2 = −nλ−2 − θ ni=1 y2i e−λyi . Then, the elements of the Fisher Information matrix are derived as,
Pn
i=1
yi e−λyi and
∂ 2 `(ϑ) n neθ = 2− θ , 2 ∂θ θ (e − 1)2 2 ∂ `(ϑ) nθ E =− F2,2 ([2, 2], [3, 3], −θ ), ∂θ ∂λ 4λ(1 − e−θ ) 2 ∂ `(ϑ) nθ 2 n E = 2 F3,3 ([2, 2, 2], [3, 3, 3], −θ ) + 2 . 2 ∂λ 4λ (1 − e−θ ) λ
E
ˆ provides the asymptotic variance-covariance matrix of the MLES. Alternative estimates The inverse of I(ϑ), evaluated at ϑ can be obtained from the inverse observed information matrix since it is a consistent estimator of I−1 (ϑ). 4. Simulation study This section presents the results of a simulation study based on the assumptions given in the Propositions 3.1 and 3.2, which guarantee the existence and uniqueness of the solution of the score functions. The proposed EM-algorithm is considered. No restriction has been imposed on the maximum number of iterations and convergence is assumed when the absolute difference between successive estimates are less that 10−5 . We first assess the accuracy of the approximation of the variance and covariance of the MLEs determined though the Fisher information matrix. The study was based on 1000 generated dataset from the PED with five different set of parameters ˆ and Cov(θˆ , λ) ˆ as well as the approximate values by for n = 30, 50, 100 and 200. The simulated values of Var(θˆ ), Var(λ) averaging the corresponding values obtained from the expected and observed information matrices are presented in Table 1. It is observed that the approximated values determined from the expected and observed information matrices are close to the simulated ones for large values of n. Furthermore, it is noted that the approximation becomes quite accurate as n increases. Additionally, the variances and covariances of MLEs obtained from the observed information matrix are quite close to that obtained from the expected information matrix for large value of n. Besides, simulations have been performed in order to investigate the proposed estimator of θ and λ and the convergence of the proposed EM scheme. We generated 1000 samples of size n = 30, 50, 100 and 200 from the PED for each one of the ten ˆ , and the average set of values of ϑ. The results are condensate in Table 2, which shows the averages of the 1000 MLEs, Av(ϑ) ˆ number of iterations to convergence, Av(k), together with their standard errors, SE (ϑ) and SE (k). Besides, Table 2 shows the coverage probability of the 95% confidence intervals for parameters of the PED, C (ϑ). Table 2 indicates the following results: convergence has been achieved in all cases, even when the starting values are poor and this emphasizes the numerical stability of the EM algorithm. The differences between the average estimates and the true values are almost always smaller than one empirical SE. These results suggest the EM estimates have performed consistently. The standard errors of the MLEs decrease when sample size increases. The empirical coverage probabilities are close to the nominal coverage level. Particularly, as sample size increases. 5. Application In this section we reanalyze the data set extracted from Lawless (2003). The lifetimes are the number of million revolutions before failure for each one of the 23 ball bearings on an endurance test of deep groove ball bearings. As pointed out in Section 2.1, even though the complementary risks are unobserved, one may speculate on some possible complementary risks for the deep groove ball bearings data. For instance, we can consider the risk of contamination from
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
683
Table 1 Mean of the variances and covariances of the MLEs.
(θ, λ)
n
Simulated
From expected Information
From observed information
Var(θˆ )
ˆ) Var(λ
ˆ λˆ ) Cov(θ,
Var(θˆ )
ˆ) Var(λ
ˆ λˆ ) Var(θ,
Var(θˆ )
ˆ) Var(λ
ˆ λˆ ) Var(θ,
30
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
1.351 1.225 1.192 1.736 3.211
0.391 0.299 0.195 0.116 0.0922
0.608 0.491 0.372 0.336 0.418
1.371 1.322 1.367 2.356 5.513
0.425 0.322 0.211 0.127 0.101
0.622 0.5211 0.4118 0.402 0.559
1.269 1.282 1.349 2.422 5.517
0.400 0.321 0.212 0.127 0.102
0.568 0.506 0.409 0.407 0.562
50
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
0.811 0.735 0.715 1.041 1.927
0.235 0.179 0.117 0.070 0.056
0.365 0.294 0.223 0.202 0.251
0.820 0.781 0.778 1.217 2.464
0.246 0.192 0.124 0.073 0.058
0.369 0.312 0.238 0.221 0.289
0.775 0.764 0.774 1.221 2.710
0.234 0.186 0.122 0.073 0.059
0.346 0.302 0.236 0.221 0.305
100
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
0.405 0.367 0.358 0.521 0.963
0.117 0.090 0.059 0.0350 0.0276
0.182 0.147 0.112 0.101 0.125
0.411 0.381 0.371 0.561 1.081
0.120 0.093 0.061 0.0357 0.028
0.185 0.153 0.116 0.106 0.134
0.400 0.377 0.371 0.554 1.079
0.117 0.091 0.060 0.036 0.029
0.179 0.150 0.115 0.105 0.134
200
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
0.203 0.184 0.179 0.260 0.482
0.059 0.045 0.029 0.018 0.014
0.091 0.074 0.056 0.050 0.063
0.205 0.188 0.182 0.269 0.516
0.060 0.046 0.030 0.018 0.014
0.092 0.075 0.057 0.051 0.065
0.201 0.187 0.182 0.269 0.507
0.059 0.046 0.030 0.018 0.014
0.091 0.075 0.056 0.051 0.065
Fig. 3. Left panel: Empirical scaled TTT-Transform for the data. Right panel: Kaplan–Meier curve with estimated RFs of the PED and ED.
dirt from the casting of the casing, the wear particles from hardened steel gear wheels and the harsh working environments, amongst others. Firstly, in order to identify the shape of a lifetime data failure rate function we shall consider a graphical method on Pr Pbased r the TTT plot (Aarset, 1987). In its empirical version the TTT plot is given by G(r /n) = [( i=1 Yi:n ) + (n − r )Yr :n ]/( i=1 Yi:n ), where r = 1, . . . , n and Yi:n represent the order statistics of the sample. It has been shown that the failure rate function is increasing (decreasing) if the TTT plot is concave (convex). Although, the TTT plot is only a sufficient condition, not a necessary one for indicating the failure rate function shape, it is used here as a crude indicative of its shape. Fig. 3 (left panel) shows the TTT plot for the considered data, which is concave indicating an increasing failure rate function, which can be properly accommodated by a PED. Then, the PED was fitted to the data. In order to obtain the MLEs of the parameters we used the EM-algorithm, considering as an initial guess θ = 1 and λ = 1/¯y (a moment estimator for λ from an independent identically distributed exponential observation). Of course, this choice is not foolproof and it is advisable to run the method several times from different starting values. Table 3 presents the MLEs and the corresponding 95% confidence intervals of the PED parameters, which were based on the Fisher information matrix at the MLEs, given by, I(b ϑ) = −
0.41204 −137.4731
−137.4731 . 73647.5948
684
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
Table 2
ˆ , the average number of iterations to convergence, Av(k), their standard errors, SE (ϑ) ˆ and SE (k), and the coverage The averages of the 1000 MLEs, Av(ϑ) probabilities of the 95% confidence intervals for parameters of the PED, C (ϑ). The initial values are given by ϑ(0) = (θ (0) , λ(0) ). ϑ
ϑ(0)
Av(b ϑ)
SE(b ϑ)
C
Av(k)
SE(k)
30
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (1.0, 1.0) (2.0, 3.0) (3.5, 1.0) (6.0, 3.0) (8.5, 1.0)
(0.882, 2.250) (1.309, 2.174) (2.225, 2.095) (4.379, 2.081) (6.736, 2.073) (0.942, 2.280) (1.263, 2.168) (2.247, 2.123) (4.412, 2.06) (6.860, 2.087)
(0.860, 0.567) (1.022, 0.5477) (1.174, 0.454) (1.646, 0.371) (2.466, 0.353) (0.888, 0.570) (1.038, 0.540) (1.136, 0.469) (1.642, 0.376) (2.388, 0.330)
(0.969, 0.967) (0.964, 0.941) (0.982, 0.965) (0.979, 0.975) (0.964, 0.942) (0.963, 0.959) (0.964, 0.962) (0.982, 0.937) (0.970, 0.939) (0.979, 0.95)
200.780 211.341 213.563 222.167 201.332 255.115 234.818 235.345 224.165 214.165
4.678 3.111 2.992 3.123 3.341 5.001 4.678 5.236 4.165 8.321
50
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (1.0, 1.0) (2.0 3.0) (3.5 1.0) (6.0 3.0) (8.5 1.0)
(0.780, 2.163) (1.194, 2.121) (2.110, 2.046) (4.280, 2.064) (6.529, 2.063) (0.7378, 2.152) (1.165, 2.103) (2.081, 2.04) (4.300, 2.055) (6.459, 2.046)
(0.723, 0.44) (0.803, 0.440) (0.860, 0.343) (1.101 0.276) (1.739, 0.268) (0.709, 0.437) (0.811, 0.429) (0.874, 0.356) (1.183, 0.279) (1.687, 0.259)
(0.949, 0.956) (0.965 0.949) (0.971, 0.951) (0.972 0.944) (0.961, 0.929) (0.959, 0.962) (0.959, 0.951) (0.972, 0.943) (0.967, 0.939) (0.959, 0.945)
129.159 128.132 139.211 110.260 111.810 150.630 149.024 161.680 131.690 154.140
3.632 4.784 4.322 3.461 4.607 6.261 4.689 4.231 5.733 6.141
100
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (1.0, 1.0) (2.0, 3.0) (3.5, 1.0) (6.0, 3.0) (8.5, 1.0)
(0.6014, 2.068) (1.047, 2.029) (2.059, 2.020) (4.112, 2.023) (6.23, 2.022) (0.601, 2.068) (1.100, 2.059) (2.071, 2.031) (4.064, 2.016) (6.175, 2.021)
(0.516, 0.312) (0.5958, 0.299) (0.5943, 0.230) (0.7498, 0.190) (1.053, 0.170) (0.525, 0.312) (0.596, 0.291) (0.628, 0.247) (0.734, 0.184) (1.124, 0.210)
(0.952, 0.953) (0.949, 0.951) (0.952, 0.954) (0.956, 0.948) (0.952, 0.955) (0.951, 0.948) (0.947, 0.955) (0.959, 0.953) (0.957, 0.956) (0.955, 0.946)
138.961 128.827 127.845 129.936 141.400 140.72 138.833 131.660 141.140 151.140
2.522 3.559 3.319 4.148 1.420 4.313 2.738 2.156 3.029 2.945
200
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0)
(0.5, 2.0) (1.0, 2.0) (2.0, 2.0) (4.0, 2.0) (6.0, 2.0) (1.0, 1.0) (2.0, 3.0) (3.5, 1.0) (6.0, 3.0) (8.5, 1.0)
(0.561, 2.028) (1.015, 2.015) (2.029, 2.010) (4.060, 2.006) (6.134, 2.018) (0.564, 2.035) (1.017, 2.019) (2.045, 2.019) (4.072, 2.017) (6.039, 1.998)
(0.404, 0.234) (0.435, 0.215) (0.425, 0.170) (0.518, 0.130) (0.733, 0.117) (0.414, 0.251) (0.443, 0.216) (0.417, 0.181) (0.533, 0.135) (0.774, 0.148)
(0.954, 0.951) (0.951, 0.952) (0.954, 0.951) (0.953, 0.949) (0.951, 0.950) (0.957, 0.949) (0.949, 0.949) (0.956, 0.943) (0.952, 0.946) (0.949, 0.951)
131.705 126.530 118.478 121.576 136.030 140.820 136.913 121.700 139.75 150.32
3.420 3.281 3.037 2.364 3.3712 5.496 3.080 4.347 3.848 4.132
n
For comparison of nested models, which is the case when comparing the PED with the ED, we can compute the maximum values of the unrestricted and restricted log-likelihoods to obtain the likelihood ratio statistics (LRS). For testing H0 : θ = 0 versus H1 : θ > 0 we consider the LRS, wn = 2(lPED − lED ), where lED and lPED are the log-likelihoods for the model under the restricted hypothesis H0 and under the unrestricted hypothesis H1 under a sample of size n, respectively. Taking into account that the test is performed in the boundary of the parameter space, following Maller and Zhou (1995), the LRS, wn , is assumed to be asymptotically distributed as a symmetric mixture of a chi-squared distribution with one degree of freedom and a point-mass at zero. Then, limn→∞ P (wn ≤ c ) = 1/2 + 1/2P (χ12 ≤ c ), where P (χ12 ≤ c ) denotes a random variable with a chi-square distribution with one degree of freedom. Large positive values of wn give favorable evidence to the full model. The last column of Table 3 presents the log-likelihood values for both models. Thus, wn is equal to 16.5744, with a p-value < 0.0000, which is strong evidence in favor of the PED. We also compare the PED and ED by inspection of the Akaikés information criterion (AIC), −2`(b ϑ) + 2q, and Schawarźs Bayesian information criterion (BIC), −2`(b ϑ) + q log(n), where q is the number of parameters in the model and n is the sample size. The preferred model is the one with the smaller value on each criterion. The estimated statistics AIC and BIC for the PED are equal to 230.3042 and 235.5145, respectively. While, the estimated statistics AIC and BIC for the ED are equal to 244.8786 and 247.4838, respectively. The PED overcomes the corresponding ED in both considered criterion. These results are corroborated by the plot in the right panel of Fig. 3, which compares the estimated failure rate function obtained via ED and PED fitting on the empirical Kaplan–Meier failure rate estimate. Besides, for the sake of illustration, in order to compare our PED with other generalization of the exponential distribution, we fitted the generalization of the exponential Poisson (GEP) distribution proposed by Barreto-Souza and Cribari-Neto (2009), with pdf given by, f (y) = αθ λ exp(−θ − λy + θ e−λy )(1 − exp(−θ + θ e−λy ))α−1 (1 − e−θ )−α ,
y>0
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
685
Table 3 MLEs and their corresponding 95% confidence intervals. Parameter Distribution
θ
λ
`(·)
PED
7.33919 (2.3678 ;12.3105) —
0.03584 (0.0241; 0.0476) 0.01384 (0.00819 ; 0.01950)
−113.1521
ED
−121.4393
with all parameters α, θ , λ > 0. The MLEs of the parameters of the GEP model are given by αˆ = 5.7527, θˆ = 0.2415 and λˆ = 0.0319, with the maximum log-likelihood value equal to `(·) = −112.9745. The estimated statistics AIC and BIC for the GEP distribution are equal to 231.9489 and 239.7645, respectively. Notice that our PED outperforms the GEP distribution in both considered criterion. 6. Concluding remarks In this paper we proposed the PED as a possible extension to the well known ED. We provide a mathematical treatment of this distribution including a formal proof of its probability density function and explicit algebraic formulae for its rate function and failure rate function, quantiles and moments (particularly, for the mean and variance). The parameters of the PED have a direct interpretation in terms of complementary risks. We discussed MLE, providing an EM-algorithm, which presented an adequate numerical stability in the simulation study performed. The PED allows a straightforwardly nested hypothesis testing procedure for comparison with its ED particular case. The practical relevance and applicability of the PED were demonstrated in a real data set. In Section 3.2, we observed that the EM iteration for both parameters θ and λ is not explicit and a Newton-Raphson algorithm for M-Step of a EM cycle is required. As pointed out by a referee, in order to overcome such a problem one may implement the nested EM algorithm proposed by Karlis (2009). This would however introduce extra difficulties in the analysis and needs further work, since the EM algorithm proposed by Karlis (2009) considers a situation were the EM iteration is not explicit for a unique parameter, while in our case, the EM iteration is not explicit for both parameters. Acknowledgement The research of Francisco Louzada-Neto is funded by the Brazilian organization CNPq. Appendix Proof of Proposition 2.1 The conditional density function of (6) given M = m is given by f (y|M = m, λ) = mλ[1 − e−λy ]m−1 e−λy ,
t > 0, m = 1, . . . .
Them, the marginal probability density function of Y is given by f (y) =
∞ X
mλ[1 − e−λy ]m−1 e−λy ×
m=1
θ m e−θ m!(1 − e−θ )
=
∞ θ λe−θ−λy X [θ (1 − e−λy )]m−1 1 − e−θ m=1 (m − 1)!
=
θ λe−λy−θ e 1 − e−θ
−λy
.
This completes the proof. Proof of Proposition 2.2 −λyi Since limθ→0 U1 (θ; λ, y) = n/2 − > 0 when i=1 e−λyi < n/2. On the other hand, Since limθ→∞ i =1 e Pn −λy U1 (θ; λ, y) = − i=1 e i then U1 (θ; λ, y) < 0 and therefore, the equation U1 (θ; λ, y) = 0 has at least one root. For the proof of proposition, we need to show the U1 is strictly decreasing in θ . To prove the U1 strictly decreasing in θ , the first derivative of U1 is considered and it is given by
Pn
U10 (θ; λ, y) = −n
e2θ − 2eθ − θ 2 eθ + 1
θ 2 (eθ − 1)2
Pn
.
686
V.G. Cancho et al. / Computational Statistics and Data Analysis 55 (2011) 677–686
If w(θ ) = e2θ − 2eθ − θ 2 eθ + 1 > 0, then U10 (θ; λ, y) < 0 and, hence, U1 is strictly decreasing in θ . Since w(θ ) = 4 exp(θ ) [sinh(θ /2) − θ /2] [sinh(θ /2) + θ /2] and sinh(θ /2) − θ /2 > 0, then w(θ ) > 0. This completes the proof. In fact, many steps in our proof are borrowed from Kus (2007). References Aarset, M.V., 1987. How to identify a bathtub hazard rate. IEEE Transactions on Reliability 2 (1), 106–108. Barreto-Souza, W., Cribari-Neto, F., 2009. A generalization of the exponential-Poisson distribution. Statistics and Probability Letters 79, 2493–2500. Basu, A., Klein, J., 1982. Some recent development in competing risks theory. In: Crowley, J., Johnson, R.A. (Eds.), Survival Analysis, Vol. 1 (1). IMS, Hayward, pp. 216–229. Cox, D., Oakes, D., 1984. Analysis of Survival Data. Chapman and Hall, London. Cox, D.R., Hinkley, D.V., 1974. Theoretical Statistics. Chapman and Hall, London. Crowder, M., Kimber, A., Smith, R., Sweeting, T., 1991. Statistical Analysis of Reliability Data. Chapman and Hall, London. Dempster, A., Laird, N., Rubin, D., 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38. Goetghebeur, E., Ryan, L., 1995. A modified log rank test for competing risks with missing failure type. Biometrika 77 (2), 207–211. Gupta, R., Kundu, D., 1999. Generalized exponential distributions. Australian and New Zealand Journal of Statistics 41 (2), 173–188. Karlis, D., 2009. A note on the exponential Poisson distribution: A nested em algorithm. Computational Statistics & Data analysis 53 (4), 894–899. Kus, C., 2007. A new lifetime distribution distributions. Computational Statistics & Data analysis 11, 4497–4509. Lawless, J.F., 2003. Statistical Models and Methods for Lifetime Data, second edition. Wiley, New York, NY. Lu, K., Tsiatis, A.A., 2001. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics 57 (4), 1191–1197. Lu, K., Tsiatis, A.A., 2005. Comparison between two partial likelihood approaches for the competing risks model with missing cause of failure. Lifetime Data Analysis 11 (1), 29–40. Maller, R.A., Zhou, S., 1995. Testing for the presence of immune or cured individuals in censored survival data. Biometrics 51 (4), 1197–1205. Marshall, A.W., Olkin, I., 2007. Life Distributions: Structure of Nonparametric, Semiparametric and Parametric Families. Springer, New York. Perdona, G.S.C., 2006. Modelos de Riscos Aplicados à Análise de Sobrevivência. Doctoral Thesis. Institute of Computer Science and Mathematics, University of São Paulo, Brazil (in Portuguese). Reiser, B., Guttman, I., Lin, D., Guess, M., Usher, J., 1995. Bayesian inference for masked system lifetime data. Applied Statistics 44 (1), 79–90.