Accepted Manuscript
The beta generalized Gompertz distribution Lazhar Benkhelifa PII: DOI: Reference:
S0307-904X(17)30132-4 10.1016/j.apm.2017.02.036 APM 11624
To appear in:
Applied Mathematical Modelling
Received date: Revised date: Accepted date:
10 August 2015 19 January 2017 13 February 2017
Please cite this article as: Lazhar Benkhelifa, The beta generalized Gompertz distribution, Applied Mathematical Modelling (2017), doi: 10.1016/j.apm.2017.02.036
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • A new continuous model called the beta generalized Gompertz distribution is introduced.
CR IP T
• Some mathematical properties of the new model are studied.
• The model parameters are estimated by maximum likelihood.
AC
CE
PT
ED
M
AN US
• The usefulness of the new model is illustrated in an application to real data using goodness-of-fit tests.
1
Manuscript - For Review lick here to view linked References
The beta generalized Gompertz distribution
CR IP T
Lazhar Benkhelifa Laboratory of Applied Mathematics, Mohamed Khider University, Biskra, 07000, Algeria
AN US
Department of Mathematics and Informatics, Larbi Ben M’Hidi University, Oum El Bouaghi, 04000, Algeria Abstract
PT
ED
M
A new five-parameter continuous model called the beta generalized Gompertz distribution is introduced and studied. This distribution contains the Gompertz, generalized Gompertz, beta Gompertz, generalized exponential, beta generalized exponential, exponential and beta exponential distributions as special sub-models. Some mathematical properties of the new model are derived. We show that the density function of the new distribution can be expressed as a linear combination of Gompertz densities. We obtain explicit expressions for the moments, moment generating function, quantile function, density function of the order statistics and their moments, mean deviations, Bonferroni and Lorenz curves and R´enyi entropy. The model parameters are estimated by using the maximum likelihood method of estimation and the observed information matrix is determined. Finally, an application to real data set is given to illustrate the usefulness of the proposed model.
CE
Keywords: Beta distribution, Gompertz distribution, Moments, Order statistics, Maximum likelihood estimation, Observed information matrix.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
ACCEPTED MANUSCRIPT
Corresponding author:
[email protected] (L. Benkhelifa). 1
ACCEPTED MANUSCRIPT
1
Introduction
CR IP T
To describe human mortality and establish actuarial tables, Gompertz [1] introduced a two-parameter family of continuous probability models, known as the Gompertz distribution. Since then, this distribution has received much attention and has been used in several areas of science and technology such as biology (see [2]), gerontology (see [3]), computer (see [4]), marketing science (see [5]) and medical science (see [6]).
ED
M
AN US
The studies of statistical methodology and properties of the Gompertz distribution have been contributed by many authors, see for example, [7], [8], [9], [10], [11], [12] and [13]. Garg et al. [14] used the maximum likelihood method to derive the point estimators of the parameters of the Gompertz distribution. An exact confidence interval and exact joint confidence region for the parameters of this distribution under type II censored data were developed by Chen [11] while Wu et al. [15] developed an exact confidence interval and exact joint confidence region for the parameters under the first-failure censored sampling plans. Wu et al. [16] proposed unweighted and weighted least squares estimates for the parameters of the Gompertz distribution under the complete data and the first failure-censored data. By using the generalised integro-exponential function, Lenart [17] obtained an exact formula of the moment generating function and central moments for the Gompertz distribution. Lenart and Missov [18] discussed a goodness-of-fit test for this distribution.
PT
A random variable X is said to have a Gompertz distribution, with parameters β > 0 and λ > 0, if its cumulative distribution function (cdf) and probability density function (pdf), for x ≥ 0, are given by β
CE
H (x; β, λ) = 1 − e− λ (e
and
β
h (x; β, λ) = βeλx e− λ (e
)
λx −1
),
λx −1
(1)
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
respectively. The Gompertz pdf is unimodal and has positive skewness whereas the Gompertz hazard rate function is increasing. The kth moment of X is given by (see [17]) k! β k−1 β k λ E X = k e E1 , (2) λ λ Z ∞ 1 k (log x)k x−s e−zx dz is the generalized integro-exponential where Es (z) = k! 1
function (see [19]). By using the Meijer G-function (for more details on this 2
ACCEPTED MANUSCRIPT
ED
M
AN US
CR IP T
function see [19]), Lenart [17] gave an alternative representation of E X k as k! β k+1,0 β 1, . . . , 1 k E X = k e λ Gk,k+1 , λ λ 0, . . . , 0 where the Meijer G-function is a generalised hypergeometric function. It is defined by the contour integral Z n a1 , . . . , ap Πm 1 j=1 Γ (bj − t) Πj=1 Γ (1 − aj + t) m,n = Gp,q z z t dt, b1 , . . . , bq 2πi C Πqj=m+1 Γ (1 − bj + t) Πpj=n+1 Γ (aj − t) √ where i = −1 is the complex unit and C denotes an integration path, see Section 9.3 in [20] for a description of this path. El-Gohary et al. [21] generalized the Gompertz (GG) distribution using the Lehmann alternative type I (due to Lehmann [22]) by taking the αth power of the cdf H, say (H (x; β, α))α . This new family of distribution was called as the exponentiated Gompertz distribution. The main advantage of the GG distribution is that its hazard rate function can be bathtub shaped, increasing, constant and decreasing depending on the values of its parameters. Another important characteristic of this distribution is that it contains, as special cases, the Gompertz, exponential and generalized exponential distributions. The cdf and pdf of the GG distribution are given by α β λx G (x) = 1 − e− λ (e −1) (3) and
β
g (x) = αβeλx e− λ (e
α−1 ) 1 − e− λβ (eλx −1) ,
λx −1
(4)
CE
PT
respectively, for x ≥ 0, α > 0, β > 0 and λ > 0. In this paper, we propose a new generalization of the Gompertz distribution, based on the methodology of Eugene et al. [23]. We will call this generalization as the beta generalized Gompertz (BGG) distribution. We provide a comprehensive description of some of its mathematical properties. The rest of this paper is organized as follows. In Section 2, we define the BGG model. In Section 3, we present some special sub-models of the BGG distribution. Section 4 provides useful expansions for its cdf and pdf. Sections 5 and 6 deal with the moments and generating functions, respectively. The quantile function is given in Section 7. In Section 8, we investigate the order statistics, their moments and L-moments. The mean deviations and R´enyi entropy are addressed in Sections 9 and 10, respectively. In Section 11, we present maximum likelihood estimation and calculate the elements of the observed information matrix. In Section 12, we illustrate the importance of the BGG distribution through the analysis of a real data set. Finally, conclusions are given in Section 13.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
3
ACCEPTED MANUSCRIPT
2
The model definition
CR IP T
For an arbitrary baseline cdf G (x) of an absolutely continuous random variable, Eugene et al. [23] defined the cdf of the beta-G distribution by Z G(x) 1 ta−1 (1 − t)b−1 dt, (5) F (x) = IG(x) (a, b) = B (a, b) 0 where a > 0 and b > 0 are two additional parameters, Iy (a, b)= By (a, b)/B (a, b)
f (x) =
AN US
Ry is the incomplete beta function ratio, By (a, b) = 0 ta−1 (1 − t)b−1 dt is the incomplete beta function, B (a, b) = Γ (a) Γ (b) /Γ (a + b) is the complete beta function and Γ (·) is the gamma function. The shape parameters a and b introduce skewness and vary tail weights. The pdf corresponding to (5) is g (x) {G (x)}a−1 {1 − G (x)}b−1 , B (a, b)
(6)
PT
ED
M
where g(x) = dG(x)/dx is the baseline density function. By using Equation (5), several authors have defined and studied many new distributions. For example, Eugene et al. [23] introduced the beta-normal distribution. The beta exponential was defined and studied by Nadarajah and Kotz [24]. The beta Gompertz distribution was introduced by Jafari et al. [25]. The beta Weibull distribution was proposed by Lee et al. [26]. Cintra et al. [27] introduced the beta generalized normal distribution. Barreto-Souza et al. [28], Mahmoudi [29], Singla et al. [30] and Cordeiro et al. [31] defined the beta generalized exponential, beta generalized Pareto, beta generalized Weibull and beta exponentiated Weibull distributions, respectively.
CE
In a similar manner, we can extend the GG distribution, because it has closed-form cumulative function. By replacing (3) in (5), we obtain the BGG distribution with cdf given by F (x) = I
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
1−e
−
β λx e −1) λ(
α
(a, b) =
1 B (a, b)
Z
1−e
−
β eλx −1 λ
(
)
0
!α
ta−1 (1 − t)b−1 dt,
(7) for x ≥ 0, α > 0, β > 0, λ > 0, a > 0 and b > 0. Hence, the associated density function with five positive parameters α, β, λ, a and b has the form β
αβeλx e− λ (e f (x) = B (a, b)
)n
λx −1
β −λ (eλx −1)
1−e
oαa−1 n
−β eλx −1) λ(
1− 1−e
α ob−1
.
(8) When X is a random variable following the BGG distribution, it will be denoted by X ∼ BGG (α, β, λ, a, b). In Figure 1, we illustrate some possible 4
ACCEPTED MANUSCRIPT
CR IP T
shapes of the density function (8) for selected parameter values α, β, λ, a and b. From this Figure, we observe that the density function can take various forms depending on the parameter values. It is evident that the BGG distribution is much more flexible than the GG distribution. The hazard rate function defined by h(x) = f (x)/[1 − F (x)] is an important quantity characterizing lifetime phenomena. It can be loosely interpreted as the conditional probability of failure, given that it has survived to time t. The hazard function of the BGG distribution is given by β
αβeλx e− λ (e h (x) =
n oαa−1 n α ob−1 β λx ) 1 − e− βλ (eλx −1) 1 − 1 − e− λ (e −1)
λx −1
α
(a, b)
AN US
B (a, b) − B
1−e
−
β λx e −1) λ(
.
M
Plots of the hazard rate function of the BGG distribution for some parameter values are shown in Figure 2. We observe that the hazard rate function of the BGG distribution can be bathtub shaped, constant, increasing or decreasing depending on the values of the parameters.
ED
[Insert Figure 1 Here] Figure 1. Plots of the BGG density function for different values of α, β, λ, a and b.
3
CE
PT
[Insert Figure 2 Here] Figure 2. Plots of the BGG hazard rate function for different values of α, β, λ, a and b.
Special sub-models
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
In this section, some sub-models of the BGG distribution for selected values of the parameters are presented. • For a = b = 1 the BGG distribution reduces to the GG distribution with the pdf (4).
5
ACCEPTED MANUSCRIPT
• For α = 1 the BGG distribution reduces to the beta Gompertz (BG) distribution with the pdf
for x ≥ 0, β > 0, λ > 0, a > 0 and b > 0.
CR IP T
bβ λx oa−1 β βeλx e− λ (e −1) n λx f (x) = , 1 − e− λ (e −1) B (a, b)
• For α = a = b = 1 the BGG distribution reduces to the Gompertz distribution with the pdf (1).
f (x) =
AN US
• If λ tends to zero, then the BGG distribution reduces to the pdf of a beta generalized exponential (BGE) distribution αa−1 α b−1 αβe−βx 1 − e−βx 1 − 1 − e−βx , B (a, b)
for x ≥ 0, α > 0, β > 0.
M
• For a = b = 1 and λ tends to zero the BGG distribution reduces to the generalized exponential (GE) distribution with the pdf α−1 f (x) = αβe−βx 1 − e−βx ,
ED
for x ≥ 0, α > 0 and β > 0, > 0, a > 0 b > 0.
f (x) =
CE
PT
• For α = 1 and λ tends to zero the BGG distribution reduces to the beta exponential (BE) distribution with the pdf α−1 βe−bβx 1 − e−βx , B (a, b)
for x ≥ 0, β > 0, a > 0 and b > 0.
• For α = a = b = 1 and λ tends to zero the BGG distribution reduces to the exponential distribution with the pdf
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
f (x) = βe−βx ,
for x ≥ 0 and β > 0.
The BGG distribution is an important competitive model (see Section 12) to the Gompertz, generalized Gompertz, beta Gompertz, generalized exponential, beta generalized exponential, exponential and beta exponential distributions since it includes all these models as special cases.
6
ACCEPTED MANUSCRIPT
4
Expansions for the cumulative and density functions
(1 − z)
η−1
CR IP T
In this section, we obtain expansions for the cdf (7) and pdf (8) of the BGG distribution in terms of infinite or finite weighted sums of cdf’s and pdf’s of Gompertz distributions respectively. If η > 0 real non-integer and |z| < 1, we have the binomial series expansion η−1 j z . (−1) = j j=0 ∞ X
j
(9)
AN US
If η is an integer, the index j in (9) stops at η − 1.
By substituting the binomial series expansion of (1 − t)b−1 in (7), for b > 0 real non-integer, we obtain
After integration, we obtain
1−e
−
β eλx −1 λ
(
)
!α
ta+j−1 dt.
M
Z 1 j b−1 F (x) = (−1) B (a, b) j=0 j 0 ∞ X
j
∞
b−1
ED
X (−1) j 1 F (x) = B (a, b) j=0 (a + j)
iα(a+j) h β eλx −1) −λ ( . 1−e
h iα(a+j) ) < 1 for x > 0, by using (9), 1 − e− λβ (eλx −1) can
PT β
Since 0 < e− λ (e be expressed as
λx −1
∞ h iα(a+j) X kβ λx −1 λx α (a + j) k −β e ( ) 1−e λ e− λ (e −1) . = (−1) k k=0
CE AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Then
∞
j
b−1
X (−1) j 1 F (x) = B (a, b) j=0 (a + j)
∞ X k=0
(−1)
k
α (a + j) − kβ λx e λ (e −1) . k
Therefore, an expansion for the cdf of the BGG distribution is α(a+j) ∞ X (−1)j+k b−1 j k (1 − H (x; kβ, λ)) , F (x) = (a + j) B (a, b) j,k=0 7
(10)
ACCEPTED MANUSCRIPT
where H (x; kβ, λ) is the Gompertz cdf with parameters kβ and λ. By differentiating Equation (10), we get an expansion for the BGG density function ∞ X
wk h (x; kβ, λ) ,
(11)
k=0
where wk =
∞ X (−1)j+k+1 j=0
b−1 j
α(a+j) k
(a + j) B (a, b)
CR IP T
f (x) =
5
ED
M
AN US
and h (x; kβ, λ) is the Gompertz pdf with parameters kβ and λ. If b > 0 is an integer, the index j in the sum stops at b − 1 and if both α and a are integers, then the index k in the sum stops at α (a + j). P It is clear that ∞ k=0 wk = 1. Equation (11) holds for any parameter values and shows that the pdf of BGG distribution can be expressed as an infinite linear combination of Gompertz densities. We can use these expansions as an alternative way of numerical integration. Hence, some mathematical properties of the BGG distribution can be obtained directly from those properties of the Gompertz distribution. For example, the ordinary, inverse and factorial moments, moment generating function and characteristic function.
Moments
CE
PT
Here, moments of the BGG distribution are presented. In any statistical analysis especially in applied work, we hardly need to emphasize the necessity and importance of moments. Some of the most important features and characteristics of a distribution like dispersion, skewness and kurtosis can be studied through moments. If X ∼ BGG (α, β, λ, a, b) , then the non-central rth moment of X can be determined from (11) as
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
r
E (X ) = =
∞ X
k=0 ∞ X k=0
wk
Z
∞
xr h (x; kβ, λ) dx
0
r wk E X(kβ,λ) ,
r is the rth moment of the Gompertz distribution with where E X(kβ,λ) parameters kβ and λ, i.e., the rth moment of the BGG distribution can be written as simple linear combination of rth moment of the Gompertz 8
ACCEPTED MANUSCRIPT
∞ X
r! kβ wk r e λ E1r−1 E (X ) = λ k=0 r
kβ λ
or by the more widely used Meijer G-function ∞ X
,
(12)
kβ 1, . . . , 1 . λ 0, . . . , 0
AN US
r! kβ r+1,0 wk r e λ Gr,r+1 E (X ) = λ j,k=0 r
CR IP T
distribution. Since the series given in (11) converges for any parameter values (because the binomial series expansion (9) converges for any real non-integer number η > 0 and |z| < 1) and since the rth moment of the Gompertz distribution exists (see, [17]), the rth moment of the BGG distribution exists. From (2), we obtain
The skewness and the kurtosis, respectively, are γ1 =
µ3 3/2
µ2
and γ2 =
µ4 , µ22
M
where µ2 , µ3 and µ4 are the second, third and fourth central moments and for p = 2, 3, 4, . . . we define
ED
µp = E(X − E(X))p .
CE
PT
The moments of the BGG distribution can not be evaluated exactly in closed form. We have computed the first four moments, variance, skewness and kurtosis of the BGG distribution numerically for some selected values of the parameters using R software and the values are given in Table 1. [Insert Table 1 Here]
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
6
Moment generating function
In this section, we derive two representations for the moment generating function of the BGG distribution. Let X ∼ BGG (α, β, λ, a, b). The moment generating function of X, say M(s) = E esX , is an alternative specification of its probability distribution. First, we have Z ∞ M (s) = esx f (x) dx. 0
9
ACCEPTED MANUSCRIPT
By using the Maclaurin series expansion of an exponential function, we get ∞ X sr
r!
r=0
E (X r ) ,
CR IP T
M (s) =
where E (X r ) is given by (12). Then, we obtain a first representation for M (s) as ∞ X sr kβ r−1 kβ λ . wk r e E1 M (s) = λ λ r,k=0
A second representation for M (s) is obtained from (11) as ∞ X
AN US
M (s) =
wk Mk (s),
k=0
M
where Mk (s) is the moment generating function of Y following the Gompertz distribution with parameters kβ and λ. The moment generating function of Y is Z ∞ β λy Mk (s) = esy kβeλy e− λ (e −1) dy. 0
λy
yields Z kβ Mk (s) = kβe
ED
The change of variable z = e
∞
kβ
s
z λ −1 e− λ z dz.
1
PT
After elementary computations, we get s λ
1− λs
kβ
e Γ
CE
Mk (s) = λ (kβ)
s kβ , λ λ
,
where Γ (u, v) is the incomplete gamma function defined by Z ∞ Γ (u, v) = xu−1 e−x dx.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
v
Therefore, the second representation for M(s) is M(s) = λ
s λ
∞ X
wk (kβ)
k=0
10
1− λs
kβ
e Γ
s kβ , λ λ
.
(13)
ACCEPTED MANUSCRIPT
7
Quantile function
CR IP T
The quantile function of the BGG distribution, say x = Q(u), can be obtained by inverting BGG cdf (7) as −1 α1 λ 1 , (14) Q (u) = log 1 − log 1 − Iu (a, b) λ β
where Iu−1 (a, b) denotes the inverse of the incomplete beta function with parameters a and b. We can obtain an expansion for the inverse of the beta incomplete function in the Wolfram website (http.//functions.wolfram.com/06.23.06.0004.01), given by
AN US
b − 1 2 (b − 1) (a2 + 3ba − a + 5b − 4) 3 w w + b+1 2 (a + 1)2 (a + 2) (b − 1) {a4 + (6b − 1) a3 + (b + 2) (8b − 5) a2 } 4 + w 3 (a + 1)3 (a + 2) (a + 3) (b − 1) {(33b2 − 30b + 4) a + b (31b − 47) + 18} 4 + w 3 (a + 1)3 (a + 2) (a + 3) + O u5/u ,
M
Iu−1 (a, b) = w +
PT
ED
where w = [auB(a, b)]1/a for a > 0. Therefore, it is easy to simulate the BGG distribution. Let V be a beta random variable with parameters a > 0 and b > 0. Then, the random variable X given by 1 1 λ X = log 1 − log 1 − V α , (15) λ β
CE
follows the BGG distribution. From Equation (15), we can generate a random variable X having the BGG distribution when the parameters are known. The median can be derived from (14) be setting u = 1/2.
It is well known that the measures of skewness and kurtosis are sensitive to outliers. There are many heavy-tailed distributions for which these measures are infinite. Because of this limitation of the measures of skewness and kurtosis, a number of alternative measures based on quantiles have been proposed. Bowley proposed a coefficient of skewness based on quartiles
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
B=
Q(3/4) + Q(1/4) − 2Q(2/4) , Q(3/4) − Q(1/4)
whereas, Moors proposed a robust measure of kurtosis based on sample quartiles Q(7/8) − Q(5/8) + Q(3/8) − Q(1/8) M= . Q(6/8) − Q(2/8) 11
ACCEPTED MANUSCRIPT
The measures B and M are less sensitive to outliers and they exist even for distributions for which the moments fail to exist.
Order statistics
CR IP T
8
fi,n (x) =
AN US
In this section, the distribution of the ith order statistic for the BGG distribution is given. The order statistics play an important role in statistics, in general, and in reliability theory and life testing in paricular. Let X1 , . . . , Xn be a simple random sample from BGG distribution and let X1,n , . . . , Xn,n denote the order statistics obtained from this sample. The pdf of ith order statistic Xi,n is given by f (x) {F (x)}i−1 {1 − F (x)}n−i , for i = 1, . . . , n, B (i, n − i + 1)
where B (·, ·) is the beta function and f (x) and F (x) denote the pdf and cdf of the BGG distribution, respectively.
ED
M
Now, we derive an expansion for fi,n (x). By using (9) of {1 − F (x)}n−i given by n−i X n−i l n−i {F (x)}l , {1 − F (x)} = (−1) l l=0 we obtain
PT
n−i X f (x) l n−i (−1) {F (x)}i+l−1 . fi,n (x) = l B (i, n − i + 1) l=0
CE
From (10) and (11), we have βeλx fi,n (x) = B (i, n − i + 1)
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
∞ X
rwr ur
r=0
! n−i X l=0
where
eλx −1) −β λ(
u=e
and
wk∗
=
!i+l−1 X ∞ n − i (−1)l+1 wk∗ uk , l r=0
∞ X (−1)j+k
b−1 j
(a + j) B (a, b)
j=0
α(a+j) k
.
From [20], for a power series raised to a positive integer s, we have !s ∞ ∞ X X i ai u = ds,i ui , i=0
i=0
12
(16)
ACCEPTED MANUSCRIPT
where the coefficients ds,i (for i = 1, 2, . . .) can be determined from the recurrence equation −1
i X
m=1
[m (s + 1) − i] am ds,i−m
(17)
CR IP T
ds,i = (ia0 )
where ci+l−1,k = (kw0 )−1
k X
m=1
AN US
and ds,0 = as0 . Hence, ds,i comes directly from ds,0 , . . . , ds,i−1 and, therefore, from a0 , . . . , ai . By using Equations (16) and (17), it follows that ! n−i X ∞ ∞ X X βeλx l+1 n − i r (−1) ci+l−1,k uk , rwr u fi,n (x) = l B (i, n − i + 1) r=0 r=0 l=0 ∗ [m (i + l) − k] wm ci+l−1,k−m ,
and ci+l−1,0 = (w0∗ )i+l−1 = (−1)i+l−1 . Here, ci+l−1,k follows from ci−1+l,0 , . . . , ci−1+l,k−1 ∗ and, therefore, from w0∗ , . . . , wk−1 . Combining terms, we obtain
ED
M
n−i X ∞ X ∞ X βeλx l+1 n − i fi,n (x) = (−1) rci+l−1,k wr uk+r . l B (i, n − i + 1) l=0 r=0 k=0 β
Substituting u = e− λ (e
) into the above expression gives
λx −1
CE
then
PT
n−i X ∞ X ∞ X (k+r)β λx βeλx l+1 n − i rci+l−1,k wr e− λ (e −1) , (−1) fi,n (x) = l B (i, n − i + 1) l=0 r=0 k=0
where
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
fi,n (x) =
∞ X ∞ X
q (k, r) h (x; (k + r) β, λ) ,
(18)
r=0 k=0
n−i X rwr l+1 n − i ci+l−1,k . q (k, r) = (−1) l (k + r) B (i, n − i + 1) l=0
Equation (18) reveals that the density function of the BGG order statistics is a linear combination of Gompertz densities. So, we can easily obtain the mathematical properties for Xi,n such as ordinary and incomplete moments, mean deviations, among others, directly from those quantities of the Gompertz distribution. Since (18) holds for any parameter values and since the 13
ACCEPTED MANUSCRIPT
CR IP T
pth moment of the Gompertz distribution exists (see, [17]), the pth moment of Xi,n exists. Then, the pth moment of Xi,n is ∞ X ∞ X p! (k+r)β p−1 (k + r) β p q (k, r) p e λ E1 E Xi,n = . (19) λ λ r=0 k=0
M
AN US
L-moments are computed from linear combinations of the ordered data values. They have several advantages over ordinary moments for example, they apply for any distribution having finite mean, no higher-order moments need be finite. The sth L-moment is computed from linear combinations of the ordered data values (hence the prefix L) by s−i X s−1+i s−1+i s − 1 βi , λs = (−1) i i i=0 i where βi = E X {F (X)} = (i + 1)−1 E(Xi+1,i+1 ). Then, in particular, we have λ1 = β0 , λ2 = 2β1 − β0 , λ3 = 6β2 − 6β1 + β0 and λ4 = 20β3 − 30β2 + 12β1 − β0 . From (19), we can obtain expansions for the L-moments of the BGG distribution with p = 1.
Mean deviations
ED
9
PT
Let X ∼ BGG (α, β, λ, a, b). The mean deviations of X about the mean and about the median can be used as measures of spread in a population. They are given by Z ∞
δ1 = E (|X − µ|) =
CE
and
δ2 = E (|X − M|) =
0
Z
∞
|x − µ| f (x) dx
|x − M| f (x) dx,
0
respectively, where the mean µ = E(X) and M = Median(X) denotes the median with µ is calculated from (12) with r = 1 and M is calculated as the solution of the non-linear equation F (x) = 1/2. The measures δ1 and δ2 can be expressed as Z ∞ Z ∞ δ1 = 2µF (µ) − 2µ + 2 xf (x) dx and δ2 = −2µ + 2 xf (x) dx.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
µ
M
From (11), we have Z Z ∞ ∞ X kβwk xf (x) dx = A
∞
A
k=0
14
kβ
xeλx e− λ (e
) dx.
λx −1
ACCEPTED MANUSCRIPT
By a change of variables, we get ∞
xf (x) dx =
A
β ∞ X kβe λ
k=0
λ
wk
Z
∞
kβ
(log z) e− λ z dx.
eλA
CR IP T
Z
Changing variables and integrating by parts yield Z ∞ β ∞ kβ λA kβ λA kβe λ X − kβ eλA e e λ , e wk λA + e λ Γ 0, xf (x) dx = λ k=0 λ A Therefore
(20)
AN US
where the Γ function in (20) is defined as in (13).
β ∞ kβ λµ kβ λµ 2kβe λ X − kβ eλµ e λ λ wk λµ + e Γ 0, δ1 = 2µF (µ) − 2µ + e e λ λ k=0
and
k=0
M
β ∞ kβ λM kβ λM 2kβe λ X − kβ eλM e wk λM + e λ Γ 0, δ2 = −2µ + . e e λ λ λ
PT
ED
We can use (20) to determine Bonferroni and Lorenz curves which have applications not only in economics to study income and poverty, but also in other fields like reliability, demography, insurance and medicine. The Lorenz and Bonferroni curves are given by Z Z ∞ 1 1 1 ∞ xf (x) dx and B (p) = − xf (x) dx, L (p) = 1 − µ q p pµ q
CE
respectively, where q = Q(p) is calculated from (14) for a given probability p.
10
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
R´ enyi entropy
The entropy of a random variable is a measure of uncertainty variation and has been used in various situations in science and engineering. In the literature, several measures of entropy have been studied and compared. Here, we give the R´enyi entropy for the BGG distribution. One of the popular entropy measure is the R´enyi entropy defined, for X ∼ BGG (α, β, λ, a, b), by Z ∞ 1 γ IR (γ) = log {f (x)} dx , γ > 0, γ 6= 1, 1−γ 0 15
ACCEPTED MANUSCRIPT
!γ β oγ(αa−1) n λ β γβ λx λx αβe {f (x)}γ = eγλx e− λ e 1 − e− λ (e −1) B (a, b) n α oγ(b−1) β λx × 1 − 1 − e− λ (e −1) .
CR IP T
where
By using (9), if γ (b − 1) real non-integer, we obtain n ∞ α oγ(b−1) X oαr n β β λx r γ (b − 1) eλx −1) −λ ( = (−1) 1 − e− λ (e −1) , 1− 1−e r r=0 β
γ
{f (x)} =
αβe λ B (a, b)
!γ
γλx − γβ eλx λ
e
γ (b − 1) (−1) r r=0
AN US
then
e
∞ X
r
oαr+γ(αa−1) n β eλx −1) −λ ( . × 1−e
M
By using (9) again, if αr + γ (αa − 1) real non-integer, we obtain ∞ oαr+γ(αa−1) X n lβ β λx l αr + γ (αa − 1) eλx −1) −λ ( e− λ (e −1) , (−1) 1−e = l l=0
ED
then
β
αβe λ B (a, b)
γ
{f (x)} =
!γ
∞ X ∞ X
(−1)
r+l
r=0 l=0
γ (b − 1) r
PT
αr + γ (αa − 1) γλx lβ − (l+γ)β eλx e eλe λ × . l
CE
Therefore Z ∞ 0
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
β
γ
{f (x)} dx =
αβe λ B (a, b)
!γ
∞ X ∞ X r=0 l=0
(−1)
r+l
γ (b − 1) r
Z αr + γ (αa − 1) lβ ∞ γλx − (l+γ)β eλx eλ × dx. e e λ l 0
(l+γ)β λx
By setting z = e λ e , we obtain !γ ∞ Z ∞ β X λ αβe γ r+l γ (b − 1) {f (x)} dx = (−1) r B (a, b) 0 r,l=0 lβ αr + γ (αa − 1) e λ λγ−1 (l + γ) β × , Γ γ, l (βl + βγ)γ λ 16
ACCEPTED MANUSCRIPT
where the Γ function involved in the above equation is defined as in (13). Finally, we obtain β
αβe λ B (a, b)
!
CR IP T
γ log IR (γ) = 1−γ
γ (b − 1) αr + γ (αa − 1) (−1) l r r,l=0 ! lβ (l + γ) β e λ λγ−1 . Γ γ, × (βl + βγ)γ λ ∞ X
r+l
AN US
1 + log 1−γ
11
Estimation of model parameters
M
Here, the maximum likelihood estimates (MLEs) of the parameters of the BGG distribution are presented. Let x1 , x2 , . . . , xn be the observed values of n independent observations drawn from the BGG distribution with unknown parameter vector ξ = (α, β, λ, a, b)T . The total log-likelihood function for ξ, denoted by ℓ (ξ), can be written as n
ED
X nβ − n log [B (a, b)] + λ xi ℓ (ξ) = n log (α) + n log (β) + λ i=1 n
n
X β X λxi e + (αa − 1) log (1 − zi ) λ i=1 i=1
PT
−
CE
+ (b − 1)
n X i=1
log (1 − (1 − zi )α ) ,
β λxi where zi = e− λ (e −1) is a transformed observation. Taking the partial derivatives of ℓ (ξ) with respect to α, β, λ, a and b we obtain the components of the score vector U (ξ) as follows
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
n
n
X X (1 − zi )α log (1 − zi ) n Uα (ξ) = + a log (1 − zi ) − (b − 1) , α log (1 − (1 − zi )α ) i=1 i=1 n n n n 1 X λxi (αa − 1) X eλxi − 1 zi e + Uβ (ξ) = + − β λ λ i=1 λ (1 − zi ) i=1 n α−1 α (b − 1) X eλxi − 1 zi (1 − zi ) + , λ (1 − (1 − zi )α ) i=1 17
ACCEPTED MANUSCRIPT
i=1
log (1 − zi )
AN US
Ua (ξ) = −n [ψ (a) + ψ (a + b)] + α
n X
CR IP T
n n β X λxi nβ X xi + 2 e (1 − λxi ) Uλ (ξ) = − 2 + λ λ i=1 i=1 n β (aα − 1) X λxi eλxi − eλxi + 1 zi αβ (b − 1) − + λ2 1 − z λ2 i i=1 n X λxi eλxi − eλxi + 1 zi (1 − zi )α−1 , × 1 − (1 − zi )α i=1
and
Ub (ξ) = −n [ψ (b) + ψ (a + b)] + α where ψ (·) is the digamma function.
n X i=1
log (1 − (1 − zi )α ) ,
ED
M
T b λ, b b The MLE ξb = α b, β, a, bb of ξ = (α, β, λ, a, b)T is obtained by solving the non-linear likelihood equations simultaneously Uα (ξ) = 0, Uβ (ξ) = 0, Uλ (ξ) = 0, Ua (ξ) and Ub (ξ) = 0. Since these equations cannot be solved analytically, we can use the statistical software to evaluate them numerically using iterative methods such as the Newton-Raphson algorithm.
CE
PT
To construct approximate confidence intervals and for testing hypotheses on the parameters, we use the normal approximation for the MLE of ξ. Under the usual regularity conditions that the parameters are in the interior of the parameter space but not on the boundary, see [32, pp. 461–463], the asymptotic distribution of √ n ξb − ξ is N5 0, I −1 (ξ) ,
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
where I −1 (ξ) is the inverse of the expected information matrix I (ξ). This asymptotic behavior holds if I (ξ) is replaced by Jn ξb , where Jn (ξ) is the observed information matrix given by Uαα Uαβ Uαλ Uαa Uαb . Uββ Uβλ Uβa Uβb . Uλλ Uλa Uλb Jn (ξ) = − . , . . . Uaa Uab . . . . Ubb 18
ACCEPTED MANUSCRIPT
whose elements are given in Appendix 1.
2
b a ± Z ω2
2
p var (b a) and bb ± Z ω2
r
var bb ,
CR IP T
With this result, we use the multivariate normal distribution to construct approximate confidence intervals for the parameters α, β, λ, a and b which are given by r r p b b b b , α b ± Z ω var (b α), β ± Z ω var β , λ ± Z ω var λ 2
AN US
respectively, where var (·) is the diagonal element of Jn−1 ξb corresponding to each parameter and Z ω2 is the quantile 100(1 − ω/2)% of the standard normal distribution.
ED
M
We can compute the maximum values of the unrestricted and restricted loglikelihoods to construct the likelihood ratio (LR) statistics for testing some sub-models (see Section 3) of the BGG distribution. For example, we may use LR statistics to check if the fit using the BGG distribution is statistically “superior” to the fits using the GG, BG, GE, BGE, BE, exponential and Gompertz distributions (see Section 3) for a given data set. Considering the T (0) partition ξ= ξ T1 , ξ T2 , tests of hypotheses of the type H0 : ξ1 = ξ1 versus (0) H1 : ξ1n6=ξ1 can be operformed via LR statistics. The LR test statistic is w = 2 ℓ ξb − ℓ ξe , where ξb and ξe are the MLEs of ξ under H1 and H0
CE
PT
respectively. Under H0 , the statistic w is asymptotically distributed as χ2p where p is the dimension of the vector ξ1 of interest. The LR test rejects H0 if w > ζγ , where ζγ denotes the upper 100% point of the χ2p distribution with p degrees of freedom.
12
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Application
In this section, we give an application using a well-known real data set to prove the flexibility and potentiality of the proposed model. The data set is given by Aarset [33] and represents the lifetimes of 50 devices (in weeks). This data set, also reported in [34], [35] and [36], is: 0.1, 0.2, 1, 1, 1, 1, 1, 2, 3, 6,7, 1,1, 12, 18, 18, 18, 18, 18, 21, 32,36, 40 ,45 ,46, 47, 50, 55 ,60, 63, 63,67,67,67, 67, 72 ,75 ,79 ,82 ,82 ,83,84 ,84, 84, 85, 85, 85, 85, 85, 86, 86. We summarize some descriptive statistics in Table 2 for the data set.
19
ACCEPTED MANUSCRIPT
[Insert Table 2 Here]
AN US
CR IP T
There is a qualitative information about the shape of the hazard function, in some applications, to select a particular model. The so-called total time on test (TTT) plot (see [33]) We obtain the TTTPis a very useful device. P plot by plotting G(r/n) = [ ri=1 Ti,n + (n − r) Tr,n ] / ni=1 Ti,n against r/n, where r = 1, . . . , n and Ti,n , i = 1, . . . , n, are the order statistics of the sample (see [35]). The TTT plot for the Aarset data set is shown in Figure 3(a), which takes a convex shape followed by a concave shape. This corresponds to a bathtub shaped hazard rate function. Hence, the BGG distribution is appropriate for modeling this data set.
CE
PT
ED
M
We compare the fits of the BGG distribution with some of its sub-models GG, BG, GE, BGE, BE, exponential and Gompertz distributions (see Section 3). The model parameters are estimated by the method of maximum likelihood. The MLEs are obtained by using the bbmle package in R (see Appendix 2). Table 3 lists the MLEs of the parameters of the considered models, the corresponding standard errors in parentheses, minus twice the maximized log-likelihood −2ℓ ξb , Kolmogorov-Smirnov (K-S) test statistics and the corresponding p-values in parentheses. In order to verify which distribution fits better to the Aarset data set, we can also perform formal goodness-of-fit tests. We use the Cram´er–von Mises (CM) and Anderson-Darling (AD) test statistics (for more detail on CM and AD tests see [37]). In general, the better distribution to fit the Aarset data corresponds to smaller values of these statistics. Let H (·, Θ) be the cdf, where the form of H (·, Θ) is known but Θ (a k-dimensional parameter vector) is unknown and needs to be estimated from the data set. For any cdf H (·, Θ), we follow the next steps to obtain the statistics CM and AD: b and yi = Φ−1 (vi ) where the x′ s are in as• Calculate vi = H xi , Θ i b is an estimate of Θ and Φ−1 (·) is the inverse of the cending order, Θ standard normal cdf Φ (·) . q Pn Pn yi − y 2 1 1 • Calculate ui = Φ , where y = n i=1 yi and sy = n−1 i=1 (yi − y) . sy
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
• Calculate
2 n X 2i − 1 1 + ui − W = 12n i=1 2n 2
20
ACCEPTED MANUSCRIPT
and n
1X {(2i − 1) log (ui ) + (2n + 1 − 2i) log (1 − ui )} . n i=1
CR IP T
A2 = −n −
Then, the statistics CM and AD are 3 1 9 2 2 and AD = A 1 + CM = W 1 + + . 12n 4n 4n2
AN US
The values of the CM and AD test statistics and their p-values for the Aarset data set are given in Table 4. The results of Tables 3 and 4 indicate that the BGG model has the largest p-values and the lowest K-S, −2ℓ ξb , CM and AD values among all the fitted models. Therefore, the BGG model gives an excellent fit than its sub-models for the Aarset data set.
M
We can also compare these models using the LR test statistics. The LR statistics are listed in Table 5. So, we conclude that there is no difference between the fitted BGG and BG distributions, but these distributions provide a better fit for the Aarset data than other models based on the LR test at the 5% significance level.
CE
PT
ED
The plots of the histogram of the Aarset data and the fitted BGG, GG, BG, GE, BGE, BE, exponential and Gompertz densities are given in Figure 4(a) whereas the estimated cumulative functions and the probability plots are shown in Figures 4(b) and 5 respectively. From these plots, we conclude that the BGG model yields the best fits and hence can be adequate for modeling this data set. Figure 3(b) reveals that the estimated hazard rate function has a bathtub shape. The asymptotic variance-covariance matrix of the MLEs of the BGG model parameters is given by 1.474204 0.003120 −0.000104 −0.188992 −0.181351 0.003120 0.0000286 −0.000085 −0.000250 −0.000609 −0.000104 −0.000085 0.000382 −0.000773 0.000355 . −0.188992 −0.000250 −0.000773 0.027299 0.024212 −0.181351 −0.000609 0.000355 0.024212 0.030850
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
So, approximate 95% confidence intervals for the parameters α, β, λ, a and b are 1.3644 ± 2.3798, 0.0029 ± 0.0105, 0.0638 ± 0.0383, 0.1982 ± 0.3238 and 0.1776 ± 0.3443 respectively.
21
ACCEPTED MANUSCRIPT
[Insert Table 3 Here]
[Insert Table 5 Here]
CR IP T
[Insert Table 4 Here]
[Insert Figure 3 Here] Figure 3. (a) TTT plot for the Aarset data (b) Estimated hazard rate function of the BGG distribution for the Aarset data.
AN US
[Insert Figure 4 Here] Figure 4. (a) Observed histogram and the fitted pdf’s (b) Empirical cdf and the fitted cdf’s for the Aarset data.
Conclusions
ED
13
M
[Insert Figure 5 Here] Figure 5. Probability plot for the Aarset data.
CE
PT
In this paper, we have introduced a new five-parameter model called the beta generalized Gompertz distribution. We have provided a mathematical treatment of the new distribution including expansions for the density function, ordinary moments, generating function, quantile function, order statistics, mean deviations, Bonferroni and Lorenz curves and R´enyi entropy. The maximum likelihood method is used to estimate the model parameters of the new distribution and the observed information matrix is derived. We have demonstrated that the proposed distribution fits a well-known real data set better than its sub-models using goodness-of-fit tests. We hope that the new proposed model may attract wider applications in several research areas.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Acknowledgments The author sincerely thanks the Editor and two anonymous reviewers for their valuable comments which improved the original version of the paper.
22
ACCEPTED MANUSCRIPT
References
CR IP T
[1] B. Gompertz, On the nature of the function expressive of the law of human mortality and on the new mode of determining the value of life contingencies, Phil. Trans. Royal Soc. A 115 (1825) 513–580. [2] A.C. Economos, Rate of aging, rate of dying and the mechanism of mortality, Arch. Gerontol. Geriatr. 1 (1982) 3–27. [3] K.S. Brown, W. F. Forbes, A mathematical model of aging processes, J. Gerontol. 29 (1974) 46–51.
AN US
[4] K. Ohishi, H. Okamura, T. Dohi, Gompertz software reliability model: estimation algorithm and empirical validation, J. Syst. Softw. 82 (2009) 535–543.
[5] A. C. Bemmaor, N. Glady, Modeling purchasing behavior with sudden ”death”: A flexible customer lifetime model, Manag. Sci. 58 (2012) 1012–1021.
ED
M
[6] H.M. Moustafa, S.G. Ramadan, Errors of misclassification and their probability distributions when the parent populations are Gompertz, Appl. Math. Comput. 163 (2005) 423–442. [7] C.B. Read, Gompertz Distribution. Encyclopedia of Statistical Sciences, Wiley, New York, 1983.
PT
[8] R. Makany, A theoretical basis of Gompertz’s curve, Biometrical J. 33 (1991) 121–128.
CE
[9] B.R. Rao, C.V. Damaraju, New better than used and other concepts for a class of life distribution, Biometrical J. 34 (1992) 919–935. [10] P.H. Franses, Fitting a Gompertz curve, J. Oper. Res. Soc. 45 (1994) 109–113.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
[11] Z. Chen, Parameter estimation of the Gompertz population, Biometrical J. 39 (1997) 117–124.
[12] J.W. Wu, W.C. Lee, Characterization of the mixtures of Gompertz distributions by conditional expectation of order statistics, Biometrical J. 41 (1999) 371–381.
23
ACCEPTED MANUSCRIPT
[13] S. Minimol, P. Y. Thomas, On characterization of Gompertz distribution by properties of generalized record values, J. Stat. Theo. and Apps. 13 (2014) 38–45.
CR IP T
[14] M.L. Garg, B.R. Rao, C. K. Redmond, Maximum likelihood estimation of the parameters of the Gompertz survival function, J. Royal Stat. Society 19 (1970) 152–159.
[15] J.W. Wu, W.L. Hung, C.H. Tsai, Estimation of the parameters of the Gompertz distribution under the first failure-censored sampling plan, Statistics 37 (2003) 517–525.
AN US
[16] J.W. Wu, W.L. Hung, C.H. Tsai, Estimation of parameters of the Gompertz distribution using the least squares method, Appl. Math. Comput. 158 (2004) 133–147.
[17] A. Lenart, The moments of Gompertz distribution and maximum likelihood estimation of its parameters, Scand. Actuar. J. 2014(3) (2014) 255–277.
M
[18] A.Lenart, T.I. Missov, Goodness-of-fit tests for the Gompertz distribution, Commun. Stat. Theor. M. 45 (2016) 2920-2937.
ED
[19] M. Milgram, The generalized integro-exponential function, Math. Comp. 44 (1985) 443-458.
PT
[20] I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 2000.
CE
[21] A. El-Gohary, A. Alshamrani, A. Al-Otaibi, The generalized gompertz distribution, Appl. Math. Model. 37 (2013) 13–24. [22] E.L. Lehmann, The power of rank tests, Ann. Math. Stat. 24 (1953) 23-43.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
[23] N. Eugene, C. Lee, F. Famoye, Beta-normal distribution and its applications, Commun. Stat. Theor. M. 31 (2002) 497–512.
[24] S. Nadarajah, S. Kotz, The beta exponential distribution, Reliab. Eng. Syst. Saf. 91 (2005) 689–697. [25] A. A. Jafari, S. Tahmasebi, M. Alizadeh, The beta Gompertz distribution, Rev. Colomb. Estad. 37 (2014) 139–156.
24
ACCEPTED MANUSCRIPT
[26] C. Lee, F. Famoye, and O. Olumolade, Beta-Weibull distribution: some properties and applications to censored data, J. Modern Appl. Stat. Methods 6 (2007) 173–186.
CR IP T
[27] R.J. Cintra, L.C. Rˆego, G.M. Cordeiro, A.D.C. Nascimento, Beta generalized normal distribution with an application for SAR image processing, Statistics 48 (2014) 279–294. [28] W. Barreto-Souza, A.H.S. Santos, G.M. Cordeiro, The beta generalized exponential distributions, J. Stat. Comput. Simul. 80 (2010) 159–172.
AN US
[29] E. Mahmoudi, The beta generalized Pareto distribution with application to lifetime data, Math. Comput. Simul. 81 (2011) 2414–2430.
[30] N. Singla , K. Jain,S. K. Sharma, The beta generalized Weibull distribution: properties and applications, Reliab. Eng. Syst. Saf. 102 (2012) 5–15.
M
[31] G.M. Cordeiro, A.E. Gomes, C.Q. da-Silva, E M.M. Ortega, The beta exponentiated Weibull distribution, J. Stat. Comput. Simul. 83 (2013) 114–138.
ED
[32] E.L. Lehmann, G. Casella, Theory of point estimation, 2nd ed, New York, Springer, 1998.
PT
[33] M.V. Aarset, How to identify bathtub hazard rate, IEEE Trans. Rel. R-36 (1987) 106–108.
CE
[34] G.S. Mudholkar, D.K. Srivastava, Exponentiated Weibull family for analyzing bathtub failure-real data, IEEE Trans Reliab 42 (1993) 299–302. [35] G.S. Mudholkar, D.K. Srivastava, G.D. Kollia, A generalization of the Weibull distribution with application to the analysis of survival data, J Am Stat Assoc 91 (1996) 1575–1583.
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
[36] F.K. Wang, A new model with bathtub-shaped failure rate using an additive Burr XII distribution, Reliab. Eng. Syst. Saf. 70 (2000) 305– 312.
[37] G Chen, N. Balakrishnan, A general purpose approximate goodness-offit test, J. Qual. Technol. 27 (1995) 154–161.
25
ACCEPTED MANUSCRIPT
Appendix 1.
n
CR IP T
β λxi Let zi = e− λ (e −1) and ti = eλxi (λxi − 1) + 1. The elements of the observed information matrix Jn (ξ) are
Uαα
X (1 − zi )α [log (1 − zi )]2 n = − 2 − (b − 1) , 2 α (1 − (1 − zi )α ) i=1
Uαβ
n n α−1 a X eλxi − 1 zi (b − 1) X eλxi − 1 zi (1 − zi ) = + λ i=1 1 − zi λ 1 − (1 − zi )α i=1 n
AN US
α (b − 1) X (1 − zi )α−1 log (1 − zi ) , + 2 λ (1 − (1 − zi )α ) i=1 n
Uαλ
n
β (b − 1) X ti zi (1 − zi )α−1 βa X ti zi + =− 2 λ i=1 1 − zi λ2 1 − (1 − zi )α i=1 n
Uαa =
M
αβ (b − 1) X ti zi (1 − zi )α−1 log (1 − zi ) , α 2 λ2 (1 − (1 − z ) ) i i=1 n X i=1
ED
+
log (1 − zi ) , Uαb =
n X (1 − zi )α log (1 − zi ) i=1
1 − (1 − zi )α
CE
PT
2 n n (aα − 1) X eλxi − 1 zi Uββ = − 2 − β λ2 (1 − zi )2 i=1 2 n α−1 zi α (b − 1) X eλxi − 1 (1 − zi ) + 2 α λ2 (1 − (1 − zi ) ) i=1 2 n α−2 zi α (b − 1) (α − 1) X eλxi − 1 (1 − zi ) + α 2 2 λ (1 − (1 − zi ) ) i=1 2 n 2α−2 2 zi α2 (b − 1) X eλxi − 1 (1 − zi ) , + α 2 2 λ (1 − (1 − zi ) ) i=1
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
26
,
ACCEPTED MANUSCRIPT
CR IP T
n n 1 X λxi (αa − 1) X ti zi n e (1 − λxi ) − Uβλ = − 2 + 2 λ λ i=1 λ2 1 − zi i=1 n β (αa − 1) X ti zi eλxi − 1 + λ3 (1 − zi )2 i=1 n
CE
PT
ED
M
AN US
α (b − 1) X ti zi (1 − zi )α−1 + λ2 1 − (1 − zi )α i=1 n α−1 αβ (b − 1) X ti zi eλxi − 1 (1 − zi ) − 2 λ3 (1 − (1 − zi )α ) i=1 n 2α−1 αβ (b − 1) X ti zi eλxi − 1 (1 − zi ) + 2 λ3 (1 − (1 − zi )α ) i=1 n α−2 αβ (b − 1) (α − 1) X ti zi2 eλxi − 1 (1 − zi ) − 2 λ3 (1 − (1 − zi )α ) i=1 n 2α−2 αβ (b − 1) X ti zi2 eλxi − 1 (1 − zi ) , − 2 λ3 (1 − (1 − zi )α ) i=1 n α X zi eλxi − 1 , Uβa = λ i=1 1 − zi n α−1 α X zi eλxi − 1 (1 − zi ) Uβb = , λ i=1 1 − (1 − zi )α
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
27
ACCEPTED MANUSCRIPT
Uλλ
n n nβ 2β X λxi β X 2 λxi = 3 − 3 xe e (1 − λxi ) − λ λ i=1 λ i=1 i n
n
n
2αβ (a − 1) X ti zi (1 − zi )α−1 λ3 1 − (1 − zi )α i=1
AN US
−
CR IP T
β (αa − 1) X zi x2i eλxi 2β (αa − 1) X ti zi − λ3 1 − zi λ 1 − zi i=1 i=1 n β 2 (αa − 1) X ti zi λβxi eλxi − eλxi + 1 + λ4 (1 − zi )2 i=1
+
n
αβ (b − 1) X zi x2i eλxi (1 − zi )α−1 + 2 λ (1 − (1 − zi )α ) i=1
CE
PT
ED
M
n α−1 β 2 (b − 1) X ti zi λβxi eλxi − eλxi + 1 (1 − zi ) + 2 λ4 (1 − (1 − zi )α ) i=1 n α−2 αβ 2 (b − 1) (α − 1) X ti zi2 λβxi eλxi − eλxi + 1 (1 − zi ) + 2 λ4 (1 − (1 − zi )α ) i=1 n αβ (b − 1) X zi x2i eλxi (1 − zi )2α−1 − 2 λ (1 − (1 − zi )α ) i=1 n 2α−1 αβ 2 (b − 1) X ti zi λβxi eλxi − eλxi + 1 (1 − zi ) + 2 λ4 (1 − (1 − zi )α ) i=1 n 2α−2 αβ 2 (b − 1) X ti zi2 λβxi eλxi − eλxi + 1 (1 − zi ) , + 2 λ4 (1 − (1 − zi )α ) i=1
Uλa
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
n n αβ X ti zi αβ X ti zi (1 − zi )α−1 =− 2 , Uλb = 3 , λ i=1 1 − zi λ i=1 1 − (1 − zi )α
Uaa = −n [ψ ′ (a) − ψ ′ (a + b)] , Uab = nψ ′ (a + b)
and
Ubb = −n [ψ ′ (b) − ψ ′ (a + b)] .
28
ACCEPTED MANUSCRIPT
Appendix 2. R Codes
CE
PT
ED
M
AN US
CR IP T
# Define the pdf of BGG distribution pdf.BGG=function(x,alpha,beta,lambda,a,b){ y=((alpha*beta*exp(lambda*x)*exp(-(beta/lambda)*(exp(lambda*x)-1)))/ beta(a,b))*((1-exp(-(beta/lambda)*(exp(lambda*x)-1)))ˆ(alpha*a-1))* (1-((1-exp(-(beta/lambda)*(exp(lambda*x)-1)))ˆalpha))ˆ(b-1) return(y) } # Define the cdf of BGG distribution cdf.BGG=function(x,alpha,beta,lambda,a,b){ y=pbeta((1-exp(-(beta/lambda)*(exp(lambda*x)-1)))ˆalpha,a,b) return(y) } # Define the hazard rate function of BGG distribution h.BGG=function(x,alpha,beta,lambda,a,b){ y=pdf.BGG(x,alpha,beta,lambda,a,b )/(1-cdf.BGG(x,alpha,beta,lambda,a,b)) return(y) } # Define the moments of BGG distribution moment=function(alpha,beta,lambda,a,b,r){ f=function(x,alpha,beta,lambda,a,b,r){ (xˆr)*(pdf.BGG(x,alpha,beta,lambda,a,b))} y=integrate(f,lower=0,upper=Inf,subdivisions=1000, alpha=alpha,beta=beta,lambda=lambda,a=a,b=b,r=r)return(y) } # Calculate MLEs and variance-covariance matrix of the BGG distribution library(bbmle) x=c(X) # X is the Aarset data fn.BGG=function(alpha,beta,lambda,a,b){ -sum(log(alpha)+log(beta)+(lambda*x)-((beta/lambda)*(exp(lambda*x) -1))-log(beta(a,b))+(alpha*a-1)*log(1-exp(-(beta/lambda)*(exp(lambda*x) -1)))+(b-1)*log(1-((1-exp(-(beta/lambda)*(exp(lambda*x)-1)))ˆalpha))) } mle.results=mle2(fn.BGG,start=list(alpha=alpha,beta=beta, lambda=lambda,a=a,b=b),hessian.opts=TRUE) summary(mle.results) vcov(mle.results)
AC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
29
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
igure 1
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
igure 2
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
igure 3
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
igure 4
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
igure 5
ACCEPTED MANUSCRIPT
ables
AC
CE
PT
ED
M
AN US
CR IP T
Table 1: Moments of the BGG distribution for some parameter values of a and b for fixed values α = 0.5, β = 2 and λ = 1.5. E (X r ) a = 1, b = 2.5 a = 0.5, b = 3.5 a = 0.25, b = 1 a = 1, b = 0.25 E (X) 0.068694 0.019526 0.067259 0.670931 2 E (X ) 0.013773 0.002281 0.026267 0.673307 E (X 3 ) 0.004198 0.000470 0.014821 0.791913 4 E (X ) 0.001632 0.000133 0.010130 1.027140 Variance 0.009055 0.001899 0.021743 0.223158 Skewness 2.330989 4.247404 3.159338 0.386296 Kurtosis 9.777357 27.897731 14.371600 2.258868
1
ACCEPTED MANUSCRIPT
Table 2: Descriptive statistics for the Aarset data set. Median Variance 1st Qu. 3rd Qu. Min. Max. 47.00 1111.23 9.50 80.50 0.10 86.00
AC
CE
PT
ED
M
AN US
CR IP T
Mean 44.61
2
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Table 3: MLEs of the model parameters, the corresponding standard errors ˆ K-S statistics and the corresponding p-values in given in parentheses, -2ℓ(ξ), parentheses. Model MLEs −2ℓ ξb K-S α β λ a b (p-value) BGG 1.3644 0.0029 0.0638 0.1982 0.1776 445.6100 0.1266 (1.2142) (0.0054) (0.0195) (0.1652) (0.1756) (0.3871) GG 0.4107 0.0016 0.0440 1 1 452.8962 0.1339 (0.1148) (0.0018) (0.0160) − − (0.3203) BG 1 0.0035 0.0615 0.2546 0.1774 446.8758 0.1333 − (0.0057) (0.0198) (0.0720) (0.0827) (0.3607) GE 0.6888 0.0177 0 1 1 485.0540 0.2106 (0.1162) (0.0035) − − − (0.0217) BGE 0.5835 0.3691 0 0.4968 0.0549 483.0022 0.1864 (0.0528) (0.0384) − (0.1642) (0.0096) (0.0577) BE 1 0.1880 0 0.4986 0.1053 480.7882 0.1889 − (0.1430) − (0.1445) (0.0769) (0.0525) Gompertz 1 0.0110 0.0181 1 1 479.9493 0.1824 − (0.0033) (0.0059) − − (0.0671) Exponential 1 0.0224 0 1 1 489.4002 0.1904 − (0.0031) − − − (0.0495)
3
ACCEPTED MANUSCRIPT
AD statistics with their p-values. AD p-value 1.3036 0.2651 1.5067 0.1106 1.3727 0.2427 3.1526 0.0152 2.8012 0.0354 2.8153 0.0328 2.1765 0.0010 3.1694 0.0047
CR IP T
CM and p-value 0.3882 0.1737 0.3491 0.0199 0.0439 0.0400 0.0324 0.0258
AC
CE
PT
ED
M
AN US
Table 4: The Model CM BGG 0.1658 GG 0.2075 BG 0.1775 GE 0.5168 BGE 0.4442 BE 0.4517 Gompertz 0.3320 Exponential 0.5199
4
GG BG GE BGE BE Gompertz Exponential
Table w 7.28615 1.26573 39.44396 37.39218 35.17813 34.33926 43.79018
5: LR tests. p-value 0.02617 0.26057 1.39766 × 10−8 9.66085 × 10−10 2.29702 × 10−8 1.67995 × 10−7 7.09292 × 10−9
AC
CE
PT
ED
M
AN US
Model BGG vs BGG vs BGG vs BGG vs BGG vs BGG vs BGG vs
CR IP T
ACCEPTED MANUSCRIPT
5