Computational Statistics & Data Analysis 51 (2007) 4369 – 4378 www.elsevier.com/locate/csda
Finite mixture of gamma distributions: A conjugate prior Jamal A. Al-Saleha,∗ , Satish K. Agarwalb a Department of Statistics and Operation Research, Faculty of Science, Kuwait University, P.O. Box 5969, Safat, Kuwait b Department of Mathematics, College of Science, University of Bahrain, P.O. Box 32038, Bahrain
Received 27 September 2005; received in revised form 4 April 2006; accepted 4 June 2006 Available online 30 June 2006
Abstract A finite mixture of gamma distributions [Finite mixture of certain distributions. Comm. Statist. Theory Methods 31(12), 2123–2137] is used as a conjugate prior, which gives a nice form of posterior distribution. This class of conjugate priors offers a more flexible class of priors than the class of gamma prior distributions. The usefulness of a mixture gamma-type prior and the posterior of uncertain parameters for the Poisson distribution are illustrated by using Markov Chain Monte Carlo (MCMC), Gibbs sampling approach, on hierarchical models. Using the generalized hypergeometric function, the method to approximate maximum likelihood estimators for the parameters of Agarwal and Al-Saleh [Generalized gamma type distribution and its hazard rate function. Comm. Statist. Theory Methods 30(2), 309–318] generalized gamma-type distribution is also suggested. © 2006 Elsevier B.V. All rights reserved. Keywords: Bayesian analysis; Conjugate prior; Generalized hypergeometric function; Gibbs sampling; Hierarchical models; Kobayashi’s gamma-type function; Markov Chain Monte Carlo; Maximum likelihood estimation; Mixture gamma distributions
1. Introduction By introducing new parameter(s), new generalizations of the two-parameter gamma distribution can be considered. The main contributors in recent years have been Lee and Gross (1991), Pham and Almhana (1995), Agarwal and Kalla (1996), and Agarwal and Al-Saleh (2001). The Al-Saleh and Agarwal (2002) generalized gamma distribution can also be written as a finite mixture of gamma distributions and provides a flexible skewed density over the positive range. This mixture distribution is useful, particularly for applications in which the subjective views of information come from two or more sources and the views are close to each other. Some examples are the assessments of a subject expert and the scores given by judges in a competition. To show the usefulness of the Al-Saleh and Agarwal (2002) distribution, let us consider a situation in which prior knowledge is used to obtain a posterior distribution, say of uncertain parameter of the Poisson distribution, where follows a mixture of gamma distributions. A single complicated prior distribution that is not fully dominated by the data not only misleads the posterior distribution, but may also lead to complicated numerical integration if the prior is non-conjugate. The other choice is to use a conventional mixture of gamma distributions as the prior, but this requires knowledge of the weights. If it is not possible to approximate the weights, then the number of parameters will increase. In such cases, an appropriate finite mixture of distributions as a conjugate prior is worth examining. This not only yields a suitable posterior distribution, but also provides computation simplicity. One such ∗ Corresponding author. Tel.: +965 335150; fax: +965 533150.
E-mail addresses:
[email protected] (J.A. Al-Saleh),
[email protected] (S.K. Agarwal). 0167-9473/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2006.06.005
4370
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
choice for some problems may be the Al-Saleh and Agarwal (2002) finite mixture of gamma distributions. The specific weights attached to the suggested prior distribution are functions of the parameters, and hence the distribution we define represents a mixture of gamma densities with only three parameters. Bayesian estimation of the parameters of a finite mixture of distributions is dealt with by West (1992), and Hobert et al. (1999). Diebolt and Robert (1994) suggested a Bayesian analysis of mixture models for unknown weights in which the components belong to the exponential family. They derived an exploitable posterior distribution with a Dirichlet prior for weights. Hsiao et al. (1998) considered a Bayesian approach by applying a prior mixture of gamma distributions to a problem involving transmission tomography. Hsiao et al. (2001) used a deterministic annealing algorithm for the Bayesian reconstruction method introduced by Hsiao et al. (1998). The present paper is a step forward in this direction. In Section 2 of the paper we have redefined the Agarwal and Al-Saleh (2001) gamma-type distribution from which the Al-Saleh and Agarwal (2002) three-parameter generalized gamma distribution is derived. This distribution (Al-Saleh and Agarwal, 2002) is used as a conjugate prior in the paper, which may be of some interest for Bayesian approaches. In Section 3, a new type of gamma prior distribution is introduced. In Section 4, Bayesian analysis is carried out to show the posterior distribution of uncertain parameter for the Poisson distribution as a new mixture gamma-type distribution. Section 5 presents simulation analyses to show the usefulness of the finite mixture of gamma densities as a conjugate prior distribution. We have also considered a real-life data set to strengthen our conjecture that the proposed conjugate prior may be useful for certain problems by considering a three-stage hierarchical model. We have noted that many statistical properties of the Agarwal and Al-Saleh (2001) distribution are not available in the literature; hence, in Appendix A, the method to approximate the maximum likelihood estimators for the parameters is given, along with simulation analyses. 2. Kobayashi’s gamma-type distribution Using Kobayashi’s (1991) generalized gamma function, Agarwal and Al-Saleh (2001) defined the generalized gamma-type distribution with parameters m, n, c and , with probability density function (pdf) f (x) =
cm− x m−1 e−cx (m, n)(x + n/c)
,
x > 0, m, n, c > 0, ∈ R.
(2.1)
Eq. (2.1) can be reduced to a three-parameter generalized gamma distribution when we let n = c and then taking c = 1/b f (x) =
x m−1 e−x/b (m, 1/b)(x + 1) bm−
where
(m, 1/b) =
∞ 0
x m−1 e−x (x + 1/b)
,
x > 0, m, b > 0, ∈ R,
dx.
(2.2)
(2.3)
It can be seen that for = 0, Eq. (2.2) reduces to ordinary gamma distribution. The integral in Eq. (2.3) is useful in many problems of diffraction theory and corrosion problems in new machines (see Kobayashi, 1991). This integral can be computed using the transformation concerning the generalized hypergeometric function F (n; d : z). This function is also known as Barnes’s extended hypergeometric function. If there are u coefficients in n, and v coefficients in d, this function is also known as u Fv , (see Andrews, 1998, Chapter 9) and is defined as k ∞ u i=1 ((ni + k)/(ni )) z , v (2.4) F ([n1 , n2 , . . . , nu ]; [d1 , d2 , . . . , dv ] : z) = i=1 ((di + k)/(di ) k! k=0 where the u and v refer to the number of numerator and denominator parameters, respectively, in its series representation. Using Eq. (2.4), Eq. (2.3) can be computed numerically as (m, 1/b) = (m)( − m)()−1 b−m F ([m], [1 − + m]; 1/b) + (m − )F ([], [1 − m + ]; 1/b),
(2.5)
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
4371
for all values of m, b > 0, and ∈ R/{0}. For the values ±(m − ) = 0, −1, −2, . . ., Eq. (2.5) can be computed using the exponential integral. 3. New gamma-type prior In this section, we introduce a new type of three-parameter finite mixture of gamma distributions, by using Eq. (2.1) when n = c, and reducing the range of the parameter to take negative integers, = −k(k = 0, 1, 2, . . . , l), l is fixed (see Al-Saleh and Agarwal, 2002). We have considered being a negative integer because the motivation of the paper is to define a new type of finite mixture of gamma distributions. Thus, we introduce the gamma-type distribution with parameters m, c > 0 and integer k, gt(m, c, k) with pdf as f (x) =
cm+k x m−1 (x + 1)k e−cx , k (m, c)
where k (m, c) =
∞ 0
x m−1 (x
+ c)k e−x
x > 0, k = 0, 1, . . . , l,
(3.1)
dx. Using the finite binomial series (a + b) = k
k i=0
k a i bk−i , Eq. (3.1) i
can be redefined as k
cm+k x m+i−1 e−cx k f (x) =
k i k ∞ m+j −1 k−j −x i=0 , c e dx 0 x j j =0
=
1 Cm,k
k
k k−i c (m)i gamma(m + i, c), i
(3.2)
i=0
where gamma(m+i, c) is gamma pdf as a function of x, with parameters m+i and c, and (m)j is known as ‘Pochhammer symbol’ and Cm,k =
k
k k−j c (m)j , j
(m)j = m(m + 1) · · · (m + j − 1) and (m)0 = 1.
j =0
Eq. (3.2) is a finite mixture of gamma distributions with parameters m, c > 0, and k = 0, 1, . . . , l. This equation introduces a class of new finite mixture of gamma distributions as a prior to give more flexibility to Bayesian methods in choice of the mixing k. Until now, users have a choice of using the conventional gamma (k = 0) or a subjective approach of guessing weights (if one wants to use mixing gamma densities as a prior). The later choice is not only subjective, but also requires too much information. In the next section, we will show that our class of mixture of gamma distributions, as a prior, is not only conjugate but also requires little information in guessing the weights. 4. Bayesian analysis In order to show that the finite mixture of gamma distributions defined in Eq. (3.2) is a conjugate prior of uncertain , xn ) from a Poisson distribution with parameter for the Poisson distribution, consider n observations x = (x1 , . . . unknown parameter . The likelihood function is l(|x) ∝ T e−n , where T = xi . Consider the prior density p() ∝
k i=0
1 (k − i)!i!
2 S0
i (v/2)i gamma((v + 2i)/2, S0 /2),
(4.1)
where gamma((v + 2i)/2, S0 /2) is the gamma pdf as a function of , with parameters (v + 2i)/2 and S0 /2. In other words, is a finite mixture of gamma random variables. Thus, the posterior density is p(|x) ∝
k i=0
1 (k − i)!i!
2 S1
i
v1 S1 , ((v + 2T )/2)i gamma 2 2
,
(4.2)
4372
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
Table 1 Pump reliability data at a pressurized water reactor power plant Pump
ti
xi
1 2 3 4 5 6 7 8 9 10
94.50 15.70 62.90 126.00 5.24 31.40 1.05 1.05 2.10 10.50
5 1 5 14 3 10 1 1 4 22
where S1 = S0 + 2n, and v1 = v + 2(T + i). Eq. (4.2) is a finite mixture of gamma random variables. Hence, the class of finite mixture of gamma distributions, Eq. (3.2), is a conjugate prior of uncertain parameter for the Poisson distribution. Once the posterior is known, the predictive distribution would be ∞ p(x) = p(x|)p() d 0
∞
∝ 0
∝
k
i
2 1 v1 S1 x − , d ((v + 2T )/2)i gamma e (k − i)!i! S1 2 2 x!
k i=0
i=0
2i v1 . (S1 + 2)−(v1 /2)−x x + (k − i)!i!x! 2
(4.3)
Setting = 1 − (S1 + 2)−1 , at least when v1 /2 is an integer, the predictive distribution is a finite mixture of a negative binomial, that is
k
v
v 2i 1 1 f (x) ∝ −i NeBin , , (4.4) (k − i)!i! 2 2 i=0
where NeBin (v1 /2, ) is the negative binomial probability mass function (pmf) as a function of x, with parameters v1 /2 and . Further, it is not difficult to see that the negative binomial distribution in Eq. (4.4) can be generalized for the real value of v1 /2. George et al. (1993), suggested a conjugate gamma prior on uncertain parameter for the Poisson distribution. In the next section we will expand their work by considering, Eq. (4.1), a conjugate gamma-type prior for the uncertain parameter which can also be written as a finite mixture of gamma distributions. This mixture of gamma densities is conjugate, which make this mixture more convenient to work with as a prior. Besides, k 0 offers a more flexible class of priors to choose among the class of gamma prior distributions. 5. Illustrations George et al. (1993) analyses data from Gaver and O’Muircheartaigh (1987). The model they considered relates to 10-pump reliability data set at a pressurized water nuclear reactor power plant. Let the number of failures xi follows a Poisson distribution i.e. xi ∼ Poisson(ri ti ), i = 1, . . . , 10, where ri is the failure rate for pump i and ti is the length of operation time of the pump (in 1000 s of hours). The data are shown in Table 1. George et al. (1993), considered a gamma prior for the failure rates ri . We have expanded their work by considering a conjugate gamma-type prior distribution for the failure rates, that is ri ∼ gt(, , k), i = 1, . . . , 10. It is worth to mention that for k = 0, the choice of prior on the failure rates is the same as considered by George et al. (1993). The case k > 0 implies that there are more than one expert assessment about prior on the failure rates ri (or there are more than
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
4373
Table 2 Parameter estimates for the power plant pumps model when k = 0 Variable
Mean
SD
MC
2.5%
Median
97.5%
0.701 0.938 0.060 0.102 0.090 0.116 0.603 0.609 0.894 0.880 1.566 1.991
0.271 0.545 0.025 0.080 0.038 0.030 0.320 0.138 0.726 0.711 0.7718 0.4229
0.0043 0.0087 2.5E − 4 9.2E − 4 3.7E − 4 2.8E − 4 0.0029 0.0013 0.0073 0.0080 0.0083 0.0058
0.2904 0.1907 0.0212 0.0079 0.0317 0.0636 0.1509 0.3734 0.0761 0.0743 0.4669 1.251
0.6638 0.8351 0.0560 0.0818 0.0849 0.1137 0.5466 0.5965 0.7124 0.7033 1.432 1.963
1.33 2.28 0.1160 0.3095 0.1789 0.1812 1.38 0.909 2.79 2.71 3.412 2.886
r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
Table 3 Parameter estimates for the power plant pumps model when k = 1 Variable
Mean
SD
MC error
2.5%
Median
97.5%
0.120 0.840 0.065 0.116 0.094 0.118 0.557 0.599 0.753 0.763 1.471 1.960
0.112 0.548 0.026 0.078 0.038 0.030 0.297 0.137 0.635 0.650 0.755 0.423
0.0068 0.0283 3.0E − 4 0.0015 4.6E − 4 3.2E − 4 0.0056 0.0014 0.0161 0.0140 0.0141 0.0062
0.0087 0.1665 0.0245 0.0162 0.0353 0.0671 0.1395 0.3631 0.0709 0.0727 0.4223 1.224
0.0885 0.7159 0.0611 0.0996 0.0890 0.1153 0.4996 0.5875 0.5819 0.5828 1.328 1.930
0.4476 2.215 0.125 0.313 0.182 0.183 1.276 0.8968 2.402 2.467 3.308 2.86
r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
Table 4 Parameter estimates for the power plant pumps model when k = 2 Variable
Mean
SD
MC error
2.5%
Median
97.5%
0.0242 0.5707 0.0679 0.1200 0.0954 0.1168 0.5361 0.5972 0.6965 0.6726 1.404 1.928
0.0256 0.3407 0.0256 0.0756 0.037 0.0294 0.3028 0.1362 0.6128 0.5848 0.7162 0.4229
0.0013 0.0203 3.0E − 4 0.0014 5.6E − 4 3.86E − 4 0.0055 0.0018 0.0145 0.0120 0.0177 0.0072
0.0016 0.1157 0.0277 0.0240 0.0388 0.0659 0.1215 0.3635 0.0709 0.0673 0.3898 1.202
0.01646 0.5004 0.0643 0.1029 0.0901 0.1146 0.4784 0.5867 0.5184 0.4999 1.276 1.898
0.0939 1.397 0.1265 0.3097 0.1814 0.1812 1.282 0.8926 2.322 2.235 3.136 2.868
r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
one possible available information about ri from different sources) and their assessment/information are very close to each other. Here we would like to emphasize that the purpose of considering ri ∼ gt(, , k), i = 1, . . . , 10, is to give more flexibility to Bayesian methods in the choice of prior by choosing different k. To understand how this mixing k would affect the posterior results, two cases are considered: k is known, and k is unknown. We first consider the case
4374
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
Table 5 Parameter estimates for the power plant pumps model when k is unknown Variable
Mean
SD
MC error
2.50%
Median
97.50%
0.2035 1.954 29.01 0.573 0.0666 0.1172 0.0913 0.1157 0.7220 0.6302 1.708 1.715 2.329 2.177
0.1465 1.338 11.82 0.237 0.0241 0.0808 0.0365 0.0298 0.3755 0.1425 1.293 1.311 1.044 0.4543
0.0137 0.0211 0.8251 0.0161 6.6E − 04 0.0015 5.1E − 04 3.7E − 04 0.0059 0.0015 0.0228 0.0220 0.0126 0.0050
0.0133 0.2351 6 0.1053 0.0283 0.0230 0.0369 0.0656 0.1601 0.3867 0.1599 0.1439 0.7476 1.368
0.1622 1.664 29 0.5754 0.0636 0.09532 0.08503 0.113 0.663 0.62 1.397 1.4 2.176 2.153
0.5204 5.274 49 0.9766 0.1217 0.3288 0.1786 0.1804 1.62 0.9383 5.032 5.02 4.77 3.156
k l r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
when k is known and k = 0, 1, 2. A three-stage hierarchical model implemented is as follows: at the first stage we have ri ∼ gt(, , k), i = 1, . . . , 10, at the second stage we assume the following prior specification for the hyperparameters , and ; ∼ exponential(), and ∼ gamma( , ) independent, at the third stage we take = 0.1, = 1.0, and k = 0, 1, and 2, respectively. A Markov Chain Monte Carlo (MCMC) Gibbs sampling approach implemented in using 䉷BUGS computer software gives an analysis of estimates. A burn in of 1000 updates followed by a further 10 000 updates are given in Tables 2–4 for the parameter estimates for the case k = 0, 1, and 2, respectively. Secondly, when k is unknown a three-stage implemented hierarchical model is as follows: at the first stage we have ri ∼ gt(, , k), i = 1, . . . , 10, at the second stage we assume the following prior specification for the hyperparameters , , and k: ∼ exponential(), ∼ gamma( , ), and k ∼ binomial(l, 50) independent, at the third stage we take l ∼ beta(0.5, 1.0), = 0.1, and = 1.0. A MCMC Gibbs sampling approach gives an analysis of estimates. A burn in of 1000 updates followed by a further 10 000 updates are given in Table 5. Examination of the above simulations (Tables 2–5) yields the following observations: 1. The posterior distribution means of the estimate of for k = 0, 1 and 2 are 0.701, 0.120 and 0.0242, respectively, while the prior mean is 1.0 for all cases (k = 0, 1, 2). There is a clear and substantial shift of the posterior mean to the left as k increases from 0 to 1 and from 1 to 2. The posterior standard deviation (SD) for k = 0, 1 and 2 is 0.271, 0.112 and 0.0256, respectively (prior SD is 1.0), and hence a decrease in posterior SD as k increases from 0 to 1 and from 1 to 2. Comparison of the MC error for k = 0 with those for k = 1 and k = 2 shows that the MC error is slightly greater for k = 1, while it decreased for k = 2. For unknown k, the posterior mean is 0.2035 (prior mean is 1.0) and posterior SD is 0.1465 (prior SD is 1.0). Comparison with the posterior mean of 0.701 for k = 0 once again shows a shift to the left and a considerable decrease in posterior SD. However, there is an increase in the MC error. Comparison of the estimates of for unknown k with the known values for k = 1 and k = 2 shows that the shift of the posterior mean for unknown k is towards the right, concomitant with an increase in SD and MC error. 2. The posterior distribution means of the estimate of for k = 0, 1 and 2 are 0.938, 0.840, and 0.5707, respectively, while the prior mean is 2.0 for all cases of k. The behavior of the posterior mean is more or less similar to that of the estimate of for k = 0, 1 and 2; the posterior mean shifts to the left as k increases. However, for unknown k, the posterior mean shifts to the right with respect to the posterior mean for k = 0, 1 and 2. The MC error more or less remains the same as that for k = 0, 1 and 2. For unknown k, the posterior mean is 1.954 (prior mean is 2.0) and the posterior SD is 1.338 (prior SD is 1.414). However, for unknown k, the posterior mean shifts substantially to the right compared to the posterior mean for the cases k = 0, 1, 2, with an increase in the MC error. 3. For the estimate k, the posterior mean is 29.0 (prior mean is 16.7) and the posterior SD is 11.82 (prior SD is 10.0). The posterior distribution has little mass at smaller k, a shift to the right from the prior mean, with a slight increase in posterior SD.
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
4375
4. For the estimate l, the posterior mean is 0.573 (prior mean is 0.333) and the posterior SD is 0.237 (prior SD is 0.298). The behavior of the posterior mean and SD is more or less similar to that for the estimate of k. 5. For the estimates r1 − r6 , the posterior mean and SD for all cases of k, including unknown k, increase slightly or remain more or less the same, with a slight increase in MC error for k = 1, 2 and for unknown k in comparison to k = 0. 6. For the estimates r7 − r10 , the posterior mean shifts slightly to the left for k = 1 and 2, and the posterior SD decreases slightly or remains approximately the same, with a small increase in the MC error in comparison to k = 0. For unknown k the trend is reversed: the posterior mean shifts to the right with a slight increase in SD and MC error (except for r10 , for which the posterior SD more or less remains the same and the MC error decreases), representing a dramatic change in these estimates compared with the case of k known. In brief, the values of the posterior distribution means vary to some extent across the results for k = 0, 1 and 2. For a few pumps, the values are also similar when k is unknown. However, the differences for pumps r7 and r8 are dramatic. The difference is clearer in the case when k is unknown compared to k = 0, 1, 2. Hence, in the above example, when k is unknown, the analysis using a gamma-type prior distribution for the failure rates seems more successful than the gamma prior (k = 0) suggested by George et al. (1993). The proposed class of prior distributions offers more flexibility for Bayesian methods to choose among the existing classes of priors. 6. Conclusion A mixture of gamma distributions derived from that of Al-Saleh and Agarwal (2002) was used as a conjugate prior distribution. This mixture of gamma distributions has a certain advantage over competitors, in that it does not require a subjective approach involving guessing the mixing weights. The usefulness of the proposed mixture gamma-type prior distribution, with a posterior of uncertain parameter for the Poisson distribution, was shown using a MCMC and a Gibbs sampling approach for hierarchical models. An additional advantage of this prior is that it represents a threeparameter mixture of gamma densities. (In the literature, mixture distributions require too much information, such as guess weights, besides the guess values of the parameters). A further advantage of the proposed prior distribution is that it gives more user flexibility for situations in which the failure rate is ri ∼ gt(, , k), i = 1, . . . .p; k > 0. (So far, users have employed a gamma prior distribution, i.e. k = 0). To show the greater scope of the proposed prior, we considered the example of a 10-pump reliability data set at a pressurized water nuclear reactor power plant reported by George et al. (1993), who used a gamma prior distribution (k = 0) with Bayesian analysis. We have given some idea of how Bayesian methods for making inference about an uncertain Poisson parameter would work, on the basis of prior knowledge that behaves as finite mixture of gamma distributions and of observed data from a Poisson process. The results reveal that the estimate ri values vary slightly across the results for known k values. For pumps r1 − r6 , the estimates are similar when k is known and unknown, while for pumps r7 − r10 (especially r7 and r8 ) the differences are dramatic. The reason for this might be that some pumps operate continuously in nuclear power plants, while other pumps operate occasionally or as a standby. This difference is more evident in the case of k unknown than for k known. Acknowledgments The authors thank the referees and the associate editor for the valuable and constructive suggestions in the original and the revised manuscript. The support by the Research Administration, Kuwait University, Kuwait, is acknowledged by the first author. Appendix A. Maximum likelihood estimation It is well known that the maximum likelihood estimation (MLE) of even the two parameter gamma distribution needs a special technique. See Cohen and Whitten (1988), and Balakrishnan and Cohen (1991). In this section, we introduce the MLE technique for the generalized gamma-type distribution. The Kobayashi’s (1991) generalized gamma function has three digamma-type functions (m) (m, b, ), (b) (m, b, ), and () (m, b, ). To obtain these three functions, let us start with the derivatives of Eq. (2.3) with respect to m, b,
4376
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
and , respectively, which are as follows: ∞ m−1 x ln(x)e−x j dx, (m, 1/b) = jm (x + 1/b) 0 ∞ j x m−1 e−x dx, (m, 1/b) = j (x + 1/b)+1 b2 0 ∞ j x m−1 ln(x + 1/b)e−x − dx. (m, 1/b) = j (x + 1/b) 0
(A.1)
(A.2)
(A.3)
In Eq. (A.1), replacing ln(x) with Frullani integral representation (see Andrews, 1998, p. 94), we can rewrite Eq. (A.1) as
∞ m−1 −x ∞ −u j x e e − e−ux (m, 1/b) = du dx jm u (x + 1/b) 0 0
∞ ∞ m−1 −x −u e − e−ux x e dx du. (A.4) u (x + 1/b) 0 0 Here the order of integration is reversed. Now, splitting the inside integral in Eq. (A.4) into a sum of integrals, and recalling that ∞ m−1 −x(u+1) x e (m, (u + 1)/b) dx = . (x + 1/b) (u + 1)m− 0 Then Eq. (A.1) can be written in the following form: ∞ m−1 −x(u+1) ∞ x e j 1 −u ∞ x m−1 e−x dx − dx du e (m, 1/b) = u jm (x + 1/b) (x + 1/b) 0 0 0
∞ 1 −u (m, (u + 1)/b) = du e (m, 1/b) − u (u + 1)m− 0 and this leads to
(m)
∞
(m, b, ) = 0
1 −u (m, (u + 1)/b) du. e − u (m, 1/b)(u + 1)m−
(A.5)
(A.6)
A similar derivation will give ()
∞
(m, b, ) = 0
1 −u e−u/b (m, (u + 1)/b) du. e − u (m, 1/b)(u + 1)m−
(A.7)
It can be seen from Eq. (A.2) that
(b) (m, b, ) =
+1 (m, 1/b) . (m, 1/b)b2
(A.8)
The integrals in Eqs. (A.6) and (A.7) are quite complicated. Therefore, their values can be computed numerically using the right-hand side of Eq. (2.5), and dividing by the derivatives of the right-hand side of Eq. (2.5) with respect to m, , and b. (The derivatives of generalized hypergeometric function with respect to m, , and b, are considered in the derivation). The numerical methods available in most mathematical computer software, such as 䉷IMSL, @Mathematica, and 䉷Maple, can solve this problem very easily. Given values of n independent random variables Xl , X2 , . . . , Xn ,
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
4377
ˆ of m, b, and , respectively, are ˆ and , each distributed as in Eq. (2.2), then the maximum likelihood estimators m, ˆ b, log Xi ˆ ˆ + log(b) ˆ ) ˆ − = 0, (A.9)
(m) (m, ˆ b, n log(Xi + 1) (ˆ ) ˆ ˆ ˆ
(m, = 0, (A.10) ˆ b, ) − log(b) − n Xi ˆ ˆ + (m ˆ − ˆ b
ˆ (b) ˆ ) b[ = 0. (A.11) (m, ˆ b, ˆ − ))] n We have made an attempt to use the MLE technique, using generalized hypergeometric function, to approximate the digamma-type functions of the parameters of the generalized gamma-type distribution given by Agarwal and Al-Saleh (2001). For MLE, we considered Agarwal and Al-Saleh (2001) generalized gamma distribution instead of Al-Saleh and Agarwal (2002), because the former is more general case than the later. The random generator function (RNGCT) used to simulate data in the statistical package 䉷IMSL are considered with m = 6.4, c = 1.0, and k = 3, one hundred simulated data are taken from the finite mixtures of gamma distributions, gt(m, c, k), (Eq. (3.1)). Now, the maximum likelihood equations when k is known are log Xi (m)
(m, c , 3) − log( c ) − = 0, (A.12) n Xi c ( c ) (m, c , 3) + − (m +3) = 0. (A.13) n The mathematical computer software @Mathematica, and 䉷Maple, can solve Eqs. (A.12) and (A.13) using the function (FSOLOVE). In our example the solutions returned are m =6.14 and c =0.84, as with little bias due to the RNGCT used in 䉷IMSL to simulate the data from the PDF of the finite mixture of gamma distributions. Appendix B. Computational tools B.1. Bayesian analysis In the illustrations (Section 5) we used the ’zero trick’ a powerful tool available at the prior level in the Gibbs sampler algorithm @BUGS software, which can be adopted as follows: if for a parameter, say theta, we want to use a prior distribution, and it is not part of the standard distributions, a single zero Poisson observation (with mean phi = phi(theta)) contributes a term exp(-phi) to the likelihood for theta, when this is combined with a ’flat’ prior for theta the correct prior distribution results. An example in @BUGS code is # trick using a zero zero < − 0 theta ∼ dflat() # flat prior phi <-expression for −log(desired prior for theta) zero ∼ dpois(phi). B.2. Maximum likelihood estimation The RNGCT in the statistical package 䉷IMSL, can simulate data from the generalized gamma-type distribution (see Appendix A.) This RNGCT uses the inverse CDF technique, by interpolation of points of the distribution function given in a setup table. The interpolation used is a technique, which gives a description of the Akima algorithm. The relative errors using the Akima interpolation are generally considered very good. The mathematical computer software @Mathematica and 䉷Maple, can then solve Eqs. ((A.12) and (A.13), see Appendix A) using the function (FSOLVE). This function provides a useful and powerful tool to compute the generalized hypergeometric function numerically and approximate the digamma type functions of the parameters of the generalized gamma-type distribution given by Agarwal and Al-Saleh (2001).
4378
J.A. Al-Saleh, S.K. Agarwal / Computational Statistics & Data Analysis 51 (2007) 4369 – 4378
References Agarwal, S.K., Kalla, S.L., 1996. A generalized gamma distribution and its application in reliability. Commun. Statist. Theory Methods 25 (1), 201–210. Agarwal, S.K., Al-Saleh, J.A., 2001. Generalized gamma type distribution and its hazard rate function. Comm. Statist. Theory Methods 30 (2), 309–318. Al-Saleh, J.A., Agarwal, S.K., 2002. Finite mixture of certain distributions. Comm. Statist. Theory Methods 31 (12), 2123–2137. Andrews, L.C., 1998. Special Functions of Mathematics for Engineers. Oxford University Press, Oxford, UK. Balakrishnan, N., Cohen, A.C., 1991. Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, USA. Cohen, A.C., Whitten, B.J., 1988. Parameter Estimation in Reliability and Life Span Models. Marcel Dekker, New York, USA. Diebolt, J., Robert, C.P., 1994. Estimation of finite mixture distributions by Bayesian sampling. J. Roy. Statist. Soc. (Ser. B) 56, 363–373. Gaver, D.P., O’Muircheartaigh, I.G., 1987. Robust empirical Bayes analyses of event rates. Technometrics 29 (1), 1–15. George, E., Makov, U., Smith, A., 1993. Conjugate likelihood distributions. Scandinavian J. Statist. 20, 147–156. Hobert, J.P., Robert, C.P., Titterington, D.M., 1999. On perfect simulation for some mixtures of distributions. Statist. Comput. 9, 287–298. Hsiao, I.-T., Rangarajan, A., Gindi, G., 1998. Joint-MAP reconstruction/segmentation for transmission tomography using mixture models as priors. Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference, vol. II, pp. 1689–1693. Hsiao, I.-T., Rangarajan, A., Gindi, G., 2001. Bayesian image reconstruction for transmission tomography using mixture model priors and deterministic annealing algorithms. Proceedings of SPIE MI. Kobayashi, K., 1991. On generalized gamma functions occurring in diffraction theory. J. Phys. Soc. Japan 60 (5), 1501–1512. Lee, M., Gross, A., 1991. Lifetime distributions under unknown environment. J. Statist. Plann. Inference 29, 137–143. Pham, T., Almhana, J., 1995. The generalized gamma distribution: its hazard rate and strength model. IEEE Trans. Reliability 44, 392–397. West, M. 1992. Modeling with mixtures. Bayesian Statistics, vol. 4, Oxford, Oxford University Press.