Applied Mathematics and Computation 182 (2006) 200–209 www.elsevier.com/locate/amc
Discrete distributions from moment generating function P.L. Novi Inverardi, A. Tagliani
*
Department of Computer and Management Sciences, University of Trento, via Inama, 1 – 38100 Trento, Italy
Abstract The recovering of positive discrete distributions from their moment generating function (mgf) is considered. From mgf and some integer moments, proper fractional moments are obtained. The latter represent the available information of the distribution. Then maximum entropy machinery is invoked to find the approximate distribution. It is proved that the approximant converges in entropy, in information divergence and then in total variation, so that accurate expected values may be obtained. Some numerical experiments are illustrated. 2006 Elsevier Inc. All rights reserved. Keywords: Entropy; Fractional moments; Hankel matrix; Integer moments; Maximum entropy; Moment generating function
1. Introduction In Applied Probability and Statistics, the moment generating function (mgf) is usually considered as a valuable tool for calculating moments or for identifying the distribution of a sum of independent random variables. When the underlying distribution is absolutely continuous, the corresponding density is recovered by mgf inversion, lending the plethora of Laplace transform inversion techniques. On the contrary, if the underlying distribution is discrete, inversion techniques seem lacking. On the other hand, discrete distributions have many important applications in the study of queueing systems [8] and in risk theory. In this paper, we propose a probability mass function (pmf) reconstruction procedure from its mgf based on the Maximum Entropy approach passing through fractional moments. It is a well-known fact that, given M pieces of information I1, I2, . . . , IM, in terms of expected values, the Maximum Entropy method [5] allows us to recover the Shannon-entropy maximizing probability mass function (pmf) p(M), which is the most consistent and coherent with (and only) the information contained in the M quantities I1, I2, . . . , IM. However, in order that the reconstructed pmf be operative, the quantities I1, I2, . . . , IM must characterize the underlying distribution. This requirement is not a negligible aspect of the procedure and the most popular choice of I1, I2, . . . , IM is consisted in the integer moments which, in many (although not in all) cases, characterize (= guarantee the existence of a unique distribution having these moments) the distribution. Recently, Novi Inverardi and Tagliani [11] proposed to involve fractional moments in density *
Corresponding author. E-mail address:
[email protected] (A. Tagliani).
0096-3003/$ - see front matter 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2006.03.048
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
201
recovering on the basis of the following motivations: first, there is a result of Lin [10] which supports the characterization of a distribution through its fractional moments and second, the Maximum Entropy pmf p(M) recovered involving fractional moments converges in entropy to the true pmf p. This last result means that if we are interested in approximating some characteristic constants of a discrete distribution (with respect to expected values, probabilities or other), then the equivalent counterparts evaluated on p(M) are as close as we like to the true values, and the closeness depends on the (increasing) value of M. For recovering a distribution (continuous or discrete) via Maximum Entropy setup, fractional moments are definitely better than integer moments, as shown in Novi Inverardi and Tagliani [11]. But they are not always easy to evaluate. Traditionally, the mgf of a random variable X is used to generate positive integer moments of X. But it is clear that the mgf also contains a wealth of knowledge about arbitrary real moments and hence, on fractional moments. Taking this into account, to obtain fractional moments Cressie and Borkent [3] exploited the mgf and its derivatives and Klar [7], in addition to it, exploited the knowledge of a set of integer moments which can be obtained by proper integration of the mgf on a contour C of the complex plane. We will use these results to produce the building blocks, i.e., the fractional moments of our discrete distribution reconstruction procedure. Once the fractional moments are available through the usual Maximum Entropy machinery, the approximant p(M) of the pmf p will be obtained. 2. Fractional moments from mgf Let X = {x1, x2, . . .} be a countable discrete positive random variable with pmf p = {p1, p2, . . .}, M(t), t complex, its mgf, flj g1 j¼1 P its infinite sequence of integer moments (which characterize X). From M(t), fractional moments EðX a Þ ¼ i xai pi may be obtained through proper procedures and some of them are listed below: 1. By repeated differentiation of mgf by hand or through a symbolic manipulation language, such as MACSYMA, MATHEMATICA or MAPLE, the fractional calculus provides [3] EðX a Þ, where Z 0 n 1 k1 d MðzÞ a EðX Þ ¼ ðzÞ dz; k ¼ n a; k 2 ð0; 1Þ; n 2 N. ð2:1Þ CðkÞ 1 dzn Here, C(Æ) indicates the Gamma function in (2.1) and the real values of M(t) only are requested. 2. Whenever repeated differentiation is a difficult task, EðX a Þ, 0 < a < 1, may be obtained through integration, according to Hoffmann-Jorgensen [4]: Z 1 a 1 MðzÞ a EðX Þ ¼ dz; 0 < a < 1. ð2:2Þ Cð1 aÞ 0 zaþ1 P It is an easy task to prove that (2.2) stems from (2.1) by setting n = 1.Recalling that EðX a Þ ¼ i xai pi is an analytic function then if a*, 0 < a* < 1, is an arbitrary fixed value of a, we may have EðX a Þ and the first n a dj derivatives da j EðX Þja ; j ¼ 1; . . . ; n as Z 1 j d Cð1 aÞ 1 MðzÞ j a EðX Þ ¼ ð1Þ lnj z dz; j ¼ 1; . . . ; n ð2:3Þ j a za þ1 da 0 a from which, through Taylor expansion, the approximate values of EðX a Þ can be obtained as EðX a Þ ’ Eappr ðX a Þ ¼
n X EðjÞ ðX a Þ j ða a Þ . j! j¼0
ð2:4Þ
Eappr ðX a Þ is defined within an interval centered in a*, whose radius R is governed by the abscissa of convergence of EðX a Þ. M1 3. From M(t) and some integer moments, flj gj¼1 , fractional moments may be obtained, according to Klar [7], as " # QM1 Z 1 M 1 j X j¼0 ðr þ jÞ M j lj s rþM1 rM EðX Þ ¼ ð1Þ s MðsÞ ð1Þ ds ð2:5Þ Cð1 rÞ j! 0 j¼0
202
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209 M1
with r 2 (0, 1), M = 1, 2, . . . It remains to calculate flj gj¼1 . The following numerical procedure, according to Choudhury and Lucantoni [1], may be useful. Expanding M(t), with t complex, in a Taylor series at P1 t = 0 we get MðtÞ ¼ n¼0 ln!n tn . If C represents a closed counterclockwise contour in t-complex plane R MðtÞand n! M(t) is analytic within and on C, then Cauchy integral formula for derivatives yields ln ¼ 2pi dt. C tnþ1 Choosing a contour as a circle of radius rn, the previous integral may be converted to an integral over a proper real variable h 2 [0, 2p]. Applying the m-point trapezoidal rule to evaluate numerically the above integral, we finally obtain m n! X Mðrn e2pij=m Þe2pinj=m . ð2:6Þ ln ’ n mðrn Þ j¼1 An accurate calculation of ln for high n values is obtained by considering, rather than M(t), an adaptively modified M(bnt), where bn ¼ ðn 1Þ lln2 , which provides bnn ln rather than ln. Here, rn = 10c/2n to achieve n1 c the accuracy of the order of 10 in the ln computation. As a consequence, fractional moments EðX a Þ; a > 0, may be efficiently calculated through (2.5).
3. Recovering p from fractional moments P Let X be a countable positive discrete r.v., with pmf p, Shannon-entropy H ½p ¼ i pi ln pi and fractional aj moments EðX Þ; j ¼ 0; . . . ; M, a0 = 1 are considered as available information about X. From the studies of ðMÞ ðMÞ Kesavan and Kapur [6], we know that the Shannon-entropy maximizing pmf pðMÞ ¼ fp1 ; p2 ; . . .g, which aj has the same M fractional moments EðX Þ; j ¼ 0; . . . ; M, of p is ! M X aj ðMÞ pi ¼ exp kj xi ; i ¼ 1; 2; . . . ð3:1Þ j¼0
Here, (k0, . . . , kM) are Lagrangian multipliers, which must be supplemented by the condition that first M fractional moments of p(M) coincide with EðX aj Þ, i.e., X a ðMÞ xi j pi ; j ¼ 0; . . . ; M; a0 ¼ 1. ð3:2Þ EðX aj Þ ¼ i
Shannon-entropy H[p(M)] of p(M) is given as H ½pðMÞ ¼
X
ðMÞ
pi
ðMÞ
ln pi
i
¼
M X
kj EðX aj Þ.
ð3:3Þ
j¼0
The choice of fractional moments fEðX a1 Þ; . . . ; EðX aM Þg relies upon both a theorem Lin [10] concerning the characterization of X and the convergence of p(M) to p, as shown in the next section.
a * Theorem 3.1 [10]. Let faj g1 j¼0 be an infinite sequence, with a0 = 0 and aj 2 (0, a ) and EðX Þ < þ1, then the aj 1 sequence of expected values fEðX Þgj¼0 guarantees the existence of a unique p.
4. The convergence 4.1. Convergence of p(M) to p 1
Tagliani [12] has shown that if the infinite sequence of integer moments flj gj¼0 determines p uniquely, then 1 ME approximant p(M), constrained by the same moment sequence flj gj¼0 , converges in entropy to p. Taking it into account we will prove the following theorem:
Theorem 4.1. Let fEðX aj ÞgM fractional moments of p with a0 ¼ 0; aj 2 ð0; a Þ; EðX a Þ < þ1 and aj ¼ 0 a a jDa ¼ j Mþ1 ; j ¼ 0; . . . ; M equispaced, with Da ¼ Mþ1 . Then
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
lim H ½pM ¼: lim
M!1
X
M!1
ðMÞ
pi
ðMÞ
ln pi
¼ H ½p ¼:
i
X
pi ln pi .
203
ð4:1Þ
i
Proof. (3.2) may be written as EðX jDa Þ ¼
X
xjDa i
exp
i
M X
! . kk xkDa i
ð4:2Þ
k¼0
Setting ti ¼ xDa i then (4.2) becomes lj
¼: EðX
jDa
Þ¼
X
tji
exp
i
M X
! kk tki
j ¼ 0; . . . ; M;
;
ð4:3Þ
k¼0
which is an ordinary moment problem relatively to ME distribution X*. The latter has pmf pðMÞ ¼ PM ðMÞ ðMÞ ðMÞ fp1 ; p2 ; . . .g, with pi ¼ exp k¼0 kk tki , and outcomes ti. Both ME distributions p(M) and p*(M) have PM same entropy H ½pðMÞ ¼ H ½pðMÞ ¼ j¼0 kj EðX jDa Þ (same masses, but located in different points xi and ti, respectively). More precisely, X* is a set of distributions, depending on M. Letting M ! 1, the ordinary moM ments flj ¼ EðX jDa Þg0 relative to X* are nothing but the chosen fractional moments. The latter, thanks to Theorem 3.1, characterize a unique distribution. Then the ordinary moment problem (4.3) is determinated. As a consequence, from Tagliani [12], p*(M) converges in entropy to p, i.e., limM!1H[p*(M)] = H[p]. Being H[p*(M)] = H[p(M)] "M, then (4.1) holds. h 4.2. Other kinds of convergence Entropy convergence proved in Section 4.1 is a strong condition, which turns out to guarantee weaker convergences. Let us introduce other preliminary quantities. If P and Q are probability measures with same outcomes and point probabilities pi and qi, respectively, then the total variation between P and Q is defined as X kP Qk ¼ jpi qi j ð4:4Þ i
and the information divergence is defined as X pi Dðp; pðMÞ Þ ¼ pi ln ðMÞ . pi i
ð4:5Þ
If p and p(M) have same fractional moments EðX aj Þ, j = 0, . . . , M, then Dðp; pðMÞ Þ ¼ H ½pðMÞ H ½p
ð4:6Þ
holds. Indeed Dðp; pðMÞ Þ ¼
X i
pi ln
pi ðMÞ
pi
¼ H ½p þ
X i
pi
M X
kj EðX aj Þ ¼ H ½pðMÞ H ½p.
j¼0
Further, total variation and information divergence are related via inequality [2]: Dðp; pðMÞ Þ P
1 kp pðMÞ k2 . 2 ln 2
ð4:7Þ
Then entropy convergence entails convergence in information divergence and in total variation. The latter convergence turns out convergence in distribution. For each bounded function g having support {x1, x2, . . .} the following upper bound in the expected values calculation may be obtained, taking into account (4.6) and (4.7):
204
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
X X ðMÞ jEp ðgÞ EpðM Þ ðgÞj ¼ gp g i pi i i i i X X ðMÞ ðMÞ 6 jgi ðpi pi Þj 6 kgk1 jpi pi j i
i
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 6 kgk1 2 ln 2Dðp; pðMÞ Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ kgk1 2 ln 2ðH ½pðMÞ H ½pÞ.
ð4:8Þ
4.3. On the choice (a1,. . .,aM) Entropy convergence theorem in Section 4.1 and (4.8) suggest us how formulate the choice criterion of (a1, . . . , aM), with 0 = a0 < a1 < < aM. The optimal exponents are obtained as M
faj gj¼1 : H ½pðMÞ ¼ minimum.
ð4:9Þ (M)
The sequence is optimal in sense that it accelerates the convergence of H[p ] to H[p]. Equivalently, it uses a minimum number of fractional moments to reach a prefixed (even if unknown) gap H[p(M)] H[p]. The choice (4.9) reflects a principle of parsimony, selecting a model with the lowest number of parameters and, in the meantime, avoiding the drawbacks of numerical instability which affect any inverse problem. With the choice (4.9), Theorem 4.1 not only is satisfied but also the convergence of H[p(M)] to H[p] is accelerated. M Once fixed faj gj¼1 Lagrangian multipliers are calculated by solving the following dual problem [6]: ( ) ! M M X X X aj aj min ln exp kj xi þ kj EðX Þ . ð4:10Þ kj
i
j¼1
j¼1
For practical purposes, the optimal choice (4.9) is obtained as ( ) ! M M X X X aj aj inf min ln exp kj x i þ kj EðX Þ . aj
kj
i
j¼1
ð4:11Þ
j¼1
5. Sensitivity analysis M1
From M(t), flj gj¼1 is numerically obtained and then an approximate value Eapp ðX a Þ of Eex ðX a Þ ¼: EðX a Þ; a > 0 is obtained by (2.1)–(2.5). From (4.8), we saw that the approximant p(M) is suitable for expected values calculation. We prove that the calculation of expected values is stable replacing Eex ðX a Þ with Eapp ðX a Þ, where jEex ðX a Þ Eapp ðX a Þj 1. In the next numerical examples, Lagrangian multipliers have been calculated replacing Eex ðX aj Þ with Eappr ðX aj Þ. Let us call p(M)(k), the approximation of p(M) obtained using Eex ðX aj Þ, pðMÞ ðk þ DkÞ the one obtained using appr E ðX aj Þ. In both cases, we assume that the optimal exponents (a1, . . . , aM) are coincident, being Eappr ðX aj Þ ’ Eex ðX aj Þ. We want to estimate the difference jEpðM Þ ðkþDkÞ ðgÞ EpðMÞ ðkÞ ðgÞj
ð5:1Þ
in expected values calculation. For notational convenience we put X a ðMÞ xk j p k ; lj ¼ EðX aj Þ ¼ k ai
li;j ¼ ðEðX Þ; EðX aj ÞÞ ¼
X k
Dlj ¼ Eappr ðX aj Þ Eex ðX aj Þ.
a
ðMÞ
xak i xk j pk
¼
X k
a þaj ðMÞ pk ;
xk i
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
205
a
Since xi j 2 l2pðMÞ , where as usual ( X 2a ðMÞ a l2pðMÞ ¼ xi j : xi j pi < þ1;
) j ¼ 1; . . . ; M; M P 1 ;
ð5:2Þ
i
it follows that 1. the matrix having entries li,j, GM = [li,j] is the Gram matrix; 2. the functions fxaj g are linearly independent on the support {x0, x1, . . .} and then jGMj > 0; 3. the space VM of moments, given by the convex hull generated by the points fxai 1 ; . . . ; xai M g; i ¼ 1; 2; . . ., has a non-empty interior Krein [9]. Such a convex hull is a convex polyhedron (polytope) whose vertices are located on the convex hull of the curve fxa1 ; . . . ; xaM g; x 2 ½0; 1Þ; 4. if the sequence of prescribed moments fEðX a1 Þ; . . . ; EðX aM Þg is an inner point of VM then there are uncountably many probability distributions having such prescribed moments, one of them being p(M). Elsewhere, 5. the Gramian matrix is singular if fxa1 ; . . . ; xaM g is a set of functions linearly dependent on the support {x0, x1, . . .}; 6. if the sequence of prescribed moments fEðX a1 Þ; . . . ; EðX aM Þg is a boundary point of VM, then there is a unique probability distribution having such moments. Such a distribution takes on a finite number of outcomes [9]. If an, with 0 6 n 6 M is an arbitrary index, then from ! M X X aj an xi exp kj xi ¼ EðX an Þ ¼ ln ; n ¼ 0; . . . ; M i
ð5:3Þ
j¼0
differentiating both sides with respect to li ¼ EðX ai Þ, with 0 6 i 6 M, while the remaining expected values are fixed, we obtain 2 3 dk0 =dli 6 7 6 7 .. ð5:4Þ GM 6 7 ¼ eiþ1 ; . 4 5 dkM =dli where ei+1 is the canonical unit column vector 2 RMþ1 . 5.1. Relative error calculation Let us consider the relative error at point xi ðMÞ
ðMÞ
½pi ¼:
ðMÞ
pi ðk þ DkÞ pi ðkÞ ðMÞ
pi ðkÞ
ð5:5Þ
.
It is to be noted that if all the expected values li are changed to li + Dli, then the corresponding ki becomes ki + Dki, with Dki = Dki(Dl0, . . . , DlM). By Taylor expansion we have ðMÞ
ðMÞ ½pi
¼:
ðMÞ
pi ðk þ DkÞ pi ðkÞ
¼ exp
ðMÞ
pi ðkÞ M X j¼0
a xi j
M X Dkj Dl Dli i i¼0
¼ exp
M X
! a Dkj xi j
1
ð5:6Þ
j¼0
! 1’
M X j¼0
a
xi j
M X Dkj Dl Dli i i¼0
ð5:7Þ
206
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
2 ¼
M X j¼0
2
6 6 6 6 6 aj T 6 1 6 xi ej 6GM 6 6 6 6 4 4
Dl0
3
2
0
3
2
6 6 Dl 7 0 7 7 6 6 17 7 7 6 6 6 0 7 þ þ G1 6 0 7 þ G1 M 6 M 6 7 7 .. 7 6 6 .. 7 4 4 . 5 . 5 0
0
0
33
2 3 77 Dl0 77 X M 77 6 . 7 a 77 ¼ 6 . 7 xi j eTj G1 M 4 . 5 77 77 j¼0 0 55 DlM 0 .. .
DlM
3 Dl0 6 . 7 6 . 7 ¼ ½xai 0 ; 0; . . . ; 0 þ ½0; xai 1 ; 0; . . . ; 0 þ þ ½0; . . . ; 0; xai M G1 M 4 . 5 DlM a0 a1 0 xi xi xai M 2 3 Dl0 6 7 1 Dl0 l0;0 l0;1 l0;M a0 a1 1 6 . 7 aM ¼ ½xi ; xi ; . . . ; xi GM 4 .. 5 ¼ .. . .. .. .. jGM j ... . . . . DlM DlM lM;0 lM;1 lM;M
ð5:8Þ
2
ð5:9Þ
ð5:10Þ
5.2. Calculation of expected values (4.4) may be evaluated under the following assumption: Dlj Eappr ðX aj Þ Eex ðX aj Þ ¼ Dl ¼ constant ¼ Eex ðX aj Þ lj
8j.
ð5:11Þ
Replacing (5.5) with (5.4) and recalling that Dlj,0 = Dlj, we have ðMÞ
½pi ¼
Dl jGM j ¼ Dl. jGM j
ð5:12Þ
As far as the expected value E½g is concerned, we have X ðMÞ ðMÞ jEpðM Þ ðkþDkÞ ½g EpðM Þ ðkÞ ½gj ¼ ½pi ðk þ DkÞ pi ðkÞg i 6 kgk1
X jpiðMÞ ðk þ DkÞ piðMÞ ðkÞj i
¼ kgk1 Dl
X
ðMÞ pi ðkÞ ðMÞ
pi ðkÞ ¼ kgk1 Dl.
ð5:13Þ ðMÞ
pi ðkÞ
ð5:14Þ ð5:15Þ
i
Hence, under hypothesis (5.5) the calculation of expected values is stable under small fluctuations of Eex ðX a Þ. 6. Numerical results The efficiency of the above exposed method in recovering some common pmf is illustrated through several examples. In all the examples fractional moments are obtained through (2.5). The approximating pmf is given by (3.1), whose parameters and optimal fractional moments are estimated by means of (4.11). Thus, the procedure of recovering is purely numerical. In particular, (4.11) is solved through Monte Carlo procedure for what the set (a1, . . . , aM) is concerned, so that the method is time consuming. Led by a mere curiosity, in all the examples a graphical comparison of p(M) and p is reported too, even if only entropy convergence is guaranteed by our procedure. Hence, D(p, p(M)) = H[p(M)] H[p] is the only significant quantity to be reported.
n i Example 1. Let X Binomial(n, k), so that pi ¼ k ð1 kÞni ; i ¼ 0; . . . ; n and M(t) = (ket + 1 k)n "t. i Here, n = 20 and k = 0.5 are chosen and then H[p] ’ 2.2234239158. Maximum entropy pmf with an increasing
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
207
Table 1 Entropy difference of distributions having an increasing number of common fractional moments M
H[p(M)] H[p]
1 2 3 4
0.7508E0 0.2313E3 0.3668E4 0.5794E5
(a)
0.2
x 10
4
–4
(b)
0.15 p(M) –p
p
2 0.1
0 0.05
0
–2 0
5
10 n
15
20
0
5
10 n
15
20
Fig. 1. (a) Exact pmf, (b) graphical comparison of p(M) and p, when M = 4.
number M of fractional moments are considered. In Table 1 entropy difference H[p(M)] H[p] is reported. Entropy convergence is fast or, equivalently, 3–4 fractional moments capture approximately all the information content of the distribution. Fig. 1 shows a graphical comparison of p(M) and p, when M = 4.
Table 2 Entropy difference of distributions having an increasing number of common fractional moments M
H[p(M)] H[p]
1 2 3 4
0.2220E0 0.1669E3 0.2034E5 0.5437E7
(a) 0.2
6
x 10
–5
(b)
4 0.15
(M)
0.1
0
p
p
–p
2
–2 0.05 –4 0
0
5
10 n
15
20
–6
0
5
10 n
15
Fig. 2. (a) Exact pmf, (b) graphical comparison of p(M) and p, when M = 4.
20
208
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209 i
t
Example 2. Let X Poisson(k), so that pi ¼ ek ki! ; i ¼ 0; 1; . . . and MðtÞ ¼ ekðe 1Þ 8t. Here, k = 5 is set and then H[p] ’ 2.204395243628. Same quantities as in Example 1 are reported and entropy convergence is fast (Table 2). Approximately, 3–4 fractional moments capture all the information content of the distribution. Fig. 2 shows a graphical comparison of p(M) and p, when M = 4. Example 3. The number of costumers served in a busy period of an M/M/1 queue with traffic intensity q has pmf [8]
1 2ði 1Þ i1 2iþ1 pi ¼ ; i ¼ 1; 2; . . . ð6:1Þ q ð1 þ qÞ i i1 pffiffiffiffiffiffiffiffiffi 1 1bet 4q and MðtÞ ¼ pffiffiffiffi , where b ¼ ð1þqÞ 2 . Setting q = 0.75, then H[p] ’ 1.805976781052. From Table 3 we conbq
clude that entropy convergence is fast, and Fig. 3 shows a graphical comparison of p(M) and p, when M = 5. Example 4. A mixture of two Poisson distributions is considered, with ki1 ki þ ð1 xÞek2 2 ; i ¼ 0; 1; . . . i! i! t k1 ðet 1Þ MðtÞ ¼ xe þ ð1 xÞek2 ðe 1Þ 8t.
pi ¼ xek1
ð6:2Þ ð6:3Þ
Setting k1 = 1, k2 = 7 and x = 0.5, then H[p] ’ 2.40986482. Entropy convergence is low. May be the complex structure of the distribution slows down the convergence. Then a high number of fractional moments are required to capture the information content of the distribution (Table 4). Fig. 4 shows a graphical comparison of p(M) and p, when M = 7.
Table 3 Entropy difference of distributions having an increasing number of common fractional moments M
H[p(M)] H[p]
1 2 3 4 5
0.6107E2 0.3698E2 0.1483E3 0.6652E4 0.1043E4
(a) 0.7
1
x 10
(b)
–3
0.6 0.5 p(M) –p
0.5 p
0.4 0.3 0.2
0
–0.5
0.1 0
0
5
10 n
15
20
–1
0
5
10 n
15
Fig. 3. (a) Exact pmf, (b) graphical comparison of p(M) and p, when M = 5.
20
P.L. Novi Inverardi, A. Tagliani / Applied Mathematics and Computation 182 (2006) 200–209
209
Table 4 Entropy difference of distributions having an increasing number of common fractional moments M
H[p(M)] H[p]
1 2 3 4 5 6 7
0.6965E1 0.4901E1 0.1817E1 0.4486E2 0.2494E2 0.1677E2 0.1360E2
(a) 0.2
6
x 10
–3
(b)
4 0.15 p(M) –p
p
2 0.1
0 –2
0.05 –4 0
0
5
10 n
15
20
–6 0
5
10 n
15
20
Fig. 4. (a) Exact pmf, (b) graphical comparison of p(M) and p, when M = 7.
From the examples reported above, we may draw the conclusion that the fractional moments are powerful tools in recovering a pmf when the mgf is given. Few fractional moments in globe approximately all the information content carried out by mgf. Expected values are accurately calculated and instability problems are avoided. References [1] G.L. Choudhury, D.M. Lucantoni, Numerical computation of the moments of a probability distribution from its transform, Operations Research 44 (2) (1996) 368–381. [2] T.M. Cover, J.A. Thomas, Elements of Information Theory, Wiley, New York, 1991. [3] N. Cressie, M. Borkent, The moment generating function has its moments, Journal of Statistical Planning and Inference 13 (1986) 337–344. [4] J. Hoffmann-Jorgensen, Probability with a View towards Statistics, vol. 1, Chapman and Hall, New York, 1994. [5] E.T. Jaynes, Where do we stand on maximum entropy, in: R.D. Levine, M. Tribes (Eds.), The Maximum Entropy Formalism, MIT Press, Cambridge, MA, 1978, pp. 15–118. [6] H.K. Kesavan, J.N. Kapur, Entropy Optimization Principles with Applications, Academic Press, New York, 1992. [7] B. Klar, On a test for exponentiality against Laplace order dominance, Statistics 37 (6) (2003) 505–515. [8] L. Kleinrock, Queuing Systems, Volume 1: Theory, Wiley, New York, 1975. [9] M.G. Krein, The ideas of P.L. Cebysev and A.A. Markov in the theory of limiting values of integrals and their further development, American Mathematical Society Transactions, Series 2 12 (1959). [10] G.D. Lin, Characterizations of distributions via moments, Sankhya: The Indian Journal of Statistics, Series A 54 (1992) 128–132. [11] P.L. Novi Inverardi, A. Tagliani, Maximum entropy density estimation from fractional moments, Communications in Statistics – Theory and Methods 32 (2) (2003) 327–346. [12] A. Tagliani, Inverse Z transform and moment problem, Probability in the Engineering and Informational Sciences 14 (2000) 393–404.