Journal of Statistical Planning and Inference 136 (2006) 159 – 182 www.elsevier.com/locate/jspi
On some properties of a class of weighted quasi-binomial distributions Subrata Chakrabortya,∗ , Kishore K. Dasb a Department of Statistics, Dibrugarh University, Dibrugarh 786004, Assam, India b Department of Statistics, Gauhati University, Guwahati, Assam, India
Received 15 December 2002; accepted 18 June 2004 Available online 14 August 2004
Abstract A class of weighted quasi-binomial distributions has been derived as weighted distribution of QBD I (Sankhy¯a 36 (Ser. B, Part 4) 391). The moments, inverse moments, recurrence relations among moments, bounds for mode, problem of estimation and fitting of data from real life situations using different methods and limiting distributions of the class have been studied. Consul’s (Comm. Statist. Theory Methods 19(2) (1990) 477) results on QBD I have been seen as particular cases. © 2004 Elsevier B.V. All rights reserved. MSC: 60E05; 62E15; 62F10 Keywords: Quasi-binomial distributions; Weighted quasi-binomial distribution; Moments; Inverse moments; Recurrence relation; Limiting distributions
1. Introduction Consul (1974) first introduced the notion of urn model with pre-determined strategy with a two urn model and developed quasi-binomial distribution (QBD) I with probability function (p.f.) n Pr(X = k) = p(p + k )k−1 (1 − p − k )n−k , k (1) k = 0(1)n, −p/n (1 − p)/n ∗ Corresponding author.
E-mail address:
[email protected] (S. Chakraborty). 0378-3758/$ - see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2004.05.015
160
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
using sampling with replacement schemes. He gave justification and mentioned applications of these distributions in various fields. Consul and Mittal (1975) defined QBD II using a four urn model with a pre-determined strategy as Pr(X = k) =
n p(1 − p − n) (p + k )k−1 (1 − p − k )n−k−1 , (1 − n) k k = 0(1)n; −p/n (1 − p)/n
(2)
and indicated a large number of possible applications. Janardan (1975) obtained QBD as a particular case of his Quasi-Polya distribution. Berg (1985) generated QBD I from modified Charlier type B expansions, also obtaining its mean, variance. He mentioned as to how Abel’s generalisation of the binomial identity (Riordan, 1968) is related to the QBDs. Berg and Mutafchiev (1990) have mentioned that new modified QBDs can be developed using Abel’s generalisations of the binomial formula. They also indicated how different Abel identities can be used to define new weighted QBDs and also have shown applications of some QBD and modified QBDs in random mapping problems. Charalambides (1990) in his paper on Abel series distribution (ASD) mentioned about QBD I and QBD II. He also obtained , (2) and 2 for QBD I, QBD II from those of the ASD by using generating function and Bell polynomial (Riordan, 1968). But the formulas for (2) , 2 were not in compact form. Consul (1990) studied properties of QBD I with p.f. (1) with deduction of moments, inverse moments, maximum likelihood estimation and data fitting. Berg and Nowicki (1991) show how QBD I arises in connection with the distribution of the size of the forest consisting of all trees rooted in a loop in random mapping. (see also Jaworski, 1984). They also indicated how difficult it is to do inference with QBD and show however, that for large sample size QBD approaches generalised Poisson distribution (GPD) is used. (Consul and Jain, 1973a). As pointed out in Berg and Mutafchiev (1990), Mishra et al. (1992) and Das (1993, 1994) used Abel’s generalisation of binomial identities (Riordan, 1968) to define a class of distributions called the class of quasi-binomial distributions having p.f. Pr(X = k) =
n (p + k )k+s (1 − p − k )n−k+t , k Bn (p, q; s, t; )
k = 0(1)n,
(3)
where s and t are integers, p + q + n = 1 and Bn (p, q; s, t; ) =
n n (p + k )k+s (1 − p − k )n−k+t k
(4)
k=0
and obtained QBD I, QBD II and some new QBDs by choosing different values for s and t in the p.f. (3) as follows. For (i) (ii) (iii) (iv) (v)
s = −1, s = −1, s = −2, s = −2, s = 0, t
t = 0 , QBD type I (Consul, 1974). t = −1, QBD type II (Consul and Mittal, 1975). t = 0, QBD type III (Das, 1993, 1994). t = −1, QBD type IV (Das, 1993, 1994). = 0, QBD type VII (Das, 1993, 1994).
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
161
We have noticed an error in the formula for the rth factorial moment for (3) presented in Das (1993, 1994). Mishra and Singh (2000) obtained first four moments about origin for QBD I by defining factorial power series. Chakraborty and Das (2000) and Chakraborty (2001) obtained QBD I as a particular case of a Unified probability model and QBD II from a generalised probability model (Chakraborty, 2001). Since the class of distribution in (3) contains some well-known distributions such as QBD I, QBD II, besides many new QBDs, we have considered the study of some important aspects of this class of distributions in the present article. In Section 2 of this paper we first show how the class of distribution in (3) can be derived as a class of weighted distributions of QBD I given in Eq. (1) by choosing different weight functions and called the class as weighted quasi-binomial distributions (WQBD). In the next section we present formulas for factorial moments, noncentral moments about origin, central moments in general and in particular for QBD I and QBD II and we obtain mean, variance, 3 and 4 . In Section 4 two general results for inverse moments and some important particular cases are stated besides some results on the inverse factorial moments. Here we have used Abel’s summation formulas, umbral notations (Riordan, 1958, 1968) and some generalisations (Chakraborty, 2001) thereof, which have led to considerable simplification of the otherwise complicated formulae and their proofs. A list of identities related to a class of Abel’s generalisations of binomial identities has been presented in the appendix at the end. In Section 5 we state a general result on the mode of the class of distributions, while in Section 6, we study the problem of parameter estimation by various methods including the maximum likelihood (ML) method and present some data fitting using ML method of estimation of the parameters with four illustrations. In the final section we show how for large values of n under certain conditions the class of WQBD approaches the class of weighted generalised Poisson distributions (Chakraborty, 2001).
2. A class of weighted QBDs The weighted probability mass function (p.m.f.) Q(x) corresponding to the p.m.f. p(x) of the random variable X with weight function w(x) is given by w(x)p(x) Q(x) = . x w(x)p(x) In case w(x) = x, the distribution is said to be size biased or mean weighted distribution (see Johnson et al., 1992, p. 145). Definition 1. A random variable X is said to follow a class of WQBD if its p.f. has the form Pr(X = k) =
n (p + k )k+s (q + (n − k))n−k+t k
Bn (p, q; s, t; )
,
(5)
162
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Bn (p, q; s, t; ) =
n n (p + k )k+s (q + (n − k))n−k+t , k
(Riordan, 1968),
k=0
(6) where p and q are the non-negative fractions, p + q + n = 1; −p/n < < (1 − p)/n and s, t integers. Alternatively, (5) can also be written as Pr(X = k) =
n (p + k )k+s (1 − p − k )n−k+t . Bn (p, 1 − p − n; s, t; ) k
(7)
Theorem 1. If X ∼ QBD I (Consul, 1974) with n, p, , then the weighted distribution of X with weight w(x) = x (s+1) (n − x)(t) , where x (r) = x(x − 1) · · · (x − r + 1) is given by n−t−s−1 {p + (s + 1) + (k − s − 1)}k−s−1+s k−s−1 {1 − p − (s + 1) Bn−s−t−1 (p + (s + 1), 1 − p − (n − t); s, t; ) − (k − s − 1)}n−t−s−k+s+t , k = 0(1)n. (8) Hence, the distribution of Y = X − s − 1 is given by m
(p + k )k+s (1 − p − k )m−k+t , Pr(Y = k) = k Bm (p , 1 − p − m; s, t; )
k = 0(1)m,
(9)
where m = n − s − t − 1, p = p + (s + 1) and −p /m < < (1 − p )/m, which is the class of QBD given in (7). Some important observations : denoting class (7) by WQBD(n; p, ; s, t), it can be seen that the form of the weighted QBD I (8) is 1 + s + WQBD(n − s − t − 1; p + (s + 1), ; s, t). In particular, for I. s = 0, t = 0, we get when = 0 the size biased form of binomial distribution (Johnson et al., 1992, p. 146) as 1 + Binomial(n − 1; p). II. s = 0, t = 0, the size-biased form of QBD I becomes 1 + WQBD(n − 1; p + , ; 0, 0).
Theorem 2. If X ∼ QBD I (Consul, 1974) with n, p, , then the weighted distribution of X with weight w(x) = (p + x )s+1 (1 − p − x )t is given by (7). Note: Following Gupta (1975) we may refer the class of distributions having the p.f. (8) as a class of mixed moment distribution of X ∼ QBD I(n, p, ) having p.f. (1).
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
163
Both Theorems 1 and 2 provide characterisation of the class of the distribution in (7) by showing how the class can be derived as a class of weighted distribution of QBD I.
3. Moments HereAbel’s generalisations of binomial identities (Riordan, 1968; Chakraborty, 2001) and umbrals (Riordan, 1958, 1968; Chakraborty, 2001) have been exploited for the derivation of the formulae of moments of the class of weighted quasi-binomial distributions in general and QBD I and QBD II in particular in compact form. Theorem 3. The rth order descending factorial moments of the class of WQBD is given by
(r) (n; p; s, t; ) =
n(r) Bn−r (p + r , q; s + r, t; ) , Bn (p, q; s, t; )
where n(r) =n(n−1) · · · (n−r +1) and (r) (n; p; s, t; ) denotes the rth order descending factorial moments. Proof. The rth order descending factorial moments is
(r) (n; p; s, t; ) = E[X(r) ] =
n
k (r)
k=0 (r)
=n
n (p + k )k+s (q + (n − k))n−k+t k Bn (p, q; s, t; )
n n − r (p + k )k+s (q + (n − k))n−k+t Bn (p, q; s, t; ) k−r k=r
n−r n(r) n−r = (p + r + l )l+r+s Bn (p, q; s, t; ) l l=0
× (q + (n − r − l))n−r−l+t =
n(r) Bn−r (p + r , q; s + r, t; ) . Bn (p, q; s, t; )
Theorem 4. The rth order moments about origin of the class of WQBD is given by
r (n; p; s, t; ) =
r j =0
S(r, j )n(j )
Bn−j (p + j , q; s + j, t; ) , Bn (p, q; s, t; )
164
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
where S(r, j ) are the Stirling numbers of the second kind (Riordan, 1958 and Johnson et al., 1992) defined as
j 0i f or i j j! = 0 f or i < j = 1 f or j = 1 or i = j
S(i, j ) =
also S(i + 1, j ) = j S(i, j ) + S(i, j − 1). Proof. The rth order moments about origin is
r (n; p; s, t; ) = E[Xr ] = =
r
r j =0
S(r, j ) (j ) (n; p; s, t; )
S(r, j )n(j )
j =0
Bn−j (p + j , q; s + j, t; ) . Bn (p, q; s, t; )
Theorem 5. The rth order moment about the mean of the class of WQBD is given by
r (n; p; s, t; ) =
j r r 1 r−j (−1)r−j 1 S(j, )n() j Bn (p, q; s, t; ) j =0 =0
Bn− (p + , q; s + , t; ). Proof. The rth order moments about mean is r r (−1)r−j ( 1 )r−j j r (n; p; s, t; ) = E[X − 1 ]r = j j =0 r r (−1)r−j = ( 1 )r j
)j j ( 1 j =0 Now on using Theorem 4 we get j r 1 r r (−1)r−j = S(j, )n() Bn− (p + , q; s + , t; ). j Bn (p, q; s, t; ) 1 j j =0 =0
3.1. Recurrence relation of moments
r (n; p; s, t; ) =
nB n−1 (p + , q; s + 1, t; ) Bn (p, q; s, t; ) r−1 r − 1 × j (n − 1; p + ; s + 1, t; ). j j =0
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
165
Repeated application of the above relation gives
r (n; p; s, t; ) =
r−1 j −1 j −1 n(2) Bn−2 (p + 2, q; s + 2, t; ) r − 1 j Bn (p, q; s, t; ) j =0 =0
(n − 2; p + 2; s + 2, t; ). All the relations stated above for the class of WQBD are of general nature. Using different combinations of integer values of r, s and t, the different moments of the different quasibinomial distributions belonging to the WQBD class may be obtained. It may be noted that in all the results above, p + q + n = 1, that is, q = 1 − p − n. 3.2. First four central moments of some of the WQBDs 3.2.1. Moments of the QBD I
1 = np
n−1
(n − 1)() ,
=0 n−2 (2)
2 = n p
(n − 2)() (p + 2 + w ) + np
=0 w=0
− n2 p 2
n−1
n−1
(n − 1)()
=0 n−1
{(n − 1)() }2 +
=0
(n − 1)(i) (n − j )(j ) i+j
i=j =0
.
It can be verified that the above results are equal to those of Consul (1990): 3 3 = (2 1
+p +p
2 − 3 1
n−3
+ 1 ) + 3p(1 − 1 )
n−2
n(+2) (p + 2 + w )
=0 w=0
n(+3) +1 w( − w + 1)(p + 3 + w )
=0 w=0 n−3 w
n(+3) (p + 3 + )(p + 3 + (w − )).
=0 w=0 =0 4 4 = p[p−1 1 (2)
+n
+ n(1 + )n−1 {−4 1 + 6 1 − 4 1 + 1} 3
2
(1 + + (p + 2))n−2 {6 1 − 12 1 + 7} 2
+ n(3) [(1 + + (p + 3; 2))n−3 + (1 + (2) + (p + 2))n−3 ]{−4 1 + 6}
+ n(4) [(1 + + (p + 4; 3))n + 3(1 + (2) + (p + 4) + (p + 4))n + (1 + (2) + (0) + (p + 4))n + (1 + (3) + (p + 4))n ].
166
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
3.2.2. Moments of the QBD II np , 1 = 1 − n n−2 np 1 − n( + p) (+1) , 2 = (n − 1) (p + 2 + ) + 1 − n 1 − n =0
3 3 = (2 1
2 − 3 1
+ n(3)
n−2 p (n − 2) (p + 2 + ) 3n(2) p(1 − 1 ) + 1) + 1 − n
n−3
=0
(n − 3)() (p + 3 + w )(p + 3 + ( − w))
=0 w=0
+
n−3
()
+1
(n − 3) w
(p + 3 + w )
.
=0 w=0
4 = 1 + 4
p 3 2 [n(−4 1 + 6 1 − 4 1 + 1) 1 − n
+ n(2) (6 1 − 12 1 + 7)(1 + (p + 2))n−2 + n(3) (−4 1 + 6). 2
{(1 + (p + 3; 2))n−3 + (1 + + (p + 3))n−3 } + n(4) {Bn (p + 4, q; 3, 0; ) + nBn−1 (p + 4, q + ; 3, 0; )}].
Further expansion has been avoided as the expression becomes too long. Where k+j −1 k k ≡ k = k!, k (j ) ≡ k (j ) = ( + · · · + ) = k!, k j terms
k (x) ≡ k (x) = k!(x + kz),
k (x) ≡ k (x) = (kz) k! (x + kz),
k (x) ≡ k (x) = (kz) (kz) k! (x + kz) and k (x; j ) = [ (x) + · · · + (x) ]k j terms
(Riordan, 1968; Chakraborty, 2001).
4. Inverse moments The importance of inverse or negative moments is well known in the estimation of the parameters of a model and also for testing the efficiency of various estimates. Besides they are equally useful in life testing and in survey sampling, where ratio estimates are being employed. Consul (1990) gave a detailed account of negative moments of QBD I. In this section similar properties have been studied for the class of WQBD by deriving general formulas and listing some particular cases for QBD I and QBD II.
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
167
Theorem 6. If X ∼ class of WQBD (7), then X(r) (n − X)(u) E (p + X )v (1 − p − X )w = n(r+u)
Bn−r−u (p + r , 1 − p − n + u; s + r − v, t + u − w; ) . Bn (p, 1 − p − n; s, t; )
Proof.
X(r) (n − X)(u) E (p + X ) (1 − p − X )w
n (p + k )k+s (1 − p − k )n−k+t k (r) (n − k)(u) (p + k ) (1 − p − k )w k Bn (p, 1 − p − n; s, t; ) k=0 n−u n − r − u n(r+u) = k−r Bn (p, 1 − p − n; s, t; )
=
n
k=r
× (p + k )k+s− (1 − p − k )n−k+t−w =
n(r+u) Bn (p, 1 − p − n; s, t; ) n−r−u n − r − u × (p + r + l )l+r+s− (1 − (p + r ) − l )n−r−u−l+t+u−w l l=0
Bn−r−u (p + r , 1 − p − n + u; s + r − , t + u − w; ) . = n(r+u) Bn (p, 1 − p − n; s, t; ) Some important results on inverse moments of QBD I and QBD II using the above general formula and Abel’s summation formula (Riordan, 1968 and Chakraborty, 2001) are listed below. 4.1. QBD I E(p + X )−1 =
1 n , − p p+
EX(p + X )−1 = EX
(2)
−1
(p + X )
np , p+ =p
n−3
j n(j +2) ,
j =0
E(p + X )−2 = p−2 − np −1 (p + )−1 − n(p + )−2 − n(2) 2 (p + )−1 (p + 2)−1 , 1
EX(p + X )−2 = np(p + )−2 − n(2) p (p + )−1 (p + 2)−1 ,
168
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
EX (2) (p + X )−2 =
n(2) p , p + 2
EX2 (p + X )−2 = np(p + )−2 + n(2) p 2 (p + )−1 (p + 2)−1 , EX(3) (p + X )−2 = p
n−3
j n(j +3) ,
j 0
EX2 (X − 1)(p + X )−2 = p
n−3
j n(j +3) + 2pn(2) (p + 2)−1 ,
j =0
E(p + X)−3 = p−3 − n(p + )−1 [p −2 + p −1 (p + )−1 + (p + )−2 ] + n(2) 2 (p + )−1 (p + 2)−1 [p −1 + (p + 2)−2 + (p + )−1 (p + 2)−1 ] + n(3) 3 (p + )−1 (p + 2)−1 (p + 3)−1 , EX(p + X)−3 = np(p + )−3 − n(2) p [(p + )−1 (p + 2)−2 + (p + )−2 (p + 2)−1 ] + n(3) 2 p(p + )−1 (p + 2)−1 (p + 3)−1 , EX(2) (p + X )−3 = n(2) p(p + 2)−2 − n(3) p (p + 2)−1 (p + 3)−1 , EX(3) (p + X )−3 =
n(3) p , p + 3
E(1 − p − X )−1 =
1 − n , 1 − p − n
np , 1 − p − n 1 − n 1 − (n − 1) 1 −2 , E(1 − p − x ) = − n 1 − p − n 1 − p − n 1 − p − n + 1 np (n − 1) 1 , EX(1 − p − X )−2 = − 1 − p − n 1 − p − n 1 − p − (n − 1) EX(1 − p − X )−1 =
E(n − X)(1 − p − X )−2 =
n(1 − (n − 1)) , 1 − p − (n − 1)
E(n − X)X(1 − p − X )−2 =
n(2) p , 1 − p − (n − 1)
E(n − X)X(2) (1 − p − X )−2 =
n−3
p n(+3) (p + ( + 2)), 1 − p − (n − 1) =0
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
169
E(n − X)(1 − p − X )−3 = n(1 − p − (n − 1))−2 (1 − (n − 1)) (1 − (n − 2)) − n(2) , (1 − p − (n − 1))(1 − p − (n − 2)) 1 n(2) p −3 E(n − X)X(1 − p − X ) = 1 − p − (n − 1) 1 − p − (n − 1) (n − 2) , − 1 − p − (n − 2)
1
E(n − X)X(2) (1 − p − X )−3 =
n−3
n(3) p
(1−p−(n − 1))2 =0 n−4 (4)
−n p
=0
(n − 3)() () (p + 2 + )
(n − 4)() (p + 2 + )
(1 − p − (n − 1))(1 − p − (n − 2))
.
Note: For expressions tagged with ‘1’, the corresponding expressions are incorrect in Consul (1990). 4.2. QBD II E(p + X )−1 =
1 − (n − 1) 1 − n , p (1 − n)(p + )
E
p(1 − (n − 1)) X =n , (p + X ) (1 − n)(p + )
E
p X (2) = n(2) , (p + X ) 1 − n
1 n(2p + )(1 − (n − 1)) , − p2 p(1 − n)(p + ) n(2) 2 (1 − (n − 2)) + , (1 − n)(p + )(p + 2) 1 − (n − 1) (n − 1)(1 − (n − 2)) np X = − , E 1 − n (p + )(p + 2) (p + X )2 (p + )2 E(p + X )−2 =
EX(2) (p + X )−2 = EX2 (p + X )−2 =
n(2) p(1 − (n − 2) , (p + 2)(1 − n)
np(1 − (n − 1)) (1 − n)(p + )2
EX(3) (p + X )−2 =
n(3) p , 1 − n
+
n(2) p 2 (1 − (n − 2)) , (p + 2)(1 − n)(p + )
170
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
EX (X − 1)(p + X ) 2
−2
n(2) p 2(1 − (n − 2)) = (n − 2) + , 1 − n p + 2
n(2) p 1 − p − n np + 1+ (1 − n)(p + ) p+ 1 − n (1 − p − n) 1 − p − n 1 1+ + + (p + 2) p + (p + )(p + 2) n(3) p 2 (1 − p − n) + , (1 − n)(p + 2)(p + 3) n(2) (1 − (n − 2)) n(3) (1 − (n − 3)) p −3 (2) EX (p + X ) = , − 1 − n p + 2 (p + 2)(p + 3) EX(p + X )−3 =
EX(3) (p + X )−3 =
n(3) p(1 − (n − 3)) , (1 − n)(p + 3)
1 np(1 − (n − 1)) − , 1 − p − n 1 − n (2) n p n EX(1 − p − X )−1 = , − 1 − n 1 − p − n 1 − p − (n − 1)
E(1 − p − X )−1 =
1 − n 1 1 − n (1 − p − n)2 (2 − 2p − (2n − 1))(1 − (n − 1)) − n (1 − p − n)(1 − p − (n − 1))2 1 − (n − 2) , + n(2) 2 (1 − p − (n − 1))(1 − p − (n − 2)) (1 − p − n)2 np EX(1 − p − X )−2 = 1 − n 1 − p − (n − 2) E(1 − p − x )−2 =
−
(n − 1)(2 − 2p − (2n − 1))
(1 − p − n)(1 − p − (n − 1))2 (n − 1)(2) 2 , + 1 − p − (n − 1)
E(n − X)(1 − p − X )−2 =
1 − (n − 1) n(1 − p − n) 1 − n 1 − p − (n − 1) (n − 1)(1 − (n − 2)) , − (1 − p − (n − 1))(1 − p − (n − 2))
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182 −2
E(n − X)X(1 − p − X )
E
E
(n − X)X(2) (1 − p − X )2
(n − X)
=
(1 − p − X )3
171
n(2) p(1 − p − n) 1 = (1 − n)(1 − p − (n − 1)) 1 − p − (n − 1) (n − 2) , − 1 − p − (n − 2)
n−3 () n(3) p(1 − p − n) =0 (n − 3) (p + 2 + ) = 1 − n (1 − p − (n − 1))2 () (n − 3) n−4 (n − 4) (p + 2 + ) =0 , − (1 − p − (n − 1))(1 − p − (n − 2))
n(1 − p − n) 1 − (n − 1) 1 − n (1 − p − (n − 1))3 (n − 1)(2p + 3)(1 − (n − 2)) − (1 − p − (n − 1))2 (1 − p − (n − 2))2
(n − 1)(2) 2 (1 − (n − 3)) + , (1 − p − (n − 1))(1 − p − (n − 2))(1 − p − (n − 3))
1 n(2) p(1 − p − n) (1 − n)(1 − p − (n − 1)) (1 − p − (n − 1))2 (n − 2)(2p + 3)
E(n − X)X(1 − p − X )−3 = −
(1 − p − (n − 1))(1 − p − (n − 2))2 (n − 2)(2) 2 . + (1 − p − (n − 2))(1 − p − (n − 3))
4.3. Inverse factorial moments Theorem 7. If X ∼ class of WQBD (7), then E
1 (X + 1)
[r]
=E −
1 (r)
(X + r) r−1 n+r l 0
l
=
Bn+r (p − r , 1 − p − n; s − r, t; ) (n + 1)[r] Bn (p, 1 − p − n; s, t; )
(p − r + l )l+(s−r) (1 − p + (r − l))n+r−l+t
(n + 1)[r] Bn (p, 1 − p − n; s, t; )
where X [r] = X(X + 1) · · · (X + r − 1).
,
172
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Proof. 1 1 = E E (X + 1)[r] (X + r)(r) n n (p + k )k+s (1 − p − k )n−k+t 1 = Bn (p, 1 − p − n; s, t; ) (k + 1)[r] k k=0 n n+r k+r+s−r (1 − (p − r ) − (k + r))n+r−(k+r)+t k=0 k+r (p − r + (k + r)) = (n + 1)[r] Bn (p, 1 − p − n; s, t; ) n+r n+r (p − r + l )l+s−r (1 − (p − r ) − l )n+r−l+t l = l=r (n + 1)[r] Bn (p, 1 − p − n; s, t; ) n+r n+r (p − r + l )l+s−r (1 − (p − r ) − l )n+r−l+t = l=0 l (n + 1)[r] Bn (p, 1 − p − n; s, t; ) r−1 n+r (p − r + l )l+s−r (1 − (p − r ) − l )n+r−l+t − l=0 l (n + 1)[r] Bn (p, 1 − p − n; s, t; ) =
Bn+r (p − r , 1 − p − n; s − r, t; )
(n + 1)[r] Bn (p, 1 − p − n; s, t; ) r−1 n+r (p − r + l )l+(s−r) (1 − p + (r − l))n+r−l+t l 0 l . − (n + 1)[r] Bn (p, 1 − p − n; s, t; )
Putting = 0, we get for binomial distribution (Johnson et al., 1992, p. 109) with parameters n, p E
1 (X + r)(r)
=
1−
r−1 n+r l=0
l
pl (1 − p)n+r−l
(n + r)(r) p r
.
In particular, the following results can be derived for 4.3.1. QBD I 1 p E = [1 − (n + 1) − (1 − p + )n+1 ], X+1 (n + 1)(p − )2 E
p(p − )−2 (p − 2)−3 1 = [p(p − )2 (X + 1)(X + 2) (n + 1)(n + 2) − p (n + 2)(p − 2)(2p − 3) + (n + 1)(n + 2)(p − )(p − 2)2 − (p − )2 (1 − p + 2)n+2 − (n + 2)(p − 2)3 (1 − p + )n+1 ].
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
173
4.3.2. QBD II 1 E X+1 p(1 − (n + 1)) + (n + 1)(1 − n)(p − ) + p(1 − p − n)(1 − p + )n = , (n + 1)(1 − n)(p − )2 1 E (X + 1)(X + 2) 1 {p(p − )(1 − (n + 2)) = (n + 1)(n + 2)(1 − n)(p − )2 (p − 2)3 + np(2p − 3)(1 − (n + 1))(p − 2) + n(2) 2 (1 − n)(p − 2)2 − (p − )p(1 − p − n)(1 − p − 2)n+1 + (n + 2)p(1 − p − n)(p − 2)3 (1 − p + )n (p − )−1 }.
5. Mode of the class of WQBDs Consul (1990) obtained bounds for the mode of QBD I. Here an attempt has been made to find similar results for the class of WQBDs. Theorem 8. Denoting the mode by m, it is observed that m lies between l and u where l is the real positive root of the equation
2 m3 − [2n − 1 + 2(t − 1)]m2 2 + [1 − p(1 + (s + 1) − 2 − 2n + n + (t − 1)) − 2 + (s + 1)(1 − n − (t − 1)) − 2n(1 − n − (t − 1))]m + (1 − p)2 + n(p + (s + 1))(1 − p − n − (t − 1)) > 0 and u=
(n + 1)(p + (s − 1)) . 1 − (n − (t + s) + 2)
Proof. Similar to the one provided for QBD I in Consul (1990). It is seen that for s = −1, t = 0 the above result reduces to the bound given by Consul (1990). (We have noticed some printing errors, in the Consul (1990) result. Like in Eq. (4.2) the last expression should be (1 − p − m + ) not (1 − p − m − ) and in the line before the line preceding Eq. (4.2) one more expression (1 − p − M ) should be multiplied with the expressions in the right-hand side of the inequality.) Similar bounds for other distributions belonging to the WQBD class can be easily derived using the above result.
174
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
6. Estimation and data fitting Consul (1990) discussed ML estimation of the parameters of the QBD I for raw as well as for grouped data sets and suggested starting values for solving the ML equations numerically. He also provide exact solutions when the number of classes are small (two, three and four). Here the problem of estimation of the parameters of QBD I and QBD II using different methods of estimation have been discussed. It is assumed that observed frequency in a random sample of size N are nk , k = 0(1)m for different classes, i.e., m k=0 nk = N , where m is of course the largest value observed. Here, the parameter n is estimated by m, x¯ is the sample mean, f0 frequency of zeros and f1 frequency ones. Since the analytical solution of the ML equations were not easy, they are solved numerically using Newton–Rapson method. The second ordered partial derivatives needed for implementing the method have been provided. In solving ML equations numerically by successive approximation, estimates of p and obtained by other methods may be taken as the starting values for p and . 6.1. QBD I The p.f. of QBD I with parameters p, is given by n p (p + k )k−1 (1 − p − k )n−k . pk = k I. By proportion of zeros and ones 1/n f0 pˆ = 1 − , N f1 1/(n−1) ˆ . = 1 − pˆ − npN ˆ II. By proportion of zeros and sample mean Once the estimate of p is obtained, the estimate of can be obtained by numerically solving the equation np
n−1
(n − 1)(i) i − x¯ = 0.
(10)
i−0
Eq. (10) can be solved by standard techniques like the Newton–Rapson. Initial value may be 0 or the estimate of as given by method of proportion of zeros and ones. III. ML method The log-likelihood function is given by l = log L ∝ N log p + +
n k=0
n
nk (k − 1) log(p + k )
k=0
nk (n − k) log(1 − p − k ).
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
175
The two likelihood equations obtained by partially differentiating l w.r.t. p and are n
n
k=0
k=0
N nk (k − 1) nk (n − k) jl − = 0, = + jp p p + k 1 − p − k ⇒ h = 0, say. n n jl nk (k − 1)k nk (n − k)k = 0, = − j p + k 1 − p − k k=0
⇒ g = 0,
k=0
say.
The partial derivatives of h and g w.r.t. p and are n
n
nk (k − 1) nk (n − k) jh N − , =− 2 − p jp (p + k )2 k=0 (1 − p − k )2 k=0 n
n
nk (k − 1)k nk (n − k)k jg − , =− jp (p + k )2 (1 − p − k )2 k=0 k=0 n
n
nk (k − 1)k 2 nk (n − k)k 2 jg =− − . j (p + k )2 (1 − p − k )2 k=0 k=0 It may be noted that
jg jh = . j jp [It may be noted that in Section 7 of Consul (1990) in p. 496 the expressions for the second order partial derivatives of likelihood function given in Eqs. (7.7) and (7.8) were incorrect. Also the initial value formula in Eq. (7.14) in p. 497 may lead to complex values for the initial value for .] 6.2. QBD II The p.f. of QBD II is given by p(1 − p − n) n pk = (p + k )k−1 (1 − p − k )n−k−1 , k 1 − n
1 − n = 0.
I. By proportion of zeros and ones First, the estimate of the parameter p is obtained by numerically solving the following equation 1/(n−2) (1 − p)n − p0 (1 − p)n − p0 np 1 − p − 1−p− n[(1 − p)n−1 − p0 ] (1 − p)n−1 − p0 1/(n−2) (1 − p)n − p0 − p 1− =0 (11) (1 − p)n−1 − p0
176
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Using the Newton–Rapson method, one can get a root of (11). Then the estimate of the parameter is obtained from the equation n =
(1 − p)n − p0
(12)
(1 − p)n−1 − p0
II. Sample proportion of zeros and the mean The estimate of p is obtained by numerically solving the following equation: p(1 − p)n−1 (n − x) ¯ − np 0 p = 0, then substituting the value of p in (12), can be estimated. III. ML method The log-likelihood function is given by n p(1 − p − n) + nk (k − 1) log(p + k ) l = log L ∝ N log 1 − n +
n
k=0
nk (n − k − 1) log(1 − p − k ).
k=0
The two likelihood equations obtained by partially differentiating l w.r.t. p and are n
n
k=0
k=0
nk (k − 1) nk (n − k − 1) N jl N = + − =0 − jp p 1 − p − n p + k 1 − p − k ⇒ g = 0,
say, n
n
k=0
k=0
nk (k − 1)k nk (n − k − 1)k jl nNp =0 − + = − 1 − p − k (1 − p − n)(1 − n) p + k j ⇒ h = 0,
say.
The partial derivatives of g and h w.r.t. p and are n
n
nk (k − 1) nk (n − k − 1) N jg N =− 2 − − − , jp p (1 − p − n)2 k=0 (p + k )2 k=0 (1 − p − k )2 n
n
nk (k − 1)k nk (n − k − 1)k jg nN =− − − , 2 j (1 − p − n)2 k=0 (p + k )2 (1 − p − k ) k=0 n
n
nk (k − 1)k 2 nk (n − k − 1)k 2 jh n2 Np(2 − p − 2n) = − − . 2 2 j (1 − p − n) (1 − np) (p + k )2 (1 − p − k )2 k=0 k=0 It may be noted that
jh jg = . jp j
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
177
Table 1 Observed and expected frequencies of European Corn borer in 1296 Corn plants No. of borers per plant
Observed no. of plants
QBD I
QBD II
0 1 2 3
907 275 88 23 3
906.41 277.40 85.90 22.59 3.70
906.45 277.26 86.01 22.60 3.68
4 pˆ ˆ
2 d.f.
0.0855 0.0591 0.0758 1
0.0797 0.0557 0.0677 1
6.3. Some examples of data fitting
Example 1. Here, McGuire, Brindley and Bancroft’s data on the European Corn borer, used by Shumway and Gurland (1960), Crow and Bardwell (1965) and Consul (1990) are considered. See Table 1. As measured by 2, QBD II gives better fit. Example 2. Classical data derived from haemacytometer yeast cell counts observed by ‘Student’ in 400 squares (Crow and Bardwell, 1965; Consul, 1990), see also Hand et al. (1994). See Table 2. Here both the models give almost equally good fit, but QBD II is better. It may be noted that for this set of data, Neyman Type A model provides better fitting than the QBDs (see Consul, 1990, p. 500). Example 3. Taken from Ord et al. (1979), based on field data on D. bimaculatus by time of the day. See Table 3. While both the models are equally good it can be seen that only the QBD II preserves the symmetry of the original data. It should be noted here that QBD II is a symmetric distribution when = (1 − 2p)/n. Example 4. This data about the incidence of flying bombs in an area in south London during World War II is taken from Feller (1968) used by Clarke (1946). See Table 4. Clearly from the values of z and expected frequencies, both the models are equally good.
178
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Table 2 Distribution of yeast cells per square in a haemacytometer No. of cells per square
Observed no. of squares
QBD I
QBD II
0 1 2 3 4 5
213 128 37 18 3 1
215.73 118.28 47.23 14.89 3.43 0.44
215.72 118.29 47.25 14.89 3.42 0.44
pˆ ˆ
0.1162 0.0391 3.6087 1
2 d.f.
0.1110 0.0376 3.6082 1
Table 3 Distribution of number of seeds by time of day Time
Observed no. seeds
QBD I
QBD II
0 1 2 3 4 5
7 4 5 5 4 7
6.50 5.38 4.55 4.18 4.40 6.99
6.77 4.95 4.28 4.28 4.95 6.77
0.2729 0.1346 0.4575 2
0.1934 0.1226 0.6225 2
pˆ ˆ
2 d.f.
Table 4 Distribution of number of hits per square No. of hits
No. of 1/4 km squares
QBD I
QBD II
0 1 2 3 4 5
229 211 93 35 7 1
231.35 203.60 100.24 33.01 7.04 0.76
231.35 203.59 100.26 33.01 7.04 0.76
pˆ ˆ
2 d.f.
0.1668 0.0263 0.9409 2
0.1619 0.0257 0.9344 2
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
179
7. Limiting distribution Theorem 9. As n → ∞ and p, → 0 such that np = , n = the class of WQBD (7) tends to the class of weighted GPD (Chakraborty, 2001) class with parameters (; s; ). Tables 1–4. Proof. Since lim ns
n→∞
n k
(p + k )k+s (1 − p − k )n−k+t =
1 ( + k )k+s e−(+k ) , k!
(13)
hence, lim ns Bn (p, 1 − p − n; s, t; ) → e− K(; s; ),
n→∞
where K(a; s; z) = i 0 (1/ i!)e−iz (a + iz)i+s (Nandi et al., 1999; Chakraborty, 2001). Therefore, n k+s (1 − p − k )n−k+t ( + k )k+s e−k k (p + k ) → . Bn (p, 1 − p − n; s, t; ) k!K(; s; ) In particular it can be easily seen that both QBD I and QBD II tend to GPD (Consul and Jain 1973a, b).
Appendix A. Some results related to a class of Abel’s generalisations of binomial identities (Chakraborty, 2001; Riordan, 1968) which are used repeatedly in deriving many results of the paper have been presented here. The general form of the class of the Able sums is given by Bn (a1 , a2 ; s, t; z) =
n n (a1 + kz)k+s (a2 + (n − k)z)n−k+t . k k=0
The parameter z will not be disposed of as it is important in the context of the present work. For z = 1, Bn (a1 , a2 ; s, t; z) reduces to An (a1 , a2 ; s, t) of Riordan (1968). Bn (a1 , a2 ; −4, 0; z) −1 −1 −2 −1 = a1−4 (a1 + a2 + nz)n − nza −1 1 (a + z) [a1 + a1 (a1 + z)
+ (a1 + z)−2 ](a1 + a2 + nz)n−1 + n(2) z2 a1−1 (a1 + z)−1 (a1 + 2z)−1 [a1−1 + (a1 + 2z)−2 + (a1 + z)−1 (a1 + 2z)−1 ](a1 + a2 + nz)n−2
+ n(3) z3 a1−1 (a1 + z)−1 (a1 + 2z)−1 (a1 + 3z)−1 (a1 + a2 + nz)n−3 ,
180
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Bn (a1 , a2 ; −3, 0; z) = a1−3 (a1 + z)−2 (a1 + 2z)−1 [(a1 + z)2 (a1 + 2z)(a1 + a2 + nz)n
− nza 1 (a1 + 2z)(2a1 + z)(a1 + a2 + nz)n−1 + n(n − 1)a12 z2 (a1 + z) × (a1 + a2 + nz)n−2 ],
Bn (a1 , a2 ; −2, 0; z) = a1−2 (a1 + z)−1 [(a1 + z)(a1 + a2 + nz)n − nza 1 (a1 + a2 + nz)n−1 ],
Bn (a1 , a2 ; −1, 0; z) = a1−1 (a1 + a2 + nz)n , Bn (a1 , a2 ; 0, 0; z) = (a1 + a2 + nz + z)n , Bn (a1 , a2 ; 1, 0; z) = (a1 + a2 + nz + z + z (a1 ))n , Bn (a1 , a2 ; 2, 0; z) = (a1 + a2 + nz + z + z (a1 ; 2))n + (a1 + a2 + nz + z(2) + z (a1 ))n , Bn (a1 , a2 ; 3, 0; z) = (a1 + a2 + nz + z + z (a1 ; 3))n
+ 3(a1 + a2 + nz + z(2) + z (a1 ) + z (a1 ))n + (a1 + a2 + nz + z(2) + z (0) + z (a1 ))n + (a1 + a2 + nz + z(3) + z (a1 ))n ,
Bn (a1 , a2 ; −3, −1; z) =
Bn (a1 , a2 ; −2, −1; z) =
nz(2a1 + z)(a1 + a2 + z) a1 + a 2 (a1 + a2 + nz)n−1 − 3 a 1 a2 a12 (a1 + z)2 a2 n(n − 1)z2 (a1 + a2 + 2z) (a1 + a2 + nz)n−2 + a1 (a1 + z)(a1 + 2z)a2 n−3 (a1 + a2 + nz) , a1 + a 2 nz(a1 + a2 + z) (a1 + a2 + nz)n−1 − (a1 + z)a2 a1 a12 a2
(a1 + a2 + nz)n−2 , Bn (a1 , a2 ; −1, −1; z) = Bn (a1 , a2 ; −2, 1; z) =
a1 + a 2 (a1 + a2 + nz)n−1 , a1 a2
1 (a1 + a2 + nz + z (a2 ))n a12 nz (a1 + a2 + nz + z (a2 ))n−1 , − a1 (a1 + z)
Bn (a1 , a2 ; −1, 1; z) = a1−1 (a1 + a2 + nz + z (a2 ))n , Bn (a1 , a2 ; 1, 1; z) = (a1 + a2 + nz + z + z (a1 ) + z (a2 ))n , Bn (a1 , a2 ; −1, 2; z) = a1−1 [(a1 + a2 + nz + z (a2 ; 2))n × (a1 + a2 + nz + z + z (a2 ))n ],
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
181
Bn (a1 , a2 ; 1, 2; z) = (a1 + a2 + nz + z + z (a1 ) + z (a2 ; 2))n , Bn (a1 , a2 ; 2, 2; z) = [(a1 + a2 + nz + z + z (a1 ; 2))n
× (a1 + a2 + nz + z(2) + z (a1 ) + z (a2 ; 2))n + (a1 + a2 + nz + z(2) + z (a1 ; 2) + z (a2 ))n + (a1 + a2 + nz + z(3) + z (a1 ) + z (a2 ))n ], a1 + a 2 (a1 + a2 + nz)n−1 a12 a22 nz(a1 + a2 + z) − 2 2 (a1 + a2 + nz)n−2 a1 a2 (a1 + z)(a2 + z)
Bn (a1 , a2 ; −2, −2; z) =
{a2 (a1 + z) + a1 (a2 + z)} + (a1 + a2 + nz)n−3 ,
n(2) z2 (a1 + a2 + 2z) a1 a2 (a1 + z)(a2 + z)
where
k ≡ k = k!, k
k
(j ) ≡ k (j ) = ( · · + ) = + · j terms
k+j −1 k
k!,
k (x) ≡ k (x) = k!(x + kz), k (x) ≡ k (x) = (kz)k! (x + kz), k (x) ≡ k (x) = (kz) (kz)k!(x + kz) and
k (x; j ) = [ (x) + · · · + (x) ]k . j terms
wherein ‘z’ is of course the ‘z’ in Bn (a1 , a2 ; s, t; z). Putting z = 1, all the corresponding results for An (a1 , a2 ; s, t) tabulated in p. 23 of Riordan (1968) can be obtained. It may be noted here that Bn (a1 , a2 ; s, t; z) = Bn (a2 , a1 ; t, s; z). References Berg, S., 1985. Generating discrete distributions from modified Charlier type B expansions. Contributions to Probability and Statistics in honour of Gunnar Blom, Lund, pp. 39–48. Berg, S., Mutafchiev, L., 1990. Random mapping with an attracting center: Lagrangian distributions and a regression function. J. Appl. Probab. 27, 622–636. Berg, S., Nowicki, K., 1991. Statistical inference for a class of modified power series distributions with applications to random mapping theory. J. Statist. Plann. Inference 28, 247–261. Chakraborty, S., 2001. Some aspects of discrete probability distributions. Ph. D. Thesis, Tezpur University, Tezpur, Assam, India.
182
S. Chakraborty, K.K. Das / Journal of Statistical Planning and Inference 136 (2006) 159 – 182
Chakraborty, S., Das, K., 2000. An unified discrete probability model. J. Assam Science Society 41 (1), 15–25. Charalambides, C., 1990. Abel series distributions with applications to fluctuation of sample functions of stochastic processes. Comm. Statist. Theory Methods 19, 317–335. Clarke, R., 1946. An application of the Poisson distribution. J. Inst. Actuaries 72, 48. Consul, P., 1974. A simple urn model dependent upon predetermined strategy. Sankhy¯a 36 (Series B, Part 4), 391 –399. Consul, P., 1990. On some properties and applications of quasi-binomial distribution. Comm. Statist. Theory Methods 19 (2), 477–504. Consul, P., Jain, G., 1973a. A generalization of the Poisson distribution. Technometrics 15, 791–799. Consul, P., Jain, G., 1973b. On some interesting properties of the generalized Poisson distribution. Biometrische Z. 15, 495–500. Consul, P., Mittal, S., 1975. A new urn model with predetermined strategy. Biometrische Z. 17, 67–75. Crow, E., Bardwell, G., 1965. Estimation of the parameters of the hyper-Poisson distributions. In: Patil, G.P. (Ed.), Classical and Contagious Discrete Distributions, Calcutta Statistical Publishing Society, Calcutta; Peragamon Press, Oxford, pp. 127-140. Das, K., 1993. Some aspects of a class of quasi-binomial distributions. Assam Statist. Rev. 7 (1), 33–40. Das, K., 1994. Some aspects of discrete distributions. Ph. D. Thesis, Gauhati University, Guwahati 781014, Assam, India. Feller, W., 1968. An Introduction to Probability Theory and its Applications, vol. 1, third ed.. Wiley, New York. Gupta, R., 1975. Some characterisations of discrete distributions by properties of their moment distributions. Comm. Statist. 4, 761–765. Hand, D., Daly, F., Lunn, A., McConway, K., Ostrooski, E., 1994. A Handbook of Small Data Sets, Chapman & Hall, London, UK. Janardan, K., 1975. Markov-Polya urn model with predetermined strategies—I. Gujrat Statist. Rev. 2 (1), 17–32. Jaworski, J., 1984. On random mapping (t, pj ). J. Appl. Probab. 21, 186–191. Johnson, N., Kotz, S., Kemp, A., 1992. Univariate Discrete Distributions, Second ed. Wiley, New York. Mishra, A., Tiwary, D., Singh, S., 1992. A class of quasi-binomial distributions. Sankhy¯a, Ser. B 54 (1), 67–76. Mishra, A., Singh, A., 2000. Moments of the quasi-binomial distribution. Assam Statist. Rev. 13 (1), 13–20. Nandi, S., Nath, D., Das, K., 1999. A class of generalized Poisson distribution. Statistica LIX (4), 487–498. Ord, J., Patil, G., Tallie, C., 1979. Statistical distribution in ecology, International Publishing House, USA. Riordan, J., 1958. An Introduction to Combinatorial Theory, Wiley, New York. Riordan, J., 1968. Combinatorial Identities, Wiley, New York. Shumway, R., Gurland, J., 1960. A fitting procedure for some generalized Poisson distributions. Skandinavisk Aktuarietidskrift 43, 87–108.