Discrete distributions in the extended FGM family: The p.g.f. approach

Discrete distributions in the extended FGM family: The p.g.f. approach

Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899 Contents lists available at ScienceDirect Journal of Statistical Planning and ...

191KB Sizes 2 Downloads 109 Views

Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: w w w . e l s e v i e r . c o m / l o c a t e / j s p i

Discrete distributions in the extended FGM family: The p.g.f. approach Violetta E. Piperigou∗ Department of Mathematics, University of Patras, 26500 Rio, Greece

A R T I C L E

I N F O

Available online 22 May 2009 MSC: 60E05 60E10 62H20 62H05 62E15 62H12

A B S T R A C T

In this article the probability generating functions of the extended Farlie–Gumbel–Morgenstern family for discrete distributions are derived. Using the probability generating function approach various properties are examined, the expressions for probabilities, moments, and the form of the conditional distributions are obtained. Bivariate version of the geometric and Poisson distributions are used as illustrative examples. Their covariance structure and estimation of parameters for a data set are briefly discussed. A new copula is also introduced. © 2009 Elsevier B.V. All rights reserved.

Keywords: Discrete data fit Discrete Farlie–Gumbel–Morgenstern family Geometric distribution Marginal-conditional distribution Multivariate distributions Negative correlation Poisson distribution Probability generating function

1. Introduction Discrete random variables (rvs) taking non-negative integer values have received considerable attention in the literature in an effort to explain phenomena in various areas of application. For an extensive account of bivariate and multivariate distributions one can refer to the books by Kocherlakota and Kocherlakota (1992) and Johnson et al. (1997). Models of bivariate (or multivariate) discrete distributions have been constructed by methods of convolution, random summation and mixing of distributions. The techniques commonly used for the study of various features are related to the structure of the models and the probability generating function (pgf) facilitates the derivation of various properties. Hence, (recurrence) expressions for the joint probabilities, different types of moments, and the form of the conditional distributions are obtained. However, many of these models, see Section 3.1, have restricted correlation structures imposed by the way they are constructed. Some negatively correlated bivariate Poisson distributions are discussed in Griffiths et al. (1979), and a model with Poisson marginals allowing negative correlation has been developed by Lakshminarayana et al. (1999). Nelsen (1987) constructs probability functions for dependent discrete random variables with any possible correlation value using convex linear combinations of the probability functions for the Fréchet boundary distributions. In this article, we use standard techniques in the study of bivariate discrete distributions to obtain results on the discrete distributions that belong to the extended Farlie–Gumbel–Morgenstern (FGM) family. In Section 3 the pgf of the FGM family for discrete distributions is obtained and the distribution with geometric marginal is discussed as an alternative to a well-known ∗ Tel./fax: +30 2610 997285. E-mail address: [email protected] URL: http://www.math.upatras.gr/∼vpiperig/englishversion.html 0378-3758/$ - see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2009.05.033

3892

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

bivariate geometric distribution. In Section 4 the pgf of the extended FGM family, in a particular case, is derived and the properties of this family of discrete distributions are given in detail. A distribution with Poisson marginals is used to fit some biological data. A new copula is also constructed in an effort to bypass difficulties in determining the possible values of a parameter. 2. The probability generating function approach In this section we give the basic definitions and prove the necessary relations that we will use in the following sections to obtain the probability generating function of the FGM family. Let us consider a non-negative univariate discrete random variable X with probability mass function (pmf) and cumulative distribution function (cdf) given, respectively, by F(x) = Pr[X ⱕ x] for x = 0, 1, . . . .

P(x) = Pr[X = x] and

The survival function (sf) of the rv X is defined as the probability S(x) = Pr[X > x]. The generating function of the pmf (pgf) and the generating function of the cdf (dgf) of the random variable X are defined as the series

(u) =

∞ 

P(x) ux

and

D(u) =

x=0

∞ 

F(x) ux ,

x=0

which converge at least for −1 ⱕ u ⱕ 1 and at least in the open interval −1 < u < 1, respectively. Feller (1968, p. 265), gives a relation between the generating function of the survival function and the pgf of the rv X: ∞ 

S(x) ux =

x=0

1 − (u) 1−u

for − 1 < u < 1.

Since, ∞ 

S(x) ux =

x=0

∞ 

ux −

x=0

∞ 

F(x) ux =

x=0

1 − D(u), 1−u

the following corollary is derived which establishes a one-to-one correspondence between the pgf and the dgf of a univariate discrete rv X. Corollary 1. For −1 < u < 1 D(u) =

(u) . 1−u

(2.1)

Consider now, a bivariate discrete rv X = (X1 , X2 ) with joint pmf and joint cdf given, respectively, by ⎡ ⎤ ⎡ ⎤ 2 2   P12 (x1 , x2 ) = Pr ⎣ ( Xi = xi )⎦ and F12 (x1 , x2 ) = Pr ⎣ ( Xi ⱕ xi )⎦ , i=1

i=1

for xi = 0, 1, . . . , and i = 1, 2. The pgf of the rv X is defined as

12 (u1 , u2 ) =

∞ ∞  

x

x

P12 (x1 , x2 )u11 u22 ,

x1 =0 x2 =0

which converges absolutely for at least −1 ⱕ u1 , u2 ⱕ 1, and its dgf is defined as D12 (u1 , u2 ) =

∞ ∞  

x

x

F12 (x1 , x2 )u11 u22 ,

x1 =0 x2 =0

which converges at least in the open square −1 < u1 , u2 < 1. The following theorem establishes also a one-to-one correspondence between the pgf and the dgf of a bivariate discrete rv X = (X1 , X2 ). Theorem 1. For −1 < u1 , u2 < 1 D12 (u1 , u2 ) =

12 (u1 , u2 ) . (1 − u1 )(1 − u2 )

(2.2)

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899 x

3893

x

Proof. The coefficient of u11 u22 in (1 − u1 )(1 − u2 )D12 (u1 , u2 ) = (1 − u1 − u2 + u1 u2 )D12 (u1 , u2 ), equals when x1 , x2 ⱖ 1,

F12 (x1 , x2 ) − F12 (x1 − 1, x2 ) − F12 (x1 , x2 − 1) + F12 (x1 − 1, x2 − 1) = P12 (x1 , x2 ) F12 (0, x2 ) − F12 (0, x2 − 1) = P12 (0, x2 )

when x1 = 0 and x2 ⱖ 1,

F12 (x1 , 0) − F12 (x1 − 1, 0) = P12 (x1 , 0)

when x1 ⱖ 1 and x2 = 0,

F12 (0, 0) = P12 (0, 0) when x1 , x2 = 0. Therefore, (1 − u1 )(1 − u2 )D12 (u1 , u2 ) = 12 (u1 , u2 ) as asserted.



3. The Farlie–Gumbel–Morgestern family For i = 1, 2 consider the discrete rv's Xi from the distribution's Fi with pmf's Pi (x), cdf's Fi (x), pgf's i (ui ) and dgf's Di (ui ), and let Ei be the sets of all values of Fi (xi ) with the exception of 0 and 1, i.e., Ei = {Fi (x), x = 0, 1, . . . , }\{0, 1}. By definition, the bivariate rv X = (X1 , X2 ) belongs to the Farlie–Gumbel–Morgestern family of bivariate distributions when its joint cdf is given by F12 (x1 , x2 ) = F1 (x1 )F2 (x2 )[1 + {1 − F1 (x1 )}{1 − F2 (x2 )}],

(3.1)

with  − min

1 1 , M1 M2 (1 − m1 )(1 − m2 )



ⱕ  ⱕ min



1 1 , , M1 (1 − m2 ) (1 − m1 )M2

where mi = inf Ei , and Mi = sup Ei . The admissible values for the coefficient  are given by Cambanis (1977). Let D12 (u1 , u2 ) be the dgf and 12 (u1 , u2 ) be the pgf of the rv X. Theorem 2. Consider the discrete rv X that belongs to the FGM given by (3.1). Its pgf is given by



(2):2 (u1 ) 12 (u1 , u2 ) = 1 (u1 )2 (u2 ) 1 +  1 − 1 1 (u1 ) (2):2

where i

(2):2

(ui ) is the pgf of the rv Xi



(2):2 (u2 ) 1− 2 2 (u2 )

,

(3.2)

j

= max{Xi1 , Xi2 }, with the rv's Xi , j = 1, 2 being iid from the distribution Fi , for i = 1, 2. x

x

Proof. From relation (3.1) multiplying both parts by u11 u22 and adding for x1 , x2 , we take ∞  ∞ 

x x F12 (x1 , x2 )u11 u22

∞ 

=

x1 =0 x2 =0

x F1 (x1 )u11

x1 =0

×

⎧ ∞ ⎨ ⎩

∞ 

x F2 (x2 )u22

x2 =0 x

F2 (x2 )u22 −

x2 =0

(2):2

Note that Fi2 (x) is the cdf of Xi (2):2

and i

∞ 

+

⎧ ∞ ⎨ ⎩ x



x1 =0

F22 (x2 )u22

x2 =0

x F1 (x1 )u11

⎫ ⎬ ⎭

∞  x1 =0

⎫ ⎬

x F12 (x1 )u11 ⎭

.

(2):2

, which is the greatest of two independent rv's each distributed as Fi . Let Di

(ui ) be its dgf

(ui ) be its pgf, we obtain (2):2

D12 (u1 , u2 ) = D1 (u1 )D2 (u2 ) + {D1 (u1 ) − D1

(2):2

(u1 )}{D2 (u2 ) − D2

(u2 )}.

Hence, using relations (2.1) and (2.2), this becomes

12 (u1 , u2 ) (1 − u1 )(1 − u2 )

=

1 (u1 ) 2 (u2 ) (1 − u1 ) (1 − u2 )

+

1 (u1 ) (1 − u1 )



(2):2 (u1 ) 1 (1 − u1 )



2 (u2 ) (1 − u2 )



(2):2 (u2 ) 2 (1 − u2 )

,

which is relation (3.2), for −1 < u1 , u2 < 1. For u1 = 1 or u2 = 1 relation (3.2) also holds, since 12 (u1 , 1) = 1 (u1 ) and 12 (1, u2 ) = 2 (u2 ) are true.  (2):2

Since, the derivation of Xi case of geometrical marginals.

is rather difficult even in the case of continuous distributions, in the sequel we will consider the

3894

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

Upper and Lower bound for correlation p1=p2

Upper and Lower bound for correlation p1=p2/ 4

1

1

0.5

← Lower

Correlation

Correlation

0.5

0 Lower→ −0.5

← Lower 0 Lower→ −0.5

FGM Geom Compound Geom

−1 0

0.2

0.4

0.6

0.8

FGM Geom Compound Geom

−1 1

0

0.2

0.4

0
0.6

0.8

1

0
3.1. Geometric marginals In the case Fi ≡ Geometric(pi ) with pmf Pi (x) = (1 − pi )pxi , x = 0, 1, . . . the pgf is given by i (ui ) = (1 − pi )/(1 − pi ui ). The (2):2

and the dgf of the rv's Xi sequence of the cdf, at the points x = 0, 1, . . . , is Fi (x) = 1 − px+1 i (2):2

Di

(ui ) =

∞  (1 − px+1 )2 uxi = i x=0

is given by

p2i 1 2pi + − . 1 − ui 1 − pi ui 1 − p2i ui

Hence, the pgf of the FGM family with geometric marginals is given by  (1 − p1 ) (1 − p2 ) (1 − p2 ) (1 − p1 ) (1 − p2 ) (1 − p1 ) 12 (u1 , u2 ) = + ∗ 1 − − + (1 − p1 u1 ) (1 − p2 u2 ) (1 − p1 u1 ) (1 − p2 u2 ) (1 − p1 u1 ) (1 − p2 u2 ) ×

(1 − p21 )

(1 − p22 )

(1 − p21 u1 )

(1 − p22 u2 )

,

(3.3)

where ∗ = /(1 + p1 )(1 + p2 ).   Since, mi = inf{Ei } = 1 − pi and Mi = sup{Ei } = 1, the admissible values of  are −1 ⱕ  ⱕ min 1/p1 , 1/p2 . Remark. Mahfoud and Patil (1982) gave the definition of the negative mixture of distributions. We can observe that the FGM geometric distribution discussed, can be written as a negative mixture of five bivariate distributions, each with independent marginals, which are univariate geometric or convolutions of geometric distributions. The covariance is given according to Johnson and Kotz (1977), see also Section 4.1.2 relation (4.5), by cov(X1 , X2 ) = 

p1 p2 (1 − p21 )(1 − p22 )

,

and hence, the correlation coefficient can take both positive and negative values, in fact its range is −

 [p1 p2 ]1/2 1 1 [p1 p2 ]1/2 ⱕ corr(X1 , X2 ) ⱕ min , . (1 + p1 )(1 + p2 ) p1 p2 (1 + p1 )(1 + p2 )

In Fig. 1 these bounds are plotted for two cases, when p1 = p2 and p1 = p2 /4. A well-known bivariate geometric distribution is the one constructed as a compound bivariate Poisson and its pgf is given by

∗ (u1 , u2 ) =

1 − p∗1 − p∗2 − ∗ , 1 − p∗1 u1 − p∗2 u2 − ∗ u1 u2

(3.4)

with 0 < p∗1 , p∗2 , ∗ < 1 and 0 < p∗1 + p∗2 + ∗ < 1. For more details about this model one can refer to Kocherlakota and Kocherlakota (1992).

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

3895

Table 1 Observed (first entry) and fitted distribution (second entry) for the number of plants Lacistema aggregatum (X1 ) and Protium guianense (X2 )a . X1 \X2

0

0

34 39.23

8 8.70

1

12 13.04

2

3

4

3 2.39

1 0.78

0 0.28

46 51.37

13 6.94

6 2.99

1 1.21

0 0.48

32 24.67

4 4.92

3 4.01

1 1.88

0 0.78

0 0.31

8 11.90

3

5 2.05

3 2.09

2 1.01

1 0.43

0 0.17

11 5.75

4

2 0.92

0 1.05

0 0.51

0 0.22

0 0.09

2 2.78

5

0 0.43

0 0.52

0 0.25

0 0.11

1 0.04

1 1.35

57 60.59

27 23.30

12 9.04

3 3.52

1 1.37

100 97.83

OBS FGM

1

2

OBS FGM

 = 1.31. p1 = 0.48, p2 = 0.39 and  a

Data from Kocherlakota and Kocherlakota (1992, p. 243).

To have the same marginal distributions with the FGM model (3.3) we require p∗1 =

p1 (1 − p2 ) − ∗ (1 − p1 ) 1 − p1 p2

and

p∗2 =

p2 (1 − p1 ) − ∗ (1 − p2 ) , 1 − p1 p2

and then 0 ⱕ ∗ ⱕ min



p1 p2 (1 − p2 ), (1 − p1 ) . 1 − p1 1 − p2

However, for the bivariate geometric given by (3.4) the correlation is always positive, so corr(X1∗ , X2∗ ) =

∗ + p∗1 p∗2 [(1 − p∗1 )(1 − p∗2 )(p∗1

+ ∗ )(p∗2 + ∗ )]1/2

.

The bounds for the correlation can be obtained after tedious calculations and in Fig. 1 these bounds are given for the special cases p1 = p2 and p1 = p2 /4. This model seems inappropriate to fit data with negative or rather small correlation, like the data presented in Table 1, where x1 = 0.94,

m2,0 = 1.31,

x2 = 0.64,

m0,2 = 0.77

and

m1,1 = 0.27

 (denoting the central sample moments of the order (r, s) by mr,s = (1/n) ni=1 (x1i − x1 )r (x2i − x2 )s , where (x1i , x2i ) i = 1, . . . , n are the observations and n is the sample size). 2 These data are fitted satisfactorily, XFGM =17.96 with df =13 by the FGM geometric distribution (3.3). The method of moments has been used to estimate the parameters of the model i = p

xi 1 + xi

for i = 1, 2

(1 + p1 )(1 + p2 ) m  and   = √ 1,1 . m2,0 m0,2 p1 p2

4. The extended FGM family Consider now the extended Farlie–Gumbel–Morgestern family of bivariate distributions with joint cdf F12 (x1 , x2 ) = F1 (x1 )F2 (x2 )[1 + S1∗ (x1 )S2∗ (x2 )],

(4.1)

where Fi∗ (x) = 1 − Si∗ (x) (i = 1, 2) are cdf's, but not necessarily identical with Fi (x). Of course,  must satisfy the conditions for the distribution to be proper. In general, its possible values depend upon the bounds of Fi (x∗ )Si∗ (x∗ ) − Fi (x)Si∗ (x) Fi (x∗ ) − Fi (x)

for x < x∗ , i = 1, 2.

Results on generalizations of the FGM family have been given recently by Bairamov et al. (2001) and Bairamov and Kotz (2002).

3896

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899 x +1

As a particular case of this family we suppose Si∗ (xi ) = pi i , which is the survival function of a Geometric(pi ) rv. This case is considered both for the simplicity of the relations and the importance of the geometric distribution as the discrete analog of the exponential distribution. x +1

Theorem 3. Consider the discrete rv X that belongs to the extended FGM given by (4.1), with Si∗ (xi ) = pi i

12 (u1 , u2 ) = 1 (u1 )2 (u2 ) + ∗

. Its pgf is given by





1 (p1 u1 ) 2 (p2 u2 ) 1 − p2 1 − p1 1 − p2 1 − p1 − + 1− , 1 (p1 ) 2 (p2 ) 1 − p1 u1 1 − p2 u2 1 − p1 u1 1 − p2 u2

(4.2)

where ∗ = 1 (p1 )2 (p2 ). x

x

Proof. From relation (4.1) multiplying both parts by u11 u22 and adding for the values of x1 , x2 , we have ∞ ∞  

x

x

F12 (x1 , x2 )u11 u22 =

x1 =0 x2 =0

∞ 

x

F1 (x1 )u11

x1 =0

∞ 

x

F2 (x2 )u22 + p1 p2

x2 =0

∞ 

x

x

F1 (x1 )p11 u11

x1 =0

∞ 

x

x

F2 (x2 )p22 u22 ,

x2 =0

and therefore, D12 (u1 , u2 ) = D1 (u1 )D2 (u2 ) + p1 p2 D1 (p1 u1 )D2 (p2 u2 ). Using Corollary 1 and Theorem 1 this is equivalent to

12 (u1 , u2 ) 1 (u1 ) 2 (u2 ) 1 (p1 u1 ) 2 (p2 u2 ) = + p1 p2 , (1 − u1 )(1 − u2 ) (1 − u1 ) (1 − u2 ) (1 − p1 u1 ) (1 − p2 u2 ) with −1 < u1 , u2 < 1 and finally,   1 − u1 1 − u2 1 (p1 u1 )2 (p2 u2 ) 1 − p1 u1 1 − p2 u2    1 (p1 u1 ) 2 (p2 u2 ) 1 − p1 1 − p2 = 1 (u1 )2 (u2 ) + 1 (p1 )2 (p2 ) 1 − . 1− 1 − p1 u1 1 − p2 u2 1 (p1 ) 2 (p2 )

12 (u1 , u2 ) = 1 (u1 )2 (u2 ) + p1 p2



For u1 = 1 or u2 = 1 relation (4.2) also holds, since 12 (u1 , 1) = 1 (u1 ) and 12 (1, u2 ) = 2 (u2 ) are true.  Remark. The pgf of the bivariate rv (X1 , X2 ) can be expressed as a finite negative mixture of bivariate distributions, each with independent marginals, which are either the marginals distributions of the model or convolutions of geometric distributions with the distributions Xi∗ that have pgf's of the form (i (pi ui ))/(i (pi )) (i = 1, 2). 4.1. Properties By appropriate differentiation of the pgf various properties of the distribution are derived. From relation (4.2), by successive differentiation with respect to the arguments, we have ⎤ ⎡ j (r−j) (r) r    r j!(1 − p1 )p1 r−j 1 (p1 u1 ) ⎦ (r,s) (r) (s) ∗ ⎣ r 1 (p1 u1 ) 12 (u1 , u2 ) = 1 (u1 )2 (u2 ) + a p1 p − 1 (p1 ) 1 (p1 ) j (1 − p1 u1 )j+1 1 j=0



(s) (p2 u2 ) × ⎣ps2 2 2 (p2 )



r   j=0

s j





j

(s−j) (p2 u2 ) s−j 2 ⎦, p j+1 2  (p 2 2) (1 − p2 u2 )

j!(1 − p2 )p2

where

(r,s) 12 (u1 , u2 ) =

r+s

*

12 (u1 , u2 ) *ur1 *us2

and

(r) (ui ) = i

dr i (ui ) r

dui

.

4.1.1. Probabilities (x ,x ) Since P12 (x1 , x2 ) = x1 !x2 !121 2 (u1 , u2 )|u1 =u2 =0 , we have     F1 (x1 ) F2 (x2 ) x x P12 (x1 , x2 ) = P1 (x1 )P2 (x2 ) 1 + p11 p22 1 − (1 − p1 ) 1 − (1 − p2 ) , P1 (x1 ) P2 (x2 ) for x1 , x2 = 0, 1, . . ..

(4.3)

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

3897

4.1.2. Factorial moments The descending factorial moments are derived from the pgf as

(r,s) (X1 , X2 ) = E(X1 (X1 − 1) . . . (X1 − r + 1)X2 (X2 − 1) . . . (X2 − s + 1)) (r,s)

= 12 (u1 , u2 )|u1 =u2 =1 , hence, the following result is easily obtained: ⎡

(r,s) (X1 , X2 ) = (r) (X1 )(s) (X2 ) + ∗ ⎣(r) (X1∗ ) −

r 

⎤⎡ Cj (r, p1 )(r−j) (X1∗ )⎦ ⎣(s) (X2∗ ) −

j=0

s 

⎤ Cj (s, p2 )(s−j) (X2∗ )⎦ ,

(4.4)

j=0

where Cj (r, p) =

 j r! p , (r − j)! 1 − p

and Xi∗ is a rv with pgf (i (pi ui ))/(i (pi )), for i = 1,2. The covariance is given by cov(X1 , X2 ) = 1 (p1 )2 (p2 )

p2 p1 , 1 − p1 1 − p2

(4.5)

and has the same sign as . 4.1.3. Conditional distributions The pgf of the conditional distribution is determined by applying the differentiation formula, see Kocherlakota and Kocherlakota (1992)   (0,x2 ) 12 (u1 , u2 )  u2 =0 . X1 |X2 =x2 (u1 ) = (0,x ) 12 2 (u1 , u2 ) u1 =1 u2 =0

Therefore,

X1 |X2 =x2 (u1 ) = 1 (u1 ) + c(x2 )





1 (p1 u1 ) 1 − p1 1 (p1 u1 ) − , 1 (p1 ) 1 − p1 u1 1 (p1 )

(4.6)

where  F2 (x2 − 1) x c(x2 ) = 1 (p1 )p22 P2 (x2 ) p2 − (1 − p2 ) P2 (x2 )

for x2 > 0 and c(0) = 1 (p1 )p2 P2 (0).

The corresponding conditional means are given by   F2 (x2 − 1) p1 x E(X1 |X2 = x2 ) = E(X1 ) − 1 (p1 )p22 P2 (x2 ) p2 − (1 − p2 ) P2 (x2 ) 1 − p1

for x2 > 0,

and E(X1 |X2 = 0) = E(X1 ) − 1 (p1 )p2 P2 (0)

p1 . 1 − p1

4.2. An example with the Poisson distribution If, for i = 1, 2, Fi ∼ Poisson(i ) with pmf Pi (x) = e−i i /x! for x = 0, 1, . . . and pgf i (ui ) = exp[i (ui − 1)], then the rv's Xi∗ ∼ Poisson(pi i ) since their pgf's are x

i (pi ui ) exp[i (pi ui − 1)] = exp[i pi (ui − 1)]. = i (pi ) exp[i (pi − 1)] To specify the possible values for  the quantities Fi (x∗ )pxi

∗ +1

− Fi (x)px+1 i

Fi (x∗ ) − Fi (x)

,

i = 1, 2,

3898

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

Table 2 Observed (first entry) and fitted distribution (second entry) for the number of seeds (X1 ) and plants (X2 ) grown over a plot of five square fita . X1 \X2

0

1

2

3

4

5

OBS EFGM

0

7 21.2

41 45.2

54 47.8

40 33.7

21 17.8

9 7.5

172 173.21

1

36 37.3

79 78.6

83 82.7

59 58.0

30 30.5

13 12.8

300 299.79

2

39 32.6

70 68.2

69 71.5

47 50.0

25 26.2

10 11.0

260 259.33

3

24 18.9

41 39.4

39 41.2

26 28.7

14 15.0

6 6.3

150 149.53

4

10 8.3

18 17.1

18 17.8

11 12.4

6 6.5

2 2.7

65 64.68

5

3 2.9

6 6.0

6 6.2

4 4.3

2 2.2

1 0.9

22 22.39

119 121.1

255 254.4

269 267.1

187 187.0

98 98.1

41 41.2

OBS EFGM

969 968.9

1 = 1.73, 2 = 2.10,   = −0.076, p1 = 0.55 and p2 = 0.57. a

Data from Lakshminarayana et al. (1999).

must be bounded for every x < x∗ . This condition, in general, holds if the supports of the distributions Fi , i = 1, 2, are bounded. is the maximum observed value of the rv Xi (i = 1, 2) in the data set, then we can use truncated Under this restriction, if xmax i Poisson distributions as marginals, with pmf PiT (x) given by PiT (x) =

Pi (x) Fi (xmax ) i

. for x = 0, 1, . . . , xmax i

Then, the pmf of the model with truncated Poisson marginals is given by     P1 (x1 ) P2 (x2 ) F1 (x1 ) F2 (x2 ) x1 x2 T (x1 , x2 ) =  p p ) ) 1 + 1 − (1 − p . P12 1 − (1 − p 1 2 max 1 2 F1 (xmax P1 (x1 ) P2 (x2 ) 1 ) F2 (x2 )

(4.7)

In this case, the possible values of the parameter  are −1/p1 p2 ⱕ  ⱕ 1/p1 p2 . In Table 2 we fit satisfactorily, X2 = 17.2 with df = 29, the model (4.7) to a data set with x1 = 1.69,

m2,0 = 1.55,

x2 = 2.01,

m0,2 = 1.73 and negative covariance m1,1 = −0.15.

To estimate the parameters we use sample moments, conditional sample moments and observed frequencies. The estimators for i are obtained from the respective moment equations x+1 (xmax −1) i i max x=0 x! = x! =  Fi (xi − 1) xi = i x x max ) max max F xi xi i  i (xi −i i x=0 e x=0 x! x!

xmax i x=0

xe−i

xi

for i = 1, 2.

The estimators for pi and  are obtained by solving iteratively the equations for the conditional means at zero and the covariance xmax (1 p1 )x 1 x=0 1 p1 x! p x1 |x2 =0 = x1 −  , 2 x x max  1 − p1 xmax   x 1 2 1 2 x=0 x! x=0 x! xmax (2 p2 )x 2 x=0 1 p2 x! p x2 |x1 =0 = x2 −  , 1 x x max  1 − p2 xmax   x 2 1 2 1 x=0 x! x=0 x! xmax (1 p1 )x xmax (2 p2 )x 1 2 x=0 x=0 p2 p1 x! x! . m1,1 =  xmax xmax x1 x2 1 − p1 1 − p2 1 2 x=0 x! x=0 x!

Violetta E. Piperigou / Journal of Statistical Planning and Inference 139 (2009) 3891 -- 3899

3899

5. Related models To facilitate the calculation of the possible values of the parameter , we can use 1−

(1 − pi )Fi (xi ) , 1 − pi Fi (xi )

in place of Si∗ (xi ). This is the sf of the geometric-maximum stable distribution Yi , discussed by Marshall and Olkin (1997). It is derived as N

Yi = max{Xi1 , Xi2 , . . . , Xi i }, j

where Ni ∼ Geometric(pi ) + 1 independent of Xi , which are iid with Xi ∼ Fi , (i = 1, 2). The rv Yi is stochastically greater than Xi , Yi > Xi , as st

(1 − pi )Fi (xi ) < Fi (xi ) for every xi . 1 − pi Fi (xi ) The cdf of the extended FGM family for this particular case is     (1 − p2 )F2 (x2 ) (1 − p1 )F1 (x1 ) 1− , F12 (x1 , x2 ) = F1 (x1 )F2 (x2 ) 1 +  1 − 1 − p1 F1 (x1 ) 1 − p2 F2 (x2 )

(5.1)

with 0 < p1 , p2 < 1 and min ⱕ  ⱕ max , where  (1 − p1 M1 )(1 − p1 )(1 − p2 M2 )(1 − p2 ) (1 − p1 M1 )(1 − p1 )(1 − p2 M2 )(1 − p2 ) min = − min , , (1 − m1 + p1 M1 )(1 − m2 + p2 M2 ) M1 M2 and

max = min



(1 − p1 M1 )(1 − p1 )(1 − p2 M2 )(1 − p2 ) (1 − p1 M1 )(1 − p1 )(1 − p2 M2 )(1 − p2 ) , . (1 − m1 + p1 M1 )M1 (1 − m2 + p2 M2 )M2

For p1 = p2 = 0 we take the FGM family given by (3.1). The form of the corresponding copula is given by   (1 − u)(1 − v) . C(u, v) = uv 1 +  (1 − p1 u)(1 − p2 v) To study the properties of this copula is beyond the scope of this paper, and although it might seem promising, its study for discrete distributions cannot be facilitated by the pgf approach proposed earlier as it is not feasible, in the general case, to find an explicit formula for the pgf of the cdf given by (5.1). Acknowledgements The author is grateful to the referees for comments and suggestions. References Bairamov, I., Kotz, S., 2002. Dependence structure and symmetry of Huang–Kotz FGM distributions and their extensions. Metrika 56, 55–72. Bairamov, I., Kotz, S., Bekçi, M., 2001. New generalized Farlie–Gumbel–Morgenstern distributions and concomitants of order statistics. J. Appl. Statist. 28, 521–536. Cambanis, S., 1977. Some properties and generalizations of multivariate Eyraud–Gumbel–Morgenstern distributions. J. Multivariate Anal. 7, 551–559. Feller, W., 1968. An Introduction to Probability Theory and Its Applications, vol. 1. Wiley, New York. Griffiths, R.C., Milne, R.K., Wood, R., 1979. Aspects of correlation in bivariate Poisson distributions and processes. Austral. J. Statist. 21, 238–255. Johnson, N.L., Kotz, S., 1977. On some generalized Farlie–Gumbel–Morgenstern distributions-II regression, correlation and further generalizations. Comm. Statist. Theory Methods 6, 485–496. Johnson, N.L., Kotz, S., Balakrishnan, N., 1997. Discrete Multivariate Distributions. Wiley, New York. Kocherlakota, S., Kocherlakota, K., 1992. Bivariate Discrete Distributions. Marcel Dekker, New York. Lakshminarayana, J., Pandit, S.N.N., Rao, K.S., 1999. On a bivariate Poisson distribution. Comm. Statist. Theory Methods 28, 267–276. Mahfoud, M., Patil, G.P., 1982. On weighted distributions. In: Kallianpur, G., Krishnaiah, P.R., Ghosh, J.K. (Eds.), Statistics and Probability: Essays in Honor of C.R. Rao. North-Holland, Amsterdam, pp. 479–492. Marshall, A.W., Olkin, I., 1997. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 84, 641–652. Nelsen, R.B., 1987. Discrete bivariate distributions with given marginals and correlation. Comm. Statist. Simulation Comput. 16, 199–208.