Variance of the truncated negative binomial distribution

Variance of the truncated negative binomial distribution

Accepted Manuscript Variance of the truncated negative binomial distribution J.S. Shonkwiler PII: DOI: Reference: S0304-4076(16)30161-0 http://dx.doi...

300KB Sizes 10 Downloads 471 Views

Accepted Manuscript Variance of the truncated negative binomial distribution J.S. Shonkwiler PII: DOI: Reference:

S0304-4076(16)30161-0 http://dx.doi.org/10.1016/j.jeconom.2016.09.002 ECONOM 4290

To appear in:

Journal of Econometrics

Received date: 29 August 2016 Revised date: 29 August 2016 Accepted date: 1 September 2016 Please cite this article as: Shonkwiler, J.S., Variance of the truncated negative binomial distribution. Journal of Econometrics (2016), http://dx.doi.org/10.1016/j.jeconom.2016.09.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Variance of the Truncated Negative Binomial Distribution J. S. Shonkwiler Department of Agricultural and Applied Economics, University of Georgia, Athens, Georgia, USA. E-mail address: [email protected] Telephone: 706.542.0847 Fax: 706.542.0739

________________________________________________________________________________________________ Abstract

Citations to formulas for the moments of the truncated negative binomial distribution usually reference the paper by Gurmu and Trivedi (1992). However their second moments of the truncated negative binomial are incorrect. We derive the correct second moments for both the left and right truncated negative binomial distribution. The second moments of the truncated distributions are written in a form that shows they will converge to the second moment of the un-truncated distribution when the truncated first moment approaches the un-truncated first moment. JEL classification: C46; C18; C52

Keywords: Negative binomial; Truncation; Incomplete beta function _________________________________________________________________________________________________________ 1. Introduction The purpose of this note is to derive the moments of the truncated negative binomial

distribution. Unfortunately, previous derivations of the second moments are incorrect,

with the exception of Geyer (2007). The paper by Gurmu and Trivedi (1992; hereafter GT) appears to be the first published report to provide the moments of the truncated negative

binomial distribution. There is an error in the GT formula for the variance that has been

propagated in the literature. The recent book by Cameron and Trivedi (2013) reproduces

both moments of GT. Additionally, Johnson et al. (2005, p.236) refer readers to both GT

and the earlier edition of Cameron and Trivedi (1998) for moments and details relating to the left and right truncated negative binomial distributions.

Both truncated and censored count data models are widely used in areas such as

recreation demand, insurance claims, shopping trips, and survey data where respondents 1

report intervalsβ€”rather than exact counts.

Since the negative binomial regression model

can account for over-dispersion and can nest the Poisson regression model, it is commonly used for truncated and censored count data and hurdle or two-part models. 2. The truncated negative binomial distribution

The (generalized) negative binomial distribution can be represented as

𝑃(π‘Œ = 𝑦) =

𝛀(𝑦+a)οΏ½

πœ‡ 𝑦 a a οΏ½ οΏ½ οΏ½ πœ‡+a πœ‡+a

(1)

𝛀(𝑦+1)𝛀(a)

where a=(1/Ξ±)ΞΌΟ•. It is well known that E[Y]=ΞΌ and V[Y]=ΞΌ+Ξ±ΞΌ2-Ο•. We will term Ξ± the

dispersion parameter and recognize in most settings Ο• will equal zero and as such we have what has been called the NB2 distribution.

By applying the Laplace transform to the canonical parameter for the truncated

form of this distribution and by exploiting the relationship between the incomplete beta function and the cumulative distribution function of the negative binomial distribution (Johnson et al. 2005), we find that for left truncation

𝐸[π‘Œ|π‘Œ > 𝑐] = πœ‡ +

(a+Β΅)(𝑐+1)β„Ž(𝑐+1) a(1βˆ’Pr(π‘Œβ‰€π‘))

= πœ‡βˆ— .

Here h(c+1) denotes the un-truncated negative binomial probability mass function

(2)

(equation 1) evaluated at c+1. The negative binomial cumulative distribution function may be conveniently evaluated with the incomplete beta function such as programmed in

MATLAB. An alternative and perhaps more attractive form of (2) is equation (2.15) in GT with r=c+1.

The truncated variance can be derived as (details available from the author) βˆ—

𝑉(π‘Œ|π‘Œ > 𝑐) = πœ‡βˆ— + 𝑐(πœ‡ βˆ’ πœ‡) + πœ‡βˆ— πœ‡(1 + 1/a) βˆ’ πœ‡βˆ—2 2

(3)

Note that as ΞΌ*β†’ΞΌ, V(Y|Y>c)β†’V(Y). Also it can be shown that the GT left-truncated variance always underestimates the true left-truncated variance. For example in the NB2 case, the GT formula for the left truncated variance omits the term Ξ±(ΞΌ*ΞΌβˆ’ΞΌ2).

An unpublished document by Geyer (2007) has also derived the left truncated

variance of the negative binomial distribution but did not recognize that the incomplete

beta function can represent the negative binomial cumulative distribution function. Under the parameterization Ξ±=a, p=a/(a+ΞΌ), and k=c, Geyer reported that 𝑉(π‘Œ|π‘Œ > π‘˜) =

𝛼(1βˆ’π‘) 𝑝2

𝑃𝑃(π‘Œ>π‘˜+1)

π‘˜+1

βˆ’ 𝑝2 (1+𝛽) οΏ½βˆ’(1 βˆ’ 𝑝) +

(π‘˜+1+𝛼)(1βˆ’π‘) 1+𝛽

where Ξ² = 𝑃𝑃(π‘Œ=π‘˜+1) . Recognizing that 1 + 𝛽 =

𝛽

+ [𝛼 βˆ’ 𝑝(π‘˜ + 1 + 𝛼)] 1+𝛽�

(a+Β΅)(𝑐+1) a(Β΅βˆ— βˆ’πœ‡)

(4)

and substituting for Ξ±, p, and k

in terms of the original parameters, it can be shown that (4) collapses to (3).

For completeness we consider the effect of right truncation on the moments.

𝐸[π‘Œ|π‘Œ ≀ 𝑐] = πœ‡ βˆ’ and

(a+Β΅)(𝑐+1)β„Ž(𝑐+1) a Pr(π‘Œβ‰€π‘)

0

= πœ‡0

2

𝑉(π‘Œ|π‘Œ ≀ 𝑐) = πœ‡0 + 𝑐(πœ‡ βˆ’ πœ‡) + πœ‡0 πœ‡(1 + 1/a) βˆ’ (πœ‡0 ) .

3

(5)

(6)

References Cameron, A. C. and P. K. Trivedi, 1998. Regression analysis of count data. Cambridge University Press, Cambridge, U.K.

Cameron, A. C. and P. K. Trivedi, 2013. Regression analysis of count data, Second edition. Cambridge University Press, New York, N.Y.

Geyer, C. J., 2007. Lower-Truncated Poisson and Negative Binomial Distributions. https://cran.r-project.org/web/packages/aster/vignettes/trunc.pdf

Greene, W. H., 2012. Econometric analysis, Seventh edition. Prentice Hall, Upper Saddle River, New Jersey.

Gurmu, S. and P. K. Trivedi, 1992. Overdispersion tests for truncated Poisson regression models. Journal of Econometrics 54, 347-370.

Johnson, N. L., A. W. Kemp and S. Kotz, 2005, Univariate discrete distributions, Third edition. John Wiley &Sons, Hoboken, New Jersey.

4