Accepted Manuscript Variance of the truncated negative binomial distribution J.S. Shonkwiler PII: DOI: Reference:
S0304-4076(16)30161-0 http://dx.doi.org/10.1016/j.jeconom.2016.09.002 ECONOM 4290
To appear in:
Journal of Econometrics
Received date: 29 August 2016 Revised date: 29 August 2016 Accepted date: 1 September 2016 Please cite this article as: Shonkwiler, J.S., Variance of the truncated negative binomial distribution. Journal of Econometrics (2016), http://dx.doi.org/10.1016/j.jeconom.2016.09.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Variance of the Truncated Negative Binomial Distribution J. S. Shonkwiler Department of Agricultural and Applied Economics, University of Georgia, Athens, Georgia, USA. E-mail address:
[email protected] Telephone: 706.542.0847 Fax: 706.542.0739
________________________________________________________________________________________________ Abstract
Citations to formulas for the moments of the truncated negative binomial distribution usually reference the paper by Gurmu and Trivedi (1992). However their second moments of the truncated negative binomial are incorrect. We derive the correct second moments for both the left and right truncated negative binomial distribution. The second moments of the truncated distributions are written in a form that shows they will converge to the second moment of the un-truncated distribution when the truncated first moment approaches the un-truncated first moment. JEL classification: C46; C18; C52
Keywords: Negative binomial; Truncation; Incomplete beta function _________________________________________________________________________________________________________ 1. Introduction The purpose of this note is to derive the moments of the truncated negative binomial
distribution. Unfortunately, previous derivations of the second moments are incorrect,
with the exception of Geyer (2007). The paper by Gurmu and Trivedi (1992; hereafter GT) appears to be the first published report to provide the moments of the truncated negative
binomial distribution. There is an error in the GT formula for the variance that has been
propagated in the literature. The recent book by Cameron and Trivedi (2013) reproduces
both moments of GT. Additionally, Johnson et al. (2005, p.236) refer readers to both GT
and the earlier edition of Cameron and Trivedi (1998) for moments and details relating to the left and right truncated negative binomial distributions.
Both truncated and censored count data models are widely used in areas such as
recreation demand, insurance claims, shopping trips, and survey data where respondents 1
report intervalsβrather than exact counts.
Since the negative binomial regression model
can account for over-dispersion and can nest the Poisson regression model, it is commonly used for truncated and censored count data and hurdle or two-part models. 2. The truncated negative binomial distribution
The (generalized) negative binomial distribution can be represented as
π(π = π¦) =
π€(π¦+a)οΏ½
π π¦ a a οΏ½ οΏ½ οΏ½ π+a π+a
(1)
π€(π¦+1)π€(a)
where a=(1/Ξ±)ΞΌΟ. It is well known that E[Y]=ΞΌ and V[Y]=ΞΌ+Ξ±ΞΌ2-Ο. We will term Ξ± the
dispersion parameter and recognize in most settings Ο will equal zero and as such we have what has been called the NB2 distribution.
By applying the Laplace transform to the canonical parameter for the truncated
form of this distribution and by exploiting the relationship between the incomplete beta function and the cumulative distribution function of the negative binomial distribution (Johnson et al. 2005), we find that for left truncation
πΈ[π|π > π] = π +
(a+Β΅)(π+1)β(π+1) a(1βPr(πβ€π))
= πβ .
Here h(c+1) denotes the un-truncated negative binomial probability mass function
(2)
(equation 1) evaluated at c+1. The negative binomial cumulative distribution function may be conveniently evaluated with the incomplete beta function such as programmed in
MATLAB. An alternative and perhaps more attractive form of (2) is equation (2.15) in GT with r=c+1.
The truncated variance can be derived as (details available from the author) β
π(π|π > π) = πβ + π(π β π) + πβ π(1 + 1/a) β πβ2 2
(3)
Note that as ΞΌ*βΞΌ, V(Y|Y>c)βV(Y). Also it can be shown that the GT left-truncated variance always underestimates the true left-truncated variance. For example in the NB2 case, the GT formula for the left truncated variance omits the term Ξ±(ΞΌ*ΞΌβΞΌ2).
An unpublished document by Geyer (2007) has also derived the left truncated
variance of the negative binomial distribution but did not recognize that the incomplete
beta function can represent the negative binomial cumulative distribution function. Under the parameterization Ξ±=a, p=a/(a+ΞΌ), and k=c, Geyer reported that π(π|π > π) =
πΌ(1βπ) π2
ππ(π>π+1)
π+1
β π2 (1+π½) οΏ½β(1 β π) +
(π+1+πΌ)(1βπ) 1+π½
where Ξ² = ππ(π=π+1) . Recognizing that 1 + π½ =
π½
+ [πΌ β π(π + 1 + πΌ)] 1+π½οΏ½
(a+Β΅)(π+1) a(Β΅β βπ)
(4)
and substituting for Ξ±, p, and k
in terms of the original parameters, it can be shown that (4) collapses to (3).
For completeness we consider the effect of right truncation on the moments.
πΈ[π|π β€ π] = π β and
(a+Β΅)(π+1)β(π+1) a Pr(πβ€π)
0
= π0
2
π(π|π β€ π) = π0 + π(π β π) + π0 π(1 + 1/a) β (π0 ) .
3
(5)
(6)
References Cameron, A. C. and P. K. Trivedi, 1998. Regression analysis of count data. Cambridge University Press, Cambridge, U.K.
Cameron, A. C. and P. K. Trivedi, 2013. Regression analysis of count data, Second edition. Cambridge University Press, New York, N.Y.
Geyer, C. J., 2007. Lower-Truncated Poisson and Negative Binomial Distributions. https://cran.r-project.org/web/packages/aster/vignettes/trunc.pdf
Greene, W. H., 2012. Econometric analysis, Seventh edition. Prentice Hall, Upper Saddle River, New Jersey.
Gurmu, S. and P. K. Trivedi, 1992. Overdispersion tests for truncated Poisson regression models. Journal of Econometrics 54, 347-370.
Johnson, N. L., A. W. Kemp and S. Kotz, 2005, Univariate discrete distributions, Third edition. John Wiley &Sons, Hoboken, New Jersey.
4