Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal count data

Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal count data

Journal of Econometrics 89 (1999) 317—338 Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal co...

167KB Sizes 0 Downloads 63 Views

Journal of Econometrics 89 (1999) 317—338

Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal count data Ulf Bo¨ckenholt* Department of Psychology, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA

Abstract This paper presents finite mixture versions of integer-valued autoregressive (INAR) Poisson regression models for investigating regularity and predictability of purchase behavior over time. The approach facilitates the analysis of heterogeneity and serial correlation effects as well as conditional and marginal analyses of the effects of covariates. An application to scanner panel data of detergents yields substantive insights into sources of autodependencies in individual category purchases.  1999 Elsevier Science S.A. All rights reserved. JEL classification: C14; C22; C23; C25; M31 Keywords: Autoregression; Binomial thinning; Count data; Finite mixture

1. Introduction Key objectives of marketing researchers are understanding and predicting consumers’ buying behavior. Many in-depth studies of markets and consumers are available that provide a detailed picture of consumers’ buying habits under different pricing, advertising, and promotional activities. Although, initially, these studies dealt with consumer behavior at a particular point in time, the recent availability of electronic scanning and bar coding technologies led to numerous investigations examining longitudinal aspects of stability and change

* E-mail: [email protected] 0304-4076/99/$ — see front matter  1999 Elsevier Science S.A. All rights reserved. PII: S 0 3 0 4 - 4 0 7 6 ( 9 8 ) 0 0 0 6 9 - 4

318

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

in brand preferences. In particular, much work has focused on finding regularities in repeat-buying data of nondurable goods by assuming that a customer’s purchase behavior follows a Poisson process (for an overview, see Ehrenberg, 1988; Goodhardt et al., 1984; Morrison and Schmittlein, 1981, 1988). For instance, the effects of store and marketing variables on purchase behavior is frequently analyzed by mixed Poisson regression models (Bo¨ckenholt, 1993; Wedel and DeSarbo, 1995; Ramaswamy et al., 1993). The regression part of the Poisson model captures the effect of the marketing variables and the mixing part takes into account different sources of heterogeneity in the data. The latter part is necessary because individual differences in buying behavior frequently lead to systematic violations of the mean-variance equality constraint of the Poisson distribution with the consequence that parameter estimates are inefficient and the covariance matrix of the Poisson model is biased. By allowing for the possibility that the Poisson parameters may vary according to some distribution, less restrictive mean-variance relationships are possible (Hausman et al., 1984). However, care needs to be taken in specifying the underlying distribution of the rate parameters because incorrect specifications may lead to a loss of consistency. It is for this reason that the recent development of finite mixture Poisson regression models appears very promising for the analysis of purchase data because they allow for arbitrary heterogeneity by approximating the underlying distributions of the rate parameters and regression weights without making assumptions about their parametric form (Bo¨ckenholt, 1993; Dillon and Gupta, 1996; Wedel et al., 1993; Wang et al., 1998; Van Duijn and Bo¨ckenholt, 1995). In contrast to the extensive research on different sources of heterogeneity in purchase data, only recently attention has shifted to the issue of time dependencies. To some extent, this may be a result of the view that individual-level purchases of frequently bought products are likely to be independent and that any observed serial dependencies are essentially nuisance parameters which do not contain useful information about the underlying choice processes. This paper takes a different position by arguing that modeling serial dependencies can reveal interesting regularities in buying behavior. New insights are obtained by the decomposition of count data into a carry-over part representing the influence of previous time periods, and an innovation part capturing the effects of the present situation on purchase behavior. A similar approach is taken in the analysis of count panel data on technological innovations by Blundell et al. (1995a, 1995b). The main goal of this paper is to introduce mixture versions of integer-valued autoregressive (INAR) Poisson regression models. These models are based on the work by Al-Osh and Alzaid (1987), Bra¨nna¨s (1995), McKenzie (1988) and Steutel and van Harn (1979). By including the finite mixture Poisson regression model as a special case, the proposed mixed INAR(1)—Poisson models facilitate separate tests of heterogeneity and serial dependency effects. Estimation and

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

319

testing issues are also discussed. An empirical illustration involving the analysis of inventory effects is presented in the last section. 2. Modeling purchase incidence behavior 2.1. Cross-sectional data A frequently used scenario in marketing applications is that of consumers’ purchasing behavior following a Poisson process. Denote the total number of purchase incidences of person n (n"1,2,N) within time period t (t"0,2,¹) by X and the rate parameter of the Poisson distribution by j . Typically, the LR LR rate parameter is expressed as a non-negative function of covariates, z , with LR j "exp(b #z b). When individuals differ in their regression parameters, the LR  LR mean-variance equality assumption of the Poisson model is violated in an analysis of the aggregate data. Instead, the variance of the counts exceeds their mean which is referred to as overdispersion. One solution to the overdispersion phenomenon is a random effects approach which postulates some distribution of the regression parameters in the population (Hausman et al., 1984; Lawless, 1987; McCullagh and Nelder, 1989). Although this approach provides a parsimonious representation of individual difference effects, it has the potential shortcoming that estimates of the regression parameters are sensitive to the specified distributional form. Misspecifications may lead to loss of efficiency and to biased regression parameter and standard error estimates (Gourieroux et al., 1984; Wang et al., 1996). To some extent these difficulties can be avoided with a finite mixture approach (cf. Titterington et al., 1985; Dillon and Kumar, 1994). Under the assumption that the population of consumers consists of S mutually exclusive and exhaustive subpopulation with relative size n (s"1,2,S), the finite mixQ ture Poisson regression model can be written as Pr(x "knt, p)" n Pr(x "j ), LR Q LR LRQ Q where p"(n ,2,n ) and knt"(j ,2,j ) (Bo¨ckenholt, 1993; Wedel et al.,  1 LR LR1 1993; Wedel and DeSarbo, 1994). The mixture-specific rate parameters j are LRQ an exponential function of covariates with j "exp(b #z bs). All regression LRQ Q LR parameters are allowed to differ among the mixture components. As a result, subpopulations can be identified that vary in their reactions to marketing mix variables. Thus, in contrast to random effects models which treat overdispersion as a nuisance factor that complicates statistical inference, finite mixture models may provide additional insights about different sources of heterogeneity in the population. For example, in an application of a finite mixture Poisson

320

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

regression model to mail-order data, Wedel et al. (1993) found that although reactions of customers to direct-mail variables varied substantially, consumer could be classified into three segments with the purchase behavior of each segment showing readily interpretable differences in reactivity to the direct mail variables. 2.2. Panel data The Poisson regression model and its extensions are tailored for applications in which the parameters of a regression model are estimated for cross-sectional data. In this case, it is reasonable to assume independence among the observations. However, for data collected over time it may be necessary to take into account dependencies among the observations. Both observation- and parameter-driven approaches are available to model autodependencies in count data (Cox, 1981). The INAR(1)—Poisson model (McKenzie, 1988; Al-Osh and Alzaid, 1987) and the Poisson transition model (Zeger and Qaqish, 1988) are prominent examples of the former approach. The latter approach is used in the work by Bo¨ckenholt and Langeheine (1996), Harvey (1989), Smith (1979) and Zeger (1988). In parameter-driven models, time dependencies imply overdispersion. Consider, for example, the latent change model by Bo¨ckenholt and Langeheine (1996) which assumes shifts among the mixture components over time. For three time periods the latent change model may be written as Pr(x ,x ,x "h)" n Pr(x "j ) LR\ LR LR> ?@A LR\ LR\? ? @ A ;Pr(x "j )Pr(x "j ), LR LR@ LR> LR>A where h is a parameter vector and n denotes the probability of belonging to ?@A the mixture components a, b, and c during time periods t!1, t, and t#1, respectively. One useful hypothesis is that the stochastic switching process among the mixture components is a stationary Markov chain with n "n n n . ?@A ? @? A@ Despite their flexibility in modeling both heterogeneity and time dependencies, latent change models have two shortcomings which may limit their applicability for the analysis of purchase records. First, a latent change process implies that the marginal distribution of the counts is overdispersed. This result may be viewed as a disadvantage because, presently, most of the empirical evidence supports the notion that individual-level purchase behavior of nondurable goods is well-described by a Poisson process. Second, latent change models are not parsimonious because the number of mixture components increases exponentially with the number of time periods. The next section

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

321

focuses on an observation-driven approach proposed independently by McKenzie (1988) and Al-Osh and Alzaid (1987) and further developed by Bra¨nna¨s (1995). Their INAR(1)—Poisson model is extended to facilitate the analysis of time-(in)dependent covariates. In addition, a finite mixture formulation is derived to account for unobserved heterogeneity effects in panel data. 3. A Mixed INAR(1)—Poisson regression model The INAR(1)—Poisson model proposed by McKenzie (1988) and Al-Osh and Alzaid (1987) is attractive for the analysis of longitudinal purchase data for three reasons. First, under this model the marginal distributions of the counts are Poisson which agrees with the current notion that purchase behavior of nondurable goods is well-described by a Poisson process. Second, the dependence structure is parsimonious requiring only a small number of parameters. Third, the process introducing time dependence has a behavioral interpretation and may reflect additional information about the choice behavior. The INAR(1)—Poisson model decomposes discrete observations into two parts, a carry-over part which represents the influence of previous time periods and an innovation part which captures the effects of the present situation on purchase behavior. It is shown below that this decomposition facilitates conditional and marginal analyses of the effects of covariates on purchase behavior. Thus, in contrast to the Poisson transition model (Zeger and Qaqish, 1988), the regression version of the INAR(1)—Poisson model may be used when the effects of covariates on the marginal distribution are of primary interest (whereas the dependence among observations is regarded as nuisance), and when the conditional distribution is modeled as a function of covariates. Dependencies between past and present choice frequencies are introduced by the binomial thinning operator  (Steutel and van Harn, 1979). Let X be a discrete random variable defined on the nonnegative integers, and ½ be G a sequence of i.i.d. binary random variables, independent of X, such that Pr(½ "1)"1!Pr(½ "0)"a where a3[0, 1]. Then the thinning operator is G G defined by 6 aX" ½ "B(a, X), G G where B(a, X) is the binomial distribution for X trials with probability of success a. By applying the binomial thinning operator X can be decomposed into two R parts. One part is a function of the number of choices at the previous time period, C , and another part, I , reflects the influence of the current choice R\ R situation, X "C #I , R R\ R

(1)

322

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

where C "a X . Thus, a stochastic INAR(1) process with Poisson(j ) [in R\ R R\ R short, P(j )] margins is obtained when both X and I are independently Poisson R  R distributed with parameters j and j'"j !a j , for all t.  R R R R\ From this representation, it is clear that correlations can only be positive and that their size is determined by a and the marginal distributions of the choice R frequencies. Because the realization of the binomial thinning operation, c , is R\ common to both x and x , their covariance is c "a j and the R\ R R R\ R R\ corresponding correlation is o "a ((j )/j . R R\ R R\ R From Eq. (1), it follows that the conditional distribution of X given X is R R\ a convolution of X'&P(j') and C &B(a , X ) which can be written as R R R\ R R\ Pr(x "x , a , j , j )"x !exp(![j !a j ])(j !a j )VR(1!a )VR\ R R R\ R R R\ R\ R R R\ R R R\





I a

VR\ VR R w ; , I (1!a )(j !a j ) R R R R\ I

(2)

where w "((x !k)!(x !k)!k!)\. An important property of the conditional I R\ R distribution is that both the regression function and the conditional variance is linear in x with E(X " x )"j'#a x , and »(X " x )"j'# R\ R R\ R R R\ R R\ R (1!a )a x , respectively. Note, however, that unlike an AR(1) process R R R\ with normally distributed innovations, X given j' and x is still a random R R R\ variable. In addition, when x exceeds (j )/(1!a ), »(X " x ) is larger R\ R\ R R R\ than »(X ). R Because of its Markovian structure, Pr(x "x , x ,2)"Pr(x "x ), and R R\ R\ R R\ the joint distribution for ¹#1 time periods is given by 2 Pr(x , x ,2,x "k, a)"Pr(x "j ) “ Pr(x "x ; j , j ,a ),   2   R R\ R R\ R R

(3)

where k"(j , j ,2,j ) and a"(a ,2,a ).   2  2 3.1. Alternative parameterizations and extensions Different expressions for the transition probabilities can be derived which may prove useful when covariates are taken into account. In Eq. (2) the transition probabilities are expressed in terms of the marginal rate parameters and the thinning probabilities. Alternatively, the transition probabilities can be written as a function of the innovation rate parameters by substituting j' for R j !a j in Eq. (2). Moreover, instead of separating the carry-over effect of R R R\ a previous time period into a and j , it may be of interest to write it directly as R R\ c . As a result, by choosing an appropriate specification conditional and R R\ marginal analyses of the effects of covariates can be conducted.

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

323

Conceptually, higher-order extensions of the autoregressive process are straightforward. For simplicity, consider a second-order process defined by X "a  X #a  X #I , R  R\  R\ R

(4)

where a (1, and I is distributed independently of X and X . An G G R R\ R\ important feature of this Poisson INAR(2) process is that its regression function of x on x and x is not linear in general and can be written as R R\ R\ Pr(x , x !1) R\ R\ E(X " x , x )"j'#a x #a j , R R\ R\ R  R\  R\ Pr(x , x ) R\ R\ where Pr(x , x ) is the bivariate Poisson function (McKendrick, 1926). As R\ R\ a result, approaches that are based on the assumption of a linear regression in lagged counts are not consistent with a higher-order autoregressive Poisson model. Unfortunately, maximum likelihood estimation of a higher-order autoregressive Poisson model is not straightforward because it lacks the Markovian property (Alzaid and Al-Osh, 1990). 3.2. A finite mixture representation For the analysis of purchase records at the aggregate level, it is assumed that the heterogenous population of consumers consists of S mutually exclusive and exhaustive segments. Within each segment the time-dependent counts are described by an INAR(1)—Poisson regression model, and the probability of observing a sequence of purchase frequencies, x "(x , x ,2,x ), by person n given L L L L2 that this person is a member of segment s is 2 Pr(x "kns, an)"Pr(x "j ) “ Pr(x "x ; j ,j , a ), L LQ LR LR\ LRQ LR\Q LR L R where an"(a ,2,a ), kns"(j , j ,2,j ), and j is the mean rate of L L2 LQ LQ L2Q LRQ person n in class s during time period t. The marginal probability of observing a sequence of purchase counts is 1 Pr(x "(kn1,2,knS), an, p)" n Pr(x "j ) L Q L LQ Q 2 ; “ Pr(x "x ;j ,j , a ). LR LR\ LRQ LR\Q LR R

(5)

3.2.1. Conditional distributions As for the N"1 case, it is instructive to derive the conditional distributions of the finite mixture models with an autoregressive component. The conditional

324

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

expectation and variance can be written as 1 E(X "x )" n j' #a x R R\ QVR\ RQ R R\ Q and



»(X "x )" n j' # n (j' )! n j' R R\ QVR\ RQ QVR\ RQ QVR\ RQ Q Q Q #(1!a )a x , R R R\ where





n exp(!j )jVR\ R\Q R\Q . n " Q QVR\ n exp(!j )jVR\ Q Q R\Q R\Q Although, in general, the regression function is not linear in x , it is clear that R\ the slope of the regression function depends on the size of the autoregressive component. The impact of the autoregressive component decreases exponentially as a function of the lag size. Thus, when conditioning on x , we obtain R\I 1 E(X "x )" n (j !aIj )#aIx . R R\I QVR\I RQ IQ R\I Q

(6)

In contrast, under a finite mixture model without an autoregressive component the slope of the regression function does not depend on the lag size and simplifies to 1 E(X " x )" n j . R R\I QVR\I RQ Q

(7)

3.2.2. Exogenous variables and profiling of segments The individual rate parameters as well as thinning and covariance parameters are expressed as a function of covariates which are assumed to satisfy the weak exogeneity condition (Bra¨nna¨s, 1995; Engle et al., 1983). For example, the marginal rate parameter at time period t can be represented as





j "exp b # z b , LRQ Q LRJ JQ J where z is the value of the lth covariate observed for person n during time LRJ period t. Alternatively, the innovation term may be related to exogeneous

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

325

variables,





j' "exp b' # z b' . LRQ Q LRJ JQ J In addition, the carry-over probability, a or the covariance between adjacent LR time periods, c , can be expressed as a logistic or exponential function of LRR\ covariates, respectively,





a " 1#exp !d ! z d LR  LRJ J J



\ ,

and





"exp u # z u .  LRJ J LRR\ J In some applications it may prove useful to express prior probabilities of segment membership as a logistic function of demographic and socio-economic variables by writing c

exp(q # d q ) Q J LJ JQ , n d" Q L 1 exp(q # d q ) Q Q J LJ JQ where d is the value of the lth demographic covariate observed for person n. LJ For reasons of identifiability we set q "0, l"0,2,¸ (Dayton and MacJ1 Ready, 1988). Demographic variables that are predictive of segment membership are useful in targeting consumer segments and improving the accuracy in assigning individuals to segments (Dillon et al., 1993; Gupta and Chintagunta, 1994). 3.2.3. Estimation An expectation—maximization (EM) algorithm is used for parameter estimation of the mixed INAR(1)—Poisson regression model (Dempster et al., 1977). Because detailed derivations of the EM-algorithm have been given elsewhere (Wang et al., 1998; Wedel et al., 1993), only the major results necessary for implementing the algorithm are presented. Assuming random sampling of N individual count vectors x , we specify the L log-likelihood function as





, , 1 ln ¸" ln Pr(x "h)" ln n d Pr(x "hs) , L Q L L L L Q

326

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

where h and hs are vectors containing the full and class-specific set of parameters, respectively. In the E-step, the posterior probabilities of class membership are determined as n x d "n d Pr(x "hs)/Pr(x "h), Q L L Q L L L and, in the M-step, the log-likelihood function of the ‘complete’ data is maximized with respect to the class size and regression parameters. , 1 ln ¸ " n x d (ln Pr(x "hs)#ln n d ).  Q L L L Q L L Q At each M-step estimates of n d and the regression parameters are obtained by Q L Newton—Raphson methods. Although the likelihood function of the regression parameters is easy to evaluate and maximize, in the reported applications the conditional likelihood is maximized by omitting Pr(x "j ). Derivatives of the L L log conditional likelihood function are obtained by following Sprott’s (Sprott, 1983) approach for estimating parameters of a convolution. Monte Carlo studies of the INAR(1) model without covariates are reported by Al-Osh and Alzaid (1987), Ronning and Jung (1992), and Bra¨nna¨s (1995). 3.2.4. Identifiability By using Theorem 2 of Teicher (1963) it is straightforward to show that the mixed INAR(1)—Poisson regression model is identifiable. Al-Hussaini and Ahmad (1981) apply this theorem to prove the identifiability of the bivariate Poisson model. The identifiability of mixtures of the conditional distribution given by Eq. (2) follows from the result that





(c"x )  R\ "0, lim

(c"x ) A  R\ where

(c"x )"exp(j' (exp(c)!1))(1!a (1!exp(c)))VR\ Q R\ RQ R is the moment generating function of Eq. (2) and j' 'j' '0. R R Because each of the conditional distributions is identifiable, it can be shown that their product given in Eq. (3) is also identifiable (Teicher, 1967). A sufficient condition for the identifiability of the regression part of the finite mixture models is that the matrices of covariates, Z "[z ] and D"[d ] are of full rank. R LRJ LJ

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

327

3.2.5. Model selection and tests Although in some applications the number of segments S may be deduced from an underlying theory about potential sources of heterogeneity in a data set, typically S is unknown. Unfortunately, standard likelihood ratio tests are not valid for determining S and, presently, only heuristics are available that provide some guidance in selecting the true number of segments. In the reported application, Akaike’s information criterion (AIC) (Akaike, 1973) is used which penalizes the log likelihood by the number of parameters. Several Monte Carlo studies demonstrated that the AIC criterion is useful in choosing the correct number of segments of mixed Poisson regression models. When AIC does not select the true model, it frequently chooses models with too many segments (Wang et al., 1996). We therefore consider S determined on the basis of AIC as an upper limit of the number of segments in a data set. After specifying S detailed residual analyses are essential for assessing the quality of the model fit. Marginal fits of the mixed Poisson regression model may be determined by computing Pearson residuals which satisfy x ! nL d jK LR Q Q L LRQ qL H " . LR ( nL jK # nL jK  !( nL jK ) d Q Q L LRQ Q QdL LRQ Q QdL LRQ The sum of the squared Pearson residuals yields a Pearson goodness-of-fit statistic for specified t (Wang et al., 1996). In a similar fashion, the conditional fit of the mixed INAR(1)—Poisson regression model can be assessed by computing x !( 1 nL jK ' ,#aL x ) LR Q QVLR\ dL LRQ LR LR\ qL HH" . LR ( nL jK ' # nL (jK ' )!( nL jK ' )#(1!aL )aL x Q QVLR\ dL LRQ Q QVLR\ dL LRQ Q QVLR\ dL LRQ LR LR LR\

In addition to these global tests of fit, a variety of specific tests are available to determine whether the data are overdispersed with respect to the Poisson distribution and whether the individual-level data are consistent with an INAR(1)—Poisson model. Overdispersion of the marginal distributions can be assessed by applying the score test statistics proposed by Dean (1992). Mills and Seneta (1991) developed a straightforward test to investigate the validity of the assumptions underling the INAR(1)—Poisson model. For large ¹ they obtain the remarkable result that under the null hypothesis of a stationary INAR(1)—Poisson model normed sample partial autocorrelations at lag *2 are both asymptotically jointly normal and independent. Consequently, on the basis of a single time series (x ,2,x ) L L2 ¹gL ¹ ) gL I>&s(1) and I I>&s(K), pL  pL  I> I>

328

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

where gL is the partial autocorrelation at lag k estimated by least squares and for I k*2 aI(1!a)(1!a!a#2a) . p"1# I (1#a)(1#a#a)j' Their proposed test is almost identical with Quenouille’s test for a stationary AR(1) process which, partially, is a result of the fact that pK1 for moderate a. I 4. Analysis of category purchase incidence data It is well-known that category purchase frequencies observed for adjacent weeks are positively correlated. However, the sources of this positive correlation are unclear. A frequently made assumption is that these correlations are exclusively a result of individual differences in purchase rates and reactions to marketing mix variables. Although this assumption has never been systematically tested some indirect evidence that it may be incomplete was recently obtained by Kahn and Morrison (1989). These authors found that purchase incidence data are less strongly correlated for light buyers than for heavy buyers. However, their result was obtained by inspecting histograms of interpurchase times which is a problematic procedure because empirical interpurchase times are frequently right censored (Wheat and Morrison, 1990). The following investigations provide strong evidence that there are systematic serial dependencies in category purchase data after accounting for various sources of heterogeneity. This result is obtained by fitting the INAR(1) model to members of a random subsample of an A. C. Nielsen ERIM data base. Predictors of the size of these serial correlations are identified by the autoregressive part of the mixed INAR(1)—Poisson regression model. 4.1. Data and variables The data used in this application are powder detergent purchases. Records of purchases and marketing-mix variables are available for about 5000 households. These records were obtained on the basis of an identification card issued to each household. When this card was presented at the checkout counter of participating stores purchases were recorded electronically. The detergent purchases are available on a weekly basis. A time period of 32 weeks was specified and separated into three intervals of 8, 16, and 8 weeks. N"100 panel members were randomly selected provided they made at least one category purchase during the first and third time period. Over the specified time period of 32 weeks, the weekly average quantity purchase of a panel member is 29.9 oz, and

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

329

19.3, 25.8, and 35.0 oz are the 25th, 50th, and 75th percentiles, respectively. The corresponding weekly average category purchase incidence is 0.48 with an interquartile range of 0.28. The covariates used in the analysis were purchased quantity and the demographic variables household size, income and working status for profiling the segments of the mixture models. In many studies, quantity purchase is used to estimate the inventory of a household which is assumed to affect the shopping rate of a household (Gupta, 1991; Neslin et al., 1985). Specifically, household inventory during time period t, z , is operationalized as GR R\ z "z #z !zN " (z !zN ), GR GR\ OR\ O OR\F\ O F

(8)

where z is the purchased quantity during time period t and z "zN . The OR O O average quantity purchase zN is interpreted as consumption rate of a household. O Unfortunately, little is known about how accurately this inventory estimate reflects the actual inventory of a household. When applying Eq. (8), two obvious difficulties arise. First, for some time periods the inventory estimate may become negative, particularly, when the household purchases are infrequent. In this case, it seems reasonable to re-set the inventory value to zero. However, this ad-hoc solution may lead to poor inventory estimates at subsequent time points. Second, the recursive computation of inventory in Eq. (8) introduces high serial dependencies complicating the estimation of the effect of this variable. The INAR(1)-Poisson approach facilitates an alternative investigation of the inventory effect that avoids these difficulties. According to the INAR(1)—Poisson model, the purchase rate j can be R decomposed as j "j'#aj . Under the hypothesis that the purchased R R R\ quantity at a previous time period affects the current innovation part of the purchase rate, j'"z !zN , we obtain R OR\ O R\ j " aF(z !zN ). R OR\F\ O F

(9)

The right-hand side of Eq. (9) reduces to Eq. (8) when the carry-over probability a is equal to one. For a(1, the impact of lagged quantity purchases depend on the size of the carry-over probability. This representation suggests that the carry-over probabilities may be systematically affected by a household’s quantity purchases. For households with high average quantity purchases carry-over probabilities may be larger than for households with low average quantity purchases because in the latter case lagged quantity purchases may be of little predictive value. Similarly, a high value of z may reduce the effect of previous quantity purchases. Thus, the OR\

330

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

Fig. 1. Observed lag 1 and lag 2 autoregression functions for panelist with low and high average quantity purchases.

size of a may be moderated by both the average quantity purchase and the purchased quantity at (t!1). This hypothesis can be tested by expressing a as R a logistic function of z and zN , OR\ O a "(1#exp[!(d #d z #d zN )])\, R   OR\  O

(10)

where d (0 and d '0. Preliminary support for the hypothesized effect of the   average quantity variable is obtained from Fig. 1. This figure contains the averaged autoregression functions of the purchase incidences with a lag of 1 and 2 for panel members with low and high average quantity purchases. Panel members were assigned to the low and high categories on the basis of a median split of their zN values. Two observations are noteworthy. First, there appear to O be both intercept and slope differences between the low and high zN groups O suggesting that consumers with high average quantity purchases have larger innovation rates and stronger autodependencies than consumers with low average quantity purchases. Second, there appear to be intercept and slope differences for the lag 1 and lag 2 autoregression functions. These effects are expected under the mixed INAR(1)-Poisson model but not under the mixed Poisson regression models without autodependencies [see Eqs. (6) and (7)].

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

331

4.2. Benchmark models In the analysis of the data several benchmark models are included. These models include mixed Poisson regressions without autodependencies and the first-order Poisson Markov chain proposed by Zeger and Qaqish (1988). According to the latter model, transition probabilities are specified by a Poisson distribution with rate parameter



E(x "x )"jH "exp b # z b LR LR\ LR  LRJ J J





DR xH LR\ , exp(b # z b)  J LR\J J

where xH "max(x , c), 0(c(1. Autocorrelations are represented by LR\ LR\ f which is expressed as f "dH#dHz #dHzN . The parameter c determines R R   OR\  O the probability that x '0 given x "0. In contrast to the INAR(1)LR LR\ Poisson model, the limiting stationary distribution of the Poisson transition model is not known. A distinctive feature of the first-order Markov chain is that it specifies a multiplicative effect through xH , R\





jE(x "x ) xH LR LR\ "jH b #dH ln LR\ LR J J jz exp(b # z b) LRJ  J LR\J J



(Blundell et al., 1995). For comparison, under the INAR(1)—Poisson model the immediate effect of a change in z is given by LRJ jE(x "x ) LR LR\ "b j' #d x . J LR J LR\ jz LRJ 4.3. Results The results of the two mixed Poisson regression models are presented first. The rate parameters of these models are expressed as a function of a household’s consumption rate and either the inventory measure or the quantity purchase at (t!1). Inventory was computed according to Eq. (8) and constrained to be non-negative. Although z and z were computed on the basis of the 32 GR OR\ weeks, data analyses are restricted to the second time interval of 16 weeks. Previous studies showed that consumers are more likely to wait if they have a large product inventory (Ailawadi and Neslin, 1996; Bucklin and Lattin, 1991; Chintagunta, 1993; Gupta, 1991; Neslin et al., 1985). As a result, a negative effect of inventory on the number of future purchase incidences may be expected. One-, two-, and three-segment solutions were fitted to the weekly purchase data of the 100 consumers. The data of the first week were omitted from the analyses to obtain log-likelihood values that are comparable to the conditional loglikelihood values of the autoregressive regression models. In the two- and

332

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

three-segment solutions the intercepts and the effects of inventory (or quantity) were allowed to differ among segments to test whether some consumers are more sensitive to their inventory (or quantity) level than others (Neslin and Schneider-Stone, 1996). The log-likelihoods, number of estimated parameters, and the corresponding AIC statistics of the one-, two-, and three-segment Poisson regression models with either z or z as a covariate are listed in Table 1. The parameter GR OR\ estimates of both two-segment solutions are presented in Tables 2 and 3. Table 2 shows that the two segments with inventory as a covariate differ predominantly in their intercepts. Purchase incidence rates depend strongly on past inventory levels. The effects of inventory are negative and of about equal size for both segments. Thus, segment-specific inventory sensitivity is not observed. The size of the first segment is estimated as nL "exp(!0.52)/  (1#exp(!0.52))"0.37. Similar results are obtained for the two-segment Table 1 Fit statistics of mixed Poisson regression models No. segments

1 2 3

No. Parameters

3 7 11

Quantity

Inventory

!log l

AIC

!log l

AIC

1248.5 1237.3 1237.2

2503.0 2488.6 2496.4

1253.3 1240.0 1239.3

2512.6 2494.0 2500.6

Table 2 Parameter estimates (and standard errors) of two-segment Poisson regression model (with inventory) Effect Intercept Inventory Consumption rate Class size

Segment 1 bK Q bK Q bK Q ln nL /nL  

!0.31 !0.04 3.37

Segment 2 (0.12) (0.02) (0.67)

!0.84 !0.09 3.18 0.52

(0.09) (0.02) (0.55)

(0.46)

Table 3 Parameter estimates (and standard errors) of two-segment Poisson regression model (with quantity) Effect Intercept Quantity Consumption rate Class size

Segment 1 bK Q bK Q bK Q lnnL /nL  

!0.43 0.13 3.45

Segment 2 (0.11) (0.06) (0.70)

!1.10 0.25 3.31 0.74

(0.46)

(0.08) (0.05) (0.51)

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

333

solution (in Table 3) with z as a covariate. Note, however, that the effect of OR\ the quantity purchase at (t!1) is positive which casts some doubts about the adequacy of this mixed Poisson model. The application of the INAR(1)—Poisson model shows that both mixed Poisson regression models are misspecified because they ignore autodependencies in the data. Even the simplest model with constant purchase rates and stationary transition probabilities which requires the estimation of only two parameters (j and a) yields a higher log-likelihood value (!1191.8) than any of the mixed Poisson regression models. When the two potential predictors z and zN of the incidence rates and the carry-over probabilities (see Eq. (10)) OR\ O are included the conditional log-likelihood increases to !1115.9. For assessing individual difference effects we report the fit statistics of two additional models. First, the sum of the individual conditional log-likelihoods obtained from estimating the INAR(1)—Poisson model (without covariates) for every panel member is !1116.4. Second, the two-segment INAR(1)—Poisson regression model yields a log-likelihood value of !1109.3 which differs little from the one of the single segment model. Clearly, the covariates and the autoregressive components account to a significant extent for the variability in the rate parameters. No further fit improvement was obtained by a threesegment INAR(1)—Poisson model. The log-likelihoods, number of estimated parameters, and the corresponding AIC statistics of the one-, two-, and threesegment INAR(1)—Poisson regression models are summarized in Table 4. Table 4 also includes the fit statistics of the a Poisson transition model with c"0.01. This model fits significantly better than any of the mixed Poisson regression models but worse than the INAR(1)—Poisson model which supports the choice of an additive autoregressive representation over a multiplicative one. To understand the differences between the two segments of the INAR(1)— Poisson model, the class size parameters were expressed as a logistic function of the three demographic covariates household size, working status, and income. The last two variables are dummy-coded to represent whether a household consists of a full-time working couple, and whether the income of a household exceeds $35,000. The log-likelihood function of this model is !1103.6 and the s-statistic of the demographic variables is 11.3 with 3 degrees of freedom. Table 4 Fit statistics of autoregressive Poisson regression models No. Segments

1 2 3

No. Parameters

6 10 14

INAR(1) Poisson

Poisson transition

!log l

AIC

!log l

AIC

1115.9 1109.3 1109.1

2243.8 2238.6 2246.2

1199.1

2410.2

334

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

Table 5 Parameter estimates (and standard errors) of two-segment INAR(1)-Poisson regression model with demographic variables Effect Intercept Quantity Consumption rate AR-intercept AR-quantity AR-consumption rate Class size intercept Household size Income Dual-work

Segment 1 bK ' Q bK ' Q bK ' Q dK  dK  dK  qL  qL  qL  qL 

!1.37 !0.32 2.94

Segment 2 (0.10) (0.10) (0.62)

!2.54 !1.79 0.32 0.14 !0.62 1.38 !1.24 0.31 0.68 2.42

(0.27) (0.33) (0.92)

(0.09) (0.09) (0.51) (0.48) (0.28) (0.72) (0.82)

From the parameter estimates and standard errors in Table 5 we conclude that households with full-time working couples have a higher probability of belonging to the second segment. The other two demographic variables are not useful in separating the two segments. Members of the first segment have higher purchase incidences than members of the second segment. For both segments the effect of z on j' is negative, indicating that a large quantity purchase R OR\ during (t!1) reduces the number of purchase incidences during the subsequent time period. However, the size of this effect is segment-specific. For members of the second segment z appears to be more predictive of j' than for members OR\ R of the first segment. In contrast, the effect of average quantity (referred to as consumption rate in Table 5) on j' is positive but only significant for the first R segment. Note also that the autocorrelations are not time-homogeneous. Average quantity is a positive predictor of the size of the carry-over probabilities while purchased quantity at (t!1) has a negative effect on the size of a. Thus, the predictive power of x for x depends systematically on z . LR\ LR OR\ In summary, these analyses provided several major results. First, it was shown that there are strong autocorrelations among individual category purchase incidences which are related to past quantity purchases. This result explicates the finding in the literature that ‘low’ quantity buyers exhibit less serial dependencies in their purchase behavior than ‘high’ quantity buyers. Second, the INAR(1)—Poisson model yielded a better fit than the Zeger and Qaqish (1988) model. This finding provides indirect support for the additive as opposed to the multiplicative decomposition of the conditional expectations. Third, by expressing the innovation part of the INAR(1)—Poisson model as a function of z , OR\ we could avoid the difficulties that arise in estimating the inventory level of a household without losing the conceptual benefits of this variable. Moreover, the separate analyses of the carry-over and innovation parts of j showed that R panel members with higher purchase incidences (segment 1) appear to be more

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

335

regular in their shopping behavior but less sensitive to their previous quantity purchases than consumers with lower purchase incidences (segment 2). Fourth, comparisons between the mixed Poisson regression models with and without autocorrelations demonstrate that less overdispersion could be detected in the data after accounting for autocorrelations. A similar result was obtained by Allenby and Lenk (1994) in an analysis of brand choice data. These authors found that the degree of estimated preference heterogeneity was substantially reduced when considering autocorrelations among the choices. Clearly, it is important to separate heterogeneity from carry-over effects. Approaches based on the assumption that the number of purchase incidences in successive time periods are independent do not provide an accurate description of recurrent choice data by systematically underestimating the regularity and predictability of purchase incidences over time. 5. Discussion This paper addressed the issue of autodependencies in longitudinal count data to disentangle the separate effects of heterogeneity and serial correlations in repeated count data (Blundell et al., 1995). A mixed INAR(1)-Poisson regression was presented that allows for different parameterization of the first-order transition probabilities. Attractive features of the presented model include its parsimonious structure and its derivation from a plausible data generation mechanism which may prove instrumental in testing specific hypotheses about sources of non-stationarities and autodependencies. For the same reason, however, the applicability of the approach is limited because it allows for first-order dependencies only and does not facilitate the analysis of random covariates. It is therefore important to carefully consider the appropriateness of the data generation mechanism in applications of the model. Fortunately, the presented maximum likelihood framework greatly facilitates tests of the validity and adequacy of the mixed INAR(1)-Poisson regression model and its many potential uses in analyzing longitudinal count data. Acknowledgements The author is grateful to Bill Dillon, Eijte Foekens, Peter Leeflang and two anonymous reviewers for helpful comments and suggestions. This research was partially supported by the National Science Foundation grant SBR-9409531. References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (Eds.), Proceedings of the 2nd International Symposium on Information Theory, Akademia Kaido, Budapest, pp. 267—281.

336

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

Ailawadi, K., Neslin, S.A., 1996. The effect of promotion on consumption: buying more and consuming it faster. Unpublished manuscript, Darmouth College. Al-Hussaini, E.K., Ahmad, K.E., 1981. On the identifiability of finite mixtures of distributions. IEEE Transactions on Information Theory 27, 664—668. Allenby, G.M., Lenk, P.J., 1994. Modeling household purchase behavior with logistic normal regression. Journal of the American Statistical Association 89, 1218—1231. Al-Osh, M.A., Alzaid, A.A., 1987. First-order integer-valued autoregressive (INAR(1)) process. Journal of Time Series Analysis 8, 261—275. Alzaid, A.A., Al-Osh, M.A., 1990. An integer-valued pth-order autoregressive structure (INAR(p)) process. Journal of Applied Probability 27, 314—324. Blundell, R, Griffith, R, Van Reenen, J, 1995a. Dynamic count data models of technological innovation. The Economic Journal 105, 333—344 Blundell, R., Griffith, R., Windmeijer, R., 1995b. Individual effects and dynamics in count data. Discussion paper 95-03, Department of Economics, University College, London. Bo¨ckenholt, U., 1993. A latent class regression approach for the analysis of recurrent choice data. British Journal of Mathematical and Statistical Psychology 46, 95—118. Bo¨ckenholt, U., Langeheine, R., 1996. Latent change in recurrent choice data. Psychometrika 61, 285—302. Bra¨nna¨s, K., 1995. Explanatory variables in the AR(1) model. Umea economic studies no. 381, University of Umea. Bucklin, R.E., Lattin, J.M., 1991. A two-state model of purchase incidence and brand choice. Marketing Science 19, 24—39. Chintagunta, P.K., 1993. Investigating purchase incidence, brand choice, and purchase quantity decisions of households. Marketing Science 12, 184—208. Cox, D.R., 1981. Statistical analysis of time series: some recent development. Scandinavian Journal of Statistics 8, 93—115. Dayton, C.M., MacReady, G.B., 1988. Concomitant variable latent class models. Journal of the American Statistical Association 83, 173—178. Dean, C., 1992. Testing for overdispersion in Poisson and binomial regression models. Journal of the American Statistical Association 80, 451—457. Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM-algorithm. Journal of the Royal Statistical Society Series B 39, 1—38. Dillon, W.R., Gupta, S., 1996. A segment-level model of category volume and brand choice. Marketing Science 15, 38—59. Dillon, W.R., Kumar, A., 1994. Latent structure and other mixture models in marketing: an integrative survey and overview. In: Bagozzi R.P. (Ed.), Advanced Methods of Marketing Research Blackwell, Cambridge, pp. 352—388. Dillon, W.R., Kumar, A., Smith de Borrero, M., 1993. Capturing individual differences in paired comparisons: An extended BTL model incorporating descriptor variables. Journal of Marketing Research 30, 42—51. Ehrenberg, A.S.C., 1988. Repeat Buying. Oxford University Press, New York. Engle, R.F., Hendry, D.F., Richard, J.F., 1983. Exogeneity. Econometrica 51, 277—404. Goodhardt, G.J., Ehrenberg, A.S.C., Chatfield, C., 1984. The Dirichlet: a comprehensive model of buying behavior. Journal of the Royal Statistical Society Series A 147, 621—655. Gourieroux, C., Monfort, A., Trognon, A., 1984. Pseudo maximum likelihood methods: applications to Poisson Models. Econometrica 50, 701—720. Gupta, S., 1991. Stochastic models of interpurchase time with time-dependent covariates. Journal of Marketing Research 28, 1—15. Gupta, S., Chintagunta, P.K., 1994. On using demographic variables to determine segment membership in logit mixture models. Journal of Marketing Research 31, 128—136.

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

337

Harvey, A.C., 1989. Forecasting Structural Time Series Models and the Kalman Filter. Cambridge University Press, Cambridge. Hausman, J., Hall, B.H., Griliches, Z., 1984. Econometric models for count data with an application to the Patents-R and D relationship. Econometrica 52, 909—938. Kahn, B.E., Morrison, D.G., 1989. A note on ‘random’ purchasing: additional insights from Dunn, Reader and Wrigley. Applied Statistics 38, 111—114. Lawless, J.F., 1987. Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics 15, 209—225. McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, Chapman and Hall, London. McKenzie, E., 1988. Some ARMA models for dependent sequences of Poisson counts. Advances in Applied Probability 20, 822—835. McKendrick, A.G., 1926. Applications of mathematics to medical problems. Proceedings of the Edinburgh Mathematical Society 44, 98—130. Mills, T.M., Seneta, E., 1991. Independence of partial autocorrelations for a classical immigration branching process. Stochastic Processes and their Applications 37, 275—279. Morrison, D.G., Schmittlein, D.C., 1981. Predicting future random events based on past performance. Management Science 27, 1006—1023. Morrison, D.G., Schmittlein, D.C., 1988. Generalizing the NBD model for customer purchases: what are the implications and is it worth the effort. Journal of Business and Economic Statistics 6, 145—159. Neslin, S.A., Henderson, C., Quelch, J., 1985. Consumer promotions and the acceleration of product purchases. Marketing Science 4, 147—165. Neslin, S.A., Schneider Stone, L.G., 1996. Consumer inventory sensitivity and the postpromotion dip. Marketing Letters 7, 77—94. Ramaswamy, V., Anderson, E.W., DeSarbo, W.S., 1993. A disaggregate negative binomial regression procedure for count data analysis. Management Science 40, 405—417. Ronning, G., Jung, R.C., 1992. Estimation of a first-order autoregressive process with Poisson marginals for count data. In: Fahrmeir, L., Francis, B., Gilchrist, R., Tutz, G. (Eds.), Advances in GLIM and Statistical Modeling. Springer, Berlin, pp. 188—194. Smith, J.Q., 1979. A generalization of the Bayesian forecasting model. Journal of the Royal Statistical Society Series B 41, 375—387. Sprott, D.A., 1983. Estimating the parameters of a convolution by maximum likelihood. Journal of the American Statistical Association 78, 460—467. Steutel, F.W., Harn, K., 1979. Discrete analogues of self-decomposability and stability. Annals of Probability 7, 893—899. Teicher, H., 1963. Identifiability of finite mixtures. Annals of Mathematical Statistics 34, 1265—1269. Teicher, H., 1967. Identifiability of mixtures of product measures. Annals of Mathematical Statistics 38, 1300—1302. Titterington, D.M., Smith, A.F., Makov, U.E., 1985. Statistical Analysis of Finite Mixture Distributions. Wiley, New York. Van Duijn M.A.J., Bo¨ckenholt, U., 1995. Mixture models for the analysis of repeated count data. Applied Statistics 44, 473—485. Wang, P., Cockburn, I.M., Puterman, M.L., 1998. Analysis of Patent data — A mixed-Poissonregression-model approach. Journal of Business and Economic Statistics 16, 27—41. Wang, P., Puterman, M.L., Cockburn, I., Le, N., 1996. Mixed Poisson regression models with covariate dependent rates. Biometrics 52, 381—400. Wedel, M., DeSarbo, W.S., 1994. A review of recent developments in latent class regression models. In: Bagozzi, R.P. (Ed.), Advanced Methods of Marketing Research. Blackwell, Cambridge, pp. 352—388.

338

Ulf Bo¨ ckenholt / Journal of Econometrics 89 (1999) 317–338

Wedel, M., DeSarbo, W.S., 1995. A mixture likelihood approach for generalized linear models. Journal of Classification 12, 1—35. Wedel, M., DeSarbo, W.S., Bult, J.R., Ramaswamy, V., 1993. A latent class Poisson regression model for heterogeneous count data. Journal of Applied Econometrics 8, 397—411. Wheat, R.D., Morrison, D.G.., 1990. Assessing purchase timing models: whether or not is preferable to when. Marketing Science 27, 87—93. Zeger, S.L., 1988. A regression model for time series of counts. Biometrika 75, 621—629. Zeger, S.L., Qaqish, B., 1988. Markov regression models for time series: a quasi-likelihood approach. Biometrics 44, 1019—1031.