Insurance: Mathematics and Economics 34 (2004) 177–192
Heterogeneous INAR(1) model with application to car insurance C. Gourieroux a , J. Jasiak b,∗ b
a CREST, CEPREMAP, University of Toronto, Toronto, Canada Department of Economics, York University, 4700 Keele Street, Toronto, Ont., Canada M3J 1P3
Received 1 January 2002 received in final form October 2003; accepted 20 November 2003
Abstract The bonus-malus scheme shows how the history of claim arrivals determines the dynamics of insurance premium. It is important to distinguish to what extent changes in the insurance premium are explained by lagged claim counts introduced among explanatory variables and by unobservable heterogeneity included in the model, which needs to be regularly updated. For this purpose, we introduce the integer valued autoregressive (INAR) model with unobserved heterogeneity. The model is applied to premium updating in car insurance and compared to the standard method based on the negative binomial distribution. We find that the premium depends on the claim history and that the timing of claim arrivals matters. This result is different from the outcome of the standard framework in which the average number of claims per year is the only relevant factor. © 2003 Elsevier B.V. All rights reserved. JEL classification: C41; D4 Keywords: Count data; Nonlinear autoregression; Car insurance; Bonus-malus
1. Introduction The basic model for claim counts in car insurance relies on a representation of the claim history of each policyowner by a short, integer valued sequence of yearly counts of claims. The simplest model, called the Poisson model, assumes independence of claim counts for each individual, which are Poisson distributed and can depend on some observed explanatory variables. Its extended version, used for insurance premium updating and called the negative binomial model, accommodates serial dependence of claim counts, by introducing unobserved individual heterogeneity. The serial dependence in claim sequences is generated by integrating out the unobserved factor, and by updating its prediction when individual information increases. In this approach the effect of claim history can be introduced in two ways: (i) by including lagged claim counts among the explanatory variables, (ii) by accounting for individual heterogeneity which entails regularly updated predictions. The aim of this paper is to discuss these two techniques and the extent to which they determine the behavior of the premium in time, that is the bonus-malus scheme. ∗
Corresponding author. Tel.: +1-416-736-2100; fax: +1-416-736-5987. E-mail address:
[email protected] (J. Jasiak). 0167-6687/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.insmatheco.2003.11.005
178
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
In Section 2 we review the standard premium updating approach, called the negative binomial model which will be used as a benchmark for further comparisons. A major problem with this approach is the way in which lagged endogenous counts are usually included among the explanatory variables. It makes the model unsuitable for simple analysis of risk predictions at large horizons and for finding the stationarity properties of the count process. In Section 3 we review the integer valued autoregressive (INAR) model introduced in time series literature to account for serial dependence in count processes. Next, it is shown how INAR model can be extended to accommodate unobserved heterogeneity. In the last section simulation studies are performed to disentangle the effects of lagged counts used as explanatory variables from the effects of unobservable heterogeneity for the bonus-malus scheme. For expository purpose the premium updating is considered for one given policyowner. Section 5 concludes the paper.
2. The negative binomial approach Let us consider the policyowner i and denote by Yi,1 , . . . , Yi,T the number of claims per year submitted by this individual. In actuarial literature, the standard approach1 assumes that (A1) Yi,1 , . . . , Yi,T , Yi,T +1 are independent conditional on an unobservable heterogeneity factor µi , and have the same Poisson distribution P[µi λi ], where λi is given. (A2) The distribution of the heterogeneity factor µi is γ(A, A).2,3 Then the (frequency part of the) pure premium4 at date T is equal to Pi,T = E[Yi,T +1 |Yi,1 , . . . , Yi,T ] = λi E[µi |Yi,1 , . . . , Yi,T ].
(2.1)
It involves the conditional distribution of the heterogeneity factor given the claim history. Under Assumptions A1 and A2, the premium is given by Pi,T = λi
A + Yi,1 + · · · + Yi,T , A + Tλi
(2.2)
and is time varying. At the beginning of the contract the premium is Pi,0 = λi (sometimes called prior premium); then, in the next periods, it changes according to the observed claim history (these are called posterior premia): Pi,1 = λi
A + Yi,1 , A + λi
Pi,2 = λi
A + Yi,1 + Yi,2 ,... . A + 2λi
In particular, we find that the total number of claims is a sufficient statistic of the claim history for computing the premium. This approach is easily extended to account for the effect of time dependent characteristics of the policyholder. If for instance λit = g(xit ) is a function of time varying individual characteristics xit , expression (2.2) becomes A + Tt=1 Yi,t . (2.3) Pi,T = λi,T +1 A + Tt=1 λi,t We observe that the total number of claims still remains a sufficient statistics for the computation of the premium. 1 See, e.g. Hausman et al. (1984), Gourieroux et al. (1984a,b), Lemaire (1985, 1995), Cameron and Trivedi (1986), Lawless (1987), Dionne and Vanasse (1989), Gourieroux (1999a,b), Pinquet (2000) and Gourieroux and Jasiak (2001, Chapter 5). 2 The two parameters of the gamma distribution are identical to ensure that Eµ = 1 that is to be able to interpret λ as the expected number i i of claims. 3 For other applications such as marketing, it may be preferable to introduce a discrete heterogeneity distribution (Bockenholt, 1999). 4 The complete pure premium includes also the cost of the claim. It is equal to the frequency part times the expected cost per claim, when cost per claim and claim occurrence are independent.
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
179
The change in the premium between T and T + 1, say, can be due to a change in the observable individual characteristics xi,t and to the improvement in predicting the unobservable factor when the claim history increases. In particular, when λi is time independent, that is when the policyholder remains in the same class of risk a priori, the effect of claim history on the premium dynamics (generally called bonus-malus) is easy to formalize. Indeed we have A + (T + 1)Y¯ i,T +1 A + T Y¯ i,T Pi,T +1 − Pi,T = λi , (2.4) − A + (T + 1)λi A + Tλi λi A Tλi Y¯ i,T Pi,T +1 − Pi,T = Yi,T +1 − , (2.5) + A + (T + 1)λi A + Tλi A + Tλi which is the credibility formula. We observe that the premium increases, when the future number of claims is larger than its expected value evaluated at time T .5 The negative binomial model has several advantages: it is easy to understand, to fit (by using the SAS package GENMOD) and to apply for ratemaking. However, the limitation of this model is that it represents the claim history as an unweighted average of past observed numbers of claims. This characteristic of the negative binomial model affects the fit and, as a consequence the prediction of future claims. Thus it is necessary to extend the basic model by introducing different weights for past claims, or by including nonlinear effects of past claims. Such an extended specification would have to satisfy the following requirements: 1. It has to allow for ratemaking both at the beginning of the contract and during the life of the contract. Typically, the model has to specify in a coherent way the marginal distribution of Yi,t (given the exogenous characteristics) for computing the premium of a new driver, the conditional distribution of Yi,t given Yi,t−1 (given the exogenous characteristics) for updating the premium after 1 year, the conditional distribution of Yi,t given Yi,t−1 and Yi,t−2 (given the exogenous characteristics) for updating the premium after 2 years, and so on. 2. It has to be sufficiently flexible with respect to the dynamic specification to provide reasonable fit. 3. It has to be easy to understand. The fact that the negative binomial model is too restrictive is well known, and for this reason standard software packages, such as GENMOD in SAS allow for more complicated specifications.6 2.1. The Poisson regression and negative binomial regression software The first extension of the negative binomial regression and the Poisson regression consists in accommodating the effects of claim history by means of explanatory variables. In particular, when adverse selection effects are taken into account, the expected future number of claims, which depends on the claim history, can be included as an explanatory variable.7 It is also possible to use as an explanatory variable the age of a previous claim (see, e.g. the discussions in Gerber and Jones, 1975; Sundt, 1988; Pinquet, 2003). In practice the parameter λi,t is typically written as λi,t = exp[xi a + zit b + Yi,t−1 c],
(2.6)
where xi are time invariant characteristics of the policyholder, zit denote exogenous time dependent characteristics (such as the type of the car) and Yi,t−1 the lagged number of claims. In this framework, the claim history has a double effect on the premium due to the specification of λi,t , and to the predicted heterogeneity. 5
The discussion is still focused on the frequency part of the bonus-malus. GENMOD is a procedure based on the theory of the so-called generalized linear model. It is closely related to the pseudo- or quasi-likelihood approach introduced in McCullagh (1983) and Gourieroux et al. (1984a,b) (see McCullagh and Nelder, 1989; Gourieroux and Monfort, 1993 for surveys). 7 See, e.g. Puelz and Snow (1994), Chiappori and Salani´ e (2000) and Dionne et al. (1998, 2001). 6
180
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
Let us now focus on the specification of the observable component as a function of claim history. For ease of exposition, let us assume that b = 0 (no time dependent exogenous variables), and that there is no unobservable heterogeneity, that is µi = 1. Then the process (Yi,t ) is a Markov process with conditional distribution P[exp(xi a + Yi,t−1 c)]. This specification is a special case of a Poisson regression model. It is easy to implement in practice, especially since commercial software like SAS, GLIM offer straightforward procedures to estimate the parameters of the model. However, this specification has some structural drawbacks: (∗) the stationarity properties of the dynamic Poisson regression model are not established. This remark is especially important for ratemaking. It means that the stationary distribution, that is the marginal distribution of Yi,t does not necessarily exist, and if it exists, its analytical expression is not known. Thus we cannot compute the premium for a new driver. (∗∗) the transitions in more than one step do not admit simple analytical expressions. This is very inconvenient for risk prediction at horizons larger than one and for financial hedging or securitization of portfolios of car insurance contracts. 2.2. The Poisson AR(1) software Some extensions of the Poisson models which include “random effects” available from GENMOD, in particular the so-called Poisson AR(1) model. It is important to note that the GENMOD procedures are essentially semi-parametric estimation procedures, based on the notion of estimating equations. They provide consistent estimators of the parameters of interest for a given parametric specification of the expected number of claims and a given weighting matrix. This approach has several drawbacks: (∗) As noted by Liang and Zeger (1986, Section 5), the method is consistent, but not fully efficient. To achieve full efficiency, the maximum likelihood approach has to be used, which requires complete specification of the model to be estimated. (∗∗) The approach is semi-parametric, i.e. “The estimating equations are derived without specifying the joint distribution of the subject’s observations” (Liang and Zeger, 1986, p. 13). As such it allows to compute the expected number of claims, and the variance of that number at horizon 1, but it does not allow to compute the expectation and variance of the number of claims at larger horizons, nor to compute the conditional quantiles, which are needed to determine the reserve. For such purposes it is necessary to specify completely the conditional distribution, not only the conditional mean and variance.8 3. Integer valued autoregressive process The integer valued autoregressive process has been introduced in time series literature to circumvent the limitations of the Poisson regression model.9 We first recall the definition of the INAR process and explain how it is used for prediction making at horizons greater than 1. Next, the INAR model is extended to account for unobservable heterogeneity and time dependent exogenous characteristics. 3.1. The INAR process The rationale for designing a specific autoregressive model for count data is the following. Let us consider a standard linear autoregressive process Yt : Yt = ρYt−1 + εt , say, 8
Conditional mean and variance characterize the conditional distribution for a conditionally normal distribution. The normality assumption is clearly not appropriate in the case of count variables. 9 See McKenzie (1985, 1988), Al-Osh and Alzaid (1987), Alzaid and Al-Osh (1990), Brannas (1995), Greene (1997) and Brannas and Hellstrom (2001) for a survey.
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
181
where ρ is a parameter with values in (−1, 1) and (εt ) is a sequence of independent identically distributed variables. In order to obtain an integer valued Yt the following constraints have to be imposed: (i) εt is integer valued, (ii) ρ = −1, 0 or 1. Such constraints limit the practical use of linear autoregression in the framework of count variables. A natural idea is to replace the deterministic effect of lagged Yt ’s by a stochastic one. This can be done by introducing the so-called thinning operator. Let us introduce i.i.d. Bernoulli variables Uj,t with distribution B(1, p), p ∈ [0, 1]. Then the thinning operator, denoted as Bt (p), associates with Yt−1 the variable: Yt−1
Bt (p) ◦ Yt−1 =
Uj,t .
j=1
Conditional on Yt−1 , the thinned variable follows a binomial distribution B(Yt−1 , p), which ensures integer values of the autoregressive term for the linear specification of the conditional expectation. The thinning operator is used to define an integer valued autoregressive process. For expository purpose we focus on the autoregressive process of order 1, but the approach is easy to extend to higher autoregressive orders (see the references given above). The process is defined by the recursive equation: Yt = Bt (p) ◦ Yt−1 + εt ,
(3.1)
where (εt ) is a sequence of independent variables with discrete nonnegative values, and identical distributions. Bt (p) are independent binomial thinning operators which are independent of the error terms. The model with i.i.d. Poisson variables (Yt ), which underlies the negative binomial specification discussed in Section 2, is obtained when: (i) p = 0, (ii) εt ∼ P(λ). Thus the specification (3.1) extends the standard one in two respects. It includes an autoregressive effect, and it allows the distribution of the error term to be of any type. Under the assumption of Poisson distributed error terms, it is easy to check that the INAR(1) defines a count process which has a marginal Poisson distribution with a modified parameter P[λ/(1 − p)]. Thus both innovation and marginal distributions are Poisson. Simple dynamic properties of the process are easily inferred from its integer valued moving average representation. Indeed by recursive substitution we get Yt = εt + Bt (p) ◦ {Bt−1 (p) ◦ Yt−2 + εt−1 } = εt + Bt,2 (p) ◦ Bt−1 (p) ◦ Yt−2 + Bt,1 (p) ◦ εt−1 = εt + Bt,1 (p) ◦ εt−1 + Bt,2 (p) ◦ Bt−1,1 (p) ◦ εt−2 + · · · + Bt,h (p) ◦ Bt−1,h−1 (p) ◦ ◦ ◦ Bt−h+1,1 (p) ◦ εt−h + · · · , where the binomial thinning operators Bt,h (p) are independent for any t, h.10 In particular, we have (Al-Osh and Alzaid, 1987; McKenzie, 1988): Eεt (i) EYt = ; 1−p ph ph (ii) γ(h) = Cov(Yt , Yt−h ) = (Vεt − Eεt ). Eεt + 1−p 1 − p2 Thus the autocorrelation function features exponential decay, typical for “linear” autoregression. Indeed, for any pair of independent binomial thinning operators B1 (p) and B2 (p), we get B1 (p) ◦ z1 + B2 (p) ◦ z2 = B(p) ◦ (z1 + z2 ), where B(p) is another binomial thinning operator. 10
182
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
3.2. Prediction formulas In actuarial science, it is important to predict and manage the risk at different horizons. For this purpose, tractable prediction formulas become necessary. For an INAR process, the formulas for predicting a future number of claims at any horizon h are easily derived. Indeed we have E(Yt |Yt−h ) = E{εt + Bt,1 (p) ◦ εt−1 + · · · + Bt,h−1 (p)Bt−1,h−1 (p) ◦ ◦ ◦ Bt−h+2,2 (p) ◦ εt−h+1 + Bt,h (p) ◦ ◦ ◦ Bt−h+1 (p) ◦ Yt−h |Yt−h } = Eεt (1 + p + · · · + ph−1 ) + ph Yt−h = ph Yt−h +
1 − ph Eεt . 1−p
This prediction formula is an equivalent of the standard linear AR(1)-based forecast function. However, it provides only the conditional expectation of the future number of claims. For practical purpose it is preferable to know the conditional distribution itself. The reason is that it allows to evaluate also the risk associated with an insurance contract by considering either the conditional variance, a conditional quantile (i.e. a value at risk (VaR)), or a conditional TailVar. The predictive distribution at horizon h is easily derived for Poisson distributed innovations. Indeed, the current value of the INAR(1) process can be written as a function of a past value at any lag h. We get (h)
Yt = Bt (ph ) ◦ Yt−h + t , (h)
where the error t is now independent of the binomial thinning operator, and has a Poisson distribution P[λ(1 − ph )/(1 − p)]. In particular, E(Yt |Yt−h ) = ph Yt−h + λ
1 − ph , 1−p
V(Yt |Yt−h ) = ph (1 − ph )Yt−h + λ
1 − ph = E(Yt |Yt−h ) − p2h Yt−h . 1−p
We see that the conditional distribution of Yt given Yt−h features underdispersion. This property of underdispersion can be surprising at a first sight, especially since according to the literature, the claim counts observed in car insurance feature historical overdispersion. To make sure that the model provides a good fit to the data, overdispersion can be introduced into the INAR model by including an unobservable heterogeneity factor, which is a usual approach. For example, it justifies the replacement of the basic Poisson model by the negative binomial model. Moreover, it is important to note that the notion of over–underdispersion depends on the conditioning set. In general, overdispersion is observed in empirical distributions of claim counts, when no conditioning variables are considered, but it diminishes in the presence of explanatory variables.11,12 Indeed, a conditional variance is in average strictly smaller than the marginal one by the variance analysis equation. This is exactly the point which is illustrated here, with lagged claim count as the explanatory variable. The marginal distribution is Poisson without under- or overdispersion whereas the conditional distribution features underdispersion. 3.3. Estimation In practice the claim dynamics vary among the individuals depending on the individual characteristics. Let us assume that all individual characteristics are observed and time independent. Then the λ parameter can be written as a function of xi : λi = exp(xi θ). 11
In practice underdispersion can even be observed in some particular classes of risk. For instance estimated underdispersion arises automatically for any class of risk, where the only observed numbers of claims are 0 or 1. Indeed the sample variance is equal to Y¯ (1 − Y¯ ) < Y¯ . 12 As a consequence the risk and the insurance premium can diminish when additional individual characteristics are taken into account (see Gourieroux, 1999b for a detailed discussion).
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
183
From the prediction formulas, it follows that: E(Yi,t |Yi,t−1 , xi ) = pYi,t−1 + exp(xi θ) = µ(yi,t−1 , xi , θ),
V(Yi,t |Yi,t−1 , xi ) = p(1 − p)Yi,t−1 + exp(xi θ) = σ 2 (yi,t−1 , xi , θ).
Thus the parametric specifications of the conditional mean and variance are known and the parameters p and θ can be estimated consistently by solving recursively the estimating equations, in which θl is the estimate from iteration l: 1 [yi − µ(σ 2 (yi,t−1 , xi , θl+1 ))] = 0 2 (y σ , x , θ ) i,t−1 i l t i
(see the Newton–Ralphson algorithm in GENMOD). This approach is not fully efficient and can be improved in this respect by applying the maximum likelihood. Since the expression of the likelihood function is complicated, the simulated maximum likelihood can be used instead. In practice, the simulated log-likelihood function has to be given as input to the maximum likelihood SAS procedure. The same remark holds when unobserved heterogeneity is introduced. Let us for instance consider the first and second order moments. We get E[Yi,t |Yi,t−1 , xi , µi ] = pYi,t−1 + µi exp(xi θ). Thus by integrating out the unobserved heterogeneity, we get E[Yi,t |Yi,t−1 , xi ] = pYi,t−1 + exp(xi θ). Similarly, along the lines of Gourieroux et al. (1984b): V(Yi,t |Yi,t−1 , xi ) = V [E(Yi,t |Yi,t−1 , xi , µi )|Yi,t−1 , xi ] + E[V(Yi,t |Yi,t−1 , xi , µi )|Yi,t−1 , xi ] = V(µi ) exp(2xi θ) + p(1 − p)Yi,t−1 + exp(xi θ). We also get the explicit parametric expression of the two first conditional moments. Thus by using the estimating equations approach (or equivalently the pseudo(quasi)-maximum likelihood),13 we derive consistent estimators of p, θ and the variance of unobserved heterogeneity. It is important to note that this semi-parametric method does not allow to estimate the complete heterogeneity distribution, except when it depends on one parameter only. In contrast, the maximum likelihood method allows for efficient estimation of the whole distribution. 3.4. The INAR(1) process with gamma heterogeneity Since a sequence of i.i.d. Poisson variables follows an INAR(1) process of order 1 with Poisson innovations and a zero binomial thinning operator (that is p = 0), it is natural to extend the INAR(1) model by introducing gamma heterogeneity. For ease of exposition the individual index i is suppressed in this section. The INAR(1) process with gamma heterogeneity is defined under the following assumptions: (A1∗ ) The count variables Y1 , . . . , YT +1 follow the INAR(1) process: Yt = B(p) ◦ Yt−1 + t = Zt + t , where conditional on the past Yt−1 , Yt−2 , . . . and on the unobservable heterogeneity factor µ, the variables Zt and t are independent with distributions B(Yt−1 , p) and P(λµ), respectively. (A2) The distribution of µ is γ(A, A). 13
See McCullagh (1983), Gourieroux et al. (1984a,b), McCullagh and Nelder (1989), Gourieroux and Monfort (1993) and Diggle et al. (2002) for surveys.
184
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
This model can easily be extended to include the effects of exogenous time varying explanatory variables. These would be channeled by the parameter λ (in the usual way), and also by the serial dependence parameter p. This last extension is less common in the literature and consists in creating cross effects between the lagged endogenous counts and the exogenous variables. Such an extension is not considered in the next sections, where we focus on risk predictions. The reason is that in practice, a technique called crystallization is used, which consists in predicting risk, with the exogenous variables (even time varying) held fixed at their current values. For example, future characteristics of a car are considered fixed and equal to the current ones. This approach allows to avoid predicting the car renewal scheme, and disregards the associated risk.
4. Premium analysis for INAR(1) process with gamma heterogeneity Let us compute the (frequency part of the) pure premium for the INAR(1) process with gamma heterogeneity and discuss the differences between this result and the one derived under the standard negative binomial approach without lagged endogenous variables. 4.1. The pure premium At date (year) T the pure premium is PT = E[YT +1 |Y1 , . . . , YT ] = E[E(YT +1 |Y1 , . . . , YT , µ)|Y1 , . . . , YT ] = E[pYT + λµ|Y1 , . . . , YT ], PT = pYT + λE[µ|Y1 , . . . , YT ].
(4.1)
The risk on the count variable YT +1 can be measured by RT = V [YT +1 |Y1 , . . . , YT ] = E[V [YT +1 |Y1 , . . . , YT , µ]|Y1 , . . . , YT ] + V [E[YT +1 |Y1 , . . . , YT , µ]|Y1 , . . . , YT ] = E[p(1 − p)YT + λµ|Y1 , . . . , YT ] + V [pYT + λµ|Y1 , . . . , YT ], RT = p(1 − p)YT + λE[µ|Y1 , . . . , YT ] + λ2 V [µ|Y1 , . . . , YT ].
(4.2)
As in the standard negative binomial framework, the pure premium and the risk measure depend on the conditional distribution of the heterogeneity factor given the claim history. The analytical expression of the conditional distribution of the heterogeneity factor is given below for a short claim history T = 0, . . . , 3, that pertains to a new policyowner and to a customer with a seniority for up to 3 years. Proposition 1. (i) For T = 0, the conditional distribution of µ is γ(A, A). (ii) For T = 1, the conditional distribution of µ given as Y1 is λ γ A + y1 , A + . 1−p (iii) For T = 2, the conditional distribution of µ given as Y1 , Y2 is −1 min(y min(y 1 ,y2 ) 1 ,y2 ) λ π(z2 , y1 , y2 )γ A + y1 + y2 − z2 , A + λ + π(z2 , y1 , y2 ) , 1−p z2 =0
z2 =0
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
185
where π(z2 , y1 , y2 ) = Cyz21
p 1−p
z2
1 1 Γ(A + y1 + y2 − z2 ) . λz2 (y2 − z2 )! [A + λ + λ/(1 − p)]A+y1 +y2 −z2
(iv) For T = 3, the conditional distribution of µ given as Y1 , Y2 , Y3 is
min(y 2 ,y3 ) min(y 1 ,y2 )
z3 =0
z2 =0
λ π(z2 , z3 , y1 , y2 , y3 )γ A + y1 + y2 + y3 − z2 − z3 , A + 2λ + 1−p
min(y 2 ,y3 ) min(y 1 ,y2 )
×
z3 =0
−1 π(z2 , z3 , y1 , y2 , y3 )
,
z2 =0
where
z2 +z3 p 1 1 1 1−p λz2 +z3 (y2 − z2 )! (y3 − z3 )! Γ(A + y1 + y2 + y3 − z2 − z3 ) . × [A + 2λ + λ/(1 − p)]A+y1 +y2 +y3 −z2 −z3
π(z2 , z3 , y1 , y2 , y3 ) = Cyz32 Cyz21
䊐
Proof. See Appendix A.
Thus the conditional distribution of the heterogeneity factor is a mixture of gamma distributions with weights and degrees of freedom which depend on the claim history. The prediction of the heterogeneity factor and the pure premium follow directly from formula (4.1). Proposition 2. (i) For T = 0, µ ˆ 0 = E(µ) = 1, P0 = λ. (ii) For T = 1, µ ˆ 1 = E(µ|y1 ) = (A + y1 )/(A + λ/(1 − p)) and P1 = py1 + λ[(A + y1 )/(A + λ/(1 − p))]. (iii) For T = 2, P2 = py2 + λµ ˆ 2 , where
min(y 1 ,y2 )
µ ˆ2 =
z2 =0
−1 min(y1 ,y2 ) A + y1 + y 2 − z 2 π(z2 , y1 , y2 ) π(z2 , y1 , y2 ) . A + λ + λ/(1 − p) z2 =0
(iv) For T = 3, P3 = py3 + λµ ˆ 3 , where µ ˆ3 =
min(y 2 ,y3 ) min(y 1 ,y2 ) z3 =0
×
π(z2 , z3 , y1 , y2 , y3 )
z2 =0
min(y 2 ,y3 ) min(y 1 ,y2 ) z3 =0
z2 =0
A + y 1 + y2 + y3 − z2 − z3 A + 2λ + λ/(1 − p) −1
π(z2 , z3 , y1 , y2 , y3 )
.
186
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
We get analytical formulas that are easy to implement in practice, especially since the number of terms to be summed up is often very small, due to yt being equal to 0 or 1 in many years. 4.2. Comparison of pure premia We can now compare the premia given in Proposition 2 with the standard negative binomial premium (the one obtained for p = 0). • Case T = 0: At the beginning of the contract, no claim history is available and the two premia coincide. • Case T = 1: After 1 year, we get a pure premium which is affine with respect to y1 and monotone with respect to p. It differs from the standard premium λ[(A + y1 )/(A + λ)] for p = 0 to y1 for p = 1. It is an increasing function of the autoregressive coefficient p if y1 > λ, a decreasing function of p, otherwise. • Case T = 2: In this case, the pure premium is no longer affine with respect to y1 + y2 , or even with respect to (y1 , y2 ). For instance we get ◦ if min(y1 , y2 ) = 0, P2 = py2 + λ
A + y 1 + y2 , A + λ + λ/(1 − p)
◦ if min(y1 , y2 ) = 1, p 1 λ 1 A + y 1 + y2 − 1 (A + y1 + y2 ) + y1 (A + y1 + y2 − 1) A+λ+λ/(1−p) y2 A + λ + λ/(1 − p) 1−pλ −1 p 1 1 A + y 1 + y2 − 1 + y1 . × y2 A + λ + λ/(1 − p) 1−pλ
P2 = py2 +
Let us compare three claim histories formed by the following sequences of two claim counts: (2, 0), (1, 1) and (0, 2), respectively. In the standard negative binomial approach without lagged endogenous explanatory variables, the predicted heterogeneity µ ˆ 2 (the frequency part of the pure premium, respectively) depends only on the sum y1 + y2 and has the same value for the three claim histories. In the presence of an autoregressive effect, we get the following results: claim history (2, 0): µ ˆ2 =
A+2 , A + λ + λ/(1 − p)
P2 = λ
A+2 . A + λ + λ/(1 − p)
claim history (0, 2): µ ˆ2 =
A+2 , A + λ + λ/(1 − p)
P2 = 2p + λ
A+2 . A + λ + λ/(1 − p)
claim history (1, 1):
p 1 (A + 1)(A + 2) µ ˆ2 = + (A + 1) A + λ + λ/(1 − p) 1 − p λ P2 = p + λµ ˆ 2.
−1 p 1 λ A+1+ A+λ+ , 1−pλ 1−p
In the INAR(1) framework the three claim histories are distinguishable and the time of the claim arrival (i.e. in the first or in the second year) matters. The predicted heterogeneity is identical for sequences (0, 2) and (2, 0), but the
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
187
pure premia differ. Moreover, the predicted heterogeneity is lower in claim history (1, 1) than in claim histories (0, 2) and (2, 0). Thus the INAR(1) specification is a convenient way to account for the time of claim arrival, and to solve the question considered in Pinquet et al. (2001) and Pinquet (2000). As an illustration, we provide Figs. 1–5 that show the expected heterogeneity factor and the pure premium as functions of parameters λ, λ = 0.1, 0.2, . . . , 1 and p for A = 9. Straightforward interpretations of the parameter values are obtained by considering the standard case p = 0. In the negative binomial framework λ is simply the expected yearly number of claims. When λ = 0.1 and A = 9, the initial premium is equal to λ = 0.1; the premium after 1 year and one accident is P1 = λ[(A + 1)/(A + λ)] = 0.1(10/9.1) = 0.11, which corresponds to a 10% increase of the frequency part of the insurance premium. In Figs. 1–5, parameter p is measured on the x-axis, and one curve is plotted for each value of λ. For p = 0, the INAR(1) premia coincide with the standard negative binomial ones, that is P2 = λ[(A + 2)/(A + 2λ)] = λ[11/(9 + 2λ)] and are identical for the three claim histories. The effects of the autoregressive parameter p are as follows: (i) The premium increases with p for claim history (0, 2), in which a large number (2) of claims arrives at the end of the period considered, and decreases with p for claim history (2, 0) in which both claims arrive at the beginning of the period. (ii) The premium decreases with p in claim history (1, 1), in which the claims are more homogeneously spread in time. The decrease is stronger for small values of p in (1, 1) compared to claim history (2, 0), since the regularity in claim arrivals provides much information about the heterogeneity factor. In contrast, the decrease becomes weaker for about p = 0.8 which implies that unobserved heterogeneity is expected.
0.0
0.2
0.4
0.6
0.8
1.0
1.2
mu hat for Y=(1,1), A=9
0.0
0.2
0.4
0.6
Fig. 1. Expected factor for Y = (1, 1).
0.8
1.0
188
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
0.2
0.4
0.6
0.8
1.0
1.2
1.4
p hat for Y=(1,1), A=9
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
Fig. 2. Premium for Y = (1, 1).
0.0
0.2
0.4
0.6
0.8
1.0
1.2
mu hat for Y=(2,0), A=9
0.0
0.2
0.4
0.6
Fig. 3. Expected factor for Y = (2, 0) or Y = (0, 2).
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
189
0.0
0.2
0.4
0.6
0.8
1.0
p hat for Y=(2,0), A=9
0.0
0.2
0.4
0.6
0.8
1.0
Fig. 4. Premium for Y = (2, 0).
0.5
1.0
1.5
2.0
p hat for Y=(0,2), A=9
0.0
0.2
0.4
0.6
Fig. 5. Premium for Y = (0, 2).
0.8
1.0
190
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
(iii) When p = 1 the count process becomes a martingale; the pure premium is fixed and equal to the lagged number of claims. Finally, it is interesting to look at the evolution of the premia in time. As an illustration, we fix the parameters at λ = 0.8, p = 0.3, A = 9. For the claim histories considered, we provide below a summary of the evolution of the premium over the first 3 years. P0
P1
P2
Claim history (1, 1) Negative binomial Mixed INAR(1)
0.3 0.3
0.318 0.618
0.378 0.629
Claim history (2, 0) Negative binomial Mixed INAR(1)
0.3 0.3
0.354 0.948
0.378 0.444
Claim history (0, 2) Negative binomial Mixed INAR(1)
0.3 0.3
0.288 0.285
0.378 0.844
5. Concluding remarks The integer valued autoregressive model with unobserved heterogeneity is a tractable specification which allows for a detailed analysis of the claim history effects on the insurance premium.14 Indeed, it is possible to separate the effects due to introduction of lagged claim counts among the explanatory variables and to the unobservable heterogeneity factor. The analysis can be extended in several directions. For instance we can introduce a time dependent unobservable heterogeneity factor µit , with a suitable autoregressive dynamics. This extension would allow to model the moral hazard phenomenon. The heterogeneity component can be interpreted as an individual effort which depends on lagged claims by means of the expected future malus if an accident occurs (see, e.g. Gourieroux, 1999b). Models with dynamic heterogeneity have been considered for instance in Gourieroux and Visser (1997), Gourieroux and Jasiak (2000, 2001), Pinquet et al. (2001) and Pinquet (2003, Section 4). Another further research topic concerns mispecification. The effects of changing the specification from a Poisson regression model to a negative binomial model are well known for risk predictions and for the premium level. In this paper we described their changes when instead of a negative binomial model an INAR(1) model with heterogeneity is used. However, we have not discussed the estimation bias. It would be interesting to examine the magnitude of bias if a negative binomial model is estimated while the true model is an INAR(1) with heterogeneity.
Appendix A. Conditional heterogeneity distribution • Case T = 1 14
As the negative binomial model, this model is also very easy to simulate and therefore can be estimated by standard simulation based methods, such as the simulated maximum likelihood or the simulated method of moments (see, e.g. Gourieroux and Monfort, 1996).
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
The joint distribution of Y1 and µ is: λµ [λµ/(1 − p)]y1 AA l(y1 , µ) = l(y1 |µ)l(µ) = exp − exp(−µA)µA−1 1−p y1 ! Γ(A) λ = C(y1 )µy1 +A−1 exp − µ A + , say. 1−p We find that the conditional distribution of µ is γ[A + y1 , A + λ/(1 − p)]. • Case T = 2 The joint distribution of Y2 , Y1 , µ is (see, e.g. Bockenholt, 1999, Eq. (2)): l(y2 , y1 , µ) = l(y2 |y1 , µ)l(y1 |µ)l(µ) min(y 1 ,y2 ) y2 −z2 (λµ) Cyz21 pz2 (1 − p)y1 −z2 exp(−λµ) = (y2 − z2 )! z2 =0
[λµ/(1 − p)]y1 AA λµ exp(−µA)µA−1 × exp − 1−p y1 ! Γ(A) = C(y1 , y2 )
min(y 1 ,y2 ) z2 =0
Cyz21
p 1−p
λ × exp − µ A + λ + 1−p
z2
1 1 µA+y1 +y2 −z2 −1 z λ 2 (y2 − z2 )!
.
The result follows directly. • Case T = 3 We get l(y3 , y2 , y1 , µ) =
min(y 2 ,y3 ) z3 =0
y3 −z3 (λµ) Cyz32 pz3 (1 − p)y2 −z3 exp(−λµ) (y3 − z3 )!
min(y 1 ,y2 )
×
z2 =0
y2 −z2 (λµ) Cyz21 pz2 (1 − p)y1 −z2 exp(−λµ) (y2 − z2 )!
[λµ/(1 − p)]y1 AA λµ exp(−µA)µA−1 1−p y1 ! Γ(A) z2 +z3 min(y 2 ,y3 ) min(y 1 ,y2 ) 1 1 p 1 z3 z2 = C(y1 , y2 , y3 ) Cy2 Cy1 1−p λz2 +z3 (y2 − z2 )! (y3 − z3 )!
× exp −
z3 =0
A+y1 +y2 +y3 −z2 −z3
×µ The result follows.
z2 =0
λ exp − µ A + 2λ + 1−p
.
191
192
C. Gourieroux, J. Jasiak / Insurance: Mathematics and Economics 34 (2004) 177–192
References Al-Osh, M., Alzaid, A., 1987. First order integer valued autoregressive (INAR(1)) processes. Journal of Time Series Analysis 8, 261–275. Alzaid, A., Al-Osh, M., 1990. An integer valued pth-order autoregressive structure (INAR(p)) process. Journal of Applied Probability 27, 314–324. Bockenholt, U., 1999. Mixed INAR(1) Poisson regression models: analyzing heterogeneity and serial dependence in longitudinal count data. Journal of Econometrics 89, 317–338. Brannas, K., 1995. Explanatory Variables in the AR(1) Count Data Model. Working Paper No. 385. Department of Economics, University of Umea, Sweden. Brannas, K., Hellstrom, J., 2001. Generalizations to the integer valued AR(1) model. Econometric Reviews 20, 425–443. Cameron, C., Trivedi, P., 1986. Econometric models based on count data: comparisons and applications of some estimators. Journal of Applied Econometrics 1, 29–53. Chiappori, P.A., Salanié, B., 2000. Testing for asymmetric information in insurance markets. Journal of Political Economy 108, 56–78. Diggle, P., Heagerty, P., Liang, K., Zeger, S., 2002. Analysis of Longitudinal Data, 2nd ed. Oxford University Press, Oxford. Dionne, G., Gourieroux, C., Vanasse, C., 1998. Evidence of adverse selection in automobile insurance market. In: Dionne, G., Laberge-Nadeau, C. (Eds.), Automobile Insurance. Kluwer Academic Publishers, Dordrecht, pp. 13–41. Dionne, G., Gourieroux, C., Vanasse, C., 2001. Evidence of adverse selection in automobile insurance market. Journal of Political Economy 109, 444–453. Dionne, G., Vanasse, C., 1989. A generalization of automobile insurance rating models: the negative binomial distribution with a regression component. ASTIN Bulletin 19, 199–221. Gerber, H., Jones, D., 1975. Credibility formulas of the updating type. Transactions of the Society of Actuaries 27, 31–52. Gourieroux, C., 1999a. Statistique de l’Assurance. Economica, Paris. Gourieroux, C., 1999b. Econometrics of risk classification in insurance. Geneva Papers for Risk and Insurance 24, 119–137. Gourieroux, C., Jasiak, J., 2000. Nonlinear panel data models with dynamic heterogeneity. In: Krishnakumar, J., Ronchetti, E. (Eds.), Panel Data Econometrics. Future Directions, pp. 127–147. Gourieroux, C., Jasiak, J., 2001. Econometric analysis of individual risk. http://dept.econ.yorku.ca/∼jasiakj. Gourieroux, C., Monfort, A., 1993. Pseudo-maximum likelihood methods. In: Maddala, Rao, Vinod (Eds.), Handbook of Statistics, vol. 11. Elsevier, Amsterdam, pp. 335–362. Gourieroux, C., Monfort, A., 1996. Simulation Based Econometric Methods. Oxford University Press, Oxford, UK. Gourieroux, C., Monfort, A., Trognon, A., 1984a. Pseudo-maximum likelihood methods: theory. Econometrica 52, 681–700. Gourieroux, C., Monfort, A., Trognon, A., 1984b. Pseudo-maximum likelihood methods: application to Poisson models. Econometrica 52, 701–720. Gourieroux, C., Visser, M., 1997. A count data model with unobserved heterogeneity. Journal of Econometrics 79, 247–268. Greene, W., 1997. Econometric Analysis. Prentice-Hall, Englewood Cliffs, NJ. Hausman, J., Hall, B., Griliches, Z., 1984. Econometric models for count data with an application to the patents R and D relationship. Econometrica 52, 909–938. Lawless, J., 1987. Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics 15, 209–225. Lemaire, J., 1985. Automobile Insurance: Actuarial Models. Kluwer Academic Publishers, Nijhoff. Lemaire, J., 1995. Bonus-malus System in Automobile Insurance. Kluwer Academic Publishers, Dordrecht. Liang, K., Zeger, S., 1986. Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22. McCullagh, P., 1983. Quasi-likelihood function. Annals of Statistics 11, 59–67. McCullagh, P., Nelder, J., 1989. Generalized Linear Models. Chapman & Hall, London. McKenzie, E., 1985. Some simple models for discrete variate time series. Water Resource Bulletin 21, 645–650. McKenzie, E., 1988. Some ARMA models for dependent sequences for Poisson counts. Advances in Applied Probability 20, 822–835. Pinquet, J., 2000. Experience rating through heterogeneous models. In: Dionne, G. (Ed.), Handbook of Insurance. Kluwer Academic Publishers, Dordrecht, Chapter 14, pp. 459–500. Pinquet, J., Guillen, M., Bolance, C., 2001. Allowance for the age of claims in bonus-malus systems. ASTIN Bulletin 31, 337–340. Pinquet, J., 2003. In: Bertrand, P., Gourieroux, C. (Eds.), Price en compte de l’ancienneté des périodes dans les modèles de risque en fréquence: aspects théoriques et applications à la tarification des risques en assurance automobile. Journal de la Société Franc,aise de Statistique, in press. Puelz, R., Snow, A., 1994. Evidence on adverse selection: equilibrium signalling and cross subsidization in the insurance market. Journal of Political Economy 102 (2), 236–257. Sundt, B., 1988. Credibility estimators with geometric weights. Insurance: Mathematics and Economics 7, 113–122.