Dependence modeling in non-life insurance using the Bernstein copula

Dependence modeling in non-life insurance using the Bernstein copula

Insurance: Mathematics and Economics 50 (2012) 430–436 Contents lists available at SciVerse ScienceDirect Insurance: Mathematics and Economics journ...

520KB Sizes 0 Downloads 35 Views

Insurance: Mathematics and Economics 50 (2012) 430–436

Contents lists available at SciVerse ScienceDirect

Insurance: Mathematics and Economics journal homepage: www.elsevier.com/locate/ime

Dependence modeling in non-life insurance using the Bernstein copula✩ Dorothea Diers a , Martin Eling b , Sebastian D. Marek c,∗ a

Provinzial NordWest Holding AG, 48143 Münster, Germany

b

Institute of Insurance Economics, University of St. Gallen, 9010 St. Gallen, Switzerland

c

Institute of Insurance Science, University of Ulm, 89069 Ulm, Germany

article

info

Article history: Received July 2011 Received in revised form February 2012 Accepted 10 February 2012 Keywords: Non-life insurance Copulas Bernstein copula Goodness-of-fit Simulation

abstract This paper illustrates the modeling of dependence structures of non-life insurance risks using the Bernstein copula. We conduct a goodness-of-fit analysis and compare the Bernstein copula with other widely used copulas. Then, we illustrate the use of the Bernstein copula in a value-at-risk and tail-valueat-risk simulation study. For both analyses we utilize German claims data on storm, flood, and water damage insurance for calibration. Our results highlight the advantages of the Bernstein copula, including its flexibility in mapping inhomogeneous dependence structures and its easy use in a simulation context due to its representation as mixture of independent Beta densities. Practitioners and regulators working toward appropriate modeling of dependences in a risk management and solvency context can benefit from our results. © 2012 Elsevier B.V. All rights reserved.

1. Introduction Using copulas in risk management has become popular in both academia and practice recently. Copula applications are presented for modeling dependence between stock returns (Jondeau and Rockinger, 2006), CDO pricing (Hofert and Scherer, 2011), currency option pricing (Salmon and Schleicher, 2006), or internal risk models (Eling and Toplek, 2009); further areas of application can, e.g., be found in Genest et al. (2009a). However, several popular copulas, such as elliptical and Archimedean copulas, exhibit a certain degree of symmetry or are restricted to certain correlation structures, which is not always suitable for risk modeling in practice. The Bernstein copula, which has only recently received attention in an insurance context (Pfeifer et al., 2009), has the potential to overcome these drawbacks while still being applicable in higher dimensions. It is a flexible, non-parametric copula capable of approximating any copula arbitrarily well and thus

✩ We are grateful to an anonymous referee for valuable suggestions. We also thank Martin Hampel, Christian Hering, Andreas Niemeyer, Thomas Parnitzke, Dietmar Pfeifer, Jan-Philipp Schmidt, and the participants of the 2010 World Risk and Insurance Economics Congress, the 2011 Annual Meeting of the German Insurance Science Association, and the 15th International Congress on Insurance: Mathematics and Economics for their comments. Furthermore, we acknowledge use of the Common Ulm Stuttgart Server project’s computing facilities. ∗ Corresponding author. E-mail address: [email protected] (S.D. Marek). URL: http://www.uni-ulm.de/mawi/ivw (S.D. Marek).

0167-6687/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.insmatheco.2012.02.007

may serve as a model for an unknown underlying dependence structure (Sancetta and Satchell, 2004). We explore this flexibility in a realistic environment by fitting the Bernstein copula to empirical claims data from six lines of business and simulating the aggregate value-at-risk and tail-value-at-risk. Our data are from lines of business driven by exposure to natural perils. The resulting portfolio is difficult to model since different lines are combined, resulting in an inhomogeneous dependence structure. Goodness-of-fit is an aspect usually not considered in the literature on copula modeling and calibration (Embrechts, 2009). Thus, this paper contributes to the literature by implementing the Bernstein copula in a higher dimension, assessing its goodness-offit in the modeling of dependence structures of non-life insurance risks, and illustrating its use in a simulation context. For this purpose we use the representation of the Bernstein copula as a mixture of Beta densities.1 This representation facilitates an efficient random sampling algorithm, which has, to our knowledge, so far not been applied in the context of the Bernstein copula. To preview our results, we show that the Bernstein copula performs especially well when multiple risk classes with inhomogeneous dependence structure are combined. We conclude that the Bernstein copula is a promising alternative for modeling dependence structures in internal risk models. Practitioners working on

1 The representation of densities based on Bernstein polynomials as mixture of Beta distributions is an already established result; for instance, Sancetta and Satchell (2004) use it in their representation of the Bernstein copula’s Spearman’s rho. We are grateful to an anonymous referee for making us aware of this representation.

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

calibration and implementation of such models, as well as regulators responsible for the validation of internal risk models, can benefit from our results. The remainder of this paper is organized as follows. In Section 2, we present the analyzed copulas, parameter estimation, and sampling algorithms. In Section 3, we introduce the data. Section 4 reports the results of the goodness-of-fit analysis. In Section 5, we show the value-at-risk and tail-value-at-risk simulation results. We conclude in Section 6. 2. Analyzed copulas 2.1. Motivation for analyzing the Bernstein copula We use the non-parametric Bernstein copula to analyze its potential for solving selected problems in the application of standard copula approaches (symmetry, parameter restrictions, applicability in high dimensions). The type of problem considered here is of high importance in light of Solvency II, which requires adequate modeling of the dependences between different types of risks in an insurance company. In this context, Eling and Toplek (2009) discuss various elliptical and nested Archimedean copulas. We extend that analysis with a flexible modeling alternative that is easy to use and readily calibrated to empirical data. There are many other non-parametric copulas, such as the gridtype copula (Pfeifer et al., 2009; Kulpa, 1999), the box copula (Hummel, 2009), or the Fourier copula (Lowin, 2010). We focus on the Bernstein copula for four reasons. First, it recently has attracted attention in insurance modeling and is thus a natural candidate for further analysis. In this context, Pfeifer et al. (2009) focus on the mathematical properties and calculation of the Bernstein copula and apply it to a two-dimensional dataset of insurance claims. We use the Bernstein copula in a higherdimensional framework and analyze its statistical fit. Second, the Bernstein copula is attractive from a modeling perspective. Standard copula approaches from the elliptical and Archimedean class provide a certain degree of symmetry or are restricted to certain correlation structures, which may not always be desirable. The Bernstein copula is not bound to these limitations and thus can provide a more adequate estimate of the underlying dependence structure. Third, because calibration and simulation efforts do not increase exponentially with the dimension, the Bernstein copula is also suitable in higher dimensions, which is a major advantage compared to other parametric and non-parametric estimators. Fourth, its mathematical properties are interesting as the Bernstein estimator converges to the underlying dependence structure, provides a higher rate of consistency than other common nonparametric estimators (Sancetta and Satchell, 2004; Kulpa, 1999), and does not suffer from boundary bias as do kernel-based copulas. However, the Bernstein copula also has the same disadvantages like other non-parametric estimators (e.g. the bias-variance tradeoff). Even though it can approximate any behavior in the tail, it cannot model asymptotic tail dependence (Sancetta and Satchell, 2004). Approximation quality may thus vary. 2.2. Considered copulas In this analysis, we consider the independence copula as well as various parametric and non-parametric copulas. Specifically, we analyze the parametric elliptical (Gauss and Student) and Archimedean (Gumbel and Clayton) classes and the Bernstein, grid-type, and kernel copulas as non-parametric alternatives. The independence copula, in which all risk classes are considered as independent, serves as the benchmark. The implementation of dependence structures and copulas in internal risk models

431

increases model complexity and thus costs. Therefore, more complex copulas should at least perform better than this benchmark. From the class of elliptical copulas we choose the Gauss and the Student copulas. Stemming from elliptical distributions, both induce symmetric dependence structures, which may not always be suitable in an insurance context with possibly erratic or clustered claim realizations. From the class of Archimedean copulas we consider the Clayton and Gumbel copulas. For this class there are two possible setups: the exchangeable case and the non-exchangeable case. The exchangeable, single-parametric case is less adequate in our context because it results in identical margins and dependence among all risk classes. In the non-exchangeable case, nested Archimedean copulas (NACs) couple multiple Archimedean copulas with different generators or, as a simplification, with the same generator. This construction better reflects the dependence situation in our data and will be described in more detail in Section 3. As non-parametric copulas we consider the Bernstein copula, the closely related grid-type copula, and a kernel-based copula. For the Bernstein and grid-type copula, we use the notation from Pfeifer et al. (2009) with d ∈ N denoting the dimension of the copula (i.e., the number of considered risk classes), n ∈ N the sample size used for calibration of the copula, and mi ∈ N the grid size  for i = Ś k k +1 1, . . . , d. Let Ti = {0, 1, . . . , mi −1} and Ik1 ,...,kd := di=1 mi , im i i Śd for all possible choices of (k1 , . . . , kd ) ∈ T . Thus I i k ,..., kd i=1 1 describes a grid over the [0, 1]d hypercube with i=1 mi cells. Let U = (U1 , . . . , Ud ) be some discrete randomvector with uniform 

d

margins over Ti . With p(k1 , . . . , kd ) := P

d

i=1

{Ui = ki } and

(u1 , . . . , ud ) ∈ [0, 1] , the general Bernstein and grid-type copula can be given as d

md −1

m1 −1

c (u1 , . . . , ud ) :=



k1 =0

···



p(k1 , . . . , kd )

kd =0

For φ(m, k, u) = B(m − 1, k, u) :=

d 

mi φ(mi , ki , ui ).

i =1



m−1 k



uk (1 − u)m−1−k , i.e.,

the Bernstein polynomials, we receive the Bernstein density, and for φ(m, k, u) = 1 k k+1  (u), i.e., the indicator function, we m, m

receive the grid-type density. For this analysis, we choose mi ≡ m constant. For each of the cells Ik1 ,...,kd , a probability estimate can be obtained, which is summarized in a d-dimensional contingency table p. The Bernstein polynomials can be interpreted as smoothing functions that disperse parts of the probability mass to surrounding cells. The degree of the polynomials determines the intensity of smoothing. For the kernel copula we use a setup adopted from Fermanian and Scaillet (2003) with a Gaussian kernel. We evaluate the kernel copula by C (u1 , . . . , ud ) = Fˆ (ξˆ1 , . . . , ξˆd ), with Fˆ as multivariate Gaussian kernel distribution function and ξˆi as a kernel-based estimate of the quantile with probability level ui of the i-th data vector. This construction corrects to some extent the boundary bias inherent in kernel-based copulas. There are alternatives that may provide better performance; for example, Chen and Huang (2007) propose local-linear kernels and Bouezmarni and Rombouts (2009) suggest a semi-parametric setup with non-parametric margins and parametric copula functions. For our higher-dimensional analysis we need, however, an efficient sampling algorithm which is not yet available for these modified estimators. For this reason we stay with the Fermanian and Scaillet (2003) setup. 2.3. Copula estimation For parameter estimation of elliptical copulas we rely on the canonical maximum likelihood (CML) method. We disregard the

432

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

full maximum likelihood method, which requires maximization of a demanding, high-dimensional likelihood function. We also disregard the inference for margins method, which requires explicit selection and fitting of the marginal distributions. Wrong assumptions and bad fit of margins may severely distort simulation results (see, Genest et al., 2009b, also for a brief overview on estimation techniques). Using the CML method, observations are transformed into relative ranks, with the average rank used for tied ranks, and parameters are estimated using a pseudo-maximum likelihood approach. In the NAC case, we use CML estimation for parameter estimation of a bivariate dataset. For parameters describing the dependence between more than two vectors, the average Kendall’s tau of all rank-transformed bivariate vector combinations is converted into the corresponding copula parameter (as proposed by Savu and Trede, 2010). This simplification allows us to derive an estimate of the copula parameter for those line combinations in our data with mixed, i.e., positive and negative, correlation structure. If a bivariate dataset or average Kendall’s tau implies a negative correlation, other approaches must be used since the parameter ranges of the NAC generator functions are restricted to positive correlations. One approach might be to set non-allowed correlations to the nearest allowed value and use the resulting parameter estimate. In this paper, all relevant correlations are positive so that such an approach is not needed. For the non-parametric Bernstein copula, we need to approximate the probabilities p(k1 , . . . , kd ). Following Pfeifer et al. (2009) and our approach in the parametric case, we obtain estimates based on the relative ranks of the original observations. Then, the relative frequency of the observations in each of the cells Ik1 ,...,kd of the grid is calculated. We note, however, that the resulting margins are not necessarily distributed uniformly. The resulting table is also used for the grid-type copula. For the kernel copula, relative rank data are again used and no further estimation is needed. 2.4. Random sampling Random numbers from the elliptical copulas are drawn using an inversion method. Sampling from the nested Clayton and Gumbel copula uses the efficient sampling algorithm developed by Hofert (2011). For sampling from the Bernstein copula, Pfeifer et al. (2009) use rejection sampling since the usual methods (inversion, conditional sampling) are not feasible due to the complexity of the density and distribution function. We, however, take advantage of the fact that the Bernstein copula can be represented as mixture of independent Beta distributions. This relationship facilitates an efficient random sampling algorithm, which has, to our knowledge, to date not been applied in this context. Therefore, sampling a d-dimensional vector of random numbers (U1 , . . . , Ud ) from the Bernstein copula involves the following two steps. 1. Sample (K1 , . . . , Kd ) ∈ [0, . . . , m − 1]d such that P ((K1 , . . . , Kd ) = (k1 , . . . , kd )) = p(k1 , . . . , kd ), with p being the ddimensional contingency table estimated as described above. 2. Sample (U1 , . . . , Ud ), with independent Ui ∼ Beta(Ki + 1, m − Ki ) for i = 1, . . . , d. For the grid-type  we sample from the uniform distribution  copula, with support

Ki m

, Kim+1 in step 2 (φ(·) is the indicator function

with this support). For the kernel copula, we use a modified bootstrap approach described by Hörmann and Leydold (2000) as well as their method for setting the smoothing parameter.

Table 1 Correlation and coefficient of variation of lines of business included in the empirical dataset. CoV

IS HS CS CF HF WD

2.21 2.09 1.54 2.28 2.12 0.32

Correlation IS

HS

CS

CF

HF

1.00 0.80 0.68 0.16 0.12 0.01

1.00 0.74 0.18 0.12 −0.01

1.00 0.33 0.22 −0.10

1.00 0.58 −0.07

−0.01

WD

1.00 1.00

insurer. We consider industry storm insurance (IS), homeowners storm insurance (HS), contents storm insurance (CS), contents flood insurance (CF), homeowners flood insurance (HF), and water damage insurance (WD). The data encompass absolute claim sizes but no claim frequencies, are denoted in Euro, and are corrected for inflation. The time span covered is 2000–2006, giving us 84 monthly observations per line.2 The motivation for considering these lines of business is that they were described by the insurer as difficult to model in practice. For these lines, there is a high claim potential and no (known) causal relation between included sectors, like storm and water damages. Note that even though we consider lines with exposure to natural perils, the return period is much shorter than for natural catastrophes such as earthquakes or hurricanes. The claims in our data occur regularly and storm claims especially during spring and winter months. Other lines, such as the water damage insurance, are not prone to such seasonality. Also, risk modeling of natural catastrophe claims in the context of commercial modeling suites (such as, e.g., RMS, EQECAT, AIR) is distinct from our approach. In such models a set of hazard scenarios is applied to a portfolio of insured objects, which can originate from several lines of business and represent the insurer’s exposure. With detailed information on the exposure, such as geographical location and physical properties, an estimate of the vulnerability of the insured objects to the simulated hazards can be derived. Combined, this yields an estimate of the financial losses or claims. Table 1 reports Kendall’s rank correlation among the considered lines and the coefficient of variation (CoV, defined as quotient of standard deviation and mean) per line. The coefficient of variation highlights the high claim potential. It is larger than 2 for the storm and flood lines (with one exception) and notably smaller than 0.5 for the water damage claims. Variation of the CoV may well be an indication of variation in dependence structures. Further, there is strong correlation among the storm lines (0.80, 0.74, and 0.68) and flood lines (0.58), which seems reasonable as different types of property in one region could be equally affected by the same peril. The correlation of the water damage claims with the other lines is mixed, partly positive and partly negative, but smaller than the other correlations. In the NAC case, we group claims by line of business on the lower level. When modeling all six lines simultaneously, this structure can be easily visualized by the resulting copula for a hierarchy with two levels: C (u1 , . . . , u6 ; θ0 , θ1,1 , θ1,2 )

= C (C (u1 , u2 , u3 ; θ1,1 ), C (u4 , u5 ; θ1,2 ), u6 ; θ0 ). On the lower level, one copula parameter describes the dependence structure between the three storm claim vectors (θ1,1 ) and one the dependence of the two flood claim vectors (θ1,2 ). On the upper level, the parameter θ0 describes the overall dependence structure between all claim vectors.

3. Data The dataset used to calibrate the copulas consists of claims data from six lines of business provided by a medium-sized German

2 We also conducted this analysis using weekly instead of monthly data. The results lead to identical conclusions and are available from the authors upon request.

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

4. Goodness-of-fit analysis We first conduct a two-dimensional benchmark analysis to evaluate the power of the employed test in a simple setting. This is followed by an evaluation of the fit of all copulas for the sixdimensional claims dataset described in Section 3. Additionally, for the sake of robustness, we conduct a three-dimensional analysis. In all cases, the Bernstein copula is compared to the seven other introduced copulas. 4.1. Test setup Genest et al. (2009b) compare the power of various tests for different copula choices. Based on a large-scale Monte Carlo experiment, they find that no single test outperforms all others and therefore derive a ranking indicating that a blanket test based on the Cramér–von Mises statistic is a good choice. The test is for the copula C coming from some copula class C0 , i.e., H0 : C ∈ C0 . Let Cˆ n denote an estimate of C under H0 using the available empirical data with n observations. The Cramér–von Misestest statistic is defined as Sn =

[0,1]d

Cn (u)2 dCn (u),



with Cn = n(Cn − Cˆ n ) and Cn (u) = 1n i=1 1(Ui1 ≤ u1 , . . . , Uid ≤ ud ) as the non-parametric distribution or the so-called empirical copula of the available data (U1 , . . . , Ud ). Since the distribution of Sn is unknown, p-values are estimated using a bootstrap approach based on the algorithm from Genest et al. (2009b). The validity of this bootstrap approach has only been established for the parametric case (Genest and Rémillard, 2008) and application for the non-parametric copulas can yield inconsistent results, i.e., p-values can systematically decrease for increased fit. We thus apply the bootstrap approach only for the parametric copulas and use the resulting p-values for model selection. The chosen level of significance for rejection is α = 0.10. For comparison with the non-parametric alternatives, we use the value of the test statistic and interpret a lower value as an indication for better fit. The quality of the test is influenced by the sample size, the number of bootstrap runs, and the grid size. Sample size. Since only the distribution of the assumed copula under H0 is known and the real dependence structure is approximated by Cn (u), the value and accuracy of the test statistic depend on the sample size. Effects of sample size on copulabased goodness-of-fit tests are considered in Genest and Rémillard (2008) and Genest et al. (2009b), finding that adequate sample sizes – in a two-dimensional setting – can range from 250 to 1000 data points, depending on the test and copula being used. As our datasets show, real-world datasets may contain fewer observations. Number of bootstrap runs. More bootstrap runs gradually increase approximation accuracy of the p-value. Genest and Rémillard (2008) and Genest et al. (2009b) choose 1000 bootstrap runs in their analyses; we always choose 10,000 bootstrap runs and test for the validity of this choice. Grid size. The parameter m denotes the fineness of the grid on which the data are approximated for the Bernstein and grid-type copulas, as well as the degree of the Bernstein polynomials used for smoothing. The choice of grid size may have a significant impact on estimation results and fit, but the selection of an optimal value in respect to fit and degree of the Bernstein polynomials is still an open research question (Pfeifer et al., 2009). For this analysis, we consider at least m = 10.

n

4.2. Two-dimensional benchmark analysis With this benchmark analysis, we intend to illustrate the adequacy of the goodness-of-fit (gof) test and the influence of grid

433

Table 2 Goodness-of-fit test statistics and bootstrapped p-values of copulas fitted to the two-dimensional benchmark dataset. Copula Independence Gauss Student Clayton Gumbel Bernstein Bernstein Grid-type Grid-type Kernel

m

10 20 10 20

Sn

p-value

0.7624 0.0770 0.0633 0.0133 0.1387 0.0787 0.0351 0.0175 0.0106 0.4668

0.0000 0.0002 0.0006 0.8145 0.0000

size on the fit of the Bernstein copula. We draw 100 bivariate random samples from the Clayton copula with parameter θ = 2 and apply the gof test to this sample (any other copula and parameter could be chosen as well). Results are presented in Table 2. The simulated data and resulting Bernstein density are shown in Fig. 1. All parametric copulas except the Clayton copula are rejected for the simulated dataset. Also, among the non-parametric copulas, only the grid-type copula yields a better fit than the Clayton copula. The test and number of bootstrap runs thus appear to be appropriate, as the correct copula (Clayton) is not rejected and shows a good fit. Several additional insights can be gained from this benchmark analysis. The value of the test statistic – as expected – is decreasing in m. A finer grid increases the fit; the Bernstein and grid-type copulas are thus superior to their parametric counterparts if m is large enough. Even though both copulas converge to the empirical copula, with increasing m, the fit of the grid-type copula will become better than the fit of the Bernstein copula. Since no smoothing (indicator function instead of Bernstein polynomials) is conducted, the grid-type copula, beyond a certain m, will basically equal the empirical copula, explaining this effect. Using the grid-type copula with such m for risk modeling and random sampling would essentially equal bootstrapping from historical observations. If this is not desired, the Bernstein copula, a lower grid size, or both may be preferable. Also for lower grid sizes, the Bernstein copula may be preferable over the grid-type copula, as it disperses probability mass of historical observations over a range of possible outcomes. Thus, it provides a model that allows for a broader range of outcomes, which might be preferable in risk modeling. The kernel copula provides the second-worst value for Sn , so that the symmetric Gaussian kernel copula clearly is not a viable alternative to the other non-parametric copulas. 4.3. Six-dimensional analysis Table 3 presents the results of the six-dimensional goodnessof-fit analysis. The independence assumption provides the worst results in terms of the test statistic, which shows that dependence between risk classes should not be ignored. Considering the p-values, all parametric copulas have to be rejected at a 10% significance level. For m = 10, the fit of the Bernstein copula (in terms of Sn ) is superior to the fit of the independence, Clayton, and kernel copulas, but inferior to that of the Gauss, Student, and Gumbel copulas. Increasing m to a value of 20 improves the fit of the Bernstein copula. In this case, Sn is 0.0625, which is close to the best parametric copula. These results thus again document that the goodness-of-fit of the Bernstein copula increases with the grid size m, since this reduces the distance to the empirical distribution. The grid-type copula, which is conceptually close to the Bernstein

434

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

(b) Fitted Bernstein density (m = 20).

(a) Simulated rank data.

Fig. 1. Scatterplot of data used in the gof benchmark analysis and fitted Bernstein density.

Table 3 Goodness-of-fit test statistics and bootstrapped p-values of copulas fitted to the six-dimensional empirical dataset. Copula Independence Gauss Student Clayton Gumbel Bernstein Bernstein Grid-type Grid-type Kernel

m

10 20 10 20

Sn

p-value

0.7335 0.0624 0.0516 0.1408 0.0732 0.1064 0.0625 0.0247 0.0200 0.5709

0.0000 0.0202 0.0519 0.0025 0.0663

copula, provides for both m = 10 and m = 20 a very good fit, as indicated by Sn . As mentioned, it performs better than the Bernstein copula due to the different smoothing concepts. The kernel copula again shows a bad fit. Overall, the Student and grid-type copula provide the best results in the six-dimensional analysis. The Bernstein copula is close to the Student copula if m = 20, but the fit can be further improved by increasing m. Our results thus emphasize the relevance of the choice of grid size, which is at the discretion of the person calibrating the model. 4.4. Robustness Why do we not find a superior fit for the Bernstein copula? To investigate this question more closely, we conduct various robustness tests. The fit of the Bernstein copula depends on the grid size and on the specific dependence structure of the data sample. If the dataset is more homogeneous, parametric copulas such as the elliptical copulas might perform well. If, however, the dataset is less homogeneous, the Bernstein copula might be a more adequate modeling approach. To test this relationship, we consider three three-dimensional datasets and rerun the gof test. The first two datasets are inhomogeneous, i.e., they are created from lines with low correlation and include storm, flood, and water damage claims. For the third dataset, we combine the three storm lines. Due to the high correlation among the lines, this third dataset can be considered as more homogeneous. For Archimedean copulas, the exchangeable setup is used. We expect that the Bernstein copula will perform especially well for the first two datasets. Results are reported in Table 4.

In line with our expectations, the Bernstein copula provides a very good fit for the first two datasets. In these cases, only the fit of the grid-type copula is better. For the third dataset, the only parametric copula not rejected is the Gumbel copula. That the fit, in contrast to our expectations, decreases for the elliptical copulas can be explained by three outliers in this third dataset. The fit of both elliptical copulas is comparable to that of the Gumbel copula when those are removed. The grid-type copula shows the smallest deviation from the empirical copula. The fit of the Bernstein copula has decreased in comparison to the first two datasets, reflected by a higher Sn . The Bernstein copula thus provides a good fit for inhomogeneous datasets, but is less appropriate in situations with stronger correlation between claims. 5. Risk modeling using value-at-risk and tail-value-at-risk To study the effects of the different dependence structures in risk modeling we simulate random claims data from the fitted copulas: 500,000 six-dimensional random variates are drawn from the copulas fitted to the six-dimensional dataset and inverted using the generalized Pareto distribution with the density f (x; k, σ , ρ) = σ −1 (1 + k(x − ρ)/σ )−(k+1)/k . The simulated claims are summed up to an aggregated claim. We consider a single set of parameters for all six lines (k = 0.50, σ = 200, ρ = 1) to isolate the effects of the dependence structure. The simulated value-at-risk (VaR) and tail-value-at-risk (TVaR) of the aggregate claim at the 99.5% quantile are presented in Table 5. Plots for varying confidence levels are shown in Figs. 2 and 3. In the following discussion we first focus on VaR and then compare the results for VaR and TVaR. The results shed some light on the practical effects of inappropriate dependence modeling. Specifically, the independence copula, which provides a bad fit, leads to the lowest VaR. The kernel copula, which provides a bad fit as well, yields a higher VaR. The Gauss, Student, and Gumbel copulas, which provide a better fit to the empirical dependence structure, yield an even higher VaR. The Bernstein, grid-type, and Clayton copulas provide very similar results and fall between the independence and the other parametric copulas. The Bernstein copula provides a slightly higher VaR than the grid-type copula. This effect is even more pronounced for the TVaR. Comparing the results for VaR and TVaR shows that the kernel copula produces relatively high losses at the far upper range, which is also reflected in the plots in Figs. 2 and 3. Aside from these differences, the results for VaR and TVaR are very similar. However,

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

435

Table 4 Goodness-of-fit test statistics and bootstrapped p-values of copulas fitted to the three-dimensional datasets considered in the robustness tests. Copula

Independence Gauss Student Clayton Gumbel Bernstein Grid-type Kernel

m

10 10

Data: IS-CF-WD

Data: IS-HF-WD

Data: IS-HS-CS

Sn

p-value

Sn

p-value

Sn

p-value

0.0461 0.0293 0.0303 0.0351 0.0354 0.0214 0.0119 0.0412

0.3004 0.6164 0.4854 0.3483 0.3243

0.0614 0.0364 0.0358 0.0426 0.0400 0.0287 0.0176 0.0542

0.1487 0.3021 0.2597 0.1555 0.1985

3.2771 0.0620 0.0421 0.1097 0.0330 0.2809 0.0251 2.1883

0.0000 0.0081 0.0325 0.0000 0.1180

Table 5 Value-at-risk and tail-value-at-risk (in thousand Euro) at 99.5% quantile simulated according to the copulas fitted to the sixdimensional empirical dataset. Copula Independence Gauss Student Gumbel Clayton Bernstein Grid-type Kernel

m

10 10

VaR

TVaR

15.436 21.382 22.694 26.730 17.506 16.905 16.851 19.546

29.186 41.226 41.482 53.015 32.168 32.596 29.949 52.442

purposes not only the fit of the copula, but also its properties (as e.g. influenced by m) have to be considered. 6. Conclusion Fig. 2. Quantile plots of value-at-risk (in thousand Euro) for varying confidence levels simulated according to the copulas fitted to the six-dimensional empirical dataset.

Fig. 3. Quantile plots of tail-value-at-risk (in thousand Euro) for varying confidence levels simulated according to the copulas fitted to the six-dimensional empirical dataset.

the ranking of riskiness partly depends on the risk measure used. The results of the risk analysis again emphasize that the dependence structure should not be ignored in risk modeling. We see substantial differences between the alternatives, which illustrates the practical importance of considering different models. The VaR from the independence copula might be too low, while the Kernel and some parametric copulas lead to relatively high VaR values. The VaR for the Bernstein and grid-type copulas are between these extreme values. Moreover, for simulation

This paper illustrates the modeling of dependence structures of non-life insurance risks using the Bernstein copula. We conduct a goodness-of-fit analysis to assess the Bernstein copula’s fit compared to that of other widely used copulas (parametric elliptical and Archimedean, as well as other non-parametric copulas) and illustrate its usage in a simulation context. Real-world claims data are used to calibrate the dependence structure. Goodness-of-fit analyses are not often made in copula modeling, but are important for interpreting the results accurately. We use a blanket test based on the Cramér–von Mises test statistic and bootstrapping for approximation of p-values. Our results show that the Bernstein copula is a flexible approach, but not the solution to all modeling problems. For example, in our six-dimensional dataset we found other approaches to have a better fit. However, the fit of the Bernstein copula can be improved by increasing the grid size. We also illustrate that the Bernstein copula performs well when used with inhomogeneous or less symmetric datasets. Additionally, simulating different risk measures according to the fitted dependence structure illustrates that inadequate modeling may yield an over- or underestimation of the risk situation compared to the alternative with the best fit. A general benefit of the Bernstein copula is its applicability in higher dimensions and the usage of all available information, which is especially not the case for NACs, in which an aggregate parameter is employed. Thus even though the Bernstein copula’s performance depends on data and calibration, our results are important for insurance companies and regulators who rely on stochastic models for determining solvency capital requirements under Solvency II. The results may also be useful in conducting the ‘‘Own Risk and Solvency Assessment’’ that is required by Solvency II. This analysis also highlights several practical issues regarding the implementation and application of the Bernstein copula that need further research and that are of high importance for practitioners working on the implementation of such models. The representation of the Bernstein copula as a mixture of Beta

436

D. Diers et al. / Insurance: Mathematics and Economics 50 (2012) 430–436

distributions greatly increases sampling efficiency, but the choice of grid size is an important aspect influencing fit and variance of the estimator. Some limitations apply to this analysis. The number of data points used to calibrate the dependence structure is relatively small, but as these are realistic datasets and calibration of internal models is usually conducted on a monthly or yearly basis, insurance companies face the same limitations. We choose the grid size equal for each dimension and equidistant; relaxing this condition could further improve estimation results. Additionally, further tests of statistical fit that do not rely on the empirical copula as benchmark could be insightful. Overall, the Bernstein copula is a promising modeling alternative and can enrich the set of copulas used in internal risk models, especially in those cases where the dependence structure is inhomogeneous, not extremely highly correlated, and data are sparse. The empirical part of this paper demonstrates that these are frequently found conditions and thus provides a sound motivation for application of the Bernstein copula. References Bouezmarni, T., Rombouts, J., 2009. Semiparametric multivariate density estimation for positive data using copulas. Computational Statistics & Data Analysis 53, 2040–2054. Chen, S.X., Huang, T.M., 2007. Nonparametric estimation of copula functions for dependence modelling. Canadian Journal of Statistics 35, 265–282. Eling, M., Toplek, D., 2009. Modeling and management of nonlinear dependencies— copulas in dynamic financial analysis. Journal of Risk and Insurance 76, 651–681.

Embrechts, P., 2009. Copulas: a personal view. Journal of Risk and Insurance 76, 639–650. Fermanian, J.D., Scaillet, O., 2003. Nonparametric estimation of copulas for time series. Journal of Risk 5, 25–54. Genest, C., Gendron, M., Bourdeau-Brien, M., 2009a. The advent of copulas in finance. Journal of Finance 15, 609–618. Genest, C., Rémillard, B., 2008. Validity of the parametric bootstrap for goodnessof-fit testing in semiparametric models. Annales de l’Institut Henri Poincaré. Probabilités et Statistiques 44, 1096–1127. Genest, C., Rémillard, B., Beaudoin, D., 2009b. Goodness-of-fit tests for copulas: a review and a power study. Insurance: Mathematics and Economics 44, 199–213. Hofert, M., 2011. Efficiently sampling nested Archimedean copulas. Computational Statistics & Data Analysis 55, 57–70. Hofert, M., Scherer, M., 2011. CDO pricing with nested Archimedean copulas. Quantitative Finance 11, 775–787. Hörmann, W., Leydold, J., 2000. Automatic random variate generation for simulation input. In: Proceedings of the 2000 Winter Simulation Conference. Hummel, C., 2009. Shaping tail dependencies by nesting box copulas. ArXiv e-prints arXiv:0906.4853. Jondeau, E., Rockinger, M., 2006. The copula-GARCH model of conditional dependencies: an international stock market application. Journal of International Money and Finance 25, 827–853. Kulpa, T., 1999. On approximation of copulas. International Journal of Mathematics and Mathematical Sciences 22, 259–269. Lowin, J.L., 2010. The Fourier copula: theory & applications. Working Paper. Harvard University. Pfeifer, D., Strassburger, D., Philipps, J., 2009. Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas. Working Paper. Carl von Ossietzky University, Oldenburg. Salmon, M.H., Schleicher, C., 2006. Pricing multivariate currency options with copulas. WP06-21. Working Papers Series. Warwick Business School, Financial Econometrics Research Centre. Sancetta, A., Satchell, S., 2004. The Bernstein copula and its applications to modeling and approximations of multivariate distributions. Econometric Theory 20, 1–38. Savu, C., Trede, M., 2010. Hierarchies of Archimedean copulas. Quantitative Finance 10, 295–304.