Multi-population mortality models: A factor copula approach

Multi-population mortality models: A factor copula approach

Insurance: Mathematics and Economics ( ) – Contents lists available at ScienceDirect Insurance: Mathematics and Economics journal homepage: www.el...

577KB Sizes 1 Downloads 32 Views

Insurance: Mathematics and Economics (

)



Contents lists available at ScienceDirect

Insurance: Mathematics and Economics journal homepage: www.elsevier.com/locate/ime

Multi-population mortality models: A factor copula approach Hua Chen a,∗ , Richard MacMinn b , Tao Sun a a

Temple University, United States

b

Illinois State University, United States

article

info

Article history: Available online xxxx Keywords: Multi-population mortality model Factor copulas Maximum entropy principle Mortality/longevity risk pricing Mortality/longevity risk hedging

abstract Modeling mortality co-movements for multiple populations have significant implications for mortality/longevity risk management. A few two-population mortality models have been proposed to date. They are typically based on the assumption that the forecasted mortality experiences of two or more related populations converge in the long run. This assumption might be justified by the long-term mortality cointegration and thus be applicable to longevity risk modeling. However, it seems too strong to model the short-term mortality dependence. In this paper, we propose a two-stage procedure based on the time series analysis and a factor copula approach to model mortality dependence for multiple populations. In the first stage, we filter the mortality dynamics of each population using an ARMA–GARCH process with heavy-tailed innovations. In the second stage, we model the residual risk using a one-factor copula model that is widely applicable to high dimension data and very flexible in terms of model specification. We then illustrate how to use our mortality model and the maximum entropy approach for mortality risk pricing and hedging. Our model generates par spreads that are very close to the actual spreads of the Vita III mortality bond. We also propose a longevity trend bond and demonstrate how to use this bond to hedge residual longevity risk of an insurer with both annuity and life books of business. © 2015 Elsevier B.V. All rights reserved.

1. Introduction In recent years, a few mortality models with multiple populations have been proposed and developed. Modeling mortality dependence for two or more populations is still, however, in its infancy. Multi-population mortality models are typically structured assuming that the forecasted mortality experiences of two or more related populations are linked together and do not diverge over the long run. This assumption might be justified by the long-term mortality co-movements and thus be applicable to longevity risk modeling. It seems, however, too strong an assumption to use in modeling short-term mortality dependence. Here, we propose a copula-based multivariate model to capture mortality dependence for multiple populations. In the first stage, we filter the mortality dynamics of each population using the ARMA–GARCH process with heavy-tailed innovations. In the second stage, we model the

∗ Correspondence to: Department of Risk, Insurance, and Health Management, Temple University, 1801 Liacouras Walk, 625 Alter Hall, Philadelphia, PA 19122, United States. E-mail addresses: [email protected] (H. Chen), [email protected] (R. MacMinn), [email protected] (T. Sun). http://dx.doi.org/10.1016/j.insmatheco.2015.03.022 0167-6687/© 2015 Elsevier B.V. All rights reserved.

residual risk using a one-factor copula model that is widely applicable to high dimension data. We then use our mortality model to price the Swiss Re Vita III bond and a longevity bond in the spirit of the Kortis bond to analyze the hedging problem of an insurer with both life and annuity books of business. In order to have a thorough assessment of mortality/longevity risk, the correlation or co-integration between mortality improvements of different populations have to be evaluated. Cairns et al. (2011) provide a detailed discussion on why correlations among multiple populations must be taken into account. First, from the natural hedging perspective, a typical life insurer may want to hedge mortality risk from insured lives with longevity risk from annuitants. These two insured groups, however, may have different but correlated patterns of mortality experience. Second, sometimes the mortality data may not be available or sufficiently reliable for a small population due to a small number of deaths, a limited number of calendar years of data, age range or simply poor quality of data, thus causing highly inaccurate parameter estimates. Jointly modeling the small population and a larger linked population allows the small-population mortality forecasts to be consistent with those of the larger population. Third, with the rapid development of mortality securitizations, the payoffs of almost all mortality bonds are contingent on a weighted mortality index

2

H. Chen et al. / Insurance: Mathematics and Economics (

)



based on multiple populations.1 We need to understand mortality correlations in order to better evaluate the overall risk and the final payoff to investors. Last but not least, entities seeking to hedge their exposures to mortality/longevity risk using capital market instruments need to determine hedge ratios that minimize basis risk between their own population and the population associated with the hedging instruments. A few two-population stochastic mortality models have been proposed. Li and Lee (2005) present an augmented common factor model for a group of populations, imposing a common mortality change by age but allowing each its own age pattern and level of mortality. Li and Hardy (2011) consider four extensions to the Lee–Carter model to incorporate mortality dependence. Dowd et al. (2011) develop a gravity mortality model for two-related but different sized populations, where gravity effects bring the state variables of the small population toward those of the large population in a manner consistent with biological reasonableness. A similar model was developed by Jarner and Kryger (2011). Cairns et al. (2011) introduce a general framework for two-population mortality modeling; they employ a mean-reverting process that permits different short-run trends in mortality rates but parallel long-run improvements. Zhou et al. (2013a) propose a twopopulation Lee–Carter model with transitory jumps under this framework. Yang and Wang (2013) and Zhou et al. (2013b) apply a co-integration analysis to investigate the long-run equilibrium in multi-country mortality data and use a vector error correction model (VECM) for mortality forecasts. A common feature of these models is that they are constructed in a way that mortality forecasts in different populations do not diverge in the long run. This assumption might be justified by the long-term mortality comovements and so be applicable to longevity risk modeling, but it seems too strong to model the short-term mortality dependence. Lin et al. (2013) develop a jump diffusion mortality model for multiple countries where mortality dependence comes from common jumps and the correlation between idiosyncratic risks. Their approach focuses on analyzing correlations among the raw mortality data but the raw data exhibit noise such as autocorrelation or volatility clustering that could possibly confound the dependence structure. Their approach also entails the use of normalized multivariate exponential tilting to price the risks and that implicitly assumes that the risks follow a Gaussian copula. We believe this is problematic since Gaussian copulas lack tail dependence and are therefore inadequate to model the joint mortality events. Our approach is related to Lin et al. (2013), but our model uses a two-stage multivariate analysis based on a factor copula approach which overcomes the problems noted here and has other advantages that we subsequently discuss. Our mortality model has two stages. The first stage estimates the conditional distribution of mortality rates for each population. In the seminal work of Lee and Carter (1992), an ARIMA model is used to forecast the time varying mortality factor. They assume that the error terms are white noise with zero mean and a small constant volatility; that assumption of homoscedasticity, however, is not realistic. Lee and Miller (2001) argue that the observed logarithm of central death rates is quite variable and the volatility is time varying. Recently, GARCH-related models have been used to model mortality rates (see, e.g. Gao and Hu, 2009; Giacometti et al., 2012; Chai et al., 2013). Along with this line of research, we use an ARMA–GARCH model to fit the mortality data of each population in order to remove autocorrelation and conditional heteroskedasticity from mortality time series. Instead

of modeling morality jumps explicitly, we assume that innovations follow a heavy-tailed distribution. This is supported by empirical evidence provided in Giacometti et al. (2009, 2012) and Wang et al. (2013). Giacometti et al. (2009) observe that for some age groups the Lee–Carter model with normal inverse Gaussian (NIG) innovations produces a dominant fit compared to the Gaussian one. Giacometti et al. (2012) find that an AR–ARCH model with Student-t innovations is more suited to Italian data. Wang et al. (2013) conclude that the Renshaw and Haberman (2006) model with non-Gaussian innovations generates better forecasts for England and Wales, France, and Italy. The second stage of our model is designed to capture mortality dependence among residuals revealed by the first stage. A very common and intuitive way to capture the dependence is to use copulas. Copulas have been studied in both actuarial science and finance to examine dependencies among risks (Frees and Valdez, 1998; Embrechts et al., 2003). Particularly, in mortality studies copulas have been applied to model the bivariate survival function of the two lives of couples (see, e.g. Frees et al., 1996; Carriere, 2000; Shemyakin and Youn, 2006; Youn and Shemyakin, 1999, 2001; Denuit et al., 2001). Surprisingly, we find no prior research using copula models in the multi-population mortality analysis. In addition, the majority of aforementioned papers use copulas in the Elliptical or Archimedean family, which usually have only one or two parameters to characterize the dependence between all variables, and are thus quite restrictive when the number of variables increases. To overcome this drawback and increase the flexibility of our model, we adopt factor copula models proposed by Oh and Patton (2014).2 Their model has some advantages. First, it is based on a simple linear, additive factor structure for the copula and is particularly attractive for high dimension applications. Second, the factor copula permits separate development of the marginal distributions and the copula model. Hence, this method allows the use of any existing model for the univariate analysis with the subsequent focus on the copula model. Third, the factor copula provides more flexibility of the model according to the number of variables and available data. To the best of our knowledge, our model is the first to use the factor copula to fit mortality data for multiple populations.3 Pricing mortality-linked derivatives is challenging in an incomplete market. With sparse market price data, some prevalent pricing methodologies, such as the arbitrage free pricing method (Cairns et al., 2006; Bauer et al., 2010), the Wang transform (Dowd et al., 2006; Denuit et al., 2007; Lin and Cox, 2008; Chen and Cox, 2009), or the Esscher transform (Chen et al., 2010; Li et al., 2010), require the user to make one or more subjective assumptions to derive the market prices of risk. The pricing process becomes even more difficult when multiple risks are involved. For this reason, Li (2010), Kogure and Kurachi (2010) and Li and Ng (2011) advocate the use of the maximum entropy approach that ‘‘does not strictly require the use of security prices to predict other security prices’’. This is particularly important in today’s market where there are a limited number of market participants and only a few mortality

1 There are only two exceptions, i.e., the Tartan mortality bond sponsored by Scottish Re in 2006 and the Atlas IX mortality bond sponsored by SCOR Re in 2013. Their payoffs depend on US mortality data only.

stage procedure which is very similar to our model. They use an ARMA process in the first stage and time-varying copulas in the Elliptical or Archimedean family in the second stage.

2 A one-factor copula model was first introduced in Vasicek (1987) to evaluate loan loss distributions and Li (2000) applies the Gaussian copula to multi-name credit derivatives. The model is generalized in Andersen and Sidenius (2004), Hull and White (2004), and Laurent and Gregory (2005). Extending Hull and White (2004)’s work, Oh and Patton (2014) use factor copula models to capture the dependence among firms in the S&P 100. Their paper presents a ‘‘simulated method of moments’’ approach to accommodate this high dimensional data. 3 At the same time this article was written, Wang et al. (2013) proposed a two-

H. Chen et al. / Insurance: Mathematics and Economics (

transactions. More market prices, however, can be easily incorporated to derive the risk-neutral measure when the mortality market becomes more mature. In this paper, we use Vita II (Tranche B, C and D), Tartan (Tranche B) and Osiris (Tranche B2, C and D) bonds as our pricing basis to derive the risk-neutral distribution by adopting the maximum entropy approach. We then study mortality risk pricing and hedging using some real-world examples. In the first example, we calibrate the par spreads for the Vita III mortality bond issued by Swiss Re in December 2006, and compare our estimates with the actual spreads that were offered. Our estimated spreads are very close to the actual ones. In the second example, we illustrate how a life insurer can hedge the residual risk in its life and annuity books of business using a longevity trend bond whose structure is similar to the Kortis bond issued by Swiss Re.4 Our paper contributes to the extant literature in several ways. First, we propose a copula-based multivariate mortality model to capture mortality correlations among multiple populations. Copulas are powerful tools to model the dependence for tail events or extreme risks, therefore our model has significant applications to mortality risk pricing and hedging for correlated risks. Second, instead of using copulas in the Elliptical or Archimedean family, we introduce a factor copula model which has a simple mathematical form and a flexible model specification and thus is widely applicable to high dimension data. We only consider mortality correlations across countries in this analysis, but our model can be extended to model mortality dependence between male and female and among different age or cohort groups. Third, although there are some studies that use previously issued mortalitylinked securities to calibrate the market prices of risk or infer the risk-neutral measure, we find few papers that use the derived risk-neutral measure to estimate prices for some real mortality transactions. So it is hard to evaluate the accuracy of their modeling and/or pricing approaches. In contrast, we apply our model to some real-world examples such as the Vita III bond. We find our model generates prices that are very close to those observed in this transaction. The remainder of this paper is organized as follows. In Section 2, we introduce the two-stage model framework based on the time series analysis and a factor copula approach. In Section 3, we investigate mortality data in six countries and estimate the parameters in our two-stage model. In Section 4, we present the maximum entropy principle as our pricing methodology. In Section 5, we illustrate how to use our proposed mortality model for mortality pricing and hedging. Concluding remarks are in Section 6. 2. The copula-based multivariate analysis

)



3

2.1. The first stage: the conditional marginal distribution Suppose we need to model the dependence structure of time series data, yit , i = 1, 2, . . . , N , t = 1, 2, . . . , T . We first model the conditional marginal distribution of yit . Based on the Box and Jenkins (1976) approach, we can fit yit with a conditional mean model ARMA (r , m) yit = ci +

r 

ϕik yi,t −k +

k=1

m 

ρik ei,t −k + eit ,

(1)

k=1

and a conditional variance model GARCH (p, q)6

σit2 = ωi +

p 

αik σi2,t −k +

k=1

q 

βik e2i,t −k ,

(2)

k=1

where eit = σit ηit and ηit is an independent and identically distributed sequence of standardized innovations. 2.2. The second stage: a factor copula approach In the mortality/longevity securitization, we intend to capture the joint distribution of tail risks, which relies on the dependence structure of the innovation process. We use the straightforward approach of a copula model to describe the joint distribution of innovations. The copula model for the standardized residuals can be structured as

ηt ≡ [η1t , . . . , ηNt ]T ∼ iid Fη (η1t , . . . , ηNt )   = C Fη1 (η1t ) , . . . , FηN (ηNt ) ; θ

(3)

where C denotes the copula function, which is parameterized by a vector of parameters θ . θ can be estimated using the simulationbased estimation approach proposed by Oh and Patton (2013). Their method is close to the Simulated Method of Moments (SMM), but is not exactly the same as the SMM, because the ‘‘moments’’ that are used in estimation are functions of rank statistics. These rank dependent measures are ‘‘pure’’ measures of dependence, meaning that they are solely affected by the changes in the copula and not by the changes in the marginal distribution. For a detailed discussion of these ‘‘pure’’ dependence measures, see Joe (1997) and Nelsen (2006). The SMM copula estimator is based on a simulation from the joint distribution of a vector of latent variables X , FX (θ), with an implied copula C (θ). Briefly, we estimate the vector of parameters, θ , by minimizing the sum squared errors between rank dependence measures computed using simulations from FX (θ ) and those

 T

Copula-based multivariate models allow researchers to specify the models for the marginal distributions separately from the dependence structure that links these distributions to form the joint distribution. This enriches the class of multivariate distributions and permits a much greater degree of flexibility in model specification. If the parameters of the marginal distributions are separable from those for the copula, we may estimate those parameters separately in two stages, which greatly simplifies the estimation procedure and facilitates the study of high-dimension multivariate problems.5

computed from estimated standardized residuals ηˆ t t =1 . The details of the simulated-based method of moments are given in the Appendix. In this study, we consider a factor copula model proposed by Oh and Patton (2014), which is generated by the following structure Xi = γi Z + εi

εi ∼ iid Fε Z ⊥εi ∀i

(4)

X ≡ [X1 , . . . , XN ] ∼ FX (x1t , . . . , xNt ) ′

  = C FX1 (x1t ) , . . . , FXN (xNt ) ; θ , 4 When this article was written, the payoff function of the Kortis longevity bond had not yet been disclosed. In the spirit of the Kortis bond, we propose a payoff structure that can effectively hedge the residual risk by accounting for the mortality correlation between the US and UK. 5 Clearly, two-stage estimation is asymptotically less efficient than one-stage estimation. However, simulation studies in Joe (2005) and Patton (2006) indicate that this loss is not great in many cases.

where the Xi are latent variables, Z is the common factor and the εi are idiosyncratic factors. It is noteworthy that we impose the same

6 We also estimated the ARMA–GJR–GARCH model. However, we find no ‘‘leverage’’ effect in the mortality improvement rate for each country.

4

H. Chen et al. / Insurance: Mathematics and Economics (

dependence structure, or the copula model, for latent variables X and standardized innovations η but their marginal distributions might be different. This class of factor copula models has a simple form but a very flexible dependence structure. For example, by allowing for a fattailed common factor the model captures the possibility of correlated crashes in different markets or correlated mortality jumps in different countries. By assuming a skewed distribution on the common factor the model allows for an asymmetric tail dependence structure, consistent with the fact that the stock returns are more correlated in crashes than in booms or mortality dependence is stronger during mortality deteriorations than during mortality improvements. If we impose the same loading on the common factor we have an equi-dependence structure, whereas different loadings on the common factor enable us to model heterogeneous pairs of variables at the cost of a more difficult estimation problem. We can also simplify the problem by assuming sub-sets of variables (for example, mortality rates for countries in the same continent or those for different age groups in the same country) have the same loading on the common factor. Furthermore, the one-factor model can be extended to a multi-factor model to capture heterogeneous pairwise dependence within the overall multivariate copula.

)



Fig. 1. The population mortality index in each country.

3. Model estimation 3.1. Data We collect mortality data for US, UK, France, Germany, Japan and Canada from 1900 to 2006.7 The US data is from the National Center for Health Statistics (NCHS).8 The NCHS reports the ageadjusted death rate per 100,000 population (2000 standard) for selected causes of death. Our mortality data for UK, France, and Canada are drawn from the Human Mortality Database.910 The Japanese population mortality data are obtained from the Vital Statistics of the Ministry of Health, Labor and Welfare in Japan.1112 The German population mortality data are provided by the German Federal Statistical Office.13 3.2. Conditional marginal distribution Fig. 2. The mortality improvement rate in each country.

(i)

Denote qt the population mortality index for country i at time t. It is computed by dividing the total number of deaths in different ages by the total population in each country. Fig. 1 presents the dynamics of the mortality index for each country. Clearly the mortality indices are not stationary. So we compute the first difference (i) (i) of the logged index, denoted by yit ≡ ln qt − ln qt −1 , to obtain the mortality improvement rate in each country, as can be seen in Fig. 2.14

7 Our selected countries and sample period are exactly the same as those used in Lin et al. (2013). We follow the procedure as described in Lin et al. (2013) to collect and compile mortality data for these six countries. We thank Yijia Lin for providing their mortality data for the purpose of comparison. 8 Data source: http://www.cdc.gov/nchs/datawh/statab/unpubd/mortabs.htm. 9 Data source: Human Mortality Database, available at www.mortality.org. 10 We collect the Canada cohort data from 1900 to 1920 and the period data from 1921 to 2006. The Canadian cohort data do not capture the impact of the 1918 flu epidemic on the whole population. As suggested by Lin et al. (2013), we inflate the Canadian population mortality index in 1918 by 0.0045 to account for the excess death rate due to the flu pandemic. 11 Data source: www.mhlw.go.jp. 12 The Japanese Vital Statistics database misses the mortality data from 1944 to 1946. So we use the linear interpolation to generate the population mortality indices for these three years. 13 Data source: www.destatis.de. 14 A positive y indicates mortality deterioration and a negative y suggests it it mortality improvement.

We perform the Augmented Dickey–Fuller (ADF) test to assess the null hypothesis of a unit root, the results of which are reported in Table 1for the lag number equal to 5. The test rejects the null hypothesis of a unit root at the 5% significance level or above, suggesting that the mortality improvement rate in each country is stationary. Table 1 also reports the summary statistics for the mortality improvement rate in these six countries. The average mortality improvement rate for each country is negative, indicating mortality improves over time. It is slightly skewed to the left (except for France) and has a highly positive kurtosis. As described in Section 2.1, we fit the mortality improvement rate, yit , with a conditional mean model ARMA (r , m) and a conditional variance model GARCH (p, q). We consider four types of distributions for the error terms: (1) the standard normal distribution; (2) the student-t distribution with the degree of freedom v ; (3) the normal inverse Gaussian (NIG) distribution (Wang et al., 2013), i.e., f η ( x) =

   αδ exp δ α 2 − β 2 + β (x − µ) π    K1 α δ 2 + (x − µ)2  × , δ 2 + (x − µ)2

(5)

where Kυ is the modified Bessel function of the third kind, µ is the location parameter, δ is the scale parameter, and α and β determine the shape of the NIG distribution. We assume µ +

H. Chen et al. / Insurance: Mathematics and Economics (

)



5

Table 1 Summary statistics and the ADF test results for the mortality improvement rate. US # of Obs. Mean Std. Skewness Kurtosis ADF Test (Lag = 5)

106 −0.0107 0.0376 −1.1146 10.1589 −2.9731∗∗∗ (0.0039)

UK

France

106 −0.0062 0.0695 −2.0769 16.1469 −6.1537∗∗∗ (0.0010)

Germany

106 −0.0092 0.0947 0.5525 10.7566 −5.3691∗∗∗ (0.0010)

106 −0.0075 0.0737 −1.6446 17.3179 −4.9050∗∗∗ (0.0010)

Japan 106 −0.0083 0.0514 −0.2313 6.8143 −2.4931∗∗ (0.0132)

Canada 106

−0.0073 0.0549

−0.4675 30.9440

−3.4614∗∗∗ (0.0010)

Note: Test statistics are tabulated in the table and p-values are reported in parenthesis. Matlab reported a minimum p-value of 0.001 when the test statistic is below the critical value. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level. Table 2 Optimal ARMA–GARCH model for each country.

US UK France Germany Japan Canada

AR

MA

ARCH

GARCH

Error Distribution

BIC

0 1 2 0 1 1

1 0 1 0 2 0

1 1 1 0 0 1

1 1 1 0 0 1

Student-t Variance Gamma Variance Gamma Student-t Student-t Student-t

−442.89 −409.28 −431.33 −335.38 −338.90 −458.19

 3  βδ/ α 2 − β 2 = 0 and δα 2 / α 2 − β 2 = 1 to ensure E [ηit ] = 0 and Var [ηit ] = 1; and (4) the variance gamma (VG) distribution (Wang et al., 2013), i.e.,

 2 λ α − β 2 |x − µ|λ−1/2 Kλ−1/2 (α|x − µ|) β(x−µ) fη (x) = e , √ π Γ (λ) (2α)λ−1/2

(6)

where Γ (·) is the Gamma function,µ is the location parameter, α and β determine the shape  of the variance gamma  distribution. We assume µ + 2βλ/ α 2 − β 2 = 0 and 2λ 1 + 2β 2 /  2    α − β 2 / α 2 − β 2 = 1 to ensure E [ηit ] = 0 and Var [ηit ] = 1. We use the Bayesian Information Criterion (BIC) to determine the appropriate lags in the conditional mean and conditional variance models and the best distribution for the error terms. Lag orders up to five are tested for the above criteria. Table 2 reports the optimal lag orders and the innovation distribution for each country and the associated BIC value as well. Four countries (US, UK, France and Canada) are characterized by the ARCH effect while the other two do not. The Student-t distribution is the best for US, Germany, Japan and Canada, whereas the variance gamma distribution is the best for UK and France. Based on the selected models, parameter estimates for each country are reported in Table 3. Almost all the coefficients are highly significant at the 1% level. More importantly, our selected model for each country successfully accounts for serial correlation and conditional heteroskedasticity after the ARMA–GARCH fitting. We perform the Ljung–Box Q-test on the standardized innovations for lag orders up to 10. The test results fail to reject the null hypothesis of zero autocorrelation. The Ljung–Box Q-test on the squared standardized innovations and Engle’s ARCH test on the standardized innovations confirm no conditional heteroskedasticity. These test results are shown in Table 4. 3.3. A one-factor copula model In this subsection, we use the one-factor model with varying loadings on the common factor. Based on the distribution assumptions for the factors, we consider four models: Student t-Normal, Student t–Student t, Skewed t-Normal and Skewed t-Student t. For example, in the Skewed t-Normal factor copula model, we assume that the common factor Z follows a skewed t distribution with two

shape parameters: a skewness parameter, λ ∈ (−1, 1), which controls for the degree of asymmetry, and a degree of freedom parameter, ν ∈ (2, ∞), which controls the thickness of the tails.15 We assume that the idiosyncratic factor ε follows a normal distribution with mean 0 and variance σ 2 . We further assume that they are all independent. To implement the SMM estimator of the copula models, we have to choose rank dependence measures as our ‘‘moments’’.16 We use pair-wise Spearman’s rho and the 25th and 75th quantile dependence.17 The upper (lower) triangle matrix in Table 5 reports Spearman’s rho (Kendall’s tau) rank correlations between standardized residuals obtained from the optimal ARMA–GARCH models. All the correlation coefficients are positive and significant at the 5% level or above except for those between Canada and Germany, UK and Japan (which are positive but insignificant). We report the pair-wise quantile dependence measures at the 25th and 75th percentiles in Table 6. The upper triangle matrix reports the 25th percentile dependence, and the lower triangle matrix reports the 75th percentile dependence. All quantile dependence measures are significant at the 10% level or higher. We have six countries in our sample and consider three moment conditions. Based on the moment construction procedure described in the Appendix, we compute forty-five moments and use them in the SMM estimation. The parameter estimates in each of the factor copula model are reported in Table 7. Based on the value of the objective function (i.e. the sum squared errors of the moment conditions), the skewed t–t factor copula model is the best. Under this model, the common factor and the idiosyncratic factors follow a heavy-tailed distribution with a degree of freedom of 3.2.

15 We follow Hansen (1994) to define the skewed t distribution. Its probability density function is given by

   2 −(v+1)/2  1 bz + a   ,  bc 1 + v − 2 1 − λ g (z |v, λ) =  −(v+1)/2   2   1 bz + a  bc 1 + ,  v−2 1+λ

z < −a/b z ≥ −a/b,

where λ < 1. The constants a, b, and c are given by a =  v−22 < v2 < ∞, and2−1 < Γ ((v+1)/2) . 4λc v− , b = 1 + 3λ − a2 , c = √π(v− 1 2)Γ (v/2) 16 All rank dependence measures are invariant to strictly increasing transformations, meaning that they are solely affected by the changes in the copula, not by the changes in the marginal distribution. Therefore, the use of rank dependence measures can facilitate our goal of capturing the dependence structure in the second stage, which is separated from the calibration of the marginal distribution in the first stage. 17 We consider other pairs of tail dependence measures as well, such as the 5th and 95th percentile dependence. Most of them are not statistically significant, thus failing to convey much information about tail dependence. As another robustness check, we use Kendall’s tau, the 25th and 75th percentile dependence as the ‘‘moments’’ to estimate the parameters of the factor copula model. The parameter estimates using this new set of rank dependence measures are very close to those using Spearman’s rho, the 25th and 75th percentile dependence.

6

H. Chen et al. / Insurance: Mathematics and Economics (

)



Table 3 Parameter estimates for the ARMA–GARCH model. Parameter

US

UK ∗∗∗

−0.0102

c

(0.0000)

ϕ1

−0.0073

France ∗∗∗

−0.0059

(0.0002) −1.3122∗∗∗ (0.0000) −0.3629∗∗∗ (0.0000) 0.9921∗∗∗ (0.0000)

−0.3357∗∗∗ (0.0002)

Japan

∗∗∗

−0.0179

(0.0084) −0.4690∗∗∗ (0.0000)

ϕ2 ρ1

Germany ∗∗∗

0.0005 (0.2526) 0.9069∗∗∗ (0.0000)

(0.0096)

0.0000∗∗∗ (0.0000) 0.0950∗∗∗ (0.0087) 0.8574∗∗∗ (0.0064)

α β

0.0000∗∗∗ (0.0000) 0.1886∗∗∗ (0.0023) 0.7112∗∗∗ (0.0023)

−0.0048∗∗∗ (0.0009)

−0.2142∗∗∗ (0.0091)

−1.1201∗∗∗ (0.0000) 0.3613∗∗∗ (0.0001)

ρ2 ω

Canada

0.0001∗∗∗ (0.0000) 0.1366∗∗ (0.0018) 0.7632∗∗∗ (0.0020)

0.0000 (0.0000) 0.0772∗∗∗ (0.0128) 0.9051∗∗∗ (0.0103)

Note: p-values are reported in parenthesis. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level.

Table 4 Q test and ARCH test for conditional distribution models. Lag 1 5

Ljung–Box Q-Test for the Standardized Innovations

10 1 Ljung–Box Q-Test for the squared Standardized Innovations

5 10 1 5

Engle’s ARCH Test for the Standardized Innovations

10

US

UK

France

Germany

Japan

Canada

2.1868 (0.1392) 3.9151 (0.5617) 7.3461 (0.6924)

2.5922 (0.1074) 9.5005∗ (0.0907) 16.0975∗ (0.0969)

0.2777 (0.5982) 3.8416 (0.5724) 11.1512 (0.3459)

0.8564 (0.3547) 6.7462 (0.2402) 9.4451 (0.4904)

0.3950 (0.5297) 2.2269 (0.8169) 4.7110 (0.9096)

0.3722 (0.5418) 2.4996 (0.7766) 5.5956 (0.8480)

5.5046∗∗ (0.0190) 8.5057 (0.1305) 10.6797 (0.3830)

0.0015 (0.9694) 0.4454 (0.9940) 1.4609 (0.9991)

0.1125 (0.7373) 1.3533 (0.9294) 2.4019 (0.9922)

2.3182 (0.1279) 11.3988∗∗ (0.0440) 11.5735 (0.3146)

2.6579 (0.1030) 5.3994 (0.3691) 6.9876 (0.7266)

0.0206 (0.8859) 0.1081 (0.9998) 0.2512 (1.0000)

5.3234∗∗ (0.0210) 8.1706 (0.1471) 8.7499 (0.5560)

0.0014 (0.9699) 0.3938 (0.9955) 1.1191 (0.9997)

0.1084 (0.7420) 1.1709 (0.9476) 2.1202 (0.9953)

2.2339 (0.1350) 9.0187 (0.1083) 9.9096 (0.4485)

2.5667 (0.1091) 4.1373 (0.5298) 5.3104 (0.8695)

0.0199 (0.8879) 0.0927 (0.9999) 0.2131 (1.0000)

Note: Test statistics are tabulated in the table and p-values are reported in parenthesis. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level.

Table 5 Spearman’s rho and Kendall’s tau.

US UK France Germany Japan Canada

US

UK

France

Germany

Japan

Canada

1

0.2923∗∗∗ (0.0024)

0.2660∗∗∗ (0.0059) 0.4786∗∗∗ (0.0000)

0.2481∗∗ (0.0104) 0.4840∗∗∗ (0.0000) 0.5536∗∗∗ (0.0000)

0.2440∗∗ (0.0117) 0.1238 (0.2060) 0.3337∗∗∗ (0.0005) 0.2856∗∗∗ (0.0030)

0.3512∗∗∗ (0.0002) 0.2057∗∗ (0.0344) 0.1950∗∗ (0.0451) 0.1114 (0.2554) 0.1914∗∗ (0.0494)

0.2029∗∗∗ (0.0021) 0.1770∗∗∗ (0.0072) 0.1676∗∗ (0.0113) 0.1615∗∗ (0.0142) 0.2449∗∗∗ (0.0002)

1 0.3373∗∗∗ (0.0000) 0.3289∗∗∗ (0.0000) 0.0882 (0.1810) 0.1414∗∗ (0.0319)

1 0.4032∗∗∗ (0.0000) 0.2313∗∗∗ (0.0004) 0.1371∗∗ (0.0375)

1 0.2123∗∗∗ (0.0013) 0.0720 (0.2771)

1 0.1303∗∗ (0.0481)

1

Note: The upper (lower) triangle matrix reports Spearman’s rho (Kendall’s tau) rank correlations. P-values are reported in parenthesis. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level.

Moreover, the common factor is positively skewed, suggesting that mortality dependence is stronger during mortality deteriorations than during mortality improvements. This conforms to our understanding that mortality jumps due to, for example, rare pandemic events are more likely to occur simultaneously. The loadings to the common factor are all positive, meaning that all countries are pos-

itively exposed to a common mortality trend or mortality shocks. When it comes to the magnitude of the loadings, France has the largest value, followed by two other European countries (Germany and UK); US, Japan and Canada fall into the second tier. The ranks of the loadings for these countries are consistent with mortality dynamics as depicted in Figs. 1 and 2, where France has more mor-

H. Chen et al. / Insurance: Mathematics and Economics (

)



7

Table 6 The 25th and 75th percentile dependence measures.

US

US

UK

France

Germany

Japan

Canada

1

0.4151∗∗∗ (0.0002)

0.3396∗∗∗ (0.0009) 0.4151∗∗∗ (0.0002)

0.3396∗∗∗ (0.0008) 0.3774∗∗∗ (0.0004) 0.4528∗∗∗ (0.0001)

0.9057∗∗∗ (0.0000) 0.3019∗∗∗ (0.0017) 0.4151∗∗∗ (0.0002) 0.3396∗∗∗ (0.0008)

0.2642∗∗∗ (0.0032) 0.2264∗∗∗ (0.0065) 0.4151∗∗∗ (0.0002) 0.4906∗∗∗ (0.0001) 0.7925∗∗∗ (0.0000)

0.4151∗∗∗ (0.0147) 0.3019∗∗ (0.0569) 0.4528∗∗ (0.0088) 0.9057 ∗∗∗ (0.0000) 0.4528∗∗∗ (0.0080)

UK France Germany Japan Canada

1 0.4151∗∗ (0.0155) 0.4151∗∗∗ (0.0140) 0.4528∗∗∗ (0.0082) 0.3019∗ (0.0595)

1 0.4528∗∗∗ (0.0089) 0.3396∗∗ (0.0377) 0.4528∗∗∗ (0.0089)

1 0.3396∗∗ (0.0398) 0.2642∗ (0.0855)

1 0.7547∗∗ (0.0000)

1

Note: The upper (lower) triangle matrix reports the 25th (75th) percentile dependence. P-values are calculated based on bootstrapping with 10,000 replications and reported in parenthesis. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level.

depending on the state of nature ω. Under any risk-neutral measure Q , we have

Table 7 Parameter estimates in the factor copula model. t-normal

γUS γUK γFracne γGermany γJapan γCanda 1/v

∗∗∗

0.6250 (0.0000) 1.0000∗∗∗ (0.0000) 4.3750∗∗∗ (0.0000) 1.2500∗∗∗ (0.0000) 0.6250∗∗∗ (0.0000) 0.5000∗∗∗ (0.0000) 0.2500∗∗∗ (0.0000)

t–t 0.4980 (0.0000) 0.7480∗∗∗ (0.0000) 1.2402∗∗∗ (0.0000) 0.9980∗∗∗ (0.0000) 0.3730∗∗∗ (0.0000) 0.2480∗∗∗ (0.0000) 0.3124∗∗∗ (0.0000)

λ σ2 Value of Obj. Fun.

1.7510∗∗∗ (0.0000) 0.2856

skew t-normal ∗∗∗

0.2777

∗∗∗

1.1230 (0.0000) 1.7480∗∗∗ (0.0001) 1.9980∗∗∗ (0.0000) 1.1230∗∗∗ (0.0000) 0.9975∗∗∗ (0.0000) 0.7480∗∗∗ (0.0000) 0.1247∗∗∗ (0.0000) 0.0911∗∗∗ (0.0000) 2.0010 ∗∗∗ (0.0000) 0.2785

skew t–t 0.7480∗∗∗ (0.0000) 1.2480∗∗∗ (0.0003) 1.8730∗∗∗ (0.0000) 1.7481∗∗∗ (0.0000) 0.7480∗∗∗ (0.0000) 0.4908∗∗∗ (0.0000) 0.3126∗∗∗ (0.0000) 0.0795∗∗∗ (0.0000)

0.2715

Note: p-values are reported in parenthesis. ∗∗∗ (∗∗ or ∗ ) indicates significance at the 1% (5% or 10%) level.

tality spikes than any other countries in our sample and the three European countries as a whole experience more volatile mortality developments than their counterparties in North America and Asia. 4. Pricing methodology: maximum entropy principle In this paper, we consider ‘‘canonical valuation’’, developed by Stutzer (1996), as our pricing strategy. This method identifies a risk-neutral measure by minimizing the Kullback–Leibler information criterion, or equivalently, by maximizing the Shannon entropy (Rittelli, 2000), so it is also called as the maximum entropy principle. The use of canonical valuation has been considered in only a few previous studies on pricing mortality-linked securities (see, e.g. Li, 2010; Kogure and Kurachi, 2010; Li and Ng, 2011). Unlike the Wang transform or other risk-neutralizing strategies, the maximum entropy approach ‘‘does not strictly require the use of security prices to predict other security prices’’ (Li, 2010; Li and Ng, 2011). This advantage is particularly important in today’s market of mortality securitization where we do not have sufficient market transaction data to infer the risk-neutral probability distribution. Nevertheless, in the future when the market becomes more mature, the method can be modified easily to incorporate more market prices in estimating a risk-neutral density. Suppose in a market there are m distinct primary securities, whose values are contingent on the state of nature ω. Denote Si the time zero price for the ith primary security in the market and Si (ω) the time zero discounted payoff at the risk-free interest rate

E Q [Si (ω)] = Si ,

i = 1 , 2 , . . . , m.

(7)

In an incomplete market there are infinitely many risk-neutral measures. One way to choose a risk-neutral measure is to minimize the Kullback–Leibler information divergence: Q0 = arg min D(Q , P ) = arg min E Q

P



dP

Q

subject to E [Si (ω)] = Si , Q

dQ

ln

dQ



dP

(8)

i = 1 , . . . , m.

Intuitively, the Kullback–Leibler information criterion measures the information gained by deviating from the physical measure P to the risk-neutral measure Q . The physical measure may be viewed as the prior distribution from a Bayesian perspective. With the given prices of the m primary securities, we readily update the prior by adding up the information contained in these prices. In other words, we update the probability measure to achieve a minimal information gain, subject to the price constraints. To implement, we first generate a large number of scenarios ωj (j = 1, . . . , n) with an equal probability under the physical measure P, i.e., Pr(ω = ωj ) ≡ πj = 1/n. Let πj∗ (j = 1, . . . , n) be the probability of scenario ωj under a risk-neutral measure Q . The constrained maximization problem can be rewritten as

 ∗ n  πj  ∗ n ∗ , πj j=1 = arg min πj ln ∗ πj πj j =1 subject to

n 

πj∗ = 1 and

j =1

n 

Si (ωj )πj∗ = Si ,

(9)

j =1

i = 1, 2, . . . , m. The solution to this minimization problem is well known to be

m 

 πj exp γi Si (ωj ) i=1  , πˆ j∗ = n m   πj exp γi Si (ωj ) j =1

for j = 1, 2, . . . , n

(10)

i =1

m i i=1

and {γ } is a set of constants obtained by solving the following equations n 

Si =

Si (ωj ) exp

m 

j =1

 γi Si (ωj )

i=1 n  j=1

exp

m  i=1

 γi Si (ωj )

,

for i = 1, 2, . . . , m.

(11)

8

H. Chen et al. / Insurance: Mathematics and Economics (

)



Table 8 The Vita III mortality bond. Vita III Tranche A-VII Size Issue date Maturity Coupon(bps) Spread (with dependence) Spread (without dependence) Attachment Detachment Index

We notice that Eq. (11) coincides with a particular case of the Esscher transform (Gerber and Shiu, 1994) applied to the empirical distribution. It is justified by the equivalence between the minimization of the Kullback–Leibler information criteria and the maximization of expected exponential utility (Rittelli, 2000), which also indicates a connection between the Esscher transform and the maximum entropy principle (Kull, 2002). While the Esscher transform relies on a single parameter and thus usually incorporates only one risk-neutrality constraint, the maximum entropy principle can easily adapt to multiple risk-neutrality constraints. After solving for the risk-neutral probability πj∗ (j = 1, . . . , n), we can easily place a value on another security whose payoff is contingent on the scenarios that we generate. Suppose the security   has a time zero discounted payoff at a risk free interest rate, v ωj , for scenario ωj (j = 1, . . . , n), the price of this security, V , is given by V =

n 

v(ωj )πj . ∗

Vita III Tranche B-I

Vita III Tranche B-II

Euro 100M $90M $50M December 2006 December 2006 December 2006 5 years 4 years 5 years EURIBOR+80 LIBOR+110 LIBOR+112 83 114 105 69 98 104 1.25 ∗ [(q04 + q05)/2] 1.20 ∗ [(q04 + q05)/2] 1.20 ∗ [(q04 + q05)/2] 1.45 ∗ [(q04 + q05)/2] 1.25 ∗ [(q04 + q05)/2] 1.25 ∗ [(q04 + q05)/2] US 62.5%, UK 17.5%, Germany 7.5%, Japan 7.5%, Canada 5%

(12)

which applies predetermined weights, wi (US 62.5%, UK 17.5%, Germany 7.5%, Japan 7.5% and Canada 5%), to the annual population N (i) mortality, qit , i.e., qt = i=1 wi qt . The principal of the Vita III notes begins to be at risk if the average CMI over any two-year measurement period within the risk coverage period, denoted by q +q qˆ t ≡ t −12 t , exceeds predefined percentages (125% for Tranche A-VII, 120% for Tranche B-I and B-II) of the base level q0 . The bond will be exhausted if qˆ t is above 145% of the base level for Tranche A-VII, 125% for Tranche B-I and B-II. The base level q0 is defined as q +q the average CMI in 2004 and 2005, i.e., q0 = 04 2 05 . The loss, as a percentage of the bond principal, in year t can be written as

 0,    qˆ t − a Lt = ,   b−a 1,

if qˆ t < a if a ≤ qˆ t ≤ b ,

(13)

if qˆ t > b

j =1

5. Applications 5.1. Mortality risk pricing

where a is the attachment point and b is the detachment point described above. Coupons are only paid on the remaining principal at a rate of LIBOR (or EURIBOR) plus a spread (80, 110 or 112 bps for Tranche A-VII, B-I and B-II, respectively). Risk-neural pricing

Swiss Re has a strong track record of transforming pure risks into speculative risks that are traded in capital markets; this was initially done with natural catastrophe bonds but more recently by periodically securitizing its life risks. Swiss Re has obtained over USD 1.5 billion in extreme mortality risk protection from its Vita program since 2003. In this subsection, we use the Vita III bond that was issued by Swiss Re as one example to illustrate how to apply our model framework to price mortality-linked securities with payoffs linked to the mortality experience of multiple populations. The vita III bond The coverage for Swiss Re through Vita III notes was well balanced with respect to maturity (4 or 5 years), layering (different attachment or detachment points), currency (Euro or US dollars) and monoline involvement (wrapped vs unwrapped).18 We consider three unwrapped tranches, i.e., Tranche A-VII, B-I, and B-II. The basic characteristics and the payoff structure of these three tranches are summarized in Table 8. These three tranches were issued in December 2006, with maturities of 4 years (Tranche B-I) and 5 years (Tranche A-VII or B-II). Tranche A-VII is denominated in Euros with an issue size of 100 million, whereas Tranche B-I and B-II are denominated in US dollars with issue sizes of 90 and 50 million respectively. The coverage of Vita III is based on a combined mortality index (CMI),

18 Unwrapped tranches refer to the tranches of which the principal repayments are not guaranteed by monoline insurers.

To derive the risk-neutral measure, we select the following seven mortality bond transactions as those used in Lin et al. (2013): the Vita II bond (Tranches B, C and D), the Tartan bond (Tranche B) and the Osiris bond (Tranche B2, C and D). We summarize the issue size, date, maturity, coupon rate, attachment and detachment points, and index composition for these bonds in Table 9. Using the copula-based multivariate analysis described in Section 3, we simulate 100,000 paths of mortality rates for each country, i.e., 100,000 mortality scenarios with an equal probability πj = 1/100,000 (j = 1, . . . , 100,000). For each simulated mortality path, we calculate the mortality index, qˆ t , for each bond and thus the discounted bond payoff at a risk-free interest rate.19 We then solve for the risk-neutral probability πj∗ (j = 1, . . . , 100,000) for each path based on Eqs. (10) and (11). Given the risk-neutral probability, we solve for the actuarially fair premium over LIBOR (or EURIBOR) for each tranche and report them in Table 8. Comparing our estimates with the actual spreads that Vita III offered, we can

19 We use the U.S. Treasury yield curve rates or the AAA-rated euro area central government bond yield curve rates as the risk-free interest rates. U.S. Treasury yield curve rates are available at the US Department of the Treasury http://www. treasury.gov/resource-center/data-chart-center/interest-rates/Pages/default.aspx AAA-rated euro area central government bond yield curve rates are available at the European Central Bank http://www.ecb.europa.eu/stats/money/yc/html/index.en. html.

H. Chen et al. / Insurance: Mathematics and Economics (

)



9

Table 9 Mortality bonds used to derive the risk-neutral measure. Source: Bauer and Kramer (2007) and Lin et al. (2013). Vita II Tranche B

Vita II Tranche C

Vita II Tranche D

Size Issue date Maturity Coupon(bps) Attachment Detachment Index

$62M $200M April 2005 April 2005 5 years 5 years LIBOR+90 LIBOR+140 1.20*[(q02+q03)/2] 1.15*[(q02+q03)/2] 1.25*[(q02+q03)/2] 1.20*[(q02+q03)/2] US 62.5%, UK 17.5%, Germany 7.5%, Japan 7.5%, Canada 5%

Size Issue date Maturity Coupon(bps) Attachment Detachment Index

Tartan Tranche B $ 80M May 2006 3 years LIBOR+300 1.10 ∗ [(q04 + q05)/2] 1.15 ∗ [(q04 + q05)/2] US 100%

Size Issue date Maturity Coupon(bps) Attachment Detachment Index

Osiris Tranche B2 EURO 50M November 2006 4 years EURIBOR+120 1.14 ∗ [(q04 + q05)/2] 1.19 ∗ [(q04 + q05)/2] US 15%, Japan 25%, France 60%

see that our estimates are very close to the actual par spreads. This finding confirms the validity of our model to be used for pricing mortality-linked securities with correlated morality risks. Note that in our factor copula model mortality rates across countries are correlated due to the existence of the common factor Z . If we restrict the coefficient of Z , γi , equal to zero, then we reach the case without mortality dependence. We simulate 100,000 paths of mortality rates for each country and solve for the actuarially fair premium for each tranche under this assumption, the results of which are reported in Table 8 as well. Comparing with the case assuming mortality dependence, the spread for each tranche without dependence becomes lower. This result is intuitive because the morality rates among these countries have positive correlations with each other (see Spearman’s rho and the Kendall’s tau in Table 5). A mortality model with positive correlations generates a higher mortality risk than the model without dependence since the mortality index is more likely to exceed the trigger level. As a result, investors will request higher premiums for the higher risk that they take. 5.2. Longevity risk hedging Longevity exposures are estimated by Swiss Re to be $23 trillion globally. The financial modeling firm, Risk Management Solutions (RMS), estimates that U.S. and European pensions are at risk of losing 10% of its $10 trillion portfolio in a once-in-100-year event that increases the average life expectancy of pensioners by five years.20 As a global reinsurer, Swiss Re exposes itself to significant longevity risk and mortality risk in its books of pension and life business. Though these two types of risks can offset each other, such natural hedging strategies may not be perfect since the mortality experiences of the reference populations in these two lines of business are not exactly the same. In order to hedge the residual longevity risk, Swiss Re issued the first longevity trend bond, i.e., the Kortis bond, in December 2010. It was reported that ‘‘the bond would trigger in the event there is a large divergence in

20 Source: http://online.wsj.com/article/BT-CO-20130207-713432.html.

$100M April 2005 5 years LIBOR+190 1.10*[(q02+q03)/2] 1.15*[(q02+q03)/2]

Osiris Tranche C $150M November 2006 4 years LIBOR+285 1.10 ∗ [(q04 + q05)/2] 1.14 ∗ [(q04 + q05)/2]

Osiris Tranche D $100M November 2006 4 years LIBOR+500 1.06 ∗ [(q04 + q05)/2] 1.10 ∗ [(q04 + q05)/2]

the mortality improvements experienced between male lives aged 75–85 in England & Wales and male lives aged 55–65 in the US’’.21 To illustrate how to use such kind of a bond to hedge residual longevity risk of a life insurer, we create a simplified example described as follows. Suppose that a life insurer has books of pensions and life insurance business. There are 100,000 pensioners and 1000 life insurance policyholders in each pool. The pension plan pays out $1 each year to every survivor in the plan and the life insurance policy pays out $1000 upon death of the policyholder. Instead of assuming the life insurer’s books of business are based on a certain cohort in the US or UK, we assume the pension plan consists entirely of members whose mortality characteristics are the same as the UK population and the life insurance pool consists of members whose mortality experience is the same as the US population.22 (i) (i) (i) We denote Qt = (q1 , . . . , qt ) the mortality  path  for country i (i = US or UK) from time 1 to time t. Denote V1 QtUS the time-zero value of liabilities in the insurer’s life insurance portfolio   which is contingent on the US mortality path, QtUS , and V2 QtUK the timezero value of liabilities in its pension plans depending on the UK mortality path, QtUK . The total time-zero value of its liabilities, if the plan is unhedged, can be written as X = V1 QtUS + V2 QtUK .









(14)

Mortality improvement in the US will reduce the payments to its life insurance policies and mortality improvement in the UK will increase the payouts to the pension plans. This natural hedge,

21 http://www.swissre.com/media/news_releases/Swiss_Re_completes_first_ longevity_trend_bond_transferring_USD_50_million_of_longevity_trend_risk_to_ the_capital_markets.html. 22 We understand that pension plans and life insurers target different age groups. Pensioners are usually older than life insurance policyholders. However, we currently do not have mortality data for different age groups in the six countries (US, UK, France, Germany, Japan and Canada) for the time period 1900–2006. The Human Mortality Database does not report age-specific mortality rates for such a long time period for some countries (for example, US data starts in 1933, Germany starts in 1990, Japan starts in 1947 and Canada starts in 1921). Our proposed multipopulation mortality model can be extended to jointly estimate the mortality rates for male and female, and different age groups once such data is available.

10

H. Chen et al. / Insurance: Mathematics and Economics (

)



Table 10 Hedge ratio and hedge efficiency for each tranche.

·

Attachment (1-b)

Detachment (1-a)

E[Loss]

Correlation

Spread(bps)

Var(X )

Var(X ∗ )

Hedge Demand

Efficiency

Tranche 1 Tranche 2 Tranche 3 Tranche 4 Tranche 5 Tranche 6 Tranche 7 Tranche 8 Tranche 9 Tranche 10

0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75

0.0654 0.0769 0.0879 0.1027 0.1191 0.1391 0.1604 0.1861 0.2156 0.2506

0.3073 0.3264 0.3370 0.3520 0.3674 0.3870 0.4031 0.4172 0.4276 0.4373

90 116 142 177 216 266 321 391 474 580

623606.1 623606.1 623606.1 623606.1 623606.1 623606.1 623606.1 623606.1 623606.1 623606.1

548284.0 539513.2 534748.2 526933.5 521478.6 511148.8 502939.2 495878.1 488232.0 484292.4

10.42 10.21 9.83 9.45 9.04 8.78 8.44 8.02 7.61 7.11

0.1208 0.1348 0.1425 0.1550 0.1638 0.1803 0.1935 0.2048 0.2171 0.2234

Note: E[loss] is the expected bond principal loss. Correlation refers to the correlation between bond principal loss and the insurer’s total unhedged liability.

however, is imperfect because the life and pension pools are based on two different populations. The insurer therefore wants to hedge the residual risk by issuing a bond with a structure similar to the Kortis bond. The bond has a face value of $100 and will mature in five years. Bond investors receive an annual coupon payment based on the LIBOR plus a fixed margin. The bond principal loss, as a percentage of the total principal, in year t is given by

 US 0, if qUK  t  − qt > b∆q0   UK  b∆q0 − qt − qUS  t , Lt = , ( b − a)∆q0  UK US   if a ∆ q ≤ q − q ≤ b ∆ q 0 0 t t   1, if qUK − qUS t t < a∆q0

(15)

(i)

where ∆q0 = qUK − qUS 0 0 and q0 (i = US or UK) is defined as the two-year population mortality average in 2004 and 2005, (i) (i) q +q

(i)

− qUS i.e., q0 = 04 2 05 . The bond triggers when qUK t t is lower than UK b∆q0 , and the bond is exhausted when qt − qUS is lower than t a∆q0 . This payoff structure is straightforward to understand after UK US we rearrange the terms. The  UStrigger,  qt − qt ≤ b∆q0 , can be reUK UK US arranged as q0 − qt − q0 − qt ≥ (1 − b) ∆q0 , which means that the bond principal reduces once the difference in mortality (i) (i) improvement, q0 − qt , between UK and US is large enough. That is to say, the attachment point is indeed 1 − b in this case. Similarly we can show that the detachment point is 1 − a. Suppose the insurer has issued hunits of the bond to hedge the risk. Its time-zero value of hedged liabilities can then be written as X ∗ = V1 QtUS + V2 QtUK + h × H (Qt ) ,













h

      = Var V1 QtUS + V2 QtUK + h × H (Qt ) .

(17)

Upon determining the optimal hedge ratio, we can measure hedge effectiveness, R, in terms of the amount of risk reduction, i.e., R=1−

Var (X ∗ ) Var (X )

.23

As we can see from Table 10, the expected principal loss of the bond increases with the increase in the attachment and detachment points. So does the spread for each tranche. Tranche 9 has a spread of 474 bps, which is the closest to that of the Kortis bond (472 bps). By issuing 7.61 units of the tranche 9 bond, the insurer can reduce the variance of its total liability by 21.71%. We further observe that a higher correlation between the bond principal loss and the insurer’s unhedged total liability is associated with a lower demand for the bond and a higher level of hedge efficiency. This pattern can be seen even clearer in Fig. 3.

(16)

where H is the present value of all payoffs in the hedging portfolio. Some questions arise such as: What is the actuarially fair premium spread that the insurer offers? What is the optimal hedge ratio, h, to minimize the uncertainty in the insurer’s cash flows? How shall we assess the hedging effectiveness? These questions can easily be answered within our model framework. We can solve for the par spread following the same procedure as that described in Section 5.1. The optimal hedge ratio, h, can be chosen by minimizing the variance in the present value of cash flows, i.e., h∗ = arg min Var X ∗

Fig. 3. Hedge demand and hedge effectiveness.

(18)

23 We use variance as our risk metric to select the optimal hedge ratio and assess the hedge effectiveness. This choice is consistent with the use of hedge effectiveness

6. Conclusions Many life insurers and reinsurers operate internationally and pool policies across countries. It is, therefore, increasingly important for them to understand the joint mortality dynamics for multiple populations and assess mortality risk in their own books of business. To hedge the risk, they actively participate in mortality securitizations and engage in issuing or trading mortalitylinked securities. The reference populations associated with these hedging instruments usually exhibit mortality improvement rates that are different from those of the hedger’s population and this introduces basis risk that the hedger must manage. In addition, almost all mortality securitizations, albeit their difference in structure, have payoffs contingent on a weighted average mortality index of multiple populations. Therefore, developing mortality models for multiple populations and understanding the correlated mortality risks is crucial for life insurers and reinsurers.

in international accounting standards (see Coughlan et al., 2004). Other metrics such as value at risk (Li and Hardy, 2011) or expected shortfall (Wylie et al., 2010) can be used.

H. Chen et al. / Insurance: Mathematics and Economics (

Although there is a rich literature for mortality modeling in a single population, only a few papers (see, e.g., the references cited in introduction) examine two-population mortality models. These models, however, are based on a critical assumption that mortality rates of two populations do not diverge in the long run. This seems justifiable for long-run mortality forecasts but seems too strong for short-term mortality forecasts in which the mortality correlations may result from some catastrophic events such as a pandemic. In addition, in spite of the extensive use of copula models in finance and economics, no previous research has explored its application to multi-population mortality analysis. We contribute to the literature by proposing a copula-based multivariate mortality model to describe mortality dynamics and correlations for multiple populations. Copulas are designed to capture correlations among risks, particularly tail risks, which are well suited to modeling and pricing extreme mortality risks. More importantly, the factor copula model that we introduce has superior benefits in several aspects. First, the parameter sets in marginal distributions and the copula for this model are separated and this allows a two-stage procedure that greatly simplifies the task of parameter estimation. Therefore, it is particularly useful to high dimension application. This characteristic is important when the number of populations that we need to model is large. Second, this factor copula model has a simple (additive) form but can produce meaningful model implications. Our model selection leads to a skewed t–t factor copula model. The fat-tailed and positively skewed common factor can generate mortality data corresponding to catastrophic morality events such as natural disasters or pandemics. It also indicates that mortality dependence is stronger during mortality deterioration than during mortality improvements. Third, this model is very flexible in model specification according to data availability and the number of variables that need to be specified. We can extend our model to incorporate the cohort effect and other socioeconomic effects by adding more factors. We can also reduce the number of parameters by creating an ‘equi-dependence’ structure. We leave that to future research. Finally, we illustrate how to apply our model to mortality risk pricing and hedging. We adopt the maximum entropy approach as our pricing methodology and use seven mortality bonds as primary assets to derive a risk-neutral measure. We then use that measure to calibrate the prices of the Vita III bond. Our estimated spreads are quite close to the actual ones, indicating our model is useful for mortality risk pricing. We give another example that has a structure similar to the Kortis bond. We show how a life insurer with pension and life insurance books of business can use the bond to hedge the residual longevity risk that remains after natural hedging. We also evaluate the hedging effectiveness. Appendix. Simulated method of moments

˜ S (θ) be a vector of dependence measures computed using Let m ˆ T be the corresponding S simulations from Fx (θ), {Xs }Ss=1 , and let m vector of dependence measures computed from the standardized  T

residuals ηˆ t t =1 . The SMM estimator is defined as

ˆ T gT ,S (θ ) θˆT ,S ≡ arg min gT′ ,S (θ ) W θ∈Θ

ˆ T is some positive definite ˆT − m ˜ S (θ) and W where gT ,S (θ ) ≡ m weight matrix. In this paper, we use an identify matrix as the weight matrix because we do not have any prior on the weights that we need to assign to each dependence measure. Under regularity conditions, Oh and Patton (2013) show that if S /T → ∞ as T → ∞, the SMM estimator is consistent and asymptotically normal  d √  T θˆT ,S − θ0 − → N (0, Ω0 )

as T , S → ∞.

)



11

The variance–covariance matrix Ω0 is defined as

 −1   −1 ′  , G0 W0 Σ0 W0 G0 G′0 W0 G0 Ω0 = G′0 W0 G0   ˆ T , G0 ≡ ∇θ g0 (θ0 ) and g0 (θ0 ) ≡ p-limT ,S →∞ where Σ0 ≡ avar m gT ,S (θ). Oh and Patton (2013) also present the distribution of a test of the over-identifying restriction (the J test). They show that a simple independent and identically distributed (i.i.d.) bootstrap can be used to consistently estimate Σ0 and that a standard numerical ˆ will consistently derivative of gT ,S (θ ) at θˆT ,S , denoted by G, estimate G0 under the condition that the step size of the numerical derivative goes to zero slower than T −1/2 . Moments construction Suppose there are N countries. Denote δij one of the chosen dependence measure (for instance, Spearman’s rho) between country i and j, and define dependence matrix as



1

 δ12

DN ×N =   ..

.

δ1N

δ12 1

.. .

δ2N

··· ··· .. . ···

 δ1N δ2N  ..  . . δNN

Considering the copula model with one common factor but different loadings for each country, we create a vector of dependence measures [δ(1,2) , δ(1,3) , . . . , δ(N −1,N ) ] using all off diagonal terms in the upper triangular matrix of DN ×N as the moment conditions, where δ(i,j) is the pair-wise dependence measure between country i and country j. In this way, we are able to take into account the heterogeneity in pair-wise dependence measures among different countries. The number of the moment conditions for one chosen dependence measure is equal to N (N − 1)/2. In our numerical study, we have six countries in our sample and consider three moment conditions, i.e., spearman’s rho and the 25th and 75th percentile dependence. This gives as a total of 3N (N − 1)/2 = 45 moments for estimation. We consider a time series of length T = 106, and we use S = 100,000 simulations in the computation of the dependence measures to be estimated in the SMM optimization. Standard errors are based on 10,000 bootstraps ˆ to estimate ΣT ,S , and with a step size εT = 0.01 to compute G. References Andersen, L.B.G., Sidenius, J., 2004. Extensions to the Gaussian copula: Random recovery and random factor loadings. J. Credit Risk 1 (1), 29–70. Bauer, D., Boerger, M., Russ, J., 2010. On the pricing of longevity-linked securities. Insurance Math. Econom. 46 (1), 139–149. Bauer, D., Kramer, F.W., 2007. Risk and valuation of mortality contingent catastrophe bonds. Working Paper, Ulm University, Germany. Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control, second ed.. Holden-Day, San Francisco. Cairns, A.J.G., Blake, D., Dowd, K., 2006. A Two-factor model for stochastic mortality with parameter uncertainty: Theory and calibration. J. Risk Insur. 73 (4), 687–718. Cairns, A.J.G., Blake, D., Dowd, K., Coughlan, G.D., Khalaf-Allah, M., 2011. Bayesian stochastic mortality modelling for two populations. Astin Bull. 41, 29–59. Carriere, J.F., 2000. Bivariate survival models for coupled lives. Scand. Actuar. J. 1, 17–31. Chai, C.M.H., Siu, T.K., Zhou, X., 2013. A double-exponential GARCH model for stochastic mortality. Eur. Actuar. J. 3 (2), 385–406. Chen, H., Cox, S.H., 2009. Modeling mortality with jumps: Applications to mortality securitization. J. Risk Insur. 76 (3), 727–751. Chen, H., Cox, S.H., Wang, S.S., 2010. Is the home equity conversion program in the United States sustainable? Evidence from pricing mortgage insurance premiums and non-recourse provisions using the conditional Esscher transform. Insurance Math. Econom. 46 (2), 371–384. Coughlan, G.D., Emery, S., Kolb, J., 2004. HEAT (Hedge Effectiveness Analysis Toolkit): A consistent framework for assessing hedge effectiveness under IAS 39 and FAS 133. J. Deriv. Account. 1 (2), 221–272. Denuit, M., Devolder, P., Goderniaux, A.C., 2007. Securitization of longevity risk: Pricing survivor bonds with Wang Transform in the Lee–Carter framework. J. Risk Insur. 74 (1), 87–113.

12

H. Chen et al. / Insurance: Mathematics and Economics (

Denuit, M., Dhaene, J., Le Bailly de Tilleghem, C., Teghem, S., 2001. Measuring the impact of dependence among insured lifelengths. Belg. Actuar. Bull. 1 (1), 18–39. Dowd, K., Blake, D., Cairns, A.J.G., Dawson, P., 2006. Survivor swaps. J. Risk Insur. 73 (1), 1–17. Dowd, K., Cairns, A.J.G., Blake, D., Coughlan, G.D., Khalaf-Allah, M., 2011. A gravity model of mortality rates for two related populations. N. Am. Actuar. J. 15 (2), 334–356. Embrechts, P., Lindskog, F., McNeil, A., 2003. Modeling dependence with copulas and applications to risk management. In: Rachev, S. (Ed.), Handbook of Heavy Tailed Distributions in Finance. Elsevier, pp. 329–384. Chapter 8. Frees, E.W., Carriere, J.F., Valdez, E.A., 1996. Annuity valuation with dependent mortality. J. Risk Insur. 63 (2), 229–261. Frees, E.W., Valdez, E.A., 1998. Understanding relationships using copula. N. Am. Actuar. J. 2 (1), 1–25. Gao, Q., Hu, C., 2009. Dynamic mortality factor model with conditional heteroskedasticity. Insurance Math. Econom. 45 (3), 410–423. Gerber, H.U., Shiu, E.S.W., 1994. Option pricing by esscher transforms. Trans. Soc. Actuar. 46, 99–191. Giacometti, R., Bertocchi, M., Rachev, S.T., Fabozzi, F.J., 2012. A comparison of the Lee–Carter model and AR–ARCH model for forecasting mortality rates. Insurance Math. Econom. 50 (1), 85–93. Giacometti, R., Ortobelli, S., Bertocchi, M.I., 2009. Impact of different distributional assumptions in forecasting Italian mortality rates. Invest. Manag. Financ. Innov. 6, 186–193. Hansen, B.E., 1994. Autoregressive conditional density estimation. Internat. Econom. Rev. 35 (3), 705–730. Hull, J., White, A., 2004. Valuation of a CDO and nth to default CDS without Monte Carlo simulation. J. Deriv. 12 (2), 8–23. Jarner, S.F., Kryger, E.M., 2011. Modelling adult mortality in small populations: the SAINT model. Astin Bull. 41, 377–418. Joe, H., 1997. Multivariate Models and Dependence Concepts. In: Monographs in Statistics and Probability, vol. 73. Chapman and Hall, London. Joe, H., 2005. Asymptotic efficiency of the two-stage estimation method for copulabased models. J. Multivariate Anal. 94, 401–419. Kogure, A., Kurachi, Y., 2010. A Bayesian approach to pricing longevity risk based on risk-neutral predictive distributions. Insurance Math. Econom. 46 (1), 162–172. Kull, A., 2002. A unifying approach to pricing insurance and financial risk. CAS Forum Winter 2003, Data Management, Quality and Technology Call Papers and Ratemaking Discussion Papers, pp. 317–350. Laurent, P., Gregory, J.K., 2005. Basket default swaps, CDOs and factor copulas. J. Risk 7 (4), 103–122. Lee, R.D., Carter, L.R., 1992. Modeling and forecasting U.S. mortality. J. Amer. Statist. Assoc. 87, 659–671. Lee, R.D., Miller, T., 2001. Evaluating the performance of the Lee–Carter method for forecasting mortality. Demography 38 (4), 537–549. Li, D.X., 2000. On default correlation: A copula function approach. J. Fixed Income 9 (4), 43–54.

)



Li, J.S.-H., 2010. Pricing longevity risk with the parametric bootstrap: A maximum entropy approach. Insurance Math. Econom. 47 (2), 176–186. Li, J.S.-H., Hardy, M.R., 2011. Measuring basis risk in longevity hedges. N. Am. Actuar. J. 15 (2), 177–200. Li, J.S.-H., Hardy, M.R., Tan, K.S., 2010. Pricing and hedging the no-negative-equityguarantee in equity release mechanisms. J. Risk Insur. 77 (2), 499–522. Li, N., Lee, R., 2005. Coherent mortality forecasts for a group of population: An extension to the classical Lee–Carter approach. Demography 42, 575–594. Li, J.S.-H., Ng, A.C.-Y., 2011. Canonical valuation of mortality-linked securities. J. Risk Insur. 78 (4), 853–884. Lin, Y., Cox, S.H., 2008. Securitization of catastrophe mortality risks. Insurance Math. Econom. 42 (2), 628–637. Lin, Y., Liu, S., Yu, J., 2013. Pricing mortality securities with correlated mortality indexes. J. Risk Insur. 80 (4), 921–948. Nelsen, R.B., 2006. An Introduction to Copulas, second ed.. Springer, U.S.A. Oh, D.H., Patton, A.J., 2013. Simulated method of moments estimation for copulabased multivariate models. J. Amer. Statist. Assoc. 108 (502), 689–700. Oh, D.H., Patton, A.J., 2014. Modelling dependence in high dimensions with factor copulas, working paper, Duke University. Patton, A.J., 2006. Estimation of multivariate models for time series of possibly different lengths. J. Appl. Econom. 21 (2), 147–173. Renshaw, A.E., Haberman, S., 2006. A cohort-based extension to the Lee–Carter model for mortality reduction factors. Insurance Math. Econom. 38, 556–570. Rittelli, M., 2000. The minimal entropy martingale measure and the valuation problem in incomplete market. Math. Finance 10, 39–52. Shemyakin, A., Youn, H., 2006. Copula models of joint last survivor analysis. Appl. Stoch. Models Bus. Ind. 22, 211–224. Stutzer, M., 1996. A simple nonparametric approach to derivative security valuation. J. Finance 51, 1633–1652. Vasicek, O., 1987. Probability of loss on loan portfolio. KMV Corporation, Mimeo. Wang, C.-W., Huang, H.-C, Liu, I.-C., 2013. Mortality modeling with non-Gaussian innovations and applications to the valuation of longevity swaps. J. Risk Insur. 80 (3), 775–798. Wylie, J.J., Zhang, Q., Siu, T.K., 2010. Can expected shortfall and value-at-risk be used to statically hedge options? Quant. Finance 10, 575–583. Yang, S.S., Wang, C.-W., 2013. Pricing and securitization of multi-country longevity risk with mortality dependence. Insurance Math. Econom. 52 (2), 157–169. Youn, H., Shemyakin, A., 1999. Statistical aspects of joint life insurance pricing. In: 1999 Proceedings of the Business and Statistics Section of the American Statistical Association, pp. 34–38. Youn, H., Shemyakin, A., 2001. Pricing practices for joint last survivor insurance. Actuar. Res. Clearing House 2001.1. Zhou, R., Li, J.S.-H., Tan, K.S., 2013a. Pricing standardized mortality securitizations: A two-population model with transitory jump effects. J. Risk Insur. 80 (3), 733–774. Zhou, R., Wang, Y., Kaufhold, K., Li, J.S.-H., Tan, K.S., 2013b. Modeling mortality of multiple populations with vector error correction models: Applications to solvency II, working paper.