Insurance: Mathematics and Economics 71 (2016) 93–102
Contents lists available at ScienceDirect
Insurance: Mathematics and Economics journal homepage: www.elsevier.com/locate/ime
Issues with the Smith–Wilson method Andreas Lagerås a,b,∗ , Mathias Lindholm b a b
AFA Insurance, Sweden Department of Mathematics, Stockholm University, Sweden
article
info
Article history: Received February 2016 Received in revised form August 2016 Accepted 26 August 2016 Available online 6 September 2016 Keywords: Smith–Wilson Discount curve Yield curve Interpolation Extrapolation Hedging Totally positive matrix Solvency II
abstract We analyse various features of the Smith–Wilson method used for discounting under the EU regulation Solvency II, with special attention to hedging. In particular, we show that all key rate duration hedges of liabilities beyond the Last Liquid Point will be peculiar. Moreover, we show that there is a connection between the occurrence of negative discount factors and singularities in the convergence criterion used to calibrate the model. The main tool used for analysing hedges is a novel stochastic representation of the Smith–Wilson method. © 2016 Elsevier B.V. All rights reserved.
1. Introduction In the present paper we analyse the mandated method for calculating the basic risk-free interest rate under Solvency II, the so-called Smith–Wilson method. This is an extra- and interpolation method, which is based on a curve fitting procedure applied to bond prices. The technique is described in a research note by Smith and Wilson from 2001, see Smith and Wilson (2000). Since Smith and Wilson (2000) is not publicly available, we have chosen to follow the notation of the European Insurance and Occupational Pensions Authority (EIOPA) given in EIOPA (2015). The primary aim with the current paper is to present problems with the Smith–Wilson method, especially with regards to hedging interest rate risk. We show analytically that the oscillating behaviour observed numerically by Ovtchinnikov (2015) and Rebel (2012) is always present (Section 3.1). Our main theoretical tool is a representation of Smith–Wilson discount factors as expected values of a certain Gaussian process (Section 2). With notation from EIOPA (2015), we have that the discount factor for tenor t, when fitted to N prices for zero coupon bonds
∗
Corresponding author. E-mail address:
[email protected] (A. Lagerås).
http://dx.doi.org/10.1016/j.insmatheco.2016.08.009 0167-6687/© 2016 Elsevier B.V. All rights reserved.
with tenors u1 , . . . , uN , is P (t ) := e−ωt +
N
ζj W (t , uj ),
t ≥ 0,
(1)
j =1
where ω := log(1 + UFR) and UFR is the so-called Ultimate Forward Rate,
W (t , uj ) := e−ω(t +uj ) α min(t , uj ) − e−α max(t ,uj ) sinh(α min(t , uj )) ,
(2)
and α is a parameter determining the rate of convergence to the UFR. Based on the above it is seen that the ζj ’s are obtained by solving the linear equation system defined by (1) and (2) given by the specific time points u1 , . . . , uN . Another name for uN given in the regulatory framework is the Last Liquid Point (LLP), i.e. the last tenor of the supporting zero coupon bonds that are provided by the market. The UFR is set to 4.2% for the Eurozone. In general, a higher value of α implies faster convergence to UFR. EIOPA (EIOPA, 2015, Paragraph 164) has decided that α should be set as small as possible, though with the lower bound 0.05, while ensuring that the forward intensity f (t ) := − dtd log P (t ) differs at most 0.0001 from ω (defined above) at a certain tenor called the Convergence Point (CP):
|f (CP ) − ω| ≤ 0.0001.
(3)
94
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
This optimisation of α can be troublesome to implement numerically since the left hand side of (3), seen as a function of α , can have singularities (Section 3.3). We also point out below that having the forward yield to converge to a fixed UFR gives rise to an inconsistency with how the interest rate stress scenarios are specified in Solvency II (Section 3.4). The method can also be applied to coupon bearing bonds, or swaps, but there is no loss in generality in considering only zero coupon bonds. The generalisation is particularly simple since a coupon bearing bond can be seen as a linear combination of zero coupon bonds and the Smith–Wilson method is linear in bond prices. We also note that the market data that is used as input for the Smith–Wilson method should undergo a credit adjustment. This is nothing that we will specify further, but refer the reader to EIOPA (2015) and merely state that this adjustment is of no relevance for the results below. If anything, the variable credit adjustment will make hedging even harder. Note that the current market practice is to adjust for the credit risk by using two interest curves, one for discounting and one for projecting forward rates, see Ametrano and Bianchetti (2013) and Mercurio (2009). Further, the methodology to derive the UFR is under review, see EIOPA (2016), and in future regulations one might expect UFR to change year by year. For later convenience we will here state relevant abbreviations and notation: LLP is the Last Liquid Point for where the zero coupon bond market support ends. UFR is the Ultimate Forward Rate, i.e. 4.2% for most currencies. ω is the continuously compounded ultimate forward rate, i.e. ω = log(1 + UFR). CP is the Convergence Point where the UFR should be reached. α is the mean reversion parameter that determines the rate of convergence to the UFR. u is a vector with tenors of the market zero coupon bonds. p is a vector of observed zero coupon prices at times to maturities u, that is p = (p1 , . . . , pLLP )′ . r is a vector of observed zero coupon spot rates at times to maturities u, that is r = (r1 , . . . , rLLP )′ , i.e. pi := p(r )i = e−ri ui . H (s, t ) is the following function: H (s, t ) := α min(s, t ) − e−α max(s,t ) sinh(α min(s, t )). W (s, t ) is defined as W (s, t ) := e−ω(s+t ) H (s, t ). Q is a diagonal matrix with Qii = e−ωui =: qi , i = 1, . . . , LLP. H is a matrix with elements Hij = (H (ui , uj ))ij . W is a matrix with elements Wij = (W (ui , uj ))ij . Note that W = QHQ . W (t , u) is defined as W (t , u) := (W (t , u1 ), . . . , W (t , uLLP ))′ . H (t , u) is defined analogously. b is the solution to the equation p = q + QHQb (note that this is the zero coupon case). sinh[α u′ ] denotes sinh( · ) applied component-wise to the vector α u′ . P (t ) is the discount function at t that suppress the dependence on the market support, i.e. P (t ) := P (t ; p(r )) ≡ P (t ; p) ≡ P (t ; r ). P c is the present value of the cash flow c w.r.t. Smith– Wilson discounting using P (t ), i.e. P c := t ct P (t ). Hence, P c := P c (t ; p(r )) ≡ P c (t ; p) ≡ P c (t ; r ). 2. Representing Smith–Wilson discount factors The problems with the Smith–Wilson method that will be highlighted in later sections are centred around problems
regarding hedging. In order to understand this in more detail we have found that the representation of the method from Andersson and Lindholm (2013) and Lagerås (2014) will prove useful. We will now give a full account of how the extrapolated discount factors of the Smith–Wilson method can be treated as an expected value of a certain stochastic process: Let {Xt : t ≥ 0} be an Ornstein–Uhlenbeck process with dXt = −α Xt dt + α 3/2 dBt , where α > 0 is a mean reversion parameter, t and X0 ∼ N (0, α 2 ) independent of B, and let X¯ t := 0 Xs ds and Yt := e−ωt (1 + X¯ t ). Given this we can state the following theorem: Theorem 1. P (t ) = E[Yt |Yui = pui ; i = 1, . . . , N ]. In other words: the Smith–Wilson bond price function can be interpreted as the conditional expected value of a certain nonstationary Gaussian process. Note that α will govern both mean reversion and volatility. Since {Yt : t ≥ 0} is a Gaussian process we have that P (t ), being a conditional expected value, is an affine function of p: P (t ) = E[Yt ] + Cov[Yt , Y ]Cov[Y , Y ]−1 (p − E[Y ])
=: β0 (t ) + β(t )′ p,
(4)
where β0 (t ) and β(t ) are functions of t, but not p, if α is considered a fixed parameter. If α is set by the convergence criterion, β0 and β are functions of p as well as t. The main aim of this paper is to analytically show problems inherent in the Smith–Wilson method which will affect hedging of liabilities. From this perspective it is evident that the reformulation of the bond price function according to Eq. (4) will prove useful, and in particular the behaviour of the β ’s will be of interest: Theorem 2. If t > uN , sign(βi (t )) = (−1)N −i for i = 1, . . . , N. This has peculiar consequences for hedging interest rate risk. The proofs of Theorems 1 and 2 are given in Section 5. 3. Problems with the Smith–Wilson method When the Smith–Wilson method was proposed by the regulator, it was known to have both advantages and disadvantages, see EIOPA (2010). One of the advantages expressed at that time is that it is not only an interpolation method, but also an extrapolation method. Hagan and West (2006) have produced a broad survey of interpolation methods where they list several desiderata. A good interpolation method should (a) (b) (c) (d)
match market data and be computationally fast to fit, generate smooth forward rates, generate non-negative forward rates, generate stable forward rates, i.e. small changes in market input should not generate large changes in forward rates, (e) generate hedges that are local, i.e. a liability should be hedged with market instruments with tenors as close as possible to that of the liability. Let us elaborate on these, especially in the context of valuing and hedging insurance liabilities which requires extrapolation as well as interpolation, and check how well the Smith–Wilson method fulfils the criterion. The first part of Point (a), concerning match to market data, is required since a liability with a cash flow identical to one of a traded financial instrument should have a value equal to that instrument, and it can be hedged by buying that instrument. The Smith–Wilson method satisfies this condition. A computationally fast fit is necessary if one wants to perform Monte Carlo simulations, and even though the Smith–Wilson method only has
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
one parameter, α , we show in Section 3.3 that it is not completely straightforward to optimise. The Smith–Wilson method fulfils point (b), having smooth forward rates. We note that a higher degree of smoothness of an interpolation method necessarily makes it more non-local, since any interpolated value depends on observed values in a larger neighbourhood. Point (c) is old dogma in Financial Mathematics based on the belief that negative interest rates would be arbitraged away due to the existence of cash, which is an alternative zero yielding investment. However, since the financial crisis of 2008, we have seen an ever larger part of the global market for government bonds trading at negative yields. As of summer 2016 government bonds with a face value of approx 12 trillion USD have negative yields, (Rebel, 2012). All Swiss government bonds trade with negative yields, the longest with 48 years to maturity, (Moore, 2016). This criterion is therefore in direct conflict with (a) and should be dropped. Even if negative rates are observed empirically, we still think that discount factors should be positive. As noted already in EIOPA (2010), the Smith–Wilson method does not guarantee positive discount factors, and in Section 3.2 we show an example with real market data where this pathological phenomenon occurs. The last two criteria are related to hedging. Point (d) means that small changes in the market data should not translate to large changes in the value of the liabilities. As we will see below, this condition is not fulfilled by the Smith–Wilson method. The last point (e) can be understood by an example. If there is a liquid market for swaps with tenors 1, 2,. . . , 10, 12, 15 and 20 years and a trader has on her book an illiquid 11 year swap, her middle office should value that swap in such a way that she can hedge it with positions in the 10 year and 12 year swap, being the two closest liquid instruments, and without needing substantial positions in the 8 and 20 year swaps, say. Both non-locality in a hedge (e) and the instability of interpolated rates (d) seem to be natural consequences of nonlocality in the interpolation method, see Hagan and West (2006). The Smith–Wilson method is non-local in this sense, and the issue is exacerbated in the extrapolated part of the curve as we will now discuss. 3.1. Hedging If you have a liability, i.e. a debt, of 1 unit of currency with tenor t, i.e. time to maturity t, its market value is that of a default free zero coupon bond with the same tenor, since if you buy the zero coupon bond you know that you will be able to pay your debt no matter what happens with interest rates. This is the essence of market valuation of liabilities and also of hedging. If you have liabilities with several tenors, you could theoretically match them by buying the zero coupon bonds with the same tenors. However, in practice, you might have liabilities with tenors for which there are no zero coupon bonds in the market. To value these liabilities one need an inter- or extrapolation method such as Smith–Wilson, to give the liability a model price. Our focus is on insurance liabilities, but the need also exists for purely financial liabilities, such as for over the counter (OTC) derivatives. In general, the liquidly traded financial instruments used as input for a yield curve, are not zero coupon bonds but rather swaps and interest rate futures, or coupon bearing bonds. As the market rates change, the model produces a new price for the non-traded liability. A hedge of that liability is a portfolio of instruments put together in such a way that its value closely tracks the value of the liability, no matter how the market rates change.
95
Following Hagan and West (2006) and financial market practice (Ametrano and Bianchetti, 2013; Bloomberg Finance, 2016), we assume the instruments that are used to construct the yield curve are also the ones available for hedging purposes. One typically shifts the yield of each market instrument one basis point, i.e. a hundredth of a percentage point, while keeping the others unchanged, and compare the changes in prices of the market instruments to that of the liability. This gives so called key rate durations (KRD). KRD’s, or rather dollar values of a basis point (which go by abbreviations such as BPV or DV01), are the preferred measure of interest sensitivities and hedge construction in Ovtchinnikov (2015) and Rebel (2012). The hedge portfolio has positions in the market instruments such that all its key rate durations match that of the liability. The inter- or extrapolated model price is typically nonlinear in the market prices so the hedge cannot match too large changes in the market prices before it needs to be rebalanced. Nevertheless, as long as the market prices do not change too much, it should perform well, and in particular it would track the liability’s price even if the market rates would shift in a non-parallel fashion. However, the Smith–Wilson method, is not typical in that it actually is linear in the market instruments. To be more precise, Eq. (4) shows that P (t ) is affine in the prices of the zero coupon bonds for tenors u1 , . . . , uN . Thus, Eq. (4) gives the recipe for a ‘‘perfect’’ hedge: own βi (t ) units of the zero coupon bond with time to maturity ui for i = 1, . . . , N, and have the amount β0 (t ) in cash. No matter how the values of the zero coupon bonds fluctuate, the combined portfolio will have the value P (t ) so the key rate duration hedge need not be rebalanced as the market rates change.1 Alas, by Theorem 2, the recipe for the perfect hedge is quite strange when you actually try to procure the ingredients: If t > uN you will need a positive amount of the zero coupon bond at uN , a negative amount of the zero coupon bond at uN −1 , then again a positive amount of the one at uN −2 , etc. This oscillating behaviour was observed empirically for some tested yield curves by Ovtchinnikov (2015) and Rebel (2012), and Theorem 2 shows that the pattern is present for all Smith–Wilson curves without exception. It is also worth noting that a hedge constructed according to the above procedure will due to the sign-changes have a sum of the absolute values of the exposures that is larger than the present value of the liabilities that one wants to hedge. If the curve is constructed from swaps or coupon bearing bonds, as is done in practice, the oscillating pattern typically does not hold for the hedge positions in the shortest tenors though it still holds for the hedge positions in the largest tenors. Another issue with the hedge is that as soon as time passes, with 1t equal to, say, a month, P (t ) is calculated with new zero coupon bonds at u1 , . . . , uN , whereas our portfolio has bonds with maturities u1 − 1t , . . . , uN − 1t. If the signs of holding amounts had all been positive, this might not have been such a big practical issue. In that case one could conceivably have changed the portfolio weights little by little and still have had an acceptable hedge. Theorem 2 implies that whatever holding one has with a particular maturity date must be sold and changed into the opposite exposure in the time span of one year in the case of when all time to maturities u1 , . . . , uN are one year apart. For many currencies uN − uN −1 = 5 years. Even in this case the turnover would be impractically large. Example 1. Consider an initial EUR curve with observed market zero coupon rates for tenors 1, 2,. . . , 10, 12, 15, and 20 all equal
1 This only holds assuming α is constant. If the changing market prices force a change of α the hedge is no longer perfect. For moderate changes of the yield curve, the hedge should still perform well.
96
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
to 4.2%. This means that the Smith–Wilson curve is flat at 4.2% for all tenors. A liability of 100 EUR with maturity 30 years, i.e. in the extrapolated part of the curve will have the present value . 100 · 1.042−30 = 29. The discount factor P (30) = β0 (30)+β(30)′ p . where β0 (30) = 0, βi (30) = 0.00 for i = 1, . . . , 6, and i
7
8
βi (30) 0.01 βi (30)pi 1
9
10
−0.05 0.19 −3 13
12
15 20 −1.64 1.96 −88 86
−0.38 0.76 −26 47
Note that the positions in the zero coupon bonds at the last three maturities: 46, −88 and 86, are considerably larger in absolute value than the present value 29 of the liability. This example is artificial in the sense that the EUR curve should be based on swap rates, adjusted for credit risk. If we assume the adjusted swap rates to all equal 4.2%, the Smith–Wilson curve is once again flat at 4.2% and the present value of the 100 EUR liability with maturity 30 years is 29 EUR. The table below shows the swap positions (nominal amounts) in the perfect hedge. Positive values means receiving fixed rate and paying floating, and negative values the opposite. We multiply the β ’s by 100 since all swaps trade at par. i
1
βi,swap (30)·
−1 −1 −1 −1 −1 −2 0
2
3
4
5
6
7
8
9
−7 18
10
12
−43 76
15
20
−180 174
100
As for the zero coupon rates we see that the highest tenors have large positions with oscillating signs, though the oscillating pattern does not hold for the shortest tenors. In general, the liabilities of an insurance company do not all come due the same date, and if we have undiscounted liabilities ct with time to maturities t = 1, 2 . . . , the present value of all c liabilities can still find the hedge by Eq. (4) is P = t ct P (t ). One c c ′ ′ since t ct P (t ) = t ct β0 (t ) + t ct β(t ) p =: β0 (t ) + β (t ) p. Example 2. Consider the same curve as in Example 1, but let the liability cash flow equal 10/1.10k for k = 1, 2, . . . . The undiscounted value of the liabilities is 100 EUR and the present value is 68. The weighted average time to maturity is 11 years and the Macaulay duration is about 8 years. The present value of liabilities discounted by extrapolated yields, i.e. longer than 20 years, is 4.47 which is less than 7% of the total present value. These liabilities will contribute to the oscillating behaviour of the hedge. A priori one might think that the amount is such a small part of the total that the hedge of the overall liabilities would consist of only positive exposures, but then one would be mistaken, since the penultimate position is negative. The last row of the table shows the hedge if one, more realistically, would use swaps. i
1
βic 9 βic pi 9 c βswap · 6
2 8 8 6
3 8 7 5
4 7 6 5
5 6 5 4
6 6 4 4
7 5 4 4
8 4 3 3
9 5 4 4
10 2 1 0
12 15 9 14
15
20
−8 29 −4 13 −12 26
100 We note that this pattern is also seen in Rebel (2012, Fig. 2). In view of these examples, one could expect insurance companies with long liabilities and therefore interest rate sensitive balance sheets, to try to obtain long interest exposure in the last liquid point, and short interest rate exposure in the penultimate liquid point, in essence exposure to the forward rate between the penultimate and last liquid point. This demand, in turn, would push down that forward rate, distorting the market curve and leading to lower extrapolated yields, and thus higher values for the liabilities. This could become quite destabilising. To conclude, in Section 3.1 we have seen that criteria (d), stability of forward rates, and (e), locality of hedge, from Section 3 are severely violated by the Smith–Wilson method. Next we continue by analysing negativity of forward rates, and in particular, negativity of discount factors.
3.2. Negative discount factors Discount factors extrapolated by the Smith–Wilson method may become negative when the market curve has a steep slope for high tenors, i.e. the last market forward rates are high. This has been noted by supervisory authorities, e.g. EIOPA (2010); Finanstillsynet (2010). Example 3. The Russian market swap rates at the close of 16 December 2014 were as shown in Fig. 1. The stipulated LLP for Russian ruble is 10 years and the CP is 60 years. With a credit risk adjustment, meaning we subtract 0.35 percentage points from each rate, we have convergence in the sense of Eq. (3) with α = 0.1646, and discount factors are negative for all tenors larger than 15 years, as seen in Fig. 2. Fig. 3 shows the corresponding forward and zero coupon rates. We see how the rates diverge towards infinity as the discount factor approaches 0. The effect of the credit risk adjustment is negligible in this case, and the results are qualitatively the same if we use a CRA equal to zero. This is nonsense and clearly very undesirable. A single set of market inputs and Smith–Wilson curve output may be checked manually, but when market rates are simulated or drawn from an Economic Scenario Generator, and the Smith–Wilson method is applied to them, checks must be automated. EIOPA does not specify how to amend the method when discount factors become negative. One solution is to increase α even further. In Example 3, it suffices to increase α to 0.2853 to avoid negative discount factors, see Fig. 4. This is nothing strange, since an increase in α corresponds to an increase in the speed of mean reversion, and hence an increase in the stiffness of the curve. In order to see that this is always possible, one can argue as follows: for large values of α and t ≥ s the function W (s, t ) ∼ α min(s, t ), which corresponds to the covariance function of Brownian motion, implying that the conditional expected value, i.e. the discount function at t, will be close to e−ωt . Since this holds for any t ≥ s, it will in particular hold for t = CP ≥ ui . Thus, increasing α will eventually make the discount factors positive. Note that there is no contradiction between negative discount factors and convergence of continuously compounded forward rates, i.e. the forward intensities. This because the latter are gradients of the corresponding bond prices, or alternatively put, gradients of discount factors. The problem with negative discount factors is rather that they cannot be represented as any real, as a converse to imaginary, spot rates. Thus, we have seen that the Smith–Wilson method may violate condition (c) of Section 3, non-negative forward rates, but also that negative discount factors may produced, which is even worse. Moreover, we have also seen that there may be situations when the convergence of the method may be towards a false root, which in some sense also violates condition (a) of Section 3. This is something that will be analysed further in the next section. 3.3. Tricky to find α : negative discount factors revisited Another problem with the calibration of α is that there may be singularities in the domain where α is optimised. Consider the same Russian data as in Example 3. Fig. 5 show g (α) := |f (CP )−ω|, the left hand side of Eq. (3). Note that the function g (α) diverges to infinity at α = 0.2847 and that the equation g (α) = 0.0001 has three roots, α = 0.1646, 0.2842 and 0.2853. Only the largest of these three correspond to positive discount factors. To understand this better, consider the convergence criterion defined by EIOPA, i.e. EIOPA (2015, Paragraphs 160–166), that can be expressed as g (α) = |h(α)| := |f (CP ) − ω| =
α , |1 − κ eαCP |
(5)
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
97
Fig. 1. Russian swap rates. Source: Bloomberg Finance L.P.
Fig. 2. Discount factors for Russian ruble as of 16 December 2014.
Fig. 3. Smith–Wilson forward and zero coupon curves fitted to the Russian swap data with a credit risk adjustment, as of 16 December 2014.
where
κ :=
Now note that 1 + α u′ Qb
sinh[α u′ ]Qb
.
In order for a singularity to arise it is hence necessary that 1 − κ eα CP ≡ 0, which is equivalent to 1 + (α u′ − e−α CP sinh[α u′ ])Qb = 0.
(6)
P (CP ) = e−ωCP 1 + (α u′ − e−α CP sinh[α u′ ])Qb , i.e. g (α) from (5) is only singular iff P (CP ) ≡ 0. Moreover, P (CP ) is continuous as a function of α . Consequently a singularity in the domain where α is optimised can only occur if the input market spot curve will result in a negative discount factors for some of the eligible α values, i.e. for some α ≥ 0.05 according to EIOPA’s specification.
98
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
Fig. 4. Smith–Wilson forward and zero coupon curves fitted to the Russian swap data with a credit risk adjustment, as of 16 December 2014. The solid and dotted curves have α = 0.2853 which gives all positive discount factors, and the dashed and dot-dashed curves have α = 0.1646 which gives negative discount factors for tenors longer than 15 years. There is essentially no difference between the curves in the interpolated part, i.e. for tenors less than 10 years.
Fig. 5. With the market data in Example 3, we have to increase α from 0.05 to 0.1646 to get g (α) below 0.0001. However, we see that g (α) diverges to infinity at α = 0.2847.
To conclude, from the previous Section 3.2 we know that there may be situations when we have convergence according to g (α) from (5), but where the resulting discount factors are negative. We have now learned that if we want to solve this situation by increasing α , which we know will work, we need to pass a singularity. Furthermore, we now know that even if the optimisation algorithm will not end up in this pathological situation, as soon as the input market spot rate may give rise to negative discount factors for some α ≥ 0.05, there will be a singularity that must be avoided by the optimisation algorithm that one uses.
amount: approximately more than ±0.2 · 4.2 = 0.84 percentage points. This goes against the whole idea of the ultimate forward rate being a constant, or, as is suggested for the future UFR methodology, only can change by 0.20 percentage points per year, see EIOPA (2016, Paragraph 20). Note that there is no easy way to fix this: One alternative could be to only stress the spot rates up until the LLP and thereafter recalibrate α with an unchanged UFR. Another alternative could be to instead stress all forward rates up until the UFR, and thereafter re-calculate the implied spot rates, hence without the need of recalibrating α .
3.4. What do the EIOPA stresses really mean? 4. Discussion and concluding remarks In the Solvency II regulation the interest rate stress is defined as follows: let rt denote the basic risk-free spot rate for a zero coupon bond with t time units to maturity, and let rts denote its stressed counterpart. According to the regulation the stressed spot rate is given by rts := rt (1 + st ), where st is a pre-specified constant that is positive or negative depending on whether one considers increasing or decreasing interest rates. The shift is at least ±0.20 (for tenors longer than 90 years). For long tenors, such as the CP, the shift is essentially parallel, i.e. it does not vary much with t. This means that the forward rate essentially shifts with the same factor as the rate, and in particular the forward rate at the CP shifts with a substantial
4.1. What are we hedging? The hedging described in Section 3.1 corresponds to a ‘‘perfect’’ hedge of a deterministic cash flow w.r.t. to arbitrary shifts of the observed market rates. In many situations it is common to make simpler hedges w.r.t. parallel shifts of the interest rate curve. This corresponds to using modified duration as measure of interest rate risk. This is however something that should be done with caution when it comes to the Smith–Wilson method, since we know from Section 3.4 that a parallel shift of a Smith–Wilson curve is inconsistent with the method itself. Moreover, from the definition of the Smith–Wilson method it is evident that the entire yield curve will depend on all market observations p, or equivalently r.
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
If one still wants to use a single number to describe the interest rate risk, such as modified duration, we argue that it is better to assess the modified duration w.r.t. to the actual market rates. That is, consider the modified duration of a Smith–Wilson discounted cash flow with respect to the underlying observed spot rates. It is straightforward to obtain expression for this quantity under Smith–Wilson discounting (calculations not included). Let c = (c1 , . . . , ck )′ denote the cash flow at maturities t = (t1 , . . . , tk )′ , where some of the ti ’s may coincide with the observed points ui . Then the modified duration with respect to the underlying observed spot rates, Dsw (c ; r ), is given by Dsw (c ; r ) := −
1 Pc
lim
P c (r + δ) − P c (r )
(r ) δ→0 k
δ
ci W (t , u)W
=
−1
u
ci e−ωti + W (t , u)W−1 (p − q)
.
(7)
Note that the calculations leading up to (7) does not include a recalibration of α . By using (7) one gets an understanding of the interest rate sensitivity of c with respect to infinitesimal parallel shifts of the underlying observed market rates, which can be used for hedging purposes. Regarding the inappropriateness of parallel shifts of a Smith–Wilson curve, we again refer the reader to Section 3.4. 4.2. On the parametrisation of the Smith–Wilson method, choice of kernel functions and related topics From Section 2 we know that the Smith–Wilson method can be interpreted as the expected value of a certain Gaussian process conditional on a number of perfect observations. An interesting observation is that Theorem 1, and especially Eq. (4), imply that one can replace the Wilson-kernel function by any proper covariance function and still keep the process interpretation of Theorem 1. One can note that this is in fact close to what is done in the original paper (Smith and Wilson, 2000) by Smith and Wilson where they start with the following general bond price model N
ζj Kj (t ),
Moreover, the second term is non-negative since Cov[Y , Y ]−1 is positive definite, and thus 0 < Var[Yt ] − Cov[Yt , Y ]Cov[Y , Y ]−1 Cov[Yt , Y ]′ < Var[Yt ] → 0,
i=1
P (t ) := e−ωt +
regularity conditions it imposes on models of the form given by (8). From a kriging perspective, one would instead discriminate between covariance functions by comparing the Mean Squared Error of Prediction (MSEP), where the MSEP is the conditional covariance that companions the conditional expectation given by Theorem 1. Hence, by giving the Smith–Wilson method a statistical interpretation it is possible to analyse and compare covariance functions statistically, not only w.r.t. MSEP. Given the kriging representation one might be tempted to use MSEP as an alternative for calibrating α in the Wilson-function by e.g. minimising α at the CP. This is not feasible, since the MSEP for the Wilson-function w.r.t. α lacks a minimum for α > 0. By definition the MSEP is non-negative, that is Var[Yt ] − Cov[Yt , Y ]Cov[Y , Y ]−1 Cov[Yt , Y ]′ > 0.
i =1 k
99
(8)
j =1
where Kj (t ), i = 1, . . . , N, denotes an arbitrary kernel function evaluated in (t , uj ). Given the above process interpretation the Kj (t ) functions should correspond to a proper covariance function evaluated in the points (t , uj ), which brings us into the realm of kriging. Within the area of geostatistics, there is a theory known as kriging, see e.g. Cressie (1991), which in many aspects resembles the interpretation of the Smith–Wilson method as an expected value of a stochastic process. The method of kriging was introduced in spatial statistics and can be described as follows: you start by observing a number of outcomes from an unknown stochastic process (for our needs one-dimensional outcomes ordered in time), that is you assume that you make perfect observations at known locations (time points). Given these observations you want to inter/extrapolate between these known points by making assumptions on the underlying process that you have observations from. In light of this the Smith–Wilson method can be seen as onedimensional kriging, that is Theorem 1 gives us the Best Linear Unbiased Predictor (BLUP), given that we treat the theoretical unconditional expected values as known a priori. If we return to the choice of kernel function proposed in Smith and Wilson (2000), the Wilson-function, this choice is motivated with that the Wilson-function is optimal with respect to certain
as α → 0, regardless of the value of t. 4.3. A hedgeable Smith–Wilson method and alternatives From Theorem 2 we know that the choice of Wilson-function as kernel will result in an oscillating hedge. If we would consider changing to another kernel or covariance function, we want to avoid re-inventing a new un-hedgeable procedure. From the proof of Theorem 2 it follows that in order for all the β ’s to be positive we need conditions on the determinant of the covariance matrix associated with the covariance function. In Karlin and Rinott (1983) it is ascertained that all β ’s will be non-negative if the inverse of the covariance matrix, i.e. the precision matrix, is a so-called Mmatrix, see result (b) on p. 420 in Karlin and Rinott (1983), where the definition of a M-matrix is as follows: Definition (M-matrix, Karlin and Rinott, 1983). A real n × n matrix A s.t. Aij ≤ 0 for all i ̸= j with Aii > 0 is an M-matrix iff one of the following holds (i) There exists a vector x ∈ Rn with all positive components such that Ax > 0. (ii) A is non-singular and all elements of A−1 are non-negative. (iii) All principal minors of A are positive. A simple example of a Gaussian process that has an M-matrix as a precision matrix is an Ornstein–Uhlenbeck process, i.e. if the kernel underlying the Smith–Wilson method had been that of an Ornstein–Uhlenbeck process rather than an integrated Ornstein–Uhlenbeck, then the hedge would not be oscillating. Since this process is Markov, one would only have local dependency, so that liabilities in the extrapolated part of the curve would be perfectly hedged with only the zero coupon bond at the LLP. More generally, if the covariance function of a Gaussian process on R+ is the Green function of a symmetric transient Markov process, then the squared Gaussian process is infinitely divisible (Eisenbaum and Kaspi, 2006, Thm. 3.4), and the precision matrix of the Gaussian process is an M-matrix (Marcus and Rosen, 2006, Cor. 13.2.5). Another concrete example is a fractional Brownian motion {BH t : t ≥ 0} with Hurst index H < 12 so that its increments over disjoint intervals are negatively correlated, see Eisenbaum (2003, Thm 2.1). To obtain a fractional Ornstein–Uhlenbeck process, one can use the SDE defining the ordinary Ornstein–Uhlenbeck process, or one could use a Lamperti transformation, and in general these two processes are different, see Cheridito et al. (2003). The one defined by the Lamperti transformation, i.e. X¯ t := e−α t BH will have an eα t /H M-matrix as precision matrix if the generating fractional Brownian
100
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
Fig. 6. Curves fitted to the same market data (2% for tenors 1, . . . , 10, 12, 15 and 20 years) and the same convergence criterion, see (3), using different kernels. Note the jagged curves for the fractional and ordinary Ornstein–Uhlenbeck kernels.
motion has a Hurst index H ≤ 12 , since the Lamperti transformation scales the fractional Brownian motion monotonically in both time and space. With such a kernel having H < 21 , liabilities in the extrapolated part of the curve would have a positive dependence on all market observations. Even if curves produced by Ornstein–Uhlenbeck kernels, possibly fractional, would have more reasonable hedges, they would have some downsides when it comes to smoothness. As we argue below, kinks in a zero coupon curve, meaning a discontinuous forward rate curve, need not be a problem. However, not only would these zero coupon curves have kinks, but they would also, in general, be jagged or have a sawtooth look in the interpolated part of the curve even if the market observations are monotone. This is more prevalent when the Hurst index H < 21 but can also be seen
in the non-fractional case H = 12 , when market observations are far from the UFR, see Fig. 6. A full investigation of curve methods generated by Gaussian process with M-matrix precision matrices is beyond the scope of this paper. The examples above indicate that it is not easy to find the right kernel. Other recent contributions in the Smith–Wilson vein is given by de Kort and Vellekoop (2016). Apart from providing a theoretical justification of the original Smith–Wilson method they analyse similar optimisation schemes and particular focus is on the situation where UFR is treated as a free parameter, given α . Further, they also extend their analysis to the situation when the objective function is defined in terms of yields and forward rates, which result in expressions for inter- and extrapolation similar to the original Smith–Wilson method. As pointed out in de Kort and Vellekoop (2016) the problem with negative discount factors is avoided by applying the method directly to yields and forward rates. Since these methods give smooth and non-local curves, we would expect similar issues with regards to hedging as for the Smith–Wilson method. Instead of using a generalised or amended Smith–Wilson method we would rather recommend a simple but robust method. We agree with de Kort and Vellekoop (2016) and Filipović (2009) that it is inadvisable to fit a curve directly to the discount factors. In fact, we think that one should consider using piecewise constant forward rates within the interpolated part of the curve and setting the forward rate equal to UFR beyond the LLP. The discontinuous forward rates violate the desideratum (b). Even if some may find this unaesthetic we think it is a price worth paying for the increase in interpretability, stability and hedgeability (Hagan and West, 2006), noting also that the curve is arbitrage free when restricted to the instruments that build the curve.
As with any curve method, when applied to instruments other than the building blocks, it might give a price that does not correspond to the market price of that instrument, e.g. a forward rate agreement. We admit that this would be a problem if the curve were intended to price financial derivatives, but the purpose here is to value insurance liabilities that are considerably less complex. Note also that any financial assets and derivatives the insurance company might hold should be valued not by this curve but by the market. 4.4. Concluding remarks In the present paper we have provided an alternative stochastic representation of the Smith–Wilson method (Theorem 1). This representation has nothing to do with the original derivation of the Smith–Wilson method, but it provides a stochastic representation of the method. We have shown that the stochastic representation may be useful for interpreting and analysing the Smith–Wilson method. In particular we have shown that the method always will result in oscillating hedges w.r.t. to the underlying supporting market spot rates (Theorem 2). This a highly undesirable feature of the method. Further, we have given an example where the resulting Smith–Wilson discount curve will take on negative values while fulfilling the convergence criterion, a situation which is total nonsense, but previously known to be able to occur, see e.g. EIOPA (2010); Finanstillsynet (2010). In the present paper we have extended this example to show that there may also occur singularities in the convergence criterion itself and analytically shown that this is a direct consequence of the occurrence of negative discount factors. That is, given that there are negative discount factors for some α ≥ 0.05 there will exist a singularity in the domain where α is being optimised. Moreover, we also provided necessary and sufficient conditions under which a change of kernel or covariance function will result in a bond price model that does not inherit the oscillating hedge behaviour, i.e. its inverse is an M-matrix. This is a nice feature, since given that you want an affine bond price model, you do not need to redo the entire functional analytical optimisation that initially lead up to the original Smith–Wilson method, but can merely change the kernel function and check the necessary conditions. However, we have not found a kernel that satisfies the other basic needs of an inter- and extrapolation method. Instead of trying to amend the Smith–Wilson method by changing the kernel we think regulators should mandate a simple and robust method such as piecewise constant forward rates.
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
We have also commented on the straightforward connection between the Smith–Wilson method and so-called kriging. This was another way to, probabilistically, be able justify different choices of alternative kernels. If one is interested in this topic, one can note that kriging in itself is a special case of so-called Bayesian nonparametrics and that kriging under certain conditions is closely connected to spline smoothing (Rebel, 2012). 5. Proofs Proof of Theorem 1. First note that the Yt has mean function E[Yt ] = e−ωt (1 + E[X¯ t ]) = e−ωt . We intend to show that Yt has the Wilson function W (s, t ) as its covariance function. Let 0 ≤ s ≤ t. The non-stationary Ornstein–Uhlenbeck process Xt = X0 e−α t + 3/2 t −α(t −s) α e dBs has 0 Cov[Xs , Xt ] =: K (s, t ) −α(s+t )
=e
Cov X0 + α
3/2
s
eα u dBu , X0
0
+α
t
3/2
αu
For the proof of Theorem 2 we will use the matrix property total positivity, see, e.g., Karlin (1968). For an n-dimensional matrix A = (aij )i,j∈{1,...,n} , let A[I, J ] := (aij )i∈I,j∈J be the submatrix formed by rows I and columns J from A, and let A[−i, −j] = (akl )k̸=i,l̸=j be the submatrix formed by deleting row i and column j. Recall that a minor of order k of a matrix A is the determinant det A[I, J ], where the number of elements in both I and J is k. Definition (Total Positivity). A n × n matrix is said to be totally positive (TP) if all its minors of order k = 1, . . . , n are non-negative. For functions of two arguments, say f (s, t ), which we call kernels, there are parallel definitions of total positivity, viz. f (s, t ) is said to be totally positive if the matrices (f (si , tj ))i,j∈{1,...,n} are totally positive for all n and s1 < · · · < sn and t1 < · · · < tn . We use the following notation for determinants of matrices constructed by kernels:
f
101
e dBu
s1 t1
··· ···
sn tn
:= det (f (si , tj ))i,j∈{1,...,n} .
0
s = e−α(s+t ) Var[X0 ] + α 3 e2α u du 0 2 α = e−α(s+t ) α 2 + (e2αs − 1)
2
=α e
2 −α t
cosh(α s).
The integrated process X¯ t thus has, Cov[X¯ s , X¯ t ] =: H (s, t ) =
=
+
+
0≤u≤v≤s
K (u, v) dudv
0≤u≤s 0≤v≤t
0≤v≤u≤s
K (u, v) dudv
0≤u≤s≤v≤t
= 2 + K (u, v) dudv 0≤u≤v≤s 0≤u≤s≤v≤t v s =2 α e−αv α cosh(α u) du dv 0 0 t s −αv + α e dv α cosh(α u) du s 0 s =2 α e−αv sinh(αv) dv + (e−αs − e−αt ) sinh(α s) 0
= α s − e−αt sinh(α s), and we arrive at the covariance function Cov[Ys , Yt ] = e−ω(s+t ) Cov[X¯ s , X¯ t ]
= e−ω(s+t ) (α s − e−αt sinh(α s)) = W (s, t ).
Write Y := (Yu1 , . . . , YuN )′ and note that since Yt is a Gaussian process
E[Yt |Y = p] = E[Yt ] + Cov[Yt , Y ]Cov[Y , Y ]−1 (p − E[Y ])
= e−ωt + Cov[Yt , Y ]ζ = e−ωt +
N
=e
+
N
Proof of Lemma 1. By Kallenberg (2002, Prop. 13.7), K (s, u) = K (s, t )K (t , u)/K (t , t ) for s ≤ t ≤ u. Since √ K is nonvanishing the correlation function c (s, t ) := K (s, t )/ K (s, s)K (t , t ) is welldefined for all s, t ≥ 0 and satisfies c (s, u) = c (s, t )c (t , u) for s ≤ t ≤ u. With s = 0 we see that √ c (t , u) = c (0, u)/c (0, t ). Define h(t ) := 1/c (0, t )2 and g (t ) := K (t , t )/c (0, t ). We can now write K (s, t ) = g (s) min(h(s), h(t ))g (t ) for all s, t ≥ 0. The function h is nondecreasing since c (0, t ) = c (0, s)c (s, t ) ≤ c (0, s) for s ≤ t, and thus we can apply (Ando, 1987, Example I.(f), p. 213) and conclude that K is totally positive. We will need the following observation about continuous nonnegative functions. Lemma 2. Let f : Rm → R be a non-negative continuous function and let A be an m-dimensional box, which might include some or none of its boundary ∂ A. If f is positive somewhere on ∂ A, then A f (x)dx is positive. Proof of Lemma 2. By assumption, there exists an ϵ > 0 and an x0 ∈ ∂ A such that f (x0 ) > 2ϵ > 0. Since f is continuous, there exists a δ > 0 such that f (x) > ϵ for all x ∈ Bδ (x0 ) := {x ∈ Rm : |x − x0 | < δ}. Since the intersection between the ball Bδ (x0 ) and A has positive m-dimensional volume and f is non-negative in general, we get
f (x)dx =
A
>ϵ
Cov[Yt , Yui ]ζi
i=1
−ωt
Lemma 1. Let K be the covariance function of a Gaussian Markov process on R+ . If K is nonvanishing, then it is totally positive. In particular, the covariance function of the Ornstein–Uhlenbeck process {Xt : t ≥ 0} is totally positive.
A∩Bδ (x0 )
f (x)dx +
A∩Bδ (x0 )
A\Bδ (x0 )
f (x)dx
1dx + 0 > 0.
We are now ready to prove Theorem 2. W (t , ui )ζi ,
i=1
as desired, where we have identified ζ Cov(Y , Y )−1 (p − E[Y ]).
:= (ζ1 , . . . , ζN )′ :=
Proof of Theorem 2. Let us fix t ≡ uN +1 > uN . We note that the sign of the regression coefficient βi equals that of the partial correlation coefficient of YuN +1 and Yui given all other Yuj , j ∈ {1, . . . , N } \ {i}. This partial correlation coefficient equals that of
102
A. Lagerås, M. Lindholm / Insurance: Mathematics and Economics 71 (2016) 93–102
X¯ uN +1 and X¯ ui given all other X¯ uj , since the two processes have the same correlation matrix. Let us call this partial correlation coefficient pi,N +1 , and let X¯ := (X¯ u1 , . . . , X¯ uN +1 )′ . The partial correlation coefficient has the opposite sign to the element on row i and column N + 1 in the inverse of the covariance matrix H := Cov[X¯ , X¯ ]: With B = (bij )i,j=1,...,N +1 := H−1 , sign(pi,N +1 ) = −sign(bi,N +1 ). By Cramer’s rule bi,N +1 = (−1)i+N +1
det H[−(N + 1), −i] det H
[−i, −(N + 1)]
N +1−i det H
= (−1)
det H
,
u ··· u is a non-negative function. Furthermore we have K ( u1 · · · uN ) > 1 N 0 since this is the determinant of the covariance matrix Cov[X , X ] ′ where X := (Xu1 , . . . , XuN ) . We finally note that the point
(u1 , . . . , uN , v1 , . . . , vN ) := (u1 , . . . , uN , u1 , . . . , uN ) ∈ ∂ A, and by Lemma 2 we conclude that the integral (9) is positive.
We note that Lemma 1 is stronger than what we need for the proof, and the theorem actually holds for any Smith–Wilson type method where the kernel H is the covariance function of a process that is an integrated Gaussian Markov process (subject to the technical conditions in the lemma), or even more generally, where the integrand’s kernel K is totally positive.
since H is a covariance matrix and thus symmetric. The determinant of H is positive and therefore
Acknowledgements
sign(pi,N +1 ) = −sign(bi,N +1 ) = (−1)N −i sign(det H[−i, −(N + 1)]).
The first author is thankful for the support by AFA Insurance. Opinions expressed in this paper are not necessarily those of AFA Insurance. The second author is grateful to Håkan Andersson for the discussions following the joint work (Andersson and Lindholm, 2013). The paper has benefited from the helpful comments of an anonymous referee.
We thus need to prove that det H[−i, −(N + 1)] > 0 for i = 1, . . . , N. With H (s, t ) being the covariance function of X¯ t , we have det H[−i, −(N + 1)] = H
u1 u1
··· ···
ui − 1 ui − 1
··· ···
ui+1 ui
uN + 1 uN
.
References
We can write the kernel H as a double integral of the kernel K : H (s, t ) =
0≤v≤s 0≤w≤t
= 0≤v 0≤w
K (v, w)dv dw
1{s ≥ v} K (v, w) 1{w ≤ t } dv dw. =:L(s,v)
=:R(w,t )
The kernel L produces matrices that have ones below a diagonal, and zeros above. The kernel R is similar with ones above a diagonal. Their determinants therefore equal either zero or one, and they equal one only if the ones are on the main diagonal of the matrix. L R
··· ···
sn tn
··· ···
sn tn
s1 t1 s1 t1
= 1{t1 ≤ s1 < t2 ≤ s2 < · · · < tn ≤ sn } = 1{s1 ≤ t1 < s2 ≤ t2 < · · · < sn ≤ tn }.
We use this in the continuous version of the Cauchy–Binet formula (Karlin, 1968, Eq. (3.1.2)) and obtain
H
· · · ui−1 · · · ui−1 = ··· u1 u1
ui + 1 ui
L
··· ··· u1
uN + 1 uN
··· ···
ui − 1
ui+1
··· ···
v1 vi−1 vi v1 · · · vN w1 · · · wN ×K R dvdw w1 · · · wN u1 · · · uN v1 · · · vN = ··· K dvdw , w · · · wN 1 A 0≤v1 <···
uN + 1
vN
(9)
where A := {0 ≤ v1 ≤ u1 < · · · < vi−1 ≤ ui−1 < vi ≤ ui+1 < · · ·
< vN ≤ uN +1 , 0 ≤ w1 ≤ u1 < · · · < wN ≤ uN }. Since K (s, t ) is a continuous function, and the determinant det(aij )i,j=1,...,n is a continuous function of all elements aij , i, j = s ··· s 1, . . . , n, the determinant K ( t1 · · · tn ) is continuous as a function 1 n s ··· s of s1 . . . , sn , t1 , . . . , tn . By Lemma 1 we also know that K ( t1 · · · tn ) 1
n
Ametrano, F.M., Bianchetti, M., 2013. Everything you always wanted to know about multiple interest rate curve bootstrapping but were afraid to ask. Available at SSRN: http://ssrn.com/abstract=2219548. Andersson, H., Lindholm, M., 2013. On the relation between the Smith-Wilson method and integrated Ornstein–Uhlenbeck processes. Research Report in Mathematical Statistics 2013:1. Stockholm University. Ando, T., 1987. Totally positive matrices. Linear Algebra Appl. 90, 165–219. Bloomberg Finance L.P. 2016. Documentation for the Swap Manager function (SWPM) in the Bloomberg Terminal. Cheridito, P., Kawaguchi, H., Maejima, M., 2003. Fractional Ornstein–Uhlenbeck processes. Electron. J. Probab. 8, 1–14. Cressie, N., 1991. Statistics for Spatial Data. John Wiley & Sons, ISBN: 0-471-843369. de Kort, J., Vellekoop, M.H., 2016. Term structure extrapolation and asymptotic forward rates. Insurance Math. Econom. 67, 107–119. EIOPA, 2010. QIS 5 risk-free interest rates – Extrapolation method, qis5 Background Document. EIOPA, 2015. Technical documentation of the methodology to derive EIOPA’s riskfree interest rate term structures, EIOPA-BoS-15/035. EIOPA, 2016. Consultation Paper on the methodology to derive the UFR and its implementation, EIOPA-CP-16/03. Eisenbaum, N., 2003. On the infinite divisibility of squared Gaussian processes. Probab. Theory Related Fields 125, 381–392. Eisenbaum, N., Kaspi, H., 2006. A characterization of the infinitely divisible squared Gaussian process. Ann. Probab. 34, 728–742. Filipović, D., 2009. Term-Structure Models. Springer, ISBN: 978-3-540-09726-6. Finanstillsynet, 2010. A Technical Note on the Smith-Wilson Method. The Financial Supervisory Authority of Norway. Hagan, P.S., West, G., 2006. Interpolation methods for curve construction. Appl. Math. Finance 13, 89–129. Kallenberg, O., 2002. Foundations of Modern Probability, second ed. Springer, ISBN: 0-387-95313-2. Karlin, S., 1968. Total Positivity. Stanford University Press. Karlin, S., Rinott, Y., 1983. M-Matrices as covariance matrices of multinormal distributions. Linear Algebra Appl. 52–53, 419–438. Lagerås, A., 2014. How to hedge extrapolated yield curves. Research Report in Mathematical Statistics 2014:1. Stockholm University. Marcus, M.B., Rosen, J., 2006. Markov Processes, Gaussian Processes, and Local Times. Cambridge University Press, ISBN: 978-0-521-86300-1. Mercurio, B., 2009. Interest rates and the credit crunch: new formulas and market models. Bloomberg Portfolio Research Paper 2010-01-FRONTIERS, Available at SSRN: http://ssrn.com/abstract=1332205. Moore, E., 2016. Entire stock of Swiss bond yields turns negative. Financial Times 2016-07-01. Ovtchinnikov, V., 2015. Parameter setting for yield curve extrapolation and the implications for hedging Stockholm University. Rebel, L., 2012. The ultimate forward rate methodology to valuate pensions liabilities: a review and an alternative methodology. Research Report. VU University, Amsterdam and Cardano Rotterdam. Samson, A., 2016. Negative-yield government debt surges $1.3tn to $11.7tn. Financial Times 2016-06-30. Smith, A., Wilson, T., 2000. Fitting Yield Curves with Long Term Constraints. Research report. Bacon & Woodrow.