Multiple risk measures for multivariate dynamic heavy–tailed models

Multiple risk measures for multivariate dynamic heavy–tailed models

Author’s Accepted Manuscript Multiple Risk Measures for Multivariate Dynamic Heavy–Tailed Models Mauro Bernardi, Antonello Maruotti, Lea Petrella www...

2MB Sizes 0 Downloads 18 Views

Author’s Accepted Manuscript Multiple Risk Measures for Multivariate Dynamic Heavy–Tailed Models Mauro Bernardi, Antonello Maruotti, Lea Petrella

www.elsevier.com

PII: DOI: Reference:

S0927-5398(17)30035-X http://dx.doi.org/10.1016/j.jempfin.2017.04.005 EMPFIN975

To appear in: Journal of Empirical Finance Received date: 20 May 2014 Revised date: 22 April 2017 Accepted date: 26 April 2017 Cite this article as: Mauro Bernardi, Antonello Maruotti and Lea Petrella, Multiple Risk Measures for Multivariate Dynamic Heavy–Tailed Models, Journal of Empirical Finance, http://dx.doi.org/10.1016/j.jempfin.2017.04.005 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Multiple Risk Measures for Multivariate Dynamic Heavy–Tailed Models

May 4, 2017 Abstract The dynamic evolution of tail–risk interdependence among institutions is of primary importance when extreme events such as financial crisis occur. In this paper we introduce two new risk measures that generalise the Conditional Value–at–Risk and the Conditional Expected Shortfall in a multiple setting. The proposed risk measures aim to capture extreme tail co–movements among several multivariate connected market participants experiencing contemporaneous distress instances. Analytical expressions for the risk measures are obtained under a parametric model that postulates a joint dynamic evolution of the underlying institutions’ losses and gains. We consider a multivariate Student– t version of Markov Switching models as a robust alternative to the usual multivariate Gaussian specification, accounting for heavy–tails and time varying non–linear correlations. An empirical application to US banks is considered to show that our model–based risk measurement framework provides a better characterisation of the dynamic evolution of the overall risk of a financial system and a more complete picture of how the risk spreads among institutions. Keywords: Markov–Switching models, tail risk interdependence, risk measures, conditional Value–at–Risk, conditional expected shortfall, systemic risk.

1

Introduction

The recent global financial crisis originated by the US subprime mortgage bubble burst of August 2007 and the consequent downturn in economic activity leading to the 2008–2012 global recession, highlighted the strong negative impact of large scale collapse of financial institutions on other banks as well as on the real economy. After the failure of Bearn Stearns hedge funds on August 5th, 2007, the threat of total collapse of large financial institutions, and the consequent downturns in stock markets around the world, caused a worldwide increase in financial market volatility and a sudden tightening of the liquidity conditions. The spillover effect of a downturn in the financial system has been advocated as the main reason for massive public interventions and bailouts of distressed banks. Bank failures cause direct effects on the real economy because of their linkage to the manufactory industry through the credit mechanism and the significant role they play as financial intermediaries on the monetary 1

transmission channel. Notwithstanding, before the 2007–2008 crisis, banking regulation and risk capital allocation was based on individual risk measures such as the Value–at–Risk (VaR). Unfortunately, such risk measure fails to consider the institution as part of a system which might itself experience instability and spread new sources of risk defined “systemic” because they concern the distress or even the collapse of the entire financial system. Recently, Adrian and Brunnermeier (2016) introduced the so called Co–movement Value–at–Risk (CoVaR), as a measure of systemic risk. Adrian and Brunnermeier (2016) define the CoVaR as the overall Value–at–Risk (VaR) of an institution conditional on another institution being under distress. Unlike the VaR, the CoVaR not only captures the overall risk embedded in each institution, but reflects individual contributions to the systemic risk, accounting also for extreme tail co– movements, see, also, Bernardi et al. (2015) and Girardi and Erg¨ un (2013) for alternative approaches to the calculation of CoVaR. The literature on co–movement risk measures has proliferated during the last few years: Acharya et al. (2012), Billio et al. (2012), Bernardi et al. (2015), Bernardi and Catania (2015), Bernal et al. (2014), Castro and Ferrari (2014), Girardi and Erg¨ un (2013), J¨ager-Ambro˙zewicz (2013), Sordo et al. (2015), Hautsch et al. (2014), Engle et al. (2014), Lucas et al. (2014), Drehmann and Tarashev 2013, Huang et al. 2012b and Huang et al. 2012a, Bernardi et al. (2017), is a necessarily incomplete list of the major recent contributions along the main research directions in this rapidly growing field. Bisias et al. (2012) provide an extensive and up to date survey of the systemic risk measures that have been proposed, while the literature on systemic risk is completely reviewed by Benoit et al. (2016). However, despite its obvious relevance as a first attempt to overcome the limitations of the traditional risk measures, the CoVaR, as originally introduced by Adrian and Brunnermeier (2016), characterises the systemic contribution of an institution as the pairwise tail dependence between that institution and another institution belonging to the same financial system, while ignoring any potentially relevant impact of all the remaining institutions, see Bernardi and Petrella (2015) and the discussion therein. Adrian and Brunnermeier (2016) mitigate the problem by considering the CoVaR between a global financial index representative of the whole financial market and each institutions belonging to that market. The resulting single index model can be seen as a factor model where the only risky factor is the institution’s returns. However, recent financial crisis are characterised by the contemporaneous distress of several institutions emphasising that the original CoVaR approach may suffer from a relevant bias due to the neglected overall dependence among institutions. The spillover effect of a financial downturn may propagate from one or more institutions being in distress at the same time, to the entire market but the mechanisms of such propagation cannot be inferred by solely looking at either marginal pairwise risk measures or the effect on the overall index. Moreover, as argued by Bernardi et al. (2015), underestimation of the simultaneous occurrence of interdependent rare events and the consequent bias in evaluating the transmission of risks among institutions or sectors may cause misleading policy reactions by the authorities. As a consequence, new overall risk measures that account for contemporaneous multiple distress as conditioning events should be considered. Here, we adopt this new multivariate perspective of systemic risk measurement and we propose to generalise the CoVaR approach of Adrian and Brunnermeier (2016) to the Multiple–CoVaR accounting for multiple contemporaneous distress instances potentially affecting the market participants at the same time. The interdependence among institutions has been recently recognised to be a key requirement in assessing the vulnerability of the financial institutions to systemic events,

2

further beyond the exposure to common shocks. Recent research focuses on the degree of interconnectedness within the financial industry. Using the principal component analysis and the Granger causality test, Billio et al. (2012) analyse stock price data on hedge funds, banks, brokers and insurers for the period 1994–2008. They find that financial institutions have become significantly linked during the period 2001–2008 and, more interestingly, an asymmetric interconnection relationship between banks and insurers, on the one hand, and hedge funds and brokers, on the other hand. Adams et al. (2014) instead develop a state-dependent sensitivity Value–at–Risk approach being able to quantify the direction, size and duration of risk spillovers among systemically important financial institutions. Their analysis considers a system of quantile regressions involving the entire panel of institutions, conditional on the relevant state of the system. Their results confirm that commercial banks and hedge funds play a relevant role in the transmission of shocks to other financial institutions. The same results are confirmed by Acharya et al. (2010) using the systemic expected shortfall risk measure. They also find that insurance firms are overall the least systemically risky and that the top three systemically risky insurers were heavily involved in providing financial guarantees for structured products in the credit derivatives market. As another relevant contribution, Hautsch et al. (2014) propose a network analysis of the tail interdependence among institutions. In their empirical analysis for the pre–GFC crisis period, they find that only two insurance companies, American International Group and Cincinnati Financial Corp, were qualified as systemically relevant, while most of the remaining insurance companies were considered to be of a low level of systemic importance. In this paper, we develop a multivariate model–based approach to measure the dynamic evolution of tail risk interdependence accounting for well known characteristics of financial time series such as non–negligible tail dependence, volatility clustering, fat–tails, asymmetry and the presence of different correlation regimes, see, e.g., Pelletier (2006). To achieve this goal, we consider the class of multivariate Markov Switching Models (MSMs), see, e.g., Ang and Bekaert (2004), Bulla et al. (2011). Multivariate MSMs may be recognised as a challenging and promising approach for analysing and modelling financial time series, mainly because of their ability to reproduce some of the most important stylised facts and to account for nonlinearity and persistence of the visited states (see, e.g., Guidolin and Timmermann 2005, Harris and K¨ u¸cu ¨k¨ ozmen (2001) and Gettinby et al. 2004). Those features are crucial aspects in market return analysis and risk modelling. Markov Switching models has been extensively used in literature see, e.g., Ryd´en et al. (1998), Ang and Bekaert (2004), Bulla and Bulla (2006), Bulla et al. (2011), and Hamilton (1989, 1990). Moreover in their papers, Geweke and Amisano (2010, 2011) showed that MSMs outperform their competitors in predicting daily returns of financial time series, especially during episodes of high volatility. As regards the component densities, we consider both multivariate Gaussian and Student–t distributions. Although the multivariate Normal specification may be considered a starting point for our analysis, the choice between the two distributions is empirically tested on the data. Indeed, it is well known that the multivariate Gaussian distribution has some deficiencies in particular when financial time series modelling and risk assessment are the major concerns. Specifically, the Gaussian model relies on the assumption that each marginal follows a Normal distribution, which is often an unrealistic assumption for daily returns, being characterised by the presence of extreme observations (see, e.g., Bulla 2011). More importantly, as documented by Embrechts et al. (2002), non–linearities or strong deviation from the elliptical and symmetric distribution for the underlying observed process may affect the association measures and, as a consequence, the covariance will not capture the complete dependence

3

structure anymore. Moreover, the tails behaviour of the multivariate Gaussian distribution lead to independent extremes, resulting in the potential underestimation of the probabilities of ˇ ıˇzek et al. 2011, Demarta and McNeil 2005 simultaneous occurrence of rare events (see, e.g., C´ and McNeil et al. 2015). The latter eventuality may cause an incorrect assessment of economic risks which has played a key role in the ongoing international financial crisis. To overcome all the above mentioned limitations of the multivariate Gaussian distribution we consider multivariate Student–t MSMs as a natural robust extension. Several justifications can be put forward to motivate the use of multivariate MSM as model driving the systemic risk building procedure. Specifically, MSMs allow for policy makers to differentiate institutions’ tail risk exposure depending on the state of the economy identified by the latent Markovian process. Being intrinsically dynamic, MSMs favour a conditional framework for the quantification of systemic risk calculated on the MSM predictive distribution. Since the predictive distribution of MSMs is a finite mixture of the Markovian component densities we are able to compute analytically the provided systemic risk measures conditional to the past history of the process. This way, we contribute to evaluating risk measures that are essentially dynamic, because they rely on time–varying loadings of individual risk factors represented by the individual Value–at–Risk. The resulting evolution over time of the risk measures provides important monitoring tools for the market–based macro–prudential or financial stability regulation. Once the modelling framework has been properly set, we develop a coherent model–based approach to measure the tail risk interdependence among institutions which relies on the predictive distribution of the postulated Markov model as well as on an extension of the CoVaR approach of Adrian and Brunnermeier (2016) accounting for multiple contemporaneous distress instances. Specifically, we introduce the Multiple–CoVaR (MCoVaR) and the Multiple–CoES (MCoES) risk measures, extending and improving the Adrian and Brunnermeier (2016) CoVaR risk measurement framework. The proposed risk measures aim to overcome the limitation of the CoVaR approach by considering the interconnections among market participants which are all connected by economic and financial relationships. As documented by Billio et al. (2012), ignoring the interconnections among institutions may be particularly dangerous during periods of financial market instability, when several institutions may contemporaneously experience distress instances. The proposed methodology evaluates the contribution of each financial institution to the total risk accounting for the fact that several institutions may jointly experience contemporaneous distress events. To this aim, we build a sequence of conditional risk measures which differ for the set of conditioning distress events they consider as relevant, qualifying different measures of risk contribution. The conditioning events refer to all the possible combinations of institutions being in distress. More specifically, the strategy developed in this paper to assess individual risk contributions relies on the ∆M CoVaR and ∆M CoES. Those marginal contributions are measured as the difference between the MCoVaR (MCoES) of each institution j conditional on a given set of different institutions being under distress and the MCoVaR (MCoES) of institution j evaluated when the same set of conditioning institutions are at their normal state, identified as the median state, as in Adrian and Brunnermeier (2016). Whenever j is assumed to be the market index, the risk measures introduced throughout the paper identifies the systemic risk. To compose the puzzle of attributing the systemic importance of each institution we apply the Shapley value methodology initially proposed by Shapley (1953) in the field of cooperative games. The idea behind the Shapley value has been previously considered by Tarashev et al.

4

(2010) and Cao (2013) in the field of systemic risk attribution and by Bernardi and Petrella (2015) to measure financial sectors interconnections. The portion of the overall value that the Shapley methodology attributes to each of the players in a cooperative game equals the average of this player’s marginal contribution to the value created by all possible permutations on the set of players. In our setting this value coincides with the systemic risk generated by market participants. The additivity axiom satisfied by the Shapley value methodology ensures that the systemic risk allocation is efficient in the sense that the shares of systemic risk attributed to individual institutions exactly sum to the total risk, i.e., the ∆M CoVaR (or ∆M CoES) of all the financial institutions in the system being in financial distress. This property allows to overcome the deficiency of the standard ∆CoVaR definition of Adrian and Brunnermeier (2016) for which the sum of individual contributions does not equal the total risk measure, providing misleading informations for policy purposes. The proposed methodology then is applied to a reduced panel of major US financial institutions previously considered by Acharya et al. (2010), belonging to the Standard and Poor’s 500 (S&P500) index. Comparing the assumption on the MSM component density we observe that the Student–t distribution is strongly preferred to the Gaussian one and this supports for the use of fat–tailed distributions. Our multivariate Student–t MSM is able to distinguish and cluster time periods corresponding to different risk–returns profiles and to model the persistence of visited states. Moreover, by employing the MCoVaR and MCoES we are able to capture extreme tail co–movements and to provide the dynamic evolution of the total systemic risk as well as the marginal contribution of each considered bank. Concerning the total risk, we find that the overall systemic risk during the 2007–2009 financial crisis is larger than the total systemic risk during the European sovereign debt crisis at its peak. Our main empirical result suggests that the marginal contribution to the systemic risk of individual banks varies through time and particularly during periods of financial instability it changes dramatically, both in order of importance and in levels. The remainder of the paper is organised as follows. Section 2 introduces the multivariate Markov–Switching models and provides the estimation methodology. Section 3 introduces some useful results concerning the marginal and conditional distributions of multivariate mixtures and the proposed systemic risk measures. Section 4 details the Shapley value methodology that composes the puzzle of individual systemic risk attribution. Section 5 presents results based on an illustrative panel of major US financial institutions belonging to the S&P500 composite index. Section 6 concludes. An extensive Appendix provides the proof of the theorems introduced throughout the paper, while a supplementary Appendix available online details all the estimation results.

2

Markov–switching models

In this section we first introduce the basic MSM framework and we extend it to account for the dynamic evolution of the conditional mean and marginal volatilities. Then, we detail the estimation methodology of the models parameters for both the Gaussian and Student–t assumption. For an up to date review of MSMs see, e.g., Capp´e et al. (2005), Zucchini and MacDonald (2009) and Dymarski (2011).

5

2.1

Model setup

Let {Yt , t = 1, 2, . . . , T } denote a sequence of multivariate observations, where Yt = {Y1,t , Y2,t , . . . , Yp,t } ∈ Rp , while {St , t = 1, 2, . . . , T } is a Markov chain defined on the state space {1, 2, . . . , L}. A MSM is a stochastic process consisting of two parts: the underlying unobserved process {St }, fulfilling the Markov property, i.e. P (St = st | S1:t−1 = s1:t−1 ) = P (St = st | St−1 = st−1 ) , where S1:t−1 = (S1 , S2 , . . . , St−1 ) and s1:t−1 = (s1 , s2 , . . . , st−1 ) and the state–dependent observation process {Yt } for which the conditional independence property, i.e. f (Yt = yt | Y1:t−1 = y1:t−1 , S1:t = s1:t ) = f (Yt = yt | St = st , y1:t−1 ) , holds, where f (·) denotes a generic probability density function. The literature on MSMs for continuous data is dominated by Gaussian MSMs (Hamilton 1989, Ryd´en et al. 1998, Bialkowski 2003, Bartolucci and Farcomeni 2010), with few exceptions (Bartolucci and Farcomeni 2009 and Lagona and Picone 2013). Under the Gaussian assumption, the state–specific distribution of Yt is given by ( ) Yt | St = st ∼ Np µst (y1:t−1 ) , Σst (y1:t−1 ) , (2.1) ( ) where Np µst (y1:t−1 ) , Σst (y1:t−1 ) denotes the multivariate Gaussian distribution with mean µst (y1:t−1 ) and covariance matrix Σst (y1:t−1 ). Hereafter, the notation µst (y1:t−1 ) and Σst (y1:t−1 ) simply denotes that, in principle, the location and scale parameters µst and Σst depend on the realisation of the hidden Markov chain at time t, st , but may also depend on past realisations of the observed process up to time t − 1, y1:t−1 . Time series models for financial data should account for several well known departures from normality such as heavy–tails, robustness to outliers, and the ability of capturing extreme events. Those reasons motivate our choice of the multivariate Student–t assumption for the MSM component densities: ( ) Yt | St = st ∼ Tp µst (y1:t−1 ) , Σst (y1:t−1 ) , νst , (2.2) ( ) where Tp µst (y1:t−1 ) , Σst (y1:t−1 ) , νst denotes the multivariate Student–t distribution with location µst (y1:t−1 ), scale matrix Σst (y1:t−1 ) and degrees of freedom equal to νst . As νst tends to infinity, the distribution in equation (2.2) approaches the Gaussian distribution, with mean µst and variance–covariance matrix Σst . Hence the parameter νst may be viewed as a robustness tuning parameter, which needs to be estimated along with all other model parameters. For the purpose of developing the inferential procedures in the next Section, we remind that the multivariate Student–t distribution can be expressed as a scale mixture of multivariate Gaussian distributions ) ( Σst , (2.3) Yt | (St = st , Wt = wt ) ∼ Np µst , wt where {Wt , t = 1, 2, . . . , T } are independent and identically distributed random variables having distribution (ν ν ) s st , t , (2.4) Wt | St = st ∼ G 2 2 6

with G (α, β) denoting the Gamma distribution with parameters α > 0 and β > 0, see, e.g., Kotz and Nadarajah (2004). The model is completed by the specification of the Markov chain that drives the hidden states at each time point t. To this purpose let ql,k = P (St = k | St−1 = l), ∀l, k ∈ {1, 2, . . . , L} denote the probability that state k is visited at time t given that at time t − 1 the chain was visiting state l. We indicate with δl = P (S1 = l) the initial probability of being in state l = {1, 2, . . . , L} at time 1, and we refer to Q = {ql,k }l,k=1,2,...,L as the transition probability matrix of the Markov chain.

2.2

Dynamic extension

In this section, we extend the MSM model to account for the possible time–varying nature of the parameters conditional to past observations. Specifically, we assume a first–order vector autoregressive (VAR) process for the state–specific conditional location parameters µst (y1:t−1 ) = ast + Ast yt−1 ,

(2.5)

for st = 1, 2, . . . , L, where ast ∈ Rp and Ast is a (p × p) matrix whose eigenvalues are assumed to be in modulus less than one, to preserve stationarity of the autoregressive process. Concerning the specification of the state–specific scale matrix, Σst (y1:t−1 ), we assume that the scale matrix can be decomposed in the following way Σst (y1:t−1 ) = Dt Cst Dt ,

(2.6)

for st = 1, 2, . . . , L, where Dt = diag {σ1,t , σ2,t , . . . , σd,t } is the diagonal matrix containing the marginal volatilities which are assumed to depend on time but not on the latent state st , while the full matrix Cst which is allowed to depend on the realisation of the hidden Markov process st and has diagonal elements equal to one and off–diagonal elements less than one in absolute value, to preserve the model identification. To account for the volatility clustering phenomenon, we allow the marginal volatilities to evolve over time by assuming the following Absolute–Value GARCH(1, 1) specification of Taylor (2007) σj,t = ωj + αj |εj,t−1 | + βσj,t−1 ,

(2.7)

where ωj > 0, αj ≥ 0 and βj ≥ 0, for j = 1, 2, . . . , p. Here, εj,t−1 denotes the j–th element of εt−1 = yt−1 − ast−1 − Ast−1 yt−2 , for j = 1, 2, . . . , p. Moreover, given the ( imposed ) stationary dynamic evolution of the VAR ( process, ) ˆ 1|0 , V1|0 , where y ˆ 1|0 = (Ip − A)−1 a and vec V1|0 we assume y0 ∼ N y = ( )−1 ( ) ¯ ¯ ¯ Ip2 − Ast ⊗ Ast vec DCst D and D = diag {¯ σ1 , σ ¯2 , . . . , σ ¯p } with generic element defined by the unconditional expectation of the Absolute–Value GARCH(1, 1) dynamics defined in the next proposition. Proposition 2.1. Let Yt follows the Gaussian dynamic MSM defined in equations (2.1)–(2.2) with location and scale dynamics specified in equations (2.5)–(2.6) and (2.7), then the first four unconditional moment of εj,t , where εj,t denotes the j–th element of εt = yt − ast − Ast yt−1 ,

7

are E (εj,t ) = 0

(

(0)

(2.8)

)

ωj2 1 + γj )( ) (2.9) (0) (1) 1 − γj 1 − γj ( ( )) (0) (1) (0) (1) 3 ( 3 ) ωj 1 + 2 γj + γj + γj γj )( )( ) E εj,t = ( (2.10) (0) (1) (2) 1 − γj 1 − γj 1 − γj ( ) (0) (1) (0) (2) (0) (1) (2) (1) (2) (0) (1) (2) 4 ( 4 ) ωj 1 + 3γj γj + 5γj γj + 5γj γj γj + 3γj γj + 3γj + 4γj + 3γj ( )( )( )( ) E εj,t = , (0) (1) (2) (3) 1 − γj 1 − γj 1 − γj 1 − γj ( ) E ε2j,t = (

(2.11) √ where b =

2 π

for the Gaussian and b =

ν+1 2ν Γ( 2 ) √ ν−1 νπΓ( ν ) 2

for the Student–t distribution respectively,

with (0)

γj

(1) γj

ωj (0)

1−γj

=

αj2

+

βj2

(2.12)

+ 2bαj βj < 0

(2.13)

γj

= 2bαj3 + βj3 + 3αj2 βj + 3bαj βj2 ̸= 0

(2.14)

(3) γj

= αj4 + 8bαj3 βj + 6αj2 βj2 + 4αj βj3 + βj4 ̸= 0,

(2.15)

(2)

and E (σj,t ) =

= bαj + βj < 0

, for j = 1, 2, . . . , p.

Before proving Proposition 2.1 we introduce the following Lemma. Lemma 2.1. Let Yt follows the Gaussian dynamic MSM defined in equations (2.1)–(2.2) with location and scale dynamics specified in equations (2.5)–(2.6) and (2.7), and let εj,t denote the j–th element of εt = yt − ast − Ast yt−1 , for j = 1, 2, . . . , p, then E (|εj,t |) = bE (σj,t ) ( ) ( 3 ) E |εj,t |3 = 2bE σj,t ( ) ( k+1 ) k E |εj,t |σj,t = bE σj,t , ( 2 k ) ( k+2 ) E εj,t σj,t = E σj,t , ( ) ( 4 ) E |εj,t |3 σj,t = 2bE σj,t .

8

(2.16) (2.17) ∀k = 1, 2, 3, ∀k = 1, 2,

(2.18) (2.19) (2.20)

Proof. Equation (2.16) and (2.17) follow from standard integration results, while [ ( )] ( ) k k E |εj,t |σj,t = E E |εj,t |σj,t Ft−1 [ ( )] ( k+1 ) k = E σj,t E |εj,t | Ft−1 = bE σj,t , ∀k = 1, 2, 3, [ ( )] ( 2 k ) k E εj,t σj,t = E E ε2j,t σj,t Ft−1 [ ( )] ( k+2 ) k = E σj,t E ε2j,t Ft−1 = E σj,t , ∀k = 1, 2, )] [ ( ( ) E |εj,t |3 σj,t = E E |εj,t |3 σj,t Ft−1 )] [ ( ( 4 ) , = E σj,t E |εj,t |3 Ft−1 = 2bE σj,t for j = 1, 2, . . . , p. Proof. (of Proposition 2.1) To get the first order stationarity condition, observe that, by Lemma 2.1 E (σj,t ) = ωj + αj E (|εj,t−1 |) + βj E (σj,t−1 ) = ωj + αj bE (σj,t−1 ) + βj E (σj,t−1 ) , and the result follows immediately. Concerning equations (2.9), (2.10) and (2.11), note that, using the law of iterated expectation, we get ( ( )) ( ) ( k ) E |εj,t |k = E E |εj,t |k Ft−1 , = E σj,t for k = 2, 3, 4, and ( ) ( 2 ) 2 E σj,t = E (ωj + αj |εj,t−1 | + βj σj,t−1 ) ( ) ( 2 ) = ωj2 + αj2 + βj2 E σj,t−1 + 2ωj (βj + bαj ) E (σj,t−1 ) + 2αj βj E (|εj,t−1 |σj,t−1 ) .

(2.21)

Using Lemma 2.1 and plugging the expression of E (σj,t−1 ) into equation (2.21), we obtain ) ) ( 2 ( ( 2 ) + 2ωj (βj + bαj ) E (σj,t−1 ) = ωj2 + αj2 + βj2 + 2bαj βj E σj,t−1 E σj,t ( ) ( 2 ) 2ωj2 (βj + bαj ) = ωj2 + αj2 + βj2 + 2bαj βj E σj,t−1 + , 1 − bαj − βj which completes the proof of equation (2.9). Concerning the third unconditional moment, we have ( ) ( 3 ) 3 E σj,t = E (ωj + αj |εj,t−1 | + βj σj,t−1 ) ( ) ( 3 ) = ωj3 + αj3 E |εj,t−1 |3 + βj3 E σj,t−1 + 3ωj2 αj E|εj,t−1 | ( ) ( 2 ) + 3ωj2 βj E (σj,t−1 ) + 3ωj αj2 E ε2j,t−1 + 3ωj βj2 E σj,t−1 ( ) ( ) 2 + 3αj2 βj E ε2j,t−1 σj,t−1 + 3αj βj2 E |εj,t−1 |σj,t−1 + 6ωj αj βj E (|εj,t−1 |σj,t−1 ) , 9

and using Lemma 2.1, we get ( 3 ) E σj,t = ωj3 + 3ωj2 (bαj + βj ) E (σj,t−1 ) ( ) ( 2 ) + 3ωj αj2 + βj2 + 2bαj βj E σj,t−1 ( ) ( 3 ) + 2bαj3 + βj3 + 3αj2 βj + 3bαj βj2 E σj,t−1 . ( k ) , k = 1, 2, 3 and rearranging terms completes the proof Substituting the expression for E σj,t−1 of equation (2.10). Concerning the fourth unconditional moment, we have ( ) ( 4 ) 4 E σj,t = E (ωj + αj |εj,t−1 | + βj σj,t−1 ) = ωj4 + 4ωj3 (βj + bαj ) E (σj,t−1 ) ( ) ( 2 ) + 6ωj2 αj2 + βj2 + 2bαj βj E σj,t−1 ( ) ( 3 ) + 4ωj 2bαj3 + βj3 + 3αj2 βj + 3bαj βj2 E σj,t−1 ( ) ( 4 ) + αj4 + 8bαj3 βj + 6αj2 βj2 + 4αj βj3 + βj4 E σj,t−1 ) ) (1) ( 2 (2) ( 3 (0) + 4ωj γj E σj,t−1 = ωj4 + 4ωj3 γj E (σj,t−1 ) + 6ωj2 γj E σj,t−1 ) ( 4 ) ( + αj4 + 8bαj3 βj + 6αj2 βj2 + 4αj βj3 + βj4 E σj,t−1 ,

(2.22)

) ) ( 3 ( 2 we get equation and E σj,t−1 and substituting for the expressions for E (σj,t−1 ), E σj,t−1 (2.11).

2.3

Estimation and inference

The MSM parameters are generally estimated using the maximum–likelihood method, see, for example, McLachlan and Peel ) (2000) and Capp´e et al. (2005). Let Ξ = ( L {al , Al , Cl , νl }l=1 , Q, δ, ϑj , j = 1, 2, . . . , p be the set of all model parameters and let ϑj = (ωj , αj , βj ) for j = 1, 2, . . . , p denote the set of marginal parameters and let f (yt ) be a diagonal matrix with conditional probabilities f (Yt = yt | St = st , y1:t−1 ) on the main diagonal, then, the likelihood of a MSM can be written as L (Ξ) = δf (y1 ) Qf (y2 ) Q × · · · × f (yT −1 ) Qf (yT )1′ .

(2.23)

Finding the value of the parameters Ξ that maximize the log–transformation of equation (2.23) ∑L ∑L under the constraints l=1 δl = 1 and k=1 ql,k = 1, is not an easy problem. Instead, it is straightforward to find solutions of equation (2.23) using the Expectation–Maximization (EM) algorithm of Dempster et al. (1977). Hereafter, we focus on the EM algorithm which has been previously applied to the case of finite mixtures of univariate Student–t distributions by Peel and McLachlan (2000). For the purpose of application of the EM algorithm the vector of observations yt , t = 1, 2, . . . , T is regarded as being incomplete. Following the implementation described in Peel and McLachlan (2000) in a finite mixture context, two missing data structures are consequently introduced. The first one is related to the unobservable Markovian states, i.e.,

10

zt = (zt,1 , zt,2 , . . . , zt,L ) and zzt = (zzt,1,1 , zzt,1,2 , . . . , zzt,l,k , . . . , zzt,L,L ) defined as { 1 if St = l zt,l = 0 otherwise { 1 if St−1 = l, St = k zzt,l,k = 0 otherwise. The second type of missing data structure is ϖt , ∀t = 1, 2, . . . , T relies to the scale mixture representation in equations (2.3)–(2.4) which are assumed to be conditionally independent given the component labels zt,l , l = 1, 2, . . . , L, ∀t = 1, 2, . . . , T . Augmenting the observations {Yt , t = 1, 2, . . . , T } with the latent variables {ϖt , zt,l , zzt,l,k , t = 1, 2, . . . , T ; l = 1, 2, . . . , L} gives the following complete–data log–likelihood: log Lc (Ξ) ∝

L ∑

z1,l log (δl ) +

L ∑ L ∑ T ∑

zzt,l,k log (ql,k )

l=1 k=1 t=1

l=1

p L T L ∑ T ∑ ∑ 1 ∑∑ − zt,l (p log (2π) + log |Cl |) − zt,l log (σj,t ) 2 t=1 t=1 j=1 l=1

− +

l=1

L T 1 ∑∑

2

zt,l ϖt,l ε′t,l C−1 l εt,l

l=1 t=1

L ∑ T ∑

zt,l

l=1 t=1

+

L ∑ T ∑ l=1 t=1

zt,l



l

2 (ν

l

2

log

(ν ) l

2

− log Γ

( ν )) l

2

(log(ϖt,l ) − ϖt,l ) +

(p 2

) ) − 1 log (ϖt,l ) ,

(2.24)

( ) where εt,l = D−1 yt − µt,l , with µt,l = al + Al yt−1 , Dt = diag {σ1,t , σ2,t , . . . , σp,t }, for t l = 1, 2, . . . , L and σj,t , for l = 1, 2, . . . , p follow the AV–GARCH(1, 1) process defined in Section 2.2. The EM algorithm consists of two major steps, one for expectation (E–step) and one for maximization (M–step), see McLachlan and Krishnan (2007). At the (m + 1)–th iteration the EM algorithm proceeds as follows: (i) E–step: computes the conditional expectation of the complete–data log–likelihood (B.1) T given the observed data {yt }t and the m–th iteration parameters updates Ξ(m) ( ) [ ] T Q Ξ, Ξ(m) = EΞ(m) log Lc (Ξ) | {yt }1 ; (2.25) (ii) M–step: choose Ξ(m+1) by maximising (2.25) with respect to Ξ ) ( Ξ(m+1) = arg max Q Ξ, Ξ(m) . Ξ

(2.26)

One nice feature of the EM algorithm is that the solution of the M–step exists analytically for Gaussian and Student–t MSMs, for all the parameters with the only exception of the degrees–of– freedom νl , l = 1, 2, . . . , L. One possible solution for estimating νl is to adopt the approximation 11

provided by Shoham (2002) for mixtures of Student–t distributions. The EM algorithm for the dynamic MSM model defined in Section 2.1–2.2 is detailed in Appendix B, while in Appendix C we detail the procedure used to obtain the score and hessian matrix of the estimated parameters.

3

Risk measures

As discussed in the Introduction, one of the main contribution of this paper is a model– based approach to quantify the systemic risk. Assessing financial risks requires the appropriate definition of risk measures accounting for potential spillover effects among institutions belonging to a given financial market. Systemic risk measures are able to overcome the main deficiencies of traditional risk measures, such as the Value–at–Risk and the expected shortfall, where individual institutions’ risk contributions are evaluated independently on each other. Financial market regulation based on such interdependence risk measures aims to stabilise the risk of the financial system, mitigating the probability of joint occurrence of extreme losses. After the 2007–2008 global financial crisis many contributions in the financial–econometric literature have been devoted to this topic: see, for example, the Marginal Expected Shortfall (MES) risk measure of Acharya et al. (2010), the Systemic–RISK (SRISK) measure jointly proposed by Brownlees and Engle (2012) and Acharya et al. (2012), for a portfolio approach to measure the total overall system–wide risk. Acharya and Richardson (2009), Huang et al. (2012b), Billio et al. (2012) instead measure the marginal contribution of individual institutions to the systemic risk using network approaches. In this paper, instead, we follow the former stream of literature and we extend the CoVaR approach to measure overall risk contributions of individual institutions of Adrian and Brunnermeier (2016) to a multivariate framework. The original definition of CoVaR τ |τ at (τ1 , τ2 ) with τj ∈ (0, 1) confidence levels for j = 1, 2, i.e., CoVaRi|j1 2 , considers two different institutions i and j, such that ( ) τ |τ P Yi ≤ CoVaRi|j1 2 | Yj = VaRτj 1 = τ2 , (3.1) where Yi and Yj denote the institution i and j returns and VaRτj 1 denotes the “univariate marginal” Value–at–Risk of asset j. Alternatively, if Yi instead denotes the returns of a global index that represents the whole financial system, then the CoVaR becomes the VaR of the whole financial system conditional on a single institution j being in financial distress. Adrian and Brunnermeier (2016), consider this second definition of CoVaR to measure the contribution of institution j to the overall systemic riskiness. Recent financial crisis are characterised by the contemporaneous default of several institutions highlighting the relevance of risk measures accounting for joint occurrence of multiple extreme losses. For this reason in Section 3.3 we introduce two new systemic risk measures. The Multiple–CoVaR (MCoVaR) which extends the CoVaR of Adrian and Brunnermeier (2016) to the case where more than a single institution experiences distress τ |τ instances, and the Multiple–CoES (MCoES) which generalises the CoESi|j1 2 of Adrian and Brunnermeier (2016), i.e., ( ) τ |τ CoESi|j1 2 ≡ E Yi | Yi ≤ yˆiτ2 , Yj = ESτj 1 , (3.2) where ESτj 1 is the univariate marginal expected shortfall of asset j. By definition, MCoVaR and MCoES rely on the conditional and marginal return distributions. In what follows we embed 12

the proposed risk measures within the MSM framework. In particular, we provide analytical expressions for the marginal and conditional “predictive” distribution of the Markov–Switching model which allows us to get explicit formulae for the MCoVaR and the MCoES under the Gaussian and Student–t assumptions for the component density.

3.1

Preliminary results

In this subsection we first recall a known result for the h–steps ahead “predictive” distribution of the MSMs and then we prove two theorems concerning the marginal and conditional predictive distribution under the multivariate Gaussian and Student–t assumption for the component density. The h–step ahead predictive distribution of the observed process Yt+h at time t + h, given information up to time t, Ft , is a finite mixture of component specific distributions, (see, e.g., Zucchini and MacDonald 2009 and Capp´e et al. 2005) p (Yt+h | Ft ) =

L ∑

(h)

πl fYt+h (yt+h | St+h = l) ,

(3.3)

Qhj,l P (St = j | Ft ) ,

(3.4)

l=1

with mixing weights (h)

πl

=

L ∑ j=1

where Qhj,l is the (j, l)–th entry of the Markovian transition matrix Q to the power h ∈ N \ {0}. Under multivariate Gaussian and Student–t assumptions on fYt+h (yt+h | St+h ) we have the following results. Without loss of generality on the next results, we remove the dependence on time of the involved variables. Theorem 3.1 (Conditional and marginal distributions of multivariate Gaussian ∑L mixtures). Let Y be a p–dimensional Gaussian mixture, i.e., Y ∼ l=1 ηl Np (y | µl , Σl ), and ′ assume Y partitioned into Y = [Y1′ , Y2′ ] , where Y1 and Y2 are of dimension dim (Y1 ) = p1 and dim (Y2 ) = p2 = p − p1 , respectively. The mean vectors µl and the variance–covariance [ ] 2′ ′ matrices Σl for] l = 1, 2, . . . , L are partitioned accordingly to µl = µ1′ and Σl = l , µl [ (1,1)

Σl (2,1) Σl

(1,2)

Σl (2,2) is a p2 × p2 positive definite matrix. Then: (2,2) , respectively, where Σl Σl

(i) the marginal distribution of Y2 is fY2 (y2 ) =

L ∑

( ) (2,2) ηl Np2 y2 | µ2l , Σl ,

l=1

(ii) the conditional distribution of Y1 given Y2 = y2 is fY1 (y1 | Y2 = y2 ) =

L ∑

( ) 1|2 1|2 η˜l (y2 ) Np1 y1 | µl (y2 ) , Σl ,

l=1

13

where the mixing weights have the following expression ( ) (2,2) ηl Np2 y2 | µ2l , Σl ( ), η˜l (y2 ) = ∑ (2,2) L 2 l=1 ηl Np2 y2 | µl , Σl and the conditional moments of each components are ) 1|2 (1,2) (2,2)−1 ( y2 − µ2l µl (y2 ) = µ1l + Σl Σl 1|2 Σl

=

(1,1) Σl



(1,2) (2,2)−1 (2,1) Σl Σl Σl ,

(3.5)

(3.6) ∀l = 1, 2, . . . , L.

(3.7)

Proof. (i) The first result follows from standard integration. (ii) Concerning the second result, applying result (i) we can factorise the joint density in the following way: fY1 ,Y2 (y1 , y2 ) fY2 (y2 ) ( ) ( ) ∑L 1|2 1|2 2 2 η N y | µ , Σ N Y | Y = y , µ (y ) , Σ 2 p1 1 2 2 2 l l l=1 l p2 l l ( ) , = ∑L (2,2) 2 l=1 ηl Np2 y2 | µl , Σl

fY1 |Y2 =y2 (y2 ) =

which is a mixture of multivariate Gaussian distributions with mixing proportions defined in equation (3.5). 22

Theorem 3.2 (Conditional and marginal distributions of multivariate Student–t ∑L mixtures). Let Y be a p–dimensional Student–t mixture, i.e., Y ∼ l=1 ηl Tp (y | µl , Σl , νl ), ′ and assume Y partitioned into Y = [Y1′ , Y2′ ] , where Y1 and Y2 are of dimension dim (Y1 ) = p1 and dim (Y2 ) = p2 = p − p1 , respectively. The location parameter vectors µl and the [ 1′ 2′ ]′ scale [matrices Σl for l = 1, 2, . . . , L are partitioned accordingly to µ = µl , µl and l ] (1,1)

Σl =

Σl (2,1) Σl

(1,2)

Σl (2,2) is a p2 × p2 positive definite matrix. Then: (2,2) , respectively, where Σl Σl

(i) the marginal distribution of Y2 is fY2 (y2 ) =

L ∑

( ) (2,2) ηl Tp2 y2 | µ2l , Σl , νl ,

l=1

(ii) the conditional distribution of Y1 given Y2 = y2 is fY1 (y1 | Y2 = y2 ) =

L ∑

( ) 1|2 η˜l (y2 ) Tp1 y1 | µl (y2 ) , Σ∗l , p1 + νl ,

l=1

14

[ ( )′ (2,2)−1 ( )] 1|2 l where Σ∗ = 1 + ν1l y2 − µ2l Σl y2 − µ2l Σl p1ν+ν , the mixing weights have the l following expression ( ) (2,2) ηl Tp2 y2 | µ2l , Σl , νl ( ), η˜l (y2 ) = ∑ (3.8) L 2 , Σ(2,2) , ν η T y | µ l p 2 l 2 l=1 l l 1|2

and µl

1|2

and Σl

are defined in equations (3.6) and (3.7), respectively.

Proof. (i) The first result follows from standard integration, using the result for the marginal distribution of multivariate Student–t distributions provided by Sutradhar (1984). (ii) It follows immediately by factorising joint density as in Theorem 3.1 and by applying the result for the conditional distribution of the multivariate Student–t distribution provided by Sutradhar (1984). 22 In the next Section we provide analytical expressions for the univariate (marginal) VaR and ES risk measures which represent the main ingredient of the MCoVaR and the MCoES.

3.2

Marginal risk measures: VaR and ES

The Value–at–Risk for a risky asset at a given confidence level τ is the (1 − τ )–quantile of the distribution of the asset return, and measures the minimum loss that can occur in the (1 − τ ) × 100 of worst cases, see Jorion (2007). When the random variable Y has an absolutely continuous density function fY (y) with cumulative density function FY (y), the VaR can be calculated as VaRτ (Y ) ≡ yˆτ = FY−1 (1 − τ ). Despite its popularity and wide usage as risk management tool, the VaR suffers for the lack of sub–additivity property and therefore if does not take into account for the benefits of diversification, see, for example, Artzner et al. (1999). In principle, well diversified portfolios of risky assets should be less risky than non–diversified ones. To overcome this problem, Acerbi (2002) introduce the so–called spectral risk measures, and, among those, the Expected Shortfall (ES). The ES is a coherent risk measure (see Acerbi and Tasche 2002) and it is defined as the expected value of Y truncated below the VaR. Therefore, the ES can be calculated as the Tail Conditional Expectation (TCE) of Y conditioned at its VaR level, i.e., the expected value of Y conditional on being below a given threshold yˆτ , i.e., ESτ (Y ) ≡ TCEY (ˆ y τ ). Moreover, the ES provides a more conservative measure of risk than the VaR for the same confidence level τ , and it represents an effective tool for analysing the tail of the distribution, see Nadarajah et al. (2014) for an up to date overview on ES estimation methods. Bernardi (2013) and Bernardi et al. (2012) show that, under the Gaussian mixture distribution TCEY (ˆ y τ ) is a convex linear combination of component specific TCE. In what follows we provide a similar result for the Student–t distribution. Proposition 3.1 (TCE for univariate Student–t mixtures). Let Y be a univariate ( ) ∑L Student–t mixture, i.e., Y ∼ l=1 ηl T y | µl , σl2 , νl , then the TCEY (ˆ y τ ) is given by TCEY (ˆ yτ ) =

L ∑

πl TCEY,l (ˆ yτ ) ,

l=1

15

where (1 )[ 1 )2 ]− 12 (νl −1) ( 2 νl Γ (ν − 1) ν y ˆ − µ l l 2 ( l ) TCEY,l (ˆ yτ ) = √ νl + , σl 2 π Γ 12 νl

(3.9)

∑L ηl FY (yˆτ ,µl ,σ 2 ,νl ) with weights πl = ∑L η F yˆτ ,µl ,σ2 ,ν , for l = 1, 2, . . . , L, and l=1 πl = 1 where ( ) Y l l l l=1 l ( τ ) ( ) 2 FY yˆ , µ, σ , ν is the cdf of a Student–t random variable with parameters µ, σ 2 , ν evaluated at yˆτ . 22

Proof. See the Appendix D.

The following corollary provides the analytical expression for the ES of Student–t mixtures. Corollary 3.1. Under the same assumptions of Proposition 3.1 we have ESτ (Y ) ≡ TCEY (ˆ yτ ) =

L 1∑ πl (ˆ y τ ) TCEY,l (ˆ yτ ) , τ

(3.10)

l=1

( ) with πl (ˆ y τ ) = ηl F yˆτ , µl , σl2 , νl , for l = 1, 2, . . . , L.

3.3

Conditional risk measures: MCoVaR and MCoES

As discussed at the beginning of this Section, CoVaR and CoES represent measures of extreme interdependence which go beyond traditional idiosyncratic risk measures to account for potential spillover effects among institutions. As introduced by Adrian and Brunnermeier (2016), CoVaR and CoES are pairwise risk measures where the systemic risk contribution of each financial institution is evaluated independently on all the remaining market participants. When dealing with the global systemic risk assessment it is instead of fundamental importance to capture interconnections among interacting market participants. This aspect becomes more relevant during periods of financial market crisis, when several institutions may contemporaneously experience distress instances. For this reason we generalise the CoVaR approach of Adrian and Brunnermeier (2016), namely Multiple–CoVaR (MCoVaR) and Multiple–CoES (MCoES) for multivariate Markov–Switching models. Let S = {1, 2, . . . , p} be a set of p institutions, we assume that the conditioning event is a set of d institutions under distress indexed by JD = {j1 , j2 , . . . , jd } ⊂ d Cp−1 , where d Cp−1 is the set of all possible combinations of p − 1 elements of class d, with d ≤ p − 1. Moreover, assuming that institution i ∈ S with i ∈ / JD and JN = JD is the set of institutions being at the τ1 |τ2 “normal” state, we define the “Multiple–CoVaR”, MCoVaRi|J , where (τ1 , τ2 ) with τj ∈ (0, 1) D for j = 1, 2 are given confidence levels, as follows. Definition 3.1. Let Y = (Y1 , Y2 , . . . , Yi , . . . , Yp ) be the vector of institution returns, then ( ) τ1 |τ2 τ2 0.5 ˆJ MCoVaRi|J ≡ CoVaRτ1 Yi | YJD = y , YJN = yˆJ is the Value–at–Risk of institution N D D i ∈ S at confidence level τ1 , conditional on the set of institutions JD being at their individual

16

) ( τ2 ˆJ VaRτ2 –level y = yˆjτ12 , yˆjτ22 , . . . , yˆjτd2 and the set of institutions JN being at their individual D ( ) τ1 |τ2 0.5 0.5 0.5 0.5 ˆJ VaR0.5 –level y = y ˆ , y ˆ , . . . , y ˆ jd+1 jd+2 jp−1 , i.e., MCoVaRi|JD satisfies the following equation N ( ) τ1 |τ2 τ2 0.5 ˆ ˆ , Y = y P Yi ≤ MCoVaRi|J | Y = y = τ1 , J J J N D J N D D

(3.11)

for i = 1, 2, . . . , p. τ |τ

1 2 is the τ1 –quantile of the conditional predictive distribution of We remark that MCoVaRi|J D τ2 0.5 ˆ JD , YJN = y ˆ JN defined in equations (3.3)–(3.4). Therefore, the MCoVaR can be Yi | YJD = y obtained by inverting the cdf of the conditional predictive density provided by Theorem 3.1 and 3.2 for the Gaussian and Student–t distribution. The expression for the conditional predictive cumulative density function is ) ( τ2 0.5 ˆJ ˆJ FYi Yi | YJD = y , YJN = y = N D

L ∑

( τ2 ) l ( ) τ2 0.5 0.5 ˆ JD , y ˆJ ˆJ ˆJ η˜l y FYi Yi | YJD = y , YJN = y , N N D

(3.12)

l=1

( τ2 ) 0.5 ˆJ ˆ JD , y are defined as in equations (3.5)–(3.8) depending where the component weights η˜l y N ( ) τ2 0.5 ˆJ ˆJ , YJN = y on the assumption made on the component densities, and FlYi Yi | YJD = y N D is the cdf of the l–th mixture component. The lack of sub–additivity property of the VaR suggests to introduce, in addition to the CoVaR, the Conditional Expected Shortfall (CoES), defined by Adrian and Brunnermeier (2016) for any two institutions i and j, as the ES evaluated on the conditional distribution of Yi given Yj . The following definition extends the CoES to the MCoES. Definition 3.2. Let Y = (Y1 , Y2 , . . . , Yi , . . . , Yp ) be the vector of institution returns, τ1 |τ2 then MCoESi|J is the expected shortfall of institution i ∈ S at confidence level τ1 , D ( τ2 ) b ˆ JD = conditional on the set of institutions JD being at their individual ESτ2 –level ψ yJD y ( ( τ2 ) ( τ2 ) ( τ 2 )) ψˆyj1 yˆj1 , ψˆyj2 yˆj2 , . . . , ψˆyjd yˆjd and the set of institutions JN being at their ( ( ) ( ) ( )) ( τ2 ) τ2 τ2 τ2 b ˆy ˆy ˆy ˆ individual ES0.5 –level ψ y = ψ y ˆ , ψ y ˆ , . . . , ψ y ˆ , yJN jd+1 jd+2 jp−1 jd+1 jd+2 jp−1 JN ( τ) with ψˆy yˆ ≡ ESτ (Yj ), ∀j = 1, 2, . . . , d, and can be defined in the following way j

j

( ( τ2 ) ( 0.5 )) τ1 |τ2 b b ˆ ˆ JN , MCoESi|J ≡ CoES Y | Y = ψ y , Y = ψ y τ i J J y y 1 D N J J J N D D

N

for i = 1, 2, . . . , p. In the MS framework considered here the CoES reduces to the following weighted average τ |τ

1 2 MCoESi|J = D

L ( τ2 ) ( 0.5 )) 1 ∑ (b b ˆ JD , ψ ˆ JN η˜l ψ yJ y yJN y D τ1 l=1 ( ( τ2 ) ( 0.5 )) b l yˆτ1 | ψ b b ˆ JD , ψ ˆ JN , ×ψ y y yJ yJ yi i D

where 17

N

(3.13)

( ( τ2 ) ( 0.5 )) b l yˆτ1 | ψ b b ˆ JN , for l = 1, 2, . . . , L is the l–th component specific ˆ JD , ψ (i) ψ yJD y yJN y yi i ES of the predictive distribution of Yi conditional ( τ2 ) on the set of institutions JD being at b ˆ D and the set of institutions JN being at their individual ESτ2 –level, YJD = ψ yJD y ( J0.5 ) b ˆ their individual ES0.5 –level, YJ = ψ ; y N

(

b (ii) η˜l ψ yJ

(

)

τ2 b ˆJ ,ψ y yJ D

(

)) 0.5

yJN

JN

ˆ JN y

is the weight associated to the l–th component ES whose ( τ2 ) ( 0.5 ) b b ˆ JD and ψ ˆ JN into analytical expression can be found by plugging ψ yJD y yJN y equation (3.5) and (3.8) for the multivariate Gaussian and and multivariate Student–t MSM, respectively.

4

D

N

Individual contributions to overall risk

When dealing with systemic risk measures, it is important to quantify the marginal contribution of individual institutions to the overall risk. In their seminal paper, Adrian and Brunnermeier (2016) suggest that marginal contributions can be evaluated by the ∆CoVaR. Within their framework, where only two asset are considered, the ∆CoVaR is defined as the difference between the CoVaR of institution i conditional on institution j being under distress and the CoVaR of the same institution i when institution j is at its median state. In this case the median state identifies the non–distress events of institution j. However, since during periods of financial instability several institutions may experience financial distress at the same time, building on the same idea behind the MCoVaR (MCoES), it is straightforward to generalise ∆CoVaR (∆CoES) to Multiple–∆CoVaR (Multiple–∆CoES). Hence, we define the Multiple– τ1 |τ2 ∆CoVaR, ∆M CoVaRi|J as follows: D ( ) τ1 |τ2 τ2 0.5 ˆJ ˆJ ∆M CoVaRi|J = CoVaRτ1 Yi | YJD = y , YJN = y N D D ( ) 0.5 ˆJ − CoVaRτ1 Yi | YJD ∪JN = y , D ∪JN

(4.1)

τ |τ

1 2 for i = 1, 2, . . . , p. The ∆M CoVaRi|J gives a more complete information on which combination D of distressed institutions provides the largest contribution to the risk of the i–th institution and conveys a sharpened signal to the regulator. It is worth noting that when JD = S \ {i} the proposed ∆M CoVaR risk measure quantifies the total risk contribution ot the i−th institution, which is useful to assess the total amount of risk. Note also that when d = 1, the ∆M CoVaR does not coincide with the Adrian and Brunnermeier (2016) ∆CoVaR definition. τ1 |τ2 Following a similar idea, we can define the “Multiple–∆CoES”, ∆M CoESi|J , as follows D

) τ1 |τ2 τ1 |τ2 ( τ2 0.5 ˆJ ˆJ Yi | YJD = y , YJN = y ∆M CoESi|J = CoESi|J N D D D ( ) τ1 |τ2 τ1 0.5 ˆy (ˆ ˆ − CoESi|J Y | Y ≤ ψ y ) , Y = y , i i J ∪J J ∪J i D N i N N D

(4.2)

for i = 1, 2, . . . , p. In a multivariate environment, it is of fundamental importance to determine how the total risk shares among individual market participants. This can be accomplished by using the Shapley value methodology, see, e.g., Shapley (1953) which decomposes the total risk into individuals contributions as a result of a cooperative game–theoretic problem where

18

institutions act like the players and the total gain relates to the total risk. The Shapley (1953) value methodology is used to efficiently evaluate the financial risk of each institution or portfolio shared by each of the remaining participants to the financial system. The Shapley value methodology is introduced in the next Section.

4.1

Shapley value methodology

The Shapley value, Shapley (1953), initially formulated in a cooperative game theory framework, is used here to efficiently allocate the overall risk in a multivariate environment where financial institutions are interconnected and belong to a system which might itself experience instability and spreads new sources of risk. Within this framework the total loss of the coalition coincides with the overall systemic risk generated by the financial system. The Shapley value methodology has been previously applied in a different context by Koyluoglu and Stoker (2002), and as a measure to attributing systemic risk, by Tarashev et al. (2010) and Cao (2013). The Shapley value methodology has been proposed to share utility or a cost among participant of a cooperative game where players can encourage cooperative behaviour and make coalitions. Specifically, for each institution i = 1, 2, . . . , p, a cooperative game is denoted by the couple (ϑi , p − 1), where p is the total number of institutions in the market and ϑi : 2p−1 → R is the loss function of individual i assigning the cost ϑi (H) to each coalition H ⊂ S \ {i}, where S = {1, 2, . . . , p} and p is the total number of institutions, provided that ϑi (∅) = 0, ∀i = 1, 2, . . . , p. In our risk measurement framework, the loss function ϑi , i = 1, 2, . . . , p coincides with the ∆M CoVaR (∆M CoES), and it assigns to each of the 2p−1 possible groups of institutions H ⊂ S \ {i} its marginal contribution to the total risk in such a way that ϑi (S \ {i}) is the total risk of institution i ∈ S. In the cooperative game theory framework, the function ϑi (·) should be sub–additive, such that the contribution of a union of disjoint coalitions is less than the sum of the coalition’s separate values, i.e., for all disjoint partitions (Hj , Hk ) ∈ S \ {i} with j ̸= k, such that Hj ∩ Hk = ∅, θi (Hj ∪ Hk ) ≤ θi (Hj ) + θ (Hk ). This means that the contribution of a union of disjoint coalitions is not less than the sum of the coalition’s separate values and that the contribution of an “empty” coalition is zero. By definition, the sub–additivity property holds when the loss function inherits a coherent risk measure. The Shapley value is one of the possible ways to distribute the total risk of institution i ∈ S, i.e., ϑi (S \ {i}), among all the remaining institutions belonging to the financial system assuming that they all collaborate. For alternative solutions to a cooperative game see Banzhaf (1965). In particular, the Shapley value of institution j ∈ S \ {i}, denoted by ShVi (j), determines the amount of risk contribution of the j–th institution on institution i and satisfies the individual rationality condition, i.e., ShVi (j) ≥ ϑi ({j}), ∀i, j ∈ S, with j ̸= i, where ϑi ({j}) is the marginal risk contribution of institution j if it does not cooperate, and the collective rationality ∑p condition, i.e., j=1 ShVi (j) = ϑi (S), ∀i ∈ S. The Shapley values are obtained as ShVi (j) =

∑ H⊂S\{i,j}

|H|! (|S \ {i} | − |H| − 1)! [ϑi (H ∪ {j}) − ϑi (H)] , |S \ {i} |!

(4.3)

for i, j = 1, 2, . . . , p with j ̸= i, where for any given institution i = 1, 2, . . . , p and coalition H ⊂ S \ {i, j}, ϑi (H) and ϑi (H ∪ {i}) denote the loss associated to the coalitions H and H ∪ {i}, respectively, and the sum extends over all the subsets H of S \ {i} not containing 19

institution j. The losses associated with those two coalitions in equation (4.3), i.e., ϑi (H) and ϑi (H ∪ {j}), are evaluated by means of the ∆M CoVaR or the ∆M CoES in the following way: τ |τ

τ |τ

1 2 ϑVi (H) = ∆M CoVaRi|H ,

1 2 M ϑE i (H) = ∆ CoESi|H ,

τ |τ

τ |τ

1 2 ϑVi (H ∪ {j}) = ∆M CoVaRi|H∪{j} ,

τ |τ

1 2 M ϑE i (H ∪ {j}) = ∆ CoESi|H∪{j} ,

(4.4) (4.5)

τ |τ

1 2 1 2 and ∆M CoESi|J have been defined in the previous equations (4.1) and where ∆M CoVaRi|J D D (4.2) for a generic set of distressed institutions JD . For each market participant i = 1, 2, . . . , p, the portion of the overall value that the Shapley methodology attributes to each of the players in a cooperative game equals the average of this player’s marginal contribution to the value created by all possible permutations on the set of remaining players, as evaluated by the ∆M CoVaR and ∆M CoES risk measures. The additivity axiom satisfied by the Shapley value methodology ensures that the risk allocation is efficient in the sense that the risk shares attributed to each individual sector i = 1, 2, . . . , p exactly sum to the total risk, i.e., the ∆M CoVaR (or ∆M CoES) generated by the remaining financial institutions in the system being in distress. Moreover, the Shapley value is the unique set of measures of systemic importance that satisfies the following axioms.

(i) Efficiency. The efficiency axiom states that the total risk of institution i is distributed among all the remaining market participants without any loss or gain. Formally: p ∑

ShVi (j) = ϑi (S \ {i}) ,

(4.6)

j=1,j̸=i

for i ∈ S. The overwhelming importance of this additivity property can be clearly understood when considering macro prudential regulations. Additive risk measures imply that supervisors will not penalise the economy without any reason. The efficiency axiom implies that, when the Shapley value is used as risk distributor, the total gain of the coalition coincides with the overall risk generated by the financial system. (ii) Symmetry. The symmetry property states that if the marginal risk contributions of any two institutions for any possible subset H, not containing those institutions, is the same, then their Shapley values should be the same. Formally, the symmetry property states that, if ϑi (H ∪ {j}) = ϑi (H ∪ {k}), for any k ̸= i, j, with i ̸= j and for any subset H ⊂ S \{i, j, k}, then ShVi (j) = ShVi (k). The symmetry axioms simply implies that the Shapley value is permutation invariant and it induces a fairness property of the resulting risk distributor. (iii) Dummy axiom. The dummy axiom states that if the risk of institution j is independent of any other institutions’ risk conditionally on institution i, with i ̸= j, then the risk share of j should be exactly equal to its risk alone. More formally, if ϑi (H ∪ {j}) = ϑi ({j}), ∀j ∈ H ⊂ S \ {i, j}, then ShVi (j) = ϑi ({j}). This is the case where standard ∆CoVaR of Adrian and Brunnermeier (2016) coincides with the Shapley value ∆M CoVaR methodology. As argued by Cao (2013), since in general the risk measure of institution j is not conditionally orthogonal to any other institutions’ risk, this implies that the standard CoVaR approach and the Shapley value marginal contribution does not delivers the same risk ordering. 20

(iv) Linearity (or additivity). If j and k, with j, k ∈ H and j ̸= k are two different institutions with associated loss functions ϑi (j) and ϑi (k), with ϑi (j) ̸= ϑi (k), such that their linear combination delivers a new function ϑ˜i = wj ϑi (j) + wk ϑi (k), with wl > 0, ∀l = {j, k}, then the distributed risks should equal the weighted average of individual risk contributions as evaluated by their respective Shapley values, i.e., ( ) ShVi θ˜i = wj ShVi (j) + wk ShVi (k). (v) Zero player. The empty coalition receives zero risk contribution, i.e., ϑi (∅) = 0, for any i = 1, 2, . . . , p.

5

Empirical analysis

In this Section we apply the econometric framework and the methodology previously introduced to examine the evolution of systemic risk in the US financial system during the period 1974– 2015 covering the recent global financial crisis. The goal of this empirical exercise is to analyse how the systemic risk spreads among the different institutions by inspecting the evolution of the risk measures introduced in previous sections.

5.1

Data

We consider a panel of sixteen US financial institutions belonging to the Standard and Poor’s Composite Index, S&P500 and previously considered by Acharya et al. (2010) and Girardi and Erg¨ un (2013). All the considered institutions have equity market capitalisation in excess of 5bln USD as of end of June 2007. The sampling period is from January 4, 1974 to July 24, 2015, consisting of 2192 weekly observations for each institution. To avoid missing data, we consider all those institutions belonging to the dataset of Acharya et al. (2010) that were listed at the beginning of the sampling period. All the time series have been downloaded from the Datastream Database. Table 1 provides the list of the institutions included in the sample, their ticker, the date of the first available observation and the date of the last observation. Three of the institutions included in the list, e.g., Safeco Corp. (SAF), Unionbancal Corp. (UB) and Marshall Isley (MI) experienced distress instances during the period we consider. Specifically, on July 5, 2011 Marshall & Ilsley Corporation has been acquired by BMO Financial Group through its subsidiary BMO Financial Corp, while on September 24, 2008 Liberty Mutual Group completed the acquisition of Safeco and on November 4, 2008 the Bank of Tokyo–Mitsubishi UFJ, Ltd. acquired the 100% of Unionbancal Corp. Moreover, we also include the American International Group (AIG) which has been a central player in the global financial crisis of 2008 and it was bailed out by the federal government for 180bln USD. Their inclusion is motivated by the desire to increase understanding of how the proposed multiple risk measures behave in practice when the default/distress of one or more institutions approaches. Unfortunately, it is worth noting that the Shapley MCoVaR/MCoES obtained as solution of the cooperative game over two different sets of institutions potentially in distress are not comparable. Therefore we adopt the strategy of splitting the whole sample into different sub–samples as follows. An estimation/validation period going from the beginning of the sample period (January 4, 1974) to the end of 2007 (December 28, 2007), comprising 1793 weekly observations. Model’s parameters and goodness–of–fit test are performed using this first sample of observations. The remaining part of the available sample, from January 4, 2008 to the end of the period (400 weekly 21

Name Depositories Bank of America BB & T Bank of New York Mellon Comerica Inc JP Morgan Chase

Ticker

Date of first observation

Date of last observation

Symbol

BAC BBT BK CMA JPM

02/01/1973 02/01/1973 02/01/1973 02/01/1973 02/01/1973

∗, Red ⋄, Red +, Red ◦, Red ∗, Blue

Marshall Isley

MI

M & T Bank Corp Northern Trust Corp

MTB NTRS

Unionbancal Corp

UB

Wells Fargo Insurance American International Group Chubb Corp Cincinnati Financial Corp Can Financial Corp Humana Inc

WFC

02/01/1973

24/07/2015 24/07/2015 24/07/2015 24/07/2015 24/07/2015 Acquisition 06/07/11 24/07/2015 24/07/2015 Acquisition 17/11/08 24/07/2015

AIG CB CINF CNA HUM

23/08/1973 23/08/1973 23/08/1973 23/08/1973 23/08/1973

+, Dark ◦, Dark ∗, Green ⋄, Green +, Green

Safeco Corp

SAF

23/08/1973

24/07/2015 24/07/2015 24/07/2015 24/07/2015 24/07/2015 Acquisition 03/10/08

02/01/1973 02/01/1973 02/01/1973 02/01/1973

⋄, Blue +, Blue ◦, Blue ∗, Dark ⋄, Dark

◦, Green

Table 1: Name and classifications of US financial institutions. We consider 16 of the 102 US financial institutions that Acharya et al. (2010) consider in their analysis. Most of the institutions have been excluded because of the limited length of their return series. Marshall Isley, Unionbancal Corp. and Safeco Corp. (denoted in bold) have been acquired by BMO Financial Group, Bank of Tokyo-Mitsubishi UFJ, Ltd. and Liberty Mutual Group, respectively, before the end of the sample period. For those institutions, the death date is reported in the next to last column. The last column reports the list of symbols used to identify the institutions.

observations), is left to assess out–of–sample performances of the proposed risk measurement framework. Specifically, the out–of–sample period is divided into four subsamples: the first subsample goes from January 4, 2008 to September 26, 2008 immediately prior to the collapse of Safeco Corp., the second subsample goes from September 26, 2008 to November 14, 2008 immediately before the collapse of Unionbancal Corp., the third one goes to July 1, 2011, before the Marshall Isley collapse and the fourth one goes till the end of the period. Table 2 reports the descriptive statistics of the weekly returns the overall index (S&P500) and the financials institutions by group (depositories and insurance companies) over the two periods. A closer inspection of Table 2 reveals that, as expected, the volatility of all institutions’ returns (except for Unionbancal Corp.) increases during the global financial crisis while skewness and kurtosis become more extreme. This empirical evidence is more relevant for those institutions that experienced distress instances during the period of crisis. Interestingly, the 1% stress levels in the penultimate column of Table 2 point out that individual risk measures, like the VaR, sometimes fail to detect the systemic relevance of the institutions. For example, Unionbancal Corp. and Safeco Corp. report a 1% stress level of about −4.7 and −3.0, respectively, while, over the same period, Bank of America reports a value at least ten time larger, −41.0. Here, we would examine whether stock market co–movements have changed over time, with a focus on the period of the recent global financial crisis. On the one hand, we expect that the global nature of the financial crisis might imply that the co–movements become stronger, with an increase in the long–run risks. On the other hand, given the heterogeneous composition

22

Name Depositories In sample: from 04/01/1974 to 28/12/2007 Bank of America BB&T Bank of New York Mellon Comerica Inc JP Morgan Chase Marshall Isley M&T Bank Corp Northern Trust Corp Unionbancal Corp Wells Fargo Out of sample: from 04/01/2008 to 24/07/2015 Bank of America BB&T Bank of New York Mellon Comerica Inc JP Morgan Chase Marshall Isley M&T Bank Corp Northern Corp Unionbancal Corp Wells Fargo Insurance In sample: from 04/01/1974 to 28/12/2007 American International Group Chubb Corp Cincinnati Financial Corp Can Financial Corp Humana Inc Safeco Corp Out of sample: from 04/01/2008 to 24/07/2015 American International Group Chubb Corp Cincinnati Financial Corp Can Financial Corp Humana Inc Safeco Corp Overall index: S&P500 In sample: from 04/01/1974 to 28/12/2007

Min

Max

Mean

Std. Dev.

Skewness

Kurtosis

1% Str. Lev.

JB

-25.612 -17.980 -29.702 -30.149 -28.434 -15.388 -27.871 -20.555 -36.382 -15.281

25.788 19.238 23.894 19.416 18.610 13.103 20.510 22.997 38.255 17.130

0.118 0.150 0.198 0.167 0.106 0.206 0.253 0.245 0.139 0.176

4.462 3.016 4.081 3.286 4.436 3.048 3.517 3.498 3.852 3.590

-0.095 0.321 -0.120 -0.582 -0.151 -0.013 0.008 0.297 0.009 0.102

7.030 8.531 8.388 9.379 5.568 5.720 9.670 8.444 15.284 4.952

-12.422 -8.511 -9.569 -9.329 -11.667 -8.404 -9.032 -9.018 -11.518 -8.972

1213.821 2312.545 2169.767 3135.745 498.743 551.855 3317.801 2237.089 11253.675 287.169

-61.669 -28.458 -30.053 -41.365 -46.416 -51.135 -28.703 -36.735 -10.299 -57.290

61.271 32.471 29.327 39.353 33.475 43.399 23.163 25.280 17.994 54.115

-0.203 0.089 -0.033 0.050 0.127 -0.295 0.127 0.008 0.107 0.179

9.412 5.623 5.162 6.603 5.997 8.726 5.058 4.731 1.911 6.880

-0.919 0.074 -0.454 -0.534 -1.344 -1.074 -0.400 -1.133 3.343 -0.561

18.459 10.772 12.602 14.232 18.938 15.581 10.568 16.046 36.652 27.188

-41.031 -16.746 -16.060 -26.734 -21.264 -36.957 -15.706 -14.283 -4.766 -24.980

4019.038 1001.994 1542.605 2111.097 4332.366 2701.344 960.454 2907.815 19521.694 9722.761

-15.496 -16.469 -17.975 -36.466 -33.787 -23.867

21.390 19.820 25.144 44.392 20.410 30.913

0.220 0.166 0.248 0.137 0.364 0.154

3.909 3.935 3.327 4.863 5.754 3.837

0.237 0.221 0.353 0.342 -0.464 0.008

5.301 5.454 7.758 12.949 6.098 8.273

-9.991 -10.547 -8.360 -12.899 -17.865 -10.148

411.547 463.518 1725.916 7417.948 780.180 2073.898

-187.551 -24.545 -34.517 -59.818 -54.892 -8.118

137.402 23.834 17.562 32.696 20.584 36.065

-0.676 0.205 0.087 0.048 0.211 0.059

15.300 3.602 3.890 5.956 6.109 1.979

-2.973 -0.388 -1.761 -2.352 -2.603 14.863

75.650 18.543 21.044 33.661 23.770 278.237

-44.709 -9.961 -12.017 -17.279 -20.644 -3.039

88113.825 4016.427 5604.925 15956.709 7603.093 1270934.386

-18.293

11.385

0.150

2.155

-0.541

7.383

-5.662

1519.943

-20.261

16.529

0.096

2.846

-1.529

17.128

-8.114

3465.338

Out of sample: from 04/01/2008 to 24/07/2015

Table 2: Summary statistics US financial institutions in the panel and the SP&500 overall index, for the “in sample” period form January, 2nd 1974 till December 28, 2007 and the “out of sample” period form January, 2nd 2008 till July, 24th 2015. The eight column, denoted by “1% Str. Lev.” is the 1% empirical quantile of the returns distribution, while the last column, denoted by “JB” is the value of the Jarque–Ber´ a test–statistics.

of the considered panel, where institutions strongly differ by market capitalisation and other individual characteristics, (either observable, like the business core, or not observable, like debt composition or the level of market linkage) we would expect that different banks were hit rather unequally by the global crisis of 2008. For example, by looking at the kurtosis index in Table 2 we observe that insurance companies like American International Group (AIG) and Safeco Corp. (SAF) have heavier tails than other banks in the panel, (like BB&T, for example), and we would expect that they will be more affected by negative events. Figure 1 shows the time series of cumulative returns for all the considered institutions, from January 2nd, 1974 till the end of the sample. Vertical shaded areas refer to the following events: the “Black Monday” (October 19, 1987), the “Black Wednesday” (September 16, 1992), the Asian crisis (July, 1997), the Russian crisis (August, 1998), the September 11, 2001 Twin Towers attack, the global financial crisis of 2008 identified by the period from the Bear Stearns hedge funds collapse (August 5, 2007) to the stock market crash of March 9, 2009 and the European sovereign–debt crisis of April 2010 identified by the period from the announcement 23

by the Greek government of doubling of budget deficit of October 18, 2009 to the beginning of the European Stability Mechanism (ESM) of October 08, 2012. The figure gives insights about the effect of crisis periods on each institution and the overall market represented by the S&P500 composite index (thin red dashed line). After the 2001 Twin Towers attack till the middle of 2007, the US financial system experienced a long period of small perturbations and stability ended shortly after the collapse of two Bear Stearns hedge funds in early August 2007. Starting from August 2007, the financial market experiences a huge fall, the subprime mortgage crisis that led to a financial crisis and subsequent recession that began in 2008. Several major financial institutions collapsed in September 2008, with significant disruption in the flow of credit to businesses and consumers and the onset of a severe global recession. The system hit the bottom in March 2009, and then started a slow recovery which culminated just before the European sovereign–debt crisis of April 2010. It is interesting to note that, since the beginning of the global financial crisis of 2008, all the considered institutions, as well as the market index, experienced huge capital losses with American International Group (AIG) and Bank of America (BIC) being the most affected by the crisis. Except for American International Group (AIG), Bank of America (BIC) and Humana Inc. (HUM), all the other institutions become more correlated after European sovereign–debt crisis, displaying similar trends.

5.2

Full sample estimation results

We estimate multivariate dynamic Markov–switching models with Gaussian and Student–t innovations defined in Section 2.1 over the in–sample period from January 4, 1974 to December 28, 2007. A fundamental problem in fitting MS models is the choice of the number of latent states, L. Indeed, increasing the number of states always improves the model fit, but the improvement comes at a quadratic cost in term of number of parameters, thereby the improvement in fit has to be traded off against this increase. As regards model selection we consider the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) which penalise the negative log–likelihood depending on the number of non–redundant parameters (see, e.g., Ryd´en 2008). Given the large number of parameters involved by the proposed HMMs we only compared three models differing for the number of states, L. Table 3 reports the maximum log–likelihood, AIC, and BIC for Gaussian and Student–t MSMs as the number of hidden states varies from 1 to 3. For each value of L, we use 20 random starting points to initialise model parameters, and we report the results corresponding to the best solution in terms of log–likelihood. We stop the algorithms when the increase in the log–likelihood is less than 10−5 ; for each of the two models, whenever possible, we use the same starting points. The selected model involves 2 latent states with Student–t innovations, because of its superior ability to account for the asymmetry and kurtosis displayed by the observed data in a parsimonious way as compared to the Gaussian alternative. Parameters estimate for the selected model are reported in Tables 1–6 of the supplementary appendix available online to save space. As expected, hidden regimes are mainly identified by correlations. Despite of the common findings state–specific return means differ significantly across latent states, favouring the rejection of the null hypothesis that the conditional means are equal. The different values of means and variances across states exacerbate the visited state persistence observed in our time series data. This evidence also emerges by inspecting the bottom panel of Figure 1 reporting the smoothed estimate of the hidden chain. Smoothed states p (St = k | Y1:T ), for k = 1, 2 over the whole sample period are estimated by running the

24

8 7 6 5 4 3 2 1 0 −1 −2

74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14

1 0.5 0

74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14

Figure 1: (Top panel): weekly cumulative returns of the financial institutions in the panel and the S&P500 index (thin red dashed line), from January 2nd, 1974 till the end of the sample. Vertical shaded areas represent major financial downturns: the “Black Monday” (October 19, 1987) (light grey), the “Black Wednesday” (September 16, 1992) (light green), the Asian crisis (July, 1997) (grey), the Russian crisis (August, 1998) (light blue), the September 11, 2001 Twin Towers attack (blue), the global financial crisis of 2008 (light yellow) identified by the period from the Bear Stearns hedge funds collapse (August 5, 2007) to the stock market crash of March 9, 2009 and the European sovereign–debt crisis of April 2010 (light red) identified by the period from the announcement by the Greek government of doubling of budget deficit of October 18, 2009 to the beginning of the European Stability Mechanism (ESM) of October 08, 2012. (Bottom panel): smoothed states p (St = k | Y1:T ), for k = 1, 2 over the whole sample period. Smoothed states are obtained by running the Forward–Filtering–Backward– Smoothing (FFBS) algorithm using the parameter estimates of the selected model reported in a supplementary appendix available online.

FFBS algorithm using the parameter estimates of the selected model in Table 3, see Fr¨ uhwirthSchnatter (2006). Clearly, the selected MSM identifies periods of crisis as well as stable phases. Indeed, according to state–specific return means, we identify a positive and a negative regime. Furthermore, during a financial crisis (identified by State 1, red line in the bottom panel of Figure 1), stock returns experiences high negative average mean returns, and correlations are pretty large. Vice versa, during more stable phases (as those identified by State 2, grey line in the bottom panel of Figure 1), instead, stock returns fluctuate around a positive mean, and variances are relatively low. As a consequence, our model is able to distinguish and cluster time periods corresponding to different risk–returns profiles. This evidence is confirmed by observing the small estimated values for the degrees–of-freedom parameters ν detecting the presence of fat tails (see Table 2 in the supplementary material). Moreover, the estimated transition probability matrix Q (see Table 2 in the supplementary material), which explains the evolution over time of regime switching, is characterised by high persistence in the first state and motivates the low rate at which the system transits outside a crisis period. An interesting 25

Gaussian innovations L

LogL

1 2 3

63195.2746 63759.5697 64093.1789

N. Par.

AIC

L

LogL

N. Par.

AIC

BIC

1 2 3

64579.9245 66073.2373 66133.3003

494 940 1388

-127371.8 -130266.4 -129490.6

-124658.9 -125104.3 -121868.1

493 -125404.5 938 -125643.1 1385 -125416.3 Student–t innovations

BIC -122697.1 -120491.9 -117810.4

Table 3: Log–likelihood (LogL), number of parameters (N. Par.), Akaike (AIC) and Schwarz (BIC) information criteria for the HMM–VAR model with Gaussian and Student–t innovations fitted to the panel of US financial institutions and the S&P500 index in Table 1. The AIC/BIC denoted in bold face indicates the selected model.

aspect is related to crisis periods: bursts in the crisis (as those identified in State 1) can be likely followed by another crisis period. Thus, coming out suddenly from a crisis period is unlikely. As known, the persistence of regimes plays an important role in generating correlation clustering, for which periods of high correlation are followed by high volatility, and periods of low correlation are followed by low volatility. The multivariate systemic risk measurement framework is able to identify different spillover effects among stocks, measured by the state–specific correlations. As expected, correlations are higher during crisis period than during more stable phases. This evidence has several important consequences for the tail risk interdependence analysed in the next subsection. Nevertheless, even if parameters estimate support for the policy decision making process they do not provide enough information to evaluate extreme tail interdependence between stocks, and risk measures should be constructed on the basis of the obtained parameter estimates.

5.3

Marginal contributions to systemic risk

One of the most important questions a systemic risk measure should answer is the identification of institutions that are systematically more important or contribute more to the vulnerability of the whole financial system. The systemic risk framework described in previous sections measures the dynamic evolution of the systemic relevance of each institution. Moreover, a relevant byproduct of the multivariate approach and the cooperative game framework used to attribute risk to market participants consists on the availability of a measure of total risk generated by the financial system at each point in time. The measure of total risk provides important monitoring tools for the market–based macro–prudential or financial stability regulation. For the basket of institutions described in Section 5.1, we calculate the predicted marginal contributions of each institution to the overall systemic risk by means of the Shapley value ∆M CoVaR and ∆M CoES at τ1 = τ2 = 0.05 over the out–of–sample period from January 4, 2008 to the end of the period (July 24, 2015). The evaluation of the MCoVaR requires the prior estimation of the marginal VaR for each institutions. Moreover, VaRs and CoVaRs are quantiles of the marginal and conditional distributions, respectively. In order to evaluate 26

the predictive ability of the two risk measures, VaR and CoVaR, we perform out–of–sample backtesting procedures over the out–of–sample period from January 4, 2008 to the end of the sample and we calculate the well known conditional and unconditional coverage tests of Kupiec (1995) and Christoffersen (1998), and the Dynamic Quantile (DQ) test of Engle and Manganelli (2004), for τ = 0.05 and for all the considered institutions. The VaR backtesting procedures have been performed by looking at the violations of the corresponding estimated marginal quantiles, while, as concerns the CoVaR we only backtested the “total systemic risk”, by looking at the violations of the MCoVaR when all the institutions are below their marginal VaRs. Similarly, for the definition of CoVaR of Adrian and Brunnermeier (2016) we considered the violations of the marginal VaR of the overall index when each institution is below its marginal VaR level. The two tables summarising all the results for both the risk measures are available as supplementary material. In both cases we reach satisfactory results: neither tests of conditional and unconditional coverage reject the null hypothesis that violations series are martingale difference sequences and the DQ test never reject the null hypothesis. The top panels in Figure 2 plot the systemic risk contribution of each institution, while bottom panels report the total risk evolution over the different periods identified in Section 5.1. For example, the top panel of sub–figure (2a) plots the evolution of the systemic risk contributions (evaluated by the ∆M CoVaR) over the first sub–period from January 4, 2008 to the collapse of Safeco Corp. (September 26, 2008). The bottom panel of sub–figure (2a) plots the total systemic risk evolution over the same period. The inspection of bottom panels of Figure 2 reveals that the total systemic risk is at its minimum level before the Bear Stearns hedge funds collapse (August 5, 2007) and then increases significantly during year 2007 till the middle of 2008. Subsequently, the system experienced a long period (between September 2008 and the collapse of Lehaman and Brothers failed in September 2009) of financial instability and high volatility when the overall systemic risk reached its highest level the system ever experienced before. Then the total systemic risk decreased suddenly, reaching the pre–Bear Sterns collapse level in the middle of 2009. This probably has been as major consequence the adoption of the US Supervisory Capital Assessment Program (SCAP) by the Federal Reserve System. The main aim of the SCAP program conducted by the Federal Reserve is to determine if the largest US financial organisations had sufficient capital buffers to withstand the recession and the financial market turmoil whose results were released on May 7, 2009. The market has calmed down till the first round of European sovereign debt crisis in May 2010, after the Greece receiving the aid with 14,5 billions euros, as documented by the decrease of the total systemic risk. Almost a year later, (June 13, 2011), Standard & Poor’s has downgraded Greek debt from B to CCC, and the total systemic risk raised sharply reaching the higher peak after 2007–2009 financial crisis in summer 2011. Concerning the total systemic risk exposure during the recent global financial crisis of 2007–2008, we observe that it is about three time as much larger as it was during the previous period (2002–2007) of financial stability and about two time larger than it was during the 1987 financial crisis (not reported in Figure 2), except for the black Monday week, when we observe pretty the same level as in 2008. The top panels of Figure 2 plot the individual marginal contributions to the systemic risk calculated by means of the Shapley value ShVi based on ∆M CoVaRi|Jd and ∆M CoESi|Jd , respectively. Here, the index i denotes each individual institutions’ risk contribution calculated with respect to the market index (S&P500). Those panels track the systemic risk importance of each institution in percentage points over the four sub–periods defined in Section 5.1. As

27

28

18/07/08

05/09/08 04/01/08 22/02/08 11/04/08 30/05/08 18/07/08 05/09/08 24/10/08

11/04/08

30/05/08

18/07/08

11/04/08

11/04/08

30/05/08

30/05/08

18/07/08

18/07/08

(e) Period I: ∆CoES

22/02/08

22/02/08

05/09/08

05/09/08

05/09/08

(f) Period II: ∆CoES

04/01/08 22/02/08 11/04/08 30/05/08 18/07/08 05/09/08 24/10/08

−0.1

−0.05

04/01/08 22/02/08 11/04/08 30/05/08 18/07/08 05/09/08 24/10/08

5

10

15

20

25

30

(b) Period II: ∆CoVaR

04/01/08 22/02/08 11/04/08 30/05/08 18/07/08 05/09/08 24/10/08

(g) Period III: ∆CoES

04/01/08 18/07/08 23/01/09 07/08/09 12/02/10 27/08/10 11/03/11

−0.1

−0.05

04/01/08 18/07/08 23/01/09 07/08/09 12/02/10 27/08/10 11/03/11

2

4

6

8

10

12

14

(c) Period III: ∆CoVaR

04/01/08 18/07/08 23/01/09 07/08/09 12/02/10 27/08/10 11/03/11

−0.1

−0.05

04/01/08 18/07/08 23/01/09 07/08/09 12/02/10 27/08/10 11/03/11

2

4

6

8

10

12

14

(h) Period IV: ∆CoES

−0.1 04/01/08 20/02/09 09/04/10 03/06/11 27/07/12 13/09/13 31/10/14

−0.05

04/01/08 20/02/09 09/04/10 03/06/11 27/07/12 13/09/13 31/10/14

4

6

8

10

12

14

16

(d) Period IV: ∆CoVaR

−0.1 04/01/08 20/02/09 09/04/10 03/06/11 27/07/12 13/09/13 31/10/14

−0.05

04/01/08 20/02/09 09/04/10 03/06/11 27/07/12 13/09/13 31/10/14

4

6

8

10

12

14

16

Figure 2: Shapley value ∆M CoVaR and ∆M CoES marginal systemic risk contributions of each institution over the four different periods, from January 4, 2008 to September 26, 2008 (period I), from September 26, 2008 to November 14, 2008 (period II), from November 14, 2008 to July 1, 2011 (period III) and from July 1, 2011 to the July 24, 2015 (period IV). Shaded areas denote the different periods. For each period, the top panel reports the marginal contributions to systemic risk evaluated by means of the Shapley value methodology ShVi based on ∆CoVaR and ∆CoES, for all the banks τ1 |τ1 τ1 |τ1 in the panel, while the bottom panel reports the total systemic risk measured by ∆M MCoVaRk|S and by ∆M CoESk|S , where k denotes the S&P500 index and S denotes the set of indexes of the remaining assets. Vertical dashed lines represent major financial downturns: for a detailed description see Table 6.

04/01/08

−0.08

−0.06

04/01/08

4

6

8

10

12

22/02/08

(a) Period I: ∆CoVaR

−0.1 04/01/08

−0.1

30/05/08 −0.05

11/04/08

−0.08

22/02/08

5

10

15

20

25

30

−0.06

04/01/08

4

6

8

10

12

12

10

10

8

8 6 6 4 4 2

2 0

15/02/08

04/04/08

23/05/08

11/07/08

0

29/08/08

(a) Period I

(b) Period II

10

10

8

8

6

6

4

4

2

2

0

15/02/08 04/04/08 23/05/08 11/07/08 29/08/08 17/10/08

0

11/07/08 16/01/09 31/07/09 05/02/10 20/08/10 04/03/11

(c) Period III

13/02/09 02/04/10 27/05/11 20/07/12 06/09/13 24/10/14

(d) Period IV

Figure 3: ∆CoVaR and ∆CoES marginal systemic risk contributions of Adrian and Brunnermeier (2016) for each institution over the four different periods, from January 4, 2008 to September 26, 2008 (period 1), from September 26, 2008 to November 14, 2008 (period 2), from November 14, 2008 to July 1, 2011 (period 3) and from July 1, 2011 to the July 24, 2015 (period 4). Shaded areas denote the different periods. Vertical dashed lines represent major financial downturns: for a detailed description see Table 6.

expected, the level of systemic importance of the different institutions is not constant and changes over time and during period of financial instability it is characterised by higher volatility levels. It is worth mentioning that the Shapley value ∆M CoVaR and ∆M CoES delivered by the procedure presented in the previous sections represent the share of the total systemic risk attributed to each institutions. Moreover, the Shapley values at any point in time are obtained as the result of a competitive game where all the possible combinations of institutions in distress at that time play the role of coalitions in the standard competitive game theory approach. As a consequence, Shapley values that refer to different group of coalitions cannot be compared. Those reasons motivate the choice to split the whole out–of–sample period into four sub–samples of different length where the numbers of institutions does not change because of the collapse of one or more institutions. In Figure 2 we have four plots for the Shapley value ∆M CoVaR (∆M CoES) each of which refers to a different sub–period. Different sub–periods are denoted by different background colours. For example, Figure 3b plots the Shapley values ∆M CoVaR during the second period (identified as the period between the collapse of Safeco Corp. and that of Unionbancal Corp.) in light blue and the Shapley values ∆M CoVaR would be realised during the first period if the default of Safeco Corp. did not occur (in light grey). Table 4 summarises the mean and standard deviation of the Shapley values ∆M CoVaR and

29

SV ∆M CoVaR Mean

SV ∆M CoES Mean

Name Per. I Per. II Per. III Per. IV Per. I Per. II Per. III Per. IV BAC BBT BK CMA JPM MI MTB NTRS UB WFC AIG CB CINF CNA HUM SAF

7.101 5.848 6.354 5.846 6.630 5.491 5.712 5.714 6.303 5.702 8.953 5.542 5.052 6.688 7.489 5.575 SV

7.682 8.147 8.276 6.477 6.486 7.263 6.888 6.975 7.772 6.710 6.717 7.439 6.616 7.311 8.069 6.411 7.101 – 6.355 6.525 7.036 6.372 6.365 6.964 6.123 – – 6.061 6.919 7.343 9.300 9.704 8.679 5.694 6.222 7.056 5.334 5.690 6.287 6.998 7.428 7.932 6.981 8.409 9.882 – – – ∆M CoVaR Std Dev.

7.150 7.766 8.191 8.316 5.826 6.443 6.443 7.234 6.357 6.880 6.964 7.779 5.825 6.686 6.686 7.421 6.644 6.603 7.321 8.096 5.453 6.376 7.082 – 5.682 6.340 6.491 6.990 5.683 6.347 6.321 6.914 6.303 6.088 – – 5.673 6.024 6.904 7.319 9.087 9.425 9.818 8.741 5.508 5.680 6.191 7.013 4.995 5.295 5.624 6.190 6.712 7.038 7.458 7.949 7.558 7.008 8.505 10.038 5.545 – – – SV ∆M CoES Std Dev.

Name Per. I Per. II Per. III Per. IV Per. I Per. II Per. III Per. IV BAC BBT BK CMA JPM MI MTB NTRS UB WFC AIG CB CINF CNA HUM SAF

0.576 0.445 0.456 0.299 0.392 0.241 0.214 0.356 0.321 0.472 0.906 0.553 0.482 0.634 0.686 0.329

0.844 0.813 0.929 0.975 0.777 0.691 0.555 0.608 0.706 0.528 1.199 1.189 0.824 1.186 1.082 –

0.518 0.430 0.475 0.429 0.486 0.657 0.329 0.233 – 0.271 2.146 0.750 0.594 0.559 0.998 –

0.160 0.172 0.191 0.121 0.161 – 0.161 0.120 – 0.094 1.034 0.252 0.157 0.113 0.410 –

0.617 0.466 0.477 0.317 0.409 0.253 0.225 0.372 0.340 0.501 0.948 0.584 0.501 0.667 0.728 0.341

0.889 0.857 0.973 1.021 0.798 0.731 0.578 0.615 0.739 0.591 1.244 1.201 0.826 1.216 1.090 –

0.531 0.459 0.510 0.457 0.521 0.684 0.351 0.245 – 0.290 2.222 0.778 0.616 0.598 1.063 –

0.167 0.183 0.203 0.129 0.172 – 0.171 0.125 – 0.101 1.076 0.260 0.157 0.120 0.442 –

Table 4: Summary statistics of the Shapley Value ∆M CoVaR and ∆M CoES (in %) over the four different periods.

∆M CoES over the different sub–periods. Figure 3 instead plots the ∆CoVaR of Adrian and Brunnermeier (2016) while Table 5 summarises the mean and standard deviations of the CoVaR estimates. Backtesting results for VaR and CoVaR are reported in the supplementary materials available online. The CoVaR of Adrian and Brunnermeier (2016) are reported to highlight the major differences with our approach. In Bernardi and Petrella (2015) we perform a similar experiment and we compare the MCoVaR as defined here with that obtained by estimating the bivariate model on each institution coupled with the overall index. In that way we were able to 30

Mean

Std Dev.

Name Per. I Per. II Per. III Per. IV Per. I Per. II Per. III Per. IV BAC BBT BK CMA JPM MI MTB NTRS UB WFC AIG CB CINF CNA HUM SAF

8.969 7.977 8.063 9.137 6.446 25.117 6.330 13.391 3.925 4.977 12.008 11.471 3.614 11.932 8.372 6.877

9.530 8.069 8.426 16.623 6.506 26.452 6.751 7.106 6.695 4.976 11.262 16.348 4.335 11.838 7.671 –

11.221 20.080 11.445 15.245 8.648 15.548 15.206 10.514 – 14.725 21.568 16.598 6.202 13.252 7.680 –

13.089 26.589 12.944 20.742 11.544 – 20.946 14.482 – 17.212 22.488 19.380 5.725 16.555 7.607 –

0.110 0.128 0.124 0.127 0.101 0.357 0.112 0.186 0.072 0.088 0.165 0.360 0.066 0.168 0.116 0.139

0.117 0.142 0.132 0.237 0.109 0.371 0.137 0.116 0.122 0.086 0.158 0.312 0.081 0.183 0.112 –

0.134 0.465 0.198 0.226 0.162 0.266 0.376 0.190 – 0.250 0.353 0.372 0.102 0.192 0.122 –

0.135 0.522 0.193 0.262 0.182 – 0.297 0.222 – 0.247 0.347 0.347 0.088 0.200 0.105 –

Table 5: Summary statistics of the ∆CoVaR of Adrian and Brunnermeier (2016) (in %) over the four different periods.

assess the relevance and the impact of the multiple risk measurement framework based on the proposed game theoretical approach disregarding the effect of the selected model. Here, instead we compare the MCoVaR with the original definition of CoVaR. The main difference between Figure 2 and Figure 3 is the absence in the latter of the bottom panels plotting the evolution of the total systemic risk. Indeed this absence reflects the main difference between the proposed approach and the ∆CoVaR approach of Adrian and Brunnermeier (2016): the Shapley value methodology computes the total systemic risk and then it distributes the total risk among the market participants while the ∆CoVaR only allows to rank the systemic importance of the different institutions. The inspection of the top panels of Figure 2 (see also Table 4) reveals that, on the one hand, as expected, Bank of America (BAC) and American International Group (AIG) are, respectively, the bank and the insurance institution that have the most systemic importance weights during the whole period. Indeed, BAC is one of the largest US Banks while, as said above, AIG experienced distress instances and on September 16, 2008, the Federal Reserve provided to AIG a two–year load of 85 billion USD to prevent its bankruptcy. On the other hand, Safeco Corp (SAF), Unionbancal Corp (UB) and Marhall Isley (MI) that experienced distress instances during the analysed period are always included among the companies that contributes less to the systemic failure. This apparently strange result should be interpreted keeping into account that the proposed method aims to identify the institutions that are systemically important. Those institutions may substantially differ from the set of institutions having large predicted probability of distress since what is relevant from a “systemic” perspective are the consequences of the distress of one or more institutions on all the remaining institutions that belong to the system. Indeed SAF, UB and MI are never identified as systemic important

31

by the ∆M CoVaR (∆M CoES) in any of the subsequent periods. The relevance of AIG and the fact that the institutions that suffer distress instances during the period (SAF, UB and MI) are not identified as systemic relevant are partly confirmed by looking at Figure 3 and the summary Table 5 providing result for the ∆CoVaR of Adrian and Brunnermeier (2016), except for MI which is the most systemically important institution. Another difference between the ∆CoVaR of Adrian and Brunnermeier (2016) and the proposed ∆M CoVaR concerns the ordering of systemic importance and the level of individual systemic importance. Indeed the results for the ∆M CoVaR are more stable over time, while the Adrian and Brunnermeier (2016) CoVaR levels display larger variance.

6

Conclusion

In this paper, we develop a multivariate model–based approach to measure the dynamic evolution of the overall risk and how it spreads among interconnected institutions belonging to a financial system which might itself experience instability and spread new sources of systemic risk. To this end, we consider both Gaussian and Student–t Markov Switching models accounting for multiple underlying risk–return profiles. The risk measurement framework developed throughout the paper is made up of two main ingredients. The first one consists of two new tail co–movement risk measures, namely the MCoVaR and the MCoES, which naturally extend and improve the Adrian and Brunnermeier (2016) ideas of CoVaR and CoES to account for the multiple joint occurrence of extreme distress events. Those measures are analytically available for the Gaussian and Student–t Markov switching models and they are calculated on the model’s predictive distributions. The second ingredient, the Shapley value, combines those multiple risk measures into an overall systemic risk indicator that essentially attributes the risk to the market participants. As a byproduct, the procedure returns the dynamic evolution of the total amount of risk defined as an appropriate extreme event related to the financial system conditioned on all the institutions belonging to the system being under distress. The proposed risk measurement framework overcomes the limitation of the CoVaR approach of Adrian and Brunnermeier (2016) which only relies on conditionally independent pairwise quantile–based risk measures. To summarise, the idea behind this paper is to develop a model–based approach to measure extreme tail risk co–movements among financial institutions which relies on the superior ability of Markov switching models to capture the main empirical evidence of stylised facts of financial returns. The methodology is then applied to assess the individual marginal contribution to the systemic risk of major US financial institutions previously considered by Acharya et al. (2010) and belonging to the Standard and Poor’s 500 index. Comparing the Student–t and Gaussian results, we observe that the former assumption is preferred by standard information criteria and this evidence supports for the use of fat tailed distributions for financial data. Moreover, the choice of the Student–t distribution is also justified by theoretical reasons because it accounts also for non linear dependence among tail events which is not possible under the multivariate Gaussian assumption. Our empirical results suggest that the marginal contribution to the systemic risk of individual institutions dynamically evolves over time and it changes dramatically, both in order of importance and in levels, during periods of financial crisis. More importantly, we observe that coupling a model being able to correctly identify different volatility regimes with opportunely defined systemic risk measures that account for contemporaneous multiple distress events, allows us to improve the understanding and the prediction of how 32

financial market turmoils impact on individual and systemic risks. Our results have several important policy implications. First, our analysis provides useful suggestions for the ongoing discussion on the imposition of capital requirements on systemically important institutions to prevent financial system disasters spillover effects. Concerning this aspect the main implication of our analysis is that capital requirements should be based on forward looking risk measures accounting for the evolving economic and financial conditions. Second, although the proposed model does not consider any exogenous information concerning individual institutions, this information can be easily included in our general framework improving policy analysis. Third, our systemic risk contribution is designed to provide a predictive systemic risk measure that can be interpreted as an early warning indicator. Of course the indicator can be improved by considering long–run forecasting exercises. To this end the risk measures should be modified to consider new dynamic evolving extreme tail events.

A

Financial crisis timeline Date January 21, 2008 March 16, 2008 September 15, 2008 March 9, 2009 December 8, 2009 May 18, 2010 November 29, 2010 May 05, 2011 August 05, 2011 March 16, 2013 April 07, 2013 April 30, 2013 September 17, 2013 June 03, 2014 September 09, 2014 November 28, 2014 June 26, 2015

Event the global stock markets suffer their largest fall since September 2001. the Bear Stearns acquisition by JP Morgan Chase. the Lehman’s failure. the peak of the onset of the recent global financial crisis. the downgrading of Greece’s credit rating from A- to BBB+ by Fitch ratings agency. the Greece achievement of 18bn USD bailout from EFSF, IMF and bilateral loans. the Ireland achievement of 113bn USD bailout from EU, IMF and EFSF. the Portugal bailout from ECM. the S&P downgrading of US sovereign debt. the Cyprus achievement of 13bn USD bailout from ECM. the conference of the Portuguese Prime Minister regarding the high court’s block of austerity plans. the approval of the Cyprus bailout by the Euro Parliament. the drop of car sales to the lowest recorded level in the Euro area. the drop of Eurozone inflation, and the consequent increasing pressure on the Central Bank. the mediterranean countries prepare for further unrest. the Italian unemployment rate reaches the record high since the 1977. the Greek government unilaterally broke off negotiations with the Eurogroup.

Table 6: Financial crisis timeline.

B

ECM algorithm

In this Appendix we detail the EM algorithm for the Student–t HMM defined in Section 2. (i) E–step: at iteration (m + 1), the E–step requires the computation of the so–called Q– function, which calculates the conditional expectation of the complete–data log–likelihood

33

given the observations and the current parameter estimates Ξ(m) L ∑ T L ∑ L ( ) ∑ ∑ (m) (m) Q Ξ, Ξ(m) ∝ zˆ1,l log (δl ) + zz ˆ t,l,k log (ql,k ) l=1 k=1 t=1

l=1



1 2

L ∑ T ∑ l=1 t=1

L ∑ T ∑ l=1 t=1

zˆt,l

p ∑

log (σj,t )

j=1

) 1 ∑ ∑ (m) (m) ( −1 zˆt,l ϖ ˆ t,l tr Cl εt,l ε′t,l 2 t=1 L



zˆt,l (p log (2π) + log |Cl |) −

T

l=1

+

L ∑ T ∑

(m)

zˆt,l

l

2

l=1 t=1

log

(ν ) l

2

− log Γ

( ν )) l

2

) 1 ∑ ∑ (m) ( ( (m) ) (m) zˆt,l νl log ϖ ˆ t,l − ϖ ˆ t,l 2 t=1 L

+



T

l=1

+

L ∑ T ∑

(m)

zˆt,l

l=1 t=1

(p 2

) ( ) (m) − 1 log ϖ ˆ t,l ,

(B.1)

( ) where εt,l = Dt−1 yt − µt,l , with µt,l = al + Al yt−1 , Dt = diag {σ1,t , σ2,t , . . . , σp,t }, for l = 1, 2, . . . , L and σj,t , for l = 1, 2, . . . , p follow the AV–GARCH(1, 1) process ( ) (m) defined in Section 2.2. Here, the conditional expectations zˆt,l = E zt,l | y1:T , Ξ(m) ( ) (m) and zz ˆ t,l,k = E zzt,l | y1:T , Ξ(m) , ∀t = 1, 2, . . . , T and ∀l, k = 1, 2, . . . , L are computed via the well know Forward–Filtering Backward–Smoothing (FFBS) recursive algorithm, (m) see, e.g., Baum et al. (1970) and Fr¨ uhwirth-Schnatter (2006). Moreover, ϖ ˆ t,l denotes the current estimate of the conditional expectation of ϖt given the observation at time t, yt , and zt,l = 1, and it is computed as in Peel and McLachlan (2000): ( ) (m) ϖ ˆ t,l = E ϖt | y1:T , zt,l = 1, Ξ(m) (m)

=

(m) νˆl

νˆl (

+p

(m) ˆ (m) ˆ t,l , Σ + dM yt , µ t,l

),

where Σt,l = Dt Cl Dt µt,l = al + Al yt−1 ,

(B.2)

for t = 1, 2, . . . , T and ∀l = 1, 2, . . . , L, and dM (xt , µt , Σt ) denotes the Mahalanobis distance. ) ( (i) M–step: at iteration (m + 1), the M–step maximizes the function Q Ξ, Ξ(m) with respect to Ξ to determine the next set of parameters Ξ(m+1) . The updated estimates of the hidden parameters, the mean vector µl , and the scale matrix Σl are given by the following expressions. 34

CM–1 The vector of initial probabilities δ and the transition matrix of the Markov chain Q have closed form updating equaitons: (m+1)

(m)

δl

= zˆ1,l

(m+1) ql,k

= ∑L

∑T t=2

(m)

zc z t,l,k

∑T

k=1

t=2

(m)

zc z t,l,k

,

for l, k = 1, 2, . . . , L. CM–2 Concerning the parameters driving the volatility dynamics ϑ, they enter the complete–data log–likelihood in equation (B.1), only through the following quantity: p L ∑ T ( ) ∑ ∑ (m) Qϑ ϑ, Ξ(m) ∝ − zˆt,l log (σj,t ) l=1 t=1



1 2

j=1

L ∑ T ∑

( ) −1 (m) (m) ˆ (m) D−2 S ˆ (m) , zˆt,l ϖ ˆ t,l tr C t l t,l

l=1 t=1

( )( )′ (m) (m) ˆ (m) = ˆ (m) yt−1 yt − a ˆ (m) yt−1 ˆl − A ˆl − A yt − a and Dt = where S t,l l l diag {σ1,t , σ2,t , . . . , σp,t }, for l = 1, 2, . . . , L and σj,t , for l = 1, 2, . . . , p follow the AV– GARCH(1, 1) process defined in Section 2.2. The update of the volatility parameters can be obtained as the result of the following optimisation problem ( ) ˆ (m+1) = arg max Qϑ ϑ, Ξ(m) . ϑ (B.3) ϑ

CM–3 The correlation matrix C, they enter the complete–data log–likelihood in equation (B.1), only through the following quantity: L ( ) ∑ ( ) (l) QC C, Ξ(m) ∝ QC Cl , Ξ(m) l=1 (l)

QC

(

)

Cl , Ξ(m) ∝ −

1 ∑ (m) zˆ log |Cl | 2 t=1 t,l T

1 ∑ (m) (m) ( −1 ˆ (m+1)−2 ˆ (m) ) zˆ ϖ ˆ tr Cl Dt St,l , 2 t=1 t,l t,l T



(B.4)

where ( )( )′ (m) (m) ˆ (m) = yt − a ˆ (m) yt−1 yt − a ˆ (m) yt−1 ˆl − A ˆl − A C = {C1 , C2 , . . . , CL }, S t,l l l { } (m+1) (m+1) (m+1) ˆ (m+1) = diag σ and D ˆ1,t , σ ˆ2,t , . . . , σ ˆp,t , for t = 1, 2, . . . , T . The update t of the correlation matrix can be obtained as the result of the following optimisation problem ) ( ˆ (m+1) = arg max Q(l) Cl , Ξ(m) , (B.5) C C l Cl

35

for l = 1, 2, . . . , L. To speed up the optimisation procedure at each iteration the starting point should be carefully selected. As suggested by Pelletier (2006) the optimisation procedure can be initialisated at [ { }]− 12 [ { }]− 21 ˆ (m+1) = diag C ˜ (m+1) ˜ (m+1) diag C ˜ (m+1) C C (B.6) l l l l ( ) −2 ∑T (m) (m) ˆ (m+1) ˆ (m) ˆt,l ϖ ˆ t,l Dt S t=1 z t,l (m+1) ˜ Cl = . (B.7) ∑T (m) ˆt,l t=1 z CM–4 Concerning the autoregressive parameters (al , Al ), they are updated by maximising the following quantity L T ( ) 1 ∑ ∑ (m) (m) ′ ˆ (m+1)−1 ˜t,l Σt,l ˜t,l , QAR al , Al , Ξ(m) ∝ − zˆt,l ϖ ˆ t,l ε ε 2 t=1 l=1

ˆ (m+1) = D ˆ (m+1) C ˆ (m+1) D ˆ (m+1) with ε ˜t,l = yt − al − Al yt−1 , for t = where Σ t t t,l l ( ) ˆ ˆl , Al , for l = 1, 2, . . . , L are available 1, 2, . . . , T . The update of the parameters a in closed form [ T ]−1 ∑ (m) (m) (m+1)−1 (m+1) ˆ ˆ a = zˆ ϖ ˆ Σ l

t,l

t,l

t=1

t,l

( ×

ˆ (m+1) A l

=

[ T ∑

T ∑

) (m) (m) ˆ (m+1)−1 zˆt,l ϖ ˆ t,l Σt,l

t=2

(yt − Al yt−1 )

]−1

(m) (m) ˆ (m+1)−1 ′ zˆt,l ϖ ˆ t,l Σt,l yt−1 yt−1

t=2

( ×

T ∑

(m) (m) ˆ (m+1)−1 zˆt,l ϖ ˆ t,l Σt,l

(

yt −

(m+1) ˆl a

)

) ′ yt−1

.

t=2 (m+1)

CM–5 The updated estimate νl does not exist in closed form but is given as the solution of the equation: ( ( ) ) ∑T ( (m) ) ( (m) ) (m) (m) (m) z ˆ log ϖ ˆ − ϖ ˆ t=1 t,l t,l t,l νl νl −ψ + log + 1+ ∑T (m) 2 2 ˆt,l t=1 z ( (m) ) ( (m) ) νl + p νl + p − log =0 +ψ 2 2 where ψ(·) is the Digamma function. The solution can be determined by a bisection algorithm or quasi–Newton methods. As an alternative, we adopt the following approximation due to Shoham (2002): [ ( ( ))] 2 a2 (m+1) νl = + a0 1 + erf a1 log , hl + log (hl ) − 1 hl + log (hl ) − 1 36

with a0 = 0.0416, a1 = 0.6594, a2 = 2.1971, where erf (·) is the error function and [ ( ) ( ) ] (m) ∑T p+νl (m) (m) 2 ) ( ˆt,l ψ + log (m) −ϖ ˆ t,l (m+1) (m+1) t=1 z 2 νl +dM yt ,µt,l ,Σt,l hl = − , ∑T (m) ˆt,l t=1 z for l = 1, 2, . . . , L.

C

Score and information matrix

In this brief appendix we detail the procedure used to obtain the score and hessian matrix of the estimated parameters.

C.1

The score vector

The EM algorithm can be exploited to obtain analytical formulae for the score function 1:t−1 ,ϕ) where ϕ denotes the vector of model parameters ϕ = ∇t (ϕ, yt ) ≡ ∂ℓt (yt |y ∂ϕ ({ )′ } L ′ ′ ′ a′l , vec (Al ) , vech (Cl ) , νl l=1 , vech (Q) , δ ′ , ϑ′ and ℓt (·) is the log–likelihood at time t = 1, 2, . . . , T . The method exploits the following decomposition of the joint log–density of the observed process yt and the latent factors (zt,l , ϖt,l ), given the past information y1:t−1 and the parameters ϕ, as follows ℓt (yt , zt,l , ϖt,l | y1:t−1 , ϕ) = ℓt (yt | y1:t−1 , zt,l , ϖt,l , ϕ) + ℓt (zt,l , ϖt,l | y1:t−1 , ϕ) = ℓt (yt | y1:t−1 , ϕ) + ℓt (zt,l , ϖt,l | y1:t , ϕ) ,

(C.1)

where ℓt (yt | y1:t−1 , zt,l , ϖt,l , ϕ) is the conditional log–density of the observed process yt given the past history of the process itself y1:t−1 the latent factors (zt,l , ϖt,l ) and the parameters ϕ, ℓt (zt,l , ϖt,l | y1:t−1 , ϕ) is the conditional log–density of the latent factors given the observations y1:t−1 and the parameters ϕ and ℓt (yt | y1:t−1 , ϕ), while ℓt (zt,l , ϖt,l | y1:t , ϕ) is the conditional joint log–density of the observable and unobservable processes, respectively. By differentiating both side of the identity in equation (C.1) with respect to the vector of parameters ϕ and taking expectations of both side with respect to the joint density of the missing factors, we end up with the following expression for the observed score [ ] [ ] ∂ℓt (yt | y1:t−1 , zt,l , ϖt,l , ϕ) ℓt (zt,l , ϖt,l | y1:t−1 , ϕ) ∇t (ϕ, yt ) = E Y1:T + E Y1:T , ∂ϕ ∂ϕ [ ] [ ] ∂ℓt (zt,l ,ϖt,l |y1:t ,ϕ) ∂ℓt (yt |y1:t−1 ,ϕ) because E = 0 and ∇ (ϕ, y ) = E , see Tanner Y Y 1:T t t 1:T ∂ϕ ∂ϕ (1996), Louis (1982) and McLachlan and Krishnan (2007). In this way, we decompose the observed score ∇t (ϕ, yt ) as the sum of the expected value of the score of a multivariate Normal log–density function and the expected value of the score of the distribution of the mixing variables (zt , ϖt ), which can be easily calculated analytically. For example, if the component density is a multivariate Student–t distribution, then the conditional distribution of the latent factor ϖt is ( ( )) νl + p νl + dM yt , µt,l , Σt,l , , p (ϖt | y1:T , zt,l = 1, ϕ) ∼ G (C.2) 2 2

37

where µl,t and Σl,t are the model parameters conditional to information up to time t i.e., µt,l = al + Al yt−1 and Σt,l = Dt Cl Dt , for t = 1, 2, . . . , T and ∀l = 1, 2, . . . , L and dM (yt , µt , Σt ) denotes the Mahalanobis distance. The marginal distribution of zl,t is Multinomial where parameters can be calculated using the FFBS recursive algorithm, see, e.g., Baum et al. (1970) and Fr¨ uhwirth-Schnatter (2006). The score vector of the VAR–HMM model is ∂ log Lc (Ξ) ∑ = zˆt,l ϖ ˆ t,l Σ−1 t,l (yt − µl − Al yt−1 ) ∂µl t=1

(C.3)

∂ log Lc (Ξ) ∑ ′ = zˆt,l ϖ ˆ t,l Σ−1 t,l (yt − µl − Al yt−1 ) yt−1 ∂Al t=1

(C.4)

1∑ ∂ log Lc (Ξ) 1∑ −1 ′ =− zˆt,l C−1 zˆt,l ϖ ˆ t,l C−1 l + l εt,l εt,l Cl ∂Cl 2 t=1 2 t=1

(C.5)

T

T

T

T

∂ log Lc (Ξ) zˆ1,l = ∂δl δl ∂ log Lc (Ξ) zc z t,l,k = ∂ql,k ql,k (ν ) 1 1 (ν ) 1 ∂ log Lc (Ξ) 1 l l = log + − Ψ + Ψ (νl + p) ∂νl 2 2 2 2 2 2 ( ( )) 1 − log νl + dM yt , µt,l , Σt,l 2 ) T ∑ L ( ∑ ∂ log Lc (Ξ) zˆt,l (ω) −1 ˜ =− − zˆt,l ϖ ˆ t,l Dt Cl St,l ∂ω σj,t t=1 l=1 ) T ∑ L ( ∑ ∂ log Lc (Ξ) zˆt,l |yj,t−1 | (α) −1 ˜ =− − zˆt,l ϖ ˆ t,l Dt Cl St,l ∂α σj,t t=1 l=1 ) T ∑ L ( ∑ ∂ log Lc (Ξ) zˆt,l σj,t−1 (β) −1 ˜ =− − zˆt,l ϖ ˆ t,l Dt Cl St,l , ∂β σj,t t=1

(C.6) (C.7)

(C.8) (C.9)

(C.10)

(C.11)

l=1

where { } ˜ (ω) = diag −2σ −3 , −2σ −3 , . . . , −2σ −3 D t p,t 1,t 2,t { } ˜ (α) = diag −2σ −3 |y1,t−1 |, −2σ −3 |y2,t−1 |, . . . , −2σ −3 |yp,t−1 | D t p,t 1,t 2,t { } ˜ (β) = diag −2σ −3 σ1,t−1 , −2σ −3 σ2,t−1 , . . . , −2σ −3 σp,t−1 , D t p,t 1,t 2,t

(C.12) (C.13) (C.14)

( )( )′ and St,l = yt − µt,l yt − µt,l , Σt,l = Dt Cl Dt and µt,l = al + Al yt−1 , for t = 1, 2, . . . , T and ∀l = 1, 2, . . . , L, and dM (xt , µt , Σt ) denotes the Mahalanobis distance.

C.2

The information matrix

Under standard regularity conditions, the score vector ∇t (ϕ, yt ) evaluate at the true parameter vector ϕ0 has the martingale difference property, therefore the maximum likelihood estimator 38

will be consistent and asymptotically normally distributed with asymptotic variance–covariance matrix which is the inverse of the information matrix T 1∑ ′ ∇t (ϕ0 , yt ) ∇t (ϕ0 , yt ) T →∞ T t=1 ( ′) = E ∇t (ϕ0 , yt ) ∇t (ϕ0 , yt ) ,

I (ϕ0 ) = p lim

(C.15)

which is not available in closed form (see Fiorentini et al. 2003) and can be consistently estimated as T ( ) ( )′ ( ) ∑ ˆ yt ∇t ϕ, ˆ yt , ˆ = 1 ∇t ϕ, Iˆ ϕ T t=1

( ) ˆ yt is the observed score at time t evaluated at ϕ, ˆ the maximum likelihood where ∇t ϕ, estimate of ϕ.

D

Risk measures

Proof — Proposition 3.1. Let Y be a univariate Student–t mixture defined as in Proposition 3.1, then the τ –level TCE of Y is TCEY (ˆ y τ ) = E (Y | Y ≤ yˆτ ) ] ∫ yˆτ [∑ L ) ( 1 2 ηl T y | µl , σl , νl dy = y P (Y ≤ yˆτ ) −∞ l=1 ( ) L ∑ ( ) ηl FY yˆτ , µl , σl2 , νl = TCEY,l yˆτ , µl , σl2 , νl τ P (Y ≤ yˆ ) l=1

=

L ∑

( ) πl TCEY,l yˆτ , µl , σl2 , νl ,

l=1

) ( ) ˆτ , µl , σl2 , νl with FY yˆτ , µl , σl2 , νl and πl(, ∀l = 1, 2, .). . , L where P (Y ≤ yˆτ ) = l=1 FY y defined as in Proposition 3.1. The TCE of each mixture component TCEY,l yˆτ , µl , σl2 , νl , for l = 1, 2, . . . , L can be evaluated as ∑L

(

( ) TCEY,l yˆτ , µl , σl2 , νl =

∫ yˆτ −∞

( ) ν +1 yΓ l2

[ ]− ν+1 2 (y−µl )2 √ 1 + νl σ2 dy ν Γ( 2l ) σl2 πνl l , FY (ˆ y τ , µl , σl2 , νl ) 

and follows from standard integration results.

Proposition D.1 (TCE for multivariate Gaussian distributions). Let Y be a multivariate Gaussian random variable of dimension d, i.e. Y ∼ Nd (µ, ΛCΛ) with Λ =

39

Diag {σ1 , σ2 , . . . , σd } and correlation matrix C, then the multivariate tail conditional expectation ˜ , is of Y, i.e. the mean of Y truncated below the threshold y TCEY (e y, µ, Λ, C) = µ + with

  be =  ϕ  z 

be ΛCϕ z Φ (e z)

ϕ (ˆz1 ) Φ−1 (˜z1 ) ϕ (ˆz2 ) Φ−2 (˜z2 ) .. .

(D.1)

    

(D.2)

ϕ (ˆzd ) Φ−d (ezd ) where e z = Λ−1 (e y − µ), ϕ (·) denotes the pdf of the standardized Gaussian distribution and ∫ Φ−j (˜zj ) = ϕ (z) dz, ∀j = 1, 2, . . . , d. (D.3) z≤˜ z

) − 21 −1 −1 ¯ = Ω22,j ¯2,j − Ω22,j C−1 and z z zj , Ω22,j = [C22,j − C21,j C12,j ] . 22,j C21,j C1|2,j ˜ (

Proof — Let Z ∼ Nd (0, C) be a d–dimensional Gaussian random variable, consider Y = µ + ΛZ, then Y has a Gaussian distribution, i.e. Y ∼ Nd (µ, ΛCΛ), then the TCE of Y is TCEY (e y, µ, Λ, C)

=

e) E (Y | Y ≤ y

= µ + ΛE (Z | Z ≤ e z) = µ + ΛTCEZ (e z, C) ,

(D.4)

where TCEZ (e z, C), is the TCE of the Gaussian distribution Z ∼ Nd (0, C), and can be evaluated as follows   E (z1 , z ≤ e z)  E (z2 , z ≤ e z)    TCEZ (e z, C) = E (z, z ≤ e z) ≡  (D.5) . ..   . E (zd , z ≤ e z) Let us consider E (zj , z ≤ e z), ∀j = 1, 2, . . . , d, we have [∫ ∫ E (zj , z ≤ e z)

= Z−j ≤˜ z−j

|

Zj ≤˜ zj

zj ϕ

(

2 zj , µ[j|−j] , σ[j|−j]

{z

]

) dzj

}

Integral A

×ϕd−1 (z−j , 0, C−j,−j ) dz−j ,

(D.6)

−1 2 where µ[j|−j] = C[j,−j] C−1 [−j,−j] zj , and σ[j|−j] = 1 − C[j,−j] C[−j,−j] C[−j,j] . Let us now consider the integral A: ∫ ( ) 2 zj ϕ zj , µ[j|−j] , σ[j|−j] dzj , Zj ≤˜ zj

40

applying transformation t = ∫ Zj ≤˜ zj

zj −µ[j|−j] σ[j|−j]

( ) 2 zj ϕ zj , µ[j|−j] , σ[j|−j] dzj

we get: (

) ˜zj − µ[j|−j] , 0, 1 σ[j|−j] ( ) ˜zj − µ[j|−j] −1 +Cj,−j C−j,−j z−j Φ , 0, 1 . σ[j|−j]

= −σ[j|−j] ϕ

Plugging this last expression into equation (D.6) we obtain ( ) ∫ ( ) ˜zj − µ[j|−j] E (zj , z ≤ e z) = −σ[j|−j] ϕ ϕd−1 z−j , 0, C[−j,−j] dz−j σ[j|−j] Z−j ≤˜ z−j {z } | +Cj,−j C−1 −j,−j

Integral B

Z−j ≤˜ z−j

) ˜zj − µ[j|−j] , 0, 1 σ[j|−j] ( ) ×ϕd−1 z−j , 0, C[−j,−j] dz−j . (



z−j Φ

Considering now the integral B ( ) ∫ ( ) ˜zj − µ[j|−j] ϕ ϕd−1 z−j , 0, C[−j,−j] dz−j σ[j|−j] Z−j ≤˜ z−j {( )2 } ∫ ˜zj − µ[j|−j] 1 √ exp = 2 σ[j|−j] 2π Z−j ≤˜ z−j { } 1 1 ′ 1 −1 × exp − C z z −j dz−j , 1 d−1 2 −j [−j,−j] (2π) 2 |C[−j,−j] | 2

(D.7)

and completing the square in the following way z′−j C−1 [−j,−j] z−j ( )′ ( ) −1 ˜zj − C[j,−j] C−1 + ˜zj − C[j,−j] C−1 σ[j|−j] [−j,−j] z−j [−j,−j] z−j ( )′ −1 ˜ = ˜z′j ˜zj + z−j − C[−j|j] C−1 C σ z C−1 [−j,−j] [−j,j] [j|−j] j [−j|j] ( ) −1 ˜ × z−j − C[−j|j] C−1 [−j,−j] C[−j,j] σ[j|−j] zj it becomes

(



ϕ

˜zj − µ[j|−j]

)

σ[j|−j]

Z−j ≤z−j d

1

= (2π)− 2 |C[−j,−j] |− 2

ϕ (˜zj )

= (2π)

d−1 2

1

|C[−j,−j] | 2

(D.8)

( ) ϕd−1 z−j , 0, C[−j,−j] dz−j

{ )′ 1( ˜zj C−1 exp − z−j − C[−j|j] C−1 C[−j,j] C−1 [−j,−j] [j|−j] [−j|j] 2 Z−j ≤˜ z−j ( )} × z−j − C[−j|j] C−1 C C−1 ˜z dz−j [−j,−j] [−j,j] [j|−j] j { ∫ ( )′ 1 ˜zj C−1 z−j − C[−j|j] C−1 C[−j,j] C−1 exp − [−j,−j] [j|−j] [−j|j] 2 Z−j ≤˜ z−j ( )} × z−j − C[−j|j] C−1 C C−1 ˜z dz−j . [−j,−j] [−j,j] [j|−j] j ∫

41

( ) − 12 −1 ˜ Considering the transformation t = C[−j|j] z−j − C[−j|j] C−1 C C z and defining j [−j,j] [−j,−j] [−j|j] ( ) − 12 −1 ˜−j − C[−j|j] C−1 ˜ t˜ ≡ C[−j|j] z [−j,−j] C[−j,j] C[−j|j] zj , the previous integral B becomes (

∫ ϕ

=

( ) ϕd−1 z−j , 0, C[−j,−j] dz−j

σ[j|−j]

Z−j ≤˜ z−j

=

)

˜zj − µ[j|−j]

ϕ (z˜j )

1

1

|C[−j,−j] | 2 ϕ (z˜j ) 1

|C[−j,−j] | 2

|C[−j|j] | 2



exp

{1 2

t′ t

}

dt d−1 (2π) 2 ( ( )) 1 −1 −1 2 ˜−j − C[−j|j] C−1 ˜ |C[−j|j] | 2 Φ C[−j|j] z C C . z [−j,−j] [−j,j] [−j|j] j t≤t˜

(D.9)

Let now consider the last part of integral in equation (D.7) (

∫ Z−j ≤˜ z−j

z−j Φ

˜zj − µ[j|−j] σ[j|−j] ∫

) , 0, 1

) ( ϕd−1 z−j , 0, C[−j,−j] dz−j

{ } { } 1 1 ′ 1 1 −1 √ exp − t2 exp − z C z dzj dz−j −j −j d−1 [−j,−j] 2 2 2π Z−j ≤˜ z−j (2π) 2 { ( )2 } ∫ ∫ ˜zj − µ[j|−j] 1 √ = z−j exp − 2 2σ[j|−j] Z−j ≤˜ z−j Zj ≤˜ zj σ[j|−j] 2π { } 1 1 × exp − z′−j C−1 z−j dzj dz−j d−1 [−j,−j] 2 (2π) 2 { } ∫ z−j 1 1 = exp − z′ C−1 z dz σ[j|−j] Z≤˜z (2π) 2d 2 { } ∫ z−j 1 1 exp − z′ C−1 z dz = d σ[j|−j] Z≤˜z (2π) 2 2 { 1 ′ −1 } 1 ∫ z−j exp − 2 z C z |C| 2 = dz d 1 σ[j|−j] Z≤˜z |C| 2 (2π) 2 ∫

=

z−j

˜ zj −µ[j|−j] zj ≤ σ [j|−j]

1

=

|C| 2 ˜) E (z−j , Z ≤ z σ[j|−j] 1

˜) . = |C[−j,−j] | 2 E (z−j , Z ≤ z 2 because |C| = |C[−j,−j] |σ[j|−j] . Concluding, we have that: 1

˜) = −σ[j,−j] ϕ (˜zj ) E (zj , z ≤ z

|C[−j|j] | 2

1

|C[−j|−j] | 2

( ( )) −1 −1 2 ˜−j − C[−j|j] C−1 ˜ Φ C[−j|j] z C C z j [−j,−j] [−j,j] [−j|j]

˜) , +C[j,−j] C−1 |C | 2 E (z−j , Z ≤ z [−j,−j] [−j,−j] 1

(D.10)

for j = 1, 2, . . . , d. Let ˆzj ˆz−j

˜) ≡ E (zj , z ≤ z ˜) ≡ E (z−j , z ≤ z (D.11)

42

rewriting the previous equation (D.12) as ˆzj − C[j,−j] C−1 |C | 2 ˆz−j = [−j,−j] [−j,−j] 1

1

−σ[j,−j] ϕ (˜zj )

|C[−j|j] | 2 |C[−j|−j] |

1 2

( ( )) −1 −1 2 ˜−j − C[−j|j] C−1 ˜ Φ C[−j|j] z C C z j [−j,j] [−j,−j] [−j|j] (D.12)

we get a system of d equation with d unknowns Aˆ z = b where the matrix A has diagonal elements ai,i = 1, for i = 1, 2, . . . , d, and off–diagonal elements ai,j and i ̸= j being the j–th 1 2 element of the (d − 1)–dimensional vector −C[i,−i] C−1 [−i,−i] |C[i,−i] | , and the vector b has the j–th generic element equal to ( 1 ( )) −2 −1 −1 ˜ ˜ z − C C Φ C C C z −j [−j|j] [−j,−j] [−j,j] [−j|j] j 1 [−j|j]

1

bj = −σ[j,−j] ϕ (˜zj )

|C[−j|j] | 2

|C[−j|−j] | 2

(D.13)

for j = 1, 2, . . . , d. Solving the previous system of equations completes the proof. Without loss of generality we can consider the case where d = 2 where ˆ = E (z | z ≤ z ˜) z

=

A−1 b

) ( ( ) ˜ z2 −ρ˜ z1 2 √ − 1 − ρ ϕ (˜z1 ) Φ 2 1 1 ρ   ( 1−ρ ) = ) 1 − ρ2 ρ 1  ( ˜ z1 −ρ˜ z2 − 1 − ρ2 ϕ (˜z2 ) Φ √ 1−ρ2 ) ( )   ( ˜ ˜ z1 −ρ˜ z2 z2 −ρ˜ z1 √ √ − ρϕ (˜z2 ) Φ   −ϕ (˜z1 ) Φ 2 2 ( 1−ρ ) ( 1−ρ )  =    ˜ ˜ z1 −ρ˜ z2 z2 −ρ˜ z1 −ϕ (˜z2 ) Φ √ − ρϕ (˜z1 ) Φ √ 1−ρ2 1−ρ2 )   ( ˜ z2 −ρ˜ z1 √   −ϕ (˜z1 ) Φ 2 ( 1−ρ )  , = C   ˜ z1 −ρ˜ z2 −ϕ (˜z2 ) Φ √ 2 [

]



   

(D.14)

1−ρ

2 with ρ = C[1,2] C−1 [2,2] |C[1,2] | . 1



Proposition D.2 (TCE for multivariate Student–t distributions). Let Y be a multivariate Student-t random variable, i.e. Y ∼ Td (µ, ΛCΛ, ν) with Λ = Diag {σ1 , σ2 , . . . , σd }, correlation matrix C and degrees of freedom ν, then the multivariate tail conditional expectation of Y is be e ) = µ + ΛCϕ E (Y, Y ≤ y z with

  be =  ϕ  z 

ϕ (ˆz1 ) Φ−1 (˜z1 ) ϕ (ˆz2 ) Φ−2 (˜z2 ) .. . ϕ (ˆzd ) Φ−d (˜zd ) 43

(D.15)

    

(D.16)

where e z = Λ−1 (e y − µ), ϕ () denotes the pdf of the standardized Student–t distribution and ∫ Φ−j (˜zj ) = ϕ (z) dz, ∀j = 1, 2, . . . , d. (D.17) z≤˜ z

) − 21 −1 −1 ¯2,j − Ω22,j C−1 ¯ = Ω22,j zj , Ω22,j = [C22,j − C21,j C12,j ] . and z z 22,j C21,j C1|2,j ˜ (

Proof — The proof is exactly as that reported for the Gaussian case with the only execption that here we exploit the scale representation of the Sutent–t distribution in equation (2.3).  Proposition D.3 (TCE for multivariate Gaussian and Student–t mixtures). Let ∑L Y multivariate Gaussian )(or Student–t) mixture, i.e. Y ∼ l=1 ηl N (y | µl , Σl , ) ( be a ∑ L or Y ∼ l=1 ηl T (y | µl , Σl , νl ) , then the tail conditional expectation of Y is a convex linear combination of the tail conditional expectations of the components: TCEY (e y, L) =

L ∑

πl TCEl (e y)

(D.18)

l=1

where the weights are πl

=

ηl ∑L

e,µl ,Σl ) Φ(y

in the Gaussian case, and πl = ∑L in the Student–t case, l = 1, 2, . . . , L, with l=1 πl = 1 and Φ (·) and l=1

ηl ∑L

e,µl ,Σl ,νl ) t(y

e,µl ,Σl ) ηl Φ(y

e,µl ,Σl ,νl ) ηl t(y t (·) denotes the Gaussian and Student–t cdf, respectively. l=1

Proof — See Bernardi (2013) for a similar proof involving Skew Normal mixtures.



Acknowledgements. This research is supported by the Italian Ministry of Research PRIN 2013–2015, “Multivariate Statistical Methods for Risk Assessment” (MISURA), by the “Carlo Giannini Research Fellowship”, the “Centro Interuniversitario di Econometria” (CIdE) and “UniCredit Foundation”, and by the 2011 Sapienza University of Rome Research Project.

44

References Acerbi, C. (2002). Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking & Finance, 26(7):1505 – 1518. Acerbi, C. and Tasche, D. (2002). Expected shortfall: A natural coherent alternative to value at risk. Economic Notes, 31(2):379–388. Acharya, V., Engle, R., and Richardson, M. (2012). Capital shortfall: a new approach to ranking and regulating systemic risks. American Economic Review, 102:59–64. Acharya, V., Pedersen, L., Philippon, T., and Richardson, M. (2010). Measuring systemic risk. In Acharya, V., Cooley, T., Richardson, M., and Walter, I., editors, Regulating Wall Street: The Dodd–Frank Act and the New Architecture of Global Finance. John Wiley & Sons. Acharya, V. and Richardson, M. (2009). Restoring financial stability: how to repair a failed system. John Wiley & Sons. Adams, Z., Fuss, R., and Gropp, R. (2014). Spillover effects among financial institutions: A state-dependent sensitivity value-at-risk approach. Journal of Financial and Quantitative Analysis, 49(3):575–598. Adrian, T. and Brunnermeier, M. (2011). CoVaR. Working paper. Adrian, T. and Brunnermeier, M. K. (2016). CoVaR. American Economic Review, 106(7):1705– 41. Ang, A. and Bekaert, G. (2004). How regimes affect asset allocation. Financial Analysts Journal, 60(2):86–99. Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D. (1999). Coherent measures of risk. Mathematical finance, 9(3):203–228. Banzhaf, J. F. (1965). Weighted voting doesn’t work: A mathematical analysis. Rutgers L. Rev., 19:317. Bartolucci, F. and Farcomeni, A. (2009). A multivariate extension of the dynamic logit model for longitudinal data based on a latent markov heterogeneity structure. Journal of the American Statistical Association, 104(486):816–831. Bartolucci, F. and Farcomeni, A. (2010). A note on the mixture transition distribution and hidden markov models. Journal of Time Series Analysis, 31(2):132–138. Baum, L., Petrie, T., Soules, G., and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann. Math. Statist., 41:164. Benoit, S., Colliard, J.-E., Hurlin, C., and P´erignon, C. (2016). Where the risks lie: A survey on systemic risk. Review of Finance. Bernal, O., Gnabo, J.-Y., and Guilmin, G. (2014). Assessing the contribution of banks, insurance and other financial services to systemic risk. Journal of Banking & Finance, 47:270 – 287. 45

Bernardi, M. (2013). Risk measures for skew normal mixtures. Statistics & Probability Letters, 83:1819–1824. Bernardi, M. and Catania, L. (2015). Switching-gas copula models with application to systemic risk. arXiv preprint arXiv:1504.03733. Bernardi, M., Durante, F., and Jaworski, P. (2017). CoVaR of families of copulas. Statistics & Probability Letters, 120:8 – 17. Bernardi, M., Gayraud, G., and Petrella, L. (2015). Bayesian tail risk interdependence using quantile regression. Bayesian Anal., 10(3):553–603. Bernardi, M., Maruotti, A., and Petrella, L. (2012). Skew mixture models for loss distributions: a bayesian approach. Insurance: Mathematics and Economics, 51:617–623. Bernardi, M. and Petrella, L. (2015). Interconnected risk contributions: A heavy-tail approach to analyze US financial sectors. Journal of Risk and Financial Management, 8(2):198. Bialkowski, J. (2003). Modelling returns on stock indices for western and central european stock exchanges - a markov switching approach. Southeast. Eur. J. Econ., 2:81. Billio, M., Getmansky, M., Lo, A., and Pellizon, L. (2012). Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of Financial Econometrics, 101:535–559. Bisias, D., Flood, M., Lo, A. W., and Valavanis, S. (2012). A survey of systemic risk analytics. Annual Review of Financial Economics, 4(1):255–296. Brownlees, C. and Engle, R. (2012). Volatility, correlation and tails for systemic risk measurement. Working paper, European Business School. Bulla, J. (2011). Hidden markov models with t components. increased persistence and other aspects. Quantitative Finance, 11(3):459–475. Bulla, J. and Bulla, I. (2006). Stylized facts of financial time series and hidden semi-markov models. Comput. Statist. Data Anal., 51:2192. Bulla, J., Mergner, S., Bulla, I., Sesbo¨ u´e, A., and Chesneau, C. (2011). Markov-switching asset allocation: Do profitable strategies exist? Journal of Asset Management, 12(5):310–321. Cao, Z. (2013). Multi-covar and shapley value: a systemic risk measure. Banq. France Work. Pap. Capp´e, O., Moulines, E., and Ryd´en, T. (2005). Inference in hidden Markov models. Springer. Castro, C. and Ferrari, S. (2014). Measuring and testing for the systemically important financial institutions. Journal of Empirical Finance, 25(0):1 – 14. Christoffersen, P. F. (1998). Evaluating interval forecasts. Internat. Econom. Rev., 39(4):841– 862. Symposium on Forecasting and Empirical Methods in Macroeconomics and Finance. ˇ ıˇzek, P., H¨ardle, W. K., and Weron, R., editors (2011). Statistical tools for finance and C´ insurance. Springer, Heidelberg, second edition. 46

Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International statistical review, 73(1):111–129. Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data via the em algorithm. J. R. Statist. Soc. Ser. B, 39:1. Drehmann, M. and Tarashev, N. (2013). Measuring the systemic importance of interconnected banks. Journal of Financial Intermediation, 22(4):586 – 607. Dymarski, P. (2011). Hidden Markov Models, Theory and Applications. Rijeka, HR, Intech. Embrechts, P., McNeil, A. J., and Straumann, D. (2002). Correlation and dependence in risk management: properties and pitfalls. In Risk management: value at risk and beyond (Cambridge, 1998), pages 176–223. Cambridge Univ. Press, Cambridge. Engle, R., Jondeau, E., and Rockinger, M. (2014). Systemic risk in europe. Review of Finance. Engle, R. F. and Manganelli, S. (2004). CAViaR: conditional autoregressive value at risk by regression quantiles. J. Bus. Econom. Statist., 22(4):367–381. Fiorentini, G., Sentana, E., and Calzolari, G. (2003). Maximum likelihood estimation and inference in multivariate conditionally heteroscedastic dynamic regression models with student t innovations. Journal of Business & Economic Statistics, 21(4):532–546. Fr¨ uhwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models: Modeling and Applications to Random Processes. Springer. Gettinby, G., Sinclair, C., Power, D., and Brown, R. (2004). An analysis of the distribution of extreme share returns in the uk from 1975 to 2000. J. Business Finance Account., 31:607. Geweke, J. and Amisano, G. (2010). Comparing and evaluating bayesian predictive distributions of asset returns. International Journal of Forecasting, 26(2):216 – 230. Special Issue: Bayesian Forecasting in Economics. Geweke, J. and Amisano, G. (2011). Hierarchical markov normal mixture models with applications to financial asset returns. Journal of Applied Econometrics, 26(1):1–29. Girardi, G. and Erg¨ un, A. (2013). Systemic risk measurement: Multivariate garch estimation of covar. Journal of Banking & Finance, 37:3169–3180. Guidolin, M. and Timmermann, A. (2005). Economic implications of bull and bear regimes in uk stock and bond returns. Econ. J., 115:111. Hamilton, J. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57:357. Hamilton, J. (1990). Analysis of time series subject to changes in regime. J. Econometr., 45:39. Harris, R. D. F. and K¨ u¸cu ¨k¨ ozmen, C. C. (2001). The empirical distribution of uk and us stock returns. Journal of Business Finance & Accounting, 28(5-6):715–740. Hautsch, N., Schaumburg, J., and Schienle, M. (2014). contributions. Review of Finance. 47

Financial network systemic risk

Huang, X., Zhou, H., and Zhu, H. (2012a). Assessing the systemic risk of a heterogeneous portfolio of banks during the recent financial crisis. Journal of Financial Stability, 8(3):193 – 205. The Financial Crisis of 2008, Credit Markets and Effects on Developed and Emerging Economies. Huang, X., Zhou, H., and Zhu, H. (2012b). Systemic risk contributions. Journal of Financial Services Research, 42(1):55–83. J¨ager-Ambro˙zewicz, M. (2013). Closed form solutions of measures of systemic risk. Ann. Univ. Sci. Budapestinensis Rolando E¨ otv¨ os Nomin. Sect. Comput., 39:215–225. Jorion, P. (2007). Value-at-Risk: The new benchmark for managing financial risk. McGraw-Hill, Chicago. Kotz, S. and Nadarajah, S. (2004). Cambridge University Press.

Multivariate t-distributions and their applications.

Koyluoglu, U. and Stoker, J. (2002). Honour your contribution. Risk, 15(4):90–94. Kupiec, P. (1995). Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives, 3:73–84. Lagona, F. and Picone, M. (2013). Maximum likelihood estimation of bivariate circular hidden markov models from incomplete data. Journal of Statistical Computation and Simulation, 83(7):1223–1237. Louis, T. A. (1982). Finding the observed information matrix when using the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 44(2):226–233. Lucas, A., Schwaab, B., and Zhang, X. (2014). Conditional euro area sovereign default risk. Journal of Business & Economic Statistics, 32(2):271–284. McLachlan, G. and Krishnan, T. (2007). The EM algorithm and extensions, volume 382. John Wiley & Sons. McLachlan, G. and Peel, D. (2000). Finite mixture models. John Wiley & Sons. McNeil, A. J., Frey, R., and Embrechts, P. (2015). Quantitative risk management: Concepts, techniques and tools. Princeton university press, second edition. Nadarajah, S., Zhang, B., and Chan, S. (2014). Estimation methods for expected shortfall. Quantitative Finance, 14(2):271–291. Peel, D. and McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10(4):339–348. Pelletier, D. (2006). Regime switching for dynamic correlations. Journal of econometrics, 131(1):445–473. Ryd´en, T. (2008). EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a computational perspective. Bayesian Anal., 3(4):659–688.

48

Ryd´en, T., Ter¨ asvirta, T., and Asbrink, S. (1998). Stylized facts of daily return series and the hidden markov model. Journal of Applied Econometrics, 13(3):217–244. Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2:307–317. Shoham, S. (2002). Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognition, 35(5):1127 – 1142. Handwriting Processing and Applications. Sordo, M. A., Su´arez-Llorens, A., and Bello, A. J. (2015). Comparison of conditional distributions in portfolios of dependent risks. Insurance: Mathematics and Economics, 61:62 – 69. Sutradhar, B. C. (1984). Contributions to multivariate analysis based on elliptic t model. Unpublished Ph.D. Thesis. The University of Western Ontario, Canada. Tanner, M. A. (1996). Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions. Springer Series in Statistics. Springer, 3rd edition. Tarashev, N. A., Borio, C. E., and Tsatsaronis, K. (2010). individual institutions. BIS Working paper.

Attributing systemic risk to

Taylor, S. J. (2007). Modelling financial time series. World Scientific Publishing. Zucchini, W. and MacDonald, I. L. (2009). introduction using R. CRC Press.

49

Hidden Markov models for time series: an