Combining a self-exciting point process with the truncated generalized Pareto distribution: An extreme risk analysis under price limits

Combining a self-exciting point process with the truncated generalized Pareto distribution: An extreme risk analysis under price limits

Journal of Empirical Finance 57 (2020) 52–70 Contents lists available at ScienceDirect Journal of Empirical Finance journal homepage: www.elsevier.c...

855KB Sizes 1 Downloads 27 Views

Journal of Empirical Finance 57 (2020) 52–70

Contents lists available at ScienceDirect

Journal of Empirical Finance journal homepage: www.elsevier.com/locate/jempfin

Combining a self-exciting point process with the truncated generalized Pareto distribution: An extreme risk analysis under price limits Jingru Ji a , Donghua Wang a,b ,∗, Dinghai Xu c , Chi Xu a a

School of Business, East China University of Science and Technology, Shanghai 200237, China Department of Finance, East China University of Science and Technology, Shanghai 200237, China c Department of Economics, University of Waterloo, Ontario, N2L 3G1, Canada b

ARTICLE

INFO

JEL classification: C32 C51 G10 Keywords: Self-exciting point process Truncated generalized Pareto distribution Predictable marks Price limits Branching process

ABSTRACT In this paper, we introduce a general framework of the self-exciting point process with the truncated generalized Pareto distribution to measure the extreme risks in the stock markets under price limits. We incorporate the predictable marks, defined as the variance of mark distribution depending on the previous events via the intensity, into the model setting. The proposed process can well accommodate many important empirical characteristics, such as the thick-tailness, extreme risk clustering and price limits. We derive a closed-form solution for the objective likelihood, based on which the proposed model can be estimated via the standard maximum likelihood estimation algorithm. Furthermore, the closed-form measures of the Value-at-Risk and Expected Shortfall are also derived. For empirical illustration, we use the China Securities Index 300 (with ±10% price restriction) in the analysis. In general, the results from both in-sample fitting and out-of-sample forecasting measures show that the proposed process can explain the empirical data well. We also investigate the cascade effect of the China stock market by introducing the branching process to distinguish the endogenous risks from the exogenous risks.

1. Introduction Price-limit trading policy has been widely adopted in various stock markets around the world. According to Deb et al. (2005), there are 41 out of 58 major stock exchanges, including Austria, Belgium, China, Egypt, France, Greece, Italy, Japan, Mexico, South Africa, Turkey, etc., that have implemented certain price restrictions on their equity exchanges. The price-limit mechanism can be traced back to the Black Monday market crash in October of 1987. To ease the impacts from such extreme fluctuations, various authorities have set up the trading boundaries on the price movements within certain time period. Taking a typical daily price limit on a stock market as an example: during a trading day, the transaction price of a stock can only move within a restricted interval. This boundaries of the interval are determined by certain percentage of previous trading day’s closing price. Different markets have different bounds. For instance, Austria imposes 5%; Mainland China imposes 10%; Greece imposes 8%; etc. The dispute of the price-limit trading policy lasts for ages. Some argue that the price-limit policy may lead to ineffectiveness or even unintended, destructive market behavior (Brennan, 1986; Chen et al., 2019). Empirically, there is no consensus about whether price limits exert a magnet effect (Sifat and Mohamad, 2018; Chan et al., 2005; Cho et al., 2003; Hsieh et al., 2009) or a cool-off effect (Fernandes and Rocha, 2007), and references therein.1 ∗ Correspondence to: School of Business, East China University of Science and Technology, 130 Meilong Road, P.O. Box 114, Shanghai 200237, China. 1

E-mail addresses: [email protected] (D. Wang), [email protected] (D. Xu). Thanks to the Associate Editor for pointing out the important related works in the price limits literature.

https://doi.org/10.1016/j.jempfin.2020.03.003 Received 17 September 2019; Received in revised form 12 January 2020; Accepted 30 March 2020 Available online 7 April 2020 0927-5398/© 2020 Elsevier B.V. All rights reserved.

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 1. The left panel shows the mean excess plot of CSI 300 negative daily log-returns from January 5th, 2010 to December 31st, 2015, while the right panel shows the mean excess plot of S&P 500 negative daily log-returns with the same time interval.

The focus of this paper is on modeling the extreme risk behavior under price limits. In particular, we concentrate on the left tail of the log-return distribution. In other words, in this paper, we are interested in analyzing the events that cause extreme losses. (such as Chavez-Demoulin et al. (2005), Chavez-Demoulin et al. (2006), etc.) However, such processes may not be suitable for modeling the data from the markets with price limits. For illustration, we present a comparison plot in Fig. 1. The left panel of Fig. 1 shows the mean excess plot of China Securities Index 300 (CSI 300) negative daily log-returns from January 5th, 2010 to December 31st, 2015, while the right panel shows the mean excess plot of Standard and Poor’s 500 (S&P 500) negative daily log-returns with the same time interval. In the framework of the peaks over threshold (POT), the scatters of the mean excess plot should visually close to an upward line. However, in the left panel of Fig. 1, the scatters over 0.04 show a clear downward trend. This interesting observation suggests that the proposed process should accommodate the price restriction as one important feature in the framework. In addition, Chavez-Demoulin and McGill (2012) mention that there are two stylized characteristics to be taken into account when modeling the distribution of the log-returns: heavy tails and volatility clustering. One popular approach in the literature to study the fat tail behavior is known as the Extreme Value Theory (EVT) introduced by Embrechts et al. (1997). One of the well-known models of the EVT is the POT, which assumes that extreme risks are independently and identically distributed from a generalized Pareto distribution (GPD). In particular, the GPD can capture different tail behavior through changing the value of its shape parameter (Chavez-Demoulin et al., 2006). The EVT has been widely adopted in the field of risk management; see McNeil (1998), McNeil and Frey (2000) or Bali (2007) for instance. Another interesting empirical phenomenon is the cluster of extreme risk occurrences. An extreme event may trigger a series of crashes, which produces a so-called short-term autocorrelation effect. To accommodate such clustering effect, researchers adopt the self-exciting process proposed by Hawkes (1971). For example, Moreno et al. (2011) study the shot-noise jump diffusion model and their empirical study indicates that 73% of extreme risks may cause autocorrelation effects. Aït-Sahalia et al. (2015) propose a multi-dimensional Hawkes jump-diffusion model considering both self-exciting and cross-exciting effect. Evidence shows that all financial markets contain self-exciting effect, while cross-exciting effect is shown to be asymmetric. Further research related to modeling clusters can be found in Bowsher (2007), Errais et al. (2010), Bollerslev et al. (2013) and the references therein. When the price restriction policy is applied to all the stocks in an index, the return of the index should be considered as an aggregate of limited-domain variables, since the index return is also restricted to the same limits (although it rarely hits the bounds empirically). Therefore, in this paper, we propose a general framework, which combines a self-exciting point process with the truncated GPD to analyze the extreme risk behavior on the markets with price limits. Similar to the GARCH modeling where the variance is function of past shocks, we assume that the variance of mark distribution depends on the previous events via the intensity. McNeil et al. (2005) define this setting as the predictable marks, which is one of the novel points from previous works such as Lee and Seo (2017).2 For empirical illustration purpose, throughout the paper, we use the CSI 300 data from 2010 to 2015. The closed-form solution for the objective likelihood function is derived for the proposed model. The standard maximum likelihood estimating algorithm is implemented. We also derive the closed-form expressions of the Value-at-Risk (VaR) and Expected Shortfall (ES) measures under the proposed framework, which provides a convenient way for model backtesting and out-of-sample forecasting. Furthermore, following Filimonov and Sornette (2015), we apply the language of branching process to separate the exogenous risks from endogenous risks. Thus, it is practicable to study the endogenous influence of a given extreme risk from different aspects, such 2

Thanks to an anonymous referee for pointing out this important property and providing us related literature. 53

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

as the duration of its influence, the total loss triggered by its next generation descendants, etc. These results are useful for both market participants and regulators. The rest of the paper is organized as follows. Section 2 presents the model that combines the self-exciting point process with the truncated GPD and the parameter estimation procedure using maximum likelihood estimation algorithm. Section 3 derives the closed-form expressions for the VaR and ES and conducts the empirical analysis using CSI 300. In particular, Section 3.1 performs a model diagnostics check and back test the in-sample VaR estimates. Section 3.2 forecasts the out-of-sample VaR and ES and evaluates the performance of the forecasting measures based on three likelihood ratio (LR) tests. Section 4 provides a risk distinguishing technique to separate the exogenous risks from endogenous risks and studies the cascade effect triggered by a given extreme risk. Section 5 concludes. The proofs for all propositions are collected in Appendix A and the detailed derivation for standard error structure (Hessian matrix) is presented in Appendix B. 2. Self-exciting point process with the truncated GPD In this section, we propose a self-exciting point process to model the extreme risk behavior. We incorporate the flexible truncated GPD into the framework of the marked point process to accommodate the price limits. We denote 𝑥𝑡 as the daily negative log-return at time 𝑡.3 The extreme risks are defined as the observations, whose magnitudes exceed a certain threshold 𝑢. Suppose that 𝑁𝑢,𝑡 is the total number of extreme risks till time 𝑡. The occurrence times of such extreme values can be denoted as 𝑇1 , 𝑇2 , … , 𝑇𝑁𝑢,𝑡 , and the corresponding magnitudes are denoted as 𝑋𝑇1 , 𝑋𝑇2 , … , 𝑋𝑇𝑁 . The information set 𝑢,𝑡 including both times and magnitudes of these extreme risks at time 𝑡 is defined as H𝑡 , that is, the sigma algebra generated by the process. In general, there are three key assumptions as follows. Assumptions. (i) The occurrence times of extreme risks over a certain threshold follow a self-exciting point process; (ii) The magnitudes of extreme risks over a certain threshold follow the truncated GPD; (iii) The occurrence times and magnitudes of extreme risks over a certain threshold are dependent in the short-term but the dependence decays as time increases. Following Chavez-Demoulin and McGill (2012) and McNeil et al. (2005), the frequency of extreme risks is described via a self-exciting point process. The conditional rate 𝜏 (𝑡) is defined as 𝜏 (𝑡) = 𝜏0 + 𝜌𝑣 (𝑡) , 𝜏0 ≥ 0, 𝜌 ≥ 0

(1)

with 𝑡

𝑣 (𝑡) =

∫−∞

( ) 𝛿 𝑋𝑠 𝑔 (𝑡 − 𝑠) 𝑑𝑁 (𝑠)

(2)

where 𝑋𝑠 represents the magnitude of extreme risk at time 𝑠. 𝛿 (⋅) and 𝑔 (⋅) are the impact and decay functions respectively. In particular, 𝛿 (⋅) determines the influence of extreme risks on 𝜏 (𝑡), while 𝑔 (⋅) controls the rate of the decay process triggered by extreme risks. 𝑁 (𝑠) is a counting measure. We define 𝑣 (𝑡) as the ‘‘Hawkes kernel". In our model, the clustering mechanism of the occurrence times on the Hawkes kernel. Once an extreme event occurs at time 𝑠, the intensity of the point process will suddenly ( relies ) increase by 𝜌𝛿 𝑋𝑠 and then it will decay over time. This process captures the aggregation of the occurrence times of extreme risks, thus it can capture the clustering effect. Grothe et al. (2014) propose that the forms of 𝛿 (⋅) and 𝑔 (⋅) influence the stationarity condition of the process. In this paper, we set ( ) ( ) 𝛿 𝑋𝑠 = 𝐹𝑢,𝑠 𝑋𝑠 (3)

𝑔 (𝑡 − 𝑠) = 𝑒−𝛾(𝑡−𝑠)

(4)

4

where 𝐹𝑢,𝑠 (⋅) represents the cumulated distribution function (CDF) of extreme risks over the threshold 𝑢 at time 𝑠. ( time-varying ) Note that when 𝑋𝑠 ≤ 𝑢, 𝛿 𝑋𝑠 = 0. It is worth mentioning that there are two main features in this setting. First, Eqs. (3) and (4) can make the self-exciting process to be stationary under certain conditions.5 Second, and more importantly, in this setting we can distinguish the exogenous risks, triggered by the ground intensity 𝜏0 , from the endogenous risks, which is caused by the Hawkes kernel. Based on the assumption (ii), the distribution of 𝑋𝑡 can be expressed as ( ) 𝐹𝜇,𝑡 (𝑥) = 𝑃 𝑋𝑡 < 𝑥 ||𝜇 ≤ 𝑋𝑡 ≤ 𝜙, H𝑡 =

3 4 5

)− 1 ( 𝜉 𝑥−𝜇 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) ( )− 1 𝜉 𝜙−𝜇 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

For simplicity, we analyze the daily negative log-returns by changing extreme risks into positive values. The detailed definition is provided in Proposition 1. We provide the stationarity proof in Proposition 5. 54

(5)

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

where 𝜓 ≥ 0, 𝛼 ≥ 0, 𝜉 ≠ 0 and 𝜙 ≥ 𝜇 ≥ 0. In Eq. (5), 𝜙 represents the upper bound of the negative daily log-returns. In our model, we focus on the extreme risks over the threshold 𝑢 within certain restricted price range. Hence, the domain of the exceedances is [𝑢, 𝜙]. Note that the scale parameter 𝜓 + 𝛼𝑣 (𝑡) contains a time-varying Hawkes kernel, which is defined as the predictable marks in McNeil et al. (2005). This setting accommodates the short-term dependence between occurrences and exceedances, while for the long run, this dependence decays over time. This is consistent with our assumption (iii). Proposition 1. The CDF of 𝑋𝑡 , 𝐹𝑢,𝑡 (𝑥), is ( )− 1 ( )− 1 𝜉 𝜉 𝑥−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) ( ) 𝐹𝑢,𝑡 (𝑥) = 𝑃 𝑋𝑡 < 𝑥 ||𝑢 ≤ 𝑋𝑡 ≤ 𝜙, H𝑡 = )− 1 ( )− 1 ( 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

(6)

with its corresponding PDF6

𝑓𝑢,𝑡 (𝑥) = [𝜓 + 𝛼𝑣 (𝑡)]

[ (

( )− 1 −1 𝜉 𝑥−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 𝜙−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

)− 1 𝜉

( )− 1 𝜉 𝑢−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

]

(7)

where 𝜓 ≥ 0, 𝛼 ≥ 0, 𝜉 ≠ 0 and 𝜙 ≥ 𝑢 ≥ 𝜇 ≥ 0. Proof. See Appendix A. □ Proposition 2. Suppose the intensity of extreme risks is defined in (1), and the PDF of the magnitudes has the form in (7). The closed-form of the log likelihood function, 𝑙, can be expressed as follows 𝑇

𝑙 = − 𝜏0 𝑇 − 𝜌 𝑁𝑢,𝑇

∑ 𝑖=1

𝑣 (𝑠) 𝑑𝑠+

∫0

⎫ ⎧ )− 1 −1 [ ( )] ( 𝑋𝑇 −𝜇 𝜉 ⎪ ⎪ − 𝜏0 + 𝜌𝑣 𝑇𝑖 1 + 𝜉 𝜓+𝛼𝑣𝑖 𝑇 ⎪ ⎪ ( 𝑖) log ⎨ [ ] )− 1 ( )− 1 ⎬ ( )] ( ⎪ ⎪[ 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣 − 1 + 𝜉 𝜓+𝛼𝑣 ⎪ ⎪ 𝜓 + 𝛼𝑣 𝑇𝑖 (𝑇𝑖 ) (𝑇𝑖 ) ⎩ ⎭

(8)

( ) ( ) ( ) ∑ 𝛥 where 𝑇 stands for the sample size of the daily negative log-returns, and 𝑣 𝑇𝑖 = 𝑗<𝑖 𝛿𝑗 𝑒−𝛾 𝑇𝑖 −𝑇𝑗 , where 𝛿𝑗 = 𝛿 𝑋𝑇𝑗 . Proof. See Appendix A. □ 𝑇

Substituting (3) and (4) into ∫0 𝑣 (𝑠) 𝑑𝑠, we get 𝑇

∫0

𝑇

𝑣 (𝑠) 𝑑𝑠 =

∫0

∑ 𝑖∶0<𝑇𝑖 <𝑠

𝑁𝑢,𝑇

𝛿𝑖 𝑒−𝛾 (𝑠−𝑇𝑖 ) 𝑑𝑠 =

∑ 𝑗=0

𝑇𝑗+1

∫𝑇𝑗

𝑒−𝛾𝑠



𝛿𝑖 𝑒𝛾𝑇𝑖 𝑑𝑠

𝑖∶0<𝑇𝑖 <𝑠

𝑁𝑢,𝑇 𝑗 𝑗 ∑ 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ ∑ 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝛿𝑖 𝑒𝛾𝑇𝑖 = 𝛿𝑖 𝑒𝛾𝑇𝑖 = 𝛾 𝛾 𝑖=1 𝑗=1 𝑖=1 𝑗=0 𝑁𝑢,𝑇

(9)

where 𝑇𝑁𝑢,𝑇 +1 should be interpreted as 𝑇 . As mentioned ( earlier, 𝑢 and)′ 𝜙 are preset. In total, we have seven unknown parameters to be estimated, which can be denoted as 𝜽 = 𝜏0 , 𝜌, 𝜓, 𝛼, 𝜇, 𝜉, 𝛾 . However, since the objective function contains numerous parameters, the estimation could be computationally expensive. Following Chavez-Demoulin and McGill (2012), the genetic algorithm (Storn and Price, 1997) is used for optimization. It is worth mentioning that the genetic algorithm does not require the objective function to be differentiable. In addition, the genetic algorithm is less affected by the local minimum problem, which is a common issue for the well-known gradient-based algorithms such as the quasi-Newton methods. Our proposed model could be applied to any market with price limits. This paper uses the China stock market for empirical illustration. We consider the CSI 300 daily index value (upper panel) and its corresponding negative log-returns (lower panel) from January 5th, 2010 to December 31st, 2015 presented in Fig. 2. There are 1455 daily negative log-returns in total. Table 1 presents the summary statistics for the returns and exceedances (with the threshold 𝑢 = 0.02). Note that the left tail of the daily log-returns is mapped into the positive domain for convenience. To estimate the model, the threshold 𝑢 is set to be 0.02, which corresponds to 92.51% quantile of the empirical distribution.7 With this threshold level, we observe 109 exceedances in total in our sample. Another presetting parameter 𝜙 is set as 0.1054 6

Both the CDF and PDF of exceedances are used in deriving the closed-form of the log likelihood function of the model for estimation. Based on a simulation study, Chavez-Demoulin (1999) suggests choosing a threshold that corresponds to 90%–95% quantile of the empirical distribution is appropriate. We have results based on different threshold levels available. To save space, we leave these results upon request. 7

55

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Table 1 Summary statistics.

Mean(%) St. deviation Skewness Kurtosis

Total negative log-returns

Excess negative log-returns

−0.0037 0.0161 0.5968 7.0085

3.3899 0.0164 1.7127 5.4177

Fig. 2. CSI 300 daily index value (upper panel) and its corresponding negative log-returns (lower panel) from January 5th, 2010 to December 31st, 2015.

Table 2 Parameter estimates. 𝜏0

𝜌

𝜓

𝛼

𝜇

𝜉

𝛾

0.0399 (0.0044)

0.0549 (0.0115)

0.0049 (0.0013)

0.0042 (0.0013)

0.0166 (0.0006)

0.4651 (0.1723)

0.0569 (0.0152)

Note: The standard errors are reported in parentheses.

(− log (0.9)), which is equivalent to the 10% lower bound of the price limits in the China stock market. Parameters are estimated via the maximum likelihood estimator based on the closed-form objective function in (8). The results are reported in Table 2. The standard errors are reported in parentheses. Appendix B provides the details in deriving the Hessian matrix. All the parameters are statistically significant. The shape parameter 𝜉 is greater than zero, indicating that the exceedances are heavy tailed. The decay parameter 𝛾 is relatively small, thus, the self-exciting effect cannot be ignored. 𝜓 and 𝛼 are at the same level, which indicates that the setting of predictable marks is necessary. Based on the results in Table 2, we can conclude that in our sample data, the exogenous risk has a proportion of 3.99% (with 𝜏0 = 0.0399), while the endogenous risk has the proportion of 3.5% over the total sample.

3. Empirical analysis 3.1. Model estimation diagnostics and VaR backtesting

In this section, we conduct post-estimation diagnostics tests to investigate the validation of the proposed framework. One of the common ways to validate the model is to construct tests with original model assumptions. In other words, if a model is well constructed and estimated, all its assumptions should be satisfied. For assumptions (i) and (ii), we design two tests based on the probability integral transformations (Diebold et al., 1998). Proposition 3. Based on the proposed model, specified in (1)–(5), we define two types of residuals as follows [ ] 𝑇𝑖+1 ( ) 𝜒𝑖 = 1 − exp −𝜏̂0 𝑇𝑖+1 − 𝑇𝑖 − 𝜌̂ 𝑣 (𝑠) 𝑑𝑠 ∫𝑇𝑖 56

(10)

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 3. Q-Q plot of 𝜒 (left panel) and 𝑚 (right panel) from January 5th, 2010 to December 31st, 2015.

and ( )− 1 ( )− 1̂ 𝑋𝑇𝑖 −𝜇̂ 𝜙−𝜇̂ 𝜉 𝜉̂ 1 + 𝜉̂ 𝜓+ − 1 + 𝜉̂ 𝜓+ ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) 𝑚𝑖 = ( )− 1 ( )− 1 𝜙−𝜇̂ 𝑢−𝜇̂ 𝜉̂ 𝜉̂ 1 + 𝜉̂ 𝜓+ − 1 + 𝜉̂ 𝜓+ ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) ̂ 𝛼𝑣 ̂ (𝑇𝑖 )

(11)

where 𝑖 = 1, 2, … , 𝑁𝑢,𝑇 and 𝑇𝑁𝑢,𝑇 +1 = 𝑇 . If the estimated model satisfies the assumptions (i) and (ii), 𝜒 and 𝑚 should be uniformly distributed over the interval [0, 1] respectively. Proof. See Appendix A. □ 𝜒 and 𝑚 defined in (10) and (11) are designed to validate assumptions (i) and (ii) respectively. We present the QQ plots of both types of residuals in Fig. 3: the left panel is for 𝜒 and the right panel is for 𝑚. Both residual quantiles are constructed against the 𝑈 (0, 1). One can see that under both cases, the empirical residuals of 𝜒 and 𝑚 fit the 45-degree line well, which indicates both empirical distributions are approximately uniform. To support the graphical evidence, we furthermore carry out the Kolmogorov– Smirnov (K–S) test of both 𝜒 and 𝑚 against 𝑈 (0, 1). The p-values of 𝜒 and 𝑚 of the two K–S tests are 0.5939 and 0.5077 respectively, indicating that the null hypotheses cannot be rejected at 10% level. This implies that both empirical distributions are not statistically different than the 𝑈 (0, 1). Assumptions (i) and (ii) are reasonable based on the estimated model. It is worth noting that there is no formal test directly on our assumption (iii) in the literature, to the best of our knowledge. However, assumption (iii) can be indirectly validated through back-testing the in-sample VaR and ES, which could explain the dependence between occurrence times and magnitudes of extreme risks. In other words, if the estimated model could capture well the dependence structure, it should have good in-sample VaR and ES performance. Proposition 4. The VaR and ES of quantile 𝑝 at time 𝑡 + 1 of the proposed model, specified in (1)–(5), are8 𝑉 𝑎𝑅𝑝𝑡+1 = 𝜇 − ⎧ ⎪ 1−𝑝 ⎨− 𝜏 (𝑡) ⎪ ⎩

𝜓̃ (𝑡) 𝜓̃ (𝑡) + ⋅ 𝜉 𝜉

)− 1 ( )− 1 ⎤ ( )− 1 ⎫ ⎡( 𝜉 𝜉 𝜉⎪ 𝑢−𝜇 ⎢ 1+𝜉𝜙−𝜇 ⎥+ 1+𝜉𝜙−𝜇 − 1+𝜉 ⎬ ⎢ ⎥ 𝜓̃ (𝑡) 𝜓̃ (𝑡) 𝜓̃ (𝑡) ⎪ ⎣ ⎦ ⎭

−𝜉

(12)

8 The VaR and ES estimates are derived as conditional measures when the extreme risks exceed the threshold level 𝑢. The VaR or ES estimates might be overestimated during the period without sample observations. Thanks to an anonymous referee to point out this.

57

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 4. The upper and middle panels show the VaR and ES estimates respectively, while the bottom panel exhibits the time-varying intensity. The dataset starts from January 5th, 2010 and ends at December 31st, 2015.

𝑝 𝐸𝑆𝑡+1 =

)− 1 ( 𝑝 ( )− 1 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 𝑝 𝜙 1 + 𝜉 𝜙−𝜇 − 𝑉 𝑎𝑅 1 + 𝜉 𝑡+1 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ (

𝜓(𝑡) ̃ 𝜉−1

1 + 𝜉 𝜙−𝜇 𝜓(𝑡) ̃

)− 1 𝜉

)− 1 ( 𝑝 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 − 1 + 𝜉 𝜓(𝑡) ̃



(13)

)− 1 +1 ⎤ ⎡( 𝑝 )− 1 +1 ( 𝜉 ⎢ 1 + 𝜉 𝜙−𝜇 𝜉 − 1 + 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 ⎥ 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ ⎢ ⎥ ⎣ ⎦ 1 ) ( 1 − 𝑝 )− ( 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 1 + 𝜉 𝜙−𝜇 − 1 + 𝜉 𝜓(𝑡) 𝜓(𝑡) ̃ ̃

where 𝜓̃ (𝑡) = 𝜓 + 𝛼𝑣 (𝑡). Proof. See Appendix A. □ Eqs.(12) and (13) provide the VaR and ES estimates of quantile 𝑝 at 𝑡 + 1 conditional on H𝑡 . From the closed-form VaR and ES measures, one can see that the time-varying VaR and ES are triggered by the Hawkes kernel 𝑣 (𝑡). Since we use 7.49% on the left tail of our sample data to model the extreme risks, it is possible to investigate the VaR and ES performance at both 95% and 99% levels. The upper panel of Fig. 4 shows the time-varying VaR estimates based on Eq. (12). The red dashed line corresponds to the 95% VaR estimates while the blue dotted line corresponds to the 99% VaR estimates. Similarly, the middle panel of Fig. 4 shows the time-varying ES estimates based on Eq. (13). The red dashed line corresponds to the 95% ES estimates while the blue dotted line corresponds to the 99% ES estimates. Finally, the estimated intensity 𝜏 (𝑡) is plotted in the bottom panel. Note that the self-exciting property of our one step ahead risk estimates is influenced by the Hawkes kernel. In other words, the VaR and ES estimates at 𝑡 + 1 will rise if an extreme risk over the threshold 𝑢 occurs at 𝑡, otherwise, they will decay slowly and approach to a certain value. Following Chavez-Demoulin and McGill (2012), three LR tests based on the violations9 are utilized in the back testing procedure. The first LR test, known as the unconditional coverage test, focuses on the ratio of the empirical violations and the theoretical quantile 𝑝 (Kupiec, 1995). The second LR test, suggested by Christofferson (1998), aims to test the independence of the violations. In other words, the occurrences of violations should follow a homogeneous Poisson process with intensity 𝑝. The third LR test, named as the conditional coverage test, jointly test the above two hypotheses (Christofferson, 1998). Table 3 reports the results of the three LR tests based on VaR estimates for CSI 300 from January 5th, 2010 to December 31st, 2015. ‘‘***" indicates the statistical significance at 10% level. The second column in Table 3 shows the p-values at the 95% VaR estimates under three LR tests. We cannot reject the null hypotheses at 10% level for all cases. Similar results can be found at 99% VaR measures as well. The good performance of the VaR estimates indicates that the dependence for both occurrence times and magnitudes of extreme risks are well measured in sample. In the following sub-section, we evaluate our proposed model on the forecasting performance. 9

A violation is referred to the case when the observation exceeds the VaR estimate. 58

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Table 3 The three LR tests based on VaR estimates for CSI 300 from January 5th, 2010 to December 31st, 2015. Test

95% VaR (𝑝-value)

99% VaR (𝑝-value)

Unconditional Independence Conditional

0.413*** 0.999*** 0.715***

0.705*** 0.999*** 0.931***

Fig. 5. The upper and middle panels show the VaR and ES estimates respectively, while the bottom panel exhibits the time-varying intensity. The dataset starts from January 4th, 2016 and ends at June 30th, 2016.

Table 4 The three LR tests based on VaR estimates for CSI 300 from January 5th, 2010 to December 31st, 2015. Test

95% VaR (𝑝-value)

99% VaR (𝑝-value)

Unconditional Independence Conditional

0.983*** 0.999*** 0.999***

0.497*** 0.995*** 0.794***

3.2. Out-of-sample forecasting Based on the closed-form solutions in Proposition 4, our model is able to forecast the out-of-sample (one-step ahead) VaR and ES. Moreover, the prediction result can be also evaluated by three LR tests (as mentioned in Section 3.1). The out-of-sample period is from January 4th, 2016 to June 30th, 2016, containing 120 observations in forecasting window. The result of the out-of-sample tests is similar to that of the in-sample tests. The upper and middle panels of Fig. 5 show the VaR and ES estimates at 95% and 99% level respectively, while the bottom panel exhibits the time-varying intensity. The result of the three LR tests is reported in Table 4, indicating that all three tests of both 95% and 99% VaR estimates are significant at 10% level. 4. Risk analysis via branching process The self-exciting process in our model can be explained as a branching process (Møller and Rasmussen, 2005). All extreme risks can be split into exogenous and endogenous risks respectively.10 Theoretically, exogenous risks are defined as immigrants, whose occurrences correspond to the ground intensity 𝜏0 . Descendants are classified via their generations, and they belong to endogenous risks. In a branching process, at the beginning, each immigrant triggers its first generation descendants. The first generation descendants then triggers its second generation descendants and so on and so forth. Therefore, each immigrant constructs a family of extreme risks. In other words, every descendant in the family originates from a certain immigrant. We present Fig. 6 for such demonstration. Fig. 6 shows the branching process transferred from a Hawkes process ignoring the marks. Each square node stands for an extreme risk over the threshold 𝑢, where the red square nodes represent immigrants, and the blue square nodes represent descendants. All square nodes are numbered with their corresponding generations. The edge effect is defined as an immigrant occurs during a family 10 We would like to thank the comments and suggestions from both the Associate Editor and an anonymous referee on the classification of the exogenous and endogenous risks. This helps us to establish a link between the branching process and the Hawkes process in this paper.

59

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 6. The branching process transferred from a Hawkes process ignoring the marks.

that is triggered by a former immigrant. For instance, in Fig. 6, a descendant in Family 1 occurs after the immigrant in Family 2, which is called the edge effect. In this section, we study the cascade effect, which can be defined as the general statistical properties of the next generation descendants triggered by a certain event, such as stationarity, duration interval, the expected number of the descendants, distribution of the total loss triggered by descendants. Among these properties, we would like to focus on the stationarity first. Helmstetter and Sornette (2002) study the average number of the next generation descendants, defined by the branching coefficient 𝑛. The branching process is stationary when 𝑛 < 1. Otherwise, the number of triggered descendants is explosive and approaches to infinity. Proposition 5. If the intensity of the occurrences is specified in (1) and (2) with the impact and decay functions as in (3) and (4) respectively, 𝜌 𝜌 and the CDF of extreme risks follows (6), the model has the branching coefficient 𝑛 = 2𝛾 . The stationarity condition is 2𝛾 < 1. Proof. See Appendix A. □ The proof of Proposition 5 is similar to the proof of Proposition 2.1 in Grothe et al. (2014). See Appendix A. Note that the form of impact function in Chavez-Demoulin and McGill (2012) leads the branching process nonstationary. Thus, in our model, we use the impact function modified from Grothe et al. (2014). According to Table 2, the branching coefficient of our model is 0.4829. This satisfies the stationarity condition (𝑛 < 1). In the framework of the branching process, the immigrants and descendants of different generations are well identified. Marsan and Lengliné (2008) and Zhuang et al. (2002) design two methods to classify the immigrants and descendants of different generations in the Hawkes processes. Both classification algorithms are based on the ‘‘thinning probabilities" defined as follows ( ) 𝛿 𝑋𝑇𝑖 𝑒−𝛾 (𝑇𝑘 −𝑇𝑖 ) 𝜂𝑖𝑘 = (14) ( ) 𝜏 𝑇𝑘 where 0 < 𝑖 < 𝑘. 𝜂𝑖𝑘 stands for the probability that the 𝑖th event triggers the 𝑘th event. Thus, the probability 𝜂0𝑘 , which denotes the probability of the 𝑘th event triggered by the ground intensity, can be expressed as 𝜂0𝑘 = 1 −

𝑘−1 ∑

(15)

𝜂𝑖𝑘

𝑖=1

Following Marsan and Lengliné (2008) and Zhuang et al. (2002), we can classify the immigrants and descendants of different generations via the following procedure: Algorithm 1 (1) (2) (3) (4)

Compute 𝜂0𝑘 and 𝜂𝑖𝑘 for the 𝑘th extreme risk in the process. If 𝜂0𝑘 > 0.5, classify the 𝑘th event as an immigrant with generation 0 and go to step (4). Otherwise, go to next step. Find out the certain extreme risk with the largest 𝜂𝑖𝑘 for all 𝑖 < 𝑘, and classify the 𝑘th event as its next generation. If 𝑘 = 𝑁𝑢,𝑇 , stop the algorithm. Otherwise, go back to step (1) and analyze the next extreme risk.

Fig. 7 presents different categories of the extreme risks based on Algorithm 1. Different colors represent different families, while an immigrant together with its descendants are plotted in the same color. Based on Algorithm 1, 109 exceedances are categorized into 60 families, with 1.8167 exceedances in each family on average. In our model, the branching coefficient is 0.4829, which implies that the average number of exceedances in each family is 1.9339. One can see that our model provides very close estimate in terms of the average number of exceedances. Moreover, In Fig. 7, there exists significant clustering for extreme risks during the second half of 2015. This is consistent with our previous VaR and ES estimates presented in Fig. 4. In our model, an extreme risk may trigger descendants regardless of its identity. It is important to study the cascade effect. For simplicity, we assume that an extreme risk occurs at time 0, with its magnitude 𝑋. We only consider its next generation descendants, 60

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 7. Different colors represent for different families of extreme risks. An immigrant together with its descendants are plotted in the same color. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

based on the following three aspects: (a) the duration that may trigger the next generation descendants; (b) the number of the next generation descendants in the duration and (c) the total loss triggered by the next generation descendants in the duration.11 Proposition 6. Consider the influence of an extreme risk occurring at time 0 with its magnitude denoted as 𝑋0 . The Hawkes kernel which ( ) 𝛥 only considers a certain extreme risk can be written by 𝑣 (𝑡) = 𝛿 𝑋0 𝑔 (𝑡) = 𝛿𝑒−𝛾𝑡 . The duration that triggers the next generation descendants has the following form, ( ) 1 𝜀 𝑇 ∗ = − log (16) 𝛾 𝜌𝛿 where 𝜀 stands for the tolerance. Then, the expectation and variance of the number of the next generation descendants in the duration can be derived as ( ∗) [ ( )] 𝜌𝛿 1 − 𝑒−𝛾𝑇 𝐄 𝑁𝑋0 0, 𝑇 ∗ = (17) 𝛾 ( ∗) [ ( )] 𝜌𝛿 1 − 𝑒−𝛾𝑇 ∗ 𝐃 𝑁𝑋0 0, 𝑇 = 𝛾

(18)

For simplicity, we set (i) 𝜓̄ = 𝜓 + 𝛼𝛿𝑒−𝛾𝑡 ; ) 1 ( ) ( 𝜃 −𝜇 − 𝜉 +𝜃2 . (ii) 𝑓 𝜃1 , 𝜃2 = 1 + 𝜉 1𝜓̄ Finally, the expectation and variance of the total loss triggered by the next generation descendants in the duration can be written by ] [ ( ) 𝐄 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 =

𝑇∗

𝐴𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡

∫0

[ ] ( ) 𝐃 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 =

𝑇∗

∫0

[

(19)

] 𝐵𝜌𝛿𝑒−𝛾𝑡 + (𝐵 − 𝐴2 )𝜌2 𝛿 2 𝑒−2𝛾𝑡 𝑑𝑡

(20)

where 𝜓̄

(i) 𝐴 = (ii) 𝐵 =

𝜙𝑓 (𝜙,0)−𝑢𝑓 (𝑢,0)− (𝜉−1) [𝑓 (𝜙,1)−𝑓 (𝑢,1)] 𝑓 (𝜙,0)−𝑓 (𝑢,0) { 𝜙2 𝑓 (𝜙,0)−𝑢2 𝑓 (𝑢,0) 𝑓 (𝜙,0)−𝑓 (𝑢,0)



2𝜓̄ 𝜉−1

;

} 𝜓̄ 𝜙𝑓 (𝜙,1)−𝑢𝑓 (𝑢,1)− 2𝜉−1 [𝑓 (𝜙,2)−𝑓 (𝑢,2)] 𝑓 (𝜙,0)−𝑓 (𝑢,0)

.

11 The closed-form solution for the distribution of the total loss triggered by the next generation descendants in the duration is not directly available. In the analysis of the cascade effect, the occurrences and the magnitudes are independent. Thus, the total loss in each specific case can be simulated following three steps: (i) simulate the occurrences via the Thinning algorithm (Lewis and Shedler, 1979); (ii) simulate the magnitudes based on its PDF at each occurrence time; (iii) compute the total loss. The distribution of the total loss can be obtained via collecting the total losses in all simulated cases.

61

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Fig. 8. Each cross in red and dot in blue represent the expectations and standard deviations of the total loss triggered by the next generation descendants of a certain extreme risk in the duration. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Proof. See Appendix A. □ Proposition 6 provides closed-form expressions of the three aspects of the cascade effect mentioned earlier. The derivation of Eqs. (16)–(18) are based on the first two aspects. In practice, the third aspect is of great significance, because it measures the overall impact triggered by a certain risk via both expectation and variance. Different from the VaR or ES, Eq. (19) measures the total loss of the next generation descendants caused by a certain extreme risk in the duration. Moreover, Eq. (20) measures the bias of the overall impact. Fig. 8 shows the result of the third aspect based on our in-sample dataset. In particular, we set the tolerance 𝜀 as 0.001, a relatively small value. Thus, the duration is long enough to contain the overall impact triggered by the next generation descendants. Moreover, when computing equations (19) and (20) for an extreme risk, the duration is split into 10000 equidistant pieces, which are sufficient to approximate the corresponding theoretical value. Each cross in red and dot in blue represent the expectations and standard deviations of the total loss triggered by the next generation descendants of a certain extreme risk in the duration. Note that standard deviations are always higher than expectations, indicating that the bias should not be ignored. Normally, a family of extreme risks only contains approximately one to two members (see Fig. 7). However, we also observe some ‘‘extreme" cases that a family contains more than five extreme risks. For instance, one can see a clear cluster of crashes in the second half of 2015. Interestingly, we find that during that period, most of the stocks were over priced. The China Securities Regulatory Commission carried out rigid market regulations that restricted the capital liquidity of the equity market. Market panics began to spread when the first extreme risk occurred. Then a series of extreme risks were subsequently triggered.12 5. Conclusion This paper proposes a general framework that combines the Hawkes process with the truncated GPD to analyze extreme risks of financial markets with price limits. Our proposed model is applied to the China stock market with ±10% price restrictions for empirical illustration. We estimate the model via the maximum likelihood estimation and design post-estimation diagnostics tests to validate the model assumptions. We also forecast the out-of-sample one step ahead VaR and ES measures. In general, the results indicate good performance in both in-sample fitting and out-of-sample risk forecasting. Finally, we separate the exogenous risks from endogenous risks and investigate the cascade effect triggered by the endogenous influence of a specific extreme risk from different aspects, such as the duration of influence, the total loss triggered by its next generation descendants, etc. In future research, we would consider a high-dimension extension of the proposed model and study the risk contagions between different stock markets. Moreover, by invoking the spatial Hawkes process from the earthquakes, it would be also interesting to construct a self-exciting model that takes the trading volumes into account. CRediT authorship contribution statement Jingru Ji: Conceptualization, Methodology, Software, Writing - original draft, Writing - review & editing, Funding acquisition. Donghua Wang: Conceptualization, Writing - review & editing, Supervision, Project administration, Funding acquisition. Dinghai Xu: Methodology, Writing - original draft, Writing - review & editing, Project administration, Funding acquisition. Chi Xu: Funding acquisition. 12 We believe that in practice it will be hard to forecast the cascade effect accurately without taking some macro/policy variables into account. We will leave this interesting extension for further research.

62

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Acknowledgment All authors warmly thank the editor, Professor Valkanov, the associate editor and two anonymous referees for their valuable comments and suggestions on the earlier versions of the paper. This research is supported by the National Science Foundation of China [grant numbers: 71171083 and 71771087]. Jingru Ji would also be grateful for the Shenwan Hongyuan Securities and Shanghai Post-doctoral Excellence Program [grant number: 2019086]. Dinghai Xu would like to acknowledge the financial support from the International Research Partnership Grant (IRPG) at University of Waterloo. All errors remain ours. Appendix A. Proofs Proof of Proposition 1. CDF and PDF of Exceedances Proof. Based on Eq. (5), we can obtain ) 𝐹̄𝜇,𝑡 (𝑥) ( 𝐹̄𝑢,𝑡 (𝑥) = 𝑃 𝑋𝑡 > 𝑥 ||𝑢 ≤ 𝑋𝑡 ≤ 𝜙, H𝑡 = 𝐹̄𝜇,𝑡 (𝑢) )− 1 ( ⎡ 𝜉 𝑥−𝜇 ⎢ 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) = ⎢1 − ( )− 1 ⎢ 𝜉 𝜙−𝜇 ⎢ 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) ⎣

=

)− 1 ( ⎤⎡ 𝜉 𝑢−𝜇 ⎥⎢ 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) ⎥ ⎢1 − ( )− 1 ⎥⎢ 𝜉 𝜙−𝜇 ⎥⎢ 1 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) ⎦⎣

−1

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

)− 1 ( )− 1 ( 𝜉 𝜉 𝑥−𝜇 𝜙−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

(A.1)

)− 1 ( )− 1 ( 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

Therefore, 𝐹𝑢,𝑡 (𝑥) = 1 − 𝐹̄𝑢,𝑡 (𝑥), which is equivalent to ( 𝐹𝑢,𝑡 (𝑥) =

(

𝑥−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 𝜙−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

)− 1

( )− 1 𝜉 𝑢−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

)− 1

)− 1 ( 𝜉 𝑢−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

𝜉

𝜉

.

Then, the PDF of 𝐹𝑢,𝑡 (𝑥) can be derived as 𝑥

∫𝑢

𝑓𝑢,𝑡 (𝑠)𝑑𝑠 =

)− 1 ( )− 1 ( 𝜉 𝜉 𝑢−𝜇 𝑥−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

(A.2)

)− 1 ( )− 1 ( 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

Taking the derivative of 𝑥 on both sides, we get

𝑓𝑢,𝑡 (𝑥) = [𝜓 + 𝛼𝑣 (𝑡)]

[ (

( )− 1 −1 𝜉 𝑥−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 𝜙−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

)− 1 𝜉

( )− 1 𝜉 𝑢−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

]



Proof of Proposition 2. The Log-likelihood Function Proof. The interval of occurrence times (denoted as 𝛥𝑇𝑖 = 𝑇𝑖+1 − 𝑇𝑖 ) should follow an exponential distribution with time-varying ( ) parameter 𝜏 𝑇𝑖 . Thus, substituting 𝛥𝑇 into its PDF, we can get ( ) [ ( )] −𝜏 (𝑇 −𝑇 )−𝜌 ∫ 𝑇𝑖+1 𝑣(𝑠)𝑑𝑠 𝑇𝑖 𝜆𝑢,𝑇𝑖 𝛥𝑇𝑖 = 𝜏0 + 𝜌𝑣 𝑇𝑖 𝑒 0 𝑖+1 𝑖

(A.3)

where 𝑇𝑁𝑢,𝑇 +1 = 𝑇 . Combining Eq. (7) with (A.3), we obtain the log likelihood function 𝑙 as follows 𝑢,𝑇𝑖 ⎡𝑁∏ ( )⎤ ( ) 𝑙 = log ⎢ 𝜆 𝛥𝑇𝑖 𝑓𝑢,𝑇𝑖 𝑋𝑇𝑖 ⎥ ⎢ 𝑖=1 𝑢,𝑇𝑖 ⎥ ⎣ ⎦

[ ]⎫ ⎧𝑁𝑢,𝑇 𝑇𝑖+1 ( )] ( ) ⎪∏ [ ⎪ = log ⎨ 𝜏0 + 𝜌𝑣 𝑇𝑖 exp −𝜏0 𝑇𝑖+1 − 𝑇𝑖 − 𝜌 𝑣 (𝑠)𝑑𝑠 ⎬ + ∫ 𝑇𝑖 ⎪ 𝑖=1 ⎪ ⎭ ⎩ 63

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

⎫ ⎧ ( )− 1 −1 𝑋𝑇 −𝜇 𝜉 ⎪𝑁𝑢,𝑇 ⎪ − 1 + 𝜉 𝜓+𝛼𝑣𝑖 𝑇 ⎪∏ ⎪ ( 𝑖) log ⎨ [ ]⎬ 1 1 ( ) ( ) − − ( )] ⎪ 𝑖=1 [ ⎪ 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣 𝜓 + 𝛼𝑣 𝑇𝑖 − 1 + 𝜉 𝜓+𝛼𝑣 ⎪ ⎪ (𝑇𝑖 ) (𝑇𝑖 ) ⎩ ⎭

𝑇

= −𝜏0 𝑇 − 𝜌

∫0

⎫ ⎧ )− 1 −1 [ ( )] ( 𝑋𝑇𝑖 −𝜇 𝜉 ⎪ ⎪ − 𝜏 + 𝜌𝑣 𝑇 1 + 𝜉 0 𝑖 ∑ 𝜓+𝛼𝑣(𝑇𝑖 ) ⎪ ⎪ 𝑣 (𝑠)𝑑𝑠 + log ⎨ ]⎬ [ 1 1 ) ( ) ( − − [ ( )] ⎪ ⎪ 𝑖=1 𝜉 𝜉 𝑢−𝜇 𝜙−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣 1 + 𝜉 𝜓+𝛼𝑣 ⎪ ⎪ 𝜓 + 𝛼𝑣 𝑇𝑖 (𝑇𝑖 ) (𝑇𝑖 ) ⎭ ⎩ 𝑁𝑢,𝑇



Proof of Proposition 3. Diagnostics Residuals Proof. Based on the assumption (i), the intervals of occurrence times should follow an exponential distribution with time-varying ( ) parameter 𝜏 𝑇𝑖 . Therefore, substituting 𝛥𝑇 into its CDF, we can get [ ] 𝑇𝑖+1 ( ) ( ) 𝛬𝑢,𝑇𝑖 𝛥𝑇𝑖 = 1 − exp −𝜏̂0 𝑇𝑖+1 − 𝑇𝑖 − 𝜌̂ 𝑣 (𝑠) 𝑑𝑠 (A.4) ∫ 𝑇𝑖 ( ) 𝛥 where 𝑖 = 1, 2, … , 𝑁𝑢,𝑇 and 𝑇𝑁𝑢,𝑇 +1 = 𝑇 . Thus, 𝜒𝑖 = 𝛬𝑢,𝑇𝑖 𝛥𝑇𝑖 ∼ 𝑈 (0, 1). Based on the assumption (ii), the CDF of 𝑋𝑇𝑖 should be equation (6). Therefore, substituting 𝑋𝑇𝑖 into its CDF, we can obtain (

( )− 1̂ 𝑋𝑇𝑖 −𝜇̂ 𝜉 − 1 + 𝜉̂ 𝜓+ ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) ( )− 1 ( )− 1 𝜙−𝜇̂ 𝑢−𝜇̂ 𝜉̂ 𝜉̂ 1 + 𝜉̂ 𝜓+ − 1 + 𝜉̂ 𝜓+ ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) ̂ 𝛼𝑣 ̂ (𝑇𝑖 ) ( ) 𝛥 where 𝑖 = 1, 2, … , 𝑁𝑢,𝑇 . Thus, 𝑚𝑖 = 𝐹𝑢,𝑇𝑖 𝑋𝑇𝑖 ∼ 𝑈 (0, 1). □ ( ) 𝐹𝑢,𝑇𝑖 𝑋𝑇𝑖 =

𝜙−𝜇̂ 1 + 𝜉̂ 𝜓+ ̂ 𝛼𝑣 ̂ (𝑇𝑖 )

)− 1 𝜉̂

(A.5)

Proof of Proposition 4. VaR and ES Proof. First, we consider the VaR of the model. Based on Eq. (6), we can obtain )− 1 ( )− 1 ( 𝜉 𝜉 𝑥−𝜇 𝜙−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 1 + 𝜉 𝜓+𝛼𝑣(𝑡) 𝐹̄𝜇,𝑡 (𝑥) = )− 1 ( )− 1 ( 𝐹̄𝜇,𝑡 (𝑢) 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

(A.6)

which can be expressed as ( 1−𝑝 = 𝜏0 + 𝜌𝑣 (𝑡)

𝜙−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

)− 1 𝜉

)− 1 ( 𝑝 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 − 1 + 𝜉 𝜓+𝛼𝑣(𝑡) (A.7)

( )− 1 ( )− 1 𝜉 𝜉 𝜙−𝜇 𝑢−𝜇 1 + 𝜉 𝜓+𝛼𝑣(𝑡) − 1 + 𝜉 𝜓+𝛼𝑣(𝑡)

For simplicity, we set 𝜓̃ (𝑡) = 𝜓 + 𝛼𝑣 (𝑡). Thus, we can derive the expression of 𝑉 𝑎𝑅𝑝𝑡+1 from (A.7) and obtain

𝑉 𝑎𝑅𝑝𝑡+1

⎧ 𝜓̃ (𝑡) 𝜓̃ (𝑡) ⎪ 1 − 𝑝 + − =𝜇− 𝜉 𝜉 ⎨ ⎪ 𝜏 (𝑡) ⎩

−𝜉

)− 1 ( )− 1 ⎤ ( )− 1 ⎫ ⎡( 𝜉 𝜉 𝜉⎪ 𝑢−𝜇 𝜙−𝜇 ⎢ 1+𝜉𝜙−𝜇 ⎥ − 1+𝜉 + 1+𝜉 ⎬ . ⎢ ⎥ 𝜓̃ (𝑡) 𝜓̃ (𝑡) 𝜓̃ (𝑡) ⎪ ⎣ ⎦ ⎭

Then, we consider ES. Actually, to obtain the expression of ES, we have to carry out two steps. First, we have to derive the survival CDF 𝐹̄𝑣,𝑡 (𝑥) on the condition that 𝑢 ≤ 𝑣 ≤ 𝜙 and H𝑡 . 𝐹̄𝑣,𝑡 (𝑥) =

𝐹̄𝜇,𝑡 (𝑥) 𝐹̄𝜇,𝑡 (𝑥) 𝐹̄𝜇,𝑡 (𝑢) 𝐹̄𝑢,𝑡 (𝑥) = = ̄ ̄ ̄ 𝐹𝜇,𝑡 (𝑣) 𝐹𝜇,𝑡 (𝑢) 𝐹𝜇,𝑡 (𝑣) 𝐹̄𝑢,𝑡 (𝑣)

( ) 1 ( ) 1 𝜙−𝜇 − 𝜉 𝑥−𝜇 − 𝜉 1+𝜉 𝜓(𝑡) − 1+𝜉 𝜓(𝑡) ̃ ̃ (

=

(

) 1 𝜙−𝜇 − 𝜉

1+𝜉 𝜓(𝑡) ̃

( ) 1 𝑢−𝜇 − 𝜉 − 1+𝜉 𝜓(𝑡) ̃

) 1 ( ) 1 𝜙−𝜇 − 𝜉 𝑣−𝜇 − 𝜉 1+𝜉 𝜓(𝑡) − 1+𝜉 𝜓(𝑡) ̃ ̃ ( ) 1 ( ) 1 𝜙−𝜇 − 𝜉 𝑢−𝜇 − 𝜉 1+𝜉 𝜓(𝑡) − 1+𝜉 𝜓(𝑡) ̃ ̃

=

( )− 1 ( )− 1 𝜉 𝜉 𝑥−𝜇 1 + 𝜉 𝜙−𝜇 − 1 + 𝜉 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ (

1 + 𝜉 𝜙−𝜇 𝜓(𝑡) ̃

)− 1 𝜉

( )− 1 𝜉 − 1 + 𝜉 𝑣−𝜇 𝜓(𝑡) ̃

64

(A.8)

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

with its PDF as follows

𝑓𝑣,𝑡 (𝑥) =

[

( )− 1 −1 𝜉 − 1 + 𝜉 𝑥−𝜇 𝜓(𝑡) ̃

)− 1 ( )− 1 ( 𝜉 𝜉 𝑣−𝜇 − 1 + 𝜉 𝜓̃ (𝑡) 1 + 𝜉 𝜙−𝜇 𝜓(𝑡) ̃ 𝜓(𝑡) ̃

]

(A.9)

Then, we move on to the second step: deriving the ES from (A.9). This process is demonstrated as follows ( )− 1 −1 𝜉 − 1 + 𝜉 𝑥−𝜇 ( ) 𝜙 𝜓(𝑡) ̃ | 𝑝 𝑝 𝐸𝑆𝑡+1 = 𝐄 𝑥 |𝑉 𝑎𝑅𝑡+1 ≤ 𝑥 ≤ 𝜙, H𝑡 = 𝑥 [ ] 𝑑𝑥 | ∫𝑉 𝑎𝑅𝑝 ( )− 1 ( )− 1 𝜉 𝜉 𝑣−𝜇 𝑡+1 𝜓̃ (𝑡) 1 + 𝜉 𝜙−𝜇 − 1 + 𝜉 𝜓(𝑡) ̃ 𝜓(𝑡) ̃

𝜙 𝑝 𝑉 𝑎𝑅𝑡+1

∫ =−

( )− 1 −1 𝜉 𝑥 1 + 𝜉 𝑥−𝜇 𝑑𝑥 𝜓(𝑡) ̃ (A.10)

)− 1 ⎤ ⎡( 𝑝 )− 1 ( 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 ⎥ 𝜓̃ (𝑡) ⎢ 1 + 𝜉 𝜙−𝜇 − 1 + 𝜉 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ ⎢ ⎥ ⎣ ⎦

Now, we consider ∫ 𝜙

∫𝑉 𝑎𝑅𝑝

𝑡+1

𝜙 𝑝 𝑉 𝑎𝑅𝑡+1

)− 1 −1 ( 𝜉 𝑑𝑥. Invoking the partial integration, we get 𝑥 1 + 𝜉 𝑥−𝜇 𝜓(𝑡) ̃

( )− 1 −1 ( )− 1 𝜙 𝜉 𝜉 𝑥−𝜇 𝑥−𝜇 𝑥 1+𝜉 𝑑𝑥 = −𝜓̃ (𝑡) 𝑥𝑑 1 + 𝜉 ∫𝑉 𝑎𝑅𝑝 𝜓̃ (𝑡) 𝜓̃ (𝑡) 𝑡+1

( )− 1 ( )− 1 )− 1 ⎡ ( 𝜉⎤ 𝜙 𝑉 𝑎𝑅𝑝𝑡+1 − 𝜇 𝜉 𝜉 𝜙 − 𝜇 𝑥−𝜇 𝑝 ⎥ + 𝜓̃ (𝑡) = −𝜓̃ (𝑡) ⎢𝜙 1 + 𝜉 − 𝑉 𝑎𝑅𝑡+1 1 + 𝜉 1+𝜉 𝑑𝑥 ∫𝑉 𝑎𝑅𝑝 ⎢ ⎥ 𝜓̃ (𝑡) 𝜓̃ (𝑡) 𝜓̃ (𝑡) 𝑡+1 ⎣ ⎦ ( )− 1 )− 1 ⎡ ( 𝜉⎤ 𝑉 𝑎𝑅𝑝𝑡+1 − 𝜇 𝜉 𝜙−𝜇 ⎥+ = −𝜓̃ (𝑡) ⎢𝜙 1 + 𝜉 − 𝑉 𝑎𝑅𝑝𝑡+1 1 + 𝜉 ⎢ ⎥ 𝜓̃ (𝑡) 𝜓̃ (𝑡) ⎣ ⎦ )− 1 +1 ⎡( 𝜉 𝜓̃ 2 (𝑡) ⎢ 𝜙−𝜇 − 1+𝜉 𝜉−1 ⎢ 𝜓̃ (𝑡) ⎣

( 1+𝜉

𝑉 𝑎𝑅𝑝𝑡+1 − 𝜇

)− 1 +1 𝜉

𝜓̃ (𝑡)

⎤ ⎥ ⎥ ⎦

(A.11)

Substituting (A.11) into (A.10), we obtain )− 1 ( 𝑝 )− 1 ( 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 𝑝 − 𝑉 𝑎𝑅 1 + 𝜉 𝜙 1 + 𝜉 𝜙−𝜇 𝑡+1 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ 𝑝 − 𝐸𝑆𝑡+1 = )− 1 𝑝 )− 1 ( ( 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 𝜙−𝜇 1 + 𝜉 𝜓(𝑡) − 1 + 𝜉 𝜓(𝑡) ̃ ̃

𝜓(𝑡) ̃ 𝜉−1

)− 1 +1 ⎤ ⎡( 𝑝 )− 1 +1 ( 𝜉 ⎢ 1 + 𝜉 𝜙−𝜇 𝜉 − 1 + 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 ⎥ 𝜓(𝑡) ̃ 𝜓(𝑡) ̃ ⎢ ⎥ ⎣ ⎦ )− 1 𝑝 ( )− 1 ( 𝜉 𝑉 𝑎𝑅𝑡+1 −𝜇 𝜉 1 + 𝜉 𝜙−𝜇 − 1 + 𝜉 𝜓(𝑡) 𝜓(𝑡) ̃ ̃



Proof of Proposition 5. Stationarity Condition Proof. Following Daley and Vere-Jones (2005), our model is stationary based on the condition that 𝐄 [𝜏 (𝑡)] = 𝜏̄ ∈ (0, +∞). From Eqs. (1) and (2), we obtain [ 𝑡 ] ( ) 𝜏̄ = 𝜏0 + 𝜌𝐄 𝛿 𝑋𝑠 𝑔 (𝑡 − 𝑠) 𝑑𝑁 (𝑠) (A.12) ∫−∞ [ ] ( ) 𝑡 According to Daley and Vere-Jones (2005), 𝑛 = 𝜌𝐄 ∫−∞ 𝛿 𝑋𝑠 𝑔 (𝑡 − 𝑠) 𝑑𝑁 (𝑠) . 𝑡

𝜏̄ = 𝜏0 + 𝜌

∫−∞

[ ( ) ] 𝑔 (𝑡 − 𝑠) 𝐄 𝛿 𝑋𝑠 𝑑𝑁 (𝑠) = 𝜏0 + 𝜌 [

𝑡

= 𝜏0 + 𝜌

∫−∞

𝑔 (𝑡 − 𝑠) 𝐄

𝜙

∫𝑢

𝑡

∫−∞

{ [ ( ) ]} 𝑔 (𝑡 − 𝑠) 𝐄 𝐄 𝛿 𝑋𝑠 𝑑𝑁 (𝑠) ||H𝑠

] 𝐹𝑢,𝑠 (𝑥) 𝑓𝑢,𝑠 (𝑥) 𝑑𝑥𝜏 (𝑠) 𝑑𝑠 65

J. Ji, D. Wang, D. Xu et al.

= 𝜏0 +

Journal of Empirical Finance 57 (2020) 52–70

𝑡 𝑡 𝜌𝜏̄ 𝜌𝜏̄ 𝜌𝜏̄ 𝑔 (𝑡 − 𝑠) ⋅ 𝑑𝑠 = 𝜏0 + 𝑒−𝛾(𝑡−𝑠) 𝑑𝑠 = 𝜏0 + 2 ∫−∞ 2 ∫−∞ 2𝛾

(A.13)

Thus, it is easy to derive that 𝜏0 𝜏̄ = 𝜌 1 − 2𝛾

(A.14)

Under the assumption of stationarity, we obtain 𝜌 𝑛= <1 □ 2𝛾 Proof of Proposition 6. The Cascade Effect of a Given Extreme Risk Proof. First, we consider the duration that may trigger the next generation descendants. Based on our model setting, we get 𝑣((𝑡) )= 𝜀 𝛿𝑒−𝛾𝑡 . Note that when 𝑡 → ∞, 𝜌𝛿𝑒−𝛾𝑡 ↘ 0, meaning that the influence will decay as time increases, we set 𝜌𝛿𝑒−𝛾𝑡 ≤ 𝜀 ⇒ 𝑡 ≥ − 𝛾1 log 𝜌𝛿 , which is equivalent to ( ) 𝜀 1 , 𝑇 ∗ = − log 𝛾 𝜌𝛿 where 𝜀 stands for the tolerance. Second, we consider the expectation and variance of the number of the next generation descendants in the duration. ( ∗) [ 𝑇∗ 𝜌𝛿 1 − 𝑒−𝛾𝑇 ( )] 𝐄 𝑁𝑋0 0, 𝑇 ∗ = 𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡 = ∫0 𝛾 [ ( )] 𝐃 𝑁𝑋0 0, 𝑇 ∗ =

𝑇∗

∫0

𝑇∗

𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡 +

∫0

𝑇∗

∫0

(A.15)

{ [ ] [ ] [ ]} 𝑑𝑁 (𝑠) 𝑑𝑁 (𝑡) 𝑑𝑁 (𝑠) 𝑑𝑁 (𝑡) 𝐄 −𝐄 𝐄 𝑑𝑠𝑑𝑡 𝑑𝑠 𝑑𝑡 𝑑𝑠 𝑑𝑡

( ∗) [ 𝜌𝛿 1 − 𝑒−𝛾𝑇 ( )] = 𝐄 𝑁𝑋0 0, 𝑇 ∗ + 0 = 𝛾

(A.16)

Third, we consider the expectation and variance of the total loss triggered by the next generation descendants in the duration. [ 𝑋 𝑑𝑁 (𝑡) ] [ ] 𝑇∗ 𝑇∗ ) [ ] ( 𝑡 𝑋0 |𝑋0 𝑑𝑡 = 𝐄 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 = 𝐄 𝐄 𝑋𝑡 ||𝑋0 𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡 (A.17) | ∫0 ∫0 𝑑𝑡 [ ] Consider 𝐄 𝑋𝑡 ||𝑋0 . For simplicity, we denote (i) 𝜓̄ = 𝜓 + 𝛼𝛿𝑒−𝛾𝑡 ; ) 1 ( ) ( 𝜃 −𝜇 − 𝜉 +𝜃2 (ii) 𝑓 𝜃1 , 𝜃2 = 1 + 𝜉 1𝜓̄ . Invoking equation (13), we obtain 𝜓̄ [ ] 𝜙𝑓 (𝜙, 0) − 𝑢𝑓 (𝑢, 0) − (𝜉−1) [𝑓 (𝜙, 1) − 𝑓 (𝑢, 1)] 𝛥 =𝐴 𝐄 𝑋𝑡 ||𝑋0 = 𝑓 (𝜙, 0) − 𝑓 (𝑢, 0)

(A.18)

Substituting (A.18) into (A.17), we get [ ] ( ) 𝐄 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 =

𝑇∗

𝐴𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡

∫0

Based on our model setting, we conclude that for ∀𝑠 ≠ 𝑡, random variables variance of the total loss can be derived as follows [ ] [ ] 𝑇∗ ( ) 𝐃 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 = 𝐃 𝑋𝑡 𝑁𝑋0 (𝑡)|𝑋0 𝑑𝑡 ∫0

𝑋𝑠 𝑑𝑁𝑋 (𝑠) 0

𝑑𝑠

and

]2 ] [ ⎧[ ⎫ 𝑇∗ 𝑋𝑡 𝑑𝑁𝑋0 (𝑡) ⎪ 𝑋𝑡 𝑑𝑁𝑋0 (𝑡) ⎪ 2 = 𝐄⎨ |𝑋0 ⎬ 𝑑𝑡 − |𝑋0 𝑑𝑡 𝐄 ∫0 ∫0 𝑑𝑡 𝑑𝑡 ⎪ ⎪ ⎩ ⎭ [ ] [ ] 𝑇∗ 𝑇∗ 𝑑𝑁𝑋0 (𝑡) 𝑑𝑁𝑋0 (𝑡) ] [ ] 2 𝑑𝑁𝑋0 (𝑡) [ 2 2 = 𝐄 𝑋𝑡 |𝑋0 𝐄 |𝑋0 𝑑𝑡 − 𝐄 𝑋𝑡 |𝑋0 𝐄 |𝑋0 𝑑𝑡 ∫0 ∫0 𝑑𝑡 𝑑𝑡 𝑑𝑡 𝑇∗

𝑇∗

=

∫0

[ ] 𝐄 𝑋𝑡2 |𝑋0

{ 𝐄2

[

𝑑𝑁𝑋0 (𝑡) 𝑑𝑡

] |𝑋0 + 𝐃

[

𝑑𝑁𝑋0 (𝑡) 𝑑𝑡

]} |𝑋0

𝑇∗

𝑑𝑡 −

66

∫0

𝐴2 𝜌2 𝛿 2 𝑒−2𝛾𝑡 𝑑𝑡

𝑋𝑡 𝑑𝑁𝑋 (𝑡) 0

𝑑𝑡

are independent. Thus, the

J. Ji, D. Wang, D. Xu et al.

𝑇∗

=

∫0

Journal of Empirical Finance 57 (2020) 52–70

[ ]( ) 𝐄 𝑋𝑡2 |𝑋0 𝜌2 𝛿 2 𝑒−2𝛾𝑡 + 𝜌𝛿𝑒−𝛾𝑡 𝑑𝑡 − ]

[

𝑇∗

∫0

𝐴2 𝜌2 𝛿 2 𝑒−2𝛾𝑡 𝑑𝑡

(A.19)

Consider 𝐄 𝑋𝑡2 ||𝑋0 , [ ] 𝐄 𝑋𝑡2 ||𝑋0 = ∫

𝜙

𝜙

𝑥2

𝑢

∫ 𝑥2 𝑓 (𝑥, −1) 𝑑𝑥 −𝑓 (𝑥, −1) 𝑑𝑥 = − 𝑢 𝜓̄ [𝑓 (𝜙, 0) − 𝑓 (𝑢, 0)] 𝜓̄ [𝑓 (𝜙, 0) − 𝑓 (𝑢, 0)]

(A.20)

𝜙

Consider ∫𝑢 𝑥2 𝑓 (𝑥, −1) 𝑑𝑥, 𝜙

𝜙

𝑥2 𝑓 (𝑥, −1) 𝑑𝑥 = −𝜓̄

∫𝑢

{ = −𝜓̄

∫𝑢

[ 𝑥2 𝑑𝑓 (𝑥, 0) = −𝜓̄ 𝜙2 𝑓 (𝜙, 0) − 𝑢2 𝑓 (𝑢, 0) − 2

𝜙2 𝑓 (𝜙, 0) − 𝑢2 𝑓 (𝑢, 0) −

𝐄 𝑋𝑡2 ||𝑋0

]

∫𝑢

𝑥𝑓 (𝑥, 0) 𝑑𝑥

[ ]} 𝜙 2𝜓̄ 𝜙𝑓 (𝜙, 1) − 𝑢𝑓 (𝑢, 1) − 𝑓 (𝑥, 1) 𝑑𝑥 ∫𝑢 𝜉−1

Substituting (A.21) into (A.20), we get [

]

𝜙

𝜙2 𝑓 (𝜙, 0) − 𝑢2 𝑓 (𝑢, 0) − = 𝑓 (𝜙, 0) − 𝑓 (𝑢, 0)

2𝜓̄ 𝜉−1

{

𝜙𝑓 (𝜙, 1) − 𝑢𝑓 (𝑢, 1) −

𝜓̄ 2𝜉−1

} [𝑓 (𝜙, 2) − 𝑓 (𝑢, 2)]

𝑓 (𝜙, 0) − 𝑓 (𝑢, 0)

(A.21)

𝛥

=𝐵

(A.22)

Substituting (A.22) into (A.19), we get [ ] 𝑇∗ [ ( ) ] 𝐃 𝑋𝑡 𝑁𝑋0 0, 𝑇 ∗ ||𝑋0 = 𝐵𝜌𝛿𝑒−𝛾𝑡 + (𝐵 − 𝐴2 )𝜌2 𝛿 2 𝑒−2𝛾𝑡 𝑑𝑡 ∫0 Appendix B. Hessian matrix In this paper, we use the outer product of gradients to compute the Hessian matrix. Therefore, Appendix B provides the first order derivatives function. ( of the log likelihood )′ Denote 𝜽 = 𝜏0 , 𝜌, 𝜓, 𝛼, 𝜇, 𝜉, 𝛾 . 𝑇𝑘+1

𝑙𝑘 = −𝜏0 𝑇𝑘+1 − 𝜌

𝑣 (𝑠) 𝑑𝑠+

∫0

⎫ ⎧ )− 1 −1 [ ( )] ( 𝑋𝑇 −𝜇 𝜉 ⎪ ⎪ − 𝜏0 + 𝜌𝑣 𝑇𝑖 1 + 𝜉 𝜓+𝛼𝑣𝑖 𝑇 ⎪ ⎪ ( 𝑖) log ⎨ ] [ )− 1 ( )− 1 ⎬ ( )] ( ⎪ ⎪[ 𝑖=1 𝜉 𝜉 𝑢−𝜇 𝜙−𝜇 − 1 + 𝜉 𝜓+𝛼𝑣 1 + 𝜉 𝜓+𝛼𝑣 ⎪ ⎪ 𝜓 + 𝛼𝑣 𝑇𝑖 (𝑇𝑖 ) (𝑇𝑖 ) ⎩ ⎭

(B.1)

( ) ∑ −𝛾 (𝑇 −𝑇 ) 𝑖 𝑗 𝑣 𝑇𝑖 = 𝛿𝑗 𝑒

(B.2)

𝑘 ∑

where

𝑗<𝑖 𝑇𝑘+1

∫0

𝑣 (𝑠) 𝑑𝑠 =

𝑗 𝑘 ∑ 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝛾𝑇𝑖 𝛿𝑖 𝑒 𝛾 𝑖=1 𝑗=1

( 1+𝜉 𝜓+𝛼

𝛿𝑖 =



)− 1

𝑗<𝑖 𝛿𝑗

(

𝜉

𝑋𝑇𝑖 −𝜇

(

𝑒−𝛾 𝑇𝑖 −𝑇𝑗



)

( 𝜓+𝛼



(

−𝛾 𝑇𝑖 −𝑇𝑗 𝑗<𝑖 𝛿𝑗 𝑒

)

𝜉



𝑢−𝜇

𝑗<𝑖 𝛿𝑗

(

𝑒−𝛾 𝑇𝑖 −𝑇𝑗

)

(

𝜉

1+𝜉

)− 1 1+𝜉 𝜓+𝛼

)− 1 𝜙−𝜇

(B.3)



)− 1

(B.4)

𝜉

1+𝜉 𝜓+𝛼



𝑢−𝜇

(

−𝛾 𝑇𝑖 −𝑇𝑗 𝑗<𝑖 𝛿𝑗 𝑒

)

For simplicity, we denote ∑ −𝛾 (𝑇 −𝑇 ) 𝑖 𝑗 𝜓̃ 𝑖 = 𝜓 + 𝛼 𝛿𝑗 𝑒

(B.5)

𝑗<𝑖

𝜓̃ 𝑞𝑖 =

𝜕 𝜓̃ 𝑖 𝜕𝑞

𝜏̃ 𝑖 = 𝜏0 + 𝜌

(B.6) ∑

𝛿𝑗 𝑒−𝛾

( ) 𝑇𝑖 −𝑇𝑗

(B.7)

𝑗<𝑖

67

J. Ji, D. Wang, D. Xu et al.

𝜏̃𝑞𝑖 =

Journal of Empirical Finance 57 (2020) 52–70

𝜕 𝜏̃ 𝑖 𝜕𝑞

(B.8)

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉, 𝛾}. Considering (B.5), we can obtain 𝜓̃ 𝜓𝑖 =

∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) 𝜕 𝜓̃ 𝑖 𝑖 𝑗 =1+𝛼 𝑒 𝜕𝜓 𝜕𝜓 𝑗<𝑖

𝜓̃ 𝑞𝑖 =

∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) 𝜕 𝜓̃ 𝑖 𝑖 𝑗 =𝛼 𝑒 𝜕𝑞 𝜕𝑞 𝑗<𝑖

(B.9)

(B.10)

where 𝑞 ∈ {𝜇, 𝜉}. 𝜓̃ 𝛼𝑖 =

∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) 𝜕 𝜓̃ 𝑖 ∑ −𝛾 (𝑇𝑖 −𝑇𝑗 ) 𝑖 𝑗 𝛿𝑗 𝑒 +𝛼 = 𝑒 𝜕𝛼 𝜕𝛼 𝑗<𝑖 𝑗<𝑖

(B.11)

𝜓̃ 𝛾𝑖 =

[ ] ( ) ∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) ( ) 𝜕 𝜓̃ 𝑖 𝑖 𝑗 + 𝑇 − 𝑇 𝛿 𝑒−𝛾 𝑇𝑖 −𝑇𝑗 =𝛼 𝑒 𝑗 𝑖 𝑗 𝜕𝛾 𝜕𝛾 𝑗<𝑖

(B.12)

Considering (B.7), we can obtain 𝜏̃𝑞𝑖 =

∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) 𝜕 𝜏̃ 𝑖 𝑖 𝑗 =𝜌 𝑒 𝜕𝑞 𝜕𝑞 𝑗<𝑖

(B.13)

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉, 𝛾}. [ ] ( ) ∑ 𝜕𝛿𝑗 −𝛾 (𝑇 −𝑇 ) ( ) 𝜕 𝜏̃ 𝑖 𝑖 𝑗 + 𝑇 − 𝑇 𝛿 𝑒−𝛾 𝑇𝑖 −𝑇𝑗 =𝜌 𝑒 𝜏̃𝛾𝑖 = 𝑗 𝑖 𝑗 𝜕𝛾 𝜕𝛾 𝑗<𝑖

(B.14)

Denote (

𝑥−𝜇 𝜓̃ 𝑖

𝑓 𝑖 (𝑥) =

1+𝜉

𝑓𝑞𝑖 (𝑥) =

𝜕𝑓 𝑖 (𝑥) 𝜕𝑞

)− 1 𝜉

(B.15)

(B.16)

( )− 1 −1 𝜉 𝑥−𝜇 𝑔 (𝑥) = 1 + 𝜉 𝜓̃ 𝑖 𝑖

𝑔𝑞𝑖 (𝑥) =

(B.17)

𝜕𝑔 𝑖 (𝑥) 𝜕𝑞

(B.18)

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉, 𝛾}. Considering (B.15), we can obtain 𝜕𝑓 𝑖 (𝑥) = 𝜕𝑞

( 1+𝜉

𝑥−𝜇 𝜓̃ 𝑖

)− 1 −1 (𝑥 − 𝜇) 𝜓̃ 𝑖 𝜉

(

𝜓̃ 𝑖

)2

𝑞

(B.19)

where 𝑞 ∈ {𝜓, 𝛼, 𝛾}. ] ( )− 1 −1 [ (𝑥 − 𝜇) 𝜓̃ 𝜇𝑖 𝜉 𝜕𝑓 𝑖 (𝑥) 𝑥−𝜇 1 = 1+𝜉 + ( )2 𝜕𝜇 𝜓̃ 𝑖 𝜓̃ 𝑖 𝜓̃ 𝑖 𝜕𝑓 𝑖 (𝑥) = 𝜕𝜉

( 1+𝜉

𝑥−𝜇 𝜓̃ 𝑖

)− 1 [ 𝜉

(B.20)

( ) 𝑖] 𝑖 𝑥−𝜇 1 1 (𝑥 − 𝜇) 𝜓̃ − 𝜉 (𝑥 − 𝜇) 𝜓̃ 𝜉 log 1 + 𝜉 − 𝜉 (𝜓̃ 𝑖 )2 + 𝜉 𝜓̃ 𝑖 (𝑥 − 𝜇) 𝜓̃ 𝑖 𝜉2

(B.21)

Considering (B.17), we can obtain 𝜕𝑔 𝑖 (𝑥) = 𝜕𝑞

( 1+𝜉

𝑥−𝜇 𝜓̃ 𝑖

)− 1 −2 (1 + 𝜉) (𝑥 − 𝜇) 𝜓̃ 𝑖 𝜉

(

)2 𝜓̃ 𝑖

𝑞

(B.22)

where 𝑞 ∈ {𝜓, 𝛼, 𝛾}. ] ( )− 1 −2 [ 𝑖 𝜉 𝜕𝑔 𝑖 (𝑥) 𝑥−𝜇 1 + 𝜉 (1 + 𝜉) (𝑥 − 𝜇) 𝜓̃ 𝜇 + = 1+𝜉 ( )2 𝜕𝜇 𝜓̃ 𝑖 𝜓̃ 𝑖 𝜓̃ 𝑖

(B.23)

68

J. Ji, D. Wang, D. Xu et al.

𝜕𝑔 𝑖 (𝑥) = 𝜕𝜉 Consider

(

Journal of Empirical Finance 57 (2020) 52–70

𝑥−𝜇 1+𝜉 𝜓̃ 𝑖

)− 1 −1 [ 𝜉

] ( ) (1 + 𝜉) (𝑥 − 𝜇) 𝜓̃ 𝜉𝑖 𝑥−𝜇 (1 + 𝜉) (𝑥 − 𝜇) 1 log 1 + 𝜉 − + 𝜓̃ 𝑖 𝜉2 𝜉 𝜓̃ 𝑖 + 𝜉 2 (𝑥 − 𝜇) (𝜓̃ 𝑖 )2 + 𝜉 𝜓̃ 𝑖 (𝑥 − 𝜇)

(B.24)

𝜕𝛿𝑖 𝜕𝑞

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉, 𝛾} ] [ 𝑖 [ ][ ] 𝑓 (𝑋𝑇𝑖 )−𝑓 𝑖 (𝑢) 𝜕 𝑓 𝑖 (𝜙)−𝑓 𝑖 (𝑢) 𝑓 𝑖 (𝑋𝑇𝑖 ) − 𝑓 𝑖 (𝑢) 𝑓𝑞𝑖 (𝜙) − 𝑓𝑞𝑖 (𝑢) 𝑓𝑞𝑖 (𝑋𝑇𝑖 ) − 𝑓𝑞𝑖 (𝑢) 𝜕𝛿𝑖 − = = [ ]2 𝜕𝑞 𝜕𝑞 𝑓 𝑖 (𝜙) − 𝑓 𝑖 (𝑢) 𝑓 𝑖 (𝜙) − 𝑓 𝑖 (𝑢)

Considering ℎ𝑘 = −𝜏0 𝑇𝑘+1 − 𝜌

∑𝑘 𝑗=1

−𝛾𝑇𝑗+1

𝑒−𝛾𝑇𝑗 −𝑒 𝛾

∑𝑗

𝛿𝑒 𝑖=1 𝑖

𝛾𝑇𝑖 ,

(B.25)

we can obtain

𝑗 𝑘 ∑ 𝜕ℎ𝑘 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝜕𝛿𝑖 𝛾𝑇𝑖 = −𝜌 𝑒 𝜕𝑞 𝛾 𝜕𝑞 𝑖=1 𝑗=1

(B.26)

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉}. ( ) 𝑗 𝑘 ∑ 𝑇𝑗+1 𝑒−𝛾𝑇𝑗+1 − 𝑇𝑗 𝑒−𝛾𝑇𝑗 𝜕ℎ𝑘 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝛾𝑇𝑖 𝛿𝑖 𝑒 − = −𝜌 − 𝜕𝛾 𝛾 𝛾2 𝑗=1 𝑖=1

𝜌

𝑘 𝑗 ∑ 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝛾 𝑗=1 𝑖=1

Finally, we consider

(

𝜕𝛿𝑖 𝛾𝑇 𝑒 𝑖 + 𝑇𝑖 𝛿𝑖 𝑒𝛾𝑇𝑖 𝜕𝛾

) (B.27)

𝜕𝑙𝑘 𝜕𝑞

∑ 1 𝜕𝑙𝑘 = −𝑇𝑘+1 + 𝜕𝜏0 𝜏̃ 𝑖 𝑖=1 𝑘

𝑘 𝑗 𝑘 ∑ 𝜕𝑙𝑘 𝑒−𝛾𝑇𝑗 − 𝑒−𝛾𝑇𝑗+1 ∑ 𝛾𝑇𝑖 ∑ =− 𝛿𝑖 𝑒 + 𝜕𝜌 𝛾 𝑗=1 𝑖=1 𝑖=1

(B.28) ( ) −𝛾 𝑇𝑖 −𝑇𝑗 𝑗<𝑖 𝛿𝑗 𝑒 𝜏̃ 𝑖



(B.29)

( ) ⎡ 𝑖 ⎤ 𝑘 𝑔𝑞𝑖 𝑋𝑇𝑖 𝜓̃ 𝑞𝑖 𝑓𝑞𝑖 (𝜙) − 𝑓𝑞𝑖 (𝑢) ⎥ 𝜕ℎ𝑘 ∑ ⎢ 𝜏̃𝑞 𝜕𝑙𝑘 = + ) − 𝑖 ⎢ 𝑖 − 𝑖 + ( ⎥ 𝜕𝑞 𝜕𝑞 𝜏̃ 𝜓̃ 𝑓 (𝜙) − 𝑓 𝑖 (𝑢) ⎥ 𝑖=1 ⎢ 𝑔 𝑖 𝑋𝑇𝑖 ⎣ ⎦

(B.30)

where 𝑞 ∈ {𝜓, 𝛼, 𝜇, 𝜉, 𝛾}. References Aït-Sahalia, Y., Cacho-Diaz, J., Laeven, R.J.A., 2015. Modeling financial contagion using mutually exciting jump processes. J. Financ. Econ. 117 (3), 585–606. Bali, T.G., 2007. An extreme value approach to estimating interest-rate volatility: Pricing implications for interest-rate options. Manage. Sci. 53 (2), 323–339. Bollerslev, T., Todorov, V., Li, S.Z., 2013. Jump tails, extreme dependencies, and the distribution of stock returns. J. Econometrics 172 (2), 307–324. Bowsher, C.G., 2007. Modelling security market events in continuous time: Intensity based, multivariate point process models. J. Econometrics 141 (2), 876–912. Brennan, M.J., 1986. A theory of price limits in futures markets. J. Financ. Econ. 16 (2), 213–233. Chan, S.H., Kim, K.A., Rhee, S.G., 2005. Price limit performance: Evidence from transactions data and the limit order book. J. Empir. Financ. 12 (2), 269–290. Chavez-Demoulin, V., 1999. Two Problems in Environmental Statistics: Capture-Recapture Analysis and Smooth Extremal Models (Ph.D. thesis). Department of Mathematics, Swiss Federal Institute of Technology, Lausanne. Chavez-Demoulin, V., Davison, A.C., McNeil, A.J., 2005. Estimating value-at-risk: A point process approach. Quant. Finance 5 (2), 227–234. Chavez-Demoulin, V., Embrechts, P., Nešlehová, .J., 2006. Quantitative models for operational risk: Extremes, dependence and aggregation. J. Bank. Finance 30 (10), 2635–2658. Chavez-Demoulin, V., McGill, J.A., 2012. High-frequency financial data modeling using hawkes processes. J. Bank. Financ. 36 (12), 3415–3426. Chen, T., Gao, Z., He, J., Jiang, W., Xiong, W., 2019. Daily price limits and destructive market behavior. J. Econometrics 208 (1), 249–264. Cho, D.D., Russell, J., Tiao, G.C., Tsay, R., 2003. The magnet effect of price limits: Evidence from high-frequency data on Taiwan stock exchange. J. Empir. Financ. 10 (1–2), 133–168. Christofferson, P.F., 1998. Evaluating interval forecasts. Internat. Econom. Rev. 39 (4), 841–862. Daley, D.J., Vere-Jones, D., 2005. An Introduction to the Theory of Point Processes. In: Elementary Theory and Methods, vol. 1, Springer, New York. Deb, S.S., Kalev, P.S., Marisetty, V.B., 2005. Are price limits really bad for equity markets?. J. Bank. Financ. 34 (10), 2462–2471. Diebold, F.X., Gunther, T.A., Tay, A.S., 1998. Evaluating density forecasts with applications to financial risk management. Internat. Econom. Rev. 39 (4), 863–883. Embrechts, P., Klüppelberg, C., Mikosch, T., 1997. Modelling Extremal Events for Insurance and Finance. Springer, Berlin. Errais, E., Giesecke, K., Goldberg, L.R., 2010. Affine point processes and portfolio credit risk. SIAM J. Financial Math. 1 (1), 642–665. Fernandes, M., Rocha, M.A.D.S., 2007. Are price limits on futures markets that cool? evidence from the Brazilian mercantile and futures exchange. J. Financ. Econ. 5 (2), 219–242. Filimonov, V., Sornette, D., 2015. Apparent criticality and calibration issues in the hawkes self-excited point process model: Application to high-frequency financial data. Quant. Finance 15 (8), 1293–1314. Grothe, O., Korniichuk, V., Manner, H., 2014. Modelling multivariate extreme events using self-exciting point processes. J. Econometircs 182 (2), 269–289. 69

J. Ji, D. Wang, D. Xu et al.

Journal of Empirical Finance 57 (2020) 52–70

Hawkes, A.G., 1971. Point spectra of some mutually exciting point processes. J. R. Stat. Soc. Ser. B Stat. Methodol. 33 (3), 438–443. Helmstetter, A., Sornette, D., 2002. Sub-critical and super-critical regimes in epidemic models and of earthquake aftershocks. J. Geophys. Res.-Solid Earth 107 (B10), 1–21. Hsieh, P.H., Yong, H.K., Yang, J.J., 2009. The magnet effect of price limits: A logit approach. J. Empir. Financ. 16 (5), 830–837. Kupiec, P.H., 1995. Techniques for verifying the accuracy of risk measurement models. J. Deriv. 3 (2), 73–84. Lee, K., Seo, B.K., 2017. Marked hawkes process modeling of price dynamics and volatility estimation. J. Empir. Financ. 40, 174–200. Lewis, P.A.W., Shedler, G.S., 1979. Simulation of nonhomogenous Poisson processes by thinning. Nav. Res. Logist. Q. 26 (3), 403–413. Marsan, D., Lengliné, O., 2008. Extending earthquakes’ reach through cascading. Science 319 (5866), 1076–1079. McNeil, A.J., 1998. Statistical analysis of extreme values: From insurance, finance, hydrology and other fields. J. Amer. Statist. Assoc. 93 (444), 1516–1519. McNeil, A.J., Frey, R., 2000. Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach. J. Empir. Financ. 7 (3), 271–300. McNeil, A.J., Frey, R., Embrechts, P., 2005. Quantitative Risk Management: Concepts, Techniques, and Tools. Princeton University Press, Princeton. Møller, J., Rasmussen, J.G., 2005. Perfect simulation of hawkes processes. Adv. Appl. Probab. 37 (3), 629–646. Moreno, M., Serrano, P., Stute, W., 2011. Statistical property and economic implications of jump-diffusion processes with shot-noise effects. European J. Oper. Res. 214 (3), 656–664. Sifat, I.M., Mohamad, A., 2018. Trading aggression when price limit hits are imminent: NARDL based intraday investigation of magnet effect. J. Behav. Exp. Finance 20, 1–8. Storn, R., Price, K., 1997. Differential evolution - A simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11 (4), 341–359. Zhuang, J., Ogata, Y., Vere-Jones, D., 2002. Stochastic declustering of space–time earthquake occurrences. J. Amer. Statist. Assoc. 97 (458), 369–380.

70