Available online at www.sciencedirect.com
Mathematics and Computers in Simulation 79 (2009) 2579–2596
Bayesian analysis of stochastic volatility models with mixture-of-normal distributions Manabu Asai Faculty of Economics, Soka University, 1-236 Tangi-cho, Hachioji-shi, Tokyo 192-8577, Japan Available online 24 December 2008
Abstract Stochastic volatility (SV) models usually assume that the distribution of asset returns conditional on the latent volatility is normal. This article analyzes SV models with a mixture-of-normal distributions in order to compare with other heavy-tailed distributions such as the Student-t distribution and generalized error distribution (GED). A Bayesian method via Markov-chain Monte Carlo (MCMC) techniques is used to estimate parameters and Bayes factors are calculated to compare the fit of distributions. The method is illustrated by analyzing daily data from the Yen/Dollar exchange rate and the Tokyo stock price index (TOPIX). According to Bayes factors, we find that while the t distribution fits the TOPIX better than the normal, the GED and the normal mixture, the mixture-of-normal distributions give a better fit to the Yen/Dollar exchange rate than other models. The effects of the specification of error distributions on the Bayesian confidence intervals of future returns are also examined. Comparison of SV with GARCH models shows that there are cases that the SV model with the normal distribution is less effective to capture leptokurtosis than the GARCH with heavy-tailed distributions. © 2008 IMACS. Published by Elsevier B.V. All rights reserved. Keywords: Bayes factor; Mixture-of-normal distributions; Generalized error distribution; Marginal likelihood; Markov-chain Monte Carlo; Multimove sampler; Student-t distribution
1. Introduction It has long been recognized that daily asset returns are leptokurtic, and hence some authors model stock returns as i.i.d. draws from fat-tailed distributions; see Mandelbrot [27] and Fama [18]. It is also a well-known phenomenon that the asset return volatility changes randomly over time. If so, the unconditional distribution is leptokurtic even though the conditional distribution is normal; see Bollerslev et al. ([8], p. 2963). Note, however, that it does not mean that the leptokurtosis of asset returns can fully be explained by changes in volatility. Actually, several authors have found that the conditional distribution is also leptokurtic by assuming leptokurtic distributions for the conditional distribution in ARCH-type models. McAleer [28] provides a comprehensive comparison of a wide range of univariate and multivariate, conditional and stochastic volatility models. Bollerslev [7] uses the Student-t distribution, while Nelson [31] uses the generalized error distribution (GED). Bollerslev et al. [8] and Watanabe [36] have found that the Student-t distribution is adequate for capturing the excess kurtosis of the conditional distribution for daily US and Japanese stock returns, respectively. Recently, Bai et al. [5] found that a mixture of two normal distributions gives a better fit to the IBM daily returns and exchange rate returns of the Dollar/Deutschemark than the Student-t distribution. E-mail address:
[email protected]. 0378-4754/$36.00 © 2008 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.matcom.2008.12.013
2580
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
In the literature of SV models, Liesenfeld and Jung [26] fitted a Student-t distribution and a GED as well as a normal distribution to the error distribution in the SV model, by using the simulated maximum likelihood method developed by Danielsson and Richard [15] and Danielsson [14]. In addition to a normal distribution, a Student-t distribution and a GED, the current paper considers a mixture-ofnormal distributions as the error distribution in the SV model and compares which distribution is the most adequate. In order to estimate these models, the paper extends Bayesian method introduced by Jacquier et al. [22] and developed by Shephard and Pitt [33]. Specifically, both of the model parameters and the latent volatility are sampled from their posterior distribution using Markov-chain Monte Carlo (MCMC) techniques, and simulated draws are used for Bayesian posterior analysis. The latent volatility are sampled using the multi-move sampler proposed by Shephard and Pitt [33] to improve the convergence rate of the MCMC. Bayes factors are calculated by using the method proposed by Chib [9], Chib and Jeliazkov [11,12] to compare the fit of distributions. Using the Bayesian MCMC method, the SV models with four kinds of distributions are fitted to daily data from the Yen/Dollar exchange rate (YEN/USD) and the TOPIX. According to Bayes factors, we find that the mixture-ofnormal distributions and the Student-t distribution fit both data series better than the normal and the GED. While the mixture-of-normal distributions gives a better fit to the YEN/USD returns, the Student-t distributions does to TOPIX returns. The paper also examine how the specification of error distributions influences the autocorrelation functions of squared returns, the confidence intervals of future returns, and Bayes factors compared to GARCH specifications. The rest of the paper is organized as follows. Section 2 shortly explains the SV model with four distributions mentioned above. Section 3 develops the Bayesian method for the analysis of the SV models with these distributions. These SV models are fitted to daily data from the YEN/USD and the TOPIX in Section 4. Conclusions are given in Section 5. 2. SV models with heavy-tailed distributions The SV model analyzed in this article is the standard one given by rt = t exp(ht /2),
t ∼ i.i.d.,E(t ) = 0, V (t ) = 1,
ht = μ + φ(ht−1 − μ) + ηt ,
ηt ∼
i.i.d.N(0, ση2 ),
(1) (2)
where |φ| < 1, and rt is the asset return on day t from which the mean and autocorrelations are removed. In the following, exp(ht /2) is called as volatility, so that ht represents the log of squared volatility. t and ηs are assumed to be serially and mutually independent for all t and s. It is a well-known phenomenon that daily asset returns have leptokurtic distributions. The kurtosis of rt following the 2 above SV model is given by k = E(rt4 )/E(rt2 ) = E(4t ) exp[σh2 ], where σh2 = ση2 /(1 − φ2 ) represents the unconditional variance of ht ; see Appendix A in Liesenfeld and Jung [26] for the derivation. The standard normal distribution is usually assumed for the distribution of t . If so, E(4t ) = 3 and hence the kurtosis of rt is k = 3 exp[σh2 ] ≥ 3, where k = 3 only if σh2 = 0. This result indicates that, as long as the volatility changes over time, the unconditional distribution of rt is leptokurtic even if t follows the standard normal. However, it does not necessarily follow that the leptokurtosis of asset returns can fully be explained by changes in volatility. The distribution of t itself may possibly be leptokurtic. The paper considers a mixture of two normal distributions as follows; N(0, σ 2 ) with probability (1 − p), t ∼ i.i.d. mixture normal(ξ, p) = N(0, ξσ 2 ) with probabilityp, where 0 < ξ < 1 and σ 2 = (1 − p + ξp)−1 so that Var(2t ) = 1. The probability density function (PDF) of t is given by 1−p p 2t 2t +√ (3) exp − exp − 2 . f (t ) = 2ξσ 2 2σ 2πσ 2 2πξσ 2 The kurtosis of the mixture-of-normal distributions is given by E(4t ) = 3 + 3p(1 − p)(ξ − 1)2 /(1 − p + ξp)2 , which is greater than 3 when 0 < p < 1.
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2581
For the purpose of comparison, the paper also consider a Student-t distribution and a GED. The PDF of the t-distribution with mean zero and variance normalized to one is given by f (t ) = [π(ν − 2)]
−(ν+1)/2 + 1)/2) 2t , 1+ (ν/2) ν−2
−1/2 ((ν
ν > 2,
(4)
where ν represents the degree-of-freedom and (·) denotes the gamma function. As long as ν > 4, the kurtosis of the t-distribution is E(4t ) = 3(ν − 2)/(ν − 4), which is greater than 3 if ν < ∞. Needless to say, the t− distribution approaches the standard normal distribution when ν → ∞. The PDF of the GED with mean zero and variance one is given by f (t ) =
υ exp[−(1/2)|t /β|υ ] , β (1/υ)2(1+1/υ)
0 < υ < ∞,
(5)
1/2
where β = [2−2/υ (1/υ)/ (3/υ)] . The kurtosis of the GED is given by E(4t ) = (1/υ) (5/υ)/[ (3/υ)]2 , which is greater than 3 if υ < 2. The GED becomes the standard normal distribution when υ = 2. 3. Bayesian analysis This section explains the method used for the analysis of the SV models with a mixture of two normal distributions (hereafter, SV-MN). 3.1. Overview of estimation methods for SV models Asai et al. [4] discuss recent theoretical developments in the SV literature, including Bayesian and non-Bayesian methods. Before coming to MCMC methods, it is worthwhile to draw attention to the likelihood-oriented procedures of Fridman and Harris [19], Sandmann and Koopman [32] and Watanabe [35]. Fridman and Harris [19] and Watanabe [35] independently apply Kitagawa’s [25] non-Gaussian state-space-model filtering and smoothing procedure to evaluate the likelihood through recursive numerical integration. Sandmann and Koopman [32] proposed the Monte Carlo maximum likelihood method, of which likelihood function can be approximated arbitrarily by decomposing it into a Gaussian part, constructed by the Kalman filter, and a remainder function, whose expectation is evaluated by simulation. Monte Carlo results conducted by Fridman and Harris [19], Sandmann and Koopman [32] and Watanabe [35] show that properties of their methods are very close to those of Jacquier et al. [22]. While the procedures proposed by Fridman and Harris [19] and Watanabe [35] are more computationally demanding than the MCMC of Jacquier et al. [22], the Monte Carlo likelihood method proposed by Sandmann and Koopman [32] is much easier to implement. It should be noted that the latter method yields larger bias than the Bayesian MCMC approach when the unconditional variance of the time-varying log-volatility is relatively small. In addition to reasons mentioned above, we avoid the likelihood oriented approach since it is a hard task to compare SV-t with SV-GED. With respect to the Student-t distribution, Chib et al. [13] and Jacquier et al. [23] consider fat-tailed SV models, and develop Bayesian MCMC techniques to estimate them. While Chib et al. [13] improve the so-called “integration sampler” proposed by Kim et al. [24], Jacquier et al. [23] use the single move sampler of Jacquier et al. [22]. Illustrative examples in de Jong and Shephard [16], Shephard and Pitt [33] and Kim et al. [24] about the SVnormal model indicate that the single-move sampler would produce a highly correlated sample sequence when state variables are highly autocorrelated. The single move sampler is, thus, inefficient in the sense that it needs to repeat the sampling a huge number of times. Apart from the integration sampler, the multi-move sampler proposed by Shephard and Pitt [33] and corrected by Watanabe and Omori [38] is more efficient than the single-move sampler. Based on simulated and real data, Asai [2] found by using efficiency factor of Geweke [20] that the integration sampler outperforms the multi-move sampler for sampling (φ, ση ), while the former is usually less efficient than latter for sampling μ and latent volatilities. Hence, it is worth extending the multi-move sampler to fat-tailed SV models. Added to this reason, we point out that the integration sampler is hard to apply to the SV-GED and the SV-normal-mixture models.
2582
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
3.2. Parameter estimation For the MCMC estimation of the SV-MN model, unknown parameters as well as the latent variable {ht }Tt=0 are sampled from their full conditional distributions. The prior distribution of h0 is set to be the unconditional distribution of ht , i.e. h0 ∼ N(μ, ση2 /(1 − φ2 )). For the unknown parameters in the SV-MN model, the prior distributions for (μ, φ, ση2 ) follows Kim et al. [24], namely, μ ∼ N(k1 , k2 ), (φ + 1)/2 ∼ Beta(φ1 , φ2 ), ση2 ∼ IG(σr /2, Sσ /2), where Beta(·, ·) and IG(·, ·) represent the beta and inverse gamma distributions, respectively. Specifically, hyperparameters are specified as k1 = 0, k2 = 10, φ1 = 20, φ2 = 1.5, σr = 5, and Sσ = 0.01 × σr . The paper assumes that the prior distributions for ξ and p are given by ξ ∼ Beta(ξ1 , ξ2 ), p ∼ Beta(r, r). Specifically, ξ1 = 2, ξ2 = 3, and r = 2 are used so that E(4t ) evaluated at the prior mean takes 3.8. These parameters as well as the latent variable {ht }Tt=0 are sampled from their full conditional distributions using MCMC techniques. It is straightforward to obtain the full conditional distributions of μ, φ and ση2 and sample from them; see Appendix A.1. For the purpose of sampling ξ and p, the paper introduces latent state vector (s1 , . . . , sT ), with st = 1 indicating t ∼ N(0, ξσ 2 ) with probability p and st = 2 indicating t ∼ N(0, σ 2 ) with probability 1 − p. We sample each st independently using the probability mass function Pr(st = i|{t }Tt=1 , ξ, p) (i = 1, 2). As for sampling ξ and p, it is not easy since variance of t is restricted to one. We sample p from its conditional distributions f (p|{t }Tt=1 , ξ) using the M–H algorithm and ξ from its conditional distributions f (ξ|{t }Tt=1 , p) using the A–R/M–H algorithm; see Appendix A.2. While it is straightforward to obtain the full conditional distributions of the parameters and h0 , sampling {ht }Tt=1 is complicated. The paper employs the multi-move sampler, which is proposed by Shephard and Pitt [33] and corrected by Watanabe and Omori [38], in order to sample {ht }Tt=1 . In the multi-move sampler, a block of disturbances {ηs }t+k s=t in Eq. (2) are sampled from their conditional density: t+k f ({ηs }t+k s=t |ht−1 , ht+k+1 , {rs }s=t , θ).
(6)
where θ = (μ, φ, ση2 ). See Appendix A.3 for details. The MCMC methods for the SV-normal, SV-t and SV-GED models are given in Watanabe and Asai [37]. The important point to note is the hyperparameters of prior distributions for ν, υ, ξ and p are determined so that E(4t ) evaluated at the prior mean for each model takes the same value, 3.8. 3.3. Bayes factors Model comparison in a Bayesian framework can be performed using posterior odds. Let R = {rt }Tt=1 denote the observation vector. Then, the posterior odds in favor of model i, Mi , to model j, Mj , is given by POR =
f (R|Mi ) f (Mi ) f (Mi |R) = , f (Mj |R) f (R|Mj ) f (Mj )
where f (R|Mi )/f (R|Mj ) and f (Mi )/f (Mj ) are called Bayes factor and prior odds, respectively. As is the usual practice, the prior odds is assumed to be 1, so that the posterior odds ratio is equal to the Bayes factor. To evaluate the Bayes factor, which is the ratio of the marginal likelihoods where f (R|Ml ) = f (R|Ml , θl )f (θl |Ml )dθl , the paper follows the basic marginal likelihood identity in Chib [9]. The log (base 10) of the marginal likelihood of model Mi can be written as1 log f (R|Mi ) = log f (R|Mi , θi ) + log f (θi |Mi ) − log f (θi |Mi , R), where θi is the set of unknown parameters for model Mi , log f (R|Mi , θi ) is the log-likelihood, log f (θi |Mi ) is the log of the prior density, and log f (θi |Mi , R) is the log of the posterior density. The above identity holds for any value of θi , but following Chib [9], we set θi at its posterior mean calculated using the MCMC draws. The likelihood is evaluated using the Accelerated Gaussian Importance Sampling proposed by Danielsson and Richard [15] and Danielsson [14], and the posterior density is calculated using the method of Chib [9] and Chib and Jeliazkov [11,12]. 1
Bayes factor is usually shown as its log (base 10) value. We denote log (base 10) as log and log (base e) as ln in this article.
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2583
Table 1 Descriptive statistics of daily returns (%) for the Yen/Dollar exchange rate and the TOPIX Statistics
Sample size Mean S.D. Kurtosis LB(12)
Yen/Dollar rate
TOPIX
Raw data
Raw data
Residual
2467 −0.0143 0.7395 6.5889 8.73
2403 −0.0268 1.2753 7.4684 35.09
2401 0 1.2615 7.2813 10.40
Note: For the Yen/Dollar exchange rate, statistics for the raw return series are calculated. For the TOPIX, statistics for the raw return series and the residual series from the AR(2) model are calculated. LB(12) is the heteroskedasticity-corrected Ljung-Box statistic including 12 lags for the return series. The corrected Ljung-Box statistic is calculated following Diebold [17]. The critical values for LB(12) are: 18.55 (10%), 21.03 (5%), and 26.22 (1%).
4. Empirical application 4.1. Data description and preliminary results This section illustrates the proposed MCMC method using daily data of the following financial series: the spot exchange rates for the Japanese Yen/U.S. Dollar (YEN/USD) from 4 January 1990 to 28 December 1999 and the TOPIX from 4 January 1990 to 30 September 1999. Both returns are defined as rt = 100(ln Pt − ln Pt−1 ) where Pt is the closing price on day t. The descriptive statistics are summarized in Table 1. The statistics reported are the mean, the standard deviation, the kurtosis, and the Ljung-Box (LB) statistics for 12 lags corrected for heteroskedasticity following Diebold [17]. The kurtosis of the returns for both series is significantly above three, indicating leptokurtic return distributions. The LB statistics indicate that the return for the YEN/USD is serially uncorrelated while the return for the TOPIX is serially correlated. Hence, as for {rt }, we use for the former series returns with the mean subtracted and for the latter series the residuals from the AR(2) model, where the lag length 2 is selected based on the Schwarz Information Criterion (SIC). 4.2. Estimation details The MCMC simulation are conducted with 20,000 iterations. The first 10,000 draws are discarded and then the next 10,000 are recorded. ση and exp(μ/2) are recorded in place of ση2 and μ. Using these 10,000 draws for each of the parameters, the posterior means, the standard errors of the posterior means, the 95% intervals, and the convergence diagnostic (CD) statistics proposed by Geweke [20] are calculated. The posterior means are computed by averaging the simulated draws. The standard errors of the posterior means are computed using a Parzen window with a bandwidth of 1000; see Geweke [20] for details. The 95% intervals are calculated using the 2.5th and 97.5th percentiles of the simulated draws. The CDs have asymptotically standard normal distributions under the null of the convergence, and they are calculated in order to assess the convergence of Markov chains. Before all computations, we conducted the simulation comparison check proposed by Geweke [21] which has power against errors of all types in MCMC simulation work, including expression of probability density kernels, the algorithm itself, and the code that implements the algorithm. 4.3. Estimation results Table 2 shows the results for the mean-subtracted return for the YEN/USD. Table 2(A) presents log (base 10) of Bayes factors. The log Bayes factor of the SV-normal model versus the SV-t model is −6.73 and that of the SV-t versus the SV-GED is 2.43, indicating that the SV-t fits the YEN/USD data better than the SV-GED, and the SV-GED does better than the SV-normal. This result is consistent with that of Liesenfeld and Jung [26]. As for SV-MN model, it gives a better fit than other models. Table 2(B) shows the posterior means, the 95% intervals, and the CD statistics for the model parameters, the unconditional variance of log volatility σh2 (= ση2 /(1 − φ2 )), and the kurtosis. According to the CD values, the null
2584
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
Table 2 Estimation results for the Yen/Dollar exchange rate
(A) Bayes factors Normal t GED Parameter (B) Parameter estimates Normal distribution exp(μ/2) φ ση σh2 Kurtosis Student-t distribution exp(μ/2) φ ση ν σh2 Kurtosis GED exp(μ/2) φ ση υ σh2 Kurtosis Mixture-of-normal distributions exp(μ/2) φ ση ξ p σh2 Kurtosis
t
GED
Mixture
−6.73 – –
−4.30 2.43 –
−7.17 −0.43 −2.87
Mean
Standard error
95% interval
CD
0.6288 0.9509 0.2331 0.5854 5.4135
0.0004 0.0014 0.0039 0.0032 0.0169
[0.5639,0.6969] [0.9218,0.9736] [0.1764,0.2976] [0.4260,0.8057] [4.5934,6.7147]
−0.36 −1.15 1.36 0.79 0.76
0.6507 0.9819 0.1265 8.2365 0.4815 7.5485
0.0013 0.0006 0.0025 0.1919 0.0037 0.0964
[0.5518,0.7612] [0.9682,0.9924] [0.0918,0.1682] [5.9779,12.360] [0.2902,0.8255] [5.5128,11.129]
−1.34 −0.73 1.04 −0.07 −1.23 0.80
0.6443 0.9773 0.1446 1.5583 0.4990 7.5827
0.0006 0.0010 0.0035 0.0049 0.0040 0.0266
[0.5549,0.7417] [0.9564,0.9903] [0.1035,0.2037] [1.4160,1.7253] [0.3140,0.8125] [4.9537,8.2490]
−0.60 −0.39 0.10 0.26 −1.22 −1.06
0.6515 0.9823 0.1241 0.2816 0.7969 0.4746 7.1520
0.0006 0.0007 0.0031 0.0047 0.0093 0.0037 0.0559
[0.5519,0.7586] [0.9687,0.9925] [0.0929,0.1641] [0.2019,0.3742] [0.6478,0.8951] [0.2854,0.8124] [5.5397,10.217]
−0.24 −0.72 1.29 0.89 0.10 1.77 0.12
Note: The numbers in the table are log (base 10) of Bayes factors for row model against column model. Note: The first 10,000 draws are discarded and then the next 10,000 are used for calculating the posterior means, the standard errors of the posterior means, 95% interval, and the convergence diagnostic (CD) statistics proposed by Geweke [20]. The posterior means are computed by averaging the simulated draws. The standard errors of the posterior means are computed using a Parzen window with a bandwidth of 1000. The 95% intervals are calculated using the 2.5th and 97.5th percentiles of the simulated draws. σh2 represents the unconditional variance of volatility ση2 /(1 − φ2 ).
hypothesis that the sequence of 10,000 draws is stationary is accepted at any standard level for all parameters in all models. The results of SV-t about (exp(μ/2), φ, ση ) are very similar to those of SV-MN. The posterior mean and 95% interval of φ of the SV-t and the SV-MN are placed on upward, compared to those of the SV-normal and the SV-GED, showing a higher persistence in volatility of the SV-t than those of the other two models. The posterior means and 95% intervals of the conditional standard deviation of log volatility ση and the unconditional variance of log volatility σh2 are both smaller in the SV-t and the SV-MN than those in the SV-normal and SV-GED, indicating that the volatility of the SV-t and the SV-MN are less variable than those of the other models. These results are also consistent with those of Liesenfeld and Jung [26]. An interesting result is that the posterior mean of the kurtosis of the SV-MN, which is smaller than those of SV-t and SV-GED and larger than that of SV-normal, is closest to sample kurtosis of the data, 6.59, among four models. Table 3 shows the results for the AR(2) residuals of the TOPIX return series. Log values of Bayes factors presented in Table 3(A) indicate that the SV-t fits the YEN/USD data better than the SV-GED, and the SV-GED does better
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2585
Table 3 Estimation results for the TOPIX
(A) Bayes factors Normal t GED Parameter
Mean
(B) Parameter estimates Normal distribution exp(μ/2) 1.0444 φ 0.9536 ση 0.2539 σh2 0.7317 Kurtosis 6.2831 Student-t distribution exp(μ/2) 1.0769 φ 0.9705 ση 0.1891 ν 11.059 0.6426 σh2 Kurtosis 7.6561 GED exp(μ/2) 1.0617 φ 0.9633 ση 0.2165 υ 1.7015 0.6751 σh2 Kurtosis 6.7083 Mixture-of-normal distributions exp(μ/2) 1.0705 φ 0.9675 ση 0.2017 ξ 0.3729 p 0.7551 0.6615 σh2 Kurtosis 7.4086
t
GED
Mixture
−3.04 – –
−1.95 1.09 –
−2.87 0.17 −0.91
Standard error
95% interval
CD
0.0011 0.0010 0.0031 0.0034 0.0211
[0.9248,1.1774] [0.9291,0.9727] [0.2021,0.3134] [0.5379,1.0082] [5.1371,8.2216]
1.02 1.19 −1.26 −1.80 −1.76
0.0023 0.0008 0.0030 0.4299 0.0041 0.0930
[0.9334,1.2415] [0.9534,0.9843] [0.1461,0.2434] [7.1471,18.313] [0.4304,0.9690] [5.8491,10.919]
0.19 0.54 −0.92 −1.23 −1.70 −0.36
0.0014 0.0010 0.0036 0.0057 0.0049 0.0151
[0.9315,1.2102] [0.9424,0.9803] [0.1676,0.2713] [1.5256,1.8947] [0.4680,0.9682] [5.3957,9.2467]
−0.28 −0.24 −0.00 0.12 −0.91 −1.62
0.0014 0.0010 0.0037 0.0143 0.0293 0.0045 0.0892
[0.9324,1.2280] [0.9472,0.9823] [0.1594,0.2596] [0.2054,0.5493] [0.4537,0.9529] [0.4537,0.9694] [5.7163,10.491]
−0.54 −1.36 1.31 1.74 0.08 0.55 −1.44
Note: The numbers in the table are log (base 10) of Bayes factors for row model against column model. Note: The first 10,000 draws are discarded and then the next 10,000 are used for calculating the posterior means, the standard errors of the posterior means, 95% interval, and the convergence diagnostic (CD) statistics proposed by Geweke [20]. The posterior means are computed by averaging the simulated draws. The standard errors of the posterior means are computed using a Parzen window with a bandwidth of 1000. The 95% intervals are calculated using the 2.5th and 97.5th percentiles of the simulated draws. σh2 represents the unconditional variance of volatility ση2 /(1 − φ2 ).
than the SV-normal. As for the SV-MN model, it fits better than the SV-GED and SV-normal. Although the SV-MN fits worse than the SV-t, the quantitative difference is very small. The differences among the four models are smaller, compared to those for the YEN/USD series. According to the estimates of φ, ση , and σh2 in Table 3(B), the posterior mean and 95% interval of φ are larger and those of the conditional standard deviation and the unconditional variance of volatility are smaller in the SV-t and the SV-MN than those in the SV-normal and the SV-GED. Again, the posterior mean of the kurtosis of the SV-MN, which is smaller than those of SV-t and SV-GED and larger than that of SV-normal, is closest to sample kurtosis of the data, 7.28, among four models. The reason why the volatility of the SV-t or SV-MN is estimated to be more persistent and less variable can be understood by comparing the PDF of the t-distribution to those of the normal and GED. Figs. 1 and 2 show these PDFs for the YEN/USD and the TOPIX, respectively, where ν, υ, ξ and p are set equal to the corresponding posterior means (Fig. 2 is omitted). Needless to say, the PDFs of the t-distribution, GED and the mixture-of-normal distributions have fatter tails than that of the normal. The PDF of the GED puts emphasis on the sharpness around the mean rather
2586
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
Fig. 1. PDF of the standard normal, t, and GED Yen/Dollar exchange rate: (A) center area and (B) tail area.
than the tail fatness, so that the PDF of the t-distribution has a fatter tail than that of the GED. Note also the strong similarity between t-distribution and the mixture-of-normal. The latter, however, has fatter tail than former. Therefore, the SV-t and SV-MN attribute a larger proportion of extreme return values to t instead of ηt than the SV-normal and the SV-GED do, making the volatility of the SV-t less variable. It also increases the persistence in volatility of the SV-t if extreme returns are less persistent. This interpretation is confirmed by comparing the volatility estimates. Figs. 3 and 4 plot the posterior means of volatilities {exp(ht /2)} jointly with the absolute returns for the YEN/USD and the absolute values for the AR(2) residuals of the TOPIX returns, respectively. For the YEN/USD data, the posterior means of volatilities from the SV-t and SV-MN models exhibit smoother movements than those from the SV-normal and SV-GED models. Extreme returns such as the returns at the Asian currency crisis beginning in July 1997 (t = 1852) make the differences clear. The volatilities associated with these extreme return values jump up more under the SV-normal and the SV-GED than
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2587
Fig. 2. PDF of the standard normal, t, and GED TOPIX: (A) center area and (B) tail area.
under the SV-t or the SV-MN models. Contrary to the YEN/USD data, no major difference in the posterior means of volatilities among the three models can be seen in the TOPIX data because the differences in parameter estimates are small. From the viewpoint of Value at Risk (VaR), it is important to examine how the specification of conditional distribution influences the Bayesian confidence intervals of future returns. The Bayesian confidence interval of rT +1 can be obtained by sampling from: (7) f (rT +1 |R) = f (rT +1 |hT +1 , R)f (hT +1 |hT )f (hT |R)dhT +1 dhT , (1)
(N)
where R = {rt }Tt=1 . This sampling is straightforward. First, N draws {hT , . . . , hT } are sampled from f (hT |R) jointly with the model parameters using the MCMC method explained in Section 3. Then, given these N draws, sample N (1) (N) (1) (N) draws {hT +1 , . . . , hT +1 } from f (hT +1 |hT ), . . . , f (hT +1 |hT ) using Eq. (2). Finally, given these N draws, sample N
2588
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
Fig. 3. Volatility estimates: Yen/Dollar exchange rate: (A) normal distribution; (B) Student-t distribution; (C) GED; and (D) mixture-of-normal distributions. (1)
(N)
(1)
(N)
draws {rT +1 , . . . , rT +1 } from f (rT +1 |hT +1 ), . . . , f (rT +1 |hT +1 ) using Eq. (1). N is set equal to 10,000. GED-variates are sampled by the proposed method explained in Appendix B. For each model, the 90%, 95%, and 99% intervals are calculated with the posterior mean of volatility, exp(hT +1 /2). The confidence intervals of rT +1 would become narrower in ordering of the SV-MN, the SV-t, the SV-GED, and the SV-normal if the volatility exp(hT +1 /2) were the same in all four models. Table 4 shows the confidence intervals. The volatility estimates, however, differ among the four models, so that the effect of the specification of conditional distribution on the widths of confidence intervals is vague. For the YEN/USD data, the means of volatility decrease in ordering of the SV-MN, the SV-t, the SV-GED, and the SV-normal, so that the confidence intervals become narrower in the same order. For the TOPIX data, however, this is not the case. Since the means of volatility decrease in ordering of the SV-normal, the SV-MN, the SV-t, and the
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2589
Fig. 4. Volatility estimates: TOPIX: (A) normal distribution; (B) Student-t distribution; (C) GED; and (D) mixture-of-normal distributions.
SV-GED, the intervals in the SV-GED is narrower than those of the SV-normal. In any case, since the specification of the conditional distribution obviously affects the confidence intervals of rT +1 , it is important to specify the conditional distribution correctly in constructing confidence intervals. All analyses so far are based on the SV model defined by Eqs. (1) and (2). To analyze the effect of the volatility specification on the estimates of νυ, ξ and p, the following GARCH model are estimated with different distributions for {t }. rt = σt t ,
E(t ) = 0,
V (t ) = 1,
2 2 σt2 = a0 + a1 rt−1 + b1 σt−1 .
2590
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
Table 4 Confidence intervals for return at T + 1
Exchange rate 90% 95% 99% Mean of exp(hT +1 /2) TOPIX 90% 95% 99% Mean of exp(hT +1 /2)
Normal
t
GED
Mixture
[−0.7632, 0.7943] [−0.9419, 0.9761] [−1.3385, 1.4120] 0.4479
[−0.9821, 0.9967] [−1.2528, 1.2406] [−1.9177, 1.9182] 0.5171
[−0.8200, 0.8216] [−1.0263, 1.0393] [−1.5398, 1.4701] 0.4896
[−1.2754, 1.2315] [−1.5796, 1.5427] [−2.1503, 2.2214] 0.5196
[−3.2686, 3.1970] [−4.0301, 3.9822] [−5.6504, 5.4389] 1.9128
[−3.3220, 3.3391] [−4.3015, 4.2163] [−6.4485, 6.1505] 1.8544
[−3.1261, 3.1016] [−3.8508, 3.8180] [−5.4893, 5.6495] 1.8429
[−3.6175, 3.6476] [−4.5016, 4.4378] [−6.2837, 6.5519] 1.8568
We apply MCMC Bayesian methods proposed by Mitsui and Watanabe [29] to estimate the GARCH model. Although it is possible to use other methods such as Bauwens and Lubrano [6] and Nakatsuma [30], Asai [3] show that the approach used here is very simple and efficient. For evaluation of the posterior and prior ordinates, we used the method of Chib and Jeliazkov [12]. Tables 5 and 6 show the estimation results for the GARCH models. Within GARCH models, Bayes factors indicate that the GARCH-MN and the GARCH-t fit YEN/USD and TOPIX better than other models, respectively. This is consistent with the result based on the SV models. An interesting result is that the GARCH-MN and the GARCH-t are preferable to the SV-normal for the YEN/USD returns. The SV-t and SV-MN, however, fit both data series better than other models. For both data sets, the posterior means of ν in the GARCH-t model and υ in the GARCH-GED model Table 5 GARCH results for Yen/Dollar exchange rate t (A) Bayes factors Normal t GED Mixture SV-normal
(B) Parameter estimates a0 a1 b1
−30.92 – – – –
GED
Mixture
−26.06 4.86 – – –
−31.73 −0.78 −5.65 – –
SV-normal −28.36 2.56 −2.30 3.34 –
−35.53 −4.60 −9.47 −3.82 −7.17
Normal
t
GED
Normal mixture
0.0199 [0.0114,0.0297] 0.0916 [0.0669,0.1176] 0.8729 [0.8350,0.9092]
0.0089 [0.0040,0.0151] 0.0654 [0.0442,0.0897] 0.9210 [0.8929,0.9460] 5.7329 [4.6326,7.1967]
0.0130 [0.0061,0.0214] 0.0770 [0.0521,0.1033] 0.9000 [0.8641,0.9333]
0.0081 [0.0029,0.0135] 0.0602 [0.0376,0.0827] 0.9262 [0.8988,0.9535]
ν
1.3338 [1.2389,1.4359]
υ ξ p a1 + b1
SV-normal-mixture
0.9646 [0.9443,0.9820]
0.9867 [0.9751,0.9976]
0.9771 [0.9599,0.9921]
0.2203 [0.1794,0.2648] 0.8093 [0.7264,0.8741] 0.9865 [0.9744,0.9974]
Note: The numbers in the table are log (base 10) of Bayes factors for row model against column model. Note: The first 10,000 draws are discarded and then the next 10,000 are used for calculating the posterior means. 95% intervals are in parenthesis.
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2591
Table 6 GARCH results for TOPIX-residual t (A) Bayes factors Normal t GED mixture SV-normal
(B) Parameter estimates a0 a1 b1
−27.58 – – – –
GED
Mixture
−23.89 3.69 – – –
SV-normal
−27.06 0.52 −3.17 – –
−27.58 0.00 −3.69 −0.52 –
−29.53 −1.95 −5.65 −2.48 −1.95
Normal
t
GED
Normal mixture
0.0746 [0.0495,0.1037] 0.1490 [0.1170,0.1833] 0.8100 [0.7685,0.8498]
0.0534 [0.0310,0.0789] 0.1238 [0.0926,0.1578] 0.8484 [0.8120,0.8842] 6.1075 [4.9565,7.6084]
0.0626 [0.0371,0.0931] 0.1336 [0.0997,0.1717] 0.8309 [0.7863,0.8717]
0.0519 [0.0272,0.0770] 0.1125 [0.0806,0.1455] 0.8571 [0.8177,0.8950]
ν
1.3422 [1.2427,1.4457]
υ ξ p a1 + b1
SV-t
0.9591 [0.9364,0.9795]
0.9722 [0.9516,0.9922]
0.9642 [0.9407,0.9865]
0.2387 [0.1908,0.2893] 0.7907 [0.6794,0.8818] 0.9696 [0.9495,0.9895]
Note: The numbers in the table are log (base 10) of Bayes factors for row model against column model. Note: The first 10,000 draws are discarded and then the next 10,000 are used for calculating the posterior means. 95% intervals are in parenthesis.
are much smaller than the corresponding means in the SV-t and SV-GED models. Similarly, a comparison of ξ and p between SV and GARCH indicates that the GARCH model requires a more leptokurtic distribution for t than the SV 2 and the past squared return r 2 , while model. Volatility in the GARCH is determined only by the past volatility σt−1 t−1 that in the SV model depends also on the current shock ηt , which increases the leptokurtosis of rt and the estimates of ν and υ. In the GARCH model, volatility persistence is measured by a1 + b1 . In ordering of the GARCH-normal, the GARCH-GED, and the GARCH-t (or the GARCH-MN), the estimates of a1 + b1 rise, which is consistent with the result based the SV model. 5. Conclusions This paper analyzes SV models with the Student-t distribution, GED or the mixture-of-normal distributions. A Bayesian method via MCMC techniques is used to estimate parameters and Bayes factors are calculated to compare the fit of distributions. While the t distribution fits the TOPIX better than the normal, the GED and the normal mixture, the mixture-of-normal distributions gives a better fit to the YEN/USD than other models. Some comparison with GARCH results also indicate that the SV-normal model is insufficient to capture the leptokurtosis since the GARCH-t and GARCH-MN sometimes outperforms the SV-normal. Acknowledgments The author acknowledges the constructive comments and suggestions of a referee. The paper is based on the author’s contributions to Watanabe and Asai [37].
2592
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
Appendix A. Sampling method for the SV-MN models A.1. Sampling μ, φ and ση2 As shown by Kim et al. [24], the full conditional distributions of μ and ση2 are given by μ|· ∼ N(μ, ˆ σμ2 ), ση2 |· ∼ IG(A, B), where k2 ση2
σμ2 =
, k2 {T (1 − φ)2 + 1 − φ2 } + ση2 T 2)
(1 − φ (1 − φ) k 1 μ ˆ = σμ2 , h0 + (ht − ht−1 ) + ση2 ση2 k2 t=1
T + 1 + σr , A= 2 T
1 2 2 2 B= Sσ + (h0 − μ) (1 − φ ) + (ht − μ(1 − φ) − φht−1 ) . 2 t=1
It is straightforward to sample from these distributions. The log of the full conditional distribution of φ is represented by: T
ln f (φ|·) = const + g(φ) −
{ht − μ(1 − φ) − φht−1 }2
t=1
,
2ση2
where g(φ) = ln f (φ) −
(h1 − μ)2 (1 − φ2 ) 1 + ln(1 − φ2 ). 2ση2 2
The method of Chib and Greenberg [10], which is based on the Metropolis–Hastings algorithm, is applied in order to sample from this distribution. Specifically, given the current value φ(i−1) at the (i − 1) st iteration, samˆ Vφ )I[−1, 1] where φˆ = Tt=1 (ht − μ)(ht−1 − ple a proposal value φ∗ from the truncated normal distribution N(φ, μ)/ Tt=1 (ht−1 − μ)2 and Vφ = ση2 /{ Tt=0 (ht − μ)2 }. Then, accept this proposal value as φ(i) with probability exp{g(φ∗ ) − g(φ(i−1) )}. If the proposal value is rejected, set φ(i) to equal φ(i−1) . A.2. Sampling ξ, p and {st }Tt=1 Conditional on ξ and p, each latent state st is drawn by (2−i) (i−1) (2−i) 2 −1/2 P(st = i) ∝ p (1 − p) (ξ σ ) exp −
2t 2ξ (2−i) σ 2
,
(i = 1, 2),
where σ 2 = (1 − p + ξp)−1 . Under the Beta prior distribution, the conditional distribution of p is
T
2t r+T1 r+T2 2 −T/2 f (p|·) ∝ p , (1 − p) (σ ) exp − 2ξ (2−st ) σ 2 T
t=1
where T1 = t=1 (2 − st ) and T2 = T − T1 . To sample from this distribution, the Metropolis–Hastings algorithm is applied. Given the current value p(i−1) at the (i − 1) st iteration, sample a proposal value p∗ from Beta(r + T1 , r + T2 ).
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2593
Then, accept this proposal value as p(i) with probability exp(g(p∗ ) − g(p(i−1) )) where T T 1 2 g(p) = ln(1 − p + ξp) − t (1 − p + ξp)ξ (st −2) . 2 2 t=1
If the proposal value is rejected, set p(i) to equal p(i−1) . The log of conditional distribution of ξ is given by ln f (ξ|·) = const. + (ξ1 − 1) ln(ξ) + (ξ2 − 1) ln(1 − ξ) +
T ln(1 − p + ξp) 2
1
+ {(st − 2) ln(ξ) − 2t (1 − p + ξp)ξ (st −2) } 2 T
t=1
To sample from this density, the paper suggests the following method, which is based on the A–R/M–H algorithm proposed by Tierney [34]. Specifically, a normal distribution is used as a proposal density in the A–R step. The mean and variance of this distribution are chosen as follows. The mode of ξ ∗ by numerical optimization, which is used for the mean, and calculate d 2 ln f (ξ|·)/dξ 2 at ξ = ξ ∗ , which is used for the negative inverse of the variance. Note that the expectation of the second derivative given parameters is negative for all ξ(0 < ξ < 1). A.3. Sampling {ht }Tt=0 The full conditional distribution of h0 is h0 |· ∼ N(μ(1 − φ) + φh1 , ση2 ). It is straightforward to sample from this distribution. The log of the conditional distribution (6) is expressed as 1 2
η + ln f (rs |hs ). 2ση2 s=t s s=t t+k
t+k t+k t+k ln f ({ηs }t+k s=t |·) = const + ln f ({ηs }s=t ) + ln f ({rs }s=t |{hs }s=t ) = const −
t+k
For the SV-MN model, 1 rs2 exp(−hs ) , ln f (rs |hs , ss ) = const − hs − 2 2 2σ {ξI(ss = 1) + I(ss = 2)}
(A.1)
where σ 2 = (1 − p + ξp)−1 . It is straightforward to prove that these ln f (rs |·) are log-concave. Denote (A.1) by l(hs ) and write the derivative of this density with respect to hs as l and l , respectively. Applying a Taylor series expansion to the log-density ln f ({ηs }t+k ηs }t+k s=t yields: s=t |·) around some preliminary estimate {ˆ t+k t+k 1 2
1 2 l(hˆ s ) + (hs − hˆ s )l (hˆ s ) + (hs − hˆ s ) l (hˆ s ) = ln g, ln f ({ηs }t+k |·) ≈ const − η + s=t s 2 2ση s=t 2 s=t t+k
where {hˆ s }s=t are the estimate of {hs }t+k ηs }t+k s=t . s=t corresponding to {ˆ 2 Define variables vs and yˆ s as follows. For s = t, . . . , t + k − 1, vs = −1/ l (hˆ s )
(A.2)
yˆ s = hˆ s + vs l (hˆ s ).
(A.3)
For s = t + k, if t + k < T , vt+k = ση2 /{φ2 − ση2 l (hˆ t+k )} 2
(A.4)
Shephard and Pitt [33] define vs and yˆ s for all s using Eqs. (A.2) and (A.3). Watanabe and Omori [38] show that this mistake may cause a significant bias in estimates for the both parameters and latent variables. When t + k < T , vt+k and yˆ t+k must be defined using Eqs. (A.4) and (A.5).
2594
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
yˆ t+k
φ ˆ ˆ ˆ = ht+k + vt+k l (ht+k ) + 2 ht+k+1 − μ(1 − φ) − φht+k , ση
(A.5)
and if t + k = T , vt+k = −1/ l (hˆ t+k )
(A.6)
yˆ t+k = hˆ t+k + vt+k l (hˆ t+k ).
(A.7)
Then, the normalized version of g is a k-dimensional normal density, which is the exact density of {ηs }t+k s=t conditional on {ˆys }t+k s=t in the linear Gaussian state space model: yˆ s = hs + s ,
s ∼ N(0, vs ),
hs = μ(1 − φ) + φhs−1 + ηs ,
(A.8) ηs ∼ N(0, ση2 ),
(A.9)
t+k Applying the de Jong and Shephard [16] simulation smoother to this model with the artificial {ˆys }t+k s=t , {ηs }s=t can be sampled from the density g. t+k Following Shephard and Pitt [33], the expansion block {hˆ s }s=t selected as follows. Once an initial expansion block t+k {hˆ s }s=t is selected, it is possible to calculate the artificial {ˆys }t+k s=t . Then, applying the Kalman smoother to the linear t+k Gaussian state space model that consists of Eqs. (A.8) and (A.9) with the artificial {ˆys }t+k s=t yields the mean of {hs }s=t t+k conditional on {ˆys }t+k in the linear Gaussian state space model, which is used as the next {hˆ s } . Five iterations of s=t
t+k
s=t
this procedure are used in order to obtain the expansion block {hˆ s }s=t . Since g does not bound f, it is hard to use the conventional acceptance–rejection sampling method to simulate {ηs }t+k s=t from the true density f. Instead, Shephard and Pitt [33] suggest using the Acceptance–Rejection/ Metropolis–Hastings (A–R/M–H) algorithm proposed by Tierney [34](see also Chib and Greenberg [10] for details). Denote the previously sampled value of {ηs }t+k s=t by x. Suppose that the candidate y is produced from the acceptance–rejection algorithm. Then, the A–R/M–H algorithm proceeds as follows. (1) If f (x) < g(x), then let α = 1; If f (x) ≥ g(x) and f (y) < g(y), then let α = g(x)/f (x); If f (x) ≥ g(x) and f (y) ≥ g(y), then let α = min{(f (y)g(x)/f (x)g(y)), 1}. (2) Generate u from a standard uniform distribution. (3) If u ≤ α, return y. Else, return x. Implementation of the multi-move sampler requires the selection the knots. Following Shephard and Pitt [33], K knots are selected, that is equivalently K + 1blocks, randomly with Ui being independent uniforms and i + Ui ki = int T × , i = 1, . . . , K, K+2 where int[x] represents the operator that rounds x down to the nearest integer. The stochastic knots ensure that the method does not become stuck by an excessive amount of rejections. It should be noted that Asai [2] showed robustness with respect to the number of knots, by using simulated and real data. In all of the analyses in our paper, the number of knots K are set equal to 50, which indicate that the size of each block is about 50(= 2500/50). This size is from Shephard and Pitt [33], but there is no particular reason as mentioned above. Appendix B. Sampling from GED The density function of the GED with mean zero and variance one is given by Eq. (5), which can be written as 1 x v f (x) = Kβ−1 exp − , 0 < v < ∞, 2 β
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
2595
where
1/2 −2/v (1/v) , β= 2 (3/v)
−1
K = v[ (1/v)2(1+1/v) ]
.
Since the cumulative distribution function (CDF) of the GED is expressed by using the incomplete gamma function, we propose a sampling method by the probability integral transformation. When x < 0, the CDF of GED is given by ∞ x −1 v Kβ exp(−(−y/β) /2)dy = (K21/v /v)t 1/v−1 e−t dt F (x) = −∞
= (K21/v /v)
x∗
∞
x∗
t 1/v−1 e−t dt = (K (1/v)21/v /v){1 − P(1/v, x∗ )} =
1 2
1−P
1 ∗ ,x v
,
where x∗ = (−x/β)v /2, and P(·, ·) is the incomplete gamma function defined in Abramowitz and Stegun ([1], p. 260). it is easy to verify that F (0) =
(1/v, 0) (1/v) = = 1/2. 2 (1/v) 2 (1/v)
Since GED is a symmetric distribution, we have, for x ≥ 0, 1 1 1 x v F (x) = 1−P . , 2 v 2 β Setting u = F (x), the inverse function is given by ⎧ 1/v ⎪ ⎪ −1 1 , 1 − 2u ⎪ −β 2P ⎨ v x = F −1 (u) = 1/v ⎪ 1 ⎪ ⎪ ⎩ β 2P −1 , 2u − 1 v
(u < 0.5) (B.1) (u ≥ 0.5).
It is possible to rewrite the above equation by using the CDF of a χ2 -variant. Since the CDF of z ∼ χ2 (k) is given by Q(z|k) = P(k/2, z/2), the CDF of the GED becomes ⎧ 1 ⎪ ⎨ {1 − Q((−x/β)v |2/v)} (x < 0) F (x) = 2 ⎪ ⎩ 1 {1 + Q((x/β)v |2/v)} (x ≥ 0). 2 Thus, its inverse function is ⎧ 1/v 2 ⎪ ⎪ −1 ⎪ 1 − 2u| (u < 0.5) ⎨ −β Q v (B.2) x = F −1 (u) = 1/v ⎪ 2 ⎪ −1 ⎪ ⎩β Q 2u − 1| (u ≥ 0.5). v Therefore, it is possible to sample from GED by using Eqs. (B.1) or (B.2) and the probability integral transformation. References [1] [2] [3] [4] [5]
M. Abramowitz, N. Stegun, Handbook of Mathematical Functions, Dover Publications Inc., New York, 1970. M. Asai, Comparison of MCMC methods for estimating stochastic volatility models, Computational Economics 25 (2005) 281–301. M. Asai, Comparison of MCMC methods for estimating GARCH models, Journal of the Japan Statistical Society 36 (2006) 199–212. M. Asai, M. McAleer, J. Yu, Multivariate stochastic volatility: a review, Econometric Reviews 25 (2006) 145–175. X. Bai, J.R. Russell, G.C. Tiao, Kurtosis of GARCH and stochastic volatility models with non-normal innovations, Journal of Econometrics 114 (2003) 349–360. [6] L. Bauwens, M. Lubrano, Bayesian inference on GARCH models using the Gibbs sampler, Econometrics Journal 1 (1998) c23–c46.
2596
M. Asai / Mathematics and Computers in Simulation 79 (2009) 2579–2596
[7] T. Bollerslev, A conditional heteroskedastic time series model for speculative prices and rates of return, Review of Economics and Statistics 69 (1987) 542–547. [8] T. Bollerslev, R.F. Engle, D.B. Nelson, ARCH models, in: R.F. Engle, D. McFadden (Eds.), in: The Handbook of Econometrics, vol. 4, North-Holland, Amsterdam, 1994. [9] S. Chib, Marginal likelihood from the Gibbs output, Journal of the American Statistical Association 90 (1995) 1313–1321. [10] S. Chib, E. Greenberg, Bayes inference for regression models with ARMA(p,q) errors, Journal of Econometrics 64 (1994) 183–206. [11] S. Chib, I. Jeliazkov, Marginal likelihood from the Metropolis–Hastings output, Journal of the American Statistical Association 96 (2001) 270–291. [12] S. Chib, I. Jeliazkov, Accept–reject Metropolis–Hastings sampling and marginal likelihood estimation, Statistica Neerlandica 59 (2005) 30–44. [13] S. Chib, F. Nardari, N. Shephard, Markov chain Monte Carlo methods for stochastic volatility models, Journal of Econometrics 108 (2002) 281–316. [14] J. Danielsson, Stochastic volatility in asset prices: estimation with simulated maximum likelihood, Journal of Econometrics 64 (1994) 375–400. [15] J. Danielsson, J.F. Richard, Accelerated Gaussian importance sampler with applications to dynamic latent variable models, Journal of Applied Econometrics 8 (1993) 153–173. [16] P. de Jong, N. Shephard, The simulation smoother for time series models, Biometrika 82 (1995) 339–350. [17] F.X. Diebold, Empirical Modeling of Exchange Rate Dynamics, Springer-Verlag, New York, 1988. [18] E.F. Fama, The behavior of stock market prices, Journal of Business 38 (1965) 34–105. [19] M. Fridman, L. Harris, A maximum likelihood approach for non-Gaussian stochastic volatility models, Journal of Business & Economic Statistics 16 (1998) 284–291. [20] J. Geweke, Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, in: J.M. Bernardo, J.O. Berger, A.P. Dawid, A.F.M. Smith (Eds.), Bayesian Statistics 4, Oxford University Press, Oxford, U.K., 1992, pp. 169–193. [21] J. Geweke, Getting it right: joint distribution tests of posterior simulators, Journal of the American Statistical Association 99 (2004) 799–804. [22] E. Jacquier, N.G. Polson, P.E. Rossi, Bayesian analysis of stochastic volatility models, Journal of Business & Economic Statistics 12 (1994) 371–389. [23] E. Jacquier, N.G. Polson, P.E. Rossi, Bayesian analysis of stochastic volatility models with fat-tails and correlated errors, Journal of Econometrics 122 (2004) 185–212. [24] S. Kim, N. Shephard, S. Chib, Stochastic volatility: likelihood inference and comparison with ARCH models, Review of Economic Studies 65 (1998) 361–393. [25] G. Kitagawa, Non-Gaussian state-space modeling of nonstationary time series (with discussion), Journal of the American Statistical Association 82 (1987) 1032–1063. [26] R. Liesenfeld, R.C. Jung, Stochastic volatility models: conditional normality versus heavy-tailed distributions, Journal of Applied Econometrics 15 (2000) 137–160. [27] B. Mandelbrot, The variation of certain speculative prices, Journal of Business 36 (1963) 394–419. [28] M. McAleer, Automated inference and learning in modeling financial volatility, Econometric Theory 21 (2005) 232–261. [29] H. Mitsui, T. Watanabe, Bayesian analysis of GARCH option pricing models (in Japanese), The Journal of the Japan Statistical Society 33 (Japanese Issue) (2003) 307–324. [30] T. Nakatsuma, Bayesian analysis of ARMA-GARCH models: a Markov chain sampling approach, Journal of Econometrics 95 (2000) 57–69. [31] D.B. Nelson, Conditional heteroskedasticity in asset returns: a new approach, Econometrica 59 (1991) 347–370. [32] G. Sandmann, S.J. Koopman, Estimation of stochastic volatility models via Monte Carlo maximum likelihood, Journal of Econometrics 87 (1998) 271–301. [33] N. Shephard, M.K. Pitt, Likelihood analysis of non-Gaussian measurement time series, Biometrika 84 (1997) 653–667. [34] L. Tierney, Markov chains for exploring posterior distributions (with discussion), Annals of Statistics 21 (1994) 1701–1762. [35] T. Watanabe, A non-linear filtering approach to stochastic volatility models with an application to daily stock returns, Journal of Applied Econometrics 14 (1999) 101–121. [36] T. Watanabe, Excess kurtosis of conditional distribution for daily stock returns: the case of Japan, Applied Economics Letters 7 (2000) 353–355. [37] T. Watanabe, M. Asai, Stochastic volatility models with heavy-tailed distributions: a Bayesian analysis, unpublished manuscript, Faculty of Economics, Tokyo Metropolitan University, 2003. [38] T. Watanabe, Y. Omori, A multi-move sampler for estimating non-Gaussian times series models: comments on Shephard and Pitt (1997), Biometrika 91 (2004) 246–248.