EMPFIN-00724; No of Pages 24 Journal of Empirical Finance xxx (2014) xxx–xxx
Contents lists available at ScienceDirect
Journal of Empirical Finance journal homepage: www.elsevier.com/locate/jempfin
Unit root vector autoregression with volatility induced stationarity Heino Bohn Nielsen, Anders Rahbek ⁎ Department of Economics, University of Copenhagen, Denmark
a r t i c l e
i n f o
Article history: Received 9 October 2013 Received in revised form 18 March 2014 Accepted 31 March 2014 Available online xxxx JEL classification: C32 Keywords: Vector autoregression Unit root Reduced rank Volatility induced stationarity Term structure Double autoregression
a b s t r a c t We propose a discrete-time multivariate model where lagged levels of the process enter both the conditional mean and the conditional variance. This way we allow for the empirically observed persistence in time series such as interest rates, often implying unit-roots, while at the same time maintain stationarity despite such unit-roots. Specifically, the model bridges vector autoregressions and multivariate ARCH models in which residuals are replaced by levels lagged. An empirical illustration using recent US term structure data is given in which the individual interest rates are found to have unit roots, have no finite first-order moments, but remain strictly stationary and ergodic. Moreover, they co-move in the sense that their spread has no unit root. The model thus allows for volatility induced stationarity, and the paper shows conditions under which the multivariate process is strictly stationary and geometrically ergodic. Interestingly, these conditions include the case of unit roots and a reduced rank structure in the conditional mean, known from linear co-integration. Asymptotic theory of the maximum likelihood estimatorspffiffiffi for a particular structured case (so-called self-exciting) is provided, and it is shown that T -convergence to Gaussian distributions apply despite unit roots as well as absence of finite first and higher order moments. Monte Carlo simulations illustrate the asymptotic theory. © 2014 Elsevier B.V. All rights reserved.
1. Introduction and summary This paper presents a new multivariate time series model which captures important stylized facts of the dynamics of term structure data. In particular the model allows for the typically observed persistence in interest rates often detected as unit-roots in the conditional mean in empirical analyses. But contrary to classic autoregressive models, the variables here enter the conditional variance as well. This can induce stationarity, such that despite unit-roots the multivariate process is stationary. In short, the model allows for co-movement of individual series such that for example spreads have no unit-roots, while at the same time the individual time series are allowed to have unit-roots and to be persistent but remain stationary. We present theory for inference as well as discuss properties of the applied model, and moreover demonstrate by our empirical analysis that the model captures surprisingly well dynamic features of US term structure data. Regarding the results on inference we show that standard asymptotic inference applies despite the fact that an immediate implication of the unit-roots present in the model is that the processes will have fat tails and only finite low order moments.
⁎ Corresponding author at: Department of Economics, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen K, Denmark. Tel.: +45 353 24031; fax: +45 353 23064. Rahbek is also affiliated with CREATES. E-mail addresses:
[email protected] (H.B. Nielsen),
[email protected] (A. Rahbek).
http://dx.doi.org/10.1016/j.jempfin.2014.03.008 0927-5398/© 2014 Elsevier B.V. All rights reserved.
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
2
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Our insistence on allowing for unit-roots is based on the rich literature in econometrics from which it is by now a stylized empirical fact that term structure data, as well as other financial economic time series, are persistent and appear to have unit-roots when modeled as autoregressive (AR) processes. However, the implied non-stationarity is often questioned from an economic, or finance, point of view, and alternative models which allow unit-root in the conditional mean, but are stationary due to the formulation of the conditional volatility have been proposed in the literature. A large part of such literature deals with continuous time models, where a key example is the celebrated Cox–Ingersoll–Ross model, cf. Cox et al. (1985), where both drift and volatility terms are functions of the level of the continuous time process. Our proposed model also fits well within the rich term structure literature, both in continuous as well as in discrete time, see for example Gourieroux et al. (2002), Le et al. (2010) and Carta et al. (2008) for affine term structure models. With respect to discrete time series models, the proposed model bridges two different time series modeling approaches: co-integration analysis, which is multivariate and allows for unit-roots or reduced rank, and the more recent double autoregressive modeling, which so far has been univariate. In the co-integrated vector AR models, see Johansen (1996), unit-roots, or equivalently, reduced rank in the conditional mean imply that the stochastic trends are non-stationary and have random walk behavior. This contrasts the univariate double autoregressive (DAR) models where a unit-root does not necessarily imply non-stationarity as lagged values of the process enter the conditional variance, see e.g. Borkovec and Klüppelberg (2001) and Ling (2004). To fix ideas consider initially the univariate DAR of order one as given by, xt ¼ xt−1 þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ωxx þ ϕ2xx x2t−1 zxt ;
ð1Þ
where zxt is i.i.d. N(0, 1), and ωxx N 0, ϕ2xx N 0. Despite the unit-root in the conditional mean, the process is strictly stationary provided 0 b ϕ2xx ≲ 2.42, which contrasts the case where squared lagged differences enter the conditional variance in which case unit-roots indeed imply non-stationarity, see Lange et al. (2011). In terms of our proposed term structure modeling the univariate unit-root DAR can be thought of as a model for the short term interest rate, driving the level and volatility of interest rates with different maturities. Consider thus next yt defined as the spread between an interest rate with long maturity and the short term rate xt, with the dynamics of yt given by, yt ¼ ρyt−1 þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ωyy þ ϕ2yy y2t−1 þ ϕ2yx x2t−1 zyt ;
ð2Þ
with zyt i.i.d. N(0, 1), and ωyy, ϕ2yy, ϕ2yx N 0, while the autoregressive parameter ρ ∈ ℝ. This way, the lagged short term rate xt, enters as a variance factor of the spread yt. That is, a stationary factor with a unit-root enters the conditional variance of the spread, where if |ρ| b 1 the interest rates are co-moving. Note that, as will be detailed in Section 3 for general versions of the process which include the above, that the process nests random-walk type behavior (as ϕxx approaches zero), and, despite the Gaussian innovations (zit, i = y, x), the joint process will have Paretian, or fat, tail like behavior. That is, the model indeed matches characteristics of financial data. Moreover, note that in terms of the affine term structure framework of Le et al. (2010) also the joint process (yt, xt)′ can be used as a multivariate (unobservable) factor. Within the unit-root and co-integration literature much attention has been devoted to the inclusion and role of constant terms. A key problem in this strand of literature is that a constant term μ may induce a linear trend due to the implied aggregation as caused by the unit-roots. This has lead to various ways of including restricted constants, and other deterministic terms, in multivariate co-integrated vector AR models, cf. Johansen (1996). However, as we show below, we do avoid such issues here and can include an unrestricted constant vector μ in our multivariate model. This is novel in the framework of nonlinear time series. In terms of Eq. (1), we show that adding a constant term, μx say, on the right hand side does not imply a linear trend, nor that the properties of xt in terms of stationarity and (geometric) ergodicity are changed, despite the unit-root in xt. For an introduction to the recent univariate DAR models, inference and estimation have been explored for univariate DAR models in Ling (2004, 2007) and Ling and Li (2008), while extremal and tail behavior have been analyzed for DAR models of order one in Borkovec (2000), Borkovec and Klüppelberg (2001) and Klüppelberg and Pergamenchtchikov (2004). As also used in the mentioned references DAR processes with Gaussian innovations may be restated as random coefficient autoregressions (RCAR), for which Aue et al. (2006) and Berkes et al. (2009) provide results, as well as key references, on estimation and inference. In fact, the RCAR approach is applied in Fong and Li (2004), where co-integration is discussed with random coefficients. The parametrization in Fong and Li (2004) of the conditional variance is quite different when compared to ours, and moreover Fong and Li (2004) apply a local approach where the conditional variance parameters loading the levels vanish at the rate of T, where T denotes the number of observations. Klüppelberg and Pergamenchtchikov (2007) study extremal behavior of a class of multivariate RCAR processes with finite second order moments, which excludes the unit-roots which is a main interest here. The paper is structured as follows: In the next section we describe in detail our proposed multivariate model. The model is a vector AR model with reduced rank structure in the conditional mean, allowing for unit-roots, while the conditional variance is cast in line with multivariate BEKK ARCH models but in levels. Next, we derive conditions for stationarity, geometric ergodicity and existence of moments, where it is emphasized that the unit-roots imply that the process in general, while being stationary, will only have finite small order moments. Asymptotic theory of the maximum likelihood estimators for the applied US term structure model is given. It is found that despite the fact that the processes lack finite even second and first order moments, Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
3
pffiffiffi maximum likelihood estimators (MLEs) are asymptotically standard T -Gaussian distributed, which is supported by Monte Carlo simulations. Observe that our focus is on the Markovian case of one lag in the conditional mean and variance; in the last sub-section we discuss the general non-Markovian case of further lags as well as other extensions left for future research. It is important though to stress that we found for the US term structure application the Markovian case to be sufficient. Throughout, the following notation is applied: The symbols → D and → P are used to indicate weak convergence and convergence in probability, respectively. For any p × r matrix α of rank r, r b p, let α⊥ denote a p × (p − r) matrix whose columns form a basis of the orthogonal complement of span (α). For any square matrix A, |A| denotes the determinant, ρ(A) denotes the spectral radius, while ‖A‖ denotes the norm of A, where the Euclidean norm is given by ‖A‖2 = tr{A′A}. We furthermore apply the non-standard notation A×2 ≔ AA′. Finally, for any matrices A and B, (A ⨂ B) is the Kronecker product. 2. The reduced rank AR model with volatility induced stationarity As explained in the introduction we wish to formulate a multivariate AR model which on the one hand allows for unit-roots in the conditional mean, while at the same time, allows levels to appear in the multivariate conditional ARCH part, which – possibly – induce stationarity. The model is notationally a little involved, and we discuss therefore in separate steps the conditional AR mean and the conditional ARCH parametrization to allow for level induced stationarity. As mentioned we study the Markovian case here, and discuss the non-trivial extension of the model to the non-Markovian case in Section 6. Consider first the conditional mean part. 2.1. Conditional mean Consider the p-dimensional vector autoregressive model of order one with a constant term as given by, 1=2
Y t ¼ μ þ ΦY t−1 þ ηt ;
ηt ¼ Ωt zt
where Φ is (p × p)-dimensional, μ is p-dimensional and zt i.i.d. N(0, Ip) such that Ωt is the conditional variance of Yt. Before specifying the parametrization of the conditional covariance Ωt, we note that k ≥ 1 unit-roots in the autoregressive polynomial, A(z) := Ip − Φz, z ∈ ℂ, implies that Π := Φ − Ip has reduced rank, r = p − k, under the well-known assumption: Assumption 1. Let A(z) = Ip − Φz, z ∈ ℂ. Assume that the characteristic equation, |A(z)| = 0, has k roots at z = 1, and the remaining (p − k) roots satisfy |z| N 1. As applied repeatedly in co-integration analysis, the reduced rank r of Π can be parametrized explicitly, 0
Π ¼ αβ ¼
r X
0
αi βi ;
i¼1
where αi and βi are p-dimensional vectors, i = 1, 2,…,r, and α = (α1, …, αr), β = (β1, …, βr). Thus we may rewrite the vector AR model, using Π and first order differences ΔYt = Yt − Yt − 1, 0
ΔY t ¼ μ þ αβ Y t−1 þ ηt ;
1=2
ηt ¼ Ωt zt :
ð3Þ
Next, observe that under Assumption 1, the following skew-projection identity from Johansen (1996) applies, 0
0
Ip ¼ β⊥;α ⊥ α ⊥ þ α β β ;
ð4Þ
−1 where α⊥ = (α⊥,1, …, α⊥,k) and β⊥ = (β⊥,1, …, β⊥,k), both of dimension (p × k), and where β⊥;α ⊥ :¼ β⊥ α 0⊥ β⊥ and αβ := α(β′α)−1. Use the skew-projection to decompose Yt as follows, 0
0
Y t ¼ β⊥;α ⊥ α ⊥ Y t þ α β β Y t :
ð5Þ
By pre-multiplying in Eq. (3) with α⊥′, it follows that the k-dimensional process, α⊥′Yt, is a unit-root process, and the k linear combinations α⊥′Yt in Eq. (5) are common stochastic unit-root factors which are loaded by the matrix β⊥;α⊥ into Yt. Likewise, pre-multiplying with β′ in Eq. (3) gives, β′Yt = β′μ + (Ir + β′α)β′Yt − 1 + β′ηt. That is, the r linear combinations in the conditional mean, the conditional mean relationships, β′Yt, are autoregressive with no unit-roots; we say that they co-move. Observe that as mentioned it is customary in co-integration analysis to consider restrictions on the constant vector μ to avoid the implied aggregation, and hence linear deterministic trend, arising from the reduced rank of Π. Despite the fact that α⊥′Yt indeed have unit-roots here these turn out not to imply aggregation of μ, see Theorem 1; this is due to the fact that the lagged levels enter the conditional variance to be defined next. Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
4
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
2.2. Conditional variance We propose a version of the BEKK ARCH model of Engle and Kroner (1995) for Ωt in Eq. (3) as it fits well our US data application. Naturally alternative choices for the functional specification of Ωt could be chosen by replacing lagged residuals (here ηt) with Yt in the rich class of multivariate ARCH formulations, see e.g. Bauwens et al. (2006) for a survey of multivariate ARCH specifications. While such constructions appear straightforward, we emphasize that each of these choices require separate treatment as the different specifications will have different implications for the properties of the model. Consider initially the unrestricted BEKK ARCH model of order h from Engle and Kroner (1995) as given by, Ωt ¼ Ω þ
h X
2
ðAi Y t−1 Þ ;
ð6Þ
i¼1
where Ai are (p × p) matrices and Ω N 0. Observe that in the BEKK ARCH model, the conditional variance is represented in terms of sums of squared terms and cross-product terms are omitted. The unrestricted BEKK ARCH is discussed in detail in Engle and Kroner (1995), including identification and generality of the parametrization in Eq. (6). While one can indeed apply this directly, we propose more structure to the BEKK ARCH such that the impact of the linear combinations β′Yt − 1 and α⊥′Yt − 1 in the conditional variance is transparent as in factor-ARCH models discussed in inter alia Bauwens et al. (2006). To do so, we use the skew-projection (Eq. (4)) again, which can be used to rewrite any (p × p)-dimensional matrix ϕ, as follows, h i h i 0 0 ϕ ¼ α β ϕββ þ β⊥;α⊥ ϕα ⊥β β þ β⊥;α ⊥ ϕα⊥ α⊥ þ α β ϕβα⊥ α ⊥ ð7Þ where αβ (p × r) and β⊥;α⊥ (p × k) are defined in Eq. (4), and ϕββ := β′ϕαβ (r × r), ϕα ⊥β :¼ α 0⊥ ϕα β (r × k), ϕα ⊥ α ⊥ :¼ α 0⊥ ϕβ⊥;α ⊥ (k × k) and finally, ϕβα ⊥ :¼ β0 ϕβ ⊥;α⊥ (r × k). Using Eqs. (7) in (6) and omitting cross-product terms, the suggested conditional covariance parametrization is then given by, 2 2 2 2 0 0 0 0 Ωt ¼ Ω þ α β ϕββ β Y t−1 þ β⊥;α ⊥ ϕα⊥β β Y t−1 þ β⊥;α⊥ ϕα ⊥ α ⊥ α ⊥ Y t−1 þ α β ϕβα ⊥ α ⊥ Y t−1 :
ð8Þ
With Ωt given in Eq. (8), we have imposed a structure which allow the lagged unit-root factors α⊥′Yt − 1 as well as lagged conditional mean relations β′Yt − 1 to enter the conditional variance of Yt. More precisely, the role of the parameters αβ and β⊥;α ⊥ is to load the factors β′Yt − 1 and α⊥′Yt − 1 in the conditional variance of the linear combinations β′Yt and α⊥′Yt respectively, as follows from using α⊥′αβ = 0 and β0 β⊥;α ⊥ ¼ 0; and hence the remaining parameters ϕij i, j = α⊥, β provide the “magnitude” or “size” of the loaded factors in the conditional variance. We add more structure by setting ϕα⊥β ¼ 0 in Eq. (8), 2 2 2 0 0 0 Ωt ¼ Ω þ α β ϕββ β Y t−1 þ β⊥;α⊥ ϕα ⊥ α ⊥ α ⊥ Y t−1 þ α β ϕβα ⊥ α ⊥ Y t−1 :
ð9Þ
This further structure has the immediate implication that the k stochastic common factors α⊥′Yt have k unit-roots and a conditional variance driven solely by their own past values since α⊥′αβ = 0. That is, in Eq. (9) the conditional mean of α⊥′ΔYt is α⊥′μ, while the conditional variance is given by, 2 0 0 0 α ⊥ Ωt α ⊥ ¼ α ⊥ Ωα ⊥ þ ϕα⊥ α⊥ α ⊥ Y t−1 : On the other hand the conditional covariance of β′Yt is driven by their own past β′Yt − 1 as well as the common unit-root factors α⊥′Yt − 1. We shall say that in this case, the k stochastic factors, α⊥′Yt, are self-exciting, while β′Yt are not. It is this version of the model which we successfully apply to the US term structure, and indeed find that the short rate is a self-exciting stochastic factor driving (parts of) the term structure. It is important to underline that while α⊥′Yt marginally is a multivariate unit root DAR process, the model does not exclude covariation in the sense that the conditional covariance, Cov(β′Yt, α⊥′Yt|Yt − 1) = β′Ωα⊥ is unrestricted since Ω is in Eq. (9). That is, it is indeed a multivariate model. 2.3. Parameters of the model For the statistical analysis the general model given by Eq. (3) with Ωt in Eqs. (8) or (9) is overparametrized as β and α (p × r) as well as α⊥ and β⊥ (p × k) are not identified without imposing just-identifying restrictions. As an example, one can apply the classic identification schemes well-known from co-integration, see Johansen (1996), where identified versions of β and α are given by, 0 −1 β c :¼ β c β
0 and α c :¼ α β c ;
ð10Þ
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
5
with c a known (p × r) matrix. Likewise, one may apply identified versions of β⊥ and α⊥ as given by, 0 β⊥;c :¼ Ip −cβc c⊥
0 −1 0 and α ⊥;c :¼ Ip −βc α c βc α c Þβ⊥;c :
ð11Þ
With this, or some other identification scheme, the parameters of the model are given by α, β, α⊥, β⊥ and Ω N 0, in addition to the loadings in the BEKK variance, ϕββ (r × r), ϕα⊥β (k × r), ϕα⊥ α⊥ (k × k) and ϕβα ⊥ (r × k). 3. Stationarity and geometric ergodicity We discuss here under which assumptions of the parameters the conditional mean relations β′Yt are stationary and geometrically ergodic, while also Yt despite its unit-roots in the characteristic polynomial is geometrically ergodic. As emphasized this is different from co-integration where the unit-roots imply that Yt is non-stationary, which again leads to asymptotic distributions of parameter estimates characterized by Brownian motions. Here the unit-roots imply ergodicity, but at the same time the processes cease to have finite even small moments and hence imply heavy tails in Yt. 3.1. AR BEKK model We start by formulating a general result regarding stationarity and geometric ergodicity in the AR model with unit-roots in the conditional mean part, and the general BEKK ARCH Ωt in Eq. (6) which embeds the restricted parametrizations in Eqs. (8) and (9), ΔY t ¼ μ þ
r X
0
1=2
α i βi Y t−1 þ Ωt zt ;
Ωt ¼ Ω þ
i¼1
h X
2
ðAi Y t−1 Þ
;
ð12Þ
i¼1
with zt i.i.d. N(0, Ip). Following this we provide more details for the case applied in the empirical example, where the stochastic factors are self-exciting, see Eq. (9). Our first result states as claimed conditions under which, despite the reduced rank, the process Yt is stationary and geometrically ergodic: Theorem 1. The process Yt given by Eq. (12) is stationary, and geometrically ergodic, if the associated top Lyapunov coefficient is strictly negative, that is, q 1 ð13Þ γ :¼ lim E log ∏t¼1 Q t b 0: q→∞ q Here Q t := (Ip + αβ′ + ϵt) and ϵt is (p × p)-dimensional i.i.d. Gaussian with mean zero and a covariance structure defined by E(ϵt ⊗ ϵt) = ∑hi = 1(Ai ⊗ Ai). We emphasize that while the characterization here of stationarity of Yt for a given set of parameters is implicit, the parameter set for which Eq. (13) holds is non-empty, as can be illustrated by simulations described below. We discuss this in more detail for the structured model where the conditional variance is parametrized such that the stochastic factors are self-exciting. The proof of Theorem 1 is located in Appendix A and is based on rewriting the multivariate process as a random coefficient autoregressive process and application of the drift criterion from Markov chain theory, see also Ling (2007) where a similar approach is used for univariate DAR models of general order. The proof, and hence Theorem 1 is not simple to generalize to other of the existing general covariance structures from multivariate ARCH in Bauwens et al. (2006), as many of the non-BEKK models, including the constant conditional correlation model, cannot be written on random coefficient form. Remark 1. Computing the Lyapunov coefficient in Eq. (13) by simulation, one may use that, q 1 a:s:; Q ∏ γ ¼ lim log t q→∞ q t¼1 as also noted by several authors, see e.g. Ling (2007) and Francq and Zakoïan (2010, Theorem 2.3). There is a rich and general literature on efficient computation of log‖∏qt = 1Q t‖ and hence γ. In the computations here, we apply QR-decompositions from Dieci and Van Vleck (1995). Remark 2. The classic co-integrated AR model is obtained with Ai = 0, in which case it is well-known that Yt is non-stationary. This conforms with our result as in this case Q t = Q is non-stochastic, and under Assumption 1, γ ¼ lim q→∞
q 1 log ∏t¼1 Q t k ¼ logρðQ Þ ¼ 0: q
Note also in this respect, that with γ = 0 then, contrary to the case of γ b 0, the constant term μ accumulates and generates a linear trend in the α⊥′Yt process. Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
6
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Remark 3. From Xie and Huang (2009, equation (1.2)) it follows that Yt has finite δ'th order moment, E‖Yt‖δ b ∞, if the moment h q δ i Lyapunov coefficient ρδ b 0 where ρδ :¼ limq→∞ 1q logE ∏ t¼1 Q t . Remark 4. As used in Klüppelberg and Pergamenchtchikov (2007), the general Lyapunov condition (13) is implied by the more explicit criterion, ρðEðQ t ⊗Q t ÞÞ b 1:
ð14Þ
However, this is a strong assumption for the multivariate case, in the sense that it implies that not only is Yt geometrically ergodic but also that it has finite second order moments, E‖Yt‖2 b ∞. Moreover, by definition this stronger assumption does not allow for unitroots in A(z), or equivalently the reduced rank of Π, Π = αβ′, and hence cannot be used to verify stationarity of Yt under Assumption 1. 3.2. Self-exciting stochastic factors When turning to the more structured case where the stochastic factors of the model, (α⊥j′Yt)j = 1,…,k, are self-exciting much more can be said in terms of verification of geometric ergodicity. In addition to results for the self-exciting case, we propose a further reduction of the model, which we label as separability. Under separability we may have that the conditional mean relations (β′Y i t)i = 1,…,r are stationary with finite second order moments, while the stochastic factors do not have any finite moments. Consider the p-dimensional process Yt as given by Eqs. (3) and (9) with self-exciting factors which we re-state here, 0
ΔY t ¼ μ þ αβ Y t−1 þ ηt ;
1=2
ηt ¼ Ωt zt ;
ð15Þ
h i2 h i2 h i2 0 0 0 þ α β ϕβα⊥ α ⊥ Y t−1 þ β⊥;α ⊥ ϕα ⊥ α⊥ α ⊥ Y t−1 ; Ωt ¼ Ω þ α β ϕββ β Y t−1
ð16Þ
using the already applied convention that ab = a(b′a)−1 for any (p × m) matrices a, b such that b′a has full rank. The next theorem states that Yt in this self-exciting case is stationary as induced by the conditional volatility. Moreover, the condition for h i2 0 stationarity does not depend on the contribution from the term α β ϕβα⊥ α ⊥ Y t−1 . Stated differently, whatever the value or size of ϕβα⊥ is, while the individual realizations of Yt will vary with ϕβα ⊥ , the conclusions regarding geometric ergodicity and stationarity remain unaffected. Thus we consider the following regularity conditions: and i.i.d. Gaussian distributed, Assumption 2. Define ϱt = Ir + β′α + ϵ ϱt , and γt = Ip − r + ϵγt , where ϵ ϱt and ϵγt are independent γ γ with mean zero and covariance structure given by E(ϵ ϱt ⊗ ϵ ϱt ) = (ϕββ ⊗ ϕββ) and E ϵt ⊗ ϵt ¼ ϕα⊥ α⊥ ⊗ ϕα⊥ α⊥ . Assume that, q 1 E log ∏t¼1 ϱt b 0 q→∞ q
λϱ :¼ lim
q 1 E log ∏t¼1 γt b 0: q→∞ q
and λγ :¼ lim
ð17Þ
One may observe that ϱt corresponds to a RCAR model for β′Yt without the α⊥′Yt process, and in this sense the assumption addresses a “skeleton” process for β′Yt. The fact that α⊥′Yt is self-exciting means that γt simply is the RCAR coefficient for this process. Note also that the covariance Ω in Ωt plays no role in the determination of the stochastic behavior. Thus the next theorem shows as claimed that ϕβα⊥ plays no role for the dynamic properties of Yt. The proof is given in Appendix A and is based on application of a novel drift-function for the drift criterion from Markov chain theory: Theorem 2. Under Assumptions 1 and 2, the p-dimensional process Yt in Eqs. (15)–(16) is geometrically ergodic, and has a stationary representation with E‖Yt‖δ b ∞, for some small δ ∈ (0, 1). Next, we consider the mentioned concept of separability, by this we mean that ϕβα ⊥ ¼ 0 in Eq. (16). That is, β′Yt − 1 enter the variance only in the linear combinations β′Yt, and likewise for α⊥′Yt. Note though that Ω is not restricted to be block-diagonal. In the case of separability we observe: Corollary 1. Assume that Assumptions 1–2 hold such that the p-dimensional process Yt in Eqs. (15)–(16) is geometrically ergodic. Assume furthermore that we have separability, ϕβα ⊥ ¼ 0, and that ρðEðϱt ⊗ ϱt ÞÞ b 1: Then the conditional mean relations β′Yt are geometrically ergodic and strongly mixing with geometric rate. Furthermore, they have finite second order moments, E‖β′Yt‖2 b ∞. Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
7
Remark 5. Regarding tail and extremal behavior of the r-dimensional process, β′Yt, some further results can be deduced from Klüppelberg and Pergamenchtchikov (2004, 2007) under the assumptions in Corollary 1 of separability. In particular, they find under regularity conditions (Klüppelberg and Pergamenchtchikov, 2007, condition H0), that the tails may be characterized as Pareto-like, that is, the conditional mean relations β′Yt here have finite second order moments and a tail index κβ N 2. Remark 6. Regarding tail and extremal properties of the k-dimensional stochastic factors, α⊥′Yt to our knowledge no results are known for k N 1, while for k = 1 these are studied in detail in Borkovec (2000) and Borkovec and Klüppelberg (2001). In particular, Borkovec and Klüppelberg (2001, Proposition 2) implies Pareto-like tails with index κ α ⊥ b 2 such that κ β N κ α ⊥ as expected. 4. Asymptotics As mentioned in the introduction we derive the asymptotics pffiffiffi for the bivariate model, which is the model applied in the empirical illustration. In particular it is shown that classic T -convergence to Gaussian distributions apply to all estimators including the reduced rank parameter β, despite the implied heavy tails of the process Yt. A small Monte Carlo simulation study in Section 4.1 illustrates that the Gaussian approximation works well even for smaller, or moderate, samples. The empirically applied bivariate model with a self-exciting factor is given by, 0
ΔY t ¼ μ þ αβ Y t−1 þ ηt ;
1=2
ηt ¼ Ωt zt
18
and h i2 h i2 h i2 0 0 0 Ωt ¼ Ω þ α β ϕββ β Y t−1 þ α β ϕβα ⊥ α ⊥ Y t−1 þ β⊥;α ⊥ ϕα ⊥ α⊥ α ⊥ Y t−1 ;
19
where zt is i.i.d. N(0, I2). With α = (a, 0)′ and β = (1, b)′, the parameters of the model, denoted θ, are given by μ = (μ1, μ2)′, the scalar parameters a, b, ϕ2ββ ; ϕ2βα⊥ , and ϕ2α⊥ α ⊥ , as well as the positive definite covariance matrix,
Ω¼
ω11 ω12
ω12 ω22
N 0;
thus with the parameters given by, n o 2 2 2 θ ¼ a; b; μ; Ω; ϕββ ; ϕβα ⊥ ; ϕα⊥ α⊥ ;
ð20Þ
the log-likelihood function to be maximized equals, LðθÞ ¼
T X t¼1
lt ðθÞ ¼ −
T h n oi 1X −1 0 log Ωt;θ þ tr Ωt;θ ηt;θ ηt;θ ; 2 t¼1
ð21Þ
where the notation ηt,θ = (ΔYt − μ − αβ′Yt − 1) and Ωt,θ is used to emphasize that ηt and Ωt are functions of θ. The result in Theorem 3 states that despite the lack of finite low order moments of Yt, the ML estimator ^θ is indeed asymptotically Gaussian distributed. The proof is given in Appendix B, and is based on deriving properties of the score, information and third order derivatives of the log-likelihood function in order to use Jensen and Rahbek (2004, Lemma 1). The findings are in line withpknown results for univariate models, such as in Jensen and Rahbek (2004) where it is shown that the (G) ffiffiffi ARCH parameters are T -consistent despite lack of finite moments of the GARCH process, as well as Ling (2004, 2007) for univariate DAR models. However, univariate results do not necessarily generalize to multivariate cases. Specifically, for the BEKK model, being the multivariate generalization of the univariate ARCH model, pffiffiffi the findings in Avarucci et al. (2013) demonstrate that high order moments of the BEKK processes are needed in general for T -Gaussian inference to apply. Our contrasting results below reflect that Avarucci et al. (2013) study the BEKK model with no conditional mean, and therefore our results show that by combining (a restricted) BEKK structure for the conditional variance and a reduced rank AR structure for the conditional mean, a different model, and hence inference, results despite the apparent similarities. Theorem 3. Consider the bivariate model in Eqs. (18)–(19), with zt i.i.d. N(0, I2), and parameters given by θ in Eq. (20). Then under Assumptions 1–2 and for θ0 satisfying ϕ2ij,0 N 0 for i, j = β, α⊥, Ω0 N 0, and |Ω0| N ψ N 0 for some constant ψ, there exists a fixed open neighborhood N (θ0) of the true value θ0 such that with probability tending to one as T → ∞, L(θ) has a unique maximum at ^θ. Moreover, ^θ is consistent and asymptotically Gaussian, pffiffiffi T ^θ−θ0 → D Nð0; Σθθ Þ; with Σθθ consistently estimated by the observed information evaluated at θ ¼ ^θ. Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
8
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx Table 1 Estimation results for the model in Eqs. (18)–(19). Standard errors in parentheses.
a b μ1 μ2 c11 c12 c22 ϕββ ϕβα⊥ ϕα ⊥ α ⊥ L ^θ
(i)
(ii)
Baseline
Restricted
−0.1627 (0.0436) −1.0042 (0.0305) 0.0822 (0.0247) 0.0030 (0.0128) 0.1409 (0.0191) 0.0364 (0.0223) 0.0677 (0.0182) 0.1770 (0.0375) 0.0243 (0.0035) 0.0563 (0.0044) −16.5021
0.1600 (0.0348) −1 0.0839 (0.0224) 0.0030 (0.0128) 0.1412 (0.0191) 0.0365 (0.0223) 0.0676 (0.0181) 0.1733 (0.0292) 0.0242 (0.0032) 0.0561 (0.0045) −16.5169
Remark 7. The regularity conditions assumed to hold include Assumptions 1 and 2, in addition to ϕ2ij,0 N 0 for i, j = β, α⊥, Ω0 N 0 and |Ω0| N ψ N 0 for some constant ψ. That ϕ2ij,0 N 0 holds corresponds to the assumption in classic ARCH literature of positive parameters, or equivalently, that ARCH effects are present, see for example Francq and Zakoïan (2010). That Ω0 is positive definite and satisfies that the determinant is bounded away from zero is a regularity condition applied, and also discussed in detail, in Comte and Lieberman (2003) as well as Jeantheau (1998), from where the regularity condition originates in the ARCH literature. It implies for example that |Ωt| N ψ, cf. Appendix B. Remark 8. In line with the literature on ARCH models, it would be of interest to extend the analysis to allow for non-Gaussian innovations, such as multivariate t-distributions. If in fact zt is not Gaussian, then in line with existing ARCH literature, the estimator studied would be referred to as the quasi maximum likelihood estimator (QMLE). Note in this respect, that changing the distributional assumptions on zt may also imply that the formulation of the conditional variance could be changed such that transformations of the squared level enter instead. That is the lagged score may enter instead as in the recent literature on generalized autoregressive score models, see e.g. Creal et al. (2012) and Harvey (2012).
4.1. Simulation study In this section we illustrate the results on asymptotic normality of the ML estimators stated in Theorem 3 by Monte Carlo simulations,1 for the model with volatility induced stationarity given by Eqs. (18)–(19). As suggested by an anonymous referee, we report simulations with parameter values set equal to the estimates in the empirical application for interest rates, i.e. column (ii) of Table 1 in Section 5 and report results for sample lengths of T = 313, as in the empirical application, and for T = 900. Moreover, the (conditional) innovations zt are i.i.d. N(0, I2) for t = 1, 2,…T. The parameter values satisfy the assumptions of Theorem 3: (i) αβ′ in Eq. (18) has reduced rank equal to one, and, (ii) the positivity constraints, ϕij N 0 for i, j = β, α⊥ and Ω N 0, apply. Remark 9. Note that if indeed the assumptions are violated, that is, if the parameter values were chosen on the boundary of the parameter space, then as demonstrated in Andrews (1999, 2001), and recently in Francq and Zakoïan (2009) for (G)ARCH estimation, different types of asymptotic limit distributions are to be expected. Moreover, for small samples, and for parameter values close to the boundary of the parameter space, the approximations are less accurate, and it would be of interest to investigate inference based on bootstrap approximations of the sampling distributions as successfully done for the co-integrated vector autoregressive processes in e.g. Swensen (2006) and Cavaliere et al. (2012). Fig. 1(A)–(I) report kernel estimates of the densities of the ten estimators based on 104 Monte Carlo replications.2 For both sample lengths, T ∈ {313, 900}, the kernel densities for most parameters of interest are quite close to a Gaussian reference
1
Numerical calculations have been done using Ox, see Doornik (2007). Occasionally, estimates in the simulations are on the boundary of the parameter space. These we discard in line with Cavaliere et al. (2012), where explosive cases were excluded in the context of bootstrapping. 2
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
(A) a
9
(B) b
20
7.5 T=313 T=900
5.0
10 2.5 -0.3
30
-0.2
-0.1
0.0
(C) μ1
-1.50
200
-1.25
-1.00
-0.75
-0.50
-0.25
(D)μ2
20 100
10 0.00
0.05
0.10
0.15
0.20
-0.03 -0.02 -0.01
(E) φββ 30
20
0
0.01 0.02 0.03 0.04
(F) φβα⊥
20 10
10 0
0.05
0.1
0.15
0.2
0.25
0.3
0.00
(G) φα
0.05
0.10
0.15
0.20
0.25
(H) c11
⊥α⊥
60 50
40
25
20 0
0.025 0.05 0.075
0.1
0.125 0.15 0.175
(I) c21
0.05 0.075
75
40
0.1
0.125 0.15 0.175
0.2
0.225
(J) c22
50
20
25 -0.050
-0.025
0.000
0.025
0.050
0.06
0.08
0.10
0.12
Fig. 1. Kernel density estimates of the simulated distributions of the estimated parameters. Simulations are based on T = 313 (red curve) and T = 900 (green curve) observations and 104 Monte Carlo replications. Thin black curves represent Gaussian distributions with matching mean and variance. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
distributions with matching mean and variance. Note that the Gaussian approximation is less accurate for the ‘volatility level’ parameters in Ω, parametrized as Ω = CC′, with C lower triangular. This is a well-known result from the theory of estimation of conditional volatility models in general, see for example Lumsdaine (1995) for univariate GARCH models. Remark 10. Unreported simulations, which are available upon request, for larger parameter values support further the asymptotic Gaussian approximations.
5. Empirical application To illustrate the use and interpretation of the model, consider a bivariate data set for monthly US interest rates, 1981:1–2006:12, covering the short end of the yield curve. The variables considered are the yield of a one year maturity zero coupon bond, Y1t, and a Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
10
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
(A) Interest rate levels 15
(B) Spread One year yield 3 month rate
2 10
5
0
1985
2
1990
1995
2000
2005
1985
(C) Change in the long rate
1990
1995
2000
2005
(D) Change in the short rate 2
0
0
-2
-2 1985
1990
1995
2000
2005
1985
1990
1995
2000
2005
Fig. 2. Monthly US interest rates, 1981–2006.
three month treasury bill rate, Y2t. The time series are illustrated in Fig. 2(A), while the spread, Y1t − Y2t, is given in Fig. 2(B). Fig. 2(C)–(D) show the first differences, where a pronounced heteroskedasticity is observed. We estimate the model in Eqs. (18)–(19) with zt i.i.d. N(0, I2). As in the simulations, we apply a Choleski factorization for Ω N 0, and we thus estimate the parameters a, b, μ, Ω = CC′ and ϕij, i, j = β, α⊥. Estimates and standard errors are reported in column (i) of Table 1. First, we note that the estimate of b is extremely close to minus unity, suggesting that the conditional mean relationship, β′Yt, is actually the interest rate spread. Testing the spread hypothesis, b = − 1, produces Wald and likelihood ratio statistics of 0.019 and 0.030, that are far from significant in the limiting χ2(1) distribution. Imposing the restriction produces the estimates in column (ii) of Table 1, and we note that estimates for the remaining parameters are basically unchanged. Below we continue the interpretation based on the restricted model in column (ii). An immediate implication of the restricted model is that the interest rate levels, Y1t and Y2t, are unit root processes whereas ^ ¼ −0:16, is the spread, β′Yt = Y1t − Y2t, is not; the estimated adjustment coefficient towards the conditional mean relationship, a clearly significant. Next, note that the parameters in the conditional variance, ϕββ ; ϕβα⊥ , and ϕα⊥ α ⊥ are all significantly different from zero. The self-exciting factor is the short rate, α⊥′Yt, which is a unit root DAR. In contrast, the conditional variance of the spread, β′Yt, is driven by the squared self-exciting short rate and the squared spread itself. Also recall that the model allows correlation between the short rate, α⊥′Yt, and the spread, β′Yt. The estimated conditional variances of Y1t and Y2t, i.e. the diagonal elements in Ωt, are shown in Fig. 3(A) and (B), and, not ^t , while surprisingly, the patterns share similarities with the short rate. Panels (C) and (D) of Fig. 3 show the estimated residuals, η panels (E) and (F) present the standardized residuals, ^zt . Note again the pronounced conditional heteroskedasticity, which is to a large extent accounted for by the model. To assess the stationarity and discuss the dynamic properties of the conditional mean relationship, β′Yt and the self-exciting factor, α⊥′Yt, we calculate the Lyapunov exponents based on the random coefficient autoregressive representation of the bivariate system, and it is convenient to consider the system multiplied by (β′, α⊥′)′. A direct application of Theorem 1 would lead to computing the top Lyapunov coefficient of the RCAR model with Qt = Q + ϵt, with Q = diag(0.840, 1) and the variance of ϵt given by 0
2
0:173 B 0 Eðϵt ⊗ ϵt Þ ¼ B @ 0 0
0 0 0 0
1 2 0 0:024 0 0 C C; 0 0 A 2 0 0:056
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
(A) Conditional variance, long rate
11
(B) Conditional variance, short rate 0.75
0.75 0.50 0.50 0.25
0.25
1985
1990
1995
2000
2005
1985
(C) Residuals, long rate
1990
1995
2000
2005
(D) Residuals, short rate
2 2
0
0
-2
-2 1985
1990
1995
2000
2005
1985
(E) Standardized residuals, long rate
1990
1995
2000
2005
(F) Standardized residuals, short rate 2.5
2.5
0.0 0.0 -2.5 -2.5 1985
1990
1995
2000
2005
1985
1990
1995
2000
2005
Fig. 3. Estimated conditional variance and residuals from the model reported in Table 1, column (ii). (A)–(B) show the estimated conditional variances. (C)–(D) show estimated residuals, while (E)–(F) show estimated residuals standardized with the square-root of the estimated conditional variance.
or alternatively, that E(vec(ϵt)vec(ϵt)′) = diag(0.1732, 0, 0.0242, 0.0562). Direct computation of the matrix product in Eq. (13) is numerically instable because the Lyapunov stability condition implies that the matrix product converges to zero exponentially fast. Instead, we follow Dieci and Van Vleck (1995) and note that the two Lyapunov exponents, characterizing the dynamics of the system, can be found as the log of the eigenvalues of Λ = limq → ∞((∏qt = 1Qt)′(∏qt = 1Qt))−2q. An efficient and numerically stable algorithm can be implemented using a sequential QR-decomposition of the matrix products, see Dieci and Van Vleck (1995, Section 2.1). For the estimated system, the two simulated Lyapunov exponents based on q = 108 random matrices, Qt, t = 1, 2, …, q, are given by ^ ¼ −0:19728 λ ϱ ð0:04218Þ
^ ¼ −0:00158; and λ γ ð0:000257Þ
where numbers in parentheses are standard errors based on numerical delta method. By Theorem 2, the top Lyapunov coefficient ^ϱ; λ ^ γ ; where λ ^ γ is associated with the self-exciting stochastic factor, xt = α⊥′Yt, while the smaller Lyapunov ^ ¼ max λ γ ^ ϱ ¼ −0:19728 , is associated with the conditional mean relationship, yt = β′Yt. Both are significantly negative, exponent, λ ^ ϱ bb λ ^ γ, reflecting implying stationarity and ergodicity of both series despite the presence of a unit root. In addition we note that λ that the non-unit root spread visually appears to be much more stable than the short rate. Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
12
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
(A) Realization 1: Levels
(B) Realization 1: Spread 4
Long rate Short rate
15
2 10
0
5
-2 -4
0
1985
1990
1995
2000
2005
1985
(C) Realization 2: Levels
1990
1995
2000
2005
(D) Realization 2: Spread
20 4
Long rate Short rate
15
2
10
0 -2
5
-4 0
1985
1990
1995
2000
2005
1985
(E) Realization 3: Levels 15
1995
2000
2005
(F) Realization 3: Spread
Long rate Short rate
4
2
10
0
5
0
1990
-2 1985
1990
1995
2000
2005
1985
1990
1995
2000
2005
Fig. 4. Three realizations of the process generated by Eqs. (18)–(19) with parameter values set equal to the estimated values in column (ii) of Table 1, and zt i.i.d. N(0, I2). The realizations clearly mimic the behavior of the observed interest rate series in Fig. 2.
Finally, to illustrate further the claimed dynamics of the estimated process, we show in Fig. 4 three realizations from a simulated process with parameter values set equal to the estimated values (as in the Monte Carlo simulation in Section 4.1). We simulate series, Yt = (Y1t, Y2t)′, and the corresponding spread, Y1t − Y2t. These series are (asymptotically) stationary and ergodic by construction, and it is quite clear that the processes have many similarities with the observed interest rates in Fig. 2: (i) There is a clear co-movements between levels; (ii) there is a clear (level dependent) heteroskedasticity; (iii) the series seem locally to behave as unit root processes, while the spread is much more stable; (iv) the series show periods of rapid increase or decrease of interest rates and show, occasionally, large jumps — corresponding to periods with rapid changes in the yield curve. 6. Extensions and concluding remarks To summarize, we propose to study the multivariate model given by Eq. (3), ΔYt = αβ′Yt − 1 + μ + Ω1/2 t zt, with a structure imposed on Ωt as in Eq.p(8) ffiffiffi or (9). We derive conditions for geometric ergodicity and establish that for the empirically applied bivariate model classic T -asymptotics hold. An immediate extension is to include more lags. One may consider the reparametrized p-dimensional vector autoregression of order m say with conditional heteroskedasticity,
0
ΔY t ¼ αβ Y t−1 þ
m−1 X
Γ i ΔY t−i þ ηt ;
1=2
η t ¼ Ωt z t
and t ¼ 1; …; T
ð22Þ
i¼1
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
13
with zt i.i.d. N(0, Ip) and α, β as before, while (Γi)i = 1,…,k − 1 are (p × p) matrices. The conditional covariance Ωt in the simplest case of the BEKK ARCH(m) is parametrized as a function of lagged levels Yt and differences ΔYt as, 2 m−1 2 X þ ϕδy;i ΔY t−i ; Ωt ¼ Ω þ ϕyy Y t−1
ð23Þ
i¼1
where Ω N 0, the parameters ϕyy, (ϕδy,i)1= 1,…,m − 1 are (p × p) matrices and the initial values Y0, ΔY0, …, ΔY−m + 1 are fixed in the statistical analysis of the model. The regularity condition in this case replacing Assumption 1 is given by replacing A(z) by −1 i Am(z) = (1 − z)Ip − αβ′z − ∑m i = 1 Γi(1 − z)z , see Johansen (1996). As to restricting further the model in line with the discussion of the Markovian case, while in principle straightforward in terms of parametrization, we refrain from this here as only the results on geometric ergodicity at this stage can be generalized. Note finally, that it seems likely that due to the complications of such further added parameters, one should apply bootstrapping rather than classic asymptotic inference, as also used in Cavaliere et al. (2010, 2012) for co-integration analysis under heteroskedasticity. Acknowledgement Funding from the Danish Council for Independent Research, Social Sciences, is gratefully acknowledged (Grant nos. 10-079774 and 12-124980). We would like to thank Søren Tolver Jensen for numerous discussions. We are very grateful to the guest editors Christian Conrad and Menelaos Karanasos, as well as two anonymous referees for helpful and encouraging comments on an earlier version of this paper. Also we are grateful to comments from Nour Meddahi, Stephane Gregoir, Søren Johansen and Shiqing Ling. The paper was presented at financial econometrics conferences in Copenhagen (HUKU), Heidelberg and Toulouse. Toulouse, as well as Hong Kong University of Science and Technology. Appendix A. Geometric ergodicity Throughout the appendix we use C and κ to denote generic positive constants. Proof of Theorem 1. The process Yt in Eq. (12) is a Markov chain, and the drift criterion from Markov chain theory, see Bec and Rahbek (2004), can be used to establish geometric ergodicity and stationarity of Yt. Observe that the process Yt in Eq. (12) can be rewritten as a random coefficient vector autoregression, Y t ¼ μ þ Q t Y t−1 þ ξt ; where ξt is i.i.d. N(0,Ω), with Q t := (Ip + αβ′ + ϵt) and where ϵt is a (p × p)-dimensional Gaussian i.i.d. mean zero and a covariance structure defined by E(ϵt ⊗ ϵt) = ∑hi = 1(Ai ⊗ Ai). Furthermore, ξt and ϵt are mutually independent. By Tjøstheim (1990) a drift function dy(⋅) can be applied to the k-step process, Ykt, rather than Yt itself, where Y kt ¼ Q ðkt;kÞ Y kðt−1Þ þ
k−1 X
Q ðkt; jÞ ξtk− j þ μ ;
j¼0
using the notation Q(n,j) := QnQn − 1 ⋯ Qn − (j − 1) for any n-indexed square matrix Qn and j ≥ 1, while Q(n,0) := I. An identical recursion is also used in Ling (2007, equation (A.9)), with the exception of the extra term here due to the constant vector μ here. With drift function, dy(y) = 1 + ‖y‖δ, where δ N 0 is chosen appropriately below, it follows as in Ling (2007, (A.3)) that we can choose δ ∈ (0, 1) and k such that E‖Q(kt,k)‖δ b 1. This we use here to see that μ plays no role in the argument as we get, δ δ E dy ðY kt ÞjY kðt−1Þ ¼ y ≤ E Q ðkt;kÞ kyk þ C: Hence with M some constant, and for dy(y) N M, we have 2 3 δ δ δ E Q y þ C k k ð kt;k Þ u 6 7 E Q ðkt;kÞ kyk þ C ¼ 4 5dy ðyÞ b ρdy ðyÞ; δ 1 þ kyk
where ρ is some constant with |ρ| b 1. Thus Yt is stationary and geometrically ergodic, provided q 1 E log ∏ Qt b 0: q→∞ q t¼1
γ :¼ lim
■
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
14
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Proof of Theorem 2. As in the proof of Theorem 1, observe that the process Yt in Eq. (15) is a Markov chain, and the drift criterion can be used to establish geometric ergodicity and stationarity of Yt. We do so in three steps: First, we define Zt as Yt appropriately rotated, and rewrite Zt as a random coefficient autoregression. Second, as before using Tjøstheim (1990), the k-step process, Zkt rather than Zt itself is used for the inspection of the drift criterion. Third, for the drift criterion a new drift function is applied to Zkt which exploits the multivariate structure of Zt, and hence Yt. We set μ = 0 without loss of generality as it plays no role in the derivations, see the proof of Theorem 1. = (β, α⊥)′Yt has the same transition density as the random It follows immediately that the Markov chain Zt = (y′,t x′)′ t coefficient process given by,
Z t ¼ Φt Z t−1 þ ϵt ;
where Φt ¼
ϱt 0
ξt : γt
ðA:1Þ
Here ϵt = (ϵy,t′, ϵx,t′)′ is i.i.d. Np(0, (β, α⊥)′Ω(β, α⊥)) and independent of the i.i.d. Gaussian (p × p) matrix sequence Φt. The ϱt (r × r), ξt (r × p − r) and γt (p − r × p − r) are all independent and i.i.d. Gaussian, with ϱt and γt defined in Assumption 2, while Eγt = 0 and Eðγ t ⊗ γt Þ ¼ Ip−r ⊗ Ip−r þ ϕα ⊥ α ⊥ ⊗ ϕα ⊥ α ⊥ . Now, Zt satisfies the regularity conditions such that the drift criterion can be applied and we apply a new drift function to the k-step process, Zkt. The new drift-function d(⋅) is given by, δ
δ
dðzÞ ¼ 1 þ kyk þ ckxk ; where δ and c are constants chosen appropriately below. More precisely, we show that with δ,c and M appropriately chosen constants, then for d(z) N M, 0 0 0 EðdðZ kt ÞjZ kðt−1Þ ¼ z ¼ y ; x Þ b ϕdðzÞ; where ϕ b 1, while for d(z) ≤ M the conditional expectation is bounded. Now from the definition of Zt we have as in the proof of Theorem 1, Z kt ¼ Φðkt;kÞ Z kðt−1Þ þ
k−1 X
Φðkt; jÞ ϵtk− j :
j¼0
Next, by definition of Φt in Eq. (A.1), for j ≥ 1, ! j ðtk; jÞ X ðt; jÞ ϱðtk; jÞ ξ ; ξ ¼ ϱðt;m−1Þ ξt−ðm−1Þ γðt−m; j−mÞ Φðkt; jÞ ¼ 0 γðtk; jÞ m¼1 while Φ(t,0) = Ip, implies ξ(t,0) = 0 (r × (p − r)). We thus find, with z = (y′, x′), δ δ EðdðZ kt ÞjZ kðt−1Þ ¼ zÞ ¼ 1 þ E kykt k jZ kðt−1Þ ¼ z þ cE kxkt k jZ kðt−1Þ ¼ z ; where, δ k−1 X δ ðtk;kÞ ðtk; jÞ xþ ϱðkt; jÞ ϵy;tk− j þ ξ ϵx;tk− j E kykt k jZ kðt−1Þ ¼ z ¼ E ϱðtk;kÞ y þ ξ j¼0 δ ðtk;kÞ δ δ δ ≤ E ϱðtk;kÞ kyk þ E ξ kxk þ C δ δ k−1 X δ δ E kxkt k jZ kðt−1Þ ¼ z ¼ E γ ðtk;kÞ x þ γðtk; jÞ ϵx;tk− j ≤ E γ ðtk;kÞ kxk þ C: j¼0 Collecting terms, we thus find
EðdðZ kt ÞjZ kðt−1Þ
3 2 δ δ ðtk;kÞ δ δ δ k x k κ þ E ϱ k y k þ E ξ þ cE γ ð tk;k Þ ð tk;k Þ 7 6 7dðzÞ: ¼ zÞ≤ 6 5 4 δ δ 1 þ ckxk þ kyk
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
15
Since Eq. (17) is assumed to hold, then as established below, one can choose some small positive δ, 0 b δ b 1, and k large enough such that, E‖ϱ(tk,k)‖δ b 1 and E‖γ(tk,k)‖δ b 1. This again means that, E‖ξ(tk,k)‖δ + cE‖γ(tk,k)‖δ b c, provided c is chosen such that, ðtk;kÞ δ E ξ cN δ N 0: 1−E γ ðtk;kÞ
ðA:2Þ
Hence for some M N 0, d(z) N M, then E(d(Zkt)|Zk(t − 1) = z) b ϕd(z), for some ϕ b 1. It is simple to see that E(d(Zkt)|Zk(t − 1) =z) is bounded for d(z) ≤ M and the result hold asdesired. That is, Zkt and hence Zt are geometrically ergodic, see Tjøstheim (1990), and hence Yt is, since by definition Y t ¼ β⊥;α ⊥ ; α β Z t using Eq. (4). Finally, we need to establish that E‖ϱ(tk,k)‖δ b 1 and E‖γ(tk,k)‖δ b 1 for some large k and small δ, provided Eq. (17) holds. This follows by standard arguments as in Ling (2007, proof of (A.3)), from which it holds that for some δϱ b 1 and kϱ large enough, δϱ ■ E ϱðtkϱ ;kϱ Þ b 1. Likewise for γ(tk,k), and we can choose δ = min(δϱ, δγ) and k = max(kϱ, kγ). Proof of Corollary 1. With Yt in Eqs. (15)–(16) such that ϕβα⊥ ¼ 0, then β′Yt is a Markov chain, which have a random coefficient representation, β′Yt = ϱtβ′Yt − 1 + ξβ,t, ϱt = Ir + β′α + ϵϱt with ξβ,t independent of ϱβ,t, and ξβ,t i.i.d. mean zero Gaussian with variance β′ Ω β. Thus by e.g. Feigin and Tweedie (1985, Theorem 3), β′Yt is geometrically ergodic with finite second order moments as claimed. ■ Appendix B. Asymptotics
Proof of Theorem 3. The result follows by establishing regularity conditions in Lemma 1 from Jensen and Rahbek (2004) for the score, information and third order derivatives of the log-likelihood function respectively. Specifically, Lemma B.1 below establishes condition (A.1) of Jensen and Rahbek (2004, Lemma 1), Lemma B.2 condition (A.2) and finally Lemma B.3 condition (A.3). The derivations are notationally quite involved, and we start by defining some key variables and expressions. Also to simplify 0 0 0 θ where e θ ¼ θ12 ; θ011 ; θ022 , and we leave μ out of θ. Thus the parameters in θ, of dimension pθ = 8, are given by θ ¼ a; b; e 0 2 2 θ11 ¼ ω11 ; ϕββ ; ϕβα⊥
θ12 ¼ ω12 ;
0 2 and θ22 ¼ ω22 ; ϕα ⊥ α ⊥ :
ðB:1Þ
With yt(b) := β′Yt, define also the variables, 0 2 2 λ11;t ¼ 1; yt−1 ðbÞ; Y 2t−1
λ12;t ¼ 1;
0 2 and λ22;t ¼ 1; Y 2t−1 :
ðB:2Þ
It will be useful to suppress the dependence on the parameter θ sometimes. We use in particular yt − 1 := yt − 1(b0) and Ωt :¼ Ωt;θ0 , that is omit θ when quantities are evaluated at θ = θ0. When the distinction between θ and θ0 is important, as for example when providing the uniform bounds for the third derivatives in Lemma B.3, we emphasize this in the arguments. For any (matrix) function of θ, f(θ), df(θ, dθ) denotes the differential of f in the direction dθ. To save space we let for example f˙b denote df(θ, db), that is, the differential in the direction db. We repeatedly use the definition of Ωt,θ and its inverse. Recall that by definition Ωt,θ is given by,
Ωt;θ ¼
ωt;11 ωt;21
ωt;12 ωt;22
h i 2 2 2 2 2 2 ¼ Ω þ S11 ϕββ yt−1 ðbÞ þ ϕβα⊥ Y 2t−1 þ Sbb ϕα⊥ α⊥ Y 2t−1 ;
ðB:3Þ
where we have introduced (two of) the following selection matrices,
S11 ¼
1 0
0 0 ; S12 ¼ 0 1
1 0 ; S22 ¼ 0 0
0 1
and Sbb ¼
2 −b : b −b 1
ðB:4Þ
Introduce next the notation for the inverse, −1
Ωt;θ ¼
t
ω11 t ω21
t
ω12 t ω22
!
¼ Ωt jΩt j
−1
where Ωt ¼
ωt;11 −ω12
−ω21 ; ωt;22
ðB:5Þ
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
16
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
t such that (Ω−1 t,θ )ij = ωij and (Ωt,θ)ij = ωt,ij for i, j = 1,2. Moreover, the determinant of Ωt,θ we write as,
2 2 4 2 2 δðt;θÞ :¼ Ωt;θ ¼ δ1 þ δ2 yt−1 ðbÞ þ δ3 Y 2t−1 þ δ4 Y 2t−1 þ δ5 yt−1 ðbÞY 2t−1
ðB:6Þ
with δ1 ¼ ω11 ω22 −ω212 ; δ2 ¼ ω22 ϕ2ββ ; δ3 ¼ ω11 ϕ2α⊥ α ⊥ þ ω22 ϕ2βα⊥ ; δ4 ¼ ϕ2βα ⊥ ϕ2α ⊥ α⊥ ; δ5 ¼ ϕ2ββ ϕ2α ⊥ α⊥ . Note that by assumption δ(t) ≥ δ1 ≥ ψ N 0. B.1. Score For the score we have the following result: Lemma B.1. The first order differential of the log-likelihood function is given by,
dlt ðθ; dθÞ ¼
o n o 1 n −1 0 −1 −1 tr Ωt;θ ηt;θ ηt;θ −I Ωt;θ dΩt;θ −tr Ωt ηt;θ dηt;θ ; 2
ðB:7Þ
0 0 with θ ¼ a; b; e θ . Under the assumptions of Theorem 3 it holds that the score is asymptotically Gaussian distributed, 1 pffiffiffi ∂LðθÞ=∂θjθ¼θ0 D N ð0; Σθθ Þ → T
ðB:8Þ
as T→∞:
Proof of Lemma B.1. The log-likelihood function is given in Eq. (21), such that with Ω˙ t;θ denoting the differential of Ωt,θ in the direction dθ, and similarly for η˙ t;θ , where ηt,θ := ΔYt − αβ′Yt − 1 = ΔYt − (a, 0)′yt − 1(b), dLðθ; dθÞ ¼
T X
dlt ðθ; dθÞ ¼ −
t¼1
T h n n n o 1X −1 −1 −1 0 −1 0 tr Ωt;θ Ω] t;θ g−tr Ωt;θ Ω] t;θ Ωt;θ ηt;θ ηt;θ g þ 2tr Ωt;θ ηt;θ η˙ t;θ : 2 t¼1
ðB:9Þ
] ¼ 0; η ] ¼ −y ðbÞð1; 0Þ0 da and hence using Eq. (B.9), Part S1: ∂lt(θ)/∂a: Standard calculus gives, Ω t;a t;a t−1 n o −1 ∂lt ðθÞ=∂ajθ¼θ0 ¼ tr Ωt;θ ηt;θ yt−1 ðbÞð1; 0Þ j
θ¼θ0
n o −1=2 ¼ tr Ωt zt ð1; 0Þ yt−1 ;
ðB:10Þ
which is a martingale difference (MGD) sequence with respect to the filtration F t generated by (ηt − s)s = 0,…,t − 1 and Y0. We find directly, E
h
2
2
t
2
∂lt ðθÞ=∂ajθ¼θ0 j F t−1 Þ ¼ yt−1 ω11 ¼ yt−1 ωt;22 =δðtÞ :
ðB:11Þ
2 Now using Eq. (B.6) we can conclude E ∂lt ðθÞ=∂ajθ¼θ0 b ∞ since,
E
h
2 2 2 i2 yt−1 ω22 þ ϕα⊥ α⊥ Y 2t−1 ≤ C: ∂lt ðθÞ=∂ajθ¼θ0 F t−1 ≤ 2 δ2 yt−1 þ δ3 Y 22t−1 þ δ4 Y 42t−1 þ δ5 y2t−1 Y 22t−1
ðB:12Þ
Part S2: ∂lt ðθÞ=∂e θ : Observe that η˙ t;θ˜ ¼ 0, and hence by Eq. (B.9), θ j dlt θ; de
¼
θ¼θ0
o 1 n −1=2 0 −1=2 zt zt −I Ωt tr Ωt Ω˙ t;θ˜ : 2
ðB:13Þ
As before this is a MGD, and using the identity in Bec et al. (2008, equation (48)), we find initially,
E
θ j dlt θ; de
θ¼θ0
2 h i2 1 −1 F ˙ ¼ : tr Ω Ω t t−1 t;θ˜ 2
ðB:14Þ
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
17
0 We consider in turn each term in e θ ¼ θ12 ; θ011 ; θ022 : Part S2.1: ∂lt(θ)/∂θ12: Using Eq. (B.14) and that by definition of Ωt,θ, Ω˙ t;θ12 ¼ S12 dθ12 where S12dθ12 is defined in Eq. (B.4): 2E
h
∂lt ðθÞ=∂θ12 jθ¼θ0
h i2 i2 n 2 o 2 −1 ¼ tr Ωt S12 =δðtÞ F t−1 ¼ tr Ωt S12 2 2 2 ¼ δðt Þ þ 2ω12 =δðt Þ ≤ 1 þ 2ω12 =ψ≤C;
ðB:15Þ
where we have used the definition of Ω−1 t , see Eq. (B.5), and δ(t) ≥ δ1 ≥ ψ N 0. Part S2.2: ∂lt(θ)/∂θ11: Using Eq. (B.13), we find ð∂lt ðθÞ=∂θ11 Þjθ¼θ0 ¼
o 1 n 0 −1=2 −1=2 S11 Ωt tr zt zt −I Ωt λ11;t ; 2
ðB:16Þ
and, using Eq. (B.14), 2E
h i n 2 o 0 0 2 ∂lt ðθÞ=∂θyy ∂lt ðθÞ=∂θyy F t−1 ¼ tr Ωt S11 λ11;t λ11;t =δðt Þ 0
2
2
¼ ωt;22 λ11;t λ11;t =δðt Þ : h i Therefore E ∂lt ðθÞ=∂θyy ∂lt ðθÞ=∂θ0yy F t−1 ≤C, using in particular that, ωt;22 =δðt Þ ≤ω22 =δ1 þ 1=ω11 ;
2
2
ωt;22 yt−1 =δðt Þ ≤2=ϕββ
2
2
and ωt;22 Y 2t−1 =δðt Þ ≤2=ϕβα ⊥ :
ðB:17Þ
Part S2.3: ∂lt(θ)/∂θ22: Using Eq. (B.13), ð∂lt ðθÞ=∂θ22 Þjθ¼θ0 ¼
o 1 n 0 −1=2 −1=2 λ22;t : S22 Ωt tr zt zt −I Ωt 2
ðB:18Þ
As before, using Eq. (B.14), E
0 2 2 0 2 ð∂lt ðθÞ=∂θ22 Þ ∂lt ðθÞ=∂θ22 F t−1 ¼ ωt;11 λ22;t λ22;t =δðt Þ
and hence, 0 2 E ð∂lt ðθÞ=∂θ22 Þ ∂lt ðθÞ=∂θ22 F t−1 ≤C
ðB:19Þ
using, similar to before, ωt;11 =δðt Þ ≤ω11 =δ1 þ 2=ω22
2
2
and ωt;11 Y 2t−1 =δðt Þ ≤3=ϕα ⊥ α⊥ :
ðB:20Þ
Part S3: ∂lt(θ)/∂b term: By definition η˙t;b ¼ −ða; 0Þ0 Y 2t−1 db and 0 2 2 2 2 bϕα⊥ α ⊥ Y 2t−1 þ ϕββ yt−1 ðbÞY 2t−1 ] @ Ω t;b ¼ 2 2 −ϕα⊥ α ⊥ Y 2t−1
2
2
−ϕα⊥ α ⊥ Y 2t−1
1 Adb:
0
ðB:21Þ
Thus, dlt ðθ; dbÞjθ¼θ0 ¼
o n oi 1 h n 0 −1=2 −1=2 −1=2 0 −2tr Ωt zt η˙ t;b ; tr zt zt −I Ωt Ω˙ t;b Ωt 2
ðB:22Þ
is a Martingale difference sequence, and we find
E
h
dlt ðθ; dbÞjθ¼θ0
h i2 i2 1 −1 0 −1 þ η˙ t;b Ωt η˙ t;b : F t−1 ¼ tr Ωt Ω˙ t;b 2
ðB:23Þ
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
18
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Next, using previously applied bounds, one immediately finds 0 −1 2 2 η˙ t;b Ωt η˙ t;b ¼ a Y 2t−1 ωt;22 =δðt Þ ≤C:
With respect to tr
h
˙ Ω−1 t Ωt;b
i2
¼ tr
ðB:24Þ
n 2 o 2 Ωt Ω˙ t;b =δðt Þ, quite lengthy and tedious calculations show that, with ci functions of the
(true) parameters θ0, h i2 2 4 4 ¼ c1 þ c2 Y 2t−1 þ c3 Y 2t−1 Y 2t−1 tr Ωt Ω˙ t;b 2 2 2 4 þ Y 2t−1 yt−1 ðbÞ c4 þ c5 Y 2t−1 þ c6 Y 2t−1 3 2 4 þ Y 2t−1 yt−1 ðbÞ c7 þ c8 Y 2t−1 þ c9 Y 2t−1 : Using the expression for δ(t) in Eq. (B.6), tr 7 of high power, c9Y2t − 1yt − 1, for which, . 2 7 c9 Y 2t−1 yt−1 δðt Þ ≤
n 2 o 2 =δðtÞ ≤C at θ = θ0. As an example consider a cross-product term with Y2t − 1 Ωt Ω˙ t;b
7 c9 Y 2t−1 yt−1 δ21 þ δ24 Y 82t−1 þ 2δ4 δ5 y2t−1 Y 62t−1
≤C;
using δi N 0. Part S4: Cross-terms: Collecting terms, and using the inequalities in Eqs. (B.20) and (B.17), we conclude that 0 E ð∂lt ðθÞ=∂θ22 Þ ∂lt ðθÞ=∂θ11 jθ¼θ F t−1 ≤C: 0
ðB:25Þ
We have here used Eqs. (B.16) and (B.18), as well as the identity Bec et al. (2008, equation (48)) to see that 1 2 1 2 0 t 0 0 2 E ð∂lt ðθÞ=∂θ22 Þ ∂lt ðθÞ=∂θ11 jθ¼θ jF t−1 ¼ ω12 λ22;t λ11;t ¼ ω12 λ22;t λ11;t =δðtÞ : 0 2 2 Next, we find h i 1 t t E ð∂lt ðθÞ=∂θ22 Þð∂lt ðθÞ=∂θ12 Þjθ¼θ0 F t−1 ¼ 2 ω12 ω22 λ22;t λ12;t ¼ 1 ω ω λ δ2 ≤C; and 2 12 t;11 22;t ðt Þ h i 1 t t E ð∂lt ðθÞ=∂θ11 Þð∂lt ðθÞ=∂θ12 Þjθ¼θ0 F t−1 ¼ 2 ω12 ω11 λ11;t λ22;t ¼ 1 ω ω λ =δ2 ≤C: 2 12 t;22 11;t ðtÞ
Observe furthermore, by definition, E Eqs. (B.10) and (B.22), to see that
θ ð∂lt ðθÞ=∂aÞj ∂lt ðθÞ=∂e
θ¼θ0
jF t−1
¼ 0. Next consider the directions da and db, and use
c1 þ c2 Y 22t−1 jyt−1 jjY 2t−1 j ≤ C; E ð∂lt ðθÞ=∂aÞð∂lt ðθÞ=∂bÞjθ¼θ0 jF t−1 ≤ δðt Þ jF t−1 ≤ C, as θ¼θ0
e with c1, c2 functions of θ0. Finally, also E ∂lt ðθÞ=∂θ ð∂lt ðθÞ=∂bÞj
E dlt θ; e θ dlt ðθ; dbÞj
θ¼θ0
jF t−1
¼
1 tr 2
h i 2 δðt Þ ≤C; Ωt Ω˙ e Ωt Ω˙ t;b t;θ
θ parameters (in Part computing the trace of the product, and using the bounds implied by δ2(t), as were applied for each case of the e 2) and for b (in Part 3). 0 0 Part S5: Application of CLT: With θ ¼ a; b; e θ , we have shown that ∂L(θ)/∂θ is a martingale difference (MGD) sequence with respect to the filtration F t generated by (ηt − s)s = 0,…,t − 1 and Y0. Moreover, as E ð∂lt ðθÞ=∂θÞ ∂lt ðθÞ=∂θ0 jθ¼θ jF t−1 b C, and 0
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
19
by geometric ergodicity, the regularity conditions of Brown (1971) apply, by the law of large numbers in Jensen and Rahbek (2007). Hence, p1ffiffiT ∂LðθÞ=∂θjθ¼θ0 D N ð0; Σθθ Þ, as claimed. →
■
B.2. Information Lemma B.2. For the observed information it holds under the assumptions of Theorem 3 that, as T → ∞, 1 2 0 P Σθθ : − ∂ LðθÞ=∂θ∂θ j T θ¼θ0 →
ðB:26Þ
Proof of Lemma B.2. We show below all terms in d2lt(θ, dθ, dθ) are bounded and hence that the law of large numbers in Jensen and Rahbek (2007) can be applied. Part I1: ∂2lt(θ)/∂a2: From the proof of Lemma B.1, ∂lt(θ)/∂a = tr{Ω−1 t,θ ηt,θ(yt − 1(b), 0)}, and hence n o 2 2 −1 2 ∂ lt ðθÞ=∂a ¼ −tr Ωt;θ S11 yt−1 ðbÞ;
ðB:27Þ
2 with E ∂ lt ðθÞ=∂a2 jθ¼θ0 ¼ E ωtyy y2t−1 ≤C by Eq. (B.12). Thus indeed, geometric ergodicity implies that the law of large numbers 1 2 in Jensen and Rahbek (2007) applies to − ∂ LðθÞ=∂a2 j . T θ¼θ0 0 2 −1 0 ˙ θ∂e θ : With e θ ≠e θ, then as dlt θ; de θ ¼ 12 tr Ω−1 η η −I Ω Part I2: ∂ lt ðθÞ=∂e Ω t;θ t;θ t;θ t;θ e , we get, t;θ
2 −1 −1 −1 0 −2d lt θ; de θ; de θ ¼ tr Ωt Ω˙ e Ωt Ω˙ e 2Ωt;θ ηt;θ ηt;θ −I ; t;θ t;θ
ðB:28Þ
:: θ; de θ ¼ 0. We have noting that the second order differential Ω ee :¼ dΩt;θ de t;θ;θ o 1 nh −1 ih −1 i −1 2 0 0 0 ∂ lt ðθÞ=∂θij ∂θkl ¼ − tr Ωt;θ Sij Ωt;θ Skl 2Ωt;θ ηt;θ ηt;θ −I λij;t λkl;t 2
ðB:29Þ
1=2
for i, j, k, l = 1, 2. Using ηt ¼ ηt;θ0 ¼ Ωt zt and |tr{AB}| ≤ ‖A‖‖B‖, the second derivatives at θ0 are bounded by, h −1 −1 2 κ kzt k þ 1 Ωt Sij Ωt Skl λij;t λkl;t ;
‖ ‖
where κ is some constant and hence the term will have finite expectation provided −1 −1 Ωt Sij Ωt Skl λij;t λkl;t ≤C;
‖ ‖
ðB:30Þ
as E‖zt‖2 b ∞. As Ω−1 = |Ωt|−1Ω∗t , t −1 Ωt Sij λij;t ¼ Ωt Sij λij;t =δðt Þ ¼: ρt ðijÞ:
Now if i = 1, j = 2, 2 2 2 2 2 ρt ð12Þ≤ ωt;22 þ ωt;11 þ 2ω12 =δðt Þ ≤ C; using repeatedly the inequalities in Eqs. (B.20) and (B.17). Likewise, 2 2 2 2 2 ρt ð22Þ ≤ κ ωt;11 þ ω12 1 þ Y 2t−1 =δðt Þ ≤ C; and 2 2 2 2 2 2 ρt ð11Þ ≤ κ ωt;22 þ ω12 1 þ Y 2t−1 þ yt−1 =δðt Þ ≤ C: Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
20
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Thus all terms in Eq. (B.30) are finite as desired. Part I3: ∂2lt(θ)/∂b2: From the proof of Lemma B.1 (Part S3), n o n o −1 −1 0 −1 0 þ 2tr Ωt;θ ηt;θ η˙ t;b ; −2dlt ðθ; dbÞ ¼ tr Ωt;θ Ω˙ t;b I−Ωt;θ ηt;θ ηt;θ
ðB:31Þ
:: :: and we find with Ω t;b;b ¼ dΩt;θ ðdb; db Þ; η t;b;b :¼ dηt;θ ðdb; db Þ ¼ 0, 2 −2dn lt θ; db; db o nh i h io −1 : −1 0 −1 −1 −1 0 ¼ tr Ωt;θ Ω_ t;b;b I−Ωt;θ ηt;θ ηt;θ þ2 tr Ωt;θ Ω˙ t;b Ωt;θ Ω˙ t;b Ωt;θ ηt;θ ηt;θ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðb1Þ n hðaÞ io n o n o −1 −1 −1 0 −1 −1 0 − tr Ωt;θ Ω˙ t;b Ωt;θ Ω˙ t;b þ2 tr Ωt;θ η˙ t;b η˙ t;b −4 tr Ωt;θ Ω˙ t;b Ωt;θ ηt;θ η˙ t;b : |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ð cÞ
ðb2Þ
ðdÞ
Consider first (a), which at θ = θ0 up to constants equals, n o −1=2 −1=2 0 2 2 2 S11 Ωt I−zt zt Y 2t−1 2 ϕα⊥ α ⊥ þ ϕββ dbdb : tr Ωt
ðB:32Þ
As in Part I2, using |tr{AB}| ≤ ‖A‖ ‖B‖ and with κ some constant, this is bounded by, 2 2 κ 1 þ kzt k Y 2t−1 Ωt S11 =δðt Þ ;
ðB:33Þ
which has finite expectation as E‖zt‖2 b ∞ and 2 2 2 Y 2t−1 Ωt S11 =δðt Þ ≤κY 2t−1 1 þ Y 2t−1 =δðt Þ ≤C: Likewise the terms in (b1) and (b2) have finite expectations as Ωt Ω˙ t;b k=δðt Þ ≤C, see Part S3 above in the proof of Lemma B.1. The term in (c) has finite expectation as it is bounded by κY 22t−1 ωt;22 =δðt Þ ≤C. Finally, the absolute value of (d) is bounded by κ‖zt‖ −1=2 as Ωt ≤C and as just applied, Ωt Ω˙ t;b k=δ ðt Þ ≤C. 2 −1 1 0 e e ˙ Part I4: ∂ lt ðθÞ=∂θ∂a: As dlt θ; dθ ¼ 2 tr Ω−1 , then t;θ ηt;θ ηt;θ −I Ωt;θ Ωt;e θ 2 −1 2 −1 d lt θ; de θ; da ¼ tr Ωt;θ ηt;θ yt−1 ðbÞ; 0 Ωt;θ Ω˙ e da t;θ and at θ = θ0, n o 2 −1=2 2 −1 zt yt−1 ; 0 Ωt Sij λij;t ∂ lt ðθÞ=∂θij ∂ajθ¼θ ¼ tr Ωt 0
ðB:34Þ
for i, j = 1, 2, which is bounded (and moreover all terms have expectation zero). Part I5: ∂2lt(θ)/∂a∂b: Now dlt(θ, da) = tr{Ω−1 t,θ ηt,θ(1, 0)}yt − 1(b)da, such that n o n o 2 −1 −1 d lt ðθ; da; dbÞ ¼ tr Ωt;θ ηt;θ ð1; 0Þ Y 2t−1 dadb þ tr Ωt;θ η˙ t;b ð1; 0Þ yt−1 ðbÞ da |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðbÞ n ðaÞ o −1 −1 þ tr Ωt;θ Ω˙ t;b Ωt;θ ηt;θ ð1; 0Þ yt−1 ðbÞ da |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ð cÞ
As above, the absolute value of (a) squared is bounded by κ‖zt‖2Y22t − 1(1 + Y22t − 1)/δ(t) and hence, as Y 22t−1 1 þ Y 22t−1 =δðt Þ ≤C, has finite expectation. Likewise, (b) is bounded by κ jyt−1 jjY 2t−1 j 1 þ Y 22t−1 =δðt Þ ≤C. And finally, (c) is bounded by κ‖zt‖ times, 1=2 2 2 2 4 3 c1 þ c2 yt−1 þ c3 Y 2t−1 c4 Y 2t−1 þ c5 Y 2t−1 þ c6 jyt−1 Y 2t−1 j þ c7 yt−1 Y 2t−1 −1=2 −1 ˙ jyt−1 j ≤C; ð1; 0ÞΩt;θ Ωt;b jyt−1 j ≤ Ωt 3=2 δðt Þ with ci constants, cf. the evaluations applied in Part S3 in the proof of Lemma B.1 and the definition of δ(t) in Eq. (B.6). Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
21
:: 2 2 2 Part I6: ∂ lt ðθÞ=∂e θ∂b: Observe that with Ω e :¼ d Ωt;θ de θ; db ; −2d lt θ; de θ; db is given by, t;θ;b −1 :: −1 0 −1 −1 0 −2 tr Ωt;θ Ω˙ e Ωt;θ ηt;θ η˙ t;b þ tr Ωt;θ Ω e I−Ωt;θ ηt;θ ηt;θ t;θ;b t;θ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðaÞ ðbÞ −1 −1 −1 0 : tr Ωt;θ Ω˙ e Ωt;θ Ω˙ t;b 2Ωt;θ ηt;θ ηt;θ −I t;θ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ð cÞ
:: :: θ ¼ ϕ211 ; ϕ222. We find Next, at θ = θ0, (a) is bounded by κ 1 þ kzt k2 Ω−1 , where Ω e ¼ 0 apart from the cases where e t Ωt;e θ;b t;θ;b −1 :: −1 2 Ωt Ωt;ϕ2 ;b ≤κ Ωt S11 jyt−1 Y 2t−1 j≤κ jyt−1 Y 2t−1 j 1 þ Y 2t−1 =δðt Þ ≤C; 22
:: 2 2 and likewise, Ω−1 t Ω t;ϕ222 ;b ≤κY 2t−1 1 þ Y 2t−1 =δðt Þ ≤C. Similarly, we get for the term in (c) that its absolute value is bounded by ˙ ˙ κ(1 + ‖zt‖2) times Ω−1 and Ω−1 t Ωt;e t Ωt;b . The last two terms have been argued to be bounded by a constant in Part I2 and Part θ −1=2 2 2 ˙ I3 above respectively. Finally, the term in (b) is bounded by κ‖zt‖ using again Ω−1 ■ Y 2t−1 ≤C. t Ω e ≤C and that Ωt t;θ
B.3. Third derivatives The third derivatives are considered with θ varying in a compact neighborhood of the true value θ0, θ∈Kðθ0 Þ. Thus we let, U2 L2 L U L L ϕ2ij ∈ [ϕL2 ij , ϕij ] for i, j = α⊥, β and with ϕij N 0. Moreover, ωij ∈ [ωij, ωij ] for i, j = 1, 2, ω11, ω22 N 0 and with |Ω| = ω11ω22 − ω212 ≥ ψ N 0. Finally, a ∈ [aL, aU] and b ∈ [bL, bU]. Lemma B.3. With KðÞ just defined, and under the assumptions of Theorem 3, it holds that 1 3 sup ∂ LðθÞ=∂θi ∂θ j ∂θk ≤V T θ∈Kðθ0 Þ T 0 0 where V T P→Vb∞ and the indices i, j and k applied to θ refer to individual entries in θ, where θ ¼ a; b; e θ ;e θ ¼ θ12 ; θ011 ; θ022 . Proof of Lemma B.3. Observe first that trivially, ∂3lt(θ)/∂a3 = 0. Next, consider the parameter entry combinations in turn, observing that by definition of the neighborhood Kðθ0 Þ, h i h i L L 2 2 L L 2 2 δðt;θÞ ≥ψ þ δ2 þ δ5 Y 2t−1 yt−1 ðbÞ þ δ3 þ δ4 Y 2t−1 Y 2t−1 ;
ðB:35Þ
L L2 L2 L L2 L2 L L2 L2 L L with δL2 ¼ ωL22 ϕL2 ββ N0; δ3 ¼ ω11 ϕα ⊥ α ⊥ þ ω22 ϕβα ⊥ N0; δ4 ¼ ϕβα ⊥ ϕα ⊥ α ⊥ N0; δ5 ¼ ϕββ ϕα ⊥ α ⊥ N0. 2 Part TD1: ∂3lt(θ)/∂θij∂a2 for i, j = 1, 2. From Eq. (B.27), ∂2lt(θ)/∂a2 = − tr{Ω−1 t,θ S11}yt − 1(b) such that, n o 3 2 −1=2 −1 −1=2 2 yt−1 ðbÞλij;t : ∂ lt ðθÞ=∂a ∂θij ¼ tr Ωt;θ Sij Ωt;θ S11 Ωt;θ
and we find, 3 −1=2 −1=2 2 −1=2 −1=2 2 ∂ lt ðθÞ=∂θij ∂a ≤C Ωt;θ Sij Ωt;θ ‖λij;t k Ωt;θ S11 Ωt;θ yt−1 ðbÞ; which are uniformly bounded for i, j = 1, 2. Recall that λ11,t = (1, y2t − 1(b), Y22t − 1)′ such that evaluation of the second 1/2 1/2 S11Ω− ‖y2t − 1(b) is included by evaluating for i, j = 1, term, ‖Ω− t,θ t,θ ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi h i2 −1 λ ðθÞ tr Ωt;θ S11 11;t 2 ωU22 þ ϕU2 1 þ Y 22t−1 þ y21t−1 ðbÞ α ⊥ α ⊥ Y 2t−1
L 2
¼ ωt;22 ðθÞ λ11;t =δðt;θÞ ≤ L 2 bÞ þ δL3 þ δL4 Y 22t−1 Y 22t−1 t−1 ð ψ þ δ2 þ δ5 Y 2t−1 y 2 U 3 U2 2 U U2 2 U U2 2 ω22 þ ϕα ⊥ α ⊥ Y 2t−1 ω22 þ ϕα ⊥ α ⊥ Y 2t−1 ω22 þ ϕα⊥ α ⊥ Y 2t−1 4 5 ≤C: ≤ þ þ ψ þ δL3 Y 22t−1 δL3 þ δL4 Y 22t−1 δL2 þ δL5 Y 22t−1 −1=2 −1=2 Ωt;θ S11 Ωt;θ λ11;t ¼
ðB:36Þ
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
22
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Next, for i, j = 2, −1=2 −1=2 Ωt;θ S22 Ωt;θ λ22;t ¼ ωt;11 ðθÞ λ22;t =δðt;θÞ 2 U2 2 ωU11 þ ϕU2 1 þ Y 22t−1 ββ yt−1 ðbÞ þ ϕβα ⊥ Y 2t−1
L 2
L 2 ≤ L 2 L 2 ψ þ δ2 þ δ5 Y 2t−1 yt−1ðbÞ þ δ3 þ δ4 Y 2t−1 Y 2t−1 2 2 2 ωU11 þ ϕU2 ωU11 þ ϕU2 ϕU2 βα ⊥ Y 2t−1 βα ⊥ Y 2t−1 ββ 1 þ Y 2t−1 ≤ þ þ L ≤C: ψ þ δL3 Y 22t−1 δL3 þ δL4 Y 22t−1 δ2 þ δL5 Y 22t−1
ðB:37Þ
And finally, −1=2 −1=2 Ωt;θ S12 Ωt;θ ≤κ
ω ðθÞ þ ω ðθÞ þ 2 ω ðθÞ t;22 t;12
L t;11
≤C: ψ þ δ2 þ δL5 Y 22t−1 y2t−1 ðbÞ þ δL3 þ δL4 Y 22t−1 Y 22t−1
Part TD2: ∂3lt(θ)/∂θij∂θkl′∂a for i, j, k, l = 1, 2: Using Eq. (B.29), n o 3 0 −1 −1 −1 0 ∂ lt ðθÞ=∂θij ∂θkl ∂a ¼ 2tr Ωt;θ Sij Ωt;θ Skl Ωt;θ ηt;θ ð1; 0Þ λij;t λkl;t yt−1 ðbÞ; for i, j, k, l = 1, 2. Applying the uniform bounds in Eqs. (B.36) and (B.37), 3 0 ∂ lt ðθÞ=∂θij ∂θkl ∂a −1=2 −1=2 −1=2 −1=2 −1=2 −1=2 ≤C Ωt;θ Sij Ωt;θ ‖λij;t k Ωt;θ Skl Ωt;θ λkl;t Ωt;θ ηt;θ ð1; 0ÞΩt;θ jyt−1 ðbÞj −1=2 −1=2 ≤C Ωt;θ ηt;θ ð1; 0ÞΩt;θ jyt−1 ðbÞj: Next, rewrite ηt,θ as 0 1=2 ηt;θ ¼ ηt;θ −ηt þ ηt ¼ ½ða−a0 Þyt−1 ðbÞ þ a0 ðb0 −bÞY 2t−1 ð1; 0Þ þ Ωt zt :
ðB:38Þ
Using this, h i −1=2 −1=2 1=2 −1=2 Ωt;θ ηt;θ ≤κ ð1; 0ÞΩt;θ ðjyt−1 ðbÞj þ jY 2t−1 jÞ þ Ωt;θ Ωt kzt k 2 3 1=2 U U2 2 ω þ ϕ Y ð j y ð b Þ j þ j Y j Þ 22 α 2t−1 α t−1 2t−1 ⊥ ⊥ 6 7 ≤κ 4 h i h i 1=2 þ kzt k5 ≤C½1 þ kzt k; L L 2 2 L L 2 2 ψ þ δ2 þ δ5 Y 2t−1 yt−1 ðbÞ þ δ3 þ δ4 Y 2t−1 Y 2t−1
ðB:39Þ
n o −1=2 1=2 2 where we have used that Ωt;θ Ωt ¼ tr Ω−1 t;θ Ωt ≤C. Moreover, U U2 2 2 ω þ ϕ Y 22 α 2t−1 α ⊥ ⊥ −1=2 2 ≤C: ð1; 0ÞΩt;θ yt−1 ðbÞ≤ δL2 þ δL5 Y 22t−1
ðB:40Þ
T
Hence we can use V T ¼ T1 ∑t¼1 C½1 þ kzt k since zt has finite first order moment. Part TD3: d3lt(θ, dθij, dθkl, dθmn) for i, j, k, l, m, n = 1, 2: By Eq. (B.28), we find that the third order differential is bounded by, 3 −1=2 −1=2 −1=2 −1=2 d lt θ; dθij ; dθkl ; dθmn ≤C Ωt;θ Sij Ωt;θ ‖λij;t k Ωt;θ Skl Ωt;θ λkl;t h i −1=2 2 −1=2 −1=2 2 Ωt;θ Sij Ωt;θ λmn;t Ωt;θ ηt;θ ≤C 1 þ kzt k ;
using Eqs. (B.36), (B.37) and (B.39). 2 Part TD4: ∂3lt(θ)/∂a2∂b: Recall that by Eq. (B.27), ∂2lt(θ)/∂a2 = − tr{Ω−1 t,θ S11}yt − 1(b) such that, n o n o 3 −1 −1 2 −1 d lt ðθ; da; da; dbÞ ¼ tr Ωt;θ Ω˙ t;b Ωt;θ S11 yt−1 ðbÞ þ 2tr Ωt;θ S11 yt−1 ðbÞY 2t−1 :
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
2 2 ðωU þϕU2 −1=2 −1=2 α ⊥ α ⊥ Y 2t−1 Þyt−1 ðbÞ Now, Ωt;θ S11 Ωt;θ y2t−1 ðbÞ≤ ψþ δL þδL Y 22 ≤C, and likewise, uniformly in Kðθ0 Þ, ½ 2 5 22t−1 y2t−1 ðbÞþ½δL3 þδL4 Y 22t−1 Y 22t−1 −1=2 ˙ −1=2 Ωt;θ Ωt;b Ωt;θ ≤C;
23
ðB:41Þ
using arguments similar to Part S3. Finally, for the last term we have, n o −1 tr Ωt;θ S11 yt−1 ðbÞY 2t−1 ≤
2 ωU22 þ ϕU2 α ⊥ α ⊥ Y 2t−1 jyt−1 ðbÞjjY 2t−1 j
≤C: ψ þ δL2 þ δL5 Y 22t−1 y2t−1 ðbÞ þ δL3 þ δL4 Y 22t−1 Y 22t−1
Part TD5: ∂3lt(θ)/∂a∂b2: From Part I3, proof of Lemma B.2, and using that by definition of Ωt,θ, we have Ω˙ t;a ¼ 0, n o n o 3 −1 :: −1 0 −1 −1 0 −2d lt ðθ; db; db; daÞ ¼ −2 tr Ωt;θ Ωt;b;b Ωt;θ ηt;θ η˙ t;a −4 tr Ωt;θ Ω˙ t;b Ωt;θ η˙ t;a η˙ t;b |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðaÞ ðbÞ nh i h io n ::0 o −1 −1 −1 0 −1 þ4 tr Ωt;θ Ω˙ t;b Ωt;θ Ω˙ t;b Ωt;θ ηt;θ η˙ t;a þ4 tr Ωt;θ η˙ t;b ηt;b;a |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðcÞ ðdÞ n o :: −1 −1 0 −8 tr Ωt;θ Ω˙ t;b Ωt;θ ηt;θ ηt;b;a : |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ðeÞ
−1=2 −1=2 −1=2 :: −1=2 Using |tr{AB}| ≤ ‖A‖ ‖B‖ repeatedly, ∂3lt(θ)/∂a∂b2 is uniformly bounded if ‖Ω−1/2 ηt,θ‖, Ωt;θ Ω˙ t;b Ωt;θ ; Ωt;θ Ωt;b;b Ωt;θ ; t,θ −1=2 ˙ −1=2 :: Ωt;θ η t;b and Ωt;θ ηt;a;b are. For the first two see Eqs. (B.41) and (B.39) respectively, and the remaining ones are bounded using similar arguments. Part TD6: ∂3lt(θ)/∂b3: From Part I3, proof of Lemma B.2, we find as in Part TD5, that |∂3lt(θ)/∂b3| is uniformly bounded if, as used −1=2 :: −1=2 −1=2 −1=2 −1=2 −1=2 in Part TD5, Ωt;θ Ω˙ t;b Ωt;θ ; Ωt;θ ηt;θ ; Ωt;θ Ωt;b;b Ωt;θ and Ωt;θ η˙ t;b are. −1=2 −1=2 Part TD7: ∂3lt(θ)/∂θij∂b2: In addition to the quantities in Part TD5 and Part TD6, is uniformly bounded if also Ωt;θ Ω˙ t;θij Ωt;θ ; −1=2 ::: −1=2 :: −1=2 −1=2 Ωt;θ Ω t;b;b;θij Ωt;θ and Ωt;θ Ωt;b;θij Ωt;θ are. The first term for θij = θ12, θ11 and θ22 respectively, is dealt with in Part TD1, ::: while for the last two similar arguments can be used. For example, with Ω t;b;b;θij ¼ 0 for θij = θ12, θ22, while h i2 −1=2 ::: −1=2 2 2L 2 2 −1 ≤C: Ωt;θ Ω t;b;b;θ11 Ωt;θ ≤κ ϕββ þ ϕβα ⊥ Y 2t−1 tr Ωt;θ S11 −1=2 −1=2 Part TD8: ∂3lt(θ)/∂θij∂θkl′∂b: Using Part I2, simple calculations give that this is uniformly bounded as Ωt;θ Ω˙ t;θij ;b Ωt;θ (see −1=2 :: −1=2 −1=2 −1=2 −1=2 −1=2 Part TD1), Ωt;θ Ωt;θij ;b Ωt;θ (see Part TD7) as well as Ωt;θ Ω˙ t;b Ωt;θ ; Ωt;θ ηt;θ and Ωt;θ η˙ t;b are (see Part TD5). −1=2 −1=2 Part TD9: ∂3lt(θ)/∂a∂b∂θij: Using Part I5, we again find this is uniformly bounded as Ωt;θ Ω˙ t;θij Ωt;θ (see Part TD1), −1=2 ˙ ηt,θ‖ (see Part TD5) are, together with Ωt;θ η t;b and ‖Ω−1/2 t,θ ωU22 þ ϕ2α⊥ α⊥ Y 22t−1 2 −1=2 2 2 2 2 Y 2t−1 þ yt−1 ðbÞ ≤C; ð1; 0ÞΩt;θ Y 2t−1 þ yt−1 ðbÞ ≤ δðt;θÞ using Eqs. (B.35), (B.39) and (B.40).
■
References Andrews, D., 1999. Estimation when a parameter is on a boundary. Econometrica 67, 1341–1383. Andrews, D., 2001. Testing when a parameter is on the boundary of the maintained hypothesis. Econometrica 69 (3), 683–734. Aue, A., Horváth, L., Steinebach, J., 2006. Estimation in random coefficient autoregressive models. J. Time Ser. Anal. 27, 61–76. Avarucci, M., Beutner, E., Zaffaroni, P., 2013. On moment conditions for quasi-maximum likelihood estimation of multivariate ARCH models. Econ. Theory 29 (3), 545–566. Bauwens, L., Laurent, S., Rombouts, J.V., 2006. Multivariate GARCH models: a survey. J. Appl. Econ. 21, 79–109. Bec, F., Rahbek, A., 2004. Vector equilibrium correction models with non-linear discontinuous adjustments. Econ. J. 7, 628–651. Bec, F., Rahbek, A., Shephard, N., 2008. The ACR model: a multivariate dynamic mixture autoregression. Oxf. Bull. Econ. Stat. 70 (5), 583–618. Berkes, I., Horváth, L., Ling, S., 2009. Estimation in nonstationary random coefficient autoregressive models. J. Time Ser. Anal. 30, 395–416. Borkovec, M., 2000. Extremal behaviour of the autoregressive process with ARCH(1) errors. Stoch. Process. Appl. 85, 189–207.
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008
24
H.B. Nielsen, A. Rahbek / Journal of Empirical Finance xxx (2014) xxx–xxx
Borkovec, M., Klüppelberg, C., 2001. The tail of the stationary distribution of an autoregressive process with ARCH(1) errors. Ann. Appl. Probab. 11, 1220–1241. Brown, B.M., 1971. Martingale central limit theorems. Ann. Math. Stat. 42, 49–56. Carta, A., Fantazzini, D., Maggi, M.A., 2008. Discrete-time affine term structure models: an ARCH formulation. Int. J. Risk Assess. Manag. 11, 164–179. Cavaliere, G., Rahbek, A., Taylor, R.M., 2010. Cointegration rank testing under conditional heteroscedasticity. Econ. Theory 26, 1719–1760. Cavaliere, G., Rahbek, A., Taylor, R.M., 2012. Bootstrap determination of the co-integration rank in VAR models. Econometrica 80 (4), 1721–1740. Comte, F., Lieberman, O., 2003. Asymptotic theory for multivariate GARCH processes. J. Multivar. Anal. 84, 61–84. Cox, J.C., Ingersoll, J.E., Ross, S.A., 1985. A theory of term structure of interest rates. Econometrica 53, 385–407. Creal, D., Koopman, S.J., Lucas, A., 2012. A dynamic multivariate heavy-tailed model for time-varying volatilities and correlations. J. Bus. Econ. Stat. 29 (4), 552–563. Dieci, L., Van Vleck, E.S., 1995. Computation of a few Lyapunov exponents for continuous and discrete dynamical systems. Appl. Numer. Math. 17, 275–291. Doornik, J.A., 2007. Object-Oriented Matrix Programming Using Ox, 3rd ed. Timberlake Consultants Press, London. Engle, R., Kroner, K., 1995. Multivariate simultaneous generalized ARCH. Econ. Theory 11, 122–150. Feigin, P.D., Tweedie, R.L., 1985. Random coefficient autoregressive processes: a Markov chain analysis of stationarity and finiteness of moments. J. Time Ser. Anal. 6, 1–14. Fong, P.W., Li, W.K., 2004. Some results on cointegration with random coefficients in the error correction form: estimation and testing. J. Time Ser. Anal. 25, 419–441. Francq, C., Zakoïan, J.-M., 2009. Testing the nullity of GARCH coefficients: correction of the standard tests and relative efficiency comparisons. J. Am. Stat. Assoc. 104 (485), 313–324. Francq, C., Zakoïan, J.-M., 2010. GARCH Models: Structure, Statistical Inference and Financial Applications. Wiley. Gourieroux, C., Monfort, A., Polimenis, V., 2002. Affine term structure models. Working Papers 2002–49. Centre de Recherche en Economie et Statistique (CREST), Paris. Harvey, A.C., 2012. Dynamic Models for Volatility and Heavy Tails. Cambridge University Press, Cambridge. Jeantheau, T., 1998. Strong consistency of estimators for multivariate ARCH models. Econ. Theory 14, 70–86. Jensen, S.T., Rahbek, A., 2004. Asymptotic inference for nonstationary GARCH. Econ. Theory 20, 1203–1226. Jensen, S.T., Rahbek, A., 2007. A note on the law of large numbers for functions of geometrically ergodic time series. Econ. Theory 23, 761–766. Johansen, S., 1996. Likelihood-based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press. Klüppelberg, C., Pergamenchtchikov, S., 2004. The tail of the stationary distribution of a random coefficient AR(q) model. Ann. Appl. Probab. 14, 971–1005. Klüppelberg, C., Pergamenchtchikov, S., 2007. Extremal behaviour of models with multivariate random recurrence representation. Stoch. Process. Appl. 117, 432–456. Lange, T., Rahbek, A., Jensen, S.T., 2011. Estimation and asymptotic inference in the AR-ARCH model. Econ. Rev. 30, 129–153. Le, A., Singleton, K.J., Dai, Q., 2010. Discrete-time AffineQ term structure models with generalized market prices of risk. Rev. Financ. Stud. 23 (5), 2184–2227. Ling, S., 2004. Estimation and testing of stationarity for double-autoregressive models. J. R. Stat. Soc. Ser. B 66, 63–78. Ling, S., 2007. A double AR(p) model: structure and estimation. Stat. Sin. 17, 161–175. Ling, S., Li, D., 2008. Asymptotic inference for a nonstationary double AR(1) model. Biometrika 95, 257–263. Lumsdaine, R., 1995. Finite-sample properties of the maximum likelihood estimator in GARCH(1,1) and IGARCH (1,1) models: a Monte Carlo investigation. J. Bus. Econ. Stat. 13 (1), 1–10. Swensen, A.R., 2006. Bootstrap algorithms for testing and determining the cointegration rank in VAR models. Econometrica 74 (6), 1699–1714. Tjøstheim, D., 1990. Non-linear time series and Markov chains. Adv. Appl. Probab. 22, 587–611. Xie, W.C., Huang, Q., 2009. Simulation of moment Lyapunov exponents for linear homogeneous stochastic systems. J. Appl. Mech. 76, 031001.
Please cite this article as: Nielsen, H.B., Rahbek, A., Unit root vector autoregression with volatility induced stationarity, J. Empir. Finance (2014), http://dx.doi.org/10.1016/j.jempfin.2014.03.008