Journal of Econometrics 178 (2014) 694–705
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
The delta expansion for the transition density of diffusion models✩ Yoon Dong Lee a,1 , Seongjoo Song b,1 , Eun-Kyung Lee c,∗ a
Sogang Business School, Sogang University, ShinSoo-Dong, MaPo-Gu, Seoul 121-742, Republic of Korea
b
Department of Statistics, Korea University, 145 Anam-Ro, Seongbuk-Gu, Seoul 136-701, Republic of Korea
c
Department of Statistics, Ewha Womans University, 11-1 DaeHyun-Dong, SeoDaeMoon-Gu, Seoul, 120-750, Republic of Korea
article
info
Article history: Received 17 October 2012 Received in revised form 11 October 2013 Accepted 16 October 2013 Available online 23 October 2013 Keywords: Diffusion model Transition density Edgeworth expansion Likelihood estimation Hermite expansion
abstract This paper is on the issue of finding a closed-form likelihood approximation of diffusion processes and rearranging the Hermite expansion in the order of the power of the observational time interval. We propose an algorithm that calculates the coefficients of the rearranged expansion that Aït-Sahalia (2002) suggested. That is, a general expression of the coefficients is provided explicitly, which as far as we know has not been given in the existing literature. We also introduce a reduced form of the rearranged expansion and call it as the delta expansion in the paper. Moreover, we are able to obtain an explicit expansion of the moments in the order of the power of the observational time interval. We examine the delta expansion and the Hermite expansion without rearrangement numerically to find that the delta expansion has such advantageous features as the order of the error bound can be more effectively attained. It is also found that our expansion gives a comparable numerical accuracy of the approximation to the expansion Aït-Sahalia (1999) suggested, while making any symbolic computation unnecessary. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Diffusion models can be used to describe the movements of small particles or the fluctuations in the prices of financial assets. A diffusion process, X , is a continuous-time Markov random process defined by the following stochastic differential equation (SDE): dXt = µt dt + σt dWt ,
(1)
where W is a standard Brownian motion. Conditions that guarantee the existence of the solutions of (1) are found in Kloeden and Platen (1999) or Øksendal (2003), among others. When the drift and the diffusion coefficients, µt and σt , do not depend on time t directly and the Lipschitz continuity condition is satisfied, the diffusion is called the (time-homogeneous) Itô diffusion. In this paper, we will concentrate only on Itô diffusions for simplicity. Since actual observations of diffusion processes in reality are recorded discretely, in order to apply the method of likelihood
✩ This work was supported by the Sogang University Research Grant of 2012 (Y.D. Lee), by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2009-0093827, E.-K. Lee), and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (No. 2011-0026070, S. Song). ∗ Corresponding author. E-mail addresses:
[email protected] (Y.D. Lee),
[email protected] (S. Song),
[email protected] (E.-K. Lee). 1 The first two authors contributed equally to this work.
0304-4076/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jeconom.2013.10.008
estimation, we need to evaluate the transition density which is not known in general. In this direction, Aït-Sahalia (1999, 2002) made a break-through in evaluating the transition density approximately in a closed-form. He suggested two types of series expansions for univariate time-homogeneous diffusions; Aït-Sahalia (1999) considered the solution of the Fokker–Planck equation as a series of ∆, the length of the observation interval and AïtSahalia (2002) adopted Hermite polynomials as orthogonal bases to approximate the transition density. The expansion suggested in Aït-Sahalia (2002) is an improved version of the series expansion that Dacunha-Castelle and Florens-Zmirou (1986) suggested, by using Hermite polynomials instead of monomials as basis functions. There have been many papers published on this issue thereafter; for example, Egorov et al. (2003) extended the closed-form likelihood approximation of Aït-Sahalia (2002) for time-inhomogeneous diffusions and Bakshi and Ju (2005) and Bakshi et al. (2006) expanded the class of diffusion processes to which Aït-Sahalia’s expansion (1999; 2002) can be applied. Aït-Sahalia and Yu (2006) extended the previous result in the form of saddle point expansions, applicable to models for which Laplace or characteristic functions are well approximated by functions of series, and Aït-Sahalia (2008) developed a way to find a closed-form likelihood expansion for a multivariate time-homogeneous diffusion using series solution approach of the Fokker–Planck equation. Yu (2007) studied a closed-form approximation of the likelihood function for multivariate jump-diffusion processes and Choi (2011) found a closed-form expansion of the log likelihood for multivariate time-inhomogeneous diffusions. Li (2010) applied the series
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
approximation method to estimate the damped diffusion model. Regarding the asymptotic properties of the approximated maximum likelihood estimators such as consistency and the convergence rate under several conditions, one can refer to Chang and Chen (2011). Also, there are some works on the Bayesian inference based on the closed-form approximation of the likelihood by AïtSahalia (2002) such as Di Pietro (2001) and Stramer et al. (2010). On the other hand, the Edgeworth expansion that is commonly used for finding an asymptotic distribution of an estimator also uses Hermite polynomials. In this case, we can easily find important distributional properties such as symmetry, moments, or cumulants and we also find the relationship between the estimation based on moments and the estimation based on likelihoods. Hall (1995) and McCullagh (1987) explain the advantages of the Edgeworth expansion using Hermite polynomials as basis functions when the sampling distribution of a statistic is expanded. In approximating the distribution of an estimator, it arranges the terms not in the order of the degree of Hermite polynomials, but in the order of the power of the sample size that is of the main interest in the limiting process. Likewise, in the expansion of the transition density, it would be more suitable to arrange terms in the order of the power of ∆, the time increment between observations in the statistical viewpoint. In fact, Aït-Sahalia (2002) suggested a way of rearranging the series in the order of the power of ∆ for a finite term approximation, but did not provide a general formula for coefficients. (See Formulae (4.4)–(4.9) in Aït-Sahalia (2002).) He also mentioned another series approximation that arranges terms in the order of the power of ∆ in Aït-Sahalia (1999). This is reintroduced in AïtSahalia (2002) as Formula (4.10), which is the series solution of the Fokker–Planck equation. The solution is expressed in the order of the power of ∆, but the expansion itself is not related to Hermite polynomials. This series approximation is a simple numerical approximation that does not have any of the statistical meaning implied by Hermite polynomials connecting the density and moments. That is, although it gives a very good approximation to the true transition density and a more stable relative error than the Hermite expansion with the same error bound, the series solution of the Fokker–Planck equation does not give us any insight into the interpretations of the transition probability distribution. Moreover, in order to use its result in practice, we have to go through some preprocessing for the symbolic calculation, although the numerical accuracy of this method is better than Hermite expansions without rearrangement. This paper proposes an algorithm that calculates the coefficients of the rearranged expansion Aït-Sahalia (2002) suggested. The algorithm gives us an explicit representation of the general form of the expansion that arranges the terms of the Hermite expansion in the order of the power of ∆, by separating terms with the drift coefficient from terms with ∆ in applying Dynkin’s operator repeatedly. Although Aït-Sahalia (1999) already provided an explicit expression for the formula arranged in the order of ∆, our result has its own meaning since the Hermite series expansion and the series solution of the Fokker–Planck equation have different meanings and applications. See, for advantages of Hermite expansions, DasGupta (2008) or Barndorff-Nielsen and Cox (1989). Meanwhile, we also obtain an explicit expansion of the moments in the order of ∆. We also suggest using a reduced form of the rearranged Hermite expansion. We will call this reduced form as the delta expansion in this paper. Simulation experiments show that it gives a more precise approximation of the exact transition densities compared to approximations by the Hermite expansion before rearrangement and also gives comparable results to the series approximation of the solution of the Fokker–Planck equation, suggested in Aït-Sahalia (1999). Moreover, when we estimate the
695
model parameters of diffusions, the estimated values obtained by maximizing the delta expansion are more similar to the likelihood estimators than the ones found by maximizing the Hermite type expansions. The rest of the paper is organized as follows. In Section 2, the related concepts and terminologies as well as the relevant literature are briefly reviewed. In Section 3, we show the steps to obtain the delta expansion. In Section 4, the delta expansion is numerically compared to the Hermite expansion for well-known diffusion models. A real data example of US Federal Fund Rate is also provided. Section 5 concludes the paper. 2. Transition density and series expansions Transition density p∆ (x0 , x) is the probability density that a random process X evolves from X0 = x0 to X∆ = x, that is, P (X∆ ∈ d x|X0 = x0 ) = p∆ (x0 , x)dλ(x) where λ(·) is the Lebesgue measure. By the Itô’s lemma, when X satisfies the SDE (1), df (Xt ) = [Ax f ](Xt ) dt + σ (x)[Dx f ](Xt )dWt ,
(2)
where Dx is the partial differential operator (∂/∂ x) with respect to x and Ax is Dynkin’s operator defined as follows: 1
[Ax f ](x) = µ(x) [Dx f ](x) + σ 2 (x) [D2x f ](x). 2
With the function h(·) defined as h(x) =
x
du
(3)
σ (u)
and y = h(x), the transformed process, Yt = h(Xt ), becomes a unit diffusion such that dYt = a(Yt )dt + dWt ,
(4)
where a(y) =
µ(h−1 (y)) 1 ′ −1 − σ (h (y)). σ (h−1 (y)) 2
(5)
The function h(·) is called a Lamperti transformation. Between the transition density, p∆ (x0 , x) of X and q∆ (y0 , y) of Yt = h(Xt ), the following relationship holds: p∆ (x0 , x) =
q∆ (h(x0 ), h(x))
σ (x )
.
Therefore, it is sufficient to consider the transition density q∆ (y0 , y) of the process Y in (4) to derive the transition density p∆ (x0 , x) of any Itô diffusion X . Rogers (1985) and Dacunha-Castelle and Florens-Zmirou (1986) showed that the transition density for the diffusion in (4) is represented as follows: 1 q∆ (y0 , y) = √ φ
∆ y
y − y0
√ ∆
1 ∆ ˜ · eA(y)−A(y0 ) · E e− 2 0 g (Wt )dt , (6)
where A(y) = 0 a(v)dv , g (y) = a2 (y) + a′ (y), and φ(·) is the density of the standard normal distribution. The Brownian bridge ˜ t ; t ∈ (0, ∆)}, is assumed to satisfy the boundary conprocess, {W ˜ ˜ ∆ = y. The major difficulty in obtaining the ditions, W0 = y0 and W transition density q∆ (y0 , y) in (6) comes from the step of taking the expectation of a function of the path integral over the Brownian bridge. A lot of research has been done to overcome this problem such as Beskos et al. (2006a,b, 2009), Papaspiliopoulos (2011), and Papaspiliopoulos and Roberts (2012).
696
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
In this direction of research, Aït-Sahalia (2002) suggested a method to approximate q∆ (y0 , y) as follows. When Y has the transition density q∆ (y0 , y) from Y0 = y0 to Y∆ = y, the transformed 1
variable Z∆ = z (Y∆ ) through the function z (y) = ∆− 2 (y − y0 ) has the transition density 1/2 ∆ q∆ q (y0 , y0 + ∆1/2 z ). Z (y0 , z ) = ∆
He showed that ∆
qZ (y0 , z ) = φ(z )
∞
By = a(y) + Dy .
ηj Hj (z ),
Applying Lemma A1 to Hj (z (y)), ηjK in (10) can be represented
where ηj = ηj (∆, y0 ) = (1/j!) E [Hj (Z∆ )|Y0 = y0 ] and Hj (z ) is the jth degree Hermite polynomial. That is, Hj (z ) satisfies
(−Dz )j φ(z ) = Hj (z )φ(z ).
(7)
H0 (z ) = 1, H1 (z ) = z, H2 (z ) = z 2 − 1, and for j ≥ 2, Hj (z )s are obtained sequentially from the relation Hj+1 (z ) = zHj (z ) − jHj−1 (z ),
j > 0.
Aït-Sahalia (1999, 2002) used a different version of Hermite polynomials, defined through the sequential relation (Dz )j φ(z ) = Hj (z )φ(z ), but we will use (7) in this paper. To avoid possible confusion, we will call ηj = ηj (∆, y0 ) = (1/j!) E [Hj (Z∆ )|Y0 = y0 ] the jth scaled Hermite moment and ξj = j!ηj = E [Hj (Z∆ )|Y0 = y0 ] the jth Hermite moment. By applying the Itô–Taylor expansion to ηj (∆, y0 ), we have K
Aky Hj (z (y))
k=0
∆ k!
k
y0
∆ K +1 + (1/j!)E AKy Hj (z (Ys ))|Y0 = y0 . (K + 1)!
J
ηjK Hj (z )
(8)
(9)
where
−1 , k! 2k−r (k − r )!r ! 0,
k ≥ r, k < r,
(11)
the first few terms of wk,d (y) are readily written in terms of Ay , By , and Jk,r as shown in the Appendix. In order to get an explicit form of ηjK , it is important to obtain the general solution of wk,d (y) and it can be written as a linear combination of ζl,n (y) and Jk,r as in Lemma A2 in the Appendix. Definition 1 is for this function, ζl,n (y). In their repeated applications, the operators Ay and By are assumed to be applied in a right-to-left order to the constant Jk,r ; for example,
By Ay By Jk,r = By {Ay (By Jk,r )}. Since Ay 1 = 0, the terms of Byn Ay Jk,r , n = 0, 1, . . . are also 0
l +n l
possible orderings in applying
Definition 1. We define the sequence ζl,n (y), l, n = 0, 1, . . . as the summation of functions obtained by applying l Ay s and n By s to the constant 1 in all possible orderings. That is, ζ0,0 (y) = 1 and for l, n ≥ 1,
ζ0,n (y) = Byn 1, ζl,0 (y) = Aly 1, ζl,n (y) = Ay ζl−1,n (y) + By ζl,n−1 (y). For example, we have
K
η =
Jk,r =
l Ay s and n By s to a constant, because the operators Ay and By are not commutative; i.e., Ay By 1 ̸= By Ay 1.
j =0
K j
−d in terms of wk,d (y), D2k Hj (z (y)), and ∆. wk,d (y) is defined as y the solution of a recursive Eq. (22). By defining the sequence of constants of
for any k and r. And there are
The transition density q∆ Z (y0 , z ) of Z∆ is then approximated with q∆ ( y , z ) as 0 (J ,K ) q∆ (J ,K ) (y0 , z ) = φ(z )
j =0
ηj (∆, y0 ) = (1/j!)
As a first step to write the Hermite expansion in the order of the power of ∆, we will look into the representation of the repeated applications of Dynkin’s operator to a function f (y), that is, Aky f (y). Lemma A1 in the Appendix tells us that the repeated applications of Dynkin’s operator can be represented by a linear combination of −d the differential operator D2k and the sequence of the coefficient y function wk,d (y), for d = 0, . . . , 2k and k = 0, 1, 2, . . .. Note that throughout the paper, we use the operator By defined as
1 j! k=0
Aky Hj
(z (y))
y0
∆ k!
k
.
(10)
Since Ay = a(y)Dy + 21 D2y and Dy = ∆−1/2 Dz , we need to express
Aky Hj (z ) in an explicit form in order to rearrange the series q∆ (J ,K ) in the order of the power of ∆. In Aït-Sahalia (2002), ηjK was expressed in the order of the power of ∆ for a few small K values, but the general form was not provided. In fact, Aït-Sahalia (1999) provided an explicit expression of the approximation formula arranged in terms of ∆, which, however, was not the Hermite expansion. In the next section, we will propose a way to find a general formula for ηjK in (9) by computing the explicit form of Aky Hj (z ) and it is the main result of this paper. 3. Delta expansion Aït-Sahalia (2002) assumed regularity conditions on the drift and diffusion coefficients of diffusions to guarantee the existence of the solution of the SDEs given in (1) and (4) and to ensure some desirable properties of estimators of the model parameters. Here, we adopt three regularity conditions in his paper (Assumptions 1 through 3 in Aït-Sahalia (2002)) that guarantee the existence of the solution of the SDEs.
ζ1,n (y) =
n
Byk Ay Byn−k
1
k=0
ζ2,n (y) =
n n
Byl Ay Byk−l Ay Byn−k
1.
l=0 k=l
Since, for any function f (y),
Ay f (y) =
1 2
(By + a(y))(By − a(y))f (y) =
1 2
(By2 − g (y))f (y),
with g (y) = a2 (y) + a′ (y), we have
Byk Ay f (y) =
=
1 2
Byk (By2 − g (y))f (y)
1 2
Byk+2
−
k k i=0
i
(i)
g (y)
Byk−i
f (y),
easily by the mathematical induction. Note that By g = (g By + g ′ ) and g (i) (y) is the ith derivative of g (y) with respect to y. Let us denote v (i) (y) to be the ith derivative of −g (y) with respect to y for i ≥ 0. Then we can easily show that
v (i) (y) = −
i i r =−1
r+
a(r ) (y)a(i−r ) (y),
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
where a(−1) (y) = 1 and r + = max(0, r ). If we define v (−1) (y) = 1 for notational convenience, we also have
Byk Ay
f (y) =
k 1
k
v (i) (y)Byk−i f (y), ∗
i+
2 i=−1
ˆ K ,d,r (z ; K ) = H
Lemma 1. When p(n) = (p1 , . . . , pn ) is an n-dimensional n vector of nonnegative integers, pi , i = 1, 2, . . . , n, such that i=1 i pi = n, ζl,n (y) is expressed recursively as follows:
ζ0,0 (y) = 1,
ζl,0 (y) = 0, l = 1, 2, . . . (i−1) pi n 1 a (y) n! ζ0,n (y) = , n = 1, 2, . . . p! i! i =1 i p(n) n n 1 k ∗ ζl,n (y) = v (i) (y) Byk−i ζl−1,n−k (y), + 2 i=−1 + k=i
Furthermore, instead of using a pre-fixed number J, we consider J as a function of d to be 2K − d for any d = 0, 1, . . . , 2K in (17). With ˆ K ,k,d,r (z ; [(J + d)/2]− ) becomes this condition, H
(12)
where i∗ = i − i− and i− = − min(0, i). Note that i∗ = 2i for i < 0. Now, the following lemma shows the formula of ζl,n (y) in a recursive manner.
K −r K −r −k (−1)m H2m+2k+2r −d (z ) = H2r −d (z ), k!2k 2m m! k=0 m=0
since terms with the same values of m + k = 1, . . . , K − r will all cancel out. Thus, the approximated Hermite expansion is reduced to q∆ (∗,K ) (y0 , z )
= φ(z )
2K
d
d/2
∆
1 r!
r =[d/2]+
d=0
(13)
ζd−r ,2r −d (y0 ) H2r −d (z ) .
(18)
We will call this the delta expansion of transition density. (14)
Proof. See Appendix.
(15)
The transition probability Q ∆ (y0 , z ) = Pr (Z∆ ≤ z |Y0 = y0 ) is z obtained by integrating q∆ (y0 , z ) as Q ∆ (y0 , z ) = −∞ q∆ (y0 , u) dλ (u). The following corollary on the approximation of the transition probability is directly obtained from Theorem 1, by applying the property of Hermite polynomial, (7). The proof is omitted.
i
l = 1, 2, . . . ; n = 1, 2, . . . .
697
Proof. See Appendix.
∆ Corollary 1. The delta expansion Q(∗, K ) (y0 , z ) of the transition
In fact, we can also obtain a non-recursive formula for ζl,n using the recursive relation (15) as follows. One can use Lemma 1 or the non-recursive formula given below to find ηjK in (10):
∆ Q(∗, K ) (y0 , z ) = Φ (z ) − φ(z )
ζl,n (y) =
2l
×
···
v
(ip )
wl (n, i1 , i2 , . . . , il )
n−i∗ −i∗2 −···−i∗l 1
=
n−i1 ∗ −···−il−1 ∗
p∆ (J ,K ) (x0 , x)
···
∗ + n1 =i1 n2 =n1 −i1
l
nl =nl−1 −il−1 ∗ p=1
np
i+ p
(z (y)) =
2k
d=0
−d ζd−r ,2r −d (y)Jk,r D2k Hj (z (y)) y
(16)
r =[d/2]+
ing (10) for ηjK in (9), we obtain Theorem 1. This theorem shows the explicit representation of the approximation of the Hermite expansion and shows that the approximation is reduced to the delta expansion when an additional constraint regarding J is imposed. Theorem 1. The approximated transition density of Z∆ by the Hermite expansion in (9) is arranged in the order of the power of ∆ as follows:
= φ(z )
2K d=0
∆
d/2
= q∆ (J ,K ) (h(x0 ), h(x))/σ (x)
d
r =[d/2]+
1 r!
ζd−r ,2r −d (y0 )
× Hˆ K ,d,r (z ; [(J + d)/2]− ) where K −r L− r −k (−1)m ˆ K ,d,r (z ; L) = H H2m+2k+2r −d (z ). k!2k 2m m! k=0 m=0
(19)
defined in the same manner as (19), using q∆ (∗,K ) . In (18), the summation index d varies from 0 to an even integer 2K , but we can use any nonnegative integer rather than 2K without loss of generality. Then we can represent the same expansion as follows: ∆
for k, j = 0, 1, 2, . . .. By substituting (16) for Aky Hj (z (y)) in (10) and again, substitut-
q∆ (J ,K ) (y0 , z )
ζd−r ,2r −d (y0 ) H2r −d−1 (z ) .
in approximating the transition density p (x0 , x) of some wellknown diffusion models. The delta expansion p∆ (∗,K ) (x0 , x) is
.
d
r!
∆
Using [·]+ and [·]− , the ceiling and floor functions, respectively, we obtain the following as a direct result of Lemmas A1 and A2:
Aky Hj
∆d/2
In the next section we will numerically study the accuracy of
where wl (n, i1 , . . . , il )
p=1 n
1
r =[d/2]+
By
n−i1 ∗
d
×
il =−1
i1 =−1 i2 =−1
l
2K d=1
n−i∗ −i∗l−1 1
∗
n −i n 1 1
probability Q ∆ (y0 , z ) is obtained as follows:
(17)
qZ (y0 , z ) = φ(z )
K 1 k ζk−j,2j+q (y0 ) k=0 q=0
j =0
(k + j + q)!
H2j+q (z )
× ∆k+q/2 + O(∆K +1 ) .
(20)
Since the Hermite bases are orthogonal, the expansions of Hermite moments and the moments of Z∆ are obtained directly from (20) as we see in the next theorem. Theorem 2. For a diffusion Zt = z (Yt ), the Hermite moments, ξ2r +q (y0 ) = E [H2r +q (Z∆ )|Y0 = y0 ], and the moments of Z∆ , m2r +q (y0 ) = E [(Z∆ )2r +q |Y0 = y0 ], for a nonnegative integer r and q = 0, 1, are expanded with respect to ∆ as follows:
ξ2r +q (y0 ) = (2r + q)!
K ζk−r ,2r +q (y0 ) k=r
m2r +q (y0 ) = (2r + q)!
(k + r + q)!
(k,r ) K min ζk−j,2j+q (y0 )
k=0 k+q/2
×
∆k+q/2 + O(∆K +1 ),
j =0
2r −j (r − j)!
∆ + O(∆K +1 ). (k + j + q)!
698
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
(a) CIR.
(b) CIR. ∆
Fig. 1. The exact transition densities p (x0 , x) (the thicker solid lines), the normal approximation by the Euler approximation scheme (the dotted lines), and the 0th order delta expansion, (∗, 0) (the thinner solid lines) for the CIR model (left) with parameter θ = (0.2, 0.08, 0.1) and the OU model (right) with parameter θ = (0.5, 0.06, 0.3). For x0 , three cases of 0.04, 0.08 and 0.12 are tested for the CIR model and −0.15 and 0.25 are tested for the OU model. ∆ = 1/4 is assumed for both models.
We also get the expansion of the moment generating function, M (s; y0 ) = E [exp{s Z∆ }|Y0 = y0 ] as
K k ζk−j,2j (y0 ) 2j k · s ∆ M (s; y0 ) = exp 2 (k + j)! k=0 j=0 ζk−j,2j+1 (y0 ) 2j+1 k+1/2 + s ∆ + O(∆K +1 ). (k + j + 1)!
s2
Proof. See Appendix. Kessler (1997) showed the expansion of the first and the second moments using a contrast function but the expansion of higher order moments has not been shown in the previous literature, to the extent of our knowledge. Here, the expansion of higher order moments becomes possible due to special characteristics of Hermite polynomials as orthogonal polynomials. 4. Numerical examples In Section 3, we presented approximated Hermite expansions and the delta expansion of transition densities of diffusions. In this section, we compare these expansion methods in terms of their numerical accuracy and the performances in estimating model parameters. The Ornstein–Uhlenbeck (OU) model, the Cox–Ingersoll–Ross (CIR) model, and the geometric Brownian motion (GBM) are considered for comparison, since the exact values of the transition densities of those models can be calculated from closed-form expressions of their likelihood functions. These models are special cases of the Chan–Karolyi–Longstaff–Sanders (CKLS) model that Chan et al. (1992) suggested and is defined by the SDE such that γ
dXt = α(β − Xt )dt + σ Xt dWt .
(21)
The OU model and the CIR model have γ = 0 and γ = 1/2, respectively, and the GBM with parameter θ = (ρ, σ ) has α = −ρ , β = 0 and γ = 1. For these models, the difference d∆ (J ,K ) (x0 , x) between the approximation and the truth is exactly assessed by subtracting the true value from the approximated value, that is, ∆ ∆ d∆ (J ,K ) (x0 , x) = p(J ,K ) (x0 , x) − p (x0 , x). We mainly consider CIR models for checking the accuracy of the approximation methods, because CIR models have relatively more complicated transition densities than those in OU models or GBMs.
The CIR model corresponds to the case of γ = 1/2 in (21) and has three parameters θ = (α, β, σ ). We tested the case of θ = (0.2, 0.08, 0.1) which was used in Hurn et al. (2007). For the OU model, we tested the case of θ = (0.5, 0.06, 0.3), which was also considered in Aït-Sahalia (2002). For GBM, we studied the case of θ = (0.08, 0.25), which was considered in Jensen and Poulsen (2002). With this model, α = −0.08, β = 0, γ = 1, and σ = 0.25 in (21). The unit of ∆ is arbitrary but in much of recent research, ∆ is measured in years. That is, for quarterly observed data and monthly observed data, ∆ will be 1/4 and 1/12, respectively. In this section we used 1/4, 1/12, and 1/52 for ∆. Fig. 1 compares the exact transition density p∆ (x0 , x), the 0th order delta expansion p∆ (∗,0) (x0 , x), and the Euler approximation
with a mean x0 +µ(θ , x0 ) ∆ and variance ∆ σ 2 (θ , x0 ) of the normal distribution, for the CIR model and OU model. In the CIR model, the variances of the transition densities depend on the value of x0 , but they are invariant under the OU model. For both the CIR model and OU model, the variances of the 0-th order delta expansions and the Euler approximations are slightly less than those of the exact transition densities. Fig. 2 shows the differences d∆ (∗,K ) (x0 , x) for several values of K comparatively for the cases of ∆ = 1/4 and ∆ = 1/12. d∆ (∗,K ) (x0 , x) diminishes as the value of K increases and the differences for the case of ∆ = 1/12 decrease more rapidly than those for the case of ∆ = 1/4. Fig. 3 shows the boxplots of e∆ (J ,K ) (θ , N ), defined by
ˆ ˆ e∆ (J ,K ) (θ , N ) = θ(J ,K ) − θmle , obtained from 1000 simulation iterations for GBM with θ = (0.08, 0.25). N is the sample size. θˆmle is the value maximizing the likelihood function using the exact transition density and θˆ(J ,K ) is the value maximizing the approximated likelihood function using p∆ (J ,K ) (x0 , x). The cases of ∗, 1, 2, 4 for J and 1, 2, 2.5 for K are considered. ∗ represents the delta expansion. As the estimators θˆ(J ,K ) and θˆmle depend on the sample size N, the difference of the estimators e∆ (J ,K ) (θ , N ) also depends on N. The cases where estimators are dispersed too widely are omitted from Fig. 3 for better comparison. Simulation results of more cases including cases of ∆ = 1/52 are summarized in Tables 1–3. Each table shows the results of the CIR model, OU model, and GBM, respectively. For simulated samples, we generated each observation from the exact transition densities. The exact transition densities for the models are found in Hurn et al. (2007). We considered sample sizes of N = 200
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
(a) ∆ = 1/4.
699
(b) ∆ = 1/12.
Fig. 2. The difference d∆ (∗,K ) (x0 , x) for the CIR model with parameter θ = (0.2, 0.08, 0.1). The case of x0 = 0.08 was tested. The left panel shows the case of ∆ = 1/4 and
the right panel shows the case of ∆ = 1/12. All the delta expansions for K = 0, 0.5, 1, 1.5 are drawn. As K increases, the absolute values of the differences d∆ (∗,K ) (x0 , x) diminish uniformly. Note that K = 0.5 and 1.5 let the 2K in formula (18) be 1 and 3, respectively.
(a) ρˆ (J ,K ) − ρˆ mle .
(b) σˆ (J ,K ) − σˆ mle .
Fig. 3. Boxplots for the differences e∆ (J ,K ) (θ, N ) in the GBM with θ = (ρ, σ ) = (0.08, 0.25). Here N = 200 and ∆ = 1/4 are assumed. The boxplots for some cases are not shown, because they are too widely spread.
and N = 400 when ∆ = 1/4. When we decrease ∆ with a fixed value of N, there can be some inconsistency problem due to the shortened entire observation period. Thus, we increased N to 2600 and 5200 when ∆ = 1/52 so that we have the same entire observation period as the case of ∆ = 1/4. The boxplots in Fig. 3 are obtained for the case of N = 200. The values shown in Tables 1– 3 summarize e∆ (J ,K ) (θ , N ) by showing 1000× MD (mean difference) and 1000× RMSD (root mean square difference). The MD and RMSD are the average of e∆ (J ,K ) (θ , N ) and the square root of the
US federal fund rate from January, 1963 to October, 1998. When the CIR model is fitted with ∆ = 1/12, the estimated values of the parameters are listed in Table 4. The MLEs in Table 4 are obtained from the exact transition densities, and each row of the table corresponds to the maximum approximated likelihood estimator θˆ(J ,K ) . This table shows that the delta expansion with K = 2 already gives a very close estimate of the MLE. Fig. 5 shows the boxplots for the values of d∆ (J ,K ) (xi−1 , xi ), i = 1, 2, . . . , 432 applied to the US FFR
average of respectively, for all simulated samples. In each table, the RMSDs of the delta expansions, (∗, K )s, are much less than those of the approximated Hermite expansions, (J , K )s. Moreover, while the RMSDs of the delta expansions monotonically decrease as K increases, the RMSDs of the approximated Hermite expansions vary irregularly. The delta expansion is numerically more stable than the approximated Hermite expansions. The same phenomenon is observed in Fig. 3. Fig. 4 shows the time-series plot for the US Federal Fund Rate (US FFR) data, which has 432 observations of the monthly recorded
model was assumed to be the true model and the MLE θˆmle was pretended to be the true parameter θ . The boxplots show that the delta expansion (∗, 2) provides the most accurate approximation of the transition densities out of the compared cases. One can look at Fig. 6 for a comparison of the delta expansion with the series solution of the Fokker–Planck equation suggested by Aït-Sahalia (1999). For this simulation, the CIR model with θ = (0.2, 0.08, 0.1), x0 = 0.08 and ∆ = 1/4 were assumed. The dotted line is the difference between the true transition density and the approximation by the series solution of the Fokker–Planck
2 (e∆ (J ,K ) (θ , N )) ,
∆ data. In evaluating the values of p∆ (J ,K ) (x0 , x) and p (x0 , x), the CIR
700
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
Table 1 Differences e∆ (J ,K ) (θ, N ) for the CIR model with θ = (0.2, 0.08, 0.1). Reported values are 1000× mean difference and 1000× root mean square difference (in parentheses).
∆ = 1/4 and ∆ = 1/52 are assumed. The row with ‘MLE’ shows the average of MLEs for all simulated samples and the following row gives 1000× mean difference (θˆmle − θ ) and 1000× root mean square difference for all simulated samples.
θ
α
β
TRUE
0.2
0.08
MLE
∆ = 1/4 N = 200
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5) (1, 2) (2, 2) (3, 2) (4, 2)
∆ = 1/4 N = 400
0.293 93.103
(157.849)
(19.930)
(153.620) (29.455) (16.376) (4.380)
14.435 0.421 −0.018 −0.003
(24.765) (1.391) (0.156) (0.094)
−1.937
0.111 0.034 0.183 1.218
(4.343) (1.327) (2.372) (6.846)
−1.950
−19.574
(54.594) (26.637) (39.579) (61.471)
0.244 44.354
(86.920)
0.080 0.291
(13.607)
14.567 0.337 −0.010 −0.001
(20.169) (0.945) (0.112) (0.071)
−1.646
0.202
−1.657
0.042 1.186
(3.539) (0.930) (1.400) (4.610)
(162.355)
0.081 1.287
(20.902)
−0.057
(1.366)
(158.638) (17.752) (0.660) (0.172)
13.057 0.026 0.000 0.000
(24.870) (0.967) (0.047) (0.012)
−0.153
(0.167) (0.022) (0.005) (0.001)
−0.153
0.069
(1.618) (0.708) (0.901) (2.457)
0.000 0.001 −0.001
(88.741)
0.080 0.373
(14.307)
−0.021
(0.993)
(85.578) (8.928) (0.526) (0.080)
13.694 0.034 0.000 0.001
(19.950) (0.821) (0.084) (0.020)
−0.129
(0.135) (0.019) (0.003) (0.001)
(1.892) (0.712) (0.887) (2.350)
−0.130
−85.699 −10.464 0.366 0.114 14.172
−0.491 0.970
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
−36.974 −8.416 −0.085 0.073
(83.660) (18.683) (3.613) (1.202)
(1, 2) (2, 2) (3, 2) (4, 2)
4.473 −1.202 −1.699 −16.124
(29.670) (17.145) (22.447) (38.176)
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
−85.703 −1.121
(1, 2) (2, 2) (3, 2) (4, 2)
0.047 0.374 0.153 −2.683
0.022 0.006
0.244 43.687
∆ = 1/52 N = 5200
0.1
0.080 0.069
0.292 92.423
∆ = 1/52 N = 2600
σ
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
−36.821 −0.722
(1, 2) (2, 2) (3, 2) (4, 2)
−0.504 −0.309
0.000
−0.004
0.247
−2.308
(21.740) (9.395) (13.710) (30.440)
(16.254) (8.184) (11.049) (22.194)
−0.016
−0.019 −0.012 −0.001
−0.060 0.027 −0.028 0.068
0.100 0.337 0.086 −0.005 −0.003 0.075 0.026 0.033 0.100 0.259 0.061
−0.004 −0.002 0.052 0.025 0.055
(5.045) (2.096) (0.223) (0.038) (0.034) (2.109) (0.131) (0.179) (0.925) (3.692) (1.713) (0.090) (0.021) (0.015) (1.725) (0.078) (0.071) (0.517)
0.100
0.001 0.000 0.000
(0.167) (0.019) (0.021) (0.031)
0.100
0.000 0.000 0.000 0.000 0.001 0.000
(0.136) (0.017) (0.019) (0.029)
equation obtained from Matlab code provided on Aït-Sahalia’s website.3 Solid lines are d∆ (∗, K )s of delta expansions with errors of the order of O(∆4 ), O(∆4.5 ), and O(∆5 ). We can easily see from the figure that the delta expansions with an error of O(∆4.5 ) and O(∆5 ) give more precise results than the ones obtained by AïtSahalia (1999). 5. Discussion When the likelihood estimation is considered for estimating the diffusion parameters, it is inevitable that we obtain the transition density for the diffusion model. After Aït-Sahalia’s seminal work (1999; 2002), the method of direct approximation of transition densities in a closed-form has been the main approach to evaluate transition densities. In this paper, we proposed an algorithm to calculate the coefficients of terms explicitly in the Hermite expansion, rearranged in
Fig. 4. Time-series plot for the US FFR data.
3 http://www.princeton.edu/~yacine/closedformmle.htm.
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
701
Table 2 Differences e∆ (J ,K ) (θ, N ) for the OU model with θ = (0.5, 0.06, 0.3). Reported values are 1000× mean difference and 1000× root mean square difference (in parentheses).
∆ = 1/4 and ∆ = 1/52 are assumed. The row with ‘MLE’ shows the average of MLEs for all simulated samples and the following row gives 1000× mean difference (θˆmle − θ ) and 1000× root mean square difference for all simulated samples.
θ
α
β
σ
TRUE
0.5
0.06
0.3
MLE
∆ = 1/4 N = 200
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5) (1, 2) (2, 2) (3, 2) (4, 2)
∆ = 1/4 N = 400
∆ = 1/52 N = 2600
0.581 81.042
−116.800 −43.241
0.061 1.069
0.301 1.268
(90.189) (18.077) (4.066) (2.528)
−10.625
(37.018) (18.048) (25.351) (41.279)
−10.657
−0.058
−4.889 −2.448 −94.875
(113.328) (68.946) (97.374) (139.389)
−0.536 −0.331 −1.250
0.546 45.730
(126.067)
−2.675
(58.105)
29.278 −0.084 −0.039 0.002
(65.383) (13.696) (2.296) (1.452)
−9.957
−0.525
(31.170) (13.811) (21.270) (30.573)
−10.006
0.039 0.001 0.639
1.963 1.999 24.134
25.046
(85.990)
(207.391) (76.286) (28.666) (13.251)
−0.403 0.094 0.158
0.057
0.558 −0.075 −0.036
0.300 0.188
(11.478)
2.105 0.526
(142.945) (68.492) (13.227) (6.627)
(1, 2) (2, 2) (3, 2) (4, 2)
−13.102 −14.528 −23.038 −89.831
(77.788) (53.628) (76.949) (113.570)
0.581 81.179
(189.545)
0.064 3.818
(86.476)
37.756 0.010 −0.019 −0.003
(94.567) (1.065) (0.301) (0.137)
−0.847
0.134 0.015 0.024 1.070
(8.744) (1.040) (1.389) (14.443)
−0.846
−56.639 −3.753 0.016 −0.001
(180.886) (5.676) (0.738) (0.303)
(1, 2) (2, 2) (3, 2) (4, 2)
−2.571 −0.226 −0.527 −11.336
(24.983) (3.571) (5.354) (42.038)
0.060
0.456 0.140 0.892 0.300 0.080 0.003 0.000 0.000
(2.956)
−0.778
−0.016 −0.049 −0.081 −0.056
(6.134) (0.950) (1.115) (11.296)
−0.777
−0.001
(1, 2) (2, 2) (3, 2) (4, 2)
−3.597 −0.113 −0.286 −11.438
(17.549) (2.492) (3.469) (31.513)
the order of the power of ∆, the time increment between observations. We also introduced the delta expansion as a reduced form of the rearranged Hermite expansion and numerically compared its performance to Hermite expansions without rearrangement. In a statistical point of view, we would like to see how the transition density behaves as ∆ goes to 0 and in order to see that, we need to arrange the approximate transition density in the order of the power of ∆. A simple application of the Hermite expansion does not give us a formula arranged in terms of ∆, so we have to rearrange the series. One approach is to use the finite term approximation of the Hermite series expansion given in Aït-Sahalia (2002) and another approach is to use the formula given in AïtSahalia (1999), which is the series approximation of the solution of the Fokker–Planck equation. The Hermite series expansion in Aït-Sahalia (2002) provided explicit expressions of the coefficients for a few terms when arranged in terms of ∆ but did not provide the complete and general form of the coefficients. Another representation introduced in Aït-Sahalia (1999) and Formula (4.10) in Aït-Sahalia (2002) gives a general form of the coefficients when the expansion is arranged
42.091
(4.379) (0.883) (0.033) (0.017) (0.008)
0.300 0.037
(73.436) (0.982) (0.261) (0.086)
(107.838) (4.016) (0.530) (0.229)
(10.258) (0.825) (1.057) (3.458)
(0.882) (0.032) (0.033) (0.119)
−0.056 −0.011 −0.002
−12.809 −3.000
(10.207) (0.807) (0.184) (0.112)
0.004 0.005 0.010
−0.317
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
0.005
0.415 −0.037 −0.021
(112.874)
(58.698)
(11.093) (1.268) (0.482) (0.265) (11.131) (1.209) (1.637) (4.926)
−79.736 −48.401
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
(15.890)
0.518 0.025 0.768
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
0.537 37.357
∆ = 1/52 N = 5200
(188.605)
0.005 0.000 0.000 0.004 0.003 0.016
(0.794) (0.034) (0.013) (0.005) (0.793) (0.033) (0.034) (0.089)
in the order of ∆, but to be precise, it is not the Hermite series expansion. What we did in this paper is to provide an organized method of finding a general formula for the approach by Aït-Sahalia (2002) and to provide a statistically more meaningful expansion than the approach given by Aït-Sahalia (1999). One example of the statistical implication of our method is that an expansion of any Hermite moment can be directly obtained from the delta expansion. More accurate expansions of the moments will improve moment-based estimation methods as well as likelihood-based estimation methods. We developed an R-code to approximate the transition density p∆ (x0 , x) of X using the delta expansion. A primitive form approximating up to the order of O(∆7/2 ) was released on the internet. The R-code has been linked from Aït-Sahalia’s internet research page.4 Recently we also added an R-code for generating ζi,j (y) to the same
4 http://www.princeton.edu/~yacine/closedformmle.htm.
702
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
Table 3 Differences e∆ (J ,K ) (θ, N ) for the GBM with θ = (0.08, 0.25). Reported values are 1000× mean difference and 1000× root mean square difference (in parentheses). ∆ = 1/4 and ∆ = 1/52 are assumed. The row with ‘MLE’ shows the average of MLEs for all simulated samples and the following row gives 1000× mean difference (θˆmle − θ ) and 1000× root mean square difference for all simulated samples. With GBM, a(y) in Eq. (4) is constant, so the values in this table will not change by values of K when the value of J is fixed.
θ
ρ
σ
TRUE
0.08
0.25
MLE
∆ = 1/4 N = 200
0.082 1.834
(34.855)
−1.164
0.249
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5)
25.499 −0.042 −0.001 0.000
(45.874) (0.988) (0.038) (0.010)
1.891 0.005 0.000 0.000
(2.665) (0.118) (0.013) (0.003)
(1 , 2 ) (2 , 2 ) (4 , 2 ) (5 , 2 )
0.592
−0.042 −0.423 −22.091
(2.418) (0.988) (10.367) (33.397)
1.901 0.005 −0.077 0.750
(2.702) (0.118) (1.453) (3.668)
(24.655)
−1.174
(8.938)
(38.791) (0.611) (0.023) (0.009)
1.419 0.002 0.000 0.000
(2.045) (0.160) (0.009) (0.003)
−23.315
(1.380) (0.611) (6.658) (29.460)
1.449 0.002 −0.171 0.843
(1.985) (0.160) (1.164) (2.705)
0.080 0.448
(36.306)
−0.115
(3.391)
28.112 0.004 0.000 0.000
(46.805) (0.154) (0.005) (0.000)
0.142 0.000 0.000 0.000
(0.210) (0.012) (0.001) (0.000)
0.011 0.004 −0.086 −28.064
(0.432) (0.154) (2.923) (39.628)
0.143 0.000 −0.001 0.120
(0.210) (0.012) (0.046) (0.410)
0.081 0.851
(26.262)
−0.200
(2.541)
27.016 0.012 0.000 0.000
(39.558) (0.183) (0.000) (0.000)
0.125 0.000 0.000 0.000
(0.181) (0.019) (0.000) (0.000)
0.033 0.012 0.149 −28.167
(0.301) (0.183) (2.024) (34.515)
0.122 0.000 −0.004 0.125
(0.166) (0.019) (0.040) (0.283)
0.079
−1.319 ∆ = 1/4 N = 400
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5) (1 , 2 ) (2 , 2 ) (4 , 2 ) (5 , 2 )
∆ = 1/52 N = 2600
0.001 0.000 0.457
−0.046 0.188
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5) (1 , 2 ) (2 , 2 ) (4 , 2 ) (5 , 2 )
∆ = 1/52 N = 5200
27.477
−0.046
(∗, 0) (∗, 1) (∗, 2) (∗, 2.5) (1 , 2 ) (2 , 2 ) (4 , 2 ) (5 , 2 )
(12.506)
0.249 Fig. 5. Boxplots for d∆ J ,K (xi−1 , xi ), i = 1, 2, . . . , 432 for US FFR data. (∗, K )s are delta expansions and (J , K )s are Hermite expansions. The CIR model is assumed as the true model and the MLE is pretended to be the true parameter values.
0.250
0.250
Table 4 Estimates for US FFR data with the CIR model.
α
β
σ
MLE
0.21904
0.07205
0.06665
(∗, 0) (∗, 1) (∗, 1.5) (∗, 2)
0.21504 0.21865 0.23103 0.22213
0.10972 0.07160 0.07168 0.07196
0.06637 0.06665 0.06665 0.06665
(3, 1) (4, 1) (1, 2) (2, 2) (3, 2) (4, 2)
0.02245 0.35050 0.13777 0.56333 0.17894 0.07045
0.15577 0.08162 0.11538 0.05778 0.07617 0.12853
0.05975 0.10476 0.06638 0.06548 0.06667 0.06610
website. The R-code we developed will help evaluate the transition densities for various diffusion models. While the algorithm is
Fig. 6. Differences between the approximation and the actual transition density for the CIR model with parameters θ = (0.2, 0.08, 0.1), x0 = 0.08, and ∆ = 1/4. The dotted line is the case of the series solution (SS) of the Fokker–Planck equation obtained by the algorithm provided by Aït-Sahalia (1999). The thinner solid line is the case of the delta expansion for O(∆4 ) and the thicker solid line is the case of O(∆4.5 ). The case of O(∆5 ) is very flat and nearly indistinguishable from the x-axis.
able to calculate coefficients of any order, it does not require any preprocessing by a separate software performing symbolic calculations since we do not solve the Fokker–Planck equation. A more organized form of the R-code giving more accurate results is being tested and will be released soon. The focus of this paper is how to approximate the transition densities of diffusion models. Nevertheless, our original purpose was to find a good approximation of transition densities in order to estimate parameters of diffusion models. Although this paper mainly deals with approximating a transition density, we are currently working on parameter estimation problem adopting the ideas proposed in this paper. Also we are writing a companion paper where we look to improve the numerical accuracy of the method using the technique of exponential tilting.
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
Appendix
and we have
Lemma A1. For k = 0, 1, 2, . . . , repeated applications of Dynkin’s operator Ay to a function f (y) can be represented as follows:
Aky f
(y) =
2k
Byn+1 = (a + Dy )Byn
(i−1) pi n 1 a = n! (a + Dy ) p! i! i=1 i p(n) p n n 1 a(i−1) i a(j) = n! pj (j−1) p! i! a i=1 i j =0 p(n) (i−1) pi n n a(j) a 1 pj (j−1) . = n! p! i! a i=1 i p(n) j=0
wk,d (y)
−d D2k y
f (y),
d=0
where wk,d (y), k, d = 0, 1, 2, . . . satisfies
wk+1,d+2 (y) =
1 2
wk,d+2 (y) + By wk,d+1 (y) + Ay wk,d (y).
(22)
As boundary conditions, we assume w0,0 = 1, w0,d = 0 for d = 1, 2, . . . , and wk,d = 0 for k ≥ 1 and d ̸∈ {0, 1, . . . , 2k − 1}. Proof. We omit the proof and refer the reader to Lee et al. (2010) for details. The first few terms of wk,d are as follows: From the relation (22), we get w1,0 (y) = 21 , w1,1 (y) = a(y), and w1,2 (y) = 0, and for all k ≥ 1 we have
wk,0 (y) = By0 Jk,0 wk,1 (y) = By1 Jk,1 wk,2 (y) = By2 Jk,2 wk,3 (y) = By3 Jk,3 + Ay By Jk,2 wk,4 (y) = By4 Jk,4 + By Ay By + A2y By Jk,3 , with Jk,r defined as in (11). Lemma A2. The solution of (22) with the boundary conditions
w0,0 (y) = 1, w0,d (y) = 0, d = 1, 2, . . . , and wk,d (y) = 0 for k ≥ 1 and d ̸∈ {0, 1, . . . , 2k − 1} is given by wk,d (y) =
d
ζd−r ,2r −d (y) Jk,r ,
(23)
r =[d/2]+
Proof of Lemma 1. (13) is simple from 1. (15) is directly n Definition k obtained by observing ζl,n (y) = k=0 By Ay ζl−1,n−k (y) for l ≥ 1 and using (12). Now, we will prove (14) by the mathematical induction. Since ζ0,1 = a(y), this holds when n = 1. Thus, we only need to show that
(i−1) qi n +1 a (y) 1 , (n + 1)! = q ! i ! i i =1 q(n+1)
assuming
(a(i−1) )pi
a(j)
∗
a(j−1)
n+1
(a(i−1) )qi ,
i=1
= (j + 1)
i!
i=1
=
n+1 qi 1
i!
i=1
and p∗j
n+1 1
∗
i=1
pi !
= qj+1
n +1 1 . q ! i =1 i
If we label all p(n)’s as (p1 , p2 , . . . , pNn ) and denote pk (pk,1 , . . . , pk,n ) for 1 ≤ k ≤ Nn , (24) is going to be (i−1) pk,i Nn n n a 1 n +1 By = n! p ! i! k , i k=1 j=0,p ̸=0 i =1
=
k,j
(i−1) pi n 1 a (y) Byn = n! p ! i ! i =1 i p(n)
×
where q = q(n+1) is an (n+1)-dimensional vector of nonnegative n +1 integers qi , i = 1, . . . , n + 1 such that i=1 iqi = n + 1. p = p(n) is, as we defined before, an n-dimensional vector of nonnegative n integers pi , i = 1, . . . , n such that i=1 ipi = n. For notational simplicity, assume p0 = 1 and a(−1) = 1. Then
(a + Dy )
n+1
n+1 p∗ 1 i
Proof. It is simple by the mathematical induction and thus is omitted.
(24)
The number of such p(n)s is the number of partitions Nn . See http://oeis.org/A000041 for further information about the number of partitions. That is, there are Nn p(n)’s and Nn+1 q(n + 1)’s. In order to see the relationship between p(n)’s and q(n + 1)’s, let us define p∗ (n). For a fixed p = p(n) = (p1 , p2 , . . . , pn ), define p∗ as an (n + 1)-dimensional vector as (p∗1 , p∗2 , . . . , p∗n , p∗n+1 ) = (p1 , p2 , . . . , pn , 0). Now, we can think of two ways to generate q(n + 1) from p∗ . First of all, if we add 1 to p∗1 , the resulting vector will be one of q(n + 1)’s. Second, for any fixed vector p∗ , if we construct a vector as such, (p∗1 , . . . , p∗j − 1, p∗j+1 + 1, . . . , p∗n+1 ) for a 0 < j ≤ n with nonzero p∗j , then the resulting vector will also be one of q(n + 1)’s. Using the second method, we may end up with the same q(n + 1) vector from different p∗ ’s. After discarding the same q(n + 1) vector from the second method, we will obtain all of Nn+1 q(n + 1)’s. That is, any vector of q(n + 1) can be generated from p∗ through either of the two methods mentioned above. Note that pi ’s (1 ≤ i ≤ n) and qj ’s (1 ≤ j ≤ n + 1) must be nonnegative integers. For simplicity, let us combine the two methods by allowing j = 0 and assuming a(−1) = 1 in the following. Suppose q = (q1 , . . . , qn+1 ) is generated from p∗ by subtracting 1 from p∗j and adding 1 to p∗j+1 . For such p∗ and q,
i=1
for k, d = 0, 1, 2, . . ..
Byn+1
703
p n a(i−1) i i =1
i!
=
p n n a(i−1) i i =1
i!
j =0
a(j)
pj (j−1) a
,
a(j) pk,j (j−1) a
.
(25)
We can also see that the following holds. For any j such that corresponding pk,j is nonzero (0 ≤ j ≤ n, pk,0 = 1), there exists a q(n + 1) = (q1 , . . . , qn+1 ) such that
(i−1) pk,i n n +1 1 a a(j) 1 pk,j (j−1) = (j + 1)qj+1 p ! i ! a q ! i=1 k,i i=1 i (i−1) qi a × . i!
(26)
704
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
Moreover, when we look at the two summations, n
Nn
k=1 and as a whole in (25), it can be shown that any nonzero
as noted in the theorem, the expansion of (18) follows immediately.
qj+1 ’s (0 ≤ j ≤ n) from all of q(n + 1)’s are covered by the relation (26). Thus, we have
Proof of Theorem 2. The expansion of Hermite moments is simply obtained by interchanging the order of summations with respect to the index variables k and j in (20) and by applying the fact
j=0,pk,j ̸=0 ,
Byn+1 =
n!
q(n+1)
Since
(i−1) qi n n +1 a 1 . (j + 1)qj+1 q! i! j =0 i=1 i
E [Hj (S )Hr (S )] =
n +1 n j=1 jql,j = n + 1, the result follows: j=0 (j + 1)ql,j+1 =
Proof of Theorem 1. Since Dy = ∆−1/2 Dz , we have −d D2k y
Hj (z ) = ∆
d/2−k
−d D2k Hj z
Diz Hj
(z ) =
j = r, j ̸= r
(29)
sj j!
1
2
= exp s z − s 2
.
Since
j!
(j − i)!
Hj−i (z ),
i≤j
0,
∞ 1 s2r +q exp{s Z∆ } = exp s · H2r +q (Z∆ ) , 2 (2r + q)! r =0 q=0
i>j
1
2
the result directly follows. The expansion of moments m2r +q is obtained from
we have
Aky Hj (z (y)) =
2k
wk,d ∆d/2−k
d=2k−j
j! Hj−2k+d (z )
(j − 2k + d)!
m2r +q (y0 ) = E {Z∆ }2r +q |y0 = Ds2r +q M (s; y0 ) s=0 . Since
.
Since wk,d (y) = 0 for d < 0, terms with d < 0 are 0 in the above formula and we need to consider only cases of 2k − j ≥ 0. By substituting this for (9) and (10), we obtain the followings: q∆ (J ,K ) (y0 , z )
Hj (z )
j=0
and by applying the relation
r !, 0,
for a random variable S following the standard normal distribution. The moment generating function M (s) is obtained by using the fact ∞
(z ),
×
Ds2r +q s2j+2m+p =
= φ(z )
2K
∆d/2
d=0 M (J ,d,k)
×
wk,d (y0 )∆d/2
j! Hj−2k+d (0)
K
(2j + 2m + p)! s2j+2m+p−2r −q , (2j + 2m + p − 2r − q)! 2r +q
when we substitute s = 0 into Ds M (s; y0 ), all terms vanish except the term with m = r − j ≥ 0 and p = q. This leads to the result.
ζ0,2 (y) = a2 + a′
Hj (z )
(27)
ζ1,1 (y) = aa′ + a′′ /2 ζ0,3 (y) = a3 + 3aa′ + a′′ ζ1,2 (y) = 3a2 a′ + 2(a′ )2 + 7aa′′ /2 + a(3)
wk,d (y0 )
ζ0,4 (y) = a4 + 6a2 a′ + 3(a′ )3 + 4aa′′ + a(3)
k=[d/2]+
(28)
where M (J , d, k) = [(J + d)/2 − k]− . The new index variable m replaces j, with 2m = j − 2k + d. By substituting (23) for wk,d (y0 ) in (28) and interchanging the order of summations, (17) is readily obtained. Since
ˆ K ,d,r (z ; K ) = H2r −d (z ) H
s2j+2m+p ∆k+p/2
ζ0,1 (y) = a
(−1)m H2m+2k−d (z ) k!2m m! m=0
ζk−j,2j+p (y0 )
2m m!(k + j + p)! j=0 p=0 m=0
Examples of zeta: Here are examples of ζl,n to the order of ∆7/2 in the delta expansion. The dependency of a and its derivatives on y is suppressed for notational convenience. For higher orders of l and n (2l + n ≥ 6), the R-code available on the internet5 shows the detailed expression
otherwise,
j!k!(j − 2k + d)! j=2k−d
,
Since
d=0 k=[d/2]+ J
+ O(∆K +1 ).
we have
2m m!
K k 1 ∞ k=0
j = 2m ≥ 0,
2K K
m=0
M (s; y0 ) =
Now, by interchanging the order of summations and using the following property of the Hermite polynomial,
q∆ (J ,K ) (y0 , z ) = φ(z )
∞ s2m
we have
J K ∆k 1 k · [Ay Hj (z (y))]y0 Hj (z ) = φ(z ) j! k! k=0 j =0 J K 2k 1 · wk,d (y0 )∆d/2 = φ(z ) j ! k=0 d=2k−j j=0 j! Hj−2k+d (0) × Hj (z ). k!(j − 2k + d)!
(2m)! (−1)m , 2m m! Hj (0) = 0,
exp{s2 /2} =
ζ2,1 (y) = a(a′ )2 + a2 a′′ + 3a′ a′′ /2 + aa(3) + a(4) /4 ζ1,3 (y) = 6a3 a′ + 14a(a′ )2 + 11a2 a′′ + 12a′ a′′ + 7aa(3) + 3a(4) /2 ζ0,5 (y) = a5 + 10a3 a′ + 15a(a′ )2 + 10a2 a′′ + 10a′ a′′ + 5aa(3) + a(4)
5 http://blog.naver.com/widylee.
Y.D. Lee et al. / Journal of Econometrics 178 (2014) 694–705
ζ2,2 (y) = 7a (a ) + 4(a ) + 4a a + 22(a ) + 21aa a /4 + 7a2 a(3) + 8a′ a(3) + 4aa(4) + 3a(5) 2
′ 2
′ 3
3 ′′
′′ 2
′ ′′
ζ1,4 (y) = 10a4 a′ + 50a2 (a′ )2 + 20(a′ )3 + 25a3 a′′ + 90(a′′ )2 + 17aa′ a′′ + 25a2 a(3) + 26a′ a(3) + 23aa(4) /2 + 2a(5) ζ0,6 (y) = a6 + 15a4 a′ + 45a2 (a′ )2 + 15(a′ )3 + 20a3 a′′ + 60(a′′ )2 + 10aa′ a′′ + 15a2 a(3) + 15a′ a(3) + 6aa(4) + a(5) ζ3,1 (y) = a(a′ )3 + 4a2 a′ a′′ + 7a2 a′′ /2 + 7a(a′′ )2 /2 + a3 a(3) + 11aa′ a(3) /2 + 11a′′ a(3) /4 + 3a2 a(4) /2 + 7a′ a(4) /4 + 3aa(5) /4 + a(6) /8 ζ2,3 (y) = 25a3 (a′ )2 + 50a(a′ )3 + 10a4 a′′ + 125a2 a′ a′′ + 80a2 a′′ + 257a(a′′ )2 + 25a3 a(3) + 99aa′ a(3) + 40a′′ a(3) + 47a2 a(4) /2 + 25a′ a(4) + 39aa(5) /4 + 3a(6) /2 ζ1,5 (y) = 15a5 a′ + 130a3 (a′ )2 + 165a(a′ )3 + 95a4 a′′ /2 + 360a2 a′ a′′ + 395a2 a′′ /2 + 142a(a′′ )2 + 65a3 a(3) + 216aa′ a(3) + 155a′′ a(3) /2 + 93a2 a(4) /2 + 95a′ a(4) /2 + 17aa(5) + 5a(6) /2 ζ0,7 (y) = a7 + 21a5 a′ + 105a3 (a′ )2 + 105a(a′ )3 + 35a4 a′′ + 210a2 a′ a′′ + 105a2 a′′ + 70a(a′′ )2 + 35a3 a(3) + 105aa′ a(3) + 35a′′ a(3) + 21a2 a(4) + 21a′ a(4) + 7aa(5) + a(6) . References Aït-Sahalia, Y., 1999. Transition densities for interest rate and other nonlinear diffusions. Journal of Finance 54, 1361–1395. Aït-Sahalia, Y., 2002. Maximum-likelihood estimation of discretely-sampled diffusions: a closed-form approximation approach. Econometrica 70, 223–262. Aït-Sahalia, Y., 2008. Closed-form likelihood expansions for multivariate diffusions. The Annals of Statistics 36 (2), 906–937. Aït-Sahalia, Y., Yu, J., 2006. Saddlepoint approximations for continuous-time Markov processes. Journal of Econometrics 134, 507–551. Bakshi, G., Ju, N., 2005. A refinement to Aït-sahalia’s (2002) maximum likelihood estimation of discretely sampled diffusions: a closed-form approximation approach. Journal of Business 78, 2037–2052. Bakshi, G., Ju, N., Ou-Yang, H., 2006. Estimation of continuous-time models with an application to equity volatility. Journal of Financial Economics 82, 227–249. Barndorff-Nielsen, O.E., Cox, D.R., 1989. Asymptotic Techniques for Use in Statistics. Chapman and Hall, London. Beskos, A., Papaspiliopoulos, O., Robert, G.O., Fearnhead, P., 2006a. Retrospective exact simulation of diffusion sample paths with applications. Bernoulli 12, 1077–1098.
705
Beskos, A., Papaspiliopoulos, O., Robert, G.O., Fearnhead, P., 2006b. Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes. The Journal of the Royal Statistical Society, Series B 68, 333–383. Beskos, A., Papaspiliopoulos, O., Robert, G.O., Fearnhead, P., 2009. Monte Carlo maximum likelihood estimation for discretely observed diffusion processes. Annals of Statistics 37, 223–245. Chan, K.C., Karolyi, G.A., Longstaff, F.A., Sanders, A.B., 1992. An empirical comparison of alternative models of the short-term interest rate. Journal of Finance 47, 1209–1227. Chang, J., Chen, S.X., 2011. On the approximate maximum likelihood estimation for diffusion processes. The Annals of Statistics 39, 2820–2851. Choi, S., 2011. Closed-form expansions for multivariate time-inhomogeneous diffusions. Journal of Econometrics 174, 45–65. Dacunha-Castelle, D., Florens-Zmirou, D., 1986. Estimation of the coefficients of a diffusion from discrete observations. Stochastics 19, 263–284. DasGupta, A., 2008. Asymptotic Theory of Statistics and Probability. Springer, New York. Di Pietro, M., 2001. Bayesian inference for discretely sampled diffusion processes with financial applications, Ph.D. Thesis, Department of Statistics, CarnegieMellon University. Egorov, A.V., Li, H., Xu, Y., 2003. Maximum likelihood estimation of time inhomogeneous diffusions. Journal of Econometrics 114, 107–139. Hall, P., 1995. The Bootstrap and Edgeworth Expansion. In: Springer Series in Statistics, Springer-Verlag, New York. Hurn, A., Jeisman, J., Lindsay, K., 2007. Seeing the wood for the trees: a critical evaluation of methods to estimate the parameters of stochastic differential equations. Journal of Financial Econometrics 5, 390–455. Jensen, B., Poulsen, R., 2002. Transition densities of diffusion processes: numerical comparison of approximation techniques. Journal of Derivatives 9, 18–32. Kessler, M., 1997. Estimation of an ergodic diffusion from discrete observations. Scandinavian Journal of Statistics 24:2, 211–229. Kloeden, P., Platen, E., 1999. Numerical Solution of Stochastic Differential Equations. Springer. Lee, E-K., Choi, Y., Lee, Y.D., 2010. A note on series approximation of transition density of diffusion processes. The Korean Journal of Applied Statistics 23 (2), 383–392 (in Korean). Li, M., 2010. A damped diffusion framework for financial modeling and closed-form maximum likelihood estimation. Journal of Economic Dynamics and Control 34, 132–157. McCullagh, P., 1987. Tensor Methods in Statistics. Chapman and Hall. Øksendal, B., 2003. Stochastic Differential Equations: An Introduction with Applications. Springer. Papaspiliopoulos, O., 2011. Monte Carlo probabilistic inference for diffusion processes: a methodological framework. In: Bayesian Time Series Models. Cambridge University Press, pp. 82–99. Papaspiliopoulos, O., Roberts, G.O., 2012. Importance sampling techniques for estimation of diffusion models. In: Statistical Methods for Stochastic Differential Equations, Monographs on Statistics and Applied Probability. Chapman and Hall, pp. 311–337. Rogers, L., 1985. Smooth transitional densities for one-dimensional diffusions. Bulletin of the London Mathematical Society 17, 157–161. Stramer, O., Bognar, M., Schneider, P., 2010. Bayesian inference for discretely sampled Markov processes with closed-form likelihood expansions. Journal of Financial Econometrics 8, 450–480. Yu, J., 2007. Closed-form likelihood approximation and estimation of jumpdiffusions with an application to the realignment risk of the Chinese Yuan. Journal of Econometrics 141, 1245–1280.