Convergence rate of weak Local Linearization schemes for stochastic differential equations with additive noise

Convergence rate of weak Local Linearization schemes for stochastic differential equations with additive noise

Journal of Computational and Applied Mathematics 279 (2015) 106–122 Contents lists available at ScienceDirect Journal of Computational and Applied M...

476KB Sizes 0 Downloads 35 Views

Journal of Computational and Applied Mathematics 279 (2015) 106–122

Contents lists available at ScienceDirect

Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam

Convergence rate of weak Local Linearization schemes for stochastic differential equations with additive noise J.C. Jimenez a,∗ , F. Carbonell b a

Instituto de Cibernetica, Matematica y Fisica, Calle 15, No. 551, entre C y D, Vedado, La Habana 10400, Cuba

b

Biospective Inc., Montreal, Canada

article

info

Article history: Received 8 March 2014 Received in revised form 17 August 2014 Keywords: Local Linearization schemes Stochastic differential equations Additive noise Numerical integration

abstract There exists a diversity of weak Local Linearization (LL) schemes for the integration of stochastic differential equations with additive noise, which differ in the algorithms employed for the numerical implementation of the weak Local Linear discretizations. Despite convergence results for these discretizations have been already developed, the convergence of the weak LL schemes has not been considered up to date. In this work, a general result concerning the convergence rate of the weak LL schemes is presented, as well as specificities for a number of particular schemes. As an application, the convergence of weak LL schemes for equations driven by Poisson processes is presented in addition. © 2014 Elsevier B.V. All rights reserved.

1. Introduction The evaluation of Wiener functional space integrals and the estimation of diffusion processes are essential matters for the resolution of a number of problems in mathematical physics, biology, finance and other fields. In the solution of this kind of problems, weak numerical integrators for Stochastic Differential Equations (SDEs) have become an important tool [1–5]. Well-known are, for instance, the Euler, the Milstein, the Talay–Tubaro extrapolation, the Runge–Kutta and the Local Linearization methods (see [6] for a review of these methods). Specifically, weak Local Linearization (LL) schemes for SDEs with additive noise have played a prominent role in the construction of effective inference methods for SDEs [7–11], in the estimation of distribution functions in Monte Carlo Markov Chain methods [12–14] and the simulation of likelihood functions [15]. Extensive simulation studies carried out in these papers have showed that these LL schemes possess high numerical stability and remarkable computational efficiency. Other distinctive feature of the weak LL integrators is that they preserve the ergodicity and geometric ergodicity properties of a wide class of nonlinear SDEs [14]. This paper deals with an open problem related with the weak Local Linearization method. It has been proved that the order-β weak Local Linear discretization is the base for the construction and study of such a method [16]. Based on this discretization, a variety of numerical schemes can be derived, which mainly differ in the algorithm employed for the numerical implementation of the discretization. This feature provides flexibility to the LL method for suitable adjustments when it is applied to certain types of equations (e.g., large systems of SDEs, etc.). However, in contrast with the weak Local Linear discretization, the convergence rate of LL schemes has not been considered so far. This essential issue must be addressed for developing computationally efficient weak LL schemes.



Corresponding author. E-mail addresses: [email protected] (J.C. Jimenez), [email protected] (F. Carbonell).

http://dx.doi.org/10.1016/j.cam.2014.10.021 0377-0427/© 2014 Elsevier B.V. All rights reserved.

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

107

In this work, a main theorem about the convergence rate of weak LL schemes for SDEs with additive noise is derived. Based on this general result, convergence results for some specific LL schemes like those based on Padé and Krylov–Padé approximations are also obtained. As straightforward application, the convergence rate of some weak LL schemes for stochastic differential equations with jumps is also presented. Numerical simulations are also given in order to illustrate the proven theoretical results and the feasibility of the numerical schemes. A summary of basic results about the LL method is presented for supporting the subsequent presentation. 2. Notations and preliminaries Let (Ω , F , P ) be a complete probability space, and {Ft , t ≥ t0 } be an increasing right continuous family of complete sub σ -algebras of F . Consider a d-dimensional diffusion process x defined by the following stochastic differential equation with additive noise dx(t ) = f(t , x(t ))dt +

m 

gi (t )dwi (t )

(1)

i =1

x(t0 ) = x0 ,

(2)

where the drift coefficient f : [t0 , T ] × Rd → Rd and the diffusion coefficient gj : [t0 , T ] → Rd are differentiable functions, w = (w1 , . . . , wm ) is an m-dimensional Ft -adapted standard Wiener process, and x0 is a Ft0 -measurable random vector. The standard conditions for the existence and uniqueness of a solution for (1)–(2) are assumed. Consider the time discretization (t )h = {tn : n = 0, 1, . . . , N }, with maximum step-size h ∈ (0, 1), defined as a sequence of F -stopping times that satisfy t0 < t1 < · · · < tN = T and supn (hn ) ≤ h, w.p.1, where tn is Ftn -measurable for each n = 0, 1, . . . , N, and hn = tn+1 − tn . In addition, let us denote nt = max{n = 0, 1, 2, . . . : tn ≤ t and tn ∈ (t )h } for all t ∈ [t0 , T ]. 2.1. Weak Local Linear discretization The following basic results about the Local Linearization method are taken from [16]. Definition 1. For a given time discretization (t )h , the order-β (=1, 2) weak Local Linear discretization of the solution of (1)–(2) is defined by the recurrent relation yn+1 = yn + φβ (tn , yn ; hn ) + η(tn , yn ; hn ),

(3)

where

φβ (tn , yn ; hn ) =

hn



efx (tn ,yn )(hn −s) (fx (tn , yn )yn + aβn (tn + s))ds,

(4)

0

and η (tn , yn ; h) is a zero mean Gaussian random variable with variance

6(t , y; δ) =

δ



|

efx (t ,y)(δ−s) G(t + s)G| (t + s)efx (t ,y)(δ−s) ds.

(5)

0

Here, G(u) = [g1 (u), . . . , gm (u)] is a d × m matrix, β

a n ( u) =

f(tn , yn ) − fx (tn , yn )yn + ft (tn , yn )(u − tn )

  

for β = 1

m

1

1  an (u) + 2

(Id ⊗ g|j (tn ))fxx (tn , yn )gj (tn ) (u − tn )

for β = 2,

j =1

fx , ft denote the partial derivatives of f with respect to the variables x and t, respectively, fxx the Hessian matrix of f with respect to x, Id is the d-dimensional identity matrix, and the initial point y0 is assumed to be a Ft0 -measurable random vector. Denote by CPl the space of l time continuously differentiable functions with partial derivatives up to order l having polynomial growth. Theorem 2. Let x be the solution of the SDE (1)–(2), and y the order-β weak Local Linear discretization of x defined by (3). Suppose that the drift and diffusion coefficients of the SDE (1) satisfy the following conditions 2(β+1)

fk ∈ CP

|f(s, u)| +

2(β+1)

([t0 , T ] × Rd , R) and gki ∈ CP

m  i =1

|gi (s)| ≤ K (1 + |u|) ,

([t0 , T ], R)

(6) (7)

108

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

and

       ∂ f(s, u)   ∂ f(s, u)   ∂ 2 f(s, u)  2  + +   ∂ t   ∂ x   ∂ x2  δβ ≤ K

(8)

for all s ∈ [t0 , T ] and u ∈ Rd , where K is a positive constant. Further, suppose that the initial values of x and y satisfy E (|x0 |j ) < ∞,

E (|y0 |j ) < ∞,

|E (g (x0 )) − E (g (y0 ))| ≤ C0 hβ 2(β+1)

for j = 1, 2, . . . , some constant C0 > 0 and all g ∈ CP

(9)

(Rd , R). Then there exists a positive constant Cg such that

   E (g (x(T ))) − E g (yn )  ≤ Cg (T − t0 )hβ . T

(10)

By construction, the value yn+1 of the Local Linear discretization (3) is the weak solution of the piecewise linear SDE dz (t ) = (An z(t ) + aβn (t ))dt +

m 

gi (t )dwi (t ),

t ∈ (tn , tn+1 ],

(11)

i =1

z(tn ) = yn

(12)

at tn+1 for all tn+1 ∈ (t )h , where An = fx (tn , yn ). Hereafter, the following definitions and notations from [3] are required. Let M be the set of all the multi-indexes α = (j1 , . . . , jl(α) ) with ji ∈ {0, 1, . . . , m} and i = 1, . . . , l(α), where m is the dimension of w in (1). l(α) denotes the length of the multi-index α and n(α) the number of its zero components. −α and α− are the multi-indexes in M obtained by deleting the first and the last component of α , respectively. The multi-index of length zero will be denoted by v . Denote by Iα [·]tn ,tn +hn the multiple Ito integrals for all α ∈ M . Further, L0 =

d d m  ∂ 1   k l ∂2 ∂ + fk k + gj gj k l ∂t ∂x 2 k,l=1 j=1 ∂x ∂x k=1

denotes the diffusion operator for the SDE (1), and Lj =

d 

gkj

k=1

∂ , ∂ xk

for j = 1, . . . , m. Lemma 3. With β = 1, 2, let

Γβ = {α ∈ M : l(α) ≤ β} be a hierarchical set, and B (Γβ ) = {α ∈ M \Γβ : −α ∈ Γβ } the remainder set of Γβ . Further, let y = {y(t ), t ∈ [t0 , T ]} be the stochastic process defined as y(t ) = ynt + φβ (tnt , ynt ; t − tnt ) + η(tnt , ynt ; t − tnt ),

(13)

where the sequence {ynt }, n = 0, 1, . . . , is the Local Linear discretization (3), and let z = {z(t ), t ∈ [t0 , T ]} be the stochastic process defined by z(t ) = ynt +

 α∈Γβ /{ν}

Iα [Λα (tnt , ynt ; tnt , ytnt )]tnt ,t +

 α∈B (Γβ )

Iα [Λα (., y.; tnt , ynt )]tnt ,t ,

where, for all given (tnt , ynt ),

Λα (s, v; tnt , ynt ) =

Lj1 . . . Ljl(α)−1 pβ (s, v; tnt , ynt ) Lj1 . . . Ljl(α)−1 gjl(α) (s)



if jl(α) = 0 if jl(α) ̸= 0

is a function of s and v, and f(r , u) + fx (r , u)(v − u) + ft (r , u)(s − r ) m 1 pβ (s, v; r , u) = (Id ⊗ g|j (r ))fxx (r , u)gj (r ) (s − r ) p1 (s, v; r , u) +   2 j =1

  

for all r , s ∈ [t0 , T ], and u, v ∈ Rd . Then E (g (y(t ))) = E (g (z(t ))) , E g (y(t ) − y(tnt )) = E g (z(t ) − z(tnt ))









for β = 1 for β = 2

(14)

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122 2(β+1)

for all t ∈ [t0 , T ] and g ∈ CP

109

(Rd , R); and

Iα [Λα (tnt , ynt ; tnt , ynt )]tnt ,t = Iα [λα (tnt , ynt )]tnt ,t ,

(15)

for all α ∈ Γβ /{ν} and t ∈ [t0 , T ], where λα denotes the Ito coefficient function corresponding to the SDE (1). Note that, the stochastic process z defined in the previous lemma is the solution of the piecewise linear SDE (11)–(12) and Λα (·; tnt , ynt ) denotes the Ito coefficient functions corresponding to that equation. Therefore, (14) is the Ito–Taylor expansion of the process (13), which coincides with the Local Linear discretization (3) at each discretization time tn ∈ (t )h . 3. Weak Local Linearization schemes It can be noticed from its definition that the Local Linear discretization is still no tractable for numerical implementation purposes. The reason is that, in general, the integrals appearing in φβ and η cannot be analytically computed. Thus, depending on the way of computing these functions, different numerical schemes could be obtained. A precise definition for such schemes is the following. Definition 4. For a weak Local Linear discretization yn+1 = yn +φβ (tn , yn ; hn )+η(tn , yn ; hn ) of the SDE (1)–(2), any recursion of the form

 yn+1 =  yn +  φβ (tn , yn ; hn ) +  η(tn , yn ; hn ),

with  y0 = y0 ,

(16)

is called weak Local Linearization scheme, where  φβ and  η denote numerical algorithms to compute φβ and η respectively. ∂g

The first weak LL schemes were derived for autonomous equations SDE (i.e. ∂∂ tf = ∂ ti ≡ 0). Indeed, by integrating by parts in (4), the LL discretization (3) can be rewritten as yn+1 = yn + (fx (yn ))−1 (efx (yn )hn − Id )f(yn )

+

δβ2 2

(fx (yn ))−2 (efx (yn )hn − Id − fx (yn )hn )

m  (Id ⊗ g|j (tn ))fxx (tn , yn )gj (tn ) hn + η(tn , yn ; hn ),

(17)

j=1

which, for β = 1 and β = 2, were proposed in [17,18] and [7,19], respectively. For scalar SDEs with m constant diffusion coefficients g = [g1 , . . . , gm ] ∈ Rm , the variance of η(tn , ytn ; hn ) is given by

6(tn , ytn ; hn ) = (2fx (yn ))−1 (e2fx (yn )hn − 1)gg| , which is obtained by integrating by parts in (5) [18,7]. For multidimensional autonomous SDEs the variance 6 of η(tn , ytn ; hn ) is approximated by the numerical solution of the pencil equation [19] fx (yn )6 + 6f|x (yn ) = efx (yn )hn GG| e(fx (yn ))

⊤h

n

− GG| ,

(18)

where G = [g1 , . . . , gm ] is a d × m matrix of constant entries. However, the numerical implementations  yn+1 of the LL discretization (17) (i.e., the corresponding LL schemes) are not always computationally feasible since they might eventually fail when the Jacobian matrix fx ( yn ) is singular or ill-conditioned at some point  yn . Moreover, Eq. (18) might have a non unique solution for some particular values of fx ( yn ). For nonautonomous equations, the LL discretization (3) can be written as [20] yn+1 = hn e

h fx (tn ,yn ) 2n

 f(tn , yn ) − fx (tn , yn )yn +

hn 2

ft (tn , yn ) +

m hn 

4 j=1

( Id ⊗

 | gj

(tn ))fxx (tn , yn )gj (tn )

1/2 + efx (tn ,yn )hn yn +  6 (tn , yn ; hn )ξn+1 + r(tn+1 ),

where the variance 6 of η(tn , ytn ; hn ) is approximated by

   | | h hn hn hn fx (tn ,yn ) 2n  6(tn , yn ; hn ) = hn e G tn + G tn + efx (tn ,yn ) 2 , 2

2

{ξn } is a sequence of d-dimensional i.i.d. Gaussian random vectors, and r is a remainder term. These expressions can be obtained after some algebraic manipulations in (3) and by using quadrature formulas for approximating the integrals (4) and (5) with β = 2. The remainder term r(tn+1 ) represents the error due to these approximations. The LL schemes  yn+1 that can be obtained by numerical implementations of the above expression for yn+1 (and neglecting r(tn+1 )) overcome the restrictions for the Jacobian matrix fx of the previous ones, but at expense of an additional approximation.

110

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

Alternatively, other types of more general weak LL schemes have be proposed [16]. For SDEs with constant diffusion coefficients, (i.e. Eqs. (1) with gi (t ) ≡ gi for all t ∈ [t0 , T ]), Theorem 1 in [21] yields analytical expressions for φβ and 6 in terms of a single exponential matrix. Indeed,

φβ (tn , yn ; hn ) =

hn



efx (tn ,yn )(hn −s) f(tn , yn )ds +

hn



0



s

efx (tn ,yn )(hn −s) bβ (tn , yn )ds1 ds

0

0

= D14 (tn , yn ; hn ),  hn  | | 6(tn , yn ; hn ) = efx (tn ,yn )(hn −s) GG| e−fx (tn ,yn ) s ds efx (tn ,yn ) hn 0

= D12 (tn , yn ; hn )D|11 (tn , yn ; hn ), where bβ (tn , yn ) = aβ (tn , yn ) − fx (tn , yn )yn , aβ is the function defined in (3), G = [g1 , . . . , gm ] is a d × m matrix, and the block matrix D = (Dlj ), l, j = 1, . . . , 4 is defined as D(tn , yn ; hn ) = eCβ (tn ,yn )hn , with fx (tn , yn ) 0  Cβ (tn , yn ) =  0 0

GG| −fx (tn , yn )| 0 0



bβ (tn , yn ) 0 0 0

f(tn , yn ) 0  . 1 0



Therefore, the LL discretization (3) can be rewritten as yn+1 = yn + D14 (tn , yn ; hn ) + (D12 (tn , yn ; hn )D11 (tn , yn ; hn ))1/2 ξn+1 , |

(19)

starting with y0 = x0 . Here, 61/2 denotes the square root matrix of 6 and {ξn } is a sequence of d-dimensional i.i.d. Gaussian random vectors. In general, for SDEs of the form (1) with no constant diffusion coefficients, the approximation G(tn + s) ≈ Gβ (tn + s),

s ∈ [0, hn ]

provided by the truncated Taylor expansion Gβ (tn + s) =

β−1 i  d G(tn )

dt i

i =0

si

has been considered [16]. This approximation implies that

6(tn , yn ; hn ) ≈ 6β (tn , yn ; hn ), where

6β (tn , yn ; hn ) =



hn

efx (tn ,yn )(hn −s) H0 (tn )e−fx (tn ,yn ) s ds |

0 2β−2

+

hn

 i =1

 s

0

0

s0

...

0



si−2



e

fx (tn ,yn )(hn −s)

−fx (tn ,yn )| s

Hi (tn )e

dsi−1 . . . ds0 ds efx (tn ,yn )

|

hn

,

(20)

0

with Hi (tn ) =

 dl G(tn ) dj G(tn ) | l +j =i

dt l

dt j

,

i = 0, . . . , 2β − 2.

Hence, by Theorem 1 in [22] it is obtained

φβ (tn , yn ; hn ) = B1,2β+2 (tn , yn ; hn ), 6β (tn , yn ; hn ) = B1,2β (tn , yn ; hn )B11 (tn , yn ; hn ), |

(21)

where the block matrix B = (Blj ) is defined as B(tn , yn ; hn ) = eAβ (tn ,yn )hn

(22)

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

111

with

f (t , y ) x n n 0   ..  .    0 Aβ (tn , yn ) =   ..   .  ..  .

H2β−2 (tn ) −fx (tn , yn )| .. .

−fx (tn , yn )|

0

0

0

H2β−3 (tn ) Id

.. . .. .

.. . .. .

0

0

H0 (tn ) 0

bβ (tn , yn ) 0

f(tn , yn ) 0 

··· ··· .. . .. . .. .

Id

.. . .. .

−fx (tn , yn )|

0

0

··· ···

0 0

0 0

1 0

.. .

.. . .. .

     .     

In this way, the LL discretization (3) can be rewritten as yn+1 = yn + B1,2β+2 (tn , yn ; hn ) + (B1,2β (tn , yn ; hn )B1,1 (tn , yn ; hn ))1/2 ξn+1 + r(tn , yn ; hn ), |

(23)

1/2

where r(tn , yn ; hn ) = 61/2 (tn , yn ; hn )ξn+1 − 6β (tn , yn ; hn )ξn+1 . Note that numerical implementations  yn+1 of the expressions (19) and (23) reduce to the use of a suitable algorithm for computing exponential matrices, for instance, those based on rational Padé approximations or Krylov subspace methods (see [23] for an updated review). Remarkably, these expressions have no restriction on the Jacobian matrix fx , and do not involve the use of quadrature formulas either. Furthermore, for weak convergence purposes, the sequence {ξn } of i.i.d. Gaussian random vectors can be replaced by any other sequence of random vectors with similar moment properties. Thus, in all numerical implementation of the LL discretization, {ξn } can be replaced by the sequence { ξn } of i.i.d. two-points distributed random vectors with components  ξnk satisfying P ( ξnk = ±1) = 1/2. 4. Convergence rate of the weak Local Linearization schemes Clearly, a weak LL scheme will preserve the order β of the underlaying LL discretization if  φβ and  6 are suitable approximations to φβ and 6. This requirement is considered in the following results. Lemma 5. Suppose that the drift and diffusion coefficients of the SDE (1)–(2) satisfy the conditions (7)–(8). Suppose also that

  φβ (tn , yn |) hnα+1 yn ; hn ) −  φβ (tn , yn ; hn ) ≤ K (1 + | and

  6(tn , yn ; hn ) −  6(tn , yn ; hn ) ≤ K (1 + | yn |) hγn +1 , for some positive constant K and natural numbers α and γ . If the initial value  y0 of a LL scheme  yn defined as in (16) has finite moments of all orders, then

 E

2j



max | yn | |Ft0

0≤n≤nT

  2r ≤ M 1 + | y0 |

for some positive constant M and natural numbers j and r. Proof. By using conditions (7)–(8) it is obtained that

  φβ (tn , yn ; hn ) ≤ C1 (1 + | yn |) hn and 2 |6(tn , yn ; hn )| ≤ C2 (1 + | yn |) hn

where C1 and C2 are positive constants. In this way,

       φβ (tn , yn ; hn ) ≤ φβ (tn , yn ; hn ) +  φβ (tn , yn ; hn ) − φβ (tn , yn ; hn ) ≤ K1 (1 + | yn |) hn

(24)

and

     6(tn , yn ; hn ) ≤ |6(tn , yn ; hn )| +  6(tn , yn ; hn ) − 6(tn , y n ; hn )  2 ≤ K2 (1 + | yn |) hn ,

where K1 = C1 + K and K2 = C2 + K .

(25)

112

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

From (24) it follows that

      E  φβ (tn , yn ; hn ) +  6(tn , yn ; hn )1/2 ξn+1 |Ftn  = E  φβ (tn , yn ; hn )|Ftn  ≤ K1 (1 + | yn |) hn ,

(26)

whereas (25) implies that

 2 |  6(tn , yn ; hn )1/2 ξn+1  = ξn+1  6(tn , yn ; hn )ξn+1  |       ≤ ξ 6(tn , yn ; hn ) |ξn+1 | n+1

2 ≤ |ξn+1 |2 K2 (1 + | yn |) hn

and so

   φβ (tn , yn ; hn ) +  6(tn , yn ; hn )1/2 ξn+1  ≤ M (ξn+1 ) (1 + | yn |) hn1/2 , (27) √ where M (ξn+1 ) = (K1 + K2 |ξn+1 |) is a random variable with finite moments of all orders.  2j  yn | exists and Inequalities (26)–(27), condition E (|y0 |j ) < ∞, for all j = 1, 2, . . . , and Lemma 9.1 in [24] imply that E | is uniformly bounded with respect to nT for all n = 0, . . . , nT , which directly implies the assertion of the theorem.  The main convergence result in this paper is the following. Theorem 6. Let x be the solution of the SDE (1)–(2), yn+1 = yn + φβ (tn , yn ; hn ) + 6(tn , yn ; hn )1/2 ξn+1 the weak Local Linear discretization defined in (3), and

 yn+1 =  yn +  φβ (tn , yn ; hn ) +  6(tn , yn ; hn )1/2 ξn+1 a numerical implementation of yn+1 , where  φβ and  6 denote numerical algorithms for computing φβ and 6. Suppose that  φβ  and 6 fulfill the local conditions

  1 φβ (tn , yn |) hα+ y n ; hn ) −  φβ (tn , yn ; hn ) ≤ K (1 + | , n

(28)

  6(tn , yn |)hγn +1 . yn ; hn ) −  6(tn , yn ; hn ) ≤ K (1 + |

(29)

and

Then, under the assumptions of Theorem 2, there exists a positive constant Cg such that

   E (g (x(T ))) − E g ( ynT )  ≤ Cg hmin{α,β,γ } , 2(β+1)

for all g ∈ CP

(Rd , R).

Proof. From Lemma 5 we have

 E

2j max | yn | |Ft0

0≤n≤nT



  2r ≤ K1 1 + | y0 |

for some positive constant K1 and natural numbers j and r. On the other hand, noted that

 yn+1 =  zn+1 +  φβ (tn , yn ; hn ) − φβ (tn , yn ; hn ) + ( 6(tn , yn ; hn )1/2 − 6(tn , yn ; hn )1/2 )ξn+1 ,

(30)

where

 zn+1 =  yn + φβ (tn , yn ; hn ) + 6(tn , yn ; hn )1/2 ξn+1 is the solution of the linear SDE d z(t ) = pβ (t , z(t ); tn , yn )dt + G(t )dw(t )  z(tn ) =  yn for all t ∈ [tn , tn+1 ] and tn ∈ (t )h , where the function pβ is defined as in Lemma 3. Thus,

 yn+1 − yn =  zn+1 − zn +  φβ (tn , yn ; hn ) − φβ (tn , yn ; hn ) + ( 6(tn , yn ; hn )1/2 − 6(tn , yn ; hn )1/2 )ξn+1 .

(31)

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

113

From this and the algebraic inequality (a + b)2j ≤ 22j−1 (a2j + b2j ) it is obtained that 2j E | yn+1 − yn | |Ftn





  2j   2j ≤ 22j−1 E | zn+1 − zn | |Ftn + 24j−2 E  φβ (tn , yn ; hn ) − φβ (tn , yn ; hn ) |Ftn   2j + 24j−2 E ( 6(tn , yn ; hn )1/2 − 6(tn , yn ; hn )1/2 )ξn+1  |Ftn .

(32)

By Theorem 4.5.4 in [3] it follows that 2j 2j E | zn+1 − zn | |Ftn ≤ K2 1 + | yn | hjn ,









(33) 2j(α+1/2)

where K2 is a positive constant. From condition (28), and by using that hn



2j E  φβ (tn , yn ; hn ) − φβ (tn , yn ; hn ) |Ftn





< 1, it is obtained that

  2j yn | hjn , ≤ K3 1 + |

(34)

where K3 = 22j−1 K . Furthermore, due to the perturbation bounds for the Cholesky and SVD factorizations (Theorems 2.2.1 and 3.2.1 in [25]) there exists a positive constant C such that

     6(tn , yn ; hn )1/2 − 6(tn , yn ; hn )1/2  ≤ C  6(tn , yn ; hn ) − 6(tn , yn ; hn ) .

(35)

2j(γ +1/2)

< 1 it is obtained that   2j  2j yn ; hn )1/2 −  6(tn , yn ; hn )1/2 )ξn+1  |Ftn ≤ 6(tn , E (6(tn , yn ; hn )1/2 −  6(tn , yn ; hn )1/2  E |ξn+1 |2j   2j yn | hjn , ≤ K4 1 + |

From this, condition (29) and taking in to account that hn

(36)

where K4 = 22j−1 (CK )2j E |ξn+1 |2j . Inequalities (32)–(34) and (36) yield 2j 2j yn+1 − yn | |Ftn ≤ K5 1 + | yn | hjn , E |









(37)

where K5 is a positive constant. In addition, by the triangular inequality, we have that

           E Fp (    I [λ ( t , y )] | F y − y ) − F α α n n tn ,tn+1 tn  ≤ e 1 + e 2 , n +1 n p    α∈Γβ /{ν} where

          Iα [λα (tn , yn )]tn ,tn+1  |Ftn  , zn+1 − yn ) − Fp  e1 = E Fp (   α∈Γβ /{ν}   e2 = E (Fp ( yn+1 − yn ) − Fp ( zn+1 − yn )|Ftn ) and λα denotes the Ito coefficient function corresponding to the SDE (1). Then, by applying Lemma 5.11.7 in [3] to Eq. (31) it is obtained

            2r E Fp (  ≤ K 1 + |    z − y ) − F I [ Λ ( t , y )] | F yn | hnβ+1 , n + 1 n p α α n n t , t t n n+1 n     α∈Γβ /{ν} which by Lemma 3 is equivalent to 2r



e1 ≤ K 1 + | yn |



1 hβ+ , n

where Λα denotes the Ito coefficient function corresponding to the SDE (31). From Lemma 10 in [16], inequalities (33) and (37), and the Cauchy–Buniakovski inequality, it follows that 2 e2 ≤ E | zn+1 − yn+1 | |Ftn

 

1/2

l(p)−1

·

1/4    1/4   4j l(p)−1−j E | zn+1 − yn | |Ftn E | yn+1 − yn | |Ftn j =0

  1/2 2 ≤ K6 1 + | yn | E | zn+1 − yn+1 | |Ftn , 

2r1

114

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

where K6 is positive constant and r1 a natural number. By using the expression (30), inequality (35) and conditions (28)–(29), it follows that 2 E | zn+1 − yn+1 | |Ftn





  2 ≤ 2E φβ (tn , yn ; hn ) −  φβ (tn , yn ; hn ) |Ftn   2 + 2E (6(tn , yn ; hn )1/2 −  6(tn , yn ; hn )1/2 )ξn+1  |Ftn   2 ≤ 2K 2 (1 + C 2 ) 1 + | yn | hn2 min{α+1,γ +1} ,

and so



2r2

e2 ≤ K7 1 + | yn |



{α+1,γ +1} hmin , n

where K7 is positive constant and r2 a natural number. Hence,

            2r E Fp (    yn+1 − yn ) − Fp Iα [λα (tn , yn )]tn ,tn+1 |Ftn  ≤ (K + K7 ) 1 + | yn | hn1+min{α,β,γ } ,    α∈Γβ /{ν} where r is a natural number. The proof concludes by applying Theorem 14.5.2 in [3].



In order to show the application of previous theorems let us first consider the numerical implementation of the LL discretization (23) by means of the Padé approximation with the ‘‘scaling and squaring’’ procedure [23]. Theorem 7. Let kn  B(tn , yn ; hn ) = (Pp,q (2−kn Aβ (tn , yn )hn ))2 ,

Aβ (tn , yn )hn , Aβ (tn , yn ) the matrix defined in (22), and  −k  1 2(β+1)   kn the smallest natural number such that 2 Aβ (tn , yn )hn ≤ 2 . If the drift and diffusion coefficients of (1) are of class CP

where Pp,q (2−kn Aβ (tn , yn )hn ) denotes the (p, q)-Padé approximation of e2

−kn

and have uniformly bounded second derivatives, then the error of the weak LL scheme |  yn+1 =  yn +  B1,2β+2 (tn , yn ; hn ) + ( B1,2β (tn , yn ; hn ) B1,1 (tn , yn ; hn ))1/2 ξn+1 ,

(38)

is given by

  E (g (x(T ))) − E (g ( ynT )) ≤ Cg hmin{β,p+q} , 2(β+1)

for all g ∈ CP

(Rd , R), where x is the solution of (1)–(2) and Cg is a positive constant.

Proof. Let L1 , R2β+2 , R2β and R1 be matrices such that yn ; hn ) = L1 B(tn , yn ; hn )R2β+2 , B1,2β+2 (tn , B1,2β (tn , yn ; hn ) = L1 B(tn , yn ; hn )R2β , and B1,1 (tn , yn ; hn ) = L1 B(tn , yn ; hn )R1 , where the matrix B(tn , yn ; hn ) is defined in (22). Lemma 4.1 in [26] implies that

    kn   B(tn , yn ; hn ) −  B(tn , yn ; hn ) = eAβ (tn ,yn )hn − (Pp,q (2−kn Aβ (tn , yn )hn ))2    p+q+1 p+q+1   ≤ cp,q kn , Aβ (tn , yn ) Aβ (tn , yn ) h , n

where cp,q (k, |X|) = α 2−k(p+q)+3 e(1+ϵp,q )|X| with α = and ϵp,q = α( 12 )p+q−3 . Since the drift and diffusion   coefficients of (1) have uniformly bounded second derivatives, there exists a positive constant K1 such that Aβ (tn , yn ) < K1 p!q! (p+q)!(p+q+1)!

and Aβ (tn , yn ) ≤ K1 (1 + | yn |) for all n, which implies that |B(tn , yn ; hn )| < ∞ and  B(tn , yn ; hn ) < ∞ for all n. Thus, from these two bounds for Aβ (tn , yn ) it is obtained that









    φβ (tn , yn ; hn ) −  φβ (tn , yn ; hn ) = L1 (B(tn , yn ; hn ) −  B(tn , yn ; hn ))R2β+2  ≤ K2 (1 + | yn |) hnp+q+1 ,

(39)

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

115

  p+q+1 where K2 = cp,q (0, K1 ) |L1 | R2β+2  K1 . Similarly, it is obtained that     | | 6β (tn , yn ; hn ) −  6β (tn , yn ; hn ) = B1,2β (tn , yn ; hn )B1,1 (tn , yn ; hn ) −  B1,2β (tn , yn ; hn ) B1,1 (tn , yn ; hn )   | | ≤ L1 (B(tn , yn ; hn ) −  B(tn , yn ; hn ))R2β R1 B| (tn , yn ; hn )L1    | | + L1 B(tn , yn ; hn )R2β R1 (B(tn , yn ; hn ) −  B(tn , yn ; hn ))| L1         ≤ |L1 |2 R2β  |R1 | B(tn , yn ; hn ) −  B(tn , yn ; hn ) · |B(tn , yn ; hn )| +  B(tn , yn ; hn ) ≤ K3 (1 + | yn |) hnp+q+1 , where K3 is a positive constant. On the other hand, by using the expressions for 6 and 6β and taking into account that the drift and diffusion coefficients of (1) have uniformly bounded second derivatives it follows that

      | 6(tn , yn ; hn ) − 6β (tn , yn ; hn ) ≤ Chn sup G(tn + s)G| (tn + s) − Gβ (tn + s)Gβ (tn + s) s∈[0,hn ]      ≤ Chn sup |G(tn + s)| + Gβ (tn + s) G(tn + s) − Gβ (tn + s) s∈[0,hn ]

1 yn |) hβ+ ≤ C (1 + | , n

where C is a positive constant. From the last two inequalities it follows that

      6(tn , y n ; hn ) −  6(tn , yn ; hn ) ≤ 6(tn , yn ; hn ) − 6β (tn , yn ; hn ) + 6β (tn , yn ; hn ) −  6(tn , y n ; hn )  {β+1,p+q+1} ≤ (C + K3 ) (1 + | yn |) hmin . n

The proof concludes by using Theorem 6 with inequalities (39)–(40).

(40)



In addition, directly from the proof of the previous theorem, the following results. Corollary 8. For SDEs with constant diffusion coefficients, the error of the weak LL scheme |  yn+1 =  yn +  D14 (tn , yn ; hn ) + ( D12 (tn , yn ; hn ) D11 (tn , yn ; hn ))1/2 ξn+1 ,

obtained from (19) with  D(tn , yn ; hn ) = (Pp,q (2−kn Cβ (tn , yn )hn ))

2kn

(41)

, is given by

   E (g (x(T ))) − E g ( ynT )  ≤ Cg hmin{β,p+q} 2(β+1)

for all g ∈ CP

(Rd , R), where Cg is a positive constant.

Here, it is worth to recall that LL schemes (38) and (41) based on the Padé approximation for exponential matrix are A-stable [27], preserve the ergodicity of linear SDEs [14,27], and are geometrically ergodic for some class of nonlinear SDEs [14]. However, due to the use of Padé approximations, these numerical schemes are not suitable for large dimensional systems of SDEs. Instead, LL schemes based on Krylov method for matrix exponential [23] are recommendable. With the purpose of constructing a LL scheme based on the Krylov method, the LL discretization (23) can be rewritten as yn+1 = yn + P| (tn , yn ; hn )R2β+2 + (P| (tn , yn ; hn )R2β R1 P(tn , yn ; hn ))1/2 ξn+1 , |

|

Aβ (tn ,yn )hn | L1

where P(tn , yn ; hn ) = e

and L1 , R1 , R2β , R2β+2 are matrices such that

B1,2β+2 (tn , yn ; hn ) = L1 eAβ (tn ,yn )hn R2β+2 , B1,2β (tn , yn ; hn ) = L1 eAβ (tn ,yn )hn R2β , and B1,1 (tn , yn ; hn ) = L1 eAβ (tn ,yn )hn R1 . |

A (t ,y )h

Hence, the Krylov methods for matrix exponential can be used to compute P(tn , yn ; hn ) = e β n n n L1 . In this case, the Krylov algorithm must be applied several times, namely, as many times as the number of different columns in the matrix | L1 . That is, d times, since that L1 = [Id 0d×(2β d−d+2) ]. Clearly, this procedure is computationally inefficient unless a specific | implementation of the Krylov algorithm for the special form of the columns of L1 cut down the computational cost. A suitable alternative to this could be a one involving a single application of the Krylov method. With that purpose, the matrix P(tn , yn ; hn ) can be vectorized and then, by applying the Krylov method to the expression |

I ⊗Aβ (tn , yn )hn

vec(P(tn , yn ; hn )) = e d

vec(L1 ), |

|

116

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

the LL scheme |  yn+1 =  yn +  P| (tn , yn ; hn )R2β+2 + ( P| (tn , yn ; hn )R2β R1 P(tn , yn ; hn ))1/2 ξn+1

(42)

can be obtained. Here, p,q

vec( P(tn , yn ; hn )) = kmn ,kn (hn , Id ⊗ Aβ (tn , yn ), vec(L1 )), |

|

p,q

kmn ,kn denotes the (mn , p, q, kn )-Krylov–Padé approximation defined as in [26]. More precisely, p,q

km,k (δ, A, v) = β Vm Fkp,q (δ Hm )e1 , where β = |v|, e1 is the m-dimensional unitary vector, and the matrices Vm = [v1 , . . . , vm ] and Hm are the orthonormal basis of the Krylov space Km = span(v, Av, . . . , Am−1 v) and the upper Hessenberg matrix, respectively, resulting both from k

the well known Arnoldi algorithm. In addition, Fkp,q (δ Hm ) = (Pp,q (2−k δ Hm ))2 denotes the (p, q)-Padé approximation with

‘‘scaling and squaring’’ procedure for the computation of eδ Hm , and k is the smallest natural number such that 2−k δ Hm  ≤ 21 . The numerical scheme (42) also seems to be computationally inefficient since it involves the exponential of a large | dimensional 2d(β d + 1) × 2d(β d + 1) matrix Id ⊗ Aβ (tn , yn ). However, since this matrix has a block diagonal structure | yn ), one would be able to save a lot of computer storage capacity with an adequate algorithmic with diagonal entries Aβ (tn ,



|

implementation. In addition, the number mn of Krylov subspaces necessary to compute e

(Id ⊗Aβ (tn , yn ))hn

|

A (t , y )h e β n n n 1.



vec(L1 ) would have |

the same order of magnitude than that needed for computing Since mn ≪ d in typical situations, LL scheme (42) could result in a feasible algorithm. Nevertheless, an alternative scheme involving a single application of the Krylov method to a lower dimensional matrix can be derived by using the procedure developed in [28,29], which provides explicit formulas for the mean and variance of linear SDEs. In detail, the term φβ (tn , yn ; hn ) in the LL discretization (3) can be written as

φβ (tn , yn ; hn ) = LeCβ (tn ,yn )hn r, with

 Cβ (tn , yn ) =

fx (tn , yn ) 0 0

bβ (tn , yn ) 0 0

f(tn , yn ) 1 , 0



L = [Id 0d×2 ] and r| = [01×(d+1) 1]. Additionally, by applying the vectorization operator to the expression (20) we obtain vec(6β (tn , yn ; hn )) =



hn

e(fx (tn ,yn )⊕fx (tn ,yn ))(hn −s) vec(H0 (tn ))ds

0 2β−2

+

hn

 i =1

0

 s 0

s0

...

0

si−2



 (fx (tn ,yn )⊕fx (tn ,yn ))(hn −s)

e

vec(Hi (tn ))dsi−1 . . . ds0 ds ,

0

where ⊕ denotes the Kronecker sum operator. According to Theorem 1 in [22], the previous expressions for φβ (tn , yn ; hn ) and vec(6β (tn , yn ; hn )) can simultaneously be computed by means of a single exponential matrix. That is, φβ (tn , yn ; hn ) = L1 eMβ (tn ,yn )hn u and vec(6β (tn , yn ; hn )) = L2 eMβ (tn ,yn )hn u, with

f (t , y ) ⊕ f (t , y ) x n n x n n 0    0  Mβ (tn , yn ) =   0   ..  .

0 Cβ (tn , yn )

vec(H2β−2 (tn )) 0

vec(H2β−3 (tn )) 0

0

0

1

0

0

..

.. .

0

..

..

0

0

0

.. .

0

.

··· ··· .. .

vec(H0 (tn )) 0 



.. .

.

0

.

1 0

0

   ,    

u| = [01×d2 r| 01×2β−2 1], L1 = [0d×d2 L 0d×(2β−1) ] = [0d×d2 Id 0d×(2β+1) ], L2 = [Id2 0d2 ×(d+2β+1) ]. This yields the LL scheme

 yn+1 =  yn + L1 Q(tn , yn ; hn ) + (vec−1 (L2 Q(tn , yn ; hn )))1/2 ξn+1 ,

(43)

where p,q  Q(tn , yn ; hn ) = kmn ,kn (hn , Mβ (tn , yn ), u),

denotes the Krylov–Padé approximation to Q = eMβ (tn ,yn )hn u, and vec−1 denotes the inverse of the vectorization operator in the sense that it transform a d2 -dimensional vector into a d × d matrix.

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

117

From practical viewpoint, it is clear that the feasibility and computational efficiency (in time and memory) of these LL schemes based on the Krylov–Padé approximation will depend on a careful implementation of such approximation that takes into account the especial structure of the involved matrices. Next theorem illustrates the application of the general convergence result given by Theorem 6 to a particular LL scheme based on the Krylov–Padé approximation. 2(β+1)

Theorem 9. with uniformly bounded second derivatives and  If the drift  and diffusion coefficients of (1) are of class CP mn ≥ 2hn Mβ (tn , yn )2 for all n, then the error of the weak LL scheme (43) is given by

  E (g (x(T ))) − E (g ( ynT ))2 ≤ Cg hmin{β,m−1,p+q} , 2(β+1)

for all g ∈ CP

(Rd , R), where m = min{mn }, Cg is a positive constant, and |·|2 denotes the Euclidean norm.

Proof. According to Lemma 4.2 in [26] we have

    p,q Q(tn , yn ; hn ) −  Q(tn , yn ; hn )2 = eMβ (tn ,yn )hn u − kmn ,kn (hn , Mβ (tn , yn ), u)2   min{mn ,p+q+1}   ≤ Cmp,nq,kn 1, Mβ (tn , yn )2 hn Mβ (tn , yn )2 , p,q

1 m where Cm,κ (β, ρ) = β cp,q (κ, ρ) + 12β em−ρ ( m ) with cp,q (k, ρ) = α 2−k(p+q)+3 e(1+ϵp,q )ρ , α = (p+q)!(p+q+1)! and ϵp,q = 1 p+q−3 . Since the drift and diffusion coefficients of (1) have uniformly bounded second derivatives, there exists a positive α( 2 ) p!q!

yn |) for all n, which implies that |Q(tn , yn ; hn )|2 < ∞ constant M such that Mβ (tn , yn )2 < M and Mβ (tn , yn )2 ≤ M (1 + |

    Q(tn , yn ; hn )2 < ∞ for all n. and 





Thus, from these two alternative bounds for Mβ (tn , yn ) it is obtained that

    φβ (tn , yn ; hn ) −  φβ (tn , yn ; hn )2 = L1 Q(tn , yn ; hn ) − L1 Q(tn , yn ; hn )2   2 1/2 min{m,p+q+1} yn |2 ≤ M1 1 + | hn , 1 m where m = min{mn } and M1 = (cp,q (0, M ) + 12em+M ( m ) ) |L1 | M min{m−1,p+q} . Similarly, it is obtained that

    6β (tn , yn ; hn )) − vec−1 (L2 Q(tn , yn ; hn ))2 yn ; hn ) −  6(tn , yn ; hn )2 = vec−1 (L2 Q(tn ,   2 1/2 min{m,p+q+1} yn |2 ≤ M2 1 + | hn , 1 m where M2 = (cp,q (0, M ) + 12em+M ( m ) ) |L2 | M min{m−1,p+q} . The proof concludes by using Theorem 6 with last two inequalities.



In the theorem above the restriction to the norm |·|2 results from the condition mn ≥ 2hn Mβ (tn , yn )2 , which is required to assure the convergence of the Krylov–Padé approximation to the exponential matrices. Nevertheless, depending on the class of the matrix Mβ (tn , yn ) and/or the location and shape of its spectrum (see, e.g., [30] and references therein), such restriction might be discarded.





5. Extension to SDEs with jumps Consider a d-dimensional jump diffusion process z defined by the SDE dz(t ) = f(t , z(t ))dt +

m 

gi (t )dwi (t ) +

i =1

z(t0 ) = x0 ,

p 

hi (t , z(t ))dqi (t )

(44)

i=1

(45)

where f, gi , w are defined as in (1), hi : [t0 , T ] × Rd → Rd is a function, and qi (t ) is a Ft -adapted Poisson counting process ni (t ) with intensity µi . It is assumed that wi (t ) and qj (t ) are all independent with zero probability of simultaneous jumps for all t. Further, let us consider the sequence of jump times {σ }µi = {σi,n : n = 0, 1, 2, . . .} associated to qi (t ), which is defined as an increasing sequence of random variables such that σi,n+1 − σi,n is exponentially distributed with parameter µi , for all n and i. It is assumed that {σ }µi ⊂ (t )h for all i = 1, . . . , p, where (t )h is a time discretization defined as before. Definition 10 ([31]). For a given time discretization (t )h , the order-β weak Local Linear discretization of the jump diffusion process z is defined by the recursive relation zn = zn− +

p 

hi (tn , zn− )1nin ,

(46)

i =1

where zn− denotes the value yn of an order-β weak Local Linear discretization of diffusion process defined by (1) on [tn−1 , tn ] with initial condition x(tn−1 ) = zn−1 , and 1nin is the increment of the process nin at the time instant tn .

118

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

According to the results of the subsection above, it is easy to realize that numerical implementations of weak Local Linear discretization for SDE with jumps involve the use of weak LL schemes for equations with no jumps. Indeed, an weak Local Linearization scheme  zn for the integration of the SDE with jumps (44)–(45) can be defined as

 zn =  zn− +

p 

hi (tn , zn− )1nin ,

(47)

i =1

where  zn− denotes the value  yn of an order-β weak LL scheme for the SDE (1) on [tn−1 , tn ] with initial condition z(tn−1 ) =  yn−1 , and 1nin is the increment of the process nin at the time instant tn .

In order to study the convergence of the LL scheme (47) the following result is useful, which is a straightforward consequence of Theorem 6. Corollary 11. Under conditions of Theorem 6, all LL scheme for the SDE (1)–(2) defined as in (16) satisfies the hypothesis of Theorem 14.5.2 in [3]. The main convergence result is the following.

Theorem 12. Let  zn be the order-β LL scheme defined in (47) for the SDE with jumps (44)–(45). Suppose that the functions hi defined in (44) satisfy the linear growth bound

|hi (t , u)| ≤ K (1 + |u|) for t ∈ [t0 , T ] and u ∈ Rd . Then, under conditions of Theorem 6, there exists a positive constant Cg such that

   E (g (z(T ))) − E g ( znT )  ≤ Cg (T − t0 )hβ 2(β+1)

for all g ∈ CP

(Rd , R).

Proof. It directly follows from the definition (47), Corollary 11, and Theorem 13.6.1 in [32].



Here it is worth to mention that the above definitions and results can be easily adapted for SDEs driven for nonhomogeneous and/or compensated Poisson processes as well. 6. Numerical simulations In this section, numerical simulations are given in order to illustrate the proven theoretical results and the feasibility of the numerical schemes. Specifically, let us consider the numerical schemes (38) and (43) based on the (p, q)-Padé and (m, p, q)-Krylov–Padé approximations with different values of m, p and q. For the Padé approximation, the classical algorithm 11.3.1 in [33] was implemented, whereas for the Krylov–Padé approximation the implementation given in [30] was used. All the simulations were carried out in Matlab2011b. Example 1. Consider the 2-dimensional SDE

 

x d 1 x2



  1 =  1 2  dt + dw(t ), 2

x(0) =



0

0

x1

t ∈ [0, 1]

(48)

 

0 , 0

which corresponds to Example 17.1.1 in [3]. It turns out (see Exercise 17.1.2 in [3]) that for the function g (x) = e−x2 ,



1

E (g (x(1))) = E e− 2

 =

1 0

2e 1 + e2

(w(s))2 ds



.

The quantity

 e(δ) = |E (g (x(1))) − E (g (y(1)))| was used to estimate the order min{β, p + q} of weak convergence of the LL schemes LLβ − P (p, q) defined by (38) with different values of β , p and q. Here, y(t ), with t ∈ (t )δ , denotes the simulated weak solution of Eq. (48) obtained from any of these LL schemes with step size δ . In particular, we have used (p, q) = (1, 1) and (p, q) = (6, 6) for the Padé approximation, and the two possible values of β . For each scheme, the value E (g (y(1))) was estimated by the average over 10 000 simulated solutions. A number of 2000 of such values were computed. We arranged them into M = 20 batches with K = 100 values of E (g (y(1))) each, for computing the errors  ei,j (δ), i = 1, . . . , M; j = 1, . . . , K . The sample mean error of the ith batch

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

119

Table 1 Global discretization errors and their respective 90% confidence intervals for the LL schemes defined by (38) with different values of β , p and q.

δ

LL1 − P (1, 1)

LL1 − P (6, 6)

LL2 − P (1, 1)

LL2 − P (6, 6)

0.00707225 0.00334453 0.00161903 0.00092813

0.00352140 0.00069553 0.00017945 0.00006308

0.00176849 0.00035382 0.00008744 0.00002991

Confidence interval size ∆(δ) 2−4 0.00007252 0.00004896 2−5 0.00007320 0.00004251 2−6 0.00006179 0.00003934 2−7 0.00003577 0.00002245

0.00007342 0.00002444 0.00001199 0.00000173

0.00002708 0.00000944 0.00000441 0.00000131

Mean error  e(δ)

2−4 2−5 2−6 2−7

0.01024173 0.00507424 0.00255160 0.00150473

Fig. 1. Estimated value of the error  e(δ) in the integration of Eq. (48) by the order β = 1, 2 LL schemes defined by (38) with different values of (p, q) and various step sizes δ .

and of all batches were computed by

 ei (δ) =

K 1 

K j=1

 ei,j (δ),

and  e(δ) =

M 1 

M i=1

 ei (δ),

respectively. Hence, the confidence interval for  e(δ) is given by

[ e(δ) − ∆(δ), e(δ) + ∆(δ)], where

 ∆(δ) = t1−α/2,M −1

 σe2 (δ) M

,

 σe2 (δ) =

1

M 

M − 1 i=1

| ei (δ) − e(δ)|2 ,

and t1−α/2,M −1 denotes the 1 − α/2 percentile of the Student’s t distribution with M − 1 degrees for the significance level 0 < α < 1. Table 1 shows the estimated values  e(δ) and their respective 90% confidence interval (i.e. the values ∆(δ) for α = 0.1). The values  e(δ) corresponding to the LL schemes with (p, q) = (6, 6) had been already reported in [16], but they were included here for comparison purposes. As it was expected, the estimated errors  e(δ) as well as their 90% confidence intervals are smaller for the LL schemes with Padé approximation of higher order. For each β = 1, 2 and p, q = 1, 6, the estimated value  β was obtained from Table 1 as the slope of the straight line fitted to the set of points {log2 (δi ), log2 ( e(δi ))} , δi = 2−(i+3) , i = 1, . . . , 4. For (p, q) = (1, 1) the estimated values  β were of  β = 0.9292 ± 0.0581 and  β = 1.9362 ± 0.0478, while for (p, q) = (6, 6) were of  β = 0.9835 ± 0.0236 and  β = 1.9673 ± 0.0411. Fig. 1 shows the plot of the estimated errors as well as the fitted straight lines with slopes determined by  β . Despite the estimation of β is less accurate for the LL schemes with lower order

120

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122 Table 2 Global discretization errors corresponding to the numerical integration of the SDE of Example 2 through various LL schemes with δ = 2−5 . LL Scheme

d=5

d = 10

d = 15

d = 20

LL1 − P (1, 1) LL1 − P (6, 6) LL1 − KP (4, 1, 1) LL1 − KP (8, 1, 1) LL1 − KP (4, 6, 6) LL1 − KP (8, 6, 6) LL2 − P (1, 1) LL2 − P (6, 6) LL2 − KP (4, 1, 1) LL2 − KP (8, 1, 1) LL2 − KP (4, 6, 6) LL2 − KP (8, 6, 6)

0.01443 0.01427 0.01483 0.01431 0.01435 0.01432 0.01830 0.01760 0.02029 0.01794 0.01794 0.01776

0.01511 0.01485 0.01717 0.01713 0.01716 0.01521 0.02206 0.02112 0.02138 0.02124 0.02136 0.02124

0.01813 0.01636 0.01793 0.01644 0.01793 0.01643 0.02413 0.02301 0.02431 0.02361 0.02257 0.02251

0.01858 0.01717 0.01869 0.01770 0.01869 0.01729 0.02466 0.02310 0.02457 0.02342 0.02342 0.02341

Table 3 Relative CPU times for different numerical schemes with δ = 2−5 . LL Scheme

d=5

d = 10

d = 15

d = 20

LL1 − P (1, 1) LL1 − P (6, 6) LL1 − KP (4, 1, 1) LL1 − KP (8, 1, 1) LL1 − KP (4, 6, 6) LL1 − KP (8, 6, 6) LL2 − P (1, 1) LL2 − P (6, 6) LL2 − KP (4, 1, 1) LL2 − KP (8, 1, 1) LL2 − KP (4, 6, 6) LL2 − KP (8, 6, 6)

1 1.46 1.98 1.96 2.49 2.49 1.16 1.71 2.55 2.41 2.97 2.96

1 2.61 4.81 5.00 5.61 6.62 1.76 3.26 8.53 8.58 8.76 8.95

1 3.87 18.3 19.1 18.7 19.5 1.62 6.70 21.3 20.1 20.3 20.3

1 3.83 35.1 35.4 41.1 43.9 2.23 6.75 48.1 49.0 56.0 56.8

Padé approximation, these results corroborate the theoretical estimate for the convergence rate given in Theorem 7 for these schemes. Example 2. Consider the (d + 1)-dimensional SDE dxi = dwi (t ),

i = 1, . . . , d

(49)

d 1 

x2i dt , t ∈ [0, 1] 2d i=1 x1 (0) = x2 (0) = · · · = xd+1 (0) = 0,

dxd+1 =

that represents a multidimensional version of (48) with the additive action of thed-dimensional Wiener process

(w1 , . . . , wd ). For this equation, the functional g (x) = e−xd+1 also satisfies E (g (x(1))) =

2e 1+e2

(it can be proved similarly

to Exercise 17.1.2 in [3]). Table 2 shows, for different values of d, the estimated value of the error  e(δ = 2−5 ) of various LL schemes in the integration of the equation of Example 2. In addition to the LL schemes LLβ − P (p, q) considered in the previous example, the LL schemes LLβ − KP (m, p, q) defined by (43) in terms of the (m, p, q)-Krylov–Padé approximation is also used. Table 3 shows, for different values of d, the relative CPU time for each numerical scheme, which is obtained as the ratio of the actual CPU time for each numerical scheme to the minimum of all CPU times. As it was expected, the LL schemes with higher order Padé approximation provide better accuracy than the others. A similar result is obtained for the LL schemes with higher number of Krylov subspaces m. The LL schemes with lower order Padé approximation are faster than the others, which is also a predictable result for the low dimensionality of the considered equations and the Krylov–Padé algorithm of [30] used here. With regard to this last issue, note that the algorithm of [30] does not take into account the especial structure of the matrices involved in the LL scheme, which is necessary to obtain computationally efficient LL schemes based on Krylov–Padé approximations. 7. Remarks for the implementation of efficient schemes Results of this work provide order conditions for the approximations to φβ and η in such a way that the resulting weak LL scheme preserves the convergence rate of the underlaying LL discretization. In line with this, it is easy to recognize that

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

121

many of the current implementations of the weak LL schemes are not so efficient as they could be. Usually, they are obtained from approximations to φβ and η with higher order of convergence than that required, which involve an unnecessary extra computational cost. Notice that, mostly the professional numerical computing environments (as Matlab, Octave, etc.) provide subroutines for the computation of complex matrix operations up to the precision of the floating-point arithmetic. This includes the computation of inverse matrix, exponential matrix, and Schur decomposition required by various weak LL schemes. Therefore, as a rule, the existing numerical implementation of the weak LL schemes use an ‘‘exact’’ (up to the precision of the floating-point arithmetic) algorithm to compute φβ and η. For instance, the current Matlab implementations of the LL schemes (38) use the building-in subroutine ‘‘expm’’ that compute the exponential matrix with a high order Padé approximation. However, according to Theorem 7, the lower order (p, q)-Padé approximation with p + q = β is sufficient for preserving the convergence rate β = 1, 2 of these LL schemes. In this way, various matrix multiplications could be skipped at each integration step, which implies a sensible reduction of the overall computational time. Nevertheless, for choosing the convenient values of p and q besides the convergence rate and the computational cost other factors should be considered. For instance, it is known that for equations with small noise, integrators with ‘‘deterministic’’ high order (but low stochastic order) often yield far smaller errors with larger step size than schemes with ‘‘deterministically’’ same order as in the stochastic sense [34–36]. That is, higher deterministic order along with its increasing complexity may not be redundant. From this and by taking into account that for ODEs the stochastic schemes (38) reduce to the known deterministic Local Linearization scheme of order 2 (see, e.g., [37]), p + q should be taken equal to 2 for both, β = 1 and β = 2. Moreover, with q = p the mentioned deterministic scheme for ODE is A-stable, whereas with q = p + 1 or q = p + 2 such scheme is also L-stable. Hence, from the stability viewpoint, the stochastic scheme (38) with pair of values (0, 2), (1, 1) or (1, 2) for (p, q) can be recommended. Further, note that the deterministic LL scheme not only preserve the stability of the linear ODEs when p ≤ q ≤ p + 2, but they are also able to ‘‘exactly’’ (up to the precision of the floating-point arithmetic) integrate this class of equations when p + q = 12 (for the numerical precision of the current personal computers [38,39]). Therefore, for the special case of linear SDEs with small additive noise, the stochastic scheme (38) with the pair of values (6, 6) or (6, 7) for (p, q) might provide more accurate results. On the other hand, it is worth to mention here that additional improvement of the computational efficiency of the weak schemes (38) can be archived by means of an efficient implementation of Padé algorithm. Indeed, thanks to the especial block structure of the involved matrices, implementations as those proposed in Section V of [38] yield a considerable reduction of the computer storage and computational time required for evaluating the recursion (38). Note that the above discussion has been focused in the weak LL schemes based on Padé algorithm, but it is also valid for the schemes based on the Krylov–Padé approximation that have been consider in this paper. 8. Conclusions A main theorem concerning the convergence rate of the weak Local Linearization schemes for stochastic differential equations with additive noise was derived. Based on this general result, the convergence rate for some specific schemes were obtained and some illustrative simulations supporting the new findings were also presented. Results of this paper provide, for the first time, a clear guideline for the numerical implementation of computationally efficient weak LL schemes in forthcoming works. Acknowledgment The authors are grateful to the referee for her/his valuable comments that contributed to improve the presentation and discussion of the results in the paper. References [1] D. Talay, Second order discretization schemes of stochastic differential systems for the computation of the invariant law, Stoch. Stoch. Rep. 29 (1990) 13–36. [2] D. Talay, L. Tubaro, Expansion of the global error for numerical schemes solving stochastic differential equations, Stoch. Anal. Appl. 8 (1990) 94–120. [3] P.E. Kloeden, E. Platen, Numerical Solution of Stochastic Differential Equations, second ed., Springer-Verlag, Berlin, 1995. [4] B.L.S. Prakasa-Rao, Statistical Inference for Diffussion Type Processes, Oxford University Press, 1999. [5] J.C. Jimenez, R. Biscay, T. Ozaki, Inference methods for discretely observed continuous-time stochastic volatility models: a commented overview, Asia-Pac. Financ. Markets 12 (2006) 109–141. [6] H. Schurz, Numerical analysis de stochastic differential equations without tears, in: D. Kannan, V. Lakahmikamtham (Eds.), Handbook of Stochastic Analysis & Applications, Marcell Dekker, 2002, pp. 237–358. [7] I. Shoji, T. Ozaki, Comparative study of estimation methods for continuous time stochastic process, J. Time Ser. Anal. 18 (1997) 485–506. [8] I. Shoji, T. Ozaki, Estimation for nonlinear stochastic differential equations by a local linearization method, Stoch. Anal. Appl. 16 (1998) 733–752. [9] G.B. Durham, A.R. Gallant, Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes, J. Bus. Econom. Statist. 20 (2002) 297–316. [10] H. Singer, Parameter estimation of nonlinear stochastic differential equations: simulated maximum likelihood versus extended Kalman filter and Ito–Taylor expansion, J. Comput. Graph. Statist. 11 (2002) 972–995. [11] A.S. Hurn, J.I. Jeisman, K.A. Lindsay, Seeing the wood for the trees: a critical evaluation of methods to estimate the parameters of stochastic differential equations, J. Financ. Econ. 5 (2007) 390–455. [12] O. Stramer, R.L. Tweedie, Langevin-type models I: diffussion with given stationary distributions and their discretizations, Methodol. Comput. Appl. Probab. 1 (1999) 283–306. [13] G.O. Roberts, O. Stramer, On inference for partially observed nonlinear diffusion models using the Metropolis–Hasting algorithm, Biometrika 88 (2001) 603–621.

122

J.C. Jimenez, F. Carbonell / Journal of Computational and Applied Mathematics 279 (2015) 106–122

[14] N.R. Hansen, Geometric ergodicity of discrete-time approximations to multivariate diffusions, Bernoulli 9 (2003) 725–743. [15] J. Nicolau, A new technique for simulating the likelihood of stochastic differential equations, Econom. J. 5 (2002) 91–103. [16] F. Carbonell, J.C. Jimenez, R.J. Biscay, Weak local linear discretizations for stochastic differential equations: convergence and numerical schemes, J. Comput. Appl. Math. 197 (2006) 578–596. [17] T. Ozaki, Nonlinear time series models and dynamical systems, in: E.J. Hannan, et al. (Eds.), Handbook of Statistics, Vol. 5, North-Holland, 1985, pp. 25–83. [18] T. Ozaki, A bridge between nonlinear time series models and nonlinear stochastic dynamical systems: a local linearization approach, Statist. Sinica 2 (1992) 113–135. [19] I. Shoji, T. Ozaki, A statistical method of estimation and simulation for systems of stochastic differential equations, Biometrika 85 (1998) 240–243. [20] C.M. Mora, Weak exponential schemes for stochastic differential equations with additive noise, IMA J. Numer. Anal. 25 (2005) 486–506. [21] C.F. Van Loan, Computing integrals involving the matrix exponential, IEEE Trans. Automat. Control AC-23 (1978) 395–404. [22] F. Carbonell, J.C. Jimenez, L. Pedroso, Computing multiple integrals involving matrix exponentials, J. Comput. Appl. Math. 213 (2008) 300–305. [23] C. Moler, C. Van Loan, Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later, SIAM Rev. 45 (2003) 3–49. [24] G.N. Milshtein, The Numerical Integration of Stochastic Differentials Equations, Ural University Press, 1988. [25] J.G. Sun, Rounding-error and perturbation bounds for the Cholesky and LDL| factorizations, Linear Algebra Appl. 173 (1992) 77–97. [26] J.C. Jimenez, H. de la Cruz, Convergence rate of strong local linearization schemes for stochastic differential equations with additive noise, BIT 52 (2012) 357–382. [27] H. de la Cruz, R.J. Biscay, J.C. Jimenez, F. Carbonell, T. Ozaki, High order local linearization methods: an approach for constructing A-stable high order explicit schemes for stochastic differential equations with additive noise, BIT 50 (2010) 509–539. [28] J.C. Jimenez, T. Ozaki, Linear estimation of continuous–discrete linear state space models with multiplicative noise, Systems Control Lett. 47 (2002) 91–101. [29] J.C. Jimenez, Simplified formulas for the mean and variance of linear stochastic differential equations, Institute of Statistical Mathematics Research Memo. No. 1153, Japan, Febrary 21, 2012. http://arxiv.org/abs/1207.5067 (submitted for publication). [30] R.B. Sidje, EXPOKIT: software package for computing matrix exponentials, ACM Trans. Math. Software 24 (1998) 130–156. [31] F. Carbonell, J.C. Jimenez, Weak local linear discretizations for stochastic differential equations with jumps, J. Appl. Probab. 45 (2008) 201–210. [32] E. Platen, N. Bruti-Liberati, Numerical Solution of Stochastic Differential Equations with Jumps in Finance, Springer-Verlag, Berlin, Heidelberg, 2010. [33] G.H. Golub, C.F. Van Loan, Matrix Computations, third ed., The Johns Hopkins University Press, 1996. [34] G. Milstein, M. Tretyakov, Mean-square numerical methods for stochastic differential equations with small noises, SIAM J. Sci. Comput. 18 (4) (1997) 1067–1087. [35] G. Milstein, M. Tretyakov, Numerical methods in the weak sense for stochastic differential equations with small noise, SIAM J. Numer. Anal. 34 (6) (1997) 2142–2167. [36] W. Römisch, R. Winkler, Stepsize control for mean-square numerical methods for stochastic differential equations with small noise, SIAM J. Sci. Comput. 28 (2) (2006) 604–625. [37] J.C. Jimenez, R. Biscay, C. Mora, L.M. Rodriguez, Dynamic properties of the local linearization method for initial-value problems, Appl. Math. Comput. 126 (2002) 63–81. [38] C. Moler, C. Van Loan, Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later, SIAM Rev. 20 (1978) 801–836. [39] J.C. Jimenez, F. Carbonell, Rate of convergence of local linearization schemes for initial-value problems, Appl. Math. Comput. 171 (2005) 1282–1295.