ARTICLE IN PRESS
Stochastic Processes and their Applications 115 (2005) 1408–1436 www.elsevier.com/locate/spa
Exponential forgetting and geometric ergodicity for optimal filtering in general state-space models Vladislav B. Tadic´a, Arnaud Doucetb, a
Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield S1 3JD, UK b Departments of Computer Science and Statistics, University of British Columbia, Vancouver, Canada V6T 122 Received 14 March 2005 Available online 15 April 2005
Abstract State-space models are a very general class of time series capable of modeling-dependent observations in a natural and interpretable way. We consider here the case where the latent process is modeled by a Markov chain taking its values in a continuous space and the observation at each point admits a distribution dependent of both the current state of the Markov chain and the past observation. In this context, under given regularity assumptions, we establish that (1) the filter, and its derivatives with respect to some parameters in the model, have exponential forgetting properties and (2) the extended Markov chain, whose components are the latent process, the observation sequence, the filter and its derivatives is geometrically ergodic. The regularity assumptions are typically satisfied when the latent process takes values in a compact space. r 2005 Elsevier B.V. All rights reserved. Keywords: Exponential forgetting; Geometric ergodicity; Nonlinear filtering; Projective metric; State-space models
Corresponding author. Tel.: +44 1223 332 676; fax: +44 1223 332 662.
E-mail addresses: v.tadic@sheffield.ac.uk (V.B. Tadic´),
[email protected],
[email protected] (A. Doucet). 0304-4149/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.spa.2005.03.005
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1409
1. Introduction State-space models are widely used in many scientific fields. We consider here the case where the latent process is modeled by a Markov chain taking its values in a continuous space and the observation at each point admits a distribution dependent of both the current state of the Markov chain and the past observation. In this context, under given regularity assumptions, we establish that (1) the filter, and its derivatives with respect to some parameters in the model, have exponential forgetting properties and (2) the extended Markov chain, whose components are the latent process, the observation sequence, the filter and its derivatives is geometrically ergodic. The regularity assumptions are typically satisfied when the latent process takes values in a compact space. This extends the results of LeGland and Mevel [8] and Douc and Matias [5]. Related problems have already been recently addressed in the literature (see the references) as these results have direct applications for misspecified models, identification, etc. Exponential forgetting properties of the filter have been established in [1,2,4,9]. Using the Hilbert metric approach pioneered in [2,9], we establish here exponential forgetting properties for the filter and its derivatives for a class of models more general than those presented in the literature. We also establish geometric ergodicity of the extended chain. In the case of finite Hidden Markov models, this has been initiated by LeGland and Mevel [8] and generalized to a larger class of continuous state-space models in [5]. Simplifying the techniques introduced in [8], we obtain results for a class of continuous state-space models which encompasses the one addressed in [5]; in particular strong differentiability assumptions for the Markov transition kernel and the likelihood are lifted and the likelihood does not have to be compactly supported. The paper is organized as follows. In Section 2, the state-space model analyzed in this paper is defined. The optimal filter and its derivatives are also defined in this section. In Section 3, the results on the exponential forgetting of the filter and its derivatives are presented. The results on the geometric ergodicity of the extended Markov chain whose components are the latent process, the observation sequence, the filter and its derivatives are given in Section 4. The differentiability of the optimal filter is the subject of Section 5. Proofs of the results presented in Sections 3–5 are provided in Sections 6 and 7. In [12], the obtained general results are applied to the stability analysis of the optimal filter for non-linear AR processes with Markov switching and its derivatives.
2. System and the optimal filter Let Y be an open subset of R, while ðO; FÞ is a measurable space. Let fX n gnX0 and fY n gnX0 be Rp and Rq -valued stochastic processes defined on ðO; FÞ, while lðÞ is a non-negative measure on ðRq ; Bq Þ. For each y 2 Y, there exist a probability Py : F ! ½0; 1 , a transition probability kernel Py : Rp RBp ! ½0; 1 and a Borelmeasurable function qy : Rp Rq Rq ! ½0; 1Þ such that qy ðx; y; y0 Þlðdy0 Þ ¼ 1 for
ARTICLE IN PRESS 1410
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
all x 2 Rp , y 2 Rq , and fX n gnX0 , fY n gnX0 admit the following relations on ðO; F; Py Þ for all Bx 2 Bp , By 2 Bq , nX0: Py ðX nþ1 2 Bx jX n ; Y n Þ ¼ Py ðX n ; Bx Þ w:p:1, Py ðY nþ1 2 By jX nþ1 ; Y n Þ ¼
(1)
Z qy ðX nþ1 ; Y n ; yÞlðdyÞ w:p:1,
(2)
By
where X n ¼ ðX 0 ; . . . ; X n Þ and Y n ¼ ðY 0 ; . . . ; Y n Þ, nX0. ~ p are the Throughout the paper the following notation is used. M p0 , M p and M families of probability measures, finite non-negative measures and finite signed measures (respectively) defined on the Borel measurable space ðRp ; Bp Þ. k k denotes ~ p are the families the total variation norm of a signed measure, while Mp0 , Mp and M p ~ (respectively) induced by the total variation of measurable sets from M p0 , M p and M norm. dx is the Dirac measure on ðRp ; Bp Þ located at x 2 Rp , I B is the indicator ~ p , a kernel R~ : M ~p!M ~ p and a function of the set B 2 Bp and I ¼ I Rp . For m~ 2 M bounded Borel-measurable function f : Rp ! R, m~ R~ denotes the measure which R~ ~ is the integral of f with respect to the measure m~ R. ~ For a maps m~ into, while m~ Rf n n sequence fyk gkX0 , let y ¼ ðy0 ; . . . ; yn Þ for nX0, and yi ¼ ðyi ; . . . ; yn Þ for 0pipn. For y 2 Y, let fRy ðy; y0 Þgy;y0 2Rq be a family of kernels defined as Z Z 0 0 0 0 0 ~ y ðy; y ÞI B ¼ ~ mR mðdxÞP y ðx; dx Þqy ðx ; y; y ÞI B ðx Þ ~ p into M ~ p for all y; y0 2 Rq ), while ~ p , B 2 Bp (notice that Ry ðy; y0 Þ maps M for m~ 2 M p ~ into M ~ p and having the property fR~ y ðy; y0 Þgy;y0 2Rq is a family of kernels mapping M p ~ , B 2 Bp . Moreover, for y 2 Y, that m~ R~ y ð; ÞI B is Borel-measurable for all m~ 2 M p p q p 0 0 ~ ~ p be measures ~ y; y0 Þ 2 M m 2 M , m~ 2 M , y; y 2 R , let F y ðm; y; y Þ 2 M and F~ y ðm; m; defined as F y ðm; y; y0 ÞI B ¼ ðmRy ðy; y0 ÞIÞ1 mRy ðy; y0 ÞI B ,
(3)
~ y; y0 ÞI B ¼ ðmRy ðy; y0 ÞIÞ1 ðmR ~ y ðy; y0 Þ ðmR ~ y ðy; y0 ÞIÞF y ðm; y; y0 ÞÞI B F~ y ðm; m; þ ðmRy ðy; y0 ÞIÞ1 ðmR~ y ðy; y0 Þ ðmR~ y ðy; y0 ÞIÞF y ðm; y; y0 ÞÞI B
ð4Þ
p
~ and a sequence fyk gkX0 from for B 2 Bp . Furthermore, for y 2 Y, m 2 M p , m~ 2 M n ~p ~ yn ÞgnX0 be measures from M p and M Rq , let fF ny ðm; yn ÞgnX0 and fF~ y ðm; m; 0 ~ y0 Þ ¼ m~ and (respectively) defined as F 0y ðm; y0 Þ ¼ m, F~ y ðm; m; nþ1 F nþ1 Þ ¼ F y ðF ny ðm; yn Þ; yn ; ynþ1 Þ, y ðm; y
(5)
nþ1 n ~ ynþ1 Þ ¼ F~ y ðF ny ðm; yn Þ; F~ y ðm; m; ~ yn Þ; yn ; ynþ1 Þ F~ y ðm; m;
(6)
nX0. It is straightforward to verify that for nX1, F ny ðm; Y n Þ is the optimal filter for estimating X n given Y 0 ; . . . ; Y n in the probability space ðO; F; Py Þ. Under additional conditions establishing a relationship between Ry ð; Þ and R~ y ð; Þ (see Section 5), for n ~ Y n Þ is a derivative of F ny ðm; Y n Þ with respect to the parameter y. each nX1, F~ y ðm; m;
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1411
3. Exponential forgetting n ~ yn ÞgnX0 The problem of the exponential forgetting of fF ny ðm; yn ÞgnX0 and fF~ y ðm; m; p ~ is considered in this section. with respect to the initial conditions m 2 M p0 , m~ 2 M This problem is analyzed under the following assumptions:
(A3.1) For all y 2 Y, there exist a Borel-measurable function y : Rq Rq ! ð0; 1Þ and a family fny ðy; y0 Þgy;y0 2Rq of measures from M p such that ny ð; ÞI B is Borelmeasurable for all B 2 Bp and 0 0 y ðy; y0 Þny ðy; y0 ÞI B pdx Ry ðy; y0 ÞI B p1 y ðy; y Þny ðy; y ÞI B for all x 2 Rp , y; y0 2 Rq , B 2 Bp . (A3.2) For all y 2 Y, there exists a Borel-measurable function ~ y : Rq Rq ! ð0; 1Þ such that 0 0 kmR~ y ðy; y0 Þkp~1 y ðy; y ÞmRy ðy; y ÞI
for all m 2 M p , y; y0 2 Rq . ~ y ðy; y0 Þf , Remark. It can easily be deduced that under (A3.1) and (A3.2) mR p q 0 0 ~ ~ m~ Ry ðy; y Þf are well-defined and finite for all y; y 2 R , m~ 2 M and any bounded Borel-measurable function f : Rp ! R. Assumption (A3.1) corresponds to the stability of the kernel Ry ð; Þ which itself is tightly related to the stability of the transition probability kernel Py ð; Þ. Assumptions of this kind have been introduced in [4] and latter used in [9]. Assumption (A3.1) is satisfied if for all y 2 Y, there exist a constant y 2 ð0; 1Þ and a measure ny 2 M p such that y ny ðBÞpPy ðx; BÞp1 y ny ðBÞ p
(7)
p
for all x 2 R , B 2 B . On the other hand, (7) holds if (1) for all y 2 Y, Py ð; Þ has a density py ð; Þ with respect to a Rreference measure k 2 M p , and (2) for all y 2 Y, there exists a set X y 2 Bp such that X y py ðx; x0 Þkðdx0 Þ ¼ 1 for all x 2 Rp and py ðx; x0 ÞXy for all x; x0 2 X y . Assumption (A3.2) is related to the kernel R~ y ð; Þ. It can be shown that (A3.2) holds if (1) for all y 2 Y, Py ð; Þ has density py ð; Þ with respect to a reference measure k 2 M p , (2) py ðx; x0 Þ and qy ðx; y; y0 Þ are differentiable with respect to y for all x; x0 2 Rp , y; y0 2 Rq , (3) for all y 2 Y, 0 ~ y ðx; x0 Þjo1, sup p1 y ðx; x Þjp
x;x0 2Rp
0 ~ y ðx; y; y0 Þjo1, sup q1 y ðx; y; y Þjq
x2Rp y;y0 2Rq
where p~ y ðx; x0 Þ, q~ y ðx; y; y0 Þ are derivatives of py ðx; x0 Þ, qy ðx; y; y0 Þ (respectively) with respect to y, and (4) for all y 2 Y, R~ y ð; Þ is the derivative of Ry ð; Þ with respect to y,
ARTICLE IN PRESS 1412
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
i.e., for all y 2 Y, y; y0 2 Rq , B 2 Bp , Z Z 0 0 ~ ~ m~ Ry ðy; y ÞI B ¼ mðdxÞkðdx Þp~ y ðx; x0 Þqy ðx0 ; y; y0 ÞI B ðx0 Þ Z Z 0 ~ Þpy ðx; x0 Þq~ y ðx0 ; y; y0 ÞI B ðx0 Þ. þ mðdxÞkðdx For y 2 Y, y; y0 2 Rq , let ty ðy; y0 Þ ¼ ð1 2y ðy; y0 ÞÞð1 þ 2y ðy; y0 ÞÞ1 , 0 1 0 ay ðy; y0 Þ ¼ 2 log1 32 y ðy; y Þty ðy; y Þ, 0 1 0 a~ y ðy; y0 Þ ¼ 80 log1 36 y ðy; y Þty ðy; y Þ, ~ ðy; y0 Þ ¼ 4 ðy; y0 Þ~1 ðy; y0 Þt1 ðy; y0 Þ, f y
y
y
y
~ ðy; y0 Þ ¼ 6 ðy; y0 Þt1 ðy; y0 Þ, c y y y b~ ðy; y0 Þ ¼ 2 log1 34 ðy; y0 Þt1 ðy; y0 Þ. y
y
y
Moreover, for y 2 Y, nX1, and a sequence fyk gkX0 from Rq , let n Y
any ðyn Þ ¼ ay ðy0 ; y1 Þ
ty ðyi1 ; yi Þ,
i¼1 n1 X
n
a~ n ðy Þ ¼ a~ y ðy0 ; y1 Þ
! ~ ðy ; y Þc~ ðy ; y Þ þ f ~ ðy ; y Þ f y i1 i y i iþ1 y n1 n
i¼1
n b~ y ðyn Þ ¼ b~ y ðy0 ; y1 Þ
n Y
n Y
! ty ðyi1 ; yi Þ ,
i¼1
ty ðyi1 ; yi Þ.
i¼1 n
~ yn ÞgnX0 The main results on the exponential forgetting of fF ny ðm; yn ÞgnX0 and fF~ y ðm; m; p p ~ are contained in the next two with respect to the initial conditions m 2 M 0 , m~ 2 M theorems. Theorem 3.1. Let (A3.1) hold, while fyk gkX0 is a sequence from Rq . Then, for all m; m0 2 M p0 , nX1, kF ny ðm; yn Þ F ny ðm0 ; yn Þkpany ðyn Þkm m0 k. Theorem 3.2. Let (A3.1) and (A3.2) hold, while y 2 Y and fyk gkX0 is a sequence from ~ p , nX1, ~ m~ 0 2 M Rq . Then, for all m; m0 2 M p0 , m; n
n
n
~ yn Þ F~ y ðm0 ; m~ 0 ; yn Þkp~any ðyn Þkm m0 kð1 þ kmkÞ ~ þ b~ y ðyn Þkm~ m~ 0 k. kF~ y ðm; m;
(8)
Proofs of Theorems 3.1 and 3.2 are provided in Section 6. In [9], the exponential forgetting of fF ny ðm; yn ÞgnX0 has been considered, and the same results as those in Theorem 3.1 have been obtained (Theorem 3.1 has been included in the paper for the sake of completeness, since it is a crucial prerequisite for Theorem 3.2 and the results presented in the next section). The exponential n ~ yn ÞgnX0 has been studied in [1,5,8]. Compared with the results forgetting of fF~ y ðm; m; of [1,5,8], Theorem 3.2 seems to be considerably more general. The results presented in [1] cover only the case of hidden Markov models with finite state and observation
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1413
spaces, while the results of [5,8] are fairly restrictive for cases where the likelihood probability density functions qy ðx; y; Þ are not compactly supported. Instead of 1 y ð; Þ, the upper bound of the left-hand side of (8) obtained in [5,8] (and extended to state-space models) depends on the function dy ð; Þ defined as dy ðy; y0 Þ ¼
supx2Rp qy ðx; y; y0 Þ inf x2Rp qy ðx; y; y0 Þ
(9)
for y 2 Y, y; y0 2 Rp : However, if qy ðx; y; Þ are not compactly supported, dy ðy; y0 Þ can (and usually do) tend to infinity as kyk; ky0 k ! 1: it can easily be shown that dy ð; Þ grows exponentially if qy ðx; y; Þ are Gaussian probability density functions. On the other hand, if there exists a constant y 2 ð0; 1Þ such that (7) holds, then 1 y ð; Þ itself is bounded by that constant.
4. Geometric ergodicity Let P : F ! ½0; 1 be a probability measure on ðO; FÞ. Moreover, let P : p q q Rp Bp ! ½0; 1 be a transition probability ½0; 1Þ R kernel,0 while0 q : R R R ! is a Borel-measurable function satisfying qðx; y; y Þlðdy Þ ¼ 1 for all x 2 Rp , y 2 Rq (lðÞ is defined in Section 2). Suppose that fX n gnX0 and fY n gnX0 are distributed on ðO; F; PÞ according to PðX nþ1 2 Bx jX n ; Y n Þ ¼ PðX n ; Bx Þ w:p:1, PðY nþ1 2 By jX
nþ1
n
(10)
Z
;Y Þ ¼
qðX nþ1 ; Y n ; yÞlðdyÞ
w:p:1
(11)
By
! M pþq be a transition probability for all Bx 2 Bp , By 2 Bq , nX0. Let S : M pþq 0 0 kernel defined as Z Z Pðx; dx0 Þlðdy0 Þqðx0 ; y; y0 ÞI B ðx0 ; y0 Þ Sðx; y; BÞ ¼ dðx;yÞ SI B ¼ ~ p be determifor x 2 Rp , y 2 Rq , B 2 Bp Bq . Furthermore, let m0 2 M p0 , m~ 0 2 M n nistic measures, while myn ¼ F ny ðm0 ; Y n Þ, m~ yn ¼ F~ y ðm0 ; m~ 0 ; Y n Þ for y 2 Y, nX1. Then, for all y 2 Y, fðX n ; Y n ; myn ÞgnX0 and fðX n ; Y n ; myn ; m~ yn ÞgnX0 are Markov chains on ðO; F; PÞ ~ p (respectively) and transition with values in Rp Rq M p0 , Rp Rq M p0 M ~ y (respectively) defined as probability kernels Py , P Z (12) Py ðx; y; m; BÞ ¼ dðx;y;mÞ Py I B ¼ Sðx; y; dx0 ; dy0 ÞI B ðx0 ; y0 ; F y ðm; y; y0 ÞÞ, ~ y ðx; y; m; m; ~ ~ ¼ dðx;y;m;mÞ ~ BÞ P ~ Py I B~ Z ~ y; y0 ÞÞ ¼ Sðx; y; dx0 ; dy0 ÞI B~ ðx0 ; y0 ; F y ðm; y; y0 Þ; F~ y ðm; m;
ð13Þ
~ p. ~ p , B 2 Bp Bq Mp , B~ 2 Bp Bq Mp M for x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M 0 0
ARTICLE IN PRESS 1414
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
The process fðX n ; Y n ÞgnX0 distributed on ðO; F; PÞ (and characterized by the transition probability kernel Pð; Þ and likelihood probability density function qð; ; Þ) can be interpreted as a true system, while for y 2 Y, the processes fðX n ; Y n ÞgnX0 distributed on ðO; F; Py Þ (and characterized by the transition probability kernel Py ð; Þ and likelihood probability density function qy ð; ; Þ) can be considered as a parameterized (candidate) model of the true system. In the context of the system identification, the aim is to determine y 2 Y such that fðX n ; Y n ÞgnX0 distributed on ðO; F; Py Þ approximates best fðX n ; Y n ÞgnX0 distributed on ðO; F; PÞ. The problem of the geometric ergodicity of fðX n ; Y n ; myn ÞgnX0 and fðX n ; Y n ; myn ; m~ yn ÞgnX0 is considered in this section. The problem is analyzed under the following assumptions: (A4.1) For all y 2 Y, there exist a constant y 2 ð0; 1Þ and a family fny ðy; y0 Þgy;y0 2Rq of measures from M p such that ny ð; ÞI B is Borel-measurable for all B 2 Bp and 0 y ny ðy; y0 ÞI B pdx Ry ðy; y0 ÞI B p1 y ny ðy; y ÞI B
for all x 2 Rp , y; y0 2 Rq , B 2 Bp . (A4.2) There exist a probability measure s 2 M pþq 0 , constants C 2 ½1; 1Þ, r 2 ð0; 1Þ and a Borel-measurable function f : Rp Rq ! ½1; 1Þ such that sfo1, sSI B ¼ sI B for all B 2 Bpþq and jdðx;yÞ S n f sf jpCrn fðx; yÞ for all x 2 Rp , y 2 Rq , nX1, and any Borel-measurable function f : Rp
Rq ! R satisfying 0pf ðx; yÞpfðx; yÞ for all x 2 Rp , y 2 Rq . Assumption (A4.1) corresponds to the stability of the kernel Ry ð; Þ and is a special case of (A3.1). It is satisfied if (7) holds. Assumption (A4.2) is related to the stability of the system fðX n ; Y n ÞgnX0 . It requires the Markov chain fðX n ; Y n ÞgnX0 to be uniformly ergodic (for more details on this type of geometric ergodicity see [10, Chapter14]). It is satisfied if the system is a hidden Markov model with geometrically ergodic hidden process. Another situation where (A4.2) is satisfied is provided in [12]. Let ty ¼ ð1 2y Þð1 þ 2y Þ1 : The main results on the geometric ergodicity of fðX n ; Y n ; myn ÞgnX0 and fðX n ; Y n ; myn ; m~ yn ÞgnX0 are contained in the next two theorems. Theorem 4.1. Let (A4.1) and (A4.2) hold, while y 2 Y and f : Rp Rq M p0 ! R is an Bp Bq Mp0 -measurable function. Suppose that j f ðx; y; mÞjpfðx; yÞ,
(14)
j f ðx; y; mÞ f ðx; y; m0 Þjpfðx; yÞkm m0 k p
q
0
M p0 .
(15)
Then, there exist constants K y 2 ½1; 1Þ, ry 2 ð0; 1Þ for all x 2 R , y 2 R , m; m 2 (depending on y ; C; r; sf only) such that jðPny f Þðx; y; mÞ ðPny f Þðx0 ; y0 ; m0 ÞjpK y rny ðfðx; yÞ þ fðx0 ; y0 ÞÞ
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1415
for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , nX0. Moreover, there exist constants f y 2 R, Ly 2 ½1; 1Þ (depending on y ; C; r; sf only) such that jðPny f Þðx; y; mÞ f y jpLy rny fðx; yÞ for all x 2 Rp , y 2 Rq , m 2 M p0 , nX1. Theorem 4.2. Let (A3.2), (A4.1) and (A4.2) hold, while y 2 Y and f : Rp Rq
~ p -measurable function. Suppose that there ~ p ! R is an Bp Bq Mp M M p0 M 0 exist constants a; b 2 ð1; 1Þ, g 2 ½0; 1Þ, K~ y 2 ½1; 1Þ such that a1 þ b1 ¼ 1 and 1=b ~ ~ g Þ, j f ðx; y; m; mÞjpf ðx; yÞð1 þ kmk
(16)
~ f ðx; y; m0 ; m~ 0 Þjpf1=b ðx; yÞð1 þ kmk ~ g þ km~ 0 kg Þðkm m0 k þ km~ m~ 0 kÞ, j f ðx; y; m; mÞ (17) Z
~yaðgþ1Þ ðy; y0 ÞSðx; y; dx0 ; dy0 ÞpK~ y fðx; yÞ
(18)
~ p . Then, there exist constants K y 2 ½1; 1Þ, ~ m~ 0 2 M for all x 2 Rp , y 2 Rq , m; m0 2 M p0 , m; ry 2 ð0; 1Þ (depending on y ; K~ y ; C; r; sf; g only) such that ~ n f Þðx0 ; y0 ; m0 ; m~ 0 Þj ~ n f Þðx; y; m; mÞ ~ ðP jðP y y ~ gþ1 Þ þ K y rny fðx0 ; y0 Þð1 þ km~ 0 kgþ1 Þ pK y rny fðx; yÞð1 þ kmk ~ p , nX1. Moreover, if there exist a ~ m~ 0 2 M for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , m; constant L~ y and a Borel-measurable function c : Rp Rq ! ½1; 1Þ such that Z fðx0 ; y0 Þ~yðgþ1Þ ðy; y0 ÞSðx; y; dx0 ; dy0 ÞpL~ y cðx; yÞ (19) for all x 2 Rp , y 2 Rq , then there exist constants f y 2 R, Ly 2 ½1; 1Þ (depending on y ; K~ y ; L~ y ; C; r; sf; g only) such that n
~ f Þðx; y; m; mÞ ~ f y jpLy rny ðfðx; yÞ þ cðx; yÞÞð1 þ kmk ~ gþ1 Þ jðP y ~ p , nX1. for all x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M Proofs of Theorems 4.1 and 4.2 are provided in Section 7. In [6], the geometric ergodicity of fðX n ; Y n ; myn ÞgnX0 has been considered and similar results as in Theorem 4.1 have been obtained. However, the results of [6] have been proved using arguments which are completely different from and less transparent than those used in the proof of Theorem 4.1. The geometric ergodicity of fðX n ; Y n ; myn ; m~ yn ÞgnX0 has been studied in [5,8]. Compared with the results of [5,8], Theorem 4.2 seems to be considerably more general. In [5,8], the geometric ergodicity of fðX n ; Y n ; myn ; m~ yn ÞgnX0 has been demonstrated under conditions which are fairly restrictive for cases where the likelihood probability density functions qy ðx; y; Þ are not compactly supported. The assumptions adopted in [5,8] (and extended to
ARTICLE IN PRESS 1416
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
state-space models) require that Z dy ðy; y0 Þqðx; y; y0 Þlðdy0 Þo1 sup
(20)
x2Rp ; y2Rq
(dy ð; Þ is defined in (9)). However, it can easily be shown that (20) does not hold if qy ðx; y; Þ are Gaussian probability density functions. On the other hand, Theorem 4.2 cover a fairly broad class of hidden Markov models and non-linear AR models with Markov switching (see [12]) and allow the likelihood probability density functions qy ðx; y; Þ to be Gaussian.
5. Filter derivatives ~ p . The problems of the weak differentiability of For y 2 Y, let my 2 M p0 and m~ y 2 M fF ny ðmy ; yn ÞgnX0 with respect to y and determining the corresponding derivatives are considered in this section (see e.g. [11] for details on weak differentiability and weak derivatives). Let A0y ¼ fðy; y0 Þ 2 Rq Rq : kny ðy; y0 Þk40g, A00y ¼ fðy; y0 Þ 2 Rq Rq : kny ðy; y0 Þk ¼ 0g, S S while A0 ¼ y2Y A0y and A00 ¼ y2Y A00y . The problems mentioned above are analyzed under the following assumptions: (A5.1) For all y 2 Y, y; y0 2 Rq and any bounded Borel-measurable function f : Rp ! R, lim jW yj1 sup jdx ðRW ðy; y0 Þ Ry ðy; y0 Þ ðW yÞR~ y ðy; y0 ÞÞf j ¼ 0.
W!y
x2Rp
(A5.2) For all y 2 Y, y; y0 2 Rq and any bounded Borel-measurable function f : Rp ! R, lim ðW yÞ1 ðmW my ðW yÞm~ y Þf ¼ 0.
W!y
(21)
(A5.3) There exists a set A 2 Bq Bq such that ðl lÞðAc Þ ¼ 0 and A A0 [ A00 . Remark. Assumption (A5.1) implies that for all y; y0 2 Rq , Ry ðy; y0 Þ is weakly differentiable with respect to y and R~ y ðy; y0 Þ is its weak derivative. Similarly, (A5.2) implies that my is weakly differentiable with respect to y and m~ y is its weak derivative. The main results on the weak differentiability of fF ny ðmy ; yn ÞgnX0 are contained in the next two theorems.
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1417
Theorem 5.1. Let (A3.1), (A3.2), (A5.1) and (A5.2) hold, while y 2 Y and fyk gkX0 is any sequence from Rq satisfying ðyn ; ynþ1 Þ 2 A0y [ A00 , nX0. Then, for all n nX0, F ny ðmy ; yn Þ is weakly differentiable in y and F~ y ðmy ; m~ y ; yn Þ is its weak derivative, i.e., n lim ðW yÞ1 ðF nW ðmW ; yn Þ F ny ðm; yn Þ ðW yÞF~ y ðmy ; m~ y ; yn ÞÞf ¼ 0
W!y
for nX0 and any bounded Borel-measurable function f : Rp ! R. Theorem 5.2. Let (A3.1), (A3.2) and (A5.1–A5.3) hold. Suppose that fX n gnX0 and fY n gnX0 are distributed on ðO; F; PÞ according to (10), (11). Then, there exists L 2 F satisfying PðLÞ ¼ 0 such that for all y 2 Y, nX0, F ny ðmy ; Y n Þ is weakly differentiable in n y on Lc and F~ y ðmy ; m~ y ; Y n Þ is its weak derivative on Lc , i.e., n
lim ðW yÞ1 ðF nW ðmW ; Y n Þ F ny ðm; Y n Þ ðW yÞF~ y ðmy ; m~ y ; Y n ÞÞf ¼ 0
W!y
(22)
on Lc for all y 2 Y, nX0, and any bounded Borel-measurable function f : Rp ! R. Proofs of Theorems 5.1 and 5.2 are rather straightforward so are not included here. There are provided in [12]. The results of Theorems 5.1 and 5.2 provide a general, but still simple way to check if fF ny ðmy ; Y n ÞgnX0 are weakly differentiable and to calculate the corresponding derivatives. To the best of our knowledge, the weak differentiability of fF ny ðmy ; Y n ÞgnX0 has not been studied in the literature on optimal filtering.
6. Proof of Theorems 3.1 and 3.2 Let dð; Þ be the Hilbert projective distance between measures from M p , i.e., 0 m I B mI B0 dðm; m0 Þ ¼ sup log mI B m0 I B0 B;B0 2Bp mðBÞ;mðB0 Þ40
if there exists a constant 2 ð0; 1Þ (depending on m; m0 ) such that mI B pm0 I B p1 mI B for all B 2 Bp , and dðm; m0 Þ ¼ 1 otherwise. For y 2 Y, y; y0 2 Rq , let 0 1 0 a~ y ðy; y0 Þ ¼ 2 log1 34 y ðy; y Þty ðy; y Þ, b~y ðy; y0 Þ ¼ 4 log1 36 ðy; y0 Þt1 ðy; y0 Þ, y
y
0 0 1 0 d~ y ðy; y0 Þ ¼ 10 log1 32 y ðy; y Þty ðy; y Þ, 00 d~ ðy; y0 Þ ¼ 4 ðy; y0 Þ~1 ðy; y0 Þt1 ðy; y0 Þ. y
y
y
y
ARTICLE IN PRESS 1418
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
Moreover, for y 2 Y, nX1, and a sequence fyk gkX0 , let a~ ny ðyn Þ ¼ a~ y ðy0 ; y1 Þ n b~y ðyn Þ ¼ b~y ðy0 ; y1 Þ
n Y i¼1 n Y
ty ðyi1 ; yi Þ, ty ðyi1 ; yi Þ,
i¼1
1 c~ny ðyn Þ ¼ 22 y ðyn1 ; yn Þ~ y ðyn1 ; yn Þ, n Y n 0 00 d~ y ðyn Þ ¼ d~ y ðy0 ; y1 Þd~y ðyn1 ; yn Þ ty ðyi1 ; yi Þ. i¼1
For y 2 Y and a sequence fyk gkX0 , let fRny ðyn ÞgnX0 be kernels defined as mR0y ðy0 ÞI B ¼ mI B and n1 ÞRy ðyn1 ; yn ÞI B mRny ðyn ÞI B ¼ mRn1 y ðy
for m 2 M p , B 2 Bp , nX1 (notice that Rny ðyn Þ maps M p into M p for nX0). Moreover, ~ p and a sequence fyk gkX0 from Rp , let fG~ n ðm; m; ~ yn ÞgnX0 and for y 2 Y, m 2 M p , m~ 2 M y 0 n p n 0 ~ defined as G~ ðm; m; ~ y Þ ¼ m~ and fH~ ðm; y ÞgnX1 be measures from M y
y
n ~ yn ÞI B G~ y ðm; m;
~ ny ðyn Þ ðmR ~ ny ðyn ÞIÞF ny ðm; yn ÞÞI B , ¼ ðmRny ðyn ÞIÞ1 ðmR
n n1 n1 ~ H~ y ðm; yn ÞI B ¼ ðF n1 ÞRy ðyn1 ; yn ÞIÞ1 F n1 ÞRy ðyn1 ; yn ÞI B y ðm; y y ðm; y 1 n1 n1 n1 n1 ~ ðF ðm; y ÞRy ðyn1 ; yn ÞIÞ F ðm; y ÞRy ðyn1 ; yn ÞIÞ y
y
F ny ðm; yn ÞI B for B 2 Bp , nX1. Lemma 6.1. Let (A3.1) hold. Then, for all m; m0 2 M p0 , km m0 kp2 log1 3 dðm; m0 Þ,
(23)
0 0 dðmRy ðy; y0 Þ; m0 Ry ðy; y0 ÞÞp2 y ðy; y Þkm m k,
(24)
dðmRy ðy; y0 Þ; m0 Ry ðy; y0 ÞÞpty ðy; y0 Þdðm; m0 Þ.
(25)
Inequality (23) is proved in [2], while inequalities (24) and (25) are proved in [9]. Proof of Theorem 3.1. Let m; m0 2 M p0 , while mn ¼F ny ðm; yn Þ, m0n ¼F ny ðm0 ; yn Þ, n ¼y ðyn1 ; yn Þ, tn ¼ ty ðyn1 ; yn Þ for nX1. Then, in order to prove the lemma’s assertion, it is sufficient to show that for nX1, ! n Y 1 1 0 kmn mn kp2 log 31 ti km m0 k. i¼1
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1419
It can easily be deduced from Lemma 6.1 that for nX0, kmn m0n kp2 log1 3 dðmn ; m0n Þ, dðmnþ1 ; m0nþ1 Þptnþ1 dðmn ; m0n Þ, 0 dðmnþ1 ; m0nþ1 Þp2 nþ1 kmn mn k.
Therefore, 0 km1 m01 kp2 log1 3 dðm1 ; m01 Þp2 log1 32 1 km m k,
kmn
m0n kp2 log1 3 dðmn ; m0n Þp2 log1 3 1 p2 log1 32 1 t1
n Y
!
n Y
! ti dðm1 ; m01 Þ
i¼2
ti km m0 k
i¼1
for nX2. This completes the proof. & Lemma 6.2. Let (A3.1) and (A3.2) hold, while y 2 Y. Then, for all y; y0 2 Rq , 0 1 ~ km~ R~ y ðy; y0 Þkp1 y ðy; y0 Þðny ðy; y0 ÞIÞkmk. y ðy; y Þ~ p
~ , while mþ and m are the positive and negative part of m. ~ Then, it Proof. Let m~ 2 M can easily be deduced from (A3.1) and (A3.2) that for all y; y0 2 Rq , B 2 Bq , 0 0 1 0 0 m Ry ðy; y0 ÞI B p1 y ðy; y Þm ny ðy; y ÞI B ¼ y ðy; y Þkm kny ðy; y ÞI B ,
jm~ R~ y ðy; y0 ÞI B jpkmþ R~ y ðy; y0 Þk þ km R~ y ðy; y0 Þk 0 þ 0 0 0 ~1 p~1 y ðy; y Þm Ry ðy; y ÞI þ y ðy; y Þm Ry ðy; y ÞI.
Consequently, 0 1 y ðy; y0 Þðny ðy; y0 ÞIÞðkmþ k þ km kÞ km~ R~ y ðy; y0 Þkp1 y ðy; y Þ~ 0 1 ~ y ðy; y0 Þðny ðy; y0 ÞIÞkmk ¼ 1 y ðy; y Þ~
for all y; y0 2 Rq . This completes the proof.
&
Lemma 6.3. Let y 2 Y, while fyk gkX0 is a sequence from Rq . Then, for all m 2 M p , B 2 Bp , nX1, F ny ðm; yn ÞI B ¼ ðmRny ðyn ÞIÞ1 mRny ðyn ÞI B . Proof. Let m 2 M p , while U n ¼ Ry ðyn1 ; yn Þ, V n ¼ Rny ðyn Þ for nX1. Moreover, let {mn gnX1 be measures from M p defined as mn I B ¼ ðmV n IÞ1 mV n I B for B 2 Bp , nX1. Then, in order to prove the lemma’s assertion, it is sufficient to show that mn ¼ F ny ðm; yn Þ for nX1.
ARTICLE IN PRESS 1420
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
It is straightforward to verify that m1 ¼ F 1y ðm; y1 Þ, as well as F y ðmn ; yn ; ynþ1 ÞI B ¼ ðmn U nþ1 IÞ1 mn U nþ1 I B , mn U nþ1 I B ¼ ðmV n IÞ1 mU nþ1 I B for B 2 Bp , nX1. Therefore, F y ðmn ; yn ; ynþ1 ÞI B ¼ ðmn U nþ1 IÞ1 ðmV n IÞ1 mV nþ1 I B ¼ ðmV nþ1 IÞ1 mV nþ1 I B ¼ mnþ1 I B for B 2 Bp , nX1 (notice that ðmV nþ1 IÞ1 ¼ ðmU nþ1 IÞ1 ðmV n IÞ1 for nX1). Then, using the mathematical induction, it can easily be deduced that mn ¼ F ny ðm; yn Þ for nX2. This completes the proof. & Lemma 6.4. Let y 2 Y, while fyk gkX0 is a sequence from Rq . Then, for all m 2 M p0 , ~ p , nX0, m~ 2 M n n ~ yn Þ ¼ G~ y ðm; m; ~ yn Þ þ F~ y ðm; m;
n X
ni i G~ y ðF iy ðm; yi Þ; H~ y ðm; yi Þ; yni Þ.
i¼1
M p0 ,
p
~ , while mn ¼ F n ðm; yn Þ for nX0, and U n ¼ Ry ðyn1 ; yn Þ, m~ 2 M Proof. Let m 2 y n U~ n ¼ R~ y ðyn1 ; yn Þ for nX1. Moreover, let n~ 0 ¼ m~ and n~ n ¼ H~ y ðm; yn Þ for nX1, while P ni n ~ n ¼ ni¼0 l~ i;n l~ i;n ¼ G~ y ðmi ; n~ i ; yni Þ, V iþ1;n ¼ Rni y ðyi Þ for 0pipn. Furthermore, let m for nX0. Then, in order to prove the lemma’s assertion, it is sufficient to show that n ~ yn Þ for nX0. m~ n ¼ F~ y ðm; m; 0 ~ y0 Þ, as well as It is straightforward to show that m~ 0 ¼ F~ y ðm; m; mn I B ¼ ðmi V iþ1;n IÞ1 mi V iþ1;n I B ,
(26)
l~ i;n I B ¼ ðmi V iþ1;n IÞ1 ð~ni V iþ1;n ð~ni V iþ1;n IÞmn ÞI B
(27)
n for B 2 Bp , 0pion (in order to get (26), notice that mn ¼ F ni y ðmi ; yi Þ for 0pipn, and apply Lemma 6.3), and
n~ n I B ¼ ðmn1 U n IÞ1 ðmn1 U~ n ðmn1 U~ n IÞmn ÞI B , F~ y ðmn ; m~ n ; yn ; ynþ1 ÞI B ¼
n X
ðmn U nþ1 IÞ1 ðl~ i;n U nþ1 ðl~ i;n U nþ1 IÞmnþ1 ÞI B þ n~ nþ1 I B
i¼0
(28) for all B 2 Bp , nX1. Therefore, mn U nþ1 I B ¼ ðmi V iþ1;n IÞ1 mi V iþ1;nþ1 I B ,
(29)
l~ i;n U nþ1 I B ¼ ðmi V iþ1;n IÞ1 ð~ni V iþ1;nþ1 ð~ni V iþ1;n IÞmn U nþ1 ÞI B
(30)
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1421
for all B 2 Bp , 0pipn, while (26), (29) and (30) imply ðmi V iþ1;nþ1 IÞ1 ¼ ðmn U nþ1 IÞ1 ðmi V iþ1;n IÞ1 , ðl~ i;n U nþ1 IÞm I B ¼ ðm V iþ1;n IÞ1 ðð~ni U i;nþ1 IÞm nþ1
i
nþ1
ð~ni V iþ1;n IÞmn U nþ1 ÞI B
p
for all B 2 B , 0pipn. Consequently, ðmn U nþ1 IÞ1 ðl~ i;n U nþ1 ðl~ i;n U nþ1 IÞmnþ1 ÞI B ¼ l~ i;nþ1 I B
(31)
p
for all B 2 B , 0pipn. Due to (28) and (31), F~ y ðmn ; m~ n ; yn ; ynþ1 Þ ¼
n X
l~ i;nþ1 þ n~ nþ1 ¼ m~ nþ1
i¼0
for nX0. Then, using the mathematical induction, it can easily be deduced that n ~ yn Þ for nX1. This completes the proof. & m~ n ¼ F~ y ðm; m; Lemma 6.5. Let (A3.1) hold, while y 2 Y and fyk gkX0 is a sequence from Rq . Then, for ~ p , nX1, all m 2 M p0 , m~ 2 M n n n y ðy0 ; y1 Þðny ðy0 ; y1 ÞRn1 y ðy1 ÞIÞpmRy ðy ÞI, n1 n ~ ~ ny ðyn ÞIjp1 jmR y ðy0 ; y1 Þðny ðy0 ; y1 ÞRy ðy1 ÞIÞkmk.
~ p , while mþ and m are the positive and negative part of m~ Proof. Let m 2 M p0 , m~ 2 M (respectively). Moreover, let 1 ¼ y ðy0 ; y1 Þ, n1 ¼ ny ðy0 ; y1 Þ, U 1 ¼ Ry ðy0 ; y1 Þ and n V iþ1;n ¼ Rni y ðyi Þ for 0pipn. Then, in order to prove the lemma’s assertion, it is sufficient to show that for nX1, 1 ðn1 V 2;n IÞpmV 1;n I, ~ ~ 1;n Ijp1 jmV 1 ðn1 V 2;n IÞkmk. It can easily be deduced from (A3.1) that for nX1, mV 1;n I ¼ mU 1 V 2;n IX1 ðmn1 V 2;n IÞ ¼ 1 ðn1 R2n IÞ, 1 m V 1;n I ¼ m U 1 V 2;n Ip1 1 ðm n1 V 2;n IÞ ¼ 1 ðn1 V 2;n IÞkm k.
(32)
~ 1;n Ipmþ V 1;n I for nX1, (32) implies Since m V 1;n IpmV ~ 1;n Ijp maxfmþ V 1;n I; m V 1;n Ig jmV þ 1 ~ p1 1 ðn1 V 2;n IÞ maxfkm k; km kgp1 ðn1 V 2;n IÞkmk
for nX1. This completes the proof.
&
Lemma 6.6. Let (A3.1) hold, while y 2 Y and fyk gkX0 is a sequence from Rq . Then, for ~ p , nX1, all m 2 M p0 , m~ 2 M n
~ yn Þkpa~ ny ðyn Þkmk. ~ kG~ y ðm; m; p p ~ , while mþ , m are the positive and negative part of m~ Proof. Let m 2 M 0 , m~ 2 M p þ þ (respectively). If km k ¼ 0, let fmþ n gnX0 be measures from M satisfying kmn k ¼ 0 p for nX0, while in the case kmþ k40, fmþ n gnX0 are measures from M 0 defined as
ARTICLE IN PRESS 1422
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
p n þ n þ 1 þ þ mþ 0 I B ¼ ðm IÞ m I B for B 2 B , and mn ¼ F y ðm0 ; y Þ for nX1. Similarly, if p km k ¼ 0, let fm n gnX0 be measures from M satisfying kmn k ¼ 0 for nX0, while in p 1 the case km k40, fmn gnX0 are measures from M 0 defined as m 0 I B ¼ ðm IÞ m I B for p n n B 2 B , and mn ¼ F y ðm0 ; y Þ for nX1. Moreover, let 1 ¼ y ðy0 ; y1 Þ, while n ~ yn Þ for nX0. Furthermore, tn ¼ ty ðyn1 ; yn Þ for nX1, mn ¼ F ny ðm; yn Þ, m~ n ¼ F~ y ðm; m; ni n and V iþ1;n ¼ Ry ðyi Þ for 1pipn. Then, in order to prove the lemma’s assertion, it is sufficient to show that for nX1, ! n Y 1 4 1 ~ km~ n kp2 log 31 t1 ti kmk. i¼1
It is straightforward to verify that for all B 2 Bp , nX1, ~ B ¼ ðmþ IÞmþ mI 0 I B ðm IÞm0 I B ,
(33)
ðm 0 V 1;n IÞmn I B ¼ m0 V 1;n I B ,
(34)
~ 1;n ðmV ~ 1;n IÞmn ÞI B , m~ n I B ¼ ðmV 1;n IÞ1 ðmV
(35)
while Theorem 3.1 and Lemma 6.5 imply 1 maxfmþ 0 V 1;n I; m0 V 1;n Igp1 ðn1 V 2;n IÞ,
(36)
mV 1;n IX1 ðn1 V 2;n IÞ,
(37)
ðm
IÞkm n
mn kp2 log
1
1 32 1 t1
n Y
! ti ðm IÞkm 0 mk
(38)
i¼1
for nX1. Due to (33)–(35), þ þ m~ n I B ¼ ðmV 1;n IÞ1 ðmþ 0 V 1;n IÞðm IÞðmn mn ÞI B ðmV 1;n IÞ1 ðm 0 V 1;n IÞðm IÞðmn mn ÞI B
ð39Þ
for all B 2 Bp , nX1, while (36), (37) and (39) yield 2 ðmV 1;n IÞ1 maxfmþ 0 V 1;n I; m0 V 1;n Igp1 ,
(40)
1 km~ n kpðmV 1;n IÞ1 ðmþ IÞkmþ n mn k þ ðmV 1;n IÞ ðm IÞkmn mn k
(41)
for nX1. Since þ þ þ ðmþ IÞkmþ 0 mk þ ðm IÞkm0 mkpðm IÞkm0 k þ ðm IÞkm0 kpkm k þ km k
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1423
(notice that km 0 mkpkm0 kp1), it can easily deduced from (38), (40) and (41) that for nX1, ! n Y 1 4 1 km~ n kp2 log 31 t1 ti ðkmþ 0 mk þ km0 mkÞ i¼1 1
1 ¼ 2 log 34 1 t1
n Y
! ~ ti kmk.
i¼1
This completes the proof.
&
Lemma 6.7. Let (A3.1) hold, while y 2 Y and fyk gkX0 is a sequence from Rq . Then, for ~ p , nX1, ~ m~ 0 2 M all m; m0 2 M p0 , m; n n ~ yn Þ G~ y ðm; m~ 0 ; yn Þkpa~ ny ðyn Þkm~ m~ 0 k, kG~ y ðm; m; n
n
n
~ yn Þ G~ y ðm0 ; m; ~ yn Þkpb~y ðyn Þkm m0 kkmk. ~ kG~ y ðm; m;
(42) (43)
Proof. Relation (42) is a direct consequence of Lemma 6.6 and the definition of n ~ p , while 1 ¼ y ðy0 ; y1 Þ, n1 ¼ ny ðy0 ; y1 Þ. MorefG~ y ð; ; ÞgnX1 . Let m; m0 2 M p0 , m~ 2 M n n ~ yn Þ, m~ 0n ¼ G~ y ðm0 ; m; ~ yn Þ for nX0, over, let mn ¼ F ny ðm; yn Þ, m0n ¼ F ny ðm0 ; yn Þ, m~ n ¼ G~ y ðm; m; ni n and V iþ1;n ¼ Ry ðyi Þ for 0pipn. Then, in order to prove (43), it is sufficient to show that for nX1, ! n Y 1 6 1 0 ~ km~ n m~ n kp4 log 31 t1 ti km m0 kkmk. (44) i¼1
It is straightforward to verify that for all B 2 Bp , nX1, ~ 1;n ðmV ~ 1;n IÞmn ÞI B , m~ n I B ¼ ðmV 1;n IÞ1 ðmV
(45)
~ 1;n ðmV ~ 1;n IÞm0n ÞI B , m~ 0n I B ¼ ðm0 V 1;n IÞ1 ðmV
(46)
while Theorem 3.1 and Lemmas 6.5, 6.6 imply ! n Y 1 2 1 0 ti km m0 k, kmn mn kp2 log 31 t1
(47)
i¼1
mV 1;n IX1 ðn1 V 2;n IÞ,
(48)
~ ~ 1;n Ijp1 jmV 1 ðn1 V 2;n IÞkmk,
(49)
0 jðm m0 ÞV 1;n Ijp1 1 ðn1 V 2;n IÞkm m k,
(50)
1 km~ 0n kp2 log1 34 1 t1
n Y i¼1
! ~ ti kmk
(51)
ARTICLE IN PRESS 1424
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
for nX1. Due to (45) and (46), ~ 1;n IÞðmn m0n ÞI B ðmV 1;n IÞ1 ððm m0 ÞV 1;n IÞm~ 0n I B ðm~ n m~ 0n ÞI B ¼ ðmV 1;n IÞ1 ðmV for all B 2 Bp , nX1, while (48)–(50) yield ~ 1;n Ijp2 ~ ðmV 1;n IÞ1 jmV 1 kmk, 0 ðmV 1;n IÞ1 jðm m0 ÞV 1;n Ijp2 1 km m k
for nX1. Therefore, ~ 1;n Ijkmn m0n k þ ðmV 1;n IÞ1 jðm m0 ÞV 1;n Ijkm~ 0n k km~ n m~ 0n kpðmV 1;n IÞ1 jmV 0 0 ~ þ 2 ~ 0n k p2 1 kmn mn kkmk 1 km m kkm
for nX1. Then, (47) and (51) imply km~ n
n Y
1 m~ 0n kp2 log1 34 1 t1
! ~ ti km m0 kkmk
i¼1
þ 2 log
1
1 36 1 t1
n Y
! ~ ti km m0 kkmk
i¼1
p4 log
1
n Y
1 36 1 t1
!
~ ti km m0 kkmk
i¼1
for nX1. This completes the proof.
&
Lemma 6.8. Let (A3.1) and (A3.2) hold, while y 2 Y and fyk gkX0 is a sequence from Rq . Then, for all m; m0 2 M p0 , nX1, n kH~ y ðm; yn Þkp~cny ðyn Þ, n n n kH~ ðm; yn Þ H~ ðm0 ; yn Þkpd~ ðyn Þkm m0 k. y
y
y
Proof. Let m; m 2 M p0 , while mn ¼ F ny ðm; yn Þ, n n H~ y ðm; yn Þ, m~ 0n ¼ H~ y ðm0 ; yn Þ, n ¼ y ðyn1 ; yn Þ, ~n
m0n ¼ F ny ðm0 ; yn Þ for nX0, and m~ n ¼ ¼ ~y ðyn1 ; yn Þ, tn ¼ ty ðyn1 ; yn Þ, nn ¼ ny ðyn1 ; yn Þ, U n ¼ Ry ðyn1 ; yn Þ, U~ n ¼ R~ y ðyn1 ; yn Þ for nX1. Then, in order to prove the lemma’s assertion, it is sufficient to show that for nX1, 0
~ 1 km~ n kp22 n n , km~ n
1 4 1 1 ~ n tn m~ 0n kp10 log1 32 1 t1 n
n Y
! ti km m0 k.
i¼1
It is straightforward to verify that for all B 2 Bp , nX1, m~ n I B ¼ ðmn1 U n IÞ1 ðmn1 U~ n ðmn1 U~ n IÞmn ÞI B ,
(52)
m~ 0n I B ¼ ðm0n1 U n IÞ1 ðm0n1 U~ n ðm0n1 U~ n IÞm0n ÞI B .
(53)
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1425
On the other hand, it can easily be deduced from Theorem 3.1 and Lemmas 6.2, 6.5 that for nX1, ! n Y 1 ti km m0 k, (54) kmn m0n kp2 log1 32 1 t1 i¼1
minfmn1 U n I; m0n1 U n IgXn ðnn IÞ,
(55)
0 jðmn1 m0n1 ÞU n Ijpkðmn1 m0n1 ÞU n kp1 n ðnn IÞkmn1 mn1 k,
(56)
~1 maxfjmn1 U~ n Ij; jm0n1 U~ n Ijgp1 n n ðnn IÞ,
(57)
0 ~1 jðmn1 m0n1 ÞU~ n Ijpkðmn1 m0n1 ÞU~ n kp1 n n ðnn IÞkmn1 mn1 k.
(58)
Due to (52) and (53), ðmn m0n ÞI B ¼ ðmn1 U n IÞ1 ðm0n1 U n IÞ1 ððmn1 m0n1 ÞU n IÞ
ðmn1 U~ n ðmn1 U~ n IÞm0n ÞI B ðmn1 U n IÞ1 ððmn1 m0n1 ÞU~ n ððmn1 m0n1 ÞU~ n IÞmn ÞI B ðmn1 U n IÞ1 ðm0 U~ n IÞðmn m0 ÞI B n1
n
for all B 2 Bp , nX1. Then, (52) and (55)–(58) imply ~1 km~ n kpðmn1 U n IÞ1 ðkmn1 U~ n k þ jmn1 U~ n Ijkmn kÞp22 n n , km~ n m~ 0n kpðmn1 U n IÞ1 ðm0n1 U n IÞ1 jðmn1 m0n1 ÞU n Ijðkmn1 U~ n k þ jmn1 U~ n Ijkm0n kÞ þ ðmn1 U n IÞ1 ðkðmn1 m0n1 ÞU~ n k þ jðmn1 m0n1 ÞU~ n Ijkmn kÞ þ ðmn1 U n IÞ1 jmn1 U~ n Ijkmn m0n k 0 2 1 ~ 1 ~n kmn m0n k p44 n n kmn1 mn1 k þ n
for nX1, while (54) and (59) yield km~ n
1 4 1 ~n m~ 0n kp8 log1 32 1 t1 n
n1 Y
! ti km m0 k
i¼1 1 2 1 ~n þ 2 log1 32 1 t1 n
n Y
! ti km m0 k
i¼1
p10 log
1
1 4 1 1 ~ n tn 32 1 t1 n
n Y i¼1
for nX1. This completes the proof.
&
! ti km m0 k
ð59Þ
ARTICLE IN PRESS 1426
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436 p
~ . Due to Lemma 6.4, ~ m~ 0 2 M Proof of Theorem 3.2. Let m; m0 2 M p0 , m; n n ~ yn Þ F~ y ðm0 ; m~ 0 ; yn Þ F~ y ðm; m; n n n n ~ yn Þ G~ ðm0 ; m; ~ yn ÞÞ þ ðG~ ðm0 ; m; ~ yn Þ G~ ðm0 ; m~ 0 ; yn ÞÞ ¼ ðG~ ðm; m; y
þ þ
n X i¼1 n X
y
y
ni i ðG~ y ðF iy ðm; yi Þ; H~ y ðm; yi Þ; yni Þ
y
ni i G~ y ðF iy ðm0 ; yi Þ; H~ y ðm; yi Þ; yni ÞÞ
ni ni i i ðG~ y ðF iy ðm0 ; yi Þ; H~ y ðm; yi Þ; yni Þ G~ y ðF iy ðm0 ; yi Þ; H~ y ðm0 ; yi Þ; yni ÞÞ
ð60Þ
i¼1
for nX1, while Theorem 3.1 and Lemmas 6.7, 6.8 imply n n n ~ yn Þ G~ y ðm0 ; m; ~ yn Þkpb~y ðyn Þkm m0 kkmk, ~ kG~ y ðm; m;
n
(61)
n
~ yn Þ G~ y ðm0 ; m~ 0 ; yn Þkpa~ ny ðyn Þkm~ m~ 0 k, kG~ y ðm0 ; m;
(62)
0 0 n n G~ y ðF ny ðm; yn Þ; H~ y ðm; yn Þ; ynn Þ G~ y ðF ny ðm0 ; yn Þ; H~ y ðm; yn Þ; ynn Þ n n ¼ H~ ðm; yn Þ H~ ðm; yn Þ ¼ 0,
ð63Þ
0 0 n n kG~ y ðF ny ðm0 ; yn Þ; H~ y ðm; yn Þ; ynn Þ G~ y ðF ny ðm0 ; yn Þ; H~ y ðm0 ; yn Þ; ynn Þk n n n ¼ kH~ ðm; yn Þ H~ ðm0 ; yn Þkpd~ ðyn Þkm m0 k
ð64Þ
y
y
y
y
y
for nX1, and ni ni i i kG~ y ðF iy ðm; yi Þ; H~ y ðm; yi Þ; yni Þ G~ y ðF iy ðm0 ; yi Þ; H~ y ðm; yi Þ; yni Þk ni
ni
i
pb~y ðyni ÞkF iy ðm; yi Þ F iy ðm0 ; yi ÞkkH~ y ðm; yi Þkaiy ðyi Þ~ciy ðyi Þb~y ðyni Þkm m0 k, ð65Þ ni
ni
i
i
kG~ y ðF iy ðm0 ; yi Þ; H~ y ðm; yi Þ; yni Þ G~ y ðF iy ðm0 ; yi Þ; H~ y ðm0 ; yi Þ; yni Þk n i ~ i i ~ ni ðyn Þkm m0 k ~i ~i 0 i pa~ ni y ðyi ÞkH y ðm; y Þ H y ðm ; y Þkpd y ðy Þa y i
for 1pion. Since ~ ðy ; y Þc ~ bny ðyn Þp~ay ðy0 ; y1 Þf y 0 1 y ðy1 ; y2 Þ
n Y
ty ðyi1 ; yi Þ,
i¼1 n d~ y ðyn Þp~ay ðy0 ; y1 Þf~ y ðyn1 ; yn Þ
n Y j¼1
n a~ ny ðyn Þ ¼ b~ y ðyn Þ
ty ðyj1 ; yj Þ,
ð66Þ
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1427
for nX1, and ni
~ ðy ; y Þc~ ðy ; y Þ aiy ðyi Þ~ciy ðyi Þb~y ðyni Þp21 a~ y ðy0 ; y1 Þf y i1 i y i iþ1
n Y
ty ðyj1 ; yj Þ,
j¼1 i 1 n ~ ðy ; y Þc~ ðy ; y Þ ~ y ðy0 ; y1 Þf d~ y ðyi Þa~ ni y i1 i y i iþ1 y ðyi Þp2 a
n Y
ty ðyj1 ; yj Þ
j¼1
for 1pion, it can easily be deduced from (60)–(66) that the lemma’s assertion holds. &
7. Proof of Theorems 4.1 and 4.2 Lemma 7.1. Let (A4.2) hold, while f : Rp Rq ! R is a Borel-measurable function satisfying 0pf ðx; yÞpfðx; yÞ for all x 2 Rp , y 2 Rq . Then, for all x 2 Rp , y 2 Rq , nX0, Z f ðx0 ; y0 ÞjS n sjðx; y; dx0 ; dy0 Þp2Crn fðx; yÞ, Z f ðx0 ; y0 ÞSn ðx; y; dx0 ; dy0 ÞpðC þ sfÞfðx; yÞ. Proof. For x 2 Rp , y 2 Rq , nX0, let ðS n sÞþ ðx; y; Þ and ðS n sÞ ðx; y; Þ be the positive and negative part of ðS n sÞðx; y; Þ. Then, for all x 2 Rp , y 2 Rq , nX0, there pþq exist sets Bþ such that n ðx; yÞ; Bn ðx; yÞ 2 B ðS n sÞ ðx; y; BÞ ¼ ðS n sÞðx; y; B \ B n ðx; yÞÞ for all x 2 Rp , y 2 Rq , B 2 Bpþq , nX0. Therefore, Z Z f ðx0 ; y0 ÞðS n sÞ ðx; y; dx0 ; dy0 Þ ¼ f ðx0 ; y0 ÞI Bn ðx;yÞ ðx0 ; y0 ÞðSn sÞðx; y; dx0 ; dy0 Þ for all x 2 Rp , y 2 Rq , nX0. Then, it can easily be deduced from (A4.2) that for all x 2 Rp , y 2 Rq , nX0, Z f ðx0 ; y0 ÞðS n sÞ ðx; y; dx0 ; dy0 ÞpCrn fðx; yÞ, Z f ðx0 ; y0 ÞðS n sÞðx; y; dx0 ; dy0 ÞpCfðx; yÞ. Consequently, Z Z f ðx0 ; y0 ÞjS n sjðx; y; dx0 ; dy0 Þ ¼ f ðx0 ; y0 ÞðS n sÞþ ðx; y; dx0 ; dy0 Þ Z þ f ðx0 ; y0 ÞðS n sÞ ðx; y; dx0 ; dy0 Þ p2Crn fðx; yÞ,
ARTICLE IN PRESS 1428
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
Z
Z n 0 0 0 0 f ðx ; y ÞS ðx; y; dx ; dy Þpsf þ f ðx ; y ÞðS sÞðx; y; dx ; dy Þ 0
0
n
0
0
pðC þ sfÞfðx; yÞ for all x 2 Rp , y 2 Rq , nX0. This completes the proof.
&
2 Proof of Theorem 4.1. Let ry ¼ max1=2 fr; ty g, M y ¼ 2 log1 32 y ty , N y ¼ 2M y ðC þ 2 sfÞ while
K y ¼ N y sup ðn þ 1Þ maxn=2 fr; ty g 0pn
and Ly ¼ ð1 ry Þ1 K y . Moreover, let n 2 M p0 be an arbitrary measure, while ni n f n;i y ðm; x; yn ; . . . ; yi Þ ¼ f ðx; yn ; F y ðm; yi ÞÞ
for x 2 Rp , m 2 M p0 , 0pipn, and a sequence fyk gkX0 from Rq . It is straightforward to verify that for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , nX1, ðPny f Þðx; y; mÞ ðPny f Þðx0 ; y0 ; m0 Þ Z Z Z n;1 ¼ ðf n;0 y ðm; xn ; yn ; . . . ; y1 ; yÞ f y ðn; xn ; yn ; . . . ; y1 ÞÞ
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx; y; dx1 ; dy1 Þ Z Z Z n;1 0 ðf n;0 y ðm ; xn ; yn ; . . . ; y1 ; yÞ f y ðn; xn ; yn ; . . . ; y1 ÞÞ
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx0 ; y0 ; dx1 ; dy1 Þ Z Z n1 Z X n;iþ1 ðf n;i ðn; xn ; yn ; . . . ; yiþ1 ÞÞ þ y ðn; xn ; yn ; . . . ; yi Þ f y i¼1
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðSi sÞðx; y; dxi ; dyi Þ Z Z n1 Z X n;iþ1 ðn; xn ; yn ; . . . ; yiþ1 ÞÞ ðf n;i y ðn; xn ; yn ; . . . ; yi Þ f y i¼1
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðSi sÞðx0 ; y0 ; dxi ; dyi Þ Z Z Z n þ f n;n y ðn; xn ; yn ÞðS sÞðx; y; dxn ; dyn Þ Z Z n 0 0 ð67Þ f n;n y ðn; xn ; yn ÞðS sÞðx ; y ; dxn ; dyn Þ. On the other hand, it can easily be deduced from Theorem 3.1 and (A4.1) that for all m 2 M p0 , 0pion, and any sequence fyk gkX0 from Rq , n ni1 kF ni ðn; yniþ1 Þk y ðm; yi Þ F y
ðF y ðm; yi ; yiþ1 Þ; yniþ1 Þ F ni1 ðn; yniþ1 Þk ¼ kF ni1 y y ni2 p2 log1 32 kF y ðm; yi ; yiþ1 Þ nkpM y tni y ty y .
ð68Þ
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1429
Due to (14), (15) and (68), n;iþ1 j f n;i ðn; x; yn ; . . . ; yiþ1 Þj y ðm; x; yn ; . . . ; yi Þ f y
pfðx; yn ÞkF yni ðm; yni Þ F ni1 ðn; yniþ1 Þk y pM y tni y fðx; yn Þ for all m 2 M p0 , 0pion, and any sequence fyk gkX0 from Rq . Then, Lemma 7.1 implies Z Z Z n;1 ðf n;0 y ðm; xn ; yn ; . . . ; y1 ; yÞ f y ðn; xn ; yn ; . . . ; y1 ÞÞ
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx; y; dx1 ; dy1 Þ Z pM y tny fðxn ; yn ÞS n ðx; y; dxn ; dyn ÞpM y ðC þ sfÞtny fðx; yÞ pN y tny fðx; yÞ;
nX1,
ð69Þ
Z f n;n ðn; xn ; y ÞðS n sÞðx; y; dxn ; dy Þ n n y Z p fðxn ; yn ÞjS n sjðx; y; dxn ; dyn Þp2Crn fðx; yÞpN y rn fðx; yÞ
ð70Þ
for all x 2 Rp , y 2 Rq , m 2 M p0 , nX1, and Z Z Z n;iþ1 ðf n;i ðn; xn ; yn ; . . . ; yiþ1 ÞÞ y ðm; xn ; yn ; . . . ; yi Þ f y
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðS i sÞðx; y; dxi ; dyi Þ Z Z pM y tni fðxn ; yn ÞSni ðxi ; yi ; dxn ; dyn ÞjSi sjðx; y; dxi ; dyi Þ y Z ni fðxi ; yi ÞjS i sjðx; y; dxi ; dyi Þ pM y ðC þ sfÞty p2M y CðC þ sfÞri tyni fðx; yÞpN y maxn fr; ty gfðx; yÞ
ð71Þ
for all x 2 Rp , y 2 Rq , m 2 M p0 , 1pion. Owing to (67) and (69)–(70), jðPny f Þðx; y; mÞ ðPny f Þðx0 ; y0 ; m0 ÞjpN y ðn þ 1Þ maxn fr; ty gðfðx; yÞ þ fðx0 ; y0 ÞÞ pK y rny ðfðx; yÞ þ fðx0 ; y0 ÞÞ
ð72Þ
ARTICLE IN PRESS 1430
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , nX1. Then, it can easily be deduced from Lemma 7.1 that for all x 2 Rp , y 2 Rq , m 2 M p0 , nX1, f Þðx; y; mÞ ðPny f Þðx; y; mÞj jðPnþ1 y Z p jðPny f Þðx0 ; y0 ; m0 Þ ðPny f Þðx; y; mÞjPy ðx; y; m; dx0 ; dy0 ; dm0 Þ Z pK y rny fðx; yÞ þ K y rny fðx0 ; y0 ÞSðx; y; dx0 ; dy0 Þ pK y rny fðx; yÞ þ ðC þ sfÞK y rny fðx; yÞp2ðC þ sfÞK y rny fðx; yÞ.
ð73Þ
Let f y ðx; y; mÞ ¼ f ðx; y; mÞ þ
1 X
ððPnþ1 f Þðx; y; mÞ ðPny f Þðx; y; mÞÞ y
n¼0
for x 2 Rp , y 2 Rq , m 2 M p0 . Then, (73) implies jðPny f Þðx; y; mÞ f y jp
1 X
i jðPiþ1 y f Þðx; y; mÞ ðPy f Þðx; y; mÞj
i¼n
p2ðC þ sfÞK y fðx; yÞ
1 X
riy ¼ Ly rny fðx; yÞ
i¼n
for all x 2 Rp , y 2 Rq , m 2 M p0 , nX1, while, (72) yields j f y ðx; y; mÞ f y ðx0 ; y0 ; m0 Þj pjðPny f Þðx; y; mÞ ðPny f Þðx0 ; y0 ; m0 Þj þ jðPny f Þðx; y; mÞ f y ðx; y; mÞj þ jðPny f Þðx0 ; y0 ; m0 Þ f y ðx0 ; y0 ; m0 Þj pðK y þ Ly Þrny ðfðx; yÞ þ fðx0 ; y0 ÞÞ for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , nX1. Consequently, there exists a constant f y 2 R such that f y ðx; y; mÞ ¼ f y for all x 2 Rp , y 2 Rq , m 2 M p0 . This completes the proof. & Proof of Theorem 4.2. Let 2 6K~ y M gþ2 y ðC þ sfÞ , while
ry ¼ max1=2 fr; ty g,
2 M y ¼ 80 log1 316 y ty ; N y ¼
K y ¼ 2N y sup n2ðgþ1Þ maxn=2 fr; ty g 1pn
and Ly ¼ ry Þ1 : Moreover, let n 2 M p0 be an arbitrary measure, ~ while n~ 2 M is a measure satisfying k~nk ¼ 0. Furthermore, let 2gþ1 K y L~ y M gþ1 y ð1 p
n ~ ni ~ yn ÞÞ ~ x; yn ; . . . ; yi Þ ¼ f ðx; yn ; F ni f n;i y ðm; yi Þ; F y ðm; m; i y ðm; m;
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1431
~ p , 0pipn, and a sequence fyk gkX0 from Rq . for x 2 Rp , m 2 M p0 , m~ 2 M ~ p, ~ m~ 0 2 M It is straightforward to verify that for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , m; nX1, ~ n f Þðx; y; m; mÞ ~ n f Þðx0 ; y0 ; m0 ; m~ 0 Þ ~ ðP ðP y y Z Z Z n;0 ~ ; xn ; yn ; . . . ; y1 ÞÞ ~ xn ; yn ; . . . ; y1 ; yÞ f n;1 ¼ ðf y ðm; m; y ðn; n
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx; y; dx1 ; dy1 Þ Z Z Z 0 0 ~ ; xn ; yn ; . . . ; y1 ÞÞ ~ ; xn ; yn ; . . . ; y1 ; yÞ f n;1 ðf n;0 y ðm ; m y ðn; n
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx0 ; y0 ; dx1 ; dy1 Þ Z Z n1 Z X ~ ; xn ; yn ; . . . ; yi Þ f n;iþ1 ðn; n~ ; xn ; yn ; . . . ; yiþ1 ÞÞ þ ðf n;i y ðn; n y i¼1
n1 Z X
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðS i sÞðx; y; dxi ; dyi Þ Z Z ~ ; xn ; yn ; . . . ; yi Þ f n;iþ1 ðf n;i ðn; n~ ; xn ; yn ; . . . ; yiþ1 ÞÞ y ðn; n y
i¼1
Z Z þ Z Z
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðSi sÞðx0 ; y0 ; dxi ; dyi Þ ~ ; xn ; yn ÞðS n sÞðx; y; dxn ; dyn Þ f n;n y ðn; n ~ ; xn ; yn ÞðS n sÞðx0 ; y0 ; dxn ; dyn Þ. f n;n y ðn; n
ð74Þ
On the other hand, Theorems 3.1, 3.2, Lemmas 6.4, 6.6, 6.8 and (A4.1) imply 1
0
1
~ y; y0 ÞkpkG~ y ðm; m; ~ y; y0 Þk þ kG~ y ðF 1y ðm; y; y0 Þ; H~ y ðm; y; y0 Þ; y; y0 Þk kF~ y ðm; m; 1 ~ 1 0 ~ þ 2 log1 34 p2 log1 34 y kmk y ty kH y ðm; y; y Þk 1 1 ~y ðy; y0 Þ ~ þ 2 log1 36 p2 log1 34 y kmk y ty 0 ~ þ ~1 pM y ðkmk y ðy; y ÞÞ
ð75Þ
~ p , y; y0 2 Rq , and for all m 2 M p0 , m~ 2 M n ni1 ðn; yniþ1 Þk kF ni y ðm; yi Þ F y
ð76Þ p2 log
1
ni2 32 kF y ðm; yi ; yiþ1 Þ y ty
ni ni ~ yni ÞkpkG~ y ðm; m; ~ yni Þk þ kF~ y ðm; m;
nkpM y tni y ,
n X
ni j j n ~ ji kG~ y ðF ji y ðm; yi Þ; H y ðm; yi Þ; yj Þk
j¼iþ1 ni1 ~ þ 2 log1 34 kmk p2 log1 34 y ty y
n1 X j¼i
ji tnj1 kH~ y ðm; yji Þk y
ARTICLE IN PRESS 1432
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1 ~ þ 4 log1 36 p2 log1 34 y kmk y ty
~ þ pM y kmk
n1 X
n1 X
!
~ 1 y ðyj ; yjþ1 Þ
j¼i
~ 1 y ðyj ; yjþ1 Þ ,
j¼i
ni ni1 ~ yni Þ F~ y kF~ y ðm; m; ðn; n~ ; yniþ1 Þk ni4 p80 log1 316 y ty
n1 X
!
nkÞ ~1 y ðyj ; yjþ1 Þ kF y ðm; yi ; yiþ1 Þ nkð1 þ k~
j¼iþ1 ni2 ~ ~ yi ; yiþ1 Þ n~ k þ 2 log1 34 kF y ðm; m; y ty
pM y tni y
~ yi ; yiþ1 Þk þ kF~ y ðm; m;
n1 X
!
~1 y ðyj ; yjþ1 Þ
j¼iþ1
for all m 2 M p0 , Consequently,
p
~ , m~ 2 M
ni ~ yni Þkg pM gy ng kF~ y ðm; m;
0pion,
~ gþ kmk
and
n1 X
any
sequence
fyk gkX0
Rq .
from
! ~g y ðyj ; yjþ1 Þ
,
j¼i ni ni ~ yni Þ F~ y ðn; n~ ; yni Þk kF~ y ðm; m; ni ~ þ ~ 1 pM 2y tni y ðkmk y ðyi ; yiþ1 ÞÞ þ M y ty
pM 2y tni y
~ þ kmk
n1 X
!
n1 X
~1 y ðyj ; yjþ1 Þ
j¼iþ1
~ 1 y ðyj ; yjþ1 Þ
ð77Þ
j¼i
for all m 2 M p0 , Therefore,
~ p, m~ 2 M
0pion
and
any
sequence
fyk gkX0
ni ni1 ~ yni Þkg þ kF~ y ~ gþ 1 þ kF~ y ðm; m; ðn; n~ ; yniþ1 Þkg p3M gy ng kmk
n1 X
Rq .
from
! ~g y ðyj ; yjþ1 Þ
j¼i
(78) p
~ , 0pion, and any sequence fyk gkX0 from Rq . Due to (16), (17), for all m 2 M p0 , m~ 2 M (76)–(78), ~ ; x; yÞjpf1=b ðx; yÞð1 þ k~nkg Þ ¼ f1=b ðx; yÞ j f n;n y ðn; n
(79)
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1433
for all x 2 Rp , y 2 Rq , nX1, and ~ x; yn ; . . . ; yi Þ f n;iþ1 j f n;i ðn; n~ ; x; yn ; . . . ; yiþ1 Þj y ðm; m; y n ni1 ðn; yniþ1 Þk pf1=b ðx; yn ÞkF ni y ðm; yi Þ F y ni ni1 ~ yni ÞÞkg þ kF~ y
ð1 þ kF~ y ðm; m; ðn; n~ ; yniþ1 ÞÞkg Þ ni
ni1
~ yni Þ F~ y þ f1=b ðx; yn ÞkF~ y ðm; m;
ðn; n~ ; yniþ1 Þk
ni ni1 ~ yni ÞÞkg þ kF~ y
ð1 þ kF~ y ðm; m; ðn; n~ ; yniþ1 ÞÞkg Þ ! n1 X gþ1 g ni 1=b g g ~ þ p3M y n ty f ðx; yn Þ kmk ~y ðyj ; yjþ1 Þ j¼i
þ
g ni 1=b 3M gþ2 ðx; yn Þ y n ty f
~ gþ
kmk
n1 X
~ þ kmk
n1 X
! ~1 y ðyj ; yjþ1 Þ
j¼i
! ~ g y ðyj ; yjþ1 Þ
j¼i g ni 1=b p6M gþ2 ðx; yn Þ y n ty f
~ þ kmk
n1 X
!gþ1 ~1 y ðyj ; yjþ1 Þ
j¼i 2g ni 1=b p6M gþ2 ðx; yn Þ y n ty f
~ kmk
gþ1
þ
n1 X
! ~ yðgþ1Þ ðyj ; yjþ1 Þ
ð80Þ
j¼i p
~ , 0pion, and any sequence fyk gkX0 from Rq . Owing to for all x 2 Rp , m 2 M p0 , m~ 2 M the Ho¨lder inequality, Lemma 7.1 and (18), Z Z Z
f1=b ðxn ; yn Þ~ðgþ1Þ ðyj ; yjþ1 Þ y
S nj1 ðxjþ1 ; yjþ1 ; dxn ; dyn ÞSðxj ; yj ; dxjþ1 ; dyjþ1 ÞS ji ðx; y; dxj ; dyj Þ Z 1=b p fðxn ; yn ÞSni ðx; y; dxn ; dyn Þ Z Z
1=a
pK~ y
Z
~yaðgþ1Þ ðyj ; yjþ1 ÞSðxj ; yj ; dxjþ1 ; dyjþ1 ÞS ji ðx; y; dxj ; dyj Þ fðxn ; yn ÞS ni ðx; y; dxn ; dyn Þ
1=b Z
1=a
fðxj ; yj ÞSji ðx; y; dxj ; dyj Þ
pK~ y ðC þ sfÞfðx; yÞ
1=a ð81Þ
for all x 2 Rp , y 2 Rq , 0pipjon, and Z
f1=b ðxn ; yn ÞjS n sjðx; y; dxn ; dyn Þp2Crn fðx; yÞ,
(82)
ARTICLE IN PRESS 1434
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
Z
f1=b ðxn ; yn ÞS n ðx; y; dxn ; dyn ÞpðC þ sfÞfðx; yÞ
(83)
for all x 2 Rp , y 2 Rq , nX1. Then, Lemma 7.1 yields Z Z Z Z
ðyj ; yjþ1 ÞSni1 ðxj ; yj ; dxjþ1 ; dyjþ1 ÞSðxj ; yj ; dxjþ1 ; dyjþ1 Þ f1=b ðxn ; yn Þ~ðgþ1Þ y
S ji ðxi ; yi ; dxj ; dyj ÞjS i sjðx; y; dxi ; dyi Þ Z ~ pK y ðC þ sfÞ fðxi ; yi ÞjSi sjðx; y; dxi ; dyi Þ pK~ y 2ðC þ sfÞ2 ri fðx; yÞ
ð84Þ
for all x 2 Rp , y 2 Rq , 1pipjon, and Z Z Z f1=b ðxn ; yn Þ~ðgþ1Þ ðyi ; yiþ1 Þ y
S ni1 ðxiþ1 ; yiþ1 ; dxn ; dyn ÞSðxi ; yi ; dxiþ1 ; dyiþ1 ÞS i ðx; y; dxi ; dyi Þ Z pK~ y ðC þ sfÞ fðxi ; yi ÞS i ðx; y; dxi ; dyi Þ pK~ y ðC þ sfÞ2 fðx; yÞ
ð85Þ
for all x 2 Rp , y 2 Rq , 0pion (set j ¼ i in (81) to get (85)). Due to (80), (79) and (82)–(85), Z Z Z ~ xn ; yn ; . . . ; y1 ; yÞ f n;1 ~ xn ; yn ; . . . ; y1 ÞÞ ðf n;0 y ðm; m; y ðm; m;
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðx1 ; y1 ; dx2 ; dy2 ÞSðx; y; dx1 ; dy1 Þ Z 2g n gþ1 ~ f1=b ðxn ; yn ÞSn ðx; y; dxn ; dyn Þ p6M gþ2 n t k mk y y 2g n þ 6M gþ2 y n ty
n1 Z Z Z X
f1=b ðxn ; yn Þ~yðgþ1Þ ðyi ; yiþ1 Þ
i¼0
S ni1 ðxiþ1 ; yiþ1 ; dxn ; dyn Þ
Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞS i ðx; y; dxi ; dyi Þ ~ gþ1 fðx; yÞ þ 6K~ y M gþ2 ðC þ sfÞ2 n2gþ1 tn fðx; yÞ p6M gþ2 ðC þ sfÞn2g tn kmk y
y
y
y
ð86Þ
~ gþ1 Þfðx; yÞ, pN y n2gþ1 tny ð1 þ kmk
Z Z f n;n ðn; n~ ; xn ; y ÞðSn sÞðx; y; dxn ; dy Þp f1=b ðxn ; y ÞjSn sjðx; y; dxn ; dy Þ n n n n y p2Crn fðx; yÞpN y rn fðx; yÞ
ð87Þ
ARTICLE IN PRESS V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
1435
~ p , nX1, and for all x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M Z Z Z ~ ; xn ; yn ; . . . ; yi Þ f n;iþ1 ðf n;i ðn; n~ ; xn ; yn ; . . . ; yiþ1 ÞÞ y ðn; n y
Sðxn1 ; yn1 ; dxn ; dyn Þ Sðxi ; yi ; dxiþ1 ; dyiþ1 ÞðS sÞðx; y; dxi ; dyi Þ n1 Z Z Z Z X gþ2 2g ni p6M y n ty f1=b ðxn ; yn Þ~ðgþ1Þ ðyj ; yjþ1 Þ y i
j¼i
S nj1 ðxjþ1 ; yjþ1 ; dxn ; dyn ÞSðxj ; yj ; dxjþ1 ; dyjþ1 Þ
S ji ðxi ; yi ; dxj ; dyj ÞjSi sjðx; y; dxi ; dyi Þ 2 2gþ1 i ni r ty fðx; yÞ ¼ N y n2gþ1 maxn fr; ty g p6K~ y M gþ2 y ðC þ sfÞ n p
q
for all x 2 R , y 2 R , m 2
M p0 ,
ð88Þ
~ p , 1pion. Owing to (74) and (86) – (87), m~ 2 M
~ n f Þðx0 ; y0 ; m0 ; m~ 0 Þj ~ n f Þðx; y; m; mÞ ~ ðP jðP y y ~ gþ1 Þ þ fðx0 ; y0 Þð1 þ km~ 0 kgþ1 ÞÞ pN y n2gþ1 tny ðfðx; yÞð1 þ kmk þ N y n2ðgþ1Þ maxn fr; ty gðfðx; yÞ þ fðx0 ; y0 ÞÞ þ N y rn ðfðx; yÞ þ fðx0 ; y0 ÞÞ ð89Þ ~ gþ1 Þ þ fðx0 ; y0 Þð1 þ km~ 0 kgþ1 ÞÞ pK y rny ðfðx; yÞð1 þ kmk ~ p , nX1. ~ m~ 0 2 M for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , m; Suppose that (18) holds for all x 2 Rp , y 2 Rq . Then, it can easily be deduced from ~ p , nX1, (75) and (89) that for all x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M nþ1
~ jðP y
p
Z
n
~ f Þðx; y; m; mÞj ~ ðP ~ f Þðx; y; m; mÞ y n
n
~ f Þðx; y; m; mÞj ~ y ðx; y; m; m; ~ f Þðx0 ; y0 ; m0 ; m~ 0 Þ ðP ~ P ~ dx0 ; dy0 ; dm0 ; dm~ 0 Þ jðP y y
~ gþ1 Þ pK y rny fðx; yÞð1 þ kmk Z ~ y; y0 Þkgþ1 ÞSðx; y; dx0 ; dy0 Þ þ K y rny fðx0 ; y0 Þð1 þ kF~ y ðm; m; n ~ gþ1 Þ þ 2gþ1 K y M gþ1 ~ gþ1 Þ pK y rny fðx; yÞð1 þ kmk y ry ð1 þ kmk Z
fðx0 ; y0 Þ~ðgþ1Þ ðy; y0 ÞSðx; y; dx0 ; dy0 Þ y n ~ gþ1 Þ. p2gþ1 K y L~ y M gþ1 y ry ðfðx; yÞ þ cðx; yÞÞð1 þ kmk
Let ~ ¼ f ðx; y; m; mÞ ~ þ f y ðx; y; m; mÞ
1 X n¼0
nþ1
~ ððP y
~ n f Þðx; y; m; mÞÞ ~ ðP ~ f Þðx; y; m; mÞ y
ð90Þ
ARTICLE IN PRESS 1436
V.B. Tadic´, A. Doucet / Stochastic Processes and their Applications 115 (2005) 1408–1436
~ p . Then, (90) implies for x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M n
~ f Þðx; y; m; mÞ ~ f y ðx; y; m; mÞj ~ jðP y 1 X ~ iþ1 f Þðx; y; m; mÞ ~ i f Þðx; y; m; mÞj ~ ðP ~ p jðP y y i¼n
~ gþ1 Þ p2gþ1 K y L~ y M gþ1 y ðfðx; yÞ þ cðx; yÞÞð1 þ kmk
1 X
riy
i¼n
~ gþ1 Þ ¼ Ly rny ðfðx; yÞ þ cðx; yÞÞð1 þ kmk ~ p , nX1, while (90) yields for all x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M n
n
~ f Þðx; y; m; mÞ ~ f Þðx0 ; y0 ; m0 ; m~ 0 Þj ~ f y ðx0 ; y0 ; m0 ; m~ 0 ÞjpjðP ~ ðP j f y ðx; y; m; mÞ y y n ~ ~ ~ þ jðP f Þðx; y; m; mÞ f y ðx; y; m; mÞj y
~ n f Þðx0 ; y0 ; m0 ; m~ 0 Þ f y ðx0 ; y0 ; m0 ; m~ 0 Þj þ jðP y ~ aÞ pðK y þ Ly Þrny ðfðx; yÞ þ cðx; yÞÞð1 þ kmk þ ðK y þ Ly Þrny ðfðx0 ; y0 Þ þ cðx0 ; y0 ÞÞð1 þ km~ 0 ka Þ ~ p , nX1. Consequently, there exists a ~ m~ 0 2 M for all x; x0 2 Rp , y; y0 2 Rq , m; m0 2 M p0 , m; ~ p. ~ ¼ f y for all x 2 Rp , y 2 Rq , m 2 M p0 , m~ 2 M constant f y 2 R such that f y ðx; y; m; mÞ This completes the proof. &
References [1] A. Arapostathis, S.I. Marcus, Analysis of an identification algorithm arising in the adaptive estimation of Markov chains, Math. Control Signals Systems 3 (1990) 1–29. [2] R. Atar, O. Zeitouni, Exponential stability for nonlinear filtering, Ann. Inst. Henri Poincare´, Probab. Statist. 33 (1997) 697–725. [4] P. Del Moral, A. Guionnet, On the stability of interacting processes with applications to filtering and genetic algorithms, Ann. Inst. Henri Poincare´ Probab. Statist. 37 (2001) 155–194. [5] R. Douc, C. Matias, Asymptotics of the maximum likelihood estimator for general hidden Markov models, Bernoulli 7 (2001) 381–420. [6] R. Douc, E. Moulines, T. Ryden, Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime, Ann. Statist. 32 (2004) 2254–2304. [8] F. LeGland, L. Mevel, Exponential forgetting and geometric ergodicity in hidden Markov models, Math. Control, Signals Systems 13 (2000) 63–93. [9] F. LeGland, N. Oudjane, Stability and uniform approximation of nonlinear filters using the Hilbert metric, and application to particle filters, Ann. Appl. Probab. 14 (2004) 144–187. [10] S.P. Meyn, R.L. Tweedie, Markov Chains and Stochastic Stability, Springer, Berlin, 1993. [11] G.C. Pflug, Optimization of Stochastic Models: The Interface between Simulation and Optimization, Kluwer Academic Publisher, Dordrecht, 1996. [12] V.B. Tadic´, A. Doucet, Exponential forgetting and geometric ergodicity for optimal filtering in general state-space models, Technical Report, CUED-F-INFENG, Cambridge University, available on www-sigproc.eng.cam.ac.uk/ad2/gehmm.ps