AR(1) processes driven by second-chaos white noise: Berry–Esséen bounds for quadratic variation and parameter estimation

AR(1) processes driven by second-chaos white noise: Berry–Esséen bounds for quadratic variation and parameter estimation

Journal Pre-proof AR(1) processes driven by second-chaos white noise: Berry-Ess´een bounds for quadratic variation and parameter estimation Soukaina D...

1MB Sizes 0 Downloads 11 Views

Journal Pre-proof AR(1) processes driven by second-chaos white noise: Berry-Ess´een bounds for quadratic variation and parameter estimation Soukaina Douissi, Khalifa Es-Sebaiy, Fatimah Alshahrani, Frederi G. Viens

PII: DOI: Reference:

S0304-4149(19)30390-4 https://doi.org/10.1016/j.spa.2020.02.007 SPA 3637

To appear in:

Stochastic Processes and their Applications

Received date : 15 June 2019 Revised date : 1 February 2020 Accepted date : 17 February 2020 Please cite this article as: S. Douissi, K. Es-Sebaiy, F. Alshahrani et al., AR(1) processes driven by second-chaos white noise: Berry-Ess´een bounds for quadratic variation and parameter estimation, Stochastic Processes and their Applications (2020), doi: https://doi.org/10.1016/j.spa.2020.02.007. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. c 2020 Published by Elsevier B.V. ⃝

repro of

Journal Pre-proof

AR(1) processes driven by second-chaos white noise: Berry-Esséen bounds for quadratic variation and parameter estimation

Soukaina Douissi1 , Khalifa Es-Sebaiy2 , Fatimah Alshahrani3 , Frederi G. Viens4 . 1 Laboratory LIBMA, Faculty Semlalia, Cadi Ayyad University, 40000 Marrakech, Morocco. Email: [email protected]

2 Department of Mathematics, Faculty of Science, Kuwait University, Kuwait. Email: [email protected]

3 Department of mathematical science, Princess Nourah bint Abdulrahman university, Riyadh. 3,4 Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824. Email: [email protected], [email protected]

Abstract:

rna lP

December 1, 2019

In this paper, we study the asymptotic behavior of the quadratic variation for

the class of AR(1) processes driven by white noise in the second Wiener chaos. Using tools from the analysis on Wiener space, we give an upper bound for the total-variation speed of convergence to the normal law, which we apply to study the estimation of the model's mean-reversion. Simulations are performed to illustrate the theoretical results.

Key words:

Central limit theorem; Berry-Esséen; Malliavin calculus; parameter estimation; time

series; Wiener chaos

2010 Mathematics Subject Classication: 1

Introduction

60F05; 60H07; 62F12; 62M10

The topic of statistical inference for stochastic processes has a long history, addressing a number of

Jou

issues, though many dicult questions remain. At the same time, a number of application elds are anxious to see some practical progress in a selection of directions. Methodologies are sought which are not just statistically sound, but stand a good chance of being computationally implementable, if not nimble, to help practitioners make data-based decisions in stochastic problems with complex time evolutions. In this paper, which is motivated by parameter estimation within the above context,

The rst author is supported by the Fulbright joint supervision program for PhD students for the academic year 2018-2019 between Cadi Ayyad University and Michigan State University. The fourth author is partially supported by NSF awards DMS 1734183 and 1811779, and ONR award N00014-18-1-2192.

1

Journal Pre-proof

we propose a quantitatively sharp analysis in this direction, and we honor the scientic legacy of

repro of

Prof. Larry Shepp.

Prof. Shepp is widely known for his seminal work on stochastic control, optimal stopping, and applications in areas such as investment nance. Often labeled as an applied probabilist, by those working in that area, he had the merit, among many other qualities, of showing by example that research activity in this area could benet from an appreciation for the mathematical aesthetics of constructing stochastic objects for their own sake. His papers also showed that one's work is only as applied as one's ability to calibrate a stochastic model to a realistic scenario. As obvious as this view may seem, it is nonetheless in short supply in some current circles, where model sophistication seems to replace all other imperatives. Instead, we believe some of the principles guiding applied probability research should include (i) statistical parsimony and robustness, (ii) feature discovery, and above all, (iii) real-world impact where mathematicians propose a real solution to a real problem. We think that Prof. Shepp would not have been shy about agreeing that his seminal and highly original works on optimal stopping and stochastic control [26], [3], including the invention [27] of the widely used Russian nancial option, illustrate items (ii) and (iii) in this philosophy perfectly. This leaves the question of how to estimate model parameters needed to implement applied solutions. Prof.

Shepp proved on many occasions that this concern was also high on his list of objectives

in applied work; he proposed methods aligning with our stated principle (i) above.

The best

example is the work for which Shepp is most widely known outside of our stochastic circles: the

rna lP

mathematical foundation of the Computational Tomography (CT) scanner, and in particular, the basis [28] for its data analysis. Prof. Shepp is less well known for his direct interest in statistically motivated stochastic modeling; the posthumous paper [8] is an instance of this, on asymptotics of auto-regressive processes with normal noise (innovations). Our paper honors this legacy by providing a detailed and mathematically rigorous stochastic analysis of some building blocks needed in the data analysis of a simple class of stochastic processes. Our paper's originality is in working out detailed quantitative properties for auto-regressive processes with innovations in the second Wiener chaos. Our framework is parsimonious in the sense of being determined by a small number of parameters, while covering features of stationarity, mean-reversion, and heavier-than-normal tail weight. We focus on establishing rates of convergence in the central limit theorem for quadratic variations of these processes, which we are then able to transfer to similar rates for the model's moments-based parameter estimation. This precision would allow practitioners to determine the validity and uncertainty quantication of our estimates in the realistic setting of moderate sample size. Careless use of a method of moments would ignore the potential for abusive

Jou

conclusions in this heavy-tailed time-series setting.

The remainder of this introduction begins with an overview of the landscape of parameter estimation for stochastic processes related to ours. The few included references call for the reader to nd additional references therein, for the sake of conciseness.

We then introduce the specic

model class used in this paper. It represents a continuation of the current literature's motivation to calibrate stochastic models with features such as stochastic memory and path roughness. constitutes a departure from the same literature's focus on the framework of Gaussian noise.

2

It

Journal Pre-proof

repro of

1.1 Parameter estimation for stochastic processes: historical and recent context Some of the early impetus in parameter estimation for stochastic processes was inspired by classical ideas from frequentist statisitics, such as the theoretical and practical superiority of maximum likelihood estimation (MLE), over other, less constrained methodologies, in many contexts. We will not delve into the description of many such instances, citing only the seminal account [15], rst published in Russian in 1974 (see references therein in Chapter 17, such as [22]). This was picked up two decades later in the context of processes driven by fractional Brownian motion, where it was shown that the martingale property used in earlier treatments was not a necessary ingredient to establish the properties of such MLEs: see in particular the treatment of processes with fractional noise in [13] and in [32]. It was also noticed that least-squares ideas, which led to MLEs in cases of white-noise driven processes, did not share this property in the case of processes driven by fractional noise: this was pointed out in the continuous-time based paper [12].

See also a more

detailed account of this direction of work in [7] and references therein, including a discussion of the distinction between estimators based on continuous paths, and those using discrete sampling under in-ll and increasing-horizon asymptotics. These were applied particularly to various versions of the Ornstein-Uhlenbeck process, as examples of processes with stationary increments and an ability to choose other features such as path regularity and short or long memory.

The impracticality of computing MLEs for parameters of stochastic processes in these feature-rich contexts, led the community to consider other methodologies, looking more closely

rna lP

at least squares and beyond. A popular approach is to work with incarnations of the method of moments. A full study in the case of general stationary Gaussian sequences, with application to in-ll asymtotics for the fractional Ornstein-Uhlenbeck process, is in [2].

This paper relates the

relatively long history of those works where estimation of a memory or Hölder-regularity parameter uses moments-based objects, particularly quadratic variations. It also shows that the generalized method of moments can, in principle, provide a number of options to access vectors of parameters for discretely observed Gaussian processes in a practical way. This was also illustrated recently in [9], where the Malliavin calculus and its connection to Stein's method was used to establish speeds of convergence in the central-limit theorems for quadratic-variations-based estimators for discretely observed processes. The Stein-Malliavin technical methodology employed in [9] is that which was introduced by Nourdin and Peccati in 2009, as described in their 2012 research monograph [20]. Other estimation methods are also proposed for general stationary time series, which we mention here, though they fall out of the scope of our paper, and they do not lead to the same

Jou

precision as those based on the Stein-Malliavin method: see e.g. [35] and [36] for the Yule-Walker method and extensions.

While the paper [34] establishes that essentially every continuous-time

stationary process can be represented as the solution of a Langevin (Ornstein-Uhlenbeck-type) equation with an appropriate noise distribution, the two aforementioned follow-up papers [35, 36], which present an analog in discrete time, do not, however, connect the discrete and continuous frameworks via any asymptotic theory. Following an initial push in [32], most of the recent papers mentioned above, and recent references therein, state an explicit eort to work with discretely observed processes.

At least

in the increasing-horizon case, the papers [9] and [6] had the merit of pointing out that many of the discretization techniques used to pass from continuous-path to discrete-observation based

3

Journal Pre-proof

estimators, were inecient, and it is preferable to work directly from the statistics of the discretely Our paper picks up this thread, and introduces a new direction of research

repro of

observed process.

which, to our knowledge, has not been approached by any authors: can the asymptotic normality of quadratic variations and related estimators, including very precise results on speeds of convergence, be obtained when the driving noise is not Gaussian?

The main underlying theoretical result we draw on is the optimal estimation of totalvariation distances between chaos variables and the normal law, established in [19]. It was used for quadratic variations of stationary sequences in the Gaussian case in [18]. But when the Gaussian setting is abandonned, the result in [19] cannot be used directly. Instead, our paper makes a theoretical advance in the analysis on Wiener space, by drawing on a simple idea in the recent preprints [24] and [21]; our main result provides an example of a sum of chaos variables whose distance to the normal appears to be estimated optimally, whereas a standard use of the Schwartz inequality would result in a much weaker result. The precise location of the technique leading to this improvement is pointed out in the main body of our paper: see Theorem 8, particularly inequality (24) in its proof and the following brief discussion there, and Remark 12. This allows us to prove our Berry-Esséen-type speed of

n−1/2 ,

rather than what would have resulted in a speed of

n−1/4 .

1.2 A stationary process with second-chaos noise, and related literature

rna lP

Given our intent to address the new issue of noise distribution, and knowing that Berry-Esséentype questions for models with mere Gaussian noise already present technical challenges, we choose to minimize the number of technical issues to address in this paper by focusing on the simplest possible stationary model class which does not restrict the marginal noise distribution within a family which is tractable using Wiener chaos analysis and tools from the Malliavin calculus. This is the auto-regressive model of order 1 (a.k.a. AR(1)) with independent noise terms, where the noise distribution is in the second Wiener chaos, i.e.

Yn = a0 + a1 Yn−1 + εn

where

{εn ; n ∈ Z}

is an i.i.d. sequence in the second Wiener chaos, and

(1)

a0

and

a1

are constants.

The complete description and construction of this process and of the noise sequence is given in Section 3, see (8).

As explained in Section 2, the second Wiener chaos is a linear space, and since the model (1) is linear, its solution, if any, lies in the same chaos. This points to a simple theoretical motivation

Jou

and justication for studying the increasing horizon problem as opposed to the in-ll problem. We also include a practical motivation for doing so, further below in this section, coming from an environmental statistics question.

For the former motivation, note that the AR(1) specication (1), with essentially any squareintegrable i.i.d.

noise distribution, is known to converge weakly, after appropriate aggregation

and scaling, to the so-called Ornstein-Uhlenbeck process (also known occasionally as the Vasicek process), which solves the stochastic dierential equation

dXt = α(m − Xt )dt + σdW (t) 4

(2)

Journal Pre-proof

and

σ

W

is a standard Wiener process (Gaussian Brownian motion), and the parameters

are explicitly related to

a0 , a1

and

V ar [εn ].

α, m,

See for instance [29, Chapter 2], which covers

repro of

where

the case of all square-integrable innovations; this paper assumes a piecewise linear interpolation in the normalization, which could be eliminated by switching to convergence in the Skorohod topology. A reference avoiding linear interpolation, with convergence in the Skorohod

J1

J1

topology,

is [5], where innovations are assumed to have four moments. In any case, this central limit theorem constrains the modeling of stationary/ergodic processes via diusive dierential formulation: under in-ll asymptotics with weakly dependent noise, the AR(1) specication cannot preserve any nonnormal noise distribution in the limit. It is of course possible to interpolate the above process

Y

in

a number of ways, to result in a continuous-time process whose discrete-time marginals are those specied via (1).

However we believe it is dicult or impossible to give a linear diusion-type stochastic dierential equation, akin to (2), whose xed-time-step marginals are as in (1) for an arbitrary noise distribution, while simultaneously describing what second-chaos process dierential would need to replace

W

in (2). The so-called Rosenblatt process (see [31]), the only known second-chaos

continuous-time process with a stochastic calculus similar to

W 's,

gives an example of a viable

alternative to (2) living in the second chaos. But this process is known to have only a long-memory version. Thus it cannot be a proxy for any continuous-time analogue of (1), since the noise there has no memory. Similar issues would presumably exist for other non-Gaussian AR(1) and related

rna lP

auto-regressive processes. A few have been studied recently in the literature. We mention [11, 37], which cover various noise structures similar to second-chaos noises, and here again, no asymptotic or interpolation theory is provided to relate to continuous time. There does exist a general treatment in [8] of asymptotics for all AR(p) processes: the limit processes are the so-called Continuous-AR(p) processes, which are Gaussian, and have

p > 1);

p − 1-dierentiable

paths (a form of very long memory for

that paper assumes normal innovations to keep technicalities to a minimum. Another indication that nding such a proxy may fail comes in the specic case of the so-

called Gumbel distribution for

εn .

This law is a popular distribution for extreme-value modeling.

The fact that this law is in the second chaos is a classical result (as a weighted sum of exponentials, see [25]), though it does not appear to be widely know in the extreme-value community. The standard (mean-zero) Gumbel law can be represented as

P∞

  ¯ 2 /2 Ej = Nj2 + N j ¯ freedom, Nj and Nj are iid

−1 (E − 1) j j=1 j

is a standard exponential variable (chi-squared with two degrees of

where

standard normals). The Gumbel law is known to give rise to a second-chaos version of an isonormal

Jou

Gaussian process, known as the Gumbel noise measure (or Gumbel process); that stochastic measure obeys the same laws as the white-noise measure (including independence of increments which fails for the Rosenblatt noise), if one replaces the standard algebra of the reals by the max-plus algebra. This is explained in detail in the preprint [16]; also see references therein. By virtue of this change of algebra, stochastic dierential specications as in (2) cannot be dened using the Gumbel noise.

However, the discrete version of the Gumbel noise, an i.i.d. sequence

(εn )n

with Gumbel

marginals, is a good example of a noise type which can be used in the AR(1) process (1).

This

specic model, known as the AR(1) process with Gumbel noise (or innovations), presents a main motivation for our work. Recent references on this process, and on the closely related process where

5

Journal Pre-proof

the marginals of

Y

are Gumbel-distributed, include [17] for a Bayesian study, [30] for applications to

repro of

maxima in greenhouse gas concentration data, and [1] for AR processs in the broader extreme-value context. A survey on AR(1) models with dierent types of innovations and marginals, while not including the Gumbel, is in the unpublished manuscript [10]. The use of the Gumbel distribution for describing environmental time series, mainly when looking at extremes, is fairly widespread, but we do not cite this literature because it does not appear willing to acknowledge that time-series models driven by Gumbel innovations should be used, rather than using tools for i.i.d.

Gumbel

data. This literature, which is easy to nd, is also entirely unaware that the Gumbel distribution is in the second Wiener chaos.

All these reasons give us ample cause to investigate the basic method-of-moments building blocks for determining parameters in stationary time series with second-chaos innovations.

For

the sake of concentrating on the core mathematical analysis towards this end, we focus on the asymptotics of quadratic variations for models in the class (1).

The methodology developed in

[9] can then be adapted to handle any method-of-moments-based estimators, at the cost of some additional eort. We provide examples of this in the latter sections of this paper. Our main result is that, for any second-chaos innovations in (1), the quadratic variation of

Y

has explicit normal

asymptotics, with a speed of convergence in total variation which matches the classical Berry-Esseén speed of

n−1/2 .

rna lP

The remainder of this paper is structured as follows. Section 2 provides elements from the analysis on Wiener space which will be used in the paper. Section 3 presents the details of the class of AR(1) models we will analyze. Section 4 computes the asymptotic variance of the AR(1)'s quadratic variation by looking separately at its 2nd-chaos and 4th-chaos components, whose asymptotics are of the same order.

Section 5 establishes our main result, the Berry-Esseén speed of convergence

in total-variation for the normal uctuations of the AR(1)'s quadratic variation. Finally, Section 6 denes a method-of-moments estimator for the mean-reversion rate of this AR(1) process, and establishes its asymptotic properties; a numerical study is included to gauge the distance between this renormalized estimator and the normal law.

2

Preliminaries

In this rst section, we recall some elements from stochastic analysis that we will need in the

H gives rise to an (G(ϕ), ϕ ∈ H) of random variables on a probability space (Ω, F, P) such that E(G(ϕ)G(ψ)) = hϕ, ψiH . In this paper, it is enough to 2 use the classical Wiener space, where H = L ([0, 1]), though any H will also work. In the case 2 H = L ([0, 1]), G can R 1 be identied with the stochastic dierential of a Wiener process W and one interprets G(ϕ) := 0 ϕ (s) dW (s). 2 The Wiener chaos of order n is dened as the closure in L (Ω) of the linear span of the random variables Hn (G(ϕ)), where ϕ ∈ H, kϕkH = 1 and Hn is the Hermite polynomial of degree n. The intuitive Riemann-sum-based notion of multiple Wiener stochastic integral In with respect to G ≡ W , in the sense of limits in L2 (Ω), turns out to be an isometry between the Hilbert space paper.

See [20], [23], and [33] for details.

Any real, separable Hilbert space

a centered Gaussian family

Jou

isonormal Gaussian process:

6

Journal Pre-proof

H n

1 (symmetric tensor product) equipped with the scaled norm √

k · kH⊗n and the Wiener chaos n! 2 under L (Ω)'s norm. In any case, we have the following fundamental decomposition of

repro of

n L2 (Ω) as a direct sum of all Wiener chaos. • The Wiener chaos expansion. For any F ∈ L2 (Ω), there exists a unique sequence of functions fn ∈ H n such that ∞ X F = E[F ] + In (fn ),

of order

n=1

L2 (Ω) and   E In (fn )2 = n!kfn k2H⊗n .

where the terms are all mutually orthogonal in



Product formula and contractions.

Since

L2 (Ω)

(3)

is closed under multiplication, the special

case of the above expansion exists for calculating products of Wiener integrals, and is explicit using contractions: For any integers

p, q > 1

and symmetric integrands

p∧q X

Ip (f )Iq (g) =

r=0

f ⊗r g

is the contraction of order

r

of

f

and

g ∈ H q ,

r!Cpr Cqr Ip+q−2r (f ⊗˜r g); and

g

which is an element of

rna lP

where

f ∈ H p

H⊗(p+q−2r)

(4)

dened by

(f ⊗r g)(s1 , . . . , sp−r , t1 , . . . , tq−r ) Z := f (s1 , . . . , sp−r , u1 , . . . , ur )g(t1 , . . . , tq−r , u1 , . . . , ur ) du1 · · · dur . [0,1]p+q−2r

while

(f ⊗˜r g)

is dened by

denotes its symmetrization.

f˜(x1 , ..., xp ) =

{1, ..., p}. The special case for form:

1 p!

P σ

More generally the symmetrization

f (xσ(1) , ..., xσ(p) )

p = q = 1



of a function

f

σ

of

where the sum runs over all permutations

is particularly handy, and can be written in its symmetrized

I1 (f )I1 (g) = 2−1 I2 (f ⊗ g + g ⊗ f ) + hf, giH .



Hypercontractivity in Wiener chaos.

for all

p > 2),

have

Hq ,

h ∈ H⊗q ,

the multiple Wiener integrals

satisfy a hypercontractivity property (equivalence in

Jou

which exhaust the set

For

which implies that for any

F ∈ ⊕ql=1 Hl

for any

.

Malliavin derivative and other operators on Wiener space.

operator

D,

Iq (h), norms

(5)

cp,q above are known with some precision when q/2

cp,q = (p − 1)

Lp

p > 2.

It should be noted that the constants



of all

(i.e. in a xed sum of Wiener chaoses), we

 1/p  1/2 E |F |p 6 cp,q E |F |2

Corollary 2.8.14 in [20],

Hq

F ∈ Hq :

by

The Malliavin derivative

and other operators on Wiener space, are needed briey in this paper, to provide an

ecient proof of the rst theorem in Section 5, and to interpret an observation of I. Nourdin and

7

Journal Pre-proof

G. Peccati, given below in (7), for a bound on the total variation distance of any chaos law to the

repro of

normal law. We do not provide any background on these operators, referring instead to Chapter 2 in [20], and briey mentioning here the facts we will use in the proof of Section 5, without spelling out all assumptions.

Strictly speaking, all the results in this paper can be obtained without the

following facts, but this would be exceedingly tedious and wholly nontransparent.

• •

D

The operator

maps

Iq (f )

t 7→ qIq−1 (f (., t)) and is consistent D1,2 ,and includes all chaos variables.

to

rule. Its domain is denoted by

L, known as the Iq (f ) to −qIq (f ),

with the ordinary chain

The operator

generator of the Orstein-Uhlenbeck semigroup on Wiener

space, maps

and

L−1

denotes its pseudo-inverse:

L's

kernel is the con-

stants, all other chaos are its eigenspaces. Combining this with the previous point, we obtain

−Dt L−1 Iq (f ) = Iq−1 (f (., t)). This



We have the relation

F = δ(−DL−1 )F.

rna lP



D has an adjoint δ in L2 (Ω), which by denition satises the duality relation E hDF, uiH = E [F δ (u)], where u is any stochastic process for which the expressions are dened. The domain of δ is a non-trivial object of study, but it is known to contain all square-integrable W -adapted processes for the case of G = W , the wiener process, where H = L2 ([0, 1]).



Distances between random variables.

The following is classical. If

random variables, then the total variation distance between the law of by

X

X, Y

are two real-valued

and the law of

Y

is given

dT V (X, Y ) := sup |P [X ∈ A] − P [Y ∈ A]| A∈B(R)

where the supremum is over all Borel sets. The Kolmogorov distance

dT V

except one take the sup over

A

of the form

(−∞, z]

for all real

z.

dKol (X, Y )

is the same as

The Wasserstein distance

uses Lipschitz rather than indicator functions:

dW (X, Y ) :=

sup |Ef (X) − Ef (Y )| ,

f ∈Lip(1)

Lip(1) being the set of all Lipschitz functions with • The observation of Nourdin and Peccati.

Lipschitz constant Let

N

6 1.

denote the standard normal law.

The

Jou

following observation relates an integration-by-parts formula on Wiener space with a classical result of Ch. Stein. Let

X ∈ D1,2

E [X] = 0 f ∈ Cb1 (R),

with

5.1.3]), for

and

V ar [X] = 1.

Then (see [19, Proposition 2.4], or [20, Theorem



 E [Xf (X)] = E f 0 (X) DX, −DL−1 X H ,

and by combining this with properties of solutions of Stein's equations, one gets



dT V (X, N ) 6 2E 1 − DX, −DL−1 X H . 8

(6)

Journal Pre-proof

−L−1 X = q −1 X , (X, N ) 6 2E 1 − q −1 kDXk2H .

dT V •A

convenient lemma.

X ∈ Hq ,

since

we obtain conveniently

repro of

One notes in particular that when

(7)

The following result is a direct consequence of the Borel-Cantelli Lemma

(the proof is elementary; see e.g. [14]). It is convenient for establishing almost-sure convergences from

Lp

convergences.

Lemma 1 Let

γ > 0. Let (Zn )n∈N be a sequence of random variables. If for every p > 1 there exists a constant cp > 0 such that for all n ∈ N, kZn kLp (Ω) 6 cp · n−γ ,

then for all ε > 0 there exists a random variable ηε which is such that |Zn | 6 ηε · n−γ+ε

almost surely

for all n ∈ N. Moreover, E|ηε |p < ∞ for all p > 1. 3

The model

rna lP

3.1 Denition We consider the following AR(1) model

where

a0 , a1

and

{σj , j > 1}

 Yn = a0 + a1 Yn−1 + εn , n > 1    ∞ P 2 − 1) εn = σj (Zn,j  j=1   Y0 = y0 ∈ R,

are real constants. The sequence of innovations

(8)

{εn , n > 1}

is i.i.d.,

with distribution in the second Wiener chaos. It turns out that this sequence can be represented as

{Zn,j , n > 1, j > 1} are i.i.d. standard {σj ; j > 1} is a sequence of reals satisfying

in the second line above in (8), where the family random variables dened on

(Ω, F, P),

and

∞ X

Jou

j=1

σj2 < ∞.

(9)

This is explained in [20, Section 2.7.4]. We assume that the mean reversion parameter that

|a1 | < 1.

Gaussian

a1

is such

Under this condition, (8) also admits a stationary ergodic solution. Both the version

above and the stationary version are linear functionals of elements of the form of

εn ,

which are

elements of the second Wiener chaos. Since this chaos is a vector space, both versions of

Y

take

values in the second Wiener chaos. By truncating the series in (8), one obtains a process which is a sum of chi-squared variables, converging to

Y

in

L2 (Ω).

Special cases where the sum is nite, can be considered. In the gures

below, we simulate 500 observations from such cases, to show the variety of behaviors, even with a limited number of terms in the noise series.

9

Journal Pre-proof

j > 2, corresponds to 2 − 1) ∼ χ2 (1). (Zn,1

When



When



When σ1 = −σ2 , and σj = 0, for all j > 2, which is ε's law is equal to a product normal law: if N, N 0 2 − 1) − (Z 2 − 1) = Z 2 − Z 2 . 2N N 0 ∼ (Zn,1 n,2 n,1 n,2

σ1 = σ

and

σj = 0,

for all

noise with one degree of freedom:

a scaled mean-zero chi-squared white

repro of



(c)

a0 = 2, a1 = 0.7, y0 = 3, σ = 2.

Jou

(a)

rna lP

σ1 = σ2 = σ and σj = 0, ∀j > 3, an exponential 2 − 1) + (Z 2 − 1) ∼ E(1/2). 1/(2σ). Indeed (Zn,1 n,2

a0 = 0.8, a1 = 0.5, y0 = 1, σ1 = σ2 = 2.

(d)

white noise with rate parameter

a symmetric second chaos white noise,

are two i.i.d.

(b)

standard normals, then

a0 = 1.2, a1 = 0.4, y0 = 2, σ = 4.

a0 = 1.2, a1 = 0.4, y0 = 0, σ1 = −σ2 = 1.

Figure 1: 500 various observations from (8) for dierent values of

a0 , a1 , y0

and

σj , j = 1, 2.

Remark 2 We can see from the gures above the asymmetry in gures (a), (b) and (c) due to the asymmetric nature of the noise; gure (d) shows more symmetry because of the choice σ1 = −σ2 . 10

Journal Pre-proof

3.2 Quadratic variation

repro of

We also notice that when the mean reversion is fairly strong and the noise is large the shape of the observations is balanced (gure (a)), while when the noise is larger compared to the mean-reversion parameter, the observations look like an Ornstein-Uhlenbeck process with a noise larger than the drift (see gure (b)).

This paper's main goal is to determine the asymptotic distribution of the quadratic variation of the observations

{Yn , n > 1}

using analysis on Wiener space.

This will be facilitated by the fact, mentioned above, that the sequence second Wiener chaos with respect to the Wiener process

W,

(Yn )n>1

linear equation with noise in the second chaos. To be more specic, observe that can be expressed recursively as follows :

Yi = di +

i X

a1i−k

∞ X j=1

k=1

where

 2 σj Zk,j −1 ,

di = ai1 y0 + a0

i X

lives in the

by virtue of being the solution of a

{Yn , n > 1}

i>1

a1i−k .

in (8)

(10)

(11)

rna lP

k=1

For the sake of ease of computation in Wiener chaos, it will be convenient throughout this paper to refer to the Wiener integral representation of the noise terms

2 . For this, there exZk,j = W (hk,j ) = I1W (hk,j ) =

{hk,j , k > 1, j > 1} an orthonormal family L2 ([0, 1]) for which Zk,j from the most elementary 0 hk,j (r) dW (r). Hence, using the fact, which comes  2 W ⊗2 product formula (4), that W (ϕ) − 1 = I2 ϕ , we have for i > 1 : ists

R1

Yi = di +

i X

ai−k 1

j=1

k=1

= di +

i X

ai−k 1

Jou

= di +

∞ X j=1

k=1

i X

∞ X

ai−k 1

∞ X

2 σj Zk,j −1



 σj W 2 (hk,j ) − 1 σj I2W (h⊗2 k,j ).

j=1

k=1

Therefore, using the linearity property of multiple integrals, we can write for

where

Y˜i := I2W (fi )

application of the

i > 1,

Yi = di + Y˜i ,

and

fi :=

i X k=1

11

a1i−k

∞ X j=1

σj h⊗2 k,j .

(12)

Journal Pre-proof

i>1

: indeed

kfi k2L2 ([0,1]2 ) =

∞ X j=1

σj2 ×

fi ∈ L2 ([0, 1]2 )

for all

repro of

A straightforward computation shows that under Assumption (9), the kernel



X (1 − a2i 1 1 ) 6 σ 2 < ∞. 2 2 (1 − a1 ) (1 − a1 ) j=1 j

Our main object of study in the next two sections is the asymptotics of the quadratic variation dened as follows :

n

n

1X 1 X ˜2 Qn := (Yi − di )2 = Yi . n n i=1

Using product formula (4), we get

 1 X W I2 (fi )2 − 2kfi k2L2 ([0,1]2 ) n n

Qn − E[Qn ] =

(13)

i=1

i=1

n n 1X W 4X W = I4 (fi ⊗ fi ) + I2 (fi ⊗1 fi ) n n i=1 i=1 ! ! n n X X 1 4 = I4W fi ⊗ fi + I2W fi ⊗1 fi n n

rna lP

i=1

i=1

=: T4,n + T2,n .

In the next section, we show that the asymptotic variance of

(14)



compute its speed of convergence. Then we establish a CLT for

n (Qn − E[Qn ]) exits and we will Qn , and compute its Berry-Esséen

speed of convergence in total variation.

4

Asymptotic variance of the quadratic variation

Using the orthogonality of multiple integrals living in dierent chaos, to calculate the limiting variance of and

T4,n



n(Qn − E[Qn ]),

given in (14).

we need only study separately the second moments of the terms

T2,n

Jou

4.1 Scale constant for T2,n

Proposition 3 Under Assumption (9), with T2,n as in (14), for large n, ∞ P 4 32 σ j C h √ 2 i j=1 1 nT2,n − , 6 E n (1 − a21 )2 12

(15)

Journal Pre-proof

j=1

σj4

!

[1+a21 (5+6a21 )] . (1−a41 )2 (1−a21 )

In particular

l1 := lim E n→∞

Proof.

We have

T2,n = I2W ( n4

we get

2 E[T2,n ] = 2k

n P

i=1

repro of

where C1 := 32

∞ P

h √

j=1

=

σj4

(1 − a21 )2

.

n n 4X 32 X fi ⊗1 fi k2L2 ([0,1]2 ) = 2 hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) . n n i,j=1

Moreover, under Assumption (9), we have

(fi ⊗1 fi )(x, y) = =

i X

∞ X

1 i−k2 ai−k a1 1

j1 ,j2 =1

k1 ,k2 =1 i X

∞ X

1 i−k2 ai−k a1 1

j1 ,j2 =1

k1 ,k2 =1 i X

∞ X

⊗2 σj1 σj2 (h⊗2 k1 ,j1 ⊗1 hk2 ,j2 )(x, y)

σj1 σj2 (hk1 ,j1 ⊗ hk2 ,j2 )(x, y)δj1 ,j2 δk1 ,k2

rna lP =

2(i−k)

a1

σj2 (h⊗2 k,j )(x, y).

j=1

k=1

δij

∞ P

denotes the Kronecker delta dened as follows :

δij =

Therefore, for

i, j > 1

such that

hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) =

i X

=

if

if

j X

2(i−k1 )

a1

i X

0 1

j X

2(i−k1 )

a1

k2 =1

i X

j X

2(i−k1 )

a1

2(j−i)

= a1

j=1

j1 ,j2 =1

σj4

j1 ,j2 =1

4(i−k)

×

2(j−i) a1

∞ X

2(j−k2 )

a1

a1

k=1

∞ X

2(j−k2 )

a1

j1 ,j2 =1

k2 =1 i X

∞ X

2(j−k2 )

a1

k2 =1

k1 =1

∞ X

i 6= j i = j.

we get

k1 =1

=



k1 =1

Jou

=

j > i,

(16)

fi ⊗1 fi ), by the isometry property (3) of multiple integrals,

i=1

Where

nT2,n

2 i

32



13

×

∞ X

σj4

j=1

1 − a4i 1 1 − a41



.

D E ⊗2 σj21 σj22 h⊗2 , h k1 ,j1 k2 ,j2

L2 ([0,1]2 )

 2 σj21 σj22 hhk1 ,j1 , hk2 ,j2 iL2 ([0,1]) σj21 σj22 δj1 ,j2 δk1 ,k2

(17)

Journal Pre-proof

repro of

Therefore, by (17), we have

n n−1 n √  32 X 64 X X 2 2 E ( nT2,n ) = kfi ⊗1 fi kL2 ([0,1]2 ) + hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) . n n i=1

i=1 j=i+1

Moreover,

∞ ∞ P P 4 σj 32 32 σj4 n 32 X j=1 j=1 kfi ⊗1 fi k2L2 ([0,1]2 ) − 4 ) = (1 − a4 ) n (1 − a 1 1 i=1

On the other hand

∞ P σj4 32 n 1X 1 j=1 a4i − 1 6 4 )2 n . n (1 − a 1 i=1

rna lP

∞ P 2 4 64a1 σj n 64 n−1 j=1 X X hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) − 4 )(1 − a2 ) n (1 − a 1 1 i=1 j=i+1 ∞ ∞ P P 4 4 2 64 σ σ 64a 1 j j n−1 n X 2(j−i) j=1 j=1 1 X 4i = (1 − a ) a − 1 1 4 )(1 − a2 ) (1 − a41 ) n (1 − a 1 1 i=1 j=i+1 64a21

∞ P

σj4

 1 X  2(n−i) 4i 6 (1 − a )(1 − a ) − 1 1 1 4 2 (1 − a1 )(1 − a1 ) n 64a21

6

j=1

σj4

(1 − a41 )(1 − a21 )

(1 −

∞ P

j=1

a41 )(1

σj4



a21 )2

n

3 X 2i a1 n i=0

!

3 . n

Jou

6

i=1

∞ P

j=1

64a21

n

14

Journal Pre-proof

Consequently

∞ P

repro of

∞ ∞ P P 4 4 σj σj 32 32 n h √ i 2 j=1 j=1 32 X nT2,n − kfi ⊗1 fi k2L2 ([0,1]2 ) − 6 E 2 2 (1 − a1 ) n i=1 (1 − a41 ) ∞ P 4 2 n−1 n σ 64a 1 j 64 X X j=1 + hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) − 4 )(1 − a2 ) n (1 − a 1 1 i=1 j=i+1 σj4

64a21

∞ P

σj4 1 3 j=1 + 6 4 4 2 2 2 (1 − a1 ) n (1 − a1 )(1 − a1 ) n   ∞ 2 2 X 4 1 + a1 (5 + 6a1 ) 1 6 32 . σj (1 − a41 )2 (1 − a21 ) n j=1 32

j=1

The desired is therefore obtained.

rna lP

4.2 Scale constant for T4,n

Proposition 4 Under Assumption (9), with T4,n as in (14), for large n,

  2    X ∞ ∞ 2 X √ C2  4 1 + a1  4 2   E ( nT4,n )2 −  σ + σ j j 6 n , (1 − a21 )2 j=1 1 − a21 j=1

where

C2 := C2,1 + C2,2 , C2,1

In particular

 2 ∞ X (3 + 17a21 ) := 4  σj2  (1 − a21 )4 j=1

Jou

√  l2 := lim E ( nT4,n )2 = n→∞

Proof.

By denition of the term

, C2,2

   ∞ X 1 + a21 (5 + 6a21 ) 4 . := 4  σj  (1 − a41 )2 (1 − a21 ) j=1

  2    X ∞ ∞ 2 X 4 1 + a 1   σ4 + σj2   . (1 − a21 )2 j=1 j 1 − a21 j=1 T4,n ,

we have

n √ 1 X ˜ i k2L2 ([0,1]4 ) E[( nT4,n )2 ] = 4!k √ fi ⊗f n i=1

n 4! X

˜ i , fj ⊗f ˜ j 2 = fi ⊗f , L ([0,1]4 ) n i,j=1

15

(18)

(19)

Journal Pre-proof

where

˜ i fi ⊗f

denotes the symmetrization of

fi ⊗ fi ,

because the kernel

Pn

i=1 fi ⊗fi

∈ L2 ([0, 1]4 )

is

repro of

no longer symmetric. We deal with symmetrization by using a combinatorial formula, obtaining

˜ i , fj ⊗f ˜ j 2 4! fi ⊗f = (2!)2 hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) + (2!)2 hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) . L ([0,1]4 )

Therefore

n n √ 4 X 4 X E[( nT4,n )2 ] = hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) + hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) n n i,j=1

i,j=1

=: T4,1,n + T4,2,n .

Moreover,

T4,1,n = =

n 4 X hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) n

4 n

i,j=1 n  X i,j=1

hfi , fj iL2 ([0,1]2 )

2

rna lP

n n  2 8 n−1 2 X X 4 X = hfi , fi iL2 ([0,1]2 ) + hfi , fj iL2 ([0,1]2 ) . n n i=1

i=1 j=i+1

On the other hand, using (9), we have for

hfi , fj iL2 ([0,1]2 ) =

i X

j>i

a1i−k1

k1 =1

(j−i)

aj−k 1

k2 =1



×

∞ X j=1

Jou

= a1

j X

16



∞ X

σj1 σj2 δj1 ,j2 δk1 ,k2

j1 ,j2 =1

σj2  ×



1 − a2i 1 1 − a21



.

(20)

Journal Pre-proof

Therefore by (20), we get

Now let us estimate

rna lP

repro of

  2 2 n ∞ ∞ 2 X X X 4 4 4 2 2 T4,1,n − 4(1 + a1 )   kf k − σ 6 σ 2 2 i j j L ([0,1] ) n (1 − a21 )3 j=1 (1 − a21 )2 j=1 i=1  2 n−1 n  ∞ 2 2 X 8 X X 8a1 2  σ + hfi , fj iL2 ([0,1]2 ) − j (1 − a21 )3 j=1 n i=1 j=i+1  2 ∞ n 1 X X 4 2 2i 2  6 σ ((1 − a ) − 1) j 1 n (1 − a21 )2 j=1 i=1  2 ∞ n  1 X  2 X 8a1 2(n−i) 2  + σ2 (1 − a2i )−1 1 ) (1 − a1 n (1 − a21 )3 j=1 j i=1  2  2 ∞ ∞ 2 X X 80a1  12 1 1  + 6 σ2 σj2  n (1 − a21 )4 n (1 − a21 )3 j=1 j j=1 !2 ∞ P σj2 4 j=1 (3 + 17a21 ) . = n (1 − a21 )4 T4,2,n =

n 4 X hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) . n i,j=1

For

j > i, x, y ∈ L2 ([0, 1]),

we have

(fi ⊗1 fj )(x, y) = j > i,

Z

Jou

So, for

hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) =

[0,1]2

(j−i) a1

i X k=1

2

2(i−k) a1

∞ X

σj2 (h⊗2 k,j )(x, y).

(fi ⊗1 fj ) (x, y)dxdy =

17

(21)

j=1



2(j−i)  a1

∞ X j=1



σj4 

×



1 − a4i 1 1 − a41



.

Journal Pre-proof

Therefore,

∞ P

24a21

∞ P

σj4 1 1 j=1 + 6 4 4 2 2 2 (1 − a1 ) n (1 − a1 )(1 − a1 ) n ∞ P   4 σj4 1 + a21 (5 + 6a21 ) j=1 = , n (1 − a41 )2 (1 − a21 ) 4

j=1

σj4

repro of

∞ ∞ ∞ ∞ P 4 P P P 4 4 4 4 σ 4 σ σ 4 σ j j j j n−1 n n X X 2(j−i) 1X 8 j=1 j=1 j=1 j=1 4i 4i = (1 − a ) + (1 − a ) a − T4,2,n − 1 1 1 2 )2 n (1 − a41 ) (1 − a21 )2 (1 − a41 ) n i=1 (1 − a 1 i=1 j=i+1 ∞ ∞ P P 4 4 2 4 σ σ a 8 1 j j n n  1 X 1 X 4i j=1 j=1 2(n−i) 6 − a1 + (1 − a4i )−1 1 )(1 − a1 (1 − a41 ) n (1 − a41 )(1 − a21 ) n i=1 i=1

which completes the proof.

T2,n

and

T4,n

rna lP

To get a sense of how the two terms

compare to each other, we propose the

following example, which shows that, despite one's best eorts, one should not expect either of these two terms to dominate the other.

Remark 5 In the AR(1) model (8) with chi-squared white noise, i.e. when σ1 = σ and σj = 0 for all j > 2, one can try to compare the two formulas for the asymptotic variances of T2,n and T4,n . Avoiding the situation where |a1 | is very close to 1, assuming for instance |a1 | < 2−1/2 , so that 1 − a21 > 1/2, when n is large, we have V ar(T2,n ) ∼

1 32σ 4 1 ∼ 4 × (1 − a21 ) × V ar(T4,n ) > 4 × × V ar(T4,n ) = 2 × V ar(T4,n ). 2 2 n (1 − a1 ) 2

Therefore the sequence T4,n can be made to have a variance which is signicantly smaller that the one of T2,n in this case, but both of them converge to zero at the same speed n−1 .

Jou

Using the orthogonality between

T2,n

and

T4,n , Proposition 3 and Proposition 4, we conclude

the following.

Theorem 6 Under Assumption (9), with Qn as in (13), for large n, |n V ar(Qn ) − (l1 + l2 )| 6

18

(C1 + C2 ) , n

Journal Pre-proof

lim n V ar(Qn ) = l1 + l2    2 ∞ ∞ 2 X X 36 4(1 + a1 )   σj4  + σj2  , 2 2 )3 2 (1 − a1 ) (1 − a 1 j=1 j=1

n→∞

=

repro of

and in particular the asymptotic variance of Qn is

where C1 , l1 , and C2 , l2 are given respectively in (15), (16), (18) and (19).

Remark 7

• From the previous theorem, we notice that for n large, and xed values of the noise scale √ parameter family {σj , j > 1}, the variance of nQn has high values when |a1 | is close to 1,

and approaches

   2 ∞ ∞ X X 36  σj4  + 4  σj2  j=1

when |a1 | is small.

j=1

rna lP

• The previous theorem also shows that one can obtain other asymptotics depending on the relation between a1 and the family {σj , j > 1}. For instance, when |a1 | is close to 1, which is the limit of fast mean reversion, one can avoid an explosion of Qn 's asymptotic variance

by scaling the variance parameters appropriately, leading to a fast-mean reversion and small noise regime. Letting 1/α := 1P − a21 , where α is interpreted as a rate of mean reversion, one P would only need to ensure that σj4 = O α−2 and σj2 = O α−3/2 . In the example where there is a single non-zero value σ , for instance, we would obtain for large α, n × V ar(Qn ) ∼ 36α2 σ 4 + 8α3 σ 4 ;

here the second term dominates, and as α → ∞, assuming α3 σ 4 remains bounded, we would get an asymptotic variance of 8 limα→∞ α3 σ 4 if the limit exists. 5

Berry-Esséen bound for the asymptotic normality of the quadratic-

Jou

variation

In this section, we prove that the quadratic variation dened in (13) is asymptotically normal and we estimate the speed of this convergence in total variation distance, showing it is of the BerryEsséen-type order

n−1/2 .

For this aim, we will need the following theorem, which estimates the total

variation distance to the normal of the standardized sum of variables in the 2nd and 4th chaos.

19

Journal Pre-proof

Theorem 8 Let F

repro of

dT V

= I2 (f ) + I4 (g) where f ∈ L2s ([0, 1]2 ) and g ∈ L2s ([0, 1]4 ). Then   √ √ F 4 h√ √ , N (0, 1) 6 2 kf ⊗ f k 6! kg ⊗ gk 4! kg ⊗2 gkL2 ([0,1]4 ) 2 ([0,1]2 ) + 2 2 ([0,1]6 ) + 18 1 1 L L EF 2 EF 2 q √ √ +36 2 kg ⊗3 gkL2 ([0,1]2 ) + 9 2 hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) i √ q +3 4! kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) . (22)

Moreover, letting RF be the bracketed term on the right-hand side of (22), for any constant σ > 0, we have   2 dT V

Proof.

We have

F , N (0, 1) σ

F = I2 (f ) + I4 (g).

6

4 EF RF + 2 1 − 2 . 2 σ σ

(23)

Then

−Dt L−1 F = I1 (f (., t)) + I3 (g(., t)) =: u(t). Thus, using

F = δ(−DL−1 )F , we can write F = δ(u).

Now we use the result of a simple calculation,

labeled as (9) in the preprint [21] (see also [24]), to obtain



F EF 2

 , N (0, 1) 6

q  2 V ar hDF, uiL2 ([0,1]) 2 EF q 2 2 E hDF, uiL2 ([0,1]) − EF 2 , 2 EF

rna lP

dT V



=

where the last equality comes from the duality relation

EhDF, uiL2 ([0,1]) = E(F δ(u)) = EF 2 .

(24) The

prior inequality appears to be used in a more general context here than what is stated in [21, Eq. (9)], but an immediate inspection of its proof therein shows that it applies to any situation where

F = δ(u), using only general results such D, and the duality between D and δ .

derivative

as Stein's lemma, the chain rule for the Malliavin

On the other hand, using the product formula (4),

hDF, uiL2 ([0,1]) = =

Z

1

u(t)Dt F dt

0

Z

1

u(t)(2I1 (f (., t)) + 4I3 (g(., t)))dt

0

Z

1

2 [I1 (f (., t))]2 + 4 [I3 (g(., t))]2 + 6I1 (f (., t))I3 (g(., t)) dt

Jou

=

0

=

Z

0

1h

2I2 (f (., t) ⊗ f (., t)) + 2kf (., t)k2L2 ([0,1]) + 4I6 (g(., t) ⊗ g(., t))

+36I4 (g(., t) ⊗1 g(., t)) + 72I2 (g(., t) ⊗2 g(., t)) + 24g(., t) ⊗3 g(., t) +6I4 (f (., t) ⊗ g(., t)) + 18I2 (f (., t) ⊗1 g(., t))] dt.

Thus

hDF, uiL2 ([0,1]) − EF 2 =

Z

1

I2 (h1 (., t)) + I4 (h2 (., t)) + I6 (h3 (., t)) dt,

0 20

Journal Pre-proof

repro of

where

h1 (.t) = 2f (., t) ⊗ f (., t) + 72g(., t) ⊗2 g(., t) + 18f (., t) ⊗1 g(., t)

h2 (.t) = 36g(., t) ⊗1 g(., t) + 6f (., t) ⊗ g(., t)

h3 (.t) = 4g(., t) ⊗ g(., t). Therefore, using Minkowski inequality,



q √ Z 1 √ Z 1

2 2

E hDF, uiL2 ([0,1]) − EF 2 h1 (., t) dt h2 (., t) dt 6 + 4!

2 0 0 L2 ([0,1]2 ) L ([0,1]4 )

√ Z 1

h3 (., t) dt . + 6!

0

Furthermore,

Z



1

0

h1 (., t) dt

L2 ([0,1]2 )

Z

6

2

1

0

L2 ([0,1]6 )

f (., t) ⊗ f (., t) dt

Z

+

18

L2 ([0,1]2 )

1

1

0

g(., t) ⊗2 g(., t) dt

L2 ([0,1]2 )

rna lP

0

f (., t) ⊗1 g(., t) dt

Z

+

72

= 2 kf ⊗1 f kL2 ([0,1]2 ) + 72 kg ⊗3 gkL2 ([0,1]2 ) + 18

Also,

Z



0

1

h2 (., t) dt

L2 ([0,1]4 )

and

Z



0

1

g(., t) ⊗1 g(., t) dt

L2 ([0,1]2 )

Z

+

6

1

0

q hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) .

f (., t) ⊗ g(., t) dt

q = 36 kg ⊗2 gkL2 ([0,1]4 ) + 6 hf ⊗1 f, g ⊗3 giL2 ([0,1]2 ) q 6 36 kg ⊗2 gkL2 ([0,1]4 ) + 6 kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) ,

h3 (., t) dt

L2 ([0,1]6 )

Jou

0

1

Z

6

36

Z

=

4

0

1

L2 ([0,1]2 )

L2 ([0,1]4 )

g(., t) ⊗ g(., t) dt

L2 ([0,1]6 )

= 4 kg ⊗1 gkL2 ([0,1]6 ) .

As a consequence,

q √ √ 2 E hDF, uiL2 ([0,1]) − EF 2 6 2 2 kf ⊗1 f kL2 ([0,1]2 ) + 72 2 kg ⊗3 gkL2 ([0,1]2 ) √ √ q +18 2 hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) + 36 4! kg ⊗2 gkL2 ([0,1]4 ) √ q √ +6 4! kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) + 4 6! kg ⊗1 gkL2 ([0,1]6 ) . 21

Journal Pre-proof

For (23), we have by (6)

dT V



repro of

This, combined with (24), establishes inequality (22).

 F 2 , N (0, 1) 6 2 E[|σ 2 − hDF, uiL2 ([0,1]) |] σ σ 2 E[F 2 ] 2 6 2 E[|E[F ] − hDF, uiL2 ([0,1]) |] + 2 1 − . σ σ2

(25)

Inequality (23) follows using inequalities (25) and (22).

We will need Theorem 8 along with the next Lemmas 9, 10 and 11 in order to prove that

Qn

the quadratic variation

Lemma 9 Let

√4 n

g2,n :=

the kernel g2,n satises

where C3 :=

r

4

4! 4n

Proof.

satises the Berry-Esséen Theorem 13 bellow.

Pn

i=1 fi

P

∞ 8 j=1 σj

We have

C3 kg2,n ⊗1 g2,n kL2 ([0,1]2 ) 6 √ , n  3



.

1 1−a21

1 1−a81

=

44 n2

i,j=1 n X

i,j,k,l=1

h(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj ), (fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )iL2 ([0,1]2 ) .

Moreover, by above calculations and (9)

(fi ⊗1 fi )(x, y) =

Hence

i X

j X

Jou

=

=

i X

2(i−k1 )

a1

k1 =1

k2 =1

i X

j X

2(i−k1 )

a1

k1 =1

=

i∧j X

k1 =1



2(i+j−2k1 ) 

j1 ,j2 =1

2(j−k2 )

a1

k2 =1

a1

∞ X

2(j−k2 )

a1

+∞ X j=1

∞ X j=1

2(i−k)

a1

σj21 σj22

∞ X

2 σj2 h⊗ k,j (x, y).

j=1

k=1

(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )(x, y) Z = (fi ⊗1 fi )(x, t)(fj ⊗1 fj )(y, t)dt [0,1]

(26)

n 44 X k (fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )k2L2 ([0,1]2 ) n2

rna lP

kg2,n ⊗1 g2,n k2L2 ([0,1]2 ) =

⊗1 fi where fi is dened in (12). If assumption (9) holds then

Z

[0,1]

hk1 ,j1 (x)hk1 ,j1 (t)hk2 ,j2 (y)hk2 ,j2 (t)dt

σj4 (hk1 ,j ⊗ hk2 ,j )(x, y)δk1 ,k2



σj4  (hk1 ,j ⊗ hk1 ,j )(x, y). 22

Journal Pre-proof

Similarly

k2 =1

Therefore

  +∞ X 2(k+l−2k2 )  a1 σj4  (hk2 ,j ⊗ hk2 ,j )(x, y).

repro of

(fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )(x, y) =

k∧l X

j=1

h(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj ), (fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )iL2 ([0,1]2 ) Z = ((fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )) (x, y) ((fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )) (x, y)dxdy [0,1]2   i∧j∧k∧l ∞ X 2(i+j+k+l−4m) X  a1 = σj8  . m=1

j=1

Consequently

44 n2

n X

i,j,k,l=1



i∧j∧k∧l X

∞ X



6

Jou

6

=

6

where we used the change of variables follows.

∞ X

σj8

j=1

i X

X 44  2(i+j+k+l−4r) 8 σ a1 j 2 n j=1 16i6j6k6l6n r=1   ∞ 8i X 44 X 8  2(j+k+l−3i) 1 − a1 4! 2  σj a1 n 1 − a81 j=1 16i6j6k6l6n   ∞ 4 X X 4 1 2(j+k+l−3i) 4! 2  a1 σj8  8 n 1 − a1 j=1 16i6j6k6l6n   ∞ n X 44 X 8  1 2(k +k +k ) 4!  σj a1 1 2 3 8 n 1 − a 1 k1 ,k2 ,k3 =0 j=1    3 ∞ n 4 X 8  1  X 2k1  a1 σj 4! n 1 − a81 j=1 k1 =0    3 ∞ 44 X 8  1 1 4! σj , n 1 − a81 1 − a21 j=1

6 4!

6

2(i+j+k+l−4r)

a1

r=1

rna lP

kg2,n ⊗1 g2,n k2L2 ([0,1]2 ) =

k1 = j − i, k2 = k − i, k3 = l − i.

23

The desired result therefore

Journal Pre-proof

√1 n

Pn

i=1 fi

⊗ fi where fi is dened in (12). If assumption (9) holds, then

for every r = 1, 2, 3, the kernel g4,n satises

repro of

Lemma 10 Let g4,n :=

C4,r kg4,n ⊗r g4,n kL2 ([0,1](8−2r) ) 6 √ , n

where

C4,r

Proof.

For

                 

v u u t4!

r = 1, 2, 3,

=

we get

kg4,n ⊗1 g4,n k2L2 ([0,1]6 )

1 6 2 n

j=1

we have

σj4

!



1

(1−a21 )(1−a41 )

3

if r = 1,

if r = 2,

if r = 3.

n 1 X k (fi ⊗ fi )⊗˜r (fj ⊗ fj )k2L2 ([0,1]2(4−r) ) n2

1 n2

1 k n2 1 n2

6

4! n2

i,j=1 n X

(fi ⊗ fi )⊗r (fj ⊗ fj )k2L2 ([0,1]2(4−r) )

i,j=1 n X

i,j,k,l=1

n X

i,j,k,l=1 n X i,j,k,l=1

Jou

=

By (9), for all

∞ P

rna lP 6

r = 1,

j=1

σj2

!2

v !4 u  7 u ∞ P 1 t4! 2 := σ 2 j 1−a1   j=1       v  !2 !  u   ∞ ∞  u P P  1 1 t4!  σj2 σj4 (1−a  4 )2 (1−a2 )4  1 1 j=1 j=1

kg4,n ⊗r g4,n k2L2 ([0,1]2(4−r) ) =

For

∞ P

(27)

h(fi ⊗ fi )⊗r (fj ⊗ fj ), (fk ⊗ fk )⊗r (fl ⊗ fl )iL2 ([0,1]2(4−r) ) .

h(fi ⊗ fi )⊗1 (fj ⊗ fj ), (fk ⊗ fk )⊗1 (fl ⊗ fl )iL2 ([0,1]6 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )

X

16i6j6k6l6n

hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 ) . (28)

1 6 i, k 6 n,

hfi , fk iL2 ([0,1]2 ) =

i∧k X

m=1



ai+k−2m × 1

24

∞ X j=1



σj2  .

(29)

Journal Pre-proof

1 6 j, l 6 n, hfj , fl iL2 ([0,1]2 ) =

On the other hand, for all

1 6 i, j 6 n

(fi ⊗1 fj )(x, y) = =



i∧k X

1 6 i, j, k, l 6 n,

∞ X

m=1

aj+l−2m × 1

Z

fi (x, t)fj (y, t)dt

0 i∧j X

hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 ) =

j=1

σj2  .

1

ai+j−2m 1

m=1 Hence, by (9), for all



repro of

Similarly, for all

i∧j∧k∧l X

∞ X

σj2 h⊗2 m,j (x, y).

(30)

j=1

(i+j+l+k−4m)

a1

m=1

Therefore, from (28) and above calculations, we have



×

∞ X j=1



σj4  .

j=1

j=1

rna lP

kg4,n ⊗1 g4,n k2L2 ([0,1]6 )  2   ∞ ∞ X X X 1 1 4! 2(k−i) 2(l−i) 2j 4i 2  4  a1 a1 (1 − a2i σ σ 6 1 )(1 − a1 )(1 − a1 ) j j 2 4 2 n (1 − a1 )2 j=1 (1 − a ) 1 j=1 16i6j6k6l6n  2   n ∞ ∞ X X 4! 1 X 1 2(2k +2k +k ) 2  4  6 a1 1 2 3 σ σ j 4) n (1 − a21 )2 j=1 j (1 − a 1 j=1 k1 ,k2 ,k3 =0  2   !2 ! ∞ ∞ n n X X X X 1 1 4! 4r 2r 2 4  σ   σj  a1 1 a1 2 = n (1 − a21 )2 (1 − a41 ) j=1 j r1 =0 r2 =0 j=1  2    3 ∞ ∞ X X 1 1 4 2   × , 6 4! σj σj n (1 − a21 )(1 − a41 )

where we used the change of variables

r = 2,

we have

Jou

For

kg4,n ⊗2 g4,n k2L2 ([0,1]4 ) 6

1 n2

=

1 n2

=

4! n2

n X

i,j,k,l=1 n X i,j,k,l=1

k1 = j − i, k2 = k − j , k3 = l − k .

h(fi ⊗ fi )⊗2 (fj ⊗ fj ), (fk ⊗ fk )⊗2 (fl ⊗ fl )iL2 ([0,1]4 ) hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 )

X

16i6j6k6l6n

hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) .

25

Journal Pre-proof

Hence, by (29) and (9), we get

6

4! (1 − a21 )4

4! = (1 − a21 )4

6

4! (1 − a21 )7

r = 3,

we have

kg4,n ⊗3 g4,n k2L2 ([0,1]2 ) 1 6 2 n = =

1 n2 4! n2

n X

i,j,k,l=1 n X i,j,k,l=1

σj2

n2

∞ P

j=1

σj2

n

∞ P

j=1

σj2

n

∞ P

j=1

σj2

n

!4 !4 !4

X

2(l−i)

a1

16i6j6k6l6n

n X

2j 2 2k (1 − a2i 1 ) (1 − a1 )(1 − a1 )

2(k1 +k2 +k3 )

a1

k1 ,k2 ,k3 =0

 

n X

k1 =0

3

1 a2k 1

,

k1 = j − i, k2 = k − j , k3 = l − k .

rna lP

where we used the change of variables For

j=1

!4

repro of

kg4,n ⊗2 g4,n k2L2 ([0,1]4 )

4! 6 (1 − a21 )4

∞ P

h(fi ⊗ fi )⊗3 (fj ⊗ fj ), (fk ⊗ fk )⊗3 (fl ⊗ fl )iL2 ([0,1]2 ) hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )

X

16i6j6k6l6n

hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )

Jou

 2   ∞ ∞ X X 4! 1 1  6 σj2   σj4  2 4 2 (1 − a1 ) (1 − a1 ) n2 j=1 j=1

X

16i6j6k6l6n

26

2(j−i) 2(l−i) a1 (1

a1

4i 2k − a2i 1 )(1 − a1 )(1 − a1 ).

Journal Pre-proof

k1 = j − i, k2 = k − j , k3 = l − k ,

we obtain

repro of

Furthermore, using (29) and the change of variables

kg4,n ⊗3 g4,n k2L2 ([0,1]2 )  2   ∞ ∞ n X X 1 4! 1 X 2(2k +k +k ) 2  4  σ σ 6 a1 1 2 3 j 4) n (1 − a21 )2 j=1 j (1 − a 1 j=1 k1 ,k2 ,k3 =0  2   ! !2 n n ∞ ∞ X X X X 1 4! 1 4r 2r 2 4 1 2  a1 a1 = σ   σj  n (1 − a21 )2 (1 − a41 ) j=1 j r1 =0 r2 =0 j=1  2   ∞ ∞ X X 1 1 1 σj4  σj2   6 4!  × , 4 2 2 4 n (1 − a1 ) (1 − a1 ) j=1

j=1

which ends the proof.

Lemma 11 Suppose that assumption (9) holds. Consider the kernels g2,n and g4,n dened in Lemma 9 and Lemma 10 respectively, then we have

Proof.

4! n4

P

C5 hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) 6 √ , n

rna lP

where C5 :=

r

q

∞ 8 j=1 σj

We have



1 1−a81



1 1−a21

3

(31)

.

hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 )

=

4 n2 4 n2

n X

h(fi ⊗1 fi ) ⊗ (fj ⊗1 fj ), (fk ⊗ fk ) ⊗2 (fl ⊗ fl )iL2 ([0,1]4 )

i,j,k,l=1 Z n X

4 i,j,k,l=1 [0,1]

(fi ⊗1 fk )(x1 , x3 )(fi ⊗1 fk )(x1 , x4 )(fj ⊗1 fl )(x2 , x3 )(fj ⊗1 fl )(x2 , x4 )dx1 dx2 dx3 dx4 .

Jou

=

27

Journal Pre-proof

hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) 6

repro of

Consequently, using (30), we get

4 n2

6 6



∞ X

2(i+j+k+l−4r) a1

r=1



rna lP

=

i,j,k,l=1

i∧j∧k∧l X

6

where we used the change of variables

∞ X

σj8

j=1

i X

X 4  2(i+j+k+l−4r) 8 σ a1 j 2 n j=1 16i6j6k6l6n r=1   ∞ 8i X 4 X 8 2(j+k+l−3i) 1 − a1 σj 4! 2  a1 n 1 − a81 j=1 16i6j6k6l6n   ∞ X 4 X 8  1 2(j+k+l−3i) 4! 2 σj a1 8 n 1 − a1 j=1 16i6j6k6l6n   ∞ n X 4 X 8 1 2(k +k +k ) σj a1 1 2 3 4!  8 n 1 − a 1 k1 ,k2 ,k3 =0 j=1    3 ∞ n X X 4 1  1 4!  σj8  a2k 1 8 n 1 − a 1 j=1 k1 =0    3 ∞ 4 X 8  1 1 σj 4! , n 1 − a81 1 − a21 j=1

6 4! 6

n X

k1 = j − i, k2 = k − i, k3 = l − i.

Remark 12 It turns out that, q when applying Theorem 8 to estimate the speed of convergence in

the CLT for Qn , the term hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) cannot merely be bounded above via Schwarz's inequality. See Lemma 11 and its proof. This is the key element which allows us to obtain the Berry-Esséen speed n−1/2 in the next theorem.

Theorem 13 With Qn dened in (13), under Assumption (9), we have for all n > 1 where

Jou

dT V

√

 n(Qn − E[Qn ]) C0 √ , N (0, 1) 6 √ , n l1 + l2

(32)

√   √ √ 1/2 1/2 √ 4 2 1 √ (C1 + C2 ) + C3 + 36C4,3 + 9C5 + 36 3C4,2 + 6 3C3 C4,3 + 12 10C4,1 , C0 := l1 + l2 2 2

where l1 , l2 , are dened in the previous section in (16), (19), and C1 , C2 , C3 , C4,r , r = 1, 2, 3 and C5 are given in the lemmas below, respectively in (15), (18), (26), (27) and (31). √ In particular n(Qn − E[Qn ]) is asymptotically Gaussian, namely lim

n→∞



law

n (Qn − E[Qn ]) −→ N (0, l1 + l2 ). 28

Journal Pre-proof

(Qn − E(Qn )) given in (14), we have √ √ √ n (Qn − E[Qn ]) = nT2,n + nT4,n = I2W (g2,n ) + I4W (g4,n ),

Based on the decomposition of

where

n

g2,n √

4 X := √ fi ⊗1 fi n i=1

repro of

Proof.

n

and

g4,n

1 X := √ fi ⊗ fi . n

rna lP

n (Qn − E[Qn ]), we get √  n(Qn − E[Qn ]) √ , N (0, 1) dT V l1 + l2 √ 4 h√ 6 2 kg2,n ⊗1 g2,n kL2 ([0,1]2 ) + 36 2 kg4,n ⊗3 g4,n kL2 ([0,1]2 ) l1 + l2 √ q + 9 2 hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) √ q √ +18 4! kg4,n ⊗2 g4,n kL2 ([0,1]4 ) + 3 4! kg2,n ⊗1 g2,n kL2 ([0,1]2 ) kg4,n ⊗3 g4,n kL2 ([0,1]2 ) i h√ 2 i E ( n (Qn − E[Qn ])) √ . +2 6! kg4,n ⊗1 g4,n kL2 ([0,1]6 ) + 2 1 − (l1 + l2 )

Applying Theorem 8 to

(33)

i=1

(34)

The bound (32) is then a direct consequence of inequality (34) and the estimates given respectively in (26), (27), (31) and Theorem 6.

6

Application: estimation of the mean-reversion parameter

In this section, to illustrate the implications of Theorem 13 in parameter estimation in an easily tractable case, we consider that we have observations

Yn

coming from a specic version of our

second-chaos AR(1) model (8), that which is driven by a chi-squared white noise with one degree of freedom:

a0 , a1

and

σ



are real constants, and

This is model (8) where all

(35)

Y0 = y0 ∈ R.

Jou

where

  Yn = a0 + a1 Yn−1 + σ(Zn2 − 1), n > 1, σj 's

{Zn , n > 1} are i.i.d.

standard normal random variables.

are zero except for the rst one.

Proposition 14 The quadratic variation Qn dened in (13) for model (35) satises, for all n > 1,

where C6 :=

2σ 2 . (1−a21 )2

E[Qn ] −

2σ 2 C6 6 , n (1 − a21 )

29

Journal Pre-proof

From the denition of

Qn

in (13), we have by the isometry property of multiple

repro of

Proof. integrals

n 2 2σ 2 2 X 2σ = kfi k2L2 ([0,1]2 ) − (1 − a21 ) n i=1 1 − a21 ! n X n i 2σ 2 X 1X 2σ 2 2σ 2 2(i−k) 2i = = (1 − a ) − 1 a1 − 1 2 2 n n 1 − a 1 − a 1 1 i=1 k=1 i=1 n 2σ 2 1 X 2σ 2 1 2i = − a 6 . 1 1 − a21 n (1 − a21 )2 n

E[Qn ] −

i=1

Remark 15 Assuming that σ is known, Proposition (14) shows that the quadratic variation Qn is

an asymptotically unbiased estimator for 2σ 2 /(1 − a21 ), and thus, after a transformation, for |a1 | as well: s 1−

lim

n→∞

E[Qn ]

can be estimated via

moment estimator for the mean-reversion rate

a ˆn := f (Qn ),

where

f (x) :=

r

Qn ,

we suggest the following

|a1 |

rna lP

Therefore, using the fact that

2σ 2 = |a1 | . E[Qn ]

1−

2σ 2 , x

x > 2σ 2 .

(36)

(37)

Remark 16 Note that we chose to estimate the absolute value |a1 | instead of a1 because the quadratic variation only gives access to a21 . For an estimator that accesses the sign of a1 , we would presumably need to use a variation or another method of moments with an odd power. This is feasible, but it is beyond the scope of the paper.

6.1 Properties of the estimator aˆn

Jou

Proposition 17 The estimator aˆn of the mean reversion parameter |a1 | dened in (36) is strongly consistent, namely almost surely

√ Vn Qn = √ + E[Qn ], with Vn = n(Qn − E[Qn ]). According to Theorem 6, n E[Vn2 ] → l1 + l2 , n → ∞. Hence, there exists a constant C > 0, such that for every n > 1,

Proof. we have

lim a ˆn = |a1 |.

n→∞

We write

" #!1/2 Vn 2 C 6√ . E √ n n 30

Journal Pre-proof

V Hence, by Lemma 1 we have almost surely √n

E[Qn ] →

n → ∞.

Thus

Qn →

→ 0,

as

n → ∞.

On the other hand, by Proposition

n 2σ 2 almost surely as 1−a21

n → ∞,

repro of

14,

2σ 2 , as 1−a21

as announced.

Proposition 18 Under Assumption (9), the estimator aˆn dened in (36) satsies dW



 √ n √ (ˆ an − |a1 |) , N (0, 1) E n−1/2 . f 0 (µ) l1 + l2

2

2σ where l1 and l2 are given in Propositions 3 and 4 respectively and µ = (1−a 2) . 1 In particular aˆn is asymptotically Gaussian; more precisely we have as n → +∞



where

 n (ˆ an − |a1 |) −→ N 0, f 0 (µ)2 × (l1 + l2 ) , f 0 (µ)2 × (l1 + l2 ) =

Proof.

x > 2σ 2 ,

µ = f −1 (|a1 |) = √



2σ 2 (1−a21 )

the function

f

> 2σ 2 ,

a1 ∈ (−1, 1).

for all

dened in (37) has an inverse





2σ 2 . Let 1−y 2

On the other hand, we can write

Therefore, from the properties of the Wasserstein metric and denoting

dW

f −1 (y) =

√ √ n n n (Qn − µ) = √ (Qn − E[Qn ]) + √ (E[Qn ] − µ) . l1 + l2 l1 + l2 l1 + l2

rna lP

denote

For

(1 − a21 )(5 − 4a21 ) . 2a21



n (Qn − µ) , N l1 + l2



N ∼ N (0, 1),

we get

 √  √ n n 6√ |E[Qn ] − µ| + dW √ (Qn − E[Qn ])) , N l1 + l2 l1 + l2 C6 n−1/2 + C0 n−1/2 6√ l1 + l2

E n−1/2 ,

(38)

where we used the bound (32) and Proposition 14 for the above bounds. On the other hand, since the function

f

dened in (37) is a dieomorphism and since

Jou

theorem, there exists a random variable

a ˆn = f (Qn ),

then by the mean-value

such that

(ˆ an − |a1 |) = f 0 (ξn ) (Qn − µ) ,

a,b means [a, b] if a 6 b and [b, a] if b 6 a. We have     √ √ √ n n n √ √ √ (ˆ a − |a |) , N (0, 1) 6 d (ˆ a − |a |) , (Q − µ) n 1 n 1 n W f 0 (µ) l1 + l2 f 0 (µ) l1 + l2 l1 + l2  √  n + dW √ (Qn − µ) , N (0, 1) . l1 + l2

where the notation

dW

ξn ∈ [|Qn , µ|]

[|a, b|]

for two reals

31

Journal Pre-proof

According to (38), the last term is bounded by the speed

∈ [|ξn , µ|] ⊂ [|Qn , µ|],

6 = 6 6

f

Moreover, applying the mean-

is twice continuously dierentiable, there exists a random variable

such that



repro of

value theorem again since

n−1/2 .

 √ √ p n n √ dW (ˆ an − |a1 |) , √ (Qn − µ) |f 0 (µ)| l1 + l2 0 f (µ) l1 + l2 l1 + l2 √ 0 0 nE[|(Qn − µ)(f (ξn ) − f (µ))|] √ nE[ (Qn − µ)f 00 (δn )(ξn − µ) ] √ nE[(Qn − µ)2 |f 00 (δn )|] 1/p0 1/p  √ 0 n E[(Qn − µ)2p ] E[|f 00 (δn )|p ] , p, p0

1 p

δn

(39)

such that

are two reals greater than 1 such that

+

1 p0

= 1. Moreover, by the hypercontractivity property for multiple integrals (5), there exists a constant C(p) where we used Hölder's inequality with

1/p √ √ n E[(Qn − µ)2p ] 6 C(p) nE[(Qn − µ)2 ] √ √ 6 2C(p) nE[(Qn − E[Qn ])2 ] + 2C(p) n(E[Qn ] − µ)2

rna lP

6 Cn−1/2 + Cn−3/2

E n−1/2 ,

where we used the inequality

(a + b)2 6 2a2 + 2b2 ,

for all

14 and Theorem 6 respectively. On the other hand, for all

(40)

a, b ∈ R and x > 2σ 2 ,

the bounds of Proposition

  −3/2 3σ 2 2σ 2 1− 1− < 0. 2x x   √ √ n n √ √ Therefore using (39) and (40), to obtain a bound for the term dW (ˆ a − |a |) , (Q − µ) , n 1 n f 0 (µ) l +l l +l 2σ 2 f (x) = − 3 x 00

0

and the

1

2

1

2

E[|f 00 (δn )|p ] is nite for some p0 > 1 but using the fact that δn ∈ [|Qn , µ|] 00 0 monotonicity of f , it is actually sucient to show that for some p > 1, we have h i h i 0 0 0 0 sup E |f 00 (Qn )|p = sup E |2σ 2 Qn − 3σ 4 |p |Qn |−5p /2 |Qn − 2σ 2 |−3p /2 < ∞.

it remains to show that

|f 00 | >1

The function

0 for any p

n>1

Jou

n>1

has two singularities in

0

and in

2σ 2

and thus is not bounded. But, we can write

    h i 0 0 0 E |f 00 (Qn )|p = E |f 00 (Qn )|p 1{|Qn −µ|> √1 } + E |f 00 (Qn )|p 1{|Qn −µ|< √1 } .

For the term

n

n

  0 E |f 00 (Qn )|p 1{|Qn −µ|< √1 } and since µ > 2σ 2 , we can pick n such that

Qn > 2σ 2 − σ 2 > 0, therefore Qn

n

is bounded away from

32

0

0 and the term |Qn |−5p /2

√1 n

< σ2.

Then

has no singularity

Journal Pre-proof

2

0

0

repro of

p0 > 1. For the term |Qn − 2σ 2 |−3p /2 , we put C := µ−2σ 2 , the constant C 6= 0, because 2 µ 6= 2σ ⇔ a1 6= 0, we can assume a1 6= 0, because there is no AR(1) process with a1 = 0. Therefore, 1 1 2 we can pick n such that √ < C . In this case Qn − 2σ = Qn − µ + 2C > − √ + 2C > 2C − C > 0, n n

for any

|Qn − 2σ 2 |−3p /2 has no singularities at 2σ 2 for any p0 > 1. In conclusion, to avoid 2 the singularities at both 0 and 2σ , it is sucient to pick n such that ( σ 2 if |a1 | > √12 , √1 < σ 2 ∧ C = 1 n C if |a1 | 6 √ . 2 √ For the other term, by the asymptotic normality of  n(Qn − µ) and using asimilar argument as hence the term

in Section 5.2.2 of [9], there exists

p0 > 1

such that

the desired result.

6.2 Numerical Results

0

E |f 00 (Qn )|p 1{|Qn −µ|> √1

< +∞,

}

n

The table below reports the mean and standard deviation of the proposed estimator (36) of the true value of the mean-reversion parameter

|a1 | = 0.10

Mean

n = 3000

0.2178

0.0901

n = 5000

0.1878

n = 10000

0.1630

|a1 | = 0.50

dened in

|a1 | = 0.70

Std dev

Mean

Std dev

Mean

Std dev

0.2887

0.0969

0.4946

0.0520

0.6962

0.0234

0.0866

0.2905

0.0616

0.4966

0.0413

0.6978

0.0270

0.0692

0.2928

0.0852

0.4974

0.0315

0.6987

0.0215

rna lP

Std dev

a ˆn

|a1 |.

|a1 | = 0.30

Mean

which gives

a ˆn from the quadratic variation Qn for dierent 2 and for xed σ chosen to be equal to 1. For each sample size n, the mean and the

We simulate the values of the estimator sample sizes

n

standard deviation are obtained by 500 replications. The table above conrms that the estimator is strongly consistent even for small values of values of

|a1 |.

Moreover, the estimator

a ˆn

n

is more ecient for values of

|a1 |

greater than

|ˆ an | is Therefore, a ˆn is

could be explained by the fact that the asymptotic variance of the limiting law of is high for small values of

|a1 |

and small for values of

Jou

more accurate as an estimator when gure below.

|a1 |

a ˆn

and has small standard deviations for dierent true

|a1 |

close to 1.

0.5,

this

(1−a21 )(5−4a21 ) 2a21 presumably

is closer to 1, e.g. greater than 0.5 as can be seen in the

33

repro of

Journal Pre-proof

rna lP

Figure 2: Asymptotic variance for values of

To investigate the asymptotic distribution of

a ˆn

|a1 |

between 0.09 and 0.99

empirically, we need to compare the distri-

bution of the following statistic

p 2a21

√ φ(n, a1 ) := p n(ˆ an − |a1 |) 2 2 (1 − a1 )(5 − 4a1 )

with the standard normal distribution

N (0, 1).

For this aim, for parameter choices

and based on 3000 replications, we obtained the following histogram:

Jou

n = 3000, σ = 1,

34

(41)

|a1 | = 0.5,

φ(n, a1 )

with

n = 3000, |a1 | = 0.5, σ = 1,

rna lP

Figure 3: Histogram of

repro of

Journal Pre-proof

3000 replications.

This Figure 3 shows that the normal approximation of the distribution of the statistic

φ(n, a1 )

is reasonable even if the sampling size

statistics of

φ(n, a1 )

and

N (0,1)

n

is not very large.

based on 3000 replications, with

empirical mean, median and standard deviation of

φ(n, a1 )

match

corroborating our theoretical results.

Median

The table below compares

n = 3000, and σ = 1. The those of N (0,1) very closely,

Statistics

Mean

Standard Deviation

N (0,1) φ(n, a1 )

0

0

1

0.0048515

0.0020649

1.0004691

φ(n, a1 ) converges φ(n, a1 ) and N (0, 1).

We can check more precisely how fast is the statistic We chose to compute the Kolmogorov distance between

in law to

N (0,1).

For this aim, we

Jou

approximate the cumulative distribution function using empirical cumulation distribution function based on 500 replications of the computation of

φ(n, a1 )

for

n = 3000.

empirical and standard normal cumulative distribution functions.

35

The next gure shows the

repro of

Journal Pre-proof

The Kolmogorov distance between the two laws, which equals the sup norm of the dierence

rna lP

of these cumulative distribution functions, computes to approx. (See for example Theorem 3.3 of [4] for a proof )

dKol (φ(n, a1 ), N (0, 1)) 6 2

0.052.

On the other hand, since

p dW (φ(n, a1 ), N (0, 1)),

the distance on the left-hand side should be bounded above by

2 × 3000−1/4 = 0.27

(42) approx times

any constant coming from the upper bound in Proposition 18. This is ve times larger than our estimate of the actual Kolmogorov distance

0.052, a reassuring practical conrmation of Proposition

18, and of our underlying results on normal asymptotics of 2nd-chaos AR(1) quadratic variations. If that proposition's upper bound with its rate

n−1/2

applied directly to the Kolomogorov distance,

as is known to be the case for the Berry-Esséen theorem in the classical CLT, the value be compared to

3000−1/2 = 0.018

0.052 should

approx., which is arguably in the same order of magnitude. This

is a motivation to investigate whether the so-called delta method which we used here to prove Proposition 18 under the Wasserstein distance, could also apply to the total variation distance, since it is known to be an upper bound on the Kolmogorov distance without the need for the square

Jou

root as in the comparison (42).

References

[1] Balakrishna, N. and Shiji, K. (2014). Extreme Value Autoregressive Model and Its Applications,

Journal of Statistical Theory and Practice, 8 (3), 460481.

[2] Barboza, L.A. and Viens, F. (2017). Parameter estimation of Gaussian stationary processes using the generalized method of moments.

Electron. J. Statist. 11 (1), 401-439. 36

Journal Pre-proof

[3] Bene, V.E., Shepp, L.A., and Witsenhausen, H.S. (1980). Some solvable stochastic control An International Journal of Probability and Stochastic Processes 4,

39-83.

repro of

problems. Stochastics:

[4] Chen, L.H.Y., Goldstein, L. and Shao, Q.M. (2011). Berlin: Springer-Verlag.

Normal Approximation by Stein's Method.

[5] Cumberland, W.G. and Sykes, Z.M. (1982). Weak Convergence of an Autoregressive Process Used in Modeling Population Growth.

Journal of Applied Probability, 19 (2) 450-455.

[6] Douissi, S., Es-Sebaiy, K., Viens, F. (2019). Berry-Esséen bounds for parameter estimation of general Gaussian processes.

ALEA, Lat. Am. J. Probab. Math. Stat., 16, 633-664.

[7] El Onsy, B., Es-Sebaiy, K. and Viens, F. (2017). Parameter Estimation for Ornstein-Uhlenbeck driven by fractional Ornstein-Uhlenbeck processes.

Stochastics, 89 (2), 431-468.

[8] Ernst, Ph., Brown, L., Shepp, L., Wolpert, R. (2017). Stationary Gaussian Markov processes as limits of stationary autoregressive time series.

Journal of Multivariate Analysis, 155, 180-186.

[9] Es-Sebaiy, K. and Viens, F. (2019). Optimal rates for parameter estimation of stationary Gaussian processes.

Stochastic Processes and their Applications, 129 (9), 3018-3054.

models, preprint.

rna lP

[10] Grunwald, G.K., Hyndman, R.J., and Tedesco, L.M. (1996).

A unied View of linear AR(1)

[11] Hürlimann, W. (2012). On Non-Gaussian AR(1) Ination Modeling Journal of Statistical and Econometric Methods,

1 (1), 93-109.

[12] Hu, Y. and Nualart, D. (2010). Parameter estimation for fractional Ornstein-Uhlenbeck processes.

Statist. Probab. Lett. 80, 1030-1038.

[13] Kleptsyna, M. and Le Breton, A. (2002). Statistical analysis of the fractional Ornstein- Uhlenbeck type process.

Statist. Infer. Stoch. Proc. 5, 229-241.

[14] Kloeden, P. and Neuenkirch, A. (2007). The pathwise convergence of approximation schemes for stochastic dierential equations.

LMS J. Comp. Math. 10, 235-253.

Jou

[15] Liptser, R. S. and Shiryaev, A. N. (2001).

Statistics of Random Processes: II. Applications,

Springer-Verlag, New York.

[16] Maddison, C.J., Tarlow, D., Minka, T. (2015).

A* Sampling,

Preprint, available online at

https://arxiv.org/abs/1411.0030v2 . [17] Nakajima, J., Kunihama, T., Omori, Y., and Frühwirth-Schnatter, S. (2012). Generalized extreme value distribution with time-dependence using the AR and MA models in state space form.

Computational Statistics & Data Analysis, 56 (11), 3241-3259.

37

Journal Pre-proof

[18] Neufcourt, L. and Viens, F. (2016). A third-moment theorem and precise asymptotics for

Stat.; 13

(1), 239-264.

ALEA, Lat. Am. J. Probab. Math.

repro of

variations of stationary Gaussian sequences. In press in

[19] Nourdin, I. and Peccati, G. (2015). The optimal fourth moment theorem.

Soc. 143, 3123-3133.

Proc. Amer. Math.

[20] Nourdin, I. and Peccati, G. (2012). Normal approximations with Malliavin calculus : Stein's method to universality.

Press, Cambridge.

from

Cambridge Tracts in Mathematics 192. Cambridge University

[21] Nourdin, I., Peccati, G., and Yang, X. (2018). Berry-Esseen bounds in the Breuer-Major CLT and Gebelein's inequality. Preprint https://arxiv.org/abs/1812.08838.

[22] Novikov, A.A. (1971). Sequential estimation of the parameters of diusion type processes.

Veroyatn. Primen., 16 (2), 394-396.

[23] Nualart, D. (2006). The Malliavin calculus and related topics. [24] Nualart, D. and Zhou, H. (2018):

Springer-Verlag, Berlin.

Total variation estimates in the Breuer-Major theorem.

Preprint arXiv:1807.09707.

rna lP

[25] Schoenberg, I.J. (1951). On Polya frequency functions. 331-374.

Teor.

Journal d'Analyse Mathématique, 1,

[26] Shepp, L.A. (1969). Explicit solutions to some problems of optimal stopping.

The Annals of

[27] Shepp, L.A. and Shiryaev, A.N. (1993). The Russian option: reduced regret.

The Annals of

Mathematical Statistics 40, 993-1010. Applied Probability 3, 631-640.

[28] Shepp, L.A. and Vardi, Y. (1982) Maximum likelihood reconstruction for emission tomography.

IEEE transactions on Medical Imaging, 1, 113-122.

[29] Tanaka, K. (2017).

Time Series Analysis: Nonstationary and Noninvertible Distribution The-

ory, volume 16. Wiley New York.

Jou

[30] Toulemonde, G., Guillou, A., Naveau, P., Vrac, M. and Chevallier, F. (2010). Autoregressive models for maxima and their applications to CH4 and N2 O. [31] Tudor, C.A. (2008). Analysis of the Rosenblatt process.

Environmetrics, 21 (2), 189-207.

ESAIM: Probability and Statistics, 12,

230257.

[32] Tudor, C. and Viens, F. (2007). Statistical aspects of the fractional stochastic calculus.

Statist. 35, no. 3, 1183-1212.

Ann.

[33] Üstünel, A. S. (1995). An Introduction to Analysis on Wiener Space. Lecture Notes in Math. 1610.

Springer, Berlin.

38

Journal Pre-proof

[34] Viitasaari, L. (2016). Representation of Stationary and stationary increment processes via

Statistics and Probability Letters, 115, 45-53.

repro of

Langevin equation and self-similar processes.

[35] Voutilainen, M., Viitasaari, L., and Ilmonen, P. (2017). On model tting and estimation of strictly stationary processes.

Modern Stochastics : Theory and Applications. 4 (4), 381406.

[36] Voutilainen, M., Viitasaari, L., and Ilmonen, P. (2019). Note on AR(1) characterization of stationary processes and model tting. 195-207.

Modern Stochastics : Theory and Applications, 6

(2),

[37] Yan, Y. and Genton, M. G. (2019). Non-Gaussian autoregressive processes with Tukey g-and-h

Environmetrics 30 (2), e2503.

Jou

rna lP

transformations.

39