Journal Pre-proof AR(1) processes driven by second-chaos white noise: Berry-Ess´een bounds for quadratic variation and parameter estimation Soukaina Douissi, Khalifa Es-Sebaiy, Fatimah Alshahrani, Frederi G. Viens
PII: DOI: Reference:
S0304-4149(19)30390-4 https://doi.org/10.1016/j.spa.2020.02.007 SPA 3637
To appear in:
Stochastic Processes and their Applications
Received date : 15 June 2019 Revised date : 1 February 2020 Accepted date : 17 February 2020 Please cite this article as: S. Douissi, K. Es-Sebaiy, F. Alshahrani et al., AR(1) processes driven by second-chaos white noise: Berry-Ess´een bounds for quadratic variation and parameter estimation, Stochastic Processes and their Applications (2020), doi: https://doi.org/10.1016/j.spa.2020.02.007. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. c 2020 Published by Elsevier B.V. ⃝
repro of
Journal Pre-proof
AR(1) processes driven by second-chaos white noise: Berry-Esséen bounds for quadratic variation and parameter estimation
Soukaina Douissi1 , Khalifa Es-Sebaiy2 , Fatimah Alshahrani3 , Frederi G. Viens4 . 1 Laboratory LIBMA, Faculty Semlalia, Cadi Ayyad University, 40000 Marrakech, Morocco. Email:
[email protected]
2 Department of Mathematics, Faculty of Science, Kuwait University, Kuwait. Email:
[email protected]
3 Department of mathematical science, Princess Nourah bint Abdulrahman university, Riyadh. 3,4 Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824. Email:
[email protected],
[email protected]
Abstract:
rna lP
December 1, 2019
In this paper, we study the asymptotic behavior of the quadratic variation for
the class of AR(1) processes driven by white noise in the second Wiener chaos. Using tools from the analysis on Wiener space, we give an upper bound for the total-variation speed of convergence to the normal law, which we apply to study the estimation of the model's mean-reversion. Simulations are performed to illustrate the theoretical results.
Key words:
Central limit theorem; Berry-Esséen; Malliavin calculus; parameter estimation; time
series; Wiener chaos
2010 Mathematics Subject Classication: 1
Introduction
60F05; 60H07; 62F12; 62M10
The topic of statistical inference for stochastic processes has a long history, addressing a number of
Jou
issues, though many dicult questions remain. At the same time, a number of application elds are anxious to see some practical progress in a selection of directions. Methodologies are sought which are not just statistically sound, but stand a good chance of being computationally implementable, if not nimble, to help practitioners make data-based decisions in stochastic problems with complex time evolutions. In this paper, which is motivated by parameter estimation within the above context,
The rst author is supported by the Fulbright joint supervision program for PhD students for the academic year 2018-2019 between Cadi Ayyad University and Michigan State University. The fourth author is partially supported by NSF awards DMS 1734183 and 1811779, and ONR award N00014-18-1-2192.
1
Journal Pre-proof
we propose a quantitatively sharp analysis in this direction, and we honor the scientic legacy of
repro of
Prof. Larry Shepp.
Prof. Shepp is widely known for his seminal work on stochastic control, optimal stopping, and applications in areas such as investment nance. Often labeled as an applied probabilist, by those working in that area, he had the merit, among many other qualities, of showing by example that research activity in this area could benet from an appreciation for the mathematical aesthetics of constructing stochastic objects for their own sake. His papers also showed that one's work is only as applied as one's ability to calibrate a stochastic model to a realistic scenario. As obvious as this view may seem, it is nonetheless in short supply in some current circles, where model sophistication seems to replace all other imperatives. Instead, we believe some of the principles guiding applied probability research should include (i) statistical parsimony and robustness, (ii) feature discovery, and above all, (iii) real-world impact where mathematicians propose a real solution to a real problem. We think that Prof. Shepp would not have been shy about agreeing that his seminal and highly original works on optimal stopping and stochastic control [26], [3], including the invention [27] of the widely used Russian nancial option, illustrate items (ii) and (iii) in this philosophy perfectly. This leaves the question of how to estimate model parameters needed to implement applied solutions. Prof.
Shepp proved on many occasions that this concern was also high on his list of objectives
in applied work; he proposed methods aligning with our stated principle (i) above.
The best
example is the work for which Shepp is most widely known outside of our stochastic circles: the
rna lP
mathematical foundation of the Computational Tomography (CT) scanner, and in particular, the basis [28] for its data analysis. Prof. Shepp is less well known for his direct interest in statistically motivated stochastic modeling; the posthumous paper [8] is an instance of this, on asymptotics of auto-regressive processes with normal noise (innovations). Our paper honors this legacy by providing a detailed and mathematically rigorous stochastic analysis of some building blocks needed in the data analysis of a simple class of stochastic processes. Our paper's originality is in working out detailed quantitative properties for auto-regressive processes with innovations in the second Wiener chaos. Our framework is parsimonious in the sense of being determined by a small number of parameters, while covering features of stationarity, mean-reversion, and heavier-than-normal tail weight. We focus on establishing rates of convergence in the central limit theorem for quadratic variations of these processes, which we are then able to transfer to similar rates for the model's moments-based parameter estimation. This precision would allow practitioners to determine the validity and uncertainty quantication of our estimates in the realistic setting of moderate sample size. Careless use of a method of moments would ignore the potential for abusive
Jou
conclusions in this heavy-tailed time-series setting.
The remainder of this introduction begins with an overview of the landscape of parameter estimation for stochastic processes related to ours. The few included references call for the reader to nd additional references therein, for the sake of conciseness.
We then introduce the specic
model class used in this paper. It represents a continuation of the current literature's motivation to calibrate stochastic models with features such as stochastic memory and path roughness. constitutes a departure from the same literature's focus on the framework of Gaussian noise.
2
It
Journal Pre-proof
repro of
1.1 Parameter estimation for stochastic processes: historical and recent context Some of the early impetus in parameter estimation for stochastic processes was inspired by classical ideas from frequentist statisitics, such as the theoretical and practical superiority of maximum likelihood estimation (MLE), over other, less constrained methodologies, in many contexts. We will not delve into the description of many such instances, citing only the seminal account [15], rst published in Russian in 1974 (see references therein in Chapter 17, such as [22]). This was picked up two decades later in the context of processes driven by fractional Brownian motion, where it was shown that the martingale property used in earlier treatments was not a necessary ingredient to establish the properties of such MLEs: see in particular the treatment of processes with fractional noise in [13] and in [32]. It was also noticed that least-squares ideas, which led to MLEs in cases of white-noise driven processes, did not share this property in the case of processes driven by fractional noise: this was pointed out in the continuous-time based paper [12].
See also a more
detailed account of this direction of work in [7] and references therein, including a discussion of the distinction between estimators based on continuous paths, and those using discrete sampling under in-ll and increasing-horizon asymptotics. These were applied particularly to various versions of the Ornstein-Uhlenbeck process, as examples of processes with stationary increments and an ability to choose other features such as path regularity and short or long memory.
The impracticality of computing MLEs for parameters of stochastic processes in these feature-rich contexts, led the community to consider other methodologies, looking more closely
rna lP
at least squares and beyond. A popular approach is to work with incarnations of the method of moments. A full study in the case of general stationary Gaussian sequences, with application to in-ll asymtotics for the fractional Ornstein-Uhlenbeck process, is in [2].
This paper relates the
relatively long history of those works where estimation of a memory or Hölder-regularity parameter uses moments-based objects, particularly quadratic variations. It also shows that the generalized method of moments can, in principle, provide a number of options to access vectors of parameters for discretely observed Gaussian processes in a practical way. This was also illustrated recently in [9], where the Malliavin calculus and its connection to Stein's method was used to establish speeds of convergence in the central-limit theorems for quadratic-variations-based estimators for discretely observed processes. The Stein-Malliavin technical methodology employed in [9] is that which was introduced by Nourdin and Peccati in 2009, as described in their 2012 research monograph [20]. Other estimation methods are also proposed for general stationary time series, which we mention here, though they fall out of the scope of our paper, and they do not lead to the same
Jou
precision as those based on the Stein-Malliavin method: see e.g. [35] and [36] for the Yule-Walker method and extensions.
While the paper [34] establishes that essentially every continuous-time
stationary process can be represented as the solution of a Langevin (Ornstein-Uhlenbeck-type) equation with an appropriate noise distribution, the two aforementioned follow-up papers [35, 36], which present an analog in discrete time, do not, however, connect the discrete and continuous frameworks via any asymptotic theory. Following an initial push in [32], most of the recent papers mentioned above, and recent references therein, state an explicit eort to work with discretely observed processes.
At least
in the increasing-horizon case, the papers [9] and [6] had the merit of pointing out that many of the discretization techniques used to pass from continuous-path to discrete-observation based
3
Journal Pre-proof
estimators, were inecient, and it is preferable to work directly from the statistics of the discretely Our paper picks up this thread, and introduces a new direction of research
repro of
observed process.
which, to our knowledge, has not been approached by any authors: can the asymptotic normality of quadratic variations and related estimators, including very precise results on speeds of convergence, be obtained when the driving noise is not Gaussian?
The main underlying theoretical result we draw on is the optimal estimation of totalvariation distances between chaos variables and the normal law, established in [19]. It was used for quadratic variations of stationary sequences in the Gaussian case in [18]. But when the Gaussian setting is abandonned, the result in [19] cannot be used directly. Instead, our paper makes a theoretical advance in the analysis on Wiener space, by drawing on a simple idea in the recent preprints [24] and [21]; our main result provides an example of a sum of chaos variables whose distance to the normal appears to be estimated optimally, whereas a standard use of the Schwartz inequality would result in a much weaker result. The precise location of the technique leading to this improvement is pointed out in the main body of our paper: see Theorem 8, particularly inequality (24) in its proof and the following brief discussion there, and Remark 12. This allows us to prove our Berry-Esséen-type speed of
n−1/2 ,
rather than what would have resulted in a speed of
n−1/4 .
1.2 A stationary process with second-chaos noise, and related literature
rna lP
Given our intent to address the new issue of noise distribution, and knowing that Berry-Esséentype questions for models with mere Gaussian noise already present technical challenges, we choose to minimize the number of technical issues to address in this paper by focusing on the simplest possible stationary model class which does not restrict the marginal noise distribution within a family which is tractable using Wiener chaos analysis and tools from the Malliavin calculus. This is the auto-regressive model of order 1 (a.k.a. AR(1)) with independent noise terms, where the noise distribution is in the second Wiener chaos, i.e.
Yn = a0 + a1 Yn−1 + εn
where
{εn ; n ∈ Z}
is an i.i.d. sequence in the second Wiener chaos, and
(1)
a0
and
a1
are constants.
The complete description and construction of this process and of the noise sequence is given in Section 3, see (8).
As explained in Section 2, the second Wiener chaos is a linear space, and since the model (1) is linear, its solution, if any, lies in the same chaos. This points to a simple theoretical motivation
Jou
and justication for studying the increasing horizon problem as opposed to the in-ll problem. We also include a practical motivation for doing so, further below in this section, coming from an environmental statistics question.
For the former motivation, note that the AR(1) specication (1), with essentially any squareintegrable i.i.d.
noise distribution, is known to converge weakly, after appropriate aggregation
and scaling, to the so-called Ornstein-Uhlenbeck process (also known occasionally as the Vasicek process), which solves the stochastic dierential equation
dXt = α(m − Xt )dt + σdW (t) 4
(2)
Journal Pre-proof
and
σ
W
is a standard Wiener process (Gaussian Brownian motion), and the parameters
are explicitly related to
a0 , a1
and
V ar [εn ].
α, m,
See for instance [29, Chapter 2], which covers
repro of
where
the case of all square-integrable innovations; this paper assumes a piecewise linear interpolation in the normalization, which could be eliminated by switching to convergence in the Skorohod topology. A reference avoiding linear interpolation, with convergence in the Skorohod
J1
J1
topology,
is [5], where innovations are assumed to have four moments. In any case, this central limit theorem constrains the modeling of stationary/ergodic processes via diusive dierential formulation: under in-ll asymptotics with weakly dependent noise, the AR(1) specication cannot preserve any nonnormal noise distribution in the limit. It is of course possible to interpolate the above process
Y
in
a number of ways, to result in a continuous-time process whose discrete-time marginals are those specied via (1).
However we believe it is dicult or impossible to give a linear diusion-type stochastic dierential equation, akin to (2), whose xed-time-step marginals are as in (1) for an arbitrary noise distribution, while simultaneously describing what second-chaos process dierential would need to replace
W
in (2). The so-called Rosenblatt process (see [31]), the only known second-chaos
continuous-time process with a stochastic calculus similar to
W 's,
gives an example of a viable
alternative to (2) living in the second chaos. But this process is known to have only a long-memory version. Thus it cannot be a proxy for any continuous-time analogue of (1), since the noise there has no memory. Similar issues would presumably exist for other non-Gaussian AR(1) and related
rna lP
auto-regressive processes. A few have been studied recently in the literature. We mention [11, 37], which cover various noise structures similar to second-chaos noises, and here again, no asymptotic or interpolation theory is provided to relate to continuous time. There does exist a general treatment in [8] of asymptotics for all AR(p) processes: the limit processes are the so-called Continuous-AR(p) processes, which are Gaussian, and have
p > 1);
p − 1-dierentiable
paths (a form of very long memory for
that paper assumes normal innovations to keep technicalities to a minimum. Another indication that nding such a proxy may fail comes in the specic case of the so-
called Gumbel distribution for
εn .
This law is a popular distribution for extreme-value modeling.
The fact that this law is in the second chaos is a classical result (as a weighted sum of exponentials, see [25]), though it does not appear to be widely know in the extreme-value community. The standard (mean-zero) Gumbel law can be represented as
P∞
¯ 2 /2 Ej = Nj2 + N j ¯ freedom, Nj and Nj are iid
−1 (E − 1) j j=1 j
is a standard exponential variable (chi-squared with two degrees of
where
standard normals). The Gumbel law is known to give rise to a second-chaos version of an isonormal
Jou
Gaussian process, known as the Gumbel noise measure (or Gumbel process); that stochastic measure obeys the same laws as the white-noise measure (including independence of increments which fails for the Rosenblatt noise), if one replaces the standard algebra of the reals by the max-plus algebra. This is explained in detail in the preprint [16]; also see references therein. By virtue of this change of algebra, stochastic dierential specications as in (2) cannot be dened using the Gumbel noise.
However, the discrete version of the Gumbel noise, an i.i.d. sequence
(εn )n
with Gumbel
marginals, is a good example of a noise type which can be used in the AR(1) process (1).
This
specic model, known as the AR(1) process with Gumbel noise (or innovations), presents a main motivation for our work. Recent references on this process, and on the closely related process where
5
Journal Pre-proof
the marginals of
Y
are Gumbel-distributed, include [17] for a Bayesian study, [30] for applications to
repro of
maxima in greenhouse gas concentration data, and [1] for AR processs in the broader extreme-value context. A survey on AR(1) models with dierent types of innovations and marginals, while not including the Gumbel, is in the unpublished manuscript [10]. The use of the Gumbel distribution for describing environmental time series, mainly when looking at extremes, is fairly widespread, but we do not cite this literature because it does not appear willing to acknowledge that time-series models driven by Gumbel innovations should be used, rather than using tools for i.i.d.
Gumbel
data. This literature, which is easy to nd, is also entirely unaware that the Gumbel distribution is in the second Wiener chaos.
All these reasons give us ample cause to investigate the basic method-of-moments building blocks for determining parameters in stationary time series with second-chaos innovations.
For
the sake of concentrating on the core mathematical analysis towards this end, we focus on the asymptotics of quadratic variations for models in the class (1).
The methodology developed in
[9] can then be adapted to handle any method-of-moments-based estimators, at the cost of some additional eort. We provide examples of this in the latter sections of this paper. Our main result is that, for any second-chaos innovations in (1), the quadratic variation of
Y
has explicit normal
asymptotics, with a speed of convergence in total variation which matches the classical Berry-Esseén speed of
n−1/2 .
rna lP
The remainder of this paper is structured as follows. Section 2 provides elements from the analysis on Wiener space which will be used in the paper. Section 3 presents the details of the class of AR(1) models we will analyze. Section 4 computes the asymptotic variance of the AR(1)'s quadratic variation by looking separately at its 2nd-chaos and 4th-chaos components, whose asymptotics are of the same order.
Section 5 establishes our main result, the Berry-Esseén speed of convergence
in total-variation for the normal uctuations of the AR(1)'s quadratic variation. Finally, Section 6 denes a method-of-moments estimator for the mean-reversion rate of this AR(1) process, and establishes its asymptotic properties; a numerical study is included to gauge the distance between this renormalized estimator and the normal law.
2
Preliminaries
In this rst section, we recall some elements from stochastic analysis that we will need in the
H gives rise to an (G(ϕ), ϕ ∈ H) of random variables on a probability space (Ω, F, P) such that E(G(ϕ)G(ψ)) = hϕ, ψiH . In this paper, it is enough to 2 use the classical Wiener space, where H = L ([0, 1]), though any H will also work. In the case 2 H = L ([0, 1]), G can R 1 be identied with the stochastic dierential of a Wiener process W and one interprets G(ϕ) := 0 ϕ (s) dW (s). 2 The Wiener chaos of order n is dened as the closure in L (Ω) of the linear span of the random variables Hn (G(ϕ)), where ϕ ∈ H, kϕkH = 1 and Hn is the Hermite polynomial of degree n. The intuitive Riemann-sum-based notion of multiple Wiener stochastic integral In with respect to G ≡ W , in the sense of limits in L2 (Ω), turns out to be an isometry between the Hilbert space paper.
See [20], [23], and [33] for details.
Any real, separable Hilbert space
a centered Gaussian family
Jou
isonormal Gaussian process:
6
Journal Pre-proof
H n
1 (symmetric tensor product) equipped with the scaled norm √
k · kH⊗n and the Wiener chaos n! 2 under L (Ω)'s norm. In any case, we have the following fundamental decomposition of
repro of
n L2 (Ω) as a direct sum of all Wiener chaos. • The Wiener chaos expansion. For any F ∈ L2 (Ω), there exists a unique sequence of functions fn ∈ H n such that ∞ X F = E[F ] + In (fn ),
of order
n=1
L2 (Ω) and E In (fn )2 = n!kfn k2H⊗n .
where the terms are all mutually orthogonal in
•
Product formula and contractions.
Since
L2 (Ω)
(3)
is closed under multiplication, the special
case of the above expansion exists for calculating products of Wiener integrals, and is explicit using contractions: For any integers
p, q > 1
and symmetric integrands
p∧q X
Ip (f )Iq (g) =
r=0
f ⊗r g
is the contraction of order
r
of
f
and
g ∈ H q ,
r!Cpr Cqr Ip+q−2r (f ⊗˜r g); and
g
which is an element of
rna lP
where
f ∈ H p
H⊗(p+q−2r)
(4)
dened by
(f ⊗r g)(s1 , . . . , sp−r , t1 , . . . , tq−r ) Z := f (s1 , . . . , sp−r , u1 , . . . , ur )g(t1 , . . . , tq−r , u1 , . . . , ur ) du1 · · · dur . [0,1]p+q−2r
while
(f ⊗˜r g)
is dened by
denotes its symmetrization.
f˜(x1 , ..., xp ) =
{1, ..., p}. The special case for form:
1 p!
P σ
More generally the symmetrization
f (xσ(1) , ..., xσ(p) )
p = q = 1
f˜
of a function
f
σ
of
where the sum runs over all permutations
is particularly handy, and can be written in its symmetrized
I1 (f )I1 (g) = 2−1 I2 (f ⊗ g + g ⊗ f ) + hf, giH .
•
Hypercontractivity in Wiener chaos.
for all
p > 2),
have
Hq ,
h ∈ H⊗q ,
the multiple Wiener integrals
satisfy a hypercontractivity property (equivalence in
Jou
which exhaust the set
For
which implies that for any
F ∈ ⊕ql=1 Hl
for any
.
Malliavin derivative and other operators on Wiener space.
operator
D,
Iq (h), norms
(5)
cp,q above are known with some precision when q/2
cp,q = (p − 1)
Lp
p > 2.
It should be noted that the constants
•
of all
(i.e. in a xed sum of Wiener chaoses), we
1/p 1/2 E |F |p 6 cp,q E |F |2
Corollary 2.8.14 in [20],
Hq
F ∈ Hq :
by
The Malliavin derivative
and other operators on Wiener space, are needed briey in this paper, to provide an
ecient proof of the rst theorem in Section 5, and to interpret an observation of I. Nourdin and
7
Journal Pre-proof
G. Peccati, given below in (7), for a bound on the total variation distance of any chaos law to the
repro of
normal law. We do not provide any background on these operators, referring instead to Chapter 2 in [20], and briey mentioning here the facts we will use in the proof of Section 5, without spelling out all assumptions.
Strictly speaking, all the results in this paper can be obtained without the
following facts, but this would be exceedingly tedious and wholly nontransparent.
• •
D
The operator
maps
Iq (f )
t 7→ qIq−1 (f (., t)) and is consistent D1,2 ,and includes all chaos variables.
to
rule. Its domain is denoted by
L, known as the Iq (f ) to −qIq (f ),
with the ordinary chain
The operator
generator of the Orstein-Uhlenbeck semigroup on Wiener
space, maps
and
L−1
denotes its pseudo-inverse:
L's
kernel is the con-
stants, all other chaos are its eigenspaces. Combining this with the previous point, we obtain
−Dt L−1 Iq (f ) = Iq−1 (f (., t)). This
•
We have the relation
F = δ(−DL−1 )F.
rna lP
•
D has an adjoint δ in L2 (Ω), which by denition satises the duality relation E hDF, uiH = E [F δ (u)], where u is any stochastic process for which the expressions are dened. The domain of δ is a non-trivial object of study, but it is known to contain all square-integrable W -adapted processes for the case of G = W , the wiener process, where H = L2 ([0, 1]).
•
Distances between random variables.
The following is classical. If
random variables, then the total variation distance between the law of by
X
X, Y
are two real-valued
and the law of
Y
is given
dT V (X, Y ) := sup |P [X ∈ A] − P [Y ∈ A]| A∈B(R)
where the supremum is over all Borel sets. The Kolmogorov distance
dT V
except one take the sup over
A
of the form
(−∞, z]
for all real
z.
dKol (X, Y )
is the same as
The Wasserstein distance
uses Lipschitz rather than indicator functions:
dW (X, Y ) :=
sup |Ef (X) − Ef (Y )| ,
f ∈Lip(1)
Lip(1) being the set of all Lipschitz functions with • The observation of Nourdin and Peccati.
Lipschitz constant Let
N
6 1.
denote the standard normal law.
The
Jou
following observation relates an integration-by-parts formula on Wiener space with a classical result of Ch. Stein. Let
X ∈ D1,2
E [X] = 0 f ∈ Cb1 (R),
with
5.1.3]), for
and
V ar [X] = 1.
Then (see [19, Proposition 2.4], or [20, Theorem
E [Xf (X)] = E f 0 (X) DX, −DL−1 X H ,
and by combining this with properties of solutions of Stein's equations, one gets
dT V (X, N ) 6 2E 1 − DX, −DL−1 X H . 8
(6)
Journal Pre-proof
−L−1 X = q −1 X , (X, N ) 6 2E 1 − q −1 kDXk2H .
dT V •A
convenient lemma.
X ∈ Hq ,
since
we obtain conveniently
repro of
One notes in particular that when
(7)
The following result is a direct consequence of the Borel-Cantelli Lemma
(the proof is elementary; see e.g. [14]). It is convenient for establishing almost-sure convergences from
Lp
convergences.
Lemma 1 Let
γ > 0. Let (Zn )n∈N be a sequence of random variables. If for every p > 1 there exists a constant cp > 0 such that for all n ∈ N, kZn kLp (Ω) 6 cp · n−γ ,
then for all ε > 0 there exists a random variable ηε which is such that |Zn | 6 ηε · n−γ+ε
almost surely
for all n ∈ N. Moreover, E|ηε |p < ∞ for all p > 1. 3
The model
rna lP
3.1 Denition We consider the following AR(1) model
where
a0 , a1
and
{σj , j > 1}
Yn = a0 + a1 Yn−1 + εn , n > 1 ∞ P 2 − 1) εn = σj (Zn,j j=1 Y0 = y0 ∈ R,
are real constants. The sequence of innovations
(8)
{εn , n > 1}
is i.i.d.,
with distribution in the second Wiener chaos. It turns out that this sequence can be represented as
{Zn,j , n > 1, j > 1} are i.i.d. standard {σj ; j > 1} is a sequence of reals satisfying
in the second line above in (8), where the family random variables dened on
(Ω, F, P),
and
∞ X
Jou
j=1
σj2 < ∞.
(9)
This is explained in [20, Section 2.7.4]. We assume that the mean reversion parameter that
|a1 | < 1.
Gaussian
a1
is such
Under this condition, (8) also admits a stationary ergodic solution. Both the version
above and the stationary version are linear functionals of elements of the form of
εn ,
which are
elements of the second Wiener chaos. Since this chaos is a vector space, both versions of
Y
take
values in the second Wiener chaos. By truncating the series in (8), one obtains a process which is a sum of chi-squared variables, converging to
Y
in
L2 (Ω).
Special cases where the sum is nite, can be considered. In the gures
below, we simulate 500 observations from such cases, to show the variety of behaviors, even with a limited number of terms in the noise series.
9
Journal Pre-proof
j > 2, corresponds to 2 − 1) ∼ χ2 (1). (Zn,1
When
•
When
•
When σ1 = −σ2 , and σj = 0, for all j > 2, which is ε's law is equal to a product normal law: if N, N 0 2 − 1) − (Z 2 − 1) = Z 2 − Z 2 . 2N N 0 ∼ (Zn,1 n,2 n,1 n,2
σ1 = σ
and
σj = 0,
for all
noise with one degree of freedom:
a scaled mean-zero chi-squared white
repro of
•
(c)
a0 = 2, a1 = 0.7, y0 = 3, σ = 2.
Jou
(a)
rna lP
σ1 = σ2 = σ and σj = 0, ∀j > 3, an exponential 2 − 1) + (Z 2 − 1) ∼ E(1/2). 1/(2σ). Indeed (Zn,1 n,2
a0 = 0.8, a1 = 0.5, y0 = 1, σ1 = σ2 = 2.
(d)
white noise with rate parameter
a symmetric second chaos white noise,
are two i.i.d.
(b)
standard normals, then
a0 = 1.2, a1 = 0.4, y0 = 2, σ = 4.
a0 = 1.2, a1 = 0.4, y0 = 0, σ1 = −σ2 = 1.
Figure 1: 500 various observations from (8) for dierent values of
a0 , a1 , y0
and
σj , j = 1, 2.
Remark 2 We can see from the gures above the asymmetry in gures (a), (b) and (c) due to the asymmetric nature of the noise; gure (d) shows more symmetry because of the choice σ1 = −σ2 . 10
Journal Pre-proof
3.2 Quadratic variation
repro of
We also notice that when the mean reversion is fairly strong and the noise is large the shape of the observations is balanced (gure (a)), while when the noise is larger compared to the mean-reversion parameter, the observations look like an Ornstein-Uhlenbeck process with a noise larger than the drift (see gure (b)).
This paper's main goal is to determine the asymptotic distribution of the quadratic variation of the observations
{Yn , n > 1}
using analysis on Wiener space.
This will be facilitated by the fact, mentioned above, that the sequence second Wiener chaos with respect to the Wiener process
W,
(Yn )n>1
linear equation with noise in the second chaos. To be more specic, observe that can be expressed recursively as follows :
Yi = di +
i X
a1i−k
∞ X j=1
k=1
where
2 σj Zk,j −1 ,
di = ai1 y0 + a0
i X
lives in the
by virtue of being the solution of a
{Yn , n > 1}
i>1
a1i−k .
in (8)
(10)
(11)
rna lP
k=1
For the sake of ease of computation in Wiener chaos, it will be convenient throughout this paper to refer to the Wiener integral representation of the noise terms
2 . For this, there exZk,j = W (hk,j ) = I1W (hk,j ) =
{hk,j , k > 1, j > 1} an orthonormal family L2 ([0, 1]) for which Zk,j from the most elementary 0 hk,j (r) dW (r). Hence, using the fact, which comes 2 W ⊗2 product formula (4), that W (ϕ) − 1 = I2 ϕ , we have for i > 1 : ists
R1
Yi = di +
i X
ai−k 1
j=1
k=1
= di +
i X
ai−k 1
Jou
= di +
∞ X j=1
k=1
i X
∞ X
ai−k 1
∞ X
2 σj Zk,j −1
σj W 2 (hk,j ) − 1 σj I2W (h⊗2 k,j ).
j=1
k=1
Therefore, using the linearity property of multiple integrals, we can write for
where
Y˜i := I2W (fi )
application of the
i > 1,
Yi = di + Y˜i ,
and
fi :=
i X k=1
11
a1i−k
∞ X j=1
σj h⊗2 k,j .
(12)
Journal Pre-proof
i>1
: indeed
kfi k2L2 ([0,1]2 ) =
∞ X j=1
σj2 ×
fi ∈ L2 ([0, 1]2 )
for all
repro of
A straightforward computation shows that under Assumption (9), the kernel
∞
X (1 − a2i 1 1 ) 6 σ 2 < ∞. 2 2 (1 − a1 ) (1 − a1 ) j=1 j
Our main object of study in the next two sections is the asymptotics of the quadratic variation dened as follows :
n
n
1X 1 X ˜2 Qn := (Yi − di )2 = Yi . n n i=1
Using product formula (4), we get
1 X W I2 (fi )2 − 2kfi k2L2 ([0,1]2 ) n n
Qn − E[Qn ] =
(13)
i=1
i=1
n n 1X W 4X W = I4 (fi ⊗ fi ) + I2 (fi ⊗1 fi ) n n i=1 i=1 ! ! n n X X 1 4 = I4W fi ⊗ fi + I2W fi ⊗1 fi n n
rna lP
i=1
i=1
=: T4,n + T2,n .
In the next section, we show that the asymptotic variance of
(14)
√
compute its speed of convergence. Then we establish a CLT for
n (Qn − E[Qn ]) exits and we will Qn , and compute its Berry-Esséen
speed of convergence in total variation.
4
Asymptotic variance of the quadratic variation
Using the orthogonality of multiple integrals living in dierent chaos, to calculate the limiting variance of and
T4,n
√
n(Qn − E[Qn ]),
given in (14).
we need only study separately the second moments of the terms
T2,n
Jou
4.1 Scale constant for T2,n
Proposition 3 Under Assumption (9), with T2,n as in (14), for large n, ∞ P 4 32 σ j C h √ 2 i j=1 1 nT2,n − , 6 E n (1 − a21 )2 12
(15)
Journal Pre-proof
j=1
σj4
!
[1+a21 (5+6a21 )] . (1−a41 )2 (1−a21 )
In particular
l1 := lim E n→∞
Proof.
We have
T2,n = I2W ( n4
we get
2 E[T2,n ] = 2k
n P
i=1
repro of
where C1 := 32
∞ P
h √
j=1
=
σj4
(1 − a21 )2
.
n n 4X 32 X fi ⊗1 fi k2L2 ([0,1]2 ) = 2 hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) . n n i,j=1
Moreover, under Assumption (9), we have
(fi ⊗1 fi )(x, y) = =
i X
∞ X
1 i−k2 ai−k a1 1
j1 ,j2 =1
k1 ,k2 =1 i X
∞ X
1 i−k2 ai−k a1 1
j1 ,j2 =1
k1 ,k2 =1 i X
∞ X
⊗2 σj1 σj2 (h⊗2 k1 ,j1 ⊗1 hk2 ,j2 )(x, y)
σj1 σj2 (hk1 ,j1 ⊗ hk2 ,j2 )(x, y)δj1 ,j2 δk1 ,k2
rna lP =
2(i−k)
a1
σj2 (h⊗2 k,j )(x, y).
j=1
k=1
δij
∞ P
denotes the Kronecker delta dened as follows :
δij =
Therefore, for
i, j > 1
such that
hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) =
i X
=
if
if
j X
2(i−k1 )
a1
i X
0 1
j X
2(i−k1 )
a1
k2 =1
i X
j X
2(i−k1 )
a1
2(j−i)
= a1
j=1
j1 ,j2 =1
σj4
j1 ,j2 =1
4(i−k)
×
2(j−i) a1
∞ X
2(j−k2 )
a1
a1
k=1
∞ X
2(j−k2 )
a1
j1 ,j2 =1
k2 =1 i X
∞ X
2(j−k2 )
a1
k2 =1
k1 =1
∞ X
i 6= j i = j.
we get
k1 =1
=
k1 =1
Jou
=
j > i,
(16)
fi ⊗1 fi ), by the isometry property (3) of multiple integrals,
i=1
Where
nT2,n
2 i
32
13
×
∞ X
σj4
j=1
1 − a4i 1 1 − a41
.
D E ⊗2 σj21 σj22 h⊗2 , h k1 ,j1 k2 ,j2
L2 ([0,1]2 )
2 σj21 σj22 hhk1 ,j1 , hk2 ,j2 iL2 ([0,1]) σj21 σj22 δj1 ,j2 δk1 ,k2
(17)
Journal Pre-proof
repro of
Therefore, by (17), we have
n n−1 n √ 32 X 64 X X 2 2 E ( nT2,n ) = kfi ⊗1 fi kL2 ([0,1]2 ) + hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) . n n i=1
i=1 j=i+1
Moreover,
∞ ∞ P P 4 σj 32 32 σj4 n 32 X j=1 j=1 kfi ⊗1 fi k2L2 ([0,1]2 ) − 4 ) = (1 − a4 ) n (1 − a 1 1 i=1
On the other hand
∞ P σj4 32 n 1X 1 j=1 a4i − 1 6 4 )2 n . n (1 − a 1 i=1
rna lP
∞ P 2 4 64a1 σj n 64 n−1 j=1 X X hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) − 4 )(1 − a2 ) n (1 − a 1 1 i=1 j=i+1 ∞ ∞ P P 4 4 2 64 σ σ 64a 1 j j n−1 n X 2(j−i) j=1 j=1 1 X 4i = (1 − a ) a − 1 1 4 )(1 − a2 ) (1 − a41 ) n (1 − a 1 1 i=1 j=i+1 64a21
∞ P
σj4
1 X 2(n−i) 4i 6 (1 − a )(1 − a ) − 1 1 1 4 2 (1 − a1 )(1 − a1 ) n 64a21
6
j=1
σj4
(1 − a41 )(1 − a21 )
(1 −
∞ P
j=1
a41 )(1
σj4
−
a21 )2
n
3 X 2i a1 n i=0
!
3 . n
Jou
6
i=1
∞ P
j=1
64a21
n
14
Journal Pre-proof
Consequently
∞ P
repro of
∞ ∞ P P 4 4 σj σj 32 32 n h √ i 2 j=1 j=1 32 X nT2,n − kfi ⊗1 fi k2L2 ([0,1]2 ) − 6 E 2 2 (1 − a1 ) n i=1 (1 − a41 ) ∞ P 4 2 n−1 n σ 64a 1 j 64 X X j=1 + hfi ⊗1 fi , fj ⊗1 fj iL2 ([0,1]2 ) − 4 )(1 − a2 ) n (1 − a 1 1 i=1 j=i+1 σj4
64a21
∞ P
σj4 1 3 j=1 + 6 4 4 2 2 2 (1 − a1 ) n (1 − a1 )(1 − a1 ) n ∞ 2 2 X 4 1 + a1 (5 + 6a1 ) 1 6 32 . σj (1 − a41 )2 (1 − a21 ) n j=1 32
j=1
The desired is therefore obtained.
rna lP
4.2 Scale constant for T4,n
Proposition 4 Under Assumption (9), with T4,n as in (14), for large n,
2 X ∞ ∞ 2 X √ C2 4 1 + a1 4 2 E ( nT4,n )2 − σ + σ j j 6 n , (1 − a21 )2 j=1 1 − a21 j=1
where
C2 := C2,1 + C2,2 , C2,1
In particular
2 ∞ X (3 + 17a21 ) := 4 σj2 (1 − a21 )4 j=1
Jou
√ l2 := lim E ( nT4,n )2 = n→∞
Proof.
By denition of the term
, C2,2
∞ X 1 + a21 (5 + 6a21 ) 4 . := 4 σj (1 − a41 )2 (1 − a21 ) j=1
2 X ∞ ∞ 2 X 4 1 + a 1 σ4 + σj2 . (1 − a21 )2 j=1 j 1 − a21 j=1 T4,n ,
we have
n √ 1 X ˜ i k2L2 ([0,1]4 ) E[( nT4,n )2 ] = 4!k √ fi ⊗f n i=1
n 4! X
˜ i , fj ⊗f ˜ j 2 = fi ⊗f , L ([0,1]4 ) n i,j=1
15
(18)
(19)
Journal Pre-proof
where
˜ i fi ⊗f
denotes the symmetrization of
fi ⊗ fi ,
because the kernel
Pn
i=1 fi ⊗fi
∈ L2 ([0, 1]4 )
is
repro of
no longer symmetric. We deal with symmetrization by using a combinatorial formula, obtaining
˜ i , fj ⊗f ˜ j 2 4! fi ⊗f = (2!)2 hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) + (2!)2 hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) . L ([0,1]4 )
Therefore
n n √ 4 X 4 X E[( nT4,n )2 ] = hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) + hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) n n i,j=1
i,j=1
=: T4,1,n + T4,2,n .
Moreover,
T4,1,n = =
n 4 X hfi ⊗ fi , fj ⊗ fj iL2 ([0,1]4 ) n
4 n
i,j=1 n X i,j=1
hfi , fj iL2 ([0,1]2 )
2
rna lP
n n 2 8 n−1 2 X X 4 X = hfi , fi iL2 ([0,1]2 ) + hfi , fj iL2 ([0,1]2 ) . n n i=1
i=1 j=i+1
On the other hand, using (9), we have for
hfi , fj iL2 ([0,1]2 ) =
i X
j>i
a1i−k1
k1 =1
(j−i)
aj−k 1
k2 =1
×
∞ X j=1
Jou
= a1
j X
16
∞ X
σj1 σj2 δj1 ,j2 δk1 ,k2
j1 ,j2 =1
σj2 ×
1 − a2i 1 1 − a21
.
(20)
Journal Pre-proof
Therefore by (20), we get
Now let us estimate
rna lP
repro of
2 2 n ∞ ∞ 2 X X X 4 4 4 2 2 T4,1,n − 4(1 + a1 ) kf k − σ 6 σ 2 2 i j j L ([0,1] ) n (1 − a21 )3 j=1 (1 − a21 )2 j=1 i=1 2 n−1 n ∞ 2 2 X 8 X X 8a1 2 σ + hfi , fj iL2 ([0,1]2 ) − j (1 − a21 )3 j=1 n i=1 j=i+1 2 ∞ n 1 X X 4 2 2i 2 6 σ ((1 − a ) − 1) j 1 n (1 − a21 )2 j=1 i=1 2 ∞ n 1 X 2 X 8a1 2(n−i) 2 + σ2 (1 − a2i )−1 1 ) (1 − a1 n (1 − a21 )3 j=1 j i=1 2 2 ∞ ∞ 2 X X 80a1 12 1 1 + 6 σ2 σj2 n (1 − a21 )4 n (1 − a21 )3 j=1 j j=1 !2 ∞ P σj2 4 j=1 (3 + 17a21 ) . = n (1 − a21 )4 T4,2,n =
n 4 X hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) . n i,j=1
For
j > i, x, y ∈ L2 ([0, 1]),
we have
(fi ⊗1 fj )(x, y) = j > i,
Z
Jou
So, for
hfi ⊗1 fj , fj ⊗1 fi iL2 ([0,1]2 ) =
[0,1]2
(j−i) a1
i X k=1
2
2(i−k) a1
∞ X
σj2 (h⊗2 k,j )(x, y).
(fi ⊗1 fj ) (x, y)dxdy =
17
(21)
j=1
2(j−i) a1
∞ X j=1
σj4
×
1 − a4i 1 1 − a41
.
Journal Pre-proof
Therefore,
∞ P
24a21
∞ P
σj4 1 1 j=1 + 6 4 4 2 2 2 (1 − a1 ) n (1 − a1 )(1 − a1 ) n ∞ P 4 σj4 1 + a21 (5 + 6a21 ) j=1 = , n (1 − a41 )2 (1 − a21 ) 4
j=1
σj4
repro of
∞ ∞ ∞ ∞ P 4 P P P 4 4 4 4 σ 4 σ σ 4 σ j j j j n−1 n n X X 2(j−i) 1X 8 j=1 j=1 j=1 j=1 4i 4i = (1 − a ) + (1 − a ) a − T4,2,n − 1 1 1 2 )2 n (1 − a41 ) (1 − a21 )2 (1 − a41 ) n i=1 (1 − a 1 i=1 j=i+1 ∞ ∞ P P 4 4 2 4 σ σ a 8 1 j j n n 1 X 1 X 4i j=1 j=1 2(n−i) 6 − a1 + (1 − a4i )−1 1 )(1 − a1 (1 − a41 ) n (1 − a41 )(1 − a21 ) n i=1 i=1
which completes the proof.
T2,n
and
T4,n
rna lP
To get a sense of how the two terms
compare to each other, we propose the
following example, which shows that, despite one's best eorts, one should not expect either of these two terms to dominate the other.
Remark 5 In the AR(1) model (8) with chi-squared white noise, i.e. when σ1 = σ and σj = 0 for all j > 2, one can try to compare the two formulas for the asymptotic variances of T2,n and T4,n . Avoiding the situation where |a1 | is very close to 1, assuming for instance |a1 | < 2−1/2 , so that 1 − a21 > 1/2, when n is large, we have V ar(T2,n ) ∼
1 32σ 4 1 ∼ 4 × (1 − a21 ) × V ar(T4,n ) > 4 × × V ar(T4,n ) = 2 × V ar(T4,n ). 2 2 n (1 − a1 ) 2
Therefore the sequence T4,n can be made to have a variance which is signicantly smaller that the one of T2,n in this case, but both of them converge to zero at the same speed n−1 .
Jou
Using the orthogonality between
T2,n
and
T4,n , Proposition 3 and Proposition 4, we conclude
the following.
Theorem 6 Under Assumption (9), with Qn as in (13), for large n, |n V ar(Qn ) − (l1 + l2 )| 6
18
(C1 + C2 ) , n
Journal Pre-proof
lim n V ar(Qn ) = l1 + l2 2 ∞ ∞ 2 X X 36 4(1 + a1 ) σj4 + σj2 , 2 2 )3 2 (1 − a1 ) (1 − a 1 j=1 j=1
n→∞
=
repro of
and in particular the asymptotic variance of Qn is
where C1 , l1 , and C2 , l2 are given respectively in (15), (16), (18) and (19).
Remark 7
• From the previous theorem, we notice that for n large, and xed values of the noise scale √ parameter family {σj , j > 1}, the variance of nQn has high values when |a1 | is close to 1,
and approaches
2 ∞ ∞ X X 36 σj4 + 4 σj2 j=1
when |a1 | is small.
j=1
rna lP
• The previous theorem also shows that one can obtain other asymptotics depending on the relation between a1 and the family {σj , j > 1}. For instance, when |a1 | is close to 1, which is the limit of fast mean reversion, one can avoid an explosion of Qn 's asymptotic variance
by scaling the variance parameters appropriately, leading to a fast-mean reversion and small noise regime. Letting 1/α := 1P − a21 , where α is interpreted as a rate of mean reversion, one P would only need to ensure that σj4 = O α−2 and σj2 = O α−3/2 . In the example where there is a single non-zero value σ , for instance, we would obtain for large α, n × V ar(Qn ) ∼ 36α2 σ 4 + 8α3 σ 4 ;
here the second term dominates, and as α → ∞, assuming α3 σ 4 remains bounded, we would get an asymptotic variance of 8 limα→∞ α3 σ 4 if the limit exists. 5
Berry-Esséen bound for the asymptotic normality of the quadratic-
Jou
variation
In this section, we prove that the quadratic variation dened in (13) is asymptotically normal and we estimate the speed of this convergence in total variation distance, showing it is of the BerryEsséen-type order
n−1/2 .
For this aim, we will need the following theorem, which estimates the total
variation distance to the normal of the standardized sum of variables in the 2nd and 4th chaos.
19
Journal Pre-proof
Theorem 8 Let F
repro of
dT V
= I2 (f ) + I4 (g) where f ∈ L2s ([0, 1]2 ) and g ∈ L2s ([0, 1]4 ). Then √ √ F 4 h√ √ , N (0, 1) 6 2 kf ⊗ f k 6! kg ⊗ gk 4! kg ⊗2 gkL2 ([0,1]4 ) 2 ([0,1]2 ) + 2 2 ([0,1]6 ) + 18 1 1 L L EF 2 EF 2 q √ √ +36 2 kg ⊗3 gkL2 ([0,1]2 ) + 9 2 hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) i √ q +3 4! kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) . (22)
Moreover, letting RF be the bracketed term on the right-hand side of (22), for any constant σ > 0, we have 2 dT V
Proof.
We have
F , N (0, 1) σ
F = I2 (f ) + I4 (g).
6
4 EF RF + 2 1 − 2 . 2 σ σ
(23)
Then
−Dt L−1 F = I1 (f (., t)) + I3 (g(., t)) =: u(t). Thus, using
F = δ(−DL−1 )F , we can write F = δ(u).
Now we use the result of a simple calculation,
labeled as (9) in the preprint [21] (see also [24]), to obtain
√
F EF 2
, N (0, 1) 6
q 2 V ar hDF, uiL2 ([0,1]) 2 EF q 2 2 E hDF, uiL2 ([0,1]) − EF 2 , 2 EF
rna lP
dT V
=
where the last equality comes from the duality relation
EhDF, uiL2 ([0,1]) = E(F δ(u)) = EF 2 .
(24) The
prior inequality appears to be used in a more general context here than what is stated in [21, Eq. (9)], but an immediate inspection of its proof therein shows that it applies to any situation where
F = δ(u), using only general results such D, and the duality between D and δ .
derivative
as Stein's lemma, the chain rule for the Malliavin
On the other hand, using the product formula (4),
hDF, uiL2 ([0,1]) = =
Z
1
u(t)Dt F dt
0
Z
1
u(t)(2I1 (f (., t)) + 4I3 (g(., t)))dt
0
Z
1
2 [I1 (f (., t))]2 + 4 [I3 (g(., t))]2 + 6I1 (f (., t))I3 (g(., t)) dt
Jou
=
0
=
Z
0
1h
2I2 (f (., t) ⊗ f (., t)) + 2kf (., t)k2L2 ([0,1]) + 4I6 (g(., t) ⊗ g(., t))
+36I4 (g(., t) ⊗1 g(., t)) + 72I2 (g(., t) ⊗2 g(., t)) + 24g(., t) ⊗3 g(., t) +6I4 (f (., t) ⊗ g(., t)) + 18I2 (f (., t) ⊗1 g(., t))] dt.
Thus
hDF, uiL2 ([0,1]) − EF 2 =
Z
1
I2 (h1 (., t)) + I4 (h2 (., t)) + I6 (h3 (., t)) dt,
0 20
Journal Pre-proof
repro of
where
h1 (.t) = 2f (., t) ⊗ f (., t) + 72g(., t) ⊗2 g(., t) + 18f (., t) ⊗1 g(., t)
h2 (.t) = 36g(., t) ⊗1 g(., t) + 6f (., t) ⊗ g(., t)
h3 (.t) = 4g(., t) ⊗ g(., t). Therefore, using Minkowski inequality,
q √ Z 1 √ Z 1
2 2
E hDF, uiL2 ([0,1]) − EF 2 h1 (., t) dt h2 (., t) dt 6 + 4!
2 0 0 L2 ([0,1]2 ) L ([0,1]4 )
√ Z 1
h3 (., t) dt . + 6!
0
Furthermore,
Z
1
0
h1 (., t) dt
L2 ([0,1]2 )
Z
6
2
1
0
L2 ([0,1]6 )
f (., t) ⊗ f (., t) dt
Z
+
18
L2 ([0,1]2 )
1
1
0
g(., t) ⊗2 g(., t) dt
L2 ([0,1]2 )
rna lP
0
f (., t) ⊗1 g(., t) dt
Z
+
72
= 2 kf ⊗1 f kL2 ([0,1]2 ) + 72 kg ⊗3 gkL2 ([0,1]2 ) + 18
Also,
Z
0
1
h2 (., t) dt
L2 ([0,1]4 )
and
Z
0
1
g(., t) ⊗1 g(., t) dt
L2 ([0,1]2 )
Z
+
6
1
0
q hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) .
f (., t) ⊗ g(., t) dt
q = 36 kg ⊗2 gkL2 ([0,1]4 ) + 6 hf ⊗1 f, g ⊗3 giL2 ([0,1]2 ) q 6 36 kg ⊗2 gkL2 ([0,1]4 ) + 6 kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) ,
h3 (., t) dt
L2 ([0,1]6 )
Jou
0
1
Z
6
36
Z
=
4
0
1
L2 ([0,1]2 )
L2 ([0,1]4 )
g(., t) ⊗ g(., t) dt
L2 ([0,1]6 )
= 4 kg ⊗1 gkL2 ([0,1]6 ) .
As a consequence,
q √ √ 2 E hDF, uiL2 ([0,1]) − EF 2 6 2 2 kf ⊗1 f kL2 ([0,1]2 ) + 72 2 kg ⊗3 gkL2 ([0,1]2 ) √ √ q +18 2 hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) + 36 4! kg ⊗2 gkL2 ([0,1]4 ) √ q √ +6 4! kf ⊗1 f kL2 ([0,1]2 ) kg ⊗3 gkL2 ([0,1]2 ) + 4 6! kg ⊗1 gkL2 ([0,1]6 ) . 21
Journal Pre-proof
For (23), we have by (6)
dT V
repro of
This, combined with (24), establishes inequality (22).
F 2 , N (0, 1) 6 2 E[|σ 2 − hDF, uiL2 ([0,1]) |] σ σ 2 E[F 2 ] 2 6 2 E[|E[F ] − hDF, uiL2 ([0,1]) |] + 2 1 − . σ σ2
(25)
Inequality (23) follows using inequalities (25) and (22).
We will need Theorem 8 along with the next Lemmas 9, 10 and 11 in order to prove that
Qn
the quadratic variation
Lemma 9 Let
√4 n
g2,n :=
the kernel g2,n satises
where C3 :=
r
4
4! 4n
Proof.
satises the Berry-Esséen Theorem 13 bellow.
Pn
i=1 fi
P
∞ 8 j=1 σj
We have
C3 kg2,n ⊗1 g2,n kL2 ([0,1]2 ) 6 √ , n 3
.
1 1−a21
1 1−a81
=
44 n2
i,j=1 n X
i,j,k,l=1
h(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj ), (fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )iL2 ([0,1]2 ) .
Moreover, by above calculations and (9)
(fi ⊗1 fi )(x, y) =
Hence
i X
j X
Jou
=
=
i X
2(i−k1 )
a1
k1 =1
k2 =1
i X
j X
2(i−k1 )
a1
k1 =1
=
i∧j X
k1 =1
2(i+j−2k1 )
j1 ,j2 =1
2(j−k2 )
a1
k2 =1
a1
∞ X
2(j−k2 )
a1
+∞ X j=1
∞ X j=1
2(i−k)
a1
σj21 σj22
∞ X
2 σj2 h⊗ k,j (x, y).
j=1
k=1
(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )(x, y) Z = (fi ⊗1 fi )(x, t)(fj ⊗1 fj )(y, t)dt [0,1]
(26)
n 44 X k (fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )k2L2 ([0,1]2 ) n2
rna lP
kg2,n ⊗1 g2,n k2L2 ([0,1]2 ) =
⊗1 fi where fi is dened in (12). If assumption (9) holds then
Z
[0,1]
hk1 ,j1 (x)hk1 ,j1 (t)hk2 ,j2 (y)hk2 ,j2 (t)dt
σj4 (hk1 ,j ⊗ hk2 ,j )(x, y)δk1 ,k2
σj4 (hk1 ,j ⊗ hk1 ,j )(x, y). 22
Journal Pre-proof
Similarly
k2 =1
Therefore
+∞ X 2(k+l−2k2 ) a1 σj4 (hk2 ,j ⊗ hk2 ,j )(x, y).
repro of
(fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )(x, y) =
k∧l X
j=1
h(fi ⊗1 fi ) ⊗1 (fj ⊗1 fj ), (fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )iL2 ([0,1]2 ) Z = ((fi ⊗1 fi ) ⊗1 (fj ⊗1 fj )) (x, y) ((fk ⊗1 fk ) ⊗1 (fl ⊗1 fl )) (x, y)dxdy [0,1]2 i∧j∧k∧l ∞ X 2(i+j+k+l−4m) X a1 = σj8 . m=1
j=1
Consequently
44 n2
n X
i,j,k,l=1
i∧j∧k∧l X
∞ X
6
Jou
6
=
6
where we used the change of variables follows.
∞ X
σj8
j=1
i X
X 44 2(i+j+k+l−4r) 8 σ a1 j 2 n j=1 16i6j6k6l6n r=1 ∞ 8i X 44 X 8 2(j+k+l−3i) 1 − a1 4! 2 σj a1 n 1 − a81 j=1 16i6j6k6l6n ∞ 4 X X 4 1 2(j+k+l−3i) 4! 2 a1 σj8 8 n 1 − a1 j=1 16i6j6k6l6n ∞ n X 44 X 8 1 2(k +k +k ) 4! σj a1 1 2 3 8 n 1 − a 1 k1 ,k2 ,k3 =0 j=1 3 ∞ n 4 X 8 1 X 2k1 a1 σj 4! n 1 − a81 j=1 k1 =0 3 ∞ 44 X 8 1 1 4! σj , n 1 − a81 1 − a21 j=1
6 4!
6
2(i+j+k+l−4r)
a1
r=1
rna lP
kg2,n ⊗1 g2,n k2L2 ([0,1]2 ) =
k1 = j − i, k2 = k − i, k3 = l − i.
23
The desired result therefore
Journal Pre-proof
√1 n
Pn
i=1 fi
⊗ fi where fi is dened in (12). If assumption (9) holds, then
for every r = 1, 2, 3, the kernel g4,n satises
repro of
Lemma 10 Let g4,n :=
C4,r kg4,n ⊗r g4,n kL2 ([0,1](8−2r) ) 6 √ , n
where
C4,r
Proof.
For
v u u t4!
r = 1, 2, 3,
=
we get
kg4,n ⊗1 g4,n k2L2 ([0,1]6 )
1 6 2 n
j=1
we have
σj4
!
1
(1−a21 )(1−a41 )
3
if r = 1,
if r = 2,
if r = 3.
n 1 X k (fi ⊗ fi )⊗˜r (fj ⊗ fj )k2L2 ([0,1]2(4−r) ) n2
1 n2
1 k n2 1 n2
6
4! n2
i,j=1 n X
(fi ⊗ fi )⊗r (fj ⊗ fj )k2L2 ([0,1]2(4−r) )
i,j=1 n X
i,j,k,l=1
n X
i,j,k,l=1 n X i,j,k,l=1
Jou
=
By (9), for all
∞ P
rna lP 6
r = 1,
j=1
σj2
!2
v !4 u 7 u ∞ P 1 t4! 2 := σ 2 j 1−a1 j=1 v !2 ! u ∞ ∞ u P P 1 1 t4! σj2 σj4 (1−a 4 )2 (1−a2 )4 1 1 j=1 j=1
kg4,n ⊗r g4,n k2L2 ([0,1]2(4−r) ) =
For
∞ P
(27)
h(fi ⊗ fi )⊗r (fj ⊗ fj ), (fk ⊗ fk )⊗r (fl ⊗ fl )iL2 ([0,1]2(4−r) ) .
h(fi ⊗ fi )⊗1 (fj ⊗ fj ), (fk ⊗ fk )⊗1 (fl ⊗ fl )iL2 ([0,1]6 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )
X
16i6j6k6l6n
hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 ) . (28)
1 6 i, k 6 n,
hfi , fk iL2 ([0,1]2 ) =
i∧k X
m=1
ai+k−2m × 1
24
∞ X j=1
σj2 .
(29)
Journal Pre-proof
1 6 j, l 6 n, hfj , fl iL2 ([0,1]2 ) =
On the other hand, for all
1 6 i, j 6 n
(fi ⊗1 fj )(x, y) = =
i∧k X
1 6 i, j, k, l 6 n,
∞ X
m=1
aj+l−2m × 1
Z
fi (x, t)fj (y, t)dt
0 i∧j X
hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 ) =
j=1
σj2 .
1
ai+j−2m 1
m=1 Hence, by (9), for all
repro of
Similarly, for all
i∧j∧k∧l X
∞ X
σj2 h⊗2 m,j (x, y).
(30)
j=1
(i+j+l+k−4m)
a1
m=1
Therefore, from (28) and above calculations, we have
×
∞ X j=1
σj4 .
j=1
j=1
rna lP
kg4,n ⊗1 g4,n k2L2 ([0,1]6 ) 2 ∞ ∞ X X X 1 1 4! 2(k−i) 2(l−i) 2j 4i 2 4 a1 a1 (1 − a2i σ σ 6 1 )(1 − a1 )(1 − a1 ) j j 2 4 2 n (1 − a1 )2 j=1 (1 − a ) 1 j=1 16i6j6k6l6n 2 n ∞ ∞ X X 4! 1 X 1 2(2k +2k +k ) 2 4 6 a1 1 2 3 σ σ j 4) n (1 − a21 )2 j=1 j (1 − a 1 j=1 k1 ,k2 ,k3 =0 2 !2 ! ∞ ∞ n n X X X X 1 1 4! 4r 2r 2 4 σ σj a1 1 a1 2 = n (1 − a21 )2 (1 − a41 ) j=1 j r1 =0 r2 =0 j=1 2 3 ∞ ∞ X X 1 1 4 2 × , 6 4! σj σj n (1 − a21 )(1 − a41 )
where we used the change of variables
r = 2,
we have
Jou
For
kg4,n ⊗2 g4,n k2L2 ([0,1]4 ) 6
1 n2
=
1 n2
=
4! n2
n X
i,j,k,l=1 n X i,j,k,l=1
k1 = j − i, k2 = k − j , k3 = l − k .
h(fi ⊗ fi )⊗2 (fj ⊗ fj ), (fk ⊗ fk )⊗2 (fl ⊗ fl )iL2 ([0,1]4 ) hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 )
X
16i6j6k6l6n
hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi , fk iL2 ([0,1]2 ) hfj , fl iL2 ([0,1]2 ) .
25
Journal Pre-proof
Hence, by (29) and (9), we get
6
4! (1 − a21 )4
4! = (1 − a21 )4
6
4! (1 − a21 )7
r = 3,
we have
kg4,n ⊗3 g4,n k2L2 ([0,1]2 ) 1 6 2 n = =
1 n2 4! n2
n X
i,j,k,l=1 n X i,j,k,l=1
σj2
n2
∞ P
j=1
σj2
n
∞ P
j=1
σj2
n
∞ P
j=1
σj2
n
!4 !4 !4
X
2(l−i)
a1
16i6j6k6l6n
n X
2j 2 2k (1 − a2i 1 ) (1 − a1 )(1 − a1 )
2(k1 +k2 +k3 )
a1
k1 ,k2 ,k3 =0
n X
k1 =0
3
1 a2k 1
,
k1 = j − i, k2 = k − j , k3 = l − k .
rna lP
where we used the change of variables For
j=1
!4
repro of
kg4,n ⊗2 g4,n k2L2 ([0,1]4 )
4! 6 (1 − a21 )4
∞ P
h(fi ⊗ fi )⊗3 (fj ⊗ fj ), (fk ⊗ fk )⊗3 (fl ⊗ fl )iL2 ([0,1]2 ) hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )
X
16i6j6k6l6n
hfi , fj iL2 ([0,1]2 ) hfk , fl iL2 ([0,1]2 ) hfi ⊗1 fj , fk ⊗1 fl iL2 ([0,1]2 )
Jou
2 ∞ ∞ X X 4! 1 1 6 σj2 σj4 2 4 2 (1 − a1 ) (1 − a1 ) n2 j=1 j=1
X
16i6j6k6l6n
26
2(j−i) 2(l−i) a1 (1
a1
4i 2k − a2i 1 )(1 − a1 )(1 − a1 ).
Journal Pre-proof
k1 = j − i, k2 = k − j , k3 = l − k ,
we obtain
repro of
Furthermore, using (29) and the change of variables
kg4,n ⊗3 g4,n k2L2 ([0,1]2 ) 2 ∞ ∞ n X X 1 4! 1 X 2(2k +k +k ) 2 4 σ σ 6 a1 1 2 3 j 4) n (1 − a21 )2 j=1 j (1 − a 1 j=1 k1 ,k2 ,k3 =0 2 ! !2 n n ∞ ∞ X X X X 1 4! 1 4r 2r 2 4 1 2 a1 a1 = σ σj n (1 − a21 )2 (1 − a41 ) j=1 j r1 =0 r2 =0 j=1 2 ∞ ∞ X X 1 1 1 σj4 σj2 6 4! × , 4 2 2 4 n (1 − a1 ) (1 − a1 ) j=1
j=1
which ends the proof.
Lemma 11 Suppose that assumption (9) holds. Consider the kernels g2,n and g4,n dened in Lemma 9 and Lemma 10 respectively, then we have
Proof.
4! n4
P
C5 hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) 6 √ , n
rna lP
where C5 :=
r
q
∞ 8 j=1 σj
We have
1 1−a81
1 1−a21
3
(31)
.
hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 )
=
4 n2 4 n2
n X
h(fi ⊗1 fi ) ⊗ (fj ⊗1 fj ), (fk ⊗ fk ) ⊗2 (fl ⊗ fl )iL2 ([0,1]4 )
i,j,k,l=1 Z n X
4 i,j,k,l=1 [0,1]
(fi ⊗1 fk )(x1 , x3 )(fi ⊗1 fk )(x1 , x4 )(fj ⊗1 fl )(x2 , x3 )(fj ⊗1 fl )(x2 , x4 )dx1 dx2 dx3 dx4 .
Jou
=
27
Journal Pre-proof
hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) 6
repro of
Consequently, using (30), we get
4 n2
6 6
∞ X
2(i+j+k+l−4r) a1
r=1
rna lP
=
i,j,k,l=1
i∧j∧k∧l X
6
where we used the change of variables
∞ X
σj8
j=1
i X
X 4 2(i+j+k+l−4r) 8 σ a1 j 2 n j=1 16i6j6k6l6n r=1 ∞ 8i X 4 X 8 2(j+k+l−3i) 1 − a1 σj 4! 2 a1 n 1 − a81 j=1 16i6j6k6l6n ∞ X 4 X 8 1 2(j+k+l−3i) 4! 2 σj a1 8 n 1 − a1 j=1 16i6j6k6l6n ∞ n X 4 X 8 1 2(k +k +k ) σj a1 1 2 3 4! 8 n 1 − a 1 k1 ,k2 ,k3 =0 j=1 3 ∞ n X X 4 1 1 4! σj8 a2k 1 8 n 1 − a 1 j=1 k1 =0 3 ∞ 4 X 8 1 1 σj 4! , n 1 − a81 1 − a21 j=1
6 4! 6
n X
k1 = j − i, k2 = k − i, k3 = l − i.
Remark 12 It turns out that, q when applying Theorem 8 to estimate the speed of convergence in
the CLT for Qn , the term hf ⊗ f, g ⊗2 giL2 ([0,1]4 ) cannot merely be bounded above via Schwarz's inequality. See Lemma 11 and its proof. This is the key element which allows us to obtain the Berry-Esséen speed n−1/2 in the next theorem.
Theorem 13 With Qn dened in (13), under Assumption (9), we have for all n > 1 where
Jou
dT V
√
n(Qn − E[Qn ]) C0 √ , N (0, 1) 6 √ , n l1 + l2
(32)
√ √ √ 1/2 1/2 √ 4 2 1 √ (C1 + C2 ) + C3 + 36C4,3 + 9C5 + 36 3C4,2 + 6 3C3 C4,3 + 12 10C4,1 , C0 := l1 + l2 2 2
where l1 , l2 , are dened in the previous section in (16), (19), and C1 , C2 , C3 , C4,r , r = 1, 2, 3 and C5 are given in the lemmas below, respectively in (15), (18), (26), (27) and (31). √ In particular n(Qn − E[Qn ]) is asymptotically Gaussian, namely lim
n→∞
√
law
n (Qn − E[Qn ]) −→ N (0, l1 + l2 ). 28
Journal Pre-proof
(Qn − E(Qn )) given in (14), we have √ √ √ n (Qn − E[Qn ]) = nT2,n + nT4,n = I2W (g2,n ) + I4W (g4,n ),
Based on the decomposition of
where
n
g2,n √
4 X := √ fi ⊗1 fi n i=1
repro of
Proof.
n
and
g4,n
1 X := √ fi ⊗ fi . n
rna lP
n (Qn − E[Qn ]), we get √ n(Qn − E[Qn ]) √ , N (0, 1) dT V l1 + l2 √ 4 h√ 6 2 kg2,n ⊗1 g2,n kL2 ([0,1]2 ) + 36 2 kg4,n ⊗3 g4,n kL2 ([0,1]2 ) l1 + l2 √ q + 9 2 hg2,n ⊗ g2,n , g4,n ⊗2 g4,n iL2 ([0,1]4 ) √ q √ +18 4! kg4,n ⊗2 g4,n kL2 ([0,1]4 ) + 3 4! kg2,n ⊗1 g2,n kL2 ([0,1]2 ) kg4,n ⊗3 g4,n kL2 ([0,1]2 ) i h√ 2 i E ( n (Qn − E[Qn ])) √ . +2 6! kg4,n ⊗1 g4,n kL2 ([0,1]6 ) + 2 1 − (l1 + l2 )
Applying Theorem 8 to
(33)
i=1
(34)
The bound (32) is then a direct consequence of inequality (34) and the estimates given respectively in (26), (27), (31) and Theorem 6.
6
Application: estimation of the mean-reversion parameter
In this section, to illustrate the implications of Theorem 13 in parameter estimation in an easily tractable case, we consider that we have observations
Yn
coming from a specic version of our
second-chaos AR(1) model (8), that which is driven by a chi-squared white noise with one degree of freedom:
a0 , a1
and
σ
are real constants, and
This is model (8) where all
(35)
Y0 = y0 ∈ R.
Jou
where
Yn = a0 + a1 Yn−1 + σ(Zn2 − 1), n > 1, σj 's
{Zn , n > 1} are i.i.d.
standard normal random variables.
are zero except for the rst one.
Proposition 14 The quadratic variation Qn dened in (13) for model (35) satises, for all n > 1,
where C6 :=
2σ 2 . (1−a21 )2
E[Qn ] −
2σ 2 C6 6 , n (1 − a21 )
29
Journal Pre-proof
From the denition of
Qn
in (13), we have by the isometry property of multiple
repro of
Proof. integrals
n 2 2σ 2 2 X 2σ = kfi k2L2 ([0,1]2 ) − (1 − a21 ) n i=1 1 − a21 ! n X n i 2σ 2 X 1X 2σ 2 2σ 2 2(i−k) 2i = = (1 − a ) − 1 a1 − 1 2 2 n n 1 − a 1 − a 1 1 i=1 k=1 i=1 n 2σ 2 1 X 2σ 2 1 2i = − a 6 . 1 1 − a21 n (1 − a21 )2 n
E[Qn ] −
i=1
Remark 15 Assuming that σ is known, Proposition (14) shows that the quadratic variation Qn is
an asymptotically unbiased estimator for 2σ 2 /(1 − a21 ), and thus, after a transformation, for |a1 | as well: s 1−
lim
n→∞
E[Qn ]
can be estimated via
moment estimator for the mean-reversion rate
a ˆn := f (Qn ),
where
f (x) :=
r
Qn ,
we suggest the following
|a1 |
rna lP
Therefore, using the fact that
2σ 2 = |a1 | . E[Qn ]
1−
2σ 2 , x
x > 2σ 2 .
(36)
(37)
Remark 16 Note that we chose to estimate the absolute value |a1 | instead of a1 because the quadratic variation only gives access to a21 . For an estimator that accesses the sign of a1 , we would presumably need to use a variation or another method of moments with an odd power. This is feasible, but it is beyond the scope of the paper.
6.1 Properties of the estimator aˆn
Jou
Proposition 17 The estimator aˆn of the mean reversion parameter |a1 | dened in (36) is strongly consistent, namely almost surely
√ Vn Qn = √ + E[Qn ], with Vn = n(Qn − E[Qn ]). According to Theorem 6, n E[Vn2 ] → l1 + l2 , n → ∞. Hence, there exists a constant C > 0, such that for every n > 1,
Proof. we have
lim a ˆn = |a1 |.
n→∞
We write
" #!1/2 Vn 2 C 6√ . E √ n n 30
Journal Pre-proof
V Hence, by Lemma 1 we have almost surely √n
E[Qn ] →
n → ∞.
Thus
Qn →
→ 0,
as
n → ∞.
On the other hand, by Proposition
n 2σ 2 almost surely as 1−a21
n → ∞,
repro of
14,
2σ 2 , as 1−a21
as announced.
Proposition 18 Under Assumption (9), the estimator aˆn dened in (36) satsies dW
√ n √ (ˆ an − |a1 |) , N (0, 1) E n−1/2 . f 0 (µ) l1 + l2
2
2σ where l1 and l2 are given in Propositions 3 and 4 respectively and µ = (1−a 2) . 1 In particular aˆn is asymptotically Gaussian; more precisely we have as n → +∞
√
where
n (ˆ an − |a1 |) −→ N 0, f 0 (µ)2 × (l1 + l2 ) , f 0 (µ)2 × (l1 + l2 ) =
Proof.
x > 2σ 2 ,
µ = f −1 (|a1 |) = √
√
2σ 2 (1−a21 )
the function
f
> 2σ 2 ,
a1 ∈ (−1, 1).
for all
dened in (37) has an inverse
√
2σ 2 . Let 1−y 2
On the other hand, we can write
Therefore, from the properties of the Wasserstein metric and denoting
dW
f −1 (y) =
√ √ n n n (Qn − µ) = √ (Qn − E[Qn ]) + √ (E[Qn ] − µ) . l1 + l2 l1 + l2 l1 + l2
rna lP
denote
For
(1 − a21 )(5 − 4a21 ) . 2a21
√
n (Qn − µ) , N l1 + l2
N ∼ N (0, 1),
we get
√ √ n n 6√ |E[Qn ] − µ| + dW √ (Qn − E[Qn ])) , N l1 + l2 l1 + l2 C6 n−1/2 + C0 n−1/2 6√ l1 + l2
E n−1/2 ,
(38)
where we used the bound (32) and Proposition 14 for the above bounds. On the other hand, since the function
f
dened in (37) is a dieomorphism and since
Jou
theorem, there exists a random variable
a ˆn = f (Qn ),
then by the mean-value
such that
(ˆ an − |a1 |) = f 0 (ξn ) (Qn − µ) ,
a,b means [a, b] if a 6 b and [b, a] if b 6 a. We have √ √ √ n n n √ √ √ (ˆ a − |a |) , N (0, 1) 6 d (ˆ a − |a |) , (Q − µ) n 1 n 1 n W f 0 (µ) l1 + l2 f 0 (µ) l1 + l2 l1 + l2 √ n + dW √ (Qn − µ) , N (0, 1) . l1 + l2
where the notation
dW
ξn ∈ [|Qn , µ|]
[|a, b|]
for two reals
31
Journal Pre-proof
According to (38), the last term is bounded by the speed
∈ [|ξn , µ|] ⊂ [|Qn , µ|],
6 = 6 6
f
Moreover, applying the mean-
is twice continuously dierentiable, there exists a random variable
such that
repro of
value theorem again since
n−1/2 .
√ √ p n n √ dW (ˆ an − |a1 |) , √ (Qn − µ) |f 0 (µ)| l1 + l2 0 f (µ) l1 + l2 l1 + l2 √ 0 0 nE[|(Qn − µ)(f (ξn ) − f (µ))|] √ nE[ (Qn − µ)f 00 (δn )(ξn − µ) ] √ nE[(Qn − µ)2 |f 00 (δn )|] 1/p0 1/p √ 0 n E[(Qn − µ)2p ] E[|f 00 (δn )|p ] , p, p0
1 p
δn
(39)
such that
are two reals greater than 1 such that
+
1 p0
= 1. Moreover, by the hypercontractivity property for multiple integrals (5), there exists a constant C(p) where we used Hölder's inequality with
1/p √ √ n E[(Qn − µ)2p ] 6 C(p) nE[(Qn − µ)2 ] √ √ 6 2C(p) nE[(Qn − E[Qn ])2 ] + 2C(p) n(E[Qn ] − µ)2
rna lP
6 Cn−1/2 + Cn−3/2
E n−1/2 ,
where we used the inequality
(a + b)2 6 2a2 + 2b2 ,
for all
14 and Theorem 6 respectively. On the other hand, for all
(40)
a, b ∈ R and x > 2σ 2 ,
the bounds of Proposition
−3/2 3σ 2 2σ 2 1− 1− < 0. 2x x √ √ n n √ √ Therefore using (39) and (40), to obtain a bound for the term dW (ˆ a − |a |) , (Q − µ) , n 1 n f 0 (µ) l +l l +l 2σ 2 f (x) = − 3 x 00
0
and the
1
2
1
2
E[|f 00 (δn )|p ] is nite for some p0 > 1 but using the fact that δn ∈ [|Qn , µ|] 00 0 monotonicity of f , it is actually sucient to show that for some p > 1, we have h i h i 0 0 0 0 sup E |f 00 (Qn )|p = sup E |2σ 2 Qn − 3σ 4 |p |Qn |−5p /2 |Qn − 2σ 2 |−3p /2 < ∞.
it remains to show that
|f 00 | >1
The function
0 for any p
n>1
Jou
n>1
has two singularities in
0
and in
2σ 2
and thus is not bounded. But, we can write
h i 0 0 0 E |f 00 (Qn )|p = E |f 00 (Qn )|p 1{|Qn −µ|> √1 } + E |f 00 (Qn )|p 1{|Qn −µ|< √1 } .
For the term
n
n
0 E |f 00 (Qn )|p 1{|Qn −µ|< √1 } and since µ > 2σ 2 , we can pick n such that
Qn > 2σ 2 − σ 2 > 0, therefore Qn
n
is bounded away from
32
0
0 and the term |Qn |−5p /2
√1 n
< σ2.
Then
has no singularity
Journal Pre-proof
2
0
0
repro of
p0 > 1. For the term |Qn − 2σ 2 |−3p /2 , we put C := µ−2σ 2 , the constant C 6= 0, because 2 µ 6= 2σ ⇔ a1 6= 0, we can assume a1 6= 0, because there is no AR(1) process with a1 = 0. Therefore, 1 1 2 we can pick n such that √ < C . In this case Qn − 2σ = Qn − µ + 2C > − √ + 2C > 2C − C > 0, n n
for any
|Qn − 2σ 2 |−3p /2 has no singularities at 2σ 2 for any p0 > 1. In conclusion, to avoid 2 the singularities at both 0 and 2σ , it is sucient to pick n such that ( σ 2 if |a1 | > √12 , √1 < σ 2 ∧ C = 1 n C if |a1 | 6 √ . 2 √ For the other term, by the asymptotic normality of n(Qn − µ) and using asimilar argument as hence the term
in Section 5.2.2 of [9], there exists
p0 > 1
such that
the desired result.
6.2 Numerical Results
0
E |f 00 (Qn )|p 1{|Qn −µ|> √1
< +∞,
}
n
The table below reports the mean and standard deviation of the proposed estimator (36) of the true value of the mean-reversion parameter
|a1 | = 0.10
Mean
n = 3000
0.2178
0.0901
n = 5000
0.1878
n = 10000
0.1630
|a1 | = 0.50
dened in
|a1 | = 0.70
Std dev
Mean
Std dev
Mean
Std dev
0.2887
0.0969
0.4946
0.0520
0.6962
0.0234
0.0866
0.2905
0.0616
0.4966
0.0413
0.6978
0.0270
0.0692
0.2928
0.0852
0.4974
0.0315
0.6987
0.0215
rna lP
Std dev
a ˆn
|a1 |.
|a1 | = 0.30
Mean
which gives
a ˆn from the quadratic variation Qn for dierent 2 and for xed σ chosen to be equal to 1. For each sample size n, the mean and the
We simulate the values of the estimator sample sizes
n
standard deviation are obtained by 500 replications. The table above conrms that the estimator is strongly consistent even for small values of values of
|a1 |.
Moreover, the estimator
a ˆn
n
is more ecient for values of
|a1 |
greater than
|ˆ an | is Therefore, a ˆn is
could be explained by the fact that the asymptotic variance of the limiting law of is high for small values of
|a1 |
and small for values of
Jou
more accurate as an estimator when gure below.
|a1 |
a ˆn
and has small standard deviations for dierent true
|a1 |
close to 1.
0.5,
this
(1−a21 )(5−4a21 ) 2a21 presumably
is closer to 1, e.g. greater than 0.5 as can be seen in the
33
repro of
Journal Pre-proof
rna lP
Figure 2: Asymptotic variance for values of
To investigate the asymptotic distribution of
a ˆn
|a1 |
between 0.09 and 0.99
empirically, we need to compare the distri-
bution of the following statistic
p 2a21
√ φ(n, a1 ) := p n(ˆ an − |a1 |) 2 2 (1 − a1 )(5 − 4a1 )
with the standard normal distribution
N (0, 1).
For this aim, for parameter choices
and based on 3000 replications, we obtained the following histogram:
Jou
n = 3000, σ = 1,
34
(41)
|a1 | = 0.5,
φ(n, a1 )
with
n = 3000, |a1 | = 0.5, σ = 1,
rna lP
Figure 3: Histogram of
repro of
Journal Pre-proof
3000 replications.
This Figure 3 shows that the normal approximation of the distribution of the statistic
φ(n, a1 )
is reasonable even if the sampling size
statistics of
φ(n, a1 )
and
N (0,1)
n
is not very large.
based on 3000 replications, with
empirical mean, median and standard deviation of
φ(n, a1 )
match
corroborating our theoretical results.
Median
The table below compares
n = 3000, and σ = 1. The those of N (0,1) very closely,
Statistics
Mean
Standard Deviation
N (0,1) φ(n, a1 )
0
0
1
0.0048515
0.0020649
1.0004691
φ(n, a1 ) converges φ(n, a1 ) and N (0, 1).
We can check more precisely how fast is the statistic We chose to compute the Kolmogorov distance between
in law to
N (0,1).
For this aim, we
Jou
approximate the cumulative distribution function using empirical cumulation distribution function based on 500 replications of the computation of
φ(n, a1 )
for
n = 3000.
empirical and standard normal cumulative distribution functions.
35
The next gure shows the
repro of
Journal Pre-proof
The Kolmogorov distance between the two laws, which equals the sup norm of the dierence
rna lP
of these cumulative distribution functions, computes to approx. (See for example Theorem 3.3 of [4] for a proof )
dKol (φ(n, a1 ), N (0, 1)) 6 2
0.052.
On the other hand, since
p dW (φ(n, a1 ), N (0, 1)),
the distance on the left-hand side should be bounded above by
2 × 3000−1/4 = 0.27
(42) approx times
any constant coming from the upper bound in Proposition 18. This is ve times larger than our estimate of the actual Kolmogorov distance
0.052, a reassuring practical conrmation of Proposition
18, and of our underlying results on normal asymptotics of 2nd-chaos AR(1) quadratic variations. If that proposition's upper bound with its rate
n−1/2
applied directly to the Kolomogorov distance,
as is known to be the case for the Berry-Esséen theorem in the classical CLT, the value be compared to
3000−1/2 = 0.018
0.052 should
approx., which is arguably in the same order of magnitude. This
is a motivation to investigate whether the so-called delta method which we used here to prove Proposition 18 under the Wasserstein distance, could also apply to the total variation distance, since it is known to be an upper bound on the Kolmogorov distance without the need for the square
Jou
root as in the comparison (42).
References
[1] Balakrishna, N. and Shiji, K. (2014). Extreme Value Autoregressive Model and Its Applications,
Journal of Statistical Theory and Practice, 8 (3), 460481.
[2] Barboza, L.A. and Viens, F. (2017). Parameter estimation of Gaussian stationary processes using the generalized method of moments.
Electron. J. Statist. 11 (1), 401-439. 36
Journal Pre-proof
[3] Bene, V.E., Shepp, L.A., and Witsenhausen, H.S. (1980). Some solvable stochastic control An International Journal of Probability and Stochastic Processes 4,
39-83.
repro of
problems. Stochastics:
[4] Chen, L.H.Y., Goldstein, L. and Shao, Q.M. (2011). Berlin: Springer-Verlag.
Normal Approximation by Stein's Method.
[5] Cumberland, W.G. and Sykes, Z.M. (1982). Weak Convergence of an Autoregressive Process Used in Modeling Population Growth.
Journal of Applied Probability, 19 (2) 450-455.
[6] Douissi, S., Es-Sebaiy, K., Viens, F. (2019). Berry-Esséen bounds for parameter estimation of general Gaussian processes.
ALEA, Lat. Am. J. Probab. Math. Stat., 16, 633-664.
[7] El Onsy, B., Es-Sebaiy, K. and Viens, F. (2017). Parameter Estimation for Ornstein-Uhlenbeck driven by fractional Ornstein-Uhlenbeck processes.
Stochastics, 89 (2), 431-468.
[8] Ernst, Ph., Brown, L., Shepp, L., Wolpert, R. (2017). Stationary Gaussian Markov processes as limits of stationary autoregressive time series.
Journal of Multivariate Analysis, 155, 180-186.
[9] Es-Sebaiy, K. and Viens, F. (2019). Optimal rates for parameter estimation of stationary Gaussian processes.
Stochastic Processes and their Applications, 129 (9), 3018-3054.
models, preprint.
rna lP
[10] Grunwald, G.K., Hyndman, R.J., and Tedesco, L.M. (1996).
A unied View of linear AR(1)
[11] Hürlimann, W. (2012). On Non-Gaussian AR(1) Ination Modeling Journal of Statistical and Econometric Methods,
1 (1), 93-109.
[12] Hu, Y. and Nualart, D. (2010). Parameter estimation for fractional Ornstein-Uhlenbeck processes.
Statist. Probab. Lett. 80, 1030-1038.
[13] Kleptsyna, M. and Le Breton, A. (2002). Statistical analysis of the fractional Ornstein- Uhlenbeck type process.
Statist. Infer. Stoch. Proc. 5, 229-241.
[14] Kloeden, P. and Neuenkirch, A. (2007). The pathwise convergence of approximation schemes for stochastic dierential equations.
LMS J. Comp. Math. 10, 235-253.
Jou
[15] Liptser, R. S. and Shiryaev, A. N. (2001).
Statistics of Random Processes: II. Applications,
Springer-Verlag, New York.
[16] Maddison, C.J., Tarlow, D., Minka, T. (2015).
A* Sampling,
Preprint, available online at
https://arxiv.org/abs/1411.0030v2 . [17] Nakajima, J., Kunihama, T., Omori, Y., and Frühwirth-Schnatter, S. (2012). Generalized extreme value distribution with time-dependence using the AR and MA models in state space form.
Computational Statistics & Data Analysis, 56 (11), 3241-3259.
37
Journal Pre-proof
[18] Neufcourt, L. and Viens, F. (2016). A third-moment theorem and precise asymptotics for
Stat.; 13
(1), 239-264.
ALEA, Lat. Am. J. Probab. Math.
repro of
variations of stationary Gaussian sequences. In press in
[19] Nourdin, I. and Peccati, G. (2015). The optimal fourth moment theorem.
Soc. 143, 3123-3133.
Proc. Amer. Math.
[20] Nourdin, I. and Peccati, G. (2012). Normal approximations with Malliavin calculus : Stein's method to universality.
Press, Cambridge.
from
Cambridge Tracts in Mathematics 192. Cambridge University
[21] Nourdin, I., Peccati, G., and Yang, X. (2018). Berry-Esseen bounds in the Breuer-Major CLT and Gebelein's inequality. Preprint https://arxiv.org/abs/1812.08838.
[22] Novikov, A.A. (1971). Sequential estimation of the parameters of diusion type processes.
Veroyatn. Primen., 16 (2), 394-396.
[23] Nualart, D. (2006). The Malliavin calculus and related topics. [24] Nualart, D. and Zhou, H. (2018):
Springer-Verlag, Berlin.
Total variation estimates in the Breuer-Major theorem.
Preprint arXiv:1807.09707.
rna lP
[25] Schoenberg, I.J. (1951). On Polya frequency functions. 331-374.
Teor.
Journal d'Analyse Mathématique, 1,
[26] Shepp, L.A. (1969). Explicit solutions to some problems of optimal stopping.
The Annals of
[27] Shepp, L.A. and Shiryaev, A.N. (1993). The Russian option: reduced regret.
The Annals of
Mathematical Statistics 40, 993-1010. Applied Probability 3, 631-640.
[28] Shepp, L.A. and Vardi, Y. (1982) Maximum likelihood reconstruction for emission tomography.
IEEE transactions on Medical Imaging, 1, 113-122.
[29] Tanaka, K. (2017).
Time Series Analysis: Nonstationary and Noninvertible Distribution The-
ory, volume 16. Wiley New York.
Jou
[30] Toulemonde, G., Guillou, A., Naveau, P., Vrac, M. and Chevallier, F. (2010). Autoregressive models for maxima and their applications to CH4 and N2 O. [31] Tudor, C.A. (2008). Analysis of the Rosenblatt process.
Environmetrics, 21 (2), 189-207.
ESAIM: Probability and Statistics, 12,
230257.
[32] Tudor, C. and Viens, F. (2007). Statistical aspects of the fractional stochastic calculus.
Statist. 35, no. 3, 1183-1212.
Ann.
[33] Üstünel, A. S. (1995). An Introduction to Analysis on Wiener Space. Lecture Notes in Math. 1610.
Springer, Berlin.
38
Journal Pre-proof
[34] Viitasaari, L. (2016). Representation of Stationary and stationary increment processes via
Statistics and Probability Letters, 115, 45-53.
repro of
Langevin equation and self-similar processes.
[35] Voutilainen, M., Viitasaari, L., and Ilmonen, P. (2017). On model tting and estimation of strictly stationary processes.
Modern Stochastics : Theory and Applications. 4 (4), 381406.
[36] Voutilainen, M., Viitasaari, L., and Ilmonen, P. (2019). Note on AR(1) characterization of stationary processes and model tting. 195-207.
Modern Stochastics : Theory and Applications, 6
(2),
[37] Yan, Y. and Genton, M. G. (2019). Non-Gaussian autoregressive processes with Tukey g-and-h
Environmetrics 30 (2), e2503.
Jou
rna lP
transformations.
39