A bound on the Wasserstein-2 distance between linear combinations of independent random variables

A bound on the Wasserstein-2 distance between linear combinations of independent random variables

Accepted Manuscript A bound on the Wasserstein-2 distance between linear combinations of independent random variables Benjamin Arras, Ehsan Azmoodeh, ...

739KB Sizes 0 Downloads 27 Views

Accepted Manuscript A bound on the Wasserstein-2 distance between linear combinations of independent random variables Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly, Yvik Swan

PII: DOI: Reference:

S0304-4149(18)30325-9 https://doi.org/10.1016/j.spa.2018.07.009 SPA 3350

To appear in:

Stochastic Processes and their Applications

Received date : 11 September 2017 Revised date : 7 June 2018 Accepted date : 4 July 2018 Please cite this article as: B. Arras, E. Azmoodeh, G. Poly, Y. Swan, A bound on the Wasserstein-2 distance between linear combinations of independent random variables, Stochastic Processes and their Applications (2018), https://doi.org/10.1016/j.spa.2018.07.009 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

*Manuscript

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

A bound on the Wasserstein-2 distance between linear combinations of independent random variables Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly and Yvik Swan Abstract We provide a bound on a distance between finitely supported elements and general elements of the unit sphere of `2 (N∗ ). We use this bound to estimate the Wasserstein-2 distance between random variables represented by linear combinations of independent random variables. Our results are expressed in terms of a discrepancy measure related to Nourdin-Peccati’s Malliavin-Stein method. The main application is towards the computation of quantitative rates of convergence to elements of the second Wiener chaos. In particular, we explicit these rates for non-central asymptotic of sequences of quadratic forms and the behavior of the generalized Rosenblatt process at extreme critical exponent. Keywords: Second Wiener chaos, variance-gamma distribution, Wasserstein-2 distance, Malliavin Calculus, Stein discrepancy MSC 2010: 60F05, 60G50, 60G15, 60H07

Contents 1 Introduction

1

2 Wasserstein-2 distance between linear combinations 5 2.1 A general result on Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . 5 2.2 A probabilistic interpretation . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Specializing to the second Wiener chaos . . . . . . . . . . . . . . . . . . . . 11 2.4 A lower bound on the Wasserstein-2 distance in the second Wiener chaos 13 2.5 Comparison with the Malliavin–Stein method for the variance-Gamma . . 19 3 Applications 3.1 An example from U -statistics . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Application to some quadratic forms . . . . . . . . . . . . . . . . . . . . . 3.3 The generalized Rosenblatt process at extreme critical exponent . . . . . .

20 20 22 27

References

30

1

Introduction

In this paper, we provide bounds on the Wasserstein-2 distance (see Definition 1.1) W2 (Fn , F∞ ) between random variables Fn and a target F∞ which satisfy the following assumption. Assumption: There exist q non-zeroPreal numbersP{α∞,k }1≤k≤q as well as a sequence 2 2 = 1 for all n ≥ 1 and (αn )n≥1 of elements of IRIN such that qk=1 α∞,k = k αn,k Fn =

X k≥1

αn,k Wk for all n ≥ 1 and F∞ :=

1

q X k=1

α∞,k Wk

(1.1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

where (Wk )k≥1 is a sequence of i.i.d. random variables with mean 0, variance 1, finite moments of orders 2q + 2 and non-zero rth cumulant for all r = 2, · · · , 2q + 2. In light of the structure imposed by our Assumption it seems intuitively evident that W2 (Fn , F∞ ) ought to be governed solely by the convergence rate of the approximating sequence of coefficients {αn,k }n,k≥1 towards {α∞,k }1≤k≤q . The main difficulty is to identify the correct measure of discrepancy for this convergence and, following on [2], we consider the quantity q Y X 2 ∆(Fn , F∞ ) = αn,k (αn,k − α∞,r )2 , n ≥ 1. (1.2) k≥1

r=1

The main theoretical contribution of the paper is Theorem 2.4, where we prove, in essence, that under technical conditions on the limiting coefficients we have the bound p (1.3) W2 (Fn , F∞ ) ≤ C ∆(Fn , F∞ ),

with C > 0 only depending on F∞ . We comment briefly on the general strategy we adopt in order to obtain a bound such as (1.3). Due to the structure imposed by our Assumption, it is natural to bound the Wasserstein-2 distance by a quantity based on re-indexing couplings. This leads us to considering a taylor-made quantity dσ (see (2.1)) in a purely Hilbertian context. Then, based on the careful analysis of minimization problems associated with dσ , we are able to identify bounding quantities which depend polynomially on the coordinates of the sequences we want to compare (see Theorems 2.1 and 2.3). Recasting these quantities in a probabilistic setting, we are able to link them to the cumulants of the random variables Fn and F∞ and finally to obtain our main result. The most important application of a bound such as (1.3) is that it provides quantitative rates of convergence towards elements of the second Wiener chaos. Indeed it is a classical result that all such random variables can be written as a linear combination of centered √ chi-squared random variables, i.e. satisfy (1.1) for Wk = (Zk2 − 1)/ 2 and {Zk }k≥1 i.i.d. standard normal random variables. In Section 2.3, we particularize our general bounds to this setting and obtain the first rates of convergence in Wasserstein-2 distance of sequences of elements belonging to the second Wiener chaos, hereby complementing recent contributions [29, 2] (see also [20] whose results are posterior to a first version of this paper). Close in spirit to our results, let us also mention the two recent articles [8] and [4]. First, in [8], the authors obtained general qualitative weak convergence results for sequence of critical polynomial chaoses (of fixed order k) defined thanks to an underlying array of independent random variables with zero mean, unit variance and satisfying some uniform integrability condition. In particular, the limiting random variable is represented by a sum of Gaussian chaotic random variables for all orders 1 ≤ m ≤ k. Then, their convergence result is used in the asymptotic study of the partition function of some pinning model. In [4], the authors obtained quantitative estimates in total variation distance between the laws of random variables that may be written as general polynomial mappings of sequences of independent random variables satisfying the Doeblin property. This property roughly asserts that some distribution ”dominates” the Lebesgue measure on some open set. We stress that the philosophy of their approach is rather different since their estimates involve the so-called influence factor while our results exploit the very specific structure provided by the second Wiener chaos to express the bounds in terms of cumulants. Moreover, in Section 2.4, we obtain a general lower bound on the Wasserstein-2 distance between elements in the second Wiener chaos using the quantity ∆(Fn , F∞ ). The rate exponent for this lower bound is 1 so that the optimality of our upper bound remains open. More importantly, these results emphasize the fact that the quantity ∆(Fn , F∞ ) is the right one to study quantitative convergence results in Wasserstein-2 distance on the second

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Wiener chaos. Since the intersection between the second Wiener chaos and the class of variance-gamma distributed random variables is not empty it is also relevant to detail our bounds in these cases. We perform this in Section 2.5; this allows direct comparison with [11] where a similar setting was tackled - by entirely different means. Finally, in Section 3, we apply our bounds to three illustrative and relevant examples. First we consider chi-squared approximation for second order U-statistics. We obtain among other results the bound 1 W2 (nUn , a(Z12 − 1)) = O( √ ), n for Un a second order U-statistics which has a degeneracy of order 1 (see Section 3.1). Next, we consider the problem of obtaining quantitative asymptotic results for sequences of quadratic forms. Letting ˜∞ = Q

q X

m=1

˜ m (Z 2 − 1), λ m

˜ n (Z) = Q

n X

a ˜i,j (n)Zi Zj ,

i,j=1

n≥1

with (Zi )i≥1 a sequence of independent standard normal random variables and with ˜ m )1≤m≤q satisfying appropriate conditions, we deduce a general (˜ ai,j (n))1≤i,j≤n and (λ ˜ n (Z) − IE[Q ˜ n (Z)], Q ˜ ∞ ). In particular, for specific instances of the n × n bound for W2 (Q real-valued symmetric matrix (˜ ai,j (n))1≤i,j≤n , we obtain explicit rates of convergence ˜ n (Z) − IE[Q ˜ n (Z)], Q ˜ ∞ ) = O( 1α ) W2 (Q n2 where α ∈ (0, 1] (see Section 3.2, Corollary 3.2). Moreover, combining Corollary 3.2 and an invariance principle in Kolmogorov distance (Corollary 3.3), we are able to derive a quantitative universality type result for quadratic forms defined by, for all n ≥ 1 ˜ n (X) = Q

n X

a ˜i,j (n)Xi Xj

i,j=1

with (Xi )i≥1 a sequence of i.i.d. random variables centered with unit variance and finite fourth moment (see Theorem 3.1). Finally, inspired by [5], we consider the generalized Rosenblatt process at extreme critical exponent. Letting  Z Z 1 γ1 γ2 Zγ1 ,γ2 = (s − x1 )+ (s − x2 )+ ds dBx1 dBx2 , R2

0

with γi ∈ (−1, −1/2) and γ1 + γ2 > −3/2 and aρ bρ Yρ = √ (Z12 − 1) + √ (Z22 − 1), 2 2 we prove that W2 (Zγ1 ,γ2 , Yρ ) ≤ Cρ

0<ρ<1

r

1 −γ1 − , 2

(see Lemma 3.2). In order to understand the significance of our general bounds and also to contextualize the crucial quantity ∆(Fn , F∞ ), it is necessary at this stage to make a short digression into Malliavin-Stein (a.k.a. Nourdin-Peccati) analysis. Let F∞ be standard Gaussian and consider a sequence of normalized random variables (Fn )n≥1 with sufficiently regular density with respect to the Lebesgue measure. The Stein kernel of Fn is the random variable τn (Fn ) uniquely defined through the probabilistic integration by parts formula IE[τn (Fn )φ0 (Fn )] = IE[Fn φ(Fn )],

3

n≥1

(1.4)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

which is supposed to hold for all smooth test functions φ : IR → IR. The classical Stein identity, according to which IE[φ0 (F∞ )] = IE[F∞ φ(F∞ )] for all smooth φ, implies in particular that the standard Gaussian distribution as a Stein kernel which is constant and equal to 1. Hence     S(Fn , F∞ ) := IE (τn (Fn ) − 1)2 = IE τn (Fn )2 − 1 (1.5)

necessarily captures some aspect of non-Gaussianity of Fn . As it turns out this quantity – called the Stein (kernel) discrepancy – plays a crucial role in Gaussian analysis. In particular, it has long been known that S(Fn , F∞ ) measures non-Gaussianity quite precisely. First, see e.g. [35, Lesson VI] or [6, 19, 7], it is equal to zero if and only if L(Fn ) = L(F∞ ) (equality in distribution). Second, Stein’s method also implies that S(Fn , F∞ ) metrizes convergence in distribution, in the sense that p dH (Fn , F∞ ) = sup IE |h(Fn ) − h(F∞ )| ≤ κH S(Fn , F∞ ) (1.6) h∈H

for H any class of sufficiently regular test functions and κH a finite constant depending only on H; see [26, Chapter 3] or [22] for more detail. The breakthrough from [27] is the discovery that S(Fn , F∞ ) is the linchpin of the entire theory of “fourth moment theorems” ensuing from the seminal paper [30]. More precisely, Nourdin and Peccati were the first to realize that the integration by parts formula for Malliavin calculus could be used to prove S(Fn , F∞ ) ≤

  q−1 (IE Fn4 − 3) 3q

(1.7)

whenever Fn is an element of the qth Wiener chaos. Combining (1.7) and (1.6) thus provides quantitative fourth moment theorems for chaotic random variables in integral probability metrics including Total Variation, Kolmogorov and Wasserstein-1. We refer to [27] and to the monograph [26] for a detailed account; see also [28] for an optimal-order bound (without a square root), and [21, 3] for a Markovian spectral point of view. Stein kernels are not inherently Gaussian objects and are well identified and tractable for a wide family of target distributions, see e.g. [35, Lesson VI]. It is therefore not unreasonable to study, for F∞ having kernel general assumptions,  τ∞ (F∞ ) and satisfying  the kernel discrepancy S(Fn , F∞ ) := E (τn (Fn ) − τ∞ (F∞ ))2 in order to reap the corresponding estimates from (1.6). This plan was already carried out in [27] for F∞ a centered gamma random variable and pursued in [10] and [36] for targets F∞ which were invariant distributions of diffusions. Many useful target distributions do not, however, bear a tractable Stein kernel and in this case the kernel discrepancy S(Fn , F∞ ) no longer captures relevant information on the discrepancy between L(Fn ) and L(F∞ ). There is, for instance, an enlightening discussion on this issue in [11, pp 8-9] about the “correct” identity for the Laplace distribution which turns out to be   IE [F∞ φ(F∞ )] = IE 2φ0 (F∞ ) + F∞ φ00 (F∞ ) (1.8) for smooth φ. Identities involving second (or higher) order derivatives of the test functions lead to considering higher order versions of the Stein kernel, namely Γ1 (Fn ) defined through E [Fn φ(Fn )] = E [φ0 (Fn )Γ1 (Fn )] and Γ2 (Fn ) defined through E [Fn φ(Fn )] = E [φ0 (Fn )] E [Γ1 (Fn )] + E [φ00 (Fn )Γ2 (Fn )] where both identities are expected to hold for all smooth test functions (higher order gamma’s are defined iteratively). Applying the intuition from Nourdin-Peccati analysis for Gaussian convergence then leads to a version of (1.6) of the form dH (Fn , F∞ ) = sup IE |h(Fn ) − h(F∞ )| ≤ κ1,H S1 (Fn , F∞ ) + κ2,H S2 (Fn , F∞ )

(1.9)

h∈H

where the constants κi,H , i = 1, 2 depend only on H and Si (Fn , F∞ ), i = 1, 2 provide a comparison of the Γi with the coefficients of the derivatives appearing in the second

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

order identities (e.g. (1.8) in the case of a Laplace target). Good bounds on the constants κi,H , i = 1, 2 are crucial for (1.9) to be of use; such bounds require being able to solve specific (second order) differential equations (called Stein equations) and providing uniform bounds on these solutions and their derivatives. This is exactly the plan carried out in [11] for variance-gamma distributed random variables, and their approach rests on the preliminary work of [12] who provides unified bounds on the solutions to the variance-gamma Stein equations. Aside from the variance-gamma case discussed in [15, 12], there are several other recent references where versions of (1.4) and (1.8) are proposed for complicated probability distributions such as the Kummer-U distribution [31], or the distribution of products of independent random variables [14, 13, 16]. The common trait of all these is that the resulting identities all involve second or higher order derivatives of the test functions. In [1] – which is essentially based on the first part of a previous version of this work – we use Fourier analysis to obtain identities for random variables of the form (1.1) when {Wk }k is a sequence of gamma distributed random variables. The resulting identities involve as many derivatives of the test functions as there are different coefficients in the decomposition (1.1). Applying the intuition outlined in the previous paragraph leads to the realization that the quantity that shall play the role of a Stein discrepancy S(Fn , F∞ ) in the context of random variables of the form (1.1) is exactly ∆(Fn , F∞ ) defined in (1.2). We are therefore, in principle, in a position to use a bound such as (1.6) or (1.9) to obtain rates of convergence in integral probability metrics dH . The problem with this roadmap for as general a family as that described by our Assumption is that the corresponding constants κH are elusive save on a case-by-case basis for specific choices of F∞ . This means in particular that Nourdin and Peccati’s version of Stein’s method shall not provide relevant bounds, at least at the present state of our knowledge on the constants κH , in one sweep for such a large family as that concerned by our assumption (1.1). In this paper we propose to only keep the relevant quantity ∆(Fn , F∞ ) whose importance to the problem was identified thanks to the Nourdin-Peccati intuition, but then bypass the difficulties inherited from the Stein methodology entirely. To this end we choose to study the problem of providing bounds in terms of an important and natural distance which is moreover better adapted to our Assumption: the Wasserstein-2 distance which we now define. Definition 1.1. Fix p ≥ 1. The Wasserstein-p distance is defined by Wp (Fn , F∞ ) = ( inf IEkX − Y kpd )1/p where the infimum is taken over all joint distributions of the random variables X and Y with respective marginals Fn and F∞ , and k kd stands for the Euclidean norm on IRd . Relevant information about Wasserstein distances can be found, e.g. in [38]. We conclude this introduction by noting that, as is well-known, convergence with respect to Wp is equivalent to the usual weak convergence of measures plus convergence of the first pth moments. Also, a direct application of H¨ older inequality implies that if 1 ≤ p ≤ q then Wp ≤ Wq . Finally, we mention that the Wasserstein-2 distance is not of the family of integral probability metrics dH (recall (1.6) for a definition).

2 Wasserstein-2 distance between linear combinations 2.1

A general result on Hilbert spaces

Let IN be the set of non-negative integers (IN = {0, 1, 2, 3, ...}) and IN? the set of positive integers (IN? = {1, 2, 3, ...}). Let `2 (IN? ) be the space of real valued sequences u = (un )n≥1

5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

P 2 such that ∞ n=1 un < ∞. It is a Hilbert space endowed with the natural inner product and induced Euclidean norm k·k2 . We aim to measure distances between elements x, y of the unit sphere of `2 (IN? ) where x is a finitely supported sequence x = (x1 , · · · , xq , 0, 0, · · ·) (q ≥ 1) and y = (yi )i≥1 is arbitrary. Denoting σ(IN? ) the set of permutations of IN? , we introduce the following quantity between x, y ∈ `2 (IN? ) !1 ∞ 2 X 2 dσ (x, y) := min ? kx − yπ k2 = min ? (xi − yπ(i) ) . (2.1) π∈σ(IN )

π∈σ(IN )

i=1

Now, for any x = (x1 , · · · , xq , 0, 0, · · ·), let Qx be the polynomial defined by, for all t ∈ R Qx (t) := t2

q Y i=1

(t − xi )2 .

(2.2)

The next theorem allows one to control the quantity dσ (x, y), for any y in the unit sphere of `2 (IN? ) thanks to the polynomial Qx .  Theorem 2.1. Suppose that x21 , · · · , x2q are rationally independent. Then there exists a constant Cx which only depends on x such that for any y in the unit sphere of `2 (IN? ) we get v u∞ uX dσ (x, y) ≤ Cx t Qx (yi ). (2.3) i=1

Proof. We first notice that

q

min t∈IR

X Qx (t) 1 Q (t) + x t2 (t − xi )2 i=1

!

:= δx > 0.

As a result, for any real number t, at least one of the following inequalities is true. ineq0 :

t2

ineq1 :

(t − x1 )2

ineqq :

(t − xq )2

q+1 Qx (t), δx q+1 ≤ Qx (t), δx .. . q+1 ≤ Qx (t). δx ≤

Although several of the aforementionned inequalities can hold simultaneously, one may always associate to any integer i ≥ 1 some index l in {0, 1, · · · , q} such that ineql holds for t = yi . Hence, one may build a partition of IN? = I0 ∪ I1 ∪ · · · ∪ Iq such that ( ∀i ∈ I0 , yi2 ≤ q+1 δx Qx (yi ) ∀j ∈ {1, · · · , q}, ∀i ∈ Ij , (yi − xj )2 ≤ q+1 δx Qx (yi ). Note that for j ∈ {1, ..., q} we have #Ij < ∞. Indeed, if one assumes, for example, that #I1 = +∞, then one necessarily has that x1 = 0 (which is a contradiction). This entails the following bound X

i∈I0

yi2

+

q X X j=1 i∈Ij



q+1X (yi − xj ) ≤ Qx (yi ). δx 2

(2.4)

i=1

For any integer i ≥ 1, we set zi = xj if i ∈ Ij for j ∈ {1, · · · , q} and we set zi = 0 when i ∈ I0 . Using triangle inequality and (2.4) we may infer that v v v u u u q q X ∞ X uq + 1 X uX uX 2 2 2 |kzk2 −1| = t #Ij xj − kyk2 ≤ t yi + (yi − xj ) ≤ t Qx (yi ). δx j=1 j=1 i=1 i∈I0 i∈Ij (2.5)

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

We need to introduce the following quantity   v q   u X u nj x2j − 1 ; (n1 , · · · , nq ) ∈ (IN)q /{(1, 1, · · · , 1)} . η := min t  

(2.6)

j=1

Since we do not let (n1 , · · · , nq ) = (1, · · · , 1) in the above minimization, and owing to the assumption of rational independence of (x21 , · · · , x2q ), it follows that η > 0. Relying on the bound (2.5), one has the following implication v u ∞ uq + 1 X t Qx (yi ) < η ⇒ #I1 = #I2 = #I3 = · · · = #Iq = 1 δx i=1 v u ∞ uq + 1 X ⇒ kx − yπ k2 ≤ t Qx (yi ), δx i=1

for π being any permutation of IN? satisfying

Finally, it holds

  I1 = {π(1)}, · · · , Iq = {π(q)}, I0 = π {q + 1, q + 2, · · ·} . v v u u ∞ ∞ uq + 1 X uq + 1 X t Qx (yi ) < η ⇒ dσ (x, y) ≤ t Qx (yi ), δx δx i=1

(2.7)

i=1

which implies that (given the trivial bound dσ (x, y) ≤ 2) v u ∞ q+1X 2 u t dσ (x, y) ≤ (1 + ) Qx (yi ). η δx

(2.8)

i=1

The proof is then achieved with the constant Cx = (1 + η2 )

q

q+1 δx .

2 2 Let us now deal with the case when (x P1q, · · · , xd2) are not anymore rationally independent. In this situation, one might write 1 = j=1 nj xj for several choices of vectors (n1 , · · · , nq ) ∈ (IN)q . We must introduce the set of all these choices, namely:   q   X E = n := (n1 , · · · , nq ) ∈ (IN)q nj x2j = 1 .   j=1

Besides, for any n = (n1 , · · · , nq ) ∈ E we define the following element of the unit sphere of `2 (IN? ): xn = (x1 , . . . , x1 , . . . , xq , . . . , xq , 0, 0, · · ·). | {z } | {z } n1 times

nq times

We then have the following Theorem.

Theorem 2.2. There exists a constant Cx only depending on x such that for any y in the unit sphere of `2 (IN? ) we get: v u∞ uX min {dσ (xn , y) ; n ∈ E} ≤ Cx t Qx (yi ). i=1

7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Proof. We proceed as in the proof of Theorem 2.1, from its beginning until the bound (2.6). The only difference is that we must now consider  v  q   u uX κ := min t nj x2j − 1 ; (n1 , · · · , nq ) ∈ (IN)q /E . (2.9)   j=1

Similarly, since we removed E from the above minimization problem, it follows that κ > 0. Relying on the bound (2.5), one has the following implication v u ∞ uq + 1 X t Qx (yi ) < κ ⇒ (#I1 , #I2 , #I3 , · · · , #Iq ) ∈ E δx i=1 v u ∞ uq + 1 X ⇒ ∃n ∈ E, kxn − yπ k2 ≤ t Qx (yi ), δx i=1

for π being any permutation of IN? satisfying  I1 = π({1, · · · , n1 }),     I = π({n1 + 1, · · · , n1 + n2 }),    .2 . .. = ..   Iq = π({n    1 + · · · + nq−1 + 1, · · · , n1 + · · · + nq }),     I0 = π {n1 + · · · + nq + 1, n1 + · · · + nq + 2, · · ·} .

Finally, it holds v v u u ∞ ∞ uq + 1 X uq + 1 X t Qx (yi ) < κ ⇒ min {dσ (xn , y) ; n ∈ E} ≤ t Qx (yi ), δx δx i=1

(2.10)

i=1

which can also be written

v u ∞ q+1X 2 u t Qx (yi ). min {dσ (xn , y) ; n ∈ E} ≤ (1 + ) κ δx

(2.11)

i=1

q The proof is then achieved with the constant Cx = (1 + κ2 ) q+1 δx .

P In the above situation, the quantity ∞ i=1 Qx (yi ) is not sufficient anymore to ensure the uniqueness of the limit for the convergence for the quantity dσ (·, ·). There may be several adherence values and some additional information is then required. Set, for all p ≥ 2 ∞ X ∆p,x (y) = | (yip − xpi )|. i=1

We have the following Theorem. Theorem 2.3. There exists a constant C˜x which only depends on x such that, for any y with kyk2 = 1, we get v  u∞ uX dσ (x, y) ≤ C˜x t Qx (yi ) + max ∆p,x (y) . 3≤p≤q+1

i=1

8

Proof. Relying on Theorem 2.2, it holds that 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

v u∞ uX min {dσ (xn , y) ; n ∈ E} ≤ Cx t Qx (yi ). i=1

Note that it is not assumed that the real numbers (x1 , · · · , xq ) are pairwise distinct. We can extract a subsequence (u1 , · · · , us ) with s ≤ q by removing the possible repetitions. For any n ∈ E, let us also denote by mi (n) the number of repetitions of ui among the sequence xn and by mi the number of repetitions of ui in the sequence x. Thus, we have ∀p ∈ {3, · · · , q + 1},

∞ X

xn (i)p =

i=1

s X

mi (n)upi .

i=1

Suppose that n = argmin {dσ (xn , y) ; n ∈ E}, by the triangle inequality get for all p ∈ {3, · · · , q + 1}, s s ∞ X X X p ≤ mi (n)upi − yip + ∆p,x (y) (mi (n) − mi )ui i=1 i=1 i=1   X p−1 ∞ X j p−1−j   = (xn (i) − yi ) xn (i) yi + ∆p,x (y) i=1 j=0 ≤

p

∞ X i=1



Cauchy-Schwarz



|xn (i) − yi | (|xn (i)|+|yi |) + ∆p,x (y)

2p dσ (xn , y) + ∆p,x (y) v u∞ uX 2pCx t Qx (yi ) + ∆p,x (y). i=1

  Finally, set V := V (u1 , · · · , us ) = mat uji

the Vandermonde matrix associ ated to the pairwise distinct real numbers (u1 , · · · , us ) and m ~ = (mi (n) − mi )u2i 1≤i≤s . The above inequality reads as v u∞ uX t t k mVk ~ Qx (yi ) + sup ∆p,x (y). ∞ ≤ 2(q + 1)Cx 1≤i≤s,0≤j≤q−1

i=1

3≤p≤q+1

Now, we set

 αx = min kt kVk∞ | k = (k1 u21 , · · · , ks u2s ) and (k1 , · · · , ks ) ∈ Zs /{(0, · · · , 0)} ,

since V is invertible we must have αx > 0. That is why, v u∞ uX Qx (yi ) + sup ∆p,x (y) < αx ⇒ m ~ = 0. 2(q + 1)Cx t i=1

3≤p≤q+1

In the latter situation we also get mi (n) = mi , xn = x and of course the desired bound v v u∞ u∞ uX uX t 2(q + 1)Cx Qx (yi ) + sup ∆p,x (y) < αx ⇒ dσ (x, y) ≤ Cx t Qx (yi ). i=1

3≤p≤q+1

i=1

The proof is now achieved with C˜x = 2(q + 1)Cx (1 +

9

2 αx ).

2.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

A probabilistic interpretation

Let (Wk )k≥1 be an i.i.d. sequence of random variables admitting moments of orders r = 2, · · · , 2q + 2 and which satisfies IE(W1 ) = 0, IE(W12 ) = 1. We shall further assume that r = 2, · · · , 2q + 2, κr (W1 ) 6= 0 where κr (W1 ) denotes the rth cumulant of W1 . Let, for all n ≥ 1, (αn,k )k≥1 and (α∞,k )k≥1 be two sequences of real numbers such that +∞ X

2 αn,k

=

k=1

q X k=1

2 α∞,k = 1.

Then, introduce the following random variables, q X

F∞ =d

k=1

α∞,k Wk , Fn =d

+∞ X

αn,k Wk ,

k=1

n≥1

(2.12)

where =d stands for equality in distribution. Using standard properties of cumulants one has for any r = 2, · · · , 2q + 2 κr (Fn ) = κr (W1 )

+∞ X k=1

r αn,k , κr (F∞ ) = κr (W1 )

q X k=1

r α∞,k .

The aim of this section is to obtain an upper bound on the Wasserstein-2 distance between the laws of Fn and F∞ based on Theorems (2.1) and (2.3). Let π be a permutation of σ(IN∗ ). Then, for all n ≥ 1 Fn =d

+∞ X

αn,π(k) Wk .

k=1

 Pq P+∞ Choosing the specific coupling k=1 αn,π(k) Wk . , the following upper k=1 α∞,k Wk , bound holds true, for all n ≥ 1 q 2 +∞ +∞ X X X W22 (Fn , F∞ ) ≤ IE α∞,k Wk − αn,π(k) Wk = (α∞,k − αn,π(k) )2 k=1

k=1

k=1

with the convention that α∞,k = 0 for all k ≥ q + 1. This implies, for all n ≥ 1 W2 (Fn , F∞ ) ≤ dσ (αn , α∞ ) .

(2.13)

In the next lemma, we reinterpret in a probabilistic way the quantity appearing on the right-hand side of inequality (2.3). Lemma 2.1. For all n ≥ 1 ∆(Fn , F∞ ) = ∆(Fn ) : = =

X k≥1

Qα∞ (αn,k ),

2q+2 X

Θr

r=2

=

2q+2 X r=2

X

r αn,k ,

(2.14)

k≥1

Θr

κr (Fn ) κr (W1 )

where the coefficients Θr are the coefficients of the polynomial 2

Qα∞ (x) = (P (x)) = (x

10

q Y i=1

(x − α∞,i ))2 .

(2.15)

Moreover, for all n ≥ 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

∆(Fn , F∞ ) =

2q+2 X

Θr

r=2

κr (Fn ) − κr (F∞ ) κr (W1 )

(2.16)

since ∆(F∞ ) = 0. From a probabilistic point of view, Theorems 2.1 and 2.3 take the following form. 2 } Theorem 2.4. If the real numbers {α∞,r 1≤r≤q are rationnally independent then

W2 (Fn , F∞ ) ≤ C if they are not, one instead gets W2 (Fn , F∞ ) ≤ C

p

p ∆(Fn ),

n ≥ 1,

! q+1 X κr (Fn ) − κr (F∞ ) , ∆(Fn ) + κr (W1 ) r=2

(2.17)

n≥1

(2.18)

where the constant C depends only, in both cases, on the target F∞ . Proof. The proof is a direct consequence of Theorems 2.1 and 2.3. Indeed, set αn = {αn,k }k≥1 and α∞ = {α∞,k }k≥1 , thanks to inequality (2.13), we get W2 (Fn , F∞ ) ≤ Qq 2 2 d σ (αn , α∞ ). As before, we set Qα∞ (x) = x k=1 (x − α∞,k ) . Finally, recalling that P +∞ k=1 Qα∞ (αn,k ) = ∆(Fn ), the result follows.

Remark 2.1. An important question concerning the sharpness of the estimate (2.17) was raised by referees on a previous version of this paper. We first notice that for some appropriate constant C > 0 and for all x ∈ [−1, 1], one gets Qα∞ (x) ≤ Cx2 and for all k = 1, · · · , q, Qα∞ (x) ≤ C(x − α∞,k )2 . Hence, we may deduce that ∆(Fn ) =

+∞ X k=1

Qα∞ (αn,k ) ≤ Cdσ (αn , α∞ )2 ,

and the result follows since one gets, for appropriate constants A, B > 0 that p A dσ (αn , α∞ ) ≤ ∆(Fn ) ≤ B dσ (αn , α∞ ).

Unfortunately, at present, we are unable to say whether the quantity dσ is equivalent to the Wasserstein-2 distance. Nonetheless, in the context of second Wiener chaos, we provide a general lower bound on the Wasserstein-2 distance in Section 2.4 as well as a simple example which refines this lower bound. 2 } Remark 2.2. To end this section, note that when the {α∞,r 1≤r≤q are rationally independent, the following bound is an immediate consequence of (2.17) v u2q+2 uX W2 (Fn , F∞ ) ≤ C t |κr (Fn ) − κr (F∞ )|, n≥1 r=2

with C only depending on the target distribution F∞ .

2.3

Specializing to the second Wiener chaos

In this section, we apply our main results in a desirable framework when the approximating sequence Fn are elements of the second Wiener chaos of an isonormal process X = {X(h); h ∈ H} over a separable Hilbert space H. We refer the reader to [26] Chapter 2 for a detailed discussion on this topic. Recall that the elements in the second Wiener chaos

11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

are random variables having the general form F = I2 (f ), with f ∈ H 2 , the symmetric two fold tensor products of H and I2 denotes the second order Wiener-Itˆo integral. Notice that, by Theorem 2.7.7 of [26], for any f = h ⊗ h with h ∈ H such that khkH = 1, I2 (f ) = X(h)2 − 1 =d Z2 − 1, where Z ∼ N (0, 1). To any kernel f ∈ H 2 , we associate the following Hilbert-Schmidt operator Af : H 7→ H;

g 7→ f ⊗1 g.

We also write {αf,j }j≥1 and {ef,j }j≥1 , respectively, to indicate the (not necessarily distinct) eigenvalues of Af and the corresponding eigenvectors. Let q ≥ 1 and F∞ be defined by F∞ =

q X α∞,j 2 √ (Zj − 1) 2 j=1

(2.19)

where {Zj , j ∈ {1, ..., q}} is a collection of i.i.d. standard normal random variables and P 2 (α∞,j )1≤j≤q a collection of real numbers such that qj=1 α∞,j = 1. The next proposition gathers some relevant properties of the elements of the second Wiener chaos associated to X. Proposition 2.1 (See Section 2.7.4 in [26] and Lemma 3.1 in [2] ). Let F = I2 (f ), f ∈ H 2 , be a generic element of the second Wiener chaos of X, and write {αf,k }k≥1 for the set of the eigenvalues of the associated Hilbert-Schmidt operator Af . P 1. The following equality holds: F =d k≥1 αf,k (Zk2 − 1), where {Zk }k≥1 is a sequence of i.i.d. N (0, 1) random variables. 2. For all r ≥ 2,

κr (F ) = 2r−1 (r − 1)!

X

r αf,k .

k≥1

The next corollary is a direct application of Theorem 2.4 and provides quantitative bounds for the main results in [29, 2]. Corollary 2.1. Let (Zk )k≥1 be a sequence of iid standard normalPrandom variables. Let 2 q ≥ 1 and (α∞,j )1≤j≤q be a collection of real numbers such that qj=1 α∞,j = 1. For all P+∞ 2 n ≥ 1, let (αn,j )j≥1 be a sequence of real numbers such that j=1 αn,j = 1. Let (Fn )n≥1 be the sequence of random variables defined by Fn =

∞ X αn,k 2 √ (Zk − 1), 2 k=1

n ≥ 1.

Let F∞ be as in (2.19) and ∆(Fn ) as in Lemma 2.1. Then,  2 , · · · , α2 (a) If α∞,1 ∞,q are rationally independent, for all n ≥ 1 W2 (Fn , F∞ ) ≤ C

p ∆(Fn ).

(2.20)

In particular, the convergence of ∆(Fn ) → ∆(F∞ ) = 0, when n → +∞, implies that (Fn )n≥1 converges in distribution to the random variable F∞ .  2 , · · · , α2 (b) If α∞,1 ∞,q are not rationally independent, for all n ≥ 1 W2 (Fn , F∞ ) ≤ C √ with W1 = (Z12 − 1)/ 2.

p

! q+1 X κr (Fn ) − κr (F∞ ) ∆(Fn ) + κr (W1 ) r=2

12

(2.21)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Remark 2.3. The upper bound in Corollary 2.1, part (b) requires the separate convergences of the first q + 1 cumulants for the convergence in distribution towards the 2 , · · · , α2 } < q. This is consistarget random variable F∞ as soon as dimQ span{α∞,1 ∞,q tent with a quantitative result in [11], see also Section 2.5 below. In fact, when q = 2 and α∞,1 = −α∞,2 = 1/2, then the target random variable F∞ (=d Z1 × Z2 , where Z1 , Z2 ∼ N (0, 1) are independent) belongs to the class of Variance–Gamma distributions V Gc (r, θ, σ) with parameters r = σ = 1 and θ = 0. Then, [11, Corollary 5.10, part (a)] reads q (2.22) W1 (Fn , F∞ ) ≤ C ∆(Fn ) + 1/4 κ23 (Fn ).

Therefore, for the convergence in distribution of the sequence Fn towards the target random variable F∞ in addition to convergence ∆(Fn ) → ∆(F∞ ) = 0 one needs also the convergence of the third cumulant κ3 (Fn ) → κ3 (F∞ ) = 0. Note that in this case we have 2 , α2 } = 1 < q = 2. dimQ span{α∞,1 ∞,2 Example 2.1. The aim of this simple example is to show that the requirement of separate convergences of the first q + 1 cumulants is essential in Theorem 2.4 √ as soon as 2 , · · · , α2 } < q. Assume that q = 2 and α dimQ span{α∞,1 = −α = 1/ 2. Consider ∞,1 ∞,2 ∞,q the fixed sequence α∞,2 α∞,1 n ≥ 1. Fn = √ (Z12 − 1) − √ (Z22 − 1) 2 2

Then κ2r (Fn ) = κ2r (F∞ ) for all r ≥ 1, in particular κ2 (Fn ) = κ2 (F∞ ) = 1, and ∆(Fn ) = ∆(F∞ ) = 0. However, it is easy to see that the sequence Fn does not converges in distribution towards the target random variable F∞ , because 2 = κ3 (Fn ) 9 κ3 (F∞ ) = 0. 2 , α2 } = 1 < q = 2. Note that in this example, we have dimQ span{α∞,1 ∞,2

2.4 A lower bound on the Wasserstein-2 distance in the second Wiener chaos The aim of this subsection is to obtain a relevant lower bound on the Wasserstein-2 distance in the framework of the second Wiener chaos. The method developed to obtain such a lower bound is based on characteristic functions, complex analysis and optimal transport in dimension one. The main point of our approach is that random variables belonging to the second Wiener chaos admit characteristic functions which extend to the complex plane. Let (Fn )n≥1 and F∞ be defined by F∞

q 1 X 1 X =√ α∞,k (Zk2 − 1), Fn = √ αn,k (Zk2 − 1), 2 k=1 2 k≥1

n≥1

(2.23)

where (Zk )k≥1 is a sequence of iid standard normal random variables, (α∞,k )1≤k≤q a collection of non-zero real numbers and (αn,k )k≥1 a sequence of real numbers such that q X k=1

2 α∞,k = 1,

X

2 αn,k = 1,

k≥1

n ≥ 1.

(2.24)

From the previous assumptions, it is clear that κ2 (Fn ) = κ2 (F∞ ) = 1, for all n ≥ 1. First, recall the following tail behavior of elements in the second Wiener chaos. Lemma 2.2. Let X be a random variable belonging to the second Wiener chaos with unit variance. Then, P(|X|> x) ≤ exp(−x/e), for all x > e.

13

(2.25)

Proof. Since X is in the second Wiener chaos, we have by hypercontractivity, for any q > 2 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

IE[|X|q ] q ≤ (q − 1).

(2.26)

Then, by Markov inequality, for all x > 0 1 1 IE[|X|q ] ≤ q (q − 1)q xq x

P(|X|≥ x) ≤

(2.27)

Choosing q = 1 + x/e with x > e gives P(|X|≥ x) ≤ e−x/e

(2.28)

which concludes the proof of the lemma. Recall that thanks to the representations (2.23), Theorem 6.1 of [18] ensures that the characteristic functions of Fn and F∞ , denoted respectively by ϕn and ϕ∞ , admit complex extensions and are analytic in the strips of the complex plane Dn := {z ∈ C : |Im(z)|< 1/(2 maxk≥1 |αn,k |)} and D∞ := {z ∈ C : |Im(z)|< 1/(2 maxk≥1 |α∞,k |)}. In particular, thanks to (2.24), ϕn and ϕ∞ are analytic in the strip {z ∈ C : |Im(z)|< 1/2}. Moreover, thanks to Theorem 7.1.1 of [23], ϕn and ϕ∞ admit the following integral representations, for all z ∈ {z ∈ C : |Im(z)|< 1/2} Z Z izx ϕn (z) := e µn (dx), ϕ∞ (z) := eizx µ∞ (dx) R

R

where µn and µ∞ are the probability distributions of Fn and F∞ respectively. Finally, recall the following standard inequality, for all x, y ∈ R and z ∈ C such that |z|= ρ > 0 |eizx − eizy |≤ ρ|x − y|eρ(|x|+|y|) .

(2.29)

We are now ready to the state the proposition linking the pointwise difference of the characteristic functions and of their derivatives with the Wasserstein-2 distance of Fn and F∞ . Proposition 2.2. For any ρ ∈ (0, 1/(4e)), there exists a strictly positive constant C1,ρ such that, for all n ≥ 1 and for all z ∈ C with |z|= ρ, |ϕn (z) − ϕ∞ (z)|+|ϕ0n (z) − ϕ0∞ (z)|≤ ρC1,ρ W2 (Fn , F∞ ).

(2.30)

Proof. By optimal transport on the real line (Brenier Theorem), for all n ≥ 1, there exists a map Tn from R to R such that W2 (Fn , F∞ ) :=

Z

2

R

|x − Tn (x)| dµ∞ (x)

1

2

.

Moreover, the push forward measure µ∞ ◦ Tn−1 is equal to µn so that, for all z ∈ C with |Im(z)|< 1/2 Z ϕn (z) := eizTn (x) µ∞ (dx). R

Let ρ ∈ (0, 1/(4e)) and z ∈ C such that |z|= ρ. Then, Z |ϕn (z) − ϕ∞ (z)|≤ |eizx − eizTn (x) |µ∞ (dx). R

Moreover, by (2.29) |ϕn (z) − ϕ∞ (z)|≤ ρ

Z

R

|x − Tn (x)|eρ(|x|+|Tn (x)|) µ∞ (dx).

14

(2.31)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Using Cauchy-Schwarz inequality, 1 Z 1  Z 2 2 2ρ(|x|+|Tn (x)|) 2 e µ∞ (dx) , |ϕn (z) − ϕ∞ (z)| ≤ ρ |x − Tn (x)| µ∞ (dx) R

R

Z 1 2 2ρ(|x|+|Tn (x)|) ≤ ρW2 (Fn , F∞ ) e µ∞ (dx) . R

Next, we need to prove that sup n≥1

Z

R

e

2ρ(|x|+|Tn (x)|)

 µ∞ (dx) < +∞.

(2.32)

By Lemma 2.2, for all c < 1/e sup(IE[ec|Fn | ]) < +∞.

(2.33)

n≥1

Since ρ ∈ (0, 1/(4e)), (2.32) follows. To conclude the proof of the proposition, we need to bound similarly the pointwise difference of the derivatives of the characteristic functions. Since Fn and F∞ are centered, for all z ∈ C with |z|= ρ ∈ (0, 1/(4e)) Z |ϕ0n (z) − ϕ0∞ (z)|≤ |Tn (x)(eizTn (x) − 1) − x(eizx − 1)|dµ∞ (x). R

Then, |ϕ0n (z) − ϕ0∞ (z)|≤ (I) + (II) with (I) :=

Z

izx

R

|x||e

−e

izTn (x)

|dµ∞ (x),

(II) :=

Z

R

(2.34)

|x − Tn (x)||eizTn (x) − 1|dµ∞ (x).

For the first term (I), using (2.29) and Cauchy-Schwarz inequality, (I) ≤ ρW2 (Fn , F∞ )

Z

2 2ρ(|x|+|Tn (x)|)

R

|x| e

dµ∞ (x)

1

2

.

Moreover, as previously, sup n≥1

Z

R

2 2ρ(|x|+|Tn (x)|)

|x| e

 dµ∞ (x) < +∞

since ρ ∈ (0, 1/(4e)). For the second term, Cauchy-Schwarz inequality implies Z (II) ≤ |x − Tn (x)|ρ|Tn (x)|eρ|Tn (x)| dµ∞ (x), R

≤ ρW2 (Fn , F∞ )

Z

R

2 2ρ|Tn (x)|

|Tn (x)| e

1 2 dµ∞ (x) .

Finally, note that for ρ ∈ (0, 1/(4e)) Z  2 2ρ|Tn (x)| sup |Tn (x)| e dµ∞ (x) < +∞. n≥1

R

(2.35)

Taking C1,ρ := sup n≥1

Z

2 2ρ|Tn (x)|

R

+ sup n≥1

|Tn (x)| e

Z

R

1 Z 1 2 2 2 2ρ(|x|+|Tn (x)|) dµ∞ (x) + sup |x| e dµ∞ (x) n≥1

1 2 2ρ(|x|+|Tn (x)|) e µ∞ (dx)

15

R

we get 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

|ϕn (z) − ϕ∞ (z)|+|ϕ0n (z) − ϕ0∞ (z)|≤ ρC1,ρ W2 (Fn , F∞ ).

(2.36)

This concludes the proof of the proposition. In order to upper bound the quantity ∆(Fn ) with the Wasserstein-2 distance, we are going to use a bit of complex analysis together with Proposition 2.2. First of all, recall the following inequality for the cumulants of Fn , for all r ≥ 2 and for all n ≥ 1 | κr (Fn ) | ≤ 2r−1 (r − 1)!

+∞ X j=1

| αn,j |r ,

≤ 2r−1 (r − 1)! max | αn,j |r−2 j≥1

≤2

r−1

(r − 1)!

+∞ X

2 αn,j ,

j=1

and similarly for |κr (F∞ )|. Therefore, the following series are convergent as soon as |z|< 1/2 ∞ X κr (Fn ) r=2

r!

r

(iz) ,

∞ X κr (F∞ ) r=2

r!

(iz)r .

(2.37)

We are now ready to link the quantity ∆(Fn ) with a certain functional on the difference of the characteristic functions. Proposition 2.3. Let ρ ∈ (0, 1/2). There exists a strictly positive constant, C2,ρ > 0, such that, for all n ≥ 1 Z 2π 0 ϕ (ρeiθ ) ϕ0 (ρeiθ ) dθ | ∞ iθ − n iθ |2 ≥ C2,ρ ∆(Fn )2 . (2.38) ϕ∞ (ρe ) ϕn (ρe ) 2π 0 Proof. Let us fix ρ ∈ (0, 1/2). First of all, by definition of the cumulants, for all z ∈ C such that | z |< 1/2 and for all n ≥ 1 ∞

ϕ0∞ (z) ϕ0n (z) X κr (F∞ ) − κr (Fn ) r r−1 − = (i) z . ϕ∞ (z) ϕn (z) (r − 1)! r=2

By orthogonality, it follows Z



0



X | κr (F∞ ) − κr (Fn ) |2 ϕ0 (ρeiθ ) ϕ0 (ρeiθ ) dθ | ∞ iθ − n iθ |2 = ρ2(r−1) . (r − 1)!2 ϕ∞ (ρe ) ϕn (ρe ) 2π

(2.39)

r=2

Then, truncating the series appearing on the right-hand side of (2.39) and using the fact that, for all 2 ≤ r ≤ 2q + 2, 0 ≤ |Θr | /2(r−1) ≤ max2≤r≤2q+2 |Θr | /2(r−1) , Z

0



|

2q+2 X ϕ0∞ (ρeiθ ) ϕ0n (ρeiθ ) 2 dθ | Θr |2 − | ≥ C | κr (F∞ ) − κr (Fn ) |2 2(r−1) (r − 1)!2 ϕ∞ (ρeiθ ) ϕn (ρeiθ ) 2π 2 r=2

for some C > 0 only depending on q, ρ and on (α∞,1 , ..., α∞,q ). Equality (2.16) of Lemma 2.1 concludes the proof of the proposition. We are now ready to state the main the result of this sub-section. Proposition 2.4. There exists a strictly positive constant C3 > 0 such that for all n ≥ 1, W2 (Fn , F∞ ) ≥ C3 ∆(Fn ).

16

(2.40)

Proof. Let ρ = 1/(8e). First of all, we note that for any z ∈ C such that |z|= ρ, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

|

1 1 1 ϕ0∞ (z) ϕ0n (z) − |≤ |ϕ0∞ (z) − ϕ0n (z)|+|ϕ0n (z)|| − | ϕ∞ (z) ϕn (z) |ϕ∞ (z)| ϕn (z) ϕ∞ (z)

(2.41)

Moreover, it is clear that the function ϕ∞ (z) is bounded away from 0 on the disk centered at the origin and with radius ρ. Regarding the function ϕn (z), the following uniform bound (with z on the disk centered at the origin and with radius ρ) holds true  1 k |κk (Fn )||z| , k! k=2  X +∞ k−1 e−ρ 2 k ρ := √ ≤ exp . k 1 − 2ρ

1 | | ≤ exp ϕn (z)

X +∞

(2.42) (2.43)

k=2

Therefore, |

ϕ0∞ (z) ϕ0n (z) − |≤ C4,ρ |ϕ0∞ (z) − ϕ0n (z)|+C5,ρ |ϕ∞ (z) − ϕn (z)|, ϕ∞ (z) ϕn (z)

for some strictly positive constants C4,ρ and C5,ρ (independent of n). Thus, using Proposition 2.2, |

ϕ0∞ (z) ϕ0n (z) − |≤ ρC6,ρ W2 (Fn , F∞ ). ϕ∞ (z) ϕn (z)

(2.44)

Proposition 2.3 concludes the proof of the proposition. Remark 2.4. • Combining Proposition 2.4 together with part (a) of Corollary 2.1, we obtain the fact that the convergence of ∆(Fn ) to 0 is equivalent to the convergence 2 , · · · , α2 } = q. This complements the of W2 (Fn , F∞ ) to 0 when dimQ span{α∞,1 ∞,q results contained in [29, 2] (see in particular Theorem 2 of [2]). Moreover, recall that convergence of W2 (Fn , F∞ ) to 0 is equivalent to convergence in distribution and 2 , · · · , α2 } = q convergence of the second moments. Therefore, when dimQ span{α∞,1 ∞,q and Fn and F∞ have unit variances, convergence in distribution of Fn towards F∞ is equivalent to convergence of ∆(Fn ) to 0. • This justifies why we choose to study quantitative convergence result with respect to the Wasserstein-2 distance instead of other probability metrics such as Kolmogorov distance or 1-Wasserstein distance. In the sequel, we provide a simple example for which it is possible to refine the previous lower bound. Let (an )n≥1 be a sequence of positive real numbers strictly less than 1 which converges to 0 when n tends to infinity such that 0 < a = sup(an ) < 1.

(2.45)

n≥1

Then, consider the random variables defined by, for all n ≥ 1 r r 1 − an 2 an 2 1 Fn = (Z1 − 1) + (Z2 − 1), F∞ = √ (Z 2 − 1). 2 2 2

(2.46)

Note that κ2 (Fn ) = κ2 (F∞ ) = 1. First of all, let us find an asymptotic equivalent for ∆(Fn , F∞ ). By definition, for all n ≥ 1 ∆(Fn , F∞ ) =

4 X k=3

Θk (κk (Fn ) − κk (F∞ )). κk ((Z 2 − 1))

17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

√ Since Θ3 = − 2 and Θ4 = 1, for all n ≥ 1     √ 1 − an 3 1 − an 2 an 3 1 3 an 2 1 2 2 2 ∆(Fn , F∞ ) = − 2 ( ) + ( ) − (√ ) + ( ) +( ) −( ) . 2 2 2 2 2 2 Then, it is readily seen ∆(Fn , F∞ ) ∼

an 4

(2.47)

when n tends to +∞. In order to find a comparable lower bound for the Wasserstein-2 distance, we need the following technical lemma. Lemma 2.3. Let (Fn )n≥1 and F∞ be defined by (2.46) with respective characteristic functions (ϕn )n≥1 and ϕ∞ . Then, for all n ≥ 1 |ϕn (t) − ϕ∞ (t)| ≤ W2 (Fn , F∞ ). |t| t∈IR\{0} sup

(2.48)

Proof. Let Tn be as in the proof of Proposition 2.2 (Brenier Theorem). For all n ≥ 1 and for all t ∈ R Z |ϕn (t) − ϕ∞ (t)| := | eitTn (x) − eitx dµ∞ (x)|, Z R ≤ |eit(Tn (x)−x) − 1|dµ∞ (x), R Z ≤ |t| |Tn (x) − x|dµ∞ (x), R

≤ |t|W2 (Fn , F∞ ),

where we have used Cauchy-Schwarz inequality in the last inequality and the definition of Tn . This concludes the proof of the lemma. Therefore, we have the following lower bound. Lemma 2.4. Let (Fn )n≥1 and F∞ be defined by (2.46). There exists a strictly positive constant c such for n large enough 3

W2 (Fn , F∞ ) ≥ c(an ) 4 .

(2.49)

Proof. By straightforward computations, for all n ≥ 1 and for all t ∈ R q q √ an 1 n −it −it 1−a −it 2 2 2 e e e q p ϕn (t) = r , ϕ (t) = . √ ∞ q p an 1 − 2it 1−an 1 − 2it 2 1 − 2it 2

(2.50)

Therefore, for all t 6= 0

1

|ϕn (t) − ϕ∞ (t)| ≥ ||ϕn (t)|−|ϕ∞ (t)||:= |

((1 + 2t2 (1 − an ))(1 + 2t2 an )) 1 1 ≥| 1 − 1 | (1 + 4t4 (1 − an )an + 2t2 ) 4 (1 + 2t2 ) 4 1 1 ≥ 4 (1−a )a 1 | 1 − 1|. 4t n n (1 + 2t2 ) 4 (1 + )4 1+2t2

Now, selecting tn :=

1 4



1 1

(1 + 2t2 ) 4

|

√1 , an

1 |ϕn (tn ) − ϕ∞ (tn )|≥ − 1 . 1 4(1−an ) 1 4 4 (2 + an ) (1 + an +2 ) 1

(an ) 4

18

(2.51)

The previous lower bound implies 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3

|ϕn (tn ) − ϕ∞ (tn )| (an ) 4 ≥ 1 |tn | (2 + an ) 4

1 − 1 . (1 + 4(1−an ) ) 14

(2.52)

an +2

Moreover, thanks (2.45), there exists a strictly positive constant c > 0 (independent of n) such that 1 1 − 1 (2.53) ≥ c. 1 1 4(1−a ) n (2 + an ) 4 (1 + an +2 ) 4

Then,

3 |ϕn (tn ) − ϕ∞ (tn )| ≥ c(an ) 4 . |tn |

(2.54)

Lemma 2.3 concludes the proof of the lemma. Remark 2.5. Modifying the proof of Lemma 2.4 by choosing tn = (1/an )β for some β > 0 produces lower bounds with different rates of convergence to 0. Indeed, one can check that the exponent of the resulting lower bound (denoted by χ(β)) is defined by χ(β) =



1− 3β 2

β 2

β ∈ (0, 12 ], β ∈ [ 12 , +∞).

(2.55)

Thus, β = 12 corresponds to the scale which reduces the most the gap between the lower and the upper scaling exponents.

2.5 Comparison with the Malliavin–Stein method for the variance-Gamma Recall that the target distributions we are interested in take the form F∞ =

q X i=1

α∞,i (Zi2 − 1),

(2.56)

where q ≥ 1, {Zi }qi=1 are i.i.d. N (0, 1) random variables, and the coefficients {α∞,i }qi=1 are non-zero. We stress that q in representation (2.56) cannot be infinity. The aim of this section is to study the connections between the class of our target distributions given as (2.56), and the so called variance-gamma class of probability distributions, and to compare our quantitive bound in Corollary 2.1 with the bounds recently obtained in [11] using the Malliavin–Stein method. First, we recall some basic facts that we need on the variance-gamma probability distributions. For detailed information, we refer the reader to [12, 15] and references therein. The random variable X is said to have a variance-gamma probability distribution with parameters r > 0, θ ∈ IR, σ > 0, µ ∈ IR if and only if its probability density function is given by θ 1 (x−µ) σ2 pVG (x; r, θ, σ, µ) = √ r e σ πΓ( 2 )



|x − µ| √ 2 θ2 + σ2

 r−1 2

K r−1 2

! √ θ2 + σ2 |x − µ| , σ2

where x ∈ IR, and Kν (x) is a modified Bessel function of the second kind, and we write X ∼ VG(r, θ, σ, µ). Also, it is known that for X ∼ VG(r, θ, σ, µ) (see for example relation (2.3) in [15]) IE(X) = µ + rθ, and Var(X) = r(σ 2 + 2θ2 ). (2.57)

19

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Lemma 2.5. (a) Let Z1 , Z2 ∼ N (0, 1) be independent and let α∞,1 , α∞,2 > 0. Then, √ F∞ = α∞,1 (Z12 − 1) − α∞,2 (Z22 − 1) =d VG(1, α∞,1 − α∞,2 , 2 α∞,1 α∞,2 , α∞,2 − α∞,1 ). (b) Let q = 2. Then, F∞ defined by (2.56) with α∞,1 , α∞,2 > 0 (or with α∞,1 , α∞,2 < 0) cannot belong to the variance-gamma class. (c) Let q ≥ 3. Then, F∞ defined by (2.56) cannot belong to the variance-gamma class. Proof. (a) Set X = α∞,1 Z12 =d γ



1 1 , 2 2α∞,1



,

and

Y = α∞,2 Z22 =d γ



1 1 , 2 2α∞,2



where γ(α, β) denotes a gamma random variable with parameters α, β > 0. Then, the claim follows directly from part (v) in [12, Proposition 3.8]. (b) and (c) also follow directly using a straightforward comparison between the characteristic function of F∞ and the one of the variance-gamma random variable (see, e.g. [24, page 83]). Next, we want to compare our bound in Corollary 2.1 with the bound in [11] obtained using the Malliavin–Stein method. A good starting point for such comparison is the right hand side of equation (4.1) in [11, Theorem 4.1]. This is because the bound in [11, Corollary 5.10, part (a)] is obtained from the right hand side of equation (4.1) in [11, Theorem 4.1] by norms of contraction operators. In virtue of Lemma 2.5, in order for F∞ as in (2.56) to belong to the variance-gamma class, it is necessary to have r = 1 and q = 2. Letting r = 1 in the right hand side of equation (4.1) in [11, Theorem 4.1], and taking into account that κ2 (F∞ ) = σ 2 + 2θ2 , and κ3 (F∞ ) = 2θ(3σ 2 + 4θ2 ), for an element F in the second Wiener chaos associated to the underlying isonormal process X, we arrive at W1 (F, F∞ ) ≤ C1 IE Γ2 (F ) − 2θΓ1 (F ) − σ 2 (F + θ) + C2 |κ2 (F ) − κ2 (F∞ )| 3 X P (r) (0) (Γ (F ) − IE(Γ (F ))) ≤ C1 IE r−1 r−1 r! 2r−1 r=1

+ C2 |κ3 (F ) − 4θκ2 (F ) − 2σ 2 θ|+C3 |κ2 (F ) − κ2 (F∞ )|

≤ C1

p

∆(F ) + C2

3 X r=2

|κr (F ) − κr (F∞ )|.

The last inequality is derived from Cauchy-Schwarz inequality together with [2, Lemma 3.1] where we used the fact that F belongs to the second Wiener chaos.

3 3.1

Applications An example from U -statistics

Under some degeneracy P conditions, it is possible to observe the appearance of limiting distributions of the form k≥1 α∞,k (Zk2 − 1) in the context of U -statistics. In this example, we restrict our attention to second order U -statistics. We refer the reader to [33, Chapter 5.5 Section 5.5.2] or to [18, Chapter 11 Corollary 11.5] for full generality. Let (Zi = I1 (hi ))i≥1 be a sequence of i.i.d. standard normal random variables supported by the isonormal Gaussian process X, where I1 is the Wiener-Itˆ o integral of order 1 and (hi )i≥1 is an orthonormal basis of H. Let a 6= 0 be a real number. Let (Un )n≥1 be the second order U -statistic with degeneracy of order 1 such that, for all n ≥ 1   X X 2a 2a ˆ j Un = Zi Zj = I2 hi ⊗h n(n − 1) n(n − 1) 1≤i
1≤i
20

ˆ the symmetric tensor product. where I2 denotes the second order Wiener-Itˆo integral and ⊗ A direct application of Theorem 5.5.2 in [33] gives 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

nUn ⇒ a(Z12 − 1) where ⇒ stands for convergence in distribution. Using Corollary 2.1, the following result holds true. Corollary 3.1. For all n ≥ 3,

 s  n(n − 3) 2a2 2 2 . W2 (nUn , a(Z1 − 1)) ≤ C a + (n − 1)3 n−1

For n large enough, 1 W2 (nUn , a(Z12 − 1)) = O( √ ). n Proof. By Corollary 2.1, W2 (nUn , a(Z12 But,



p − 1)) ≤ C ∆(nUn )+ | κ2 (nUn ) − κ2 (a(Z12 − 1)) |

 .

κ2 (a(Z12 − 1)) = 2a2 ,

2a2 n , n−1 4 X Θr κr (nUn ), ∆(nUn ) = κ (Z12 − 1) r=2 r   4 X Θr 2 ∆(nUn ) = 2 − 1) κr (nUn ) − κr (a(Z1 − 1)) . κ (Z r 1 r=2

κ2 (nUn ) =

In order to obtain an explicit rate of convergence, we have to compute the cumulants of order 3 and 4 of the random variable nUn . Since nUn is in the second order Wiener chaos, the following formula holds true, for all r ≥ 2 and for any f ∈ H ⊗ H κr (I2 (f )) = 2r−1 (r − 1)! hf ⊗1 ... ⊗1 f ; f i,

(3.1)

where there are r − 1 copies of f in f ⊗1 ... ⊗1 f and where ⊗1 is the contraction of order 1. Note that, for all r ≥ 2 κr (a(Z12 − 1)) = ar 2r−1 (r − 1)! . Let us compute the third and the fourth cumulants of nUn . By formula (3.1), κ3 (nUn ) = with fn =

P

ˆ

1≤i
26 a3 hfn ⊗1 fn ; fn i, (n − 1)3

By standard computations,

  1 XX fn ⊗1 fn = δik hj ⊗ hl + δil hj ⊗ hk + δjk hi ⊗ hl + δjl hi ⊗ hk . 16 i6=j k6=l

21

We denote by (I), (II), (III) and (IV ) the four associated double sums. The scalar product of (I) with fn gives 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1 XX X ˆ o i, δik hhj ⊗ hl ; hm ⊗h 32 i6=j k6=l m6=o  X X X 1 1 1 = δik hhj ⊗ hl ; hm ⊗ ho i + hhj ⊗ hl ; ho ⊗ hm i , 32 2 2 i6=j k6=l m6=o   1 XX X δik δjm δlo + δjo δlm , = 64

h(I); fn i =

i6=j k6=l m6=o

1 = n(n − 1)(n − 2). 32

The three other terms contribute in a similar way. Thus, κ3 (nUn ) =

23 a3 n(n − 2). (n − 1)2

(3.2)

Similar computations for the fourth cumulants of nUn lead to κ4 (nUn ) =

23 a4 3! n(n − 2)(n − 3). (n − 1)3

Using the facts that Θ2 = a2 , Θ3 = −2a and Θ4 = 1, ∆(nUn ) = =

4 X

r=2 a4

  Θr 2 κr (nUn ) − κr (a(Z1 − 1)) , κr (Z12 − 1) +

2a4 a4 + [1 + (3 − 2n)n], (n − 1)2 (n − 1)3

n−1 a4 = n(n − 3). (n − 1)3 The result then follows.

3.2

Application to some quadratic forms

In this example, we are interested in the asymptotic distribution of sequences of some specific quadratic forms. More precisely, for all n ≥ 1 Qn (Z) =

n X

ai,j (n)Zi Zj

i,j=1

where An = (ai,j (n))1≤i,j≤n is a n × n real-valued symmetric matrix and (Zi )i≥1 an i.i.d. sequence of standard normal random variables. A full description of the limiting distributions for this type of sequences is contained in [34]. P In particular, it is possible to observe the appearance of limiting distributions of the form k≥1 α∞,k (Zk2 − 1). Sufficient conditions for such an appearance have been introduced in [37]. Let (λm )1≤m≤q be q distinct non-zero real numbers. The following assumptions hold through this subsection: • Let (bm i (n))1≤i≤n,1≤m≤q be a sequence of real numbers such that for all 1 ≤ m, k ≤ q n X i=1

k bm i (n)bi (n) −→ δk,m n→+∞

∃b > 0, ∀i, m, n,



22

n | bm i (n) |≤ b < +∞.

• For all 1 ≤ m ≤ q 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

n X

i,j=1

m ai,j (n)bm i (n)bj (n) −→ λm . n→+∞

• Finally, n X

i,j=1

ai,j (n)2 −→

n→+∞

q X

λ2m .

m=1

Under these assumptions, it is natural to expect a non-standard asymptotic behavior for the sequence (Qn (Z) − IEQn (Z))n≥1 . Indeed, since, for all n ≥ 1, An is a real valued symmetric matrix, it is diagonalizable with real eigenvalues (λk (n))1≤k≤n . The previous assumptions suggest that, asymptotically, only a finite number of eigenvalues (λk (n))1≤k≤n are different from zero. Then, as n tends to +∞, only a finite number of coordinates of the Gaussian random vector (Zk )1≤k≤n has non-null contributions to the quadratic form Qn (Z). Such different asymptotic contributions rule out the possibility of a normal behavior at infinity. In order to fit the assumptions of Corollary 2.1, we renormalize the quadratic form Qn . ˜ n the quadratic form associated with the matrix A˜n defined by, for all We denote by Q n ≥ 1 and for all 1 ≤ i, j ≤ n a ˜i,j (n) =  In particular, for all m ≥ 1 n X

i,j=1

ai,j (n) Pn

2 i,j=1 ai,j (n)

1 .

m ˜m = a ˜i,j (n)bm −→ λ i (n)bj (n)n→+∞ 

2

λm Pq

2 m=1 λm

1 , 2

P P ˜ 2 = 1. By Theorem 2 of [37], the following convergence in and ni,j a ˜i,j (n)2 = qm=1 λ m distribution holds true ˜ n (Z) − IE[Q ˜ n (Z)] ⇒ Q ˜∞ = Q

q X

m=1

˜ m (Z 2 − 1). λ m

As in the previous example, we assume that the sequence (Zi )i≥1 is a sequence of independent standard normal random variables induced by a Gaussian isonormal process so that the following representation holds true, for all n ≥ 1 ˜ n (Z) − IE[Q ˜ n (Z)] = I2 Q

X n

i,j=1

 ˆ a ˜i,j (n)hi ⊗hj ,

with Zi = I1 (hi ), for all i ≥ 1. Applying Corollary 2.1, we obtain below an explicit rate of convergence for the previous limit theorem in Wasserstein-2 distance. For this purpose ˜ n (Z) − IE[Q ˜ n (Z)] for r ∈ {2, ..., 2q + 2}. we need to compute the cumulants of order r of Q Using the fact that the Zi ’s are i.i.d. standard normal, for all r ∈ 2, ..., 2q + 2, ˜ n (Z)) = 2r−1 (r − 1)! Tr (A˜rn ). κr ( Q

23

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Combining the previous formula together with Corollary 2.1, we obtain the following ˜ n (Z) − IE[Q ˜ n (Z)] and Q ˜∞ bound on the Wasserstein-2 distance between Q v   u q u2q+2 X X t r r ˜ ˜ ˜ ˜ ˜ λm (3.3) W2 (Qn (Z) − IE[Qn (Z)], Q∞ ) ≤C Θr Tr (An ) − m=1

r=2

+

q+1 X r=2

| Tr (A˜rn ) −

q X

m=1

˜r | λ m



.

Thanks to this bound, we can obtain explicit rates of convergence for some more specific examples. In the sequel, we denote by C α ([0, 1]) the space of H¨older continuous real-valued functions of order α ∈ (0, 1] on [0, 1]. We have the following result. Corollary 3.2. Let q ≥ 1 and (em )1≤m≤q be q distinct orthonormal functions of L2 (0, 1) such that em ∈ C α ([0, 1]) for some α ∈ (0, 1], for all m ≥ 1. Let Kq (., .) be the square integrable kernel defined by, for all (x, y) ∈ (0, 1) × (0, 1) Kq (x, y) =

q X

λm em (x)em (y)

m=1

and let An be the n × n matrix defined by, for all 1 ≤ i, j ≤ n and for all n ≥ 1 ai,j (n) =

1 i j Kq ( , ). n n n

Then, for n large enough ˜ n (Z) − IE[Q ˜ n (Z)], Q ˜ ∞ ) = O( W2 (Q

1 α ). n2

√ i Proof. First of all, choosing bm i (n) = em ( n )/ n, we note that the assumptions of the non-central limit are verified so that the corresponding quadratic form converges Ptheorem q 2 − 1). Let us work out the bound (3.3) in order to obtain an ˜ in law towards m=1 λm (Zm explicit rate of convergence. By standard computations, for all r ≥ 2 X Tr (Arn ) = ai1 ,i2 (n)...air ,i1 (n), i1 ,...,ir

X

=

λm1 ...λmr

m1 ,...,mr

=

X m

+

1 X i1 i2 ir i1 em1 ( )em1 ( )....emr ( )emr ( ), r n n n n n i1 ,...,ir

1 X 2 i1 ir λrm r em ( )...e2m ( ) n n n i1 ,...,ir

0 X

λm1 ...λmr

m1 ,...,mr

1 X i2 ir i1 i1 em1 ( )em1 ( )....emr ( )emr ( ), r n n n n n i1 ,...,ir

P0 where means that we have excluded the hyper diagonal ∆r = {(m1 , ..., mr ) ∈ {1, ..., q}r , m1 = ... = mr }. Thus, Tr (Arn )



q X

λrm

=

X

λrm

m

m=1

+

0 X

 X r  n 1 2 i em ( ) − 1 n n

m1 ,...,mr

i=1

λm1 ...λmr

1 X i1 i2 ir i1 em1 ( )em1 ( )....emr ( )emr ( ). nr n n n n i1 ,...,ir

24

Note that the second term tends to 0 as n tends to +∞ since we have excluded the hyper diagonal ∆r and that 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

i1 1 X i2 ir i1 em1 ( )em1 ( )....emr ( )emr ( ) → δm1 ,m2 ...δmr ,m1 . r n n n n n i1 ,...,ir

Since em ∈ C α ([0, 1]), for all 1 ≤ m ≤ q |

 X  n 1 i r 1 e2m ( ) − 1 |= O( α ). n n n i=1

Similarly, using the fact that em ∈ C α ([0, 1]), it is straightforward to see that the second term is O(1/nα ). Now, note that | Tr (A˜rn ) − since for r ≥ 2, Tr (A˜rn )



q X

m=1

˜r λ m

Tr (Arn ) − =  P

Pq

r m=1 λm

2 i,j ai,j

r

2

q X

m=1

+

˜ r |= O( 1 ), λ m nα

q X

m=1

λrm



−

The result then follows.

1 Pq

2 m=1 λm

r +  2

1 P

2 i,j ai,j

 r . 2

Remark 3.1. Theorem 2 of [37] is actually more general than the particular instance we have displayed since it holds for quadratic forms defined by, for all n ≥ 1 ˜ n (X) = Q

n X

a ˜i,j (n)Xi Xj

i,j=1

where (Xi )i≥1 is an i.i.d. sequence of centered random variables such that IE[X12 ] = 1 ˜ n (X) and IE[X14 ] < +∞. Furthermore, since the works of Rotar’ [32], it is known that Q ˜ n (Z) and explicit rates of approximation have exhibits the same asymptotic behavior than Q been obtained in Kolmogorov distance (see e.g. [17] and more generally [25] Theorems 2.1 and 2.2). We end this subsection with a universality result as announced in the previous remark. We assume that (Xi )i≥1 is a i.i.d. sequence of centered random variables such that IE[Xi2 ] = 1 and IE[Xi4 ] < +∞. First of all, as a direct application of Theorem 2.1 of [25], we obtain ˜ n (X) and Q ˜ n (Z) in Kolmogorov distance. an explicit bound of approximation between Q Corollary 3.3. Under the previous assumptions, there exists C > 0 such that, for all n≥1 ˜ n (X), Q ˜ n (Z)) ≤ C1 . dKol (Q n 16 Proof. Since the sequence (Xi )i≥1 is a i.i.d. sequence of centered random variables with unit variance and finite 4th moment we have in particular that IE[| Xi |3 ] = IE[| X1 |3 ] = β < +∞. Moreover, ˜ n (x1 , ..., xn ) = Q

n X

a ˜i,j (n)xi xj =

i,j=1

X

S⊂[n]

25

cS

Y

i∈S

xi ,

with, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

cS = It is clear that

P

2 S⊂[n] cS

X

c2S

( 0

| S |6= 2

a ˜i,j (n) S = {i, j}.

(3.4)

= 1. Moreover, for any i ∈ {1, ..., n}

=

S3i

=

n X j=1 n X j=1

a ˜i,j (n)2 , a2i,j (n) , 2 i,j ai,j (n)

P

 n  1 i j 2 1 X X λm em ( )em ( ) , 2 2 n n i,j ai,j (n) n m

=P

j=1

n 1 q XX 2 2 i 2 j ≤P λm em ( )em ( ), 2 2 n n i,j ai,j (n) n j=1 m  n  qX 2 1X 2 j C1 P ≤ λm em ( ) , 2 n n i,j ai,j (n) n m j=1

where we have used the fact that the em are bounded on [0, 1]. Now, as n tends to +∞ X X ai,j (n)2 −→ λ2m , m

i,j

 X  n X Z 1 X 1 X 2 2 j 2 2 em (x)dx = λ2m . em ( ) −→ λm λm n n 0 m m m j=1

Thus, for some C2 > 0 X S3i

c2S ≤

C2 . n

Applying Theorem 2.1 of [25], ˜ n (X), Q ˜ n (Z)) ≤ C1 dKol (Q n 16 which concludes the proof of the corollary. In order to obtain a rate for the Kolmogorov distance, we combine the previous corollary with Corollary 3.2 and with the fact that the Kolmogorov distance admits the following bound when the density of the target law is bounded (see e.g. Theorem 3.3 of [9]): q ˜ ˜ ˜ ˜ n (Z) − IE[Q ˜ n (Z)], Q ˜ ∞ ). dKol (Qn (Z) − IE[Qn (Z)], Q∞ ) ≤ C W2 (Q ˜ ∞ admits a bounded density as soon as q is large enough. In this regard, we have the Q following lemma. Lemma 3.1. Let q ≥ 3. Let X be a random variable such that X =d

q X j=1

λj (Zj2 − 1).

with {λj }1≤j≤q non-zero real numbers. Then, X has a bounded density.

26

(3.5)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Proof. The proof is standard so that we only sketch it. The characteristic function of X is given by, for all ξ ∈ R ϕX (ξ) =

q Y exp(−iξλj ) p . 1 − 2iξλ j j=1

(3.6)

Let λmin = min|λj |6= 0. Then, for all ξ ∈ R

1

|ϕX (ξ)|≤

q

(1 + 4ξ 2 λ2min ) 4

.

(3.7)

Since q ≥ 3, we deduce from the previous inequality that ϕX (.) is in L1 (R). Thus, we can apply Fourier inversion formula to obtain the following bound on the density function of X, denoted by fX , kfX k∞ ≤ kϕX kL1 (R) < ∞.

(3.8)

This concludes the proof of the lemma. Therefore, we conclude this example with the following result. Theorem 3.1. Under the previous assumptions, for all n ≥ 1 ˜ n (X) − E[ ˜Q ˜ n (X)], Q ˜ ∞) ≤ dKol (Q

C1 n

+

1 16

C2 α n4

with C1 , C2 > 0 not depending on n. Remark 3.2. We would like to mention that it is possible to combine the inequality (3.3) together with Theorem 2.1 of [25] to obtain a general bound in Kolmogorov distance for q ≥ 3. Let τn be the quantity defined, for all n ≥ 1, by τn = max

1≤i≤n

n X

a ˜i,j (n)2 .

j=1

Then, for all n ≥ 1 1

˜ n (X) − E[ ˜Q ˜ n (X)], Q ˜ ∞ ) ≤ C1 (τn ) 16 dKol (Q

v u   q u2q+2 X X t r r ˜ + C2 Θr Tr (A˜n ) − λ m r=2

1 q+1 q 2 X X ˜ r + λ . Tr (A˜rn ) − m r=2

m=1

m=1

3.3 The generalized Rosenblatt process at extreme critical exponent We conclude this section with a more ambitious example, providing rates of convergence in a recent result given by [5, Theorem 2.4]. Let Zγ1 ,γ2 be the random variable defined by Zγ1 ,γ2 =

Z

R2

Z

0

1

(s −

x1 )γ+1 (s





x2 )γ+2 ds

dBx1 dBx2 ,

with γi ∈ (−1, −1/2) and γ1 + γ2 > −3/2 and {Bx } a two sided Brownian motion on R. By Proposition 3.1 of [5], the cumulants of Zγ1 ,γ2 are given by, for all m ≥ 2 1 κm (Zγ1 ,γ2 ) = (m − 1)! A(γ1 , γ2 )m Cm (γ1 , γ2 , 1, 1) 2

27

where 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

A(γ1 , γ2 ) = [(γ1 + γ2 + 2)(2(γ1 + γ2 ) + 3)] 2 × [B(γ1 + 1, −γ1 − γ2 − 1)B(γ2 + 1, −γ1 − γ2 − 1) 1

+ B(γ1 + 1, −2γ1 − 1)B(γ2 + 1, −2γ2 − 1)]− 2 ,

and X

Cm (γ1 , γ2 , 1, 1) =

σ∈{1,2}m

B(α, β) =

Z

Z

m Y

0

j−1

+1

(0,1)m j=1

γσj +γσ0

+ (sj−1 − sj )+ 1

γσj +γσ0

[(sj − sj−1 )+ j−1

0 0 B(γσj−1 + 1, −γσj − γσj−1 − 1)

+1 0 B(γσj + 1, −γσj − γσj−1 − 1)]ds1 ...dsm ,

uα−1 (1 − u)β−1 du.

Let ρ ∈ (0, 1) and Yρ be the random variable defined by bρ aρ Yρ = √ (Z12 − 1) + √ (Z22 − 1), 2 2 with Z1 , Z2 independent standard normal random variables and aρ and bρ two real numbers defined by √ (ρ + 1)−1 + (2 ρ)−1 aρ = p , 2(ρ + 1)−2 + (2ρ)−1 √ (ρ + 1)−1 − (2 ρ)−1 . bρ = p 2(ρ + 1)−2 + (2ρ)−1

For simplicity, we assume that γ1 ≥ γ2 and γ2 = (γ1 + 1/2)/ρ − 1/2. Then [5, Theorem 2.4] implies that as γ1 tends to −1/2 Zγ1 ,γ2 ⇒ Yρ

(3.9)

with ⇒ denoting convergence in distribution. Note that, in this case, γ2 automatically tends to −1/2 as well. To prove the previous result, the authors of [5] prove the following convergence result, for all m ≥ 2 m

m κm (Zγ1 ,γ2 ) → κm (Yρ ) = 2 2 −1 (am ρ + bρ )(m − 1)!

as γ1 tends to −1/2. Now, using Corollary 2.1, Lemma 2.1 and applying Lemma 3.2, we can present the following quantative bound for the convergence (3.9), namely as γ1 tends to −1/2 r 1 dW2 (Zγ1 ,γ2 , Yρ ) ≤ Cρ −γ1 − 2 where Cρ is some strictly positive constant depending on ρ uniquely. In order to apply Corollary 2.1 to obtain an explicit rate for convergence (3.9), we need to know at which speed κm (Zγ1 ,γ2 ) converges towards κm (Yρ ). For this purpose, we have the following lemma: Lemma 3.2. Under the above assumptions, for any m ≥ 3, as γ1 tends to −1/2 1 κm (Zγ1 ,γ2 ) = κm (Yρ ) + O( − γ1 − ). 2

28

Proof. First of all, note that, as γ1 tends to −1/2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1 1 1 3 2 1 A(γ1 , γ2 ) = [(γ1 + (γ1 + ) + )(2γ1 + (γ1 + ) + 2)] 2 ρ 2 2 ρ 2 1 1 1 1 1 1 1 × [B(γ1 + 1, −(1 + )(γ1 + ))B( (γ1 + ) + , −(1 + )(γ1 + )) ρ 2 ρ 2 2 ρ 2 1 2 1 1 1 −1 + B(γ1 + 1, −2γ1 − 1)B( (γ1 + ) + , − (γ1 + ))] 2 , ρ 2 2 ρ 2 (−γ1 − 1/2) 1 1 ≈q − Cρ (−3 + 2γ + 2ψ( ))(γ1 + )2 2 2 (1 + 1 )−2 + ( 4 )−1

ρ

ρ

2

+ o((−γ1 − 1/2) ),

where γ is the Euler constant, ψ(.) is the Digamma function and Cρ some strictly positive constant depending on ρ uniquely. Note that −3 + 2γ + 2ψ(1/2) < 0. Moreover, for all m≥2  m  X Z Y 0 Cm (γ1 , γ2 , 1, 1) ≈ Isj >sj−1 (−γσj − γσj−1 − 1)−1 − log(sj − sj−1 ) σ∈{1,2}m

(0,1)m j=1

σ∈{1,2}m

(0,1)m j=1

  1 0 − 1)−1 − log(sj−1 − sj ) + (−γ − ψ( )) + o(1) + Isj sj−1 log((sj − sj−1 )−1 ) (−γσj − γσj−1 −1

+ Isj
 1 ) + (−γ − ψ( )) + o(1) ds1 ...dsm . 2

(3.11)

0 Note that −γ − ψ( 12 ) > 0. The diverging terms in Cm (γ1 , γ2 , 1, 1) are B(γσj−1 + 1, −γσj − 0 0 γσj−1 − 1) and B(γσj + 1, −γσj − γσj−1 − 1). At σ and j fixed, the only possible values are

1 1 B(γ1 + 1, −γ1 − γ2 − 1) = B(γ1 + 1, −(γ1 + )(1 + )), 2 ρ 1 1 + (−γ − ψ( )) + o(1), ≈− 2 (1 + ρ1 )(γ1 + 21 ) 1 1 1 1 1 B(γ2 + 1, −γ1 − γ2 − 1) = B( (γ1 + ) + , −(γ1 + )(1 + )), ρ 2 2 2 ρ 1 1 ≈− + (−γ − ψ( )) + o(1), 2 (1 + ρ1 )(γ1 + 21 ) 1 1 1 + (−γ − ψ( 2 )) + o(1), 2(γ1 + 2 ) 1 1 1 2 1 B(γ2 + 1, −2γ2 − 1) = B( (γ1 + ) + , − (γ1 + )), ρ 2 2 ρ 2 ρ 1 ≈− + (−γ − ψ( )) + o(1). 2 2(γ1 + 12 ) B(γ1 + 1, −2γ1 − 1) ≈ −

Moreover, for j fixed γσj +γσ0

(sj − sj−1 )+

j−1

+1

γσj +γσ0

= Isj >sj−1 (sj − sj−1 )

j−1

+1

0 0 ≈ Isj >sj−1 [1 + log(sj − sj−1 )(γσj + γσj−1 + 1) + o((γσj + γσj−1 + 1))].

29

Developing the product in the right hand side of (3.11), 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Cm (γ1 , γ2 , 1, 1) ≈

X

m Y

0 (−γσj − γσj−1 − 1)−1

σ∈{1,2}m j=1

1 + (−γ − ψ( )) 2 +

X

m X

m X

X

σ∈{1,2}m j=1 k=1, k6=j m Y

σ∈{1,2}m j=1 k=1, k6=j

+ Isj
m Y

0 − 1)−1 (−γσk − γσk−1

0 − 1)−1 (−γσk − γσk−1

−1

X

m Y

Z

(0,1)m



Isj >sj−1 log((sj − sj−1 )−1 )

 1 ) ds1 ...dsm + o((−γ1 − )−m+1 ), 2

0 (−γσj − γσj−1 − 1)−1

σ∈{1,2}m j=1

1 3 + (−γ − ψ( ) + ) 2 2

X

m X

m Y

σ∈{1,2}m j=1 k=1, k6=j

1 + o((−γ1 − )−m+1 ). 2

0 − 1)−1 (−γσk − γσk−1

This leads to the following asymptotic for the cumulants of Zγ1 ,γ2 , for all m ≥ 2  (−γ1 − 1/2) 1 (m − 1)! 1 q κm (Zγ1 ,γ2 ) ≈ − Cρ (−3 + 2γ + 2ψ( ))(γ1 + )2 1 −2 4 −1 2 2 2 (1 + ρ ) + ( ρ ) m  X Y m 2 0 + o((−γ1 − 1/2) ) (−γσj − γσj−1 − 1)−1 1 3 + (−γ − ψ( ) + ) 2 2

σ∈{1,2}m j=1 m m X X Y

σ∈{1,2}m j=1 k=1, k6=j

 1 −m+1 ) , + o((−γ1 − ) 2

(m − 1)! (−γ1 − 1/2)m q m ≈ 2 (1 + ρ1 )−2 + ( ρ4 )−1

0 (−γσk − γσk−1 − 1)−1

X

m Y

0 (−γσj − γσj−1 − 1)−1

σ∈{1,2}m j=1

1 + O((−γ1 − )) 2

1 ≈ κm (Yρ ) + O((−γ1 − )), 2

where we have used similar computations as in the proof of Theorem 2.4 of [5] for the last equality.

Acknowledgments We would like to thank the referee for her/his very useful remarks and suggestions, that helped us to improve the presentation of the manuscript. BA’s research was supported by a Welcome Grant from the Universit´e de Li`ege. YS gratefully acknowledges support by the Fonds de la Recherche Scientifique - FNRS under Grant MIS F.4539.16.

30

References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[1] Arras, B., Azmoodeh, E., Poly, G. and Swan, Y. (2017) Stein characterizations for linear combinations of gamma random variables. arXiv preprint arXiv:1709.01161. [2] Azmoodeh, E., Peccati, G., and Poly, G. (2014) Convergence towards linear combinations of chi-squared random variables: a Malliavin-based approach. S´eminaire de Probabilit´es XLVII (Special volume in memory of Marc Yor), pp.339-367. [3] Azmoodeh, E., Campese, S. and Poly, G. (2014) Fourth Moment Theorems for Markov Diffusion Generators. J. Funct. Anal., 266(4): 2341-2359. [4] Bally V. and Caramellino L. (2017) Total variation distance between stochastic polynomials and invariance principles. arXiv preprint arXiv:1705.05194. [5] Bai, S. and Taqqu, M. (2017) Behavior of the generalized Rosenblatt process at extreme critical exponent values. Ann. Probab. 45(2), pp.1278–1324. [6] Borovkov, A.A. and Utev, S.A. (1984) On an inequality and a related characterization of the normal distribution. Theory Probab. Appl., 28(2), pp.219-228. [7] Cacoullos, T. and Papathanasiou, V. (1989). Characterizations of distributions by variance bounds. Statistics & Probability Letters, 7(5), pp.351-356. [8] Caravenna F., Sun R. and Zygouras N. (2017) Universality in marginally relevant disordered systems Ann. Appl. Probab., 27(5): 3050-3112. [9] Chen, L.H.Y., Goldstein, L. and Shao, Q.M. (2010) Normal approximation by Stein’s method. Springer Science Business Media. [10] Eden, R., and Viquez, J. (2015) Nourdin-Peccati analysis on Wiener and WienerPoisson space for general distributions. Stoch. Proc. Appl. 125 (1), pp.182-216. [11] Eichelsbacher, P. and Th¨ ale, C. (2015) Malliavin-Stein method for variancegamma approximation on Wiener space. Electron. J. Probab. 20 (123), pp.1-28. [12] Gaunt, R. E. (2013) Rates of Convergence of variance-gamma approximations via Stein’s method, Thesis at University of Oxford, the Queen’s College. [13] Gaunt, R. E. (2015) Products of normal, beta and gamma random variables: Stein characterisations and distributional theory. to appear in Braz. J. Probab. Stat. [14] Gaunt, R. E. (2017) On Stein’s method for products of normal random variables and zero bias couplings. Bernoulli, 23(4B), pp.3311-3345. [15] Gaunt, R. E. (2014) variance-gamma approximation via Stein’s method. Electron. J. Probab. 19(38), pp.1-33. [16] Gaunt, R., Mijoule, G. and Swan, Y. (2016) Stein operators for product distributions, with applications. arXiv preprint arXiv:1604.06819. [17] G¨ otze, F. and Tikhomirov, A. N. (2005) Asymptotic expansions in non-central limit theorems for quadratic forms. J. Theoret. Probab. 18(4), pp.757-811. [18] Janson S. (1997) Gaussian Hilbert spaces. Cambridge University Press: Volume 129. [19] Klaassen, C. A. (1985) On an inequality of Chernoff. Ann. Probab. 13(3), pp.966974. [20] Krein, C. (2017). Weak convergence on Wiener space: targeting the first two chaoses. arXiv preprint arXiv:1701.06766. [21] Ledoux, M. (2012). Chaos of a Markov operator and the fourth moment condition. Ann. Probab., 40(6), pp.2439-2459. [22] Ley, C., Reinert, G., and Swan, Y. (2017) Stein’s method for comparison of univariate distributions. Probability Surveys 14, pp.1-52.

31

[23] Lukacs. E. (1970) Characteristic functions. Griffin 1820 London, second edition. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[24] Madan, D. B., Carr, P. P. and Chang, E. C. (1998) The variance gamma process and option pricing. Europ. Finance. Rev, 2, pp.79-105. [25] Mossel E., O’Donnell R. and Oleszkiewicz K. (2010) Noise stability of functions with low influences: invariance and optimality. Ann. of Math. 171(1), pp.295-341. [26] Nourdin, I. and Peccati, G. (2012) Normal Approximations Using Malliavin Calculus: from Stein’s Method to Universality. Cambridge Tracts in Mathematics. Cambridge University. [27] Nourdin, I. and Peccati, G. (2009) Stein’s method on Wiener chaos. Probab. Theory Related Fields. 145(1), pp.75-118. [28] Nourdin, I. and Peccati, G. (2015) The optimal fourth moment theorem. Proceedings of the American Mathematical Society, 143(7), pp.3123-3133. [29] Nourdin, I. and Poly, G. (2012) Convergence in law in the second Wiener/Wigner chaos. Electron. Commun. Probab. 17(36), pp.1-12. [30] Nualart, D. and Peccati, G. (2005). Central limit theorems for sequences of multiple stochastic integrals. The Annals of Probability, 33(1), pp.177-193. [31] Pek¨ oz, E., R¨ ollin, A. and Ross, N. (2013) Degree asymptotics with rates for preferential attachment random graphs. Ann. Appl. Prob. 23, pp.1188-1218. [32] Rotar’, V. I. (1973) Some limit theorems for polynomials of second degree. Theory Probab. Appl., 18, pp.499-507. [33] Serfling R. J. (1980) Approximation theorems of mathematical statistics. John Wiley & Sons. [34] Sevast’yanov B.A. (1961) A class of limit distribution for quadratic forms of normal stochastic variables. Theory Probab. Appl., 6, pp.337–340. [35] Stein, C. (1986) Approximate computation of expectations. IMS, Lecture NotesMonograph Series 7. [36] Kusuoka, S. and Tudor, C. A. (2012) Stein’s method for invariant measures of diffusions via Malliavin calculus. Stoch. Proc. Appl. 122(4), pp.1627-1651. [37] Venter, J. H. and De Wet, T.(1973) Asymptotic distributions for quadratic forms with applications to tests of fit. Ann. Statist., 1(2), pp.380–387. [38] Villani, C. (2009) Optimal transport. Old and new. Grundlehren der Mathematischen Wissenschaften 338. Springer. ´s, Paris, (B. Arras) Laboratoire Jacques-Louis Lions, Sorbonne Universite France (E. Azmoodeh) Faculty of Mathematics, Ruhr University Bochum, Germany ´matiques de Rennes, Universite ´ de (G. Poly) Institut de Recherche Mathe Rennes 1, Rennes, France ´ de Lie `ge, Lie `ge, Belgium (Y. Swan) Mathematics department, Universite E-mail E-mail E-mail E-mail

address, address, address, address,

B. Arras [email protected] E. Azmoodeh [email protected] G. Poly [email protected] Y. Swan [email protected]

32