Moderate deviations for Hawkes processes

Moderate deviations for Hawkes processes

Statistics and Probability Letters xx (xxxx) xxx–xxx Contents lists available at SciVerse ScienceDirect Statistics and Probability Letters journal h...

349KB Sizes 0 Downloads 118 Views

Statistics and Probability Letters xx (xxxx) xxx–xxx

Contents lists available at SciVerse ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Moderate deviations for Hawkes processes✩ Q1

Lingjiong Zhu ∗ Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY-10012, United States

article

abstract

info

Article history: Received 22 October 2012 Received in revised form 7 December 2012 Accepted 7 December 2012 Available online xxxx

In this paper, we obtain a moderate deviation principle for a class of point processes, i.e. linear Hawkes processes. © 2012 Published by Elsevier B.V.

MSC: 60G55 60F10 Keywords: Moderate deviations Large deviations Point processes Hawkes processes Self-exciting processes

Q2

1. Introduction and main results

1

1.1. Introduction

2

The Hawkes process is a self-exciting simple point process first introduced by Hawkes (1971). The future evolution of a self-exciting point process is influenced by the timing of past events. The process is non-Markovian except for some very special cases. There are applications in neuroscience, e.g. Johnson (1995), DNA modeling, e.g. Gusto and Schbath (2005), finance, and many other fields. Applications of the Hawkes process in finance include market order modeling, e.g. Bauwens and Hautsch (2009), Bowsher (2007), and Large (2007), value-at-risk, e.g. Chavez-Demoulin et al. (2005), and credit risk, e.g. Errais et al. (2010). Let N be a simple point process on R, and let Ft−∞ := σ (N (C ), C ∈ B (R), C ⊂ (−∞, t ]) be an increasing family of σ -algebras. Any nonnegative Ft−∞ -progressively measurable process λt with

E N (a, b]|Fa−∞ = E





b



 λs dsFa−∞

3 4 5 6 7 8 9 10

 (1.1)

11

a

a.s. for all intervals (a, b] is called an Ft−∞ -intensity of N. We use the notation Nt := N (0, t ] to denote the number of points in the interval (0, t ].

✩ This research was supported partially by a grant from the National Science Foundation: DMS-0904701, a DARPA grant, and a MacCracken Fellowship at New York University. ∗ Tel.: +1 212 998 3329. E-mail addresses: [email protected], [email protected].

0167-7152/$ – see front matter © 2012 Published by Elsevier B.V. doi:10.1016/j.spl.2012.12.011

12 13

2

1

2

L. Zhu / Statistics and Probability Letters xx (xxxx) xxx–xxx

A general Hawkes process is a simple point process N admitting an Ft−∞ -intensity

λt := λ

t



 h(t − s)N (ds) ,

(1.2)

−∞ 3 4

λ(·) : R+ → R+ is locally integrable and left continuous, h(·) : R+ → R+ , and we always assume that ∥h∥L1 = where ∞ t h ( t )dt < ∞. Here, −∞ h(t − s)N (ds) stands for (−∞,t ) h(t − s)N (ds). We always assume that N (−∞, 0] = 0, i.e. the 0

7

Hawkes process has empty history. In the literatures, h(·) and λ(·) are usually referred to as the exciting function and the rate function, respectively. The Hawkes process is linear if λ(·) is linear and it is nonlinear otherwise.

8

1.2. Limit theorems for Hawkes processes

5 6

9 10 11 12

Let us review some results about the limit theorems for Hawkes processes in the literatures. For a linear Hawkes process, say λ(z ) = ν + z, for some ν > 0 and ∥h∥L1 < 1, the linear Hawkes process has a very nice immigration-birth representation; see, for example, Hawkes and Oakes (1974). There is the law of large numbers (see, for instance, Daley and Vere-Jones, 2003), i.e. Nt

13

14

15

16 17 18

t



ν , 1 − ∥h∥L1

I ( x) =

 

x log



+∞



ν + x∥h∥L1

→ σ B(·),



t

22



x

− x + x∥h∥L1 + ν

 Nt t

if x ∈ [0, ∞)

 ∈ · with the rate function (1.4)

otherwise.

Recently, Bacry et al. (2011) proved a functional central limit theorem for a linear multivariate Hawkes process under assumptions. That includes the linear Hawkes process as a special case, and they proved under the assumption certain ∞ 1/2 t h(t )dt < ∞ that 0 N·t − ·µt

21

(1.3)

Moreover, Bordenave and Torrisi (2007) proved a large deviation principle for

19

20

as t → ∞ a.s.

as t → ∞,

(1.5)

where B(·) is a standard Brownian motion. The convergence is weak convergence on D[0, 1], the space of cádlág functions on [0, 1], equipped with Skorokhod topology. In (1.5), µ and σ 2 are given by

µ :=

ν

ν . (1 − ∥h∥L1 )3

and σ 2 :=

1 − ∥h∥L1

(1.6)

28

For a nonlinear Hawkes process, Brémaud and Massoulié (1996) proved that there exists a unique stationary nonlinear Hawkes process under the assumptions that λ(·) is α -Lipschitz and α∥h∥L1 < 1. By the ergodic theorem, this implies the law of large numbers for Ntt . Zhu (in press-b) proved a functional central limit theorem and Strassen’s invariance principle, (i.e. functional law of the iterated logarithm) under slightly stronger assumptions. Zhu (2011) proved a large deviation principle for (Nt /t ∈ ·) for sublinear λ(·) when h(·) is exponential or sums of exponentials, and Zhu (in press-a) proved a process-level, i.e. level-3, large deviation principle for sublinear λ(·) and general h(·).

29

1.3. Main results

23 24 25 26 27

30 31 32 33

34

We are interested in obtaining a moderate deviation principle for linear Hawkes processes in our paper. Let X1 , . . . , Xn be a sequence of Rd -valued i.i.d. random vectors with mean √0 and convariance matrix C which is invertible. Assume that E[e⟨θ,X1 ⟩ ] < ∞ for θ in some ball around the origin. For any n ≪ an ≪ n, a moderate deviation principle says that, for any Borel set A,



1

inf ⟨x, C 2 x∈Ao

−1

x⟩ ≤ lim inf n→∞

n→∞

37 38

a2n

≤ lim sup

35

36

n

In other words,



1 an

n

i=1



 log P

n a2n

n 1 

an i=1

 log P

 Xi ∈ A

n 1 

an i = 1

 Xi ∈ A

≤−

1 inf ⟨x, C −1 x⟩. 2 x∈Ao

Xi ∈ · satisfies a large deviation principle with speed

(1.7)

a2n . The above classical result can be found for n

example in Dembo and Zeitouni (1998). The moderate deviation principle fills in the gap between the central limit theorem and the large deviation principle.

L. Zhu / Statistics and Probability Letters xx (xxxx) xxx–xxx

3

It is in general difficult to obtain moderate deviations for dependent random variables. The study of moderate deviations has been an active research area, and over the years certain mixing processes, Markov processes, martingales, etc., have been studied. For example, Ghosh and Babu (1977), Babu and Singh (1978), Gao (1996), and others studied moderate deviations for mixing processes. de Acosta (1997), de Acosta and Chen (1998), Chen (2001), and others studied moderate deviations for Markov processes. Dembo (1996), Gao (1996), Grama (1997), and others studied moderate deviations for martingales. Let us assume the following for the Hawkes process.

• Nt is a Hawkes process with empty history, i.e. N (−∞, 0] = 0. • λ(z ) = ν + z , ν > 0. • ∥h∥L1 < 1 and supt >0 t 3/2 h(t ) = C < ∞.

t →∞

x∈A

t a( t )

2

≤ lim sup t →∞

where J (x) =

log P

t a(t )2

N t − µt



t ≪ a(t ) ≪ t, we have the following moderate deviation

11 12

∈A

Nt − µt a( t )

13

 ∈ A ≤ − inf J (x),

(1.8)

14

x∈A

x2 (1−∥h∥ 1 )3 L . 2ν

15

∞ 0

t 1/2 h(t )dt < ∞ used to

16 17

2. Proofs

18 19 20 21 22 23

24

Since a Hawkes process has long memory and is in general non-Markovian, there is no good criterion in the literature for moderate deviations that we can use directly. For example, Bacry et al. (2011) used a central limit theorem for martingales to obtain a central limit theorem for linear Hawkes processes. But there is no criterion for moderate deviations for martingales that can fit into the context of a Hawkes process. The strategy of our proof relies on the fact that for a linear Hawkes process there is a nice immigration-birth representation from which we obtain a semi-explicit formula for the moment-generating function of Nt in Lemma 4. A careful asymptotic analysis of this formula would lead to the proof of Theorem 1. Proof of Theorem 1. Let us first prove that, for any θ ∈ R, t

t →∞



a(t )

log E e

2

a( t ) t θ(Nt −µt )



=

a( t ) t θ Nt



= eν

a( t )

t

0 Gt (s)ds

25 26 27 28 29 30 31

νθ 2 . 2(1 − ∥h∥L1 )3

(2.1)

For any fixed θ ∈ R, for any t sufficiently large, by Lemma 4, we have

E e

6

9

Remark 3. It would be interesting if we could obtain a moderate deviation principle for a nonlinear Hawkes process. The challenge is that, for a nonlinear Hawkes process, there is no immigration-birth representation, and thus we would not have a nice formula as in Lemma 4. Furthermore, unlike the central limit theorem and the law of the iterated logarithm, there are not many good criteria for which a moderate deviation principle holds. At least to the author’s knowledge, none of the criteria in the literature about moderate deviations could be used directly in the case of a nonlinear Hawkes process. This will have to be left for future investigations.



5

7

Remark 2. Our assumption supt >0 t 3/2 h(t ) ≤ C < ∞ is comparable with the assumption prove the central limit theorem in Bacry et al. (2011).

lim

4



a(t )

 log P

3

10

Theorem 1. For any Borel set A and time sequence a(t ) such that principle.

− info J (x) ≤ lim inf

2

8

Let µ := 1−∥νh∥ . We have the following result. L1



1

32

33

,

(2.2) a(t )

s

where Gt (s) = e t θ + 0 h(u)Gt (s−u)du − 1, 0 ≤ s ≤ t. (Here, Gt (s) is simply the F (s) − 1 in Lemma 4. Because t θ depends on t, we write Gt (s) instead of G(s) to indicate its dependence on t.) Clearly, Gt (s) is increasing in s, and, letting s ↑ ∞, we

34

35 36

a(t )

get that Gt (∞) is the minimal solution to the equation xt = e t θ +∥h∥L1 xt − 1. (See the proof of Lemma 4 and the reference therein.) Since ∥h∥L1 < 1, it is easy to see that xt = O(a(t )/t ). Since xt = O(a(t )/t ), we have Gt (s) = O(a(t )/t ) uniformly in s. By Taylor’s expansion, Gt (s) =

a(t )θ t

+

s



h(u)Gt (s − u)du +

+ 0

a(t )θ t

1



2

a(t )θ

2 +

t

1 2

s



h(u)Gt (s − u)du

37 38 39

2 40

0

s



h(u)Gt (s − u)du + O (a(t )/t )3 .



0



(2.3)

41

4

1

2

L. Zhu / Statistics and Probability Letters xx (xxxx) xxx–xxx

Let Gt (s) =

a(t )θ t

G1 (s) +

G1 (s) = 1 +



a(t ) t

2

G2 (s) + ϵt (s), where G1 (s) satisfies

s



h(u)G1 (s − u)du,

(2.4)

0 3

4

and G2 (s) satisfies G2 (s) =

θ2

s



h(u)G2 (s − u)du +

2

0 5 6

1

lim

t →∞

a( t ) 1

lim

8

t →∞

t

t



G2 (s)ds = 0

t



t

G1 (s)ds = 1 + 0

= 1+ = 1+ = 1+

13

(2.6)

θ2 . 2(1 − ∥h∥L1 )3

(2.7)

1

 t

t 1

0

1

h(u) h(u)

t −u



0

t

G1 (s − u)dsdu u

t

G1 (s)dsdu 0

t



t



0



t 1

h(u)G1 (s − u)duds

0 t



t

s

h(u)

t



0

G1 (s)dsdu − 0

1 t

t



h(u) 0

t



G1 (s)dsdu.

(2.8)

t −u

Therefore, we have 1

16

(2.5)

0

12

15

(G1 (s) − 1)2 .

   t G1 (s)ds − θ µt = 0, θν

11

14

2

To prove (2.6), notice first that 1

10

θ2

Substituting (2.4) and (2.5) back into (2.3), and using the fact Gt (s) = O(a(t )/t ) uniformly in s, we get ϵt (s) = O((a(t )/t )3 ) uniformly in s. Moreover, we claim that

7

9

+ θ 2 (G1 (s) − 1) +

t



t

G1 (s)ds = 0

1−

1 t

t 0

h ( u)

1−

t 0

t t −u

G1 (s)dsdu

h(u)du

.

(2.9)

Hence,

  t  θν G1 (s)ds − θ µt a( t ) 0   t θν 1 = G1 (s) − ds a(t ) 0 1 − ∥h∥L1   t t θ νt 1 1 θ ν 0 h(u) t −u G1 (s)dsdu ∞ = − − .  t a(t ) 1 − t h(u)du a( t ) 1 − 0 h(u)du 1 − 0 h(u)du 0 1

17

18

19

20

21

22 23 24

25

(2.10)

For the first term in (2.10), we have

    θ νt  |θ|ν t  ∞ h(u)du 1 1   t  − → 0,   ≤  a(t ) 1 − t h(u)du 1 − 0∞ h(u)du  a(t ) (1 − ∥h∥L1 )2 0

(2.11)

as t → ∞, since, by our assumption, supt >0 t 3/2 h(t ) ≤ C < ∞, which implies that a(tt ) → 0 as t → ∞. For the second term in (2.10), we have

∞ t

h(u)du ≤ a(tt )

∞ t

√ C du u3/2

  t  θ ν  t h(u)  t G1 (s)dsdu  |θ |ν 0 h(u)udu   0 t −u lim sup  ≤ lim G ( t ) lim sup = 0. t   t →∞ 1 t →∞  a(t ) t →∞ a(t ) 1 − ∥h∥L1 1− h(u)du

=

2C t a(t )

(2.12)

0

26 27 28

This is because (2.4) is a renewal equation and ∥h∥L1 < 1. By application of the Tauberian theorem to the renewal equation (see Chapters XIII and XIV of Feller, 1971), we have limt →∞ G1 (t ) = 1−∥1h∥ . Moreover, our assumption that supt >0 t 3/2 h(t ) ≤ C < ∞ and ∥h∥L1 < ∞ implies that a(1t )

t 0

h(u)udu ≤ a(1t )

1 0

L1

h(u)udu + a(1t )

t

C du 1 u1/2

→ 0 as t → ∞.

L. Zhu / Statistics and Probability Letters xx (xxxx) xxx–xxx

5

To prove (2.7), notice that limt →∞ G1 (t ) = 1−∥1h∥ , and, again by application of the Tauberian theorem to the renewal L1 equation (see Chapters XIII and XIV of Feller, 1971), we have 1

lim



t

t →∞

G2 (s)ds = lim G2 (t )

3

t →∞

0

θ 1+2

=



  2 − 1 + 1−∥1h∥ 1 − 1

1 1−∥h∥ 1 L

L

4

1 − ∥h∥L1

2

θ2 . 2(1 − ∥h∥L1 )3

=

(2.13)

Finally, from (2.2) and the definitions of G1 (s), G2 (s), and ϵt (s), we have



log E e

a(t )

2

=



1 a( t )

a( t ) t θ (Nt −µt )

t

=

a( t )

2

t



νθ



t



ν

Gt (s)ds − θµ 0



1

G1 (s)ds − θµt +

t

0

ν

t

G2 (s)ds + 0

7

a(t )

t



t →∞

t



a(t )2

log E e

a( t ) t θ(Nt −µt )



=

t



t a(t )

2

ϵt (s)ds.

(2.14)

9

(2.15)

Applying the Gärtner–Ellis theorem (see for example Dembo and Zeitouni, 1998), we conclude that, for any Borel set A,

− info J (x) ≤ lim inf t →∞

x∈A

t a(t )2

≤ lim sup t →∞

log P

N t − µt a(t )



t a(t )2

log P

8

0

νθ 2 . 2(1 − ∥h∥L1 )3



5

6

Hence, by (2.6), (2.7), and the fact that ϵt (s) = O((a(t )/t )3 ) uniformly in s, we conclude that, for any θ ∈ R, lim

2

t

2

t

1

10

11

 ∈A

Nt − µt

12

 ∈ A ≤ − inf J (x),

a( t )

(2.16)

13

x∈A

where

14



J (x) = sup θ x − θ∈R

νθ 2 2(1 − ∥h∥L1 )3

 =

x2 (1 − ∥h∥L1 )3 2ν

. 

(2.17)

Lemma 4. For θ < ∥h∥L1 − 1 − log ∥h∥L1 ,

E[eθ Nt ] = eν where F (s) = eθ+

t

0 (F (s)−1)ds

16

,

(2.18)

s

0 h(u)(F (s−u)−1)du

for any 0 ≤ s ≤ t.

L1

children according to the same laws, independent of other children. All the immigrants produce children independently. Let F (t ) = E[eθ S (t ) ], where S (t ) is the number of descendants an immigrant generates up to time t. Hence, we have





=

∞  (ν t )k k=0

= eν

t

k!

e−ν t

0 (F (s)−1)ds

1 t k /k!



...

 t1
.

F (t1 ) · · · F (tk )dt1 · · · dtk

19 20 21 22 23 24

25

(2.19)

It is well known that, see p. 39 of Jagers (1975), for all θ ∈ (−∞, ∥h∥L1 − 1 − log ∥h∥L1 ), E[eθ S (∞) ] is the minimal positive solution of

  E[eθ S (∞) ] = eθ exp µ(E[eθ S (∞) ] − 1) .

17

18

Proof. The Hawkes process has a very nice immigration-birth representation; see for example Hawkes and Oakes (1974). The immigrant arrives according to a homogeneous Poisson process with constant rate ν . Each immigrant would produce children, and the number of children has a Poisson distribution with parameter ∥h∥L1 . Conditional on the number of the h(t ) children of an immigrant, the time that a child was born has probability density function ∥h∥ . Each child would produce

E eθ Nt

15

(2.20)

26

27 28

29

6

1 2

3

L. Zhu / Statistics and Probability Letters xx (xxxx) xxx–xxx

(1)

(2)

(K )

Let K be the number of children of an immigrant and St , St , . . . , St be the number of descendants of the immigrant’s kth child that was born before time t (including the kth child if and only if it was born before time t). Then F (t ) =

∞  

E eθ S (t ) |K = k P(K = k)



k=0

4

= eθ

∞  

(1) k

E eθ St



P(K = k)

k =0

5

= eθ

∞   k=0

6

7

8 9

10

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

= eθ+

t

t 0

h(s)

∥h∥L1

F (t − s)ds

0 h(s)(F (t −s)−1)ds

k

e−∥h∥L1

∥h∥kL1 k!

. 

(2.21)

Acknowledgments The author wishes to thank an anonymous referee for helpful suggestions. The author is supported by NSF grant DMSQ3 0904701, a DARPA grant, and a MacCracken Fellowship at New York University.

References Babu, G.J., Singh, K., 1978. Probabilities of moderate deviations for some stationary strong-mixing processes. Sankhya¯ Ser. A 40, 38–43. Bacry, E., Delattre, S., Hoffmann, M., Muzy, J.F., 2011. Scaling limits for Hawkes processes and application to financial statistics. Preprint. Bauwens, L., Hautsch, N., 2009. Modelling financial high frequency data using point processes. In: Handbook of Financial Time Series. pp. 953–979. Bordenave, C., Torrisi, G.L., 2007. Large deviations of Poisson cluster processes. Stoch. Models 23, 593–625. Bowsher, C.G., 2007. Modelling security market events in continuous time: intensity based, multivariate point process models. J. Econometrics 141, 876–912. Brémaud, P., Massoulié, L., 1996. Stability of nonlinear Hawkes processes. Ann. Probab. 24, 1563–1588. Chavez-Demoulin, V., Davison, A., McNeil, A., 2005. Estimating value-at-risk: a point process approach. Quant. Finance 5, 227–234. Chen, X., 2001. Moderate deviations for Markovian occupation times. Stochastic Process. Appl. 94, 51–70. Daley, D.J., Vere-Jones, D., 2003. An Introduction to the Theory of Point Processes, Vols. I and II, second ed. Springer, New York. de Acosta, A., 1997. Moderate deviations for empirical measures of Markov chains: lower bounds. Ann. Probab. 25, 259–284. de Acosta, A., Chen, X., 1998. Moderate deviations for empirical measures of Markov chains: upper bounds. J. Theoret. Probab. 11, 1075–1110. Dembo, A., 1996. Moderate deviations for martingales with bounded jumps. Electron. Commun. Probab. 1, 11–17. Dembo, A., Zeitouni, O., 1998. Large Deviations Techniques and Applications, second ed. Springer, New York. Errais, E., Giesecke, K., Goldberg, L., 2010. Affine point processes and portfolio credit risk. SIAM J. Financial Math. 1, 642–665. Feller, W., 1971. An Introduction to Probability Theory and Its Applications, Vols. I and II, second ed. John Wiley & Sons, Inc., New York. Gao, F.Q., 1996. Moderate deviations for martingales and mixing random processes. Stochastic Process. Appl. 61, 263–275. Ghosh, M., Babu, G.J., 1977. Probabilities of moderate deviations for some stationary φ -mixing processes. Ann. Probab. 5, 222–234. Grama, I.G., 1997. On moderate deviations for martingales. Ann. Probab. 25, 152–183. Gusto, G., Schbath, S., 2005. F.a.d.o.: a statistical method to detect favored or avoided distances between occurrences of motifs using the Hawkes model. Stat. Appl. Genet. Mol. Biol. 4, Article 24. Hawkes, A.G., 1971. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90. Hawkes, A.G., Oakes, D., 1974. A cluster process representation of a self-exciting process. J. Appl. Probab. 11, 93–503. Jagers, P., 1975. Branching Processes with Biological Applications. John Wiley, London. Johnson, D.H., 1995. Point process models of single-neuron discharges. J. Computational Neuroscience 3, 275–299. Q4 Large, J., 2007. Measuring the resiliency of an electronic limit order book. J. Financ. Mark. 10, 1–25. Zhu, L., 2011. Large deviations for Markovian nonlinear Hawkes processes. Preprint. Q5 Zhu, L., 2011. Process-level large deviations for nonlinear Hawkes point processes, Ann. Inst. Henri Poincare (in press-a). Zhu, L., 2012. Central limit theorem for nonlinear Hawkes processes, J. Appl. Probab. (in press-b).