The rate of consistency of the quasi-maximum likelihood estimator

The rate of consistency of the quasi-maximum likelihood estimator

Statistics & Probability Letters 61 (2003) 133 – 143 The rate of consistency of the quasi-maximum likelihood estimator Istv&an Berkesa;∗ , Lajos Hor...

155KB Sizes 2 Downloads 92 Views

Statistics & Probability Letters 61 (2003) 133 – 143

The rate of consistency of the quasi-maximum likelihood estimator Istv&an Berkesa;∗ , Lajos Horv&athb a

A. Renyi Institute of Mathematics, Hungarian Academy of Sciences, P.O. Box 127, H-1364 Budapest, Hungary b Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112-0090 USA Received July 2002; received in revised form September 2002

Abstract We connect the rate of consistency of the quasi-maximum likelihood estimator in GARCH(p; q) sequences with the number of the moments of the innovations. c 2003 Elsevier Science B.V. All rights reserved.  MSC: primary 62F12; secondary 62M10 Keywords: GARCH(p; q) sequence; Quasi-maximum likelihood; Rate of consistency

1. Introduction Since its introduction by Bollerslev (1986), the generalized autoregressive conditionally heteroskedastic (GARCH) process has found many applications in the analysis of >nancial data, including exchange rates and stock prices. A GARCH(p; q) process is de>ned by the equations yk =  k k and k2 = ! +

(1.1)  16i6p

i yk2−i +

 16j 6q

j k2−j ;

(1.2)

where ! ¿ 0;

i ¿ 0;

1 6 i 6 p;

j ¿ 0;

1 6 j 6 q;



Supported by the Hungarian National Foundation for Scienti>c Research, Grants T 29621 and T 37886. Corresponding author. Fax: +36-13177166. E-mail address: [email protected] (I. Berkes).



c 2003 Elsevier Science B.V. All rights reserved. 0167-7152/03/$ - see front matter  PII: S 0 1 6 7 - 7 1 5 2 ( 0 2 ) 0 0 3 4 2 - 5

(1.3)

134

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

are constants. We also assume that {j ; −∞ ¡ j ¡ ∞} are independent; identically distributed random variables:

(1.4)

Nelson (1990) showed that in case of a GARCH(1; 1) sequence (1.1) and (1.2) have a unique stationary solution if and only if E log( 1 + 1 02 ) ¡ 0. Bougerol and Picard (1992a, b) found necessary and suHcient conditions for the existence of a unique stationary solution of (1.1) and (1.2) in case if a general GARCH(p; q) model. To state their condition we introduce further notation. Let n = ( 1 + 1 n2 ; 2 ; : : : ; q−1 ) ∈ Rq−1 Qn = (n2 ; 0; : : : ; 0) ∈ Rq−1 and  = ( 2 ; : : : ; p−1 ) ∈ Rp−2 : (Clearly without loss of generality we may and shall assume min(p; q) ¿ 2). De>ne the (p + q − 1) × (p + q − 1) matrix An , written in block form, by   n q 

p   0 0   Iq − 1 0 ;  An =   0 0 0   Qn 0

0

Ip − 2

0

where Iq−1 and Ip−2 are the identity matrices of size q − 1 and p − 2, respectively. The norm of any d × d matrix M is de>ned by M  = sup{M xd =xd : x ∈ Rd ; x = 0}; where  · d is the usual (Euclidean) norm in Rd . The top Lyapunov exponent L associated with the sequence {An ; −∞ ¡ n ¡ ∞} is 1 E log A0 A1 : : : An ; L = inf (1.5) 16n¡∞ n + 1 assuming that E(log A0 ) ¡ ∞:

(1.6)

(We note that A0  ¿ 1; cf. Berkes et al. (2003).) Bougerol and Picard (1992a, b) showed that if (1.6) holds, then (1.1) and (1.2) have a unique stationary solution if and only if L ¡ 0:

(1.7)

In the rest of the paper we will assume that (1.1) – (1.4), (1.6) and (1.7) hold. Based on the observations y1 ; y2 ; : : : ; yn one wants to estimate V = (!; 1 ; 2 ; : : : ; p ; 1 ; 2 ; : : : ; q ), the parameter of the GARCH(p; q) process {yi ; −∞ ¡ i ¡ ∞}. Lee and Hansen (1994) and Lumsdaine (1996) proposed the quasi-maximum likelihood estimator and studied its asymptotic properties. They assumed very strict conditions on the distribution of 0 . Also, the quasi-maximum likelihood estimator in Lee and Hansen (1994) and Lumsdaine (1996) is local, i.e. the likelihood

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

135

function is maximized in a neighborhood of V. Berkes et al. (2003) maximized the likelihood function over an arbitrary compact set and proved the asymptotic normality of the quasi-maximum likelihood estimator for V under very mild conditions. The main aim of the this note is to show that the moment conditions in Berkes et al. (2003) are also necessary. In order to de>ne the likelihood function, we write k2 as a function of yk −1 ; yk −2 ; : : : . The coeHcients in the representation of k2 will be de>ned by a recursion. Let u=(x; s; t) ∈ Rp+q+1 ; x ∈ R; s ∈ Rp and t ∈ Rq . We start with the initial conditions. If q ¿ p, then c0 (u) = x=(1 − (t1 + · · · + tq )); c1 (u) = s1 ; c2 (u) = s2 + t1 c1 (u) .. . cp (u) = sp + t1 cp−1 (u) + · · · + tp−1 c1 (u); cp+1 (u) = t1 cp (u) + · · · + tp c1 (u) .. . cq (u) = t1 cq−1 (u) + · · · + tq−1 c1 (u) and if q ¡ p, the equations above are replaced with c0 (u) = x=(1 − (t1 + · · · + tq )); c1 (u) = s1 ; c2 (u) = s2 + t1 c1 (u) .. . cq+1 (u) = sq+1 + t1 cq (u) + · · · + tq c1 (u) .. . cp (u) = sp + t1 cp−1 (u) + · · · + tq cp−q (u): In general, if i ¿ R = max(p; q), then ci (u) = t1 ci−1 (u) + t2 ci−2 (u) + · · · + tq ci−q (u): Let 0 ¡ v1 ¡ v2 , 0 ¡ 0 ¡ 1, qv1 ¡ 0 and de>ne U = {u: t1 + t2 + · · · + tq 6 0

and

v1 6 min(x; s1 ; s2 ; : : : ; sp ; t1 ; t2 ; : : : ; tq ) 6 max(x; s1 ; s2 ; : : : ; sp ; t1 ; t2 ; : : : ; tq ) 6 v2 }:

(1.8)

136

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

From now on we replace (1.3) with the somewhat stronger condition that V is inside U:

(1.9)

The quasi-maximum likelihood estimator is de>ned by V˜n = arg maxL˜ n (u); u ∈U

where

 1  1 yk2 log w˜ k (u) + L˜ n (u) = − n 2 w˜ k (u) 16k 6n

is the likelihood function and  w˜ k (u) = c0 (u) + ci (u)yk2−i : 16i6k −1

The almost sure asymptotic consistency of V˜n was established by Berkes et al. (2003). Theorem 1.1. We assume that (1.9) holds, 02 is a nondegenerate random variable

(1.10)

E02 = 1

(1.11)

E|02 |1+) ¡ ∞

with some ) ¿ 0;

lim t −+ P{02 6 t} = 0 t →0

with some + ¿ 0

(1.12) (1.13)

and the polynomials 1 x + 2 x2 + · · · + p xp and 1 − 1 x − 2 x2 − · · · − q xq are coprimes in

(1.14)

the set of polynomials with real coe=cients: Then a:s: V˜n → V:

(1.15)

If (1.12) is replaced by a stronger condition, the asymptotic normality of n1=2 (V˜n − V) follows from Berkes et al. (2003). Theorem 1.2. If (1.9) – (1.11), (1.13) – (1.14) hold and E|02 |2+) ¡ ∞ with some ) ¿ 0;

(1.16)

D n1=2 (V˜n − V) → N(0; B0 );

(1.17)

then with some positive de>nite matrix B0 , where N(0; B0 ) denotes a multivariate normal random variable with mean 0 and covariance matrix B0 .

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

137

Under the conditions of Theorem 1.2 we have that the exact order of |V˜n − V| is OP (n−1=2 ). The main result of this paper gives a necessary and suHcient condition for the rate of consistency of V˜n . The maximum norm of vectors and matrices will be denoted by | · |. Theorem 1.3. We assume that (1.9) – (1.14) hold and let 1 ¡ - ¡ 2. Then n1−1=- |V˜n − V| → 0

a:s:

(1.18)

if and only if E|02 |- ¡ ∞:

(1.19)

2. Preliminary lemmas The following lemmas are taken from Berkes et al. (2003). Let  wk (u) = c0 (u) + cj (u)yk2−j : 16j¡∞

w (u)

Note that k = @wk (u)=@u is a (p + q + 1)-dimensional vector and wk (u) = @2 wk (u)=@u2 is a (p + q + 1) × (p + q + 1) matrix. Lemma 2.1. If the conditions of Theorem 1.1 are satis>ed, then 

wk (u) / ¡∞ E sup u∈U wk (u) and

 / w (u) ¡∞ E sup k u∈U wk (u)

for all / ¿ 0. Proof. It follows immediately from Lemmas 3.2, 3.3 and 5.2 of Berkes et al. (2003). Lemma 2.2. If the conditions of Theorem 1.1 are satis>ed, then /

k2 ¡ ∞ for all 0 ¡ / ¡ 1 + ): E sup u∈U wk (u) Proof. This is Lemma 5.1 in Berkes et al. (2003). Let  1  1 yk2 Ln (u) = : log wk (u) + − n 2 wk (u) 16k 6n

We note that under the conditions of Theorem 1.1 L(u) = ELn (u) exists and is independent of n.

138

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

Lemma 2.3. If the conditions of Theorem 1.1 are satis>ed, then sup |Ln (u) − L (u)| → 0

u ∈U

a:s:

(2.1)

Proof. Following the proof of Lemma 5.5 in Berkes et al. (2003), one can verify that Lemmas 2.1 and 2.2 with the ergodic theorem give (2.1). Lemma 2.4. If the conditions of Theorem 1.1 are satis>ed, then n sup |Ln (u) − L˜ n (u)| = O(1) u ∈U

a:s:

(2.2)

Proof. By Lemmas 3.1 and 3.2 of Berkes et al. (2003) there exist constants d1 ; d2 and 0 ¡  ¡ 1 such that |ci (u)| 6 d1 i and |ci (u)| 6 d2 ii ; 1 6 i ¡ ∞. Let   2 2 i y− and 0 = ii y− 01 = 2 i i: 06i¡∞

16i¡∞

We will see below that 01 and 02 are a.s. >nite. It is easy to see that ’k; 1 := sup |wk (u) − w˜ k (u)| u ∈U

6



k 6i¡∞

sup |ci (u)|yk2−i 6 d1

u ∈U

 k 6i¡∞

i yk2−i = k d1 01

(2.3)

and similarly ’k; 2 := sup |wk (u) − w˜ k (u)| 6 k d2 02 : u ∈U

(2.4)

Clearly,

  1  1 wk (u) 2 wk (u) − yk 2 − Ln (u) = n 2 wk (u) wk (u) 16k 6n 

and a similar expression holds for L˜ n (u). Let    ˜ k (u) wk (u) w˜ k (u) 2 wk (u) 2 w ˜ − yk 2 − yk 2 ‘k (u) = − wk (u) w˜ k (u) wk (u) w˜ k (u) and ’k; 3

 wk (u) : := sup u∈U wk (u)

Since wk (u) ¿ v1 and w˜ k (u) ¿ v1 for all u ∈ U , a simple calculation shows that sup |‘˜k (u)| 6 d3 (1 + yk2 )(’k; 1 + ’k; 2 ’k; 3 ):

u ∈U

(2.5)

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

139

By (2.3) – (2.5) there are constants d4 and d5 such that  sup|‘˜k (u)| n sup |Ln (u) − L˜ n (u)| 6 u ∈U

16k 6n

u ∈U

6 d4 (01 + 02 )



(yk2 + 1)k

16k¡∞



+ d5 (01 + 02 )

(yk2 + 1)’k; 3 k :

16k¡∞

Observe that E|y2 |# ¡ ∞ with some #∗ ¿ 0 (cf. Davis et al., 1999), E’#k; 3 ¡ ∞ for all # ¿ 0 (cf. Lemma 2.1). Since {(yk ; ’k; 3 ) − ∞ ¡ k ¡ ∞} is a stationary sequence, Lemma 2.1 in Berkes et al. (2003) yields that 16k¡∞ (1 + yk2 + ’k; 3 + yk2 ’k; 3 )k ¡ ∞ a.s. The same argument shows that 01 and 02 are a.s. >nite as claimed above. This completes the proof of Lemma 2.4. ∗

3. Proof of Theorem 1.3 We start with two new technical lemmas. Lemma 3.1. There is a constant 0 ¡ c∗ ¡ ∞ such that  @ci (u) ∗ ci (u); 1 6 i ¡ ∞ c 6 @s1 and

 @ci (u) ci (u); c 6 @t1 ∗

26i¡∞

(3.1)

(3.2)

for all u ∈ U . Proof. Clearly, @c0 (u) = 0; @s1

@c1 (u) =1 @s1

and

@c2 (u) = t1 @s1

(3.3)

and elementary calculations show that with a suitable 0 ¡ c∗ ¡ ∞ (3.1) holds if 1 6 i 6 max(p; q). Next we use (1.8) and induction. Since @ci−q (u) ci−q (u) 1 @ci−1 (u) 1 1 @ci (u) ci−1 (u) = t1 + · · · + tq ci (u) @s1 ci (u) ci−1 (u) @s1 ci (u) ci−q (u) @u  @ci−j (u) 1 1 {t1 ci−1 (u) + · · · + tq ci−q (u)}; ¿ min 16j 6q ci−j (u) @s1 ci (u) (3.1) is proven.

140

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

Next we note that x @c0 (u) = ; @t1 (1 − (t1 + · · · + tq ))2

@c1 (u) =0 @t1

@c2 (u) = c1 (u) @t1

and

(3.4)

and clearly 0 ¡ c∗ ¡ ∞ can be chosen so that (3.2) holds for 2 6 i 6 max(p; q). Using (3.2) and induction, one can verify (3.2) following the proof of (3.1). Lemma 3.2. There is a constant 0 ¡ c ¡ ∞ such that  wk (u) ; c6 wk (u)

−∞ ¡ k ¡ ∞

(3.5)

for all u ∈ U . Proof. Assuming that yk2−1 ¿ x=(1 − (t1 + · · · + tq )), we use >rst (3.3) and then (3.1) to conclude that 





@wk (u) @  c0 (u) + = cj (u)yk2−j  @s1 @s1 16j¡∞ = yk2−1 + t1 yk2−2 +

 @cj (u) yk2−j @s 1 36j¡∞

     ¿ min(t1 ; c∗ ; 1) yk2−1 + yk2−2 + cj (u)yk2−j   36j¡∞

   x 1 + yk2−1 + yk2−2 + ¿ min(t1 ; c∗ ; 1) cj (u)yk2−j   2 1 − (t1 + · · · + tq ) 2 36j¡∞  1

¿ min(t1 ; c∗ ; 1)min 1;

×

 

1 1 ; c1 (u) c2 (u)

x +  1 − (t1 + · · · + tq )



 16j¡∞

cj (u)yk2−j

  



1 1 1 ∗ ; wk (u): = min(t1 ; c ; 1) min 1; 2 c1 (u) c2 (u)

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

141

If yk2−1 6 x=(1 − (t1 + · · · + tq )), then by (3.4) and (3.2) we have  @cj (u) @wk (u) x = + c1 (u)yk2−2 + yk2−j 2 @t1 (1 − (t1 + · · · + tq )) @t 1 36j¡∞ ¿

 x 2 ∗ + c (u)y + c cj (u)yk2−j 1 − 2 k (1 − (t1 + · · · + tq ))2 36j¡∞

1 ¿ 1 − (t1 + · · · + tq ) + c1 (u)yk2−2 + c∗

¿ min Since

 0 ¡ min min u ∈U



1 x 1 + y2 2 1 − (t1 + · · · + tq ) 2 k −1

 36j¡∞



cj (u)yk2−j

1 c1 (u) ∗ 1=2 min 1; ; ; c wk (u): 1 − (t1 + · · · + tq ) c1 (u) c2 (u)

1 1 c1 (u) ; ; c1 (u) c2 (u) c2 (u)

;

Lemma 3.2 is proved. Let k

=

1  w (V): 2k2 k

Lemma 3.3. We assume that (1.9) – (1.14) hold and let 1 ¡ - ¡ 2. Then  −1=(1 − k2 ) k → 0 a:s: n

(3.6)

16k 6n

if and only if (1.19) holds. Proof. Let us assume >rst 4k = 1 − k2 , Yk = 4k k and Yk∗ = k 4k I {|4k | 6 k 1=- }.

that (1.19) holds. Let 1=By (1.19) we have that 16k¡∞ P{|4k | ¿ k } ¡ ∞, and therefore by the Borel–Cantelli lemma we conclude P{Yk = Yk∗ i:o:} = 0:

(3.7)

Let Fi be the -algebra generated by j ; j6i. Since k is Fk −1 -measurable, we have that E(Yk∗ |Fk −1 )= k fk ; where fk =E4k I {|4k |6k 1=- }. Note that |fk | 6 E|4k |I {|4k | ¿ k 1=- } by (1.11).

142

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

Hence



16k¡∞

1

E|E(Yk∗ |Fk −1 )|

k 1=-

= E|



0|

1 k 1=-

16k¡∞

6 E|

0|



1

16k¡∞

6 E|

0|

|fk |



k 1=-

E(|4k |I {|4k | ¿ k 1=- })

16k¡∞

= E|

0|





k −1=-

i1=- P{(i − 1)1=- 6 |40 | ¡ i1=- }

k+16i¡∞

i1=- P{(i − 1)1=- 6 |40 | ¡ i1=- }

16i¡∞

6 const





k −1=-

1 6k 6 i − 1

iP{(i − 1)1=- 6 |40 | ¡ i1=- } ¡ ∞

16i¡∞

by (1.19). Hence 16k¡∞ k −1=- E(Yk∗ |Fk −1 ) converges a.s. and therefore by the Kronecker lemma we have 1  E(Yk∗ |Fk −1 ) → 0 a:s: (3.8) n1=16k 6n

Observing that Yk∗ −E(Yk∗ |Fk −1 )= k (4k I {|4k | 6 k 1=- }−fk ) are the terms of a martingale diNerence sequence and hence are orthogonal random vectors, Chow and Teicher (1988, p. 118) yields T     1  1 ∗ ∗ ∗ ∗ (Y − E(Yk |Fk −1 )) (Y − E(Yk |Fk −1 )) E k 1=- k k 1=- k 16k 6n

= E(

T 0

16k 6n

0)

 16k 6n

= E(

T 0

0)

 16k 6n

1 k 2=1 k 2=-

E(4k I {|4k | 6 k 1=- } − fk )2 E(40 I {|40 | 6 k 1=- } − f0 )2

6K with K. Using Corollary 2.2 in Hall and Heyde (1980, p. 18) we get that

some −constant 1=∗ − E(Y∗ |F k (Y k −1 ) converges a.s. By the Kronecker lemma we have k k 16k¡∞  1 (Yk∗ − E(Yk∗ |Fk −1 )) → 0 a:s: (3.9) n1=16k 6n

Now (3.6) follows from (3.7) – (3.9).

I. Berkes, L. Horvath / Statistics & Probability Letters 61 (2003) 133 – 143

Let us assume now that (3.6) holds. Then   (1 − k2 ) k − |(1 − n2 ) n | = 16k 6n

16k 6n−1

(1 − k2 ) k = o(n1=- )

143

a:s:

and therefore we have |(1 − n2 )

n|

= o(n1=- )

a:s:

(3.10)

In the light of Lemma 3.2, (3.10) implies that |1 − n2 | = o(n1=- ) a:s: and therefore (1.19) follows from (1.4) and Chow and Teicher (1988, p. 125). Proof of Theorem 1.3. Since L˜ n (V˜n ) = 0, by Lemma 2.4 we have

1  ˜   a:s: Ln (Vn ) − Ln (V) = −Ln (V) + O n Berkes et al. (2003) showed that L (V) is a positive de>nite matrix and therefore by Theorem 1.1, Lemma 2.3 and the mean value theorem we have

1  −1  ˜ ˜ a:s: Vn − V = −{(L (V)) + o(1)}Ln (V) + O n Since k2 = wk (V), (1.1), (2.5) and the de>nition of 1  (1 − k2 ) k : Ln (V) = − n

k

yield that (3.11)

16k 6n

Thus, Theorem 1.3 follows from (3.11) and Lemma 3.3. References Berkes, I., Horv&ath, L., Kokoszka, P., 2003. GARCH processes: structure and estimation, Bernoulli, to appear. Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. J. Econometrics 31, 307–327. Bougerol, P., Picard, N., 1992a. Strict stationarity of generalized autoregressive processes. Ann. Probab. 20, 1714–1730. Bougerol, P., Picard, N., 1992b. Stationarity of GARCH processes and of some non-negative time series. J. Econometrics 52, 115–127. Chow, Y.S., Teicher, H., 1988. Probability Theory, 2nd Edition. Springer, New York. Davis, R.A., Mikosch, T., Basrak, B., 1999. Sample ACF of multivariate stochastic recurrence equations with applications to GARCH, preprint. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and its Application. Academic Press, New York. Lee, S.-W., Hansen, B.E., 1994. Asymptotic theory for the GARCH(1; 1) quasi-maximum likelihood estimator. Econometric Theory 10, 29–52. Lumsdaine, R.L., 1996. Consistency and asymptotic normality of the quasi-maximum likelihood estimator in IGARCH(1; 1) and covariance stationary GARCH(1; 1) models. Econometrica 6, 575–596. Nelson, D.B., 1990. Stationarity and persistence in GARCH(1; 1) model. Econometric Theory 6, 318–334.