NM–QELE for ARMA–GARCH models with non-Gaussian innovations

NM–QELE for ARMA–GARCH models with non-Gaussian innovations

Statistics and Probability Letters 81 (2011) 694–703 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: ...

263KB Sizes 0 Downloads 20 Views

Statistics and Probability Letters 81 (2011) 694–703

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

NM–QELE for ARMA–GARCH models with non-Gaussian innovations Jeongcheol Ha a , Taewook Lee b,∗ a

Department of Statistics, Keimyung University, Republic of Korea

b

Department of Statistics, Hankuk University of Foreign Studies, Republic of Korea

article

info

Article history: Received 2 March 2010 Received in revised form 1 February 2011 Accepted 2 February 2011 Available online 12 February 2011 Keywords: ARMA–GARCH model Consistency Gaussian mixture model QMLE Quasi-maximum estimated-likelihood estimator

abstract Although the quasi maximum likelihood estimator based on Gaussian density (GaussianQMLE) is widely used to estimate parameters in ARMA models with GARCH innovations (ARMA–GARCH models), it does not perform successfully when error distribution of ARMA–GARCH models is either skewed or leptokurtic. In order to circumvent such defects, Lee and Lee (submitted for publication) proposed the quasi maximum estimated-likelihood estimator using Gaussian mixture-based likelihood (NM–QELE) for GARCH models. In this paper, we adopt the NM–QELE method for estimating parameters in ARMA–GARCH models and demonstrate the validity of NM–QELE by verifying its consistency. © 2011 Elsevier B.V. All rights reserved.

1. Introduction Since the papers of Engle (1982) and Bollerslev (1986), generalized autoregressive conditional heteroscedastic (GARCH) models have been a functional tool to analyze the volatility of financial time series. This is because GARCH models are capable of reflecting several prominent features of financial time series, such as being leptokurtic, conditional heteroscedasticity and volatility clustering. Although GARCH models perform comparatively well in many applications, it is more common to employ autoregressive moving average models with GARCH innovations (ARMA–GARCH models) for modeling simultaneously the conditional mean and variance of financial time series. For the parameter estimation in both GARCH and ARMA–GARCH models, the quasi maximum likelihood estimator based on Gaussian density (GaussianQMLE) is widely used, due to its tractability and excellent performances. With respect to the asymptotic properties of Gaussian–QMLE in GARCH models, a large number of outstanding works have been produced, including Berkes and Horváth (2004), Francq and Zakoïan (2004) and Jensen and Rahbek (2004), and the papers cited in those articles. On the other hand, only a few papers have studied Gaussian–QMLE in ARMA–GARCH models. For instance, Francq and Zakoïan (2004) extended the asymptotic results of Gaussian–QMLE in GARCH models to ARMA–GARCH models under mild conditions. Although Gaussian–QMLE behaves appropriately in financial applications, empirical studies have shown that Gaussian–QMLE does not perform successfully in cases where the error distribution of GARCH or ARMA–GARCH models are either skewed or leptokurtic, compared to Gaussian distribution. See, for instance, Hall and Yao (2003), and Haas et al. (2004), and the papers cited in those articles. As a remedy for Gaussian–QMLE in GARCH models, there are several approaches using two-sided exponential (cf. Berkes and Horváth, 2004) and Gaussian mixture distributions (cf. Ausín and Galeano (2007), and Lee and Lee, 2009). Among them, it is worthy of noting that Lee and Lee (2009) suggested the quasi maximum likelihood



Corresponding author. E-mail addresses: [email protected] (J. Ha), [email protected] (T. Lee).

0167-7152/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2011.02.004

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

695

estimator using Gaussian mixture-based likelihood (NM–QMLE) and established its consistency and asymptotic normality. This is done, since Gaussian mixture models are widely used to model leptokurtic and skewed distributions (cf. McLachlan and Peel, 2000). Nevertheless, Lee and Lee (2009) also mentioned that the EM-like algorithm for computing NM–QMLE is highly time consuming and does not completely guarantee its convergence. In order to circumvent such defects, Lee and Lee (submitted for publication) proposed the quasi maximum estimated-likelihood estimator using Gaussian mixture-based likelihood (NM–QELE) for GARCH models, which is step-by-step estimation algorithm, and it is found that the NM–QELE algorithm outperforms the EM-like algorithm used for NM–QMLE in Lee and Lee (2009) in terms of both stability and speed. It may be obvious that all those newly developed approaches for GARCH models are also easily applicable for ARMA–GARCH models. However, their validity should be carefully investigated one by one in advance of actual usage. The main objective of this paper is to adopt the NM–QELE method for estimating parameters in ARMA–GARCH models. Next, we try to extend the asymptotic properties of NM–QELE in GARCH models to ARMA–GARCH models and demonstrate the validity of NM–QELE by verifying its consistency. This paper is organized as follows. In Section 2, we establish the consistency of NM–QELE. In Section 3, we provide the proof of the theorems presented in Section 2. 2. Main results 2.1. NM–QELE of ARMA–GARCH model Let us consider the ARMA(P , Q )–GARCH(p, q) model: Xt − c0 =

P −

a0i (Xt −i − c0 ) + et −

i=1

Q −

b0j et −j ,

j =1

et = σt ϵt ,

σt2 = ω0 +

(1) q −

α0i e2t −i +

i=1

p −

β0j σt2−j ,

j =1

where ϵt is a sequence of independent and identically distributed random variables with a density function g with respect to the Lebesgue measure, such that E [ϵt ] = 0 and E [ϵt2 ] < ∞, c0 , a0i , b0j ∈ R (i = 1, . . . , P , j = 1, . . . , Q ), w0 > 0, α0i ≥ 0 (i = 1, . . . , q) and β0j ≥ 0 (j = 1, . . . , p). The parameter vector is denoted by ϕ = (θ1T , θ2T )T , where T T T T θ1T = (c , a1 , . . . , aP , b1 , . . . , bQ ), θ2T = (ω, α1 , . . . , αq , β1 , . . . , βp ), the true value is ϕ0 = (θ01 , θ02 ) , where θ01 = T P +Q +1 (c0 , a01 , . . . , a0P , b01 , . . . , b0Q ), θ02 = (ω0 , α01 , . . . , α0q , β01 , . . . , β0p ), and the parameter space of ϕ is Φ ⊂ R × (0, ∞) × [0, ∞)p+q . Conditional on the initial values for X0 , . . . , X1−(q−Q )−P , e˜ min{0,−q+Q } , . . . , e˜ 1−max{q,Q } and σ˜ 02 , . . . , σ˜ 12−p , we can compute e˜ t (θ1 ) := e˜ t = Xt − c −

P −

ai (Xt −i − c ) +

i=1

σ˜ t2 (ϕ) := σ˜ t2 = ω +

q −

αi e˜ 2t −i +

i=1

Q −

bj e˜ t −j

for t = −q + Q + 1, . . . , n,

(2)

j =1 p −

βj σ˜ t2−j for t = 1, . . . , n.

(3)

j=1

These initial values will be taken to be fixed, and neither random nor functions of the parameters. (cf. Francq and Zakoïan, 2004, page 612). In order to obtain the NM–QELE, we introduce a family of normal mixture densities. Let F = {fη : η ∈ Ω } be the set of K -component normal mixture densities of the form fη (y) =

K −

πk f (y; µk , σk ),

k=1

where f (y; µk , σk ) :=

√ 1 2πσk

  2  y−µk 1 exp − 2 . Here, the parameter for the normal mixture model is denoted by η = σk

(π1 , . . . , πK , µ1 , . . . , µK , σ12 , . . . , σK2 )T and its parameter space is   K K − − K K K 2 2 Ω ⊂ Ω0 = η ∈ [0, 1] × R × (0, ∞) : πk = 1 and πi (µi + σi ) = 1 . k=1

i =1

696

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

Here, the moment condition in Ω0 guarantees that the estimator considered in this paper is uniquely defined. Further, it is assumed that the model F is weakly identifiable, that is, K −

K −

πk(1) f (y; µ(k1) , σk(1) ) =

k=1

πk(2) f (y; µ(k2) , σk(2) ) a.s.

k =1

⇐⇒

K − k =1

πk(1) δ(µ(1) ,σ (1) ) = k

k

K − k=1

πk(2) δ(µ(2) ,σ (2) ) , k

(4)

k

where δ(µk ,σk ) (·) is a function with δ(µk ,σk ) (µk , σk ) = 1 and δ(µk ,σk ) (x, y) = 0 for all (x, y) ̸= (µk , σk ). Now, we set 2 2 T η0 = (π01 , . . . , π0K , µ01 , . . . , µ0K , σ01 , . . . , σ0K ) := arg min d(g , fη ),

(5)

η∈Ω

where d(·, ·) denotes the Kullback–Leibler divergence defined by d(g , f ) :=



g (y)(log g (y) − log f (y))dy.

In our set-up, ϕ and η are viewed as a structural parameter and nuisance parameter, respectively. Now, we give the method to obtain the NM–QELE as follows: (Step 1) By employing the Gaussian–QMLE (cf. Francq and Zakoïan, 2004)

ϕˆ n = (θˆ1T , θˆ2T )T = (ˆc0 , aˆ 1 , . . . , aˆ P , bˆ 1 , . . . , bˆ Q , ωˆ 0 , αˆ 1 , . . . , αˆ q , βˆ 1 , . . . , βˆ p ),

(6)

we obtain the residuals e˜ t (θˆ1 ) ϵ˜t :=  , σ˜ t2 (ϕˆ n )

t = 1, . . . , n.

(7)

Here, e˜ t (θˆ1 ) and σ˜ t2 (ϕˆ n ) are computed from the Eqs. (2) and (3), respectively.

(Step 2) Define the preliminary estimator for η based on residuals {˜ϵt } in (Step 1):

η˜ n∗ := arg max ˜l∗n (η), η∈Ω

where ln (η) :=

˜∗

1 n

∑n

t =1

log fη (˜ϵt ).

(Step 3) Define the estimated quasi-likelihood: L˜ n (ϕ, η˜ n∗ ) =

n ∏



1

 t =1

σ˜ t2 (ϕ)

 fη˜ n∗

e˜ t (θ1 )



 σ˜ t2 (ϕ)

.

Then, the NM–QELE ϕˆ ne of ϕ0 is defined by

ϕˆ ne := arg max ˜len (ϕ), ϕ∈Φ

where ˜len (ϕ) :=

1 n

log L˜ n (ϕ, η˜ n∗ ) =

1 n

 ∗ ˜ ˜ ˜ √ 12 l and l = l (ϕ, η ˜ ) = log t t n t =1 t

∑n

σ˜ t (ϕ)

 fη˜ n∗

√e˜t (θ21 )

σ˜ t (ϕ)

 .

Throughout this paper, it is assumed that all r.v.’s are defined on a probability space (Λ, F , P∑ ). The spectral radius of q i a square matrix A is denoted by ρ(A). The Kronecker product is denoted by ⊗. Let Aθ2 (z ) = i=1 αi z and Bθ2 (z ) =

∑p

∑ βi z i . Conventionally, if q = 0, Aθ2 (z ) = 0, and if p = 0, Bθ2 (z ) = 1. In addition, let Aθ1 (z ) = Pi=1 ai z i and ∑Q Bθ1 (z ) = 1 − i=1 bi z i . In order to obtain the asymptotic properties, we consider the following regularity conditions: 1−

i=1

(A1) (ϕ0 , η0 ) ∈ Φ × Ω and Φ × Ω is compact. (A2)

∑q

i =1

αi +

∑p

j =1

βj < 1 for each ϕ ∈ Φ .

(A3) ϵ has a non-degenerate distribution and E ϵt2 = 1. 2 t

(A4) For all ϕ ∈ Φ , Aθ1 (z )Bθ1 (z ) = 0 implies |z | > 1.

(A5) Aθ01 (z ) and Bθ01 (z ) have no common roots, and a0P + b0Q ̸= 0. Furthermore, if p > 0, Aθ02 (z ) and Bθ02 (z ) have no common roots, Aθ02 (1) ̸= 0 and α0q + β0p ̸= 0.

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

697

(A6) The η0 ∈ Ω is essentially unique and for any η ∈ Ω , u1 > 0 and u2 , E log{u1 fη (ϵ1 u1 + u2 )} ≤ E log fη0 (ϵ1 ), where the equality holds only if η = η0 , u1 = 1 and u2 = 0. (A7) ρ{E (A0t ⊗ A0t )} < 1, where A0t is a (p + q) × (p + q) matrix of the form



αt

α0q ϵt2

I A0t =  q−1

0

βt

β0p ϵt2

0

α

α0q

β

0

0

Ip−1



0  β0p  0

where Il denotes the identity matrix of size l and αt = (α01 ϵt2 , . . . , α0q−1 ϵt2 ), βt = (β01 ϵt2 , . . . , β0p−1 ϵt2 ), α = (α01 , . . . , α0q−1 ), and β = (β01 , . . . , β0p−1 ). Given below are the asymptotic properties for η˜ n∗ and ϕˆ ne , which are defined in (Step 2) and (Step 3), respectively. The proofs are given in Section 3. Theorem 1. Suppose that the conditions (A1)–(A7) hold. Then, η˜ n∗ → η0 a.s. as n → ∞. Theorem 2. Suppose that the conditions (A1)–(A7) hold. Then, ϕˆ ne → ϕ0 a.s. as n → ∞. Remark 2.1. In general, the consistency of the Gaussian–QMLE does not require the existence of any moment of the observed process. On the other hand, the consistency of the QELE requires the existence of the moment conditions. Note that (A2) is the necessary and sufficient condition for the second-order stationarity of the process {et }, which, in turn, guarantees the second-order stationarity of ARMA–GARCH process {Xt }. Furthermore, (A7) is the necessary and sufficient condition for the fourth-order stationarity of the process {et } and {Xt }. See Ling and McAleer (2002) for details. Remark 2.2. We refer to Redner (1981), Redner and Walker (1984) and Leroux (1992) for more details of the first part of (A6). Remark 2.3. In our paper, F = {fη } is defined as a family of the finite normal mixture densities. We can show that

− ∞ < E log fη0 (ϵ1 ) < ∞

(8)

by using the inequality log x ≤ x − 1, ∀x > 0, Jensen’s inequality and E ϵ12 < ∞. Now, we suppose that for a set Ω , the following equality holds arg min d(g , fη ) = arg min d(g , fη ). η∈Ω

(9)

η∈Ω0

Then, the second part of (A6) holds from (5), (8) and (9). In the subsequent section, we give an example for (9) under E ϵt2 = 1. Therefore, we may conclude that two assumptions, E ϵt2 = 1 and the second part of (A6), can be compatible, no matter when either g ∈ F or g ̸∈ F . 2.2. Example Let F ∗ = {fη : η ∈ Ω } be the set of K -component normal scale-mixture densities of the form fη (x) =

K −

πi f (x; σi ),

i =1



where f (x; σi ) = √ 1

2πσi

exp −

x2 2σi2



. Here, η = (π1 , . . . , πK , σ1 , . . . , σK ), and

 Ω ⊂ Ω0 = ∗



η ⊂ [0, 1] × (0, ∞) : K

K

K − i=1

πi = 1 and

K −

 πσ =1 . 2 i i

i =1

The following two propositions indicate that the Eq. (9) holds for K = 1 or 2 when the error distribution is the double exponential with E ϵt2 = 1. In what follows, the symbol C (> 0) denotes a generic constant taking many different values. Proposition 1. Consider F ∗ with K = 1. Suppose that the true distribution g of ϵt is the double exponential with E ϵt2 = 1. Then, the Eq. (9) holds.

698

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

Proof. Note that η is expressed as η = σ 2 for K = 1. Now, we get d(g , fη ) =

g (x)





g (x) log −∞





fη (x)



g (x) − log σ −

=− −∞

= log σ +

1 2σ 2





g (x) log fη (x)dx + C

dx = − −∞

x2 2σ 2

 dx + C

+ C.

Therefore, η0 = arg maxη d(g , fη ) = 1. Hence, for any compact set Ω ∗ with 1 ∈ Ω ∗ , the Eq. (9) holds.



Proposition 2. Consider F ∗ with K = 2. Suppose that the true distribution g of ϵt is the double exponential with E ϵt2 = 1. Then, the Eq. (9) holds. Proof. Note that η is expressed as η = (π , σ12 , σ22 )′ for K = 2. Now, we get d(g , fη ) = −



1 y2 g (x) log π √ exp − 2 2σ1 2π σ1 −∞









+ (1 − π ) √



1

2π σ2

exp −

y2 2σ22

 dx + C

and thus, for any π and σ12 , ∞

   1 1 y2 lim d(g , fη ) = − g (x) log π √ dx = − log π + log σ1 + +C exp − 2 2 2 2 σ 2 σ σ2 →∞ 2π σ1 −∞ 1 1 ∫

which takes its minimum value at π = 1 and σ12 = 1. Since, for any π and σ12 , d(g , fη ) is an increasing and continuous function with respect to σ22 , there exists a sufficiently large C > 0 such that for Ω ∗ = [0, 1] × (0, ∞) × (0, C ], min d(g , fη ) = min d(g , fη ).

η∈Ω0∗

η∈Ω ∗

We can have the similar results as σ22 → 0 or σ12 → 0 or σ12 → ∞. Hence, there exist some positive real numbers C1 and C2 (C1 ≤ C2 ) with 1 ∈ [C1 , C2 ] such that for Ω = [0, 1] × [C1 , C2 ]2 , arg min d(g , fη ) = arg min d(g , fη ) η∈Ω ∗

and thus the Eq. (9) follows.

η∈Ω0∗



3. Proofs In this section, we provide the proofs for the theorems presented in Section 2. Lemma 1. Under the condition (A1),

  ∂   sup  log fη (y) ≤ C (|y| + 1) ∂ y η∈Ω   ∂  sup  log fη (y) ≤ C (|y| + 1). η∈Ω ∂η

(10)

(11)

Proof. By simple calculation, we get

    K ∑ y−µk       πk f (y; µk , σk ) − σ 2   ∂   1 ∂  k  k=1  sup  log fη (y) = sup  fη (y) ≤ sup  . K  ∑ η∈Ω ∂ y η∈Ω fη (y) ∂ y η∈Ω    πk f (y; µk , σk )   k=1 Then, due to (12) and (A1), (10) is established. Similarly, we can have (11).

(12)



Below, a series of lemmas are introduced. The first two lemmas are from Francq and Zakoïan (2004) and the third lemma is done in Theorem 3 of Leroux (1992).

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

699

Lemma 2. Let et , σt2 , e˜ t , and σ˜ t2 be the symbols in (1), (2) and (3). Suppose that the conditions (A1), (A2) and (A4) hold. Then, there exists ρ ∈ (0, 1) such that, for all t ≥ 1 and 0 ≤ i ≤ q, sup |et −i (θ1 ) − e˜ t −i (θ1 )| ≤ C ρ t

ϕ∈Φ

(13)

(|ek | + 1) a.s.

(14)

t −1 −

sup |σt2 (ϕ) − σ˜ t2 (ϕ)| ≤ C ρ t

ϕ∈Φ

a.s.

k=1−q

Lemma 3. Let ϕˆ n be the Gaussian–QMLE in (6). Suppose that the conditions (A1)–(A5) hold. Then, we have, ϕˆ n → ϕ0 a.s., as n → ∞. Lemma 4. Let ηˆ n := arg maxη∈Ω l∗n (η), where l∗n (η) := ηˆ n → η0 a.s., as n → ∞.

1 n

∑n

t =1

log fη (ϵt ). Suppose that the condition (A6) holds. Then,

Lemma 5. Let ϵ˜t be the symbol in (7). Suppose that the conditions (A1)–(A5) hold. Then, we have, as n → ∞, n 1−

n t =1 n 1−

n t =1

|ϵ˜t − ϵt | → 0 a.s.

(15)

|ϵt ||ϵ˜t − ϵt | → 0 a.s.

(16)

Proof. First, we prove (15). Using (1), (2) and (7), we obtain





n   1 −  e˜ t (θˆ1 )  |ϵ˜t − ϵt | = − ϵt   2  n t =1 n t =1  σ˜ t (ϕˆ n )

n 1−





n   1 −  e˜ t (θˆ1 ) e˜ t (θ01 ) e˜ t (θ01 ) et (θ01 )  ≤ − + −  2  n t =1  σ˜ t (ϕˆ n ) σ˜ t2 (ϕˆ n ) σ˜ t2 (ϕˆ n ) σ˜ t2 (ϕˆ n ) 





n   1 −  et (θ01 ) et (θ01 ) et (θ01 ) et (θ01 )  + − + −  2  n t =1  σ˜ t (ϕˆ n ) σ˜ t2 (ϕ0 ) σ˜ t2 (ϕ0 ) σt2 (ϕ0 ) 

≤ I1 + I2 + I3 + I4 , where





n   1 −  e˜ t (θ01 ) et (θ01 )  I2 = −  2  n t =1  σ˜ t (ϕˆ n ) σ˜ t2 (ϕˆ n ) 





n   1 −  et (θ01 ) et (θ01 )  I4 = −  2 . n t =1  σ˜ t (ϕ0 ) σt2 (ϕ0 ) 

n   1 −  e˜ t (θˆ1 ) e˜ t (θ01 )  I1 = −  2 , n t =1  σ˜ t (ϕˆ n ) σ˜ t2 (ϕˆ n )  n   et (θ01 )  1 −  et (θ01 ) − I3 =  2 , n t =1  σ˜ t (ϕˆ n ) σ˜ t2 (ϕ0 ) 









First, we prove I1 → 0 a.s. Using (1) and (2) and Lemma 3, we have

    Q Q P P   − − − −   |˜e1 (θˆ1 ) − e˜ 1 (θ01 )| =  X1 − cˆ − aˆ i (X1−i − cˆ ) + bˆ j e˜ 1−j − X1 − c0 − a0i (X1−i − c0 ) + b0j e˜ 1−j    i =1 j =1 i =1 j =1 ≤ |ˆc0 − c0 | +

P −

|ˆai − a0i ||X1−i | +

i=1

P −

|ˆai − a0i ||c0 | +

i=1

P −

|ˆai ||ˆc − c0 | +

i=1

Q −

|bˆ j − b0j ||˜e1−j |

j =1

→ 0 a.s.

(17)

Then, by recursively doing the same procedure, we have n 1−

n t =1

|˜et (θˆ1 ) − e˜ t (θ01 )| ≤ |ˆc0 − c0 | +

P −

|ˆai − a0i |

i =1

+

Q − j =1

|bˆ j − b0j |

n 1−

n t =1

n 1−

n t =1

|Xt −i | +

|˜et −j (θ01 )| +

P −

|ˆai − a0i ||c0 | +

i =1

P −

|ˆai ||ˆc − c0 |

i =1

Q n − 1 − bˆ j |˜et −j (θˆ1 ) − e˜ t −j (θ01 )| j =1

n t =1 +j

700

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703 Q n 1− −

= R1n + R2n ×

n j=1 t =1+j

|˜et −j (θˆ1 ) − e˜ t −j (θ01 )|

.. . 1

= R′1n + R′2n × |˜e1 (θˆ1 ) − e˜ 1 (θ01 )| n

→ 0 a.s.,

which follows from (17), Lemma 3, the second order stationarity, ergodicity and the fact that R1n → 0 a.s., R′1n → 0 a.s., R2n → C a.s. and R′2n → C a.s. Therefore, we have I1 ≤ C

n 1−

n t =1

|˜et (θˆ1 ) − e˜ t (θ01 )| → 0 a.s.

Similarly, we can show I3 → 0 a.s., by using (A2), Lemma 3 and the fact that

     e (θ )    et (θ01 )  σ˜ t2 (ϕ0 ) − σ˜ t2 (ϕˆ n )  t 01     − = ϵt   2  . 2 2 2 2  σ˜ t (ϕˆ n )   σ˜ t (ϕ0 ) σ˜ t (ϕˆ n )( σ˜ t (ϕ0 ) + σ˜ t (ϕˆ n ))  Next, we deal with I2 → 0 a.s. Due to (13) and the fact that σ˜ t2 (ϕˆ n ) ≥ ω ˆ > 0, we have I2 ≤ C

n 1−

n t =1

|˜et (θˆ1 ) − et (θˆ1 )| ≤ C

n 1−

n t =1

ρ t → 0 a.s.

By using (A2) and (14), we can similarly show I4 → 0 a.s. Similarly as in the proof of (15), we can prove (16). Hence, the lemma is established.  Lemma 6. Let ϵ˜t be the symbol in (7). Suppose that the conditions (A1)–(A5) and (A7) hold. Then, we have, as n → ∞, n 1−

n t =1

|ϵ˜t ||ϵ˜t − ϵt | → 0 a.s.

Proof. First, we show that

2



n   1 −  et (θ01 ) et (θ01 )  −  2  →0 n t =1  σt (ϕˆ n ) σt2 (ϕ0 ) 

a.s.

Now, we have

 2      e (θ )   σ 2 (ϕ ) − σ 2 (ϕˆ ) 2  σ 2 (ϕ ) − σ 2 (ϕˆ ) 2 et (θ01 )  n  0 n   t 01  t 0 t t t 2 2   −  2  = et (θ01 )   2  = ϵt    σt (ϕˆ n )    σt2 (ϕ0 )  σt (ϕ0 ) σt2 (ϕˆ n )  σt2 (ϕˆ n ) ≤ C ϵt2 |σt2 (ϕ0 ) − σt2 (ϕˆ n )|2 and thus, using (A7) and Lemma 3, we obtain

2



n  n  1 −  et (θ01 ) et (θ01 )  1− 2 2 lim − ≤ C lim ϵt |σt (ϕ0 ) − σt2 (ϕˆ n )|2 = 0.  2  2 n→∞ n n→∞ n   σ ( ϕ ˆ ) σ (ϕ ) n 0 t =1 t =1 t t

The rest of the proof can be similarly done as in the proof of Lemma 5.



Proof of Theorem 1. Due to Lemma 4, it suffices to show that η˜ n∗ − ηˆ n → 0 a.s. as n → ∞, which can be verified by showing lim sup |˜l∗n (η) − l∗n (η)| = 0

n→∞ η∈Ω

a.s.

According to (10) and mean value theorem, we have sup | log fη (ϵ˜t ) − log fη (ϵt )| ≤ C (|ξt∗ | + 1)|ϵ˜t − ϵt | ≤ C (max{|ϵ˜t |, |ϵt |} + 1)|ϵ˜t − ϵt |,

η∈Ω

(18)

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

701

where ξt∗ is a intermediate point between ϵ˜t and ϵt . Then, due to Lemmas 5 and 6 and (18), we have lim sup |˜l∗n (η) − l∗n (η)| ≤ lim

n→∞ η∈Ω

n→∞

≤ lim

n→∞

n 1−

sup | log fη (ϵ˜t ) − log fη (ϵt )| n t =1 η∈Ω

n 1−

n t =1

C (max{|ϵ˜t |, |ϵt |} + 1)|ϵ˜t − ϵt |

= 0. Therefore, the theorem is established.  Now, the following lemma is necessary to prove Theorem 2. Since it can be shown by using (A2) and Lemma 2, we omit the proof. Lemma 7. Suppose that the conditions (A1), (A2) and (A4) hold. Then, as n → ∞, n

−1

n − t =1

n− 1

n− 1

n −

  2   σt (ϕ)   sup log → 0 a.s. σ˜ 2 (ϕ)  ϕ∈Φ t

sup |σt2 (ϕ) − σ˜ t2 (ϕ)|(|et (θ1 )| + 1)2 → 0 a.s.

t =1 ϕ∈Φ n −

sup |et (θ1 ) − e˜ t (θ1 )|(|et (θ1 )| + 1) → 0

t =1 ϕ∈Φ

a.s.

Let ˜len (ϕ) be the symbol in (Step 4) and define



n 1−

lt (ϕ, η˜ n ) and lt (ϕ, η˜ n ) = log n t =1 Now, we have the following result. len

(ϕ) :=







1

 fη˜ n∗ σt2 (ϕ)

et (θ1 )



 .

σt2 (ϕ)

Lemma 8. Suppose that the conditions (A1), (A2) and (A4) . Then, as n → ∞, lim sup |˜len (ϕ) − len (ϕ)| = 0 a.s.

n→∞ ϕ∈Φ

Proof. Note that we have n 1− |˜len (ϕ) − len (ϕ)| ≤ |˜lt (ϕ, η˜ n∗ ) − lt (ϕ, η˜ n∗ )|

n t =1





    fη˜ n∗  − log  fη˜ n∗   2 2 2 2  σ˜ t (ϕ) σ˜ t (ϕ) σt (ϕ) σt (ϕ)         2  n  n    1− 1 − 1 σ (ϕ) et (θ1 ) e˜ t (θ1 )   t +  log ∗ ∗   ≤ log f − log f   η˜ n η˜ n  n t =1  2 σ˜ t2 (ϕ)  n t =1  σ˜ t2 (ϕ) σt2 (ϕ) n  1 − = log n t =1 

1



e˜ t (θ1 )





1



et (θ1 )

(19)

and

         et (θ1 ) e˜ t (θ1 )   − log fη˜ n∗  log fη˜ n∗  2    σ˜ t (ϕ) σt2 (ϕ)      e˜ (θ )  et (θ1 )  |˜et (θ1 )| |et (θ1 )|  t 1 ≤  − ,  max  2  σ˜ t2 (ϕ) σt2 (ϕ)  σ˜ t (ϕ) σt2 (ϕ)    e˜ (θ )  et (θ1 )   t 1 ≤ C  −  (|et (θ1 )| + 1)  σ˜ t2 (ϕ) σt2 (ϕ)       e˜ (θ )    et (θ1 )   et (θ1 ) et (θ1 )   t 1 ≤ C  − +  −   (|et (θ1 )| + 1)  σ˜ t2 (ϕ) σ˜ t2 (ϕ)   σ˜ t2 (ϕ) σt2 (ϕ)  ≤ C |et (θ1 ) − e˜ t (θ1 )| (|et (θ1 )| + 1) + |σt2 (ϕ) − σ˜ t2 (ϕ)| (|et (θ1 )| + 1)2 , where the first inequality follows from (10). Then, the lemma is established due to (19), (20) and Lemma 7.

(20) 

702

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

Lemma 9. Suppose that (A2) and (A4)–(A6) hold. Under (4), if





1 ht (ϕ)



et (θ1 )

 = √



ht (ϕ)



1

ht (ϕ0 )

fη0

et (θ01 )

 a.s.



ht (ϕ0 )

for some t, we have ϕ = ϕ0 and η = η0 . Proof. By using (4) and (5) and (A6), we have η = η0 , et (θ1 ) = et (θ01 ) and ht (ϕ) = ht (ϕ0 ) a.s., which, together with (A2) and (A4)–(A5) and the proof of Theorem 3.1 of Francq and Zakoïan (2004), implies ϕ = ϕ0 .  Lemma 10. Suppose that (A1)–(A2) and (A3)–(A6) hold. Let lt (ϕ, η) = log

 E



√ 12

σt (ϕ)

 fη

√et (θ21 )

σt (ϕ)

 . Then,

 sup

(ϕ,η)∈Φ ×Ω

lt (ϕ, η)

<∞

(21)

and for any (ϕ, η) ̸= (ϕ0 , η0 ), E [lt (ϕ, η)] < E [lt (ϕ0 , η0 )].

(22)

Proof. By using (A1) and simple calculation, we can show

 sup

(ϕ,η)∈Φ ×Ω

lt (ϕ, η) =

sup

(ϕ,η)∈Φ ×Ω

≤−

≤−

log

1

1



fη √ 2 σ t (ϕ)



et (θ1 )



σt2 (ϕ)  

inf log σ (ϕ) + sup log fη 2 ϕ∈Φ (ϕ,η)∈Φ ×Ω 2 t

 

1

inf log ω + sup log fη 2 ϕ∈Φ (ϕ,η)∈Φ ×Ω



et (θ1 )



σt2 (ϕ)  et (θ1 )  σt2 (ϕ)

< C, which establishes (21). Next, by using (A1) and Lemma 9, we obtain

    σ 2 (ϕ )  et (θ1 ) 0 t fη  − E log(fη0 (ϵt )) E [lt (ϕ, η)] − E [lt (ϕ0 , η0 )] = E log 2  σt (ϕ) σt2 (ϕ)      σ 2 (ϕ )  e (θ ) − e (θ ) e (θ ) 0 t 1 t 01 t 01 t   − E log(fη0 (ϵt )) = E log + f η  σt2 (ϕ)  σt2 (ϕ) σt2 (ϕ)      σ 2 (ϕ ) σt2 (ϕ0 ) et (θ1 ) − et (θ01 )  0 t  = E log − E log(fη0 (ϵt )) f ϵ +  σt2 (ϕ) η t σt2 (ϕ)  σt2 (ϕ) ≤ 0, where the last equality holds if and only if ϕ = ϕ0 and η = η0 . Thus, (22) is established.



Lemma 11. Suppose that (A1)–(A7) hold. Any ϕ(̸=ϕ0 ) has a neighborhood N (ϕ) such that lim

sup ˜len (ϕ ′ ) < El1 (ϕ0 , η0 )

n→∞ ϕ ′ ∈N (ϕ)

a.s.

Proof. For any ϕ ∈ Φ and any positive integer k, let Nk (ϕ) be the open ball with center ϕ and radius 1/k and Vk (η0 ) be the open ball with center η0 and radius 1/k. Then, due to Theorem 1 and Lemma 8, by using the argument the proof of Theorem 2.1 of Francq and Zakoïan (2004, page 617), we have lim

sup

n→∞ ϕ ′ ∈N (ϕ)∩Φ k

˜len (ϕ ′ ) = lim

sup

n→∞ ϕ ′ ∈N (ϕ)∩Φ k

len (ϕ ′ ) ≤

sup (ϕ ′ ,η)∈(Nk (ϕ)∩Φ )×Vk (η0 )

El1 (ϕ ′ , η) a.s.

By the Beppo-Levi theorem, as k tends to ∞, we get that sup(ϕ ′ ,η)∈(Nk (ϕ)∩Φ )×Vk (η0 ) El1 (ϕ ′ , η) decreases to El1 (ϕ, η0 ), which, together with Lemma 10, establishes the lemma.  Proof of Theorem 2. Similar to the proof of Theorem 2.1 of Lee and Lee (2009), the consistency of the NM–QELE can be easily obtained by using Theorem 1 and Lemmas 7–11. The details are omitted for brevity. 

J. Ha, T. Lee / Statistics and Probability Letters 81 (2011) 694–703

703

Acknowledgements We thank the referee for his/her valuable comments to improve the quality of the paper. This work was supported by the Korean Research Foundation Grant funded by the Korean Government (MOEHRD, Basic Research Promotion Fund) (KRF2008-331-C00059) and partially supported by Hankuk University of Foreign Studies Research Fund of 2011. References Ausín, M.C., Galeano, P., 2007. Bayesian estimation of the Gaussian mixture GARCH model. Comput. Statist. Data Anal. 51, 2636–2652. Berkes, I., Horváth, L., 2004. The efficiency of the estimators of the parameters in GARCH processes. Ann. Statist. 32, 633–655. Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. J. Econometrics 31, 307–327. Engle, R.F., 1982. Autoregressive conditional heteroskedasticity with estimates of variance of UK inflation. Econometrica 50, 987–1008. Francq, C., Zakoïan, J., 2004. Maximum likelihood estimation of pure GARCH and ARMA–GARCH processes. Bernoulli 10, 605–637. Haas, M., Mittnik, S., Paolella, M.S., 2004. Mixed normal conditional heteroskedasticity. J. Financ. Econom. 2, 211–250. Hall, P., Yao, Q., 2003. Inference in ARCH and GARCH models with heavy-tailed errors. Econometrica 71, 285–317. Jensen, S.T., Rahbek, A., 2004. Asymptotic inference for nonstationary GARCH. Econometric Theory 20, 1203–1226. Lee, T., Lee, S., 2009. Normal mixture quasi-maximum likelihood estimator for GARCH models. Scand. J. Statist. 36, 157–170. Lee, S., Lee, T., 2011. Inference for box-Cox transformed threshold GARCH models with nuisance parameters (submitted for publication). Leroux, B.G., 1992. Consistent estimation of a mixing distribution. Ann. Statist. 20, 1350–1360. Ling, S., McAleer, M., 2002. Stationarity and the existence of moments of a family of GARCH processes. J. Econometrics 106, 109–117. McLachlan, G., Peel, D., 2000. Finite Mixture Models. Wiley, New York. Redner, R., 1981. Note on the consistency of the maximum likelihood estimate for nonidentifiable distribution. Ann. Statist. 9, 225–228. Redner, R., Walker, H.F., 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26, 195–239.