On functional limit theorems for multivariate linear processes with applications to sequential estimation

On functional limit theorems for multivariate linear processes with applications to sequential estimation

Journal of Statistical Planning and Inference 83 (2000) 11– 23 www.elsevier.com/locate/jspi On functional limit theorems for multivariate linear pro...

114KB Sizes 0 Downloads 36 Views

Journal of Statistical Planning and Inference 83 (2000) 11– 23

www.elsevier.com/locate/jspi

On functional limit theorems for multivariate linear processes with applications to sequential estimation a Department

Issa Fakhre-Zakeria; ∗ , Sangyeol Leeb of Statistics, University of North Carolina, Chapel Hill, NC 27599-3260, USA of Statistics, Seoul National University, Seoul, 151-742, South Korea

b Department

Received 30 March 1998; accepted 6 November 1998

Abstract For a multivariate linear process with martingale di erence innovations, we prove a random functional limit theorem and propose an almost sure consistent estimator for the limiting covaric 2000 ance matrix. We also consider sequential estimation of the mean vector of the process. Elsevier Science B.V. All rights reserved. MSC: primary 60F17; 60G10; 60G42; secondary 60G40; 62L12 Keywords: Functional central limit theorem; Linear process; Martingale; Mixing in the sense of Renyi; Random indices; Sequential estimation

1. Introduction Let Xt ; t = 0; ±1; : : : ; be an m-dimensional linear process of the form Xt −  =

∞ P u=0

Au Zt−u ;

(1.1)

de ned on a probability space ( ; F; P), where  : m × 1 is the unknown mean vector and {Zt } is an m-dimensional martingale di erence sequence E(Zt |Ft−1 ) = 0

a:s:

(1.2)

and Ft is the sub--algebra generated by Zu ; u6t. Throughout we shall assume that ∞ P u=0

||Au || ¡ ∞

and

∞ P u=0

Au 6= 0m×m ;

(1.3)

Pm Pm where for any m × m, m¿1, matrix A = (aij ); ||A||:= i=1 j=1 |aij | and 0m×m denotes the m × m zero matrix. Let W m denote Wiener measure on C m [0; 1], the space ∗

Corresponding author.

c 2000 Elsevier Science B.V. All rights reserved. 0378-3758/00/$ - see front matter PII: S 0 3 7 8 - 3 7 5 8 ( 9 9 ) 0 0 0 5 4 - 3

12

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

of all continuous functions f de ned on [0,1] into Rm equipped with the norm ||f||∞ = max16i6m sup06t61 |fi (t)|. Let t denote the conditional covariance matrix of Zt ; E(Zt Zt0 |Ft−1 ) = t a.s., such that n 1P p t → ; n t=1

(1.4)

where the prime denotes transpose and  is a positive-de nite (p.d.) nonrandom matrix. Further, let ! !0 ∞ ∞ P P T= Aj  Aj ; (1.5) j=0

Sn =

Pn

t=1

j=0

Xt ; (n¿0) (S0 = 0), and de ne for n¿1 the stochastic process n by

n (u) = n−1=2 T −1=2 [(Sr − r) + (nu − r)(Xr+1 − )];

r6nu ¡ r + 1;

(1.6)

where r = 0; 1; : : : ; n − 1. For m = 1, under certain conditions, Phillips and Solo (1992) show that the process n de ned in (1.6) converges weakly (or in distribution) to W 1 , Wiener measure on C[0; 1]. They also give a detailed discussion of various limit theorems for the univariate linear processes using the so-called Beveridge–Nelson decomposition. We extend their result on invariance principle to the case of m-dimensional Xt , under some slightly weaker conditions. More general results, however, are obtainable. As it will appear in Theorem 1 below, the weak convergence of n to W m is mixing in the sense of Renyi (see De nition 1). Furthermore, the convergence holds if the process n is randomly indexed. There are, of course, many ways of de ning invariance principle, because there are alternative ways of norming or scaling the partial sum process n (see e.g., Hall and Heyde, 1980, p. 98). We will have more to say on this point in Remark 1 following the proof of Corollary 1. As it stands, the invariance principle for the process {n } cannot be used directly to obtain inference on , since it requires knowing T . In practice, however, T is unknown and must be replaced by an appropriate estimator. One of our goals in this paper is to construct a consistent estimator for the limiting covariance matrix T , under the additional assumption that the sequence {Zt } is weakly stationary. Using this estimator we will be able to normalize or scale the partial sum process {n } based on the observed data and obtain the same limiting distribution W m (see Corollary 1 stated below). Before introducing a consistent estimate of T , let us consider two additional applications where similar estimation problems arise. First, as an immediate consequence of Theorem 1 we have √ L n(X n − ) → N(0; T ) as n → ∞; (1.7) P n where X n = 1n t=1 Xt ; N(0; T ) denotes an m-dimensional normal random vector and L

the symbol → indicates convergence in distribution. Based on (1.7), if T is known,

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

13

the following ellipsoid is an asymptotic (1 − ) con dence region for the unknown mean vector : 2 (m)}; { ∈ Rm : n(X − )0 T −1 (X n − )61− 2 1− (m)

(1.8)

2

where is the upper 1 − point of a  distribution with m-degrees of freedom. However, when T is unknown the above con dence region cannot be utilized. Another application where the issue of estimating T arises is in the context of minimum risk estimation of . That is, given n consecutive observations on the process {Xt }, it is desired to estimate  by the sample mean X n subject to a suitable loss function. For example, by the loss function Ln = (X n − )0 Q(X n − ) + cn;

(1.9)

where Q is a known m × m p.d. matrix and c ¿ 0 is the cost per observation (see Fakhre-Zakeri and Lee, 1993). In this article we develop sequential procedures to deal with both point estimation, under the loss function de ned in (1.9), and xed accuracy con dence region with prescribed coverage probability for the unknown mean vector . Let us now turn to the problem of estimating T , de ned in (1.5). First observe that if {Zt } is weakly stationary, then T = (0) +

∞ P h=1

[ (h) +

0

(h)];

(1.10)

where (h) = E(Xt+h − )(Xt − )0 is the autocovariance matrix of the process {Xt } at lag h, and   P |h| (h) → T as n → ∞: (1.11) nE(X n − )(X n − )0 = 1− n |h|¡n Let {hn } be a sequence of positive integers such that hn → 0 as n → ∞ for all  ¿ 0: (1.12) n A typical example of such sequence is hn = [(log n)q ], for some q ¡ ∞. From (1.5) or (1.10), it is clear that T involves an in nite number of unknown parameters. In view of (1.10), given n consecutive observations X1 ; : : : ; Xn , we propose to estimate T by hn → ∞

and

Tˆn = ˆ n (0) +

hn P h=1

[ ˆ n (h) + ˆn 0 (h)];

(1.13)

where ˆ n (h) = n−1

n−h P t=1

(Xt+h − X n )(Xt − X n )0 ;

06h6hn :

(1.14)

Our motivation behind the estimator Tˆn , adopted from Fakhre-Zakeri and Lee (1993), is that autocovariances at large lags, relative to the number of data points, are likely to be negligible compared with those at smaller lags. Andrews (1991) provides an extensive discussion on the spectral density estimation at the origin.

14

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

We now consider sequential point estimation of . Motivated by the optimal xed sample size (see Fakhre-Zakeri and Lee, 1993, 1.18), it is natural to use the stopping rule Nc = inf {n¿m0 : n¿c1=2 [(tr QTˆn )1=2 + n−b ]};

(1.15)

in which m0 is a xed initial sample size, and b ¿ 0 still has to be chosen. The quantity b is usually called the delay factor in the sense that prevents from an early stopping. Finally, in view of (1.8), to construct con dence region with prescribed coverage probability 1 − , we adopt the sequential procedure (Nd ; RNd ) in which the stopping rule Nd is de ned by 2 (m)}; Nd = inf {n¿m0 : [max (Tˆn ) + n−b ]6d2 n=1−

(1.16)

where max (Tˆn ) is the largest eigenvalue of Tˆn , and the terminal decision rule RNd is de ned by RNd = { ∈ Rm : (X Nd − )0 H −1 (X Nd − )6d2 };

(1.17)

where H is a known p.d. matrix and d2 H determines the size and the shape of the con dence region. The main results are stated in Section 2 and proofs are given in Section 3.

2. Main results First, we recall the de nition of mixing in the sense of Renyi (1958). De nition 1. A sequence {Yn ; n¿1} of random elements on the probability space ( ; F; P) with values in a metric space is Renyi-mixing with limiting distribution F (notation Yn ⇒ F (mixing)) if P({Yn ∈ A} ∩ B) → F(A)P(B), as n → ∞, for all F-continuity sets A and all events B with P(B) ¿ 0. Theorem 1. Let {n } be as in (1.6), and assume (1.4) holds. Further; assume that sup E||Zt ||2 ¡ ∞

(2.1)

t

and n 1P p E(Zt0 Zt I (Zt0 Zt ¿ n)|Ft−1 ) → 0 n t=1

as n → ∞

(2.2)

for every  ¿ 0; where I (·) denotes the indicator function. Let {Nn ; n¿1} be a sequence of positive-integer-valued random variables de ned on the probability space p ( ; F; P) such that; as n → ∞; Nn =n → N with P(0 ¡ N ¡ ∞)=1. Then the following hold: (i) n ⇒ W m (mixing). (ii) Nn ⇒ W m .

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

15

a:s: Corollary 1. Let {T˜n } be a sequence of m × m p.d. matrices such that T˜n −→ T; as n → ∞; and let ˜n be the same as n ; de ned in (1.6), with T replaced by T˜n . Then; under the assumptions in Theorem 1, we have (i) ˜n ⇒ W m (mixing) (ii) ˜Nn ⇒ W m .

The next theorem gives us information concerning the rate of convergence of Tˆn to T , a:s: which, in turn, implies that Tˆn → T . Thus, Tˆn satis es the condition of Corollary 1. Theorem 2. Let ˆij and ij denote the (i; j)th elements of Tˆn and T respectively. Assume that the martingale di erence sequence {Zt } is weakly stationary with supt E||Zt ||2r ¡ ∞; for some r ¿ 2. Then max ||ˆij − ij ||r = O(n−1=2 hn );

16i; j6m

(2.3)

where || · ||r denotes the rth norm in the space Lr ( ; F; P); and a:s: Tˆn −→ T

as n → ∞:

(2.4)

The following Theorems summarize the asymptotic properties of the proposed sequential procedures (1.15)–(1.17). Theorem 4 shows that the sequential point estimation is asymptotically risk ecient in the sense of Starr (1966), and Theorem 5 shows that the xed accuracy con dence set procedure is asymptotically ecient with prescribed coverage probability in the sense of Chow and Robbins (1965). Assume that the conditions in Theorems 1 and 2 hold. Then Theorem 3 (Asymptotic normality). p L Nc (X Nc − ) → N(0; T ) as c → 0; and

p L Nd (X Nd − ) → N(0; T )

as d → 0:

(2.5)

(2.6)

Theorem 4 (Point estimation). If the delay factor b ∈ (0; (r − 2)=2); then as c → 0; Nc =n0 → 1

a:s:;

(2.7)

E|Nc =n0 − 1| → 0

and

RNc =Rn0 → 1:

(2.8) (2.9)

Theorem 5 (Con dence region). lim Nd =k0 = 1

d→0

a:s:;

(2.10)

16

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

lim P( ∈ RNd )¿1 −

and

d→0

(2.11)

lim E(Nd =k0 ) = 1;

(2.12)

d→0

2 (m)d−2 ] + 1 and max (T ) is the largest eigenvalue of T . where k0 = [max (T ) · 1−

3. Proofs For simplicity of notation set  = 0. To prove Theorem 1 it is useful to have the following lemmas. Lemma 1. Let {Yn ; n¿1} and {Yn0 ; n¿1} be two sequences of random elements on the probability space ( ; F; P) with values in a metric space (S; ). If Yn ⇒ F p (mixing) and (Yn ; Yn0 ) → 0; as n → ∞; then Yn0 ⇒ F (mixing). This is also Lemma 2.6 of Rootzen (1976) and its proof is immediate. Lemma 2. Let {Yn ; n¿1} and {Yn0 ; n¿1} be two sequences of random elements on ( ; F; P) with values in a metric space. Further; let g(x; y) be a continuous function of p two variables. If Yn ⇒ F (mixing) and Yn0 → Y 0 ; as n → ∞; then g(Yn ; Yn0 ) ⇒ g(F; Y 0 ) (mixing). Proof. See Theorem 10 of Aldous and Eagleson (1978). P∞ Pk Lemma 3. Let Xt∗ = ( j=0 Aj )Zt ; Sk∗ = t=1 Xt∗ and assume that (1.4) and (2.1) hold. Let pn be a sequence of positive integers such that pn → ∞

and

pn =n → 0

as n → ∞:

(3.1)

Then (i) max16k6n ||n−1=2 (Sk∗ − Sk )|| = op (1). (ii) De ne ( 0; un6pn ; n;pn (u) = −1=2 −1=2 n T (Sr − Spn ); pn 6r6un ¡ r + 1:

(3.2)

Then sup ||n (u) − n;pn (u) || = op (1):

(3.3)

06u61

Proof. First observe that ! k k−t k P P P Sk∗ = Aj Zt +

=

t=1

j=0

k P

t−1 P

t=1

j=0

t=1

! Aj Zt−j

+

∞ P j=k−t+1

! Aj

k P

∞ P

t=1

j=k−t+1

Zt !

Aj

Zt

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

and thus, Sk∗

− Sk = −

k P ∞ P t=1 j=t

Aj Zt−j +

= I1 + II2

k P

∞ P

t=1

j=k−t+1

17

! Aj

Zt

(say):

To prove n−1=2 max ||I1 || = op (1);

(3.4)

16k6n

note that n

−1

2 P ∞ k P E max Aj Zt−j 16k6n t=1 j=t

=n

−1

2 P P ∞ j∧k E max Aj Zt−j 16k6n j=1 t=1 

 2 1=2 2  ∞ P P  j∧k 6 n−1  ||Aj || E max Zt−j   16k6n t=1  j=1  

6 K(sup E||Zt ||2 ) · t

∞ P j=1

 ||Aj ||

j∧n n

1=2

for some constant K, where we have used Doob’s maximal inequality. By the dominated convergence theorem the last term above tends to zero as n → ∞, proving (3.4). Next, we show that n−1=2 max ||I2 || = op (1):

(3.5)

16k6n

Write I2 = II1 + II2 , where II1 = A1 Zk + A2 (Zk + Zk−1 ) + · · · + Ak (Zk + · · · + Z1 ) and II2 = (Ak+1 + Ak+2 + · · ·)(Zk + · · · + Z1 ): Note that n−1=2 max ||II2 || 6 16k6n

∞ P i=0

+

= Op

||Ai ||n−1=2 · max ||Z1 + · · · + Zk || 16k6pn

P i¿pn

||Ai ||n−1=2 max ||Z1 + · · · + Zk ||

p  n

n

16k6n

+ Op

P i¿pn

! ||Ai ||

= op (1);

18

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

by Doob’s maximal inequalities (3.1) and (1.3). It remains to prove that Yn :=n−1=2 max ||II1 || = op (1): 16k6n

To this end, de ne for each l¿1 II1;l = B1 Zk + B2 (Zk + Zk+1 ) + · · · + Bk (Zk + · · · + Z1 ); where

( Bk =

Let Yn;l = n

Ak ;

k6l

0m×m ;

k ¿ l:

−1=2

max16k6n ||II1;l ||. Clearly, for each l¿1,

Yn;l = op (1):

(3.6)

On the other hand,

2 k P n(Yn;l − Yn ) 6 max (Ai − Bi )(Zk + · · · + Zk−i+1 ) 16k6n 2

i=1

 = max

l¡k6n

 6 64

i=l+1

2

P

P

i¿l

2 ||Ai || · ||Zk + · · · + Zk−i+1 || · max max ||Zk + · · · + Zk−i+1 ||2

||Ai ||

i¿l



k P

l¡k6n l¡i6k

2 ||Ai ||

· max ||Z1 + · · · + Zj ||2 : 16j6n

From this result and (1.3), for any  ¿ 0, lim lim sup P(|Yn;l − Yn |2 ¿ )

l→∞ n→∞

−1

6 lim lim sup 4 l→∞ n→∞



P

||Ai ||

i¿l

64−1 sup E||Zt ||2 · lim t

2



l→∞

P i¿l



· n−1 E max ||Z1 + · · · + Zj ||2 16j6n

||Ai ||2

 = 0:

(3.7)

In view of (3.6) and (3.7), it follows from Theorem 4.2 of Billingsley (1968, p. 25) that Yn = op (1). This completes the proof of Lemma 3(i). To prove Lemma 3(ii), let us rst observe that sup ||n (u) − n;pn (u)|| 6 3n−1=2 ||T ||−1=2 max ||Si || 16i6pn

06u61

+ n−1=2 ||T ||−1=2 max ||Xi ||: 16i6n

By Lemma 3(i) n−1 max ||Si ||2 = n−1 max ||Si∗ ||2 + op (1); 16i6pn

16i6pn

(3.8)

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

19

by Doob’s maximal inequality n−1 max ||Si∗ ||2 = Op (pn =n) 16i6pn

= op (1) and by the Assumption (2.1) n−1=2 max ||Xi || = op (1): 16i6n

Combining these observations shows that the right-hand side of (3.8) converges in probability to zero and hence the proof of (ii) is complete. ∗ Proof of Theorem 1. Let ∗n be the same as n de ned in (1.6) with Sr∗ and Xr+1 in place of Sr and Xr+1 , respectively. It follows from the multivariate version of Theorem 1 of Babu and Ghosh (1976) or Theorem 2.4 of Rootzen (1976) or Theorem 2 of Aldous and Eagleson (1978) that ∗n ⇒ W m (mixing). Applying Lemma 3 (i) we obtain that n ⇒ W m (mixing), so Theorem 1(i) is proved.

To prove Theorem 1 (ii), rst note that n;pn ⇒ W m

(mixing);

which follows directly from an application of Theorem 1 (i), Lemmas 3(ii) and 1. Combining this result and Lemma 3(ii), we arrive at (17.19) of Billingsley (1968, p. 147). From this point on, the proof of Theorem 1 (ii) follows the same lines as that given in his Theorem 17.2 and hence the details are omitted. Proof of Corollary 1. (i) Follows immediately from a joint application of Theorem 1 (i) and Lemma 2. p p a:s: Because T˜n → T and Nn → +∞, as n → ∞, it follows that T˜Nn → T (cf. Gut, 1988, Theorem 2.2). Hence part (ii) follows from this fact, Theorem 1 (ii) and Lemma 2. Remark 1. As mentioned in the introduction, there are various ways of de ning the partial sum process. Corollary 1 shows that we may normalize the process by any sequence which converges a.s. to T . In the univariate case, due to the nonhomogeneous nature of the variances, Brown (1971) used   sn2 u − sr2 −1 2 Sr + 2 if sr2 6usn2 ¡ sr+1 Xr+1 ; n (u) = sn sr+1 − sr2 where sn2 = ESn2 . A similar partial sum process can also be de ned in our case. For instance, let   sn2 u − sr2 −1=2 ˆ −1=2 2 Tn Sr + 2 if sr2 6usn2 ¡ sr+1 Xr+1 ; (3.9) n (u):=n sr+1 − sr2 Pn where sn2 = E|| t=1 t ||2 and replace the constant n in (1.4) and (2.2) by sn2 . Note a:s: that since T is p.d. and Tˆn → T , we may assume that Tˆn is also p.d.

20

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

An alternative choice of particular interest for the partial sum process is n (u) = Tn−1=2 Sr ;

2 sr2 6usn2 ¡ sr+1 :

(3.10)

This process belongs to Dm [0; 1], the space of all functions on [0,1] into Rm which are right continuous and have left-hand limits and usually equipped with the Skorohod topology and a compatible metric (see, e.g., Billingsley, 1968; Pollard, 1984). For simplicity and clarity of the exposition, we have chosen the partial sum process n introduced in (1.6). However, it is easy to see that the conclusions of Theorem 1 also hold for the partial sum processes given in (3.9) and (3.10) above with suitable modi cations. Remark 2. An interesting consequence of Theorem 1(i) implies that for all w ∈

outside of a null set, the sequence {n (·; w); n ∈ N 0 } is dense in the set of all functions in C m [0; 1] that vanish at zero, where N 0 is any in nite sequence of integers. For a proof of this result see Rootzen (1976), Theorem 2.2. The following lemmas are useful in the sequel. Lemma 4. Let {Yi ; i¿1} be a sequence of random variables; and let Sn =Y1 +· · ·+Yn ; Sa;n = Ya+1 + · · · + Ya+n ; Ma;n = max16i6n |Sa;i |. Suppose that for some  ¿ 2; E|Sa;n | 6A n=2 ;

(all a¿0 all n¿1)

(3.11)

and E|Ma;n | 6B n=2 ;

(all a¿0 all n¿1);

(3.12)

where A and B are positive constants depending only on . Then for any r ∈ (0; ) and for all n¿1; r  E sup |Sk =k| 6cr; n−r=2 ; k¿n

where



 r − −=2 −1 (A + B )2 (1 − 2 ) : cr; = 1 + −r

Proof. By (3.11) and (3.12) and Theorem 5.1 of Ser ing (1970) for any x ¿ 0;   (3.13) P sup |Sk =k| ¿ x 6a x− n−=2 ; k¿n

where a = (A + B )2− (1 − 2−=2 )−1 . Note that r     E sup |Sk =k| 6  + E sup |Sk =k|r · I sup |Sk =k|r ¿  k¿n

k¿n

Z 6+





 P

k¿n

sup |Sk =k|r ¿ x

k¿n

 dx

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

Z = +



1=r

6  + ra n

ry

r−1

−=2

=  + ra n−=2

Z

 P ∞

1=r



21

 sup |Sk =k| ¿ y

k¿n

dy

yr−1− dy

(r−)=r −r

 ;

where the last inequality follows from (3.13). Letting  = n−r=2 we obtain the desired inequality. Lemma 5. Let {Yt } be a sequence of martingale di erences such that supt E|Yt |r ¡ ∞ for some r¿2; and let {ct } be a sequence of real numbers. Then there is a constant Br such that r n    n r=2 P P 2 ct : E ct Yt 6Br sup E|Yt |r t t=1

t=1

Proof. The proof follows easily by an application of Burkholder’s inequality and then Minkowski’s inequality. Lemma 6. Let {Yt } be a sequence of martingale di erences such that supt E|Yt | ¡ ∞ for some  ¿ 2. Then for any r ∈ (0; ) and for all n¿1; r  E sup |Sk =k| 6Dr; · n−r=2 ; k¿n

where

        2 (1 − 2−=2 )−1 sup E|Yt | ; Dr; = 1 + ( − r)−1 rc 1 + −1 t c = (18˜1=2 ) ;

˜−1 + −1 = 1:

Proof. By Doob’s maximal inequality and Lemma 5 we have that, for all a¿0 and all n¿1,   E|Ma;n | 6(=( − 1)) E|Sa;n | 6c (=( − 1)) sup E|Yt | n=2 : t

Application of Lemma 4 with A = c (supt E|Yt | ) and B = c (=( − 1)) supt E|Yt | now completes the proof. Proof of Theorem 2. We follow the pattern of the proof of Lemma 1 in Fakhre-Zakeri and Lee (1993). Denote the components of Zt ; Xt and X n by Zt;j ; Xt;j and X n;j ; j = 1; : : : ; m, and the (i; j)th entries of (h) and ˆ (h) by ij (h) and ˆij (h). In view of (1.1) the jth

22

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

component Xt;j of Xt is m P ∞ P aujk Zt−u;k : Xt;j =

(3.14)

k=1 u=0

Clearly, sup E|Xt;j |2r ¡ ∞

(3.15)

t

and by Minkowski’s inequality and Lemma 6 m P ∞ n n −1 P −1 P P −1=2 n X 6 |a | · n Z ): t;j ujk t−u;k = O(n t=1

k=1 u=0

2r

Let

˜ij (h) = n−1

n P t=1

t=1

Xt+h;i Xt;j ;

A1 = A1 (n; i; j; h) = −n−1

(3.16)

2r

(3.17) n P t=n−h+1

Xt+h;i Xt;j ;

  n−h n−h P P Xt;j + X n;j Xt+h;i ; A2 = A2 (n; i; j; h) = −n−1 X n;i t=1

t=1

A3 = A3 (n; i; j; h) = n−1 (n − h)X n;i Xn;j : Then, in view of (1.14), we have

ˆij (h) = ˜ij (h) + A1 + A2 + A3 :

(3.18)

Using the Schwarz inequalities (3.15) and (3.16), it follows immediately that max

3 P

max

16i;j6m 06h6hn k=1

||Ak ||r = O(n−1=2 ):

(3.19)

Note that from (1.10) and (1.13), ˆij − ij = ˆij (0) − ij (0) + −

∞ P h=hn +1

hn P h=1

[( ˆij (h) − ij (h)) + ( ˆji (h) − ji (h))]

[ ij (h) + ji (h)]:

Hence (2.3) holds if max

√ max || n( ˆij (h) − ij (h))||r = O(1)

16i;j6n 06h6hn

and by virtue of (3.18) and (3.19) it suces to show that √ max max || n( ˜ij (h) − ij (h))||r = O(1): 16i;j6m 06h6hn

(3.20)

(3.21)

Proceeding as in (3.12)–(3.14) of Fakhre-Zakeri and Lee (1993) and applying Lemma 5, we see that √ max sup || n( ˜ij (h) − ij (h))||r = O(1); 16i;j6m h¿0

I. Fakhre-Zakeri, S. Lee / Journal of Statistical Planning and Inference 83 (2000) 11– 23

23

which proves (3.21). Now by using (2.3) and the Markov inequality, we obtain max P(|ˆij − ij | ¿ ) = O(n−r=2 hrn );

16i;j6m

which implies (2.4). This completes the proof of Theorem 2. Theorem 3 is an easy consequence of Theorem 1(ii) together with (2.4), (2.7) and (2.10). Theorems 4 and 5 can be proved along the lines of Theorems 2 and 3, respectively, of Fakhre-Zakeri and Lee (1993). We omit the details for brevity. References Aldous, D.J., Eagleson, G.K., 1978. On mixing and stability of limit theorems. Ann. Probability 6, 325–331. Andrews, D.W.K., 1991. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59, 817–858. Babu, G.J., Ghosh, M., 1976. A random functional central limit theorem for martingales. Acta. Math. Scie. Hungar. 27, 301–306. Billingsley, P., 1968. Convergence of Probability Measures. Wiley, New York. Brown, B.M., 1971. Martingale central limit theorems. Ann. Math. Statist. 42, 59–66. Chow, Y.S., Robbins, H.E., 1965. On the asymptotic theory of xed width sequential con dence intervals for the mean. Ann. Math. Statist. 36, 457–462. Fakhre-Zakeri, I., Lee, S., 1993. Sequential estimation of the mean vector of a multivariate linear process. J. Multivariate Anal. 47, 196–209. Gut, A., 1988. Stopped Random Walks, Limit Theorems and Applications. Springer, New York. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and its Applications. Academic Press, New York. Phillips, P.C.B., Solo, V., 1992. Asymptotic for linear processes. Ann. Statist. 20, 971–1001. Pollard, D., 1984. Convergence of Stochastic Processes. Wiley, New York. Renyi, 1958. On mixing sequences of random variables. Acta. Math. Acad. Sci. Hungar. 9, 389–393. Rootzen, H., 1976. Fluctuations of sequences which converge in distribution. Ann. Probability 4, 456–463. Ser ing, R.J., 1970. Converge properties of Sn under moment restrictions. Ann. Math. Statist. 41, 1235–1248. Starr, N., 1966. On the asymptotic eciency of a sequential procedure for estimating the mean. Ann. Math. Statist. 37, 1173–1185.