Nonparametric trend estimation in replicated time series

Nonparametric trend estimation in replicated time series

Journal of Statistical Planning and Inference 97 (2001) 263–274 www.elsevier.com/locate/jspi Nonparametric trend estimation in replicated time serie...

114KB Sizes 0 Downloads 122 Views

Journal of Statistical Planning and Inference 97 (2001) 263–274

www.elsevier.com/locate/jspi

Nonparametric trend estimation in replicated time series Sucharita Ghosh ∗ Swiss Federal Research Institute WSL, Zurcherstrasse 111, CH-8903 Birmensdorf, Switzerland Received 26 September 1999; received in revised form 1 September 2000; accepted 15 September 2000

Abstract Nonparametric estimation of a mean trend function for replicated time series  is considered. The ith series is de-ned by Yi (j) = g(tj ) + gi (tj ) + Xi (j) (j = 1; 2; 3; : : :) with gi = 0. The error processes Xi (i = 1; : : : ; k) for the individual series are assumed to be stationary with fractional di1erencing parameters i ∈ (− 12 ; 12 ), corresponding to anti-persistence ( i ¡ 0), short-range dependence ( i = 0) or long-range dependence ( i ¿ 0). A non-parametric estimate of g is de-ned. The optimal bandwidth and optimal integrated mean squared error are derived under the assumption of -xed and randomly generated i ’s, respectively. Simulations illustrate the results. c 2001 Elsevier Science B.V. All rights reserved.  MSC: 62G07; 62G20; 62M10 Keywords: Kernel estimation; Long-range dependence; Repeated measures; Smoothing; Time series

1. Introduction Time series exhibiting smooth trends are well known in the literature. Often these trends can be deterministic functions of time but may also be stochastic. In particular, the random Auctuations around a deterministic trend can consist of slowly decaying long-term correlations which are also known to create spurious trend like behavior. Estimating a smooth function in the presence of long-range-dependent errors is therefore diBcult in general. Mathematically, long-range dependence results in a slower rate of convergence of kernel estimates (see e.g. Hall and Hart, 1990; CsEorgEo and Mielniczuk, 1995). Stationary models for explaining spurious trend-like behavior in the data include fractional autoregressive processes (Granger and Joyeux, 1980; Hosking, 1980), with or without a drift. Such a model is given by Yi = g(ti ) + Xi ;

p (B)(1 − B) Xi = vi :



Tel.: 41-1-739-2431; fax: 41-1-739-2215 E-mail address: [email protected] (S. Ghosh).

c 2001 Elsevier Science B.V. All rights reserved. 0378-3758/01/$ - see front matter  PII: S 0 3 7 8 - 3 7 5 8 ( 0 0 ) 0 0 2 2 2 - 6

264

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

Here B denotes the backshift operator, p (x) denotes a polynomial of degree p in x with its zeros outside the unit circle p = 0; 1; 2; : : : ; vi ∼ iid N (0; 2 ); i = 1; 2; : : : ; n and ti = 1=n. Moreover, Xi is a stationary stochastic process, −0:5 ¡ ¡ 0:5 is the fractional di1erencing parameter and g is smooth. In particular, p = 0; = 0 imply Xi ∼ iid N (0; 2 ); ¿ 0 implies Xi has long memory and ¡ 0 implies Xi is antipersistent. In some situations, replicated time series are observed. Diggle and Wasel (1997) consider spectral estimation for repeated time series. Here, we consider the problem of nonparametric estimation of a common trend function. Typical examples where this is of interest are ring width data from many trees from a region with a speci-c ecological property, climate data from di1erent geographic locations, etc. (see Cook and Kairiukstis, 1990 for references). The fact that k replicated series are available can be used to obtain better estimates of the trend function. In this paper, kernel smoothing is applied to the averaged series. The errors Xi (j) of the individual series Yi (j) (i =1; : : : ; k; j =1; 2; : : : ; n) are assumed to be stationary with short-memory, long-memory or antipersistence (for the de-nitions see below). In particular, some of the results in Beran and Ocker, 1999 are generalized for the replicated time-series situation for model 1 (see below). The asymptotically optimal bandwidth and integrated mean squared error are derived. Results are given for -xed k as well as when k tends to in-nity. As it turns out, there is an essential di1erence between the results for these cases. The results are examined in speci-c randomized and nonrandomized settings. 2. The model Let Yi (j) denote the ith time series, i = 1; 2; : : : ; k; observed at j = 1; 2; : : : ; n. The following model for Yi (j) is assumed: Yi (j) = g(tj ) + gi (tj ) + Xi (j):

(1)

Here tj = j=n (j = 1; : : : ; n). The functions g; gi are in C 2 [0; 1] (i = 1; : : : ; k) and such k that i=1 gi (t) = 0 (06t61). The ith error process Xi (j) is a stationary zero mean process with second moments existing and spectral density f() such that, for some − 12 ¡ i ¡ 12 and Ci ¿ 0, fi () ∼ Ci ||−2 i ;

(2)

as  → 0. This implies the covariances cov(Xi (j); Xi (l)) = i (u) ∼ Di |u|2 i −1 ;

(3)

as u → ∞; i = 0, where Di =

sin( i )(1 − 2 i ) Ci : (1 + 2 i )

(4)

Moreover, the di1erent series X1 (·); : : : ; Xk (·) are assumed to be independent of each other.

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

265

Remarks • If i ¿ 0, then Xi exhibits long-range dependence (i.e. fi has a pole at zero and the sum of the aucorrelations diverges). If i = 0, then Xi exhibits short-range dependence (i.e. 0 ¡ fi (0) ¡ ∞, and the sum of the autocorrelations is -nite and nonzero). If i ¡ 0, then Xi exhibits antipersistence (i.e. fi (0) = 0 and the sum of the autocorrelations is zero). • The functions gi model the deviations of individual trends from the overall trend g. 3. Estimation of g Consider the average of the k series, YN (j) = k −1

k

i=1

Yi (j). Then

YN (j) = g(tj ) + XN (j); k where XN (j) = k −1 i=1 Xi (j).

(5)

3.1. De8nition A simple estimate of g can be obtained by applying kernel smoothing to the averaged series YN (j). Thus,   n tj − t N 1  Y (j): (6) g(t) ˆ = K nb j=1 b Here the function K is a kernel satisfying the conditions given in Gasser and MEuller (1979) and b ¿ 0 is the bandwidth. For simplicity of presentation, the rectangular kernel K(u) = 12 1{−16u61}

(7)

will be used in the following. An extension of the results to more general kernels is straightforward. Also, boundary problems are not discussed here. Thus, the integrated mean square error is calculated by integrating over [ ; 1 − ] where 0 ¡ ¡ 12 . 3.2. Conditional mean squared error – exact expression Theorem 1. Let K(u) = 12 1{−16u61}; and gˆ de8ned by (6). Then; for 8xed n and k and given 1 ; 2 ; : : : ; k ; • (i) Bias: ˆ − g(t)) = Bn; k (t) = E(g(t)

 1 n(t+b) g(tj ) − g(t): 2nb j=n(t−b)

(8)

• (ii) Variance: Vn; k; 1 ;:::; k

k 2nb  1 1  (2nb + 1 − |u|)i (u): 2 2 (2nb) k i=1 u=−2nb

(9)

266

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

• (iii) Mean squared error at 8xed t: MSE(t) = Bn;2 k (t) + Vn; k; 1 ;:::; k : • (iv) Integrated mean square error (IMSE) for 0 ¡ ¡ 12 :  1− IMSE{ 1 ;:::; k } = (Bn; k (t))2 dt + (1 − 2 )Vn; k; 1 ;:::; k :

(10)

3.3. Basic assumptions for asymptotic results Let n → ∞, b → 0 and nb → ∞. To obtain asymptotic results, four cases need to be distinguished: Case 1: Assumptions k is -xed and -nite and 1 ; 2 ; : : : ; k are non-random paran meters. When i = 0, i satis-es u=−n |u|i (u) = O(n); n → ∞. Case 2: Assumptions k is -xed and -nite and 1 ; 2 ; : : : ; k are independently and identically distributed random variables with a common distribution function F. The distribution function F is assumed to have a moment generating function m(·) such that m(−2 log(u)) = L(u)u−2 where L is slowly varying at zero in the sense of   Zygmund (Feller, 1971) and − 12 ¡ ¡ 12 . Also if f() = EF fi () and (u) = (1=2) − f()eiu n then, when = 0, u=−n |u|(u) = O(n); n → ∞ (Priestley, 1989). Case 3: Assumptions k → ∞ and 1 ; 2 ; : : : ; k are non-random parameters. Also,  k 1 1 1 limk→∞ k i=1 fi ()=f() uniformly in ; − 2 ¡¡ 2 and let (u)=(1=2) − f()eiu . We assume that f() ∼ L()||−2 ;

|| → 0;

−1=2 ¡ ¡ 1=2

where L is slowly varying at zero in the sense of Zygmund (Feller, 1971) and when n

= 0, u=−n |u|(u) = O(n); n → ∞ (Priestley, 1989). Case 4: Assumptions 1 ; 2 ; : : : ; k are independently and identically distributed random variables with a common distribution function F, and k tends to in-nity. The distribution function F, the average spectral density f and the average covariance function  satisfy the conditions of Case 2. 3.4. Asymptotic IMSE and asymptotically optimal bandwidth – Case 1 Recall that fi denotes the spectral density of the stationary process Xi . The fractional di1erencing parameter i for the Xi process determines the behavior of its spectral density at zero by f() ∼ Ci ||−2 i . The autocovariances of XN (j) are given by (|j − l|) = cov(XN (j); XN (l)) =

k 1  i (|j − l|): 2 k i=1

giving rise to the spectral density, ∞ k 1  1  fXN () = (u)eiu = 2 fi (): 2 u=−∞ k i=1 Then it is easy to see that if is the largest fractional di1erencing parameter, then it is also the fractional di1erencing parameter for the sample mean process. Thus,

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

267

Lemma 1. Let = max{ 1 ; 2 ; : : : ; k }. Then 1 fXN () ∼ 2 C||−2 as || → 0; k  where C = i: i = Ci and Ci are de8ned in (2). In particular, for -xed n and k in Case 1 the IMSE is given by,  1− IMSE{ 1 ;:::; k } = Bn;2 k (t) dt + (1 − 2 )(Vn; k; 1 ;:::; k )  =

1−



(11)

2

 1 n(t+b) g(tj ) − g(t) 2nb j=n(t−b)

+ (1 − 2 )

dt

k 2nb  1 1  (2nb + 1 − |u|)i (u): 2 2 (2nb) k i=1 u=−2nb

(12)

De-ne (( ) =

22 (1 − 2 )sin( )

(2 + 1)

(13)

for = 0, and ((0) = lim →0 (( ) =  (see Beran and Ocker, 1999).  Theorem 2. Let = max{ 1 ; 2 ; : : : ; k }; and C = i: i = Ci where Ci is de8ned in (2). If k is 8xed and n and nb tend to ini8nity; then under the conditions of Case 1; an asymptotic expression for the IMSE (IMSElim ) is given by • (i) b4 36

IMSElim =



1−

+o(b4 ) + =

b4 36



1−

[g (t)]2 dt +

k  1 (1 − 2 ) (nb)2 i −1 (( i )Ci 2 k i=1

k 1  o((nb)2 i −1 ) 2 k i=1

[g (t)]2 dt +

(14)

1 (1 − 2 )(nb)2 −1 (( )C

k2

+o(max(b4 ; (nb)2 −1 )): • (ii) The optimal bandwidth is equal to 1=(5−2 )  9(1 − 2 )(1 − 2 )(( )C

n(2 −1)=(5−2 ) k −2=(5−2 ) : bopt =  1− (g (t))2

(15)

(16)

• (iii) The optimal IMSE is of the order O(n(8 −4)=(5−2 ) k −8=(5−2 ) ). Theorem 2 implies that, under long-range dependence ( ¿ 0), gˆ has a slower rate of convergence than under independence. In contrast, antipersistence ( ¡ 0) implies a

268

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

faster rate of convergence, whereas, under short memory ( =0), the rate of convergence is the same as under independence. Analogous results for single time series are given in Chiu (1989), Altman (1990), Herrmann et al. (1992), Hall and Hart (1990), CsEorgEo and Mielniczuk (1995) and Beran and Ocker (1999). The reason for the di1erence between the three cases is essentially related to the fact that the rate at which the variance of the sample mean of a stationary process Xi converges to zero is determined by the n limiting behaviour of Sn = k=−n cov(Xi ; Xi+k ) (as n tends to in-nity). 3.5. Asymptotic IMSE and asymptotically optimal bandwidth — Case 2 For -xed n and k; the IMSE given by formulas (10) – (12) are conditional IMSE (i.e. given 1 ; : : : ; k ). Since the fractional di1erencing parameters 1 ; : : : ; k are assumed to be random variables, the unconditional IMSE or the expected value of the conditional IMSE will be considered for obtaining the limiting results. Essentially, taking the expected value in either (10), (11) or (12), the index i disappears so that arguments similar to Theorem 2 can be used to obtain the following results. Theorem 3. De8ne (( ) as in Theorem 2. Then; as n and nb tend to in8nity; • (i) The unconditional IMSE tends to:    b4 1−  1 1 2 2 −1 (( ) + Rn; b; k IMSElim = (g (t)) dt + (1 − 2 )(nb) L 36 k nb (17) where Rn; b; k = o(max(b4 ; (nb)1−2 ). • (ii) The asymptotically optimal bandwidth is the solution of the equation  1=(5−2 ) 9(1 − 2 )(1 − 2 )L(1=nb)(( ) bopt = n(2 −1)=(5−2 ) k −1=(5−2 ) : (18)  1− (g (t))2 dt Remarks • If L(0) is equal to a -nite nonzero constant, then Theorem 3 shows that the optimal IMSE decreases with k at the rate k −4=(5−2 ) . • The assumption on the individual error processes Xi was that the spectral density fi behaves at zero like a constant Ci times ||2 i : In the theorem above, however, the expected spectral density f =E(fi ) is assumed to be equal to L()||2 where L(:) is not a necessarily constant slowly varying function that may even diverge to in-nity or converge to zero at the origin. This generality is needed, because a nonconstant slowly varying function can be obtained for f even if the individual fi ’s behave like a constant times ||2 : This can be illustrated by the following example: Let F be the uniform distribution on [ − 12 + +; 12 − +] for some 0 ¡ + ¡ 1=2: Also, let Ci = C ∗ with C ∗ a -xed positive number. Then f() ∼ C ∗ EF (||−2 )=C ∗ EF (exp−2 log|| ) ∼ 1 −2 2 −2+

L()||

with L() proportional to 1=log||:

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

269

• The behavior of the expected spectral density at the origin is directly related to the moment generating function m of F by EF (||−2 ) = m(−2 log||): • If L does not converge to a nonzero constant at zero, then in general there is no explicit formula for the asymptotically optimal bandwidth. This is in contrast to the usual situations in nonparametric regression. • Theorem 2 shows that the optimal IMSE in case-1 decreases with k at the rate k −8=(5−2 ) (obtained by substituting (16) in (15)). This seems faster than the rate given in Theorem 3. However, depending on the values of the i ; C has an inAuence on the IMSE (see (15)). For example, in the extreme case when

1 = 2 =· · ·= k ; C is proportional to k so that the overall rate reduces to k −4=(5−2 ) as in Theorem 3. 3.6. Asymptotic IMSE and asymptotically optimal bandwidth — Cases 3 and 4 As in Case 2, here also the index i disappears either under the assumption that k limk→∞ (1=k) i=1 fi () = f() uniformly in ; − 12 ¡  ¡ 12 (Case 3) or after taking the expected value of the second term in the right-hand side of (12) (Case 4). Thus, the asymptotic expression for the IMSE (unconditional for Case 4) and the optimal bandwidth are given by (17) and (18), respectively.

4. Simulations We illustrate some of the asymptotic results via simulations in this section. For Cases 2– 4, the optimal bandwidth would have to be obtained by solving an implicit equation involving the slowly varying function L which is to be evaluated at 1=nb: For the purpose of illustration, we consider only Case 1. 4.1. Simulation 1 ARIMA(0; ; 0) processes Xi (i = 1; : : : ; k) are simulated for k = 1; 5; 10 and 20; and n = 50; 200; 400; 1000: For each combination of k and n; the fractional di1erencing parameters i ; i = 1; 2; : : : ; k are selected from the interval [ − 0:4; 0:4]: Depending on the value of k; the chosen values are approximately equally spaced on this interval. The k; combinations are as follows: • k = 1; : 0:4; • k = 5; : ±0:4; ±0:2; 0; • k = 10; : ±0:4; ±0:3; ±0:2; ±0:1; 0; 0; • k = 20; : ±0:4; ±0:35; ±0:3; ±0:25; ±0:2; ±0:15; ±0:1; ±0:05; ±0:025; 0; 0: The observed process is de-ned by Yi (j) = g(tj ) + Xi (j);

j = 1; 2; : : : ; n; tj = j=n;

270

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

Table 1 Simulation 1: The asymptotically optimal bandwidth (bopt ) and the ratio (IMSEsim =IMSEopt ) where IMSEsim denotes the simulated IMSE and IMSEopt denotes the asymptotically optimal IMSE when the long-memory parameters i , i = 1; 2; : : : ; k are chosen to be equally spaced on an interval −0:46 i 60:4 n

k 1

50 100 400 1000

0.311, 0.301, 0.282, 0.270,

5 0.747 0.947 0.974 1.010

0.145, 0.140, 0.131, 0.125,

10 1.442 1.281 1.008 0.944

0.104, 0.100, 0.094, 0.090,

20 2.200 1.757 1.399 1.286

0.075, 0.072, 0.068, 0.065,

4.468 2.810 2.577 2.037

Table 2 Simulation 2: The asymptotically optimal bandwidth (bopt ) and the ratio (IMSEsim =IMSEopt ) where IMSEsim denotes the simulated IMSE and IMSEopt denotes the asymptotically optimal IMSE when the long-memory parameters i , i = 1; 2; : : : ; k are chosen to be equal to 0:4 n

k 1

50 100 400 1000

0.311, 0.301, 0.282, 0.270,

5 0.971 1.122 1.032 1.053

0.212, 0.205, 0.192, 0.184,

10 0.916 1.033 0.921 1.086

0.180, 0.174, 0.163, 0.156,

20 0.994 1.092 1.052 1.033

0.152, 0.148, 0.138, 0.132,

1.025 0.944 1.086 1.009

where g(t) = 10et − 15t 2 : The purpose of the simulations is to show how the asymptotic -rst-order approximation to the IMSE and the optimum bandwidth depend on k and n: For each combination of k and n; the asymptotic expression for the IMSE and the optimum bandwidth values are tabulated (Table 1). Overall, the results show that as n increases the asymptotic formula for the IMSE in Theorem 3 approaches the actual value of the IMSE. The approximation is worse if the ratio k=n is large. This is because in these simulations all values of i are di1erent so that the asymptotic formula takes into account only the contribution of one of the k series. This is a rather crude approximation when k is relatively large compared to n: 4.2. Simulation 2 Here also, the process Yi (j) = g(tj ) + Xi (j), j = 1; 2; : : : ; n, tj = j=n is simulated with g(t) = 10et − 15t 2 but with i equal to 0.4 for all i = 1; 2; : : : ; k: Thus in this case, the optimal IMSE reduces with k at the rate k −4=(5−2 ) . On the other hand, all terms in the sum over i contribute to the asymptotic expression in Theorem 2. The asymptotic approximation for IMSE is therefore accurate also for small sample sizes (Table 2).

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

271

5. Final remarks The estimator gˆ considered here was based on the averaged process YN (j): As an k alternative, the average trend g could be estimated by g(t) ˜ = k −1 i=1 hˆi (t) where hˆi (t) = hˆi (t; bi ) is a kernel estimate of hi (t) = g(t) + gi (t) with asymptotically optimal bandwidth bi (optimal for the series number i). Depending on the functions hi and the correlation structure of the individual series, the asymptotic IMSE of g˜ can be smaller, larger or equal to the IMSE of g: ˆ Consider, for instance, the case with gi ≡ 0 for all i;

1 = · · · = k = and C1 = · · · = Ck : Let B and V be the asymptotic expressions for the integrated bias and variance of g, ˆ respectively, when using the optimal bandwidth, as given in Theorem 2. Then we have IMSE(g) ˆ ≈ B2 + V: By analogous arguments as in Theorem 2, one can show that IMSE(g) ˜ ≈ B2 k 4=(5−2 ) +Vk 2 =(5−2 ) . Since 4=(5−2 ) ¿ 0 but (2 − 1)=(5 − 2 ) ¡ 0; this means that for k ¿ 1; g˜ has a larger bias but a smaller variance than g: ˆ Thus, which IMSE is larger depends on the ratio B2 =V; and the values of k and . Finally, note that assumption (2) on the spectral density functions fi ; i=1; : : : ; k also implies that when the parameters 1 ; : : : ; k are randomly distributed with the common distribution function F speci-ed in cases 2 and 4, the mean spectral density f may be estimated nonparametrically (at zero) via the moment generating function of 1 ; : : : ; k : This approach will be considered elsewhere. Another problem that will be addressed in a forthcoming paper is data driven estimation of the optimal bandwidth when the spectral densities fi and the distribution F are unknown. Acknowledgements I would like to thank the referee for constructive suggestions that helped to improve the quality of the paper.

Appendix Proof of Theorem 1. (i) The proof for the bias follows directly by substituting (5) in (6). (ii) The proof for the variance follows from noting that k 1  cov(XN (j); XN (l)) = 2 i (j − l): k i=1

Thus, var(g(t)) ˆ = =

n(t+b) k  1 1  i (j − l) 2 2 (2nb) k i=1 j;l=n(t−b) k 2nb+1  1  1 i (j  − l ): 2 2 (2nb) k i=1 j ;l =1

272

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

The latter is obtained by substituting j  = j − n(t − b) + 1 and l = l − n(t − b) + 1: n (n−1) Since i; j=1 a(i − j) = u=−(n−1) (n − |u|)a(u) for a function a(·); var(g(t)) ˆ =

k 2nb  1 1  (2nb + 1 − |u|)i (u): 2 2 (2nb) k i=1 u=−2nb

Proof of Theorem 2. When k is -xed, standard arguments can be used to show that the -rst term in (12) converges to  b4 1− [g (t)]2 dt + o(b4 ): 36 The second term equals (1 − 2 )

k 1  [An + Bni + Cni ]; k 2 i=1 i

where 2nb 

Ani =

1 (2nb)2

Bni =

1 (2nb)2

Cn i =

1 (2nb)2

u=−2nb (2nb)  u=−2nb

2nbi (u);

i (u)

and 2nb  u=−2nb

|u|i (u):

Three cases need to be distinguished. Case 1: 1=2 ¿ ¿ 0. In this case, since 2 i −1¿−1; and nb→∞, 2nb 

lim

nb→∞ u=−2nb

i (u) = ∞

so that 2nb  u=−2nb

i (u) ∼ (0) + 2Di

2nb  u=1

where 2Di

2nb  u=1

|u|2 i −1 ∼ (2nb)2 i 2Di

|u|2 i −1 ;  0

1

x2 i −1 d x:

This implies Di Ani = (2nb)2 i −1 + o((nb)2 i −1 ):

i Clearly, Bni =o(Ani ): Finally, since i (u) ∼ Di |u|2 i −1 ; u → ∞; so that, 2nb  1 Cni ∼ 2D u2 i i (2nb)2 u=1

2nb

u=−2nb

|u|i (u) diverges

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

273

which leads to Cn i =

2Di (2nb)2 i −1 + Ri; n ; (2 i + 1)

where Ri; n = o((nb)2 i −1 ). ∞ Case 2: −1=2 ¡ ¡ 0. To examine Ani and Bni ; note that u=−∞ i (u) = 0 so that 2nb ∞ ∞ u=2nb+1 i (u): Moreover, in this case, u=−∞ |u|i (u) diverges. u=−2nb i (u) = −2 Thus, using arguments as in Case 1, we have the same limits for Ani ; Bni ; and Cni : ∞ Case 3: i = 0. In this case u=−∞ i (u) = 2fi (0) where fi (0) = Ci : Thus, one has, Ani ∼

2Ci ; 2nb

Bni = o(Ani ); Cni = o(nb): Proof of (14) and (15) are immediate. (16) follows by di1erentiation. Proof of Theorem 3. We start with Eqs. (12) when k and n are -xed and 1 ; : : : ; k are -xed. For case 2, this corresponds to the conditional IMSE given 1 ; : : : ; k : Then the unconditional IMSE is given by  1− IMSE = EF (IMSE{ 1 ;:::; k } ) = Bn;2 k (t) dt + (1 − 2 )EF (Vn; k; 1 ;:::; k ) where 2nb 1 1  (2nb + 1 − |u|)(u): (2nb)2 k u=−2nb  k Here (u) = (1=2) − f()eiu and f() = (1=k) i=1 E(fi ()): Since fi () satis-es (2) and EF ||−2 i = EF exp(−2 log || i ) = m(−2 log ||) where  is -xed, under the conditions on m(·);

EF (Vn; k; 1 ;:::; k ) =

f() ∼ L()||−2 ;

|| → 0;

− 12 ¡ ¡ 12

and L is slowly varying at zero in the sense of Zygmund. Result follows by the arguments of Theorem 2. References Altman, N.S., 1990. Kernel smoothing of data with correlated errors. J. Amer. Statist. Assoc. 85, 749–759. Beran, J., Ocker, D., 1999. SEMIFAR forecasts, with applications to foreign exchange rates. J. Statist. Plann. Inference 80, 137–153. Chiu, S.T., 1989. Bandwidth selection for kernel estimates with correlated noise. Statist. Probab. Lett. 8, 347–354. Cook, E.R., Kairiukstis, L.A. (Eds.), 1990. Methods of Dendrochronology. Applications in the Environmental Sciences. Kluwer, London.

274

S. Ghosh / Journal of Statistical Planning and Inference 97 (2001) 263–274

CsEorgEo, S., Mielniczuk, J., 1995. Nonparametric regression under long-range dependent normal errors. Ann. Statist. 23, 1000–1014. Diggle, P.J., Wasel, I.A., 1997. Spectral analysis of replicated biomedical time series. Appl. Statist. 46, 31–60. Feller, W., 1971. An Introduction to Probability theory and its Applications, 2nd Edition. Wiley, New York. Gasser, T., MEuller, H.G., 1979. Kernel estimation of regression functions. In: Gasser, T., Rosenblatt, M. (Eds.), Smoothing Techniques for Curve Estimation. Springer, New York, pp. 23–68. Granger, C.W.J., Joyeux, R., 1980. An introduction to long-memory time series models and fractional di1erencing. J. Time Ser. Anal. 1, 15–29. Hall, P., Hart, J., 1990. Nonparametric regression with long-range dependence. Stochastic Process. Appl. 36, 339–351. Herrmann, E., Gasser, T., Kneip, A., 1992. Choice of bandwidth for kernel regression when residuals are correlated. Biometrika 79, 783–795. Hosking, J.R.M., 1980. Fractional di1erencing. Biometrika 68, 165–176. Priestley, M.B., 1989. Spectral Analysis and Time Series. Wiley, New York.