Estimation of frequencies in presence of heavy tail errors

Statistics & Probability Letters 58 (2002) 265–282 Estimation of frequencies in presence of heavy tail errors Swagata Nandi, Srikanth K. Iyer, Debasi...

Download PDF

195KB Sizes 0 Downloads 38 Views

Report

PDF Reader
Full Text

Statistics & Probability Letters 58 (2002) 265–282

Estimation of frequencies in presence of heavy tail errors Swagata Nandi, Srikanth K. Iyer, Debasis Kundu ∗ Department of Mathematics, Indian Institute of Technology Kanpur, P.O. Box IIT, Kanpur 208016, India Received November 1998

Abstract In this paper, we consider the problem of estimating the sinusoidal frequencies in presence of additive white noise. The additive white noise has mean zero but it may not have 0nite variance. We propose to use the least-squares estimators or the approximate least-squares estimators to estimate the unknown parameters. It is observed that the least-squares estimators and the approximate least-squares estimators are asymptotically equivalent and both of them provide consistent estimators of the unknown parameters. We obtain the asymptotic distribution of the least-squares estimators under the assumption that the errors are from a symmetric stable distribution. We propose di4erent methods of constructing con0dence intervals and compare their performances through Monte Carlo simulations. We also discuss the properties of the estimators if the errors are c 2002 Published by Elsevier Science B.V. correlated and 0nally we discuss some open problems. Keywords: Sinusoidal signals; Consistent estimators; Stable distributions; Con0dence intervals

1. Introduction One of the most important problem of the time series analysis has proved to be the estimation of frequencies in presence of an additive noise. This problem may occur in several discipline in a variety of ways. Suppose we observe a sequence of observations from the following time series model: y(t) =

P

(Aj cos(!j t) + Bj sin(!j t)) + e(t):

(1.1)

j=1

Here y(t)’s are observed values at the equidistant time points, for t = 1; : : : ; n; !j ’s are unknown frequencies and lying between (0; ); Aj ’s and Bj ’s are amplitudes and they are unknown real numbers. The additive errors {e(t)}’s are independent and identically distributed (i.i.d.) random ∗

Corresponding author. Tel.: +91-512-597-636; fax: +91-512-590-007. E-mail address: [email protected] (D. Kundu).

c 2002 Published by Elsevier Science B.V. 0167-7152/02/$ - see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 0 2 ) 0 0 1 0 9 - 8

266

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

variables with mean zero but they may not have 0nite variances. The problem is to estimate the unknown parameters namely Aj ’s, Bj ’s and !j ’s and study their properties assuming p is known. The estimation of the parameters of the model (1.1) is a fundamental problem in signal processing (Kay, 1988) and time series analysis (Brillinger, 1987). There is a vast amount of literature exists regarding the estimation procedures as well as the theoretical behavior of the di4erent estimators if the error random variables have 0nite variances or when the errors are from a stationary sequence. The asymptotic theory of the least-squares estimators (LSEs) of this model has a long history. Whittle (1953) obtained some of the earlier results. More recent results are by Hannan (1971), Walker (1971), Rice and Rosenblatt (1988), Kundu (1993, 1997) and Kundu and Mitra (1996). The main aim of this paper is to consider the case when the error random variables have heavier tails. A heavy tail distribution is one whose extreme probabilities approach zero relatively slowly. The non-existence of 0nite variance is an important criterion for heavy tailedness as noted by Mandelbrot (1963). In fact, Mandelbrot (1963) de0ned distributions as heavy tailed if and only if the variance is in0nite. We are using the same de0nition of Mandelbrot (1963). It can be shown that under the assumption E|e(t)|1+ ¡ ∞ for some ¿ 0 on the error random variables, the LSEs and the ALSEs both provide consistent estimators of the unknown parameters. Furthermore, if we assume that e(t)’s are from a symmetric stable distribution, then the asymptotic distribution of the LSEs or ALSEs is multivariate stable. Using this asymptotic distribution it is possible to construct asymptotic con0dence intervals of the unknown parameters. The rest of the paper is organized as follows. We introduce the LSEs and the ALSEs in Section 2 and prove the consistency results in this section. The asymptotic distribution of the LSEs or the ALSEs is provided in Section 3. Construction of the con0dence intervals are discussed in Section 4. Some numerical results are presented in Section 5. Finally we conclude our paper and discuss some open problems in Section 6. 2. Least-squares estimators and approximate least-squares estimators In this section, we study the properties of the most intuitive estimators, namely the LSEs and the most used estimators, namely the ALSEs. For brevity, we assume p = 1, although the results can be established for any integer p along the same line. In this section, we mainly consider the following model: y(t) = A cos(!t) + B sin(!t) + e(t):

(2.1)

ˆ B; ˆ !) For the model (2.1), the LSE of Â = (A; B; !), say Âˆ = (A; ˆ can be obtained by minimizing n Q(Â) = (y(t) − A cos(!t) − B sin(!t))2 (2.2) t=1

with respect to Â. We represent Â0 = (A0 ; B0 ; !0 ), the true value of Â and throughout we assume that Â0 is an interior point of the parameter space. The ALSE of ! of the model (2.1) can be obtained by maximizing the periodogram function 2 n 2 y(t)ei!t (2.3) I (!) = n t=1

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

267

with respect to !. If !˜ maximizes I (!), then !˜ is called the ALSE of !. Following the approach ˜ respectively, of Walker (1971) or Hannan (1971), we de0ne the ALSEs of A and B, say A˜ and B, as n

2 y(t) cos(!t); ˜ A˜ = n t=1

n

2 B˜ = y(t) sin(!t): ˜ n t=1

(2.4)

˜ B; ˜ !) For the motivation of using Â˜ = (A; ˜ as estimator of Â, see Hannan (1971) or Walker (1971). Although in case of i.i.d. errors, the LSEs are the most intuitive estimators, but for the model (1.1), the most popular estimators of the unknown parameters are the ALSEs. Note that the LSEs can be easily de0ned for the model (1.1) and the ALSEs of !’s can be de0ned as the local maximums of I (!) instead of the global maximum. Once !j is estimated the corresponding ALSEs of Aj ’s and Bj ’s can be obtained using (2.4). The following two theorems give strong consistency for both the LSE and ALSE. We defer the proofs of these two results to Appendix A, at the end of the paper. Theorem 1. If Âˆ is the LSE of Â of the model (2.1) and e(t)’s are i.i.d. random variables with mean zero and E|e(t)|1+ ¡ ∞ for some 0 ¡ ¡ 1; then Âˆ is a strongly consistent estimator of Â. Theorem 2. If Â˜ is the ALSE of Â of the model (2.1) and e(t)’s are same as in Theorem 1; then Â˜ is a strongly consistent estimator of Â.

3. Asymptotic distributions of the LSEs and ALSEs In this section, we obtain the asymptotic distributions of the LSEs and ALSEs under the assumption that the errors are from a symmetric stable distribution. Before progressing any further, 0rst we de0ne a symmetric -stable (SS) distribution as follows: De!nition 1. A symmetric (around 0) random variable X is said to have SS distribution; with scale parameter ; and stability index ; if the characteristic function of the random variable X is EeitX = e−

|t |

:

For detailed treatments of the di4erent SS distributions the readers are referred to the book of Samorodnitsky and Taqqu (1994). From now on we always take 1 + ¡ ¡ 2. Consider (2.2), we use the following notations: @Q(Â) @Q(Â) @Q(Â) ; ; Q (Â) = @A @B @! ˆ around Â0 and Q (Â) is the 3 × 3 matrix of the second derivatives of Q(Â). Now expanding Q (Â) by multivariate Taylor series, we obtain ˆ − Q (Â0 ) = (Âˆ − Â0 )Q (Â) K Q (Â)

(3.1)

268

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

where ÂK is a point on the line joining Âˆ and Â0 . Suppose D1 and D2 are two diagonal matrices of order 3 × 3 each and they are as follows: D1 = diag{n−1= ; n−1= ; n−(1+)= }

and

D2 = diag{n−(−1)= ; n−(−1)= ; n−(2−1)= }:

ˆ = 0, therefore, (3.1) can be written as Since Q (Â) 1 K −1 (Âˆ − Â0 )D− 2 = −[Q (Â0 )D1 ][D2 Q (Â)D1 ] ; K 1 is invertible almost surely for large n. It can be easily seen that as D2 Q (Â)D   B0 1 0   2     A 0 K : lim [D2 Q (Â)D1 ] = lim [D2 Q (Â0 )D1 ] =  0 1 −   n→∞ n→∞ 2   B  A0 1 2 0 − (A0 + B02 ) 2 2 3 Therefore,  2  A0 + 4B02 −3A0 B0 −6B0     1 K −1 K −1 2 2  lim [D2 Q (Â)D1 ] = lim [D2 Q (Â0 )D1 ] = 2 −3A0 B0 4A0 + B0 6A0   : 2 n→∞ n→∞ A0 + B 0   −6B0 6A0 12

(3.2)

(3.3)

Now 0rst we show that [Q (Â0 )D1 ] converges to a three-dimensional multivariate stable distribution. Let us consider n n 2 2 Q (Â0 )D1 = − 1= e(t) cos(!0 t); − 1= e(t) sin(!0 t); n t=1 n t=1 n 2 te(t)[A0 sin(!0 t) − B0 cos(!0 t)] n(1+)= t=1 = (Xn ; Yn ; Zn )

(say):

(3.4)

Therefore, if t = (t1 ; t2 ; t3 ), then the joint characteristic function of (Xn ; Yn ; Zn ) is n (t) = Eet1 Xn +t2 Yn +t3 Zn = e−2

(1=n)

n

j=1

|Kt ( j)|

;

(3.5)

where jt3 (A0 sin(!0 j) − B0 cos(!0 j)): n Although we could not proveit theoretically but it is observed by extensivenumerical computations that as n tends to ∞; (1=n) nj=1 |Kt (j)| converges. Assuming that (1=n) nj=1 |Kt (j)| converges, it can be proved that (see Appendix B) it converges to a non-zero limit for t = 0. Suppose n 1 lim |Kt (j)| = t (A0 ; B0 ; !0 ; ); (3.6) n→∞ n j=1 Kt (j) = −t1 cos(!0 j) − t2 sin(!0 j) +

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

269

then, from (3.5) it is clear that any linear combination of Xn ; Yn and Zn is a SS distribution. Also lim n (t) = e−2

t (A0 ;B0 ;!0 ;)

(3.7)

n→∞

indicates that any linear combination of Xn ; Yn and Zn even if n → ∞ is also a SS distribution. Now using the result (Theorem 2.1.5) of Samorodnitsky and Taqqu (1994) that a random vector is symmetric stable in R3 if and only if any linear combination is symmetric stable distribution in R1 , it immediately follows that: K 1 ]− 1 ; lim [Q (Â0 )D1 ][D2 Q (Â)D

(3.8)

n→∞

converges to a symmetric stable random vector in R3 , which has the characteristic function (t) = e−2

u (A0 ;B0 ;!0 ;)

(3.9)

and u is de0ned through (3.6) replacing t by u. Here u = (u1 (t1 ; t2 ; t3 ; A0 ; B0 ); u2 (t1 ; t2 ; t3 ; A0 ; B0 ); u3 (t1 ; t2 ; t3 ; A0 ; B0 )) and u1 (t1 ; t2 ; t3 ; A0 ; B0 ) = [(A20 + 4B02 )t1 − 3A0 B0 t2 − 6B0 t3 ]

A20

u2 (t1 ; t2 ; t3 ; A0 ; B0 ) = [ − 3A0 B0 t1 + (4A20 + B02 )t2 + 6A0 t3 ] u3 (t1 ; t2 ; t3 ; A0 ; B0 ) = [ − 6B0 t1 + 6A0 t2 + 12t3 ]

A20

1 ; + B02 A20

1 ; + B02

1 : + B02

Therefore, we can put this as the following theorem. Theorem 3. In model (2.1); if e(t)’s are i.i.d. random variables with mean zero and have SS disˆ 0 )D−1 converges to a multivariate stable distribution; tribution as de4ned in De4nition 1; then (Â−Â 2 which has a characteristic function as de4ned in (3.9). Now to show that the asymptotic distributions of the LSEs and the ALSEs are same observe the following facts. By simple calculations (similarly as Hannan, 1971 or Walker, 1971) it can be shown that ˆ ˜ A(!) = A(!) + Op (n);

ˆ ˜ B(!) = B(!) + Op (n);

!ˆ = !˜ + Op (n2 ):

(3.10)

Here Op (m) indicates that the term goes to zero in probability and also mOp (m) is bounded in 1 and (Â˜ − probability. (3.10) immediately implies that the asymptotic distributions of (Âˆ − Â0 )D− 2 −1 Â0 )D2 are same. This result can be extended for the general model (1.1). Observe that for the general model, (Aˆ i ; Bˆ i ; !ˆ i ) and (Aˆ j ; Bˆ j ; !ˆ j ) will be asymptotically independent for i = j, therefore the joint characteristic function can be easily obtained.

270

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

4. Individual con!dence intervals In the previous section, we saw that although we know the characteristic function of the joint ˆ B; ˆ B; ˆ !) ˆ !) distribution of Âˆ = (A; ˆ but it is not very easy to obtain the joint distribution of (A; ˆ from ˆ Bˆ and !ˆ by inverting the corresponding that. In this section we obtain the marginal distributions of A; characteristic functions. The marginal characteristic functions of n(−1)= (Aˆ − A0 ); n(−1)= (Bˆ − B0 ), and n(2−1)= (!ˆ − !0 ) are

A (u)

= e−

w (u)

= e−

|u| 2 (1=n)

n

t=1

|Ka (t)|

;

B (u)

= e−

|u| 2 (1=n)

n

t=1

|Kb (t)|

and

|u| 2 (1=n)

n

t=1

|Kw (t)|

;

respectively, where 2 A0 + 4B02 −3A0 B0 −6B0 ; ; ; a= A20 + B02 A20 + B02 A20 + B02 and

w=

6A0 12 −6B0 ; ; A20 + B02 A20 + B02 A20 + B02

b=

−3A0 B0 4A20 + B02 6A0 ; ; A20 + B02 A20 + B02 A20 + B02

:

(−1)= ˆ ˆ (B−B0 ), Therefore, the characteristic functions of the limiting distributions of n(−1)= (A−A 0 ), n (2−1)= and n (!ˆ − !0 ) are

lim

A (u)

= e−

lim

w (u)

= e−

n→∞

|u| 2 a (A0 ; B0 ; !0 ; )

;

lim

n→∞

B (u)

= e−

|u| 2 b (A0 ; B0 ; !0 ; )

and n→∞

|u| 2 w (A0 ; B0 ; !0 ; )

;

respectively, where a (A0 ; B0 ; !0 ; ), b (A0 ; B0 ; !0 ; ) and w (A0 ; B0 ; !0 ; ) are de0ned as (3.6) replacing t by a; b and w, respectively. Now to construct the asymptotic 100(1 − &)% con0dence intervals of the individual parameters (say for !0 ), we use the inversion formula of the characteristic function (see Chung, 1974). Find xw , such that 2 ∞ sin(ux) − |u| 2 w (A0 ; B0 ; !0 ; ) e du = (w (−xw ; xw ) (say) 1−&= 0 u where (w (−x; x) is the probability measure which corresponds to e−

|u| 2 w (A0 ; B0 ; !0 ; )

:

We do not have the explicit expression of w (A0 ; B0 ; !0 ; ), but numerically we can estimate ˆ B; ˆ B; ˆ !; ˆ !; w (A; ˆ ) using (3.6) say ˆw (A; ˆ ) and then we estimate xw by minimizing |(ˆw (−x; x) − (1 − &)|, where (ˆw (−x; x) is the probability measure corresponding to the characteristic function e−

ˆ B; ˆ !; |u| 2 ˆw (A; ˆ )

:

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

271

Table 1 The average estimates and the mean absolute deviations of the LSEs and the ALSEs when the sample size is 20. Para.

= 1:2

= 1:4

= 1:6

= 1:8

LSE

ALSE

LSE

ALSE

LSE

ALSE

LSE

ALSE

A

1.790 (0.884)

0.858 (1.115)

1.582 (0.421)

1.045 (0.624)

1.501 (0.249)

1.118 (0.446)

1.493 (0.192)

1.212 (0.399)

B

1.990 (0.974)

1.878 (0.858)

1.825 (0.618)

1.788 (0.469)

1.549 (0.235)

1.783 (0.361)

1.509 (0.154)

1.766 (0.293)

!

2.416 (0.106)

2.459 (0.097)

2.488 (0.030)

2.517 (0.036)

2.501 (0.012)

2.517 (0.025)

2.499 (0.011)

2.521 (0.021)

Here A = B = 1:5 and ! = 2:5. In each box, the 0rst row represents the average estimates and the corresponding mean absolute deviations are reported within brackets below.

If xˆw minimizes |(ˆw (−x; x) − (1 − &)|, then the asymptotic 100(1 − &)% con0dence interval of ! is xˆw xˆw !ˆ − (2−1)= ; !ˆ + (2−1)= : n n The con0dence intervals of A and B can be obtained exactly in the same manner. We also consider two Bootstrap con0dence intervals, percentile Bootstrap (Boot-p) and Bootstrap-t (Boot-t) con0dence intervals similarly as Mitra and Kundu (1997). Their performances are compared in the next section. 5. Numerical experiments In this section, we present some experimental results to see how the LSEs and the ALSEs behave for 0nite samples. We consider the following model: y(t) = A cos(!t) + B sin(!t) + e(t):

(5.1)

We took A = B = 1:5 and ! = 2:5. We consider e(t)’s to be i.i.d. SS random variable with mean zero and 1 ¡ ¡ 2. We want to see how the LSEs and the ALSEs behave for di4erent values of and for di4erent sample sizes. We consider = 1:2; 1:4; 1:6; 1:8 and n = 20, 25 and 30. In all the cases the scale parameter = 0:25. For each combination of and n, we compute the LSEs and ALSEs of the unknown parameters and obtain the average estimates and the mean absolute deviations (MADs) over 500 replications and the results are reported in Tables 1–3. In each table the 0rst 0gure represents the average estimates and the corresponding MADs are reported in the bracket. For computing the LSEs and the ALSEs we used the optimization routines of Press et al. (1992). We also computed the con0dence intervals of the di4erent parameters using the asymptotic distribution and also by the two Bootstrap methods. The exact numerical procedures for the di4erent ˆ Bˆ and !, methods are as follows. For a given data set, 0rst we estimate A; B and !, say A; ˆ respecˆ B; ˆ B; ˆ B; ˆ !; ˆ !; ˆ !; tively. Then compute ˆa (A; ˆ ), ˆb (A; ˆ ) and ˆw (A; ˆ ) using the 0rst 500 terms of the corresponding series as de0ned in (3.6). Finally we obtained the 95% con0dence intervals of A; B

272

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

Table 2 The average estimates and the mean absolute deviations of the LSEs and the ALSEs when the sample size is 25. Para.

= 1:2

= 1:4

= 1:6

= 1:8

LSE

ALSE

LSE

ALSE

LSE

ALSE

LSE

ALSE

A

1.555 (0.592)

1.164 (0.710)

1.459 (0.357)

1.205 (0.447)

1.498 (0.212)

1.216 (0.353)

1.499 (0.159)

1.217 (0.314)

B

1.265 (0.767)

1.712 (0.737)

1.523 (0.295)

1.682 (0.433)

1.505 (0.210)

1.690 (0.305)

1.500 (0.153)

1.691 (0.248)

!

2.427 (0.096)

2.457 (0.080)

2.479 (0.033)

2.501 (0.031)

2.493 (0.016)

2.508 (0.020)

2.500 (0.007)

2.513 (0.014)

Here A = B = 1:5 and ! = 2:5. In each box, the 0rst row represents the average estimates and the corresponding mean absolute deviations are reported within brackets below.

Table 3 The average estimates and the mean absolute deviations of the LSEs and the ALSEs when the sample size is 30. Para.

= 1:2

= 1:4

= 1:6

= 1:8

LSE

ALSE

LSE

ALSE

LSE

ALSE

LSE

ALSE

A

1.457 (0.506)

1.217 (0.728)

1.476 (0.301)

1.217 (0.436)

1.481 (0.209)

1.246 (0.320)

1.485 (0.154)

1.254 (0.273)

B

1.502 (0.620)

1.624 (0.637)

1.497 (0.306)

1.634 (0.363)

1.491 (0.205)

1.636 (0.266)

1.500 (0.141)

1.646 (0.211)

!

2.440 (0.080)

2.442 (0.087)

2.478 (0.033)

2.491 (0.033)

2.493 (0.014)

2.502 (0.022)

2.501 (0.006)

2.500 (0.019)

Here A = B = 1:5 and ! = 2:5. In each box, the 0rst row represents the average estimates and the corresponding mean absolute deviations are reported within brackets below.

and !. We also obtained the corresponding Boot-p and Boot-t con0dence intervals for the di4erent parameters. We present the results for = 1:2; 1:4; 1:6 and 1.8, n = 25 and for ! only. The results for A and B are quite similar in nature therefore they are not provided here. The results are reported in Table 4. In each box the 0rst 0gure represents the average coverage percentages and in the bracket the average length of the con0dence intervals are reported over 500 replications. Some of the points are very clear from the numerical results. From Tables 1–3 it is observed that as the sample size n increases the biases decrease in general and also the MADs decrease, it veri0es the consistency property of both the LSEs and the ALSEs for all the parameters. It is also observed that as increases the biases and the MADs decrease. It indicates that for heavier tail it is more diPcult to estimate the unknown parameters. In all the cases for both the methods, the MADs of the frequencies are signi0cantly smaller than the corresponding MADs of the amplitudes. It veri0es that the rate of convergence of the frequencies is more compared to the rate of convergence

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

273

Table 4 The coverage percentages and the average con0dence lengths of the frequency obtained by di4erent methods when the sample size is 25. Methods

= 1:2

= 1:4

= 1:6

= 1:8

Asymp. Boot-p Boot-t

0.90 (0.087) 0.93 (0.601) 0.94 (1.041)

0.93 (0.059) 0.91 (0.286) 0.92 (0.434)

0.95 (0.052) 0.90 (0.116) 0.90 (0.183)

0.97 (0.004) 0.90 (0.060) 0.91 (0.087)

In each box, the 0rst 0gure represents the coverage percentages and the corresponding average con0dence lengths are reported within brackets next to it.

of the amplitudes. Comparing the LSEs and the ALSEs, it seems although they are asymptotically equivalent, LSEs behave marginally better than the ALSEs in terms of the minimum MADs for most of the cases considered and for all the parameters. But computationally ALSEs are much easier to compute than the LSEs at least for large p. Therefore, if p is large we recommend to use the ALSEs but if p is small LSEs are preferable. From Table 4, some of the points are very clear. For all the methods as increases the length of the con0dence intervals decrease and also as the sample size increases the length of the con0dence intervals decrease (not reported here). For both the Bootstrap methods the coverage percentages gradually decrease but for the asymptotic method the coverage percentages gradually increase as increases. Between the Bootstrap con0dence intervals for 0xed , Boot-t con0dence intervals have higher coverage probability compared to Boot-p con0dence intervals and also the length of the Boot-t con0dence intervals are larger than those of Boot-p. For both the Bootstrap procedures the coverage percentages generally vary between 90% and 94%. On the other hand the coverage percentages of the asymptotic method vary between 90% and 97%, although the length of the con0dence intervals are much smaller than the corresponding Bootstrap con0dence intervals. Moreover, to compute the asymptotic con0dence intervals, we need to know the value of , which is not required to compute the Bootstrap con0dence intervals. Comparing all the points, we recommend to use the Boot-p con0dence bounds for the unknown parameters if is not known and if is known and it is close to 2, we should use asymptotic con0dence bounds. As one of the referee has suggested, we are providing a comparison between the simulation based distribution and the bootstrap distributions based on histograms. The histograms are based on one thousand replications. We are providing the results when = 1:8, others are quite similar in nature so they are not provided here. We provide three histograms, in Figs. 4 – 6. Fig. 4 represents simulation-based histogram and Figs. 5 and 6 represent the histograms based on Boot-t and Boot-p, respectively. One point is very clear that the shapes are quite similar in nature but the dispersions are less for the bootstrap samples. We provide a graph (Fig. 1) of a particular realization of the model (5.1) with = 1:5 and = 0:5. From the plot it may not be very clear that the data has in0nite variance, but if we look at the plot (Fig. 3) n vs. Var{y(1); : : : ; y(n)}, it clearly gives an indication that the variance is not 0nite. We plot the periodogram function in Fig. 2 and from the periodogram function it is clear that p = 1. We estimate the di4erent parameters and also obtain the 95% con0dence bounds for all the parameters. They are provided below. Aˆ = 1:73579; Bˆ = 1:19572, !ˆ = 2:49620;

274

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

Fig. 1.

Fig. 2.

Fig. 3.

A˜ = 1:70866; B˜ = 1:17192; !˜ = 2:49605. Using the LSEs the Boot-p con0dence intervals for A; B and ! are (1:3296; 2:1733); (0:6515; 1:7162) and (2:4916; 2:5007), respectively. Similarly, using ALSEs the corresponding con0dence intervals are (1:1880; 2:1385); (0:5437; 1:6998) and (2:4908; 2:5009).

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

275

Fig. 4.

Fig. 5.

Fig. 6.

6. Conclusions In this paper, we consider the sum of sinusoidal model under the assumptions of additive heavy tail i.i.d. errors. Although, we considered only i.i.d random variables but the results can be extended even when the errors are moving average type. One important question we did not address in this

276

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

paper, namely estimation of p, the number of sinusoidal components. We may need to use some information theoretic criteria to estimate p. More work is needed in this direction.

Acknowledgements The authors would like to thank two referees for several constructive suggestions. The authors would also like to thank the editor Professor Richard A. Johnson for encouragement.

Appendix A To prove Theorem 1, we need the following lemmas. Lemma 1. Let us denote Sc; M = {Â : Â = (A; B; !); |Â − Â0 | ¿ c; |A| 6 M; |B| 6 M }. Suppose e(t)’s are i.i.d. random variables with mean zero; if for any c ¿ 0 and for some M ¡ ∞; lim inf

Â∈Sc; M

1 [Q(Â) − Q(Â0 )] ¿ 0 n

a:s:

then Âˆ is a strongly consistent estimator of Â0 . Proof. The proof is a simple extension of the result of Wu (1981); so it is omitted. Lemma 2. If X1 ; X2 : : : are i.i.d. random variables with mean zero and E|Xi |1+ ¡ ∞ for 0 ¡ ¡ 1; then n

n

1 1 lim sup Xt cos(t-) = lim sup Xt sin(t-) = 0 n→∞ 06-62 n n→∞ 06-62 n t=1 t=1

a:s:

Proof. We prove the result for cos(t-); the result for sin(t-) follows similarly. Let Zt =Xt I[|Xt |6t 1=(1+) ] . Then ∞

P[Zt = Xt ] =

t=1

∞

P[|Xt | ¿ [t

1=(1+)

t=1

6

∞

]=

∞

t=1 2t −1 6n¡2t

2t P[2(t −1)=(1+) 6 |X1 |] 6

6

j=1

P[2

(j −1)=(1+)

∞ ∞ 2t P[2(j−1)=(1+) 6 |X1 | ¡ 2j=(1+) ] t=1

j=t

j=(1+)

j ] 2t

t=1

∞

P[|X1 | ¿ n1=(1+) ]

6 |X1 | ¡ 2

t=1

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

6C

∞

277

2j−1 P[2(j−1)=(1+) 6 |X1 | ¡ 2j=(1+) ]

j=1

6C

∞

E|X1 |1+ I[2(j−1)=(1+) 6|X1 |¡2j=(1+) ] 6 CE|X1 |1+ ¡ ∞:

j=1

Therefore P[Zt = Xt i:o:] = 0. Thus; n

1 Xt cos(t-) → 0 06-62 n t=1 sup

n

a:s ⇔

1 Zt cos(t-) → 0 06-62 n t=1 sup

a:s:

Let Ut = Zt − E(Zt ); note that 1=(1+) n n n 1 1 1 t sup E(Zt ) cos(t-) 6 |E(Zt )| = x dF(x) → 0: n n t=1 −t 1=(1+) 06-62 n t=1 t=1 Thus; we only need to show that n

1 Ut cos(t-) → 0 06-62 n t=1 sup

a:s:

(A.1)

1 For any 0xed - and 1 ¿ 0; let 0 6 h 6 2n1=(1+) ; then we have n n n 1 −hn1 hUt cos(t-) −hn1 P Ut cos(t-) ¿ 1 6 2e Ee 6 2e (1 + 2Ch1+ ): n t=1 t=1 t=1

Since |hUt cos(t-)| 6 12 ; ex 6 1 + x + 2|x|1+ for |x| 6 12 and E|Ut |1+ ¡ C for some C ¿ 0. Clearly; 2e

−hn1

n

1+

(1 + 2Ch1+ ) 6 2e−hn1+2nCh :

t=1 1 Choose h = 2n1=(1+) ; then for large n; n 1 =(1+) =(1+) +C P Ut cos(t-) ¿ 1 6 2e−(1=2)n 6 Ce−(1=2)n : n t=1

Let K = n2 ; choose -1 ; : : : ; -K ; such that for each - ∈ (0; 2 ); we have a -j satisfying |-j − -| 6 2 =n2 . Note that n n 1 1 1=(1+) 2 6 C n−=(1+) → 0: Ut ( cos(t-) − cos(t-j )) 6 C t t 2 n n n t=1 t=1

278

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

Therefore; for large n; we have n 1 Ut cos(t-) ¿ 21 6 P max P sup j 6n2 06-62 n t=1 Since

∞

n=1

=(1+)

n2 e−(1=2)n

n 1 =(1+) Ut cos(t-j ) ¿ 1 6 Cn2 e−(1=2)n : n t=1

¡ ∞; therefore (A.1) is proved by using Borel Cantelli lemma.

Now with the help of Lemmas 1 and 2, we will prove Theorem 1. Proof of Theorem 1. In this proof only we denote Âˆ by Âˆn =(Aˆ n ; Bˆ n ; !ˆ n ) to emphasize that Âˆ depends on the sample size. If Âˆn is not consistent for Â0 ; then either Case I: for all sub-sequences {nk } of {n}; |Aˆ nk | + |Bˆ nk | → ∞. Then 1 ˆ Q(Ânk ) − Q(Â0 ) → ∞: nk But as Âˆnk is the LSE of Â0 at n = nk , Q(Âˆnk ) − Q(Â0 ) ¡ 0 which leads to a contradiction. So Âˆn is consistent estimator for Â0 . Case II: for at least one sub-sequence {nk } of {n}; Âˆnk ∈ Sc; M for some c ¿ 0 and a 0 ¡ M ¡ ∞. Let us write Sc; M , as de0ned in Lemma 1 as follows: Sc; M = {Â : Â = (A; B; !); |Â − Â0 | ¿ 3c; |A| 6 M; |B| 6 M } = Ac ∪ Bc ∪ Wc ; where Ac = {Â : Â = (A; B; !); |A − A0 | ¿ c; |A| 6 M; |B| 6 M }; Bc = {Â : Â = (A; B; !); |B − B0 | ¿ c; |A| 6 M; |B| 6 M }; Wc = {Â : Â = (A; B; !); |! − !0 | ¿ c; |A| 6 M; |B| 6 M }: Consider n 1 1 [{(y(t) − A cos(!t) − B sin(!t))}2 − e(t)2 ] [Q(Â) − Q(Â0 )] = n n t=1 n

1 {A0 cos(!0 t) + B0 sin(!0 t) − A cos(!t) − B sin(!t)}2 = n t=1 n

+

2 e(t){A0 cos(!0 t) + B0 sin(!0 t) − A cos(!t) − B sin(!t)} n t=1

= fn (Â) + gn (Â)

(say):

(A.2)

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

279

Using Lemma 2, we get that lim sup gn (Â) = 0

n→∞Â∈S

a:s:

(A.3)

c; M

Now for any c ¿ 0 and a 0xed 0 ¡ M ¡ ∞, n 1 lim inf fn (Â) = lim inf (A0 cos(!0 t) + B0 sin(!0 t) − A cos(!t) − B sin(!t))2 Â ∈ Ac Â ∈ Ac n t=1 n

= lim inf

Â ∈ Ac

1 [(A0 cos(!0 t) − A cos(!t))2 + (B0 sin(!0 t) − B sin(!t))2 n t=1

+ 2(A0 cos(!0 t) − A cos(!t))(B0 sin(!0 t) − B sin(!t))] n

1 (A0 cos(!0 t) − A cos(!0 t))2 |A−A0 |¿c n t=1

= lim inf =

inf

|A−A0 |¿c

1 2

(A0 − A)2 ¿ 12 c2 ¿ 0

a:s:

(A.4)

Similarly, it can be proved for Bc and Wc also. Thus, we have lim inf fn (Â) ¿ 0 Â∈Sc; M

a:s:

(A.5)

Now using Eqs. (A.3) and (A.5) in Eq. (A.2) and using Lemma 1, Theorem 1 follows immediately. We need the following lemmas to prove Theorem 2. Lemma 3. Let !˜ be an estimator de4ned in Section 2 and let for any 1 ¿ 0; S1 ={! : |!−!0 | ¿ 1} for some 4xed !0 ∈ (0; 2 ). If for any 1 ¿ 0 1 lim sup [In (!) − In (!0 )] ¡ 0 a:s: (A.6) S1 n then !˜ → !0 a.s. as n → ∞. Proof. Lemma 3 can be obtained similarly as Lemma 1. Lemma 4. Let {e(t)} be i.i.d. random variables with mean zero and E|e(t)|1+ ¡ ∞ for some 0 ¡ ¡ 1; then the estimator !˜ of !0 as obtained by maximizing (2.3) is a strongly consistent estimator of !0 . Proof. Consider 1 lim sup [In (!) − In (!0 )] ! ∈ S1 n  2 2  n n 2 1  2 =lim sup y(t)e−i!t − y(t)e−i!0 t  n n n ! ∈ S1 t=1 t=1

280

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

 2 2 n n A B 0 0 K sup  =2lim cos(!0 t) cos(!t) + sin(!0 t) sin(!t) n t=1 n t=1 ! ∈ S1 −

n

2

A0 2 cos (!0 t) n t=1

−

n

2  

B0 2 sin (!0 t) n t=1

[using the trigonometric identity; the 0rst two components go to zero]: 1 1 1 2 =2A0 − − = − A20 ¡ 0 a:s 4 4 2 Therefore; because of Lemma 3; the result follows. Lemma 5. If X1 ; X2 ; : : : are i.i.d. random variables with mean zero and E|X1 |1+ ¡ ∞ for some 0 ¡ ¡ 1; then n n 1 k 1 k lim sup t X cos(t-) = lim sup t Xt sin(t-) = 0 a:s: t n→∞ 06-62 nk+1 n→∞ 06-62 nk+1 t=1 t=1 for k = 1; 2; : : : : Proof. Lemma 5 can be obtained similarly as Lemma 2. Lemma 6. Under the same conditions as of Lemma 4; n(!˜ − !0 ) → 0; a.s. Proof. Let In (!) and In (!) be the 0rst and second derivatives of In (!). Expanding In (!) ˜ around !0 by Taylor Series In (!) ˜ − In (!0 ) = (!˜ − !0 )In (!); K where !K is a point between !˜ and !0 . As !˜ maximizes In (!); so In (!) ˜ = 0 which implies (!˜ − !0 ) = −

In (!0 ) (1=n2 )In (!0 ) ⇒ n( ! ˜ − ! : ) = − 0 In (!) K (1=n3 )In (!) K

(A.7)

Note that 1 In (!0 ) ¡ 0 n→∞ n3 lim

a:s: and

1 In (!0 ) = 0 n→∞ n2 lim

a:s:

(A.8)

Therefore; using (A.8) in (A.7); along with the fact that !˜ → !0 a.s.; the lemma follows. Lemma 7. Under the same conditions as of Lemma 4; A˜ and B˜ are strongly consistent estimators of A0 and B0 ; respectively. Proof. Let us expand using Taylor series; cos(!t) ˜ around !0 up to the 0rst order term. n Suppose !K is 2 ˜ a point such !t K lies between !t and !t. ˜ Note that !K may depend on t. Now A= n t=1 y(t) cos(!t) ˜

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

281

can be written as n 2 ˜ A= {A0 cos(!0 t) + B0 sin(!0 t) + e(t)}{cos(!0 t) + t(!˜ − !0 ) sin(!t)} K n t=1 n n 2A0 2 1 = cos (!0 t) − 2A0 n(!˜ − !0 ) 2 t cos(!0 t) sin(!t) K n t=1 n t=1 n n 2B0 1 + sin(!0 t) cos(!0 t) − 2B0 n(!˜ − !0 ) 2 t sin(!0 t) sin(!t) K n t=1 n t=1 n n 2 1 + e(t) cos(!0 t) − 2n(!˜ − !0 ) 2 te(t) sin(!t) K → A0 a:s: n t=1 n t=1

(A.9)

Note that the second; fourth and sixth terms of (A.9) converge to zero using Lemmas 6 and 5; third term vanishes because of the trigonometries identity and the 0fth term vanishes because of Lemma 2. Similarly it can be shown that B˜ is a consistent estimator of B. Proof of Theorem 2. Combining Lemmas 4 and 7; the result follows.

Appendix B The proof that

1 n

n

j=1

|Kt (j)| converges to a non-zero limit for t = 0. Note that

|Kt (j)| 6 |t1 | + |t2 | + |t3 |(A0 + B0 ) = M

(say)

for all j and n, 1 6 j 6 n; n = 1; 2; : : : : Thus |Kt (j)=M | 6 1, hence |Kt (j)| ¿ (M =M 2 )|Kt (j)|2 for 0 ¡ 6 2 and for all j = 1; 2 : : : : Therefore, n

n

1 M −2 |Kt (j)| ¿ lim |Kt (j)|2 : n→∞ n n→∞ n j=1 j=1 lim

Using n

1 2 1 cos (j!0 ) = ¿ 0 n→∞ n 2 j=1 lim

it easily follows that n

1 |Kt (j)|2 ¿ 0: n→∞ n j=1 lim

It proves the result.

n

and

1 cos(j!0 ) = 0; n→∞ n j=1 lim

282

S. Nandi et al. / Statistics & Probability Letters 58 (2002) 265–282

References Brillinger, D.R., 1987. Fitting cosines: some procedures and some physical examples. In: MacNeill, B., Umphrey, G.J. (Eds.), Applied Probability and Stochastic Process and Sampling Theory. D. Reidel Publishing Company, USA, pp. 75–100. Chung, K.L., 1974. A course in probability theory, 2nd Ed., Academic Press, New York. Hannan, E.J., 1971. Non-linear time series regression. J. Appl. Probab. 8, 767–780. Kay, S., 1988. Modern Spectral Estimation: Theory and Applications. Prentice-Hall, New York. Kundu, D., 1993. Asymptotic theory of least-squares estimators of a particular non-linear regression model. Statist. Probab. Lett. 18, 13–17. Kundu, D., 1997. Asymptotic theory of least-squares estimators of sinusoidal signals. Statistics 30 (3), 221–238. Kundu, D., Mitra, A., 1996. Asymptotic theory of the least-squares estimators of a nonlinear time series regression model. Comm. Statist. Theory Methods 25, 133–141. Mandelbrot, B., 1963. The variation of certain speculative prices. J. Business 36, 394–419. Mitra, A., Kundu, D., 1997. Consistent method for estimating sinusoidal frequencies; A non-iterative approach. J. Statist. Comput. Simulation 58, 171–194. Press, W.H., Teukolsky, S.A., Vellerling, W.T. and Flannery, B.P., 1992. Numerical Recipes in FORTRAN, The Art of Scienti0c Computing, 2nd Ed., Cambridge University Press, Cambridge. Rice, J.A., Rosenblatt, M., 1988. On frequency estimation. Biometrika 75, 477–484. Samorodnitsky, G., Taqqu, M., 1994. Stable Non-Gaussian Random Processes; Stochastic Models with In0nite Variance. Chapman and Hall, New York. Walker, A.M., 1971. On the estimation of the Harmonic components in a time series with Stationary residuals. Biometrika 58, 21–26. Whittle, P., 1953. The simultaneous estimation of a time series Harmonic component and covariance structure. Trabalos. Estadlist. 3, 43–57. Wu, C.F.J., 1981. Asymptotic theory of the non-linear least-squares estimation. Ann. Statist. 5, 501–513.

Estimation of frequencies in presence of heavy tail errors

Estimation of frequencies in presence of heavy tail errors

Recommend Documents