Signal Processing 91 (2011) 2685–2689
Contents lists available at ScienceDirect
Signal Processing journal homepage: www.elsevier.com/locate/sigpro
Fast communication
Approximate variances for tapered spectral estimates ¨ Michael Amrein , Hans R. Kunsch Seminar f¨ ur Statistik, ETH Z¨ urich, R¨ amistrasse 101, 8092 Z¨ urich, Switzerland
a r t i c l e i n f o
abstract
Article history: Received 24 January 2011 Received in revised form 23 May 2011 Accepted 27 May 2011 Available online 14 June 2011
We propose an approximation of the asymptotic variance that removes a certain discontinuity in the usual formula for the raw and the smoothed periodogram in case a data taper is used. It is based on an approximation of the covariance of the (tapered) periodogram at two arbitrary frequencies. Exact computations of the variances for a Gaussian white noise and an AR(4) process show that the approximation is more accurate than the usual formula. & 2011 Elsevier B.V. All rights reserved.
Keywords: Asymptotic variance Data taper (smoothed) Periodogram
1. Introduction Spectral estimation is by now a standard topic in time series analysis, and many excellent books are available, e.g. Percival and Walden [1] or Bloomfield [2]. The purpose of this short note is to propose an approximation of the asymptotic variance that removes a certain discontinuity in the usual formula for the raw and the smoothed periodogram in case a taper is used. The standard asymptotic variance of the raw periodogram is independent of the taper chosen, see Formulae (222b) and (223c) in Percival and Walden [1]. However, this changes when the raw periodogram is smoothed over frequencies close by. Then a variance inflation factor Ch, see (4), appears which is equal to one if no taper is used and greater than one otherwise, compare Table 248 in Percival and Walden [1]. The reason for this is that tapering introduces correlations between the raw periodogram at different Fourier frequencies. Because of this, the variance reduction due to smoothing is smaller in the case of no tapering. The above variance inflation factor is justified asymptotically when the number of Fourier
Corresponding author. Tel.: þ41 44 632 3438; fax: þ 41 44 632 1228. E-mail addresses:
[email protected] (M. Amrein), ¨
[email protected] (H.R. Kunsch).
0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2011.05.018
frequencies that are involved in the smoothing tends to infinity (more slowly than the number of observations, otherwise we would have a bias). Hence, if only little smoothing is used, then we expect something in between: some increase in the variance, but less than the asymptotic variance inflation factor Ch. We give here a formula, see (5), which is almost as simple as the inflation factor, but which takes the amount of smoothing into account. 2. Notation and preliminaries Let fXt gt2Z be a real-valued stationary process with observation frequency 1=D, mean E½Xt ¼ m, autocovariances st :¼ CovðXt ,Xt þ t Þ and spectral density S(f). We assume that X1 ,X2 , . . . ,XN have been observed. The tapered periodogram (called direct spectral estimator in [1]) is 2 X N ðtpÞ D ht ðXt m~ Þei2ptf D S^ ðf Þ :¼ PN 2 h t ¼ 1 t¼1
t
for f 2 ½0,1=ð2DÞ. Here the estimator m~ is usually either the P arithmetic mean X or the weighted average ð N t ¼ 1 ht Xt Þ= PN ðtpÞ ^ ð t ¼ 1 ht Þ: The latter has the property that S ð0Þ ¼ 0. Since the choice is irrelevant for the asymptotics, we can use either version. The taper ðh1 , . . . ,hN Þ is chosen to reduce the discontinuities of the observation window at the edges t¼1
2686
M. Amrein, H.R. K¨ unsch / Signal Processing 91 (2011) 2685–2689
and t¼N. Usually, it has the form ht ¼ hðð2t1Þ=ð2NÞÞ with a function h that is independent of the sample size N. A popular choice is the split cosine taper 9 8 1 > > ð1cosð2px=pÞÞ, 0 rx r 2p > > = <2 p p p 1, ox o1 : ð1Þ h ðxÞ ¼ 2 2 > > > ; : 1ð1cosð2pð1xÞ=pÞÞ, 1p r x r1 > 2 2 The tapered periodogram has the approximate variance Var½S^
ðtpÞ
ðf Þ Sðf Þ2 ,
f2 = f0,1=ð2DÞg
ð2Þ
(see e.g. [1, Formula (222b)]). In particular, it does not converge to zero. Because of this, one usually smoothes the periodogram over a small band of neighboring frequencies. We smooth discretely over an equidistant grid of frequencies. Let fN0 ,k ¼ k=ðN0 DÞ ð0 r k rN 0 =2Þ for an integer N 0 of order O(N). Then the tapered and smoothed spectral estimate is ðtsÞ S^ ðfN0 ,k Þ ¼
M X
fast Fourier transform, exact computation of the latter is in most cases also possible. If not, then by the following lemma we can use Z pf D H2ðNÞ ðf Þ h2 ðuÞei2pNuf D du eipf D : sinðpf DÞ 0 Choosing a simple form for the function h, we can compute the integral on the right exactly. It is obvious that (5) agrees with (2) for M¼ 0. In the next section, we show that it also agrees with (3) for M large. 4. Justification of the approximation The idea is simple: We just plug in a suitable approximation for the relative covariances of the tapered periodogram values into the exact expression for the relative variance. ðtsÞ VarðS^ ðf 0 Þ=Sðf 0 ÞÞ is equal to N ,k
N ,k
! ðtpÞ ðtpÞ S^ ðfN0 ,kr Þ S^ ðfN0 ,ks Þ , gr gs Cov : SðfN0 ,kr Þ SðfN0 ,ks Þ r ¼ M s ¼ M M X
ðtpÞ gj S^ ðfN0 ,kj Þ,
j ¼ M
where the gj ’s are weights with the properties gj 40, PM gj ¼ gj ðM r j r MÞ and j ¼ M gj ¼ 1. If kr M, the ðtpÞ smoothing includes the value S^ ð0Þ which is equal or very close to zero if the mean m is estimated. In this case, we should exclude j¼k from the sum.
M X
The asymptotic behavior of these covariances is well known. Theorem 5.2.8 of Brillinger [4] shows that, under suitable conditions, we have for frequencies 0 of r g o 1=ð2DÞ that ! ðtpÞ ðtpÞ jHðNÞ ðf gÞj2 þ jH2ðNÞ ðf þ gÞj2 S^ ðf Þ S^ ðgÞ , Cov þOðN1 Þ: ¼ 2 Sðf Þ SðgÞ jHðNÞ ð0Þj2 2
ð7Þ
3. Approximations of the variance of spectral estimators The usual approximation for the relative variance of ðtsÞ S^ ðfN0 ,k Þ is ! ðtsÞ M S^ ðfN0 ,k Þ N0 X g2 ð3Þ Var Ch SðfN0 ,k Þ N r ¼ M r for ka0,N0 =2 where PN h4 =N Ch ¼ PNt ¼ 1 t : ð t ¼ 1 h2t =NÞ2
ð4Þ
This formula is given in Bloomfield [2, Eq. (9.12), p. 183], and it is implemented in the function ‘‘spec.pgram’’ in the language for statistical computing R [3]. In order to see that it is the same as Formula (248a) in Percival and Walden [1], one has to go back to the definition of Wm in terms of the weights gj which is given by the formulae (237c), (238d) and (238e). If we put M¼0, (3) is different from (2). The reason for this difference is that (3) is valid in the limit M-1 and M=N0 -0. But in applications M is often small, e.g. M ¼1, and one wonders how good the approximation is in such a case. We propose here as alternative the following approximation for the relative variance ! M 2M M l X X jH2ðNÞ ðfN0 ,l Þj2 X gr2 þ2 g g ð5Þ r r þ l ðNÞ 2 r ¼ M r ¼ M l ¼ 1 H2 ð0Þ P 2 i2ptf D . (again for ka0,N0 =2) where H2ðNÞ ðf Þ ¼ ð1=NÞ N t ¼ 1 ht e In order to compute this expression, we need to compute the convolution of the weights ðgj Þ and the discrete Fourier transform of the squared taper. The former is usually not a problem since M is substantially smaller than N0 . Using the
ð6Þ
The statement in Brillinger [4] is actually asymmetric in f and g since it has S(f) instead of S(g) on the left side in the equation above. Our statement can be proved by the same argument if we assume Sðf Þ SðgÞ when jf gj is small. When jf gj is big, i.e. not of order OðN1 Þ, the covariance is of the order OðN1 Þ anyhow. Using the approximation (7) directly would lead to an approximation which depends on k. Having to compute N 0 =2 different approximate variances is usually too complicated. However, the term jH2ðNÞ ðf þ gÞj2 is small unless Dðf þ gÞ is close to zero modulo one. This has been pointed out by Thomson [5], see also the discussion on pp. 230–231 of Percival and Walden [1]. If we omit this term, then we obtain our new approximation (5) by a simple change in the summation indices. We next give a simple lemma that justifies the omission of the second term in (7). In addition, it also shows how the usual approximation (3) follows from (5). Lemma 4.1. If c is once continuously differentiable on ½0,1 0 and c is Lipschitz continuous with constant L, then Z 1 N 1X 2t1 i2plt pl c ¼ cðuÞei2pNlu du eipl e þ R, Nt¼1 2N sinðplÞ 0 where jRjr const:=N uniformly for all l 2 ½0,0:5. Proof. Put E ¼ 1=ð2NÞ. By a Taylor expansion, we obtain for any x 2 ½0,1 Z E Z xþE cðuÞei2pNlu du ¼ cðxÞei2pNlx ei2pNlu du xE
E
þ c ðxÞei2pNlx 0
Z E E
uei2pNlu du þ R0 ,
M. Amrein, H.R. K¨ unsch / Signal Processing 91 (2011) 2685–2689
where the remainder satisfies jR0 j r 2E3 L=3 ¼ L=ð12N 3 Þ: Next, observe that Z E E
ei2pNlu du ¼
sinðplÞ , plN
Z E
uei2pNlu du ¼
E
2687
i 2p
lN 2
sinðplÞ cosðplÞ :
pl
From this the lemma follows by taking x ¼ ð2t1Þ=ð2NÞ for t ¼1,y,N and summing up all terms. & If cð0Þ ¼ cð1Þ ¼ 0, then by partial integration Z
1 0
cðuÞei2pNlu du r
0
supjc ðxÞj : 2plN
Hence by setting cðuÞ ¼ h2 ðuÞ, we obtain H2ðNÞ ðf Þ rconst:ðNf DÞ1 þconst:N 1 rconst:ðNf DÞ1
Fig. 1. Spectra of the Gaussian white noise and the AR(4) process in (9) in dB.
ð8Þ
for f r 1=D. Therefore the second term in (7) is negligible unless f þg is of the order OðN1 Þ.
Fig. 2. Exact relative variances of the Gaussian white noise (thick-dashed) and the AR(4) process (thick-solid) in comparison to the usual (thin-dashed) and the new (thin-solid) approximation for different choices of N 0 and M when p ¼0.2.
2688
M. Amrein, H.R. K¨ unsch / Signal Processing 91 (2011) 2685–2689
Finally, we derive the usual variance approximation (3) from (5) as follows. By Parseval’s theorem 0 =2 N X
jH2ðNÞ ðfN0 ,l Þj2
l ¼ N 0 =2
0
N X
N 1 ¼ h4 : N Nt¼1 t
Note that H2ðNÞ ð0Þ ¼ 1=N
PN
t¼1
h2t . Because of (8), we have
0 =2 N X
l ¼ 2M
Z
1 1
g 2 ðuÞ du
M X
gr2 ðM-1Þ
r ¼ M
5. Comparison with exact relative variances for Gaussian processes If we assume the process fXt gt2Z to be Gaussian, then it holds
for M-1 and N ¼ OðNÞ-1. Thus N 1X h4 -0 N Nt¼1 t
N jH2ðNÞ ðfN0 ,l Þj2
gr gr þ l M
and the desired result follows by dominated convergence.
0
2M X
M l X r ¼ M
const:N 0 2 jH2ðNÞ ðfN0 ,l Þj2 r -0 NlD l ¼ 2M þ 1 l ¼ 24Mþ1 0 =2 N X
fixed l
ðtpÞ ðtpÞ 1 S^ ðf Þ S^ ðgÞ , Þ¼ P 2 2 Sðf Þ SðgÞ Sðf ÞSðgÞð N t ¼ 1 ht Þ 0 2 2 1 N X C N B X i2pðfjgkÞD i2 p ðfj þ gkÞ D A hj hk sjk e þ h h s e @ j k jk j,k ¼ 1 j,k ¼ 1
Cov
0
also in the above limit. If the weights gj change smoothly as a function g of the lag j, i.e., gj ¼ gðj=MÞ, then for any
Fig. 3. As Fig. 2, but p ¼ 0.5.
M. Amrein, H.R. K¨ unsch / Signal Processing 91 (2011) 2685–2689
see p. 326 of Percival and Walden [1]. Plugging this into (6) yields thus an exact expression. Evaluation is of the order OðN 3 Þ, thus it is not practical to use it routinely. We now compare the two approximations to the exact relative variances for a Gaussian white noise Xt ¼ Et and the AR(4) process
2689
the data is strongly tapered (p¼0.5) or when we use a refined smoothing grid (N 0 ¼ 2N or N 0 ¼ 8N) we recommend to use the new approximation (5). When N 0 ¼ 8N, we observe an improvement over the usual version even when M¼ 8.
Xt ¼ 2:7607Xt1 3:8106Xt2 þ 2:6535Xt3 0:9238Xt4 þ Et ð9Þ used in Percival and Walden [1, p. 46] where Et i:i:d N ð0,1Þ. True spectra are shown in Fig. 1 in decibel (dB), i.e., the plot displays 10 log10 ðSðÞÞ. As we can see, the spectrum of the AR(4) process varies over a wide range and exhibits two sharp peaks. Further, we assume the observation frequency 1=D to be 1 and N ¼ 210 ¼ 1024. We compute the exact relative variance (6) at the frequencies fk,N0 , k ¼ 0, . . . ,N 0 =2, the split cosine taper (1) with p 2 f0:2,0:5g and weights gj ¼ 1=ð2M þ 1Þ, j ¼ M, . . . ,M, for different values of N0 and M. Comparison to the usual approximation (3) and to the new one (5) are shown in Fig. 2 (for p ¼0.2) and Fig. 3 (for p ¼0.5). The code in R is available under http://stat.ethz.ch/ kuensch/ papers/approximate_variances.R. We see that the new approximation fits the true relative variances clearly better when we smooth over few frequencies, i.e., M is small. Especially in the situations when
Acknowledgment ¨ We thank Don Percival and Martin Machler for helpful comments and suggestions on earlier versions. References [1] D.B. Percival, A.T. Walden, Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques, Cambridge University Press, 1993. [2] P. Bloomfield, Fourier Analysis of Time Series: An Introduction, second ed., Wiley, NY, 2000. [3] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2010. URL /http://www.R-project.orgS. [4] D.R. Brillinger, Time Series, Data Analysis and Theory. International Series in Decision Processes, Holt, Rinehart and Winston, Inc., New York, 1975. [5] D.J. Thomson, Spectrum estimation techniques for characterization and development of WT4 waveguide—I, Bell System Technical Journal 56 (1977) 1769–1815.