Signal Processing 10 (1986) 129-144 North-Holland
SEARCH FOR PERIODICITIES OF FILTERED TIME SERIES*
129
BY A X I S - C R O S S I N G S
Benjamin K E D E M Department of Mathematics, University of Maryland, College Park, MD 20742, U.S.A., and Armament Development Authority, Israel Received 11 March 1985 Revised 11 June 1985
Abstract. The number of axis-crossings properly normalized tends to land on or near the dominant frequency of a mixed spectrum. Such dominancy can be achieved by linear filtering. This observation is proved useful in the fast detection of isolated sinusoidal components. The method is applied successfully to some well-known data sets.
Zusammenfassung. In geeignet normierter Form ist die Zahl der Kreuzungspunkte eines Signals mit eitner Achse ein grobes Mal3 fiir die dominante Frequenz des zugehrrigen diskreten Spektrums. Diese Dominanz kann mit Hilfe linearer Filterung erzwungen werden. Derartige Beobachtungen erweisen sich bei der Detektion versteckter Periodizit~iten in einigen wohlbekannten Datens~itzen als niitzlich. R~sum& Le hombre de passage par un axe, normalis6 d'une mani~re approprire, tend b. &re sur ou dans le voisinage de la frrquence dominante d'un spectre mixte. De telles dominances peuvent ~tre obtenues par filtrage linraire. Cette observation s'est rrvrl6 ~tre utile dans la drtection rapide de composantes sinusoidales isolres. La mrthode est appliquire avec succ~s certains ensembles de donnres bien connus.
Keywords. Filter, sinusoidal limit, spectrum, stationary, periodogram, white noise, Gaussian, dominant frequency.
I. Introduction
derived ZI,...
Let {Z,} be a zero mean, Gaussian, stationary process of the form p
Z,= ~ (Ajcostojt+Bjsinto~t)+e,,
(1)
j-t
t =0, a : l , . . . , where {Aj} and {Bj} are normal random variables and (e,} is stationary normal noise independent of Aj and Bj. In this paper we shall be interested in the estimation of the angular frequencies toj. We would like to show that the toj can be detected very effectively by the number of axiscrossings and the number of axis-crossings by the * This research was supported by the AFOSR under Grant No. 82-0187.
filtered series when a time series is given. This is due to the observation
,Z N
that the number of axis-crossings properly normal-
ized tends to 'land' on the dominant angular frequency when it exists in the discrete spectrum. This is so precisely because the number of axiscrossings, D~, say, after normalization is a weighted average of the toj when the signal to noise ratio tends to increase. In general, ~r D 1 / ( N - 1 ) tends to admit values in Ihe dominant frequency band regardless of whether the spectrum is discrete, continuous, or of mixed type. Our method is very simple to apply and opens new channels for further research on the connection between filtering and axis-crossings analysis of time series. We find this connection fruitful both mathematically and practically.
0165-1684/86/$3.50 O 1986. Elsevier Science Publishers B.V. (North-Holland)
130
B. Kedem / Periodicitiesfrom axis-crossings
It is easy to see why the above observation is useful in discovering periodicities in time series, for, by linear filtering we can change the relative prominence of the frequencies so that the resulting D J ( N - 1) is forced to land at or near the resulting dominant angular frequency. This simple device is in fact independent of the Gaussian assumption and hence is of general apllicability. In the search for hidden periodicities in sinusoidal data plus noise, frequency and amplitude are nonseparable entities. In order to detect a hidden frequency one always considers the significance of the corresponding amplitude since a nonexisting frequency displays no amplitude. Such is the case with the Schuster and Whittaker periodograms or simply any line spectrum: the search for hidden periodicities (or frequencies) is done through amplitudes. Our approach, too, uses amplitudes, except that it is done indirectly by attenuating undesired frequencies and emphasing others through linear filtering. Once a frequency becomes dominant, it is quickly detected by D 1 / ( N - 1). In this connection, the periodogram may be used for a final verification that a true frequency or period has been detected. This amounts to zeroing in on the right frequency bands. In essence, the periodogram is evaluated only at a few selected promising frequencies which, furthermore, are not necessarily Fourier frequencies. This implies that a regular periodogram analysis may skip a frequency which may nonetheless be detected by our method. A further feature of this paper is a new test for white noise based on higher order crossings. Our method has been applied very successfully to some well-known data sets. Some of these results are presented below. The present paper builds on our previous work in [7]. However, the emphasis here is on the idea of putting together axis-crossings analysis and linear filtering. This serves as a remedy to the deficiency reported there concerning low frequencies. There we have introduced the higher order crossings spectral representation whose Signal Processing
generalization is discussed and applied in what follows.
2. A spectral representation for axis-crossings In order to formalize the above argument we shall assume that {Z,} in (1) is Gaussian with spectral distribution function F ( A ) , - w < A <~w, and (normalized) correlation function {Pk}. For this we let At, Bt, j = 1 , . . . , p be mutually independent N(0, tr}) random variables. Also, it is convenient to record the tot in their increasing order of magnitude so that O<~w~
Z, t>0, Zt<0,
t=0,+l .....
Then, D~ is given by N
N
D a = 2 Y~ S t - 2
Y~ X t X t _ l - ( S l - ~ - X N ) .
t=l
t=2
Clearly, by symmetry, E X , =½ and from [5, p. 34] the Gaussian assumption yields
E X , X , _ I = P ( X , = 1, X,_, = 1) 1
1
= - + - - sin -~ p~. 4 2~r Therefore,
EDa=(N-1)(~---~rlsin-~p~) ' and so, by solving for Pl and be recalling the celebrated relation Pk =
COS(tok)d F ( w
dF(w),
131
B. Kedem / Periodicitiesfrom axis-crossings
where F is symmetric, we obtain
• r E D H / ( N - 1) is again attracted to the dominant
( ~ E DI~ = fro COS(tO)dF(60)
frequency in the new spectrum [H(60)] 2 dF(60). To see this, let L,(. ) be a sequence of filters such that, as n increases,
cos\ N-1 /
fo
dF(60)
1Ho(60)1=,{ 0, 60~60. C,
60 = 60r,
0 -2 C 0 S ( 6 0 1 ) - { - " " ' + 0 .2 C 0 S ( 6 0 p ) O-2q-...
where C is a positive constant. Then, from (3),
q- O-2 -{- O--2 v
(2) and it is seen that as o-~, say, increases, then ~r E D 1 / ( N - 1 ) ~ tOt. The same effect is obviously achieved if o-2, is fixed while, for all j # r, cr2 ~ 0, and also cr~ ~ 0. Thus, a dominant frequency can easily be detected by simply counting the number of axis-crossings. The cr2, however, can be manipulated through linear filtering. Equation (2) is called the spectral representation o f the number o f axiscrossings. It is nothing but the first correlation of {z,}.
Operate on {Zt} with a linear filter L(.) with transfer function H(w). The output being again Gaussian yields the useful formula
/%r E Dn'~ c°sk~}
=
cos(60)lH<60)l = dE
fo
1H(60)12 dF(to)
~ E DH -~ " 60,,
N-1
60, c ( 0 , ~ ]
a s n --~ oo.
In general, L(.) can be designed so that its transfer function H(60) attenuates or emphasizes frequency bands at will, thus causing 7r E D , / ( N - 1 ) to admit values in the desired intervals. To estimate a frequency toj we render it dominant mainly by attenuating all other frequencies in the discrete spectrum via proper filtering. It will be seen that this procedure is easy to apply as the simplest possible filters yield remarkable results when applied reasonably. It should be noted that the new noise component due to filtering is no longer white when IH(60)I is not flat. Clearly, our procedure works just the same for filtered noise excluding pathological cases. In most cases it is advisable to first lowpass the series in order to reduce the power of the noise. Let Hi(60) correspond to the linear filter L j ( . ) and consider the sequential filter L I , . . . , Lk. Then, (3) generalizes to
+ ~r~ f= cos(60)lH(tO)l = d60 -1
/[p
-Jo
J
COS(rr EN_IDn,...nk~/
E ~lH(o~,)l ~
j=l
fo'COS(60)lnl(,O)l~..
Ink(to)l 2
+~r~
IH(60)I2d60 ,
(3)
7r
where DH is the corresponding number of axiscrossings in the filtered series of length N. This is a generalization (but also the implication!) of (2) with L ( . ) being the identity filter whose transfer function is H(60)= 1. The symbol U 1 n o w makes sense. Quite a few results can be obtained from (3) but of special importance is the fact that
dF(to)
o~ IH1<60)12., • IHk<60)l 2 dF(60) (4) Equation (4) relates the number of axis-crossings of a sequentially filtered series and the spectral content of the series and is a generalization of the higher order crossings spectral representation which was introduced in [7], and will be applied Vol. 10, No. 2, March 1986
B. Kedem / Periodicitiesfrom axis-crossings
132
below. In that representation the difference operator is applied repeatedly. At this point it is helpful to give an illustration of the main idea. The complete method, however, will be described in the following section. Consider a model of the form (1) with A1, B 1 - N ( 0 , 1 ) , AE, B 2 - N ( 0 , 2 . 5 6 ) . Assume that tr2~=0. The observed process is given by
t
Z, = 0.414 cos 0.75t + 0.488 sin 0.75t
t = 1, 2 , . . . , 4 5 0 .
':I, I (b)
0,75
t25
(b')
"trxl'/'9 ~9K9 449-"=~ •
r~f
0
(7)
otherwise,
and transfer function
IH(,o)12=(l+cos~,) '°, 0~,o<~,
i
I
i
t
t
o,7~
L25
i
ILl
(6)
with weights
aj (0,
i
/ IH(oJ)12dF(~)
dF(w)
j ~ --co
= ~'2-s(1°), j = O , 1 , . . . , 10,
(a')
(5)
Since the angular frequency 1.25 dominates the frequency 0.75, we expect D~ to detect it. Indeed, we found D~ = 179 and ~r D ] / 4 4 9 = 1.252. This takes care of the first frequency. After applying the linear filter
L(Z,)= ~ ajZ,_j
't
(a)
-0.249 cos 1.25t + 1.090 sin 1.25t,
(8)
the spectral function gives the weights 121.24 and 19.84 to the frequencies 0.75, 1.25, respectively. Now, 0.75 is dominating and D , = 107 yielding "rr D , / 4 4 9 = 0.749 as our estimate for the second frequency. When o-~ > 0 but the signal to noise ratio is moderate, we get similar results by applying proper filtering in order to reduce the power in the noise. Fig. 1 graphically displays the foregoing steps. This procedure generalizes to the case of more than just two sinusoids in the sum (5).
~'x107 -('174Q 449 7 ' ' ' ~ ~ OLI
I
I--
0
(c)
L/a)
(c')
Fig. 1. Spectral decomposition of a sum of two sinusoids using axis-crossings of the original and the filtered series. (a) Fifty observations from (5) with N = 4 5 0 , D~=179. (a') Fifty observations from the filtered series (6) with N = 450, DH = 107. (b) Discrete spectrum of (5). (b') Discrete spectrum of filtered series using (8). (c) Estimate of the dominant frequency using D 1. (c') Estimate of the dominant frequency using D , .
order crossings have been discussed in a number of papers, starting with those of Kedem and Slud [8, 9]. Let V be the difference operator and L a linear filter with transfer function H. Define DHk =-- ~ o f axis-crossings by v k - l L ( Z t ) ,
t=l,...,N. 3. Detection by higher order crossings In order to fully exploit the idea presented in the previous section we introduce the higher order crossings in conjunction with linear filters. Higher Signal Processing
co is called the sequence of The sequence {D Hk}k=l higher order crossings. When k = 1, D , I -= DH. When L is the identity filter so that H =- 1, we write Dk for Dnk. The higher order crossings spectral
133
B. Kedem / Periodicitiesfrom axis-crossings
representation in conjunction with linear filtering is given by
cos['rr E Dnf~ -i- ) w
f
~ cos(to)[sin ½to]2u-1)lH(to)12 dE(to) [sin ½to]~(J-')lH(to)l ~ dF(to)
(9) j = 1, 2 , . . . . From [6, 7] or directly from (9) we can deduce the following results. 3.1. Dominant frequency band. Assume 0 < a < b~<~r and
O,
to~[a,b].
Then a < ~r E D m / ( N - 1) < b.
3.2. Dominant frequency. Assume IH(tor)[>0. If or, -> oo, then 1)-> (.Or .
"reE D H t / ( N -
3.3. Sinnsoidal limit. Assume 0 < E D H 1 . If E D . 1 = E D H 2 , then L(Zt) is a pure sinusoid with period 2 ( N - 1 ) / E D m .
then EDHj
j
The message relayed by the {DHj} can be attributed in the main part to the first three properties. The Monotonicity Property 3.5 is a consequence of Properties 3.1 and 3.2 because the difference operator is a highpass filter and so are its iterates. Each time V acts on L(Zt), the spectral mass is pushed towards the higher frequencies. This means that higher and higher frequencies become dominant, thus attracting monotonically the ~ E D n j ( N - 1) until convergence to the highest frequency in the spectrum of L(Zt) is achieved as j--> oo. The third property is remarkable. When L(Zt) is a sinusoid, we have only one frequency in its spectrum, this being obviously dominant irrespective of W. Ir~ this case, the E DHj are the same for all j. But the third property claims the converse, namely, when the expected values of the first two higher crossings are equal, L(Z,) is a pure sinusoid. The equality between E DHI and E DH2 implies the equality of all the higher order crossings! This and Property 3.4 imply Property 3.7. Property 3.6 is immediate from representation (9). More properties can be obtained from (9) but we shall discuss them elsewhere as they do not exactly fall within the context of the present paper. 3.1. Strategies
3.4. Highest frequency. A s j -->oo, 7r E D H J ( N -- 1) converges to the highest frequency present in the spectrum of L(Z,). 3.5. Monotonicity E D m <-EDH2 <~ ' ' ' < ~ N - 1 .
3.6. Weighted average. Assume o'2E= 0. Then ~r E D m tol<~ - N- - 1< ~
-
~r E DH2 N< - 1~
. . . <~top.
3.7. Convergence to xt. Assume [H(to)[2>0, 0 < to ~<~. If ~ is included in the spectrum of L(Z,),
What emerges from this discussion is a new framework for discovering discrete frequencies in which the two key terms are frequency dominance and monotonicity of higher order crossings. We can think now of several strategies which utilize the DHj in our detection problem. For simplicity assume that there is no noise or that its power has been reduced sharply. Strategy I (a) Do not filter Zt. Get the highest frequency from xr E D J ( N - 1 ) -> top, j-> oc). Vol. 10, No. 2, M a r c h 1986
B. Kedem / Periodicitiesfrom axis-crossings
134
(b) Operate on Zt with an ideal lowpass filter L whose transfer function satisfies
2f>0,
0 < to ~< to~,
IH(o~)l ~=o, topknot=.
(c) Replace top by top_, and repeat steps (a), (b) for L(Z,). Here, use the sequence E D , j , j = 1,2,.... (d) Repeat for all frequencies, each time operating with an ideal lowpass filter which removes the respective highest frequency. In each step, the highest frequency present is detected by the convergence of the corresponding sequence of higher order crossings. As an example of this strategy consider model (1) with no noise component and five sinusoids with amplitudes and frequencies as given in Table 1.
Coefficients of five sinusoids with given frequencies to~
1 2 3 4 5
0.75 1.25 1.60 2.00 2.30
Ai
x/A~ + B/2
Bi
!.034 -0.250 - 1.462 0.817 0.873
1.220 1.090 2.266 -0.501 - 1.660
1.599 1.118 2.696 0.958 1.875
k
IH(to )12 = ( l + (sin ½w-/sin ½tob)26) ' 0~< to ~ , r , k = 1 0 .
(10)
This filter is not an ideal lowpass filter but resembles it; tob is the cutoff frequency. The filter coefficients were computed using program LPSB in [11, p. 421] by converting cycles per unit time Signal
Processing
N-I
7/"
.ORIGINAL SERIES
o/e~o~e--e--e--o--
%
o--o--o--e--e--e
. / " / r O b = 2.299 w5 w2 wI
o/o
~ o_
./
o../qb~O --o --o--
._/.
.~.~'-"
oF_ ~ . . o,-.- o _ o _ o _ o.......o ~ ' t ° ~
. / ~ b - _ I, 5
o--e
--e--
e--o--o--o-
• - o-
•
. . . . . . . . . . . . . . .
0
But the figure reveals an additional important feature: on its way to the highest frequency, the
~r DHj/ ( N - 1) tend to pass through small neighborhoods of the true frequencies. In other words, it
We see that 1.6 is the dominant frequency which means that it will be rather easy to detect this frequency. 2.00 is the least emphasized frequency so that its detection is the most difficult. In order to follow this strategy we use the sine-Butterworth lowpass filter with transfer function whose squared gain is given by 1
TrDHj
Fig. 2. Convergence of ~ D n J ( N - 1 ) before and after linear filtering with the sine-Butterworth Iowpass filter with N = 450.
Table 1
i
to radians per unit time. Given tob, this filter, ideally, removes from the spectrum the band above rob. The filter is applied each time a new frequency is detected. It is seen from Fig. 2 that convergence to the respective highest frequency is rather fast and in fact exponentially fast as is immediate from (9).
seems worthwhile to consider the entire sequence {,r DH~/(N - 1)}~=1, as it may contain values which estimate a subset of the true frequencies very well. Thus, for example, the graph of the Tr D J ( N - 1), j -- 1 , . . . , 16, of the original series (i.e., not filtered) already contains estimates of the frequencies 1.6, 2.0, 2.3. This gives rise to our next strategy. We remark that Strategy I is recommended only when the frequencies are well separated since otherwise more than one frequency (not just the highest) may be removed when the series is lowpass filtered. This is so since ideal lowpass filters are not within practical reach. The next strategy builds on the above observation that higher order crossings tend to 'visit' the true frequencies as they increase or converge to the highest frequency in the spectrum. As the higher order crossings increase, successive frequencies tend to be dominant one after the
B. Kedem / Periodicitiesfrom axis-crossings other, each time attracting the corresponding (normalized) axis-crossings. As an illustration consider a discrete spectrum as in Fig. 3. First, the series
135
rrDHj N-I 7/"
3
( I +B) 2 (a) o
II,
~I
ta2
• ~E']
gqfi] I;1"
i lld
"
. [] []E,]
('a3
[][] r;1[] • " " [] IIN,I*B) I0 (b) I
I
0
I,
I
.
:
""trDH~/(N-I)
:
-
:
5
.
•
:
:
:
:
j
I0
Fig. 4. Normalized higher order crossings visiting true frequencies following lowpass filtering. A visit is indicated by a square.
,ll
,,I
i
to
k'trDH2/(N-I)
(d)
i
I
N'rDHI/(N-I)
(c)
i
"tr
Fig. 3. Ideal behavior of higher order crossings following a lowpass filter. The sequential operations are L, VL, V2L. (a) Discrete spectrum of a sum of three sinusoids. (b) Effect of a lowpass filter, to t becomes dominant and is detected by D H. (c) Effect of a first difference after lowpassing, to2 becomes dominant and is detected by DH2. (d) Effect of second differencing following a lowpass filter, to3 becomes dominant and is detected by DH3,
is operated on with a lowpass filter, then DH1 , DH2 , and DH3 (normalized) visit the true frequencies. This illustration is ideal and, in most cases, in practice we cannot expect it to hold exactly. Nonetheless, this tendency of higher order crossings is exhibited in many cases provided the lowpass filter is chosen properly. Only extremely simple filters are needed. As an example of this tendency, again consider the five sinusoids treated above whose coefficients and frequencies are given in Table 1. The series was operated on with the simple summation filter L = (1-4- B) K, B being the shift Bx, = x,_~, and K = 2, 10. The results are given in Fig. 4. The figure shows that all frequencies were visited. Our problem then is to choose from among the ~ D H f l ( N - 1 ) those which visit true fre-
quencies. For this purpose we suggest the use of the periodogram except that now it is evaluated on a small set of points which are known to include good estimates of true frequencies. We can now formulate our next strategy whose use we recommend. lI (a) Apply one or more bandpass filters to the series. (b) Observe the sequence ~ t D H j / ( N - 1 ) , j = Strategy
1 ....
, K
(e.g.,
K = 10).
(c) Evaluate the periodogram at the above sequence. _i~D,j N-1
t
} 2
.
(11) Look for peaks in the periodogram. From a practical point of view, the evaluation of (11) for various filters or combination of filters has proven to be very useful as our examples with real data show. Actually, we use the normalized periodogram
,
k=l\N--1] Vol.
10, No.
2, March
1986
136
B. Kedem / Periodicitiesfrom axis-crossings
It should be noted that (12) should be used with caution because some or even all of the D,~ may be the same. In this case, corresponding ties contribute to the same frequency so that the relative intensity (12) should be multiplied accordingly. Such is the case, for example, when the process is very close to a sinusoid.
4. Bounding the lowest discrete frequency
oo
~ aJZt_j,
j=o
0
(13)
This can also be expressed as L(Z,) = (1 - a)Z, + aL(Z,_O.
-')1, _ I H ( t o ) l 2 dF(to)
a ~ 1,
(16)
so that ~r E D n , / ( N - 1 ) - - > 0. It follows that there exists an a close enough to unity such that (17)
Also, as a -~ 1,
cos ['rr , E Dn2~) Ol c o s to, + - . O.12..[_
+ •
c o s to, + •
•
cos(½
)
+O'p+0% 2
2
,t
(18)
the last expression being a weighted average of c o s ( t o , ) , . . . , cos(%), and cos(½~) (=0). The complete statement is that and if to1 ~
- - ~ < N-1
or E D . 2
tol<~ -
N-1
as a-> 1.
(19)
When tol~>½rr, formula (19) is no longer true but (17) obviously holds. For this case, however, ½~r replaces to~ in (19). It is also seen from (18) that ifa dominant frequency to, is present in the discrete spectrum, then, for a reasonable signal to noise ratio, ~ r D n 2 / ( N - 1 ) will be close to this frequency. The above argument can be demonstrated very effectively by viewing the graph in Fig. 5 of the squared gains of the first order lowpass recursive
(14) 1.60
The squared gain is (1-a)2 IH(to)12 - 1 - 2 a cos to + a 2"
)lH(to)l 2 d F ( t o )
"rr E D n l / ( N - 1) <~ 091.
A useful feature of {D.~} is that it enables us to bound the lowest frequency tol by using a proper lowpass filter. If such a filter is not used, then the (normalized) D . j start off, on their way to the highest frequency in the mixed spectrum, at the dominant spectral region, and as a consequence all information about to1 is lost. To overcome this difficulty, the series is lowpass filtered so that the spectral mass is pushed towards very low frequencies which then become dominant, attracting ~r E D m / ( N 1) to a region below or close to 601. Almost any lowpass filter will do. Thus, the lowpass filtering is used in order to distribute the normalized higher order crossings throughout the discrete frequency range. This point can easily be demonstrated by considering a simple first-order recursive filter defined by L(Z,)=(1-a)
cos(o --qT
(15)
As a ~ 1, H(to) approaches a narrow unit spike at the origin, and vanishes otherwise. Therefore, when cr2~> O,
1,20
1,00 0.80 0.40 I
cos
['tr E Dm'~
Signal Processing
]-
0,8
I~
2,4
3,2
Fig. 5. Squared gains of the first-orderlowpass recursivefilter.
137
B. Kedem / Periodicities fiom axis-crossings
filter (15) for various values of a. The same effect can be achieved with sequential summation and the sequential sine-Butterworth filter when the cutoff frequency approaches zero.
5.
"rr D Hj / 549 77"
o//ojO
/
Applications
I
5.1. Variable star series In order to illustrate the use of the Whittaker periodogram, Whittaker and Robinson [ 12] supply the now well-known series of the magnitude (i.e., measure of brightness) of a variable star on 600 successive midnights. This set of data has been analyzed by m a n y authors including Whittaker and Robinson and it is by now well established that the series displays two significant periods of approximately 24 and 29 days. It is interesting to see what a higher order crossings analysis yields. We have used only 550 data points (see Fig. 6).
0.2652 0,2174
•
o...- o - - ' ° ~ ° ~ ° - - 4
/
/ • • •
,
,
I
5
I0
, j 15
I* (-n-DHj/549)
(D)
I
0,6673 0,5 0.3326 0
32.000
(a)
I I
I
. . . .
5
,
. . . . . . . .
Io
_- j
15
Fig. 7, Normalized higher order crossings from the variable star data visit significantfrequencies. (a) Higher order crossings extracted after an application of simple summation followed by the first-order lowpass filter with a =0.999999. (b) Normalized periodogram ordinates evaluated at ~rDHJ549, j = l .... ,16.
24.000
16.000
The results in Fig. 7 very clearly show that the variable star series mainly consists of two sinusoids with periods
8.000
I00.0
200.0
500.0
400.0
500.0
600.0
Fig. 6. Variable star series. We have applied to the series simple summation followed by the first-order lowpass recursive filter with a very close to one in accordance with the foregoing discussion. The normalized higher crossings "rr DHJ549,j -- 1 , . . . , 16, with the corresponding normalized periodogram ordinates l*(~r D , J 5 4 9 ) are given in Fig. 7. For a rough significance test we use a modification of Fisher's test which with n -- 16 and at 0.05 significance level yields a critical value of 0.32 (see Appendix A for a discussion of this point).
2~r 0.2174
"= 28.9015,
2rr 0.2632
= 23.8723.
But the figure reveals more, namely, the ~r DHJ549 continue to increase towards n which is indicative of the presence of noise. Also, the two periods were detected very fast by DH2 and DH3 so that we did not have to consider very high Dnj. In many cases, it is sufficient to consider no more than ten higher order crossings. Thus, the higher order crossings did indeed visit the two significant frequencies and the periodogram was evaluated very economically only at 16 points instead of the 275 or so required by the usual harmonic analysis. Vol. 10, NO. 2, M a r c h 1986
B. Kedem / Periodicities from axis-crossings
138
In general, harmonic analysis supplies [½N] points at which the periodogram is evaluated while our procedure calls for, by far, fewer periodogram ordinates, perhaps sixteen, regardless of N. 5.2. Sunspot series
Wolfer's sunspot numbers from 1749 to 1924 ( N = 176) are given in [1, p. 660] (see also Fig. 8),
(a)
2(N-I)/Dj 10.00 8.158
\
3.78 2.00
°~o • •
i
160.000
~i
,j
I'0
I*('rr Dj/155)
([3)
120.000
0,524
80.000
0.450
40.000
. . . . . . . .
5
I
0.0
40.0
80.0
120.0
160.0
200.0
Fig. 8. Graph of the Wolfer's sunspot numbers from 1749 to 1924.
together with the usual periodogram analysis. It is generally agreed upon that the series exhibits a dominant periodicity of a little over eleven years in addition to some other less dominant periods. As with the yariable star data, a rough significance test applied to the periodogram ordinates in Figs. 9 and 10 reveals that the periods 8.158, 10.000, and 11.308 are significant. By applying simple summation followed by the first-order recursive lowpass filter with a = 0.999999 we also detected the period 11.071. The periods are computed from the formula 2 ( N - 1)/D. These periods very much resemble those obtained by Damsleth and Spj~tvoll [2], who used a stepwise least squares procedure combined with harmonic analysis (see also [1, p. 662]). 5.3. Detection o f periodic signal in noise
Hinich [4] suggests a method for detecting a periodic signal in additive noise when the Signal Processing
~ j
I0
Fig. 9. Normalized higher crossings visit significant frequencies in the raw (centered) sunspot series. Periods are found by the formula 2 ( N - 1 ) / D j . (a) Periods suggested by higher order crossings from raw data ( N = 1 5 6 ) . (b) Normalized periodogram ordinates evaluated at ~r Dj/155.
amplitudes of the fundamental and the first M - 1 harmonics are nonzero. Specifically, the usual periodogram analysis may not perform very well for a model of the form M
Z, = a ~, m -'/2 sin(~6"trrnt) + e,,
(20)
m=l
e, being white noise, when a is small. Hinich proposes a new measure, the harmogram, which outperforms the periodogram in detecting the fundamental frequency. It is interesting to see what a higher order crossings analysis can do for models such as (20). With a = 0.3, M = 5, Var(e,) = 1, and N = 550, the graph of (20) is given in Fig. 11. We have applied to this series sequential summation (1 + B) 3° followed by the first-order recursive lowpass filter with a = 0.75. From Fig. 12 we quickly estimate the fundamental frequency as 0.3948
B. Kedem / Periodicities from axis-crossings
Ca)
42,0'. 2(N-I)/DHj ,( ,(
12.25 11.508
\
139
(a)
"n-DHj / 549 77"
\° \°
\
",,. \
I •
•
o.~o
o/,O •/e~'° • • e~.,o~°~°-°~°~e-e'-°
0.3948 ,
2.00
j
,j
I*(rrDHj/549)
(b)
(b)
I t I" (rrDHj /148) 0.81
0.438 0.12
,I.,...,,I,l.J,, I
i
.
.
.
.
I
.
.
.
.
.
=j
I0
15
j
10
5
5
Fig. 10. Normalized higher crossings visit a significant frequency in the filtered sunspot series. (a) Periods suggested by higher crossings after an application of sequential summation (1+ B) t° and the first-order lowpass filter with a =0.9 (N = 148). (b) Detection of the period 11.308.
Fig. 12. Higher crossings analysis of the series in Fig. 11. (a) Normalized higher crossings extracted after applying the filters (1 + B) 3° and the first-order lowpass with a = 0.75. (b) Normalized periodogram ordinates evaluated at normalized higher order crossings.
6. A test for white noise
1 I
I
I
I
1
I
1
I
I
I
Fig. 11. A time series from (20) with a =0.3, M = 5 , and ez~ N(0, 1), the fundamental being iz~r, 2
which corresponds to the period 2~
0.3948
= 15.9148.
Observe that the true period is 16.
A useful way for testing for white noise is furnished by the D i themselves. It is clear that higher order crossings which are mainly due to periodic c o m p o n e n t s in the data will be quite different from those derived from white noise. The main obstacle, however, is the fact that there are no simple expressions for Var(Dj), except f o r j = 1. In this case, under white noise, Var(D1) = ( N -
1)(1 -A~I))A~ ~),
(21)
where A~j) = P ( V J - l e t >i 0J v J - l e , _ l I> 0).
(22)
This is J-5] because, u n d e r white noise, D I b ( N - 1, 1-A~I)). Interestingly enough, such VOI. 10, No. 2, M a r c h 1986
R Kedem/ Periodicitiesfrom axis-crossings
140
simple formulas are not available for Dj for small j~>2, but it was shown by Kedem and Reed [10] that, as j-* oo, white noise implies V a r ( D J N~/-NZT1) - 1.
POWER I
Q
0.9
o
(23)
Thus, owing to the fact that
o
• POWERPOINTBASEDON DI * POWERPOINTBASEDON (DI,...,DI6)
{d~J)} is j-dependent
for a n y j (see Appendix A) we have the asymptotic result that forj-= 1 and also, for sufficiently large j,
II
g
0.5
Dj - ( N - 1 ) ( ~ + 1 s i n - ' ( ~ . 1))
[ (N-1)(1-(l\4
\'tr s t n ' - l { j - l ' ' ~ 2 ) ]
0.05 0 O15 1
-" N(0, 1),
N-+ oo.
(24)
We obtain 95% probability limits for Di:
1,(r s,n (T)) 1
+1.96[(N_1)(1
1
•
-1
j-1
, s,n (T)))] (1
. _, j - 1
=
,/2
(25) j = 1, and large j. It is, however, useful to consider (25) also for small j i> 2. The sensitivity of a two-sided test based on (25) is studied via a power calculation in Fig. 13. The hypothesis is that of white noise versus a sinusoid plus noise as in (1) with p = 1. The signal to noise ratio (SNR) is defined as the ratio of the variance of the sinusoid to the variance of the noise. A test based on D1 alone is clearly less powerful under the alternative than one based on D ~ , . . . , D r , K = 16, say, but at the same time has a lower significance level and provides a fast decision rule. It is interesting to note that as the SNR increases, the difference in power between the two tests diminishes. This is obviously in agreement with the SLT. On the other hand, the probability of rejecting a true hypothesis in the test from D 1 , . . . , D16is quite large and is due to unconservative limits provided by (25) for small j t> 2. This is an indication that the binomial approximation for SignalProcessing
2
4'
3
SNR
Fig. 13. Power estimation of white noise test. The alternative is a sinusoid (to = 0.5) plus noise (~rz, = 1.0). The hypothesis is rejected when at least one of the Dj falls outside the limits in (25). Each entry is obtained from 100 independent realizations with the corresponding SNR (N = 450). Dj, small j/> 2, is not quite adequate. Still, we find these limits useful. The probability limits in (25) are applied to the real data series examined earlier (see Figs. 14, 15, and 16). For comparison we also apply these bounds to a white noise series. As can be seen, repeated linear filtering causes the Dj (j increasing) from the nonwhite noise series to behave as though they came from white noise and they seem to lose their discriminatory power. For each case we also record the value of @2 from D 1 , . . . , Ds. This quantity is discussed [8, 9]. Large values of @2 indicate deviation from white noise. The @2 are defined by the well-known quadratic form
@2=_ j=l where
ms E Aj -~
mj and
IDI,
Aj=
Dj-Oj_l ' [ N- 1-Dr_l,
j=l, j=2,...,K-1,
j=K.
Under the hypothesis of white Gaussian noise, the mj are obtained from the cosine or arcsine formula
B. Kedem / Periodicities from axis-crossings
141
520.000
Lower limit
D
Upper limit
480.000
251.538 344.351 381.688 403.373 418.013 428.765 437.101 443.814 449.374 454.079 458.129 461.664 464.787 467.571 470.075 472.343 474.409 476.304 478.048 479.662
279.000 371.000 401.000 425.000 437.000 444.000 451.000 461.000 465.000 469.000 475.000 475.000 479.000 487.000 493.000 493.000 497.000 499.000 499.000 499.000
297.452 387.649 422.355 442.029 455.081 461.538 471.782 477.560 482.303 486.285 489.689 492.641 495.232 497.529 499.584 501.436 503.116 504.649 506.054 507.349
440.000 400.000 560.000 520.000
280.000 240.000 I
I
4.0
I
I
8.0
[
I
12.0
[
]
16.0
I
I
20.0
Fig. 14. Probability limits (25) for twenty higher order crossings from white noise ( N = 550, ~b2= 5.53). The higher crossings are represented by the dotted line. Hypothesis of white noise is accepted.
800.000
600.000
400.000
200.000
0.000
I 4.0
I 8.0
I 12.0
I 16.0
I 20.0
Lower limit
D
Upper limit
251.538 344.351 381.688 403.373 418.013 428.765 437.101 443.814 449.374 454.079 458.129 461.664 464.787 467.571 470.075 472.343 474.409 476.304 478.048 479.662
38.000 60.000 253.000 390.000 436.000 463.000 479.000 495.000 501.000 503.000 505.000 507.000 507.000 509.000 509.000 509.000 509.000 511.000 511.000 511.000
297.462 387.649 422.355 442.029 455.081 464.536 471.782 477.560 482.303 486.285 489.689 492.641 495.232 497.529 499.584 501.436 503.116 504.649 506.054 507.349
Fig. 15. Probability limits (25) for twenty higher order crossings from the variable star series ( N = 550, 4,2 = 1779.72). Most higher crossings are outside the bounds, and the hypothesis of white noise is strongly rejected. Vol. 10, No. 2, March 1986
B. Kedem / Periodicitiesfrom axis-crossings
142 200.000
160.000
120.000
80.000
40.000
I 4.0
I 8.0
I 12.0
I 16.0
I 20.0
Lower limit
D
Upper limit
64.378 90.571 101.304 107.598 111.876 115.034 117.494 119.481 121.132 122.533 123.743 124.801 125.737 126.574 127.327 128.011 128.635 129.208 129.737 130.226
31.000 36.000 81.000 107.000 119.000 121.000 125.000 125.000 127.000 131.000 131.000 131.000 131.000 131.000 133.000 137.000 137.000 137.000 137.000 137.000
88.622 113.429 122.773 128.005 131.445 133.918 135.802 137.295 138.516 139.535 140.404 141.153 141.809 142.389 142.906 143.370 143.790 144d72 144.521 144.842
Fig. 16. Probability limits (25) for twenty higher order crossings from the sunspot series ( N = 155, I]/2 = 281.42). The first four higher crossings are not within the bounds and the hypothesis of white noise is rejected.
discussed in Section 2, namely,
/1
1
1
In this and previous work we let K = 9. Values of ~2 above 30 indicate significant deviation from white noise at level less than 0.05.
7. Afterthought We have shown that the axis-crossings of filtered time series are useful statistics in the search for hidden periodicities in periodic time series. The method outlined still makes use of the periodogram but only at, by far, fewer points than in the usual periodogram analysis. For long series this is an advantage. In fact, in order to have a quick idea about the frequency content of the time series at hand, it is sufficient to examine higher order crossings only without recourse to the periodogram at all. On the other hand, our method does not have a built-in decision rule which tells us when to stop the search other than a consistent search throughout successive frequency bands. Signal Processing
We note that the particular model (1) we chose and the normal assumption are not crucial. After all, a dominant frequency attracts the (normalized) number of axis-crossings regardless of the model a n d / o r the normal assumption.
Appendix A. Testing for periodogram ordinates at higher crossings Our procedure calls for a test based on periodogram ordinates evaluated at the normalized higher order crossings. This alteration means that the periodogram is evaluated at random points in [0, "tr] rather than at equispaced fixed points in this interval. It will be shown that under the assumption of white noise these periodogram ordinates are distributed very much as those evaluated at Fourier frequencies. Our main interest here is to obtain an approximation to the distribution of the maximum ordinate of
I*(~rDJ(N-1)), in (12).
j - - 1 , . . . , K,
B. Kedem / Periodicities from axis-crossings
So, let Z,=-e, be white Gaussian noise. It is convenient first to assume that the higher crossings can be obtained from an independent stretch of length N from e,. Denote these higher crossings by Dr. Then ~ ~ e, and Dr, any j, are independent. Define %, flj by ,:1-
Df "~ ,cos[rrv:Tt),
and so %, (9/k are asymptotically independent normal random variables with mean zero and variance o-2~. The same holds for (flj, Bk) and for (%,/3k). Because
I (rr \ ~ _D - ~; ~/ = aj2+ ~j,2 it follows that, under the hypothesis of white Gaussian noise,
For any fixed X as N increases, D f / ( N - 1 ) converges with probability one to a number in the open interval (0, ~r), where the limit is monotone in j. Assume N is large. By independence we have
E%=Efl;=O
for allj.
['trD;~ I\N_I],
Dj--d~J). + . . . + d ~
2 ~G +O(1/N),
O(1/N),
\N-1
d~J>= x~J) + x~~>_ 2x~J)x~i_),
j=k, j#k.
and
Likewise,
x~j)={
j=k, j#k,
= lo-2.+O(1/N), tO(l/N),
Cov(flj, ilk) and
Cov(aj, ~k) N
,=,
= O(1/N)
-
sin ~ , N 2 - - [
,
for all j, k
(see [1, p. 138]). More generally, since, for any two random variables X,Y, E X = E E ( X I Y ) , we have E ei(4"";+'b:"k)=EE
e i('¢'''j+4'2"D
N~I'
1 2 2 1 2 2 -~ exp{-~¢~b,} exp{-i¢r,&2},
N~oo,
),
where
2°'2~E ~ cos t N ,=, \ ~2--~
=
j=l,...,K,
are asymptotically independent cGX(2) 2 2 random variables (compare with [I, Theorem 8.4.4]). That the /97 may effectively be replaced by the regular Dj from the white noise series at hand may be seen from the fact that D; has the representation
Cov(aj, ak)
=
143
1,
vJ-lgt
~>0,
0,
otherwise.
Thus, d(,j) is a binary random variable dependent on ( e , , . . . , e,_j) only. It is seen that, for every fixed j, the sequence {d{j)} is j-dependent. Therefore, Y E,/x/-N and Dj/#--N are asymptotically jointly normal so that the quantities Y~e,/x/N and D i N are asymptotically independent. Thus, asymptotically in N and for sufficiently large K, K ~ N, ~rDj
1
'¢
- I - ( I - e-X/2) K,
(A.I)
where the denominator in (A.I) behaves roughly as 2 . Now, since
1 - ( 1 - e - X / 2 ) K ~ K ( 1 - x / ( 2 K ) ) K ', we finally arrive under white noise at the approxiVoL 10, No. 2, March 1986
144
B. Kedem / Periodicitiesfrom axis-crossings
References
mation for sufficiently large K < N, P t max I * ( ~r D J ~ > ~ K LI~j~K \ N - 1] ~K(I
--~-x~ r - ~ -2K/
}
(A.2)
This is precisely the first term in Fisher's exact test for p e r i o d o g r a m ordinates (see [1, p. 129]). Thus, with N = 550, K = 16, and x / ( 2 K ) =0.33, relation (A.2) gives 0.039 while its estimate obtained from 200 white noise series gave 0.04, a surprisingly g o o d approximation. For x / ( 2 K ) = 0 . 6 6 , relation (A.2) and its estimate from 200 independent white noise series were both practically 0. Relation (A.2) was used in testing the significance o f the perio d o g r a m ordinates in the examples above.
Appendix B. Consistency The main point put across in this p a p e r is that the normalized n u m b e r o f axis-crossings tends to a p p r o a c h the d o m i n a n t frequency when present in model (1). It can be s h o w n that the e n h a n c e m e n t o f such a frequency by a filter renders our procedure consistent (for a detailed proof, see [7]). A real application o f our m e t h o d to speech spectra can be f o u n d in [3].
Acknowledgment The a u t h o r is grateful to M. Berger and A. Tal for stimulating discussions, and to N. G i d r o n and N. Schal for their assistance in programming.
SignalProcessing
[1] T.W. Anderson, The Statistical Analysis of Time Series, Wiley, New York, 1971. [2] E. Damsleth and E. Spjetvoll, "Estimation of trigonometric components in time series", J. Amer. Statist. Assoc., Vol. 77, 1982, pp. 381-387. [3] O. Ghitza, "A measure of in-synchrony regions in the auditory nerve firing patterns as a basis for speech vocoding', Presented at: 1985 Tampa Internat. Conf. on Acoustics, Speech and Signal Processing, Tampa, FL, March 26-29, 1985. [4] M.J. Hinich, "Detecting a hidden periodic signal when its period is unknown", IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. 30, 1982, pp. 747-750. [5] B. Kedem, Binary Time Series, Dekker, New York, 1980. [6] B. Kedem, "On the sinusoidal limit of stationary time series", Ann. Statist., Vol. 12, 1984, pp. 665-674. [7] B. Kedem, "Detection of hidden periodicities by means of higher order crossings, Part I", Tech. Rept. TR85-55, Dept. of Mathematics, Univ. of Maryland, 1984. [8] B. Kedem and E. Slud, "On goodness of fit of time series models: An application of higher order crossings", Biometrika, Vol. 68, 1981, pp. 551-556. [9] B. Kedem and E. Slud, "Time series discrimination by higher order crossings", Ann. Statist., Vol. 10, 1982, pp. 786-794. [10] B. Kedem and G. Reed, "The asymptotic variance of higher order crossings", Tech. Rept. TR82-43, Dept. of Mathematics, Univ. of Maryland, 1982. [11] R.K. Otnes and L. Enochson, Digital Time Series Analysis, Wiley, New York, 1978. [12] E.T. Whittaker and G. Robinson, The Calculus of Observation, Blackie & Son, London, 1924.