Physics of the Earth and Planetary Interiors 113 Ž1999. 247–263
Multi-component autoregressive techniques for the analysis of seismograms M. Leonard b
a,)
, B.L.N. Kennett
b
a Australian Geological SurÕey Organisation, GPO Box 378, Canberra, ACT 2600, Australia Research School of Earth Sciences, The Australian National UniÕersity, Canberra, ACT 0200, Australia
Received 23 December 1997; accepted 10 June 1998
Abstract Autoregressive methods provide a very useful means of characterising a seismic record; calculating the power spectra of a seismic record and determining the onset time of different classes of arrivals. The representation of a time series with an autoregressive ŽAR. process of low order can be applied to both multi-component and single-component traces of broadband and short period seismograms. In three-component analysis the AR coefficients are represented as second-order tensors and include potential cross-coupling between the different components of the seismogram. Power spectrum estimation using autoregressive methods is demonstrated to be effective for both signal and noise and has the advantage over FFT methods in that it is smoother and less susceptible to statistical noise. The order of the AR process required to resolve the detail of the spectra is higher for a complex signal than for the preceding noise. This variation in the weighting of the AR coefficients provides an effective way to characterise data in a similar way to Spectragrams and Vespagrams and can be achieved with as few as five AR coefficients. For three-component analysis a display of the nine AR coefficients can be readily organised with three AR-grams for each of the original data components. The various elements of the AR tensor coefficients reflect different changes in the seismogram. The presence of secondary phases is often clearer on a cross-correlation AR-gram ŽNE or EN. than on the autocorrelation AR-gram ŽNN or NE.. The variations in the weighting of the AR coefficients can be exploited in two different styles of approach to onset time estimation Žphase picking.. In the first method, two different AR representations are constructed for different portions of the record and the onset time is estimated from the point of transition. In the second method, a single AR representation is constructed and the onset time estimation is based on the growth of a component which is not represented by the AR process. Both methods can be applied to both single and three-component data. For large impulsive P phases, both methods picked the onset time within two samples of the manually estimated onset time. For S phases, where the energy is present on all three components, three-component AR onset time estimation is preferred to that using a single component. The approach is very robust with the three-component method picking the onset time of a very small S phase on a broad-band record to within 0.5 s of the best manual estimate. q 1999 Elsevier Science B.V. All rights reserved. Keywords: Autoregressive methods; Seismograms; Time series
1. Introduction )
Corresponding author
A relatively simple means of characterising a time series is in terms of an autoregressive ŽAR. process
0031-9201r99r$ - see front matter q 1999 Elsevier Science B.V. All rights reserved. PII: S 0 0 3 1 - 9 2 0 1 Ž 9 9 . 0 0 0 5 4 - 0
248
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
in which the current value depends on a linear combination of past values. The order of the representation depends on the number of past values employed. Such representations have often been used for single component seismic data, but have particular value in three-component studies. Much of the ambient seismic noise is provided by microseisms which are dominantly Rayleigh waves with a strong link between vertical and horizontal components of motion. Although P waves are well represented on the vertical component, S related phases usually have their largest amplitude on horizontal components. Autoregressive procedures are of considerable value in the analysis of seismograms. We demonstrate the way in which such methods can be used to estimate a power spectrum with a relatively short data window. We show that typical seismic noise is well represented with a relatively low order autoregressive process, whereas seismic signals usually require higher order representations. The differences in the nature of the AR processes can be exploited by using a visual representation of the evolution of the autoregressive coefficients in time. The AR-gram can be calculated from a single component trace. It is of particular value when used with a three-component record to display the nine components of the tensor coefficients and indicate the interactions between the different components of the ground motion. The variations in the value of the AR coefficients which are exploited in the AR-gram can be quantified to provide estimation procedures for the onset times of both P and S phases using three-component data. High quality time picks can be obtained even where the signal-to-noise ratio is relatively low.
2. The autoregressive model A portion of a three-component seismogram may well contain both desired signal and other unwanted information. This unwanted information is classified as ‘noise’. This noise may be microseismic noise from distant storms or might be the unrecognised coda of an earlier event. In the analysis of seismic information we want to use as simple a description of the character of the seismic trace as possible. If
this description is sensitive to the subtle changes in the seismic trace it may indicate the presence of significant information. One simple description of a digital time series is in terms of an autoregressive ŽAR. model of order m in which the current value is predicted by a linear combination of a set of m previous values. For a vector seismogram u t the mth order AR model would be u AR t s A 1 u ty1 q A 2 u ty2 q PPP qA m u tym ,
Ž 1.
where the A i ’s are second-order tensor coefficients Žwhich can be represented as 3 = 3 matrices., e.g., a i11 A i s a i21 a i31
a i12 a i22 a i32
a i13 a i23 , a i33
0
Ž 2.
where, e.g., a i12 is the coefficient for a time delay of i samples relating component 1 at the current time to component 2 at the delayed time. Such an AR process will not normally provide a complete description of the seismogram and so the actual values of the seismogram can be represented in the form m
ut s
Ý A i u tyi q zt ,
Ž 3.
is1
with a ‘noise’ component zt in addition to the contribution from the mth order AR model. In an ideal environment the component zt will represent a vector of white noise elements, and in this case we can use a procedure based on the work of Robinson Ž1957, 1963. to derive an algorithm to determine the tensor AR coefficients A i . For a random white-noise sequence u t , the present value of zt will not depend on the prior values of u t , i.e., the tensor cross-correlation ² u t u s : s 0 for s - t. Thus, taking the product of Eq. Ž3. with u s and averaging over the stochastic sequence, m
² ut us: y
Ý A i ² u tyi u s : s 0, is1
s - t.
Ž 4.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
We introduce the tensor autocorrelation r t with the structure r 11 Ž s .
r 12 Ž s .
r 13 Ž s .
r s s r 21 Ž s . r 31 Ž s .
r 22 Ž s .
r 23 Ž s . , r 33 Ž s .
r 32 Ž s .
0
Ž 5.
where, e.g., r 12 Ž s . is the cross-correlation of component 1 with component 2 at an offset of s samples. Then, we can rewrite Eq. Ž4. as m
rk s
Ý A i r kyi ,
Ž k s 1,2, . . . ,m . .
Ž 6.
is1
This system of equations for the tensor coefficients A i can be represented schematically as
r0 r1 ... rmy 1
PPP
r1 r0
..
. PPP
rmy1 rmy2 ... r0
A1 A2 ...
r1 r2 s . .. rm
0 0 0 Am
Ž 7.
PPP
r1 r0
..
. PPP
rm rmy1 ... r0
< A1 ...
a 0 s .. .
Am
0
0 0 0
AIC
s y2 Ž maximum log likelihood. q2 Ž number of coefficients. ,
Ž 10 .
s y2 N log n q 2 m,
Ž 8.
where m
asr0 q
tensor forms for the AR coefficients A and the autocorrelation are replaced by scalar quantities. The resulting Toeplitz matrices can be solved using the recursive development of Levinson Ž1947.. Now, if we can develop a representation of a segment of a seismogram such as microseismic noise in terms of such an autoregressive model, we can then use this model as a filter to be applied to the seismogram. The part of the signal zt which is not represented by the AR process can thus be extracted from the original seismogram by the AR filter. In such a way we may, for example, recover the onset of a seismic phase from a noisy background. The Akaike Information Criterion ŽAIC. ŽAkaike, 1973. is the most widely accepted method for determining the length of an AR filter. For a scalar N-point time series:
2
The structure of Eq. Ž7. has the form of a block Toeplitz matrix acting on the set of tensor AR coefficients A and the equation system can be solved using a modified Levinson recursion described by Wiggins and Robinson Ž1965.. Eq. Ž7. can alternatively be rewritten as r0 r1 ... rm
249
where 2
n s
1 N
N
Ý is1
2
m
ž
Õi y
Ý a j Õiyj js1
/
,
represents the variance of the component not explained by the autoregressive process. The nature of the representation employed means that this form of the Akaike Information Criterion does not directly extend to vector time series. However, we have found that the AIC applied to a single component is normally sufficient to provide an effective estimate of the appropriate order for an AR filter applied to a vector segment.
Ý r iP is1
From the stochastic properties of white-noise we can demonstrate that
3. AR models to characterise a seismogram
a s ² u i zi : s ² zi zi : s s 2 ,
3.1. Data used
Ž 9.
2
where s is the diagonal variance matrix of the vector white noise. When attention is directed at a single component of the seismogram it can be convenient to use a simple scalar AR representation; in which case the
Table 1 lists the three events the data from which are used as examples in this paper. The Talaud Island and Fiji events were recorded at a temporary array deployment at Inveralochy, in New South
250
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Table 1 Location information of the three events are used as examples Event
Date
Hours
Minutes
Seconds
Latitude
Longitude
Depth
Magnitude
Recording station
Talaud Island Fiji Okinawa
1993304 1993312 1993219
17 19 0
2 15 0
53.6 50.3 6.4
4.6 y25.5 26.6
125.7 y176.5 y125.7
175 33 150
5.4 5.1 5.9
IVY deployment IVY deployment CTA
Wales, Australia. The deployment was around the Seismological Observatory IVY. The instrument consisted of a three-component extended period Guralp CMG-3 seismometer, buried in 60 cm deep pits. The Okinawa event data is from the seismological observatory CTA, Queensland, Australia. These records have been chosen because of their variety of noise characteristics to provide a realistic test of the performance of different AR methods. 3.2. Power spectrum estimation Estimating the power spectrum of broad band data, which typically has a frequency range of over three orders of magnitude Ži.e., 0.02 Hz–20 Hz., requires a data segment of 100 s, which at 50 samples per second is 5000 samples. A single estimate of the power spectrum has a standard deviation of 100% ŽPress et al., 1986.. Averaging the power spectrum of k 50% overlapping data segments reduces the variance by 9kr11 ŽWelch, 1967.. Averaging 11 estimates reduces the variance to 10%, but requires 550 s of data. This length of data is acceptable when calculating the power spectra of background noise but is unrealistic for estimating the power spectra of seismic phases which are often only a few cycles long. The autoregressive method of power spectrum estimation ŽBurg, 1967. produces a smooth power spectrum and uses only a single data segment which needs be no longer than the duration of the phase. Gutowski et al. Ž1978. showed that the AR method works very well for modelled data when that data has been generated by an AR process and the number of AR coefficients is the same or slightly higher than the AR model used to generate the data. When the data is generated by a Moving Average ŽMA. or an Autoregressive Moving Average ŽARMA. process or when the order of the AR model is significantly different to the actual model, the spectral
estimate is not as accurate. Typically it picks the position but not the width of the peaks. Fig. 1 shows the FFT and AR power spectrum of vertical component for two events, one large ŽFiji. and one small ŽTalaud Islands.. For each technique there are two power spectra, one for the pre-event noise and one for the P phase. The AR FFTs are displaced three orders of magnitude which helps in interpretation. The pre-event noise for the two events is quite different. Between 0.1 Hz and 2 Hz the noise power spectrum for the Talaud Islands event is one to two orders of magnitude higher than for the Fiji event. This reflects the different level of microseismic noise recorded on the seismogram. Above 2 Hz the noise is similar. Two 50% overlapping data segments were used to calculate the power spectrum, which reduces the variance to 60% ŽPress et al., 1986.. The very scattered power spectrum has been further smoothed by eight convolutions of a 3-pt triangular filter Žwhich is equivalent to a 17-pt filter.. This smoothing makes the difference between the power spectra of the signal and the noise clearer, but has detrimental effects such as broadening of peaks and troughs, and will not remove all statistical fluctuations. For example, the FFT power spectrum of Talaud Islands event ŽFig. 1b. showed two notches, one at 1.7 Hz and another at 3.7 Hz. An analysis of the SNR of the data after a series of narrow bandpass filters had been applied indicates that the notch at 1.7 Hz is not real but that the notch at 3.7 Hz is real. The AR power spectrum shows only one notch—at 3.7 Hz. For large events, the power of the signal is so high that there is no problem in clearly separating the power spectrum of the signal from the noise using an FFT. For very small events, separating signal from noise and separating statistical fluctuations from real features can be a problem. The order of the AR model Žthe number of AR coefficients. used to calculate the power spectrum
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
251
Fig. 1. Power spectrum estimates of the Fiji Ža. and Talaud Islands Žb. events. The power spectrum is calculated on the vertical channel only using two 50% overlapping 20-s data segments for the FFT estimate and a single 10-s data segment for the AR estimates.
changes the calculated FFT. If the order used is too low, detail in the FFT will be missed but if the order is too high false peaks might be introduced. We have found that an order of between 5 and 10 is adequate for describing noise. In the case of the noise prior to the Fiji event ŽFig. 2a. the model of order 5 describes the main feature of the power spectrum—its log linear decrease. The model of order 10 describes almost all the features of the power spectrum with the higher orders adding little. To adequately describe P phase the order of 10–20 is generally needed. For the Fiji event ŽFig. 2b. the order 5 model describes the major features of the power spectrum: the peak at 2.0 Hz and the log linear decrease. The order 10 model adds little. The order 15 and 25 models resolve the peak at 5 Hz. The order 25 model is possibly introducing false peaks above 10 Hz. 3.3. Single component AR-grams The power spectrum of a few tens of seconds of seismogram can be described in terms of 5–10 AR
coefficients. The changes in both the order and the value of the AR coefficients are diagnostic of the differences between noise and signal, and can be exploited to characterise seismic data. We have used the AR coefficients to characterise data by generating diagrams akin to Vespagrams and Spectragrams which we have called AR-grams. Whereas Vespagrams display slowness and Spectragrams display the power spectrum on the ordinate and time on the abscissa, AR-grams display the AR coefficients on the ordinate. We have developed AR-grams using the first 13 AR coefficients for single component seismograms Žsee Figs. 3 and 4.. The AR-grams are generated by calculating the 13 AR coefficients every 2 s using a 5-s window. Each pixel plotted is 2 s wide. The scale is white at zero, light grey for small absolute values and black for absolute values above 2. In the absence of a signal, we expect all but the first three coefficients to be light grey. When a signal is present, the values up to the order of the signal will darken.
252
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Fig. 2. Power spectrum of pre-event noise Ža. and the Fiji event P phase Žb. AR coefficients of orders 5, 10, 15 and 25 are used to calculate the AR power spectra.
The first AR coefficient of a single component AR-gram of broadband data always positive and is about one. In the presence of some microseisms ŽFig. 3. the absolute value of the second coefficient is less than 0.1, the third less than y0.15, the fourth about 0.1, and for all the others the absolute value is less than 0.1. The number at which all further AR coefficients have an absolute value less than 0.1 could be considered the order of the AR model. The AR order of the Fiji event noise is about 3 and the P phase about 10–12. The AR order varies with the type of microseisms. In Fig. 4 the AR order of the noise is only 2. 3.4. Three-component AR-gram The AR-gram process can also be applied to three-component data by providing a visual representation of the full set of tensor AR coefficients A. For each component, e.g., the vertical Ž Z ., we display the variation of the AR coefficients of order up to 5 for one row of the tensor A, so that there are three AR-grams associated with each component. For the
Z trace we display the variation of the ZZ, ZN and ZE AR coefficients of order up to 5 calculated every 2 s using a window of length 4 s. In Figs. 5 and 6 we display a set of examples of three-component AR-grams for events of different character and signal-to-noise ratio. The arrivals of major seismic phases are signalled by significant changes in the patterns of the weighting of the different orders in the AR process which are often seen rather clearly in the cross-components, e.g., ZE. Significant values in the off-diagonal elements Žcross-components. indicate that the prediction of the displacement value on one component is not only dependent on the previous value of that component but also on previous values of the cross-component. In the presence of rectilinear energy we would expect the cross-components to be significant. However, when more random noise is present the crosscomponents are insignificant. Except when noise is particularly high and is dominated by a Rayleigh wave from a single direction, we would not expect microseismic noise to have significant cross-components. Most seismic body-wave phases are close to
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Fig. 3. AR-gram of the Fiji event. The 13 AR coefficients are calculated every 2 s using a 5-s data segment.
253
254
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Fig. 4. AR-gram of the Talaud Islands.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Fig. 5. Three-component AR-gram of the Fiji event—broadband data.
255
256
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Fig. 6. Three-component AR-gram of the Okinawa event—broadband data.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
rectilinear, so we would expect significant crosscomponents when seismic phases are present. 3.5. AR-gram interpretation The AR-gram is a visual guide to the change of character of a seismogram, and often reflects changes which are not noticeable in the raw seismogram. Figs. 3 and 4 display single component AR grams of the Fiji and Talaud Islands events. For the preevent noise of the Fiji event, most of the AR coefficients above order 3 are insignificant. For the P phase, the order and the value of the coefficients both increase. Whilst after 200 s the seismogram has reduced to the pre-event noise level, its character is different due to the presence of phases and coda. This change of character is signalled in the AR-gram, where both the order and the values are higher. For the pre-event noise of the Talaud Islands event, again the coefficients above order 3 are insignificant for the noise. Whilst the P phase is barely discernible on the broadband seismogram, it is clear on the ARgram. There is no significant increase in order but there is an increase in the value of the first four coefficients. This subtle change in the AR-gram reflects the subtle change in the power spectrum. The S phase on the broadband seismogram appears as a change in amplitude with little change in spectral content. This might explain why it is not observable on the AR-gram. The three-component AR-gram often reflects the changing character of the seismogram in a different way to the single component AR-gram. For example, in Fig. 6 on the N component AR-gram of the P phase, the NE coefficients are larger than the NN coefficients, but the NE coefficients quickly become insignificant. Whereas the NN coefficients appear to be more sensitive to the coda, this is not the case for the single component AR-gram ŽFig. 3.. The S phase is small but clear on the N component and is quite clear on the NE AR-gram but is only just perceivable on the NN AR-gram. The cross-components ZN, ZE, and NE all have small changes at the time of the S phase. The Fiji event has several phases and the complexity of the AR-grams ŽFig. 5. indicates the presence of these phases. The first coefficient of the autocomponents ŽZZ, NN, EE. is always the largest
257
of the five coefficients. When noise is dominating and there is little correlation between components the cross-components ŽZN, ZE, NZ, NE, EZ, EN. are small. When a coherent signal is present on more than one component, the cross-components become significant. The clearest example of this is for the S phase in Fig. 6 where, for all three components, the cross-components are significant. The cross-components can be more sensitive to the presence of a phase than the autocomponent. For example, the P phase is most significant on the ZN, ZE and NE AR-grams, the pP phase on ZN AR-gram, and ScP on the ZN and EN AR-grams.
4. Onset time estimation 4.1. Introduction Current applications of AR filtering to seismology have been mostly in the area of automatic onset time picking of previously detected phases. Three different, but quite similar, approaches have been proposed ŽKushnir et al., 1990; GSErJAPANr40, 1992; Takanami and Kitagawa, 1993. for use with short period data, and mostly with local and regional events which have large SNR. Here we demonstrate that such AR procedures can be extended to work with three-component broad-band records with variable SNR. All the approaches involve calculating an AR model of the seismogram before and after the start of the phase. These two models will be most different, when one contains only seismic noise and the other contains mostly signal. Since the two models have a common point, when the models are most different the dividing point between the two models is considered the onset time. The joint Akaike Information Criteria ŽAICŽF [ B.. of the time series is a minima at this point. AICŽF [ B. is defined as: AIC Ž F [ B . s y Ž k y 1 . log n F2 y Ž n y k q 1 . log n B2 q 2Ž mF q mB . ,
Ž 11 .
where n F2 is the variance of the forward model, n B2 is the variance of the backward model, m F is the
258
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
number of AR coefficients used to calculate the forward model and m B is the number of AR coefficients used to calculate the backward model. Ideally, AICŽF. will be constant from 1 to k y 1 then increasing from k to n, and AICŽB. will be constant from n to k and increasing from k y 1 to 1. So the addition of the two AICs will result in a ‘V’ shaped function with the minima at km, where k is the onset time. 4.2. Summary of methods There are three methods which authors have used to calculate onset times using AR filtering. Fig. 7 schematically describes the three methods. In each case, they can be applied to both single-component and three-component records. 4.2.1. Model 1 This method describes the whole time interval by two AR models. One AR model ŽF model. is calculated for the interval 1 to k y 1 and another ŽB model. for the interval k to n. The joint Akaike Information Criteria ŽAICŽF [ B.. is then calculated. This process is then repeated for all k from 1 q m to n y m, where m is the order of the AR model. The value of k for which ŽAICŽF [ B. is a minimum is chosen as the onset time. This model, which divides the time series into two stationary time series, is conceptually the simplest of the three methods but is very computer intensive as n y 2 m AR models need to be calculated.
Fig. 7. The three models used in the AR onset time estimation methods. m1, m2, and m3 refer to the three models described in the text. F model refers to the forward model and B model refers to the backward model.
4.2.2. Model 2 In model 2 the time series is divided into two error prediction series ŽWF and WB .. The F model is calculated on the interval 1 to l and the B model on the interval n y l to n. The F model AR filters are then used to calculate the Wf error prediction series on the interval 1 q m to n y m where m is the order of the AR model Ž m F l .. The B model AR filters are then used to calculate WB on the interval n y m to 1 q m Ži.e., backwards.. The variance of the prediction errors is calculated at each time point and is used to calculate the AICŽF [ B.. The two AR models are still calculated only once. Whilst the time series of the seismic signal is regarded as stationary in a limited interval Žlocal stationarity., it is not a continuation of stationary series of the same nature, but changes slightly both before and after the merging of a signal. The division of the interval is made by inferring the covariance matrix of prediction errors from F model and B model. The use of only two models with an update of the variance at every time point corresponds to the assumption that the F and B model spectra stay almost the same but the amplitude changes with time. 4.2.3. Model 3 In this method the AR model ŽF model. is calculated once in the first part of the time series. The prediction error series of the F model is calculated in the forward direction. When a phase emerges, the prediction error of the F model becomes larger. The AICŽF. is calculated at every point and its minimum corresponds to the arrival time. The AR model is calculated only once. Model 3 assumes that the signal and noise have different spectra. 4.2.4. Model comparison GSErJAPANr40 Ž1992. found that the accuracy, stability and confidence of the time estimation using model 2 is similar to, or better than using model 1, and is much faster than model 1. For short impulsive phases, where the assumption of stationarity of the B model in model 2 does not hold, model 3 is more accurate than model 2. Model 2 is superior to model 3 when the power spectrum of the phase is similar to the noise.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
The work by Kushnir et al. Ž1990. is very similar to model 1 described in Section 4.2.1. There are two differences. One is that instead of calculating the AICŽF [ S., a closely related Maximum Likelihood parameter is constructed which has a maximum at the onset time. The second is that autocorrelations are calculated up to offset m Žthe order of the AR model. and then the Toeplitz matrix is solved via the Levinson–Durbin recursion ŽLevinson, 1947.. The approach of Takanami and Kitagawa Ž1988. is similar to model 2 described above, but instead of calculating the AIC from the AR filtered time series, they calculate it directly without needing to calculate the AR coefficients or AR filter the data. Takanami and Kitagawa Ž1988. applied a single component method to P phases and Takanami and Kitagawa Ž1991. expanded the method to three components and applied it to B phases with a large SNR. Takanami and Kitagawa Ž1993. found that summing the three single-component AIC time series performed nearly as well as the three-component
259
method, but they were concerned about the implied assumption that each component is independent of each other. They found that when using the multicomponent method for estimating the onset time of S phases, utilising the two horizontal components was slightly superior to using all three components. 4.3. AR onset time estimation of P and S phases Using the models m2 and m3 we have evaluated the accuracy of AR onset time picking for the P and S phases for two teleseismic events. Here we will discuss their effectiveness in accurately estimating the onset time of the Fiji event P and S phases, and the Okinawa event S phase. These three phases cover a range of phases and sizes—including a large impulsive P phase, a small S phase and a medium S phase—with which to evaluate the method’s effectiveness. Kværna Ž1995., in his study comparing the AR onset time estimates with the analyst picked arrival times for high SNR P phases reported by the
Fig. 8. Fiji event—P phase. The three-component broadband seismogram and the six AIC functions are shown. Ža. Shows the results of single component AR processing and Žb. the results of three-component AR processing. ar1 and ar3 refer to single and three-component AR filtering respectively. The m2 and m3 refer to whether model 2 or 3 has been used. The data has been filtered with a 0.6–4.0 Hz bandpass filter.
260
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
Table 2 Fiji event, comparison of the various onset time estimation methods for P and S phases Component
Code
Model 4
Z single component AR Z three-component AR P phase onset time Žmanual. N single component AR E single component AR NE single component AR N three-component AR E three-component AR NE three-component AR S phase onset time Žmanual.
Z-ar1 Z-ar3
51.75 51.85
N-ar1 E-ar1 NE-ar1 N-ar3 E-ar3 NE-ar3
330.1 330.9 330.2 330.0 329.1 330.0
Model 3 51.65 51.9 51.75 332.1 330.8 330.2 336.0 Ž330.0. 324.0 330.0 329.5
ar1 and ar3 refer to single- and three-component AR filtering, respectively. The m2 and m3 refer to the component and whether model 2 or 3 has been used.
IDC, used the GSErJAPANr40 Ž1992. methods with a window length of 12 s starting 7 s before the
detector’s arrival time. In this study, we have used the same window length and start time. To determine the single- and three-component AR coefficients, we used subroutines distributed by GSErJAPANr40 Ž1992.. For S phase onset time estimates we have used the suggestions of Takanami and Kitagawa Ž1993. in using the sum of the AIC functions of the two horizontal components using both single ŽNEar1. and three-component ŽNE-ar3. AR processing. For impulsive high SNR events, such as the P phase of the Fiji event, the technique performs well. In the single component case ŽFig. 8a., the six single-component AIC functions are all well behaved with a gentle smooth linear decrease down to the minima and then a steep increase after the minima. They all pick the onset time within a sample or two of the actual onset time of 51.75 s. For the threecomponent case ŽFig. 8b., all six models also work well. The m3 models are comparable with the single component P phase results. The m2 models are
Fig. 9. Fiji event—S phase. The three-component broadband seismogram and the eight AIC functions are shown. The labelling is the same as Fig. 8.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
slightly less accurate than the m3 models and have an untidier AIC function. The poorer performance of the m2 models is a result of the backward model being calculated on the coda where the amplitude of the horizontal components is larger than is the vertical component. Also, there is a change in the noise on the E component 1.5 s before the P phase and the model 2 three-component AICs are possibly reflecting this change. Table 2 lists the P onset times for the two methods using single- and three-component AR processing. Model 3 applied to the single component picks the onset within a sample with the three-component AR pick being 0.1 s late. Model 2 is slightly worse with the single component pick being 0.1 s early and the three-component pick being 0.15 s late. For low SNR phases, such as the S phase of the Fiji event, when the seismogram is filtered to simulate a SP seismogram, the technique still works. As would be expected for a low SNR teleseismic S phase, which has most of the energy on the horizontal components, applying the techniques to the two horizontal components and summing the AICs produces the best results. For the single component case ŽFig. 9a., all the horizontal component AICs produce acceptable results with the sum of the two horizontal giving the clearest minima. For the three-component case ŽFig. 9b., the N component model 3 ŽN-ar3-m3. AIC gives the best minima and is very close to the manually picked time but the other three are poor. The two summed AICs ŽNE-ar3-m2 and NE-ar3-m3. have clear minima within half a second of the manually picked time, which considering the very low SNR of this event is impressive. The superior performance Žin terms of how well defined the minima is. of NE-ar3-m3 compared to NE-ar3-m2 is consistent with the assumptions, mentioned above, behind those models. Model 2 assumed that the signal and noise have similar power spectra but with different amplitudes, whereas model 3 assumed different power spectra. Events where there is a low SNR phase just prior to a larger SNR phase can be difficult to pick correctly. The S phase of the Okinawa event when bandpass filtered Ž0.6–4.0 Hz. to simulate a SP seismogram ŽFig. 10. has this property. At first glance one would pick the start of the S phase at 446 s but on closer examination the S phase onset time is
261
picked at 444.35 s. Both NE-ar1-m2 and NE-ar1-m3 pick the onset time at 444.35 s, with NE-ar3-m2 and NE-ar3-m3 picking 444.15 and 444.0 s, respectively. Interestingly, all the AIC functions reflect the change in phase at 446 s but all have their minima close to the onset time of 444.35 s. In this case, where there is a significant change in frequency content, model 2 is superior to model 3. 4.4. Comparison of methods Here we have shown that the AR onset time estimator, as well as working well for large teleseismic P phases, works well for medium sized S phases and small S phases. The fact that the method works for small teleseismic S phases is significant for two reasons. Firstly, it means that automatic analysis, including accurate onset time estimation, of seismic data can be applied to secondary phases as well as P phases. Secondly, it shows that phases with a small SNR and very similar power spectra as the noise can be accurately picked by AR methods. This confirms the sensitivity of autoregressive filtering to subtle changes in the power spectrum. The advantage of an accurate automatic S phase onset time picker in event location is significant, particularly for regional events recorded on a sparse network.
Fig. 10. Okinawa event—S phase AIC functions. The data has been filtered with a 0.6–4.0 Hz bandpass filter.
262
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263
For P phases the best methods were the single component AR processing of the Z component using models 2 and 3. For S phases the best methods were the three-component AR processing models 2 and 3 where the AIC of the two horizontal components are summed ŽNE-ar3-m2 and NE-ar3-m3.. As a general rule, picking the earliest of the two times will give the best estimate of the onset time. Further work using the depth of the minima and the distance to the nearest other minima might produce a more quantitative measure of the quality of the onset time estimate.
5. Conclusion We have been able to demonstrate that methods based on an AR representation can be successfully applied to three-component broadband records for spectral estimation, record characterisation and onset time estimation. AR power spectrum estimation is a useful alternative to the normal FFT method, with the advantage that a smooth power spectrum is generated even when only relatively short data segments are available. Methods for characterising three-component broadband data, where small phases are not discernible on the unfiltered trace, can be simply done via the use of an AR-gram. As well as being sensitive to the presence of seismic phases they also register changes in data quality. Autoregressive techniques are a very robust and simple way of determining the onset time of seismic phases. For large events and events with their energy concentrated on a single component, such as teleseismic P, single component AR onset time picking is acceptable. For events which have their energy spread over two or more components, three-component AR onset time estimation has superior performance. The superior performance of the three-component analysis using AR can be explained by the fact that for both seismic phases and microseismic noise, the signal on one component aids in the prediction of the signal on the other components. For example, model 2 for onset time estimation performs best when the F model is perfectly predicting the noise and the B model is perfectly predicting the signal. Hence, any
improvement in the process of signal prediction will improve the onset time estimation. Prediction between the different components also explains why the AR-grams of some seismic phases are clearer on the AR-grams for the off-diagonal elements Žcrosscomponents. than they are on the AR-grams of the diagonal elements Žautocomponent. which apparently have the largest signal-to-noise ratio.
Acknowledgements This paper is published with permission of the Executive Director of the Australian Geological Survey Organisation.
References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B., Csaki, F. ŽEds.., 2nd International Symposium on Information Theory. Budapest Akademiai Kiado, pp. 267–281. Burg, J., 1967. Maximum entropy spectral analysis. Proceedings of the 37th Meeting of the Society of Exploration Geophysicists ŽIn Childers 1978.. GSErJAPANr40, 1992. A fully automated method for determining the arrival times of seismic waves and its application to an on-line processing system. Paper tabled in the 34th GSE session in Geneva GSErRFr62, G.S.E. Gutowski, P., Robinson, E., Treitel, S., 1978. Spectral estimation: fact or fiction. IEEE Trans. Geosci. Electron. GE 16, 80–84, In Childers 1978. Kushnir, A., Lapshin, V., Pinsky, V., Fyen, J., 1990. Statistically optimal event detection using small array data. Bull. Seism. Soc. Am. 80 Ž6b., 1934–1950. Kværna, T., 1995. Automatic onset time estimation based on autoregressive processing. Norsar scientific report, NORSAR. Levinson, N., 1947. The Wiener RMS, root mean squared, error criterion in filter design and prediction. In: Wiener, N. ŽEd.., Extrapolation, Interpolation, and Smoothing of Stationary Time Series with Engineering Applications, Appendix B. Wiley. Press, W., Flannery, B., Teukolsky, S., Vetterling, W., 1986. Numerical Recipes: The Art of Scientific Computing. Cambridge Univ. Press, Cambridge. Robinson, E., 1957. Predictive decomposition of seismic traces. Geophysics 22, 767–778. Robinson, E., 1963. Mathematical development of discrete filters for the detection of nuclear explosions. J. Geophys. Res. 68, 5559–5568. Takanami, T., Kitagawa, G., 1988. A new efficient procedure for the estimation of onset times of seismic waves. J. Phys. Earth 36, 267–290.
M. Leonard, B.L.N. Kennettr Physics of the Earth and Planetary Interiors 113 (1999) 247–263 Takanami, T., Kitagawa, G., 1991. Estimation of the arrival times of seismic waves by multivariate time series models. Ann. Inst. Stat. Math. 43 Ž3., 407–433. Takanami, T., Kitagawa, G., 1993. Multivariate time-series models to estimate the arrival times of S waves. Comput. Geosci. 19 Ž2., 295–301.
263
Welch, P., 1967. The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio and Electroacoust. AU 15, 70–73, In Childers 1978. Wiggins, R., Robinson, E., 1965. Recursive solution to the multichannel filtering problem. J. Geophys. Res. 70 Ž8., 1885–1891.