Detection and identification of seismic signals recorded at Krakatau volcano (Indonesia) using artificial neural networks

Journal of Volcanology and Geothermal Research 176 (2008) 448–456 Contents lists available at ScienceDirect Journal of Volcanology and Geothermal Re...

Download PDF

2MB Sizes 0 Downloads 79 Views

Report

PDF Reader
Full Text

Journal of Volcanology and Geothermal Research 176 (2008) 448–456

Contents lists available at ScienceDirect

Journal of Volcanology and Geothermal Research j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / j v o l g e o r e s

Detection and identiﬁcation of seismic signals recorded at Krakatau volcano (Indonesia) using artiﬁcial neural networks M. Ibs-von Seht Federal Institute for Geosciences and Natural Resources (BGR), Stilleweg 2, 30655 Hanover, Germany

A R T I C L E

I N F O

Article history: Received 19 November 2007 Accepted 21 April 2008 Available online 4 May 2008 Keywords: Krakatau neural network volcano seismicity detection classiﬁcation

A B S T R A C T The Anak Krakatau volcano (Indonesia) has been monitored by a multi-parametric system since 2005. A variety of signal types can be observed in the records of the seismic stations installed on the island volcano. These include volcano-induced signals such as LP, VT, and tremor-type events as well as signals not originating from the volcano such as regional tectonic earthquakes and transient noise signals. The work presented here aims at the realization of a system that automatically detects and identiﬁes the signals in order to estimate and monitor current activity states of the volcano. An artiﬁcial neural network approach was chosen for the identiﬁcation task. A set of parameters was deﬁned, describing waveform and spectrogram properties of events detected by an amplitude-ratio-based (STA/LTA) algorithm. The parameters are fed into a neural network which is, after a training phase, able to generalize input data and identify corresponding event types. The success of the identiﬁcation depends on the network architecture and training strategy. Several tests have been performed in order to determine appropriate network layout and training for the given problem. The performance of the ﬁnal system is found to be well suited to get an overview of the seismic activity recorded at the volcano. The reliability of the network classiﬁer, as well as general drawbacks of the methods used, are discussed. © 2008 Elsevier B.V. All rights reserved.

1. Introduction Volcanic seismicity is a key parameter to estimate the activity state of a volcano. One of the main tasks of a volcano observatory is therefore, to monitor seismicity related to a volcano and to derive information on its current state from seismic event statistics. This study is part of the Krakatau monitoring project (Krakmon) and aims at the realization of a system that automatically detects and identiﬁes seismic events recorded at Krakatau volcano (Indonesia). In order to estimate the volcano's state of activity using seismic records it is required to discriminate between events that occurred in the direct vicinity of the volcano and are related to it, tectonic earthquakes that happened further away and are not related to the volcano (regional and distant events) and seismic signals that can be classiﬁed as noise. Volcano-related signals can further be subdivided into groups such as volcano-tectonic events, long-period events, and volcanic tremor. The Krakatau multiparameter monitoring system is described in detail in Hoffmann-Rothe et al. (2006). Its main components are four seismic stations on the islands of Anak Krakatau and Sertung that are in operation since June 2005 (Fig. 1). Seismic data are available in nearreal time at an observatory on Java and at data centres in Jakarta and Bandung. At the present state of work, the automatic detection system works with the data of one station (KM01) only. Station KM01 is

E-mail address: [email protected]. 0377-0273/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jvolgeores.2008.04.015

located nearest to the centre of Anak Krakatau and proved to be most sensitive for volcano-related seismic events. An artiﬁcial neural network (ANN) approach was used for the classiﬁcation of seismic events detected by the system. Neural networks in computer applications attempt to imitate certain structures of the nervous system, in particular the web of interconnected neuron cells, in order to solve modelling or classiﬁcation problems. Basically, ANNs consist of input and output units being inter-connected by a net of additional units. A feature vector presented on the input side is fed through the network and returns a class membership vector on the output side. In the application presented here, the feature vector consists of certain seismic signal properties, and the class membership vector deﬁnes the corresponding signal type. Using supervised training strategies, the role of the ANN is to approximate a function that maps the two vectors on each other. This is, in principal, possible for cases of arbitrary degrees of complexity (Cybenko, 1989; Langer and Falsaperla, 2003). Ideally, after completing a training phase in which a sufﬁcient number of complete training data sets are processed, the trained network is able to compute the unknown outputs for new data sets correctly. A drawback of ANNs applied in physical research may be seen in its ‘black-box’ character: outputs of a well trained ANN might give right answers but will not give insight in the physical processes controlling the coupling between input and output parameters. Still, for practical applications, such as activity status estimations on the base of event statistics attempted in this study, ANNs can be useful. They have been

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

449

Fig. 1. (a) Location of the Krakatau volcano complex in the Sunda Strait and positions of temporary seismic stations (KT network, black triangles). (b) Satellite image (Ikonos) of Anak Krakatau and positions of permanent monitoring stations (KM network, white triangles).

successfully applied in seismology for automatic phase-picking (e.g. Jentili and Bragato 2006 and references therein) and, in particular, for the classiﬁcation of volcano-seismic events (e.g. Langer and Falsaperla, 2003; Langer et al., 2005). 2. Data processing For the development of the automatic detection and identiﬁcation scheme continuous seismic data of station KM01 for the period from June 2005 to March 2007 were used. Due to various technical failures, numerous data gaps occurred during that period, resulting in a total of approximately 16 months of data that could be used. The procedure of event detection and identiﬁcation is illustrated in Fig. 2. The seismic data of station KM01 are ﬁltered and then processed by an event detector based on amplitude-ratio calculations. During the initial development of the system, seismograms and spectrograms of the triggered time windows are inspected and, with the help of auxiliary information and a soniﬁcation tool, a manual identiﬁcation and classiﬁcation of event types is carried out. The results are used to train an artiﬁcial neural network that works with numeric parameters extracted from waveforms and spectrograms of the detected events. After a successful training, the whole procedure of event detection and identiﬁcation runs automatically. In a ﬁrst step, the broadband data are 1 Hz highpass-ﬁltered which strongly increases the signal to noise ratio for the short-period signals of interest. The event detector is based on a conventional STA/LTA trigger procedure. The algorithm continuously computes the ratio of average absolute seismogram amplitudes for a short-term (STA) and a long-term (LTA) time window. The parameters that control the sensitivity of the detector are dSTA and dLTA (durations of STA and LTA time windows), and ratON and ratOFF (ratio thresholds that switch trigger on and off). A number of tests revealed suitable values to be dSTA = 1 s, dLTA = 60 s, ratON = 5.0, and ratOFF = 1.0. Some additional routines are applied in order to minimize the number of false detections for data streams that may contain gaps or spikes and for events that produce multiple detections. The ﬁnal time windows that contain the detected seismic events are enlarged by 10 s at the start and the end (pre-event and post-event times).

Spectrograms are calculated for time windows detected using standard FFT routines. Power spectra are calculated for 50% overlapping, 64 samples wide data segments. The resulting time– frequency–amplitude matrix is smoothed by averaging neighbouring data points and displayed using a colour coding for the normalized logarithm of the power spectral amplitudes. For a digitizing frequency of 100 Hz the temporal and spectral resolutions achieved are 0.32 s and 1.5 Hz, respectively. In a subsequent step, a contouring algorithm is applied to the spectrograms. The aim of this procedure is to provide simpliﬁed representations of the general patterns visible in the spectrograms. The resulting contour polygons (spectral contours) deﬁne regions in the spectrograms that contain signiﬁcant spectral energy

Fig. 2. Sketch of processing scheme for automatic event detection and identiﬁcation. The tasks on the right side are only required during the development of the ANN.

450

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

with respect to the background noise. The contour level value is determined individually for each spectrogram. Its calculation is based on an empirical function that uses the mean STA/LTA ratio of the trigger time window as parameter. Thereby, the weighting between weak and strong signals becomes balanced and an over-weighting of background noise for low signal events is avoided. Soniﬁcation of the triggered waveforms was found to be a useful tool that helps to identify similarities in seismic signals. It is done by converting the seismogram data to an audio format such as PCM-WAV and selecting an increased playback speed by setting a suitable digitizing frequency (fdig) in the audio ﬁle header. Good results were obtained with fdig = 16 kHz which corresponds to a 160-fold faster playback. Audio ﬁles prepared this way enable to better perceive complex patterns in the spectral content of signals, compared to visualization methods. Particularly, harmonic components that play an important role in volcano-related seismic signals can be clearly perceived. 3. Event types, a-priori classiﬁcation The detection process created, after excluding very weak or very short-lasting signals, a data base of almost 10,000 usable events for the time period analyzed. After a thorough analysis of the data base, six relatively well distinguishable types of seismic events could be identiﬁed (Table 1). The types VT (volcano-tectonic), LP (long-period), and TRE (tremor) are volcano-related, whereas the types REG (regional), NOH (high-frequency noise), and NOL (low-frequency noise) have other sources. The task of manual event identiﬁcation, which is required before designing a system that automatically identiﬁes event types, is not trivial because assignments to speciﬁc event types are not always clear and sometimes impossible. The decision process is guided by information such as coda shape, spectrogram pattern, soniﬁcation, and, if possible, locating the event. Due to poor signal to noise ratio and an unfavourable distribution of the permanent seismic stations on Krakatau (KM network) it is mostly not possible to locate tectonic events with these stations only. However, using the data of a temporary seismic network that had been operated from June 2005 until January 2006 in the Sunda Strait region (KT network, see Fig. 1) it was possible to locate a part of the event database (see HoffmannRothe et al., 2006). Although a great part of events detected and located by the KT network is not visible in the data of the KM network, and, on the other hand, many (very local and mostly volcano-related) events are visible in the KM data only, the information derived from the KT network helped to identify event types. Fig. 3 displays example seismograms and spectrograms for each event class. 3.1. Non-volcano-related signals The Krakatau islands are located near the Sunda Arc subduction zone. Strong tectonic earthquakes that are not related to the volcano are therefore relatively frequent and are recorded by the Krakatau stations. These events were classiﬁed as regional events (REG). The

Table 1 Number n of manually identiﬁed data for different event types Type

n

REG NOH NOL VT LP TRE sum

74 144 166 252 190 98 924

class contains tectonic earthquakes that occurred at a certain distance from the volcano, so that a clear separation of P- and S-phase of more than about 3.5 s is observed. Regional events could be identiﬁed relatively easy by their coda and spectrogram shapes, by their sound (soniﬁcation), or, if possible, by locating them using the KT network data. A great number of events detected could be attributed to external sources and was therefore classiﬁed as noise signals. Consulting the data of electromagnetic and weather stations installed on Anak Krakatau (see Hoffmann-Rothe et al., 2006), as well as by visual and acoustic observations, one group of signals (NOH) was found to be caused by thunder sounds that couple into the solid ground. The sounds produce seismograms that sometimes resemble volcano-tectonic events, however, their frequency content is mostly restricted to high frequencies well above 20 Hz. Another source of transient noise signals is coastal surf. Sea-waves produce low-frequency signals (NOL) that may sometimes be confused with volcano-related LP events. The source of these signals could be veriﬁed consulting the data of a sea-level sensor and seismometers installed at Anak Krakatau's coast. A third group of transient seismic signals with external sources is produced by people passing the recording site of KM01. Since these disturbances also produce high-frequency seismograms they were assigned to the group of NOH signals. The occurrence of transient noise signals reveals no information on volcano-seismic activity. Yet experience shows that it is of high importance to identify these signals correctly in order not to misinterpret them as volcano-related. Indeed, the swarm of LP events mentioned in Hoffmann-Rothe et al. (2006) was later found to be completely produced by coastal surf. 3.2. Volcano-related signals Volcano-tectonic (VT) events are tectonic earthquakes that originate from brittle failure in the volcano system. Most of the volcano-related signals recorded at Krakatau were classiﬁed as VTs. Many of them show clear and impulsive onsets and stronger events could be localized. Since it is unknown up to which distance a tectonic event is still related to the volcano, there is a gradual transition between VT type and regional events at some distance to the volcano. In this study, an event was classiﬁed as VT if the separation between Pand S-wave onset was less than about 3.5 s. For many of the VTs, however, S-wave onsets are not visible and the seismograms suggest very small (b1 km) hypocentral distances. Generally, VT events recorded at Krakatau show a great variety in their coda shapes and frequency content which is attributed to the complex inﬂuence of hypocentral distance, source depth and magnitude of the events on the waveforms recorded relatively near to the source. At the present state of the work it was not attempted to subdivide the class of VT events into groups such as A-type and B-type signals as introduced by Minakami (1960). During the analysed time period, long-period events (LP) and volcanic tremor (TRE) were almost exclusively recorded in November and December 2006. Their occurrence is related to the volcano magmatic system and indicates stronger ﬂuid-driven ﬂuctuations in the system. Seismograms of LP events are mostly characterised by a high-frequency onset and a low-frequency coda. In the soniﬁcation, the signals often show clearly audible harmonic components. Tremor signals recorded at Krakatau often start with an impulsive event, others show no clear onsets. They produce a roaring sound, mostly occupy a frequency range up to 10 Hz and show no clear development in spectral content with time. The STA/LTA detector sometimes fails to detect tremor signals that start with slowly increasing amplitudes. It could not be attempted to localize LP and TRE events because, during the time of their appearance, only two seismic stations were in operation.

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

451

Fig. 3. Seismograms and spectrograms of different event types identiﬁed at Krakatau. For each event type, three examples are shown. Time axis in seconds, frequency axis 0…50 Hz. Open circles: power maxima, ﬁlled triangles: centroids of contour lines shown in spectrograms. Vertical lines in seismograms indicate trigger times.

Based on the above described guidelines, a set of 924 events was manually identiﬁed (Table 1). This data set was the basis for the application of an artiﬁcial neural network approach to identify event types automatically.

4. Data parameterization Application of artiﬁcial neural networks requires to deﬁne suitable numeric parameters that describe certain characteristics or properties

452

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

Fig. 4. Schematic layout of a neural network. In this study, the network input is represented by parameters extracted from the seismic data and the output represents the corresponding event types.

of the input data. For this study it was attempted to express the view of a human operator on the data to some extent. The most striking characteristics of the seismogram and corresponding spectrogram samples are straightforward properties such as the event duration and the mode of its onset, the signal's spectral range and prevailing frequencies, and the general pattern visible in the spectrogram. These properties were expressed in the form of six parameters extracted from the seismogram waveforms and spectrograms: (1) the duration of the trigger time window (dur); (2) the impulsiveness of the signal onset (imp) (see below); (3) the dominant frequency of the signal within the trigger time window (fdom), determined by counting zero crossings; (4) the frequency range deﬁned by the spectral contour (fr); (5) the frequency of the centre of gravity (centroid) of the spectral contour (fc); (6) the frequency of the energy maximum in the spectrogram (fe). The imp parameter was calculated as follows: Within the triggering dSTA window of n samples width, the cumulative sum of rectiﬁed amplitude samples σ(i = 1…n) is calculated. The impulsiveness of the signal is then deﬁned as ic/n, with ic being the index where σ(ic) becomes greater than σ(n) ⁎ 0.2. The imp value is a measure of how fast the signal amplitude had risen before the trigger detection occurred. The use of more sophisticated parameters such as the temporal development of dominant signal frequencies was examined but not found to be practical. In order to achieve a uniform inﬂuence of all parameters on the classiﬁcation process, the data were standardized by subtracting their mean and dividing them by their standard deviation. There is for sure some coupling between parameters. For example, the dominant frequency of a seismic signal (fdom) will, to a certain degree, be correlated to the frequency of the energy maximum (fe) and to the centroid frequency (fc). However, this coupling between parameters is nonlinear, meaning that each of the three frequency values can contribute additional information for a classiﬁer. The key power of ANN classiﬁers is their ability to learn nonlinearities of the inputs from training data (Duda et al., 2001).

feature vector presented at the input layer is processed by the network of inter-connected units and produces a speciﬁc signal output. The signal pattern at the output layer results from relatively simple calculations. At each hidden and output unit, the weighted sum of all input values, the net activation, is calculated and passed to a nonlinear activation function that determines the output emitted by the unit. In order to produce a desired output, appropriate weights assigned to each unit have to be set. In training mode, the weights are adjusted by the network, based on training patterns and their desired output. Most of the strategies for such supervised training use backpropagation methods (Werbos, 1974; Rumelhart et al., 1986). The method is based on an error measure (the difference between actual and desired network output), which is propagated backward from the output to the hidden layer units while adjusting the unit weights to reduce the error. During one training cycle (epoch), this adjusting process is performed for all patterns available in a training data set. Multiple training epochs are performed in order to gradually decrease the mean square error (MSE) value of the network and thereby improve its overall performance. MSE is deﬁned as the squared difference between the actual and the desired weight of a unit, averaged over all output units and all training pairs. From a theoretical point of view, Cybenko (1989) proved that a three-layer feedforward network is capable of approximating any continuous, multivariate function arbitrarily well, provided that the data sets used for training are properly chosen and the hidden layer dimension is sufﬁcient. In practice, the ﬁnal success of the automatic classiﬁcation depends decisively on the consistency of the a-priori

5. Artiﬁcial neural network design In this study, the simplest type of a feedforward artiﬁcial neural network, the three-layer case, is used. It consists of an input layer, a hidden layer, and an output layer (Fig. 4). Each layer consists of a number of units whose function is, to some extent, comparable to properties of biological neurons. In the so-called feedforward mode, a

Fig. 5. MSE curves calculated as average over 25 training/testing runs using variable parameters. a: nEPO ﬁxed to 100, nHID variable, b: nHID ﬁxed to 120, nEPO variable. Arrows indicate those values where MSE stabilizes and which were taken for the ﬁnal network layout and training.

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456 Table 2 Percentages (left part) and total numbers (right part) (with standard deviations) of event types identiﬁed by trained network, calculated as average from 50 training/ testing runs using random data sets Type Type out in REG NOH NOL

nin VT

LP

TRE

U

98.4 0.0 0.0 0.0 0.0 0.7

0.5 0.9 0.3 95.9 0.1 2.9

0.0 0.0 0.1 0.5 99.3 0.2

0.0 1.1 0.0 0.5 0.0 0.8 1.5 1.2 0.1 0.5 94.4 1.6 Mean/sum

nfail

hit

% REG NOH NOL VT LP TRE

ndet

# 0.0 98.6 0.0 0.5 0.0 0.2

0.0 0.0 99.0 0.5 0.0 0.0

98.4 ± 2.2 98.6 ± 2.1 99.0 ± 1.0 95.9 ± 1.7 99.3 ± 0.9 94.4 ± 4.1 97.6

37 72 83 126 95 49 462

36.4 ± 0.8 0.6 71.0 ± 1.5 1.0 82.2 ± 0.8 0.8 120.8 ± 2.2 5.2 94.3 ± 0.9 0.7 46.3 ± 2.0 2.7 451.0 11.0

Training and testing were done using 50% of available data each. nin is the number of data used for testing each type, ndet and nfail are the average numbers of correctly and incorrectly identiﬁed or undetected (type out = U) events for each type. hit is the average rate of correct identiﬁcation. Small discrepancies in percentages are due to rounding effects.

identiﬁcation of class membership. It is therefore crucial that the training data set covers a wide range of possible input patterns in order to enable the network to recognize similar patterns in feedforward mode when data previously unknown to the network are processed. Detailed description on the function of neural networks and related methods can be found in Duda et al. (2001). For this study, the C software library implementation fann (‘fast artiﬁcial neural networks’) was used. The library, described in detail in Nissen (2003), allows generation, training and execution of multilayer feedforward ANNs. While the dimensions of the input and output layers of the network are determined by the number of data parameters and the number of event types to be classiﬁed, the number of units in the hidden layer (nHID) is crucial for the performance of the trained network. An appropriate value has to be chosen according to the complexity of the classiﬁcation problem and can be determined by testing a range of values. A second parameter that affects the network learning quality is the number of epochs a network is trained (nEPO). This value has a strong inﬂuence on the generalizing abilities of the trained network. During the network design, the effect of nEPO on the network performance can be observed in order to ﬁnd an appropriate value. A common procedure to test and optimize a network is to divide a set of complete data with known outputs into training and test data sets, and to observe the training progress by means of the test data. The quality of ﬁt between input and output data of the trained network can be explored by means of the MSE value or by using an analyzing function to decide if the output matches the correct target type. An analyzing function used in this study was deﬁned as follows: The event type detected is indicated by the output unit of the highest weight wmax, if wmax is greater than a threshold thout and all other output units have weights less than a∙wmax. In all other cases the event was deﬁned as unidentiﬁed (type U). The values for thout = 0.2 and a = 0.9 were determined empirically by evaluating the outputs of numerous data sets. A number of tests was performed in order to estimate optimum values for nHID and nEPO. The tests were performed with data sets randomly selected from the 924 manually classiﬁed events. Training sets and test sets comprised 462 events each, and each group of event type was represented equally in either set (Table 1). Fig. 5a shows the results of a test to estimate an appropriate value for nHID. For this test, random data sets were processed using an increasing number of nHID. The MSE value of the test data was calculated after a ﬁxed number of 100 training cycles. The test was run 25 times and MSE values were averaged. It can be seen that the mean MSE strongly decreases at nHID = 50 units and stabilizes at around

453

120 units. This means that more than 120 hidden units will not improve the learning ability of the network signiﬁcantly. In a second test, a suitable value for nEPO was estimated. Fig. 5b displays the mean MSE value as function of nEPO for 25 runs with random data sets and nHID ﬁxed to 120. The curve stabilizes at around 120 training epochs. As result of the tests, a network with 120 hidden neurons being trained for 130 epochs was regarded as appropriate for the classiﬁcation problem in this study. In order to analyze the classiﬁcation quality, 50 random runs were performed with the ﬁnal network setup, again using only one half of the data for training. The results are shown in Table 2, where the performance statistics averaged over the 50 runs are listed. The mean hit rate of the network is approximately 97%, meaning that the network correctly identiﬁes 97% of the test data. The lowest hit rate of about 94% occurs for the group of tremor events for which on average 2.7 out of 49 detections failed. It has to be kept in mind that the network training for the test runs was done using only one half of the available training data in order to be able to observe the network behaviour by means of test data. A ﬁnal network should, however, be trained making use of all available training data. This complete training was done using all 924 data and with the ﬁnal network layout and training intensity that had proven to be appropriate. The training yielded 100% correct detections for all types, except VT where on average one detection failed. The training success is inﬂuenced by a random factor arising from randomly chosen initial weights of the hidden units. Fig. 6 illustrates this effect. Shown are the MSE value of the network and the number of false detections nfail for 50 complete training runs. nfail varies between 1 and 2 and MSE varies at a very low (b0.01) level. 6. Results and discussion Using the ﬁnally trained network, the procedure of automatic event detection and identiﬁcation was applied to the seismic data available for the period from June 2005 to March 2007. A total of 9868 events were detected of which 154 are unidentiﬁed. The results are illustrated in the form of event statistics plots of selected time periods. Fig. 7a covers half a year's section where the seismic activity at Krakatau is low, except for a short phase of high volcano-tectonic activity with almost 100 events on 24.09.2005. The average daily VT rate lies between 3 and 4, whereas long-period and tremor events are almost absent. The average daily rate of regional events is 0.3; lowfrequency and high-frequency noise signal rates reach values of more than 30 and more than 40, respectively.

Fig. 6. MSE values (line) and number of false identiﬁcations nfail (bars) for ﬁnal network trained using all available training data. Complete training was done 50 times (run number).

454

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

Fig. 7. Daily rates of events as determined by the automatic detection and identiﬁcation scheme (U = unidentiﬁed) for two selected periods (a and b). Shaded areas indicate seismic data gaps.

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

455

Table 3 Number of automatic (ndet) and false (nfail) event identiﬁcations (U = undetermined) for ﬁve selected days indicated in the top line

Table 4 Event statistics for 24.09.2005 Type

ndet

nfail

Type

REG NOH NOL VT LP TRE sum

2 7 1 97 2 3 112

1 0 0 1 1 3 6

REG NOL NOH LP VT TRE U sum

17.11.2005

22.6.2006

21.10.2006

16.11.2006

15.12.2006

ndet

nfail

ndet

nfail

ndet

nfail

ndet

nfail

ndet

nfail

1

0

7

0

2 1 6

0 0 0

10

0

11 1

1 1

6 9 22 14 2

2 1 0 3 2

1 19 95%

1 1

21 90%

2

53 85%

8

5 3 6 7 1 2 24 83%

1 0 0 1 0 2 4

10 8 27 29 9 2 85 87%

3 0 1 5 0 2 11

ndet = number of events detected, nfail = number of events incorrectly identiﬁed, as established by manual examination.

The bottom line gives percentages of correct automatic identiﬁcations.

The section shown in Fig. 7b, covering the period from October 2006 to March 2007, is characterised by higher daily counts for both volcanic and non-volcanic events. The daily rates of volcano-related events start increasing in the middle of October and reach peaks in the middle of December. The average rate of regional events is unchanged, whereas the daily noise signal counts are higher than in 2005. In the following, the results of the automatic event detection and identiﬁcation will be critically examined. 6.1. Event detection A general drawback of STA/LTA detectors is the fact that they can fail in case of signals of slowly growing amplitudes. This is sometimes the case for volcanic tremor signals. Furthermore, the concept of an event-based analysis bears the drawback, that mixed signals, such as a VT event occurring during a tremor phase, cannot be separated. The same problem arises when single events follow each other in rapid succession. Also in this case the detector may treat the succession of events as a single, longer event. 6.2. Event identiﬁcation Since the correct classiﬁcations of detected events are only known for the limited amount of data samples deﬁned in the training data set, the quality of the automatic event identiﬁcation has to be evaluated manually. Due to the great number of events recorded and detected this task can practically be merely performed for random samples. Such manual checks have been exemplarily carried out for a number of selected days. It has to be noted that, also for a human operator, a

correct type assignment is often difﬁcult. The results, shown in Table 3, suggest that the part of correct automatic identiﬁcations is generally greater than about 80%. The classiﬁer seems to perform better for days with lower total event counts. Another way to estimate the performance of the identiﬁcation scheme is based on the assumption that the daily counts of volcanorelated events are independent of those of noise and regional events. Fig. 8 reveals that there is, at least in general, no such correlation. This means that a capability of the network to discriminate between the two groups of event types exists. Nevertheless, a visual inspection of the statistics plots suggests that there might be some coupling between the rates of certain event types, especially for days with overall high event rates. For example, both the rates of VT and NOL events show local maxima on 24.12.2006 and 26.01.2007. On the other hand, this coincidence of high event rates is not observed on other days with either high VT (e.g. 24.09.2005) or high NOL counts (e.g. 15.02.2007). A careful examination of seismograms showed that the detected event rates are plausible: in December 2006 Krakatau's volcano-seismic activity is rather high, expressed by relatively high VT, LP and TRE rates. At the same time, coastal surf is strong which leads to the relatively high NOL rates. At the end of December 2006, a change in the local wind conditions increases the seismic background noise level and causes a drop in the seismic station's sensitivity and in overall event detection rates. In the middle of January 2007, the wind drops again while the costal surf is still strong. On 26.01.2007 a swarm of seismic events causes the observed peak in the VT event rate. Further ﬂuctuations in the wind and coastal surf conditions explain the observed NOL event rates after the end of January 2007. The swarm of VT events on 24.09.2005 occurred during a phase of generally low event counts. It is therefore suitable to examine its inﬂuence on the rates detected for the other types. The event detections were carefully examined in order to identify possible wrong identiﬁcations. The results, listed in Table 4, show that out of 112 detected events six automatic identiﬁcations failed. The worst performing class is TRE where three out of three identiﬁcations are wrong. These detections contain multiple VT events with only a few seconds inter-event time, and resemble TRE type events. The examination allows to estimate the part of false identiﬁcations to be roughly 5 per cent. The result cannot necessarily be generalized, but it gives an idea of the performance of the neural network classiﬁer. 7. Conclusion

Fig. 8. Cross-plot for daily rates of non-volcanic (REG, NOL, NOH) and volcanic (VT, LP, TRE) event types.

A system for an operator-independent determination of seismic event rates recorded at Krakatau volcano was implemented and tested. The two-step process includes a STA/LTA trigger detector and a neural network based classiﬁer for the identiﬁcation of six different seismic event types. Special attention was paid to the correct identiﬁcation of noise signals that are not volcano-related and that, due to the particular environment found at the Krakatau island volcano, sometimes dominate the seismic traces. The training of the

456

M. Ibs-von Seht / Journal of Volcanology and Geothermal Research 176 (2008) 448–456

neural network was done by means of manually identiﬁed events using six straightforward signal characteristics. A number of tests was performed in order to ﬁnd a network topology and a training strategy that ﬁt the given classiﬁcation problem best. The trained network is able to almost perfectly identify all training events. Applied to the entirety of all events detected during a period of almost two years, the part of correct identiﬁcations was estimated to lie between 80 and 95%. This discrepancy is due to the fact that the training data consist of those events only that could be more or less clearly identiﬁed. Events of unknown or unclear type cannot naturally be used for training. The apparent limited performance of the network classiﬁer can be caused by an imperfect set of training data that does not cover the complete range of possible signal signatures, and by possible ﬂaws in the a-priori classiﬁcation of signals. While the effect of the ﬁrst point can be reduced by extending the training data base, the second aspect points to the general difﬁculties in correctly identifying volcano-seismic signals. The difﬁculties in training a neural network, as well as in estimating its performance, are therefore similar to those when trying to assess classiﬁcations done by a human operator. Still, for the classiﬁer developed in this study it was shown that there is no evident coupling between the rates detected for volcanic and non-volcanic events. This is a further indication for a certain reliability of the classiﬁer which is therefore regarded as suitable to get a clear overview of the seismic activity recorded at the volcano during the time period analysed. The performance of the event trigger is satisfactory. The algorithm sometimes fails only in case of very high event rates and for some tremor signals. Also in this context, similarities to operator-based classiﬁcations are evident. It will be required to apply a more sophisticated detection algorithm in order to overcome these drawbacks. It should also be noted that in case of volcanic tremor events, the duration of the tremor signals may be a better measure for the activity status than is the pure event count. This is, however, only a question of how the results of the system are visualised. The current system works with the data of one station only, which is located nearest to the crater. A future system could be designed to include data of other stations as well, which might improve the reliability of event detections. This requires to explore how the signal signatures change at varying distances from the crater, and perhaps to set up individually trained neural network classiﬁers for each station. Similar considerations will have to be made when attempting

to employ a comparable system at other volcanoes. The development of the system presented here is not regarded to be complete. The data used for the study do not contain signals that occurred during eruptions, for which a massive change in the signal signatures or the occurrence of new signal types can be expected. In that case, the network would have to be re-trained. The results so far achieved with the ANN based approach are promising. Acknowledgements This study is part of the Krakmon project, funded by the German Federal Ministry of Education and Research (BMBF) within the framework of the GEOTECHNOLOGIEN program (FKZ03G0578A). Thanks to all who worked with me on Krakatau and in Pasauran, especially Rudolf Knieß and Arne Hoffmann-Rothe. The equipment of the KT network was kindly made available by the geophysical instrument pool of the GFZ Potsdam. I appreciate the work of two anonymous reviewers; their comments greatly improved the paper. References Cybenko, G.V., 1989. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst 2, 303–314. Duda, R.O., Hart, P.E., Stork, D.G., 2001. Pattern Classiﬁcation, 2nd edition. Wiley & Sons, New York. 654 pp. Hoffmann-Rothe, A., Ibs-von Seht, M., Knieß, R., Faber, E., Klinge, K., Reichert, C., Purbawinata, M.A., Patria, C., 2006. Monitoring Anak Krakatau Volcano in Indonesia. EOS 87, No. 51: 581, 585–586. Jentili, S., Bragato, P., 2006. A neural-tree-based system for automatic location of earthquakes in Northeastern Italy. J. Seismol. 10, 73–89. Langer, H., Falsaperla, S., 2003. Seismic monitoring at Stromboli volcano (Italy): a case study for data reduction and parameter extraction. J. Volcanol. Geotherm. Res. 128, 233–245. Langer, H., Falsaperla, S., Powell, T., Thompson, G., 2005. Automatic classiﬁcation and aposteriori analysis of seismic event identiﬁcation at Soufrie`re Hills volcano, Montserrat. J. Volcanol. Geotherm. Res. 153, 1–10. doi:10.1016/j.jvolgeores.2005.08.012. Minakami, T., 1960. Fundamental research for predicting volcanic eruptions, Part I. Bull. Earthq. Res. Inst. Univ. Tokyo 38, 497–544. Nissen, S., 2003. Implementation of a Fast Artiﬁcial Neural Network Library (fann). Technical report, Department of Computer Science University of Copenhagen (DIKU). WWW Page, http://fann.sf.net. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by backpropagating errors. Nature 323, 533–536. doi:10.1038/323533a0. Werbos, P.J., 1974. Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. thesis, Harvard University, Cambridge.

Detection and identification of seismic signals recorded at Krakatau volcano (Indonesia) using artificial neural networks

Detection and identification of seismic signals recorded at Krakatau volcano (Indonesia) using artificial neural networks

Recommend Documents