Nonlinear Analysis 63 (2005) e859 – e865 www.elsevier.com/locate/na
Discrimination of infrasound events using parallel neural network classification banks Sungjin Park, Fredric M. Ham∗ , Christopher G. Lowrie Florida Institute of Technology, 150 W University Boulevard, Melbourne, Florida 32901, USA
Abstract An integral part of the Comprehensive Nuclear-Test-Ban Treaty (CTBT) International Monitoring System is an infrasound-monitoring network. This network has the capability to detect and verify infrasonic signals-of-interest (SOI), e.g., nuclear explosions, from other unwanted infrasound noise sources. This paper presents classification results of infrasonic events using parallel neural network classification banks (PNNCB). The PNNCB presented are structured by parallelization of classical neural networks. The PNNCB algorithm has capability of multiple events classification and show enhanced overall classification performance. 䉷 2005 Elsevier Ltd. All rights reserved. Keywords: Infrasound; Neural network classifier; Feature extraction
1. Introduction A major component of the Comprehensive Nuclear-Test-Ban Treaty (CTBT) [10] International Monitoring System (IMS) is an infrasound network. This infrasound monitoring system must be capable of detecting and verifying infrasonic signals-of-interest (SOI), e.g., nuclear explosions, from other unwanted infrasound noise sources. Infrasonic waves are sub-audible acoustic waves typically in the frequency range 0.01 < f < 10 Hz [2,11–13]. Fig. 1 shows the infrasonic frequency range. This paper presents classification results of infrasound events using parallel neural network classification banks (PNNCB). The generalization of the neural network classifier is a ∗ Corresponding author.
E-mail address: fmh@ee.fit.edu (F.M. Ham). 0362-546X/$ - see front matter 䉷 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2005.02.016
e860
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
0.01
0.03
0.1 0.2
1 Megaton Yield Volcano
1.0
10.0
Hz
impulsive events
1 kiloton Yield Microbaroms
Gravity Waves Mountain Associated Waves 0.01
Bolide
0.1
1.0
10.0
Hz
Fig. 1. Infrasonic frequency range.
result of not only the independent validation approach taken to training and test the network, but it is also due to the diversity of the data used. More specifically, the data used to train and test the network are taken from multiple infrasound arrays, in different geographical locations, with different geometries, and in some cases different data sampling rates. The diversity of the data contributed to the overall robustness of the neural network to reliably classify infrasound events of interest. The major contributor to the neural network robustness is the preprocessing of the input data. From raw time-domain infrasound data, a set of mel-frequency scaled cepstral coefficients and their associated derivatives for each signal are used to form a set of feature vectors [9]. These feature vectors contain the salient characteristics of the data that can be used to classify the SOI as opposed to using the raw time-domain data. Previous analysis efforts have been performed to determine the “optimal” feature vector space, i.e., the best combination of mel-frequency scaled cepstral coefficients and associated derivatives [6,7]. This feature extraction process is invariant with respect to: record length, array geometry, array location, system sampling frequency, signal amplitude, and time sequence length [3,8]. However, as the number of event categories grows, overall performance of the classifier is degraded due to the very limited bandwidth of infrasound signals. The PNNCB algorithm presented here has capability of multiple events classification, and also show enhanced overall classification performance compared to the classical structures. Using the PNNCB approach, each individual neural network is trained only on one SOI as a positive reinforcement; the rest of the input signals are used for negative reinforcement signals. Each individual neural network will have its own set of weights and biases trained on an individual SOI. Each neural network module will only respond if the input has the SOI that it was trained on. Three different competing neural networks were used: (1) multi-layer feedforward perceptron (MLP) trained by backpropagation (BP), (2) a radial basis-function (RBF) neural network, and (3) a partial least-squares (PLS) calibration model. Several simulations were
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
e861
Table 1 Overall process (1) (2) (3) (4)
(5) (6)
Eleven categories of infrasound events are of interest in this study. The data were collected from 14 different infrasound sensor arrays with different geometries and different locations. Feature vectors, extracted from time-domain event signals, are based on cepstral coefficients (similar process is used in robust speaker recognition [9]). For training, 1077 feature vectors are used and, for testing, 1007 feature vectors are used. Three different neural networks are used for classification, i.e., partial least-squares (PLS) [4], a radial basis function (RBF) neural network [5], and a multi-layer perceptron (MLP) neural network trained by backpropagation (BP) [5]. Optimal weights and biases vectors are computed. The performances of each classifier are computed and compared.
Table 2 Infrasound data summary Data (no. of sensor array)
Event type (no. of signals-of interest)
Data set 1 (1)
Volcanoa (112) [14,15] Mountain associated wavesb (96) [1] Detonation (120), No-event (92) Detonation (108), No-event (36), Periodic (816) Detonation (102), High dB gain combustion (114) Bi-plane (264), Helicopter (58), No-event (96) Thunder (32), Howitzer (20) Bolide (18)
Data set 2 (2) Data set 3 (4) Data set 4 (5) Data set 5 (1) Data set 6 (1)
Sampling frequency (Hz)
1 100 100, 250 250 10–40 10–40
a Events from volcano eruptions in El Chichon, Mexico and Galunggung, Java. b Events from mountains in New Zealand.
run to discriminate between volcano, mountain associated waves, detonation, periodic signals, high dB gain combustion, bi-plane, helicopter, thunder, howitzer, bolide, and background noise, i.e., no-event. The data sets were taken from 14 different sensor arrays in different geographical locations. The following steps are the overall process for training and testing the PNNCB (Table 1).
2. Details of the infrasound data A total of 2084 infrasound signals were used in this study. Table 2 gives the details about the collected data from the different locations. A few comments are in order. As seen in Table 2, there are 14 different array geometries and six different locations. The “No-event” signals are 20 s sequences of data taken from selected data records where it was determined no SOI existed. Therefore, these signals represent background noise.
e862
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
3. Data preprocessing 3.1. Microbarom noise filtering and feature extraction steps A persistent phenomenon known as microbaroms [15] is typically present in many infrasonic signals. They are often bothersome because their frequency content coincides with that of small yield nuclear explosions. Microbarom signals also create a problem when attempting to classify infrasonic signals, however, simple band-pass filtering can solve the problem. Microbarom signals usually have a period between 6 and 8 s. Therefore, a bandpass filter with a pass-band between 1 and 50 Hz will completely eliminate this component. However, the band-pass filter has not been applied to the volcano and mountain associated waves since their infrasound frequency range is below 1 Hz. As mentioned above, the major contributor to the robustness of the neural network classifier for infrasound events is the preprocessing of the input data. More specifically, the generation of a set of feature vectors that is used to train and test the classifiers. The data preprocessing steps to extract feature vectors are given in Table 3. This process was applied to all of the infrasound data collected for this study to generate a set of 40-element Table 3 Feature extraction steps (1) (2) (3) (4)
Remove the mean value from the signal. Apply hamming window to the signal. Compute the power spectral density S(k). Apply mel-frequency scaling to the PSD: Sm (k) = loge [S(k)],
(1)
where = 1125, = 0.000003 (determined experimentally). (5) Take the inverse discrete cosine transform: xm (n) =
N−1 1 Sm (k) cos(2kn/N ), n
for n = 0, 1, 2, . . . , N − 1.
(2)
k=0
(n). (6) Take the differenced sequence of the sequence xm (n), i.e., xm (n), with the cepstral sequence, x (n), to form the augmented (7) Concatenate the differenced sequence, xm m sequence: a = [x (n)|x (n)]. xm m m
(3)
a (n) to give (8) Take the absolute value of the elements in the sequence xm a a |. = |xm xm,abs
(4)
a (9) Take the loge of xm,abs from the previous step to give a a xm,abs,log = loge [xm,abs ].
(10) Remove the mean value from the feature vector. (11) Range the feature vectors between [−1, 1].
(5)
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
Infrasound event
Preprocessing
Infrasound #1 NN
1 or 0
Infrasound #2 NN
1 or 0
. . .
. . .
Infrasound #n NN
1 or 0
e863
Classification code
Fig. 2. Structure of parallel neural network classification banks.
feature vectors. The first 15 elements of each feature vector are the derivative terms and the remaining 25 elements are the cepstral coefficients. 3.2. Structure of parallel neural network classification banks Fig. 2 shows the PNNCB structure. Each individual neural network is trained only on one SOI as a positive reinforcement; the rest of the input signals are used for negative reinforcement signals. The code ‘1’ is used for a positive reinforcement signal and the code ‘0’ is used for the negative reinforcement signals. Each individual neural network will have its own set of weights and biases trained on an individual SOI. Each neural network module will only respond if the input has the SOI that it was trained on. Therefore, infrasound input patterns are discriminated when the resulting classification code matches with the pre-defined event code. In addition, if an infrasound input pattern includes more than one infrasound event, the classification code indicates multiple events. However, indication of multiple-events is signal-strength dependent. The pre-processing block includes microbarom noise filtering and feature extraction steps, and as mentioned above, three different competing neural networks were used for each individual neural network module. 4. Simulation results 4.1. Training and test data structure and training parameters The three classifiers listed in Table 1 were trained and tested using the data described in Table 2. The data were preprocessed using the steps described above and then split into a training data set and a test data set. The dimension of the training matrix was 1077 × 40 and the test matrix was 1007 × 40. The “Thunder”, “Howitzer”, and “Bolide” events were only trained but not tested at the moment since there were not enough number of events. The MATLAB neural networks toolbox was used for training the MLP using BP, and the RBF network. The MATLAB chemo-metrics toolbox was used to develop the PLS statistical calibration model and determine the optimal number of factors to retain in the model. Table 4 shows the training parameter for each of the classifiers.
e864
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
Table 4 Training parameters for classifiers Classifier
Training parameters
MLP trained by BP
• Activation function: hyperbolic tangent • Initial learning rate parameter: 0.0001 • Momentum parameter: 0.9 • Error goal: SSE = 0.1 • Architecture: 40/20/1a • Error goal: MSE = 0 • Spread factor for RBFs: 1.0 • Number of factors: 24–27b
RBF network PLS
a 40 input neurons, 20 hidden layer neurons, and 1 neuron at the output layer. b 24–27 PLS factors were required depends on the event categories.
Table 5 Classification results Volcano Mountain Detonation Periodic High dB Biplane Helicopter No-event Total (%) associated gain correctly waves combustion classified PLS
39/56
35/48
140/165
390/408 37/57
101/132 17/29
5/112
RBF
25/56
17/48
125/165
353/408 51/57
116/132 25/29
53/112
MLP
47/56
36/48
143/165
397/408 54/57
125/132 16/29
16/112
PNNCB 42/56 using PLS PNNCB 49/56 using RBF PNNCB 46/56 using MLP
34/48
140/165
389/408 33/57
98/132
17/29
83/112
33/48
144/165
393/408 56/57
129/132 27/29
73/112
32/48
130/165
387/408 54/57
120/132 24/29
87/112
764/1007 75.87 765/1007 75.97 834/1007 82.82 836/1007 83.01 904/1007 89.77 880/1007 87.38
4.2. Training and test data structure and training parameters Table 5 shows the classification results using the test data set. As can be seen from Table 5, the classification performances of all three PNNCB were outperformed over the classical neural network classifiers. The PNNCB structure using RBF was able to correctly classify more than 89% of the test data.
5. Conclusions Representing infrasound data as a set of feature vectors consisting of cepstral coefficients and their derivatives dramatically improves the performance of the classifier. The
S. Park et al. / Nonlinear Analysis 63 (2005) e859 – e865
e865
performances of three PNNCB were evaluated: a partial least-squares statistical calibration model, a radial basis function network, and a multi-layer perceptron trained by backpropagation. All three classifiers were able to correctly classify more than 83% of all tested events. However, the PNNCB using RBF showed the best performance by consistently classifying 89% of the test data correctly. All three classifiers, especially the PNNCB using RBF, were robust in the sense that they were able to achieve relatively high levels of performance using data taken from different geographical locations, different array geometries. Also, the record lengths varied, the system sampling frequencies were different in some cases, and the time sequence lengths varied. The major contributor to robustness can be attributed to the preprocessing of the time-domain data that was used to generate the feature vectors. This feature extraction process is invariant with respect to: record length, array geometry, array location, system sampling frequency, signal amplitude, and time sequence length. Future research will also involve adding more signals to each class, and adding more infrasound classes. References [1] A.J. Bedard, Infrasound originating near mountain regions in Colorado, J. Appl. Meteorol. 17 (1978) 1014. [2] A.J. Bedard, T.M. Georges, Atmospheric infrasound, Phys. Today 53 (3) (2000) 32–37. [3] F.M. Ham, Neural network classification of infrasound events using multiple array data, International Infrasound Workshop, Kailua-Kona, Hawaii, 2001. [4] F.M. Ham, I. Kostanic, Partial least-squares: theoretical issues and engineering applications in signal processing, J. Math. Probl. Eng. 2 (1) (1996) 66–93. [5] F.M. Ham, I. Kostanic, Principles of Neurocomputing for Science and Engineering, McGraw-Hill Higher Education, New York, 2001. [6] F.M. Ham, T.A. Leeney, H.M. Canady, J.C. Wheeler, Discrimination of volcano activity using infrasonic data and a backpropagation neural network, in: K.L Priddy, P.E. Keller, D.B. Fogel, J.C. Bezdek (Eds.), Proceedings of the SPIE Conference on Applications and Science of Computational Intelligence II, vol. 3722, Orlando, FL, USA, 1999, pp. 344–356. [7] F.M. Ham, T.A. Leeney, H.M. Canady, J.C. Wheeler, An infrasonic event neural network classifier, 1999 International Joint Conference on Neural Networks, Session 10.7, Paper No. 77, Washington, DC, July 10–16, 1999. [8] F.M. Ham, S. Park, A robust neural network classifier for infrasound events using multiple array data, in: The Proceedings of the 2002 World Congress on Computational Intelligence—International Joint Conference on Neural Networks, Honolulu, Hawaii, May 12–17, 2002, pp. 2615–2619. [9] R.J. Mammone, X. Zhang, R.P. Ramachandran, Robust speaker recognition: a feature-based approach, IEEE Signal Process. Mag. 13 (5) (1996) 58–71. [10] National Research Council, Comprehensive Nuclear Test Ban Treaty Monitoring, National Academy Press, Washington, DC, 1997. [11] K.B. Payne, Silent Thunder: In the Presence of Elephants, Simon and Schuster, New York, 1998. [12] A.D. Pierce, Acoustics: An Introduction to Its Physical Principles and Applications, Acoustical Society of American Publishers, Sewickley, PA, 1989. [13] V.N. Valentina, Microseismic and Infrasound Waves, (Research Reports in Physics), Springer, New York, 1992. [14] C.R. Wilson, R.B. Forbes, Infrasonic waves from Alaskan volcanic eruptions, J. Geophys. Res. 74 (1969) 1812–1836. [15] C.R. Wilson, J.V. Olson, R. Richards, Library of typical infrasonic signals, Report Prepared for ENSCO (subcontract no. 269343-2360.009), vols. 1–4, 1996.