NEEWS: A novel earthquake early warning model using neural dynamic classification and neural dynamic optimization

NEEWS: A novel earthquake early warning model using neural dynamic classification and neural dynamic optimization

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427 Contents lists available at ScienceDirect Soil Dynamics and Earthquake Engineering journ...

871KB Sizes 140 Downloads 278 Views

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

Contents lists available at ScienceDirect

Soil Dynamics and Earthquake Engineering journal homepage: www.elsevier.com/locate/soildyn

NEEWS: A novel earthquake early warning model using neural dynamic classification and neural dynamic optimization Mohammad Hossein Rafiei, Hojjat Adeli

MARK



Department of Civil, Environmental and Geodetic Engineering, The Ohio State University, 470 Hitchcock Hall, 2070, Neil Ave., Columbus, OH 43220, USA

A R T I C L E I N F O

A B S T R A C T

Keywords: Support vector machine Enhanced probabilistic neural networks Neural dynamic classification Neural dynamic model of Adeli and Park Earthquake early warning system Earthquake prediction

An Earthquake Early Warning System (EEWS) can save lives. It can also be used to manage the critical lifeline infrastructure and essential facilities. Recent research on earthquake prediction towards the development of an EEWS can be classified into two groups based on a) arrival of P waves and b) seismicity indicators. The first approach can provide warnings within a timeframe of seconds. A seismicity indicator-based EEWS is intended to forecast major earthquakes within a time frame of weeks. In this paper, a novel seismicity indicator-based EEWS model, called neural EEWS (NEEWS), is presented for forecasting the earthquake magnitude and its location weeks before occurrence using a combination of a classification algorithm (CA) based on machine learning concepts and a mathematical optimization algorithm (OA). The role of the CA is to find whether there is an earthquake in a given time period greater than a predefined magnitude threshold and the role of the OA is to find the location of that earthquake with the maximum probability of occurrence. The model is tested using earthquake data in southern California with a combination of four CAs and one OA to find the best EEWS model. The proposed model is capable of predicting strong disastrous events as long as sufficient data are available for such events. The paper provides a novel solution to the complex problem of earthquake prediction through adroit integration of a machine learning classification algorithm and the robust neural dynamics optimization algorithm of Adeli and Park.

1. Introduction

1.2. Review of the literature

1.1. Outline of the problem

Panakkat and Adeli [36] present a review of research efforts for predicting the time, epicentral location, and the magnitude of future earthquake up to 2007. Recent research on earthquake prediction towards the development of an EEWS can be classified into two groups. Two approaches have been pursued for the P wave-based EEWS, one is on-site warning and the other is regional warning. The former uses signal processing of the P-waves at a single site to forecast the magnitude and the arrival of the destructive S-waves with advance warning of only a few seconds. The latter uses P-waves from a network of seismometers to forecast the arrival of the destructive S-waves and the magnitude and the location of the earthquake. The regional EEWS is intended to provide alarms for sites far from the epicenter in relatively longer periods in the order of tens of seconds [11,27]. For regional warning systems in recent years a few researchers have used machine learning models such as the backpropagation neural networks (BPNN) [18] and support vector machine (SVM) [43].

U.S. Federal Emergency Management Agency (FEMA) employs the Integrated Public Alert & Warning System (IPAWS), an online software system, to issue alarms automatically before catastrophic events such as hurricanes and major earthquakes [26]. The recent enactment by the U.S. Congress on December 18, 2015 gives the Federal Communications Commission (FCC) nine months to report the necessary actions required to deliver IPAWS earthquake emergency alarms to the public rapidly in less than 3 s [25]. Such alarms require to be issued seconds or minutes before the earthquakes to help people take appropriate actions [46]. The U.S. Geological Survey (USGS) encourages researchers to develop an Earthquake Early Warning System (EEWS) for the purpose of saving lives and managing the critical lifeline infrastructure and essential facilities [47].



Corresponding author. E-mail addresses: rafi[email protected] (M.H. Rafiei), [email protected] (H. Adeli).

http://dx.doi.org/10.1016/j.soildyn.2017.05.013 Received 17 February 2017; Received in revised form 2 May 2017; Accepted 8 May 2017 0267-7261/ © 2017 Elsevier Ltd. All rights reserved.

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

over ∆T , (dE1/2 ), the mean time between characteristic events, (Tchar ), and the coefficient of variation of the mean time between characteristic events (C ). Characteristic events are groups of events based on their range of magnitude. For example, events with a magnitude between a predefined range, say 6–6.5 Richter, represent characteristic events for that range of magnitude in a given time period defined as multiple of time-resolution. Recently, in a similar work, Asim et al. [17] use these seismicity indicators to forecast earthquake magnitude in Hindukush region using PNN, recurrent neural network, random forest, and linear programming boost ensemble of decision trees. In contrast to P-wave based EEWS, a seismicity indicator-based EEWS can potentially forecast major earthquakes weeks within a time resolution of weeks with some level of certainty [35]. Seismicity indicator-based and P-wave based EEWSs can work in tandem to increase the probability of detection of a major earthquake and reduce the probability of a false alarm.

Machine learning and neural network algorithms can be categorized into two main categories: 1) Supervised learning and 2) Unsupervised learning [28,45]. In supervised learning, the learner is presented with a set of inputs and outputs. In contrast, in unsupervised learning, the learner has to learn from a set of inputs only, that is, to extract meaningful features from input data [31,33,34,38,19]. BPNN is a supervised learning approach which employs an intensive iterative algorithm to reduce the difference between the estimated and desired outputs [28,7]. Böse et al. [18] propose an EEWS using BPNN with two hidden layers that employs data obtained from sensors as inputs. BPNN is used to forecast the location of hypocenter and the magnitude of the coming earthquake. The inputs of the BPNN are the Pwave arrival time differences on seismometers and the cumulative absolute velocity computed by integration of the ground acceleration over time obtained from accelerometers. SVM is a supervised learning approach [23] that is, in general, faster than BPNN for complex pattern recognition problems. Reddy and Nair [43] propose a hybrid regional EEWS using wavelet decomposition method [14,24] and SVM. The model extracts useful data from seismograms (P-waves) or accelerograms. The inputs of SVM are the wavelet coefficients [15,16] and the output is a class, 1 if the magnitude of the earthquake is greater than a predefined threshold and 2 if not. The accuracy of SVM in classifying new data depends highly on the type and the parameters of the kernel, a usually nonlinear transformation function to classify the linearly inseparable clusters. Mu and Yuen [32] present a heterogeneous Bayesian learning approach for ground motion prediction. In a comprehensive EEWS, the probabilities of false and missed alarms need to be estimated and based on tradeoff between such probabilities and the associated costs, a decision needs to be made to issue a public alarm or not. One uncertainty about an EEWS alarm is the susceptibility to vibrations initiated by events other than an earthquake such as wind, hurricane, and floods. Hsu et al. [27] use singular spectrum analysis, fast Fourier transform, and SVM to distinguish the vibrations of an earthquake from a non-earthquake event so that the probability of a false alarm is reduced. The total number of SVM inputs is nine, features extracted from the first 3 s of P-wave: peak absolute value of velocity, peak absolute value of displacement, predominant period, integral of absolute acceleration, integral of absolute velocity, integral of absolute displacement, peak value of acceleration times velocity, peak value of acceleration times displacement, and peak value of velocity times displacement. The seismicity of a region can be described mathematically through seismicity indicators. These indicators are computed using the earthquake magnitude temporal distribution before an event. An event is defined as the maximum earthquake magnitude occurring in a time resolution. If the time resolution is one month up to 600 events will be available in 50 years (some time resolutions may not have any earthquake based on a preselected threshold). Panakkat and Adeli [35] employ three neural network models to forecast earthquake magnitudes in southern California and San Francisco bay region using a number of seismicity indicators: LevenbergMarquardt backpropagation network, recurrent network, and radial basis function. They report recurrent neural network yields the most accurate results. Panakkat and Adeli [37] and Adeli and Panakkat [1] present a recurrent neural network and a probabilistic neural networks (PNN) model to forecast the earthquake time, location, and magnitude in the future, respectively. Martínez-Álvarez et al. [29] perform a sensitivity analysis to identify the best combination of seismicity indicators to forecast the earthquake magnitude in Chile and the Iberian Peninsula using three classification algorithms: Naive Bayes, K-nearest neighbors, and SVM. They report the best combination consists of the time elapsed, ∆T , over a predefined number, N , of events, the mean magnitude of the last N events (Gmean ), the rate of the square root of seismic energy released

1.3. Parameters representing seismic potential of a region The seismic indicators used in this research are similar to those used in Panakkat and Adeli [35]: 1) ∆T , 2) Gmean , 3) dE1/2 , 4) the slope of the Gutenberg-Richter inverse power law, known as the b -value, over the last N events, 5) summation of the mean squared deviation from the regression line based on the Gutenberg-Richter inverse power law in the past N events (D ), 6) the difference between the maximum observed magnitude among the last N events and the largest expected through the Gutenberg-Richter relationship, (ΔG ), 7) Tchar , and 8) C . For a mathematical definition of these indicators, see Panakkat and Adeli [35]. Gutenberg-Richter inverse power low describes an approximate linear relation between the magnitude of an event and the logarithm of the frequency of events of magnitude equal or lower than that magnitude. 1.4. Objective of the research In this paper, a novel seismicity indicator-based EEWS model, dubbed neural EEWS (NEEWS), is presented for forecasting the earthquake magnitude and its location weeks before occurrence using a combination of a classification algorithm (CA) based on machine learning concepts [4] and a mathematical optimization algorithm (OA) [3,6]. The role of the CA is to find whether there is an earthquake in a given time period greater than a predefined magnitude threshold and the role of the OA is to find the location of that earthquake with the maximum probability of occurrence. The model is tested using a combination of four CAs and one OA to find the best EEWS model. The four CAs are the recently-developed Neural Dynamic Classification (NDC) [41], Probabilistic Neural Network (PNN), Enhanced Probabilistic Neural Network (EPNN) [10], and Support Vector Machine (SVM) [20,22]. The OA is the Neural Dynamic Optimization of Adeli and Park (NDAP) [39,40,8,9]. The selection of NDAP is based on the fact that it is known as a robust algorithm for solution of highly nonlinear optimization algorithms with discontinuous constraints [12,13]. 2. A novel methodology for forecasting the earthquake magnitude and its location The proposed model, NEEWS, employs a CA to forecast the magnitude ranges of the maximum earthquake and an optimization algorithm to estimate the longitude and latitude of the location of that earthquake in a given region and a given time lag number. The range of earthquake is in Richter magnitude increments such as 4.5–5.0. The region is defined with minimum and maximum latitudes and longitudes. The time lag number represents the time difference between now and a selected time in the future to predict the earthquake as a multiple of a preselected time resolution. Time resolution can be a month, a two-week period, or a week. 418

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

Start

2. Select a predefined magnitude threshold

3. Compute seismicity indicators for each event using a predefined to form event data

5. Divide event point data into training and testing data 6. Find the best combination of the seismicity indicators using NDC

4. Assign a category to each data point in the event data based on the time-lag number and the magnitude threshold

Computation of the seismicity indicators

Forecast the earthquake Determine the best combination of magnitude and location seismicity indicators in the future

1. Select a time in the future as a multiple of the time resolution when the earthquake needs to be forecasted and compute the time-lag number

7. Compute the class (1 or 2) of the earthquake in the future 8. Find the coordinates of the locations where the earthquake of class 2 is expected

Yes

Forecasted class is 2? No Yes

All predefined magnitude thresholds

End

No Fig. 1. the general flowchart of neural earthquake early warning system (NEEWS).

This is a complicated multi-dimensional pattern recognition problem with a large training database. CA is used to determine if a certain range of earthquake is probable in the selected future and the optimization algorithm (OA) is used to estimate the coordinates of the predicted magnitude range. The general flowchart of NEEWS is presented in Fig. 1. It consists of

NEEWS requires the CA to be trained using the best combination of seismicity indicators corresponding to the previously-collected records of earthquakes, referred to as training data, for the given region. From the user point of view, the model requires three inputs: 1) a time resolution 2) a time lag number or a time in the future for occurrence or non-occurrence of an earthquake, and 3) earthquake magnitude ranges.

Table 1 The seismicity indicators computed every half month in 2000 for southern California (Gthreshold = 4. 5, N = 100 ). Month

1st/2nd half

1) ΔT (days)

2) Gmean (Richter)

3) dE1/2 (Erg)

4) b-value

5) D

6) ΔG (Richter)

7) Tchar (days)

8) C

1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

7125 7125 7125 7125 7125 7125 7125 7271 7267 7267 7166 7001 7001 7001 7001 7001 7001 7001 7001 7001 7001 7001 7001 7001

4.02 4.01 4.01 4.01 4.00 4.00 4.01 4.01 4.03 4.03 4.03 4.04 4.03 4.04 4.04 4.04 4.04 4.04 4.04 4.03 4.03 4.03 4.04 4.04

1.91E+22 1.91E+22 1.91E+22 1.91E+22 1.91E+22 1.91E+22 1.91E+22 1.87E+22 1.87E+22 1.87E+22 1.90E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22 1.92E+22

0.63 0.63 0.63 0.63 0.64 0.64 0.64 0.63 0.64 0.64 0.63 0.63 0.63 0.63 0.64 0.64 0.64 0.64 0.64 0.64 0.64 0.64 0.64 0.64

0.0086 0.0083 0.0082 0.0081 0.0089 0.0089 0.0093 0.0095 0.0098 0.0098 0.0099 0.0099 0.0096 0.0098 0.0096 0.0095 0.0094 0.0095 0.0093 0.0089 0.0087 0.0086 0.0090 0.0091

0.52 0.53 0.52 0.51 0.57 0.57 0.56 0.55 0.54 0.54 0.53 0.52 0.52 0.53 0.53 0.54 0.54 0.55 0.56 0.56 0.55 0.55 0.55 0.54

17.81 17.92 17.98 17.62 17.61 17.81 17.75 17.74 116.33 17.98 18.06 18.06 18.02 17.99 18.00 17.93 18.00 17.83 17.77 17.93 17.86 18.00 17.90 17.88

0.49 0.49 0.49 0.51 0.50 0.50 0.50 0.50 0.99 0.50 0.50 0.50 0.50 0.50 0.50 0.51 0.51 0.51 0.52 0.51 0.51 0.51 0.52 0.52

Gthreshold : Earthquake magnitude threshold; N: Predefined number of events; Nlag : Lag number; ∆T : Time elapsed (days); Gmean : The mean magnitude; dE1/2 : the rate of the square root of seismic energy; b -value: The slope of the Gutenberg-Richter inverse power law; D : Summation of the mean squared deviation from the regression line based on the Gutenberg-Richter inverse power law; ΔG : The difference between the observed maximum magnitude among the last N events and the largest expected; Tchar : The mean time between characteristic events;

C : the coefficient of variation of the mean time between characteristic events.

419

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

2.2. Step 2: Magnitude threshold

Table 2 An example to show how to assign a class (1 or 2) based on the value of Nlag and Gthreshold ( Nlag = 4 and Gthreshold = 4.5). Month

May

June

July

Event number 1st or 2nd half month Magnitude Class

1 1

2 2

3 1

4 2

5 1

4.4 2

5.1 2

3.9 1

2.5 2

4.6 2

August

September

6 2

7 1

8 2

9 1

10 2

5.0 1

3.6 N/A

4.7 N/A

5.5 N/A

2.4 N/A

A magnitude threshold, Gthreshold , is selected out of all predefined magnitude thresholds in the vector Gthreshold . For example, Gthreshold = {4.5,5.0,5.5,6.0,6.5}. Each threshold is used to categorize the data into two categories: 1) earthquake magnitude is less than the threshold value and 2) not.

2.3. Step 3: Computation of seismicity indicators Gthreshold : Threshold magnitude; Nlag : Time-lag number.

The eight seismic indicators are computed using the records of the earthquake magnitudes for the given region and the period of the available instrumental data. For example, the daily records of earthquake magnitude in southern California from 1932 to 2016 is available at http://scedc.caltech.edu/ and can be used to compute the seismicity indicators. If instrumental data is available for a long period of time, say 50 years, and the time resolution is selected as half a month, there are 50 × 12 × 2 = 1200 available events. However, depending on the value of N and the temporal distribution of the earthquake magnitude in the instrumental data, some of the seismicity indicators cannot be computed for early events. If the total seismicity indicators are computed for say 1000 events, there are 1000 data points (feature vectors) in hand. Table 1 shows a sample of seismicity indicators calculated every half month for southern California in 2000 (the first 15 days of each month are considered the first half and the remaining days are considered the second half to take into account the non-uniform number of days in each month). Each event is represented by I seismicity indicators included in a row vector, X , called an event data point.

8 steps. The first step is to choose a time resolution along with a time in the future to forecast the earthquake. Steps 2–4 are to prepare input and output data for the training of the proposed model. Steps 5 and 6 are to determine the best combination of the seismicity indicators. Steps 7 and 8 are to compute the earthquake magnitude and coordinates of the future earthquake, respectively.

2.1. Step 1: Time resolution and time-lag number A time resolution is selected along with a time in the future when an earthquake needs to be estimated in a region as a multiple of the resolution period. For example, for a resolution of one week, one can choose the future prediction time of first week of September 2017. Based on this time, the time difference between now and the selected time in the future as a multiple of the resolution is called time-lag number and denoted as Nlag in this paper. For example, if the time resolution is one week, today’s time is the first week of August 2016, and the future time is the second week of October 2016, the time-lag number will be 8.

Fig. 2. The architecture of Neural Dynamic Classification Algorithm (NDC).

420

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

sign of Rm positive. This optimization problem is solved by NDAP. The two transformed form of X , Z1 and Z2 , are placed in a 1 by (2 × J ) matrix:

2.4. Step 4: Categorization based on time-lag number Each event is assigned one of two categories. If the magnitude of an event Nlag times resolution after the current event is more than Gthreshold , the current event is assigned category 2, otherwise 1. Table 2 shows an example of how to assign categories to 10 consecutive events having Nlag = 4 and Gthreshold = 4.5. For example, the category of the first event (1st half of May) is 2 because Nlag = 4 times resolution later (1st half of July) the earthquake magnitude is 4.6 which is greater than Gthreshold = 4. 5. Assigning a category based on the magnitude of an event Nlag times resolution after the current event is to investigate if a CA is able to forecast the magnitude of an event Nlag times resolution before the next earthquake.

Ztotal = [Z1, Z2]

(2)

total

Each row in Z is called a feature vector representer (FVR). In the pattern layer, Ztotal is compared with the FVR form of each training event data points in class m using the following probability density function (PDF) [44]:

ϕ(Ztotal ) =

⎛ total − Ztotal ⎜− U × exp I I ⎜ 2σ 2 ⎝ (2π ) 2 σ 1

2

⎞ ⎟ ⎟ ⎠

(3)

total

where U is the FVR form of one out of Nm training event data points in class m and σ is the spread of the Gaussian function (σ ∈ [01]). In the summation layer, the average likelihood of Ztotal belonging to the same class m is:

2.5. Step 5: Division of event point data into machine learning training and testing data In order to perform sensitivity analysis of CA to forecast earthquake ranges in the future, a number of recent event data points, say 10% of the total events, are selected among the events time-lag number before the last event to test the CA. For example, if time-lag number is 6, the time resolution is half month, the last event in the data point is the 2st half of May 2016, and the total number of testing data is 150, then 150 event data points before the 1nd half of March 2016, that is the 150 most recent data, are selected as the testing data. Each time a testing data is selected, all the event data points before that are used as the training data.

Pm(Ztotal ) =

Nm

∑ Φn(Ztotal ) n =1

(4)

In the decision layer, the estimated class of X is the one with the maximum average likelihood:

Class of X = arg max{Pm(Ztotal )} m

(5)

2.6.2. Sensitivity analysis using NDC A global sensitivity analysis is performed to find the best combination of seismicity indicators as inputs to a CA in order to maximize the average accuracy of the classification. All possible combinations of I indicators are evaluated. Hence, using the best combinations of inputs, the size of each event data point becomes Ir ≤I .

2.6. Step 6: A sensitivity analysis to find the best combination of seismicity indicators for magnitude prediction 2.6.1. NDC model for earthquake magnitude prediction Recently, Rafiei and Adeli [41] developed a new supervised classification algorithm, dubbed Neural Dynamic Classification (NDC), capable of solving highly complicated classification problems by employing the patented robust neural dynamic optimization model of Adeli and park (NDAP) (U.S. Patent Number: 5,815,394) [2]. NDC discovers a new feature space with large margins between clusters and close proximity of the classmates using a set of transformation functions. Fig. 2 presents the architecture of the NDC model for earthquake magnitude prediction. It consists of 5 layers: input layer, layer of feature vector representer, pattern layer, summation layer, and decision layer. In the first layer, an I-dimensional feature vector of seismicity indicators is used as input. In the next layer, the inputs are transformed into a J-dimensional feature space twice using 2 linear transformation functions, each corresponding to one of the two aforementioned classes. The linear transformation function corresponding to class m ∈ {1,2} is expressed as

Zm = XWm + Bm

1 Nm

2.7. Step 7: Prediction of the magnitude of the future earthquake The class of the earthquake Nlag times resolution after the last event in the available event data is estimated using the best combination of inputs found in step 6. All event data points except the last one are considered as the training data and the last one as the testing data of a CA. If the estimated class is 2, the coordinates of the last event is computed in the next step 8; otherwise, the algorithm stops. 2.8. Step 8: Prediction of the coordinates of the future earthquake It is expected that for X2 , an event data point with estimated class of 2 in step 7 (Pm =1 ≤ Pm =2 ), the larger the difference between Pm=1 and Pm=2 , the higher is the probability of occurrence of an earthquake greater than Gthreshold in X2 ’s corresponding location, because the average likelihood of that location being in the same class as earthquakes with class 2 is larger than that of class 1. Using this notion, in step 8, a nonlinear minimization problem with a fitness function defined as the difference between the likelihoods of X2 being in classes 1 and 2, and 5 constraints is defined as follows:

(1)

where Wm is an I × J matrix of weights and Bm is a J-dimensional row vector of biases [7], collectively referred to as the m th transformation parameters, and Zm is a 1 by J matrix of transformed X . The transformation parameters are optimized such that large margins between clusters and close proximity of the classmates in the Jdimensional feature space are achieved. The fitness function of the resulting optimization problem is the summation of Euclidean distances between transformed form of each training event data point in class m and centroid of the class in the Jdimensional feature space. This optimization problem has three sets of inequality constraints. The first set of inequality constraints is to push and keep the transformed training event data points in class m within a hypersphere with radius Rm . The second set of inequality constraints is to keep the transformed training event data points in the other class out of the hypersphere. The third set of inequality constraints is to keep the

f (H ) = P1(H ) − P2(H )

(6)

g1(H ) = P1(H ) − P2(H ) < 0

(7)

g2(H ) = L1 − L1max < 0

(8)

g3(H ) = L1min − L1 < 0

(9)

g4(H ) = L 2 − L 2 max < 0

(10)

g5(H ) = L 2 min − L 2 < 0

(11)

where H = [X2,L ], an Ir +2 dimensional row vector, L = [L1, L 2], L1 and L 2 are the longitude and latitude corresponding to X2 , respectively, L1min and L1max are the minimum and maximum longitudes in the region, 421

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

Table 3 The best combination of seismicity indicators (step 6, Fig. 1) corresponding to various magnitude thresholds-lag combinations using NDC (1: selected, 0: not selected).

Nlag − G

threshold

Nlag

Gthreshold

Combination Number

Tchar 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Seismicity indicators

∆T

Gmean

0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 1 1 0 1 1 1

0 1 1 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1

Accuracy of NDC (%)

dE1/2

b --

value

D

ΔG

1 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1

0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 0 1 1

1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 0 1 0 1 1 1 1 0 0 0 0 1

1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 1 0 1 1 1 1 1 0 1 0 1 1 1 1 0 1 1 1 1 0 1 0 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0 0 1 1 1

C 1 1 1 1 1 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12

4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5

0 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 0 0 1 1 0 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 0 1 1 1

100.0 100.0 100.0 100.0 100.0 97.3 99.3 100.0 99.3 99.3 95.3 100.0 100.0 100.0 99.3 99.3 100.0 100.0 99.3 100.0 96.7 99.3 100.0 100.0 100.0 99.3 98.7 100.0 100.0 100.0 97.3 98.7 98.7 100.0 99.3 100.0 99.3 99.3 99.3 100.0

Gthreshold : Earthquake magnitude threshold; Nlag : Lag number; ∆T : Time elapsed (days); Gmean : The mean magnitude; dE1/2 : the rate of the square root of seismic energy; b -value: The slope of the Gutenberg-Richter inverse power law; D : Summation of the mean squared deviation from the regression line based on the Gutenberg-Richter inverse power law; ΔG : The difference between the observed maximum magnitude among the last N events and the largest expected; Tchar : The mean time between characteristic events; C : the coefficient of variation of the mean time between characteristic events; NDC: Neural dynamic classification

Table 4 The estimated class of future earthquakes in 2016 and the actual and estimated coordinates of those with estimated class of 2 by at least one out of 4 CAs. Nlag−Gthreshold combination number in Table 3

1 2 6 7 27 38

Time-lag number

1 1 6 6 10 12

Magnitude threshold

4.5 5.0 4.5 5.0 5.0 5.5

Calendar month number

6 6 8 8 10 11

1st or 2nd half

1 1 2 2 2 2

Estimated class2-occurrence of an earthquake

Estimated coordinate

Actual coordinate

SVM

PNN

EPNN

NDC

Longitude

Latitude

Longitude

Latitude

2 2 2 2 2 2

2 2 1 1 1 1

2 2 1 1 1 1

2 2 1 1 1 1

32.7 37.0 32.0 37.0 32.4 34.6

−115.7 −115.5 −121.6 −118.4 −116.2 −120.9

33.4 33.4 N/A N/A N/A N/A

−116.4 −116.4 N/A N/A N/A N/A

Gthreshold : Magnitude threshold; Nlag : Lag number; CA: Classification algorithm; SVM: Support vector machine; PNN: Probabilistic neural network; EPNN: Enhanced probabilistic neural network; NDC: Neural dynamic classification algorithm.

422

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

Table 5 POD, FAR, R score, and accuracy (%) for different magnitude thresholds using the best combination of inputs from Table 4 and four CAs; SVM, PNN, EPNN, and NDC for southern California.

Nlag − G

Nlag

threshold

Gthreshold

Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Average

1 1 1 1 1 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12

4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5

SVM

PNN

EPNN

NDC

POD

FAR

R

ACC

POD

FAR

R

ACC

POD

FAR

R

ACC

POD

FAR

R

ACC

0.82 0.80 0.33 0.00 0.00 0.70 0.50 0.75 0.00 0.00 0.30 0.00 0.25 0.00 0.00 0.00 0.50 0.50 0.00 0.00 0.27 0.40 0.75 0.00 0.00 0.37 0.40 0.75 0.00 0.00 0.10 0.09 0.00 1.00 0.00 0.59 0.00 0.25 0.00 0.00 0.26

0.00 0.47 0.50 1.00 1.00 0.79 0.90 0.63 1.00 1.00 0.82 1.00 0.89 1.00 1.00 1.00 0.83 0.97 1.00 1.00 0.72 0.93 0.88 1.00 1.00 0.74 0.88 0.85 1.00 1.00 0.25 0.97 1.00 0.92 N/A 0.53 1.00 0.96 1.00 N/A 0.85

0.82 0.33 −0.17 −1.00 −1.00 −0.09 −0.40 0.13 −1.00 −1.00 −0.52 −1.00 −0.64 −1.00 −1.00 −1.00 −0.33 −0.47 −1.00 −1.00 −0.46 −0.53 −0.13 −1.00 −1.00 −0.38 −0.48 −0.10 −1.00 −1.00 −0.15 −0.88 −1.00 0.08 N/A 0.07 −1.00 −0.71 −1.00 N/A −0.58

96.7 94.0 98.0 98.7 97.3 42.7 66.0 96.0 98.7 94.0 58.0 90.7 92.7 96.7 94.7 78.0 80.0 58.0 96.7 98.7 71.3 62.7 84.0 98.0 96.0 66.0 77.3 88.0 89.3 96.7 80.7 68.0 88.0 92.7 99.3 77.3 88.7 83.3 96.7 99.3 85.7

0.86 0.90 0.67 1.00 1.00 0.90 1.00 1.00 1.00 1.00 0.77 1.00 0.75 1.00 0.00 0.97 1.00 1.00 1.00 1.00 0.83 0.90 1.00 1.00 1.00 1.00 0.80 1.00 1.00 1.00 1.00 0.73 1.00 1.00 1.00 1.00 1.00 0.50 1.00 1.00 0.91

0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 0.50 0.50 0.04 0.52 0.40 0.00 1.00 0.00 0.55 0.43 0.00 0.50 0.04 0.10 0.56 0.00 0.00 0.40 0.11 0.64 0.67 0.00 0.45 0.00 0.33 0.50 0.00 0.26 0.33 0.00 0.00 0.00 0.23

0.86 0.90 0.67 0.50 1.00 0.90 1.00 1.00 0.50 0.50 0.73 0.48 0.35 1.00 −1.00 0.97 0.45 0.57 1.00 0.50 0.79 0.80 0.44 1.00 1.00 0.60 0.69 0.36 0.33 1.00 0.55 0.73 0.67 0.50 1.00 0.74 0.67 0.50 1.00 1.00 0.68

97.3 99.3 99.3 99.3 100.0 98.0 100.0 100.0 99.3 99.3 94.7 92.7 98.0 100.0 98.0 99.3 92.0 98.0 100.0 99.3 96.0 98.7 96.7 100.0 100.0 86.7 98.0 95.3 98.7 100.0 83.3 98.0 98.7 99.3 100.0 92.7 96.0 98.7 100.0 100.0 97.5

0.86 0.90 0.67 1.00 1.00 0.90 1.00 1.00 1.00 1.00 0.83 1.00 0.75 1.00 0.00 0.97 1.00 1.00 1.00 1.00 0.97 0.90 1.00 1.00 1.00 1.00 0.90 1.00 1.00 1.00 1.00 0.73 1.00 1.00 1.00 1.00 1.00 0.50 1.00 1.00 0.92

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.04 0.00 0.00 0.00 N/A 0.00 0.00 0.00 0.00 0.00 0.03 0.00 0.00 0.00 0.00 0.12 0.00 0.00 0.00 0.00 0.31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01

0.86 0.90 0.67 1.00 1.00 0.90 1.00 1.00 1.00 1.00 0.79 1.00 0.75 1.00 N/A 0.97 1.00 1.00 1.00 1.00 0.93 0.90 1.00 1.00 1.00 0.88 0.90 1.00 1.00 1.00 0.69 0.73 1.00 1.00 1.00 1.00 1.00 0.50 1.00 1.00 0.93

97.3 99.3 99.3 100.0 100.0 98.0 100.0 100.0 100.0 100.0 96.0 100.0 99.3 100.0 99.3 99.3 100.0 100.0 100.0 100.0 98.7 99.3 100.0 100.0 100.0 97.3 99.3 100.0 100.0 100.0 90.7 98.0 100.0 100.0 100.0 100.0 100.0 98.7 100.0 100.0 99.3

1.00 1.00 1.00 1.00 1.00 0.87 1.00 1.00 0.00 0.00 0.80 1.00 1.00 1.00 0.00 0.97 1.00 1.00 0.00 1.00 0.93 0.90 1.00 1.00 1.00 0.97 0.90 1.00 1.00 1.00 0.90 0.82 0.50 1.00 0.00 1.00 0.92 0.75 0.00 1.00 0.81

0.00 0.00 0.00 0.00 0.00 0.00 0.09 0.00 N/A N/A 0.04 0.00 0.00 0.00 N/A 0.00 0.00 0.00 N/A 0.00 0.10 0.00 0.00 0.00 0.00 0.00 0.10 0.00 0.00 0.00 0.03 0.00 0.00 0.00 N/A 0.00 0.00 0.00 N/A 0.00 0.01

1.00 1.00 1.00 1.00 1.00 0.87 0.91 1.00 N/A N/A 0.76 1.00 1.00 1.00 N/A 0.97 1.00 1.00 N/A 1.00 0.84 0.90 1.00 1.00 1.00 0.97 0.80 1.00 1.00 1.00 0.87 0.82 0.50 1.00 N/A 1.00 0.92 0.75 N/A 1.00 0.94

100.0 100.0 100.0 100.0 100.0 97.3 99.3 100.0 99.3 99.3 95.3 100.0 100.0 100.0 99.3 99.3 100.0 100.0 99.3 100.0 96.7 99.3 100.0 100.0 100.0 99.3 98.7 100.0 100.0 100.0 97.3 98.7 98.7 100.0 99.3 100.0 99.3 99.3 99.3 100.0 99.4

SVM: Support Vector Machine; PNN: Probabilistic Neural Network; EPNN: Enhanced Probabilistic Neural Network; NDC: Neural Dynamic Classification Algorithm; ACC: Accuracy (%); POD: Probability of Detection; FAR: False Alarm Ratio; R: R Value; CA: Classification algorithm; Gthreshold : Magnitude threshold; Tlag : Time-lag number; CA: Classification algorithm.

3. Reliability measures of NEEWS

respectively,L 2 min and L 2 max are the minimum and maximum latitudes in the region, respectively, andP( 1 H ) and P2(H ) are the average likelihoods of H being in class 1 and 2, respectively (Eq. (4)). The PDF is in the form of Eq. (3) as follow:

1

Φn(H ) = (2π )

Ir +2 (I +2) 2 σ r

⎛ H −H n, m × exp ⎜⎜ − 2σ 2 ⎝

2

⎞ ⎟⎟ ⎠

3.1. Statistical measures for evaluating the prediction power of the model Following Panakkat and Adeli [35], three statistical measures are computed to verify the reliability of the classification results: the probability of detection (POD) of earthquakes greater than Gthreshold , false alarm ratio (FAR) which shows the ratio of events that are not greater than Gthreshold but estimated as class 2, and Hanssen-Kuiper skill score or R score, which is close to 1 if the accuracy of classification is near 100% [42]:

(12)

where Hn, m = [Vn, m,L n, m], Vn, m is the n th training feature vector in class m , and Ln, m contains the longitude and latitude corresponding to Vn, m . The variables to be optimized are L1 and L 2 . The inequality g1 assures that H remains in class 2 during the optimization. The inequalities g2 to g5 assure the longitude L1 and the latitude L 2 are within the maximum and minimum longitudes and latitudes of the region. This optimization problem is solved by NDAP. In the previous step an earthquake of magnitude greater than a preselected threshold corresponding to X2 has been forecast in the region. The probability of location of that earthquake being in different longitudes and latitudes within the region is computed by investigating the fitness value in each iteration; the smaller is the fitness value, the higher is the probability of the corresponding coordinates to be close to the epicenter of an earthquake.

POD =

FAR =

Npc Npc + Nni

(13)

Npi Npc + Nnc

(14)

R = POD − FAR

(15)

where Npc (estimated-correct) is the number of data points an earthquake of threshold magnitude or greater occurred and was estimated correctly, Nnc (not estimated-correct) is the number of data points 423

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

where Ddistance is the distance between the estimated and actual coordinates and Ddiagonal is the diagonal length of the rectangular region. The closer the value of Rcloseness to 1, the more accurate will be the estimated coordinates.

Table 6 The magnitude distribution in the testing data points corresponding to each threshold magnitude and each lag number.

Nlag − G

threshold

Nlag

Gthreshold

Number

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

1 1 1 1 1 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12

4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5 4.5 5.0 5.5 6.0 6.5

Number of data points available Less than threshold (estimated class 1)

Greater than or equal to threshold (estimated class 2)

28 10 3 1 1 27 10 4 1 1 26 10 4 1 0 29 10 4 1 1 30 9 4 1 1 29 9 4 1 1 29 9 4 1 1 32 12 3 1 1

122 140 147 149 149 123 140 146 149 149 124 140 146 149 150 121 140 146 149 149 120 141 146 149 149 121 141 146 149 149 121 141 146 149 149 118 138 147 149 149

Average Rcloseness

4. Implementation The proposed algorithm has been developed in MATLAB version 8.6.0.267246 programming tool. The sensitivity analysis in step 6 of NEEWS requires running the algorithm for different combination of inputs. For example, in the example presented in the next section there are 1498 event data points, 150 testing data, I = 8, J = 3, and the number of iterations in NDC is chosen to be 20. It takes about 600 s to evaluate one combination of inputs on a CPU core with four 2.4-GHz processors (Intel® Core™ i7-3630QM). The number of possible combiQ2 ! nations of Q1 objects out of Q2 objects is equal to [30]. Hence,

0.71 0.64 0.58 0.73 0.64 0.47 0.43 0.21 0.13 0.39 0.53 0.59 0.49 0.80 N/A 0.49 0.57 0.41 0.20 0.25 0.63 0.49 0.56 0.95 0.86 0.71 0.79 0.92 0.70 0.62 0.59 0.37 0.83 0.86 0.64 0.75 0.63 0.16 0.86 0.96

the

8! 8! + 4!×4! 3!×5! 600 × 255 = 42.5 3600

of

8! 5!×3!

8! 6!×2!

+

+

combinations

+

8! 7!×1!

+

8! 8!×0!

Q1 ! (Q2 − Q1) ! 8! is 1!×7!

+

8! + 2!×6!

= 255. It takes about

5. Description of study region and data used In the proposed model training and testing are performed concurrently. The earthquake data in an area between longitudes -114 and -122 and latitudes 32 and 37 in southern California which includes the Los Angeles county are used to train and test the model. The database used is from the website of Southern California Earthquake Data Center, California Institute of Technology at http://service.scedc.caltech.edu with more than 700,000 records of earthquake with Richter magnitude in the range M = 0.1 - 7.3 collected during the period January 1, 1932 to May 31, 2016. In this research only damaging earthquakes defined as those with a magnitude equal or greater than 4.5 are used. 6. Results and discussion

during which an earthquake of threshold magnitude or greater did not occur and was not estimated, Npi (estimated-incorrect) is the number of data points during which an earthquake of threshold magnitude or greater did not occur but was estimated, and Nni (not estimatedincorrect) is the number of data points during which an earthquake of threshold magnitude or greater occurred but was not estimated. The values of the three statistical measures can also be used in the NEEWS decision making where the trade-off between the costs of a false alarm and a missed alarm is considered [21].

6.1. Future times to forecast earthquakes and time-lag numbers (Step 1) A time resolution of half month is chosen in this example. Eight different times in the future are chosen after May 2016 to forecast the magnitude of earthquakes: 1st half month of June, September, October, and November and 2nd half month of August, September, October, and November in 2016. The corresponding Nlag times resolution are 1, 7, 9, 11, 6, 8, 10, and 12, respectively (step 1). The proposed model is used to forecast occurrence/non-occurrence of a damaging earthquake in eight future times, each time using a different Nlag .

3.2. Accuracy of the estimated coordinates In order to verify the accuracy of the estimated coordinates in step 8, the coordinates of each testing data with estimated class of 2 using the best combination of seismicity indicators in step 6 are compared with its actual coordinates. The following closeness ratio is defined to evaluate the closeness of estimated and actual coordinates:

Ddistance Ddiagonal

number

CPU hours to compute the accuracy of all combinations for just one magnitude threshold. If there are 5 magnitude thresholds, it takes about 42.5 × 5 = 212.5 CPU hours to compute the best combination for a time-lag number. If 8 lag numbers need to be evaluated, 212.5 × 8 = 1700 CPU hours are required. As such, this research was performed on the Ruby cluster at the Ohio Supercomputer [5]. For the aforementioned example, it takes about 137 s to evaluate a combination of inputs on a CPU core with twelve processors (Intel® Xeon E5 2670 V2) on the Ruby cluster. The total CPU hours required for 8 time137 × 255 × 5 × 8 ≈ 390 CPU hours. lag numbers on the Ruby cluster is about 3600 Having 40 CPU cores on Ruby cluster, each for 10 h, the above problem is solved within 39 CPU hours.

Rcloseness : Closeness ratio; Gthreshold : Magnitude threshold; Tlag : Time-lag number.

Rcloseness = 1 −

total

6.2. Computation of the seismicity indicators (Step 2–4) Five magnitude thresholds are defined as Gthreshold = {4.5,5.0,5.5,6.0,6.5} (step 2). For each threshold magnitude, a different predefined N is selected, and 1498 eight-dimensional event data points (feature vectors) (62 years×12 months ×2 half months+10 half months in 2016= 1498 event data points) are computed for each half month from January 1954 to the end of May 2016 using instrumental

(16) 424

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

Fitness value

Fitness value

M.H. Rafiei, H. Adeli

Fig. 3. Left figures represent the actual and estimated coordinates corresponding to two magnitude thresholds and time-lag number with the maximum closeness ratio. The circular arches encompass the area with P1 < P2 in Eq. (6). (The background image is from google map). Right figures: the fitness value versus the number of iterations corresponding to the two magnitude thresholds and time-lag number in the left figures.

seismic data. Each event data point contains the values of eight seismicity indicators (I = 8) corresponding to that event. Based on the value of the magnitude threshold, and time-lag number, a category of 2 (occurrence of an earthquake of magnitude equal or greater than the threshold value) or 1 (nonoccurrence of an earthquake) is assigned to each event data point (step 4).

8 × 5 = 40 combinations of Nlag − Gthreshold (for 8 time lags and 5 magnitude thresholds). Each row of this table presents the best combination out of 255 possible combinations of the seismicity indicators corresponding to a particular combination of magnitude threshold and time lag number (1 means the corresponding seismicity indicator is selected and 0 means not). The most frequently selected seismicity indicator is C followed by b -value, and then Tchar . Similar results are reported by Martínez-Álvarez et al. [29]. The last column of this table shows the average accuracies of 150 tests for NDC using the corresponding best combination. This table shows high accuracy in the range of 95.3 – 100%. The low range of the NDC accuracy can be increased by using a number of iterations more than 20 used in this research.

6.3. Determining the best combination of seismicity indicators (Step 5 and 6) The 150 event data points time-lag number before the latest event (about 10% of the total number of data points) are selected among the events as testing data. For example, for Nlag = 1, 150 data points corresponding to the events between the 1nd half month of March 2010 and 2st half month of May 2016 are selected as the testing data (step 5). In the results presented in this section, a value of J = 3 is selected for the dimension of the new feature space in NDC, and the total number of iterations (epochs) is limited to just 20, a relatively small number. These numbers resulted in sufficiently accurate results. Larger numbers will increase the computational cost significantly without an appreciable increase in the accuracy. Table 3 presents the best combinations of 8 seismicity indicators resulting in the most accurate classification (step 6), each corresponding to 1 out of

6.4. Future earthquake magnitudes and coordinates (Step 7 and 8) The earthquake magnitudes for 40 Nlag − Gthreshold combinations are estimated (step 7) using the corresponding best combinations of seismicity indicators in Table 3 (Table 4) and four different CAs. Six out of 40 Nlag − Gthreshold combinations have an estimated class of 2 by at least one out of 4 CAs: 1) 4.5-0 (1st half month of June 2016), 2) 5.0-0 (1st half month of June 2016), 3) 4.5–6 (2nd half month of August 2016), 4) 5.0–6 (2nd half month of August 2016), 5) 5.0–10 (2nd half month of October 2016), and 6) 5.5–12 (2nd half month of November 2016). These are summarized in Table 4 which also includes the 425

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli

7. Conclusion

estimated and actual coordinates of these earthquakes (step 8). In this table, the last four earthquakes are only predictions in the future from now and therefore have no actual epicentral coordinates.

Earthquake prediction is a complex and seemingly intractable problem. Authors provided a solution to this problem through adroit integration of a machine learning classification algorithm and the robust neural dynamics optimization algorithm. In particular, a seismicity indicator-based EEWS was presented for forecasting the earthquake magnitude and its location weeks before occurrence using a combination of a CA and OA. The model was tested using earthquake data in southern California with a combination of four CAs and one OA. The four CAs are NDC, PNN, EPNN, and SVN. The OA is the patented neural dynamic optimization of Adeli and Park. The most accurate result was obtained using NDC followed by EPNN. The proposed model is capable of predicting strong disastrous events as long as sufficient data are available for such events. Table 3 presents the accuracy of NDC in predicting 40 Nlag − G combinathreshold tions and Table 6 presents how accurate the corresponding location predictions are by employing the concept of R closeness . These accuracies are computed by taking into account about 150 most recent events available in the repository as the testing data and the rest as training data. The reported accuracy is based on comparing the prediction of the model and the actual earthquake class in the testing database. Further, the model was able to predict the time, magnitude, and location of a future earthquake which happened after the last event in the repository, second half of May 2016) in southern California in the range 5.0–5.5 in the first half of June 2016 and no occurrence of earthquakes by the end of November 2016 (Table 4). The accuracy of these forecasts was verified because there was in fact an earthquake in the range 5.0–5.5 on June 6. The accuracy of the model depends on the size of the available recorded data in the region. With more data, the accuracy of prediction is expected to improve for finer time resolutions. The proposed model is general and not limited to any type of seismicity indicators or earthquake magnitude scale.

6.5. Evaluating the prediction power of the model using four different classification algorithms For each Nlag − Gthreshold combination in Table 3, POD, FAR, R score and the average accuracy of 150 tests are computed for each one of the four classification algorithms SVM, PNN, EPNN, and NDC and summarized in Table 5. The average values are in the last row of the table. The average accuracy and R-score for 40 different Nlag − Gthreshold combinations for NDC are 99.4% and 0.94, respectively, followed by 99.3% and 0.93 for EPNN much better than PNN and SVM. As noted by Panakkat and Adeli [35], R- score “is considered advantageous over POD and FAR because it includes an equal representation of both correct and incorrect predictions.” The average POD for EPNN is 0.92 which is greater than 0.81 for NDC. In terms of accuracy NDC and EPNN consistently provide the most accurate results.

6.6. Use of MEEWS to issue public alarm If an earthquake public alarm is issued for the 2nd half of November 2016 (Nlag − G combination number 38 in Table 4) based on SVM threshold results (a predicted earthquake of class 2 for Gthreshold = 5.5), the probability of false alarm is high because SVM did not show good FAR, R-value, and accuracy in Tables 5 (0.96, −0.71, and 83.3, respectively). In contrast, if an earthquake public alarm is issued for the 1st half of June 2016 (Nlag − G combination number 2 in threshold Table 4) based on NDC or EPNN results (a predicted earthquake of class 2 for Gthreshold = 5.0 ) the probability of detection (false alarm) is high (low) because the accuracies of NDC and EPNN are 100% and 99.3%, respectively and R values are 1.00 and 0.90; relatively high. The largest earthquake that occurred in the 1st half of June 2016 is available in the repository data with a magnitude of 5.2. Hence, the prediction by NDC and EPNN was accurate. Also, after 1st half of June 2016 and before 1st half of September 2016, no earthquake of magnitude greater than 5.5 was recorded in Southern California. Hence, according to Table 4, the magnitude predictions by NDC and EPNN are also accurate. The actual coordinates corresponding to the 5.2 magnitude earthquake are also available in the repository data and noted in Table 4. A relatively high closeness ratio of 0.9 is obtained. In order to evaluate the accuracy of the best combination of inputs in Table 3 to predict the coordinates of major earthquakes, for eachNlag − Gthreshold combination in Table 3, the average Rcloseness of testing data with estimated class 2 is computed and presented in Table 6. For each Nlag − Gthreshold combination, the number of test data points with estimated class 1 and 2 are also shown in this Table. For the magnitude thresholds of 4.5, 5.0, 5.5, 6.0, and 6.5, the best closeness ratio is computed in time-lag number 12, 10, 10, 9, and 12, respectively, with average closeness ratio of 0.75, 0.79, 0.92, 0.95, and 0.96. In order to illustrate the concept of closeness, the estimated and actual coordinates of earthquakes corresponding to Nlag − Gthreshold of 5.5-10, and 6.5-12 in Table 6 are plotted on the southern California map in Fig. 3 (left side). The circular arches encompass the area with P1 < P2 in Eq. (6). Figures on the right of Fig. 3 show plots of the fitness value versus NDAP iteration number corresponding to plots on the left side of Fig. 3. In NDAP, the optimization iterative process stops if the total number of iterations is 15,000 or when the fitness value does not change over two iterations within a tolerance. The smoothness of the convergence curves in Fig. 3 is a clear indication of the robustness of the NDAP optimization algorithm.

Acknowledgement CPU time for this research was provided by the Ohio Supercomputer Center. References [1] Adeli H, Panakkat A. A probabilistic neural network for earthquake magnitude prediction. Neural Netw 2009;22(7):1018–24. [2] Adeli H, Park HS. Neurocomputing for design automation. Boca Raton, Florida: CRC Press; 1998. [3] Adeli H, Sarma K. Cost optimization of structures – fuzzy logic, genetic algorithms, and parallel computing. West Sussex, United Kingdom: John Wiley and Sons; 2006. [4] Hung SL, Adeli H. Parallel backpropagation learning algorithms on Cray Y-MP8/ 864 supercomputer. Neurocomputing 1993;5(6):287–302. [5] OSC. Ohio Supercomputer Center. 〈https://www.osc.edu/〉; 1987. [6] Adeli H, editor. Advances in Design Optimization. London: Chapman and Hall; 1994. [7] Adeli H, Hung SL. An adaptive conjugate gradient learning algorithm for efficient training of neural networks. Appl Math Comput 1994;62(1):81–102. [8] Adeli H, Park HS. A neural dynamics model for structural optimization—theory. Comput Struct 1995;57(3):383–90. [9] Adeli H, Hyo, Park S. Optimization of space structures by neural dynamics. Neural Netw 1995;8:769–81. [10] Ahmadlou M, Adeli H. Enhanced probabilistic neural network with local decision circles: a robust classifier. Integr Comput-Aided Eng 2010;17(3):197–210. [11] Alcik H, Ozel O, Wu Y, Ozel NM, Erdik M. An alternative approach for the Istanbul earthquake early warning system. Soil Dyn Earthq Eng 2011;31(2):181–7. [12] Aldwaik M, Adeli H. Advances in optimization of highrise building structures. Struct Multidiscip Optim 2014;50(6):899–919. [13] Aldwaik M, Adeli H. Cost optimization of reinforced concrete flat slabs of arbitrary configuration in irregular highrise building Structures. Struct Multidiscip Optim 2016;54(1):151–64. http://dx.doi.org/10.1007/s00158-016-1483-5. [14] Alhasan A, White DJ, De Brabanter K. Wavelet filter design for pavement roughness analysis. Comput-Aided Civil Infrastruct Eng 2016;31(12):907–20. [15] Amezquita-Sanchez JP, Adeli A, Adeli H. A new methodology for automated diagnosis of mild cognitive impairment (MCI) using magnetoencephalography (MEG). Behav Brain Res 2016;305:174–80. [16] Amezquita-Sanchez JP, Adeli H. Signal processing techniques for vibration-based health monitoring of smart structures. Arch Comput Methods Eng 2016;23(1):1–15. [17] Asim KM, Mart/'inez-/'Alvarez F, Basit A, Iqbal T. Earthquake magnitude prediction

426

Soil Dynamics and Earthquake Engineering 100 (2017) 417–427

M.H. Rafiei, H. Adeli in Hindukush region using machine learning techniques. Nat Hazards 2016:1–16. [18] Böse M, Wenzel F, Erdik M. PreSEIS: a neural network-based approach to earthquake early warning for finite faults. Bull Seismol Soc Am 2008;98(1):366–82. [19] Bougoudis I, Demertzis K, Iliadis L. Fast and low cost prediction of extreme air pollution values with hybrid unsupervised learning. Integr Comput-Aided Eng 2016;23(2):115–27. [20] Castillo E, Peteiro-Barral D, Berdiñas BG, Fontenla-Romero O. Distributed one-class support vector machine. Int J Neural Syst 2015;25(07):1550029. [21] Cheng MH, Wu S, Heaton TH, Beck JL. Earthquake early warning application to buildings. Eng Struct 2014;60:155–64. [22] Chou J, Pham A. Smart artificial firefly colony algorithm-based support vector regression for enhanced forecasting in civil engineering. Comput Civil Infrastruct Eng 2015;30(9):715–32. [23] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20(3):273–97. [24] Dai H, Wang W, Zhang H. A multiwavelet neural network-based response surface method for structural reliability analysis. Comput-Aided Civil Infrastruct Eng 2015;30(2):151–62. [25] FCC. Public Safety and Homeland Security Bureau Seeks Comment on Ways to Facilitate Earthquake-Related Emergency Alerts. 〈https://apps.fcc.gov/edocs_ public/attachmatch/DA-16-380A1.pdf〉; (Aug/15, 2016); 2016. [26] FEMA. Integrated Public Alert & Warning System. 〈http://www.fema.gov/ integrated-public-alert-warning-system〉; (Aug/15, 2016); 2016. [27] Hsu T, Wu R, Chang K. Two novel approaches to reduce false alarm due to nonearthquake events for on-site earthquake early warning system. Comput-Aided Civil Infrastruct Eng 2016;31(7):535–49. [28] Hung SL, Adeli H. Object-oriented backpropagation and its application to structural design. Neurocomputing 1994;6(1):45–55. [29] Martínez-Álvarez F, Reyes J, Morales-Esteban A, Rubio-Escudero C. Determining the best set of seismicity indicators to predict earthquakes. Two case studies: Chile and the Iberian Peninsula. Knowl-Based Syst 2013;50:198–210. [30] Mathwords. Combination Formula. 〈http://www.mathwords.com/c/combination_ formula.htm〉 (September/07, 2016); 2012. [31] Mesejo P, Ibanez Enrique Fernandez-Blanco O, Cedron F, Pazos A, Porto-Pazos AB. Artificial neuron-glia networks learning paradigm based on cooperative coevolution. Int J Neural Syst 2015;25(4):1550012. [19 pages]. [32] Mu HQ, Yuen KV. Ground motion prediction equation development by hetero-

[33]

[34] [35] [36] [37]

[38] [39] [40] [41] [42] [43]

[44]

[45] [46] [47]

427

geneous Bayesian learning. Comput-Aided Civil Infrastruct Eng 2016;31(10):761–76. Ortiz-Garcia A, Munilla J, Gorriz JM, Ramirez J. Ensembles of deep learning architectures for the early diagnosis of Alzheimer’s disease. Int J Neural Syst 2016;26:7. [23 template pages]. Palomo EJ, Lopez-Rubio E. Learning topologies with the growing neural forest. Int J Neural Syst 2016;26(3):1650019. [21 page]. Panakkat A, Adeli H. Neural network models for earthquake magnitude prediction using multiple seismicity indicators. Int J Neural Syst 2007;17(01):13–33. Panakkat A, Adeli H. Recent efforts in earthquake prediction (1990–2007). Nat Hazards Rev 2008;9(2):70–80. Panakkat A, Adeli H. Recurrent neural network for approximate earthquake time and location prediction using multiple seismicity indicators. Comput Civil Infrastruct Eng 2009;24(4):280–92. Paris P, Pedrino EC, Nicoletti M. Automatic learning of image filters using Cartesian genetic programming. Integr Comput-Aided Eng 2015;22(2):135–51. Park H, Adeli H. Distributed neural dynamics algorithms for optimization of large steel structures. J Struct Eng 1997;123(7):880–8. Park HS, Adeli H. A neural dynamics model for structural optimization—application to plastic design of structures. Comput Struct 1995;57(3):391–9. Rafiei MH, Adeli H. A new neural dynamic classification algorithm. IEEE Trans. Neural Netw. Learn. Syst. (in press); 2017. Read TR, Cressie NA. Goodness-of-fit statistics for discrete multivariate data. Springer Science & Business Media; 2012. Reddy R, Nair R. The efficacy of support vector machines (SVM) in robust determination of earthquake early warning magnitudes in central Japan. J Earth Syst Sci J Earth Syst Sci 2013;122(5):1423–34. Sankari Z, Adeli H. Probabilistic neural networks for diagnosis of Alzheimer's disease using conventional and wavelet coherence. J Neurosci Methods 2011(1872–678). (JID - 7905558). Siddique N, Adeli H. Computational intelligence: synergies of fuzzy logic, neural networks and evolutionary computing. John Wiley/ & Sons; 2013. USGS. Earthquake early warning. 〈http://earthquake.usgs.gov/research/ earlywarning/〉; (Aug/15, 2016); 2016. USGS. US EEW development. 〈http://earthquake.usgs.gov/research/earlywarning/ currentstatus.php〉 (Aug/15, 2016); 2016.