Engineering Structures 165 (2018) 120–141
Contents lists available at ScienceDirect
Engineering Structures journal homepage: www.elsevier.com/locate/engstruct
Approaches to the rapid seismic damage prediction of r/c buildings using artificial neural networks
T
⁎
Konstantinos Morfidisa, , Konstantinos Kostinakisb a b
Earthquake Planning and Protection Organization (EPPO-ITSAK), Terma Dasylliou, 55535 Thessaloniki, Greece Department of Civil Engineering, Aristotle University of Thessaloniki, Aristotle University Campus, 54124 Thessaloniki, Greece
A R T I C LE I N FO
A B S T R A C T
Keywords: Seismic damage prediction Artificial neural networks Pattern recognition R/C buildings Seismic vulnerability assessment Seismic response
The present paper deals with the investigation of the ability of Artificial Neural Networks (ANN) to reliably predict the r/c buildings’ seismic damage state. In this investigation, the problem was formulated as a problem of approximation of an unknown function as well as a pattern recognition problem. In both cases, Multilayer Feedforward Perceptron networks were used. For the creation of the ANNs’ training data set, 30 r/c buildings with different structural characteristics, which were subjected to 65 actual ground motions, were selected. These buildings were subjected to Nonlinear Time History Analyses. These analyses led to the calculation of the buildings’ damage indices expressed in terms of the Maximum Interstorey Drift Ratio. The influence of several configuration parameters of ANNs to the level of the predictions’ reliability was also investigated. In order to investigate the generalization ability of the trained networks, three scenarios were considered. In the framework of these scenarios, the ANNs’ seismic damage state predictions were evaluated for buildings subjected to earthquakes, neither of which are included to the training data set. The most significant conclusion of the investigation is that the ANNs can reliably approach the seismic damage state of r/c buildings in real time after an earthquake.
1. Introduction The seismic vulnerability assessment of existing reinforced concrete (r/c) buildings is one of the most significant problems of earthquake engineering. For this reason, it is the subject of continuous research globally. Results of this extended research are the development and the evolution of methods which are utilized for the assessment of the seismic vulnerability of existing buildings, as well as for the estimation of their seismic damage state due to future earthquakes. The available methods used for the solution of the two aforementioned problems can be classified into two general categories: (a) methods that can estimate the seismic performance of individual buildings and (b) methods that can rapidly assess the seismic vulnerability of groups of buildings with common structural characteristics. The methods of the first category concern linear and nonlinear analytical procedures suitable for individual buildings for which preliminary investigations confirm that a detailed evaluation of their seismic vulnerability assessment or/and their pre-seismic strengthening or post-seismic retrofitting is required. Due to their inherent complexity, these methods are time consuming but absolutely necessary for buildings considered to be seismically vulnerable (e.g. buildings which
⁎
have suffered seismic damages or old buildings designed without the provisions of seismic codes) or for buildings considered to be important (e.g. schools, hospitals, fire stations, etc.). The methods of the first category (which are mainly based on the Finite Element Method (FEM)) have been adopted and described in the modern seismic codes (e.g. [1,2]). However, the fact that a big percentage of the existing r/c buildings located in high seismicity regions are old constructed and/or have been designed on the basis of old and inadequate seismic codes (or without the provisions of any seismic code) led to the development of methods of the second category. These methods are based on procedures that can accomplish rapid and approximate assessment of the seismic vulnerability of big groups of buildings with common structural characteristics (e.g. the seismic vulnerability curves, the damage probability matrices and procedures of rapid visual screening of structures: e.g. [3–6]). Thus, they can be used as decision-making tools, either in pre-seismic periods in order to help the engineers to make decisions about the necessity (or not) of a more detailed vulnerability assessment of individual r/c buildings (by the use of methods of the first category), or immediately after a strong earthquake in order to help the authorities to detect the most damaged zones in the stricken area. The ability of the methods of the second category to extract
Corresponding author. E-mail addresses:
[email protected] (K. Morfidis),
[email protected] (K. Kostinakis).
https://doi.org/10.1016/j.engstruct.2018.03.028 Received 16 November 2017; Received in revised form 9 March 2018; Accepted 12 March 2018 0141-0296/ © 2018 Elsevier Ltd. All rights reserved.
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
shear received by r/c walls (if they exist) along the two orthogonal construction axes. As seismic parameters, 14 parameters widely used in literature (e.g. [25]) were chosen. The 65 seismic excitations were selected in order to cover a wide range of values of these seismic parameters. The Maximum Interstorey Drift Ratio (MIDR) was utilized as the overall damage index of the selected r/c buildings (e.g. [26,27]). Three training algorithms were used for the training of networks, namely the Levenberg-Marquardt (LM) algorithm, the Scaled Conjugate Gradient (SCG) algorithm and the Resilient Backpropagation (RP) algorithm. In both cases of the formulation of the problem (FA problem or PR problem), the influence of the parameters which are used for the configuration of networks on the reliability of their predictions was investigated. These parameters are the number of the hidden layers, the number of neurons in the hidden layers, as well as the neurons’ activation functions. This investigation led to the optimum configured networks on the basis of the optimization of the utilized performance evaluation parameters (the correlation factor R and the Mean Square Error (MSE) in the case of the FA problem, and the percentage of correct classifications of buildings in the seismic damage state categories in the case of the PR problem). The generalization ability of the optimum configured networks (i.e. the ability of the optimum configured ANNs to extract reliable predictions for r/c buildings subjected to earthquakes which are both unknown to them) was examined by means of three seismic scenarios. In these scenarios, r/c buildings or/and earthquakes which were not utilized in the creation of the training data set were used.
approximate results about the seismic vulnerability of numerous buildings in a very short time led to research efforts in order to improve their reliability. In the context of these efforts, in the past 25 years many research studies have been conducted aiming to utilize the capacities of artificial intelligence, such as the Artificial Neural Networks (ANNs). The inherent ability of ANNs to embed and deploy results of problems which have known input data in order to extract predictions for the solution of the same type of problems with unknown input data instantly (e.g. [7,8]), led to the thought of utilizing them for the approximation of the seismic damage state of existing buildings in real time after an earthquake. An additional reason which led to this thought is the existence of available data for seismic damages of existing buildings caused by several earthquakes globally, as well as the fact that it is feasible to create the respective data using well-documented analytical methods such as, for example, the Nonlinear Time History Analysis (NTHA). Moreover, if it is taken into consideration that the problem of prediction of damage state of buildings is a multiparameter problem, the use of ANNs for its solution can be considered as very promising because their structure is capable of effectively handling such problems. Thus, when using ANNs there is neither a need to consider only one parameter for the quantification of earthquakes’ magnitude (seismic parameter), nor a need to consider a limited number of parameters which describe the seismic response of buildings (structural parameters). The ANNs give the ability to consider any number and combination of seismic and structural parameters for the study of the optimum correlation between them and the damage state of buildings which is defined with the aid of various expressions of damage indices (e.g. [9,10]). The first attempt for the utilization of ANNs as computational tools for the solution of civil engineering problems was made by Adeli and Yen [11], who examined their performance in the design procedure of steel beams. Since then, the ANNs have been the subject of numerous research studies in the field of civil engineering problems such as the structural health monitoring, the damage identification, the model updating, the optimization of structural design, the estimation of the characteristics of soil materials and the response of soil structures. The use of the ANNs for the solution of the aforementioned problems led to very interesting and promising results. A very detailed survey of the research studies about the use of ANNs in civil engineering problems can be found in [12,13]. The utilization of ANNs for the estimation of the seismic damage state of buildings was first studied by Stephens and VanLuchene [14], and Molas and Yamazaki [15]. After them, several research studies were focused on the utilization of ANNs in the prediction of seismic damage on the basis of analytical or statistical data (e.g. [16–24]). In the present paper, the results of the study the ability of ANNs as regards the rapid and reliable prediction of r/c buildings’ seismic damage state are presented. This study was performed by taking into consideration two different formulations of the problem. Firstly, the problem was formulated and solved as a problem of approximation of the values of an unknown function (Function Approximation (FA) problem). More specifically, an attempt was made to approach the relation between the values of the damage index of r/c buildings with parameters which describe the seismic response of structures (structural parameters), as well as with parameters which evaluate the impact of seismic motions on structures (seismic parameters). Consequently, the problem was formulated and solved as a Pattern Recognition (PR) problem. More specifically, the ability of ANNs to correctly classify the r/c buildings in seismic damage categories which are defined by specific values of the damage index was investigated. In both cases, Multilayer Feedforward Perceptron (MFP) networks were utilized. For the training of networks, a data set which consists of 1950 input and target vectors was created. This data set was configured by means of NTHA. More specifically, 30 3-D r/c buildings were selected and analyzed performing NTHA for 65 pairs of horizontal bidirectional actual ground motions. The selected structural parameters were the total height of buildings, their structural eccentricity and the ratio of base
2. The Artificial Neural Networks (ANN) It is well-known that the ANNs are complex computational structures which are able to solve problems using the general rules of the human brain functions (e.g. memory, training, etc.). Thus, the use of ANNs makes it feasible to approximate the solution of problems such as pattern recognition, classification and function approximation, with the aid of computers utilizing algorithms based on a different philosophy than the conventional ones. Various types of ANNs have been proposed (e.g. Radial Basis Function Networks [8], Counterpropagation Networks [28], GradientBased Networks [29]). However, in the present paper, Multilayer Feedforward Perceptron (MFP) networks were utilized. Fig. 1(a) presents the model of the function of a typical artificial neuron which receives the input signals (x1, x2, … , xm) through its synapses (connecting links) and transforms them to an output signal (yk) through the use of an adder (which adds the products of the input signals by the respective synaptic weights (wk1, wk2, … , wkm) of the neuron’s synapses and the bias) and the use of an activation function (in which the argument is the uk that results from the adder and transforms it to the output signal yk). More details are available in specialized references (e.g. [7,8]). The function of ANNs is based on the combined action of interconnected artificial neurons (Fig. 1(b)). Due to the fact that the function of ANNs is based on the general rules of the human brain functions such as the memory and the training, the necessary procedure for the successful solution of problems by them is the training. The training of an ANN consists of the detection of values of the synaptic weights of neurons (vector w) which produce the minimum output error. This detection is achieved through the use of training algorithms (e.g. [8]). These algorithms require a set of n input vectors x and the corresponding n output vectors d that are called target vectors. The n pairs of vectors x and vectors d constitute the training data set. A trained ANN includes the optimum vector of synaptic weights which incorporate the “knowledge” acquired from the used training data set. Thus, a trained ANN is capable of extracting predictions about the solution of problems with input data that are not included in the training data set (generalization ability). The generalization ability can be constantly improved through the re-training of ANNs (i.e. the re-calculation of values of the synaptic weights) using wider training data sets. 121
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 1. The model of the artificial neuron (a), and the typical form of a MFP (b).
3. Modeling of the problem using ANNs
FA problem (e.g. [30]), as well as the solution of the PR problem (e.g. [31]). This choice was also based on the fact that this type of ANNs, as mentioned in the introduction, was successfully used in many published investigations which are related to the scientific field of the present study (e.g. [16,18,19,21,23]).
The subject of the current section is the presentation of the procedure which was used for the formulation of the problem of the prediction of the seismic damage state of r/c buildings in terms compatible to the structure of ANNs. Thus, the choices which were made in the present paper for the configuration parameters of ANNs in the case of the formulation of the FA problem, as well as in the case of the formulation of the PR problem, will be exhibited. Fig. 2 briefly presents the steps of the two versions of the formulation of the problem, as well as the choices for the basic parameters. These steps will be described in detail in the following subsections.
3.2. Selection of input parameters The ANNs are computational structures which are capable of approaching the solution of multi-parametric problems. This feature gives the flexibility to select the number of the parameters (input parameters) through which a problem can be formulated. The parameters which describe the problem of the seismic damage prediction of r/c buildings are grouped in two general categories: the structural parameters and the ground motion parameters (seismic parameters). The structural parameters are used for the description of the seismic response (performance) of buildings. The most significant ones are the total height, the configuration (in plan and in elevation), the structural system (e.g. frame, wall or dual system), the structural eccentricity, the concrete and the reinforcing steel grade, the dimensions and the
3.1. Selection of the type of ANNs In the present study, as was mentioned in Section 2, MFP networks were utilized. In the networks of this type, the neurons in any layer (input-hidden(s)-output) are connected to all neurons in the adjacent layer (Fig. 1(b)). This choice was made because it has been proved that this type of networks is able to successfully approach the solution of the
Fig. 2. Procedure for the formulation of the problem in terms compatible to the structure of ANNs.
122
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
In the present study, the seismic damage index was expressed by means of the Maximum Interstorey Drift Ratio (MIDR). The MIDR is a global, structural and deterministic seismic damage index which is generally considered as a reliable indicator of global structural and nonstructural damage of r/c buildings (e.g. [27]), and has been used in many investigations for the assessment of the inelastic response of structures (e.g. [37]). It corresponds to the maximum drift among the perimeter frames over all storeys. According to Fig. 2, the next step of the procedure after the selection of the damage index entails the choice of the formulation of the problem. This choice is necessary in order to define the shape of the output vectors. As regards the FA problem, the formulation is based on the approximation of values of an unknown function f(x) for which a set of known pairs (x, f(x)) are available. More specifically, if xi ∈ Rm (i = 1, … , N) are Ν vectors x = [x1, … , xm]T and di ∈ R1 are N real numbers, the solution of the FA problem leads to the approximation of a function f(x):Rm → R1 which approximately fulfills the conditions f (xi) = di for the N pairs (xi, di). Obviously, there is an analogy between the above-mentioned formulation and the basic principles of ANNs which were presented in Section 2. For this reason, the use of ΑΝΝs (e.g. the MFPs) is one of the effective methods for the solution of the FA problem [30]. The accomplished solution is generally approximate and its reliability is evaluated by the measurement of the error e (i.e. the vector which contains the difference between the ANN’s outputs oi and the target values di). Therefore, in the case of the formulation of the FA problem, the output of ANNs must be a real number i.e. in the case of the present study, the value of seismic damage index MIDR (ο = f (x) = MIDR). Pattern recognition is defined as the procedure of the detection and identification/classification of objects in certain categories (patterns). Among the methods which are used to approach the solution of the PR problem are methods which are based on ANNs [31,38,39]. In the present study, the investigated problem was formulated and solved using the supervised pattern recognition method. More particularly, the classes (damage states) of the objects (vectors x, Eq. (1)) were initially defined (Table 2). Additionally, a set of classified objects was available (i.e. a training data set with target vectors compatible with the formulation of PR problem). The classes into which the input vectors x (Eq. (1)) can be classified were defined on the basis of seismic damage states of r/c buildings. To this end, in the present study five damage states were defined using specific limit values of MIDR. These damage states (valid for r/c buildings) are presented in Table 2 [40]. Thus, the corresponding output vectors o, as well as the target vectors d, must have dimensions (5 × 1). In other words, five is the number of the outputs of networks in this case. The general form of the output vectors is given (using an example) in Fig. 3. As emerges from this figure, each element of an output vector o (or of a target vector d) represents one of the five classes (damage states) of Table 2 and attains a value equal to 1 if the corresponding MIDR belongs to the interval of values which define the specific damage state. Otherwise, it attains a value equal to 0.
reinforcement of structural members, the foundation system and the soil category. In the present study, 4 structural parameters were selected. These parameters are widely utilized in well-known methods of the vulnerability assessment of existing r/c buildings (e.g. [3,5]) and they have also been identified by the modern seismic codes as the parameters which have significant effect on the seismic response of r/c buildings (e.g. [2]). These parameters are the total height of building Htot (=3.2 ⋅ nst, nst = the number of storeys), the structural eccentricity eo (the distance between the mass center and the stiffness center of storeys), and the ratio of the base shear that is received by r/c walls (if they exist) along two perpendicular directions (axes x and y): nvx and nvy (for the calculation procedures of these parameters see e.g. [32]). The seismic parameters are used for the evaluation of the impact of seismic motions on structures. The inherent difficulty of a reliable description of this impact led to the introduction of several expressions for the seismic parameters (e.g. [25]). This fact gives rise to the concern about the choice of the seismic parameter which is better related to the seismic response of structures (e.g. [33,34]). Nevertheless, the capability of ANNs to model multi-parametric problems taking into consideration many input parameters gives the potential to overcome this problem. Thus, for the investigation conducted in the present study, the 14 seismic parameters given in Table 1 have been chosen in order to evaluate the effect of earthquakes on the structural damage state. The impact of the 14 seismic parameters given in Table 1 on the accuracy level of the predictions of ANNs was investigated by the authors [36]. This investigation regarded only the FA problem and led to the conclusion that, if more than 6 seismic parameters are used as input parameters, the accuracy of the predictions of the MFP networks is especially high. Considering that there is no such investigation for the case of the PR problem, it was decided in the present paper to use all the 14 seismic parameters as input parameters for the networks. Thus, in the present study, 18 input parameters (4 structural and 14 seismic) were utilized. Therefore, the input vectors of ANNs x (18 × 1) have the general form which is given by the Eq. (1): x = [x seism |x struct]T x seism = [PGA|PGV|PGD|Ia |SED|CAV|ASI|HI|EPA|PGV/PGA|PP|TUD|TBD|TSD]T x struct = [Htot |e0 |nvx |nvy]T
(1)
3.3. Selection of the problem’s formulation – output parameters The exported result of the solution of the problem which is examined in the present paper is the estimation of the seismic damage state of r/c buildings. This estimation is generally accomplished through the use of seismic damage indices (e.g. [9]). These indices are defined for the quantification of the seismic damage state which is essential in order to formulate the problem of seismic vulnerability assessment. However, the efficient selection of a seismic damage index which can adequately capture the overall seismic damage state of buildings is very difficult, since it depends on a large number of factors. Besides, in the past decades, many expressions for the seismic damage index have been proposed. These indices are classified into categories, based on whether they are deterministic or probabilistic, local or global, structural or financial (e.g. [10]).
3.4. Training data set generation According to Fig. 2, the next step of the procedure for the formulation of the problem is the generation of the training data set, i.e. the generation of a set which consists of input and target vectors for the training of ANNs. The steps of this procedure (Fig. 4) will be described in detail in the current section.
Table 1 The selected seismic (ground motion) parameters [25,35]. 1 2 3 4 5 6 7
Peak Ground Acceleration: PGA Peak Ground Velocity: PGV Peak Ground Displacement: PGD Effective Peak Acceleration: EPA Specific Energy Density: SED Acceleration Spectrum Intensity: ASI Cumulative Absolute Velocity: CAV
8 9 10 11 12 13 14
Housner Intensity: HI Arias Intensity: Ia Vmax/Amax (PGV/ PGA) Predominant Period: PP Uniform Duration: UD Bracketed Duration: BD Significant Duration: SD
Table 2 Relation between MIDR and damage state.
123
MIDR (%)
< 0.25
0.25–0.50
0.50–1.00
1.00–1.50
> 1.50
Degree of damage
Null
Slight
Moderate
Heavy
Destruction
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Table 3 Ranges of the values of the selected seismic parameters corresponding to the 65 earthquakes.
Fig. 3. General form of output vectors o in the case the formulation of the PR problem.
3.4.1. Selection of 30 r/c buildings The selected r/c buildings differ in the total height Htot (different number of storeys nst), the structural eccentricity eo (=the distance between the mass center and the stiffness center of storeys), as well as the ratios of the base shear nvx and nvy that are received by r/c walls (if they exist) along two perpendicular structural axes (axes x and y). The values of the above structural parameters for the selected buildings (as well as their design parameters) are presented in Appendix A. All buildings are rectangular in plan (with dimensions LxxLy) and regular in elevation (according to the criteria set by EC8 [2]), and were chosen so as to represent typical (and actual) r/c buildings.
Ground motion parameter
Units
Minimum value
Maximum value
PGA PGV PGD Ia SED CAV ASI HI EPA Vmax/Amax (PGV/PGA) PP TUD TBD TSD
%g cm/s cm m/s cm2/sec cm/s g·sec cm %g sec sec sec sec sec
0.004 0.86 0.36 ≈0.0 1.24 14.67 0.003 3.94 0.003 0. 036 0.077 ≈0.0 ≈0.0 1.74
0.822 99.35 60.19 5.592 16762.8 2684.1 0.633 317.6 0.63 0.336 1.26 17.68 61.87 50.98
• The buildings were analyzed using the modal response spectrum method; • The structural materials were: steel S500B and concrete C20/25. • For the design of r/c members, the load combinations 1.35G+1.50Q
3.4.2. Selection of ground motions A suite of 65 pairs of horizontal earthquake records obtained from the European strong motion database [41] and the PEER [42] was selected. The main criterion used for the selection of these records was the coverage of a wide range of values for the 14 seismic parameters considered in the present study (see Tables 1 and 3). The seismic parameters for each ground motion were computed as the geometric mean values of the parameters corresponding to the two horizontal components of each earthquake record. The data of the selected earthquake records are given in Appendix B.
and G+0.3Q ± E were taken into consideration (G is the dead load, Q is the live load, and E is the seismic load expressed by the simultaneous application of the design spectrum of EC8 for seismic zone II and site class C along the axes x and y).
3.4.4. Nonlinear modeling and analysis (NTHA) – Data for the calculation of MIDR Τhe nonlinear behavior of r/c buildings was modeled by means of lumped plasticity (concentrated hinge) models at the column and beam ends, as well as at the base of the r/c walls. More specifically, the length of the plastic hinges was determined using the Eqs. (2a) and (2b) [44]:
3.4.3. Modeling of linear behavior, analysis and design of the selected r/c buildings (see also Appendix A) The selected r/c buildings were modeled, analyzed and designed utilizing the provisions of EC2 [43] and EC8 [2]. More specifically:
lp = 0.08l 0 + 0.022dp f y
for beams and columns
lp = 0.2l w + 0.044h w < 0.8l w
• The elastic modeling was carried out taking into consideration all basic recommendations of EC8. • The buildings were considered to be fully fixed to the ground. • The infill walls were considered only as vertical loads and not as seismic resistant structural elements. The • buildings were designed as Medium Ductility Class (MDC) structures [2]. • The behavior factors q were determined according to the re-
for walls
(2a) (2b)
where lp is the length of the plastic hinge, l0 is the distance of critical section from the point of contraflexure, dp is the mean diameter of the longitudinal reinforcement, fy is the yield stress of the longitudinal reinforcement, lw is the length of the cross-section of wall and hw is the total wall height. The material inelasticity of the structural members was modeled with the aid of the Modified Takeda hysteresis rule [45] (Fig. 5a). It should also be noted that the effects of the axial load-biaxial bending moments interaction at column and wall hinges were taken into account by using the N-M2-M3 interaction diagram, which is implemented in the software adopted for the application of the analyses [46] (Fig. 5b).
commendations of ΕC8 [2].
Fig. 4. Procedure for the design and generation of the training data set.
124
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 5. Moment (M) - Rotation (θ) relationship (a) and N-M2-M3 interaction diagram (b).
(b) Number of neurons in the hidden layers: The number of neurons of hidden layers which leads to the optimum performance of ANNs is not uniquely defined. It depends on the nature and on the formulation of the studied problem. It must also be stressed that there is no direct method for its determination. Thus, only the “trial and error” method can be adopted for both cases of the problem’s formulation (Section 4). (c) Activation functions of neurons: In the present study, two different types of nonlinear activation functions for the neurons of hidden layers were used (Fig. 1(a)): the hyperbolic tangent function (tansig) and the sigmoid function (logistic-logsig). This choice was made for the FA problem as well as for the PR approach. As regards the neurons of the output layer, the choice of the activation function was different for the two problems. In the case of the FA problem, the linear function was selected. This choice was based on the fact that, in this case, the output (MIDR values) attains any real value and not a value between [0, 1] or [−1, 1]. In the case of the PR problem, the nonlinear functions tansig and logsig were selected because the elements of the output vectors attain values 0 or 1 (see Fig. 3). The choice of using two activation functions (instead of using a single one) was made in order to investigate the optimum efficiency of the ANNs in the solution of the PR problem (more details are given in Section 4). (d) Performance evaluation parameters: In the case of the solution of the FA problem, the Mean Square Error (MSE), as well as the correlation coefficient (R factor), (e.g. [20]) were adopted. In the case of the solution of the PR problem, the most useful tools for the evaluation of ANNs are the Confusion Matrices - CM (e.g. [38,47]). The general form of a CM (for a three-class problem) is presented in Fig. 6. On the basis of CMs, three types of metrics for the prediction accuracy of ANNs are defined, namely the “Recall” index, the “Precision” index and the “Overall Accuracy” index (Fig. 6). In the present study, the “Overall Accuracy” or (OA) index was mainly used. However, for the evaluation of the several configurations of the examined ANNs, the corresponding whole CMs are also presented and evaluated (Section 4). As emerges from Fig. 6, the elements of CMs which are located in the main diagonal (i.e. the elements CFii) define the number of input vectors which are classified by the network to correct classes. Furthermore, valuable information about the quality of the predictions of ANNs is also given by the configuration of CMs. More specifically, when the vast majority of the non-zero elements of a CM are located about the main diagonal, this means that the ANN achieves an acceptable classification. For example, if all the non-zero elements are located in the cells of the main diagonal and in the adjacent cells, this means that the objects are classified into correct classes and into classes which are adjacent to them (i.e. if
After the linear and the nonlinear modeling, the 30 selected r/c buildings were analyzed by NTHA for each one of the 65 earthquake ground motion pairs. The design vertical (gravity) loads were also taken into consideration in these analyses. Thus, a total of 1950 NTHA (30 buildings × 65 ground motion records) were performed. For each one of the 1950 analyses, the required data for the MIDR calculation were exported. 3.4.5. Post-processing of the results of the NTHA – Calculation of the MIDR The last step of the procedure for the training data set generation (Fig. 4) concerns the post-processing of the results of NTHA in order to calculate the MIDR values of the analyzed r/c buildings. To this end, a computer code in Visual Basic was developed. Thus, following the described procedure above, 1950 training vectors x, which are given in Eq. (1), were created. Also, the corresponding 1950 target vectors d were formed. The shape of these vectors depends on the formulation of problem. In particular: (i) In the case of the FA problem, the target vectors d are in fact scalar values (the values of MIDR). (ii) In the case of the PR problem, the target vectors d have the form which is presented in Fig. 3. 3.5. Selection of training algorithms – Configuration of ANNs The last two steps of the procedure for the formulation of the investigated problem (Fig. 2) are the selection of training algorithms and the configuration of the utilized networks. More specifically, these steps concern the choice of the parameters which are required for the configuration of the used ANNs. In particular, these parameters are: (a) the number of hidden layers; (b) the number of neurons in each hidden layer; (c) the activation functions of neurons; (d) the performance evaluation parameters; (e) the normalization functions of the input and output values and (f) the method for partitioning the data set in training, validation and testing subsets. Additionally, to the above parameters, a “parameter” of great significance (which also influences the performance of ANNs) is the selected training algorithm. (a) Number of hidden layers: In the case of the solution of the FA problem, single-layered networks were chosen. This choice was based on the fact that, as it was proved [30], the single-layered feedforward perceptron networks are able to precisely approach functions f(x): Rm → R1, as well as on the fact that their efficiency has been well-documented in numerous relevant investigations (e.g. [18,20,22]). In the case of the solution of the PR problem, networks with one or two hidden layers were chosen in order to study the influence of the second hidden layer on the percentage of correct classifications of input vectors x in the damage state categories of Table 2. 125
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 6. General form of a CM and related metrics for a three-class problem.
the Resilient Backpropagation algorithm (“RP” algorithm, [52]). In the case of the FA problem, the training of ANNs was conducted by the use of LM and SCG algorithms whereas, in the case of the PR problem, the RP and SCG algorithms were utilized. The LM algorithm is a “QuasiNewton” algorithm (variation of Newton's method) and belongs to the general category of Back-Propagation algorithms. It is a fast and efficient algorithm and it is suggested for the solution of FA problems. The SCG is an algorithm which belongs to the specific class of Conjugate Gradient algorithms. A significant feature of this algorithm is its ability to effectively handle large-scale problems and its quickness. The RP algorithm constitutes a variation of methods which are based on a variable learning rate and it is recommended for the quick and reliable solution of pattern recognition problems (e.g. [53]).
the correct class for an object is the class i, the ANN classifies this object in the class i − 1 or in class i + 1). Therefore, all predictions/ classifications about the expected damage states which are made by the ANN in this case are correct or close to the correct ones. (e) Normalization functions for the elements of the input and output vectors: The utilization of functions which normalize the values of the elements of input vectors x before these vectors are introduced to ANNs is considered necessary in order to optimize the training (e.g. [47,48]). The same transformation is also required for the elements of the target vectors d. The network generates output vectors in which the reverse transformation is required in order to attain their final values. A function, through which the elements of input and target vectors of the data set attain values in the range [−1, 1], was selected in the present study [47]. (f) Method for partitioning the data set: The partition of the data-set in three sub-sets, namely the training, the validation and the testing sub-set is recommended in order to ensure good generalization of networks and to avoid the overfitting (e.g. [47,49]). In the present study, the partition of the data set in training, validation and testing sub-sets was done using the ratio 70%/15%/15% respectively. The training and target vectors of the three sub-sets were chosen randomly. It is important to notice that the validation data set is used by the training algorithm internally for the check of the criteria of the training termination, which regard the avoidance of the overfitting [47]. Thus, the results arising from the use of the validation data set do not provide information that can lead to certain conclusions about the performance of ANNs. Therefore, in the following tables of the present research work, the results regarding the validation data set are omitted.
4. Training of the selected ANNs – Parametric investigation of the optimum configurations In this section, details for the procedure of trainings, as well as the results of the parametric analyses which were conducted for the investigation of the seismic damage predictions of the optimum ANNs, are exhibited. The aim of the parametric analyses was the investigation of the influence of: (a) the type of activation functions of the neurons; (b) the number of hidden layers (only in the case of the solution of the PR problem); (c) the number of neurons in hidden layers and (d) the training algorithms on the performance of ANNs. In Figs. 7 and 9, the procedures of parametric investigations for both cases of the problem formulation are briefly illustrated. It must be noted that for the configuration and the training of ANNs, the neural network tool box in Matlab [47] was used. 4.1. Parametric investigation in the case of the FA problem
As regards the training algorithms, three different algorithms were adopted: the Levenberg-Marquardt algorithm (“LM” algorithm, [50]), the Scaled Conjugate Gradient algorithm (“SCG” algorithm, [51]) and
In the case of the FA problem solution, networks with one hidden layer (henceforth “N1” networks) were utilized. As emerges from Fig. 7,
Fig. 7. Procedure of the parametric investigation for the optimum performance of ANNs in the solution of the FA problem.
126
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
the procedure of the parametric investigation was separated in two parts on the basis of the utilized training algorithm (LM or SCG). In each one of these parts, two classes of networks were configured. The networks of the first class have activation functions tansig for neurons of the hidden layer, whereas the networks of the second class have logsig functions. In particular, 51 different networks as regards the number of neurons (between 10 and 60) in the hidden layer were configured for each one of the two network classes. Thus, 102(= 2 × 51) different networks were created in each one of the two parts of the parametric investigation. Each one of these networks was trained 75 times (i.e. in total 102 × 75 = 7650 training procedures were performed in each part). This is done because differences in the performance of ANNs are caused by the initial values of the synaptic weights (e.g. [18]) and also by the random composition of the three sub-sets of the training data set [47]. From the 75 training procedures of each one of the 102 configured networks, the optimum ones were detected. More particularly, the trainings which yielded the optimum values of the utilized performance parameters (MSE or R-factor) on the basis of the testing sub-set, the training sub-set and the total data set were detected. Thus, from 7650 training procedures of each part, 612 = (6 × 102) optimum trained networks were emerged, taking into consideration one of the following six criteria (i.e. 306 optimum trained networks for each one of the two network classes):
Table 4b Optimum number of the neurons of the hidden layer (which leads to the optimum values of MSE and R-factor). Performance criterion
min(MSE)
max(R)
the total data set.
Then, from the 306 optimum trained networks of each class, the networks which yielded the best predictions on the basis of the aforementioned six criteria (i.e. minimum values for the MSE and maximum values for the R-factor) were extracted. More specifically, the optimum number of neurons in the hidden layer, which leads to the best predictions for MIDR values using each one of the six adopted criteria, was detected. Thus, the 12 best trained networks from each one of the two parts of the investigation were extracted (i.e. 12 optimum configured networks trained using the LM algorithm and 12 optimum configured networks trained using the SCG algorithm). The results of the parametric investigation described above are summarized in Tables 4a and 4b. The basic conclusion which arises from Table 4a is that the training algorithm LM is more efficient than the algorithm SCG for any of the utilized performance criteria. Furthermore, in the case of the utilization of the LM algorithm, the networks in which the logsig function is used extract slightly better results than the results of the networks in which the tansig function is adopted. As regards the optimum number of neurons in the hidden layer (Table 4b), it generally depends on the utilized performance criterion (i.e. the performance parameter and the part of the data set for Table 4a Optimum values of the performance parameters of the FA problem.
min(MSE)
max(R)
Training algorithm/Activation function of the hidden layer’s neurons LM/logsig
LM/tansig
SCG/logsig
SCG/tansig
Testing subset Training subset Total data set
0.045
0.052
0.078
0.071
0.010
0.010
0.077
0.065
0.034
0.038
0.095
0.082
Testing subset Training subset Total data set
0.975
0.972
0.958
0.958
0.995
0.995
0.958
0.967
0.983
0.981
0.951
0.958
LM/logsig
LM/tansig
SCG/logsig
SCG/tansig
Testing subset Training subset Total data set
18
16
28
34
60
54
14
54
32
30
14
46
Testing subset Training subset Total data set
18
54
16
52
60
54
14
54
32
54
14
46
which this parameter is calculated). This conclusion does not apply in the case of the most efficient network (i.e. the network in which the logsig function is adopted and is trained using the LM algorithm). In this case, the optimum number of neurons does not depend on the adopted performance parameter (MSE or R) but only on the part of the data set for which this parameter is calculated. Thus, the optimum number of neurons in the hidden layer is equal to 18 when the performance parameter is calculated for the samples which belong to the testing sub-set, but it is equal to 60 when the training sub-set is used for the calculation of the performance parameter. However, in the case of the calculation on the basis of the total data set, the corresponding number is 32. Fig. 8 illustrates the predictive ability of the optimum networks when the criterion of min(MSE) for the testing sub-set is adopted. More specifically, the diagrams of this figure concern the four optimum networks (i.e. the networks with the optimum number of neurons in the hidden layer) which correspond to the four examined combinations of training algorithms and activation functions of the neurons of the hidden layer (Tables 4a and 4b). In these diagrams, the MIDR values which were calculated using NTHA (MIDRNTHA) are plotted against the MIDR values predicted by the optimum networks (MIDRANN) for all samples of the total data set. The main conclusion that can be drawn from Fig. 8 is that the network which has 18 neurons with the logsig activation function in the hidden layer and was trained using the LM algorithm (henceforth “N1LM-log/lin-18” network) extracts the best predictions about the expected MIDR values (Fig. 8a). In particular, the “N1-LM-log/lin-18” network extracts MIDRANN values, which are the best related to the corresponding MIDRNTHA values (R = 0.9745). Another significant conclusion which is extracted from Fig. 8 is that the correlation between the MIDRNTHA values and the MIDRANN values is better in the range (MIDR = 0–1.5%). In this range – with a few exceptions – all points of the data set sit very close to the straight diagonal reference line (i.e. the line in which the points which fulfill the condition MIDRNTHA = MIDRANN are located). For higher damage levels (MIDR > 1.5%), i.e. for damage levels which correspond to “Destruction” according to Table 2, the degree of scatter is increased in all diagrams of Fig. 8. Therefore, the predictive ability of the optimum networks is decreased for MIDR values larger than 1.5%. Nevertheless, this weakness is not significant since, for MIDR values larger than 1.5%, the buildings suffer heavy (and practically non-repairable) damages. Thus, the precision of the predicted MIDR values in these cases is not critical. By contrast, the ability for the reliable prediction of the order of magnitude of MIDR values is more significant. As emerges from Fig. 8, this requirement is accomplished. According to the conclusions presented above, the most efficient of the examined networks on the basis of the testing sub-set is the “N1-LMlog/lin-18” network (MSE = 0.045 and R = 0.975, Table 4a). The corresponding optimum network according to the total data set is the
• min(MSE) for: (a) the testing sub-set, (b) the training sub-set and (c) the total data set, • max(R) for: (a) the testing sub-set, (b) the training sub-set and (c)
Performance criterion
Training algorithm/Activation function of the hidden layer’s neurons
127
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 8. Comparison of damage predicted by NTHA and the best trained ANNs.
different numbers of neurons in the hidden layer) ΑΝΝs with one hidden layer and 20808(=8 combinations of activation functions × 51 different numbers of neurons in the 1st hidden layer × 51 different numbers of neurons in the 2nd hidden layer) ANNs with two hidden layers were configured. In Table 5, the results of the parametric investigation of the performance of networks with one hidden layer are presented. The main conclusion that can be drawn from Table 5 is that the activation function of the neurons of the output layer is the most important configuration parameter of networks with one hidden layer. More specifically, when the function tansig is adopted for the neurons of the output layer, the value of the OA index fluctuates between 81.9% and 88.9%, regardless of the choices for the activation function of the neurons of the hidden layer and for the training algorithm (the influence of these parameters on the OA index value is significantly lesser). For example, the combined use of activation functions (tansig/ logsig) extracts a value for the OA index that is equal to 61.8% as regards the classifications of all samples of the data set when the network is trained using the RP algorithm. However, the corresponding value of the OA index in the case of the utilization of SCG is 59.7%. Respectively, when the combination (tansig/tansig) is adopted, the extracted value of the OA index is 84.5%, irrespective of the training algorithm utilized. A similar conclusion is drawn when the results which are based on the samples of testing and the training sub-sets are evaluated. Another significant conclusion which is extracted from the study of Table 5 is the great importance of the optimum number of neurons of the hidden layer. This number depends on the other configuration parameters of the examined networks, as well as on the set (testing, training, and total) for which the OA index is calculated. As emerges from Table 5, the relation of the optimum number of neurons of the hidden layer with the other configuration parameters is not based on a certain function. It is obvious that the optimum number of neurons in the hidden layer is altered randomly. This conclusion substantiates the nonexistence of a direct method for its calculation, and the recourse to the “trial and error” procedure. Tables 6a and 6b illustrate the results of the parametric investigation of the performance of ANNs with two hidden layers. The main conclusion which can be extracted from the combined study of these
“N1-LM-log/lin-32” network (MSE = 0.034 and R = 0.983, Table 4a). The “N1-LM-log/lin-18” network is mainly utilized for the assessment of the ability of the trained networks in the reliable prediction of the seismic damage level of r/c buildings in cases with data unknown to them (generalization ability), which is presented in Section 5. This choice was based on the fact that this sub-set is used in order to control the generalization ability of networks during their training and not for the optimization of the values of the synaptic weights [47]. Therefore, the “N1-LM-log/lin-18” is considered as more efficient for predictions in cases of unseen input data (i.e. seismic damage predictions due to future earthquakes). Nevertheless, the generalization ability of “N1-LMlog/lin-32” networks is also examined in Section 5 due to research reasons. 4.2. Parametric investigation in the case of the PR problem In the case of the PR problem, networks with one (“N1” networks) and two hidden layers (henceforth “N2” networks) were utilized. Just as in the case of FA problem, the procedure of the parametric investigation was separated in two parts on the basis of the utilized training algorithm (RP or SCG). This procedure is briefly illustrated in Fig. 9. As emerges from this figure, the procedure of the parametric investigation is basically similar to the corresponding procedure which was conducted in the case of the FA problem. The main difference between the two procedures is that, in the case of the solution of the PR problem, ANNs with one and two hidden layers were used. Thus, the parametric analyses were performed in two stages for each one of the two parts of procedure: In stage one, the problem was solved by the use of ANNs with one hidden layer whereas, in stage two, networks with two hidden layers were used. As regards the criterion for the performance assessment of ANNs, the percentage of correct classifications of r/c buildings to the damage classes of Table 2 (i.e. the Overall Accuracy (OA) index, Fig. 6) was adopted. This index was calculated on the basis of the testing sub-set, the training sub-set and the total data set. The performance of the examined networks was also assessed using the whole CMs (Fig. 6) which illustrate an overall view with regard to the predictive abilities of ANNs. A total of 204(=4 combinations of activation functions × 51 128
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 9. Procedure of the parametric investigation for the optimum performance of ANNs in the solution of the PR problem.
hidden layer”, henceforth “N1-RP-tan/tan-56” network) extracts an OA index value equal to 84.5% (Table 5). The corresponding most efficient network with two hidden layers (combination “RP algorithm – activation functions in 1st/2nd hidden layers and output layer: tansig/tansig/ tansig – 60/48 neurons in 1st/2nd hidden layers”, henceforth “N2-RPtan/tan/tan-60/48” network) extracts an OA index value equal to 89.9% (Table 6a). This increase of the OA index value is not negligible but it can’t be considered as significant if evaluated in conjunction with the corresponding increase of the number of neurons in the case of networks with two hidden layers (the “N2-RP-tan/tan/tan-60/48” network has 108 neurons whereas the “N1-RP-tan/tan-56” network has 56). In Fig. 10, the CMs of the most efficient networks with one hidden layer according to the testing sub-set and to the total data-set (Table 5) are presented. The study of these CMs leads to the conclusion that, besides the significant percentage of correct classifications (i.e. high values of the OA index which are illustrated in the Table 5), the vast majority of wrong classifications are classifications in classes adjacent to correct ones. For example, in the CM of Fig. 10a which corresponds to the most efficient network according to the testing data sub-set (“N1-
tables is that, as in the case of networks with one hidden layer, the activation function of the neurons of the output layer is the most important configuration parameter. In particular, the utilization of the tansig function in the neurons of the output layer leads to values of OA index between 84% and 90% (the corresponding range of OA index values in the case of the utilization of logsig function is 63.7–70%). The other conclusions (regarding the minor importance of the activation function of the neurons of the hidden layer and of the training algorithm in comparison to the importance of the activation function of the neurons of the output layer, as well as the great importance and the non-deterministic calculation of the optimum number of neurons in the hidden layer), which were extracted from the parametric investigation of ANNs with one hidden layer, are also valid in the case of networks with two hidden layers. Finally, the combined study of Tables 5, 6a and 6b leads to the significant conclusion which regards the relatively minor improvement of the values of the OA index when a second hidden layer is added. For example, the most efficient network with one hidden layer according to the total data set (combination “RP algorithm – activation functions in the hidden layer and the output layer: tansig/tansig – 56 neurons in the
Table 5 ANNs with one hidden layer – Best values of the OA index and the corresponding number of neurons of hidden layer. Performance criterion
Training algorithm - Activation function of the neurons of the hidden/output layers RP algorithm
SCG algorithm
logsig/ logsig
logsig/ tansig
tansig/ tansig
tansig/ logsig
logsig/ logsig
logsig/ tansig
tansig/ tansig
tansig/logsig
maxOA(%)
Testing sub-set Training sub-set Total data set
64.8% 61.4% 62.4%
81.9% 87.0% 83.9%
83.6% 88.9% 84.5%
64.5% 62.0% 61.8%
54.9% 56.5% 55.5%
81.9% 85.4% 83.0%
81.9% 87.2% 84.5%
61.4% 59.4% 59.7%
Number of neurons in the hidden layer
Testing sub-set Training sub-set Total data set
54 34 34
48 54 54
44 56 56
14 26 14
46 18 18
40 46 46
36 56 56
52 10 52
129
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Table 6a ANNs with two hidden layers - Best values of the OA index and the corresponding number of neurons of the hidden layers (RP algorithm). Performance criterion
Activation function of the 1st hidden/2nd hidden/output layers’ neurons tansig/tansig/ logsig
tansig/ tansig/tansig
tansig/ logsig/logsig
tansig/ logsig/tansig
logsig/ tansig/logsig
logsig/ tansig/tansig
logsig/ logsig/logsig
logsig/logsig/ tansig
maxOA(%)
Testing sub-set Training sub-set Total data set
67.9% 64.7% 64.4%
84.3% 94.1% 89.9%
70.0% 69.6% 69.9%
84.6% 93.8% 89.4%
67.6% 64.7% 63.7%
85.3% 92.4% 88.1%
67.9% 68.4% 66.4%
85.0% 92.4% 87.9%
Number of neurons in 1st/ 2nd hidden layer
Testing sub-set Training sub-set Total data set
24/56 46/12 46/12
20/32 54/58 60/48
44/28 44/28 44/28
36/24 52/58 56/40
46/14 46/50 50/54
34/52 60/56 60/56
30/52 42/50 42/50
44/12 44/54 44/54
corresponding efficiency for correct classification to the classes 1 and 5. However, the percentages of the correct classifications to the classes 2, 3 and 4 can’t be rated as unacceptable. Similar conclusions are also extracted from the study of the other CMs of Fig. 10. Fig. 11 illustrates the CMs of the most efficient networks with two hidden layers according to the testing sub-set and to the total data-set (Tables 6a and 6b). The conclusions which arise from the study of these CMs are generally similar to the corresponding conclusions which result from the study of CMs of ANNs with one hidden layer (Fig. 10). However, it must be noted that the number of classifications in classes which are not adjacent to the correct ones is the least in the case of networks with two hidden layers. For example, no classifications in classes which are not adjacent to the correct ones are performed by the “N2-SCG-tan/ tan/tan-10/28” network (Fig. 11c). The corresponding number of classifications in the case of the “N1-SCG-tan/tan-36” network (with one hidden layer) is 2 (Fig. 10c). It must also be stressed that the effectiveness of networks with two hidden layers to correctly classify the samples to the classes 2, 3 and 4 is greater than the corresponding effectiveness of networks with one hidden layer. More specifically, the percentages of correct classifications which are performed by networks with two hidden layers in the classes 2, 3 and 4 are (with few exceptions) greater than 80%. Thus, despite the fact that the overall percentage of correct classifications (OA index) is not significantly increased when networks with two hidden layers are utilized, the general form of the corresponding CMs indicates that the improvement of the general quality of the classifications is considerable. Finally, it must be noted that, just as in the case of the FA problem study (Section 4.1), the network which is mainly utilized for the assessment of the generalization ability of the ANNs (Section 5) is the most efficient network according to the testing sub-set, i.e. the “N2SCG-tan/tan/tan-10/28” network (OA = 86%, Table 6b and Fig. 11c). Nevertheless, the generalization ability of the ANNs is also assessed in Section 5 by the use of the most efficient network according to the total data set, i.e. the “N2-RP-tan/tan/tan-60/48” network (OA = 89.9%, Table 6a and Fig. 11b). In Table 7, the configuration parameters of the networks which are utilized for the study of the generalization ability of the ANNs are summarized.
RP-tan/tan-44” network), there are no classifications in classes not adjacent to the correct ones. The corresponding number in the case of the most efficient network according to the total data set (“N1-RP-tan/ tan-56” network) is 8 in 1950 samples (Fig. 10b). Similar numbers are observed also in the cases of the most efficient networks which were trained using the SCG algorithm (Fig. 10c and d). Thus, even the wrong classifications do not lead to misleading conclusions about the expected damage state. A significant feature of CMs is their ability to provide additional information about the percentage of correct classifications in each class individually and not only about their corresponding overall percentage (Fig. 6). More specifically, the sum of the elements of each column corresponds to the total number of the target vectors which belong to each one of the five classes. Thus, as emerges, for example, from the study of the CM of Fig. 10b (“N1-RP-tan/tan-56” network), the total data set consists of: 292(=268 + 24) class 1 samples (null damage), 271(=12 + 225 + 34) class 2 samples (slight damage), 443(=1 + 34 + 349 + 56 + 3) class 3 samples (moderate damage), 371(=68 + 274 + 29) class 4 samples (heavy damage), and 573(=4 + 38 + 531) class 5 samples (Destruction). Respectively, the sum of the elements of each row corresponds to the total number of samples which the “N1-RP-tan/tan-56” network classifies to each one of the five classes of the problem. According to the above clarifications, it is concluded that the “N1-RP-tan/tan-56” network classifies 268 samples to the class 1, whereas the samples whose true class is the class 1 are 292 (Precision = 268/292 = 0.918 or 91.8%). Furthermore, it can be concluded that the “N1-RP-tan/tan-56” network classifies 281(=268 + 12 + 1) samples to the class 1 but the number of these samples which belong indeed to the class 1 is 268 (Recall = 268/ 281 = 0.954 or 95.4%). These high values of the Precision and Recall indices indicate the great efficiency of the “N1-RP-tan/tan-56” network to correctly classifying the samples to the class 1. Similar values for the Precision and Recall indices are also observed also for the network classifications to the class 5 (92.7% and 94.3% respectively). By contrast, the corresponding values of these indices for the classes 2, 3 and 4 are lesser (with little exceptions fluctuating between 74% and 80%). Thus, the efficiency of the “N1-RP-tan/tan-56” network to correctly classify the samples to these classes is not equivalent to the
Table 6b ANNs with two hidden layers - Best values of the OA index and the corresponding number of neurons of the hidden layers (SCG algorithm). Performance criterion
Activation function of the 1st hidden/2nd hidden/output layers’ neurons tansig/ tansig/logsig
tansig/ tansig/tansig
tansig/ logsig/logsig
tansig/ logsig/tansig
logsig/ tansig/logsig
logsig/ tansig/tansig
logsig/ logsig/logsig
logsig/logsig/ tansig
maxOA(%)
Testing sub-set Training sub-set Total data set
68.3% 64.9% 63.7%
86.0% 91.1% 87.9%
66.9% 64.7% 64.0%
84.6% 92.7% 88.4%
65.2% 62.4% 61.9%
85.3% 89.3% 85.8%
65.5% 64.9% 63.1%
84.0% 89.2% 86.4%
Number of neurons in 1st/ 2nd hidden layer
Testing sub-set Training sub-set Total data set
38/60 28/14 28/14
10/28 50/38 50/36
54/12 44/28 42/30
10/52 28/26 28/26
58/52 60/44 52/30
42/20 40/24 48/56
52/58 28/48 24/18
14/16 48/32 54/40
130
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 10. CMs of the most efficient networks with one hidden layer.
them (all elements of the vector x) are different from the corresponding values of the input vectors of the training data set.
5. Assessment of the predictive abilities of the optimum configured networks In the current section, the results of the generalization ability assessment of the most efficient ANNs are presented. In particular, the results of the study of the ability of the optimum configured networks (Table 7) to reliably predict the seismic damage state in cases in which the r/c buildings or/and the earthquakes are unknown to them (i.e. they were not used in the generation of the training data set) are presented. For this purpose, three different scenarios were utilized. In the context of these scenarios, the predictions of the most efficient networks were compared to results of NTHA. More specifically, in the context of these scenarios, input vectors x (and the corresponding target vectors d) were generated, in which the values of the structural parameters (elements of the sub-vector xstruct, Eq. (1)) or the values of the seismic parameters (elements of the sub-vector xseism) or the values of both of
5.1. Assessment of predictive ability of ANNs for known buildings subjected to unseen earthquakes The aim of the first scenario is the assessment of generalization ability of the most efficient networks in cases in which the testing data set consists of input vectors x with unknown values of seismic parameters (i.e. unknown sub-vector xseism) but known values of structural parameters (i.e. known sub-vector xstruct). To this end, the 30 r/c buildings which were used for the generation of the training data set (Appendix A) were utilized. These buildings were subjected to 4 testing earthquakes (Table 8) that were different from the 65 excitations (Appendix B) which were used for the generation of the training data set.
Fig. 11. CMs of the most efficient networks with two hidden layers.
131
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
values of the seismic parameters of the 65 seismic excitation which were used for the generation of the training data set (see Table 3 and the bold fonts of the Table 8). Thus, in the case of earthquake E4, the network approaches the solution of the problem performing extrapolation. As a result, the generalization ability decreases (e.g. [54,55]). The combined study of Figs. 12 and 13 leads to the conclusion that the “N1-LM-log/lin-18” network accomplishes a better generalization than the “N1-LM-log/lin-32” network. Especially in the cases of the testing earthquakes E2 and E4, the “N1-LM-log/lin-32” network extracts unacceptable MIDR values (R values equal to 0.28 and 0.36 respectively). By contrast, in the case of the earthquake E1 and mainly in the case of the earthquake E3, the difference between the generalization efficiency of the two examined networks is insignificant. Thus, from the Figs. 12 and 13, it is possible to quantitatively confirm the expected conclusion (according to the theory) that the “N1-LM-log/lin-18” network, which is the optimum network according to the testing sub-set, generalizes more efficiently than the “N1-LM-log/lin-32” network (optimum network according to the total data set) does. Fig. 14a presents the predictions which are extracted from the “N2SCG-tan/tan/tan-10/28” network (optimum network on the basis of the testing sub-set) and the “N2-RP-tan/tan/tan-60/48” network (optimum network according to the total data set) for the damage state of the 30 r/c buildings due to the 4 testing earthquakes on the basis of the solution of the PR problem. More specifically, the percentages of the correct classifications (values of the OA index) to the damage states of Table 2 which result from the abovementioned networks (Table 7), are illustrated. From the study of Fig. 14a, it is concluded that, with the exception of the earthquake E3, the “N2-SCG-tan/tan/tan-10/28” network extracts percentages of correct classifications greater than 73%. The corresponding percentages which are extracted from the “N2-RPtan/tan/tan-60/48” network are lower, with the exception of the earthquake E3 (OA = 46.7% for the “N2-SCG-tan/tan/tan-10/28” network and OA = 60% for the “N2-RP-tan/tan/tan-60/48” network). The low values of the OA index which are extracted from the “N2-RP-tan/ tan/tan-60/48” network can be explained by the fact that it is the best efficient network according to the total data set, and not according to the testing sub-set which is used for the optimization of the generalization efficiency of the networks. However, the conclusion about the low performance of the “N2-SCG-tan/tan/tan-10/28” network in the case of earthquake E3 demands further explanation. Due to the fact that such an explanation is not feasible because of the multi-parametric nature of the problem, a further analysis of the specific classifications is considered necessary. Thus, the corresponding CM is illustrated in Fig. 14b. As emerges from this figure, the wrong classifications concern the moderate and heavy damage classes (classes 3 and 4). More specifically, the buildings whose true class is the class 3 are 18(=9 + 9). The network classifies 9 of these buildings to this class (Precision = 50%). Furthermore, the network classifies a total of 18(=9 + 5 + 4) buildings to damage class 4, whereas only 5 of them belong indeed to this class (Recall = 27.8%). However, in any case, all wrong classifications are classifications in classes adjacent to correct ones. Therefore, despite the low value of the OA index (46.7%), the general vision for the damage state of the 30 r/c buildings due to the testing earthquake E3 is not misleading, provided that it is additionally taken into consideration that such information will be used as a first estimation of the seismic damage right after a strong earthquake.
Table 7 Configuration parameters of the ANNs which were selected for the evaluation of their generalization ability. Criterion: Optimum performance for the
Formulation of the problem FA problem
PR problem
Number of hidden layers
Testing sub-set Total data set
1 1
2 2
Number of neurons in hidden layers
Testing sub-set Total data set
18 32
10/28 60/48
Training algorithm
Testing sub-set Total data set
LM LM
SCG RP
Activation functions
Testing sub-set
logsig/linear
Total data set
logsig/linear
tansig/tansig/ tansig tansig/tansig/ tansig
Testing sub-set
N1-LM-log/ lin-18 N1-LM-log/ lin-32
Network name
Total data set
N2-SCG-tan/ tan/tan-10/28 N2-RP-tan/tan/ tan-60/48
Following the procedure which is illustrated in Fig. 4, the testing data set was generated. This data set consists of 120 samples. Two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). The input vectors were introduced to the most efficient networks of Table 7 in order to predict the seismic damage state for the 120 testing samples. Figs. 12 and 13 illustrate the predictions of the “N1-LM-log/lin-18” network (optimum network on the basis of the testing sub-set) and the predictions of the “N1-LM-log/lin32” network (optimum network on the basis of the total data set) respectively. These figures concern the predictions which arise from the FA problem solution. As emerges from the study of Fig. 12, the generalization ability of the “N1-LM-log/lin-18” network is significant, mainly in the cases of testing earthquakes E1, E2 and E3. The extracted predictions of the MIDR values for the specific earthquakes are generally acceptable (R values between 0.63 and 0.72, and MSE values between 0.017 and 0.069). This conclusion also arises from the fact that the vast majority of points in the corresponding diagrams (Fig. 12a–c) are close to the straight diagonal reference line (i.e. geometric trace of points for which the condition MIDRNTHA=MIDRANN is fulfilled). By contrast, in the case of the testing earthquake E4, the predictions of the MIDR values are not so accurate. Despite the fact that the value of the corresponding correlation factor R(=0.65) is of the same order of magnitude as the values of the R factors which correspond to the testing excitations E1-E3, the value of the MSE(=0.62) is one order of magnitude higher. Furthermore, the vast majority of points in the corresponding diagram (Fig. 12d) are far from and above from the reference line. This means that the network mainly extracts lower MIDR values that the NTHA. The higher deviations of MIDR values which are extracted from the “N1-LM-log/lin-18” network for the testing earthquake E4, in comparison to the corresponding deviations which are extracted for the testing earthquakes E1-E3, could be partly attributed to the fact that, contrary to the testing earthquakes Ε1–Ε3, 2 seismic parameters (ASI and EPA) of the earthquake E4 have values which are out of the range of the corresponding Table 8 The ground motion parameters of the 4 testing earthquakes.
E1 E2 E3 E4
PGA
PGV
PGD
Ia
SED
CAV
ASI
HI
EPA
PGV/PGA
PP
TUD
TBD
TSD
0.296 0.11 0.321 0.68
16.083 6.865 28.621 44.719
2.33 2.726 6.921 17.687
0.725 0.216 0.973 4.915
224.12 175.85 694.81 3006.3
538.73 490.24 592.87 1630.99
0.284 0.102 0.257 0.652
46.817 37.156 107.024 181.352
0.28 0.101 0.258 0.646
0.055 0.064 0.091 0.067
0.159 0.219 0.11 0.275
3.55 0.21 4.78 7.76
10.76 7.72 8.91 19.12
7.01 22.36 6.99 9.72
132
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 12. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the four testing earthquakes: (a) E1, (b) E2, (c) E3 and (d) E4.
The 3 new testing r/c buildings were subjected to the 65 seismic excitations (Table 3 and Appendix B) which were used for the generation of the training data set (Section 3.4). Following the procedure of the flow chart which is presented in Fig. 4, a new testing data set was generated. This data set consists of 195 samples. As in the case of the first scenario, two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). Figs. 15 and 16 present the predictions of the “N1-LM-log/lin-18” network and the predictions of the “N1-LM-log/lin-32” network respectively on the basis of the FA problem solution. The combined study of these figures leads to the conclusion that, as was expected, the “N1-LMlog/lin-18” network extracts more accurate predictions for the MIDR values than the ones extracted by the “N1-LM-log/lin-32” network. However, these differences in accuracy are not significant, especially as regards the values of the R factor. More specifically, with the exception of the 8-storey building for which the “N1-LM-log/lin-32” network extracts results with R factor value equal to 0.88 (Fig. 16c), all the other values of the R factor are greater than 0.9. Another significant conclusion which was also extracted from the study of the Fig. 8 is the
5.2. Assessment of predictive ability of ANNs for unseen buildings subjected to known earthquakes The aim of the second scenario is the assessment of the generalization ability of the most efficient networks in cases in which the testing data set consists of input vectors x with known values of seismic parameters (i.e. known sub-vector xseism) but unknown values of structural parameters (i.e. unknown sub-vector xstruct). Furthermore, this scenario is used for the investigation of the capability of the utilization of ANNs as tools for a rapid and reliable vulnerability assessment of individual r/c buildings using the FA problem solution. The specific utilization of ANNs is significantly advantageous because, in real time, it extracts an estimation of the expected damage state for individual buildings due to numerous excitations which differ among them according to their characteristics (seismic parameters). In order to fulfill the aims of this scenario, new r/c buildings different from the 30 r/c buildings which were used for the training data set generation, were selected. The values of their structural parameters are illustrated in Table 9. The selected buildings were designed following the procedure which is described in Section 3.4.
Fig. 13. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the four testing earthquakes: (a) E1, (b) E2, (c) E3 and (d) E4.
133
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 14. Percentages of correct classifications to damage states extracted from the optimum trained networks for the 30 r/c buildings subjected to the 4 testing earthquakes (a), CM of the damage classification of the 30 r/c buildings which are exported by the “N2-SCG-tan/tan/tan-10/28” network for the testing earthquake E3 (b).
buildings due to the 65 seismic excitations, the extracted results would be similar or relatively close to the results of the NTHA. This fact leads to the conclusion that the properly trained ANNs might be used as tools accompanying the rapid visual screening procedure of buildings.
Table 9 The values of the structural parameters of the 3 testing r/c buildings. No.
nst
Htot = 3.2 · nst (m)
Lx (m)
Ly (m)
eo (m)
nvx (%)
nvy (%)
1 2 3
3 5 8
9.6 16.0 25.6
10.0 10.0 10.0
15.0 15.0 15.0
0.0 0.0 0.0
62.0 60.0 58.0
0.0 0.0 0.0
5.3. Assessment of predictive ability of ANNs for unseen buildings subjected to unseen earthquakes
incapacity of both networks to predict the MIDR values greater than 1.5% with a similar accuracy as with the MIDR values which are lower than this value. This incapacity causes the relatively high MSE values in the case of the 3-storey building (Figs. 15a and 16a), as well as in the case of the 8-storey building (when for the predictions the “N1-LM-log/ lin-32” network is used, Fig. 16c). As was mentioned in the remarks of Fig. 8, the incapacity of networks to accurately predict MIDR values greater than 1.5% is not of great importance, since these values correspond to heavy (non-repairable) damages or collapse. Moreover, it must be stressed that for such values of MIDR, the results obtained by the NTHA are not especially reliable, since a large number of structural members are characterized by extensive inelastic behavior which is difficult to capture by analytical models. Fig. 17 illustrates the CMs of classifications to the damage classes of Table 2 made by “N2-SCG-tan/tan/tan-10/28” and “N2-RP-tan/tan/ tan-60/48” networks for the 3 testing r/c buildings. The main conclusion which can be drawn from this figure is that, in general, the two examined networks accomplish high percentages of correct classifications (OA values greater than 70%, with the exception of classifications which were made by the “N2-SCG-tan/tan/tan-10/28” network for the 3-storey building). Another significant evidence of the high performance of the utilized networks is the fact that only one of the 195 samples was classified to a class not adjacent to the correct one (3-storey building in Fig. 17a). Therefore, if these networks would be used as computational tools for a preliminary approach of the seismic damage state of the 3 testing
The aim of the third scenario is the assessment of the generalization ability of the optimum configured networks in cases in which the testing data set consists of input vectors x with unknown values for both seismic and structural parameters (i.e. totally unknown vectors x). To this end, the 3 testing r/c buildings which were used in the analyses of the second scenario (Table 9) were subjected to 15 new testing earthquakes (Table 10). Following the procedure which is illustrated in Fig. 4 once more, a new testing data set was generated. This data set consists of 45 samples. As in the cases of the two previous scenarios, two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). Figs. 18 and 19 illustrate the predictions of the MIDR values made by the “N1-LM-log/lin-18” network (optimum network according to the testing sub-set) and the corresponding predictions made by the “N1-LM-log/lin-32” network (optimum network according to the total data set) respectively. As emerges from the study of Figs. 18 and 19, both examined networks extract predictions of a similar level of accuracy in the cases of the 3-storey and the 5-storey buildings. More specifically, the “N1-LMlog/lin-32” network extracts results which are better related to the corresponding NTHA results than the results of the “N1-LM-log/lin-18” network. The opposite applies in the case of the 8-storey building. Despite the fact that the extracted values of the R factor can be generally considered as acceptable (with the exception of the value which is extracted from the “N1-LM-log/lin-32” network for the 8storey building), the corresponding values of MSE are relatively high. According to Figs. 18 and 19, these high values could be attributed
Fig. 15. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.
134
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 16. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey
classifications (Fig. 20a) is indicative of its high efficiency. More specifically, besides the high values of the OA index (86.7% for 3-storey building, 73.3% for the 5-storey building and 80.0% for the 8-storey building), the percentages of correct classifications to individual classes are greater than 67% (i.e. the Precision and Recall indices have values greater than 67%), with a few exceptions. Furthermore, it must be stressed that none of the examined samples is classified by the “N2-SCG-tan/tan/tan-10/28” network to classes not adjacent to the correct ones (Fig. 20a). Finally, a very significant conclusion, which is extracted from the combined study of Figs. 18–20, is that the predictions of the most efficient networks are more reliable when they are extracted on the basis of the PR problem solution.
mainly to the aforesaid insufficiency of networks in adequately approaching high MIDR values (higher than 1.5–2.0%) calculated by NTHA. However, as was already mentioned above, this insufficiency of the networks is not of great importance. Furthermore, it must be stressed that the majority of samples of the training data set has MIDR values less than 2.0% (1589 of 1950 samples). Therefore, the training algorithms extract values for synaptic weights more adapted to MIDR values less than 2.0%. Finally, the high values of MSE could be also attributed to the fact that 7 of the 15 testing earthquakes have 2 or more seismic parameters whose values are out of the range of the corresponding seismic parameters’ values of the 65 seismic excitations which were used for the generation of the training data set (see Table 3 and the bold fonts of the Table 10). In Fig. 20, the CMs of classifications made by “N2-SCG-tan/tan/tan10/28” and “N2-RP-tan/tan/tan-60/48” networks for the 3 testing r/c buildings are presented. The main conclusion which is extracted from this figure is the high quality of classifications of both the examined networks (OA values greater than 73% with the exception of the classifications made by the “N2-RP-tan/tan/tan-60/48” network for the 8storey building (Fig. 20b)). Especially the configuration of CMs which correspond to the “N2-SCG-tan/tan/tan-10/28” network’s
6. Conclusions The present paper examined the ability of the Multilayer Feedforward Perceptron (MFP) Artificial Neural Networks (ANN) to successfully approach the solution of the problem of the seismic damage prediction of reinforced concrete (r/c) buildings using two different methodologies. The first one concerns the estimation of the values of the buildings’ damage
Fig. 17. CMs of the damage state classifications of the 3 testing buildings made by the optimum trained networks (a) N2-SCG-tan/tan/tan-10/28 and (b) N2-RP-tan/tan/tan-60/48.
135
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Table 10 The ground motion parameters of the 15 testing earthquakes.
PGA PGV PGD Ia SED CAV ASI HI EPA PGV/PGA PP TUD TBD TSD
Ε1
Ε2
Ε3
Ε4
Ε5
Ε6
Ε7
Ε8
Ε9
Ε10
Ε11
Ε12
Ε13
Ε14
Ε15
0.131 19.02 17.26 0.248 720.0 559.7 0.107 52.81 0.106 0.148 0.525 1.18 6.66 29.08
0.296 16.08 2.330 0.725 224.1 538.7 0.284 46.82 0.280 0.055 0.159 3.55 10.76 7.01
0.037 1.505 0.125 0.016 2.35 113.8 0.026 4.099 0.027 0.041 0.118 0.00 0.00 17.33
0.110 6.865 2.726 0.216 175.8 490.2 0.102 37.16 0.101 0.064 0.219 0.21 7.72 22.36
0.014 1.144 0.596 0.003 1.95 34.78 0.015 3.982 0.014 0.085 0.410 0.00 0.00 9.81
0.030 1.826 1.902 0.010 2.41 68.47 0.031 3.466 0.031 0.062 0.369 0.00 0.00 9.89
0.091 6.181 1.176 0.070 31.72 191.6 0.082 24.55 0.082 0.069 0.167 0.37 2.76 7.51
0.166 8.641 1.196 0.117 70.58 216.2 0.122 31.37 0.120 0.053 0.319 0.30 1.02 8.39
0.272 20.77 2.893 0.648 233.6 497.9 0.258 60.22 0.256 0.078 0.233 2.04 7.14 6.09
0.321 28.62 6.921 0.973 694.8 592.9 0.257 107.0 0.258 0.091 0.110 4.78 8.91 6.99
0.714 58.4 22.89 5.453 4280 1452 0.474 210.5 0.479 0.083 0.195 7.70 10.96 6.75
0.681 44.72 17.69 4.915 3006 1631 0.652 181.3 0.646 0.067 0.275 7.76 19.13 9.72
0.809 82.55 20.94 9.394 6277 2529 0.682 300.8 0.683 0.104 0.574 12.76 34.43 9.59
0.891 53.29 10.02 4.412 1887 1262 0.749 178.2 0.746 0.061 0.297 6.42 14.71 6.59
1.198 75.37 28.17 11.07 7063 2404 0.983 293.2 0.969 0.064 0.259 10.35 21.97 7.88
performance parameters. In the case of the investigation of the optimum performance of the ANNs in the solution of the PR problem, networks with one and two hidden layers were used. The main conclusion that turned out from this investigation is that the most important configuration parameter of networks (with one or two hidden layers) is the utilized activation function of the neurons of the output layer (more specifically, the tansig function). The addition of a second hidden layer improves the classification quality, primarily as regards the percentages of the correct classification to individual classes and secondarily as regards the total percentages of correct classifications (values of the OA index). Therefore, the most efficient networks for the solution of the PR problem are the networks with two hidden layers. The generalization ability of the best configured networks was then examined by means of three seismic scenarios. In the framework of the first scenario, the generalization ability of the most efficient networks in the case in which known r/c buildings were subjected to unknown seismic excitations was examined. The main conclusion which was extracted from this scenario is that the networks which were the most efficient (in the case of the FA problem as well as in the case of the PR problem) are the optimum networks according to the testing data-set which is used for the generalization assessment of them during their training. In the framework of the second scenario, the generalization ability of the most efficient networks in the case in which unknown r/c buildings were subjected to known seismic excitations was examined. The same conclusion as the main conclusion of the first scenario was extracted for the case of the FA problem. Another significant conclusion is the incapacity of networks to predict accurately damage index values greater than the values which practically correspond to heavy (nonrepairable) damages. Nevertheless, this incapacity is not of great importance since, in this range of values, the accuracy of the predicted damage index values is not critical. By contrast, the performance of the corresponding examined networks in the PR problem solution was
index through the formulation (and solution) of the approximation of an unknown function problem (Function Approximation (FA) problem). The second one concerns the classification of buildings to specific damage classes through the formulation (and solution) of the Pattern Recognition (PR) problem. In order to investigate the performance of the optimum ANNs, several networks with different configuration parameters were examined (i.e. different number of hidden layers (1 or 2), different number of neurons in the hidden layers (between 10 and 60), as well as different activation functions of neurons (sigmoid function (logsig) or hyperbolic tangent function (tansig)). Three training algorithms, namely the Levenberg-Marquardt algorithm (LM), the Scaled Conjugate Gradient algorithm (SCG) and the Resilient Back-Propagation algorithm (RP), were adopted. The seismic damage index of r/c buildings was expressed by means of the Maximum Interstorey Drift Ratio (MIDR). This choice was based on fact that MIDR is a global damage index, which can be used for the description not only of all the structural but also of the non-structural elements. Moreover, the adoption of MIDR does not lead to numerical problems that can show up in the attempt to analytically assess the global seismic damage using other indices. Also, in the case of instrumented buildings, the calculation of the MIDR can be made in real time, thus providing direct information that can be used either for the remote estimation of the damage level or for the direct generation of data in order to improve (through re-training) the predictions of ANNs. In the framework of the investigation of the optimum performance of ANNs (best configured networks) in the solution of the FA problem, networks with one hidden layer were used. The main conclusion which was extracted from this investigation is that the networks which have the logsig activation function in the neurons of the hidden layer and were trained with the LM algorithm extract the optimum predictions for the damage index values. The number of neurons of the hidden layer which leads to the optimum predictions of ANNs (optimum number of neurons) depends on the data set which is used for the calculation of the
Fig. 18. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.
136
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. 19. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.
The common and main conclusion of the three scenarios is that the best configured networks which were trained on the basis of the generated training data set are capable of extracting reliable estimations about the seismic damage state of known or unknown buildings which are subjected to future earthquakes in real time after the shock or in pro-seismic periods. The results of the utilization of the PR approach can especially be used both in pro-seismic periods for the vulnerability assessment of existing r/c buildings (accompanying rapid visual screening procedures in the framework of seismic microzonation studies) and as reliable instant decision-making tools (or in the production of preliminary reports) for the authorities after a strong earthquake. However, the predictions of the ANNs which were illustrated in the present paper can be further improved using additional training samples which cover more types of buildings or/and more seismic excitations.
generally highly sufficient. The third scenario concerns the examination of the generalization ability of the most efficient networks in cases in which both the buildings and the seismic excitations are unknown to them. Just as in the previous scenarios, in the case of the FA problem, the network which is the optimum according to the testing data-set extracts damage index values which are highly related to the corresponding values which are calculated by non-linear time history analyses. The relatively high MSE values can be mainly attributed to the incapacity of networks to accurately predict damage index values which practically correspond to heavy (non-repairable) damages. In the case of the solution of the PR problem, the examined networks extract highly sufficient classifications. In addition (just as in the second scenario), the configuration of the corresponding confusion matrices led to the conclusion that the networks are capable of extracting reliable predictions for the expected damage of unknown buildings due to future earthquakes.
Fig. 20. CMs of the damage state classifications of the 3 testing buildings made by the optimum trained networks (a) “N2-SCG-tan/tan/tan-10/28” and (b) “N2-RP-tan/tan/tan-60/48”.
137
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Appendix A. Design data of the 30 selected r/c buildings
Fig. A1. Design data of the 15 selected symmetric r/c buildings.
138
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Fig. A2. Design data of the 15 selected nonsymmetric r/c buildings.
139
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
Appendix B. Data of the 65 selected ground motion records
Table B1 Data of the 65 selected seismic excitations. No.
Earthquake name
Date
Magnitude (Ms)
Distance to fault (km)
Component (deg)
PGA (g)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Imperial Valley Imperial Valley Kocaeli, (Turkey) Landers Loma Prieta Whittier Narrows Northridge Northridge N. Palm Springs Northridge Northridge Northridge Whittier Narrows Cape Mendocino Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Erzincan (Turkey) Loma Prieta Loma Prieta Loma Prieta Northridge Northridge Northridge Northridge Campano Lucano (Italy) Spitak (Armenia) Izmit (Turkey) Duzce (Turkey) Duzce (Turkey) Duzce (Turkey) Duzce (Turkey) Izmit (Turkey) Duzce (Turkey) Strofades (Greece) Aigion (Greece) Friuli (Italy) Volvi (Greece) Dinar (Turkey) Izmit (Turkey) Duzce (Turkey) Imperial Valley Loma Prieta Loma Prieta Northridge Northridge Duzce, Turkey Northridge Imperial Valley Superstition Hills Duzce (Turkey) Imperial Valley Imperial Valley Imperial Valley Imperial Valley Livermore Superstition Hills Superstition Hills Morgan Hill Imperial Valley Morgan Hill
15/10/1979 15/10/1979 17/8/1999 28/6/1992 18/10/1989 1/10/1987 17/1/1994 17/1/1994 8/7/1986 17/1/1994 17/1/1994 17/1/1994 1/10/1987 25/4/1992 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 13/3/1992 18/10/1989 18/10/1989 18/10/1989 17/1/1994 17/1/1994 17/1/1994 17/1/1994 23/11/1380 7/12/1988 17/8/1999 12/11/1999 12/11/1999 12/11/1999 12/11/1999 17/8/1999 6/6/2000 18/11/1997 15/6/1995 11/9/1976 4/7/1978 1/10/1995 17/8/1999 12/11/1999 15/10/1979 18/10/1989 18/10/1989 17/1/1994 17/1/1994 12/11/1999 17/1/1994 15/10/1979 24/11/1987 12/11/1999 15/10/1979 15/10/1979 15/10/1979 15/10/1979 27/1/1980 24/11/1987 24/11/1987 24/4/1984 15/10/1979 24/4/1984
6.9 6.9 7.8 7.4 7.1 5.7 6.7 6.7 6 6.7 6.7 6.7 5.7 7.1 7.6 7.6 7.6 7.6 7.6 7.6 7.6
23.8 28.7 144.6 128.3 28.2 25.2 25.4 30 43.3 13 6.4 12.3 10.8 9.5 2.94 10.04 4.01 7.31 11.14 10.33 5.92 2.0 12.7 14.4 14.5 7.1 8.9 14.6 6.2 39 20 29 18 113 98 94 80 158 54 138 7 15 0 5 0 43.6 16.1 77.4 30.9 36.9 17.6 32.7 54.1 18.2 8.2 7.6 4.2 1 1 3.6 13.9 13.3 12.8 12.6 3.4
225/315 012/282 090/180 000/270 090/180 000/090 177/267 020/110 270/360 000/270 090/360 000/090 048/318 000/090 N/W N/W N/W N/W N/W N/W N/W NS/EW 000/090 000/090 000/090 090/360 270/360 000/090 052/142 E-W/N-S E-W/N-S W-E/S-N E-W/N-S S-N/E-W 030/120 E-W/N-S E-W/N-S LONG/TRAN 261/351 065/155 E-W/N-S E-W/N-S W-E/S-N E-W/N-S W-E/S-N 262/352 000/090 180/270 155/245 090/180 000/090 090/180 075/345 225/315 180/270 002/092 140/230 140/230 140/230 270/360 000/090 090/180 270/360 140/230 150/240
0.128/0.078 0.27/0.254 0.06/0.049 0.057/0.046 0.247/0.215 0.221/0.124 0.357/0.206 0.474/0.439 0.144/0.132 0.41/0.482 0.604/0.843 0.303/0.443 0.426/0.443 0.59/0.662 0.251/0.202 0.393/0.742 0.162/0.134 0.821/0.653 0.44/0.353 0.13/0.147 0.188/0.148 0.515/0.496 0.367/0.322 0.555/0.367 0.529/0.443 0.583/0.59 0.753/0.939 0.877/0.64 0.612/0.897 0.047/0.048 0.183/0.183 0.129/0.091 0.8/0.745 0.022/0.021 0.018/0.016 0.042/0.041 0.114/0.11 0.004/0.004 0.053/0.054 0.013/0.013 0.105/0.23 0.099/0.115 0.319/0.273 0.244/0.296 0.513/0.377 0.238/0.351 0.417/0.212 0.195/0.244 0.465/0.322 0.29/0.264 0.728/0.822. 0.103/0.186 0.122/0.167 0.156/0.116 0.348/0.535 0.213/0.235 0.485/0.36 0.519/0.379 0.41/0.439 0.258/0.233 0.358/0.258 0.172/0.211 0.224./0.348 0.364/0.38 0.156/0.312
7.1 7.1 7.1 6.7 6.7 6.7 6.7 6.9 6.7 7.6 7.2 7.2 7.2 7.2 7.6 6.1 6.6 6.5 5.5 6.4 7.6 7.2 6.9 7.1 7.1 6.7 6.7 7.3 6.7 6.9 6.6 7.3 6.9 6.9 6.9 6.9 5.5 6.6 6.6 6.1 6.9 6.1
140
Engineering Structures 165 (2018) 120–141
K. Morfidis, K. Kostinakis
References [29]
[1] ASCE/SEI 41-06. Seismic rehabilitation of existing buildings. ASCE (VA): American Society of Civil Engineers; 2009. [2] EC8 (Eurocode 8). Design of structures for earthquake resistance - part 1: general rules, seismic actions and rules for buildings. European Committee for Standardization; 2005. [3] ATC. Earthquake damage evaluation data for California. Redwood City (CA): Applied Technology Council; 1985 [ATC-13 Report]. [4] Anagnos T, Rojahn C, Kiremidjian AS. NCEER-ATC joint study on fragility of buildings. Technical Report NCEER 95–0003, State Univ. of New York at Buffalo: National Center for Earth. Eng. Research; 1995. [5] Kappos AJ, Panagopoulos G, Panagiotopoulos C, Penelis G. A hybrid method for the vulnerability assessment of R/C and URM buildings. Bull Earthq Eng 2006;4(4):391–413. [6] Tsang H-H, Ray KLS, Nelson TKL, Lo SH. Rapid assessment of seismic demand in existing building structures. Struct Des Tall Special Build 2009;18(4):427–39. [7] Fausett L. Fundamentals of neural networks: architectures, algorithms and applications. Pearson; 1994. [8] Haykin S. Neural networks and learning machines. 3rd ed. Prentice Hall; 2009. [9] Williams MS, Sexsmith RG. Seismic indices for concrete structures: a state of the art review. Earthq Spectra 1995;11(2):319–49. [10] Kappos AJ. Seismic damage indices for RC buildings: evaluation of concepts and procedures. Construction Research Communications Limited; 1997. p. 78–87. [ISSN 1365-0556]. [11] Adeli H, Yeh C. Perceptron learning in engineering design. Microcomput Civ Eng 1989;4(4):247–56. [12] Adeli H. Neural networks in civil engineering: 1989–2001. Comput Aid Civ Infrastruct Eng 2001;16:126–42. [13] Jegadesh SJS, Jayalekshmi S. A review on artificial neural network concepts in structural engineering applications. Int J Appl Civ Env Eng 2015;1(4):6–11. [14] Stephens JE, VanLuchene RD. Integrated assessment of seismic damage in structures. Microcomput Civ Eng 1994;9:119–28. [15] Molas G, Yamazaki F. Neural networks for quick earthquake damage estimation. Earthq Eng Struct Dyn 1995;24:505–16. [16] Erkus B. Utilization of artificial neural networks in building damage prediction. Ankara: Middle East Technical University; 1999 [MSc Thesis]. [17] Huang CS, Hung SL, Wen CM, Tu TT. A neural network approach for structural identification and diagnosis of a building from seismic response data. Earthq Eng Struct Dyn 2003;32:187–206. [18] Lautour OR, Omenzetter P. Prediction of seismic-induced structural damage using artificial neural networks. Eng Struct 2009;31:600–6. [19] Arslan MH. An evaluation of effective design parameters on earthquake performance of RC buildings using neural networks. Eng Struct 2010;32(7):1888–98. [20] Rofooei FR, Kaveh A, Farahani FM. Estimating the vulnerability of the concrete moment resisting frame structures using artificial neural networks. Int J Optim Civ Eng 2011;3:433–48. [21] Caglar N, Garip ZS. Neural network based model for seismic assessment of existing RC buildings. Comput Concr 2013;12(2):1–18. [22] Šipoš TK, Sigmund V, Hadzima-Nyarko M. Earthquake performance of infilled frames using neural networks and experimental database. Eng Struct 2013;51:113–27. [23] Vafaei M, Adnan AB, Rahman ABA. Real-time seismic damage detection of concrete shear walls using artificial neural networks. J Earthq Eng 2013;17:137–54. [24] Arslan MH, Ceylan M, Koyuncu T. Determining earthquake performances of existing reinforced concrete buildings by using ANN. Int J Civ Env Struct Constr Archit Eng 2015;9(8):930–4. [25] Kramer SL. Geotechnical earthquake engineering. Prentice-Hall; 1996. [26] Gunturi SKV, Shah HC. Building specific damage estimation. In: Proceedings of 10th world conference on earthquake engineering. Madrid: Rotterdam: Balkema; 1992. p. 6001–6. [27] Naeim F. The seismic design handbook. 2nd ed. Boston: Kluwer Academic; 2011. [28] Kaveh A, Iranmanesh A. Comparative study of backpropagation and improved
[30] [31] [32]
[33]
[34] [35] [36]
[37] [38] [39] [40]
[41] [42] [43] [44] [45] [46]
[47] [48] [49]
[50] [51] [52]
[53]
[54] [55]
141
counterpropagation neural nets in structural analysis and optimization. Int J Space Struct 1998;13(4):177–85. Iranmanesh A, Kaveh A. Structural optimization by gradient based-neural networks. Int J Numer Meth Eng 1999;46:297–311. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw 1989;2(5):359–66. Ripley BD. Pattern recognition and neural networks. Cambridge University Press; 1996. Avramidis I, Athanatopoulou A, Morfidis K, Sextos A, Giaralis A. Eurocode-compliant seismic analysis and design of r/c buildings: concepts, commentary and worked examples with flowcharts. Geotechnical, geological and earthquake engineering. New York: Springer; 2016. Kostinakis K, Athanatopoulou A, Morfidis K. Correlation between ground motion intensity measures and seismic damage of 3D R/C buildings. Eng Struct 2015;82:151–67. Yakut A, Yilmaz H. Correlation of deformation demands with ground motion intensity. J Struct Eng 2008;134(12):1818–28. SeismoSoft. SeismoSignal v. 5.1.0; 2014 < http://www.seismosoft.com > . Morfidis K, Kostinakis K. Seismic parameters’ combinations for the optimum prediction of the damage state of R/C buildings using neural networks. Adv Eng Softw 2017;106:1–16. Elenas A, Meskouris K. Correlation study between seismic acceleration parameters and damage indices of structure. Eng Struct 2001;23:698–704. Theodoridis S, Koutroumbas K. Pattern recognition. 4th ed. Elsevier; 2008. Asht S, Dass R. Pattern recognition techniques: a review. Int J Comp Sci Telecommun 2012;3(8):25–9. Masi A, Vona M, Mucciarelli M. Selection of natural and synthetic accelerograms for seismic vulnerability studies on reinforced concrete frames. J Struct Eng 2011;137:367–78. European Strong-Motion Database; 2003 < http://www.isesd.hi.is/ESD_Local/ frameset.htm > . PEER (Pacific Earthquake Engineering Research Centre). Strong motion database; 2003 < http://peer.berkeley.edu/smcat/ > . EC2 (Eurocode 2). Design of concrete structures, Part 1–1: general rules and rules for buildings. European Committee for Standardization; 2005. Paulay T, Priestley MJN. Seismic design of reinforced concrete and masonry buildings. New York: John Wiley and Sons; 1992. Otani A. Inelastic analysis of RC frame structures. J Struct Div ASCE 1974;100(7):1433–49. Carr AJ. Ruaumoko – a program for inelastic time-history analysis: program manual. New Zealand: Department of Civil Engineering, University of Canterbury; 2006. Matlab, Neural networks toolbox user guide; 2013. Rafiq MY, Bugmann G, Easterbrook DJ. Neural network design for engineering applications. Comput Struct 2001;79:1541–52. Caruana R, Lawrence S, Giles L. Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Proceedings of neural information processing systems. Denver (CO, USA); 2000. p. 402–8. Marquardt DW. An algorithm for least squares estimation of non-linear parameters. J Soc Ind Appl Math 1963;11(2):431–41. Moller MF. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 1993;6:525–33. Riedmiller M, Braun H. A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of IEEE. San Francisco; 1993. p. 586–91. Mushgil HM, Alani HA, George LE. Comparison between resilient and standard back propagation algorithms efficiency in pattern recognition. Int J Sci Eng Res 2015;6(3):773–8. Flood I, Kartam N. Neural networks in civil engineering. I: Principles and understanding. J Comput Civil Eng 1994;8(2):131–48. Shahin MA, Jaksa MB, Maier HR. Recent advances and future challenges for artificial neural systems in geotechnical engineering applications. Adv Artif Neural Syst 2009:308239, doi:http://dx.doi.org/10.1155/2009/308239.