c buildings using artificial neural networks

Engineering Structures 165 (2018) 120–141 Contents lists available at ScienceDirect Engineering Structures journal homepage: www.elsevier.com/locate...

Download PDF

8MB Sizes 0 Downloads 48 Views

Report

Full Text

Engineering Structures 165 (2018) 120–141

Contents lists available at ScienceDirect

Engineering Structures journal homepage: www.elsevier.com/locate/engstruct

Approaches to the rapid seismic damage prediction of r/c buildings using artiﬁcial neural networks

T

⁎

Konstantinos Morﬁdisa, , Konstantinos Kostinakisb a b

Earthquake Planning and Protection Organization (EPPO-ITSAK), Terma Dasylliou, 55535 Thessaloniki, Greece Department of Civil Engineering, Aristotle University of Thessaloniki, Aristotle University Campus, 54124 Thessaloniki, Greece

A R T I C LE I N FO

A B S T R A C T

Keywords: Seismic damage prediction Artiﬁcial neural networks Pattern recognition R/C buildings Seismic vulnerability assessment Seismic response

The present paper deals with the investigation of the ability of Artiﬁcial Neural Networks (ANN) to reliably predict the r/c buildings’ seismic damage state. In this investigation, the problem was formulated as a problem of approximation of an unknown function as well as a pattern recognition problem. In both cases, Multilayer Feedforward Perceptron networks were used. For the creation of the ANNs’ training data set, 30 r/c buildings with diﬀerent structural characteristics, which were subjected to 65 actual ground motions, were selected. These buildings were subjected to Nonlinear Time History Analyses. These analyses led to the calculation of the buildings’ damage indices expressed in terms of the Maximum Interstorey Drift Ratio. The inﬂuence of several conﬁguration parameters of ANNs to the level of the predictions’ reliability was also investigated. In order to investigate the generalization ability of the trained networks, three scenarios were considered. In the framework of these scenarios, the ANNs’ seismic damage state predictions were evaluated for buildings subjected to earthquakes, neither of which are included to the training data set. The most signiﬁcant conclusion of the investigation is that the ANNs can reliably approach the seismic damage state of r/c buildings in real time after an earthquake.

1. Introduction The seismic vulnerability assessment of existing reinforced concrete (r/c) buildings is one of the most signiﬁcant problems of earthquake engineering. For this reason, it is the subject of continuous research globally. Results of this extended research are the development and the evolution of methods which are utilized for the assessment of the seismic vulnerability of existing buildings, as well as for the estimation of their seismic damage state due to future earthquakes. The available methods used for the solution of the two aforementioned problems can be classiﬁed into two general categories: (a) methods that can estimate the seismic performance of individual buildings and (b) methods that can rapidly assess the seismic vulnerability of groups of buildings with common structural characteristics. The methods of the ﬁrst category concern linear and nonlinear analytical procedures suitable for individual buildings for which preliminary investigations conﬁrm that a detailed evaluation of their seismic vulnerability assessment or/and their pre-seismic strengthening or post-seismic retroﬁtting is required. Due to their inherent complexity, these methods are time consuming but absolutely necessary for buildings considered to be seismically vulnerable (e.g. buildings which

⁎

have suﬀered seismic damages or old buildings designed without the provisions of seismic codes) or for buildings considered to be important (e.g. schools, hospitals, ﬁre stations, etc.). The methods of the ﬁrst category (which are mainly based on the Finite Element Method (FEM)) have been adopted and described in the modern seismic codes (e.g. [1,2]). However, the fact that a big percentage of the existing r/c buildings located in high seismicity regions are old constructed and/or have been designed on the basis of old and inadequate seismic codes (or without the provisions of any seismic code) led to the development of methods of the second category. These methods are based on procedures that can accomplish rapid and approximate assessment of the seismic vulnerability of big groups of buildings with common structural characteristics (e.g. the seismic vulnerability curves, the damage probability matrices and procedures of rapid visual screening of structures: e.g. [3–6]). Thus, they can be used as decision-making tools, either in pre-seismic periods in order to help the engineers to make decisions about the necessity (or not) of a more detailed vulnerability assessment of individual r/c buildings (by the use of methods of the ﬁrst category), or immediately after a strong earthquake in order to help the authorities to detect the most damaged zones in the stricken area. The ability of the methods of the second category to extract

Corresponding author. E-mail addresses: [email protected] (K. Morﬁdis), [email protected] (K. Kostinakis).

https://doi.org/10.1016/j.engstruct.2018.03.028 Received 16 November 2017; Received in revised form 9 March 2018; Accepted 12 March 2018 0141-0296/ © 2018 Elsevier Ltd. All rights reserved.

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

shear received by r/c walls (if they exist) along the two orthogonal construction axes. As seismic parameters, 14 parameters widely used in literature (e.g. [25]) were chosen. The 65 seismic excitations were selected in order to cover a wide range of values of these seismic parameters. The Maximum Interstorey Drift Ratio (MIDR) was utilized as the overall damage index of the selected r/c buildings (e.g. [26,27]). Three training algorithms were used for the training of networks, namely the Levenberg-Marquardt (LM) algorithm, the Scaled Conjugate Gradient (SCG) algorithm and the Resilient Backpropagation (RP) algorithm. In both cases of the formulation of the problem (FA problem or PR problem), the inﬂuence of the parameters which are used for the conﬁguration of networks on the reliability of their predictions was investigated. These parameters are the number of the hidden layers, the number of neurons in the hidden layers, as well as the neurons’ activation functions. This investigation led to the optimum conﬁgured networks on the basis of the optimization of the utilized performance evaluation parameters (the correlation factor R and the Mean Square Error (MSE) in the case of the FA problem, and the percentage of correct classiﬁcations of buildings in the seismic damage state categories in the case of the PR problem). The generalization ability of the optimum conﬁgured networks (i.e. the ability of the optimum conﬁgured ANNs to extract reliable predictions for r/c buildings subjected to earthquakes which are both unknown to them) was examined by means of three seismic scenarios. In these scenarios, r/c buildings or/and earthquakes which were not utilized in the creation of the training data set were used.

approximate results about the seismic vulnerability of numerous buildings in a very short time led to research eﬀorts in order to improve their reliability. In the context of these eﬀorts, in the past 25 years many research studies have been conducted aiming to utilize the capacities of artiﬁcial intelligence, such as the Artiﬁcial Neural Networks (ANNs). The inherent ability of ANNs to embed and deploy results of problems which have known input data in order to extract predictions for the solution of the same type of problems with unknown input data instantly (e.g. [7,8]), led to the thought of utilizing them for the approximation of the seismic damage state of existing buildings in real time after an earthquake. An additional reason which led to this thought is the existence of available data for seismic damages of existing buildings caused by several earthquakes globally, as well as the fact that it is feasible to create the respective data using well-documented analytical methods such as, for example, the Nonlinear Time History Analysis (NTHA). Moreover, if it is taken into consideration that the problem of prediction of damage state of buildings is a multiparameter problem, the use of ANNs for its solution can be considered as very promising because their structure is capable of eﬀectively handling such problems. Thus, when using ANNs there is neither a need to consider only one parameter for the quantiﬁcation of earthquakes’ magnitude (seismic parameter), nor a need to consider a limited number of parameters which describe the seismic response of buildings (structural parameters). The ANNs give the ability to consider any number and combination of seismic and structural parameters for the study of the optimum correlation between them and the damage state of buildings which is deﬁned with the aid of various expressions of damage indices (e.g. [9,10]). The ﬁrst attempt for the utilization of ANNs as computational tools for the solution of civil engineering problems was made by Adeli and Yen [11], who examined their performance in the design procedure of steel beams. Since then, the ANNs have been the subject of numerous research studies in the ﬁeld of civil engineering problems such as the structural health monitoring, the damage identiﬁcation, the model updating, the optimization of structural design, the estimation of the characteristics of soil materials and the response of soil structures. The use of the ANNs for the solution of the aforementioned problems led to very interesting and promising results. A very detailed survey of the research studies about the use of ANNs in civil engineering problems can be found in [12,13]. The utilization of ANNs for the estimation of the seismic damage state of buildings was ﬁrst studied by Stephens and VanLuchene [14], and Molas and Yamazaki [15]. After them, several research studies were focused on the utilization of ANNs in the prediction of seismic damage on the basis of analytical or statistical data (e.g. [16–24]). In the present paper, the results of the study the ability of ANNs as regards the rapid and reliable prediction of r/c buildings’ seismic damage state are presented. This study was performed by taking into consideration two diﬀerent formulations of the problem. Firstly, the problem was formulated and solved as a problem of approximation of the values of an unknown function (Function Approximation (FA) problem). More speciﬁcally, an attempt was made to approach the relation between the values of the damage index of r/c buildings with parameters which describe the seismic response of structures (structural parameters), as well as with parameters which evaluate the impact of seismic motions on structures (seismic parameters). Consequently, the problem was formulated and solved as a Pattern Recognition (PR) problem. More speciﬁcally, the ability of ANNs to correctly classify the r/c buildings in seismic damage categories which are deﬁned by speciﬁc values of the damage index was investigated. In both cases, Multilayer Feedforward Perceptron (MFP) networks were utilized. For the training of networks, a data set which consists of 1950 input and target vectors was created. This data set was conﬁgured by means of NTHA. More speciﬁcally, 30 3-D r/c buildings were selected and analyzed performing NTHA for 65 pairs of horizontal bidirectional actual ground motions. The selected structural parameters were the total height of buildings, their structural eccentricity and the ratio of base

2. The Artiﬁcial Neural Networks (ANN) It is well-known that the ANNs are complex computational structures which are able to solve problems using the general rules of the human brain functions (e.g. memory, training, etc.). Thus, the use of ANNs makes it feasible to approximate the solution of problems such as pattern recognition, classiﬁcation and function approximation, with the aid of computers utilizing algorithms based on a diﬀerent philosophy than the conventional ones. Various types of ANNs have been proposed (e.g. Radial Basis Function Networks [8], Counterpropagation Networks [28], GradientBased Networks [29]). However, in the present paper, Multilayer Feedforward Perceptron (MFP) networks were utilized. Fig. 1(a) presents the model of the function of a typical artiﬁcial neuron which receives the input signals (x1, x2, … , xm) through its synapses (connecting links) and transforms them to an output signal (yk) through the use of an adder (which adds the products of the input signals by the respective synaptic weights (wk1, wk2, … , wkm) of the neuron’s synapses and the bias) and the use of an activation function (in which the argument is the uk that results from the adder and transforms it to the output signal yk). More details are available in specialized references (e.g. [7,8]). The function of ANNs is based on the combined action of interconnected artiﬁcial neurons (Fig. 1(b)). Due to the fact that the function of ANNs is based on the general rules of the human brain functions such as the memory and the training, the necessary procedure for the successful solution of problems by them is the training. The training of an ANN consists of the detection of values of the synaptic weights of neurons (vector w) which produce the minimum output error. This detection is achieved through the use of training algorithms (e.g. [8]). These algorithms require a set of n input vectors x and the corresponding n output vectors d that are called target vectors. The n pairs of vectors x and vectors d constitute the training data set. A trained ANN includes the optimum vector of synaptic weights which incorporate the “knowledge” acquired from the used training data set. Thus, a trained ANN is capable of extracting predictions about the solution of problems with input data that are not included in the training data set (generalization ability). The generalization ability can be constantly improved through the re-training of ANNs (i.e. the re-calculation of values of the synaptic weights) using wider training data sets. 121

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 1. The model of the artiﬁcial neuron (a), and the typical form of a MFP (b).

3. Modeling of the problem using ANNs

FA problem (e.g. [30]), as well as the solution of the PR problem (e.g. [31]). This choice was also based on the fact that this type of ANNs, as mentioned in the introduction, was successfully used in many published investigations which are related to the scientiﬁc ﬁeld of the present study (e.g. [16,18,19,21,23]).

The subject of the current section is the presentation of the procedure which was used for the formulation of the problem of the prediction of the seismic damage state of r/c buildings in terms compatible to the structure of ANNs. Thus, the choices which were made in the present paper for the conﬁguration parameters of ANNs in the case of the formulation of the FA problem, as well as in the case of the formulation of the PR problem, will be exhibited. Fig. 2 brieﬂy presents the steps of the two versions of the formulation of the problem, as well as the choices for the basic parameters. These steps will be described in detail in the following subsections.

3.2. Selection of input parameters The ANNs are computational structures which are capable of approaching the solution of multi-parametric problems. This feature gives the ﬂexibility to select the number of the parameters (input parameters) through which a problem can be formulated. The parameters which describe the problem of the seismic damage prediction of r/c buildings are grouped in two general categories: the structural parameters and the ground motion parameters (seismic parameters). The structural parameters are used for the description of the seismic response (performance) of buildings. The most signiﬁcant ones are the total height, the conﬁguration (in plan and in elevation), the structural system (e.g. frame, wall or dual system), the structural eccentricity, the concrete and the reinforcing steel grade, the dimensions and the

3.1. Selection of the type of ANNs In the present study, as was mentioned in Section 2, MFP networks were utilized. In the networks of this type, the neurons in any layer (input-hidden(s)-output) are connected to all neurons in the adjacent layer (Fig. 1(b)). This choice was made because it has been proved that this type of networks is able to successfully approach the solution of the

Fig. 2. Procedure for the formulation of the problem in terms compatible to the structure of ANNs.

122

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

In the present study, the seismic damage index was expressed by means of the Maximum Interstorey Drift Ratio (MIDR). The MIDR is a global, structural and deterministic seismic damage index which is generally considered as a reliable indicator of global structural and nonstructural damage of r/c buildings (e.g. [27]), and has been used in many investigations for the assessment of the inelastic response of structures (e.g. [37]). It corresponds to the maximum drift among the perimeter frames over all storeys. According to Fig. 2, the next step of the procedure after the selection of the damage index entails the choice of the formulation of the problem. This choice is necessary in order to deﬁne the shape of the output vectors. As regards the FA problem, the formulation is based on the approximation of values of an unknown function f(x) for which a set of known pairs (x, f(x)) are available. More speciﬁcally, if xi ∈ Rm (i = 1, … , N) are Ν vectors x = [x1, … , xm]T and di ∈ R1 are N real numbers, the solution of the FA problem leads to the approximation of a function f(x):Rm → R1 which approximately fulﬁlls the conditions f (xi) = di for the N pairs (xi, di). Obviously, there is an analogy between the above-mentioned formulation and the basic principles of ANNs which were presented in Section 2. For this reason, the use of ΑΝΝs (e.g. the MFPs) is one of the eﬀective methods for the solution of the FA problem [30]. The accomplished solution is generally approximate and its reliability is evaluated by the measurement of the error e (i.e. the vector which contains the diﬀerence between the ANN’s outputs oi and the target values di). Therefore, in the case of the formulation of the FA problem, the output of ANNs must be a real number i.e. in the case of the present study, the value of seismic damage index MIDR (ο = f (x) = MIDR). Pattern recognition is deﬁned as the procedure of the detection and identiﬁcation/classiﬁcation of objects in certain categories (patterns). Among the methods which are used to approach the solution of the PR problem are methods which are based on ANNs [31,38,39]. In the present study, the investigated problem was formulated and solved using the supervised pattern recognition method. More particularly, the classes (damage states) of the objects (vectors x, Eq. (1)) were initially deﬁned (Table 2). Additionally, a set of classiﬁed objects was available (i.e. a training data set with target vectors compatible with the formulation of PR problem). The classes into which the input vectors x (Eq. (1)) can be classiﬁed were deﬁned on the basis of seismic damage states of r/c buildings. To this end, in the present study ﬁve damage states were deﬁned using speciﬁc limit values of MIDR. These damage states (valid for r/c buildings) are presented in Table 2 [40]. Thus, the corresponding output vectors o, as well as the target vectors d, must have dimensions (5 × 1). In other words, ﬁve is the number of the outputs of networks in this case. The general form of the output vectors is given (using an example) in Fig. 3. As emerges from this ﬁgure, each element of an output vector o (or of a target vector d) represents one of the ﬁve classes (damage states) of Table 2 and attains a value equal to 1 if the corresponding MIDR belongs to the interval of values which deﬁne the speciﬁc damage state. Otherwise, it attains a value equal to 0.

reinforcement of structural members, the foundation system and the soil category. In the present study, 4 structural parameters were selected. These parameters are widely utilized in well-known methods of the vulnerability assessment of existing r/c buildings (e.g. [3,5]) and they have also been identiﬁed by the modern seismic codes as the parameters which have signiﬁcant eﬀect on the seismic response of r/c buildings (e.g. [2]). These parameters are the total height of building Htot (=3.2 ⋅ nst, nst = the number of storeys), the structural eccentricity eo (the distance between the mass center and the stiﬀness center of storeys), and the ratio of the base shear that is received by r/c walls (if they exist) along two perpendicular directions (axes x and y): nvx and nvy (for the calculation procedures of these parameters see e.g. [32]). The seismic parameters are used for the evaluation of the impact of seismic motions on structures. The inherent diﬃculty of a reliable description of this impact led to the introduction of several expressions for the seismic parameters (e.g. [25]). This fact gives rise to the concern about the choice of the seismic parameter which is better related to the seismic response of structures (e.g. [33,34]). Nevertheless, the capability of ANNs to model multi-parametric problems taking into consideration many input parameters gives the potential to overcome this problem. Thus, for the investigation conducted in the present study, the 14 seismic parameters given in Table 1 have been chosen in order to evaluate the eﬀect of earthquakes on the structural damage state. The impact of the 14 seismic parameters given in Table 1 on the accuracy level of the predictions of ANNs was investigated by the authors [36]. This investigation regarded only the FA problem and led to the conclusion that, if more than 6 seismic parameters are used as input parameters, the accuracy of the predictions of the MFP networks is especially high. Considering that there is no such investigation for the case of the PR problem, it was decided in the present paper to use all the 14 seismic parameters as input parameters for the networks. Thus, in the present study, 18 input parameters (4 structural and 14 seismic) were utilized. Therefore, the input vectors of ANNs x (18 × 1) have the general form which is given by the Eq. (1): x = [x seism |x struct]T x seism = [PGA|PGV|PGD|Ia |SED|CAV|ASI|HI|EPA|PGV/PGA|PP|TUD|TBD|TSD]T x struct = [Htot |e0 |nvx |nvy]T

(1)

3.3. Selection of the problem’s formulation – output parameters The exported result of the solution of the problem which is examined in the present paper is the estimation of the seismic damage state of r/c buildings. This estimation is generally accomplished through the use of seismic damage indices (e.g. [9]). These indices are deﬁned for the quantiﬁcation of the seismic damage state which is essential in order to formulate the problem of seismic vulnerability assessment. However, the eﬃcient selection of a seismic damage index which can adequately capture the overall seismic damage state of buildings is very diﬃcult, since it depends on a large number of factors. Besides, in the past decades, many expressions for the seismic damage index have been proposed. These indices are classiﬁed into categories, based on whether they are deterministic or probabilistic, local or global, structural or ﬁnancial (e.g. [10]).

3.4. Training data set generation According to Fig. 2, the next step of the procedure for the formulation of the problem is the generation of the training data set, i.e. the generation of a set which consists of input and target vectors for the training of ANNs. The steps of this procedure (Fig. 4) will be described in detail in the current section.

Table 1 The selected seismic (ground motion) parameters [25,35]. 1 2 3 4 5 6 7

Peak Ground Acceleration: PGA Peak Ground Velocity: PGV Peak Ground Displacement: PGD Eﬀective Peak Acceleration: EPA Speciﬁc Energy Density: SED Acceleration Spectrum Intensity: ASI Cumulative Absolute Velocity: CAV

8 9 10 11 12 13 14

Housner Intensity: HI Arias Intensity: Ia Vmax/Amax (PGV/ PGA) Predominant Period: PP Uniform Duration: UD Bracketed Duration: BD Signiﬁcant Duration: SD

Table 2 Relation between MIDR and damage state.

123

MIDR (%)

< 0.25

0.25–0.50

0.50–1.00

1.00–1.50

> 1.50

Degree of damage

Null

Slight

Moderate

Heavy

Destruction

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Table 3 Ranges of the values of the selected seismic parameters corresponding to the 65 earthquakes.

Fig. 3. General form of output vectors o in the case the formulation of the PR problem.

3.4.1. Selection of 30 r/c buildings The selected r/c buildings diﬀer in the total height Htot (diﬀerent number of storeys nst), the structural eccentricity eo (=the distance between the mass center and the stiﬀness center of storeys), as well as the ratios of the base shear nvx and nvy that are received by r/c walls (if they exist) along two perpendicular structural axes (axes x and y). The values of the above structural parameters for the selected buildings (as well as their design parameters) are presented in Appendix A. All buildings are rectangular in plan (with dimensions LxxLy) and regular in elevation (according to the criteria set by EC8 [2]), and were chosen so as to represent typical (and actual) r/c buildings.

Ground motion parameter

Units

Minimum value

Maximum value

PGA PGV PGD Ia SED CAV ASI HI EPA Vmax/Amax (PGV/PGA) PP TUD TBD TSD

%g cm/s cm m/s cm2/sec cm/s g·sec cm %g sec sec sec sec sec

0.004 0.86 0.36 ≈0.0 1.24 14.67 0.003 3.94 0.003 0. 036 0.077 ≈0.0 ≈0.0 1.74

0.822 99.35 60.19 5.592 16762.8 2684.1 0.633 317.6 0.63 0.336 1.26 17.68 61.87 50.98

• The buildings were analyzed using the modal response spectrum method; • The structural materials were: steel S500B and concrete C20/25. • For the design of r/c members, the load combinations 1.35G+1.50Q

3.4.2. Selection of ground motions A suite of 65 pairs of horizontal earthquake records obtained from the European strong motion database [41] and the PEER [42] was selected. The main criterion used for the selection of these records was the coverage of a wide range of values for the 14 seismic parameters considered in the present study (see Tables 1 and 3). The seismic parameters for each ground motion were computed as the geometric mean values of the parameters corresponding to the two horizontal components of each earthquake record. The data of the selected earthquake records are given in Appendix B.

and G+0.3Q ± E were taken into consideration (G is the dead load, Q is the live load, and E is the seismic load expressed by the simultaneous application of the design spectrum of EC8 for seismic zone II and site class C along the axes x and y).

3.4.4. Nonlinear modeling and analysis (NTHA) – Data for the calculation of MIDR Τhe nonlinear behavior of r/c buildings was modeled by means of lumped plasticity (concentrated hinge) models at the column and beam ends, as well as at the base of the r/c walls. More speciﬁcally, the length of the plastic hinges was determined using the Eqs. (2a) and (2b) [44]:

3.4.3. Modeling of linear behavior, analysis and design of the selected r/c buildings (see also Appendix A) The selected r/c buildings were modeled, analyzed and designed utilizing the provisions of EC2 [43] and EC8 [2]. More speciﬁcally:

lp = 0.08l 0 + 0.022dp f y

for beams and columns

lp = 0.2l w + 0.044h w < 0.8l w

• The elastic modeling was carried out taking into consideration all basic recommendations of EC8. • The buildings were considered to be fully ﬁxed to the ground. • The inﬁll walls were considered only as vertical loads and not as seismic resistant structural elements. The • buildings were designed as Medium Ductility Class (MDC) structures [2]. • The behavior factors q were determined according to the re-

for walls

(2a) (2b)

where lp is the length of the plastic hinge, l0 is the distance of critical section from the point of contraﬂexure, dp is the mean diameter of the longitudinal reinforcement, fy is the yield stress of the longitudinal reinforcement, lw is the length of the cross-section of wall and hw is the total wall height. The material inelasticity of the structural members was modeled with the aid of the Modiﬁed Takeda hysteresis rule [45] (Fig. 5a). It should also be noted that the eﬀects of the axial load-biaxial bending moments interaction at column and wall hinges were taken into account by using the N-M2-M3 interaction diagram, which is implemented in the software adopted for the application of the analyses [46] (Fig. 5b).

commendations of ΕC8 [2].

Fig. 4. Procedure for the design and generation of the training data set.

124

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 5. Moment (M) - Rotation (θ) relationship (a) and N-M2-M3 interaction diagram (b).

(b) Number of neurons in the hidden layers: The number of neurons of hidden layers which leads to the optimum performance of ANNs is not uniquely deﬁned. It depends on the nature and on the formulation of the studied problem. It must also be stressed that there is no direct method for its determination. Thus, only the “trial and error” method can be adopted for both cases of the problem’s formulation (Section 4). (c) Activation functions of neurons: In the present study, two diﬀerent types of nonlinear activation functions for the neurons of hidden layers were used (Fig. 1(a)): the hyperbolic tangent function (tansig) and the sigmoid function (logistic-logsig). This choice was made for the FA problem as well as for the PR approach. As regards the neurons of the output layer, the choice of the activation function was diﬀerent for the two problems. In the case of the FA problem, the linear function was selected. This choice was based on the fact that, in this case, the output (MIDR values) attains any real value and not a value between [0, 1] or [−1, 1]. In the case of the PR problem, the nonlinear functions tansig and logsig were selected because the elements of the output vectors attain values 0 or 1 (see Fig. 3). The choice of using two activation functions (instead of using a single one) was made in order to investigate the optimum eﬃciency of the ANNs in the solution of the PR problem (more details are given in Section 4). (d) Performance evaluation parameters: In the case of the solution of the FA problem, the Mean Square Error (MSE), as well as the correlation coeﬃcient (R factor), (e.g. [20]) were adopted. In the case of the solution of the PR problem, the most useful tools for the evaluation of ANNs are the Confusion Matrices - CM (e.g. [38,47]). The general form of a CM (for a three-class problem) is presented in Fig. 6. On the basis of CMs, three types of metrics for the prediction accuracy of ANNs are deﬁned, namely the “Recall” index, the “Precision” index and the “Overall Accuracy” index (Fig. 6). In the present study, the “Overall Accuracy” or (OA) index was mainly used. However, for the evaluation of the several conﬁgurations of the examined ANNs, the corresponding whole CMs are also presented and evaluated (Section 4). As emerges from Fig. 6, the elements of CMs which are located in the main diagonal (i.e. the elements CFii) deﬁne the number of input vectors which are classiﬁed by the network to correct classes. Furthermore, valuable information about the quality of the predictions of ANNs is also given by the conﬁguration of CMs. More speciﬁcally, when the vast majority of the non-zero elements of a CM are located about the main diagonal, this means that the ANN achieves an acceptable classiﬁcation. For example, if all the non-zero elements are located in the cells of the main diagonal and in the adjacent cells, this means that the objects are classiﬁed into correct classes and into classes which are adjacent to them (i.e. if

After the linear and the nonlinear modeling, the 30 selected r/c buildings were analyzed by NTHA for each one of the 65 earthquake ground motion pairs. The design vertical (gravity) loads were also taken into consideration in these analyses. Thus, a total of 1950 NTHA (30 buildings × 65 ground motion records) were performed. For each one of the 1950 analyses, the required data for the MIDR calculation were exported. 3.4.5. Post-processing of the results of the NTHA – Calculation of the MIDR The last step of the procedure for the training data set generation (Fig. 4) concerns the post-processing of the results of NTHA in order to calculate the MIDR values of the analyzed r/c buildings. To this end, a computer code in Visual Basic was developed. Thus, following the described procedure above, 1950 training vectors x, which are given in Eq. (1), were created. Also, the corresponding 1950 target vectors d were formed. The shape of these vectors depends on the formulation of problem. In particular: (i) In the case of the FA problem, the target vectors d are in fact scalar values (the values of MIDR). (ii) In the case of the PR problem, the target vectors d have the form which is presented in Fig. 3. 3.5. Selection of training algorithms – Conﬁguration of ANNs The last two steps of the procedure for the formulation of the investigated problem (Fig. 2) are the selection of training algorithms and the conﬁguration of the utilized networks. More speciﬁcally, these steps concern the choice of the parameters which are required for the conﬁguration of the used ANNs. In particular, these parameters are: (a) the number of hidden layers; (b) the number of neurons in each hidden layer; (c) the activation functions of neurons; (d) the performance evaluation parameters; (e) the normalization functions of the input and output values and (f) the method for partitioning the data set in training, validation and testing subsets. Additionally, to the above parameters, a “parameter” of great signiﬁcance (which also inﬂuences the performance of ANNs) is the selected training algorithm. (a) Number of hidden layers: In the case of the solution of the FA problem, single-layered networks were chosen. This choice was based on the fact that, as it was proved [30], the single-layered feedforward perceptron networks are able to precisely approach functions f(x): Rm → R1, as well as on the fact that their eﬃciency has been well-documented in numerous relevant investigations (e.g. [18,20,22]). In the case of the solution of the PR problem, networks with one or two hidden layers were chosen in order to study the inﬂuence of the second hidden layer on the percentage of correct classiﬁcations of input vectors x in the damage state categories of Table 2. 125

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 6. General form of a CM and related metrics for a three-class problem.

the Resilient Backpropagation algorithm (“RP” algorithm, [52]). In the case of the FA problem, the training of ANNs was conducted by the use of LM and SCG algorithms whereas, in the case of the PR problem, the RP and SCG algorithms were utilized. The LM algorithm is a “QuasiNewton” algorithm (variation of Newton's method) and belongs to the general category of Back-Propagation algorithms. It is a fast and eﬃcient algorithm and it is suggested for the solution of FA problems. The SCG is an algorithm which belongs to the speciﬁc class of Conjugate Gradient algorithms. A signiﬁcant feature of this algorithm is its ability to eﬀectively handle large-scale problems and its quickness. The RP algorithm constitutes a variation of methods which are based on a variable learning rate and it is recommended for the quick and reliable solution of pattern recognition problems (e.g. [53]).

the correct class for an object is the class i, the ANN classiﬁes this object in the class i − 1 or in class i + 1). Therefore, all predictions/ classiﬁcations about the expected damage states which are made by the ANN in this case are correct or close to the correct ones. (e) Normalization functions for the elements of the input and output vectors: The utilization of functions which normalize the values of the elements of input vectors x before these vectors are introduced to ANNs is considered necessary in order to optimize the training (e.g. [47,48]). The same transformation is also required for the elements of the target vectors d. The network generates output vectors in which the reverse transformation is required in order to attain their ﬁnal values. A function, through which the elements of input and target vectors of the data set attain values in the range [−1, 1], was selected in the present study [47]. (f) Method for partitioning the data set: The partition of the data-set in three sub-sets, namely the training, the validation and the testing sub-set is recommended in order to ensure good generalization of networks and to avoid the overﬁtting (e.g. [47,49]). In the present study, the partition of the data set in training, validation and testing sub-sets was done using the ratio 70%/15%/15% respectively. The training and target vectors of the three sub-sets were chosen randomly. It is important to notice that the validation data set is used by the training algorithm internally for the check of the criteria of the training termination, which regard the avoidance of the overﬁtting [47]. Thus, the results arising from the use of the validation data set do not provide information that can lead to certain conclusions about the performance of ANNs. Therefore, in the following tables of the present research work, the results regarding the validation data set are omitted.

4. Training of the selected ANNs – Parametric investigation of the optimum conﬁgurations In this section, details for the procedure of trainings, as well as the results of the parametric analyses which were conducted for the investigation of the seismic damage predictions of the optimum ANNs, are exhibited. The aim of the parametric analyses was the investigation of the inﬂuence of: (a) the type of activation functions of the neurons; (b) the number of hidden layers (only in the case of the solution of the PR problem); (c) the number of neurons in hidden layers and (d) the training algorithms on the performance of ANNs. In Figs. 7 and 9, the procedures of parametric investigations for both cases of the problem formulation are brieﬂy illustrated. It must be noted that for the conﬁguration and the training of ANNs, the neural network tool box in Matlab [47] was used. 4.1. Parametric investigation in the case of the FA problem

As regards the training algorithms, three diﬀerent algorithms were adopted: the Levenberg-Marquardt algorithm (“LM” algorithm, [50]), the Scaled Conjugate Gradient algorithm (“SCG” algorithm, [51]) and

In the case of the FA problem solution, networks with one hidden layer (henceforth “N1” networks) were utilized. As emerges from Fig. 7,

Fig. 7. Procedure of the parametric investigation for the optimum performance of ANNs in the solution of the FA problem.

126

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

the procedure of the parametric investigation was separated in two parts on the basis of the utilized training algorithm (LM or SCG). In each one of these parts, two classes of networks were conﬁgured. The networks of the ﬁrst class have activation functions tansig for neurons of the hidden layer, whereas the networks of the second class have logsig functions. In particular, 51 diﬀerent networks as regards the number of neurons (between 10 and 60) in the hidden layer were conﬁgured for each one of the two network classes. Thus, 102(= 2 × 51) diﬀerent networks were created in each one of the two parts of the parametric investigation. Each one of these networks was trained 75 times (i.e. in total 102 × 75 = 7650 training procedures were performed in each part). This is done because diﬀerences in the performance of ANNs are caused by the initial values of the synaptic weights (e.g. [18]) and also by the random composition of the three sub-sets of the training data set [47]. From the 75 training procedures of each one of the 102 conﬁgured networks, the optimum ones were detected. More particularly, the trainings which yielded the optimum values of the utilized performance parameters (MSE or R-factor) on the basis of the testing sub-set, the training sub-set and the total data set were detected. Thus, from 7650 training procedures of each part, 612 = (6 × 102) optimum trained networks were emerged, taking into consideration one of the following six criteria (i.e. 306 optimum trained networks for each one of the two network classes):

Table 4b Optimum number of the neurons of the hidden layer (which leads to the optimum values of MSE and R-factor). Performance criterion

min(MSE)

max(R)

the total data set.

Then, from the 306 optimum trained networks of each class, the networks which yielded the best predictions on the basis of the aforementioned six criteria (i.e. minimum values for the MSE and maximum values for the R-factor) were extracted. More speciﬁcally, the optimum number of neurons in the hidden layer, which leads to the best predictions for MIDR values using each one of the six adopted criteria, was detected. Thus, the 12 best trained networks from each one of the two parts of the investigation were extracted (i.e. 12 optimum conﬁgured networks trained using the LM algorithm and 12 optimum conﬁgured networks trained using the SCG algorithm). The results of the parametric investigation described above are summarized in Tables 4a and 4b. The basic conclusion which arises from Table 4a is that the training algorithm LM is more eﬃcient than the algorithm SCG for any of the utilized performance criteria. Furthermore, in the case of the utilization of the LM algorithm, the networks in which the logsig function is used extract slightly better results than the results of the networks in which the tansig function is adopted. As regards the optimum number of neurons in the hidden layer (Table 4b), it generally depends on the utilized performance criterion (i.e. the performance parameter and the part of the data set for Table 4a Optimum values of the performance parameters of the FA problem.

min(MSE)

max(R)

Training algorithm/Activation function of the hidden layer’s neurons LM/logsig

LM/tansig

SCG/logsig

SCG/tansig

Testing subset Training subset Total data set

0.045

0.052

0.078

0.071

0.010

0.010

0.077

0.065

0.034

0.038

0.095

0.082

Testing subset Training subset Total data set

0.975

0.972

0.958

0.958

0.995

0.995

0.958

0.967

0.983

0.981

0.951

0.958

LM/logsig

LM/tansig

SCG/logsig

SCG/tansig

Testing subset Training subset Total data set

18

16

28

34

60

54

14

54

32

30

14

46

Testing subset Training subset Total data set

18

54

16

52

60

54

14

54

32

54

14

46

which this parameter is calculated). This conclusion does not apply in the case of the most eﬃcient network (i.e. the network in which the logsig function is adopted and is trained using the LM algorithm). In this case, the optimum number of neurons does not depend on the adopted performance parameter (MSE or R) but only on the part of the data set for which this parameter is calculated. Thus, the optimum number of neurons in the hidden layer is equal to 18 when the performance parameter is calculated for the samples which belong to the testing sub-set, but it is equal to 60 when the training sub-set is used for the calculation of the performance parameter. However, in the case of the calculation on the basis of the total data set, the corresponding number is 32. Fig. 8 illustrates the predictive ability of the optimum networks when the criterion of min(MSE) for the testing sub-set is adopted. More speciﬁcally, the diagrams of this ﬁgure concern the four optimum networks (i.e. the networks with the optimum number of neurons in the hidden layer) which correspond to the four examined combinations of training algorithms and activation functions of the neurons of the hidden layer (Tables 4a and 4b). In these diagrams, the MIDR values which were calculated using NTHA (MIDRNTHA) are plotted against the MIDR values predicted by the optimum networks (MIDRANN) for all samples of the total data set. The main conclusion that can be drawn from Fig. 8 is that the network which has 18 neurons with the logsig activation function in the hidden layer and was trained using the LM algorithm (henceforth “N1LM-log/lin-18” network) extracts the best predictions about the expected MIDR values (Fig. 8a). In particular, the “N1-LM-log/lin-18” network extracts MIDRANN values, which are the best related to the corresponding MIDRNTHA values (R = 0.9745). Another signiﬁcant conclusion which is extracted from Fig. 8 is that the correlation between the MIDRNTHA values and the MIDRANN values is better in the range (MIDR = 0–1.5%). In this range – with a few exceptions – all points of the data set sit very close to the straight diagonal reference line (i.e. the line in which the points which fulﬁll the condition MIDRNTHA = MIDRANN are located). For higher damage levels (MIDR > 1.5%), i.e. for damage levels which correspond to “Destruction” according to Table 2, the degree of scatter is increased in all diagrams of Fig. 8. Therefore, the predictive ability of the optimum networks is decreased for MIDR values larger than 1.5%. Nevertheless, this weakness is not signiﬁcant since, for MIDR values larger than 1.5%, the buildings suﬀer heavy (and practically non-repairable) damages. Thus, the precision of the predicted MIDR values in these cases is not critical. By contrast, the ability for the reliable prediction of the order of magnitude of MIDR values is more signiﬁcant. As emerges from Fig. 8, this requirement is accomplished. According to the conclusions presented above, the most eﬃcient of the examined networks on the basis of the testing sub-set is the “N1-LMlog/lin-18” network (MSE = 0.045 and R = 0.975, Table 4a). The corresponding optimum network according to the total data set is the

• min(MSE) for: (a) the testing sub-set, (b) the training sub-set and (c) the total data set, • max(R) for: (a) the testing sub-set, (b) the training sub-set and (c)

Performance criterion

Training algorithm/Activation function of the hidden layer’s neurons

127

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 8. Comparison of damage predicted by NTHA and the best trained ANNs.

diﬀerent numbers of neurons in the hidden layer) ΑΝΝs with one hidden layer and 20808(=8 combinations of activation functions × 51 diﬀerent numbers of neurons in the 1st hidden layer × 51 diﬀerent numbers of neurons in the 2nd hidden layer) ANNs with two hidden layers were conﬁgured. In Table 5, the results of the parametric investigation of the performance of networks with one hidden layer are presented. The main conclusion that can be drawn from Table 5 is that the activation function of the neurons of the output layer is the most important conﬁguration parameter of networks with one hidden layer. More speciﬁcally, when the function tansig is adopted for the neurons of the output layer, the value of the OA index ﬂuctuates between 81.9% and 88.9%, regardless of the choices for the activation function of the neurons of the hidden layer and for the training algorithm (the inﬂuence of these parameters on the OA index value is signiﬁcantly lesser). For example, the combined use of activation functions (tansig/ logsig) extracts a value for the OA index that is equal to 61.8% as regards the classiﬁcations of all samples of the data set when the network is trained using the RP algorithm. However, the corresponding value of the OA index in the case of the utilization of SCG is 59.7%. Respectively, when the combination (tansig/tansig) is adopted, the extracted value of the OA index is 84.5%, irrespective of the training algorithm utilized. A similar conclusion is drawn when the results which are based on the samples of testing and the training sub-sets are evaluated. Another signiﬁcant conclusion which is extracted from the study of Table 5 is the great importance of the optimum number of neurons of the hidden layer. This number depends on the other conﬁguration parameters of the examined networks, as well as on the set (testing, training, and total) for which the OA index is calculated. As emerges from Table 5, the relation of the optimum number of neurons of the hidden layer with the other conﬁguration parameters is not based on a certain function. It is obvious that the optimum number of neurons in the hidden layer is altered randomly. This conclusion substantiates the nonexistence of a direct method for its calculation, and the recourse to the “trial and error” procedure. Tables 6a and 6b illustrate the results of the parametric investigation of the performance of ANNs with two hidden layers. The main conclusion which can be extracted from the combined study of these

“N1-LM-log/lin-32” network (MSE = 0.034 and R = 0.983, Table 4a). The “N1-LM-log/lin-18” network is mainly utilized for the assessment of the ability of the trained networks in the reliable prediction of the seismic damage level of r/c buildings in cases with data unknown to them (generalization ability), which is presented in Section 5. This choice was based on the fact that this sub-set is used in order to control the generalization ability of networks during their training and not for the optimization of the values of the synaptic weights [47]. Therefore, the “N1-LM-log/lin-18” is considered as more eﬃcient for predictions in cases of unseen input data (i.e. seismic damage predictions due to future earthquakes). Nevertheless, the generalization ability of “N1-LMlog/lin-32” networks is also examined in Section 5 due to research reasons. 4.2. Parametric investigation in the case of the PR problem In the case of the PR problem, networks with one (“N1” networks) and two hidden layers (henceforth “N2” networks) were utilized. Just as in the case of FA problem, the procedure of the parametric investigation was separated in two parts on the basis of the utilized training algorithm (RP or SCG). This procedure is brieﬂy illustrated in Fig. 9. As emerges from this ﬁgure, the procedure of the parametric investigation is basically similar to the corresponding procedure which was conducted in the case of the FA problem. The main diﬀerence between the two procedures is that, in the case of the solution of the PR problem, ANNs with one and two hidden layers were used. Thus, the parametric analyses were performed in two stages for each one of the two parts of procedure: In stage one, the problem was solved by the use of ANNs with one hidden layer whereas, in stage two, networks with two hidden layers were used. As regards the criterion for the performance assessment of ANNs, the percentage of correct classiﬁcations of r/c buildings to the damage classes of Table 2 (i.e. the Overall Accuracy (OA) index, Fig. 6) was adopted. This index was calculated on the basis of the testing sub-set, the training sub-set and the total data set. The performance of the examined networks was also assessed using the whole CMs (Fig. 6) which illustrate an overall view with regard to the predictive abilities of ANNs. A total of 204(=4 combinations of activation functions × 51 128

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 9. Procedure of the parametric investigation for the optimum performance of ANNs in the solution of the PR problem.

hidden layer”, henceforth “N1-RP-tan/tan-56” network) extracts an OA index value equal to 84.5% (Table 5). The corresponding most eﬃcient network with two hidden layers (combination “RP algorithm – activation functions in 1st/2nd hidden layers and output layer: tansig/tansig/ tansig – 60/48 neurons in 1st/2nd hidden layers”, henceforth “N2-RPtan/tan/tan-60/48” network) extracts an OA index value equal to 89.9% (Table 6a). This increase of the OA index value is not negligible but it can’t be considered as signiﬁcant if evaluated in conjunction with the corresponding increase of the number of neurons in the case of networks with two hidden layers (the “N2-RP-tan/tan/tan-60/48” network has 108 neurons whereas the “N1-RP-tan/tan-56” network has 56). In Fig. 10, the CMs of the most eﬃcient networks with one hidden layer according to the testing sub-set and to the total data-set (Table 5) are presented. The study of these CMs leads to the conclusion that, besides the signiﬁcant percentage of correct classiﬁcations (i.e. high values of the OA index which are illustrated in the Table 5), the vast majority of wrong classiﬁcations are classiﬁcations in classes adjacent to correct ones. For example, in the CM of Fig. 10a which corresponds to the most eﬃcient network according to the testing data sub-set (“N1-

tables is that, as in the case of networks with one hidden layer, the activation function of the neurons of the output layer is the most important conﬁguration parameter. In particular, the utilization of the tansig function in the neurons of the output layer leads to values of OA index between 84% and 90% (the corresponding range of OA index values in the case of the utilization of logsig function is 63.7–70%). The other conclusions (regarding the minor importance of the activation function of the neurons of the hidden layer and of the training algorithm in comparison to the importance of the activation function of the neurons of the output layer, as well as the great importance and the non-deterministic calculation of the optimum number of neurons in the hidden layer), which were extracted from the parametric investigation of ANNs with one hidden layer, are also valid in the case of networks with two hidden layers. Finally, the combined study of Tables 5, 6a and 6b leads to the signiﬁcant conclusion which regards the relatively minor improvement of the values of the OA index when a second hidden layer is added. For example, the most eﬃcient network with one hidden layer according to the total data set (combination “RP algorithm – activation functions in the hidden layer and the output layer: tansig/tansig – 56 neurons in the

Table 5 ANNs with one hidden layer – Best values of the OA index and the corresponding number of neurons of hidden layer. Performance criterion

Training algorithm - Activation function of the neurons of the hidden/output layers RP algorithm

SCG algorithm

logsig/ logsig

logsig/ tansig

tansig/ tansig

tansig/ logsig

logsig/ logsig

logsig/ tansig

tansig/ tansig

tansig/logsig

maxOA(%)

Testing sub-set Training sub-set Total data set

64.8% 61.4% 62.4%

81.9% 87.0% 83.9%

83.6% 88.9% 84.5%

64.5% 62.0% 61.8%

54.9% 56.5% 55.5%

81.9% 85.4% 83.0%

81.9% 87.2% 84.5%

61.4% 59.4% 59.7%

Number of neurons in the hidden layer

Testing sub-set Training sub-set Total data set

54 34 34

48 54 54

44 56 56

14 26 14

46 18 18

40 46 46

36 56 56

52 10 52

129

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Table 6a ANNs with two hidden layers - Best values of the OA index and the corresponding number of neurons of the hidden layers (RP algorithm). Performance criterion

Activation function of the 1st hidden/2nd hidden/output layers’ neurons tansig/tansig/ logsig

tansig/ tansig/tansig

tansig/ logsig/logsig

tansig/ logsig/tansig

logsig/ tansig/logsig

logsig/ tansig/tansig

logsig/ logsig/logsig

logsig/logsig/ tansig

maxOA(%)

Testing sub-set Training sub-set Total data set

67.9% 64.7% 64.4%

84.3% 94.1% 89.9%

70.0% 69.6% 69.9%

84.6% 93.8% 89.4%

67.6% 64.7% 63.7%

85.3% 92.4% 88.1%

67.9% 68.4% 66.4%

85.0% 92.4% 87.9%

Number of neurons in 1st/ 2nd hidden layer

Testing sub-set Training sub-set Total data set

24/56 46/12 46/12

20/32 54/58 60/48

44/28 44/28 44/28

36/24 52/58 56/40

46/14 46/50 50/54

34/52 60/56 60/56

30/52 42/50 42/50

44/12 44/54 44/54

corresponding eﬃciency for correct classiﬁcation to the classes 1 and 5. However, the percentages of the correct classiﬁcations to the classes 2, 3 and 4 can’t be rated as unacceptable. Similar conclusions are also extracted from the study of the other CMs of Fig. 10. Fig. 11 illustrates the CMs of the most eﬃcient networks with two hidden layers according to the testing sub-set and to the total data-set (Tables 6a and 6b). The conclusions which arise from the study of these CMs are generally similar to the corresponding conclusions which result from the study of CMs of ANNs with one hidden layer (Fig. 10). However, it must be noted that the number of classiﬁcations in classes which are not adjacent to the correct ones is the least in the case of networks with two hidden layers. For example, no classiﬁcations in classes which are not adjacent to the correct ones are performed by the “N2-SCG-tan/ tan/tan-10/28” network (Fig. 11c). The corresponding number of classiﬁcations in the case of the “N1-SCG-tan/tan-36” network (with one hidden layer) is 2 (Fig. 10c). It must also be stressed that the effectiveness of networks with two hidden layers to correctly classify the samples to the classes 2, 3 and 4 is greater than the corresponding effectiveness of networks with one hidden layer. More speciﬁcally, the percentages of correct classiﬁcations which are performed by networks with two hidden layers in the classes 2, 3 and 4 are (with few exceptions) greater than 80%. Thus, despite the fact that the overall percentage of correct classiﬁcations (OA index) is not signiﬁcantly increased when networks with two hidden layers are utilized, the general form of the corresponding CMs indicates that the improvement of the general quality of the classiﬁcations is considerable. Finally, it must be noted that, just as in the case of the FA problem study (Section 4.1), the network which is mainly utilized for the assessment of the generalization ability of the ANNs (Section 5) is the most eﬃcient network according to the testing sub-set, i.e. the “N2SCG-tan/tan/tan-10/28” network (OA = 86%, Table 6b and Fig. 11c). Nevertheless, the generalization ability of the ANNs is also assessed in Section 5 by the use of the most eﬃcient network according to the total data set, i.e. the “N2-RP-tan/tan/tan-60/48” network (OA = 89.9%, Table 6a and Fig. 11b). In Table 7, the conﬁguration parameters of the networks which are utilized for the study of the generalization ability of the ANNs are summarized.

RP-tan/tan-44” network), there are no classiﬁcations in classes not adjacent to the correct ones. The corresponding number in the case of the most eﬃcient network according to the total data set (“N1-RP-tan/ tan-56” network) is 8 in 1950 samples (Fig. 10b). Similar numbers are observed also in the cases of the most eﬃcient networks which were trained using the SCG algorithm (Fig. 10c and d). Thus, even the wrong classiﬁcations do not lead to misleading conclusions about the expected damage state. A signiﬁcant feature of CMs is their ability to provide additional information about the percentage of correct classiﬁcations in each class individually and not only about their corresponding overall percentage (Fig. 6). More speciﬁcally, the sum of the elements of each column corresponds to the total number of the target vectors which belong to each one of the ﬁve classes. Thus, as emerges, for example, from the study of the CM of Fig. 10b (“N1-RP-tan/tan-56” network), the total data set consists of: 292(=268 + 24) class 1 samples (null damage), 271(=12 + 225 + 34) class 2 samples (slight damage), 443(=1 + 34 + 349 + 56 + 3) class 3 samples (moderate damage), 371(=68 + 274 + 29) class 4 samples (heavy damage), and 573(=4 + 38 + 531) class 5 samples (Destruction). Respectively, the sum of the elements of each row corresponds to the total number of samples which the “N1-RP-tan/tan-56” network classiﬁes to each one of the ﬁve classes of the problem. According to the above clariﬁcations, it is concluded that the “N1-RP-tan/tan-56” network classiﬁes 268 samples to the class 1, whereas the samples whose true class is the class 1 are 292 (Precision = 268/292 = 0.918 or 91.8%). Furthermore, it can be concluded that the “N1-RP-tan/tan-56” network classiﬁes 281(=268 + 12 + 1) samples to the class 1 but the number of these samples which belong indeed to the class 1 is 268 (Recall = 268/ 281 = 0.954 or 95.4%). These high values of the Precision and Recall indices indicate the great eﬃciency of the “N1-RP-tan/tan-56” network to correctly classifying the samples to the class 1. Similar values for the Precision and Recall indices are also observed also for the network classiﬁcations to the class 5 (92.7% and 94.3% respectively). By contrast, the corresponding values of these indices for the classes 2, 3 and 4 are lesser (with little exceptions ﬂuctuating between 74% and 80%). Thus, the eﬃciency of the “N1-RP-tan/tan-56” network to correctly classify the samples to these classes is not equivalent to the

Table 6b ANNs with two hidden layers - Best values of the OA index and the corresponding number of neurons of the hidden layers (SCG algorithm). Performance criterion

Activation function of the 1st hidden/2nd hidden/output layers’ neurons tansig/ tansig/logsig

tansig/ tansig/tansig

tansig/ logsig/logsig

tansig/ logsig/tansig

logsig/ tansig/logsig

logsig/ tansig/tansig

logsig/ logsig/logsig

logsig/logsig/ tansig

maxOA(%)

Testing sub-set Training sub-set Total data set

68.3% 64.9% 63.7%

86.0% 91.1% 87.9%

66.9% 64.7% 64.0%

84.6% 92.7% 88.4%

65.2% 62.4% 61.9%

85.3% 89.3% 85.8%

65.5% 64.9% 63.1%

84.0% 89.2% 86.4%

Number of neurons in 1st/ 2nd hidden layer

Testing sub-set Training sub-set Total data set

38/60 28/14 28/14

10/28 50/38 50/36

54/12 44/28 42/30

10/52 28/26 28/26

58/52 60/44 52/30

42/20 40/24 48/56

52/58 28/48 24/18

14/16 48/32 54/40

130

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 10. CMs of the most eﬃcient networks with one hidden layer.

them (all elements of the vector x) are diﬀerent from the corresponding values of the input vectors of the training data set.

5. Assessment of the predictive abilities of the optimum conﬁgured networks In the current section, the results of the generalization ability assessment of the most eﬃcient ANNs are presented. In particular, the results of the study of the ability of the optimum conﬁgured networks (Table 7) to reliably predict the seismic damage state in cases in which the r/c buildings or/and the earthquakes are unknown to them (i.e. they were not used in the generation of the training data set) are presented. For this purpose, three diﬀerent scenarios were utilized. In the context of these scenarios, the predictions of the most eﬃcient networks were compared to results of NTHA. More speciﬁcally, in the context of these scenarios, input vectors x (and the corresponding target vectors d) were generated, in which the values of the structural parameters (elements of the sub-vector xstruct, Eq. (1)) or the values of the seismic parameters (elements of the sub-vector xseism) or the values of both of

5.1. Assessment of predictive ability of ANNs for known buildings subjected to unseen earthquakes The aim of the ﬁrst scenario is the assessment of generalization ability of the most eﬃcient networks in cases in which the testing data set consists of input vectors x with unknown values of seismic parameters (i.e. unknown sub-vector xseism) but known values of structural parameters (i.e. known sub-vector xstruct). To this end, the 30 r/c buildings which were used for the generation of the training data set (Appendix A) were utilized. These buildings were subjected to 4 testing earthquakes (Table 8) that were diﬀerent from the 65 excitations (Appendix B) which were used for the generation of the training data set.

Fig. 11. CMs of the most eﬃcient networks with two hidden layers.

131

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

values of the seismic parameters of the 65 seismic excitation which were used for the generation of the training data set (see Table 3 and the bold fonts of the Table 8). Thus, in the case of earthquake E4, the network approaches the solution of the problem performing extrapolation. As a result, the generalization ability decreases (e.g. [54,55]). The combined study of Figs. 12 and 13 leads to the conclusion that the “N1-LM-log/lin-18” network accomplishes a better generalization than the “N1-LM-log/lin-32” network. Especially in the cases of the testing earthquakes E2 and E4, the “N1-LM-log/lin-32” network extracts unacceptable MIDR values (R values equal to 0.28 and 0.36 respectively). By contrast, in the case of the earthquake E1 and mainly in the case of the earthquake E3, the diﬀerence between the generalization eﬃciency of the two examined networks is insigniﬁcant. Thus, from the Figs. 12 and 13, it is possible to quantitatively conﬁrm the expected conclusion (according to the theory) that the “N1-LM-log/lin-18” network, which is the optimum network according to the testing sub-set, generalizes more eﬃciently than the “N1-LM-log/lin-32” network (optimum network according to the total data set) does. Fig. 14a presents the predictions which are extracted from the “N2SCG-tan/tan/tan-10/28” network (optimum network on the basis of the testing sub-set) and the “N2-RP-tan/tan/tan-60/48” network (optimum network according to the total data set) for the damage state of the 30 r/c buildings due to the 4 testing earthquakes on the basis of the solution of the PR problem. More speciﬁcally, the percentages of the correct classiﬁcations (values of the OA index) to the damage states of Table 2 which result from the abovementioned networks (Table 7), are illustrated. From the study of Fig. 14a, it is concluded that, with the exception of the earthquake E3, the “N2-SCG-tan/tan/tan-10/28” network extracts percentages of correct classiﬁcations greater than 73%. The corresponding percentages which are extracted from the “N2-RPtan/tan/tan-60/48” network are lower, with the exception of the earthquake E3 (OA = 46.7% for the “N2-SCG-tan/tan/tan-10/28” network and OA = 60% for the “N2-RP-tan/tan/tan-60/48” network). The low values of the OA index which are extracted from the “N2-RP-tan/ tan/tan-60/48” network can be explained by the fact that it is the best eﬃcient network according to the total data set, and not according to the testing sub-set which is used for the optimization of the generalization eﬃciency of the networks. However, the conclusion about the low performance of the “N2-SCG-tan/tan/tan-10/28” network in the case of earthquake E3 demands further explanation. Due to the fact that such an explanation is not feasible because of the multi-parametric nature of the problem, a further analysis of the speciﬁc classiﬁcations is considered necessary. Thus, the corresponding CM is illustrated in Fig. 14b. As emerges from this ﬁgure, the wrong classiﬁcations concern the moderate and heavy damage classes (classes 3 and 4). More speciﬁcally, the buildings whose true class is the class 3 are 18(=9 + 9). The network classiﬁes 9 of these buildings to this class (Precision = 50%). Furthermore, the network classiﬁes a total of 18(=9 + 5 + 4) buildings to damage class 4, whereas only 5 of them belong indeed to this class (Recall = 27.8%). However, in any case, all wrong classiﬁcations are classiﬁcations in classes adjacent to correct ones. Therefore, despite the low value of the OA index (46.7%), the general vision for the damage state of the 30 r/c buildings due to the testing earthquake E3 is not misleading, provided that it is additionally taken into consideration that such information will be used as a ﬁrst estimation of the seismic damage right after a strong earthquake.

Table 7 Conﬁguration parameters of the ANNs which were selected for the evaluation of their generalization ability. Criterion: Optimum performance for the

Formulation of the problem FA problem

PR problem

Number of hidden layers

Testing sub-set Total data set

1 1

2 2

Number of neurons in hidden layers

Testing sub-set Total data set

18 32

10/28 60/48

Training algorithm

Testing sub-set Total data set

LM LM

SCG RP

Activation functions

Testing sub-set

logsig/linear

Total data set

logsig/linear

tansig/tansig/ tansig tansig/tansig/ tansig

Testing sub-set

N1-LM-log/ lin-18 N1-LM-log/ lin-32

Network name

Total data set

N2-SCG-tan/ tan/tan-10/28 N2-RP-tan/tan/ tan-60/48

Following the procedure which is illustrated in Fig. 4, the testing data set was generated. This data set consists of 120 samples. Two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). The input vectors were introduced to the most eﬃcient networks of Table 7 in order to predict the seismic damage state for the 120 testing samples. Figs. 12 and 13 illustrate the predictions of the “N1-LM-log/lin-18” network (optimum network on the basis of the testing sub-set) and the predictions of the “N1-LM-log/lin32” network (optimum network on the basis of the total data set) respectively. These ﬁgures concern the predictions which arise from the FA problem solution. As emerges from the study of Fig. 12, the generalization ability of the “N1-LM-log/lin-18” network is signiﬁcant, mainly in the cases of testing earthquakes E1, E2 and E3. The extracted predictions of the MIDR values for the speciﬁc earthquakes are generally acceptable (R values between 0.63 and 0.72, and MSE values between 0.017 and 0.069). This conclusion also arises from the fact that the vast majority of points in the corresponding diagrams (Fig. 12a–c) are close to the straight diagonal reference line (i.e. geometric trace of points for which the condition MIDRNTHA=MIDRANN is fulﬁlled). By contrast, in the case of the testing earthquake E4, the predictions of the MIDR values are not so accurate. Despite the fact that the value of the corresponding correlation factor R(=0.65) is of the same order of magnitude as the values of the R factors which correspond to the testing excitations E1-E3, the value of the MSE(=0.62) is one order of magnitude higher. Furthermore, the vast majority of points in the corresponding diagram (Fig. 12d) are far from and above from the reference line. This means that the network mainly extracts lower MIDR values that the NTHA. The higher deviations of MIDR values which are extracted from the “N1-LM-log/lin-18” network for the testing earthquake E4, in comparison to the corresponding deviations which are extracted for the testing earthquakes E1-E3, could be partly attributed to the fact that, contrary to the testing earthquakes Ε1–Ε3, 2 seismic parameters (ASI and EPA) of the earthquake E4 have values which are out of the range of the corresponding Table 8 The ground motion parameters of the 4 testing earthquakes.

E1 E2 E3 E4

PGA

PGV

PGD

Ia

SED

CAV

ASI

HI

EPA

PGV/PGA

PP

TUD

TBD

TSD

0.296 0.11 0.321 0.68

16.083 6.865 28.621 44.719

2.33 2.726 6.921 17.687

0.725 0.216 0.973 4.915

224.12 175.85 694.81 3006.3

538.73 490.24 592.87 1630.99

0.284 0.102 0.257 0.652

46.817 37.156 107.024 181.352

0.28 0.101 0.258 0.646

0.055 0.064 0.091 0.067

0.159 0.219 0.11 0.275

3.55 0.21 4.78 7.76

10.76 7.72 8.91 19.12

7.01 22.36 6.99 9.72

132

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 12. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the four testing earthquakes: (a) E1, (b) E2, (c) E3 and (d) E4.

The 3 new testing r/c buildings were subjected to the 65 seismic excitations (Table 3 and Appendix B) which were used for the generation of the training data set (Section 3.4). Following the procedure of the ﬂow chart which is presented in Fig. 4, a new testing data set was generated. This data set consists of 195 samples. As in the case of the ﬁrst scenario, two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). Figs. 15 and 16 present the predictions of the “N1-LM-log/lin-18” network and the predictions of the “N1-LM-log/lin-32” network respectively on the basis of the FA problem solution. The combined study of these ﬁgures leads to the conclusion that, as was expected, the “N1-LMlog/lin-18” network extracts more accurate predictions for the MIDR values than the ones extracted by the “N1-LM-log/lin-32” network. However, these diﬀerences in accuracy are not signiﬁcant, especially as regards the values of the R factor. More speciﬁcally, with the exception of the 8-storey building for which the “N1-LM-log/lin-32” network extracts results with R factor value equal to 0.88 (Fig. 16c), all the other values of the R factor are greater than 0.9. Another signiﬁcant conclusion which was also extracted from the study of the Fig. 8 is the

5.2. Assessment of predictive ability of ANNs for unseen buildings subjected to known earthquakes The aim of the second scenario is the assessment of the generalization ability of the most eﬃcient networks in cases in which the testing data set consists of input vectors x with known values of seismic parameters (i.e. known sub-vector xseism) but unknown values of structural parameters (i.e. unknown sub-vector xstruct). Furthermore, this scenario is used for the investigation of the capability of the utilization of ANNs as tools for a rapid and reliable vulnerability assessment of individual r/c buildings using the FA problem solution. The speciﬁc utilization of ANNs is signiﬁcantly advantageous because, in real time, it extracts an estimation of the expected damage state for individual buildings due to numerous excitations which diﬀer among them according to their characteristics (seismic parameters). In order to fulﬁll the aims of this scenario, new r/c buildings diﬀerent from the 30 r/c buildings which were used for the training data set generation, were selected. The values of their structural parameters are illustrated in Table 9. The selected buildings were designed following the procedure which is described in Section 3.4.

Fig. 13. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the four testing earthquakes: (a) E1, (b) E2, (c) E3 and (d) E4.

133

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 14. Percentages of correct classiﬁcations to damage states extracted from the optimum trained networks for the 30 r/c buildings subjected to the 4 testing earthquakes (a), CM of the damage classiﬁcation of the 30 r/c buildings which are exported by the “N2-SCG-tan/tan/tan-10/28” network for the testing earthquake E3 (b).

buildings due to the 65 seismic excitations, the extracted results would be similar or relatively close to the results of the NTHA. This fact leads to the conclusion that the properly trained ANNs might be used as tools accompanying the rapid visual screening procedure of buildings.

Table 9 The values of the structural parameters of the 3 testing r/c buildings. No.

nst

Htot = 3.2 · nst (m)

Lx (m)

Ly (m)

eo (m)

nvx (%)

nvy (%)

1 2 3

3 5 8

9.6 16.0 25.6

10.0 10.0 10.0

15.0 15.0 15.0

0.0 0.0 0.0

62.0 60.0 58.0

0.0 0.0 0.0

5.3. Assessment of predictive ability of ANNs for unseen buildings subjected to unseen earthquakes

incapacity of both networks to predict the MIDR values greater than 1.5% with a similar accuracy as with the MIDR values which are lower than this value. This incapacity causes the relatively high MSE values in the case of the 3-storey building (Figs. 15a and 16a), as well as in the case of the 8-storey building (when for the predictions the “N1-LM-log/ lin-32” network is used, Fig. 16c). As was mentioned in the remarks of Fig. 8, the incapacity of networks to accurately predict MIDR values greater than 1.5% is not of great importance, since these values correspond to heavy (non-repairable) damages or collapse. Moreover, it must be stressed that for such values of MIDR, the results obtained by the NTHA are not especially reliable, since a large number of structural members are characterized by extensive inelastic behavior which is diﬃcult to capture by analytical models. Fig. 17 illustrates the CMs of classiﬁcations to the damage classes of Table 2 made by “N2-SCG-tan/tan/tan-10/28” and “N2-RP-tan/tan/ tan-60/48” networks for the 3 testing r/c buildings. The main conclusion which can be drawn from this ﬁgure is that, in general, the two examined networks accomplish high percentages of correct classiﬁcations (OA values greater than 70%, with the exception of classiﬁcations which were made by the “N2-SCG-tan/tan/tan-10/28” network for the 3-storey building). Another signiﬁcant evidence of the high performance of the utilized networks is the fact that only one of the 195 samples was classiﬁed to a class not adjacent to the correct one (3-storey building in Fig. 17a). Therefore, if these networks would be used as computational tools for a preliminary approach of the seismic damage state of the 3 testing

The aim of the third scenario is the assessment of the generalization ability of the optimum conﬁgured networks in cases in which the testing data set consists of input vectors x with unknown values for both seismic and structural parameters (i.e. totally unknown vectors x). To this end, the 3 testing r/c buildings which were used in the analyses of the second scenario (Table 9) were subjected to 15 new testing earthquakes (Table 10). Following the procedure which is illustrated in Fig. 4 once more, a new testing data set was generated. This data set consists of 45 samples. As in the cases of the two previous scenarios, two types of target vectors were formed: one type compatible to the formulation of the FA problem and one type compatible to the formulation of the PR problem (Section 3.3). Figs. 18 and 19 illustrate the predictions of the MIDR values made by the “N1-LM-log/lin-18” network (optimum network according to the testing sub-set) and the corresponding predictions made by the “N1-LM-log/lin-32” network (optimum network according to the total data set) respectively. As emerges from the study of Figs. 18 and 19, both examined networks extract predictions of a similar level of accuracy in the cases of the 3-storey and the 5-storey buildings. More speciﬁcally, the “N1-LMlog/lin-32” network extracts results which are better related to the corresponding NTHA results than the results of the “N1-LM-log/lin-18” network. The opposite applies in the case of the 8-storey building. Despite the fact that the extracted values of the R factor can be generally considered as acceptable (with the exception of the value which is extracted from the “N1-LM-log/lin-32” network for the 8storey building), the corresponding values of MSE are relatively high. According to Figs. 18 and 19, these high values could be attributed

Fig. 15. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.

134

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 16. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey

classiﬁcations (Fig. 20a) is indicative of its high eﬃciency. More speciﬁcally, besides the high values of the OA index (86.7% for 3-storey building, 73.3% for the 5-storey building and 80.0% for the 8-storey building), the percentages of correct classiﬁcations to individual classes are greater than 67% (i.e. the Precision and Recall indices have values greater than 67%), with a few exceptions. Furthermore, it must be stressed that none of the examined samples is classiﬁed by the “N2-SCG-tan/tan/tan-10/28” network to classes not adjacent to the correct ones (Fig. 20a). Finally, a very signiﬁcant conclusion, which is extracted from the combined study of Figs. 18–20, is that the predictions of the most efﬁcient networks are more reliable when they are extracted on the basis of the PR problem solution.

mainly to the aforesaid insuﬃciency of networks in adequately approaching high MIDR values (higher than 1.5–2.0%) calculated by NTHA. However, as was already mentioned above, this insuﬃciency of the networks is not of great importance. Furthermore, it must be stressed that the majority of samples of the training data set has MIDR values less than 2.0% (1589 of 1950 samples). Therefore, the training algorithms extract values for synaptic weights more adapted to MIDR values less than 2.0%. Finally, the high values of MSE could be also attributed to the fact that 7 of the 15 testing earthquakes have 2 or more seismic parameters whose values are out of the range of the corresponding seismic parameters’ values of the 65 seismic excitations which were used for the generation of the training data set (see Table 3 and the bold fonts of the Table 10). In Fig. 20, the CMs of classiﬁcations made by “N2-SCG-tan/tan/tan10/28” and “N2-RP-tan/tan/tan-60/48” networks for the 3 testing r/c buildings are presented. The main conclusion which is extracted from this ﬁgure is the high quality of classiﬁcations of both the examined networks (OA values greater than 73% with the exception of the classiﬁcations made by the “N2-RP-tan/tan/tan-60/48” network for the 8storey building (Fig. 20b)). Especially the conﬁguration of CMs which correspond to the “N2-SCG-tan/tan/tan-10/28” network’s

6. Conclusions The present paper examined the ability of the Multilayer Feedforward Perceptron (MFP) Artiﬁcial Neural Networks (ANN) to successfully approach the solution of the problem of the seismic damage prediction of reinforced concrete (r/c) buildings using two diﬀerent methodologies. The ﬁrst one concerns the estimation of the values of the buildings’ damage

Fig. 17. CMs of the damage state classiﬁcations of the 3 testing buildings made by the optimum trained networks (a) N2-SCG-tan/tan/tan-10/28 and (b) N2-RP-tan/tan/tan-60/48.

135

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Table 10 The ground motion parameters of the 15 testing earthquakes.

PGA PGV PGD Ia SED CAV ASI HI EPA PGV/PGA PP TUD TBD TSD

Ε1

Ε2

Ε3

Ε4

Ε5

Ε6

Ε7

Ε8

Ε9

Ε10

Ε11

Ε12

Ε13

Ε14

Ε15

0.131 19.02 17.26 0.248 720.0 559.7 0.107 52.81 0.106 0.148 0.525 1.18 6.66 29.08

0.296 16.08 2.330 0.725 224.1 538.7 0.284 46.82 0.280 0.055 0.159 3.55 10.76 7.01

0.037 1.505 0.125 0.016 2.35 113.8 0.026 4.099 0.027 0.041 0.118 0.00 0.00 17.33

0.110 6.865 2.726 0.216 175.8 490.2 0.102 37.16 0.101 0.064 0.219 0.21 7.72 22.36

0.014 1.144 0.596 0.003 1.95 34.78 0.015 3.982 0.014 0.085 0.410 0.00 0.00 9.81

0.030 1.826 1.902 0.010 2.41 68.47 0.031 3.466 0.031 0.062 0.369 0.00 0.00 9.89

0.091 6.181 1.176 0.070 31.72 191.6 0.082 24.55 0.082 0.069 0.167 0.37 2.76 7.51

0.166 8.641 1.196 0.117 70.58 216.2 0.122 31.37 0.120 0.053 0.319 0.30 1.02 8.39

0.272 20.77 2.893 0.648 233.6 497.9 0.258 60.22 0.256 0.078 0.233 2.04 7.14 6.09

0.321 28.62 6.921 0.973 694.8 592.9 0.257 107.0 0.258 0.091 0.110 4.78 8.91 6.99

0.714 58.4 22.89 5.453 4280 1452 0.474 210.5 0.479 0.083 0.195 7.70 10.96 6.75

0.681 44.72 17.69 4.915 3006 1631 0.652 181.3 0.646 0.067 0.275 7.76 19.13 9.72

0.809 82.55 20.94 9.394 6277 2529 0.682 300.8 0.683 0.104 0.574 12.76 34.43 9.59

0.891 53.29 10.02 4.412 1887 1262 0.749 178.2 0.746 0.061 0.297 6.42 14.71 6.59

1.198 75.37 28.17 11.07 7063 2404 0.983 293.2 0.969 0.064 0.259 10.35 21.97 7.88

performance parameters. In the case of the investigation of the optimum performance of the ANNs in the solution of the PR problem, networks with one and two hidden layers were used. The main conclusion that turned out from this investigation is that the most important conﬁguration parameter of networks (with one or two hidden layers) is the utilized activation function of the neurons of the output layer (more speciﬁcally, the tansig function). The addition of a second hidden layer improves the classiﬁcation quality, primarily as regards the percentages of the correct classiﬁcation to individual classes and secondarily as regards the total percentages of correct classiﬁcations (values of the OA index). Therefore, the most eﬃcient networks for the solution of the PR problem are the networks with two hidden layers. The generalization ability of the best conﬁgured networks was then examined by means of three seismic scenarios. In the framework of the ﬁrst scenario, the generalization ability of the most eﬃcient networks in the case in which known r/c buildings were subjected to unknown seismic excitations was examined. The main conclusion which was extracted from this scenario is that the networks which were the most eﬃcient (in the case of the FA problem as well as in the case of the PR problem) are the optimum networks according to the testing data-set which is used for the generalization assessment of them during their training. In the framework of the second scenario, the generalization ability of the most eﬃcient networks in the case in which unknown r/c buildings were subjected to known seismic excitations was examined. The same conclusion as the main conclusion of the ﬁrst scenario was extracted for the case of the FA problem. Another signiﬁcant conclusion is the incapacity of networks to predict accurately damage index values greater than the values which practically correspond to heavy (nonrepairable) damages. Nevertheless, this incapacity is not of great importance since, in this range of values, the accuracy of the predicted damage index values is not critical. By contrast, the performance of the corresponding examined networks in the PR problem solution was

index through the formulation (and solution) of the approximation of an unknown function problem (Function Approximation (FA) problem). The second one concerns the classiﬁcation of buildings to speciﬁc damage classes through the formulation (and solution) of the Pattern Recognition (PR) problem. In order to investigate the performance of the optimum ANNs, several networks with diﬀerent conﬁguration parameters were examined (i.e. diﬀerent number of hidden layers (1 or 2), diﬀerent number of neurons in the hidden layers (between 10 and 60), as well as diﬀerent activation functions of neurons (sigmoid function (logsig) or hyperbolic tangent function (tansig)). Three training algorithms, namely the Levenberg-Marquardt algorithm (LM), the Scaled Conjugate Gradient algorithm (SCG) and the Resilient Back-Propagation algorithm (RP), were adopted. The seismic damage index of r/c buildings was expressed by means of the Maximum Interstorey Drift Ratio (MIDR). This choice was based on fact that MIDR is a global damage index, which can be used for the description not only of all the structural but also of the non-structural elements. Moreover, the adoption of MIDR does not lead to numerical problems that can show up in the attempt to analytically assess the global seismic damage using other indices. Also, in the case of instrumented buildings, the calculation of the MIDR can be made in real time, thus providing direct information that can be used either for the remote estimation of the damage level or for the direct generation of data in order to improve (through re-training) the predictions of ANNs. In the framework of the investigation of the optimum performance of ANNs (best conﬁgured networks) in the solution of the FA problem, networks with one hidden layer were used. The main conclusion which was extracted from this investigation is that the networks which have the logsig activation function in the neurons of the hidden layer and were trained with the LM algorithm extract the optimum predictions for the damage index values. The number of neurons of the hidden layer which leads to the optimum predictions of ANNs (optimum number of neurons) depends on the data set which is used for the calculation of the

Fig. 18. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-18” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.

136

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. 19. Comparison of MIDR values predicted by NTHA and the “N1-LM-log/lin-32” network for the 3 testing buildings: (a) 3-storey building, (b) 5-storey building and (c) 8-storey building.

The common and main conclusion of the three scenarios is that the best conﬁgured networks which were trained on the basis of the generated training data set are capable of extracting reliable estimations about the seismic damage state of known or unknown buildings which are subjected to future earthquakes in real time after the shock or in pro-seismic periods. The results of the utilization of the PR approach can especially be used both in pro-seismic periods for the vulnerability assessment of existing r/c buildings (accompanying rapid visual screening procedures in the framework of seismic microzonation studies) and as reliable instant decision-making tools (or in the production of preliminary reports) for the authorities after a strong earthquake. However, the predictions of the ANNs which were illustrated in the present paper can be further improved using additional training samples which cover more types of buildings or/and more seismic excitations.

generally highly suﬃcient. The third scenario concerns the examination of the generalization ability of the most eﬃcient networks in cases in which both the buildings and the seismic excitations are unknown to them. Just as in the previous scenarios, in the case of the FA problem, the network which is the optimum according to the testing data-set extracts damage index values which are highly related to the corresponding values which are calculated by non-linear time history analyses. The relatively high MSE values can be mainly attributed to the incapacity of networks to accurately predict damage index values which practically correspond to heavy (non-repairable) damages. In the case of the solution of the PR problem, the examined networks extract highly suﬃcient classiﬁcations. In addition (just as in the second scenario), the conﬁguration of the corresponding confusion matrices led to the conclusion that the networks are capable of extracting reliable predictions for the expected damage of unknown buildings due to future earthquakes.

Fig. 20. CMs of the damage state classiﬁcations of the 3 testing buildings made by the optimum trained networks (a) “N2-SCG-tan/tan/tan-10/28” and (b) “N2-RP-tan/tan/tan-60/48”.

137

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Appendix A. Design data of the 30 selected r/c buildings

Fig. A1. Design data of the 15 selected symmetric r/c buildings.

138

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Fig. A2. Design data of the 15 selected nonsymmetric r/c buildings.

139

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

Appendix B. Data of the 65 selected ground motion records

Table B1 Data of the 65 selected seismic excitations. No.

Earthquake name

Date

Magnitude (Ms)

Distance to fault (km)

Component (deg)

PGA (g)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Imperial Valley Imperial Valley Kocaeli, (Turkey) Landers Loma Prieta Whittier Narrows Northridge Northridge N. Palm Springs Northridge Northridge Northridge Whittier Narrows Cape Mendocino Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Chi-Chi (Taiwan) Erzincan (Turkey) Loma Prieta Loma Prieta Loma Prieta Northridge Northridge Northridge Northridge Campano Lucano (Italy) Spitak (Armenia) Izmit (Turkey) Duzce (Turkey) Duzce (Turkey) Duzce (Turkey) Duzce (Turkey) Izmit (Turkey) Duzce (Turkey) Strofades (Greece) Aigion (Greece) Friuli (Italy) Volvi (Greece) Dinar (Turkey) Izmit (Turkey) Duzce (Turkey) Imperial Valley Loma Prieta Loma Prieta Northridge Northridge Duzce, Turkey Northridge Imperial Valley Superstition Hills Duzce (Turkey) Imperial Valley Imperial Valley Imperial Valley Imperial Valley Livermore Superstition Hills Superstition Hills Morgan Hill Imperial Valley Morgan Hill

15/10/1979 15/10/1979 17/8/1999 28/6/1992 18/10/1989 1/10/1987 17/1/1994 17/1/1994 8/7/1986 17/1/1994 17/1/1994 17/1/1994 1/10/1987 25/4/1992 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 20/9/1999 13/3/1992 18/10/1989 18/10/1989 18/10/1989 17/1/1994 17/1/1994 17/1/1994 17/1/1994 23/11/1380 7/12/1988 17/8/1999 12/11/1999 12/11/1999 12/11/1999 12/11/1999 17/8/1999 6/6/2000 18/11/1997 15/6/1995 11/9/1976 4/7/1978 1/10/1995 17/8/1999 12/11/1999 15/10/1979 18/10/1989 18/10/1989 17/1/1994 17/1/1994 12/11/1999 17/1/1994 15/10/1979 24/11/1987 12/11/1999 15/10/1979 15/10/1979 15/10/1979 15/10/1979 27/1/1980 24/11/1987 24/11/1987 24/4/1984 15/10/1979 24/4/1984

6.9 6.9 7.8 7.4 7.1 5.7 6.7 6.7 6 6.7 6.7 6.7 5.7 7.1 7.6 7.6 7.6 7.6 7.6 7.6 7.6

23.8 28.7 144.6 128.3 28.2 25.2 25.4 30 43.3 13 6.4 12.3 10.8 9.5 2.94 10.04 4.01 7.31 11.14 10.33 5.92 2.0 12.7 14.4 14.5 7.1 8.9 14.6 6.2 39 20 29 18 113 98 94 80 158 54 138 7 15 0 5 0 43.6 16.1 77.4 30.9 36.9 17.6 32.7 54.1 18.2 8.2 7.6 4.2 1 1 3.6 13.9 13.3 12.8 12.6 3.4

225/315 012/282 090/180 000/270 090/180 000/090 177/267 020/110 270/360 000/270 090/360 000/090 048/318 000/090 N/W N/W N/W N/W N/W N/W N/W NS/EW 000/090 000/090 000/090 090/360 270/360 000/090 052/142 E-W/N-S E-W/N-S W-E/S-N E-W/N-S S-N/E-W 030/120 E-W/N-S E-W/N-S LONG/TRAN 261/351 065/155 E-W/N-S E-W/N-S W-E/S-N E-W/N-S W-E/S-N 262/352 000/090 180/270 155/245 090/180 000/090 090/180 075/345 225/315 180/270 002/092 140/230 140/230 140/230 270/360 000/090 090/180 270/360 140/230 150/240

0.128/0.078 0.27/0.254 0.06/0.049 0.057/0.046 0.247/0.215 0.221/0.124 0.357/0.206 0.474/0.439 0.144/0.132 0.41/0.482 0.604/0.843 0.303/0.443 0.426/0.443 0.59/0.662 0.251/0.202 0.393/0.742 0.162/0.134 0.821/0.653 0.44/0.353 0.13/0.147 0.188/0.148 0.515/0.496 0.367/0.322 0.555/0.367 0.529/0.443 0.583/0.59 0.753/0.939 0.877/0.64 0.612/0.897 0.047/0.048 0.183/0.183 0.129/0.091 0.8/0.745 0.022/0.021 0.018/0.016 0.042/0.041 0.114/0.11 0.004/0.004 0.053/0.054 0.013/0.013 0.105/0.23 0.099/0.115 0.319/0.273 0.244/0.296 0.513/0.377 0.238/0.351 0.417/0.212 0.195/0.244 0.465/0.322 0.29/0.264 0.728/0.822. 0.103/0.186 0.122/0.167 0.156/0.116 0.348/0.535 0.213/0.235 0.485/0.36 0.519/0.379 0.41/0.439 0.258/0.233 0.358/0.258 0.172/0.211 0.224./0.348 0.364/0.38 0.156/0.312

7.1 7.1 7.1 6.7 6.7 6.7 6.7 6.9 6.7 7.6 7.2 7.2 7.2 7.2 7.6 6.1 6.6 6.5 5.5 6.4 7.6 7.2 6.9 7.1 7.1 6.7 6.7 7.3 6.7 6.9 6.6 7.3 6.9 6.9 6.9 6.9 5.5 6.6 6.6 6.1 6.9 6.1

140

Engineering Structures 165 (2018) 120–141

K. Morﬁdis, K. Kostinakis

References [29]

[1] ASCE/SEI 41-06. Seismic rehabilitation of existing buildings. ASCE (VA): American Society of Civil Engineers; 2009. [2] EC8 (Eurocode 8). Design of structures for earthquake resistance - part 1: general rules, seismic actions and rules for buildings. European Committee for Standardization; 2005. [3] ATC. Earthquake damage evaluation data for California. Redwood City (CA): Applied Technology Council; 1985 [ATC-13 Report]. [4] Anagnos T, Rojahn C, Kiremidjian AS. NCEER-ATC joint study on fragility of buildings. Technical Report NCEER 95–0003, State Univ. of New York at Buﬀalo: National Center for Earth. Eng. Research; 1995. [5] Kappos AJ, Panagopoulos G, Panagiotopoulos C, Penelis G. A hybrid method for the vulnerability assessment of R/C and URM buildings. Bull Earthq Eng 2006;4(4):391–413. [6] Tsang H-H, Ray KLS, Nelson TKL, Lo SH. Rapid assessment of seismic demand in existing building structures. Struct Des Tall Special Build 2009;18(4):427–39. [7] Fausett L. Fundamentals of neural networks: architectures, algorithms and applications. Pearson; 1994. [8] Haykin S. Neural networks and learning machines. 3rd ed. Prentice Hall; 2009. [9] Williams MS, Sexsmith RG. Seismic indices for concrete structures: a state of the art review. Earthq Spectra 1995;11(2):319–49. [10] Kappos AJ. Seismic damage indices for RC buildings: evaluation of concepts and procedures. Construction Research Communications Limited; 1997. p. 78–87. [ISSN 1365-0556]. [11] Adeli H, Yeh C. Perceptron learning in engineering design. Microcomput Civ Eng 1989;4(4):247–56. [12] Adeli H. Neural networks in civil engineering: 1989–2001. Comput Aid Civ Infrastruct Eng 2001;16:126–42. [13] Jegadesh SJS, Jayalekshmi S. A review on artiﬁcial neural network concepts in structural engineering applications. Int J Appl Civ Env Eng 2015;1(4):6–11. [14] Stephens JE, VanLuchene RD. Integrated assessment of seismic damage in structures. Microcomput Civ Eng 1994;9:119–28. [15] Molas G, Yamazaki F. Neural networks for quick earthquake damage estimation. Earthq Eng Struct Dyn 1995;24:505–16. [16] Erkus B. Utilization of artiﬁcial neural networks in building damage prediction. Ankara: Middle East Technical University; 1999 [MSc Thesis]. [17] Huang CS, Hung SL, Wen CM, Tu TT. A neural network approach for structural identiﬁcation and diagnosis of a building from seismic response data. Earthq Eng Struct Dyn 2003;32:187–206. [18] Lautour OR, Omenzetter P. Prediction of seismic-induced structural damage using artiﬁcial neural networks. Eng Struct 2009;31:600–6. [19] Arslan MH. An evaluation of eﬀective design parameters on earthquake performance of RC buildings using neural networks. Eng Struct 2010;32(7):1888–98. [20] Rofooei FR, Kaveh A, Farahani FM. Estimating the vulnerability of the concrete moment resisting frame structures using artiﬁcial neural networks. Int J Optim Civ Eng 2011;3:433–48. [21] Caglar N, Garip ZS. Neural network based model for seismic assessment of existing RC buildings. Comput Concr 2013;12(2):1–18. [22] Šipoš TK, Sigmund V, Hadzima-Nyarko M. Earthquake performance of inﬁlled frames using neural networks and experimental database. Eng Struct 2013;51:113–27. [23] Vafaei M, Adnan AB, Rahman ABA. Real-time seismic damage detection of concrete shear walls using artiﬁcial neural networks. J Earthq Eng 2013;17:137–54. [24] Arslan MH, Ceylan M, Koyuncu T. Determining earthquake performances of existing reinforced concrete buildings by using ANN. Int J Civ Env Struct Constr Archit Eng 2015;9(8):930–4. [25] Kramer SL. Geotechnical earthquake engineering. Prentice-Hall; 1996. [26] Gunturi SKV, Shah HC. Building speciﬁc damage estimation. In: Proceedings of 10th world conference on earthquake engineering. Madrid: Rotterdam: Balkema; 1992. p. 6001–6. [27] Naeim F. The seismic design handbook. 2nd ed. Boston: Kluwer Academic; 2011. [28] Kaveh A, Iranmanesh A. Comparative study of backpropagation and improved

[30] [31] [32]

[33]

[34] [35] [36]

[37] [38] [39] [40]

[41] [42] [43] [44] [45] [46]

[47] [48] [49]

[50] [51] [52]

[53]

[54] [55]

141

counterpropagation neural nets in structural analysis and optimization. Int J Space Struct 1998;13(4):177–85. Iranmanesh A, Kaveh A. Structural optimization by gradient based-neural networks. Int J Numer Meth Eng 1999;46:297–311. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw 1989;2(5):359–66. Ripley BD. Pattern recognition and neural networks. Cambridge University Press; 1996. Avramidis I, Athanatopoulou A, Morﬁdis K, Sextos A, Giaralis A. Eurocode-compliant seismic analysis and design of r/c buildings: concepts, commentary and worked examples with ﬂowcharts. Geotechnical, geological and earthquake engineering. New York: Springer; 2016. Kostinakis K, Athanatopoulou A, Morﬁdis K. Correlation between ground motion intensity measures and seismic damage of 3D R/C buildings. Eng Struct 2015;82:151–67. Yakut A, Yilmaz H. Correlation of deformation demands with ground motion intensity. J Struct Eng 2008;134(12):1818–28. SeismoSoft. SeismoSignal v. 5.1.0; 2014 < http://www.seismosoft.com > . Morﬁdis K, Kostinakis K. Seismic parameters’ combinations for the optimum prediction of the damage state of R/C buildings using neural networks. Adv Eng Softw 2017;106:1–16. Elenas A, Meskouris K. Correlation study between seismic acceleration parameters and damage indices of structure. Eng Struct 2001;23:698–704. Theodoridis S, Koutroumbas K. Pattern recognition. 4th ed. Elsevier; 2008. Asht S, Dass R. Pattern recognition techniques: a review. Int J Comp Sci Telecommun 2012;3(8):25–9. Masi A, Vona M, Mucciarelli M. Selection of natural and synthetic accelerograms for seismic vulnerability studies on reinforced concrete frames. J Struct Eng 2011;137:367–78. European Strong-Motion Database; 2003 < http://www.isesd.hi.is/ESD_Local/ frameset.htm > . PEER (Paciﬁc Earthquake Engineering Research Centre). Strong motion database; 2003 < http://peer.berkeley.edu/smcat/ > . EC2 (Eurocode 2). Design of concrete structures, Part 1–1: general rules and rules for buildings. European Committee for Standardization; 2005. Paulay T, Priestley MJN. Seismic design of reinforced concrete and masonry buildings. New York: John Wiley and Sons; 1992. Otani A. Inelastic analysis of RC frame structures. J Struct Div ASCE 1974;100(7):1433–49. Carr AJ. Ruaumoko – a program for inelastic time-history analysis: program manual. New Zealand: Department of Civil Engineering, University of Canterbury; 2006. Matlab, Neural networks toolbox user guide; 2013. Raﬁq MY, Bugmann G, Easterbrook DJ. Neural network design for engineering applications. Comput Struct 2001;79:1541–52. Caruana R, Lawrence S, Giles L. Overﬁtting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Proceedings of neural information processing systems. Denver (CO, USA); 2000. p. 402–8. Marquardt DW. An algorithm for least squares estimation of non-linear parameters. J Soc Ind Appl Math 1963;11(2):431–41. Moller MF. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 1993;6:525–33. Riedmiller M, Braun H. A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of IEEE. San Francisco; 1993. p. 586–91. Mushgil HM, Alani HA, George LE. Comparison between resilient and standard back propagation algorithms eﬃciency in pattern recognition. Int J Sci Eng Res 2015;6(3):773–8. Flood I, Kartam N. Neural networks in civil engineering. I: Principles and understanding. J Comput Civil Eng 1994;8(2):131–48. Shahin MA, Jaksa MB, Maier HR. Recent advances and future challenges for artiﬁcial neural systems in geotechnical engineering applications. Adv Artif Neural Syst 2009:308239, doi:http://dx.doi.org/10.1155/2009/308239.

c buildings using artificial neural networks

c buildings using artificial neural networks

Recommend Documents