Advances in Engineering Software 42 (2011) 85–93
Contents lists available at ScienceDirect
Advances in Engineering Software journal homepage: www.elsevier.com/locate/advengsoft
A multi-output descriptive neural network for estimation of scour geometry downstream from hydraulic structures Aytac Guven University of Gaziantep, Department of Civil Engineering, 27310 Gaziantep, Turkey
a r t i c l e
i n f o
Article history: Received 16 June 2010 Received in revised form 9 September 2010 Accepted 19 December 2010 Available online 15 January 2011 Keywords: Scour Descriptive neural networks Modelling Regression River hydraulics Grade-control structures
a b s t r a c t Several researchers have attempted to estimate the maximum depth and location of local scour, particularly, based on conventional regression analysis. Many of these equations in the literature failed to estimate the scour depths satisfactorily. This study presents explicit formulation extracted from a multi-output descriptive neural network (DNN), which estimates both the depth and location of maximum scour. The DNN method extracts rules (information) conveyed from input layer to output layer of a NN consisting two outputs. The present DNN results are compared to non-linear and linear regression equations derived by the author and selected other empirical equations available in the literature. The results show that the proposed DNN estimates the maximum-scour depth and its location in strict agreement with the measured ones (R2 = 0.819 and 0.907, respectively), and dominantly better than the other equations (R2 = 0.687 and 0.706 being the highest results for dm and for xm, respectively). This study shows that the explicit formulation extracted from DNN can replace the conventional regression equations with much more accuracy. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction Local scour is the erosion of bed surface due to the impact effect of flowing water over or through hydraulic structures in rivers. Grade-control structures are built to prevent excessive channelbed degradation in alluvial channels. However, local scour downstream from grade-control structures occurs due to erosive action of the weir overflow and this additional turbulence may undermine these structures [13]. Thus, the structural design of the grade-control structures must include comprehensive understanding of the mechanics, location and extent of the downstream scour, and sufficient protective provisions to minimize local scour [18]. Local scour downstream from grade-control structures is a specific case considered in this study. Although these structures are built to prevent excessive bed degradation in alluvial channels, they also cause scouring due to the erosive action of water flowing over the structure. Bormann and Julien [13] investigated the scourhole development downstream from sharp-crested, grade-control structures, having a weir width bw and height z, as shown in Fig. 1. They observed that as the water flowed over the weir and enters the tailwater yt, the flow separated from the structure at the crest of structure and a vortex was formed in the separation zone. The diffused flow velocity in the vicinity of point impingement point exerts a shear stress on bed-sediment particles. When the applied shear stress exceeds the critical-shear stress, sediment E-mail address:
[email protected] 0965-9978/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.advengsoft.2010.12.005
can be dislodged and transported beyond the impingement zone (Point B in Fig. 1) [13]. The process continues until the rate of scour is zero. This state generally is characterized by maximum-scour depth, dm and its location xm. In fact, almost all experimental studies on local scour aim to characterize the local scour, dm, and estimate it based on incoming flow parameters and scour geometry characteristics [29]. Due to the complex structure of scouring process together with difficulties raised by the various turbulent-flow conditions, most of the studies dealing with local scour have been done in experimental laboratory settings. A detailed literature review of these experimental studies can be found in Mason and Arumugam [28], Bormann and Julien [13], and Sarkar and Dey [30]. The majority of experimental studies aimed to derive empirical relation among the scour parameters (depth and location of maximum scour) based on conventional regression analysis [11–14,16,17]. The existing formulae performed satisfactorily on corresponding experimental data. However, their major drawback is that they involve assuming ideal conditions, rough approximation, and averaging of widely varying prototype conditions [8]. Another issue is the scale problem, namely, almost all the existing formulae have been derived based on regression analysis on experimental data of small scale (dm < 0.1 m) laboratory researches. The scale problem arises from the limitations of the laboratory conditions, because large-scale (dm > 0.1 m) physical modelling of scouring around hydraulic structures needs larger working space than available or field studies of scour at hydraulic structures [13]. This means that the
86
A. Guven / Advances in Engineering Software 42 (2011) 85–93
Fig. 1. Sketch of local scour downstream of sharp-crested grade-control structure.
existing formulae are not capable of extrapolating to larger scale scour parameters. Additionally, the scouring process is assumed as two-dimensional or quasi two-dimensional in these laboratory studies, neglecting the side-wall and secondary-flow effects [21]. However, the scour process in natural or constructed channels is much more complex and containing many hydraulic factors, which are neglected or simplified in laboratory studies. Accordingly, the resulting formulae derived based on these experimental data suffer from generalization on larger scale scour estimations, and they generally ignore some important hydraulic parameters influencing scour processes [18–20]. In the last two decades, new soft-computing techniques have been proposed as an alternative to conventional regression analysis and numerical (finite volume, finite element, etc.) methods. Soft-computing techniques have been widely used and well-validated in estimation of water-resources variables, primarily using laboratory data [2,11,17,26,29]. Neural networks (NNs), a branch of soft-computing techniques, have been successively applied in estimation of hydraulic processes. NNs are relatively stable with respect to noise in data and have a good generalization potential to represent input–output relationships [16]. NNs involve a structure where non-linear functions are present and a parameter identification process is based on techniques that search for global maximums in the space of feasible parameter values. Thus, NNs can represent the non-linear effects associated with scour processes. All NN features make it an intelligent tool in formulation of maximum depth of scour downstream from hydraulic structures. On the other hand, NNs have a problem related to the weakness to extrapolate, but this can be overcome to a certain extent by increasing the generalization capacity of the model as it is explained in further sections. Due to the black-box nature of neural networks, they sometimes are not classified as data-mining tools for discover interesting and understandable data-mining definition. However, once an NN model is trained for its generalization properties, it can be demonstrated that the trained model represents the physical process of the system. The knowledge acquired for the problem domain during the training process is encoded within the NN in two forms: (a) in the network architecture itself (through number of hidden units), and: (b) in a set of constants or weights [34]. Lange [24] states that ANNs are black-box models that only develop the relation between input and output variables without the modelling of any physical processes, however, it must be realized that the data that are employed in developing black-box models contain important information about the physical processes being modeled, and this information gets embedded or captured inside the model [9]. Although there are several attempts in other scientific branches that have shown that useful information could be obtained from trained neural networks (see Yao
[34] for relevant references), limited research has been done for water-engineering applications [3,9,16,18,19]. In the literature, increasingly more studies are encountered that use NNs in estimation of local scour downstream from hydraulic structures. Liriano and Day [25] estimated the scour depth at culvert outlets, Azinfar et al. [5] estimated the scour depth downstream of a sluice gate. Azamathulla et al. [6–10] estimated the scour downstream of a ski jump and below spillways using NNs. Guven and Gunal [18] proposed a NN model for estimation of local scour downstream of grade-control structures, based on widespread experimental data of others. Guven and Gunal [19] studied the estimation of local scour downstream of grade-control structures by a new soft-computing technique: Gene-Expression Programming. This study focuses on modelling two important scour parameters in a single NN model. Namely, a practical mathematical formulation is presented using a two-output NN model. Although Azamathulla et al. [6,7] mentioned the inner structure of NN and its basic model equations, they did not provide the explicit form of their proposed models. This study directly aims to derive the knowledge (explicit formulation) from a well trained and tested NN model. In the present study, dimensional analysis and experimental-scour data are used to establish a non-dimensional relationship describing the size and extent of the scour profile. These are the normalized maximum-scour depth, dm/z, and the normalized location of maximum-scour depth, xm/z. The experimental measurements were taken from a large-scale model research carried out by Bormann and Julien [13]. The non-dimensional parameters derived by dimensional analysis: densimetric Froude number (Frd), normalized tail-water depth (yt/z) and sediment uniformity ratio d90/d50 (d50 and d90 represent bed grain size for which 50% and 90% of sampled particles are finer (m), respectively) were used as inputs in training phase of a proposed descriptive neural network model (DNN), and the corresponding dm/z and xm/z are used as the outputs. The derivation of DNN is also explained in this study.
2. Overview of descriptive neural networks DNN techniques involve three basic stages. First, the neural network forecasting is built by training a neural network using available data. This stage contains all manual procedure components involved in construction of neural networks such as data preprocessing, input/output selection, sensitivity analysis, data organization, model construction, post analysis and model recommendation. The second stage is to extract rules (formulation) from well-trained NN. The NN architecture (number of hidden neurons) and model weights are decoded to get some rules that govern the
87
A. Guven / Advances in Engineering Software 42 (2011) 85–93
forecasting. By extracting hidden information from previously constructed NN, we were better able to define the mechanism of a forecating NN model. In the third stage, the formulations extracted from the previous stage is incorporated to the network generated by first stage to form a DNN. Researchers genarally use if–then type rules association rules [34]. In this study, the information hidden in a well-trained NN is extracted as linear regression equations embedded in input layer and logistic functions embedded in the output layer of the NN architecture. By this, the author aims to present the proposed DNN as a regression-based explicit formulation alternative to conventional regression equations. Although the basics of NNs have been sufficiently described in previous studies, the author finds it necessary to remind, especially non-specialised reader, of most important elements of NNs in order to comprehend the explicit neural networks formulation. A NN technique is a data processing tool that mimics the function of the human brain and nerves built on the so-called neurons – processing elements – connected to each other. Artificial neurons are organized in such a way that the structure resembles a network. This technique differs from the traditional data processing; it learns the relationship between the input and output data [23]. A multilayer neural network model, which is considered in this study, usually consists of three layers: input, hidden, and output layers. The input layer constitutes input nodes representing input variables. The output of the input nodes are normalized and transferred to the hidden layer in which they are processed through a transfer function. The output layer consists of output variables. The basic element of NNs is an artificial neuron, which consists of three main components; weights, bias, and an activation function. Each neuron receives inputs xi (i = 1, 2, . . . , n) attached with a weight wij (j P 1), which shows the connection strength for a particular input for each connection. Every input is then multiplied by the corresponding weight of the neuron connection and summed as
Wi ¼
n X
wij xj
ð1Þ
outdoor flume that had an overall depth of 3.5 m, an overall length of 27.4 m, and a width of 0.91 m. The structural slope was set at 3H:1V. Experimental discharges, Q, range between 0.3 and 2.5 m3/s, that produced maximum-scour depths of 1.40 m. The tests of Bormann and Julien [13] are characterized by a sediment size range of 1.5 < d90/d50 < 5.3. Other details of the experimental study can be found in Bormann and Julien [13]. The physical relationship of scour downstream from a sharpcrested weir due to erosive action of impinging water (see Fig. 1) is given in Eq. (4), and the functional relationship obtained by dimensional analysis on physical parameters in Eq. (4) is given in Eq. (5). In the present study, the dimensionless parameters given in Eq. (5) were used as input and output parameters in multilayered NN modelling. Table 1 lists the ranges of input and output parameters, i.e. maximum and minimum values of dimensionless groups, which are used in this study.
dm ; xm ¼ f ðz; bw ; h; q; qs ; q; g; d50 ; d90 Þ
ð4Þ
dm =z; xm =z ¼ f ðF rd ; yt =z; d90 =d50 Þ
ð5Þ
Table 2 shows the correlation matrix representing the results of multiple linear regression analysis in possible pair of groups, considering dm/z and xm/z as a dependent variable, respectively. The numbers given in the table shows the degree of correlation between the input and the dependent parameters. Table 2 shows that yt/z and d90/d50 have considerable correlation with dm/z and xm/z, but Frd (¼ q=ðbw zððqs qÞ=qs gd50 Þ0:5 Þ has relatively poor correlation. However, when developing a NN model and considering Frd as a single input, correlations of R2 = 0.641 and 0.691 were obtained for the dm/z and xm/z outputs, respectively.
4. Sensitivity analysis on scour parameters
j¼1
A bias bi, a type of correction weight with a constant non-zero value, is added to the summation in Eq. (1) as
U i ¼ W i þ bi
ð2Þ
In other word, Wi in Eq. (1) is the weighted sum of the ith neuron for the input received from the preceding layer with n neurons, wij is the weight between the ith neuron in the hidden layer and the jth neuron in the preceding (input) layer, and xj is the output of the jth neuron in the input layer. After being corrected by a bias (constant) bi as in Eq. (2), the summation is transferred using a scalar-to-scalar function called an ‘‘activation or transfer function’’, f(Ui), to yield a value called the unit’s ‘‘activation’’, given as
yi ¼ f ðU i Þ
ð3Þ
Activation functions serve to introduce nonlinearity into NNs which makes it more powerful than standard linear transformations. In this study, the most commonly used sigmoid transfer function (y = 1/(1 + ex)) is utilised; although there are other types available, like tangent-hyperbolic, Gaussian, and linear. Readers who are interested in more information can consult textbooks in neural computing, e.g. [22,23] or can refer to other previously published works in related journals [4,26]. 3. Scour data measurements of Bormann and Julien [13] Experimental measurements of Bormann and Julien [13] (82 data sets) are used as Training, Cross-Validation, and Testing sets of the proposed multilayered NNs model. They worked with a large-scale
In order to ascertain how dm/z and xm/z depend upon the input parameters (Frd, yt/z, d90/d50), a sensitivity analysis on these parameters was carried out. Firstly, a NN model was developed based on the three inputs and then for the next runs, each input was removed from the input set and new models were computed. The results of the sensitivity analysis are given in Table 3. It is clear that removing any of the three input parameters worsened the performance of the corresponding NN model based on the coefficient of determination (R2) and the mean squared error (MSE) criteria. Especially, the NN model without Frd gave the worst performance (R2 = 619, MSE = 10.858 for dm/z and R2 = 0.483, MSE = 385.534 for xm/z) when 2-input set was used. NN models with unique input set can be said to fail to model the two outputs meanwhile as it is seen from the table, especially revealing very high MSE values and also some negative estimations which are not acceptable. It can be deduced from the overall performance of the developed NN models that each input parameter has a significant influence on the scour parameters dm/z and xm/z. Table 1 Minimum and maximum values of input and output variables of Bormann and Julien [13] experiment (82 data). Dimensionless variable
Minimum value
Maximum value
Frd yt/z d90/d50 dm/z xm/z
1.026 2.395 3.800 0.368 1.605
1.921 25.60 5.267 25.20 158.6
88
A. Guven / Advances in Engineering Software 42 (2011) 85–93
Table 2 Correlation matrix for Bormann and Julien [13] experiment (82 data). Parameter
Frd
yt/z
d90/d50
dm/z
xm/z
Frd yt/z d90/d50 dm/z xm/z
1 0.381 0.259 0.097 0.048
1 0.470 0.695 0.765
1 0.556 0.581
1 0.931
1
Table 3 Results of the sensitivity analysis based on different input sets. Input set
Frd, yt/z, d90/d50 Frd, yt/z Frd, d90/d50 yt/z, d90/d50 Frd yt/z d90/d50
Output = dm/z
Output = xm/z
R2
MSE
R2
MSE
0.964 0.858 0.888 0.619 0.641 0.001 0.421
0.637 3.976 3.210 10.858 23.641 52.894 15.996
0.974 0.919 0.952 0.483 0.691 0.624 0.396
11.724 27.136 35.367 385.534 664.733 1668.49 447.174
Fig. 2. Optimal architecture of the proposed NN model.
5. Results of training, cross-validation, and testing of NNs model
dm /z (NNs)
25 20 15 10
2
x Training set (R =0.990) 2 o Cross-validation set (R =0.972) 2 Δ Testing set (R =0.964)
5 0 0
5
10
15
20
25
30
dm /z (Experimental) Fig. 3. Estimated and experimental dm/z for Training, Cross-Validation and Testing sets.
150 125
xm/z (NNs)
The data of Bormann and Julien [13] were randomly split into Training, Cross-Validation, and Testing sets. Firstly different NN models are developed and their performance is evaluated based on estimations to these data sets. The DNN modelling is described in the next section. As it is widely known, one of the main issues of NNs modelling is the generalization capacity of the model. Increasing the number of hidden neurons lead to over-generalization of the model on Train set and poor performance on Test Set. For this reason, a portion of the Training set was used for Cross-Validation in order to control and avoid the over-generalization of the model. Thus, 21 sets of the 82 data sets were reserved for the Cross-Validation, and 40 sets were used for model training. The remaining 21 data sets (20%) were used for testing. Multi-layer feed forward neural networks with back-propagation learning algorithm were used in this study. The model parameters were optimised by Levenberg and Marquardt algorithm, which is one of the most common and successful back-propagation algorithm. Another important issue is to find the optimal architecture of the NNs model. Most studies in the literature used a trial approach, which generally leads to local maxima or minima. In this study, this issue is eliminated by using a Genetic Algorithm in order to find the optimal architecture of the proposed NNs model. Namely, a fitness function was chosen based on MSE of the Cross-Validation Set and the program searched for optimal architecture with least MSE for Cross-Validation Set. The optimal architecture of the proposed NNs was found to be 3-7-2 (No. of inputs, No. of hidden neurons, and No. of outputs), as shown in Fig. 2. The overall performance of the three sets was evaluated by MSE and the R2. The training results of the proposed models showed that the NNs learned the highly non-linear relationship between local scour parameters and the outputs with high correlations (R2 = 0.990 for dm/z and R2 = 0.993 for xm/z) and low errors (MSE = 0.327 for dm/z and MSE = 3.251 for xm/z). Validation of the trained model proved the high generalization capacity of the proposed model with a high correlation and low error (R2 = 0.964, MSE = 0.637 for dm/z and R2 = 0.974, MSE = 11.724 for xm/z). As it seen in Figs. 3 and 4, almost all estimations of the NNs fall on the line of perfect agreement, which shows strictly good agreement with the observed ones. Especially for both high dm and xm values, R2 was observed to be 0.99 for and MSE was observed to be less than 0.001 for the estimation of NNs (see Figs. 3 and 4).
30
100 75 50
2
x Training set (R =0.988) 2
o Cross-validation set (R =0.994)
25
2
Δ Testing set (R =0.974)
0 0
25
50
75
100
125
150
xm/z (Experimental) Fig. 4. Estimated and experimental xm/z for Training, Cross-Validation and Testing sets.
89
A. Guven / Advances in Engineering Software 42 (2011) 85–93
The overall results show that the proposed NN architecture is well applicable for the next steps of the DNN methodology. 6. Derivation of the DNN model The model parameters of DNN were obtained from the trained NN, and the explicit formulation is derived using the weights of the trained NN model. As mentioned earlier, the proposed NNs model involves three input nodes, seven hidden nodes, and two output nodes. First, inputs and outputs are normalized before the learning process of the NN. For the input layer, each input is multiplied by a connection weight, namely products and biases are simply summed (Eqs. (8a)–(8g)). In the hidden layer, these equations are transformed through a transfer function (sigmoid) to generate two results. Finally, for the output layer, the outputs are obtained by transferring the summation by the activation function (y = 1/(1 + ex)), as given in denominators of Eqs. (6) and (7), and de-normalising the result obtained in hidden layer. As mentioned earlier, the goal is to obtain a single formulation that estimates both dm/z and xm/z using unique DNN architecture, in a functional form in terms of measured dimensionless variables as:
dm =z ¼
xm =z ¼
regression equation to the experimental data set yielded following equations:
dm =z ¼ 3:929F rd þ 0:817ðyt =zÞ þ 2:682ðd90 =d50 Þ 15:031
ð9Þ
xm =z ¼ 14:945F rd þ 4:677ðyt =zÞ þ 13:707ðd90 =d50 Þ 71:56
ð10Þ
Secondly, a power type non-linear relation was used among the local non-dimensional scour parameters, and the model parameters were optimised using the Levenberg and Marquardt algorithm, which resulted in the following equations:
dm =z ¼ 0:193ðF rd Þ0:838 ðyt =zÞ0:931 ðd90 =d50 Þ1:217
ð11Þ
xm =z ¼ 0:78ðF rd Þ0:658 ðyt =zÞ0:991 ðd90 =d50 Þ1:334
ð12Þ
The statistical performance of Eqs. (9)–(12) are discussed in Section 9. 8. Other local scour equations Mason and Arumugam [28] provides a detailed review of empirical equations used for estimating maximum-scour depth
27:591 1:012 1:605 42:203 18:907 21:019 2:233 40:714 2:957 þ þ þ þ 2:843 U1 U2 U3 U4 U5 U6 U7 1þe 1þe 1þe 1þe 1þe 1þe 1 þ e1 þ e
174:439 7:117 1:229 38:857 14:323 16:096 2:002 37:948 11:591 þ þ þ þ 2:487 1 þ e1 þ eU1 1 þ eU2 1 þ eU3 1 þ eU4 1 þ eU5 1 þ eU6 1 þ eU7
where values for Ui (i = 1, . . ., 7) are given as
U 1 ¼ 24:923 F rd 1:757yt =z 12:620 d90 =d50 þ 91:155
ð8aÞ
U 2 ¼ 3:231 F rd 0:403yt =z 4:622d90 =d50 þ 32:670
ð8bÞ
U 3 ¼ 3:297 F rd 0:805yt =z 4:053 d90 =d50 þ 24:711
ð8cÞ
U 4 ¼ 2:537 F rd 0:607yt =z 3:267 d90 =d50 þ 19:587
ð8dÞ
U 5 ¼ 11:581F rd þ 1:386yt =z 3:006 d90 =d50 þ 27:594
ð8eÞ
U 6 ¼ 5:278 F rd 0:535yt =z 4:378 d90 =d50 þ 35:018
ð8fÞ
U 7 ¼ 26:416 F rd þ 1:346yt =z þ 12:622 d90 =d50 125:676
ð8gÞ
Most of the studies on estimation of local sour parameters have applied non-linear regression analysis on experimental data, and optimised the parameters of their pre-defined equations based on the correspondent data [13,15,28,30]. These empirical equations are generally in power-relation form (y = axb). In this section, multiple non-linear regression (MNLR) and multiple linear regression (MLR) techniques are applied on the same data sets used in NNs modelling. Firstly, a simple linear regression relation was developed between the input and output parameters using a least-square method for model calibration. The fit of linear
ð7Þ
downstream from grade-control structures. They also studied the performance of past empirical studies and proposed a general power form for these scour equations as:
dm ¼ K
7. Regression analysis on scour data
ð6Þ
qa U b0 DHc ydt b0e i
g f ds
z
ð13Þ
where a, b, c, d, e, f, i and K are exponents of scour equation, DH is drop across the structure (m), U0 is jet velocity impinging on the tailwater (m/s), g is gravitational acceleration (m/s2), b0 is jet angle near bed (rad.) and ds is representative bed grain size (m). Based on the relationship in Eq. (13), Martins [27], Chee and Yuen [14], Bormann [12], and Mason and Arumugam [28] proposed Eqs. (14)–(17), respectively. They observed that their equation estimated the model data is much better when using ds equal to mean bed grain size, rather than dm (median bed grain size).
dm ¼ 1:5q0:6 DH0:1 z
ð14Þ
b0 z dm ¼ 0:6q0:45 U 0:55 0
ð15Þ 0:1
dm ¼ 3:27q0:6 DH0:05 y0:15 =g 0:3 ds z t 0:3
b00:66 =g 0:73 ds z dm ¼ 0:7q0:45 U 0 y0:12 t
ð16Þ ð17Þ
Bormann and Julien [13] used their measured data to calibrate their semi-theoretical equation that estimates maximum-scour depth under a wide range of conditions: wall jets, vertical jets, free
90
A. Guven / Advances in Engineering Software 42 (2011) 85–93
Table 4 Experimental and estimated dm/z for Test Set data from Bormann and Julien [13]. Test set
Experimental dm (m)
Estimated dm (m) Martins [27]
Chee and Yuen [14]
Mason andArumugam [28]
Bormann [12]
Bormann and Julien [13] (Eq. (14))
D’Agostino and Ferro [15] (Eq. (16))
MNLR (Eq. (11))
MLR (Eq. (9))
DNN (Eq. (6))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
1.12 1.40 1.16 0.72 1.32 0.97 1.39 0.27 0.10 0.58 1.05 0.64 0.18 0.36 0.53 0.21 0.52 0.47 0.59 0.57 0.97 R2 MSE AIC MAE
2.50 2.52 2.13 2.05 2.48 2.41 2.51 1.00 0.65 2.29 1.70 0.83 0.45 0.72 1.15 0.65 0.99 2.55 2.52 1.01 0.57 0.456 1.151 8.900 1.687
0.32 0.25 0.11 0.05 0.39 0.27 0.14 0.15 0.14 0.09 0.10 0.21 0.21 0.03 0.16 0.04 0.04 0.26 0.19 0.03 0.04 0.350 0.548 1.58 1.067
3.11 2.94 2.47 2.67 2.94 2.79 2.82 1.15 0.65 2.94 1.89 0.84 0.46 0.67 1.11 0.63 0.99 3.05 3.12 1.06 0.51 0.400 1.983 26.326 2.015
0.37 0.43 0.29 0.15 0.63 0.35 0.37 0.04 0.05 0.21 0.26 0.14 0.11 0.07 0.15 0.11 0.18 0.22 0.28 0.16 0.11 0.687 0.332 11.228 0.666
0.27 0.32 0.21 0.10 0.48 0.26 0.27 0.02 0.03 0.15 0.18 0.09 0.07 0.04 0.11 0.07 0.12 0.16 0.21 0.11 0.07 0.674 0.414 6.634 0.771
0.17 0.18 0.16 0.14 0.17 0.20 0.21 0.10 0.08 0.17 0.15 0.10 0.07 0.09 0.13 0.08 0.11 0.22 0.19 0.11 0.08 0.388 0.476 3.657 0.740
0.31 0.18 0.25 0.68 0.64 0.10 1.35 1.48 2.62 1.04 1.85 1.16 0.73 0.32 0.11 0.65 0.62 0.18 0.54 0.47 0.52 0.677 0.672 47.987 0.488
1.24 1.33 1.28 1.33 1.89 0.91 0.86 0.12 0.12 0.19 0.99 1.33 1.30 0.28 0.36 0.63 0.71 0.81 0.54 0.51 0.87 0.415 0.157 25.694 0.840
1.15 1.25 1.19 0.62 1.26 0.97 1.19 0.21 0.20 0.49 0.87 0.78 0.79 0.34 0.59 0.34 0.79 0.48 0.55 0.51 0.91 0.819 0.030 61.921 0.341
jets, submerged jets and flow over large-scale grade-control structures. Their equation is remarkably similar to the regression-based equations proposed in the literature:
" dm ¼
# U0 0 0:6 q sin b z 0:4 ðsinð0:436 þ b0 ÞÞ0:8 g 0:8 d90 0:611
ð18Þ
The angle b0 (Fig. 1) is experimentally inferred by Bormann and Julien [13] as:
b0 ¼ 0:316 sin k þ 0:15 ln U0 ln pffiffiffiffiffiffiffiffi gy0
z þ y0 y þ 0:13 ln t 0:05 y0 y0
Q h i1=2 bw z gd50 qsqq
ð22Þ
A90 ¼
Q h i1=2 bz gd90 qsqq
ð23Þ
9. Comparison of the DNN models with regression equation models 9.1. Performance criteria
ð19Þ
where k is downstream face angle of the grade-control structure (radian), y0 is water depth at the crest (m). It can be deduced from Eq. (18) that it is applicable to the cases for which the drop height z is less than the term of Eq. (18) within the square brackets. D’Agostino and Ferro [15] updated the formulas, estimating the maximum-scour depth, given in the literature and improved the estimation of maximum-scour depth by introducing variables representative of both the jet and contraction and the bed particle grain-size distribution. The authors used Incomplete Self-Similarity theory for deducing the physically dimensionless groups controlling the geometrical pattern of scour hole:
dm =z ¼ 0:54ðbw =zÞ0:593 ðyt =HÞ0:126 ðA50 Þ0:544 ðd90 =d50 Þ0:856 ðbw =BÞ0:751 ð20Þ xm =z ¼ 1:616ðbw =zÞ0:662 ðyt =HÞ0:117 ðA90 Þ0:455 ðbw =BÞ0:478
A50 ¼
ð21Þ
where B is channel width (m), A50 and A90 are derived by dimensionless analysis:
It is important to define the criteria by which the performance of a model and its estimation accuracy will be evaluated in the model-development process. Various statistical measures have been developed and used to assess a model’s performance. However, in our case we considered some specific error measure that would evaluate the performance of the compared models with respect to model size as well as its accuracy in estimating the test data, because the number of fitting data in a NNs model is naturally many more than that in regression-based model. Thus, we employed effective measures Akaike’s Information Criterion (AIC) proposed by Akaike [1], coefficient of determination (R2), mean square error (MSE), and mean absolute error (MAE), in comparison of the proposed DNN with other regression equations based on the Testing Set. AIC (Eq. (24)) is used to measure the exchange between testing performance and model size (No. of model weights). The goal is to minimize AIC to determine the model with the best generalization.
AIC ¼ N lnðMSEÞ þ 2k MAE ¼
jðei pi Þ=ei j N
ð24Þ ð25Þ
91
A. Guven / Advances in Engineering Software 42 (2011) 85–93 Table 5 Experimental and estimated xm values for Test Set data from Bormann and Julien [13]. Test set
Experimental xm (m)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
6.10 7.93 5.49 4.27 6.71 4.88 7.93 2.44 1.83 4.27 5.49 3.05 3.05 1.83 3.05 1.83 3.66 3.05 4.27 2.44 3.66 R2 MSE AIC MAE
Estimated xm (m) D’Agostino and Ferro [15] (Eq. (17))
MNLR (Eq. (12))
MLR (Eq. (10))
DNN (Eq. (7))
1.40 1.47 1.32 1.15 1.41 1.58 1.40 0.76 0.64 1.19 1.09 0.77 0.57 0.72 0.96 0.66 0.83 1.46 1.31 0.81 0.64 0.588 12.145 64.444 0.724
3.86 4.43 4.12 3.19 4.90 4.29 3.07 1.08 1.04 2.76 2.83 2.04 2.14 0.99 1.29 1.46 1.74 2.05 2.59 1.91 1.23 0.706 3.756 35.787 0.387
7.30 7.04 6.75 8.45 9.62 5.00 4.42 0.77 0.06 2.91 4.58 5.41 5.36 1.42 1.85 2.76 3.18 5.03 4.11 2.60 3.52 0.468 3.252 32.748 0.387
5.64 6.37 6.01 4.47 6.56 4.88 6.75 1.76 1.69 3.70 5.49 3.54 3.59 1.99 3.18 1.99 4.12 3.00 3.60 2.30 2.57 0.907 0.364 9.232 0.109
Table 6 Large-scale field data and dm estimations. Author(s)
Veronese [32] Scimemi [31] Whittaker and Schleiss [33]
z (m)
12.9 57 19.2
yt (m)
5 40 7
q (m2/s)
4.57 275 38.97
d50 (m)
0.1 2.1 0.75
bw = B (m)
25 58 11.6
where N is the number of input–output pairs in the testing set, k is the number of model weights, ei is the experimental value and pi is the estimated value. 9.2. Maximum-scour depth, dm In this section, the performance of DNN (Eq. (6)) and the other equations are compared based on the Test Set data, as shown in Table 4 and Fig. 3. The table shows the overall best performance of DNN with the lowest values of error measures (AIC = 61.921, MAE = 0.341, MSE = 0.030) and highest correlation (R2 = 0.819). On the other hand, MLNR (Eq. (11)) and MLR (Eq. (9)) estimated dm much better than the other regression-based equations based on AIC and MAE criteria, but Bormann [12] estimated best in terms of higher R2 (0.688), which only indicated relatively weak tendency of estimations of MNLR and MLR to linearly co-vary with the experimental values. Almost all estimations of Chee and Yuen [14] (Eq. (15)), MNLR (Eq. (11)) and D’Agostino and Ferro [15] (Eq. (20)) are observed to be less than the experimental values, and also negative values of Chee and Yuen [14] are seen in Table 4, which are physically impossible. 9.3. Location of maximum-scour depth, xm Eqs. (13)–(18) did not consider xm, so the DNN model of xm (Eq. (7)) was compared to MNLR (Eq. (12)), MLR (Eq. (10)), and D’Agostino and Ferro’s [15] empirical model (Eq. (21)). Table 5 indicates the
dm (m)
3 28 6.2
Estimated dm (m) Mason and Arumugam [28]
D’Agostino and Ferro [15]
MNLR (Eq. (11))
DNN (Eq. (7))
0.54 45.42 12.16
4.84 23.56 4.59
2.72 18.86 3.53
4.99 28.95 9.48
same performance of DNN in estimation of dm. Namely, DNN (Eq. (7)) estimated xm values with the lowest errors (AIC = 9.232, MAE = 0.109, MSE = 0.364) and the highest correlation (R2 = 0.907). MNLR and MLR derived by the author showed overall better performance compared to Eq. (21), except for R2, which is better than that of MLR. It should be noted that MNLR, MLR, and Eq. (21) underestimated all xm values with considerable deviation, while those of DNN are very close to experimental values (see Fig. 4).
10. Application to large-scale field data In this section, the robustness of the proposed DNN (Eq. (6)), MNLR (Eq. (11)), MLR (Eq. (9)) and the other equations is evaluated in estimation of field data corresponding to erosion downstream from large dams, which were not used in training stage of the proposed DNN model (see Table 6). The field data was taken from D’Agostino and Ferro [15], which collected three field measurements of Veronese [32], Scimemi [31] and Whittaker and Schleiss [33]. In fact, evaluation of DNN based on only three sites basically is insufficient data to put much weight on the interpretations, but it is aimed to give an idea about the capability of the proposed DNN with available out-of-range data. Veronese [32] presented field observations on the ‘‘Rocchetta’’ dam built on Noce River, Trento, Italy. Scimemi [31] observed the bottom erosion downstream from the Conowingo dam on the Susquehanna River in Maryland, USA. Whittaker and Schleiss [33]
92
A. Guven / Advances in Engineering Software 42 (2011) 85–93
compared various formulas for the Cabora-Bassa dam in Mozambique in Africa. The statistical performance results of DNN estimations are very promising, because an almost perfect agreement is observed between the estimated and the field dm values (R2 = 0.997, MAE = 0.408), as shown in Table 6. Although in the early sections of this study it mentioned about the scale problems (from lab to field data), these results exhibit that DNN eliminated this problem. The estimations of MLR (Eq. (11)) and Eqs. (13)–(18) are not given because they estimated negative values, which are physically impossible. Additionally, the xm values of these field studies, which are not available, are estimated based on the developed DNN model. It should be noted that the values of field measurements are very near to those of Bormann and Julien [13]. The estimated xm values also are observed to fall in the range of experimental values of Borrmann and Julien [13]. In this perspective, the estimated xm values can be said to be realistic. The underlying reason for the overall superiority of the proposed DNN model to the existing and the proposed regressionbased empirical formulas is in that regression analyses apply a pre-defined linear or non-linear model on the widely-scattered relation among scouring parameters, which suffers from inappropriate estimations. In fact, the correlation matrix among the input and output parameters given in Table 2 indicates that linear regression fails to capture the relation among the input and output parameters, which assures to be highly non-linear and complex. On the other hand, the proposed DNN model uses parallel processors incorporated with strong learning (back-propagation) and optimization tools that are able to capture the highly non-linear relation among these parameters even though multiple constraints such as multiple outputs in present case. 11. Conclusions The depth and location of maximum scour downstream from grade-control structures is estimated by a two-output neural network model. The knowledge from the developed model is extracted in order to build explicit formulation of the proposed descriptive NN model (DNN). DNN estimated the depth and location of the maximum scour with very good agreement with laboratory and limited field data and is superior to non-linear and linear equations developed by the author and other available regression equations given in Section 8. The significance of this study is in that, the proposed DNN model estimates both the depth and the location of the maximum scour (dm and xm), and more significantly, this study focuses on extracting the explicit formulation from the developed DNN model. By this, the proposed DNN model can be refined/improved by using more experimental or field observations, and also it can be used in comparison with conventional methods. DNN was also compared to conventional empirical models in the literature, and the results for Testing set exhibited overall superior performance of DNN against those equations with the lowest error (MAE = 0.341 and AIC = 61.921) and highest correlation (R2 = 0.819). Eq. (15) proposed by Chee and Yuen [14] failed to estimate dm with negative values, which are not acceptable. The best performance among the regression equations belong to Eqs. (11) and (12) proposed by the author with R2 = 0.677 and MAE = 0.488 for dm/z and R2 = 0.706 and MAE = 0.307 for xm/z, respectively. The robustness of DNN is validated in estimation and estimation of field data obtained from other experimental studies, and the results are very promising. Initial testing with three field data sites shows promising results, but much more field (large-scale) data is needed to validate the DNN model. DNN estimated the ob-
served dm/z with excellent accuracy (R2 = 0.997, MAE = 0.408). MLR (Eq. (9)). In the early sections of this study it is mentioned about the scale problems (from lab to field data), but these results exhibit that DNN eliminated this problem. Eqs. (13)–(18) estimated negative values for almost all observed scour depths, which are physically impossible. These excellent results suggest the efficiency of DNN in estimating complex local scour problem downstream from control structures, and provide additional support for using softcomputing techniques. The results of this study showed that conventional regressionbased equations of estimating maximum-scour depth downstream of hydraulic structures could better be replaced by DNN. Also, the performance of the non-linear regression equations developed by the author (Eqs. (11) and (12)) showed that they can be used as alternative to the existing ones. Acknowledgement The author wishes to thanks to Scientific Research Projects Unit of Gaziantep University for providing support during this study. References [1] Akaike H. Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F, editors. Second international symposium on information theory. Budapest: Academiai Kiado; 1973. p. 267–81. [2] Akdag U, Komur MA, Ozguc A. Estimation of heat transfer in oscillating annular flow using artificial neural networks. Adv Eng Software 2009;40(9):864–70. _ Aksoy H. An explicit neural network formulation [3] Aytek A, Guven A, Yuce MI, for evapotranspiration. Hydrol Sci J 2008;53(4):893–904. [4] American Society of Civil Engineers (ASCE) Task Committee. The ASCE Task Committee on application of artificial neural networks in hydrology. J Hydrol Eng 2000;5(2):115–37. [5] Azinfar H, Kells JA, Elshorbagy A. Use of artificial neural networks in prediction of local scour. In: Proc 32nd annual general conference of the Canadian society for civil engineers, GC-350; 2004. p. 1–10. [6] Azamathulla H MD, Deo MC, Deolalikar PB. Neural networks for estimation of scour downstream of a ski-jump bucket. J Hydraul Eng 2005;131(10):898–908. [7] MD AzamathullaH, Deo MC, Deolalikar PB. Estimation of scour below spillways using neural networks. J Hydraul Res 2006;44(1):61–9. [8] Md AzamathullaH, Deo MC, Deolalikar PB. Alternative neural networks to estimate the scour below spillways. Adv Eng Software 2008;39:689–98. [9] Azamathulla H MD, Ghani A Ab, Zakaria NA, Kiat, CC, Siang LC. Knowledge extraction from trained neural network scour models. Modern Appl Sci 2008b;2(4):52–62. [10] Azamathulla H MD, Ghani A Ab, Zakaria NA. In: Prediction of scour below flip bucket using soft computing techniques AIP conf proc, vol. 1233; May 21, 2010. p. 1588–93. [11] Bateni SM, Jeng D-S, Melville BW. Bayesian neural networks for prediction of equilibrium and time-dependent scour depth around bridge piers. Adv Eng Software 2007;38(2):102–11. [12] Bormann, NE. Equilibrium local scour downstream of grade-control structures. Ph.D. thesis Colorado State University, Fort Collins, Colo; 1988. [13] Bormann NE, Julien PY. Scour downstream of grade-control structures. J Hydraul Eng 1991;117(5):579–94. [14] Chee SP, Yuen EM. Erosion of unconsolidated gravel beds. Can J Civ Eng 1985;12:559–66. [15] D’Agostino V, Ferro V. Scour on alluvial bed downstream of grade-control structures. J Hydraul Eng 2004;130(1):24–36. [16] Guven A, Gunal M, Cevik AK. Prediction of pressure fluctuations on stilling basins. Can J Civ Eng 2006;33(11):1379–88. [17] Guven A, Aytek A, Yuce MI, Aksoy H. Genetic programming-based empirical model for daily reference evapotranspiration estimation. Clean-Soil Air Water 2007;36(10–11):905–12. [18] Guven A, Gunal M. Genetic programming approach for prediction of local Scour downstream of hydraulic structures. J Irrig Drain Eng 2008a;134(2):241–9. [19] Guven A, Gunal M. Prediction of scour downstream of grade-control structures using neural networks. J Hydraul Eng 2008;134(11):1656–60. [20] Guven A. Modeling local scour using k–e turbulence model and soft computing techniques. Ph.D. thesis submitted to University of Gaziantep; 2008. [21] Guven A, Gunal M. Hybrid numerical–mathematical modeling of local scour and flow patterns in laboratory flumes. Int J Numer Methods Fluids 2009;62:291–312. [22] Haykin S. Neural networks: a comprehensive foundation. Macmillan College Publications Cooperation; 2000. [23] Hecht-Nielsen R. Neurocomputing. Reading (MA): Addison-Wesley; 1990. [24] Lange NT. New mathematical approaches in hydrological modeling – an application of artificial neural networks. Phys Chem Earth 1999;24(1–2):31–5.
A. Guven / Advances in Engineering Software 42 (2011) 85–93 [25] Liriano SL, Day RA. Prediction of scour depth at culvert outlets using neural networks. J Hydroinformatics 2001;3(4):231–8. [26] Maier HR, Dandy GC. Neural networks for forecasting of water resources variables: a review of modeling issues and applications. Environ Modell Software 2000;15:101–24. [27] Martins R. Scouring of rocky riverbeds and free-jet spillways. Int Water Power Dam Contstr 1975;27(5):152–3. [28] Mason PJ, Arumugam K. Free jet scour below dams and flip buckets. J Hydraul Eng 1985;111(2):220–35. [29] Pinar E, Paydas K, Seckin G, Akilli H, Sahin B, Cobaner M, et al. Artificial neural network approaches for prediction pf backwater through arched bridge constrictions. Adv Eng Software 2010;41(4):627–35.
93
[30] Sarkar A, Dey S. Review on local scour due to jets. Int J Sediment Res 2004;19(3):210–39. [31] Scimemi E. Sulla relazione che intercede fra gli scavi osservati nelle opere idrauliche originali e nei modelli. Energy Elettr 1939;16(11):3–8 (in Italian). [32] Veronese A. Erosioni di fondo a valle di uno scarico. Annal Lavori Pubbl 1937;75(9):717–26 (in Italian). [33] Whittaker JG, Schleiss A. Scour related to energy dissipators for high head structures. Mitt Nr 73 VAW/ETH, Zurich; 1984. [34] Yao JT. Knowledge based descriptive neural networks. In: Proc 9th international conference on rough sets, fuzzy sets, data mining and granular computing, Chongqing, China. Lecture Notes in Computer Science, vol. 2639; 2003. p. 430–6.