Computers and Geotechnics 54 (2013) 125–132
Contents lists available at SciVerse ScienceDirect
Computers and Geotechnics journal homepage: www.elsevier.com/locate/compgeo
Modeling tunneling-induced ground surface settlement development using a wavelet smooth relevance vector machine Fan Wang a,b, Biancai Gou c, Yawei Qin a,⇑ a
Department of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan 430074, PR China Hongshan Construction Bureau, Wuhan 430070, PR China c Department of Civil Engineering, Wuhan University of Science and Technology–City College, Wuhan 430083, PR China b
a r t i c l e
i n f o
Article history: Received 21 May 2012 Received in revised form 16 April 2013 Accepted 6 July 2013 Available online 31 July 2013 Keywords: Ground surface settlement development Smooth relevance vector machine Wavelet kernel Tunneling
a b s t r a c t Accurate prediction of ground surface settlement is necessary for effectively controlling the settlement that develops during tunneling. Many models have been established for this purpose by extracting the relationship between the settlement and the factors that influence it. However, most of the models focused on the maximum ground surface settlement and do not involve dynamic and real-time predictions. This paper investigated how tunneling-induced ground surface settlement developed using a smooth relevance vector machine with a wavelet kernel (wsRVM). Various factors that affect this settlement, including geometrical, geological and shield operational parameters were considered. The model was applied to earth pressure balance (EPB) shield-driven tunnels. The results indicate that the prediction model performs well and that the distribution of the predictions can provide a measure of the prediction uncertainty. Unlike conventional methods that requireadditional efforts to determine relevant model parameters, the proposed method can optimize the parameters in the training process. The results of the parametric study conducted show that the model performance can be improved by the optimization and that the method can serve as a simple tool for practitioners to use in estimating ground surface settlement development during tunneling. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction Ground surface settlement is an important field measurement for identifying the potential damage incurred to adjacent structures or facilities due to tunneling. Thus, analyzing and predicting settlement development are essential to avoid excessive settlement by taking appropriate countermeasures. Although empirical methods and analytical methods are available for settlement prediction, some researchers question the accuracy of these methods, pointing out that these methods fail to consider all the relevant factors which jointly affected the settlement [1–3]. During the past decade, artificial neural networks (ANNs) have been used as an alternative method for solving the problem. Most of the ANN-based analyses were implemented by extracting the relationships between influencing factors, such as the tunnel depth and soil properties, and the induced settlement. For example, Kim et al. [1] used artificial neural networks to predict the maximum settlement and inflection point that needed to generate the transverse settlement trough caused by tunneling. A total of 47 factors were considered as input variables for the network. Suwansawat and Einstein [4] established a neural network model to predict ⇑ Corresponding author. Tel.: +86 27 87556946; fax: +86 27 87556945. E-mail address:
[email protected] (Y. Qin). 0266-352X/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.compgeo.2013.07.004
the maximum settlement induced by earth pressure balance (EPB) tunneling. Shield operational parameters, as well as tunnel geometries and geological conditions, were incorporated to establish the predictive relations. Santosand Celestino [5] also developed ANN-based models to analyze the influence of relevant factors on settlement and concluded that the complete adoption of factors would improve the prediction capacity of these models. Support vector machines (SVMs), which are based on statistical learning [6], have also been successfully applied in highly nonlinear geotechnical areas. Samui [7] applied an SVM to the prediction of the settlement of shallow foundations on cohesionless soil and concluded that the use of SVMs could be very advantageous because the machines can perform nonlinear regression efficiently for highdimensional datasets. Zhao and Yin [8] used an SVM in a back analysis to identify geomechanical parameters. Feng et al. [9] illustrated the potential of SVMs for modeling displacement time series. They proposed a model that incorporated an SVM to predict the deformations of high rock slopes and landslides and obtained satisfactory results. SVMs typically have goodgeneralization abilities because they adopt a structural risk minimization (SRM) induction principle instead of an empirical risk minimization (ERM) induction principle, which minimizes the error in both the training and testing data. However, when using ANNs, it is difficult to determine the network architecture because no direction or analytical method is
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
available, and ANNs often suffer the problem of poor generalization performance. SVMs also have some drawbacks, such as the determination of model parameters (e.g., the penalty weight C and the insensitivity parameter e), relatively high model complexity and kernel function restrictions (i.e., the kernel function must satisfy Mercer’s condition) [10]. The relevance vector machine (RVM) has recently emerged as a viable SVM competitor, due to its model sparsity, good generalization performance, free choice of kernel function and distributive prediction [11]. Because of these advantages, the results obtained from an RVM are often superior to those obtained from an SVM for the same inputs [12,13]. RVMs have been successfully applied to fault diagnosis [14], canal flow prediction [15] and monitoring network analysis [16]. Smooth relevance vector machines (sRVMs) are an extension of RVMs, to some extent [17]. To avoid overfitting or underfitting problems, sRVMs incorporate a sparsitycontroller called ‘‘smoothness prior’’ to directly control the model complexity. Due to the need for accurate real-time prediction capability, the potential for use of sRVMs with wavelet kernel functions (wsRVMs) to model ground surface settlement development induced by EPB shield tunneling was investigated in this paper. The instrumentation data and continuous observation of shield operational factors in two tunnel sections of the Wuhan metro project provide a good opportunity to study how settlements develop during shield passing. To this end, the model was trained and validated using the collected data. The performance of the wsRVM model was compared to that of other models (e.g., RVM, SVM and ANN), and the results indicated that the wsRVM model has good predictive ability. 2. Ground surface settlement development The International Association of Engineering Insurers (IAEA) has reported that most failures, including excessive deformation, in tunneling projects occur during the construction phase [18]. However, most of the aforementioned studies focused on the maximum ground surface settlement. These static results do not fulfill the dynamic and real-time requirements of predicting ground surface settlement during tunnel construction [19]. Fig. 1 shows a typical longitudinal settlement profile obtained by connecting the instrumentation readings. The appropriate preventive measures have to be designed and implemented before a large settlement occurs, so predicting the settlement of each excavation step is critical for achieving the goal. Nevertheless, settlement data are typically nonlinear and noisy, and shield–ground interaction is complex, implying that the modeling of settlement development could be challenging. Yeh [20] used the actual soil pressure, coupled with other factors, to predict the soil pressure of the next excavation step. In this paper, a similar method is adopted. The present settlement at a specific settlement marker s and affecting factors F are used as inputs to the model to predict the next settlement s’, and the actual measurement of the next settlement is then taken as the present settlement for the next prediction (see Fig. 2). That is, Approaching
Settlement (mm)
-10
0
-10
The factors that affect ground surface settlement can be classified into three groups: tunnel geometry (e.g., tunnel diameter, cover depth, excavation face height, etc.), geological conditions (e.g., Young’s modulus, Poisson’s ratio, permeability, shear strength parameters, etc.) and construction parameters (e.g., excavation method, support method, support time, etc.) [1,5]. Because the proposed method is applied to EPB shield-driven tunnels, the construction parameters are specifically the shield operational parameters. 3.1. Geometrical characteristics Tunnel depth (Z) and tunnel diameter (2R) are usually considered important geometrical parameters that affect the settlements [21] and excavation face stability of shield-driven tunnels [22]. An analysis carried out by Norgrove et al. [23] indicated that the ratio of the depth to the diameter (Z/2R) should be taken as a combined factor of influence. However because the diameter of the tunnels was designed as a constant of 6 m, the effect of tunnel diameter is negligible in the present model. Therefore, the first geometric factor is the tunnel depth. Another important factor is the distance from the excavation face to the settlement markers. As summarized in the longitudinal development ofsettlement, the effect of tunneling increases as the shield approaches and decreases as the shield recedes [24]. To distinguish the directions, we define the distance value as negative in the case of approaching and positive in the case of receding. 3.2. Geological conditions Some previous works have taken soil properties such as Young’s modulus and shear strength as geological factors [25,26]. However, detailed geological investigation of the soil properties at each instrumentation section is practically impossible, making it difficult to obtain the values of the soil properties. Other researchers, such as Kim et al. [1] and Suwansawat and Einstein [4], used soil type to represent the soil properties because the differences in properties among soils of different types are generally greater than those among soils of the same type. Thus, the use of soil type can, to some extent, solve the problem of the values of soil properties being unavailable. In the model presented in this paper, the soil types at the tunnel crown and at the tunnel invert are considered Prediction model Predictions of next settlement
Settlement markers
10 20 30 40 Front edge of shield passing Tail of shield passing
ð1Þ
3. Factors affecting ground surface settlement
Receding
Distance to the excavation face (m) -20 0
s0 ¼ f ðs; FÞ
compare
126
Actual measurements T
T
Other information (tunnel depth, geology at the instrumented section etc.) Data bank
Operational records
Ground surface
T
Shield machine
Launching station
-20 -30 Excavation direction
-40 measurements Fig. 1. A typical ground surface settlement development.
Completed lined tunnel
Fig. 2. Schematic diagram for predicting settlement development.
127
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
as twogeological factors. The groundwater level is also an important factor. We use the tunnel depth below the water table to reflect the variance of groundwater level. 3.3. EPB shield operational parameters The EPB shield tunneling method is often used for soft ground tunneling. This technique maintains the stability of the excavation face and minimizes the ground movement by balancing the pressure between the earth pressure chamber and the outside ground [27]. Therefore, the face pressure should be the primary control parameter during tunneling. The drilling velocity is an indicator of the volume of the excavated soils. If the drilling velocity is too high, overbreak may occur, leading to large ground loss. The pitching angle reflects the shield position, which has to be kept within the designed alignment. However, it is practically impossible to maintain an accurate orientation along the entire length of the tunnel. The mismatch between the actual position and the designed alignment may influence the settlement because it can create voids. Tail void grout filling and grout pressure also play important roles incontrolling the extent of settlement. The simulation carried out by Kasper and Meschke [28] illustrated how tail void grouting affects settlements. In general, the values of these parameters should be high enough to resist the potential ground movement into the tail void and thus reduce the settlements after the shield passes. In summary, five factors, namely, face pressure, drilling velocity, pitching angle, tail void grout pressure and grout filling, are considered as shield operational parameters in the model presented in this paper. 4. Wavelet smooth relevance vector machine 4.1. Relevance vector machine The relevance vector machine introduced by Tipping [11] is actually a special case of a Gaussian process [29]. RVMs have the same functional form as SVMs but are trained within a Bayesian framework. The compelling characteristic of RVMs is that they typically use fewer training vectors, termed ‘‘relevance vectors,’’ associated with non-zero weights, while achieving generalization performance similar to that of SVMs. Given a dataset of N input vectors with N corresponding scalarvalued targets fxn ; t n gNn¼1 , the output t ¼ ðt1 ; . . . ; tN ÞT can be expressed as the sum of an approximation vector y ¼ ðyðx1 Þ; . . . ; yðxN ÞÞT and a noise vector e ¼ ðe1 ; . . . ; eN ÞT as follows:
t ¼ y þ e ¼ Uw þ e
ð2Þ
where w is the weight vector and U = [/(x1), . . ., /(xN)]T is the N M ‘‘design’’ matrix, wherein /(xn) = [K(xn, x1), K(xn, x2), xn) = [K(xn, x1), K(xn, x2), . . ., K(xn, xM)]T, K(xn, xi) is a kernel function. Furthermore, the noise en can typically be assumed to be independent and identically distributed, following a mean-zero Gaussian distribution with variance r2: p(en|r2) = N(0, r2). Due to the assumption of independence of the targets, the likelihood of the complete dataset can be written as follows:
pðtjw; r2 Þ ¼ ð2pr2 Þ
N=2
1 exp 2 jjt Uwjj2 2r
ð3Þ
The classical method for estimating t is to maximize the likelihood (3) or minimize the ordinary least square (OLS) of the measured training dataset to estimate w and r2; however, this procedure leads to severe overfitting [30]. To avoid overfitting and to control the complexity of the model, a zero-mean Gaussian prior with a different precision ai for each weight wi is adopted as a constraint to ‘‘penalize’’ the likelihood.
M Y
pðwjaÞ ¼ ð2pÞM=2
a1=2 m exp
m¼1
am w2m
2
ð4Þ
The hyperparameter vector a = [a1, . . ., aM]T controls how far from zero each weight is allowed to deviate. For completion of the hierarchical prior, hyperpriors over a:p(a) and the noise variance r2:p(r2) are specified as Gamma distributions. Consequently, the posterior parameter distribution conditioned on the data can be obtained by combining the likelihood and prior within Bayes’ rule as follows:
pðwjt; a; r2 Þ ¼
pðtjw; r2 ÞpðwjaÞ pðtja; r2 Þ
ð5Þ
In Eq. (5), p(t|w, r2) and p(w|a) are both Gaussian priors. Thus, the posterior over w is also Gaussian and can be expressed as p(w|t, a, r2) N(l, R). The posterior covariance R and mean l of w are, respectively, as follows: 1
R ¼ ðA þ r2 UT UÞ
ð6Þ
l ¼ r2 RUT t
ð7Þ
where A = diag(a). Rather than extending the model to include Bayesian inference over those hyperparameters, the model only needs to compute the weight posterior p(w|t, a, r2). Then, maximum a posteriori (MAP) estimates for a and r2 are determined by searching for the most-probable mode ðaMP ; r2MP Þ of the hyperparameter posterior p(a, r2|t) / p(t|a, r2)p(a)p(r2). Therefore, the MAP of the hyperparameter only needs to maximize the logarithm R of marginal likelihood p(t|a, r2) = p(t|w, r2)p(w|a)dw, which isknown as the type-II maximum likelihood procedure. The RVM marginal likelihood, L, also known as evidence [31], is given by the following equation:
1 L ¼ log pðtja; r2 Þ ¼ ½N log 2p þ log jCj þ tT C1 t 2
ð8Þ
where C = r2I + UA1UT. A detailed summary of the RVM inference procedure is provided by Tipping [11,30]. In the inference procedure, the values of aMP and r2MP replace a and r2, respectively. The posterior covariance R and mean l can then be computed, and a mean final approximator at new data x can be obtained with:
y ¼ lT /ðx Þ
r2 ¼ r2MP þ /ðx ÞT R/ðx Þ
ð9Þ ð10Þ
4.2. Smooth relevance vector machine The smooth relevance vector machine [17] is an extension of RVM that is more flexible in adjusting model sparsity. The sRVM defines a prior on a that directly penalizes models with large numbers of effective parameters to control the amount of sparsity. From Eqs. (6) and (7), the output of the model at the training points y = (y(x1), . . ., y(xN))T can be obtained by
y ¼ Ul ¼ ðr2 URUT Þt St
ð11Þ
S is the smoothing matrix, and its trace measures the effective number of parameters in the model, leading to the following equation:
pðajr2 Þ / ecðtraceðSÞÞ
ð12Þ
Parameter c directly controls the amount of sparsity. The specific values of this parameter are given based on some known model selection criteria (i.e., c = 0 for None (classic RVM); c = 1 for
128
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
(a)
(b) Fig. 3. Geological profiles of (a) the Q–D tunnel section and (b) the D–Y tunnel section.
Akaike information criteria (AIC); c = log (N)/2 for Bayesian information criteria (BIC); and c = log (N) for risk inflation criteria (RIC)), ranging from least smoothing (c = 0) to most smoothing (c = log (N)) [17,32]. Because a smoothness prior is incorporated, the marginal likelihood is modified as follows [33]:
generalization ability of the SVM [36]. In this paper, the commonly used Morlet wavelet kernel function is adopted to train the model:
K morlet ðx; x0 Þ ¼
! xi x0i jjxi x0i jj2 exp cos 1:75 a 2a2 i¼1
d Y
ð14Þ
where Rii = (ai + si)1. si is a measure of the extent to which the basis vector overlaps with those already present in the model during model updating [34].
where a is the kernel parameter and x e Rd. Joaquin and Hansen [37] showed that the performance of RVM models can be enhanced by adapting the kernel parameter. This can be accomplished by maximizing the marginal likelihood (13) with conditions on the kernel parameter. Here, we simply obtained the optimal value of the kernel parameter that yields the maximum logarithm of marginal likelihood via a one-direct search method.
4.3. Wavelet kernels
5. Application of wsRVM to real tunnels
Smooth relevance vector machines use the kernel approach, which maps original data into a high-dimensional feature space to increase the model’s computational power. Commonly used kernel functions are polynomial, sigmoid and Gaussian kernels. Wavelet kernels and proposed wavelet support vector machines (wavelet SVMs) have been adopted in recent studies to solve regression problems [35]. This kernel function can approximate almost any function in a quadratic continuous integral space, enhancing the
5.1. Engineering background
Ls ¼ L c N
N X
ai Rii
! ð13Þ
i¼1
The Qingyuzhui–Dongting (Q–D) tunnel section and the Dongting–Yuejiazhui (D–Y) tunnel section are located in the middle of Wuhan metro project Line 4, where the tunnels are constructed using the EPB shield tunneling method. The length of the Q–D tunnel section and the D–Y tunnel section are 536 m and 779 m respectively. The geological profiles of the tunnels are shown in
129
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132 Table 1 Statistics of input factors in wsRVM.
a
Factors
Maximum
Minimum
Mean
Standard deviation
Tunnel depth (m) Training set Testing set
8.64 9.06
6.70 3.99
7.90 6.33
0.57 1.59
Excavation face to settlement markers (m) Training set Testing set
56.19 50.81
30.81 28.19
12.01 8.31
18.31 17.01
Soil types at tunnel crowna Training set Testing set
1 1
0 0
0.48 0.67
0.50 0.47
Soil types at tunnel inverta Training set Testing set
1 1
0 1
0.29 1
0.46 0
Invert to water table (m) Training set Testing set
4.28 3.80
0.84 1.30
2.58 3.20
0.86 0.46
Face pressure (MPa) Training set Testing set
0.1950 0.1683
0.0001 0.0617
0.1232 0.1188
0.0740 0.0207
Drilling velocity (mm/min) Training set Testing set
80.00 81.25
20.00 14.17
51.35 50.90
13.22 12.83
Pitching angle (°) Training set Testing set
4.00 19.50
4.20 1.00
0.71 1.39
1.58 3.11
Grout pressure (MPa) Training set Testing set
0.3080 0.8750
0.1333 0.2740
0.2197 0.3771
0.0442 0.0680
Grout filling (m3) Training set Testing set
7.00 8.15
5.25 3.67
6.91 5.28
0.35 0.56
Soil types at tunnel crown and invert are binary data, i.e., 0 or 1.
Table 2 Summary of the wsRVM model performance.
RMSE of testing dataset R of testing dataset
AIC prior wsRVM
BIC prior wsRVM
RIC prior wsRVM
None prior wsRVM
RVM with RBF kernel
4.0915
4.9423
4.7907
6.3739
8.6710
0.7349
0.7264
0.6668
0.6648
0.5548
661 data samples, was formed. The first 147 data samples collected were used as the training dataset, while the testing dataset consistedof the remaining 514 data samples. The factors that considered in the application of wsRVM and the statistics of those factors are listed in Table 1. 5.2. Model training and validation
Fig. 3. It can be observed that the tunnels are excavated mostly within the clay and silty clay layer 4–9 m below the ground surface. Settlement markers are installed along the tunnel alignment at intervals of 5–20 m. As a result, 43 settlement markers and 57 settlement markers are installed in the Q–D tunnel and the D–Y tunnel respectively. Because settlement readings are taken once a day on sections near the excavation face, whereas theshield operational parameters are recorded several times a day, the average value of each shield operational parameter between two neighboring measurements is taken as input data for the shield operational factors. The settlement data from the first 22 settlement markers and the related data on the factors that described in Section 3 (i.e., tunnel depth, excavation face to settlement markers, soil types at tunnel crown and invert, tunnel invert to water table, face pressure, drilling velocity, pitching angle, tail void grout pressure and grout filling) were collected initially in the first half of the Q–D tunnel section, forming a dataset that consisted of 147 data samples. As the shield machine moved forward, more data samples were added to the dataset. Consequently, a sizable dataset, comprising
The data were first normalized to values between 0 and 1 because data normalization can enhance model performance [38]. The smoothness prior is an important parameter that needs to be determined before the model is trained. Although Schmolck and Everson [17] suggested a BIC prior as the default choice, a thorough examination for sRVM with different priors was carried out. Unlike SVMs or ANNs, which require an additional procedure to estimate the values of the model parameters (e.g., penalty weight C, insensitivity parameter e) or to determine the architecture of the network, which wastes both data and computation resources, RVMs and sRVMs effectively infer analogs of these parameters (i.e., hyperparameters a and r2) by maximizing the logarithm of marginallikelihood. That is, the model parameters are estimated from the training dataset [11]. The coefficient of correlation (R) and the root mean square error (RMSE) were adopted to evaluate the performance of the model. As summarized in Table 2, the wsRVM with AIC, BIC and RIC priors outperform the wsRVM with none prior (i.e., the classical wavelet RVM), indicating that the adoption of a smoothness prior can improve model performance. To illustrate the enhancement contributed by the wavelet kernel, the performance of a classical RVM with a Gaussian kernel is also presented in Table 2. It can be observed that the adoption of a wavelet kernel can also enhance
130
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
10
-5
5
-10
RMSE=6.1535 R=0.8328
-5
Settlement (mm)
Predictions (mm)
0
Measured Predicted
147 training data Relevance Vectors Predicted=Measured
-10 -15 -20
-15
-20
-25 -25 -30 -35 -40 -40
-30 -5 -35
-30
-25
-20
-15
-10
-5
0
5
Predictions (mm)
15
20
model parameters (i.e., the penalty weight C and the insensitivity parameter e) for the wavelet SVM method are optimized using fivefold cross validation, and the architecture of the ANN is determined in a similar way. The AIC-prior wsRVM method performs the best, followed by the wavelet SVM and ANN methods. Furthermore, the wsRVM requires no cross validation, making it relatively simple for practitioners to use. As mentioned above, the training of wsRVM is implemented within a Bayesian framework. One benefit of Bayesian training is that it allows the calculation of error bars, i.e., the standard deviation of the predicted distribution of the output, rather than a point estimate. The standard deviation can be determined from the variance formula given by Eq. (10). Thus, the prediction uncertainty is quantified in the form of a confidence interval for the prediction. For example, the predicted settlement development of settlement marker DK20742 is shown in Fig. 5. The error bars represent the standard deviation of the corresponding prediction at each point.
10
0
10
Fig. 5. Illustration of predicted settlement development with error bars.
(a) 514 testing data Predicted=Measured
5
5
Distance to excavation face (m)
10
Measurements (mm)
0
RMSE=4.0915 R=0.7349
-5 -10 -15 -20 -25 -30 -35 -40 -40
5.3. Discussion -35
-30
-25
-20
-15
-10
-5
0
5
10
Measurements (mm)
(b) Fig. 4. AIC-prior wsRVM performance: (a) training and (b) testing.
Table 3 Comparison between the AIC-prior wsRVM model and other models.
RMSE of testing dataset R of testing dataset RVs/SVs
AIC prior wsRVM
wSVM (C = 20, e = 0.02)
ANN (11 16 1)
4.0915
4.5834
6.6942
0.7349 4
0.6586 118
0.5546 /
the performance of the model. In particular, an AIC-prior wsRVM yields the best performance, as shown in Fig. 4. The predictions are in good agreement withthe measured data. Note that a small proportion of the data are used as the training dataset. To evaluate the strength of the proposed model, a comparison between this method and other methods is presented in Table 3. The wavelet support vector machine and artificial neural network methods are considered the major competitors. The values of the
The smoothness prior is a model parameter that needs to be determined before the training process. A BIC prior is recommended as the default choice [17]. However, as shown in Table 2, it is necessary to test the model performance with different priors. Since there are only four choices for the smoothness prior, a thorough examination of sRVM with different priors is computationally acceptable. The kernel parameter also affects the generalization performance of the proposed model. SVMs usually need additional cross validation to determine the kernel parameter. In comparison, the method described in this paper optimizes the kernel parameter of the wsRVM by maximizing the logarithm of marginal likelihood (i.e., the evidence [31]) during the training process. An analysis was conducted to see how the kernel parameter influences the performance of the model. Fig. 6(a) is a plot of the kernel parameter versus the evidence. Fig. 6(b) and (c) shows the effect of the kernel parameter on the RMSE and coefficient of correlation, respectively. It can be observed that the effect of the kernel parameter on the performance of the model varies with the effect of the kernelparameter on the evidence. The RMSE and coefficient of correlation reach their minimum and maximum values, respectively, when the value of the kernel parameter is set (by the maximum evidence value) to 4. In addition, the optimized model uses fewer relevance vectors, as shown in Fig. 6(d). Therefore, the kernel parameter of the proposed model can be effectively optimized by maximizing
131
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
1.5
30
1.4 25 1.3 20
1.1
RMSE
Evidence
1.2
1
0.9
15
10
0.8 5 0.7 0 0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0
0.5
1
1.5
2
2.5
3
Kernel parameter
Kernel parameter
(a)
(b)
0.8
35
0.7
30
Number of Relevance vectors
Coefficient of correlation
0
0.6 0.5 0.4 0.3 0.2
3.5
4
4.5
5
3.5
4
4.5
5
25
20
15
10
5
0.1
0
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0
0.5
1
1.5
2
2.5
3
Kernel parameter
Kernel parameter
(c)
(d)
Fig. 6. Effect of the kernel parameter on (a) model training (evidence), (b) model performance (RMSE), (c) model performance (R), and (d) model sparsity (number of relevance vectors).
the logarithm of marginal likelihood or the evidence during the training process. Because calculations are conducted for a large area and because there is a need for real-time prediction, obtaining abundant data from laboratory tests or numerical simulations is practically impossible. Thus only in situ measurements were used in this paper. Nevertheless, if the in situ measurements are inadequate, it is difficult to validate the model. Using data generated from laboratory tests or numerical simulations can solve this problem. Thus, how to combine the proposed method with existing numerical models is an appropriate subject for future research. 6. Conclusions Accurately predicting ground surface settlement is a precondition for effectively controlling the settlement that develops during tunneling. Many models have been established for this purpose by characterizing the relationship between the settlement and the fac-
tors that influence it. However, most of the existing models focus on maximum ground surface settlement and cannot be used to perform dynamic and real-time predictions. In this paper, the use of a wavelet smooth relevance vector machine for modeling the development of ground surface settlement in EPB shield-driven tunnels was investigated. The model combines the strength of the smooth relevance vector machine and a wavelet kernel to provide more accurate predictions. Unlike SVMs or ANNs, which need additional efforts to determine the values of parameters such as the penalty weight or the number of neurons in a hidden layer, the proposed model optimizes the model parameter values in the training process. Anadditional parametric study showed that the optimized model yields the best predictions among the models compared. The model proposed in this paper is also relatively simple for practitioners to use. Moreover, the distributions of predicted settlement values can be obtained, which makes it possible to assess the confidence associated with the predictions. The results indicate good generalization performance of the prediction model. However, the proposed
132
F. Wang et al. / Computers and Geotechnics 54 (2013) 125–132
model requires abundant data for training. If the available in situ measurements are inadequate, data from other sources, such as laboratory tests or numerical simulations, are necessary. Acknowledgements This work was supported by the Fundamental Research Funds for the Central Universities under Grant No. HUST:2013QN028. The support is greatly appreciated References [1] Kim CY, Bae GJ, Hong SW, Park CH, Moon HK, Shin HS. Neural network based prediction of ground surface settlements due to tunnelling. Comput Geotech 2001;28(6):517–47. [2] Leu S, Lo H. Neural-network-based regression model of ground surface settlement induced by deep excavation. Automat Constr 2004;13(3):279–89. [3] Jan JC, Hung S, Chi SY, Chern JC. Neural network forecast model in deep excavation. J Comput Civ Eng 2002;16(1):59–65. [4] Suwansawat S, Einstein H. Artificial neural networks for predicting the maximum surface settlement caused by EPB shield tunnelling. Tunnel Undergr Space Technol 2006;21(2):133–50. [5] Santos OJ, Celestino TB. Artificial neural networks analysis of Sao Paulo subway tunnel settlement data. Tunnel Undergr Space Technol 2008;23(5):481–91. [6] Vapnik VN. Statistical learning theory. New York: Wiley; 1998. [7] Samui P. Support vector machine applied to settlement of shallow foundations on cohesionless soils. Comput Geotech 2008;35(3):419–27. [8] Zhao H, Yin S. Geomechanical parameters identification by particle swarm optimization and support vector machine. Appl Math Model 2009;33(10):3997–4012. [9] Feng XT, Zhao H, Li S. Modeling non-linear displacement time series of geomaterials using evolutionary support vector machines. Int J Rock Mech Min Sci 2004;41(7):1087–107. [10] Kecman V. Learning and soft computing: support vector machines, neural networks, and fuzzy logic models. Cambridge, Massachusetts, London, England: The MIT press; 2001. [11] Tipping ME. Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 2001;1(3):211–44. [12] Ghosh S, Mujumdar PP. Statistical downscaling of GCM simulations to streamflow using relevance vector machine. Adv Water Resour 2008;31(1):132–46. [13] Yuan J, Wang K, Yu T, Fang M. Integrating relevance vector machines and genetic algorithms for optimization of seed-separating process. Eng Appl Artif Intell 2007;20(7):970–9. [14] Widodo A, Kim EY, Son J-D, Yang B-S, Tan ACC, Gu D-S, et al. Fault diagnosis of low speed bearing based on relevance vector machine and support vector machine. Expert Syst Appl 2009;36(3):7252–61. [15] Flake J, Moon TK, McKee M, Gunther JH. Application of the relevance vector machine to canal flow prediction in the Sevier River Basin. Agric Water Manage 2010;97(2):208–14. [16] Ammar K, McKee M, Kaluarachchi J. Bayesian method for groundwater quality monitoring network analysis. J Water Resour Plan Manage 2011;137(1):51–61.
[17] Schmolck A, Everson R. Smooth relevance vector machine: a smoothness prior extension of the RVM. Mach Learn 2007;68(2):107–35. [18] Landrin H, Blückert C, Perrin JP, Stacey S, Stolfa A. ALOP/DSU coverage for tunnelling risks? Boston: International Association of Engineering Insurers, 2006. 19 p. Report No.: IMIA WGP 48 (06). [19] Ding L, Ma L, Luo H, Yu M, Wu X. Wavelet analysis for tunneling-induced ground settlement based on a stochastic model. Tunnel Undergr Space Technol 2011;26(5):619–28. [20] Yeh I. Application of neural networks to automatic soil pressure balance control for shield tunneling. Automat Constr 1997;5(5):421–6. [21] Peck RB. Deep excavation and tunnelling in soft ground. In: Proc., 7th Int. Conf. Soil Mechanics and Foundation Engineering, Univ. Nacional Autonoma de Mexico Instituto de Ingenira, Mexico, City; 1969. p. 225–90. [22] Li Y, Emeriault F, Kastner R, Zhang ZX. Stability analysis of large slurry shielddriven tunnel in soft clay. Tunnel Undergr Space Technol 2009;24(2):472–81. [23] Norgrove WB, Cooper I, Attewell PB. Site investigation procedures adopted for the Northumbrian water authority’s tyneside sewerage scheme, with special reference to settlement prediction when tunneling through urban areas. Tunneling 1979:79–184. [24] Yoshikoshi W, Watanabe O, Takagi N. Prediction of ground settlements associated with shield tunnelling. Soils Found 1978;18(4):48–59. [25] Chua CG, Goh ATC. Estimating wall deflections in deep excavations using Bayesian neural networks. Tunnel Undergr Space Technol 2005;20(4):400–9. [26] Kung GTC, Hsiao ECL, Schuster M, Juang CH. A neural network approach to estimating deflection of diaphragm walls caused by excavation in clays. Comput Geotech 2007;34(5):385–96. [27] Finno RJ, Clough GW. Evaluation of soil response to EPB shield tunneling. J Geotech Eng 1985;111(2):155–73. [28] Kasper T, Meschke G. A numerical study of the effect of soil and grout material properties and cover depth in shield tunnelling. Comput Geotech 2006;33(4– 5):234–47. [29] Rasmussen CE, Williams CKI. Gaussian processes for machine learning. Cambridge, MA: The MIT press; 2006. [30] Tipping ME. Bayesian inference: an introduction to principles and practice in machine learning. Adv Lect Mach Learn 2004;3176:41–62. [31] MacKay D. The evidence framework applied to classification networks. Neural Comput 1992;4(5):720–36. [32] Holmes CC, Denison DGT. Bayesian wavelet analysis with a model complexity prior. Bayesian Stat 1999;6:972–8 [Oxford, UK: Oxford Univ. Press]. [33] Tzikas DG, Likas AC, Galatsanos NP. Sparse Bayesian modeling with adaptive kernel learning. IEEE Trans Neural Netw 2009;20(6):926–37. [34] Tipping ME, Faul AC. Fast marginal likelihood maximisation for sparse Bayesian models. In: Bishop CM, Frey BJ, editors. Proc., 9th Int. workshop on artificial intelligence and statistic, Key West, Fla.; 2003. [35] Widodo A, Yang BS. Wavelet support vector machine for induction machine fault diagnosis based on transient current signal. Expert Syst Appl 2008;35(1– 2):307–16. [36] Wu Q. The forecasting model based on wavelet v-support vector machine. Expert Syst Appl 2009;36(4):7604–10. [37] Joaquin QC, Hansen LK. Time series prediction based on the relevance vector machine with adaptive kernels. In: Proceedings of the IEEE international conference on acoustics, speech, and, signal processing; 2002. p. 985–8. [38] Flood I, Kartan N. Neural networks in civil engineering I: principles and understanding. J Comput Civ Eng 1994;8(2):131–48.