Accepted Manuscript
Modeling of shield-ground interaction using an adaptive relevance vector machine Fan Wang , Biancai Gou , Xianquan Han , Qiling Zhang , Yawei Qin PII: DOI: Reference:
S0307-904X(15)00550-8 10.1016/j.apm.2015.09.016 APM 10715
To appear in:
Applied Mathematical Modelling
Received date: Revised date: Accepted date:
11 July 2014 26 June 2015 22 September 2015
Please cite this article as: Fan Wang , Biancai Gou , Xianquan Han , Qiling Zhang , Yawei Qin , Modeling of shield-ground interaction using an adaptive relevance vector machine, Applied Mathematical Modelling (2015), doi: 10.1016/j.apm.2015.09.016
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights The developed model considers soil behavior in response to tunnel excavation.
Adaptive feature scaling factors are adopted to improve model performance.
The relative importance of features related to ground settlement is prioritized.
The model parameters can be determined automatically in the training process.
AC
CE
PT
ED
M
AN US
CR IP T
ACCEPTED MANUSCRIPT
Modeling of shield-ground interaction using an adaptive relevance vector machine
Fan Wanga,b,*, Biancai Gouc, Xianquan Hana, Qiling Zhanga, Yawei Qinb a b
Changjiang River Scientific Research Institute, Wuhan 430010, P.R.China
Department of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan 430074, P.R.China c
Department of Civil Engineering, Wuhan University of Science and
ABSTRACT
CR IP T
Technology–City College, Wuhan 430083, P.R. China
Shield tunneling method is widely adopted in tunneling projects. Analysis of
AN US
ground settlement is required as an effective way for minimizing the potential damage caused by tunneling. Many efforts have been devoted for this purpose using various methods such as empirical approaches and numerical modeling. However, there are multiple factors that may influence the ground settlement, and the shield-ground
M
relationship is highly non-linear and complex. To understand the complex soil behavior in response to shield penetration, a model that can establish the relationship
ED
and make accurate predictions for tunneling-induced ground settlement is needed. This paper proposed a model based on relevance vector machines (RVMs) to develop
PT
the predictive relations. Adaptive feature scaling factors were introduced as an inherent mechanism that enables RVMs to identify the relative importance of each
CE
input factor, and an optimization method for obtaining the appropriate values of feature scaling factors is proposed. The potential of the proposed adaptive model was
AC
investigated by applying it to tunnels that bored by an earth pressure balance (EPB) shield machine. Three categories of factors, namely tunnel geometry, geological conditions and shield operational parameters were considered in the model. The results demonstrate that the proposed model has competitive predictive capacities and that the adoption of adaptive feature scaling factors can enhance the prediction accuracy and provide a measure of the relative importance of each input factor. *
Corresponding author. Tel.:+86 27 82829879; fax:+86 27 82820548. E-mail address:
[email protected] (F. Wang).
ACCEPTED MANUSCRIPT
Moreover, the implementation of the adaptive RVM model is relatively simple. There is no need to set model parameters because they can be automatically optimized during model training, which makes the method a practical tool for geotechnical engineers to evaluate ground reactions during tunnel excavation.
AC
CE
PT
ED
M
AN US
machine; adaptive feature scaling factors; instrumentation
CR IP T
KEY WORDS: ground settlement; shield-ground interaction; relevance vector
ACCEPTED MANUSCRIPT
1. Introduction In shield tunneling projects, excavation inevitably disturbs the original stress field and hence ground movement is induced. Ground settlement, as an important measurement item that indicates the vertical displacement of ground surface, needs to
CR IP T
be closely monitored to assess construction safety, particularly in urban areas where excessive settlement may cause serious damage to adjacent structures or facilities. To avoid large settlement and potential damage, analyzing and predicting ground settlement development is required during tunneling. Generally, there are two major
AN US
classes of ground settlement analysis methods: empirical methods and numerical methods. The empirical methods are based mainly on the practice and experience of tunnel construction. Several empirical formulae have been proposed for predicting ground movement over the past decades [1, 2]. However, the accuracy of the empirical methods is questioned because the methods fail to take into account all the
M
relevant factors, including many shield operational parameters, which concurrently
ED
influence the ground settlement [3]. On the other hand, numerical methods have been widely used in tunneling projects. The numerical approaches consider the
PT
characteristics of ground and construction conditions with sophisticated constitutive models, which enable the calculation of deformations at each point within the ground
CE
[4]. However, the implementation of a numerical model is relatively complex, particularly when mechanized processes of shield excavation are considered. The
AC
detailed information on soil properties that required for simulation is scarce or unavailable in many cases, and building a practical constitutive soil model for tunneling-induced settlement prediction is rather difficult [5]. With the rapid development of monitoring method and hardware, a large number of in situ measurement data are generated in practice. Kim et al. [3] suggested that the instrumentation data should be the best ‘text book’ for understanding the tunneling-induced ground settlement. Therefore, a viable approach that fully utilizes the instrumentation data for ground settlement analysis and prediction is necessary. In
ACCEPTED MANUSCRIPT
the past years, many researchers have applied artificial neural networks (ANNs) in geotechnical engineering problems. Several ANN-based ground settlement prediction models were proposed by building the relationships between multiple variables and the induced ground settlement [5-8]. The variables included geometrical characteristics (e.g., cover depth, tunnel diameter, cover-span ratio), geological parameters (e.g., Young’s modulus, friction angle, cohesion) and construction
CR IP T
conditions (e.g., support method, excavation speed, dewatering condition). It is found that the complete consideration of variables can enhance the predictive accuracy of the ANN models [8].
Some reports have shown that most failures, such as large ground deformations,
AN US
take place at the tunnel construction stage [9]. Nevertheless, many developed ANN models pay attention to the maximum ground settlement, which do not satisfy the requirements of real-time settlement prediction during tunnel construction. Moreover, determination of the network architecture is complex when using ANNs because
M
trial-and-error is always needed, and ANNs have the problem of over fitting in many cases [10].
ED
Support vector machines (SVMs) [11], as a machine learning technique, are also used in geomechanical back analysis [12], prediction of settlement of shallow
PT
foundation [13], tunnel deformation control [14] and other geotechnical areas. SVM-based models typically have good generalization capacities because a structural
CE
risk minimization (SRM) induction principle that attempts to minimize the error both on the training data and testing data is adopted. However, the selection of model
AC
parameters, such as the insensitivity parameter ε and the penalty weight C , is relatively complex, and these models do not allow the identification of the relative importance of different factors. Sensitivity analysis is usually needed to clarify the cause and effect of the input-output relationship. Recently, a new machine learning technique named relevance vector machine (RVM) is introduced for its high generalization performance, sparse model structure, distributive prediction, and free choice of kernel function [15]. Due to these advantages, when using the same inputs, RVM models generally outperform SVM
ACCEPTED MANUSCRIPT
models for better prediction results [16-18]. This paper investigated the potential of RVMs to analyze tunneling-induced ground settlement. A Gaussian kernel with adaptive feature scaling factors was integrated as an inherent mechanism that can prioritize the relative importance of the input factors, and the optimization method for obtaining the kernel parameters was proposed. Field instrumentation data and EPB shield operational history are well recorded during the construction of the Wuhan
CR IP T
(China) metro project. These data provide a good chance for investigating how ground soils react to shield penetration. Using the collected data, the adaptive RVM model was trained and validated. The modeling results of the adaptive RVM model was then evaluated by comparing to that obtained from other approaches. The results
and identification of critical factors.
AN US
demonstrated the strength of the proposed model for accurate settlement prediction
ED
2.1 Relevance vector machine
M
2. Adaptive relevance vector machines
Tipping first proposed RVM to solve regression and classification problems [15].
PT
The RVM learning algorithm within a Bayesian framework is in fact a specialization of Gaussian processes [19]. RVMs have a similar functional form to the popular
CE
SVMs but a different basis function set is employed for prediction. The key feature of RVMs is that relatively fewer basis functions (termed ‘relevance vectors’ and ‘support
AC
vectors’ in RVMs and SVMs, respectively), are adopted to offer generalization performance that is still comparable to that of SVMs. Fig. 1 illustrates the different regression results with a sinc toy example.
ACCEPTED MANUSCRIPT
1.2 Noisy data sinc function RVM regression SVM regression Relevance vectors Support vectors
1 0.8 0.6 0.4
0 -0.2 -0.4 -5
-4
-3
-2
-1
0
1
2
CR IP T
0.2
3
4
5
Fig. 1. Noisy Sinc RVM regression versus SVM regression. Samples are generated by y=sin(x)/x
AN US
with an additive noise component ε~N(0, 0.05). Both RVM and SVM adopt a Gaussian kernel
with kernel width r=1. The insensitivity parameter and penalty weight for SVM model is set to 0.1 and 10, respectively.
Given a training set x n , t n n1 , the targets t (t1 ,, t N ) T can be approximated N
ED
ε ( 1 ,, N ) T :
M
by the outputs of a function y ( y(x1 ),, y(x N )) T and an additive noise vector
t y ε Φw ε
(1)
PT
where w [w1 wN ]T are the weights of the linear model and Φ is the N M
CE
‘design’ matrix with nm K (x n , x m ) as its elements. K () are basis functions that
AC
can be, for example, Gaussian kernels. The data noise ε n is independent and can be assumed to follow Gaussian distribution with variance as 2 and mean as zero:
p( n | 2 ) N (0, 2 ) . Because the targets are assumed to be independent, the likelihood of the whole
dataset can be expressed as follows:
1 p(t | w, 2 ) (2 2 ) N / 2 exp || t Φw || 2 2 2
(2)
ACCEPTED MANUSCRIPT
The conventional method that employs maximum-likelihood estimation of w and 2 from (2) would lead to severe over fitting [20]. To control the model complexity and avoid over fitting, a zero-mean Gaussian prior distribution over each weight wi with a different precision i is adopted, and the weight prior takes the form: M
m 1
The hyper-parameter α 1 ,, M
T
1/ 2 m
m wm2 exp 2
CR IP T
p(w | α) (2 )
M / 2
(3)
determines how far each weight can
deviate from zero. Consequently, the posterior distribution for the weights conditioned
AN US
on the training data can be calculated using Bayes’ law:
p ( w | t , α, 2 )
p (t | w , 2 ) p ( w | α ) p (t | α , 2 )
(4)
Both of the numerators in (4) are Gaussian priors, so the posterior over the
w
also follow Gaussian distribution and can be described as
M
weights
p(w | t, α, 2 ) ~ N (μ, Σ) with posterior mean μ and covariance Σ given by: (5)
Σ (A 2ΦTΦ)1
(6)
PT
ED
μ 2 ΣΦT t
where A diag (α) . To calculate the weight posterior p(w | t, α, 2 ) , maximum a
CE
posteriori (MAP) estimates are employed to determine α and 2 by seeking the values
2 (α MP , MP )
of
the
posterior
of
hyper-parameters
AC
most-probable
p(α, 2 | t) p(t | α, 2 ) p(α) p( 2 ) . As a result, the MAP estimates of the
hyper-parameters simply needs to maximize the marginal likelihood p(t | α, 2 ) . In particular, the logarithm of RVM marginal likelihood (also known as evidence [21]) is written as:
L log p(t | α, 2 )
1 N log 2 t TC1t log | C | 2
(7)
ACCEPTED MANUSCRIPT
2 where C ΦA1ΦT 2I . The values of α MP and MP cannot be obtained
analytically, and the iterative re-estimation formulae are given as:
inew 2 new
(8)
t Φμ N i i
(9)
CR IP T
i i2
where i 1 i ii . Each i [0,1] is a measure of how ‘well-determined’ its corresponding parameter wi is by the training data [21], and the training vectors associated with the remaining non-zero weights after this optimization is the
AN US
aforementioned ‘relevance vectors’ [15].
In the learning procedure, the values of α and 2 are replaced by α MP and 2 , respectively. The posterior mean μ and covariance Σ can then be obtained, MP
M
and the prediction for a new datum x* is given by
(10)
ED
y μT (x* )
2.2 Adapting feature scaling factors
PT
RVMs use kernel mapping to improve the non-linear regression ability. Gaussian kernels are commonly used in the machine learning practice [22]. Eq. (11) is a typical
AC
CE
Gaussian kernel:
|| x x' || 2 KGaussian(x, x') exp r2
(11)
where r is an unified kernel width. The input features are associated with a constant kernel width, which implies all the features are regarded as equally important. If multiple input scale parameters are considered, the classical Gaussian kernel is modified with extension of features:
D || x d x'd || 2 K AdaptiveGaussian(x, x') exp 2 r d 1 d
(12)
ACCEPTED MANUSCRIPT
where rd is the adaptive kernel width, which is also called ‘feature scaling factor’ [23] and D is the input space dimension. The importance of each feature is now weighed by the feature scaling factor. Larger feature scaling factor indicates more significant influence of the input feature on the output [23]. Consequently, the adoption of an adaptive Gaussian kernel enables the model to identify the relative
CR IP T
importance of each factor. To obtain the values of feature scaling factors, the gradient of the evidence L (7) with respect to the k-th feature scaling factor is written in the form: N M L L nm rk n1 m1 nm rk
(13)
AN US
nm m (xn ; r) is notated as the elements of the design matrix Φ . The first term in (13) is independent of the kernel function parameters and the derivatives of L with respect to Φ is given as [15]:
L C1tt TC1 C1 ΦA1 2[(t Φμ)μT ΦΣ] Φ
(14)
M
Let k rk2 , the derivatives of nm , namely the adaptive Gaussian kernel
PT
ED
described in (12), with respect to k is:
CE
By defining D
nm nm ( xmk xnk ) 2 k
(15)
L and combining (14) and (15), the derivatives (13) of the Φ
AC
evidence L with respect to the k-th feature scaling factor can be re-written as: N M L Dnm nm ( xmk xnk ) 2 k n 1 m 1
(16)
where Dnm is the elements of the matrix D . The feature scaling factors can thus be optimized using a simple hill-climbing method. Specifically, a stage-wise optimization method is proposed by interleaving the updates of the hyper-parameters (i.e., α and 2 ) and the feature scaling factors (i.e., k ) with a few cycles. The procedure is described in Fig. 2. In the first stage, the
ACCEPTED MANUSCRIPT
hyper-parameters α and 2 are optimized while the feature scaling factor k is fixed. After iterating H1 cycles of the α and 2 optimization, the feature scaling factor k is optimized with the fixed hyper-parameters α and 2 for H2 cycles in the second stage. Due to the change of feature scaling factors, the kernel functions Φ
model. Initialize α and σ2 Stage 1 Yes
No
Updated for H2 cycles?
AN US
Compute μ and Σ
CR IP T
are changed, and μ and Σ need to be updated to reflect the current state of the
Update α and σ2
Search ηk using hillclimbing method
No
Yes
M
Updated for H1 cycles?
Yes
Update μ Σ Φ Stage 2
PT
Optimal model obtained
No
ED
Maximum iteration or converged?
CE
Fig. 2. Stage-wise optimization of RVM with adaptive feature scaling factors.
AC
3. Representation of the shield-ground relationship Shield-ground interaction is a complex process because of the involvement of
multiple factors, such as ground condition and shield operation. Due to the complex shield-ground relationship, site instrumentation data are always non-linear and noisy. Fig. 3 is the schematic diagram describing shield tunneling and the induced ground settlement. The settlement develops continually in response to the shield penetration, and the effect of shield penetration on settlements increases in case of shield machine
ACCEPTED MANUSCRIPT
approaching while it decreases in case of receding. Therefore, the settlement is normally plotted against the distance between measurement devices and excavation face to demonstrate the longitudinal settlement profile. In practice, due to that the advance rate is not constant and the measurement readings are not taken with equal time intervals, the data points that form the settlement curve may be not equally spaced [4]. To model the settlement development, the present settlement at an
other influencing factors
F
CR IP T
instrumented section s (indicates the current ground stress conditions), coupled with recorded between two consecutive settlement
measurements (representing the loads applied by the shield penetration), are used to predict the settlement s ' for next excavation step. As shield machine advances, the
AN US
measurement of the settlement at next excavation step and the possibly changed factors, such as shield operational parameters, are taken again as the input for the next prediction [24]. This process is expressed as the following equation: s ' f ( s, F )
(17)
M
The non-linear relationship f () is learned from known dataset using the
ED
adaptive RVM, and the prediction is conducted based on the trained model. In addition, the feature scaling factors optimized in the training process can be used to
PT
investigate the effect of each factor on ground settlement. This characteristic can be useful in engineering practice. For example, the established model has the potential to
CE
provide directions about shield steering for settlement control because the ground
AC
movement is sensitive to the critical shield operational parameter.
ACCEPTED MANUSCRIPT
Approaching -10
-10
Tail of shield passing
-20 -30 -40
Measurements
CR IP T
Settlement (mm)
-20 0
Receding Distance to excavation face (m) 0 10 20 30 40 Front edge of shield passing
A typical longitudinal development of ground settlement Ground surface
Settlement marker
AN US
T
Shield machine
M
Launching station
Excavation direction
ED
Completed tunnel lining
Excavation face
PT
Fig. 3. Schematic diagram of shield tunneling.
CE
4. Application
AC
The Dongting–Yuejiazhui (D–Y) tunnel and the Qingyuzhui–Dongting (Q–D)
tunnel constitute the middle section of Line 4 in the Wuhan (China) metro project. The tunnels are constructed by EPB shield tunneling method. The D–Y tunnel and the Q–D tunnel are respectively 779m and 536m in length. Fig. 4 shows the geological profiles of the two tunnels. The main geologic formations encountered by the tunnel sections are the clay and silty clay, and the cover depth ranged from 4m to 9m. Ground settlement markers are deployed above the centre line of the tunnel with about 10m intervals. Measurements of ground settlement were taken once per day
ACCEPTED MANUSCRIPT
ED
M
AN US
(a)
CR IP T
during shield passing.
(b)
PT
Fig. 4. (a) Soil profile of the D–Y tunnel and (b) Soil profile of the Q–D tunnel.
The affecting factors considered can be grouped into three categories:
CE
geometrical characteristics, geological conditions and shield operational parameters [7, 24]. Two geometrical factors are considered: the distance between the excavation face
AC
and the ground settlement markers (d) and the cover-span ratio (Z/2R) wherein Z represents the cover depth and 2R is the tunnel diameter. However, due to that the tunnel diameter was designed as a constant of 6 meters, the first factor is replaced by the tunnel depth (Z). Three factors, namely the soil type at the tunnel crown (Soilcrown), the soil type at the tunnel invert (Soilinvert), and the tunnel depth below the groundwater table (WT), are considered as the geological factors. The soil types rather than detailed soil properties are adopted because the detailed information on
ACCEPTED MANUSCRIPT
geotechnical parameters at most instrumented sections is unavailable. The shield operational parameters include penetration speed (v), face pressure (p1), tail void grouting pressure (p2), grouting volume (V) and pitching angle (θ). By collecting the instrumentation data, referring to the geological investigation documents and recording the shield operational parameters, a dataset consisting of 661 samples was created and they were sorted according to the measurement date. The dataset was then
CR IP T
divided into a training dataset, which contained the first 430 data samples, and a testing dataset, which consisted of the last 231 data samples. Fig. 5 gives the boxplot of the original data. The factor Soilcrown and Soilinvert are represented by dummy variables. Because the tunnels are drilled in two soil types (i.e., clay and silty clay) in
AN US
this case, the values of the two factors are binary (i.e., 0 or 1) and are not shown in Fig. 5. 100
M
80 60
ED
40
0
CE
-20
PT
20
AC
-40
d (m)
Z ×10-1(m)
WT v p1 p2 V ×10-1(m) (mm/min) ×10(kpa) ×10(kpa) ×10-1(m3)
θ (°)
Fig. 5. Boxplot of the original dataset.
5. Results and discussion Data normalization to the range [0, 1] was carried out before the model training. When building ANN-based or SVM-based models, additional efforts are usually
ACCEPTED MANUSCRIPT
required to determine the architecture of the network for ANN models or to select appropriate model parameters (e.g., penalty weight C , insensitivity parameter ε , and kernel width r ) that yields the best performance for SVM models. In contrast, the proposed method can infer analogues of those parameters (i.e., hyperparameters
α and 2 , and feature scaling factors rd ) by searching the maximum of the
CR IP T
logarithm of marginal likelihood. In other words, there is no need to set the model parameters because they can be obtained during model training.
Fig. 6 shows the inferred weights after model training. As mentioned above, how ‘well-determined’ the parameter wi is by the training data can be measured by the quantities γi. Fig. 7 demonstrates the values of the ‘well-determinedness’ of each
AN US
weight. Most values of γi tend to be 1 while the two relevance vectors with low values of γi is associated with very small weights. This results show that the entire model is largely determined by the ‘well-determined’ relevance vectors, while those relevance vectors that may have minor influence on model performance is not neglected. In
M
particular, the non-linear shield-ground relationship is well represented by the trained
ED
model. 5
PT
4 3
CE
2
0
AC
w
i
1
-1 -2 -3 -4 -5
0
50
100
150
200
250
300
Number of data samples Fig. 6. Values of wi for the trained model.
350
400
ACCEPTED MANUSCRIPT
1
0.8
CR IP T
γi
0.6
0.4
0.2
0
50
100
150
200
250
300
350
AN US
0
400
Number of data samples
Fig. 7. Values of γi for the trained model. 10 430 training data 231 testing data Relevance vectors Predicted=Measured
M
5
ED
0
-15 -20
PT
-10
CE
Predictions (mm)
-5
-25
AC
-30 -35
-40 -40
-35
-30
-25
-20
-15
-10
-5
0
5
10
Measurements (mm) Fig. 8. Performance of the model for the training dataset and testing dataset.
The trained model was then used for ground settlement prediction. Fig. 8illustrates the performance of the model for the training and testing dataset. As
ACCEPTED MANUSCRIPT
shown in Fig. 8, the predictions are in good agreement with the measured data. A comparison of the simulation results is presented in Table 2 to evaluate the strength of the adaptive RVM model. The root mean square error (RMSE) and the coefficient of correlation (R) were employed as performance metrics. The ANN and SVM models are considered as the major competitors. Specifically, the ANN model architecture is determined by a 5-fold cross validation method that sequentially uses a
CR IP T
small proportion of the training set as validation data. The kernel width of the SVM model is also selected in the same way when the penalty weight and insensitivity parameter are arbitrarily set to 10, 20, 50, 100 and 0.01, 0.05, 0.1, 0.5 respectively. It is found that a 3-layer ANN architecture with 12 hidden neurons and a SVM with a
AN US
penalty weight of 20, insensitivity parameter of 0.1, and kernel width of 0.2 yield the best prediction performance on the validation dataset. Moreover, to illustrate how the adoption of adaptive feature scaling factors can improve the prediction results, a classical RVM model with a standard Gaussian kernel is also established. Based on a
M
direct search method, the unified kernel width of the classical RVM that corresponds to the maximum evidence is used. As shown in Fig. 9, the classical RVM with a
ED
kernel width of 0.5 has the maximum evidence and is applied for prediction.
740 730
710 700
AC
Evidence
CE
720
PT
750
690 680 670 660 650 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Kernel width Fig. 9. Classical RVM model evidence corresponds to different values of kernel width.
ACCEPTED MANUSCRIPT
Based on Table 2, the proposed model outperforms the other models, which reveals its good prediction capacity. It is interesting to note that the adaptive feature scaling factors help to reduce both prediction errors and the number of relevance vectors. Thus, it is shown that the adoption of adaptive feature scaling factors is beneficial for increasing the prediction accuracy and reducing the model complexity. Moreover, the proposed model requires no further efforts to determine the model
CR IP T
parameters. All the parameters, including feature scaling factors, are automatically optimized during model training, which makes the model relatively simple for practitioners to use. Table 2
Adaptive RVM 4.4472
R of testing dataset
0.6714
RVs/SVs*
8
Classical RVM
SVM
ANN
4.8454
4.7457
5.044
0.6290
0.6285
0.5897
17
99
/
M
RMSE of testing dataset
AN US
Comparison between different models.
*RVs – Relevance Vectors, SVs – Support Vectors.
ED
The feature scaling factors can weigh the relative importance of each input factor. Important factors are associated with larger feature scaling factors. Fig. 10 presents
PT
the feature scaling factors, which are normalized to facilitate the comparison with the subsequent sensitivity analysis results. Based on Fig. 10, face pressure is the most
CE
important factor in this case. The result is concomitant with the analysis based on finite element method [25, 26] and the empirical evidence [27]. In shield-driven
AC
tunnels, soil is removed directly from the excavation face. To stabilize the excavation face, the pressure provided by the shield machine must be in equilibrium with the external ground and water pressure. If not, then excavation face can become unstable causing excessive settlement or even a collapse, particularly in the shallow buried tunnels and soft soil conditions. Thus, the face pressure is a critical shield operational factor that needs to be closely examined and tuned to stabilize the excavation face and minimize the ground deformation. The geological factors, namely soil type at tunnel
ACCEPTED MANUSCRIPT
crown and tunnel invert, and the groundwater table, are also identified as important factors. This result is also supported by several models because the geological conditions determine the potential soil responses to shield machine penetration and subsequently impact ground movement [26, 28]. In comparison, as shown in Fig. 11, the sensitivity analysis has failed to identify the critical shield operational parameters in this case, and the importance of the soil type at tunnel invert is over-estimated
CR IP T
because the disturbance imposed on the soils at tunnel crown can be propagated to the ground surface without significant loss, while the effects of soil disturbance at tunnel invert is partially neutralized by the shield weight and tough shield skin [29]. 1
AN US
0.8 0.7 0.6
M
0.5 0.4
ED
0.3 0.2 0.1
d
CE
0
PT
Normalized feature scaling factor
0.9
Z
Soilcrown Soilinvert WT
v
p1
p2
V
θ
Factors
AC
Fig. 10. Feature scaling factor as a measure of relative importance of each factor.
ACCEPTED MANUSCRIPT
1 0.9
0.7 0.6 0.5 0.4 0.3 0.2 0.1
d
Z
Soilcrown Soilinvert WT
v
p1
p2
AN US
0
CR IP T
Normalized Sensitivity
0.8
θ
V
Factors
Fig. 11. Sensitivity analysis as a measure of relative importance of each factor.
900
M
800
ED
700
PT
500 400
CE
Evidence
600
300
AC
200 100
0
Adaptive Classical 0
50
100
150
200
250
300
350
400
Number of iterations Fig. 12. The comparison of algorithm efficiency.
Regarding the convergence procedure, the comparison between the proposed method and the classical RVM approach is shown in Fig. 12. In this study, the number
ACCEPTED MANUSCRIPT
of cycles H1 in the first stage and H2 in the second stage was set to 10, and the learning rate for the feature scaling factor optimization was set to 10-4 to prevent premature convergence. As a result, the evidence with respect to the feature scaling factors is maximized in the second stage optimization, shown as steps in blue in the figure, and the maximum evidence of whole optimization and the convergence speed
CR IP T
is improved compared to the classical RVM training method.
6. Conclusions
It is important to protect adjacent structures and facilities from tunneling-induced
AN US
damage. For this reason, accurate prediction of ground settlement is pivotal because preventive measures can be taken before the occurrence of large settlement that can interrupt the function of the existing structures or facilities. In this paper, a RVM model with adaptive feature scaling factors was proposed to explore the predictive relations between the settlement and the causing factors. The potential of the proposed
M
model was investigated in shield-driven tunnels. Although the shield-ground
ED
interaction is complex and the settlement data is highly non-linear and noisy, the results demonstrate that the model has extracted the shield-ground relationship to
PT
provide competitive generalization performance. The adoption of feature scaling factors is proven to be beneficial for reducing both the prediction errors and the model
CE
complexity. Unlike ANN models or SVM models, which need an additional procedure to determine the model structure or set model parameters, the optimization method
AC
can obtain the model parameters, including the feature scaling factors, in the training process. Moreover, the feature scaling factors can be used as an inherent mechanism to measure the relative importance of each input factor, and provide an insight into how the factors contribute to the settlement development. Some factors, such as face pressure of the shield machine and geological conditions, are correctly identified as important in the case study. It can be concluded that the proposed model is useful for its accurate prediction, simple implementation and the ability for feature importance identification. Therefore, the proposed model can be employed in tunneling projects
ACCEPTED MANUSCRIPT
for settlement prediction and control. However, the application of this model to other tunnels at different sites requires close examination, particularly to those constructed across soil types that are not included in the existing training dataset. A database that consisted of different site-specific information is needed to improve the generalization
CR IP T
ability of the proposed model.
Acknowledgements
This research was supported by the National Natural Science Foundation of
AN US
China under Grant No. 41301434.
References
[1] R.B. Peck, Advantages and limitations of the observational method in applied soil mechanics, Geotechnique 19 (2) (1969) 171–187.
M
[2] C. Sagaseta, Analysis of undrained soil deformation due to ground loss,
ED
Geotechnique 37 (3) (1987) 301–320.
[3] C.Y. Kim, G.J. Bae, S.W. Hong, C.H. Park, H.K. Moon, H.S. Shin, Neural network
PT
based prediction of ground surface settlements due to tunneling, Comput. Geotech. 28 (6) (2001) 517–547.
CE
[4] ITA, Settlements induced by tunneling in Soft Ground, Tunnel. Undergr. Space Technol. 22 (2) (2007) 119–149.
AC
[5] A. Pourtaghi, M.A. Lotfollahi-Yaghin, Wavenet ability assessment in comparison to ANN for predicting the maximum surface settlement caused by tunneling, Tunnel. Undergr. Space Technol. 28 (2012) 257–271. [6] S. Gholizadeh, O.A. Samavati, Structural optimization by wavelet transforms and neural networks, Appl. Math. Model. 35 (2) (2011) 915–929. [7] S. Suwansawat, H.H. Einstein, Artificial neural networks for predicting the maximum surface settlement caused by EPB shield tunneling, Tunnel. Undergr. Space Technol. 21 (2) (2006) 133–150.
ACCEPTED MANUSCRIPT
[8] O.J. Santos, T.B. Celestino, Artificial neural networks analysis of Sao Paulo subway tunnel settlement data, Tunnel. Undergr. Space Technol. 23 (5) (2008) 481–491. [9] H. Landrin, C. Blückert, J.P. Perrin, S. Stacey, A. Stolfa, ALOP/DSU coverage for tunnelling risks? Boston: International Association of Engineering Insurers, 2006. 19 p. Report No.: IMIA WGP 48 (06).
CR IP T
[10] V. Kecman, Learning and Soft Computing: Support Vector Machines, Neural Networks, and Fuzzy Logic Models, MIT Press, Cambridge, MA, 2001. [11] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.
[12] H. Zhao, S.Yin, Geomechanical parameters identification by particle swarm
3997–4012.
AN US
optimization and support vector machine, Appl. Math. Model. 33 (10) (2009)
[13] P. Samui, T.G. Sitharam, Least-square support vector machine applied to settlement of shallow foundations on cohesionless soils, Int. J. Numer. Anal. Meth.
M
Geomech. 32 (17) (2008) 2033–2043.
[14] A.N. Jiang, S.Y. Wang, S.L. Tang, Feedback analysis of tunnel construction using
ED
a hybrid arithmetic based on Support Vector Machine and Particle Swarm Optimisation, Automat. Constr. 20 (4) (2011) 482–489.
PT
[15] M.E. Tipping, Sparse Bayesian learning and the relevance vector machine, J. Mach. Learn. Res. 1 (3) (2001) 211–244.
CE
[16] J. Yuan, K. Wang, T. Yu, M. Fang, Integrating relevance vector machines and genetic algorithms for optimization of seed-separating process, Eng. Appl. Artif. Intell.
AC
20 (7) (2007) 970–979. [17] P. Samui, T.G. Sitharam, Site characterization model using least-square support vector machine and relevance vector machine based on corrected SPT data (Nc), Int. J. Numer. Anal. Meth. Geomech. 34 (7) (2010) 755–770. [18] P. Samui, Application of statistical learning algorithms to ultimate bearing capacity of shallow foundation on cohesionless soil, Int. J. Numer. Anal. Meth. Geomech. 36 (1) (2012) 100–110. [19] C.E. Rasmussen, C.K.I. Williams, Gaussian Processes for Machine Learning,
ACCEPTED MANUSCRIPT
MIT press, Cambridge, MA, 2006. [20] M.E. Tipping, Bayesian inference: an introduction to principles and practice in machine learning, In Advanced Lectures on Machine Learning, O. Bousquet, U. von Luxburg, G. Rätsch (Eds.). Springer, Berlin, 2004; 41–62. [21] D. MacKay, The evidence framework applied to classification networks, Neural Comput. 4 (5) (1992) 720–736.
CR IP T
[22] A. Smola, B. Schölkopf, A tutorial on support vector regression, Stat. Comput. 14 (3) (2004) 199–222.
[23] J. Yuan, L. Bo, K. Wang, T. Yu, Adaptive spherical Gaussian kernel in sparse Bayesian learning framework for nonlinear regression, Expert Syst. Appl. 36 (2)
AN US
(2009) 3982–3989.
[24] F. Wang, B. Gou, Y. Qin, Modeling tunneling-induced ground surface settlement development using a wavelet smooth relevance vector machine, Comput. Geotech. 54 (2013) 125–132.
M
[25] T. Kasper, G. Meschke, On the influence of face pressure, grouting pressure and TBM design in soft ground tunneling, Tunnel. Undergr. Space Technol. 21 (2) (2006)
ED
160–171.
[26] Y. Li, F. Emeriault, R. Kastner, Z.X. Zhang, Stability analysis of large slurry
472–481.
PT
shield-driven tunnel in soft clay, Tunnel. Undergr. Space Technol. 24 (4) (2009)
CE
[27] B. Maidl, M. Herrenknecht, U. Maidl, G. Wehrmeyer, Mechanised shield tunneling, 2nd edition. Ernst & Sohn, Berlin, Germany, 2012.
AC
[28] T. Kasper, G. Meschke, A 3D finite element simulation model for TBM tunnelling in soft ground, Int. J. Numer. Anal. Meth. Geomech. 28 (14) (2004) 1441–1460. [29] L. Ding, F. Wang, H. Luo, M. Yu, X. Wu, Feedforward analysis for shield-ground system, J. Comput. Civ. Eng., ASCE 27 (3) (2013) 231–242.