Modeling of shield-ground interaction using an adaptive relevance vector machine

Modeling of shield-ground interaction using an adaptive relevance vector machine

Accepted Manuscript Modeling of shield-ground interaction using an adaptive relevance vector machine Fan Wang , Biancai Gou , Xianquan Han , Qiling Z...

906KB Sizes 0 Downloads 19 Views

Accepted Manuscript

Modeling of shield-ground interaction using an adaptive relevance vector machine Fan Wang , Biancai Gou , Xianquan Han , Qiling Zhang , Yawei Qin PII: DOI: Reference:

S0307-904X(15)00550-8 10.1016/j.apm.2015.09.016 APM 10715

To appear in:

Applied Mathematical Modelling

Received date: Revised date: Accepted date:

11 July 2014 26 June 2015 22 September 2015

Please cite this article as: Fan Wang , Biancai Gou , Xianquan Han , Qiling Zhang , Yawei Qin , Modeling of shield-ground interaction using an adaptive relevance vector machine, Applied Mathematical Modelling (2015), doi: 10.1016/j.apm.2015.09.016

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights The developed model considers soil behavior in response to tunnel excavation.



Adaptive feature scaling factors are adopted to improve model performance.



The relative importance of features related to ground settlement is prioritized.



The model parameters can be determined automatically in the training process.

AC

CE

PT

ED

M

AN US

CR IP T



ACCEPTED MANUSCRIPT

Modeling of shield-ground interaction using an adaptive relevance vector machine

Fan Wanga,b,*, Biancai Gouc, Xianquan Hana, Qiling Zhanga, Yawei Qinb a b

Changjiang River Scientific Research Institute, Wuhan 430010, P.R.China

Department of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan 430074, P.R.China c

Department of Civil Engineering, Wuhan University of Science and

ABSTRACT

CR IP T

Technology–City College, Wuhan 430083, P.R. China

Shield tunneling method is widely adopted in tunneling projects. Analysis of

AN US

ground settlement is required as an effective way for minimizing the potential damage caused by tunneling. Many efforts have been devoted for this purpose using various methods such as empirical approaches and numerical modeling. However, there are multiple factors that may influence the ground settlement, and the shield-ground

M

relationship is highly non-linear and complex. To understand the complex soil behavior in response to shield penetration, a model that can establish the relationship

ED

and make accurate predictions for tunneling-induced ground settlement is needed. This paper proposed a model based on relevance vector machines (RVMs) to develop

PT

the predictive relations. Adaptive feature scaling factors were introduced as an inherent mechanism that enables RVMs to identify the relative importance of each

CE

input factor, and an optimization method for obtaining the appropriate values of feature scaling factors is proposed. The potential of the proposed adaptive model was

AC

investigated by applying it to tunnels that bored by an earth pressure balance (EPB) shield machine. Three categories of factors, namely tunnel geometry, geological conditions and shield operational parameters were considered in the model. The results demonstrate that the proposed model has competitive predictive capacities and that the adoption of adaptive feature scaling factors can enhance the prediction accuracy and provide a measure of the relative importance of each input factor. *

Corresponding author. Tel.:+86 27 82829879; fax:+86 27 82820548. E-mail address: [email protected] (F. Wang).

ACCEPTED MANUSCRIPT

Moreover, the implementation of the adaptive RVM model is relatively simple. There is no need to set model parameters because they can be automatically optimized during model training, which makes the method a practical tool for geotechnical engineers to evaluate ground reactions during tunnel excavation.

AC

CE

PT

ED

M

AN US

machine; adaptive feature scaling factors; instrumentation

CR IP T

KEY WORDS: ground settlement; shield-ground interaction; relevance vector

ACCEPTED MANUSCRIPT

1. Introduction In shield tunneling projects, excavation inevitably disturbs the original stress field and hence ground movement is induced. Ground settlement, as an important measurement item that indicates the vertical displacement of ground surface, needs to

CR IP T

be closely monitored to assess construction safety, particularly in urban areas where excessive settlement may cause serious damage to adjacent structures or facilities. To avoid large settlement and potential damage, analyzing and predicting ground settlement development is required during tunneling. Generally, there are two major

AN US

classes of ground settlement analysis methods: empirical methods and numerical methods. The empirical methods are based mainly on the practice and experience of tunnel construction. Several empirical formulae have been proposed for predicting ground movement over the past decades [1, 2]. However, the accuracy of the empirical methods is questioned because the methods fail to take into account all the

M

relevant factors, including many shield operational parameters, which concurrently

ED

influence the ground settlement [3]. On the other hand, numerical methods have been widely used in tunneling projects. The numerical approaches consider the

PT

characteristics of ground and construction conditions with sophisticated constitutive models, which enable the calculation of deformations at each point within the ground

CE

[4]. However, the implementation of a numerical model is relatively complex, particularly when mechanized processes of shield excavation are considered. The

AC

detailed information on soil properties that required for simulation is scarce or unavailable in many cases, and building a practical constitutive soil model for tunneling-induced settlement prediction is rather difficult [5]. With the rapid development of monitoring method and hardware, a large number of in situ measurement data are generated in practice. Kim et al. [3] suggested that the instrumentation data should be the best ‘text book’ for understanding the tunneling-induced ground settlement. Therefore, a viable approach that fully utilizes the instrumentation data for ground settlement analysis and prediction is necessary. In

ACCEPTED MANUSCRIPT

the past years, many researchers have applied artificial neural networks (ANNs) in geotechnical engineering problems. Several ANN-based ground settlement prediction models were proposed by building the relationships between multiple variables and the induced ground settlement [5-8]. The variables included geometrical characteristics (e.g., cover depth, tunnel diameter, cover-span ratio), geological parameters (e.g., Young’s modulus, friction angle, cohesion) and construction

CR IP T

conditions (e.g., support method, excavation speed, dewatering condition). It is found that the complete consideration of variables can enhance the predictive accuracy of the ANN models [8].

Some reports have shown that most failures, such as large ground deformations,

AN US

take place at the tunnel construction stage [9]. Nevertheless, many developed ANN models pay attention to the maximum ground settlement, which do not satisfy the requirements of real-time settlement prediction during tunnel construction. Moreover, determination of the network architecture is complex when using ANNs because

M

trial-and-error is always needed, and ANNs have the problem of over fitting in many cases [10].

ED

Support vector machines (SVMs) [11], as a machine learning technique, are also used in geomechanical back analysis [12], prediction of settlement of shallow

PT

foundation [13], tunnel deformation control [14] and other geotechnical areas. SVM-based models typically have good generalization capacities because a structural

CE

risk minimization (SRM) induction principle that attempts to minimize the error both on the training data and testing data is adopted. However, the selection of model

AC

parameters, such as the insensitivity parameter ε and the penalty weight C , is relatively complex, and these models do not allow the identification of the relative importance of different factors. Sensitivity analysis is usually needed to clarify the cause and effect of the input-output relationship. Recently, a new machine learning technique named relevance vector machine (RVM) is introduced for its high generalization performance, sparse model structure, distributive prediction, and free choice of kernel function [15]. Due to these advantages, when using the same inputs, RVM models generally outperform SVM

ACCEPTED MANUSCRIPT

models for better prediction results [16-18]. This paper investigated the potential of RVMs to analyze tunneling-induced ground settlement. A Gaussian kernel with adaptive feature scaling factors was integrated as an inherent mechanism that can prioritize the relative importance of the input factors, and the optimization method for obtaining the kernel parameters was proposed. Field instrumentation data and EPB shield operational history are well recorded during the construction of the Wuhan

CR IP T

(China) metro project. These data provide a good chance for investigating how ground soils react to shield penetration. Using the collected data, the adaptive RVM model was trained and validated. The modeling results of the adaptive RVM model was then evaluated by comparing to that obtained from other approaches. The results

and identification of critical factors.

AN US

demonstrated the strength of the proposed model for accurate settlement prediction

ED

2.1 Relevance vector machine

M

2. Adaptive relevance vector machines

Tipping first proposed RVM to solve regression and classification problems [15].

PT

The RVM learning algorithm within a Bayesian framework is in fact a specialization of Gaussian processes [19]. RVMs have a similar functional form to the popular

CE

SVMs but a different basis function set is employed for prediction. The key feature of RVMs is that relatively fewer basis functions (termed ‘relevance vectors’ and ‘support

AC

vectors’ in RVMs and SVMs, respectively), are adopted to offer generalization performance that is still comparable to that of SVMs. Fig. 1 illustrates the different regression results with a sinc toy example.

ACCEPTED MANUSCRIPT

1.2 Noisy data sinc function RVM regression SVM regression Relevance vectors Support vectors

1 0.8 0.6 0.4

0 -0.2 -0.4 -5

-4

-3

-2

-1

0

1

2

CR IP T

0.2

3

4

5

Fig. 1. Noisy Sinc RVM regression versus SVM regression. Samples are generated by y=sin(x)/x

AN US

with an additive noise component ε~N(0, 0.05). Both RVM and SVM adopt a Gaussian kernel

with kernel width r=1. The insensitivity parameter and penalty weight for SVM model is set to 0.1 and 10, respectively.

Given a training set x n , t n n1 , the targets t  (t1 ,, t N ) T can be approximated N

ED

ε  ( 1 ,,  N ) T :

M

by the outputs of a function y  ( y(x1 ),, y(x N )) T and an additive noise vector

t  y  ε  Φw  ε

(1)

PT

where w  [w1 wN ]T are the weights of the linear model and Φ is the N  M

CE

‘design’ matrix with nm  K (x n , x m ) as its elements. K () are basis functions that

AC

can be, for example, Gaussian kernels. The data noise ε n is independent and can be assumed to follow Gaussian distribution with variance as  2 and mean as zero:

p( n |  2 )  N (0,  2 ) . Because the targets are assumed to be independent, the likelihood of the whole

dataset can be expressed as follows:

 1  p(t | w,  2 )  (2 2 )  N / 2 exp  || t  Φw || 2  2  2 

(2)

ACCEPTED MANUSCRIPT

The conventional method that employs maximum-likelihood estimation of w and  2 from (2) would lead to severe over fitting [20]. To control the model complexity and avoid over fitting, a zero-mean Gaussian prior distribution over each weight wi with a different precision  i is adopted, and the weight prior takes the form: M

 m 1

The hyper-parameter α  1 ,,  M 

T

1/ 2 m

  m wm2 exp   2 

  

CR IP T

p(w | α)  (2 )

M / 2

(3)

determines how far each weight can

deviate from zero. Consequently, the posterior distribution for the weights conditioned

AN US

on the training data can be calculated using Bayes’ law:

p ( w | t , α,  2 ) 

p (t | w ,  2 ) p ( w | α ) p (t | α ,  2 )

(4)

Both of the numerators in (4) are Gaussian priors, so the posterior over the

w

also follow Gaussian distribution and can be described as

M

weights

p(w | t, α,  2 ) ~ N (μ, Σ) with posterior mean μ and covariance Σ given by: (5)

Σ  (A   2ΦTΦ)1

(6)

PT

ED

μ   2 ΣΦT t

where A  diag (α) . To calculate the weight posterior p(w | t, α,  2 ) , maximum a

CE

posteriori (MAP) estimates are employed to determine α and  2 by seeking the values

2 (α MP ,  MP )

of

the

posterior

of

hyper-parameters

AC

most-probable

p(α,  2 | t)  p(t | α,  2 ) p(α) p( 2 ) . As a result, the MAP estimates of the

hyper-parameters simply needs to maximize the marginal likelihood p(t | α,  2 ) . In particular, the logarithm of RVM marginal likelihood (also known as evidence [21]) is written as:

L  log p(t | α, 2 )  





1 N log 2  t TC1t  log | C | 2

(7)

ACCEPTED MANUSCRIPT

2 where C  ΦA1ΦT   2I . The values of α MP and  MP cannot be obtained

analytically, and the iterative re-estimation formulae are given as:

 inew  2 new



(8)

t  Φμ N  i  i

(9)

CR IP T

 

i i2

where  i  1  i ii . Each  i  [0,1] is a measure of how ‘well-determined’ its corresponding parameter wi is by the training data [21], and the training vectors associated with the remaining non-zero weights after this optimization is the

AN US

aforementioned ‘relevance vectors’ [15].

In the learning procedure, the values of α and  2 are replaced by α MP and 2 , respectively. The posterior mean μ and covariance Σ can then be obtained,  MP

M

and the prediction for a new datum x* is given by

(10)

ED

y  μT (x* )

2.2 Adapting feature scaling factors

PT

RVMs use kernel mapping to improve the non-linear regression ability. Gaussian kernels are commonly used in the machine learning practice [22]. Eq. (11) is a typical

AC

CE

Gaussian kernel:

 || x  x' || 2   KGaussian(x, x')  exp   r2  

(11)

where r is an unified kernel width. The input features are associated with a constant kernel width, which implies all the features are regarded as equally important. If multiple input scale parameters are considered, the classical Gaussian kernel is modified with extension of features:

 D || x d  x'd || 2   K AdaptiveGaussian(x, x')  exp    2 r d  1 d  

(12)

ACCEPTED MANUSCRIPT

where rd is the adaptive kernel width, which is also called ‘feature scaling factor’ [23] and D is the input space dimension. The importance of each feature is now weighed by the feature scaling factor. Larger feature scaling factor indicates more significant influence of the input feature on the output [23]. Consequently, the adoption of an adaptive Gaussian kernel enables the model to identify the relative

CR IP T

importance of each factor. To obtain the values of feature scaling factors, the gradient of the evidence L (7) with respect to the k-th feature scaling factor is written in the form: N M L L  nm   rk n1 m1  nm rk

(13)

AN US

 nm   m (xn ; r) is notated as the elements of the design matrix Φ . The first term in (13) is independent of the kernel function parameters and the derivatives of L with respect to Φ is given as [15]:

L  C1tt TC1  C1 ΦA1    2[(t  Φμ)μT  ΦΣ] Φ



(14)

M



Let k  rk2 , the derivatives of  nm , namely the adaptive Gaussian kernel

PT

ED

described in (12), with respect to  k is:

CE

By defining D 

 nm   nm ( xmk  xnk ) 2 k

(15)

L and combining (14) and (15), the derivatives (13) of the Φ

AC

evidence L with respect to the k-th feature scaling factor can be re-written as: N M L    Dnm nm ( xmk  xnk ) 2 k n 1 m 1

(16)

where Dnm is the elements of the matrix D . The feature scaling factors can thus be optimized using a simple hill-climbing method. Specifically, a stage-wise optimization method is proposed by interleaving the updates of the hyper-parameters (i.e., α and  2 ) and the feature scaling factors (i.e.,  k ) with a few cycles. The procedure is described in Fig. 2. In the first stage, the

ACCEPTED MANUSCRIPT

hyper-parameters α and  2 are optimized while the feature scaling factor  k is fixed. After iterating H1 cycles of the α and  2 optimization, the feature scaling factor  k is optimized with the fixed hyper-parameters α and  2 for H2 cycles in the second stage. Due to the change of feature scaling factors, the kernel functions Φ

model. Initialize α and σ2 Stage 1 Yes

No

Updated for H2 cycles?

AN US

Compute μ and Σ

CR IP T

are changed, and μ and Σ need to be updated to reflect the current state of the

Update α and σ2

Search ηk using hillclimbing method

No

Yes

M

Updated for H1 cycles?

Yes

Update μ Σ Φ Stage 2

PT

Optimal model obtained

No

ED

Maximum iteration or converged?

CE

Fig. 2. Stage-wise optimization of RVM with adaptive feature scaling factors.

AC

3. Representation of the shield-ground relationship Shield-ground interaction is a complex process because of the involvement of

multiple factors, such as ground condition and shield operation. Due to the complex shield-ground relationship, site instrumentation data are always non-linear and noisy. Fig. 3 is the schematic diagram describing shield tunneling and the induced ground settlement. The settlement develops continually in response to the shield penetration, and the effect of shield penetration on settlements increases in case of shield machine

ACCEPTED MANUSCRIPT

approaching while it decreases in case of receding. Therefore, the settlement is normally plotted against the distance between measurement devices and excavation face to demonstrate the longitudinal settlement profile. In practice, due to that the advance rate is not constant and the measurement readings are not taken with equal time intervals, the data points that form the settlement curve may be not equally spaced [4]. To model the settlement development, the present settlement at an

other influencing factors

F

CR IP T

instrumented section s (indicates the current ground stress conditions), coupled with recorded between two consecutive settlement

measurements (representing the loads applied by the shield penetration), are used to predict the settlement s ' for next excavation step. As shield machine advances, the

AN US

measurement of the settlement at next excavation step and the possibly changed factors, such as shield operational parameters, are taken again as the input for the next prediction [24]. This process is expressed as the following equation: s '  f ( s, F )

(17)

M

The non-linear relationship f () is learned from known dataset using the

ED

adaptive RVM, and the prediction is conducted based on the trained model. In addition, the feature scaling factors optimized in the training process can be used to

PT

investigate the effect of each factor on ground settlement. This characteristic can be useful in engineering practice. For example, the established model has the potential to

CE

provide directions about shield steering for settlement control because the ground

AC

movement is sensitive to the critical shield operational parameter.

ACCEPTED MANUSCRIPT

Approaching -10

-10

Tail of shield passing

-20 -30 -40

Measurements

CR IP T

Settlement (mm)

-20 0

Receding Distance to excavation face (m) 0 10 20 30 40 Front edge of shield passing

A typical longitudinal development of ground settlement Ground surface

Settlement marker

AN US

T

Shield machine

M

Launching station

Excavation direction

ED

Completed tunnel lining

Excavation face

PT

Fig. 3. Schematic diagram of shield tunneling.

CE

4. Application

AC

The Dongting–Yuejiazhui (D–Y) tunnel and the Qingyuzhui–Dongting (Q–D)

tunnel constitute the middle section of Line 4 in the Wuhan (China) metro project. The tunnels are constructed by EPB shield tunneling method. The D–Y tunnel and the Q–D tunnel are respectively 779m and 536m in length. Fig. 4 shows the geological profiles of the two tunnels. The main geologic formations encountered by the tunnel sections are the clay and silty clay, and the cover depth ranged from 4m to 9m. Ground settlement markers are deployed above the centre line of the tunnel with about 10m intervals. Measurements of ground settlement were taken once per day

ACCEPTED MANUSCRIPT

ED

M

AN US

(a)

CR IP T

during shield passing.

(b)

PT

Fig. 4. (a) Soil profile of the D–Y tunnel and (b) Soil profile of the Q–D tunnel.

The affecting factors considered can be grouped into three categories:

CE

geometrical characteristics, geological conditions and shield operational parameters [7, 24]. Two geometrical factors are considered: the distance between the excavation face

AC

and the ground settlement markers (d) and the cover-span ratio (Z/2R) wherein Z represents the cover depth and 2R is the tunnel diameter. However, due to that the tunnel diameter was designed as a constant of 6 meters, the first factor is replaced by the tunnel depth (Z). Three factors, namely the soil type at the tunnel crown (Soilcrown), the soil type at the tunnel invert (Soilinvert), and the tunnel depth below the groundwater table (WT), are considered as the geological factors. The soil types rather than detailed soil properties are adopted because the detailed information on

ACCEPTED MANUSCRIPT

geotechnical parameters at most instrumented sections is unavailable. The shield operational parameters include penetration speed (v), face pressure (p1), tail void grouting pressure (p2), grouting volume (V) and pitching angle (θ). By collecting the instrumentation data, referring to the geological investigation documents and recording the shield operational parameters, a dataset consisting of 661 samples was created and they were sorted according to the measurement date. The dataset was then

CR IP T

divided into a training dataset, which contained the first 430 data samples, and a testing dataset, which consisted of the last 231 data samples. Fig. 5 gives the boxplot of the original data. The factor Soilcrown and Soilinvert are represented by dummy variables. Because the tunnels are drilled in two soil types (i.e., clay and silty clay) in

AN US

this case, the values of the two factors are binary (i.e., 0 or 1) and are not shown in Fig. 5. 100

M

80 60

ED

40

0

CE

-20

PT

20

AC

-40

d (m)

Z ×10-1(m)

WT v p1 p2 V ×10-1(m) (mm/min) ×10(kpa) ×10(kpa) ×10-1(m3)

θ (°)

Fig. 5. Boxplot of the original dataset.

5. Results and discussion Data normalization to the range [0, 1] was carried out before the model training. When building ANN-based or SVM-based models, additional efforts are usually

ACCEPTED MANUSCRIPT

required to determine the architecture of the network for ANN models or to select appropriate model parameters (e.g., penalty weight C , insensitivity parameter ε , and kernel width r ) that yields the best performance for SVM models. In contrast, the proposed method can infer analogues of those parameters (i.e., hyperparameters

α and  2 , and feature scaling factors rd ) by searching the maximum of the

CR IP T

logarithm of marginal likelihood. In other words, there is no need to set the model parameters because they can be obtained during model training.

Fig. 6 shows the inferred weights after model training. As mentioned above, how ‘well-determined’ the parameter wi is by the training data can be measured by the quantities γi. Fig. 7 demonstrates the values of the ‘well-determinedness’ of each

AN US

weight. Most values of γi tend to be 1 while the two relevance vectors with low values of γi is associated with very small weights. This results show that the entire model is largely determined by the ‘well-determined’ relevance vectors, while those relevance vectors that may have minor influence on model performance is not neglected. In

M

particular, the non-linear shield-ground relationship is well represented by the trained

ED

model. 5

PT

4 3

CE

2

0

AC

w

i

1

-1 -2 -3 -4 -5

0

50

100

150

200

250

300

Number of data samples Fig. 6. Values of wi for the trained model.

350

400

ACCEPTED MANUSCRIPT

1

0.8

CR IP T

γi

0.6

0.4

0.2

0

50

100

150

200

250

300

350

AN US

0

400

Number of data samples

Fig. 7. Values of γi for the trained model. 10 430 training data 231 testing data Relevance vectors Predicted=Measured

M

5

ED

0

-15 -20

PT

-10

CE

Predictions (mm)

-5

-25

AC

-30 -35

-40 -40

-35

-30

-25

-20

-15

-10

-5

0

5

10

Measurements (mm) Fig. 8. Performance of the model for the training dataset and testing dataset.

The trained model was then used for ground settlement prediction. Fig. 8illustrates the performance of the model for the training and testing dataset. As

ACCEPTED MANUSCRIPT

shown in Fig. 8, the predictions are in good agreement with the measured data. A comparison of the simulation results is presented in Table 2 to evaluate the strength of the adaptive RVM model. The root mean square error (RMSE) and the coefficient of correlation (R) were employed as performance metrics. The ANN and SVM models are considered as the major competitors. Specifically, the ANN model architecture is determined by a 5-fold cross validation method that sequentially uses a

CR IP T

small proportion of the training set as validation data. The kernel width of the SVM model is also selected in the same way when the penalty weight and insensitivity parameter are arbitrarily set to 10, 20, 50, 100 and 0.01, 0.05, 0.1, 0.5 respectively. It is found that a 3-layer ANN architecture with 12 hidden neurons and a SVM with a

AN US

penalty weight of 20, insensitivity parameter of 0.1, and kernel width of 0.2 yield the best prediction performance on the validation dataset. Moreover, to illustrate how the adoption of adaptive feature scaling factors can improve the prediction results, a classical RVM model with a standard Gaussian kernel is also established. Based on a

M

direct search method, the unified kernel width of the classical RVM that corresponds to the maximum evidence is used. As shown in Fig. 9, the classical RVM with a

ED

kernel width of 0.5 has the maximum evidence and is applied for prediction.

740 730

710 700

AC

Evidence

CE

720

PT

750

690 680 670 660 650 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Kernel width Fig. 9. Classical RVM model evidence corresponds to different values of kernel width.

ACCEPTED MANUSCRIPT

Based on Table 2, the proposed model outperforms the other models, which reveals its good prediction capacity. It is interesting to note that the adaptive feature scaling factors help to reduce both prediction errors and the number of relevance vectors. Thus, it is shown that the adoption of adaptive feature scaling factors is beneficial for increasing the prediction accuracy and reducing the model complexity. Moreover, the proposed model requires no further efforts to determine the model

CR IP T

parameters. All the parameters, including feature scaling factors, are automatically optimized during model training, which makes the model relatively simple for practitioners to use. Table 2

Adaptive RVM 4.4472

R of testing dataset

0.6714

RVs/SVs*

8

Classical RVM

SVM

ANN

4.8454

4.7457

5.044

0.6290

0.6285

0.5897

17

99

/

M

RMSE of testing dataset

AN US

Comparison between different models.

*RVs – Relevance Vectors, SVs – Support Vectors.

ED

The feature scaling factors can weigh the relative importance of each input factor. Important factors are associated with larger feature scaling factors. Fig. 10 presents

PT

the feature scaling factors, which are normalized to facilitate the comparison with the subsequent sensitivity analysis results. Based on Fig. 10, face pressure is the most

CE

important factor in this case. The result is concomitant with the analysis based on finite element method [25, 26] and the empirical evidence [27]. In shield-driven

AC

tunnels, soil is removed directly from the excavation face. To stabilize the excavation face, the pressure provided by the shield machine must be in equilibrium with the external ground and water pressure. If not, then excavation face can become unstable causing excessive settlement or even a collapse, particularly in the shallow buried tunnels and soft soil conditions. Thus, the face pressure is a critical shield operational factor that needs to be closely examined and tuned to stabilize the excavation face and minimize the ground deformation. The geological factors, namely soil type at tunnel

ACCEPTED MANUSCRIPT

crown and tunnel invert, and the groundwater table, are also identified as important factors. This result is also supported by several models because the geological conditions determine the potential soil responses to shield machine penetration and subsequently impact ground movement [26, 28]. In comparison, as shown in Fig. 11, the sensitivity analysis has failed to identify the critical shield operational parameters in this case, and the importance of the soil type at tunnel invert is over-estimated

CR IP T

because the disturbance imposed on the soils at tunnel crown can be propagated to the ground surface without significant loss, while the effects of soil disturbance at tunnel invert is partially neutralized by the shield weight and tough shield skin [29]. 1

AN US

0.8 0.7 0.6

M

0.5 0.4

ED

0.3 0.2 0.1

d

CE

0

PT

Normalized feature scaling factor

0.9

Z

Soilcrown Soilinvert WT

v

p1

p2

V

θ

Factors

AC

Fig. 10. Feature scaling factor as a measure of relative importance of each factor.

ACCEPTED MANUSCRIPT

1 0.9

0.7 0.6 0.5 0.4 0.3 0.2 0.1

d

Z

Soilcrown Soilinvert WT

v

p1

p2

AN US

0

CR IP T

Normalized Sensitivity

0.8

θ

V

Factors

Fig. 11. Sensitivity analysis as a measure of relative importance of each factor.

900

M

800

ED

700

PT

500 400

CE

Evidence

600

300

AC

200 100

0

Adaptive Classical 0

50

100

150

200

250

300

350

400

Number of iterations Fig. 12. The comparison of algorithm efficiency.

Regarding the convergence procedure, the comparison between the proposed method and the classical RVM approach is shown in Fig. 12. In this study, the number

ACCEPTED MANUSCRIPT

of cycles H1 in the first stage and H2 in the second stage was set to 10, and the learning rate for the feature scaling factor optimization was set to 10-4 to prevent premature convergence. As a result, the evidence with respect to the feature scaling factors is maximized in the second stage optimization, shown as steps in blue in the figure, and the maximum evidence of whole optimization and the convergence speed

CR IP T

is improved compared to the classical RVM training method.

6. Conclusions

It is important to protect adjacent structures and facilities from tunneling-induced

AN US

damage. For this reason, accurate prediction of ground settlement is pivotal because preventive measures can be taken before the occurrence of large settlement that can interrupt the function of the existing structures or facilities. In this paper, a RVM model with adaptive feature scaling factors was proposed to explore the predictive relations between the settlement and the causing factors. The potential of the proposed

M

model was investigated in shield-driven tunnels. Although the shield-ground

ED

interaction is complex and the settlement data is highly non-linear and noisy, the results demonstrate that the model has extracted the shield-ground relationship to

PT

provide competitive generalization performance. The adoption of feature scaling factors is proven to be beneficial for reducing both the prediction errors and the model

CE

complexity. Unlike ANN models or SVM models, which need an additional procedure to determine the model structure or set model parameters, the optimization method

AC

can obtain the model parameters, including the feature scaling factors, in the training process. Moreover, the feature scaling factors can be used as an inherent mechanism to measure the relative importance of each input factor, and provide an insight into how the factors contribute to the settlement development. Some factors, such as face pressure of the shield machine and geological conditions, are correctly identified as important in the case study. It can be concluded that the proposed model is useful for its accurate prediction, simple implementation and the ability for feature importance identification. Therefore, the proposed model can be employed in tunneling projects

ACCEPTED MANUSCRIPT

for settlement prediction and control. However, the application of this model to other tunnels at different sites requires close examination, particularly to those constructed across soil types that are not included in the existing training dataset. A database that consisted of different site-specific information is needed to improve the generalization

CR IP T

ability of the proposed model.

Acknowledgements

This research was supported by the National Natural Science Foundation of

AN US

China under Grant No. 41301434.

References

[1] R.B. Peck, Advantages and limitations of the observational method in applied soil mechanics, Geotechnique 19 (2) (1969) 171–187.

M

[2] C. Sagaseta, Analysis of undrained soil deformation due to ground loss,

ED

Geotechnique 37 (3) (1987) 301–320.

[3] C.Y. Kim, G.J. Bae, S.W. Hong, C.H. Park, H.K. Moon, H.S. Shin, Neural network

PT

based prediction of ground surface settlements due to tunneling, Comput. Geotech. 28 (6) (2001) 517–547.

CE

[4] ITA, Settlements induced by tunneling in Soft Ground, Tunnel. Undergr. Space Technol. 22 (2) (2007) 119–149.

AC

[5] A. Pourtaghi, M.A. Lotfollahi-Yaghin, Wavenet ability assessment in comparison to ANN for predicting the maximum surface settlement caused by tunneling, Tunnel. Undergr. Space Technol. 28 (2012) 257–271. [6] S. Gholizadeh, O.A. Samavati, Structural optimization by wavelet transforms and neural networks, Appl. Math. Model. 35 (2) (2011) 915–929. [7] S. Suwansawat, H.H. Einstein, Artificial neural networks for predicting the maximum surface settlement caused by EPB shield tunneling, Tunnel. Undergr. Space Technol. 21 (2) (2006) 133–150.

ACCEPTED MANUSCRIPT

[8] O.J. Santos, T.B. Celestino, Artificial neural networks analysis of Sao Paulo subway tunnel settlement data, Tunnel. Undergr. Space Technol. 23 (5) (2008) 481–491. [9] H. Landrin, C. Blückert, J.P. Perrin, S. Stacey, A. Stolfa, ALOP/DSU coverage for tunnelling risks? Boston: International Association of Engineering Insurers, 2006. 19 p. Report No.: IMIA WGP 48 (06).

CR IP T

[10] V. Kecman, Learning and Soft Computing: Support Vector Machines, Neural Networks, and Fuzzy Logic Models, MIT Press, Cambridge, MA, 2001. [11] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.

[12] H. Zhao, S.Yin, Geomechanical parameters identification by particle swarm

3997–4012.

AN US

optimization and support vector machine, Appl. Math. Model. 33 (10) (2009)

[13] P. Samui, T.G. Sitharam, Least-square support vector machine applied to settlement of shallow foundations on cohesionless soils, Int. J. Numer. Anal. Meth.

M

Geomech. 32 (17) (2008) 2033–2043.

[14] A.N. Jiang, S.Y. Wang, S.L. Tang, Feedback analysis of tunnel construction using

ED

a hybrid arithmetic based on Support Vector Machine and Particle Swarm Optimisation, Automat. Constr. 20 (4) (2011) 482–489.

PT

[15] M.E. Tipping, Sparse Bayesian learning and the relevance vector machine, J. Mach. Learn. Res. 1 (3) (2001) 211–244.

CE

[16] J. Yuan, K. Wang, T. Yu, M. Fang, Integrating relevance vector machines and genetic algorithms for optimization of seed-separating process, Eng. Appl. Artif. Intell.

AC

20 (7) (2007) 970–979. [17] P. Samui, T.G. Sitharam, Site characterization model using least-square support vector machine and relevance vector machine based on corrected SPT data (Nc), Int. J. Numer. Anal. Meth. Geomech. 34 (7) (2010) 755–770. [18] P. Samui, Application of statistical learning algorithms to ultimate bearing capacity of shallow foundation on cohesionless soil, Int. J. Numer. Anal. Meth. Geomech. 36 (1) (2012) 100–110. [19] C.E. Rasmussen, C.K.I. Williams, Gaussian Processes for Machine Learning,

ACCEPTED MANUSCRIPT

MIT press, Cambridge, MA, 2006. [20] M.E. Tipping, Bayesian inference: an introduction to principles and practice in machine learning, In Advanced Lectures on Machine Learning, O. Bousquet, U. von Luxburg, G. Rätsch (Eds.). Springer, Berlin, 2004; 41–62. [21] D. MacKay, The evidence framework applied to classification networks, Neural Comput. 4 (5) (1992) 720–736.

CR IP T

[22] A. Smola, B. Schölkopf, A tutorial on support vector regression, Stat. Comput. 14 (3) (2004) 199–222.

[23] J. Yuan, L. Bo, K. Wang, T. Yu, Adaptive spherical Gaussian kernel in sparse Bayesian learning framework for nonlinear regression, Expert Syst. Appl. 36 (2)

AN US

(2009) 3982–3989.

[24] F. Wang, B. Gou, Y. Qin, Modeling tunneling-induced ground surface settlement development using a wavelet smooth relevance vector machine, Comput. Geotech. 54 (2013) 125–132.

M

[25] T. Kasper, G. Meschke, On the influence of face pressure, grouting pressure and TBM design in soft ground tunneling, Tunnel. Undergr. Space Technol. 21 (2) (2006)

ED

160–171.

[26] Y. Li, F. Emeriault, R. Kastner, Z.X. Zhang, Stability analysis of large slurry

472–481.

PT

shield-driven tunnel in soft clay, Tunnel. Undergr. Space Technol. 24 (4) (2009)

CE

[27] B. Maidl, M. Herrenknecht, U. Maidl, G. Wehrmeyer, Mechanised shield tunneling, 2nd edition. Ernst & Sohn, Berlin, Germany, 2012.

AC

[28] T. Kasper, G. Meschke, A 3D finite element simulation model for TBM tunnelling in soft ground, Int. J. Numer. Anal. Meth. Geomech. 28 (14) (2004) 1441–1460. [29] L. Ding, F. Wang, H. Luo, M. Yu, X. Wu, Feedforward analysis for shield-ground system, J. Comput. Civ. Eng., ASCE 27 (3) (2013) 231–242.