Performance prediction of tunnel boring machine through developing high accuracy equations: A case study in adverse geological condition

Performance prediction of tunnel boring machine through developing high accuracy equations: A case study in adverse geological condition

Journal Pre-proofs Performance prediction of tunnel boring machine through developing high accuracy equations: a case study in adverse geological cond...

2MB Sizes 2 Downloads 78 Views

Journal Pre-proofs Performance prediction of tunnel boring machine through developing high accuracy equations: a case study in adverse geological condition Masoud Samaei, Masoud Ranjbarnia, Vahid Nourani, Masoud Zare Naghadehi PII: DOI: Reference:

S0263-2241(19)31109-1 https://doi.org/10.1016/j.measurement.2019.107244 MEASUR 107244

To appear in:

Measurement

Received Date: Revised Date: Accepted Date:

27 May 2019 27 October 2019 5 November 2019

Please cite this article as: M. Samaei, M. Ranjbarnia, V. Nourani, M.Z. Naghadehi, Performance prediction of tunnel boring machine through developing high accuracy equations: a case study in adverse geological condition, Measurement (2019), doi: https://doi.org/10.1016/j.measurement.2019.107244

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Elsevier Ltd. All rights reserved.

Performance prediction of tunnel boring machine through developing high accuracy equations: a case study in adverse geological condition Masoud Samaeia, Masoud Ranjbarniab*, Vahid Nouranic, Masoud Zare Naghadehid a

Engineer, Department of Geotechnical Engineering, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran Assistant Professor, Department of Geotechnical Engineering, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran. Corresponding author E-mail: [email protected] c Professor, Center of Excellence in Hydroinformatics, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran c Professor, Faculty of Civil and Environmental Engineering, Near East University, Nicosia, Turkey d Researcher, Department of Mining and Metallurgical Engineering, University of Nevada, Reno, USA b*

Abstract The aim of present study is to propose new superior equations and introduce novel techniques for TBM performance prediction. To this end, correlations between the Rate of Penetration (ROP) and rock mass properties are investigated using four simple regression analyses. Based on these analyses, two non-linear multivariable equations are introduced and optimized by the Imperialist Competitive Algorithm (ICA). Then, two other distinct models are examined by using the Classification and Regression Tree (CART) and Genetic Expression Programming (GEP) techniques. The aforementioned methods are applied on a dataset from the Queens Tunnel, in New-York City with complex geology conditions. It was found that the models proposed by ICA, CART and GEP techniques have determination coefficient of 0.76, 0.82 and 0.72 for training data, and 0.62, 0.69 and 0.65 for testing data, respectively. The results showed the noticeable improvement of the predictions compare to previous studies. Keywords: TBM Performance Prediction; Queens Tunnel; Genetic Expression Programming (GEP); Classification and Regression Tree (CART); Fourier Regression

1. Introduction The prosperous era of tunnel boring machine (TBM) started since its successful applications in the early 1950s. Then, companies introduced various types of TBMs, which were also applicable to adverse geological conditions and hard rocks. Considering TBM drilling in hard rocks, the most important aspect of its operation is the prediction of its rate of penetration (ROP). In fact, TBM performance prediction has become one of the most sophisticated problems in the field of geotechnical engineering in the past forty years, as mentioned by Robbins ‘’nothing

1

has been more difficult than evaluating the rock mass characteristics and applying the evaluations to a formula predicting penetration rate‘’ [1]. This challenge is directly related to cost estimation, scheduling, efficiency improvement, and other issues related to project management. A number of attempts have been made to propose some models to estimate TBM performance on difficult grounds since the 1970s [2-8]. These efforts, which neither have a specific conclusion nor a comprehensive formulation, can be categorized into theoretical and empirical models. Theoretical studies are usually based on laboratory testing results, which probe into the operation of TBM and the influence of cutter disk on a rock sample, and other related issues. However, due to some difficulties in providing field rock conditions in the laboratory and lack of available equipped laboratories, theoretical models have not been developed progressively [9-11]. On the other hand, empirical studies are conducted based on both field data of the rock mass properties (including the uniaxial compressive strength (UCS), the joint spacing (JS), the distance between planes of weaknesses (DPW), the brittleness index (BI), the Brazilian tensile strength (BTS), angle between planes of weakness and TBM driven direction (α), quartz content, drilling rate index (DRI), rock mass rating (RMR), rock quality designation (RQD), etc.) and the machine recorded parameters (e.g. thrust force, torque, cutter load, power and number of cutters, etc. [12-15]). These methods can be classified into four groups [16]: simple models (e.g. [17]), multiple parameters models (e.g. [12, 13]), probabilistic models (e.g. [18]) and computer aided models (e.g. [19-22]). In the past two decades, the Computer aided models were developed with various algorithms, such as Artificial Neural Network (ANN) (e.g. [23-25]), Fuzzy logic (FL) (e.g. [19, 26]), Support Vector Regression (SVR) (e.g. [27]) and other algorithms (e.g. [28-30]). These models, which were developed by considering various rock and machine parameters, have a better fit with actual measurements in comparison with the other types of models. In recent years, various types of linear and non-linear regression analyses have been combined with computer algorithms to improve the accuracy of models for manufacturer companies and site managers of tunneling projects [31]. Yagiz [32] analyzed the ROP and rock mass properties (including UCS, BTS, DPW, BI and alpha angle (α)), datasets of the Queens Tunnel project, and proposed a linear-multivariable equation for the ROP prediction. It was found that the BI, DPW and α were the most important parameters for the 2

TBM performance prediction. Yagiz et al [33] developed an artificial neural network (ANN) model and a new non-linear multivariable equation for the penetration prediction. At the same time, using the rock mass cut-ability index (RMCI), Hassanpour et al. [31]) proposed another multivariable equation based on several simple regression analyses on the Nowsood Tunnel datasets to estimate the TBM field penetration index (FPI). Khademi Hamidi et al [34], based on rock mass rating (RMR) system, proposed a non-linear equation on the Nowsood Tunnel datasets and improved the estimations. They found that the tunnel depth and groundwater level had a meaningless fit with the FPI. Yagiz and Karahan [35] employed the particle swarm optimization (PSO) algorithm to develop a non-linear multivariable model based on the Queens Tunnel dataset to improve the accuracy of prediction. Benato and Oreste [36] introduced a new equation based on Alpine tunnel datasets, using disk cutter's force (which applied to rock), the geological strength index (GSI), and uniaxial strength of intact rock. They concluded that the UCS is the most effective parameter on TBM’s advance rate. Jahed Armaghani et al [37] developed an ANN model and also introduced two other hybrid models, i.e. ANN-ICA and ANN-PSO. They found that although ANNs had a high degree of accuracy, the hybrid models had higher accuracy with better predictors. The literature review shows that there is no comprehensive agreement on the quantitative or qualitative influence of various variables on the TBM penetration rate, but the degree of accuracy in its prediction has been improved in recent years through using computer algorithms. Hence, in this study, novel models using computer algorithms are introduced for the ROP prediction. First, two non-linear multivariable equations with different mathematical functions are proposed and optimized by the imperialist competitive algorithm (ICA). Afterward, classification and regression tree (CART) is employed as a different technique to develop models for the ROP prediction. In continue following, Section 2 describes the Queens Tunnel project’s specifications and extracted datasets. Section 3 introduces an introduction of the algorithms which are used for to develop the models presented generation in this study and their flowcharts. The results of this study, the generated models and their performance are illustrated in Section 4. Then in Section 5, the proposed models verified are validated and compared with published previous models presented by other researchers. Finally, section 6 outlines remarkable conclusions of this study. 3

2. Site Description and Dataset The Queens Tunnel No. 3 was excavated in about 7.06 m diameter, at a depth of almost 200 m below sea level in Queens County in New York City. This tunnel will supply 90% of consumed water in the city in 2021 (see Figure 1). The second stage of the Tunnel No. 3 was excavated using a high-performance Robbins TBM. Table 1 shows specifications of the Queens’s TBM. The obtained 151 datasets from the second stage are used in this study, which were published by Yagiz [32]. The encountered rocks in the section No.3 are divided into 5 categories as follows: -

Rhyodacite dike rocks (2.6%),

-

Granitoid (felsic) gneiss and orthogneiss (29.2%),

-

Massive garnet amphibolite and larger mafic dikes (7.3%),

-

Mafic-to mesocratic orthogneiss (19.9%)

-

Mafic-to mesocratic gneiss, amphibolite and schist (41%).

The datasets include the ROP (m/h), three intact rock properties (i.e. the uniaxial compressive strength (UCS), the Brazilian tensile strength (BTS) and the brittleness index (BI)), and two rock mass properties (i.e. the angle between planes of weakness and tunnel direction (α) and the average distance between planes of weaknesses (DPW)). Table 2 presents basic descriptive statistics of all parameters. These parameters were used in previous models [32, 33, 35, 38, 39]. Since 2008, Many papers have been published using the Queens datasets [32, 33, 35, 38, 39]. All of the reported methods led to the determination coefficient (DC) of R2=0.67. This low determination coefficient between the actual measurements and the predictions is associated with great complexity of the ground condition and disorderliness of the rock parameters’ values encountered in each section of drilling. That is, as shown in Figure 2, the ROP does not have a meaningful relationship with the rock properties. None of the UCS, DPW and BI parameters have a simple ascending or descending trend respecting the ROP’s trend. For example, it is anticipated that for low UCS values, the ROP values increase, but as shown in Figure 2, the UCS values fluctuate sharply from a section to another section and do not have a descending trend when the ROP's graph shows ascending trend. Such fluctuation can be also observed for BI and DPW, indicating that 4

the prediction of ROP is not possible by using one or two independent parameters. All of the mentioned parameters influence the ROP in synchronization with each other. Therefore, it is essential to consider all of the effective parameters in prediction purposes. TABLE 1 Specifications of the Queens Tunnel Boring Machine [32] Machine diameter Diameter range Number of disc cutters Recommended disc cutters load Max. operational cutter head thrust Cutterhead power Cutterhead speed Cutterhead torque Thrust cylinder stroke Conveyor capacity (approx.) TBM weight (approx.)

7.06 m 6.5-8.5 m 50 35 short tons per each disk cutter 1750 tons 4220 hp 8.3 rpm 1335 short tons 1.83 m (6 ft.) 18.4 m3/min (650 ft3/min) 640 short tons

TABLE 2 The statistically description of rock mass parameters and penetration rate for the model development [32] Type of data Inputs

Intact rock properties Rock mass properties

Output

Symbol UCS (MPa) BTS (MPa) BI (kN/mm) DPW (m) α ( Degree) ROP (m/h)

Ave. 150.03 9.57 34.61 1.02 44.61 2.05

Standard.D. 22.26 0.84 8.48 0.65 23.32 0.36

Min. 118.3 7 25 0.05 2 1.27

Max. 199.7 11.4 58 2 89 3.07

3. Proposed methodologies 3.1 Imperialist Competitive Algorithm (ICA) One of the most popular applications of evolutionary algorithms is to solve and optimize the multivariable equations. Some of the evolutionary algorithms (e.g. the Genetic Algorithm (GA) [40], the Particle Swarm Optimization (PSO) [41], the Artificial Bee Colony algorithm (ABC) [42] and the Imperialist competitive algorithm (ICA), etc.) solve or optimize the problems by using an imitation of natural processes by computers and coding. The Imperialist competitive algorithm (ICA), with an inspiration from a social-political process, due to its greater capability of solving problems and shorter run time, has been used in different engineering problems in recent years [43, 44]. Similar to the other evolutionary methods, first of all, the ICA generates a random

5

initial population called “empires”, in which the individuals in each empire are called “country”. During this process, some of the best countries are considered as imperialist and the rest as colonies. Afterward, the imperialists start to assimilate the colonies in order to organize its emperorship. At the end of algorithm processing, one emperorship, which is the best solution for the problem, will exist. Figure 3 shows a flowchart of ICA. In each optimization problem with N number of variables (𝑁𝑣𝑎𝑟), each country is an 1 × 𝑁𝑣𝑎𝑟 array, Country = {w1,w2,w3,…,wNvar}

(1)

where wi is the weight of each variable. The cost or fitness of each country (amount of power) is assessed using the evaluation or fitness function (f) Cost = f (country) = f (w1,w2,w3,…,wNvar)

(2)

In this paper, the Root Mean Square Error (RMSE) is used as the fitness function, meanwhile, other functions such as the Mean Square Error (MSE), the Mean Absolute Error (MAE), etc. could be also used. In countries scoring process, some of them are chosen as imperialist (Nimp) and the rest as colonies (Ncol). After obtaining the normalized costs of each imperialist from existing costs and using Eq. (1), the colonies are divided by Eqs. (4) and (5): Cn = cn ― max {ci}

(3)

| |

(4)

pn =

Cn

Npop

∑i = 1 Ci

(5)

N.C.n = round {pn.Ncol}

where N.C.n shows the number of colonies assigned to imperialists, cn is the cost of nth imperialist, Cn is the normalized cost, Npop is the initial population size, and max {ci} is the maximum costs of imperialists. During the optimization process in the ICA, the colonies are propelled toward imperialists. Therefore, it is possible for a colony to be replaced as an imperialist and to generate a new emperorship (See Figure 4). The power of each emperorship can be calculated using Eq. (6) T.C.n = Cost(imperialistn) +ξ mean {Cost(colonies of the empiren)}

(6)

where 𝑇.𝐶𝑛 is the cost of nth emperorship and ξ is a digit between 0 and 1, which leads to the change in the amount of colonies’ role respecting the total power of imperialist.

6

In the ICA, the colonies transfer from a weak emperorship to a better one and after a while, the weak imperialists are eliminated from the population. Figure 5 shows the competitive imperialists, in which emperorship No.1, as a weak emperorship, loses its colonies and other emperorships compete to acquire those colonies. The ICA algorithm will be continued to acquire a convergence condition or a certain number of iterations and finally, only one emperorship as the best solution will exist [45]. 3.2 Classification and regression tree (CART) A classification and regression tree (CART) is a data mining technique. It has been considered as a powerful and popular technique in recent years, due to its simplicity, interpretability, low cost of implementation, and a high degree of accuracy [46]. In comparison with the other machine learning methods (e.g. the artificial neural networks (ANNs), the support vector machines (SVMs), the support vector regression (SVR), etc.), the CART is considered as a white box method because of its explicitness and interpretability in modeling. It is applicable to qualitative data, as a classification method, as well as to quantitative data, as a regression technique [47]. In this study, the regression tree (RT), a suitable alternative to the common multivariable regression analyses, is used. This tree consists of a root node, decision (or interior) nodes and branches and leaf (or terminal) nodes. The RT performs the regression by splitting the data of the root node (that includes all of the training data) into mutual sub-nodes, called decision nodes. Subsequently, by considering each decision node as a root node, the tree will be grown until the stopping criterion is met. Finally, the result is a tree-like chart that is fitted to each node to give a predicted value for observations (See Figure 6). One of the major performance parameters of CART is the tree depth, which is defined as a length of the longest path from the root to the leaf node. The tree depth plays an important role in preventing the tree from overtraining and overgrowing. However, in this paper, a Matlab [48] code is used, in which the tree depth is not directly changed, but the tree growth is controlled by the following criteria: the maximum number of splits (is 𝑛 ― 1 where 𝑛 is the training sample size), the minimum number of leaves and the minimum parent size.

7

The other important issues in the data mining are the detection of outlier data as well as identification of principal parameters. Some algorithms have been presented for these issues; however, the CART inherently uses the Principle Component Analysis (PCA) in modeling [49]. This powerful technique is able to detect the deviated data and eliminate them by changing the direction of the coordinate axis. Furthermore, the PCA identifies the principal parameters in a modeling procedure. 3.3 Genetic Expression Programming (GEP) Since the Genetic Algorithm was introduced by Holand, its different forms have been developed [50, 51]. This Algorithm has a symbolic string (chromosome) of fixed length, as unique solutions. One of the popular modification performed on the genetic algorithm was Genetic Programming (GP), in which the solution had a tree-shaped structure with different sizes [22, 52, 53]. The Genetic Expression Programming (GEP) is a newly developed algorithm form of GA and GP, which its major difference with the GP refers to the size of chromosome [22, 54]. The GP chromosomes have different sizes while those of GEP are encoded as simple strings of fixed length [55]. Among a large number of Evolutionary Algorithms (EAs), only the GEP and GP are able to produce a mathematical equation between some independent and dependent parameters of a certain problem. The GEP is applicable for Classification, Logistic regression, Function finding, Time series prediction and Logic synthesis issues. As well, by using several sub programmes, it develops more complex programmes which are associated with its multi-genic nature. The GEP chromosomes are able to contain one or more genes, where each gene contains two different types of individuals, called functions and terminals (see Figure 7). The Functions consist of mathematical operators (+, -, *, /, etc.) whereas the terminals can contain numerical constants, logical constants or variables. Figure 7 shows two different chromosome samples and an expression tree structure. where “L” represents the natural logarithm function and a, b, x, y, m and z are variables and constants. It should be noted that in any genome (chromosome), each gene starts with a zero. As shown in Figure 8, the process of analyzation using GEP includes some steps: 1- The GEP generates an initial population and evaluates each member of population (chromosome) using a previously selected 8

fitness function (there are several forms of fitness function, e.g. the root mean square error (RMSE), the mean square error (MSE), the Mean absolute percentage error (MAPE), etc.); 2-The algorithm sorts the solutions regarding their fitness from the best to the worst. If an anticipated fitness is not reached, the algorithm modifies the chromosomes using selection, mutation, inversion, reproduction and transposition operators. The process is continued until the anticipated fitness or a defined number of generations are reached. The most common operators are described below. Mutation: The mutation operator, the most significant operator of the GEP algorithm, is able to change the results of the program strickly. It converts terminal nodes to a functional node or vice versa. Using this operator, two different sub-trees from the different chromosomes are chosen and then the selected trees are swapped with each other. Inversion: This operator selects a specific range of genes in the head of the chromosome and then inverts the gene. Transposition: Using this operator, the selected part of a chromosome is jumped into another position in a chromosome. Three different types of transposition operators are: 1) Insertion sequence (IS) transposition 2) Root insertion sequence (RIS) transposition 3) gene transposition.

4. Results Before performing this study, for the Queens datasets, several different equations were developed to predict the ROP by employing different techniques [28, 32, 33, 35, 38]. These techniques did not improve the predictions progressively, but published due to the novelty of their algorithms used for the ROP prediction. They employed the linear multivariable regression (LMR) (to give Eq. (7)), the Non-linear multivariable regression (NLMR) (to give Eqs. (8) and (11)), particle swarm optimization (PSO) (to give Eq. (9)), harmony search (HS) algorithm, differential evolution algorithm (DEA) and grey wolf optimizer (GWO) (to give Eq. (10)) (details are available in Table 3). TABLE 3 Experimental models of penetration rate prediction based on Queens Tunnel datasets. Reference

Techniqu e

Input

Equation

R2

9

RMS E

MAP E

Eq. No.

Yagiz [32]

LMR

DPW, UCS, BI, α

0.665 𝑅𝑂𝑃 = 1.093 ― 0.003𝑈𝐶𝑆 + 0.029𝐵𝐼 ― 0.219𝐷𝑃𝑊 + 0.437log 𝛼

0.215

9.473

7

DPW, UCS, BI, α

0.649 𝑅𝑂𝑃 = 0.076 ― 0.139𝑈𝐶𝑆 + 0.524𝐵𝐼 ― 0.234𝐷𝑃𝑊 + 0.634𝛼0.205

0.348

15.88

8

UCS, BTS, BI, DPW, α

0.668 0.205 0.207 𝑅𝑂𝑃 = 2.827 ― 0.0041𝑈𝐶𝑆 + 0.029𝐵𝐼 ― 0.4016𝐷𝑃𝑊0.584 + 0.634𝛼

8.925

9

0.205 0.208 𝑅𝑂𝑃 = 2.6285 ― 0.0028𝑈𝐶𝑆 + 0.0248𝐵𝐼 ― 0.4016𝐷𝑃𝑊0.50110.661 + 0.634𝛼

8.790

10

8.708

11

Yagiz et al.[33]

NLMR

Yagiz and Karahan [35]

PSO

Yagiz and Karahan [38]

DE, HSBFGS, GWO

DPW, UCS, BI, α

Yagiz [39]

NLMR

Fn, UCS, BI, α, DPW

𝑅𝑂𝑃 =

𝐹𝑛0.21 𝑈𝐶𝑆0.11 × Exp ( ―0.012𝐵𝐼 + 0.118𝐹𝑠 ― 0.115ln (𝛼))

0.647

0.452

4.1 Generation of Empirical mathematical equation To evaluate the relationship between TBM penetration rate and rock mass properties, some single variable regressions including the Fourier, polynomial, power, and exponential regressions are developed between them. Table 4 shows that all rock mass properties have the greatest correlation when the Fourier regression is used. For the UCS, BTS, BI, DPW and α parameters, the determination coefficient (DC), R2=0.1, 0.025, 0.337, 0.228 and 0.2, are obtained respectively (See Figure 9). The BI and BTS have the highest and lowest R2, respectively. TABLE 4 The coefficients of determination (R2) of the simple regressions between the ROP and the independent variables. Exponential

Fourier

polynomial

Power

UCS

0.069

0.101

0.067

0.057

BTS

0.008

0.025

0.008

0.009

BI

0.335

0.337

0.337

0.335

DPW

0.219

0.228

0.216

0.188

α

0.045

0.2

0.048

0.118

The effect of Brazilian tensile strength (BTS) was considered in the previous studies [28, 32, 33, 35, 38]. In this study, when the BTS parameter is added to the predictor parameters' group, the models’ accuracy improves due to the use of the more developed techniques, in which the effect of BTS on the ROP can be considered. Therefore, this parameter is inserted into the group of model generator parameters. The equations developed for the relationships between the rock mass parameters and ROP are shown in Table 5.

10

The mathematical form of non-linear regressions is available in Table 5. For all parameters, a Fourier term is established (Eq. (12)-(15)), but not for the BI (Eq. (16)), which has the same R2 value for both Fourier and Polynomial regressions. Therefore, the equation is shortened with a polynomial term. TABLE 5 Equations between the ROP and rock mass properties parameter

ROP (m/h) R2

Regression

Equations

Eq. No.

𝑅𝑂𝑃 = 2.123 + 0.1107 × cos (𝑈𝐶𝑆 × 0.1616) ― 0.1729 × sin (𝑈𝐶𝑆 × 0.1616)

12

𝑅𝑂𝑃 = 2.043 + 0.044 × 𝑐𝑜𝑠 (𝐵𝑇𝑆 × 38.41) ― 0.0644 × 𝑠𝑖𝑛 (𝐵𝑇𝑆 × 38.41) 𝑅𝑂𝑃 = 4.18 ― 1.78 × 𝑐𝑜𝑠 (𝐷𝑃𝑊 × 0.3416) ― 1.509 × 𝑠𝑖𝑛 (𝐷𝑃𝑊 × 0.3416) 𝑅𝑂𝑃 = 𝑐𝑜𝑠 (𝛼 × ―5.26𝑒 ― 05) ― 518.8 × 𝑠𝑖𝑛 (𝛼 × ―5.264𝑒 ― 05) 𝑅𝑂𝑃 = 0.0246 × 𝐵𝐼 + 1.19

13 14 15 16

UCS

0.101

Fourier

BTS DPW α BI

0.025 0.228 0.2 0.337

Fourier Fourier Fourier polynomial

Although the Fourier regression gives the greater DC than the other regression types (See Table 5), the correlation between rock mass properties and the ROP is still low. As mentioned in Section 2, the prediction of the ROP using only one or two parameters is impossible because of the dependency of the ROP on all rock mass properties. Hence, combining the single variable equations between the ROP and rock mass properties, it is expected to reach a multivariable equation with a higher correlation. However, coefficients of this new equation need to be optimized again using a stochastic algorithm. As a result, a new empirical equation is obtained as Eq. (17): 𝑅𝑂𝑃 = 𝑤(1) × cos [𝑤(2) × 𝑈𝐶𝑆] + 𝑤(3) × sin [𝑤(4) × UCS] +𝑤(5) × cos [𝑤(6) × 𝐵𝑇𝑆] +𝑤(7) × sin [𝑤(8) × 𝐵𝑇𝑆] +𝑤(9) × 𝐵𝐼 +𝑤(10) × cos [𝑤(11) × 𝐷𝑃𝑊] +𝑤(12) × sin [𝑤(13) × 𝐷𝑃𝑊] +𝑤(14) × cos [𝑤(15) × 𝛼] +𝑤(16) × sin [𝑤(17) × 𝛼] +𝑤(18)

(17)

4.2 Non-linear multivariable regression analysis using ICA To optimize Eq. (17) and to approach the best solution, the imperialistic competitive algorithm (ICA) code is written in MATLAB 2012 software package [48]. The efficiency of models are evaluated by three indices: the determination coefficient (DC, Eq. (18)), the root mean square error (RMSE, Eq. (19)) and the mean absolute percentage error (MAPE, Eq. (20)). ∑(𝐲𝐚𝐜𝐭 ― 𝐲𝐩𝐫𝐞)𝟐

𝐑𝟐 = 𝟏 ― ∑(𝐲

𝐚𝐜𝐭

(18)

― 𝐲𝐚𝐜𝐭)𝟐

11

(19)

𝟐

𝐧

𝐑𝐌𝐒𝐄 =

∑𝐢 = 𝟏(𝐲𝐩𝐫𝐞 ― 𝐲𝐚𝐜𝐭) 𝐧 𝟏

𝐧

𝐌𝐀𝐏𝐄 = 𝐧∑𝐢 = 𝟏

|

𝐲𝐩𝐫𝐞 ― 𝐲𝐚𝐜𝐭 𝐲𝐚𝐜𝐭

| × 𝟏𝟎𝟎

(20)

where 𝑦𝑝𝑟𝑒 , 𝑦𝑎𝑐𝑡 and 𝑦𝑎𝑐𝑡 are the predicted values, the measured (actual) values and the mean of measured values, respectively. The Queens Tunnel datasets were randomly divided into the training (80% of all dataset) and testing (20% of all dataset) sub-sets. The generated code is tested about 100 times and finally, seven top models are selected. Model No. 7 is chosen as the best model for TBM performance prediction. It has the best training and testing results among the other developed models using the ICA. However, its prediction power is evaluated using all of 151 datasets. (Table 6). Moreover, each model is evaluated using DC, RMSE, MAPE indices. Table 7 shows the details of the models. TABLE 6 Experimental models of penetration rate prediction on the Queens Tunnel datasets Model D1 D2 D3 D4 D5 D6 D7

Data status

Training RMSE MAPE

R2

80% of all for train 20% of all for test 80% of all for train 20% of all for test 80% of all for train 20% of all for test 80% of all for train 20% of all for test 80% of all for train 20% of all for test 80% of all for train 20% of all for test 80% of all for train 20% of all for test

R2

Testing RMSE MAPE

0.715

0.195

7.96

0.460

0.514

30.29

0.727

0.191

7.68

0.556

0.493

28.31

0.75

0.183

7.25

0.623

0.472

26.43

0.75

0.183

7.39

0.465

0.492

28.08

0.762

0.178

7.09

0.364

0.504

28.00

0.764

0.177

6.92

0. 628

0.48

26.47

0.793

0.161

5.92

0.651

0.426

8.97

TABLE 7 Details of developed models (Coefficients of weighting). Model

D1

D2

D3

D4

D5

D6

D7

W1 (UCS)

0.120

0.086

-0.106

0.102

-0.126

-0.103

0.093

W2 ( UCS)

20.816

0.303

41.993

-76.284

-13.567

-49.265

41.312

W3 ( UCS)

-0.105

0.110

0.097

-0.139

-0.136

0.114

0.082

W4 ( UCS)

76.406

42.025

-58.65

-62.787

42.003

41.982

-1.997

W5 (BTS)

0.691

-0.310

0.070

-0.112

0.068

-0.059

-0.090

W6 ( BTS)

31.593

-64.200

-97.880

-25.231

-49.170

-4.482

53.483

W7 ( BTS)

-0.213

13.004

20.444

0.060

-0.092

0.097

0.086

W8 ( BTS)

-30.409

63.002

-94.248

71.838

24.743

38.088

44.625

12

W9 (BI)

0.024

0.021

0.020

0.019

0.019

0.016

0.022

W10 (DPW)

-0.183

-0.148

-57.384

-0.491

-0.111

0.153

-0.316

W11 ( DPW)

62.832

40.759

-41.887

73.228

84.933

26.415

28.527

W12 ( DPW)

-0.148

0.114

-0.275

-0.455

0.109

-0.338

0.329

W13 ( DPW)

-65.141

2.788

94.983

-77.935

-91.384

32.132

-88.054

W14 (α)

-0.248

0.817

1.002

-0.192

-0.230

-0.222

-0.152

W15 ( α)

-25.187

-94.284

-18.884

50.324

-0.056

-62.889

-81.764

W16 ( α)

0.097

-2.448

-2.974

0.140

-0.102

0.097

6.214

W17 ( α)

-69.251

37.682

31.401

-94.383

62.968

94.112

81.682

W18 (bias)

1.314

-12.971

-28.922

1.342

1.306

1.712

1.045

R2 (train)

0.715

0.727

0.75

0.75

0.762

0.764

0.768

DC, RMSE and MAPE values of testing datasets indicate the models which did not overtrain and are reliable. Hence, the TBM penetration rate (m/h) could be predicted by Eq. (21). 𝑅𝑂𝑃 = 0.092755 × cos [41.31271 × 𝑈𝐶𝑆] + 0.08167 × sin [ ―1.99695 × UCS] ―0.08966 × cos [53.48337 × 𝐵𝑇𝑆] + 0.086489 × sin [44.62475 × 𝐵𝑇𝑆] (21)

+ 0.022441 × 𝐵𝐼 ―0.31624 × cos [28.52688 × 𝐷𝑃𝑊] + 0.328816 × sin [ ―88.0544 × 𝑃𝑊] ―0.15166 × cos [ ―81.7641 × 𝛼] + 6.213674 × sin [81.6825 × 𝛼] + 1.044826

Khademi Hamidi [34] showed that the logarithmic regression generated a better relation between FPI and 𝛼. Yagiz [39] used a logarithmic term for 𝛼 for the ROP prediction (Eq. 11). However, he used individual cutter force (Fn) in his equation while this parameter was obtainable during the excavation process. Furthermore, he put 𝛼 in the dominator of his equation, and as seen in Fig. 9, there is a positive relationship between ROP and log (𝛼). Therefore, this equation is modified to obtain a new equation, which is optimized using ICA (i.e. (22))

𝑅𝑂𝑃 =

(

0.0469 × 𝐵𝐼0.7857 0.0366 × 𝑈𝐶𝑆0.4918 + 10.4229 × 𝐵𝑇𝑆 ―2.6374 + 0.1096 × 𝐷𝑃𝑊0.8381

13

)

+ 0.5683 × log 𝛼0.7804

(22)

Eq. (22) has a better performance compare to previously introduced models based on the Queens Tunnel datasets (see Table 12). It gives the α's logarithm, which has a direct relationship with the ROP. (Figure 9).

4.3 Regression tree (RT) model for TBM penetration rate The developer code of RT model is written in MATLAB software package [48]. The datasets are divided into training (80% of all) and testing (20% of all) sets and a separate code is written to calculate the RMSE and MAPE errors for each run. The RT tries to find a relationship between the dependent and independent parameters with the lowest error and the greatest accuracy for predictions. The growth rate of the RT is controlled by the maximum number of leaves and the minimum number of parents. The minimum parent size returns to the minimum number of observations per a branch, that is, with the fewer number of parents, the tree tends to grow as great as the maximum number of splits. Although there is no certain rule to obtain an optimal tree with an optimum number of nodes and an optimal tree depth, in this study, by choosing some limits for the maximum number of splits and minimum number of parents, some different trees are made to find the best model. In some of the previous studies, the tree depth was considered as a criterion to choose the best tree among the others. However, in this study, the tree depth, the number of nodes and the DC of the trees are considered as the tree performance criterion. In fact, these three parameters represent the model’s capability of prediction and complexity. Using trial and error procedures, four numbers from 10 to 13, as the maximum number of splits and five numbers, from -3 to +1, as the number of parents, are chosen to develop 20 different models (trees). Considering the efficiency of trees and their complexities, the model No. 17 was selected as the best model for TBM penetration prediction (See Table 8) with tree-depth=7, the minimum number of parents=10, the maximum number of splits=13, R2train= 0.82 and R2test= 0.69. Figure 10 shows the tree structure of the best model. TABLE 8 The best CART model’s characteristics for ROP prediction Max Number of splits

Min Number of parents

Number of nodes

Tree depth

R2

14

Training RMSE

MAPE

R2

Testing RMSE MAPE

13

10

27

7

0.82

0.15

5.91

0.693

0.1953

7.96

4.4 Penetration prediction using the Genetic Expression Programming (GEP) Armaghani et al. [37] developed a GEP model for TBM performance prediction, based on PSRWT project dataset in Malaysia, using the UCS, rock quality designation (RQD), rock mass rating (RMR), BTS, weathering zone (WZ), thrust force (TF), and cutter head revolution per minute (RPM) [56]. However, they used some independent parameters in their study (e.g. TF and RPM) which are not measurable before starting the project, and they are only obtainable during the TBM’s excavation progress. The use of applicable parameters in the generation of the GEP model are essential in each tunneling project, and it is necessary to measure them before starting the project [22]. As a result, the rock mass properties (as independent parameters) and the ROP (m/h) (as a dependent parameter) are used for model development. Therefore, a model with a form of 𝑅𝑂𝑃 = 𝑓(𝑈𝐶𝑆, 𝐵𝑇𝑆, 𝐵𝐼, 𝐷𝑃𝑊, 𝐴𝑙𝑝ℎ𝑎) is proposed to the GEP software. For GEP modeling, the GeneXpro Tools v.4.0 are used. In this way, 80% of the whole datasets are considered as the training set and the rest as the testing set to ensure that the algorithm does not overtrain during the development process. Although different parameters of a GEP model are regulable in this software (e.g. fitness function, linking function, number of chromosomes, number of genes, etc.), there is no specific criterion for their regulation. Hence, some trial and error procedures are applied to obtain the optimum values of these parameters (See Table 9). TABLE 9 The optimum values and regulations for the developed GEP model Furthermore, genetic operators (e.g. mutation rate, gene recombination, etc.) are chosen by Ferreira’s suggestions (See Table 10) [54]. Model setting Chromosome size Generation Number of genes Constants per gene Length of the head size Linking function Function set Fitness function

Regulation 1000 10000 5 2 10 Addition + - × / sin cos tan sqrt pow log RMSE

15

TABLE 10 Optimum values of the genetic operators in the GEP modeling [54] Genetic operator Mutation Inversion Recombination Permutation Tail inversion Tail mutation

value 0.00138 0.00546 0.00277 0.00546 0.00546 0.00546

To get the best models for TBM performance prediction, more than 20 models are established using the GEP software. At the end, the best model is chosen with R2train = 0.72, RMSE train = 0.1912, MAPE train = 5.96 (see Table 11). TABLE 11 Statistical indices of the best GEP model for TBM performance prediction Proposed model

R2 0.7263

#The best model

Training set RMSE MAPE 0.1912 5.96

R2 0.6567

Testing set RMSE 0.4241

MAPE 8.68

The model consists of five genes which link together using a linking function (additional operator, see Table 9) and generate the overall GEP model. The mathematical equation of each gene is written as Eqs. (23)-(27) and the final GEP model is obtained as Eq. (28). Figure 11 shows the expression tree (ET) of each gene. To interpret the ETs, read them from left to right and bottom to up.

𝑆𝑢𝑏𝐸𝑇1 = atan

𝑆𝑢𝑏𝐸𝑇2 =

(

(( (

―6.86 tan 𝐵𝐼

1 3

))

―6.86 × (1.4 ― 𝐷𝑃𝑊) 1.4 × 𝛼

(

―10.38 × 𝐵𝑇𝑆 ―10.38 × 𝐷𝑃𝑊 × 𝛼 + 𝑈𝐶𝑆 𝐵𝐼 1 3

𝑆𝑢𝑏𝐸𝑇3 = 𝐷𝑃𝑊 × cos

(

𝛼 ∗ sin (𝛼 ― 𝐵𝑇𝑆) 𝛼 + 𝐵𝐼

𝐵𝐼4 𝑆𝑢𝑏𝐸𝑇4 = 2.05 × 𝑈𝐶𝑆 × 𝑈𝐶𝑆

(

𝑆𝑢𝑏𝐸𝑇5 = 68.04 ― atan

𝑅𝑂𝑃

+

)

(

)

(23)

2.63 𝐵𝐼

)) ( ) ―

(24)

)

(25)

1 9

tan (𝑈𝐶𝑆) ― cos (𝐵𝐼) 𝛼 + 2.48

(26)

)

(27)

(𝑚 ℎ) = 𝑆𝑢𝑏𝐸𝑇1 + 𝑆𝑢𝑏𝐸𝑇2 + 𝑆𝑢𝑏𝐸𝑇3 + 𝑆𝑢𝑏𝐸𝑇4 + 𝑆𝑢𝑏𝐸𝑇5

16

(28)

5. Verification The efficiency of the models presented in this paper can be evaluated by comparing their results with those of old previous models developed using the same datasets (see Table 3). One of the defects of previous models is that their performances in the prediction were evaluated using their own calibration results rather than their test results. Hence, it is not appropriate to compare the performance of presented models with the previously published models (in which various datasets in quantity and quality had been used). Therefore, the prediction power of all previously published models and the currently proposed models are recalculated using all of the Queens Tunnel datasets. As observed in Table 12, the predictive proposed models in this study have the highest accuracy and the lowest error among all of the models proposed for the Queens datasets, indicating the superiority of the Fourier regression, ICA, CART and GEP algorithms over other previously proposed techniques. TABLE 12 The values of statistical indices for all predictive models Model ICA-1 CART GEP ICA-2 Yagiz and Karahan [38] Yagiz [39] Yagiz and Karahan [35] Yagiz et al. [33] Yagiz [32]

R2 0.76 0.75 0.72 0.681 0.668 0.665 0.661 0.649 0.647

Rank 1 2 3 4 5 6 7 8 9

RMSE 0.172 0.177 0.191 0.202 0.207 0.215 0.208 0.348 0.452

Rank 1 2 3 4 5 7 6 8 9

MAPE 6.76 7.15 5.96 8.58 8.925 9.473 8.790 15.88 8.708

Rank 2 3 1 4 7 8 6 9 5

Note that there are some other studies, which employed the black-box techniques, for the penetration rate prediction (See e.g., [26, 27]). However, as these studies neither present an equation or practical solution nor have a potential for future applications, they are not considered as a predictive model to be compared with the proposed models.

6. Conclusions After the publication of the Queens Tunnel datasets in 2008, different analyses were performed on them and several models were developed for TBM penetration rate prediction. This new study tried to analyze these datasets more deeply and introduce new models with higher accuracy. To this end, in initial steps, the

17

effects of rock properties and their relationships with the ROP were investigated. As well, for the first time, the Fourier Transform was used as a regression technique, which showed impressive results compare to the other conventional regression techniques. It was concluded that the BI and BTS have the greatest and the least effect on the ROP. However, predicting ROP by using only one or two parameters is not possible because of the complexity of the problem and meaningless trend of the ROP graph respecting the graphs of rock properties. In the next steps, by using computer algorithms, four different models were developed for the ROP (m/h) prediction. Therefore, two non-linear multivariable equations were proposed using the Fourier Transform and conventional regression techniques.Then, they were optimized by using the imperialist competitive algorithm (ICA). Another model was also developed by using the genetic expression programming (GEP), which is interpretable as an equation due to the use of mathematical functions. Finally, a model was developed by using classification and regression tree (CART), due to its robustness in problem-solving in different scientific issues. Comparing the developed models and previously published models showed the following distinguished results: -

the The CART has the lowest errors in the ROP prediction;

-

The NLMR equation, which was developed using the Fourier transform, outperform the conventional NLMR and GEP models; and

-

Although the GEP model which is introduced in this study had lower accuracy compare to the CART and the NLMR with Fourier terms and its

accuracy is much higher than the models

introduced for the ROP prediction.

References

[1] R. Robbins, Present trends and future directions in tunnelling, The Yugoslav Symp. on Rock Mechanics and Underground Actions, 1980, pp. 11 [2] P.J. Tarkoy, A. Hendron, Rock hardness index properties and geotechnical parameters for predicting tunnel boring machine performance, 1975.https://doi.org/10.1016/0148-9062(77)90787-2. [3] G.E. Korbin, Factors influencing the performance of full face hard rock tunnel boring machines, 1979, 18

[4] P.J. Tarkoy, Predicting raise and tunnel boring machine performance: state of the art, Proceedings 4th RETC, (1979) 333-352 [5] A. Ingraffea, K. Gunsallus, J. Beech, P. Nelson, A fracture toughness testing system for prediction of tunnel boring machine performance, The 23rd US Symposium on Rock Mechanics (USRMS), American Rock Mechanics Association, 1982, [6] P. Nelson, T.D. O'Rourke, F.H. Kulhawy, Factors affecting TBM penetration rates in sedimentary rocks, The 24th US Symposium on Rock Mechanics (USRMS), American Rock Mechanics Association, 1983, [7] P. Nelson, A. Ingraffea, T. O'rourke, TBM performance prediction using rock fracture parameters, Intl J of Rock Mech & Mining Sci & Geomechanic Abs, 22 (1985),https://doi.org/10.1016/01489062(85)92079-0. [8] H. Sanio, Prediction of the performance of disc cutters in anisotropic rock, International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts, Elsevier, 1985, pp. 153-161 [9] F.F. Roxborough, H.R. Phillips, Rock excavation by disc cutter, International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts, Elsevier, 1975, pp. 361366.https://doi.org/10.1016/0148-9062(76)90564-7. [10] L. Ozdemir, Development of theoretical equations for predicting tunnel boreability, Colorado School of Mines. Arthur Lakes Library, 1977, [11] J. Rostami, L. Ozdemir, Selection, Design Optimization and Performance Prediction of Tunnel Boring Machines for Mining Operations, SME 126th Annual Meeting & Exhibit,, Denver, Colorado., Society of Mining Metallurgy and Exploration Inc, 1997, [12] N.R. Barton, TBM tunnelling in jointed and faulted rock, Crc Press, 2000 [13] A. Bruland, Hard rock tunnel boring, Fakultet for ingeniørvitenskap og teknologi, 2000 [14] Q. Gong, J. Zhao, Development of a rock mass characteristics model for TBM penetration rate prediction, International journal of Rock mechanics and mining sciences, 46 (2009) 818,https://doi.org/10.1016/j.ijrmms.2008.03.003. [15] M. Zare Naghadehi, A. Ramezanzadeh, Models for estimation of TBM performance in granitic and mica gneiss hard rocks in a hydropower tunnel, Bulletin of Engineering Geology and the Environment, 76 (2017) 1627-1641,https://doi.org/10.1007/s10064-016-0950-y. [16] E. Farrokh, J. Rostami, C. Laughton, Study of various models for estimation of penetration rate of hard rock TBMs, Tunnelling and Underground Space Technology, 30 (2012) 110123,https://doi.org/10.1016/j.tust.2012.02.012. [17] P. Graham, Rock exploration for machine manufacturers, Exploration for rock engineering, 1 (1976) 173-180 [18] C. Laughton, Evaluation and prediction of tunnel boring machine performance in variable rock masses, (1999) [19] M.A. Grima, P. Bruines, P. Verhoef, Modeling tunnel boring machine performance by neuro-fuzzy methods, Tunnelling and underground space technology, 15 (2000) 259269,https://doi.org/10.1016/s0886-7798(01)00022-0.

19

[20] O. Acaroglu, L. Ozdemir, B. Asbury, A fuzzy logic model to predict specific energy requirement for TBM performance prediction, Tunnelling and Underground Space Technology, 23 (2008) 600608,https://doi.org/10.1016/j.tust.2007.11.003. [21] R. Mikaeil, M.Z. Naghadehi, F. Sereshki, Multifactorial fuzzy approach to the penetrability classification of TBM in hard rock conditions, Tunnelling and Underground Space Technology, 24 (2009) 500-505,https://doi.org/10.1016/j.tust.2008.12.007. [22] M. Zare Naghadehi, M. Samaei, M. Ranjbarnia, V. Nourani, State-of-the-art predictive modeling of TBM performance in changing geological conditions through gene expression programming, Measurement, 126 (2018) 46-57,https://doi.org/10.1016/j.measurement.2018.05.049. [23] A. Benardos, D. Kaliampakos, Modelling TBM performance with artificial neural networks, Tunnelling and Underground Space Technology, 19 (2004) 597605,https://doi.org/10.1016/j.tust.2004.02.128. [24] Z. Zhao, Q. Gong, Y. Zhang, J. Zhao, Prediction model of tunnel boring machine performance by ensemble neural networks, Geomechanics and Geoengineering: An International Journal, 2 (2007) 123128,https://doi.org/10.1080/17486020701377140. [25] G. JAVAD, T. NARGES, Application of artificial neural networks to the prediction of tunnel boring machine penetration rate, Mining Science and Technology (China), 20 (2010) 727733,https://doi.org/10.1016/s1674-5264(09)60271-4. [26] E. Ghasemi, S. Yagiz, M. Ataei, Predicting penetration rate of hard rock tunnel boring machine using fuzzy logic, Bulletin of Engineering Geology and the Environment, 73 (2014) 2335,https://doi.org/10.1007/s10064-013-0497-0. [27] S. Mahdevari, K. Shahriar, S. Yagiz, M.A. Shirazi, A support vector regression model for predicting tunnel boring machine penetration rates, International Journal of Rock Mechanics and Mining Sciences, 72 (2014) 214-229,https://doi.org/10.1016/j.ijrmms.2014.09.012. [28] A.C. Adoko, C. Gokceoglu, S. Yagiz, Bayesian prediction of TBM penetration rate in rock mass, Engineering geology, 226 (2017) 245-256,https://doi.org/10.1016/j.enggeo.2017.06.014. [29] A. Salimi, R.S. Faradonbeh, M. Monjezi, C. Moormann, TBM performance estimation using a classification and regression tree (CART) technique, Bulletin of Engineering Geology and the Environment, 77 (2018) 429-440,https://doi.org/10.1007/s10064-016-0969-0. [30] M. Samaei, M. Ranjbarnia, M. Zare Naghadehi, Prediction of the Rock Brittleness Index Using Nonlinear Multivariable Regression and the CART Regression Tree, Journal of Civil and Environmental Engineering, 48.3 (2018) 33-40 [31] J. Hassanpour, J. Rostami, M. Khamehchiyan, A. Bruland, Developing new equations for TBM performance prediction in carbonate-argillaceous rocks: a case history of Nowsood water conveyance tunnel, Geomechanics and Geoengineering: An International Journal, 4 (2009) 287297,https://doi.org/10.1080/17486020903174303. [32] S. Yagiz, Utilizing rock mass properties for predicting TBM performance in hard rock condition, Tunnelling and Underground Space Technology, 23 (2008) 326339,https://doi.org/10.1016/j.tust.2007.04.011.

20

[33] S. Yagiz, C. Gokceoglu, E. Sezer, S. Iplikci, Application of two non-linear prediction tools to the estimation of tunnel boring machine performance, Engineering Applications of Artificial Intelligence, 22 (2009) 808-814,https://doi.org/10.1016/j.engappai.2009.03.007. [34] J. Khademi Hamidi, K. Shahriar, B. Rezai, J. Rostami, Performance prediction of hard rock TBM using Rock Mass Rating (RMR) system, Tunnelling and Underground Space Technology, 25 (2010) 333345,https://doi.org/10.1016/j.tust.2010.01.008. [35] S. Yagiz, H. Karahan, Prediction of hard rock TBM penetration rate using particle swarm optimization, International Journal of Rock Mechanics and Mining Sciences, 48 (2011) 427433,https://doi.org/10.1016/j.ijrmms.2011.02.013. [36] A. Benato, P. Oreste, Prediction of penetration per revolution in TBM tunneling as a function of intact rock and rock mass characteristics, International Journal of Rock Mechanics and Mining Sciences, 74 (2015) 119-127,https://doi.org/10.1016/j.ijrmms.2014.12.007. [37] D.J. Armaghani, R.S. Faradonbeh, E. Momeni, A. Fahimifar, M. Tahir, Performance prediction of tunnel boring machine through developing a gene expression programming equation, Engineering with Computers, 34 (2018) 129-141,https://doi.org/10.1007/s00366-017-0526-x. [38] S. Yagiz, H. Karahan, Application of various optimization techniques and comparison of their performances for predicting TBM penetration rate in rock mass, International Journal of Rock Mechanics and Mining Sciences, 80 (2015) 308-315,https://doi.org/10.1016/j.ijrmms.2015.09.019. [39] S. Yagiz, New equations for predicting the field penetration index of tunnel boring machines in fractured rock mass, Arabian Journal of Geosciences, 10 (2017) 33,https://doi.org/10.1007/s12517-0172843-1. [40] J.H. Holland, Genetic algorithms, Scientific american, 267 (1992) 66-73 [41] J. Kennedy, R. Eberhart, Particle swarm optimization (PSO), Proc. IEEE International Conference on Neural Networks, Perth, Australia, 1995, pp. 1942-1948.https://doi.org/10.1109/icnn.1995.488968. [42] D. Karaboga, An idea based on honey bee swarm for numerical optimization, Technical report-tr06, Erciyes university, engineering faculty, computer engineering department, 2005, [43] K. Nemati, S. Shamsuddin, M. Darus, An optimization technique based on imperialist competition algorithm to measurement of error for solving initial and boundary value problems, Measurement, 48 (2014) 96-108,https://doi.org/10.1016/j.measurement.2013.10.043. [44] D. Yang, L. Wang, W. Hu, C. Ding, W. Gan, F. Liu, Trajectory optimization by using EMD-and ICAbased processing method, Measurement, 140 (2019) 334341,https://doi.org/10.1016/j.measurement.2019.03.063. [45] E. Atashpaz-Gargari, C. Lucas, Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition, 2007 IEEE congress on evolutionary computation, IEEE, 2007, pp. 4661-4667.https://doi.org/10.1109/cec.2007.4425083. [46] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and regression trees. Belmont, CA: Wadsworth, International Group, 432 (1984) 151-166,https://doi.org/10.1201/9781315139470-8. [47] H. Li, H. Dong, L. Jia, M. Ren, Vehicle classification with single multi-functional magnetic sensor and optimal MNS-based CART, Measurement, 55 (2014) 142152,https://doi.org/10.1016/j.measurement.2014.04.028. 21

[48] I. MathWorks, Statistics Toolbox: For Use with Matlab: User's Guide, Version 5, MathWorks, 2012 [49] S. Wold, K. Esbensen, P. Geladi, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987) 37-52,https://doi.org/10.1016/0169-7439(87)80084-9. [50] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley Longman Publishing Co., Inc., 1989 [51] J.H. Holland, Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, MIT press, 1992 [52] J.R. Koza, J.R. Koza, Genetic programming: on the programming of computers by means of natural selection, MIT press, 1992.https://doi.org/10.1007/bf00175355. [53] V. Nourani, B. Pradhan, H. Ghaffari, S.S. Sharifi, Landslide susceptibility mapping at Zonouz Plain, Iran using genetic programming and comparison with frequency ratio, logistic regression, and artificial neural network models, Natural hazards, 71 (2014) 523-547,https://doi.org/10.1007/s11069-013-0932-3. [54] C. Ferreira, Gene expression programming: a new adaptive algorithm for solving problems, arXiv preprint cs/0102027, (2001) [55] A.H. Gandomi, A.H. Alavi, M. Gandomi, S. Kazemi, Formulation of shear strength of slender RC beams using gene expression programming, part II: With shear reinforcement, Measurement, 95 (2017) 367-376,https://doi.org/10.1016/j.measurement.2016.10.024. [56] D.J. Armaghani, E.T. Mohamad, M.S. Narayanasamy, N. Narita, S. Yagiz, Development of hybrid intelligent models for predicting TBM penetration rate in hard rock condition, Tunnelling and Underground Space Technology, 63 (2017) 29-43,https://doi.org/10.1016/j.tust.2016.12.009.

22

Declaration of interests

☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

23

Highlights    

Fourier transform used for modeling the parameters relation with penetration rate Four novel models and their methodologies proposed for performance prediction of TBM Models' performance indices give satisfying results for all of the proposed models Proposed models were verified by using the models from literatures

24

Fig. 1 Location of Queens tunnel stages in New York City, the second stage between Brooklyn and Queens is studied in this paper with 7.5 km long [27]

1

Fig. 2 Variation of Queens’s rock parameters in the stations (sorted according to the rising trend in ROP measure)

2

Fig. 3 Flowchart of the ICA process [45]

3

Fig. 4 a) Exchanging the positions of a colony and the imperialist b) The entire emperorship after position exchange [45]

4

Fig. 5 Imperialistic competition. The more powerful an empire is, the more likely it will possess the weakest colony of the weakest empire [45]

5

Fig. 6 A simple CART including Root, interior (decision), and the terminal node [46].

6

Fig. 7 Schematic indication of a chromosome a) consists of a gene b) consists of two genes c) a GEP tree structure [54]

7

Fig. 8 The algorithm of GEP [54]

8

Fig. 9 Linear regression analysis between the ROP and 𝒍𝒐𝒈 (𝜶)

9

Fig. 10 The best regression tree for prediction of TBM penetration rate

10

11

Fig. 11 Expression Tree of each gene of the best model for penetration rate prediction

12