Identifying groundwater contaminant sources based on a KELM surrogate model together with four heuristic optimization algorithms

Advances in Water Resources 138 (2020) 103540 Contents lists available at ScienceDirect Advances in Water Resources journal homepage: www.elsevier.c...

Download PDF

5MB Sizes 0 Downloads 16 Views

Report

Full Text

Advances in Water Resources 138 (2020) 103540

Contents lists available at ScienceDirect

Advances in Water Resources journal homepage: www.elsevier.com/locate/advwatres

Identifying groundwater contaminant sources based on a KELM surrogate model together with four heuristic optimization algorithms Ying Zhao a,b, Ruizhuo Qu c, Zhenxiang Xing a,b,∗, Wenxi Lu d a

School of Water Conservancy and Civil Engineering, Northeast Agricultural University, Harbin 150030, China Key Laboratory of High Eﬃciency Utilization of Agricultural Water-Soil Resources of Ministry of Agriculture of China, Harbin 150030, China c School of Environment, Harbin Institute of Technology, Harbin, 150090, China d College of New Energy and Environment, Jilin University, Changchun 130021, China b

a r t i c l e

i n f o

Keywords: Groundwater contaminant Heuristic search algorithms Identifying contaminant sources Kernel-based extreme learning machine

a b s t r a c t Identifying groundwater contaminant sources involves a reverse determination of the source characteristics by monitoring contaminant concentrations in a few observation wells. However, due to the ill-posed nature and high time consumption of identiﬁcation, an eﬃcient identiﬁcation process with accurate estimated results is particularly important. To improve the eﬃciency of identifying groundwater contaminant sources, a kernel-based extreme learning machine was used as a surrogate for the time-consuming simulation model. Four heuristic search algorithms were used to improve the accuracy of the identiﬁcation results. The proposed approach was tested in both hypothetical and actual cases. The conclusions are: 1. By forward and backward calculation of the surrogate model, the time cost of identifying groundwater sources can be reduced signiﬁcantly; 2. When a traditional genetic algorithm and a particle swarm optimization algorithm are combined with quantum computing, computational eﬃciency and accuracy are both improved; and 3. By using various search algorithms to identify unknown contaminant sources in the actual case, the range of release histories of each contaminant source can be obtained, decreasing the ill-posed nature of the identiﬁcation result obtained by a single algorithm and improving the reliability of the identiﬁcation results.

1. Introduction Unlike surface water contaminant, groundwater contamination is concealed underground and is often not quickly discovered, meaning that information on contaminant sources is diﬃcult to obtain. To provide a reliable basis for treatment and restoration of groundwater contaminant sources, identiﬁcation of the groundwater contaminant source release history has become a key point. However, current research on identifying groundwater contaminant sources suﬀers from three main problems. First, most existing research studied only hypothetical cases (Ayvaz, 2010; Liu et al., 2010; Wang and Jin, 2013; Gurarslan and Karahan, 2015). Although hypothetical cases have many advantages, they also have limitations. For example, in hypothetical cases, it can be assumed that the aquifer parameters are known and that the groundwater model is completely correct. Hence, only information about the groundwater source is unknown. For real-world cases, not only is information about groundwater contaminant sources unknown, but also the aquifer parameters. Aquifer parameters can be obtained through experiments and

∗

calculations, but there are also uncertainties in the constructed aquifer. In addition, a large amount of uncertainty exists in groundwater models, including the uncertainty of the observed data and the model structure (Singh et al., 2010). The uncertainty of real-world cases seriously affects the identiﬁcation of groundwater contaminant sources. However, unless the validity of the identiﬁcation method is veriﬁed by a hypothetical case, there are also instability problems when the model is applied to actual cases. Therefore, established identiﬁcation models should be applied to both hypothetical and actual cases so that the validity and practicability of the identiﬁcation model can both be veriﬁed. Second, there are eﬃciency and accuracy issues in identifying the release histories of contaminant sources. Rapid and stable identiﬁcation of these release histories is the key to this research. The main method used has become the identiﬁcation of groundwater contaminant sources by simulation and optimization. However, if the simulation model is repeatedly used for iterative solution of the optimization model, it will result in a huge computational load, which seriously restricts the feasibility of the simulation-optimization method in practical inversion identiﬁcation. Therefore, constructing surrogates for simulation models has

Corresponding author. E-mail address: [email protected] (Z. Xing).

https://doi.org/10.1016/j.advwatres.2020.103540 Received 27 November 2019; Received in revised form 2 February 2020; Accepted 17 February 2020 Available online 19 February 2020 0309-1708/© 2020 Elsevier Ltd. All rights reserved.

Y. Zhao, R. Qu and Z. Xing et al.

become a leading issue in recent research (Forrester and Keane, 2009; Srivastava and Singh, 2014; Zhao et al., 2016). Qin et al. (2007) used polynomial regression to create surrogate models. The most widely used approach is the Kriging surrogate model (Simpson et al., 2001; Luo and Lu, 2014; Wan et al., 2005). Kriging is used because it can spatially model and predict ﬁelds and processes. At the same time, it can approximate and map complex nonlinear functions. However, this method easily falls into the local optimal solution, especially when the distance constraint is too small, which will aﬀect the robustness of the algorithm (Zhou et al., 2015). There is also a least-squares support vector machine that captures the complex relationship between inputs and outputs and eﬀectively extracts structural information from noisy data (Zhang et al., 2009). The disadvantage is that it is diﬃcult to determine a reasonable parameter range, which aﬀects the speed and accuracy of fault diagnoses to a certain extent (Zhang et al., 2008). In addition, surrogate models include support vector machines (Jin et al., 2001), multivariate adaptive regression splines (Hou and Lu, 2018), ensemble surrogate model (Xing et al., 2019) and others. In this study, a kernel-based extreme learning machine (KELM) was used as a surrogate for the simulation model when running the optimization model. The KELM model learns quickly (Huang et al., 2012), but during its operation, multiple types of problems must be converted into sub-problems of the two classiﬁcations. Due to the introduction of the kernel function, the kernel limit learning machine can be simpliﬁed to the execution of a least-squares support vector machine. For a traditional feed forward neural network, artiﬁcial adjustment parameters are needed, resulting in slow operation and a tendency to fall into local optima, whereas the KELM model has strong generalization performance (Jain et al.,1996; Li et al., 2016). Use of surrogate models to save time in identifying the release histories of contaminant sources should also consider the accuracy of the identiﬁcation results (Razavi et al., 2012). The accuracy of the surrogate model in approximating the simulation model depends on the coverage of the sample space by the training sample, but the number of training samples used to establish the surrogate model is limited, and there is no guarantee that the training sample set can cover the sample space well. Therefore, it is diﬃcult to establish a suﬃciently reliable surrogate model using only the training samples obtained by one-time sampling. Relevant research currently available is based on one-oﬀ sampling (Rao, 2006; Mirghani et al., 2012; Srivastava and Singh, 2015) and has not given suﬃcient attention to this issue. Latin hypercube sampling (LHS) is a multidimensional stratiﬁed sampling method proposed by McKay et al., 2000 for computer simulation experiments . The LHS method is a mathematical method based on stochastic mathematical theory. Compared with the traditional random sampling method, LHS introduces the concept of “layering”, which randomly samples the sample values throughout the sample space. At the same time, it can also ensure that data points do not agglomerate, or in other words, it ensures good coverage of the entire sampling space. In addition, the surrogate model is used to determine the optimization model identiﬁcation space from both forward and backward directions, and the training sample of the surrogate model is improved based on the groundwater contaminant source inversion identiﬁcation results as feedback information. This method not only enables the training samples to cover the neighborhood of the optimal solution eﬀectively, but also improves identiﬁcation eﬃciency. The optimization model coupled with the surrogate model is generally a heuristic optimization algorithm. Singh and Datta (2006) used a genetic algorithm to simultaneously identify the locations and release histories of groundwater contaminant sources and compared the identiﬁcation results with observations with errors. Ayvaz (2010) used the harmonic search algorithm to solve an optimization model for groundwater contaminant source identiﬁcation and compared the observations with diﬀerent degrees of error on the identiﬁcation results. Zhao et al. (2016) used the simulation-optimization method based on the Kriging surrogate model to invert the identiﬁcation of contaminant sources in a two-dimensional heterogeneous aquifer. This study combined tradi-

Advances in Water Resources 138 (2020) 103540

tional optimization algorithms with quantum computation to improve computing power. The concept of quantum computation was ﬁrst proposed by Benioﬀ (1980) and has the characteristics of exponential storage capacity, parallelism, and exponential acceleration. Its theory of entanglement, superposition, and interference of quantum states may also contribute to our understanding of intelligence. This may be a solution to calculation problems. Narayanan and Moore (1996) ﬁrst proposed a good combination of evolutionary algorithms with quantum computation and successfully solved the traveling salesman problem perfectly. Mirghani et al. (2009) organically integrated quantum computation and genetic algorithms, creating a solution that had both the traditional advantages of genetic algorithms and the excellent characteristics of quantum computation, including the advantage of seeking the best solution, faster calculation rate, and smaller overall calculation scale. Mikki and Kishk, (2006) proposed a quantum particle swarm algorithm and used it to analyze and calculate the antenna layout problem. Comparing the two algorithms reveals that the optimization results and convergence speed were better than those of the classical particle swarm optimization algorithm (Han et al., 2001; Sun et al., 2004). Therefore, using a genetic algorithm (GA) and a particle swarm optimization algorithm (PSO), two traditional heuristic algorithms were combined with quantum computing, and the stability of the identiﬁcation of groundwater contaminant source release history in the two cases was compared with the four optimization methods with regard to the time cost and accuracy of the identiﬁcation results. Third, identifying groundwater contaminant source release histories is a typical inverse problem, and in some cases, the identiﬁcation result is ill-posed. Neuman (1973) ﬁrst introduced inverse problem-solving theory into the ﬁeld of hydrology and water resources and divided inversion methods into direct and indirect methods. For the inverse problem, if the solution satisﬁes the three conditions of existence, uniqueness, and stability, the problem is said to be well-posed. For an hypothetical model, the solution to the inverse problem is still very complex. In practical applications, it is diﬃcult to obtain the distribution of contaminant at any point in time at each node, and the observed data generally contain errors. Therefore, ill-posed problems arise and have always been diﬃcult to solve. Therefore, to avoid ill-posed problems, the four optimization models are coupled with the KELM surrogate model to identify groundwater contaminant sources and release histories and obtain four sets of identiﬁcation results. The results of the four groups are not completely similar, and there are great diﬀerences in identifying the individual contaminant sources. Therefore, for the unknown contaminant source release histories, it is not reliable to use only one model to obtain results. Based on the four sets of results, a historical value range of source release histories is provided instead of a single identiﬁcation result. To address the challenges of ill-posed nature and high time consumption, both the surrogate model and the optimization algorithm have been improved and combined in this study to improve the accuracy and eﬃciency of solving the inverse problem. To improve the eﬃciency of identifying groundwater contaminant sources, the KELM is used as a surrogate for the time-consuming simulation model. Forward and backward surrogate models are used for real-world cases to decrease the number of surrogate samples. Furthermore, four heuristic search algorithms are used to improve the accuracy of the identiﬁcation results. The above approach has been tested in both hypothetical and actual cases (Fig. 1). 2. Methodology 2.1. Simulation model The simulation–optimization method contains a simulation model that makes up the principal section as an equality constraint in an optimization formulation (Datta et al., 2011). The governing partial diﬀerential equation for steady-state ﬂow in a two-dimensional aquifer system

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 1. Flow diagram illustrating the study process.

can be stated as follows: ( ) 𝜕 𝜕ℎ 𝑇𝑖𝑗 = 𝑄 𝑖, 𝑗 = 1, 2 𝜕 𝑥𝑖 𝜕 𝑥𝑗

(1)

𝜕 where 𝜕𝑥 is the gradient operator vector, Tij is the transmissivity tensor, and Q is the volumetric ﬂux per unit area. The partial diﬀerential equation that describes the transport of a contaminant in a two-dimensional aquifer system can be written as follows (Pinder and Bredehoeft, 1968): ( ) ) 𝑐𝑄 𝜕 (𝜃𝑐 ) 𝜕 𝜕𝑐 𝜕 ( = 𝜃𝐷𝑖𝑗 − (2) 𝜃𝑐 𝑣𝑖 + 𝑠 𝑖, 𝑗 = 1, 2 𝜕𝑡 𝜕 𝑥𝑖 𝜕 𝑥𝑖 𝜕 𝑥𝑖 𝑏

where b is the thickness of the aquifer, C the dissolved solids concentration, Dij the hydrodynamic dispersion tensor, vi the average linear seepage velocity, Cs the dissolved solids concentration in a source or sink ﬂux, and 𝜃 the eﬀective porosity. Note that Eq. (2) is related to Eq. (1) through Darcy’s Law as follows: 𝑣𝑖 = −

𝐾𝑖𝑗 𝜕ℎ 𝑖, 𝑗 = 1, 2 𝜃 𝜕 𝑥𝑗

(3)

where Kij is the hydraulic conductivity and h is the hydraulic head. 2.2. Surrogate model The surrogate model replaces the simulation model coupling with the optimization model to identify groundwater contaminant. The input data to the simulation model and the surrogate model were the same and were obtained by the Latin hypercube sampling method, that is, the inputs were the source concentration released history. The outputs were also same, that is, the observation value of the observation wells. In the actual case, to improve identiﬁcation eﬃciency for appropriate samples, the surrogate model was used from both forward and backward directions. The inputs of the forward surrogate model were the samples of contaminant source release histories, and the outputs were the observed well values obtained through the simulation model. In contrast, in the input and output data of the backward surrogate model, the contaminant source release histories were derived from the known actual well observations. In addition, the selected surrogate model was KELM, which has fast and stable convergence. The extreme learning machine (ELM) is a new machine learning algorithm proposed by Huang et al. (2006). The output weights of the learning network can be solved for analytically through one-step calculation. The generalization ability and learning speed are greatly improved over the traditional neural network (Shi et al., 2014; Wong et al.,

2013). However, the ELM regression model has some problems, such as unsatisfactory generalization ability and instability. To solve these problems, Huang et al. (2012) proposed the KELM and introduced a kernel function to replace the hidden-layer feature mapping process that contained the activation function in the original ELM. This version can produce stable output, and its classiﬁcation and ﬁtting abilities are better than the non-core ELM method (Chen et al., 2014). The main calculation steps are as follows: For N training samples (xk ,pk ), k = 1, . . . , N, the output of ELM can be expressed as: 𝐿 ∑ 𝑧=1

( ) ( ) 𝛽𝑧 𝑔 𝜔 𝑧 ⋅ 𝑥 𝑘 + 𝑚 𝑧 = 𝑓 𝑥 𝑘 = 𝑦 𝑘

(4)

where g(𝜔z · xk + Mz ) is the output function of the z-th hidden neuron; g(·) is the excitation function; 𝜔z is the weight vector connecting the input node and the z-th hidden neuron; mz is the threshold of the z-th hidden neuron; 𝛽 z is the weight vector connecting the i th hidden neuron to the output neuron; yk is the output of the limit learning machine, and L is the number of hidden-layer neurons (Wang and Han, 2014). Eq. (4) can be expressed as follows: 𝐻𝛽 = 𝑌

(5)

where 𝛽 = [𝛽 1 . . . 𝛽 L ]T , y = [y1 . . . yN ]T , and H is the output matrix of the ELM hidden layer: ( ) ( ) 𝑔 𝜔𝐿 ⋅ 𝑥1 + 𝑚𝐿 ⎤ ⋯ ⎡ 𝑔 𝜔1 ⋅ 𝑥1 + 𝑚1 ⎥ 𝐻 =⎢ . (6) ⋱ ⋮ ⋮ ) ( )⎥ ⎢ ( ⋯ 𝑔 𝜔𝐿 ⋅ 𝑥𝑁 + 𝑚𝐿 ⎦𝑁×𝐿 ⎣𝑔 𝜔1 ⋅ 𝑥𝑁 + 𝑚1 If the ultimate learning machine model with L hidden nodes can learn N training samples unbiased, then the following equation exists (Wang and Han, 2014): 𝐿 ∑ 𝑧=1

( ) ( ) 𝛽𝑧 𝑔 𝜔 𝑧 ⋅ 𝑥 𝑘 + 𝑀 𝑧 = 𝑓 𝑥 𝑘 = 𝑡 𝑘 , 𝑘 = 1 , … , 𝑁

(7)

where tp is the target value. Eq. (7) can be expressed as follows: 𝐻𝛽 = 𝑃

(8)

Using the least-squares methods to solve Eq. (8): 𝛽 = 𝐻 +𝑃

(9)

H+

where is the Moore–Penrose generalized inverse matrix of H. Then the KELM original optimization problem can be expressed as: 𝑚𝑖𝑛

{

} ( )𝑇 1 ‖ 2 ‖ 𝑊 ∑𝑁 2 𝛿𝑘 𝑠.𝑡. ℎ 𝑥𝑘 ⋅ 𝛽 = 𝑝𝑘 − 𝛿𝑘 ‖𝛽 ‖ + 𝑘 =1 2‖ ‖ 2

(10)

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

where 𝛽 is a vector in the eigenspace F; W is the regularized parameter; h(xk ) is the mapping of the input variable x in the F space, and 𝛿 k is the error. According to the Karush–Kuhn–Tucker (KKT) theory, ELM training is equivalent to solving the following dual optimization problem: ( ( ) ) 𝑇 1 𝑊 ∑𝑁 2 ∑𝑁 𝐿𝐷 = ‖𝛽‖2 + 𝛿𝑘 − 𝜀 𝑘 ℎ 𝑥 𝑘 𝛽 − 𝑝 𝑘 + 𝛿𝑘 (11) 𝑘=1 𝑘=1 2 2 where 𝜀Z represents a Lagrangian operator. According to the KKT optimization condition, Eq. (11) can be obtained as follows: 𝑁 ( ) ∑ ⎧ 𝜕 𝐿𝐷 𝜀𝑘 ℎ 𝑥𝑘 ⎪ 𝜕𝛽 = 0 → 𝛽 = 𝑘 =1 ⎪ ⎨ 𝜕 𝐿 𝐷 = 0 → 𝜃 𝑍 = 𝑊 𝛿𝑘 ⎪ 𝜕 𝛿𝑘 ⎪ 𝜕 𝐿𝐷 = 0 → ℎ(𝑥 )𝑇 𝛽 − 𝑡 + 𝛿 = 0 ⎩ 𝜕 𝛿𝑘 𝑘 𝑘 𝑘

(12)

𝐾𝐸𝐿𝑀 = 𝐻 𝐻 𝑇

(13)

( ) ( ) ( ) 𝐾𝐸𝐿𝑀 (𝑧,𝑝) = ℎ 𝑥𝑘 ⋅ ℎ 𝑥𝑧 = 𝐾 𝑥𝑘 , 𝑥𝑧 .

(14)

10

10

2

5

5

2.3.1.3. Entanglement of quantum states. Entanglement of quantum states occurs in quantum systems that contain multiple subsystems. If certain states exist in the two subsystems where they occur, if they cannot be represented as the tensor product of the two subsystems, then this state can be called an entangled state (Nielsen and Chuang, 2007). The entanglement of quanta is an important feature distinguishing quantum mechanics from classical physics. Almost all quantum information processing is related to quantum entanglement, and therefore quantum entanglement properties occupy an important position in quantum information science. 3. Parallelism in quantum computation

(15)

where P = [p1 , . . . , pN ]T . 2.3. Optimization model Genetic Algorithm, Particle Swarm Optimization were chosen, and combines with quantum computation. The Quantum Genetic Algorithm uses the quantum chromosome coding method, and uses the eﬀect of quantum gates to update the parent individuals, thereby generating new oﬀspring individuals. Compared with traditional genetic algorithms, QGA can ﬁnd the optimal solution to a problem in a shorter time with a smaller population size. Quantum particle swarm optimization uses information sharing strategy, which has better aggregation degree and higher search accuracy. So QPSO has a simple structure, good versatility, and fast convergence. The design variables and objective functions of the four optimization models are the same, namely observation values of observation wells and contaminant sources release history. The constraints of diﬀerent models are slightly diﬀerent, but the minimum value must be greater than zero. 2.3.1. Quantum computation The essence of quantum computation is to reduce the complexity of large-scale problems by using the properties of quantum theory (the superposition, coherence, entanglement, and quantum computational parallelism of quantum computation). 2.3.1.1. Superposition of quantum states. In computer languages, information is described as binary, either 0 or 1. However, the existence of qubits is unique, and they can exist in an intermediate state other than 0 and 1. This state is derived from the superposition of states of 0 and 1. This property is a quantum state in quantum information and a classic diﬀerence in physical state used in classical information. This property can set the speciﬁc state n to a common register. According to the quantum mechanics correlation principle, it is derived from the speciﬁc state of the 2n ground states superposed onto each other: the coherent superposition state |𝜙⟩. Among them, the relationship between the ground state and the superposition state can be described as: ⟩ ∑ |𝜑 ⟩ = 𝛾𝑒 ||𝜑𝑒 . (16) 𝑒

10

interact with it, the eﬀect between the two can be expressed as: |𝜑⟩ = 2 1 √ |0⟩ + √ |1⟩. From Eq. (16), when the probability of the ground state |0⟩ decreases, |1⟩ increases instead.

The kernel matrix of ELM is:

KELM’s output function can be expressed as: ( ) ⎡𝐾 𝑥, 𝑥1 ⎤( ( )−1 ) 1 1 −1 ⎥ 𝐾 𝑓 (𝑥) = ℎ(𝑥)𝐻 𝑇 𝐻 𝐻 𝑇 + 𝑃 =⎢ + 𝑃 ⋮ 𝐸𝐿𝑀 )⎥ ⎢ ( 𝑉 𝑊 ⎣𝐾 𝑥, 𝑥𝑘 ⎦

2.3.1.2. Coherence of quantum states. The speciﬁc concept of quantum state coherence is the following: if the ground state of a quantum system is in a superposition state, then the quantum system is considered to be coherent. In quantum computing, it is important to make the different ground states constituting the superposition state interfere with the interaction of quantum gates, so that the phase can be changed. Suppose that a superposition of a quantum can be expressed as |𝜑⟩ = 3 1 1 1 3 1 1 √ |0⟩ + √ |1⟩ = √ [ ]. Using the quantum gate 𝑈 = √ [ ] to 1 1 −1

The parallel capability of quantum computation is actually an extension of the superposition of quantum states. Under certain ground conditions, the quantum states are used to evolve quantum states in quantum registers, and the vector matrix A in Hilbert space is used to represent quantum gates. Quantum gates are constrained, and therefore the eﬀect on the quantum state in Hilbert space is manifested as wholeness, that is, the representation of all the base points, and the above characteristics are what we call quantum parallelism. This property makes it possible to eﬀectively improve calculation speed, making it possible to complete work that is diﬃcult for classical computers. 3.1. Criteria for surrogate evaluation Some common statistical indicators, speciﬁcally relative error, and coeﬃcient of determination, were used to test model performance. (1) The relative error (RE) for a particular parameter is deﬁned as follows and is expressed as a percentage: 𝑅𝐸 =

𝑍 (𝑡) − 𝑍̂ (𝑡) × 100% 𝑍 (𝑡 )

(17)

where t is the number of data points, Z(t) is an observed value, and 𝑍̂ (𝑡) is a predicted value. (2) The coeﬃcient of determination (R2 ) reﬂects the agreement between the observed and predicted values and ranges from 0 to 1. R2 can be expressed as follows: )2 ∑𝑛 ( 𝑍 (𝑡) − 𝑍̂ (𝑡) 𝑅2 = 1 − ∑𝑡=1 ( (18) )2 𝑛 ̄ 𝑡=1 𝑍 (𝑡) − 𝑍 (𝑡) where 𝑍̄ (𝑡) is the mean of the observations. 4. Basic introduction to the aquifer Fig. 2 shows the plan view for a hypothetical case. The aquifer has speciﬁed head boundary conditions on the upper left (AB) and lower right (CD) sides, and there is no ﬂow on the other sides. The head values on both sides AB and CD are 100.0 m and 80.0 m, respectively. The aquifer is divided into ﬁve hydraulic conductivity zones, and the hydraulic conductivity values in each zone are constant. The hydraulic conductivities are K1 = 34.56 m/d, K2 = 17.28 m/d, K3 = 8.64 m/d,

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Table 1 Basic parameters of the aquifer. Parameter

Value

Effective porosity, 𝜃 Longitudinal dispersivity, 𝛼 L (m) Transverse dispersivity, 𝛼 T (m) Saturated thickness, b(m) Grid spacing in x-direction, Δx(m) Grid spacing in y-direction, Δy(m) Volume ﬂux per unit area, Q(m/d) Initial concentration (g/L)

0.25 40 9.6 30.5 100 100 8.64 × 10−5 0

Table 2 Actual values of the source ﬂuxes (q) in the ﬁrst four stress periods for case one. Contaminant source

S1 S2

Source ﬂux at each stress period (SP) (g/d × 105 ) SP1

SP2

SP3

SP4

54.44 58.14

7.5 74.56

29.65 72.38

33.64 21.12

Fig. 2. Hypothetical case.

K4 = 25.92 m/d, and K5 = 60.48 m/d. Table 1 shows the hydrogeological parameters of the aquifer. There are two contaminant sources and seven sampling locations, and one pumping well is present at the center of this aquifer. Table 2 lists the actual values for the source ﬂuxes. The actual case is the case studied by Gzyl et al. (2014); Fig. 3 shows its plan. Its location is in the city of Jaworzno in the Wąwolnica valley in southern Poland. There are ﬁve main areal sources of contaminant: 1 CSO Main, 4 Internal, 5 CSO North, and LS (divided into LSN 6a and LSS 6b). The exact value of the source contaminant concentration is unknown. However, it is known that the contaminant released by the contaminant source are constant during the simulation. Fifteen observation wells were arranged along the direction perpendicular to the water ﬂow. The simulation period was 120 months. Detailed information on the hydrological and geological parameters of the study area can be found in that paper. The GMS model was used to simulate contaminant migration. Fig. 4 shows the distribution of the contaminant plumes. For the actual case, the actual observed value at the observation well and the values simulated by the four identiﬁcation models are shown in Fig. 5.

5. Results and discussion 5.1. The surrogate model 5.1.1. hypothetical case The surrogate model was trained diﬀerently in the hypothetical and actual cases. In the hypothetical case, 500 sets of input values were sampled. The input data were entered into the simulation model to obtain the corresponding output data, which were the observed concentrations at the observation well. 450 groups were used as training data, and 50 groups were used as test data. R2 was used to characterize the training results. The results showed that R2 was as high as 0.9990 in the hypothetical case; Fig. 6 shows the training results for seven observation wells. The simulation accuracy for each observation well was greater than 0.99, indicating that the KELM surrogate model had higher simulation accuracy for extreme values. Moreover, the KELM surrogate model was stable and had high simulation accuracy for observation wells at diﬀerent positions. Fig. 3. Actual case.

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 4. (a) and (b) are the distribution of contaminant plume in the ﬁnal period of the hypothetical case and the actual case, respectively.

Fig. 5. Actual observed values and simulation values corresponding to the optimization results of four optimization models.

5.1.2. Actual case In the actual case, 50 sets of input data were sampled ﬁrst. Forty data sets were used to train the simulation model, and the other ten data sets were used to test the accuracy of the surrogate model. The results show that in the actual case, R2 was as high as 0.9948. Fig. 7 shows the training results for the 15 observation wells. The accuracies at wells 1, 2, 3, and 10 were relatively low, with R2 greater than 0.82, but the prediction accuracy was still acceptable. The reason for this is that the values required for the surrogate model simulation were too small compared with the observed values at the other observation wells. The simulated values of R2 were all greater than 0.92 at the other observation wells. Nine of the ﬁfteen observation wells had R2 greater than 0.99, indicating that the KELM surrogate model also had high accuracy in actual cases. After training with 50 data sets, the KELM surrogate model could be coupled with the optimization model. The calculation search range was provided before optimization, which improved calculation eﬃciency. The reverse KELM surrogate model could quickly determine the search range. The input of the forward surrogate model is the contaminant source release history, and the output is the observation values. However, the input of the backward surrogate model is the observation values, and the output value is the contaminant source release history to

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 6. Plots indicating the ﬁtness of the hypothetical case: (a)–(g) are from the ﬁrst observation well to the 7th observation well respectively.

be sought. Fig. 8 shows the predicted results. An R2 of 0.9605 was obtained, indicating that the range of values obtained using the reverse KELM surrogate model could be applied to the optimization model. The trained surrogate model was coupled with the four optimization algorithms, and the identiﬁcation results were obtained. If the error between the simulated observed contaminant concentrations at the observation wells corresponding to the identiﬁcation results and the actual observed contaminant concentrations was within a reasonable range, the result was considered as the ﬁnal contaminant concentration of the source. Otherwise, training data were added within a certain numerical range according to the identiﬁcation result, and the above steps were repeated until the error was reduced to an acceptable range. In this study, the accuracy of contaminant identiﬁcation results based on 50 sets of training data was low. Therefore, to improve identiﬁcation accuracy, the num-

ber of initial input and output data sets was increased by 50 groups. The value range of the additional 50 sets of training data is appropriately adjusted according to the previous identiﬁcation results. The resulting 100 sets of input and output data for each contaminant source conformed to the normal distribution (Fig. 9), which shows that 100 data sets were suﬃcient. According to the identiﬁcation results of the four models, the next sampling value range was determined, as shown in Table 3. The Table 3 The second sampling interval of ﬁve contaminant sources.

Minimum Maximum

Internal

CSO North

CSO Main

LSN

LSS

0.18 0.77

4.02 16.88

0.65 3.54

0.00 0.05

1.11 5.66

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 6. Continued

upper limit of each interval is the maximum of the four optimization model identiﬁcation results. The lower limit of the interval is to choose the minimum value.

stable after 500 iterations and that the optimization result was the ﬁnal identiﬁcation result. The accuracy of the identiﬁcation was reﬂected by the R2 and RE values, which are shown in Table 4. Comparing the results of the four optimization algorithms shows that the quantum genetic al-

5.2. Optimization results 5.2.1. Hypothetical case In the hypothetical case, the surrogate model was trained and coupled with the optimization model to identify the contaminant concentrations of the groundwater contaminant sources. For the four optimization models, the initial population was 50, and the number of iterations was 500. After 500 iterations, the iterative process of each optimization model and the ﬁnal identiﬁcation result were recorded, as shown in Fig. 10. It is apparent that the error of all the optimization models was

Table 4 Accuracy of identiﬁcation results in hypothetical case. Optimization model

R2

RE (%)

GA PSO QGA QPSO

0.9424 0.9619 0.9836 0.9717

9.77 6.93 5.68 5.65

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 7. Plots indicating the ﬁtness of the actual case: (a)–(o) are from the ﬁrst observation well to the 15th observation well respectively.

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 7. Continued

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 7. Continued

gorithm (QGA) and quantum particle swarm optimization (GPSO) algorithms achieved higher accuracy than the other two optimization algorithms and were very eﬀective in identifying groundwater contaminant sources. In comparison, the accuracy of the GA algorithm was lower, but still within a reasonable range. Taking an overall view, the R2 of the four optimization algorithms were all greater than 0.94, and RE was within 10%, indicating that the four identiﬁcation models had high accuracy for identiﬁcation and could be applied to the actual case. 5.2.2. Actual case In the actual case, after the optimization interval was determined, 50 data sets were used to identify the groundwater contaminant sources. Fig. 11 shows the results for the four identiﬁcation models. Diﬀerent identiﬁcation models had diﬀerent identiﬁcation results for the contami-

nant sources, especially for contaminant sources LSN and LSS. However, the results obtained by the four identiﬁcation models were very close to each other at ﬁve of the contaminant sources. Among the observed contaminant concentrations at the 15 observation wells, contaminant source CSO Main had the greatest impact and was therefore the easiest to identify. Contaminant sources Internal and CSO North had the second-greatest eﬀect. In contrast, sources LSN and LSS had the least impact on observed well values. The four groups of identiﬁcation results were input into the simulation model to obtain the observed well values and compared with the known observed values; the results are shown in Table 5. Clearly, the R2 of the observed values corresponding to the identiﬁcation results are all between 0.93 and 0.94, indicating that the overall trend of the observed values is consistent with the trend of the actual observed values.

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 8. Obtained the contaminant source concentration values according to the backward surrogate model.

Fig. 9. The normal curve of the input data for ﬁve contaminant sources: (a) CSO Main; (b) Internal; (c) CSO North; (d) LSN; (e) LSS.

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 10. Iterative convergence process and identiﬁcation results: (a) iterative convergence process for GA; (b) iterative convergence process for PSO; (c) iterative convergence process for QGA; (d) iterative convergence process for QPSO; (e) identiﬁcation results of four optimization models.

Table 5 Accuracy of identiﬁcation results in actual case at ﬁrst sampling. Optimization model

R2

RE (%)

GA PSO QGA QPSO

0.9331 0.9356 0.9387 0.9430

28.11 24.70 23.26 21.48

Table 6 Accuracy of identiﬁcation results in actual case at second sampling.

Fig. 11. Identiﬁcation results based on 50 sets of training data.

However, the RE of the simulated observed values is larger. For example, the RE of the GA algorithm is close to 30%. The RE values of the other simulated observations all exceed 20%, and there are large numerical errors for individual observation wells. Fig. 12 shows the concentration of contaminant sources identiﬁed by the 100 groups of training values. It is evident that diﬀerent identiﬁcation models still have diﬀerent results for the same contaminant source identiﬁcation. For example, at contaminant source LSS, the maximum value calculated by the GA was 2.4580 mg/L, but the QPSO algorithm calculated that no contaminant was discharged there. Compared with Table 5, the discharge contaminant concentration of each contaminant

Optimization model

R2

RE (%)

GA PSO QGA QPSO

0.9455 0.9469 0.9473 0.9482

18.58 19.45 18.55 19.08

source also changed. For example, at contaminant source Internal the concentration decreased from 0.5–0.3 mg/L to 0.3–0.2 mg/L. For the other three contaminant sources, the discharge concentration change was relatively small. By analyzing the accuracy of the simulated observation values corresponding to the identiﬁcation result, it is possible to judge whether the identiﬁcation result is acceptable. Table 6 shows the accuracy of the second set of identiﬁcation results corresponding to the simulated observation values. It is apparent that R2 changed less than 1%. The reason for this is that even when the number of training groups was increased, there was no signiﬁcant change in the overall prediction

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Fig. 14. The time consumption to obtain the ﬁnal identiﬁcation results. Fig. 12. Identiﬁcation results based on 100 sets of training data.

trend of the surrogate model. However, RE showed a relatively obvious change. In the case with fewer training data points, the RE of the GA algorithm was close to 30%. After the number of training data points was increased, RE decreased to less than 20%. The RE of the other three optimization algorithms all remained within 20%, indicating that the simulated observation values conformed to the overall trend of the actual observation values and that the accuracy of a single value was also within a reasonable range. To judge whether identiﬁcation results are reasonable and reliable, it is necessary to test not only the accuracy of the simulated observations, but also the convergence of the four identiﬁcation models. Fig. 13 shows convergence plots for the four models after 200 iterations; the convergence curves of all the optimization algorithms are stable. Among these, after 200 iterations of the PSO algorithm, although the iteration curve was stable, it did not last for a long time. However, the ﬁtted value was less than 0.4, indicating that the identiﬁcation result can be accepted. In identifying the release histories of groundwater contaminant sources, not only must the accuracy and reliability of the identiﬁcation results be considered, but also the time cost of the identiﬁcation process. Fig. 14 presents a direct comparison of the time taken by the four optimization models to perform 200 iterations after the same initial population numbers were set and based on the same 100 groups of training data. The time taken by the four identiﬁcation models was very short, but the QPSO model had the shortest running time. The calculations of

0.8

Mean fitness

Fitness value

Fitness value

2 1.5 1

Best fitness=0.35

0.5 0

50

100

Generation

150

200

0.7 0.6 Best fitness=0.45

0.5 0

50

Best fitness=0.38

50

100

150

Generation (c)

200

0.5

Fitness value

Fitness value

0.45

0

100

Generation

150

200

(b)

Mean fitness

0.4

Fig. 13. Iterative convergence curve of the four optimization models of the second time: (a) GA (b) PSO (c) QGA (d) QPSO.

Mean fitness

(a)

0.5

all models were done a PC with an Intel 3.4 GHz processor and 16 GB RAM. Compared with the performance of the four identiﬁcation models in the two cases, the identiﬁcation accuracy in the hypothetical case was better than that in the actual case. The main reason was that it was more diﬃcult to reconstruct the actual hydrologic and geological information in the actual case. However, a comprehensive analysis of the performance of the four identiﬁcation models in both cases showed that each model performed very well. In the case with less training data, KELM surrogate simulation accuracy could still be maintained at a high level. GA and PSO, which are two traditional optimization models, have good simulation foundation and high identiﬁcation accuracy. However, the combination of these two models with quantum computing not only improved their computational eﬃciency, but also their computational accuracy. The PSO algorithm was still unstable after 200 iterations, but the QPSO algorithm became stable after about 50 iterations, and the calculation time was shortened. Therefore, the combination of a traditional optimization model with quantum computing can improve the model. Nevertheless, this study still needs to integrate the identiﬁcation results of the four identiﬁcation models and to achieve identiﬁcation of the release histories of unknown contaminant sources. Otherwise, the results obtained by the algorithm will be invalid, the solution will not be unique, or the solution will be discontinuous under the boundary. To avoid this problem, the identiﬁcation results of the four models should be considered comprehensively, and the ﬁnal identiﬁcation re-

Mean fitness

0.45 0.4 Best fitness=0.35

0.35 0

50

100

150

Generation (d)

200 r

Y. Zhao, R. Qu and Z. Xing et al.

Advances in Water Resources 138 (2020) 103540

Table 7 The interval of the contaminant sources release history. Contaminant source

Minimum in Gzyl (mg/d)

Maximum in Gzyl (mg/d)

Identiﬁcation results Minimum (mg/d)

Identiﬁcation results Maximum (mg/d)

Internal CSO North CSO main LSN LSS

0 0 0 0 0

1.1 0.9 10.8 5.2 14

0.09 8.95 1.60 0.00 0.00

0.22 9.67 1.98 0.03 2.46

sults should not be a set of data, but an interval. Table 7 shows the ﬁnal results. In the research of Gzyl et al. (2014), the accurate release histories of the actual contaminant sources were the parameters to be solved for, and the intervals for these release histories were ﬁnally obtained, as shown in Table 7. By comparing the intervals of the contaminant source release histories, the value range obtained through synthesis of the four optimization models in this study could be made narrower. Among the ﬁve contaminant sources, the identiﬁcation result interval of contaminant source CSO North was signiﬁcantly diﬀerent from that of Gzyl et al. (2014). The main reason was that in both studies, the release histories of the contaminant sources were the parameter to be solved for. Moreover, there were diﬀerences among the various models for simulating groundwater contaminant transport, which may have had a great inﬂuence on the results. In addition, the numerical solutions for groundwater ﬂow and transport were also diﬀerent, which could have had a slight impact on the results. 6. Conclusions The contaminant identiﬁcation model used in this study was applied to two cases, one hypothetical and another actual. The kernel-based extreme learning machine-coupled optimization model was used to obtain the identiﬁcation model. The contaminant release history identiﬁcation model was used in the hypothetical case and the actual case, respectively. The results show that the four identiﬁcation models not only have high accuracy but also fast identiﬁcation speed. The shape, area size, and hydrological and geological parameters are set in the hypothetical case. In contrast, it was more diﬃcult to generalize the actual case than to set up the hypothetical case. Moreover, the release histories of the contaminant sources were unknown. By identifying the contaminant concentrations of the sources in the actual case and comparing them with the hypothetical case, it could be concluded that: (1) KELM has faster running speed and higher precision than the simulation model. Coupled with the four optimization models, it performed well in all cases. In the actual case, after using the forward surrogate model, the surrogate model is used to reversely determine the source’s contaminant concentration range, and the surrogate model is retrained. By using this method to identify the release histories of unknown contaminant sources, not only can suﬃciently training data be ensured, but also the concentration range of contaminant sources can be determined more quickly, and the calculation eﬃciency can be improved. (2) Through comparative analysis, diﬀerences were found in the performance of the four optimization models. With fewer input and output data points, the identiﬁcation results of the combination of quantum computing with GA or PSO were better than those of the uncombined models. Moreover, by comparing the time required for contaminant source identiﬁcation, the optimization algorithm combined with quantum computing was faster. The combination of quantum computing characteristics with traditional heuristic optimization algorithms not only can achieve a stable identiﬁcation process, but also has high identiﬁcation accuracy and fast identiﬁcation speed.

(3) The release histories of groundwater contaminant sources were identiﬁed by means of four optimization methods. More importantly, four diﬀerent identiﬁcation results were obtained by the four identiﬁcation models, which together provided a range of concentrations for the contaminant sources. This avoids the problem of ill-posed identiﬁcations from a single optimization model, improves the stability of the identiﬁcation result, and provides a reliable basis for contaminant control. Declaration of Competing Interest The authors declare that they have no known competing ﬁnancial interests or personal relationships that could have appeared to inﬂuence the work reported in this paper. CRediT authorship contribution statement Ying Zhao: Conceptualization, Writing - review & editing. Ruizhuo Qu: Writing - original draft, Methodology, Software. Zhenxiang Xing: Funding acquisition, Visualization. Wenxi Lu: Methodology, Software. Acknowledgments This research was supported by the National Natural Science Foundation of China (No. 41807196, No. 51979038, No. 41672232, No. 41972252), the National Key R&D Program of China (No: 2017YFC0406004), the Postdoctoral Science Foundation of China (No. 2018M641793), the Postdoctoral Science Foundation of Heilongjiang Province of China (No. LBH-Z19002), the Natural Science Foundation of Heilongjiang Province of China (No. E2015024), and the Academic Backbones Foundation of Northeast Agricultural University (No. 16XG11). Supplementary materials Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.advwatres.2020.103540. References Ayvaz, M.T., 2010. A linked simulation-optimization model for solving the unknown groundwater pollution source identiﬁcation problems [J]. J. Contam. Hydrol. 117 (1–4), 46–59. Benioﬀ, P., 1980. The computer as a physical system: a microscopic quantum mechanical hamiltonian model of computers as represented by Turing machines [J]. J. Stat. Phys. 22 (5), 563–591. Chen, C., Wei, L., Hongjun, S., Kui, L, 2014. Spectral-Spatial classiﬁcation of hyperspectral image based on Kernel extreme learning machine [J]. Remote Sens. 6 (6), 5795–5814. Datta, B., Chakrabarty, D., Dhar, A, 2011. Identiﬁcation of unknown groundwater pollution sources using classical optimization with linked simulation [J]. J. Hydro-Environ. Res. 5 (1), 25–36. Forrester, A.I.J., Keane, A.J., 2009. Recent advances in surrogate-based optimization [J]. Prog. Aerosp. Sci. 45 (1), 50–79. Gurarslan, G., Karahan, H., 2015. Solving inverse problems of groundwater pollution source identiﬁcation using a diﬀerential evolution algorithm [J]. Hydrogeol J. 23 (6), 1109–1119. Gzyl, G., Zanini, A., Fraczek, R., Kura, K., 2014. Contaminant source and release history identiﬁcation in groundwater: A multi-step approach. J. Contam. Hydrol. 157 (2014), 59–72.

Y. Zhao, R. Qu and Z. Xing et al. Han, K.H., Park, K.H., Lee, C.H, 2001. Parallel quantum-inspired genetic algorithm for combinatorial optimization problem: proceeding of the 2001. IEEE Congr. on Evolut. Comput.[C] 1429–1442. Hou, Z.Y., Lu, W.X., 2018. Comparative study of surrogate models for groundwater contaminationt source identiﬁcation at DNAPL-contaminated sites [J]. Hydrogeol. J. 26 (3), 923–932. Huang, G.B., Zhou, H., Ding, X., Zhang, R, 2012. Extreme learning machine for regression and multiclass classiﬁcation [J]. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42 (2), 513–529. Huang, G.B., Zhu, Q.Y., Siew, C.K, 2006. Extreme learning machine: theory and applications [J]. Neurocomputing 70 (1–3), 489–501. Jain, A.K., Mao, J., Mohiuddin, K.M, 1996. Artiﬁcial neural networks: a tutorial [J]. Computer 29 (3), 31–44. Jin, R., Chen, W., Simpson, T.W, 2001. Comparative studies of metamodelling techniques under multiple modelling criteria [J]. Struct. Opt. 23 (1), 1–13. Li, X., Mao, W., Wei, J, 2016. Extreme learning machine based transfer learning for data classiﬁcation [J]. Neurocomputing 174, 203–210. Liu, X., Cardiﬀ, M.A., Kitanidis, P.K, 2010. Parameter estimation in nonlinear environmental problems [J]. Stochastic Environ. Res. Risk Assess. 24 (7), 1003–1022. Luo, J.N., Lu, W.X., 2014. A mixed-integer non-linear programming with surrogate model for optimal remediation design of NAPLs contaminated aquifer [J]. Int. J. Environ. Pollut. 54 (1), 1–16. McKay, M., Beckman, Conover, W, Mckay, M.D., Beckman, R.J., Conover, W.J, 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code [J]. Technimetrics 42 (1), 55–61. Mikki, S.M., Kishk, A.A., 2006. Quantum particle swarm optimization for electromagnetic[J]. Antennas Propag. IEEE Trans. 54 (10), 2764–2775. Mirghani, B.Y., Mahinthakumar, K.G., Tryby, M.E., Ranjithan, R.S., Zechman, E.M, 2009. A parallel evolutionary strategy-based simulation-optimization approach for solving groundwater source identiﬁcation problems [J]. Adv. Water Res. 32 (9), 1373–1385. Mirghani, B.Y., Zechman, E.M., Ranjithan, R, 2012. Enhanced simulation-optimization approach using surrogate modeling for solving inverse problems [J]. Environ. Forensics 13 (4), 348–363. Narayanan, A., Moore, M, 1996. Quantum-inspired genetic algorithms[C]. In: IEEE International Conference on Evolutionary Computation. IEEE, pp. 61–66. Neuman, S.P., 1973. Calibration of distributed parameter groundwater ﬂow models viewed as a multiple-objective decision process under uncertainty [J]. Water Resour. Res. 9 (4), 1006–1021. Nielsen, M.A., Chuang, I.L, 2007. Quantum computation and quantum information [J]. Math. Struct. Comput. Sci. 17 (6), 1115. Pinder, G.F., Bredehoeft, J.D., 1968. Application of the digital computer for aquifer evaluation [J]. Water Resour. Res. 4 (5), 1069–1093. Qin, X.S., Huang, G.H., Chakma, A., Chen, B., Zeng, G.M, 2007. Simulation-based process optimization for surfactant-enhanced aquifer remediation at heterogeneous DNAPL– contaminated sites [J]. Sci. Total Environ. 381 (1–3), 17–37. Rao, S.V.N., 2006. A computationally eﬃcient technique for source identiﬁcation problems in three-dimensional aquifer systems using neural networks and simulated annealing [J]. Environ. Forensics 7 (3), 233–240.

Advances in Water Resources 138 (2020) 103540 Razavi, S., Tolson, B.A., Burn, D.H, 2012. Review of surrogate modeling in water resources [J]. Water Resour. Res. 48 (7), W07401. Shi, Y., Zhao, L., Tang, Jian, 2014. Recognition model based feature extraction and kernel extreme learning machine for high dimensional data [J]. Adv. Mater. Res. 875, 2020–2024. Simpson, T.W., Mauery, T.M., Korte J J Mistree, F, 2001. Kriging models for global approximation in simulation-based multidisciplinary design optimization [J]. AIAA J. 39 (12), 2233–2241. Singh, A., Mishra, S., Ruskauﬀ, G, 2010. Model averaging techniques for quantifying conceptual model uncertainty [J]. Groundwater 48 (5), 701–715. Singh, R.M., Datta, B., 2006. Identiﬁcation of groundwater contaminant sources using GA-based linked simulation optimization model [J]. J. Hydrol. Eng. 11 (2), 101–109. Srivastava, D., Singh, R.M., 2014. Breakthrough curves characterization and identiﬁcation of an unknown pollution source in groundwater system using an artiﬁcial neural network (ANN) [J]. Environ. Forensics 15 (2), 175–189. Srivastava, D., Singh, R.M., 2015. Groundwater system modeling for simultaneous identiﬁcation of pollution sources and parameters with uncertainty characterization [J]. Water Resour. Manage. 29 (13), 4607–4627. Sun, J., Feng, B., Xu, W.B., 2004. Particle swarm optimization with particles having quantum behavior [C]. In: Proceedings of the 2004 Congress on Evolutionary Computation. IEEE Press, Portland, pp. 325–331. Wan, X., Pekny, J.F., Reklaitis, G.V, 2005. Simulation-based optimization with surrogate models-Application to supply chain management. Comput. Chem. Eng. 29 (6), 1317–1328. Wang, H., Jin, X., 2013. Characterization of groundwater contaminant source using Bayesian method [J]. Stoch. Environ. Res. Risk Ass. 27 (4), 867–876. Wang, X., Han, M., 2014. Online sequential extreme learning machine with kernels for nonstationary time series prediction [J]. Neurocomputing 145, 90–97. Wong, K.I., Wong, P.K., Cheung, C.S., Vong, C.M, 2013. Modelling of diesel engine performance using advanced machine learning methods under scarce and exponential data set [J]. Appl. Soft Comput. 13 (11), 4428–4441. Xing, Z.X., Qu, R.Z., Zhao, Y., Fu, Q., Ji, Y., Lu, W.X, 2019. Identifying the release history of a groundwater pollution source based on an ensemble surrogate model [J]. J. Hydrol. 572, 501–576. Zhang, X., Srinivasan, R., Liew, M.V, 2009. Approximating swat model using artiﬁcial neural network and support vector machine1 [J]. J. Am. Water Resour. Assoc. 45 (2), 460–474. Zhang, S., Zhang, X., Lei, T, 2008. Research on fault diagnosis based on PSO and SVM [J]. Comput. Meas. Control 16 (11) 1573-1570. Zhao, Y., Lu, W.X., Xiao, C.N, 2016. A kriging surrogate model coupled in simulation-optimization approach for identifying release history of groundwater sources [J]. J. Contam. Hydrol. 185, 51–60. Zhou, Y.M., Zhang, J.R., Cheng, G.D, 2015. Comparison for two global optimization algorithms based on Kriging surrogate model [J]. Chin. J. Comput. Mech. 32 (4), 451–456.

Identifying groundwater contaminant sources based on a KELM surrogate model together with four heuristic optimization algorithms

Identifying groundwater contaminant sources based on a KELM surrogate model together with four heuristic optimization algorithms

Recommend Documents