Advances in Water Resources 138 (2020) 103540
Contents lists available at ScienceDirect
Advances in Water Resources journal homepage: www.elsevier.com/locate/advwatres
Identifying groundwater contaminant sources based on a KELM surrogate model together with four heuristic optimization algorithms Ying Zhao a,b, Ruizhuo Qu c, Zhenxiang Xing a,b,∗, Wenxi Lu d a
School of Water Conservancy and Civil Engineering, Northeast Agricultural University, Harbin 150030, China Key Laboratory of High Efficiency Utilization of Agricultural Water-Soil Resources of Ministry of Agriculture of China, Harbin 150030, China c School of Environment, Harbin Institute of Technology, Harbin, 150090, China d College of New Energy and Environment, Jilin University, Changchun 130021, China b
a r t i c l e
i n f o
Keywords: Groundwater contaminant Heuristic search algorithms Identifying contaminant sources Kernel-based extreme learning machine
a b s t r a c t Identifying groundwater contaminant sources involves a reverse determination of the source characteristics by monitoring contaminant concentrations in a few observation wells. However, due to the ill-posed nature and high time consumption of identification, an efficient identification process with accurate estimated results is particularly important. To improve the efficiency of identifying groundwater contaminant sources, a kernel-based extreme learning machine was used as a surrogate for the time-consuming simulation model. Four heuristic search algorithms were used to improve the accuracy of the identification results. The proposed approach was tested in both hypothetical and actual cases. The conclusions are: 1. By forward and backward calculation of the surrogate model, the time cost of identifying groundwater sources can be reduced significantly; 2. When a traditional genetic algorithm and a particle swarm optimization algorithm are combined with quantum computing, computational efficiency and accuracy are both improved; and 3. By using various search algorithms to identify unknown contaminant sources in the actual case, the range of release histories of each contaminant source can be obtained, decreasing the ill-posed nature of the identification result obtained by a single algorithm and improving the reliability of the identification results.
1. Introduction Unlike surface water contaminant, groundwater contamination is concealed underground and is often not quickly discovered, meaning that information on contaminant sources is difficult to obtain. To provide a reliable basis for treatment and restoration of groundwater contaminant sources, identification of the groundwater contaminant source release history has become a key point. However, current research on identifying groundwater contaminant sources suffers from three main problems. First, most existing research studied only hypothetical cases (Ayvaz, 2010; Liu et al., 2010; Wang and Jin, 2013; Gurarslan and Karahan, 2015). Although hypothetical cases have many advantages, they also have limitations. For example, in hypothetical cases, it can be assumed that the aquifer parameters are known and that the groundwater model is completely correct. Hence, only information about the groundwater source is unknown. For real-world cases, not only is information about groundwater contaminant sources unknown, but also the aquifer parameters. Aquifer parameters can be obtained through experiments and
∗
calculations, but there are also uncertainties in the constructed aquifer. In addition, a large amount of uncertainty exists in groundwater models, including the uncertainty of the observed data and the model structure (Singh et al., 2010). The uncertainty of real-world cases seriously affects the identification of groundwater contaminant sources. However, unless the validity of the identification method is verified by a hypothetical case, there are also instability problems when the model is applied to actual cases. Therefore, established identification models should be applied to both hypothetical and actual cases so that the validity and practicability of the identification model can both be verified. Second, there are efficiency and accuracy issues in identifying the release histories of contaminant sources. Rapid and stable identification of these release histories is the key to this research. The main method used has become the identification of groundwater contaminant sources by simulation and optimization. However, if the simulation model is repeatedly used for iterative solution of the optimization model, it will result in a huge computational load, which seriously restricts the feasibility of the simulation-optimization method in practical inversion identification. Therefore, constructing surrogates for simulation models has
Corresponding author. E-mail address:
[email protected] (Z. Xing).
https://doi.org/10.1016/j.advwatres.2020.103540 Received 27 November 2019; Received in revised form 2 February 2020; Accepted 17 February 2020 Available online 19 February 2020 0309-1708/© 2020 Elsevier Ltd. All rights reserved.
Y. Zhao, R. Qu and Z. Xing et al.
become a leading issue in recent research (Forrester and Keane, 2009; Srivastava and Singh, 2014; Zhao et al., 2016). Qin et al. (2007) used polynomial regression to create surrogate models. The most widely used approach is the Kriging surrogate model (Simpson et al., 2001; Luo and Lu, 2014; Wan et al., 2005). Kriging is used because it can spatially model and predict fields and processes. At the same time, it can approximate and map complex nonlinear functions. However, this method easily falls into the local optimal solution, especially when the distance constraint is too small, which will affect the robustness of the algorithm (Zhou et al., 2015). There is also a least-squares support vector machine that captures the complex relationship between inputs and outputs and effectively extracts structural information from noisy data (Zhang et al., 2009). The disadvantage is that it is difficult to determine a reasonable parameter range, which affects the speed and accuracy of fault diagnoses to a certain extent (Zhang et al., 2008). In addition, surrogate models include support vector machines (Jin et al., 2001), multivariate adaptive regression splines (Hou and Lu, 2018), ensemble surrogate model (Xing et al., 2019) and others. In this study, a kernel-based extreme learning machine (KELM) was used as a surrogate for the simulation model when running the optimization model. The KELM model learns quickly (Huang et al., 2012), but during its operation, multiple types of problems must be converted into sub-problems of the two classifications. Due to the introduction of the kernel function, the kernel limit learning machine can be simplified to the execution of a least-squares support vector machine. For a traditional feed forward neural network, artificial adjustment parameters are needed, resulting in slow operation and a tendency to fall into local optima, whereas the KELM model has strong generalization performance (Jain et al.,1996; Li et al., 2016). Use of surrogate models to save time in identifying the release histories of contaminant sources should also consider the accuracy of the identification results (Razavi et al., 2012). The accuracy of the surrogate model in approximating the simulation model depends on the coverage of the sample space by the training sample, but the number of training samples used to establish the surrogate model is limited, and there is no guarantee that the training sample set can cover the sample space well. Therefore, it is difficult to establish a sufficiently reliable surrogate model using only the training samples obtained by one-time sampling. Relevant research currently available is based on one-off sampling (Rao, 2006; Mirghani et al., 2012; Srivastava and Singh, 2015) and has not given sufficient attention to this issue. Latin hypercube sampling (LHS) is a multidimensional stratified sampling method proposed by McKay et al., 2000 for computer simulation experiments . The LHS method is a mathematical method based on stochastic mathematical theory. Compared with the traditional random sampling method, LHS introduces the concept of “layering”, which randomly samples the sample values throughout the sample space. At the same time, it can also ensure that data points do not agglomerate, or in other words, it ensures good coverage of the entire sampling space. In addition, the surrogate model is used to determine the optimization model identification space from both forward and backward directions, and the training sample of the surrogate model is improved based on the groundwater contaminant source inversion identification results as feedback information. This method not only enables the training samples to cover the neighborhood of the optimal solution effectively, but also improves identification efficiency. The optimization model coupled with the surrogate model is generally a heuristic optimization algorithm. Singh and Datta (2006) used a genetic algorithm to simultaneously identify the locations and release histories of groundwater contaminant sources and compared the identification results with observations with errors. Ayvaz (2010) used the harmonic search algorithm to solve an optimization model for groundwater contaminant source identification and compared the observations with different degrees of error on the identification results. Zhao et al. (2016) used the simulation-optimization method based on the Kriging surrogate model to invert the identification of contaminant sources in a two-dimensional heterogeneous aquifer. This study combined tradi-
Advances in Water Resources 138 (2020) 103540
tional optimization algorithms with quantum computation to improve computing power. The concept of quantum computation was first proposed by Benioff (1980) and has the characteristics of exponential storage capacity, parallelism, and exponential acceleration. Its theory of entanglement, superposition, and interference of quantum states may also contribute to our understanding of intelligence. This may be a solution to calculation problems. Narayanan and Moore (1996) first proposed a good combination of evolutionary algorithms with quantum computation and successfully solved the traveling salesman problem perfectly. Mirghani et al. (2009) organically integrated quantum computation and genetic algorithms, creating a solution that had both the traditional advantages of genetic algorithms and the excellent characteristics of quantum computation, including the advantage of seeking the best solution, faster calculation rate, and smaller overall calculation scale. Mikki and Kishk, (2006) proposed a quantum particle swarm algorithm and used it to analyze and calculate the antenna layout problem. Comparing the two algorithms reveals that the optimization results and convergence speed were better than those of the classical particle swarm optimization algorithm (Han et al., 2001; Sun et al., 2004). Therefore, using a genetic algorithm (GA) and a particle swarm optimization algorithm (PSO), two traditional heuristic algorithms were combined with quantum computing, and the stability of the identification of groundwater contaminant source release history in the two cases was compared with the four optimization methods with regard to the time cost and accuracy of the identification results. Third, identifying groundwater contaminant source release histories is a typical inverse problem, and in some cases, the identification result is ill-posed. Neuman (1973) first introduced inverse problem-solving theory into the field of hydrology and water resources and divided inversion methods into direct and indirect methods. For the inverse problem, if the solution satisfies the three conditions of existence, uniqueness, and stability, the problem is said to be well-posed. For an hypothetical model, the solution to the inverse problem is still very complex. In practical applications, it is difficult to obtain the distribution of contaminant at any point in time at each node, and the observed data generally contain errors. Therefore, ill-posed problems arise and have always been difficult to solve. Therefore, to avoid ill-posed problems, the four optimization models are coupled with the KELM surrogate model to identify groundwater contaminant sources and release histories and obtain four sets of identification results. The results of the four groups are not completely similar, and there are great differences in identifying the individual contaminant sources. Therefore, for the unknown contaminant source release histories, it is not reliable to use only one model to obtain results. Based on the four sets of results, a historical value range of source release histories is provided instead of a single identification result. To address the challenges of ill-posed nature and high time consumption, both the surrogate model and the optimization algorithm have been improved and combined in this study to improve the accuracy and efficiency of solving the inverse problem. To improve the efficiency of identifying groundwater contaminant sources, the KELM is used as a surrogate for the time-consuming simulation model. Forward and backward surrogate models are used for real-world cases to decrease the number of surrogate samples. Furthermore, four heuristic search algorithms are used to improve the accuracy of the identification results. The above approach has been tested in both hypothetical and actual cases (Fig. 1). 2. Methodology 2.1. Simulation model The simulation–optimization method contains a simulation model that makes up the principal section as an equality constraint in an optimization formulation (Datta et al., 2011). The governing partial differential equation for steady-state flow in a two-dimensional aquifer system
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 1. Flow diagram illustrating the study process.
can be stated as follows: ( ) 𝜕 𝜕ℎ 𝑇𝑖𝑗 = 𝑄 𝑖, 𝑗 = 1, 2 𝜕 𝑥𝑖 𝜕 𝑥𝑗
(1)
𝜕 where 𝜕𝑥 is the gradient operator vector, Tij is the transmissivity tensor, and Q is the volumetric flux per unit area. The partial differential equation that describes the transport of a contaminant in a two-dimensional aquifer system can be written as follows (Pinder and Bredehoeft, 1968): ( ) ) 𝑐𝑄 𝜕 (𝜃𝑐 ) 𝜕 𝜕𝑐 𝜕 ( = 𝜃𝐷𝑖𝑗 − (2) 𝜃𝑐 𝑣𝑖 + 𝑠 𝑖, 𝑗 = 1, 2 𝜕𝑡 𝜕 𝑥𝑖 𝜕 𝑥𝑖 𝜕 𝑥𝑖 𝑏
where b is the thickness of the aquifer, C the dissolved solids concentration, Dij the hydrodynamic dispersion tensor, vi the average linear seepage velocity, Cs the dissolved solids concentration in a source or sink flux, and 𝜃 the effective porosity. Note that Eq. (2) is related to Eq. (1) through Darcy’s Law as follows: 𝑣𝑖 = −
𝐾𝑖𝑗 𝜕ℎ 𝑖, 𝑗 = 1, 2 𝜃 𝜕 𝑥𝑗
(3)
where Kij is the hydraulic conductivity and h is the hydraulic head. 2.2. Surrogate model The surrogate model replaces the simulation model coupling with the optimization model to identify groundwater contaminant. The input data to the simulation model and the surrogate model were the same and were obtained by the Latin hypercube sampling method, that is, the inputs were the source concentration released history. The outputs were also same, that is, the observation value of the observation wells. In the actual case, to improve identification efficiency for appropriate samples, the surrogate model was used from both forward and backward directions. The inputs of the forward surrogate model were the samples of contaminant source release histories, and the outputs were the observed well values obtained through the simulation model. In contrast, in the input and output data of the backward surrogate model, the contaminant source release histories were derived from the known actual well observations. In addition, the selected surrogate model was KELM, which has fast and stable convergence. The extreme learning machine (ELM) is a new machine learning algorithm proposed by Huang et al. (2006). The output weights of the learning network can be solved for analytically through one-step calculation. The generalization ability and learning speed are greatly improved over the traditional neural network (Shi et al., 2014; Wong et al.,
2013). However, the ELM regression model has some problems, such as unsatisfactory generalization ability and instability. To solve these problems, Huang et al. (2012) proposed the KELM and introduced a kernel function to replace the hidden-layer feature mapping process that contained the activation function in the original ELM. This version can produce stable output, and its classification and fitting abilities are better than the non-core ELM method (Chen et al., 2014). The main calculation steps are as follows: For N training samples (xk ,pk ), k = 1, . . . , N, the output of ELM can be expressed as: 𝐿 ∑ 𝑧=1
( ) ( ) 𝛽𝑧 𝑔 𝜔 𝑧 ⋅ 𝑥 𝑘 + 𝑚 𝑧 = 𝑓 𝑥 𝑘 = 𝑦 𝑘
(4)
where g(𝜔z · xk + Mz ) is the output function of the z-th hidden neuron; g(·) is the excitation function; 𝜔z is the weight vector connecting the input node and the z-th hidden neuron; mz is the threshold of the z-th hidden neuron; 𝛽 z is the weight vector connecting the i th hidden neuron to the output neuron; yk is the output of the limit learning machine, and L is the number of hidden-layer neurons (Wang and Han, 2014). Eq. (4) can be expressed as follows: 𝐻𝛽 = 𝑌
(5)
where 𝛽 = [𝛽 1 . . . 𝛽 L ]T , y = [y1 . . . yN ]T , and H is the output matrix of the ELM hidden layer: ( ) ( ) 𝑔 𝜔𝐿 ⋅ 𝑥1 + 𝑚𝐿 ⎤ ⋯ ⎡ 𝑔 𝜔1 ⋅ 𝑥1 + 𝑚1 ⎥ 𝐻 =⎢ . (6) ⋱ ⋮ ⋮ ) ( )⎥ ⎢ ( ⋯ 𝑔 𝜔𝐿 ⋅ 𝑥𝑁 + 𝑚𝐿 ⎦𝑁×𝐿 ⎣𝑔 𝜔1 ⋅ 𝑥𝑁 + 𝑚1 If the ultimate learning machine model with L hidden nodes can learn N training samples unbiased, then the following equation exists (Wang and Han, 2014): 𝐿 ∑ 𝑧=1
( ) ( ) 𝛽𝑧 𝑔 𝜔 𝑧 ⋅ 𝑥 𝑘 + 𝑀 𝑧 = 𝑓 𝑥 𝑘 = 𝑡 𝑘 , 𝑘 = 1 , … , 𝑁
(7)
where tp is the target value. Eq. (7) can be expressed as follows: 𝐻𝛽 = 𝑃
(8)
Using the least-squares methods to solve Eq. (8): 𝛽 = 𝐻 +𝑃
(9)
H+
where is the Moore–Penrose generalized inverse matrix of H. Then the KELM original optimization problem can be expressed as: 𝑚𝑖𝑛
{
} ( )𝑇 1 ‖ 2 ‖ 𝑊 ∑𝑁 2 𝛿𝑘 𝑠.𝑡. ℎ 𝑥𝑘 ⋅ 𝛽 = 𝑝𝑘 − 𝛿𝑘 ‖𝛽 ‖ + 𝑘 =1 2‖ ‖ 2
(10)
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
where 𝛽 is a vector in the eigenspace F; W is the regularized parameter; h(xk ) is the mapping of the input variable x in the F space, and 𝛿 k is the error. According to the Karush–Kuhn–Tucker (KKT) theory, ELM training is equivalent to solving the following dual optimization problem: ( ( ) ) 𝑇 1 𝑊 ∑𝑁 2 ∑𝑁 𝐿𝐷 = ‖𝛽‖2 + 𝛿𝑘 − 𝜀 𝑘 ℎ 𝑥 𝑘 𝛽 − 𝑝 𝑘 + 𝛿𝑘 (11) 𝑘=1 𝑘=1 2 2 where 𝜀Z represents a Lagrangian operator. According to the KKT optimization condition, Eq. (11) can be obtained as follows: 𝑁 ( ) ∑ ⎧ 𝜕 𝐿𝐷 𝜀𝑘 ℎ 𝑥𝑘 ⎪ 𝜕𝛽 = 0 → 𝛽 = 𝑘 =1 ⎪ ⎨ 𝜕 𝐿 𝐷 = 0 → 𝜃 𝑍 = 𝑊 𝛿𝑘 ⎪ 𝜕 𝛿𝑘 ⎪ 𝜕 𝐿𝐷 = 0 → ℎ(𝑥 )𝑇 𝛽 − 𝑡 + 𝛿 = 0 ⎩ 𝜕 𝛿𝑘 𝑘 𝑘 𝑘
(12)
𝐾𝐸𝐿𝑀 = 𝐻 𝐻 𝑇
(13)
( ) ( ) ( ) 𝐾𝐸𝐿𝑀 (𝑧,𝑝) = ℎ 𝑥𝑘 ⋅ ℎ 𝑥𝑧 = 𝐾 𝑥𝑘 , 𝑥𝑧 .
(14)
10
10
2
5
5
2.3.1.3. Entanglement of quantum states. Entanglement of quantum states occurs in quantum systems that contain multiple subsystems. If certain states exist in the two subsystems where they occur, if they cannot be represented as the tensor product of the two subsystems, then this state can be called an entangled state (Nielsen and Chuang, 2007). The entanglement of quanta is an important feature distinguishing quantum mechanics from classical physics. Almost all quantum information processing is related to quantum entanglement, and therefore quantum entanglement properties occupy an important position in quantum information science. 3. Parallelism in quantum computation
(15)
where P = [p1 , . . . , pN ]T . 2.3. Optimization model Genetic Algorithm, Particle Swarm Optimization were chosen, and combines with quantum computation. The Quantum Genetic Algorithm uses the quantum chromosome coding method, and uses the effect of quantum gates to update the parent individuals, thereby generating new offspring individuals. Compared with traditional genetic algorithms, QGA can find the optimal solution to a problem in a shorter time with a smaller population size. Quantum particle swarm optimization uses information sharing strategy, which has better aggregation degree and higher search accuracy. So QPSO has a simple structure, good versatility, and fast convergence. The design variables and objective functions of the four optimization models are the same, namely observation values of observation wells and contaminant sources release history. The constraints of different models are slightly different, but the minimum value must be greater than zero. 2.3.1. Quantum computation The essence of quantum computation is to reduce the complexity of large-scale problems by using the properties of quantum theory (the superposition, coherence, entanglement, and quantum computational parallelism of quantum computation). 2.3.1.1. Superposition of quantum states. In computer languages, information is described as binary, either 0 or 1. However, the existence of qubits is unique, and they can exist in an intermediate state other than 0 and 1. This state is derived from the superposition of states of 0 and 1. This property is a quantum state in quantum information and a classic difference in physical state used in classical information. This property can set the specific state n to a common register. According to the quantum mechanics correlation principle, it is derived from the specific state of the 2n ground states superposed onto each other: the coherent superposition state |𝜙⟩. Among them, the relationship between the ground state and the superposition state can be described as: ⟩ ∑ |𝜑 ⟩ = 𝛾𝑒 ||𝜑𝑒 . (16) 𝑒
10
interact with it, the effect between the two can be expressed as: |𝜑⟩ = 2 1 √ |0⟩ + √ |1⟩. From Eq. (16), when the probability of the ground state |0⟩ decreases, |1⟩ increases instead.
The kernel matrix of ELM is:
KELM’s output function can be expressed as: ( ) ⎡𝐾 𝑥, 𝑥1 ⎤( ( )−1 ) 1 1 −1 ⎥ 𝐾 𝑓 (𝑥) = ℎ(𝑥)𝐻 𝑇 𝐻 𝐻 𝑇 + 𝑃 =⎢ + 𝑃 ⋮ 𝐸𝐿𝑀 )⎥ ⎢ ( 𝑉 𝑊 ⎣𝐾 𝑥, 𝑥𝑘 ⎦
2.3.1.2. Coherence of quantum states. The specific concept of quantum state coherence is the following: if the ground state of a quantum system is in a superposition state, then the quantum system is considered to be coherent. In quantum computing, it is important to make the different ground states constituting the superposition state interfere with the interaction of quantum gates, so that the phase can be changed. Suppose that a superposition of a quantum can be expressed as |𝜑⟩ = 3 1 1 1 3 1 1 √ |0⟩ + √ |1⟩ = √ [ ]. Using the quantum gate 𝑈 = √ [ ] to 1 1 −1
The parallel capability of quantum computation is actually an extension of the superposition of quantum states. Under certain ground conditions, the quantum states are used to evolve quantum states in quantum registers, and the vector matrix A in Hilbert space is used to represent quantum gates. Quantum gates are constrained, and therefore the effect on the quantum state in Hilbert space is manifested as wholeness, that is, the representation of all the base points, and the above characteristics are what we call quantum parallelism. This property makes it possible to effectively improve calculation speed, making it possible to complete work that is difficult for classical computers. 3.1. Criteria for surrogate evaluation Some common statistical indicators, specifically relative error, and coefficient of determination, were used to test model performance. (1) The relative error (RE) for a particular parameter is defined as follows and is expressed as a percentage: 𝑅𝐸 =
𝑍 (𝑡) − 𝑍̂ (𝑡) × 100% 𝑍 (𝑡 )
(17)
where t is the number of data points, Z(t) is an observed value, and 𝑍̂ (𝑡) is a predicted value. (2) The coefficient of determination (R2 ) reflects the agreement between the observed and predicted values and ranges from 0 to 1. R2 can be expressed as follows: )2 ∑𝑛 ( 𝑍 (𝑡) − 𝑍̂ (𝑡) 𝑅2 = 1 − ∑𝑡=1 ( (18) )2 𝑛 ̄ 𝑡=1 𝑍 (𝑡) − 𝑍 (𝑡) where 𝑍̄ (𝑡) is the mean of the observations. 4. Basic introduction to the aquifer Fig. 2 shows the plan view for a hypothetical case. The aquifer has specified head boundary conditions on the upper left (AB) and lower right (CD) sides, and there is no flow on the other sides. The head values on both sides AB and CD are 100.0 m and 80.0 m, respectively. The aquifer is divided into five hydraulic conductivity zones, and the hydraulic conductivity values in each zone are constant. The hydraulic conductivities are K1 = 34.56 m/d, K2 = 17.28 m/d, K3 = 8.64 m/d,
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Table 1 Basic parameters of the aquifer. Parameter
Value
Effective porosity, 𝜃 Longitudinal dispersivity, 𝛼 L (m) Transverse dispersivity, 𝛼 T (m) Saturated thickness, b(m) Grid spacing in x-direction, Δx(m) Grid spacing in y-direction, Δy(m) Volume flux per unit area, Q(m/d) Initial concentration (g/L)
0.25 40 9.6 30.5 100 100 8.64 × 10−5 0
Table 2 Actual values of the source fluxes (q) in the first four stress periods for case one. Contaminant source
S1 S2
Source flux at each stress period (SP) (g/d × 105 ) SP1
SP2
SP3
SP4
54.44 58.14
7.5 74.56
29.65 72.38
33.64 21.12
Fig. 2. Hypothetical case.
K4 = 25.92 m/d, and K5 = 60.48 m/d. Table 1 shows the hydrogeological parameters of the aquifer. There are two contaminant sources and seven sampling locations, and one pumping well is present at the center of this aquifer. Table 2 lists the actual values for the source fluxes. The actual case is the case studied by Gzyl et al. (2014); Fig. 3 shows its plan. Its location is in the city of Jaworzno in the Wąwolnica valley in southern Poland. There are five main areal sources of contaminant: 1 CSO Main, 4 Internal, 5 CSO North, and LS (divided into LSN 6a and LSS 6b). The exact value of the source contaminant concentration is unknown. However, it is known that the contaminant released by the contaminant source are constant during the simulation. Fifteen observation wells were arranged along the direction perpendicular to the water flow. The simulation period was 120 months. Detailed information on the hydrological and geological parameters of the study area can be found in that paper. The GMS model was used to simulate contaminant migration. Fig. 4 shows the distribution of the contaminant plumes. For the actual case, the actual observed value at the observation well and the values simulated by the four identification models are shown in Fig. 5.
5. Results and discussion 5.1. The surrogate model 5.1.1. hypothetical case The surrogate model was trained differently in the hypothetical and actual cases. In the hypothetical case, 500 sets of input values were sampled. The input data were entered into the simulation model to obtain the corresponding output data, which were the observed concentrations at the observation well. 450 groups were used as training data, and 50 groups were used as test data. R2 was used to characterize the training results. The results showed that R2 was as high as 0.9990 in the hypothetical case; Fig. 6 shows the training results for seven observation wells. The simulation accuracy for each observation well was greater than 0.99, indicating that the KELM surrogate model had higher simulation accuracy for extreme values. Moreover, the KELM surrogate model was stable and had high simulation accuracy for observation wells at different positions. Fig. 3. Actual case.
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 4. (a) and (b) are the distribution of contaminant plume in the final period of the hypothetical case and the actual case, respectively.
Fig. 5. Actual observed values and simulation values corresponding to the optimization results of four optimization models.
5.1.2. Actual case In the actual case, 50 sets of input data were sampled first. Forty data sets were used to train the simulation model, and the other ten data sets were used to test the accuracy of the surrogate model. The results show that in the actual case, R2 was as high as 0.9948. Fig. 7 shows the training results for the 15 observation wells. The accuracies at wells 1, 2, 3, and 10 were relatively low, with R2 greater than 0.82, but the prediction accuracy was still acceptable. The reason for this is that the values required for the surrogate model simulation were too small compared with the observed values at the other observation wells. The simulated values of R2 were all greater than 0.92 at the other observation wells. Nine of the fifteen observation wells had R2 greater than 0.99, indicating that the KELM surrogate model also had high accuracy in actual cases. After training with 50 data sets, the KELM surrogate model could be coupled with the optimization model. The calculation search range was provided before optimization, which improved calculation efficiency. The reverse KELM surrogate model could quickly determine the search range. The input of the forward surrogate model is the contaminant source release history, and the output is the observation values. However, the input of the backward surrogate model is the observation values, and the output value is the contaminant source release history to
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 6. Plots indicating the fitness of the hypothetical case: (a)–(g) are from the first observation well to the 7th observation well respectively.
be sought. Fig. 8 shows the predicted results. An R2 of 0.9605 was obtained, indicating that the range of values obtained using the reverse KELM surrogate model could be applied to the optimization model. The trained surrogate model was coupled with the four optimization algorithms, and the identification results were obtained. If the error between the simulated observed contaminant concentrations at the observation wells corresponding to the identification results and the actual observed contaminant concentrations was within a reasonable range, the result was considered as the final contaminant concentration of the source. Otherwise, training data were added within a certain numerical range according to the identification result, and the above steps were repeated until the error was reduced to an acceptable range. In this study, the accuracy of contaminant identification results based on 50 sets of training data was low. Therefore, to improve identification accuracy, the num-
ber of initial input and output data sets was increased by 50 groups. The value range of the additional 50 sets of training data is appropriately adjusted according to the previous identification results. The resulting 100 sets of input and output data for each contaminant source conformed to the normal distribution (Fig. 9), which shows that 100 data sets were sufficient. According to the identification results of the four models, the next sampling value range was determined, as shown in Table 3. The Table 3 The second sampling interval of five contaminant sources.
Minimum Maximum
Internal
CSO North
CSO Main
LSN
LSS
0.18 0.77
4.02 16.88
0.65 3.54
0.00 0.05
1.11 5.66
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 6. Continued
upper limit of each interval is the maximum of the four optimization model identification results. The lower limit of the interval is to choose the minimum value.
stable after 500 iterations and that the optimization result was the final identification result. The accuracy of the identification was reflected by the R2 and RE values, which are shown in Table 4. Comparing the results of the four optimization algorithms shows that the quantum genetic al-
5.2. Optimization results 5.2.1. Hypothetical case In the hypothetical case, the surrogate model was trained and coupled with the optimization model to identify the contaminant concentrations of the groundwater contaminant sources. For the four optimization models, the initial population was 50, and the number of iterations was 500. After 500 iterations, the iterative process of each optimization model and the final identification result were recorded, as shown in Fig. 10. It is apparent that the error of all the optimization models was
Table 4 Accuracy of identification results in hypothetical case. Optimization model
R2
RE (%)
GA PSO QGA QPSO
0.9424 0.9619 0.9836 0.9717
9.77 6.93 5.68 5.65
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 7. Plots indicating the fitness of the actual case: (a)–(o) are from the first observation well to the 15th observation well respectively.
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 7. Continued
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 7. Continued
gorithm (QGA) and quantum particle swarm optimization (GPSO) algorithms achieved higher accuracy than the other two optimization algorithms and were very effective in identifying groundwater contaminant sources. In comparison, the accuracy of the GA algorithm was lower, but still within a reasonable range. Taking an overall view, the R2 of the four optimization algorithms were all greater than 0.94, and RE was within 10%, indicating that the four identification models had high accuracy for identification and could be applied to the actual case. 5.2.2. Actual case In the actual case, after the optimization interval was determined, 50 data sets were used to identify the groundwater contaminant sources. Fig. 11 shows the results for the four identification models. Different identification models had different identification results for the contami-
nant sources, especially for contaminant sources LSN and LSS. However, the results obtained by the four identification models were very close to each other at five of the contaminant sources. Among the observed contaminant concentrations at the 15 observation wells, contaminant source CSO Main had the greatest impact and was therefore the easiest to identify. Contaminant sources Internal and CSO North had the second-greatest effect. In contrast, sources LSN and LSS had the least impact on observed well values. The four groups of identification results were input into the simulation model to obtain the observed well values and compared with the known observed values; the results are shown in Table 5. Clearly, the R2 of the observed values corresponding to the identification results are all between 0.93 and 0.94, indicating that the overall trend of the observed values is consistent with the trend of the actual observed values.
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 8. Obtained the contaminant source concentration values according to the backward surrogate model.
Fig. 9. The normal curve of the input data for five contaminant sources: (a) CSO Main; (b) Internal; (c) CSO North; (d) LSN; (e) LSS.
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 10. Iterative convergence process and identification results: (a) iterative convergence process for GA; (b) iterative convergence process for PSO; (c) iterative convergence process for QGA; (d) iterative convergence process for QPSO; (e) identification results of four optimization models.
Table 5 Accuracy of identification results in actual case at first sampling. Optimization model
R2
RE (%)
GA PSO QGA QPSO
0.9331 0.9356 0.9387 0.9430
28.11 24.70 23.26 21.48
Table 6 Accuracy of identification results in actual case at second sampling.
Fig. 11. Identification results based on 50 sets of training data.
However, the RE of the simulated observed values is larger. For example, the RE of the GA algorithm is close to 30%. The RE values of the other simulated observations all exceed 20%, and there are large numerical errors for individual observation wells. Fig. 12 shows the concentration of contaminant sources identified by the 100 groups of training values. It is evident that different identification models still have different results for the same contaminant source identification. For example, at contaminant source LSS, the maximum value calculated by the GA was 2.4580 mg/L, but the QPSO algorithm calculated that no contaminant was discharged there. Compared with Table 5, the discharge contaminant concentration of each contaminant
Optimization model
R2
RE (%)
GA PSO QGA QPSO
0.9455 0.9469 0.9473 0.9482
18.58 19.45 18.55 19.08
source also changed. For example, at contaminant source Internal the concentration decreased from 0.5–0.3 mg/L to 0.3–0.2 mg/L. For the other three contaminant sources, the discharge concentration change was relatively small. By analyzing the accuracy of the simulated observation values corresponding to the identification result, it is possible to judge whether the identification result is acceptable. Table 6 shows the accuracy of the second set of identification results corresponding to the simulated observation values. It is apparent that R2 changed less than 1%. The reason for this is that even when the number of training groups was increased, there was no significant change in the overall prediction
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Fig. 14. The time consumption to obtain the final identification results. Fig. 12. Identification results based on 100 sets of training data.
trend of the surrogate model. However, RE showed a relatively obvious change. In the case with fewer training data points, the RE of the GA algorithm was close to 30%. After the number of training data points was increased, RE decreased to less than 20%. The RE of the other three optimization algorithms all remained within 20%, indicating that the simulated observation values conformed to the overall trend of the actual observation values and that the accuracy of a single value was also within a reasonable range. To judge whether identification results are reasonable and reliable, it is necessary to test not only the accuracy of the simulated observations, but also the convergence of the four identification models. Fig. 13 shows convergence plots for the four models after 200 iterations; the convergence curves of all the optimization algorithms are stable. Among these, after 200 iterations of the PSO algorithm, although the iteration curve was stable, it did not last for a long time. However, the fitted value was less than 0.4, indicating that the identification result can be accepted. In identifying the release histories of groundwater contaminant sources, not only must the accuracy and reliability of the identification results be considered, but also the time cost of the identification process. Fig. 14 presents a direct comparison of the time taken by the four optimization models to perform 200 iterations after the same initial population numbers were set and based on the same 100 groups of training data. The time taken by the four identification models was very short, but the QPSO model had the shortest running time. The calculations of
0.8
Mean fitness
Fitness value
Fitness value
2 1.5 1
Best fitness=0.35
0.5 0
50
100
Generation
150
200
0.7 0.6 Best fitness=0.45
0.5 0
50
Best fitness=0.38
50
100
150
Generation (c)
200
0.5
Fitness value
Fitness value
0.45
0
100
Generation
150
200
(b)
Mean fitness
0.4
Fig. 13. Iterative convergence curve of the four optimization models of the second time: (a) GA (b) PSO (c) QGA (d) QPSO.
Mean fitness
(a)
0.5
all models were done a PC with an Intel 3.4 GHz processor and 16 GB RAM. Compared with the performance of the four identification models in the two cases, the identification accuracy in the hypothetical case was better than that in the actual case. The main reason was that it was more difficult to reconstruct the actual hydrologic and geological information in the actual case. However, a comprehensive analysis of the performance of the four identification models in both cases showed that each model performed very well. In the case with less training data, KELM surrogate simulation accuracy could still be maintained at a high level. GA and PSO, which are two traditional optimization models, have good simulation foundation and high identification accuracy. However, the combination of these two models with quantum computing not only improved their computational efficiency, but also their computational accuracy. The PSO algorithm was still unstable after 200 iterations, but the QPSO algorithm became stable after about 50 iterations, and the calculation time was shortened. Therefore, the combination of a traditional optimization model with quantum computing can improve the model. Nevertheless, this study still needs to integrate the identification results of the four identification models and to achieve identification of the release histories of unknown contaminant sources. Otherwise, the results obtained by the algorithm will be invalid, the solution will not be unique, or the solution will be discontinuous under the boundary. To avoid this problem, the identification results of the four models should be considered comprehensively, and the final identification re-
Mean fitness
0.45 0.4 Best fitness=0.35
0.35 0
50
100
150
Generation (d)
200 r
Y. Zhao, R. Qu and Z. Xing et al.
Advances in Water Resources 138 (2020) 103540
Table 7 The interval of the contaminant sources release history. Contaminant source
Minimum in Gzyl (mg/d)
Maximum in Gzyl (mg/d)
Identification results Minimum (mg/d)
Identification results Maximum (mg/d)
Internal CSO North CSO main LSN LSS
0 0 0 0 0
1.1 0.9 10.8 5.2 14
0.09 8.95 1.60 0.00 0.00
0.22 9.67 1.98 0.03 2.46
sults should not be a set of data, but an interval. Table 7 shows the final results. In the research of Gzyl et al. (2014), the accurate release histories of the actual contaminant sources were the parameters to be solved for, and the intervals for these release histories were finally obtained, as shown in Table 7. By comparing the intervals of the contaminant source release histories, the value range obtained through synthesis of the four optimization models in this study could be made narrower. Among the five contaminant sources, the identification result interval of contaminant source CSO North was significantly different from that of Gzyl et al. (2014). The main reason was that in both studies, the release histories of the contaminant sources were the parameter to be solved for. Moreover, there were differences among the various models for simulating groundwater contaminant transport, which may have had a great influence on the results. In addition, the numerical solutions for groundwater flow and transport were also different, which could have had a slight impact on the results. 6. Conclusions The contaminant identification model used in this study was applied to two cases, one hypothetical and another actual. The kernel-based extreme learning machine-coupled optimization model was used to obtain the identification model. The contaminant release history identification model was used in the hypothetical case and the actual case, respectively. The results show that the four identification models not only have high accuracy but also fast identification speed. The shape, area size, and hydrological and geological parameters are set in the hypothetical case. In contrast, it was more difficult to generalize the actual case than to set up the hypothetical case. Moreover, the release histories of the contaminant sources were unknown. By identifying the contaminant concentrations of the sources in the actual case and comparing them with the hypothetical case, it could be concluded that: (1) KELM has faster running speed and higher precision than the simulation model. Coupled with the four optimization models, it performed well in all cases. In the actual case, after using the forward surrogate model, the surrogate model is used to reversely determine the source’s contaminant concentration range, and the surrogate model is retrained. By using this method to identify the release histories of unknown contaminant sources, not only can sufficiently training data be ensured, but also the concentration range of contaminant sources can be determined more quickly, and the calculation efficiency can be improved. (2) Through comparative analysis, differences were found in the performance of the four optimization models. With fewer input and output data points, the identification results of the combination of quantum computing with GA or PSO were better than those of the uncombined models. Moreover, by comparing the time required for contaminant source identification, the optimization algorithm combined with quantum computing was faster. The combination of quantum computing characteristics with traditional heuristic optimization algorithms not only can achieve a stable identification process, but also has high identification accuracy and fast identification speed.
(3) The release histories of groundwater contaminant sources were identified by means of four optimization methods. More importantly, four different identification results were obtained by the four identification models, which together provided a range of concentrations for the contaminant sources. This avoids the problem of ill-posed identifications from a single optimization model, improves the stability of the identification result, and provides a reliable basis for contaminant control. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. CRediT authorship contribution statement Ying Zhao: Conceptualization, Writing - review & editing. Ruizhuo Qu: Writing - original draft, Methodology, Software. Zhenxiang Xing: Funding acquisition, Visualization. Wenxi Lu: Methodology, Software. Acknowledgments This research was supported by the National Natural Science Foundation of China (No. 41807196, No. 51979038, No. 41672232, No. 41972252), the National Key R&D Program of China (No: 2017YFC0406004), the Postdoctoral Science Foundation of China (No. 2018M641793), the Postdoctoral Science Foundation of Heilongjiang Province of China (No. LBH-Z19002), the Natural Science Foundation of Heilongjiang Province of China (No. E2015024), and the Academic Backbones Foundation of Northeast Agricultural University (No. 16XG11). Supplementary materials Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.advwatres.2020.103540. References Ayvaz, M.T., 2010. A linked simulation-optimization model for solving the unknown groundwater pollution source identification problems [J]. J. Contam. Hydrol. 117 (1–4), 46–59. Benioff, P., 1980. The computer as a physical system: a microscopic quantum mechanical hamiltonian model of computers as represented by Turing machines [J]. J. Stat. Phys. 22 (5), 563–591. Chen, C., Wei, L., Hongjun, S., Kui, L, 2014. Spectral-Spatial classification of hyperspectral image based on Kernel extreme learning machine [J]. Remote Sens. 6 (6), 5795–5814. Datta, B., Chakrabarty, D., Dhar, A, 2011. Identification of unknown groundwater pollution sources using classical optimization with linked simulation [J]. J. Hydro-Environ. Res. 5 (1), 25–36. Forrester, A.I.J., Keane, A.J., 2009. Recent advances in surrogate-based optimization [J]. Prog. Aerosp. Sci. 45 (1), 50–79. Gurarslan, G., Karahan, H., 2015. Solving inverse problems of groundwater pollution source identification using a differential evolution algorithm [J]. Hydrogeol J. 23 (6), 1109–1119. Gzyl, G., Zanini, A., Fraczek, R., Kura, K., 2014. Contaminant source and release history identification in groundwater: A multi-step approach. J. Contam. Hydrol. 157 (2014), 59–72.
Y. Zhao, R. Qu and Z. Xing et al. Han, K.H., Park, K.H., Lee, C.H, 2001. Parallel quantum-inspired genetic algorithm for combinatorial optimization problem: proceeding of the 2001. IEEE Congr. on Evolut. Comput.[C] 1429–1442. Hou, Z.Y., Lu, W.X., 2018. Comparative study of surrogate models for groundwater contaminationt source identification at DNAPL-contaminated sites [J]. Hydrogeol. J. 26 (3), 923–932. Huang, G.B., Zhou, H., Ding, X., Zhang, R, 2012. Extreme learning machine for regression and multiclass classification [J]. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42 (2), 513–529. Huang, G.B., Zhu, Q.Y., Siew, C.K, 2006. Extreme learning machine: theory and applications [J]. Neurocomputing 70 (1–3), 489–501. Jain, A.K., Mao, J., Mohiuddin, K.M, 1996. Artificial neural networks: a tutorial [J]. Computer 29 (3), 31–44. Jin, R., Chen, W., Simpson, T.W, 2001. Comparative studies of metamodelling techniques under multiple modelling criteria [J]. Struct. Opt. 23 (1), 1–13. Li, X., Mao, W., Wei, J, 2016. Extreme learning machine based transfer learning for data classification [J]. Neurocomputing 174, 203–210. Liu, X., Cardiff, M.A., Kitanidis, P.K, 2010. Parameter estimation in nonlinear environmental problems [J]. Stochastic Environ. Res. Risk Assess. 24 (7), 1003–1022. Luo, J.N., Lu, W.X., 2014. A mixed-integer non-linear programming with surrogate model for optimal remediation design of NAPLs contaminated aquifer [J]. Int. J. Environ. Pollut. 54 (1), 1–16. McKay, M., Beckman, Conover, W, Mckay, M.D., Beckman, R.J., Conover, W.J, 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code [J]. Technimetrics 42 (1), 55–61. Mikki, S.M., Kishk, A.A., 2006. Quantum particle swarm optimization for electromagnetic[J]. Antennas Propag. IEEE Trans. 54 (10), 2764–2775. Mirghani, B.Y., Mahinthakumar, K.G., Tryby, M.E., Ranjithan, R.S., Zechman, E.M, 2009. A parallel evolutionary strategy-based simulation-optimization approach for solving groundwater source identification problems [J]. Adv. Water Res. 32 (9), 1373–1385. Mirghani, B.Y., Zechman, E.M., Ranjithan, R, 2012. Enhanced simulation-optimization approach using surrogate modeling for solving inverse problems [J]. Environ. Forensics 13 (4), 348–363. Narayanan, A., Moore, M, 1996. Quantum-inspired genetic algorithms[C]. In: IEEE International Conference on Evolutionary Computation. IEEE, pp. 61–66. Neuman, S.P., 1973. Calibration of distributed parameter groundwater flow models viewed as a multiple-objective decision process under uncertainty [J]. Water Resour. Res. 9 (4), 1006–1021. Nielsen, M.A., Chuang, I.L, 2007. Quantum computation and quantum information [J]. Math. Struct. Comput. Sci. 17 (6), 1115. Pinder, G.F., Bredehoeft, J.D., 1968. Application of the digital computer for aquifer evaluation [J]. Water Resour. Res. 4 (5), 1069–1093. Qin, X.S., Huang, G.H., Chakma, A., Chen, B., Zeng, G.M, 2007. Simulation-based process optimization for surfactant-enhanced aquifer remediation at heterogeneous DNAPL– contaminated sites [J]. Sci. Total Environ. 381 (1–3), 17–37. Rao, S.V.N., 2006. A computationally efficient technique for source identification problems in three-dimensional aquifer systems using neural networks and simulated annealing [J]. Environ. Forensics 7 (3), 233–240.
Advances in Water Resources 138 (2020) 103540 Razavi, S., Tolson, B.A., Burn, D.H, 2012. Review of surrogate modeling in water resources [J]. Water Resour. Res. 48 (7), W07401. Shi, Y., Zhao, L., Tang, Jian, 2014. Recognition model based feature extraction and kernel extreme learning machine for high dimensional data [J]. Adv. Mater. Res. 875, 2020–2024. Simpson, T.W., Mauery, T.M., Korte J J Mistree, F, 2001. Kriging models for global approximation in simulation-based multidisciplinary design optimization [J]. AIAA J. 39 (12), 2233–2241. Singh, A., Mishra, S., Ruskauff, G, 2010. Model averaging techniques for quantifying conceptual model uncertainty [J]. Groundwater 48 (5), 701–715. Singh, R.M., Datta, B., 2006. Identification of groundwater contaminant sources using GA-based linked simulation optimization model [J]. J. Hydrol. Eng. 11 (2), 101–109. Srivastava, D., Singh, R.M., 2014. Breakthrough curves characterization and identification of an unknown pollution source in groundwater system using an artificial neural network (ANN) [J]. Environ. Forensics 15 (2), 175–189. Srivastava, D., Singh, R.M., 2015. Groundwater system modeling for simultaneous identification of pollution sources and parameters with uncertainty characterization [J]. Water Resour. Manage. 29 (13), 4607–4627. Sun, J., Feng, B., Xu, W.B., 2004. Particle swarm optimization with particles having quantum behavior [C]. In: Proceedings of the 2004 Congress on Evolutionary Computation. IEEE Press, Portland, pp. 325–331. Wan, X., Pekny, J.F., Reklaitis, G.V, 2005. Simulation-based optimization with surrogate models-Application to supply chain management. Comput. Chem. Eng. 29 (6), 1317–1328. Wang, H., Jin, X., 2013. Characterization of groundwater contaminant source using Bayesian method [J]. Stoch. Environ. Res. Risk Ass. 27 (4), 867–876. Wang, X., Han, M., 2014. Online sequential extreme learning machine with kernels for nonstationary time series prediction [J]. Neurocomputing 145, 90–97. Wong, K.I., Wong, P.K., Cheung, C.S., Vong, C.M, 2013. Modelling of diesel engine performance using advanced machine learning methods under scarce and exponential data set [J]. Appl. Soft Comput. 13 (11), 4428–4441. Xing, Z.X., Qu, R.Z., Zhao, Y., Fu, Q., Ji, Y., Lu, W.X, 2019. Identifying the release history of a groundwater pollution source based on an ensemble surrogate model [J]. J. Hydrol. 572, 501–576. Zhang, X., Srinivasan, R., Liew, M.V, 2009. Approximating swat model using artificial neural network and support vector machine1 [J]. J. Am. Water Resour. Assoc. 45 (2), 460–474. Zhang, S., Zhang, X., Lei, T, 2008. Research on fault diagnosis based on PSO and SVM [J]. Comput. Meas. Control 16 (11) 1573-1570. Zhao, Y., Lu, W.X., Xiao, C.N, 2016. A kriging surrogate model coupled in simulation-optimization approach for identifying release history of groundwater sources [J]. J. Contam. Hydrol. 185, 51–60. Zhou, Y.M., Zhang, J.R., Cheng, G.D, 2015. Comparison for two global optimization algorithms based on Kriging surrogate model [J]. Chin. J. Comput. Mech. 32 (4), 451–456.