Advances in Water Resources 56 (2013) 14–26
Contents lists available at SciVerse ScienceDirect
Advances in Water Resources journal homepage: www.elsevier.com/locate/advwatres
Sparse calibration of subsurface flow models using nonlinear orthogonal matching pursuit and an iterative stochastic ensemble method Ahmed H. Elsheikh a,b,c,⇑, Mary F. Wheeler a, Ibrahim Hoteit b,c a
Center for Subsurface Modeling (CSM), Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX, USA Dept. of Earth Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia c Dept. of Applied Mathematics and Computational Sciences, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia b
a r t i c l e
i n f o
Article history: Received 9 October 2012 Received in revised form 5 February 2013 Accepted 11 February 2013 Available online 19 February 2013 Keywords: Parameter estimation Subsurface flow models Sparse regularization Orthogonal matching pursuit Iterative stochastic ensemble method
a b s t r a c t We introduce a nonlinear orthogonal matching pursuit (NOMP) for sparse calibration of subsurface flow models. Sparse calibration is a challenging problem as the unknowns are both the non-zero components of the solution and their associated weights. NOMP is a greedy algorithm that discovers at each iteration the most correlated basis function with the residual from a large pool of basis functions. The discovered basis (aka support) is augmented across the nonlinear iterations. Once a set of basis functions are selected, the solution is obtained by applying Tikhonov regularization. The proposed algorithm relies on stochastically approximated gradient using an iterative stochastic ensemble method (ISEM). In the current study, the search space is parameterized using an overcomplete dictionary of basis functions built using the K-SVD algorithm. The proposed algorithm is the first ensemble based algorithm that tackels the sparse nonlinear parameter estimation problem. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction Subsurface flow models relies on many parameters that cannot be measured directly. Instead, a sparse set of measurements may be available at the location of wells. The complete distributions of these unknown fields are commonly inferred by a model calibration process that takes into account historical records of the input– output of the model. During model calibration, the parameters are adjusted to minimize the difference between the model output from measured time dependent data. Automated calibration is essential for accurate prediction of groundwater flow models to understand the fate of subsurface contaminants [8,43]. The multiphase flow of hydrocarbons in an oil reservoir is another example where model calibration is needed for accurate predictions [45]. The amount of available data to constrain the models is usually limited in both quantity and quality. This results in an ill-posed inverse problem that might admit many different solutions. Different parameter estimation techniques can be applied to tackle this problem. These techniques can be classified into Bayesian methods based on Markov Chain Monte Carlo (MCMC) methods [14,16,48], gradient based optimization methods [8,43], stochastic search ⇑ Corresponding author. Address: Center for Subsurface Modeling, Institute for Computational Engineering and Sciences, The University of Texas at Austin, 201 East 24th Street, Campus Mail C0200, Austin, TX 78712, USA. E-mail addresses:
[email protected] (A.H. Elsheikh),
[email protected] (M.F. Wheeler),
[email protected] (I. Hoteit). 0309-1708/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.advwatres.2013.02.002
algorithms [2–4,37], and ensemble Kalman filter methods [17,44,45]. Subsurface domains are generally heterogeneous and show a wide range of heterogeneities in many physical attributes such as permeability and porosity fields. An important step in the automatic calibration process is to define a proper parameterization of these unknown fields. Most of the parameterization methods depend on a prior model assumptions that implicitly defines the spatial correlations of the unknown fields using a parameter covariance matrix. Karhunen–Loève expansion (KLE) [30,31,39], also known as proper orthogonal decomposition (POD) or principal component analysis (PCA) in the finite dimensional case [5], is widely used for parameterizing the permeability field in subsurface flow models [13,14,38,53]. KLE is an effective method that is simple to implement, however it only preserves the second order moments of the distribution. Sarma et al. [57] utilized kernel based KLE to preserve the higher-order statistics of complex geological structures. The kernel trick relies on transforming high dimensional problem data into the feature space by defining a nonlinear kernel function. This method has a conceptual advantage over KLE, however a major complication arise when trying to define a reverse mapping from the feature space to the problem space. This results in a nonlinear optimization problem known as the pre-image problem. Also, the choice of the kernel is not unique as many different kernels with different parameters can be defined. Jafarpour and McLaughlin
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
[29] utilized discrete cosine transformations (DCT) to parameterize the permeability without specification of the covariances or other statistics of the unknown fields. DCT has a maximal compression power and is commonly used in image compression. However, these methods might perform poorly if all scales are adjusted simultaneously [6]. Sparse calibration and the compressed sensing (CS) [7,12] are very active research areas in the signal processing community. Standard reconstruction methods relies on defining a set of basis functions which are orthogonal as in KLE methods and then one tries to find the optimal set of weights to reconstruct the measurements. This reconstruction problem is an ill-posed problem and regularization techniques (i.e. Tikhonov regularization) that constrain the ‘2 -norm of the solution are commonly applied. The quality of the solution depends on the class of basis functions that are used to parameterize the search space. In sparse calibration methods, a large collection of basis functions are included in a dictionary and the solution process consists of picking the best basis functions for accurate reconstruction of the unknown field as well as finding the associated weights. In the current paper, we present a sparse calibration framework that has two components. The first component is concerned with the parameterization of the unknown fields. The second component is a search space exploration algorithm to solve an inverse problem over the parameterized search space. For the parameterization step, we follow Khaninezhad et al. [32,33] in utilizing a special parameterization based on sparse dictionary learning. Given a set of realizations of the unknown field (i.e. permeability field), the dictionary learning problem is formulated as an optimization problem to find the best basis functions such that each realization can be represented as a linear combination of only few basis functions. These dictionaries are over-complete and have a certain amount of redundancy. This redundancy is desirable as it gives robustness to the representation. Complex signals or fields might include structures that are not well represented by few vectors in any single basis. If large dictionaries are used, efficient representation can be found as a linear combination of only few basis functions. Building the optimal dictionary that approximates a signal with a minimum error is NP-hard (non-deterministic polynomial-time hard) problem and approximate algorithms can be used. Here, we utilized the K-SVD algorithm introduced by Aharon et al. [1] and used by Khaninezhad et al. [32] for parameterizing subsurface flow models. We refere the reader to the good introduction to the topic presented by Khaninezhad et al. [32] as it is straightforward application of well developed image processing techniques [15]. Once the dictionary is defined, the sparse calibration can proceed in two different directions. The first direction is to solve an optimization problem that penalizes the solution in the ‘1 -norm and minimizes the reconstruction error. The second class of algorithms are greedy algorithms, that iteratively find and remove elements from the dictionary that are maximally correlated with the residuals. Khaninezhad et al. [32,33] followed the first direction by utilizing the iteratively reweighted least-squares (IRLS) [9] algorithm to identify the important dictionary elements (solution support) and its associated weights by minimizing a sparsityregularized objective function. However, the main challenge with the IRLS is in the specification of a reasonable value for the regularization parameter. To avoid that, Khaninezhad et al. [33] modified the IRLS to include the ‘1 regularization term as a multiplicative term instead of an additive term. The use of multiplicative regularization terms is not common and increases the nonlinearity of the problem. In addition, the minimization algorithm might be attracted to minimizing the data misfit term only and a reduction of the total objective function may be obtained because of the multiplicative effect. In [33], the nonlinear objective function was minimized using an adjoint code to estimate the sensitivities.
15
In the current paper, an ensemble based algorithm is proposed to solve nonlinear sparse calibration problem given a dictionary built using the K-SVD algorithm. Ensemble based methods have proven to be an effective tool for subsurface model calibration [21,44,45]. The proposed algorithm enables the use of sparse calibration techniques when the adjoint code is not available. This is a major advantage over the earlier work of [32,33] where an adjoint code was needed. For the sparse calibration problem, we propose a new algorithm based on the orthogonal matching pursuit (OMP) algorithm [59,60]. The proposed algorithm falls in the class of greedy algorithms for sparse recovery and extends the standard OMP algorithm, which is designed for linear reconstruction problems, to nonlinear parameter estimation problems. The organization of this paper is as follows: Section 2 presents an introduction to sparse parameterization techniques. In Section 3, we present the derivation of the iterative stochastic ensemble method (ISEM) for parameter estimation. In Section 4, we present a novel sparse nonlinear parameter estimation algorithm based on the ISEM and the OMP algorithm. The proposed algorithm is a nonlinear extension to the orthogonal matching pursuit algorithm with an ensemble based approximate sensitivities. Section 5, starts by presenting a brief formulation of the subsurface flow problem followed by several test problems for evaluating the proposed algorithm. In Section 6, we present a discussion of the numerical results followed by the current work conclusions.
2. Sparse parameterization Spatially distributed parameter fields (e.g. permeability or porosity) are commonly parameterized to enhance the efficiency of calibration methods [13,14,38,53]. Here, we adopt a novel parameterization that builds a large dictionary of basis functions. These large dictionaries can reconstruct non-gaussian models and/or mixture of different models [32]. Linear sparse reconstruction is concerned with building the set of basis functions of the search space (dictionary building) as well as finding the weights of these basis functions to reconstruct a certain field. We start by presenting reconstruction methods assuming the set of basis functions (dictionary) is already known. Following that, we present a method for building large collection of basis functions in what is denoted as dictionary learning. In this section, we limit the presentation to linear cases.
2.1. Linear sparse reconstruction Given a dictionary of basis functions W 2 Rmn of n basis functions, each of size m, the sparse reconstruction problem is concerned with representing the unknown field as a linear combination of dictionary elements via a vector of the unknown weights x = ½x1 ; x2 ; . . . ; xn > 2 Rn . Let S = fi :j xi j – 0g be the support of x and let WðSÞ be the set of basis (atoms) of W corresponding to the support S. The vector x is said to be k-sparse if the cardinality of the set S is no more than k (i.e., j S j6 k). Recovery of a high-dimensional sparse signal from a small number of noisy linear measurements is a fundamental problem in the field of compressed sensing (CS) [7,12]. For a linear sparse reconstruction problem, one is interested in finding a sparse weight vector x to reconstruct the signal y 2 Rm1 such that y ¼ Wx, where W is the dictionary defined as W ¼ ½w1 ; w2 ; . . . ; wn and wi denotes the ith column of W. Throughout the paper the matrix W and its ith column are called dictionary and the ith atom of W, respectively.
16
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
Two classes of sparse reconstruction methods have been extensively used [15]. One class is based on solving an optimization problem that penalizes the solution in ‘1 -norm and minimizes the reconstruction error. In the presence of noise, the recovery proceeds by minimizing the ‘1 -norm in addition to penalizing the ‘2 -norm of the measurement constraint [7]. The second class of methods are greedy algorithms, that iteratively finds elements from the dictionary that are maximally correlated with the residuals. The orthogonal matching pursuit (OMP) [60], utilized in the current paper, falls in the class of greedy algorithms for sparse recovery. Methods that relies on the ‘1 -minimization are based on linear programming, which has its advantages and disadvantages. Utilizing linear programming algorithms as a black box can be considered as an advantage as any development in the linear programming algorithms will reduce the running time of the sparse recovery method. On the other hand, there is no strongly polynomial time algorithm in linear programming yet [46]. OMP is quite fast, both theoretically and experimentally. Each iteration of the OMP algorithm amounts to one matrix multiplication and solving a least squares problem of limited size in comparison to the dictionary size. This yields strongly polynomial running time [46]. In practice, OMP is observed to perform faster and is relatively easy to implement [46,60]. In addition, OMP is quite transparent [46]: at each iteration, it selects a new atom (basis) from the dictionary in a very specific and natural way. The algorithm transparency enabled us to extend the algorithm to deal with the nonlinear reconstruction problem in conceptually straightforward way.
Orthogonal matching pursuit (OMP) [59,60] is an iterative greedy algorithm that tackels the sparse reconstruction problem. The algorithm is an extension of the basis pursuit algorithms [41,42,51]. The OMP algorithm has been applied to sparse signal recovery in many studies [12,28,60]. The algorithm tries to solve the problem x
Learning or building a dictionary aims to provide a pool of basis functions in which a few basis functions can be linearly combined to approximate a novel signal or field. Assuming a set of signals, denoted by Y ¼ ½y1 ; . . . ; yi ; . . . ; yN , where yi is the ith signal, are available. Dictionary learning methods try to solve the following optimization problem N X fX; Wg ¼ argmin kyi Wxi k22 þ kkxi k1
2.2. The OMP algorithm
minkxk0 subject to : ky Wxk2 6 g;
2.3. Dictionary learning
ð1Þ
where kxk0 is the ‘0 norm that counts the number of non-zero components of a vector x and g is an error tolerance for the least squares solution. The reconstructed field (signal) y is approximated iteratively by a linear combination of a few atoms in the dictionary W. These few atoms (basis functions) are included in the active set which is built column by column, in a greedy fashion. At each iteration, the most correlated column of dictionary with the current residual vector is added to the active set. The OMP algorithm pseudocode is detailed in Algorithm 1. We assume that the atoms are normalized (i.e., kwi k2 ¼ 1, for i ¼ 1; 2; . . . ; n). We denote the support of x by S # f1; 2; . . . ; ng, which is defined as the set of indices corresponding to the nonzero components of x. WðSÞ denotes the matrix formed by picking the atoms of W corresponding to indices in S. The OMP algorithm starts with x ¼ 0 and iteratively constructs a k-term approximation to y by maintaining a set of active atoms (initially empty), and expanding the set by one additional atom at each iteration. The atom chosen at each stage maximally reduces the residual ‘2 error in approximating y from the currently active atoms. At each iteration, the ‘2 norm of the residual is evaluated and if it falls below a specified threshold, the algorithm terminates. Updating the provisional solution relies on the Moore–Penrose pseudoinverse of a matrix M using SVD, denoted as Mþ . The OMP can be considered as a stepwise forward selection algorithm [21].
X;W
s:t: kwi k22 6 1;
i¼1
ð2Þ
8i ¼ 1; . . . ; N;
where X ¼ ½x1 ; . . . ; xN is the coefficient matrix and k is a regularization parameter. Constraining the ‘1 norm in the dictionary learning optimization problem is equivalent to obtaining sparse solution [7,12,15]. The optimization problem formulated in Eq. (2) is nonconvex and NP-hard. Popular dictionary learning algorithm, namely the K-SVD [1] and MOD (Method of Optimal Directions) [18,19], attempt to approximate the solution using a relaxation technique that fixes all the parameters but one at each iteration and then optimizes the objective function. Both the K-SVD and MOD methods converge to a local minimum that is strongly dependent on the initial dictionary. Dictionary learning algorithms iterate between solving a set of sparse reconstruction problems and updating the dictionary. Thus, the first step in each iteration n of a dictionary learning algorithm optimizes the parameters X given a fixed dictionary Wn1 as N X Xn ¼ argmin kyi Wn1 xi k22 þ kkxi k1 X
ð3Þ
i¼1
This problem can be solved approximately using sparse reconstruction algorithm. K-SVD algorithm utilizes the OMP algorithm in this step. Next, Xn is kept fixed and the representation error is minimized over the dictionary by solving
W ¼ argmin W
s:t:kwi k22
N X kyi Wxni k22 i¼1
6 1;
ð4Þ
8i ¼ 1; . . . ; N;
where xni are elements of the coefficient matrix Xn . The difference between MOD and K-SVD lies in the choice of optimization method for
17
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
the second step. In this study, we utilize the K-SVD algorithm for dictionary learning [1] and its efficient implementation developed by Rubinstein et al. [55]. For a detailed description of the K-SVD algorithm, interested readers are referred to the orignal work by Aharon et al. [1] or a reproduction of the K-SVD algorithm by Khaninezhad et al. [32] within the context of subsurface flow model calibration.
Calibration of subsurface flow models given a dictionary of basis functions for the unknown fields is a nonlinear parameter estimation problem. The proposed parameter estimation method relies on the Gauss–Newton method and stochastic estimation of the derivatives using an ensemble of directional derivatives. Assuming the numerical simulator as a multi-input multi-output nonlinear function, the simulator output for a given set of input parameters x is defined as y ¼ HðxÞ. Given a set of observations yobs , one is interested in finding a set of parameters xest that minimizes the squared error function
ð5Þ
ð6Þ
The Jacobian rFðxÞ of FðxÞ has the components @ j Fi ðxÞ and the gradient vector GðxÞ ¼ r 12 kFðxÞk2 ¼ rFðxÞ> FðxÞ. The general strategy when solving non-linear optimization problems is to solve a sequence of approximations to the original problem [44]. At each iteration, a correction Dx to the vector x is estimated. For non-linear least squares, an approximation can be constructed by using the linearization Fðx þ DxÞ FðxÞ þ rFðxÞDx, which leads to the following linear least squares problem
1 min krFðxÞDx þ FðxÞk2 : Dx 2
ð7Þ
Further, it is easy to see that solving Eq. (7) is equivalent to solving the normal equation
rFðxk Þ> rFðxk Þ ðxkþ1 xk Þ ¼ rFðxk Þ> Fðxk Þ;
ð8Þ
where xk is the current value at iteration k. A Newton like iterative update equation easily follows as
1 xkþ1 ¼ xk rFðxk Þ> rFðxk Þ rFðxk Þ> Fðxk Þ:
Fðx þ huÞ FðxÞ ; h
ðrU FðxÞÞU> ¼ rFðxÞ UU> ;
ð13Þ
from which, the standard derivative can be evaluated as
rFðxÞ ¼ ðrU FðxÞÞU> UU>
1
:
ð14Þ
For each ensemble member i, the directional derivative around xk is
ð15Þ
where ui is a zero mean random perturbation in all components of x. For the ensemble of directional derivatives, we can re-write the directional derivative in a matrix format as
rU Fðxk Þ ¼ R1=2 Y;
ð16Þ
where Y is of size no ne and each column i corresponds to ðHðxk þ ui Þ Hðxk ÞÞ. The matrix form of Eq. (14) is then
rFðxÞ ¼ R1=2 YU> UU>
1
:
ð17Þ
Substituting this ensemble based approximate derivative in Eq. (9), an iterative parameter estimation equation is obtained as
xkþ1 ¼ xk þ
1 > 1=2 > > 1 1 R1=2 YU> UU> R YU UU
1 > R1=2 YU> UU> Fðxk Þ:
ð18Þ
Further simplification results in
1=2 > > 1=2 > 1 xkþ1 ¼ xk þ UU> YU R YU R > R1=2 YU> Fðxk Þ:
ð19Þ
This formula is the main update equation of the proposed iterative stochastic ensemble method (ISEM). This update equation will be utilized within the nonlinear orthogonal matching pursuit algorithm presented hereafter. 4. Sparse nonlinear parameter estimation algorithm
ð10Þ
where h is the step size. The directional derivative is related to the standard derivative by the following relation
ru FðxÞ ¼ rFðxÞ u;
where rU FðxÞ is an ensemble of directional derivatives of size no ne where ne is the ensemble size and U is the perturbation matrix of size nx ne used in estimating the directional derivatives. Multiply both sides from the right side with U> one gets
ð9Þ
For high dimensional search spaces, the evaluation of the gradient rFðxk Þ with simple differencing methods is not computationally feasible. The gradient can be evaluated efficiently using adjoint code but it is not always available for many numerical simulators. Here, we utilize directional derivatives in random directions u, defined as
ru FðxÞ ¼
ð12Þ
ðrU Fðxk ÞÞi ¼ R1=2 ðHðxk þ ui Þ Hðxk ÞÞ;
where O is the objective function and R is the output error covariance matrix. The least squares nature of the objective function enables the definition of the missmatch function FðxÞ ¼ R1=2 ðyobs HðxÞÞ. The function F can be thought of as a multiple output function with no outputs, where n0 is the number of observations. With this formalization, one is interested in solving the following optimization problem
1 argmin kFðxÞk2 : 2 x
Directional derivatives are utilized within a stochastic ensemble method for parameter estimation. We use an ensemble of perturbations to approximate the standard derivative (gradient) from an ensemble of directional derivatives as
rU FðxÞ ¼ rFðxÞU;
3. Parameter estimation algorithm
1 OðxÞ ¼ ðyobs HðxÞÞ> R1 ðyobs HðxÞÞ; 2
3.1. Iterative stochastic ensemble method (ISEM)
ð11Þ
where, ru FðxÞ is of size no 1; rFðxÞ is of size no nx ; nx is the size of the search space and u is of size nx 1. In all subsequent formulations we assume a unit step size h.
First, we want to simplify the update Eq. (19) by studying the properties of the ðUU> Þ term. In case of generating ui ¼ k wi , where wi Nð0; 1Þ and k is a constant, the covariance of the perturbation matrix ðUU> Þ asymptotically equals 2k ðne 1ÞI, where I is the identity matrix and ne is the ensemble size. Fig. 1 illustrates the convergence of the covariance of random perturbations to the identity matrix for different ensemble sizes. Once this is realized, the update equation can be simplified to
xkþ1 ¼ xk þ
R1=2 YU>
> 1 R1=2 YU>
> R1=2 YU> 2k ðne 1ÞFðxk Þ :
ð20Þ
Here, we want to highlight the difference between the nonlinear parameter estimation problem and the linear reconstruction
18
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
−1
(a)
(b)
(c)
Fig. 1. The estimated covariance matrix ðUU> Þ multiplied by ne 1, where ne is the ensemble size. Part (a) ensemble of 10 members. Part (b) ensemble of 20 members. Part (c) ensemble of 40 members.
problem presented in Section 2. For the nonlinear case, we express the unknown fields in terms of the basis functions included in the dictionary W and the corresponding weights vector x. The calibration process tries to minimize the mismatch function between the observations yobs and the simulator output in the ‘2 norm as kyobs HðWxÞk2 . This minimization problem is solved iteratively by the update Eq. (20). Adding a sparsity constrain on the solution vector x can be implemented by viewing the update Eq. (20) as the normal equation of the following system
Ak Dxk ¼ bk ;
1=2
>
ð21Þ
where Ak ¼ R YU ; bk ¼ 1ÞFðxk Þ and Dxk ¼ ðxkþ1 xk Þ. A sparse solution of this normal equation can be found by the OMP algorithm. However, in contrast to the linear reconstruction problem, the problem is nonlinear and the matrix A is the sensitivity of the solution to the different dictionary atoms. At each nonlinear iteration of the parameter estimation algorithm, a new sensitivity matrix Ak is estimated. A direct application of the OMP algorithm at the linearized iteration level will produce sparse updates of the parameters. However, a number of sparse updates might not produce a sparse solution after few nonlinear iterations. This is attributed to the lack of any link (in the solution support sense) between the different updates across the nonlinear iterations. In order to solve this problem, we propose a natural extension of the OMP algorithm as greedy algorithm to nonlinear problems by storing the discovered solution support between the subsequent nonlinear iterations. This is consistent with the logic of the OMP as a greedy algorithm. Once an atom of the dictionary is included in the support it is then carried over all subsequent update iterations. The pseudocode of the nonlinear OMP for sparse calibration combined with the ISEM is listed in Algorithm 2. We note two major changes of NOMP from the standard OMP algorithm. First, the solution support is carried between the nonlinear iterations. Second, once the solution support is identified, we use ‘2 regularization for calculating the residuals but we limit the solution space to the identified support. The ‘2 regularization is needed as the estimated sensitivity matrix Ak is rank deficient and might contain sampling errors. Tikhonov regularization [24] is applied to the update Eq. (21) as 2 k ðne
1 Dxk ¼ Ak ðSk Þ> Ak ðSk Þ þ kI Ak ðSk Þ> bk ;
ð22Þ
where Ak ðSk Þ is a restriction of matrix A to the selected support Sk and k is the regularization parameter. We utilize the L-curve method [17,22,23] for automatic selection of the regularization parameter. Tikhonov regularization replaces the Moore–Penrose pseudoinverse in the OMP algorithm. The proposed NOMP combines ISEM for estimating the sensitivities based on an ensemble methods with the OMP at each iteration. It is relatively simple to implement and requires a limited number of input constants that need to be adjusted. We also note that the method can handle different types of observation data.
In the current algorithm, at each iteration the ensemble members are generated by adding random perturbations. These perturbations mimic a random stencil for stochastic estimation of the gradient direction. The magnitude of the perturbations is decreased as we approach the solution via a decaying function. In all our numerical testing, the random perturbations are drawn from the Gaussian distribution Nð0; Inx Þ and are scaled by a scaler k defined by logarithmic rule proposed by Kushner et al. [32] as c= logðk þ 1Þ, where c is a user input and k is the iteration number. However, other forms of decaying sequences [19,20] can be used. In order to ensure error reduction, the iterative update Eq. (20) is modified by introducing a step size a which takes an initial value of 1 and is adjusted to ensure error reduction. The modified update equation is
xkþ1 ¼ xk þ aDxk
ð23Þ
In the numerical testing, a is multiplied by one half if no error reduction is achieved. This is repeated for up to 5 times and if no error reduction is achieved, the current iteration of the stochastic algorithm is skipped and another ensemble is generated starting
19
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
from the parameter values in the previous iteration. A more sophisticated step size selection using Wolfe or Goldstein condition could be applied within a line search strategy [47]. 5. Problem formulation and numerical evaluation A two-phase immiscible flow in a heterogeneous porous subsurface region is considered. For clarity of exposition, gravity and capillary effects are neglected. However, the proposed model calibration algorithm is independent of the selected physical mechanisms. The two phases will be referred to as water with the subscript w for the aqueous phase and oil with the subscript o for the non-aqueous phase. This subsurface flow problem is described by the mass conservation equation and Darcy’s law
r v t ¼ q;
v t ¼ Kkt ðSw Þrp
over X;
ð24Þ
where v t is the total Darcy velocity of the engaging fluids, q ¼ Q o =qo þ Q w =qw is the normalized source or sink term, K is the absolute permeability tensor, Sw is the water saturation, kt ðSw Þ ¼ kw ðSw Þ þ ko ðSw Þ is the total mobility and p ¼ po ¼ pw is the pressure. In which, qw ; qo are the water and oil fluid densities, respectively. These equations can be combined to produce the pressure equation
r ðKkt ðSw ÞrpÞ ¼ q:
ð25Þ
The pore space is assumed to be filled with fluids and thus the sum of the fluid saturations should add up to one (i.e., So þ Sw ¼ 1). Then, only the water saturation equations is solved
/
@Sw Q þ r ðf ðSw Þv t Þ ¼ w ; @t qw
ð26Þ
where / is the porosity, f ðSw Þ ¼ kw =kt is the fractional flow function. The relative mobilities are modeled using polynomial equations of the form
kw ðSw Þ ¼
ðSnw Þ2
lw
; ko ðSw Þ ¼
ð1 Snw Þ2
lo
; Snw ¼
Sw Swc ; 1 Sor Swc
The reference permeability fields for test Problem 1 and test problem 2 are shown in Fig. 2b and Fig. 2c, respectively, where both fields represent channelized models. Different realizations of channelized models are generated using the Stanford Geostatistical Modeling Software, S-GeMS [49] based on the training image shown in Fig. 2a. The training image shown in Fig. 2a is based on a similar example published in [55]. Fig. 3 shows ten different unconstrained realizations generated by S-GeMS. A total of one thousand different realizations were generated and used as an input to the K-SVD algorithm to produce a sparse parameterization of the search space. Fig. 4 shows 15 basis functions randomly selected from a dictionary of 500 basis functions built using the KSVD algorithm with target sparsity of 20 elements. In the numerical testing, the discretized model uses a 2D regular grid of 50 50 blocks in the x and y directions, respectively. The size of each grid block is 10 m in each direction and a unit thickness in the z direction. The porosity is assumed to be constant in all grid blocks and equals 0:2. The water viscosity lw is set to 0:3 cp and the oil viscosity lo is set to 3 cp. The irreducible water saturation and irreducible oil saturation are set as Sor ¼ Swc ¼ 0:2 and the simulations are run until 1 pore volume is injected. For each test problem, two injection/production patterns are used. Fig. 5 shows the location of injection wells as a black dots and the production wells as white dots. The first pattern has one injection well and four production wells arranged in the inverted five spot pattern and shown in Fig. 5a. For pattern 2, shown in Fig. 5b, 9 production wells are distributed around 4 injection wells. For the parameter estimation problem, the production curves at the production wells are used to define the misfit function and guide the inverse problem solution. Each water-cut curve was sampled at 50 points and these samples were used for calculating the errors and the update equation. The observation data (water-cut values) is perturbed with uncorrelated white noise with a small standard deviation of 106 to be able to perform a convergence study for different ensemble sizes.
ð27Þ 5.2. Water flooding test case 1
where Swc is the connate or irreducible water saturation, Sor is the irreducible oil saturation and lw ; lo are the water and oil fluid viscosities, respectively. The pressure Eq. (25) is discretized using the standard two-point flux approximation (TPFA) method and the saturation Eq. (26) is discretized using a finite-volume scheme and solved implicitly by a standard Newton–Raphson iteration [11]. For simplicity, we limit the parameter estimation to the subsurface permeability map K. We also assume this permeability field as a lognormal random variable as it is usually heterogeneous and shows a high range of variability.
(a)
5.1. K-SVD parameterization
This test case utilizes the reference permeability field shown in Fig. 2b. For comparison purposes, all different runs were initialized using uninformed prior of uniform permeability field with logðKÞ ¼ 0. For injection/production pattern 1, the optimized permeability fields are shown in Fig. 6 as they result from different ensemble sizes of 5; 10 and 20 members. The smallest ensemble of 5 members managed to reproduce the locations of the two channels running along the model. However, this is not to be expected from every run of the algorithm because of the approximate nature
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
−1
(b)
(c)
Fig. 2. Details of the reference permeability fields for the first and second test problems. Part (a) shows the log-permeability field (in Darcy) for the training image. Part (b) shows the reference log-permeability field for test Problem 1. Part (c) shows the reference log-permeability field for test Problem 2.
20
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
Fig. 3. Different realizations of the unconditioned log-permeability field obtained by the SNESIM algorithm as implemented in S-GeMS [49].
Fig. 4. Few basis functions from an overcomplete dictionary generated using K-SVD algorithm.
of the estimated derivatives. Also, the problem is ill-posed and might admit different solutions. Fig. 7 shows the corresponding weights of the basis functions selected from the dictionary for ensembles of 5; 10 and 20 members. For an ensemble of 5 members, the inferred support have 73 non-zero bases as shown in Fig. 7a. The initial RMSE for the initial permeability field of logðKÞ ¼ 0 is 0:0691. Table 1, shows the final RMSE values for different ensemble sizes. The number of nonlinear iterations is set to 40; 20 and 10 for the ensemble size of 5; 10 and 20, respectively. This corresponds to the same number of 200 forward runs. It is observed that smaller ensembles produced solutions with larger support because of the increased number of nonlinear iterations. This is reflected in the final RMSE as it is smaller for smaller ensembles in comparison to larger ensembles after 200 forward runs. For injection pattern 2, the optimized permeability fields are shown in Fig. 8a–c, for ensembles of 5; 10 and 20 members,
respectively. The stem plot of the discovered weights is shown in Fig. 9. The initial RMSE for the initial permeability field of logðKÞ ¼ 0 is 0:0729. Table 2, shows the final RMSE values for different ensemble sizes. The number of nonlinear iterations is set to 40; 20 and 10 for the ensemble size of 5; 10 and 20, respectively. Again, it is observed that smaller ensembles are more effective in matching the data as the number of nonlinear iterations are larger for the same number of forward runs. 5.3. Water flooding test case 2 Fig. 2c shows the reference permeability field for this test problem. All other constants are similar to test case 1. For injection pattern 1, Fig. 10a–c, shows the optimized log permeability fields for an ensemble of 5, 10 and 20 members, respectively. The calibrated field obtained by an ensemble of 5 members clearly shows the
21
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
(a)
(b)
Fig. 5. Injection production patterns (injector highlighted as black dot and producer highlighted as white dot). Part (a) pattern 1 and Part (b) pattern 2
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
(a)
−1
(b)
(c)
Fig. 6. Calibrated log-permeability field for different ensemble sizes for test case 1 under injection/production pattern 1. (a) n = 5, (b) n = 10 and (c) n = 20.
5
3
4
2
3
0 −1 −2
Weight
0
1
Weight
Weight
0.5
1
2
−1 −2
0 −0.5
−3
−3
−1 −4
−4 −5 0
1
100
200
300
400
500
−5 0
100
200
300
Basis Index
Basis Index
(a)
(b)
400
500
−1.5 0
100
200
300
400
500
Basis Index
(c)
Fig. 7. Stem plot of the weights of the calibrated permeability fields for test case 1 under injection/production pattern 1. (a) n = 5, (b) n = 10 and (c) n = 20.
Table 1 Results of NOMP for Problem 1 with injection/production pattern 1. Ensemble size
Solution cardinality
Final RMSE
5 10 20
60 58 31
0.0408 0.0514 0.0667
channelized features as in the reference field. The corresponding stem plots of the basis functions weights are shown in Fig. 11. The initial RMSE for the initial permeability field of logðKÞ ¼ 0 is 0.0695. Table 3, shows the RMSE values after 200 forward runs for different ensemble sizes. The number of nonlinear iterations is set to 40, 20 and 10 for the ensemble size of 5, 10 and 20 members, respectively. The results are consistent with those from the previous example. Smaller ensembles produce a less accurate
gradient estimation but updates the solution support more frequently and thus succeeded in producing smaller mismatch errors in comparison to larger ensembles for the same number of forward runs. The recovered permeability field shows channelized features as the basis functions included in the dictionary are channelized. For injection pattern 2, Fig. 12a–c, show the optimized log permeability fields for an ensemble of 5, 10 and 20 members, respectively. The corresponding stem plots of the basis functions weights are shown in Fig. 13. The initial RMSE for the initial permeability field of logðKÞ ¼ 0 is 0.0678. Table 4, shows the final RMSE values for different ensemble sizes. The number of nonlinear iterations is set to 40, 20 and 10 for the ensemble size of 5, 10 and 20 members, respectively. The outperformance of small ensembles evident in these results in terms of final RMSE after 200 forward runs.
22
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
−1
(a)
(b)
(c)
Fig. 8. Calibrated log-permeability field for different ensemble sizes for test case 1 under injection/production pattern 2. (a) n = 5, (b) n = 10 and (c) n = 20.
6
5
3
4 4
2 3
0 −2
1
Weight
2
Weight
Weight
2
1 0 −1
0 −1
−2 −4
−2 −3
−6 0
100
200
300
400
500
−4 0
100
Basis Index
200
300
400
500
−3 0
Basis Index
(a)
100
200
300
400
500
Basis Index
(b)
(c)
Fig. 9. Stem plot of the weights of the calibrated permeability fields for test case 1 under injection/production pattern 2. (a) n = 5, (b) n = 10 and (c) n = 20.
5.4. Convergence study Table 2 Results of NOMP for Problem 1 with injection/production pattern 2. Ensemble size
Solution cardinality
Final RMSE
5 10 20
48 36 33
0.0372 0.0528 0.0579
In all cases, we observe that the optimized field does not show extreme values in contrast with the results in [29,30]. This is attributed to applying ‘2 regularization once the solution support is discovered. The ‘2 regularization has the advantage of penalizing large weights that produces realizations with extreme permeability values.
(a)
In this section we perform a complete convergence study for the two test cases presented earlier under two injection/production patterns. The stochastic nature of the estimated gradients results in different solution path for each different run. The average of 50 different runs is presented to compare the ensemble size effect on the error reduction rates. Fig. 14 shows the average RMSE in water cut versus the total number of forward runs for test case 1 (left) and test case 2 (right), under injection/production pattern 1 (top) and pattern 2 (bottom). It is evident that smaller ensembles are more effective for sparse calibration in terms of error reduction and support detection for the same number of forward runs. These results are valid for the two test cases under the utilized two injection/production patterns. Smaller ensembles outperformed larger ensembles on average in all the numerical test cases because of the increased number of nonlinear iterations for the same number
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
−1
(b)
(c)
Fig. 10. Calibrated log-permeability field for different ensemble sizes for test case 2 under injection/production pattern 1. (a) n = 5, (b) n = 10 and (c) n = 20.
23
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
4
4
2
3
1.5
3 2
1
0 −1 −2
Weight
2
Weight
Weight
1
1
0.5 0
0 −0.5
−3 −1
−1
−4 −5 0
100
200
300
400
−2 0
500
100
200
300
400
−1.5 0
500
100
200
300
Basis Index
Basis Index
Basis Index
(a)
(b)
(c)
400
500
Fig. 11. Stem plot of the weights of the calibrated permeability fields for test case 2 under injection/production pattern 1. (a) n = 5, (b) n = 10 and (c) n = 20.
of forward runs. At each nonlinear iteration, the support is updated as well as the major search directions are detected. Thus, applying more nonlinear iterations have a positive effect on the error reduction. In the current study, we did not utilize any convergence criteria to terminate the algorithm. We did run algorithm with different ensemble sizes using the same number of forward runs to compare the performance in terms of error reduction. However, any convergence criteria can be easily set as a function of the error reduction percentage or as an error threshold. Defining these criteria is a standard procedure for any nonlinear solver.
Table 3 Results of NOMP for Problem 2 with injection/production pattern 1. Ensemble size
Solution cardinality
Final RMSE
5 10 20
67 45 32
0.0456 0.0614 0.0645
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
(a)
−1
(b)
(c)
Fig. 12. Calibrated log-permeability field for different ensemble sizes for test case 2 under injection/production pattern 2. (a) n = 5, (b) n = 10 and (c) n = 20.
6
2
4
1
0 −2
Weight
0.5
Weight
Weight
1
0
2
−1 −2
−1
−4 100
200
300
400
500
−5
0 −0.5
−3
−4 −6 0
1.5
0
100
200
300
400
500
−1.5
0
100
200
300
Basis Index
Basis Index
Basis Index
(a)
(b)
(c)
400
500
Fig. 13. Stem plot of the weights of the calibrated permeability fields for test case 2 under injection/production pattern 2. (a) n = 5, (b) n = 10 and (c) n = 20.
24
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
Table 4 Results of NOMP for Problem 2 with injection/production pattern 2. Ensemble size
Solution cardinality
Final RMSE
5 10 20
52 40 31
0.0508 0.0589 0.0631
6. Discussion and conclusions
−1.3
10
Ensemble size = 5 Ensemble size = 10 Ensemble size = 20
0
Error in Water Fractional Flow curve (RMSE)
Error in Water Fractional Flow curve (RMSE)
−1.2
10
50
100
150
0
50
100
150
Number of forward runs
(b)
−1.3
10
−1.4
10
Ensemble size = 5 Ensemble size = 10 Ensemble size = 20
0
Ensemble size = 5 Ensemble size = 10 Ensemble size = 20
(a)
−1.2
−1.5
−1.3
10
Number of forward runs
10
10
−1.2
10
200
Error in Water Fractional Flow curve (RMSE)
Error in Water Fractional Flow curve (RMSE)
The solution of the nonlinear sparse calibration problem is challenging. Not only, the algorithm has to find the optimal weights to reproduce the measured values, it has to select the basis functions that are included in the solution support as well. A complete combinatorial exploration by running standard parameter estimation algorithms on a subset of the basis functions leads to a combinatorial problem of huge size that is impossible to solve. In the linear setting, different algorithms for sparse reconstruction can be used.
These algorithms can be simply classified as forward stage wise selection algorithms such as the OMP algorithm and optimization based algorithms such as the iterative re-weighted least square algorithm (IRLS). In the current paper, we utilized the OMP algorithm for solving the sparse calibration problem. OMP has the advantage of low computational complexity and conceptual clarity in comparison to other sparse signal recovery methods [10,28]. In contrast to the IRLS, the OMP algorithm depends on a tolerance parameter of the solution residual that is conceptually easy to specify. However, a direct application of OMP will result in a collection of sparse updates that does not guarantee a final sparse solution. We modified the OMP in a logically consistent way with the greedy nature of the algorithm by carrying over the discovered solution support across the nonlinear iterations. The transparency of the OMP algorithm enabled an easy extension to developed NOMP algorithm. In terms of results, the calibrated models using the NOMP algorithm did not show extreme values as in the results presented by
50
100
150
200
200
−1.2
10
−1.3
10
Ensemble size = 5 Ensemble size = 10 Ensemble size = 20
−1.4
10
0
50
100
150
Number of forward runs
Number of forward runs
(c)
(d)
200
Fig. 14. Average RMSE in water cut curve versus the total number of forward runs for different ensemble sizes with different injection/production patterns. (a) test case 1, injection/production pattern 1, (b) test case 2, injection/production pattern 1 (c) test case 1, injection/production pattern 2 and (d) test case 2, injection/production pattern 2.
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
Khaninezhad et al. [32,33]. This is attributed to applying ‘2 regularization at each iteration once the solution support is discovered. The ‘2 regularization has the advantage of penalizing large weights that produces realizations with extreme permeability values. This is also evident from the stem plots showing the weights of the different dictionary atoms. This is a clear advantage of NOMP over other sparse reconstruction algorithms that only penalize the ‘1 norm of the solution. Another advantage of the proposed algorithm is in the efficient use of an ensemble based approximate derivative using the ISEM. The ISEM is an iterative Newton like algorithm for nonlinear parameter estimation. The proposed algorithm combining the ISEM and NOMP facilitates sparse calibration as a blackbox for numerical simulation packages when the adjoint code is not available. The proposed algorithm is a parameter estimation algorithm and does not estimate the parameters uncertainties. It is an iterative ensemble based method that is adapted for sparse calibration. It shares many commonalities with iterative EnKF methods [17,34,36,40,56]. All these iterative schemes have a common challenge of updating uncertainties at each iteration. In cases of large parameters space, the curse of dimensionality implies that the support of the posterior PDF is likely to be much smaller than the prior. This increases the probability of the case where none of the sampled parameters lies in that region. This might result in over estimation of the posterior PDF spread. On the other hand, Palmer and Hagedorn [50] highlighted that repeated use of the data generally results in an underestimation of uncertainty or what is called data overfitting. The proposed algorithm is a parameter estimation algorithm and does not try to update the error covariance. Instead, we utilized a standard technique commonly used in stochastic optimization methods where a gain sequence is formulated with a decaying magnitude to describe the random perturbations [35]. If one is interested in evaluating the parameters uncertainties, Eq. (21) in [61] can be utilized at the converged solution. Another way to estimate uncertainties is to embed the proposed algorithm based on NOMP and ISEM within the randomized maximum likelihood (RML) [49,54], to solve a set of inverse problems with perturbed objective functions [37]. In the RML method, different realizations can be obtained as the maximum a posteriori estimator (MAP) for each of the perturbed objective functions. For the linear cases, the ensemble of MAP estimators will provide an ensemble of equiprobable realizations of the unknown parameter fields and can be used as a discrete representation of the posterior PDF. However, for nonlinear cases, the theoretical justification is missing and the only formal methods are based on Bayesian algorithms. References [1] Aharon M, Elad M, Bruckstein A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 2006;54(11):4311–22. http://dx.doi.org/10.1109/TSP.2006.881199. [2] Ahmed HE, Mary FW, Ibrahim H. Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models. Journal of Hydrology, 0022–1694 2013, http://dx.doi.org/10.1016/j.jhydrol.2013.03.037. [3] Ahmed HE, Mary FW, Ibrahim H. An iterative stochastic ensemble method for parameter estimation of subsurface flow models. J Comput Phys, 0021–9991 2013, http://dx.doi.org/10.1016/j.jcp.2013.01.047. [4] Ahmed HE, Reza Tavakoli, Mary FW, Ibrahim H. Boosting iterative stochastic ensemble method for nonlinear calibration of subsurface flow models. Comput Meth Appl Mech Eng, 0045–7825 2013, http://dx.doi.org/10.1016/ j.cma.2013.02.012. [5] Berkooz G, Holmes P, Lumley JL. The proper orthogonal decomposition in the analysis of turbulent flows. Ann Rev Fluid Mech 1993;25(1):539–75. http:// dx.doi.org/10.1146/annurev.fl.25.010193.002543. http:// www.annualreviews.org/doi/abs/10.1146/annurev.fl.25.010193.002543. [6] Bhark Eric W, Jafarpour Behnam, Datta-Gupta Akhil. A generalized grid connectivity-based parameterization for subsurface flow model calibration. Water Resour Res 06 2011;47(6):W06517. http://dx.doi.org/10.1029/ 2010WR009982.
25
[7] Candes EJ, Wakin MB. An introduction to compressive sampling. IEEE Signal Process Mag 2008;25(2):21–30. http://dx.doi.org/10.1109/MSP.2007.914731. ISSN 1053-5888. [8] Carrera Jesus, Alcolea Andres, Medina Agustin, Hidalgo Juan, Slooten Luit J. Inverse problem in hydrogeology. Hydrogeol J 2005;13(1):206–22. http:// dx.doi.org/10.1007/s10040-004-0404-7. [9] Chartrand R, Yin Wotao. Iteratively reweighted algorithms for compressive sensing. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2008, 2008;31:3869–3872. http://dx.doi.org/10.1109/ ICASSP.2008.4518498. [10] Chen S, Donoho D, Saunders M. Atomic decomposition by basis pursuit. SIAM J Sci Comput 1998;20(1):33–61. http://dx.doi.org/10.1137/ S1064827596304010. [11] Chen Zhangxin. Reservoir simulation: mathematical techniques in oil recovery. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics; 2007. http://dx.doi.org/10.1137/1.9780898717075. [12] Donoho DL. Compressed sensing. IEEE Trans Inf Theory, 0018-9448 2006;52(4):1289–306. http://dx.doi.org/10.1109/TIT.2006.871582. [13] Dostert P, Efendiev Y, Mohanty B. Efficient uncertainty quantification techniques in inverse problems for Richards’ equation using coarse-scale simulation models. Adv Water Resour 2009;32(3):329–39. http://dx.doi.org/ 10.1016/j.advwatres.2008.11.009. [14] Efendiev Y, Datta-Gupta A, Ginting V, Ma X, Mallick B. An efficient two-stage Markov chain Monte Carlo method for dynamic data integration. Water Resour Res 2005;41(12):W12423. http://dx.doi.org/10.1029/2004WR003764. [15] Elad Michael. Sparse and redundant representations – from theory to applications in signal and image processing. Springer; 2010, ISBN 978-14419-7010-7. http://dx.doi.org/10.1007/978-1-4419-7011-4. [16] Elsheikh Ahmed H, Jackson Matt D, Laforce Tara C. Bayesian reservoir history matching considering model and parameter uncertainties. Math Geosci 2012;44(5):515–43. http://dx.doi.org/10.1007/s11004-012-9397-2. [17] Elsheikh Ahmed H, Pain CC, Fang F, Gomes JLMA, Navon IM. Parameter estimation of subsurface flow models using iterative regularized ensemble Kalman filter. Stochastic Environ Res Risk Assess., 1436-3240 2012:1–21. http://dx.doi.org/10.1007/s00477-012-0613-x. [18] Engan K., Aase S.O., Hakon Husoy J. Method of optimal directions for frame design. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing 1999;5:2443–2446. http://dx.doi.org/10.1109/ ICASSP.1999.760624. [19] Engan Kjersti, Skretting Karl, Husøy John Håkon. Family of iterative ls-based dictionary learning algorithms, ils-dla, for sparse signal representation. Digital Signal Processing 2007;17(1):32–49. ISSN 1051-2004. http://dx.doi.org/ 10.1016/j.dsp.2006.02.002.
. [20] Engl Heinz W, Grever Wilhelm. Using the L-curve for determining optimal regularization parameters. Numer Math 1994;69(1):25–31. http://dx.doi.org/ 10.1007/s002110050078. [21] Evensen Geir. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research A: Space Physics 1994;99(C5):10143–10162. http:// dx.doi.org/10.1029/94JC00572. [22] Fang Haitao, Gong Guanglu, Qian Minping. Annealing of iterative stochastic schemes. SIAM J Control Optim 1997;35(6):1886–907. http://dx.doi.org/ 10.1137/S0363012995293670. [23] Gelfand Saul, Mitter Sanjoy. Simulated annealing type algorithms for multivariate optimization. Algorithmica 1991;6(1):419–36. http://dx.doi.org/ 10.1007/BF01759052. [24] Guyon Isabelle, Elisseeff André. An introduction to variable and feature selection. J Mach Learn Res 2003;3:1157–82. http://jmlr.csail.mit.edu/papers/ v3/guyon03a.html. [25] Hansen PC, O’Leary DP. The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J Sci Comput 1993;14:1487–503. http://dx.doi.org/ 10.1137/0914086. [26] Hansen Per Christian. Analysis of ill-posed problems by means of the L-curve. SIAM Rev 1992;34:561–80. http://dx.doi.org/10.1137/1034115. [27] Hansen Per Christian. Rank-deficient and discrete ill-posed problems: numerical aspects of linear inversion. Philadelphia: SIAM; 1998. http:// dx.doi.org/10.1137/1.9780898719697. [28] Huang Shisheng, Zhu Jubo. Recovery of sparse signals using OMP and its variants: convergence analysis based on rip. Inverse Prob 2011;27(3):035003. http://dx.doi.org/10.1088/0266-5611/27/3/035003. http://stacks.iop.org/ 0266-5611/27/i=3/a=035003. [29] Jafarpour Behnam, McLaughlin Dennis B. Reservoir characterization with the discrete cosine transform. SPE J 2009;14(1):182–201. http://dx.doi.org/ 10.2118/106453-PA. [30] Kac M, Siegert AJF. An explicit representation of a stationary Gaussian process. Ann Math Statist 1947;18:438–42. http://www.jstor.org/stable/2235740. [31] Karhunen Kari. Über lineare Methoden in der Wahrscheinlichkeitsrechnung. Ann Acad Sci Fennicae Ser A.I. Math-Phys 1947;37:79. http://dx.doi.org/ 10.1088/0266-5611/27/3/035003. [32] Khaninezhad Mohammadreza Mohammad, Jafarpour Behnam, Li Lianlin. Sparse geologic dictionaries for subsurface flow model calibration: Part I. Inversion formulation. Adv Water Resour, 0309-1708 2012;39(0):106–21. http://dx.doi.org/10.1016/j.advwatres.2011.09.002. http:// www.sciencedirect.com/science/article/pii/S0309170811001692.
26
A.H. Elsheikh et al. / Advances in Water Resources 56 (2013) 14–26
[33] Khaninezhad Mohammadreza Mohammad, Jafarpour Behnam, Li Lianlin. Sparse geologic dictionaries for subsurface flow model calibration: Part II. Robustness to uncertainty. Adv Water Resour, 0309-1708 2012;39(0):122–36. http://dx.doi.org/10.1016/j.advwatres.2011.10.005. . [34] Krymskaya MV, Hanea RG, Verlaan M. An iterative ensemble Kalman filter for reservoir engineering applications. Comput Geosci 2009;13(2):235–44. http:// dx.doi.org/10.1007/s10596-008-9087-9. [35] Kushner HJ. Asymptotic global behavior for stochastic approximation and diffusions with slowly decreasing noise effects: global minimization via Monte Carlo. SIAM J Appl Math 1987;47(1):169–85. http://dx.doi.org/10.1137/ 0147010. http://epubs.siam.org/doi/abs/10.1137/0147010. [36] Li Gaoming, Reynolds Albert C. Iterative ensemble Kalman filters for data assimilation. SPE J 2009;14(3):496–505. http://dx.doi.org/10.2118/109808-PA. [37] Li Gaoming, Reynolds Albert C. Uncertainty quantification of reservoir performance predictions using a stochastic optimization algorithm. Comput Geosci, 1420-0597 2011;15:451–62. http://dx.doi.org/10.1007/s10596-0109214-2. [38] Li Wei, Cirpka Olaf A. Efficient geostatistical inverse methods for structured and unstructured grids. Water Resour Res 2006;42(6):W06402. http:// dx.doi.org/10.1029/2005WR004668. [39] Loève M. Fonctions Aléatoires de second order. Supplement to P. Levy, Processus Stochastiques et Mouvement Brownien. Gauthier-Villars: Paris; 1948. [40] Lorentzen Rolf J, Nævdal Geir. An iterative ensemble Kalman filter. IEEE Trans Autom Control 2011;56(8):1990–5. http://dx.doi.org/10.1109/ TAC.2011.2154430. [41] Mallat S., Zhang Z. Adaptive time-frequency decomposition with matching pursuits. In: Proceedings of the IEEE-SP international symposium time– frequency and time-scale analysis, Oct 1992, p. 7–10. http://dx.doi.org/ 10.1109/TFTSA.1992.274245. [42] Mallat SG, Zhang Zhifeng. Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process Dec 1993;41(12):3397–415. http://dx.doi.org/ 10.1109/78.258082. [43] McLaughlin Dennis, Townley Lloyd R. A reassessment of the groundwater inverse problem. Water Resour Res 1996;32(5):1131–61. http://dx.doi.org/ 10.1029/96WR00160. [44] Moradkhani Hamid, Sorooshian Soroosh, Gupta Hoshin V, Houser Paul R. Dual state-parameter estimation of hydrological models using ensemble Kalman filter. Adv Water Resour, 0309-1708 2005;28(2):135–47. http://dx.doi.org/ 10.1016/j.advwatres.2004.09.002. http://www.sciencedirect.com/science/ article/pii/S0309170804001605. [45] Nævdal Geir, Johnsen Liv Merete, Aanonsen Sigurd I, Vefring Erlend H. Reservoir monitoring and continuous model updating using ensemble Kalman filter. SPE J 2005;10(1):66–74. http://dx.doi.org/10.2118/84372-PA. [46] Needell Deanna, Vershynin Roman. Uniform uncertainty principle and signal recovery via Regularized Orthogonal Matching Pursuit. Found Comput Math, 1615-3375 April 2009;9(3):317–34. http://dx.doi.org/10.1007/s10208-0089031-3.
[47] Nocedal Jorge, Wright Stephen J. Numerical optimization. Springer Verlag; 2006. http://dx.doi.org/10.1007/978-0-387-40065-5. [48] Oliver Dean S, Cunha Luciane, Reynolds Albert C. Markov chain Monte Carlo methods for conditioning a permeability field to pressure data. Math Geol 1997;29(1):61–91. http://dx.doi.org/10.1007/BF02769620. [49] Oliver D.S., He N., Reynolds A.C. Conditioning permeability fields to pressure data. In: 5th European conference on the mathematics of oil recovery, Sep 1996, p. 259–269. [50] Palmer Tim, Hagedorn Renate. Model error in weather and climate forecasting. Cambridge University Press; 2006. http://dx.doi.org/10.1017/ CBO9780511617652.016. [51] Pati Y.C., Rezaiifar R., Krishnaprasad P.S. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: 1993 Conference record of the 27th asilomar conference on signals, systems and computers, vol. 1, Nov 1993, p. 40–44. http:// dx.doi.org/10.1109/ACSSC.1993.342465. [52] Remy Nicolas. S-GeMS: the stanford geostatistical modeling software: a tool for new algorithms development. In: Leuangthong Oy, Deutsch Clayton V, editors. Geostatistics banff. Quantitative geology and geostatistics, 14. Netherlands: Springer; 2005. p. 865–71. [53] Reynolds AC, Nanqun He, Lifu Chu, Oliver DS. Reparameterization techniques for generating reservoir descriptions conditioned to variograms and well-test pressure data. SPE J 1996;1(4):413–26. http://dx.doi.org/10.2118/30588-PA. [54] Reynolds Albert C, Nanqun He, Oliver Dean S. Reducing uncertainty in geostatistical description with well-testing pressure data. AAPG Mem, 02718529 1999;71:149–62. http://archives.datapages.com/data/specpubs/ memoir71/m71ch10/m71ch10.htm. [55] Rubinstein R, Zibulevsky M, Elad M. Double sparsity: learning sparse dictionaries for sparse signal approximation. IEEE Trans Signal Process, 1053587X 2010;58(3):1553–64. http://dx.doi.org/10.1109/TSP.2009.2036477. [56] Sakov Pavel, Oliver Dean S, Bertino Laurent. An iterative EnKF for strongly nonlinear systems. Mon Weather Rev 2012;140(6):1988–2004. http:// dx.doi.org/10.1175/MWR-D-11-00176.1. [57] Sarma Pallav, Durlofsky Louis, Aziz Khalid. Kernel principal component analysis for efficient, differentiable parameterization of multipoint geostatistics. Math Geosci 2008;40(1):3–32. http://dx.doi.org/10.1007/ s11004-007-9131-7. [58] Strebelle Sebastien. Conditional simulation of complex geological structures using multiple-point statistics. Mathe Geol 2002;34(1):1–21. http:// dx.doi.org/10.1023/A:1014009426274. [59] Tropp JA. Greed is good: algorithmic results for sparse approximation. IEEE Trans Inf Theory, 0018-9448 2004;50(10):2231–42. http://dx.doi.org/10.1109/ TIT.2004.834793. [60] Tropp JA, Gilbert AC. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans Inf Theory, 0018-9448 Dec 2007;53(12):4655–66. http://dx.doi.org/10.1109/TIT.2007.909108. [61] Zupanski Milija, Navon I Michael, Zupanski Dusanka. The maximum likelihood ensemble filter as a non-differentiable minimization algorithm. Q J Roy Meteorol Soc 2008;134(633):1039–50. http://dx.doi.org/10.1002/qj.251.