Field-scale history matching with sparse geologic dictionaries

Field-scale history matching with sparse geologic dictionaries

Accepted Manuscript Field-scale history matching with sparse geologic dictionaries M. Reza M. Khaninezhad, Behnam Jafarpour PII: S0920-4105(18)30512-...

10MB Sizes 0 Downloads 56 Views

Accepted Manuscript Field-scale history matching with sparse geologic dictionaries M. Reza M. Khaninezhad, Behnam Jafarpour PII:

S0920-4105(18)30512-6

DOI:

10.1016/j.petrol.2018.06.024

Reference:

PETROL 5031

To appear in:

Journal of Petroleum Science and Engineering

Received Date: 19 July 2017 Revised Date:

4 May 2018

Accepted Date: 9 June 2018

Please cite this article as: Khaninezhad, M.R.M., Jafarpour, B., Field-scale history matching with sparse geologic dictionaries, Journal of Petroleum Science and Engineering (2018), doi: 10.1016/ j.petrol.2018.06.024. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Field-Scale History Matching with Sparse Geologic Dictionaries

University of Southern California Corresponding author

Keywords:

SC

*

RI PT

M. Reza M. Khaninezhad, Behnam Jafarpour*

M AN U

Model Calibration, Inverse Modeling, Sparse Reconstruction, Compressed-Sensing, Geologic Dictionaries, K-SVD Algorithm, L1-Norm Minimization

Running Title:

TE D

Field-Scale Sparse History Matching

AC C

EP

Corresponding Author (Contact Information): Behnam Jafarpour

925 Bloom Walk, HED 313

Mork Family Department of Chemical Engineering and Material Science University of Southern California (USC) Los Angeles, CA 90089 E-mail: [email protected] Phone: (213) 740-2228

1

ACCEPTED MANUSCRIPT

Abstract

1. Introduction

TE D

M AN U

SC

RI PT

Inference of reservoir flow properties from scattered production data poses a poorly-constrained inverse problem with non-unique solutions. Parsimonious description of reservoir properties can alleviate problem ill-posedness and help with preserving the expected geologic continuity of the solution. Low-rank representations that exploit the prevalent correlations in spatially distributed rock properties provide compact descriptions for reservoir property distributions. Recent advances in sparse reconstruction, known as compressed sensing, have inspired efficient and flexible model calibration techniques that offer a departure from classical reduced-order parameterization techniques, where only prior knowledge is used to select the parameterization subspace. Sparse model calibration allows for dynamic subspace identification by adaptively searching through a large combination of possible relevant subspaces to identify an appropriate one that best explains the observed data. We recently presented sparse geologic dictionaries as an effective and flexible method for representation of complex geologic connectivity patterns to facilitate the search for plausible solutions. In this paper, we demonstrate the field-scale application of learned sparse geologic dictionaries for history matching. We use a gradient-based implementation of the history matching problem to investigate the feasibility of adopting the developed method for field-scale application. The paper presents practical considerations, important properties, and alternative implementations of the method for practitioners. The benefits and limitations of applying the method to reservoir history matching are evaluated and discussed. The performance of the sparse history matching approach with geologic dictionaries is also compared with the classical truncated singular value decomposition parameterization in three field-scale case studies. The results show that sparse model calibration with geologic dictionaries offer a superior history matching performance in estimating reservoir properties in realistic settings where the trade-offs between problem dimensionality, model resolution, computation speed, geologic uncertainty, and solution plausibility should be reconciled effectively.

AC C

EP

Reservoir history matching involves solving an inverse problem in which a plausible predictive model is constructed by minimizing the mismatch between observed and model-predicted data. The unknown parameters can include physical rock and fluid properties, initial and boundary conditions, source/sink terms or a combination of them (Sagar et al., 1975). Most documented inverse problems in reservoir modeling fall into the first category where the primary focus is on identifying heterogeneous distribution of important reservoir properties (such as log-permeability or porosity distribution) that control the fluid flow and production behavior in the reservoir. The formulation and solution of the resulting inverse problem have been widely discussed in both groundwater and petroleum reservoir modeling literature (McLaughlin and Townley, 1996; Yeh, 1986; Carrera, 2005; Oliver and Chen, 2011). While the history matching process is used to ensure that a given model reproduces the historical production measurements in the field, the main objective is to improve the predictive power of the model by updating its underlying rock properties. A major difficulty in solving subsurface characterization inverse problems is data limitation (in both quantity and content) as well as the nonlinear and complex relation between production data and unknown parameters of interest. In general, the number of parameters used to describe distributed subsurface properties with a high enough resolution is overwhelmingly larger than the 2

ACCEPTED MANUSCRIPT

RI PT

number of available measurements. Even for linear problems, this situation leads to underdetermined inverse problems with a non-empty null space and infinitely many solutions. Moreover, the flow and pressure data provide an averaged out (lumped) nonlinear response of the reservoir over the support of these data (e.g., regions between injection/production wells). Consequently, many solutions can be constructed to reproduce the measurements and yet provide distinct predictions, resulting in significant uncertainty in the predicted future performance of the field (Liu and Jafarpour, 2013; Zhou et al., 2014).

M AN U

SC

To constrain the set of plausible solutions, the dynamic well/field production data are typically augmented with implicit or explicit prior knowledge about the acceptable spatial variability of the solutions. Parameterization (Tavakoli and Reynolds, 2010; Khaninezhad and Jafarpour, 2013; Bhark et al., 2011; Jacquard and Jain, 1965; Jahns, 1966; Bissell, 1994; Brun et al., 2001) and regularization (Tikhonov and Arsenin, 1977; Khaninezhad and Jafarpour, 2013, 2014; Neuman, 1973) are commonly adopted to incorporate prior knowledge into the inverse modeling formulation to make the problem better-posed. The former explicitly reduces the number of parameters while the latter constrains the variability of the parameters, which amounts to reducing the number of independently varying parameters. In some cases, as in this paper, one can combine parameterization and regularization methods to take advantage of specialized regularization forms that can only be invoked when specific representations of model parameters (reservoir properties) are used. Such is the case in transform-domain sparse representation of geologic formations, which is discussed in this paper.

1.1. Linear Transforms for Sparse Parameterization

AC C

EP

TE D

A general approach to parameterization of spatially correlated reservoir properties is linear transformation with specialized expansion functions (e.g., those with decorrelating property). In this case, the transformation results in a significantly smaller set of often uncorrelated parameters that closely approximate the original properties, and form a small search space for model calibration (Khaninezhad et al., 2012; Khaninezhad and Jafarpour, 2014). Several low-rank parameterization approaches have been developed and applied in hydrogeology and petroleum engineering. An important geostatistical parameterization method for calibration of groundwater models is the pilot point method (LaVenue and Pickens, 1992; Alcolea et al., 2006). Zonation and its adaptive multi-resolution variants (Jacquard and Jain, 1965; Jahns, 1966; Bissell, 1994; Brun et al., 2001; Grimstad et al., 2003) have also been applied to calibrate both groundwater aquifers and petroleum reservoirs against dynamic data. An important class of parametrization techniques is the transform domain methods, most notably the truncated singular value decomposition (TSVD) and the principal component analysis (PCA) (Gavalas et al., 1976; Jafarpour and McLaughlin, 2009), the discrete cosine transform (DCT) (Ahmed et al., 1974; Britanak et al., 2007; Jafarpour and McLaughlin, 2009; Bhark et al., 2011a) and its generalized form as grid-connectivity transform (GCT) (Bhark et al., 2011b), the discrete wavelet transform (DWT) basis (Mallat, 2018; Lu and Horne, 2000; Jafarpour, 2010). The general form of a linear transform parameterization can be expressed as

u = u0 + Φv

3

(1)

ACCEPTED MANUSCRIPT

where u 0 ∈ Ρ N is the vector corresponding to an average (constant) model and the term Φv represents variations around the average model using a weighted combination of a small set of basis elements to approximately represent the original model. The columns of Φ ∈ Ρ N×K consist of basis vectors and the entries of v include the weights given to each basis vector.

M AN U

SC

RI PT

Since the hydraulic properties of geologic formations tend to exhibit piecewise smooth and continuous features, appropriate mathematical or geological transforms Φ can be adopted to compactly represent them. In general, each column of Φ corresponds to a basis element, typically encoding a distinct continuity pattern (feature). Hence, the history matching solution can be expressed as a sparse linear combination of a small number of relevant basis elements (features). The choice of Φ is closely related to availability of prior knowledge about the reservoir property of interest (Jafarpour and McLaughlin, 2009; Bhark et al., 2011). Linear transformation bases can either be learned from explicit prior geologic model realizations (when available) or they can be borrowed from a host of pre-constructed basis functions that have desirable compression properties (e.g., the Fourier and Wavelet bases). For instance, the PCA basis functions are constructed using the eigenvectors of the parameter (sample) covariance matrix; whereas the DCT parameterizations technique uses low-frequency features, corresponding to large-scale special continuity patterns.

1.2. Generic Transforms for Data Integration

EP

TE D

Pre-constructed (i.e. generic) basis functions that are used in image compression, such as the DCT (Ahmed et al., 1974; Britanak et al., 2007; Jafarpour and McLaughlin, 2009; Bhark et al., 2011a), the GCT (Bhark et al., 2011) and the DWT (Mallat, 2018; Lu and Horne, 2000; Jafarpour, 2010), achieve excellent compression performance in representing smooth and piecewise smooth images. Hence, for most natural images, only a small fraction of the transformed coefficients is sufficient to capture the most salient features of the original image. If the image to be compressed is known, the subset of significant basis elements and their corresponding weights can be easily identified. However, in history matching applications, where the goal is to reconstruct unknown spatial property distributions from limited production data, the significant basis functions and their corresponding coefficients are generally unknown.

AC C

To accurately represent complex geometric features (for example meandering fluvial deposits) with generic transforms, many high- and low-resolution basis components must be included in the linear expansion. However, data inadequacy limits our ability to resolve such high-resolution details in an inverse problem. To compensate for the lack of sufficient data, it is common to resort to prior knowledge/assumptions about the relevant connectivity patterns in the formation (e.g., a training dataset) to develop more specialized basis functions. Such training dataset can often be derived from extensive data collection efforts, site surveys, and geologic modeling. Geostatistical simulation techniques can then be used to generate a large collection of conditional realizations that quantify the associated uncertainty (Journel and Deutsch, 1995; Strebelle, 2002; Caers and Zhang, 2004; Arpat and Caers, 2005; Hu and Chugunova, 2008; Mahmud and Mariethoz, 2014a, 2014b). The resulting stochastic prior models and their statistical properties provide important information about the expected connectivity patterns in the field. Hence, they can guide the solution of the history matching problem. As is common in machine learning, these prior models can be used as training data to learn effective geologically-inspired basis elements. A classical example of learned basis functions, which has received widespread application, is the

ACCEPTED MANUSCRIPT

PCA basis (Gavalas et al., 1976; Jafarpour and McLaughlin, 2009), which is derived from eigendecomposition of the sample covariance matrix of the prior dataset.

EP

2. Methodology

TE D

M AN U

SC

RI PT

Recently, we introduced sparse geologic dictionaries for flexible and compact parametric representation of prior models for model calibration (Khaninezhad et al., 2012), and presented preliminary results on their applications to history matching problem (Khaninezhad and Jafarpour 2015). Inspired by recent advances in sparse signal processing and compressed sensing (Donoho, 2006; Candes and Wakin, 2008; Chen et al., 1998; Tropp and Gilbert, 2007; Rudelson and Vershynin, 2008), we also formulated a novel sparse model calibration framework that could take advantage of sparse geologic dictionaries to advance the state-of-the-art low-rank parametrization from searching for a solution in a pre-selected subspace to an effective formulation that allows for dynamic subspace selection during model calibration (Khaninezhad et al., 2012). Furthermore, the flexibility in subspace selection of sparse reconstruction algorithms enables automatic identification of optimal parameterization level. This is achieved by automatically selecting the number of active dictionary elements based on available dynamic production data during inversion. In a recent conference paper (Khaninezhad and Jafarpour 2015), we presented initial results about the use of sparse geologic dictionaries for field-scale history matching. In this paper, we demonstrate the application of the proposed sparse history matching formulation to realistic three-dimensional field cases with complex geologic structures. The focus of the paper is, hence, on a more intuitive presentation of the formulation and its practical implementation to help proper application by practitioners. Additionally, the paper highlights and discusses the main improvements and possible extensions that the proposed approach offers relative to existing parametrized history matching techniques. In particular, the dynamic basis selection property of the proposed formulation can be exploited to formulate robust history matching under uncertainty in prior geologic scenario and enable prior model identification. Finally, comparisons with well-established PCA or truncated SVD parameterization techniques are presented to illustrate the superior performance of the proposed framework.

AC C

We begin by discussing the main idea behind our formulation. Consider the estimation of a gridbased representation of an unknown spatial property u (e.g., permeability) with N gridblocks. In our history matching formulation, we aim to retrieve the important connectivity features that can be resolved from limited available production data. To this end, we define a geologic feature space of dimension K << N , in which a solution will be sought. This reduced representation is possible and important because a grid-based description is overly redundant and data resolution and content does not warrant a fine-scale model description (Jafarpour and McLaughlin, 2009). In practice, N could be of order 106-7 while K would be of order 103. That is, we anticipate that about a few thousands of geologic features should adequately capture the range of variability in the existing connectivity patterns of the given formation. In addition, out of the large K diverse geologic features, only a small number S << K of them are needed to represent a single model realization (the value of S depends on model complexity, but is expected to be less than 10% of K. In other words, each model realization resides in a very small S-dimensional subspace within the geologic feature space of dimension K. This motivates the search for a sparse approximation of the solution in the geologic feature space. 5

ACCEPTED MANUSCRIPT

Mathematically, we can now write a general S-term approximation of the N-dimensional vector u (i.e., the desired spatial property) in a K dimensional geologic feature space as follows: elements, i.e., as K

uˆ = ∑ v jφ j = Φ v

(2)

j =1

RI PT

where Φ N×K = [φ1 | φ2 | L | φK ] and v j s represent the geologic features and their corresponding

SC

expansion coefficients, respectively, uˆ is the approximation of the property of interest u that “compactly” represent the salient features of it. If uˆ only has S << K active expansion terms, then S out of the K coefficients v j s will be non-zero. Furthermore, the number of grid blocks N is overwhelmingly larger than the dimension of the search space K. The first step before we proceed to the history matching formulation is to learn, from available training data, the geologic features with these specified properties, mainly one that can afford a sparse representation of the solution with the parameters discussed above, a procedure known as sparse dictionary learning (discussed in the next section).

EP

TE D

M AN U

Once the choice of the geologic basis (or dictionary) Φ N×K = [φ1 | φ2 | L | φK ] is made, two questions must be answered to find a solution: (1) how to identify the subset of significant S (out of K) components, and (2) what are the corresponding coefficients (weights), v j = 1 : S , that provide the “best” solution (data match). The first question translates to identifying the best approximation subspace (typically a combinatorial optimization problem) while the second question is related to finding the weights or expansion coefficient in the identified subspace. In classical parameterization of inverse problems, the answers to these questions are found independently. That is, the S basis elements that define the solution subspace Φ s = [φ1 | φ2 | L | φs ] are first determined from available training data prior to model calibration. The model calibration problem is, therefore, reduced to using the production data to find the best weights, v j = 1 : S , for the selected solution subspace. A popular example is the Principal Component Analysis in which the first S leading eigenvectors of the parameter sample covariance matrix are selected as expansion functions φ j =1:K for model calibration. The production data is then used to find the corresponding coefficients v j =1:S . This traditional approach is illustrated in Figure 1a.

AC C

In the proposed sparse model calibration (Figure 1b), the two questions discussed above (subspace identification and coefficient estimation) are addressed simultaneously during model calibration. That is, the production data are used to (i) identify the S-dimensional subspace from a set of K >> S dictionary elements and, (ii) simultaneously estimate the corresponding coefficients v j =1:S (see the workflow in Figure 1b). This is achieved by including a sparsitypromoting regularization term in the history matching objective function to zero out many irrelevant coefficients (and hence remove their corresponding geologic features from the linear expansion). In the following subsections, we briefly describe the classical SVD parameterization and its extension to the K-SVD formulation for constructing sparse geologic dictionaries, followed by an overview of our sparse model calibration framework.

2.1 Parameterization with Singular Value Decomposition (SVD)

ACCEPTED MANUSCRIPT

S

Φ

S

F

S

RI PT

To implement the proposed sparse history matching approach, the first part of the formulation requires constructing a sparse geologic dictionary from prior training data. Given a set of L prior model realizations of a N-dimensional model u, i.e., U N ×L = [u1 | u 2 | ... | u L ] an effective approximation is obtained by projecting a model onto the left singular vectors of the matrix U. The SVD decomposition of a matrix U N×L can be expressed as U N ×L = Φ N ×N Σ N ×L VL×L It is easy to show that the left singular vectors of U are equivalent to the eigenvectors of the sample covariance matrix and that they can be derived from the following minimization problem ˆ = arg min U − Φ V 2 , Φ ΦT Φ = diag(s) , V T V = I (3) S

where . F denotes the Frobenius norm and VS×L = [v1 | v1 | L | v L ] is the matrix containing the right singular vectors (i.e., the loadings matrix) (Golub and Reinsch, 1970). In this notation, vector s contains the singular values of U and ΦS represents the sorted left singular vectors of U. The minimization problem in Eq. (3) searches for rank-S representations of matrix U with minimum mean-squared error. The leading S elements contain a significant portion of the information (energy) in U . On the other hand, the remaining (N-S) left singular vectors tend to have small-scale (higher frequency) details that are truncated with minimal impact on the approximation quality. Note that, in this case, the S-dimensional approximation subspace is determined purely by using the prior training data, i.e., without including the dynamic data in the process. In the Results Section, we compare the performance of our sparse geologic dictionary formulation with this truncated SVD parameterization. Furthermore, in the limit as K=S, a rankK approximation is achieved to include smaller scale details. Next, we consider the same training data U N ×L = [u1 | u 2 | ... | u L ] and use an alternative formulation to construct a sparse geologic dictionary.

TE D

M AN U

SC

2

2.2 Sparse Dictionary Learning with the K-SVD Algorithm

AC C

EP

Following the introduction of sparse reconstruction, formalized as compressed sensing, recent years have witnessed a growing interest in searching for sparse representations of various types of signals. Signal representations with smallest support (fewest non-zero entries) are appealing as they tend to offer simpler interpretations and superior approximation quality. For these reasons, sparse representations have found application in a wide range of fields including, signal processing, data compression, imaging sciences, inverse modeling, feature extraction, and machine learning (Aharon et al., 2006; Elad and Aharon, 2006; Petra et al., 2008; Bach et al., 2011). In data-deficient geoscience inverse problems, learning sparse dictionaries from a prior training dataset offers a particularly useful way for incorporating prior information. Sparse dictionary learning from a training dataset U N ×L = [u1 | u 2 | ... | u L ] can be formulated using either of the following minimization problems:

{Φˆ , Vˆ }= arg min U − ΦV s.t. v ≤ S , ∀i ∈ {1,..., N} {Φˆ , Vˆ }= arg min ∑ v s.t. U − ΦV ≤ ε 2

Φ, V

F

i 0

2

Φ ,V

i 0

i

7

F

(4) (5)

ACCEPTED MANUSCRIPT

2

{Φˆ , Vˆ }= arg min

U − ΦV F s.t. vi 0 = 1, ∀i ∈ {1,..., N } 2

(6)

M AN U

Φ, V

SC

RI PT

where . 0 is the l0-semi-norm of a vector and . F represents the squared-Frobenius norm of its argument. Intuitively, Eq. (4) minimizes the training data mismatch (i.e., approximation error) subject to V having a support size S. On the other hand, Eq. (5) provides a least squares solution to the system of equations U = ΦV while minimizing the support of each column of matrix V. Although there is no known practical optimization algorithm to simultaneously solve for the dictionary Φ and the corresponding transform domain representations of the prior models, i.e. V , approximate heuristic methods such as the Method of Optimal Directions (MOD) and the Orthogonal Matching Pursuit (OMP), one of which being the K-SVD algorithm, are extensively applied to practical problems in the literature (Aharon et al., 2006; Lebrun and Leclaire, 2012; Khaninezhad et al., 2012). In this paper, we use the K-SVD algorithm to construct sparse geologic dictionaries from a training dataset of prior model realizations. The K-SVD algorithm is viewed as an extension of the K-means clustering algorithm (MacQueen, 1967; Jain et al., 1999), which is formulated as

It is easy to verify that the K-means formulation is a special case of the sparse learning algorithm in which S = 1. The K-SVD algorithm is an approximate solution method in which the optimization problem is decoupled into two individual problems that are solved iteratively, one for finding the collection of sparse vectors V when the dictionary is fixed (sparse coding) and another for estimating the dictionary Φ when V is fixed (dictionary update). For completeness, a summary of these steps is provided in Appendix A.

AC C

EP

TE D

A few remarks about the properties of dictionary learning methods are in order: (1) Many learning algorithms, including the K-SVD, are heuristic in nature and do not have convergence proof; (2) In many cases the sparse representations may be non-unique; that is, one may find different S-term expansions of the dictionary elements that provide similar approximation to a given property image; (3) Learning in high-dimensional spaces is computationally demanding. Hence, for large-scale problems it is not recommended to perform the learning in the space domain (large N); instead, the learning can be performed on a dimension reduced representation of the training data, e.g., in a reduced DCT domain (Liu and Jafarpour, 2013); Appendix B provides a brief discussion on learning sparse representations in a reduced DCT domain description of the training data to improve the computational complexity of the algorithm without compromising important large-scale reservoir connectivity features; (4) In some imaging applications, where measurements are not scarce, the K-SVD implementation is performed on small image patches (8×8 or 16×16 segments) to reduce the computational complexity. However, image segmentation can introduce boundary artifacts that can degrade the reconstruction quality, especially when limited measurements are available. (5) In general, the dimension K, S, and L are problem dependent and their specification may require trial and error. However, K represents the search subspace dimension and depends on the diversity and complexity of the underlying geologic features in the prior training data. On the other hand, S controls the sparsity level and affects the approximation quality. In general, the S/K ratio should be kept within 10% for a sufficiently sparse representation; however, cross-validation is needed to examine the approximation quality of the learned dictionary before proceeding to solve the history matching problem. Cross-validation can be easily done by excluding some samples from

ACCEPTED MANUSCRIPT

RI PT

the training data and approximating them using the resulting K-SVD dictionary. Finally, sparse dictionary learning is fundamentally different from the traditional low-rank representations, such as the SVD or PCA, in that sparse dictionaries do not pre-select or sort the dictionary elements and allow for the inversion (production) data to select the significant elements in representing the solution. This is an extremely important property as it enables dynamic or adaptive solution subspace identification during history matching iterations. Furthermore, the selection property of sparse reconstruction methods offers robustness against uncertainty in the prior training data even when very diverse set of geologic features are included as prior training data, see (Khaninezhad et al., 2012) for details.

2.3. Sparse Model Calibration Formulation

M AN U

SC

Of interest in this paper is the subsurface flow model calibration problem. Given a forward flow and transport model of the form g (u) = d , which relates the vector containing the unknown parameters u (i.e., reservoir properties) of the inverse problem to the vector of production data d (e.g., well pressure and flowrate responses), the simplest formulation of the model calibration problem seeks to find a solution that minimizes the data mismatch function

J (u ) = g(u ) − dobs

2

(5)

2

where d obs is the observed data and g (u ) = d represents the simulated (predicted) observations. In this notation, the p-norm of a vector x is defined as x

(0 <

p

=

(∑

K

x i =1 i

) for (0 < p < +∞ ). For

1 p p

p < 1) the above definition does not constitute a norm; however, we loosely refer to it as lp-

TE D

norm for all p values. When data is not abundant, the problem is ill-posed, and the solutions are non-unique. A regularization term can be added to the data mismatch function to impart specific attributes (e.g., smoothness) on the solution. In that case, the regularized objective function takes the general form of:

J (u) = g(u ) − dobs 2 + λ2 Wu 2

EP

2

2

(6)

AC C

where λ2 is the regularization parameter that controls the trade-off between minimizing the data mismatch and regularization terms, and W is the regularization operator that quantifies a minimization measure (e.g., roughness) to achieve a desired solution property (smoothness). In classical Tikhonov regularization (Tikhonov and Arsenin, 1977; Leo et al., 1986), W denotes zeroth, first, or second order spatial derivatives, to promote minimum length, smooth, or flat solutions, respectively. In our sparse model calibration inverse problem, we take advantage of the sparse linear expansion (approximation) u ≈ Φv , and formulate the inverse problem in terms of v (as the new unknown). Note that after finding v , u can be readily obtained through a matrix product. Assuming v is sparse, i.e., it has very few non-zero coefficients, following the sparse reconstruction formulations (Elad, 2010), one could find a sparse solution that provides a good match to the production data by solving the following optimization problem: 9

ACCEPTED MANUSCRIPT

min v J (v ) = g (Φv ) − d obs

Here,

2 Cn

+ λ2 v

(9)

0

Cn is a diagonal matrix that contains the weights (typically assigned as data noise

variance) for the weighted least-squares function, i.e., x

2 Cn

= x T C n x . A difficulty in solving the

RI PT

above optimization problem is the discontinuous and non-smooth (non-differentiable) nature of the l0 -norm regularization term v 0 . Hence, optimization algorithms for smooth functions cannot be used to solve the problem.

SC

To circumvent the complexity of minimizing the discontinuous l0 -norm, one could replace this with other sparsity-promoting regularization functions. A special case is when the l0 -norm is

M AN U

approximated with its convex l1 -norm approximation, which is widely known to have sparsitypromoting property. For linear problems, that is when g (u) = Gu , this substitution leads to the compressed sensing formulation of Donoho (2006), where the problem reduces to linear programming with a global solution. For nonlinear problems, l1 -norm approximation of the l0 norm leads to a better-behaved optimization problem with similar sparsity-promoting behavior. In general, one could arrive at a sparsity promoting formulation by replacing the l0 -norm with a

l p -norm (for 0 < p ≤ 1 ). This substitution leads to the better-behaved nonlinear optimization problem 2

Cn

+ λ2 v

1

(10)

TE D

min v J ( v ) = g (Φv ) − d obs

EP

However, the l1 -norm in the objective function is non-differenitial at the origin, can complicate gradient-based minimization of the above objective function. In the next section, we present an effective solution strategy to circumvent this issue.

2.4. Iteratively Reweighted Least-Squares (IRLS) Solution Algorithm We apply the IRLS algorithm to solve the l1 -norm regularized least-squares problem. The use of

AC C

IRLS algorithm is primarily motivated by the need to handle the non-differentiable l1 -norm term in the objective function. To this end, the l1 -norm in Eq. (10) is substituted with a weighted

l2 -norm (a smooth function), in such a way that resemble the behavior of l1 -norm, and at convergence the weighted l2 -norm reduces to the l1 -norm. With the weighted l2 -norm penalty function, the least squares problem in Eq. (10) is expressed as: min v J ( v ) = g (Φv ) − d obs

2 Cn

+ λ2 v

2 W

(11)

ACCEPTED MANUSCRIPT

= vT Wv , where W is a diagonal weighting matrix with diagonal entries, at, 1 iteration j, are expressed as wi ,i j = . Notice that the weight at iteration j is the 1 in which v

2

W

((v

)

)

j −1 2

+ε 2 inverse of the sparse coefficient magnitudes at iteration j-1. Inspection of the weight terms shows that, at convergence, the weighted l2 -norm is the same as the l1 -norm. With this definition, the

RI PT

i

2

term v W is a differentiable representation of the l1 -norm of v, where vi is approximated by 2 vi + ε .

(

) (Φ G C GΦ v −1

T

T

n

j

− ΦT GT Cn∆d j

)

(12)

M AN U

v j +1 = ΦT GT CnGΦ + λ2 W

SC

To minimize the nonlinear objective function in Eq. (11) with the IRLS algorithm, we apply the Gauss-Newton algorithm. The iteration form of the Gauss-Newton method takes the form:

where ∆d j = g (Φv j ) − d obs is the data misfit vector in iteration (j) and G denotes the Jacobian matrix with entries Gij = ∂g i / ∂u j . In our field-scale examples, these Jacobian entries are computed as part of the forward model simulation using an efficient adjoint model. The IRLS implementation of the above iterative form uses the weight matrix that was defined above. Table 1 provides a summary of the IRLS implementation that was used in this paper. Additional details about the algorithm can be found in (Li and Jafarpour, 2010).

TE D

In the next section, we apply the above dictionary learning and sparse model calibration formulations to large-scale field case studies to evaluate their practical application and compare their performance to the classical SVD parameterization approach. The examples are based on the Brugge and Norne benchmark case studies, which involve complex three-dimensional reservoirs with multiphase flow systems.

AC C

EP

In the examples that follow, we have implemented several history matching algorithms for comparison purposes. The abbreviated forms that we use to refer to these methods are as follows. For parameterization, truncated SVD (TSVD) refers to S leading left singular vectors of the training data while SVD consist of the rank-K SVD parameterization (all left singular vectors), and K-SVD refers to the sparse dictionary. For history matching objective function, we either non-regularized objective function (only with TSVD) or we use the sparsity-promoting regularized objective function (with the full SVD or KSVD). For regularized inversions we use the IRLS algorithm while for the TSVD parameterization (without regularization) we use the standard Gauss-Newton method.

3. Results and Discussion We present three field-scale history matching applications of sparse geologic dictionaries to demonstrate the feasibility of applying them in realistic problems and to highlight their main properties and advantages. The first case is based on the Norne model (Rwechungura et al., 2013), consisting of 113,344 total grid blocks of which 44,431 are active. Our second and third 11

ACCEPTED MANUSCRIPT

M AN U

SC

RI PT

case studies are based on the Brugge benchmark model with a more complex structure (Peters et al., 2010) and 60,000 grid blocks, of which approximately 45,000 are active for flow and transport processes. In these experiments, we consider the permeability field as the unknown of interest and generate a large (2000) number of log-permeability model realizations using appropriate geostatistical modeling techniques such as sequential Gaussian simulation with prespecified variogram (spatial correlation) parameters. The prior model realizations are used to construct the corresponding sparse K-SVD dictionaries and the SVD basis. For the Brugge model, we consider two sets of experiments, one consisting of a multi-Gaussian log-permeability distribution and another with a more complex non-multi-Gaussian fluvial patterns. In the first case, we use the sequential Gaussian simulation (sgsim) algorithm to generate the prior realization while in the second case we use the training-image-based single normal equation (snesim) algorithm to simulate the prior models. These simulations were performed using the Stanford Geostatistical Modeling software (SGeMS) (Remy et al., 2009). In all three examples, hard data from well locations are used to condition the prior model realizations. This conditioning will ensure that the variability in the prior training data is constrained and the training dataset includes plausible connectivity patterns. It is important to recognize that the approximate nature of the parametric representations implies that at well locations the hard data are honored in a least-squares sense. The degree to which the hard data is honored could be specified with an additional hard data constraint term in the history matching objective function.

3.1 Norne Case Study

AC C

EP

TE D

We first present a series of approximation examples to demonstrate the effectiveness of sparse representation with geologic dictionaries and compare it with the conventional reduced-order parameterization. In these examples, unlike in model calibration, it is assumed that the reference model is known, and the objective is to provide a low-rank approximation for it using different methods using the prior model realizations. Figure 2 shows 18 (out of 2000) sample prior logpermeability models for the three-dimensional Norne field case. These realizations are generated based on the geologic description of the field that was provided in (Rwechungura et al., 2013). Figure 3 summarizes the approximation results. Figure 3a displays the reference model chosen for our study while Figures 3b-3d depict the approximation results using the KSVD dictionary with the OMP algorithm (hereafter referred to as KSVD-OMP), the SVD dictionary with the OMP algorithm (SVD-OMP) and the truncated SVD (TSVD) method, respectively. For the first two methods the dictionary size is K=1000 while the sparsity level is S=30; for the TSVD parametrization, the first S=30 leading left singular vectors of the same prior models are used. We note that the SVD is implemented on a matrix that has as its columns the vectorized versions of the three-dimensional prior models. Figures 3e-3g show the transform domain representations of the reference model in the KSVD dictionary, the full and truncated SVD bases, respectively. The larger boxes in these columns are used to zoom in the areas with a cluster of significant coefficients. Note that while the SVD basis elements are sorted from largest to smallest singular values, the K-SVD dictionary elements are not ordered, and can have similar significance and contribution in reconstructing a model. This is a major difference between the traditional SVD basis and our KSVD sparse dictionary. That is, in representations with a sparse dictionary, different (S = 30-term) combinations of the elements in the dictionary, irrespective of their order, can be used to approximate a given model. When the model changes, a completely different subset of elements, depending on the geologic features in

ACCEPTED MANUSCRIPT

RI PT

the new model, is used for approximation. This is not the case in traditional reduced-order parametrization with TSVD, where only the first S leading singular vectors are used for approximation (see Figure 3g). It is also noteworthy that although the OMP algorithm with the full SVD is used in Figure 3f, the significant basis elements are selected from within the leading 120 singular vectors, highlighting the insignificant nature of non-leading elements in SVD basis. As shown in Figure 3e, no order exists in the KSVD dictionary. This signifies an important departure from traditional parametrization methods and offers several advantages that we will explore in our discussion.

AC C

EP

TE D

M AN U

SC

As is typically the case, the TSVD representation is very smooth. A comparison of reported Root Mean-Squared-Error (RMSE) values shows a better performance for the KSVD dictionary. It is important to emphasize that this example represents an extremely overdetermined (datadominant) problem, for which the choice of the approximation basis is not very important. The difference between these bases and their behavior becomes evident in an underdetermined problem (such as the model calibration problem that we consider next). The behavior of the two algorithms, K-SVD and SVD, has important implications in solving ill-posed inverse problems. First, in data-deficient problems where small scale details cannot be resolved, the non-leading SVD basis elements, containing details, are not useful. Second, the solution search space in the TSVD parameterization is limited to a small number (in this case S=30) of the leading basis elements while the search space for the sparse reconstruction case is K >> S. This property is incorporated purposefully to introduce additional flexibility in searching for a solution and allow for diversity in geologic features that can be used to reconstruct the solution. Most importantly, unlike TSVD where the basis components are selected a-priori and their coefficients are estimated from data during model calibration, in sparse model calibration the data is used to select the relevant components from a larger set and estimate their corresponding coefficients simultaneously. This leads to a more challenging optimization problem, which as discussed in the previous section can be handled using sparsity promoting regularization. Figure 4 presents the reference log-permeability model and the complex well trajectories, including both vertical and horizontal wells. The well names specified by B-1BH, B-4DH and B2H are the wells used later in this section for evaluating the data match and predictive capability of the calibrated models. The Norne model contains 46×112×22 grid cells and represents a complex nonlinear three-phase flow system (oil/water/gas). The original model configuration and the related constraints are slightly adjusted in our example so that the adjoint model can be used to efficiently compute the necessary gradient information for our minimization problem. In this paper, we characterize the subsurface hydraulic property map (log-permeability) based on 6 years of available phase flow rate data at well locations and forecast 12 years of fluid extraction rate afterwards. In addition, to better evaluate the predictive power of the estimated model, we forecast the fluid flow rates in newly introduced wells at locations different from the existing wells. The quality of forecasted data can be used for evaluation of reconstruction quality for real world problems in which the reference models are not available for comparison purposes. Figures 5a and 5e show, respectively, the reference and initial log-permeability models that were used for calibration. Figures 5b-5d (and 5f-5h) display the transform-domain representations of the reference (and initial) model using the KSVD-OMP, SVD-OMP, and TSVD, respectively. The discrepancy between the initial and reference model both in the spatial and transform domains is quite apparent. During model calibration, the three methods use the 13

ACCEPTED MANUSCRIPT

RI PT

flow and pressure data to update the initial model in the transform domains. This update is performed automatically using the optimization formulations discussed above. In case of sparsity-promoting methods, the update is performed by selecting other dictionary elements and updating the coefficients (weights) while in TSVD approach, the calibration only changes the weights of the leading components to match the dynamic data. We also note that the sparse model calibration is performed using the IRLS algorithm that was described in the previous section (in this paper OMP is only used for approximation when the reference model is known). A main difference between the two algorithm is that OMP uses a fixed number of elements S=30 and is only tractable for linear problems with fast forward models. On the other hand, IRLS is a computationally efficient algorithm that minimizes the solution support (S) without constraining it to a specific value.

EP

TE D

M AN U

SC

Figure 6 shows the updated log-permeability maps from different methods. It is clear from the results that the traditional TSVD approach provides inferior results to the sparsity promoting formulations. The solution obtained from the KSVD-IRLS approach superior to the alternative methods. This is confirmed both visually and using a mean-square-error in log-permeability reported in Figure 6. It is noteworthy that the transform domain solutions are not required to be similar to the reference case (Figure 5f). This is because of the non-unique form of the KSVD representation and the type/nature of data used to obtain the solution. The changes applied to the transform-domain representations are constrained to be sparse and to result in a good match to data. These requirements can be met using alternative combination of dictionary elements. Another contributing factor is that, unlike OMP (used in Figure 5f), the IRLS algorithm does not fix the number of unknowns to S=30, it simply minimizes the support of the solution subject to data mismatch constraint. However, since the prior information is used to construct the geologic dictionary, the IRLS algorithm selects the smallest subset of geologic features that minimize the data mismatch term. It is also important to note that the solution from SVD-IRLS is better than the one obtained from TSVD. An important interpretation of this approach is that implementing the IRLS algorithm for this case amounts to searching for an optimal level of parameterization with the full SVD basis, whereas in TSVD the parametrization level is fixed at S=30. Nonetheless, even in this case, the estimated log-permeability map is not as good as the solution obtained with the K-SVD dictionary, which is constructed by taking full advantage of sparsity in the prior models.

AC C

Figure 7 displays the estimation results for selected layers of the Norne model. Examination of the results for different layers proves that the KSVD-IRLS algorithm is better able to capture the heterogeneity in the reference model. It is evident from the Figure 7 that the solution obtained from the TSVD parameterization provides a substantially smoothed solution that fails to capture some of the variability in the reference model. This behavior is expected as concentrate a significant amount of the variability in the spatially correlated prior log-permeability fields is described by the large-scale features, which are incorporated into the leading SVD basis elements. The construction used in KSVD algorithm takes advantage of sparsity and is better able to preserve the heterogeneity represented by the prior models. The better performance of the KSVD-IRLS method becomes more noticeable by comparing the estimation error maps (difference between reference model and the estimated models) as shown in Figure 8. Figure 8a depicts the difference between the reference and initial model, which is the exact change needed to obtain the reference model from the initial solution. Figures 8b-8d show the difference

ACCEPTED MANUSCRIPT

between the reference model and estimated maps using KSVD-IRLS, SVD-IRLS, and TSVD, respectively. These maps show the remaining error after data integration with each method (both dark red and dark blue regions identify areas with significant discrepancies). While all three cases provide consistent updates, the KSVD-IRLS updates resemble the geologic heterogeneity present in the subsurface formation more closely.

M AN U

SC

RI PT

Figure 9 shows the evolution of the objective functions from their initial to converged values for different methods. To better understand the behavior of the algorithm, when applicable, the data mismatch and regularization (and they combinations) are plotted separately. Figures 9a and 9b pertain to the methods that use the IRLS algorithm and contain a (sparsity) regularization term in their objective function whereas Figure 9c is related to TSVD and does not include a regularization term. It is clear from Figures 9a and 9b that the sparsity term initially drops at a higher rate than the data mismatch term. Figure 10 shows the iterations both in spatial and transform domain at selected iterations of the minimization. The results indicate that after 15 iterations of the classical Gauss-Newton updates, no significant model updates executed by the algorithm.

TE D

The flow data matches and predictions for sample wells are shown in Figure 11. While in all three cases the data matches for the first six years are improved compared to the initial model, due to differences in the estimated models some differences are seen for the following 12 years of forecast. Overall, the KSVD-IRLS algorithm shows a superior performance in forecasting the behavior of flow responses. A numerical comparison of the well data matches is also provided in Table 2 where the mean square errors for the flow forecasts are reported. The KSVD-IRLS (and SVD-IRLS) model calibration methods have superior performance to the traditional TSVD parameterization.

3.2. Brugge Model 1: Multi-Gaussian Spatial Variability

AC C

EP

We apply our proposed formulation to the Brugge benchmark model, which is a synthetic hydrocarbon reservoir consisting of an east-west elongated half dome and bounded by a fault at its northern edge. The model also contains an internal fault and 139×48×9 (60048) grid cells. The subsurface formation is otherwise sharing clear similarities with a typical North Sea Brent type fields and has roughly a size of 10 kms × 3 kms. The original model has four main geological layers representing fluvial, lower shore-face, upper shore-face and sandy shelf deposits from top to bottom. The simulation models are divided into 9 sub-layers (Lorentzen et al., 2009). More details about the model can be found in (Peters et al., 2010). The original model has 20 hydrocarbon extraction wells mostly located on the crest, along the north-south axis, and 10 injection wells at the periphery. The reference pressure is 170 bar at a depth of 1700 meters and the free water level is located at 1678m. In our case study, we reduce the number of wells to only 9, with 3 injectors and 6 producers and simulate a two-phase flow system. The flow data for model calibration is available for the first 6 years. After calibrating the initial model, three wells (two production and one injection wells) are introduced at locations shown in Figure 12 (left) for evaluating the forecast performance of the updated models. Figure 12 (left) shows the top view of the three-dimensional structure with the reference log-permeability distribution. The right panel in Figure 12 displays 15 (out of 2000) 15

ACCEPTED MANUSCRIPT

M AN U

SC

RI PT

initial realizations of the prior log-permeability field, generated using the sgsim algorithm. A sparse dictionary of size K = 1000 with S = 30 was generated using the KSVD algorithm. The same prior information was used to generate the leading K = 1000 SVD basis. For the TSVD basis, we use only the first S = 30 leading singular vectors. Figure 13 summarizes the model calibration results for this case for all nine layers of the model. Figure 13a and 13b display the reference and initial log-permeability distributions whiles Figures 13c-13e depict model calibration results for the KSVD-IRLS, SVD-IRLS, and TSVD, respectively. The TSVD results in Figure 13e show clear deficiency of the method in capturing the large-scale trend in the field. A contributing element to this poor performance could be the number of leading elements that are included in the parameterization, i.e., S = 30. The SVD-IRLS solution can adjust the parameterization level adaptively based on data content. While the results for SVD-IRLS (Figure 13d) are clearly superior to those obtained from the TSVD method, the updated logpermeability map is too smooth and, in some regions, averages out the high and low permeability values, losing the detailed heterogeneity in the reference model. This is, however, not the case for the KSVD-IRLS inversion approach, which not only capture the general high and low permeability trends in the field, but it also reflects the heterogeneity that exists in both prior and reference models.

TE D

Figure 14 presents selected iterations of the Gauss-Newton minimization problem for the KSVD-IRLS method and the evolution of the log-permeability field from the initial model to the converged solutions. The results are shown for both spatial (top) and transform (bottom) domains. The transform domain representations indicate that the algorithm primarily eliminates most of initially non-zero coefficients and then activates the significant elements from the dictionary to minimize the mismatch between the observations while maintaining solution sparsity. An important aspect of all the formulation studied in this paper is the global nature of the applied updates during the iteration. Such updates are particularly useful and effective when the initial model is far from the solution and major changes must be introduced to match the data, as is the case in this problem.

AC C

EP

Figure 15 shows the data match and forecast results for this case study. After the first 6 years of model calibration a set of three infill wells are introduced to quantify the flowrate forecast quality of the developed models for the next 12 years. The results in each column refer to select wells, with the first column belonging to extraction well #6 and the second and third columns to two infill wells. While for all methods the forecasts with the calibrated models are closer to the reference model than those generated with the initial model, the KSVD-IRSL results are the closest to the reference case. This can be confirmed by referring to quantitative results in Table 2. In the next section, we use the Brugge reservoir architecture to investigate the performance of the proposed methods on estimating complex non-Gaussian fluvial channel models.

3.3. Brugge Model 2: Complex Non-Gaussian Spatial Variability As our last example, we evaluate the performance of the proposed method for calibrating nonGaussian log-permeability models. The reference map for this model is shown in Figure 16 (top panel). The model contains 139×48×9 (60048) grids and nine layers. Each three layers constitute a single geologic zone with different channel architecture. Zone 1 consists of straight fluvial channel that pass through the entire formation at a right angle to the main fault (See Figure 16).

ACCEPTED MANUSCRIPT

RI PT

Zone 2 is characterized by more complex intersecting fluvial channel deposits in a similar direction to Zone 1. Zone 3 contains meandering channels, which is known for its complex geometric shape. The reference model (and 2000 realizations) are generated using the snesim algorithm in SGeMS (Remy et al., 2009). The realizations are conditioned on hard data at 9 well locations. The bottom panel in Figure 16 shows six prior realizations of the log-permeability map. After data conditioning, Zones 2 and 3 appear to have more variability and complexity than Zone 1.

M AN U

SC

To ensure that the generated KSVD dictionary and SVD basis can approximate these complex non-Gaussian maps, first it is assumed that the reference model is known and only a low-rank representation of it is desired. Given the more complex nature of the problem, the approximation rank (S) was increased 50 for these set of examples. Figure 17 depicts the approximation results for all three methods, KSVD-OMP (Figure 17b), SVD-OMP (Figure 17c), and TSVD (Figure 17c). Of all three methods, the KSVD-OMP is better able to describe the complex connectivity, in particular for Zone 2. For model calibration, the well configurations remain similar to that in Section 3.2, with 9 wells for model calibration and 3 additional infill wells for forecasting.

AC C

EP

TE D

The model calibration results are shown in Figure 18. Figures 18a and 18b display the reference and initial log-permeability models, respectively. The final log-permeability solutions after 20 iterations for the KSVD-IRLS, SVD-IRLS and TSVD are depicted in Figures 18c through 18e, respectively. The TSVD method seems to have had most difficulty in preserving the channel connectivity of the solution and retrieving the exact locations of the channels. While the performance of the SVD-IRLS is superior to that of the TSVD, the reconstructed solution does not capture the features in Zones 2 and 3 of the reference model. On the other hand, the KSVDIRLS approach provides the most accurate description of the complex channels in the reference model, in particular, for Zones 2 and 3. This is mainly attributed to the flexibility (diversity) of the KSVD dictionary elements as well as their large-scale connectivity patterns, properties that are important for capturing complex global connectivity patterns. On the other hand, the SVD basis is known to have difficulty to capture high order statistical patterns such as intersecting and meandering fluvial systems. We also performed (results not shown) model calibration using the TSVD method with S=100 and S=200, without observing a noticeable improvement in the results. As discussed earlier, the SVD-IRLS method can be viewed as an adaptive TSVD approach where the level of parameterization is adjusted based on flow data content. Sample iteration results for the KSVD-IRLS model calibration minimization are shown in Figure 19. It is important to note that in addition to quantified estimation errors that show the improvement obtained because of data integration, one should also examine statistical measures such as histograms and statistical moments of the solution in comparison with the initial model. This is particularly important for non-Gaussian models (i.e., those that have non-zero higher order statistical moments) and is discussed in this example. In our example, since the reference model has the same statistics as the initial model, the statistical quality of the update is evaluated by how closely the resulting solutions reproduce the statistical information in the initial model. The histograms of the results for the Non-Gaussian Brugge case are shown in Figure 20. The plots show that the histograms can correctly capture the bimodal shape of the distributions (e.g., showing the two facies types that exist in the model). The results are also consistent with a visual inspection to confirm that the solutions in each case tend to maintain the two facies in the model. 17

ACCEPTED MANUSCRIPT

This is partly because all the methods use connectivity-preserving parameterization (K-SVD and SVD), which helps to preserve the bimodal nature of the distributions.

SC

RI PT

A final remark is that while the use of histogram and error metrics (e.g., MSE) is important in evaluating the estimation quality, in practice these statistical measures cannot be computed as the reference model is unknown. Hence, a combination of the data match quality (including crossvalidation) and statistical consistency of the solution with the initial model may be assess the updated models. For this example, the data match and flow forecast results for all three methods are plotted (for sample wells) in Figure 21. While the data match portion of the plots in Figure 21 shows consistent results for all three methods, the forecasts parts based on the KSVD-IRLS updated models are closer to predictions from the reference model. Table 3 summarizes the permeability estimation errors for the three methods, indicating a better performance for the KSVD-IRLS algorithm.

3.4. Additional Considerations

AC C

EP

TE D

M AN U

A fundamental premise for adopting sparse model calibration methods is the prevailing spatial correlations in complex geologic formations. Large-scale complex connectivity patterns are amenable to sparse representations in a properly designed and learned sparse geologic dictionary. Such sparse representations can be effectively exploited in formulating model calibration techniques that are inspired by recent development sparse approximation and signal processing. A particularly useful property in sparse model calibration framework is dynamic (basis or subspace) selection during the inversion process, which allows the inversion algorithm to select significant geologic features from a diverse dictionary to match the dynamic flow data. When sparse dictionaries are learned from reliable prior geologic models, they are quite efficient in representing the salient information in the prior model. On the other hand, when prior knowledge is uncertain or multiple geologic scenarios may be plausible, the selection property of sparse model calibration offers robustness against geologic uncertainty, which is an important advantage over classical parameterization techniques. The selection property of sparsity-promoting regularization implies that the method does more than simply finding a calibrated model. In fact, a main advantage of the proposed framework is related to its ability to disregard any irrelevant features that are included in the prior model, by simply assigning a zero weight to it. Several interesting formulations can be devised to exploit this selection property. For instance, one could devise a post-processing step in which the selected basis elements are used to adjust the prior model. When alternative geologic scenarios are proposed, one could take advantage of this selection property to identify which of the proposed prior models are consistent with the dynamic data. The sparse model calibration approach with KSVD dictionary was applied to both multiGaussian and complex fluvial channel formation that exhibit multimodal behavior. In the latter case, a particularly challenging aspect is reproducing the complex geologic patterns during model calibration. While low-rank approximation with the TSVD method also takes advantage of prior geologic models to construct a compact basis, it tends to break down the existing connectivity features of the prior models into a few large-scale (leading) and many small-scale (detail) basis elements. During model calibration these elements must be linearly combined, based on limited flow and pressure data, to form the solution. Correctly combing many smallscale elements, however, requires high resolution data, which is practically non-existent in

ACCEPTED MANUSCRIPT

RI PT

subsurface model calibration. Hence, the SVD parameterization is not suitable for preserving complex geologic connectivity patterns during inversion with limited data. The KSVD dictionary behaves rather differently; when initialized with the prior models, instead of decomposing the complex patterns in the models, it attempts to maintain the large-scale connectivity features of the prior model while providing a sparse representation. In addition, the dictionary construction approach is not very sensitive to the diversity in the prior models can hence offers robustness against the initial geologic uncertainty. As a result, sparse dictionaries can preserve geologic connectivity and handle geologic uncertainty more properly than the SVD method.

4. Conclusions

AC C

EP

TE D

M AN U

SC

In this paper, we discussed a novel history matching method and its application to estimate heterogeneous reservoir properties in subsurface flow systems using three large-scale problems. The method is based on sparse geologic dictionaries that are learned from uncertain prior model realizations. For dictionary learning we used the KSVD algorithm, which is a recently developed method in sparse reconstruction literature. Using the large-scale field application examples, the sparse (low-rank) approximation and model calibration performance of the KSVD dictionary was compared with the traditional truncated SVD (TSVD) basis. The OMP algorithm was implemented for sparse approximation (compression) while the IRLS inversion was performed for sparse history matching. As a classical alternative, the TSVD parameterization was also applied to the same example. The results show that the KSVD-IRLS implementation provides superior results to the other two alternative methods in estimating heterogeneous rock properties and the resulting production forecasts. While the KSVD algorithm has shown superior performance in our experiments, it also comes with its own limitations. The learning stage of the algorithm can be computationally demanding for large-scale problems. While this step is performed off-line and is not included during history matching, strategies to improve its computational demand can make the problem computationally more efficient. Furthermore, history matching with the KSVD dictionary leads to a higher (i.e. K) dimensional optimization compared to the low- (i.e., S) dimensional optimization with TSVD. This property may result in additional computational complexity and/or increase the chance of non-unique representations. Finally, it is important to note that the convergence proof of the original Compressed Sensing paradigm is offered for linear measurements and under certain assumptions. These conditions do not hold in nonlinear problems, such as multi-phase flow in porous media. Hence, the results presented in this work are based on the premise that sparsity-promoting regularization functions can be used to augment least-squares problems and constrain the solution, without relying on any convergence proof. In summary, the results in this paper suggest that sparse model calibration with learned geologic dictionaries provides a promising feature-based estimation framework for solving history matching problems. This paper illustrates the feasibility of applying the proposed KSVD learning technique and the IRSL inversion algorithm to solve large-scale and geologically complex subsurface flow model calibration problems. While this study only included deterministic model calibration formulations, probabilistic formulations can be developed for uncertainty quantification. However, sparsity-based probabilistic formulations for model calibration depart from the classical Gaussian prior models and require more complex prior models, such as Laplace, that promote sparsity. Research in underway to develop ensemblebased sparse formulations for uncertainty quantification in practical applications. 19

ACCEPTED MANUSCRIPT

Appendix A: K-SVD Implementation The general formulation for sparse dictionary learning is expressed as the following optimization problem (following the notation described in the main text) 2 ˆ ,V ˆ = arg min Φ U − ΦV F s.t. v i 0 ≤ S & φi 2 = c Φ, V (A1) Where c is a normalizing constant for columns of the dictionary matrix. The K-SVD algorithm implements a heuristic solution to this problem by dividing (A1) into two sub-problems that consist of (1) a sparse coding step based on Orthogonal Matching Pursuit (see Table A1 for a pseudo code) and (2) a dictionary updating step (see Table A2 for a pseudo code). These two steps are repeated until convergence or when a specified number of iterations is reached. The sparse coding step is obtained by fixing the dictionary Φ in (A1) and solving: 2 f a (V ) = U − ΦV F s.t. vi 0 ≤ s (A2) The OMP algorithm is one of the earliest approximate (greedy) methods for solving this sparse coding problem and is summarized in Table A1. A detailed description of the algorithm and a comparison of various sparse coding algorithms can be found in (Tropp and Wright, 2010). Provided with a set of sparse representation vectors V (from the sparse coding step), the optimization problem for dictionary update step can be expressed as: 2 f a (Φ ) = U − ΦV F (A3) The MOD algorithm presents a tractable solution to this least-squares problem by setting the derivative of the objective function with respect to Φ equal to zero. However, the algorithm requires inverting large matrices, which is impractical for realistic problems. Instead, the dictionary update step in the K-SVD algorithm implements K rank-1 SVD operations (hence the

TE D

M AN U

SC

RI PT

{ }

name K-SVD) while ensuring the sparsity of the dictionary is preserved. Hence, each element φk of the dictionary is substituted with the leading left singular vector of the matrix containing the

EP

elements of the training data that use φk in their sparse representation. Further details about the K-SVD implementation can be found in (Aharon et al., 2006).

AC C

Appendix B: Learning K-SVD from Low-Rank Prior Representations The computational complexity of each iteration in the K-SVD algorithm is estimated as O L 2 NK + s 2 K + 7 sK + s 3 + 4sN + 5 NK 2 (Liu and Jafarpour, 2013). In this paper, we studied the computation time of the K-SVD algorithm and compared it with the total time spent on model calibration. Table B1 provides a detailed analysis and comparison between the computation time for dictionary learning and adjoint-based flow data simulation. While dictionary learning is a computationally demanding process, it is important to note that it is performed off-line and can be completed prior to model calibration. Furthermore, low-rank representation of the full-rank prior models of the property map may be adopted to significantly speed up the computations involved in the K-SVD algorithm, as discussed in (Liu and Jafarpour, 2013).

((

)

)

ACCEPTED MANUSCRIPT

Acknowledgments This work is partially supported by research grants from the Energi Simulation Foundation and the Society of Petroleum Engineers.

RI PT

References

AC C

EP

TE D

M AN U

SC

Aharon, M., Elad, M., Bruckstein, A., 2006. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Trans. Signal Process. 54, 4311–4322. doi:10.1109/TSP.2006.881199 Alcolea, A., Carrera, J., Medina, A., 2006. Pilot points method incorporating prior information for solving the groundwater flow inverse problem. Adv. Water Resour. 29, 1678–1689. doi:10.1016/j.advwatres.2005.12.009 Ahmed, N., Natarajan, T. and Rao, K.R., 1974. Discrete cosine transform. IEEE transactions on Computers, 100(1), pp.90-93. Arpat, G.B., Caers, J., 2005. Sequential simulation with patterns. Ph.D. Dissertation. Stanford University, Stanford, CA, USA. Advisor(s) Jef Caers. AAI3162380. Bach, F., 2011. Optimization with Sparsity-Inducing Penalties. Found. Trends® Mach. Learn. 4, 1–106. doi:10.1561/2200000015 Bhark, E.W., Jafarpour, B., Datta-Gupta, A., 2011. A generalized grid connectivity–based parameterization for subsurface flow model calibration. Water Resour. Res. 47, W06517. doi:10.1029/2010WR009982 Bissell, R., 1994. Calculating Optimal Parameters for History Matching, in: ECMOR IV - 4TH EUROPEAN CONFERENCE ON THE MATHEMATICS OF OIL RECOVERY, TOPIC E: HISTORY MATCH AND RECOVERY OPTIMIZATION. ICMOR IV, pp. 1–15. doi:10.3997/2214-4609.201411181 Britanak, PC, Yip P, Rao K. Discrete cosine transform: general properties, fast algorithms, and integer approximation, Academic Press; 2006. Brun, B., Gosselin, O., Barker, J.W., 2004. Use of Prior Information in Gradient-Based History Matching. SPE J. 9, 67–78. doi:10.2118/87680-PA Caers, J., Zhang, T., 2004. Multiple-point geostatistics: a quantitative vehicle for integrating geologic analogs into multiple reservoir models 383–394. Candes, E.J., Wakin, M.B., 2008. An Introduction To Compressive Sampling. IEEE Signal Process. Mag. 25, 21–30. doi:10.1109/MSP.2007.914731 Carrera J, SP Neuman, 1986. Estimation of aquifer parameters under transient and steady state conditions: 1. Maximum likelihood method incorporating prior information. Water Resour Res. 22, 199-210. Carrera, J., Alcolea, A., Medina, A., Hidalgo, J., Slooten, L.J., 2005. Inverse problem in hydrogeology. Hydrogeol. J. 13, 206–222. doi:10.1007/s10040-004-0404-7 Chen, S.S., Donoho, D.L., Saunders, M.A., 1998. Atomic Decomposition by Basis Pursuit. SIAM J. Sci. Comput. 20, 33–61. doi:10.1137/S1064827596304010 Elad, M., Aharon, M., 2006. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries. IEEE Trans. Image Process. 15, 3736–3745. doi:10.1109/TIP.2006.881969 Fonseca, R. a, Silva, L.O., Tsung, F.S., Decyk, V.K., Lu, W., Ren, C., Mori, W.B., Deng, S., Lee, S., Katsouleas, T., Adam, J.C., 2002. Information processing for medical imaging. 21

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Gavalas, G.R., Shah, P.C. and Seinfeld, J.H., 1976. Reservoir history matching by Bayesian estimation. Society of Petroleum Engineers Journal, 16(06), pp.337-350. Golub, G.H., Reinsch, C., 1970. Singular value decomposition and least squares solutions. Numer. Math. 14, 403–420. doi:10.1007/BF02163027 Grimstad, A.A., Mannseth, T., Nævdal, G. and Urkedal, H., 2003. Adaptive multiscale permeability estimation. Computational Geosciences, 7(1), pp.1-25. Hu, L.Y., Chugunova, T., 2008. Multiple-point geostatistics for modeling subsurface heterogeneity: A comprehensive review. Water Resour. Res. 44, n/a-n/a. doi:10.1029/2008WR006993 Jacquard, P., 1965. Permeability Distribution From Field Pressure Data. Soc. Pet. Eng. J. 5, 281– 294. doi:10.2118/1307-PA Jafarpour, B., Khodabakhshi, M., 2011. A Probability Conditioning Method (PCM) for Nonlinear Flow Data Integration into Multipoint Statistical Facies Simulation. Math. Geosci. 43, 133–164. doi:10.1007/s11004-011-9316-y Jacquard, P. and C. Jain, 1965. Permeability distribution from field pressure data”. Soc. Pet. Eng. Journal, 281-294. Jahns, H.O., 1966. A Rapid Method for Obtaining a Two-Dimensional Reservoir Description From Well Pressure Response Data. Soc. Pet. Eng. J. 6, 315–327. doi:10.2118/1473-PA Journel, A., Deutsch, C., 1995. GSLIB: Geostatistical Software Library and User’s Guide, Technometrics. doi:10.1080/00401706.1995.10485913 Khaninezhad, M., Jafarpour, B., 2014. Prior model identification during subsurface flow data integration with adaptive sparse representation techniques. Comput. Geosci. 3–16. doi:10.1007/s10596-013-9378-7 Khaninezhad, M.M., Jafarpour, B., Li, L., 2012. Sparse geologic dictionaries for subsurface flow model calibration: Part I. Inversion formulation. Adv. Water Resour. 39, 106–121. Khaninezhad, M.R.M., Jafarpour, B., 2013. Hybrid Parameterization for Robust History Matching. doi:10.2118/146934-PA Khaninezhad, M.-R., & Jafarpour, B., 2015. Sparse geologic dictionaries for field-scale history matching application. Society of Petroleum Engineers. doi:10.2118/173275-MS Khodabakhshi, M., Jafarpour, B., 2014. Adaptive Conditioning of Multiple-Point Statistical Facies Simulation to Flow Data with Probability Maps. Math. Geosci. 46, 573–595. doi:10.1007/s11004-014-9526-1 Khodabakhshi, M., Jafarpour, B., 2013. A Bayesian mixture-modeling approach for flowconditioned multiple-point statistical facies simulation from uncertain training images. Water Resour. Res. 49, 328–342. doi:10.1029/2011WR010787 Kitanidis, P.K., 2012. Generalized priors in Bayesian inversion problems. Adv. Water Resour. 36, 3–10. doi:10.1016/j.advwatres.2011.05.005 LaVenue, A.M., Pickens, J.F., 1992. Application of a coupled adjoint sensitivity and kriging approach to calibrate a groundwater flow model. Water Resour. Res. 28, 1543–1569. doi:10.1029/92WR00208 Lebrun, M., Leclaire, A., 2012. An Implementation and Detailed Analysis of the K-SVD Image Denoising Algorithm. Image Process. Line 2, 96–133. doi:10.5201/ipol.2012.llm-ksvd Leo, T., Kravaria, C., Seinfeld, J.H., 1986. History Matching by Spline Approximation and Regularization in Single-Phase Areal Reservoirs. SPE Reserv. Eng. 1, 521–534. doi:10.2118/13931-PA Li L., Jafarpour B., 2010. Effective solution of nonlinear subsurface flow inverse problems in

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

sparse bases, Inverse Problems, Vol. 26(10), PP 105016. Liu, E., Jafarpour, B., 2013. Learning sparse geologic dictionaries from low-rank representations of facies connectivity for flow model calibration. Water Resour. Res. 49, 7088–7101. doi:10.1002/wrcr.20545 Lu, P. B., and R.N. Horne, 2000. A multiresolution approach to reservoir parameter estimation using wavelet analysis, SPE Annual Technical Conference and Exhibition, Dallas, TX, 1 - 4 October. Mahmud, K., Mariethoz, G., 2014. Simulation of earth textures by conditional image quilting. Water Resour. … n/a-n/a. doi:10.1002/2013WR015069 Mahmud, K., Mariethoz, G., Caers, J., Tahmasebi, P., Baker, A., 2014. Simulation of Earth textures by conditional image quilting. Water Resour. Res. 50, 3088–3107. doi:10.1002/2013WR015069 McLaughlin, D., Townley, L.R., 1996. A Reassessment of the Groundwater Inverse Problem. Water Resour. Res. 32, 1131–1161. doi:10.1029/96wr00160 Neuman, S.P., 1973. Calibration of distributed parameter groundwater flow models viewed as a multiple-objective decision process under uncertainty. Water Resour. Res. 9, 1006–1021. doi:10.1029/WR009i004p01006 Oliver, D.S., Chen, Y., 2011. Recent progress on reservoir history matching: a review. Comput. Geosci. 15, 185–221. doi:10.1007/s10596-010-9194-2 Peters, L., Arts, R., Brouwer, G., Geel, C., Cullick, S., Lorentzen, R.J., Chen, Y., Dunlop, N., Vossepoel, F.C., Xu, R., Sarma, P., Alhuthali, A.H.H., Reynolds, A., 2010. Results of the Brugge Benchmark Study for Flooding Optimization and History Matching. SPE Reserv. Eval. Eng. 13, 391–405. doi:10.2118/119094-PA Petra, S., Schröder, A., Wieneke, B., Schnörr, C., 2008. On Sparsity Maximization in Tomographic Particle Image Reconstruction, in: Pattern Recognition. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 294–303. doi:10.1007/978-3-540-69321-5_30 Remy, N., Boucher, A., Wu, J., 2009. Applied Geostatistics with SGeMS - A User’s Guide. Cambridge University Press. Rudelson, M., Vershynin, R., 2008. On sparse reconstruction from Fourier and Gaussian measurements. Commun. Pure Appl. Math. 61, 1025–1045. doi:10.1002/cpa.20227 Rwechungura, R.W., Bhark, E.W., Miljeteig, O.T., Suman, A., Kourounis, D., Foss, B. a., Hoier, L., Kleppe, J., 2012. Results of the First Norne Field Case on History Matching and Recovery Optimization Using Production and 4D Seismic Data, in: SPE Annual Technical Conference and Exhibition. Society of Petroleum Engineers. doi:10.2118/157112-MS Sagar, B., Yakowitz, S., Duckstein, L., 1975. A direct method for the identification of the parameters of dynamic nonhomogeneous aquifers. Water Resour. Res. 11, 563–570. doi:10.1029/WR011i004p00563 Strebelle, S., 2002. Conditional Simulation of Complex Geological Structures Using MultiplePoint Statistics. Math. Geol. 34, 1–21. doi:10.1023/A:1014009426274 Tarantola, A., 2005. Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics. doi:10.1137/1.9780898717921 Tavakoli, R., Reynolds, A.C., 2010. History Matching With Parametrization Based on the SVD of a Dimensionless Sensitivity Matrix. SPE J. 15, 495–508. doi:10.2118/118952-PA Tikhonov, A.N., Arsenin, V.I.A., 1977. Solutions of ill-posed problems. Winston. Tropp, J.A., Gilbert, A.C., 2007. Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit. IEEE Trans. Inf. Theory 53, 4655–4666. doi:10.1109/TIT.2007.909108 23

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Tropp, J. a, Wright, S.J., 2010. Computational Methods for Sparse Solution of Linear Inverse Problems. Proc. IEEE 98, 948–958. doi:10.1109/JPROC.2010.2044010 Yeh WWG, 1086. Review of parameter identification procedures in groundwater hydrology: The inverse problem. Water Resour Res. 22, 95-108. Zhou, H., Gómez-Hernández, J.J., Li, L., 2014. Inverse methods in hydrogeology: Evolution and recent trends. Adv. Water Resour. 63, 22–37. doi:10.1016/j.advwatres.2013.10.014

Tables



SC

RI PT

ACCEPTED MANUSCRIPT

Table 1: Pseudo code for IRLS sparse model calibration Input: Geologic dictionary Φ ∈ Ρ N×K , p for lp-norm, observed data vector d obs ∈ Ρ

m

M AN U

• Output: Sparse coefficient vector v ∈ Ρ K , calibrated physical property u ∈ Ρ N 1) Initialize: Set the initial sparse coefficient vector v 0 ∈ Ρ K as sparse projection of initial physical property u 0 ∈ Ρ N using OMP (or any other sparse coding scheme) Repeat the following until convergence { 2) Intermediate Step: compute the following: j • Simulated/predicted data using last updated coefficients, g Φv

G= •

)

Jacobean matrix for predicted data w.r.t property (log-permeability) u ∈ Ρ N ,

∂g(u ) ∈ Ρ m×N ∂u

Calculate

wi ,i =

TE D



(

the

1

((v )



)

1−

p 2

i

3) Estimation Update Step: Update the coefficients vector as

(

v j +1 = ΦT GT CnGΦ + λ2 W

AC C

)

,0 < p ≤ 1

EP

2

(

∆d j = g Φv j − d obs ,

parameters:

) (Φ G C GΦ v −1

T

T

n

j

− ΦT G T Cn ∆d j

)

} 4) Outputs: Final update estimate is v = v j +1 and calibrated model is calculated as u = Φv

25

ε = ∆d j

2 Cn

,

ACCEPTED MANUSCRIPT

Table 2: Comparison of flow data forecasts (for 6 years data match and 12 years of forecast) SVD-IRLS (S=30, K=1000)

7665

10010

19320

WOPT

3.20×10

3.38×10

WWCT

0.0010

0.0012

WOPR

35144

49852

WOPT

5.50×1010

19.68×1010

WWCT

0.0024

0.0028

0.0058

WOPR

2.54×105

4.53×105

9.83×105

WOPT

1.06×1010

WWCT Method

0.006Norne

Initial KSVD-IRLS

AC C

EP

TSVD

3.90×1010 0.002

103306

63.79×1010

2.98×1010

0.004 Brugge 1

1.03

1.60

0.625

0.54

1.43

0.720

0.80

1.48

1.06

2.38

0.745

4.09×1010

Brugge 20.038

0.923

TE D

SVD-IRLS

10

SC

10

M AN U

Table 3: Errors (MSE)

Brugge 2

Brugge 1

Norne

WOPR

TSVD (S=30)

RI PT

KSVD-IRLS (S=30, K=1000)

Estimation

SC

RI PT

ACCEPTED MANUSCRIPT

M AN U

Table A1: Pseudo code for Orthogonal Matching Pursuit • Input: A signal u ∈ Ρ N , a matrix Φ ∈ Ρ N ×K Output: A sparse coefficient vector v ∈ Ρ K 1) Initialize: Set the index set Ω0 = {∅}, the residual r0 = u , and put the counter as k=1. 2) Identify: Calculate the correlation of residual rk −1 with columns of Φ

c = Φ T rk −1

and find the larges correlation index nk of c

nk = arg maxn c = arg maxn

rk −1 , φ n

TE D

And update the index vector Ω k = Ω k −1 ∪ {nk }

3) Estimate: Find the best (in the least squares sense) coefficients/weights v k for approximating the input signal u with the columns chosen so far.

v k = arg min x u − Φ Ωk x

2 2

EP

4) Iterate: Update the residual vector rk = u − Φ Ω k v k

Increment k. Repeat (2)-(4) until rk 2 is sufficiently small or v k

AC C

5) Output: Return the vector v = v k

27

0

=s

ACCEPTED MANUSCRIPT

RI PT

Table A2: Pseudo code for K-SVD dictionary learning algorithm • Input: A set of prior signals as columns of U ∈ Ρ N×L , sparsity level s • Output: Sparse coefficient vector set V ∈ Ρ K×L , learned dictionary Φ ∈ Ρ N×K 1) Initialize: Set the initial dictionary Φ 0 ∈ Ρ N× K as a random set of normalized realizations

 u u u  Φ0 =  1 | 2 | ... | K  ∈ Ρ N ×K , and put the counter as j=1. u K 2   u1 2 u 2 2

2) Sparse Coding Step: Using a pursuit algorithm, (Table 1), compute v i , i ∈ {1,2,..., L}as the 2 2

s.t

vi

≤s

SC

u i − Φ j −1 v i

arg min vi

solution of

0

M AN U

3) Dictionary Update Step: For k=1:K • Define the group of examples u i , that use φk (k-th column of dictionary) as

ωk = { i | 1 < i < K , v i (k ) ≠ 0} (i.e. index of non-zero elements in k-th row in sparse



representation matrix V j ∈ Ρ K ×L ) Compute the residual matrix E k = U − Φ j −1V − φk v k , and restrict E k to only ωk

(

ωk

columns, i.e. obtain E k •

)

~ ~T ω ~ Σ v1 Apply rank-1 Singular Value Decomposition (SVD) to E k k = u 1

~

~ and v = Σ ~ v1 • Update φk = u 1 Increase j=j+1 and iterate on (2)-(3) until convergence T

TE D

k

AC C

EP

Nomenclature: N: number of model grid cells L: number of samples in the prior library K: size of the coefficients vector (i.e. number of dictionary atoms) s: number of nonzero (significant) elements used for sparse approximation V : matrix of transform coefficients v k : vector containing k-th row of the coefficient matrix v i : vector containing i-th column of coefficient matrix, representing i-th prior model

φk : k-th element of the dictionary matrix Φ ∈ Ρ N×K

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

Table B1: Comparison of computation times for different components of the algorithm [in secs] Norne (Gaussian) Brugge (Gaussian) Brugge (Fluvial) S=30 S=30 S=50 1963 2404 5250 KSVD 58.3 43.2 66.7 SVD 2.1 1.2 1.9 SVD 171.3 4.5 5.6 Forward / Adjoint Model 0.05 0.05 0.05 Update time 3373 71 84.8 Total computation time

29

ACCEPTED MANUSCRIPT

RI PT

Figures

M AN U

SC

(a) Conventional history matching with reduced-order parameterization (e.g., TSVD or PCA)

AC C

EP

TE D

(b) Proposed history matching with learned sparse dictionaries (KSVD)

Figure 1. (a) Traditional history matching workflow with reduced-order parameterization (TSVD or PCA); (b) Proposed history flowchart with learned sparse geologic dictionaries (e.g., KSVD).

ACCEPTED MANUSCRIPT

TE D

M AN U

SC

RI PT

log(mD)

AC C

EP

Figure 2. Prior three-dimensional log-permeability model realizations for the Norne case study. (Each image shows one conditional realization)

31

ACCEPTED MANUSCRIPT

(c) MSE=0.515

(b) MSE=0.504

(d) MSE=527

M AN U

SC

Log-Perm

RI PT

(a) Reference Model

TE D AC C

EP

SVD Coef.

K-SVD Coef.

Coef. Index

Coef. Index

(g)

TSVD Coef.

(f)

(e)

Coef. Index

Figure 3. Compressed approximation of the reference model (a) using the K-SVD dictionary (OMP algorithm) with K=1000 and S=30 (b), the SVD dictionary (OMP algorithm) with K=1000 and S=30 (c), and truncated SVD parameterization with S=30, using orthogonal projection (d); (e)-(g) display the transform domain representation of (b)-(d), respectively.

log(mD)

ACCEPTED MANUSCRIPT

(b)

B-4DH B-4DH

M AN U

B-1BH

SC

log(mD)

RI PT

(a)

B-1BH

TE D

B-2H

B-2H

AC C

EP

Figure 4. (a) Reference log-permeability model for the Norne case; (b) horizontal and vertical well configurations (from the original Norne Model).

33

(c) SVD-OMP

(b) KSVD-OMP

(f) KSVD-OMP

TE D

EP

K-SVD Coef.

log(mD)

Coef. Index

Coef. Index

SC (g) SVD-OMP

(h) TSVD

TSVD Coef.

(e) Initial Log-Perm

Coef. Index

SVD Coef.

Coef. Index

M AN U

SVD Coef.

K-SVD Coef.

log(mD)

(d) TSVD

TSVD Coef.

(a) Reference Log-Perm

RI PT

ACCEPTED MANUSCRIPT

Coef. Index

Coef. Index

AC C

Figure 5. Reference(a) / initial (e) log-permeability models and their corresponding transform domain representations with the K-SVD-OMP sparse coding (b)/(f), SVD-OMP sparse coding (c)/(g), and TSVD with orthogonal projection methods (d)/(h). The results correspond to K=1000 and S=30.

(c) TSVD (MSE=0.745)

(b) SVD-OMP (MSE=0.720)

log(mD)

EP

Coef. Index

Coef. Index

TSVD Coef.

SVD Coef.

TE D

K-SVD Coef.

M AN U

Log-Perm

SC

(a) KSVD-OMP (MSE=0.628)

RI PT

ACCEPTED MANUSCRIPT

Coef. Index

AC C

Figure 6. Calibrated log-permeability models using sparse model calibration with (a) KSVD-IRLS, (b) SVD-IRLS, and (c) TSVD methods; top and bottom rows show the spatial and transform-domain representations, respectively.

35

ACCEPTED MANUSCRIPT

log(mD) Layer 2

Layer 11

Layer 21

Layer 22

SC M AN U

(b) Initial

AC C

(e) TSVD

EP

TE D

(c) KSVD-IRLS (d) SVD-IRLS

Layer 12

RI PT

(a) Reference

Layer 1

Figure 7. Sample layers of reference (a) and initial log-permeability models; also shown are calibrated log-permeability models for the same layers using KSVD-IRLS (c), SVD-IRLS (d), and TSVD (e) methods.

ACCEPTED MANUSCRIPT

Layer 2

Layer 11

Layer 12

Layer 22

log(mD)

SC

(a) Ref - Initial

AC C

EP

TE D

M AN U

(b) Ref - KSVD (c) Ref - SVD (c) Ref - TSVD

Layer 21

RI PT

Layer 1

Figure 8. Difference between reference log-permeability model and (a) initial model, and updated models using KSVD-IRLS, SVD-IRLS and (d) TSVD methods.

37

(b) SVD-IRLS

(c) TSVD

M AN U

SC

(a) KSVD-IRLS

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

Iteration number Iteration number Iteration number Figure 9. Evolution of the model calibration overall objective function, the regularization term, and the data misfit term for (a) KSVD-IRLS, (b) SVD-IRLS and (c) TSVD methods.

Iter 2

Iter 4

Iter 10

Iter 15 log(mD)

Coef. Index

TE D

K-SVD Coef.

M AN U

Log-Perm

SC

Initial

RI PT

ACCEPTED MANUSCRIPT

Coef. Index

Coef. Index

Coef. Index

Coef. Index

AC C

EP

Figure 10. Evolution of updated log-permeability model from initial model to the last update iteration 15 for Norne case study (top row) using the KSVD-IRLS method; the bottom row shows the transform domain representation (coefficients) evolution.

39

(b)

(c)

Years

TE D

WOPR

M AN U

SC

WWCT

(a)

RI PT

ACCEPTED MANUSCRIPT

Years

Years

AC C

EP

Figure 11. Well response data match (first 6 years) and forecasts (for 12 years afterwards) for wells B-1BH (a), B-2H (b), and B-4DH (c). Top row shows watercut (ratio of water to total fluid produced) and bottom row shows oil rate.

RI PT

ACCEPTED MANUSCRIPT

3 infill wells for future forecasting

TE D

M AN U

SC

3 injectors

log(mD)

AC C

EP

Figure 12. Well configuration and reference log-permeability model (left); 15 out of 2000 sample prior log-permeability model realizations (right) that are used as training data for KSVD and SVD learning.

41

Layer 2

Layer 3

Layer 4

Layer 5

Layer 7

Layer 8

Layer 9

log(mD)

M AN U

(b) Initial

AC C

EP

TE D

(c) KSVD-IRLS (d) SVD-IRLS (e) TSVD

Layer 6

SC

(a) Reference

Layer 1

RI PT

ACCEPTED MANUSCRIPT

Figure 13. Log-permeability model calibration results: (a) reference map, (b) initial model, (c)-(e) updated models using the KSVD-IRLS algorithm, the SVD-IRLS method, and the TSVD approach, respectively. The columns show the layers of the models.

Iter. 1

Iter. 2

Iter. 3

Iter. 15

Coef. Index

TE D

K-SVD Coef.

M AN U

Log-Perm

SC

Initial

RI PT

ACCEPTED MANUSCRIPT

Coef. Index

Coef. Index

Coef. Index

Coef. Index

AC C

EP

Figure 14. Evolution of updated log-permeability model for Brugge case study from the initial map to the final solution for the KSVD-IRLS method; top row shows the spatial log-permeability distribution and bottom row shows the transform domain representations.

43

log(mD)

ACCEPTED MANUSCRIPT

(c)

WOPR

M AN U

SC

RI PT

(b)

WWCT

(a)

Years

Years

Years

AC C

EP

TE D

Figure 15. Well response data match (for 6 years) and forecasts (for 12 years afterwards) for Well #6 (a), Infill Well #1 (b), and Infill Well #2; the first and second rows show the watercut rate (ratio of water to total liquid produced) and the well oil production rate, respectively.

ACCEPTED MANUSCRIPT

log(mD)

RI PT

3 Injectors

Zone 2 (Layers 4-6)

Zone 3 (Layers 7-9)

AC C

EP

TE D

M AN U

Zone 1 (Layers 1-3)

SC

Infill drilling wells

Figure 16. Well configurations and the reference log-permeability map for the Brugge case study with non-Gaussian fluvial channels (top); five sample prior log-permeability model realizations (bottom). Each row of the bottom plots shows one realization; the columns display the three zones (each having three layers) in the model.

45

log(mD)

ACCEPTED MANUSCRIPT

Zone 2 (Layers 4-6)

log(mD)

SC

(a) Reference

M AN U

(b) KSVD-OMP

EP

TE D

(c) SVD-OMP (d) TSVD

Zone 3 (Layers 7-9)

RI PT

Zone 1 (Layers 1-3)

AC C

Figure 17. Approximation quality for the non-Gaussian models: (a) reference log-permeability map; (b)-(d) approximate (low-rank) representations with KSVD-OMP (K=1000,S=50), SVD-OMP (K=1000,S=50), and TSVD (S=50), respectively. The three columns show the results for three zones in the model.

ACCEPTED MANUSCRIPT

Zone 2 (Layers 4-6)

SC

(a) Reference

M AN U

(b) Initial

AC C

EP

TE D

(c) KSVD-IRLS (d) SVD-IRLS (e) TSVD

Zone 3 (Layers 7-9)

log(mD)

RI PT

Zone 1 (Layers 1-3)

Figure 18. Model calibration results for non-Gaussian fluvial channels: (a) reference logpermeability model, (b) initial model, (c)-(e) calibrated models using the KSVD-IRLS, SVDIRLS, and TSVD methods, respectively.

47

ACCEPTED MANUSCRIPT

Zone 2 (Layers 4-6)

log(mD)

SC

Initial

M AN U

Iter 2

EP

TE D

Iter 3 Iter 4

AC C

Iter 20

Zone 3 (Layers 7-9)

RI PT

Zone 1 (Layers 1-3)

Figure 19. Updated log-permeability maps for selected iterations (rows) of the KSVD-IRLS algorithm for different zones (columns) for the Brugge case study with non-Gaussian fluvial channels.

ACCEPTED MANUSCRIPT

SVD-IRLS

TSVD

------ Initial ------ Refernece ------ Updated

EP

Layer 9

TE D

Layer 6

M AN U

Layer 3

SC

RI PT

Layer 1

KSVD-IRLS

AC C

Log-Perm

Log-Perm

Log-Perm

Figure 20. Histograms of the refrence, initial, and updated log-permaebility field in layers 1,3,6,9 (Rows 1-4, respectively) for the KAVD-IRLS (Column 1), SVD-IRLS (Column 2), and TSVD (Column 3) for the the Brugge case study with nonGaussian fluvial channels.

49

(b)

(c)

Years

TE D

WOPR

M AN U

SC

WWCT

(a)

RI PT

ACCEPTED MANUSCRIPT

Years

Years

AC C

EP

Figure 21. Flow response data match (first 6 years) and forecasts (12 years afterwards) for the Brugge case study with non-Gaussian fluvial channels: (a) Well #2, (b) Well #5 and (c) Infill Drilled Well #1; the top row shows the watercut (ration of water to total fluid produced) and the bottom row displays the oil production rate.

ACCEPTED MANUSCRIPT

Highlights:

EP

TE D

M AN U

SC

RI PT

Field-scale demonstration of sparse geologic dictionaries for history matching Effective and flexible low-rank parameter representations in field-scale history matching Low-dimensional subspace selection with L1-norm minimization Efficient learning of large-scale sparse geologic dictionaries using low-dimensional training data

AC C

-