Journal of Petroleum Science and Engineering 145 (2016) 689–703
Contents lists available at ScienceDirect
Journal of Petroleum Science and Engineering journal homepage: www.elsevier.com/locate/petrol
Localized model order reduction in porous media flow simulation Mohamadreza Ghasemi a, Eduardo Gildin b,n a b
Reservoir Engineer, Stone Ridge Technology, Houston, USA Petroleum Engineering Department, Texas A&M University, College Station, USA
art ic l e i nf o
a b s t r a c t
Article history: Received 5 May 2015 Received in revised form 17 January 2016 Accepted 20 June 2016 Available online 30 June 2016
This paper introduces a new approach to construct an efficient reduced order model for fluid flow simulation and optimization in porous media. For nonlinear systems, one of the most common methodology used is the proper orthogonal decomposition (POD) combined with discrete empirical interpolation method (DEIM) due to its computational efficiency and good approximation. Whereas regular POD-DEIM approximates the fine scale model with just one single reduced subspace, the localized POD (LPOD) and localized DEIM (LDEIM) are constructed by computing several local subspaces. Each subspace characterize a period of solutions in time and all together they not only can approximate the high fidelity model better, but also can reduce the computational cost of simulation. LPOD and LDEIM use classification approach to find these regions in the offline computational phase. After obtaining each class, POD and DEIM is applied to construct the basis vectors of the reduced subspace. In the online phase, at each time step, the reduced states and nonlinear functions will be used to find the most representative basis vectors for POD and DEIM without requiring fine scale information. The advantages of LPOD and LDEIM are shown in a numerical example of two phase flow in porous media. & 2016 Elsevier B.V. All rights reserved.
Keywords: Model order reduction Localized POD Localized DEIM Online bases update Classification Fluid dynamics Reservoir simulation
1. Introduction Reservoir management enables one to obtain the most favorable production scenario given the current information of the reservoir, such as reservoir characterization parameters (permeability, porosity) and production data. Very often, one is directed to computational methods (reservoir simulation) to develop new techniques or to optimize the existing fields. There are two main challenges in this realm, the large amount of computational infrastructure necessary to perform such calculations; and the fact that there is no certainty in the range of parameters. In this paper our primary focus is to tackle the first problem by means of reduced order models. Note that the latter issue can be investigated by uncertainty quantification and parameter estimation. In many cases, reduced-order modeling techniques have shown to be a viable way of mitigating computational complexity in simulation of large-scale reservoirs, while maintaining high level of accuracy. The options range from non-intrusive methods that do not depend on modifications of a reservoir simulation code (Cardoso and Durlofsky, 2010; Lerlertpakdee et al., 2014; Ghommem et al., 2015), to more intrusive and sophisticated methods that depend on several modifications of legacy code or the development of new simulator codes (Heijn et al., 2004; Gildin et al., 2013; Gildin and Ghasemi, 2014). n
Corresponding author.
http://dx.doi.org/10.1016/j.petrol.2016.06.030 0920-4105/& 2016 Elsevier B.V. All rights reserved.
POD is one of the efficient intrusive methodology used in model order reduction context due to its computational simplicity and good approximation (Van Doren et al., 2004). However, computational savings are not always attained because in order to evaluate nonlinear terms, the reduced state needs to be projected back to fine scale state yielding similar computational cost as the original system. There are different techniques to alleviate this problem. One approach is to linearize these nonlinear functions (trajectories) around several known states and use these piecewise linear solutions, see Cardoso and Durlofsky (2010). Ghasemi and Gildin (2015) used quadratic bilinear formulation to reformulate nonlinear saturation function into bilinear form and then applied model order reduction. Here, we use discrete empirical interpolation method (DEIM), where one constructs another subspace for reducing the nonlinear function evaluations (Chaturantabut and Sorensen, 2010). In reservoir simulation and optimization, POD-DEIM can yield several orders of magnitude reduction in the size of models and reduce the computational cost significantly (Ghasemi et al., 2015). However, if the system exhibits a very dynamic state with a wide range of changes, many POD and DEIM basis vectors are required to accurately approximate the states of a system as well as the nonlinear terms. A remedy to this problem is to search for not just a single global reduced subspace, but rather multiple local subspaces in time in order to update the bases online, i.e during the reconstruction of the results under different boundary conditions by running a reduced order model. Thus, in this work we refer to
690
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
this procedure as localized model order reduction. In this technique rather than just one single subspace, multiple local subspaces are constructed for the reduced system. In the online phase, based on the state of the system at each time step, the proper subspace that has the best approximation will be selected. By applying the localization technique, the dimension of the reduced subspace at each time step is decreased more (compared to having a single set of global bases for the entire simulation time) and consequently the computational cost is reduced even further. The localization idea was introduced for POD in Amsallem et al. (2012) and was verified in reducing the pressure state in dynamical equations describing a MEMS switch device. They suggested using auxiliary quantities to ensure an online selection procedure is independent of the original system dimension. However, the number of auxiliary quantities, which are computed in the offline phase, scales cubically with the number of clusters. This can be an issue especially in a optimization workflow, whereby one updates bases of reduced model periodically to obtain more stable and accurate solution. Following localized POD (LPOD), localized DEIM (LDEIM) was proposed in Peherstorfer et al. (2014) based on machine learning techniques and using efficient classification algorithm. They discussed both parameter and state based LDEIM approaches and applied it to a steady reacting flow simulation. In this paper, we extend LPOD and LDEIM to reduce the computational cost of porous media fluid flow simulation. We present a new approach to resolve the existing issues in LPOD by reducing the number of required auxiliary parameters. Also, an efficient reduced order model for the fluid transport equation is presented by applying LDEIM on the nonlinear function of fractional flow.
fw (s ) =
k rw(s ) λ w(s ) = , μ λ(s ) k rw(s ) + w k ro(s ) μo
and v is the total velocity defined as,
v = vw + vo
(6)
In this paper, we follow a sequential formulation; at each time step one solves for pressure and flux first and then use these results to solve for saturation. We employ mixed finite element methods (Brezzi and Fortin, 2012) to discretize the pressure equation in order to use the velocity field and preserve the mass conservation, see Ghasemi et al. (2015) for more details and examples. One can write the pressure equation after spatial discretization of the problem as the following system of equations
⎛ B λ s −CT ⎞⎛ v ⎞ ⎛ ⎞ ⎜⎜ ( ( )) ⎟⎟⎜⎜ ⎟⎟ = ⎜⎜ 0 ⎟⎟ p ⎝ C 0 ⎠⎝ ⎠ ⎝ g ⎠
(7)
where matrix B and C are derived from FEM discretization and g results from the sink/source terms, see Appendix A for derivation. Generally, an implicit time (backward Euler) discretization will be applied to solve for saturation profile, while a mass conservative finite volume is used for the spatial derivative discretization in saturation equation. Consider a cell Ωi with interfaces γij and associated normal vectors nij pointing out of Ωi, the saturation Eq. (3) will be discretized as,
⎛ ⎞ ϕi ⎜ k 1 k − 1⎟ − s s i i ⎟ + |Ω | Δt ⎜ i ⎝ ⎠
⎛
⎞
qw sik
⎝
⎠
ρ
∑ Fij⎜⎜ sik⎟⎟ = j≠i
In this section, we summarize the underlying partial differential equations related to porous media flow simulation. In particular, we briefly discuss two-phase oil-water systems, see Aarnes et al. (2009) for more details. We consider two-phase flow in a reservoir domain under the assumption that the displacement is dominated by viscous effects; i.e., we neglect the effects of gravity, compressibility, and capillary pressure. The two phases are water and oil, and they are assumed to be immiscible. The Darcy's law for each phase is as follows,
(1)
where vl is the phase velocity, K is the permeability tensor, krl is the relative permeability to phase l ( l = o, w ), s is the water saturation (we use s instead of sw for simplicity) and p is pressure. Throughout the paper, we will assume that a single set of relative permeability curves is used. Combining Darcy's law with conservation of mass allows us to express the governing equations in terms of the so-called pressure and saturation equations as,
−∇· λ s K ∇p = qw + qo,
)
(2)
⎛ ⎛ ⎞ ⎞ q ∂s + ∇·⎜⎜ fw ⎜⎜ s⎟⎟v⎟⎟ = w , ∂t ρw ⎝ ⎝ ⎠ ⎠
(3)
ϕ
where
ϕ is the porosity, λ is the total mobility defined as,
k (s ) k (s ) λ(s ) = λ w(s ) + λ o(s ) = rw + ro , μw μo fw(s) is the fractional flow function,
(8)
where si is the cell-average of the water saturation at time t , and Fij is the numerical approximation of the flux over interface γij,
2. Two-phase flow model
(()
( ) k
k
k (s ) vl = − rl K ∇p, μl
(5)
Fij ≈
⎡ ⎤ fw ( s )ij ⎢ vij s ⎥. nij dv ⎣ ⎦ ij
()
∫γ
(9)
Note that to find fw ( s )ij over each interface, there are different schemes. It is common to use first order approximation, known as upwinding method, defined as follows (Aarnes et al., 2009),
⎧f s ⎪ w i fw ( s )ij = ⎨ ⎪ ⎩ fw s j
( ) ( )
if v. nij ≥ 0 if v. nij < 0
(10)
Thus, neglecting capillary pressure and gravity, one can derive the following differential equation to solve for saturation,
( )
s k = s k − 1 + A(v k )f s k + q+
(11)
where
qi+ =
⎞ ⎛ 1 max⎜⎜ qi , 0⎟⎟ ϕiΩi ⎠ ⎝
(12)
Note that the pressure and velocity equation in Eq. (7) is a linear system, whereas the saturation equation in Eq. (11) is nonlinear system and can be solved for sk by iterative methods such as Newton-Raphson efficiently. To do so, the residual as a function of saturation is defined as follows,
()
R(s ) = s − s k − 1 − A(v k )f s + q+,
(13)
and the Jacobian as,
(4)
J (s ) =
∂R = I − A f ′(s ). ∂s
(14)
where f ′(s ) is the first derivative of the fractional flow with respect
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
to saturation. Finally, at each iteration the current state is updated as,
s k + 1 = s k − αJ (s k )−1 R(s k )
(15)
The iteration is stopped at each time step when either of the update norm or the Jacobian norm is smaller than a tolerance. At each time step the problem is solved sequentially, meaning that first the pressure equation is solved, and next the saturation equation is solved based on Newton-Raphson method. In the following section, our goal is to present a reduced-order model for both equations and reduce the computation runtime.
3. Model order reduction In the subsequent analysis, we describe different approaches to reduce the computational complexity associated with simulation of two-phase flows. Throughout the paper, we refer to fully-resolved solution as the high fidelity results. The primary approach in this paper is to use POD-Galerkin method to project the fine scale states to a reduced subspace and to solve for fewer number of unknowns. 3.1. Proper orthogonal decomposition (POD) As already stated, POD is typically performed on the premise that the state solutions belong to a subspace with much smaller dimension than a large scale model. Assume there are ns time steps, and the solution is stored as,
x = ⎡⎣ x1, x2, …, x ns⎤⎦ ∈ n × ns
(16)
where n is the total number of grid-blocks, ns is the number of snapshots, x i can be a pressure, saturation, or nonlinear function snapshot that is vectorized to be stacked in the snapshots matrix . These snapshots are assembled into different matrices, p , v , s respectively. After applying a singular value decomposition on these matrices, we select the projection matrix, Φ, by indicating the fraction of total energy to be captured (Volkwein and Kunisch, 2008). After obtaining the projection matrix, the pressure, velocity and saturation are re-parametrized as,
p(t ) = Φppr (t ),
v(t ) = Φvvr (t ),
s(t ) = Φssr (t )
(17)
The pressure and saturation equations are now solved in the reduced space by substituting variables in Eq. (17) into Eqs. (7) and (8). The reduced pressure equation is,
⎛ Φ T BΦ −Φ T CT Φ ⎞⎛ ⎞ ⎛ ⎞ 0 v v p ⎟⎜ vr ⎟ ⎜ v =⎜ T ⎟ ⎜ Φ T CΦ ⎟ ⎜ ⎟ ⎜ p Φ g⎟ 0 ⎝ p v ⎠⎝ r ⎠ ⎝ p ⎠
3.2. Discrete empirical interpolation method (DEIM) We briefly review the DEIM as presented in Chaturantabut and Sorensen (2010). Let f (τ ) ∈ n denotes a nonlinear function where τ refers to time or a parameter. We approximate the function f by projecting it into a subspace spanned by the basis functions Ψ = (ψ1, … , ψm) ∈ n × m as
f (τ ) ≈ Ψc (τ )
(20)
The projection basis vectors Ψ is determined very similar to the POD basis vectors for states. The nonlinear functions of saturation are evaluated at each time step and are stored in a matrix as, f ∈ n × ns and svd is employed to compute the m modes. These modes are used as the projection basis vectors in the approximation given by Eq. (20). Note that Eq. (20) is an over-determined system of equations. The coefficient vector c is calculated based on the evaluation of the nonlinear function at few points such that the resulting system will be nonsingular. Thus, the first step is to construct a matrix as,
P = [e℘1, …, e℘m] ∈ n × m
(21)
where e℘i = [0, … , 0, 1, 0, … , 0]T ∈ n is the ℘i th column of the
identity matrix In ∈ n × n for i = 1, … , m. Multiplying both sides of Eq. (20) by PT and assuming that the matrix PT Ψ is nonsingular, we obtain
f (τ ) ≈ Ψc (τ ) = Ψ (PT Ψ )−1PT f (τ )
(22)
As for the interpolation indices {℘1, … , ℘m}, they are selected using greedy algorithm as given in Chaturantabut and Sorensen (2010) to have a nonsingular matrix. A greedy algorithm is an algorithm that solves the optimization problem heuristically by making the locally optimal choice at each stage. Such algorithms are called greedy because while the optimal solution to each smaller instance will provide an immediate output, the algorithm does not consider the larger problem as a whole. Greedy algorithm works recursively. In this study, this algorithm is used to select the rows of a solution matrix that have largest distance from each other. At each step a column of the matrix is selected, and the row that has the largest Euclidian distance from others will be selected. Although POD-DEIM will reduce the computational runtime, for some problems large number of basis vectors and interpolation points are required to reduce the error and have a reliable approximation. To improve bases selection, one can compute several local subspaces with similar dynamical features and find their corresponding basis vectors. We will elaborate more on this idea in the next section. 3.3. Localized POD
(18)
Thus, the original system with large degrees of freedom (number of interfaces þnumber of grid blocks) is reduced to a dynamical system with few basis vectors for velocity and pressure combined. Furthermore, the nonlinear saturation equation is solved iteratively in reduced space as follows,
( ( )( )
691
)
srk = srk − 1 + ΦsT A v k f Φssrk + q+
(19)
Note that due to the presence of a nonlinear function f in Eq. (19), at each iteration the reduced saturation should be projected back to fine scale for nonlinear function evaluation. This issue increases computational cost, but can be mitigated by applying DEIM, as it is explained in the next section.
The main idea is to divide the snapshots into different subgroups and apply POD to each domain to obtain local POD bases. This will require less number of basis vectors for each region compared to global POD approach. Also, it will give us a better approximation of the solution trajectory in each subdomain, because more representative bases will be used. Two main questions are how to cluster the snapshots into subsets based on a feature and how to efficiently select the proper set in the online phase. There are different techniques to split the solution snapshots space into different regions. In most of these methods, the domain is split into subdomains recursively (Drohmann et al., 2011), but this method might results in a large number of subdomains in practice. Here, we classify the snapshots into different clusters with machine learning techniques and for each cluster a local
692
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
reduced-order model is constructed as suggested in (Peherstorfer et al., 2014). A local reduced-order model is then selected with respect to the current state of the system. This localization approach was introduced for POD method in (Amsallem et al., 2012). However, they proposed unsupervised learning methods that can results in unstable clustering behavior if the clustering method and its parameters are not carefully chosen (Von Luxburg, 2007). Furthermore, the given procedure in Amsallem et al. (2012) requires precomputing auxiliary quantities to ensure an online selection procedure is independent of the large scale system; however, the number of auxiliary quantities scales cubically with the number of clusters. This can be an issue especially in optimization workflows, where one updates basis vectors of reduced model periodically to obtain more stable and accurate solution. Thus, in this work we propose a modified algorithm for localized POD that only uses few number of indices for classification in the offline phase and also using these few indices in the online phase to find the corresponding cluster. This also reduces the storage required for the auxiliary parameters. In the offline (preprocessing) phase of this approach, Algorithm 1 is applied to cluster the snapshot matrix. Here LPOD is used only in saturation equation but it will be extended to pressure equation in future work. The number of clusters k and the number of POD basis vectors in each cluster r are also provided for this algorithm. The subscript s refers to saturation. In the first step of this algorithm deim function is used in order to select grid points that are most representative in the solution space based on the greedy algorithm. On line 2 and 3 of this algorithm, a small sets of indices, instead of fine scale dimension, will be used for clustering in the k-means algorithm. This not only accelerates clustering, but also helps finding the proper set of basis vectors in the online phase to become independent of fine scale solution. Also, the auxiliary variables are smaller and need less storage. It should be clear that this will not introduce any extra error in the LPOD, if enough indices are selected based on the greedy algorithm, because the selected grid points are only used for clustering and classification. The output of the k-means algorithm is the centroid of each cluster and the corresponding index, see Appendix B and references there for more details. These centroids and the indices are used to define a classifier function as explained in Line 4. This classifier will be used in the online phase to distinguish the proper index of a cluster where the current state belongs to. After clustering all the snapshots one can find the POD basis vectors for each group as explained in Lines 5–8. Note that in the Line 7 of this algorithm again a few rows of these basis vectors are selected to be used in the classifier in the online phase. The POD basis vectors for each group, the classifier and the small subset of these bases are returned as the output of this algorithm.
Algorithm 1. Classification procedure of LPOD. 1: Procedure LPOD s , r , k 2:
Pg = deim(s )
3:
( centroids, {S , …, S }) ← kmeans( P , k)
4: 5: 6:
c s(x ) = knnclassify(x , centroids , 1: k ) for i = 1: k do Φi ← svd(Si, r )
1
k
T s g
wis ← PTg Φi
7:
8: end for 9: return c s(x ) , Φi, wis i = 1, … , k 10: End procedure If LPOD is applied to the saturation equation, in the online phase at each Newton-Raphson iteration, the index of a cluster that current state is most likely belongs to can be found as follows,
i ← c s(wis sr ) s
(23) s
where c (x ) and w are the output of the Algorithm 1, and sr is the reduced saturation state. After obtaining the cluster index, one can choose the corresponding POD basis vectors and solve the problem in the most suitable reduced subspace. 3.4. Localized DEIM In this section we discuss the general description of LDEIM and an efficient method to select each subspace in the online phase without fine scale calculation. In LDEIM several local DEIM subspaces are constructed as, (Ψ1, P1) , … , (Ψk, Pk) in the offline phase and then select the best one in the online phase. We follow the approach suggested in Peherstorfer et al. (2014) regarding clustering the snapshots and selecting the best basis vectors in the online phase as explained in Algorithm 2. The inputs are a snapshot matrix of nonlinear functions f , the number of clusters k, and the number of DEIM basis vectors in each cluster m. At the first step, a subset of all snapshots, consisting a few gridblocks, is selected to be used in k-means clusterings as shown in Line 2 and 3 of this algorithm. These steps, similar to LPOD, accelerate the clustering and also make the online classification independent of fine scale calculations. After clustering all the snapshots one can find the DEIM basis vectors for each group as explained in Lines 5–9. Note that in the Line 8 of this algorithm a small subset of these bases is selected to be used in the classifier in the online phase Peherstorfer et al. (2014). The DEIM basis vectors for each group, the classifier and the small subset of these bases is returned from this algorithm.
Fig. 1. 45 45 Homogeneous reservoir. (a) reservoir production pattern. (b) relative perm.
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Algorithm 2. LDEIM procedure Peherstorfer et al. (2014).
8: 9:
1: Procedure LDEIM f , m , k
return c f (x ) , Ψi , wif i = 1, … , k 11: End procedure
Pg = deim(f )
3:
( centroids, {S , …, S }) ← kmeans(
4: 5: 6:
c f (x ) = knnclassify(x , centroids , 1: k ) for i = 1: k do Ψi ← svd(Si, m)
7:
Pi ← deim(Ψi , m)
k
wif ← PTg Ψi (PTi Ψi )−1 end for
10:
2:
1
693
T f
Pg , k )
If LDEIM is applied to the saturation equation, in the online phase at each Newton-Raphson iteration, the index of a cluster that current state is most likely belongs to can be found as follows,
Fig. 2. Illustration of global POD on all snapshots and the first four dominant basis. (a) mean of all snapshots. (b) POD on all snapshots.
Fig. 3. Illustration of localized POD on saturation snapshots mean of each cluster and the first four basis of each cluster is shown. (a) mean 1st cluster (b) mean 2nd cluster (c) mean 3rd cluster (d) LPOD on 1st cluster (e) LPOD on 2nd cluster (f) LPOD on 3rd cluster.
694
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Fig. 4. Illustration of global DEIM on all snapshots and selected grid blocks in a homogeneous model. (a) mean of all snapshots (b) DEIM on all snapshots.
Fig. 5. Illustration of localized DEIM on each cluster and selected grid blocks in a homogeneous model. (a) mean 1st cluster (b) mean 2nd cluster (c) mean 3rd cluster (d) LDEIM on 1st cluster (e) LDEIM on 2nd cluster (f) LDEIM on 3rd cluster.
Fig. 6. Average error with respect to increasing localized DEIM basis.
Fig. 7. Average saturation error with respect to increasing the number of localized DEIM clusters.
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
i ← c f (wi f f˜ )
(24)
where c f (x ) and wf are the output of the Algorithm 2, and f˜ is the approximated nonlinear function based on DEIM approach. Here we select following point-based feature extraction,
f˜ = f (PiT s )
∈ m
(25)
After obtaining the cluster index, one can choose the corresponding DEIM basis vectors and solve the problem in the reduced subspace.
4. Numerical results We will investigate the idea of LPOD/LDEIM by means of two examples. The first one consists of a simple example that we consider for depicting the advantages of localized model order reductions in terms of basis selection. We can also study the selection of the grid points in LDEIM. The second example investigates the application of localized methods in a more complex reservoir model, whereby one can look at the full benefits of using LPOD/LDEIM.
Fig. 8. Localized DEIM cluster analysis based on silhouette plot.
695
Note that as pressure snapshots contain global information about the field, and saturation snapshots contain local information, we applied localized model order reduction to only saturation equation and use the regular techniques for pressure equation. 4.1. Application of LPOD-LDEIM on homogeneous model In this example we will apply LPOD and LDEIM on a simple model and compare the results with the regular POD and DEIM. The reservoir model is a two-phase flow (oil-water) model under the water flooding recovery process with the structure of a 5-spot as shown in Fig. 1(a). The reservoir model is discretized using Cartesian grid of size 10 ft × 10 ft × 10 ft . Overall the reservoir model has 45 × 45 × 1 = 2025 active cells. The permeability of the reservoir is 10 (md) homogeneous and the porosity is 0.2. The relative permeability curves is quadratic as shown in Fig. 1(b). The reservoir model was simulated for 1000 days under constant injection rate of one pore volume and constant BHP at all the producers. The snapshots of pressure, velocity, water saturation, and fractional flow was saved every 10 days, resulting in 100 snapshots. Here we selected the snapshots uniformly, however one can see Kunisch and Volkwein (2010) for other approaches. Primarily, the POD basis for pressure and velocity were found, by applying svd on the corresponding snapshots and selecting the dominant basis. Next, LPOD was applied as instructed in Algorithm 1 on the 100 snapshots of saturation to cluster them into 3 subsets with corresponding basis. Note that in the online phase, the index corresponding to each cluster is found based on the Eq. (23). The mean of all the snapshots of saturation and the first 4 dominant POD basis are shown in Fig. 2. Also, the mean of three clusters resulted from kmean and the first 4 dominant POD basis for each cluster is demonstrated in Fig. 3. comparison of first basis in all the clusters reveals that they are similar to the mean of each cluster, whereas the fourth basis are dynamic and have high frequency. Since the global POD average out all the snapshots, there is a chance that part of a local dynamic might be ignored in the global dominant basis that are selected. Thus, LPOD is expected to give better result as it is shown at the end of this example. Next, one can apply DEIM on all of the fractional flow snapshots and obtain the global DEIM grid points as shown in Fig. 4. LDEIM was also applied on the fractional flow snapshots as explained in Algorithm 2, and 3 clusters were attained. 10 grid points were selected per cluster based on DEIM. The mean of each cluster and the selected grid points are shown in Fig. 5.
Fig. 9. Training schedule of the homogeneous model. (a) injection rate (b) producers BHP.
696
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Fig. 10. Final water saturation and water cut at producers by applying POD and LPOD on the homogeneous model under the training schedule. (a) POD (5 basis) (b) LPOD (5 basis).
Fig. 11. Final water saturation error by applying POD and LPOD on the homogeneous model with the training schedule. (a) POD (5 basis) (b) LPOD (5 basis).
Fig. 12. Final water saturation and water cut at producers by applying DEIM and LDEIM on the homogeneous model under the training schedule. (a) DEIM (10 basis) (b) LDEIM (10 basis).
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
697
Fig. 13. Final water saturation error by applying DEIM and LDEIM on the homogeneous model with the training schedule. (a) DEIM (10 basis) (b) LDEIM (10 basis).
Fig. 14. Temporal saturation error for the homogeneous model under the training schedule. (a) POD and LPOD. (b) DEIM and LDEIM.
Fig. 15. Test schedule: different inputs by varying training input 10%. (a) injection rate (b) producers BHP.
698
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Fig. 16. Final water saturation and water cut at producers by applying POD and LPOD on the homogeneous model under the training schedule. (a) POD (5 basis) (b) LPOD (5 basis).
Fig. 17. Final water saturation and water cut at producers by applying DEIM and LDEIM on the homogeneous model under the training schedule. (a) DEIM (10 basis) (b) LDEIM (10 basis).
Fig. 18. Temporal saturation error for the homogeneous model under the test schedule. (a) POD and LPOD (b) DEIM and LDEIM.
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Comparing the global selected grid blocks and the local ones in each cluster reveals that the clustering allows LDEIM to concentrate the interpolation points in only a certain parts of the reservoir, i.e. close to the front of water saturation. This is one of the reasons that LDEIM can give us better results compared to regular DEIM, because the selected points are more into dynamic part of the model. Fig. 6 indicates the average error in saturation by increasing the number of basis in both DEIM and LDEIM. This figure reveals that for few number of basis LDEIM results in a smaller error compared to regular DEIM. However, for high number of basis the error from LDEIM approach is slightly higher than DEIM. For a fixed number of basis Fig. 7 illustrate the effect of increasing the number of clusters on the average saturation error. This figure indicates that 4 cluster gives us the minimum error. Silhouette plot is another validation method for clustering of data. This technique provides a concise graphical representation of how well each snapshot lies within its cluster Rousseeuw (1987). The silhouette plot of the snapshots corresponding to this example
699
with four clusters is shown in Fig. 8. The squared Euclidean distance were used as a metric to generate this plot. Silhouette value of 0.7–1.0 indicates that strong structure has been found as it is the case for this example. Note that finding the optimal value of the number of clustering is still a very active research topic in machine learning. Here we performed an exhaustive search on a simple model and found that 4 or 5 clusters give us good results for sample sizes of 100–200. Now to verify the localized model order reduction on this simple model, a training schedule is used for the injector (rate controlled) and all the producers (BHP controlled) as shown in Fig. 9(a) and (b), respectively. Note that this amount of injection was selected to assure one pore volume is injected throughout the simulation time. The initial water saturation and pressure are assumed to be 0.0 and 2500 psia, respectively. The final water saturation and the water cut at all the producers are shown in Fig. 10(a) and (b) for POD and LPOD, respectively. As illustrated in these figures the LPOD outperforms the regular POD, yielding better approximation of the high fidelity model with
Fig. 19. 5 layers of SPE10 (Layer 10–14th). (a) permeability (b) relative perm.
Fig. 20. Training schedule (solid line) and Test schedule (dashed line) with ±5% variation. (a) injection rate (b) producers BHP.
700
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
almost exact water-cuts Fig. 11. Fig. 12(a) and (b) compare the same results for DEIM and LDEIM. LDEIM reveals accurate results, whereas regular DEIM is unstable. The final water saturation error is also compared between POD and LPOD in Fig. 10 and between DEIM and LDEIM in Fig. 13. The relative error in saturation is calculated based on Eq. (26) and is compared for POD and LPOD in Fig. 14(a) and for DEIM and LDEIM in Fig. 14(b). All of these results confirm the superiority of local model reduction over global one.
serr (ti ) = ∥ sred(ti ) − sref (ti )∥2 /∥ sref (ti )∥2 ,
ti ∈ [0 Tf ]
(26)
One needs to test the localized model reduction with a different input to verify its robustness we used the test schedule in Fig. 15, which is 10% variation with respect to the training schedule. The final water saturation and the water cut at all the producers are shown in Fig. 16 for POD and LPOD, respectively. As illustrated in these figures the LPOD outperforms the regular POD, yielding better approximation of the high fidelity model with almost exact watercuts. Fig. 17 compares the same results for DEIM and LDEIM, whereas LDEIM outperforms regular DEIM in accuracy and stability. The temporal saturation error is calculated based on Eq. (26) and is compared for POD and LPOD in Fig. 18(a) and for DEIM and LDEIM in Fig. 18(b). All of these results also confirm the superiority of localized model reduction over global method with test schedule.
5. Case study In this example the localized model reduction is applied to a two-phase flow (oil-water) reservoir model under the water flooding recovery process with the structure of a 5-spot as shown Table 1 Compare fine and reduced model. Num # of
Fine scale
POD/LPOD
DEIM/LDEIM
Pressure basis Velocity basis Saturation basis Fractional basis Iterations Elapsed time (s)
66,000 183,400 66,000 66,000 869 7371.3
3 10 6/6 – 257/273 177.7/177.1
– – – 7/7 306/312 163.7/160
in Fig. 19(a). The permeability of the reservoir is taken from SPE10 comparative model (Christie and Blunt, 2001) (5 layers, layer 10– 14th). The reservoir model is discretized using Cartesian grid of size 20 ft × 10 ft × 2 ft . Overall the reservoir model has 60 × 220 × 5 = 66000 active cells. The fluid viscosity ratio is μw /μo = 0.1. The relative permeability curves is depicted in Fig. 19 (b). We assumed a constant porosity of 0.2 for entire model. The producers are controlled by bottom hole pressure and the injector by rate. The input schedule is changed every 200 days as shown in Fig. 20. This figure includes both the training and the test schedule. Note that injection volume is at least one pore volume throughout simulation time (1000 days). The initial water saturation and pressure are assumed to be 0.0 and 2500 psia, respectively. In order to apply the POD-DEIM methods, the reservoir is simulated with training schedule for 1000 days and the snapshots of pressures, velocity, water saturations and the nonlinear fractional function are saved every 10 days. Thus, we have 100 snapshots for each variables. One can find the basis vectors and construct the reduced model as explained in previous sections. Next, the reduced model is verified with the test schedule. Regular (global) POD is applied to velocity and pressure, and POD/LPOD and DEIM/LDEIM to saturation equation as it is a nonlinear equation. The selection criteria for pressure and velocity basis vectors was to capture at least 99% of the energy of snapshots. Throughout this example LPOD and POD as well as DEIM and LDEIM are compared in the saturation equation. The number of basis vectors is compared between different reduced order models and the original fine scale model in Table 1. The snapshots were classified into 4 clusters. The overall speedup is between 40 and 45 times. Note that even though the localized method has more overhead during the online phase, it does not have higher computational time compared to regular POD-DEIM. Thus, this overhead is small as only few grid blocks information are needed to select the right basis vectors. Note that, when we apply DEIM or LDEIM, the saturation equation is assumed to be projected onto the reduced subspace by regular (global) POD basis vectors. Combining LPOD-LDEIM will be considered in future work. The final water saturation and the water cut at all the producers are shown in Fig. 21(a) and (b) for POD and LPOD, respectively. Note that the water cut results from high fidelity model is shown as dashed line in these figures. As illustrated in these figures the LPOD approximates the original fine scale model better, with almost
Fig. 21. Final water saturation and water cut at producers. (a) POD (7 basis) (b) LPOD (7 basis).
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Fig. 22. Final water saturation error. (a) POD (7 basis) (b) LPOD (7 basis).
Fig. 23. Final water saturation and water cut at producers. (a) DEIM (9 basis) (b) LDEIM (9 basis).
Fig. 24. Final water saturation error. (a) DEIM (9 basis) (b) LDEIM (9 basis).
701
702
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
Fig. 25. Temporal saturation error. (a) POD and LPOD (b) DEIM and LDEIM.
identical water cut. Fig. 23(a) and (b) compare the same results for DEIM and LDEIM, whereas LDEIM outperforms regular DEIM. The final water saturation error is also compared between POD and LPOD in Fig. 22 and between DEIM and LDEIM in Fig. 24. The temporal saturation error is compared for POD and LPOD in Fig. 25(a) and for DEIM and LDEIM in Fig. 25(b). All of these results confirm the superiority of local model order reduction over the global approach.
6. Conclusion A localized POD and DEIM model reduction scheme has been introduced for the solution of the two-phase flow in heterogeneous porous media. It has been our experience that performing localization with multiple snapshot yield a more accurate and stable reduced order models than a single reduced subspace. More work still need to be done in selecting the appropriate number of clusters and the specific clustering technique used. Moreover, applications of the proposed method in reservoir production optimization should be investigated.
Acknowledgment This publication also was made possible by a National Priorities Research Program grant NPRP grant 7-1482-1278 from the Qatar National Research Fund (a member of The Qatar Foundation). The authors would like to acknowledge partial support for this research from the Army Research Office under grant W911NF-12-1-0206 and the Foundation CMG through the FCMG Research Chair at Texas A&M.
( (λK )
) (
−1
)
v, u + ∇p, u = 0
∀ l ∈ L (Ω)
The pressure equation is solved based on mixed finite element method (FEM) Brezzi and Fortin (2012). By introducing the flux variable v = − λ(s )K ∇p, we get the following first-order system −1
(λ( s )K ) v + ∇p = 0, ∇·v = q.
(A.1)
To drive the mixed FEM, we primarily define the following Sobolev space with imposing no flow at the boundary,
The mixed formulation of (p, v ) ∈ L2(Ω ) × H01, div(Ω ) such that,
Eq.
(A.1)
(A.3)
2
where ( ·, ·) is the L inner product. To make this system of equation well posed, one should add an extra constraint, e.g. ∫Ω p dx = 0. In a mixed FEM discretization we replace the pressure and velocity solution space ( L2(Ω ) and H01, div ) by finite dimensional subspaces that typically consists of low order piecewise polynomials (triangles, quadrilateral, etc). For further discussion one can refer to Aarnes et al. (2007). Defining the finite element spaces Q and U,
U = {p ∈ L2(Ω): p| Ωi
is constant
∀ Ωi ∈ Ω}V = {v ∈ H01,div(Ω): v| Ωi have linear components ∀ γij ∈ Ω,
and
v·nij
∀ Ωi ∈ Ω, (v·nij )| γij
is constant
is continuous across γij}
(A.4)
where nij is the unit normal to γij pointing from Ωi to Ωj. The corresponding Raviart-Thomas mixed FEM is to find (p, v ) ∈ U × V such that
∫Ω (λ(s)K )−1v·u
dx −
∫Ω p∇·u
∫Ω (∇·v)l
∫Ω q l
dx,
dx =
dx = 0,
(A.5)
(A.6)
for all u ∈ V and l ∈ U . After discretization and writing p = ∑Ω pm ϕm and v = ∑γ uij ψij , where ϕm ∈ U , ψij ∈ V are the basis ij
functions; the system of equations (A.5) takes the form,
Appendix A. Pressure equation discretization
v · n = 0} .
)
2
m
H01, div = {v ∈ (L2(Ω)): ∇·v ∈ L2(Ω) and
(
∀ u ∈ H01, div(Ω) ∇·v, l = (q, l)
(A.2) is
to
⎛ B(λ(s )) −CT ⎞⎛ v ⎞ ⎛ 0 ⎞ ⎜ ⎟⎜ ⎟ = ⎜ ⎟ ⎝ C 0 ⎠⎝ p⎠ ⎝ g ⎠
(A.7)
where B = [bij, kl ] and C = [cm, ij ] defined as,
bij, kl =
∫Ω (λ(s)K )−1ψij·ψkl dx,
cm, ij =
∫Ω
m
∇·(ψij )dx
(A.8)
and g results from the sink/source terms. Here, we keep the same notations for discrete and continuous variables for simplicity. Appendix B. Clustering
find Clustering or classification is the task of grouping a set of
M. Ghasemi, E. Gildin / Journal of Petroleum Science and Engineering 145 (2016) 689–703
objects such that the ones with a similar feature are in the same group, called cluster. Cluster analysis is one of the main techniques in statistical data analysis, used in many fields such as machine learning and pattern recognition among others. There are mainly four types of clustering, namely hierarchical, centroid based, distribution based, and density based clustering. Centroid based clustering represents a cluster by a central vector, which may not be necessarily a member of the data set. If the number of clusters is fixed, then this clustering can be formulated as an optimization problem, to find the required number of cluster centers (centroids) and assigning the objects to the nearest cluster. This optimization problem is NP-hard and the common approach is to find the approximate suboptimal solution. One of the popular approximative method is k-means Hartigan and Wong (1979). This classification method partitions the data space into a structure known as a Voronoi diagram, and it is close to nearest neighbor classification. In this paper, k-means with squared Euclidean distance is used to cluster the snapshots. This algorithm is easy to implement and has a good performance Jain (2010). In this method, given a set of data points in n-dimentional space and an integer k corresponding to the number of clusters, the problem is to determine k points, called centroids, so as to minimize the distance from each data point to its nearest center. There are iterative scheme to solve this problem efficiently Kanungo et al. (2002).
References Aarnes, J.E., Gimse, T., Lie, K.-A., 2007. An introduction to the numerics of flow in porous media using matlab. In: Hasle, G., Lie, K.-A., Quak, E. (Eds.), Geometrical Modeling, Numerical Simulation, and Optimization: Industrial Mathematics at SINTEF. Springer Verlag, Berlin, pp. 265–306. Aarnes, J.E., Lie, K.-A., Kippe, V., Krogstad, S., 2009. Multiscale methods for subsurface flow. In: Engquist, Björn, Lötstedt, Per, Runborg, Olof (Eds.), Multiscale Modeling and Simulation in Science vol. 66. Springer Verlag, Berlin, pp. 3–48. Amsallem, D., Zahr, M.J., Farhat, C., 2012. Nonlinear model order reduction based on local reduced-order bases. Int. J. Numer. Methods Eng. 92 (10), 891–916. Brezzi, F. and Fortin, M., Mixed and Hybrid Finite Element Methods; ISBN¼ 9781461231721, In Springer Series in Computational Mathematics, 2012, vol. 15, Springer, New York.
703
Cardoso, M., Durlofsky, L., 2010. Use of reduced-order modeling procedures for production optimization. SPE J. 15 (2), 426–435. Chaturantabut, S., Sorensen, D.C., 2010. Discrete empirical interpolation for nonlinear model reduction. SIAM J. Sci. Comput. 32 (5), 2737–2764. Christie, M., Blunt, M., 2001. Tenth SPE Comparative Solution Project a Comparison of Upscaling Techniques. SPE-72469. SPEREE 4, pp. 308–317. Drohmann, M., Haasdonk, B., Ohlberger, M., 2011. Adaptive reduced basis methods for nonlinear convection–diffusion equations. In: Fořt, J., Fürst, J., Halama, J., Herbin, R., Hubert, F. (Eds.), Finite Volumes for Complex Applications VI Problems & Perspectives. Springer, Berlin, pp. 369–377. Ghasemi, M., Gildin, E., 2015. Model order reduction in porous media flow simulation using quadratic bilinear formulation. Comput. Geosci., 1–13. http://dx. doi.org/10.1007/s10596-015-9529-0. Ghasemi, M., Gildin, E., Yang, Y., Efendiev, Y., Calo, V., 2015. Fast multiscale reservoir simulations using pod-deim model reduction. In: Proceedings of SPE Reservoir Simulation Symposium. Houston, Texas, SPE 173271. February. Ghommem, M., Gildin, E., Ghasemi, M., 2015. Complexity reduction of multi-phase flows in heterogeneous porous media. SPE J., 167295. Gildin, E., Ghasemi, M., 2014. A new model reduction technique applied to reservoir simulation. In: Proceedings of the 14th European Conference on the Mathematics of Oil Recovery. Sicily, Italy. Gildin, E., Ghasemi, M., Romanovskay, A., Efendiev, Y., 2013. Nonlinear complexity reduction for fast simulation of flow in heterogeneous porous media. In: Proceedings of SPE Reservoir Simulation Symposium. The Woodlands, Texas, SPE 163618. February. Hartigan, J.A., Wong, M.A., 1979. Algorithm as 136: a k-means clustering algorithm. Appl. Stat., 100–108. Heijn, T., Markovinovic, R., Jansen, J., 2004. Generation of low-order reservoir models using system-theoretical concepts. SPE J. 9 (June), 2. Jain, A.K., 2010. Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31 (8), 651–666. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y., 2002. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24 (7), 881–892. Kunisch, K., Volkwein, S., 2010. Optimal snapshot location for computing pod basis functions. ESAIM: Math. Model. Numer. Anal. 44 (03), 509–529. Lerlertpakdee, P., Jafarpour, B., Gildin, E., 2014. Efficient production optimization with flow-network models. SPE J. Peherstorfer, B., Butnaru, D., Willcox, K., Bungartz, H.-J., 2014. Localized discrete empirical interpolation method. SIAM J. Sci. Comput. 36 (1), A168–A192. Rousseeuw, P.J., 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (0), 53–65. Van Doren, J., Markovinovic, R., Jansen, J.D., 2004. Reduced-order optimal control of waterflooding using pod. In: Proceedings of the 9th European Conference on the Mathematics of Oil Recovery. Volkwein, S., Kunisch, K., 2008. Proper orthogonal decomposition for optimality systems. ESAIM: Math. Model. Numer. Anal. 42, 1–23. Von Luxburg, U., 2007. A tutorial on spectral clustering. Stat. Comput. 17 (4), 395–416.