Reparameterization for statistical state estimation applied to differential equations

Reparameterization for statistical state estimation applied to differential equations

Journal of Computational Physics 231 (2012) 2641–2654 Contents lists available at SciVerse ScienceDirect Journal of Computational Physics journal ho...

1MB Sizes 2 Downloads 62 Views

Journal of Computational Physics 231 (2012) 2641–2654

Contents lists available at SciVerse ScienceDirect

Journal of Computational Physics journal homepage: www.elsevier.com/locate/jcp

Reparameterization for statistical state estimation applied to differential equations T. Butler a, M. Juntunen b,⇑ a b

Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, Austin, TX 78712, United States Department of Mathematics and Systems Analysis, Aalto University School of Science, Finland

a r t i c l e

i n f o

Article history: Received 7 April 2011 Received in revised form 6 November 2011 Accepted 15 December 2011 Available online 27 December 2011 Keywords: Ensemble Kalman filter EnKF State estimation Parameter estimation Data assimilation Partial differential equations

a b s t r a c t The ensemble Kalman filter is a widely applied data assimilation technique useful for improving the forecast of computational models. The main computational cost of the ensemble Kalman filter comes from the numerical integration of each ensemble member forward in time. When the computational model involves a partial differential equation, the degrees of freedom of the solution in the discretization of the spatial domain are oftentimes used for the representation of the state of the system, and the filter is applied to this state vector. We propose a method of approximating the state of a partial differential equation in a representation space developed separately from the numerical method. This representation space represents a reparameterization of the state vector and can be chosen to retain desirable physical features of the solutions. We apply the ensemble Kalman filter to this representation of the state, and numerically demonstrate that acceptable results are obtained with substantially smaller ensemble sizes. Ó 2012 Elsevier Inc. All rights reserved.

1. Introduction In the last two decades, data assimilation (DA) methods have proven useful in improving the operational capabilities of numerical weather prediction [4], oil reservoir simulation [5], and in global positioning system devices [6] to name a few. We focus on the use of sequential DA methods for the purpose of improving forecasts for computational models involving evolutionary partial differential equations. The Kalman filter (KF) serves as the basis for most of the widely used data assimilation schemes, see [7] for a comprehensive review. The KF provides the optimal (minimum variance) state estimate for linear models with Gaussian errors [4]. It alternates a forecast step to integrate the most recent estimate forward in time with an analysis step to update (and correct) the forecast with new observations. However, the KF is not designed to be optimal for nonlinear models. Ensemble Kalman filter (EnKF) methods [7,8] address several difficulties with applying the KF to nonlinear problems via the extended KF, e.g. storing and propagating in time the error covariance matrix. These EnKF methods represent the uncertainties around the KF state estimates by an ensemble of state vectors, in place of computing an error covariance matrix. The time update, or forecast, of the uncertainties is then carried out through the integration of each ensemble member with the full nonlinear forward model, avoiding some of the problems that arise from the linearization of the system [7]. The analysis step is still based on the linear KF step. However, ensemble methods can be either stochastic or deterministic [9]. In the stochastic approach, perturbed observations are used to correct each ensemble member. In the deterministic approach, what is known as an ensemble square-root filter (EnSRF) is used, and the analysis step is applied only to the mean of the ensemble members. In an EnSRF, the ensemble members are re-sampled after every analysis using the analyzed error covariance. ⇑ Corresponding author. E-mail addresses: [email protected] (T. Butler), [email protected] (M. Juntunen). 0021-9991/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.jcp.2011.12.019

2642

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

Due to the slow convergence rate of Monte Carlo sampling, numerical experiments using an EnKF or EnSRF often require Oð100Þ ensemble members to provide accurate estimates of uncertainties in the state space. Since each ensemble member requires forward integration with the model to update uncertainties in time, the computational cost of implementing an ensemble method can become prohibitive. We describe below some recently proposed variants of EnKF and EnSRF methods that address issues related to the size of the ensemble required to capture uncertainties in the state space. Several dimension reduction techniques have been proposed over the years that decrease the cost of various KF methodologies by projecting the high dimensional state space onto a low dimensional subspace (see e.g., [15,16]). For example, the singular evolutive extended Kalman (SEEK) filter [17] approximates the error covariance matrix by a singular matrix of low rank. Approximating the error covariance in this way results in corrections only in the directions spanned by the low dimensional subspace. This led [17,19] to choose as the basis vectors for this low dimensional subspace only the directions whose errors are not sufficiently attenuated by the system. An EnSRF variant of the SEEK filter was introduced in [18] called the singular evolutive interpolated Kalman (SEIK) filter, where an ensemble spanning the subspace replaces the linearization of the system required in the SEEK filter. Various degraded forms of these filters have been proposed, e.g. see [19], that continue to reduce implementation cost by limiting the evolution of the ensemble to only a few members or to some chosen forecast steps. The methods above do not address the loss of desirable features in the solution as they are applied directly to the state derived from the numerical method. The method we propose does not compete with the various ensemble methods that have been studied. In fact, these methods serve different purposes. The dimension reduction techniques mentioned above use an analysis similar to principal component analysis to determine a subspace of the state space defined by the numerical method. The result is that a smaller ensemble is required to capture the uncertainties in the state space. However, the dimension of the state vectors remains unchanged even though the effective dimension of the state space has been reduced, i.e. the state is still defined by the number of degrees of freedom in the spatial discretization even though the spatial variability is spanned sufficiently well by a number of directions significantly less than the degrees of freedom. The primary assumption in our proposed method is that we have some a priori knowledge of the state space to the system. We propose that the state be defined in a representation (or constrained) space determined a priori to the numerical method/integrator being applied to the computational model. We construct such a space to retain desirable features of the solution to a differential equation, and as a result, the number of degrees of freedom in the numerical method does not influence the dimension of the state vector, which we show has the desirable effect of reducing the number of ensemble members required to obtain accurate solutions. Since we are interested in solutions to physical processes modeled as partial differential equations, we assume that it is known, in at least a weak sense, that the solution exists in a specific infinite dimensional function space. We use the structure of the function space, or actual physical constraints of the solution, to construct a finite-dimensional representation space spanned by a finite set of basis functions, and it is to the coefficients of these representative basis functions that we apply the EnKF method. Other methods exist for expanding the state in terms of a particular basis. For example, in [2], an efficient algorithm based on the generalized polynomial chaos (gPC) expansion and an EnSRF scheme was presented. In this method, stochastic quantities are expressed as convergent polynomial series of input random variables. This requires solving the system of stochastic state equations via gPC-based numerical methods, e.g. using either a stochastic Galerkin or stochastic collocation approach to forecast the state. The stochastic Galerkin approach is an intrusive method that results in a set of deterministic, and often coupled, equations that must be solved numerically. A non-intrusive pseudo-spectral stochastic collocation approach was also explored that approximated the exact gPC expansion. However, while the state is defined in terms of the coefficients of a polynomial basis of the stochastic quantities, each of these coefficients has dimension equal to the number of degrees of freedom in the numerical method. Furthermore, this gPC method represents a reparameterization in terms of the aleatory uncertainty, which is distinct from our proposed method. An alternative approach that is perhaps most related to our proposed method is that of the two-stage EnKF. In [3], a Bayesian framework is used to describe the problem of data assimilation and the Kalman gain used to compute the analysis state is determined from maximizing a likelihood function. The first stage is to use the standard EnKF to update the state according to the observable data. A penalty term accounting for nonphysical solutions is then added to the likelihood function to form a generalized least squares problem, i.e. Tikhonov regularization is used. This is equivalent to a hierarchical Bayesian update. This is distinct from our proposed method in that we map directly to the basis and compute the analysis on this basis whereas the two-stage EnKF seeks, in some sense, to fit the analysis state to both numerical and representation basis functions simultaneously. The method we proposed does not require the use of the traditional EnKF. Any of the above ensemble methods may be used with our approach where a reparameterization of the state occurs before the analysis step. We restrict attention to the traditional EnKF to focus the results on the effect of the reparameterization of the state. Therefore, we hereafter refer to our method simply as the reparameterized state EnKF, or RS-EnKF, for brevity. This paper is organized as follows. In Section 2, we review the KF and EnKF. In Section 3, we describe the problem of defining the state to a differential equation in terms of the numerical basis and describe the proposed RS-EnKF method. A motivating example is used throughout Section 3 to illustrate the conceptual themes. In Section 4, we provide additional numerical results and concluding remarks follow.

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

2643

2. Sequential data assimilation and the Kalman filter We first present the equations for the basic KF designed for state estimation problems using linear models. We then present the EnKF as a means of extending the KF to nonlinear problems that does not require any explicit computation of model sensitivities as in the extended KF. 2.1. The Kalman filter The KF produces the ‘‘best’’ estimate of a state (for a linear model) in the sense that it combines measurements with a model forecast to produce a minimum-variance estimate. The KF algorithm assumes Gaussian distributions for noise in the model and measurement. The basic model and measurement equations are:

ukþ1 ¼ Akþ1 uk þ wkþ1 ; dkþ1 ¼ Hkþ1 ukþ1 þ kþ1 :

ð1Þ

Here, wk+1  N(0, Qk+1) and k+1  N(0, Rk+1) are the model and measurement noises, respectively. Given a forecast of the state uk+1 and error covariance Pfkþ1 at time tk+1, the analyzed state uakþ1 is determined from ufkþ1 and measurement data dk+1 using:

  uakþ1 ¼ ufkþ1 þ K dkþ1  Hkþ1 ufkþ1 ;

ð2Þ

where the Kalman gain matrix, K, is defined as:

 1 K ¼ Pfkþ1 H>kþ1 Hkþ1 Pfkþ1 H>kþ1 þ Rkþ1 :

ð3Þ

Here, the superscript T denotes the matrix transpose. The analyzed estimate of the error covariance is

Pakþ1 ¼ ðI  KHkþ1 ÞPfkþ1 :

ð4Þ

uakþ1

The analyzed state is then forecast to time tk+2 using (1), and since the the system is linear, equations for the forecast of P akþ1 to time tk+2 can be derived. 2.2. The ensemble Kalman filter Now consider a general nonlinear model with additive noise defined by

ukþ1 ¼ gðuk Þ þ wkþ1 ; dkþ1 ¼ hðukþ1 Þ þ kþ1 :

ð5Þ

In the EnKF, we use an ensemble of states to deal with the technical difficulty nonlinear state transitions introduce in the prediction stage, e.g. the mean of the forecast may not be the forecast of the mean, by using the propagation of the ensemble to compute forecasts and error statistics compute forecasts and error statistics used in the Kalman gain (3). We summarize the EnKF presented in [8,7] in Algorithm 1 below, where for the sake of simplicity we assume h = H is a linear observation operator. If this is not the case, the same algorithm applies, but H is interpreted as the linearization of nonlinear observation operator h, or we may use the standard technique where the state is expanded by concatenating the data, which makes the measurement operator linear. Algorithm 1: Ensemble Kalman filter Suppose we have N ensemble members, fu1 ; . . . ; uN g  Rn at time tk+1, and observed data vector d 2 Rm Form A 2 RnN whose jth-column is uj Compute A :¼ N1 A1NN Compute A0 :¼ A  A Simulate measurement errors j for each ensemble member to form perturbed observation vectors dj = d + j Form D ¼ ðd1 ; . . . ; dN Þ 2 RmN The analyzed ensemble is given by

 1 Aa ¼ A þ A0 A0> H> H> A0 A0> H> þ R ðD  HAÞ

ð6Þ

The past decade has seen an increased interest in minimizing the ensemble size N in Algorithm 1. The computational cost increases linearly with the number of ensemble members since each ensemble requires a forward integration in time using

2644

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

the computational model. A good description of numerical properties of the EnKF is given by Li and Xiu [10]. The method we describe below may allow for fewer ensemble members in Algorithm 1 while maintaining acceptable accuracy by possible reduction of the degrees of freedom in the states ui. This is validated in Section 4. 3. Sequential data assimilation and differential equations We are primarily interested in the application of the ensemble Kalman filter (EnKF) to partial differential equations (PDEs). We consider the general evolutionary PDE defined by

Dðx; t; k; uÞ ¼ fðx; t; kÞ;

ðx; tÞ 2 Q ¼ S  ½0; T;

ð7Þ

where D is a (nonlinear) differential operator including boundary and initial conditions, u :¼ u(x, t; k) is the solution, f(x, t; k) is the source term, and k represents any (model/source) parameters to be estimated along with the state. We let S denote the spatial domain, [0, T] the time interval of interest where we collect data and make forecasts, and Q = S  [0, T] the space-time domain. Using the language of DA, we define the state of (7) as the solution u (ignoring for now any estimable parameters k), which exists in an infinite-dimensional function space. This poses a problem for DA methods, which require the state space to be finite-dimensional, but one circumvented using numerical methods, see Section 3.1 below for more details. We propose an alternative approximation to the state of the system, interpreted as a reparameterization of the state. The method we propose is non-intrusive in the sense that it does not require developing new, and possibly complicated, numerical methods for solving the computational model. The proposed method is described in the following section. We note that the method we propose does not require the use of any particular data assimilation scheme. We choose the widely known EnKF, summarized below, as it is relatively straightforward and allows us to focus the presentation and numerical results on the effect that reparameterization of the state has on the method. To motivate the proposed method below, we consider the problem of a vibrating string modeled by the wave equation:

mðxÞ

o2 o2 uðx; tÞ ¼ 2 uðx; tÞ for x 2 ð0; 1Þ; 2 ox ot

uð0; tÞ ¼ uð1; tÞ ¼ 0; uðx; 0Þ ¼ u0 ðxÞ;

t P 0;

ð8Þ ð9Þ

v ðx; 0Þ ¼ v 0 ðxÞ;

ð10Þ

where u(x, t) is the displacement, v ðx; tÞ ¼ is the velocity, and m(x) is the mass of the string. This system is different from the types of systems often considered in data assimilation problems where applications are often derived from oceanography or meteorology and the models themselves are usually dissipative in nature allowing for perturbations to ‘‘die out’’ in time as the forcing term drives the system to a stable state. We have no forcing terms to drive the system to a stable state and perturbations added by data assimilation are preserved by the system, which we preserve in the numerical technique described below, i.e. the system is driven essentially by the observable data. However, we have a priori knowledge of the problem, e.g. we know the mass and initial conditions are smooth and that the solution at each time step should be smooth as well. Thus, applying the EnKF to the numerical state will produce (as seen below) very poor results compared to applying the EnKF to the reparameterized state, which we define in the following sections. o uðx; tÞ ot

3.1. Defining the numerical state In practice, we typically do not have access to u, so a discretization of (7) is used to compute a numerical approximation uh to u. Oftentimes the discretization of (7) results in approximating u(x, tk) in space-time by a finite-dimensional vector in Rn at a given (discrete) time tk, where n indicates the degrees of freedom in the spatial discretization, i.e. u(x, tk) is approximated by uh(x, tk) in S. Thus, the state of the system at time tk is approximated by the numerical approximation and is of dimension n, and DA methods are applicable. The dimension of the state defined in this way becomes a function of the discretization, and the performance of the DA method may have unnecessary dependence on the discretization as shown in Section 4 and in the motivating example below. We now assume time t = tk is fixed and focus on the spatial representation of the state. In general terms, the numerical technique and associated discretization are chosen to approximate a solution u 2 V by a numerical approximation uh 2 Vh. Here, V is the function space where solutions to (7) exist at time t = tk, and Vh is an approximation space consistent with the numerical technique. There are many factors that may motivate or restrict the choice of Vh, e.g. numerical stability or accuracy in a given norm. Additionally, legacy code or available software may restrict the type of input data so that Vh is restricted a priori to other factors. Example 1. Continuing the motivating example, we choose a discretization of Q to mimic a ‘‘legacy code’’ situation where the discretization is limited to simple input data, namely displacement and velocity values at equally distributed nodes. This discretization coupled with the solver defines the space Vh. The discretization scheme is validated below.

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

2645

Fig. 1. The manufactured solution (blue) with 15 terms in the sin-series and the underlying approximated function (red). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 2. Time averaged L2-error with Dt = 0.02 (blue) and Dt = 0.002 (black) using 50 (dash-dotted), 75 (dashed), and 100 (continuous) grid points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Spatial derivatives are approximated using centered finite differences with Np equally distributed spatial grid points. We evolve the spatially discretized system in time using the classic midpoint rule (i.e. Crank–Nicolson method). The midpoint rule preserves non-smooth solutions. The method is validated against the manufactured solution:

uðx; tÞ ¼

Ns X

cðnÞ cosðnptÞ sinðnpxÞ;

n¼1

cðnÞ ¼

2ðcosðnp=5Þ  cosð2np=5ÞÞ : np

ð11Þ

where it is assumed that m(x)  1, the initial velocity is zero, and the initial displacement is the sine-series approximation to a function that is 1 in x 2 (0.2, 0.4). Fig. 1 shows the underlying function and the sine-series approximation with 15 terms. We stress that we use the smooth truncated sine-series function as our solution, not the function that the series approximates. Unless otherwise stated, we use the series with 15 terms, i.e. with Ns = 15. The goal is to numerically capture the low frequency, high amplitude waves traveling in the string. At time t we measure the instantaneous L2-error in the numerical solution ku  uh kL2 ð0;1Þ . Due to the oscillating solution, the instantaneous L2-error also oscillates in time. Therefore, we use as a metric for the quality of a numerical solution the time-averaged L2-error, defined by

E L2 ðt; uh ; dtÞ :¼

1 dt

Z

tþdt=2

tdt=2

kuðx; tÞ  uh ðx; tÞkL2 ð0;1Þ dh:

ð12Þ

We say that at time t the solution is ‘‘good enough’’ or ‘‘has maintained important features’’ if the time-averaged L2-error of the displacement with time window size dt = 2 is less than a pre-set tolerance of 0.2, i.e. E L2 ðt; uh ; 2Þ < 0:2. This tolerance was empirically chosen and provides a measure of the quality of forecasted states in the results below. To validate the numerical method, we test the ability to track the solution over the time interval [0, 40] given the correct initial conditions. Fig. 2 shows the time-averaged L2-error of the system with Dt = 0.02 and D t = 0.002 using Np = 50, 75, and 100. We observe that is E L2 ðt; uh ; dtÞ < 0:2 only for Dt = 0.002 and Np = 100. We note that if Dt = 0.02 and Np = 100, then E L2 ðt; uh ; dtÞ < 0:2 for less than 10 s.

2646

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

If we are not restricted by a ‘‘legacy code’’ scenario, then oftentimes the structure of V directly influences the choice of Vh. For example, if solutions to (7) are in L2(0, T; H1(S)), then we may choose to numerically solve a variational form of (7) using a first-order continuous Galerkin method in space and a discontinuous first-order Galerkin method in time. This results n in piecewise-linear numerical approximations to the solution, i.e. at a given time t, V = H1(S) and V h ¼ v 2 CðSÞ \ H1 ðSÞ : o 8 E 2 T h ; v jE 2 P1 ðEÞ , where T h is a quasi-uniform triangulation of the spatial domain containing simplicial elements E, and P1 ðEÞ indicates the space of linear functions on element E. We consider a similar situation in the higher dimensional example of Section 4. 3.2. The RS-EnKF method We propose that the structure of either a particular solution u 2 V or the entire solution space V be exploited by the DA technique. Some desirable features (e.g. smoothness) may not be preserved by uh 2 Vh. Thus, we seek to define the state as an element in another space Vr that is more representative of V, i.e. the state should not be defined by uh 2 Vh, which is closest to u 2 V in some numerical sense, but should be defined by a member of Vr. We note that Vr may be chosen a priori of the discretization and numerical technique. Informally, the main idea is that if at the data measurement times it is known that the solution u in the infinite-dimensional space V is approximated well, in a sense defined by the user, by elements in the finite-dimensional space Vr, then the state should be defined by its representation in Vr. Formally, we denote by ur(x, t) the approximation to u(x, t) at time t in Vr. If V is an inner product space, and Vr is chosen as a subspace, the natural choice is for ur(x, t) to be the orthogonal projection of u(x, t) 2 V into the space Vr. In general, we do not require one space to be a subspace of another. The single requirement is that there are maps between the spaces. Let p:V ? Vr denote the map to the representation space, i.e. ur(x, t) = pu(x, t). Thus, we propose that the state of the system be defined explicitly by the coefficients of the basis functions determined from mapping u(x, t) into Vr. In practice, we do not know u(x, t), and we use a numerical approximation uh 2 Vh. We let P:Vh ? Vr denote the map between the two approximation spaces so that ur(x, t) = Puh(x, t). The crucial difference between Vh and Vr is the choice of basis. The basis for Vr is based solely on knowledge of either a specific solution to (7) or on V. The choice of basis for Vh takes into account many other practical considerations as described above. Furthermore, the basis for Vr may remain fixed, whereas the basis for Vh changes with the discretization or numerical scheme. Thus, the dimension of the state may remain constant as we either refine or coarsen the discretization or alter the numerical technique. This is a key point of our proposed method and has immediate consequences in terms of the number of ensemble members necessary to obtain reasonable DA results as seen below and in Section 4. However, some numerical changes, such as refining the discretization, may require the construction of new projection operators P:Vh ? Vr. We also define a pullback operator P1:Vr ? Vh that is needed in the evolution of elements in Vr from time tk to tk+1. We observe that this combination of projection and pullback operators allows us to circumvent the problem of restrictive legacy code or having to code complicated numerical solvers for Vr, i.e. the proposed method is designed to be non-intrusive. We outline the reparameterized state EnKF, or RS-ENKF, below. We evolve the state ur 2 Vr from tk to tk+1 using uhk :¼ P1 ur and evolving to uhkþ1 in the space Vh. We define the evolved state ur;kþ1 :¼ Puhkþ1 . The model/measurement equations for (7) become:

  ur;kþ1 ¼ P Gh ðP1 ur;k Þ þ wkþ1 ;

dkþ1 ¼ hður;kþ1 Þ þ kþ1 :

ð13Þ

Here Gh ðuhk Þ is the numerical integrator evolving the numerical solution forward in time. This is a key difference with standard descriptions of state estimation problems where the evolution operator acts explicitly on the state space. Example 2. We now test the EnKF and RS-EnKF on the motivating example. There are two goals: (1) find the correct displacement given incorrect initial conditions, and (2) once the correct displacement is found, track the solution on the time interval [0, 40] within the prescribed tolerance. We set Np = 100 and use Dt = 0.02. From Example 1, we see that the time step is too large to track the solution accurately given the spatial discretization. Ideally, using DA should overcome this numerical deficiency by correcting the errors. Spatial measurements of the displacement are at 80 equally distributed points in [0, 1], and the measurement variance is 6  104. To accomplish goal (1) we assimilate data into the state at 18 equally distributed points in time for the first 10 s. Then, to accomplish (2) we assimilate data into the state at 16 equally distributed points in time over the last 30 s. Thus, we seek to use frequent measurements in time to correct towards the true displacement, and then numerically track the true displacement using less frequent measurements. The state vector consists of both the displacement and the velocity at the nodes, giving 200 degrees of freedom in the state vector. For the ensemble members, both the displacement and velocity are initialized around zero with a variance of 0.1. The model error is prescribed a variance of 103. The initialization of the ensemble is computed using an empirical orthogonal function analysis of the initial error covariance as is becoming the standard in practice [8]. In Fig. 3, we see a solution using the EnKF with 40 ensemble members at time 20. We notice that the solution has high frequency noise in the spatial direction. We know a priori that the solution is smooth with low frequency components. Thus, we post-process the solution with a spatial moving average (with window size 0.1), i.e. we apply a low-pass filter. Clearly this improves the

2647

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

Fig. 3. Solution of the EnKF process at time 20 using 40 ensemble members (cyan), moving average (window size 0.1) of the solution (black), projection to the constrained space of the solution (blue), and the true solution (red). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

EnKF

0.6

5 20 40

0.5

0.4

0.3

0.3

0.2

0.2

0.1

0.1 0

10

20

30

3 6 12

0.5

0.4

0

RS−EnKF

0.6

40

0

0

10

20

30

40

Fig. 4. Time-averaged L2-error of the displacement as a function of time; on the left using the EnKF and on the right using RS-EnKF. The continuous line is the numerical result and the dashed line is the moving average post-processed solution. The number of ensemble members is in the legend.

solution, see Fig. 3. However, the idea is not to use a priori knowledge of u 2 V for post-processing, but instead use such knowledge for constructing Vr. Nonetheless, post-processing is an option, and is present to improve results of the EnKF and provide a more fair comparison to the results of the RS-EnKF. We now define Vr. Given the smoothness of the true solution, it is natural to approximate using piecewise cubic, C1continuous elements, e.g. splines or Hermite polynomials. In the interest of space, we limit ourselves to Hermite polynomials. To capture the low frequency components of the solution, we use only 4 equal length elements. Thus, Vr is an 8dimensional space, i.e. a function in Vr is defined by 8 degrees of freedom (including boundary conditions). The map P:Vh ? Vr is defined by a generalized least squares fit on each element. The map P1:Vr ? Vh is defined by interpolation of the polynomial basis functions at the spatial grid points. Fig. 3 shows the mapping of the numerical solution onto Vr. We use this reparameterization for both displacement and velocity. Also, the maps between numerical and reparameterized spaces for measurements and mass are defined by the same procedure as used above for displacement and velocity, see Examples 3 and 4. In Fig. 4, we plot E L2 ðt; uh ; 2Þ versus time for the EnKF and RS-EnKF. Both methods are able to achieve the two goals given enough ensemble members, i.e. both correct the state to within our prescribed tolerance within the first 10 s and can then track the state within this tolerance for the next 30 s. However, the EnKF requires at least three times the number of ensemble members (and the low-pass filter) to obtain the same quality approximations as the RS-EnKF. It may be the case that Vr  Vh, in which case we interpret the problem as having state equality constraints. Recent interest in this particular problem has resulted in two proposed methods: the projected Kalman predictor [11], and the constrained Kalman filter by projected system [12]. As demonstrated in our motivating example, and in Section 4, we consider cases where neither Vr nor Vh is a subspace/subset of the other. 3.3. The EnKF-RM and RS-EnKF-RM It certain cases measurements may be considered redundant, i.e. linearly dependent, or there may simply be too much data. We can apply a similar process in defining a representative basis approximating the observation space. We consider this a reparameterization of the observation, or measurement, space. For example, in physical situations where the data

2648

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

EnKF−RM

RS−EnKF−RM

0.6

0.6 5 20 40

0.5 0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

10

3 6 12

0.5

20

30

40

0

0

10

20

30

40

2

Fig. 5. Time-averaged L -error of the displacement as a function of time; on the left using the EnKF-RM and on the right using RS-EnKF-RM. The continuous line is the numerical result and the dashed line is the moving average post-processed solution. The number of ensemble members is in the legend.

are measured point-wise from the solution u, it is natural to apply a similar technique as described above to use the data and, for example, a least squares technique to generate an approximate data field. There are parallels to localization [13,14] in that there are cases where this is identified as filtering the data, which we demonstrate below. We distinguish the use of this technique of reparameterizing measurement space on the EnKF or RS-EnKF by denoting the methods as EnKF-RM or RS-EnKF-RM, respectively. We observe that unlike the reparameterizing of the state space, we do not require a ‘‘pullback’’ map from the reparameterized measurement space to the original measurement space. The only change to (13) and (5) is to the second equation in each where h is replaced by Pm(h), where Pm denotes the map to the reparameterized measurement space. In the case where Pm is mapping between different scales, this is similar to the case studied in [1]. Example 3. Continuing the motivating example, recall that there are 80 uniformly distributed measurement points in space. The displacement is known to be a low frequency wave, so we reparameterize the measurement space to reflect this. We use 15 piecewise linear, continuous elements to represent the measurements, and Pm is defined by a generalized least squares fit. This has two immediate consequences. First, the dimension of the observation space is reduced from 80 to 14. Second, this reparameterization acts as a low-pass filter for the measurement operator. Fig. 5 shows the results using reparameterized measurements. The only difference to Example 2 is in how the measurements are used. We see that for the RS-EnKF-RM the results are similar to the RS-EnKF of Example 2. We note that for the EnKF-RM, the need for post-processing is even more pronounced. This is due to the low-pass filter nature of Pm, which attenuates high frequency components, so that the EnKF update is incapable of correcting these components that arise from model noise.

3.4. Extending the RS-EnKF to estimable parameters In many applications, the goal of DA is to improve estimates of a model or source parameter(s). It is common to consider the parameter as an indirectly observed state variable, i.e. state vectors are defined by the concatenation of the parameter(s) with the numerical solution. Many cases exist where a priori knowledge of the structure of the model parameter is applied to constrain the parameter space, see e.g. [5]. The method we propose may be thought of as a generalization of what is applied to model parameters. We use the structure of the solution space to a differential equation to reparameterize (or ‘‘constrain’’) the state in a finite dimensional space that provides reasonable approximations and/or retains desirable features of the solution. We demonstrate below that the ability to estimate model parameters in differential equations using DA techniques may be directly influenced by the approximation of the state. Example 4. We now consider parameter estimation on the motivating problem with a spatially varying mass function. The idea is to find the unknown mass of the string by using measurements of displacement and velocity to correct estimates of the mass. The mass is discretized on the same spatial mesh as displacement and velocity. The correct mass is (see the leftmost plot of Fig. 6):

mðxÞ ¼ max ð1; 1  sinð5pxÞÞ: We initialize the mass-components of the ensemble members around a mean value of 2 with a variance of 102, and the displacement and velocity components of the ensemble members are initialized around zero with variance 103. We make the following technical changes from previous examples: the time interval is [0, 10] and data are assimilated at the 80 spatial measurements at 25 points equally distributed in time. The ‘‘exact’’ solution is computed using 500 grid points and time step Dt = 103. As before, the measurements are taken from the exact solution.

2649

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

Fig. 6. Mass as a function of time; on the left: the correct mass, in the middle: EnKF estimate using constrained state and 60 ensemble members, and on the right: same estimate after moving average post-processing (window size 0.1).

In the interest of space we present only the results of the estimated mass. Fig. 6 shows the exact mass and EnKF estimate to the mass. Similar to the displacement, the estimated mass may have a high frequency component that we know a priori to be incorrect. Thus, we use the spatial moving average post-processing, with window size 0.1, to filter the estimated mass, see the right-most plot of Fig. 6. There are several spaces we might reparameterize. First, we may reparameterize the physical state (i.e. displacement and velocity), model parameter (i.e. mass), and/or measurements. The a priori information on displacement, velocity, and measurement spaces are assumed the same as before, so we reparameterize these spaces as in the previous examples. We assume the mass is known a priori to be smooth. Using this information, we reparameterize the mass in a space defined by 10 piecewise linear elements (i.e. with 11 degrees of freedom). The top row in Fig. 7 shows the error in the averaged estimate of the mass using the EnKF. We see that the EnKF is not able to estimate the mass even using 180 ensemble members. The middle row of Fig. 7 shows the results of the EnKF using the

time

10

EnKF ave.mass, 120 ens.

10

8

8

6

6

6

4

4

4

2

2

2

EnKF ave.mass, 180 ens.

1

0.5

0

10

time

10

8

0 0

0.5

1

EnKF rep.mass, 60 ens.

0 0 10

0.5

1

EnKF rep.mass, 120 ens.

0 0 10

8

8

8

6

6

6

4

4

4

2

2

2

−0.5

0.5

1

EnKF rep.mass, 180 ens.

−1 1

0.5

0

0 0 10

time

EnKF ave.mass, 60 ens.

0.5

1

EnKF−RM rep.mass, 60 ens.

0 0 10

0.5

1

EnKF−RM rep.mass, 120 ens.

0 0 10

8

8

8

6

6

6

4

4

4

2

2

2

−0.5

0.5

1

EnKF−RM rep.mass, 180 ens.

−1 1

0.5

0

0 0

0.5

1

0 0

0.5

1

0 0

−0.5

0.5

1

−1

Fig. 7. Absolute error in estimated mass; on the top row: EnKF with moving average post-processed mass, on the middle: EnKF with reparameterized mass, and on the bottom: EnKF-RM with reparameterized mass. The number of ensemble members is, from left to right, 60, 120, and 180.

2650

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

representation space for the mass. The reparameterization of the mass seems to improve the results, but the error is still of the same order as the previous case with 180 ensemble members. The bottom row in Fig. 7 shows the error when we reparameterize both mass and measurements. Now the result with 180 ensemble members gives an idea of how the mass is distributed, but the results are relatively poor. Next we study the effect of reparameterizing the state, i.e. applying the RS-EnKF, on estimating mass. The top row of Fig. 8 shows the averaged error in estimated mass using the RS-EnKF (note: the mass is not reparameterized here; Fig. 6 shows the estimated mass corresponding to the error in this estimate that is shown on the top right corner of Fig. 8). We see that using the RS-EnKF improves the results significantly. The RS-EnKF requires only 60 ensemble members to produce a fair approximation of the mass. The middle row of Fig. 8 shows the error in the estimated mass when we reparameterize both the state and mass. Observe that the mass is accurately estimated with only 60 ensemble members. The bottom row of Fig. 8 shows the error in the estimated mass when we reparameterize the state, mass, and measurements. This method achieves reasonable estimates of the mass using only 40 ensemble members.

4. Extending the RS-EnKF to higher dimensions Designing reparameterized spaces in higher dimensions is not necessarily a daunting task since these spaces are not required in the forecast step. Thus, there is no need for the reparameterized space to satisfy requirements set by the numerical schemes. In this section we describe how to construct a representation space over an arbitrary spatial domain using piecewise multivariate polynomial expansions of the solution over a partition of the spatial domain. By partitioning the spatial domain into user-defined subsets with separate basis functions over each subset, we define a method that extends to non-convex and higher dimensional domains. We consider a new physical model problem. Specifically, we consider a contaminant source problem in a shallow square aquifer modeled as transient diffusive transport:

RS−EnKF ave.mass, 20 ens.

time

10

8

8

6

6

6

4

4

4

2

2

2

1

0.5

0

0.5

1

RS−EnKF rep.mass, 20 ens.

10

time

RS−EnKF ave.mass, 60 ens.

10

8

0 0

0 0

0.5

1

RS−EnKF rep.mass, 40 ens.

10

−0.5

0 0

8

8

6

6

6

4

4

4

2

2

2

0.5

1

RS−EnKF rep.mass, 60 ens.

10

8

−1 1

0.5

0

0 0 10

time

RS−EnKF ave.mass, 40 ens.

10

0.5

1

RS−EnKF−RM rep.mass, 20 ens.

0 0 10

0.5

1

RS−EnKF−RM rep.mass, 40 ens.

−0.5

0 0 10

8

8

8

6

6

6

4

4

4

2

2

2

0.5

1

RS−EnKF−RM rep.mass, 60 ens.

−1

1

0.5

0

0 0

0.5

1

0 0

0.5

1

0 0

−0.5

0.5

1

−1

Fig. 8. Absolute error in estimated mass; on the top row: RS-EnKF with moving average post-processed mass, on the middle: RS-EnKF with reparameterized mass, and on the bottom: RS-EnKF-RM with reparameterized mass. The number of ensemble members is, from left to right, 20, 40, and 60.

2651

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

Fig. 9. The initial condition of the exact solution.

EnKF

2.5 2

1.5

1

1

0.5

0.5

0

0.1

0.2

0.3

0.4

5 20 40

2

1.5

0

RS−EnKF

2.5 5 20 40

0.5

Time

0

0

0.1

0.2

0.3

0.4

0.5

Time

Fig. 10. RMS error; on the left using EnKF and on the right using RS-EnKF. The number of ensemble members is in the legend.

ou  r  ðAruÞ ¼ f S ¼ ð0; 1Þ  ð0; 1Þ; ot ru  m ¼ 0 oS  ð0; TÞ:

t 2 ½0; T;

ð14Þ ð15Þ

The source is located at ~ x^ ¼ ð0:6; 0:65Þ and has a Gaussian profile and is temporally periodic:

  2  2  ^ ¼ 100 exp 50 ~ f ð~ x; t; ~ xÞ x^1  x1 þ ~ x^2  x2 2p

ð16Þ

for time 0.05(n  1) < t < 0.05n, n = 1,3,5 . . . and zero otherwise. The permeability is:





3 expð2 sinðpxÞ cosðpyÞÞ 2

 ;

and the initial condition at time t = 0 is shown in Fig. 9. The spatial discretization is defined by a standard finite element method using piecewise linear, continuous elements where the spatial domain S is partitioned into triangles using the Delaunay algorithm. In the temporal domain, we use the discontinuous Galerkin (dG) method with fixed time step. To generate the correct solution, and data, we use 20000 elements (10201 nodes) and second order dG in time with time step of 0.00125. In forecast we use 800 elements (441 nodes) and linear dG in time with maximum time step of 0.0025. Given this discretization and the same initial condition, the forecast setting produces similar yet visibly different results from the exact solution. For the representation space Vr we first partition the domain into four rectangles: S1 = (0, 1/2)  (0, 1/2), S2 = (1/ 2, 1)  (0, 1/2), S3 = (0, 1/2)  (1/2, 1), and S4 = (1/2, 1)  (1/2, 1). In each of the domains we approximate the solution using a basis of multivariate Legendre polynomials (orthonormal with respect to the L2 inner product restricted to each subdomain) up to 4th order (additional details below), i.e. in each subdomain the solution has at most 15 degrees of freedom, resulting in at most 60 degrees of freedom in total. In general, this illustrates how to approach higher dimensional problems or domains of increasing geometric complexity, e.g. non-convex domains. We partition the full spatial domain into subdomains for which there is some knowledge of the structure of the solution. We are not restricted to using the same type of

2652

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

RS−EnKF order, 5 ensembles

RS−EnKF order, 40 ensembles

RS−EnKF order, 20 ensembles

4

S4

3

S3

2 S2

1

S1 0

0 0.1

0.2

0.3

0.4

0

0.1

0.2

Time

0.3

0

0.4

0.1

0.2

Time

0.3

0.4

Time

Fig. 11. Adaptively chosen polynomial order in RS-EnKF. The number of ensemble members is in the title.

EnKF

RS−EnKF

2.5

2.5 20 40 60

2 1.5

1.5

1

1

0.5

0.5

0

0

0.1

0.2

0.3

0.4

0.5

0

0

0.1

0.2

0.3

0.4

Time

Time

Source distance, EnKF

Source distance, RS−EnKF 20 40 60

0.8

0.6

0.4

0.4

0.2

0.2 0.1

0.2

0.3

Time

0.4

0.5

0.5

20 40 60

0.8

0.6

0

20 40 60

2

0

0.1

0.2

0.3

0.4

0.5

Time

Fig. 12. RMS error and the source distance from the correct location; on the left using EnKF and on the right using RS-EnKF. The number of ensemble members is in the legend.

basis in each subdomain. For example, we might continue to use the numerical basis in subdomains for which we have no a priori information regarding the behavior of the solution. The map from Vr to Vh is defined by interpolation of the Legendre polynomials on the nodes in each subdomain. The map from the linear triangles to the representation basis is done via L2-projection on each subdomain but with a slight variation. The projection is not to the full 4th order polynomials but to a maximum of 4th order polynomials. Specifically, the map takes the lowest order reparameterization that fulfills a preset projection tolerance , i.e. the expansion of ur in the basis of Si is computed until kuh  ur kL2 ðSi Þ <  is reached for a fixed order polynomial representation. This way the representation basis is adaptively modified at each RS-EnKF update for each ensemble member. We pad the coefficients of the higher-order polynomials in each subdomain with zeros, which allows for the assimilation of data (i.e. the update) to correct up to a higher-order representation in each subdomain. We empirically chose a tolerance of  = 0.015. The ensemble members are initialized around zero with variance 0.15 and the model noise has a variance of 0.5. The simulation time is from 0 to 0.5, and data are measured every 0.05 time units. The data are the point values at 23 nodal points that are roughly equally distributed in the domain S and the measurement noise is 0.0001. For simplicity, in these examples we only reparameterize the state, not the measurements.

2653

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

RS−EnKF order, 20 ensembles

RS−EnKF order, 60 ensembles

RS−EnKF order, 40 ensembles

4

S4

3

S3

2 S2

1

S1 0

0 0.1

0.2 0.3 Time

0.4

0

0.1

0.2 0.3 Time

0.4

0

0.1

0.2 0.3 Time

0.4

Fig. 13. Adaptively chosen polynomial order in RS-EnKF. The number of ensemble members is in the title.

The first test is if the EnKF can correct the state when the source is known but the initial state is incorrect. Fig. 10 shows the root mean square (RMS) error of the state for different numbers of ensemble members. Since the model noise is significant with respect to the values of the state, the traditional EnKF struggles to correct the state. However, the RS-EnKF filters the model noise and captures the correct state faster and more accurately. Fig. 11 shows the order of the polynomial approximation used in the representation basis in each subdomain as a function of assimilation cycle. We observe that the RS-EnKF adaptively changes the order of approximation required to satisfy the projection tolerance. The next test is if the EnKF can locate the center of the source term, that is, ~ x^ is unknown in Eq. (16). We assume the time behavior and the physical profile of the source are still known. In this test, the state vector is concatenated with the ~ x^ coordinate of the source, but we only project the components of the state vector corresponding to the level of contaminant at each node. In the ensemble members, the components associated to the location are initialized around (0.5, 0.5) with initial variance 0.1, and assigned a model variance of 0.4. The other components of the ensemble members are initialized as before. Fig. 12 shows the RMS error for the EnKF and RS-EnKF. Again the RS-EnKF is able to correct both the state and the location faster and more accurately than the EnKF. Fig. 12 also shows the location of the source for both methods. Here, the difference between the methods is more pronounced; the RS-EnKF demonstrates convergence with as few as 20 ensemble members whereas the EnKF requires an ensemble size of at least 60 in order to produce similar results. Again, in Fig. 13 we show the order of the reparameterized basis in each subdomain as a function of assimilation cycle, which again demonstrates the adaptive size of this basis. 5. Conclusions We have developed an abstract approach to representing the state space to a partial differential equation separate from the numerical scheme for the purpose of efficient application of an ensemble Kalman filter. Specifically, we have shown that by using a different basis to approximate solutions than what is determined by the numerical method, we attain desired accuracy with reduced computational cost. This requires constructing maps between the numerical basis and the representation basis, which was interpreted as a reparameterization of the state. We showed that this reparameterization can be extended to higher dimensions and the reparameterized basis may even be adaptively determined in time. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

S. Akella, A. Datta-Gupta, Y. Efendiev, Assimilation of coarse-scale data using the ensemble Kalman filter, Int. J. Uncert Quant 1 (1) (2011) 49–76. J . Li, D. Xiu, A generalized polynomial chaos based ensemble Kalman filter with high accuracy, J. Comput. Phys. 228 (2009) 5454–5469. C. Johns, J. Mandel, A two-stage ensemble Kalman filter for smooth data assimilation, Environ. Ecol. Stat. 15 (2008) 101–110. M. Ghil, P. Malanotte-Rizzoli, Data assimilation in meteorology and oceanography, Adv. Geophys. 33 (1991) 141–266. D. Zhang, Z. Lu, Y. Chen, Dynamic reservoir data assimilation with an efficient, dimension-reduced Kalman filter, SPE J. 12 (1) (2007) 108–117. E. Krakiwsky, C. Harris, R. Wong, A Kalman filter for integrating dead reckoning, map matching and GPS positioning, in: Position Location and Navigation Symposium, 1988, Record. Navigation into the 21st Century, IEEE PLANS’88, IEEE, 1988, pp. 39–46. doi:10.1109/PLANS.1988.195464. G. Evensen, The ensemble Kalman filter: theoretical formulation and practical implementation, Ocean Dynam. 53 (2003) 343–367. G. Evensen, Data Assimilation: The Ensemble Kalman Filter, Springer, Berlin, Heidelberg, 2009. M. Tippett, J. Anderson, C. Bishop, T. Hamill, J. Whitaker, Ensemble square root filters, Mon. Weather Rev. 131 (2003) 1485–1490. J. Li, D. Xiu, On numerical properties of the ensemble Kalman filter for data assimilation, Comput. Methods Appl. Mech. Eng. 197 (2008) 3574–3583. D. Simon, T. Chia, Kalman filtering with state equality constraints, IEEE Trans. Aerospace Electron. Syst. 39 (2002) 128–136. S. Ko, R.R. Bitmead, State estimation for linear systems with state equality constraints, Automatica 43 (8) (2007) 1363–1368. P.L. Houtekamer, H.L. Mitchell, Data assimilation using an ensemble Kalman filter technique, Mon. Weather Rev. 126 (1998) 796–811. H.L. Mitchell, P.L. Houtekamer, Ensemble size, balance, and model-error representation in an ensemble Kalman filter, Mon. Weather Rev. 130 (2002) 2791–2808. M. Cane, A. Kaplan, R. Miller, B. Tang, E.C. Hackert, A. Busalacchi, Mapping Tropical Pacific Sea level: data assimilation via a reduced state space Kalman filter, J. Geophys. Res. 101 (C10) (1996) 599–617. D. Dee, Simplification of the Kalman filter for meteorological data assimilation, Q. J. R. Meteorol. Soc. 117 (1991) 365–384. D. Pham, J. Verron, M. Roubaud, A singular evolutive extended Kalman filter for data assimilation in oceanography, J. Marine Syst. 16 (1996) 323–340.

2654

T. Butler, M. Juntunen / Journal of Computational Physics 231 (2012) 2641–2654

[18] D. Pham, J. Verron, L. Gourdeau, Filtres de Kalman singuliers évolutifs pour l’assimilation de données en océanographie, C.R. Acad. Sci. Paris Series II A 326 (1998) 255–260. [19] I. Hoteit, D. Pham, J. Blum, A simplified reduced order Kalman filtering and application to altimetric data assimilation in Tropical Pacific, J. Marine Syst. 36 (2002) 101–127.