Ocean Modelling 12 (2006) 348–377 www.elsevier.com/locate/ocemod
Euler–Lagrange equations for the spectral element shallow water system J.C. Levin a
a,*
, D.B. Haidvogel a, B. Chua b, A.F. Bennett b, M. Iskandarani
c
Institute of Marine and Coastal Sciences, Rutgers University, 71 Dudley Rd., New Brunswick, NJ 08901, USA b College of Oceanic and Atmospheric Sciences, Oregon State University, Corvallis, OR 97331, USA c Rosenstiel School of Marine and Atmospheric Science, University of Miami, 4600 Rickenbacker Causeway, Miami, FL 33149, USA Received 4 August 2004; received in revised form 20 June 2005; accepted 20 June 2005 Available online 22 July 2005
Abstract We present the derivation of the discrete Euler–Lagrange equations for an inverse spectral element ocean model based on the shallow water equations. We show that the discrete Euler–Lagrange equations can be obtained from the continuous Euler–Lagrange equations by using a correct combination of the weak and the strong forms of derivatives in the Galerkin integrals, and by changing the order with which elemental assembly and mass averaging are applied in the forward and in the adjoint systems. Our derivation can be extended to obtain an adjoint for any Galerkin finite element and spectral element system. We begin the derivations using a linear wave equation in one dimension. We then apply our technique to a two-dimensional shallow water ocean model and test it on a classic double-gyre problem. The spectral element forward and adjoint ocean models can be used in a variety of inverse applications, ranging from traditional data assimilation and parameter estimation, to the less traditional model sensitivity and stability analyses, and ensemble prediction. Here the Euler–Lagrange equations are solved by an indirect representer algorithm. 2005 Elsevier Ltd. All rights reserved.
*
Corresponding author. Tel.: +1 732 932 6555. E-mail addresses:
[email protected] (J.C. Levin),
[email protected] (D.B. Haidvogel), chua@ coas.oregonstate.edu (B. Chua),
[email protected] (A.F. Bennett),
[email protected] (M. Iskandarani). 1463-5003/$ - see front matter 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.ocemod.2005.06.002
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
349
Keywords: Spectral element method; Euler–Lagrange equations; Inverse ocean model; 4DVAR variational data assimilation; Twin experiment
1. Introduction Data assimilation has recently emerged in oceanography as a valuable tool that allows the combination of model simulations and observations to gain better understanding of dynamical processes. Assimilated data can compensate for model errors that arise from imperfect interpretation of underlying physical processes, from errors in the input fields, and from discretization errors (Robinson et al., 1998). One of the approaches in data assimilation is to use inverse models that take into consideration model and data errors to obtain a state estimate that minimizes a cost function subject to the chosen constraints (Bennett, 2002; Wunsch, 1996). Inverse models may also be used to estimate model parameters such as eddy viscosities and diffusivities, heat and fresh water fluxes, open boundary conditions, etc. (Robinson et al., 1998). Lastly, they are powerful tools with which to study model sensitivity and stability (Moore et al., 2004), and thereby to improve observational networks by identifying places which require denser sampling (Palmer et al., 1998). Most existing inverse ocean models are based on a finite difference discretization of the dynamical model (see Bennett et al., 1998; Egbert et al., 1994; Moore, 1991; Moore et al., 2004 to name just a few). Although other numerical approaches to inverse modeling have been investigated in meteorology—including an operational spectral four-dimensional variational (4DVAR) model at European Center for Medium-Range Weather Forecast (ECMWF) (Buizza, 1997; Mahfouf and Rabier, 2000; Cardinali et al., 2003), and a 4DVAR assimilation model of a finite element shallow water model (Zhu et al., 1994)—their application in oceanography has been rare. Lynch and Hannah (2001) have recently designed an inverse scheme for the finite element QUODDY model (Lynch et al., 1996) and derived open boundary conditions by assimilating velocity observations in a limited-area hindcast. In Lynch et al. (1998) the inverse QUODDY model was also used for de-tiding ADCP data from Georges Bank. Finally, Dobrindt and Schroter (2003) have developed an inverse three-dimensional finite element model with application to the large-scale ocean circulation. In this paper we present for the first time the derivation of an inverse spectral element ocean circulation model. The spectral element method is an h-p-type finite element method that approximates the solution within each quadrilateral element with a high-order polynomial (Maday and Patera, 1988; Patera, 1984). The unstructured nature of its elemental grid permits a better geometrical description of complex geometries (e.g., ocean basins), easily accommodates spatially variable resolution to resolve regionally localized, fine-scale processes (such as western boundary currents like the Gulf Stream), and therefore enables multi-scale simulations within the framework of a single model. The spectral element method also offers a dual approach to convergence: algebraic (via elemental grid refinement) and exponential (via increase in the order of intra-element interpolation). The combination of spatially variable resolution, rapid convergence to the true solution, and excellent scalability on parallel systems are especially advantageous for high-accuracy simulation of finescale geophysical processes.
350
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
For simplicity, we begin by describing the derivations using a linear wave equation in one dimension. Euler–Lagrange equations (ELE) are obtained by setting to zero the derivatives of the cost function with respect to all free variables. We obtain the discrete ELE by minimizing a discrete penalty function. The discrete adjoint operator is an exact transpose of the forward operator resulting from the Galerkin discretization of the tangent linear system. We then show how the discrete ELE can be obtained from the continuous ELE. The continuous ELE are much easier to derive from the penalty function, than the discrete ELE. Building the discrete ELE of any Galerkin finite element or spectral element system can be substantially simplified by applying our discretization strategy to the continuous ELE. We then apply the technique to the shallow water system. To take advantage of linear estimation theory, tho nonlinear shallow water system is linearized to obtain a so-called tangent linear model (Bennett, 2002). The inverse solution to the nonlinear problem is obtained by finding an inverse solution to the tangent linear problem and iterating on nonlinearity. The solutions of the linear ELE converge to the solution of the nonlinear ELE if an appropriate linearization around judiciously chosen background fields is used. To test our theoretical derivations, we construct a tangent linear and an adjoint system for a two-dimensional (shallow water) spectral element ocean model (SEOM) which has been developed by Iskandarani et al. (1995) and applied in a variety of settings (see Haidvogel et al., 1997; Levin et al., 1997; Curchitser et al., 1998, 1999, 2001; Wunsch et al., 1997). The new inverse model is tested using a twin experiment of a wind-driven barotropic double-gyre circulation. The shallow water model is used as the barotropic engine in our three-dimensional spectral element ocean model (Iskandarani et al., 2003), and therefore its adjoint can be incorporated directly into a future three-dimensional inverse model. The paper is organized as follows. In Section 2 we derive the Euler–Lagrange equations for a spectral element discretization of a one-dimensional linear wave equation. In Section 3 the Euler– Lagrange equations are derived for a spectral element shallow water system. We present numerical tests in Section 4 and a concluding discussion in Section 5.
2. The ELE for a wave equation in one dimension 2.1. Continuous ELE The wave equation in one dimension has the form: ou ou þ c ¼ F ðx; tÞ þ f ðx; tÞ; ot ox uðx; 0Þ ¼ IðxÞ þ iðxÞ;
ð1Þ
uð0; tÞ ¼ BðtÞ þ bðtÞ; uðxm ; tm Þ ¼ d m þ em ; where 0 6 x 6 L and 0 6 t 6 T is the domain; I(x) are initial conditions; B(t) are boundary conditions; F is a forcing function; c is a constant velocity, c > 0; dm, m = 1, . . ., M are the data
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
351
available for the field u at points (xm, tm); f, i, b are errors in the equation and its initial and boundary conditions respectively; and em is the error between the data and the model estimate. Following Bennett (2002) in order to obtain the Euler–Lagrange equations we need to minimize the following penally functional: Z Z Z L Z T M T L X f 2 dx dt þ W i i2 dx þ W b b2 dt þ W d e2m ; ð2Þ J ½u ¼ W f 0
0
0
0
m¼1
where Wf, Wi, Wb, and Wd are positive weights. Loosely speaking, to find a local extremum of J[u] we consider a variation dJ = J [u + du] J [u] and find a function u such that dJ = 0 "du. This happens if the coefficients of du(x, t) in dJ vanish for all x and t. Denoting ou ou þc F ; ð3Þ kðx; tÞ ¼ W f f ðx; tÞ ¼ W f ot ox and using the relation M X
em duðxm ; tm Þ ¼
Z Z T 0
m¼1
L 0
M X
em duðx; tÞdðx xm Þdðt tm Þ dx dt;
where d(x xm) is the Dirac delta-function, we obtain the following formula for dJ: Z L Z T Z Z T L odu odu þc k iduðx; 0Þ dx þ 2W b bduð0; tÞ dt dJ ¼ 2 dx dt þ 2W i ot ox 0 0 0 0 Z Z M T L X em dudðx xm Þdðt tm Þ dx dt. þ 2W d 0
0
ð4Þ
m¼1
ð5Þ
m¼1
The continuous ELE are obtained by integrating the first integral in (5) by parts and setting coefficients of du(x, t) to zero. The ELE become: 8 M X > < ok c ok ¼ W ðd m uðxm ; tm ÞÞdðx xm Þdðt tm Þ; d ot ox ð6Þ m¼1 > : kðx; T Þ ¼ 0; kðL; tÞ ¼ 0; 8 ou < ou þ c ¼ F þ C f k; ð7Þ ot ox : uðx; 0Þ ¼ IðxÞ þ C i kðx; 0Þ; uð0; tÞ ¼ BðtÞ þ cC b kð0; tÞ; 1 1 where C f ¼ W 1 f ; Ci ¼ W i ; Cb ¼ W b .
2.2. Spectral element formulation Consider a one-dimensional spectral element formulation (Maday and Patera, 1988; Patera, 1984) wherein domain A = [0, L] consists of Q elements. Let N be the order of the interpolating polynomials in each element, and hl (n), l = 0, . . ., N a set of Legendre Cardinal functions of order N that form a set of basis functions within an element (Boyd, 2001). The basis functions are
352
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
chosen in such a way that they ensure the continuity of all variables across element edges. All variables are represented in an element as uðn; tk Þ ¼
N X
uki hi ðnÞ;
i¼0
L0N ðnÞð1 n2 Þ hi ðnÞ ¼ . N ðN þ 1ÞLN ðni Þðn ni Þ
ð8Þ
Here n 2 [1; 1] is a computational coordinate, obtained from the physical space coordinate x by a transformation n = n(x) that maps an element into an interval [1, 1]; L0N ðnÞ is the derivative of a Legendre polynomial of order N; collocation points ni, i = 0, . . ., N are the roots of L0N ðnÞð1 n2 Þ. By definition, hi (nj) = dij, where dij is the Kronecker delta-function. Thus the coefficients uki are the values of the function at the collocation points: uki ¼ uðxi ; tk Þ, where xi = x(ni), i = 0, . . ., N is a set of collocation points inside an element in physical space (Boyd, 2001). For the purposes of this analysis, it is more convenient to define a ‘‘global’’ set of collocation points that are numbered consecutively in all elements xl, l = 1, . . ., L, where L = QN + 1. It is also convenient to define a set of basis functions hl (x), such that they are defined on the whole domain A, and are zero everywhere except in some element, where they are the same as the Legendre Cardinal functions defined in (8). Then the Galerkin formulation of Eq. (1) yields: Z Z Z Z ou ou hl dx þ c hl dx ¼ F hl dx þ f hl dx 8l ¼ 0; . . . ; L; A ot A ox A A uðx; 0Þ ¼ IðxÞ þ iðxÞ; ð9Þ uð0; tÞ ¼ BðtÞ þ bðtÞ; uðxm ; tm Þ ¼ d m þ em . For simplicity, we assume that the data xm, tm is given at several grid points i.e., that xm ¼ xim , tm ¼ tk m so that no interpolation between data points and grid points is necessary. The formulation proceeds by computing the integrals in the first equation of (9) on each element separately and then combining the elemental contributions to obtain global integrals. This procedure is called an elemental assembly. Substituting definitions (8) into the first equation of (9) and assembling the elemental contributions together we get the following equation: * + X Z Z N N X dui ohi hi hl dx þ c ui ð10Þ F i fi hl dx ¼ 0 ; dt E E ox i¼0 i¼0 where hGi denotes the sum of all elemental contributions of the form G; E is an integral P computed over an element E; ui (t) = u (xi, t) are coefficients in the spatial expansion of u: u(x, t) = i ui (t)hi (x); and Fi, fi are coefficients in the spatial expansions of F and f respectively. Applying Gauss–Lobatto quadrature of order N to the integrals in (10), and the backward Euler scheme to the time derivative, we obtain a discrete equation: * + N k X ukþ1 u l l e e l ¼ fkM e l; Ml þ c uki h0i ðnl Þnx M l F kl M ð11Þ l Dt i¼0
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
353
where F kl ¼ F ðxl ; tk Þ;
flk ¼ f ðxl ; tk Þ
ð12Þ
are coefficients in the spectral element expansions for F and f respectively; Ml = jxnjl xl is an elemental mass matrix that depends on the Jacobian of the coordinate transformation jxnj and on the e l is the sum of all elemental contributions to weights in the Gauss–Lobatto quadrature xl ; and M e the mass matrix: M l ¼ hM l i. The mass matrix is diagonal, the so-called mass lumping is obtained by applying Gauss–Lobatto quadrature of order N and using the property hi (nj) = dij (Boyd, 2001; Maday and Patera, 1988; Patera, 1984). Note, that since the functions u, F and f are piecewise continuous by definition, their projections onto a basis function hl result in an assembly of mass matrices: * + Z N N X X e l. uhl dx ¼ ui hi ðnp Þhl ðnp ÞM p ¼ hul M l i ¼ ul M ð13Þ A
i¼0
p¼0
Since the derivative of u is discontinuous across elementedges, we need to assemble different esti PN k 0 mates of it from the neighboring elements into a term c i¼0 ui hi ðnl Þnx M l . This operation can be viewed as a mass averaging of elemental contributions on an edge. e l and adding initial and boundary conditions we obtain a final form of Dividing Eq. (11) by M the discretized system: * + N k X 1 ukþ1 u l l e c uki h0i ðnl Þnx M l F kl ¼ flk 8l ¼ 1; . . . ; L; þM l Dt i¼0 ð14Þ u0l ¼ I l þ il ; uk0 ¼ Bk þ bk ; ukimm ¼ d m þ em ; where Il, il, Bk, bk are coefficients in the spectral element expansions for I, i, B, b respectively. 2.3. Discrete penalty function Following Bennett (2002) we write the discrete penalty function as: K X L L K M X X X X 2 k 2 k 2 J ½u ¼ W f ðfl Þ Dt þ W i ðil Þ þ W b ðb Þ Dt þ W d e2m ; k¼0
l¼0
l¼0
ð15Þ
m
k¼1
where flk , l = 1, . . ., L, and il, bk, em are defined in (14); and f0k ¼ uk0 Bk is the model error on the boundary. Substituting the discretized system (14) into (15) we obtain the following discrete penalty function: ( * + )2 L X K 1 N X X 1 ukþ1 ukl 0 l k e c ui hi ðnl Þnx M l F l Dt þM J ½u ¼ W f l Dt l¼1 k¼0 i¼0 þWf
K L K M X X X X 2 2 2 ðuk0 Bk Þ Dt þ W i ðil Þ þ W b ðbk Þ Dt þ W d e2m . k¼1
l¼0
k¼1
m
ð16Þ
354
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Let kkþ1 ¼ W f flk ¼ W f l
* + ! N k X 1 ukþ1 u l l e þM c uki h0i ðnl Þnx M l F l l Dt i¼0
for l = 1, . . ., L be weighted residuals. Using the relation X XX em dukimm ¼ 2W d em duki diim dkkm ; 2W d m
ik
ð17Þ
ð18Þ
m
and denoting the weighted model boundary error by kk0 ¼ W f ðBk uk0 Þ we obtain variation of the penalty function ( * +) L X K1 N K kþ1 X X X 1 dukl kþ1 dul 0 e þM dJ ¼ 2 Dt 2 kl c duki hi ðnl Þnx M l kk0 duk0 Dt l Dt l¼1 k¼0 i¼0 k¼1 þ 2W b
K X
bk duk0 Dt þ 2W i
k¼1
L X XX ðil Þdu0l þ 2W d em duki diim dkkm . l¼0
ik
ð19Þ
m
2.4. The discrete ELE Setting the coefficients of duki in (19) to zero and using (17) we obtain the following ELE: * + 8 N kþ1 k X > k k M Wd X l > 0 i < i þ ¼ c kkþ1 h ðn Þn ðd m ukimm Þdiim dkkm ; l x i l e Dt Dt M l m l¼0 > > : K ki ¼ 0; ð20Þ * + 8 N kþ1 k X > 1 u u > l e < l c uki h0i ðnl Þnx M l F l ¼ C f kkþ1 þM l l ; Dt i¼0 > > : u0l ¼ I l þ C i k0l ; uk0 ¼ Bk þ C b kk0 . We note that the advection operator in the forward system is obtained in two steps: first is the e l . The elemental assembly of the derivative; second is division by the diagonal mass matrix M order of these operations is reversed in the computation of the advection operator in the adjoint system: division by the mass matrix occurs first, followed by the assembly of the weak form of the derivative. Strong and weak forms are two different ways to discretize a derivative in a Galerkin formulation: in the former the derivative of a field is projected onto a basis function, while in the latter the derivative is applied to a basis function itself. The two forms are related through an integration by parts, with a boundary integral connecting the two forms. The discrete derivation above which is equivalent to taking a transpose of a discrete forward operator, did not produce any boundary terms. It means that in the corresponding Galerkin formulation of the adjoint operator the boundary conditions are imposed weakly, i.e. the field is assumed to be zero at the boundary, so that the boundary integral vanishes.
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
355
Comparing the continuous adjoint equation (6) with the discrete Galerkin adjoint operator we notice that the discrete adjoint operator can be obtained from the continuous adjoint operator, if the spatial derivative is discretized using a weak Galerkin form, and boundary conditions are imposed weakly. It can also be shown that if the weak form of the derivative is used in the discrete forward operator, the discrete adjoint operator should have the corresponding term in a strong form. The time difference in Eq. (20) is analogous to applying the backward Euler time-stepping that is initialized with zero at the last time-level k = K and computed backward in time to obtain an update to the initial conditions for the forward variable u. In Appendix A, we show the equivalent ELE for a multi-level time stepping scheme. 2.5. Non-diagonal weight functions The use of constant weights Wf, Wi, and Wb in the definition of the penalty functional (2) leads to a singular solution (Bennett, 2002). Nonsingular solutions can only be obtained when the weights take into account correlations of errors at neighboring places and times. A generalized definition of the penalty functional is ZZ Z Z ZZ dx0 dt0 dx dt W f ðx; t; x0 ; t0 Þf ðx; tÞf ðx0 ; t0 Þ þ dx0 dxW i ðx; x0 ÞiðxÞiðx0 Þ J ½u ¼ T A T A A A Z Z X W mn ð21Þ þ dt0 dtW b ðt; t0 ÞbðtÞbðt0 Þ þ d em en . T
T
mn
If we define weighted residuals as: ZZ dx0 dt0 W f ðx; t; x0 ; t0 Þf ðx0 ; t0 Þ; kðx; tÞ ¼ W f f ¼
ð22Þ
T A
then the variation of the first term in (21) becomes similar to the first term in (5). A forward equation can be obtained from (22) by using a convolution of weighted residuals k with a correlation function Cf (x, t, x 0 , t 0 ) that is an ‘‘inverse’’ of Wf ; i.e., Cf satisfies: ZZ dx0 dt0 C f ðy; s; x0 ; t0 ÞW f ðx0 ; t0 ; x; tÞ ¼ dðy xÞdðs tÞ. ð23Þ T A
The correlation functions Ci and Cb are obtained similarly. Then the ELE for this system become (Bennett, 2002): 8 ok ok X mn > < c ¼ W d ðd m uðxm ; tm ÞÞdðx xm Þdðt tm Þ; ot ox m;n ð24Þ > : kðx; T Þ ¼ 0; kðL; tÞ ¼ 0; 8 < ou þ c ou ¼ F þ C k; f ot ox ð25Þ : uðx; 0Þ ¼ IðxÞ þ C i kðx; 0Þ; uð0; tÞ ¼ BðtÞ þ cC b H kð0; tÞ;
356
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
where Cf k ¼
ZZ
dx0 dt0 C f ðx; t; x0 ; t0 Þkðx0 ; t0 Þ;
T A
C i kðx; 0Þ ¼
Z
dx0 C i ðx; x0 Þkðx0 ; 0Þ;
ð26Þ
A
C b H kð0; tÞ ¼
Z
dt0 C b ðt; t0 Þkð0; t0 Þ. T
A discrete penalty function is XX X X X k n k n J ½u ¼ ðDtÞ2 W kn W ilm il im þ ðDtÞ2 W kn W mn flm fl fm þ b b b þ d em en ; m;n
l;k
m;l
n;k
ð27Þ
mn
kn k n k n where W kn flm ¼ W f ðxl ; t ; xm ; t Þ, W b ¼ W b ðt ; t Þ and Wilm = Wi (xl, xm) are the weight functions at the collocation points. Then substituting the discretized system (14) into (27) and defining as weighted residuals kkþ1 l X n ¼ DtW kn ð28Þ kkþ1 flm fm ; l m;n
we obtain the penalty variation dJ which is similar to the one in Section 2.2 (Eq. (19)). Using a discrete convolution of (28) with the correlation function Cf such that X rk DtW nr fmq C fql ¼ dlm dkn ;
ð29Þ
q;r
we obtain the following equation: X r Dt C rk Dtf kl ¼ fql kq
ð30Þ
q;r
from which the forward equation is easily obtained. Initial and boundary conditions are obtained by constructing discrete convolutions in a similar fashion. The discrete ELE thus yield: * + 8 N kþ1 k X > k k M 1 X mn l > kþ1 0 i i < c kl hi ðnl Þnx ¼ W d ðd n ukinn Þdiim dkkm ; þ e Dt Dt M l mn l¼0 > > : K ki ¼ 0; * + 8 kþ1 ð31Þ N X X ul ukl 1 > rk rþ1 k 0 > > þ c ui hi ðnl Þnx M l F l ¼ C fql kq ; < el Dt M q;r i¼0 > P P > q > : u0l ¼ I l þ C iql k0q ; uk0 ¼ Bk þ C qk b k0 . q
q
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
357
2.6. Continuous Galerkin formulation of the Euler–Lagrange equations Define a function ~ kðx; tÞ such that ~ kðx; tk Þ ¼
L X
k ~ kl hl ðxÞ;
ð32Þ
l¼0
where k ~ e 1 kk . kl ¼ M l l
ð33Þ
By construction, Eq. (31) can be rewritten as 8 * + kþ1 k N > ~ ~ X kþ1 > k 1 X mn k i e > < i Mi þ cM l ~ W d ðd n ukinn Þdiim dkkm ; kl h0i ðnl Þnx ¼ Dt Dt mn l¼0 > > > : ~K ki ¼ 0; * + 8 N kþ1 k X X > u u > 0 l k l elþ c el ¼ M e q C rk ~krþ1 ; el > M M ui hi ðnl Þnx M l F l M > fql q < Dt q;r i¼0 > P P > 0 > e q C iql ~ e ~k > kq ; uk0 ¼ Bk þ C qk : u0l ¼ I l þ M b M 0 k0 . q
ð34Þ
q
These discrete ELE equations can be obtained from the following continuous Galerkin problem: 8 Z Z ~ X ok ohi > < hi dx þ c~ dx ¼ k W mn d ðd n uin ðtn ÞÞdiim dðt tm Þ; ot ox A A mn > :~ kðx; T Þ ¼ 0; ð35Þ 8Z Z ou ou > < kÞhl dx; þ c F hl dx ¼ ðC f ~ ox A ot A > : kðx; 0Þ; uð0; tÞ ¼ BðtÞ þ C b H ~k; uðx; 0Þ ¼ IðxÞ þ C i ~ if a backward Euler scheme is applied to discretize the time derivatives; Gauss–Lobatto integration and elemental assembly are applied to all Galerkin integrals in (35) and to the convolution operators Cf • k, Cb H k and Ci k defined in (26); and the time variable is treated in a continuous fashion. We note again that a strong form of the advection operator with Dirichlet boundary conditions in the forward equation corresponds to a weak form of the advection operator with weak boundary conditions in the adjoint equation. It should also be noted that straightforward application of Galerkin discretization to the continuous ELE (6) and (7) yields incorrect results, but the correct results can be obtained by judiciously choosing strong and weak forms in the Galerkin discretization of (6) and (7) with appropriate boundary conditions. Later we will see that the same holds for the advection operator in the shallow water system.
358
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
3. Euler–Lagrange equations for the shallow water system 3.1. Tangent linear system Consider a system of nonlinear shallow water equations ou ou ou 1 sx of r½mDru þ u þ v fv þ cu ¼ f u; þg ot ox oy D ox D q ov ov ov 1 sy of r½mDrv ð36Þ þ u þ v þ fu þ cv ¼ f v; þg ot ox oy D oy D q of o o þ ½Du þ ½Dv ¼ f f ; ot ox oy where u = (u, v) is the horizontal velocity vector; H the resting depth of the fluid: f, the free surface elevation; D = H + f, the height of the water column; f, the Coriolis force: g, the gravitational acceleration; c, the bottom drag coefficient; m, the lateral viscosity coefficient; q the constant density of the fluid; (sx, sy) the wind stress vector acting on the surface of the fluid; and f u, f v, and f f are the errors associated with the momentum and continuity equations. There are methods that invert a nonlinear system of equations. But they are prohibitively expensive for typical geophysical systems. Our solution is based on linearization of the system (36). We solve the ELE equations for the linearized system and iterate on the nonlinearity. The series of solutions of the linear ELE converges to a solution of the nonlinear ELE for a large class of problems, but convergence is not guaranteed. There are a number of ways to linearize Eq. (36). Convergence to the nonlinear solution depends on a linearization algorithm. It may depend also on a choice of background fields. Let ~ ¼ ðU ; V Þ, and surface elevation H be the background fields. They are such that velocity U ~ þ d~ ~ u¼U u ¼ ðU þ du; V þ dvÞ; f ¼ H þ df; where d~ u and df are small. e ¼ H þ H. We also rewrite the diffusion term as Let D rðmDruÞ mrurD ¼ rðmruÞ þ . D D 1 Applying the Taylor series approximation H1þf H þH ðHfH and linearizations þHÞ2 ~ ~ ÞrU , we obtain the following tangent linear fu Hu + (f H)U and ~ uru ¼ Uru þ ð~ uU (TL) system ou ~ ~ ÞrU fv þ Cu þ g of rðmruÞ u ¼ f u ; þ U ru þ ð~ uU ot ox ov ~ of ~ ÞrV þ fu þ Cv þ g rðmrvÞ v ¼ f v ; þ U rv þ ð~ uU ð37Þ ot oy n o of e u þ r ðf HÞU ~ ¼ f f; þ r D~ ot ~ ru ¼ U ou þ V ou; where r~ u ¼ ux þ vy ; U ox oy
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
359
1 sx fH sx Cu ¼ cU cu ; e q q e2 D D 1 sy fH sy cv cV Cv ¼ e q q e2 D D are linearizations of the bottom drag and wind stress terms and e rur D ðf HÞ þ mr rU; e e D D e rvr D ðf HÞ þ mr rV v ¼ m e e D D u ¼ m
are linearizations of m rurD and m rvrD respectively. D D 3.2. Continuous ELE for the tangent linear shallow water system Assume that we have measurements dr, r = 1, . . ., M of the free surface elevation at a number of grid points: fðxr ; y r ; tr Þ ¼ d r þ er ;
r ¼ 1; . . . ; M;
where er is an error between the data and the model estimate at data point r. It is easy to show (Bennett, 2002) that the continuous ELE are ! ~ e ok o U ck oq r D e ~Þ þ ~ ¼ 0; rðkU k þ fl þ D rðmrkÞ þ r mk e e ot ox ox D D ! ~ e ol o U cl oq r D e ~Þ þ ~ ¼ 0; rðlU k fk þ D rðmrlÞ þ r ml e e ot oy oy D D ~ oq k m s ~ ~ ¼ Q; ~ mrU ~ rq ~rD e þr krU gr~ cU kU e ot q e2 D D ou ~ of ~ þ U ru þ ð~ u UÞrU fv þ Cu þ g rðmruÞ u ¼ C u k; ot ox ov ~ of ~ ÞrV þ fu þ Cv þ g rðmrvÞ v ¼ C v l; þ U rv þ ð~ uU ot oy of e ug þ rfðf HÞU ~ g ¼ C f q; þ rf D~ ot where X Qðx; y; tÞ ¼ W rs d ðd s fðxs ; y s ; ts ÞÞdðx xr Þdðy y r Þdðt t r Þ; Ck¼
ZZ T S
ð38Þ
ð39Þ
ð40Þ
rs
CðS; t; S 0 ; t0 ÞkðS 0 ; t0 Þ dS 0 dt0 ;
ð41Þ
360
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
~ ~ k ooxU ¼ kU x þ k ¼ ðk; lÞ and q are the adjoint variables; S is a two-dimensional domain;~ s ¼ ðsx ; sy Þ; ~
~ ~ ~ ¼ kU þ lV ; ~ ~¼ ~ ~rD e ¼ ðrU r D; e rV r DÞ. e lV x , r~ k ¼ ok þ ol ;~ kU krU k ooxU ; ~ k ooyU ; and rU ox oy For simplicity we omit treatment of initial and boundary conditions here, since it is similar to that in the previous sections.
3.3. Galerkin formulation of the forward system The Galerkin approximation of a two-dimensional system of the form (37) is given in greater detail in Iskandarani et al. (1995). Here we outline the basic steps only. The Galerkin formulation proceeds by choosing a set of basis functions for each element that are tensor products of Legendre Cardinal functions of order N and Np for velocity and free surface respectively /lm ¼ hl ðnÞhm ðgÞ; l ¼ 0; . . . ; N; m ¼ 0; . . . N; /plm ¼ hpl ðnÞhpm ðgÞ;
l ¼ 0; . . . ; N p ; m ¼ 0; . . . ; N p ;
such that u¼
N X i;j¼0
uij /ij ;
v¼
N X i;j¼0
vij /ij ;
f¼
Np X
fij /pij .
ð42Þ
i;j¼0
To form a global solution, we require C 0 continuity of the fields across element edges. This is obtained by defining a global set of basis functions in the same way as is done in Section 2.2. Then, the formulation proceeds by multiplying the momentum equations in (37) by each of the basis functions /lm the continuity equation in (37) by /plm , and then integrating the resulting equations over the full domain. Note that the spectral expansions are different for velocity and the free surface. We use the Babuska–Brezzi condition Np = N 2. This is done to suppress spurious pressure modes (Iskandarani et al., 1995). The third-order Adams Bashforth scheme is used for time stepping. In our Galerkin formulation, we use the weak form of the diffusion operator in the momentum equation, and the weak form of the divergence in the continuity equation. The strong form of the diffusion operator is invalid for this formulation, as it requires C1 continuity of the velocity field across element edges, while the weak form needs C 0 only. The strong form of the divergence operator is valid, but leads to a weak form of the gradient operator in the adjoint system that is more tedious to compute. The Galerkin formulation of the system (37) has the form: Z ou ~ of ~ þ U ru þ ð~ u UÞrU fv þ Cu þ g u /lm dS otZ ox Z þ mrur/lm dS ¼ f u /lm dS; Z ov ~ of ~ þ U rv þ ð~ u U ÞrV þ fu þ Cv þ g v /lm dS otZ oy Z
mrvr/lm dS ¼ f v /lm dS; Z n Z Z o of p p e ur/plm þ ðf HÞUr/ ~ D~ /lm dS dS ¼ f f /plm dS; lm ot
ð43Þ
þ
where vector notation $u$/ = ux/x + uy/y is used for compactness.
ð44Þ
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
361
3.4. Spectral element ELE for the shallow water system To obtain the discrete Galerkin ELE, we write a penalty functional that combines weighted errors in the momentum and the continuity equations. We obtain the adjoint system by applying the techniques described in Section 2 to the penalty functional. Derivations of the major terms in the adjoint operator are shown in Appendix B. The discrete Galerkin ELE formulation for the tangent linear system becomes: ) Z Z ( ~ ok ~ oU ck e oq ~ r/ij dS / dS þ kU þ fl þ D þk e ot ox ox ij D Z mk e þ mrkr/ij r Dr/ij dS ¼ 0; e D ) Z ( Z ~ ol ~ oU cl e oq ~ r/ij dS þk fk þ D /ij dS þ lU e ot oy oy D Z ml e þ mrlr/ij r Dr/ij dS ¼ 0; ð45Þ e D ) Z Z ( ~ ~ s oq ~ k ~rD e ~ mrU /pij dS þ g ~ kr/pij dS U rq 2 cU q ot e D Z m~ ~ krUr/pij dS ¼ Qij ; e D Z ou ~ of ~ þ U ru þ ð~ u U ÞrU fv þ Cu þ g u /lm dS ot ox Z Z þ mrur/lm dS ¼ ðC u kÞ/lm dS; Z ov ~ of ~ ð46Þ þ Urv þ ð~ u U ÞrV þ fu þ Cv þ g v /lm dS ot oy Z Z þ mrvr/lm dS ¼ ðC v lÞ/lm dS; Z Z Z of p p p e ur/lm þ ðf HÞU ~ r/lm g dS ¼ ðC f qÞ/plm dS; / dS f D~ ot lm where Qij ðtÞ ¼
X
W rs d ðd s fðxs ; y s ; ts ÞÞdijr djjr dðt tr Þ.
ð47Þ
rs
The same time stepping scheme and the same quadrature rules should be used in the adjoint Eqs. (45) and in the forward system (46) to obtain consistent systems of discrete ELE. The discrete right hand side terms in the system (46) are obtained by applying the same quadrature rules to convolutions (41) to yield
362
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Z
e lm ðC kÞ/lm dS ¼ M
N X qr
e qr M
Z
CðS lm ; t; S qr ; t0 ÞkðS qr ; t0 Þ dt0 .
ð48Þ
T
Note that if we apply a Galerkin formulation to the continuous ELE (38) and (39), and then discretize the resulting equations, we will not get the correct discrete ELE, unless we use a correct Galerkin form for the advection, diffusion, gradient and divergence operators. If we use a weak form of the divergence operator in the Galerkin formulations of the forward system (39), then we need to use a strong form of the gradient operator in the Galerkin formulation of the adjoint system (38), and vice versa: the strong form of the divergence operator in the forward system leads to a weak form of the gradient operator in the adjoint system. Similarly, the strong form of the gradient operator in the forward system leads to a weak form of the divergence in the adjoint system. And finally, the strong form of the advection operator in the forward system leads to a weak form of the advection operator in the adjoint system.
4. Numerical testing: the double-gyre problem The formulation is tested using the canonical problem of a wind-driven, double-gyre circulation in a mid-latitude ocean (see Haidvogel et al., 1992; Holland, 1978; McCalpin and Haidvogel, 1996). The problem is solved in a basin of 3600 · 2800 km and uses no-slip sidewall boundary conditions. The resting depth of the upper active layer is h = 600 m and the reduced gravity g 0 = 0.02 m/s2. The Coriolis parameter in the mid-latitude b-plane approximation has the form f ¼ f0 þ bðy y 0 Þ; 5 1
ð49Þ 11
1
where f0 = 7.27 · 10 s , b = 1.97 · 10 (ms) and y0 = 1400 km. The flow starts from rest. The circulation is forced by a steady zonal wind, similar to that of McCalpin and Haidvogel (1996), and has the form: 4a sinð2py=LÞ ; ð50Þ sx ¼ s0 ð1 þ 4aðy=L 1=2ÞÞ cosð2py=LÞ 2p where s0 = 5 · 105 (m/s)2, a = 0.13 and L = 2800 km. The wind stress component in the meridional direction sy is set to zero. The wind forcing is chosen in order to produce a free jet analogous to the Gulf stream (McCalpin and Haidvogel, 1996). The wind stress (50) has zero curl at the northern and southern boundaries. A linear bottom drag c = 1 · 107 s1 and viscosity m are used to balance the vorticity input by the wind stress forcing, and to prevent the small scales generated by eddies frompbeing ffiffiffiffiffiffi aliased into low-frequency waves. With this set of parameters, the Rossby radius is R ¼ g0 h=f0 ¼ 48 km at the central latitude, the Stommel boundary layer thickness is dS = c/b = 5 km. We run three different experiments: in Experiment 1 viscosity is 1200 m2/s, in Experiment 2 it is 600 m2/s, and in Experiment 3 it is 200 m2/s. The Munk layer thicknesses are dM = (m/b)1/3 = 39, 31, and 22 km respectively. We use the Galerkin spectral element model, described previously. The grid consists of 24 · 18 elements of order 6 for velocity, and order 4 for pressure. The average grid spacing is thus 26 km; the time step is 900 or 450 s depending on viscosity. The grid resolves the scales of interest only marginally, and viscosity is needed to damp small-scale noise resulting from nonlinear cascading
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
363
of energy (Levin et al., 1997). In Experiments 3, the viscosity is too low to suppress nonlinear instability, so we also use a vorticity-divergence filter (Levin et al., 1997) every 12 time steps. 4.1. Convergence of the tangent linear solution to the nonlinear solution To study convergence of the solutions of the linear ELE to the solution of the nonlinear ELE, we need to have a reference nonlinear solution. In most cases, the solution of the nonlinear ELE is not known. We study convergence in a particular case where no observational data is available (the so-called dataless iteration). In this case, the adjoint variables are identically zero, and the optimal solution is just the solution of the forward system for both the tangent linear and the nonlinear systems. Convergence of the solutions of the tangent linear forward system (37) to the solution of the nonlinear system (36) is investigated using the following procedure. First, we run a nonlinear (NL) simulation for two years starting from rest. The resulting interface displacement f and velocity (u, v) are used as initial conditions in the convergence experiments. The interface displacement for Experiments 1–3 is shown in Fig. 1. All convergence experiments are run for an additional 180 days, using the same viscosity/filtering as used in the spin up stage. We compare the results to the nonlinear simulation. We denote by u(x, y, t), v(x, y, t), f(x, y, t) the fields obtained in the nonlinear simulation. Series of simulations using the tangent linear (TL) system (37) are compared to the nonlinear run. In the first TL iteration U(x, y, t), V(x, y, t), H(x, y, t) are set to zero. We denote the resulting fields as u1(x, y, t), v1(x, y, t), f1(x, y, t). The second TL simulation uses the linearization around the new solution u1, v1, f1, so that U = u1, V = v1, H = f1 is used in the system (37). All other iterations are computed similarly. We thus obtain a series of fields uk(x, y, t), vk(x, y, t), fk(x, y, t), k = 1, . . . The rms error between uk and u is computed as: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R R ðuk uÞ2 dS dt T SR R . e¼ dS dt T S The error between fk and f is computed similarly. For problems (such as this one) of any appreciable size, it is impractical to store U, V, H at every time step. Instead, they are saved every few hundred time steps, and then interpolated linearly in time. The results of Experiment 1 are shown in Fig. 2. We tested three different sampling strategies. In the first, we sampled the fields every 480 time steps, in the second every 240 time steps, and in the third every 120 time steps. Convergence to the NL solution was obtained for all three sampling strategies. The error reduces faster than exponentially. At this level of nonlinearity, all three sampling strategies give similar convergence rates. Fig. 3 shows time evolution of the error, where error is computed as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi R k 2 ðf fÞ dS S R ; eðtÞ ¼ dS S and the background fields are sampled every 480 time steps. In Experiment 1 (upper left panel of the figure), the results after five iterations are within 107 m from the nonlinear solution during the first 60 days after initialization. After that the error of the linearization increases. A better
364
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Fig. 1. Contour plot of interface displacement after 2 years of nonlinear simulation for three different experiments. Black contours show positive values of the interface displacement, red contours—negative values. This field is used as an initial condition for the convergence experiments.
linearization strategy is to use the results of nonlinear simulation as the background field in the linearization. The error in such linearization is shown in Fig. 3 (thick solid line). Here, the linearization is done around the true solution, thus persistently small errors are achieved throughout the simulation. When data is assimilated, one does not know the optimal nonlinear solution around which to linearize. Still, in most cases it is better to use the result of the nonlinear solution where no data is assimilated (the so-called prior solution) as a background state than to start from zero background state.
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
–1
u–velocity: convergence of TL to NL solution
free surface: convergence of TL to NL solution
1
10
365
10 every 480 every 240 every 120
every 480 every 240 every 120
–2
0
rms error
10
rms error
10
–3
–1
10
10
–4
10
1
–2
2
3
Iteration
4
5
10
1
2
3
4
5
Iteration
Fig. 2. RMS error between the NL solution and five iterations of the TL solution. Viscosity is 1200 m2/s. Solid line corresponds to the TL solution where U, V, H are sampled every 480 time steps, dashed line—every 240 time steps, and dashdot line—every 120 time steps. Left panel shows an error in u—velocity, right panel—in interface displacement.
Using smaller values of viscosity leads to a more energetic, less linear solution. The error between the results of the NL and TL simulations for Experiment 2 is shown in upper right panel of Fig. 3. The nonlinear model is stable over the whole integration, but the TL iterations are stable and convergent to the NL simulation only within the first 100 days of simulation. During the latter stages, the simulation becomes dynamically unstable. Convergence to the NL solution in Experiment 3 (lower panel of Fig. 3) is obtained only during the first 40 days of simulation. All TL iterations except the first become unstable after that. The model can not handle the unphysical phenomena and unbalanced fields introduced by linearization. As expected, in a mildly nonlinear simulation convergence of the TL to the NL solution is much faster. The more nonlinear the solution is, the larger the errors are. Linearizing around the prior solution achieves better and more stable results. 4.2. Testing of the adjoint operator The discrete system that is obtained from the Galerkin formulation of the forward Eq. (46) is a linear operator. The discrete system resulting from the adjoint operator (45) should be a transpose of the forward linear operator. We verify this numerically by applying adjoint and tangent linear operators to unit state vectors. The result of an application of a matrix to a unit vector gives a corresponding entry in the matrix. By applying the forward and adjoint operators to a number of unit vectors, we are able to obtain and compare individual entries in the corresponding linear operators. Our tests show that within machine accuracy, the adjoint operator is a transpose of the forward operator. 4.3. Twin experiment The discrete ELE are incorporated into a modular Inverse Ocean Modeling System (IOM) developed by Bennett and Chua (Chua and Bennett, 2001). The optimal solution to the nonlinear
366
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377 Viscosity is 1200
2
10
0
0
10
10
rms error
rms error
–2
10
It. 1 It. 2 It. 3 It. 5 Around nl
–4
10
–6
–4
10
10
–8
0
It. 1 It. 2 It. 3 It. 5 Around nl
–2
10
–6
10
10
Viscosity is 600
2
10
–8
50
100 Time(days)
150
200
10
0
50
100 Time(days)
150
200
Viscosity is 200
2
10
0
10
rms error
–2
10
–4
It. 1 It. 2 It. 3 It. 5 Around nl
10
–6
10
–8
10
0
50
100
150
200
Time(days)
Fig. 3. Time evolution of the interface displacement error between the NL solution and the TL solution for different iterations. Dotted line corresponds to the first iteration (U, V, H = 0), Dashdot line—second iteration, dashed line—third iteration, thick solid line—fifth iteration, thin solid line—linearization around nonlinear (prior) simulation. Different panels correspond to different values of viscosity used in the simulation. The fields are sampled every 480 time steps.
problem is obtained iteratively by finding an optimal solution to the tangent linear problem (45) and (46) and iterating on nonlinearity. The indirect representer method (Bennett, 2002) is used to solve the linear ELE. The indirect representer method consists of finding a decomposition of the forward and adjoint solutions in terms of the so-called representer functions and their adjoints. The coefficients in the decomposition are found iteratively by a preconditioned conjugate gradient method, where on each iteration the decoupled ELE system is solved. This is accomplished by integrating the adjoint system followed by the forward system. Inversion is tested by running a double-gyre twin experiment. The setup of the twin experiment is similar to that suggested in Muccino and Bennett (2002). A schematic diagram of the twin experiment is shown in Fig. 4. Initial conditions for the test run are obtained by advancing the nonlinear problem for two years starting from rest, similar to
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377 Test run:
errors
assimilation
Optimal solution
Init. Cond.,
Synthetic data:
NL simulation
Forcing
True + noise
Prior
NL simulation
solution
IC + noise, Forcing + noise True run: NL simulation
Difference is prior error
Prior run : Introduce observational error
Difference is model error
2 year run
Difference is residual error
Statistics of
367
True solution
Fig. 4. A schematic diagram of the twin experiment. Initial conditions for the twin experiment are obtained by running the nonlinear model for two years starting from rest. Synthetic data is obtained from the results of the true run by adding noise. Initial conditions and forcing in the test run are different from that in the true run. Statistics of all errors are known. The synthetic data is assimilated in the test run. The resulting solution is compared to the model simulation without assimilation (the prior run) and to the results of the true run.
Section 4.1. Initial conditions for the true run are obtained by adding some noise to the test run initial conditions. Similarly, some noise is added to the wind forcing in the true run. The duration of the assimilation experiment is 10 h (40 time steps). The noise in the initial conditions and wind forcing has Gaussian correlation in space; the decorrelation scale is L = 40 km. For simplicity we assume that the noise is uncorrelated in time. Variance of the noise is 10% of the signal for initial conditions, and 20% of the signal for wind forcing. Gaussian correlation in space is obtained by applying the pseudo-diffusion operator (see Egbert et al., 1994) to the uncorrelated noise. The uncorrelated (white) noise is obtained from a normal random noise generator. The pseudo-timestep Dt = 107 is dictated by stability of the diffusion operator. The integration is performed for twenty five timesteps. Fig. 5 shows the difference between uncorrelated and correlated noise for the velocity and pressure grids. Note that the velocity grid has more collocation points than the pressure grid; therefore, the uncorrelated noise on the velocity grid shows more spatial features than on the pressure grid. Colored noise has similar spatial scales on both grids. The pseudo-diffusion operator is used also to compute the convolutions of the covariance with the adjoint fields in (46). The operator is initialized with the normalized adjoint variable. The diffusion operator is integrated for fifty timesteps with Dt = 107. The normalization coefficient for each grid point is obtained by applying the diffusion operator in a similar fashion to a Kronecker Delta function at that grid point. Application of the pseudo-diffusion operator gives tremendous computational savings compared to the direct computation of convolutions given in (48). A result of a nonlinear run is chosen as the background state in the tangent linearization. We use a viscosity of 600 m2/s in all the experiments. The rms error between the interface displacement in the nonlinear and the tangent linear models is of the order of 106 m (see Fig. 3).
368
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377 Noise on velocity grid (slice at x=3500 km) 3
3
Uncorrelated Correlated
Uncorrelated Correlated
2
2
1
1
0
0
–1
–1
–2
–2
–3 0
500
1000
1500 y (km)
2000
2500
3000
–3 0
500
1000
1500 y (km)
2000
2500
3000
Fig. 5. Uncorrelated and correlated noise on a slice across velocity and pressure grids. Thin line corresponds to noise, that is obtained by a random noise generator (normally distributed noise with variance of 1). Thick line corresponds to specially correlated noise, obtained by applying a pseudo-diffusion equation.
Considering that the interface displacement itself is of the order of 100 m, and the variance of error in interface displacement is of the order of 10 m, we consider the linearization error small, and do not iterate on nonlinearity. In Fig. 6 we present results of three assimilation experiments. The experiments differ from each other by the amount and quality of assimilated data. The data is synthetic interface displacement Error in u velocity
Error in interface displacement 0.11
0.16
0.1
rms error (m/s)
rms error/(100 m)
0.14
0.12
0.1
0.08
0.06
0.04 0
prior Experiment 1 Experiment 2 Experiment 3
0.09
0.08 prior Experiment 1 Experiment 2 Experiment 3 2
0.07
4
6
time (hours)
8
10
0
2
4
6
8
10
time (hours)
Fig. 6. Prior and residual error in three assimilation experiments. Thin solid line corresponds to the prior error; dashed line—the residual error in Experiment 1 (observational data are interface displacement on all grid points at one time level t = 9.5 h; variance in observational error is 10 m); dot dashed line—residual error in Experiment 2 (observational data is interface displacement on all grid points at two time levels t = 4.5 h and t = 9.5 h; variance in observational error is 10 m); thick solid line—residual error in Experiment 3 (observational data is interface displacement on all grid point at one time level t = 9.5 h; variance in observational error is 5 m).
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
369
obtained by sampling the interface displacement in the true run and adding some noise to it. Observational noise is uncorrelated. For simplicity, the data are located on grid points, so that no interpolation is needed either in space or in time. In Experiments 1 and 2 variance of the observational error is 10 m (10% of the signal), while in Experiment 3 it is 5 m. In Experiments 1 and 3 the data is located on all grid points at time t = 9.5 h. In Experiment 2, the data is inserted at time 4.5 and 9.5 h on all grid points. In all the experiments, no iteration on the nonlinearity is used, and the conjugate gradient converges to our threshold tolerance of 104 in nine iterations in Experiments 1 and 2, and in eleven iterations in Experiment 3; each iteration requires one adjoint and one forward run.
Interface Displacement along x=18 km slice 80 true prior optimal data
60 40
(m)
20 0 –20 –40 –60
0
500
1000
1500 y (km)
2000
2500
3000
Interface Displacement along x=3500 km slice 50 40 30
(m)
20 10 0 true prior optimal data
–10 –20 –30
0
500
1000
1500
2000
2500
3000
y (km)
Fig. 7. Comparison of interface displacement in prior and optimal solutions to the true solution and data along two y-slices at time 9.5 h for Experiment 3. Red line—true, blue line—prior, black line—optimal solution; stars with error bars—observational data.
370
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Fig. 6 compares the prior error (rms difference between the prior and the true solution) with the residual error (the difference between the optimal and the true solution). In all three experiments, the interface displacement and velocity fields are significantly closer to the true solution after insertion of data than before. We also note that a better fit is obtained either when there is more data (Experiment 2), or if the quality of data is better (Experiment 3), as compared to Experiment 1. There is also a noticeable decrease in interface displacement error at and near the time level for which the observational data is used. The velocity error, on the contrary displays some increase at the respective time levels. This may be explained by noticing that the cost function does not include the velocity error explicitly. We minimize the error between observations and model estimates in interface displacement. The velocity field is adjusted through the model. Thus the increase in velocity error at the time for which the interface displacement data exists can be attributed to the adjustment of the velocity field towards the data. The adjustment is successful: insertion of more data, or better quality data, results in making the residual error smaller, both in interface displacement and velocity. Fig. 7 compares the interface displacement in prior and optimal solutions to the true solution and to the data. The fields are shown along two slices taken at time 9.5 h in Experiment 3. These plots show that the optimal solution is within error margins of the data and is significantly closer to the true solution than the prior solution. Similar results were, obtained for Experiments 1 and 2. Two different sampling strategies have been tested. In the first one, the background fields have been sampled at every time step. In the second, the background fields have been sampled at time steps 0, 20 and 40. Results are practically the same.
5. Summary and discussion We have presented the derivation of the discrete Euler–Lagrange equations for a Galerkin spectral element shallow water ocean model. We find in particular that the discrete Euler–Lagrange equations can be obtained from the continuous Euler–Lagrange equations by using a correct combination of weak and strong forms of derivatives in the Galerkin integrals. The strong form of a derivative in the forward system corresponds to a weak derivative in the adjoint system. Boundary conditions in the adjoint system are imposed weakly by assuming that the boundary integrals that relate the strong and weak forms vanish. Similarly, a weak form of a derivative in the forward system gives rise to a strong form of the derivative in the adjoint system. The adjoint system also has a reverse order with which the element assembly and mass averaging are performed. The results can be applied to obtain an adjoint to any Galerkin finite element and spectral element system. We have applied our technique to the two-dimensional shallow water ocean model developed by Iskandarani et al. (1995). We developed the tangent linear shallow water model and tested its convergence to the nonlinear shallow water model using a classic double-gyre problem. Due to the high-order discretization scheme, our tangent linear model converges to the nonlinear model in these highly nonlinear regimes. The tangent linear and adjoint systems described herein can be used in a variety of algorithms for data assimilation and parameter estimation, e.g., the strong constraint adjoint method, generalized inverse problems, and the representer method (Bennett, 2002; Wunsch, 1996). Here we have used the 4DVAR representer algorithm developed by Bennett et al. (Bennett, 2002; Bennett
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
371
and Thornburn, 1992; Bennett et al., 1998) which solves the Euler–Lagrange equations by decoupling the forward and adjoint system for each observation, and then finds the solution to the full Euler–Lagrange system as a linear combination of the representer functions. The algorithm has been utilized in a modular Inverse Ocean Modeling System (IOM) (see Chua and Bennett, 2001) which can be easily configured to work with any tangent linear and adjoint model as modules (see e.g., Bennett and Thornburn, 1992; Bennett et al., 1998, 2000; Egbert et al., 1994). An implementation of the shallow water SEOM model within the IOM framework has been tested on a double-gyre twin experiment. Although the two-dimensional inverse model described here can be used in its own right to study barotropic processes and other single-layer phenomena (e.g., the oceanic tide and the abyssal circulation), the shallow water system also represents an important precursor to the development of a fully three-dimensional assimilation capability based upon the spectral finite element method. For example, the shallow water system is the building block from which a multi-layered spectral element ocean model has been constructed. It also serves as the barotropic engine in our three-dimensional, split-explicit, spectral element model (Iskandarani et al., 2003). Development of a three-dimensional inverse spectral element ocean model is underway. Acknowledgements This research was supported by the National Science Foundation under grant NSF OCE 0121506. The development of the Spectral Element Ocean Model has been supported by the Office of Naval Research under grants N00014-00-1-0230 and N00014-03-l-0255. The development of the Inverse Ocean Model has been supported by the National Science Foundation under grant NSF OCE 0121542. The authors are grateful to Dr. Julia Muccino for her valuable comments and suggestions. Appendix A. The discrete ELE for a multi-level time stepping scheme If in system (14) we apply third-order Adams Bashforth time stepping instead of the backward Euler for k = 2, . . ., K 1, we get the following time stepping of the forward equation: ukþ1 ukl l þ Akl ðR½uÞ ¼ flk ; Dt u0l ¼ I l þ il ; uk0 ¼ 0; where operators A and R are defined as follows 8 k > < Rl ½u; k ¼ 0; 1; k 2 Al ðR½uÞ ¼ P kq > : aq Rl ½u; k ¼ 2; . . . ; k 1; q¼0
Rkl ½u
¼
e 1 M l
*
ckl
N X i¼0
+
uki h0i ðnl Þnx M l
F kl .
372
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Then using the same strategy as before we obtain the following discrete ELE: 8 kþ1 k X
> < ki ki þ Ryk ðAy ½kÞ ¼ W d d m ukimm diim dkkm ; i Dt Dt m > : K ki ¼ 0; kk0 ¼ 0; 8 kþ1 k < ul ul þ Akl ðR½uÞ ¼ C f kkþ1 l ; Dt : 0 0 uk0 ¼ 0; ul ¼ I l þ W 1 i ki ; where Ryki ðAy ½kÞ ¼
* N X l¼0
+ M l ckl h0i ðnl Þnx Ayk ½k ; el l M
8 > a0 kKl ; > > > > > a0 kK1 þ a1 kKl ; > l > >
q¼0 > > > > > > k2l þ a1 k3l þ a2 k4l ; > > : 1 kl þ a2 k3l ;
k ¼ K 1; k ¼ K 2; k ¼ 2; . . . ; K 3;
ðA:1Þ
k ¼ 1; k ¼ 0.
We notice that the order of operations is reversed in the adjoint equation, and AB3 time stepping is done backward in time.
Appendix B. Derivation of some adjoint operators for the shallow water equations Here we derive some of the terms in the adjoint shallow water system. The derivations are shown for one element only, an elemental assembly is implied by making a distinction between e ij . M e ij contains an elemental mass matrix Mij and the mass matrix after elemental assembly, M contributions to the mass matrix from neighboring elements. For collocation points that are e ij is equal to the sum of Mij computed in those elements that share located on element edges, M e this edge. M ij is equivalent to Mij for interior collocation points. B.1. Time derivative Consider the part of the penalty function that corresponds to the time derivative. Using upwind Euler time stepping for simplicity, and summation over all time and space indices, we get Z N X kþ1 dukþ1 duk XX kkþ1 dukþ1 duk lm lm /lm dS ¼ 2 . Dt klm Dt lm 2 e Dt Dt M lm k klm lm
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
373
The coefficient of dukij Dt in dJ has the form: kkþ1 kkij ij ¼ Dt
Z ~kþ1 ~k k k /ij dS. Dt
ðB:1Þ
As before, the right hand side of (B.1) is equal to the left hand side, if Gauss–Lobatto quadrature e ij ¼ kij . of order N is used to compute the integral, and ~kij M In the rest of this section, we will deal with an Upwind Euler time stepping scheme. For compactness, in the following formulas we omit the superscripts in duk, dvk, dfk, kk + 1, lk + 1 and qk + 1. B.2. Advection operator Here we separate the part of the penalty function that corresponds to the advection operator. It has the form: Z N XX klm odu odu 2 Dt U þV /lm dS. e ox oy k lm M lm The integral above can be rewritten as Z Z odu odu odu odu þV /lm dS ¼ ðU nx þ V ny Þ þ ðUgx þ V gy Þ /lm jJ j dn dg . U ox oy on og After applying GL quadrature it becomes * + N X 0 0 fduij hi hi ðnl Þhj ðgm ÞðU nx þ V ny Þlm M lm þ duij hi ðnl Þhj ðgm ÞðU gx þ V gy Þlm M lm g . ij
Collecting terms for duij and taking into account that hi (nl) = dil "i,l, where dil is a Kronocker Delta function (Iskandarani et al., 1995), we get the following advection term in the ELE * + N N X X M M lj im klj h0i ðni ÞðU nx þ V ny Þlj þ kim h0j ðgm ÞðUgx þ V gy Þim e e M M lj im m l Z Z Z o/ij o/ij o/ij o/ij ~ e ~ dS þ kðUgx þ V gy Þ dS ¼ þV ¼ kðUnx þ V ny Þ k U dS on og ox oy Z ~ ðB:2Þ ¼ e k Ur/ ij dS. As before, the discrete advection operator is equal to the integral operator, if an appropriate quadrature is used. B.3. Bottom drag and wind stress The bottom drag terms give rise to diagonal operators in a discrete Galerkin formulation, so that in the ELE those terms are the same diagonal operators as well. In the x-momentum equation they have the form:
374
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
ckij ¼ H
Z
ce k / dS. H ij
ðB:3Þ
The wind stress does not depend on any of the unknown variables, and thus is not represented in the penalty function. This term does not give rise to any term in the adjoint equations. B.4. Coriolis terms The coefficient for du has a contribution from the Coriolis term Z N N XX XX llm f du/lm dS ¼ llm f dulm e k k lm M lm lm from which we obtain a Coriolis term in the ELE of the form: Z lij f ¼ f e l /ij dS;
ðB:4Þ
where e l is defined similarly to ~ k. B.5. Divergence term The divergence operator in the continuity equation gives rise to the following terms in the penalty function Z Np XX qlm H fduð/pn nx þ /pg gx Þ þ dvð/pn ny þ /pg gy ÞgdS. p e k lm M lm In the Galerkin formulation of the continuity equation (44), we apply higher-order quadrature to compute integrals in order to achieve a desired accuracy in integrating the terms Hu and Hv (Iskandarani et al., 1995). Here we will show the coefficient of du; the coefficient of dv is done similarly. * + Np N X XX 0p p qlm H ij duij ðhl ðni Þhpm ðgj Þnxij þ hl ðni Þh0pm ðgj Þgxij ÞM ij . k
lm
ij
From there we get the following term in the ELE * p + N X qlm 0p p p 0p H ðh ðn Þh ðg Þn þ hl ðni Þhm ðgj Þgxij ÞM ij e p ij l i m j xij lm M lm Z Z o~ q o~ q o~q ¼ H nx þ gx /ij dS ¼ H /ij dS. on og ox
ðB:5Þ
Proceeding as before, we can prove that if we use the strong form of the divergence operator in (44) Z Hðux þ vy Þ/pij dS;
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
375
then the corresponding term in the ELE has the form Z o/ij dS. He q ox B.6. Gradient operator Gradient operators in the momentum equation give rise to the following term in the penalty function Z Z N XX klm odf odf / dS þ llm / dS g e lm ox lm oy lm M k lm from which we obtain the following contribution to the coefficient for dfij * N
X klm 0p g hi ðnl Þhpj ðgm Þnxlm þ hpi ðnl Þh0pj ðgm Þgxlm e lm M lm + Z
o/pij o/pij 0p p p 0p ~k ~ þl þllm hi ðnl Þhj ðgm Þnylm þ hi ðnl Þhj ðgm Þgylm M lm ¼ g dS. ox oy
ðB:6Þ
B.7. Diffusion operator It is easy to show that a weak form of the diffusion operator is self-adjoint; i.e. the term Z o/lm o/lm þ uy m ux dS ox oy in the forward Galerkin formulation has a corresponding term Z o/ij ~ o/ij ~ þ ky dS m kx ox oy
ðB:7Þ
in the ELE. All other terms are done similarly.
References Bennett, A.F., 2002. Inverse Modeling of the Ocean and Atmosphere. Cambridge University Press. Bennett, A.F., Thornburn, M.A., 1992. The generalilzed inverse of a nonlinear quasigeostrophic ocean circulation model. Journal of Physical Oceanography 22, 213–230. Bennett, A.F., Chua, B.S., Harrison, D.E., McPhaden, M.J., 1998. Generalilzed inversion of a tropical atmosphere– ocean (TAO) data and a coupled model of the tropical Pacific. Journal of Climate 11, 1768–1792. Bennett, A.F., Chua, B.S., Harrison, D.E., McPhaden, M.J., 2000. Generalilzed inversion of a tropical atmosphere– ocean (TAO) data and a coupled model of the tropical Pacific. Part II: The 1995–96 La Nina and 1997–98 El Nino. Journal of Climate 13, 2770–2785.
376
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
Boyd, J.P., 2001. Chebyshev and Fourier Spectral Methods, second ed. Dover Publications. Buizza, R., 1997. Potential forecast skill of ensemble prediction and spread and skill distributions of the ECMWF ensemble prediction system. Monthly Weather Review 125, 99–119. Cardinali, C., Isaksen, L., Andersson, E., 2003. Use and impact of automated aircraft data in a global 4DVAR data assimilation system. Monthly Weather Review 131, 1865–1877. Chua, B.S., Bennett, A.F., 2001. An inverse ocean modeling system. Ocean Modeling 3, 137–165. Curchitser, E.N., Iskandarani, M., Haidvogel, D.B., 1998. A spectral element solution of the shallow water equations on multiprocessor computers. Journal of Atmospheric and Oceanographic Technology 15 (2), 510–521. Curchitser, E.N., Haidvogel, D., Iskandarani, M., 1999. On the transient adjustment of a mid-latitude abyssal ocean basin with realistic geometry: the constant depth limit. Dynamics of Atmosphere and Oceans 29 (2–4), 147–188. Curchitser, E.N., Haidvogel, D., Iskandarani, M., 2001. Transient adjustment of circulation in a midlatitude abyssal ocean basin with realistic geometry and bathymetry. Journal of Physical Oceanography 31 (3), 725–745. Dobrindt, U., Schroter, J., 2003. An adjoint ocean model using finite elements: an application to the South Atlantic. Journal of Atmospheric and Oceanic Technology 20, 392–407. Egbert, G.D., Bennett, A.F., Foreman, M.G.G., 1994. TOPEX/POSEIDON tides estimated using a global inverse method. Journal of Geophysical Research 99, 24281–24852. Haidvogel, D.B., McWilliams, J.C., Gent, R.R., 1992. Boundary current separation in a quasigeostrophic, eddyresolving ocean circulation model. Journal of Physical Oceanography 22, 882–902. Haidvogel, D.B., Curchitser, E.N., Iskandarani, M., Hughes, R., Taylor, M., 1997. Global modeling of the ocean and atmosphere using the spectral element method. Atmosphere–Ocean 35, 505–531. Holland, W.R., 1978. The role of mesoscale eddies in the general circulation of the ocean—numerical experiments using a wind-driven quasi-geostrophic model. Journal of Physical Oceanography 8, 363–392. Iskandarani, M., Haidvogel, D.B., Boyd, J.P., 1995. A staggered spectral element model with application to the oceanic shallow water equations. International Journal for Numerical Methods in Fluids 20, 393–414. Iskandarani, M., Haidvogel, D.B., Levin, J.G., 2003. A three-dimensional spectral element model for the solution of hydrostatic primitive equations. Journal of Computational Physics 186 (2), 397–425. Levin, J., Iskandarani, M., Haidvogel, D.B., 1997. A spectral filtering procedure for eddy-resolving simulations with a spectral element ocean model. Journal of Computational Physics 137 (1), 130–154. Lynch, D.R., Hannah, C.G., 2001. Inverse model for limited-area hindcasts on the continental shelf. Journal of Atmospheric and Oceanic Technology 18, 962–981. Lynch, D.R., Ip, J.T.C., Naimie, C.E., Werner, F.E., 1996. Comprehensive coastal circulation model with application to the Gulf of Maine. Continental Shelf Research 16, 875–906. Lynch, D.R., Naimie, C.E., Hannah, C.G., 1998. Hindcasting the Georges Bank circulation: Part I. Detiding. Continental Shelf Research 18, 607–639. Maday, Y., Patera, A.T., 1988. Spectral element methods for the incompressible Navier–Stokes equations. In: Noor, A. (Ed.), State of the Art Surveys in Computational Mechanics. ASME, pp. 71–143. Mahfouf, J.F., Rabier, F., 2000. The ECMWF operational implementation of four-dimensional variational assimilation. II: Experimental results with improved physics. Quarterly Journal of the Royal Meteorological Society 126, 1171–1190. McCalpin, J.D., Haidvogel, D.B., 1996. Phenomenology of the low-frequency variability in a reduced-gravity, quasigeostrophic double-gyre model. Journal of Physical Oceanography 26 (5), 739–752. Moore, A.M., 1991. Data assimilation in a quasigeostrophic open ocean model of the Gulf Stream region using the adjoint method. Journal of Physical Oceanography 21, 398–427. Moore, A.M., Arango, H.G., Lorenzo, E.D., Cornuelle, B.D., Miller, A.J., Neilson, D.J., 2004. A comprehensive ocean prediction and analysis system based on the tangent linear and adjoint of a regional ocean model. Ocean Modeling 7, 227–258. Muccino, J.C., Bennett, A.F., 2002. Generalilzed inversion of the Korteweg–de Vries equation. Dynamics of Atmospheres and Oceans 35, 227–263. Palmer, T.N., Gelaro, R., Barkmeijer, J., Buizza, R., 1998. Singular vectors, metrics, and adaptive observations. Journal of Atmospheric Sciences 55, 633–653.
J.C. Levin et al. / Ocean Modelling 12 (2006) 348–377
377
Patera, A.T., 1984. A spectral element method for fluid dynamics: laminar flow in a channel expansion. Journal of Computational Physics 54, 468–488. Robinson, A.R., Lermusiaux, P.F.J., Sloan, N.Q., 1998. Data assimilation. In The Sea, vol. 10. John Wiley & Sons, pp. 541–594. Wunsch, C., 1996. The Ocean Circulation Inverse Problem. Cambridge University Press. Wunsch, C., Haidvogel, D.B., Iskandarani, M., Hughes, R., 1997. Dynamics of the long-period tides. Progress in Oceanography 40, 81–108. Zhu, K., Navon, I.M., Zou, X., 1994. Variational data assimilation with a variable resolution finite-element shallowwater equations model. Monthly Weather Review 122, 946–965.