Accepted Manuscript Title: Adjoint method for parameter identification problems in models of stirred tank chemical reactors Author: M. Ben´ıtez A. Berm´udez J.F. Rodr´ıguez-Calo PII: DOI: Reference:
S0263-8762(17)30264-2 http://dx.doi.org/doi:10.1016/j.cherd.2017.04.028 CHERD 2665
To appear in: Received date: Revised date: Accepted date:
3-1-2017 28-3-2017 21-4-2017
Please cite this article as: M. Ben´itez, A. Berm´udez, J.F. Rodr´iguez-Calo, Adjoint method for parameter identification problems in models of stirred tank chemical reactors, (2017), http://dx.doi.org/10.1016/j.cherd.2017.04.028 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Adjoint method for parameter identification problems in models of stirred tank chemical reactors M. Beníteza,∗, A. Bermúdezb , J. F. Rodríguez-Caloc a Departamento
de Matemáticas, Universidade da Coruña, c/ Mendizábal s/n, 15403 Ferrol, Spain de Matemática Aplicada, Universidade de Santiago de Compostela, c/ Lope Gómez de Marzoa s/n, 15786 Santiago de Compostela, Spain c Centro de Tecnología de Repsol, Autovía de Extremadura s/n, 28935 Móstoles, Madrid, Spain
ip t
b Departamento
cr
Abstract
an
us
In this paper an adjoint-based algorithm for parameter identification problems in systems of ordinary differential equations (ODEs) is presented. This is done by solving a minimization problem in which the cost function is defined in order to quantify the mismatch between the observed data and the numerical solution of the ODEs. Most of existing local minimizers need the derivatives of both the cost function and the constraints that are usually calculated by finite-difference formulas. In this paper we show how they can be computed much more efficiently and accurately by the so-called adjoint method. We apply this method to the problem of estimating a set of unknown parameters appearing in chemical reaction models. Numerical results showing the efficiency of the adjoint method are included.
M
Keywords: Chemical kinetics, adjoint method, inverse problem, ODE-constrained optimization
1. Introduction
Ac ce p
te
d
Mathematical modelling and numerical simulation are fundamental tools in most areas of applied science and engineering. A few examples are biological systems, chemical reactions, fluid mechanics, etc. In general, the models involve unknown parameters which must be identified to define completely the system. Usually, these parameters are estimated by solving an inverse problem which consists in a minimization problem where the cost function depends on the solution of the parameterized model. Depending on the nature of the problem the models can be different mathematical objects as systems of numerical equations, ordinary differential equations or partial differential equations. In this paper we focus on systems of (usually nonlinear) ordinary differential equations. In order to solve the above mentioned minimization problem numerical algorithms have to be used. Most of local minimizers are gradient-based iterative algorithms requiring, at each iteration, the computation of the derivatives of the objective function, and possibly of the constraints, with respect to the parameters to be identified. In practice, several classical difficulties have to be faced: first, the number of variables and parameters of the model can be large, second the objective function can have several local minima and third the governing equations usually involve nonlinear functions of variables and parameters so their solution have to be computed by numerical discretization methods. Consequently, the original minimization problem is first approximated by a family of minimization problems associated with a discretization parameter involved in the approximation procedure (typically, the time-step of the numerical scheme used to solve numerically the ODEs). Thus, a fully discrete minimization problem is finally solved instead of the original continuous one: both the cost function and the constraints depend on the numerical solution of the parameterized model. ∗ Corresponding
author Email addresses:
[email protected] (M. Benítez),
[email protected] (A. Bermúdez),
[email protected] (J. F. Rodríguez-Calo) Abbreviations: ODE, ordinary differential equation; PDE, partial differential equation; STR, stirred-tank reactors; CSTR, continuous stirred tank reactors
Preprint submitted to Elsevier
March 28, 2017
Page 1 of 23
Ac ce p
te
d
M
an
us
cr
ip t
In this paper, a quite general identification problem for systems described by ODEs is considered. The main goal is to introduce an efficient strategy for computing the derivatives of both the discrete objective and the constraint functions with respect to the variables to be identified. The proposed algorithm is based on the so-called adjoint method. It is an advantageous alternative to the standard finite-difference method of computing the derivatives. Indeed, the latter is very sensitive to the discretization step, leading to round-off errors when it is too small and to large truncation errors when it is too large. Moreover, it takes more computing time than the adjoint method, which is particularly true for problems with a large number of optimization variables. The adjoint approach has been introduced in optimal control theory (see [12], [13], [8]) and then successfully used in different fields for different problems (see [14], [26], [25], [10], [15], [21], [11]). In particular, it is extensively used to obtain the sensitivity of some variables with respect to related parameters. Let us mention, for instance, aeroelastic systems [14], atmospheric chemistry [26], chemical kinetic systems (see [25], [10]), thermoelasticity (see [15], [16], [17]) and so on. In the field of inverse problems, the adjoint method has been introduced in the 1970s to efficiently compute the gradient of a functional [9] and for design optimization in fluid dynamics (see [21], [11]). In this context, it has been also applied, for instance, to an inverse thermo-solutal convection problem [28], to inverse modelling in atmospheric chemistry [26], to tumor invasion PDE-constrained optimization problem [23], to geophysics [22] and so on. When the adjoint method is applied to the system of continuous equations (respectively, discrete equations) it is called continuous adjoint approach (respectively, discrete adjoint approach). In this paper, we present both approaches. We notice that from the computational point of view a discrete adjoint approach is the one needed to accurately compute the gradient. Moreover, we apply this strategy to parameter identification of chemical reaction systems. Reaction system models are obtained from conservation equations and depend on several issues such as the underlying reactions (stoichiometry and kinetics) and the reactor operation conditions (initial conditions, inlet and outlet flows, operational constraints). Different methods can be considered to identify the kinetics from experimental data (see [7], [19], [5], [3], [24]). When they are performed in one step (respectively, several steps) they are called simultaneous approach (respectively, incremental approach). More precisely, in the simultaneous approach the parameters of all reactions are identified simultaneously. However, in the incremental approach parameter identification of each reaction is done individually. In [5] an incremental approach that uses the concept of extents of reaction and mass transfer has been proposed. In [6] the simultaneous approach and two incremental approaches are compared with respect to several issues. In [3] an incremental approach which extends the so-called differential method is introduced. The performance of the incremental strategy is compared to that of the simultaneous approach which exhibits optimal statistical properties. In the present paper, we consider models of reaction systems where the physico-chemical magnitudes do not depend on the particular position in the reactor. They correspond to the so-called stirredtank reactors (STR); the case of plug-flow reactors (PFR) has been addressed in [4]. We assume that stoichiometry matrix, empirical measurements for variables, and functional expressions for the reaction velocities are given, but the latter depend on a set of unknown parameters. The paper is organized as follows. In Section 2 , a general constrained optimization problem governed by an ODE system is posed and the continuous adjoint method for computing the gradient of the cost function with respect to the optimization variables is presented. In Section 3, we present a discrete adjoint approach associate to a second-order finite-difference scheme for solving the system of ODEs. In Section 4, we apply the proposed discrete adjoint method to inverse problems of chemical reaction systems in stirred tank reactors. First, the mathematical models for stirred tank reactors and the associated inverse problem are posed. Then, the discrete adjoint method is presented. Besides, the whole algorithm used for solving these inverse problems is introduced. For the sake of completeness, we consider the nonisothermal case so temperature is also calculated from the energy conservation equation. Finally, in Section 5 numerical examples are included showing the performance of the adjoint method. 2. Continuous adjoint method for ODE-constrained optimization problems. In this section we briefly describe the essence of the adjoint approach in general terms, before dealing with the specific application to parameter identification problems in chemical kinetics models. Moreover, 2
Page 2 of 23
while the adjoint method will be applied to a discretized version of the problem due to numerical reasons, with the aim of more easily understanding it will be first introduced in continuous rather than discrete time. 2.1. Optimization problem In this section, we consider the following problem: Continuous Optimization Problem (COP). (1)
ip t
min J(p), p
dxp (t) = F(t, xp (t), p) dt xp (0) = x0 .
t ∈ (0, T ),
cr
where J(p) = Φ(xp , p) being xp the solution of the following initial-value problem:
(2)
M
an
us
In the above equations, T > 0 is the final time, p ∈ RM is a (column) vector containing the optimization variables, xp is a vector-valued function giving the time evolution of a set of N variables, F : (0, T ) × RN +M −→ RN is the source term, Φ : U × RM −→ R being U ⊂ C([0, T ]; RN ) a Banach space endowed with the usual L2 ([0, T ]; RN ) norm, J : RM −→ R is the objective function, and x0 is a (column) vector containing the initial conditions. In the particular application addressed in this paper, the parameter identification in chemical kinetic models, p, represents the parameters to be identified and the objective function J is defined in order to quantify the mismatch between the experimentally observed data and the numerical solution of the ODEs. In order to solve this optimization problem we usually need to make two consecutive steps:
te
d
1. To approximate the problem by introducing a numerical discretization scheme to solve the ordinary differential system (2) and, subsequently, an approximation of the objective function. Both lead to the discrete optimization problem. Notice that in this step a given parameter arises: the discretization step to be called h. 2. To choose an optimisation algorithm to solve the discrete optimization problem. 2.2. Continuous adjoint method.
Ac ce p
Most of optimisation algorithms include local minimizers that uses the gradient of the discrete version of J with respect to its independent variables, p. In principle, this gradient can be approximated by using quotient differences but, in practice, this procedure has two major drawbacks: it is computationally expensive (the objective function has to be evaluated M +1 times) and, more importantly, is very sensitive to the increment of the variables, leading to round-off errors when it is too small and to large truncation errors when it is too large. Alternatively, the so-called adjoint method provides a way to compute exactly this gradient for either the continuous version or the discrete one. For the sake of simplicity in the exposition and notation we first recall the former. We first notice that, since J(p) = Φ(xp , p), by using the chain rule we get DJ(p) = Dp Φ(xp , p) + (Dp xp )T Dx Φ(xp , p),
(3)
where the superscript T denotes the transposed of an operator or a matrix and, in general, Df (y) is the Fréchet derivative of mapping f at point y (let us recall that if f : Rq −→ Rr then Df (y) is defined by the Jacobian matrix of f , which is a r × q matrix). In particular, Dp xp is the Fréchet derivative of the mapping that gives the solution of the initial-value problem (2) for a particular choice of the set of parameters p. It is a linear mapping from RM into the function space U: Dp xp : δp ∈ RM −→ δxp := Dp xp δp ∈ U. (4) 3
Page 3 of 23
It is easy to prove that (3) is equivalent to the vector equality gradJ(p) = gradp Φ(xp , p) +
Z
T
T gradp xp (t) gradx Φ(xp , p)(t) dt,
0
(5)
ip t
where gradp xp (t) denotes the Jacobian matrix of the mapping p ∈ RM −→ xp (t) ∈ RN . In order to compute DJ(p), the first step is to obtain the linear mapping Dp xp (or, equivalently, the Jacobian matrix gradp xp (t)). However, looking at (3), our interest is actually the term (Dp xp )T Dx Φ(xp , p), that is, the image of Dx Φ(xp , p) by the linear mapping (Dp xp )T which, as we will see next, is much less expensive than computing the full Dp xp . This is the main idea underlying the adjoint method. Let us first introduce the mathematical problem needed to compute the above Fréchet derivative Dp xp . The Implicit Function Theorem tells us that, for given δp ∈ RM , function δxp in (4) is the solution of the linear initial-value problem
cr
dδxp (t) = gradx F(t, xp (t), p)δxp (t) + gradp F(t, xp (t), p)δp, dt δxp (0) = 0.
(6)
us
(7)
Now let us introduce the adjoint state q ∈ U as the solution of the linear final-value problem T dq (t) = gradx F(t, xp (t), p) q(t) + gradx Φ(xp , p)(t), dt q(T ) = 0.
(8)
an
−
(9)
Z
0
T
M
Let us make the scalar product of (6) by q(t) and of (8) by δxp (t) := Dp xp δp (t), then integrating in [0, T ], subtracting the resulting equalities and using the integration by parts formula Z
dδxp (t) · q(t) dt = δxp (T ) · q(T ) − δxp (0) · q(0) − dt
0
Z
=
T
T
gradp F(t, xp (t), p)δp · q(t) dt −
te
0=
Z
gradp F(t, xp (t), p)δp · q(t) dt −
Z
T
0
Ac ce p
0
and also
0 = δp ·
Z
T
0
0
d
we get
T
Z
T gradp F(t, xp (t), p) q(t) dt −
Z
dq (t) · δxp (t) dt = − dt
Z
0
T
dq (t) · δxp (t) dt, dt
T
gradx Φ(xp , p)(t) · δxp (t) dt 0
gradx Φ(xp , p)(t) · Dp xp δp (t) dt,
T
T gradp xp (t) gradx Φ(xp , p)(t) dt
0
(10)
!
· δp.
Hence, we have proved the equality Z
0
T
T gradp xp (t) gradx Φ(xp , p)(t) dt =
Z
0
T
T gradp F(t, xp (t), p) q(t) dt,
which can be used in (5) to finally get
gradJ(p) = gradp Φ(xp , p) +
Z
T
0
T gradp F(t, xp (t), p) q(t) dt.
(11)
Let us emphasize that the computational effort to compute the gradient of the objective function for a given set of parameters p by using the adjoint method consists in firstly solving the initial-value problem (2) for p, to compute the state xp , secondly solving the final-value problem (8)-(9) to compute the adjoint state q, and finally making some additional algebraic operations to compute (11).
4
Page 4 of 23
3. Discrete adjoint method for ODE-constrained optimization problems.
ip t
At first view, in order to compute the gradient by making the above steps we could discretize the adjoint state problem (8)-(9). However, this procedure does not give the exact gradient of the discretized objective function which is really the one we need when solving the problem in a computer by using an optimisation algorithm. This fact deteriorates the performance of the latter leading to a higher computational cost and, in many cases, to a lack of convergence. The goal of this section is to introduce a discrete adjoint approach allowing to compute the exact gradient of the discretized objective functional. Of course, the underlying ideas are similar to those in the continuous approach described in the previous section.
us
cr
3.1. Discrete optimization problem. As previously said, in most cases Cauchy problem (2) cannot be exactly solved. Instead, a numerical method has to be used to obtain an approximate solution. As a consequence, the objective function J must be also approximated by using this solution. In this paper, we choose a particular second order implicit finite-difference scheme to solve problem (2). Let us first introduce some notation. Let L be the number of time steps, h = T /L the time step, tn = nh, n = 0, . . . , L, the mesh-points and T the set of these points. Let us denote by xnp,h the approximation of xp (tn ) obtained with the discretized scheme below. In order to approximate the time derivative in (2), we propose the following formulas:
an
• First-order formula (two-points):
• Second-order formula (three-points):
M
x(t) − x(t − h) dx (t) = + O(h). dt h
dx 3x(t) − 4x(t − h) + x(t − 2h) (t) = + O(h2 ). dt 2h
(12)
(13)
te
d
Then, by evaluating (2) at time t1 (respectively, tn with n > 1) and then using (12) (respectively, (13)) to approximate the first time derivative, we get the following discretized problem: find vectors x1p,h , · · · , xL p,h such that
Ac ce p
1 1 (xp,h − x0p,h ) = F(t1 , x1p,h , p), h 1 3 n 1 n−2 xp,h − 2xn−1 + x = F(tn , xnp,h , p) n ∈ {2, . . . , L}, p,h h 2 2 p,h x0p,h = x0 .
(14) (15) (16)
Accordingly, we introduce the following discrete optimization problem: Discrete Optimization Problem (DOP). It consists in finding
min Jh (p),
(17)
p
0 L d with Jh (p) = Φh (xd p,h , p), where x p,h = (xp,h , . . . , xp,h ) is the solution of problem (14)-(16) corresponding to p, and Φh : R(L+1)N +M −→ R is an approximation of Φ, i.e., Φh (xd p,h , p) ≃ Φ(xp , p) being xp the solution of (2) and xd p,h the solution of (14)-(16); for example, sometimes Φ involves an integral between 0 and T while this integral is replaced by a quadrature formula in Φh .
3.2. Discrete adjoint method Many optimization algorithms are gradient-based iterative methods. In this case it is necessary to compute the derivatives of the objective function with respect to the optimization variables at each iteration. While finite difference formulas are the most popular numerical methods to approximate derivatives 5
Page 5 of 23
of functions, the computational cost of these methods may become large because many evaluations of the objective function can be required and, moreover, they produce approximations whose quality depends on the size of the discretization step which cannot be a priori determined in an optimal way. As an alternative, in this section we present an efficient technique to obtain the exact value of these derivatives: the discrete adjoint method. Differentiation of the discrete objective function with respect to p and the chain rule yield the following discrete version of (5): T d grad p Jh (p) = grad p Φh (xd Φh (xd p,h , p) + ( grad p x p,h ) grad x p,h , p), c h
(18)
ip t
M (L+1)N where grad p xd −→ xd which can p,h denotes the Jacobian matrix of the mapping p ∈ R p,h ∈ R be obtained from the Implicit Function Theorem applied to system (14)-(16). More precisely, it can be computed by solving the following equations for grad p x0p,h , · · · , grad p xL p,h :
= 0.
an
grad p x0p,h
us
cr
1 ( grad p x1p,h − grad p x0p,h ) = grad x F(t1 , x1p,h , p) grad p x1p,h h + grad p F(t1 , x1p,h , p), 1 3 1 grad p xnp,h − 2 grad p xn−1 grad p xn−2 = grad x F(tn , xnp,h , p) grad p xnp,h p,h + p,h h 2 2 + grad p F(tn , xnp,h , p), n = 2, · · · , L,
(19)
(20) (21)
Ac ce p
te
d
M
This first approach to obtain grad p xd p,h is known as the direct method and involves the resolution of (L + 1)M linear systems whose associated matrices are of size N × N . Therefore, the computational cost is very high when the number of optimization variables becomes large. Alternatively, an efficient approach trying to mitigate this cost is the discrete adjoint method. It can be introduced in a easy way by using the following recipe: Firstly, we build a Lagrangian function by considering the discrete state equation as a set of constraints, i.e. 1 ch ) :=Φh (c L(c xh , p, q xh , p) + (x1h − x0 ) − F(t1 , x1h , p) · q0h h 1 3 1 2 + xh − 2x1h + x0 − F(t2 , x2h , p) · q1h h 2 2 L X 1 3 1 n−2 n + xnh − 2xn−1 + x − F(t , x , p) · qn−1 + x0h − x0 · qL n h h, h h h h 2 2 n=3 ch , q ch ∈ R(L+1)N are the block column where x 0 xh . ch = x . . xL h
vectors ,
ch = q
q0h . . . qL h
.
Then, one can can prove that if p is a solution of the (DOP) then the following equalities hold: ch ) = 0, • gradqch L(c xh , p, q
(22)
ch ), • gradp Jh (p) = gradp L(c xh , p, q
(24)
ch ) = 0, • gradxch L(c xh , p, q
(23)
ch , q ch ∈ R(L+1)N . We notice that the first equation is exactly equivalent to the for some vectors x ch = xd discrete state equations (14)-(16) and hence x p,h . The second equation allows us to compute the 6
Page 6 of 23
ch . It is equivalent to the following linear system: discrete-adjoint state q ch = grad xch Φh (xd BT q p,h , p),
(25)
where B is a matrix with (L + 1) × (L + 1) blocks of size N × N . Among them, the only non-null blocks are 1 1 I − grad x F(t1 , x1p,h , p), B n,n−2 = I n ∈ {3, . . . , L}, h 2h 2 3 = − I, B n,n = I − grad x F(tn , xnp,h , p) n ∈ {2, . . . , L}, h 2h B L+1,L+1 = I,
B n,n−1
ip t
B 1,1 =
cr
where I denotes the N × N identity matrix. Because of the triangular block structure of matrix B, the linear system (25) can be recursively solved backward in time. More precisely, we solve the following L + 1 linear systems of size N starting from null value at final time: qL h = 0, T 3 I − grad x F(tL , xL qL−1 = grad xLh Φh (xd p,h , p), p,h , p) h 2h T 3 I − grad x F(tn , xnp,h , p) qn−1 = grad xnh Φh (xd p,h , p) + h 2h T 1 1 I − grad x F(t1 , xp,h , p) q0h = grad x1h Φh (xd p,h , p) + h
an
us
(26) (27)
2 n 1 n+1 q − q , h h 2h h
(28)
2 1 1 2 qh − q , h 2h h
(29)
M
where n = L − 1, . . . , 2. Notice that this discrete problem is a discrete version of the linear final-value problem (8)-(9). Finally, the third equality, i.e. (24), yields
d
grad p Jh (p) = grad p Φh (xd p,h , p) +
L X
n=1
T grad p F(tn , xnp,h , p) qn−1 , h
(30)
te
which can be considered as a discrete version of (11).
Ac ce p
4. Inverse problems and adjoint method in chemical reaction models In this section we apply the above described adjoint method to parameter identification problems in chemical reactions models. We consider models where the physico-chemical magnitudes do not depend on the particular position in the reactor, that is, so-called stirred-tank reactors (STR). In chemical reaction models, vector p may consist of reaction rate coefficients, initial conditions (p = x0 ), additional source terms, etc. For identification purposes, we assume that a set of experimental data is provided and try to identify the values of p that best explain the experimental data. The method consists in minimizing the deviation between the observed data and the numerical solution of the parametrized model. When this problem is solved by iterative methods, the computation of both the cost function and its gradient is required at each step. This gradient could be approximated by using a finite difference formula but, in general, it would require a large computational effort. Moreover, the quality of the approximate derivatives is very sensitive to the discretization step. On the contrary, with the adjoint method these derivatives are efficiently and exactly calculated although in the context of chemical systems it has not been used as extensively as the finite difference approach. The goal of this section is to apply the adjoint method to compute the derivatives of the cost function in parameter identification problems related to chemical reactions. For this purpose we first introduce mathematical models for stirred tank reactors. Next, the optimization problem to identify the values of the unknown parameters appearing in the mathematical models is stated. Finally, the discrete adjoint method is used to compute the gradient of the cost function of the discrete identification problem.
7
Page 7 of 23
4.1. Mathematical models for Stirred Tank Reactors Let us consider a set of reacting chemical species S: S = {E1 , . . . , ES }, in a stirred tank reactor (STR) in presence of Nc catalysts the concentrations of which are known along the process. These species are involved in a set of R chemical reactions 1 ≤ r ≤ R,
ip t
ν1r E1 + · · · + νSr ES → λr1 E1 + · · · + λrS
S Y
αr
yj j
j=1
Nc Y
j=1
(zj (t))
µrj
,
(31)
us
δr (t, y, θ, p) = kr
cr
where νir and λri are called the stoichiometric coefficients, 1 ≤ i ≤ S, 1 ≤ r ≤ R. We consider reaction kinetic models for which reaction rates depend on a set of unknown parameters p. For elementary reactions, the law of mass action (C.M. Guldberg and P. Waage (1864-67)) yields the following expression for the velocity of the r-th reaction:
an
being zj (t) the concentration of the j-th catalyst at time t and kr = b kr (θ) the rate “constant” of the r-th reaction which is a function of the reactor temperature through the Arrhenius law Ear r b , kr (θ) = B exp − Rθ
Ac ce p
te
d
M
where B r is the pre-exponential factor, Ear is the activation energy (J/mol) and αrj , j = 1, . . . , S are coefficients to be determined (all of them are associated to the r-th reaction), and R is the universal gas constant (J/(molK)). Other kinetics different from the mass action law can be also considered, for instance, the Hill or the Michaelis-Menten kinetics (see [18]). In any case we denote by p ∈ RM the (column) vector containing all the unknown parameters. In the present case these parameters can be some of the following ones: B r , Ear , αrj , µrj , j = 1, · · · , S, r = 1, · · · , R. We assume that the mixture inside the reactor is homogeneous due to stirring so the physico-chemical magnitudes do not depend on position. Reactors are not supposed to be either adiabatic or isotherm so temperature as well as species concentrations have to be computed by the models that are obtained from the energy and the mass conservation equations, respectively. Moreover, we consider the transient case. Thus, the general mathematical model of these chemical reaction mechanisms is a particular case of (2) being y(t) x(t) = , (32) θ(t) where y is the (column) vector function containing the time evolution of the chemical species concentrations in the reactor and θ represents the time evolution of the reactor temperature. Notice that N = S +1. The functional form of source term F in (2) depends on the reactor. We have considered two cases: batch stirred tank reactors and continuous stirred tank reactors . From now on the following notations will be used: X • the dot (·) denotes the scalar product of two vectors: a · b = ai b i , i
• A: stoichiometric matrix, • δ: column vector containing the reaction velocities (mol/(m3 s)), • g: heat transfer coefficient (W/K) between the reactor and its exterior, • θext the temperature of the exterior (K),
8
Page 8 of 23
• wi (θ): the i-th component of the column vector w(θ). It is given by wi (θ) = Mi (e∗i + ci (θ − θ∗ )) (J/mol), being – Mi (kg/mol) the molecular mass of the i-th species, – e∗i (J/kg) the internal energy of formation of the i-th species at temperature θ∗ (K), – ci (J/(kgK)) the specific heat of the i-th species, • ∆H(θ) = AT w(θ) (J/mol): the components of this vector are the molar heats of reaction at temperature θ,
ip t
• P : number of inlet streams in a CSTR,
• ui : volumetric flow rate of the i-th inlet stream in a CSTR (ui is the i-th component of column vector u) (m3 /s),
cr
• uout : output flow rate in a CSTR (m3 /s),
us
• Wij : concentration of the i-th species in the j-th inlet stream in a CSTR (mol/m3 ), • θp : temperature of the p-th inlet stream in a CSTR (K),
an
• V : volume of the mixture (m3 ).
M
4.1.1. Batch Stirred Tank Reactors (BSTR) In BSTRs, reactants are introduced into the reactor at the initial time only so there are no input/output flows along the process. We may assume that the volume of the BSTR is constant and then the mixture density is also constant along the time. The mathematical model consists of the following ordinary differential equations: 1. Mass conservation equations
g(t) ∆H(θ(t)) · δ(t, y(t), θ(t), p) − (θext (t) − θ(t)) . V
te
2. Energy conservation equation
(33)
d
dy (t) = Aδ(t, y(t), θ(t), p). dt
dθ 1 (t) = − ′ dt w (θ(t)) · y(t)
Ac ce p
Notice that this model is a particular case of the ODE system (2) where y x= θ
(34)
(35)
and the source term F is defined by
with and
ϕ(t, y, θ, p) := −
f (t, y, θ, p) ϕ(t, y, θ, p)
,
(36)
f (t, y, θ, p) := Aδ(t, y, θ, p),
(37)
F(t, x, p) =
1 w′ (θ) · y
g(t) ∆H(θ) · δ(t, y, θ, p) − (θext (t) − θ) . V
(38)
9
Page 9 of 23
4.1.2. Continuous Stirred Tank Reactors (CSTR) In CSTRs, reactants are continuously introduced and there is also an output flow along the process (otherwise, the reactor is called semi-batch STR). Therefore, in a CSTR, the volume may change due to possible non-equilibrium between input and output flows. The equation giving the volume along the time is the following: P X dV (t) = up (t) − uout (t) t ∈ (0, T ), (39) dt p=1 V (0) = V0 .
ip t
1. Mass conservation equations
2. Energy conservation equation
g(t) ∆H(θ(t)) · δ(t, y(t), θ(t), p) − (θext (t) − θ(t)) V (t) ! ! P S 1 X X p + Wip (t) [wi (θ(t)) − wi (θ (t))] up (t) . V (t) p=1 i=1
an
us
dθ 1 (t) = − ′ dt w (θ(t)) · y(t)
M
This model is also a particular case of (2) for the choices, h(t, y, θ, p) F(t, x, p) = , φ(t, y, θ, p) where h is defined by
d
h(t, y, θ, p) := Aδ(t, y, θ, p) +
(42)
(43)
g(t) ∆H(θ) · δ(t, y, θ, p) − (θext (t) − θ) V (t) ! ! P S X X p Wip (t) [wi (θ) − wi (θ (t))] up (t) .
1 w′ (θ) · y
Ac ce p
φ(t, y, θ, p) := −
P X 1 1 W (t)u(t) − y up (t). V (t) V (t) p=1
(41)
te
and
(40)
cr
P X dy 1 1 (t) = Aδ(t, y(t), θ(t), p) + W (t)u(t) − y(t) up (t). dt V (t) V (t) p=1
+
1 V (t)
p=1
(44)
i=1
4.2. Identification problem Let us assume that a set E of experimental setups for species concentrations and temperature is provided. More precisely, for each experimental setup e ∈ E, measured values for variables are given at a e }. These empirical data will be denoted with a tilde. More precisely, set of time instants Te = {tee0 , · · · , tg Ke e,k yg g θe,k
e,k = xg
!
(45)
g e,k and θ e,k denote, respectively, the (column) vector of measured concentrations of species and where yg the measured temperature at time teek ∈ Te , under the experimental setup e .
4.2.1. Continuous identification problem Given one of the chemical reaction models of the previous section and a set of R mathematical expressions for the reaction velocities (as, for instance, (31)), the goal is to find the values for the unknown parameters such that the model best explain the above experimental data. The method proposed to compute these optimal values consists in minimizing the deviation between the observed data and the 10
Page 10 of 23
solution of the parametrized model. More precisely, the optimal values of the parameters are defined as the solution of the following least-square problem: Continuous Identification Problem (CIP). min J c (p) with J c (p) := p
X
Υe (xep ) + Ψ(p),
(46)
e∈E
X S+1 X
teek ∈Te i=1
where
g 2 wie,k (xi (teek ) − xe,k i ) ,
Ψ(p) =
M X
γj (pj − βj )2 ,
j=1
(47)
cr
Υe (x) =
ip t
e,0 , and Υe : U −→ R and where xep is the solution of the parametrized model (2) with initial value xg M Ψ : R −→ R are defined by
us
• wie,k are weights reflecting the importance of the errors corresponding to the different species, experimental setups and time instants, g e,k , • xe,k is the i-th component of vector xg i
an
• xi (teek ) is the i-th component of x at the observation instant teek ∈ Te , • γj ≥ 0 j = 1, . . . , M , are regularization parameters, and
• βj is an a priori estimate of pj .
d
M
Inverse problems are often ill-posed and then regularization methods are used. In particular, the so-called Tikhonov regularization (see [27]) consists in adding to the cost function the term Ψ(p). Finally, we notice that the above optimization problem is a particular case of the one stated in Section 2 for the choice X Φ({xe : e ∈ E}, p) := Υe (xe ) + Ψ(p).
te
e∈E
Ac ce p
4.2.2. Discrete identification problem As we already said, in practice the objective function J c is approximated by using a numerical solution of problem (2). For the sake of simplicity, we shall assume that the set of observation times is a subset of T (see Section 3.1), that is, Te ⊂ T for e ∈ E. In this case, the following approximation of J c is considered X e ) + Ψ(p), Jhc (p) = Υeh (xd (48) p,h e∈E
e,0 e,L e g e,0 , and the mapping where xd = x , . . . , x p,h p,h p,h is the solution of problem (14)-(16) with initial value x Υeh : R(L+1)(S+1) −→ R is defined by
Υeh (c xh ) =
X S+1 X
ne (k)
wie,k (xh,i
teek ∈Te i=1
g 2 − xe,k i ) ,
(49)
where ne (k) is such that tne (k) = teek . Then, the optimal values of the parameters are defined as the solution of the following least-square discrete problem: Discrete Identification Problem (DIP). min Jhc (p)
(50)
p
11
Page 11 of 23
where Jhc (p) =
X e∈E
e,ne (k)
being xp,h,i
e ) + Ψ(p) = Υeh (xd p,h
X X S+1 X
e,ne (k)
wie,k xp,h,i
e∈E tee ∈Te i=1 k
e,ne (k)
the i-th component of xp,h
M
g 2 X − xe,k + γj (pj − βj )2 , i
(51)
j=1
.
ip t
4.3. Discrete adjoint method In this section, we apply the adjoint method introduced in Section 3 to obtain the exact value of the derivatives of the discrete cost function Jhc with respect to p. Let us notice that this technique applied to optimization problem (50)–(51) allows us to compute these derivatives from the adjoint state. The latter is defined as the solution of the backward discrete problem (26)–(29) which, in this case, becomes T 3 e ) I − grad x F(tL , xe,L , p) qe,L−1 = grad xLh Υeh (xd p,h h p,h 2h T 3 e )+ I − grad x F(tn , xe,n qe,n−1 = grad xnh Υeh (xd p,h , p) h p,h 2h T 1 e d e I − grad x F(t1 , xe,1 qe,0 p,h ) h = grad x1h Υh (xp,h ) + h =
(
ne (k)
2wie,k (xh,i 0
an
( grad xnh Υeh (c xh ))i
(52) (53)
2 e,n 1 e,n+1 qh − q , h 2h h
(54)
2 e,1 1 e,2 q − q , h h 2h h
(55)
g ee − xe,k i ) if tn = tne (k) = tk ∈ Te , otherwise,
M
with
us
cr
qe,L h = 0,
for i = 1, · · · , S + 1, n = 1, · · · , L, e ∈ E. Then, from 30 we get L XX
d
te
grad p Jhc (p) = grad p Ψ(p) + with
e∈E
n=1
T e,n−1 grad p F(tn , xe,n qh , p,h , p)
Ac ce p
( grad p Ψ(p))j = 2γj (pj − βj ), j = 1, · · · , M.
(56)
(57)
Moreover, the gradients appearing in (52)–(55) and (56) are obtained analytically as follows: • Batch stirred-tank reactor
∂δ A (t, x, p) ∂F ∂xi (t, x, p) = ∂ϕ , ∂xi (t, x, p) ∂xi
being
∂δ A (t, x, p) ∂pj ∂F , (t, x, p) = ∂ϕ ∂pj (t, x, p) ∂pj
(58)
∂ϕ M k ck g(t) (t, y, θ, p) = ∆H(θ) · δ(t, y, θ, p) − (θ (t) − θ) ext ∂yk (w′ (θ) · y)2 V ∂δ 1 − ′ ∆H(θ) · (t, y, θ, p), (59) w (θ) · y ∂yk ∂ϕ 1 ∂δ g(t) (t, y, θ, p) = − ′ AT w′ (θ) · δ(t, y, θ, p) + ∆H(θ) · (t, y, θ, p) + , (60) ∂θ w (θ) · y ∂θ V ∂ϕ 1 ∂δ (t, y, θ, p) = − ′ ∆H(θ) · (t, y, θ, p). (61) ∂pj w (θ) · y ∂pj
12
Page 12 of 23
• Continuous stirred-tank reactor
∂F (t, x, p) = ∂yk
∂δ A (t, x, p) ∂F (t, x, p) = ∂θ , ∂φ ∂θ (t, x, p), ∂θ
P ∂δ 1 X A (t, x, p) − up (t)ek ∂yk V (t) p=1 ∂φ (t, x, p), ∂yk ∂δ A ∂pj (t, x, p) ∂F (t, x, p) = ∂φ ∂pj (t, x, p) ∂pj
being
(62)
(63)
, ,
g(t) ∆H(θ) · δ(t, y, θ, p) − (θext (t) − θ(t)) V (t) ! ! P S X X p Wip (t) [wi (θ) − wi (θ (t))] up (t)
us
1 V (t) p=1
cr
∂φ M k ck (t, y, θ, p) = ∂yk (w′ (θ) · y)2 +
ip t
i=1
M
an
∂δ 1 ∆H(θ) · (t, y, θ, p), − ′ w (θ) · y ∂yk ∂φ 1 ∂δ (t, y, θ, p) = − ′ AT w′ (θ) · δ(t, y, θ, p) + ∆H(θ) · (t, y, θ, p) ∂θ w (θ) · y ∂θ g(t) 1 + w′ (θ) · W (t)u(t) + , V (t) V (t) ∂φ 1 ∂δ (t, y, θ, p) = − ′ ∆H(θ) · (t, y, θ, p). ∂pj w (θ) · y ∂pj
(64)
(65) (66)
Ac ce p
te
d
In the above equations ek is the k-th vector of the canonical basis of RS . We also notice that the derivatives of the reaction rates with respect to parameters and state variables can be obtained analytically. Then, the proposed algorithm to obtain grad p Jhc (p) consists of the following steps (index n and e denote, respectively, the n-th time step and the experimental setup e ∈ E).
1. Compute the solution xe,n p,h for n = 0, . . . , L of the discrete problem (14)-(16) with parameters g e,0 p and initial value x . 2. Obtain grad x F(tn , xe,n p,h , p) for n = 1, . . . , L by analytical calculus. e e d 3. Obtain grad xn Υh (x ) for n = 1, . . . , L by analytical calculus. p,h
h
4. Compute the adjoint state qe,n for n = 0, . . . , L by solving the linear system (52)–(55). h e,n 5. Obtain grad p F(tn , xp,h , p) for n = 1, . . . , L by analytical calculus. 6. Compute de :=
L X
n=1
T e,n−1 grad p F(tn , xe,n qh . p,h , p)
7. Repeat stages 1 to 6 for each experimental setup e ∈ E. 8. Obtain b := grad p Ψ(p) by analytical calculus. 9. Compute X grad p Jhc (p) = b + de . e∈E
13
Page 13 of 23
4.4. Some details about practical implementation In practice, we consider parameter identification problems in stirred-tank reactors for which the unknown parameters p are reaction rate coefficients. More precisely, the following expression for reaction rates which extends the law of mass action is considered !αrj Y Mr S+N S r X Xc E Grjk yk + Grjk zk (t) + brj , δr (t, y, θ, p) = B r exp − a Rθ j=1 k=1
(67)
k=S+1
M
an
us
cr
ip t
being zk (t) the concentration of the k-th catalyst at time t. This expression for kinetics was provided by engineers of Repsol (an integrated oil and gas Company) who have observed that this generalized form can represent all the cases they are considering. Notice that the above reaction rates depend on coefficients, B r , Ear , Grjk , brj and αrj . In practice, some of them are known, while some others are not and will be the variables in the optimization process. Different methods can be used to solve these inverse problems, such as integral and incremental methods (see, for instance, [6]). In the incremental approach the global identification process is decoupled so that each reaction is dealt with individually (see, for instance, [24]). The integral method is based on a direct comparison of the observed data and the numerical solution of the parameterized model. We notice that in Section 4.2 an integral method has been described. We have used a combination of the incremental and the integral methods for solving the parameter identification problems considered in this paper. Regarding the latter, we have used the method proposed in Section 4.2. The optimization problem is solved by using a heuristic method based on the global optimization technique called variable neighbourhood search (VNS) (see, for instance, [20]). At each iteration of this global optimization algorithm, a gradient-based local minimizer is used. The derivatives of the discrete cost function with respect to the parameters are computed exactly by using the adjoint method. We notice that the heuristic method uses initial values for the parameters which have been obtained by using an incremental method.
d
5. Illustrative Examples
Ac ce p
te
The goal of this section is to assess the performance of the adjoint method by making some numerical tests. More precisely, we solve a parameter identification problem as the one stated in Section 4.2 for the reaction system considered in [2] and [1] in two reactor configurations. The first one corresponds to a non-isothermal BSTR and the second one to a non-isothermal CSTR. In both cases, we build synthetic “experimental” data by solving the model of the reactor with small time step. Elementary reactions are considered for which the mass action law is assumed. However, we notice that in the computer program the more general expressions for reaction rates given by (67) are considered. Firstly, we introduce the reaction system and the numerical parameters given in reference [1]. Then, we solve the model and show the obtained numerical results. Next, we calculate the gradient of the discrete fitting function with respect to all parameters appearing in (67), by using both the adjoint method and a finite-difference approximation. We compare the obtained results with both methods, showing the error between the exact and approximate gradient and the computational times required for such calculations. Finally, we introduce and solve the identification problem and show the obtained numerical results. 5.1. Reaction system and conditions We consider the following non-isothermal homogeneous reaction system with S = 4 reacting species and R = 2 elementary and independent reactions (see [2] and [1]) R1 :
δ
1 2E1 −→ E2 ,
δ
2 R2 : E2 + E3 −→ 2E3 + 2E4 .
In addition to the first reaction where E1 is converted to the desired product E2 , there is a consecutive reaction auto-catalyzed by E3 in which E2 yields the waste products E3 and E4 . The stoichiometric
14
Page 14 of 23
matrix and the reaction rates of this auto-catalyzed system are as follows −2 0 1 −1 , δ1 (y1 , θ) = k1 y12 , δ2 (y2 , y3 , θ) = k2 y2 y3 , A= 0 1 0 2
ip t
br (θ) obeys the Arrhenius law: kbr (θ) = B r exp (−E r /(Rθ)). The numerical values of the where kr = k a parameters are taken from [1]. More precisely, the values of the pre-exponential factors, activation energies and molecular weights are given in Table 1. Moreover, we notice that the above reaction rates Value
Unit
{Mi }4i=1
{50, 100, 75, 12.5}
gmol−1
B1, B2
0.028, 0.278
Ea1 , Ea2
0.0125, 0.0166
cr
Parameter
M −1 h−1
us
Jmol−1
Table 1: Numerical values for the reaction system.
Mr
Br
Ear
R1
1
0.028
0.0125
2
br
0.278
(1 0 0 0) 0 1 0 0 0 0 1 0
0.0166
αr
0
0 0
2
1 1
te
d
R2
Gr
M
Reaction
an
are particular cases of the general formulation given in (67). More precisely, the values of the kinetic parameters appearing in (67) for the above reactions are given in Table 2. Two different reactors are
Table 2: Values of kinetic parameters appearing in (67) for the reaction system.
Ac ce p
studied, namely a batch stirred tank reactor and a continuous stirred tank reactor. In both cases, the final time is T = 1 h and the initial conditions are the following: V (0) = 1.5 l,
y(0) = (19, 0, 0, 0.505, 0)T mol/l,
θ(0) = 393 K.
In the CSTR, the output flow rate is 0.22 l/h and reactant E1 is added continuously at 373 K with constant flow rate u = 1.2 l/h and composition W = (16.15 0 0 0)T mol/l. The density of the reaction mixture is assumed to be constant along the course of the reaction. We solve the mathematical model of these chemical reaction mechanisms by using (14)-(16) with the time step h = 1/1000. The obtained results are depicted in Figures 1 and 2. More precisely, in Figure 1 we represent the time history of the numerical volume and temperature. In Figure 2 we show the time profiles of the numerical concentrations in both the BSTR and the CSTR. 5.2. Adjoint method versus finite difference method In the above chemical reaction mechanisms, we analyze the adjoint method and the following finitedifference approximation: ∂Jhc J c (p + εei ) − Jhc (p) (p) ≈ h ∂pi ε
i ∈ {1, . . . , M },
(68)
being ei the i-th vector of the canonical basis of RM and ε a discretization parameter. The computational cost of the latter might become large because it requires M +1 evaluations of the cost function. Moreover, 15
Page 15 of 23
2.5
395
BSTR CSTR
BSTR CSTR 390
Volume (l)
Temperature (K)
2
1.5
385
380
375
1
0
0.2
0.4
0.6
0.8
365
1
0
0.2
Time (hours)
0.4
ip t
370
0.6
0.8
1
Time (hours)
20
E1
18
us
20
cr
Figure 1: Volume (left) and temperature (right) for the BSTR and the CSTR, computed with the numerical method (14)-(16) for h = 1/1000.
18
E
16
E
14
E4
16
3
Concentration (mol/l)
12 10 8 6
12
E
2
E
3
E4
10
8 6 4
2 0
0.2
0.4
0.6
0.8
1
Time (hours)
M
4
0
14
E1
an
Concentration (mol/l)
2
2 0
0
0.2
0.4
0.6
0.8
1
Time (hours)
te
d
Figure 2: Concentrations for the BSTR (left) and the CSTR (right), computed with the numerical method (14)-(16) for h = 1/1000.
Ac ce p
it produces an approximation whose quality depends on ε and it is not easy to determine a priori suitable values. Specifically, in the above chemical reaction mechanisms, we apply the adjoint method and the finite difference method (68) to obtain the exact and approximate gradient with respect to all parameters of discrete fitting function (48) where δ is given by (67), being M1 = 1 and M2 = 2, and p contains all kinetic parameters of all reactions. Thus, in this case the number of derivatives is M = 22. For each reactor, one synthetic experiment is built. More precisely, they correspond to the values of temperature and concentrations shown in Figures 1 and 2 at 11 equally spaced time instants (K = 10), until reaching end time T = 1 h. For simplicity, all weights wie,k in (48) are chosen to be equal to unity and all regularization parameters γj are chosen to be zero. We calculate, for different time step h and for the values of the kinetic parameters given in Table 2, the exact and approximate derivatives of the discrete fitting function with respect to all kinetic parameters of all reactions, for both the BSTR and the CSTR. The exact values are obtained by using the adjoint method introduced in this paper and the approximate ones are calculated by (68) for different values of discretization parameter ε. In Figure 3 we represent, for the BSTR and the CSTR, the Euclidean norm of the error between the exact and the approximate gradient versus 1/ε, for the fixed time step h = 1/40. As expected, these results show that formula (68) possesses first-order accuracy in terms of ε. Notice that we can observe an increasing error as the discretization parameter ε decreases below a threshold. This is due to rounding errors due to the finite precision of the computer. Notice that the ODE model is converted to a system of nonlinear algebraic equations by performing time discretization. In Tables 3 and 4 we compare for the BSTR and the CSTR, respectively, and for different time steps h, the computing times for the calculation of the derivatives of fitting function Jhc with respect to all kinetic parameters of all reactions, with the adjoint method and with the finite difference scheme (68). The CPU time for the 16
Page 16 of 23
l2 error curve BSTR CSTR 1st order curve
5
Relative error (%)
10
0
10
ip t
−5
10
2
4
10
6
10
8
10
10
10
10
12
10
cr
1/ε
us
Figure 3: Euclidean norm of the difference between the gradients computed by the adjoint method and the finite difference method, in log-log scale versus 1/ε, in the BSTR and the CSTR (h = 1/40).
finite difference code is around fifteen times greater than for the adjoint code, in both the BSTR and the CSTR. Adjoint
Finite difference with ε = 10−8
40
0.138
2.182
80
0.256
4.177
M
an
Number of times steps (L)
400
20.448
2.618
38.259
d
800
1.268
te
Table 3: CPU time (in seconds) for the calculation of the gradient of Jhc with the adjoint method and the finite difference formula (68), for the BSTR.
Adjoint
Finite difference with ε = 10−8
40
0.188
3.349
80
0.364
6.121
400
1.793
29.888
800
3.651
57.757
Ac ce p
Number of times steps (L)
Table 4: CPU time (in seconds) for the calculation of the gradient of Jhc with the adjoint method and the finite difference formula (68), for the CSTR.
5.3. Parameter identification problem We solve a parameter identification problem as the one stated in Section 4.2 for the above chemical reaction mechanisms. The experimental data, the expressions for reaction rates, the weights wie,k and the regularization parameters γj , are the same as in the previous section. In reaction rates, we assume that the coefficients Grjk , brj and αrj are known for the two reactions and their values are the ones given in Table (2). Therefore, the parameters to be identified are the pre-exponential factors and the activation 17
Page 17 of 23
Parameter
Range
Unit
B1, B2
[0, 10]
M −1 h−1
Ea1 , Ea2
[0, 5]
Jmol−1
ip t
energies. We solve this identification problem for the BSTR and the CSTR by using the procedure stated in Section 4.4, and for h = 1/127. Random values of the unknown parameters are used which are chosen in the intervals detailed in Table 5. In Table 6 and Table 7 we show the obtained numerical solution for the BSTR and the CSTR, respectively. For these numerical values, the value of the discrete fitting
cr
Table 5: Ranges of unknown parameters.
Numerical Value
Unit
B1, B2
0.0279994, 0.2783981
M −1 h−1
Ea1 , Ea2
0.0002604, 4.5143297
Jmol−1
an
us
Parameter
Table 6: Numerical values obtained by solving the parameter identification problem for the BSTR.
Numerical Value
Unit
B1, B2
0.0277102, 0.2824067
M −1 h−1
Ea1 , Ea2
0.0005463, 4.9330343
Jmol−1
d
M
Parameter
te
Table 7: Numerical values obtained by solving the parameter identification problem for the CSTR.
392 390
Temperature (K)
388 386 384
395
Numerical Experimental
Numerical Experimental 390
Temperature (K)
394
Ac ce p
function is equal to 9.54 · 10−6 for the BSTR and 7.76 · 10−5 for the CSTR. Moreover, we also compare graphically the numerical temperature and concentrations corresponding to these values of the unknown parameters, with the experimental ones. More precisely, in Figure 4 we show the time profiles of the numerical and experimental temperature, in both the BSTR and the CSTR. The time histories of the numerical and experimental concentrations are depicted in Figure 5.
382 380 378
385
380
375
370
376 374
0
0.2
0.4
0.6
0.8
365
1
Time (hours)
0
0.2
0.4
0.6
0.8
1
Time (hours)
Figure 4: Numerical and experimental temperature for the BSTR (left) and the CSTR (right), computed with the numerical values given in Table 6 and in Table 7, respectively.
18
Page 18 of 23
20
20
18
18 16
Num., E1
12
Exp., E1 Num., E2
10
Exp., E
2
8
Num., E
6
Exp., E
3
3
Num., E
4
4
14
Num., E
12
Exp., E
1
1
Num., E
2
10
Exp., E
2
8
Num., E3
6
Exp., E3
4
Num., E4
Exp., E
Exp., E4
4
2 0
2 0
0.2
0.4
0.6
0.8
0
1
0
0.2
Time (hours)
0.4
ip t
14
Concentration (mol/l)
Concentration (mol/l)
16
0.6
0.8
1
Time (hours)
cr
Figure 5: Numerical and experimental concentrations for the BSTR (left) and the CSTR (right), computed with the numerical values given in Table 6 and in Table 7, respectively.
an
us
Notice that the values obtained of the pre-exponential factors practically coincide with the ones considered to build the experimental data given in Table 1. Moreover, although this accuracy is not observed for the activation energies, the obtained solution reproduces very well the experimental data (see Figures 4 and 5). Therefore, we observe numerically that this identification problem has not a unique solution. 6. Conclusions
Ac ce p
te
d
M
We have presented an adjoint-based algorithm to compute exactly the gradient of the cost function in optimization problems involving systems of ordinary differential equations. A quite general optimization problem has been considered and a second-order finite-difference scheme for solving the system of ODEs suitable for stiff problems, has been used. We have also considered a discrete version of the problem where the cost function depends on the numerical solution of the parameterized model. The proposed algorithm calculates exactly and efficiently the gradient of the discrete cost function with respect to the optimization variables. Then the adjoint method is applied to the problem of estimating a set of unknown parameters appearing in chemical reaction models of stirred-tank reactors. This is done by minimizing the deviation between the observed data and the numerical solution of the parametrized model. Numerical tests have been presented to assess the efficiency of the adjoint method. In particular, we have considered a parameter identification problem for a chemical reaction system in two reactor configurations. The first one corresponds to a non-isothermal BSTR and the second one to a non-isothermal CSTR. Synthetic data sets for species concentrations and temperature have been built. At present, the algorithm is being successfully used for stirred-tank reactors with real data by the Spanish company Repsol in its Technology Center. Unfortunately, confidentiality issues prevent the experiments from being shown in this document. Acknowledgements
This research was developed as an activity in the Joint Research Unit Repsol-ITMATI (code file: IN853A 2014/03) which is funded by the Galician Agency for Innovation and the Ministry of Economy and Competitiveness in the framework of the Spanish Strategy for Innovation in Galicia. The first author has been also financed by FEDER and Spanish Ministry of Science and Innovation, MTM2008-02483, CGL2011-28499-C03-01, FPU13/00279 and MTM2013-43745-R, and by FEDER and Xunta de Galicia, and the second author has been also financed by FEDER and Spanish Ministry of Science and Innovation under research project ENE2013-47867-C2-1-R and by FEDER and Xunta de Galicia under research project GRC2013/014. We would also like to thank the reviewers for their remarks which have allowed us to improve the paper.
19
Page 19 of 23
References [1] M. Amrhein, B. Srinivasan, D. Bonvin, Target factor analysis of reaction data: use of data pretreatment and reaction-invariant relationships, Chem. Eng. Sci. 54 (1999) 579–591. [2] M. Amrhein, B. Srinivasan, D. Bonvin, M.M. Schumacher, On the rank deficiency and rank augmentation of the spectral measurement matrix, Chemometr. Intel. Laboratory Systems 33 (1996) 17–33.
ip t
[3] A. Bardow, W. Marquardt, Incremental and simultaneous identification of reaction kinetics: methods and comparison, Chem. Eng. Sci 59 (2004) 2673–2684.
cr
[4] A. Bermúdez, N. Esteban, J. Ferrín, J. Rodríguez-Calo, M. Sillero-Denamiel, Identification problem in plug-flow chemical reactors using the adjoint method, Comput. Chem. Eng., http://dx.doi.org/10.1016/j.compchemeng.2016.11.029 (2016). [5] N. Bhatt, M. Amrhein, D. Bonvin, Incremental identification of reaction and mass-transfer kinetics using the concept of extents, Ind. Eng. Chem. Res. 50 (2011) 12960–12974.
us
[6] N. Bhatt, N. Kerimoglu, M. Amrhein, W. Marquardt, D. Bonvin, Incremental identification of reaction systems-A comparison between rate-based and extent-based approaches, Chem. Eng. Sci. 83 (2012) 24–38.
an
[7] M. Brendel, D. Bonvin, W. Marquardt, Incremental identification of kinetic models for homogeneous reaction systems, Chem. Eng. Sci 61 (2006) 5404–5420.
M
[8] E. Casas, Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations, SIAM J. Control Optim. 35 (1997) 1297–1327. [9] G. Chavent, Identification of function parameters in partial differential equations, in identification of parameter distributed systems, eds Goodson, R.E. & Polis, New-York, ASME (1974).
d
[10] D.N. Daescu, A. Sandu, G.R. Carmichael, Direct and adjoint sensitivity analysis of chemical kinetic systems with KPP: II-numerical validation and applications, Atmos. Environ. 37 (2003) 5097–5114.
te
[11] A. Jameson, Aerodynamic design via control theory, J. Sci. Comput. 3 (1988) 233–260.
Ac ce p
[12] J.L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer Verlag, Berlin (1971). [13] J.L. Lions, Nonhomogeneous boundary value problems and applications, Springer Verlag, Berlin (1972). [14] K. Maute, M. Nikbay, C. Farhat, Sensitivity analysis and design optimization of three-dimensional non-linear aeroelastic systems by the adjoint model, Int. J. Numer. Meth. Engng. 56 (2003) 911–933. [15] R. Meric, Coupled optimization in steady-state thermoelasticity, J. Thermal Stresses 8 (1985) 333– 347. [16] R. Meric, Material and load optimization of thermoelastic solids. Part I: sensitivity analysis, J. Thermal Stresses 9 (1986) 359–372. [17] R. Meric, Material and load optimization of thermoelastic solids. Part II: numerical results, J. Thermal Stresses 9 (1986) 373–388. [18] L. Michaelis, M.L. Menten, Die kinetik der invertinwirkung, Biochemische Zeitschrift 49 (1913) 333–369. [19] C. Michalik, M. Brendel, W. Marquardt, Incremental identification of fluid multi-phase reaction systems, AIChE J. 55 (2009) 1009–1022. [20] N. Mladenović, P. Hansen, Variable neighborhood search, Comput. Oper. Res. 24 (1997) 1097–1100. 20
Page 20 of 23
[21] O. Pironneau, On Optimum Design in Fluid Mechanics, J. Fluid Mech. 64 (1974) 97–110. [22] R.E. Plessix, Y.H. de Roech, G. Chavent, Waveform inversion of reflection seismic data for kinematic parameters by local optimization, SIAM J. Sci. Comp. 20 (1999) 1033–1052. [23] A.A.I. Quiroga, D. Fernández, G.A. Torres, C.V. Turner, Adjoint method for a tumor invasion PDEconstrained optimization problem in 2D using adaptive finite element method, Appl. Math. Comput. 270 (2015) 358–368.
ip t
[24] D. Rodrigues, S. Srinivasan, J. Billeter, D. Bonvin, Variant and invariant states for chemical reaction systems, Comput. Chem. Eng. 73 (2015) 23–33. [25] A. Sandu, D.N. Daescu, G.R. Carmichael, Direct and adjoint sensitivity analysis of chemical kinetic systems with KPP: Part I-theory and software tools, Atmos. Environ. 37 (2003) 5083–5096.
cr
[26] R. Vautard, M. Beekmann, L. Menut, Applications of adjoint modelling in atmospheric chemistry: sensitivity and inverse modelling, Environ. Modell. Softw. 15 (2000) 703–709.
us
[27] C.R. Vogel, Computational methods for Inverse Problems, SIAM (2002).
Ac ce p
te
d
M
an
[28] G.Z. Yang, N. Zabaras, The Adjoint Method for an Inverse Design Problem in the Directional Solidification of Binary Alloys, J. Comput. Phys. 140 (1998) 432–452.
21
Page 21 of 23
Graphical Abstract (for review)
Experimental data
20
394
E1
392
E2
16
E
14
E4
390
3
388
θ
12
386
10
384
δ (p) 1 2E → E 1 2
8
R1:
6
δ2(p) R2: E +E → 2E +2E 2
4
3
382
3
380
4
378
2
376
0
374
0
0.2
Temperature (K)
Concentration (mol/l)
18
0.4
0.6
0.8
1
cr
ip
t
Time (hours)
l2 error curve
10
us
10
Numerical results 394
E1
392
ep t
E2
16
E3
390
14
E4
388
θ
12
386
10
384
8
382
6
380
4
378
2
376
0
374
0
0.2
0.4
0.6
Time (hours)
0.8
1
Temperature (K)
Ac c
Concentration (mol/l)
18
ed
20
Relative error (%)
M
an
BSTR CSTR 1st order curve
5
10
0
10
CPU time (22 parameters): Finite Difference=15⋅Adjoint −5
10
2
10
4
6
8
10
12
10 10 10 10 10 1/ε; ε: discretization parameter of finite difference method
Page 22 of 23
*Research Highlights
The discrete adjoint method for ODE-constrained optimization problems is presented
This method is efficient to obtain the exact derivatives of a discrete functional
The adjoint method is applied to inverse problems of chemical reaction systems
Chemical reaction systems in stirred tank reactors are considered
Numerical results showing the good performance of the methodology are presented
Ac
ce pt
ed
M
an
us
cr
ip t
Page 23 of 23