Optimal input design for model discrimination using Pontryagin’s maximum principle: Application to kinetic model structures

Optimal input design for model discrimination using Pontryagin’s maximum principle: Application to kinetic model structures

Automatica 50 (2014) 1535–1538 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Technical ...

370KB Sizes 0 Downloads 240 Views

Automatica 50 (2014) 1535–1538

Contents lists available at ScienceDirect

Automatica journal homepage: www.elsevier.com/locate/automatica

Technical communique

Optimal input design for model discrimination using Pontryagin’s maximum principle: Application to kinetic model structures✩ Karel J. Keesman a,1 , Eric Walter b a

Systems and Control Group, Wageningen University, P.O. Box 17, 6700 AA Wageningen, The Netherlands

b

Laboratoire des Signaux et Systèmes, CNRS, SUPELEC, Univ Paris-Sud, 91192 Gif-sur-Yvette, France

article

info

Article history: Received 20 November 2013 Accepted 20 March 2014 Available online 13 April 2014 Keywords: Optimal input design Model discrimination Non-linear systems Optimal control Fed-batch reactor

abstract The paper presents a methodology for an optimal input design for model discrimination. To allow analytical solutions, the method, using Pontryagin’s maximum principle, is developed for non-linear single-state systems that are affine in their joint input. The method is demonstrated on a fed-batch reactor case study with first-order and Monod kinetics. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction Before attempting to estimate the parameters of a given model, one may have to choose the proper model structure among a set of candidates, which may correspond, for instance, to competing scientific hypotheses about the description of some phenomenon (see, e.g., some of the practical examples in Keesman (2011)). The procedure of choosing between model structures is called model discrimination. For it to be possible on the basis of (future) experimental data, these models must be distinguishable. The distinguishability property can be tested by techniques similar to those used to test models for identifiability (although identifiability of two structures is neither necessary nor sufficient for their distinguishability); see Walter, Lecourtier, and Happel (1984). In practice, of course, the ability to discriminate distinguishable model structures depends on the informational content of the data collected. Optimal experiment design for model discrimination has received some attention in the statistical literature (see, e.g., Atkinson & Cox, 1974; Box & Hill, 1967; Dette & Titoff,

✩ The material in this paper was presented at the 16th IFAC Symposium on System Identification (SYSID 2012), July 11–13, 2012, Brussels, Belgium. This paper was recommended for publication in revised form by Associate Editor Brett Ninness under the direction of Editor André L. Tits. E-mail addresses: [email protected] (K.J. Keesman), [email protected] (E. Walter). 1 Tel.: +31 317 483780; fax: +31 317 484957.

http://dx.doi.org/10.1016/j.automatica.2014.03.022 0005-1098/© 2014 Elsevier Ltd. All rights reserved.

2009), although less than optimal experiment design for parameter estimation did. Applications of experiment design for discrimination can be found in domains as diverse as chemistry Schwaab et al. (2006), machine learning Rajamoney (1993), system biology Kreutz and Timmer (2009), Skanda and Lebiedz (2010) and psychology Myung and Pitt (2009). For optimal design to be possible, some performance index is needed. T-optimal design, as in Atkinson and Fedorov (1975), aims at maximizing some measure of the lack of fit between the output of some model assumed to be true and that of an alternative structure. Ponce de Leon and Atkinson (1991) extend T-optimality to the case where prior probabilities are specified for each of the models to be true, as well as prior distributions for the parameters of these models. KL-optimal design, see Lopez-Fidalgo, Tommasi, and Trandafir (2007) and Skanda and Lebiedz (2010), based on the Kullback–Leibler divergence, can be seen as a generalization of T-optimal design of interest under non-normal assumptions. Ds -optimal design, as in Studden (1980), applies to the discrimination between two structures with a common part, by attempting to maximize some measure of the precision with which structurespecific parameters are estimated. For an extension to more than two rival structures, see Atkinson and Cox (1974). Entropy-based design, as in Box and Hill (1967) and Reilly (1970), assumes that each of the competing structures is given a prior probability and updates the probabilities of the structures after each measurement. The new experiment is then designed to maximize some measure of the decrease in Shannon’s entropy to be expected from the next measurement (entropy is minimal when a single model structure has gained all the probability mass).

1536

K.J. Keesman, E. Walter / Automatica 50 (2014) 1535–1538

Although the proposed method, based on Pontryagin’s maximum principle (PMP), may seem to be rather limited at first sight, we think it is nevertheless valuable to study closed form analytical solutions, as these solutions provide direct insight in terms of possible singularities, steady states, parameter sensitivities, etc. We demonstrate the method in a case study using kinetic model structures. Typically, in bio-reactor modeling, a specific kinetic model structure will appear in different equations, as in e.g. oxygen, substrate and biomass balances. In each of the balances the kinetic model will only be corrected by a so-called yield coefficient, expressing the stoichiometric relationship between the components (see e.g. Bastin & Dochain, 1990). Also, in basic physical modeling with its mass balances in the PDE form, we distinguish storage and transport terms from the reaction term. Under the assumption that the reaction term does not depend on the other terms, we can investigate all kinds of kinetic models separately. Thus, instead of discriminating potential multi-state models for the whole reactor system, we can focus on a single-state system that only contains the kinetic model. 2. Preliminaries In the paper, as already motivated in the previous paragraph, and for the purpose of introducing the idea of optimal input design for model discrimination using analytical solutions, we limit ourselves to the case of two non-linear scalar state equations that are affine in their joint input. Hence, without explicitly referring to the parameter vectors to simplify notation, two competitive model structures are defined by the following state equations x˙1 (t ) = f1 (x1 (t )) + b1 u(t )

(1)

x˙2 (t ) = f2 (x2 (t )) + b2 u(t )

(2)

with x1 (t ) and x2 (t ) being scalar states, u(t ) the joint control input, and f1 (·), f2 (·) non-linear functions. In what follows, it is assumed that f1 (x1 (t )), f2 (x2 (t )) ∈ R2 [(−∞, ∞)]. For the readability of the solutions only, it is further assumed that the states are directly observed so that yi (t ) = xi (t ) for i = 1, 2. Hence, it suffices to derive a control law in terms of xi (t ). Since our aim is to optimally discriminate between models from new experiments, it is intuitively appealing to focus on the norm of the difference of the predicted output signals. Here, our concern is to find a control law that optimally discriminates between competitive, single-input— single-state models. A similar approach to the OID problem, but then for parameter estimation using PMP, is presented by Keesman and Stigter (2002) and Stigter and Keesman (2004). 3. Optimal input design

For finding analytical solutions to the problem, define the Hamiltonian

H (x(t ), λ(t )) , − (x1 (t ) − x2 (t ))2 + ρ u(t ) +

Let us start from the general system description given in (1)–(2). As a measure of optimality, the following utility function, which includes the linear costs of the control input, is introduced:



tf

J =

(x1 (τ ) − x2 (τ ))2 − ρ u(τ )dτ

(3)

t =0

λi (t )˙xi (t )

(5)

i=1

where λi (t ), i = 1, 2, are the co-states. The last term on the right-hand side of (5) is included to fulfil the dynamic constraints, given by (1)–(2) (see e.g. Stengel (1994) for details). Let U be the admissible set of input trajectories defined by (4). Pontryagin’s maximum principle states that the input u(t ) ∈ U that minimizes H is optimal and thus in this case maximizes J. The following holds (see, e.g., Bryson, 1999):

∂H = 2x1 (t ) − 2x2 (t ) − f˙1 λ1 (t ) ∂ x1 ∂H λ˙ 2 (t ) = − = −2x1 (t ) + 2x2 (t ) − f˙2 λ2 (t ) ∂ x2

λ˙ 1 (t ) = −

where, for simplicity of notation, fi (xi (t )) and its derivatives

(6) (7) dfi (xi ) dxi

are denoted as fi , f˙i , etc. Since H does not explicitly depend on time, a first integral of the problem is H = c with c being a real constant. Also, since the final time tf is assumed to be unknown and no terminal conditions are specified (determining the value of the co-states at tf ), this constant can be assumed equal to zero. Furthermore, since the problem is affine in the bounded control variable u(t ) (4), three types of constraints on the optimal control can be found u(t ) = umax

if sf < 0

0 ≤ u(t ) ≤ umax u( t ) = 0

if sf = 0

if sf > 0

(8)

where sf = ∂ H /∂ u is the so-called switching function. The case sf = 0 corresponds to a singular arc. A singular controller that minimizes the Hamiltonian H over all possible input sequences u(t ) can be derived by setting

∀i ∈ {0, 1, 2, . . .} :

di ∂ H dt i ∂ u

= 0.

(9)

Hence, the singular control law is derived by solving a set of algebraic equations, generated through repeated differentiation of the Pontryagin optimality condition ∂∂H ≡ 0 on the compact u interval [t1 , t2 ]. In order to determine the optimal input u(t ) in (1)–(2) explicitly two differentiations are needed, presuming that u appears in one of the equations of (9). For i = 0, and thus for ∂ H /∂ u = sf = 0, we get sf = ρ + b1 λ1 (t ) + b2 λ2 (t ) = 0. Hence, λ1 (t ) = −

3.1. Singular control using squared 2-norm of the differences and linear input costs

2 

ρ+b2 λ2 (t ) b1

, and the condition H ≡ 0 gives λ2 (t ) =

(ρ f1 + b1 (x1 − x2 ) )/(−b2 f1 + b1 f2 ). Subsequent differentiation of = 0, assuming that the condition holds on an the condition ∂∂H u interval, yields for i = 1 and using (6)–(7) the so-called singularity 2

condition:

φ(t ) = [b1 (f2 (2(b1 − b2 )x1 − 2(b1 − b2 )x2 + ρ f˙1 ) + b2(x1 − x2 )2 (f˙1 − f˙2 )) − b2 f1 (2(b1 − b2 )x1 − 2(b1 − b2 )x2 + ρ f˙2 )]/[−b2 f1 + b1 f2 ] = 0

(10)

under the dynamic constraints given by (1)–(2) and the (practical) input constraints:

with −b2 f1 + b1 f2 ̸= 0. The corresponding singular control law, found from (9) for i = 2, is given by

umin ≤ u(t ) ≤ umax ,

u∗ ( t ) =

∀t ≥ 0

(4)

with, in what follows, umin = 0 and umax being the maximum allowed value, frequently determined by the equipment. Consequently, the OID problem leads to a constrained optimization problem, where our goal is to maximize (3).



2b2 (−b1 + b2 )f12 f˙2 − b1 f˙1 (2(b1 − b2 )f22

+ b2 (x1 − x2 )2 (f˙1 − f˙2 )f˙2 + b2 f2 (x1 − x2 ) ∗ (2f˙1 − 2f˙2 + (x1 − x2 )f¨2 )) + f1 (b1 b2 (x1 − x2 ) ∗ f˙2 (2f˙1 − 2f˙2 + (x1 − x2 )f¨1 ) + 2(b1 − b2 )

K.J. Keesman, E. Walter / Automatica 50 (2014) 1535–1538

1537

 ˙ ˙ ¨ ¨ ∗ f2 (b1 f1 + b2 f2 − (x1 − x2 )(b1 f1 − b2 f2 )))  −2b1 (b1 − b2 )f2 ((b1 − b2 )f˙1 + b1 (−x1 + x2 )f¨1 ) + b2 (2(b1 − b2 )f1 ((b1 − b2 )f˙2 + b2 (−x1 + x2 )f¨2 ) − b1 (x1 − x2 )2  ∗ (b1 f˙2 f¨1 − b2 f˙1 f¨2 ))

(11)

with the denominator being unequal to zero. The corresponding states and co-states are denoted by x∗i (t ) and λ∗i (t ), respectively. The control law of (11) is only valid on the singular arc φ = 0, as given by (10). This result is summarized in Proposition 1. Proposition 1. Given the model structures (1)–(2), a necessary condition for u(t ) to be optimal, in the sense of maximizing the squared 2-norm of the difference between real output signals of two competitive model structures and including linear input costs (3), is given by (11) under the singular arc condition given by (10). For the case with ρ = 0, and thus without input costs, we refer to Keesman and Walter (2012) for details. In the next section, we will discuss closed form analytical results for the discrimination between two common kinetic models and will support this with numerical simulations. 4. Optimal input design for discriminating kinetic model structures 4.1. Fed-batch reactor models Consider two potential single-state, fed-batch reactor models with first-order and with Monod kinetics, frequently applied in chemical and bio-reactor modelings, see Van ’t Riet and Tramper (1991), First-order kinetics: x˙ 1 = −µx1 + u x2 Monod kinetics: x˙ 2 = −µ + bu KS + x 2

(12) (13)

where xi , for i = 1, 2, is the substrate concentration in the (bio)reactor, µ the (bio)degradation rate and KS the half-saturation constant. In practice, and this is also used in what follows, each of these variables is positive. The reactor is assumed to be completely mixed and, as said before, the substrate concentration xi is assumed to be directly observed. Hence, (12)–(13) are specific examples from the class of systems defined by (1)–(2). Notice that in (12)–(13): b1 = 1 and b2 = b, to allow identical initial conditions, as we will show later. 4.2. Optimal input design for distinguishing between first-order and Monod kinetics Let two model structures, describing a bio-reactor system with either first-order or Monod kinetics, be given by (12)–(13), respectively. To reduce the complexity of the analytical solutions, it is assumed that (x1 − x2 ) is either positive or negative on the time interval [0, tf ]. Then, the optimal input, in the sense of maximizing the squared differences between x1 and x2 while minimizing linear input costs, and corresponding singularity condition are obtained from (11) and (10), respectively, after assigning: f1 ≡ −µx1 , f2 ≡ −µ KSx+2 x2 , b1 ≡ 1 and b2 ≡ b. Because of space limitations, the analytical expressions for u∗ and φ are not presented here, except that ρ always appears in the product ρµ. It can be shown that, for   a given initial condition x2 (0) and choosing b = KS + x2 (0) /KS : x1 (0) = x2 (0). These choices for b and the corresponding initial conditions directly put the

Fig. 1. Co-states λ1 , λ2 (−·) and H (upper panel); states x1 , x2 (−·) and φ (middle panel); optimal control input u∗ (bottom panel).

system on the singular arc and keep it on the singular arc by the corresponding control law. Fig. 1 shows the states, co-states and optimal singular control input for µ = 1, KS = 0.9, x1 (0) = x2 (0) = 0.3, b = 1.333 and ρ = 1 on the time interval [0, 5]. The initial condition, x2 (0) = 0.334, is optimal in the sense that x1 = 0.0374 and x2 = 0.0487 for u(t ) ≥ 0, ∀t. 5. Concluding remarks A methodology for the optimal input design for model discrimination, based on Pontryagin’s maximum principle (PMP), has been successfully applied to a fed-batch reactor case study. The study could be relatively easily extended to cases that include soft state constraints, following the methodology in Keesman and Stigter (2002). To tackle the problem of (partially) unknown system parameters, as presented in Section 4.2, can in principle be cast in an adaptive framework (see Stigter, Vries, & Keesman, 2006). Discrimination between other kinetic model structures, such as Haldane, Hill, etc., has also been successfully explored, but not shown here because of the complexity of the analytical solutions. Also, the framework provided by the PMP does directly allow numerical solutions, as presented in e.g. Bryson (1999), Stengel (1994), and many others. References Atkinson, A. C., & Cox, D. R. (1974). Planning experiments for discriminating between models (with discussion). Journal of the Royal Statistical Society Series B, 36, 321–348. Atkinson, A. C., & Fedorov, V. V. (1975). The design of experiments for discriminating between two rival models. Biometrika, 62, 57–70. Bastin, G., & Dochain, D. (1990). On-line estimation and adaptive control of bioreactors. Amsterdam: Elsevier. Box, G. E. P., & Hill, W. J. (1967). Discrimination among mechanistic models. Technometrics, 9(1), 57–71. Bryson, A. E., Jr. (1999). Dynamic optimization. Addison Wesley. Dette, H., & Titoff, S. (2009). Optimal discrimination designs. The Annals of Statistics, 37(4), 2056–2082. Keesman, K. J. (2011). System identification: an introduction. London: SpringerVerlag. Keesman, K. J., & Stigter, J. D. (2002). Optimal parametric sensitivity control for the estimation of kinetic parameters in bioreactors. Mathematical Biosciences, 179, 95–111. Keesman, K.J., & Walter, E. (2012). Optimal input design for model discrimination. In Proc. IFAC system identification, Volume 16, Part 1 (pp. 668–673). Brussels, Belgium. Kreutz, C., & Timmer, J. (2009). Systems biology: experimental design. FEBS Journal, 276, 923–942. Lopez-Fidalgo, J., Tommasi, C., & Trandafir, P. C. (2007). An optimal experimental design criterion for discriminating between non-normal models. Journal of the Royal Statistical Society Series B, 69, 231–242. Myung, J. I., & Pitt, M. A. (2009). Optimal experimental design for model discrimination. Psychological Review, 116(3), 499–518.

1538

K.J. Keesman, E. Walter / Automatica 50 (2014) 1535–1538

Ponce de Leon, A. C., & Atkinson, A. C. (1991). Optimum experimental design for discriminating between two rival models in the presence of prior information. Biometrika, 78(3), 601–608. Rajamoney, S. A. (1993). The design of discrimination experiments. Machine Learning, 12, 185–203. Reilly, P. M. (1970). Statistical methods in model discrimination. The Canadian Journal of Chemical Engineering, 48, 168–173. Schwaab, M., Silva, F. M., Queipo, C. A., Barreto, A. G., Jr., Nele, M., & Pinto, J. C. (2006). A new approach for sequential experiment design for model discrimination. Chemical Engineering Science, 61, 5791–5806. Skanda, D., & Lebiedz, D. (2010). An optimal experimental design approach to model discrimination in dynamic biochemical systems. Bioinformatics, 26(7), 939–945.

Stengel, R. (1994). Optimal control and estimation. New York: Dover Publications. Stigter, J. D., & Keesman, K. J. (2004). Optimal parametric sensitivity control of a fed-batch reactor. Automatica, 40(8), 1459–1464. Stigter, J. D., Vries, D., & Keesman, K. J. (2006). On adaptive optimal input design: a bioreactor case study. AIChE Journal, 52(9), 3290–3296. Studden, W. J. (1980). Ds -optimal designs for polynomial regression using continued fractions. The Annals of Statistics, 8(5), 1132–1141. Van ’t Riet, K., & Tramper, J. (1991). Basic bioreactor design. New York: Marcel Dekker Inc.. Walter, E., Lecourtier, Y., & Happel, J. (1984). On the structural output distinguishability of parametric models, and its relation with structural identifiability. IEEE Transactions on Automatic Control, AC-29(1), 56–57.