MATH~TtCS
c°s ELSEVIER
N
Mathematics and Computers in Simulation42 (1996) 125-134
On the identifiability and distinguishability of nonlinear parametric models Eric Walter a,,, Luc Pronzato b
a L2S, CNRS-SUPELEC, Plateau de Moulon, 91192 Gif-sur-Yvette, France b 13S, CNRS-URA-1376, Sophia-Antipolis, 06560 Valbonne, France
Abstract Testing parametric modelsfor identifiabilityand distinguishabilityis importantwhen the parameters to be estimatedhave a physical meaningor whenthe model is to be used to reconstructphysicallymeaningfulstate variablesthat cannotbe measured directly. Examples are used to explain why and indicatebrieflyhow, with special emphasis on nonlinearmodels.
I. Why Building mathematical models from observations is a fundamental step in many domains of pure and applied sciences. These models may be used to understand, predict or control the behavior of a system, to estimate quantities that cannot be measured directly or to detect faults. Most often, the equations of the models to be considered involve unknown parameters that must be estimated from experimental data. Assume that a finite number of structures Mi (.) of models compete for the description of the experimental data. It is then natural to ask whether the experimental set-up makes it possible to select the best structure and estimate its parameters. These questions find answers in the idealized framework of identifiability and distinguishability studies, where noise-free data are assumed to have been generated by a model with one of the structures considered (see, e.g., [1,2] and the references therein). Let Mi(Pi) =-- M j ( p j ) mean that the model with structure Mi(.) and parameters Pi has the same inputoutput behavior as the model with structure Mj (.) and parameters p j, for any input and time. M(.) will then be uniquely identifiable (or globally identifiable) if for almost any p* the only solution for p of M ( p ) -M ( p * ) is p = p*. In other words, if M(.) is not uniquely identifiable, then there are several parameter vectors that correspond to exactly the same input-output behavior, and it is impossible to eliminate any of them from the data alone. Moreover, each of them corresponds to a different evolution of some state variables that are not measured directly. * Corresponding author. 0378-4754/96/$15.00 © 1996 Elsevier Science B.V. All rights reserved SSDI 0378-4754(95)00123-9
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
126
Similarly, Mi (.) will be distinguishable from Mj (-) if and only if for almost any pj there is no Pi such that Mi (Pi) -~ M j ( p j ) . In other words, if Mi (.) is not distinguishable from Mj (.) then there is no way to discover from experimental data that the structure Mi (.) of the model does not correspond to that Mj (.) of the process that has generated the data. To make these notions more concrete, consider an experiment performed in the Department of Chemical Engineering of Columbia University to study a catalytic reaction producing methane from carbon monoxide. Under suitable experimental conditions, 13CO is substituted for 12CO at the inlet of the reactor in stationary state [3]. The data collected are the evolution with time of the percentage of marked carbon atoms in CO and CH4. Assuming that 13C and 12C behave identically with respect to the reaction, one can use a linear time-invariant model to describe the dependency of the percentage of 13C in CH4 (output y(t)) with that in CO (input u(t)). Chemical prior information suggests two compartmental model structures for the description of these data (Figs. 1 and 2). Compartments, represented by circles, correspond to the various chemical species that can be labeled by the tracer and arrows materialize the flows of labelable atoms. In both structures, the ith component of the state vector x is the specific activity (percentage of 13C) in the ith compartment and initial conditions can be taken as x(0) = 0. State equations are obtained by material balances. For the first structure
d C1--~t Xl = - ( V
"4- Ve)Xl q- VeX2 q- V u ,
d
C 2 - ~ x 2 ~ pexl - VEX2,
M(p):
d
C3--~tx 3 = V x I - Vx3,
d
C4--7-x 4 .~ V x 3 - Vx4, flt
y = x4,
the parameters tO be estimated are p = (C1, C2, C3, re) T. Ci (i = 1. . . . . 3) are surface concentrations, Ve and V are velocities of transfer of carbon atoms. V and C4 are known from independent measurements.
Fig. 1. First possible model structure.
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
127
Fig. 2. Second possible model structure.
For the second structure d Cl-~tX 1 = -Vx
1 --~ V u ,
d C 2 - ~ t x 2 -~- l)lX 1 - 1)lX2, Y4(b
.
d C 3 - ~ t x 3 = ( V - V l ) X l + l)lX2 - V x 3 ,
d
C4-z-x4 ~ Vx3 - Vx4, tit = X4,
the parameters are ~b = (C1, C2, C3, Vl) T. Ci(i = 1 . . . . ,4), V and u are as previously; vl is a velocity of transfer of carbon atoms. All state variables and parameters of these two model structures have a concrete meaning, and the aim of the modeling is not a mere reproduction of an experimentally observed input-output behavior. One would like to select the best structure and to estimate the associated parameters on the basis of experimental data, the final goal being the improvement of the performances of the catalyst. Studying the identifiability and distinguishability of these two structures, one reaches the following conclusions [4]. M (-) and M (-) are not ~stinguishable. If the data have been generated by a model M(p*), there are six models with the structure M(.) and the same input-output behavior. Conversely, if the data have been generated by a model ~/(b), there are three models with the structure M(.) and the same input-output behavior. Neither M(-) nor M(.) is uniquely identifiable. Before collecting any data on the reactor, we therefore know that there is no hope of selecting the best model structure and estimating uniquely its parameters from the experiment considered
whatever the quality of the experimental data.
2. How
Because of space limitation, only a small part of the state of the art can be presented. The outputs of the structures to be considered are all nonlinear in their parameters (the linear case reduces to testing the nonsingularity of a matrix). We shall concentrate on identifiability, but distinguishability problems are treated with similar methods [5], although identifiability of two model structures is neither necessary nor sufficient for their distinguishability [6]. Methods dealing with models whose outputs are nonlinear in their
128
E. Walter,L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
inputs will be favored, but their relation to methods available for linear models will be mentioned. For a more detailed presentation, see [7]. 2.1. Taylor series
Consider the model [dx(t)=f(x(t),u(t),t,p), M(p):
x(0)=x0(p),
/
| y(t, p) = h(x(t), p).
If dk ak(p) = lim p), t~o+ dt ky(t'
then a sufficient condition for M(.) to be uniquely identifiable is [8] ak(p) = ak(p*),
k = O, 1. . . . . kmax, ==~ p = p*,
where kmax is a positive integer, small enough for the computations to remain tractable. Example 1. For the nonlinear model
M(p) :
dt
[_ p2(1 - p3x2)xl - p4x2
'
y(t, p) = Xl, ao(p) = 1,
al (p) = --(Pl q- P2), a2(p) = (Pl + P2) 2 q- p2p3, = p3p2 2 aa(p) -- 2 3 -- 4PEPa(Pl q- P2) -- p2p3P4 -- (Pl q- P2) 3,
and it can be shown that ak(p)=ak(p*),
k = 1,2 . . . . . 5 :=~ ~ = p*.
M(-) is therefore uniquely identifiable. Note that if P3 = 0, the model then becomes linear and unidentifiable. This illustrates the tendency of nonlinear models to be more identifiable than linear models. For linear state-space models, the Taylor series approach amounts to testing identifiability from the identity of the Markov parameters [9,10].
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
129
2.2. Generating series Consider the model d m - ~ x ( t ) = f o ( x ( t ) , p) + i=lZui(t)fi(x(t)' p)'
M(p) "
x(O)=xo(p),
y(t, p) = h(x(t), p),
wherefi (i = 0 . . . . . m) and h are analytic on the manifold over which x evolves, so that the model output can be expanded in series with respect to time and inputs [ 11 ]. The coefficients of this series are h (x(0), p) and
Lfyo " " Lfj~h(x(t), p)[o, where ]o indicates evaluation at x(O) and Lfh is the Lie derivative of h along the vector field f , given by n
0
Lfh(x(t), p) = E fj(x(t), p)-~xjh(x(t), p), j=l with fj the jth component o f f . Let s(p) be the vector of all coefficients of the series, M ( b ) - M(p*) translates into s(b) = s(p*). One can therefore test the identifiability of M(.) by calculating the number of solutions for ,b of the set of equations s(b) = s(p*) [1,12]. Example 2. For the model
¥d =x
M(p) •
P2
Pl + P3 -+ Xl
--
) Xl + p4x2
plXl -- p4x2
+I:lu
x(O) = O,
y(t, p) = Xl, h(x, p) = xl,
fo
= ( l+ 3 x/Xl+ 4x2 plXl -- p4x2 ==~Lfo =
fl =
[( -
=*
Pl q-
)
P2 Xl q- p4x2 P3 + x-----~
LS~ = ~ x l
LflLf°hlo = -- ( pl + P~3)'
+ [plXl -
L,
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
130
Lfl Lfl LfohlO = 2 P__322' P3 Lfl Lfl L$, L:oh 10 = - 6 P2
p33 '
L$1LfoLfohIO= PlP4 + (Pl + P~3)2 , which implies that
s@) = s(p*) =¢, ~ = p* :0 M(.) is uniquely identifiable. Computations would have been much more complex with the Taylor series approach. For bilinear models, a systematic method is available for generating all nonzero coefficients of the series, which can be used to prove that a given model structure is not identifiable [12]. For linear state-space models, the generating series approach amounts to the Laplace transform approach [13], up to a change of variable.
2.3. Local state isomorphism Consider the n-dimensional state-space model d
-~x(t) =f(x(t), p) + u(t)g(x(t), p), M(p) :
x(O) = xo(p),
ym(t, p) : h(x(t), p), where f , g and h are analytic, u is a measurable bounded function and M(p) is locally reduced at xo(p) for almost any p (which corresponds to a notion of structural observability and structural controllability [14,15]). Letx* be the state of M(p*) a n d J be the state of M@). M(p*) and M @ ) will have the same input-output behavior for any u up to a time q > 0 if and only if there exists a local state isomorphism dp • v(x~) --~ ~n, x* -+ J = ~b(x*) such that for any x* in the neighborhood v(x~) the following conditions are satisfied, which express that (i) ~b is a diffeomorphism, (ii) the initial states correspond, (iii) the drift terms correspond, (iv) the control terms correspond, (v) the observations correspond. 0 ranko--~b(X)lx=x. = n,
(i)
q~(x~) = ~o,
(ii) ,
,
s~(:~) =f(~b(x*)) : yxTO(x)lx=x,f (x),
(iii)
0 ~(i-) = ~(~b(x*)) -----yxTqb(X)lx=x.g*(x*),
(iv)
h(:~) : h(~b(x*)) : h*(x*).
(v)
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
131
After checking that M(p) is locally reduced at xo(p), one can look for all solutions for b and 4~ of (i)-(v). If for almost any p* the only possible solution is k = p*, O(x*) = x*, then M(.) is uniquely identifiable [I6,17]. E x a m p l e 3. Consider the (locally reduced) model [16]
f(x, p) [ p3 x2 -I- p4XlX2 ..]'
h(x,p)=xl
and
xo=O,
0
(v) ~ 21 = ~1 (x*) = x~
0
-~Xlqbl(X)lx=x, = 1, -~x24)l(X)lx=x* = O,
1
=
=
::~ ~Xl~2(X)lx=x* = O, ~Xl ~2(X)lx =x*
so that (iii) translates into
= /333"~2 -~/34"~1"~2
IPlXl OX2 ~2(x)lx=x*
q-P2XlX2
.
L| P *x.23 1 q- P4Xl* *X2*
Since 21 = x T and .~2 = ~b2(x*), the first row of (iii) implies /31xT2 nt-/32x~q~z(x*) = Since
p~x~ 2 -t- p~x~x~,
Vx* E
v(x~).
(8/Oxl)4)z(X)lx=x, = O, 4~2(x*) is not affine in x~, and the identity of the terms in xT2 implies
/31 = p~
and
4~2(x*) = :~-x~
P2
=¢, ~--~b2(X)lx=x, = 'T-.
ox2
P2
The second row of (iii) then becomes
/33x~{2 +/34xr~-x p*2 • , , ' • *2 + ._~2P4XlX2 P2 ~ = P* /32 P3Xl
Vx* E v(x~),
which implies P~ p* /33 --'- - 7 - 3 P2
¢:~ /32/33 = P2P3, * *
* p* /34 p~ = -~-Pz~ ¢:~ /34 = Pz~P2 P2
Condition (ii) amounts to ~b(0) = 0, and (i) can now be written as
rank
[01 p~
0
=-P2
=2
¢~ p : ~ 0 .
132
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
The parameters pl and p4 are therefore uniquely identifiable, while only the product of P2 and P3 is identifiable. For polynomial models with linear observations, i.e. models where the components of f and g are polynomials in x parametrized by p and where h(x(t, p), p) = C(p)x(t, p), 4~ can directly be written as a linear transformation :~ = Tx*, which simplifies the calculations notably [18]. In [19], this approach has been applied to a nonlinear model of methane pyrolisis. For linear models, the local isomorphism is also a linear transformation, and the method becomes the similarity transformation approach [20].
2.4. Differential algebra All methods considered so far usually produce a system of algebraic equations to be solved for the parameters. In identifiability studies, one hopes to prove that this system has a unique solution while in distinguishability studies one hopes to prove that it has none. Elimination theory and computer algebra can be used to obtain these equations and to simplify their solution by transforming them into sets of triangular equations [21,22]. Differential algebra (see, e.g., [23]), in which differentiation is added to the classical axioms of algebra, makes it possible to use a similar approach to eliminate state variables so as to get differential input-output relations, only involving known variables and their derivatives and the parameters to be estimated, from which identifiability can be studied [24]. Example 4. We shall illustrate this notion with the Volmer-Heyrovski mechanism, used in electrochemistry to describe the production of gas or the dissolution of metals [25]. The state and observation equations can be written as d
-d~X(t) = kl(t)[1 - x(t)] - ke(t)x(t), kl (t) = Pl ePZU(t),
x(O) =
kl (0) kl (0) + k2(0)'
k2(t) = P3 eP4U(t),
y(t) = kl(t)[1 - x(t)] q- k2(t)x(t). Differentiating the observation equation with respect to time, one obtains d d d d - ~ y = -~(kl)(1 - x) + --d-~(k2)x + (k2 -- k l ) - ~ ( x ) . Replacing ( d / d t ) ( x ) by its value as given by the state equation, multiplying the result by (k2 - kl) so as to use the observation equation to eliminate x, one gets the input-output equation
d d k - -~-~(k2) d (k2 - k 1)-d-~(y)-t-[--d--~(1) -t- k 2 - k2]y _ = ~ t (kl)k2 -- k l - ~ ( k 2 ) - t - 2 k l k 2 ( k 2 - kl), with the initial condition y(0) = kl(0)[1 - x(0)] q- k2(0)x(0) = 2
kl (0)k2 (0) kl (0) nt- k2(0)"
Exchanging kl and k2 (i.e. (Pl, P2) and (P3, P4)) leaves y(0) unchanged and multiplies the input-output equation by ( - 1 ) , the structure is therefore not uniquely identifiable.
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
133
Finally, consider a model defined by a set o f relations
rk(Y(t), u(t), x(t), p) = O,
k = 1 . . . . . nr,
where the rk's are polynomial functions of u, y andx and their derivatives with respect to time and polynomial functions of p. In the idealized context of identifiability studies, any uniquely identifiable parameter Pk can be computed [26] as the solution of a linear equation
ak(Y(t), u(t))pk = bk(Y(t), u(t)), where ak and bk are polynomial functions of the inputs, outputs and their derivatives with respect to time. Computing ak and bk can therefore be used to prove the unique identifiability of Pk.
3. Conclusions Testing nonlinear parametric model structures for identifiability and distinguishability should be a prerequisite before attempting to decide which model structure is the most appropriate and what is the best value for its parameters on the basis of experimental data. Otherwise, one may fail to realize that no conclusion can be drawn as to the value of some physically meaningful parameter or state variable. The methods presented here can be used to that effect, even before collecting any data. If some model structures turn out not to be distinguishable, or if any of them turns out not to be uniquely identifiable, then one may use the same methods to select additional experiments or hypotheses that may remove the structural ambiguity. This may be seen as qualitative experiment design. When all structures to be considered have been made identifiable and distinguishable, one should select the actual experiment (input shape, measurement schedule,..,) that can be expected to yield the most relevant information, for example to achieve a good precision on parameters to be estimated. This corresponds to quantitative experiment design [27].
References [1] E. Walter, Identifiability of State Space Models (Springer, Berlin, 1982). [2] E. Walter, ed., Identifiability of Parametric Models (Pergamon, Oxford, 1987). [3] J. Happel, Isotopic Assessment of Heterogeneous Catalysis (Academic Press, Orlando, FL, 1986). [4] E. Walter, Y. Lecourtier, J. Happel and J.-Y. Kao, Identifiability and distinguishability of fundamental parameters in catalytic methanation, AIChE J. 32 (1986) 1360-1366. [5] E. Walter, Y. Lecourtier, A. Raksanyi and J. Happel, On the distinguishability of parametric models with different structures, in: J. Eisenfeld and C. De Lisi, eds., Mathematics and Computers in Biomedical Applications (North-Holland, Amsterdam, 1985) 145-160. [6] E. Walter, Y. Lecourtier and J. Happel, On the structural output distinguishability of parametric models, and its relations with structural identifiability, IEEE Trans. Automat. Control AC-29 (1984) 56-57. [7] E. Walter and L. Pronzato, Identifiabilities and nonlinearities, in: A. Fossard and D. Normand-Cyrot, eds., Nonlinear Systems, Vol. 1, Modeling and Estimation (Chapman & Hall, London, 1995) 111-143. [8] H. Pohjanpalo, System identifiability based on the power series expansion of the solution, Math. Biosci. 41 (1978) 21-33. [9] E M. Fisher, The Identification Problem in Economics (McGraw-Hill, New York, 1966). [10] M. S. Grewal and K. Glover, Identifiability of linear and nonlinear dynamical systems, IEEE Trans. Automat. Control AC-21 (1976) 833-837.
134
E. Walter, L. Pronzato /Mathematics and Computers in Simulation 42 (1996) 125-134
[11] M. Fliess, Fonctionnelles causales non lin6aires et ind6termin6es non commutatives, Bull. Soc. Math. France 1.09 (1981) 3-40. [12] Y. Lecourtier, F. Lamnabhi-Lagarrigue and E. Walter, Volterra and generating power series approaches to identifiability testing, in: [2], 50-66. [13] R. Bellman and K. J./~str6m, On structural identifiability, Math. Biosci. 7 (1970) 329-339. [14] R. Hermann and A. J. Krener, Nonlinear controllability and observability, IEEE Trans. Automat. Control AC-22 (1977) 728-740. [15] H. J. Sussman, Existence and uniqueness of minimal realizations of nonlinear systems, Math. Systems Theory 10 (1977) 263-284. [16] S. Vajda, K. R. Godfrey and H. Rabitz, Similarity transformation approach to identifiability analysis of nonlinear compartmental models, Math. Biosci. 93 (1989) 217-248. [17] S. Vajda and H. Rabitz, State isomorphism approach to global identifiability of nonlinear systems, IEEE Trans. Automat. Control AC-34 (1989) 220.223. [ 18] M. J. Chappell, K. R. Godfrey and S. Vajda, Global identifiability of the parameters of nonlinear systems with specified inputs: A comparison of methods, Math. Biosci. 102 (1990) 41-73. [ 19] S. Vajda, H. Rabitz, E. Walter and Y. Lecourtier, Qualitative and quantitative identifiability analysis of nonlinear chemical models, Chem. Engrg. Comm. 83 (1989) 191-219. [20] E. Walter and Y. Lecourtier, Unidentifiable compartmental models: What to do?, Math. Biosci. 56 (1981) 1-25. [21] E. Walter and Y. Lecourtier, Global approaches to identifiability testing for linear and nonlinear state space models, Mathematics and Computers in Simulation 24 (1982) 472-482. [22] Y. Lecourtier and A. Raksanyi, The testing of structural properties through symbolic computation, in: [2], 75-84. [23] M. Fliess, Automatique et corps diff~rentiels, Forum Math. 1 (1989) 227-238. [24] E Ollivier, Le probl~me de l'identifiabilit~ structurelle globale: Approche th~orique, m~thodes effectives et bornes de complexitY, Th~se de Doctorat en Sciences, Ecole Polytechnique, Palaiseau (1990). [25] F. Berthier, J.-R Diard, L. Pronzato and E. Walter (1996), Identifiability and distinguishability concepts in electrochemistry, Automatica, to appear. [26] L. Ljung and T. Glad, On global identifiability for arbitrary model parametrizations, Automatica 30 (1994) 265-276. [27] E. Walter and L. Pronzato, Qualitative and quantitative experiment design for phenomenological models- A survey, Automatica 26 (1990) 195-213.