8th IFAC Symposium on Advanced Control of Chemical Processes The International Federation of Automatic Control Singapore, July 10-13, 2012
A New Approach for Practical Identifiability Analysis Applied to Dynamic Phenomenological Models Viviane Rodrigues Botelho, Luciane Ferreira Trierweiler, and Jorge Otávio Trierweiler GIMSCOP – Group of Intensification, Modeling, Simulation, Control and Optimization of Process Federal University of Rio Grande do Sul, Chemical Engineering Department R. Eng. Luis Englert, s/n. Campus Central. CEP: 90040-040 - Porto Alegre - RS - BRAZIL, E-MAIL: {vivirb,luciane, jorge}@enq.ufrgs.br Abstract: Identifiability analysis is an essential tool for parameter estimation of dynamic models. The objective of this technique is to verify which parameters of this model should be identified. This paper proposes a new methodology, based in control system theory applied to the local sensitivity analysis of the model. First, the sensitivity matrix is scaled by applying an optimization problem that minimizes the condition number of the system. Then, analyzing the NSRGA matrix of the system, the most appropriated subset of parameters to be estimated is selected. To assess the proposed methodology, a case study is presented. Keywords: Identifiability, Parameter Estimation, Dynamic Model, Sensitivity Analysis. do not have a direct relation, they are structurally identifiable. However, if in the experimental data, the states x1 and x2 are proportional, the parameters still could not be individually estimated (DOCHAIN, 2001). In other words, in those conditions, the parameters are structurally identifiable, but they are not practically identifiable.
1. INTRODUCTION In general, technological advances enabled a better understanding of physical phenomena associated with different kinds of process. Consequently, mathematical models that characterize this phenomena became more detailed and complex, with a high degree of nonlinearity and involving a large quantity of parameters and state variables (DOCHAIN, 2001), making essential the development of tools capable to help the modeling of systems. In this context, arises the importance of identifiability analysis, which is a primordial step, before any work of parameter estimation. The objective of that analysis is to verify which parameters of a given model should be estimated to guarantee the suitable representation of a real system.
Several techniques are available to identifiability analysis. The most traditional techniques for structural analysis are well developed for linear models; however, there is not a generic method suitable for any non-linear case (BOUBAKER, 2004). In general, the techniques used for practical identifiability analysis do not present restrictions related with the degree of nonlinearity. The most relevant limitation of these methodologies are related to the requirement of previous knowledge about parameters variability and the high quantity of analysis necessary to define the best subset of parameters to be estimated,
Identifiability analysis could be structural or practical. Structural identifiability considers only the model structure, not taking into account restrictions related with operational conditions. Practical identifiability evaluates the information content in the available experimental data, verifying if they are appropriated and enough for reliably parameter estimation.
This paper presents a new methodology of practical identifiability analysis, based on concepts of control system theory, which is capable to deal with the problems previously described. In Section 2 an overview of the most important concepts and existing techniques of practical identifiability analysis are presented. The proposed methodology is explained in Section 3. Finally, in Section 4, two case studies are analysed, comparing the proposed methodology with the currently most used method, proposed by BRUN et al. (2001) and also some practical details related with the new technique are presented.
To illustrate those concepts, the following simple models can be compared: =( + )
(1)
=
+
(2)
Analysing (1), is possible verify the direct relation between parameters and , which could not be individually estimated. Therefore, in this model and are not structurally identifiable. On the other hand, in (2), as c and d
978-3-902823-05-2/12/$20.00 © 2012 IFAC
2. PRACTICAL IDENTIFIABILITY A parameter subset is considered non-identifiable when a strong correlation among the parameters exists, leading to a 691
10.3182/20120710-4-SG-2026.00150
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
high degree of uncertainty in the identified model, once different combinations of parameter values leads to the same response, as shown in Fig. 1. Estimation of hypothetical parameters P1 e P2: (a) Identifiable pair and (b) non-identifiable pair., considering the hypothesis of an identifiable (Fig. 1a) and nonidentifiable (Fig. 1b) subsets. The parameters are represented by the axis P1 and P2, while the axis J(P1,P2) represents the objective function (error minimization).
where corresponds to initial value for the parameter and is a scale factor with the same physical dimension as the corresponding output. BRUN et al. (2001) show some practical considerations for the selection of those coefficients, but all of them need previous information about the system, which, in most of the cases cannot be available. From the sensitivity functions, the sensitivity matrix is calculated. For dynamical systems, it can be obtained applying the chain rule in (5): (!) = ( !
⁄
). (!) + (
⁄
)
(7)
where (!) represents the dynamic sensitivity matrix. From (7), it is finally defined the averaged sensitivity matrix ( ̅): '&
(a) (b) Fig. 1. Estimation of hypothetical parameters P1 e P2: (a) Identifiable pair and (b) non-identifiable pair.
̅ = $ (%. !& )*)& (
Fig. 1a shows that there is only one optimal combination of parameters values (P1 and P2), but in Fig. 1b, several combinations of parameters leads to the same objective function, and it is not possible to know which one of these combinations in fact represent the optimal parameters.
(8)
where !& is the sampling time and )& is the number of samples. 2.2. Methods of Practical Identifiability Analysis
Practical identifiability analysis aims to avoid of the problem described above, indicating what parameters should be estimated, considering the available experimental data and the correlation among the parameters.
After discussing the concepts related to sensitivity analysis, the most import methods of practical identifiability can be presented.
2.1. Sensitivity analysis
2.2.1. Covariance Matrix Analysis
All techniques of identifiability analysis are based on local sensitivity functions (REICHER, 2001). Those functions quantify the relation between the outputs and parameters of the model. Consider the generic system below:
It is the simplest method for identifiability analysis. According to BALSA-CANTO (2007) it is done calculating first the Fisher Information Matrix (+,-), as described in (9). Then, Covariance (./), and each index of the Correlation Matrix (.0) are calculated by (10) and (11), respectively.
= ( , , )
= ( , )
(3)
(4)
where is the vector of outputs, is the vector of inputs, is the vector of states and is the vector of parameters. The sensitivity function is mathematically defined as follow, where represents an initial parameter estimative, and subindexes i and j are, respectively, the output and the parameter. ⁄
=
+,- = ̅ 1 ̅
./ = +,- 2
⁄
(5)
⁄
3 = .4 ⁄5.4 .4
(11)
The elements of .0 can vary from 0 to ±1.Values of 3 = ±1 indicates that the parameters i and j are highly correlated. On the other hand, if 3 = 0 , parameters i and j do not have any correlation. With this method is only possible to analyse pairs of parameters and not all of them together; this is its main limitation.
In general, parameters and states of a model present distinct magnitudes. Therefore, scaling is a key issue that has a direct influence on the analysis result. Traditionally it is done according to (6): =
(10)
(9)
2.2.2. Method WEIJERS & VANROLLEGHEM (1997)
From +,-, described in (9) is possible to calculate the Condition Number (.9) and Determinant (:;<).
(6)
692
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012 ABC .9 = =>ABC (?@ ⁄>( '
(12)
:;< =
5det (+,-)
GH
defined as the ratio between the largest and smallest singular value (BOYD et al. ., 1994):
(13)
V(-) = 5>(?@ (-1 -)⁄>( ' (-1 -)
ABC where >ABC (?@ and >( ' are, respectively, the largest and the smallest eigenvalue of Fisher Information Matrix and m is the number of parameters of the subset.
A matrix with large condition number is said to be ill conditioned, indicating great sensitivity to uncertainties. Additionally, is important to notice that conditioning is strongly dependent on inputs and outputs scaling (SKOGESTAD and POSTLETHWAITE, 2005). Therefore, scaling is a critical step in order to avoid wrong results. One way to calculate the scaling which leads to less conditioning is the minimum use of LMI (Linear Matrix Inequality), which leads to the following optimization problem (BOYD et al., 1994):
The Condition Number gives a measurement of the shape and the Determinant the size of the trust region. Therefore, the best subset of parameters is defined by those with the biggest DET and the smallest CN values. In the literature, there is no information about suitable limits for DET and CN. FRENI (2009) suggested that the best subset can be chosen as the n more frequent parameters in the 5% best evaluated subsets. 2.2.3. Method BRUN et al. (2001)
min V([-0)
First, the parameter importance index (δ) is calculated, according to (14). It indicates the influence of each parameter in the outputs, where )K is the number of outputs.
Y,Z
. !. [ ∈ ℜ^ _ ^ , ` a) b ) ca `!`4d d `)`!d 0 ∈ ℜe _ e , ` a) b ) ca `!`4d d `)`!d
M = N)K 2 $ ̅
(14)
PQ = 1⁄5>L( '
With the scaled sensitivity matrix, the best subset of parameters to be estimated can be selected. In this work, we are proposing to use the non-square Relative Gain Array Matrix (NSRGA). In control theory, the RGA (Relative Gain Array) is an indicator of the degree of interaction between variables, i.e., the influence that each manipulated variable (or parameter in this case) has on each output. Usually it is assigned for square systems, where the degree of interaction is indicated by the values of its main diagonal. If the RGA is close to 1, the manipulated variable will have little effect on the other outputs. Numbers close to 0.5 or very high indicate interaction (LUYBEN and LUYBEN, 1997).
(15)
High collinearity index indicates correlation among the parameters. According to BRUN et al. (2001) parameters subsets with collinearity index smaller than 5 are considered as good and values above 20 are critical.
For the study of identifiability, the use of square systems is not convenient, since in general, a model has more parameters than measured states. Therefore, we consider the Non-Square Relative Gain Array Matrix (NSRGA). The concept of NSRGA was proposed by (CHANG and YU, 1990), whose objective was to establish a criterion to minimize the sum of squared errors (stationary error or offset) for systems whose number of entries is lower than outputs, given that the perfect control of such systems is not possible due to the absence of degrees of freedom. Subsequently (ROSSITER and CAO, 1997) proposed the use of NSRGA to systems with more outputs than inputs as a form of pre-selection of the manipulated variables of the control system. The authors establish the following theorem: "The sum of each column of the matrix NSRGA is equal to the projection of a corresponding entry in the direction of the input space of the effective gain matrix." This means in practice that the value of the sum of each column is proportional to its influence and degree of interaction in the process. One of the main drawbacks of NSRGA when
3. PROPOSED METHODOLOGY In this paper a methodology based on control theory concepts is proposed. Usually these concepts are applied to analyse the level of interaction between inputs and outputs of a given system. Instead, the proposed methodology adapts those concepts to evaluate parameter-output correlation. First, it is proposed a scaling method of sensitivity matrix. As mentioned in subsection 2.1, practical identifiability studies are based on sensitivity matrix, which must be adequately scaled to avoid the effect of the discrepancy on the magnitude of the variables. The proposed method is capable to scaling the system without the need of previous knowledge about the the parameters ( ), as well as reference values for the outputs ( ), presented in (6). This new proposal is based on the minimum conditioning of sensitivity matrix. The condition number of a matrix RSTU of full rank is
(17)
where [ and 0 are the scaling matrix of -. Thus the concept described can be applied to the sensitivity matrix ( ), allowing it to be appropriately scaled.
Once the most important parameters are defined, the columns of sensitivity matrix should be normalized and a subset of parameters (L) selected, giving priority to those with greater importance index. Afterwards, the normalized Sensitivity Matrix is restructured, keeping only the columns of the selected subset. Finally, the subset collinearity index is calculated by (15), where >L( ' is the smallest eigenvalue of sensitivity matrix formed by the parameter subset. 'O
(16)
693
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
compared to the traditional RGA is that it depends on the scaling of the gain matrix.
4. RESULTS AND DISCUSSION 4.1. Case Study 1 – Continuous Alcoholic Fermentation
It can be adapted to the identifiability analysis, using the sensitivity matrix instead of gain matrix. In this case, each column of NSRGA matrix corresponds to a parameter of the model. It can be mathematically defined as: fghij = g k gl
m
4.1.1. Model Description The case study is a continuous fermentation producing ethanol, using the yeast Saccharomyces cerevisiae, conducted in a CSTR. The mass balance is described by (19), (20) and (21), where D is the dilution rate; Cp, Cs e Cx are the concentrations of product, substrate and cells; YP/S and YX/S are ethanol and cells yield; and µ is the specific growth rate. For this type of process, there are a lot of kinetic models available in the literature. BORGES (2008) presents more than 35 kinetic models; the model of AIBA (1968) coupled to EDWARD (1987) was used in this paper. Table 1 shows the parameters original values for this model.
(18)
where gl is the pseudo-inverse of sensitivity matrix and k indicates matrix multiplication. In this case, the interaction among parameters is calculated summing the lines for each column of NSRGA matrix, whose values can vary between 0 and 1. Considering that each column represents a parameter, when its sum is close to 1, the corresponding parameter has a small correlation with the other ones.
. ⁄ ! = (. n . ). : n . . o ⁄p_/r .c⁄ ! .c n .c . : ps/r ⁄p_/r . t. o . ⁄ ! . n . . : . . o o o(?@ u. ⁄ v& . wd 2 xyz&lxGz^
From the presented concepts, the following procedure is proposed to select the subset of parameters that should be estimated: a.
Scaling the sensitivity matrix S by the minimizing condition number procedure or by the Brun et al. (2001) scaling recommendations,
b.
Calculate the NSRGA matrix, according to (18).
c.
Sum the elements of each column.
d.
Select the parameters whose columns present values bigger or smaller than a chosen cut factor (ξ), previously defined for the system.
(19) (20) (21) (22)
Table 1. Original parameters of Kinetic model. µmax 0,3052 h-1
A fundamental issue of this method is the correct selection of the cut factor (ξ). Based on our experience, a practical rule is to consider the sum elements of each column of NSRGA matrix for a hypothetical model, as show Fig. 2. In general, is possible to verify that the parameters are grouped in regions, according to this sum. So, the rule proposed in this work establishes that the parameters contained on the first and second region should be estimated. In some cases, the third region could be included, depending on the discrepancy between it and the previous region.
K2 2,316 g/L
K1 0,008 L/g
YP/S YX/S KS 0,028 0,445 0,026 L/g ----SOURCE: Borges(2008).
The perturbations were criteriously applied in the dilution rate. The other input variables were not changed: Cs0 = 100 g/L, Cp0 = 0 g/L, and Cx0 = 0 g/L. All states were considered as being measured. The initial condition corresponds to the stationary solution at t=0. Data were generated from the simulation using the parameter values presented in Table 1. Subsequently, a variation of -5% in the original parameters value was performed, and these values were used as the initial value (θ0) in further analysis. 4.1.2. Sensitivity Analysis Matlab 5.3® was used to calculate the scaling through minimum conditioning. All other studies were made in Python, using the Jmodelica package, developed by Åkesson et al. (2010). The solver IDAS, available in this package solve the differential-algebraic system and provides, automatically, the sensitivities of the outputs related with the parameters. After determining the sensitivities, the scaling is done. For the methodology proposed by BRUN et al. (2001) was considered equal to the initial parameter values and equal to the average of simulated outputs. Figure 3 depicts the result of dynamic sensitivities, , !), for the state P, considering different methods for scaling.
region 1 region 2 region 3 region 4
Fig. 2. Sum of NSRGA matrix columns Different from the other methods, which require the evaluation of a large number of combinations of parameters subset to define the best one, the main advantage associated with our approach is that NSRGA matrix is computed only once, allowing the selection of the best set of parameters in a single step, avoiding combinatorial problems.
Comparing Fig. 3(a) and Fig. 3(b) is possible to verify a considerable reduction in the variability of sensitivities after scaling the variables. Comparing now Fig. 3(b) FIG. 3(c) one can verify only a small improvement in the scaling. 694
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
Ž€ †‡ˆ‰Š‹Œ
| = }~• $ $ (€• n €ƒ‚ )…„ R
(a)
(b)
• •
„ •
R• = •• ⁄•••
(23) (24)
Table 3. Results using the Collinearity Index Criterion proposed by Brun et al. (2001) Scaling of the Sensitivity Matrix No Scaling BRUN et al. (2001) Minimum Conditioning
(c) Fig. 3. Dynamic sensitivities for state P: (a) no scaling, (b) scaling suggested by Brun et al. (2001) and (c) scaling through minimum conditioning.
Best Subset
Collinearity Index
K1, YP/S µmax, YP/S, YX/S YP/S,YX/S, K1
5.14
79.58
4.22
3.60
3.57
2.70
Square Error(J)
The determination of the best parameters subset was performed by the partial NSRGA method. Fig. 4 shows the result of the sum of columns of this matrix.
4.1.3. Identifiability analysis For comparative purposes, this step was performed in two ways: applying the new methodology proposed in this work (using NSRGA matrix) and using the method proposed by BRUN et al. (2001), since it is the most applied method and the one considered more efficient in the literature. First, applying BRUN’s method, the parameters were ranked according to their importance in the model (14), as shown in Table 2. Then, the optimal subset of parameters was selected.
(a)
No scaling θj K1 K2 YX/S µmax YP/S KS
{j 5375.5 1569.1 447.1 310.95 78.73 1.53
θj µmax K2 K1 YP/S YX/S KS
{j 3.01 1.39 1.36 0.97 0.41 0.11
(c)
Fig. 4. Result of Sum of NSRGA matrix columns to Continuous Alcoholic Fermentation case: (a) No Scaling, (b) Scaling by Brun et al. (2001) and (c) scaling by Minimum conditioning.
Table 2. Parameter Importance Index to Continuous Alcoholic Fermentation case Using the Scaling suggested by BRUN et al.
(b)
Using the Scaling through Minimum Conditioning θj {j YP/S 2.02 YX/S 1.63 K1 1.39 K2 1.10 µmax 1.00 KS 0.75
After the sum of NSRGA columns, the best subset of parameters was selected, according with the rule described in Section 3. The parameters of these subsets were estimated according to (23) and (24). Table 4 summarizes the results obtained with the parameters sorted by the NSRGA Criterion proposed in this paper. Table 4. Results using the NSRGA Criterion
From Table 2 is possible to see that the importance of each parameter is significantly different in each one of the scaling method, emphasizing the influence that such mathematical treatment provides on the results.
Scaling of the Sensitivity Matrix No Scaling
Best Subset K1, YP/S ,YX/S
Square Error (J) 2.70
BRUN et al. (2001)
µmax, YP/S, YX/S
3.60
Minimum Conditioning
K1, YP/S ,YX/S
2.70
Comparing the results in Table 3 and Table 4, regardless the methodology used, the adherence (or quality) of the model improved considerably after the estimation. The Collinearity Index Criterion without scaling showed the worst result, whereas NSRGA criterion was less sensitive to scaling. Considering the scaled systems by minimizing condition number procedure the same result is achieved, but it is worth to mention that NSRGA matrix analysis is simple and quick to be calculated. On the other hand, the Collinearity
After the determination of the best subset, the parameters were estimated minimizing the least square function, according to (23) and (24). Table 3 summarizes the results obtained with the parameters selected using the Collinearity Index suggested by Brun et al. (2001). It is important to mention that, before the estimation, i.e., when all the parameters are equal to θ0, the square error (J) was 3363.67. 695
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
Criterion, needs combinations.
to
be
calculated
for
all
possible
condition number procedure the same result is achieved, but it is worth to mention that NSRGA matrix analysis is simple and quick to be calculated. On the other hand, the Collinearity Criterion needs to be calculated for all possible combinations. Based on these results we recommend to scale the sensitivity matrix using the minimizing condition number procedure and to select the best parameters to be estimate based on the NSRGA criterion.
The scaling through minimum conditioning gave the best results, with the advantage that there is no need of prior information about the system, such as the variability of parameters (∆•„ ) and reference values for the outputs (€‡• ).
In order to check the reliability of the obtained results, the estimation of all possible subsets of three parameters was also performed, as shown in Fig. 5.
REFERENCES Åkesson, J., Årkezén, K., Gäfvert, M., Bergdahl, T., Tummescheit, H. (2010). Modeling and Optimization with Optimica and JModelica.org-Languages and Tools for Solving Large-Scale Dynamic Optimization Problem. Computers and Chemical Engineering, 34, 1737-1749. Balsa-Canto, E., Rodriguez-Fernandez, M., Banga, J. R. (2007). Optimal design of dynamic experiments for improved estimation of kinetic parameters of thermal degradation. Journal of Food Engineering, 82, 178-188. Borges, P. (2008). Otimização dinâmica da fermentação alcoólica no processo em batelada alimentada, Masters Thesis, Federal University of Uberlândia, Uberlândia. Boyd, S. P., Ghaoui, L., Feron, E., Balakrishnan. V. (1994). Linear Matrix Inequalities in System and Control Theory. SIAM, Philadelphia. Brun, R., Reichert, P., Künsch, H. (2001). Practical identifiability analysis of large environmental simulation models. Water Resources Research, 37, 1015-1030. Cao, Y., Rossiter, D. (1997). An input pre-screening technique for control structure selection. Computers and Chemical Engineering, 21, 563-569. Chang, J., Yu, C. (1990). The relative gain for non-square multivariable systems. Chemical Engineering Science, 45, 1309-1323. Dochain, D., Vanrolleghem, P. (2001). Dynamical modelling and estimation in wastewater treatment processes, IWA Publishing, London. Freni, G., Mannina ,G., Viviani , G. (2008). Identifiability analysis for receiving water body quality modelling. Environmental Modelling & Software, 24, 54-62 . Luyben, M., Luyben, W. (1997). Essentials of process control, McGraw-Hill, New York. Reichert. P., Vanrolleghem. P. (2001). Identifiability and uncertainty analysis of the river water quality model no. 1 (RWQM1), Water Science and Technology, 43, 329– 338. Skogestad, S., Postlethwaite, I. (2005). Multivariable Feedback Control: Analysis and Design, Wiley, New York.
Fig. 5. Error Map: estimation of all subsets of 3 parameters. Fig. 5 shows that the best subset was K1, YP/S, YX/S. Note that it is the same proposed by both methods, when using scaling through minimum conditioning procedure. Additionally, all subsets containing both YP/S and YX/S are the ones with the smallest error, which denotes that these parameters have a strong influence in the model. This fact is consistent with the importance index obtained when scaling through minimum conditioning (Table 2) as well as the highest summation values presented in Fig. 4 . 4. CONCLUSIONS The identifiability analysis is a fundamental tool in the identification of dynamic models of high complexity, since it provides an overview of possible combinations of parameters that can be estimated. Taking into account practical identifiability, several techniques are available. All of them are based on the sensitivity matrix and, therefore, such matrix should adequately reflect the influence of each parameter on the outputs. Hence, it is essential to properly choose the scaling method. In this work, the proposed methodology, using the minimum conditioning to scale the sensitivity matrix was more efficient than that most used in the literature, suggested by Brun et al. (2001). The main advantage is that it does not require prior knowledge about the variability of system parameters. The Collinearity Index and NRGA Criteria have been analysed in the paper to select the parameters to be estimated. The Collinearity Criterion without scaling showed the worst result, whereas NSRGA criterion was less sensitive to scaling. Considering the scaled systems by minimizing
696