Computers and Chemical Engineering 31 (2007) 307–317
New approach for microkinetic mean-field modelling using latent variables Jonas Sj¨oblom ∗ , Derek Creaser Department of Chemical and Biological Engineering, Chemical Reaction Engineering, Chalmers University of Technology, S-412 96 G¨oteborg, Sweden Received 1 February 2006; received in revised form 22 June 2006; accepted 19 July 2006 Available online 7 September 2006
Abstract A detailed microkinetic model for the storage and reduction steps in a lab-scale NOX storage reactor was taken from the literature and a large number of parameters were subject to fitting to a new set of experimental transient data. The parameters chosen for fitting were selected by sensitivity analysis of the Jacobian matrix of all adjustable parameters using latent variable (LV) models (partial least squares, PLS). The analysis of the LV structure of the Jacobian matrix is important because it indicates the experimental rank, i.e. how many parameters that are relevant to fit. Two slightly different methods both using LV models are presented in order to select good candidate parameters for fitting as well as a method to fit linear combinations of many parameters that only span the experimental space. By using this methodology the risk of overfitting is reduced and the chances for successful fitting are increased by supplying an appropriate number of parameters that also are uncorrelated (independent) for the given experiment. This methodology contributes to a better understanding of the catalytic process as well as a potential for more efficient parameter fitting for complex transient heterogeneous catalytic systems. The application of this kind of sensitivity analysis offers a new approach to microkinetic modelling. © 2006 Elsevier Ltd. All rights reserved. Keywords: Microkinetic modelling; Parameter fitting; NOX storage and reduction; Sensitivity analysis; Latent variables; PLS
1. Introduction 1.1. Background With the increasing capacity of modern computers together with increasing knowledge about complex systems, such as heterogeneous catalysis, the microkinetic models for these systems tend to be more and more complex. Very often these models serve their purpose as being good for predictions, increasing our understanding of the catalytic mechanism and for pinpointing interesting subjects for future research. Even though these models are very complex, their fit to experimental data is often not excellent. This “fact” seems to be accepted by the modelling community and excused by the argument that parameter fitting of complex models is very difficult. By letting the number of adjustable parameters in the model be governed by the amount of uncertainty and by assuming that the model mechanism is accurate enough to reproduce the experimental data, it is still very difficult to obtain a close
fit. All scientists doing modelling are of course aware of the model assumptions and limitations. However, these limitations are assumed to be compensated for by adjusting some of the adjustable parameters in order to get the desired overall fit. We feel it necessary to work on this parameter fitting issue and to try to develop methods and strategies for better fit to experimental data without spending inordinate amounts of time using software to produce a reasonable fit. Model parameter fitting in this study is applied to the transient analysis of a Pt–Ba–Al2 O3 NOX storage and reduction (NSR) catalyst using propene as a reducing agent. In this paper we will outline some new strategies for parameter fitting for complex models of catalytic systems. The aim with this work is to demonstrate the importance of correlated parameters and to present some new techniques to overcome the problems that parameter correlation presents. 2. Theory 2.1. Microkinetic modelling
∗
Corresponding author. Tel.: +46 31 7723039. E-mail address:
[email protected] (J. Sj¨oblom).
0098-1354/$ – see front matter © 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.compchemeng.2006.07.008
Microkinetic modelling (MKM) is a well established and a powerful modelling technique for catalytic reactions (Dumesic
308
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
Nomenclature A
scalar indicating the number of components (also pre-exponential factor) C (c) loading matrix (vector) used for the approximation of Y Ea activation energy in Arrhenius expression f optimizing function (e.g. square of the difference experiment − simulation) H Hessian matrix J, J Jacobian, Jacobian matrix k scalar indicating the number of fitted parameters (also rate constant) m scalar indicating the number of measured signals n scalar indicating the number of experiential observations (time points) P loading matrix used for the approximation of J T score matrix, approximation of the J matrix T temperature U score matrix, approximation of the Y matrix W loading matrix used for the approximation of J (reflecting the covariance with Y) W* (w* ) loading matrix (vector) related to J to produce the regression matrix PLS Y un-squared optimizing function matrix (experiment − simulation) Greek letters βPLS regression matrix for a PLS model θ, model parameter, parameter vector
et al., 1991). The method of MKM consists of having all reactions modelled as elementary steps. This enables a deeper analysis of the involved steps and a deeper understanding of the overall process is possible. However, it assumes that the majority of the kinetic parameters are known or can be estimated a priori and that only a few parameters are unknown, but which are possible to determine by some fitting procedure. In order to get the theoretical parameter values one can use a range of techniques e.g. kinetic gas theory, transition state theory, group contribution methods, density functional theory as well as the output from other independent experiments (published values). In addition, other known relations such as thermodynamic consistency for reversible reactions are used in order to decrease the number of unknown parameters. In MKM all parameters are “physical”, i.e. they should have the same value irrespective of how they were determined, e.g. from theory or experiment. However, all parameters (either from theory or from experiment) are subject to errors. Therefore we believe that the notation “known parameter” should be relaxed and actually all kinetic parameters could be subject to fitting. The MKM method has been proven to be useful many times but common to all successful applications, at least one of the following factors was involved:
• Steady state experiments where many of the included reactions never are rate determining. Therefore the actual value of many kinetic parameters does not need to be known, merely not to be unrealistic (see e.g. Aghalayam et al., 2000). • Mechanism developed in a sequential manner including parameter fitting in each step. This ensures that each “subset” of reactions work reasonably well before putting them together in a larger mechanism (see e.g. Olsson et al., 2001, 2002). • The fitted parameters are carefully chosen to be the most important ones in order to compensate for eventual deviations in the theoretical parameters. Nevertheless MKM is an attractive method that one would wish to succeed with due to the potential increased understanding of the catalytic mechanism and due to the fact that the Microkinetic model should be better at making accurate predictions beyond the experimental conditions at which it was developed. 2.2. Sensitivity analysis using latent variable methods 2.2.1. Sensitivity analysis Sensitivity analysis is a measure of how sensitive some variable or function is to small changes in some of the model parameters (Turanyi, 1997). The variables could be the optimizing function (residual in the case of curve fitting) or any other observable or unobservable variable such as gas phase concentrations, reaction rates or surface coverage. The model parameter could be any parameter such as rate constants (pre-exponential factors, activation energies) or other types of parameters (number of active sites, catalyst properties, experimental parameters). For parameter fitting of heterogeneous catalytic reactor experiments, the optimization function is often the squared residual (experimental − simulated)2 and an analytical expression for sensitivity is most often not possible. The sensitivity or derivative is then calculated by using finite differences. The derivative of the residual with respect to some parameter is also known as the Jacobian. The Jacobian is a local sensitivity matrix of the optimizing function with respect to a set of parameters J≡
∂f ∂θ
(1)
The optimizing function f has the size n × m, where n is the number of observations and m is the number of measured signals for residual calculation. θ is the parameter vector of size k and J is the Jacobian of size nm × k, due to prior “unfolding” of f. 2.2.2. Latent variable methods Latent variable (LV) methods are a range of projection methods which all include the approximation of the original data matrix with a matrix of lower rank. The earliest example of an LV model was a special case of linear regression (Pearson, 1901), where an LV approach was used as a countermeasure when the independent variable is subject to error (which is commonly assumed to be exactly known). Singular value decomposition (SVD) and subsequent replacement of small eigenvalues
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
with zeros to produce an approximation of the original matrix, which becomes invertible is another mathematical application of the LV approach. Principal component analysis (PCA) and LV regression methods such as partial least squares (PLS) also use the LV approach and often function using “deflated” matrices (extraction of one component/dimension and then using the residual matrix for the next component). The application areas of LV methods range from psychometrics (e.g. using many questions from a questionnaire to predict different behaviour of the respondents), econometrics (e.g. trying to use available financial information to predict stock value evolution), chemometrics (e.g. using many process variables from a chemical plant to predict product quality and thus enabling better quality control) as well as sensitivity analysis for kinetic modelling (Tur´anyi et al., 1989; Vajda et al., 1985). A common property for the systems within these application areas to become successful is that they are linearizable, i.e. despite the apparent and inherent non-linearities, a linearization is obtained (e.g. by intelligently chosen variable sets, by variable transformation or by the use of a sufficiently small experimental space). This feature is also one corner stone of optimization theory, which suggests that LV models should potentially benefit an optimization method. Partial least squares (also referred to as projections to latent structures (Eriksson et al., 2001)) was used in this study to find a regression model that predicts the residual (experimental − simulated, denoted Y) using the Jacobian matrix J. The model can be described as follows (all matrices being centred): 1. The Jacobian is approximated by a score matrix T (size N × A) and a loading matrix P (of size k × A). The number of components, A is determined in step 5 by the method of cross validation as described e.g. in (Eriksson et al., 2001) J ≈ TP
(2)
2. The optimizing function (in this case the number of columns is more than one) is also approximated by a y-score matrix U and a y-loading matrix C Y ≈ UC
(3)
3. The Jacobian (X) scores are then regressed against the residual (Y) scores ˆ = 1T U
(4)
4. where the regression coefficient is set to 1. 5. A loading matrix W (very similar to the loading matrix P) is chosen so that the covariance between Y and J is maximized. 6. The objective function (residual = experimental − simulated) is now predicted as ˆ = JβPLS Y βPLS = W(P W)−1 C = W∗ C
309
In the case of sensitivity analysis where one has k parameters, the smaller number of dimensions A represents the number of independent phenomena that are taking place given the reaction mechanism, the actual set of parameters (and their values) and the performed experiment under study. This technique has been proven beneficial for a number of applications, especially for determining the rate limiting step (model reduction) e.g. in combustion applications (Turanyi, 1997). 2.2.3. Correlation among fitted parameters When applying MKM methodology, one can often end up with several parameters that should be obtained by fitting the model to experimental data. This can be due to a lack of suitable theoretical estimation methods or a lack of success in previous fitting attempts. If the fitting “fails” (minimization finds local minima far from satisfactory) correlation among the parameters may be the reason why. This has been investigated for example by Jansson (2002). In fact even if one has fitted as many parameters as the rank indicates from the sensitivity analysis, it may happen that the fitted parameters were not those that properly spanned the experimental space. One of the main messages from this work is that it is important to assess the model and data for such correlations and be prepared to take actions if necessary or at least understand why the fitting result may be unsatisfactory. 2.3. New fitting strategies using latent variable analysis The main idea of this work is to expand the different methods for sensitivity analysis which are typically carried out on a “final” fitted model (e.g. (Deshmukh et al., 2004)) for the purpose of assessment and to apply it also during the fitting process. By repeatedly analysing the Jacobian, new sets of suitable fitting parameters can be selected to advance the fitting process. We propose two alternatives for this technique: • Method I: parameter selection for reducing parameter correlations. • Method II: fitting latent variables instead of original parameters to ensure orthogonality. The details will be given below. The central idea is to use the latent variable structure to improve the chances for success of the fitting algorithm by providing only independent parameters for fitting. It will turn out that this methodology will offer an alternative to the traditional Microkinetic modelling maxim stated as: “Estimate what you can and fit the rest”. with an alternative maxim stated as: “Fit only parameters that span the experimental space”.
(5)
For more details about the algorithm see e.g. (Martens & Naes, 1989). The main advantage of LV methods is that they use projections to some smaller dimension than the original one.
As a result a key difference is that with the traditional application of MKM many parameters are estimated and only a few are fitted to the experimental data whereas in the method demonstrated here all parameters are initially estimated and considered as candidates for fitting, but only a few are selected based on how well they span the experimental space.
310
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
2.3.1. Example of sensitivity analysis For the studied model (details given below in the experimental section) we have 75 parameters and three measured signals and thus three residuals. We apply local sensitivity analysis on the squared residuals by finite differencing and obtain the Jacobian matrix J [N × k] and use the un-squared residuals as Y. The resulting regression model will include a loading matrix W* , which consists of coefficients that will project the original data to a low-dimensional hyper-plane denoted as “components”. For further detail see e.g. (Eriksson et al., 2001; Martens & Naes, 1989). By studying plots of different components, one can analyze the correlation structure, exemplified in Fig. 1. As seen in Fig. 1 different groups of parameters are seen. Parameters within a group are similar or correlated, i.e. they have the same effect on the residual. Groups that are found on opposite sides of the origin are also similar but have negative correlation. This analysis is performed for different component combinations for a particular model. If one would try to fit two or more parameters within a group (as indicated by circles in Fig. 1) there is a substantial risk that the fitting procedure will not be effective and the accuracy of the fit would be low.
If the “missing” parameters (i.e. the parameters for which we lack a reliable estimate) tend to group together we have a “problem” with different options: 1. Accept the current situation and realize that there is correlation between the parameters. The fit may be good, but the validity still remains to be proven. 2. Perform new experiments that will expand the experimental space and allow for independent estimation of the unknown parameters. 2.3.2. Method I: selection of parameters to fit In this study we had estimates (theoretical and published values) for all parameters to be used as a first attempt. The “unknown” parameters were to be chosen so that the fit would be as good as possible. Note however, that we consider ALL parameters to be subject to fitting (Frame 1). Given that several parameters within a group (as shown in Fig. 1) have the same effect on the Jacobian, we also have prior knowledge about the uncertainty and confidence of the different parameter values. Thus we chose the most uncertain ones to be fitted.
Fig. 1. Loading plot of the first two components in the regression model between the Jacobian matrix (parameters indicated by triangles) and the residuals (indicated by squares). Different groups are indicated by circles.
Frame 1. Method I: how to choose parameters by using a LV model.
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
311
Frame 2. Method II: how to choose parameter combinations (loadings) to use for fitting.
2.3.3. Method II: fitting in LV space Instead of selecting a few kinetic parameters to fit and letting them take into account the model deficiency (as discussed above), we could also choose to fit many kinetic parameters but restrict the search to be held within the latent variable space determined by the regression model described in Section 2.3.1. The search algorithm (in our case “lsqnonlin” Matlab function) fits the scores in the different components as described in Frame 2. This methodology will search in all parameters (or among any large number of parameters) but limited to span the experimental space. As a result, the fitting algorithm should have fewer problems due to co-linearity among the fitted parameters because they are per definition independent.
second order function. One of the prerequisites for this to be effective is that the current parameters are “close enough” to the optimal point. Another condition for this algorithm to be effective is that the approximation of the Hessian by using J J is reasonable.
2.3.4. Comments on the proposed methods The developed regression methods are actually never used for any prediction of the objective function. The only purpose for making such regression models instead of a simpler LV model (e.g. PCA as in (Vajda et al., 1985)) is to force the latent variables to focus on the residual, which is what we try to minimize. PCA models maximize the Variance of TP while PLS models maximize the covariance between “X” and “Y”. The differences in performance using these two LV model types (PCA or PLS) should not be so great, since the major advantage is the use of the lower space with orthogonal components. The use of PLS however produces fewer components than a PCA model, which makes it more attractive from our point of view. This is the first time anyone has used PLS for this type of “model reduction”. For other problems PLS have been used to reduce for example the number of grid points for spatially resolved problems (Sun & Hahn, 2005). In this paper, the LV methods have been used only for initial selection of parameters. After this “selection”, an “ordinary” parameter-fit algorithm (‘lsqnonlin’ in our case) was used. In a future study we intend to integrate this approach into each iteration step.
1. Too many parameters. 2. Parameter values “far away” from optimal values.
2.4. Theoretical aspects of the proposed procedures (motivation) The used optimization algorithm (‘lsqnonlin’ In Matlab) has the same features as any nonlinear programming algorithm. It assumes that the objective function can be approximated by a
H=
2 ∂J ˆ − Yexp ) ∂ Y ≈ 2J J = 2J J + 2(Y ∂θ ∂θ 2
(6)
The use of uncorrelated parameters will actually improve this approximation by the fact that off-diagonal elements in the Hessian matrix will be zero. With complex systems containing many unknown parameters that are badly estimated, we have (at least) two kinds of problems:
The proposed methodology addresses the first kind of problem, the task of knowing how many parameters to fit and knowing which parameters to choose. The second kind of problem will only be solved if a reasonably “smooth” gradient towards the optimum is found, without ending up in a suboptimal point (local optima). However, we believe that the chances of finding such a parameter pathway will increase by choosing the appropriate parameters to fit. 3. Experimental 3.1. Experimental data The experimental data was taken from previous work. All details can be found in (Abdulhamid et al., 2004). The experiments were performed on a lab-scale flow reactor, a cordierite monolith (15 mm long, 69 channels, 64 channels/cm2 ) loaded with Pt and BaO on an Alumina washcoat. The experiments were performed with a space velocity of 52 000 h−1 (1730 ml/min at room temperature). For 40 min the catalyst was exposed to a gas stream containing 500 ppm NO2 (in Argon). During this period the NO2 was stored on the catalyst as Barium nitrate (Ba(NO3 )2 ) and the long storage time was chosen to ensure a saturated catalyst. Then for 5 min 222 ppm of propene was added to the gas stream. During this relatively short period, the propene reduced
312
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317 Table 1 Reaction mechanism Reaction r1
NO(g) + Pt←→NO–Pt
(i)
r2
r3
O2(g) + 2Pt←→2O–Pt
(ii)
r4
r5
NO–Pt + O–Pt←→NO2 –Pt + Pt r6
r7
NO2 –Pt←→NO2(g) + Pt r8
r9
NO2(g) + BaO←→NO2 –BaO r10
r11
NO2 –BaO←→NO(g) + BaO2 r12
r13
NO2(g) + BaO2 ←→BaO–NO3 r14
r15
NO2(g) + BaO–NO3 ←→Ba(NO3 )2 Fig. 2. Experimental data extracted from (Abdulhamid et al., 2004).
r16
r17
2O–BaO←→2BaO + O2 r18
the catalyst and eventually produced N2 . This cycle (storage and reduction) was repeated for each experiment. NO, NO2 , NOX (=NO + NO2 ), CO and CO2 were analyzed and the storage and reduction experiments were performed at two different temperatures (250 and 350 ◦ C). The used experimental data is displayed in Fig. 2. The signals for CO are not displayed in Fig. 2 because its concentration was very low at all times. As can be seen in Fig. 2 the reducing period shows substantial NOX slip (ideally the NOX level should be zero) indicating that the catalyst was not optimized for maximum NOX reduction. Also the long storage period contributes to a lower NOX reduction performance than would be desirable for real applications. However, the objective of the study by Abdulhamid et al. was to investigate different reducing agents (H2 , CO, propene and propane) and their influence on NOX reduction. Only the experimental data using propene was used in this modelling study. 3.2. Reactor model The reactor model was adopted from (Westerberg et al., 2003). The monolithic catalyst was modelled as a series of continuously stirred tanks. The number of tanks was low but increasing the number of tanks did not yield visible changes in the concentration profiles. The model assumes a uniform radial flow distribution and it neglects radial temperature and concentration gradients, gas phase accumulation and diffusion resistance in the washcoat. The gas phase transport resistance was accounted for by a film model using the concentration gradient and a mass transfer coefficient. 3.3. Kinetic mechanism The kinetic model was taken from (Olsson et al., 2001, 2002) and is outlined in Table 1.Fig. 3 depicts the scheme of the kinetic mechanism.
r19
NO2 –Pt + BaO–NO3 ←→Ba(NO3 )2 + Pt r20
r21
C3 H6(g) + Pt←→C3 H6 –Pt r22
r23
C3 H6 –Pt + 2Pt−→3CH2 –Pt r24
CH2 –Pt + 3O–Pt−→CO2(g) + H2 O(g) + 4Pt r25
CH2 –Pt + 3NO2 –BaO−→CO2(g) + 3NO(g) + H2 O(g) + Pt + 3BaO r26
3NO–Pt + CH2 –Pt−→CO2(g) + H2 O(g) + 1.5N2 + 4Pt r27
O–Pt + C3 H6(g) −→C3 H6 O–Pt r28
8O–Pt + C3 H6 O–Pt−→3CO2(g) + 9Pt + 3H2 O r29
CH2 –Pt+3Ba(NO3 )2 −→3NO(g) +CO2(g) + 3BaO(NO3 ) + H2 O+Pt r30
CO2(g) + BaO←→BaCO3 r31
r32
CH2 –Pt + S2←→Pt + CH2 –S2 r33
r34
3NO2 –Pt + CH2 –S2−→3NO(g) + 3Pt + CO2(g) + H2 O(g) + S2 r35
3NO–Pt + CH2 –S2−→CO2(g) + 3Pt + H2 O(g) + S2 + 1.5N2(g)
(iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii) (xiv) (xv) (xvi) (xvii) (xviii) (xix) (xx) (xxi) (xxii)
3.4. Numerical methods Matlab 6.5 from Mathworks (using the functions ‘ODE15s’ and ‘lsqnonlin’) was used for parameter fitting. Thermodynamic consistency was preserved during all parameter fitting. For the PLS analysis Simca-P 10.0 from Umetrics was used. Starting from the values of pre-exponential factors and activation energies published in (Olsson et al., 2001, 2002), some “manual tuning” was found necessary in order to get any improvement at all (e.g. to get the ODE solver to converge). Then the fitting procedures were applied, as described in Sections 2.3 and 3.5. It should be noted however, that the Jacobian used was actually not the one described in Eq. (1). Each residual was squared and summed row-wise, thus producing an n × k matrix instead of nm × k. This has the benefit of easier interpretation of the time
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
313
kinds of parameters, we need to convert all parameters to the same “scale”. We also want to break up the inherent correlation structure of the Arrhenius expression. In the fitting procedures, the following scaling was performed: • “Centring” of the pre-exponential factors were performed according to k = A e−Ea /RT = A e−Ea /RT0 e−Ea /R(1/T −1/T0 ) = kref e−Ea /R(1/T −1/T0 )
Fig. 3. Kinetic mechanism for NOX storage and reduction process.
and thus kref was fitted instead of A. • All kref were logarithm transformed in order to obtain a linear relationship for the desired reaction rate change after subsequent scaling. • All parameters were finally scaled according to θscaled =
dependence but may suffer from the lumping of the different residuals. The Jacobian was calculated using finite differences using a perturbation of 1–5% where a change of 100% corresponds to a doubling or halving of the reaction rate. All 75 adjustable parameters were included in the sensitivity analysis (35 preexponential factors, 35 activation energies, 3 types of sites and 2 coefficients (α) for coverage dependent activation energies). For the subsequent LV modelling, some of the parameters were excluded as a consequence of exterior knowledge (described in the next section) as well as for other reasons (e.g. small activation energies were not included in the fitting procedure). 3.4.1. Use of “exterior” knowledge In order to put in as much information (without making further assumptions or additions to the mechanism, etc.) into the model, the following data could be retrieved from the experimental curves: • Integration of the first NOX storage signal gave the number of storage sites in the washcoat. (Note: We know that NOX can be stored on alumina, especially for low temperatures, but since the mechanism does not contain NOX adsorption on alumina, we must force the number of available sites to be included in the existing mechanism.) • The first storage was carried out with a fresh catalyst and the second storage was carried out after regeneration. Because the regeneration was not complete, a comparison of first and second storage periods yielded an estimate of surface coverage of NOX storage sites at the end of the regeneration. • A carbon balance gave the amount of reacted propene during the regeneration and subsequent storage. Only 58% of the incoming propene reacted on the catalyst assuming 100% conversion to CO and CO2 of the propene that adsorbs on the surface. 3.4.2. Pre-treatment of parameters Scaling of parameters is an important issue when dealing with parameter fitting. Since we want to find the values of different
(7)
θun-scaled − θmean weight
(8)
where θ mean was the nominal value or the original starting guess value. For the weight, the following rules were applied: scaled parameter values of +1 doubled the reaction rate and a value of −1 halved the reaction rate. For kref this was enabled by the preceding logarithmic transformation. 3.5. Iterative parameter fitting procedure The parameter fitting process was not performed in one step. First a number of manual adjustments were necessary in order to get close at all. Next a number of iterations were performed where each iteration consisted of: 1. A sensitivity analysis of all parameters. 2. LV modelling in which parameters to fit (method I) or in which combinations of parameters to fit (method II) were selected. 3. Fitting using the optimisation algorithm. 4. Evaluation of the fit and more manual adjustments in order to “move away” from the local minima. 4. Results and discussion Below the typical outputs from a LV model (PLS) is described where J is regressed against the objective function. This displayed model uses 12,447 observations (time points), 62 xvariables (J) and 3 y-variables (the residual: experimental result − simulated result). The number of components was estimated to be 9, using the method of cross validation. This LV model explained 70% of the variance in X and 75% of the variance in Y. The loading plot for the first two components was already shown in Fig. 1. 4.1. LV modelling of sensitivity analysis data using method I Method I consists of picking A parameters (in this case A = 9) that span the experimental space. The loadings are the linear
314
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
Fig. 4. VIP plot for a LV model with nine components.
combinations of original variables that do precisely that. The variables that are far from the origin have the most influence, hence span most of the space. However since there are several components to account for, a weighted average of all variables is computed from the loadings and the corresponding SSY (variance sum of squares of Y explained by the model) for each component. This is referred to as the VIP (variable importance in the projections). More details can be found in (Eriksson et al., 2001). A VIP plot is shown in Fig. 4.
The variables with highest VIP were chosen as the combination that spans the experimental space. In this case the nine variables to the left were chosen with the exception that k13 and Ea29 were replaced by k31 and k07 because their reverse reactions were already picked. The corresponding loading plots, in Fig. 5, confirm that the selected variables span well each plot. We observe from the pair wise plots that the selected variables are well distributed. We can also observe that some variables
Fig. 5. Loading plots of the eight first components for a LV model with nine components and the selected nine variables marked with rectangles. Unselected variables are marked with triangles and the y-variables with stars.
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
315
Fig. 6. Model predictions for the different residuals NO, NO2 and CO2 (predictions in black solid lines and experiments in grey dashed lines). The first half of the time points is at 250 ◦ C and the second half at 350 ◦ C. The total explained variance in Y was 74%, which can be seen in the relatively close fit of the model.
separate out (form groups in a corner) in a loading plot but were not selected. This is because they contribute to the explanation of X and not so much to the explanation of Y. The nine selected variables (parameters) were selected and fitted using the optimisation algorithm. 4.2. LV modelling of sensitivity analysis data using method II The same model could be used for method II. Fig. 6 shows the prediction performance for the different Y-variables.
As seen in Figure 6, the predictions are good for CO2 (which is only produced during the regeneration periods) and for NO and NO2 during the first storage at low temperature. Using Method II the loading matrix (W* ) was exported to Matlab and fitting using the optimization algorithm was performed with the scores instead of the kinetic parameters. The scores were transformed back to original parameter values using W* as the loading matrix as described in Section 2.3.3. The matrix W* is very similar to P and W, but W* is more related to Y than P which is used to model the variance in X only and W is related to the deflated X-matrix. (For further details, see e.g. (Martens & Naes, 1989).) It should
316
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
Fig. 7. Comparison between fit results using method I (left) and method II (right).
be noted that the regression model fails to predict accurately during certain intervals due to the fact that no sensitivity was present in the data set, i.e. by changing the parameters very little or no change in the outlet concentrations was induced. This will of course affect the fitting results and the effect is also visible in the final fit displayed in Fig. 9. 4.3. Fitting results The results from the fitting using method I and II are compared in Fig. 7. The comparison is displayed for the changes in the objective function (SSresidual ) using the models described in Sections 4.1 and 4.2. The displayed SSresidual in Fig. 7 comprises 9 iterations, where each iteration comprises 10 function calls. A comparison of the methods (considering both the SSresidual as well as visual inspection of the outlet concentration plots) shows that the final achieved fit was very similar. It was expected that the progress for method II should be faster during the first iteration due to the fact that the LV structure was assumed for the remaining iterations. The results show however that this effect was not evident in this case.
Fig. 8. Residual evolution for fitting of all 62 parameters.
4.4. Fitting results using very many parameters The result from fitting all parameters included in the LV models (62 parameters) is displayed in Fig. 8. The residual evolution is again very similar. Visual inspection of the outlet concentration plot indicated that the results were similar to those reached by using methods I and II above. Here it is demonstrated that a fit of similar quality could be achieved in this case without using LV models but it required many more function calls. 4.5. Final fit The result after a number of fitting sequences is depicted in Fig. 9.
Fig. 9. Experimental data (dotted lines) and simulation results (solid lines).
J. Sj¨oblom, D. Creaser / Computers and Chemical Engineering 31 (2007) 307–317
317
As can be seen from the simulation results the model captures the NSR phenomena reasonably well. The general dynamics and levels of concentrations during the storage periods were reproduced better than those during the regeneration periods. However, it is our view that the fit could be much closer to the observed values. One major candidate for explaining the lack of fit could be that the model is incorrect. For example effects of the mean field approximation, the reaction mechanism or the modelling of transport phenomena could be improved and a better fit would be possible. Again we want to stress that we assume this model to be correct enough to reproduce the experimental data and the objective was not to demonstrate a perfect fit but to demonstrate an alternative method for parameter fitting.
5. Conclusions
4.6. Concluding remarks
Acknowledgements
The optimization function ‘lsqnonlin’ (large scale version) uses an algorithm that assumes that the (non-linear) objective function can be approximated by a quadratic curve. If that is not the case, only local minima will be found. This may indeed be the case for the parameter fitting carried out here. A better choice could be a global search algorithm, e.g. simulated annealing (Eftaxias et al., 2002). However, given the large number of parameters subject to fitting, and the time required for each function call (ode-solver, approx 3 min) the estimated time for a complete search would be several hundred years. One can also argue that more efficient experimentation would be the solution to the lack of fit. And yes indeed by studying specific experiments like TPD (temperature programmed desorption), TPR (temperature programmed reaction/reduction), more step changes in the different gas phase species (preferably using a statistical experimental design), variation of flow conditions and even variation of reactor design one could probably estimate more accurately a larger number of kinetic parameters. Subsequent parameter fitting could then be less complicated and the chances of achieving a better fit would definitely increase. The experimental data in this study was limited by the objective of the experimenter who had a different objective than modelling. The data at hand is nevertheless very interesting since:
Hussam Abdulhamid at the Competence Center for Catalysis (KCK) is greatly acknowledged for providing the experimental data. The Swedish Research Council is acknowledged for financial support.
(a) It represents a common situation where we want to model data that was not originally intended for modelling or that was not optimally designed but was found to be worthy a modelling attempt. (b) It has a low experimental rank (relatively simple experiments) and therefore it should in theory be easy to fit the model parameters to these data even if extrapolation to other experimental conditions would show the model to be invalid. Again we would like to stress that the objective was not to find the “true” values of the kinetic parameters, but to demonstrate various important aspects of parameter fitting.
A new approach to parameter fitting for transient modelling of a catalytic reactor was presented. The first general conclusion is that awareness of the correlation among kinetic parameters for microkinetic models is important. Secondly, analysis of the correlation structure gives valuable information about the experimental rank and thus the number of parameters suitable for fitting. Thirdly, deeper analysis of the correlation structure (by means of latent variable methods such as PLS) also directs the selection of fitting parameters when many parameters are subject to uncertainty. In this study the use of LV models produces similar fitting results as traditional fitting of all parameters but using substantially less computational time.
References Abdulhamid, H., et al. (2004). Influence of the type of reducing agent (H2 , CO, C3 H6 and C3 H8 ) on the reduction of stored NOX in a Pt/BaO/Al2 O3 model catalyst. Topics in Catalysis, 30/31, 161–168. Aghalayam, P., et al. (2000). Reactors, kinetics, and catalysis; construction and optimization of complex surface-reaction mechanisms. AIChE Journal, 46, 2017–2029. Deshmukh, S. R., et al. (2004). Microreactor modeling for hydrogen production from ammonia decomposition on ruthenium. Industrial Engineering Chemistry Research, 43, 2986–2999. Dumesic, J. A., et al. (1991). The microkinetics of heterogeneous catalysis. Washington, DC: American Chemical Society. Eftaxias, A., et al. (2002). Non-linear kinetic parameter estimation using simulated annealing. Computers Chemical Engineering, 26, 1725–1733. Eriksson, L., et al. (2001). Multi- and megavariate data analysis, priciples and applications. Ume˚a: Umetrics, AB. Jansson, J. (2002). Studies of catalytic low-temperature CO oxidation over cobalt oxide and related transition metal oxides. PhD Thesis. Department of Chemical Reaction Engineering. Gothenburg: Chalmers University of Technology. Martens, H., & Naes, T. (1989). Multivariate calibration. New York: Wiley. Olsson, L., et al. (2001). A kinetic study of NO oxidation and NOX storage on Pt/Al2 O3 and Pt/BaO/Al2 O3 . Journal of Physical Chemistry B, 105, 6895–6906. Olsson, L., et al. (2002). Mean field modelling of NOX storage on Pt/BaO/Al2 O3 . Catalysis Today, 73, 263–270. Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 6, 559–572. Sun, C., & Hahn, J. (2005). On the use of partial least squares (PLS) and balancing for non-linear model reduction. In American Control Conference. Portland, OR, United States: IEEE. Turanyi, T. (1997). Applications of sensitivity analysis to combustion chemistry. Reliability Engineering System Safety, 57, 41–48. Tur´anyi, T., et al. (1989). Reaction rate analysis of complex kinetic systems. International Journal of Chemical Kinetics, 21, 83–99. Vajda, S., et al. (1985). Principal component analysis of kinetic models. International Journal of Chemical Kinetics, 17, 55–81. Westerberg, B., et al. (2003). Transient modelling of a HC-SCR catalyst for diesel exhaust after treatment. Chemical Engineering Journal, 92, 27–39.