Copyright @) IF AC Advanced Control of Chemical Processes, Pisa, Italy, 2000
A NOVEL ALGORITHM TO LOCAL MODEL NETWORK GENERATION
M. S. Posserl, J. O. Trierweilerz* and A. R. SeccW3 Department of Chemical Engineering, Federal University of Rio Grande do SuI (UFRGS) Rua Marechal Floriano, 501, CEP: 90020-061 - Porto Alegre - RS - BRAZIL, E-MAIL: {posser\jorge2.arge3 }@enq.ufrgs.br
Abstract: A novel algorithm to local model network (LMN) generation is proposed. This algorithm allows to use prior knowledge about the process in many different levels. The Van de Vusse benchmark problem is used as example. This system exhibits a change in gain at the peak of the reactor yield, displays non-minimum phase behavior (inverse response) for operating points on the left of this peak and minimum-phase behavior with overshoot for operating points on the right. To deal with those problems, the space is divided into SUb-regions, and for each one a model is identified around its centers. The whole space is described by the combination of the local models with a new weighting function based on the generalized gauss function . Copyright © 2000 [FAC Keywords: Local Model Network, van de Vusse reaction, Identification process.
Pi(Xr )
1. INTRODUCTION
x,
Modeling nonlinear dynamic systems from observed data and prior knowledge is an important area of science and engineering (Smith, 1994). The Local Model Network (LMN) supply this tasks. The main idea is that any model or controller has one limited interval of operation or validity, i.e., a region where it represent the system's behavior and works well (Trierweiler and Secchi, 1999). A model (or controller) that is valid in a given operating condition is called local model (or local controller). The aim of this work is to apply the LMN concept to the control system area. The benchmark problem based on the Van de Vusse reaction (Chen et al., 1995) is used as example.
The LMN can be viewed as a RBF (Radial Basis Function) Network where the basis function coefficients have been generalized to allow not just a constant parameter to be associated with each basis function, but a more powerful function .fix/). This means that a smaller number of local models can cover larger areas of the input domain. Building the net is one of the most important task. The first step is the selection and range specification of the variables that characterize the operating space and the local models, Le., x, and XI> respectively. The variables set x, and x/ can be the same. Usually, X, is only a subset of XI defined by two or three variables. After that. the local model structure is chosen and the parameter identification of the local models is done. Finally. one method is chosen to combine the local models into a global process model.
2. LOCAL MODEL NETWORK The LMN is a useful hybrid method incorporating the possibility to introduce prior knowledge and the observed data to modeling a process. The LMN can be described by
The local models are usually combined by weighting functions. The weighting functions makes the smoothly transition between the local models. To ensure the space homogeneity. the weighting function must satisfy the normalization condition. In the case that a function does not satisfy this condition. it can be normalized by
n
y=f(x/,x,)= Lf;(x/)p;(x,)
(1)
;=1
where ji(Xl)
: local models
x/
:
: weighting function : variables that define the operating point of the system.
variables used in the local models
Author to whom correspondence should be addressed.
383
top. This flat surface can considerably reduce the second side effect. Nevertheless, the first side effect can be only reduced but not completely solved. These effects are shown in Fig. 1 (a.3 and b.3).
(2)
where J.l;(xr ) represents a function that is being nonnalized, the sub-indices k represent an specific region and m the total number of centers (local models). An usual choice for J.l;(xr) is the gaussian function (Trierweiler and Neumann, 1998)
J.li
=ex{-.!.(±( xr -ci,r 2
r=1
]2']I/']
(3)
CJi,r
where n denotes the vector nonn, Ci,r is the location of the center in the r coordinate and o;,r is the corresponding standard deviation, that quantifies the degree of expansion (Le., width) of the center k along of each dimension by the variables that characterize the operating space.
Fig. 1: Non-noITIlalized and nOITIlalized Gauss functions, and the nOITIlalized GGF for three centers uniformly spaced with distinct widths (a. I, a.2, and a.3 , respectively) and for three centers non-uniformly spaced with distinct widths (b. I, b.2, and b.3).
3. LOCAL MODEL NETWORK GENERATION PROCEDURE
Depending on the location of the center and of its width, the nonnalization leads to a number of important side effects that can have important consequences for the resulting local model network. For example, when Gaussians are used as basis function, the main side effects are (Shorten e MurraySmith, 1994):
1.
2.
The local model network structure is defined by three parameters: c, (j and P, when the GGF is chosen as weighting function . The vector parameter c represent the centers of the subspace, while (j is analogous to the standard deviation and P is responsible for the flat fonn around the center. For more details see Posser (2000).
Loss of independence and change of the basis function shape. The shape of the basis functions is usually quite different from the nonnonnalized basis functions. In the extreme case occurs the reactivation of the center. Ideally, the basis functions must decrease monotonically with increasing distance to the center. The reactivation happens when the basis function increases from some point outside of the region where the local model should be active.
Basically, there are two ways to build the network structure: using algorithms like LOLIMOT (Nelles, 1997) or prior knowledge. Based on the network structure, a new signal collection are generated, one for each identified sub-region. This step is very important, because the new signals are specific for each sub-region, so that the identified model for one specific sub-region will be better than a model identified by a signal that would include infonnation about other operating subspaces.
Shift and reduction of basis function maximum. The maximum of the basis functions may no longer be at its center and the maximum can be much smaller than the non-nonnalized basis function.
After this, for each subspace, it is identified a model, which can be linear or nonlinear. Those models will be weighted by a weighting function for each operating point and through that, a global model is obtained. The Fig. 2 shows the basic steps of the LMN identification procedure.
The above mentioned side effects happen manly when the centers are not uniformly spaced, or when the basis functions have distinct widths, as can be seen in Fig. 1. In this work, trying to minimize the side effects, it is used the generalized gauss function (GGF), which is defined by
LOLlMOT
Ilgorlhm
(4) Fig. 2: Basic steps of LMN generation procedure.
If no information is available, the generation procedure can start with the LOLIMOT algorithm, which is used to generate a network structure, but if there is some infonnation about the processes, it can be introduced in many levels to obtain a model in an easier and quickly way.
where P represents the flat parameter. One of advantages of the GGF over the gaussian function is the higher weight for a larger region around the centers produced by its flat surface on the
384
where N is the number of data used and Q is the weighting matrix, calculated for each output variable y with the normalized weighting function (2). The solution for (8) is given by
3.1 Data scaling Before the identification, the plant data must be normalized, to avoid scaling problems. The normalization process is carried out by the following formula (Posser, 2000):
(9)
The main advantage of using ARX model as local model is the above analytical solution for the local model parameters Oi, what makes easier and faster the parameter estimation.
(5)
Zmax - Zmin
where Zmin and Zmax are the minimum and maximum values of each dimension from the data z, and Zbias is the bias value.
4. EXAMPLE: CSTR WITH VAN DE VUSSE REACTION SCHEME
3.2 Using LOLlMOT to build the net
One of the possibilities to place the centers of LMN is to applied the LOLIMOT algorithm (Nelles, 1997), with the following steps: 1.
2.
3.
4. 5. 6. 7.
The van de Vusse reaction has been used as a benchmark problem for nonlinear process control algorithms (Chen at aI., 1995). This example was chosen because it has basically three operating points: with positive gain in a non-minimum phase region, with negative gain in a minimum phase region and with null gain at maximum concentration for CB; exhibiting some non-linearity.
takes the whole space, with all experimental data for each dimension D from the sub-region i, do the following steps, i.e., j = 1, ... ,D: a) divide the sub-region in two equal parts through the dimension j. b) in the center of this region, put a normalized weighting function. c) find the respective local models parameters. d) calculate the error produced by the net. find the best division produced in step 2, i.e., one that produces the smallest error. do the best division using the results from 2b and 2c. calculate the local errors produced by each local model. choose the sub-region with the biggest relative error and go to the next division. if the stop criteria is not satisfied yet, go to the step 2.
4.1 Description of the Process It consists of the following reaction scheme:
where B is the wanted product and C and D are the undesired byproducts. f=F.'
c; . T
This algorithm should be only used when no information about operating points is available. The LOLIMOT algorithm is used to place the centers in the operating space and to identify the relevant variables necessary to define the centers. The ARX model is used as local model in the LOLIMOT algorithm. The ARX model is defined by YV)+ a1yv -1)+ ... + andy{t -
nd)
=b1u{t -1)+ ... + bnnuV - nn)+ ev)
VR
Fig. 3: Schematic representation of the CSTR.
Fig. 3 shows the reactor schematically. Here, f and Qk, are the manipulated variables, CAin and Tin are the disturbances and T and CB are the measured variables. In (Engell and Klatt, 1993 and Trierweiler, 1997) can be found more details about this system.
(6)
4.2 System's Dynamics
The plot of the steady-state concentration CB over f (Fig. 4) reveals an interesting behavior of the system.
where y and u represent the output and input sampled data, respectively, at the current time t, and at nd past sampling times for y and nn for u.
-
T.. _
'_"0 Too ••
Usually, Eqn. (6) is written in the vectorial form (7)
where cp(t)=[-y(t-l), ... , -y(t-nd), u(t-l), ... , u(t-nn)f, and Oi =[ ah .. . , and, b h ... , bnn f. The vector parameter Oi is calculated by the following cost function:
•
2
•••
la
"
••
O.
I.
10
D
2'"
•
U
""
1:1
:.<
' I '~
Fig. 4: CB VS. f for five different temperatures (cC) and CAin = 5.1 moIlL.
(8)
385
The reactor exhibits a change in gain at the peak of the reactor yield (i.e., where the concentration CB achieves its maximum value), and displays nonminimum phase behavior for operating points on the left of this peak and minimum-phase behavior for operating points on the right. The curves shown in Fig. 4 are obtained for different reactor temperatures, T, and CAin =5.1 mollL. 4.3 Using LOUMOT to build the LMN structure For the van de Vusse example, the space divisions were made in the I and Qk dimensions, shown in Tab. 1.
gain SUb-region, and a minimum phase SUb-region, respectively.
!·b 20
"
Table I: Centers distribution by LOLIMOT.
o.s
Region trans. zero CB [mollL] T[°C]
f[h"l] Qk (kJlh] CAin
[mollL]
Tin
[0C]
1 0.86 1.012 106.3 9.1 -2432 5.105 107.5
2 0.86 1.012 106.3 9.1 -811 5.105 107.5
1
tim&(h]
3 -16.6 1.012 106.3 18.5 -1621 5.105 107.5
Fig. 6: Comparison between Local Models Network using identified local models (LMN icI ) and linearized local models (LMN using LOLIMOT algorithm to build the net.
w,
Note that the transition between ORl and OR3 was not well captured. It has happened, because the I value used to identify the OR change instantaneously, while the actual system's dynamic do not. That is, the initial dynamic response of the system, after the second step, is closer to the OR1 than OR3 dynamic model. An alternative to solve this problem is to delay I only to calculate the operating region. This approach will be explored in the next section, where the same problem arises when the prior knowledge is used to build the net structure.
The next step is to generate signals for each subregion, with the purpose to stimulate specific regions in the space to obtain more accurate local models. These identified local models can be linear or nonlinear ones. For the Van de Vusse example, three ARX-221 models were identified, it was chosen manly because its simplicity, but any other model structure (e.g., OE, BJ, ARMAX, subspace state space system identification) could be used. If a nonlinear model is available, another alternative to use as local models are the linearized models at the operating points defined by the centers of subregions.
4.4 Applying prior knowledge to build the LMN structure Applying prior knowledge to build the net, means, in this case, use the information about the three different operating regions. Analyzing the linearized model, one can easily conclude that the left side of peak is more nonlinear than the right side, therefore the space must have more centers on the nonminimum phase region (i.e., on the left side of peak). Tab.2 illustrates such kind of space division.
The normalized GGF for P=4 used as weighting function are shown in Fig. 5. Note that the space was divided inland Qk dimensions only.
Table 2: Centers distribution using prior knowledge. Region trans. zero CB [moVL] T[°C]
l[h"l] Qk [kJlh] CAin [moVL]
Fig. 5: Nonnalized GGF with P=4 for f and Qk dimensions.
Tin
Fig. 6 compares the prediction of two LMNs, where the network structure was determined by LOLIMOT algorithm and the corresponding local models were identified (LMNid ) and linearized at operating points defined by the centers (LMNlin). It shows the predicted outputs for steps in f, remaining the other inputs constant. These perturbations were made to verify the transitions among the three operating regions (OR), where ORl , OR2 and OR3, representing a non-minimum phase sub-region, null
[0C]
1 20.6 0.981 111.0 6.0 -1690 5.100 107.5
2 16.6 1.062 118.0 13.0 -770 5.100 107.5
3 -7.8 1.066 110.0 15.0 -3761 5.100 107.5
Simulating with this centers distribution using prior knowledge, the obtained results are shown in Fig. 7. Applying a ftrst order transfer function to delay the I value used in the calculation of the weighting function, the problem of transition between OR1 and OR3 are solved, as can be seen by comparison of Figs. 7 and 8. This solution was also p~oposed by Gatzke and Doyle (1999).
386
A third option to delay / is shown in Fig. 10 and consists of calculating the / value corresponding to the current process outputs (i.e., CB and 7) with the steady state model. This approach is not so efficient and simple as the other two.
Another simple way to solve the transition problem is simply delay the/value used in the calculation of the weighting function. Fig. 9 shows the results of this approach. Note that the results are similar to the first alternative.
Fig. 7: Comparison between Local Models Network using Fig. 8: Simulation of the LMNlin of Fig. 7 applying a first order transfer function to delay the f value used in the identified local models (LMNid ) and linearized local calculation of the weighting function. models (LMNw.), using prior knowledge.
{b:~: · a _ I ,AI)
Fig. 9: Simulation of the LMNlin of Fig. 7 simply delaying Fig. 10: Simulation of the Local Models Network using the f value used in the calculation of the weighting Iinearized local models (LMNli',)' prior knowledge to function. build the net and the delayed f information calculated by the steady state equations.
linearized local models for the controller project, are presented in Fig. 11, 12, 13 and 14.
5. CONTROL SYSTEM IMPLEMENTATION The LMN have several properties that make possible its utilization in process control. Here, local decentralized PI controllers are projected and their parameters are weighted by the same weighting function used in the LMN. To solve the weighting problem caused by the instantaneous value of /, it is used the / value corresponding to the current process outputs as described in the last section.
Observe that the controller have a good performance so long the process do not change the operating region. Fig. 11 shows the simulation results when the process is perturbed in the non-minimum phase region only. Similar results are obtained in minimum phase region (not shown here). Fig. 12 explores the transition between different ORs. Here, one can see that the system goes to the opposite side that was expected at beginning. It happens because the large difference between the controller gains, which is 10 times bigger for ORl than for OR3. Although the weighting function weights app. 80% the controller of OR3, it cannot avoid the sign change of controller gain, which is responsible for apparently inverse response in the first set-point change of Fig. 12.
For the controller design, the methodology based on frequency domain approach is used, where a desired closed loop performance is established for the system and the controller parameters can be calculated through the minimization of the difference between the real controller (obtained by the direct synthesis in the closed loop) and the ideal controller (Trierweiler et aI., 2(00). For the desired closed loop performance, the time constants are 0.01 hand 0.0234 h for CB, while for T the rising time is 0.03 with 5% of overshooting (Trierweiler, 1997).
Fig. 13 was obtained using the LMN as a controller selector to eliminate the sign gain problems. In this figure a problem originated by the choice of another root of / arises. One can see that initially the system tend to go in one direction not expected, but due to the choice of another root, the weights make the
The results obtained from this controller using the LMN built through prior knowledge and with
387
possible solution, the system would saturate, as can be seen in Fig. 14.
system tracks the set-point. It explains the apparently inverse response presented in the second set-point change. If the controller would not chosen the other
r"~ ;. '." =. ""[0. . - - - ,. - - 111
~"u
:
Cl. ,
:
no
· Bd "ITJ -'[[]
"
..•..
.·
'
-
!o:
..
"0
_ 1_ 1
i -.:~M':··, . - - -, . - - -
-
10
'DO
-
m
1010
.
10
_
UO
_
.to
.-
-1- 1
- -'1_
. "
"0
..,
1-1
~'0 - •
_1_ 1
'
~:::
"
t
_ I _I
· ' ·--w .!
~. i . .
,~
_t
_1_ 1
., ,--
~
d'
10
100
110
_
-,
$0
_ 1_1
100
.... . .... ,
Fig. 11: Step response for set-point changes in the non- Fig. 12: Step response for set-point changes from minimum minimum phase region. phase to non-minimum phase region with an interpolated controller.
"8] .= .
1" 12
~:
.. "0
.[5] -
., -
::bZ]":"= ., ..... :
~1»
.:
~,
,~
50
_'_1 l OO
ISO
200
'-a
110
_,_, lOO
lto
0'0
_
-mhO *'
_,OOD
t'
1er
- ICI
· (I
10
100
_ , .... ,
u.o
..,
-0
50
1110
-
1-
110
_'_I '00
110
.,
SO
_1_1 l OO
1110
_
#..a:.a
=.25
2
-
d'.-
",
..,
I.,
.:CIJ-"~rn -
:f
1
:
~' ll
:
10
--
mblJ "' - ·.,'
11. E"·
i"
~ ,,,
so
100
_ 1_ 1
ItO
zao
-
0
110
100
_ 1_ 1
110
..,
Fig. 13: Step response for set-point changes from minimum Fig. 14: Step response for set-point changes from minimum phase to non-minimum phase region with a selected phase to non-minimum phase region with a selected controller. controller and with closed solution to the actual! Gatzke, E. P. e Doyle, F. J., "Multiple Model Approch for CSTR Control", 14th World Congress of IFAC, Beijing, China, N-7a-II-5, 1999.
6. CONCLUSIONS This work presented a novel algorithm to develop a local model network, where it is possible to take different levels of knowledge about the system's dynamics into account. If nothing is know about the system much more experimental data are necessary to produce a consistent dynamic model.
Nelles, 0., "LOLIMOT- Lokale, lineare Modelle zur Identifikation nicht-linearer, dynarnischer Systeme, Automatisierungstechnik", pp.163-174, 1997. Posser, M. S., "Rede de Modelos Locais", Master of Science Thesis, Federal University of Rio Grande do Sui, 2000.
To solve the side effects produced by the normalization of the weighting functions, the generalized gaussian function (GGF) was introduced. The GGF produce much better results than the traditional gaussian function, because it has a flat surface on the top, given a maximum weight around the centers.
Shorten, R. and Murray-Smith, R., (1994) On nonnalising radial basis functions networks, in Irish Neural Networks Conf. Smith, R. M., "Local Model Networks and Local Learning", Berlin, Gennany, 1994. Trierweiler J. 0 ., "A Systematic Approach to Control Structure Design", Ph.D. Thesis, Univ. of Dortrnund, 1997.
Three different solutions for sudden changes in the weighting variables are analyzed and compared. Applying a first order transfer function to dump these variables or just delay them produce the best results. Finally, the paper showed how LMN could be easily applied to control a non-linear system.
Trierweiler, J. O. and A. R. Secchi, "Exploring the Potentiality as Using Multiple Model Approach in Nonlinear Model Predictive Control", in "Progress in Systems and Control Theory", Ed. F. Allgower and A. Zheng, 1999. Trierweiler, J. O. and U. Neumann, "Rede de Modelos Locais: Uma solw;ao simples para problemas complexos" , COBEQ 98, Trab338, Porto Alegre, Brazil, 1998.
7. REFERENCES
Trierweiler, J. 0., Milller, R. and Engell, S., "Multivariable Low Order Structured-Controller Design by Frequency Response Approximation", Submitted to Brazilian Journal of Chemical Engineering, 2000.
Chen, H., A. Krernling, F. Allgower, "Nonlinear Predictive Control of a Benchmark CSTR", Proc. of 3rd ECC, Rome, pp. 3247-3252, 1995. Engell, S. and Klatt, K. U., "Nonlinear Control of a NonMinimum-Phase CSTR", Proc. of American Control Conference, Los Angeles, pp2041-2045 , 1993.
388