Copyright © IFAC System Identification, Kitakyushu, Fukuoka, Japan, 1997
IDENTIFICATION OF NONLINEAR DYNAMIC SYSTEMS AS A COMPOSmON OF LOCAL LINEAR PARAMETRIC OR STATE-SPACE MODELS R. BabuSka, J. Keizer, M. Verhaegeo
Delft University ojTechnology, Control Laboratory, Mekelweg 4, 2628 CD DelJt, The Netherlands, tei: +31 15 2786152,fax: +31152786679, e-mail:
[email protected]
Abstract. A technique for the identification of nonlinear systems as a composition of multivariable (MIMO) state-space local models is presented. First, fuzzy clustering with adaptive distance measure is applied to the Hankel matrix in order to obtain a partition of the data into fuzzy subsets which can be accurately approximated by local linear models. Then, after weighting the data by the membership degrees computed by fuzzy clustering, standard subspace identification algorithms can be applied to obtain the parameterization in terms of system matrices. The developed technique is applied to the identification of non linear pressure dynamics. Keywords. Identification, subspace methods, fuzzy subsets, clustering, local structures.
1. INTRODUCTION
2. IDENTIFICATION OF LOCAL LINEAR MODELS BY FUZZY CLUSTERING
A promising approach to identify (and control) nonlinear system dynamics is via composite local modeling. In this paper, the non linear input-output behavior is represented by the composition of local (linear) models along the system's operation envelope. In order to ensure a smooth transition between the local models, a fuzzy model due to Takagi and Sugeno (1985) is used (abbreviated as TS fuzzy model). The construction of the TS model is based on clustering the input-output data (Babuska and Verbruggen, 1995; Babuska, 1996).
Fuzzy clustering can be used as a tool to obtain partitionings of data into subsets, which can be approximated by local linear models. The transitions between the subsets are gradual rather than abrupt, which has advantages for constructing models from data corrupted by noise and for smooth interpolation between the individual models. This section deals with the decomposition of input-output data, observed on a nonlinear system, into fuzzy subsets which can be approximated by local ARX models. Fuzzy clustering is used to obtain this decomposition (partition). Consider the regression problem Yk = f(Xk) + €k , where f( ·) captures the dependence of Y on the regression vector Xk, and the additive component €k reflects the fact that Yk will not be an exact function of x k. An input-output NARX structure (Leonaritis and Billings, 1985) gives the regressor vector as a collection of a finite number of past inputs xk = [Yk-l, . . . ,Yk-ny, Uk-l , ... ,Uk-nJ , where nu, ny are integers related to the system's order. Let p = nu + ny denote the dimension of Xk. From a set of observed inputs and outputs of an unknown dynamic system, {uk,yd~,;l' the matrix X E lR Nxp and the vector Y E lR N are constructed. X has the regressors Xk in its rows, and Y contains the regressands Yk. The
The contribution of this paper is twofold. First, we outline that the use of fuzzy clustering with adaptive distance measure corresponds to the solution of a set of total least-squares problem. Second, it is highlighted how to integrate the original contribution of subspace model identification as presented in the paper of Moonen et at. (1989) into the product-space clustering framework. This opens a new avenue to identify MIMO state-space models as local models, adding also the possibility to consider more refined subspace identification solutions for coping with special noise circumstances. The developed technique has been applied to the identification of a highly non linear pressure dynamics in a laboratory fermenter.
675
data matrix denoted Z is formed by appending y to X : ZT = [X y) . The objective of fuzzy clustering is to partition Z into c fuzzy subsets (clusters). In this paper, it is assumed that c is known, based on prior knowledge, for instance. Methods are also available to determine c by cluster validity measures (Gath and Geva, 1989; Backer, 1995; Pal and Bezdek, 1995), or by cluster merging techniques (Krishnapuram and Freg, 1992; Kaymak and Babuska, 1995).
2.3 Generating a prediction model Each cluster obtained by fuzzy clustering is regarded as a local linear approximation of the nonlinear system. The global model can be conveniently represented using affine TS rules (Takagi and Sugeno, 1985): Ri : If x is Ai then Yi = aT x
+ bi
i = 1, ... , K , (3)
The output of the model is computed by:
Y
2.1 Fuzzy partition LetZ 2 :S c
= {zklk = 1, 2, . . . ,N} be a finite set, and let « N be an integer. Further, let U = [JLi ,k) E
= {U E IRc xNIJLik
Computing antecedent membership jUnctions. There are several possibilities to compute the membership functions . The most straightforward one is to represented them analytically in the antecedent space, computing first the distance of the data point from the cluster prototype, by using the distance norm (30). The membership degree is then inversely proportional to the distance. Denote FX = [Jij), 1 :S i,j :S p the partition of the cluster covariance matrix, which includes all but the last column. The corresponding norm-inducing matrix is given by:
E [O , I),Vi , k;
N
c
L
JLik
= 1, Vk ; 0 < L
i=l
JLik
(4)
Given the result of clustering, (U, V, F), two sets of parameters have to be obtained: the antecedent membership functions Ai and the consequent parameters ai , bi .
[0, l) cxN denote thejUzzy partition matrix. Row i of this matrix contains the membership degrees of the data points Zk in the i th fuzzy subset. The jUzzy partitioning space for Z is the set Mic
JLi(X)Yi = 'E~-l c 'Ei=l JLi(X)
< N , Vi} (1)
k=l
Note that the total membership of each Zk equals one, but the distribution of memberships among the c fuzzy subsets is not constrained. Further, no subset may be empty or contain all data points.
(5)
2.2 Fuzzy clustering
vt
Let = [Vli , .. . , Vpi]T denote the projection of the cluster center onto X. The inner-product norm
The aim of fuzzy clustering for the identification of local linear models is to partition the data set Z into fuzzy subsets which can be approximated by local linear models (Babuska and Verbruggen, 1995). This can be achieved for instance by using the adaptivedistance clustering algorithm due to Gustafson and Kessel (1979), which is reviewed in the Appendix. Given Z and the desired number of clusters c, the Gustafson-Kessel (GK) algorithm computes the fuzzy partition matrix, U the matrix of cluster prototypes, V = [VI , V2, .. . , Vc ), Vi E IRn , and a set of cluster covariance matrices F = [FI , ... , Fc), where Fi are positive definite matrices in IR cxc given by:
dAi (Xk ,
vn = (x - vn
T
Af(x -
vn
(6)
measures the distance of the regressor vector x from the projection of the cluster center To transform the distance into the membership degree, eq. (31) is employed.
vt.
(7)
Computing consequent parameters. Expressions for computing the consequent parameters ai and bi of the model (3) can be deri ved from the geometrical structure of the clusters. Assume that the collection of c clusters approximate the regression surface. These clusters can be approximately regarded as p-dimensionallinear subspaces of the regression space. In this case, the smallest eigenvalue Ai ,p+l of the cluster covariance matrix Fi is typically in the order of magnitude smaller than the remaining eigenvalues. The eigenvector
where m is a weighting exponent, which determines the overlap of the fuzzy sets. For m approaching one from above, the partition becomes crisp. The covariance matrices describe the local distribution of data in the individual clusters. Note that the relations revealed by clustering are just associations among the data vectors and as such do not yet constitute a prediction model of the given system. To obtain such a model, additional steps are needed, as described below. 676
vector to the hyperplane spanned by the remaining eigenvectors of that cluster.
data samples (t1xi, t1Yk) to the ith local linear model, the data is weighted by Wi,k = y'(J.ti,d m :
To simplify the notation, the smallest eigenvector of the ith cluster will be denoted
(16)
yf
= O.
Define the matrix Xi E IR Nxp, having the vectors xi in its rows, and the column vectoryi E IR N , containing the scalars iii.
(8)
Lemma 1 (Total least-squares solution) Let the ith cluster be approximated by a local linear model in the transformed coordinates: t1yi = a; t1xi. Let
n T be the smallest eigenvector of the ith cluster's covariance matrix F i . The parameter vector ai , given by eq. (13), is the unique solution of the linear system yi ~ Xiai in the total least-squares sense.
Let the prototypical vector Vi and the smallest eigenvector
= [(vff ; vrJ = [(ViI ," "
Vip); Vip+I]
(9)
n = [( q)il, ... ,q)ip); iP+l] (10)
The proof is straightforward and it is based on Theorem 2.6 in (van Huffel and Vandewalle, 1991). Now, the offset parameter bi of the affine consequent (3) is derived. In the incremental coordinates (16), the i th local linear model is given by
Carrying out the inner product (8) leads to the following equality:
t1yi = a; t1x
from which by a simple algebraic manipulation, an explicit equation for the hyperplane is obtained: -1
x TIT Y=
'-..--'
aT
(17)
a;
from which bi = vf vf. By substituting for from (13), the following expression is obtained:
(12)
t
'--v-"
b
i
bi
-1 -1 T = y
bi
= elI1Y
ell i
1 ""T
r~i Vi,
(18)
which is equivalent to (14). This close relationship between linear total least squares and fuzzy clustering with adaptive distance allows to transfer or extend many results known for linear TLS (van Huffel and Vandewalle, 1991) to the identification by fuzzy clustering .
By comparing the above expression with the affine consequent of the TS rule (3), equations for ai and bi directly follow: ai
x = Viy + r1 (""x)T ~i Vi =
a;
~i,p+l
(14)
3. FUZZY CLUSTERING IN SUBSPACE IDENTIFICATION
t
Although these equations for the consequent parameters have been derived from the geometrical interpretation of the cluster in an intuitive way, it can be shown that ai given by eq. (13) is obtained as a solution of a weighted total least-squares (TLS) problem defined locally around the cluster center Vi. The weights are the membership degrees contained in the ith row of the fuzzy partition matrix. Hence Vi is seen as a local operating point for the model, following the usuallinearization approach. To obtain the "global" affine linear form used in the TS rules (3), the offset parameters bi are calculated using the estimates ai and the cluster center Vi. To show that an expression equivalent to (14) is obtained, transform the data by subtracting the cluster's center:
In this section, we describe the integration of the fuzzy clustering technique, described in the previous section, and Subspace Model Identification (SMI). For that purpose, let us first recall some notation and in sights related to SMI of linear time-invariant (LTI) systems. In SMI (Verhaegen and Dewilde, 1992a; Verhaegen and Dewilde, 1992b; Verhaegen, 1993b; Verhaegen, 1994; van Overschee and de Moor, 1994), the goal is to identify a state-space model of the form:
= AiXk + BiUk Yk = CiXk + DiUk from measured data sequences {Uk, Yd t'~I' The sysXk+l
tem dimensions considered are x k E IR n , Yk E IR l and IR m. The local state-space models are assumed to be minimal.
(15)
Uk E
The cluster's center Vi now becomes the origin and the local model describing the ith cluster can be given in a linear form t1yi = a; t1xi , instead of the affine form (3). To express the varying relevance of the different
One of the attributes in solving this problem in an SMI context is the definition of the relationship between (block) Hankel matrices constructed from available iJo 677
nology. Delft, The Netherlands. Babuska, R. and H.B. Verbruggen (1995). Identification of composite linear models via fuzzy clustering. In: Proceedings European Control Conference. Rome, Italy. pp. 1207-1212. Backer, E. (1995). Computer-assisted Reasoning in Cluster Analysis. Prentice Hall. New York. Gath, I. and A.B. Geva (1989). Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Analysis and Machine Intelligence 7, 773-781. Gustafson, D.E. and w.c. Kessel (1979). Fuzzy clustering with a fuzzy covariance matrix. In : Proc. IEEE CDC. San Diego, CA, USA. pp. 761-766. Kaymak, U. and R. Babuska (1995). Compatible cluster merging for fuzzy modeling. In: Proceedings FUZZ-IEEElIFES'95. Yokohama, Japan. pp. 897-904. Krishnapuram, R. and Chin-Pin Freg (1992). Fitting an unknown number of lines and planes to image data through compatible cluster merging. Pattern Recognition 25(4),385-400. Leonaritis, 1.1. and S.A. Billings (1985). Input-output parametric models for non-linear systems. International Journal of Control 41, 303-344. Moonen, M ., B. De Moor, L. Vandenberghe and J. Vandewalle (1989). On- and off-line identification of linear state-space models. International Journal of Control 49(1), 219-232. Pal, N.R. and J.C . Bezdek (1995). On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Systems 3(:3), 370-379. Takagi, T. and M . Sugeno (1985). Fuzzy identification of systems and its application to modeling and control. IEEE Trans. Systems, Man and Cybernetics 15(1), 116--132. van Can, H.J.L., H.A.B. te Braake, C. Hellinga, A. Krijgsman, H.B. Verbruggen, K.Ch.A.M. Luyben and J.J. Heijnen (1995). Design and real-time testing of a neural model predictive controller for a nonlinear system. Chemical Engineering Science 50(15), 2419-2430. van Huffel, S. and J. Vandewalle (1991). The Total Least Squares Problem; Computational Aspects and Analysis. Frontiers in Applied Mathematics, SIAM. Philadelphia, U.S .A. van Overschee, P. and B. de Moor (1994). Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 30(1),75-93. Verhaegen, M . (1993a) . Application of a subspace model identification technique to identify LT! systems operating in closed-loop. Automatica 29, 1027-1040. Verhaegen, M . (1993b) . Subspace model identification. Part lIT: analysis of the ordinary output-error
state space model identification algorithm. International Journal of Control 58, 555-586. Verhaegen, M . (1994). Identification of the deterministic part of MIMO state space models given in innovation form from input-output data. Automatica 30(1), 61-74. Verhaegen, M . and P. Dewilde (1 992a). Subspace model identification. Part I: the output-error state space model identification class of algorithms. International Journal of Control 56, 1187-1210. Verhaegen, M. and P. Dewilde (1992b). Subspace model identification. Part IT: analysis of the elementary output-error state space model identification algorithm. International Journal of Control 56, 1211-1241.
APPENDIX: GUSTAFSON-KESSEL ALGORITHM Given the data set Z, choose the number of clusters 1 < c < N , the weighting exponent m > 1 (e.g. m = 2) and the termination tolerance f > 0 (e.g. f = 0.01). Initialize the partition matrix randomly, such that U(O) EM/c .
= 1, 2, ...
Repeat for l
Step 1: Compute cluster prototypes (means): ",N ( (1-1))m L.."k=1 J.Li ,k Zk «(I-l))m k=1 J.Li ,k
(I) _
Vi
EN
-
1 ::; i ::; c . (28)
Step 2: Compute the cluster covariance matrices:
F _
Ef=1 (J.Lt;I))m(Zk -
V}I))(Zk - v}/))T
(1-1))m L.."k=1 J.Li ,k
, -
",N
(
'
(29) Step 3: Compute the distances: drkA i =(Zk_v}I))T
[(pidet(Fi)l~pF;-I] (~k-V}l)),
1 ::; i ::; c, 1 ::; k ::; N .
(30)
Step 4: Update the partition matrix: if dikAi > 0 for 1 ::; i ::; c, 1::; k ::; N, 1
(I) _
E;=1 (dikA i /djkA i )2/ (m-l) '
J.Lik -
(31)
otherwise
J.L;~k
=0
if dikAi
> 0, and J.L;~k E [0, 1]
c
with
L
J.L;~k = 1 .
i=1
untilIIU(I) - U(I-l) 11
680
< f.
(32)