sets and systems Fuzzy Sets and Systems
ELSEVIER
113 (2000) 351-365 www.elsevier.com/locate/fss
Fuzzy modelling and identification with genetic algorithm based learning Baolin Wu, Xinghuo Yu* Faculty ofInformation
and Communication,
Central Queensland
Received June 1997; received
University, Rockhampton
in revised form December
QLD 4702, Australia
1997
Abstract
A GA based learning algorithm is proposed in this paper for the identification of TSK models. The algorithm consists of four blocks: Partition Block, GA Block, Tuning Block and Termination Block. The Partition Block is to determine an estimated partition of input variables. The GA Block is to optimise the structure of a TSK model. The Tuning Block is to fine tune the parameters of the TSK model using the gradient descent based approach and the Termination Block checks that the resultant TSK model is satisfactory. The proposed GABL algorithm has the advantage of simplicity, flexibility, high accuracy and automation. The presented numerical examples indicate that the GABL algorithm is effective in constructing a good TSK model for complex nonlinear systems. 0 2000 Elsevier Science B.V. All rights reserved. Keywords:
System identification;
Genetic algorithm; Artificial intelligence; Modelling
1. Introduction
Fuzzy systems have been used in many applications. A central characteristic of fuzzy systems is that they are based on the concept of fuzzy coding of information and operating with fuzzy sets instead of numbers. In many real problems, the imprecision is admissible, even useful, because the categories of human thinking are vague ideas which are very hard to quantify. In essence, the representation of information in fuzzy systems imitates the mechanism of approximate reasoning performed in the human mind. *Correspondingauthor. Tel.: + 61 79 309777; fax: + 61 79 361361. E-mail address:
[email protected] (X. Yu) 0165-0114/00/% - see front matter PII: SO165-0114(97)00408-9
One important class of fuzzy systems is fuzzy mathematical models, which are characterised by fuzzy parameters or variables related to functioning of operators and connectives. The so-called TSK models, developed by Takagi, Sugeno and Kang [26,28,33], is such a model which is formed by logical rules that have a fuzzy antecedent part and functional consequent. A common practice to model a complex nonlinear system is by linearising the system around given operating points. Such local linearisation may be good for local analysis but lacks a global perspective of the system. Numerical analysis techniques such as interpolation provide a way out but suffer from sheer complexity. The TSK model allows us to aggregate the set of linearised models into a global model to approximate
0 2000 Elsevier Science B.V. All rights reserved.
352
B. Wu, X Yu / Fuzzy Sets and Systems I13 (2000) 351-365
the complex nonlinear system with less complexity. The TSK models have been studied extensively by many researchers. The relevant literature can be found in [28,29,31,33]. Like any common problem in systems theory, identification of the TSK models is the bottleneck in development and application. Proposers of the TSK models developed a systematic algorithm for modelling and identification [32]. Other algorithms and approaches have been proposed [l, 8,17,29,37]. Generally speaking, these methods for identification of the TSK models rely on the expert’s knowledge acquisition. Recently artificial intelligence techniques have been used in the case of lack of expert knowledge by using artificial neural networks, genetic algorithms (GAS) and many other learning strategies [7,8,10, 11,15,18,19,21-24,29,30,34]. These algorithms, especially, GAS, were mostly used in dealing with logic models. Little research has been done in using GAS for identification of the TSK models. Builiding a TSK fuzzy model involves two tasks, the structure identification and the parameter identification. Two important steps in structure identification are the determination of the number of if-then rules and partition of input space to give rise to a set of fuzzy sets associated with membership functions. Parameter identification involves identification of parameters of membership functions and parameters of functional consequents. In this paper, we propose an approach for identification of the multi-input and single output (MISO) TSK models by means of genetic algorithm based learning. Investigations along this direction have been done [6,7,11,15,18,19] but most of these studies focus on one aspect or another, for example, tuning membership function or optimising logic models, etc. Our modelling algorithm contains four blocks: Partition Block, GA block, Tuning Block and Termination Block. These blocks enable partitioning fuzzy variables and optimising membership functions in a unified frame. Novel procedures are implemented in these blocks to enhance the efficiency of the modelling process. This paper is organised as follows. Section 2 presents the basics of TSK models and genetic algorithms. Section 3 proposes and discusses the
GA based learning algorithm for TSK modelling. Two case studies are presented in Section 4 to show the effectiveness of the algorithm. Discussion and Conclusion are drawn in Section 5.
2. Basics of TSK models and genetic algorithms 2.1. TSK models Consider the nonlinear multi-input and singleoutput system y =f(x1,xz,
*..,&I)
(2.1)
which has known m operating points (xi, xi, . . . , x’,), j= l,..., m, around which the system can be decomposed into m linearised subsystems: y’ =fj(xl, X2, . . . ) xn), the TSK model for the system can be represented as Rj: If x1 is A< and x2 is A’, and Then yj =fj(xr,
x2, . . . ,x,)
and x, is A’, (2.2)
The output of the model is
jzlB'Y'
(2.3)
= exp (-(‘is;)*),
(2.4)
fFuzzy(x) =y= with
Z’ = ~ Ai( i=l
and Ai
where ~i and /I{ are constant parameters. Here Rj,j = l,..., m, are denoted as the jth rule in the rule-base, Xi, i = 1, . . . , n, the input state variable, A{, the fuzzy set, A/(x,), the membership function for the ith input variable around the jth operating point, rj, the degree of fire for the jth rule, and yj, the linear function around the jth operating point. By partitioning the state space into regions according to the operating points, and connecting the linearised models together to form a TSK model,
353
B. Wu, X Yu / Fuzzy Sets and Systems 113 (2000) 351-365
we arrive at a simple model which is easy to handle. The power of the TSK model is such that the complex system can be possibly decomposed into simple subsystems, then the treatment of each subsystem can be done by applying the linear system theory. The complex system can therefore be represented by aggregating simple subsystems into a global model. 2.2. Genetic algorithms
3.1. The structure of GABL algorithm Assume we are given pairs of input-output data (xk, yk), xk EX c R”, yk E Y c R, where the data are randomly generated from an unknown system g(x), x = (x1, x2, . . . , x,)~EX c R”. Our task is to find a TSK fuzzy modelf& in the form of (2.3) such that, for a given error tolerance E > 0
b(xk)-fFuzzy(xk)l = g(xk)- f
B’(Xk)f’(Xk) < E.
j=l
GAS are an exploratory search and optimisation procedure based on the principle of natural evolution and population genetics. The basic concepts of GAS were proposed by Holland [S]. Dejong laid the foundation of GAS [9]. Goldberg, Davis and Michalewicz provided recent comprehensive overviews and introductions to GAS [4,20]. Typically, there are five basic components in aGA: 1. A genetic representation or encoding of chromosomes for potential solutions to a problem; 2. The method to create an initial population; 3. A fitness (or objective) function to evaluate each chromosome; 4. Operators such as crossover and mutation to perform an evolution process; 5. Choose working parameters such as population size, probabilities of applying genetic operators, and termination criterion, etc. GAS are superior to traditional searching techniques as the search is not biased towards the locally optimal solution. The robustness of the GAS is due to their capacity to locate the global optimum in a large and complex landscape. GAS have been given great attention for engineering problems where large and complex space exists, and have been used successfully in a wide range of applications such as function optimisation [4,20,25], neural networks training, and finding the parameters of control systems, etc. [6,7,11,15,18-201.
3. The genetic algorithm based learning algorithm In this section, the genetic algorithm based learning (GABL) algorithm is proposed and discussed in detail for the TSK model identification.
(3.1) The GABL algorithm is proposed as follows. The GABL algorithm contains four different functional blocks, namely Partition Block, GA Block, Tuning Block and Termination Block. The GABL algorithm is a recursive procedure, it focuses on an objective TSK model, gradually improves the model performance to match the given data until the inequality (3.1) is satisfied. Fig. 1 shows the structure configuration of the GABL algorithm. The data set comes from an unknown system by random sampling. We assume that the data sampled contains some important features of the unknown nonlinear system, and noise in the data is persistent. Otherwise, we need to generate more data in order to build a proper TSK model. The Partition Block is used in order to determine an estimated partition along one input variable, which is the most significant one affecting the output among other variables. It also identifies an initial structure of the TSK model based on the estimated partition. The GA Block is to search and optimise the structure of the TSK model, and produce initial settings of parameters for the TSK model. Because of the trade off between speed and accuracy, the GA Block acts as a coarse searching process in order to quickly locate a near optimal solution, and interacts with the Tuning Block toward a possibly good TSK model. The Tuning Block is a fine tuning process. It accepts the initial parameters from the GA Block, and optimises the model by a gradient descent based learning algorithm. The Termination Block checks whether the estimated TSK model satisfies condition (3.1). If the
354
B. Wu, X. Yu / Fuzzy Sets and Systems 113 (2000) 351-365
guarantee that the estimation in (3.3) is nonsingular. (Note that other learning schemes can be used as well, such as the Kalman filter [37], depending upon the data dealt with.) (b) Find the maximum error point. Find the maximum error point x* for the group Go, which maximises
[ T,,,s~~;;t;k~
Ifo(xk) - Ykl
Fig. 1. The structure of the GABL algorithm.
condition is satisfied, then the TSK model becomes the solution. Otherwise, improvement of the model is carried out by two steps based on the criterion. If the parameters in an estimated TSK model do not improve the performance after the Tuning Block, then we continue partitioning the input space by going back to the Partition Block, followed by the GA Block and Tuning Block, until the TSK model approximates the given data within a satisfactory tolerance error. Note that in the GABL algorithm, the searching method is objective because the TSK model is built from data, and no a priori knowledge is needed. In the following, details of function of each block are given. 3.2. Partition Block In the Partition Block, a procedure is proposed to search for a better partition from a given set of data according to the significance of input variables. The procedure contains the following steps: (a) Obtain a linear function approximation. Denote the given set of data, (2, yk), xk E X t R”, ykeYcR,andk=l ,..., m,asG’.Usetheleast square method to establish a linear function approximation f0 for the group of data such that fo(x) = bTx,
(3.2)
b = (x’XC-‘x’Y,
(3.3)
where X is a n x m matrix, Y is a m x 1 vector, and b is an n x 1 vector representing the parameter estimation offi( respectively. Here we assume the number of sets of data, m, is much greater than the dimension, n. This is a necessary condition to
(3.4)
amongallxk, k = 1, . . . , m. The purpose of this step is to capture the nonlinearity of the given data. The f0 is an estimated function based on the average of data. If we accept that the data have persistent noise and contain some important feature of the unknown nonlinear system, then we assume that the nonlinearity of the system may exist at the point x*. (c) Divide the group of data into subgroups. After the maximum error point x* = (x:,x;, . . . ,x:)~ is found, divide the data set into two groups for each input variable, such as Gi’,: (Xki’,ykC1),kil = 1, . . . ,cil; when xi < G,!,: (&, flCZ), ki2 = 1, . . . , Ci2; when
~7;
(3.5) Xi > XT,
where n + 1 < cii, ci2 < m are the sizes of the groups,andi= l,..., n. Note that the number pairs of data in each group must be larger than the number of input variables. The superscripts of G represents the first partition. Thus we have n partitions along input variables and 2n subgroups of data Gh: (x“;~,yki,), i = 1, . . . ,n, j = 1,2, and kij= l,.*.,cij. (d) Determine the signijicance of input variables. To determine the significance of input variables, the linear approximation function fij, i = 1,. . . , n and j = 1,2 for each group of data (3.5) is first established by using (3.2) and (3.3), then the performance for each estimated partition for each input variable is calculated, as
Pi
Ilk”=1 CC:= lfij(xk)SijC2)
=
-
YkY
1 2
m
(3.6)
where (1, xk =xk’, Sijtx”)
=
I0;xk #Xk’
(3.7)
B. Wu,X
Yu / Fuzzy SetsandSystems I13 (2000)351-365
is a switching function. We refer expression (3.6) as a performance index function. The input significance is determined by the performance value pi defined by (3.6). An input variable Xi is a significant input if pi is smaller than those of others. The choice of the input variable significance depends on the improvement of performance due to the partition at that input variable. If partitioning along one variable and aggregated associated linear functions can give a better performance, then this input variable is the most significant one among the variables. (e) Partition the input space. Partition the input space into two subspaces along the significant input variable, and construct an initial TSK model, which has two rules and one premise variable in the form of R’: If
Xi
is At
Then y1 = ah + a:xl + ... + a:~,,, R2: If xi is A? Then y2 = ai + a:xr + ... + uix,.
(3.8)
Expression (3.8) is an estimated model based on the partition along the most significant input variable Xi. To see if the model (3.8) is sufficient to approximate the input-output data, we use the GA Block incorporated with the Tuning Block to search and optimise the structure and the parameters of the model. Further partitioning stops when expression (3.1) is satisfied. If (3.1) is not satisfied, then we return to the Partition Block to further partition the input space, i.e., further partition the two groups of data Gh: (x“*~,y~&~), i = 1, . . . , n, j = 1,2, and kij = 1, . . . ,cij. It is not necessary to partition both groups of data, because nonlinearity appears significantly in one of them. Thus further partition only needs to be done in one group of the data. Which group goes to partition is determined by calculating the individual approximate linear functionfij performance associated with the group of data Gij with the modified (3.6), which can be written as
J[
C;j= 1 (.hjb+)
Pij
=
cij
-
YkaJj2 1;
(3.9)
3.55
where kik = 1, . . . , cij, and cij is the size of the Gij. If the Gij associated approximate linear function hj has a larger value of pij, then the Gij should be further partitioned. 3.3. The GA Block The optimisation of the initially constructed TSK model is done by the GA Block incorporated with the Tuning Block. We use the GA as a coarse searching process. How to encode parameters of the TSK model into a chromosome without using binary representation is a problem. For higher order TSK models, the coding in binary fashion is not an efficient method. On the other hand, the uncertain boundaries (or domain) specified in the partition will affect the length of chromosomes if binary coding is used. We use real values instead and adaptively adjust the domains of parameters. Suppose that there are m - 1 points denoted x*p l ,p=l,..., m-l, i=l,..., n along the input variable xi within the region xi < xi < Xi, where the underline means the lower bound and upperline means the upper bound. Thus the region (Xi, Xi)can be divided into m subregions (zj,Xi), j = 1, . . . ,m, based on the neighbourhood points XT”.Each subregion is associated with an estimated membership function and a linear function. The search of domains for all parameters is done as follows: The domain for the central point U: of the membership function Ai( j = 1, . . . , m, i = 1, . . . , II, is denoted by (x{,Xi). The domain is chosen based on that the central point remains in the partitioned subregion. The domain for the /$ of the membership function Aj(xi), j = 1, . . . ,m, i = 1, . . . , n, is calculated based on the size of the domain. Let the length of (xi, 2:) be 6j, then the domain of p{ is denoted by (6{/2,2S{). The choice of the domain for the /Ii is based on each fuzzy set being partially overlapped with neighbouring fuzzy sets. The domain of the parameters in each linear function does not have explicit form. However, the estimated domain can be implicitly denoted by the region ($,Xj),j = 1, . . . , m with associated grouped data Gi’j: (X~,J,~~,J),i = 1, . . . , n, j = 1, . . . , m, kij = 1, e.. , Cij, where $,gj are n x 1
356
B. Wu, X. Yu /Fuzzy
Sets and Systems 113 (2000) 351-365
vectors. Let the estimated linear function in the region be y’ = a$ + u{xi + ... + a{~,,. The parameters of the linear function can be estimated by sampling n + 1 different points from the group data Gij: (x“,J,Y~,J). The expressions (3.2) and (3.3) can be used to calculate the estimated parameters. Thus searching the parameters of the linear function becomes searching points from the given grouped data Gij: (x’+,y’Q)which are in the region ($,Xj), j = 1, . . . ,m. Note that the number of pairs of data in the group Gij: (&,yk,) must be larger than the number of variables, Cij > n + 1. For simplification, we use the notation -i as an implicit belonging, for the parameters a{, 1 = 0, . . . ,n, of the linear function, therefore u{ Z (xj, Zj), so each of the parameters is calculated implicitly from the region ($,Xj) with the associated grouped data Gij: (XQ,yk, j). The central point domains (af, /3!) and (I$, fly) will be adjusted as there may exist a potential solution on the outside of the region. (a) Coding. The type of coding used in GA is in segment fashion with the length of m. This coding joins together segment codes of all parameters into one chromosome. Let thejth segment be thejth rule in the TSK model (2.3), the jth segment can be illustrated as in Fig. 2. The chromosome consists of m segments to represent a potential TSK model. Initialising the chromosome is based on randomly generating real values from the given searching domains. (b) Fitness function and selection. Each chromosome is ranked by a performance index, which is a mean squared error, EI = c;= 1 (Yk -fF”zzy(~kv
(3.10)
>
m
where (xk, yk), k = 1, . . . , m, are the pairs of the given input and output data, and fFUzzy(x) is the TSK model (2.3). Note that the performance index (3.10)
4
P:
a;
P:
.
4
A
4
Fig. 2. The jth segment coding.
4
a’
is similar to (3.6) except that it is for the TSK model. The selection is done based on the chromosome performance index (3.10). The selection of population is done by a so called steady-state reproduction, in which half of the population’s chromosomes with better fitness are kept, and new offspring are introduced into the population to replace less fitter chromosomes [20]. (c) Crossover and mutation. In order to bring new solution into the next trial, crossover and mutation are applied. Crossover is applied twice. First the numberj is randomly picked on segments in the chromosome, the segments after number j in the two chromosomes are swapped. This crossover is to exchange partial rules from two models with better ranks. Second, it exchanges the information of parameters. We use one point exchange in two selected chromosomes, that is, we randomly generate a number 1and use it to exchange the Ith parameter from two selected parents. Mutation is to update the parameter values within the chromosome by randomly sampling a value from the domain of that parameter. (d) Adjust (xi’, 2:) and (~7, Xr)for central points ai and c$‘, and termination
condition.
The region ($,X!) and (&“,Zr) could be very large, e.g., if the central point tl: location is estimated on the left side of x7, then the region (&, Xf ) could be (- cc ,x7). When coding the chromosome, this set cannot be coded. We can restrict the domain within the region (d,x:) where d is an appropriate value. But such restriction is not good because there may exist a good solution outside this region. Thus the domain has to be adjusted in order to explore a possible good model. To do this, we let the GA run for a specified number of times, taking the best chromosome as an initially estimated TSK model, and using the Tuning Block to train the model. If the parameters are close to or beyond the predefined domain, then we enlarge the region such that (xf - 6i, 2:) and (z$‘,$” + oi), where 6i and ci, i = 1, . . . , n, are some positive real values. GA is further used for search until the domains do not improve much in performance and the number of iterations runs out.
B. Wu,X
3.4. Tuning Block In this block, we develop a gradient based learning algorithm for the tuning of the parameters of the TSK modelf& in the form of (2.3). This consideration is due to the fact that GAS are not good at locally searching optimal solution. We wish to optimise parameters of the model fFuzzy (x) such that Ek = OCR,,,,,,
- yk)* = i(e”)’
(3.11)
is minimised. Note that (3.11) is the same as (3.1) but in slightly different form. For convenience, we denote the fFuZZY (x) as y, and the consequent linear function for each rule as yj, j = 1, . . . , m. The expression (3.11) is rewritten as Ek = &jk -
yk)2 = f(&‘)‘.
adjustment of i should be considered. In this study, we keep it simple and smaller 1 will be considered which means that the training time will be longer comparing with different settings and adaptive adjustment. Calculating the partial derivatives in (3.13)-(3.16) yield: aEk aol!’ = (j - yk)(l - B’)y’2 3 1 I j = 2ek(l - Bj)yj X$,
We formulate this problem as minimising the square of instantaneous errors between the output of the fuzzy model y and the current output reading yk with respect to the parameters ai, /I{, a& and ui, j = l,..., m, i = 1, . . . . n. Therefore, the problem becomes training the parameters such that the (3.12) is minimised. Using the gradient descent method, we obtain the adaptively adjusted values of the parameters in the direction of greatest decrease of the square instantaneous error between the output of the TSK model j and the output of the actual system y at each reading point k, then the parameters training laws can be written as , i
(3.13)
k
aEk P;(k + 1) = P;(k) - I aai k >
(3.14)
a’,(k + 1) = u{(k) - 15
(3.15) 0
k’
uj(k + 1) = u;(k) - Iz$
, I
(3.16)
k
where I > 0 is the learning rate. In reality, the 2 should be different for each of the parameters due to the scaled value of each parameter, and adaptive
(3.17)
aE’ ag!’ = (j - yk) (1 - Bj)yj2 (“;$‘)’ I
(3.12)
c&k + 1) = a;(k) - A,$
357
Yu /Fuzzy Sets andSystems 113(2000)351-365
I = 2e(l - Bj)y’ (“;,~)‘, I
(3.18)
(3.19)
aEk Q
=
(j -
yk) Bj = ekBj,
=
(j
yk).
0
aEk aaj
-
Bjxi = ek Bjxi .
(3.20)
i
Substituting (3.17)-(3.29) into (3.13)-(3.16), we obtain the learning laws for the parameters: ct{(k + 1) = a{(k) - 2Lek(l - Bj)yj 3,
(3.21) I
j?{(k + 1) = j?:(k) - 2Lek(l - Bj)yj (“;,~)‘,
(3.22)
I
u’,(k + 1) = u;(k) - lekBj,
(3.23)
u{(k + 1) = ui(k) - AekBjxi,
(3.24)
where ek = jj - yk, Xi, i = 1,. . . , n, is an input variable and Bj, j = 1, . . . , m, is a fuzzy basis function in the form of (2.4), respectively. The learning procedure for the TSK model in Fig. 1 is a two-pass procedure. In the forward pass, for a given input xk, the current Bj, yj, j and e with the current estimated parameters a!(k), p{(k), u{(k) and u{(k) are calculated. In the backward pass, the current parameters are updated according to the learning laws (3.21)-(3.24). We train the TSK model with the input-output data until the parameters stabilise with no significant improvement.
358
B. Wu, X Yu / Fuzqv Sets and Systems 113 (2000) 351-365
3.5. Termination Block
In this block, further process is determined based on the performance of the estimated TSK model. When the Tuning Block completes its task, the Termination Block checks the model’s satisfaction based on the expression (3.1) for the tolerance error. If (3.1) is satisfied, then we stop the modelling, and accept the TSK model as our solution. Otherwise, the modelling process continues. There are two flowing streams in the process, one goes to further partition, another goes back to the GA Block. This is done by examining whether the parameters are close to or out of the predefined domains after the Tuning Block. If it is true, then we return to the GA Block. Otherwise, we go to the next cycle, that is, further partition the input space, search and optimise partition and parameters until (3.1) is satisfied.
4. Case studies In this section two case studies are presented: one is a nonlinear function approximation and the other a human operation to control a chemical plant. These two systems are modelled by the TSK model using the GABL approach proposed.
Consider the nonlinear function in the form of 0 d xi, x2 6 2.
&(x1,x2) = 5.525 + 0.2344~~ - 4.4021x2,
where the subscript 0 means zero partition. Fig. 6 shows the output of the linear function approximation. Because only one group of data is available, no more linear functions need to be established. For the comparison later, we measure the performance of the linear function approximation using expressions (3.6) and (3.7), and find that p. = 2.107.
. 15-
(4.1)
0.5
system (4.1).
..
. .
.
.
.
: :
.
.
.
*..
.
’
. .
.
.
. .
.a
.
.
..
-
. . .
.
*: .
.
. -
:
... .
...
:
.
.
-I
. . #..
0.5 Fig. 4. The sample
0
’
.I
..
0
-5’ of nonlinear
.
.
0
.
.
*. I.
The system output is illustrated in Fig. 3. (Note that in Fig. 3 and following three dimensional figures,
Fig. 3. The output
.I
.
l-
4. I. A nonlinear function approximation
y = 1 + 0.5~~ + 5 sin(nx&
the scale of each coordinate is enlarged by the Matlab software for visualisation purpose.) To show the use of Partition Block, we randomly generate data from the system (4.1) as (xk,yk), xk = (x:,x”,)’ and k = 1, . . . ,100, and we set the error tolerance E = 0.5. Fig. 4 shows the sampled data distribution in the input space, and Fig. 5 shows the output using sampled data from the system (4.1). We first establish a linear function approximation using expressions (3.2) and (3.3), and have
1
points
15
2
over the input space.
I 20 Fig. 5. The output
40
En of sampled
en data from (4.1).
100
B. Wu,X
Yu /Fuzzy Setsandsystems 113 (2000)351-365
xa 21
.:.‘I
359
. ., .. . . . -.I** . . .. I r...
-
’
: :’ -.
’
.
o.j.:*. .II_
G;
05'
0
.
.
,I
” ._
;y.-_II
).
Fig. 6. The output of the linear function fO
*
..
(x:.x:)
G;,
.
1
15
2
Fig. 7. The division of input space along x1.
We now search the maximum error point between &(x:,x;) and yk, k = 1, . . . ,100, and the resulting maximum error point between f0 and y is located at the point (XT,xt) = (0.6737,1.355). Based on this maximum error point, we divide the data into four subgroups (there are two inputs), two for each input variable, such that
XP
21
..
.:.I
..
G;,: (~+,y~~~), kll = 1, . . . ,28; where x1 < 0.6737;
0
0
05'
...
. 1
.
.
.
.
.
. 15
. XI
2
Fig. 8. The division of input space along x2.
Gi2: (~.+,y~~~), k12 = 1, . . . ,72; where x1 > 0.6737; G:i: (~“~~,y~~~),kzl = 1, . . . ,62; where x2 d 1.355; Giz: (_&z,yk22), kz2 = 1, . . . ,38; where x2 > 1.355,
we have the following two linear functions: fil(xl, x2) = 6.2834 - 2.1707~~ - 4.542x,, 0 < x1 Q XT, 0 < x2 < 2; fi2(x1, x2) = 5.9862 - 0.1460~~ - 4.2753x2,
where Gfj represents the partitioned group of data in which the superscript 1 depicts Ith partition, and subscripts ijiis interpreted as partitioning at the ith input variable of the jth group of data. For instance, Gi 1 stands for group 1 of input variable 2 from the first partition. Figs. 7 and 8 illustrate distribution of groups in the input space. In Fig. 7, the input space is divided into two subspaces along input variable xi. The left hand side of xf is the subspace of G: r, and the right hand side is G:,. Similarly, the locations of the G:, and G& are shown in Fig. 8. In order to determine the input variable significance, we establish a linear function estimation for each subgroup of data using expressions (3.2) and (3.3), and aggregate two functions for each input variable into one. For instance, for G:, and Gi2 which are divided along the input variable x1,
x:: 6 xi d 2, 0 < x2 < 2; and fi is a function such that fi =fi I +fi2. Fig. 9 shows the output off,, 0 d x1, x2 < 2. Similarly, f2 =f2r +f22, where f2i and fi2 have the following forms: f21 (x1, x2) = 6.3940 + 0.0147x1 - 5.2845x2,
fi2(x1, x2) = - 16.4006 + 0.5087~~ + 8.2118x2, 0 6 x1 6 2, xi < x2 < 2. The output of f2 is shown in Fig. 10. We calculate the performances of fr and f2 using expressions (3.6) and (3.7). The fi’s performance is p1 = 2.066 and the f2’s performance is p2 = 1.6299.
360
B. Wu, X. Yu 1 Fuzzy Sets and Systems 113 (2000) 35I-365
where xi = 0, Xi = 2, 5: = 0, j;_: = 1.335, a; Z ($,X2), where x! = 0, 24 = 2, & = 1.335, 22”= 2, where 1 = 0,1,2. The chromosome structure is organised as shown in Fig. 11. The parameters for the GA are: the population size is 30, the number of iteration is 100, pC = 1 for two crossovers, and P,,, = 0.05. After the GA is run, we pick the better performance chromosome in the population to form the TSK model:
Fig. 9. The output of fi.
10 5
R':If x2 is A:(0.1224,0.9512)
0
Then y’ = 1.0426 + 0.5264~~ + 14.2314x2,
505
25
R2: If x2 is Ai(1.920JO.9882)
Then y2 = - 28.9547 + 0.4839x1 + 14.8643x2, Fig. 10. The output of f2.
(4.3) Note that both fr and f.‘s performances are better thanfo’s. The fi’s performance is better than fi’s. This result implies that partitioning the input space along x2 by using linear approximation would lead to a better solution. As we can see from Figs. 3, 9 and 10, the output off2 is closer to that of the original system (4.1). We conclude that x2 is more significant than x1 in terms of influencing the output because changing x2 will significantly change the output. Thus we can initialise the TSK model as having two rules and one premise variable with a possible boundary at xf such that
where Ai(0.1224,0.9512) denotes in two meanings: the fuzzy set Ai and Gaussian membership function with elf = 0.1224 and j3: = 0.9512. The output of the model is shown in Fig. 12. The error between the models (4.1) and (4.3) is shown in Fig. 13. We see that the output of the model has almost the same shape as the output of the system (4.1) shown in Fig. 3. However, Eq. (3.1) is not satisfied since E = 0.5 is assumed. There are three options that we may consider in order to satisfy Eq. (3.1). The first option is that we increase the population
R1: If x2 is Ai Then y’ = a: + a:xr + six,, R2: If x2 is Ai Then y2 = at + &x1 + &x2,
(4.2)
Fig. 11. The chromosome structure.
where A: is a fuzzy set, which contains data on the left hand side of xf, and similarly, AZ on the right hand side of xz. In order to be able to use the GA Block, we estimate the domains for parameters as follows: cl:~(O, 1.335), &(1.335,2), /I; E (0.775,2.710), u: Z ($,X1),
/3; E (0.323,1.290), Fig. 12. The output of the fuzzy model (4.3) in the domain 0 < x1, x* < 2.
361
B. Wu, X. Yu / Fuzzy Sets and Systems 113 (2000) 351-365
-2' 0
20
40
60
80
/ 100
-5’ 0
20
Fig. 13. The error between the model (4.1) and (4.3).
size and the number of iterations, which is time consuming. The second option is that we further partition the input space to establish higher order and more premise. The third option is that we use other optimisation algorithms to speed up the search and fine tuning. We choose the third option which is implemented in the Tuning Block due to the fact that GAS are not good at locally searching optimal solutions. We set the learning rate as 2 = 0.02, and the train cycle as 50 iterations. We get the model in the form of
40
M)
80
I 100
Fig. 14. The output of the TSK model (4.4).
041
I
0.2 0 -0.2 -0 4 -0.6' 0
I 20
40
60
80
100
Fig. 15. The error between (4.1) and (4.4).
R’: If x2 is A:(0.3639,0.7782) Then y1 = 1.704 + 0.5444~~ + 14.2198x2, R’: If x2 is Az(1.6368,0.7686)
Then y2 = - 28.9615 + 0.5342~~ + 14.4457x2.
Fig. 16. The output of model (4.4) in the domain 0 < x1, xz < 2.
(4.4) The output of the model is shown in Fig. 14. Comparing the model (4.1) with the model (4.4) at each reading points k using (3.1), we conclude that the model (4.4) is much better than (4.3) as shown in Fig. 15, in which the error between (4.1) and (4.4) is limited within 0.5. From the 3D image of the model (4.4) as shown in Fig. 16 we can see that the overall approximation of (4.4) is acceptable, and the mapping error is contained within E = 0.5. This result shows the effectiveness of GABL algorithm. The error between the models (4.1) and (4.4) are shown in Fig. 17. Obviously if we want to have higher precision, there will be more rules generated. For example, given an error tolerance E = 0.2, the following
Fig. 17. The error between model (4.4) and (4.1) in the domain 0 c x1, x2 < 2.
model of three rules and one premise variable can be derived: R’: If x2 is A:(0.0816,0.7024)
Then y’ = 0.6885 + 0.4534~~ + 14.1537x2,
362
B. Wu, X. Yu / Fuzzy Sets and Systems I13 (2000) 351-365
R2: If x2 is A~(1.0084,0.6119) Then y2 = 7.8332 + 0.5038x1 -
‘01 6.4367x2,
R3: If x2 is Aif(1.8497,0.6690) Then y3 = - 28.6610 + 0.5199~~ + 15.015x2. (4.5)
Fig. 18. The output
of (4.5) in 0 < x1, x2 < 2.
Figs. 18 and 19 illustrate the overall approximation of the model and the error in the region 0 < x1, x2 d 2. The error is well within 0.2 range. 4.2. Human operation at a chemical plant This example is used to show how to build a fuzzy model of a human operator controlling action in a chemical plant. The plant is for producing a polymer by the polymerisation of monomers. Since the start-up of the plant is very complicated, a human operator is used to manually operate the system in this circumstance. The structure of the human operation is given by Sugeno and Yasukawa [27]. Five input candidates, ul: the monomer concentration, u2: the change of monomer concentration, u3: the monomer flow rate, uq and us: the local temperatures inside the plant, are available, to which the human operator may refer. The output y is the set point for monomer flow rate. In this plant, an operator determines the set point for the monomer flow rate and the actual value of the monomer flow rate to be put into the plant is controlled by a PID controller. There are 70 data points for each of the six variables (five input and one output) from the actual plant operation as given in Appendix B, and the output of the plant is shown in Fig. 20. There are (at least) three different fuzzy models available for modelling this plant. In [27], a six rule qualitative fuzzy model was given by identifying ul, u2 and u3 as input variables, further forming three premise variables of the qualitative fuzzy model. In [29], ul, u2 and u3 were selected as the input variables, and six clusters from the given data were taken to give six rules and three premise variables of the TSK model. In [16], a fuzzy-neural system
Fig. 19. The error between
(4.1) and (4.5) in 0 Q xl, xg < 2.
8000
10
M
30
Fig. 20. Output
40 of actual
50
60
70
plant.
was used for modelling the plant. It also took ul, u2 and u3 as inputs and produced a seven rule model. The performance measurement was not stated in [27, 291, but it was given in [16] as 0.002245 by defining a performance index:
where o,d, k= l,..., m, are the actual or desired output values and ok, k = 1, . . . ,m, are the outputs from the model. It is very interesting to note that the three models in [47] and [41] identified the premise variables as ur, a2 and u3. In our approach, the TSK model is
363
B. Wu, X. Yu / Fuzzy Sets and Systems 113 (2000) 351-365
identified, and u3 and u4 are identified as significant input variables. The resulting fuzzy model consists of four rules and two premises variables in the following:
which gives PI = 0.0020. The TSK model identified improves not only the model performance but also the structure of the model.
RI:
5. Summary
If u3 is A~(401.001,578.003) and u4 is A:( - 0.246,0.1014)
A GA based learning algorithm has been proposed in this paper for the identification of TSK models. The algorithm basically consists of four blocks: Partition Block, GA Block, Tuning Block and Termination Block. The Partition Block is to determine an estimated partition of input variables. The GA Block is to optimise the structure of a TSK model. The Tuning Block is to fine tune the parameters of the TSK model using the gradient descent based approach and the Termination Block checks that the resultant TSK model is satisfactory. The proposed GABL algorithm has the advantage of simplicity, flexibility, high accuracy and automation. The presented numerical examples have illustrated that the GABL algorithm is useful in constructing a good TSK model for modelling complex nonlinear systems.
Then y’ = 1396.5 - 200.6~~ + 222.6~~ + 1.05~~ - 254~~ - 87.1u5,
R2:If u3 is Ag(6998.0012,2719.9) and u4 is A:( - 0.2887,0.0513) Then y2 = 3896.8 - 467.56~~ + 68.4~~ + 0.8~~ - 682.7~~ - 608.9u5, R3: If u3 is Ai(400.999,2500.056) and u4 is A:(0.1056,0.212) Then y3 = 1401.4 - 211~~ + 22.8~~ + 1.0~~ - 166~4 + 26.71u5,
R4:If u3 is &(6997.98,800.0012) and u4 is &(0.1292,0.1509)
Appendix A
Then y4 = 3163.3 - 309.4~~ - 200.8~~
Input-output data for human a chemical plant are listed below.
+ 0.75~~ + 78.5~~ - 134.61~~.
(4.7)
Fig. 21 shows the performance of the model. We calculate the performance measured using (4.6),
muo,
I
6020 -
/ y
.:/c I+-
4COO-
,/
XXI-
w
--I
(C./
Orl
10
20
30
40
50
60
70
Fig. 21. The outputs of the TSK model (4.7)and the actual plant.The dotted line is the output of the TSK model(4.7),and the solid line the output of the actual plant.
Ul
6.80 6.59 6.59 6.50 6.48 6.54 6.54 6.54 6.20 6.02 5.80 5.51 5.43 5.44 5.51
u2 - 0.05 - 0.21 0.00 - 0.09 - 0.02 0.06 - 0.09 0.00 - 0.25 - 0.18 0.22 - 0.29 - 0.08 0.01 0.07
u3 401.00 646.00 703.00 797.00 717.00 706.00 784.00 794.00 792.00 1211.00 1557.00 1782.00 2206.00 2404.00 2685.00
u4
operation
us
at
Y
- 0.20 - 0.10 500.00 - 0.10 0.10 700.00 - 0.10 0.10 900.00 0.10 0.10 700.00 - 0.10 0.10 700.00 - 0.20 0.10 800.00 0.00 0.10 800.00 - 0.20 0.10 800.00 0.00 0.00 1000.00 0.00 0.10 1400.00 - 0.20 0.00 1600.00 - 0.10 0.00 1900.00 - 0.10 0.10 2300.00 - 0.10 - 0.10 2500.00 - 0.10 0.00 2800.00
364
B. Wu, A’. Yu /Fuzzy
Sets and Systems I13 (2000) 351-365
5.62 5.77 5.94 5.97
0.11 0.15 0.17 0.03
3562.00 - 0.40 0.103700.00 3629.00 - 0.10 0.003800.00 3701.00 - 0.20 0.103800.00 3775.00 - 0.10 0.003800.00
4.48 4.50 4.50 4.48 -
0.00 0.02 0.00 0.02
6973.00 7006.00 7027.00 7032.00
0.00
6.02 5.99 5.82 5.79 5.65 5.48 5.24 5.04 4.81 4.62 4.61 4.54 4.71 4.72 4.58 4.55 4.59 4.65 4.70 4.81 4.84 4.83 4.76 -
0.05 0.03 0.17 0.03 0.14 0.17 0.24 0.20 0.23 0.19 0.01 0.07 0.17 0.01 0.14 0.03 0.04 0.06 0.05 0.11 0.03 0.01 0.07
3829.00 - 0.10 3896.00 0.20 3920.00 0.20 3895.00 0.20 3887.00 - 0.10 3930.00 0.20 4048.00 0.10 4448.00 0.00 4462.00 0.00 5078.00 - 0.30 5284.00 - 0.10 5225.00 - 0.30 5391.00 - 0.10 5668.00 0.00 5844.00 - 0.20 6068.00 - 0.20 6250.00 - 0.20 6358.00 - 0.10 6368.00 - 0.10 6379.00 - 0.30 6412.00 - 0.10 6416.00 0.10 6514.00 0.00
0.103900.00 0.103900.00 0.103900.00 0.103900.00 0.003900.00 0.004000.00 0.004400.00 0.004700.00 0.104900.00 0.305200.00 0.205400.00 0.105600.00 0.006000.00 0.106000.00 0.106100.00 0.006400.00 0.106400.00 0.106400.00 0.006400.00 0.006400.00 0.106400.00 0.106500.00 0.006600.00
4.54 4.57 4.56 4.56 4.57
0.06 0.03 0.01 0.00 0.01
6995.00 6986.00 7009.00 7022.00 6998.00 -
0.00
4.77 4.77 4.77 4.73 4.73 4.74 4.77 4.71 4.66 4.70 4.63 4.61 4.57 4.56 4.54 4.51 4.47 4.47 4.48
0.01 0.00 0.00 0.04 0.00 0.01 0.03 0.06 0.05 0.04 0.07 0.02 0.04 0.01 0.02 0.03 0.04 0.00 0.01
6587.00 - 0.10 6569.00 0.00 6559.00 0.00 6672.00 0.00 6844.00 - 0.10 6775.00 - 0.20 6779.00 0.00 6783.00 0.00 6816.00 0.00 6812.00 0.00 6849.00 0.00 6803.00 0.00 6832.00 0.00 6832.00 - 0.10 6862.00 - 0.10 6958.00 0.10 6998.00 0.00 6986.00 - 0.10 6975.00 0.00
0.106600.00 0.106600.00 0.006700.00 0.006700.00 0.006800.00 0.006800.00 0.106800.00 0.006800.00 0.006800.00 0.006800.00 0.006800.00 0.006800.00 0.106800.00 0.106900.00 0.107000.00 0.107000.00 0.107000.00 0.107000.00 0.007000.00
0.007000.00 0.00 0.107000.00 0.00 0.007000.00 0.00 0.007000.00
0.10 0.10 0.00 0.10
0.007000.00 0.107000.00 0.107000.00 0.007000.00 0.007000.00
References
Cl1J.L. Castro,
Fuzzy logic controllers are universal approximators, IEEE Trans. Systems Man Cybernet. 25 (4) (1995) 629-635. PI J. Chen, J. Lu, L. Chen, An on-line identification algorithm for fuzzy systems, Fuzzy Sets and Systems 64 (1994) 63-72. c31 C. Chou, H. Lu, A heuristic self-tuning fuzzy controller, Fuzzy Sets and Systems 61 (1994) 249-264. c41 D.E. Goldberg, Genetic Algorithms in Search, Optimisation and Machine Learning, Addison-Wesley, Reading, MA, 1989. c51 J.H. Holland, Genetic algorithms and the optimal allocations of trials, SIAM J. Comput. 2 (2) (1973) 88-105. [61 A. Homaifar, E. McCormick, Simultaneous design of membership functions and rule sets for fuzzy controllers using genetic algorithms, IEEE Trans. Fuzzy Systems 3(2) (1995) 129-139. c71 H. Ishigami, T. Fukuda, T. Shibata, F. Arai, Structure optimization of fuzzy neural network by genetic algorithms, Fuzzy Sets and Systems 71 (1995) 257-264. C81 Y. Jin, J. Jiang, J. Zhu, Adaptive fuzzy modelling and identification with its applications, Int. J. Systems Sci. 26 (2) (1995) 197-212. c91 D. Jong, Adaptive system design: A genetic approach, IEEE Trans. System Man Cybernet. 10 (9) (1980) 566-574. Cl01 H.M. Kim, J.M. Mendel, Fuzzy basis functions: comparisons with other basis functions, IEEE Trans. Fuzzy Systems 3 (2) (1995) 158-167. Cl11 J. Kin, Y. Moon, B.P. Zeigler, Designing fuzzy net controllers using genetic algorithms, IEEE Control Systems Mag. 15 (3) (1995) 66-72. Cl21 R. Kirshnapuram, H. Frigui, 0. Nasraoui, Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation - Part I, IEEE Trans. Fuzzy Systems 3 (1) (1995) 29-43. Cl31 R. Kirshnapuram, H. Frigui, 0. Nasraoui, Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation - Part II, IEEE Trans. Fuzzy Systems 3 (1) (1995) 44-60. Cl41 B. Kosko, Fuzzy Engineering, Prentice-Hall, Englewood Cliffs, NJ, 1997.
B. Wu, X. Yu / Fuzzy Sets andSystems 113 (2000)351-365
[lS] K. Krishnakumar, D.E. Goldberg, Control system optimization using genetic algorithms, J. Guidance Control Dynamics 15 (3) (1992) 735-740. [16] Y. Lin, G.A. Cunningham, A new approach to fuzzyneural system modeling, IEEE Trans. Fuzzy Systems 3 (1) (1995) 190-197. [17] D.A. Linkens, J. Nie, Back-propagation neural-network based fuzzy controller with self-learning teach, Int. J. Control 60 (1) (1994) 17-39. [lS] D.A. Linkens, H.O. Nyongesa, Genetic algorithms for fuzzy control, Part 1: OtRine system development and application, IEE Proc. Control Theory Appl. 142 (3) (1995) 161-175. [19] D.A. Linkens, H.O. Nyongesa, Genetic algorithms for fuzzy control, Part 2: Online system development and application, IEE Proc. Control Theory Appl. 142 (3) (1995) 177-185. [20] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer, Berlin, 1992. [21] S. Mitra, S.K. Pal, Logical operation based fuzzy MLP for classification and rule generation, Neural Networks 7 (2) (1994) 353-373. [22] J. Nie, Constructing fuzzy model by self-organizing counter prorogation network, IEEE Trans. Systems Man Cybernet. 25 (6) (1995) 960-970. [23] G. Romer, A. Kandel, E. Backer, Fuzzy partitions of the sample space and fuzzy parameter hypotheses, IEEE Trans. Systems Man Cybernet. 25 (9) (1995) 1322-1314. [24] K. Shimojima, T. Fukuda, H. Hasegawa, Self-tuning fuzzy modeling with adaptive membership function, rules, and hierarchical structure based on genetic algorithm, Fuzzy Sets and Systems 71 (1995) 259-309. [25] M. Srinivas, L.M. Patnaik, Adaptive probabilities of crossover and mutation in genetic algorithms, IEEE Trans. Systems Man Cybernet. 24 (4) (1994) 657-666.
365
[26] M. Sugeno, G.T. Kang, Fuzzy modelling and control of multilayer incinerator, Fuzzy Sets and Systems 18 (1986) 329-346. [27] M. Sugeno, T. Yasukawa, A fuzzy-logic-based approach to qualitative modeling, IEEE Trans. Fuzzy Systems 1 (1) (1993) 7-31. [28] T. Takagi, M. Sugeno, Fuzzy identification of systems and its application to modeling and control, IEEE Trans. Systems Man Cybernet. 15 (1) (1985) 116-132. [29] L. Wang, R. Langari, Complex systems modeling via fuzzy logic, IEEE Trans. System Man Cybernet. - Part B: Cybernetics 26 (1) (1996) 100-106. [30] B. Wu, and X. Yu, Fuzzy modelling using genetic algorithms, Proceedings of FLAMOC’96, vol. 3, Math., Soft Comp. and Hardware, 1996, pp. 273-277. [31] R.R. Yager, Modeling and formulation fuzzy knowledge bases using neural networks, Neural Networks 7 (8) (1994) 1273-1283. [32] R.R. Yager, D.P. Filev, Unified structure and parameter identification of fuzzy models, IEEE Trans. Systems Man Cybernet. 23 (4) (1993) 1198-1205. [33] R.R. Yager, D.P. Filev, Essentials of Fuzzy Modeling and Control, Wiley, New York, 1994. [34] P.I. Yeliseyev, Interpretation of fuzzy subsets in modeling and control problems, J. Comput. System Sci. International 31 (3) (1993) 158-160. [35] L.A. Zadeh, Fuzzy sets, Inform. and Control 8 (1965) 338-353. [36] X. Zeng, M.G. Singh, Approximation theory of fuzzy systems - SISO case, IEEE Trans. Fuzzy Systems 2 (2) (1994) 162-176. [37] X. Zeng, M.G. Singh, Approximation theory of fuzzy systems - MIMO case, IEEE Trans. Fuzzy Systems 3 (2) (1995) 219-235.