Automatica, Vol. 8, pp. 101-104. Pergamon Press, 1972. Printed in Great Britain.
Brief Paper Optimizing Control Using Fuzzy Automata* Commande optimalisante utilisant des automates flous Optimale Steurung unter Benutzung eines stochastischen Automaten OnTHMVI3vtpyIotttee ynpaB.rieHHe rtcrto.rtb3ytottlee pacn~biBqaTbie aBTOMaTbI K. ASAIt and S. KITAJIMA++ In the following, an outline of the method for the optimizing control, the method for determining the convergent coefficient, the simulation results showing the convergent characteristics, and discussions of them will be given.
Smnmary--A method for the optimizing control using the stochastic or fuzzy automata has convergent characteristics which are superior to those of the random search method. In this paper, the convergent characteristics of the optimizing control using the fuzzy automata previously proposed by the authors are investigated with regard to the convergent coefficient of the reinforcement algorithm for the learning operation of the automata. It is shown that the convergent characteristics can be improved if the convergent coefficient is varied with respect to the objective functions.
Outline of optimizing control method The block diagram of an optimizing control system using the fuzzy automaton is shown in Fig. I.
Controlled =
Introduction THE TRANSITIONof fuzzy automata is executed on the basis of the membership function between two states. The idea of the membership function was introduced by ZADEH [1] as the basic concept of the fuzzy sets. The relation between two states in fuzzy automata can be represented by the membership function which is expressed in real positive numbers of the closed interval [0, 1]. WEE and Fu [2] have proposed a fuzzy automaton, in which the transition from a state to another state may be executed on the basis of the membership function between these two states, and they have proposed a learning system using this automaton. The authors [3] of this paper have also formulated a kind of fuzzy automata which have many outputs in a branch or a state, and shown the learning behavior of new optimizing control systems using these automata. The membership functions of the automata are adjusted on the basis of the objective function, and an operation of self-organization is performed. In this case, the convergent characteristics depend on the convergent coefficient of the reinforcement algorithm for the learning operation of the automata. The smaller the value of the coefficient grows, the faster the membership functions converge to I or 0 for success or failure respectively. In the optimizing control system, the hunting loss in the steady state will be large if the coefficient is selected to have a small value to realize a rapid response. In order to realize the performance of rapid response and small hunting loss, it is necessary to determine the value of the coefficient with the progress of the learning.
system
(Ch o r a c t e r i s t i c s ore
unknownl
Learning fuzzy automat
=
I Performance~.. I evaluator
Input x FIG. 1. Block diagram of optimizing control system. This system consists of the controlled system whose characteristics are unknown, the performance evaluator in which the objective functions are calculated and evaluated and the Moore type of learning fuzzy automaton with v states and (v ×v) outputs. In this automaton, the transition from state st to state sj is executed on the basis of the membership function given as ftj 1 = max(min(.fij, g~n)} (jhj: membership function for transin j tion from st to sj) and the output at the state s~ is selected by the membership function g~n(h: number of output, 1, 2, - - , v), and these membership functions are modified by the outputs sent from the performance evaluator. The outline of the operation of the system is shown in Fig. 2. This operation is executed by an algorithm based on the ideas of higher order transition characteristics of the fuzzy automaton and the partition of the domain of the objective function. For the purpose of investigating the behavior of this optimizing control system, a simulation study has been carried out. In this simulation study, the objective function is the function of two variables and includes an optimum point, a local optimum point and a saddle point [4].
* Received 20 January 1971 ; revised 29 July 1971. The original version of this paper was presented at the I F A C Symposium on Systems Engineering Approach to Computer Control which was held in Kyoto, Japan during August 1970. It was recommended for publication in revised form by Associate Editor Peter Dorato. t Department of Industrial Engineering, University of Osaka Prefecture, Sakai, Osaka, Japan. ++Department of Electrical Engineering, Osaka City University, Sugimoto-cho, Osaka, Japan.
101
102
Brief paper
-V Calculate k objechve functions I~(n'. 7 or, d the mean value I-(n) of air |
_1
[ 'o7.;; L":,7:
leblectlve functionsfar obtained
° ; ,"n;L °,t%%."" I
t -}
a su bdornain
Jset fijand gjhtothe initialvalue 0.5 l
®
+
-l-]-I
Yes
[success)
g jh (n+t): a n~jh (n) +( I- e~) I ~,¢;::;,7;: th:lrK £ " F : ~ ;
}
.°@
.... I
Find a loop path i~ which transition is e~ecuted I on the basis of f ~i,mox(n] vta ( k - ) states
I
JiF/,7,:,;.:,,=,;, tovo,iob,.,,,,, ...... k states
FIG. 2(a).
po°d
(n): trial number, i.j.h : 1.2.---.~, k = 1.2.---.K.
Flow chart of performance of optimizing control system (I).
From the simulation results, it has been shown that an optimizing control has been achieved as follows. A global search was executed at early stage of learning, but the domain of search could be reduced as the learning progressed. Accordingly, the optimizing control systems are able to avoid the continuation of trials in the vicinity of a local optimum by the initial global search.
Convergent characteristics of optimizing control In the algorithm described before, the membership functions are modified by the reinforcement algorithm with variable a, for the purpose of getting the optimal behaviors of the control system. That is, the membership functions are modified by the equations
]]i(n +
I)=
c<.J;~{n) + (1 - ~ . ) 2
gjh(n + 1) = g, gtj(n) + (1 - =.)2 where 2=
1 if I > i (success) if I < i (failure)
~ , = 1 - I~S--~ ( 0 . 5 ~ , <
1)
In: Value of objective function, I : Mean value of objective functions far obtained, n : Trial number (1, 2, --).
I-li
tij (n÷l)=08 fij(n)
t i.o°,t h.... ber,o o,
(~
a -
I- i
membership ftJnction on the diagonal entries in the m a t r i x F g
-,-7
fk I (n*r~:a.fkj {n)*(1-a
I_
Find the m(lximum value f ~ti ,max(n) of
17
}
:<<:::0;70r I glh : mean value of glh far obtained
FIo. 2(b). Flow chart of performance of optimizing control system (II). In these equations, 7~(n+l) or g~h(n+l) will rapidly converge to L, for the larger the difference (I-]) becomes, a smaller value of a is obtained. This is shown only for .hj(n+l) in the next paragraph, sinceT~j(n+l) and g:n(n+l) show the same convergent behavior. The equation described above may be rewritten as follows.
.fq(n + 1) = ~,.~,_ 1" -- "~l"]ij(0) + { 1 -- ~,.~,_ 1" -- "&l },t It is clear that .fij(n-t-1) will certainly converge to X as n~oo if 0 < a n < 1 ( n = l , 2, --). Accordingly, when 0 < I < • (case of failure) or 7 < 1 < 2 7 (case of success), .hj(n+l) will approach to 0 or I respectively. And, when I = i , 7~(n+l) will not be varied because it can not be distinguished between the success and the failure. Thus, the difference (I-/-') is large in the early stage of learning but the difference will be small as the learning progresses. Consequently, .hj(n+l) is varied widely in the early stages, but it varies only slightly in the final stages, and it rapidly converges to X. In order to show the convergent characteristics of the optimizing control, the convergent behaviors of the membership functions ]~l and gjh of the learning fuzzy automaton in the simulation study are shown in Figs. 3 and 4. In Fig. 3, .hi is the membership function of the branch which arrived at the state, including the optimum point, and gjh is the membership function of the choice of the output in the state. In Fig. 4, .hi and fj~ are the membership functions of the branches corresponding to the optimum and a quasi optimum respectively, in the Mealy type of fuzzy automaton. Also shown is the case of using a constant a in comparison with the case of a variable a. As is evident from the figures, the smaller the value of a becomes, the faster the membership functions converge to X, when the value of a is constant. On the other hand, in the case of the variable a, the membership functions more rapidly converge to X, for the value of a is varied with the progress of the learning.
Brief paper
h: Mean value of objective functions corresponding to outputs at state ss, I~m,~: Maximum value of h, m: Trial number (I, 2, --). The output probability qsh can be modified as well as p~j. An example of simulation results is shown in Fig. 5.
o
o.
103
/
0.8 "9 o
u E
I
f
07 / / / / / "
.c
oJ
i:!°
/
)) "6 c
7 :TV o:r?!t !!
'
0
'
Io., I00
50
Trial
I0
2
t30
ta.
'7ta
g4,4 °
0.8
E OE E
//
(
number o
FiG. 3. An example of convergent characteristics of
>,:
/ f o
/
/,.
/
04
optimizing control (case of Moore type). ×××x
> x • ~ ('x,x,*<'
Io I
t~ k,=
0'9
i
x /f3
f2,3
/
'-o
i
I
I
I00
50
IqO
Trial number
FIG. 5. Convergent behaviors of probabilities Po, q~h and membership functionsfij, gjh.
07 s_
x u Ls vor[oble (0 5 _-
OE
(o 9) I
I00
5O
Trial
x
/
/
\
2
c:
E
/
x94X/-XX-:,:z.( :,:.',1\X- "
3 o
i
O8
X- X - .
02
/
/x
X
,
I 120
number
F ~ . 4. An example of convergent characteristics of optimizing control (case of Mealy type).
From the figure, it may be seen that both convergent behaviors of the stochastic automaton and the fuzzy automaton show similar properties but the membership functions in the case of the fuzzy automaton converge to 1 more rapid than the probabilities in the case of the stochastic automaton. It may be seen that the difference of these behaviors depends on the differences of the transition characteristics of both automata because the transition characteristics of the fuzzy automata will become gradually deterministic in the process of learning, while those of the stochastic automata are random during the entire learning process. Therefore, it may be observed that self-organization using the fuzzy automata is more efficiently performed than when the stochastic automata is used.
Conclusions Also, it is shown clearly from these figures that both ~ and g~h are mostly varied in the early stages and slightly varied in the final stages, and they rapidly approach their final values. Therefore, the optimizing control systems are able to obtain a desired performance with small hunting loss to the steady state. To compare this convergent characteristics with those of optimizing control using a stochastic automaton, a simulation study has been carried out as discussed in the following. In this simulation study, the stochastic automaton has been used in similar fashion to the fuzzy automaton. The transition probability Pu is modified by the equations
Pij(m + 1) = CtmPij(m ) + (1 -- % ) 2 Pik(m + 1) = Pik(m) -t n (
,4cknowledgement--The authors would like to express their appreciation to Prof. L. A. Zadeh and Prof. K. S. Fu for valuable discussions.
3eij,-m, n (rij,m + 1) (k # j )
v-1 where
i f I < l (failure)
~m=l - / ~
In the preceeding paragraphs, optimizing control systems using fuzzy automata and their convergent characteristics have been outlined. From the simulation results, it has been shown that the convergent characteristics of the optimizing control can be improved if the coefficient a for modification of the membership functions is varied with respect to the objective functions. It has been also seen that the convergent characteristics of the fuzzy automata are superior to those of the stochastic automata and the operation of self-organization is more efficiently performed than in the case of the stochastic automata.
(0"5<~m
References [1] L. A. ZADEH: Fuzzy sets. Inf. Control8, 338-353 (1965). [2] W. G. WEE and K. S. Fu: A formulation of fuzzy automata and its application as a model of learning systems. IEEE Trans. Syst. ScL & Cybern. 215-223 (1969). [3] K. ABAI and S. KITAJIMA: A method for optimizing control of multimodal systems using fuzzy automata. Inf. Sci. (in press). [4] H. E. ZELLNIK,
N. E. SONDAK and
search optimization. (1962).
R. S. DAVIS:
Gradient
Chem. Engng Prog. 58, 35-41
104
Brief paper
R ~ u m 6 - - U n e m6thode pour la coxnmande optimalisante utilisant les automates al6atoires ou flous poss~de des caract6ristiques de convergence sup6rieures /t celles de la m6thode d'exploration al6atoire. Dans le pr6sent article sont examin6es les caract6ristiques de convergence de la commande optimalisante utilisant des automates flous par rapport au coefficient de convergence de l'algorithme de renforcement pour le fonctionnement d'apprentissage de l'automate. 11 est montr6 que les caractgristiques de convergence peuvent 6tre amgliordes si le coefficient de convergence est modifi6 par rapport aux fonctions de l'objectif. Zusammenfassung--Eine methode ftir die optimale Steuerung unter Benutzung eines stochastischen Automaten besitzt konvergenete Charakteristiken, die denen der Random-SuchMethode unterlegen sind. In der Arbeit werden die konvergenten Charakteristiken der zu optimierenden Steuerung mt Rticksicht auf die Konvergenzkoeffizienten des Verst~rkungsalgorithmus for die
Lernoperation des Automaten betrachtet. Gezeigt wird, dab die konvergenten Charakteristiken verbessert werden kt~nnen, wenn der Konvergenzkoeffizient im Hinblick atff die Zielfunktion variiert wird. PesmMe--Mexojl OrITHMH3Hpyrotllero ynpaBnenHs ncnoYlb3ytoILIftfl cny,m~nbie n:~r~ pacnnblB,mTbm aBTOMaTbl o6naJxaeT qapaKTeprlCT~KaMrl CX.O~II,IMOCTIt IlpeBocxo,RfllllnMlI xapaKTeprlCTrlKM MeroJIa cnyaa~aoro noHcKa. B nacxoame~ craTbe pacCMaTpHBamTC~ xapaKxepacr~ia CXO~lnMOCrH OnTHMH3r~pyloulero ynpaBneHHrl HCrlOKb3yromero pacnJIblBqarble aBTOMarbl rio OTHOLUenmo K KoaqbqbritlrfeHTy CXO;I~IMOCTH a:~ropurMa ycriyteH~ Jl~t pa6oTbl o6yqeHri~ aBTOMaXa. HoKa3bmaercn qTO qapaKTepriCTrfKH ~IIHeaTy CXOJ~ttMOCTH anropHTMa 3cH~eHHS ~I~ pa6oTbl oSyqeH~in aBToMara. HoKa3bmaerc:~ qrO xapaKrepricTrirn CXOjIItMOCTI4 MOFyT 6bITb ysly~lmeHb~ ecnH go31~dibtllIHeHT CXO]II4MOCTI! lt3MenfleTc~l n o 3a]iatnt.
O-rHOmeHHrO
K t~)yHKIII, IIIM