INFORMATION SCIENCES ~A~NAL
ELSEVIER
JOU~AL
Information Sciences 110 (1998) 41 50
Efficient search for fuzzy models using genetic algorithm S. Matsushita a,,, T. Furuhashi b, H. Tsutsui c, Y. Uchikawa b Nagoya Municipal Industrial Research Institute, 3o4-41 Rokuban, Atsuta-ku, Nagoya 456, Japan b Dept. of Information Electronics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-01, Japan c Yamatake Honeywell Co. Ltd., YBP West Tower, 134 Kobe-cho, Hodogaya-ku, Yokohama 240, Japan
Received 28 May 1996; accepted 10 October 1997
Abstract
Fuzzy modeling is one of the promising methods for describing nonlinear systems. Determination of antecedent structure of fuzzy model, i.e. input variables and number of membership functions for the inputs, has been one of the most important problems of the fuzzy modeling. The authors have proposed a hierarchical fuzzy modeling method using Fuzzy Neural Networks (FNN) and Genetic Algorithm (GA). This method can identify fuzzy models of nonlinear objects with strong nonlinearities. The disadvantage of this method is that the training of FNN is time consuming. This paper presents a quick method for rough search for proper structures in the antecedent of fuzzy models. The fine tuning of the acquired rough model is done by the FNNs. This modeling method is quite efficient to identify precise fuzzy models of systems with strong nonlinearities. A simulation is done to show the effectiveness of the proposed method. © 1998 Published by Elsevier Science Inc. All rights reserved. Keywords." Fuzzy modeling; Genetic algorithm; Fuzzy neural networks
1. Introduction
Fuzzy modeling is a m e t h o d to describe the characteristics o f complex nonlinear systems using fuzzy inference. The authors have proposed fuzzy *Corresponding author. Tel.: +81 52 661 3161; fax: +81 52 652 6776; e-mail: matsushita @nmi.city.nagoya.jp. 0020-0255198/$19.00 © 1998 Published by Elsevier Science Inc. All rights reserved. PII:S0020-0255(97) 10084-6
42
X Matsushita et al, / Information Sciences 110 (1998) 41-50
modeling methods using Fuzzy Neural Networks (FNNs) [1-4]. The authors have proposed a hierarchical fuzzy modeling method [3,4] for enhancing the modeling capability by the FNN. When a fuzzy model is made in a hierarchical manner with many submodels, each model has a small number of inputs and the division of its input space can be made small. With less number of input-output data, the precision as well as the generality of a hierarchical model can be made high. Each submodel is to be identified by an FNN. It is easy from the identified model to extract fuzzy rules. Nagasaka and Ichihashi [5] and Tanaka et al. [6] studied fuzzy Group Method of Data Handling (GMDH). Although the methods in [5,6] are based on the fuzzy inference technique, and they are effective in modeling complex nonlinear object, no consideration has been paid to identify the fuzzy rules. Thus, extraction of fuzzy rules are difficult from the identified model. The problem of fuzzy modeling was the difficulty to determine the proper set of input variables and the numbers of membership functions for each submodel. The parameter increasing method in [1-4], which searches better models by increasing parameters, i.e. input variables and numbers of membership functions for the input variables, is not effective when the nonlinearity of the modeling object is strong. Evolutionary approach to the fuzzy modeling was studied to overcome this difficulty. Shimojima et al. [7] applied GA to the determination of structures of the fuzzy model. Although this method is effective to find appropriate structures of fuzzy models, no study on selection of input variables is done. The authors [8] have presented a combination of the hierarchical fuzzy modeling method and GA which can find appropriate structures including sets of input variables and sets of membership functions, and also enables easy extraction of fuzzy rules from the identified model. The disadvantage of this method is that the learning of FNN for evaluation of individuals in the GA process is time consuming. This paper presents a quick method for rough search for proper structures in the antecedent of fuzzy models. The simple fuzzy inference is used for the efficient search. The fine turning of the acquired rough model is done by replacing the simple fuzzy inference with the FNNs. This modeling method is quite efficient to identify precise fuzzy models of systems with strong nonlinearities. A simulation is done to show the effectiveness of the proposed method.
2. Hierarchical fuzzy modeling
2.1. Quickfuzzy modeling This paper presents a quick fuzzy modeling method using a simple fuzzy inference which quotes the idea from TCBM [9]. Using the simple fuzzy inference and the GA, the input variables and numbers of membership functions for the
S. Matsushita et al. / Information Sciences 110 (1998) 41-50
43
selected inputs are quickly searched. This efficient method is applied to search for a rough model for each submodel of hierarchical fuzzy model. After the input variables for the submodel are selected, a fine tuning using the F N N is done. The procedure of quick fuzzy modeling is as follows: (1) The input space is divided into crisp regions equally as illustrated in Fig. l(a). This is the case where there are two inputs Xl and x2, and each is divided into three subspaces. These divisions are decided by the chromosome in Section 2.2. (2) The singletons in the consequences are set at 0 at the initial state and will be determined in the next process.
X2
.,~o Q
•
1
Y32•
,[,. h v
(a) rule identificati(m
Xl
X2
°~t Xl
Ca) inf_~eaaee Fig. 1. Simple fuzzy inference.
44
S, Matsushita et al. / Information Sciences 110 (1998) 41 50
(3) If input data exist in a subspace, the consequent singleton is obtained as the mean value of the output data given by Eq. (1).
---
{i,jlz,i,x2j},
(l/
k=l /Yk
where (Xli,X2j) is the region where xl is included in the ith divided part and x2 is included in the jth divided part; J~j is the singleton in the consequent, Ark the number of data in the region (xli,x2j),~ the kth output data in the region (xji,x2j). The rules are made only in the region where input data exist. (4) The fuzzy inference is done by Eq. (2). Fig. l(b) shows this inference. Neighboring rules to the new input contribute to the inferred value in proportion to the inverse of the nth power of the distance between the input data and the centers of the rules.
•./~..-"r"-//=1
{i'j]Ris <~L},
(2)
~/
where y* is the inferred value, Nl the number of neighboring rules, r~i the distance from the input data to the center of the rule in the region (xj,.,x2j), R~j the number of the regions between the input data and the rule, L the number to define the neighbors, and n the constant. In this paper, L = 1, and n = 2. Since there is no learning stage in the rule identification, the above procedure of fuzzy modeling is very quick.
(A) (B) (C) (D) Fig. 2. Fuzzy neural network.
¢B
S. Matsushita et al. / Information Sciences llO (1998) 41 50
45
2.2. Fuzzy neural network The F N N presented by the authors is a multilayered back-propagation (BP) model with specially designed structures for easy extraction of fuzzy rules from the trained NN, This paper uses Type I of the FNNs in [1,2]. Fig. 2 shows an example of the FNN. The F N N has two inputs xl and x2, one output y and three membership functions for each input. The BP learning can be applied to modify the connection weights w~.,wg and Wf. From the trained FNN, fuzzy rules can be extracted easily. Since the center-of-gravity method is used in (E)-layer, the updating method of connection weights, i.e. BP algorithm, needs some modifications. The learning algorithm for the F N N is well described in [1,2].
2.3. Hierarchical Ji~zzy modeling Fig. 3 shows an example of hierarchical fuzzy model [3,4]. The figure shows a case where the object has five inputs (xl, x2, x3, x4, xs, ) and one output (y) and the fuzzy model has a three level hierarchical structure. In Fig. 2, yl-l*,yl-2*,yZ-t*,y, are the inferred values of the fuzzy submodels. In the first layer, the fuzzy model with the inputs Xl and x2, and the model with x3 and x4 are lined in parallel. The outputs of these models are yH. and y~-2", respectively. These two fuzzy submodels in the first layer greatly contribute to the input-output relationships of the object system. The input variable x5 is used for a small adjustment of the model. This fuzzy model reduces the number of divisions of the input space by constructing submodels in a hierarchical manner, and thus makes the total number of fuzzy rules small. The obtained fuzzy model, therefore, has the generalities, and the rules are easy to be extracted.
1st
red-
2rid lay
3rd layer
x2
yl-z" x4 ---~
2-1
1-2 x5 ---~
Fig. 3. An example of hierarchical fuzzy model using FNNs.
46
S. Matsushita et al. / Information Sciences 110 (1998) 41-50
The authors have proposed a hierarchical fuzzy modeling method using the FNNs and the GA. Each submodel was built by the FNN and a proper set of input variables and sets of membership functions for the submodel were sought by the GA. A hierarchical structure was constructed by finding proper submodels one by one. The disadvantage of this method was that the learning of FNNs defined by chromosomes was time consuming. This paper presents an efficient method for the hierarchical fuzzy modeling using simple fuzzy inference described in Section 2.1. The procedure of the hierarchical fuzzy modeling using the simple fuzzy inference, and the GA is as follows: (a) The input-output data is divided into two groups A, B whose statistical characteristics are nearly the same. The models identified from the group A, B data are called model A, B, respectively. The number of layer h is 1. (b) Using the GA and the simple fuzzy inference, the input variables and the number of membership functions for each input is decided. Individuals having the codes of the antecedent structure are evaluated by the performance index F given by F =- N log (Se) + 2P, m
(3)
m
(4)
P:HnDk+ZnDk, k=l
k
1
where N is the number of the data, Se the mean square error of the model identified with the data group A and tested with the data group B, P the number of parameters, n the number of membership functions of the kth input variable, m the number of selected input variables. The first term of Eq. (4) is the number of rules and the second term is the number of membership functions. This criterion is to evaluate both the precision and the complexity of the model. The chromosome used in this paper is shown in Fig. 4. The number of loci is lg, same as that of the whole candidates of input variables. The integer number in each locus means the number of membership functions for the corresponding input variable. When the number is 0, the input variable is not used. The next step is fine tuning of the acquired rough model. The best individual found in the above process determines the set of input variables and the sets of membership functions in the antecedent. The FNN replaces the simple fuzzy inference, and the learning of the FNN is done. For the stopping condition, the following criterion is used:
123
Io]21,31 ........
m
12[
Fig. 4. An exampleof chromosome.
S. Matsushita et al. I Information Sciences 110 (1998) 41 50
C =
_ yAiA) 2 q_
47
(yB --yiBB)2 i=1
+
-
A/2 + Z(y
A-
(51
i=l
where ng and nB are the numbers of the data groups A, B, respectively; ~ and y~ the outputs of the data A and data B, respectively; y~A and y~B the inferred value of the model A with the data A and that of model B with the data B, respectively; yAB and y~a the inferred value of the model B with group A and that of model A with group B, respectively. The first term on the right-hand side in Eq. (5) is the precision of the model, and the second term is the criterion for evaluation of the generalities of the model. This identified model is called model h-I and its output is denoted by yh-l'. The model h-k means that it is the kth model in the hth layer. (c) If the number of remaining input variables not used in the model h-1 is more than two, another model is identified using the GA and the simple fuzzy inference. Some out of the remaining input variables are selected. The found rough model is tuned by the FNN. The model is denoted by model h-2. If more variables remain, model k-3 will then be identified. This modeling is repeated until the number of remaining input variables becomes less than two. The model k-1 will be used for the identification of the models in the succeeding layer. The outputs of the models k-2, -3,... will be the candidates for the input variables in the next layer. (d) Fuzzy modeling in the next layer is done using the submodels identified in the previous layer. The output of model k-1 is always used. A combination of this output yh-V and some of the outputs of models k-2, -3,... as well as the input variables not used in model k-1 is chosen by the GA and the simple fuzzy inference. The FNN is used to tune the found model. The acquired model is denoted by (h + 1)-1. The evaluation criterion C of model h-I and model (h + 1)-1 are compared. If the latter value is less, the procedure from (c) will be repeated. If not, this fuzzy modeling will be stopped. Since the input variables which are used in the previous layer are not used in the current layer, the acquired structure is simple.
2.4. Genetic algorithm The GA is an efficient tool for searching solutions in a vast search space. By the GA, an appropriate set of inputs and sets of membership functions are likely to be found even in the case where the system has strong nonlinearity and the multiple input variables have complex correlation.
48
S. Matsushita et al. / Information Sciences 110 (1998) 41 50
The G A used in this paper is as follows: (a) Initialization of chromosomes (see Fig. 4) is done. The number of chromosomes is Ng. (b) The chromosomes are evaluated. A model which is represented by a chromosome is made using the simple fuzzy inference. The evaluated value of the model is calculated using the fitness functions Eq. (3). (c) Genetic operations are applied to the chromosomes. Ng chromosomes which have larger evaluation criterion (poorer performance) are deleted. Ns chromosomes are reproduced. The chromosome which has smaller evaluation criterion is more likely to be used for the reproduction than the one which has larger evaluation criterion. The one point crossover is applied at the rate of pc. The mutation is applied to each locus at the rate of Pro- Mutation to random number is applied at the rate of pm/2, and increment or decrement mutation is applied at the rate of pro/2. (d) The calculation is stopped in the case where the evaluation criterion of the fittest chromosome does not improve during mend generations. Else, the above procedure from (b)-(d) is repeated.
3. Simulations
Fuzzy modeling of the nonlinear system given by the following equations was done: y = ( - 4 + x~5 + x~-l)2 + 5 sin(x2 +x3) + exp(1 +x4 +xs) = ( - 4 Jl- Xl0) 2 -~- 5 sin(x,,) + exp(1 X6 :
X~.5
X7 :
XIo = X~'5 @ X l I ,
Xl 1,
X8 = X l 2,
XlI = X2 - - X3,
+xl2),
X9 :
(6)
0.5 - I X0 Xl ,
XI2 = X4 -~- X5,
(7)
where x0 ~ x5 were the input variables, x13 which had no relationships with y was also used as a dummy variable, and x6 ~ x12 are the intermediate input variables expressed by Eq. (7). The number of the input candidates lx was 14. The range of each input was x0 = {0, 1 , . . . , 2 0 } , xl = {0.5, 0 . 7 , . . . , 1.5}, x2,x3={-1.O,-0.9,...,5.0}, x 4 = { 0 . 0 , 0 . 1 , . . . , 0 . 5 } , x 5 = { - 1 . 5 , - 1 . 4 , . . . , 0.5}, and x13 = {0, 1 , . . . , 17}. These ranges were decided so that the input variables influence the output almost equally. Two sets of eighty pairs of input-output data each for A and B-groups were generated. The input-output data were normalized within the range [0,1). The parameters of G A were set as follows: N~ = 30, N,. = 9, Pc = 60%, Pm = 1 0 % , mend = 10. The maximum number of submodels allowed to be in one layer was 5. The acquired model by the proposed method is shown Table 1 and
49
S. Matsushita et al. / Information Sciences llO (1998) 41-50 Table 1 Fuzzy models identified by the proposed method Criterion C
Selected variables (no. of div.)
Model
33.1
Xll (4), xl2 (5) Xl0 (5) xl (2), x9 (3) X6(4), X8 (2) XO (2), X7 (2) Model 1-1 (5), 1-2 (2) Calculation time
1-I 1-2 1-3 1-4 1-5 2-1 19 rain
21.5
xtx[+]
xu[sl xxo[2]
1s r . ~ 2ridlayer ~FNN[yX'X*[s]
1-1 I !FNN J yl.2.[2 ] 2-1 ]1"21
Fig. 5. Identified hierarchical fuzzy model by the proposed method.
Fig. 5. For comparison, the identified model by the method in [8], i.e. FNN + GA modeling method, is shown in Table 2 and Fig. 6. The computation time was drastically decreased by the proposed method. It took 19 min to obtain the result. The FNN + GA modeling method took 7 h. The results show that the proposed rough search could find proper sets of input variables and an appropriate structure. The evaluation criterion was not degraded by the proposed method from that by the previous method.
Table 2 Fuzzy models identified by FNN + GA [8] Criterion C
Selected variables (no. of div.)
Model
28.9
xtt (4), xt2 (2) Xlo (3) x6 (2), x9 (2) xo (3) x8 (2) xl0 (3), model 1-1 (2) Calculation time
1-1 1-2 1-3 1-4 1-5 2-1 7h
17.2
50
S. Matsushita et al. / Information Sciences 110 (1998) 41-50
2rid layer xu[4] xu[2]
yi"I"[3]
1-1 I 2-1
;.y*
x o[2] Fig. 6. Identified hierarchical fuzzy model by FNN + GA [8].
4. Conclusions This paper presented a quick search method for hierarchical fuzzy models. The new method uses the simple fuzzy inference for the efficient search. The results showed that the computation time was remarkably decreased by the proposed method. This method can be combined with the FNN + GA method in a different manner for further improvement of the hierarchical modeling technique.
References [1] S. Horikawa, T. Furuhashi, Y. Uchikawa, T. Tagawa, A study on fuzzy modeling using fuzzy neural networks, Proc. Internat. Fuzzy Eng. Symp. (IFES'91), 1991, pp. 562-573. [2] S. Horikawa, T. Furuhashi, Y. Uchikawa, On fuzzy modeling using fuzzy neural networks with the back-propagation algorithm, IEEE Trans. on Neural Networks 3 (5) (1992) 801 806. [3] S. Nakayama, T. Furuhashi, Y. Uchikawa, A proposal of hierarchical fuzzy modeling method, J. Japan Soc. for Fuzzy Theory and Systems 5 (5) (1993) 1155-1168. [4] S. Matsushita, A. Kuromiya, M. Yamaoka, T. Furuhashi, Y. Uchikawa, A study on fuzzy GMDH with comprehensible fuzzy rules. Proceedings of Seiken/IEEE Symposium on Emerging Technologies and Factory Automation (ETFA '94), 1994, pp. 192-198. [5] K. Nagasaka, H. lchihashi, Neuro-fuzzy GMDH and its application to modeling of grinding characteristics, Ninth Fuzzy System Symposium (in Japan), 1993, pp. 449-452. [6] H. Tanaka, K. Yokode, H. Ishibuchi, GMDH by Fuzzy If-Then rules, Ninth Fuzzy System Symposium (in Japan), 1993, pp. 237 240. [7] K. Shimojima, T. Fukuda, Y. Hasegawa, Self-tuning fuzzy modeling with adaptive membership function, rules, and hierarchical structure based on Genetic Algorithm, Fuzzy Sets and Systems 71 (3) (1995) 295-309. [8] S. Matsushita, A. Kuromiya, M. Yamaoka, T. Furuhashi, Y. Uchikawa, Determination of Antecedent Structure for Fuzzy Modeling Using Genetic Algorithm (ICEC'96), 1996, pp. 235238. [9] H. Tsutsui, Kurosaki, Fuzzy topological case based modeling and applications, Ninth Fuzzy System Symposium (in Japan), 1993, pp. 249 252.