GA-TSKfnn: Parameters tuning of fuzzy neural network using genetic algorithms

GA-TSKfnn: Parameters tuning of fuzzy neural network using genetic algorithms

Expert Systems with Applications 29 (2005) 769–781 www.elsevier.com/locate/eswa GA-TSKfnn: Parameters tuning of fuzzy neural network using genetic al...

205KB Sizes 0 Downloads 3 Views

Expert Systems with Applications 29 (2005) 769–781 www.elsevier.com/locate/eswa

GA-TSKfnn: Parameters tuning of fuzzy neural network using genetic algorithms A.M. Tang, C. Quek, G.S. Ng* Centre for Computational Intelligence, formerly known as the Intelligent Systems Laboratory, School of Computer Engineering, Nanyang Technological University, Blk N4 #2A-32, Nanyang Avenue, Singapore 639798, Singapore

Abstract Fuzzy logic allows mapping of an input space to an output space. The mechanism for doing this is through a set of IF-THEN statements, commonly known as fuzzy rules. In order for a fuzzy rule to perform well, the fuzzy sets must be carefully designed. A major problem plaguing the effective use of this approach is the difficulty of automatically and accurately constructing the membership functions. Genetic Algorithms (GAs) is a technique that emulates biological evolutionary theories to solve complex optimization problems. Genetic Algorithms provide an alternative to our traditional optimization techniques by using directed random searches to derive a set of optimal solutions in complex landscapes. GAs literally searches towards the two end of the search space in order to determine the optimum solutions. Populations of candidate solutions are evaluated to determine the best solution. In this paper, a hybrid system combining a Fuzzy Inference System and Genetic Algorithms—a Genetic Algorithms based Takagi-Sugeno-Kang Fuzzy Neural Network (GA-TSKfnn) is proposed to tune the parameters in the Takagi-Sugeno-Kang fuzzy neural network. The aim is to reduce unnecessary steps in the parameters sets before they can be fed into the network. Modifications are made to various layers of the network to enhance the performance. The proposed GA-TSKfnn is able to achieve higher classification rate when compared against traditional neuro-fuzzy classifiers. q 2005 Elsevier Ltd. All rights reserved. Keywords: Genetic algorithms; Takagi-sugeno-kang fuzzy neural networks; Fuzzy inference systems; Simultaneous and sequence parameter tuning; Extensive benchmarking

1. Introduction Self-Organizing Fuzzy Neural Network (FNN) combines the advantages of both fuzzy inference systems and neural networks by generating a knowledge base of fuzzy rules without the need for expert human knowledge. It also allows the tuning of the membership functions through training to minimize output error and improve prediction accuracy. Fuzzy systems are based on how the human brain deals with inexact information while neural networks are based on the brain architecture and learning capabilities. The neurofuzzy system is the fusion of these two different technologies (Sun & Jang, 1993). There is a growing number of neuro-fuzzy systems, and they can be broadly grouped into two categories depending on the fuzzy IFTHEN rule model they employ. They are; namely: * Corresponding author. Tel.: C65 679 05043; fax: C65 679 26559. E-mail address: [email protected] (G.S. Ng).

0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2005.06.001

the Mamdani (Mamdani, 1974) model and TakagiSugeno-Kang (TSK) (Takagi, 1985) fuzzy model. It is commonly accepted that the Mamdani model posseses poor representation, meaning that, a greater numbers of fuzzy rules are required to model a real world system when compared against the TSK model. Genetic algorithms (Holland, 1975) have been widely used for generating fuzzy if-then rules and tuning membership functions (Carse, Fogarty & Munro, 1996). For example, genetic algorithms were used for generating fuzzy if-then rules in (Feldman, 1993; Thrift, 1991), for tuning membership functions in (Herrera, Lozano, & Verdegay, 1995; Janikow, 1996; Karr, 1991; Karr & Gentry, 1993), and for both the rule generation and the membership function tuning in (Homaifar & McCormick, 1995; Ishigami, 1995; Lee & Takagi, 1993; Nomura, Hayashi, & Wakami, 1992; Park Kandel, & Langholz, 1994). Hierarchical structures of fuzzy rule-based systems were also determined by genetic algorithms in (Shimojima Fukuda, & Hasegawa, 1995; Matsushita, 1996). While various methods have been proposed for generating fuzzy

770

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

if-then rules and tuning membership functions, only a few methods are applicable to pattern classification problems. This is because the above-mentioned methods (ValenzuelaRendon, 1991; Parodi & Bonelli, 1993; Furuhashi, Nakaoka, & Uchikawa, 1994; Nakaoka, Furuhashi, & Uchikawa, 1994) are mainly for control and function approximation problems. In this research work, pattern classification problem is dealt with by using the proposed method. Ishibuchi and Murata (1996) proposed a method to determine the number of fuzzy rules and the membership function of each antecedent fuzzy set. Cordon, Herrera, and Villar (2001) also proposed a method to automatically learn the knowledge base of Mamdani type fuzzy system by finding an appropriate data base by means of a genetic algorithm. The genetic process learns the number of linguistic terms per variable and the membership function parameters that define their semantics, while a rule base generation method learns the number of rules and their composition. The above two research works basically tried to find the number of fuzzy rules and membership functions. However, in this work, the number of rules and membership functions have been predetermined as the shape of membership function is more important in affecting the classification accuracy of the system. Furthermore, this research works deal with Takagi-Sugeno-Kang (TSK) type fuzzy system instead of Mamdani type. Ishibuchi, Nakashima, & Murata, 1999 used a geneticsbased machine learning method that automatically generates fuzzy if-then rules for pattern classification problems from numerical data. They used linguistic values with fixed membership functions. The fixed membership functions also lead to a simple implementation as a computer program. In this research work, fixed membership function is not dealt with. The parameters of membership function (i.e. trapezoidal membership function) are dynamically tuned during the training. Cordon & Herrera (1999) presented a two-stage learning process to automatically learn TSK-type Knowledge Base (KB) from examples. The two stages are the generation and refinement stages. In the generation stage, a preliminary TSK-type KB is obtained. Then the refinement stage is to obtain the optimal TSK-type KB. Casillas (2001) also proposed two stages tuning of the membership function using Genetic Algorithm: basic and extended tuning. The membership function used is triangular. However, in this research work, the learning process is not divided into two stages, neither the tuning process. Our emphasis is on simultaneously tuning of membership function parameters and TSK parameters. Self-Organizing Takagi-Sugeno-Kang Fuzzy Neural Network (S-TSKfnn) proposed by Gao (2002) is a novel realization of the TSK inference system. It has the advantages of expressing a clearer concept of inference process. The network is also able to learn the data presented faster with less memory spaces. Besides, a lesser number of

fuzzy rules is required in this network. It has been demonstrated to produce a better performance compared against traditional neuro-fuzzy models. The network requires two independent tuning phases; namely: Virus Infection Clustering and Plane Tuning to tune the parameters in the Condition and the Consequence Layers. On the other hand, Genetic Algorithms (GAs) are global search and optimization techniques that explore the search space by incorporating a set of candidate solutions in parallel. It emulates biological evolutionary theories to solve the complex optimization problems. GAs perform the search on the coding of the data instead of directly on the raw data. This feature allows GAs to be independent of the continuity of the data and the existence of derivatives of the functions as needed in some conventional optimization algorithms. The coding method in GAs allows them to easily handle optimization problems with multi-parameter or multi-model, which is rather difficult or impossible to be treated by classical optimization methods. The objectives of this paper can be divided into two major parts: The first part is to apply the Genetic Algorithms to tune the parameters of the membership functions as well as the Takagi-Sugeno-Kang parameters. The second part is to further improve the performance of the network. Necessary modifications are done on certain layers of the existing S-TSKfnn. The proposed GA-fuzzy architecture, that is, GA-TSKfnn, is benchmarked against an extensive class of fuzzy neural system using the Iris data set (Fisher, 1936). The results are highly promising.

2. Architecture of GA-TSKfnn GA-TSKfnn is a modification adapted from S-TSKfnn (Gao, 2002). It is a six-layer feed-forward network. The layers are characterized by the fuzzy operations they perform. Starting from left to right in Fig. 1, the layers are: Layer I (Input Layer), Layer II (Condition Layer), Layer III (Rule-base Layer), Layer IV (Selection Layer), Layer V (Consequence Layer) and Layer VI (Output Layer). Each layer has a number of neurons. Neurons in each layer are ordered. The number of neurons in layer k is labeled by Nk, where k2{1,.,6}. Neurons in adjacent k denotes layers are connected using the link E. The label Ei;j a link originating from the ith neuron in layer kK1 terminating at the jth neuron in layer k, where k2{2,., 6}, i2{1,.,NkK1} and j2{1,.,Nk}. There are no links between neurons in the same layer. Links in conventional neural network contain weights w. In GA-TSKfnn, all the weight parameters of the links are embedded with the k is neurons they feed into. Hence, the weight of the link Ei;j k denoted by label wi;j , and is included in the jth neuron of layer k.

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

771

IL1,1 x1

L1

R1

S1

C1

Ri

Si

Ci

RN3

SN4

CN5

IL1,T1 ILi,1 xi

Li ILi,Ti

Σ

f

ILN1,1 xN1

LN1 ILN1,TN1

x1 ...... xN1 Fig. 1. Structure of GA-TSKfnn.

2.1. Layer I: the input layer The input to GA-TSKfnn is a non-fuzzy data vector, represented by xð Z ½x1 ; x2 ; .; xi ; .; xN1 T . The function of this layer is to receive the input and translate it into singleton fuzzy sets and output to the next layer. Nodes in this layer are termed as linguistic nodes; they represent input linguistic variables like ‘height’, ‘weight’, ‘size’ etc. However, in the actual implementation, all the inputs used are non-fuzzy input vectors. Therefore, no fuzzification process is performed. The linguistic nodes will directly transmit the non-fuzzy inputs to the next layer. Each node receives only one input as one dimension of the data vector to the whole network, and output to several nodes of the next layer. The net input and output functions are defined in Eqs. (1) and (2), respectively: net input : fiI Z xi for i Z 1; .; N1

(1)

net output oIi Z fiI for i Z 1; .; N1

(2)

where fiI is the net input of node i in Layer I, oIi is the net output of node i in Layer I, and xi is the ith element of the input vector xð. 2.2. Layer II: the condition layer

As shown in Fig. 2, ‘height’ and ‘weight’ represent two linguistic variables corresponding to the two linguistic nodes in the Input Layer. There are four labeled nodes in the Condition Layer, which correspond to ‘short’, ‘tall’, ‘heavy’ and ‘light’. The first two labels are used to describe the linguistic variable ‘height’ while the last two refer to the linguistic variable ‘weight’. The input label nodes are denoted using IL. The label ILi,j denotes a node representing the jth label of the ith linguistic variable input. Referring to Fig. 2, ‘heavy’ is the first label of the second linguistic variable input, so it is denoted as IL2,1. Each input label node has only one input, but can output to one or more nodes in the next layer. Each node has its own membership function. The argument of the membership function comes from the connected linguistic node in the first layer. The output of the input label node is the membership values. Commonly used membership functions are triangular, trapezoidal, Gaussian and Bell-shaped membership functions. The horizontal axis of the membership function represents the input linguistic value, and the vertical axis represents the membership value m(xi), where xi is the output of the ith linguistic node and m(xi)2[0.1]. Fig. 3 shows a typical trapezoidal membership function. The input label node ILi,j represents the jth label of the ith linguistic variable, and the net input and output functions

Neurons in the Condition Layer are termed as input label nodes. They represent labels such as ‘small’, ‘median’, or ‘large’ of the corresponding input linguistic variable. Fig. 2 illustrates a simple input label concept. short height

fat tall

heavy thin

weight light

Fig. 2. Illustration of simple example of fuzzy rule inference.

Fig. 3. Trapezoidal membership function of input label nodes.

772

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

are in Eqs. (3) and (4) respectively. 8 > 0; oIi % lIIij > > > > > > oIi K lIIij > > lIIij ! oIi ! uIIij > > II II ; > < uij K lij uIIij % oIi % vIIij net input fijII Z 1; > > > > > oIi K rijII > II I II > > > vII K r II ; vij ! oi ! rij > ij ij > > : 0; oIi R rijII

among all rules, its value in Layer IV is set to 1, otherwise it is set to 0. The net input and the net output functions are given in Eqs. (7) and (8) respectively. net input : flIV Z oIII k

for l Z 1; .; N4 ; and (7)

k Z 1; .; N3 (3) net output : oIV l ( Z

for i Z 1; .; N1 and j Z 1; .; N2

1

when flIV is maximum

0

otherwise

for l Z 1; .; N4 (8)

oIIij

Z fijII

for i Z 1; .; N1 and j Z 1; .; N2

(4)

oIi

net output:where is the output of the ith linguistic node in Layer I and flIIij ; uIIij ; vIIij ; rijII g is the parameter set of the trapezoidal membership function (see Fig. 3).

where oIII k is the net output of the kth rule node in Layer III, is the net input of the lth selection label node in Layer IV, and is the net output of the lth selection node in Layer IV. 2.5. Layer V: the consequence layer

2.3. Layer III: the rule-base layer The Rule-base Layer defines the fuzzy rules. Each neuron in this layer represents a fuzzy rule and is termed as a rule node. The main function of the rule node is illustrated in Fig. 2. Two fuzzy rules can be observed from the figure. They are: Rule 1: ‘IF the height is short and weight is heavy, THEN the person is fat’ Rule 2: ‘IF the height is tall and weight is light, THEN the person is thin’ The words ‘fat’ and ‘thin’ are the corresponding rule nodes. The number of output for each rule node is fixed at one, but the number of input for each rule node is not fixed, which can range from 1 to N1 (the number of input dimensions) depending on the fuzzy rules they represent. The net input and net output functions of the kth rule node in this layer are defined in Eqs. (5) and (4) respectively. Y net input : fkIII Z ðoIIij Þ for k Z 1; .; N3 ; i (5) i Z 1; .; N1 ; and j Z 1; .; N2 III net output : oIII k Z fk

for k Z 1; .; N3

(6)

Nodes in the Consequence Layer are termed as the consequence nodes, and each node corresponds to one fuzzy rule. Each node in this layer has N1C1 number of inputs, which can be separated into two groups: an input from the Selection Layer and N1 number of data inputs. The input from the Selection Layer provides the firing strength of the associated fuzzy rules. The net input and net output functions are defined in Eqs. (9) and (10) respectively. net input : fmV Z c0 x0 C c1 x1 C c2 x2 C/C cN1 xN1

(9)

for m Z 1; .; N5 V net output : oVm Z oIV m fm

for m Z 1; .; N5

(10)

where fmV is the net input of the mth consequence node in Layer V, ci is the coefficient of ith input variable, oIV m is the net output of the mth consequence node in Layer IV, and oVm is the net output of the mth consequence node in Layer V. Eq. (9) is based on Takagi-Sugeno-Kang (TSK)-type fuzzy rule-base model. The label fc0 ; c1 ; .; ci ; .; cN1 g represents the parameter set. Parameters in this layer are referred as the consequence parameters. 2.6. Layer VI: the output layer

dimension in Layer II.

This layer contains only one node, and it is termed as the output node. Hence the label N5 is equal to 1. It sums up all the outputs from the consequence nodes. The function of the output node is defined in Eq. (11):

2.4. Layer IV: the selection layer

net output : oVI Z f VI Z

where fkIII is the net input of the kth rule node in Layer III, and OIIij is the net output of the jth label node for the ith input

N5 X

oVm

(11)

mZ1

This layer determines the maximum degree of matching for each rule. This layer has the same number of nodes as in the previous layer. If the degree of matching is the greatest

where fVI is the net input of the output node, oVI is the net output of the output node, and oVm is the output of the mth neuron in Layer V.

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

3. Algorithms of GA-TSKfnn GA technique is employed to tune the parameters in Layer II and Layer V of GA-TSKfnn. Sections 3.1 and 3.2 outline the GA algorithm to tune the membership functions and the TSK parameters independently. Section 3.3 describes the simultaneous tuning of the membership functions and the TSK parameters.

773

3.1.2. Encoding Step 3 For all CA chromosomes, and all BA parameters, decode the actual value of each parameter by using the following formula and store them in a 2-dimensional array of size C AB A, that is, Value[CA][BA]. Value½cA ½bA  Z BITA !FACTOR

for (12)

3.1. Tuning the membership function parameters The membership function used in the GA-TSKfnn is the trapezoidal membership function. A detailed description and the GA algorithm are presented below. The following assumptions are made in the algorithm: † † † †

the total number of rules is K; the total number of input pattern is P; the total number of input dimension is D; the output of the kth rule corresponds to the rule number, that is, the output of the kth rule is k; and † the higher the fitness value, the stronger the chromosome in the population. The rule takes the form of Rk: IF xp1 is Ap1 AND xp2 is Ap2 AND .. AND xpd is Apd .. AND xpD is ApD THEN ypk is k. for pZ1,.,P, dZ1,.,D and kZ1,.,K k

where R is the kth rule, xpd is the dth dimension of the pth input pattern, Apd is the value of the dth dimension of the pth input pattern, ypk is the output of the kth rule for the pth input pattern, and k is the rule number. Since the trapezoidal membership function is used, there are a total of 4 parameters to be tuned for each membership function. The number of parameters to be tuned for each rule is 4D. This yields a total of 4DK parameters to be tuned for all K rules. Let BA denotes the total number of parameters to be tuned, that is, BAZ4DK. To represent a dimension’s value, L bits are required. This will give a total of LBA bits to represent one chromosome. Let MA denotes the total number of bits for each chromosome, that is, MAZ LBA. A total of CA chromosomes are needed to form the initial population. The following procedure illustrates in details the steps used to tune the membership function parameters. 3.1.1. Initialization Step 1 Read in all the training data and store them in a 2-dimensional array of P rows (patterns) and D columns (dimensions), that is, TD[P][F]. Step 2 Randomly initialize another 2-dimensional array of size CAMA, Bit[CA][MA], either with a ‘0’ or a ‘1’, under equal probability.

cA Z 0; 1; .; CA K 1 and bA Z 0; 1; .; BA K 1 Value½cA ½bA  Z

Value½cA ½bA  lA

for (13)

cA Z 0; 1; .; CA K 1 and bA Z 0; 1; .; BA K 1 where Value[cA][bA] the actual value of the bth A parameter of the cth chromosome A BITA ½Bit½cA ½bA !L Bit½cA ½ðbA !LÞC 1. Bit½cA ½ðbA !LÞC ðLK 1Þ; FACTOR [2L-1 2LK2.21 20]T, and lA the scaling factor. Step 4 Initialize a 1-dimensional array of size CA to store the fitness value of all chromosomes. Set the initial value of all chromosomes to 0, that is fitness A½cA  Z 0 for cA Z 1; .; CA K 1: Step 5 For all CA chromosomes, and for each set K of the four parameters, KZ{Value[cA][p], Value[cA][q], Value[cA][r], Value[cA][s]}, perform the following operation to get a set K 0 Z{p, q, r, s} such that p, q, r and s are sorted in ascending order. p Z min½Value½cA ½p; Value½cA ½q; Value½cA ½r; Value½cA ½s

(14)

s Z max½Value½cA ½p; Value½cA ½q; Value½cA ½r; Value½cA ½s

(15)

q Z min½K K ph K K s

(16)

r Z max½k K ph K K s

(17)

for cAZ0,1,.,CA-1, pZ0,4,8,.,BAK8,BA-4, qZ1,5,9,., BAK7, BA-3, rZ2,6,10,.,BAK6,BA-2, and sZ3,7,11,., BAK5,BA-1.

3.1.3. Evaluation Step 6 For the 1st chromosome and the 1st set of training data, find the matching degree of the input data to

774

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

all the rules defined by using the multiplication operator. k

m ðð xÞ

Z mk1 ðx1 Þ$mk2 ðx2 Þ$ /$mkd ðxd Þ.$mkD ðxD Þ

Step 7 Find the strongest matching degree (SMD) among all rules, that is

(19)

k

wheel½cA  Z

fitnessA½cA 

Step 8 If the value of SMD equals to the rule number, that is, k, increase the fitness value of the 1st chromosome by 1. Step 9 Repeat Step 6 to Step 8 for the remaining (PK1) data set with the same chromosome. Step 10 Repeat Step 6 to Step 9 for the remaining (CA-1) chromosome. Step 11 Start with first chromosome (that is, cAZ0), sort the fitness values of all chromosomes in descending order, and swap the bits value of each chromosome accordingly. The fitness value of the cth A chromosomes, thus, is given by:  CAK1 fitness A½cA  Z max fitness A½cA ; max ½fitness A½i iZcAC1

(20) Step 12 If the termination criterion is met, that is, fitness A[0] is equal to the desired fitness value OR maximum iteration of 5000 loops is met, go to Step 19, otherwise continue with the following steps.

3.1.4. GA operations Step 13 Perform genetic operations and create new generations. The top 10% of the best chromosomes (that is, a total of 0.1 CA chromosomes) are brought over to a new generation without any change. The remaining chromosomes will undergo the crossover and mutation processes. Step 14 Find the total fitness value of the cth A chromosome, wheel[cA] by summing up the fitness values of the cth A chromosome and the (cAK1)th chromosome. At the same time, find the total fitness values of all chromosomes.

if cA Z 0

fitness A½cA  C wheel½cA K 1 if cA s0 (21)

(18)

where xð is the input vector, mkd ðxd Þ is the membership value of the dth dimension of the kth rule, and mk ðð x Þ is the matching degree of the kth rule with input xð.

SMD Z arg maxðmk ðð x ÞÞ

(

total_fitness Z

CX AK1

fitness A½i

(22)

iZ0

Step 15 Randomly generate two numbers, CR1 and CR2, between 0 and the total_fitness. If CR1 falls in (or is less than) wheel[i], chromosome i is selected. If CR2 falls in (or is less than) wheel[j], chromosome j is selected. Step 16 Randomly generate a Crossover Point (CRP), between two and MA-1. Perform a one-point crossover operation on chromosome i and chromosome j on bit position CRP. Repeat Steps 15 and 16 until all crossovers are done. Step 17 Apply the mutation operator with probability of pm. A total of pm0.9 CAMA bits will undergo the mutation process. Randomly generate a number between 0.1CA and CAK1 (Mut_Index) to select the CAth chromosome and another number between 0 to MA-1 (Mut_Position) for bit position. Let the selected bit be mui, that is, mutZBit[Mut_Index] [Mut_Position]. The value of mut is thus given by: mut Z NðmutÞ

(23)

where N(.) is the complement (NOT) function. Step 18 Repeat Step 3 to Step 17. Step 19 Decode the actual value of parameters of the fittest chromosome. Save it to a text file to be fed into the GA-TSKfnn as the membership function parameters. In the above algorithm, Steps 1 and 2 are termed as Initialization method; Steps 3 and 4 are termed the Encoding method, Step 6 to Step 8 are termed as Evaluation method; and Step 13 to Step 17 are termed Genetic Algorithms method.

the as the the

3.2. Tuning the TSK parameter sets In Takagi-Sugeno-Kang’s fuzzy IF-THEN rules, the output of each rule is a linear combination of the input variables plus a constant. Therefore, the output equation can be written as yk Z ck0 C ck1 x1 C/C ckd xd C/C ckD xD (24) for d Z 1; .; D; and k Z 1; .; K where yk is the output of the kth rule, ckd is the parameter sets of the dth dimension of the kth rule, and ck0 is the constant term of the kth rules.

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

There are a total of DC1 constants to be tuned for each rule output. This yields a total of K(DC1) parameters to be tuned for all K rules. Let BB denotes the total number of TSK parameter sets to be tuned, that is, BBZK(DC1). To represent one parameter’s value, L bits are required. This will give a total of LBB bits to represent a chromosome. Let MB denote the total number of bits for each chromosome, that is, MBZLBB. For simplicity, we assume that the output of the kth rule corresponds to the rule number, that is, the output of rule k equals to k. The procedure used to tune the TSK parameter sets is similar to the steps discussed in Section 3.1, except: † Replace all the subscript A with subscript B and fitness A[.] with fitnessB[.]. † Remove Step 5 in Section 3.1. † Replace the Evaluation method (Step 6 to Step 8 in Section 3.1) with Step 6 and Step 7 listed below. Step 6 For the 1st chromosome and the 1st set of training data, determine the output value yk using Eq. (25).

775

† Compare the result obtained by tuning both layers’ parameters independently and tuning both layers’ parameter simultaneously. The procedure listed in this section will follow the notations employed in Sections 3.1 and 3.2. Further notations will be introduced and indicated where necessary. The procedure used to tune the membership function parameters will be termed as Part A in the algorithm; while the procedure used to tune the TSK parameters will be renamed as Part B. The algorithm is listed in the following. 3.3.1. Initialization Step 1 Perform the Initialization method as described in Section 3.1 by dropping the subscript A, such that CZCAZCB and MZMACMB.

3.3.2. Encoding yk Z VALUE$TDATA

for k Z 1; .; K

(25)

where yk the output of the kth rule, VALUE [Value[c B][k$D] Value[c B][(k$D)C1]. Value[cB][(k$D)C(DK1)]], TDATA [1 TD[p] TD[p]..TD[p][d]]T, Value[cB][l] the lth parameter of the cth B chromosome, TD[p][d] the dth dimension of the pth training pattern, pZ 1; 2; .; P; dZ 1; 2; .; D; lZ 0; 1; .; BB K 1 and cB Z 0; 1; .; CB K 1:

Step 2 For all C chromosomes, decode the actual value of each parameter by using Eqs. (26)–(29) and store them in a 2-dimensional array of size C B, that is, Value[C][B], where CZCAZCB and BZBACBB. For Part A (or first BA values): Value½c½bA  Z BIT$FACTOR

Value½c½bA  Z

Please note that pZ1 for the first set of training data. Step 7 The output value yk is of integer value k according to Eq. (24). However, sometime it is not possible to obtain the exact integer value k, so if k!yk!kC0.5, increase the fitness value of the 1st chromosome by 1.

Value½c½bA  lA

(26)

(27)

For Part B (or next BB values): Value½c½bB  Z BIT$FACTOR

Value½c½bB  Z

Value½c½bB  lB

(28)

(29)

3.3. Tuning the membership function parameters and TSK parameter sets simultaneously

for cZ0,.,C-1, bAZ0,.,BAK2, BA-1, and bBZBA, BAC 1,.,BACBBK2, BACBA-1where

Sections 3.1 and 3.2 describe the algorithms to tune the membership function parameters and TSK parameter sets independently. Although Genetic Algorithms can be used to tune the membership function parameters and TSK parameter sets independently, the drawback is still similar to S-TSKfnn; where two independent tuning processes are required before the parameters can be fed into the network. Therefore, this section introduces a method that integrates the two proposed algorithms into one algorithm. It aims to:

Value[c][b] the actual value of the bth parameter of the cth chromosome, BIT [Bit[c][b$L] Bit[c][(b$L)C1].Bit[c] [(b$L)C(L-1)] FACTOR ½ 2LK1 2LK2 . . 21 20 T ; lA the scaling factor for Part A, and lB the scaling factor for Part B;

† Reduce the number of tuning steps to be done before the parameters can be fed into the network.

Step 3 For Part A (that is, for first BA parameters only), and for all C chromosomes, and for each set K of four parameters, KZ{Value[c][p], Value[c][q], Value[c][r], Value[c][s]}, perform the following

776

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

operation to derive a set K 0 Z{p, q, r, s} such that p, q, r and s are sorted in ascending order. p Z min½Value½c½p; Value½c½q; Value½c½r; (30) Value½c½s s Z max½Value½c½p; Value½c½q; Value½c½r; (31) Value½c½s q Z min½K K ph K K s

(32)

r Z min½K K ph K K s

(33)

for cZ0,.,CK1, pZ0,4,8,.,BA-8, BAK4, qZ1,5,9,., BA-7, BAK3 rZ2,6,10,.,BA-6, BAK2, and sZ3,7,11,., BA-5, BAK1. 3.3.3. Evaluation Step 4 Initialize two 1-dimensional arrays of size C to store the fitness value of all chromosomes for Parts A and B. Set the initial value of all chromosomes to 0, that is, fitness A[c]Z0, and fitness B[c]Z0 for cZ0,.,C-1. Step 5 For the 1st chromosome and the 1st set of training data: For Part A: (a) Determine the matching degree of the input data to all the rules defined by using the multiplication operator in Eq. (34). mk ðð x Þ Z mk1 ðx1 Þ$mk2 ðx2 Þ$.$mkd ðxd Þ.$mkD ðxD Þ

(34)

vector, mkd ðxd Þ th

where xð is the input is the membership value of the dth dimension of the k rule, and mk ððx Þ is the matching degree of the kth rule with input xð. (b) Identify the strongest matching degree (SMD) among all rules, that is, SMD Z arg maxðmk ðð x ÞÞ

(35)

k

(c) If the value of SMD equals to the rule number, that is, k, increase the fitness value of the 1st chromosome in Part A, fitness A[c] by 1. Note that cZ1 for the 1st chromosome. For Part B: (a) Determine the output value yk using Eq. (36). yk Z VALUE$TDATA

for k Z 1; .; K

(36)

where yk the output of the kth rule, VALUE ½Value½c½BA C ðk$DÞ Value½c½BAC ðk$DÞ C1 . Value½c½BA C ðk$DÞC ðDK 1Þ;

TDATA ½1 TD½p½1 TD½p½2 . . TD½p½DT ; Value[c][l] the lth parameter of the cth chromosome, TD[p][d] the dth dimension of the pth training pattern, pZ1,2,.,P, dZ1,2,.,D, lZBA, BAC1,.,BACBBK1 and cZ0,1,.,C-1. Please note that pZ1 for the first set of training data. (b) If k!yk!kC0.5, increase the fitness value of the 1st chromosome, i.e. fitness B[c], by 1. Please note that cZ1 for the 1st chromosome. Step 6 Repeat Step 5 for the remaining (PK1) data set with the same chromosome. Step 7 Repeat Steps 5 and 6 for the remaining (C-1) chromosome. Step 8 Start with first chromosome (that is, cAZcBZ0), sort the fitness values of all chromosomes in Part A and Part B in descending order, and swap the bit values of each chromosome accordingly. The th fitness values of the cth a and cB chromosomes are described by Eqs. (37) and (38) respectively.  CA K1 fitness A½cA  Zmax fitness A½cA ; max ½fitness A½i iZcA C1

(37)  CB K1 fitness B½cB  Zmax fitness B½cB ; max ½fitness B½i iZcB C1

(38) Step 9 If the termination criterion is met, that is, fitness A[0] equals to the desired fitness value AND fitness B[0] equals to the desired fitness value OR maximum iteration of 5000 loops then go to Step 13. Otherwise proceed with the following steps.

3.3.4. GA operations Step 10 Initialize one 1-dimensional array of size C to store the weighted fitness value. Sum up the fitness value of each chromosome by assigning different weights to the fitness functions of Parts A and B. fitness AB½c Za$fitness A½c Cb$fitness B½c c Z 0; 1;.; C K1

(39)

where is the weights assigned for fitness function for Part A (aZ0.1,0.2,.,1.0), and b is the weights assigned for fitness function for Part B (bZ0.1,0.2,.,1.0). Step 11 Perform the Genetic Algorithms method as described in Section 3.1 by dropping the subscript A and replace fitness A[.] with fitness AB[.]

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781 Table 1 Classification under independent GA-tuning Iris

Number of Data

Correct Classified

Misclassified

Classification Rate (%)

Setosa Virginica Versicolor Total

50 50 50 150

49.35 48.81 47.74 145.90

0.65 1.19 2.26 4.10

98.70 97.62 95.48 97.27

777

on the results of independent tuning while Section 4.2 reports on the results for simultaneous tuning of both fuzzy sets and TSK parameters. Section 4.3 discusses the experimental finding of the two approaches (that is, independent and simultaneous) for the tuning of the TSK network by the GA technique. 4.1. Independent GA Tuning A total of 100 executions of the proposed method were conducted to tune the membership functions parameters in the Condition Layer and TSK parameters sets in the Consequence Layer independently. This gives a total of 10000 (100100) combinations of parameters to be fed into the network. The average classification rate achieved is shown in Table 1. Table 2 presents the highest classification rate achieved. The highest classification rate is defined as the highest achievable classification rate among the 100 executions.

Table 2 Highest classification rate achieved under independent GA-tuning Iris

Number of Data

Correct Classified

Misclassified

Classification Rate (%)

Setosa Virginica

50 50

50 50 or 49

0 0 or 1

Versicolor

50

49 or 50

1 or 0

Total Weights

150 Correct Classification of

149 Total

1 Classification Rate (%)

100.00 100.00 or 98.00 98.00 or 100.00 99.33

4.2. Simultaneous GA tuning

Step 12 Repeat Step 2 to Step 11. Step 13 Decode the actual value of parameters of the fittest chromosome. Save it to a text file to be fed into GA-TSKfnn as the membership function parameters and TSK parameter sets.

4. Experimental results The experiments were carried out using the two proposed methods. The first method was to tune the membership function parameters in the Condition Layer and the TSK parameter sets in the Consequence Layer independently. The second proposed method allows simultaneously tuning in both layers (that is, membership functions parameters in the Condition Layer and TSK parameter set in the Consequence Layer) by assigning different weights to the fitness function of the two layers. Section 4.1 reports

The second method proposed is to simultaneously tune the membership functions parameters in the Condition Layer and TSK parameter sets in the Consequence Layer. Weights are assigned to both the fitness functions. The experiments starts by assuming both the membership function parameters and the TSK parameter sets have the same weights, that is, the weight of the fitness function in the Condition Layer is the same as the weight of the TSK parameter sets (aZbZ1). Subsequently, the weight of the membership function parameters was decreased from 1.0 to 0.1 and the weight in the TSK parameter sets was increased from 0.1 to 0.9. A total of 10 combinations of weights value were tested. Each combination was executed 100 times and the average of the correct classification rate is computed. For comparison purpose, the highest classification rate achieved was identified. Table 3 presents the average classification rate of 100 executions for each combination of weights in fitness function, while Table 4 presents the highest classification rate achieved.

Table 3 Classification under simultaneous GA-tuning Weights

Correct Classification of

Condition Layer, a

TSK Parameter set, b

Iris Setosa

Iris Virginica

Iris Versicolor

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

49.55 48.14 47.76 48.22 48.90 49.02 48.43 46.58 48.79 47.23

48.54 48.59 48.65 48.54 48.56 48.13 48.65 48.68 48.45 48.51

48.50 47.43 48.43 48.62 48.36 48.31 48.31 48.19 48.45 48.56

Total

Classification Rate (%)

146.59 144.16 144.84 145.38 145.82 145.46 145.39 143.45 145.69 144.30

97.73 96.11 96.56 96.92 97.21 96.97 96.93 95.63 97.13 96.20

778

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

Table 4 Highest classification rate achieved under simultaneous GA-tuning Iris

Number of Data

Correct Classified

Misclassified

Classification Rate (%)

Setosa Virginica

50 50

50 50 or 49

0 0 or 1

Versicolor

50

49 or 50

1 or 0

Total

150

149

1

100.00 100.00 or 98.00 98.00 or 100.00 99.33

(Ang et al., 2003), modified POPFNN-TVR(s)(Ang et al., 2003), NEFCLASS(Nauck & Kruse, 1997), improved LQV(Ang & Quek, 2005), and Fuzzy C-Means(Ang & Quek, 2005). The results are shown in Table 5. The proposed algorithms, independently and simultaneously, are superior to other networks. Although the average classification rate is less than the classification rate of S-TSKfnn, there are several differences that make GATSKfnn superior to the rest. The discussion will be based on the number of rules generated, number of iteration and the classification rate in the following section. Note that due to the nature of the Genetic Algorithms, which randomize the initial population, the discussion will be based on both the average and highest classification rate achieved.

At this point, all the experiments on the proposed architecture were conducted. The following section will discuss the performance of the proposed architecture, GA-TSKfnn.

5.1. Number of fuzzy rules and tuning of TSK parameter sets

4.3. Discussion on iris data classification

For the Iris data, there are 3 original clusters: Iris Setosa, Iris Versicolor, and Iris Virginica. Hence, theoretically, the number of fuzzy rules should be equal or greater than 3. Since there is an overlapping space, for the traditional classifiers, this classification problem is difficult to resolve using only 3 fuzzy rules (Gao, 2002). As can be seen from Table 5, POPFNN-CRI(s)-FKP and POPFNN-CRI(s)-PFKP requires a total of 42 to 47 rules to classify Iris data. The classification rate is acceptable; but the number of rules used is more than the ideal number of rules. Although the number of rules are not available in Table 5 for other four classifiers, but the three fuzzy rules used by GA-TSKfnn is treated as the best solution for the derivation of fuzzy rule. Firstly, GA-TSKfnn requires only one rule for each class. Fig. 4 shows the rules generated by GA-TSKfnn whereas Figs. 5 and 6 show the rules generated by MS-TSKfnn (Wang, 2003) and the original S-TSKfnn (Gao, 2002) respectively. Compared to five rules generated in MSTSKfnn and four rules generated in S-TSKfnn, it is clear that Genetic Algorithms are able to classify the Iris data with less rule. It is argued that in MS-TSKfnn, additional rule is required to separate Iris Virginica and Iris Versicolor,

The result shows that the proposed architecture, GA-TSKfnn achieves the highest classification rate of 99.33% with only 1 Iris Virginica misclassified as Iris Versicolor, or 1 Iris Versicolor misclassified as Iris Virginica. Although the average classification rate of Iris Setosa did not reach 100% (which was expected due to the reason that the dimensions of Iris Setosa do not overlapped with Iris Virginica or Iris Versicolor), most of the combinations had a 100% correct classification rate for Iris Setosa. For Iris Virginica and Iris Versicolor, a 100% of correct classification rate is achieved for Iris Versicolor while some of the executions give a 100% of correct classification for Iris Virginica. The proposed architecture is able to classify Iris Data and achieve a satisfactory accuracy.

5. Benchmark against other neuro-fuzzy model This section benchmarks GA-TSKfnn against other neuro-fuzzy classifiers, i.e. S-TSKfnn (Gao, 2002), MSTSKfnn (Wang, 2003) POPFNN-CRI(s) with FKP (Ang, Quek, & Wahab., 2003), POPFNN-CRI(s) with PFKP Table 5 Benchmark of GA-TSKfnn Classifiers

Tuning of parameters in Layer II and Layer V

Number of Rules

Iterations

Classification Rate (%)

GA-TSKfnn (Average aZ1-bZ1) GA-TSKfnn (Highest) GA-TSKfnn (Average) GA-TSKfnn (Highest) MS-TSKfnn S-TSKfnn POPFNN-CRI(s)-FKP POPFNN-CRI(s)-PFKP Modified POPFNN-TVR (s) NEFCLASS Improved LQV Fuzzy C-Means

Simultaneously Simultaneously Independently Independently Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable Not Applicable

3 3 3 3 5 4 42 47 Not Not Not Not

1 1 1 1 1 1 15 15 15 126 16 24

97.73 99.33 97.27 99.33 98.00 96.00 80.00 98.00 74.60 96.60 89.30 88.70

Available Available Available Available

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

779

IF Sepal Length is Large and Sepal Width is Small and Petal Length is Medium and Petal Width is Medium THEN It is Iris Setosa IF Sepal Length is Small and Sepal Width is Large and Petal Length is Small and Petal Width is Small THEN It is Iris Virginica IF Sepal Length is Medium and Sepal Width is Medium and Petal Length is Large and Petal Width is Large THEN It is Iris Versicolor Fig. 4. Three Rules generated by GA-TSKfnn.

IF Sepal Length is Medium and Sepal Width is Small and Petal Length is Medium and Petal Width is Medium THEN It is Iris Setosa IF Sepal Length is Small and Sepal Width is Very Small and Petal Length is Fair Medium and Petal Width is Fair Medium THEN It is Iris Setosa IF Sepal Length is Very Small and Sepal Width is Large and Petal Length is Small and Petal Width is Small THEN It is Iris Virginica IF Sepal Length is Fair Medium and Sepal Width is Fair Medium and Petal Length is Large and Petal Width is Large THEN It is Iris Versicolor IF Sepal Length is Large and Sepal Width is Medium and Petal Length is Very Large and Petal Width is Large THEN It is Iris Versicolor Fig. 5. Five Rules generated by MS-TSKfnn.

which lead to a higher classification rate in MS-TSKfnn. However, it can be shown clearly that Genetic Algorithms are able to achieve a highest classification rate (99.33%) among all classifiers with only three rules. Secondly, in both S-TSKfnn and MS-TSKfnn, although the TSK parameter sets in Layer V are tuned by using Plane Tuning, the value of constants ck1, ck2, ck3 and ck4 are set to 0 for all k rules. This adopts a zero order TSK model. The constants ck0 equals to 1 if it is Iris Setosa, equals to 2 if it is Iris Virginica and equals to 3 if it is Iris Versicolor. In the proposed GA-TSKfnn, all the constant values are tuned by using Genetic Algorithms, either tuned together with the parameters in the Condition Layer or tuned independently, hence resulting in a first order TSK model. 5.2. Number of iterations The comparison of the number of iteration only involves the whole network and excludes the tuning of parameters

using the Genetic Algorithms. Similar to S-TSKfnn, GATSKfnn also uses one iteration to obtain a highest classification result of 99.33%. This is a significant achievement compared to other classifiers. Although the traditional classifier POPFNN-CRI(s)-PFKP is able to achieve 98% classification rate, the number of iteration used is 15. This demonstrates the superiority in terms of iterations of GA-TSKfnn over POPFNN-CRI(s)-PFKP. 5.3. Classification rate GA-TSKfnn achieved an average classification rate of a 97.73% and the highest classification rate of 99.33%. This classification rate is the highest among all the classifiers and is highly encouraging. As discussed in Section 4.3, there is an overlapping space between Iris Versicolor, and Iris Virginica data. The total number of overlapping data is 27 from both clusters. If we treat these data as misclassified or unclassified, the highest

IF Sepal Length is Medium and Sepal Width is Very Small and Petal Length is Medium and Petal Width is Medium THEN It is Iris Setosa IF Sepal Length is Small and Sepal Width is Small and Petal Length is Fair Medium and Petal Width is Fair Medium THEN It is Iris Setosa IF Sepal Length is Small and Sepal Width is Large and Petal Length is Small and Petal Width is Small THEN It is Iris Virginica IF Sepal Length is Large and Sepal Width is Medium and Petal Length is Large and Petal Width is Large THEN It is Iris Versicolor Fig. 6. Four Rules generated by S-TSKfnn.

780

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781

classification rate achieved would be ((150-27)/150)! 100%Z82%. The classification rate achieved shows a highest accuracy rate of 99.33%, which is much more higher than 82%.

6. Conclusions Genetic Algorithms is able to classify the Iris data with a highest classification rate of 99.33%. It only misclassifies one Iris data. The drawback of the Genetic Algorithms is that the initial bits are randomly generated and hence many executions of the algorithm were conducted. Although the average result (97.72%) is slightly lower than the highest classification rate of 98.00% in other neuro-fuzzy model, the Genetic Algorithms has demonstrated the capability of achieving a highest classification rate of 99.33% in certain executions. The experiments also conclude that tuning the parameters simultaneously yields a superior performance. Compare to S-TSKfnn, the method of tuning the parameters simultaneously has greatly reduced the number of iterations needed. As for the method of tuning the parameters simultaneously, among all the 10 combinations of weights (a and b), the highest of the average classification rate achieved is when aZbZ1 reaffirms the fact that both sets and rules are equally important for an effective rule-base system. The classification rate achieved is 97.72%. This shows that our proposed GA-TSKfnn is promising in classification problem. Extensive research effort has been devoted at the Centre for Computational Intelligence, formerly known as the Intelligent Systems Laboratory, of the School of Computer Engineering, Nanyang Technological University, Singapore, in the formulation of novel fuzzy neural architectures (Ang et al., 2003; Ang & Quek, 2000; Ang & Quek; Tung & Quek, 2002; Quek & Zhou, 1996; Quek & Zhou, 1999; Quek & Tung, 2000; Tung & Quek, 2004; Tung & Quek, 2002) that exhibit human like logically based reasoned decision processes on the basis of formal fuzzy logical foundation. These techniques have been extensively and successfully applied to numerous novel applications such as automated driving (Ang et al., 2001; Pasquier, Quek, & Toh, 2001), biometrics (Quek, Tan, & Sagar, 2001; Quek & Zhou, 2002), medical decision support (Tung, 2004; Tan, 2004), and banking decision support (Ang, 2004; Tung, Quek, & Cheng, 2004).

References Ang, K.K. & Quek C. RSPOP: Rough Set based Pseudo Outer-Product fuzzy rule identification algorithm. to appear in Neural Computation. Ang, K. K., & Quek, C. (2000). Improved MCMAC with momentum, neighborhood, and averaged trapezoidal output. IEEE Transactions on Systems, Man and Cyberneticss: Part B, 30(3), 491–500. Ang, K.K. & Quek C., (2005). Self-Evolving-Cerebellar (SEC) fuzzy membership identification algorithm. Under preparation.

Ang, K. K., Quek, C., & Wahab, A. (2001). MCMAC-CVT: a novel on-line associative memory based CVT transmission control system. Neural Networks, 15(2), 219–236. Ang, K. K., Quek, C., & Pasquier, M. (2003). POPFNN-CRI(S): Pseudo Outer Product based Fuzzy Neural Network using the Compositional Rule of Inference and Singleton Fuzzifier. IEEE Transactions on Systemss, Man and Cybernetics, Part B, 33(6), 838–849. K. Ang, and C, in: . (Eds.), Quek, Stock Trading using RSPOP: A novel rough set neuro-fuzzy approach,, sumbitted to IEEE Transactions on Neural Networks, 2004, Carse, B., Fogarty, T. C., & Munro, A. (1996). Evolving fuzzy rule based controllers using genetic algorithms. Fuzzy Sets System, 80, 273–293. Casillas, J., et al. (2001). Genetic tuning of fuzzy rule-based systems integrating linguistic hedges IFSA World Congress and 20th NAFIPS International Conference. Cordon, O., & Herrera, F. (1999). A two-stage evolutionary process for designing TSK fuzzy rule-based systems. IEEE Transactions on Systems, Man and Cybernetics, Part B, 29(6), 703–715. Cordon, O., Herrera, F., & Villar, P. (2001). Generating the knowledge base of a fuzzy rule-based system by the genetic learning of the data base. IEEE Transactions on Fuzzy Systems, 9(4), 667–674. Feldman, D. S. (1993). Fuzzy network synthesis with genetic algorithms Fifth International Conference Genetic Algorithm. Urbana-Champaign: University of Illinois. Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annuals of Eugenics, 7, 179–188. Furuhashi, T., Nakaoka, K., & Uchikawa, Y. (1994). Suppression of excessive fuzziness using multiple fuzzy classifier systems Third IEEE International Conference on Fuzzy Systems. Orlando, FL. Gao, S. Y. (2002). S-TSKfnn: A novel self-organizing fuzzy neural network based on the TSK fuzzy rule model School of Computer Engineering, Intelligent Systems, Laboratory. Singapore: Nanyang Technological University. Herrera, F., Lozano, M., & Verdegay, J. L. (1995). Tuning fuzzy logic controllers by genetic algorithms. International Journal of Approximate Reasoning, 12, 299–315. Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. Ann Arbor, MI: Univ. of Michigan Press. Homaifar, A., & McCormick, E. (1995). Simultaneous design of membership functions and rule sets for fuzzy controllers using genetic algorithms. IEEE Transactions on Fuzzy Systems, 3, 129–139. Ishibuchi, H., & Murata, T. (1996). In F. Herrera, & J. L. Verdegay, A genetic-algorithm-based fuzzy partition method for pattern classification problems. Studies in fuzziness and soft computing (pp. 555–578). Physica-Verlag. Ishibuchi, H., Nakashima, T., & Murata, T. (1999). Performance evaluation of fuzzy classifier systems for multidimensional pattern classification problems, Systems. IEEE Transactions on Man and Cybernetics, Part B, 29(5), 601–618. Ishigami, H., et al. (1995). Structure optimization of fuzzy neural network by genetic algorithm. Fuzzy Sets Systems, 71, 257–264. Janikow, C. Z. (1996). A genetic algorithm method for optimizing the fuzzy components of a fuzzy decision tree. In S. K. Pal, & P. P. Wang (Eds.), Genetic Algorithms for Pattern Recognition (pp. 253–281). Boca Raton, FL: CRC Press. Karr, C. L. (1991). Design of an adaptive fuzzy logic controller using a genetic algorithm Fourth International Conference on Genetic Algorithms. San Diego: University of California. Karr, C. L., & Gentry, J. E. (1993). Fuzzy control of pH using genetic algorithms. IEEE Transactions on Fuzzy Systems, 1, 46–53. Lee, M. A., & Takagi, H. (1993). Integrating design stages of fuzzy systems using genetic algorithms Second IEEE International Conference on Fuzzy Systems. San Francisco, CA. Mamdani, E. H. (1974). Application of fuzzy algorithms for control of simple dynamic plant. Proceeding of the Institution of Electrical Engineers, 121, 1585–1588.

A.M. Tang et al. / Expert Systems with Applications 29 (2005) 769–781 Matsushita, S., et al. (1996). Determination of antecedent structure for fuzzy modeling using genetic algorithm Third IEEE International Conference on Evolutionary Computation. Nagoya: Nagoya University. Nakaoka, K., Furuhashi, T., & Uchikawa, Y. (1994). A study on apportionment of credits of fuzzy classifier system for knowledge acquisition of large scale systems Third IEEE International Conference on Fuzzy Systems. Orlando, FL. Nauck, D., & Kruse, R. (1997). A neuro-fuzzy method to learn fuzzy classification rules from data. Fuzzy Sets and Systems, 89(3), 277–288. Nomura, H., Hayashi, I., & Wakami, N. (1992). A self-tuning method of fuzzy reasoning by genetic algorithm Interantional Fuzzy Systems Intelligent Control Conference. Louisville, KY. Park, D., Kandel, A., & Langholz, G. (1994). Genetic-based new fuzzy reasoning models with application to fuzzy control. IEEE Transactions on Systems, Man, and Cybernetics, 24, 39–47. Parodi, A., & Bonelli, P. (1993). A new approach to fuzzy classifier systems Fifth International Conference on Genetic Algorithms. UrbanaChampaign: University of Illinois. Pasquier, M., Quek, C., & Toh, M. (2001). Fuzzylot: a novel selforganizing fuzzy-neural rule-based pilot system for automated vehicles. Neural Networks, 14(8), 1099–1112. Quek, C., & Zhou, R. W. (2002). Antiforgery: a novel pseudo-outer product based fuzzy neural network driven signature verification system. Pattern Recognition Letters, 23(14), 1795–1816. Quek, C., & Tung, W. L. (2000). A novel approach to the derivation of fuzzy membership functions using the Falcon-MART architecture. Pattern Recognition Letters, 22(9), 941–958. Quek, C., & Zhou, R. W. (1996). POPFNN: A pseudo outer-product based fuzzy neural network. Neural Networks, 9(9), 1569–1581. Quek, C., & Zhou, R. W. (1999). POPFNN-AARS(S): A Pseudo OuterProduct Based Fuzzy Neural Network. IEEE Transactions on Systems, Man and Cybernetics: Part B, 29(6), 859–870. Quek, C., Tan, B., & Sagar, V. (2001). POPFNN-based fingerprint verification system. Neural Networks, 14, 305–323. Shimojima, K., Fukuda, T., & Hasegawa, Y. (1995). Self-tuning fuzzy modeling with adaptive membership function, rules, and hierarchical structure based on genetic algorithm. Fuzzy Sets Systems, 71, 295–309.

781

Sun, C. T., & Jang, J. S. (1993). A neuro-fuzzy classifier and its applications. Second IEEE International Conference on Fuzzy Systems, 1, 94–98. Takagi, T. (1985). Fuzzy identification of systems and its applications to modeling and control. IEEE Transactions on Systems, Man, and Cybernetics, 15(1), 116–132. Tan, T.Z., C. Quek, and G.S. Ng, FALCON-AART: A Clinical Decision Support System Based on Complementary Learning Approach. submitted to Neural Computation, 2004. Thrift, P. (1991). Fuzzy logic synthesis with genetic algorithms Fourth International Conference Genetic Algorithms. San Diego: University of California. Tung, W. L., & Quek, C. (2002). PACL-FNNS: A novel class of FalconLike fuzzy neural networks based on positive and negative exemplars. In C. T. Leondes (Ed.), Intelligent systems: Technology and applications: Fuzzy systems, neural networks and expert systems (pp. 257–320). Boca Raton: CRC Press. Tung, W. L., & Quek, C. (2002). GenSoFNN: a generic self-organizing fuzzy neural network. IEEE Transactions on Neural Networks, 13(5), 1075–1086. Tung, W. L., & Quek, C. (2004). Falcon: neural fuzzy control and decision systems using FKP and PFKP clustering algorithms. IEEE Transacions on Systems, Man and Cybernetics, 34(1), 686–694. Tung, W.L.& Quek C., (2004). GenSo-FDSS: A neural-fuzzy decision support system for pediatric ALL cancer subtype identification using gene expression data. To appear in Artificial Intelligence in Medicine. Tung, W. L., Quek, C., & Cheng, P. (2004). GenSo-EWS: a novel neuralfuzzy based early warning systems for predicting bank failures. Neural Networks, 17, 567–587. Valenzuela-Rendon, M. (1991). The fuzzy classifier system: A classifier system for continuously varying variables Fourth International Conference on Genetic Algorithms. San Diego: University of California. Wang, D. (2003). MS-TSKfnn: Modified Self-Organizing Fuzzy Neural Network Based on TSK Fuzzy Rule Model, in School of Computer Engineering, Intelligent Systems Laboratory School of Computer Engineering, Intelligent Systems Laboratory. Singapore: Nanyang Technological University.