Modelling fuzzy production rules with fuzzy expert networks

Modelling fuzzy production rules with fuzzy expert networks

Expert Systems WithApplications, Vol. 13, No. 3, pp. 169-178, 1997 © 1997 Elsevier Science Ltd Printed in Great Britain, All rights reserved 0957-4174...

800KB Sizes 2 Downloads 127 Views

Expert Systems WithApplications, Vol. 13, No. 3, pp. 169-178, 1997 © 1997 Elsevier Science Ltd Printed in Great Britain, All rights reserved 0957-4174/97 $17.00+0.00

Pergamon PII: $0957-4174(97)00029-8

Modelling Fuzzy Production Rules with Fuzzy Expert Networks E. C. C. TSANG AND D. S. YEUNG Department of Computing, The Hong Kong Polytechnic University, Hung Horn, Kowloon, Hong Kong

Abstract--The strength of a fuzzy expert system comes from its ability to handle imprecise, uncertain and vague information used by human experts while the power of neural networks lies in their learning, generalization and fault tolerance capabilities. There have already been many attempts to model and formulate fuzzy production rules (FPRs) by using neural networks so that a new system, called a hybrid system, can be developed and will have the advantages of both. The modelling or formulating process, however, is not an easy task. There are many problems that need to be resolved before such a hybrid system can achieve its goal of having the power of both systems. These problems include how to model FPR using a neural network, what necessary modifications to the learning algorithm of the neural network need to be done if the neural network is to have the same inference mechanism as that of a fuzzy expert system, and where could such a hybrid system be applied. In this paper, the necessary network structure, forward and backward processes of a fuzzy expert network (FEN) used to solve these problems will be presented. This FEN had been proposed in Tsang and Yeung (1996, World Congress on Neural Networks, pp. 500--503) but whose details in terms of network structure, forward and backward reasoning mechanism have not been covered. Therefore, this paper aims to introduce the concept of how FEN can formulate and model FPRs. One of its applications is to help knowledge engineers fine-tune knowledge representation parameters such as certainty factor of a rule, threshold value of a proposition and membership values of a fuzzy set. An experiment will also be performed to demonstrate the tuning capability of this FEN. © 1997 Elsevier Science Ltd. All rights reserved

1. INTRODUCTION

1990). The two technologies seem to offer complementary strengths. While neural networks offer capabilities such as automatic rule extraction and refinement in a noisy environment, expert systems provide features such as programmability, conceptual capabilities and structural schemes of knowledge representation. In Hollatz, 1992 it was shown how symbolic (expert systems) and sub-symbolic knowledge (neural network) can be combined. Sestito and Dillon (1993) present a method to extract rules using neural networks. FPR have been modeled and formulated by fuzzy Petri nets (Chen et al., 1990; Garg et al., 1991; Yeung & Tsang, 1992) for reasoning and consistency checking, and neural networks for adjusting parameters in triangular fuzzy weights (Ishibuchi et al., 1995), learning membership grades (Yager, 1994), and approximating continuous fuzzy functions (Buckley & Hayshi, 1994). In this paper we present the network structure, forward chaining and backward error correcting method of our former work on FEN which is an extension of work done in Tsang and Yeung (1996). This FEN extends the expert network proposed by Lacher et al. (1992) to a FEN so that it could statically represent the FPR which handle

MULTILAYERPERCEPTRONNEURALNETWORKShave been used to solve many domain specific problems such as medical diagnosis, pattern recognition, character recognition, and image processing. They have been applied to various areas: industry, business and science (Widrow et al., 1994). Fuzzy expert systems (FES), on the other hand, have been investigated since the mid 1970s, but the actual implementations began emerging only in the 1980s (Scharwartz et al., 1994). They have been successfully applied to many application problems such as control problems (Lee, 1990; Sugeno, 1985), fuzzy decision support systems (Chen, 1988, 1994), damage assessment (Ogawa et al., 1985), medical diagnosis (Adlassnig & Kolarz, 1982), information retrieval (Tong, 1986), and financial application (Whalen, 1986). There has been a rapidly growing body of research related to intelligent hybrid systems composed of expert systems and neural networks (Fu & Fu, 1990; Gallant, 1988; Hall & Romaniuk, 1990; Handelman et al., 1989; Murphy, 1990; Touretzky & Hinton, 1988; Towell et al., 169

170

E. C. C. Tsang and D. S. Yeung

vague, incomplete, uncertain and fuzzy knowledge of domain experts. The node functionality of this FEN is equivalent to the reasoning method used in a FES. The modified learning algorithm derived for this FEN is based on the inference mechanism of a FES. One of the applications of our proposed FENs is to help knowledge engineers fine-tune the knowledge representation parameters in FPRs and membership values of fuzzy sets. Specifically, the parameters to be tuned are: certainty factor of a FPR, threshold value of each proposition in the antecedent part of a FPR, and the membership values of a fuzzy set. An experiment is provided to demonstrate the capability of our proposed network to fine-tune the fuzzy knowledge base. This paper is organized into several sections. In Section 2 the expert network proposed by Lacher et al. (1992) is briefly introduced, while in Section 3 a general FPR, together with a fuzzy reasoning method used in a FES, are included. In Section 4 a mapping of this FPR into a FEN is defined and the different functionality, nodes and weights of this FEN are identified. Section 5 presents a modified backpropagation which accommodates the learning for the FEN while in Section 6 a comparison of our FEN with the one proposed by Lacher et al. (1992)and the one proposed by Yager (1994) is given. In Section 7 an experiment is performed to show one of the applications of our proposed network. The last two sections discuss some possible future research problems and concluding remarks.

2. AN EXPERT NETWORK According to Lacher et al. (1992), an expert network is an acyclic neural network with a high-level node functionality. The functionality of this network comes from the inference engine of the expert system of interest (Hruska et al., 1991). Instead of using the simple summation functions, the rules employed for combining evidences in the expert system are used. Layer structure is not identifiable in this network and the network is not fully connected. The nodes in the network are categorized as (i) out-connections input nodes, (ii) nodes with only in-connections nodes, and (iii) nodes with both in- and out-connections interior nodes. Furthermore, according to the node functionality, nodes are divided into regular node, operation node, negation node and conjunction node. Connection weights of the directed arcs are divided into hard weights, whose values do not change, and soft weights, whose values may change during learning. Furthermore, the proposed expert network has been used to tune certainty factors of production rules used in EMYCIN.

3. FUZZY PRODUCTION RULE AND REASONING METHOD According to Buchanan (Buchanan & Duda, 1982) and Yager (1984), propositional statements are the fundamental building blocks of a rule-based system. It is usually represented in the form of: The (attribute) of (an object) is (value) e.g. The temperature of this room is high. It is suggested that one can combine and replace the (attribute) of (an object) by a concept called a variable. Thus the above example can be expressed as a canonical form: V is A, where V is a variable which stands for 'the (attribute) of an (object)', and A is the value of the variable V. A is not confined to a fuzzy value. Unlike conventional production rules, a FPR can have fuzzy terms like 'tall' or 'high' appear in the antecedent and the consequent part. Furthermore, a threshold value and a certainty factor can be assigned to each proposition while a certainty factor can also be assigned to the entire rule. In what follows, a generic FPR will be considered. A conjunctive FPR has the form of: R: IF V~ is Al AND V2 is A2... AND Vn is A n THEN U is B, Fact 1: V1 isA'~ Fact 2: V2 is A'2

CFR,AAI,AA2.....AAn,

Fact n: Vn is A'n Conclusion: U is B'

CFF., CFs,,

CFF,, CFF2,

where VI,V2..... Vn and U are variables; AI, Az..... A, and B are the fuzzy values of the variables 1/i, V2..... Vn and U respectively; CFR, CFF~, CFF2..... CFF. and CFB, are the certainty factors of the rule, the observed facts, and the deduced consequent respectively; and AAiis the threshold value for the similarity measure between Ai and the observed fact A'i. When n= 1 this FPR is reduced to a simple FPR. It is possible to form a disjunctive FPR whose propositions in the antecedent are connected by 'OR'. In order to draw a reasonable conclusion, a similarity based fuzzy reasoning method proposed in (Yeung & Tsang, 1992) is presented in this FES. A detailed comparison of similarity-based fuzzy reasoning methods can be found in Yeung and Tsang (1997). Degree of subsethood (DS) between two fuzzy sets proposed by Kosko (1992) is used to compare the similarity between two fuzzy sets. Recall that the degree of subsethood is defined as: SDs(A[,Ai) =

M(A i' A Ai)

M(Ai' )

,

(1)

where M(A'I) is the sigma-count of A'i, i.e. the size or

171

Modelling Fuzzy Production Rules

cardinality of A'~ (Zadeh, 1975), and M(A'if')Ai) is the sigma-count of the intersection of A' i and A i. It is observed that SDs(A'i, A~), which is the degree of subsethood of A'~ in Ai, differs from SDs(Ai, A'i) which is the degree of subsethood of Ai in A'i, and 0--
So= i=l ~ So~= _1 ~ [SDs(A, ' A~)], F/ i=l

B'= Min [1,B/So]. For a conjunctive FPR, B' can be drawn as: So-- i=l ~ (SDs(A'i,Ai))/n.

for

~A(Xt ),.

< l

all

1-
~A(X2 ),.

CFB,=CFR* min[CFF1,CFF2..... CFj. 4. FUZZY EXPERT NETWORK In this section we present the structure of our proposed FEN, the mappings of the generic FPRs into this network and its forward propagation process. The mathematical derivation of its backward error correction method is presented in Section 4.3. A modified backpropagation based on the forward and backward processes is given in the next section.

(2)

Yeung et al.'s DS method has adopted the 'more-orless form' proposed in Turksen and Zhong (1988) and the B' is given by:

t If SDs(Ai,Ai)--,h.ai, min[1,B/So].

The certainty factor of the drawn consequent B' is given by:

then

B'=

4.1. Mapping Fuzzy Production Rule into FEN The mapping of a conjunctive FPR presented in Section 3 with two propositions in the antecedent part is shown in Fig. 1. From this figure a number of different nodes and weights are identified and categorized. One may notice that there is a connection with value 1 from node t to node w in Fig. 1. This connection is used to provide a signal to its subsequent node, e.g. node w in Fig. l, whether the certainty factor of the given fact is allowed to propagate to the next layer. If the signal is zero, no certainty factor of the given fact will be propagated.

la .(yl )

~A(Xn ),.

!a .(y2)

g (ym)

ixn)

Al

1

I

CFFI

CFF2

K

CF a

M

laB ,(z I )

laB,(Z2)

/aB,(Zk)

CF B, FIGURE 1. An expert network representing a conjunctive FPR.

172

E. C. C. Tsang and D. S. Yeung

4.2. Functionalities of Nodes and Weights

4.3. Feed Forward Propagation

In what follows nodes and weights are identified and categorized according to different functionalities.

Since different types of rule have different methods of computing the fuzzy value and the certainty factor of the consequent, only a conjunctive FPR is presented below. For feed forward propagation presented in this section, please refer to Fig. 1. Netp stands for the combining function of node p while Outp stands for the output of node p.

4.2.1. Input Nodes. (1) Membership value node: A node whose input is a membership value of a given fact, e.g. Tal1=0.215'5" +0.4/5'7"+ 0.6/5'9" + 0.8/5' 11"+0.9/6' + 1/6'2", is represented by 6 nodes. Each represents a membership value, e.g., layer I nodes 1 to n in Fig. 1. (2) Threshold value node: A node whose input value is 1, e.g. layer I node 0 in Fig. 1. (3) Certainty factor of a given fact node: A node whose input value is the certainty factor of a given fact, e.g., layer K nodes u and v in Fig. 1.

4.3.1. A Conjunctive FPR. For this type of rule, please refer to Fig. 1 for its feed forward propagation method presented as follows.

Netp=SDs(A,',At) - AA, Out =J "SDs(A/'AO if Netp>-O P [0

4.2.2. Hidden Nodes. (1) Degree of similarity node: A node which computes the degree of similarity between the fuzzy set of a proposition and its given fact, e.g. layer J nodes p and r in Fig. 1. (2) Conjunctive node: A node which computes the overall similarity of propositions in the antecedent, e.g. layer K node t in Fig. 1. (3) Minimum certainty factor node: A node which computes the minimum value of input certainty factors of given facts as its output value if a certain condition is fulfilled, e.g. layer L node w in Fig. 1.

Netr =SDs(A2',A2) - A~ ) if Netr>-0 Utr= ~fSDs(A2,A2 0 otherwise

f _Outp+Outr ifOute>O and Nett= ~ 2 Outr>O l0

4.2.4. Weights.

otherwise

Out,= Net,

4.2.3. Output Nodes. (1) Membership value node: A node whose output value is the membership value of the deduced consequent, e.g. layer L nodes 1,2,..k in Fig. 1. (2) Simple product node: A node whose output value is equal to the product of its incoming weight and input value, e.g. layer M node z in Fig. 1.

otherwise

Netq:

t min I 1, txa(Zq)

0

L --~utt ]

if Out,>O for q= 1,2..... k otherwise

Outq=Netq I~B,(Zq)=Outq for q= 1,2..... k b~.

[min[CFvl,CFF2] if Out,>0 otherwise

etw=~ 0 (1) Hard weight: A connection weight whose value does not change during learning and is always equal to 1, e.g. weight Wpt,in Fig. 1. (2) Antecedent membership value weight: A tunable weight representing the membership value of a fuzzy set in the antecedent with value in the interval [0,1], e.g. weights/XA(Xl),/-tA(X2)..... /ZA(Xn)in Fig. 1. (3) Consequent membership value weight: A tunable weight representing the membership value of a fuzzy set in the consequent with value in the interval [0,1 ], e.g. weights/zn(zl),kes(z2) ..... /~B(Zk)in Fig. 1. (4) Certainty factor weight: A tunable weight which represents the certainty factor of a rule, e.g. weight CFR in Fig. 1.

Outw=Netw Net~=Outw× CFR Outz=Net~ CFB,= Outz

(3)

4.4. Error Convergence Derivation The following notations are adopted: E, Tari and Outi represent an error function, a target output at node i and an actual output at node i for a pattern p respectively,

Modelling Fuzzy Production Rules

173

while r/represents a learning rate. Only the conjunctive FPR is used. Other types of rule can be derived by a similar approach.

~ (Tara- OUtq)2 E=~1 q°l OE OE • ONetq Otzn(zq) - - dNetq Olx(z~q)for q = 1,2 ..... k

1 E=~ (Ta~ - Outz)

-

=

=

dNet~ dCFR

dE * dOut z dE - -* dOut:

=

dE 1 * dNetq dOut,

if Out,> ~B(Zq)

- -

-

otherwise

dE dOutq 1 - * * dOutq dNetq Out,

if Out,> tzB(Zq)

0

dOut z *CFn if CFn-O dNet~ dOut~ dNet z *CFF2 ifCFF~>CFw and Out~>O

aE _ _ _ .I,CFFI aOut z aE - --*I*CFFz aOutz

if

_ f ( T a ~ - Outz)*CFn - ~[ ( T a r : - OuQ*CFF2

=

otherwise

* 1 t - - OE OOutq Out, 0

if

CFFI~-~CFmand Out~>O

Out,>I&B2(Zq)

otherwise

f 1 = ~ (Tarq - Outq)* Out,

l0

if Out t >/&B(Zq) otherwise

i f C F n > C F F 2 and Out,>O

For the error E to converge, it is necessary that A tzn( Zq) oc - ( a E)/ ( a izn(Zq)).

ifCFn-O

AIxB(Zq) =

if CFn >CFF2 and Out,>O

{

rl(Tarq- Outq)* ~ 0

For the error E to converge, it is necessary that ACF R oc - ( 3E)/( dCFR).

OE Otza,(xj)

-

0

dE -*Outw dNetz - - -

Outt>O

The ].bB(Zq), for q= 1,2 ..... k, is given by:

refer to Fig. 1 and equation (3) for its convergence derivation presented below. The CFR, the certainty factor of the rule, is given by:

dE dNetz

if CFFI > CFw and

(4)

4.4.1. A Conjunctive FPR. For this type of rule, please

dE dCFR

if CFF~<-CFw and Out,>O

ACFR= f rI(T:~I - Outz)*CFFI [ rl( z - Outz)*CFF2

'

if Out,>l~B (Zq) otherwise

The/-hl(X~), for j = 1,2 ..... n, and/xa2(Yi), for i= 1,2 ..... m, are given by:

OE ONetp - - * - for j = 1 , 2 ..... n ONetp dl~A,(Xj) _ =

O E , OOutp , aOutp

aNetp

1 ~ /ZA,(Xi) j=l

0

-

=

if/~a,(xj)
dE ONet,

-

*

ONet, OOutp

. 1 .

1 if lXA,(Xj)O ~ /aLA,I(Xj) j=t

0

=

otherwise

_ , _ 1, _ _ 1 f -- _ OE OUet, 2 ~ I.%,(xj)

if tZA,(Xj)O

j=l

0

= 0

otherwise

~ (Tarq- OuL)* I'tn(Zq) * 1 • q=l ~ Out~ 2

(5)

1

~: ~A,,(xj) j=l

if/xA,(xj) 0 and Out, > ~B(Zq) otherwise

174

E. C. C. Tsang and D. S. Yeung

For the error E to converge, it is necessary that A/~a,(xi) ~

A].,~AI(Xj)=

f-- 17 ~ (Tarq- Outq)*~B(Zq) 1 q=l

1

bE O/xa,(xj)"

if I~A,(Xi)O and Out,> /zB(Zq)

--0--~utt 2 * 2 ~g ~ ~/.LA,,(Xj)

j=l

otherwise

0

bE Similarly for error convergence, it is necessary that A/za:(y;) oc b/'ta~(Y;)" k

- 71~l= (Tarq - Outo)*, lXB(Zq)out~•; 21• A/-%(Yi)=

1

if/ZA2(y~O and Outt> l~n(Zq)

i=l

0

5. AN ENHANCED BACKPROPAGATION The backpropagation presented in this section is based on the feed forward propagation method and an enhanced backward weight changes method presented in the last section. (1) Choose a small positive value of learning rate 7/, and assign initial weights to all soft weights which include the antecedent membership value weight, the consequent membership value weight, and the certainty factor weight. (2) Set all hard weights and threshold values to 1. (3) Repeat until the algorithm converges, i.e. the square of error E is less than a preset epsilon 6. (3.1) Take the next training example TIE, a vector which comprises membership values and certainty factor(s) of the given facts, and the corresponding correct output CO which is a vector of membership values and certainty factor of the goal. (3.2) Forward propagation step: Use the feed forward method presented in Section 4.3 to perform forward propagation. (3.2.1) If a similarity measure SDs(A'~¢4i) of the rule is smaller than the threshold value Aa~,assign the threshold value to SDs(A'I,AI).Repeat step 3.1. (3.3) Backward propagation step: Starting with the outputs, identify the weights being visited. (3.3.1) If it is a hard weight skip it. (3.3.2) If it is an antecedent membership value weight (3.3.2.1) Compute/~a,(xi). (3.3.2.2) Update the weight by ~.~Ai*(Xj)= ~dbAi(Xj)JI"~.~Ai(Xj) , if the updated weight is out of the range of [0,11, adjust the weight by a shrinking factor. (3.3.3) If it is a consequent membership value weight

otherwise

(3.3.3.1) Compute/zB(zi). (3.3.3.2) Update the weight by I~B*(z~)= I.tn(z~)+ izs(zl), if the updated weight is out of the range of [0,1], adjust the weight by a shrinking factor. (3.3.4) If it is a certainty factor weight (3.3.4.1) Compute CFR~. (3.3.4.2) Update the weight by CFR/*=CF~+ CUR/, if the updated weight is out of the range of [0,1], adjust the weight by a shrinking factor. Note that/zAi*(xj),/zB*(Yi) and CF~* are the new values for I~Ai(Xi),mB(Yi),and CF~ at time (t+ 1) respectively. In this algorithm, the new weights of/xAi*(xi),/zB*(yi) and CFR;* updated by the steps mentioned above are subject to the constraint that they must lie within the interval [0,1]. To guarantee this, the shrinking and retest process as proposed in Lacher et al. (1992) is used. The error convergence derivation of this algorithm has been presented in Section 4.4. Only the conjunctive FPR is shown in Section 4.4. The other types of FPRs can be derived similarly. Furthermore, the threshold value Aa/is tuned by firstly assigning it with a value of 1. In each forward iteration, if the similarity measure SDs(A'i,A~) is smaller than the threshold value IAi,the threshold value is decreased. After the learning has been completed, the threshold will reach a value that it allows all the rules (training samples) to be fired. 6. DIFFERENCES BETWEEN OTHERS' APPROACHES AND OUR APPROACH Although we have extended the expert networks proposed by Lacher et al. (1992) to acquire, refine and tune parameters and membership values in FPRs, there are many differences between our approach and Lacher et al.'s approach. Furthermore, Yager has proposed a neural

Modelling Fuzzy Production Rules network approach to model and formulate FPR (Yager, 1994) which, to some extent, is similar to our approach. The following shows the differences between these three approaches according to network architecture, connection weights, type of production rule, acquisition and refinement of knowledge representation parameters, and learning algorithm.

175 membership value of a fuzzy set, a certainty factor of a rule, or a threshold value of an antecedent of a conjunctive FPR. The hard weight is fixed and does not change during training. The soft weight is allowed to have a value in the interval [0,1 ].

6.3. Type of Production Rule 6.1. Network Architecture 6.1.1. Lacher et al. 's Approach. A node of different shape in an expert network represents a different connective such as 'NOT' or 'AND' used in a production rule. An expert network is not partitioned into different layers. Four types of nodes are identified: regular node, operation node, conjunctive node and negation node. 6.1.2. Yager's Approach. A node with a different activation function represents different propositional statement in a fuzzy production rule. The neural network model is partitioned into different layers. Three types of nodes are identified: antecedent matching node, rule firing node and rules aggregation node. 6.1.3. Our Approach. A set of nodes is used to represent a propositional statement in a fuzzy production rule. An FEN is partitioned into different layers. Three types of input nodes, four types of hidden nodes and three types of output nodes are identified.

6.2. Connection Weights 6.2.1. Lacher et al. 's Approach. Two types of weights, soft weight and hard weight are identified. The soft weight only represents the certainty factor of a production rule used in EMYCIN while the hard weight is set to 1 and does not change during training. The soft weight is allowed to have a value in the interval [ - 1,1 ]. 6.2.2. Yager's Approach. Only soft weight representing a membership value in the antecedent, degree of importance of each antecedent, or a consequent is identified. The soft weight lies in the interval [0,1]. 6.2.3. Our Approach. Two types of weights, soft weight and hard weight are also identified. The soft weight can represent a

6.3.1. Lacher et al. 's Approach. The types of production rules used in EMYCIN. The negation of propositional statement is considered. 6.3.2. Yager's Approach. The FPR used in fuzzy controllers, i.e. parallel rules are considered. Negation of the consequent is considered.

6.3.3. Our Approach. Three types of FPR mentioned in Section 3 are considered. Negation of a fuzzy production rule is not considered.

6.4. Acquisition and Refinement of Knowledge Representation Parameters 6.4.1. Lacher et al. 's Approach. Only certainty factors of production rules are tuned or acquired. 6.4.2. Yager's Approach. Parameters such as membership grades of a linguistic variable and degree of importance of each antecedent can be tuned. 6.4.3. Our Approach. Many parameters could be tuned. They include the threshold value of an antecedent, the certainty factor of a rule and the membership values of a fuzzy set.

6.5. Learning Algorithm 6.5.1. Lacher et al. 's Approach. An enhanced backpropagation called Expert System Back Prop (ESBP) is derived from the inference mechanism used in EMYCIN. Various Monte Carlo methods have been used in conjunction with ESBP to overcome the problem of getting stuck in local minima. 6.5.2. Yager's Approach. Traditional backpropagation with unique activation function is used. Two different activation functions are used.

176

E. C. C. Tsang and D. S. Yeung

The FEN of this type of rule is like the one shown in Fig. 1 with n=5, m=5 and q=5. The inputs (given facts) to each of the propositions in the antecedent are sampled over their domains of discourse, creating a vector representing their membership values as shown in Tables 1 and 2. Five sets of training data are given in Table 3. In this test a set of initial weights are assigned to the FEN as shown in Fig. 1. The initial weights for propositions A~ and A2 are quite similar to the target weights. The initial weights for the consequent B, and the certainty factor of the rule are randomly assigned. The network is expected to refine the membership values Ai of the antecedent, the threshold value '~t;, the membership values B of the consequent, and the certainty factor CF R of the rule. The initial weights (IW), target weights (TW), and actual weights (AW) are shown in Table 4. The results of the deduced consequent B' using the initial weights before training and the actual weights after training for the same training sets are shown in Table 5.

6.5.3. Our Approach.

An enhanced backpropagation based on a similarity based fuzzy reasoning method and the degree of subsethood is used. Tuning threshold value is performed by firstly setting the threshold value to 1 and then decreasing it to a value which allows all the training samples to be fired. 7. A N E X P E R I M E N T In this section an experiment is used to demonstrate that our proposed FEN could be used to fine-tune knowledge representation parameters such as threshold value, certainty factor of a rule and the membership values of fuzzy sets in a single-level reasoning system such as fuzzy controllers. The capability of fine-tuning these parameters and membership values of fuzzy sets in a multi-level reasoning system such as FESs can be found in Tsang and Yeung (1996).

7.2. Analysis of Results In this test, the initially assigned weights of the membership values have a similar shape, i.e. bell shape. The actual (final) weights as shown in Table 4 show that they give promising results for all the weights. The results shown in Table 5 show that expected and actual results are nearly identical. From this test, it is observed that, to get a better result for the weights far away from the output nodes, a set of representative training samples and initial weights should be used. However, for weights close to the output nodes i.e. membership values of the fuzzy set B, the initial weights will not affect its final weights. The certainty factor CF~ of the rule and the

7.1. A Conjunctive FPR e.g. IF x is middle-aged AND x is bald THEN x is rich. The fuzzy sets middle-aged, bald and rich are represented as: Middle-aged = 0.1/27 years old + 0.5/31 years old + 0.9/ 35 years old+0.6/39 years old+0.3/43 years old, Bald = 0.95/3000 hairs + 0.7/4000 hairs + 0.5/5000 hairs + 0.3/6000 hairs + 0.1/7000 hairs, Rich=0.1/1 Mi1+0.4/5 Mil+0.6/10 Mil+0.8/ 15 Mil + 0.95/20 Mil, where Mil =Million.

TABLE 1 Fuzzy Sets of Antecedent A 1' for the Conjunctive FPR Membershipvalues (A 1,) Label

27 years old

31 years old

35 years old

39 years old

43 years old

Middle-aged Quite middle-aged Very middle-aged Nearly middle-aged Quite old

0.1 0.3162 0.01 0.3 0.1

0.5 0.7071 0.25 0.6 0.4

0.9 0.9487 0.81 0.95 0.6

0.6 0.7746 0.36 0.5 0.8

0.3 0.5477 0.09 0.2 0.9

TABLE 2 Fuzzy Sets of Antecedent A=' for the Conjunctive FPR Label

3000 hairs

4000hairs

5000hairs

6000hairs

7000hairs

Bald Quite b a l d Very b a l d Nearly bald

0.95 0.9746 0.9025 0.8

0.7 0.8367 0.49 0.7

0.5 0.7071 0.25 0.5

0.3 0.5477 0.09 0.4

0.1 0.3162 0.01 0.2

Modelling Fuzzy Production Rules

177 TABLE 3 Five Sets of Training Data for the Conjunctive FPR

Training sets

Input

Aai 1St

1 1 1 1 1 1 1 1 1 1

2nd 3rd 4th 5th

Output

Membership values ( A i ' ) ( A 1 ' upper row, ,42'= lower row) =

0.3162 0.9746 0.01 0.9025 0.3 0.8 0.3162 0.9025 0.1 0.9746

0.7071 0.8367 0.25 0.49 0.6 0.7 0.7071 0.49 0.4 0.8367

0.9487 0.7071 0,81 0,25 0,95 0.5 0.9487 0,25 0.6 0.7071

0.7746 0.5477 0.36 0.09 0.5 0.4 0.7746 0.09 0.8 0.5477

0.5477 0.3162 0.09 0.01 0.2 0.2 0.5477 0.01 0.9 0.3162

consequent B are easy to be refined. This experiment shows that our FENs can be used to help knowledge engineers fine-tune knowledge representation parameters and membership values of fuzzy sets. 8. FUTURE W O R K Although the FENs have been used to fine-tune the knowledge representation parameters and the member-

Membership values (B')

CFF,

0,8 0,9 0,7 0.8 0.9 0.75 0,8 0.65 0.85 0.9

CF s.

0.1345

0.5378

0.8067

1

1

0.64

0.1

0.4

0.6

0.8

0.95

0.56

0.1112

0.445

0.6675

0.9800

1

0.6

0.1123

0.4487

0.6731

0.8975

1

0.52

0.1355

0.5420

0.8129

1

1

0.68

ship values of the FPRs, there are open problems which need further investigation. They may include: Our present FENs are not capable of handling the new certainty factor computing method proposed in Yeung and Tsang (1997), and the method of computing the certainty factor of the deduced consequent as discussed in Section 3 is only the limiting case of the one proposed in Yeung and

TABLE 4 The Initial Weights (IW), Target Weights (TW) and Actual Weights (AW) in the Conjunctive FPR AAi

Membership values (Ai) (A1= upper row, ,42= lower row)

Membership values (B)

OF R

IW

1 0.2343 0.5322 0.4902 0.3939 0.1222 0.2163 0.5343 0.9273 0.6323 0.3125 0.32 1 0.4345 0.5442 0.1213 0.2312 0.1342 "I3N N~ 0.1 0.5 0.9 0.6 0.3 0.1 0.4 0.6 0.8 0.95 0.8 N~ 0.95 0.7 0.5 0.3 0.1 AW 0.307212 0.192551 0.491780 0.824275 0.607236 0.359135 0.099610 0.398551 0.597720 0,806203 0.950000 0.800000 0.149883 0.983121 0.593963 0.314523 0.280964 0.317128

TABLE 5 Results of the Deduced Consequent B' Using Weights Before and After Training in the Conjunctive FPR Training sets

Results

1st

Initial Expected Actual Initial Expected Actual Initial Expected Actual Initial Expected Actual Initial Expected Actual

2nd

3rd

4th

5th

Membership value (B') 0.460889 0.1345 0.134430 0.307703 0.1 0.099610 0.355258 0.1112 0.116866 0.350846 0.1123 0.111435 0.460949 0.1355 0.135468

1.000000 0.5378 0.537872 0.760081 0.4 0.398551 0.877552 0.445 0.467593 0.866652 0.4487 0.445865 1.000000 0.6420 0.542023

1.000000 0.8067 0.806664 1.000000 0.6 0.597720 1.000000 0.6675 0.701264 1.000000 0.6731 0.668679 1.000000 0.8129 0.812890

CFB, 1.000000 1 1.000000 0.899494 0.8 0.806203 1.000000 0.9800 0.945863 1.000000 0.8975 0.901912 1.000000 1 1.000000

0.665871 1 1.000000 0.444564 0.95 0.950000 0.513260 1 1.000000 0.506885 1 1.000000 0.665958 1 1.000000

0.256000 0.64 0.640000 0.224000 0.56 0.560000 0.240000 0.6 0.600000 0.208000 0.52 0.520000 0.272000 0.68 0.680000

178

E. C. C. Tsang and D. S. Yeung

Tsang (1997). Changing the computation of the certainty factor would require the restructuring of the FENs and the modification of the learning algorithm. To overcome the problem of easily getting stuck in a local minimum, other techniques such as genetic algorithm, simulated annealing, Monte Carlo method or radial basis function could be considered. 9. CONCLUSION In this paper a FEN which is capable of handling fuzziness has been proposed. The FEN has the advantages of both of the FES and neural network. It not only has the learning capability of neural networks but also has the inference mechanism of FESs. An enhanced learning algorithm derived from the inference mechanism of a FES is also proposed. One of the applications of this FEN is to help knowledge engineers fine-tune knowledge representation parameters such as threshold values, certainty factor of a rule, and the membership values of fuzzy sets. The proposed networks can help knowledge engineers solve knowledge refinement problems. REFERENCES Adlassnig, K. P., & Kolarz, G. (1982). CAD/AG-2: Computer-assisted medical diagnosis using fuzzy subsets. In M. M. Gupta & E. Sanchez (Eds), Approximate reasoning in decision analysis (pp. 219--247). New York: North-Holland. Buchanan, B. G., & Duda, R. O. (1982). Principles of rule-based expert systems. Lab for AI research, Fairchild Camera, Palo Alto, GA, Fairchild Tech. Rep. no. 626. Buckley, J. J., & Hayshi, Y. (1994). Can fuzzy neural nets approximate continuous fuzzy functions?. Fuzzy Sets and Systems, 61, 43-51. Chen, S. M. (1988). A new approach to handling fuzzy decision making problems. IEEE Transactions on Systems, Man and Cybernetics SMC, 18(6), 1012-1016. Chen, S. M. (1994). A weighted fuzzy reasoning algorithm for medical diagnosis. Decision Support Systems, 11, 37--43. Chen, S. M., Ke, J. S., & Chang, J. E (1990). Knowledge representation using fuzzy Petri nets. IEEE Transaction on Knowledge and Data Engineering, 2(2), 311-319. Fu, L. M., & Fu, L. C. (1990). Mapping rule-based systems into neural architecture. Knowledge-Based Systems, 3, 48-56. Gallant, S. I. (1988). Connectionist expert systems. Communications of theACM, 31, 152-169. Garg, M. L., Ahson, S. I., & Gupta, P. V. (1991). A fuzzy Petri net for knowledge representation and reasoning. Information Processing Letters, 39, 165-171. Hail, L., & Romaniuk, S. (1990). A hybrid connectionist symbolic learning system. In Proceedings of the AAAI-90, pp. 783-788. Handelman, D. A., Land, S. H., & Gelfand, J. J. (1989). Integration of knowledge-based and neural network techniques for autonomous learning machines. Proceedings of the International Journal of Conferences on Neural Networks, 1, 683--688. Hollatz, J. (1992). Supplementing neural network learning with rulebased knowledge. IJCNN, Bejing, China, 3-6 November, pp. 595-601. Hruska, S. I., Kuncicky, D. C., &Lacher, R. C. (1991). Hybrid learning

in expert networks. IJCNN, 91(1I), 117-120. Ishibuchi, H., Kwon, K., & Tanaka, H. (1995). A Learning algorithm of fuzzy neural networks with triangular fuzzy weights. Fuzzy Sets and Systems, 71, 277-293. Kosko, B. (1992). Neural systems and fuzzy systems: a dynamical systems approach to machine intelligence. Englewood Cliffs: Prentice Hall. Lacher, R. C., Hruska, S. I., & Kuncicky, C. (1992). Back-propagation learning in expert networks. IEEE Transactions on Neural Networks, 3(1), 62-72. Lee, C. C. (1990). Fuzzy logic in control systems: fuzzy logic controller - - part I. IEEE Transactions on System, Man, and Cybernetics, 20(2), 404-418. Murphy, J. H. (1990). Probability-based neural networks. Proceedings of the IJCNN, 90(1), 451--454. Ogawa, H., Fu, K. S,, & Yao, J. T. P. (1985). An expert system for damage assessment of existing structures. In M. M. Gupta (Ed.), Approximate reasoning in expert systems (pp. 731-744). New York: North-Holland. Sestito, S., & Dillon, T. (1993). Knowledge acquisition of conjunctive rules using multilayered neural networks. International Journal of Intelligent Systems, 8, 779-805. Scharwartz, D. G., Klir, G. J., Lewis, H. W., & Ezawa, Y. (1994). Applications of fuzzy sets and approximate reasoning. Proceedings of the IEEE, 82(4), 482--498. Sugeno, M. (1985). An introductory survey of fuzzy control. Information Sciences, 36, 59-83. Tong, R. M. (1986). The representation of uncertainty in an expert system for information retrieval. Fuzzy logic in knowledge engineering. Germany: Verlag TUV Rheinland. Touretzky, D., & Hinton, G. (1988). A distributed connectionist production system. Cognitive Science, 12, 423---466. Towell, G. G., Shavlik, J. W., & Noordewier, M. O. (1990). Refinement of approximate domain theories by knowledge-based neural networks. Proceedings of the AAAI-90, 861-866. Tsang, E. C. C., & Yeung, D. S. (1996). Learning capability of fuzzy expert networks. World Congress on Neural Networks, 15-18 Sept, San Diego, California, pp. 500-503. Turksen, I. B., & Zhong, Z. (1988). An approximate analogical reasoning approach based on similarity measures. IEEE Transactions on Systems, Man and Cybernetics, 18(6), 1049-1056. Whalen, T. S. (1986). Financial ratio analysis, fuzzy logic in knowledge engineering. Germany: Verlag TUV Rheinland. Yager, R. (1984). Approximate reasoning as a basis for rule-based expert systems. IEEE Transactions on Systems Man, and Cybernetics, 14, 636--643. Yager, R. (1994). Modeling and formulating fuzzy knowledge bases using neural networks. Neural Networks, 7(8), 1273-1283. Widrow, B., Rumelhart, D. E., & Lehr, M. A. (1994). Neural networks: applications in industry, business and science. Communications of the ACM, 37(3), 93-105. Yeung, D. S., & Tsang, E. C. C. (1992). Fuzzy production rules evaluation and fuzzy Petri nets. In Advances in computer cybernetics and information engineering. 1993 Collection of Selected Papers presented at the 6th International Conference on Systems Research, Informatics and Cybernetics, 17-23, August, Baden-Baden, Germany, pp. 137-142. Yeung, D. S., & Tsang, E. C. C. (1994). Fuzzy knowledge representation and reasoning using Petri nets. Expert Systems with Applications, 7(2), 281-290. Yeung, D. S., & Tsang, E. C. C. (1997). Weighted fuzzy production rules. Fuzzy Sets and Systems, 8, 299-313. Yeung, D. S., & Tsang, E. C. C. (1997). A Comparative study on similarity-based fuzzy reasoning methods. IEEE Transactions on Systems Man, and Cybernetics, 27, 216--227. Zadeh, L. A. (1975). The concept of a linguistic variable and its application to approximate reasoning - - I, II, Ili. Information Sciences, 8 and 9, 199-249, 301-357, 43-80.