Automatic refinement of knowledge bases with fuzzy rules K S Leung and M L Wong
The knowledge-acquisition bottleneck obstructs the development of expert systems. Refinement of existing knowledge bases is a subproblem of the knowledgeacquisition problem. The paper presents a HEuristic REfinement System (HERES), which refines rules with mixed fuzzy and nonfuzzy concepts represented in a variant of the rule representation language Z-H automatically. HERES employs heuristics and analytical methods to guide its generation of plausible refinements. The functionality and effectiveness of HERES are verified through various case studies. It has been verified that HERES can successfully refine knowledge bases. The refinement methods can handle imprecise and uncertain examples and generate approximate rules. In this aspect, they are better than other famous learning algorithms such as ID3 ~5-~8,A Qll, and INDUCE 14,19,2° because HERES' methods are currently unique in processing inexact examples and creating approximate rules. Keywords: knowledge bases, knowledge acquisition, refinement The knowledge-acquisition problem consists of the subproblems of creating an initial knowledge base and refining an existing one. This paper discusses a HEuristic REfinement System (HERES), which refines an existing knowledge base using refinement examples. HERES progressively improves an existing knowledge base by the addition, deletion, and alteration of components of rules in the knowledge base. A good knowledge base should diagnose refinement examples with a small number of mistakes, i.e., misdiagnosed examples. Thus the number of mistakes is an important indicator of the quality of a knowledge base. HERES should search for a good refinement to decrease the
Departmentof ComputerScience,The ChineseUniversityof Hong Kong, Shafin,N.T.,Hong Kong Paperreceived2 January1991. Revisedpaperreceived10 September 1991 Vol 4 No 4 December 1 9 9 1
number of mistakes and increase the overall performance of a knowledge base. To avoid the combinatorial explosion of a refinement problem, heuristics are employed to guide the search of good refinements. Ginsberg et al. have explored the feasibility of a heuristic approach for knowledge-base refinement1-5. HERES uses refinement examples to evaluate the behaviour and performance of the knowledge base being refined. HERES activates the Simple Expert System Shell (SESS), described in Appendix I, to solve refinement examples using rules with mixed fuzzy and nonfuzzy concepts in the knowledge base and compares the conclusions (KB-FC) reached by the shell with the conclusions (Expert-FC) given by domain experts. HERES produces statistics about the performance of the knowledge base. It also analyses the statistics collected and employs some domain-independent knowledge engineering concepts (heuristics) to recommend good refinements for the knowledge base. HERES is designed to refine knowledge bases developed for the SESS. The refined knowledge base can be executed by SESS. Since SESS is a subset of a comprehensive shell Z-II 6-8, the knowledge base obtained can also be executed by Z-II as an expert system. In this paper, first the refinement concepts and design philosophy of HERES are presented. The logical structure of HERES is then outlined. The analytical refinement method (rule analysis) and the method of modification of rule statistics are discussed after that. A discussion of cases studied is then given, which is followed by the conclusion.
REFINEMENT CONCEPTS OF HERES This section presents the concepts of refinement operations (generalization and specialization), refinement phases (first-order generalization-specialization (G-S) refinement and higher-order G-S refinement), refinement strategy, refinement examples, performance statistics, and rule statistics of HERES.
0950-7051/91/040231-16 © 1991 Butterworth-Heinemann Ltd
231
Refinement operations
Strategy for knowledge-base refinement
HERES transforms rules through generalization and specialization to improve rules in a knowledge base. HERES generalizes a rule by making antecedent conditions of a rule more easily satisfied (i.e., by transforming an antecedent condition into a more general description) or by reaching a conclusion with a higher certainty factor (e.g., by increasing the certainty factor of a rule). For example, the following rules:
A real-world knowledge base usually has thousands of rules that form a complicated reasoning network, as depicted in Figure 1. fcl, fc2. . . . . and fen in Figure 1 are possible conclusions reached by the knowledge base. The number of rules, the amount of refinement examples, and the complicated interactions between rules make the problem of refining a knowledge base difficult. Thus a divide and conquer strategy is employed in the design of HERES to reduce the complexity of this problem. Applying this strategy, the whole problem is divided initially into two simpler refinement phases (the firstorder G-S refinement and higher-order G-S refinement phases). It is interesting to find that rules in a knowledge base can be classified into different sets based on its final conclusions. From Figure 1, rules R1, R2, . . . , and R l l are involved only in the reasoning of the final conclusion fcl, so they should be in the same set. This set of rules is denoted as Rules (fcl). On the other hand, Rules R13, R14, . . . , R21 of the knowledge base in Figure 1 should be in another set, which is denoted as Rules(re2), because they are involved in the reasoning of the final conclusion fc2 only. Since R12 in Figure 1 is involved in the reasonings of both final conclusions, i.e., fc~ and fc2, it should be classified into a special set named Global-rules. The set of rules of a knowledge base can be partitioned into smaller and disjoint subsets, namely, Rules(fc~), Rules(lea), . . . , Rules(fen) and Globalrules. Refinements of rules in these subsets can be performed individually. Thus the first-order G-S refinement phase is divided into several simpler local level refinement steps and a global level refinement step. A local level refinement step refines rules of Rules(fci) by examining a subset of refinement examples (RE), which is denoted as TargetRE(fce) and discussed in the next section. By examining all refinement examples, the global level refinement step refines a portion of intermediate rules involved in the reasonings of more than one final conclusion. Here Global-rules denotes the set of these intermediate rules.
If
Sex = Male and Age = 18 to 30 Then Weight is heavy With CF = 0.8; If Age = 20 to 30 Then Weight is heavy With CF = 0.8 ; and If Sex = Male and Age = 20 to 30 Then Weight is heavy With CF = 1.0 are generalizations of the rule: If
Sex = Male and Age = 20 to 30 Then Weight is heavy With CF = 0.8.
HERES specializes a rule by making antecedent conditions of a rule more difficult to be satisfied (i.e., by transforming an antecedent condition into a less general description) or a conclusion to be reached with a smaller certainty factor (e.g., by decreasing the certainty factor of a rule). For example, the following rules: If
Sex = Male and Age = 25 to 30 Then Weight is heavy With CF = 0.8; If Sex = Male and Age = 20 to 30 and Height = tall Then Weight is heavy With CF = 0.8; and If Sex = Male and Age = 20 to 30 Then Weight is heavy With CF = 0.6 are specializations of the rule: If
Sex = Male and Age = 20 to 30 Then Weight is heavy With CF = 0.8.
Refinement examples Refinement phases of HERES HERES is composed of two refinement phases: the first-order G-S refinement phase and the higher-order G-S refinement phase. A single generalization/specialization is performed on a rule each time in the first phase. Since the effect of each refinement is limited and the operation is simple, a first-order G-S refinement can be performed in a short period. Refinements that contain more than one generalization and/or specialization are carried out in the higherorder G-S refinement phase. Learning a new rule can also be viewed as a higher-order G-S refinement because it refines an empty rule to an actual one. A single generalization/specialization of a fuzzy antecedent of a rule with a fuzzy conclusion is a kind of higherorder G-S refinement because of the features of fuzzy reasoning. 232
HERES evaluates the performance of a knowledge base and individual rules using the refinement examples. Usually, these refinement examples are obtained from some knowledge sources, such as human experts, databases, and books. In this paper, the refinement examples are assumed to come from human experts. Suppose that a knowledge base, say KB1, is being refined by HERES. HERES combines SESS and KB1 to form a complete expert system that diagnoses the refinement examples. Information on the number of misdiagnosed examples, the reasons for generating mistakes, and the performances of each rule are collected by HERES when the expert system classifies the refinement examples. Thus the refinement examples are employed to evaluate the adequacy of KB 1. For the knowledge base, say KB1, being refined by HERES, the refinement examples can be divided into Knowledge-Based Systems
I I
Final conclusion
fc I
I
fc2
i fc 3 ....... fc n
I
I .
.
.
.
.
.
.
.
.
-4-
I I
Top level rules
I
1
i i I
Intermediate level rules
I I I I I I I .lu
I
.
I
Facts
I I
Figure 1.
Reasoning network of a knowledge base
two disjoint subsets, namely, IE (incorrect examples) and CE (correct examples). IE contains those refinement examples that are misdiagnosed by the expert system using KB1, while CE contains those refinement examples that are diagnosed correctly. Since the modification of a rule in a knowledge base may mean that some misdiagnosed examples may be dianosed correctly later, these examples provide evidences for the refinement of this rule. In fact, the same misdiagnosed example may provide evidences for refinements of more than one rule. However, the refinement of a rule in a knowledge base may also mean that some correct examples are misdiagnosed later, i.e., they contribute disproofs for the refinement of this rule. Sometimes, the same correct example may supply disproofs for refinements of more than one rule. IE provides information on the imperfection of a knowledge base and evidences for refinements of a knowledge base. CE supplies information about where a knowledge base performs adequately and disproofs for refinements of a knowledge base. Recall that the first-order G-S refinement phase comprises several local level refinement steps and a global level refinement step. A local level refinement step refines a subset of rules (denoted as Rules (fcl)) that conclude with the same final conclusion, say fcl. Most refinement examples do not provide any inforVol 4 No 4 December 1991
mation on how to refine rules in Rules(fcl). Thus it is not necessary to consider all refinement examples during a local level refinement step. The subset of refinement examples that does affect Rules(fcl) is named TargetRE(fcl). TargetRE(fcl) is composed of GenlE(fcl), SpeclE(fc~), GenCE(fc~), and SpecCE-
(fc,).
GenIE(fCl) is the set of refinement examples that provide evidences for generalization o f rules in Rules(fc~). An example belongs to GenIE(fcl) if the final conclusion suggested by human experts is fc~, which is one of the possible solutions deduced by the knowledge base, but the knowledge base does not infer fc~ with the highest certainty factor. Thus this example is a misdiagnosed example. The knowledge base does not infer fcl with the highest certainty factor because some rules in Rules(fc~) are too strict. If these rules are generalized, this example will then be corrected. For this reason, this example suggests that some rules in Rules(fc~) should be generalized and thus belongs to GenIE(fcx). SpecIE(fcx) is the set of refinement examples that supply evidences for specialization of rules in Rules(fcl). If the final conclusion indicated by human experts for a refinement example is not fCx, but the knowledge base infers the final conclusion fc~, this implies that some rules in Rules (fc~) are too general. 233
They should be specialized so that this example can be diagnosed correctly. Since this example provides evidence for specialization of rules in Rules(fc0, it is a member of SpecIE(fc 0. GenCE(fc0 is the set of refinement examples providing disproofs for generalization of rules in Rules(fc 0. Suppose that there is an example that is diagnosed by the knowledge base and human experts identically and the diagnosis is not fcl. However, fc~ is one of the possible solutions inferred by the knowledge base. If some rules in Rules(fc0 are generalized, the generalized knowledge base will misdiagnose this example as fcl with the highest certainty factor. Thus this example suggests that HERES should not generalize rules in Rules(fcl) and it belongs to GenCE(fc0. SpecCE(fc0 is the set of refinement examples that are disproofs for specialization of rules in Rules(fcl). Suppose that there is a correct example that is diagnosed by the knowledge base and human experts identically as fcl. If some rules in Rules(fc~) are specialized, the generalized knowledge base will not generate the final conclusion fc] for this example anymore, i.e., this example will be misdiagnosed by the specialized knowledge base. Because of this, this example suggests that HERES should not specialize rules in Rules(fc~) and thus belongs to SpecCE(fcl). For a final conclusion, say fci, there is a third type of misdiagnosed examples (IE): the subset of misdiagnosed examples that supply no information on how to modify rules in Rules(fc~), which is denoted as OtherlE(fc~). Similarly, for a final conclusion, say fc~, the correct examples (CE) can be partitioned into three subsets, which are GenCE(fci), SpecCE(fci), and OtherCE(fc~). OtherCE(fci) is the set of correct examples that provide no information about how to refine rules in Rules(fc~).
Performance statistics The performance statistics indicate the fraction of refinement examples that are diagnosed correctly by the knowledge base. These performance statistics can be further broken down into individual categories. The breakdown statistics illustrate the performances of rules that conclude the same final conclusion.
Calculate 5 breakdown statistics I_
Select TargetFC
I Local level refinement
analysis
First-order
G-S refinement phase
Refinments (*)
1
l...... refinements I
Higher-order refinement
Higher-order refinements
phase
t_ Figure 2.
Operations of HERES
• GenGain is the maximum of the performance gain if the rule is generalized. • SpecDiff is the minimum of the performance gain if the rule is specialized. • SpecGain is the maximum of the performance gain if the rule is specialized. These statistics are detailed in later sections.
LOGICAL STRUCTURE OF HERES The logical operations and structure of HERES are now described.
Rule statistics Rule statistics indicate the performances of individual rules in the knowledge bases. They also suggest how to modify rules in the knowledge base. Rule statistics of a rule are: • GenEvidNo indicates the amount of gain if the rule is generalized by first-order refinement. • PgenEvidNo indicates the amount of gain if the rule is generalized by higher-order refinement. • GenDisNo indicates the amount of loss if the rule is generalized. • SpecEvidNo indicates the amount of gain if the rule is specialized. • SpecDisNo indicates the amount of loss if the rule is specialized. • GenDiff is the minimum of the performance gain if the rule is generalized. 234
Operations of HERES The operations performed by HERES are outlined in Figure 2. The first step solves the refinement examples using the knowledge base being refined. The conclusion reached by the inference engine (KB-FC) and the conclusion indicated by experts (Expert-FC) are compared. HERES partitions the refinement examples into correct examples (CE) and incorrect examples (IE) and calculates the overall performance and breakdown statistics of the knowledge base. The breakdown statistics are employed by HERES to determine the ordering of the local level refinements for the final conclusions. During a local level refinement cycle for a certain final conclusion, say fci, only a portion of the rules and refinement examples are used. These rules and refinement examples are Rules(fci) and TargetRE(fci), Knowledge-Based Systems
respectively. Thus this cycle sets up TargetRE(fcx) and Rules(fcx), which are used during local level refinements for each final conclusion fc~. This cycle also sorts out the Global-Rules, which are refined in the global level refinement step. The local level refinement cycle consists of three steps in a loop as shown in Figure 2. Each loop refines rules in Rules(fc~) using TargetRE(fci) for each f% In this refinement loop, the first step selects an fc,., called TargetFC. HERES then performs rule analysis of rules in Rules(TargetFC) using TargetRE(TargetFC). Rules analysis collects rule statistics about performances of rules in the knowledge base to modify these rules. Rule statistics being collected have already been discussed. A number of first-order G-S refinements are then performed in the third step. In the refinement process, each time the knowledge base is modified, previous rule statistics will be revised. Rule analysis and modification of rule statistics are discussed later. Global level refinement refines the intermediate rules in Global-Rules. HERES first performs rule analysis of rules in Global-Rules using all the refinement examples. Some first-order G-S refinements are then generated as in the local level refinement cycle. The last step, the higher-order refinement phase, refines the whole knowledge base to rectify the remaining misdiagnosed examples. The functions of this step are detailed shortly.
First-order G-S refinement During a local level refinement cycle, many first-order G-S refinements will be generated after rule analysis. The refinement block marked with (*) in Figure 2 is elaborated in Figure 3. The first operation evaluates Gen-set and Spec-set. Gen-set is a subset of Rules(TargetFC) that is worth consideration for generalization. Spec-set is a subset of Rules(TargetFC) that is worth consideration for specialization. The second step (Loop A) finds the 'best' (or as closest to best) generalization to a rule in Gen-set. Heuristics BETTER-HEURISTICS checks whether a refinement is better than the other one. The iteration must determine whether it is possible to find a better refinement. If it is impossible to find such a refinement, the iteration should be terminated by heuristics IMP-HEURISTICS. HERES employs heuristics LITF-CHANCE-HEURISTICS to check whether it has little chance of finding a better refinement. If this situation occurs, HERES should terminate loop A. During the iterations, refinements are suggested by refinement heuristics. The third step (Loop B) searches a good specialization of rules in Spec-set in turn. Operations in the final step select, refine the rules and modify the rule statistics. HERES then starts another local level refinement cycle if the termination criteria are not satisfied. The termination criteria are expressed in heuristics STOPHEURISTICS. Heuristics of HERES are described in Appendix B.
Higher-order C,-S refinement The higher-order G-S refinement phase attempts to tackle misdiagnosed examples that cannot be corrected Vol 4 No 4 December 1991
by simple first-order G-S refinement. This refinement phase includes three steps. The first step is the generalization of rules with nonfuzzy conclusions that are supported by partial generalization evidence (see the next section). The second step tries to generalize/ specialize some fuzzy antecedent conditions of rules with fuzzy final conclusions. HERES employs heuristics described in Appendix B to generate this kind of refinements. In the last step, HERES calls the learning component AKA-2 of AKARS-1 to learn new rules that correct the remaining misdiagnosed examples. R U L E ANALYSIS AND M O D I F I C A T I O N OF R U L E STATISTICS Rule analysis attempts to assess why a refinement example is misdiagnosed. By tracing reasoning chains of misdiagnosed examples, HERES can determine which rules should be generalized and/or specialized. Because a poor generalization/specialization will cause some correct examples to be misdiagnosed later, the consequence of a refinement on correct examples must be considered too.
Rule analysis for generalization Let the current knowledge base (CKB) have two nonfuzzy final conclusions 'weight is heavy' (fcl) and 'weight is medium' (fc2) and two rules (rl and r2). If the target final conclusion (TargetFC) is 'weight is heavy' (fcl), GenlE(fcx) is {e~, e2, e3, e 4 } , GenCE(fc~) is {es, e6, e7} , and Rules(fcl) is {rl}. The attributes and their corresponding values are listed in Table 1. The rules (rl and r2) and refinement examples ( e l , e2, . . . , e7) are depicted in Tables 2 and 3, respectively. The inference engine concludes 'weight is heavy' for el with certainty factor of 0.0 because r~ is the only rule that concludes 'weight is heavy', r2 matches e~ and concludes 'weight = medium' with certainty factor of 0.9. The certainty factors are evaluated from the following equation:
CF(el, r t ) t = Min[
CF(el, r2)
rule-CF(rl)¶ = Min[ = 0.0 = Min[
rule-CF(r2) = Min[ = 0.9
Match(Sex = MalelSex = f e m M e ) , * fact-CF(Sex female)§, M a t c h ( A g e = 18 to 30[Age = 20) * fact-CF(Age = 20), Match(height = very talllheight = very tall)* fact-CF(height = very tall)]* 0.__00" 1.0, 1.0 * 1.0, 1.0 * 1.0] * 1.0 Match(height = tall[height = very tall)* fact-CF(height = very tall)I* 1.0 * 1.0,] * 0.9
t CF (e~, r~) calculates the CF of the conclusion of rl for el. * Match(Ante-condition of ri[fact in e3 determines whether an antecedent condition of a rule is satisfied by a fact of an example. The method of evaluation is described in Appendix A. § Fact-CF(fact in e3 returns the CF of a fact of an example. ¶ Rule-CF(ri) returns the CF of rule r~.
235
Start
LI Find Gen-set, I '] Spec- set
Loop B "1 ~f
M~~$~ "e
n
Find spec I Cur-spec
Y
I
B-gen : =
Cur-gen
n ~
15
Find gen. Cur-gen
~
I B- spec : =
Loop A
Cur-spec I_ n
Accept B-gen
J Modify rule statistics
1
Accept B-spec
I
[
IA
A. Key B
: Best
Cur : Current Gen : Goneralizotion Spec : Specialization
Figure 3.
Elaboration of refinement block marked * in Figure 2
The method for calculating a certainty factor is detailed in Appendix A. Since r2 concludes 'weight is medium' with the highest certainty factor, the KB-FC(e~) (the best final conclusion reached by the knowledge base for the refinement example e~) is 'weight is medium'. Because KB-FC(e~) is not equal to 236
Expert-FC(el), the final conclusion suggested by human experts for the refinement example el, et is an incorrect example that provides evidence implying the requirement of generalization of rl. rl reaches ExpertFC(el) with a small certainty factor (0.0) because of the result of Match(Sex = Male I Sex = Female). Thus Knowledge-Based Systems
Table 1. Attributes and their value sets Attribute
Type
Attribute values
Sex Age Height
Nominal Linear Fuzzy
Weight
Nominal
{Male, Female) 0-120 {Tall, (x Tall), Medium, (x Medium), Short, (x Short) V x ~ Hedge --- {very, rather, quite) {Heavy, Medium}
Table 2. Rules of CKB If then with If then with
r~:
r2:
Sex = Male and Age = 18 to 30 and height = very tall weight = heavy CF = 1.0 height = tall weight = medium CF = 0.9
Since [height = very tall] and [height = tall] are fuzzy expressions, match(height = very talllheight = tall) returns the similarity6-s between the fuzzy sets associated with them. If the similarity is 0.8, CF(e3, rl) will also be 0.8 according to the above equation. Because r2 concludes 'weight is medium' with certainty factor of 0.9, which is higher than CF(ea, rl), ea is an incorrect example. If [height = very tall] of r~ is generalized to a more general condition, say Y, such that Mateh(Y I height = tall) returns a large value (e.g., 1.0), ea can be corrected. Thus e3 is an evidence for generalization of the antecedent condition [height = very tall] of r,. e4 is misdiagnosed by rt, which reaches ExpertFC(e4), 'weight is heavy', with certainty factor, CF(e4, rl), of 0.0 as follows:
CF(e,, rt)
= Min[
Table 3. Some refinement examples Case
Sex/CF*
Age/ CF~r
Height/ CF~t
Expert-FC/ CF§
e~
female/1.0
20/1.0
very tall/1.0
heavy/1.0
e2
male/1.0
17/1.0
e3
male/1.0
21/1.0
e4
female/1.0
16/1.0
e5 e6 e~
female/1.0 male/1.0 male/1.0
17/1.0 15/1.0 22/1.0
KB-FC/ CF~
medium/0.9 heavy/0.0 very tall/1.0 heavy/1.0 medium/0.9 heavy/0.0 tall/1.0 heavy/1.0 medium/0.9 heavy/0.8 very tall/1.0 heavy/1.0 medium/0.9 heavy/0.0 very tall/1.0 mediurn/0.9 medium/0.9 very tall/1.0 medium/0.9 medium/0.9 quite tall/1.0 mediurrd0.9 medium/0.9
* SerdC-'Fdescribes value of Sex and the associated certainty factor (CF). "~Age/CF describes value of Age and the associated certainty factor. $ Height/CF describes value of Height and the associated certainty factor. § Expert-FC/CF is the conclusion and the associated certainty factor suggested by human experts. ¶ This column includes all possible final conclusions inferred by the inference engine and the corresponding certainty factors.
the antecedent condition [Sex = Male] should be generalized to a more general condition, say Y, such that Match(Y [ Sex = Female) returns a large value (e.g., 1.0). This analysis concludes that ex is an evidence for generalization of the antecedent condition [Sex = Male] of rl. Similarly, e2 is an evidence for generalization of the condition [Age --- 18 to 30]. For e3, the inference engine concludes Expert-FC(ea), 'weight is heavy', with certainty factor of 0.8 as below because CF(e3, r~) is 0.8 and r~ is the only rule that concludes Expert-FC(e3). CF(e3, rl)
= Min[
Match(Sex = MalelSex = Male) * fact-CF(Sex = Male), Match(Age = 18 to 30[Age = 21) • fact-CF(Age = 21), Match(height = very talllheight = ct-CF(height = tall)] *
rule-CF(rl)
= Min[
1.0"1.0, 1.0"1.0, Match(height = very talllheight = tall) * 1.0] * 1.0 = Match (height = very tall[height = tall)
Vol 4 No 4 December 1991
nde-CF(rl) =Min[
Match(Sex = MalelSex = female) * fact-CF(Sex = fehaale), Match(Age = 18 to 30lAge = 16) * fact-CF(Age = 16), Match(height = very talllheight = very tall)* fact-CF(height = very tall)I* 0..._00* 1.0, 0 . 0 " 1.0, 1 . 0 ' 1.0]* 1.0
= 0.0
A single generalization of [Sex = Male] or [Age = 18 to 30] will not correct e4, but will make e4 closer to being right. Thus e4 is said to be a partial evidence for generalizations of these conditions. If a rule is generalized, however, some correct examples will then be misdiagnosed by the system. For example, consider the correct example es. If [Sex = Male] is generalized to a condition Z such that Match(Z] Sex = female) returns 1.0, rz concludes 'weight is heavy' with certainty factor, CF(e5, rl), of 1.0 as follows:
CF(es, rl)
= Mini
Match(ZISex = female) * factCF(Sex = female), Match(Age = 18 to 30[Age = 18) • fact-CF(Age = 18), Match(height = very talllheight = very tall) * fact-CF(height = very tall)I*
rule-CF(rt) = 1.0
CF(e5, rl) of 1.0 of the conclusion 'weight is heavy' is larger than the certainty factor (0.9) of the correct final conclusion 'weight is medium', i.e., es will be misdiagnosed. Thus e5 is a disproof for generalization of the antecedent condition [Sex -- female]. Similarly, e6 and e7 are disproofs for generalizations of [Age = 18 to 30] and [Height = very tall], respectively. A table is created for rl to record the generalization evidences (GenEvid), the number of generalization evidences (GenEvidNo), partial generalization evidences (PgenEvid), the number of partial generalization evidences (PgenEvidNo), the number of generalization disproofs (GenDisNo), and generalization disproofs (GenDis). 237
1 Sex* ......
Table 4. Rules of CKB GenEvidNo GenEvid P g e n ~ t
PgenEvid GenDisNo GenDis
{c,}
1
{Ca}
l
If then with If then with
r6:
{es}
J
Agef
l
Height;
1
1 {C3}
i
o
I
CFr§
0
:
{}
I {e,}
l
{}
I {~}
l
(eT}
0
()
i
(3
I t
(}
i
* The number of evidences (GenEvidNo), generalization evidences (GenEvid), the number of partial evidences (PgenEvidNo) and partial evidences (PgenEvid) that support generalization of the Sex attribute. The number of disproofs (GenDisNo) and generalization disproofs (GenDis) that disprove generalization of the Sex attribute. "~GenEvidNo, GenEvid, PgenEvidNo, and PgenEvid that support generalization of the Age attribute. GenDisNo and GenDis that disprove generalization of the Age attribute. :~ GenEvidNo, GenEvid, PgenEvidNo, and PgenEvid that support generalization of the Weight attribute. GenDisNo and GenDis that disprove generalization of the Weight attribute. § GenEvidNo, GenEvid, PgenEvidNo, and PgenEvid that support generalization of the certainty factor of the rule. GenDisNo and GenDis that disprove generalization of the certainty factor of the rule.
HERES attempts to minimize the number of misdiagnosed examples, so an estimate of the performance gain of a refinement is defined as the number of misdiagnosed examples decreased. Related to the performance gain, two statistics, GenDiff and GenGain, are maintained for every rule ri. GenDiff (ri) and GenGain (ri) are, respectively, the minimum and the maximum numbers of misdiagnosed examples which decrease if rule r i is generalized. Since a generalization to a rule with fuzzy antecedent conditions and fuzzy final conclusion is a higher-order G-S refinement, the rule analysis in the first-order G-S refinement phase will not consider these fuzzy conditions. The analytical method developed here, however, can be extended to hierarchical rules and used in the higher-order G-S refinement phase.
Rule analysis for specialization
Let the current knowledge base (CKB) have two nonfuzzy final conclusions, 'weight is heavy' (fc~) and 'weight is medium' (fc2), and two rules (r 6 and r7). If the TargetFC is fct, then SpeclE(TargetFC) is{eg, elo}, SpecCE(TargetFC) is {e~l), and Rules(TargetFC) is { r 6 ) . The attributes and their corresponding values are listed in Table 1. The rules and refinements examples are presented in Tables 4 and 5, respectively. e 9 is misdiagnosed because r6 concludes 'weight is heavy' with a high certainty factor of 1.0. To correct e9, HERES should reduce CF(e9, r6) to a value that is smaller than the certainty factor, CF(eg, rT), of the correct conclusion 'weight is medium'. HERES has many methods to reduce CF(eg, r6): add an additional antecedent condition [Sex = male]; modify an existing condition, e.g., modify [Age = 18 to 30] to [Age = 21 to 30] or [height = tall] to [height = quite tail]; or reduce the certainty factor of r 6. If HERES specializes r 6 t o reduce the value of C F ( e 9, r6), e9 will be diagnosed correctly. Thus e 9 is an evidence for specialization of r 6•
238
rT:
Age = 18 to 30 and height - tall weight is heavy CF = 1.0 height = tall weight is m e d i u m CF = 0.9
Table 5. Some refinement examples Case
Sex/CF*
Age/ CFt
Height/ CF~
Expert-FC/ CF§
KB-FC/ CF~
e9
female/1.0
20/1.0
tall/1.0
ea0
male/1.0
18/1.0
tall/1.0
e.
male/1.0
20/1.0
very tall/1.0
medium/1.0 heavy/1.0 medium/0.9 mediurn/1.0 heavy/1.0 medium/0.9 heavy/1.0 heaw/l.0
* Sex/CF describes value of Sex and the associated certainty factor (CF). t Age/CF describes value of Age and the associated certainty factor. :~ Height/CF describes value of Height and the associated certainty factor. § Expert-FC/CF is the conclusion and the associated certainty factor suggested by human experts. ¶ This column includes all possible final conclusions infered by the inference engine and the corresponding certainty factors.
HERES observes that ell, a correct example, may be misdiagnosed by the system after the specialization of r6; thus eu is a disproof for specialization of r 6. For each rule, HERES maintains a statistical table that summarizes the number of evidences (SpecEvidNo), evidences (SpecEvid) for specialization, the number of disproofs (SpecDisNo), disproofs (SpecDis) for specialization, and the maximum (SpecGain) and minimum (SpecDiff) of performance gains obtained by a specialization of that rule. For example, the statistical table for r 6 is: SpecEvidNo*
SpecEvidf
2
{e9, el0}
SpecDisNo$ SpecDis§ SpecGain¶ SpeeDiffll 1
{en}
2
1
* SpecEvidNo is the number of examples that support specialization of the rule. It is also an upper bound of the profit gained by a specialization. t SpecEvid is the examples that support specialization. ~: SpecDisNo is the number of examples that disprove specialization of the rule. § SpecDis is the examples that disprove specialization. ¶ SpecGain is the maximum of the profit gained by a specialization, SpecGaln is equal to SpecEvidNo. I[ SpecDiff is the minimum of the profit gained by a specialization. SpecDiff = SpecEvidNo - SpecDisNo
Though a specialization to a fuzzy condition of a rule with a fuzzy final conclusion is a higher-order G-S refinement, the rule analysis is similar to that above. Modification of rule statistics
After a refinement is made to a rule, the rule statistics collected before are already outdated because the rule statistics cannot delineate behaviours and performances of rules exactly and clearly. Usually, the rule analysis should be executed again to gather new rule statistics of the modified knowledge base. SEEK and SEEK21-5 employ this method to collect the updated rule statistics. However, this method is too timeKnowledge-Based Systems
consuming and expensive, thus HERES employs a more effective method to modify rule statistics. It is observed that a refinement modifies a single rule only. Thus its effect is limited within a small range. Only the evidences and disproofs for refinements of the rule are influenced by a modification of the rule. For this reason, only these refinement examples have to be diagnosed again by the modified knowledge base to collect the updated rule statistics. In fact, there is a method for modifying the rule statistics directly without diagnosing any refinement examples again. The idea is that the rule statistics to be modified can be deduced directly by the kind of refinement performed and the refinement examples influenced. An experiment demonstrated that the method described in this section showed significant improvement over the old method for modifying rule statistics.
CASE STUDIES Two case studies are explored to verify the efficacy of HERES in learning and refining knowledge bases for development of expert systems. All refinement and testing examples used are mutually independent.
Methods for evaluating performance of knowledge base HERES succeeds in refining a knowledge base if the knowledge base obtained performs well with the testing examples. There are two methods for measuring the performance of a knowledge base. The first method calculates the fraction of testing examples that are diagnosed by a knowledge base correctly (successful rate). The second method calculates the mean of differences between certainty factors (MDCF: mean of differences of certainty factors) suggested by the expert system and human experts. (Note that if the expert system misdiagnoses an example, the certainty factor suggested by the expert system is assumed to be zero.) If the mean is small, the expert system is good, which implies the knowledge base of the expert system is appropriate.
Case 1 The first case study verifies the performance of HERES in refining a knowledge base used to build an expert system for classifying iris flowers into three varieties, namely, setosa, versicolour, and virginica. The expert system examines four attributes of iris flowers and applies rules in the knowledge base acquired to classify them. These attributes are sepal length, sepal width, petal length, and petal width. The training, refinement, and testing examples are taken from Fisher 9. Two experiments are attempted in this case study. For the first experiment, examples are divided into refinement and testing sets containing 45 examples each. These example sets are all composed of the equal proportions of setosa, versicolour, and virginica flowers. The initial knowledge base is progressively refined to a refined knowledge base by HERES using refinement examples provided by the refinement set. Examples of the testing set are then classified by expert systems Vol 4 No 4 December 1991
using the initial and refined knowledge bases, respectively. Different performance indices are collected during this classification procedure. The second experiment is basically similar to the first one. The only difference is the sizes of the two example sets. For the second experiment, the refinement and testing sets have 30 and 90 examples, respectively. These example sets are all composed of the equal proportions of setosa, versicolour, and virginica flowers. Each experiment repeats four times. The results are summarized in Figures 4 and 5. For the first experiment, HERES refines the initial knowledge base successfully. The successful rate increases from 87% to 98%. On the other hand, the MDCF decreases from about 0.12 to about 0.033. This fact indicates that certainty factors of rules in the refined knowledge base are nearly optimized. For the second experiment, the successful rate increases from about 80% to 90%. Thus the initial knowledge base is refined successfully. The same conclusion can be obtained by examining their MDCFs. From Figures 4 and 5, the refined knowledge base obtained in the first experiment is better than that refined in the second experiment. The reason is that the number of examples in the refinement set is smaller in the second experiment. In the second experiment, there are not enough examples to provide sufficient knowledge for HERES to build an appropriate knowledge base. Therefore, adequate examples should be available, so that an elegant knowledge base can be refined.
Case 2 In the second case study, a complicated real-life knowledge base incorporated with fuzzy concepts is refined by HERES. This knowledge base is applied in a medical expert system that deals with the problems of rupture of membranes. The system has four goals at the top or intermediate levels and they are: • Diagnosis: decides whether the membranes are ruptured or unruptured. • Complication (CX): decides whether infection exists. • Management: decides whether a foetus should be delivered. • Mode: determines how to deliver the foetus when it has been decided that the foetus should be delivered. All four goals are nominal and their order in consultation is diagnosis, complication, management, and finally mode. This knowledge base has a multiple level structure and 42 attributes. Because of the complexity of this knowledge base, the whole refining process is divided into a number of easier phases. Each phase refines a portion of rules of the whole knowledge base. These portions of rules are called KBsa, KBs2 , KBs3 , and KBs4. There are four experiments in this case study. The first experiment refines the knowledge base KB,a. An expert system employs knowledge in KBsl to decide whether the membranes are ruptured or unruptured. The number of examples in the refinement set is 150. One fifth of the examples of this set belong to the 239
1.0
0.9
The last experiment verifies the functionality and effectiveness of HERES in refining KBs3. The example sets all have 100 examples each, and each set contains equal numbers of examples obtained from the classes 'management is delivery' and 'management is observation'. An expert system using KBs4 determines whether the foetus should be delivered or observed. KBs4 is refined by HERES in the third experiment. An expert system using KBs4 diagnoses the 'modes' of different testing examples. The numbers of examples in the refinement and testing sets are 150 each, and each set is composed of equal numbers of examples taken from the three different categories. Every experiment is repeated five times (i.e., five attempts) and their outcomes are summarized in Figures 6 and 7. For the first experiment, the successful rates of the initial and refined KBsl are nearly the same. Nevertheless, the refined KBsl is better than the initial KBsl because the behaviour of the former is closer to that of human experts, with the MDCF of the refined KBsl being smaller. This implies that HERES mainly improves certainty factors of rules in the knowledge base. This experiment suggests that if the distribution of the refinement examples is uneven, it is difficult to achieve a large improvement for the refined knowledge base. For the second experiment, the successful rate of the refined knowledge base (Refined KBs2) increases from about 77% to about 84%. The same fact is indicated by the MDCF (MDCF is reduced from about 0.16 to about 0.05). In the third experiment, HERES refines the initial knowledge base substantially and the successful rate increases from about 70% to about 85%. MDCFs displayed in Figure 7 also show that the initial KBs3 is suboptimized. Its MDCF is 0.19, while that of the refined KBs3 is reduced to 0.037. This illustrates that the refined KBs3 has preferable performance. In the fourth experiment, the successful rate of the refined KB~4 increases from 70% to 77%. The MDCF is also reduced from 0.23 to 0.1, i.e., the refined KB~4 is better than the initial KB~4 because the refined KBs4 can infer more accurate certainty factors for conclusions.
m
0.8
0.7 0.6
~_ 0.5 ¢I, ¢I,
©
0.4
u~
0.5
0 0
0.2 0.1 0
I
2
Experiment number
~
Key lnitial knowledge base
~i~l~lRefined knowledge bose
Figure 4.
Summary of successful rates in case I
0.5 o
m
O
0.4
m
D
cO
0.3 D
C
0.2 D
0.1 0
2
I
Experiment number Initial knowledge
CONCLUSIONS
basel
Refined knowledge base
Figure 5.
l i
Summary of MDCFs in case I
category of 'diagnosis is unrupt', while the remaining examples are taken from the class of 'diagnosis is membrupt'. The testing set is composed of 200 examples and comprises equal portions of examples taken from the two classes. In the second experiment, H E R E S refines the knowledge base KBs2 for deciding whether infection exists. The refinement and testing sets have 200 and 400 examples, respectively. These sets all comprise equal portions of examples obtained from the two classes. 240
HERES is a component of the Automatic Knowledge Acquisition and Refinement System (AKARS-1). HERES refines rules expressed in the rule representation language of SESS. The rules being refined can be organized in a hierarchical structure and can also contain approximate attributes and certainty factors. HERES is composed of two refinement phases. In the first-order G-S refinement phase, a single generalization/specialization is performed on a rule each time. Refinements containing more than one generalization and/or specialization are carried out in the higher-order G-S refinement phase. Since the problem of refining a knowledge base is difficult, a divide and conquer strategy is used in the design of HERES. Applying this strategy, the first-order G-S refinement phase is divided into several simpler local level refinement steps and a global level refinement step. HERES tackles each local level refinement step by Knowledge-Based Systems
one final conclusion are refined in the global level refinement step. Many misdiagnosed examples will be corrected in the first-order G-S refinement phase. The remaining misdiagnosed examples will be corrected in the higherorder refinement phase. Because of the appropriate combination of a divide and conquer strategy, analytical methods, and heuristics, effective refinements have been achieved. The refinement capability of HERES has been verified by case studies.
1.0
0.9
B
0.8 0.7 -5 "~
0.6 0.5
0
o0 0 . 4 u)
0.:5
REFERENCES
0.2 0.1
0 I
2
5
4
Experiment number Key I Initial knowledge baseI ....
Refined knowledge base
I I
Figure 6. Summary of successful rates in case 2 0.5 0.4
0.3
e
iii:
Expert Systems. The MYCIN Experiments of the Stanford Heuristic Programming Project Addison-
g
I
1 Ginsberg, A 'A meta-linguistic approach to the construction of knowledge base refinement systems' in Proc. AAAI-86 Philadelphia, PA, USA (1986) pp 436-441 2 Ginsberg, AAutomatic RefinementofExpertSystem Knowledge Bases Pitman, UK (1988) 3 Ginsberg, A, Weiss, S M and Politakis, P 'Automatic knowledge base refinement for classification systems' Artif. Intell. Vol 35 (1988) pp 227-241 4 Politakis, P and Weiss, S M 'Using empirical analysis to refine expert system knowledge bases' Arnf. Intell. Vol 22 (1984) pp 23-48 5 Politakis, P Empirical Analysis for Expert Systems Pitman, UK (1985) 6 Leung, K S and Lam, W 'A fuzzy knowledge-based system shell' in Proc. Second Int. Symp. Methodologies for Intelligent Systems (1987) pp 321-331 7 Leung, K S and Lain, W 'The implementation of fuzzy knowledge-based system shells' in Proc. TENCON 87 1987 IEEE Region 10 Conf. (1987) pp 650-654 8 Leung, K S and Lam, W 'Fuzzy concepts in expert systems' IEEE Computer Vol 21 (1988) pp 43-56 9 Fisher, R A 'The use of multiple measurements in taxonomic problems' Ann. Eugen. Vol 7 (1936) p 179 10 Buchanan, B G and Shortliffe, E H (eds) Rule-Based
0
I
2 Experiment
3
4
number
Ke--2Y ~lnltioI
| knowledge basI
Refined knowledge l x ~ base /
Figure 7. Summary of MDCFs in case 2 using an analytical method (rule analysis) to summarize performances of a subset of rules. Based on these analytical data, HERES deduces why some refinement examples are misdiagnosed and uses heuristics to suggest plausible refinements. HERES evaluates their abilities to improve the knowledge base and only executes the plausible refinements with good performances. An improved method is introduced for more efficient updating of performance statistics. The intermediate rules involved in the reasonings of more than Vol 4 No 4 December 1991
Wesley, USA (1984) 11 Mizumoto, M, Fukami, S and Tanaks, K 'Some methods of fuzzy reasoning' in Gupta, M M, Ragade, R K and Yager, R R (eds) Advances in Fuzzy Set Theory and Applications North-Holland, The Netherlands (1979) pp 117-136 12 Cayrol, M, Farrency, H and Parde, H 'Fuzzy pattern matching' Kybernetes Vol 11 (1982) pp I03-116 13 Zimmermann, H J Fuzzy Set Theory and its Applications Kluwer-Nijhoff Publishing (1986) 14 Michalski R S 'A theory and methodology of inductive learning' in Michalski, R S, Carbonell, J G and Mitchell, T M (eds) Machine Learning: An Artificial Intelligence Approach I Morgan Kaufmann, USA (1983) pp 83-134 15 Quinlan, J R 'Learning efficient classification procedures and their application to chess end games' in Michaiski, R S, Carbonell, J G and Mitchell, T M
(eds) Machine Learning: An Artificial Intelligence Approach I Morgan Kaufmann, USA (1983) pp 463-482 16 Quinlan, J R 'The effect of noise on concept learning' in Michalski, R S, Carbonell, J G and Mitchell, T M (eds) Machine Learning: An Artificial 241
Intelligence Approach H Morgan Kaufmann, USA (1986) pp 149-166 17 Quinlan, J R 'Simplifying decision trees' Int. J. Man-Mach. Stud. Vol 27 (1987) pp 221-234 18 Quinlan, J R 'Probabilistic decision trees' in Kodratoff, Y and Michalski, R S (eds) Machine Learning: An Artificial Intelligence Approach III Morgan Kaufmann, USA (1990) pp 140-152 19 Miehaiski, R S 'Discovering classification rules using variable-valued logic system V L I ' in Proc. Third IJCAI Stanford, CA, USA (1973) pp 162-172 20 Michalski, R S 'Pattern recognition as rule-guided inductive inference' IEEE Trans. Patt. Anal. Mach. Intell. Vol 2 No 4 (1980) pp 349-361
A single antecedent condition
consists of an attribute and its attribute value. The is a linguistic term specifying the attribute of an object. The specifies the values in the antecedent condition. The antecedent condition will be analysed to determine whether there is a fact satisfying the antecedent condition of a rule. This fact and its certainty factor are considered when the corresponding rule is evaluated. The antecedent part of a rule consists of one or more fuzzy/nonfuzzy condition(s) connected by a logical A N D operation. The consequent conclusion < c o n q > consists of an attribute and its attribute value.
Reasoning in SESS A P P E N D I X A: SIMPLE E X P E R T S Y S T E M SHELL This appendix describes a Simple Expert System Shell (SESS) that can deal with exact and inexact reasoning. SESS is a rule-based system that employs fuzzy logic and backward chaining for its reasoning. H E R E S refines the knowledge base, which will embedded in SESS to form a complete expert system. It is in fact a variation of Z-II, a complete comprehensive expert system shell 6-8.
SESS adopts backward reasoning ~° to build the appropriate reasoning trees. Then SESS employs a forward method to calculate the values of the fuzzy terms; the details of rule evaluations are given in the following subsections.
Rule evaluation Consider a rule r~ and a fact e~. rI: el:
conclusion where
Knowledge representation Fuzzy concepts are allowed in the facts and rules of the knowledge bases of SESS. In SESS, fuzzy concepts are denoted by fuzzy expressions, which are modelled by fuzzy sets. Each knowledge base has some basic fuzzy concepts, which are used in presenting rules and facts of the knowledge base. A fuzzy expression can be formally defined by using the BNF grammar in Figure 8. A knowledge base of SESS is a collection of rules and facts that describe knowledge of a specific domain. There is a certainty factor attached to each rule for describing the degree of confidence in the rule. The format of a rule is shown in Figure 9.
::= :: =
Figure 8.
* very I rather I quite
BNF grammar of
I f A = F 1 Then C = Fo WithCF = cfr A = FI' With CF = cf/ C = Fc' With CF = cfc A antecedent attribute C consequent attribute cfr certainty factor denoting the uncertainty of rl cf~' certainty factor denoting the uncertainty of el cfc certainty factor denoting the uncertainty of the conclusion antecedent attribute value F1 consequent attribute value attribute value of the fact el F 1' attribute value of the conclusion F c'
If the attribute A in the antecedent is either a nominal, linear, or structural attribute and the rule is satisfied, the value F / i n the conclusion is equal to the value Fc in the rule. When the attribute A is nominal, Ft and F 1' must be the same symbolic terms to apply this rule. The rule is satisfied if F ( is a subrange or specialization of F1 for a linear or structural attribute, respectively. The certainty factor cf¢ is calculated using the formula: cfc = cfr * cf 1'
:: =
If
:: = : :: =
:: = :: =
:: =
:: = :: =
names of rules and I I ( ) = ] is I to I t [ ( ) = I is
Figure 9. BNF grammar of 242
Then With CF =
where * denotes a multiplication. If both A and C are fuzzy attributes, F1, Fc, and F / are fuzzy values represented by fuzzy sets FZ~, FZ¢, and FZI', respectively. A fuzzy relation R can be formed by taking some fuzzy operations on FZ~ and FZc 11. The fuzzy set F Z / of Fc' in the conclusion is obtained by applying a fuzzy composition operation (denoted by o) on FZI' and R H, i.e.: FZc' = FZI' o R The certainty factor cfc is calculated from the above formula. If A is fuzzy and C is nonfuzzy, Fc' in the conclusion Knowledge-Based Systems
must be equal to F~. However, the certainty factor Cfc is obtained by the multiplication of Cfl, Cfl t , and the similarity ~2 'Similarity(FZx, FZI')' between FZ x and FZt', which are the fuzzy sets of F~ and F~', respectively. The calculation method is depicted in the following formula: cf~ = cf~ * cf~' * Similarity(FZ1,
FZI' )
Rules with multiple antecedent conditions In SESS, the antecedent part of a rule can only contain multiple conditions with AND conjunctions between them. If the attribute in the condition is nonfuzzy, no special treatment is needed. However, if the condition is fuzzy, the fuzzy set of the value in the conclusion is calculated using the following algorithm. Consider a rule r2 and two facts e2 and e3.
The function 'Match(A I F)' determines whether an antecedent condition A is satisfied by a fact F. If A is a nonfuzzy antecedent, 'Match(A I F)' returns 1.0 (true) to show that the antecedent is satisfied. Otherwise, it outputs 0.0 (false) to announce that the condition is not satisfied by the fact F. If both the antecedent condition and the consequent conclusion are fuzzy, 'Match(A [ F)' produces 1.0 (true) because such a rule is always satisfied in SESS. If only the antecedent is fuzzy, 'Match(A IF)' outputs the similarity between the two fuzzy sets FZA and FZv, which are fuzzy sets of A and F, respectively, because the degree of matching between A and F is considered as equal to the degree of similarity between FZA and FT__~.The evaluation method of 'Match(A ] F)' is detailed in the following equation:
Match(AIF) = 1.0 r2:
If A1 is F1 and A2 is F32 then C is F¢
e2:
A1 is FI' A2 is F2' C is Fc'
e3:
conclusion
With With With With
CF CF CF CF
= = = =
= 0.0
dr of1 cf2 cf¢
= 1.0
if A if A if
A and F are nonfuzzy and is satisfied by F A and F are nonfuzzy and is not satisfied by F A and F are fuzzy and C is
fuzzy = Similarity(FZAlFZv)
The algorithm will first evaluate the fuzzy sets FZx and FZ2. The fuzzy set FZ1 is obtained from the composition operation on the rule 'If Ax is F1 then C is Fc With CF = Cfr' and the fact e2,while the fuzzy set FZ2 is obtained from the composition operation on the rule 'If A2 is F2 then C is Fc With CF = Cfr' and the fact e 3. Then the fuzzy set representing FZ,' in the conclusion is evaluated by taking fuzzy union 13 of the fuzzy sets FZ~ and FZ2. If FZu is the fuzzy union of FZx and FZ2, then FZ, is defined as follows: ~XFz~(X) = MAX(~tvz,(X),
~Fz2(X))
Calculation of certainty factor This section discusses the certainty factor evaluation method. Some evaluation methods presented earlier are also summarized here. Suppose that there is a rule R and some facts EI, E2 . . . . , E~:
R:
El: E2:
If (A1 = F1) and (A2 = F2) a n d . . , T h e n (C = Fc) With C F = cf, A1 is FI' with C F T = cf{ A2 is F 2' with C F = cf2'
E,:
A i is Fi' with C F = cf{
and (A~ = F~)
The rule R and these facts will reach a conclusion with certainty factor cfc, which is calculated from the following equation:
cfc
=
Min[
cf~
M a t c h ( A t = F1 A1 = F I ' ) * c f t ' , Match(A2 F2 A2 = F2') * cf2',
Match(A, = EIA, = F,') * cf/] *
Vol 4 No 4 December 1991
if A and F are fuzzy and C is nonfuzzy w h e r e FZ^ and FZF are fuzzy sets of A and F, respectively
The method for evaluating similarity between two fuzzy concepts is discussed by Cayrol et al.12.
APPENDIX B: HEURISTICS OF HERES Heuristics of HERES can be classified into three categories. The first category contains control heuristics employed by HERES to determine the control flow of HERES. These heuristics decide when to terminate a local level refinement cycle and the first-order G-S refinement phase. They are used to control the refinement level of HERES. The second category contains strategic heuristics to determine whether a rule should be generalized or specialized. The third category is composed of many heuristics that propose the detailed refinement operations to be performed on a rule.
Notations In this section, some notations are discussed first, which will be used in the following sections. These notations are presented in alphabetical order as follows: • Gain(r, A) is a function that returns the actual performance gain obtained if the rule r is refined using the refinement A. • GenDiff(r) is the minimum of the performance gain if the rule r is generalized. • GenDisNo(A, r) yields the number of refinement examples that provide disproofs for generalization of the condition about the attribute A in the rule r. • GenEvidNo(A, r) returns the number of refinement examples that provide evidences for generalization of the condition about the attribute A in the rule r. • GenGain(r) is a function that evaluates the maximum of the performance gain if the rule r is generalized. 243
• G e n l E ( T a r g e t F C ) is the incorrect e x a m p l e s that provide evidences for generalizations of rules in Rules(TargetFC). • O l d - G e n l E ( T a r g e t F C ) is the incorrect e x a m p l e s that provide evidences for generalizations of rules in R u l e s ( T a r g e t F C ) b e f o r e these rules are modified. • O l d - S p e c l E ( T a r g e t F C ) is the incorrect e x a m p l e s that p r o v i d e evidences for specializations of rules in R u l e s ( T a r g e t F C ) b e f o r e these rules are refined. • SpecDiff(r) is the m i n i m u m of the p e r f o r m a n c e gain if the rule r is specialized. • S p e c G a i n ( r ) is a function that yields the m a x i m u m of the p e r f o r m a n c e gain if the rule r is specialized. • S p e c l E ( T a r g e t F C ) is the incorrect e x a m p l e s that provide evidences for specializations of rules in Rules(TargetFC). • T a r g e t F C is the final conclusion of which the local rules (i.e., R u l e s ( T a r g e t F C ) ) is being refined by the current local level r e f i n e m e n t cycle.
Control heuristics During a local level r e f i n e m e n t cycle, it is required to calculate the best generalization and specialization. Better-Heuristics arc used by H E R E S to d e t e r m i n e w h e t h e r one r e f i n e m e n t is b e t t e r than another. B e t t e r Heuristics are as follows:
to find a better refinement, H E R E S should not a t t e m p t to discover a superior one. I m p - H e u r i s t i c s are applied by H E R E S to m a k e this decision. T h e rules in G e n - s e t and Spec-set are sorted in descending order using their G e n G a i n and S p e c G a i n , respectively. I m p - H e u r i s t i c s are listed as follows: If Then If Then
Vk I> 1 Gain(rj, A) >= GenGain(rj+k) it is impossible to find a better generalization Vk/> 1 Gain(rj, A) >= SpecGain(rj+k) it is impossible to find a better specialization
H E R E S e m p l o y s heuristics Litt-Chance-Heuristics to d e t e r m i n e w h e t h e r it has little chance of finding a better refinement. If it is true, H E R E S should stop to find a better refinement. Litt-Chance-Heuristics are: If Then If Then
Gain(rj, A) > = Gen-gain(rj) * 95% it has little chance of finding a better generalization. Gain(rj, B) > = Spec-gain(rj) * 95% it has little chance of finding a better specialization.
H E R E S t e r m i n a t e s a local level r e f i n e m e n t if StopHeuristics are satisfied. T h e s e heuristics are: If
If Then If
Then
gain of refinement A to rule r~, Gain(r~, A) is greater than gain of refinement B to rule rj, Gain(rj, B) refinement A to rule r~is better than refinement B to rule rj gain of refinement A to rule r, Gain(r, A) is equal to gain of refinement B to rule rj, Gain(rj, B) and refinement A to rule r~is simpler than refinement B to rule rj refinement A to rule ri is better than refinement B to rule rj
T h e s e heuristics use gains and simplicity of r e f i n e m e n t to m a k e decisions. T h e first heuristics concludes that r e f i n e m e n t A is b e t t e r t h a n r e f i n e m e n t B if the perf o r m a n c e gain of r e f i n e m e n t A is larger than that of r e f i n e m e n t B. T h e second heuristics says that if refinem e n t A and r e f i n e m e n t B h a v e the s a m e r e f i n e m e n t gain, the simpler r e f i n e m e n t will be selected. T h e following heuristics d e t e r m i n e w h e t h e r a r e f i n e m e n t is simpler than the others.
IGenlE(TargetFC)l + ISpeclE(TargetFC)i < 10% IOld-genIE(TargetFC)l + [Old-SpeclE(TargetFC)l Then stop finding more local level refinements where ]GenlE(TargetFC)l + ISpeclE(TargetFC)[ is the residual number of refinement examples that still supports refinements of rules in Rules(TargetFC). IOld-genlE(TargetFC)l + IOld-speclE(TargetFC)l is the original number of refinement examples that supports refinements of rules in Rules(TargetFC). If Gain(rj, best-refinement) ~ 0 Then stop finding more local level refinements T h e first heuristics declares that if m o r e than 90% of e x a m p l e s that s u p p o r t refinements of rules in R u l e s ( T a r g e t F C ) are corrected, the searching for m o r e local level refinements should be t e r m i n a t e d . T h e second heuristics specifies that if the best r e f i n e m e n t has no positive r e f i n e m e n t gain, the searching for m o r e local level refinements of rules in R u l e s ( T a r g e t F C ) should be t e r m i n a t e d .
Strategic heuristics If Then If Then If Then If Then If Then
refinement A modifies a nominal attribute simplicity of refinement A is 1 refinement A modifies a linear attribute simplicity of refinement A is 2 refinement A modifies a structural attribute simplicity of refinement A is 3 refinement A modifies a fuzzy attribute simplicity of refinement A is 4 simplicity of refinement A is smaller than simplicity of refinement B refinement A is simpler than refinement B
It is necessary for H E R E S to d e t e r m i n e w h e t h e r it is possible to find a b e t t e r refinement. If it is impossible 244
T h e strategic heuristics suggest that a rule should be generalized if the m i n i m u m p e r f o r m a n c e gain of the rule for generalization is g r e a t e r than zero. T h e s e heuristics are: If Then If Then
GenDiff(r3 > 0 generalize rule r~ a fuzzy condition of a rule with fuzzy conclusion is generalized generalize the fuzzy conclusion too
T h e strategic heuristics suggest that a rule should be specialized if the m i n i m u m p e r f o r m a n c e gain of the rule for specialization is g r e a t e r than zero. T h e y are:
K n o w l e d g e - B a s e d Systems
If Then
SpecDiff(r~) > 0 specialize rule r~
If
a fuzzy condition of a rule with fuzzy conclusion is specialised specialize the fuzzy conclusion too
Then
Refmement heuristics H E R E S e m p l o y s various generalization techniques to create generalizations. T h e refinement heuristics for generalization are listed as follows:
1. If
Then
r~is generalized and X is the condition (A = val) that maximizes [ GenEvidNo(A, ri) - GenDisNo(A, ri) ] and A is a nominal attribute delete (A = val) from ri
This heuristic e m p l o y s the d r o p p i n g conditions to suggest that a condition ' A = val' o f attribute should be deleted if it maximizes the o f refinement gain, [ G e n E v i d N o ( A , ri) D i s N o ( A , r~) ].
2. If
Then
of rule 14 nominal estimate - Gen-
r~is generalized and X is the condition (A = vail to val2) that maximizes [ GenEvidNo(A, r~)- GenDisNo(A, ri) ] and A is a linear attribute and Gen-X is a set of examples that support generalization of X and Dis-X is a set of examples that disprove generalization of X generalize (A = vail to val2) to (A = extendrange(vail, val2, Gen-X, Dis-X, A) )
This heuristics e m p l o y s extending range o f linear attribute 14 to suggest that a condition [A = val to val2] of a linear attribute should be generalized if it maximizes the estimate of refinement gain. T h e function (extend-range) considers evidence and disproof in generalization o f the condition to find the best range for the condition.
3. If
Then
rl is generalized and X is the condition (A = literal) that maximizes [ GenEvidNo(A, r~) - GenDisNo(A, rg) ] and A is a structural attribute and Gen-X is a set of examples that support generalization of X and Dis-X is a set of examples that disprove generalization of X generalize (A = literal) to (A = Parent(literal, Gen-X, Dis-X, A))
This heuristics e m p l o y s climbing generalization tree 14 to suggest that a condition [ A = literal] of structural attribute should be generalized if it maximizes the estimate o f refinement gain. T h e function (Parent) considers evidences and disproofs in generalization of the condition to find a new value for it.
Then
This heuristics uses generating c o m p l e m e n t a r y rules 14 to suggest that a condition [A = val] o f nominal attribute should be generalized if it maximizes the estimate of refinement gain. T h e condition [A = val] should be generalized to [ ( A = val) or ( A = plausiblenominal(val, G e n - X , Dis-X, A ) ) ] w h e r e the function plausible-nominal evaluates a new value for n o m i n a l attribute A. H o w e v e r , internal disjunction is invalid in SESS. Thus a new rule is created which is nearly the same as ri, except that [A = val] is replaced by [A = Plausible-nominal(val, G e n - X , Dis-X, A ) ] 5. If
Then
r~is generalized and X is the condition (A = val) that maximizes [ GenEvidNo(A, r~) - GenDis(A, r~) ] and A is a nominal attribute and
Vol 4 N o 4 D e c e m b e r 1991
ri is generalized and X is the condition (A = val) that maximizes [ GenEvidNo(A, r~) - GenDisNo(A, r~) ] and (A = val) is a fuzzy condition generalize (A = val) to (A = rather val) or (A = quite val).
This heuristics suggests that a fuzzy condition [A = val] should be generalized to [ A = r a t h e r val] or [A = quite val] if it maximizes the estimate o f refinement gain. 6. If
Then
r~is generalized and X is the condition (A = very val) that maximizes [ GenEvidNo(A, r~) - GenDisNo(A, r~) ] and (A = very val) is a fuzzy condition generalize (A = very val) to (A = rather val), (A = quite val) or (A = val).
This heuristic suggests that a fuzzy condition [A = very val] should be generalized to [A = rather val], [A = quite val] or [A = val] if it maximizes the estimate o f refinement gain. 7. If
Then
ri is generalized and X is the condition (A = quite val) that maximizes [ GenEvidNo(A, ri) - GenDisNo(A, ri) ] and (A = quite val) is a fuzzy condition delete (A = quite val) of r~
This heuristics suggests that a fuzzy condition [A = quite val] should be deleted if it maximizes the estimate of refinement gain. Similarly, H E R E S has various refinement heuristics for specialization. These heuristics are listed as follows: 8. If
Then 4. If
Gen-X is a set of examples that support generalization of X and Dis-X is a set of examples that disprove generalization of X generate a new rule which is nearly the same as r~, except that (A = val) is replaced by (A = Plausible-nominal(val, Gen-X, Dis-X, A))
ri is specialized and The best condition X to be modified is (A = vail to val2) and A is a linear attribute and Spec-X is a set of examples that support specialization of X and Dis-X is a set of examples that disprove specialization of X modify (A = vail to val2) to (A = new-range(vail, val2, Spec-X, Dis-X, A)
This heuristic suggests that if a condition [ A = vail to val2] of linear attribute is the best candidate for spe245
cialization, a more restricted range should be used to form a new condition. The function new-range evaluated the new range.
11. If
Then 9. If
Then
r~ is generalized and the best condition X to be modified is (A = literal) and A is a structural attribute and Spec-X is a set of examples that support specialization of X and Dis-X is a set of examples that disprove specialization of X change (A = literal) to (A = children(literal, SpecX, Dis-X, A))
This heuristic suggests that if a condition [A = literal] of structural attribute is the best candidate for specialization, it should be changed to a new value. The function children finds a less general value for this structural attribute. 10. If
Then
ri is specialized and Spec-X is a set of examples that support specialization of rl and Dis-X is a set of examples that disprove specialization of r~ add a condition obtained by find-condition(Spec-X, Dis-X) to r~
This heuristics advises specialization by adding a new condition to ri. A plausible condition to be appended to ri is evaluated by the function find-condition.
246
r, is specialized and X is the best condition (A = val) for specialization and (A = val) is a fuzzy condition specialize (A = val) to (A = very val).
This heuristics suggests that a fuzzy condition [A =val] should be specialized to [A = very val] if it is the best condition to be specialized.
12. If
Then
rg is specialized and X is the best condition (A = quite val) for specialization and (A = quite val) is a fuzzy condition and specialize (A = quite val) to (A = val) or (A = very val)
This heuristics suggests that a fuzzy condition [A = quite val] should be specialized to [A = val] or [A = very val] if it is the best condition to be specialized.
13. If
Then
ri is specialized and X is the best condition (A = rather vai) for the specialization and (A = rather val) is a fuzzy condition change (A = rather val) to (A = val) or (A = very val)
This heuristics suggests that a fuzzy condition [A = rather val] should be specialized to [A = val] or [A = very vail if it is the best condition to be specialized.
Knowledge-Based Systems