Diagnostic expert system inference engine based on the certainty factors model S Tzafestas, L Palios* and F Cholin*
An expert system inference engine is described which is based on the utilization of certainty factors, and has a structure similar to Naylor's probabilistic inference engine. After a brief description of the latter, a modified probabilistic criterion is developed which naturally leads to the certainty factors question selection criterion employed in the inference engine. Also, the case in which there is user reporting bias is considered, and a method for dealing with it is presented. The paper discusses the required aspects of the model of certainty factors, and presents the major implementation issues relating to the new inference engine, which is written in PASCAL.
Keywords: diagnostic expert systems, probabilistic models, certainty factors models, inference engines, reporting bias, combining function tables
Recently, the artificial intelligence field has arrived at the point where some of its applications have seen important practical results ~ 3. The majority of these results are due to the development and use of knowledge-based systems 4. Presently, expert systems are being applied in a diversity of areas from medical diagnosis and DNA experiments to weather forecast and system supervision and control 5,6. The development of expert systems is greatly facilitated by the use of expert system shells (tools or cores), i.e. expert systems from which the knowledge base has been removed. An expert system tool consists of an inference engine, an explanation component, and a
Department of Electricaland Computer Engineering,National Technical Universityof Athens, Zografou, Athens, Greece *Currently with the Department of Computer Science, Princeton University. Princeton, NJ 08544-2087, USA *Thecontribution of F Cholin (IDN, France) was made during his stay at the NTUA under the framework of the ERASMUS EEC programme Paper received July 1988. Revised paper received 20 August 1992. Accepted 2 September 1992
users' interaction component. Currently available expert system tools and languages include AGE, ARS, EMYCIN, EXPERT, KMS, OPS, KES, KEE etc. 6. Our purpose in this paper is to describe an expert system inference engine which works with uncertain data. An event (fact, proposition, rule etc.) is said to be uncertain if its validity cannot be determined with full certainty. Actually, there exist many models that characterize uncertainty, the main ones being the probabilistic model (Bayes), the re&vance model (Kadesch), the f u z z y sets model (Zadeh), the evidence model (Dempster and Shaffer) and the certainty factors model (Shortliffe)2.7 10. All these models provide suitable schemes for combining pieces of evidence about a certain hypothesis (disease, fault, system model etc.) gathered from different observations, and thus update the knowledge about this hypothesis. The second section provides a short review of Naylor's rule-value-based inference engine, and the third section outlines our modification of this inference engine ~U2. Then, the fourth section discusses the model of certainty factors which is employed in our system. The inference engine is described in the fifth section. The sixth discusses the situation in which the user's answers contain some bias and provides a method for dealing with it. Finally, the seventh section presents a brief outline of the inference engine's PASCAL implementation, and the eighth section contains some concluding remarks and issues for future work.
REVIEW OF PROBABILISTIC INFERENCE ENGINE The probabilistic inference engine of Naylor ~ is neither of the forward chaining nor the backward chaining type. Rather, it is concentrated on the evidence and gives to each evidence element a value, the rule value (RV) in the inference process. Each time, the system asks the question which is associated with the highest rule value.
0950-7051/94/010017-10 © 1994 Butterworth-Heinemann Ltd Knowledge-Based Systems Volume 7 Number 1 March 1994
17
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas eta/.
Thus, in a sense, the inference mechanism has a two-way (bidirectional) chaining nature. The value R V of a question is defined as the total sum of the maximum probability changes occurring in all the hypotheses of the database as a result of the question, i.e.
•
•
R V
where E is the evidence element (question) at hand, and H~ is the ith hypothesis. Some alternative ways of computing R V can be found in Reference 11. In this way, the system finds a value R V for each evidence element of the database, and searches for the one that has the maximum R V in order to ask about it. Clearly, the above values are not static. As the a posteriori probabilities P(HjlE) are updated, in the course of reasoning, the rule values change, modifying accordingly the importance of the evidence elements. The inference is based on the following classification of the hypotheses: •
•
•
•
Most probable hypothesis: This is the hypothesis of which the minimum possible probability is greater than the maximum possible probabilities of all other hypotheses. Once we have identified the most probable hypothesis, the question-answering session with the user can be ended. Probable hypotheses: These are hypotheses with minimum possible probabilities that are higher than the confirmation threshold, i.e. higher than the probability value over which a hypothesis is accepted. Uncertain hypotheses: These are hypotheses with minimum possible probabilities lower than the confirmation threshold, but with maximum probabilities over the rejection threshold, i.e. over the probability value below which a hypothesis is not accepted. hnpossible hypotheses: These are the remaining hypotheses (i.e. those not falling into the three categories above) which have maximum possible probabilities below the rejection threshold. These hypotheses must be rejected as soon as possible in order to reduce the search space.
From a practical point of view the expert system will give as the result (solution) the most probable hypothesis if it can find it. Otherwise, the second category may satisfy our requirements (to a smaller extent, of course). The principal advantages of the rule value method are as follows: • •
It can be used easily. It has a relatively good criterion for choosing the next question (rule value criterion).
Some disadvantages of the method are as follows: The order of the items of evidence (questions/ answers) cannot be interchanged. We have verified experimentally that in most cases such an interchange leads to significantly different probabilities (although the initial probabilities and answers were kept the same). 18
The probability of an evidence item is not constant. if we consult the various hypotheses in which it is observed. The rule value technique assumes the independency of the evidence elements, something that is very common in the inference engines that are based on the probabilistic model, although it is thr from being true in practice. Furthermore, Johnson ~ has recently shown that this assumption, in conjunction with other typical assumptions of the probabilistic model (exhaustiveness and mutual exclusion of the hypotheses), precludes multiple updating of any of the hypotheses. It is useful to mention at this point the work of Pearl H which is concerned with so called belkffnetworks, and studies the fusion and propagation of new information through such networks, also providing a tree-structured representation for a collection of probabilisticalty coupled propositions using dummy variables. An inference engine based on these and other ideas is under development.
A more detailed discussion of the above aspects can be found in Reference 12.
MODIFICATION
OF RULE VALUE
INFERENCE TECHNIQUE The purpose of this section is to outline a modified form of the rule value technique ~2which has inspired the certainty-value-based inference technique of the paper. The main modification concerns the criterion of selecting the next question. Suppose that some hypothesis appears, at some instant of time, to have a probability that approaches 1. Then a reasonable action would be to try to ensure the required supporting information that will lead to the confirmation of the hypothesis. However, even if no hypothesis possesses a relatively large probability, the attempt to confirm the most probable of them is, no doubt, a good practice. A drawback of this criterion, which is common to all methods that search more than one (candidate) cause of a fault, is the fact that it is possible that, even if all existing causes have been detected, the system will continue to ask questions, since the confirmation of some hypotheses is not sufficient to terminate the diagnosis procedure. On the contrary it is required to reject all the other hypotheses. The drastic increase of the probabilities of the final results in combination with the criterion of rejecting the non-probable hypotheses (as described below) leads to a substantial reduction in the indifferent (neutral) hypotheses, and thus to a faster arrival at a conclusion. This criterion is implemented in two stages, as follows.
Selection of hypothesis We select the hypothesis which contains items of evidence that have not yet been requested and also maximizes the value of the numerical criterion, i.e. of the current probability P(.) or the combination P ( - ) + k max P(.) if the maximum possible probability is also weighted (k is an arbitrarily defined positive constant).
Knowledge-Based Systems Volume 7 Number 1 March 1994
Diagnostic expert system inference engine based
Selection of corresponding item of evidence Suppose that H is the hypothesis selected at the previous stage. The greatest value that the current probability of H can assume is max { P(HI El), P(HI not Ei)} when evidence about Ei is accumulated. As our attempt is to increase P(H) as much as possible, we will select, among all the evidence elements that affect H, the one that maximizes the above quantity. Simple mathematical manipulation leads to the following statement of our criterion. Select the piece of evidence that maximizes the ratio
P(E I H) P(EI not H)
CFc = CF~" max{0, CFp}
CF + CFc(I -ICFI)
P( ,-, E) = tP(not E)
The certainty factor updating scheme is shown in Figure 1, and it has the following important properties:
• •
P( E [H) >>,P( E [ not H) P(E [H) <~P(E ] not H)
In this way we favour the large probability values if P(EIH)>P(EI not H) and the small ones in the opposite case.
CF. CFc = - 1
(3)
•
where a is a suitable positive coefficient and
t P( E)
- 1 < CF'CFc < 0
~-
undefined
A(E) { 1 + aP(~ E)}
CF. CF~ >7 0
CF + CF~
P(E[ H) <~P(EI not H)
A better criterion, however, could be obtained by including the probability of the corresponding evidence element. This can be done by defining the quantity to be maximized as
(2)
Now, let CF be the certainty factor of a hypothesis, computed on the basis of all previously available information, and CF~ be the certainty factor of the same hypothesis resulting from the current evidence. Then the certainty factor CF' that combines the information of both CF and CFc is found to be
P(EI H) >>.P(EI not H)
A(E) = 1 - P(E1 not H)
certainty factors model: S Tzafestas et al.
is CFr, then the certainty factor CF¢ of the conclusion drawn from this rule is given by the product
CF'
1 - P(EI H )
on the
Its branches are continuous and smooth at the matching points (i.e. for CF ~ 0 and CFc~O). For a given CF, the updating function is monotonically increasing with respect to the variable CF~. It is commutative with respect to the evidence elements, i.e. the resulting value of the certainy factor is independent of the order in which the contribution of each evidence element has been taken into account '0.
INFERENCE ENGINE MODEL MODEL OF CERTAINTY FACTORS This model was developed by Shortliffe and it was used for the first time in the medical expert system MYCIN '0. In this model, to each hypothesis H and given item of evidence E we assign a measure ofbeliefMB(HI E) and a measure of disbelief MD(H I E), which are then used to compute the certainty factor CF(H I E)
CF(H[ E) = MB(Ht E ) - MD(HI E)
(1)
of the hypothesis H given the evidence E. Since 0 ~< MB(HIE) ~< 1 and 0 ~< MD(HIE) ~< 1 the certainty factor CF varies from - 1 (full certainty that H is not valid) up to + 1 (full certainty that H is true). The value CF= 0 expresses full ignorance about H. To compute the certainty factor of the 'conjunction' and 'disjunction' of two hypotheses H~ and/-/2 we use the formulas
MB(H, AND H21E) = min {MB(H~ I E), MB(H2IE)} MB(H, OR
//21 E) = max {MB(H, I E), MB(H2I E)}
MD(H, AND H2IE) = max {MD(H, I E), MD(H2 [ E)} MD(H~ OR
H 2 I E ) = min {MD(H~ I E), MD(H2[ E)}
If the total certainty factor of the premises (assumptions) of a rule is CFp, and the certainty factor of the rule itself
Our inference engine model is the result of combining the question selection philosophy of the third section with the certainty factors model. This model, which does not possess the drawbacks of the rule inference model, has been implemented in PASCAL.
Knowledge representation For convenience, in our PASCALimplementation, instead of a unique data file for the illnesses and the symptoms, we used two separate files: ILLNESSES and SYMPTOMS which contain information exclusively for the hypotheses (illnesses) and the evidence (symptoms), respectively. The data record for the evidence has the form ( R N of evidence), (name of evidence), # , < question > The character # is used to separate the name of the evidence element from the question, since PASCALis not particularly effective in the treatment of symbol strings. The reason for using this character is to enable the system to print the names of the evidence items, whenever required by the user. The data record of the hypothesis has the form (description of hypothesis)
(CF of hypothesis), ((RN of evidence), (CF of rule))
Knowledge-Based Systems Volume 7 Number 1 March 1994
19
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas et al
+I .CF
+1~ "C~-I ,/v
'
-
,
-,
a
b
Figure 1 Updating scheme of certainty factors; (a) CF > O, (b) CF < 0 i.e. it occupies two lines. The first line contains only the description of the hypothesis. The second line contains the a priori certainty factor of the hypothesis, and pairs of numbers, each pair consisting of the running index of the evidence element and a certainty factor which expresses our belief that the hypothesis is valid given that the corresponding evidence element (rule) has been observed.
Criterion for selecting next question
As one can see from Figure 2, the above choice implies that evidence against a hypothesis will lead to the greatest possible reduction of the certainty factor of this hypothesis. This results in a faster rejection of the less probable hypothesis and an increase of 'jitters' in the system. Of course our criterion for selecting the evidence element (question) which will be asked next is static (since the static knowledge of the certainty factors of the rules is dominating), but the selection of the hypothesis is dynamic, i.e. it depends on the results of only the current session. Thus it is believed that the combined criterion has satisfactory applicability.
By analogy with the third section, we again distinguish two stages: •
•
Selection of hypothesis: From the set of hypotheses
which contain items of evidence that have not yet been consulted, we select the one that has the maximum certainty factor CF(.) or the one that maximizes the combination CF(.) + k max CF(.), if we wish to take into account the effect of the maximum possible certainty factor max CF. Selection o f item o f evidence." We wish to select the item of evidence which, in the case of supporting information, will produce the maximum increase of our certainty about the hypothesis selected above (in the first stage). Now, since the certainty factor
Computation of maximum and minimum certainty factors From Figure 1 it is seen that the updating relationships are monotonically increasing functions of the certainty factor of the current evidence element (CFc). This implies that the estimation of the maximum and minimum certainty factors that are possible for a hypothesis must be computed by assuming that the contribution CFc of the corresponding evidence element is totally supporting or not. However, from Equation 2, we conclude that CF~ ranges between 0 and CFr for the various answers of the user. As a result, CFc = max {0, CFp}
A
CFp = C R
of
the
user
response lies in the interval - 1 <~ C R <~ + 1, we conclude from Equation 2 that
- I c £ 1 4 c £ ~< IC£1
C R = CFp = sgn(CFr)
(4)
Then, for some preselected hypothesis, it follows from the monotonous increasing property of the updating function that, for this particular response of the user, a greater increase in the hypothesis certainty will be incurred by the evidence element that corresponds to the absolutely greater certainty factor. This is shown in Figure 2. 20
and this is obtained when the certainty factor C R of the user's reply is equal to
where sgn(.) is the signum function. Similarly, evidence totally against a hypothesis corresponds to CFc = min{0,C£}
and is obtained for
Knowledge-Based Systems Volume 7 Number 1 March 1994
C R = CFp = - sgn(CFr)
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas et aL
ICFrll < ICFal
+1 . . . . .
-1
+1~ CF'
[ -1
!
! I I
'
I i I
+1
Z
112Frll ICF~I -1
a
.I.,
b
Figure 2 Increase in hypothesis certainty; (a) CF > 0, (b) CF < 0 Thus, to compute the maximum and minimum possible certainty factors of a hypothesis we start from the most recently updated value of its current certainty factor and we take into account the positive or negative contribution, respectively, of the evidence elements that have not yet been consulted. We can see that the updating of the certainty factors, and the above computation of the maximum and minimum certainty factors, cannot be executed in parallel in only one run (consultation) of the hypothesis record. To avoid the double reading of the hypothesis database, and given that each one of its records is independent of the previous and the next ones, only one solution seems feasible: to store temporarily those parts of the current records that we need, and to reprocess them when the value of the current certainty factor for this phase has been determined. This storage can be done in an array (table) or in a list. We preferred to use the list, both for its dynamic nature and its efficient handling in PASCAL. Of course, the utilization of the list structure makes the software more complex and thus leads to increased execution time. However, this is very small in comparison with the gain obtained by not reading the hypothesis file twice. This gain is particularly important in large files that we usually have in practical problems.
where Dr{h} = max CF{h} + min CF{h}
D2{h} = max CF{h} - min CF{h} Clearly, for CF = max CF (CF = min CF) we obtain CF' = + 1 ( C F ' = - 1) as required. The inverse transformation is CF{h} = [D2{h}.CF'{h} + D,{h}]/2
Thus, in order to be able to execute this transformation, we keep two arrays with dimensionalities of the maximum number o f hypotheses, namely the arrays base and
d/~.
base {h} = max CF{h} + min CF{h} diffih}
= max CF{h} - min CF{h}
where max CF and min CF are the initial values of the maximum and minimum certainty factors of h (computed or estimated using the information of the knowledge base). Three more arrays are needed, namely cf, m a x c f and mincf for the respective normalized values. Finally, it is noted that CF'{h} needs the division by the quantity D2{h} for which
N o r m a l i z a t i o n o f certainty factors To treat all the hypotheses on an equal basis, we require the following conditions to be met for each one of them at the beginning of the inference procedure: • •
The maximum possible certainty factor is + 1. The minimum possible certainty factor is - 1.
Thus, if the initial estimates of the maximum and minimum certainty factors (possible) of a hypothesis with a running index h are max CF{h} and min CF{h}, respectively, we must use the following linear normalizing transformation: CF'{h} = [2CF{h} - D,{h}]/D2{h}
D2{h} = max CF{h} - min CF{h} >1 0
The equality holds if and only if the certainty factors CFr for all the items of evidence in the hypothesis record are zero. This case is handled by setting CF' {h} = min CF{h} and neutralizing the hypothesis (STATUS{h} = - 5 ) since any relevant information will have no effect on the value of the certainty factor.
TREATMENT
OF USER REPORTING
BIAS
Actually, we are not sure that the user's answers are correct, and the certainty factors model allows us to deal
Knowledge-Based Systems Volume 7 Number 1 March 1994
21
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas et aL
with uncertainty, but not with reporting bias. The user is a human and his (her) replies may contain some unavoidable bias. Our purpose in this section is to show how the reporting bias leads to a wrong diagnosis, and how our inference engine can be enhanced to take into account this fact. Suppose again that •
• •
CF is the certainty factor of some hypothesis. CFr is the certainty factor of the user's reply about the symptom (premise). CFr is the certainty factor of the rule.
When updating the database, all hypotheses dealing with the given symptom are updated. Their new certainty factors CF' are computed using CF, CFp and CFr with the aid of Equations 2 and 3. The bias occurs in the user's answers. For example, if the user reports CFp = 0.7 instead of CFp = 0.2, the inferred certainty factor CF' of the hypothesis is increased erroneously. In many situations, this bias may lead to wrong diagnosis or decision, and so CFp has to be appropriately corrected.
tionL and then enhance the inference engine so as to use this knowledge.
Combining function table When accumulating knowledge one needs two inputs CFe and Ck~, and, as output, the confidence in the assertion "CFh means CF~'. Then, one has to use this knowledge to determine the most likely output CF, when the biased input CFj, is given. This way of combining evidence is represented as a table which specifies degrees of belief for each combination (pair) of evidence items. A tabular function that combines two pieces of evidence E~ and E2 for the conclusion C is as follows (E, = CF~, E~ = Cb;, and (7" = degree of belief in the rule E,--+ Ee):
El -2 -1 0
Consider the case where the reply scale of the user is from - 5 (no) to + 5 (yes). Each user has an individual interpretation of the scale [ - 5,5], and many users say more easily a 'little yes' (e.g. a value of 2 in the above scale) than '1 don't know' (i.e. the value 0). Thus it can be said that the user's scale is nonlinear. Therefore, when the correct answer is Cb, the user gives CF+ b, where the real quantity b represents the bias (positive or negative). In general, denoting the function CF+ b as flCF), we have (5)
where CF, is the true (correct) certainty factor and CFh is the biased certainty factor, i.e. the certainty factor that the user gives instead of CFt. Clearly, if the function* ./(-) is known, then one can correct the input as
CF, = g(CFe)
(6)
where g(.) is the inverse of f(.). This is equivalent to building an anticipatory system about the user's behaviour. Actually, the function f(.) can be regarded as a membership function characterizing a fuzzy event with a given CF,. A comprehensive discussion of anticipatory systems can be found in Reference 15. Our problem here is to build the user's model. To determine the function/" one needs to know CFh and CF~. However, in actual practice, CF, is not known (even after the diagnosis is completed). Therefore, at the first stage, we have to collect information about the user to determine the func-
*Here the function,/(.) is just an additive bias function J(.) = (.) + b, but our technique is also valid when the bias depends on CF, nonlinearly.
22
-1
0.5 0.4 0.1
0.2 0.8
I
Theoretical background
CF,, =/(CF,)
-2
0
1
0.1 0.2 0.7
0.1 0.9
the degree of belief in the couple (CFh, CF,) = ( - 2 , 1) is 0.4, the degree of belief in (CFh, CF,) = ( - I , - 1) is 0.8, and so on. For a detailed study Here,
of combining function tables the reader is referred to Reference 16. We shall now give a way of building this table, which will give the membership function for each event 'CF~, = ...'.
Algorithm for building combining function table The goal is to create a table that shows how the bias occurs. To this end, one needs (a) the user's answers CF~, and (b) the right answers CFt. Thus, to build the table we have written a small program that asks the user questions for which the right answer can be (independently) computed. The table is built by entering the degrees of belief (or probability) that each CFb represents CF~. Schematically, the construction of the combining function table for the assertion 'there is a piece of evidence that CFh means CF{ is shown in Figure 3. The PASCAL-like definition type couple = record number, prob: real end; column = array of couple; table = array of column; was used, so that, when updating the table with a couple (CFb, CFt), we have only to increase the corresponding number of evidence Ni (table [no of column] [no of couple]). By computing the total number N of pieces of evidence for each column, we compute the probabilities (degrees of belief) Pi as Pg = NJN. Let us consider an example of mapping the user's behaviour when the system computes a temperature in
Knowledge-Based Systems Volume 7 Number 1 March 1994
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas et al.
Computation of a question eft
Expert System Queries [ CF given by the user
Queriesfrom the system to the user CFb
-~
BehaviourTable [
I
Computation of the two input effects on the data base
Updating the table with (CFt CFb)
I
- Show to the user the two effects - Ask the user to choose between them - Use this information to update the table
Figure 3 Construction of combining function table the interval 0 ° < T < 100 ° and asks the user about it. The user knows that the desired output is 50 ° and replies with a scale - 5 (too low) to + 5 (too high). In this simple test the following combining function table was obtained: -5 - 5 0.57 - 4 0.43 - 3 -2 - 1 0 1 2 3 4 5
-4 -3 -2 -1 0 1 2 3 4 5 0.24 0.33 0.50 0.29 0.22 0.14 0.67 0.63 0.11 0.50 0.38 0.16 0.63 0.21 0.27 0.64 0.75 0.29 0.07 0.09 0.25 0.43 0.20 0.29 0.73 0.33 0.67
It is easy to see in this table that the bias occurs when the user does not know. The PASCAL procedure for updating the combining function table is given in the Appendix. One may improve this program by adding new questions to explore a wider knowledge field of the user. O f course many other updating schemes can be employed. The present scheme is fairly simple to interpret and takes into account all previous knowledge.
Figure 4 Updating combining function table
User
I
[
i
I
i
Symptoms |
Hypotheses |
Behaviour of the User
•
Use the table to obtain the corrected certainty factor
•
Compute the hypotheses certainty factors or perform the required calculations using, separately, CFh and CFc. If CFb and CFc give different results Rb and R,, ask the user to choose between Rb and R,. Update the table using the user's choice between Rb and R~. This means that, if the user has selected R~ there is a new piece of evidence that 'CFb means CF~' otherwise, there is a new piece of evidence that 'CFb means CF/(i.e. no bias).
CFc.
When using the table one only has available C F b. The table provides the most likely value CFc for CF,, i.e. the CFt value that corresponds to the highest probability C. When there are several maxima, the latest one is taken. The PASCAL procedure for obtaining this is given in the Appendix. As already mentioned, the purpose of this method is to collect knowledge about the user's behaviour. The following procedure is proposed when asking the user for information:
•
CFb.
I
bias treatment facility
•
Obtain the biased answer
I
I
Figure 5 Structure of overall inference engine enhanced with
How to use combining function table
•
I
I
I
[
The above scheme has the flow diagram shown in Figure 4. From a structural point of view, our certainty-factorsbased inference engine is enhanced with a module (here called the blackboard module) for the treatment of the user's biased answers (see Figure 5). The blackboard is
Knowledge-Based Systems Volume 7 Number 1 March 1994
23
D i a g n o s t i c e x p e r t s y s t e m i n f e r e n c e e n g i n e b a s e d on the c e r t a i n t y f a c t o r s model: S T z a f e s t a s e t a/.
filled in with the hypotheses that correspond to the current symptom. For each hypothesis we compute its new CF with either CF~ or CF~. Then the user can choose between CFh and CFc by observing their effect on the database. Of course he (she) can also give another value for CF~,and the blackboard is updated. Actually, the user uses the board as a tool to see the effect of his (her) response to the database, and learn about his (her) previous behaviour by seeing both CFh and CFc. The behaviour table is loaded from the memory.
>> >>
>> >>
>>
>>
PASCAL IMPLEMENTATION In order to study the behaviour of the present expert system kernel and see how well it matches the theoretically expected behaviour, we implemented it in PASCAL and compared it with Naylor's expert system. To this end, we used Naylor's data '~ appropriately converted to suit our certainty factors model. In particular, owing to lack of the relevant transformation, we (arbitrarily) used as certainty factors of the hypotheses the corresponding probabilities of Naylor's database. Note that the trivial transformation CF = 2 P - 1 would lead to negative certainty factors and thus to a wrong interpretation of the data. Actually all hypotheses are probable, even with small probability, and thus they must be assigned with positive (although small) certainty factors. Another possibility would be to omit this field from the hypotheses records, and allocate to all hypotheses an initial certainty factor of zero, expressing in this way our ignorance. However, such a case means that we do not take into account some information quantity which could be useful for our decision. Regarding the certainty factors that express the dependence of the hypothesis on the evidence element, we used standard probability expressions for MB(HI E), MD(HI E) and CF(HI E), together with the conditional probabilities contained in Naylor's probabilistic database. Two small data files for the hypotheses (ILLNESSES) and the items of evidence (SYMPTOMS) are shown in Figures 6 and 7. Because of space limitations it is not possible to include here the full PASCAL program of our expert system tool. The empirical results verified our expectations about the superiority of our inference engine. Not only did it always correctly deduce the underlying illness (unlike Naylor's system), thanks to the normalization technique, but we have also observed • • •
an up to 40% saving in the number of questions, a 2.5-23.0% loss in the number of executed instructions, a 25.5 44.2% gain in the execution time.
>>
>>
>>
>>
>> >>
>>
>>
>>
0.000~
>>
22 0.0098
26 0 , 2 8 5 6 6
18 o.o06es
2~ 0.00587
27 0.04753
25
9999
nostic applications. Unfortunately there does not exist a unique and generally acceptable criterion for evaluating expert systems. The theoretical superiority (and generality) of our certainty-factor-based inference engine over the probabilistic inference engine is obvious. Regarding its effectiveness, in a number of examples, we have found a substantial improvement. Although our examples were drawn from the medical field, we wish to make it clear that there is no reason why the inference engine cannot be applied to general system fault diagnosis, subject of course to appropriate coding of the knowledge. Work in this area is under development 7-2°. There remain many aspects for further investigation and inclusion in our inference engine. Some of these are as follows: •
Several tests with user bias were also performed with
•
We have presented a relatively simple inference engine under uncertainty which can effectively be used in diag-
24
0.04753
Figure 6 Hypotheses file ILLNESSES
Success.
CONCLUDING REMARKS
CO~ON COLD 0.02 I 0.25373 2 0.4382 5 0.54128 6 0.66443 7 0.27536 8 0.49495 15 0.6124 34 -I, 999 ALLERGIC R H I N ] T I S 0.01 1 0.49749 2 0.49749 6 0.4709 I0 0.40828 ii 0.40828 12 0 . 3 7 1 0 7 20 0.4709 999 PHARYNGITIS 0.02 3 0,66443 16 0,64029 8 0.49495 ii 0,64029 37 0.03226 38 0.4382 999 TONSILLITIS 0.001 3 0.09008 7 0.08173 15 0.09008 19 -l. 8 0.07322 38 0 . 0 7 3 2 2 999 INFLUENZA 0.01 3 0.4709 i 0.4709 6 0.32886 7 0.40828 8 0.49749 15 0 . 4 9 7 4 9 17 0 . 4 4 1 3 4 18 0.37107 34 -I. 999 LARYNGITIS 0.0l 4 0.49749 8 0.37107 15 0.03846 16 0.40828 37 0.01639 999 T U M O U R OF THE L A R Y N X 0.00004 4 0.00394 34 0.0039 37 0.00007 999 ACLrfE BRONCHITIS 0.005 5 0.04306 8 0.3311 12 0,3311 15 0.3311 18 0.19679 21 0.3311 31 0.30796 34 -i. 22 0.30796 999 CHRONIC BRONCHITIS 0.005 5 0.3311 12 0.30796 14 0.19679 21 0,3311 22 0.28315 34 0 . 3 3 1 1 36 0 . 3 0 7 9 6 37 0 . 0 0 8 2 6 999 ASTHMA 0.02 12 0 . 6 1 2 4 22 0 . 6 6 4 4 3 23 0 . 4 9 4 9 5 24 0 . 4 9 4 9 5 25 0.49495 26 0,49495 31 0.6124 999 EMPHYSEMA 0.01 22 0.49749 5 -0.89909 26 0.44134 12 -0.89909 21 -0.89909 37 0.01639 999 PNEUMONIA 0.003 8 0.22899 15 0.22899 18 0.19159 22 0.22899 23 0.12816 26 0.12816 28 0.02629 29 0,00299 27 0.05393 31 0.21073 36 - 0 . 8 8 8 5 9 7 0.21073 17 0.21073 32 0.59952 999 PLEURISY 0.001 31 0.07322 32 0.07322 22 0.04671 5 0.07322 8 0.08173 15 0 . 0 9 0 0 8 34 -I, 999 PNEUMONOTHORAX 0.0002 18 0.01555 22 0.01555 32 0.01555 999 BRONCHIECTASIS 0.00001 21 0.00099 27 0.00049 5 0,00099 14 0.00049 999 LUNG ABSCESS 0.00001 33 0.00089 18 0.00049 21 0.00049 27 0.00049 999 PNEUMONOCONIOSIS 0.O01 22 0 . 0 9 0 0 8 36 0.09008 21 0.07322 9 049975 999 LUNG CANCER 0.001 5 0.09008 21 0.07322 27 0.04671 22 0.04671 18 0.07322 12 0.04671 37 0.00229 999 INTERSTITIAL FIBROSIS 0.GO001 22 0.00079 35 0.00079 21 0.00059 999 PULMONARY OEDEMA 0.001 22 0 . 0 8 1 7 3 25 0.08173 30 0.04671 27 0.04671 26 0.04671 12 0.04671 999 PULMONARY EMBOLISM
K n o w l e d g e - B a s e d S y s t e m s V o l u m e 7 N u m b e r 1 M a r c h 1994
The development and introduction of an appropriate hierarchical structure to the database to classify the hypotheses (diseases etc.) in hierarchical levels. This would not only help in a better understanding and treatment of the knowledge base, but would lead to increased efficiency, since the hypotheses would be handled in groups and possibly rejected also in groups. The complementing of the diagnostic inference system with a repair section. The diagnostic process alone, although very important for the treatment of
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas et a/.
>>
>> >> >>
'>>
>>
>>
>>
>> >>
>> >> >> >>
>> >> >>
1 S N E E Z I N G • ARE YOU S N E E Z I N G A LOT ? 2 E Y E S P A I N F U L or W A T E R I N G • APE Y O U R EYES P A I N F U L OR WATERING A LOT ? 3 S O R E T H R O A T • DO Y O U H A V E A SORE T H R O A T ? 4 VOICE HOARSE • IS YOUR VOICE HOARSE ? 5 COUGH • A R E YOU COUGHING A LOT ? 6 RUNNY N O S E • DO Y O U H A V E A RUNNY N O S E ? 7 H E A D A C H E S • DO Y O U H A V E A H E A D A C H E O R IN G E N E R A L DO Y O U S U F F E R F R O M H E A D A C H E S AT A L L ? 8 HIGH T E M P E R A T U R E • DO Y O U HAVE A HIGH T E M P E R A T U R E ? ( O V E R 37.7 C ) 9 D U S T Y E N V I R O N M E N T • DO YOU SPEND A LOT OF Y O U R T I M E IN A VERY D U S T Y A T M O S P H E R E ? I 0 ITCH • D O E S Y O U R S K I N ITCH ? 11 DRY T H R O A T # DO YOU HAVE A D R Y T H R O A T ? 12 B R E A T H ' W I ~ E Z Y ' # IS Y O U R BREAq'rl 'WHEEZY' ? 13 N O S E ' B L O C K E D UP" • IS Y O U R N O S E VERY 'BLOCKED UP, ? 14 R E C E N T COLD or S I M I L A R I N F E C T I O N • HAVE YOU H A D A COLD OR S I M I L A R I N F E C T I O N R E C E N T L Y ? 15 G E N E R A L S I C K N E S S • DO Y O U F E E L G E N E R A L L Y ILL ? 16 S W A L L O W I N G T R O U B L E S • DO Y O U H A V E T R O U B L E S W A L L O W I N G ? 17 M U S C U L A R A C R E S • DO Y O U R M U S C L E S A C H E ? 18 P A I N IN T H ~ CHEST # DO YOU H A V E ANY PAIN AT A L L IN Y O U R CHEST ? 19 T O N S I L S R E M O V E D • H A V E YOU HAD YOUR T O N S I L S R E M O V E D ? 20 S Y M P T O M S O C C U R I N G IN 'A~'fAC~S' • DO YOU H A V E ANY S Y M P T O M S W H I C H T E N D TO O C C U R IN 'ATTACKS' ? 21 ' P R O D U C T I V E ' C O U G H • D O Y O U HAVE A "PRODUCTIVE' C O U G H ? 22 B R E A T H L E S S N E S S • ARE Y O U RATH'~'R B R E A T H L E S S ? 23 E X C E S S I V E S W E A T I N G • D O Y O U S W E A T A LOT ? (EVEN RELAXING) 24 H I G H P U L S E RATE • IS Y O U R P U L S E RATE HIGH ? 25 S E R I O U S A T T A C K S OF B R E A T H L E S S N E S S ~ DO YOU HAVE S E R I O U S A T T A C K S OF B R E A T H L E S S N E S S - E N O U G H TO S E R I O U S L Y W O R R Y Y O U ? 26 B L U I S H T I N G E • DOES Y O U R S K I N HAVE A B L U I S H T I N G E ? 27 P H L E G M S T A I N E D W I T H B L O O D • W H E N Y O U COUGH IS Y O U R P H L E G M STAINED WITH BLOOD ? 28 C O N F U S I O N # ARE YOU C O N F U S E D - M U D D L E D A B O U T W H A T , S G O I N G ON A R O U N D Y O U ? 29 D E L I R I U M • ARE Y O U fOR THE PATIENT) D E L I R I O U S ? 30 D R Y C O U G H • DO YOU H A V E A D R Y fNON-PRODUCTIVE) C O U G H ? 31 P A I N F U L L B R E A T H I N G or C O U G H I N G # IS IT P A I N F U L L W H E N Y O U B R E A T H OR C O U G H ? 32 SEVEP-E P A I N IN THE C H E S T o DO YOU EVER H A V E ANY R E A L L Y S E V E R E P A I N IN Y O U R CHEST ? 33 S W I N G I N G B E T W E E N F E E L I N G C H I L L E D and F E V E R I S H # DO Y O U SWING B E T W E E N F E E L I N G C H I L L E D AND F E E L I N G F E V E R I S H ? 34 P R O L O N G E D P R E S E N C E OF S Y M P T O M S • DO Y O U HAVE ANY S Y M P T O M S PRESENT F O R SOME T I M E - SIX W E E K S OR M O R E ? 35 'CLUBBED' F I N G E R S # DO Y O U HAVE 'CLUBBED' F I N G E R S ? 36 S Y M P T O M S O C C U R I N G AFTER E X E R T I O N ~ OO YOU H A V E ANY S Y M P T O M S W H I C H M A I N L Y O C C U R W H E N YOU E X E R T Y O U R S E L F ? 9 7 E X C E S S I V E S M O K I N G # DO YOU SMOKE ? ( -5 M E A N S YOU DO NOT SMOKE ) 38 S W E L L I N G S UNDER THE S K I N # DO YOU HAVE ANY S W E L L I N G S UNDER THE SKIN ?
6 7 8 9 10 11 12 13 14 15 16 17
18
19
Figure 7 Evidence file SYMPTOMS
20
anomalous situations that occur in system operation, cannot be considered sufficient without the ability to take over the necessary repair action. This requires the development of another expert system which, on the basis of the results of the diagnosis expert system, would decide about the repair actions. The development of a unique inference engine for both tasks would be more challenging. The collection of statistical data during the operation of the expert system. This information will be related to the behaviour of the system (successful or not successful diagnosis) and to the number and type of faults. In this way, we will be able to fully evaluate the expert system and decide whether the database needs improvements. Further work on the study of users' behaviour, and, in particular, the treatment of cases in which the user's model (bias) function J(.) is nonlinear.
REFERENCES 1 Gerener, M. and Smetek, R. 'Artificial Intelligence: Technology and Applications' Mifitary Technology No. 6 (1985) 67 2 Rich, E. Artificial Intelligence McGraw-Hill, Singapore (1983) 3 Tzafestas, S. 'AI Techniques in Control: An Overview' in Kulikowski, C. and Ferrate, G. (Eds) A1 Expert Systems and Languages in Modelling and Simulation North Holland, Netherlands (1988) 55 4 Nau, D.S. 'Expert Computer Systems' 1EEL Computer (1983) 63 5 Tzafestas, S., Singh, M. and Schmidt, G. (Eds.) Systems Fault
Diagnostics, Reliability and Related Knowledge-Based Approaches - - Vols 1 and2 D Reidel, Netherlands (1987) Hayes-Roth, F., Waterman, D. and Lenat, D. Building Expert Systems Academic Press (1983) Kadesch, R.R. "Subjective Inference with Multiple Evidence' Artificial Intelligence (1986) 333 Zadeh, L.A. 'Outline of a New Approach to the Analysis of Complex Systems and Decision Processes' IEEE Trans. Systems, Man and Cyber. (1973) Shafer, G. A Mathematical Theory of Evidence Princeton University Press, USA (1987) Shortliffe, E.H. Computer-Based Medical Consultation: MYCIN Elsevier, USA 0976) Naylor, C. 'How to Build an Inferencing Engine' in Forsyth, R. (Ed.) Expert Systems Chapman and Hall, UK (1984) Tzafestas, S. and Palios, L. 'Improved Diagnostic Expert System Based on Bayesian Inference' Proc. 12th IMACS Worm Congress on Scientific Computation Paris, France (July 1988) Johnson, R.W. 'Independence and Bayesian Updating Methods' Artificial Intelligence Vol. 29, No. 2 (1986) 217 Pearl, J. 'Fusion, Propagation and Structuring in Belief Networks' Artificial Intelligence Vol. 29, No. 3 (1986) 241-288 Tsoukalas, L. 'Anticipatory Systems Using a Probabilistic Possibilistic Formalism' PhD Thesis Dept. Nuclear Engineering, University of Illinois at Urbana Champaign, USA (1989) Cohen, D., Shafer, A. and Shenoi, M. 'Modifiable Combining Functions' A1 EDAMVol. 1, No. 1(1987)4%57 Tzafestas, S., 'Knowledge Engineering Approach to System Modelling, Diagnosis, Supervision and Control' in Troch, I., Kopacek, P. and Breitenecker (Eds.) Simulation of Control Systems Pergamon (1987) 17 Tzafestas, S. and Ligeza, A, 'Expert Control Through Decision Making' in Kulikowski, C. and Ferrate, G. (Eds.) A1, Expert Systems and Languages in Modelling and Simulation North-Holland, Netherlands (1988) Tzafestas, S. and Tsichritzis, G. 'ROBBAS: An Expert System for Choice of Robots' in Singh, M. and Salassa, D. (Eds.) Managerial Decision Support Systems and Knowledge-Based Systems NorthHolland, Netherlands (1988) Tzafestas, S. 'System Fault Diagnosis Using the Knowledge-Based Methodology' in Patton, R.J., Frank, P.M. and Clark, R.N. (Eds.) Fault Diagnosis in Dynamic Systems Prentice-Hall, UK
APPENDIX For the convenience of the reader we give in this appendix the PASCAL routines for updating the combining function table and determining the most likely value CFc for the true certainty factor CFt (see the sixth section).
PASCAL routine for updating procedure Update_Table (var m • table; cf, cfp • certainty_factor); var col_index, lin_index, i : integer; col • column; total : real; begin col_index := Get_Num(ef); {column number} lin_index := Get_Num(cfp); col : = m[col_index]; total : = 0; for i := 1 to 11 do {take into account the new couple} begin if i = lin_index then
Knowledge-Based Systems Volume 7 Number 1 March 1994
25
Diagnostic expert system inference engine based on the certainty factors model: S Tzafestas e t al.
col[t].number : = col[t].number + 1: total : = t o t a l + col[t].number: end; for i : = 1 to 11 do {compute the probabilitiesl col[i].prob : = col[l].number/total; re[col index] : = col; end; {Update_Tablel
PASCAL routine for getting
CFc
1 value :real; begin {we only look for the highest value in the column~, {problem if there are m a n y outcomesl col index : = Get_ N u m ( u ~:/); I_value : = 0; f o r / : = I to II do i f ( c t a b l e [ c o l index] [l].prob > = 1 value) then begin 1 v a l u e : = c_table [col_index] [1]].prob; 1 index : = 1; end;
procedure G e t _ T a b l e _ A n s w e r (var c_table : table; var ~c:f: certainty_factor);
u_cf: =
Get_cf(l_index); {that is the table answer}
var
col index, 1,1_index : integer;
26
end; {Get_Table_Answer',
Knowledge-Based Systems Volume 7 Number 1 March 1994