INFORMATION
SCIENCES
A Note on Rule Representation
38,193-203
(1986)
193
in Expert Systems*
ANCA L. RALESCU Department
of Computer Science, University of Cincinnati, Cincinnati, Ohio 45221
ABSTRACT We consider the problem of rule representation in expert systems when quantifiers are present. The approach is based on the concept of possibility distribution. Using this concept we are able to derive a formula for the possibility induced by a statement containing imprecise quantifiers.
1.
INTRODUCTION
Expert systems, also known as knowledge engineering, is the fastest-growing branch of artificial intelligence. The goal of an expert system is to capture the knowledge of an expert in a particular field, represent this knowledge in a modular expandable structure, and transfer it to the user by allowing the user to obtain answers to questions related to the knowledge base (KB) of the system. The private knowledge of an expert in a particular field consists largely of heuristic rules. Consequently, the KB of an expert system may be incomplete, imprecise, or not totally reliable. So we are led to the issue of the management of uncertainty in expert systems. In the expert systems developed so far there are different ad hoc approaches to this issue, most of which consist of predicate logic and probability-based methods. However, these approaches can be challenged in that the assumptions under which they work are usually not satisfied. An alternative method for dealing with uncertainty and imprecision was proposed by Zadeh [4] and is based on the use of approximate (or, equivalently, fuzzy) logic. The theory of possibility developed in [3] (based on the concept of fuzzy set) seems to deal very naturally with the uncertainty and imprecision in an expert system. In
*This work was partially supported by the Summer Faculty Research Fellowship of the University of Cincinnati. OElsevier Science Publishing Co., Inc. 1986 52 Vanderbilt Ave., New York, NY 10017
0020-0255/86/$03.50
194
ANCA L. RALESCU
particular, a nice feature of this approach is that it is able to deal with imprecise quantifiers (such as most, many, almost all, about 50%, etc.). In this note, by using the basic concepts of possibility theory, we will derive a formula for the possibility associated to a statement containing imprecise quantifiers. We also relate our result to another formula given by Yager [2] in a heuristic way. We recall briefly the main concepts of possibility theory. Let U be a universal set, X a variable taking values in U, and G a fuzzy subset of U. The sentence Xis G induces a possibility distribution of X, usually denoted by n,. According to Zadeh [3], in the absence of any other information concerning X, we can assert that n,=G. With the possibility function (pass. d.f.)
distribution
n,(u)
of X we associate
=Poss(X=u)
where po( .) denotes the membership ity measure associated to X is II(F)
=/Jc(u),
a possibility
u E u,
function corresponding
distribution
(1.1) to G. The possibil-
=sup(FnG)
or equivalently,
where F is a fuzzy subset of U with membership function pF, and A denotes the minimum operator. The concept of joint, marginal, and conditional possibility distribution have also been defined in [3] as follows: (1) Let Xi and X, be variables subset of U X U. The statement
taking values in U, and let F be a fuzzy
(X1,X2)isF induces a joint possibility d.f. given by Qx,,x*,(%aJ where ui E U, i = 1,2.
distribution
=Poss(x,=u,,
of (Xi, X,), ILxl, x,) = F, with the poss.
X*=u,)
=c1F(ui,u2),
(1.2)
RULE
REPRESENTATION
IN EXPERT
SYSTEMS
195
(2) The marginal possibility distribution of Xi given the joint possibility distribution I-I,,, X2) can be obtained by the so-called projection principle:
Ir,bi) =
i,j=1,2,
sup ~&&1’%)7 u, E u
j#i.
(1.3)
(3) The statement if
XisA
then
YisB,
where A, B are fuzzy subsets of U and X, Y are variables taking values in U, induces a conditional possibility distribution of Y given X. One way of defining the poss. d.f. in this case is [3]
Among the principles governing the calculus of possibility fuzzy compositional inference rule, according to which
distributions
is the
(1.5) It is important to note that while the concepts of possibility and probability distributions exhibit certain similarities, the possibilistic approach represents a substantial departure from the probabilistic one. It should be mentioned, for instance, that while the possibilistic approach is also concerned with studying the uncertainty associated with various situations, the uncertainty in question is not necessarily of random nature. Moreover, the fact that the basic operations in manipulating possibility distribution functions are maximum and minimum results in properties associated with possibility measures which are different from those of probability measures. For instance, the concept of independence associated with possibility measures is quite different than the corresponding concept associated with probability measures. For an extensive discussion on the above issues the reader is referred to [l], [3], and [5]. 2.
RULE
REPRESENTATION
A typical rule in an expert system is of the form IF antecedent THEN consequent with CF = a, where CF stands for a certainty factor whose default value is 1. In the following we will consider a = 1, that is, rules of the form IF antecedent Depending
THEN consequent.
on the form of the antecedent
and consequent,
this rule takes on
196
ANCA L. RALESCU
forms ranging
from very simple to complex. Some of the possible forms are:
(I) If X is A, then Y is B. (II) If X, is Ai, i =1,2 ,..., n, then Y is B. (III) If Q of X is Ai, i = 1,2,. . . , n, then Y is B. Here X, Y, Xi, i=1,2 ,..., n, are variables with values in U; A, B, Ai, i=l2 , ,..., n, are fuzzy subsets of U, and Q is an imprecise quantifier. Note that (II) can be viewed as a special case of (III) with Q having the value “all.” It has been proposed [2] to represent these rules in terms of conditional possibility distributions. Therefore, (I) will be represented by II~rlxj, (II) by +,+, . . .. x,)) ad (III) by ~~~~~~~~ ,,..., x+). Here +jQfx ,,... x )) denotes the conditional possibility distribution of Y given Q of Xi ‘s, i = 1; 2:. . . , n . In each of these cases the poss. d.f. can be computed according to (1.4). For rule (II) we have
where ~~~,,.:.,x,,(x~,...,x,) =c~,+(x,)A +.a A am, is the distribution functiOn Of the JOmt possibility distribution of (Xi,. . . , X,), and rITy(y) = pa(y) is the poss. d.f. of Y, with xi (i =l ,. . . , n), y E U. Similarly, for rule (III) we have ~~X@(X,,..., X”))(.Wl1...&J
= 4 .
lJ-Q(x,
,..,, X,)(X, ,...9 X”) + 49)
7 (2.1)
whereqcx ,,...,x,~(x19.-9x,)
is the poss. d.f. corresponding to II,, x,,,, , xnj. That is, rotx, ,___,x,,(x1 ,..., x.) =Poss(Q of Xi is A,, i=1,2 ,..., n, at {x i ,..., x.}). The problem is now to compute rocx ,,,,,, x,j(x1 ,..., x”). In order to do that we need to decide how to represent the quantifier Q. One way, suggested in [6], is to view a quantifier as a fuzzy subset of either the real line (quantifiers of the first kind such as many, few, etc.) or of the unit interval (quantifiers of the second kind such as most, etc.). Let then po denote the membership function associated with Q. Let N be a variable defined as the number of premises that hold in the antecedent of rule (III). Then to say that Q premises hold is equivalent to saying that N premises hold given that N is Q. Note that N can take the values 0,1,2,. , . , n. The statement N is Q induces
a possibility nN(k)
distribution =Poss(N=k)
for N, II,,,, whose poss. d.f. is given by pQ: =&k),
k=0,1,2
,..., n.
(2.4
RULE REPRESENTATION
IN EXPERT SYSTEMS
197
The fuzzy compositional inference rule implies that
where 4X ,,..., X,lk)=Poss(Nof Xis Ai, i-l ,..., n, at {xi ,..., x,}lN=k)= Poss (exact& k of the premises “4 is Ai” hold, i = 1,. . . , n, at ( xi,. . . , x, }). Note that pA,(xi) = Poss(X, = xi 1Xi is Ai) can be viewed as the degree to which the premise “4 is A, ” holds at xi. We need to compute x,). This is done by observing that rrCx ,,,,,, x,,k)(xl,. . . , xn) is ~T(X,....,X”,k)(Xl,...r the joint possibility distribution function of the k 4’s corresponding to the premises which hold at ( xi, ___, x, } and the n - k X, ‘s corresponding to the premises which do not hold at { xi,. . . , x, }. Following this reasoning, we will show that V(&*....x,lk)(x,,...,x,)
= m+(k)(%
,..-,X,),I--~k+l)(X*,-..,
G>) y (2.4)
where c~~~)(xi,..., xn) is the kth element of the set { pA,(xi), . . . , pa,(x,)} arranged in nonincreasing order of magnitude, i.e., po)(x,, . . . , x,) ia P@)(X,, . . . . X”) 2 . . . a Ir(,)(X,, . . . . x,). We will extend this set by letting =l and ~+z+i)(xit..., x,) = 0. In the followmg we will denote P@(Xt,..-rx,) p,(x) = By.. In order to derive (2.4) let {it,. . . , i, } be a permutation of
=
=
max (Q...., i,}c(l,...,
max {il. .. .. i,}S{1,2..*.,n}
n)
( j_~~.,,~r~i~)(x,'...,xD)) [
since the max and min operations will yield the same result when applied on all permutations of the ordered p’s.
198
ANCA L. RALESCU
We note now that the largest term on the right-hand side corresponds to the permutation ij = j, j =1,2,.. ., k. So the maximum overall pe~utations ik} wiIlbeequaIto (il,...,
llliri
P(j)(xl~~~*~xn)A
j-1,2,...,k
,_k*,,
.. fn[1-C(I)(X19.*-*Xn)]
%)A [l-C(k+l)(Xl ,...,&J]
‘Bfk)(+.9
= m+(k)h ~...J,u-P(k+l) (xw..&J). This proves (2.4). Note that for k = 0 we have nc&,...,X”lO)(XI,...,
x,) =min(~(o)(xl,...,xn),l-p,(x,,...,x,)) =
~n(lJ-P(I)bl
)...,
xn)) =l-P&*,...,X,).
Also, for k = n we have
4X,....,X.ln, (x1,..., XJ
=min(EL(n)(X1,...,Xn),l-CC(n+l)(XI1...,Xn))
which is the joint pass. d.f. of (Xi,. . . , X,) induced by the antecedent Substituting (2.2) and (2.4) into (2.3), we obtain
of rule (II).
52(X,,.-.,X”,(X,,..*J,)
3.
NUMERICAL
EXAMPLES
A formula for computing nix, ..,,, x,,k)(xl ,..., xn) was suggested in [2]. Let @CT-Y ,,..., X,,/Q(Xir..., x,) stand for 7rcx,,..., x,,k)(xl ,..., x,) when computed as in [2], that is,
7&....X”,k)(Xl,..., x*) =P(k)hr--.J,),
k=l,...,n.
(3.1)
RULE
REPRESENTATION
IN EXPERT
199
SYSTEMS
First we note that the case k = 0 is not included here. Of course one may define
(X19...,X,)=1-p&
IT:x,,....X”,O)
,..., xn).
We notenow that T(X,,..., X,I~)(~ly...,~,) ad q!&,,,.,X,lk)(~l,...,~,) agree when cc~k~(xl,...,~,)+cc~k+l~(~l,...,x,,) ~1. In particular, for k = n they > 1 for some k, then rrCx,,,,,, x +)(x1,.. ., x,) and If p(k) + /$k+l) xn) give quite different results. In prkicular if pCk+ij > i $C1 . .... Xn,k)(X1,.-., (and hence # ptkj >f),then~~~,,...,,“lk,(x,,...,x,)=~(k,(x,,...,x,)>) wh&? x,,) < :. The reason for this discrepancy between the two niXI ,..., ~,,k)(Xl,r.., x,) takes into account not only the degrees to resu1ts ls that n(x,,...,X,(k)(XI,..., which k premises hold but also the degree to which the remaining n- k premises do not hold. In particular, if the (k + 1)st premise holds with a degree greater than f, then 4x,,....X Ik)(X1,~~~, xn) (which is really the possibility that exactly k premises hold) musi be at most 1 - pCk+t) (xi,. . . , x,,). To better illustrate the difference between the results given by x,), we will consider the follown(X,...., X,(k)(X1,-*, x.) md ?r(;(,,...r X,,k)tX1,..., ing example. Let Q = “A FEW" be a fuzzy subset of [l, n], n = 20, where the membership function p L.A_,. can be chosen as follows (see Figure 1):
;, I.
Odid2,
p.,,,,,.(i)
=
” i+5 -2 ’ 0,
2
(3.2)
3
i=1,2 ,..., 20). Recall that by convention, ptoj (xi,. . . , xzo) = 1 x”) = 0. We will compute rCx,,..., x,,kj(xl ,..., xn) [according to and ~c(,,)(x,,..., (241 and 17&1,__.,x,lk)(x19. .., xn) [according to (3.1)] for different choices of S. In each case we will also compute the possibility that “A FEW of X, is A,,” i=l ,. . . ,20, hold at some {xi ,... , xzo},
Let S=
{pA,(xi),
(a) S= {1,0.9,0,0.8,0,0.5,.8,1,1,0.8,0,0.9,1,0,0,0,0,0,0}. summarized in Table 1. Now, using (24, we have POSS(“FEW
=
i
Of &
is Ai,” i =l,_..,
The results
20 hold at xi ,..., xzo)
1
if
n~~~,...,X~~~k)(x1,...,x20)
0.1
if
rrCx,,.,,, x20,k,( xi ,..., xzo) is used in (2.3).
isusedb(2.3),
are
200
ANCA L. RALESCU A “A
FEW”
Fig. 1
TABLE 1 k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
~mdk) 0 0.5 1 1 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -
p(k)
rz
1 1 1 1 1 0.9 0.9 0.8 0.8 0.8 0.5 0 0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 0.9 0.9 0.8 0.8 0.8 0.5 0 0 0 0 0 0 0 0 0 0 -
Tk
0 0 0 0 0.1 0.1 0.2 0.2 0.2 0.5 0.5 0.0 0 0 0 0 0 0 0 0 0 -
RULE
REPRESENTATION
IN EXPERT
SYSTEMS
201
We can explain the above results as follows: Recall that 4 x,,, , X20,kj( x1,. . . , x2,,) :)(X1,..., x2,,) both represent Poss( exactly k of the premises “X, and ~&,,....X*O1k is Ai,” i =1,2,. ,. . ,20, hold at {xi,. . . , x,,}). According to Table 1 we see that, for example r&l ,..., x,,zj(xl,. . . , xzo) = 1, that is, according to (3.1) the possibility that exactly 2 premises hold out of the 20 premises considered is equal to 1. However, by looking back at the data ({ pCkj(xl,... , x,)},) we see that when there are 10 premises which do not hold (the corresponding pCkj is 0), for the remaining ones the degree with which each holds is greater than or equal to 0.5. Therefore, no two premises will hold exclusively. The same point is illustrated
even better by the next example:
(b) S= {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}. in Table 2. Then Poss(“AFEWof
Xi areAi,“i=1,2
,..., 20,holdat
1 if q%,*...,X,,,k) (Xi ,..., =
i
0
if
The results
{xi ,..., xlO})
xzO) is used in (2.3),
7rCx ,,,,,, X,,,kj( Xi ,..., xzO) is used in (2.3).
TABLE 2 k 0 1 2 3 4 5 6 I 8 9 10 11 12 13 14 15 16 17 18 20 21
~mv(k) 0 0.5 1 1 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -
p(k) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
fli? 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -
are shown
“k
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
202
ANCA L. RALESCU
different here. Looking at the data, we see that since all premises hold with degree 1, no group of k of them will hold exclusively unless k = 20. This but not in situation is captured in rcx, ,,,,, x,,,kj(x,, . . . , x,,), xzo). Moreover, we see from (a) and (b) that although the r& ,.._,X&(-G...1 data (S) used were quite different, the use of rr+ ,,,. ,, X,,,kj ( xi,. . . , xzo) results in the same numerical value for POSS(“A FEW of X are Ai,” i = 1,1,2, _. . ,20, hold at {Xi,..., X,, }). This example suggests that rcx ,,,.., x20,kj(xi,. . . , xzo) is more sensitive to variations in the data considered. Finally,
here is an example in which the formulae agree:
(c) S= {l,O,O,O,O,O,O,O,O,O,O,O,O,O,O,O,O,O,O,O}(see Table 3). In this case, since poj(xl,. . . , xzo) + pc2)(x1,. . . , xzo) G 1, we see that x2o) agrees with n&l ..... x,lk)(xl,. . . , ~~~1. Hence, using ~~X1,...,X20,k)(X1’..., either one of them in (2.5) we obtain
PosS(“AFEWX,~SA~,“~=~,~
,..., 20,holdat{x,
,..., x~~})=O.~.
TABLE 3 0
0
1
0.5 1 1
1 1 0 0
0 1 0 0
0 1 0 0
0.5 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 -
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 -
0 0 0 0 0 0 0 0 0 0 0
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 10 20 21
RULE REPRESENTATION
IN EXPERT SYSTEMS
203
It is interesting to realize that since Ir;t,,,,.., x,,kj(xl,. . . , x,) as well 4x ,...., X”,k)(X1,...~x,) stands for the possibility that exactb k premises hold, x,) will actually represent the possibility that “exactly Q ” of htx I,.... +1,..., the premses hold. Obviously the tern “exactly Q ‘*is rather inappropriate here, since Q will usually be a fuzzy quantifier. For our example Q = A FEW, rA FEW( x,, . . . , x2)( x1,. ..,xzo) should be viewed as the possibility that A FEW ONLY of the premises hold. I would like to thank Lorenza Saitta for many helpful comments on an earlier version of this puper.
REFERENCES 1. E. Hisdal, Conditional possibilities, independence and noninteraction, Fuzzy Sets and Sysrems 1:285-297 (1978). 2. R. R. Yager, Approximate reasoning as a basis for rule based expert systems, Technical Report #MII-31A, Machine Intelligence Inst., New Rochelle, NY 10801; also IEEE Truns. Systems Man Cybernet. 14:636-643 (1984). 3. L. A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sers and Systems 1:3-28 (1978). 4. L. A. Zadeh, The role of ftuzy logic in the management of uncertainty in expert systems, Memo. No. UCB/ERL M83/41, July 1983. 5. L. A. Zadeh, Possibility theory and soft data analysis, in Mathematical Frontiers of the Social and Policy Sciences, AAAS Selected Symposium (Loren Cobb and Robert M. Thrall, Ids.). 6. L. A. Zadeh, A computational approach to fuzzy quantifiers in natural languages, Cornput. and Math. 9:149-184 (1983). Received 10 August 1985; revised I7 September 1985