Information Sciences 181 (2011) 3199–3209
Contents lists available at ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
On the fusion of imprecise uncertainty measures using belief structures Ronald R. Yager Machine Intelligence Institute, Iona College, New Rochelle, NY 10801, United States King Saud University, Riyadh, Saudi Arabia1
a r t i c l e
i n f o
Article history: Received 15 September 2010 Received in revised form 28 December 2010 Accepted 19 February 2011 Available online 10 March 2011 Keywords: Uncertainty Multi-source fusion Dempster–Shafer Normalization
a b s t r a c t Our interest is in the fusion of information from multiple sources when the information provided by the individual sources is expressed in terms of an imprecise uncertainty measure. We observe that the Dempster–Shafer belief structure provides a framework for the representation of a wide class of imprecise uncertainty measures. We then discuss the fusion of multiple Dempster–Shafer belief structures using the Dempster rule and note the problems that can arise when using this fusion method because of the required normalization in the face of conflicting focal elements. We then suggest some alternative approaches fusing multiple belief structures that avoid the need for normalization. Ó 2011 Elsevier Inc. All rights reserved.
1. Introduction The representation and fusion of uncertain information from multiple sources is an important task [1,8,12,14,16,20,21, 31,32]. A set measure provides a useful general framework for representing information about an uncertain variable. Using these measures we are able to express the confidence of finding the value of a variable in a set. Probability provides an important example of these measures. Recent interest has focused on imprecise uncertainty measures. In the case of the probability measure this corresponds to situation in which rather than knowing the precise probability of an event we only know an interval in which the probability lies. The Dempster–Shafer belief structure [4,5,18,19,27] provides a framework for the representation of a wide class of imprecise uncertainty measures. When using these structures to represent imprecise uncertain information in the multi-source environment we are faced with the problem of fusing multiple Dempster–Shafer belief structures. The earliest prescribed approach for fusing Dempster–Shafer belief structures was the Dempster rule [4,5]. Zadeh [29,30] and others [6,23,21,28] have raised some concern about the use of normalization in addressing the issue from the intersection of conflicting focal elements. This concern has initiated interest in providing alternatives to Dempster’s rule. In this work we introduce some alternative approaches to the fusion of Dempster–Shafer belief structures and other representations of imprecise information about uncertain variables. 2. Imprecise uncertainty measures Assume V is variable whose value is uncertain but known to lie in the set X. We can represent our knowledge about the value of the variable V using a set function l such that for any subset A # X, l(A) 2 [0, 1] indicates the confidence we have that the value of V lies in A. It is natural to require that l(X) = 1, l(£) = 0 and l(A) P l(B) if B # A. A prototypical example of this is a probability measure. Here we associate with each xi a value pi and let l({xi}) = pi. In this case P P lðAÞ ¼ xi 2A lðfxi gÞ ¼ xi 2A pi . In the probability environment we generally refer to l as a probability measure. 1
Visiting Distinguished Scientist. E-mail address:
[email protected]
0020-0255/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2011.02.010
3200
R.R. Yager / Information Sciences 181 (2011) 3199–3209
Recent interest has focused on imprecise uncertainty measures. In this situation rather than having a precise value for the confidence of A, l(A), we have a range, interval, in which the value of l(A) lies [10]. Thus for an imprecise uncertainty measure (IPUM) g we have for each A # X an interval R(A) = [L(A), H(A)] such that L(A) and H(A) satisfy the following conditions 1. 2. 3. 4. 5.
L(A) 2 [0, 1] and H(A) 2 [0, 1] L(A) 6 H(A) H(£) = L(£) = 0 H(X) = L(X) = 1 If B # A then L(A) P L(B) and H(A) P L(B)
We note that if L(A) = H(A) then we know that l(A) = L(A) = H(A). Here L and H are short for lower and higher. A fundamental idea associated with imprecise uncertainty measures is the idea of entailment. Assume g is an IPUM such that R(A) = [L(A), H(A) ]. This is telling us that our degree of confidence that the variable V lies in the set A, l(A), is a value in the interval R(A). We observe that if a 6 L(A) and b P H(A) then we are also sure that l(A) 2 [a, b]. Here we see that R(A) # [a, b]. Thus knowing that l(A) 2 R(A) allows us to infer that l(A) 2 [a, b]. More generally assume we have as our knowledge about V the IPUM g1 where for each A # X this has R1(A) = [L1(A), H1(A)]. Let g2 be another IPUM where for each A we have R2(A) = [L2(A), H2(A)] such that L2(A) 6 L1(A) and H2(A) P H1(A), R2(A) # R1(A). From the preceding observation we conclude that knowing g1 is true allows us to infer g2 is true. This process is called entailment and is related to logical deduction. However, we emphasize here that in this case g1 has more information than g2, thus entailment is going for more to less information. A very special and simple structure for representing a wide class of imprecise uncertainty measures is the Dempster–Shafer belief structure [27]. Let V be a variable taking its value in the space X. A D–S belief structure m associated with V consists of the following components: (1) A collection of q non-empty subsets, Fj # X, j = 1 to q. These subsets are referred to as the focal elements. P (2) A mapping m that associates with each focal element a value m(Fj) 2 [0, 1] such that qj¼1 mðF j Þ ¼ 1. Associated with a D–S belief structure are two notable measures. The first, called the plausibility measure is defined by
X
PlðAÞ ¼
mðF j Þ
F j \A–£
The second called the belief measures is defined by
BelðAÞ ¼
X
mðF j Þ
Fj # A
A unique relationship of duality exists between these two measures, BelðAÞ ¼ 1 PlðAÞ When viewing the D–S belief structure as an imprecise uncertainty measure Pl(A) and Bel(A) provide the bounds of the imprecise interval R(A). Here L(A) = Bel(A) and Pl(A) = H(A) thus R(A) = [Bel(A, Pl(A)]. We observe that while a D–S belief structure is an example of an imprecise uncertainty measure IPUM, not all IPUM are D–S belief structures. A very desirable aspect of the D–S belief structure is the well-mannered (simple) way in which for any A the interval R(A) is determined from the belief function m. A very important requirement of the D–S belief structure is that the null set is not a focal element, we must have m(£) = 0. The imposition of this requirement guarantees a number of important features of the D–S framework. The condition that m(£) = 0 assures us that Bel(X) = Pl(X) = 1. We see this as follows. Since all Fj # X then we always have Bel(X) = 1. P P However, if m(F1) – 0 and F1 = £ then we have X \ F1 = £ and F j \A–£ mðF j Þ ¼ qj¼2 mðF j Þ ¼ 1 mðF 1 Þ–1. In addition it guarantees that Bel(£) = Pl(£) = 0. While Pl(£) while will always equal 0 even if m(£) – 0 we see that if P m(F1) – 0 with F1 = £ then F1 # £ and F j # A mðF j Þ ¼ mðF j Þ–0. Additionally the fact that Fj – £ assures us that Bel(A) 6 Pl(A) for all A. We see that if Fk = £ then for some A, A \ Fk = £ while Fk # A and hence Bel(A) > Pl(A). We now describe some notable examples of Dempster–Shafer belief structures. One case is when F1 = B and m(F1) = 1. In this case
RðAÞ ¼ ½1; 1 if B # A RðAÞ ¼ ½0; 1 if B \ A – £ and B å A RðAÞ ¼ ½0; 0 if B \ A ¼ £ P Another case is where Fi = {xi} and m(Fi) = pi. In this case LðAÞ ¼ HðAÞ ¼ xi 2A pi ¼ ProbðAÞ. Consider the case where we have focal elements Fj with weights m(Fj). One view of this is as a probability type of uncertainty in which the probability m(Fj) instead to being allocated to some specific element in X is to be distributed in some
3201
R.R. Yager / Information Sciences 181 (2011) 3199–3209
unknown fashion among the elements in Fj. Here for each A we get R(A) = [L(A), H(A)] where LðAÞ ¼ P HðAÞ ¼ F j \A–£ mðF j Þ. In this case we have an imprecise probability for any subset A.
P
F j # A mðF j Þ
and
3. Fusing multiple D–S belief structures using Dempster’s rule A problem of central importance in our modern information rich environment is the fusion of multiple sources of information [9,15]. Assume V is a variable taking its values in the space X. Let S = {S1, . . . , Sr} be a collection of sources of information. The information fusion problem involves the aggregation of the information provided by these individual sources to obtain a unified expression of the value of V. Here we shall concern ourselves with the situation in which the information provided by each individual source is expressed in terms of a D–S belief structure, mj. Thus our task is to combine these multiple D–S belief structures. The pioneering approach to the aggregation of multiple D–S structure was suggested by Dempster [3–5] and has come to be known as Dempster’s rule. In the following we first describe this rule for the case of two belief structures to better capture the essential idea involved. Assume m1 and m2 are two belief structures with focal elements Ai and Bj respectively. We let n1 and n2 be the number of focal elements in the respective belief structures. Under Dempster’s combination rule we obtain a new belief structure m whose focal elements are Fij = Ai \ Bj – £ for i = 1 to ni and j = 1 to n2. Thus the focal elements of m are the non-null intersections of the focal elements of the two contributing belief structures. In addition the weights associated with these focal elements are
mðF ij Þ ¼
m1 ðAi Þm2 ðBj Þ 1K
here K, called the degree of conflict, is defined as follows
K¼
X
m1 ðAi Þm2 ðBj Þ
Ai \Bj ¼£
We also note that 1 K ¼
mðF ij Þ ¼ P
P
Ai \Bj –£ m1 ðAi Þm2 ðBj Þ,
m1 ðAi Þm2 ðBj Þ
Ai \Bj –£ m1 ðAi Þm2 ðBj Þ
using this we can express
:
The process of dividing the m1(Ai)m2(Bj) by 1 K is called normalization. It is introduced to assure that while restricting the focal elements of m to be non-null we still obtain that the sum of weights equal one. In the case of multiple belief structures, mi for i = 1 to q each having focal elements Eij for j = 1 to ni we proceed in a similar manner. Each focal element F of the aggregation structure m is obtained as the non-null intersection of one focal elements Pq mi ðEij Þ from each of the contributing belief structures, F ¼ \qi¼1 Eiji – £ and mðFÞ ¼ i¼11K i where K is sum of the weight associated P with intersections that are null K ¼ \q Eij ¼£ m1 ðEiji Þ. i¼1 i An important property of the Dempster rule is closure, combing the individual belief structures using this rule results in a D–S belief structure. In the following for simplicity and where it causes no loss in generality we shall focus on the fusion of two belief structures. As noted by Zadeh [30] the use of the type of proportional normalization used by Dempster can lead to questionable aggregations and has inspired alternative methods of normalization. Yamada [28] provides a comprehensive discussion of these issues. Characteristic of many of these alternative approaches is an intersection of the focal elements from the different belief structures and a multiplication of the weights associated with the focal elements, the major difference is the handling of the weights associated null intersections. We can refer to these as Dempster like rules. One alternate approach to normalization was suggested by Yager [23]. In this approach rather than proportionally allocating the conflict weight K among the focal elements it was suggested to assign this K to a focal element equal to X, the whole universe. An interesting comparison can be made between the aggregation obtained using Dempster’s rule and the one suggested by Yager [23]. First we note that if there are no conflicts in the contributing belief structures both will give the same result. Consider now the case where the contributing belief structures have some conflict. Let m1 be the belief structure obtained by the Dempster rule and assume it has focal elements Fj for j = 1 to q. If m2 is the resulting belief structure obtained using the method suggested by Yager, it also has the focal element Fj for j = 1 to q plus the focal element Fq+1 = X. If we let m2(Fj) = aj for j = 1 to q + 1 be the weights obtained using the method suggested by Yager then the weights associated with m1 are
m1 ðF j Þ ¼
m2 ðF j Þ m2 ðF j Þ ¼ 1 m2 ðF qþ1 Þ 1 ajþ1
for j ¼ 1 to q
Let us now calculate the plausibility in the respective cases
Pl1 ðAÞ ¼
X j;F j \A–£
m1 ðF j Þ
3202
R.R. Yager / Information Sciences 181 (2011) 3199–3209
Pl2 ðAÞ ¼
X
m2 ðF j Þ
j;F j \A–£
Without loss of generality let us assume the index is such that Fj for j = 1 to r are such that Fj \ A – £. In this case
Pl1 ðAÞ ¼
r X
m1 ðF j Þ ¼
j¼1
r X 1 m2 ðF j Þ 1 ajþ1 j¼1
In the case of m2 these same focal elements contribute plus Fq+1 = X hence
Pl2 ðAÞ ¼
r X
m2 ðF j Þ þ m2 ðF qþ1 Þ ¼
j¼1
r X
m2 ðF j Þ þ ajþ1
j¼1
We see that from the above that
Pl2 ðAÞ ¼ ð1 ajþ1 ÞPl1 ðAÞ þ ajþ1 Furthermore since Pl1(A) 6 1 we see that
Pl2 ðAÞ P ð1 ajþ1 ÞPl1 ðAÞ þ ajþ1 Pl1 ðAÞ P Pl1 ðAÞ Thus Pl2(A) P Pll(A). Consider now the calculation of Bel2(A) and Bel1(A) for an A. As we indicated
Bel1 ðAÞ ¼ 1 Pl1 ðAÞ Bel2 ðAÞ ¼ 1 Pl2 ðAÞ Since we have just shown the Pl2 ðAÞ P Pl1 ðAÞ then Bel2(A) 6 Bel1(A). We now see that we have [Bel1(A), Pl1(A)] # [Bel1(A), Pl1(A)]. Thus the range provided by the Dempster rule method is narrower then that provide by m2. On one hand this says that the use of Dempster’s rule is providing a more informative type of aggregation. On the other hand the choice of this rule is arbitrary and it may be, that this additional information is unjustified. Furthermore the validity of m2 can be inferred from m1. Thus in using m2 we are not conflicting with the use of Dempster’s rule we are just not using all the information available in the sense that providing the wider range [Bel1(A), Pl1(A)] is less informative then providing the narrower range [Bel1(A), Pl1(A)]. 4. Aggregation of numeric focal elements A notable feature of the Dempster-rule like combination of belief structures is that it requires no special properties on the underlying space X. The operation of intersection needs no special properties. Thus we see that while the use of intersections of the focal elements has the benefit of working for any domain X it has the drawback of possibly resulting in null sets and hence requiring normalization. Here we shall present an alternative to the pure Dempster aggregation rule that while it does not produce conflicts it requires a richer universe X in which the value of the variable V lies. Here we shall assume the domain X is numeric, that V is a random like variable. In anticipation of suggesting our approach we introduce some ideas from set arithmetic. Assume A and B are two sets defined on the real line R. We define C = A + B as the sum of A and B where C is a subset of R such that C = {x + yj for all x 2 A and y 2 B} Example. If A = {1, 2, 3} and B = {10, 11, 12} then C = A + B = {11, 12, 13, 14, 15}. We note in the special case where A and B are intervals, An= [a1, a2] and B = [b1, bo2] then A + B = [a1 + b1, a2 + b2]. P Pn In the more general situation where C ¼ nj¼1 Ai then C ¼ j¼1 xi =for all xi 2 Ai . In the case where the Ai are intervals, Pn Pn Ai = [ai1, ai2] then C ¼ a ; a . i1 i2 i¼1 i¼1 We now define a second arithmetic operation on subsets of the real line R. Let C be a subset of R and let K > 0 we define D ¼ K1 C as a subset of R such that D ¼ Kx for all x 2 C . We now define average of a collection of subsets R. Let Ai, for i = 1 to q be a collection of subset of R we define Av eðA1 ; . . . ; Aq Þ ¼ 1q C where C = A1 + A2 + + Aq We note in passing that if the Ai are any collection of non-null subsets of R, and if D = Ave(A1, . . . , Aq) then D is a non-null subset. Using these operations we can define a new form of fusion for Dempster–Shafer belief structures in the case where the underlying domain is the real line R. Let m1 and m2 be two D–S belief structures on R with focal elements Ai, i = 1 to n1 and Bj for j = 1 to n2. We define a new belief structure m on R such that the focal elements on m are Fij = Ave(Ai, Bj) for all i = 1 to n1 and j = 1 to n2 and m(Fij) = m1(Ai)m2(Bj). We see that this is a closed operation as m is always a D–S belief structure. Also no null sets are provided by the operation Ave(Ai, Bj) so no normalization is necessary. Here we have replaced the intersection of the focal elements by the average but have retained the multiplication of the weights from the Dempster-rule.
R.R. Yager / Information Sciences 181 (2011) 3199–3209
3203
Example. Assume V is a variable on R = [0, 20]. Let m1 and m2 be belief structures such that
A1 ¼ ½4; 8 with m1 ðA1 Þ ¼ 0:8 and A2 ¼ ½0; 20 with m1 ðA2 Þ ¼ 0; 2 B1 ¼ ½10; 12 with m2 ðB1 Þ ¼ 0:6 and B2 ¼ ½0; 20 with m2 ðB2 Þ ¼ 0:4 Using the average type aggregation we get a new belief structure m with focal elements
F 11 ¼ ½7; 10 with mðF 11 Þ ¼ 0:48 F 12 ¼ ½2; 14 with mðF 12 Þ ¼ 0:32 F 21 ¼ ½5; 16 with mðF 21 Þ ¼ 0:12 F 22 ¼ ½0; 20 with mðF 22 Þ ¼ 0:08 ^ with If we used the original Dempster’s rule we would get a belief structure m
^ 12 Þ ¼ 0:615 E12 ¼ ½4; 8 with mðE ^ 21 Þ ¼ 0:23 E21 ¼ ½10; 12 with mðE ^ 22 Þ ¼ 0:15 E22 ¼ ½0; 20 with mðE The extension of this approach to more then two belief structure is straightforward.
5. Union average fusion of D–S belief structures Let V be a variable with the domain X = {x1, . . . , xn}. Assume we have two sources of information providing information about the variable in terms of probability distributions. In this case source one provides pi as probability of xi and source two provides qi as the probability of xi. Our objective is to fuse these two pieces of information. Since each of these can be viewed as a Bayesian D–S belief structure we can use the Dempster-rule to provide fusion. Assume m1 and m2 are the Bayesian D–S belief structures corresponding to the two pieces of information. In this case the focal elements of both these belief structures are singletons. Let us denote these Ej = {xj} for j = 1 to n. Furthermore we denote m1(Ej) = pj and m2(Ej} = qj. In this case if we combine these using Dempster’s rule we get a new belief structure m with focal pq element Ej = {xj} and mðEj Þ ¼ Pnj j pj qj P j¼1 Here the conflict K ¼ 1 nj¼1 pj qj As noted by Zadeh [30] the use of the Dempster rule can at times lead to very unintuitive results as illustrated in the following. Example. m1 m1({x1}) = 0.95 m1({x2}) = 0.05 m1({x3}) = 0
m2 m2({x1}) = 0 m2({x2}) = 0.05 m2({x3}) = 0.95
In this case using the Dempster rule we get the belief structure m with
mðx1 Þ ¼ 0; mðx2 Þ ¼ 1 and mðx3 Þ ¼ 0 Here we get as result of the fusion the troubling result that with no uncertainty V = x2. There exists another less confrontational method for combining probability distributions [7,11]. Again let P and Q be two probability distributions on X with pj and qj being their respective probabilities of xj. In this method we obtain a fused probability distribution R such that rj is the probability of xj where
rj ¼
1 1 p þ q 2 j 2 j
More generally if Pi are a collection of S probability distributions with pij being the probability of xj in the distribution Pi P then we can obtain a fused distribution P with pj denoting the associated probability of xj and pj ¼ Si¼1 ai pij where ai 2 [0, 1] PS and i¼1 ai ¼ 1. Here we see ai is an importance weight assigned to Pi. If all the Pi have the same importance then ai = 1/S. We P note for any subset A of X we have PðAÞ ¼ Si¼1 ai P i ðAÞ
3204
R.R. Yager / Information Sciences 181 (2011) 3199–3209
Example. Consider the preceding example with p1 = 0.95 p2 = 0.05 p3 = 0
q1 = 0 q2 = 0.05 q3 = 0.95
In this case we obtain a new distribution with
r1 ¼ 0:475;
r2 ¼ 0:05 and r3 ¼ 0:475
We can see that from the D–S perspective what we have done is take the two Bayesian belief structures m1: E1 = {x1} with m1(E1) = 0.95 E2 = {x2} with m1(E2) = 0.05
m2 E2 = {x2}} with m2(E1) = 0.05 E3 = {x3} with m2(E3) = 0.95
and combined them to get a new belief structure
m E1 ¼ fx1 g with mðE1 Þ ¼ 0:475 E2 ¼ fx2 g with mðE2 Þ ¼ 0:05 E3 ¼ fx3 g with mðE3 Þ ¼ 0:475 We see that the focal elements of m are the union of the focal elements of m1 and m2. Furthermore mðE1 Þ ¼ 12 m1 ðE1 Þ; mðE2 Þ ¼ 12 m2 ðE2 Þ þ 12 m2 ðE2 Þ and mðE3 Þ ¼ 12 m2 ðE3 Þ. Let us look and see how we can extend this approach to the aggregation of general D–S belief structures. Assume mi for i = 1 to M are a collection of D–S belief structures. Here we let Fij for j = 1 to ni be the focal elements of mi. Let ai 2 [0, 1] and PM ni M i¼1 ai ¼ 1. We now form a new belief structure m such that the focal elements of m are [i¼1 ½[j¼1 F ij . The focal elements of m are the union of all focal elements in the contributing belief structures. In addition for any focal element Fij we define m(Fij) = aimi(Fij). It’s clear this that we have ni M X X i¼1
ai mðF ij Þ ¼
M X i¼1
j¼1
ai
ni X
mðF ij Þ ¼
M X
ai ¼ 1
j¼1
j¼1
We note that if some subset F appears in two or more belief structures then its effective weight becomes a sum of the contributions from the individual belief structures. We shall refer to this as the union-average (U-A) aggregation rule of Dempster–Shafer belief structures. We note the default values for ai = 1/M. One important feature of this approach is that we never induce any null sets as potential focal elements, no conflicts are introduced, and hence there is no need for normalization. We also emphasize that the U-A aggregation of belief structure is closed in that it always gives a D–S belief structure. In addition the underlying space X needs no special structure. In contrast to this new union-average aggregation the Dempster rule can be seen as kind of intersection-product type aggregation. We now note an important feature of the U-A aggregation method. We recall for any subset A we have P PlðAÞ ¼ F ij \A–£ mðF ij Þ. We can express this as
0
PlðAÞ ¼
M B X i¼1
B @
1
X
F ij \A–£ j¼1toni
C mðF ij ÞC A¼
0
M B X i¼1
B @
1
X j¼1toni F ij \A–£
C
ai mi ðF ij ÞC A¼
M X
ai Pli ðAÞ
i¼1
Thus Pl(A) is the weighted average of the plausibilities of the individual contributing belief structures. Since then we also have BelðAÞ ¼ 1 ¼ PlðAÞ
BelðAÞ ¼ 1
M X i¼1
ai Pli ðAÞ ¼
M X i¼1
ai ai Pli ðAÞ ¼
M X
ai Beli ðAÞ
i¼1
Thus Bel(A) is the weighted average of the individual beliefs. One view of this U-A approach is the following. We look at each belief structure mi as a collection of Qi experiments whose outcomes are the focal elements. Here mi(Fij)Qi is the number of times that Fij has occurred in these Qi experiments. We then P fuse these multiple belief structures by viewing then as a combined experiment with Q ¼ M i¼1 Q i outcomes. In this case the mðF ij Þ ¼ QQi mi ðF ij Þ is the proportion of the total experiments in which Fij has occurred. In this way we see that QQi ¼ ai . Thus
R.R. Yager / Information Sciences 181 (2011) 3199–3209
3205
here then each source of information is saying my belief about the value of V is the same as if I made Qi experiments and get the results presented in mi. The Dempster-rule is viewing the situation as a kind of single experiment where must all agree on the same outcome. Let us look at the U-A aggregation of D–S belief structures for some special cases. Consider the case where all the contributing belief structures are Bayesian. In this case all the contributing belief structures have the same focal elements, Ej = {xj} P and hence the union of these is still Ej = {xj} for j = 1 to n. However here mðfxj gÞ ¼ mðEj Þ ¼ M i¼1 ai mi ðEj Þ. Thus using this rule the aggregation of a collection of Bayesian belief structures is also a Bayesian belief structure whose probabilities have been PM averaged. In this case PlðAÞ ¼ BelðAÞ ¼ ProbðAÞ ¼ i¼1 ai Probi ðAÞ. We now consider the case where m1 is a Bayesian belief structure having focal elements Ej = {xj} with m1(Ej) = pj and m2 is a an ordinary belief structure with focal elements Fk and m2(Fk) is the associated weight. In this case if we apply the U-A aggregation rule we get a belief structure m with focal elements Fk and Ej = {xj}. If we use the default values for the ai, a1 = a2 = 0.5 then the new belief structure m has m(Fk) = 0.5m2(Fk) and m(Ej) = 0.5pj. Example. Assume X = {x1, x2, x3, x4} and the Bayesian belief structure m1 is such that
p1 ¼ 0:5;
p2 ¼ 0:3;
p3 ¼ 0:1 and p4 ¼ 0:1
and for m2 we have focal elements
F 1 ¼ fx2 ; x3 g with m2 ðF 1 Þ ¼ 0:6 and F 2 ¼ fx1 ; x2 ; x4 g with m2 ðF 2 Þ ¼ 0:4 In this case m has focal elements
F 1 ¼ fx1 g with mðF 1 Þ ¼ 0:25 F 2 ¼ fx2 g with mðF 2 Þ ¼ 0:15 F 3 ¼ fx3 g with mðF 3 Þ ¼ 0:05 F 4 ¼ fx4 g with mðF 4 Þ ¼ 0:05 F 5 ¼ fx2 ; x3 g with mðF 5 Þ ¼ 0:3 F 6 ¼ fx1 ; x2 ; x4 g with mðF 6 Þ ¼ 0:2 ^ If we apply the Dempster-rule in this case we get a belief structure m
^ 1 Þ ¼ 0:333 F 1 ¼ fx1 g with mðF ^ 2 Þ ¼ 0:5 F 2 ¼ fx2 g with mðF ^ 3 Þ ¼ 0:1 F 3 ¼ fx3 g with mðF ^ 4 Þ ¼ 0:066 F 4 ¼ fx4 g with mðF One notable difference is the Dempster-rule results in a Bayesian belief structure, as it always does if any of the contributing belief structures is Bayesian. Let us now consider the case where one of the belief structures is such that it has one focal element the whole universe, X, with weight one. In the case of the Dempster-rule rule this type of belief structure acts as a neutral element. Thus here if m1 has focal element E = X with m1(X) = 1 and m2 has focal elements Fj with m2(Fj) being the associated weight then using the Dempster-rule gives as the fusion of these two the belief structure m3 = m2. It is treating the belief structure m1 as if it is providing no information. It is giving no weight to this source. In the case of the U-A rule we would get as a fused belief structure m4 with focal elements X and the Fj where m4(X) = a and m4(Fj) = (1 a)m(Fj). If we used the default value for a, 0.5, then we have m4(X) = 0.5 and m4(Fj) = 0.5m2(Fj). Here we are treating m2 like valid information source of information, but one where the information provided is not very good. This inspires us to consider an alternative method to obtain the ai weights that can be used in the U-A aggregation. This method is based on the specificity of the information provided by the source. We recall the concept of specificity has been introduced by Klir [10] and Yager [22,24]. First we note that for a set F its specificity is related to the clarity with which is provides a unique value. Assume X = {x1, . . . , xn} for a subset F its specificity assumes the highest value, one, if F is a singleton, contains exactly one element. The specificity of F assumes is minimal of zero if F = X More generally we define the specificity of any subset F as
SpðFÞ ¼
CardðXÞ CardðFÞ n CardðFÞ ¼ CardðXÞ 1 n1
We see if F = {xj} then Sp(F) = 1 and if F = X, Sp(F) = 0. We note in the case where X is a finite interval then SpðFÞ ¼ LenðXÞLenðFÞ LenðXÞ
3206
R.R. Yager / Information Sciences 181 (2011) 3199–3209
Using this idea we can associate with a Dempster–Shafer belief structure m a measure of its specificity as follows. If m has P focal elements Fj then SpðmÞ ¼ j SpðF j ÞmðF j Þ. We note that for a belief structure just consisting of the space X, m(X) = 1 then Sp(m) = 0. At the other extreme if m is a Bayesian belief structure since each Fj is a singleton, Fj = {xj}, then each Sp(Fj) = 1 and therefore Sp(m) = 1. Assume we have a collection mi of D–S belief structures, each having a specificity Sp(mi). From this we can obtain the weight used in the U-A aggregation as ai ¼ PSpðmi Þ . We note in this case any belief structure having just X as its focal element i
Spðmi Þ
will have a = 0. Here then a belief structure with just X as its focal element will be neutral and not effect the aggregation. The one exception to this is if all the contributing belief structures have just X as its focal element. In this case all the contributing belief structures get the same weight and the fused belief structure is also one with just X as its focal element. We also note that if we are combining all Bayesian belief structures then since all have specificity one they will all have the same weight. In addition in a mixed type aggregation the Bayesian ones will all have the same a value which will be more than that associated with the non-Bayesian belief structures. We note that this specificity-based measure of importance weight can be viewed as a type of weight based on the quality of the information provided. We can also associate with each source a measure of confidence. In this case we can have two measures, one based on the confidence or expertise of the source and the other based on the quality of the information provided, we must combine them to obtain a. Let ki 2 [0, 1] be specificity-based measure associated with the quality of the information in mi and let wi be the measure of confidence in the source Si. We can combine these to give us ai in two ways 1. ai ¼ Pkni wi
kw i¼1 i i
i þbwi 2. ai ¼ Pbk n i¼1
wki þbwi
here b 2 [0, 1]. If b = 0.5 then we get ai ¼ Pnki þwi i¼1
ðki þwi Þ
6. General approach to aggregation of imprecise uncertainty measures As we initially indicated our interest is in the representation and fusion of imprecise uncertain information about a variable V whose domain is X. We recall that an imprecise uncertainty measure is one in which for any subset A our confidence that the value of V lies in A is expressed by R(A) = [L(A), U(A)], a subset of the unit interval. Furthermore our focus on the D–S belief structure was because it provided a useful and simple representation for a wide class of imprecise uncertainty measures. We again emphasize that not all imprecise uncertainty measures are of the D–S class. In the preceding we have introduced some procedures for aggregating multiple Dempster–Shafer belief structures. An important feature of these procedures was their double closed nature. In particular in addition to be closed with respect to resulting in an imprecise uncertainty measure they also were closed with respect to the fact they resulted in D–S belief structures. Here we shall consider alternative methods for fusing D–S belief structures that while resulting in imprecise uncertainty measures they are not necessarily D–S belief structures. However here the problem of conflict introduced by null sets is avoided. As we shall see these methods can be seen as a generalization of the U-A method of aggregation where we recall that this P U-A aggregation can be viewed as a one in which the Pl(A) and Bel(A) of the fused measure satisfy PlðAÞ ¼ i ai Pli ðAÞ and P BelðAÞ ¼ i ai Beli ðAÞ. Let S = {S1, . . . , Sr} be a collection of sources providing information about a variable V whose domain is X. Assume each source provides information about the variable V in terms of D–S belief structure, mi. Thus for each source we have for any subset A of X a range
Ri ðAÞ ¼ ½Beli ðAÞ; Pli ðAÞ ¼ ½Li ðAÞ; Hi ðAÞ Here we shall indicating the confidence we have that the value of V lies in the set A. We recall for that Beli ðAÞ ¼ 1 Pli ðAÞ. investigate a general approach to fusing the information provided by the individual sources by essentially fusing the ranges provided by the individual sources. Here we associate with the collection of sources S a function Cred : 2S ? [0, 1] such that for any subset Z # S the value Cred(Z) indicates the credibility of any fusion based on the subset Z of sources. We see some required properties that any such credibility function should have. The first property is that if Z1 Z2 then Cred(Z2) P Cred(Z1), using more sources cannot reduce the credibility. At the extremes we have Cred(£) = 0 and Cred(S) = 1. Here we have that
0 ¼ Credð£Þ
6 CredðZÞ 6 CredðSÞ ¼ 1
Let us look at some particular examples of the credibility function. A basic type of credibility is one in which we associate P P with each Si a value CredðfSi gÞ ¼ ai and then obtain CredðZÞ ¼ Si 2Z ai . Here we require that ri¼1 ai ¼ 1. This is an additive credibility function. Another notable credibility function is one in the credibility simply depends on the number of sources used. Thus in the case Cred(Z) = bjZj where jZj is the cardinality of Z, we call these cardinality based measure. Here we require br = 1 and b0 = 0 and bi P bk if i > k. We note that in this case if we let wj = bj bj1 then wj P 0. A special case of this is where br = 1 and bj = 0
R.R. Yager / Information Sciences 181 (2011) 3199–3209
3207
for j – r. In this case we always require all sources to be used. At the other extreme if bj = 1 for j – 0 then all if we use any source it is completely credible. We note that the cardinality-based credibility function with bj = 0 for all j < r can be seen as an ‘‘anding’’ or conjunction of sources. On the other hand the one with bj = 1 for all j – 0 provides for an ‘‘oring’’ or disjunction of all sources. A more general example of this cardinality based credibility function which includes the two preceding is one in which, bj = 0 for j < d and bj = 1 for j P d. It is important to point out the structure of a credibility function can be described in many different ways. We can linguistically express the credibility such as by saying ‘‘most of the sources must be satisfied’’. We can also express the credibility function using a fuzzy model [17,26]. However our point of departure will be the availability of a credibility function. In the following we shall have need for the concept of the dual of a credibility function. Assume Cred is a credibility relad and define it as tion on S we denote its dual Cred
d CredðZÞ ¼ 1 CredðZÞ d is itself a credibility measure. In particular We see that Cred
d Credð£Þ ¼ 1 CredðSÞ ¼ 0 d CredðSÞ ¼ 1 Credð£Þ ¼ 1 1 Z 2 and hence If Z1 # Z2 then Z
d d CredðZ 2 Þ ¼ 1 CredðZ 2 Þ > 1 CredðZ 1 Þ P CredðZ 1 Þ We also see that the dual of the dual of Cred is Cred. P Let us look at examples of duals of some credibility functions. If Cred is an additive credibility function, CredðZÞ ¼ Si 2Z ai P P d d then Cred ¼ Cred. We see this as follows CredðZÞ ¼ 1 CredðZÞ and with CredðZÞ ¼ Si 2Z ai ¼ 1 Si 2Z ai and hence ^ ¼ CredðZÞ. CredðZÞ Another interesting example is the case where Cred is a cardinality-based function. In this case we see that since d d then CredðZÞ CredðZÞ ¼ 1 CredðZÞ ¼ 1 brjZj . In the special case where bj = 0 for j < d and bj = 1 for j P d we see d d CredðZÞ ¼ 0 if jZj 6 r d and CredðZÞ ¼ 1 if jZj > r d . We now recall the concept of the Choquet integral [2] which provides a tool for aggregation with set functions. Let Y be a set of n objects and let l be a set measure on Y. Assume we associate with each yi 2 Y is a value ai. The Choquet integral provides for an aggregation of the ai with respect to the measure l. In particular
Choql ða1 ; . . . ; an Þ ¼
n X
wj aindðjÞ
j¼1
where ind(j) is the index of the jth largest of the ai and wj = l(Gj) l(Gj1) with
n o Gj ¼ yindðkÞ jk ¼ 1toj
While we shall not prove it here , the Choquet integral is an averaging operator. In particular (1) Choql(a1, . . . , an) P Choql(b1, . . . , bn) if all ai P bi(Monotonic) (2) Choql(a1, . . . , an) = a if all ai = a (Idempotent) 3) Mini(ai) 6 Choql(a1, . . . , an) 6 Maxi(ai) (Bounded) We now describe our procedure for fusing the information provided by the sources. Here then we have a collection of sources of information S = {S1, . . . , Sr}. Each source is providing information about the value of the variable V in terms of a Dempster–Shafer structure mi. For each source we can calculate, for any subset A, the value Ri(A) = [Beli(A), Pli(A)]. In addition we have a credibility mapping Cred : 2S ? [0, 1] describing the imperative for fusing the information provided by the sources. We now express our fused information as a general imprecise uncertainty measure (IPUM) g which is defined in terms of Rg where for each A # X the interval Rg(A) = [Lg(A), Hg(A)] indicates the range of our confidence that V is in A. The intervals Rg(A) are obtained by fusing the individual Ri(A) using the Cred function, here then
Rg ðAÞ ¼ FuseCred ðR1 ðAÞ; . . . ; Rr ðAÞÞ ¼ ðLðAÞ; HðAÞÞ where H(A) = ChoqCred(Pl1(A), . . . , Plr(A)) and L(A) = Choqcred(Bel1(A), . . . , Belr(A)). We see since 0 6 Pli(A) 6 1 and 0 6 Beli(A) 6 1 that 0 6 L(A) 6 1 and 0 6 H(A) 6 1. Since Pli(£) = Beli(£) = 0 then H(£) = L(£) = 0 and Pli(X) = Beli(X) = 1 then H(X) = L(X) = 1. If A # B then Pli(A) 6 Pli(B) and Beli(A) 6 Beli(B) and because of the monotonicity of the Choquet integral then H(A) 6 H(B) and L(A) 6 L(B). Finally we must look at the relationship between H(A) and L(A) . Since Pli(A) P Beli(A) then the monotonicity of the Choquet integral assures us that H(A) P L(A) . We now consider some special cases of Cred. First we note that if Cred an the additive type credibility function with Cred {S1} = ai then this reduces to the U-A aggregation where
3208
R.R. Yager / Information Sciences 181 (2011) 3199–3209
HðAÞ ¼
r X j¼1
ai Pli ðAÞai and LðAÞ ¼
r X
ai Beli ðai Þ
j¼1
Consider now the case where Cred is a cardinality type credibility function with weights 0 = b0 6 b1 6 6 br = 1. Let pind be a function such that pind(j) is the index of the jth largest of the Pli(A) and let bind be a function such that bind(j) is the index of the jth largest of Beli(A). In this case
HðAÞ ¼
r X
ðbj bj1 ÞPlpindðjÞ ðAÞ and LðAÞ ¼
j¼1
r X
ðbj bj1 ÞBelbindðjÞ ðAÞ
j¼1
We consider now some special cases of the bj. In the case where we have an ‘‘anding,’’ bj = 0 for j – r and br = 1 then H(A) = Mini[Pli(A)] and L(A) = Mini[Beli(A)]. At the other extreme if bj = 1 for all j P 1 and b0 = 0 then H(A) = Maxi[Pli(A)] and L(A) = Maxi[Beli(A)]. It is interesting to investigate the nature of this approach. Here we use the Dempster–Shafer structure to represent the input information provided by the sources. In this regard we emphasize that the use of the Dempster–Shafer representation greatly reduces the burden on the sources providing the information. In addition with the use of the Dempster–Shafer representation we can easily and algorithmically generate any Pli(A) and Beli(A) we need. From a D–S representation of the source information we easily obtain H(A) = ChoqCred(Pl1(A), . . . , Plr(A)) and L(A) = Choqcred(Bel1(A), . . . , Belr(A)). One advantage of this approach is that the issue of conflicts and null-intersections does not arise. In addition it allows for the use of sophisticated rules for the aggregation of the information provided by the sources using the Cred function. Since the Dempster–Shafer belief structure provides a compact easy to comprehend and use formulation for imprecise uncertainty measures in many cases it may be desirable to convert the IPUM g into a D–S belief structure. With this in mind one can consider the introduction of a process of retranslation [25,13]. Here starting with the aggregated ranges, Rg(A) = [L(A), H(A)] we desire to obtain an appropriate D–S belief structure m. As we earlier indicated any D–S belief m such that its range for all A satisfies the condition Rm(A) = [Belm(A), Plm(A)] # [L(A), H(A)] is inferable from fused values Rg(A). Ideally, in order to retain as much of the information contained in g as possible our objective then becomes to obtain a ‘‘minimal’’ type inferable D–S belief structure. In particular we desire a belief structure m such that: (1) Belm(A) 6 L(A) for all A (2) Plm(A) P H(A) for all A (3) Plm(A) Belm(A) is minimal for any A We note that they always exists at least one solution to the first two conditions, which is to let m have one focal element X with m(X) = 1, however this has the maximum possible value for Plm(A) Belm(A). We see that finding a solution to these three conditions is an optimization problem. Actually solving this optimization can be very difficult and can also result in a very complex formulation of the resulting optimal D–S belief structure. One way around this problem is to relax the third requirement. One procedure that can be used is a type of intelligent conjecturing. Here based on the information contained in the Rg(A)’s one can conjecture some formulations for D–S belief that meet the first two conditions and select between these based on the compactness of the D–S formulation and information retained as measured by the Plm(A) Belm(A). While perhaps more controversial the possibility of some ‘‘minor’’ infringement of conditions one and two can be considered in tradeoff for some elegant formulation of the resulting Dempster–Shafer belief structure. 7. Conclusion Our interest was in the fusion of information from multiple sources when the information provided by the individual sources is expressed in terms of an imprecise uncertainty measure. We observed that the Dempster–Shafer belief structure provided a framework for the representation of a wide class of imprecise uncertainty measures. We then discussed the fusion of multiple Dempster–Shafer belief structures using the Dempster rule and noted the problems that can arise when using this fusion method because of the required normalization in the face of conflicting focal elements. We then suggested some alternative approaches fusing multiple belief structures that avoid the need for normalization. References [1] M. Casanovas, J.M. Merigó, Decision making with Dempster–Shafer theory and uncertain induced aggregation operators, Journal of International Business Disciplines 3 (2008) 13–26. [2] G. Choquet, Theory of capacities, Annales de l’Institut Fourier 5 (1953) 131–295. [3] A.P. Dempster, New methods of reasoning toward posterior distributions based on sample data, Annals of Mathematical Statistics 37 (1966) 355–374. [4] .A.P. Dempster, Upper and lower probabilities induced by a multi-valued mapping, Annals of Mathematical Statistics 38 (1967) 325–339. [5] A.P. Dempster, A generalization of Bayesian inference, Journal of the Royal Statistical Society (1968) 205–247. [6] D. Dubois, H. Prade, On the unicity of Dempster rule of combination, International Journal of Intelligent Systems 1 (1986) 133–142. [7] D. Dubois, H. Prade, Formal representation of uncertainty, in: D. Bouyssou, D. Dubois, M. Pirlot, H. Prade (Eds.), Decision-Making Process, John Wiley & Sons, Hoboken, 2009.
R.R. Yager / Information Sciences 181 (2011) 3199–3209
3209
[8] C. Fu, S.L. Yang, Analyzing the applicability of Dempster’s rule to the combination of interval-valued belief structures, Expert Systems with Applications 38 (2011) 4291–4301. [9] D. Hall, J. Llinas, An introduction to multisensor data fusion, Proceedings of IEEE 85 (1997) 6–24. [10] G.J. Klir, Uncertainty and Information, John Wiley & Sons, New York, 2006. [11] K. Lehrer, C. Wagner, Rational Consensus in Science and Society, Reidel, Dordrecht, 1981. [12] T.C. Lin, Switching-based filter based on Dempster’s combination rule for image processing, Information Sciences 180 (2010) 4892–4908. [13] O. Martin, G.J. Klir, On the problem of retranslation in computing with perceptions, International Journal General Systems 35 (2006) 655–674. [14] E. Miranda, I. Couso, P. Gil, Approximations of upper and lower probabilities by measurable selections, Information Sciences 180 (2010) 1407–1417. [15] H.B. Mitchell, Multi-Sensor Data Fusion: An Introduction, Springer-Verlag, Heidelberg, 2007. [16] J. Montero, D. Ruan, Modeling uncertainty, Information Sciences 180 (2010) 799–802. [17] W. Pedrycz, F. Gomide, Fuzzy Systems Engineering: Toward Human-Centric Computing, John Wiley & Sons, New York, 2007. [18] G.A. Shafer, Mathematical Theory of Evidence, Princeton University Press, Princeton, N.J., 1976. [19] P. Smets, R. Kennes, The transferable belief model, Artificial Intelligence 66 (1994) 191–234. [20] H. Wang, J. Liu, J.C. Augusto, Mass function derivation and combination in multivariate data spaces, Information Sciences 180 (2010) 813–819. [21] Y.M. Wang, J.B. Yang, D.L. Xu, K.S. Chin, On the combination and normalization of interval-valued belief structures, Information Sciences 177 (2007) 1230–1247. [22] R.R. Yager, Entropy and specificity in a mathematical theory of evidence, International Journal of General Systems 9 (1983) 249–260. [23] R.R. Yager, On the Dempster–Shafer framework and new combination rules, Information Sciences 41 (1987) 93–137. [24] R.R. Yager, On measures of specificity, in: O. Kaynak, L.A. Zadeh, B. Turksen, I.J. Rudas (Eds.), Computational Intelligence: Soft Computing and FuzzyNeuro Integration with Applications, Springer-Verlag, Berlin, 1998, pp. 94–113. [25] R.R. Yager, On the retranslation process in Zadeh’s paradigm of computing with words, IEEE Transactions on Systems, Man and Cybernetics: Part B 34 (2004) 1184–1195. [26] R.R. Yager, D.P. Filev, Essentials of Fuzzy Modeling and Control, John Wiley, New York, 1994. [27] R.R. Yager, L. Liu, Classic Works of the Dempster–Shafer Theory of Belief Functions, Springer, Heidelberg, 2008. A.P. Dempster, G.Shafer, Advisory Editors. [28] K. Yamada, A new combination of evidence based on compromise, Fuzzy Sets and Systems 159 (2008) 1689–1708. [29] L.A. Zadeh, On the validity of Dempster’s rule of combination of evidence, Memo# UCB/ERL, M79/32, University of California, Berkeley, 1979. [30] L.A. Zadeh, A simple view of the Dempster–Shafer theory of evidence and its implication for the rule of combination, AI Magazine Summer (1986) 85– 90. [31] L.A. Zadeh, Toward a generalized theory of uncertainty (GTU)-an outline, Information Sciences 172 (2005) 1–40. [32] L.A. Zadeh, Is there a need for fuzzy logic?, Information Sciences 13 (2008) 2751–2779