Fuzzy Sets and Systems 126 (2002) 177–190
www.elsevier.com/locate/fss
A probabilistic de#nition of a nonconvex fuzzy cardinality Miguel Delgado∗ , Daniel S)anchez, Mar)+a J. Mart)+n-Bautista, Mar)+a Amparo Vila Department of Computer Science and Articial Intelligence, University of Granada, Avda. Andalucia 38, 18071 Granada, Spain Received 24 August 1999; received in revised form 30 November 2000; accepted 8 January 2001
Abstract The existing methods to assess the cardinality of a fuzzy set with #nite support are intended to preserve the properties of classical cardinality. In particular, the main objective of researchers in this area has been to ensure the convexity of fuzzy cardinalities, in order to preserve some properties based on the addition of cardinalities, such as the additivity property. We have found that in order to solve many real-world problems, such as the induction of fuzzy rules in Data Mining, convex cardinalities are not always appropriate. In this paper, we propose a possibilistic and a probabilistic cardinality of a fuzzy set with #nite support. These cardinalities are not convex in general, but they are most suitable for solving problems and, contrary to the generalizing opinion, they are found to be more intuitive for humans. Their suitability relies mainly on the fact that they assume dependency among objects with respect to the property “to be in a fuzzy set”. The cardinality measures are generalized to relative ones among pairs of fuzzy sets. We also introduce a de#nition of the entropy of a fuzzy set by using one of our probabilistic measures. Finally, a fuzzy ranking of the cardinality of fuzzy sets is proposed, and a de#nition c 2002 Elsevier Science B.V. All rights reserved. of graded equipotency is introduced. Keywords: Fuzzy cardinality; Fuzzy relative cardinality; Fuzzy entropy; Equipotency of fuzzy sets
1. Introduction Measuring the cardinality of a fuzzy set with #nite support (fuzzy set from now on) is a necessary task in many approximate reasoning problems, such as fuzzy querying in databases, expert systems, evaluation of natural language statements, aggregation, decision∗
Corresponding author. Tel.: +34-958-244018; fax: +34-958243317. E-mail addresses:
[email protected] (M. Delgado),
[email protected] (D. S)anchez),
[email protected] (M.J. Mart)+n-Bautista),
[email protected] (M.A. Vila).
making in fuzzy environment, etc. (see [14,1,3,2]), where natural language sentences are modeled by using a fuzzy representation of imprecise terms. In the #eld of Data Mining, one of the most important tasks is to discover statistically signi#cant associations among the occurrence of data items in the records of a database. To do this, counting items is necessary, i.e., we must #nd the cardinality of the set of records containing each item, together with the relative count of one item with respect to other items. On many occasions, the number of diGerent items in the database is so high that it is very diHcult to #nd statistically signi#cant associations among them.
c 2002 Elsevier Science B.V. All rights reserved. 0165-0114/02/$ - see front matter PII: S 0 1 6 5 - 0 1 1 4 ( 0 1 ) 0 0 0 3 9 - 2
178
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
The usual solution is to cluster the items and then #nd associations among the clusters. Each cluster is a set of items that is considered as a “higher-order item” with a clear semantic interpretation. Then, for each cluster we must perform a count of the number of records containing an item of the cluster, in order to obtain the cardinality of the cluster. It is widely accepted that in many occasions, fuzzy clustering is the most natural way to cluster domains. A typical example is the fuzzy partitioning of the domain “Age” by the set of fuzzy sets “Very young”, “Young”, “Medium”, “Old”, “Very old”. In such cases, the clusters are fuzzy sets of items, and we have to #nd statistical associations among “fuzzy higher-order items”. Therefore, measuring the cardinality of such fuzzy sets is crucial in order to solve the problem. When the cardinality is measured, we are able to obtain a solution by calculating the accomplishment degree among the cardinalities and fuzzy quanti#ers. This #nal step is known as the evaluation of quanti#ed sentences of type I and II. These quanti#ed sentences are assertions about the number or percentage of objects that verify a certain property. We study two kinds of sentences, called type I and type II sentences, respectively. Examples of such sentences are “Most of the students are tall” or “Almost all the intelligent students are tall”. In the #rst case, the set of objects in which we evaluate the fuzzy property is crisp. In the second case, the set is a fuzzy set. The quanti#ers are fuzzy quantities (fuzzy sets over the non-negative integers) or fuzzy percentages (fuzzy sets over the real interval [0, 1] such as “Most” and “Almost all” in the previous examples). The existing methods for evaluating quanti#ed sentences have some drawbacks that could be avoided with what we have called the cardinality approach. We evaluate the degree of ful#llment of type I (resp. type II) sentences as the compatibility degree between the fuzzy quanti#er and the fuzzy cardinality (resp. fuzzy relative cardinality) of the fuzzy set that represents the property. Several authors have proposed ways to measure the cardinality of a fuzzy set, extending the classic one in diGerent ways (see [13]). The most common approaches are the scalar cardinality and the fuzzy cardinality of a fuzzy set. The #rst approach claims that the cardinality of a fuzzy set is measured by means of a scalar value, either integer or real, whereas the second approach assumes the cardinality of a fuzzy set is just
another fuzzy set over the non-negative integers. The de#nitions by De Luca and Termini [6], Ralescu [9], Dubois and Prade [1] and Wygralak [14,16,17] are to be found in the #rst category. Many authors have suggested, however, that the second approach is more appropriate (see [1,14]). In this category are the definitions by Zadeh [21,22], Dubois and Prade [1] and Wygralak [14,18]. These extensions of the classical cardinality theory are intended to preserve as many classical properties as possible. Two of the main properties of the classical cardinality of a set are the valuation, i.e. Card(A) + Card(B) = Card(A ∪ B) + Card(A ∩ B) (1) and the additivity property, de#ned as A ∩ B = ∅ ⇒ Card(A) + Card(B) = Card(A ∪ B): (2) Wygralak and Pilarski have shown that additivity, based on the extension principle, is veri#ed by a t-norm-based generalized fuzzy cardinality if it is convex [18]. Convexity of a fuzzy measure M can be de#ned as M is convex ⇔ (x 6 y 6 z ⇒ M (y) ¿ min(M (x); M (z)): (3) Also in [18] it is shown that t-norm-based generalized fuzzy cardinalities verify the valuation property (based on the extension principle) if and only if standard intersection and union are used. One of the claims we make in this paper is that in general convexity is not a natural property for the fuzzy cardinality of a fuzzy set. We noticed this for the #rst time when we were using convex cardinalities to evaluate quanti#ed sentences on real cases. In many experiments we realized that the results obtained using methods based on convex cardinalities were not always coherent with the intuitively expected correct results. These conclusions motivated us to search for cardinalities that were better suited to real-life problems. A very simple example can help us to clarify our claims. Let X = {John; Mike; Peter} be a set of
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
friends, and let us suppose that John is blonde, and Mike and Peter are fairly blonde. We could say that John is in the set of blonde people with degree 1 and Mike and Peter are in the same set with degree 0.5. Suppose we want to buy a cap for every blonde person in X . Deciding how many caps we shall buy is equivalent to obtaining the cardinality of the fuzzy set XBlonde = 1=John + 0:5=Mike + 0:5=Peter. If we are strict with the concept of blonde, we shall buy only one cap (for John). But if we relax our criterion we could think of buying three caps, one for each friend in X . What should be clear is that buying two caps is not a solution for the problem, because as John is certainly blonde one of the caps should be for him. What can we do then with the other cap, given that Mike and Peter are equally blonde? In other words, the cardinality of XBlonde can be one or three, but not two, because Mike is in XBlonde if and only if Peter also is. Hence, the possibility that the cardinality of XBlonde is two is 0. But convex cardinalities do not agree with this. Any convex fuzzy cardinality will give a possibility greater than 0 that “two” is the cardinality of XBlonde , because clearly the possibility that the cardinality is “one” or “three” are both greater than 0, and a convex fuzzy cardinality veri#es (3). For instance, the convex cardinality FECount(XBlonde ) = 0=0 + 0:5=1 + 0:5=2 + 0:5=3. In this sense, we #nd convexity to be counterintuitive in some cases. This problem aGects the evaluation of quanti#ed sentences by means of the cardinality approach. For instance, the evaluation of the sentence “Q2=3 of people in X are blonde” (where Q2=3 = {1=2=3 } is the quanti#er “exactly 2=3”) should be 0, as the cardinality of XBlonde cannot be two. But the result of the evaluation of this sentence by using the cardinality approach with any fuzzy cardinality C is {C(0) Q2=3 (0); C(1) Q2=3 (1=3); C(2) Q2=3 (2=3); C(3) Q2=3 (1)} = C(2); where ⊕ and are a t-conorm and a t-norm, respectively (usually, maximum and minimum for possibilistic cardinalities). Hence, if the fuzzy cardinality C is convex, the evaluation of the sentence will not be the intuitively expected zero value. Related to the problem of the cardinality of a fuzzy set is that of the relative cardinality of a fuzzy set F with respect to a fuzzy set G (i.e. the percent-
179
age of objects in the fuzzy set G that are also in the fuzzy set F). A scalar measure was described in [20]. Fuzzy measures are de#ned in [22,1]. Similar examples of non-intuitive results can be detailed for type II sentences with relative cardinalities if convexity is requested. In the #rst part of the paper (Sections 2 and 3), we brieQy review the existing cardinality measures. We also discuss the convexity and its relation to the additivity and valuation properties. In the second part, we propose both a probabilistic and a possibilistic measure of the relative cardinality of fuzzy sets, and we tackle two related problems: the entropy of a fuzzy set and the equipotency of fuzzy sets. 2. Some existing measures of the absolute and relative cardinality of a fuzzy set Let F and G be two fuzzy sets de#ned over X = {x 1 ; : : : ; xn }. Let fj and gj be the jth greatest value of the multisets {F(xi ) | xi ∈ X } and {G(xi ) | xi ∈ X }, respectively, and let f0 = g0 = 1. Also let fm = gm = 0 ∀m¿n. 2.1. Scalar measures 2.1.1. Power of a fuzzy set This measure was introduced by De Luca and Termini in [6] to be P(F) =
n
F(xi ):
(4)
i=1
The power is also called the -count, and it is an example of an energy measure of a fuzzy set (see [7]). The authors propose to use P(F) as a scalar cardinal for fuzzy subsets. The main drawback of this use is that for fuzzy sets with high support when the values F(xi ) are very low, P(F) may be great enough to ensure F is “big”, which is somewhat counter-intuitive. 2.1.2. Ralescu’s cardinality measure The following method was proposed by Ralescu in [9]: F = ∅; 0 F = ∅ and fj ¿ 0:5; ncard F = j (5) j − 1 F = ∅ and fj ¡ 0:5;
180
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
where j = max{1 6 s 6 n | fs−1 +fs ¿1}. Ralescu’s idea is to provide an integer value as the scalar cardinality of a fuzzy set (the power is in general a real number), and to avoid the problem of accumulating low values to give a high cardinality. Ralescu also shows that ncard F = |F0:5 |;
(6)
where F is the -cut of F. 2.1.3. Wygralak’s cardinality measure This cardinality measure is described in [14]. The motivation for this de#nition is the same as Ralescu’s method. Wygralak proposes one of the values of the following integer interval as the scalar cardinality WS (F) = [|F 0:5 |; |F0:5 |];
(7)
where F 0:5 = {x ∈ X | F(x) ¿ 0:5} is the strong -cut of F at level 0.5, and F0:5 = {x ∈ X | F(x) ¿ 0:5} is the -cut of F at level 0.5. Wygralak also gives mathematical and intuitive reasons for why the values |F 0:5 | and |F0:5 | are preferred as the best integer approximations to the cardinality of F. Recently, Wygralak has proposed an axiomatic approach for the de#nition of scalar measures [16,17]. Following this approach, a function sc is a scalar cardinality iG sc(F) = f(F(x)) x∈X
with f : [0; 1] → [0; 1] such that f(0) = 0; f(1) = 1 and f(a) 6 f(b) if a 6 b. Some examples of scalar cardinalities of this kind are provided in [16,17]. Particular cases are |F 0:5 | and |F0:5 |, and in general any -cut or strong -cut of F. 2.1.4. Dubois and Prade’s cardinality measure Dubois and Prade [1] propose a real interval of values as possible scalar cardinalities of F, the bounds of the interval being the lower and upper expectation for the cardinality. The proposed interval is DPs(F) = [|F1 |; P(F)].
2.2. Fuzzy cardinality measures 2.2.1. Zadeh’s rst denition In [21], Zadeh introduces a fuzzy measure for the cardinality of a fuzzy set F to be Z(F; k) = sup{ | |F | = k}:
(8)
This de#nition is based on the representation theorem of fuzzy sets based on -cuts. The main problem that has been attributed to this method is that the additivity property of classical cardinality (based on the addition by means of the extension principle) is lost, as Z(F) is not a convex fuzzy set. 2.2.2. Zadeh’s FECount(F) To avoid the nonconvexity of Z(·), Zadeh proposed the fuzzy cardinality measure FECount(F) in [22], de#ned as FECount(F) = FGCount(F) ∩ FLCount(F);
(9)
where FGCount(F; k) = sup{ | |F | ¿ k} can be interpreted as the possibility that the cardinality of F is at least k. Also FLCount(F; k) = FGCount(F; k) − 1; where the bar stands for the standard complement, can be interpreted as the possibility that the cardinality of F is at most k. It is easy to see that the possibility of the cardinality of F being at least k is FGCount(F; k) = fk with f0 = 1. An equivalent expression for the FECount measure was introduced by Wygralak in [11,12,15]. It is easy to show that FECount(F) can be expressed as FECount(F; k) = min(fk ; fk+1 );
(10)
where fk is the kth greatest value of F(xi ) and the bar represents the standard fuzzy negation. Starting from a diGerent approach, Ralescu [9] has reached the same de#nition. 2.2.3. Dubois and Prade’s fuzzy cardinality This measure is de#ned in [1] as DP(F) = sup{*F (S) | S ⊆ X and |S| = k};
(11)
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
where *F (S) =
inf {F(x) | x ∈ S} if S ∈ (F); 0
if S ∈ (F)
where +(F) and +(G) are the level set of F and G, respectively. (12)
and (F) = {S | F1 ⊆ S ⊆ Support(F)}. It is easy to see that 0 if k¡|F1 |; DP(F; k) = (13) fk otherwise: So DP(F) can be obtained from FGCount(F) by turning the values that are below the lower expectation for the cardinality (in the sense of 2:1:4) to be 0. The motivation behind this method was that FLCount(F) was obtained from FGCount(F) in a way that is not coherent with the possibility theory. This cardinality is also convex.
The relative cardinality of a set F with respect to a set G is de#ned as the percentage of elements in G that are also in F. When F and G are fuzzy sets, both scalar and fuzzy measures for the relative cardinality have been introduced, although the most used measure is the scalar measure de#ned in the next subsection. 2.3.1. Zadeh’s scalar relative cardinality The relative cardinality of F with respect to G is de#ned in [20] as P(F ∩ G) : P(G)
2.4.2. Dubois and Prade’s fuzzy relative cardinality This measure was proposed in [1] following the approach of (11). It is de#ned for every q ∈ [0; 1] ∩ Q to be DP(F=G; q)
|S ∩ T | =q = sup min(*F (S); *G (T ))
|T |
(16)
with S ∈ (F); T ∈ (G). 3. Convexity, additivity and valuation 3.1. The convexity problem
2.3. Scalar relative cardinality
Card(F=G) =
181
(14)
2.4. Fuzzy relative cardinality As is the case of the cardinality of a fuzzy set, the relative cardinality of fuzzy sets is considered by most authors as being a fuzzy set over [0, 1] rather than a scalar value. 2.4.1. Zadeh’s FGCount(F=G) The fuzzy measure called FGCount(F=G) was de#ned by Zadeh in [22] as a fuzzy multiset over [0, 1] as follows: P(F ∩ G ) FGCount(F=G) = ; P(G ) ∈+(F)∪+(G)
(15)
We agree with the idea that the cardinality of a fuzzy set is also a fuzzy set, but as we explained in the introduction, we think that this fuzzy set ought not to be convex in the general case. The example described in Section 1 illustrates our claim. The aHrmation that the cardinality of XBlonde can be one or three but not two could seem strange in a #rst view. But all the people we asked about “how many caps would you buy?” have answered “one or three”. All of them found “two” not to be a valid answer. In this sense, and at least from a practical point of view, it is actually convexity that seems to be counterintuitive. When extending scalar operations on classical sets to fuzzy operations on fuzzy sets, we usually represent the fuzzy set by means of a weighted set of classical sets. Once this step has been made, we perform the operations over the elements of the representation and we obtain a weighted collection of results, which can be interpreted as a fuzzy set. One example of this kind of procedure is the well-known extension principle (see [22], appendix). A further step can be to obtain a scalar value that summarizes the information of the fuzzy result, following some criterion (see [2] for example). In this procedure, one of the most important issues is the choice of the classical sets that will represent the original fuzzy set, together with their weight in the representation. There are several representation methods, the most well-known being Zadeh’s representation theorem.
182
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
This is a possibilistic representation where the fuzzy set F is represented as the set {F } of -cuts of F, being the weight of F in the representation. Zadeh’s #rst method is based on this representation. Dubois and Prade [1] de#ne another possibilistic representation of F as the set (F) with *F (S) (see (12)) being the weight for every S ∈ (F). They use this representation to obtain their fuzzy cardinality measure described in Section 2.2.3. A probabilistic representation is also proposed by Dubois and Prade in [2] using the set (F) and assuming F is normalized. Finally, Wygralak uses the set P(X ) of subsets of X as the representation, *F (A) again being the weight for every A ∈ P(F). Following this representation, the possible cardinalities for XBlonde are Ws (XBlonde ) ∈ [1; 3]. In our opinion, the representations must take into account the dependency among objects with respect to the property “to be in F”. Regarding the example in Section 1 and following both (XBlonde ) and P(X ) approaches, the set {John, Mike} is a valid representative of XBlonde , but with respect to the property “x belongs to XBlonde ” we think it is not, because if Mike ∈ XBlonde then Peter ∈ XBlonde . We can see that the cut representation of XBlonde is {{John}, {John, Mike, Peter}}. We think that only representations based on -cuts preserve the dependency among objects from the point of view of cardinality. DiGerent weighting of the -cuts are possible. As a consequence, the only valid values for the cardinality of a fuzzy set are the cardinalities of its -cuts. However, this approach turns convexity to be unneeded. 3.2. Convexity and the extension principle Given two fuzzy sets F and G over N; F + G can be obtained by using the well-known extension principle as (F + G)(i) = max min(F(a); G(b)): i=a+b
(17)
The extension principle ensures that the addition of fuzzy subsets of nonnegative integers, whether convex or not, is a convex fuzzy integer. Hence, adding nonconvex fuzzy cardinalities by means of the extension principle, we obtain a convex cardinality, and therefore it is not possible in general for a nonconvex fuzzy cardinality to verify the additivity property. Let us consider the following simple example. Let
Y = {Mary; Susan; Katy} and let YBlonde = 1=Mary + 0:5=Susan+0:5=Katy. Obviously, XBlonde ∩ YBlonde = ∅, so for any fuzzy cardinality C, if C veri#es the additivity then C(XBlonde )+C(YBlonde ) = C(XBlonde ∪ YBlonde ). If C were not convex (in the sense that only cardinalities of -cuts were allowed) then C(XBlonde ), C(YBlonde ) and C(XBlonde ∪ YBlonde ) should not be convex (the only possible values for the cardinality are 1 and 3 for XBlonde and YBlonde , and 2 and 6 for XBlonde ∪YBlonde ). But C(XBlonde ) + C(YBlonde ) obtained by means of the extension principle is convex, and hence C(XBlonde ) + C(YBlonde ) = C(XBlonde ∪ YBlonde ). Therefore, if C is not convex then the additivity (and hence the valuation property), as based in the extension principle, does not hold. Moreover, it could happen that C(XBlonde )+C(YBlonde ) does not represent the cardinality C(H ) for any fuzzy set H , so that it could seem that C is not well-de#ned from the operational point of view. However, all these assertions are based on the use of the extension principle for adding fuzzy cardinalities. If we consider the extension principle to be the axiomatic reference for the addition, we must agree that C neither verify additivity and valuation nor it is well formed. Even for convex cardinalities, using the extension principle under Zadeh’s original formulation (17) can oGer counterintuitive results. For example, the cardinality FECount(XBlonde ) = 0=0 + 0:5=1 + 0:5=2 + 0:5=3, and obviously FECount(XBlonde ) = F ECount(YBlonde ). Hence, one would expect the following: FECount(XBlonde ) + F ECount(YBlonde ) = 2FECount(XBlonde ) = 2F ECount(YBlonde ) (here 2F ECount(F) = 1{2} F ECount(F), where 1{i} , i ∈ N, is a fuzzy set over N such that 1{i} (x) = 1 iG x = i, and 0 otherwise). But by using the extension principle for both the addition and product (the latter can be performed by replacing + with × in (17)), we obtain FECount(XBlonde ) + FECount(YBlonde ) = 0=0 + 0=1 + 0:5=2 + 0:5=3 + 0:5=4 + 0:5=5 + 0:5=6 and 2FECount(XBlonde ) = 2FECount(YBlonde ) = 0=0 + 0=1 + 0:5=2 + 0=3 + 0:5=4 + 0=5 + 0:5=6 and they are diGerent!
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
With an alternative and suitable de#nition of addition for integer fuzzy cardinalities, this problem can be solved. Wygralak [15] has proposed a modi#ed extension principle that works n for the convex cardinality FECount, ensuring i=1 F ECount(F) = nF ECount(F) for any F. In our example and following [15] 2FECount(XBlonde ) = 0=0 + 0=1 + 0:5=2 + 0:5=3 + 0:5=4 (18)
However, we think that the addition of a (fuzzy) cardinal number with itself should have no odd natural number in its support. But regarding (18), there is a nonzero possibility that 2F ECount(XBlonde ) is 3 or 5. In our opinion, that is counterintuitive, and it is another reason why nonconvex cardinalities make sense. Now we are studying the addition for nonconvex cardinalities. Some proposals to perform a suitable addition will be dealt with in future papers. 4. A new de!nition for a cardinality measure We start this section by de#ning a family of fuzzy cardinality measures based on the evaluation of fuzzy logic sentences. This approach is followed, for example, by Wygralak [13 – 15]. 4.1. The family E of measures De!nition 4.1. The possibility that at least k elements of X belong to F, L(F; k), is the evaluation of the logical sentence “∃X1 ⊆ X | |X1 | = k and X1 ⊆ F”, with X1 a crisp set L(F; k) = if k = 0; 1 0 if k ¿ n; { {F(xi1 ); : : : ; F(xik )}} if 1 6 k 6 n; Ik
(19)
where Ik is the set of k-tuples of indexes ij with ij ∈ {1; : : : ; n} ∀j ∈ {1; : : : ; k} de#ned by Ik = {(i1 ; : : : ; ik ) | i1 ¡ i2 ¡ · · · ¡ ik }
and ⊕ and ⊗ are a t-conorm and a t-norm, respectively. We note L to be the family of functions L(F; k) for each pair of t-conorm and t-norm. Proposition 4.2. Let ⊕ be the maximum and ⊗ be the minimum. Then L(F; k) = fk and therefore L(F) = F GCount(F). Proposition 4.3. Every measure of the family L is well-dened in the crisp case; i.e. if F is crisp then L(F; k) = 1 i= |F|¿k.
= FECount(XBlonde ) + FECount(XBlonde ) + 0:5=5 + 0:5=6:
183
(20)
De!nition 4.4. The possibility that exactly k elements of X belong to F; E(F; k); is E(F; k) = L(F; k) ⊗ L(F; k + 1) ;
(21)
where ⊗ is any t-norm (not necessarily the same one used in L) and the bar stands for a fuzzy complement. For each set of t-conorm, t-norms and complement, the expression (21) de#nes a family of fuzzy cardinalities that we denote E. A direct interpretation of (21) is that “the cardinality of F is k if it is “at least k” and it is not “at least k+1”. (21) is also the evaluation of the logic sentence “∃X1 ⊆ X | |X1 | = k and X1 ⊆ F and F ⊆ X1 ” i.e. the evaluation of the sentence “∃X1 ⊆ X | |X1 | = k and X1 = F”. It can be seen that in the evaluation of these sentences, every crisp subset X1 ⊆ X with cardinality k is considered. We might #nd that some of these subsets are not -cuts of F, and this can be a problem on account of that which we discussed in Section 3. As we shall see in Section 4.2, at least one measure in the family E exists such that E(F; k) = 0 if and only if there exists ∈ [0; 1] such that |F | = k. Proposition 4.5. Every measure of the family E is well-dened in the crisp case; i.e. if F is crisp then E(F; k) = 1 if |F| = k. Proposition 4.6. The measure FECount(F) dened by (9) is a member of the family E. Proof. We only need to de#ne L as in Proposition 4.2 by using max and min, and using the standard
184
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
complement (n(x) = 1 − x) and the t-norm min in the expression (21). We then obtain E(F; k) = fk ∧ fk+1 . Proposition 4.7. The measure DP(F) dened by (11) is a member of the family E. Proof. Let us de#ne L as in Proposition 4.2 by using max and min, and using the fuzzy complement 1 if x ¡ 1; c(x) = 0 if x = 1 and the t-norm min in expression (21). Let |F1 | = h. We shall consider two cases: (1) Let k¡h. Then by (13), we have DP(F; k) = 0. We also have L(F; k) = L(F; k + 1) = 1 and then L(F; k + 1) =0, and hence E(F; k) = 0 =DP(F; k). (2) Let k¿h. Then DP(F; k) = fk and L(F; k + 1)¡1 so L(F; k + 1) = 1 and hence E(F; k) = L(F; k) = DP(F; k) = fk . Let us remark that our objective is not to deepen in the properties of every cardinality in the family, but to de#ne a general framework in which we could integrate new and existing de#nitions. In Section 3 we claimed that fuzzy cardinalities should not be convex in general, and that it is intuitively true that the only possible values for the cardinality for a fuzzy set are the cardinality values of its -cuts, from which we construct the fuzzy cardinal. The study of the cardinalities of the family E that verify these requirements, together with the discussion of their properties, will be an object of future research. 4.2. The measure ED De!nition 4.8. We introduce the fuzzy cardinality measure ED to be ED(F; k) = fk − fk+1 :
(22)
Proposition 4.9. The measure ED(F) is the basic probability assignment of Z(F). Proof. Let us consider the set of -cuts of F plus ∅. The possibility that a given F is a subset of F is , and an integer k exists with 16k6n such that = fk . Obviously, the possibility that ∅ is a subset of F is
1 = f0 . If we consider this representation as a possibility distribution over a collection of nested sets, then the diGerences fk − fk+1 can be interpreted as the basic probability assignment to every -cut of F, in the sense of Dempster–Shafer’s theory of evidence. Hence, ED can be obtained from a representation of F as the collection of its -cuts, weighted with the basic probability. It is only necessary to assign the probability that it is a valid representation of F to the cardinality of every -cut. Corollary 4.10. ED(F; k)¿0 if and only if ∃ ∈ [0; 1] such that |F | = k. Corollary 4.11. ED(F) is not a convex fuzzy set in general. A very similar measure is de#ned by Dubois and Prade [3,2]. The main diGerence is that Dubois and Prade’s measure requires the set F to be normalized. We do not impose this requirement because if F is not normalized, the possibility that the set Ff0 = F1 = ∅ is a subset of F is known to be 1, so its basic probability assignment is f0 − f1 = 1 − f1 . Another diGerence is that this representation only considers -cuts as the set of crisp representatives of F, while Dubois and Prade’s representation considers that the set of representatives of F is (F). In any case, the basic probability assignment of a set S ∈ (F) such that S is not an -cut is 0. Proposition 4.12. The measure ED is a member of the family E. Proof. Using max and min in L and standard negation and Lukasiewicz’s t-norm t(a; b) = max{0; a + b − 1} in (21). Proposition 4.13. Let F c be the standard complement of the set F with respect to X . The measure ED veries ED(F c ; k) = ED(F; n − k):
(23)
Proof. It is easy to see that fkc = 1 − fn−k+1 ∀k c ∈ {0; : : : ; n + 1}. Then ED(F c ; k) = fkc − fk+1 = (1 − fn−k+1 )−(1−fn−k ) = fn−k −fn−k+1 = ED(F; n−k).
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
A scalar value for the cardinality of a fuzzy set can be obtained as a summary of the information contained in a fuzzy measure. In the case of the probabilistic measure ED, the expected value for the cardinality can be de#ned as the usual centroid defuzzi#cation approach of fuzzy control Ex(F) =
n
i ∗ ED(F; i):
(24)
i=0
A formula similar to (24) was faced in [4]. Also, the same formula was proposed by Dubois and Prade [2] with the sum starting in i = 1. This is due to the fact that they give a probability 0 that the cardinality of F could be 0, because they force F to be normalized. In any case, the term for i = 0 is 0 and does not aGect the #nal result. Moreover, for every subset A ∈ (F) that is not an -cut of F, it has been shown in [2] that the probability of A is 0, so the expected result is the same as Ex(F). Proposition 4.14. The expected scalar value for the cardinality of F; Ex(F); is the power of F dened in (4).
185
The following are some preliminary de#nitions: De!nition 5.1. We introduce M (F) to be the following subset of [0; 1] M (F) = {F(xi ) | xi ∈ X }:
(25)
De!nition 5.2. We introduce M (F=G) to be the following subset of [0; 1] M (F=G) = M (F ∩ G) ∪ M (G):
(26)
We will use the following notation: M (F=G) = {1 ; : : : ; m } where we assume 1 = 1 ¿2 ¿ · · · ¿m ¿m+1 = 0. De!nition 5.3. We introduce C(F=G; i ) to be the rational value C(F=G; i ) =
|(F ∩ G)i | : |Gi |
(27)
De!nition 5.4. We introduce CR(F=G) to be the following subset of [0; 1] ∩ Q CR(F=G) = {C(F=G; i ) | i ∈ M (F=G)}:
(28)
Notice that although Z(F) and ED(F) are not convex in general, the summary of ED(F) is P(F), an additive measure, and ED(F) can be interpreted as the basic probability assignment of Z(F).
5.2. A new possibilistic measure of the relative cardinality
5. New de!nitions for the relative cardinality
ES(F=G; q) = max{i ∈ M (F=G) | q = C(F=G; i )}:
Following the representation approach by means of -cuts, either possibilistic or probabilistic, we are going to propose both a possibilistic and a probabilistic measure of the relative cardinality of a fuzzy set F with respect to a fuzzy set G. 5.1. Preliminaries We require the fuzzy set G to be normalized, because otherwise there is a non-zero probability that the cardinality of G is 0, and then the relative cardinality of F with respect to G is not de#ned. If G is not normalized, then we can normalize it in the usual way, by multiplying by a factor 1=max{G(x)}. We must apply the same factor to the set F ∩ G to maintain the relative cardinality of F with respect to G.
De!nition 5.5. We introduce the fuzzy relative cardinality ES(F=G) for all q ∈ Q to be (29) Proposition 5.6. The measure ES(F=G) is welldened in the crisp case; i.e. if F and G are crisp then ES(F=G) ={1=q} where q = C(F=G; 1) = |F ∩ G|=|G|. Proposition 5.7. Let G = X . Then ES(F=X; k=n) = Z(F; k):
(30)
Proof. First, we have M (F=X ) = M (F) ∪ {1}. As the -cut f0 = 1 is always used in the representation of F, then {F | ∈ M (F=X )} is the representation of F in terms of its -cuts. On the other hand, we consider that |X | = n ∀ ∈ M (F=X ). Hence CR(A=X ) ={|F |=n | ∈ M (F=X )} and from (29) it follows ES(F=X; k=n) = = Z(F; k).
186
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
5.3. A new probabilistic measure of the relative cardinality De!nition 5.8. We introduce the fuzzy relative cardinality ER(F=G), q ∈ Q, to be the random set (i − i+1 ): (31) ER(F=G; q) = C(F=G;i )=q
Proposition 5.9. The measure ER(F=G) is welldened in the crisp case; i.e. if F and G are crisp then ER(F=G) ={1=q} where q = C(F=G; 1) = |F ∩ G|=|G|. Proposition 5.10. Let G = X . Then ER(F=X; k=n) = ED(F; k):
(32)
Proof. Analogous to the proof of Proposition 5.7. 6. Related problems 6.1. Entropy of a fuzzy set The concept of entropy of a fuzzy set was #rst de#ned by De Luca and Termini in [6] as a measure of the degree of “fuzziness” of a fuzzy set. This de#nition makes use of the function S(x) de#ned by Shannon for every x ∈ [0; 1] as S(x) = − x ln(x) − (1 − x) ln(1 − x)
(33)
and assuming 0 ln(0) = 0. The entropy of a fuzzy set F is de#ned in [6] as d(F) =
n
S(F(xi )):
(34)
i=1
Since then, other de#nitions of the entropy of a fuzzy set can be found in papers, such as the measures proposed by Yager [19], Kosko [5], Shang and Jiang [10], Pal and Pal [8] among others. All these measures verify the following properties, proposed in [6], for an entropy measure: (1) The entropy of a fuzzy set F is 0 if and only if F is a crisp set. (2) The maximum value of the entropy is reached if and only if F(x) = 0:5 ∀x ∈ X . (3) For every sharpened version G of F (i.e. G is a fuzzy set verifying G(x)6F(x) if F(x)60:5
and G(x)¿F(x) if F(x)¿0:5) the entropy of F is greater than the entropy of G. A reasonable relation between fuzzy cardinality and entropy of a fuzzy set F is that “the crisper the fuzzy set, the crisper the fuzzy cardinality”, i.e. “the lower the entropy, the crisper (the clearer) the fuzzy cardinality”. Obviously, the reverse relation “the crisper the fuzzy cardinality, the lower the entropy” must also hold. This could allow us to measure the entropy of a fuzzy set by measuring the entropy of its fuzzy cardinality. The measures FECount(F) and DP(F) seems to verify this relation, since if F is crisp then both FECount(F) and DP(F) are crisp, and if F(x) = 0:5 ∀x ∈ X then FECount(F; k) = DP(F; k) = 0:5 ∀k ∈ {0; : : : ; n}. We think that the discussion in Section 3 can be extended to the relation between fuzzy cardinality measures and entropy measures. It is easy to see that the relation just described between entropy and cardinality is not veri#ed by the measures Z(F) and ED(F). First we propose the following de#nition of the entropy of a fuzzy set. De!nition 6.1. We introduce the entropy of a fuzzy set F to be Ent(F) = −
n
ED(F; k) ln(ED(F; k)):
(35)
k=0
That is to say, the well-known functional H (similar to Shannon entropy, see [6]) with K = 1 over the probability distribution ED(F). As pointed out in [6], this de#nition does not verify all the properties proposed for the entropy. But we think that some discussion could arise over the interpretation of these properties. The entropy of a fuzzy set has also been interpreted by De Luca and Termini in [6,7] as a measure of information in the sense of “the amount of information we need to turn the fuzzy set into a crisp one”, i.e. the amount of information we need to decide, for every x ∈ X , whether x ∈ F or not. This is clearly related to the problem of the fuzzy cardinality of a fuzzy set, and the dependency among objects. Let us consider the following example: Example 6.2. Let F = 0:5=x1 +0:5=x2 +0:5=x3 +0:5=x4 be a fuzzy set and let G = 1=x1 + 0:75=x2 + 0:5=x3 +
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
187
Fig. 1. Fuzzy sets F and G over X .
Fig. 2. Fuzzy cardinalities ED(F) and ED(G).
0:25=x4 . Fig. 1 shows graphical representations of the sets F and G: Following the property (2), F is the fuzziest set over four objects. But the question is, how much information do we need to turn F into a crisp set? We only need to decide whether an object with a degree 0.5 is in F or not (we can apply this information to every object with a degree 0.5). Although we agree that the information needed is the maximum for one object (any other degree is nearer to 1 or 0 than 0.5), we consider that in the case of G we need the same information (to decide for x3 ) and some additional information (to decide for x4 , etc.). We think that the problem with property 2 is that we must assume independence among the objects with respect to the problem of deciding if one object is in F or not. Regarding the fuzzy sets F and G in Fig. 1 we could ask, which set makes us feel more comfortable when deciding its cardinality? and hence, which set seems to be “crisper” in that sense? In our opinion, “F” is the answer. In fact, the cardinalities ED(F) and ED(G) of Fig. 2 tell us that there are less integer candidates with a high probability of being the cardinality of F.
Following this approach, which is the “fuzziest” fuzzy set? By the properties of function H , the highest entropy value is reached when the probability is equidistributed. Therefore, if we have n elements, the highest entropy is reached when ED(F; k) = 1=(n + 1) for every k (taking into account the probability ED(F; 0)). The following proposition de#nes the fuzziest fuzzy set over n objects. Proposition 6.3. The fuzziest set over n objects; A(n) ; can be dened as A(n) (xi ) = i=(n + 1):
(36)
As an example, if n = 4 then A(4) = 0:2=x1 +0:4=x2 + 0:6=x3 + 0:8=x4 . Proposition 6.4. The entropy Ent(F) is 0 if and only if F is a crisp set. Proof. It is trivial in the respect that if F is crisp then ED(F) is also crisp. Proposition 6.5. The entropy Ent(F) veries Ent(F) W = Ent(F).
188
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
Proof. Trivial regarding (35) and (23). Proposition 6.6. Let F be a fuzzy set. We dene n fuzzy {F 1 ; : : : ; F n } as F i = F(xi )=xi . Hence sets i F = {F } and i = j ⇒ F i ∩ F j = ∅. Then d(F) =
n
Ent(F i ):
(37)
i=1
a fuzzy set over R = {¡; =; ¿}. Following our general criterion, we should defuzzify the fuzzy ranking when providing it as a #nal result. Nevertheless, the size of the set R (|R| = 3) allows us to give it as a result to the user. De!nition 6.7. We introduce the possibilistic ranking of the cardinalities of F and G for every ∗ ∈ R to be
Proof. We only need to show that Ent(F i ) = S(F(xi )). The entropy of F i is
RPoss (|F| ∗ |G|) = max{i ∈ M (F=G) | |Fi | ∗ |Gi |}:
Ent(F i ) = − ED(F; 0) ln(ED(F; 0))
De!nition 6.8. We introduce the probabilistic ranking of the cardinalities of F and G for every ∗ ∈ R to be RProb (|F| ∗ |G|) = (i − i+1 ): (39)
− ED(F; 1) ln(ED(F; 1)); where ED(F; 0) = b0 −b1 = 1−F(xi ), and ED(F; 1) = F(xi ), so Ent(F i ) = S(F(xi )) and b1 − b2 = b1 = n hence d(F) = i=1 Ent(F i ). We think that entropy measures which agree with property (2) assume independence among objects, and therefore they obtain the total entropy of the set as the sum of the entropy of every set F i . De Luca and Termini agree in [7] that additivity is a strong requirement which is only valid when independence among objects holds. They also agree in the same paper that some interesting classes of entropy measures do not verify property 2. However, we agree with them in that for some problems, additivity can hold or it can be a reasonable approximation. 6.2. A cardinality-based ranking for fuzzy sets One of the problems related to the cardinality is that of the equipotency of fuzzy sets, i.e. deciding whether two given fuzzy sets F and G have the same cardinality. We can extend this problem to that of ranking the cardinality of two fuzzy sets using the representation of a fuzzy set F as a set of -cuts. We think that the ranking of the cardinalities of fuzzy sets should be fuzzy, since the cardinality is widely considered to be fuzzy. In particular, the same argument can be claimed with respect to the equipotency of fuzzy sets, so we should give a fuzzy degree of equipotency of two fuzzy sets. We will de#ne the fuzzy ranking of the cardinalities of two fuzzy sets as
(38)
i :|Fi |∗|Gi |
The following de#nitions of the degree of equipotency of two fuzzy sets follow from De#nitions 6.7 and 6.8. De!nition 6.9. We introduce the possibilistic degree of equipotency among two fuzzy sets F and G to be EqPoss (F; G) = RPoss (|F| = |G|):
(40)
De!nition 6.10. We introduce the probabilistic degree of equipotency among two fuzzy sets F and G to be EqProb (F; G) = RProb (|F| = |G|):
(41)
Proposition 6.11. The possibilistic equipotency EqPoss veries EqPoss (F; G) = 1 ⇔ |F1 | = |G1 |:
(42)
Proof. EqPoss (F; G) = 1 ⇔ max{i ∈ M (F=G) | |Fi | = |Gi |} = 1 ⇔ |F1 | = |G1 |. Proposition 6.12. The EqProb veries
probabilistic
equipotency
EqProb (F; G) = 1 ⇔ |Fi | = |Gi | ∀i ∈ M (F=G): (43)
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
Proof. EqProb (F; G) = 1 ⇔ {i ∈ M (F=G) | |Fi | = |Gi |} (i − i+1 ) = 1 ⇔ |Fi | = |Gi | ∀i ∈ M (F=G). Proposition 6.13. The probabilistic equipotency EqProb veries EqProb (F; G) = 1 ⇔ Z(F) = Z(G) ⇔ ED(F) = ED(G):
(44)
Proof. The cardinality Z(F) is based on the cardinality of the -cuts of F, and there is a one to one relation between a set of cardinalities of -cuts and a fuzzy cardinality Z, so the -cuts of G are equal to the -cuts of F if and only if Z(F) = Z(G). By Proposition 4.9, Z(F) = Z(G) if and only if ED(F) = ED(G). This last property is not veri#ed by EqPoss , and hence we prefer EqProb for measuring the degree of equipotency of fuzzy sets.
7. Concluding remarks We have discussed the fuzzy cardinality and fuzzy relative cardinality of fuzzy sets. We have justi#ed why only the representation of a fuzzy set by means of its -cuts is intuitively well-suited for the purposes of measuring cardinality, if we take into account the dependence among objects. We have proposed a probabilistic and nonconvex measure ED for the fuzzy cardinality. Also we have introduced a family E of cardinalities where ED is included, as well as other existing fuzzy cardinalities. The study of the cardinalities of the family E that verify the dependence among objects, together with the discussion of their properties, will be an object of future research. We have also proposed both a probabilistic (ER(F=G)) and a possibilistic (ES(F=G)) fuzzy measure of the relative cardinality of fuzzy sets that generalize the measures of cardinality ED(F) and Z(F), respectively, in the case G = X . We have applied some of these concepts to the problem of evaluating quanti#ed sentences, and we have obtained methods with better properties than existing ones. We have proposed a de#nition of the entropy of a fuzzy set based on our probabilistic measure ED, relating the development of entropy measures to the independence problem. Finally, we have pro-
189
posed a fuzzy ranking of fuzzy sets based on the fuzzy cardinality measures ED and Z. We expect to apply the method ER to the evaluation of type II quanti#ed sentences to obtain a probabilistic method in further research.
References [1] D. Dubois, H. Prade, Fuzzy cardinality and the modeling of imprecise quanti#cation, Fuzzy Sets and Systems 16 (1985) 199–230. [2] D. Dubois, H. Prade, Measuring properties of fuzzy sets: a general technique and its use in fuzzy query evaluation, Fuzzy Sets and Systems 38 (1990) 137–152. [3] D. Dubois, H. Prade, Scalar evaluations of fuzzy sets: overview and applications, Appl. Math. Lett. 3 (2) (1990) 37–42. [4] S. Gottwald, A note on fuzzy cardinals, Kybernetika 16 (1980) 156–158. [5] B. Kosko, Fuzzy entropy and conditioning, Inform. Sci. 40 (1986) 165–174. [6] A. De Luca, S. Termini, A de#nition of a nonprobabilistic entropy in the setting of fuzzy sets theory, Inform. and Control 20 (1972) 301–312. [7] A. De Luca, S. Termini, Entropy and energy measures of a fuzzy set, in: M.M. Gupta, R.K. Ragade, R.R. Yager (Eds.), Advances in Fuzzy Set Theory and Applications, Vol. 20, 1979, pp. 321–338. [8] N.R. Pal, S.K. Pal, Higher order fuzzy entropy and hybrid entropy of a set, Inform. Sci. 61 (1992) 211–231. [9] D. Ralescu, Cardinality, quanti#ers and the aggregation of fuzzy criteria, Fuzzy Sets and Systems 69 (1995) 355–365. [10] Xiu-Gang Shang, Wei-Sun Jiang, A note on fuzzy information measures, Pattern Recognition Lett. 18 (1997) 425–432. [11] M. Wygralak, A new approach to the fuzzy cardinality of #nite fuzzy subsets, BUSEFAL 15 (1983) 72–75. [12] M. Wygralak, Fuzzy cardinals based on the generalized equality of fuzzy subsets, Fuzzy Sets and Systems 18 (1986) 143–158. [13] M. Wygralak, Vaguely De#ned Objects. Representations, Fuzzy Sets and Nonclassical Cardinality Theory, Kluwer Academic Press, Dordrecht, Boston, London, 1996. [14] M. Wygralak, On the best scalar approximation of cardinality of a fuzzy set, Int. J. Uncertainty, Fuzziness and Knowledge-Based Systems 5 (6) (1997) 681–687. [15] M. Wygralak, Questions of cardinality of #nite fuzzy sets, Fuzzy Sets and Systems 102 (6) (1999) 185–210. [16] M. Wygralak, Triangular operations, negations, and scalar cardinality of a fuzzy set, in: L.A. Zadeh, J. Kacprzyk (Eds.), Computing With Words in Information=Intelligent Systems, 1. Foundations, Physica-Verlag, Heidelberg-New York, 1999, pp. 326–341. [17] M. Wygralak, An axiomatic approach to scalar cardinalities of fuzzy sets, Fuzzy Sets and Systems 110 (2000) 175–179.
190
M. Delgado et al. / Fuzzy Sets and Systems 126 (2002) 177–190
[18] M. Wygralak, D. Pilarski, Triangular norm-based generalized cardinals for fuzzy sets, in: Proceedings of the Seventh Zittan Fuzzy Colloquium, 1999, pp. 234 –239. [19] R.R. Yager, On the measure of fuzziness and negation, part i: Membership in the unit interval, Int. J. General Systems 5 (1979) 221–229. [20] L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, Inform. Sci. 9 (1975) 43–80.
[21] L.A. Zadeh, A theory of approximate reasoning, Machine Intell. 9 (1979) 149–194. [22] L.A. Zadeh, A computational approach to fuzzy quanti#ers in natural languages, Comput. Math. Appl. 9 (1) (1983) 149–184.