On fuzzy bags and their application to flexible querying

On fuzzy bags and their application to flexible querying

Fuzzy Sets and Systems 140 (2003) 93 – 110 www.elsevier.com/locate/fss On fuzzy bags and their application to %exible querying Daniel Rocacher IRISA/...

318KB Sizes 1 Downloads 30 Views

Fuzzy Sets and Systems 140 (2003) 93 – 110 www.elsevier.com/locate/fss

On fuzzy bags and their application to %exible querying Daniel Rocacher IRISA/ENSSAT, BP 447, 22305 Lannion Cedex, France

Abstract An issue in extending database management functionalities is to increase the expressiveness of query languages. Flexible querying enables users to express preferences inside requirements and priorities inside compound queries. The answers are then quali,ed and sorted out. The fuzzy sets theory o-ers a general framework for dealing with %exible queries. Moreover the bag type plays an important role in databases. Systems taking into account both %exible queries and bags de,ne fuzzy bags. A new approach of this concept, based on the notion of fuzzy cardinalities, is presented in this paper. Fuzzy cardinalities provide a general framework in which di-erent collection types: set, fuzzy set, bag and fuzzy bag are treated in a uniform way and can then be composed. We show how they can be used to build up %exible queries and to manipulate both quantitative and qualitative information on the elements of collections. c 2003 Elsevier B.V. All rights reserved.  Keywords: Fuzzy integer; Fuzzy cardinality; Fuzzy bags; Bags; Fuzzy sets; Flexible querying

1. Introduction In ordinary database management systems user needs are modelled by Boolean conditions. However, certain requirements are more %exible and involve selecting more or less acceptable elements. For instance, in the query asking the names of both young and well-paid employees in a company, the criteria young and well-paid are more or less satis,ed (or gradual). Fuzzy sets theory provides a general framework in which %exible querying can be expressed. Atomic gradual criteria (young, well-paid, etc.) can be combined with conjunction or disjunction, weighted conjunction or disjunction when predicates have di-erent degrees of importance, and averaging operators for the expression of compromises. Flexible queries are therefore able to take users’ preferences and degrees of importance into account and answers can be di-erentiated and rank-ordered in accordance with gradual predicates [14].

E-mail address: [email protected] (D. Rocacher). c 2003 Elsevier B.V. All rights reserved. 0165-0114/$ - see front matter  doi:10.1016/S0165-0114(03)00029-0

94

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

However, databases and their query languages must be adapted if this graduality is to be exploited. Studies have been thoroughly conducted in the relational database context [5,21]. The introduction of %exibility in query languages applied to object-oriented databases has also been the subject of research. These studies focus on the de,nition of elementary algebraic operators such as selection, Cartesian product, image, %atten, nest; : : : ; integrating fuzzy sets and gradual predicates. Using these elementary operators, some aspects of a high level query language, such as OQL speci,ed by the ODMG, have been designed in order to deal with gradual queries [4,8]. As bags are an important type for object-oriented databases, these works have led us to analyse how to deal with both bags and %exible querying [6,9,25,26] and to introduce the concept of fuzzy bags. Even though we suggest here a role for fuzzy bags in the context of databases, there are of course many other potential application domains for the theory of bags. A bag (or multiset) is a collection, like a set, but in which repeated elements are signi,cant. Many systems have been designed to support them in their data model (whether relational or object oriented). The use of this type of data is motivated by its ability to manage quantities. We can also notice that the elimination of duplicates is time consuming in certain set operations, bags can then be useful for computation optimisation. The algebraic properties of bags have been studied in [1,16,17,20,30] where sets have been presented as special cases of bags. A fuzzy bag is a collection which simultaneously deals with quantities and degrees of membership of the elements it contains. In a database, a fuzzy bag can be obtained by a selection on a crisp bag using a gradual predicate as illustrated by the query: 4nd the expensive products stored in a given store. The selected products more or less satisfy the gradual predicate expensive and are associated with their quantities. The query: 4nd the young employees’ salaries is a projection on the attribute ‘salary’ from a fuzzy set of persons (the young employees) and also delivers a fuzzy bag. As several employees may have the same salary, the returned collection of salaries may contain duplicates. Moreover, a given salary occurrence coincides with a more or less young employee and is associated with a degree of membership expressing the extent to which the criterion to be the salary of a young person is satis,ed. Consequently the di-erent elements returned by the query have to be managed both quantitatively and qualitatively thanks to a fuzzy bag which represents the distribution of the young employees’ salaries. The fuzzy bag concept has been introduced by Yager [30,31]. But, as explained later in this article, his propositions do not make fuzzy sets and fuzzy bags compatible: fuzzy sets cannot be presented as special cases of fuzzy bags. In [4,7,18,22,23] approaches for building fuzzy bags so as to introduce operators compatible with both bags and fuzzy sets have been proposed. A new approach in building fuzzy bags, based on fuzzy integers, is presented here. An interesting objective for manipulating quantities inside queries is to deal with quanti,ed statements. For examples, we may be interested in expressing requests on quantities, as in: 4nd the expensive products stored in large quantities or 4nd the company in which the number of young employees is equal to or greater than the number of well-paid employees. The study of cardinalities, quanti,ers and aggregations is a large and important subject which has been investigated by many authors [13,24,29,31,33] but is beyond the scope of this paper. In this paper, we show how to specify the basic operators on fuzzy bags so as to introduce new de,nitions compatible with both bags and fuzzy sets. The usual basic operators on ordinary bags are brie%y recalled in Section 2. In Section 3, fuzzy bags are characterized following di-erent approaches based on -cuts, !-cuts and a general framework using fuzzy numbers. This allows to develop, in

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

95

Section 4, extensions of bag operators to fuzzy bags thanks to fuzzy cardinalities. In the ,eld of databases, bags and fuzzy bags o-er interesting perspectives, we present, in section 5, how they can be used to construct %exible queries. 2. Bags In this section, the main operators on bags are brie%y presented. We sometimes have to deal with collections of elements in which duplicates are signi,cant. For example, when dealing with the collection of employees’ age in a company, we may be interested in the distribution of ages. In such a situation, the set is not the proper mathematical abstraction to use. A collection of elements which may contain duplicates is called a bag (also called multiset) [1,16]. If X is a set of elements, a bag A, de,ned on a universe X , is characterized by a function !A as follows: !A : X → N: For each x in X , !A (x) is the characteristic value of x in A and indicates the number of occurrences of the element x in A. The following representation will be used to de,ne a bag A on the set X = {x1 ; : : : ; xn }: A = {a∗1 x1 ; : : : ; a∗n xn }

where ai = !A (xi ):

We write x ∈ A when the bag A contains x (i.e.: !A (x) ¿ 0). Example 2.1. A = {3∗ a; 2∗ b; 1∗ c}. A could also be denoted as a; a; a; b; b; c: The cardinality of a bag A drawn from X , denoted |A|, is de,ned as |A| =



!A (x):

(1)

x ∈X

Usual operations on bags are: !A∩B (x) = min(!A (x); !B (x));

(2)

!A∪B (x) = max(!A (x); !B (x));

(3)

!A+B (x) = !A (x) + !B (x); 1

(4)

A ⊆ B if !A (x) 6 !B (x); 1

∀x ∈ X;

In A + B; + denotes the additive union.

(5)

96

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

!A−B (x) = max(0; !A (x) − !B (x));

(6)

!A×B (x; y) = !A (x) × !B (y):

(7)

A bag without duplicates is a set and, in this particular case, the bag operations lead to the same results as set operations thus achieving compatibility. The bags are often used to implement relations in database systems and algebras for manipulating them have been developed [1,16,17]. Knuth [20] introduced multisets into algorithms. See also [3] for a complete survey. 3. Fuzzy bags A fuzzy bag is a bag in which each occurrence of each element is associated with a grade of membership. One way to describe a fuzzy bag is to enumerate its elements. Example 3.1. A = 1=a; 0:1=a; 0:1=a; 0:5=b A fuzzy bag A, on a universe X , can also be de,ned by a characteristic function A from X to Q, where Q is the set of all crisp bags de,ned on [0; 1]: A : X → Q: Example 3.2. A = {1; 0:1; 0:1=a; 0:5=b} = {{1; 2∗ 0:1}=a; {1∗ 0:5}=b}; A (a) = 1; 0:1; 0:1 = {1; 2∗ 0:1}: Operators on fuzzy bags have been de,ned by Yager in [30,31] and a complementary study has been carried out in [7]. In these propositions, the intersection of two fuzzy bags A and B is based on the intersection of the crisp bags of degrees associated with each element x, denoted below A (x) and B (x), so ∀x ∈ X;

(A∩B) (x) = A (x) ∩ B (x):

However, this de,nition is not compatible with the intersection of fuzzy sets. For example, if A = {0:1=a} and B = {0:9=a} then A ∩ B = ∅ which is not the expected result when A and B are considered as fuzzy sets. Although Yager’s propositions allow crisp bags to be considered as a special case of fuzzy bags, they do not always consider that fuzzy sets are a special case of fuzzy bags. Similar diNculties appear with the de,nition of the union and inclusion operators. Therefore, new de,nitions taking into consideration compatibility between fuzzy sets and fuzzy bags have to be developed. A ,rst approach is based on the -cut concept similarly to the extension of sets to fuzzy sets. This concept can be viewed as a bridge connecting fuzzy sets and crisp sets since a fuzzy set can

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

97

be represented by the family of all its -cuts [19]. This representation allows operations on crisp sets to be extended to their fuzzy counterparts thanks to the following properties: (A ∩ B) = A ∩ B ;

(A ∪ B) = A ∪ B :

(8)

Such an approach will be applied to link fuzzy bags to bags and to achieve compatibility with fuzzy sets [9,18,22,23]. The -cut of a fuzzy bag A is de,ned as the crisp bag A which contains all the occurrences of the elements of a universe X whose grade of membership in A is greater than or equal to the degree ( ∈ ]0; 1]). The number of occurrences xi of the element x in A is denoted: !A (x). Let !A (x; d) be the number of occurrences of the element x in A associated with the grade of membership d. Each fuzzy bag is represented by its -cuts via the formula:  !A (x) = !A (x; d): d¿ 

Example 3.3. A = {{1 ∗ 1; 2∗ 0:1}=a; {1∗ 0:5}=b}; !A (a; 1) = 1; !A (a; 0:1) = 2; !A0:5 (a) = 1;

!A0:1 (a) = 3;

A1 = {1∗ a}; A0:5 = {1∗ a; 1∗ b}; A0:1 = {3∗ a; 1∗ b}: Given a fuzzy bag A, the following property holds: ∀;  ∈]0; 1];

 6  ⇒ A ⊆ A :

Consequently the -cuts of a fuzzy bag are nested crisp bags and a fuzzy bag can be represented by the family of all its -cuts. De,ning the intersection and the union of two fuzzy bags satisfying (8) leads to preserve the compatibility between bags and fuzzy bags. So, if A and B are two fuzzy bags, from (2), (3) and (8) it can be deduced: !(A∩B) (x) = min(!A (x); !B (x));

(9)

!(A∪B) (x) = max(!A (x); !B (x)):

(10)

98

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

Example 3.4. A = {1; 0:1; 0:1=a; 0:5=b};

B = {0:9; 0:5=a};

(A ∩ B)1 = {1∗ a} ∩ {} = {}; (A ∩ B)0:9 = {1∗ a} ∩ {1∗ a} = {1∗ a}; (A ∩ B)0:5 = {1∗ a; 1∗ b} ∩ {2∗ a} = {1∗ a}; (A ∩ B)0:1 = {3∗ a; 1∗ b} ∩ {2∗ a} = {2∗ a}: We then deduce: A ∩ B = {0:9; 0:1=a}: A ∩ B is the largest collection which is contained both in A and B. In a second approach, we put forward a link between fuzzy bag and fuzzy set structures by introducing the concept of !-cut. The !-cut of a fuzzy bag A is the fuzzy set A! such that the grade of membership of the element x in A! , denoted: A! (x), de,nes the extent to which A contains at least ! (with ! ∈ N+ ) occurrences of x: A! (x) = sup{: !A (x) ¿ !} A! (x) is the best  such that A contains at least ! occurrences of x. Example 3.5. A = {1; 0:1; 0:1=a; 0:5=b}; A1 = {1=a; 0:5=b}; A1 (a) = 1; A2 = {0:1=a}; A3 = {0:1=a}: A fuzzy bag A can be represented by the family of the nested fuzzy sets A! (with ! ∈ N+ ). A is the additive union of all its !-cuts A! . The !-cuts de,ne homomorphic links between fuzzy sets and fuzzy bags and operations on fuzzy sets can be extended to their counterparts on fuzzy bags thanks to the following properties: (A ∩ B)! = A! ∩ B! ;

(A ∪ B)! = A! ∪ B! :

Example 3.6. A = {1; 0:1; 0:1=a; 0:5=b};

B = {0:9; 0:5=a};

(11)

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

99

Fig. 1. Links between di-erent collection types.

(A ∩ B)1 = {1=a; 0:5=b} ∩ {0:9=a} = {0:9=a}; (A ∩ B)2 = {0:1=a} ∩ {0:5=a} = {0:1=a}; (A ∩ B)3 = {0:1=a} ∩ {} = {}: We then deduce: A ∩ B = {0:9; 0:1=a}: The !-cut concept can also be used to link crisp bags and crisp sets. The relationships between the set, bag, fuzzy set and fuzzy bag structures are then described in Fig. 1. These two ,rst approaches focused on degrees of memberships (-cut) and on quantities (!-cut) can be jointly dealt with a unique approach based on fuzzy cardinalities. The concept of fuzzy cardinality of a fuzzy set has been proposed by Zadeh [33]. The cardinality ˆ of a fuzzy set A, called FGCount(A), is de,ned by |A| ∀n ∈ N;

|Aˆ| (n) = sup{: |A | ¿ n}:

Example 3.7. If A = {1=x1 ; 0:1=x2 ; 0:1=x3 } then: ˆ = {1=0; 1=1; 0:1=2; 0:1=3}: |A| The degree  associated with a number ! in a fuzzy cardinality of a fuzzy set A is interpreted as the extent to which A has at least ! elements. ˆ has been de,ned as the convex hull of the fuzzy set of the cardinalities The fuzzy cardinality |A| ˆ of all the -cuts of A. |A| is a normalized and convex fuzzy set of integers. Properties of fuzzy cardinalities are presented, for instance, in [2,13,10,29,33]. Considering this fuzzy cardinality notion, the occurrences of an element x in a fuzzy bag A can be characterized as a fuzzy integer denoted A (x). This fuzzy number is the fuzzy cardinality of the fuzzy set of the occurrences of x in A. This approach with fuzzy cardinalities integrates both the previous approaches with -cuts and !-cuts. ˆ So, a fuzzy bag A, on a universe X , can be de,ned by a characteristic function A from X to C, where Cˆ is the subset of the fuzzy integers related to all the fuzzy cardinalities: ˆ A : X → C:

100

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

Example 3.8. A = {1; 0:1; 0:1=a; 0:5=b}; A (a) = {1=0; 1=1; 0:1=2; 0:1=3}: Using fuzzy numbers of occurrences A is represented by: A = {{1=0; 1=1; 0:1=2; 0:1=3}∗ a; {1=0; 0:5=1}∗ b}: 4. Operators on fuzzy bags using fuzzy numbers The operations on crisp bags are based on arithmetic operations on the numbers of occurrences of their elements. Through the application of the extension principle, these crisp bag operations can be extended to fuzzy bags thanks to the concept of fuzzy numbers of occurrences. Let us recall that a binary operation # (+, min, max, etc.) on fuzzy quantities Q and Q [19,32] is de,ned by Q#Q (z) =

sup

(x;y)=x#y=z

min(Q (x); Q (y)):

In the following, operators on bags (1)–(7) have been extended. 4.1. Fuzzy cardinality The cardinality of a bag is the arithmetic sum of the occurrences of its elements, so through the extension principle the cardinality of a fuzzy bag A is  ˆ = |A| A (x): (12) x ∈X

Example 4.1. A = {1; 0:1; 0:1=a; 0:5=b}; ˆ = {1=0; 1=1; 0:1=2; 0:1=3} + {1=0; 0:5=1} |A| = {1=0; 1=1; 0:5=2; 0:1=3; 0:1=4}: We can notice that a degree of membership of an element x in a fuzzy set A (A (x)) can be viewed as a special case of a fuzzy number: A (x) ∼ = {1=0; A (x)=1} = A (x) and thus the fuzzy cardinality FGCount(A) can be also de,ned as  A (x): FGCount(A) = x ∈X

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

101

Example 4.2. A = {1=x1 ; 0:1=x2 ; 0:1=x3 }: A can be represented with fuzzy numbers: A = {{1=0; 1=1}∗ x1 ; {1=0; 0:1=1}∗ x2 ; {1=0; 0:1=1}∗ x3 }; FGCount(A) = {1=0; 1=1} + {1=0; 0:1=1} + {1=0; 0:1=1} = {1=0; 1=1; 0:1=2; 0:1=3}: In the same way, the number of occurrences of an element x in a crisp bag A (!A (x)) can be viewed as a special case of a fuzzy number: !A (x) ∼ = {1=0; : : : ; 1=!A (x)} = A (x) and |A| =



A (x):

x ∈X

4.2. Inclusion Given two fuzzy bags A and B, A is said to be a strict subbag of B i-: ∀x ∈ X :

A (x) 6 B (x):

This de,nition is a simple extension of (5) and of the usual de,nition of the strict inclusion in fuzzy set theory, originally proposed by Zadeh. However, 6 is a partial order relation over Cˆ because there exist fuzzy numbers of occurrences which are strictly incomparable. Example 4.3. x = {1=0; 1=1; 0:9=2; 0:4=3}; y = {1=0; 1=1; 0:8=2; 0:5=3}: This involves to de,ne a fuzzy inclusion which evaluates the extent to which a fuzzy bag A is a subbag of a fuzzy bag B. A is a subbag of B if there are at least ! occurrences of x in A then there are at least ! occurrences of x in B, for any x ∈ X and ! ∈ N+ . Formally: A ⊆ B i- ∀x ∈ X; ∀! ∈ N+ ; A (x) (!) ⇒f B (x) (!); where ⇒f is a fuzzy implication operator.

(13)

102

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

When the standard conjunction and the GSodel implication are chosen, we get: A⊆B = min; min+ (A (x) (!) ⇒GoS B (x) (!)) x∈X !∈N

with: (A (x) (!) ⇒GoS B (x) (!)) = 1 when (A (x) (!) 6 B (x) (!)) = B (x) (!)

otherwise:

Example 4.4. A = {{1=0; 1=1; 0:9=2; 0:4=3}∗ a}; B = {{1=0; 1=1; 0:8=2; 0:5=3}∗ a}; A⊆B = min(1 ⇒GoS 1; 1 ⇒GoS 1; 0:9 ⇒GoS 0:8; 0:4 ⇒GoS 0:5) = min(1; 1; 0:8; 1) = 0:8: The degree 0.8 is the threshold t such that A ⊆ B ;

∀ ∈]0; t]:

4.3. Intersection, union From the extension principle and (2) and (3) we deduce A∩B (x) = min(A (x); B (x));

(14)

A∪B (x) = max(A (x); B (x));

(15)

where A (x) and B (x) are the fuzzy numbers of occurrences of x in A and B. The set of fuzzy cardinalities Cˆ is a lattice, so any two fuzzy cardinalities i1 and i2 have a greatest lower bound (min) and a least upper bound (max), and the characteristic functions of the intersection and union are de,ned by: A∩B (x) (z) = A∪B (x) (z) = for all z in N.

sup

min(A (x) (t); B (x) (v));

sup

min(A (x) (t); B (x) (v))

z=min(t;v)

z=max(t;v)

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

103

Example 4.5. A = {{1=0; 1=1; 0:1=2; 0:1=3}∗ a}; B = {{1=0; 0:9=1; 0:5=2}∗ a}; A∩B (a) = min({1=0; 1=1; 0:1=2; 0:1=3}; {1=0; 0:9=1; 0:5=2}) A∩B (a) 1/0 0.9/1 0.5/2

1/0 1/0 0.9/0 0.5/0

1/1 1/0 0.9/1 0.5/1

0.1/2 0.1/0 0.1/1 0.1/2

0.1/3 0.1/0 0.1/1 0.1/2

Hence, A∩B (a) = {1=0; 0:9=1; 0:1=2}. The intersection A ∩ B is the largest collection such that: ∀x ∈ X; A ∩ B ⊆ A

and

A ∩ B ⊆ B:

This result is similar to the result obtained via the previous approaches using - or !-cuts. When A and B are two crisp bags or two fuzzy sets, they can be viewed as particular cases of fuzzy bags. The extended operations (14) and (15) lead to results in accordance with the expected results obtained with the standard bag and fuzzy set intersection and union. 4.4. Additive union The additive union of two fuzzy bags is a concatenation operation. Based on the concept of fuzzy numbers of occurrences and through the application of the extension principle, this operation can be deduced from (4): A+B (x) = A (x) + B (x);

(16)

where A (x) and B (x) are the fuzzy numbers of occurrences of x in A and B. Example 4.6. A = {{1=0; 1=1; 0:1=2; 0:1=3}∗ a}; B = {{1=0; 0:9=1; 0:5=2}∗ a}; A+B (a) = {1=0; 1=1; 0:1=2; 0:1=3} + {1=0; 0:9=1; 0:5=2} = {1=0; 1=1; 0:9=2; 0:5=3; 0:1=4; 0:1=5}: 4.5. Di:erence The di-erence between two fuzzy bags de,ned on a universe must be speci,ed carefully because this operator does not induce a complement operator relative to this universe. In the classical theory, the set di-erence can be de,ned without complementation by A − B = S ⇔ A = (A ∩ B) ∪ S:

104

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

In the bag theory, this de,nition has an additive semantics and expresses that A − B is associated with the elements which have to be ‘added’ to A ∩ B so that it equals A: A − B = S ⇔ A = (A ∩ B) + S:

(17)

Using fuzzy numbers of occurrences, this de,nition involves: ∀x ∈ X;

A (x) = A∩B (x) + S (x):

(18)

Unfortunately, it is known [11,12,10,27] that X = N −M is not the solution of the equation M +X = N when M and N are fuzzy numbers and + and − are the usual operations on fuzzy numbers de,ned with the extension principle. The solution X of the equation M + X = N , when it exists, is de,ned by the optimistic di-erence (denoted: N )–(M ): N )−(M (y) =

inf

(x;z)=x+y=z

M (x) ∧−1 N (z);

where x1 ∧ −1 x2 is de,ned as the greatest element t in [0,1] such that min (x1 ; t) 6 x2 , i.e.: if x1 6 x2 then x1 ∧−1 x2 = 1 else x1 ∧−1 x2 = x2 : However, the solution of M +X = N does not always exist. If this equation is weakened in M +X 6 N then there is always a set of solutions and the greatest solution is given by X = N )–(M [27]. Consequently, we propose to de,ne an ‘optimistic’ di-erence between two fuzzy bags (denoted)–() by relaxing (17): A) − (B = ∪{S: (A ∩ B) + S ⊆ A}:

(19)

In other words, A) − (B is the greatest bag S we can ‘add’ to A ∩ B so that A contains (A ∩ B) + S. This de,nition is not counter intuitive and is compatible with the usual de,nition of set di-erence. Expression (19) therefore leads to the greatest solution S of the following equation on fuzzy numbers: A∩B (x) + S (x) = A (x):

(20)

The solution, related to the optimistic di-erence and the extension of (6), is given by S (x) = max(0; A (x)) − (A∩B (x)): Example 4.7. A = {{1=0; 1=1; 0:8=2; 0:5=3; 0:2=4}∗ a}; B = {{1=0; 1=1; 0:3=2}∗ a}: In this simple example A ∩ B = B, so A) − (B = ∪{S: B + S ⊆ A}:

(21)

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

105

Fig. 2. [A (a)) − (B (a)] + B (a) = A (a).

From (21) we deduce A) − (B = {[{1=0; 1=1; 0:8=2; 0:5=3; 0:2=4}) − ({1=0; 1=1; 0:3=2}]∗ a} = {1=0; 0:8=1; 0:2=2}∗ a} B + [A) − (B] = {{0=1; 1=1; 0:8=2; 0:3=3; 0:2=4}∗ a}. It is the greatest bag which is contained by A (Fig. 2). 4.6. Cartesian-product The Cartesian product A×B on two fuzzy bags can be easily de,ned using -cuts and the Cartesian product of bags. However, the extension of (7) with fuzzy numbers of occurrences does not lead directly to the expected result. As a matter of fact, the product of two fuzzy numbers (derived from the extension principle) generally produces a fuzzy set of integers with ‘holes’ and is not exactly a fuzzy cardinality. Example 4.8. A = {{1=0; 1=1; 1=2}∗ a}; B = {{1=0; 1=1; 0:8=2}∗ b}; A (a) × B (b) = {1=0; 1=1; 1=2; : : : ; 0:8=4}: In fact, due to the semantics associated with the degree of a number ! in a fuzzy number of occurrences of an element x in a fuzzy bag (the extent to which a fuzzy bag has at least ! occurrences of x), we have to create the convex hull of the product and ‘to ,ll in the holes’. A similar result has been established in [13]. In the following, this convex hull of the product operator is denoted ⊗. So: A×B (x; y) = A (x) ⊗ B (y):

(22)

106

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

Example 4.9. A (a) ⊗ B (b) = {1=0; 1=1; 1=2; 0:8=3; 0:8=4}: If 0.8 is the extent to which A × B has at least 4 occurrences of (a; b), then the extent to which A × B has 3 (at least) occurrences of (a; b) is 0.8. 5. Fuzzy bags and %exible querying In [4] we have extended an object-oriented algebra [28] in order to take gradual predicates and fuzzy sets into account. Moreover, fuzzy bags generalize sets, bags and fuzzy sets. As matter of fact, a degree of membership in a fuzzy set (or a set) and a number of occurrences in a bag can be viewed as speci,c fuzzy numbers of occurrences. Consequently all these structures and their operators are compatible and can be simultaneously manipulated in a query addressed to an information system. So, using the most general framework based on fuzzy bags, we now show how fuzzy extensions can be applied to three basic querying operators (select, image and %atten) from which complex %exible queries can be built up. 5.1. Select The operator select (C; p) is an iterator. Each element of the collection C is in the result R if it satis,es p. Because the query operators can be nested, C can be the result of a query, thus a fuzzy bag. In the most general case, C is a fuzzy bag and p is a fuzzy predicate. The selection on crisp bags is de,ned by the product: !R (x) = !C (x) × ord(p(x))

where ord(p(x)) is the number 1 if x satis,es p; 0 else:

Let us extend this de,nition. C is now a fuzzy bag and the ord function translates the fuzzy degree of satisfaction of the fuzzy predicate p(p (x)) into a fuzzy cardinal value such that: ord(p(x)) = {1=0; p (x)=1}: The previous expression becomes: R (x) = C (x) ⊗ ord(p(x)): Example 5.1. If p(a) = 0:4, p(b) = 0:3 and C = {0:9; 0:8=a; 1=b}, we obtain from (23): R (a) = C (a) ⊗ ord(p(a)) = {1=0; 0:9=1; 0:8=2} ⊗ {1=0; 0:4=1} = {1=0; 0:4=1; 0:4=2};

(23)

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

107

R (b) = C (b) ⊗ ord(p(b)) = {1=0; 1=1} ⊗ {1=0; 0:3=1} = {1=0; 0:3=1}; select(C; p) = {{1=0; 0:4=1; 0:4=2}∗ a; {1=0; 0:3=1}∗ b}: We can also express that each occurrence xi of an element x of R is in C (with the degree of membership C (xi )) and satis,es p (at the level p (x)). The conjunction of these two degrees characterizes xi in R: select (C; p) = {min(0:9; 0:4); min(0:8; 0:4)=a, min(1; 0:3)=b} = {0:4; 0:4=a, 0:3=b}. This result is coherent with the above result. 5.2. Image The image operator applies a function f on each x of a collection C. This operator is a projection when f is an access function to an attribute of the object x. In the crisp set framework this operator is formally de,ned by {y|∃x(x ∈ S) ∧ y = f(x)}: In the most general caseC is interpreted as a fuzzy bag, the generalized disjunction (∃) is interpreted as an addition ( ) and the conjunction as a product (⊗). So the characteristic of x in the result R is given by  R (y) = [C (x) ⊗ ord(y = f(x))]: (24) x in C

This operator does not eliminate duplicates generated by the function f. Example 5.2. Let the fuzzy set of ‘young employees’ be: {1=Alain; 0:5=Philippe; 0:3=Henry} and their respective salaries: 2000; 2500 and 2000 euros: Each employee is an object characterized by attributes such as ‘age’ or ‘salary’. The query ‘,nd the salary of young employees’ is a projection on the attribute age and de,nes a fuzzy bag: {{0=1; 1=1; 0:3=2}∗ 2000; {1=0; 0:5=1}∗ 2500}: The fuzzy number {0=1; 1=1; 0:3=2} is the sum of {0=1; 1=1} and {0=1; 0:3=1} which are the degrees of membership of Alain and Henry in the fuzzy set of ‘young employees’ (1 and 0.3) represented as fuzzy numbers. This fuzzy bag is representative of the distribution of the young employees’ salaries. It allows to take into account two pieces of information: (i) qualitative: to what extent a given salary is associated with a young person; (ii) quantitative: a given salary can correspond to several employees who are more or less young.

108

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

For example, the fuzzy number of occurrences associated with 2000 means: 2000 is the salary of two employees and one of them is younger than the other. 5.3. Flatten This operator %attens a collection C of collections x. For example, applied to the set of sets: {{1; 2; 3}; {2; 4}}, it returns the set: {1; 2; 3; 4}. In the set framework its formal de,nition is {y|∃x(x ∈ C) ∧ (y ∈ x)}: In the most general case the %atten operator can be applied to a fuzzy bag of fuzzy bags, the  generalized disjunction (∃) is interpreted as an addition ( ) and the conjunction as a product (⊗). So, the characteristic of an element y in the result R is  R (y) = [C (x) ⊗ x (y)]: (25) x in C

Of course, this operator does not eliminate duplicates. Example 5.3. The composition of the extended operators allows queries such as 4nd keywords associated with reddish images: %atten( image( select(images, !ii.reddish), !rr.keys)) The select operator returns a fuzzy set of images. The image operator returns a fuzzy set of fuzzy sets of keywords (we assume that keywords of an image are associated with a degree of relevance). Finally the %atten operator, which does not eliminate duplicates, returns a fuzzy bag of keywords. In this result, the fuzzy number of each keyword characterizes it: (i) quantitatively: a given keyword can correspond to several images; (ii) qualitatively: each occurrence of a keyword more or less satis,es the criterion reddish image. Now we have the information to evaluate queries such as: how many times a keyword is used signi4cantly, or 4nd the keywords associated with many reddish images. 5.4. Other operators One motivation for extending an object oriented algebra with fuzzy bags is to provide the precise semantics of queries when they bear upon multiple collections (set, bag, fuzzy set, fuzzy bag). In order to deal with this problem our proposition is based on the theories of fuzzy integers and fuzzy bags. The structure of fuzzy bags generalizes, in a natural way, the set, bag and fuzzy set structures. These collections are treated uniformly and, consequently, the extension of an object-oriented algebra is still reduced to a small number of simple and generic operators. These basic operators can then be easily composed in order to de,ne complex queries.

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

109

The operators Cartesian-product, select, image and %atten can be viewed as basic querying operators from which more complex operators, such as the projection, join, nest, unnest, group, : : : operators [4], can be built. Finally, these operators and their combinations allow to specify OQL expressions thanks to their equivalent algebraic forms. In so far, the extensions of OQL expressions in the %exible framework can be well de,ned. Example 5.4. A query by image content asking “the round objects above a green square object in an image I ”, is based on a fuzzy relation (above) which joins two fuzzy sets of objects. It could be expressed in an OQL like query by: select x from x in I.objets, y in I.objects where x.sketch.is round and y.sketch.is square and x:is above(y) A round object x can be more or less above several square objects, so the result is a fuzzy bag of round objects. 6. Conclusion This study takes place in a framework de,ning a query language which admits the formulation of imprecise queries addressed to an object-oriented database. The interpretation of users’ queries is founded on the fuzzy set theory. The problem implies handling both fuzzy collections and fuzzy predicates inside querying operators. These extended primitive operators are then used to build up a high level query language, such as OQL, which admits %exible queries [8]. In this paper a generic collection, the fuzzy bags, is speci,cally studied. The bag concept, implemented in a lot of data models, and the %exible querying principle lead to the concept of fuzzy bags. Sets, fuzzy sets and bags are particular cases of fuzzy bags. Thus, all these structures are homogenous and compatible. The operations bearing on multiple collections are treated in a similar way because they can be de,ned through a common mechanism: fuzzy cardinalities. Consequently fuzzy bags and their operators provide a general framework in which di-erent structures can be composed. They improve the expressiveness of the algebra by managing both quantities and qualities. In a further step these quantities inside the results will allow to deal with quanti,ed statements. Fuzzy bags can be viewed as a generalisation of fuzzy sets thanks to the consideration of an order structure over the unit interval. Their characteristic function is de,ned in terms of application from a set to a partial ordered set (the set of fuzzy cardinalities). So fuzzy bags can be classi,ed as L-fuzzy sets as de,ned by Goguen [15]. References [1] J. Albert, Algebraic properties of bag data types, Proc. 17th Internat. Conf. on Very Large Data Bases, Barcelona, 1991, pp. 211–219. [2] N. Blanchard, ThXeories cardinales et ordinales des ensembles %ous, Ph.D. Thesis, University of Lyon I, 1981. [3] W.D. Blizard, The development of multiset theory, Modern logic 1 (1991) 319–352.

110

D. Rocacher / Fuzzy Sets and Systems 140 (2003) 93 – 110

[4] P. Bosc, F. Connan, D. Rocacher, Flexible querying in multimedia databases with an Object Query Language, Proc. 7th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’98), Anchorage, 1998, pp. 1308–1313. [5] P. Bosc, O. Pivert, SQLf: a relational database language for fuzzy querying, IEEE Trans. Fuzzy Systems 3 (1) (1995) 1–17. [6] P. Bosc, D. Rocacher, About di-erence operation on fuzzy bags, Proc. 9th Internat. Conf. on Information Processing Management of Uncertainty in Knowledge-based Systems (IPMU’02), Annecy, France, 2002, pp. 1541–1546. [7] K. Chakrabarty, R. Biswas, S. Nanda, On Yager’s theory of bags and fuzzy bags, Comput. Artif. Intell. 18 (1) (1999) 1–17. [8] F. Connan, Interrogation %exible de bases de donnXees multimXedias, Ph.D. Thesis, University of Rennes I, 1999. [9] F. Connan, D. Rocacher, Flexible queries in object-oriented databases: on the study of bags, Proc. 8th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’99), Seoul, Korea, 1999, pp. 615 – 620. [10] D. Dubois, H. Prade, Fuzzy sets and systems, theory and applications, Mathematics in Science and Engineering, vol. 144, Academic Press, London, 1980. [11] D. Dubois, H. Prade, Inverse operations for fuzzy numbers, Proc. IFAC Symp. on Fuzzy Information, Knowledge Representation and Decision Analysis, Pergamon Press, Oxford, 1983, pp. 391–396. [12] D. Dubois, H. Prade, Fuzzy set theoretic di-erences and inclusions and their use in the analysis of fuzzy equations, Control Cybernet. 13 (3) (1984) 129–145. [13] D. Dubois, H. Prade, Fuzzy cardinality and modeling of imprecise quanti,cation, Fuzzy Sets and Systems 16 (3) (1985) 199–230. [14] D. Dubois, H. Prade, F. SXedes, Some uses of fuzzy logic in multimedia database querying, Proc. 8th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’99), Seoul, 1999, pp. 586 –591. [15] J.A. Goguen, L-fuzzy sets, J. Math. Appl. 18 (1967) 145–174. [16] P.W.P.J. Grefen, R.A. DeBy, A multi-set extended relational algebra: a formal approach to a practical issue, Proc. Internat. Conf. on Data Engineering, Houston, TX, USA, 1994, pp. 80 –88. [17] S. Grumbach, T.M. Milo, Towards tractable algebra for Bags, Proc. 12th ACM Symp. on Principles of Database Systems, Washington, DC, 1993, pp. 49 –58. [18] K.S. Kim, S. Miyamoto, Application of fuzzy multisets to fuzzy database systems, Proc. 1996 Asian Fuzzy Systems Symp., TaSZwan, 1996, pp. 115 –120. [19] G.J. Klir, B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice-Hall, Englewood Cli-, NJ, 1995. [20] D.E. Knuth, The Art of Computer Programming, vol. 2, Addison-Wesley, Reading, MA, 1981. [21] J.M. Medina, O. Pons, M.A. Vila, GEFRED: a Generalized model of fuzzy relational databases, Inform. Sci. 76 (1–2) (1994) 87–109. [22] S. Miyamoto, Fuzzy multisets and fuzzy clustering of documents, Proc. 10th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’01), Melbourne, Australia, 2001, pp. 1539 –1542. [23] S. Miyamoto, Generalized multisets and rough approximations, Proc. 11th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’02), Honolulu, Hawaii, 2002. [24] D. Ralescu, Cardinality, quanti,ers and aggregation of fuzzy criteria, Fuzzy Sets and Systems 69 (1995) 355–365. [25] D. Rocacher, On the use of fuzzy numbers in %exible querying, Proc. 9th IFSA World Congr. and 20th NAFIPS International, Vancouver, Canada, 2001, pp. 2440 –2445. [26] D. Rocacher, About division operation on fuzzy bags, Proc. 11th Internat. Conf. on Fuzzy Systems (FUZZ IEEE’02), Honolulu, Hawaii, 2002. [27] E. Sanchez, Solutions of fuzzy equations with extended operations, Fuzzy Sets and Systems 12 (1984) 237–248. [28] G.M. Shaw, S.B. Zdonik, A query algebra for object-oriented databases, Proc. 6th Internat. Conf. on Data Engineering IEEE, 1990, pp. 154 –162. [29] M. Wygralack, Fuzzy cardinals based on the generalized equality of fuzzy subsets, Fuzzy Sets and Systems 18 (1986) 143–158. [30] R. Yager, On the theory of bags, Internat. J. Gen. Sys. 13 (1986) 23–27. [31] R. Yager, Cardinality of fuzzy sets via bags, Math. Modelling 9 (6) (1987) 441–446. [32] L.A. Zadeh, The concept of linguistic variable and its application to approximate reasoning, Inform. Sci. 8 (1975) 199–249. [33] L.A. Zadeh, A computational approach to fuzzy quanti,ers in natural languages, Comput. Math. Appl. 9 (1983) 149–184.