Available online at www.sciencedirect.com
Fuzzy Sets and Systems 202 (2012) 42 – 60 www.elsevier.com/locate/fss
On four noncommutative fuzzy connectives and their axiomatization Patrick Bosc, Olivier Pivert∗ IRISA/ENSSAT, University of Rennes 1, 6, Rue de Kerampont, BP 80518, 22305 Lannion Cedex, France Received 28 October 2010; received in revised form 7 November 2011; accepted 8 November 2011 Available online 17 November 2011
Abstract In most of the query languages, conjunctive and disjunctive combinations of conditions remain the usual way for aggregation. Fuzzy query languages also offer trade-off operators, such as means in order to compensate between elementary conditions. In this paper, we would like to introduce a new type of condition basically founded on the interaction between two predicates, thus enriching the panoply of tools the user is provided with and the power of query languages. © 2011 Elsevier B.V. All rights reserved. Keywords: Database querying; Fuzzy connectives; Noncommutative operators; Preferences
1. Introduction Query languages based on fuzzy set theory provide a rich range of connectives, among which conjunction/disjunction and their weighted version, mean operators and quantified statements such as most of P1 , . . . , Pn where Pi is a fuzzy condition. So doing, preferences are introduced at a double level: (i) atomic fuzzy predicates which allow for stating that some values are preferred to others and (ii) connectors which permit to specify that some terms are more important than others or can be compensated by others. Here, we would like to focus on four operators that have a counterpart in natural language, for which an interpretation is proposed in the context of information selection/retrieval. We consider the following four types of conditions: • • • •
P1 P1 P1 P1
and if possible P2 , which is related to conjunction, or else P2 , which is connected with disjunction, all the more as P2 , which expresses a reinforcement of P1 when P2 is more and more satisfied. all the less as P2 , which expresses a weakening of P1 when P2 is more and more satisfied.
They can be seen as a way to define a sophisticated interaction between two predicates in order to build new types of conditions. These mechanisms naturally extend to n > 2 predicates (to consider for instance statements like
∗ Corresponding author. Tel.: +33 2 96 46 90 31; fax: +33 2 96 37 0199.
E-mail addresses:
[email protected] (P. Bosc),
[email protected] (O. Pivert). 0165-0114/$ - see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.fss.2011.11.005
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
43
P1 and if possible (P2 and if possible P3 )), although we limit the scope of this paper to the binary case. In a context where P1 and P2 take only Boolean values, the first two operators should behave as follows: • P1 and if possible P2 : (tr ue, tr ue) (tr ue, f alse) ( f alse, −), • P1 or else P2 : (tr ue, −) ( f alse, tr ue) ( f alse, f alse), where a b means that a is preferred to b. As to the third and fourth ones, there is no notion of graduality of the satisfaction for a Boolean predicate (here P2 ) and then “P1 all the more as P2 ” as well as “P1 all the less as P2 ” do not make sense in this context. We situate the proposal advocated here as an enrichment of the SQLf query language [1]. As a consequence, each new condition will be associated with a single truth value taken in the unit interval, whereas a pair of values is returned in the bipolar interpretation suggested in [2]. So, it is legal to compose the new conditions with usual ones (atomic or not), but also to define complex conditions composed using recursion and/or cross-use, e.g., P1 and if possible (P2 or else P3 ), P1 or else (P2 all the more as P3 ). The rest of the paper is structured as follows. Some basic notions about fuzzy predicates and queries are first recalled in Section 2. Then, Sections 3, 4, 5, and 6 are devoted respectively to the operators “and if possible”, “or else”, “all the more as”, and “all the less as”. The same line is adopted for each of them, based on some desirable properties which serve as a starting point for “rational” definitions of the operators. Section 7 is devoted to a detailed example. Implementation aspects are dealt with in Section 8. Section 9 introduces a new type of fuzzy implication that can be derived from the definition of the connective “all the more as” and shows how it can be used in the context of database querying. Section 10 summarizes the contributions of the paper and draws some lines for future works. 2. Reminders about fuzzy predicates and queries Regular sets allow for the definition of Boolean predicates. In an analogous way, gradual predicates (or conditions) can be associated with fuzzy sets [3] aimed at describing classes of objects with vague boundaries. Often, elementary fuzzy predicates correspond to adjectives of the natural language, such as “young”, “tall”, “cheap” or “well-paid”. A fuzzy predicate P can be modeled as a function P (usually of triangular or trapezoidal shape) from one (or several) domain(s) X to the unit interval [0, 1]. The degree P (x) represents the extent to which element x satisfies the vague predicate P (or equivalently the extent to which x belongs to the fuzzy set of objects which match the fuzzy concept P). An elementary fuzzy predicate can also compare two attributes using a gradual comparison operator such as “more or less equal”. It is possible to alter the meaning of a given predicate using a modifier which is generally associated with an adverb (e.g., “very”, “more or less”, “relatively”). For instance, “very cheap” is more restrictive than “cheap” and “fairly high” is less demanding than “high”. The meaning of the predicate “mod P” (where “mod” is a modifier) may be defined in a compositional way and different approaches have been advocated, among which: mod P (x) = ( P (x))n (see [4]). Atomic and modified predicates can be involved in compound predicates which go far beyond those used in regular queries. Conjunction (resp. disjunction) is interpreted by means of a triangular norm (resp. co-norm ⊥), for instance the minimum (resp. the maximum). As to negation, it is interpreted as: ∀x, ¬P (x) = 1 − P (x). As mentioned earlier, weighted conjunction and disjunction as well as weighted mean or OWA can be used to assign a different importance to each of the predicates (see [5,6] for more details). The operations from the relational algebra can be straightforwardly extended to fuzzy relations by considering fuzzy relations as fuzzy sets on the one hand and by introducing fuzzy predicates in the appropriate operations on the other hand. Indeed, a fuzzy relation is designed as a fuzzy subset of a Cartesian product of domains. Thus, any such fuzzy relation r can be seen as made of weighted tuples, denoted by /t, where expresses the extent to which tuple t belongs to the relation, i.e., is compatible with the concept conveyed by r. As an illustration, we give the definition of the fuzzy selection hereafter, where “cond” denotes a fuzzy predicate built according to what was said above: sel(r,cond) (t) = (r (t), cond (t)). The language called SQLf described in [1] extends SQL so as to support fuzzy queries.
44
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
3. “And if possible” The expression “P1 and if possible P2 ” where P1 and P2 are two (possibly complex) predicates expresses both a weak and nonsymmetric conjunction in the sense that P2 is less important than P1 . One might think of modelling it using a mean operator but this would not work since when P1 is totally unsatisfied, the result must be 0 (or false) which cannot be reached by a mean operator. An example of use is the query “find the houses with 4 rooms, a price around 300K$ and if possible a small garden and a garage”. 3.1. Axioms In order to formally define this operator denoted by , below are listed (and commented) some “reasonable” axioms, where i stands for the degree of satisfaction of predicate Pi for a given element onto which “P1 and if possible P2 ” applies: A1: “P1 and if possible P2 ” is less drastic than “P1 and P2 ”: (1 , 2 ) ≥ min(1 , 2 ). A2: “P1 and if possible P2 ” is more drastic than P1 : (1 , 2 ) ≤ 1 . A3: must have an asymmetric behavior, i.e., it is noncommutative: ∃x, y : (x, y) (y, x). A4: is increasing in its first argument: x ≥ y ⇒ (x, z) ≥ (y, z). A5: is increasing in its second argument: y ≥ z ⇒ (x, y) ≥ (x, z). A6: “P1 and if possible P2 ” is equivalent to “P1 and if possible (P1 and P2 )”: (1 , 2 ) = (1 , min(1 , 2 )). From Axiom A2, it comes: ∀2 , (0, 2 ) = 0. From Axioms A1 and A2, one gets 1 ≤ 2 ⇒ (1 , 2 ) = 1 , thus: ∀1 , (1 , 1 ) = 1 . In other words, “P1 and if possible P1 ” is equivalent to P1 , which makes sense. As mentioned earlier, one may also notice that (0, 2 ) = 0 makes it impossible to call on a mean operator to model operator . 3.2. A modeling framework Indeed, when “P1 and if possible P2 ” applies to an element t, the effect of “if possible” intervenes when P2 (t) = 2 is less than P1 (t) = 1 . In that situation, one wants the value returned to be over 2 which is the result in the presence of a regular conjunction. In other words, the result must be upgraded with respect to 2 without going beyond 1 . There are obviously many ways to perform this upgrade, among which those captured by an expression such as (1 , 2 ) = min(1 , h(1 , 2 )).
(1)
Moreover, such a format guarantees that Axioms A1–A5 hold, as well as Axiom A6 when 2 ≤ 1 , provided that: (i) h is increasing on both 1 and 2 , and (ii) h(1 , 2 ) ≥ min(1 , 2 ).
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
45
Table 1 Behavior of 1 (1 , 2 )—Boolean case.
1
2
(1 , 2 )
1 1 0 0
1 0 1 0
1 k 0 0
Proof. Axiom A1: if 1 ≤ 2 , h(1 , 2 ) ≥ min(1 , 2 )(= 1 ) and (1 , 2 ) = 1 (= min(1 , 2 )). If 2 < 1 , h(1 , 2 ) ≥ min(1 , 2 )(= 2 ) and either (1 , 2 ) = h(1 , 2 ) ≥ 2 (= min(1 , 2 )) or (1 , 2 ) = 1 > 2 (= min(1 , 2 )). Hence, in any case Axiom A1 holds. Axiom A2: By its very nature, (1 , 2 ) as defined by Formula (1) cannot deliver a result over 1 . Axiom A3: Obvious provided that h(1 , 2 ) 2 . Axioms A4–A5: Obvious with the assumption (monotonicity) about h. Axiom A6: If 2 ≤ 1 , min(1 , h(1 , min(1 , 2 ))) rewrites min(1 , h(1 , 2 )). 3.3. Diverse proposals As mentioned before, many possibilities may be thought of so that the condition “P1 and if possible P2 ” applied to an element t delivers a result better than 2 when 2 (= P2 (t)) < 1 (= P1 (t)). We first give our definition, then two others issued from the notion of a weighted conjunction initially introduced by Dubois and Prade [5]. 3.3.1. Our definition One may envisage: 1 (1 , 2 ) = min(1 , k · 1 + (1 − k) · 2 )
(2)
with k ∈ [0, 1]. For instance, if k = 0.5 is chosen, one gets 1 + 2 . 1 (1 , 2 ) = min 1 , 2 Remark 1. It turns out that this formula also appears in [7] in the context of bipolar querying where it is “disqualified” by the authors. Remark 2. When k = 1, the resulting grade is 1 , and when k = 0, it is min(1 , 2 ). Remark 3. In the case where P1 and P2 are Boolean predicates, one gets the truth values represented in Table 1. Proof of validity of Axioms A1–A6. It appears that Formula (2) complies with the format of Expression (1) since (i) h(1 , 2 ) = (k · 1 + (1 − k) · 2 ) is increasing in both 1 and 2 and (ii) h(1 , 2 ) ≥ min(1 , 2 ) since h is a mean operator. Consequently, the only remaining question is about the validity of Axiom A6 when 1 < 2 . In this case, 1 (1 , min(1 , 2 )) = 1 (1 , 1 ) = 1 and this is also the case for 1 (1 , 2 ), hence Axiom A6 holds for 1 . 3.3.2. Definitions based on a weighted conjunction To the best of our knowledge, two proposals have been made so far with the purpose of prioritizing between two (or more) predicates without calling on a mean operator. These works do not explicitly start with expression “P1 and if possible P2 ”, but rather with the idea of a hierarchical operator of the type “P1 then P2 ” (which extends in a straightforward manner to n > 2 predicates). In the approach proposed by Yager in [8], a tuple t of the set T of tuples to be compared is assigned the following grade: 2 (t) = min( P1 (t), max( f (P1 , P2 ), P2 (t))), where f (P1 , P2 ) = 1 − maxs∈T min( P1 (s), P2 (s)).
(3)
46
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
This approach, which was proposed in the context of multicriteria decision-making, refers to the concept of the conditional possibility of satisfying a fuzzy predicate when another one is satisfied. The value f (P1 , P2 ) can be seen as a level of incompatibility between P1 and P2 . Formula (3) fits the format of expression (2) with h(1 , 2 ) = max( f (P1 , P2 ), 2 ). In addition, h does not depend on 1 and is increasing in 2 on the one hand and h(1 , 2 ) ≥ 2 ≥ min(1 , 2 )), then Axioms A1–A5 are obviously satisfied. As to Axiom A6 for the case 1 < 2 , one has 2 (1 , min(1 , 2 )) = min(1 , max( f (P1 , P2 ), min(1 , 2 ))) = min(1 , max( f (P1 , P2 ), 1 )) = 1 = min(1 , max( f (P1 , P2 ), 2 )) = 2 (1 , 2 ). Example 1. Let us consider a set of tuples T such that either P1 (t) = 0 or P2 (t) = 0 or both, for any tuple t of T. Then f (P1 , P2 ) = 1, which means that P1 and P2 are totally incompatible. So, for any tuple t, 2 (t) = P1 (t), i.e., P2 is ignored. On the contrary, if ∃t0 ∈ T such that P1 (t0 ) = P2 (t0 ) = 1, the compatibility between P1 and P2 is maximal and for any tuple t, 2 (t) = min( P1 (t), P2 (t)), i.e., a pure conjunction applies. In between (when 0 < (w = f (P1 , P2 )) < 1), 2 (t) = min( P1 (t), max(1 − w, P2 (t))), which is nothing but a weighted conjunction between P1 and P2 associated with the importance 1 − w. For instance, let us take the tuples t1 and t2 such that: P1 (t1 ) = 0.8, P2 (t1 ) = 0.3, P1 (t2 ) = 0.4, P2 (t2 ) = 0.1. With w = 0.8 (fairly low compatibility), we get: 2 (t1 ) = min(0.8, max(0.2, 0.3)) = 0.3 (2 = 0.3 is not upgraded), 2 (t2 ) = min(0.4, max(0.2, 0.1)) = 0.2 (2 = 0.1 is slightly upgraded). With w = 0.3 (fairly high compatibility), we get: 2 (t1 ) = min(0.8, max(0.7, 0.3)) = 0.7 (2 = 0.3 is upgraded to 0.7), 2 (t2 ) = min(0.4, max(0.7, 0.1)) = 0.4 (2 = 0.1 is (significantly) upgraded but the result is bounded by 0.4).
In [9], the authors have taken this type of aggregation as a basis for introducing a connector named “and possibly” in an information retrieval query language, which establishes a connection between the hierarchical conjunction of predicates and the connector “and if possible”. In [10], the authors deal with the modeling of fuzzy hierarchical preference conditions. The type of condition considered is: “P1 then P2 (with priority level w < 1)”. To interpret such a statement, the authors extend the definition of their initial weighted minimum operation. The following formula shows how the degree associated with an element t onto which the condition “P1 then P2 ” applies, is computed: 3 (t) = min( P1 (t), max( P2 (t), 1 − min( P1 (t), w))).
(4)
In this expression, the final importance level of P2 for tuple t depends on both P1 (t) and w. In particular, it is zero if P1 is not at all satisfied whatever the satisfaction of P2 . It turns out that this view does not meet all of our requirements since Axiom A4 does not hold (basically since h(1 , 2 ) = max(2 , 1 − min(1 , w)) is decreasing in 1 ). Let us consider w = 0.8 and two tuples t1 and t2 such that: P1 (t1 ) = 0.6, P2 (t1 ) = 0.2, P1 (t2 ) = 0.7, P2 (t2 ) = 0.2, we get: 3 (t1 ) = min(0.6, max(0.2, 1 − min(0.6, 0.8))) = 0.4, 3 (t2 ) = min(0.7, max(0.2, 1 − min(0.7, 0.8))) = 0.3. 3.4. Approach based on a conjunctive partial absorption Conjunctive partial absorption (CPA for short) [11] is an asymmetric compound operator that combines a mandatory input and a desired (optional) one. In that sense, it has something to with the “and if possible” operator we propose.
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
47
However, their behaviors are different and CPA does not comply with all of the axioms given in Section 3.1. The behavior of CPA is as follows: • CPA(0, y) = 0 whatever the value of y. Operator acts the same. • CPA(x, 0) = x − x P when x > 0 and y = 0 (a penalty is applied). • CPA(x, 1) = x + x R when 0 < x < 1 (a reward is applied). This violates axiom A2 which says that (x, y) is always ≤ x. • CPA(1, y) = z such that y < z < 1 when 0 < y < 1 where z is computed by means of a weighted power mean [12]. Axiom A6 (which says that “P1 and if possible P2 ” is equivalent to “P1 and if possible (P1 and P2 )” is also violated. As CPA, operator may apply a penalty (with , the “penalty” is applied when y is smaller than x, not when y is zero), but the main difference is that (x, y) takes a value between x and min(x, y) depending on the satisfaction of y with respect to that of x. This is a different type of interpretation of the interaction between a mandatory criterion and a desired one. 3.5. Approach based on bipolar view As mentioned in Remark 1, our definition of and if possible also appears in a paper by Dubois and Prade [7] in the context of bipolar querying where it is discarded by the authors. Indeed, they do not judge this formula suitable for capturing heterogeneous bipolarity, and we share this point of view. Our goal, with operator , is not to model a bipolar conjunction since we do not keep the two conditions separate ( performs an aggregation). In [13], the authors deal with similarity and dissimilarity degrees in a bipolar framework which is based on Atanassov’s intuitionistic fuzzy sets. However, dealing with positive and negative requirements is not the purpose of operator . In [14–16], the authors propose an operator inspired by an aggregation technique, aimed at modeling prioritized constraints, initially introduced by Lacroix and Lavency in their system Preferences [17]. They define a fuzzy version of it which represents an interpretation of the concept of a bipolar query with an and possibly operator. They also show its basic relation with a fuzzy version of the operator winnow whose crisp version was first defined by Chomicki [18]. This fuzzy and possibly operator — whose suitability for representing bipolar preferences is criticized in [19] — corresponds to the approach proposed by Yager, which is discussed above, cf. Formula (3). The interpretation of and possibly is then based on the content of the whole database. 3.6. Discussion and conclusion Clearly, the approach suggested by Yager and expressed by Formula (3) is compatible with our expectations. That would be the case of any interpretation based on a weighted conjunction, i.e.: P1 and (w,P2 ) (t) = min( P1 (t), max( P2 (t), 1 − w)), where w is the importance assigned to P2 . However, in such a context, the upgrade mechanism does not depend on P1 at all. In Yager’s specific proposal, the final degree assigned to an element is conditioned by w which is a parameter global to the whole dataset and reflects a notion of global possibility of having P1 and P2 simultaneously. In this regard, it can be said that Yager’s approach takes into account a certain form of “context”: the meaning of the term “possible” (in “and if possible”) then means “feasible considering the data present in the database”. On the other hand, in the definition we propose, “P1 and if possible P2 ” means “P1 or even better (P1 and P2 )”. We do not claim that an interpretation is better than another. We rather think that they both make sense, and that the user will favor one or the other depending on what he/she means by “possible”. In other terms, our proposal provides a sound basis for an acceptable behavior of the operator “and if possible”, but it is not the only reasonable one. As to the approach suggested by Dubois and Prade (cf. Formula (4)), it turns out that it does not obey monotonicity that we consider a legitimate property. 4. “Or else” The expression “P1 or else P2 ” where P1 and P2 are two predicates expresses both a strong and nonsymmetric disjunction in the sense that P2 is not considered at the same level as P1 and then is not a “true alternative” for P1 .
48
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
4.1. Axioms This operator can be formally defined on the basis of the following axioms: B1: “P1 or else P2 ” is more drastic than “P1 or P2 ” since, as said above, P2 is not as good as P1 : (1 , 2 ) ≤ max(1 , 2 ). B2: “P1 or else P2 ” is softer than P1 since P2 opens to a choice: (1 , 2 ) ≥ 1 B3: must have an asymmetric behavior as , i.e., it is noncommutative (cf. Axiom A3) B4: is increasing in its first argument: x ≥ y ⇒ (x, z) ≥ (y, z). B5: is increasing in its second argument: y ≥ z ⇒ (x, y) ≥ (x, z). B6: “P1 or else P2 ” is equivalent to “P1 or else (P1 or P2 )”, i.e.: (1 , 2 ) = (1 , max(1 , 2 )). From Axiom B2, it comes: ∀2 , (1, 2 ) = 1. From Axioms B1 and B2, one gets 2 ≤ 1 ⇒ (1 , 2 ) = 1 , thus: ∀1 , (1 , 1 ) = 1 . In other words, “P1 or else P1 ” is equivalent to P1 , which makes sense. 4.2. The proposed definition One may envisage: 1 (1 , 2 ) = max(1 , k · 1 + (1 − k) · 2 )
(5)
with k ∈ [0, 1]. For instance, if k = 0.5 is taken: 1 + 2 . 1 (1 , 2 ) = max 1 , 2 Remark 4. When k = 1, the resulting grade is 1 , and when k = 0, one gets max(1 , 2 ). Remark 5. In the case where P1 and P2 are Boolean predicates, one gets the truth values represented in Table 2. Proof of validity of Axioms B1–B6. If 1 ≤ (k · 1 + (1 − k) · 2 ), then 1 (1 , 2 ) = (k · 1 + (1 − k) · 2 ), which is upper bounded by max(1 , 2 ) as any mean is. Otherwise, 1 (1 , 2 ) = 1 ≤ max(1 , 2 ), hence B1 holds. By its very nature, 1 (1 , 2 ) ≥ 1 is guaranteed and Axiom B2 holds. Axioms B3–B5 are trivially satisfied by Expression (5). Last, if 1 ≤ 2 , (1 , max(1 , 2 )) rewrites (1 , 2 )). Otherwise, (1 , max(1 , 2 )) = (1 , 1 ) = 1 , while (1 , 2 ) = 1 since (k · 1 + (1 − k) · 2 ) ≤ max(1 , 2 ) (= 1 ). Interestingly enough, de Morgan’s laws hold ∀k ∈ [0, 1] between operators 1 and 1 , which makes them dual as conjunctions and disjunctions based on norms and co-norms: 1 − 1 (1 − 1 , 1 − 2 ) = 1 (1 , 2 ), 1 − 1 (1 − 1 , 1 − 2 ) = 1 (1 , 2 ).
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
49
Table 2 Behavior of (1 , 2 )—Boolean case.
1
2
1 (1 , 2 )
1 1 0 0
1 0 1 0
1 1 1−k 0
Proof. From Eq. (2), one has 1 − 1 (1 − x, 1 − y) = 1 − min(1 − x, k · (1 − x) + (1 − k) · (1 − y)) = max(x, 1 − (k · (1 − x) + (1 − k) · (1 − y))) = max(x, k · x + (1 − k) · y) = 1 (x, y). Similarly, from Eq. (5): 1 − 1 (1 − x, 1 − y) = 1 (x, y).
Remark 6. The value of parameter k (the relative importance of P1 with respect to P2 or vice versa) has to be the same for both operators. It is worth noticing that an interpretation of “P1 or else P2 ” could also be derived by duality from the interpretation of “P1 and if possible P2 ” suggested by Yager, i.e.: 2 (1 , 2 ) = max(1 , min(1 − f (P1 , P2 ), 2 )), where f (P1 , P2 ) is the incompatibility function given in Section 3.3.2. Operator 2 is just a special case of a weighted disjunction defined as the dual of the weighted conjunction. Remark 7. From a semantic point of view, the connective “or else” can be related to Qualitative Choice Logic (QCL) [20], which is a propositional logic for representing alternative, ranked options for problem solutions. Indeed, this logic adds to classical propositional logic a new connective called ordered disjunction: A × B intuitively means: if possible A, but if A is not possible then at least B. In other words, it models statements of the form “A or else B”. The semantics of qualitative choice logic is based on a preference relation among models. However, the authors of [20] do not deal with fuzzy statements. 5. “All the more as” As mentioned previously, the basis for interpreting the condition “P1 all the more as P2 ” is to strengthen predicate P1 depending on the satisfaction of predicate P2 . In particular, when this latter predicate is true (degree of satisfaction equal to 1), the final satisfaction will be that of very P1 . So, we will first look at various ways to moving from P to very P along with their compliance with some common sense properties. Let us mention that in [21], Bouchon-Meunier et al. describe a somewhat similar reinforcement effect — also expressed by the linguistic expression “all the more as” — in a context of fuzzy rules (but the authors do not aim to define a connective, contrary to us). 5.1. About strengthening operators 5.1.1. Axioms One may consider reasonable to expect a strengthening modifier to satisfy the following three properties: S1: ∀(x, y), if P (x) ≥ P (y) then ver y P (x) ≥ ver y P (y), S2: core(very P) ⊆ core(P),
50
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
S3:
∀x,
if P (x) ∈]0, 1[ then ver y P (x) < P (x), if P (x) = 0 or 1 then ver y P (x) ≤ P (x).
In other words, property S1 states that the strengthening operator “very” does not change the ordering of elements, property S2 tells that “very P” is at least as demanding as P for the elements that are completely in agreement with predicate P and property S3 imposes that for any element which somewhat (but not fully) satisfies predicate P, predicate “very P” is strictly less satisfied (there is a true strengthening effect). 5.1.2. Different strengthening approaches Use of a triangular norm: A well known way for reinforcing a predicate P is to define: ver y P (x) = n (P) (x), where stands for a nonidempotent triangular norm. For instance, with the product and Lukasiewicz’ norm, one gets respectively: ver y P (x) = P n (x) = ( P (x))n , ver y P (x) = P n (x) = max(n · P (x) − n + 1, 0). In such a framework, the interpretation of the statement “X is A all the more as Y is B” applying to an element t, denoted by (A, B)(t), will be such that (A, B)(t) = f 1 (A, B, t, n) = n (A) (t) if B (t) = 1 with n a given powering coefficient. The satisfaction of properties S1–S3 by this strengthening method is now examined: S1: if P (x) ≥ P (y) then n(P) (x) ≥ n(P) (y) due to the monotonicity of norms with respect to both arguments, then ver y P (x) ≥ ver y P (y), S2: if P (x) = 1 then n(P) (x) = (1, . . . , 1) = 1 due to the fact that 1 is a neutral element of any norm, and we have core(P) = core(very P), S3: when P (x) = 0, n (P) (x) = P (x) = 0; when P (x) = 1, n (P) (x) = P (x) = 1; when P (x) ∈]0, 1[, the strict inferiority of n (P) (x) with respect to P (x) is not necessarily satisfied by a triangular norm , except if is Archimedean; in such a case, by definition it is ensured that (y, y) < y for any y ∈]0, 1[ (see [6]). Then, we have: (y, . . . , y) = n (y) < y and ver y P (x) = n (P) (x) < P (x) when P ∈]0, 1[. So, we can conclude that if an Archimedean norm () is used, the strengthening mechanism defined as ver y P (x) = n (P) (x) complies with all three desired properties. Translation: A second approach to the reinforcement of a predicate is to translate its membership function. If P is an increasing (resp. a decreasing) predicate, the translation is to the right (resp. left). In the more general case of a predicate P with a trapezoidal membership function, two translations take place, to the right for its increasing part and to the left for its decreasing part. For an increasing predicate P (situation considered later on), one has the following formula: ver y P (x) = P (x − ), where is a positive value (shift). Then, the interpretation of the statement “X is A all the more as Y is B” applying to an element t will be such that: (A, B)(t) = f 2 (A, B, t, ) = A (t − ) if B (t) = 1 with a given magnitude of the translation. Such a mechanism satisfies properties S1–S3 due to the monotonicity of P and the very nature of a translation. Erosion: The last strengthening technique considered here to build “very P” from “P” is called erosion [22] and it relies on the idea of a semantic proximity between the initial and the modified (reinforced) predicates. It makes use of a
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
51
Fig. 1. Principle of the erosion operation.
proximity relation E between two (numeric) values x and y which is reflexive and symmetric, and the quantity E (x, y) can be viewed as a degree of approximate equality of x and y. This quantity can be defined as E (x, y) = Z (x − y) which depends only on the difference between x and y. The parameter Z, called a tolerance indicator, is a fuzzy subset of the real line centered in 0 which has a trapezoidal membership function whose core is [−z, z] and whose support is [−z − , z + ], denoted by (−z, z, −, ). So doing, the degree produced by Z (x − y) evaluates the extent to which (x − y) is close to 0 (according to Z). In other words, z is the maximal difference for which x and y are considered completely “approximately equal” and z + is the difference over which x and y are “not at all approximately equal”. Using this proximity relation E parameterized by Z, a fuzzy set (or a predicate) P may be eroded into “very P” as follows: ver y P (x) = P d Z (x), where d denotes Minkowski’s difference defined as [23] F d Z (r ) = inf Z (r − s) →Gd F (s), s
where →Gd is the Gödel implication ( p →Gd q = 1 if p ≤ q, q otherwise). Then, the interpretation of the statement “X is A all the more as Y is B” applying to an element t will be such that: (A, B)(t) = f 3 (A, B, t, Z ) = Ad Z (t) if B (t) = 1 with Z a given tolerance indicator for the erosion. Fig. 1 illustrates the behavior of the erosion operator acting on a fuzzy predicate P whose trapezoidal membership function is (C, D, c, d). Of course, one must have z ≤ (D − C)/2, ≤ c and ≤ d. If z (resp. ) equals 0, the core (resp. support) of the initial predicate remains unchanged. Such a strengthening operator satisfies properties S1–S3 due to the nature of the erosion operator which corresponds to a difference in terms of membership functions (cf. Fig. 1). Let us notice that erosion may modify the core and/or the support of the initial predicate, while translation does reduce both and the composition using any triangular norm leads to a reinforced predicate (very P) with the same core and support as the initial one (P). 5.2. Axioms of the connective “all the more as” We now review a set of axioms that can serve as a reasonable basis for defining the predicate “X is A all the more as Y is B”. The following properties will act as a mandatory requirement for any definition of the connector “all the more as”: C1: decreasing monotonicity in the second argument “Y is B”, i.e.: if B (t.Y ) > B (t .Y ) and A (t.X ) = A (t .X ) then (A, B)(t) ≤ (A, B)(t ), C2: increasing monotonicity in the first argument “X is A”, i.e.: if A (t.X ) > A (t.X ) and B (t.Y ) = B (t .Y ) then (A, B)(t) ≥ (A, B)(t ),
52
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
C3: when X is not at all A, “Y is B” has no effect, i.e.: if A (t.X ) = 0 then ∀ B (t.Y ), (A, B)(t) = 0, C4: when Y is not at all B, the value of “X is A all the more as Y is B” is that of “X is A”, i.e.: if B (t.Y ) = 0 then ∀ A (t.X ), (A, B)(t) = A (t.X ), C5: when “Y is B” is somewhat satisfied, the value of the statement “X is A all the more as Y is B” must result in an effective strengthening of “X is A” (except for the extreme truth values 0 and 1), i.e.: if A (t.X ) ∈]0, 1[ and B (t.Y ) > 0 then (A, B)(t) < A (t.X ). 5.3. A plausible definition As mentioned before, the interpretation of the statement “X is A all the more as Y is B” will have two extreme points: A (t.X ) when B (t.X ) equals 0 and ver y A (t.X ) when B (t.X ) equals 1. We suggest a linear transition between these two states, which has the advantage of being general for the three families of strengthening operators evoked before. This choice leads to the following definition of the connector “all the more as” denoted by : (A, B)(t) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ).
(6)
Let us remark that, in the particular case of the use of the norm product (strengthening of type 1), it could also be possible to have the following definition: (A, B)(t) = ( A (t.X ))((n−1)· B (t.Y )+1) = ( A (t.X )) B (t.Y )+1 for n = 2. Example 2. Let us consider the following relation secondhandcars(#c, brand, name, year, price, mileage, horsepower): t1 t2 t3 t4 t5 t6 t7
= 13, Ford, Focus, 2005, 6000, 40,000, 75, = 264, Renault, Clio, 2002, 4000, 130,000, 75, = 59, Toyota, Prius, 2006, 15000, 90,000, 115, = 508, Peugeot, 307, 2004, 5500, 100,000, 110, = 4, Ford, Mondeo, 2005, 6000, 125,000, 110, = 78, Renault, Megane, 2006, 7000, 110,000, 110, = 112, Nissan, Primera, 2008, 16,000, 30,000, 130
and the query looking for cars whose horsepower is less than 120 and such that the price is all the cheaper as the mileage is high. The predicates “cheap” and “high” are assumed to be defined as follows: cheap ( p) = 1 if p ≤ 5000, 0 if p ≥ 7500, linear in-between, high (m) = 0 if m ≤ 80,000, 1 if m ≥ 120,000, linear in-between. Clearly, the seventh car does not qualify for the query since its horsepower exceeds the limit imposed by the user. With the first type of strengthening, the grades obtained by the first six cars are: 0.6/t1 , 1/t2 , 0/t3 , 0.72/t4 , 0.36/t5 , 0.08/t6 with the product and n = 2, 0.6/t1 , 1/t2 , 0/t3 , 0.7/t4 , 0.2/t5 , 0.05/t6 with Lukasiewicz’ norm and n = 2, 0.6/t1 , 1/t2 , 0/t3 , 0.656/t4 , 0.216/t5 , 0.056/t6 with the product and n = 3, 0.6/t1 , 1/t2 , 0/t3 , 0.6/t4 , 0/t5 , 0.05/t6 with Lukasiewicz’ norm and n = 3. If a translation (to the left since the predicate is decreasing) is used to define “very cheap”, the grades are: 0.6/t1 , 1/t2 , 0/t3 , 0.6/t4 , 0.2/t5 , 0.05/t6 with = 100, 0.6/t1 , 0.6/t2 , 0/t3 , 0.4/t4 , 0/t5 , 0.05/t6 with = 200. Last, with an erosion to reinforce “cheap”, the grades obtained are: 0.6/t1 , 1/t2 , 0/t3 , 0.73/t4 , 0.33/t5 , 0.05/t6 with Z = (0, 0, 0, 100), 3 0.6/t1 , 1/t2 , 0/t3 , 0.68/t4 , 0.42/t5 , 0.05/t6 with Z = (0, 100, 0, 0).
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
53
As can be seen, the results may significantly depend on the type of strengthening used as well as on the parameters chosen for a given type. 5.4. Compliance with the axioms In this subsection, we show that, if the strengthening mechanism used to reinforce “X is A” complies with properties S1–S3 (see Section 5.1) in Expression (6), then Axioms C1–C5 are satisfied. C1: Let us take two tuples t and t such that: (i) A (t.X ) = A (t .X ) and (ii) B (t.Y ) > B (t .Y ). We have (A, B)(t) − (A, B)(t ) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ) −(ver y A (t .X ) − A (t .X )) · B (t .Y ) − A (t .X ) = (ver y A (t.X ) − A (t.X )) · ( B (t.Y ) − B (t .Y )). The first term is negative or equal to 0 due to the validity of property S3 and the second one is positive according to the hypotheses stated above. So, we have: (A, B)(t) − (A, B)(t ) ≤ 0, i.e.: (A, B)(t) ≤ (A, B)(t ) and property C1 holds. C2: Let us take two tuples t and t such that: (i) A (t.X ) > A (t .X ) and (ii) B (t.Y ) = B (t .Y ). We have: (A, B)(t) − (A, B)(t ) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ) −(ver y A (t .X ) − A (t .X )) · B (t .Y ) − A (t .X ) = (ver y A (t.X ) − ver y A (t .X )) · B (t.Y ) + ( A (t.X ) − A (t .X )) · (1 − B (t.Y )). From S1: A (t.X ) > A (t .X )) ⇒ ver y A (t.X ) ≥ ver y A (t .X ) and the first term of the sum is positive or equal to 0. The first term of the product is positive according to the hypotheses made above and the second term is positive or equal to 0. It follows that the previous expression yields a result that is positive or equal to 0, so: (A, B)(t) − (A, B)(t ) ≥ 0, i.e.: (A, B)(t) ≥ (A, B)(t ) and property C2 holds. C3: Let us consider a tuple t such that A (t.X ) = 0. Due to the validity of S3, one has ver y A (t.X ) = 0. Then: (A, B)(t) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t) = 0. C4: Let us now take a tuple t such that B (t.Y ) = 0. We have: (A, B)(t) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t) = A (t). C5: Let us consider a tuple t such that: (i) A (t.X ) ∈]0, 1[ and (ii) B (t.Y ) > 0. We have: (A, B)(t) = (ver y A (t.X ) − A (t.X )) · B (t.Y ) + A (t) where (ver y A (t.X ) − A (t.X )) is strictly negative, then (ver y A (t.X ) − A (t.X )) · B (t.Y ) as well. Consequently: (A, B)(t) < A (t.X ) and axiom C5 holds. Finally, it turns out that the strengthening approaches previously proposed (composition with an Archimedean norm, translation, erosion) are convenient for the interpretation of the statement “X is A all the more as Y is B”. When applied to a tuple t of a relation r, the preceding predicate is defined as follows: • with an Archimedean norm : (A, B)(t) = f 1 (A, B, n, t) = (n (A) (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ), • with a translation of magnitude > 0 to the right applying to an increasing initial predicate A: (A, B)(t) = f 2 (A, B, , t) = ( A (t.X − ) − A (t.X )) · B (t.Y ) + A (t.X ),
54
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
• with the erosion operator whose indicator is Z = (−z, z, −, ): (A, B)(t) = f 3 (A, B, Z , t) = ( Ad Z (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ).
6. “All the less as” By analogy with the operator “all the more as”, one may think of a predicate “X is A all the less as Y is B”. In such a case, one will not reinforce but weaken condition A (from “A” to “more or less A”). The interpretation is based on the fact that one is all the less demanding on the fact that X is A as the condition “Y is B” is more and more satisfied, similarly to the interpretation of “X is A all the more as Y is B” where one is all the more demanding on the fact that X is A as the condition “Y is B” is more and more satisfied. The generic formulation of the degree of satisfaction of the condition “X is A all the less as Y is B” is then given by (A, B)(t) = (mor e or
less A (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ).
(7)
By analogy with the strengthening case, this expression will have three instantiations, namely: • with an Archimedean co-norm ⊥: (A, B)(t) = f 1 (A, B, n, t) = (⊥n (A) (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ),
(8)
• with a translation of magnitude > 0 to the left applying to an increasing initial predicate A: (A, B)(t) = f 2 (A, B, , t) = ( A (t.X + ) − A (t.X )) · B (t.Y ) + A (t.X ), • with the dilation operator whose indicator is Z = (−z, z, −, ): (A, B)(t) = f 3 (A, B, Z , t) = ( A⊕Z (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ), where ⊕ denotes the addition of fuzzy numbers. Example 3. Let us consider the relation secondhandcars introduced in Example 2 with the following extension: t8 = 19, Ford, Focus, 2005, 6000, 80,000, 75, t9 = 21, Renault, Clio, 2002, 6000, 60,000, 75, t10 = 132, Citroen, C1, 2006, 4000, 50,000, 60, t11 = 75, Peugeot, 307, 2004, 7500, 50,000, 90, t12 = 57, Opel, Astra, 2008, 10,000, 55,000, 115 and the query looking for cars whose horsepower is more than 55 and such that the price is all the less cheap as the mileage is low. The predicates “cheap” and “high” are assumed to be defined as follows: cheap ( p) = 1 if p ≤ 5000, 0 if p ≥ 7500, linear in-between, low (m) = 1 if m ≤ 50,000, 1 if m ≥ 70,000, linear in-between. If we assume that the weakening is done using a translation (to the right in this case) whose magnitude is = 100, we have to compute for each tuple t: (mor e or
less cheap (t. price
− ) − cheap (t. price)) · low (t.mileage) + cheap (t. price)
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
55
which leads to: 0.6/t8 since low (80, 000) = 0, cheap (6000) = 0.6, mor e or less cheap (6000) = 1, 0.8/t9 since low (60, 000) = 0.5, cheap (6000) = 0.6, mor e or less cheap (6000) = 1, 1/t10 since low (50, 000) = 1, cheap (4000) = 1, mor e or less cheap (4000) = 1, 0.4/t11 since low (50, 000) = 1, cheap (7500) = 0, mor e or less cheap (7500) = 0.4, 0/t12 since low (55, 000) = 0.75, cheap (10, 000) = 0, mor e or less cheap (10, 000) = 0. By the way, it is worth noticing that the approach chosen for the interpretation of “price is cheap all the less as mileage is low” is not equivalent to “price is all the cheaper as mileage is not low”, which is not quite surprising. For tuple t9 , this second interpretation would lead to the grade 0.4 ((0.2 − 0.6) ∗ 0.5 + 0.6) instead of 0.8. 7. Detailed example Let us consider a relation secondhandcars of schema (#id, brand, model, type, year, mileage, horsePw, price) describing used cars (cf. Table 3). Now let us consider the fuzzy predicates P1 = “mileage is low”and P2 = “horsePw is around_100” defined hereafter: ⎧ ⎪ ⎨ 1 if x ≤ 40, low (x) = 0 if x ≥ 100, ⎪ ⎩ linear in-between ⎧ 1 if x = 100, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 0 if x ≤ 70 or x ≥ 130, ⎪ ⎨ ar ound _100 (x) = x − 70 if x ∈ [70, 100] ⎪ ⎪ 30 ⎪ ⎪ ⎪ ⎪ 130 − x ⎪ ⎩ if x ∈ [100, 130] 30 and the queries: (Q1) (Q2) (Q3) (Q4)
P1 P1 P1 P1
and P2 , and if possible P2 , or P2 or else P2 ,
The results of these queries appear in Table 4 (k = 0.5 is used in the interpretation of “and if possible” and “or else”, and the degrees are rounded). Table 3 Extension of relation secondHandCars. #id
Brand
Model
Type
Year
Mileage
Horsepw
Price
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
vw seat audi seat ford vw vw ford kia seat ford rover
golf ibiza A3 cordoba focus polo golf ka rio leon focus 223
sedan sport sport sedan estate city estate city city sport sedan sedan
2006 2001 2008 2004 2005 1998 2007 2003 2009 2009 2009 1997
95K 150K 22K 220K 80K 120K 40K 240K 10K 25K 53K 100K
90 80 120 100 70 50 80 50 60 115 90 120
12K 8K 22K 4K 9K 3K 14K 2K 10K 21K 14K 4K
56
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
Table 4 Results of queries (Q1)–(Q4). #id
P1
P2
P1 ∧ P2
P1 aip P2
P1 ∨ P2
P1 oe P2
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
0.1 0 1 0 0.3 0 1 0 1 1 0.8 0
0.7 0.3 0.3 1 0 0 0.3 0 0 0.5 0.7 0.3
0.1 0 0.3 0 0 0 0.3 0 0 0.5 0.7 0
0.1 0 0.65 0 0.15 0 0.65 0 0.5 0.75 0.75 0
0.7 0.3 1 1 0.3 0 1 0 1 1 0.8 0.3
0.35 0.15 1 0.5 0.3 0 1 0 1 1 0.8 0.15
The ordered results are: for Q1: for Q2: for Q3: for Q4:
0.7/t11 , 0.5/t10 , 0.3/t3 , 0.3/t7 , 0.1/t1 ; 0.75/t10 , 0.75/t11 , 0.65/t3 , 0.65/t7 , 0.5/t9 , 0.15/t5 , 0.1/t1 ; 1/t3 , 1/t4 , 1/t7 , 1/t9 , 1/t10 , 0.8/t11 , 0.7/t1 , 0.3/t2 , 0.3/t5 , 0.3/t12 ; 1/t3 , 1/t7 , 1/t9 , 1/t10 , 0.8/t11 , 0.5/t4 , 0.35/t1 , 0.3/t5 , 0.15/t2 , 0.15/t12 .
Among the effects that can be observed, let us mention the following ones: • tuples t10 and t11 get the same degree in the result of query Q2 whereas t11 is better than t10 when a strict conjunction (as in Q1) is used. This illustrates the upgrading effect tied to the operator “and if possible”. • tuple t9 belongs to the result of Q2 (with degree 0.5) whereas it was discarded by Q1, due to its zero degree related to P2 (which illustrates the optional nature of the second predicate in a condition involving the connective “and if possible”). • tuple t4 gets the degree 1 in the result of Q3, but only 0.5 in that of Q4. This illustrates the fact that the connective “or else” is more demanding than a strict disjunction (in this case, tuple t4 is severely downgraded since it does not satisfy P1 at all). Let us now consider the queries: (Q5) (price is low) all the more as (mileage is high), (Q6) (price is low) all the less as (year is recent), where the fuzzy predicates low, high, and recent are defined as follows: ⎧ 0 if x ≥ 15, 000, ⎪ ⎪ ⎨ if x ≤ 7000, low (x) = 1 ⎪ ⎪ 15, 000 − x ⎩ otherwise. 8000 ⎧ 0 if x ≤ 80, 000, ⎪ ⎪ ⎪ ⎨ 1 if x ≥ 150, 000, high (x) = ⎪ x − 80, 000 ⎪ ⎪ ⎩ otherwise. 70, 000 ⎧ 0 if x ≤ 2001, ⎪ ⎪ ⎨ if x ≥ 2009, r ecent (x) = 1 ⎪ ⎪ x − 2001 ⎩ otherwise. 8
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
57
Table 5 Results of queries (Q5)–(Q6). #id
low ( price)
high (mileage)
Q5
r ecent (year )
Q6
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
0.37 0.87 0 1 0.75 1 0.12 1 0.62 0 0.12 1
0.21 1 0 1 0 0.57 0 1 0 0 0 0.29
0.32 0.76 0 1 0.75 1 0.12 1 0.62 0 0.12 1
0.62 0 0.87 0.37 0.5 0 0.75 0.25 1 1 1 0
0.52 0.87 0 1 0.81 1 0.29 1 0.79 0 0.35 1
Using formulas (6) and (7) for interpreting “all the more as” and “all the less as” respectively, and the modifiers very and more or less defined as ver y P (t) = ( P (t))2 mor e or less P (t) = P (t)), we get the results represented in Table 5. One may notice that, for query Q5, tuples t2 and t5 get almost the same degree, whereas t2 was significantly better than t5 considering only the predicate “price is low” (which shows the impact of the operator “all the more as”). This makes sense since t2 has a mileage which is totally high (thus the constraint on the price is strengthened). As for query Q6, even though tuples t7 and t11 are equivalent with respect to the criterion on price, t11 finally gets a better degree than t7 since it is more recent (which has the effect of softening the price requirement). 8. Implementation aspects We now deal with the processing of fuzzy queries involving conditions of the types introduced in the previous sections. A first remark is that these queries are “regular” fuzzy selection queries, in the sense that the only novelty concerns the way atomic predicates can be combined inside a compound filtering condition. Therefore, a naïve evaluation algorithm would have a linear data complexity inasmuch as an exhaustive scan of the relation concerned may be used to assess the tuples and build the result. This is not worse than, e.g., the evaluation of a selection query involving atomic conditions combined by a mean or a fuzzy quantifier. Anyway, one can also take advantage of the connections which exist between properties tied to regular (Boolean) conditions and fuzzy ones, so that fuzzy query processing can come down to Boolean query processing (at least partly). An evaluation method, called derivation, exploiting such properties is described in [24,25], where the applicability of this method to the evaluation of different types of SQLf queries is discussed, as well as the integration of a derivation-based SQLf query interface on top of a regular relational DBMS. This strategy assumes that a threshold is associated with an SQLf query in order to retrieve the -level cut of its answer set. The idea advocated is to use an existing database management system which will process regular Boolean queries. An SQL query is derived from an SQLf expression in order to retrieve a superset of the -level cut of its answer set. Then, the fuzzy query can be processed on this superset thus avoiding the exhaustive scan of the whole database. The principle is to express the -level cut in terms of a query involving only regular operators and expressions. The problem is mainly to distribute the -level cut operation applying to a selection expression onto its constitutive elements. Hereafter, we show how this can be done for fuzzy queries involving the new constructs introduced before. • “and if possible” (denoted by aip hereafter): (P1 aipP2 ) (t) ≥ ⇔ min( P1 (t), k · P1 (t) + (1 − k) · P2 (t)) ≥ ⇒ P1 (t) ≥ .
58
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
Thus, one may use a Boolean query Q 1 resulting from the derivation of the condition ( P1 (t) ≥ ), and perform the computation of the actual degrees only over the subset of tuples returned by Q 1 . Notice that some of the elements in the result of Q 1 may get a final degree smaller than (then they should be discarded) since the derivation rule is weak (an implication is used, not an equivalence). • “or else” (denoted by oe hereafter): (P1 oeP2 ) (t) ≥ ⇔ max( P1 (t), k · P1 (t) + (1 − k) · P2 (t)) ≥ ⇒ ( P1 (t) ≥ ) or (k · P1 (t) + (1 − k) · P2 (t) ≥ ) ⇒ ( P1 (t) ≥ ) or +k−1 −k P1 (t) ≥ max , 0 and P2 (t) ≥ max ,0 . k 1−k Again, one may take advantage of a derived Boolean query in order to avoid the exhaustive scan of the relation concerned (provided that an index over the attribute concerned by P1 and/or that associated with P2 is available). • “all the more as” (denoted by atma hereafter). Using Formula (6), and the fact that ver y
P1 (t)
≤ (P1 atma
P2 ) (t)
≤ P1 (t),
we get (P1 atma
P2 ) (t)
≥ ⇒ P1 (t) ≥ ,
and the same derived condition as in the case “and if possible” may be used. • “all the less as” (denoted by atla hereafter). Using Formula (7), and the fact that P1 (t) ≤ (P1 atla
P2 ) (t)
≤ mor e or
less P1 (t),
we get (P1 atla
P2 ) (t)
≥ ⇒ mor e or
less P1 (t)
≥ .
If the fuzzy modifier “more or less” is interpreted as mor e or
less P1 (t)
= ( P1 (t))1/n
with n denoting an integer ≥ 2, it becomes (P1 atla
P2 ) (t)
≥ ⇒ P1 (t) ≥ n .
Finally, it appears that for each type of condition, the derivation method can be used. This allows to perform a preselection aimed at reducing the set of tuples for which a final degree has to be computed. Consequently |r | is only an upper bound of the data complexity attached to such queries (where r denotes the relation concerned). 9. A derived implication and its application to database querying 9.1. Deriving an implication from “all the more as” Let us come back to the expression of the predicate “X is A all the more as Y is B” and the reinforcement of predicate A using an Archimedean norm : (A, B)(t) = f 1 (A, B, n, t) = (n (A) (t.X ) − A (t.X )) · B (t.Y ) + A (t.X ). One may observe that this formula rewrites: (n (q) − q) · p + q
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
59
letting p = B (t.Y ), q = A (t.X ). This latter formula may be calculated for any two quantities p and q of the unit interval. Let us now define: I(,n) ( p, q) =
1
if p = 0,
(n (q) − q) · p + q otherwise,
where denotes a nonidempotent norm. Operator I(,n) complies with the following four properties: • ∀q, if p1 < p2 , I(,n) ( p1 , q) ≥ I(,n) ( p2 , q)—since the term (n (q) − q) is negative, • ∀ p, if q1 < q2 , I(,n) ( p, q1 ) ≤ I(,n) ( p, q2 )—since I(,n) ( p, q2 ) − I(,n) ( p, q1 ) = (q2 − q1 ) · (1 − p)) + (n (q2 ) − n (q1 )) · p, where (q2 − q1 ) > 0 and (n (q2 ) − n (q1 )) ≥ 0 from S1, • ∀q, I(,n) (0, q) = 1—by definition, • ∀ p, I(,n) ( p, 1) = 1—since I(,n) ( p, 1) = (n (1) − 1) · p + 1 = 1. So, this operator satisfies four of the most cited (see e.g., [6]) properties of fuzzy implications. The fact that 1 ⇒ f q = q does not hold, but instead 1 ⇒ f q = n (q) (for instance q n with the product), may be considered acceptable in view of a “reinforcement-based implication”, which could be seen as a new type of fuzzy implication, in addition to R-implications and S-implications in particular (see [6]). This is all the more arguable as (i) our objective is not to build a logic or a deductive system and (ii) this is coherent with the idea of strengthening captured by operator I(,n) .
9.2. Application to inclusion queries Until now, we have considered queries in which the selection part involved a condition of the type “X is A all the more as Y is B”. Now we tackle another type of queries calling on an inclusion. For instance, let us consider the fuzzy relation Profile whose schema is (emp, skill) describing the various skills possessed by the employees of a company along with a given level (between 0 and 1). The query looking for the employees whose set of skills is included in that associated with John writes: select emp from Profile where emp “John” group by emp having set(skill) included-in (select skill from Profile where emp = “John”) in an SQL-like language (e.g., SQLf). Usually, the interpretation of the inclusion operator is founded on an implication: E ⊆ F ⇔ ∀x ∈ X, x ∈ E ⇒ x ∈ F which, for the query above, should be extended with a fuzzy implication (denoted by → f ) as: deg(E ⊆ F) = min x∈X E (x) → f F (x). Different semantics of the inclusion are obtained depending on the family of fuzzy implications taken. An R-implication naturally extends the usual implication in the sense that if E is included in F according to Zadeh (∀x ∈ X, E (x) ≤ F (x)) the maximal degree (1) is returned, while an S-implication leads to a more demanding type of inclusion where full satisfaction is reached when the support of E is included in the core of F. The use of I(,n) gives birth to a third kind of semantics, more demanding than S-implications. Here, an element x of X must be in F all the more as it is in E according to the behavior of the connective “all the more as” at the heart of implication I(,n) . Example 4. With the following extension of relation Profile: 1/John, A, 0.7/John, B, 0.3/John, C, 0.6/Peter, A, 1/Peter, B, 0.1/Peter, D, 0.1/Mary, A, 0.2/Mary, B, 0.4/Mary, C, 0.9/Jebediah, A, 1/Jebediah, C
60
P. Bosc, O. Pivert / Fuzzy Sets and Systems 202 (2012) 42 – 60
the results of the previous query are: {0.3/Mary, 0.3/Jebediah} with Gödel implication ( p →Gd q = 1 if p ≤ q, q otherwise), {0.7/Peter, 0.6/Mary, 0.3/Jebediah} with Kleene-Dienes implication ( p → K −D q = max(1 − p, q)), {0.216/Mary, 0.09/Jebediah} using I(,n) with (x, y) = x · y and n = 2. 10. Conclusion In this paper, we have proposed four new types of conditions involving a pair of fuzzy predicates (atomic or compound) P1 and P2 , namely “P1 and if possible P2 ”, “P1 or else P2 ”, “P1 all the more as P2 ”, and “P1 all the less as P2 ”. An axiomatic approach has been followed to propose rational definitions. They have been positioned with respect to existing connectors, especially conjunction, disjunction and implication. These conditions are of interest for retrieval when interactions more sophisticated than conjunction and disjunction are desired. Among perspectives for future work, let us mention: • the extension of the first two conditions to other norms and co-norms than min and max, for instance product and probabilistic sum, • the implementation of these operators inside the SQLf language. References [1] P. Bosc, O. Pivert, SQLf: a relational database language for fuzzy querying, IEEE Trans. Fuzzy Syst. 3 (1995) 1–17. [2] P. Bosc, O. Pivert, L. Liétard, A. Mokhtari, Extending relational algebra to handle bipolarity, in: Proceedings of the 25th ACM Symposium on Applied Computing (SAC 2010), Sierre, Switzerland, 2010, pp. 1718–1722. [3] L. Zadeh, Fuzzy sets, Inf. Control 8 (1965) 338–353. [4] B. Bouchon-Meunier, J. Yao, Linguistic modifiers and imprecise categories, Int. J. Intell. Syst. 7 (1992) 25–36. [5] D. Dubois, H. Prade, Weighted minimum and maximum operations in fuzzy set theory, Inf. Sci. 39 (1986) 205–210. [6] J. Fodor, R. Yager, Fuzzy-set theoretic operators and quantifiers, in: D. Dubois, H. Prade (Eds.), The Handbooks of Fuzzy Sets Series, vol 1: Fundamentals of Fuzzy Sets, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2000, pp. 125–193. [7] D. Dubois, H. Prade, Bipolarity in flexible querying, in: Proceedings of FQAS’02, 2002, pp. 174–182. [8] R. Yager, Fuzzy sets and approximate reasoning in decision and control, in: Proceedings of FUZZ-IEEE’92, 1992, pp. 415–428. [9] G. Bordogna, G. Pasi, A fuzzy query language with a linguistic hierarchical aggregator, in: Proceedings of ACM Symposium on Applied Computing (SAC’96), 1994, pp. 184–187. [10] D. Dubois, H. Prade, Using fuzzy sets in flexible querying: why and how, in: Proceedings of the International Conference on Flexible QueryAnswering Systems (FQAS’96), 1996, pp. 89–103. [11] J.J. Dujmovic, H. Nagashima, LSP method and its use for evaluation of java IDEs, Int. J. Approx. Reason. 41 (1) (2006) 3–22. [12] J.J. Dujmovic, Characteristic forms of generalized conjunction/disjunction, in: Proceedings of FUZZ-IEEE’08, IEEE, 2008, pp. 1075–1080. [13] T. Matthé, G. De Tré, S. Zadro˙zny, J. Kacprzyk, A. Bronselaer, Bipolar database querying using bipolar satisfaction degrees, Int. J. Intell. Syst. 26 (10) (2011) 890–910. [14] S. Zadro˙zny, J. Kacprzyk, Bipolar queries and queries with preferences, in: Proceedings of DEXA’06 Workshops, IEEE Computer Society, 2006, pp. 415–419. [15] S. Zadro˙zny, J. Kacprzyk, Bipolar queries using various interpretations of logical connectives, in: P. Melin, O. Castillo, L.T. Aguilar, J. Kacprzyk, W. Pedrycz (Eds.), Proceedings of IFSA, Lecture Notes in Computer Science, vol. 4529, Springer, 2007, pp. 181–190. [16] J. Kacprzyk, S. Zadro˙zny, Bipolar queries, and intention and preference modeling: synergy and cross-fertilization, in: Proceedings of the World Conference on Soft Computing (WCSC’11), San Francisco, CA, USA, 2011. [17] M. Lacroix, P. Lavency, Preferences: putting more knowledge into queries, in: Proceedings of the 13rd VLDB Conference, 1987, pp. 217–225. [18] J. Chomicki, Preference formulas in relational queries, ACM Trans. Database Syst. 28 (2003) 1–40. [19] D. Dubois, H. Prade, Handling bipolar queries in fuzzy information processing, in: J. Galindo (Ed.), Handbook of Research on Fuzzy Information Processing in Databases, IGI Global, 2008, pp. 97–114. [20] G. Brewka, S. Benferhat, D.L. Berre, Qualitative choice logic, Artif. Intell. 157 (1–2) (2004) 203–237. [21] B. Bouchon-Meunier, A. Laurent, M.-J. Lesot, M. Rifqi, Strengthening fuzzy gradual rules through “all the more” clauses, in: Proceedings of FUZZ-IEEE’10, IEEE, 2010, pp. 1–7. [22] P. Bosc, A. Hadjali, O. Pivert, Empty versus overabundant answers to flexible relational queries, J. Fuzzy Sets Syst. 159 (2008) 1450–1467. [23] D. Dubois, H. Prade, Inverse operations for fuzzy numbers, in: Proceedings of IFAC Symposium on Fuzzy Information, Knowledge Representation and Decision Analysis, 1983, pp. 391–395. [24] P. Bosc, O. Pivert, On the evaluation of simple fuzzy relational queries: principles and measures, in: R. Lowen, M. Roubens (Eds.), Fuzzy Logic—State of the Art, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1993, pp. 355–364. [25] P. Bosc, O. Pivert, SQLf query functionality on top of a regular relational database management system, in: O. Pons, M. Vila, J. Kacprzyk (Eds.), Knowledge Management in Fuzzy Databases, Physica-Verlag, Heidelberg, Germany, 2000, pp. 171–190.