Dependencies among attributes given by fuzzy confirmation measures

Expert Systems with Applications 39 (2012) 7591–7599 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal hom...

Download PDF

460KB Sizes 1 Downloads 41 Views

Report

PDF Reader
Full Text

Expert Systems with Applications 39 (2012) 7591–7599

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Dependencies among attributes given by fuzzy conﬁrmation measures Jirˇí Kupka ⇑, Iva Tomanová Institute for Research and Applications of Fuzzy Modeling, University of Ostrava, 30. dubna 22, 701 03 Ostrava 1, Czech Republic

a r t i c l e

i n f o

Keywords: Data mining Linguistic association Conﬁrmation measures Association rule Reduction rule

a b s t r a c t This paper is a contribution to theoretical background of data mining, more precisely to fuzzy association analysis. We consider three the most commonly used conﬁrmation measures and we study relations among found and known associations given by them. Good understanding of such relationships is necessary for creating more efﬁcient algorithms or for subsequent work with found associations as well as for cooperation with the consumer of the data mining process. Even if our motivation to this work arose from mining of linguistic associations, found properties that coincide with semantics of mined associations are valid in general. Additionally, some examples showing how to use obtained properties are also contained in this paper. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction Data mining is known as a method for searching for unknown and valid knowledge from large scale datasets. One of ﬁrst data mining methods was the GUHA method introduced, e.g. in Hájek (1968) (for a comprehensive survey we refer to Hájek and Havránek (1978) and Rauch (2005) and references therein). This work is devoted to so-called fuzzy association analysis that is one of the most important directions of data mining (e.g. Chiang, 2011; Delgado, Marín, Sánchez, & Vila, 2003, etc.). Recently in Novák et al. (2008) a theoretical background and also mining of associations expressible in natural language (i.e. associations of the form ‘‘IF the area of the base of a cylinder is big AND the height of this cylinder is also big THEN the volume of this cylinder is big.’’) via GUHA method was presented. In Novák et al. (2008) the model of evaluative linguistic expressions is used for mining of so-called linguistic associations (see also Lin, Hong, & Lu, 2010 for parallel approach). Then in the paper (Kupka & Tomanová, 2010) another mathematical model extending the model from Novák et al. (2008) was introduced. This model is based on fuzzy partitions or fuzzy coverings, respectively. The main advantage of this approach is seen in the interpretability of found associations, i.e. associations obtained as results of the data mining process can be interpreted in natural language and therefore they are understandable not only for experts for data mining methods but, e.g. also for end users who established the data mining task. Our motivation was to go further in this way and, e.g. to allow to cooperate directly with the end user in the data mining process or automatically to use certain expert knowledge and so on. That is one of reasons leading us to study relations among various associations. Additionally, good understanding of the inner ⇑ Corresponding author. E-mail addresses: [email protected] (J. Kupka), [email protected] (I. Tomanová). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.11.125

structure of a given dataset is necessary also for creating efﬁcient algorithms searching the dataset, for responsible subsequent work with found associations (as suggested in Novák et al. (2008) and Kupka and Tomanová (2010)) as well as for potential creating sound and complete system of associations (e.g. Beˇlohlávek & Vychodil, 2006), etc. Similar tasks were already studied in ‘‘crisp’’ methods – GUHA method can serve as an example (see Hájek & Havránek, 1978), however, except for Glass (2008), there is no survey paper devoted to this task provided fuzzy conﬁrmation measures are taken into consideration. Within this work we consider three the most commonly used conﬁrmation measures using of which was mathematically justiﬁed in Dubois, Hüllermeier, and Prade (2006) (see also Glass, 2008 for further information). For each of considered conﬁrmation measures we study nine properties (see the beginning of Section 4). Six of them are motivated by so-called Armstrong axioms that, among other things, are used for database design (see, e.g. Armstrong, 1974) and are also valid in fuzzy attribute logic developed, e.g. in Beˇlohlávek and Vychodil (2006). This logic can be applied to similar data sets and, under some assumptions, establishes a complete and sound system of associations. Thus it was a natural question under what conditions we can obtain similar relations in ordinary fuzzy association analysis. The remaining properties are motivated by properties that are used, for example, in GUHA method (Hájek & Havránek, 1978) or in known Apriori algorithm (Agrawal & Srikant, 1994 and also Delgado et al., 2003). Our results are summarized at the end of Section 4. Some properties remain valid when we use standard fuzzy conﬁrmation measures. But we have also obtained some negative results and we demonstrated that the situation become better when we apply some additional (expert, resp. background) knowledge to our properties. This paper consisting of six parts is organized as follows - in the next part (in Section 2) basic terminology is introduced. Then we continue with some preliminary considerations (Section 3). Chosen

7592

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

properties are studied and obtained results are provided in Section 4. As usually, some illustrative examples (in Section 5) and concluding comments (in Section 6) follow. 2. Preliminaries In this section we introduce some basic notions that are necessary for mining of linguistic associations (e.g. Novák, Perﬁlieva, & Mocˇkorˇ, 1999). Let U be a nonempty set called universum and R denote the set of real numbers. A fuzzy set A in U is a function A: U ? I where I :¼ [0, 1]. The function A is sometimes called a membership function and, for x 2 X, the value A(x) represents a membership degree of an element x 2 U into A. Further, a t-norm is a binary function : I I ? I that satisﬁes these conditions: 1. 2. 3. 4.

(x, y) = (y, x), (x, (y, z)) = ((x, y), z), if x 6 y then (x, z) 6 (y, z), (0, x) = 0 and (1, x) = x.

where any real number aij 2 R is a value of jth attribute (property) Xj on ith object (observation, transaction) oi. Let Do denote the set of rows of D. Now we can look for dependencies between given disjoint sets of attributes fX i gpi¼1 ; fY j gqj¼1 # fX i gki¼1 . For each attribute Xi its context [ai, bi] can be deﬁned and for each context we can deﬁne relevant evaluative linguistic expressions represented by fuzzy sets.1 Then we can look for linguistic associations of the form

EðfX i gpi¼1 Þ ) FðfY j gqj¼1 Þ;

where E, F represent evaluative expressions containing only the connective AND. Let us recall that AND can be represented by a chosen t-norm. The left and right side of (1) is called the antecedent and succedent, respectively. A symbol ) (called quantiﬁer) can be represented as an implication, resp. another relationship between an antecedent and a succedent. Before using GUHA quantiﬁers (see the next example) it is necessary to apply the process of discretization (also to the method from Kupka and Tomanová (2010)). This allows us to construct the standard contingency table for F and E:

For a given t-norm we can establish its corresponding t-conorm : I I ? I given by

ðx; yÞ ¼ 1 ð1 x; 1 yÞ: Example 1. For instance, (x, y) = min {x, y} is a t-norm and (x, y) = max {x, y} is its t-conorm. Further, the ordinary product (x, y) = x y is a t-norm and (x, y) = x + y xy is its t-conorm. Finally, (x, y) = min {a + b 1, 1} is Łukasiewicz t-norm. 2.1. The original way of mining of linguistic associations 2.1.1. Evaluative linguistic expressions In this subsection we brieﬂy present some notions we will use in Section 5. For a given interval ½a; b # R called context we can use evaluative linguistic expressions allowing us to use several natural language expressions. For instance, atomic linguistic expressions Big (Bi), Medium (Me) or Small (Sm) can be used or we can compose atomic linguistic expressions with various linguistic hedges (introduced by Zadeh): Extremely (Ex), Signiﬁcantly (Si), Very (Ve), More or Less (ML), Roughly (Ro), Quite Roughly (QR) and Very Roughly (VR). For their precise mathematical representation via fuzzy sets (Model I) we refer to Novák et al. (2008). Very recently (Kupka & Tomanová, 2010) another mathematical model (Model II) of evaluative linguistic expressions was established. This mathematical model is based on fuzzy coverings and partitions and, occasionally, it is necessary to use another form of linguistic expressions – namely, so-called specifying linguistic expressions of the form ‘‘Si Sm but not Ex Sm’’. We would like to emphasize that all evaluative linguistic expressions are represented by fuzzy sets. For more details we again refer to Kupka and Tomanová (2010). 2.2. The original approach Now we brieﬂy recall how we can mine for (linguistic) associations by using the method from Novák et al. (2008). Consider realvalued two-dimensional table of the form

ð1Þ

E not E

F

not F

a c

b d

Example 2. Here two examples of quantiﬁers are presented: ) :¼ )x a symmetric associational quantiﬁer. This quantiﬁer is valid if ad > bc. ):¼ )ca a binary multitudinal quantiﬁer. This quantiﬁer is taken as true if a > c(a + b) and ma > a, where c 2 [0, 1] is a given conﬁdence degree and a 2 [0, 1] is a support degree. As above, for details we refer to Novák et al. (2008) and Kupka and Tomanová (2010). 2.3. Fuzzy approach There are also other ways how to search for linguistic associations (1) and also for more soﬁsticated associations. While using GUHA quantiﬁers (and computing fourfold tables) requires crisp partitions (induced by respective fuzzy sets) of contexts of considered attributes, there are methods due to which we can work directly with fuzzy sets carrying linguistic labels. Having such fuzzy sets deﬁned for each attribute (e.g. in Section 2.1.1), we can consider associations of the form

E fX i gpi¼1 ) F fY j gqj¼1 :

ð2Þ

For simplicity we will write only E ) F instead of (2). In (2), F, resp. E, can represent various conjunctions (given by t-norms) and disjunctions (given by t-conorms) of respective fuzzy sets and ) represents a relationship between E and F given by chosen conﬁrmation measures (see below). There are several ways how to choose them. In Dubois et al. (2006), this problem was studied systematically and choices of various conﬁrmation measures were justiﬁed especially in connection with a certain and very natural partition of the row set Do . The partition considered therein is given by the condition

1 For completeness, in Novák et al. (2008), evaluative linguistic expressions considered in a given context are called evaluative linguistic predications. We omit this notation in order to simplify this paper.

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

Sþ ðoi Þ þ S ðoi Þ þ S ðoi Þ ¼ 1;

for any oi 2 Do ;

ð3Þ

where S+(oi), S (oi), S±(oi) denotes a positive, negative and irrelevant evaluation, respectively, of each row oi 2 Do and a given rule (2). The question in Dubois et al. (2006) was how to choose a t-norm (a t-conorm is given by) and an implication ? in order to guarantee (3) for any possible rule (2). Let us summarize possibilities we have. This problem can be solved in positive (Dubois et al., 2006) when is so-called copula (e.g. Łukasiewicz t-norm is the smallest copula) then the expression

a ! b ¼ ð1 aÞ þ ða bÞ; deﬁnes an implication operator ?. In this case, the partition of Do is given by

Sþ ðoi Þ :¼ Eðoi Þ Fðoi Þ; S ðoi Þ :¼ 1 ðEðoi Þ ! Fðoi ÞÞ;

ð4Þ

S ðoi Þ :¼ 1 Eðoi Þ: For this partition, the following (t-norm-based) support measure of (2) is considered in the data set D

suppt ðE ) FÞ :¼

X

Eðoi Þ Fðoi Þ:

ð5Þ

oi 2Do

However, the problem in Dubois et al. (2006) can be further speciﬁed. For instance, when we require the implication operator ? to be a self implication (i.e. a ? a = 1) they obtained that has to be the minimum t-norm, or, when we require ? to be a strong implication (i.e. a ? b = n(a) b for a strong negation n(a)) the solution is given only by the product t-norm. For completeness, the partition is again given by (4). The authors of Dubois et al. (2006) discussed the same problem for so-called gradual rules E ) F. It should be emphasized that gradual rules E ) F are usually interpreted by ‘‘The more the property E is true, the more the conclusion F is true’’ and given by a tnorm and its residuated implication, i.e.

7593

In order to keep preceding notation we put !:¼ I . Finally, if is a continuous t-norm and ? is derived from that tnorm through residuation, the (minimum-based) support measure is counted as

suppm ðE ) FÞ :¼

X

minfEðoi Þ; Fðoi Þg:

ð8Þ

oi 2Do

Having support measures (5), (7) and (8) in mind, we can deﬁne respective conﬁdence measures by

suppj ðE ) FÞ confj ðE ) FÞ :¼ P oi 2Do Eðoi Þ

ð9Þ

for j 2 {t, c, m}. Note that (9) cannot be strictly greater than 1 for any association E ) F. When a conﬁrmation measure is ﬁxed, or equivalently, when we ﬁx both support and its conﬁdence measures, we say that a given rule E ) F is valid if its support and conﬁdence degree are greater than or equal to given support and conﬁdence thresholds. For given rules E1 ) F1 and E2 ) F2, E1 ) F1 ‘ s E2 ) F2 denotes the fact that supp (E1 ) F1) 6 supp (E2 ) F2), similarly, E1 ) F1 ‘ c E2 ) F2 denotes the fact that conf (E1 ) F1) 6 conf (E2 ) F2) and ﬁnally E1 ) F1 ‘ E2 ) F2 means that E1 ) F1 ‘s E2 ) F2 and also E1 ) F1 ‘c E2 ) F2. Analogous notation we also use for sets of associations,

A ) B; C ) D ‘ E ) F; means that E ) F is valid, i.e. it has higher support and conﬁdence degrees than both A ) B and C ) D etc. At the very end of this section we would like to stress that the linguistic connective AND is represented by the t-norm used in support measures (5), (7) and (8) and OR is represented by its t-conorm. 3. Preliminary results 3.1. Expert knowledge

a ! b :¼ supfcja c 6 bg: Example 3. The product t-norm (x, y) = xy has the following residuated implication (called the product implication): x ? y = y/x for y < x, otherwise x ? y = 1. When we consider gradual (resp. certainty) rules, another deﬁnition of partition of Do has to be considered for E ) F (motivated by Hüllermeier’s approach Hüllermeier, 2001)

Sþ ðoi Þ :¼ Eðoi Þ ðEðoi Þ ! Fðoi ÞÞ; S ðoi Þ :¼ Eðoi Þ ð1 ðEðoi Þ ! Fðoi ÞÞÞ;

ð6Þ

S ðoi Þ :¼ 1 Eðoi Þ: Then the problem is solvable only for the product t-norm. Consequently, the following (implication-based) support measure is taken for (2)

suppc ðE ) FÞ :¼

X

Eðoi Þ ðEðoi Þ ! Fðoi ÞÞ;

ð7Þ

oi 2Do

where ? represents any generalized implication. Recall that it has been shown in Dubois and Prade (1996) that so-called gradual and certainty rules (i.e. for rules checked by (7) and (9)) can be modeled by generalized implication operators. More precisely, an implication operator I : I I ! I is a generalization of the material implication when it satisﬁes, for x, y, x0, y0 2 I, (I1) (I2) (I3) (I4)

I ðx; yÞ 6 I ðx0 ; yÞ for x0 6 x, I ðx; yÞ 6 I ðx; y0 Þ for y 6 y0, I ð1; yÞ ¼ y, and I ð1; 0Þ ¼ 0; Ið0; 0Þ ¼ 1.

Sometimes we will work with a set E of associations that represent some additional knowledge (i.e. the expert knowledge) provided to the data mining process. In this work we study how we can use this knowledge in the associations mining process and also, e.g. for presentations of found associations, since it is useless to present previously known associations as a result of our effort. Let us stress that we not specify the inner structure of such expert associations (i.e. associations from E) in advance. For given conﬁrmation measures suppi and confi, for some i = {m, c, t}, and an unknown association E ) F we would like to test, associations from E (notation A ) ⁄B) can describe information only within the antecedent and succedent part of E ) F, respectively, as well as some additional knowledge between the antecedent and succedent part of the rule simultaneously. We would like to emphasize that, within this paper, associations from E are those that are fully valid in the dataset D, i.e. we assume

confi ðA) BÞ ¼ 1: According to the choice of a conﬁdence measure, we obtain some additional information. For a t-norm-based conﬁdence measure (and hence for the minimum-based one), it is easy to see that the last expression is satisﬁed only if A(oi) 6 B(oi) for each oi 2 Do . When an implication-based conﬁdence measure is considered, we can obtain the same condition provided ? is a residuated implication of some t-norm. But if ? is a generalized implication then only B(oi) = 1 for any oi 2 Do can be assumed. We can also use associations of the form C ) C. Considering such associations is very natural, their conﬁdence degree is always

7594

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

equal to 1. Consequently, the validity of this association implies that the linguistic expression represented by C has a sufﬁciently large support. 3.2. Further results The following lemma will be used several times in this paper. It describes rather natural property but we have put it to this paper for the sake of completeness. We say that data sets Dl ; l ¼ 1; 2; . . . ; r, are of the same type if they have the same attributes Xi, i = 1, 2, . . . , k, possessing the same contexts wi, i = 1, 2, . . . , k, and fuzzy sets evaluating various linguistic terms of attributes Xi, i = 1, 2, . . . , k, are also the same. Let ml denote the number of objects in each Dl ; l ¼ 1; 2; . . . ; r. For such data sets we can deﬁne another data set D :¼ rl¼1 Dl called direct join by joining data tables Dl to the unique one. Then, for example, for D1 and D2 ; D has m = m1 + m2 objects, the ﬁrst m1 objects of D comes from D1 , the following m2 objects of D comes from D2 , etc. The following lemma claims that the validity of a given rule in each particular data set ensures the validity of the rule in the direct join of such data sets. Let us stress that this lemma is independent of the choice of support measure. Lemma 1. Let Dl ; l ¼ 1; 2; . . . ; r, be data sets of the same type and let a rule (A ) B) is valid in each Dl . Then (A ) B) is also valid in D ¼ rl¼1 Dl . Proof. Consider any support measure supp (A ) B) and the conﬁdence measure (9). By ðDo Þl we denote rows of D coming from Di . Then we use suppl (A ) B) and confl (A ) B) to denote that the conﬁdence measures are counted just by using rows ðDo Þl . According to our assumptions, for given conﬁdence degree c and support degree r, we have

X

suppl ðA ) BÞ ¼

Aðoi Þ Bðoi Þ ¼ r l P r

ð10Þ

oi 2ðDo Þl

and

P

l

oi 2ðDo Þl Aðoi Þ

P

conf ðA ) BÞ ¼

Bðoi Þ

oi 2ðDo Þl Aðoi Þ

¼P

rl oi 2ðDo Þl Aðoi Þ

¼ cl

Pc

ð11Þ

for any l = 1, 2, . . . , m. To simplify the proof put Al ¼ the deﬁnition of D and (10) we immediately have

supp ðA ) BÞ ¼

r X

suppl ðA ) BÞ ¼

l¼1

r X

P

rl P r l P r:

oi 2ðDo Þl Aðoi Þ.

By

ð12Þ

l¼1

4. Properties In this section we study the following properties: P1 P2 P3 P4 P5 P6 P7 P8 P9

(A OR B) ) A, A ) B, (B OR C) ) D ‘ (A OR C) ) D, A ) B ‘ (C AND A) ) (C AND B), (A ) B), (A ) C) ‘ (A ) (B OR C)), A ) (B OR C) ‘ A ) B, A ) B, B ) C ‘ A ) C, A ) B, C ) C ‘ (C OR A) ) (C OR B), A ) B, C ) D ‘ (A OR C) ) (B OR D), (A AND B) ) (C AND D) ‘ (A AND B AND D) ) C.

Properties P1–P6 are motivated by axioms and inference rules used in database design (see, e.g. Armstrong, 1974). It should be also mentioned that the same rules are valid also in fuzzy attribute logic elaborated, e.g. in Beˇlohlávek and Vychodil (2006). In this logic, above mentioned properties can deﬁne a sound and complete system of associations. Since the authors of Beˇlohlávek and Vychodil (2006) deal with a very similar dataset and since there could be an easy way how to transform a data set we consider here into a dataset satisfying their assumptions, is was natural to ask whether similar properties can be valid when we use other conﬁrmation measures. Further, Properties P7–P9 are motivated by analogous properties that are used, e.g. in GUHA method or in the classic Apriori algorithm (see Agrawal & Srikant, 1994 and references therein). So, let us continue in our study property by property. 4.1. Property P1 As regards the property A OR B ) A for considered conﬁdence measures, it is easy to check that this property need not be satisﬁed in general for given conﬁrmation restraints, even if the support of the expression A OR B is sufﬁciently large (e.g. when A OR B ) A OR B holds). However, for minimum-based conﬁrmation measures we can easily obtain the following lemma claiming that at least one of associations A OR B ) A, A OR B ) B is valid for a reasonably large conﬁdence degree. Lemma 2.

confm ððA OR BÞ ) BÞ þ confm ððA OR BÞ ) AÞ P 1:

ð13Þ

Proof. According to Lemma 1 we may assume A(oi) – B(oi) for all oi 2 Do . We further decompose Do into D1 ¼ foi 2 Do jAðoi Þ < Bðoi Þg and D2 ¼ Do n D1 . Clearly, for any oi 2 Do ,

As regards the conﬁdence degree, (11) implies that

minfmaxfAðoi Þ; Bðoi Þg; Bðoi Þg ¼ Bðoi Þ

c Al 6 rl ; for any l ¼ 1; 2; . . . ; r

and

and hence r X l¼1

c Al ¼ c

ð14Þ

r X l¼1

Al 6

r X

minfmaxfAðoi Þ; Bðoi Þg; Aðoi Þg ¼ Aðoi Þ: rl ;

for any l ¼ 1; 2; . . . ; r:

l¼1

Consequently, by the choice of D; r l and Al , resp.,

Pr rl conf ðA ) BÞ ¼ Prl¼1 P c: l¼1 Al This and (12) ﬁnishes this proof.

ð15Þ

Then, by (14),

P confm ððA OR BÞ ) BÞ ¼ P

oi 2Do Bðoi Þ

oi 2D1 Bðoi Þ þ

P

oi 2D2 Aðoi Þ

and from (15) we get h

P

oi 2Do Aðoi Þ

confm ððA OR BÞ ) AÞ ¼ P

oi 2D1 Bðoi Þ

Remark 1. We can also use the last lemma in the following way in order to check the validity of the rule (A ) B) in D it is sufﬁcient to decompose the data set D to smaller data sets Di and to check the validity of (A ) B) in each particular Di .

þ

P

Then clearly,

P

o 2D ð13Þ ¼ P i o

Bðoi Þ þ

oi 2D1 Bðoi Þ þ

P

Poi 2Do

Aðoi Þ

oi 2D2 Aðoi Þ

P 1;

oi 2D2 Aðoi Þ

:

7595

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

since Do ¼ D1 [ D2 .

confm ðA ) BÞ > confm ððC AND AÞ ) ðC AND BÞÞ

h

whenever the inequality B(oi) < C(oi) < A(oi) is satisﬁed in each row oi of a given dataset. However, without this inequality we obtain a reasonable result.

4.2. Property P2 Let us study the property

A ) B;

ðB OR CÞ ) D ‘ ðA OR CÞ ) D:

The following two examples demonstrate that this property does not hold for all t-norms. Our counterexamples demonstrate that the validity of this property can be violated by both conﬁdence and support restraints. Example 4. Consider the product t-norm (x, y) = xy and its tconorm (x, y) = 1 (1 x)(1 y). Then, for all oi 2 Do , we can have A(oi) = 0.2, B(oi) = 0.7, C(oi) = 0.1, D(oi) = 0.4. For rows oi we have

Lemma 3. Consider a dataset D such that B(oi) < C(oi) < A(oi) is satisﬁed for no row oi. Then, for minimum-based conﬁdence measure, we have

A ) B ‘c ðC AND AÞ ) ðC AND BÞ: Proof. By our assumptions, for given conﬁdence restraint c, we have

P confm ðA ) BÞ ¼

minfAðoi Þ; Bðoi Þg P oi 2Do Aðoi Þ

oi 2Do

ðAðoi Þ Cðoi ÞÞ Dðoi Þ ¼ 0:112 < Aðoi Þ Bðoi Þ ¼ 0:14

is greater than or equal to c. We want to prove

and

confm ððA AND CÞ ) ðB AND CÞÞ P oi 2Do minfAðoi Þ; Bðoi Þ; Cðoi Þg P ¼ ; oi 2Do minfAðoi Þ; Cðoi Þg

ðAðoi Þ Cðoi ÞÞ Dðoi Þ ¼ 0:112 < ðBðoi Þ Cðoi ÞÞ Dðoi Þ ¼ 0:292: Since arbitrarily large dataset possessing such rows can be constructed we obtain

A ) B;

ðB OR CÞ ) D

ðA OR CÞ ) D:

Example 5. In this example we demonstrate that

A ) B;

ðB OR CÞ ) D

ðA OR CÞ ) D;

for (x, y) = x y. We construct a table D with two rows o1, o2 and fuzzy sets A, B, C, D for which A(o1) = 7/8, A(o2) = 1/8, B(o1) = 5/6, B(o2) = 1/6, C(o1) = C(o2) = 1/2 and D(o1) = 1/4, D(o2) = 3/4. Let c = 0.37 be a conﬁrmation degree. Then, by direct calculation,

conf ððA OR CÞ ) DÞ ¼

ð17Þ

ð18Þ

cannot be smaller than (17). According to Lemma 1 (see also Remark 1), we may decompose D into four disjoint subdatasets according to subsequent row inequalities (i) C(oi) < A(oi), B(oi), (ii) A(oi) 6 B(oi) 6 C(oi), (iii) B(oi) < A(oi) 6 C(oi) and (iv) A(oi) 6 C(oi) < B (oi). Because the cases (i), (ii) and (iv) lead to (18) = 1, we obtained that (17) is smaller than or equal to (18) on these subdatasets. It remains to explain the case (iii). But then (18) = conf (A ) B) and this concludes this proof. h According to Section 3.1 we immediately obtain the following corollary. Corollary 1. For minimum-based conﬁdence degree we have

13 < c; 33

A) B ‘c ðC AND AÞ ) ðC AND BÞ:

but conﬁdence degrees of A ) A, A ) B, (B OR C) ) D, C ) D and D ) D are greater than c. It easy to construct an arbitrarily large data set possessing the same properties.

4.4. Property P4 Let us study the rule

A ) B; A ) C ‘ A ) ðB OR CÞ:

4.3. Property P3

As we can see from the following lemma, the validity of this property is straightforward.

Let us study the rule

A ) B ‘ ðC AND AÞ ) ðC AND BÞ:

ð16Þ

It is easy to see that A AND C could represent an empty set and this would violate the validity of (16). Consequently, it could be useful to add some additional assumptions (e.g. (A AND C) ) (A AND C)) in order to guarantee the support measure to be sufﬁciently large. But the following example demonstrates that this attempt need not be successful. Example 6. Consider a dataset consisting of two rows o1, o2 and minimum-based support measure. For attributes represented by fuzzy sets A, B, C with values A(o1) = A(o2) = 0.6, B(o1) = C(o2) = 0.1 and B(o2) = C(o1) = 0.9, we obtain that suppm (A ) B) = suppm ((A AND C) ) (A AND C)) = 0.6 and suppm (C AND A) ) (C AND B) = 0.12. Thus

ðA ) BÞ;

ðA AND CÞ ) ðA AND CÞ‘s ðC AND AÞ ) ðC AND BÞ:

Further, we need to study whether the conﬁdence degree of the rule P3 is sufﬁciently large. We clarify that this rule is valid under some assumptions for minimum-based conﬁdence measure. Namely, one can easily check that

Lemma 4 (P4). Let us consider conﬁrmation measures given by (5), (7), (8) and (9). Then

A ) B; A ) C ‘ A ) ðB OR CÞ: Proof. Consider the t-norm-based conﬁdence measures given by (9) and (5) at ﬁrst. Let the assumptions be valid, i.e. associations A ) B and A ) C are valid for given conﬁdence and support degrees c and r. Especially, we have for A ) B

suppt ðA ) BÞ :¼

X

Aðoi Þ Bðoi Þ P r

ð19Þ

oi 2Do

and

P conft ðA ) BÞ :¼

oi 2Do Aðoi Þ

P

Bðoi Þ

oi 2Do Aðoi Þ

Analogously, for A ) C we have

P c:

ð20Þ

7596

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

X

suppt ðA ) CÞ :¼

Aðoi Þ Cðoi Þ P r

ð21Þ

oi 2Do

and

P

oi 2Do Aðoi Þ

Cðoi Þ

P

conft ðA ) CÞ :¼

P c:

oi 2Do Aðoi Þ

ð22Þ

For completeness, we want to prove

X

suppt ðA ) ðB OR CÞÞ :¼

Aðoi Þ ðBðoi Þ Cðoi ÞÞ P r

ð23Þ

oi 2D

and

P conft ðA ) ðB OR CÞÞ :¼

oi 2Do Aðoi Þ

P

ðBðoi Þ Cðoi ÞÞ

oi 2Do Aðoi Þ

P c;

ð24Þ

where means the corresponding t-conorm. It is easy to see that B(oi) 6 B(oi) C(oi) (resp. C(oi) 6 B(oi) C(oi)), for any oi 2 Do , implies that (19) (resp. (21)) is smaller than or equal to (23). Consequently, (20) (resp. (22)) is smaller than or equal to (24). i.e. the proof of this part concludes. Now we should consider the minimum-based conﬁdence measures given by (9) and (8). The proof follows from the previous part since the t-norm above can replaced by the t-norm min{x, y}. Finally we consider the implication-based conﬁdence measures given by (9) and (7). This means that the following expressions are assumed:

X

suppc ðA ) BÞ :¼

Aðoi Þ ðAðoi Þ ! Bðoi ÞÞ P r

ð25Þ

oi 2Do

and

P

oi 2Do Aðoi Þ

P

confc ðA ) BÞ :¼

ðAðoi Þ ! Bðoi ÞÞ

oi 2Do Aðoi Þ

P c:

ð26Þ

Additionally, also

suppc ðA ) CÞ :¼

X

Aðoi Þ ðAðoi Þ ! Cðoi ÞÞ P r;

ð27Þ

oi 2Do

P confc ðA ) CÞ :¼

oi 2Do Aðoi Þ

P

ðAðoi Þ ! Cðoi ÞÞ

oi 2Do Aðoi Þ

P c:

ð28Þ

We want to prove

suppc ðA ) ðB OR CÞÞ :¼

X

Aðoi Þ ðAðoi Þ ! ðBðoi Þ Cðoi ÞÞÞ P r

oi 2Do

ð29Þ and

confc ðA ) ðB OR CÞÞ :¼

P

oi 2Do Aðoi Þ

ðAðoi Þ ! Bðoi Þ Cðoi ÞÞ P P c; oi 2Do Aðoi Þ ð30Þ

where is the corresponding t-conorm. Then the inequality B(oi) 6 B(oi) C(oi) (resp. C(oi) 6 B(oi) C(oi)), for any oi 2 D, easily follows from basic properties of tconorm. Thus we have (25) (resp. (27)) is smaller than or equal to (29). And similarly we obtain that (26) (resp. (28)) is smaller than or equal to (30), i.e. the whole proof is ﬁnished. h

The following examples demonstrate that this property does not hold in general even if we deal with some additional assumptions. However, in the subsequent lemma we can specify some sufﬁcient conditions. Within the following examples we illustrate that Property P5 cannot be proved with stronger assumptions (namely, when we require the validity of A ) A, B ) B and C ) C, respectively) and cannot be changed to a rule A ) (B OR C) ‘ A ) B or A ) C. Example 7. We construct a simple data set consisting of just three rows o1, o2 and o3 and we deﬁne fuzzy sets A, B and C as follows – A(o1) = 0.1, A(o2) = 0.5, A(o3) = 0.9, B(o1) = 0.5, B(o2) = 0.1, B(o3) = 0.5, C(o1) = C(o2) = 0.9 and C(o3) = 0.1. Then, for minimum-based support measures, suppm (A ) (B OR C)) = 1.1, suppm (A ) A) = 1.5, suppm (B ) B) = 1.1, suppm (C ) C) = 1.9 and they are greater than both suppm (A ) B) = suppm (A ) C) = 0.7. Example 8. We construct a simple data set consisting of two rows o1, o2 and we deﬁne fuzzy sets A, B and C as follows – A(o1) = 0.1, A(o2) = 0.5, B(o1) = 0.5, B(o2) = 0.1, C(o1) = 0.9 and C(o2) = 0.1. Then suppt (A ) (B OR C)) = 0.19, suppt (A ) A) = suppt (B ) B) = 0.26, suppt (C ) C) = 0.82 and they are greater than suppt (A ) B) = 0.1 and suppt (A ) C) = 0.14 where we consider the product t-norm and its t-conorm is a b = a + b ab. Remark 3. In the last two examples we have clariﬁed some properties for various support measures. Since antecedents of A ) (B OR C) and A ) B are equal to each other we immediately obtain the same results for relevant conﬁdence measures. At the end of this subsection we present a counterexample for implication-based conﬁrmation measures. Example 9. Consider implication-based conﬁrmation measures and fuzzy sets A, B, C deﬁned by A(o1) = 0.99, B(o1) = 0.9, C(o1) = 0.89. Then suppc (A ) B) = 0.9, suppc (A ) C) = 0.89, suppc (A ) B OR C) = 0.989, suppc (B ) C) = 0.89, suppc (C ) B) = 0.89. Additionally, confc (A ) B) = 0.909, confc (A ) C) = 0.8989, confc (A ) BORC) = 0.9989, confc (B ) C) = 0.988, and ﬁnally confc (C ) B) = 1. It is clear that arbitrarily large dataset possessing this property can be easily constructed. According to previous examples we can claim, for considered conﬁrmation measures, that

A ) ðB OR CÞ

A)B

and

A ) ðB OR CÞ

A ) B:

4.6. Property P6 In this subsection we consider the property

A ) B; B ) C ‘ A ) C:

ð31Þ

Remark 2. Clearly since Property P4 is valid in general, it can be used also in connection with expert knowledge we consider in our task.

At ﬁrst we demonstrate that this property is not valid in general in the set of mined associations. The next example demonstrates

4.5. Property P5

for all support measures (5), (7) and (8), respectively.

Let us study the property

A ) ðB OR CÞ ‘ A ) B:

A ) B; B ) C

A)C

Example 10. Consider a dataset consisting of two rows o1, o2. Let fuzzy sets A, B, C be deﬁned as follows – A(o1) = 0.2, A(o2) = 0.1, B(o1) = 0.5, B(o2) = 0.3, C(o1) = 0.1 and C(o2) = 0.3.

7597

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

Then suppm (A ) C) = 0.2, suppm (A ) B) = 0.3 and suppm (B ) C) = 0.4. Further, suppt (A ) C) = 0.05, suppt (A ) B) = 0.13 and suppt (B ) C) = 0.14 and ﬁnally suppc (A ) C) = 0.04, suppc (A ) B) = 0.13 and suppc (B ) C) = 0.1. Further the next example shows

A ) B; B ) C

A)C

P

oi 2Do Aðoi Þ

P

conft ðA ) BÞ :¼

Bðoi Þ

P c:

oi 2Do Aðoi Þ

ð34Þ

We want to prove

X

suppp ðA ) CÞ :¼

Aðoi Þ Cðoi Þ P r

ð35Þ

oi 2Do

ð32Þ and

for minimum-based and t-norm-based conﬁdence measures. For implication-based conﬁdence measure, (32) is justiﬁed by Example 14. Example 11. Consider the dataset from Example 10. Then confm (A ) C) = 2/3 and this value is smaller than confm (A ) B) = 1 and confm (B ) C) = 1/2 and also conft (A ) C) = 1/6 which is less than conft (A ) B) = 13/30 and conft (B ) C) = 7/40. The following three examples demonstrate that requiring some additional assumptions need not lead to the validity of Property P6. Example 12. Consider minimum-based conﬁrmation measures and a dataset consisting of two rows. In the dataset, consider fuzzy sets A, B, C having values A(o1) = A(o2) = 0.9, B(o1) = B(o2) = 0.5 and C(o1) = C(o2) = 0.1. Then suppm (A ) B) = 1, suppm (B ) C) = 0.2, suppm (A ) A) = 1.8, suppm (B ) B) = 1, and suppm (C ) C) = 0.2. Then suppm (A ) C) is equal to suppm (C ) C) and suppm (B ) C). As regards the conﬁdence degree, confm (A ) C) = 0.11 is strictly smaller than confm (C ) C) = 1, confm (A ) B) = 0.56 and also than confm (B ) C) = 0.2. Example 13. We consider t-norm-based conﬁrmation measures where (x, y) = xy and a dataset consisting of two rows. Let fuzzy sets A, B, C be deﬁned by A(o1) = 0.1, A(o2) = 0.9, B(o1) = B(o2) = 0.5, C(o1) = 0.1 and C(o2) = 0.9. Then suppt (A ) B) = 0.5, suppt (B ) C) = 0.5, suppt (A ) A) = 0.82, suppt (B ) B) = 0.5, and suppt (C ) C) = 0.82. Obviously, suppp (A ) C) = 0.18 is strictly smaller than previous support degrees. Hence it would be superﬂuous to count relevant conﬁdence measures. Example 14. Consider implication-based support measure with the product implication and take a dataset consisting of two rows. Let fuzzy sets A, B, C be deﬁned by A(o1) = 0.1, A(o2) = 0.9, B(o1) = B(o2) = 0.5, C(o1) = 0.1 and C(o2) = 0.9. Then suppc (A ) B) = 1, suppc (B ) C) = 0.2, suppc (A ) A) = 1.8, suppc (B ) B) = 1, suppc (C ) C) = 0.2 and suppc (A ) C) is equal to suppc (C ) C) and suppc (B ) C). For conﬁdence measures we get confc (A ) C) = 0.11 and this is smaller than confc (A ) B) = 0.56, confc (B ) C) = 0.2 and also than confc (A ) A) = confc (B ) B) = confc (C ) C) = 1. Now we prove a lemma claiming that by using some expert knowledge we can reasonably use Property P6. Lemma 5. Let us consider conﬁrmation measures given by (9) and (5), (7), (8). Then

oi 2Do

and

P

Cðoi Þ

oi 2Do Aðoi Þ

P c:

Remark 4. As an easy corollary of Lemma 5 we can use an ordinary transitivity (A ) ⁄B, B ) ⁄C ‘ A ) ⁄C) in the set E. On the other side, the property (A ) ⁄B, B ) C ‘ A ) C) is not valid in general (see Example 10 where A(oi) 6 B(oi) for any oi). 4.7. Properties P7 and P8 In this subsection we can return to the original motivation of establishing conﬁrmation measures (9), (5), (7) and (8). Namely, each rule E ) F and given measures of support and conﬁdence deﬁne a fuzzy partition on Do (see again (3)). Consequently, we can speak about a positive, negative and irrelevant part of the rule E ) F (notation S+(E ) F), S (E ) F) and S±(E ) F)), respectively. Note that each Si (E ) F), i 2 {+, , ±}, is a fuzzy set on Do . In Dubois et al. (2006) conﬁrmation measures (9), (5), (7) and (8) were established in order to satisfy

X

suppðE ) FÞ ¼

Sþ ðE ) FÞðoi Þ

oi 2Do

and

P conf ðE ) FÞ ¼ P

oi 2Do Sþ ðE

) FÞðoi Þ

oi 2Do ðSþ ðE ) FÞðoi Þ þ S ðE ) FÞðoi ÞÞ

:

It is easy to see from the last two expressions that having two valid associations E1 ) F1, E2 ) F2 with ‘‘disjoint’’ positive parts ensures the validity of (E1OR E2) ) (F1OR F2) whenever the linguistic OR is represented by pointwise maximum. The proof of this fact would be analogous to the proof of Lemma 1. Therefore, we immediately have two rules

A ) B; C ) D ‘ ðA OR CÞ ) ðB OR DÞ for fuzzy sets A, C with disjoint supports. 4.8. Property P9

Aðoi Þ Bðoi Þ P r

ð36Þ

According to Section 3.1, B ) ⁄C implies that B(oi) 6 C(oi) for any oi 2 Do . Having this in mind, it is clear that (33) is smaller than or equal to (35) and also (34) is smaller than or equal to (36). This ﬁnishes the proof for t-norm-based conﬁrmation measures. For minimum-based conﬁrmation measures the proof would be analogous. When we consider implication-based conﬁrmation measures (9) and (7), we can analogously use conditions B(oi) 6 C(oi) or B(oi) = 1 for any oi 2 Do . By using the monotonicity of the ordinary product and a chosen implication operator, we immediately obtain the required ordering for the support measure. Finally, it is trivial to ﬁnish the proof for the implication-based conﬁdence measures. h

and

Proof. First we consider t-norm-based conﬁrmation measures. By our assumptions we have, for support and conﬁdence thresholds r and c,

X

oi 2Do Aðoi Þ

A ) B; C ) C ‘ ðC OR AÞ ) ðC OR BÞ

A ) B; B) C ‘ A ) C:

suppt ðA ) BÞ :¼

P confp ðA ) CÞ :¼

ð33Þ Let us study the condition

ðA AND BÞ ) ðC AND DÞ ‘ ðA AND B AND DÞ ) C:

7598

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

It can be easily proven that this property can be valid for t-normbased conﬁrmation measures in our considerations. Lemma 6 (P9). Let us consider the t-norm-based conﬁdence measures given by (5) and (9). Then

ðA AND BÞ ) ðC AND DÞ ‘ ðA AND B AND DÞ ) C: Proof. Since the linguistic AND is represented by a given t-norm , it follows directly from the associativity of that

ðAðoi Þ Bðoi ÞÞ ðCðoi Þ Dðoi ÞÞ ¼ ðAðoi Þ Bðoi Þ Dðoi ÞÞ ðCðoi ÞÞ for any oi 2 Do . Hence, by the choice of suppt, we immediately obtain

suppt ððA AND BÞ ) ðC AND DÞÞ ¼ suppt ððA AND B AND DÞ ) CÞ: Consequently also

conft ððA AND BÞ ) ðC AND DÞÞ 6 conft ððA AND B AND DÞ ) CÞ since A(oi) B(oi) P A(oi) B(oi) D(oi) for each oi 2 Do .

h

Lemma 7. (P9) Let us consider the minimum-based conﬁdence measures given by (8) and (9). Then

5. Examples In this section we prepared two simple demonstrations showing how the obtained results can be used in the data mining process. For our purposes we work with GUHA quantiﬁers in order to mine for linguistic associations. We work with a dataset entitled NO2 downloaded from the web page: http://lib.stat.cmu.edu/ modules.php. We chose a model of evaluative linguistic expressions (see Model I) and the implicational quantiﬁer with parameters r P 0.005 and c P 0.2. Further, A represents an evaluative linguistic expression of an attribute NoCar, B represents an evaluative linguistic expression of an attribute Hour and C is an evaluative linguistic expression of an attribute Y_NO2. Then we obtained the following linguistic associations (except for other ones): IF NoCar is

THEN Hour is

IF Hour is

THEN Y_NO2 is

IF NoCar is

THEN Y_NO2 is

Bi Bi Bi Bi QR Sm QR Sm

QR Bi. QR Bi. Me. Me. ML Sm. ML Sm.

QR Bi QR Bi Me Me ML Sm ML Sm

QR Bi. VR Bi. QR Bi. VR Bi. VR Bi. Me.

Bi Bi Bi Bi QR Sm QR Sm

QR Bi. VR Bi. QR Bi. VR Bi. VR Bi. Me.

ðA AND BÞ ) ðC AND DÞ ‘ ðA AND B AND DÞ ) C: Proof. This result is an immediate consequence of Lemma 6 since it is its special case. h Remark 5. Since Property is valid in general, it would be superﬂuous to study how to use associations from E. However, for implication-based conﬁrmation measures, the result is not valid as it is demonstrated by the next example. Example 15. Consider data with attributes represented by fuzzy sets A, B, C, D having values A(oi) = B(oi) = 0.1 and C(oi) = D(oi) = 0.2. Then for the product t-norm, its t-conorm (a, b) = a + b ab and the product implication we obtain suppc ((A AND B) ) (C AND D)) = 0.01. This is greater than suppc ((A AND B AND D) ) C) = 0.002. The last example also demonstrates that using the expert knowledge (i.e. A AND B ) ⁄C AND B) need not be useful for us. 4.9. Summary In this subsection we would like to summarize obtained results for particular conﬁrmation measures. For minimum-based conﬁrmation measures we have demonstrated that P1, P3, P5, P6, P7 and P8 are not valid in general. However, when we can specify some conditions (for P3, P7 and P8) or expert knowledge (for P3, P4 and P6) in order to guarantee the validity of the considered rule. Finally, P4 and P9 are always valid. For t-norm-based conﬁrmation measures we have got that P2, P5, P6, P7 and P8 are not valid in general. Similarly as above, we can specify some conditions (for P7 and P8) or some expert knowledge (for P4 and P6) in order to get their validity. And as above, P6 and P9 are valid. Finally we consider implication-based conﬁrmation measures. For these conﬁrmation measures, Properties P1, P5, P6, P7 and P8 cannot be used in general. On the other side, P4 and P9 are always valid and for some rules some additional knowledge (for P4 and P6) or other requirements (for P7 and P8) can guarantee their validity.

This table demonstrates that Property P6 is reasonable. We can simplify the data mining process when we obtain a suitable set E possessing associations from the middle column. Then it would be sufﬁcient to mine only for associations from the ﬁrst column. For example (see the ﬁrst row), we can mine for ‘‘IF NoCar is Bi THEN Hour is QR Bi’’. If the association ‘‘IF Hour is QR Bi THEN Y_NO2 is QR Bi’’ is in E we immediately obtain by P6 ‘‘IF NoCar is QR BI THEN Y_NO2 is QR Bi’’. Analogously we demonstrate the validity of Property P9. Now let A be an evaluative linguistic expression of an attribute Hour, B be an evaluative linguistic expression of an attribute DayNumb, C be an evaluative linguistic expression of an attribute Y_NO2 and D be an evaluative linguistic expression of an attribute NoCar. By using the same procedure as above with the same conﬁrmation degrees, we obtain following associations of the form (A AND B) ) (C AND D): IF Hour is

AND DayNumb is

THEN Y_NO2 is

AND NoCar is

Me Bi Ro Bi Ro Sm Ro Bi

Ro Bi Sm Bi Ro Bi Ro Bi

QR Bi VR Bi QR Bi VR Bi QR Bi

Bi Bi Ex Bi QR Bi Ex Bi

Now, we can either use Property P9 or to repeat the mining procedure to get associations of the form (A AND B) ) (C AND D) as it is described in this table: IF Hour is

AND DayNumb is

AND NoCar is

THEN Y_NO2 is

Me Bi Ro Bi Ro Sm Ro Bi

Ro Bi Sm Bi Ro Bi Ro Bi

Bi Bi Ex Bi QR Bi Ex Bi

QR Bi VR Bi QR Bi VR Bi QR Bi.

J. Kupka, I. Tomanová / Expert Systems with Applications 39 (2012) 7591–7599

For completeness, ﬁrst rows of mentioned tables provide us these associations: ‘‘IF Hour is Me AND DayNumb is Ro Bi THEN Y_NO2 is QR Bi AND NoCar is Bi’’. ‘‘IF Hour is Me AND DayNumb is Ro Bi AND NoCar is Bi THEN Y_NO2 is QR Bi ‘‘. 6. Conclusions In this paper we studied some relations in a given data set that are given by chosen fuzzy conﬁrmation measures. As we pointed out in the introductory section of this paper, this task is very important. Understanding relationships among attributes of given data sets allows us to create more efﬁcient algorithms and, especially, reasonable subsequent work with found associations. The most promising and detailed results we obtained for minimum-based conﬁrmation measures. This (together with less computational complexity of such conﬁrmation measures) gives another argument for their use. According to our experience, remaining conﬁrmation measures are quite restrictive, especially when we use more attributes in considered attributes. Concerning our future work – our results are hopeful. However there are still some open tasks devoted even to Properties P1–P9 considered in this paper - some of them immediately follows from this paper (see also Kupka & Tomanová, 2011). On the other side, this is the ﬁrst step of our research. We intend to extend our research, e.g. to study relations given by other conﬁrmation measures, for instance, by those admitting various dependencies as it is suggested in Glass (2008), and also to create novel methods for mining of linguistic associations using the knowledge we have discovered. Acknowledgements The ﬁrst author is supported by the research plan MSM 6198898701 of the Ministry of Education of the Czech Republic. Both authors are supported by Research center 1M0572 ‘‘Data –

7599

Algorithms – Decision making’’ (2005–2009) of the Ministry of Education of the Czech Republic. References Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th international conference on very large databases, Santiago (pp. 487–499). Chile: Morgan Kaufmann Newport Beach, CA: AAAI Press. Armstrong, W. W. (1974). Dependency structures of database relationships. In Proc. IFIP 74 (pp. 580–583). Amsterdam: North- Holland Pub. Co.. Beˇlohlávek, R., & Vychodil, V. (2006). Fuzzy attribute logic over complete residuated lattices. Journal of Experimental and Theoretical Artiﬁcial Intelligence, 18(4), 471–480. Chiang, W.-Y. (2011). To mine association rules of customer values via a data mining procedure with improved model: An empirical case study. Expert Systems with Applications, 38(3), 1716–1722. Delgado, M., Marín, N., Sánchez, D., & Vila, M.-A. (2003). Fuzzy association rules: General model and applications. IEEE Transactions on Fuzzy Systems, 11, 214–225. Dubois, D., Hüllermeier, E., & Prade, H. (2006). A systematic approach to the assessment of fuzzy association rules. Data mining and Knowledge Discovery, 13, 167–192. Dubois, D., & Prade, H. (1996). What are fuzzy rules and how to use them. Fuzzy Sets and Systems, 84(2), 169–186. Glass, D. H. (2008). Fuzzy conﬁrmation measures. Fuzzy Sets and Systems, 159, 475–490. Hájek, P. (1968). The question of a general concept of the guha method. Kybernetika, 505–515. Hájek, P., & Havránek, T. (1978). Mechanizing hypothesis formation. Mathematical foundations for a general theory. Berlin/Heidelberg/New York: Springer-Verlag. Hüllermeier, E. (2001). Implication based fuzzy association rules. In Proceedings of the 5th European conference on principles and practice of knowledge discovery in databases, PKDD-01 (pp. 241–252). Freiburg, Germany: Springer-Verlang. Kupka, J., & Tomanová, I. (2010). Some extensions of mining of linguistic associations. Neural Network World, 20, 27–44. Kupka, J., & Tomanová, I. (2011). Some dependencies among attributes given by fuzzy conﬁrmation measures. In Proc. EUSFLAT-LFA2011 (pp. 498–505). Atlantis Press. Lin, C.-W., Hong, T.-P., & Lu, W.-H. (2010). Linguisticnext term data mining with fuzzy FP-trees. Expert Systems with Applications, 37(6), 4560–4567. Novák, V., Perﬁlieva, I., Dvorˇák, A., Chen, Q., Wei, Q., & Yan, P. (2008). Mining pure linguistic associations from numerical data. International Journal of Approximate Reasoning, 48, 4–22. Novák, V., Perﬁlieva, I., & Mocˇkorˇ, J. (1999). Mathematical principles of fuzzy logic. Boston: Kluwer. Rauch, J. (2005). Logic of association rules. Applied Intelligence, 22, 9–28.

Dependencies among attributes given by fuzzy confirmation measures

Dependencies among attributes given by fuzzy confirmation measures

Recommend Documents