JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.1 (1-12)
Available online at www.sciencedirect.com
ScienceDirect Fuzzy Sets and Systems ••• (••••) •••–••• www.elsevier.com/locate/fss
An efficient method to factorize fuzzy attribute-oriented concept lattices Gabriel Ciobanu a,∗ , Cristian V˘aideanu b,∗ a Romanian Academy, Institute of Computer Science, Romania b A.I. Cuza University of Ia¸si, Faculty of Mathematics, Blvd Carol I, no. 11, 700506 Ia¸si, Romania
Received 19 March 2015; received in revised form 14 July 2016; accepted 18 July 2016
Abstract Factorization by similarity is a mathematical technique used in formal concept analysis with fuzzy attributes for reducing the complexity of different types of fuzzy concept lattices. In this paper we find the structure of the factor lattice of a fuzzy attributeoriented concept lattice, namely the intervals representing the blocks of this lattice. We provide a procedure for generating the infimum and the supremum concepts of these intervals as fixpoints of a fuzzy closure operator. This theoretical result allows to develop a more efficient algorithm for building the factor lattice of the fuzzy attribute-oriented concept lattice. A comparison between our approach and the existing algorithms is presented. © 2016 Elsevier B.V. All rights reserved. Keywords: Similarity relation; fuzzy attribute-oriented concept lattice
1. Introduction Over the last decades, researchers have faced new challenging issues in order to deal with the huge amount of information available in databases. Developed as a branch of applied lattice theory, Formal Concept Analysis (FCA) is a mathematical framework which provides techniques to extract, organize and represent the relevant data hidden in large data tables. In the classical setting of FCA, originally proposed by R. Wille in [38], the information is described as an ordered triple called formal context, consisting of a set of objects, a set of attributes (or properties) of objects, and a binary relation between the objects and their attributes. Using a Galois connection, the FCA methods allow to process the data generating a collection of concepts which are complete clusters of objects having an attribute conjunctive description. The set of these classes of objects equipped with the subconcept–superconcept partial order forms a complete lattice called concept (or Galois) lattice. In their traditional setting, FCA formulas and theoretical results are based on classical logic. By considering contexts with attributes taking values in a set of truth degrees endowed with appropriate operations, researchers have * Corresponding authors.
E-mail addresses:
[email protected] (G. Ciobanu),
[email protected] (C. V˘aideanu). http://dx.doi.org/10.1016/j.fss.2016.07.004 0165-0114/© 2016 Elsevier B.V. All rights reserved.
JID:FSS AID:7073 /FLA
2
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.2 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
developed several extensions of Wille’s approach to the fuzzy setting. Thus, conceptual scaling was proposed in [20] as a method to deal with the so-called multi-valued contexts, while fuzzy logic was first integrated into FCA in [11]. However, considering residuated lattices as structures of truth degrees, it was developed a feasible approach for FCA with graded attributes in [3] and, independently, in [31]. The authors used an antitone Galois connection to define the notions of formal fuzzy concept and fuzzy concept lattice. Using two different types of isotone derivation operators and non-commutative logic, the classical approach presented in [16] is extended in [21] by introducing two types of fuzzy concept lattices, namely the fuzzy object-oriented and the fuzzy attribute-oriented concept lattices. In [9], FCA with fuzzy attributes is extended to the case when the partial order on the set of fuzzy sets is replaced with a fuzzy order. In the literature, we can find even further generalizations of the notion of concept lattice [18,24,26,30]. Both the accessibility and the generality of FCA make the theory suitable to be applied in various fields. Thus, FCA has provided a mathematical basis for conceptual knowledge processing [39], data mining and design in software engineering [35], gene expression data analysis [22], hierarchical classification of web search results [14,28], analysis of software code [33] and information retrieval [12]. Generating the concept lattice and its associated Hasse diagram is one of the major issues in FCA. This process can be computationally expensive, particularly when dealing with real world data tables. Various techniques have been developed over the last years in order to reduce the size of the Galois lattice. The research efforts were first directed to obtain more efficient algorithms for generating and representing the set of concepts [19,25,27,34]. A method to control the size of the concept lattice through two unary functions (called hedges) defined on the residuated lattice was proposed in [7]. A method to simplify the information provided by a fuzzy biconcept lattice taking into account the membership degree of objects or attributes was proposed in [1]. A different approach to obtain a less complex procedure is to reduce the size of the attribute set. Thus, some authors have recently studied the attribute reduction effect for fuzzy oriented concept lattices [37,29]. In order to reduce the complexity of the conceptual structure, methods to decompose and to structure the lattice into smaller parts have been also developed. FCA researchers have been investigated the conditions allowing the Galois lattice to be factorized. Thus, as a general framework, they used similarity relations to measure the degree of indistinguishability between concepts, and so to group the similar classes of objects. In [4] it is used a fuzzy similarity between concepts to factorize the fuzzy concept lattice, and so to generate the factor lattice. These ideas were extended in [10], where it is provided a general method for factorizing systems of fuzzy sets by similarity. In [17] it is proposed a similarity for semantic web by following an information content approach, and accordingly the similarity of concept intents is computed independently of the related extents. In [23] it is showed that the factor lattice of a fuzzy concept lattice can be computed from a special kind of super-relations of the incidence relation called fuzzy block relations. A faster algorithm for computing the factor lattice of an antitone fuzzy concept lattice was developed in [5]. Based on a fuzzy similarity relation, we defined in [13] a compatible tolerance relation on the set of fuzzy attribute-oriented concepts which is used to factorize the corresponding fuzzy concept lattice, and so to reduce the complexity of the conceptual structure. The aim of this paper is to develop an efficient method to generate the factor lattice of a fuzzy attribute-oriented concept lattice. Following the results of [13], we present here an algorithm to build the factor lattice. Since such a procedure is usually costly, we investigate a simpler way to build the blocks, and so the factor lattice. In order to get a faster algorithm for computing the reduced conceptual structure, we closely analyze the structure of the factor lattice in the isotone case. Thus, we find the structure of the blocks which is different than the one corresponding to the case of antitone fuzzy Galois lattices [5]. Then we show that the blocks computation is reduced to the problem of finding the fixpoints of a fuzzy closure operator. Using this result, we develop an efficient method to compute the blocks of the resulted factor lattice directly from the input data. A set of experiments on large real databases which proves the efficiency of the method concludes our work. The remainder of the paper is organized as follows. We briefly overview the basic notions and results of FCA with fuzzy attributes in Section 2. In Section 3 we first present the main issues involved in reducing the size of fuzzy attribute-oriented Galois lattices by similarity which are particularly connected with our approach. Then we find the structure of the factor lattice obtained by factorizing a fuzzy attribute-oriented Galois lattice via a compatible tolerance relation. Finally, we provide a theoretical method which allows to develop a more efficient algorithm for generating the blocks of the factor lattice. Section 4 presents some experimental results in order to evaluate the efficiency of the presented method.
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.3 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
3
2. Preliminaries In this section we briefly recall some basic notions of fuzzy sets and fuzzy logic used in the following. We refer to [2,36] for further details. Definition 1. A residuated lattice is an algebra L = (L, ∧, ∨, ⊗, →, 0, 1) satisfying: (i) (L, ∧, ∨, 0, 1) is a bounded lattice with bottom element 0 and top element 1; (ii) (L, ⊗, 1) is a commutative monoid; (iii) ⊗ and → form an adjoint pair (isotone Galois connection), i.e., for all x, y, z ∈ L we have the equivalence x ≤ y → z iff x ⊗ y ≤ z. Residuated lattices are general structures of truth degrees used in fuzzy logic. The operations ⊗ and →, called multiplication and residuation, generalize the logical conjunction and implication of classical logic, respectively. We also use the biresiduum operator defined for all x, y ∈ X by x ↔ y = (x → y) ∧ (y → x), which is the fuzzy equivalence logical connective. In the following we consider L to be a complete residuated lattice, i.e., L is a residuated lattice with (L, ∧, ∨, 0, 1) being complete. In this situation, arbitrary infima ( ) and suprema ( ) model the universal and the existential quantifier, respectively. Heyting algebras, which are algebraic models for intuitionistic logic, provide an important example of residuated lattices. In Heyting algebras the multiplication ⊗ is the infimum (∧) operation (x ⊗ y = x ∧ y), while the residuation is defined by x → y = { z | z ∧ x ≤ y}. Other examples of complete residuated lattices can be defined by considering L to be either the unit interval + L = [0, 1] or the chain L = {0, n1 , n2 , ..., n−1 n , 1} with n ∈ N , where the infimum and supremum operations are given by x ∧ y = min(x, y) and x ∨ y = max(x, y), respectively, and the multiplication ⊗ being a left-continuous t-norm with the corresponding residuation operation. A particular example is the Łukasiewicz lattice obtained when x ⊗ y = max(x + y − 1, 0) and x → y = min(1 − x + y, 1). Considering x ⊗ y = min(x, y) and x → y = 1 if x ≤ y, x → y = y otherwise, we obtain the Gödel lattice. Considering x ⊗ y = x · y, as well as x → y = 1 whenever x ≤ y and x → y = y/x otherwise, we get the product lattice. The following properties hold in any complete residuated lattice: Proposition 2. [2] Let L be a complete residuated lattice. Then: (i) the multiplication operation ⊗ is isotone in every argument, i.e., for all x1 , x2 , y1 , y2 ∈ L, if x1 ≤ x2 and y1 ≤ y2 then x1 ⊗ y1 ≤ x2 ⊗ y2 ; (ii) the operation → is antitone in its first argument, and isotone in its second argument, i.e., for all x, x1 , x2 , y, y1 , y2 ∈ L, if x1 ≤ x2 then x2 → y ≤ x1 → y, and if y1 ≤ y2 then x → y1 ≤ x → y2 ; (iii) ⊗ distributes over arbitrary joins, i.e., x ⊗ ( yi ) = (x ⊗ yi ); i∈I
i∈I
(iv) x ⊗ (x → y) ≤ y; (v) (x ⊗ y) → z = x → (y → z) = y → (x → z); (vi) i∈I (x → yi ) = x → ( i∈I yi ) for all x, y, z ∈ L and {yi }i∈I ⊆ L. Following [40], we say that a fuzzy set (or an L-set) in the universe X with truth degrees from L is a mapping A : X → L. For every x ∈ X, A(x) is interpreted as the truth degree to which “x belongs to A”. We denote by LX the collection of fuzzy sets in X, and we order this set with the usual pointwise ordering: A ≤ B if A(x) ≤ B(x) for every x ∈ X. The pair (LX , ≤) is a complete lattice in which infima and suprema are also defined pointwise: (A ∧ B)(x) = A(x) ∧ B(x) and (A ∨ B)(x) = A(x) ∨ B(x), for every x ∈ X. If x ∈ X and a ∈ L, an important example of fuzzy set is {a/x} : X → L, defined by {a/x}(x)=a and {a/x}(x )=0 for all x ∈ X, x = x. Given a fuzzy set A : X → L and a ∈ L, we define the fuzzy sets a ⊗ A and a → A by (a ⊗ A)(x) = a ⊗ A(x) and (a → A)(x) = a → A(x), for every x ∈ X. A fuzzy relation (or L-relation) between two universes X and Y is a mapping R : X × Y → L. For (x, y) ∈ X × Y , R(x, y) ∈ L can be interpreted as the truth value of the proposition “x and y are in relation R”.
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.4 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
4
Given A, B ∈ LX , we define the subsethood degree of A in B by S(A, B) =
(A(x) → B(x)), which generalizes
x∈X
the classical relation ⊆ to the fuzzy setting. The element S(A, B) ∈ L represents the degree of A being a subset of B. We have S(A, B) = 1 if and only if A(x) ≤ B(x) for all x ∈ X (i.e., A ≤ B). We also recall that a mapping cl : LX → LX satisfying A ≤ cl(A), S(A1 , A2 ) ≤ S(cl(A1 ), cl(A2 )) and cl(A) = cl(cl(A)) for all A, A1 , A2 ∈ LX is called a fuzzy closure operator on the set LX . 2.1. Formal concept analysis with fuzzy attributes We present a basic overview of fuzzy oriented-concept lattices needed for our purposes. Let X be a set of objects, Y the set of their attributes (properties), and R a fuzzy relation between X and Y . The triple (X, Y, R), called formal fuzzy context, formalizes a data table which assigns to each x and y the element R(x, y), namely the truth value of the statement “y is an attribute of x”. We call R the incidence fuzzy relation of the context (X, Y, R). In [21] and [32] it is generalized a model presented in [16], and defined two new types of conceptual structures with fuzzy attributes. The first one has been obtained by defining the fuzzy attribute-oriented derivation operators as – the map 3 : LX → LY which associates to every fuzzy set of objects A : X → L the fuzzy set of attributes A3 : Y → L defined by A3 (y) =
A(x) ⊗ R(x, y) , y ∈ Y
x∈X
– the map : LY → LX which associates to every fuzzy set of attributes B : Y → L the fuzzy set of objects B : X → L defined by B (x) =
R(x, y) → B(y) , x ∈ X.
y∈Y
The element A3 (y) measures the truth value of the statement “x ∈ A and y is related to x”, while B (x) can be interpreted as the truth value of the statement “for every y in relation with x, y ∈ B”. The pair (3 , ) is an isotone Galois connection between (LX , ≤) and (LY , ≤). A pair (A, B) ∈ LX × LY such that A3 = B and B = A is called fuzzy attribute(or property)-oriented formal concept. The sets A and B are called the extent and the intent of (A, B) and sometimes, we denote them by int((A, B)) and ext((A, B)), respectively. A partial relation ≤ can be defined on the set Bp (X, Y, R) of fuzzy attribute-oriented concepts by (A1 , B1 ) ≤ (A2 , B2 ) if A1 ≤ A2 (equivalently, B1 ≤ B2 ). The structure of the set Bp (X, Y, R) is described by the so-called Basic Theorem of fuzzy attribute-oriented FCA [21]: Theorem 3. The set Bp (X, Y, R) equipped with the subconcept–superconcept order ≤ is a complete lattice called fuzzy attribute-oriented concept lattice, in which the infimum and the supremum are given by: 3 3 (Ai , Bi ) = Ai , Ai Ai , Bi = , i∈I
i∈I
i∈I
(Ai , Bi ) =
i∈I
i∈I
i∈I
3 Bi , Bi = Ai , Bi . i∈I
i∈I
(1)
i∈I
(2)
i∈I
Changing the role of attributes and objects, the notion of fuzzy object-oriented concept lattice is introduced in a similar way (see [21,32]). In order to generalize some results from FCA to FCA with fuzzy attributes, the notion of Galois connection is extended to the fuzzy setting [8]:
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.5 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
5
Definition 4. Let X, Y be two non-empty sets, L a complete residuated lattice, and two mappings f : LX → LY , g : LY → LX . The pair (f, g) is said to be an isotone fuzzy Galois connection between the ordered sets (LX , ≤) and (LY , ≤) if, for all A, A1 , A2 ∈ LX and B, B1 , B2 ∈ LY , we have (i) S(A1 , A2 ) ≤ S(f (A1 ), f (A2 )); (ii) S(B1 , B2 ) ≤ S(g(B1 ), g(B2 )); (iii) A ≤ g(f (A)) and B ≤ f (g(B)). The pair (3 , ) is described as an isotone fuzzy Galois connection in [21], while a proof of this is in [2]. 3. Factorization of a fuzzy attribute-oriented concept lattice One of the major issues in FCA with fuzzy attributes is how to reduce the size of the fuzzy Galois lattices. We briefly review some methods to factorize this type of lattices. 3.1. Factorizing lattices by similarity A tolerance relation is a reflexive and symmetric relation θ on a non-empty set V . If, in addition, θ is compatible with suprema and infima, i.e., foran index set I and a complete lattice V , for i ∈ I , xi , yi ∈ V and xi θ yi we have ( xi ) θ ( yi ) and ( xi ) θ ( yi ), then θ is said to be a complete tolerance relation. For v ∈ V , let us denote i∈I i∈I i∈I i∈I vθ = {x ∈ V |v θ x} and v θ = {x ∈ V |v θ x}. A subset S of V , maximal such that x and y are related by θ for all x, y ∈ S, is called a block of the tolerance relation θ . In [15] it is proved that the blocks of θ are the intervals [vθ , (vθ )θ ] = {x ∈ V |vθ ≤ x ≤ (vθ )θ }, where v ∈ V . The blocks are obviously determined by their infima vθ . Since θ (vθ )θ θ = vθ and (vθ )θ θ = (vθ )θ , (3) they are also determined by their suprema (vθ )θ . A partial order canbe defined on the set V /θ of blocks of θ by saying that a block A1 ∈ V /θ is less than a block A2 ∈ V /θ if A1 ≤ A2 . The set of blocks V /θ together with this partial order is a complete lattice called the factor lattice of V by θ . We also recall a key similarity relation between two fuzzy sets. Let U be a universe and (L, ∧, ∨, ⊗, →, 0, 1) a complete residuated lattice. Following [2], the fuzzy relation E : LU × LU → L defined by A1 (x) ↔ A2 (x) , A1 , A2 ∈ LU E(A1 , A2 ) = x∈X
is a fuzzy similarity (or fuzzy equivalence) relation on LU , i.e., the following properties hold for all x, y, z ∈ U : E(x, x) = 1 (reflexivity), E(x, y) = E(y, x) (symmetry), and E(x, y) ⊗ E(y, z) ≤ E(x, z) (transitivity). The element E(A1 , A2 ) can be interpreted as the equality degree of the fuzzy sets A1, A2 . The factorization method described above is applied in [4] in order to reduce the size of antitone fuzzy concept lattices by similarity. Another important example of compatible tolerance relation used to factorize a system of fuzzy sets is presented in [10]. We present now the technique for factorizing a fuzzy attribute-oriented concept lattice developed by us in [13]. We first defined a similarity relation between fuzzy attribute-oriented concepts. Let (X, Y, R) be a formal fuzzy context and Bp (X, Y, R) the corresponding fuzzy attribute-oriented Galois lattice. The relation Ep defined in [13] on Bp (X, Y, R) by Ep (A1 , B1 ), (A2 , B2 ) = E(A1 , A2 ) = E(B1 , B2 ) for all (A1 , B1 ), (A2 , B2 ) ∈ Bp (X, Y, R) is called fuzzy attribute-oriented similarity relation between fuzzy attributeoriented concepts, induced by extents (or by intents) of concepts. We considered a truth value t ∈ L, and defined the relation E t on the lattice Bp (X, Y, R) by (A1 , B1 )E t (A2 , B2 ) if t ≤ Ep ((A1 , B1 ), (A2 , B2 )), i.e., two fuzzy attribute-oriented concepts are in relation if their degree of fuzzy similarity is greater than t . Since E t is proved in [13] to be a complete tolerance relation on Bp (X, Y, R), we used this relation to factorize the fuzzy attribute-oriented concept lattice, and so to reduce its complexity.
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.6 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
6
3.2. Construction of the factor lattice In order to find the factor lattice corresponding to the complete tolerance relation E t , we can use the method described by the above theoretical results: firstly, we generate the fuzzy attribute-oriented concept lattice, secondly we compute the similarities between any two concepts, and finally we find the block associated to each concept. Since this simple algorithm is rather costly, we present another method to build the blocks and to get the factor lattice in a more efficient way. We have already mentioned in this section that the blocks of the factor lattice are determined by their suprema. Since a fuzzy attribute-oriented concept is uniquely given by its extent, to find the blocks it is enough to generate the extents of the suprema of the blocks. To find the structure of the factor lattice blocks, we prove some preliminary results. Proposition 5. Let (X, Y, R) be a fuzzy formal context, and t ∈ L. Then we have: (i) (ii) (iii) (iv)
(t → B) = t → B , for every B ∈ LY ; t ≤ E(A, t → A), for every A ∈ LX ; t ≤ E(A, t ⊗ A), for every A ∈ LX ; if (A, B) ∈ Bp (X, Y, R), then t → A is an extent of a fuzzy attribute-oriented concept.
Proof. (i) According to Proposition 2 equalities (v) and (vi), we can write for every x ∈ X: R(x, y) → t → B(y) = t → R(x, y) → B(y) = (t → B) (x) = y∈Y
=t →
y∈Y
R(x, y) → B(y) = t → B (x).
y∈Y
We now sketch the proofs of (ii) and (iii), which are presented also in [5]. (ii) The relation t ≤ E(t → A, A) is true if and only if t ≤ (t → A(x)) ↔ A(x) for every x ∈ X, or equivalently, t ≤ (t → A(x)) → A(x) and t ≤ A(x) → (t → A(x)) for every x ∈ X. By adjunction, the first inequality reduces to t ⊗ (t → A(x)) ≤ A(x) for every x ∈ X, which is true because of Proposition 2 statement (iv). The second inequality leads to t ⊗ A(x) ≤ t → A(x), whichis equivalent with the true relation t ⊗ t ⊗ A(x) ≤ A(x), for every x ∈ X. (A(x) ↔ (t ⊗ A)(x)) for every x ∈ X, or t ≤ A(x) → (t ⊗ A(x)) and t ≤ (iii) We have to prove that t ≤ x∈X
(t ⊗ A(x)) → A(x), for every x ∈ X. These inequalities reduce to the true relations t ⊗ A(x) ≤ t ⊗ A(x) and t ⊗ t ⊗ A(x) ≤ A(x), for every x ∈ X. (iv) The set t → A is an extent of some fuzzy attribute-oriented concept if and only if t → A = (t → A)3 . Since (3, ) is an isotone Galois connection, we have t → A ≤ (t → A)3 . Thus, we should prove the inequality (t → A)3 ≤ t → A. By adjunction, this relation is equivalent to t⊗ (t → A)3 ≤ A or t ≤ (t → A)3 (x) → A(x), for every x ∈ X. The fuzzy set A is an extent, i.e. A = A3 , and so the last inequality can be rewritten as t ≤ (t → A)3 (x) → A3 (x) for every x ∈ X. We have E((t → A)3 , A3 ) ≤ (t → A)3 (x) → A3 (x) for every x ∈ X. Consequently, it is enough to prove t ≤ E (t → A)3 , A3 .
(4)
Since (3, ) is a fuzzy isotone Galois connection, it follows that E(t → A, A) ≤ E (t → A)3 , A3 , which together with the property (ii), namely t ≤ E(t → A, A), imply the relation (4).
2
In the following theorem we determine the infimum and the supremum concepts of a block. Let (X, Y, R) be a fuzzy formal context, t ∈ L, a fuzzy attribute-oriented concept (A, B) ∈ Bp (X, Y, R) and its corresponding block t t [(A, B)E t , ((A, B)E t )E ]. We denote (A, B)E t by (A, B)t , and (A, B)E by (A, B)t .
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.7 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
7
Theorem 6. We have: (i) the infimum concept (A, B)t is given by ((t ⊗ A)3 , (t ⊗ B)3 ); (ii) the concept (A, B)t is given by (t → A, (t → B)3 ); (iii) the supremum concept ((A, B)t )t is given by (t → (t ⊗ A)3 , (t → (t ⊗ B)3 )3 ). Proof. (i) The set ext((A, B)t ) can be found by using Lemma 3 from [10]. In [10], the authors considered a fuzzy closure operator C : LX → LX and a truth value a ∈ L to define a binary relation a ≈ on the set fix(C) of fixpoints of C by A1 a ≈A2 if E(A1 , A2 ) ≥ a. They proved that a ≈ is a compatible tolerance relation on fix(C), the corresponding a factor lattice being fix(C)/a ≈ = {[Aa ≈ , (Aa ≈ ) ≈ ]|A ∈ fix(C)}, where a≈
Aa ≈ = C(a ⊗ A) and A
= a → A.
(5)
We now consider the fuzzy closure operator C : LX → LX defined by C(A) = A3 , and get fix(C) = {A ∈ LX | (A, B) ∈ Bp (X, Y, R)}. By relation (5), it follows that ext((A, B)t ) = C(t ⊗ A) = (t ⊗ A)3 . The next step is to find the intent of (A, B)t . The concept (A, B)t is the infimum of the set of fuzzy attributeoriented concepts (A , B ) ∈ Bp (X, Y, R), which are similar regarding to degree t with (A, B); we denote this set by S. Applying Theorem 3, we get (A, B)t = S= A , B ∈ Bp (X, Y, R)|(A, B)E t A , B = 3 = A, A . (A ,B )∈S
(A ,B )∈S
The intent of the concept (A, B)t is 3 int (A, B)t = A = (A ,B )∈S
We also have
(A ,B )∈S
B
3
=
B
3 .
(A ,B )∈S
S= A , B ∈ Bp (X, Y, R)|Ep (A, B), A , B ≥ t = = A , B ∈ Bp (X, Y, R)|E B, B ≥ t =
= A , B ∈ Bp (X, Y, R)| B(y) ↔ B (y) ≥ t .
(A, B)t =
The inequality
y∈Y
(B(y) ↔ B (y)) ≥ t
is equivalent with B(y) → B (y) ≥ t and B (y) → B(y) ≥ t , for all y ∈ Y .
y∈Y
By adjunction, these relations can be written t ⊗ B(y) ≤ B (y) and t ⊗ B (y) ≤ B(y) for every y ∈ Y , or t ⊗ B ≤ B and t ⊗ B ≤ B for all (A , B ) ∈ S, respectively. The first relation implies that t ⊗ B ≤ (A ,B )∈S B which leads to 3 (t ⊗ B)3 ≤ B = int (A, B)t . (6) (A ,B )∈S
In fact, we prove that (t ⊗ B)3 = int((A, B)t ). Let (F, (t ⊗ B)3 ) be the fuzzy attribute-oriented concept whose intent is (t ⊗ B)3 . We have to prove that F, (t ⊗ B)3 = (A, B)t = S. (7) By relation (6), it follows that (F, (t ⊗ B)3 ) ≤ (A, B)t = S which shows that (F, (t ⊗ B)3 ) is a lower bound of the set S. We prove that (F, (t ⊗ B)3 ) is an element of S, i.e., E((t ⊗ B)3 , B) ≥ t , or E((t ⊗ B)3 , B 3 ) ≥ t . Since the pair (3, ) is a fuzzy isotone Galois connection we have E(t ⊗ B, B) ≤ E (t ⊗ B)3 , B 3 ,
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.8 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
8
which together with t ≤ E(t ⊗ B, B) ends the reasoning. Therefore, the relation (7) is true. Consequently, we have int((A, B)t ) = (t ⊗ B)3 . (ii) Next, we find the element (A, B)t . By using Theorem 3, we get: (A, B)t = S= A , B ∈ Bp (X, Y, R)|(A, B)E t A , B = 3 = B , B = A , B . (A ,B )∈S
(A ,B )∈S
(A ,B )∈S
(A ,B )∈S
Since (A, B)E t (A , B ), it follows that E(A, A ) ≥ t which leads to t ⊗ A ≤ A and t ⊗ A ≤ A, for all (A , B ) ∈ S. These relations are with A ≤ t → A and A ≤ t → A, respectively. Since A ≤ t → A for all (A , B ) ∈ S, equivalent it follows that A ≤ t → A which implies (A ,B )∈S
A
3
≤ (t → A)3 = t → A.
(A ,B )∈S
The latter equality is true according to Proposition 5 statement (iv). In fact, we prove that we have 3 ext (A, B)t = A = t → A. (A ,B )∈S
Let (t → A, D) be the concept whose extent is t → A. The inclusion A ≤ t → A implies (A , B ) ≤ (t → A, D) for all (A , B ) ∈ S, i.e., (t → A, D) is an upper bound for S. By Proposition 5 (ii), we have t ≤ E(A, t → A) which shows that (t → A, D) ∈ S. The concept (t → A, D) is an upper bound of S and (t → A, D) ∈ S, and so we have (t → A, D) = S = (A, B)t . We look now for int((A, B)t . Since t → A is the extent of (A, B)t , it follows that int((A, B)t ) = (t → A)3 = (t → B )3 = (t → B)3 , the second equality being true due to Proposition 5 (i). We conclude that (A, B)t = (t → A, (t → B)3 ). (iii) The proof results immediately from (i) and (ii). 2 As we have already noticed, the blocks [(A, B)t , ((A, B)t )t ] of the factor lattice Bp (X, Y, R)/E t are uniquely generated by the extents of their suprema, namely by the sets ext((A, B)t )t ) = t → (t ⊗ A)3 , (A, B) ∈ Bp (X, Y, R). We show that these extents can be interpreted as fixpoints of a fuzzy closure operator. We consider the mapping σ : LX → LX defined by σ (A) = t → (t ⊗ A)3 . Using a similar reasoning as in [5] for antitone fuzzy concept lattices, it can be shown that the set {t → (t ⊗ A)3 | (A, B) ∈ Bp (X, Y, R)} of extents of suprema equals the set fix(σ ) = {A ∈ LX | A = σ (A)} of fixpoints of the operator σ . Therefore, in order to build the factor lattice, we need to compute the set of fixpoints of the fuzzy closure operator σ . We use the algorithm presented in [6] which adapts the “next closure algorithm” from [19] to the lattice-valued case. This algorithm generates the fixpoints of a fuzzy closure operator in a lexicographic order. 4. Experimental results We present how our method performs when computing the blocks of the factor lattice. We have implemented an extension of the Ganter’s algorithm for a fuzzy setting (see [6]), and tested the Java software we have developed for various formal fuzzy contexts of different sizes. We conducted many experiments which evaluate how the runtime for building the factor lattice of a fuzzy attribute-oriented concept lattice is reduced when using the algorithm described in Section 3.2. All these tests were executed on a 1.8 MHz AMD Turion 64 × 2 TL-60, by using 1 GB RAM and Windows 7 operating system. Thus, we have computed the speed-up for our method, i.e., the ratio between the runtime for computing the factor lattice by the usual/simple algorithm and the runtime for generating the same lattice by using our approach. We also calculated the size reduction ratio for the factor lattice Bp (X, Y, R)/E t , i.e., the ratio between the cardinal of the fuzzy attribute-oriented concept lattice (denoted by |Bp (X, Y, R)|) and the cardinal of the factor lattice (denoted by |Bp (X, Y, R)/E t |), where t is a given threshold.
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.9 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
9
Table 1 Data table with 600 objects and 5 attributes. Threshold t
0.25
0.50
0.75
Time simple algorithm Time of our algorithm Speed-up |Bp (X, Y, R)/E t | Reduction ratio
873.40 24.03 36.08 27 32.51
1041.17 135.17 7.70 143 6.13
1252.72 417.54 2.12 413 2.12
|Bp (X, Y, R)| = 878. Table 2 Data table with 400 objects and 6 attributes. Threshold t
0.25
0.50
0.75
Time simple algorithm Time of our algorithm Speed-up |Bp (X, Y, R)/E t | Reduction ratio
637.01 8.48 75.07 27 44.81
882.99 53.17 16.60 143 8.46
733.62 191.03 3.84 472 2.56
|Bp (X, Y, R)| = 1210.
Example 7. We test our approach on two data tables extracted from “the Lahman Baseball Database 2012” found at www.seanlahman.com/baseball-archive/statistics/, a collection of statistics for Major League Baseball teams and players. The objects are baseball teams and seasons from 1871 to present, while the attributes are some indicators for teams and players. The first data table has 600 objects, 5 attributes, and a density of 71% (the density is the ratio between the nonzero elements and the number of elements in the table). Table 1 provides the runtime (in milliseconds) and the speed-up for the two algorithms. It also shows the number of fuzzy concepts of the factor lattice and the size reduction ratio corresponding to thresholds t = 0.25, 0.50, 0.75 and to the Łukasiewicz logical connectives. The second data table has 400 objects, 6 attributes, and 74% density. The results for the same thresholds t = 0.25, 0.50, 0.75 and the Łukasiewicz logical connectives are presented in Table 2. The graphs of runtimes for generating the factor lattice Bp (X, Y, R)/E t by using the simple method and our algorithm are depicted in Fig. 1 with ∗-line and 3-line, respectively. They show that the algorithm proposed in this paper outperforms the simple procedure. We also provide the speed-up (left) and the reduction ratio (right) for these two data tables in Fig. 2 and Fig. 3, respectively. It is worth noting the similarity of the two graphs in Fig. 2 and in Fig. 3, which suggests that speed-up and size reduction ratio are closely related.
Fig. 1. Runtimes for the data tables from Table 1 and Table 2
JID:FSS AID:7073 /FLA
10
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.10 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
Fig. 2. Speed-up and reduction ratio for the data table from Table 1.
Fig. 3. Speed-up and reduction ratio for the data table from Table 2.
We emphasize that, due to the high density of the fuzzy context, we obtained large fuzzy attribute-oriented concept lattices and thus, our efficient factorization method become a very useful option to reduce the size of the conceptual structure. According to the practical requirements, a user can find an optimal threshold t to get a factor lattice with less blocks, while the amount of lost information to be minimal. 5. Concluding remarks Factorization by similarity is a technique developed over the last years for reducing the size of different types of fuzzy concept lattices. In this paper we provided a framework for computing efficiently the factor lattice corresponding to a fuzzy attribute-oriented concept lattice. Using a tolerance relation, we factorized a fuzzy attribute-oriented concept lattice in order to reduce its complexity. We presented some properties of the derivation operators of a fuzzy context related to fuzzy similarity relations and to the logical connectives of the residuated lattice. Based on these results, we have described the structure of the blocks of the factor lattice. Thus, we provided the methods to find the infimum and the supremum concepts of an interval representing a block. We have shown that these intervals are determined by the extents of their supremum concepts, and we computed these extents as fixpoints of a fuzzy closure operator. This approach allowed us to develop a faster method to build the fuzzy attribute-oriented factor lattice, directly from the input data. The paper also provided experimental evaluation of the proposed method using a Java implementation of an improved Ganter’s algorithm for a fuzzy setting. The results showed that the method proposed for building the blocks of the factor lattice performs much better than the simple algorithm. Thus, the runtime of the procedure provided by our approach is significantly reduced when compared with the execution time for the classical method.
JID:FSS AID:7073 /FLA
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.11 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
11
Acknowledgements We thank the anonymous reviewers for their helpful comments and suggestions. The work was supported by a grant of the Romanian Ministry of Education and Scientific Research. References [1] C. Alcalde, A. Burusco, R. Fuentes-Gonzalez, The use of two relations in L-fuzzy contexts, Inf. Sci. 301 (2015) 1–12. [2] R. Belohlavek, Fuzzy Relational Systems: Foundations and Principles, Kluwer Academic/Plenum Press, 2002. [3] R. Belohlavek, Lattices generated by binary fuzzy relations (extended abstract), in: Abstracts of the 4th Conference on Fuzzy Sets Theory and Its Applications, Slovakia, 1998, p. 11. [4] R. Belohlavek, Similarity relations in concept lattices, J. Log. Comput. 10 (6) (2000) 823–845. [5] R. Belohlavek, J. Dvorak, J. Outrata, Fast factorization by similarity in formal concept analysis of data with fuzzy attributes, J. Comput. Syst. Sci. 73 (2007) 1012–1022. [6] R. Belohlavek, Algorithms for fuzzy concept lattices, in: Proc. 4th Int. Conf. on Recent Advances in Soft Computing, Nottingham, 2002, pp. 67–68. [7] R. Belohlavek, V. Vychodil, Reducing the size of fuzzy concept lattices by hedges, in: FUZZ–IEEE 2005 Conference on Fuzzy Systems Reno, 2005, pp. 663–668 (proceedings on CD), abstract in proceedings, 44 pp. [8] R. Belohlavek, Fuzzy Galois connections, Math. Log. Q. 45 (4) (1999) 497–504. [9] R. Belohlavek, Concept lattices and order in fuzzy logic, Ann. Pure Appl. Logic 128 (2004) 277–298. [10] R. Belohlavek, M. Krupka, Grouping fuzzy sets by similarity, Inf. Sci. 179 (15) (2009) 2656–2661. [11] A. Burusco, R. Fuentes-Gonzales, The study of the L-fuzzy concept lattice, Mathw. Soft Comput. 3 (1994) 209–218. [12] G. Ciobanu, R. Horne, C. V˘aideanu, Extracting threshold conceptual structures from web documents, in: N. Hernandez, R. Jaschke, M. Croitoru (Eds.), Graphs-Based Representation and Reasoning, ICCS 2014, in: Lect. Notes Artif. Intell., vol. 8577, Springer, 2014, pp. 130–144. [13] G. Ciobanu, C. Vaideanu, Similarity relations in fuzzy attribute-oriented concept lattices, Fuzzy Sets Syst. 275 (2015) 88–109. [14] G. Ciobanu, D. Rusu, A formal topology of web classification, in: N. Hernandez, R. Jaschke, M. Croitoru (Eds.), Graphs-Based Representation and Reasoning, ICCS 2014, in: Lect. Notes Artif. Intell., vol. 8577, Springer, 2014, pp. 145–158. [15] G. Czedli, Factor lattices by tolerances, Acta Sci. Math. 44 (1982) 35–42. [16] G. Gediga, I. Düntsch, Modal-style operators in qualitative data analysis, in: Proc. IEEE Int. Conf. on Data Mining, 2002, pp. 155–162. [17] A. Formica, Concept similarity in fuzzy formal concept analysis for semantic web, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 18 (2) (2010) 153–167. [18] B. Ganter, S.O. Kuznetsov, Pattern structures and their projections, ICCS-01, Lect. Notes Comput. Sci. 2120 (2001) 129–142. [19] B. Ganter, R. Wille, Formal Concept Analysis. Mathematical Foundations, Springer, Berlin, 1999. [20] B. Ganter, R. Wille, Conceptual scaling, in: F.S. Roberts (Ed.), Applications of Combinatorics and Graph Theory to the Biological and Social Sciences, Springer, 1989, pp. 139–167. [21] G. Georgescu, A. Popescu, Non-dual fuzzy connections, Arch. Math. Log. 43 (8) (2004) 1009–1039. [22] M. Kaytoue, S.O. Kuznetsov, A. Napoli, S. Duplessis, Mining gene expression data with pattern structures in formal concept analysis, Inf. Sci. 181 (10) (2011) 1989–2001. [23] J. Konecny, M. Krupka, Block relations in fuzzy settings, in: A. Napoli, V. Vychodil (Eds.), Proceedings of the Concept Lattices and Their Applications, 2011, pp. 115–130. [24] S. Krajci, The basic theorem on generalized concept lattice, in: R. Belohlavek, V. Snasel (Eds.), CLA 2004, Proc. of 2nd Int. Workshop, Ostrava, 2004, pp. 25–33. [25] S.O. Kuznetsov, A fast algorithm for computing all intersections of objects in a finite semi-lattice, Autom. Doc. Math. Linguist. 27 (5) (1993) 11–21. [26] F. Lehmann, R. Wille, A triadic approach to formal concept analysis, in: Gerard Ellis, Robert Levinson, William Rich, John F. Sowa (Eds.), Conceptual Structures: Applications, Implementation and Theory, in: Lect. Notes Comput. Sci., vol. 954, 1995, pp. 32–43. [27] C. Lindig, Fast concept analysis, in: B. Ganter, G.W. Mineau (Eds.), ICCS 2000, in: Lect. Notes Comput. Sci., vol. 1867, Springer, Heidelberg, 2007, pp. 152–161. [28] C. De Maio, G. Fenza, V. Loia, S. Senatore, Hierarchical web resources retrieval by exploiting fuzzy formal concept analysis, Inf. Process. Manag. 48 (3) (2012) 399–418. [29] J. Medina, Relating attribute reduction in formal, object-oriented and property-oriented concept lattices, Comput. Math. Appl. (2012), http://dx.doi.org/10.1016/j.camwa.2012.03.087. [30] J. Medina, M. Ojeda-Aciego, J. Ruiz-Calvi, Formal concept analysis via multi-adjoint concept lattices, Fuzzy Sets Syst. 160 (2) (2009) 130–144. [31] S. Pollandt, Fuzzy Begriffe, Springer Verlag, Berlin/Heidelberg, 1997. [32] A. Popescu, A general approach to fuzzy concepts, Math. Log. Q. 50 (3) (2004) 265–280. [33] D. Poshyvanyk, A. Marcus, Combining formal concept analysis with information retrieval for concept location in source code, in: Proceedings 15th IEEE International Conference on Program Comprehension, 2007, pp. 37–48. [34] G. Stumme, Y. Bestride, R. Taouil, L. Lakhal, Computing iceberg concept lattices with TITANIC, Data Knowl. Eng. 42 (2) (2002) 189–222. [35] G. Stumme, Efficient data mining based on formal concept analysis, in: Proc. of Database and Expert Systems Applications, DEXA, in: Lect. Notes Comput. Sci., vol. 2453, 2002, pp. 534–546.
JID:FSS AID:7073 /FLA
12
[36] [37] [38] [39]
[m3SC+; v1.234; Prn:1/08/2016; 12:27] P.12 (1-12)
G. Ciobanu, C. V˘aideanu / Fuzzy Sets and Systems ••• (••••) •••–•••
M. Ward, R.P. Dilworth, Residuated lattices, Trans. Am. Math. Soc. 45 (1939) 335–354. L. Wei, J.J. Qi, Relation between concept lattice reduction and rough set reduction, Knowl.-Based Syst. 23 (8) (2010) 934–938. R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, in: I. Rival (Ed.), Ordered Sets, Reidel, 1982, pp. 445–470. R. Wille, Methods of conceptual knowledge processing, in: R. Missouri, J. Schmidt (Eds.), ICFCA, in: Lect. Notes Artif. Intell., vol. 3874, Springer, 2006, pp. 1–29. [40] L.A. Zadeh, Similarity relations and fuzzy orderings, Inf. Sci. 3 (1971) 159–176.