Information Sciences 325 (2015) 87–97
Contents lists available at ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
The relationship between attribute reducts in rough sets and minimal vertex covers of graphs Jinkun Chen a,∗, Yaojin Lin b,∗∗, Guoping Lin a, Jinjin Li a, Zhouming Ma a a b
School of Mathematics and Statistics, Minnan Normal University, Zhangzhou 363000, PR China School of Computer Science, Minnan Normal University, Zhangzhou 363000, PR China
a r t i c l e
i n f o
Article history: Received 15 October 2013 Revised 20 May 2015 Accepted 4 July 2015 Available online 9 July 2015 Keywords: Attribute reduction Vertex covers Graph theory Rough sets
a b s t r a c t The problems to find attribute reduction in rough sets and to obtain the minimal vertex cover for graphs are both NP-hard problems. This paper studies the relationship between the two problems. The vertex cover problem for graphs from the perspective of rough sets is first investigated. The attribute reduction of an information system is then studied in the framework of graph theory. The results in this paper show that finding the minimal vertex cover of a graph is equivalent to finding the attribute reduction of an information system induced from the graph. Conversely, the attribute reduction computation can be translated into the calculation of the minimal vertex cover of a derivative graph. Finally, a new algorithm for the vertex cover problem based on rough sets is presented. Furthermore, experiments are conducted to verify the effectiveness of the proposed method. © 2015 Elsevier Inc. All rights reserved.
1. Introduction Rough set theory, as proposed by Pawlak [35,36], has been proven to be an effective tool for managing uncertainty in information systems [38,39]. Attribute reduction plays an important role in rough set theory. In the framework of rough sets, the main aim of attribute reduction is to find a minimal subset of attributes that preserves the same classification ability as the original attributes [35]. Over the past ten years, attribute reduction has been successfully applied in many fields, such as pattern recognition [19,20,48,63,67], machine learning [8,27,47,65] and data mining [11,42,56,68]. Many approaches have been proposed to find attribute reduction in the literature [1,15,21,24,25,28,30,31,37,44– 46,55,57,61,63,64,66]. A beautiful theoretical result is based on the notion of a discernibility matrix [45]. Skowron and Rauszer [45] showed that the set of all reducts is in fact the set of prime implicants of a discernibility function. A representation of the useful information on the set of reducts in simple graphical form was given in [32,33]. To do this, we need to know all the reducts of a given information system. However, it is time consuming in terms of computation time. In fact, as was shown by Wong and Ziarko [59], finding the set of all reducts or an optimal reduct (a reduct with the minimum number of attributes), is NPhard. Therefore, heuristic methods such as positive-region methods [13,18,41], information entropy methods [26,40,46,62] and discernibility matrix methods [6,7,51–54] have been developed. The vertex cover problem, which is a classical problem in graph theory, is that of finding a minimal vertex cover with the least number of vertices in a given graph [3]. Except the application in graph theory, the vertex cover problem has also been used in a wide variety of real-world applications such as crew scheduling [43], VLSI design [2], nurse rostering [5], and industrial machine ∗ ∗∗
Corresponding author. Tel.: +13 615090602. Corresponding author. E-mail addresses:
[email protected] (J. Chen),
[email protected] (Y. Lin).
http://dx.doi.org/10.1016/j.ins.2015.07.008 0020-0255/© 2015 Elsevier Inc. All rights reserved.
88
J. Chen et al. / Information Sciences 325 (2015) 87–97
assignments [60]. Similar to the discernibility function method used for attribute reduction, the minimal vertex cover computation can be also translated into the calculation of prime implicants of a Boolean function [3,10,29]. Although one can generate all the minimal vertex covers of a graph by using the Boolean function method, it is a well known NP-hard optimization problem [10,22]. There are some approximation algorithms that have been proposed for solving this problem with various performance guarantees [9,12,16,17,34]. An overview of these methods was given in [14,49]. However, the best known performance bound on the approximation ratio is two [14]. It is therefore desirable to develop techniques to improve the performance bound on the approximation ratio. As we have stated above, the attribute reduction and vertex cover problems are both NP-hard and can be obtained via the Boolean logical operation. It seems that there is some kind of natural connection between them. The purpose of this paper is to establish the relationship between the two problems. We have shown that the two problems can be transformed into each other. This study may open new research directions and provide new methods for the two problems. The rest of this paper is organized as follows. In Section 2, some basic notions about rough sets and graph theory are reviewed. In Section 3, a new information system induced from a given graph is introduced, and the relationship between attribute reduction of the derivative information system and the minimal vertex cover of the graph is established. In Section 4, we investigate attribute reduction of a given information system from the viewpoint of the vertex cover problem. In Section 5, a new approximation algorithm for the vertex cover problem based on rough sets is presented. Experiments are also given to show the effectiveness of the proposed method. Finally, some conclusions are drawn in Section 6. 2. Preliminaries In this section, we recall some basic notions about rough sets and graph theory [3,4,35,48,50]. 2.1. Attribute reduction with rough sets The starting point of rough set theory is an information system. Formally, an information system (IS for short) can be seen as a pair S = (U, A), where U and A, are finite, non-empty sets of objects and attributes, respectively. With each attribute a ∈ A, we define an information function a: U −→ Va , where Va is the set of values of a, called the domain of a. Each non-empty subset B ⊆ A determines an indiscernibility relation:
RB = {(x, y) ∈ U × U |a(x) = a(y), ∀a ∈ B}. Obviously, RB is an equivalence relation on U, it forms a partition U/RB = {[x]B |x ∈ U }, where [x]B denotes the equivalence class containing x w.r.t. B, i.e., [x]B = {y ∈ U |(x, y) ∈ RB }. In the theory of rough sets, attribute reduction is one of key processes for knowledge discovery. Given an IS S = (U, A), a reduct of S is a minimal subset of attributes B ⊆ A such that RB = RA . Many approaches to attribute reduction have been proposed. For our purpose, we introduce the following method based on the discernibility matrix and logical operation [45]. By the discernibility matrix method, one can get all the reducts of an IS. Let S = (U, A) be an IS with n objects and (x, y) ∈ U × U. We define
M(x, y) = {a ∈ A|a(x) = a(y)}, M(x, y) is referred to as the discernibility attribute set of x and y in S, and M = {M(x, y)|(x, y) ∈ U × U } is called the discernibility set of S. Note that the discernibility set can be also stored in a matrix form, which is a symmetric n × n matrix with the entry M(x, y). A discernibility function fS for an IS is a Boolean function of m Boolean variables a∗1 , a∗2 , . . . , a∗m corresponding to the attributes a1 , a2 , . . . , am , respectively, and it is defined as follows:
fS (a∗1 , a∗2 , . . . , a∗m ) = ∧{∨M(x, y)|M(x, y) ∈ M, M(x, y) = ∅}, where ∨M(x, y) is the disjunction of all variables a∗ such that a ∈ M(x, y). By the operations of disjunction (∨) and conjunction (∧), Skowron and Rauszer [45] showed that the attribute reduction computation can be translated into the calculation of prime implicants of a Boolean function. ∗ ai is a prime implicant of the discernibility Lemma 1 ([45]). Let S = (U, A) be an IS. An attribute subset B ⊆ A is a reduct of S iff ai ∈B
function fS . From Lemma 1, we can see that if
fS (a∗1 , a∗2 , . . . , a∗m ) = ∧{∨M(x, y)|M(x, y) ∈ M, M(x, y) = ∅} =
t i=1
si
a∗j
,
j=1
si where j=1 a∗j , i ≤ t, are all the prime implicants of the discernibility function fS , then Bi = {a j | j ≤ si }, i ≤ t are all the reducts of S. Without any confusion, we will write ai instead of a∗i in the sequel.
J. Chen et al. / Information Sciences 325 (2015) 87–97
89
Table 1 An IS of Example 1. U
a1
a2
a3
a4
x1 x2 x3 x4 x5
2 1 1 1 1
2 2 0 2 0
0 0 1 1 1
1 0 1 1 1
Table 2 The discernibility matrix of S.
x1 x2 x3 x4 x5
x1
x2
x3
x4
x5
∅ a1 a4 a1 a2 a3 a1 a3 a1 a2 a5
a1 a4 ∅ a2 a3 a4 a3 a4 a2 a3 a4
a1 a2 a3 a2 a3 a4 ∅ a2 ∅
a1 a3 a3 a4 a2 ∅ a2
a1 a2 a5 a2 a3 a4 ∅ a2 ∅
Example 1. Let S = (U, A) be the following IS with U = {x1 , x2 , x3 , x4 , x5 } and A = {a1 , a2 , a3 , a4 }. From Table 1, the discernibility matrix of S can be represented as Table 2. For simplicity, we use a separator-free form for sets, e.g., a1 a4 stands for {a1 , a4 }. From Table 2, we obtain the following discernibility function of S.
fS (a1 , a2 , a3 , a4 ) = a2 ∧ (a1 ∨ a3 ) ∧ (a1 ∨ a4 ) ∧ (a3 ∨ a4 ) ∧ (a1 ∨ a2 ∨ a3 ) ∧ (a1 ∨ a2 ∨ a5 ) ∧ (a2 ∨ a3 ∨ a4 ) = (a1 ∧ a2 ∧ a3 ) ∨ (a1 ∧ a2 ∧ a4 ) ∨ (a2 ∧ a3 ∧ a4 ). Accordingly, there are three reducts of S: B1 = {a1 , a2 , a3 }, B2 = {a1 , a2 , a4 } and B3 = {a2 , a3 , a4 }. 2.2. Vertex covers in graph theory A graph is a pair G = (V, E ) consisting of a set V of vertices and a set E of edges such that E ⊆ V × V. Two vertices are adjacent if there is an edge connecting them. The ends of an edge are said to be incident with the edge. An isolated vertex is a vertex that has no other vertices adjacent to it. A loop is an edge with the same ends. Two or more edges that link the same pair of vertices are said to be parallel. The edge of a graph may be directed (asymmetric) or undirected (symmetric). An undirected graph is one in which edges are symmetric. A hypergraph is a generalization of the traditional graph in which an edge can connect any number of vertices. Formally, a hypergraph can be also written as a pair H = (V, E ), where V is a set of elements called nodes or vertices, and E is a family of non-empty subsets of V called hyperedges or edges. A vertex cover of a graph G is a subset K ⊆ V such that every edge of G has at least one end in K. A vertex cover is minimal if none of its proper subsets is itself a vertex cover. A minimum vertex cover is a vertex cover with the least number of vertices. Note that a minimum vertex cover is always minimal but not necessarily vice versa. Observe that a minimal vertex cover is not necessarily unique, it is also true for the minimum vertex cover. Finding a minimal vertex cover of a graph not only plays important roles in applications of graph theory, but also appears to be important in theoretical considerations. Similar to the attribute reduction method with rough sets, all the minimal vertex covers of a graph can be also obtained via Boolean formulaes. Given a graph G = (V, E ) and e ∈ E, let N(e) be a set of vertices connected by the edge e. Denote N = {N(e)|e ∈ E }. Now we define a function fG for G as follows, which is a Boolean function of m Boolean variables v∗1 , v∗2 , . . . , v∗m corresponding to the vertices v1 , v2 , . . . , vm , respectively.
fG (v∗1 , v∗2 , . . . , v∗m ) = ∧{∨N(e)|N(e) ∈ N }, where ∨N(e) is the disjunction of all variables v∗ such that v ∈ N(e). The following lemma gives a method for computing all the minimal vertex covers of a given graph. Lemma 2 ([10,29]). Let G = (V, E ) be a graph. Then a vertex subset K ⊆ V is a minimal vertex cover of G iff implicant of the Boolean function fG . Lemma 2 shows that if
fG (v∗1 , v∗2 , . . . , v∗m ) = ∧{∨N(e)|N(e) ∈ N } =
t i=1
si j=1
∗ j
v
,
∗ vi ∈K vi is a prime
90
J. Chen et al. / Information Sciences 325 (2015) 87–97
Fig. 1. The graph of Example 2. Table 3 The incidence matrix of the graph G in Example 2.
e1 e2 e3 e4 e5
where
si j=1
v1
v2
v3
v4
1 0 1 1 0
1 1 0 0 0
0 1 1 0 1
0 0 0 1 0
v∗j , i ≤ t, are all the prime implicants of the Boolean function fG , then Ki = {v j | j ≤ si }, i ≤ t are all the minimal vertex
covers of G. We will also write vi instead of v∗i in the discussion to follow. Example 2. Let G = (V, E ) be the following graph with V = {v1 , v2 , v3 , v4 } and E = {e1 , e2 , e3 , e4 , e5 }. We have the Boolean function:
fG (v1 , v2 , v3 , v4 ) = (v1 ∨ v2 ) ∧ (v2 ∨ v3 ) ∧ (v1 ∨ v3 ) ∧ (v1 ∨ v4 ) ∧ v3 . After simplification, we obtain fG in prime implicants as:
fG (v1 , v2 , v3 , v4 ) = (v1 ∧ v3 ) ∨ (v2 ∧ v3 ∧ v4 ). Hence, G has two minimal vertex covers: K1 = {v1 , v3 } and K3 = {v2 , v3 , v4 }. K1 = {v1 , v3 } is the unique minimum vertex cover of G (Fig. 1). 3. Induction of information systems from a graph According to Lemmas 1 and 2, one can see that there may exist some connections between attribute reduction and the minimal vertex cover. In this section, we first introduce an IS induced from a graph and then discuss the relationship between attribute reduction of the derivative IS and the minimal vertex cover of a given graph. In this section, we always assume that the graph of discourse is finite, undirected and without isolated vertices. We first introduce a simple representation of a graph called the incidence matrix. Given a graph G = (V, E ) with V = {v1 , v2 , . . . , vn } and E = {e1 , e2 , . . . , em }. The incidence matrix of G is the m × n matrix MG = (mi j )m×n , where mi j = 1 if the edge ei is incident to the vertex vj , and 0 otherwise. For example, the incidence matrix of the graph shown in Example 2 is a matrix consisting of 5 rows (corresponding to the five edges, e1 − e5 ) and 4 columns (corresponding to the four vertices, v1 − v4 ), see Table 3. Definition 1. Let MG = (mi j )m×n be the incidence matrix of a graph G = (V, E ) with V = {v1 , v2 , . . . , vn } and E = {e1 , e2 , . . . , em }. Denote U = {e1 , e2 , . . . , em , em+1 }. We call the pair S = (U, V ) an induced information system from G (IIS for short), where the information functions of IIS are defined as: • vi (e j ) = j × mi j , 1 ≤ i ≤ n, 1 ≤ j ≤ m; • vi (em+1 ) = 0, 1 ≤ i ≤ n. Example 3. In Example 2, the IIS of G is shown in the following table. Proposition 1. Let S = (U, V ) be the IIS of a graph G = (V, E ). For ei , ej ∈ E and i = j, then M(ei , e j ) = N(ei ) ∪ N(e j ). Proof. For ei , ej ∈ E and i = j, by the definition of M(ei , ej ), we have M(ei , e j ) = {v ∈ V |v(ei ) = v(e j )}. By Definition 1, for any ei ∈ E, it is easy to see that
v(ei ) =
i, 0,
v ∈ N(ei ) . v ∈/ N(ei )
Since i = j, thus we have v(ei ) = v(ej )⇔v ∈ (N(ei ) ∪ N(ej )). Therefore, M(ei , e j ) = N(ei ) ∪ N(e j ) holds.
J. Chen et al. / Information Sciences 325 (2015) 87–97
91
Table 4 The IIS of the graph G in Example 2.
e1 e2 e3 e4 e5 e6
v1
v2
v3
v4
1 0 3 4 0 0
1 2 0 0 0 0
0 2 3 0 5 0
0 0 0 4 0 0
Table 5 The discernibility matrix of the IIS in Example 3.
e1 e2 e3 e4 e5 e6
e1
e2
e3
e4
e5
e6
∅ v1 v2 v3 v1 v2 v3 v1 v2 v4 v1 v2 v3 v1 v2
v1 v2 v3 ∅ v1 v2 v3 v1 v2 v3 v4 v2 v3 v2 v3
v1 v2 v3 v1 v2 v3 ∅ v1 v3 v4 v1 v3 v1 v3
v1 v2 v4 v1 v2 v3 v4 v1 v3 v4 ∅ v1 v3 v4 v1 v4
v1 v2 v3 v2 v3 v1 v3 v1 v3 v4 ∅ v3
v1 v2 v2 v3 v1 v3 v1 v4 v3 ∅
Now we give the main result of this section. The set of all reducts of an IS and the set of all minimal vertex covers of a graph G are denoted by R(S) and C (G), respectively. Theorem 1. Let S = (U, V ) be the IIS of a graph G = (V, E ) with V = {v1 , v2 , . . . , vn } and E = {e1 , e2 , . . . , em }. Then C (G) = R(S). Proof. By Lemmas 1 and 2, to prove the result we only need to show that fS (v1 , v2 , . . . , vn ) = fG (v1 , v2 , . . . , vn ). Let M be the discernibility set of IIS, and N = {N(e)|e ∈ E }. Denote M∗ = {M ∈ M|M = ∅}. In the following, we will first show that N ⊆ M∗ . In fact, by the definition of IIS, for any ei (1 ≤ i ≤ m), we have M(ei , em+1 ) = {v ∈ V |v(ei ) = v(em+1 )} = {v ∈ V |v(ei ) = 0} = N(ei ). This implies that N ⊆ M∗ . Thus, there are two possibilities: either (i) N = M∗ or (ii) N ⊂ M∗ (where “ ⊂ ” denotes the proper inclusion). (i) In case N = M∗ , by the definitions of fS (v1 , v2 , . . . , vn ) and fG (v1 , v2 , . . . , vn ), it is easy to see that fS (v1 , v2 , . . . , vn ) = fG (v1 , v2 , . . . , vn ). (ii) In case N ⊂ M∗ , this implies H = M∗ \N = ∅. Next, we will prove that for any H ∈ H, there exists N(e) ∈ N such that N(e) ⊂ H. In fact, for any H ∈ H, by the definition, there are ei and ej such that 1 ≤ i, j ≤ m, i = j and H = M(ei , e j ). By Proposition 1, N(ei ) ⊂ H holds. According to the definition of fS (v1 , v2 , . . . , vn ), we have
fS (v1 , v2 , . . . , vn ) = ∧{∨M|M ∈ M∗ } = ∧{∨M|M ∈ (N ∪ H)} = ( ∧ {∨N|N ∈ N }) ∧ ( ∧ {∨H |H ∈ H}). Note that for any H ∈ H, there exists NH ∈ N such that NH ⊂ H. According to the absorption law of Boolean algebras, we have
( ∧ {∨N|N ∈ N }) ∧ ( ∧ {∨H |H ∈ H}) = ( ∨ H1 ) ∧ ( ∨ NH1 ) ∧ · · · ∧ ( ∨ H|H| ) ∧ ( ∨ NH|H| ) ∧( ∧ {∨N|N ∈ (N \ {NH1 , . . . , NH|H| })}) = ( ∨ NH1 ) ∧ · · · ∧ ( ∨ NH|H| ) ∧ ( ∧ {∨N|N ∈ (N \ {NH1 , . . . , NH|H| })}) = ∧{∨N|N ∈ N }. Thus, fS (v1 , v2 , . . . , vn ) = fG (v1 , v2 , . . . , vn ) holds as well. Combining (i) and (ii), we have fS (v1 , v2 , . . . , vn ) = fG (v1 , v2 , . . . , vn ). Therefore, C (G) = R(S).
Theorem 1 shows that the problem of finding the minimal vertex cover of a graph can be translated into the problem of finding the reduction of an IS. At the same time, finding a minimum vertex cover of a graph is equivalent to finding an optimal reduct of the IIS. This may provide us with the new method for the vertex cover problem. Example 4. In Example 3, from Table 4, the discernibility matrix can be represented as Table 5. In Table 5, we use a separator-free form for sets, e.g., v1 v2 v3 stands for {v1 , v2 , v3 }. From Table 5, we obtain the discernibility function of the IIS:
fS (v1 , v2 , v3 , v4 ) = =
v3 ∧ (v1 ∨ v2 ) ∧ (v2 ∨ v3 ) ∧ (v1 ∨ v3 ) ∧ (v1 ∨ v4 ) ∧ (v1 ∨ v2 ∨ v3 ) v3 ∧ (v1 ∨ v2 ) ∧ (v2 ∨ v3 ) ∧ (v1 ∨ v3 ) ∧ (v1 ∨ v4 ).
92
J. Chen et al. / Information Sciences 325 (2015) 87–97 Table 6 The IIS of H in Example 5.
e1 e2 e3 e4 e5
v1
v2
v3
v4
v5
v6
1 0 0 0 0
1 2 0 0 0
1 2 3 0 0
0 0 0 4 0
0 0 3 0 0
0 0 3 0 0
Fig. 2. The hypergraph of Example 5.
According to Example 2, we have fS (v1 , v2 , v3 , v4 ) = fG (v1 , v2 , v3 , v4 ). Hence, C (G) = R(S). Remark 1. The result in Theorem 1 is also true for the hypergraph. Example 5. Let H = (V, E ) be the following hypergraph with V = {v1 , v2 , v3 , v4 , v5 , v6 } and E = {e1 , e2 , e3 , e4 } (Fig. 2), which is adopted from Wikipedia [58]. The IIS of the hypergraph H is shown in Table 6. It can be calculated that C (H ) = R(S) = {{v3 , v4 }, {v2 , v4 , v5 }, {v2 , v4 , v6 }}. The intersection of all reducts of an IS is the so-called core. The core attribute plays an important role in decision making. The following proposition gives a characterization of the core attribute of the IIS from the viewpoint of graphs (Fig. 2). Proposition 2. Let S = (U, V ) be the IIS of a graph G = (V, E ). Then v ∈ V is a core attribute of S iff v is a vertex with loops in G. Proof. “⇒” Suppose v ∈ V is a core attribute of S. Assume that v is a vertex of G without loops. Note that the core attribute is included in every reduct, thus by Theorem 1, we know that v is included in each minimal vertex cover of G. Since v has no loops, it is easy to see that V − {v} is a vertex cover of G. Hence, there is a minimal vertex cover K such that K ⊆ V − {v}. This implies v ∈ K, which is a contradiction to that v is included in each minimal vertex cover of G. Thus, v has no loops in G. “⇐” Assume v is a vertex with loops in G. If v is not a core attribute of S, then by Theorem 1 and the definition, there exists a minimal vertex cover K ⊆ V such that v ∈ K. Since v ∈ K and v has a loop, this implies that the ends of the loop are not contained in K, which is a contradiction to that K is a minimal vertex cover of G. Thus, v is a core attribute of S. From Theorem 1 and Proposition 2, for a given graph G, v ∈ ∩C (G) iff v is a vertex with loops in G, and we call it the core vertex of G. Example 6. In Example 2, it can be easily seen that v3 is the unique core vertex of G. 4. The graph induced from an information system In this section, we are interested in the problem that for a given IS, whether there exists a corresponding graph G such that R(S) = C (G). Next, we first introduce the concept of simplified discernibility sets presented by Yao and Zhao [63]. Definition 2 ([63]). Let M be a discernibility set of an IS. For an element ∅ = M(x , y ) ∈ M, it absorbs another element M(x, y) ∈ M if the following condition holds: ∅ = M(x , y ) ⊂ M(x, y). A set is called a simplified discernibility set if one applies the element absorption operation for all element pairs in the discernibility set M. From Definition 2, it is easy to see that no element in the simplified discernibility set is a proper subset of another element. As pointed out in [63], the simplified discernibility set has exactly the same set of reducts as the original discernibility set M.
J. Chen et al. / Information Sciences 325 (2015) 87–97
93
Fig. 3. The induced graph from S of Example 1. Table 7 An IS of Example 9.
x1 x2 x3 x4 x5
a1
a2
a3
a4
a5
2 3 2 2 1
1 2 1 2 0
3 1 3 3 0
0 0 0 1 1
4 4 4 4 4
Fig. 4. The induced graph from S of Example 9.
Example 7. The simplified discernibility set of Example 1 is: M = {∅, {a2 }, {a1 , a3 }, {a1 , a4 }, {a3 , a4 }}. Inspired by the results in [23,63], we have the following definition. Definition 3. Let M be the simplified discernibility set of an IS S = (U, A) and M∗ = {M ∈ M|M = ∅}. Denote V = A and E = M∗ . We call the pair G = (V, E ) an induced graph from S. Example 8. In Example 1, it is easy to see that:
M∗ = {{a2 }, {a1 , a3 }, {a1 , a4 }, {a3 , a4 }}. The induced graph G = (V, E ) from S is as follows (Fig. 3).
V = {a1 , a2 , a3 , a4 }, E = {e1 , e2 , e3 , e4 } = {{a2 }, {a1 , a3 }, {a1 , a4 }, {a3 , a4 }}. In most cases, however, the induced graph from a given IS is a hypergraph. Example 9. Let S = (U, A) be an IS as shown in Table 7: It can be easily calculated that:
M∗ = {{a2 , a4 }, {a1 , a2 , a3 }, {a1 , a3 , a4 }}. The induced graph H = (V, E ) from S is as follows (Fig. 4).
V = {a1 , a2 , a3 , a4 , a5 }, E = {e1 , e2 , e3 } = {{a2 , a4 }, {a1 , a2 , a3 }, {a1 , a3 , a4 }}. Similar to Theorem 1, we also have the following same result (Fig. 4). Theorem 2. Let G be an induced graph from an IS S. Then C (G) = R(S). Proof. It follows immediately from Lemmas 1 and 2, and Definition 3. Remark 2. From Theorem 2, we can see that finding the set of all reducts of an information system can be viewed as finding the set of minimal vertex covers of a graph. Thus, it provides us with a new model to obtain attribute reduction of an information system.
94
J. Chen et al. / Information Sciences 325 (2015) 87–97
Example 10. In Example 1, we know that
R(S) = {{a1 , a2 , a3 }, {a1 , a2 , a4 }, {a2 , a3 , a4 }}. From Example 8, it can be easily calculated that
C (G) = {{a1 , a2 , a3 }, {a1 , a2 , a4 }, {a2 , a3 , a4 }}. Thus, C (G) = R(S) holds. An attribute is called the unnecessary attribute if it is not included in any reduct. The following proposition characterizes the core attribute and the unnecessary attribute by means of graph theory. Proposition 3. Let G = (V, E ) be an induced graph from a given IS S = (U, A). For any a ∈ A, then 1. a is a core attribute of S iff a has a loop in G; 2. a is an unnecessary attribute of S iff a is an isolated vertex of G. Proof. (1) It is similar to the proof of Proposition 2. (2) “⇒” Suppose a is an unnecessary attribute of S. Suppose that a ∈ A is not an isolated vertex of G. By Theorem 2 and (1), we can conclude that there is a vertex b = a such that a and b are adjacent. Since A − {b} is a vertex cover of G, there exists a minimal vertex cover K ⊆ A − {b} such that a ∈ K. By Theorem 2, we know that a belongs to one reduct. This is a contradiction to the definition of the unnecessary attribute. Hence, a is an isolated vertex of G. “⇐” It is immediate by the definition and Theorem 2. 5. An algorithm for MVCP based on rough sets In this section, we are interested in the minimum vertex cover problem (MVCP). We give an approximate algorithm for MVCP based on rough sets and conduct experiments to show the effectiveness of the proposed method. In [46], Slezak introduced Shannon’s information entropy to search reducts in the classical rough set. Here, we review this method. Given an IS S = (U, A), for any B ⊆ A, its entropy is defined as: r |Xi | H (B) = − log |U | 2 i=1
|Xi | , |U |
where U/RB = {X1 , X2 , . . . , Xr } and | · | denotes the cardinality of a set. The above entropy can be used to measure the significance of attributes, and thus one can construct a heuristic algorithm for attribute reduction. For an attribute a ∈ B, the significance of a w.r.t. B is defined as:
Sig(a, B) = H (B ∪ {a}) − H (B). By Theorem 1 and the method proposed in [26,46], now we can design an algorithm for MVCP based on rough sets. Algorithm 1. An approximate algorithm for MVCP Input: A graph G = (V, E ); Output: A vertex cover of G. Step 1: Compute the set of core vertices of G, and is denoted by C; Step 2: Compute the induced information system S = (U, V ) from G; Step 3: While H(C) = H(V) Do {C ← C ∪ {a0 }, where a0 is the attribute which satisfies Sig(a0 , C ) = max{Sig(ak , C )|ak ∈ V − C }; Step 4: A vertex cover C. Step 1 can be done in O(|V|) and Step 2 can be computed in O(|E|). In Step 3, this process is called a forward reduction algorithm whose time complexity is O(|U|2 |A|) [41]. By Definition 1, we know that Step 3 needs O(|E|2 |V|). Thus, the time complexity of Algorithm 1 is O(|E|2 |V|). To further illustrate effectiveness of the new proposed algorithm, we compare it with some existing algorithms for MVCP, which have been summarized by Gomes et al. [14]. For convenience, the new proposed algorithm was written as VCRS. The algorithms for MVCP were tested on randomly generated graphs. All the algorithms were executed 8 times on each random graph with the same number of vertices and edges. We tested 24 random graphs. The experiments were performed on a personal computer with Windows XP and Intel Pentium R Dual (E2140) 1.6GHz and 1.94GB of memory. The algorithms were implemented in the same language (Matlab), and the LINGO 11.0 software was used to solve linear and integer programs. Let n and m denote the number of vertices and the number of edges, respectively, of the input graph. Tables 8–10 list the experiment results for MVCP. The column Optimum is the optimum value for MVCP of the input graph, which is solved by the LINGO software. The column Value is the value of a solution found by an algorithm. In these tables, the column Time is the running time (in seconds) of the algorithms.
J. Chen et al. / Information Sciences 325 (2015) 87–97
95
Table 8 Results of the Greedy, Round, Dual-LP and VCRS algorithms to graphs with n = 100 and m = 200 of MVCP. Optimum
Greedy
Round
Dual-LP
VCRS
Value
Time
Value
Time
Value
Time
Value Time
Value
Time
54 53 55 54 51 51 55 52
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
82 76 77 81 73 71 83 74
0.02 0.01 0.01 0.01 0.01 0.01 0.02 0.02
98 98 96 94 95 84 97 94
0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
98 99 99 99 96 99 98 98
1.01 1.00 1.00 1.01 1.01 1.00 1.01 1.01
57 55 58 59 51 52 57 54
0.20 0.17 0.19 0.19 0.16 0.16 0.17 0.17
Table 9 Results of the Greedy, Round, Dual-LP and VCRS algorithms to graphs with n = 150 and m = 350 of MVCP. Optimum
Greedy
Round
Dual-LP
VCRS
Value
Time
Value
Time
Value
Time
Value
Time
Value
Time
80 86 85 82 86 79 83 83
1.00 7.00 4.00 2.00 5.00 2.00 8.00 6.00
114 129 132 124 127 122 120 118
0.01 0.01 0.01 0.01 0.04 0.01 0.02 0.01
143 146 147 146 148 148 144 142
1.00 2.01 1.00 1.00 2.01 1.00 1.00 1.00
149 148 149 147 148 149 146 146
3.01 3.01 3.00 3.02 3.03 3.01 4.01 3.01
85 91 89 87 90 82 90 87
0.70 0.76 0.72 0.78 0.78 0.70 0.76 0.73
Table 10 Results of the Greedy, Round, Dual-LP and VCRS algorithms to graphs with n = 300 and m = 500 of MVCP. Optimum
Greedy
Round
Dual-LP
VCRS
Value
Time
Value
Time
Value
Time
Value
Time
Value
Time
147 149 146 147 146 148 146 147
9.00 9.00 9.00 9.00 9.00 9.00 9.00 9.00
235 239 218 228 236 229 230 227
0.02 0.02 0.02 0.04 0.02 0.03 0.03 0.03
266 259 257 273 258 261 261 259
8.00 8.00 8.00 8.01 8.00 8.00 8.00 8.00
284 293 289 287 289 290 286 292
21.06 22.04 21.05 21.12 21.06 21.06 21.05 21.04
153 153 152 157 156 154 155 152
6.86 6.89 6.92 7.59 7.26 7.11 7.21 7.50
Table 11 Averages and standard deviations of the ratios to graphs with n = 100 and m = 200 of MVCP. Algorithm
Min
Average
Max
Std. dev.
Greedy Round Dual-LP VCRS
1.39 1.65 1.78 1.00
1.45 1.78 1.85 1.04
1.52 1.86 1.94 1.09
0.05 0.07 0.05 0.03
From Tables 8–10, one can see that the computational time of VCRS is faster than that of the Round and Dual-LP algorithms, but it is slower than that of the Greedy algorithm. The reason is due to the fact that the time complexity of Greedy is O(|E||V|), but VCRS needs O(|E|2 |V|). For the Round and Dual-LP algorithms, they are both needed to solve a linear program, and this is time consuming. In the following, we introduce another important index to measure the quality of a solution derived by an algorithm for MVCP [14]. The Ratio is defined as Value/Optimum, where Value is the value of a solution found by an algorithm, and Optimum is the optimal solution value. Note that for an approximate algorithm for MVCP, the smaller the ratio is, the better the algorithm performs. Tables 11–13 summarize the ratios according to the testing results of Tables 8–10. In these tables, columns Min, Max and Std Dev represent, respectively, the minimum value, the maximum value and the standard deviation of the ratios.
96
J. Chen et al. / Information Sciences 325 (2015) 87–97 Table 12 Averages and standard deviations of the ratios to graphs with n = 150 and m = 350 of MVCP. Algorithm
Min
Average
Max
Std. dev.
Greedy Round Dual-LP VCRS
1.42 1.70 1.72 1.04
1.48 1.75 1.78 1.06
1.55 1.87 1.89 1.08
0.05 0.06 0.06 0.01
Table 13 Averages and standard deviations of the ratios to graphs with n = 300 and m = 500 of MVCP. Algorithm
Min
Average
Max
Std. dev
Greedy Round Dual-LP VCRS
1.49 1.74 1.93 1.03
1.57 1.78 1.96 1.05
1.62 1.86 1.99 1.07
0.04 0.04 0.02 0.02
By Tables 11–13, we can see that VCRS performs better than the other methods on the ratios. The value found by VCRS is quite close to the optimal solution value. For example, the Greedy, Round, Dual-LP, and VCRS algorithms deviated, on average, by 45%, 78%, 85%, and 4% from the optimal solution value (see Table 11). Furthermore, the VCRS algorithm may find an optimal solution, see the graph number 5 in Table 8. This means that the VCRS algorithm has a superior performance on the ratios compared to the Greedy, Round, and Dual-LP algorithms. Combining with the computational time and the ratios, we can infer that the new presented algorithm is an effective method for MVCP. 6. Conclusions We have discussed the attribute reduction and the vertex cover problems in this paper. We have established the relationship between the two problems. The results have shown that the problem of finding the minimal vertex cover of a graph is equivalent to the problem of finding attribute reduction of an information system. Also, finding a minimum vertex cover of graph is equivalent to finding an optimal reduct of its information system. Finally, as an application of the theoretical result, a new algorithm for MVCP based on rough sets has been presented. The experiment results have demonstrated that the new proposed algorithm has a superior performance on the ratio values. This study may provide us with new methods for the two problems. In our future studies, we plan to investigate attribute reduction of decision systems based on graph theory. Acknowledgements We would like to thank the referees for their valuable comments and suggestions. This work was supported by grants from the National Natural Science Foundation of China (nos. 61379021, 61303131, 11301367 and 11061004), the Department of Education of Fujian Province (nos. JK2013027, JK2011031, JA13202, JA11171 and JA14192), the Natural Science Foundation of Fujian Province (nos. 2013J01028, 2013J01029, 2012D141 and 2013J01265) and the Science Foundation of Minnan Normal University in China (no. SJ1015). References [1] A. Bargiela, W. Pedrycz, Granular Computing: An Introduction, Kluwer Academic Publishers, Dordrecht, 2003. [2] S.S. Bhattacharyya, S. Sriram, E.A. Lee, Resynchronization for multiprocessor DSP systems, IEEE Trans. Circuits Syst I: Fundam. Theory Appl. 47 (11) (2000) 1597–1609. [3] J.A. Bondy, U.S.R. Murty, Graph Theory with Applications, Elsevier Science Publishing Co., Inc., 1976. [4] J.A. Bondy, U.S.R. Murty, Graph Theory, Springer, Berlin, 2008. [5] A. Caprara, P. Toth, D. Vigo, M. Fischetti, Modeling and solving the crew rostering problem, Oper. Res. 46 (6) (1998) 820–830. [6] D.G. Chen, C.Z. Wang, Q.H. Hu, A new approach to attribute reduction of consistent and inconsistent covering decision systems with covering rough sets, Inf. Sci. 177 (2007) 3500–3518. [7] D.G. Chen, S.Y. Zhao, L. Zhang, et al., Sample pair selection for attribute reduction with rough set, IEEE Trans. Knowledge Data Eng. 24 (11) (2012) 2080–2093. [8] D.G. Chen, Y.Y. Yang, Attribute reduction for heterogeneous data based on the combination of classical and Fuzzy rough set models, IEEE Trans. Fuzzy Syst. 22 (5) (2014) 1325–1334. [9] V. Chvatal, A greedy heuristic for the set-covering problem, Math. Oper. Res. 4 (3) (1979) 233–235. [10] T. Eiter, G. Gottlob, Identifying the minimal transversals of a hypergraph and related problems, SIAM J. Comput. 24 (6) (1995) 1278–1304. [11] F. Feng, X.Y. Liu, V. Leoreanu-Fotea, Y.B. Jun, Soft sets and soft rough sets, Inf. Sci. 181 (2011) 1125–1137. [12] F. Gavril, Private Communication cited in [14] (1974). [13] J.W. Grzymala-Busse, LERS-a system for learning from examples based on rough sets, in: R. Slowinski (Ed.), Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, Kluwer Academic Publishers, 1992, pp. 3–18. [14] F.C. Gomes, C.N. Meneses, P.M. Pardalos, G.V.R. Vianaa, Experimental analysis of approximation algorithms for the vertex cover and set covering problems, Comput. Oper. Res. 33 (12) (2006) 3520–3534.
J. Chen et al. / Information Sciences 325 (2015) 87–97 [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68]
97
J.W. Guan, D.A. Bell, Rough computational methods for information systems, Artif. Intell. 105 (1998) 77–103. D.S. Hochbaum, Efficient bounds for the stable set, vertex cover and set packing problems, Discrete Appl. Math. 6 (3) (1983) 243–254. D.S. Hochbaum, Approximation algorithms for the set covering and vertex cover problems, SIAM J. Comput. 11 (3) (1982) 555–556. X.H. Hu, N. Cercone, Learning in relational databases: a rough set approach, Int. J. Comput. Intell. 11 (2) (1995) 323–338. Q.H. Hu, Z.X. Xie, D.R. Yu, Hybrid attribute reduction based on a novel Fuzzy-rough model and information granulation, Pattern Recognit. 40 (12) (2007) 3509–3521. Q.H. Hu, D.R. Yu, M.Z. Guo, Fuzzy preference based rough sets, Inf. Sci. 180 (2010) 2003–2022. Q.H. Hu, D.R. Yu, W. Pedrycz, D.G. Chen, Kernelized Fuzzy rough sets and their applications, IEEE Trans. Knowledge Data Eng. 23 (11) (2011) 1649–1667. R.M. Karp, Reducibility Among Combinatorial Problems, Springer, US, 1972. P. Kulaga, P. Sapiecha, S. Krzysztof, Approximation algorithm for the argument reduction problem, Computer Recognition Systems, Springer, Berlin, 2005, pp. 243–248. M. Li, C. Shang, S. Feng, J. Fan, Quick attribute reduction in inconsistent decision tables, Inf. Sci. 254 (2014) 155–180. J.Y. Liang, Z.B. Xu, The algorithm on knowledge reduction in incomplete information systems, Int. J. Uncertain., Fuzziness Knowl.-Based Syst. 10 (1) (2002) 95–103. J.Y. Liang, Z.Z. Shi, The information entropy, rough entropy and knowledge granulation in rough set theory, Int. J. Uncertain., Fuzziness Knowl.-Based Syst. 12 (2004) 37–46. P. Lingras, C. Butz, Rough set based 1-v-1 and 1-vr approaches to support vector machine multi-classification, Inf. Sci. 177 (18) (2007) 3782–3798. Y. Lin, J. Li, P. Lin, G. Lin, J. Chen, Feature selection via neighborhood multi-granulation fusion, Knowl.-Based Syst. 67 (2014) 162–168. S. Listrovoy, S. Minukhin, The solution algorithms for problems on the minimal vertex cover in networks and the minimal cover in Boolean matrixes, Int. J. Comput. Sci. Issues 9 (5) (2012) 8–15. Z.Q. Meng, Z.Z. Shi, Extended rough set-based attribute reduction in inconsistent incomplete decision systems, Inf. Sci. 204 (2012) 44–69. D.Q. Miao, Y. Zhao, Y.Y. Yao, et al., Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model, Inf. Sci. 179 (24) (2009) 4140–4150. M. Moshkov, M. Piliszczuk, Graphical representation of information on the set of reducts, rough sets and knowledge technology, Lect. Notes Comput. Sci. 4481 (2007) 372–378. M. Moshkov, A. Skowron, Z. Suraj, On covering attribute sets by reducts, rough sets and intelligent systems paradigms, Lect. Notes Comput. Sci 4585 (2007) 175–180. G.L. Nemhauser, L.E. Trotter Jr, Vertex packings: structural properties and algorithms, Math. Program. 8 (1) (1975) 232–248. Z. Pawlak, Rough sets, Int. J. Comput. Inf. Sci. 5 (1982) 341–356. Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishing, Dordrecht, 1991. W. Pedrycz, Granular Computing: Analysis and Design of Intelligent Systems, CRC Press/Francis Taylor, Boca Raton, 2013. W. Pedrycz, W. Homendad, Building the fundamentals of granular computing: a principle of justifiable granularity, Appl. Soft Comput. 13 (10) (2013) 4209– 4218. W. Pedrycz, Allocation of information granularity in optimization and decision-making models: towards building the foundations of granular computing, Eur. J. Oper. Res. 232 (1) (2014) 137–145. Y.H. Qian, J.Y. Liang, Combination entropy and combination granulation in rough set theory, Int. J. Uncertain., Fuzziness Knowl.-Based Syst. 16 (2) (2008) 179–193. Y.H. Qian, J.Y. Liang, W. Pedrycz, C.Y. Dang, Positive approximation: an accelerator for attribute reduction in rough set theory, Artif. Intell. 174 (9) (2010) 597–618. Y.H. Qian, S.Y. Li, J.Y. Liang, et al., Pessimistic rough set based decisions: a multigranulation fusion strategy, Inf. Sci. 264 (2014) 196–210. H.D. Sherali, M. Rios, An air force crew allocation and scheduling problem, J. Oper. Res. Soc. 35 (2) (1984) 91–103. W. Shu, W. Qian, A fast approach to attribute reduction from perspective of attribute measures in incomplete decision systems, Know.-Based Syst. 72 (2014) 60–71. A. Skowron, C. Rauszer, The discernibility matrices and functions in information systems, in: R. Slowinski (Ed.), Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory, Kluwer, Dordrecht, 1992. D. Slezak, Approximate entropy reducts, Fundam. Inf. 53 (3-4) (2002) 365–390. B. Sun, W. Ma, H. Zhao, Decision-theoretic rough Fuzzy set model and application, Inf. Sci. 283 (2014) 180–196. R.W. Swiniarski, A. Skowron, Rough set methods in feature selection and recognition, Patt. Recognit. Lett. 24 (6) (2003) 833–849. V.V. Vazirani, Approximation Algorithms, Springer, Berlin, 2001. B. Walczak, D.L. Massart, Rough sets theory, Chemometrics and Intelligent Laboratory Systems 47 (1) (1999) 1–16. J. Wang, J. Wang, Reduction algorithms based on discernibility matrix: the ordered attributes method, J. Comput. Sci. Technol. 16 (2001) 489–504. G.Y. Wang, J. Zhao, J.J. An, A comparative study of algebra viewpoint and information viewpoint in attribute reduction, Fundam. Inf. 68 (3) (2005) 289–301. F. Wang, J.Y. Liang, C.Y. Dang, Attribute reduction for dynamic data sets, Appl. Soft Comput. 13 (1) (2013) 676–689. C. Wang, Q. He, D. Chen, Q. Hu, A novel method for attribute reduction of covering decision systems, Inf. Sci. 254 (2014) 181–196. C. Wang, M. Shao, B. Sun, Q. Hu, An improved attribute reduction scheme with covering based rough sets, Appl. Soft Comput. 26 (2015) 235–243. G. Wang, H. Yu, T. Li, Decision region distribution preservation reduction in decision-theoretic rough set model, Inf. Sci. 278 (2014) 614–640. G. Wang, X. Ma, H. Yu, Monotonic uncertainty measures for attribute reduction in probabilistic rough set model, Int. J. Approx. Reason. 59 (2015) 41–67. Wikipedia, http://en.wikipedia.org/wiki/Hypergraph, 2015 (Last accessed 10.06.15). S.K.M. Wong, W. Ziarko, On optimal decision rules in decision tables, Bull. Pol. Acad. Sci. 33 (1985) 693–696. L.R. Woodyatt, K.L. Stott, F.E. Wolf, F.J. Vasko, An application combining set covering and Fuzzy sets to optimally assign metallurgical grades to customer orders, Fuzzy Sets Syst. 53 (1) (1993) 15–25. W.Z. Wu, M. Zhang, H.Z. Li, J.S. Mi, Knowledge reduction in random information systems via Dempster–Shafer theory of evidence, Inf. Sci. 174 (2005) 143–164. S.X. Wu, M.Q. Li, W.T. Huang, S.F. Liu, An improved heuristic algorithm of attribute reduction in rough set, J. Syst. Sci. Inf. 2 (3) (2004) 557–562. Y.Y. Yao, Y. Zhao, Discernibility matrix simplification for constructing attribute reducts, Inf. Sci. 179 (7) (2009) 867–882. X. Zhang, D. Miao, Quantitative information architecture, granular computing and rough set models in the double-quantitative approximation space of precision and grade, Inf. Sci. 268 (2014) 147–168. X. Zhang, D. Miao, Reduction target structure-based hierarchical attribute reduction for two-category decision-theoretic rough sets, Inf. Sc. 277 (2014) 755–776. S.Y. Zhao, H. Chen, C.P. Li, M.Y. Zhai, X.Y. Du, RFRR: robust fuzzy rough reduction, IEEE Trans. Fuzzy Syst. 21 (5) (2013) 825–841. K. Zheng, J. Hu, Z. Zhan, J. Ma, J. Qi, An enhancement for heuristic attribute reduction algorithm in rough set, Expert Syst. Appl. 41 (15) (2014) 6748–6754. W. Ziarko, Introduction to the special issue on rough sets and knowledge discovery, Comput. Intell. 11 (2) (1995) 223–226.