International Journal of Approximate Reasoning 76 (2016) 80–95
Contents lists available at ScienceDirect
International Journal of Approximate Reasoning www.elsevier.com/locate/ijar
An incremental attribute reduction approach based on knowledge granularity under the attribute generalization Yunge Jing a,b , Tianrui Li a,∗ , Junfu Huang a , Yingying Zhang a a b
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China Department of Public Computer Teaching, Yuncheng University, Yuncheng 044000, China
a r t i c l e
i n f o
Article history: Received 17 November 2015 Received in revised form 26 April 2016 Accepted 2 May 2016 Available online 6 May 2016 Keywords: Decision system Knowledge granularity Attribute reduction Incremental learning Rough set theory
a b s t r a c t Attribute reduction is a key step to discover interesting patterns in the decision system with numbers of attributes available. In recent years, with the fast development of data processing tools, the information system may increase quickly in attributes over time. How to update attribute reducts efficiently under the attribute generalization becomes an important task in knowledge discovery related tasks since the result of attribute reduction may alter with the increase of attributes. This paper aims for investigation of incremental attribute reduction algorithm based on knowledge granularity in the decision system under the variation of attributes. Incremental mechanisms to calculate the new knowledge granularity are first introduced. Then, the corresponding incremental algorithms are presented for attribute reduction based on the calculated knowledge granularity when multiple attributes are added to the decision system. Finally, experiments performed on UCI data sets and the complexity analysis show that the proposed incremental methods are effective and efficient to update attribute reducts with the increase of attributes. © 2016 Elsevier Inc. All rights reserved.
1. Introduction Attribute reduction has attracted much attention recently, as an important data preprocessing tool to improve recognition accuracy and discover potentially useful knowledge in many research areas such as knowledge discovery, pattern recognition, expert system, data mining, decision supporting and machine learning [1–9]. In practice, due to many real data sets may increase dynamically in attributes nowadays. Non-incremental approaches are often infeasible since they need to compute repeatedly and consume a large amount of computational time, while incremental approaches are considered as effective techniques to deal with such data because they can directly run the computation using the previous results from the original decision system. Attribute reduction for a data set based on Rough Set Theory (RST) has many successful applications since it may keep the same information of the original decision system. Many heuristic attribute reduction approaches have been developed based on information entropy, positive region, discernibility matrix, decision cost and knowledge granularity [10–15]. However, these methods are only effectively applied in static decision systems and very inefficient to deal with dynamic decision systems. In practice, there are a lot of examples about dynamic variation of the attribute set in many aspects such as risk prediction, image processing, etc. For example, in a distributive decision system, we need to centralize all data from dif-
*
Corresponding author. E-mail addresses:
[email protected] (Y. Jing),
[email protected] (T. Li),
[email protected] (J. Huang),
[email protected] (Y. Zhang).
http://dx.doi.org/10.1016/j.ijar.2016.05.001 0888-613X/© 2016 Elsevier Inc. All rights reserved.
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
81
ferent locations, which may result in the expansion of attributes in the decision system. Therefore, it needs to develop an incremental attribute reduction algorithm for dynamically varying data sets to obtain valuable knowledge efficiently. In view of dynamically increasing information systems, many incremental updating algorithms on the variation of the attribute set have been proposed [16–20,33]. Based on upper and lower boundary sets, Chan proposed an incremental algorithm to calculate approximations in RST when one attribute is added to or deleted from the information system [21]. Li et al. developed an incremental approach to compute approximations under the attribute generalization in incomplete information systems [22]. For the incomplete decision system, Shu et al. proposed a positive region-based method for updating the attribute reduct efficiently with the dynamically varying attribute set [23]. Cheng constructed two incremental methods for fast updating approximations through boundary sets and cut sets based on rough fuzzy sets, respectively [25]. Wang et al. proposed a dimension incremental strategy for updating attribute reduct based on information entropy, then proposed an algorithm which is efficient to find a new reduct with dynamically increasing attributes in decision systems [26]. Zeng et al. presented incremental approaches to update attribute reduct of HIS based on fuzzy rough sets when some attributes are added to or deleted from the information system [27]. Li et al. constructed a dominance matrix to compute dominating and dominated sets when the attribute set varies. Then they proposed an incremental method for computing approximations [28]. In set-valued information systems, Luo et al. introduced an incremental mechanism for updating relevant matrices, and developed incremental algorithms for calculating approximations [29]. Matrix is a very useful computing tool for dealing with the decision system. Its theories in calculation and methods are important and they have been indispensable in modern physical science, economics, biology and computer science. Many matrix-based incremental learning algorithms have been proposed to deal with dynamic data sets [24,34,35,39–41]. Most of them mainly focused on updating approximations when the decision system varies dynamically. To fully explore properties in updating reducts, this paper proposes a matrix-based incremental reduction algorithm for dynamic data sets based on the knowledge granularity. It is shown that the matrix-based algorithm for updating the reduct is efficient when the data sets are small. However, for the large data sets, matrix-based incremental algorithm is inefficient since it needs more memory and computational time. To overcome this deficiency, an efficient incremental reduction algorithm based on non-matrices is developed. Then, a series of experiments is performed on 6 data sets from UCI. Experimental results show that the proposed incremental method based on non-matrix is faster than the matrix-based incremental method. Also its computation time is much smaller than that of the non-incremental counterpart and incremental algorithms based on entropy and positive region. The remainder of this paper is arranged as follows. Section 2 briefly reviews some basic concepts of RST and knowledge granularity. Section 3 introduces a matrix presentation of the knowledge granularity and a general heuristic reduction algorithm based on knowledge granularity for the decision system. In Section 4, incremental reduction algorithms based on matrix and non-matrix when adding multiple attributes are presented. In Section 5, experiments are performed to verify the efficiency and effectiveness of the proposed algorithms. The paper ends with conclusions and the future research in Section 6. 2. Preliminaries In this subsection, we review several basic definitions of knowledge granularity in RST [30,31]. Definition 1. [31] Given a decision system S = (U , C ∪ D , V , f ) and U / I N D (C ) = { X 1 , X 2 , · · · , X m }. The knowledge granularity of C is defined as
G P U (C ) =
m | X i |2 i =1
| U |2
,
(1)
where the equivalence relation I N D (C ) is determined by nonempty subset C as follows: I N D (C ) = {(x, y ) ∈ U × U |∀a ∈ C , f (x, a) = f ( y , a)}. Definition 2. [30] Given a decision system S = (U , C ∪ D , V , f ), U / I N D (C ) = { X 1 , X 2 , · · · , X m } and U / I N D (C ∪ D ) = {Y 1 , Y 2 , · · · , Y n }. A knowledge granularity of C relative to D is defined as follows
G P U ( D |C ) = G P U (C ) − G P U (C ∪ D ).
(2)
The relative knowledge granularity was used to construct the heuristic attribute reduction algorithm [30]. This reduction algorithm generates a feature subset that has the same discernibility ability as the original one. Definition 3. [30] Given a decision system S = (U , C ∪ D , V , f ), U / I N D (C ) = { X 1 , X 2 , · · · , X m } and B ⊆ C . ∀a ∈ B, the significance measure (inner significance) of a in B is defined as
Sig Cinner (a, B , D ) = G P U ( D |(C − {a})) − G P U ( D |C ).
(3)
82
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Table 1 A decision system. u
a
b
c
e
f
d
u1 u2 u3 u4 u5 u6 u7 u8 u9
1 0 0 0 0 1 1 0 0
0 0 1 0 1 0 0 1 1
0 1 1 1 1 0 0 1 1
0 0 1 0 1 0 0 0 0
1 1 0 1 0 0 0 1 1
0 0 1 0 1 1 1 0 1
Definition 4. [30] Given a decision system S = (U , C ∪ D , V , f ) and B ⊆ C . ∀a ∈ C − B, the significance measure (outer significance) of a in B is defined as
Sig Couter (a, B , D ) = G P U ( D | B ) − G P U ( D |( B ∪ {a})).
(4)
Definition 5. [30] Given a decision system S = (U , C ∪ D , V , f ) and B ⊆ C . Then B is a relative reduct based on the knowledge granularity of S if B satisfies (1) G P U ( D | B ) = G P U ( D |C ); (2) ∀a ∈ B , G P U ( D |( B − {a})) = G P U ( D | B ). The first condition of Definition 8 ensures that the reduct has the same discernibility ability as the original one. And the second condition of Definition 8 guarantees that the reduct is the minimum [16,36]. 3. Attribute reduction based on knowledge granularity This section develops a heuristic attribute reduction algorithm based on knowledge granularity for the decision system. The algorithm is to keep the knowledge granularity of target decision unchanged. 3.1. The matrix presentation of the knowledge granularity Definition 6. [38] Given a decision system S = (U , C ∪ D , V , f ) and U / I N D (C ) = {u 1 , u 2 , · · · , um }. R C is an equivalence R relation on U . The relation matrix ( M U C )|U |×|U | = ( w i j )|U |×|U | is defined as
( w i j )|U |×|U | =
1, 0,
(u i , u j ) ∈ R C (u i , u j ) ∈ / RC
1 ≤ i , j ≤ |U |
(5) R
R
where |U | is the cardinal number of U . For convenience, ( M U C )|U |×|U | is expressed as M U C in the following. R
Definition 7. [34] Given a decision system S = (U , C ∪ D , V , f ). M U C = ( w i j )|U |×|U | is the relation matrix. Then, a knowledge granularity of C is defined as
G P U (C ) =
|U | |U | R sum( M U C ) wij = = M UR C , | U |2 | U |2
(6)
i =1 j =1
R
R
R
R
where sum( M U C ) is the sum of all elements of the matrix M U C , and M U C is the average of matrix M U C . The following example is employed to show how to compute the knowledge granularity based on the relation matrix. Example 1. Table 1 is a decision system, where U = {u 1 , u 2 , u 3 , u 4 , u 5 , u 6 , u 7 , u 8 , u 9 }, C = {a, b, c , e , f } and D = {d}. Computing G P U (C ) based on Definition 7, we have
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
⎡
R
G P U (C ) = M U C
1
⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢0 ⎢ 1 0 = ×⎢ 81 ⎢ ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎣0 0
83
⎤
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0⎥
0
1
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
⎥ ⎥ 0⎥ ⎥ ⎥ 0⎥ ⎥ 17 0⎥ ⎥ = 81 ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 1⎦
0
0
0
0
0
0
1
1 R
R
Definition 8. Given a decision system S = (U , C ∪ D , V , f ). M U C and M U C ∪ D are the relation matrices, respectively. A knowledge granularity of D with respect to C is defined as R
R
G P U ( D |C ) = M U C − M U C ∪ D .
(7)
Example 2. (Continued from Example 1) According to Definition 8, we have
⎡
R
R
G P U ( D |C ) = M U C − M U C ∪ D
1
⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢0 ⎢ 1 = × (⎢ ⎢0 81 ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎣0 0
⎡
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
⎥ ⎢ ⎢0 ⎥ ⎢ ⎥ 0⎥ ⎢ ⎢0 ⎥ ⎢ 0⎥ ⎢0 ⎥ ⎢ ⎢ 0⎥ ⎥−⎢0 ⎥ ⎢ 0⎥ ⎢0 ⎥ ⎢ ⎢ 0⎥ ⎥ ⎢0 ⎥ ⎢ 1⎦ ⎣0
0
0
0
0
0
0
1
1
R
R C −{a}
0
⎤
0
1
0⎥
0 R
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0⎥
0
1
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
⎥ ⎥ 0⎥ ⎥ ⎥ 0⎥ ⎥ 2 0⎥ ⎥) = 81 ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦
0
0
0
0
0
0
0
1
R (C −{a})∪ D
Definition 9. Given a decision system S = (U , C ∪ D , V , f ). M U C , M U , M U C ∪ D and M U respectively. ∀a ∈ C , the significance measure (inner significance) of a in C is defined as R C −{a}
Sig Cinner (a, C , D ) = M U
⎤
0
are the relation matrices,
R
− M U (C −{a})∪ D − M UR C + M UR C ∪ D .
(8)
Definition 10. [16,30] Given a decision system S = (U , C ∪ D , V , f ). a ∈ C . The core of S is defined as
inner Core C = a ∈ C | Sig U (a, C , D ) > 0 .
(9) RC
R C ∪D
R C ∪{a}
R (C ∪{a})∪ D
Definition 11. Given a decision system S = (U , C ∪ D , V , f ), C 0 ⊆ C . M U 0 , M U 0 , M U 0 and M U 0 matrices, respectively. ∀a ∈ C − C 0 , the significance measure (outer significance) of a in C 0 is defined as R C0
Sig Couter (a, C 0 , D ) = M U
R C0 ∪D
− MU
R C 0 ∪{a}
− MU
R (C 0 ∪{a})∪ D
+ MU
.
are the relation
(10)
A heuristic attribute reduction algorithm can obtain a reduct by gradually adding selected attributes with the highest significance to the core [16,30,36]. Since the heuristic attribute reduction strategies based on non-matrix are similar to matrix-based heuristic attribute reduction, Algorithm 1 introduces a general heuristic attribute reduction algorithm under knowledge granularity [16,25,30]. 4. Incremental attribute reduction algorithm when adding multiple attributes In practice, data processing tools have been developed rapidly in recent years. Thus the decision system may increase quickly in attributes with time in real-life applications. Suppose that many attributes are added to the decision system. Algorithm 1 needs to compute repeatedly on the decision system, which generally may be inefficient. How to update attribute reducts efficiently becomes a vital task. Hence, two efficient incremental reduction algorithms under the addition of attributes are developed in this section.
84
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Algorithm 1: A General Heuristic Attribute Reduction Algorithm (GHARA) based on knowledge granularity. Input: A decision table S = (U , C ∪ D , V , f ) Output: A attribute reduct R E D C on C 1 begin 2 R E DC ← ∅ 3 for 1 ≤ i ≤ |C | do 4
Calculate Sig Cinner (ai , C , D )
5 6 7
if Sig Cinner (ai , C , D ) > 0 then R E D C ← ( R E D C ∪ {ai }) end end Let B ← R E D C ; while G P U ( D | B ) = G P U ( D |C ) do for each ai ∈ (C − B ) do
8 9 10 11 12
Calculate Sig Couter (ai , B , D )
13 14
a0 = max Sig Couter (ai , B , D ) , ai ∈ (C − B ) } B ← ( B ∪ {a0 })
end
15 16 17 18 19 20
end for each ai ∈ B do if G P U ( D |( B − {ai })) = G P U ( D |C ) then B ← ( B − {ai }) end
21 22 23
end R E D C ← B; return reduct R E D C .
24 end
Table 2 An information system. u
g
h
u1 u2 u3 u4 u5 u6 u7 u8 u9
1 0 1 1 1 1 0 1 1
0 1 0 1 1 0 1 0 0
4.1. Matrix-based incremental mechanisms to compute knowledge granularity on the variation of attributes Given a dynamic decision system, the equivalence classes may be refined when the attribute set P is added to the decision system and knowledge granularity of the new decision system may become smaller than that of the original one. In other words, w i j in Definition 6 may change from 1 to 0. Definition Eqs. (12) show the matrix-based incremental mechanisms to calculate new relation matrices. R
Definition 12. Given a decision system S = (U , C ∪ D , V , f ) and relation matrix M U C = ( w i j )|U |×|U | . Suppose that P is the new added attribute set, R P is an equivalence relation on U and incremental relation matrix
qi j =
1, 0,
M UR C
R MU P
= ( w i j )|U |×|U | is the relation matrix. Then, the
= (qi j )|U |×|U | is defined as
w i j = 1 ∧ w i j = 0 otherwise
1 ≤ j ≤ |U |, 1 ≤ i ≤ |U |
(11)
Following is an example to illustrate the process of computing the incremental relation matrix. Example 3. (Continued from Example 1) Table 2 is an information system. We suppose that P = { g , h} is the new added attribute set. According to Definition 6, we have
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
⎡
R
M UC
⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ =⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎣0 ⎡
R
MUP
1
⎤
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0⎥ ⎥
0
1
0
1
0
0
0
0⎥ ⎥
1
0
1
0
0
0
0
0⎥ ⎥
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
⎥ ⎥ ⎥ ⎥ ⎥
⎥ ⎥, 0⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ 1⎦
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1 1
1
0
1
0
0
1
0
1
1
1
0
0
0
0
1
0
0⎥ ⎥
0
1
0
0
1
0
1
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
1
0
0
1
0
1
1
0
0
0
0
1
0
0
1
0
0
1
0
1
⎥ ⎥ 1⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ ⎥. 0⎥ ⎥ ⎥ ⎥ 1⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ 1⎦
0
1
0
0
1
0
1
1
⎢ ⎢0 ⎢ ⎢ ⎢ ⎢1 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ =⎢ ⎢0 ⎢ ⎢ ⎢ ⎢1 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎣1 1
85
⎤ ⎥
Then, the incremental relation matrix is
⎡
M UR C
0
⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ =⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎢ ⎢0 ⎢ ⎢ ⎣0 0
⎤
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0⎥ ⎥
0
0
0
1
0
0
0
0⎥ ⎥
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
⎥ ⎥ 0⎥ ⎥ ⎥ ⎥. 0⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦
0
0
0
0
0
0
0
0
⎥ ⎥ ⎥
R
R e
Theorem 1. [38] Given a decision system S = (U , C ∪ D , V , f ) and B ⊆ C , B = ∅, ek ∈ C − B. M U B = (b i j )|U |×|U | , M U k = (ai j )|U |×|U | are the relation matrices, respectively. Suppose the attribute ek is added into B. Then, the new relation matrix R ( B ∪ e ) k
(M U
)|U |×|U | = ( w i j )|U |×|U | is defined as
w i j = AN D (ai j , b i j ), where AND is the conjunction operation of logic.
(12)
86
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Based on Definition (12), the matrix-based incremental mechanisms to update knowledge granularity are introduced in Theorems 2–3. Theorem 2. Given a decision system S = (U , C ∪ D , V , f ). G P U (C ) is the existing knowledge granularity of C . Suppose that P is the R new added attribute set and M U C is the incremental relation matrix. Then, the new knowledge granularity becomes
G P U (C ∪ P ) = G P U (C ) −
1
| U |2
(sum( M UR C )).
(13) R
R
Theorem 3. Given a decision system S = (U , C ∪ D , V , f ) and the relation matrices M U C , M U C ∪ D . G P U ( D |C ) is the existing knowlR
R
edge granularity of C with respect to D. Suppose that P is the new added attribute set. M U C and M U C ∪ D are relation matrices, respectively. Then, the new relative knowledge granularity becomes
G P U ( D |C ∪ P ) = G P U ( D |C ) −
1
| U |2
(sum( M UR C ) − sum( M UR C ∪ D )).
(14)
Proof. From Definition 2, we have
G P U ( D |C ∪ P ) = G P U (C ∪ P ) − G P U (C ∪ P ∪ D )
= G P U (C ) −
1
R
| U |2
sum( M U C ) − (G P U (C ∪ D ) −
= G P U (C ) − ( G P U (C ∪ D ) −
1
| U |2
1
R
| U |2
sum( M U C ∪ D ))
(sum( M UR C ) − sum( M UR C ∪ D )).
Because
G P U ( D |C ) = G P U (C ) − G P U (C ∪ D ), we have
G P U ( D |C ∪ P ) = G P U ( D |C ) −
1
| U |2
(sum( M UR C ) − sum( M UR C ∪ D )).
2
Example 4. (Continued from Example 3) From Definition (12) and Theorem 3, we have
⎡
M UR C
1
⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢0 ⎢ =⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎣0 0
⎤
⎡
1
⎥ ⎥ 0⎥ ⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥, ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦
⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢0 ⎢ =⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎣0
0
1
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
Because G P U ( D |C ) =
2 , 81
0
0⎥
M UR C ∪ D
0
0
⎤
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0⎥
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
⎥ ⎥ 0⎥ ⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥. ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦
0
0
0
0
0
0
0
0
then we have
G P U ( D |C ∪ P ) = G P U ( D |C ) −
1
| U |2
(sum( M UR C ) − sum( M UR C ∪ D )) =
2 81
−
1 81
(6 − 6) =
2 81
.
4.2. A matrix-based incremental reduction algorithm when adding multiple attributes In this section, a matrix-based incremental reduction algorithm is introduced on the variation of attribute set under knowledge granularity. Supposed that P is the attribute set which is added to the decision system, we compute new knowledge granularity by the incremental method firstly. Then, we select the attribute with the highest outer significance from (C ∪ P − R E D C ) and add it to the reduct of the decision system gradually. Finally, we delete the redundant attributes in the reduct. The detailed procedure is listed in Algorithm 2. The time and memory complexities of Algorithm MIRA are discussed as follows. When the attribute set P is added to the decision system, the new knowledge granularity by using the incremental mechanism can be obtained. The time complexity
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
87
Algorithm 2: A Matrix-based Incremental Reduction Algorithm (MIRA) with the variation of attribute set under knowledge granularity. Input: A decision system S = (U , C ∪ D , V , f ), the reduct R E D C on C and the new added attribute set P Output: A new attribute reduct R E D C ∪ P on C ∪ P 1 begin 2 3 4 5 6 7 8 9 10
R
R
11
calculate outer significance Sig Couter ∪ P (ai , B , D )
12 13
a0 = max Sig Couter ∪ P (ai , B , D ) , ai ∈ ((C − B ) ∪ P ) } B ← B ∪ {a0 }
14
R
B ← R E D C . Compute relation matrices M U P , M U C , and M U C ∪ D Compute new knowledge granularity G P U ( D |C ∪ P ) (According to Theorem 3) if G P U ( D | B ) = G P U ( D |C ∪ P ) then go to Step 21 else go to Step 9 end while G P U ( D | B ) = G P U ( D |C ∪ P ) do for ai ∈ ((C − B ) ∪ P ) do
end
15 16 17 18 19
end for each a ∈ B do if G P U ( D | B − {a}) = G P U ( D |C ∪ P ) then B ← B − {a} end
20 21 22
end R E D C ∪ P ← B; return R E D C ∪ P .
23 end
Table 3 A comparison of time and memory complexities of attribute reduction based on algorithms GHARA and MIRA. Reduction algorithms
GHARA
MIRA
Time complexity Memory complexity
O ((|C | + | P |) |U | + (|C | + | P |)|U |) O (|U |2 + (|C | + | P |)|U |) 2
O ((|C | + | P |)2 |U | + | P ||U |) O (|U |2 + | P ||U |)
of knowledge granularity computation is O ((|C | + | P |)|U | + |U |). And the time complexity of Steps 4–20 is O ((|C | + | P |)2 |U | + | P ||U |). Hence, the total time complexity of the proposed incremental method is O ((|C | + | P |)2 |U | + | P ||U |). In addition, the memory complexities of the methods MIRA and GHARA are O (|U |2 + | P ||U |) and O (|U |2 + (|C | + | P |)|U |), respectively. A comparison of the time and memory complexities of computing the reduct by the Algorithms GHARA and MIRA is shown in Table 3. Since (|C | + | P |)|U | is usually much greater than | P ||U | in Table 3. Hence, the time and memory complexities of computing the reduct by the non-incremental algorithm are usually much greater than that of the incremental algorithm. In other words, the proposed matrix-based incremental method is more effective than the non-incremental counterpart. 4.3. Incremental mechanisms based on non-matrix to calculate knowledge granularity on the variation of attributes Matrix-based computation has many advantages. However, incremental methods under knowledge granularity based on matrix for computing the reduct of the large data sets need a lot of computational memory and space. To improve the efficiency, we aim to develop an incremental attribute reduction based on non-matrix to deal with the large data sets. Firstly, incremental mechanisms for computing knowledge granularity based on non-matrix are introduced in this section. For convenience, some explanations are given that will be applied in the following theorems. Let S = (U , C ∪ D , V , f ) be a decision system and U / I N D (C ) = { X 1 , X 2 , · · · , X m }. Suppose that P is the new added attribute set. We might as-
sume that U / I N D (C ∪ P ) =
Z 1 , Z 2 , · · · , Z k , Z 1k+1 , Z 2k+1 , · · · , Z lk+1 , Z 1k+2 , Z 2k+2 , · · · , Z lk+2 , · · · , Z 1m , Z 2m , · · · , Z lm , where Z i k+1
k +2
m
(i = 1, 2, · · · , k) are conditional classes which are unchanged in U / I N D (C ), ∪lje=1 Z ej (e = k + 1, k + 2, · · · , m) is conditional classes which are changed in U / I N D (C ), respectively [26]. Example 5. In Table 1, U = {u 1 , u 2 , u 3 , u 4 , u 5 , u 6 , u 7 , u 8 , u 9 } and U / I N D (C ) = {{u 1 } , {u 2 , u 4 } , {u 3 , u 5 } , {u 6 , u 7 } , {u 8 , u 9 }}. Suppose that P = { g , h} is the new added attribute set, where g = {1, 0, 1, 1, 1, 1, 0, 1, 1}, h = {0, 1, 0, 1, 1, 0, 1, 0, 0}, and U / I N D (C ∪ P ) = {{u 1 } , {u 2 } , {u 4 } , {u 3 } , {u 5 } , {u 6 } , {u 7 } , {u 8 , u 9 }}.
88
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Hence, we have
X 1 = {u 1 } , X 2 = {u 8 , u 9 } , X 13 = {u 2 } , X 23 = {u 4 } , l3 = 2, X 3 = X 13 ∪ X 23 , X 14 = {u 3 } , X 24 = {u 5 } , l4 = 2, X 4 = X 14 ∪ X 24 , and X 15 = {u 6 } , X 25 = {u 7 } , l5 = 2, X 5 = X 15 ∪ X 25 . Theorems 4–5 present incremental mechanisms based on non-matrix under knowledge granularity. Theorem 4. Given a decision system S = (U , C ∪ D , V , f ) and U / I N D (C ) = { X 1 , X 2 , · · · , X m }. G P U (C ) is the existing knowledge granularity of C . Suppose that P is the new added attribute set. We denote U / I N D (C ∪ P ) = { Z 1 , Z 2 , · · · , Z k , Z 1k+1 , Z 2k+1 , · · · , Z lk+1 , X 1k+2 , Z 2k+2 , · · · , Z lk+2 , · · · , Z 1m , Z 2m , · · · , Z lm }. Then, the new knowledge granularity becomes k+1
k+2
m
G P U (C ∪ P ) = G P U (C ) −
2
|U |
le le m
( 2
| Z ie || Z ej |).
(15)
e =k+1 i =1 j =i +1
Proof. From Definition 1, we have
G P U (C ) =
M
| X i |2 , le |U |2 = |U1|2 ( ki=1 | Z i |2 + m | Z ie |)2 ) e =k+1 ( i =1
i =1
G P U (C ) =
1
|U |
le le le k m m 2 e 2 ( | Z | + | Z | + 2 | Z ie || Z ej |). i i 2 i =1
e =k+1 i =1
e =k+1 i =1 j =i +1
Because
G P U (C ∪ P ) =
1
|U |
le k m 2 ( | Z | + | Z ie |2 ), i 2 i =1
e =k+1 i =1
then, we have
G P U (C ) = G P U (C ∪ P ) +
2
|U |
( 2
le le m
| Z ie || Z ej |).
e =k+1 i =1 j =i +1
Hence,
G P U (C ∪ P ) = G P U (C ) −
2
|U |
( 2
le le m
| Z ie || Z ej |).
2
e =k+1 i =1 j =i +1
Example 6. (Continued from Example 5) According to Theorem 4, we have
G P U (C ∪ P ) = G P U (C ) −
2
|U |
( 2
le le m
e =k+1 i =1 j =i +1
From Definition 1 and Theorem 4, we have
G P U (C ) = 2
25 81
,
le le m
| Z ie || Z ej | = 2 + 2 + 2 = 6.
e =k+1 i =1 j =i +1
Hence,
G P U (C ∪ P ) =
25 81
−
6 81
=
19 81
.
| Z ie || Z ej |),
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
89
Theorem 5. Given a decision system S = (U , C ∪ D , V , f ). U / I N D (C ) = { X 1 , X 2 , · · · , X m }, U / I N D (C ∪ D ) = {Y 1 , Y 2 , · · · , Y n }, and G P U ( D |C ) is the existing knowledge granularity of C with respect to D. Suppose that P is the new added attribute set. We
denote U / I N D (C ∪ P ) = Z 1 , Z 2 , · · · , Z k , Z 1k+1 , Z 2k+1 , · · · , Z lk+1 , Z 1k+2 , Z 2k+2 , · · · , Z lk+2 , Z 1m , Z 2m , · · · , Z lm
D) = H1, H2, · · ·
k+1
, H k , H 1k+1 , H 2k+1 , · · ·
, H lk+1 , H 1k+2 , H 2k+2 , · · · k+1
k+2
, H lk+2 , · · · k+2
, H n1 , H n2 , · · ·
, H lnn
and U / I N D (C ∪ P ∪
m
. Then, the new relative knowledge
granularity becomes
G P U ( D |C ∪ P ) = G P U ( D |C ) −
2
|U |
( 2
le le m
| Z ie || Z ej | −
e =k+1 i =1 j =i +1
le le n
| H ie || H ej |).
(16)
e =k+1 i =1 j =i +1
Proof. From Definition 2, we have
G P U ( D |C ∪ P ) =G P U (C ∪ P ) − G P U (C ∪ P ∪ D ). Because
G P U (C ∪ P ) = G P U (C ) −
2
|U |
( 2
le le m
| Z ie || Z ej |),
e =k+1 i =1 j =i +1
2
G P U (C ∪ P ∪ D ) = G P U (C ∪ D ) −
|U |
( 2
le le n
| H ie || H ej |),
e =k+1 i =1 j =i +1
then,
G P U ( D |C ∪ P ) = G P U (C ) −
2
|U |
( 2
le le m
| Z ie || Z ej |) − (G P U (C ∪ D ) −
e =k+1 i =1 j =i +1
= G P U (C ) − G P U (C ∪ D ) −
2
|U |
( 2
m
le
le
| Z ie || Z ej |) +
e =k+1 i =1 j =i +1
2
|U | 2
|U |
( 2
le le n
| H ie || H ej |))
e =k+1 i =1 j =i +1
( 2
le le n
| H ie || H ej |).
e =k+1 i =1 j =i +1
Since
G P U ( D |C ) = G P U (C ) − G P U (C ∪ D ), then,
G P U ( D |C ∪ P ) = G P U ( D |C ) −
2
|U |
( 2
le le m
e =k+1 i =1 j =i +1
| Z ie || Z ej | −
le le n
| H ie || H ej |).
2
e =k+1 i =1 j =i +1
4.4. An incremental reduction algorithm based on non-matrix when adding multiple attributes In this section, an incremental reduction algorithm is developed on the variation of attribute set under knowledge granularity. When some attributes are added to the decision system, we firstly update the previous knowledge granularity based on the incremental mechanisms. If knowledge granularity under the new attribute set is the same as the previous one, then the previous reduct keeps unchanged. Otherwise the previous reduct is updated. The detailed procedure is listed in Algorithm 3 (denoted as Algorithm IARC). The following is the time and memory complexities analysis of Algorithm IARC. When the new attribute set P is added to the decision system, the time complexity of computing new knowledge granularity is O (|C ||U | + | P || w |), where | w | denotes the number of changed conditional classes after adding P to the decision system. In Algorithm IARC, the time complexity of Steps 3–20 is O ((|C | + | P |)2 |U | + | P || w |). Hence, the total time complexity of the proposed incremental method is O ((|C | +| P |)2 |U | +| P || w |). In addition, the memory complexity of the proposed incremental method is O (|U || P |). A comparison of the time and memory complexities of computing the reduct by Algorithms IARC and MIRA is shown in Table 4. From Table 4, | P ||U | is usually much greater than | P || w | and |U || P | is much smaller than (|U |2 + |U || P |). Hence, when multiple attributes are added to the decision system, the time and memory complexities of Algorithm IARC are lower than that of Algorithm MIRA.
90
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Algorithm 3: An Incremental Algorithm for Reduct Computation (IARC) based on knowledge granularity.
1 2 3 4 5 6 7 8 9 10
Input: A decision system S = (U , C ∪ D , V , f ), the reduct R E D C on C and the new added attribute set P Output: A new reduct R E D C ∪ P on C ∪ P begin B ← R E D C . Calculate new partitions U / I N D (C ∪ P ) and U / I N D (C ∪ P ∪ D ). Step 2: Calculate new knowledge granularity G P U ( D |C ∪ P ) if G P U ( D | B ) = G P U ( D |C ∪ P ) then go to Step 21 else go to Step 9 end while G P U ( D | B ) = G P U ( D |C ∪ P ) do for each ai ∈ ((C − B ) ∪ P ) do
11
Calculate Sig Couter ∪ P (ai , B , D )
12 13
a0 = max Sig Couter ∪ P (ai , B , D ) , ai ∈ ((C − B ) ∪ P ) } B ← B ∪ {a0 }
14
end
15 16 17 18 19
end for each a ∈ B do if G P U ( D | B − {a}) = G P U ( D |C ∪ P ) then B ← B − {a} end
20 21 22
end R E D C ∪ P ← B; return R E D C ∪ P .
23 end
Table 4 A comparison of time and memory complexity of algorithms MIRA and IARC. Reduction algorithms
MIRA
IARC
Time complexity Memory complexity
O ((|C | + | P |)2 |U | + | P ||U |) O (|U |2 + |U || P |)
O ((|C | + | P |)2 |U | + | P || w |) O (|U || P |)
Table 5 A description of data sets. ID
Data sets
Row
Attribute
Class
1 2 3 4 5 6
Lung cancer Dermatology Kr-vs-kp Ticdata2000 Mushroom Letter-recognition (Letter)
32 366 3196 5822 5644 20000
56 34 36 85 22 16
3 6 2 2 2 26
Table 6 A description of personal computer configuration. ID
Names
Model
Parameters
1 2 3 4 5
System Memory CPU Software environment Hard disk
Windows 7 Samsung DDR3 SDRAM Pentium(R) Dual-Core E5800 Eclipse 3.7 SATA
32 bit 4.0 GB 3.20 GHz 32-bits (JDK 1.6.0_20) 500 GB
5. Experimental analysis 5.1. A description of datasets and experimental environment We have tal methods downloaded experiments
executed a series of experiments to verify the efficiency and effectiveness of the proposed two incremenfor attribute reducts when multiple attributes are added into the decision system. In the experiments, we 6 data sets from UCI machine learning databases repository, which are shown in Table 5. All the concerned have been executed on a computer with configuration of the software and hardware, which is listed in Table 6.
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
91
Table 7 A comparison of GHARA, MIRA and IARC on experimental results. Data sets
GHARA NSF
Lung cancer Dermatology
MIRA Reduct
8
6, 13, 3, 14, 7, 15, 20, 26
7
34, 16, 4, 19, 3, 28, 21
Times/s
NSF
2.723
11
16, 4, 3, 2, 17, 5, 1, 14, 12, 6, 9, 18
35
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 33, 24, 21, 36, 26, 27, 34, 30, 35, 23, 20, 31, 32, 22, 19, 25, 29, 28 1, 2, 31, 30, 18, 15, 28, 38, 17, 7, 23, 9, 39, 37, 59, 47, 68, 44, 65, 80, 55, 76, 54, 75, 70, 49, 64, 85, 83, 62, 61, 82, 51, 72, 84, 63, 45, 66, 52, 46, 73, 67, 57, 78, 69, 48, 79
5694
5, 20, 9, 3
1037
0.890
106141
3.395
1, 3, 4, 5, 6, 7, 10, 12, 13, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 27, 28, 30, 31, 33, 34, 35, 36, 11, 32, 22
Ticdata2000
72
1, 2, 44, 47, 55, 59, 68, 80, 83, 18, 31, 30, 28, 15, 38, 9, 23, 17, 37, 7, 39, 24, 36, 35, 22, 19, 14, 32, 27, 25, 10, 13, 34, 12, 26, 16, 33, 42, 8, 40, 29, 11, 6, 3, 4, 20, 21, 41, 54, 75, 49, 70, 61, 82, 64, 85, 43, 48, 69, 72, 51, 73, 52, 57, 78, 84, 63, 45, 66, 56, 77, 79
9198
47
Mushroom
4
5, 20, 9, 3
1328
4
219758
13
13
4, 8, 15, 9, 11, 13, 10, 7, 6, 12, 14, 3, 5
Times/s
8
30
957.4
6, 13, 3, 14, 7, 15, 20, 26
Times/s
0.056
Kr-vs-kp
Letter
IARC Reduct
4, 8, 15, 9, 11, 13, 10, 7, 6, 12, 14, 3, 5
0.026
0.001
1.395
0.162
840.4
1.513
13.58
5.2. A comparison of running time on different data sets To verify that incremental algorithms which can obtain a feasible reduct in a much shorter time, we run the nonincremental algorithm GHARA and incremental algorithms MIRA, IARC on the 6 employed different data sets outlined in Table 5. In order to test efficiency and effectiveness of the proposed methods, each of these 6 data sets in Table 5 is divided into the basic data set and the incremental data set. The basic data set is made of 50% condition attributes and the decision attributes and the remaining 50% condition attributes are taken out as the new added attribute set. After renewing the basic data set by the adding attribute set, we use non-incremental algorithm GHARA and incremental algorithms MIRA, IARC to update the reduct of each data set. The reduct, the number of selected features (NSF) and computational time of each employed data set are shown in Table 7. Since, the reduct and NSF obtained by MIRA are equivalent to that of IARC, then, for simplification, only the computational time of IARC is shown in Table 7. From Table 7, it is clear that the reduct and the NSF found by the incremental algorithms MIRA, IARC are relatively close to that of non-incremental algorithm GHARA. However, incremental algorithms MIRA, IARC can obtain a feasible reduct in a much shorter time, especially for the larger data sets, e.g., Ticdata2000, Mushroom and Letter. The reason should be owed to the fact that incremental algorithms avoid recomputing, and just calculate new knowledge granularity and reduct by using the previous results. Then, a new reduct can be obtained by gradually adding selected remaining attributes with the highest significance to the original one. However, when we use non-incremental algorithm GHARA to compute the reduct, it needs to recompute knowledge granularity and core attributes. Then, a new reduct can be obtained by gradually adding selected remaining attributes with the highest significance to the core. Hence, there are some differences between the results of attribute reduction using non-incremental and incremental approaches. In addition, the reduct selected by algorithm IARC is identical to that of MIRA, and the computation time of IARC is lower than that of MIRA except for Lung Cancer. The reason should be attributed to the fact that the knowledge granularity obtained by algorithm MIRA is coincide with those by algorithm IARC. Therefore, reduct founded by MIRA is equivalent to that of IARC. In addition, algorithm MIRA needs more memory and computational time than algorithm IARC. The experimental results verify that, compared with non-incremental algorithm GHARA and incremental algorithm MIRA, the incremental algorithm IARC is efficient and effective to deal with the data sets with dynamically adding attributes. 5.3. Comparison of running time on data sets with different numbers of attributes A series of experiments are performed to further illustrate the time effectiveness and efficiency of the proposed incremental methods MIRA, IARC for computing the reduct while the attribute set varies dynamically with time. In order to compare the computational time between non-incremental GHARA and incremental methods MIRA, IARC based on knowledge granularity with different numbers of attributes. We select 50% condition attributes and the decision attribute of each data set in Table 5 as the basic data sets, 20%, 40%, · · · , 100% of the remaining 50% condition attributes are taken out as the new added attribute set, respectively. The combination of each adding attribute set and the basic decision system is viewed as the new data set. We employ these new data sets to calculate the reduct by GHARA, MIRA and IARC. By comparing GHARA, MIRA and IARC, we demonstrate the efficiency and effectiveness of MIRA, IARC. Experimental results of these data sets are displayed in Fig. 1. In each sub-figure in Fig. 1, x-axis pertains to different attribute numbers of data set and y-axis is the common logarithms of the computational time for updating reducts. The computational time of GHARA is shown with dot lines, that of algorithm MIRA is depicted with square lines and that of algorithm IARC is depicted with asterisk lines. In
92
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Fig. 1. A comparison of GHARA, MIRA and IARC on computation time. Table 8 A comparison of GHARA, MIRA and IARC on evaluation measures. Data sets
Lung cancer Dermatology Kr-vs-kp Ticdata2000 Mushroom Letter
GHARA
MIRA/IARC
AQ
AP
AQ
AP
1 0.9946 1 0.980 1 0.9998
1 0.9892 1 0.961 1 0.9997
1 0.9792 1 0.9985 1 0.9998
1 0.9678 0.9999 0.9789 1 0.9997
order to better reflect the running time of methods GHARA, MIRA and IARC, logarithmic scale is used in the vertical axis of the different charts in Fig. 1 to differentiate the low computing times of IARC. Each sub-figure in Fig. 1 shows that the computation time of MIRA, IARC and GHARA on different data sets increases with the increasing size. Clearly, the proposed incremental algorithms MIRA, IARC are much faster than the non-incremental algorithm GHARA on updating reducts when different attribute sets are added to the decision system. Especially, this conclusion is best illustrated by the percentage improvement of computational time. In addition, the computational time of incremental algorithm IARC is much smaller than that of MIRA. Therefore, incremental algorithm IARC is efficient and effective to improve the performance with respect to dynamically varying attributes. 5.4. Efficiency analysis Since there is a little difference between reducts found by the incremental algorithms MIRA, IARC and non-incremental algorithm GHARA, we employ two common evaluation measures in RST to evaluate the decision performance of the reduct generated by algorithms MIRA, IARC. The two common evaluation measures are approximate classified quality (AQ) and approximate classified precision (AP), which are defined by Pawlak to depict the precision of approximate classification in RST [32,37]. Experimental results based on two common evaluation measures are given in Table 8. Definition 13. Let S = (U , C ∪ D , V , f ) be a decision system and U / I N D ( D ) = { X 1 , X 2 , · · · , X m }. The approximate classified precision of C with respect to D is defined as
| P O S C ( D )| A P C ( D ) = m . i =1 | C X i |
(17)
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
93
Table 9 A comparison of GHARA, MIRA and IARC on classification accuracy. Data sets
GHARA
Lung cancer Dermatology Kr-vs-kp Ticdata2000 Mushroom Letter
MIRA
IARC
CA
SD
CA
SD
CA
SD
0.785714 0.877049 0.901439 0.730849 0.997638 0.759412
0.01491 0.01135 0.01619 0.00675 0.00483 0.01195
0.785714 0.881026 0.884543 0.812405 0.997638 0.759412
0.01491 0.01132 0.01598 0.00623 0.00483 0.01195
0.785714 0.881026 0.884543 0.812405 0.997638 0.759412
0.01491 0.01132 0.01598 0.00623 0.00483 0.01195
Table 10 A comparison of IARC DIA_RED and UARA on classification accuracy and computational time. Data sets
Lung cancer Dermatology Kr-vs-kp Ticdata2000 Mushroom Letter
IARC
DIA_RED
UARA
Times/s
CA
SD
Times/s
CA
SD
Times/s
CA
SD
0.001 0.011 1.513 13.58 0.890 3.395
0.785714 0.881026 0.884543 0.812405 0.997638 75.9412
0.01491 0.01132 0.01598 0.00623 0.00483 0.01195
0.045 0.083 18.70 213.90 10.07 307.8
0.774682 0.838797 0.881101 0.733768 0.997165 0.758921
0.01514 0.01148 0.01603 0.00768 0.00512 0.01216
1.325 2.878 6.565 25.62 15.23 148.6
0.839345 0.820138 0.859474 0.808467 100 0.759652
0.01396 0.01148 0.01658 0.00645 0.00000 0.01198
Definition 14. Let S = (U , C ∪ D , V , f ) be a decision system. The approximate classified quality of C with respect to D is defined as
A Q C (D) =
| P O S C ( D )| . |U |
(18)
From Table 8, AP and AQ of the reduct generated by incremental algorithms IARC, MIRA are very close, and even identical to that of algorithm GHARA on some data sets. Hence, the reduct found by the developed incremental method IARC is feasible. 5.5. Classification accuracy We performed on a series of experiments to compare the precision of classification on algorithms GHARA, MIRA and IARC. The classification accuracy and standard deviations are conducted on the selected reducts generated by the methods GHARA, MIRA and IARC with the variation of attribute set. The experimental results listed in Table 9 are acquired by using Bayes Theorem and 10-fold cross-validation. In 10-fold cross-validation, each of these 6 data sets in Table 5 is divided into ten parts of equal size. The nine parts are used to conduct as the training set and a part is retained as the testing set. Bayesian classifier uses the training set to conclude the classifier, and then applies the classifier to classify the testing data. Classification accuracy and standard deviations are conducted on the reduced data founded by the developed reduction algorithms, where methods MIRA and IARC are used to found the reducts when 50% condition attributes and the decision attribute of each data set in Table 5 as the basic decision system and the other 50% as the adding attribute set. For convenience, classification accuracy and standard deviations are written as CA and SD, respectively. From Table 9, we know that the average classification accuracy and standard deviations of the non-incremental Algorithm GHARA are very close or even the same as that of incremental Algorithms MIRA and IARC on the six data sets when renewing the decision system by the new added attribute set, and the average classification accuracy and standard deviations of MIRA are identical to those of IARC. However, the computational time of incremental algorithm IARC is much shorter than that of Algorithms GHARA, MIRA. Hence, the experimental results verify that incremental Algorithm IARC is efficient to deal with the decision systems with the variation of attribute set. 5.6. A comparison with other incremental algorithms To further illustrate the performance of incremental algorithm (IARC), we compare algorithm IARC with the dimension incremental algorithm (DIA_RED) based on entropy in [26] and incremental algorithm (UARA) based on positive region in [23] on the six data sets when adding attribute set. Likewise, average classification accuracy and standard deviations of each employed data set are obtained by using Bayes Theorem and 10-fold cross-validation. The experimental results are shown in Table 10. It is clear that the computation time of IARC is much smaller than that of algorithms DIA_RED and UARA. CA of DIA_RED is lower than that of IARC and SD of IARC is smaller than that of DIA_RED. CA of UARA is smaller than that of IARC in data sets Dermatology, Kr-vs-kp and Ticdata2000 and SD of IARC is lower than that of UARA in data sets Dermatology, Kr-vs-kp,
94
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
Ticdata2000 and Letter. In other words, the reduct generated by incremental algorithm IARC is more effective than that of algorithms DIA_RED and UARA. The classification accuracy and standard deviations of the reduct generated by IARC are better than those of algorithms DIA_RED and UARA. Hence, it is validated that IARC can achieve better performance than incremental algorithms DIA_RED and UARA. 6. Conclusions The decision system may increase quickly in attributes with time in real-life applications. How to update attribute reduct efficiently becomes an important task in knowledge discovery or other related tasks. In this paper, we firstly proposed incremental mechanisms to compute knowledge granularity on the variation of attributes. Then, the corresponding incremental algorithms are developed based on knowledge granularity. Finally, we carried out a series of experiments to evaluate the effectiveness of the developed algorithms. The experimental results showed that the presented algorithms can effectively update attribute reducts on the changing attribute set in decision systems. This research may shed light on dealing with complicated and large-scale data sets in practical applications. The proposed incremental reduction methods in this paper only consider the variation of the attribute set in decision system. In our further investigation, we will study the developed method to update attribute reducts with a multi-granulation view based on knowledge granularity and extend our algorithms to other general information systems. Study how to further improve our algorithms with cloud computing techniques to deal with big data is our another future work. Acknowledgements This work is supported by the National Science Foundation of China (Nos. 61573292, 61572406). References [1] R.W. Swiniarski, A. Skowron, Rough set methods in feature selection and recognition, Pattern Recognit. Lett. 24 (6) (2003) 833–849. [2] I. Düntsch, G. Gediga, Uncertainty measures of rough set prediction, Artif. Intell. 106 (1) (1998) 109–137. [3] I. Guyon, A. Elisseeff, Uncertainty measures of rough set prediction. An introduction to variable feature selection, J. Mach. Learn. Res. 3 (2003) 1157–1182. [4] Y.Y. Yao, Three-way decisions with probabilistic rough sets, Inf. Sci. 180 (2010) 341–353. [5] R. Kohavi, G.H. John, Wrappers for feature subset selection, Artif. Intell. 97 (1–2) (1997) 273–324. [6] L. Sun, J.C. Xu, Y. Tian, Feature selection using rough entropy-based uncertainty measures in incomplete decision systems, Knowl.-Based Syst. 36 (2012) 206–216. [7] H. Liu, L. Yu, Toward integrating feature selection algorithms for classification and clustering, IEEE Trans. Knowl. Data Eng. 17 (4) (2005) 491–502. [8] Y.Y. Yao, N. Zhong, Potential applications of granular computing in knowledge discovery and data mining, in: Proceedings of the World Multiconference on Systemics, Cybernetics and Informatics, 1999, pp. 573–580. [9] W.X. Zhang, J.S. Mi, W.Z. Wu, Approaches to knowledge reducts in inconsistent systems, Int. J. Intell. Syst. 21 (9) (2003) 989–1000. [10] D. Dubois, H. Prade, Rough fuzzy sets and fuzzy rough sets, Int. J. Gen. Syst. 17 (1990) 191–209. [11] X.H. Hu, N. Cercone, Learning in relational databases: a rough set approach, Int. J. Comput. Intell. 11 (2) (1995) 323–338. [12] W. Ziarko, Variable precision rough set model, J. Comput. Syst. Sci. 46 (1) (1993) 39–59. [13] D. Slezak, Approximate entropy reducts, Fundam. Inform. 53 (3–4) (2002) 365–390. [14] J.S. Mi, W.Z. Wu, X.W. Zhang, Approaches to knowledge reduct based on variable precision rough set model, Inf. Sci. 159 (3–4) (2004) 255–272. [15] Y.Y. Yao, Y. Zhao, Attribute reduct in decision theoretic rough set models, Inf. Sci. 178 (2008) 356–373. [16] J.Y. Liang, F. Wang, C.Y. Dang, Y.H. Qian, A group incremental approach to feature selection applying rough set technique, IEEE Trans. Knowl. Data Eng. 9 (2012) 1–31. [17] M. Kryszkiewicz, P. Lasek, FUN: fast discovery of minimal sets of attributes functionally determining a decision attribute, Trans. Rough Sets 9 (2008) 76–95. [18] Z.Y. Xu, Z.P. Liu, B.R. Yang, W. Song, A quickly attribute reduction algorithm with complexity of max( O (|C ||U |), O (|C |2 |U /C |)), Chin. J. Comput. 29 (3) (2006) 391–398 (in Chinese). [19] R. Susmaga, Experiments in incremental computation of reduct, in: A. Skowron, L. Polkowski (Eds.), Rough Sets in Date Mining and Knowledge Discovery, Springer-Verlag, Berlin, 1998. [20] M. Orlowska, M. Orlowski, Maintenance of knowledge in dynamic information systems, in: R. Slowinski (Ed.), Intelligent Decision Suppert, Handbook of Applications and Advances of the Rough Set Theory, Kluwer Academic Publishers, Dordrecht, 1992, pp. 315–330. [21] C. Chan, A rough set approach to attribute generalization in data mining, Inf. Sci. 107 (1–4) (1998) 169–176. [22] T. Li, D. Ruan, W. Geert, J. Song, Y. Xu, A rough sets based characteristic relation approach for dynamic attribute generalization in data mining, Knowl.-Based Syst. 20 (5) (2007) 485–494. [23] W.H. Shu, H. Shen, Updating attribute reduct in incomplete decision systems with the variation of attribute set, Int. J. Approx. Reason. 55 (2014) 867–884. [24] J.B. Zhang, T.R. Li, D. Ruan, Z.Z. Gao, C.B. Zhao, Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems, Int. J. Approx. Reason. 53 (4) (2012) 620–635. [25] Y. Cheng, The incremental method for fast computing the rough fuzzy approximations, Data Knowl. Eng. 70 (2011) 84–100. [26] F. Wang, J.Y. Liang, Y.H Qian, Attribute reduct: a dimension incremental strategy, Knowl.-Based Syst. 39 (2013) 95–108. [27] A.P. Zeng, T.R. Li, D. Liu, J.B. Zhang, H.M. Chen, A fuzzy rough set approach for incremental feature selection on hybrid information systems, Fuzzy Sets Syst. 258 (2015) 39–60. [28] S.Y. Li, T.R. Li, D. Liu, Incremental updating approximations in dominance-based rough sets approach under the variation of the attribute set, Knowl.-Based Syst. 40 (2013) 17–26. [29] C. Luo, T.R. Li, H.M. Chen, D. Liu, Incremental approaches for updating approximations in set-valued ordered information systems, Knowl.-Based Syst. 50 (2013) 218–233. [30] J.C. Xu, J.C. Shi, L. Sun, Attribute reduct algorithm based on relative granularity in decision tables, Comput. Sci. 36 (3) (2002) 205–207.
Y. Jing et al. / International Journal of Approximate Reasoning 76 (2016) 80–95
95
[31] D.Q. Miao, S.D. Fan, The calculation of knowledge granulation and its application, Syst. Eng. Theory Pract. 1 (2002) 48–56 (in Chinese). [32] Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Boston, 1991. [33] H.M. Chen, T.R. Li, C. Luo, S.J. Horng, G.Y. Wang, A rough set-based method for updating decision rules on attribute values’ coarsening and refining, IEEE Trans. Knowl. Data Eng. 26 (12) (2014) 2886–2899. [34] L. Wang, T.R. Li, A matrix-based approach for calculation of knowledge granulation, Pattern Recogn. Artif. Intell. 26 (5) (2013) 448–453 (in Chinese). [35] L. Wang, T.R. Li, Q. Liu, M. Li, A matrix-based approach for Maintenance of approximations under the variation of object set, J. Comput. Res. Dev. 50 (9) (2013) 1992–2004 (in Chinese). [36] G.Y. Wang, H.Yu. Li, D.C. Yang, Decision table reduct based on conditional information entropy, Chin. J. Comput. 25 (7) (2002) 759–766 (in Chinese). [37] Z. Pawlak, A. Skowron, Rudiments of rough sets, Inf. Sci. 177 (1) (2007) 3–27. [38] J.B. Zhang, T.R. Li, D. Ruan, D. Liu, Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems, Int. J. Approx. Reason. 53 (2012) 620–635. [39] C. Luo, T.R. Li, H.M. Chen, L.X. Lu, Matrix approach to decision-theoretic rough sets for evolving data, Knowl.-Based Syst. 99 (2016) 123–134. [40] J.B. Zhang, J.S. Wong, Y. Pan, T.R. Li, A parallel matrix-based method for computing approximations in incomplete information systems, IEEE Trans. Knowl. Data Eng. 27 (2) (2015) 326–339. [41] J.B. Zhang, Y. Zhu, Y. Pan, T.R. Li, Efficient parallel boolean matrix based algorithms for computing composite rough set approximations, Inf. Sci. 329 (2016) 287–302.