Applied Soft Computing Journal 84 (2019) 105699
Contents lists available at ScienceDirect
Applied Soft Computing Journal journal homepage: www.elsevier.com/locate/asoc
A parallel approach to calculate lower and upper approximations in dominance based rough set theory Muhammad Summair Raza, Usman Qamar
∗
Computer Engineering Department, College of Electrical and Mechanical Engineering(E&ME), National University of Sciences and Technology (NUST), Islamabad, Pakistan
article
info
Article history: Received 20 October 2018 Received in revised form 30 July 2019 Accepted 7 August 2019 Available online 12 August 2019 Keywords: Dominance-based rough set approach Lower approximation Upper approximation Incremental approximation calculation
a b s t r a c t Feature selection plays an important role in data mining and machine learning tasks. Rough set theory has been a prominent tool for this purpose. It characterizes a dataset by using two important measures called lower and upper approximation. Dominance based rough set approach (DSRA) is an extension to conventional rough set theory. It is based on persistence of preference order while extracting knowledge from datasets. Dominance principal states that objects belonging to a certain decision class should follow the preference order. Preference order states that an object having higher values of conditional attributes should belong to higher decision classes. However, some of the basic concepts like checking preference order consistency of a dataset, dominance based lower approximation and upper approximation are computationally too expensive to be used for large datasets. In this paper, we have proposed a parallel incremental approach called Parallel Incremental Approximation Calculation or PIAC for short, for calculating these measures of lower and upper approximations. The proposed approach incrementally calculates lower and upper approximations using parallel threads. We compare our method with the conventional approach using ten widely used datasets. Whilst achieving the same accuracy levels as the conventional approach, our approach significantly reduces the average computation time, i.e., 71% for the lower approximation and 70% for the upper approximation. Over all datasets, the decrease in memory usage achieved was 99%. © 2019 Elsevier B.V. All rights reserved.
1. Introduction In the field of machine learning, a typical classification task requires (i) a set of cases having certain number of conditional attributes, (ii) a known outcome often termed as decision attribute and (iii) the domains of attributes describing possible values each attribute can take [1]. However, the preference order in the domain values of the attributes is often overlooked in classification related tasks. In practice, we often come across situations where attributes have a preference order, that is, certain values are preferred over others. For example, consider a school or university course assessment where the overall grade (the decision attribute) is calculated on the basis of marks obtained in various individual subjects (conditional attributes) like Chemistry and Physics. Here, the decision attribute has a preference order which means that the grade of ‘‘Excellent’’ is preferred over the grade of ‘‘Very Good’’, and so on. If two students get same score in a subject but one of them gets higher score in other subjects, then the better performing student is expected to have a higher ∗ Corresponding author. E-mail addresses:
[email protected] (M.S. Raza),
[email protected] (U. Qamar). https://doi.org/10.1016/j.asoc.2019.105699 1568-4946/© 2019 Elsevier B.V. All rights reserved.
(preferred) overall grade. However, if this logical expectation is not met then we consider this to be an inconsistent outcome. This may occur due to several different reasons, for example, incorrect entry or missing values, to name a few. Handling these inconsistencies in preferences is a core issue in knowledge discovery. In real-life applications, consideration of preference order and handling such inconsistencies becomes critical. Rough Set Theory (RST) [1] is an important tool for the analysis of ‘‘messy’’ datasets containing inconsistencies, incomplete fields, or imprecise data. However, RST fails to consider preference ordering in the domain values of the attributes. Dominancebased Rough Set Approach, proposed by Greco et al. [2–4], is an extension to conventional Rough Set Theory (RST) which generalizes the theory by substituting indiscernibility relation with dominance relations. DRSA has a number of advantages over its conventional counterpart, however, the two core concepts of checking consistency of dataset and calculating the approximations are computationally too expensive to be used for datasets beyond smaller size. In this paper, we have proposed a Parallel Incremental Approximation Calculation (PIAC) approach for calculating the lower and upper approximations. Initially, we incrementally calculate the approximation by traversing each object in dataset. Each object
2
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
Table 1 Decision system.
If we consider item X 3 in Table 1 as our origin and P = {Physics} then:
U
Physics
Chemistry
Final-grade
X1 X2 X3 X4 X5 X6 X7
A A B A B A C
B A C B A B B
Very Good Excellent Good Good Very Good Very Good Good
is traversed and decided either it may be part of approximation or not. It is then further optimized using parallel approach. The parallel approach calculates approximations in parallel. So instead of traversing one object at a time, we traverse dataset in parallel thus calculating approximations more efficiently. The rest of the paper is organized as follows. Section 2 discusses some core preliminary concepts of DRSA. Section 3 provides challenges of conventional approach. Section 4 presents related work. Proposed approach is presented in Section 5. Section 6 shows results and analysis. Finally, summary and future work are discussed in Section 7.
D− P (x) = {X4 , X5 , X7 } Similarly, D+ P (x) = {X1 , X2 , X5 , X6 } 2.2. Decision classes and class unions In conventional RST, the decision attribute provides partition of the universe in finite number of decision classes. Similarly, in case of DRSA, decision attribute divides the universe in finite decision classes Cl = {Cl1 , Cl2 , Cl3 , . . . , Xm }. Note that each item belongs to one and only one decision class. Unlike with RST, the decision classes in DRSA are assumed to be preference ordered. So, for r, s = {1, 2, 3 . . ., m}, item belonging to Clr is preferred over the item belonging to Cls for r > s. So, instead of simple approximation as in case of RST, the approximation in DRSA is upward unions and downward unions of classes. Clt (x) = ≥
Cls
t = 1 , . . . , n.
Cls
t = 1 , . . . , n.
s≥t
Clt (x) = ≤
2. Preliminaries
⋃ ⋃ s≤t
In conventional RST, a decision system is a finite set of items called universe U, where each item is characterized by set of conditional attributes C and decision attributes D. Mathematically:
α = (U , C ∪ D) In the context of DRSA, a decision table is a four tuple mathematically represented as:
Here, Clt (x) specifies the set of items belonging to Clt or a more ≥ preferred class. Clt (x), on the other hand specifies the set of items belonging to Clt or to a less preferred class. From Table 1, the decision attribute ‘‘Final-Grades’’ has three decision classes which are ‘‘Excellent’’, ‘‘Very Good’’ and ‘‘Good’’. ‘‘Excellent’’ is preferred over ‘‘Very Good’’ which is preferred over ‘‘Good’’. Considering their indexes as Excellent = 3, Very Good = 2 and Good = 1 then for t = 2: ≥
α = (U , Q , V , f )
Clt (x) = {X1 , X2 , X5 , X6 }
Here U represents finite set of items, Q is finite set of criteria i.e. ⋃ the attributes having ordinal scale based domain. Here, V = q∈Q Vq where Vq is the value set of criteria q. f represents a function of the form f (x, q) which assigns a particular value Vq to an item x for attribute q. Here Q = (C ∪ D) i.e. both conditional attributes and decision attribute(s) are called included. Table 1 shows the sample decision system. Here universe comprises of seven items i.e. U = {X 1, X 2, X 3 . . ., X 7}. Conditional criteria include {Physics, Chemistry} and the decision criterion is {Final-Grade}. DRSA is an extension to conventional RST that can be used for data analysis of non-ordinal problems as well [5]. Right from its emergence it has been used in many domains e.g. in manufacturing industry [6], finance [7,8], project selection [9] and data mining [10–12] etc. Below, we discuss some core preliminaries of DRSA in order to discuss its benefits and limitations and the need for increasing computational performance.
i.e. Clt (x) specifies the set of items either belonging to class ‘‘Very good’’ or more preferred class i.e. ‘‘Excellent’’. Similarly
2.1. Dominance
≥
≥
Clt (x) = {X3 , X4 , X7 } ≤
where Clt (x) specifies the set of items belonging to class ‘‘Very good’’ or less. ≤
Clt (x) = {X1 , X2 , X3 , X4 , X5 , X6 , X7 } ≥
Clt (x) = {X3 , X4 , X7 } ≤
2.3. Lower approximations Lower approximation in RST defines the set of items that with certainty belong to a decision class with respect to the given attributes. In DRSA, given that P ⊆ C , P-lower approximation ≥ of Clt (x) specifies all the items that will, with certainty, belong ≥ ≤ to Clt (x). Similarly the P-lower approximation of Clt (x) will ≤ contain all the items that, with certainty, will belong to Clt (x). Mathematically:
(
≥
)
{ } ≥ = x ∈ U: D+ P (x) ⊆ Clt
≤
)
{ } ≤ = x ∈ U: D− P (x) ⊆ Clt
P Clt For a set of criteria P ⊆ C , an item x dominates item y if x is better than y on every criterion in P i.e. {∀q ∈ P , x ≽q y}. It will be specified as ‘‘x dominates y’’ and can be mathematically denoted as D− P (x), which is a set of items dominated by x by considering the information in P ⊆ C i.e. D− P (x) = {y ∈ ∪: xDP y}. Similarly we can specify the set of items dominating x as D+ P (x), which can be defined mathematically as follows: D+ P (x) = {y ∈ ∪: yDP x}
For:
(
P Clt
≥
≤
Calculating P-lower approximation both for Clt and Clt comprises of the three steps explained below with the help of an example dataset provided in Table 1. Step-1: In first step we calculate the items belonging to the ≥ union of classes Clt . This is just like calculating equivalence class
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
3
structure using decision attribute in conventional RST. In our ≥ example we will calculate lower approximation P(Clt ) for t = 2. Clt = {X 1, X 2, X 5, X 6} ≥
Step-2: In second step we calculate D+ P (x) for each item identified in Step-1. In our case For X1: D+ P (x1) = {X 1, X 2, X 4, X 6} For X2: D+ P (X 2) = {X 2} For X5: D+ P (X 5) = {X 2, X 5} For X6: D+ P (X 6) = {X 1, X 2, X 4, X 6} It is evident from the above example that we have to calculate D+ P (x) for each individual item which is computationally expensive step as it requires multiple dataset passes which can be computationally expensive in case of dataset beyond smaller size. Step-3: In third step we actually calculate lower approximation. The sets identified in step-2 that are subset of the sets identified in step-1 become part of the lower approximation. These are the items about which we can, with certainty, conclude that they belong to the union of class t or preferred. So,
(
≥
P Clt
)
= {X 2, X 5}
2.4. Upper approximations In conventional RST based approach the upper approximation defines that set of items that may possibly belong to the concept ≥ X. In DRSA, for P ⊆ C the P-upper approximation of Clt (x) defines the set of items that may possibly belong to the union ≥ ≤ of classes Clt (x). Similarly the P-upper approximation of Clt (x) defines the set of items that may possibly belong to the union of ≤ classes Clt (x). Mathematically:
(
≥
P Clt
)
{ } ≥ = x ∈ U: D− P (x) ∩ Clt ̸ = ∅
≤
For Clt :
(
≤
P Clt
)
= x ∈ U: DP (x) ∩ Clt ̸= ∅
{
+
≤
}
i.e. we cannot with certainty conclude that( the) item will belong ≥ ≤ to the union of Clt . Same is the case for P Clt . Calculating upper approximation is also computationally expensive and comprises of three steps. Performing these three steps in conventional way results in serious performance bottlenecks for the algorithms using these measures. Now we will explain with the help of an example about how to calculate upper approximation. We will use Table ( ≥ ) 1 given above and will calculate p-upper approximation P Clt for t = 2. Step-1: Just like P-lower approximation, in calculating P-upper approximation, the first step is to calculate all the items belonging ≥ to the union of classes Clt . In our case Clt = {X 1, X 2, X 5, X 6} ≥
Step-2: In second step we calculate D− P (x1) for each item belonging to the set identified in step-1. In our case For X1: D− P (x1) = {X 1, X 2, X 4, X 6} For X2: D− P (X 2) = {X 2} For X5: D− P (X 5) = {X 2, X 5} For X6: D− P (X 6) = {X 1, X 2, X 4, X 6} It should be noted that this step significantly degrades the performance as for each item we need to find items dominating it. This requires a complete traversal of the dataset for each item.
Fig. 1. Pseudo code of Step-1 for calculating lower approximation.
Fig. 2. Pseudo code of Step-1 for calculating lower approximation.
≥
So, as we have four items in Clt , so we will have to perform four traversals of the dataset. Step-3: Finally, in step-3 we determine the items belonging to the P-upper approximation. This requires identifying the items in subsets (identified in Step-2) that have non-empty interaction with the set identified in Step-1. − − − In our case all D− P (X 1) , D (P (≥X)2), DP (X 5) and DP (X 6) have non-empty interaction. So, P Clt for t = 2 will be:
(
≥
P Clt
)
= {X 1, X 2, X 4, X 5, X 6}
(
≥
)
Same three steps will be performed for P Clt . All of the DRSA based algorithms use this approach which effects the performance of the algorithm and consequently these algorithms cannot be used for datasets beyond smaller size without significant degrade in performance. 3. Challenges in calculating approximations Calculating lower and upper approximations using the above three steps in conventional method is computationally too expensive that affects the performance of the algorithms using these ≥ measures. Calculating the first step requires the calculation of Clt ≤ or Clt structure depending on which approximation we need to calculate. For example, Fig. 1 below shows the pseudocode to calculate ≥ ≥ Clt in case of P(Clt ). In the provided pseudocode, Cli represents the decision class of the ith object in dataset while Clt represents the decision class ≥ for which we need to calculate P(Clt ). Here we need to traverse ≥ the complete dataset for calculating P(Clt ). Now in second step, we calculate D+ P (x) which comprises of ≥ all the objects greater than or equal to each object in Clt . This ≥ means that if Clt comprises of five objects we need to traverse the dataset five times to calculate D+ P (x) for each object. Having ≥ larger datasets means more number of objects in Clt and thus more number of dataset traversals, which significantly effects the performance of the algorithm. The pseudo code for this step is given in the Fig. 2 below: Here Xj represents jth object in dataset and Xit represents ith ≥ object in Clt . ( ≥) Now finally we calculate P Clt which comprises of the objects (identified in second step) that are subset of objects identified in first step. Fig. 3 below shows the pseudo code of this step: ( ) Here D+ P Xji represents the jth object in the set comprising of ≥ ≥ all objects greater than Xi in Clt and Clkt represents kth object in ≥ Clt .
4
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
Fig. 3. Pseudo code of Step-1 for calculating lower approximation.
It is clear that the conventional approach poses serious challenges to performance of the algorithms using these approaches when it comes to larger datasets. We, therefore need a more efficient method to calculate both lower and upper approximations. 4. Related work Right from its inception, RST has been a prominent tool for data mining and knowledge discovery. Various approaches have been proposed in literature for different tasks. First, we will present few state of the art approaches using RST for different purposes. In [13], authors have discussed the minimum weight vertex cover problem based on test-cost-sensitive rough set. They converted the Minimum Weighted Vertex Cover Problem (MWVCP) to Minimal Test Cost Attribute Reduction problem (MTRP) in testcost-sensitive rough sets, then we proposed an improved heuristic algorithm for MWVCP based on test-cost-sensitive rough set. The improved heuristic algorithm with a new significance measure can avoid a large number of repeated calculations. In [14], authors have proposed an integrated prediction method based on Rough Set (RS) and Wavelet Neural Network (WNN) to improve the prediction capacity of stock price trend. RST is firstly used to reduce the feature dimensions of stock price trend. On this basis, RS is used again to determine the structure of WNN, and to obtain the prediction model of stock price trend. Finally, the model is applied to prediction of stock price trend. Their simulation results indicate that, through RS attribute reduction, the structure of WNN prediction model can be simplified significantly with the improvement of model performance. In [15] several proposals for decision making are provided based on both hybrid soft sets including fuzzy soft sets and rough soft sets. For fuzzy soft sets, a computational tool called D-score table is introduced to improve the decision process of a classical. In addition, a novel adjustable approach based on decision rules is introduced. Regarding rough soft sets, several new decision algorithms to meet different decision makers’ requirements are introduced together with a multi-criteria group decision making approach. Finally, several practical examples are also developed to show the validity of such proposals. In [16], authors propose a group incremental feature selection algorithm is proposed using rough set theory based genetic algorithm for selecting the optimized and relevant feature subset, called reduct. The method may be applied in a regular basis in the dynamic environment after small to moderate volume of data being added into the system and thus the computational time, the major issue of the genetic algorithm does not affect the proposed method. Experimental results on benchmark datasets demonstrate that the proposed method provides satisfactory results in terms of number of selected features, computation time and classification accuracies of various classifiers. In [17], authors have combined the rough set and fuzzy Bandelet neural network construct the novel prediction model to correctly predict the service life of large centrifugal compressor impeller. The attribute reduction algorithm based rough set and
clustering method is firstly designed to optimize the inputting variables of fuzzy Bandelet neural network. And then the prediction model based on fuzzy Bandelet neural network is proposed. Results show that the fuzzy Bandelet neural network optimized by improved genetic algorithm has highest prediction precision and efficiency, which can correctly predict the service life of remanufacturing impeller. However, RST does not take in to account the preference ordering. DSRA is extension of RST for multi-criteria decision analysis. It is based on persistence of preference order while extracting knowledge from datasets. However, all of these approaches use conventional approximation calculation methods which degrades performance of these algorithms. Here we will present few representative approaches that use approximations for different tasks. In [18], authors have used improved DSRA for classification of medical data. DSRA is used for ordinal attributes, the proposed technique is used for nominal ones. It suggests decision table to determine dominance relation, the improved DSRA is applied to determine the lower and upper approximations in the entire dataset. Finally, the attribute reduction technique is applied to find the reduced number of attribute for classification. The proposed technique comprises of five steps. In first step, they construct the decision table to apply DSRA, then based on this decision table, the lower and upper approximations are calculated using the conventional methods. In third step the boundary values and dependency are calculated. Fourthly the reduct and core are found out to apply feature selection and finally the rule generation step for classification. The proposed approach uses conventional lower and upper approximations which affects the performance of the proposed approach. In [19], authors have presented a classification technique based on DSRA for spare parts management. The proposed approach uses conventional lower and upper approximation measures. It is a three step framework. In first step ‘‘if. . . then’’ rules are generated from the historic data using DSRA. In second step the proposed rules are validated both manually and automatically using cross validation techniques. Finally, in third step the unseen set of spare parts is classified in real setting. The proposed approach has been tested by using real world data and the accuracy was found to be 96%. However, the proposed approach uses conventional approach for calculating the lower and upper approximations, which we think, can be significantly improved by applying the proposed incremental approximation calculation based technique. In [20], authors have used DSRA to predict the number of students likely to drop out from the Massive Open Online Course (MOOCS) course next week using the historic data of the previous week. The proposed approach proposes two classes of students Cl1 which specifies the ‘‘At-risk Learns’’ and Cl2 which specifies ‘‘Active Learners’’. The proposed approach is a two-step method, the first step inferences a preference mode, while the second phase classifies the students in above mentioned classes. The first step itself consists of three steps. First step identifies the learning examples of learners. Second step constructs the coherent criteria family for learners profile characterization and finally the third step is to infer a preference model resulting in a set of decision rules. The proposed approach uses the conventional DSRA based lower and upper approximation definitions. Thus may result in significant decrease in performance of the algorithm. In [21], authors have proposed DSRA based approach for predicting customer behavior in airline companies. This can help managers attain new customers and retain high valued customers. A set of rules is derived from a large sample of international airline customers, and its predictive ability is evaluated. Results have shown the effectiveness of the approach. The approach
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
5
Table 2 Summary of the related work. Algorithm
Technique used
Advantages
Disadvantages
Improved dominance rough set-based classification system [18]
DSRA based classification
Improved results
Conventional lower and upper approximation techniques affect the performance
Spare parts classification in industrial manufacturing using the dominance-based rough set approach [19]
DSRA based approach for spare parts classification
High accuracy
Conventional approximation based approaches are used
Weekly predicting the At-Risk MOOC learners using Dominance-Based rough set approach [20]
DSRA based approach for classifying students
Prediction of ‘‘At-risk’’ students
Conventional approach used
A dominance-based rough set approach to customer behavior in the airline market [21]
DSRA based prediction approach
prediction and retaining of high valued customer
Conventional approach affects performance
A unified approach to reducts in dominance-based rough set approach [22]
DSRA based approach for finding reducts
New types of reducts are introduced
Conventional approximation approach affects performance of algorithm
Variable Consistency Dominance-based Rough Set Approach to formulate airline service strategies [23]
Variable Consistency Dominance-based Rough Set Approach (VC-DRSA) to formulate airline service strategies.
Use of flow graphs to visualize rules makes them more reasonable and understandable than traditional methods
Conventional approximation based approaches are used
uses conventional DSRA based lower and upper approximations which result in the inherent performance drawbacks discussed above. In [22], authors have proposed a new approach for finding the reducts in DSRA. They have investigated the attribute reduction in DSRA along with introducing class-based reducts and their relations with previous reducts. Class based reducts are of three kinds. The first kind of reducts, called L-reduct, preserves the lower approximations of decision classes, the second kind reduct, called U-reduct, preserves the upper approximations of decision classes, and the third kind of reduct, called B-reduct, preserves the boundary regions of decision classes. They also show that all kinds of reducts can be enumerated comprehensively based on two discernibility matrices associated with generalized decisions. In [23], authors use the Variable Consistency Dominancebased Rough Set Approach (VC-DRSA) to formulate airline service strategies by generating airline service decision rules that model passenger preferences for airline service quality. Flow graphs are applied to infer decision rules and variables. This combined method considers decision-maker inconsistency. The use of flow graphs to visualize rules makes them more reasonable and understandable than traditional methods. Although the results were impressive, however, the use of conventional definitions may still be bottleneck for performance. Table 2 below shows the summary of approaches discussed above. 5. The proposed approach Using conventional approach for calculating lower and upper approximation results in significant decrease in performance of the algorithms using these measures. The reason behind is that calculating lower and upper approximation requires to perform three computationally expensive steps discussed in Section 2, − in particular, calculating D+ P (x) or DP (x) and finally calculating approximations in third steps substantially affect performance because we need to traverse dataset for each item present in upward or downward union of classes. The problem poses significant bottlenecks for datasets with larger sizes. The overall time complexity of the conventional approach is O(|Cl|2 + |N | |C | |Cl|) ≥ ≤ where |Cl| represents cardinality of Clt or Clt , C represents total number of conditional attributes and N represents total universe size. It is clear that for large datasets, the conventional approach will suffer serious performance issues due to large number of the ≥ ≤ objects belonging to Clt or Clt . This will ultimately affect the
≥
Fig. 4. Calculating lower approximation P(Clt ) using PIAC.
performance of the algorithms using these measures, so, alternatively we need some other approach enhance the performance level. To overcome the issue, we propose Parallel Incremental Approximation Calculation or PIAC for short that calculates approximations in parallel. PIAC operates in two steps. In first step, ≥ ≤ we calculate Clt or Clt , while the approximations are calculated in second step. The second step is performed in parallel which actually reduces the execution time. This is accomplished using parallel threads where each thread belongs to once decision ≥ ≤ class in Clt or Clt . Following is the description of each of the approximations using PIAC.
5.1. PIAC for lower approximation ≥
In order to calculate, a lower approximation for Clt , initially, ≥ we calculate all the items belonging to Clt . Note that this step is that same as that of the first step of the conventional approach. ≥ Initially, all of these items are considered to belong to P(Clt ). Now, parallel threads are started where one thread belongs to one ≥ dominance class in Clt . Each thread picks one object S ∈ / Cl≥ t and ≥ ≥ checks its dominance with k ∈ P(Clti ) where P(Clti ) represents ≥ set of objects that belong to a particular dominance class Clti ≥ in P(Clti ). Now if s dominates k then the object k is removed ≥ ≥ from P(Clt ). This process is carried out by each thread and P(Clt ) ≥ is updated globally. Once the last thread completes, P(Clt ) is returned. Fig. 4 below shows the algorithm of the proposed method. ≤ ≤ Similarly, the lower approximation for Clt i.e. P(Clt ) will be calculated. Fig. 5 shows the pseudo code of the algorithm.
6
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699 Table 3 Summary of datasets.
≤
Fig. 5. Calculating lower approximation P(Clt ) using PIAC.
Dataset
Instances
Attributes
Dataset/attribute characteristics
Chess DeliciousMIL p53 Mutants Gas Sensor Array Musk2 URL Reputation Gisette Isolet Internet advertisement EEg-Eye-state
3196 12234 16772 18000 6598 2396130 135000 7797 3279
37 8519 5409 1950000 168 3231961 5000 617 1558
Multivariate/Integer Integer Real Real Multivariate/Integer Integer, Real Integer Real Categorical, Integer, Real
14980
15
Integer, Real
≥
and check its dominance with its corresponding P(Clti ). Thread T1 will check dominance of these objects with objects {X1, X5, X6} while Thread T2 will check their dominance with object set {X2}. T1 will remove X1 and X6 while T2 will not remove any object ≤ according to dominance. So, the final set P(Clt ) becomes: P(Clt ) = {X 2, X 5} ≤
≥
Fig. 6. Calculating upper approximation P(Clt ) using PIAC.
Clearly it can be seen that we have successfully avoided the computationally expensive steps followed in conventional approach, furthermore, the parallel approach significantly contributes towards performance enhancement and the results produced are same which shows that approach can successfully be used as alternate of the conventional method. ≥ Now we will calculate P(Clt ) for t = 2 by considering the ≥ dataset given in Table 1. As discussed first we will calculate Clt which in our case is: Clt (x) = {X1 , X2 , X5 , X6 } ≥
≤
Fig. 7. Calculating upper approximation P(Clt ) using PIAC.
5.2. PIAC for upper approximation Our proposed approach incrementally calculates the upper approximation in parallel, which (as already stated) allows us to avoid multiple traversals of the dataset. We first find all items belonging to the upward or downward union of classes. Then parallel threads calculate upper approximation for each domi≥ ≤ nance class belonging to Clt or Clt . One thread belongs to one dominance class. Multiple threads work at the same time and update upper approximation set. When last thread completes, the upper approximation set contains the final approximation. Fig. 6 shows the pseudo code of the algorithm ≥ Here P(Clti ) represents set of objects belonging to one dom≤ ≥ inance class in Clt . Upper approximation P(Clt ) is calculated on the same basis. Fig. 7 below shows the pseudo code of the algorithm.
As the objects here belong to two dominance classes, so again two threads will be started, one belonging to each class. One thread will update upper approximation for t = 2 while other will update it for t = 3. The thread belonging to t = 2 will add X4 while thread ≥ belonging to T = 3 will add nothing except itself. So, final P(Clt ) becomes: P(Clt ) = {X 1, X 2, X 4, X 5, X 6} ≥
6. Results and analysis To justify the proposed approach, it was compared with conventional one using ten publically available datasets from UCI [24]. Table 3 below shows the description of the datasets used. 6.1. Comparison measures Three measures were used as criteria to compare both approaches. Following is the description of the measures.
5.3. Case study As a case study we, will calculate the upper and lower approximations for Table 1 for t = 2 using PIAC. ≥ For the lower approximation for Clt as discussed earlier, the ≥ first step is to calculate Clt , in our dataset, for t = 2:
6.1.1. Percentage decrease in execution time Percentage decrease in execution time specifies the total decrease in execution time by one algorithm as compared to other one by providing the same input. Percentage decrease in execution time was calculated by the following formula:
Clt (x) = {X1 , X2 , X5 , X6 }
Percentage Decrease in execution time = 100 −
≥
This becomes our initial P(Clt ). Now we will find the set S ∈ / ≥ Clt . In our case, this set contains the items {X3, X4 and X7}. Note ≤ that P(Clt ) contains two decision classes. It means we will start two threads, one for each. Now each thread will take first object ≤
E1 E2
∗ 100
For example, if there are two algorithms i.e. Algorithm-A and Algorithm-B. Algorithm-A executes a task in ten seconds and Algorithm-B executes the same task in five seconds then the
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
percentage decrease in execution time resulted by Algorithm-B will be:
Table 4 ≥ Comparison of conventional and PIAC based approach for calculating P(Clt ). DataSet
Conventional ≥ P(Clt ) Time (s)
PIAC based ≥ P(Clt ) Time (s)
T
%Dec in time
Chess Gas Sensor Array Musk 2 URL Reputation Gisette Isolet Internet Advertisement EEg-Eye-state DeliciousMIL p53 Mutants
15.04 15.6 12.3 75.6 20.4 11.29 14.67 10.9 18.38 19.09
4.13 3.1 3.2 19.6 7.1 4.5 3.9 3.1 5.3 5.2
1 1 1 1 1 1 1 1 1 1
72.54% 80.13% 73.98% 74.07% 65.2% 60.14% 73.14% 71.56% 71.16% 72.76%
Percentage Decrease in execution time
= 100 −
E1 E2
∗ 100 = 100 −
5 10
∗ 100 = 50%
So, percentage decrease in execution time resulted by AlgorithmB will 50%. System stopwatch was used to measure the execution time of each algorithm. Stopwatch was started after giving the input and was stopped once the output was produced. The execution time was calculated by using the difference in start and completion time. 6.1.2. Accuracy Accuracy specifies the correctness of the results produced. Accuracy of an algorithm is absolute if as compared to other algorithm, it produces the same output as produced by the other algorithm for the same input. For accuracy, the output of both the conventional approximation calculation technique and PIAC based approach was compared for same datasets. 6.1.3. Memory Memory represents the required runtime memory used by an algorithm. For this purpose, only the major data structures were considered both in conventional and proposed approach. The individual variables and the data structures common in both approaches were ignored. In conventional approach we need two data structures. The ≥ ≤ first one is a one dimensional array to store the set Clt or Clt . While the second data structure is a two dimensional array to ≥ ≤ − store the sets D+ P (x) or DP (x) for each object in Clt or Clt . Each + − row in two dimensional array will store DP (x) or DP (x) for each ≥ ≤ object in Clt or Clt . Total memory size is calculated by formula: Memory = Size of datatype ∗ (total number of records)
+ Size of datatype ∗ (total number of records/2)2 In our proposed approach, we need two one dimensional arrays of the size equal to total number of records in dataset. The first array ≥ ≤ will store the set Clt or Clt . While second array will store the ≥ status of each object that either it has been considered in Clt or ≤ Clt set or not. This will help us avoid comparison of same object again and again. Memory size will be calculated by the formula: Memory = 2 ∗ (Size of datatype) ∗ (total number of records) 6.1.4. Runtime context Runtime context specifies the runtime environment of the system on which the algorithms were executed. During experiments, special attention was given to ensure that runtime context remains the same throughout. 6.2. Parameter settings The same value of T was used while testing lower and upper approximations for a certain dataset to avoid unbiased analysis. While calculating execution time, the time to load dataset in local data structure was ignored in both cases as the time was constant and equal, furthermore the memory used to store the dataset was also ignored as same data structure was used to store dataset in both cases.
7
Table 5 ≤ Comparison of conventional and PIAC based approach for calculating P(Clt ). DataSet
Conventional ≤ P(Clt )
PIAC based ≤ P(Clt )
T
%Dec in time
Chess Gas Sensor Array Musk 2 URL Reputation Gisette Isolet Internet Advertisement EEg-Eye-state DeliciousMIL p53 Mutants
11.12 18.4 16.21 91.2 25.4 16.09 14.67 15.3 21.89 24.5
3.39 4.7 5.1 29.2 6.9 4.1 3.8 4.6 5.8 8.6
1 1 1 1 1 1 1 1 1 1
69.51% 74.46% 68.54% 67.98% 72.83% 74.52% 74.1% 69.93% 73.5% 64.9%
Table 6 ≥ Comparison of conventional and PIAC based approach for calculating P(Clt ). DataSet
Conventional ≥ P(Clt )
PIAC based ≥ P(Clt )
T
%Dec in time
Chess Gas Sensor Array Musk 2 URL Reputation Gisette Isolet Internet Advertisement EEg-Eye-state DeliciousMIL p53 Mutants
13.84 12.45 15.89 93.2 45.6 17.5 20.67 15.3 23.23 25.14
4.31 3.6 3.7 30.1 12.8 4.7 6.1 4.9 7.3 7.9
1 1 1 1 1 1 1 1 1 1
68.57% 71.08% 76.71% 67.7% 72.8% 73.14% 70.49% 67.97% 68.58% 68.58%
6.3. Discussion Tables 4–7 show the result of the experiments using PIAC. First column specifies the datasets used. Second, third and fourth, fifth columns specify the total number of objects restored and the execution time for conventional and proposed PIAC based approach respectively. Sixth column specifies the value of the T. Finally, the last column specifies the percentage decrease in execution time resulted by the proposed PIAC based approach. Results show that PIAC is are more effective and efficient as compared to conventional one. ≥ ≤ For P(Clt ) an average decrease of 70.8% while in case of P(Clt ) ≤ an average decrease of 71.2% was found. For P(Clt ), an average ≥ decrease of 70.2% while in case of P(Clt ) an average decrease of 71.09% was noted. The reason behind this decrease is that instead of calculating − D+ P (x) or DP (x) which substantially affects performance, the proposed PIAC based approach directly calculates approximations by checking the preference order of the objects. The parallel nature of the proposed approach adds further to reduction in execution time. To make a comparison, we have divided the proposed algorithm in three steps just like conventional approach. Now we will
8
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
Table 7 ≤ Comparison of conventional and PIAC based approach for calculating P(Clt ). DataSet Chess Gas Sensor Array Musk 2 URL Reputation Gisette Isolet Internet Advertisement EEg-Eye-state DeliciousMIL p53 Mutants
Conventional ≤ P(Clt )
PIAC based ≤ P(Clt )
T
13.84 14.5 20.45 101.89 50.65 19.2 23.78 17.5 29.13 32.3
3.52 4.9 4.1 30.2 16.7 5.2 7.9 4.1 9.2 9.5
1 1 1 1 1 1 1 1 1 1
Table 10 Comparison of conventional and proposed approaches for Step-3.
%Dec in time
Conventional approach
74.57 66.21 79.95 70.36 67.03 72.92 66.78 76.57 68.42 70.59
(2)
(1) (3)
Conventional approach
Proposed approach
(1) (2) (3) (4) (5)
(1) (2) (3) (4) (5) (6) (7)
For i = 1 to |U | If Cli ≥ Clt ≥ ≥ Clt = Clt ∪ Xi End-if End-For
For i = 1 to |U | If Cli ≥ Clt Kt = Kt ∪ Xi Else s = s ∪ Xi End-if End-For
Table 9 Comparison of conventional and proposed approaches for Step-2. Conventional approach (1) (2) (3) (4) (5) (6) (7)
⏐
≥
⏐
For i = 1 to ⏐Clt ⏐ For j = 1 to |U | If Xj ≥ Xit + D+ P (Xi ) = DP (Xi ) ∪ Xj End-if End-For End-For
Proposed approach (1) For i = 1 to |s| (2) For j = 1 to |kt | (3) If (si ≥ ktj ) (4) Kt = Kt − Ktj (5) End-if (6) End-For (7) End-For
make step-by-step comparison for chess dataset while calculating ≥ P(Clt ) for t = 1. Table 8 below shows the step-1 of both approaches. As it can be seen that step-1 is almost same in both cases ≤ ≥ as both approaches calculate Clt or Clt . The execution time for this step was almost equal in both cases. Note that instead ≤ ≥ of constructing one complete set for Clt or Clt , as in case of conventional approach, this step constructs set of objects that do ≤ ≥ not belong to Clt or Clt . Furthermore, it also constructs subsets, ≤ ≥ each belonging to one Clt in Clt or Clt . These subsets become input for each thread in step-2 for proposed approach. Table 9 below shows the comparison of the execution times of both approaches for Step-2: It can be seen in this step that conventional approach cal− culates D+ P (x) or DP (x) depending on either we are calculating ≤ ≥ − P(Clt ) or P(Clt ). One D+ P (x) or DP (x) set is calculated for each ≤ ≥ ≤ ≥ object in Clt or Clt . So, if Clt or Clt comprise of fifty objects, fifty subsets will be generated. This is the step that results in substantial decrease in execution time for proposed approach. In conventional approach, the complete dataset is traversed for ≥ ≤ each element of Clt or Clt . In proposed approach on the other hand, we do not need to traverse complete dataset time and again. Instead, set s is compared with set k and values from k are subtracted on the basis of condition mentioned in statement-3. It should be noted that set s comprises of the objects that do not ≥ ≤ belong to Clt or Clt whereas set k will comprise the objects that ≥ ≤ belong to Clt or Clt . So, |s| and |k| will be much smaller than |U |. This results in substantial decrease in execution time. The execution time is further reduced to due parallel calculation of this step by threads. One thread is started for each decision class ≥ ≤ ≥ ≤ Clt in Clt or Clt . So, if Clt or Clt comprises of two decision classes,
(1) For i = 1 to |t | (2) Pl = CalculateUnion ktj (3) End-For
(4) (5) End-For (6) End-For (7) End-For
Table 11 Summary of Percentage Decrease in execution time. Approximation
Table 8 Comparison of conventional and proposed approaches for Step-1.
Proposed approach
⏐ ≥⏐ For i = 1 to ⏐Clt ⏐ ⏐ ⏐ ⏐ For j = 1 to ⏐D+ ⏐ ≥P ⏐(Xi ) For k=1 to ⏐Clt ⏐ ( ) ≥ pl = Calculate D+ P Xji ⊆ Clkt
≥
P(Clt ) ≤ P(Clt ) ≥ P(Clt ) ≤ P(Clt )
% decrease in execution time 71.47% 71.03% 70.56% 71.34%
then two threads will be started. Each thread will calculate Kt as discussed above. Note that Kt here comprises of the objects that ≥ ≤ belong to a particular class Clt in Clt or Clt . For chess dataset, the execution time of the proposed approach was almost 70% less than the time taken by conventional approach for this step. Now finally we compare Step-3. Table 10 below shows the pseudo code of both approaches for this step. This step makes substantial difference in execution time. In ≥ conventional approach this step calculates (D+ P (x) ⊆ Clt( ) or ) ≥ ≤ − (DP ((x) ⊆ Clt ) depending on either we are calculating P Clt ) ≥ ≤ ≤ or P Clt . To calculate this, we need to traverse Clt or Clt time + − and again for each object in each set of DP (X ) or DP (X ) to find ≥ ≤ − that either a certain D+ P (Xi ) or DP (Xi ) is subset of Clt or Clt . In proposed approach this step comprises of only computing union of each Kt , so, if there were two threads then this step would compute union of two sets that simply means serially copying the elements of these subsets in third set. For chess dataset, the execution time of the proposed approach was almost 94% less than the time taken by conventional approach for this step. The procedure for calculating upper approximation is same with some minor changes in step-3. We can also conclude the ≤ ≥ more the number of classes in Clt or Clt , the more the efficient is the proposed approach. This is because the more number of dom≤ ≥ inance classes in Clt or Clt will result in more number of threads and lesser number of objects in each dominance class. This will result in enhanced performance of the proposed approach. This also justifies the significance of the proposed approach in terms of scalability. As there is probability of more number ≤ ≥ of dominance classes in Clt or Clt when dataset is large, the algorithm will be more efficient in terms of execution time due to its parallel nature. This decrease in execution time will consequently enhance the performance of the underlying algorithms using these approaches. This will enable these algorithms to use them for datasets beyond small size. Table 11 below shows the summary of the decrease in execution time resulted by PIAC based approach for five datasets. As far as accuracy is concerned, results have shown that proposed PIAC based approach produced the same results as produced by the conventional approach. For example, in case of ‘‘Chess’’ dataset, cardinality of approximations observed (and manually verified for both approaches) are shown below in Table 12. The reason behind is that PIAC based approach correctly identifies the objects that result in a specified approximation. It means that the proposed approach can accurately and effectively replace the conventional method and can enhance the performance of the algorithms using these measures without affecting their accuracy.
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699 Table 12 Cardinality of approximations in conventional and proposed approaches. Approximation ≥
P(Clt ) ≤ P(Clt ) ≥ P(Clt ) ≤ P(Clt )
Cardinality of conventional approach
Cardinality of proposed approach
813 654 2542 2383
813 654 2542 2383
Table 13 Comparison of memory taken by proposed approach and conventional one. Dataset
Memory taken by conventional approach (Mb)
Memory taken by proposed approach
%decrease in memory
YouTube Multiview Video Games Dataset DeliciousMIL p53 Mutants Gas Sensor Array Musk2 Isolet Phishing EEg-Eye-state URL Reputation Gisette Internet Advertisement
13733.4
0.915527
99%
142.784 268.333 309.059 41.542 58.0067 116.594 214.062 5475472 17381.2 10.2663
0.093338 0.12796 0.137329 0.050339 0.059486 0.084343 0.114288 18.28102 1.029968 0.025017
99% 99% 99% 99% 99% 99% 99% 99% 99% 99%
9
Table 14 ≤ Percentage increase in execution time with increase in instances for P(Clt ). Instances
Execution time (Conventional approach)
% increase after instances are doubled
Execution time (Conventional approach)
% increase after instances are doubled
3000 6000 9000 12000 15000
1.50 1.95 3.27 8.02 31.78
– 30% 50% 145% 296%
1.01 1.21 1.70 3.12 8.16
– 20% 41% 83% 161%
Table 15 ≤ Percentage increase in execution time with increase in instances for P(Clt ). Instances
Execution time (Conventional approach)
% increase after instances are doubled
Execution time (Conventional approach)
% increase after instances are doubled
3000 6000 9000 12000 15000
1.10 1.44 2.42 5.93 23.48
– 30% 50% 145% 296%
0.90 1.08 1.52 2.78 7.27
– 20% 41% 83% 161%
7. Future work and conclusion In terms of memory usage, our approach is significantly more efficient than the conventional approach. Over all datasets, the decrease in memory usage achieved was 99% as shown in Table 13. The reason behind is that in case of the proposed approach we ≥ only need to use one data structure i.e. a linear array to store Clt ≤ or Clt depending on the approximation we are calculating. Here it should be noted that the individual variables and common data structures e.g. the array to store original dataset were ignored. 6.3.1. Complexity analysis The empirical evaluation shows that the proposed approaches reduce the execution time almost by 70%, which a significant contribution is ultimately leading to high performance of algorithms using these approaches. Time complexity of the conventional approach is O(|Cl|2 + |N | |C | |Cl|) whereas the time complexity of the proposed approximation is O(|Cl| |N | |C |) where |Cl| rep≤ ≥ resents the cardinality of Cl t or Cl t , |C | represents total number of conditional attributes and |N | represents universe size i.e. total number of objects in dataset. It can be clearly seen that the conventional approach is quadratic of Cl whereas the proposed approach shows linear complexity level. It alternatively justifies the scalability of the proposed approach with increasing data size. For large datasets, ≥ it is obvious that |Cl| ∝ N i.e. the total number of objects in Cl t ≤ or Cl t increase with increase in size of the dataset. So, the large datasets significantly affect the scalability of the conventional approach whereas the proposed approach shows linear growth rate with increase in size of data. 6.3.2. Scalability of proposed algorithm To further justify the scalability an experiment was performed with ‘‘p53 Mutants’’ dataset. Initially starting with 3000 data items the execution time was calculated using both conventional ≤ ≤ and proposed approaches for P(Cl t ) and P(Cl t ). Tables 14 and 15 show the results of execution. In above tables, first column shows number of instances, second & third and fourth & fifth show the execution time and percentage increase in execution time when instances are doubled. It can clearly be seen the growth rate of the proposed approach is almost linear in contrast with conventional approach where it is almost quadratic.
We have proposed a new parallel incremental technique for calculating lower and upper approximations in dominance-based rough set approach called Incremental Approximation Calculation or PIAC for short. The conventional approaches for calculating these approximations comprise of three computationally expensive steps, not only affects the performance of algorithms using these approaches, but also makes the algorithms using these approaches inappropriate for datasets beyond normal size. The proposed approach incrementally calculates lower and upper approximations using parallel threads. The proposed method for calculating these approximations significantly enhances the performance and produces the same results. It means that the proposed approach can successfully replace the conventional method. Results have shown that the proposed approach significantly outclasses the conventional one. We compared our method with the conventional approach using ten widely used datasets. Whilst achieving the same accuracy levels as the conventional approach, our approach significantly reduces the average computation time, i.e., 71% for the lower approximation and 70% for the upper approximation. Over all datasets, the decrease in memory usage achieved was 99%. The practical implication of this work is that it by using PIAC, dominance-based rough set approach may be applied to any dataset regardless of its size. However, it should be noted that although the proposed approach results in linear time complexity, as compared to conventional approach where it is almost quadratic, it may suffer some performance issues when the total number of attributes become large almost approaching to the total number of objects in dataset. The proposed approach is appropriate for supervised datasets; however, we may come across the situation where class labels are not available. So, future direction is to study how to use the proposed approach for unsupervised datasets as well. Efforts will also be made to use it for incomplete datasets having missing values. In its current state the proposed approach requires complete datasets and datasets having missing values should be preprocessed to fill the missing values before applying the proposed approach. The proposed approach can also be tested with many other algorithms using classification, rule extraction and outlier detection etc.
10
M.S. Raza and U. Qamar / Applied Soft Computing Journal 84 (2019) 105699
Declaration of competing interest No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.asoc.2019.105699. References [1] Roman Słowiński, Salvatore Greco, Benedetto Matarazzo, Dominance-based rough set approach to multiple criteria decision support, Multiple Criteria Decision Making/University of Economics in Katowice 2 (2007) 9–56. [2] S. Greco, B. Matarazzo, R. Słowiński, Rough approximation of a preference relation by dominance relations, European J. Oper. Res. 117 (1999) 63–83. [3] S. Greco, B. Matarazzo, R. Słowiński, Rough sets theory for multicriteria decision analysis, European J. Oper. Res. 129 (2001) 1–47. [4] S. Greco, B. Matarazzo, R. Slowinski, Multicriteria classification, in: J. Zytkow W. Kloesgen (Ed.), Handbook of Data Mining and Knowledge Discovery, Oxford University Press, 2002, pp. 318–328, (Chapter 16.1.9). [5] J. Błaszczynski, S. Greco, R. Słowinski, Inductive discovery of laws using monotonic rules, Eng. Appl. Artif. Intell. (2011) http://dx.doi.org/10.1016/j. engappai.2011.09.003. [6] Qiwei Hu., et al., Spare parts classification in industrial manufacturing using the dominance-based rough set approach, Eur. J. Oper. Res. 262 (3) (2017) 1136–1163. [7] Masurah Mohamad, Ali Selamat, Analysis on hybrid dominance-based rough set parameterization using private financial initiative unitary charges data, in: Asian Conference on Intelligent Information and Database Systems, Springer, Cham, 2018. [8] M.G. Augeri, et al., Dominance-based rough set approach to budget allocation in highway maintenance activities, J. Infrastruct. Syst. 17 (2) (2010) 75–85. [9] Jean-Charles Marin, Kazimierz Zaras, Bryan Boudreau-Trudel, Use of the dominance-based rough set approach as a decision aid tool for the selection of development projects in Northern Quebec, Mod. Econ. 5 (07) (2014) 723. [10] Krzysztof Pancerz, Dominance-based rough set approach for decision systems over ontological graphs, in: Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on, IEEE, 2012.
[11] Robert Susmaga, Reducts and constructs in classic and dominance-based rough sets approach, Inform. Sci. 271 (2014) 45–64. [12] Krzysztof Pancerz, Dominance-based rough set approach for decision systems over ontological graphs, in: Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on, IEEE, 2012. [13] Xiaojun Xie Xiaolin Qin Chunqiang Yu Xingye Xu, Test-cost-sensitive rough set based approach for minimum weight vertex cover problem, Appl. Soft Comput. 64 (2017) 423–435. [14] Lei Lei, Wavelet neural network prediction method of stock price trend based on rough set attribute reduction, Appl. Soft Comput. 62 (2018) 923–932. [15] Yaya Liu, Keyun Qin, Luis Martínez, Improving decision making approaches based on fuzzy soft sets and rough soft sets, Appl. Soft Comput. 65 (2018) 320–332. [16] Asit K. Das, Shampa Sengupta, Siddhartha Bhattacharyya, A group incremental feature selection for classification using rough set theory based genetic algorithm, Appl. Soft Comput. 65 (2018) 400–411. [17] Yi Ren, Diankui Gao, Lizhi Xu, Prediction of service life of large centrifugal compressor remanufactured impeller based on clustering rough set and fuzzy Bandelet neural network, Appl. Soft Comput. 78 (2019) 132–140. [18] Taher Azar Ahmad, H. Hannah Inbarani, K. Renuga Devi, Improved dominance rough set-based classification system, Neural Comput. Appl. 28 (8) (2017) 2231–2246. [19] Qiwei Hu, et al., Spare parts classification in industrial manufacturing using the dominance-based rough set approach, Eur. J. Oper. Res. 262 (3) (2017) 1136–1163. [20] Sarra Bouzayane, Inès Saad, Weekly predicting the At-Risk MOOC learners using dominance-based rough set approach, in: European Conference on Massive Open Online Courses, Springer, Cham, 2017. [21] James J.H. Liou, Gwo-Hshiung Tzeng, A dominance-based rough set approach to customer behavior in the airline market, Inform. Sci. 180 (11) (2010) 2230–2238. [22] Yoshifumi Kusunoki, Masahiro Inuiguchi, A unified approach to reducts in dominance-based rough set approach, Soft Comput. 14 (5) (2010) 507–515. [23] James J.H. Liou, Variable consistency dominance-based rough set approach to formulate airline service strategies, Appl. Soft Comput. 11 (5) (2011) 4011–4020. [24] UCI Machine learning repository, archive.ics.uci.edu/ml/. (Accessed 19 April 2019).