European Journal of Operational Research 197 (2009) 214–224
Contents lists available at ScienceDirect
European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor
Decision Support
Hybridizing principles of the Electre method with case-based reasoning for data mining: Electre-CBR-I and Electre-CBR-II Hui Li *, Jie Sun School of Business Administration, Zhejiang Normal University, 91 Subbox in P.O. Box 62 at YingBinDaDao 688, Jinhua City 321004, Zhejiang Province, PR China
a r t i c l e
i n f o
Article history: Received 18 May 2007 Accepted 30 May 2008 Available online 10 June 2008 Keywords: Data mining Electre Case-based reasoning 30-times hold-out method Financial distress prediction
a b s t r a c t Electre is an important outranking method developed in the area of decision-aiding. Data mining is a vital developing technique that receives contributions from lots of disciplines such as databases, machine learning, information retrieval, statistics, and so on. Techniques in outranking approaches, e.g. Electre, could also contribute to the development of data mining. In this research, we address the following two issues: a) why and how to combine Electre with case-based reasoning (CBR) to generate corresponding hybrid models by extending the fundamental principles of Electre into CBR; b) the effect on predictive performance by employing evidence vetoing the assertion on the base of evidence supporting the assertion. The similarity measure of CBR is implemented by revising and fulfilling three basic ideas of Electre, i.e. assertion that two cases are indifferent, evidence supporting the assertion, and evidence vetoing the assertion. Two corresponding CBR models are constructed by combining principles of the Electre decision-aiding method with CBR. The first one, named Electre-CBR-I, derives from evidence supporting the assertion. The other one, named Electre-CBR-II, derives from both evidence supporting and evidence vetoing the assertion. Leave-one-out cross-validation and hold-out method are integrated to form 30times hold-out method. In financial distress mining, data was collected from Shanghai and Shenzhen Stock Exchanges, ANOVA was employed to select features that are significantly different between companies in distress and health, 30-times hold-out method was used to assess predictive performance, and grid-search technique was utilized to search optimal parameters. Original data distributions were kept in the experiment. Empirical results of long-term financial distress prediction with 30 initial financial ratios and 135 initial pairs of samples indicate that Electre-CBR-I outperforms Electre-CBR-II and other comparative CBR models, and Electre-CBR-II outperforms the other comparative CBR models. 2008 Elsevier B.V. All rights reserved.
1. Introduction Data mining is a new popular tool to mine knowledge from data, under the contributions from lots of disciplines such as databases, machine learning, information retrieval, statistics, data visualization, parallel and distributed computing, and so forth. (Han and Kamber, 2001; Zhou, 2003). There are systematic approaches in data mining to carry out classification and prediction tasks, with some mature theories of data preprocessing, performance assessment, and so on. They assist the process of modeling effectively. Electre is a chief outranking approach that has been developed by Roy (1971, 1991, 1996), Roy and Vincke (1984), Roy and Słowin´ski (2008). In the development of Electre, it is mainly applied into the area of decision-aiding. Till now, seldom contributions have been made from decision-aiding methods to the area of data mining.
* Corresponding author. Tel.: +86 158 8899 3616. E-mail addresses:
[email protected] (H. Li),
[email protected] (J. Sun). 0377-2217/$ - see front matter 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2008.05.024
CBR is a body of concepts and techniques which touches on knowledge representation, reasoning, and learning from experience (Schank and Abelson, 1977; Schank, 1982; Pal and Shiu, 2004; Shiu and Pal, 2004). It is also a methodology in data mining (Han and Kamber, 2001). CBR is a comprehensible manner for prediction, because it is similar to human reasoning process. As we all know, human beings always search their memories to find similar experiences when they are encountered with a new problem. It could learn over time, reason in domains with incomplete data and concepts that have not been fully defined or modeled, and provide a means of explanation. Financial distress prediction, which could be regarded as a data mining task, has become an important research area since 1966. In the beginning, statistical techniques such as multivariate discriminant analysis (MDA) (Altman, 1968) and Logistic regression (Logit) (Martin, 1977) were employed in financial distress prediction. Then, with the development of machine learning, neural networks (NN) (Odom and Sharda, 1990), case-based reasoning (CBR) (Jo and Han, 1996), and support vector machine (SVM) (Shin et al., 2005; Hui and Sun, 2006) were applied and they often outperformed
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
statistical ones. Data mining, which refers to mining knowledge from large amounts of data, has made great strides during the last two decades. It revives the interest for researches of financial distress prediction. Lin and McClean (2001), Sun and Li (2008a), respectively, used systematic data mining approaches with hybrid models to find hidden patterns, trends and relationships in financial data. Clustering analysis, a technique of data mining, was also used to predict financial distress (Smet and Guzman, 2004). Meanwhile, group decision making began to be employed to build an early warning system of financial distress (Sun and Li, 2007). In the period of 1968–2005, more than 67% of published papers on financial distress prediction collected samples containing less than one thousand records, and there were no researches involving more than ten thousands records in the review by Kumar and Ravi (2007). In financial distress prediction of Chinese listed companies, there are no more than 2000 listed companies in Shanghai and Shenzhen Stock Exchanges. As a result, the number of the samples that could be used is limited. Thus, it is a suitable area for the application of CBR with the algorithm of k-nearest-neighbor (KNN) as its heart. Electre and CBR are both developed on the basis of distance measure between two objects. They are derived from human behaviors or developed to assist human behaviors in daily life. Thus, it is reasonable to combine Electre with CBR to fulfill similarity measure. In fact, Slowinski and Stefanowski (1994, 1996), were the first authors who applied outranking relations to define different similarity functions for the data features. Li et al. (2007) also investigated the possibility of defining a similarity function by using outranking relations. Li and Sun (2008) attempted to use ranking-order information in defining a similarity measure in CBR. However, there is a lack of research on how and why to combine outranking approaches with CBR, and there are no discussions on effects of evidence supporting and vetoing the assertion that two objects are similar. In this research, we address why and how to combine the Electre decision-aiding method with CBR to form a data mining method and attempt to investigate whether predictive performance could be improved by employing evidence vetoing the assertion on the base of evidence supporting the assertion. The following five contents are mainly presented, i.e., data mining by CBR, fundamental principles of Electre, similarity measures by employing principles of Electre in two different means, reduction on computation complexity of Electre when hybridizing, and detailed methodology. Two hybrid CBR models are constructed, both of which are fulfilled by extending the basic ideas of Electre and using them in CBR. The assertion of the Electre decision-aiding method is revised as ‘case a and case b are indifferent’ in order to be consistent with the idea of similarity in CBR. In the hybridization process, we generate two corresponding models. The first one, named Electre-CBR-I, derives from evidence supporting the assertion. The other one, named Electre-CBR-II, derives from both evidence supporting and evidence vetoing the assertion. Meanwhile, the application area is financial distress prediction. The performance of the two hybrid models are compared by 30times hold-out method, which is combined by leave-one-out cross-validation and hold-out method. The classical CBR models based on Euclidean distance and Manhadun distance and the one based on grey correlation degree are also employed to make a comparison. This paper is organized as follows. Section 2 makes a brief review on data mining processes, commonly used feature selection methods, and CBR-based financial distress prediction. The contribution of this paper and why this kind of combination is selected are presented in Section 3. Section 4 addresses how to combine the basic principles of Electre with CBR to generate two hybrid models for data mining purpose. Section 5 designs the experiment of financial distress prediction, with empirical
215
results and analysis presented in Section 6. Section 7 draws some conclusions. 2. Literature review 2.1. Data mining processes Data mining is composed of several systematic processes. They aim to obtain more suitable data and models for prediction or classification problems. Chief processes of data mining include data collection, data cleaning, data integration, data reduction by feature selection or/and construction, data verification, modeling, assessment and prediction (Han and Kamber, 2001; Hand et al., 2001; Witten and Frank, 2000; Lin and McClean, 2001). Data cleaning is to clean the data for specific application. Data integration merges data for specific applications into a data cube. Such data in financial distress prediction includes publicly revealed symptoms of listed companies, class labels describing whether the company runs into financial distress or not, and related operating strategies that help the company in distress get out of it. Data reduction leads to a reduced representation of the data set which is much smaller in volume, yet produces the same even better predictive accuracy. Whether or not the processed data set is suitable for financial distress prediction should be verified in the process of data verification. There are two main techniques for data verification, i.e. multi-collinearity test among features and significance test on difference between samples in distress and health. In the modeling process, specific models are developed and applied for data mining tasks. Predictive performance is usually assessed through hold-out method or crossvalidation. 2.2. CBR-based financial distress mining CBR was born in 1980 or so (Schank and Abelson, 1977; Schank, 1982) and began to be used in financial distress prediction in 1996. Under the guidance of showing that CBR systems could be effectively developed for a variety of accounting tasks (Brown and Gupta, 1994), there are indeed some researches that have studied financial distress prediction using CBR. Jo and Han (1996) firstly introduced CBR into financial distress prediction to form a casebased forecasting system. Jo et al. (1997) gave the comparison among case-based forecasting, MDA, and NN. Average accuracy of MDA, NN and CBR was 82.22%, 83.79% and 81.52%, respectively. By employing the CBR shell of ReMind, Bryant (1997) also tried to find out whether or not CBR could be used successfully to predict corporate bankruptcy. Experimental results indicated that Logit had superior predictive accuracy than CBR. Feature weighting is an important problem in feature-weighted similarity measure, e.g. Euclidean distance metric. Park and Han (2002) proposed the AHP-weighted KNN algorithm for bankruptcy prediction by introducing analytic hierarchy process (AHP), a commonly used weighting mechanism, into CBR. Experimental results indicated that it outperformed all the other comparative CBR models. In the following two years, Yip and Deng (2003), Yip (2004) used CBR with KNN algorithm to predict business failure of Australian firms. Experimental results indicated that CBR with weighted KNN outperformed MDA, and MDA outperformed CBR with pure KNN. Lately, Sun and Hui (2006) proposed a financial distress prediction model based on grey correlation metric and weighted KNN. It was concluded that the new CBR model had a good predictive ability for those companies that would probably run into financial distress in less than two years. Lin et al. (2007) integrated rough set theory with grey relational analysis and CBR to implement financial distress prediction, with the conclusion that the proposed
216
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
model was the most accurate and effective model in predictive models based on rough set theory. Researches of CBR-based financial distress prediction could be summarized as follows: (a) studies of CBR-based financial distress prediction mainly distribute in Korea, USA, Australia, China’s mainland, and Taiwan; (b) current researches of CBR-based financial distress prediction do provide some evidence on its effectiveness and un-effectiveness; (c) the majority of forgoing researches use equal weight on each feature; (d) the majority of forgoing researches employ Euclidean metric as the foundation of similarity measure; (e) in the majority of forgoing researches, the principle of KNN, rather than nearest neighbor, is utilized and (f) MDA stepwise method and two-tailed t-test have been used to select features in CBR-based financial distress prediction. 2.3. Commonly used methods of feature selection in financial distress mining Financial distress prediction could be regarded as a specific form of classification problem. For such problems, two families of feature selection methods have been developed so far (Guyon and Elisseeff, 2003). One family of feature selection is the filters and the other one is the wrappers. Both families are data oriented. The grouping principle for feature selection methods is whether or not a classifier or predictor is taken into consideration when carrying out the process of feature selection. Filters select optimal feature subsets as a pre-processing step. This step is independent of the classifier. Till now, there are several specific filter approaches commonly used in financial distress prediction. For example, Chen and Hsiao (2008), Min et al. (2006), Lin and McClean (2001), Jo et al. (1997), Jo and Han (1996), Back et al. (1996), respectively, employed stepwise methods of MDA, Logit, and ANOVA in their applications of financial distress prediction. Wrappers take interest of a classifier into consideration and use it as a black box to select feature subsets according to their predictive power. When wrappers are included in the training process, they are often called as embedded methods. A commonly used wrapper in financial distress prediction is genetic algorithm (GA), which is employed to search optimal feature subsets under the assessment of the classifier’s predictive accuracy. For example, Min et al. (2006) and Kim (2004), respectively, used GA to select optimal feature subsets for the classifiers they employed. One of the main advantages of filters is that they could be used as preprocessing step to reduce space dimensionality and overcome over-fitting. Thus, it is reasonable to use a filter or wrapper with a linear predictor in the process of feature selection, and then train a more complex non-linear predictor on the resulting features (Guyon and Elisseeff, 2003). In fact, Bi et al. (2003) have already carried out this type of research recently. They used a linear SVM in the process of feature selection, but a non-linear SVM for prediction using the resulting features. Chen and Hsiao (2008) also tried to use a wrapper with a linear predictor, i.e. MDA stepwise method, to select optimal features for a non-linear SVM. In the area of financial distress prediction, there is another family of feature selection method. When Beaver (1966) and Altman (1968) started the area of financial distress prediction by applying statistical models, experience-based feature selection was used. Lin and McClean (2001) compared predictive accuracy using two types of feature subsets, respectively, generated by human judgment and ANOVA. Their conclusion is that the filter approach outperformed feature selection method based on human beings’ experience. The so called experience of expert is hard to be at hand and it does not take into consideration the characteristic of a specific application problem. Nowadays, a relatively complete initial feature subset built by experts’ experience needs to be employed
firstly. Then, data oriented feature selection methods are supposed to be used.
3. Contribution of this work In this research, we would like to call attention of researchers in decision-aiding area, e.g. outranking approach, on the application area of data mining. The contribution of this paper is twofold. On the one hand, we try to describe why and how to apply some key techniques of Electre into data mining. This is done by analyzing the fundamental ideas of Electre. When fulfilling this motivation, we find the media of CBR, which is a classification method in data mining. The reason why we choose CBR is not only because of its merits mentioned in Section 1, but also because both CBR and Electre are developed on the basis of distance measure between pairs of objects on each feature. In CBR with KNN as its heart, Euclidean distance is traditionally used. In Electre, the determination of an outranking relation also roots on a distance measure on each feature. At the same time, CBR is believed to be a methodology not a technology (Kolodner, 1993; Watson, 1999; Pal and Shiu, 2004). Hence, any technique or approach could be absorbed into CBR. Thus, it sounds reasonable to combine principles of the Electre decision-aiding method with CBR and form a hybrid model, i.e. Electre-CBR. We generate two Electre-CBR models. The first one, named Electre-CBR-I, is developed on the basis of evidence supporting the assertion that two objects are indifferent. The other one, named Electre-CBR-II, is developed on the basis of both evidence supporting and vetoing the assertion that two objects are indifferent. The effect on predictive performance by employing evidence vetoing the assertion on the base of evidence supporting the assertion is attempted to be investigated. On the other hand, we try to investigate the performance of the two Electre-CBR models in long-term financial distress prediction. Predictive performance of the two derived models is planed to be compared. Classical CBR models are planned to be employed as baseline models. Initial data used is collected from Shanghai and Shenzhen Stock Exchanges in China. In the empirical research, we employ ANOVA to find significantly different features between samples in health and distress, as Lin and McClean (2001), Min et al. (2006) did. In the assessment of predictive performance, we employ 30-times hold-out strategy by combining hold-out method and leave-one-out cross-validation (LOO-CV). Corresponding parameters are optimized by grid-search technique in LOO-CV. Predictive performance is evaluated by predictive results produced on hold-out data. Statistical analysis is employed to find whether or not there are significant differences among comparative models on the basis of 30 predictive results.
4. Hybridizing fundamental principles of Electre into CBR 4.1. Data mining by CBR Case library is very critical in CBR. When employing CBR into financial distress prediction, the data set of useful samples is used as case library, and each sample is taken as a case. Judgment on a current sample’s financial state is generated by integrating financial labels of similar cases in case library. These cases have similar symptoms with the current one. Internal structure of CBR-based financial distress prediction system could be divided into two major parts, i.e. company retriever and company label classifier. It is shown as Fig. 1. The task of company retriever is to find appropriate companies in case library. The task of company label classifier is to use similar companies to label the current one. When distance-based data mining methods are utilized, company retriever
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
217
tion, indifference relation, weak preference relation, and strong preference relation are employed in the classical Electre method. The third principle relates to the evidence vetoing the assertion. Discordance is used in the classical Electre method to reflect this evidence. There are no detailed descriptions of veto relations in the classical Electre method, though such relations are indeed used to calculate discordance. 4.3. Similarity measure based on principles of Electre On the basis of the three fundamental principles, we attempt to combine principles of the Electre decision-aiding method with CBR. In order to be consistent with the basic concept of similarity in CBR, we use some terminologies different from the classical Electre. The model for similarity measure fulfilling the three principles is shown as Fig. 2. The hybrid model based on evidence supporting the assertion that two objects are indifferent is named Electre-CBR-I. The hybrid model based on evidence supporting and vetoing the assertion that two objects are indifferent is called Electre-CBR-II. 4.4. Reduction on computation complexity of Electre when hybridizing
Fig. 1. Structure of CBR for financial distress mining.
is on the basis of similarity measure. Company label classifier builds on the principle of majority voting in KNN. 4.2. Fundamental principles of Electre To our understanding, the classical Electre method roots on three fundamental principles. The first is the assertion used to describe the preference of the decision maker. In classical Electre methods, ‘object a is at least as good as object b’ is used. The second principle is the evidence supporting the assertion. Concordance is used in the classical Electre method to indicate the evidence. In order to calculate an index reflecting evidence supporting the asser-
A big challenge in outranking approaches is to determine threshold values. Electre is developed in the area of decision-aiding. It is generally accepted that there are m groups of parameters (qk, pk, vk) if there are m features/criteria. This is applicable since there are seldom more than 10 features and 10 objects involved in decision-aiding problems. Since decision-aiding problems are human oriented, these parameters are often determined by experience. While data mining is data oriented. The less the number of parameters employed in data mining models, the lower the computational complexity is. When introducing principles of the Electre decision-aiding method with CBR to carry out data mining, the number of parameters in the hybrid models is determined directly by parameter numbers of Electre. Consider the situation with ten features, there would be 3 10 = 30 parameters to be optimized in classical Electre. Thirty parameters will surely incur computation problems in data mining. In order to handle this problem, we employ the concept of range threshold. Range threshold refers to the parameter which is in the range of [0, 1] and is used to tune the three
Fig. 2. Concept model for similarity measure by fulfilling fundamental principles of Electre.
218
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
thresholds, i.e. indifference threshold, difference threshold and veto threshold. The three relations will be described in next section. The three corresponding range thresholds, i.e. q, p, v, (q < p < v) are defined in the following way:
qk ¼ q rangeðkÞ;
ð1Þ
pk ¼ p rangeðkÞ;
ð2Þ
vk ¼ v rangeðkÞ;
ð3Þ
where range(k) represents the range of the kth feature, and q, p, v could be, respectively, called as range threshold of indifference threshold, difference threshold and veto threshold. 4.5. Model specification 4.5.1. Frame model The frame model of Electre-CBR for data mining is shown as Fig. 3. We could find from Fig. 3 that key problems for combining
principles of the Electre decision-aiding method with CBR are assertion definition, outranking relation, indifference indicator, veto relation, veto indicator, and parameter number reduction. At the same time, the key problem of employing CBR in data mining is similarity measure. Key problems for building up case library include data collection and preprocessing. One of the key problems in data preprocessing is feature selection. The initial aim of developing outranking approaches, e.g. Electre, is to help human beings to make a decision by ranking alternatives. Thus, the research objective in Electre is to find which object is more preferred than the other. When trying to combine some key techniques of Electre into CBR, some changes on initial definitions of Electre have to be made. As we all know, CBR is developed to help human beings to make a decision by providing similar experiences to the encountered problem. Thus, the research object in CBR is to find which object is more similar (or indifferent) to the target object. This is completely different from the initial research object of Electre. Hence, we have to revise original definitions of
Fig. 3. Structure of the hybrid data mining model by combining Electre into CBR.
219
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
Electre to be consistent with the research object of CBR, when combining principles of the Electre decision-aiding method with CBR to carry out data mining tasks. 4.5.2. Revised assertion Concepts of outranking relations derive from researches of European school of multi-criteria decision aiding (Roy, 1971, 1991, 1996; Brans and Vincke, 1985; Vincke, 1986; Bouyssou and Pirlot, 2005; Roy and Słowin´ski, 2008). Specific definitions of outranking relations in Electre are assertion oriented. In classical decision-aiding domain, the objective of developing Electre is to find a preferred sequence of objects. Thus, the assertion ‘object a outranks object b’, meaning ‘object a is at least as good as object b’, is employed in classical decision-aiding domain. For example, in the summary of basic theory on Electre in Takeda (2001), it is the assertion ‘object a outranks object b’ or ‘object b is outranked by object a’ that is used, on the assumption that object a is at least as preferred if object a outranks object b. Thus, there are three types of relations generated between the two objects, i.e. strict preference, weak preference, and indifference. When employing the principles of Electre into CBR to carry out data mining, it is a different situation. CBR has been developed on the assumption that similar experience could guide further problem solving (Schank, 1982; Pal and Shiu, 2004). It is the concept of indifference, rather than preference, which is employed in CBR. In the hybridization of Electre and CBR, it is not applicable to directly use the classical definitions of Electre to solve data mining problems. In this view, we employ the assertion ‘object a and object b are indifferent’ when developing a hybrid model for CBR-based financial distress prediction. As a result, there are corresponding three types of relations generated between the two objects, i.e. indifference, weak difference, and strong difference. 4.5.3. Revised definitions of outranking relation Let the distance between case a and case b be denoted by dab. Let xak and xbk be values of the kth feature of case a and case b respectively. Let wk express the weight of the feature. The distance measure between two cases in financial distress prediction on the kth feature could be denoted as follows:
dabk ¼ jxak xbk j:
ð4Þ
Indifference relation, weak difference relation, and strong difference relation on each feature are defined as follows, respectively. Definition 1. Case a and case b are considered as indifferent when the kth feature is taken into account, which could be denoted as Ik(a, b), if the condition, dabk 6 qk, is met. qk is indifference threshold. Definition 2. Case a and case b are considered as strongly different when the kth feature is taken into account, which could be denoted as SDk(a, b), if the condition, dabk > pk, is met. pk is difference threshold.
4.5.4. Revised definitions of veto relation Definition 4. There is a non-veto relation on the assertion that case a and case b are considered as indifferent when the kth feature is taken into account, which could be denoted as NVk(a, b), if the condition, dabk 6 pk, is met. Definition 5. There is a strong veto relation on the assertion that case a and case b are considered as indifferent when the kth feature is taken into account, which could be denoted as SVk(a, b), if the condition, dabk > vk, is met. Here, vk is veto threshold. Definition 6. There is a weak veto relation on the assertion that case a and case b are considered as indifferent when the kth feature is taken into account, which could be denoted as WVk(a, b), if the condition, pk < dabk 6 vk, is met. It is to partition the space of strong difference more particularly. Strong veto relation will only be concluded if a case is too strongly dominating or dominated by the case it is compared with on some feature. Thus, the assertion that case a and case b is indifferent is forbidden. While weak veto relation represents a certain lack of conviction. Non-veto relation means that relation between two cases could be measured by indifference indicator effectively. The three veto relations compose the space of strongly different relations between two cases, which could be expressed as Vr. 4.5.5. Electre-CBR-I: using revised similarity measure based on evidence supporting the assertion The similarity measure of CBR based on evidence supporting the assertion that two cases are indifferent is defined as follows:
SIM0ab ¼
m X wk c0k ða; bÞ Pm ; k¼1 wk k¼1
ð5Þ
where c0k ða; bÞ is called the concordance index in classical Electre. In order to represent basic ideas of similarity in CBR, we rename concordance index as indifference indicator between case a and case b on the kth feature. The indifference indicator expresses the evidence which supports the assertion that case a and case b are indifferent. c0k ða; bÞ is determined by
c0k ða; bÞ ¼ f ðOrÞ ¼
8 > <0
pk dabk > ðpk qk Þ
:
1
if SDk ða; bÞ; if WDk ða; bÞ;
ð6Þ
if Ik ða; bÞ:
4.5.6. Electre-CBR-II: using revised similarity measure based on evidence supporting and vetoing the assertion Similarity between two cases based on evidence supporting and vetoing the assertion could be computed by
SIMab
8 SIM0ab > > > > 0 0 > > < if dk ða; bÞ 6 SIMab for every feature; Q 1d0k ða;bÞ ¼ SIM0 ab > 1SIM0ab > ^ > k2fk:dk ^ k2fk:dk ^ ða;bÞ>SIMab g > > > : 0 if dk ða; bÞ > SIM0ab for at least one feature;
ð7Þ
0
Definition 3. Case a and case b are considered as weakly different when the kth feature is taken into account, which could be denoted as WDk(a, b), if the condition, qk
where dk ða; bÞ is called as the discordance index in the classical Electre. In order to represent basic ideas of similarity in CBR, we rename discordance index as veto indicator between case a and case b on the kth feature. Veto indicator expresses the evidence which ve0 toes the assertion that case a and case b are indifferent. dk ða; bÞ could be computed in the following way: 0 dk ða; bÞ
¼ gðVrÞ ¼
8 > <0
vk dabk > ðvk pk Þ
:
1
if NV k ða; bÞ; if WV k ða; bÞ; if SV k ða; bÞ:
ð8Þ
220
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
Note that Formula (7) is directly inspired by a similar formula used to define the concordance–discordance index of the classical Electre method, an index that is usually called degree of credibility of outranking (Roy, 1971, 1991, 1996; Roy and Słowin´ski, 2008).
5. Financial distress mining The objective of the application is to investigate predictive performance of the two Electre-CBR models. For comparison, the other three CBR models developed earlier are also considered. Two of them are on the basis of Euclidean distance metric and Manhadun distance metric, and the other one is on the basis of grey coefficient metric. Predictive performance of the above five CBR models is compared in this research. The general design of financial distress mining is shown as Fig. 4. 5.1. Initial data and variables Listed companies that have had negative net profits in two consecutive years will be specially treated (ST) by China Securities Supervision and Management Committee (CSSMC). They are considered as companies in financial distress. Companies that have never been specially treated are regarded as healthy ones. The pairing strategy is employed to collect initial data in order to provide information of two distinct classes for learning machines. We collected 135 companies in distress from Shenzhen Stock Exchange and Shanghai Stock Exchange with the period range of 2000– 2005. Generally, a machine could learn from data effectively if there are comparable numbers of each class. Hence, we paired the distress companies by collecting corresponding healthy companies in principle of the same branch of industry and asset scale. If there is no corresponding healthy company to a specific company in distress, we try to use a healthy company that is similar to the company in distress from the point of view of the type of industry and asset scale. Suppose the year when a company runs into financial distress as the benchmark year t-0, then t-1, t-2, t-3, respectively, represents one year, two years, and three years before distress. If there are negative net profits in two consecutive years for a listed company, this company will surely be regarded as a company in finan-
cial distress. If there is a negative net profit in one year for a listed company, this phenomenon will surely call attention of corresponding people of the company. In order to generate an early warning of financial distress for listed companies using data before there are some signals from net profit, it is necessary to employ financial data in the period of t-3. The initial financial feature set is composed of 35 financial ratios, which cover activity ratios, long-term debt ratios, short-term debt ratios, profitability ratios, growth ratios and structural ratios. Five of them could be easily calculated through the integration of the others, which were deleted. The remaining 30 initial ratios are listed in Table 1, which covers mostly employed features in financial distress prediction in China (Liang and Wu, 2005; Hua et al., 2007; Ding et al., 2008; Sun and Li, 2008b). 5.2. Data preprocessing Initial data collected for financial distress prediction tends to be incomplete, noisy, and inconsistent. In the process of outlier elimination, sample companies with at least one financial ratio value missing were dropped. Companies with financial ratios deviating from the mean value as much as three times of the standard deviation were also excluded. Meanwhile, obvious outliers were also eliminated by expertise. Final sample companies consist of 81 healthy companies and 81 companies in distress at year t-3. The strategy of dimension reduction by removing irrelevant, weakly relevant or redundant features through ANOVA stepwise method was used. Each feature selected was considered to be of equal importance since feature selection is a specific form of feature weighting. There are several reasons for employing ANOVA stepwise method for feature selection in this research. (a) Though there are some researches using feature subsets picked out by domain experts, yet it is a time-consuming task. Besides, it is also difficult to carry out effectively because the behavior of the data for financial distress prediction is not well-known. (b) Those researches in financial distress prediction that have employed ANOVA stepwise method for feature selection, e.g. Lin and McClean (2001); Min et al. (2006), guide this research.
Fig. 4. Experimental design.
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224 Table 1 Initial feature set Category
Variable
Features
Profitability
X1 X2 X3 X4 X5 X6 X7 X8
Gross income to sales Net income to sales Earning before interest and tax to total asset Net profit to total assets Net profit to current assets Net profit to fixed assets Profit margin Net profit to equity
Activity
X9 X10 X11 X12 X13 X14
Account receivable turnover Inventory turnover Account payable turnover Total assets turnover Current assets turnover Fixed assets turnover
Short-time liability
X15 X16
Current ratio The ratio of cash to current liability
Long-time liability
X17 X18 X19 X20 X21
Asset-liability ratios Equity to debt ratio Ratio of liability to tangible net asset Ratio of liability to market value of equity Interest coverage ratio
Growth
X22 X23
Growth rate of primary business Growth rate of total assets
Structural ratios
X24
The proportion of current assets (over total assets) The proportion of fixed assets (over total assets) The ratio of equity to fixed assets The proportion of current liability (over total liability)
X25 X26 X27 Per share items and yields
X28 X29 X30
Earning per share Net assets per share Cash flow per share
data that do not intervened in the learning dataset. Meanwhile, it is believed that holdout method is biased in assessing predictive accuracy. Considering this, we used a hybrid assessing strategy by combining hold-out method and LOO-CV to estimate predictive accuracy of various models. We split the whole data into two parts randomly. One part, which occupies 30%, is used as testing data that never intervenes in the learning and validating processes. The other one, which occupies 70%, is used as training and validating data. LOO-CV is carried out in training and validating processes to get optimal parameters. This kind of split was carried out for 30 times to generate 30 predictive results. On its basis, paired-samples t-test was employed to find whether or not there exists significant difference among various models. As the majority of researches in financial distress prediction did, we did not distinguish Type I error and Type II error. The total predictive accuracy was utilized as optimization objective in training and validating. Total predictive accuracies of hold-out data were used to assess predictive models. 5.4. Comparative CBR models employed 5.4.1. Euclidean CBR and Manhadun CBR In Euclidean CBR (ECBR), similarity measure is computed by the following formula:
SIMab ¼
1 vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi !2ffi : u uP m wk dabk t Pm 1þ k¼1 k¼1
5.3. Assessing predictive accuracy Using all data to get a classifier and then to estimate the accuracy of the classifier could result in overly optimistic estimate. In holdout method, the useful data set is divided into a training set and a testing set randomly. The former is utilized to derive a classifier, whose predictive accuracy is estimated with the latter. In Vfold cross validation, useful data are partitioned randomly into V exclusive subsets with approximately equal size. By holding out one class of the partition as testing set and the remaining as training set each time, V values of the predictive accuracy could be obtained. The average of those values is taken as the final predictive accuracy. LOO-CV is a special kind of V-fold cross validation in case V is taken as the number of cases. One would always like to get information from the performance of a classifier when it is used for generalization purposes, using
ð9Þ
wk
In Manhadun CBR (MCBR), similarity measure is computed in the following way:
SIMab ¼
1þ
1 Pm
k¼1
wk dabk P m k¼1
(c) Lin and McClean (2001) have provided some evidence that data-oriented filter approach of feature selection, e.g. ANOVA stepwise method, outperforms feature selection based on human beings’ judgments.Features that are considered as significantly different between samples in distress and health by ANOVA stepwise method are shown in Table 2. From the table we could find that optimal feature subset is composed of X3, X4, X5, X8, X9, X10, X11, X12, X13, X16, X17, X18, X19, X20, X21, X22, X24, X25, X27, X28, X29, and X30.In order to eliminate impacts on predictive performance from different normalization processes, no normalization process was employed. It means that the experiment in this research was carried out on the basis of original data distribution.
221
:
ð10Þ
wk
5.4.2. Grey CBR Let the grey correlation degree between case a and case b be denoted as yabk. Let xak and xbk express values of the kth feature of case a and b, respectively. Let h(a) express case library without case a, and b, c 2 h(a). The measure of the grey correlation degree between two cases could be computed as follows:
yabk ¼
inf k jxak xck j þ 0:5 supk jxak xck j ; dabk þ 0:5 supk jxak xck j
ð11Þ
where inf k jxak xck j and supk jxak xck j represent the minimum and maximum distance of case a and case b ("b 2 h(a)) on the kth feature, respectively. dabk represents the distance between case a and case b on the kth feature. Once the grey correlation degree between two cases has been measured, similarity between the two cases could be computed. The grey correlation degree could be transformed into similarity by the following way:
SIMab
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u m uX wk yabk 2 Pm : ¼t k¼1 wk k¼1
ð12Þ
5.5. Model implementation and parameters searching The predictive models were implemented using Matlab. The dimension reduction process of ANOVA was realized using SPSS. All features employed were put on equal importance since feature selection is a specific form of feature weighting. To search optimal parameters of the corresponding models in the process of LOO-CV, the standard technique of grid-search was employed. For the
222
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
Table 2 ANOVA output Category
Variable
Mean (health)
SD (health)
Mean (distress)
SD (distress)
F
Sig.
Profitability
X1 X2 X3 X4 X5 X6 X7 X8
0.2516 0.0791 0.0577 0.0357 0.0765 0.2204 0.8158 0.0644
0.1373 0.1054 0.0487 0.0410 0.0971 0.4054 0.1342 0.0775
0.2559 0.0564 0.0395 0.0188 0.0345 0.1158 0.7868 0.0361
0.1516 0.7611 0.0468 0.0450 0.0684 0.4993 0.8155 0.1023
0.036 0.071 5.893 6.238 10.10 2.140 0.099 3.939
0.849 0.791 0.016** 0.014** 0.002*** 0.145 0.753 0.049**
Activity
X9 X10 X11 X12 X13 X14
15.602 4.6350 11.2399 0.5465 1.0715 3.5699
47.324 4.3574 10.6976 0.4129 0.6644 6.3878
5.5249 2.7678 6.1088 0.3496 0.6039 2.4497
10.184 2.5661 5.5320 0.2304 0.4170 3.7608
3.510 11.04 14.70 14.04 28.78 1.850
0.063* 0.001*** 0.000*** 0.000*** 0.000*** 0.176
Short-time liability
X15 X16
1.6576 0.2140
0.6513 0.3057
1.5533 0.0147
0.7460 0.1730
0.899 26.08
0.345 0.000***
Long-time liability
X17 X18 X19 X20 X21
0.3934 1.8476 0.8635 183.47 8.5323
0.1358 1.2786 0.5927 144.56 12.662
0.4820 1.3369 1.3371 243.88 3.5785
0.1442 1.2103 0.8666 170.51 4.2432
16.17 6.815 16.48 5.915 11.15
0.000*** 0.10* 0.000*** 0.016** 0.001***
Growth
X22 X23
0.3354 0.1383
1.4424 0.4194
0.0055 0.1909
0.6732 0.3389
3.481 0.771
0.064* 0.381
Structural ratios
X24 X25 X26 X27
0.5320 0.3468 3.5026 0.8825
0.1926 0.1877 6.1163 0.1515
0.6129 0.2613 3.1899 0.9217
0.1516 0.1354 4.6148 0.1101
8.844 11.06 0.135 3.558
0.003*** 0.001*** 0.714 0.061*
Per share items and yields
X28 X29 X30
0.1631 2.6917 0.2481
0.1860 0.9426 0.3479
0.0810 2.4290 0.0341
0.1873 0.9930 0.4135
7.821 2.982 12.70
0.006*** 0.086* 0.000***
* **
Significant at the level of 10%. Significant at the level of 5%. Significant at the level of 1%.
***
hybrid model of Electre-CBR, parameters of qk, pk, vk were, respectively, determined by the three range thresholds, i.e. q, p, v. This could be found from the definition of range threshold. Thus, gridsearch technique was employed to search optimal values of the three parameters. q was searched in the range of [0, 0.1] by steps of 0.02. p was searched in the range of [0.3, 1] by steps of 0.1. And v was searched in the range of [p, 1] by steps of 0.1 since v > p in the definition. When determining the number of the nearest neighbors, we followed Pan et al.’s (2007) approach. For all the CBRs, we tried 1-NN, 3-NN, 5-NN, 7-NN, 9-NN, 11-NN, 13-NN, and 15-NN to find the optimal KNN for financial distress prediction.
Table 4 t-Values of pairwise comparative models by total accuracy
MCBR GCBR ELECTRE-CBR-I ELECTRE-CBR-II
ECBR
MCBR
GCBR
Electre-CBR-I
1.642a 2.076** 4.785*** 3.662***
1 1.511 4.090*** 2.814***
–
– – 1 2.811***
1 2.957*** 1.775*
a
A t-value. Significant at the level of 10%. ** Significant at the level of 5%. *** Significant at the level of 1%. *
6. Empirical results and analysis In this section, predictive abilities of the five CBR models are compared. Table 3 lists total predictive accuracies on each model. Paired-samples t-test was utilized to examine whether there is any significant difference on predictive performance of each model. Table 4 shows the results of significance test for pairwise comparisons of predictive performance between models by total accuracy.
Table 3 Means of total accuracy obtained with comparative models on 30 data sets Models
ECBR
MCBR
GCBR
Electre-CBR-I
Electre-CBR-II
Minimum Maximum Mean
58.33 72.92 66.60
60.42 75.00 67.30
56.25 81.25 69.24
58.33 83.33 71.81
56.25 83.33 70.63
From Table 3 we could find that Electre-CBR-I produces higher total predictive accuracy than Electre-CBR-II, GCBR, MCBR, and ECBR by 1.18%, 2.57%, 4.51%, and 5.21%, respectively. The highest predictive accuracies of the two Electre-CBR models are both 83.33%. As Table 4 shows, predictive accuracy of Electre-CBR-I is statistically better than those of Electre-CBR-II, GCBR, MCBR, and ECBR at the significant level of 1%. Predictive accuracy of ElectreCBR-II is significantly better than that of GCBR at the level of 10%, and significantly better than those of MCBR and ECBR at the level of 1%. There are no significant differences between predictive accuracies of ECBR-MCBR, and MCBR-GCBR. Both Electre-CBR-I and Electre-CBR-II produce higher predictive performance than the other three CBR models in the experiment. It testifies the effectiveness of the hybrid models and their applicability in financial distress prediction. In predictive performance comparison between Electre-CBR-I and Electre-CBR-II, the former outperforms the latter on predictive accuracy. The main difference between Electre-CBR-I and Electre-CBR-II is that veto relations are
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
introduced in the latter to partition the space of strong difference in detail. Maybe this process reduces total predictive accuracy by over-fitting the training data and validating data. In order to verify this assumption, we employ all data and used the assessment of LOO-CV to find whether or not Electre-CBR-II outperforms Electre-CBR-I by over-fitting training data and validating data. The optimal KNN for Electre-CBR-I is 9-NN, and the corresponding predictive accuracy is 72.84%. Predictive accuracy of Electre-CBR-II, which introduces evidence vetoing the assertion that two objects are indifferent, is 73.46%. The improved predictive accuracy may possibly result from over-fitting the training and validating data. 7. Conclusion and remarks The conclusion of this study is that the two hybrid data mining models for financial distress prediction, i.e. Electre-CBR-I and Electre-CBR-II, could produce acceptable predictive performance. Predictive performance of the CBR system in long-term financial distress prediction has been improved significantly, under the assessment of 30-times’ hold-out data. It testifies the possibility of applying some techniques of the decision-aiding approach of Electre into the area of data mining. Fundamental principles of Electre and how to extend them into CBR are presented in detail. We generate two hybrid models of Electre-CBR. The first one, named Electre-CBR-I, is developed on the basis of evidence supporting the assertion that two objects are indifferent. The other one, named Electre-CBR-II, is developed on the basis of evidence supporting and vetoing the assertion that two objects are indifferent. In this research, we use outranking relations for CBR in three ways. We first adopt an indifference indicator deriving from indifference relations, weak difference relations, and strong difference relations as similarity measure mechanism. Second, we use a veto indicator deriving from non-veto relations, weak veto relations, and strong veto relations to enhance the mechanism of similarity measure between two cases. Third, we utilize the principle of knearest neighbor to generate a prediction for financial distress based on outranking indicators. 30-times hold-out method is employed to assess predictive performance of various CBR models. This assessment is combined by hold-out method and LOO-CV. Dataset is divided into training dataset, validating dataset, and testing dataset. Training data and validating data are employed in LOO-CV to obtain optimal parameters of models. Testing dataset is used to assess predictive performance of the corresponding models. From the results of the experiment, it could be found that the two hybrid CBRs, i.e. Electre-CBR-I and Electre-CBR-II, offer a viable approach for financial distress prediction. Empirical results show that they offer significantly better predictive performance than the three CBR models respectively derived from Euclidean metric, Manhadun metric, and grey coefficient metric. Note that we carried out the experiment on the basis of initial data distributions, with optimal feature subset produced by the filter approach of one way ANOVA, and with data of year t-3. All conclusions drawn out should be considered under the circumstance of experiment design. At the same time, data we collected is to identify long-term financial distress not bankruptcy, namely three years ahead of distress. Thus, there may be less volume of signals that could be used to distinguish companies in distress from companies in health than to distinguish bankrupt companies from healthy companies. It maybe the main reason that the average value of predictive accuracies of each CBR model is around 70%. This research also has some limitations. There are some other factors that could enhance the predictive ability of a CBR system. Predictive performance of CBR may be improved if some optimization algorithms are employed for feature weighting, though it is not always the case. Of course, the generalization of the enhanced model
223
of CBR by outranking relations should be tested further more by applying it into other problem domains or using data for financial distress or bankruptcy prediction from other counties. Acknowledgements This research is partially supported by the National Natural Science Foundation of China (#70801055) and the Zhejiang Provincial Natural Science Foundation of China (No. Y607011). The authors gratefully thank anonymous referees for their useful comments. Meanwhile, we thank Ms. Zhe Li and Ms. Ying Li for their help in checking the manuscript for us. References Altman, E., 1968. Financial ratios discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance 23, 589–609. Back, B., Laitinen, T., Sere, K., 1996. Neural network and genetic algorithm for bankruptcy prediction. Expert Systems with Applications 11, 407–413. Beaver, W., 1966. Financial ratios as predictors of failure. Journal of Accounting Research 4, 71–111. Bi, J., Bennett, K., Embrechts, M., et al., 2003. Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research 3, 1229–1243. Bouyssou, D., Pirlot, M., 2005. A characterization of concordance relations. European Journal of Operational Research 167, 427–443. Brans, J.P., Vincke, Ph., 1985. A preference ranking organization method: The Promethee method for multiple criteria decision making. Management Science 31, 647–656. Brown, C.E., Gupta, U., 1994. Applying case-based reasoning to the accounting domain. Intelligent Systems in Accounting, Finance and Management 3, 205– 221. Bryant, S.M., 1997. A case-based reasoning approach to bankruptcy prediction modeling. Intelligent Systems in Accounting, Finance and Management 6, 195– 214. Chen, L.-H., Hsiao, H.-D., 2008. Feature selection to diagnose a business crisis by using a real GA-based support vector machine: An empirical study. Expert Systems with Applications 35 (3), 1145–1155. Ding, Y.-S., Song, X.-P., Zen, Y.-M., 2008. Forecasting financial condition of Chinese listed companies based on support vector machine. Expert Systems with Applications 34, 3081–3089. Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182. Han, J.-W., Kamber, M., 2001. Data Mining Concepts and Techniques. Morgan Kaufman, San Mateo. Hand, D., Mannila, H., Smyth, P., 2001. Principles of Data Mining. MIT Press, Cambridge. Hua, Z.-S., Wang, Y., Xu, X.-Y., et al., 2007. Predicting corporate financial distress based on integration of support vector machine and logistic regression. Expert Systems with Applications 33, 434–440. Hui, X., Sun, J., 2006. An application of support vector machine to companies’ financial distress prediction. In: Torra, V., Narukawa, Y., Valls, A., et al. (Eds.), Modeling Decisions for Artificial Intelligence. Springer-Verlag, Berlin, pp. 274– 282. Jo, H., Han, I., 1996. Integration of case-based forecasting, neural network, and discriminant analysis for bankruptcy prediction. Expert Systems with Applications 11, 415–422. Jo, H., Han, I., Lee, H., 1997. Bankruptcy prediction using case-based reasoning, neural network and discriminant analysis for bankruptcy prediction. Expert Systems with Applications 13, 97–108. Kim, K.-J., 2004. Toward global optimization of case-based reasoning systems for financial forecasting. Applied Intelligence 21, 239–249. Kolodner, J.L., 1993. Case-Based Reasoning. Morgan Kaufman, San Francisco. Kumar, P.R., Ravi, V., 2007. Bankruptcy prediction in banks and firms via statistical and intelligent techniques – A review. European Journal of Operational Research 180, 1–28. Li, H., Sun, J., 2008. Ranking-order case-based reasoning for financial distress prediction. Knowledge-Based Systems. doi:10.1016/j.knosys.2008.03.047. Liang, L., Wu, D., 2005. An application of pattern recognition on scoring Chinese corporations financial conditions based on back propagation neural network. Computers & Operations Research 32, 1115–1129. Li, H., Sun, J., Sun, B., 2007. Financial distress prediction based on OR-CBR in the principle of k-nearest neighbors. Expert Systems with Applications. doi:10.1016/j.eswa.2007.09.038. Lin, F.-Y., McClean, S., 2001. A data mining approach to the prediction of corporate failure. Knowledge-Based Systems 14, 189–195. Lin, R.-H., Wang, Y.-T., Wu, C.-H., et al., 2007. Developing a business failure prediction model via RST, GRA and CBR. Expert Systems with Applications. doi:10.1016/j.eswa.2007.11.068. Martin, D., 1977. Early warning of bank failure: A Logit regression approach. Journal of Banking and Finance 1, 249–276.
224
H. Li, J. Sun / European Journal of Operational Research 197 (2009) 214–224
Min, S.-H., Lee, J.-M., Han, I., 2006. Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications 31, 652– 660. Odom, M., Sharda, R., 1990. A neural networks model for bankruptcy prediction. In: Proceedings of International Joint Conference on Neural Networks, San Diego, CA, pp. 163–168. Pal, S.K., Shiu, S., 2004. Foundations of Soft Case-Based Reasoning. Wiley, New Jersey. Pan, R., Yang, Q., Pan, J., 2007. Mining competent case bases for case-based reasoning. Artificial Intelligence 171, 1039–1068. Park, C.-S., Han, I., 2002. A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications 23, 255–264. Roy, B., 1971. Problems and methods with multiple objective functions. Mathematical Programming 1, 239–266. Roy, B., 1991. The outranking approach and the foundations of ELECTRE methods. Theory and Decision 31, 49–73. Roy, B., 1996. Multicriteria Methodology for Decision Aiding. Kluwer Academic Publishers, Dordrecht. Roy, B., Słowin´ski, R., 2008. Handling effects of reinforced preference and counterveto in credibility of outranking. European Journal of Operational Research 188, 185–190. Roy, B., Vincke, P., 1984. Relational systems of preference with one or more pseudo-criteria: Some new concepts and results. Management Science 30, 1323–1335. Schank, R., 1982. Dynamic Memory. Cambridge University Press, New York. Schank, R., Abelson, R., 1977. Scripts, Plans, Goals and Understanding. Erlbaum Hillsdale, New Jersey. Shin, K.-S., Lee, T.-S., Kim, H.-J., 2005. An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications 28, 127–135. Shiu, S.C.K., Pal, S.K., 2004. Case-based reasoning: Concepts, features and soft computing. Applied Intelligence 21, 233–238. Slowinski, R., Stefanowski, J., 1994. Rough classification with valued closeness relation. In: Diday, E., Lechevallier, Y., Schrader, M., et al. (Eds.), New
Approaches in Classification and Data Analysis. Springer-Verlag, Berlin, pp. 482–489. Slowinski, R., Stefanowski, J., 1996. Rough set reasoning about uncertain data. Fundamenta Informaticae 27, 229–243. Smet, Y.D., Guzman, L.M., 2004. Towards multicriteria clustering: An extension of the k-means algorithm. European Journal of Operational Research 158, 390– 398. Sun, J., Hui, X.-F., 2006. Financial distress prediction based on similarity weighted voting CBR. In: Li, X., Zaiane, R., Li, Z. (Eds.), Advanced Data Mining and Applications. Springer-Verlag, Berlin, pp. 947–958. Sun, J., Li, H., 2007. Financial distress early warning based on group decision making. Computers & Operations Research. doi:10.1016/j.cor.2007.11.005. Sun, J., Li, H., 2008a. Data mining method for listed companies’ financial distress prediction. Knowledge-Based Systems 21, 1–5. Sun, J., Li, H., 2008b. Listed companies’ financial distress prediction based on weighted majority voting combination of multiple classifiers. Expert Systems with Applications 35 (3), 818–827. Takeda, E., 2001. A method for multiple pseudo-criteria decision problems. Computers & Operations Research 28, 1427–1439. Vincke, Ph., 1986. Analysis of MCDA in Europe. European Journal of Operational Research 25, 160–168. Watson, I., 1999. Case-based reasoning is methodology, not a technology. Knowledge-Based Systems 12, 303–308. Witten, I.H., Frank, E., 2000. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufman, San Francisco, CA. Yip, A., Deng, H., 2003. A case-based reasoning approach to business failure prediction. In: Palade, V., Howlett, R.J., Jain, L.C. (Eds.), Knowledge-Based Intelligent Information and Engineering Systems. Springer-Verlag, Berlin, pp. 1075–1080. Yip, A., 2004. Predicting business failure with a case-based reasoning approach. In: Negoita, M., Howlett, R., Jain, L., et al. (Eds.), Knowledge-Based Intelligent Information and Engineering Systems. Springer-Verlag, Berlin, pp. 665–671. Zhou, Z.-H., 2003. Three perspectives of data mining. Artificial Intelligence 143, 139–146.