Expert Systems with Applications Expert Systems with Applications 32 (2007) 1011–1019 www.elsevier.com/locate/eswa
A case-based reasoning system with the two-dimensional reduction technique for customer classification Hyunchul Ahn a, Kyoung-jae Kim
b,*
, Ingoo Han
a
a
b
Graduate School of Management, Korea Advanced Institute of Science and Technology, 207-43 Cheongrangri-Dong, Dongdaemun-Gu, Seoul 130-722, Korea Department of Management Information Systems, Dongguk University, 3-26 Pil-Dong, Chung-Gu, Seoul 100-715, Korea
Abstract Many studies have tried to optimize parameters of case-based reasoning (CBR) systems. Among them, selection of appropriate features to measure similarity between the input and stored cases more precisely, and selection of appropriate instances to eliminate noises which distort prediction have been popular. However, these approaches have been applied independently although their simultaneous optimization may improve the prediction performance synergetically. This study proposes a case-based reasoning system with the two-dimensional reduction technique. In this study, vertical and horizontal dimensions of the research data are reduced through our research model, the hybrid feature and instance selection process using genetic algorithms. We apply the proposed model to a case involving real-world customer classification which predicts customers’ buying behavior for a specific product using their demographic characteristics. Experimental results show that the proposed technique may improve the classification accuracy and outperform various optimized models of the typical CBR system. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Case-based reasoning; Genetic algorithms; Feature selection; Instance selection; Customer relationship management
1. Introduction Case-based reasoning (CBR) is a problem-solving technique that reuses past experiences to find a solution. It is quite simple to implement in general, but it often handles complex and unstructured decision making problems very effectively. Moreover, its prediction model is maintained in an up-to-date state because the case-base is revised in real time, which is a very important feature for the realworld application. Thus, it has been applied to various problem-solving areas including manufacturing, finance and marketing (see Chiu, 2002; Chiu, Chang, & Chiu, 2003; Kim & Han, 2001; Shin & Han, 1999; Yin, Liu, & Wu, 2002).
*
Corresponding author. Tel.: +82 2 2260 3324; fax: +82 2 2260 3684. E-mail addresses:
[email protected] (H. Ahn),
[email protected] (K.-j. Kim),
[email protected] (I. Han). 0957-4174/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2006.02.021
This study applies CBR to a customer classification model which means that a company classifies its customers into predefined groups with similar behavior patterns. Companies usually build a customer classification model to find the prospects for a specific product. In this case, they classify the prospects into either purchasing or nonpurchasing groups. It is essential to find proper candidate buyers for the target products to obtain successful results from one-to-one marketing. Consequently, it is important to build an effective customer classification model for one-to-one marketing that can predict the candidate customer’s buying behavior more precisely. It is, however, not easy to obtain successful results with high classification accuracy by applying CBR because there is no mechanism to design effective systems. In particular, it is very important to design effective case indexing and retrieval, in which the match between input and stored cases is processed. There have been three research issues regarding the effective case indexing and retrieval – feature
1012
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
selection, feature weighting and instance selection. Feature selection means choice of an appropriate feature subset. Feature weighting means optimization of the weight of each feature. Instance selection is to choose a small subset of training instances that represents the genuine pattern of the accumulated cases. Prior studies for optimizing CBR have proposed these three methods using various techniques including statistical techniques and artificial intelligence techniques (see Babu & Murty, 2001; Cardie, 1993; Domingos, 1997; Huang, Chiang, Shieh, & Grimson, 2002; Liao, Zhang, & Mount, 2000; Lipowezky, 1998; Sanchez, Pla, & Ferri, 1997; Shin & Han, 1999; Siedlecki & Sklanski, 1989; Skalak, 1994; Wettschereck, Aha, & Mohri, 1997; Yan, 1993). However, most studies have merely suggested these methods for independent application or sometimes combining these methods sequentially, although simultaneous optimization of the factors may lead to better results. For the reason, simultaneous optimization of several components in CBR has attracted researchers, recently. In this article, we propose the two-dimensional reduction technique using a genetic algorithm (GA) to optimize the feature and instance selection simultaneously. In addition, we apply the proposed model to a real-world case involving customer classification and we present experimental results from the application. We organize this article as follows. In Section 2, we provide a brief review for prior studies and describe our proposed model – the two-dimensional reduction technique – in the next section. Section 4 presents the research design and experiments. In Section 5, the empirical results are presented and discussed. The final section suggests the contributions and limitations of this study. 2. Literature review Here, we first review the basic concepts of CBR and GA. After that, we introduce prior studies that attempt to optimize the components of CBR such as feature and instance selection. 2.1. Case-based reasoning and genetic algorithms While other major artificial intelligence techniques depend on generalized relationships between problem descriptors and conclusions, CBR utilizes specific knowledge of previously experienced and concrete problem situations, so it is effective for complex and unstructured problems and it is easy to update (Shin & Han, 1999). Due to its strengths, CBR has been applied popularly to various areas of problem solving, e.g. intelligent product catalogs for Internet shopping malls, conflict resolution in air traffic control, medical diagnosis and even the design of semiconductors (Turban & Aronson, 2001). The process of CBR is shown graphically in Fig. 1 (Kolodner, 1993).
Input problem
1. Case Representation
2. Case Indexing
Case-Base
3. Case Retrieval
4. Case Adaptation Fig. 1. General process of case-based reasoning.
Among the steps of the above flowchart, case indexing and retrieval are the most important steps because the performance of CBR systems usually depends on them. In these steps, the CBR system retrieves the most similar cases from the case memory. The similar cases to be found affect the quality of the solution significantly, thus it is very important to design an effective indexing and retrieval method. In particular, feature and instance selection for measuring similarity have been controversial issues in designing CBR systems. In addition, the genetic algorithm (GA) has been popularly applied as a tool for optimization of CBR. GA is an optimization method that attempts to incorporate ideas of natural evolution. Its procedure improves the search results by constantly trying various possible solutions with genetic operations. As such they represent an intelligent exploitation of a random search within a defined search space to solve a problem. In general, the process of GA is as follows. First of all, GA generates a set of solutions randomly which is called an initial population. Each solution is called a chromosome and it is usually in the form of a binary string. After generating the initial population, GA computes the fitness function of each chromosome. The fitness function is a user-defined function which returns the evaluation results of each chromosome, thus higher fitness value means its chromosome is a dominant gene. Typically, classification accuracy (performance) is used as a fitness function for classification problems. After that, offspring are generated by applying genetic operators. Among various genetic operators, selection, crossover and mutation are the most fundamental and popular operators. The selection operator determines which chromosome will survive. In crossover, substrings from pairs of chromosomes are exchanged to form new pairs of chromosomes. In mutation, with a very small mutation rate, arbitrarily selected bits in a chromosome are inverted. These steps of evolution continue until the stopping conditions are satisfied. In most cases, the stopping criterion is
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
set to the maximum number of generations (Chiu, 2002; Fu & Shen, 2004; Han & Kamber, 2001). 2.2. Feature selection approaches Feature selection is the process of picking a subset of features that are relevant to the target concept and removing irrelevant or redundant features. It is an important factor that determines the performance of the classification model, so it has been one of the most popular research issues in designing most classification models including CBR. In the case of feature selection, Stearns (1976) proposed the sequential forward selection (SFS) method which finds optimal feature subsets with the highest accuracy by varying the number of features. Siedlecki and Sklanski (1989) also proposed a feature selection algorithm based on genetic search, and Cardie (1993) presented a decision tree approach to feature subset selection. Skalak (1994) and Domingos (1997) also proposed a hill climbing algorithm and a clustering method for feature subset selection respectively. In addition, the study by Cardie and Howe (1997) selected relevant features using a decision tree, and Jarmulak, Craw, and Rowe (2000) selected relevant features using a decision tree algorithm like C4.5. 2.3. Instance selection approaches Though feature selection is an algorithm to enhance the measurement of the similarity between cases, instance selection is the technique to reduce noises included in a case-base. It is also called ‘editing’ or ‘prototype selection’. Instance selection chooses an appropriate reduced subset of a case-base and applies the nearest-neighbor rule to the selected subset. It may increase the performance of CBR systems significantly when a case-base contains many distorted cases which disturb the prediction of the CBR model. So, it has been another popular research issue in CBR systems for a long time. There are many different approaches for selecting appropriate instances. First of all, Hart (1968) suggested a condensed nearest neighbor algorithm, and Wilson
1013
(1972) proposed Wilson’s method. Their primitive algorithms were based on simple information gain theory, so they are easy to apply. Recently, mathematical tools and AI techniques have been applied popularly for instance selection. For example, Sanchez et al. (1997) proposed a proximity graph approach, and Lipowezky (1998) suggested linear programming methods for instance selection. Yan (1993) and Huang et al. (2002) presented a neural network-based instance selection method. In addition, the GA approach was proposed by Babu and Murty (2001). 2.4. Simultaneous optimization approaches The main idea of the simultaneous optimization approach is to reduce the dimensions of the features as well as the instances simultaneously. By using this, conventional CBR may find more relevant features for the more representative instances, so it is possible to obtain synergetic effects. However, unfortunately, there are few studies regarding this technique because of its short history. The first two-dimensional reduction approach was proposed by Kuncheva and Jain (1999). They proposed simultaneous optimization of feature and instance selection using GA, and they compared their model to sequential combining of traditional feature selection and instance selection algorithms. After the pioneering work by Kuncheva and Jain (1999), Rozsypal and Kubat (2003) also proposed a similar model. However, they pointed out the model by Kuncheva and Jain (1999) had defects when there are many training examples. Thus, they used different designs for the chromosome and fitness function for GA search. Also, they showed empirically that their model outperforms the model of Kuncheva and Jain. Fig. 2 gives an example of the twodimensional reduction process. However, these studies applied the model only to an artificially generated dataset, thus the usefulness of the model has not been certified under the real-world environment. In this study, we apply the two-dimensional reduction technique to a real-world customer classification case in order to validate applicability of the proposed model in the real world.
Fig. 2. Two-dimensional reduction process.
1014
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
3. The two-dimensional reduction technique using genetic algorithms In order to enhance the performance of typical CBR systems, this study uses the two-dimensional reduction technique via feature selection and instance selection processes using genetic algorithms. We call this model TRCBR (Two-dimensional Reduction using feature selection and instance selection for CBR). The framework of TRCBR is shown in Fig. 3. As shown in Fig. 3, the process consists of seven steps in total. The detail explanation for each step of the proposed model is presented as follows. Step 1. Design the structure of chromosomes. To apply GA to search for these optimal parameters, they have to be coded on a chromosome, a form of binary strings. The structure of the chromosomes for the model is presented in Fig. 4. The value of the codes for feature selection and instance selection is set to 1-bit digit – ‘0’ or ‘1’. ‘0’ means the corresponding item is not selected and ‘1’ means it is selected. Consequently, the length of each chromosome is k + n bits where k is the number of features and n is the number of instances, as shown in Fig. 4. Step 2. Generate the initial population and define the fitness function. The population (a set of seed chromosomes for finding optimal parameters) is initiated into random values before the search process. The population is
searched to find the encoded chromosome for maximizing the specific fitness function. The objective of the study is to determine appropriate features and relevant instances for CBR systems, which produce the highest classification accuracy for the test data (Kim, 2004; Shin & Han, 1999). Thus, we set the classification accuracy of the test data as the fitness function for GA. The fitness function (fT) for the test set T can be expressed as Eq. (1): Maximize f T ¼
n X
ð1Þ
hitk
k¼1
where n is the size of test dataset T, hitk is the matched result between the expected outcome (EOk) and the real outcome (AOk), i.e. if EOk = AOk then hitk is 1, otherwise hitk is 0. Step 3. Perform the CBR process. In this step, the parameters that are set in the previous step are applied to the CBR system and the general reasoning process of CBR goes on. We use the weighted average of Euclidean distance for each feature as a similarity measure. We use 3-NN (three-nearest neighbor) matching as a method of case retrieval. It means the system searches for three nearest neighbors for an input case and suggests the final classification result by voting for them. In this paper, we set k parameter of k-NN to 3 because 3-NN showed the best prediction accuracy among 1-NN, 3-NN, 5-NN, 7-NN and 9-NN.
1. Design the structure of chromosomes
2. Generate the initial population and define fitness function
Test Case-Base
3. Perform the CBR process
4. Calculate the fitness value for each chromosome
Reference Case-Base
5. Apply genetic operators and produce a new generation
No 6. Satisfy the stopping criteria? Yes Hold-out Case-Base
7. Apply the selected parameters to hold-out cases
Fig. 3. Steps of TRCBR.
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
1015
Case Population (A generation)
Feature selection
Instance selection
F3
…
Fk-2
Fk-1
Fk
I1
I2
I3
…
In-2
In-1
In
Chromosome 1
1
0
1
…
1
1
0
1
0
1
…
0
0
1
Chromosome 2
1
1
0
…
0
1
1
0
1
1
…
1
0
1
Chromosome 3
0
0
1
…
0
1
0
1
1
0
…
0
1
1
1
1
1
…
1
1
0
0
1
…
1
0
0
Chromosome m
……
F2
……
F1
0
Fig. 4. Gene structure for TRCBR.
Step 4. Calculate the fitness value for each chromosome. After the adoption of the reasoning process for all of test cases, the values of the fitness function (fT) for the items of test set T are updated. They are used as the indicator of the excellence of each chromosome. Step 5. Apply genetic operators and produce a new generation. In step 5, the process of GA’s evolution goes on towards the direction of maximizing the value of the fitness function. It includes selection of the fittest, crossover and mutation. By applying these operators, a new generation of the population is produced. Step 6. Iterate steps 3–5 until satisfying the stopping criteria. Steps 3–5 are repeated until the stopping conditions are satisfied. When finishing the evolving process according to the stopping criteria, the best parameters – the information about finally selected features and instances – are stored in a memory. Step 7. Apply the selected features and instances to the hold-out cases. In the last stage, the system determines the parameters – the optimal selections of the features and instances – whose performance for the test data is the best. It applies them to the hold-out data to check the generalizability of the selected parameters. Sometimes, optimized parameters by GA do not fit the unknown data. Thus, this phase is required to check the possibility of overfitting.
their personal information. As a result, the target company of our research possesses detailed and accurate customer
4. The research design and experiments
Table 1 Selected features and their descriptions Feature name
Description
Range
AGE
Age
ADD0
Residences are located in Seoul (the capital of South Korea)
Continuous (years) 0: False 1: True
ADD1
Residences are located in big cities
0: False 1: True
OCCU0
Customers work for companies
0: False 1: True
OCCU2
Customers are students
0: False 1: True
OCCU4
Customers run their own businesses
0: False 1: True
SEX
Gender
0: Male 1: Female
LOSS4
Customers’ need to lose weight around their legs and thighs
0: Not exist 1: Exist
PUR0
Customers diet for beauty
0: False 1: True
HEIGHT
Height
continuous (m)
BMI
Body mass index (BMI) is the measure of body fat based on height and weight that applies to both adult men and women. It is calculated as follows: weight ðkgÞ BMI ðkg=m2 Þ ¼ ðheight ðmÞÞ2 Prior experience with ‘functional diet food’
Continuous (kg/m2)
E02
Prior experience with ‘diet drugs’
0: Not exist 1: Exist
E05
Prior experience with ‘one food diet’
0: Not exist 1: Exist
4.1. Application data The research data is collected from an online diet portal site in Korea which contains all kinds of services for online diets such as providing information, community services and a shopping mall. In the case of a diet shopping mall, customers generally have a clear objective for the shopping service they want, and they are very sensitive to personalized care and services. Thus, they do not hesitate to provide
E01
0: Not exist 1: Exist
1016
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
Fig. 5. Working screen of the experimental system.
information, and it has a desire to use the information as a source of direct marketing for selling their products. Based on this motivation, we build the customer classification model to predict the potential buyers for the company’s products among the users of its web site. The web site deals with four kinds of products, (1) a total diet counseling service, (2) functional meals and recipes which are helpful for dieting, (3) fitness equipment, and (4) various accessories. Among them, functional meals and recipes are the most strategic product group for the mall because its parent company produces these products in an offline environment. So, the mall is interested in examining the potential buyers of the functional meals and recipes, and we develop a model to classify the customers who are expected to purchase them. The experimental data includes 980 cases that consist of the purchasing and non-purchasing customers. It contains demographic variables and the status of purchase or nonpurchase for the corresponding user. The status of purchase for each user is categorized as ‘0’ or ‘1’ and it is used as a dependent variable. ‘0’ means that the user has not purchased the company’s products and ‘1’ means he or she made a purchase. We collect totally 46 independent variables including demographic and other personal information. To eliminate irrelevant variables and make the reasoning process more efficient, we adapt two statistical methods, two-sample t-tests for ratio variables and chisquare tests for nominal variables. Finally, we select only 14 factors which prove to be the most influential in the pur-
chase of the company’s product. Table 1 presents detailed information about the selected features. Table 2 Classification accuracy of k-NN for CCBR k
1
3
5
7
9
Performance for hold-out case-base
52.04%
56.12%
55.61%
54.08%
53.57%
Table 3 The feature and instance selection of optimized CBR models Feature name
CCBR
FCBR
ICBR
TRCBR
Feature selections AGE ADD0 ADD1 OCCU0 OCCU2 OCCU4 SEX LOSS4 PUR0 HEIGHT BMI E01 E02 E05
Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected
Unselected Selected Selected Unselected Unselected Unselected Selected Selected Unselected Unselected Selected Selected Selected Selected
Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected
Selected Unselected Selected Unselected Selected Selected Unselected Selected Selected Selected Unselected Selected Selected Selected
588 100%
524 89.12%
478 81.29%
Instance selections # of selections 588 Ratio (%) 100%
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
We split the data into three groups: reference, test and hold-out case-bases. The portion of these three case-bases is 60% (588 cases), 20% (196 cases) and 20% (196 cases) each. 4.2. Research design and system development For the controlling parameters of GA search for our experiments, we use 200 organisms in the population and set the crossover to 0.7 and mutation rate to 0.1. As a stopping condition, we use 4000 trials (20 generations). To test the effectiveness of TRCBR, we also apply three different CBR models to the same dataset. The first model, labeled CCBR (Conventional CBR), uses a conventional approach for the reasoning process of CBR. This model considers all initially available features as a feature subset.
1017
That is to say, there is no special process of feature subset selection. In addition, instance selection is not considered here, so all instances are used in this model. The second model selects relevant features using genetic algorithms. This model is called FCBR (Feature selection for CBR). Here, we try to optimize feature selection by GA, but we are still unconcerned with instance selection. Siedlecki and Sklanski (1989) and Kim (2004) proposed similar models. The third model uses GA to select a relevant instance subset. This model is called ICBR (Instance selection for CBR). In this model, we try to optimize instance selection by GA, but we are unconcerned with feature selection. Babu and Murty (2001) proposed a similar model. These experiments are done by our private prototype software which is developed using Microsoft Excel 2003
Fig. 6. Scattergrams of (a) CCBR, (b) ICBR and (c) TRCBR.
1018
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
and Evolver Industrial Version 4.08 (Palisade Software, www.palisade.com), a commercial GA tool. The 3-NN algorithm is implemented in VBA (Visual Basic for Applications) of Microsoft Excel 2003 and the VBA codes are incorporated with Evolver to optimize the parameters by GA. Fig. 5 represents the working screen of the developed experimental system. 5. Experimental results
Table 5 The results of the two sample test for proportions Pair of the comparative models
Z-score
p-Value
Decision
CCBR CCBR FCBR TRCBR TRCBR TRCBR
0.8191 1.4416 0.6235 1.6510 0.8335 0.2102
0.2064 0.0747 0.2665 0.0494 0.2023 0.4168
Accept H0 Reject H0* Accept H0 Reject H0** Accept H0 Accept H0
*
For CCBR, we apply the k-NN algorithm by varying parameter k into odd numbers ranging from 1 to 9. As a result, we find that 3-NN shows the best performance, as illustrated in Table 2. So, as indicated earlier, we apply 3-NN to all of other CBR models including CCBR, FCBR, ICBR and our proposed model. Table 3 presents the results of CCBR, FCBR, ICBR and TRCBR. We can find that ICBR and TRCBR use only a portion of the training case-base, so these techniques select about 81–89% of the total training samples. It can be interpreted that the original dataset may contain some noisy cases. Fig. 6 compares the scattergrams of CCBR, ICBR and TRCBR whose x-axis represents ‘BMI (body mass index, kg/m2)’, and the y-axis indicates ‘AGE (years)’. Although it is not clearly distinguished, TRCBR shows the difference of the patterns between the purchasing group and non-purchasing group slightly more precisely than CCBR as well as ICBR in Fig. 6. In addition, we compare the prediction performance of the proposed model and other alternative models. Table 4 describes the average prediction accuracy of each model. Among the models, TRCBR shows the highest level of accuracy (64.29%) in the given hold-out dataset, followed by ICBR (63.27%) and FCBR (60.20%). In this dataset, the CBR algorithms which contain instance selection procedures show better results, which may be interpreted that the dataset used in the study may contain many noises. In addition, the results show that TRCBR improves the prediction accuracy of typical CBR systems by about 8.17%. We use the two-sample test for proportions to examine whether the differences of predictive accuracy between the proposed model and other comparative algorithms are statistically significant. By applying this test, it is possible to check whether there is a difference between two probabilities when the prediction accuracy of the left-vertical methods is compared with the right horizontal methods
Table 4 Average prediction accuracy of the models Model
Test dataset (%)
Hold-out dataset (%)
Remarks
CCBR FCBR ICBR TRCBR
62.24 62.76 66.33 70.92
56.12 60.20 63.27 64.29
k of k-NN = 3
**
FCBR ICBR ICBR CCBR FCBR ICBR
Significant at the 10% level. Significant at the 5% level.
(Harnett & Soni, 1991). In the test, the null hypothesis is that H0: pi pj = 0 where i = 1, . . . , 3 and j = 2, . . . , 4, while the alternative hypothesis is that Ha: pi pj 5 0 where i = 1, . . . , 3 and j = 2, . . . , 4. pk means the classification performance of the kth method. Table 5 shows the results for the test. As shown in Table 5, TRCBR outperforms CCBR at the 5% statistical significance level, but it does not outperform other two models with statistical significance. 6. Conclusions In this paper, we have suggested a novel model, TRCBR, to improve the performance of the typical CBR system. This paper uses GA as a tool to optimize the feature selection and instance selection simultaneously. From the results of the experiment, we show that our proposed model may outperform other comparative algorithms such as CCBR, FCBR, and ICBR in the case of customer classification. This study has some limitations. First of all, it takes too much computational time to obtain optimal parameters for the proposed model. So, the efforts to make the proposed model more efficient should be followed in the future to apply our model to real-world general cases. Second, there are other factors which enhance the performance of the CBR system that may be incorporated with the simultaneous optimization model. For example, feature weighting is another factor to be optimized in the CBR system (Liao et al., 2000; Shin & Han, 1999; Wettschereck et al., 1997). In addition, k parameter of k-NN, the number of cases that combine, may be another parameter to be optimized (Ahn, Kim, & Han, 2003; Lee & Park, 1999). Building a universal simultaneous optimization model for CBR including feature and instance selection as well as other factors like k parameter and feature weights may improve the overall performance of the CBR system. Finally, the results of this study may depend on the experimental dataset. In particular, the research dataset is relatively small and seems to be noisy, so it is difficult to distinguish the prediction ability of each model. Consequently, the generalizability of the proposed model should be tested further by applying it to other problem domains in the future.
H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019
References Ahn, H., Kim, K.-j., & Han, I., (2003). Determining the optimal number of cases to combine in an effective case-based reasoning system using genetic algorithms. In Proceedings of international conference of Korea intelligent information systems society 2003 (ICKIISS 2003), Seoul, Korea (pp. 178–184). Babu, T. R., & Murty, M. N. (2001). Comparison of genetic algorithm based prototype selection schemes. Pattern Recognition, 34(2), 523–525. Cardie, C. (1993). Using decision trees to improve case-based learning. In Proceedings of the tenth international conference on machine learning (pp. 25–32). San Francisco, CA: Morgan Kaufmann. Cardie, C., & Howe, N. (1997). Improving minority class prediction using case-specific feature weights. In Proceedings of the fourteenth international conference on machine learning (pp. 57–65). San Francisco, CA: Morgan Kaufmann. Chiu, C. (2002). A case-based customer classification approach for direct marketing. Expert Systems with Applications, 22(2), 163–168. Chiu, C., Chang, P. C., & Chiu, N. H. (2003). A case-based expert support system for due-date assignment in a water fabrication factory. Journal of Intelligent Manufacturing, 14(3–4), 287–296. Domingos, P. (1997). Context-sensitive feature selection for lazy learners. Artificial Intelligence Review, 11(1–5), 227–253. Fu, Y., & Shen, R. (2004). GA based CBR approach in Q&A system. Expert Systems with Applications, 26(2), 167–170. Han, J., & Kamber, M. (2001). Datamining: Concepts and techniques. San Francisco, CA: Morgan Kaufmann Publishers. Harnett, D. L., & Soni, A. K. (1991). Statistical methods for business and economics. Massachusetts, MA: Addison-Wesley. Hart, P. E. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3), 515–516. Huang, Y. S., Chiang, C. C., Shieh, J. W., & Grimson, E. (2002). Prototype optimization for nearest-neighbor classification. Pattern Recognition, 35(6), 1237–1245. Jarmulak, J., Craw, S., & Rowe, R. (2000). Self-optimizing CBR retrieval. In Proceedings of the twelfth IEEE international conference on tools with artificial intelligence (pp. 376–383). Canada: Vancouver. Kim, K. (2004). Toward global optimization of case-based reasoning systems for financial forecasting. Applied Intelligence, 21(3), 239–249. Kim, K., & Han, I. (2001). Maintaining case-based reasoning systems using a genetic algorithms approach. Expert Systems with Applications, 21(3), 139–145. Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann.
1019
Kuncheva, L. I., & Jain, L. C. (1999). Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recognition Letters, 20(11–13), 1149–1156. Lee, H. Y., & Park, K. N. (1999). Methods for determining the optimal number of cases to combine in an effective case based forecasting system. Korean Journal of Management Research, 27(5), 1239–1252. Liao, T. W., Zhang, Z. M., & Mount, C. R. (2000). A case-based reasoning system for identifying failure mechanisms. Engineering Applications of Artificial Intelligence, 13(2), 199–213. Lipowezky, U. (1998). Selection of the optimal prototype subset for 1-NN classification. Pattern Recognition Letters, 19(10), 907–918. Rozsypal, A., & Kubat, M. (2003). Selecting representative examples and attributes by a genetic algorithm. Intelligent Data Analysis, 7(4), 291–304. Sanchez, J. S., Pla, F., & Ferri, F. J. (1997). Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognition Letters, 18(6), 507–513. Shin, K. S., & Han, I. (1999). Case-based reasoning supported by genetic algorithms for corporate bond rating. Expert Systems with Applications, 16(2), 85–95. Siedlecki, W., & Sklanski, J. (1989). A note on genetic algorithms for large-scale feature selection. Pattern Recognition Letters, 10(5), 335–347. Skalak, D. B. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Proceedings of the eleventh international conference on machine learning (pp. 293–301). New Brunswick, NJ: Morgan Kaufmann. Stearns, S. (1976). On selecting features for pattern classifiers. In Proceedings of the third international conference on pattern recognition, Coronado, CA (pp. 71–75). Turban, E., & Aronson, J. E. (2001). Decision support systems and intelligent systems (6th ed.). Upper Saddle River, NJ: Prentice-Hall. Wettschereck, D., Aha, D. W., & Mohri, T. (1997). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, 11(1–5), 273–314. Wilson, D. L. (1972). Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, 2(3), 408–421. Yan, H. (1993). Prototype optimization for nearest neighbor classifier using a two-layer perceptron. Pattern Recognition, 26(2), 317–324. Yin, W. J., Liu, M., & Wu, C. (2002). A genetic learning approach with case-based memory for job-shop scheduling problems. In Proceedings of the first international conference on machine learning and cybernetics, Beijing, China (pp. 1683–1687).