A case-based reasoning system with the two-dimensional reduction technique for customer classification

Expert Systems with Applications Expert Systems with Applications 32 (2007) 1011–1019 www.elsevier.com/locate/eswa A case-based reasoning system with...

Download PDF

951KB Sizes 6 Downloads 32 Views

Report

PDF Reader
Full Text

Expert Systems with Applications Expert Systems with Applications 32 (2007) 1011–1019 www.elsevier.com/locate/eswa

A case-based reasoning system with the two-dimensional reduction technique for customer classiﬁcation Hyunchul Ahn a, Kyoung-jae Kim

b,*

, Ingoo Han

a

a

b

Graduate School of Management, Korea Advanced Institute of Science and Technology, 207-43 Cheongrangri-Dong, Dongdaemun-Gu, Seoul 130-722, Korea Department of Management Information Systems, Dongguk University, 3-26 Pil-Dong, Chung-Gu, Seoul 100-715, Korea

Abstract Many studies have tried to optimize parameters of case-based reasoning (CBR) systems. Among them, selection of appropriate features to measure similarity between the input and stored cases more precisely, and selection of appropriate instances to eliminate noises which distort prediction have been popular. However, these approaches have been applied independently although their simultaneous optimization may improve the prediction performance synergetically. This study proposes a case-based reasoning system with the two-dimensional reduction technique. In this study, vertical and horizontal dimensions of the research data are reduced through our research model, the hybrid feature and instance selection process using genetic algorithms. We apply the proposed model to a case involving real-world customer classiﬁcation which predicts customers’ buying behavior for a speciﬁc product using their demographic characteristics. Experimental results show that the proposed technique may improve the classiﬁcation accuracy and outperform various optimized models of the typical CBR system. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Case-based reasoning; Genetic algorithms; Feature selection; Instance selection; Customer relationship management

1. Introduction Case-based reasoning (CBR) is a problem-solving technique that reuses past experiences to ﬁnd a solution. It is quite simple to implement in general, but it often handles complex and unstructured decision making problems very eﬀectively. Moreover, its prediction model is maintained in an up-to-date state because the case-base is revised in real time, which is a very important feature for the realworld application. Thus, it has been applied to various problem-solving areas including manufacturing, ﬁnance and marketing (see Chiu, 2002; Chiu, Chang, & Chiu, 2003; Kim & Han, 2001; Shin & Han, 1999; Yin, Liu, & Wu, 2002).

*

Corresponding author. Tel.: +82 2 2260 3324; fax: +82 2 2260 3684. E-mail addresses: [email protected] (H. Ahn), [email protected] (K.-j. Kim), [email protected] (I. Han). 0957-4174/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2006.02.021

This study applies CBR to a customer classiﬁcation model which means that a company classiﬁes its customers into predeﬁned groups with similar behavior patterns. Companies usually build a customer classiﬁcation model to ﬁnd the prospects for a speciﬁc product. In this case, they classify the prospects into either purchasing or nonpurchasing groups. It is essential to ﬁnd proper candidate buyers for the target products to obtain successful results from one-to-one marketing. Consequently, it is important to build an eﬀective customer classiﬁcation model for one-to-one marketing that can predict the candidate customer’s buying behavior more precisely. It is, however, not easy to obtain successful results with high classiﬁcation accuracy by applying CBR because there is no mechanism to design eﬀective systems. In particular, it is very important to design eﬀective case indexing and retrieval, in which the match between input and stored cases is processed. There have been three research issues regarding the eﬀective case indexing and retrieval – feature

1012

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

selection, feature weighting and instance selection. Feature selection means choice of an appropriate feature subset. Feature weighting means optimization of the weight of each feature. Instance selection is to choose a small subset of training instances that represents the genuine pattern of the accumulated cases. Prior studies for optimizing CBR have proposed these three methods using various techniques including statistical techniques and artiﬁcial intelligence techniques (see Babu & Murty, 2001; Cardie, 1993; Domingos, 1997; Huang, Chiang, Shieh, & Grimson, 2002; Liao, Zhang, & Mount, 2000; Lipowezky, 1998; Sanchez, Pla, & Ferri, 1997; Shin & Han, 1999; Siedlecki & Sklanski, 1989; Skalak, 1994; Wettschereck, Aha, & Mohri, 1997; Yan, 1993). However, most studies have merely suggested these methods for independent application or sometimes combining these methods sequentially, although simultaneous optimization of the factors may lead to better results. For the reason, simultaneous optimization of several components in CBR has attracted researchers, recently. In this article, we propose the two-dimensional reduction technique using a genetic algorithm (GA) to optimize the feature and instance selection simultaneously. In addition, we apply the proposed model to a real-world case involving customer classiﬁcation and we present experimental results from the application. We organize this article as follows. In Section 2, we provide a brief review for prior studies and describe our proposed model – the two-dimensional reduction technique – in the next section. Section 4 presents the research design and experiments. In Section 5, the empirical results are presented and discussed. The ﬁnal section suggests the contributions and limitations of this study. 2. Literature review Here, we ﬁrst review the basic concepts of CBR and GA. After that, we introduce prior studies that attempt to optimize the components of CBR such as feature and instance selection. 2.1. Case-based reasoning and genetic algorithms While other major artiﬁcial intelligence techniques depend on generalized relationships between problem descriptors and conclusions, CBR utilizes speciﬁc knowledge of previously experienced and concrete problem situations, so it is eﬀective for complex and unstructured problems and it is easy to update (Shin & Han, 1999). Due to its strengths, CBR has been applied popularly to various areas of problem solving, e.g. intelligent product catalogs for Internet shopping malls, conﬂict resolution in air traﬃc control, medical diagnosis and even the design of semiconductors (Turban & Aronson, 2001). The process of CBR is shown graphically in Fig. 1 (Kolodner, 1993).

Input problem

1. Case Representation

2. Case Indexing

Case-Base

3. Case Retrieval

4. Case Adaptation Fig. 1. General process of case-based reasoning.

Among the steps of the above ﬂowchart, case indexing and retrieval are the most important steps because the performance of CBR systems usually depends on them. In these steps, the CBR system retrieves the most similar cases from the case memory. The similar cases to be found aﬀect the quality of the solution signiﬁcantly, thus it is very important to design an eﬀective indexing and retrieval method. In particular, feature and instance selection for measuring similarity have been controversial issues in designing CBR systems. In addition, the genetic algorithm (GA) has been popularly applied as a tool for optimization of CBR. GA is an optimization method that attempts to incorporate ideas of natural evolution. Its procedure improves the search results by constantly trying various possible solutions with genetic operations. As such they represent an intelligent exploitation of a random search within a deﬁned search space to solve a problem. In general, the process of GA is as follows. First of all, GA generates a set of solutions randomly which is called an initial population. Each solution is called a chromosome and it is usually in the form of a binary string. After generating the initial population, GA computes the ﬁtness function of each chromosome. The ﬁtness function is a user-deﬁned function which returns the evaluation results of each chromosome, thus higher ﬁtness value means its chromosome is a dominant gene. Typically, classiﬁcation accuracy (performance) is used as a ﬁtness function for classiﬁcation problems. After that, oﬀspring are generated by applying genetic operators. Among various genetic operators, selection, crossover and mutation are the most fundamental and popular operators. The selection operator determines which chromosome will survive. In crossover, substrings from pairs of chromosomes are exchanged to form new pairs of chromosomes. In mutation, with a very small mutation rate, arbitrarily selected bits in a chromosome are inverted. These steps of evolution continue until the stopping conditions are satisﬁed. In most cases, the stopping criterion is

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

set to the maximum number of generations (Chiu, 2002; Fu & Shen, 2004; Han & Kamber, 2001). 2.2. Feature selection approaches Feature selection is the process of picking a subset of features that are relevant to the target concept and removing irrelevant or redundant features. It is an important factor that determines the performance of the classiﬁcation model, so it has been one of the most popular research issues in designing most classiﬁcation models including CBR. In the case of feature selection, Stearns (1976) proposed the sequential forward selection (SFS) method which ﬁnds optimal feature subsets with the highest accuracy by varying the number of features. Siedlecki and Sklanski (1989) also proposed a feature selection algorithm based on genetic search, and Cardie (1993) presented a decision tree approach to feature subset selection. Skalak (1994) and Domingos (1997) also proposed a hill climbing algorithm and a clustering method for feature subset selection respectively. In addition, the study by Cardie and Howe (1997) selected relevant features using a decision tree, and Jarmulak, Craw, and Rowe (2000) selected relevant features using a decision tree algorithm like C4.5. 2.3. Instance selection approaches Though feature selection is an algorithm to enhance the measurement of the similarity between cases, instance selection is the technique to reduce noises included in a case-base. It is also called ‘editing’ or ‘prototype selection’. Instance selection chooses an appropriate reduced subset of a case-base and applies the nearest-neighbor rule to the selected subset. It may increase the performance of CBR systems signiﬁcantly when a case-base contains many distorted cases which disturb the prediction of the CBR model. So, it has been another popular research issue in CBR systems for a long time. There are many diﬀerent approaches for selecting appropriate instances. First of all, Hart (1968) suggested a condensed nearest neighbor algorithm, and Wilson

1013

(1972) proposed Wilson’s method. Their primitive algorithms were based on simple information gain theory, so they are easy to apply. Recently, mathematical tools and AI techniques have been applied popularly for instance selection. For example, Sanchez et al. (1997) proposed a proximity graph approach, and Lipowezky (1998) suggested linear programming methods for instance selection. Yan (1993) and Huang et al. (2002) presented a neural network-based instance selection method. In addition, the GA approach was proposed by Babu and Murty (2001). 2.4. Simultaneous optimization approaches The main idea of the simultaneous optimization approach is to reduce the dimensions of the features as well as the instances simultaneously. By using this, conventional CBR may ﬁnd more relevant features for the more representative instances, so it is possible to obtain synergetic eﬀects. However, unfortunately, there are few studies regarding this technique because of its short history. The ﬁrst two-dimensional reduction approach was proposed by Kuncheva and Jain (1999). They proposed simultaneous optimization of feature and instance selection using GA, and they compared their model to sequential combining of traditional feature selection and instance selection algorithms. After the pioneering work by Kuncheva and Jain (1999), Rozsypal and Kubat (2003) also proposed a similar model. However, they pointed out the model by Kuncheva and Jain (1999) had defects when there are many training examples. Thus, they used diﬀerent designs for the chromosome and ﬁtness function for GA search. Also, they showed empirically that their model outperforms the model of Kuncheva and Jain. Fig. 2 gives an example of the twodimensional reduction process. However, these studies applied the model only to an artiﬁcially generated dataset, thus the usefulness of the model has not been certiﬁed under the real-world environment. In this study, we apply the two-dimensional reduction technique to a real-world customer classiﬁcation case in order to validate applicability of the proposed model in the real world.

Fig. 2. Two-dimensional reduction process.

1014

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

3. The two-dimensional reduction technique using genetic algorithms In order to enhance the performance of typical CBR systems, this study uses the two-dimensional reduction technique via feature selection and instance selection processes using genetic algorithms. We call this model TRCBR (Two-dimensional Reduction using feature selection and instance selection for CBR). The framework of TRCBR is shown in Fig. 3. As shown in Fig. 3, the process consists of seven steps in total. The detail explanation for each step of the proposed model is presented as follows. Step 1. Design the structure of chromosomes. To apply GA to search for these optimal parameters, they have to be coded on a chromosome, a form of binary strings. The structure of the chromosomes for the model is presented in Fig. 4. The value of the codes for feature selection and instance selection is set to 1-bit digit – ‘0’ or ‘1’. ‘0’ means the corresponding item is not selected and ‘1’ means it is selected. Consequently, the length of each chromosome is k + n bits where k is the number of features and n is the number of instances, as shown in Fig. 4. Step 2. Generate the initial population and deﬁne the ﬁtness function. The population (a set of seed chromosomes for ﬁnding optimal parameters) is initiated into random values before the search process. The population is

searched to ﬁnd the encoded chromosome for maximizing the speciﬁc ﬁtness function. The objective of the study is to determine appropriate features and relevant instances for CBR systems, which produce the highest classiﬁcation accuracy for the test data (Kim, 2004; Shin & Han, 1999). Thus, we set the classiﬁcation accuracy of the test data as the ﬁtness function for GA. The ﬁtness function (fT) for the test set T can be expressed as Eq. (1): Maximize f T ¼

n X

ð1Þ

hitk

k¼1

where n is the size of test dataset T, hitk is the matched result between the expected outcome (EOk) and the real outcome (AOk), i.e. if EOk = AOk then hitk is 1, otherwise hitk is 0. Step 3. Perform the CBR process. In this step, the parameters that are set in the previous step are applied to the CBR system and the general reasoning process of CBR goes on. We use the weighted average of Euclidean distance for each feature as a similarity measure. We use 3-NN (three-nearest neighbor) matching as a method of case retrieval. It means the system searches for three nearest neighbors for an input case and suggests the ﬁnal classiﬁcation result by voting for them. In this paper, we set k parameter of k-NN to 3 because 3-NN showed the best prediction accuracy among 1-NN, 3-NN, 5-NN, 7-NN and 9-NN.

1. Design the structure of chromosomes

2. Generate the initial population and define fitness function

Test Case-Base

3. Perform the CBR process

4. Calculate the fitness value for each chromosome

Reference Case-Base

5. Apply genetic operators and produce a new generation

No 6. Satisfy the stopping criteria? Yes Hold-out Case-Base

7. Apply the selected parameters to hold-out cases

Fig. 3. Steps of TRCBR.

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

1015

Case Population (A generation)

Feature selection

Instance selection

F3

…

Fk-2

Fk-1

Fk

I1

I2

I3

…

In-2

In-1

In

Chromosome 1

1

0

1

…

1

1

0

1

0

1

…

0

0

1

Chromosome 2

1

1

0

…

0

1

1

0

1

1

…

1

0

1

Chromosome 3

0

0

1

…

0

1

0

1

1

0

…

0

1

1

1

1

1

…

1

1

0

0

1

…

1

0

0

Chromosome m

……

F2

……

F1

0

Fig. 4. Gene structure for TRCBR.

Step 4. Calculate the ﬁtness value for each chromosome. After the adoption of the reasoning process for all of test cases, the values of the ﬁtness function (fT) for the items of test set T are updated. They are used as the indicator of the excellence of each chromosome. Step 5. Apply genetic operators and produce a new generation. In step 5, the process of GA’s evolution goes on towards the direction of maximizing the value of the ﬁtness function. It includes selection of the ﬁttest, crossover and mutation. By applying these operators, a new generation of the population is produced. Step 6. Iterate steps 3–5 until satisfying the stopping criteria. Steps 3–5 are repeated until the stopping conditions are satisﬁed. When ﬁnishing the evolving process according to the stopping criteria, the best parameters – the information about ﬁnally selected features and instances – are stored in a memory. Step 7. Apply the selected features and instances to the hold-out cases. In the last stage, the system determines the parameters – the optimal selections of the features and instances – whose performance for the test data is the best. It applies them to the hold-out data to check the generalizability of the selected parameters. Sometimes, optimized parameters by GA do not ﬁt the unknown data. Thus, this phase is required to check the possibility of overﬁtting.

their personal information. As a result, the target company of our research possesses detailed and accurate customer

4. The research design and experiments

Table 1 Selected features and their descriptions Feature name

Description

Range

AGE

Age

ADD0

Residences are located in Seoul (the capital of South Korea)

Continuous (years) 0: False 1: True

ADD1

Residences are located in big cities

0: False 1: True

OCCU0

Customers work for companies

0: False 1: True

OCCU2

Customers are students

0: False 1: True

OCCU4

Customers run their own businesses

0: False 1: True

SEX

Gender

0: Male 1: Female

LOSS4

Customers’ need to lose weight around their legs and thighs

0: Not exist 1: Exist

PUR0

Customers diet for beauty

0: False 1: True

HEIGHT

Height

continuous (m)

BMI

Body mass index (BMI) is the measure of body fat based on height and weight that applies to both adult men and women. It is calculated as follows: weight ðkgÞ BMI ðkg=m2 Þ ¼ ðheight ðmÞÞ2 Prior experience with ‘functional diet food’

Continuous (kg/m2)

E02

Prior experience with ‘diet drugs’

0: Not exist 1: Exist

E05

Prior experience with ‘one food diet’

0: Not exist 1: Exist

4.1. Application data The research data is collected from an online diet portal site in Korea which contains all kinds of services for online diets such as providing information, community services and a shopping mall. In the case of a diet shopping mall, customers generally have a clear objective for the shopping service they want, and they are very sensitive to personalized care and services. Thus, they do not hesitate to provide

E01

0: Not exist 1: Exist

1016

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

Fig. 5. Working screen of the experimental system.

information, and it has a desire to use the information as a source of direct marketing for selling their products. Based on this motivation, we build the customer classiﬁcation model to predict the potential buyers for the company’s products among the users of its web site. The web site deals with four kinds of products, (1) a total diet counseling service, (2) functional meals and recipes which are helpful for dieting, (3) ﬁtness equipment, and (4) various accessories. Among them, functional meals and recipes are the most strategic product group for the mall because its parent company produces these products in an oﬄine environment. So, the mall is interested in examining the potential buyers of the functional meals and recipes, and we develop a model to classify the customers who are expected to purchase them. The experimental data includes 980 cases that consist of the purchasing and non-purchasing customers. It contains demographic variables and the status of purchase or nonpurchase for the corresponding user. The status of purchase for each user is categorized as ‘0’ or ‘1’ and it is used as a dependent variable. ‘0’ means that the user has not purchased the company’s products and ‘1’ means he or she made a purchase. We collect totally 46 independent variables including demographic and other personal information. To eliminate irrelevant variables and make the reasoning process more eﬃcient, we adapt two statistical methods, two-sample t-tests for ratio variables and chisquare tests for nominal variables. Finally, we select only 14 factors which prove to be the most inﬂuential in the pur-

chase of the company’s product. Table 1 presents detailed information about the selected features. Table 2 Classiﬁcation accuracy of k-NN for CCBR k

1

3

5

7

9

Performance for hold-out case-base

52.04%

56.12%

55.61%

54.08%

53.57%

Table 3 The feature and instance selection of optimized CBR models Feature name

CCBR

FCBR

ICBR

TRCBR

Feature selections AGE ADD0 ADD1 OCCU0 OCCU2 OCCU4 SEX LOSS4 PUR0 HEIGHT BMI E01 E02 E05

Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected

Unselected Selected Selected Unselected Unselected Unselected Selected Selected Unselected Unselected Selected Selected Selected Selected

Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected Selected

Selected Unselected Selected Unselected Selected Selected Unselected Selected Selected Selected Unselected Selected Selected Selected

588 100%

524 89.12%

478 81.29%

Instance selections # of selections 588 Ratio (%) 100%

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

We split the data into three groups: reference, test and hold-out case-bases. The portion of these three case-bases is 60% (588 cases), 20% (196 cases) and 20% (196 cases) each. 4.2. Research design and system development For the controlling parameters of GA search for our experiments, we use 200 organisms in the population and set the crossover to 0.7 and mutation rate to 0.1. As a stopping condition, we use 4000 trials (20 generations). To test the eﬀectiveness of TRCBR, we also apply three diﬀerent CBR models to the same dataset. The ﬁrst model, labeled CCBR (Conventional CBR), uses a conventional approach for the reasoning process of CBR. This model considers all initially available features as a feature subset.

1017

That is to say, there is no special process of feature subset selection. In addition, instance selection is not considered here, so all instances are used in this model. The second model selects relevant features using genetic algorithms. This model is called FCBR (Feature selection for CBR). Here, we try to optimize feature selection by GA, but we are still unconcerned with instance selection. Siedlecki and Sklanski (1989) and Kim (2004) proposed similar models. The third model uses GA to select a relevant instance subset. This model is called ICBR (Instance selection for CBR). In this model, we try to optimize instance selection by GA, but we are unconcerned with feature selection. Babu and Murty (2001) proposed a similar model. These experiments are done by our private prototype software which is developed using Microsoft Excel 2003

Fig. 6. Scattergrams of (a) CCBR, (b) ICBR and (c) TRCBR.

1018

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

and Evolver Industrial Version 4.08 (Palisade Software, www.palisade.com), a commercial GA tool. The 3-NN algorithm is implemented in VBA (Visual Basic for Applications) of Microsoft Excel 2003 and the VBA codes are incorporated with Evolver to optimize the parameters by GA. Fig. 5 represents the working screen of the developed experimental system. 5. Experimental results

Table 5 The results of the two sample test for proportions Pair of the comparative models

Z-score

p-Value

Decision

CCBR CCBR FCBR TRCBR TRCBR TRCBR

0.8191 1.4416 0.6235 1.6510 0.8335 0.2102

0.2064 0.0747 0.2665 0.0494 0.2023 0.4168

Accept H0 Reject H0* Accept H0 Reject H0** Accept H0 Accept H0

*

For CCBR, we apply the k-NN algorithm by varying parameter k into odd numbers ranging from 1 to 9. As a result, we ﬁnd that 3-NN shows the best performance, as illustrated in Table 2. So, as indicated earlier, we apply 3-NN to all of other CBR models including CCBR, FCBR, ICBR and our proposed model. Table 3 presents the results of CCBR, FCBR, ICBR and TRCBR. We can ﬁnd that ICBR and TRCBR use only a portion of the training case-base, so these techniques select about 81–89% of the total training samples. It can be interpreted that the original dataset may contain some noisy cases. Fig. 6 compares the scattergrams of CCBR, ICBR and TRCBR whose x-axis represents ‘BMI (body mass index, kg/m2)’, and the y-axis indicates ‘AGE (years)’. Although it is not clearly distinguished, TRCBR shows the diﬀerence of the patterns between the purchasing group and non-purchasing group slightly more precisely than CCBR as well as ICBR in Fig. 6. In addition, we compare the prediction performance of the proposed model and other alternative models. Table 4 describes the average prediction accuracy of each model. Among the models, TRCBR shows the highest level of accuracy (64.29%) in the given hold-out dataset, followed by ICBR (63.27%) and FCBR (60.20%). In this dataset, the CBR algorithms which contain instance selection procedures show better results, which may be interpreted that the dataset used in the study may contain many noises. In addition, the results show that TRCBR improves the prediction accuracy of typical CBR systems by about 8.17%. We use the two-sample test for proportions to examine whether the diﬀerences of predictive accuracy between the proposed model and other comparative algorithms are statistically signiﬁcant. By applying this test, it is possible to check whether there is a diﬀerence between two probabilities when the prediction accuracy of the left-vertical methods is compared with the right horizontal methods

Table 4 Average prediction accuracy of the models Model

Test dataset (%)

Hold-out dataset (%)

Remarks

CCBR FCBR ICBR TRCBR

62.24 62.76 66.33 70.92

56.12 60.20 63.27 64.29

k of k-NN = 3

**

FCBR ICBR ICBR CCBR FCBR ICBR

Signiﬁcant at the 10% level. Signiﬁcant at the 5% level.

(Harnett & Soni, 1991). In the test, the null hypothesis is that H0: pi pj = 0 where i = 1, . . . , 3 and j = 2, . . . , 4, while the alternative hypothesis is that Ha: pi pj 5 0 where i = 1, . . . , 3 and j = 2, . . . , 4. pk means the classiﬁcation performance of the kth method. Table 5 shows the results for the test. As shown in Table 5, TRCBR outperforms CCBR at the 5% statistical signiﬁcance level, but it does not outperform other two models with statistical signiﬁcance. 6. Conclusions In this paper, we have suggested a novel model, TRCBR, to improve the performance of the typical CBR system. This paper uses GA as a tool to optimize the feature selection and instance selection simultaneously. From the results of the experiment, we show that our proposed model may outperform other comparative algorithms such as CCBR, FCBR, and ICBR in the case of customer classiﬁcation. This study has some limitations. First of all, it takes too much computational time to obtain optimal parameters for the proposed model. So, the eﬀorts to make the proposed model more eﬃcient should be followed in the future to apply our model to real-world general cases. Second, there are other factors which enhance the performance of the CBR system that may be incorporated with the simultaneous optimization model. For example, feature weighting is another factor to be optimized in the CBR system (Liao et al., 2000; Shin & Han, 1999; Wettschereck et al., 1997). In addition, k parameter of k-NN, the number of cases that combine, may be another parameter to be optimized (Ahn, Kim, & Han, 2003; Lee & Park, 1999). Building a universal simultaneous optimization model for CBR including feature and instance selection as well as other factors like k parameter and feature weights may improve the overall performance of the CBR system. Finally, the results of this study may depend on the experimental dataset. In particular, the research dataset is relatively small and seems to be noisy, so it is diﬃcult to distinguish the prediction ability of each model. Consequently, the generalizability of the proposed model should be tested further by applying it to other problem domains in the future.

H. Ahn et al. / Expert Systems with Applications 32 (2007) 1011–1019

References Ahn, H., Kim, K.-j., & Han, I., (2003). Determining the optimal number of cases to combine in an eﬀective case-based reasoning system using genetic algorithms. In Proceedings of international conference of Korea intelligent information systems society 2003 (ICKIISS 2003), Seoul, Korea (pp. 178–184). Babu, T. R., & Murty, M. N. (2001). Comparison of genetic algorithm based prototype selection schemes. Pattern Recognition, 34(2), 523–525. Cardie, C. (1993). Using decision trees to improve case-based learning. In Proceedings of the tenth international conference on machine learning (pp. 25–32). San Francisco, CA: Morgan Kaufmann. Cardie, C., & Howe, N. (1997). Improving minority class prediction using case-speciﬁc feature weights. In Proceedings of the fourteenth international conference on machine learning (pp. 57–65). San Francisco, CA: Morgan Kaufmann. Chiu, C. (2002). A case-based customer classiﬁcation approach for direct marketing. Expert Systems with Applications, 22(2), 163–168. Chiu, C., Chang, P. C., & Chiu, N. H. (2003). A case-based expert support system for due-date assignment in a water fabrication factory. Journal of Intelligent Manufacturing, 14(3–4), 287–296. Domingos, P. (1997). Context-sensitive feature selection for lazy learners. Artiﬁcial Intelligence Review, 11(1–5), 227–253. Fu, Y., & Shen, R. (2004). GA based CBR approach in Q&A system. Expert Systems with Applications, 26(2), 167–170. Han, J., & Kamber, M. (2001). Datamining: Concepts and techniques. San Francisco, CA: Morgan Kaufmann Publishers. Harnett, D. L., & Soni, A. K. (1991). Statistical methods for business and economics. Massachusetts, MA: Addison-Wesley. Hart, P. E. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3), 515–516. Huang, Y. S., Chiang, C. C., Shieh, J. W., & Grimson, E. (2002). Prototype optimization for nearest-neighbor classiﬁcation. Pattern Recognition, 35(6), 1237–1245. Jarmulak, J., Craw, S., & Rowe, R. (2000). Self-optimizing CBR retrieval. In Proceedings of the twelfth IEEE international conference on tools with artiﬁcial intelligence (pp. 376–383). Canada: Vancouver. Kim, K. (2004). Toward global optimization of case-based reasoning systems for ﬁnancial forecasting. Applied Intelligence, 21(3), 239–249. Kim, K., & Han, I. (2001). Maintaining case-based reasoning systems using a genetic algorithms approach. Expert Systems with Applications, 21(3), 139–145. Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann.

1019

Kuncheva, L. I., & Jain, L. C. (1999). Nearest neighbor classiﬁer: Simultaneous editing and feature selection. Pattern Recognition Letters, 20(11–13), 1149–1156. Lee, H. Y., & Park, K. N. (1999). Methods for determining the optimal number of cases to combine in an eﬀective case based forecasting system. Korean Journal of Management Research, 27(5), 1239–1252. Liao, T. W., Zhang, Z. M., & Mount, C. R. (2000). A case-based reasoning system for identifying failure mechanisms. Engineering Applications of Artiﬁcial Intelligence, 13(2), 199–213. Lipowezky, U. (1998). Selection of the optimal prototype subset for 1-NN classiﬁcation. Pattern Recognition Letters, 19(10), 907–918. Rozsypal, A., & Kubat, M. (2003). Selecting representative examples and attributes by a genetic algorithm. Intelligent Data Analysis, 7(4), 291–304. Sanchez, J. S., Pla, F., & Ferri, F. J. (1997). Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognition Letters, 18(6), 507–513. Shin, K. S., & Han, I. (1999). Case-based reasoning supported by genetic algorithms for corporate bond rating. Expert Systems with Applications, 16(2), 85–95. Siedlecki, W., & Sklanski, J. (1989). A note on genetic algorithms for large-scale feature selection. Pattern Recognition Letters, 10(5), 335–347. Skalak, D. B. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Proceedings of the eleventh international conference on machine learning (pp. 293–301). New Brunswick, NJ: Morgan Kaufmann. Stearns, S. (1976). On selecting features for pattern classiﬁers. In Proceedings of the third international conference on pattern recognition, Coronado, CA (pp. 71–75). Turban, E., & Aronson, J. E. (2001). Decision support systems and intelligent systems (6th ed.). Upper Saddle River, NJ: Prentice-Hall. Wettschereck, D., Aha, D. W., & Mohri, T. (1997). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artiﬁcial Intelligence Review, 11(1–5), 273–314. Wilson, D. L. (1972). Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, 2(3), 408–421. Yan, H. (1993). Prototype optimization for nearest neighbor classiﬁer using a two-layer perceptron. Pattern Recognition, 26(2), 317–324. Yin, W. J., Liu, M., & Wu, C. (2002). A genetic learning approach with case-based memory for job-shop scheduling problems. In Proceedings of the ﬁrst international conference on machine learning and cybernetics, Beijing, China (pp. 1683–1687).

A case-based reasoning system with the two-dimensional reduction technique for customer classification

A case-based reasoning system with the two-dimensional reduction technique for customer classification

Recommend Documents