Constructing intelligent model for acceptability evaluation of a product

Constructing intelligent model for acceptability evaluation of a product

Expert Systems with Applications 38 (2011) 13702–13710 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ...

471KB Sizes 2 Downloads 35 Views

Expert Systems with Applications 38 (2011) 13702–13710

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Constructing intelligent model for acceptability evaluation of a product Shu-Ting Luo ⇑, Chwen-Tzeng Su, Wen-Chen Lee Graduate School of Industry Engineering and Management, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin 64002, Taiwan

a r t i c l e

i n f o

Keywords: Consumer acceptability Ensemble classifiers Feature ranking Acceptability evaluation Non-parametric statistical test

a b s t r a c t In the highly competitive marketplace, consumer acceptability has become an important factor in the product design process. However, manufacturers and designers often misunderstand what consumers really want. Thus, acceptability evaluation and prediction is important in product development. This study developed an intelligent model to solve consumer acceptability problem in an attempt to evaluate consumer acceptability with better performance. The model adopted three well-known feature ranking methods to rank features of importance. In addition, it employed the Bayesian Network (BN), Radial Basis Function (RBF) Networks, Support Vector Machine–Sequential Minimal Optimization (SVM–SMO) and their ensembles to build prediction models. In this study, we also focus on the use of non- parametric statistical test for the comparison algorithms performance in classification. To demonstrate applicability of the proposed model, we adopted a real case, car evaluation, to show that the consumer acceptability problem can be easily evaluated and predicted using the proposed model. The results show that the model can improve the performance of consumer acceptability problems and can be easily extended to other industries. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction Businesses should develop products that satisfy the customer demands because it will increase the enterprise’s competitiveness (Liao, Hsieh, & Huang, 2008). During the last few decades, the evaluation of consumer acceptability of products has been a topic of continuous and extensive research. The choice of any product involves complex human behaviors, influenced by many interrelating factors concerned with the product. It also includes the consumer making a choice and external characteristics that include price, brand and capability. In addition, the consumer’s response must be taken into consideration, not only to evaluate the acceptance of the final product, but also from the beginning of the process and its development. However, the consumer will have formed a first impression of product quality. Product quality for the manufacturer can be viewed as minimizing deviations from specifications of the product’s objectively measured attributes. In contrast, consumers view product quality subjectively; often only recognizing through a combination of their experience, especially making judgments based on first impression, or based on the product’s price and value, and the extent to which it satisfies their needs. Therefore, how to evaluate the acceptability of consumer is one of key problems of product development. Effective evaluation methods should be developed to help manufacturer predict consumer acceptability more accurately. Thus, nowadays, how to ⇑ Corresponding author. E-mail address: [email protected] (S.-T. Luo). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.04.162

develop more accurate evaluation and prediction methods has become an important research topic. Lolas and Olatunbosun (2008) pointed that product development is an important but also dynamic, lengthy and risky phase in the life of a new product. It is critical to take advantage of not only designers’ experience and creativity but also implicit affective needs of consumers. Therefore, this study aims to provide effective methods for consumer acceptability evaluation that can be used for manufacturing, as well as many other areas. A car case study will be presented in detail in order to demonstrate the capability of prediction methods in the field of car manufacturing. Every day in the life, hundreds of millions of people need cars for transportation tool. Aside from buying a house, a car is perhaps the largest purchase that most people make. Buying a car is regarded as a reflection of consumers’ needs and a decision-marking problem. When consumers consider buying a car, there are many factors that could influence their decision on the car purchase, such as the price, the cost of regular maintenance, comfort, and safety. Byun (2001) found most people who purchase a car rank safety high among their purchase considerations. As the car market becomes more and more competitive, acceptability is a critical aspect of evaluation car market. Estimating acceptability of products necessitates knowledge of strengths and weaknesses regarding consumer needs. Thus, the acceptability estimating process has to be based on experience and judgment and involves consideration of many factors that characterize performance of products. The acceptability concept was developed by Bretz, Milkovich, and Read (1992) as being increasingly important in relation to the

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

validity and usefulness of appraisal processes in general. Assessing the cost and quality of a new product in the marketing phase of development enables a more accurate prediction of consumer acceptability of the product or service. Many acceptability trials have used different measuring methods to compare the results for different industries. In the past, some researchers have developed a variety of statistical methods for measuring acceptability. Pettijohn, Pettijohn, and Taylor (2002) adopted the Likert mean score which indicates a greater acceptability of the fabric and design used in the making of the bedcover and throw pillows. McEwn (1997) used the coefficient of variation to measure the repeatability of consumer acceptability measures. Rejeb, Boly, and Guimaraes (2008) provided a new methodology based on Kano model for the evaluation of a new product acceptability. This method can be used as decision aid tool for selecting consumer needs. In addition to the statistical methods, Jaakkola (2007) adopted a qualitative research approach to evaluate the acceptability of a new product format in the pharmaceutical market. Recently, artificial intelligence techniques have been adopted to predict acceptability. Chao and Skibniewski (1995) used neural network (NN) based approach for predicting the acceptability of a new construction technology. Elazouni, Ali, and Abdel-Razek (2005) used analytical hierarchy process (AHP) to evaluate weights of acceptability values and developed NN model to predict the acceptability of a new formwork systems. However, these are not state of the art techniques. Over the last few years, the application of a new classification technique the support vector machine (SVM) was successfully applied to a wide range of domains (Berry & Linoff, 1997; Rüping, 2000). The support vector machine (SVM) was introduced to deal with the classification problem. In addition, now, ensemble classifiers are an active area of research in machine learning and pattern recognition (Rodriguez, Kuncheva, & Alonso, 2006). It is a meta-classifier that combines the predictions of single classifiers called base classifiers with the equal weight or weights based on estimated prediction accuracy. Therefore, this study planed to employ classifiers and ensemble methods for acceptability evaluation problems. To evaluate the importance of relevant features is an important problem requiring extensive research. Han, Kim, Yun, Hong, and Kim (2004) provided an approach to identify some of the design of features of a mobile phone critical to user satisfaction. Hsiao and Huang (2002) proposed a neural network based method to examine the relationships between product form parameters and adjective image words. Lai, Lin, and Yeh (2005) employed gray prediction and neural network models to find out the finest design combination of product form elements for matching a given product image represented by a word pair. Poirson, Dépincé, and Petiot (2007) adopted genetic algorithms to determine influencing objective variables of brass musical instruments. Akay and Kurt (2009) presented a neuron-fuzzy method to convert affective consumer needs into explicit form features of products. However, there is no published research apply feature ranking, a pre-processing step in the data mining process, to explore and rank the importance of related features in product development. In this study, we proposed a model aimed to calculate the degree of importance ratings for features of a product and estimate consumer acceptability. The model applied Relief, Information Gain (IG) and Chi-squared (CHI) to rank importance of the product features. In addition, it adopted Bayesian Network (BN), Radial Basis Function (RBF) Networks, Support Vector Machine–Sequential Minimal Optimization (SVM–SMO) and their ensemble classifiers to solve the problem in an attempt to predict results with better performance. However, classifier performance estimation has previously been applied in many areas. A commonly used performance measure is the area under the receiver operating

13703

characteristics curve, referred to as AUC below. In this study, we used non-parametric approach, Wilcoxon statistic test (DeLong, DeLong, & Clarke-Pearson, 1988; Hanley & McNeil, 1982) to establish comparisons of the average AUC among the classifiers. The model was developed to assist manufacturers to evaluate importance of product features and to predict consumer acceptability. This study is organized as follows. Section 2 illustrates BN, RBF Network and SVM–SMO; the data analysis and experiments results are shown in Section 3 and discussed in Section 4. Finally, conclusions are presented in Section 5. 2. Methodologies 2.1. Bayesian Network (BN) BN is a graphical representation of probabilistic relationships between multiple variables (Klopotek, 2002). The probability distribution of known variables can be entered as inputs and the network provides the probability distribution of unknown variables as system outputs. A conditional probability table gives the strength of such relationship. The variables that are not connected by an arc can be considered as having no direct influence on them. In addition, the BN incorporates probabilistic inference engines that support reasoning under uncertainty (Hruschka & Ebecken, 2007). It is an outcome of a machine learning process that finds a given network’s structure and its associated parameters, and it can provide diagnostic reasoning, predictive reasoning (Lauria & Duchessi, 2007). In this study, BN learning algorithm used a hill climbing algorithm restricted by an order on the variables. Estimate probabilities directly from data are used for conditional probability tables of a BN once the structure has been learned. 2.2. Radial basis function (RBF) network RBF Networks were first introduced in 1988 (Broomhead & Lowe, 1988). In latest two decades, RBF Networks have been widely used for function approximation and pattern classification (Moody & Darken, 1989; Nabney, 1999; Poechmueller, Hagamuge, Glesner, Schweikert, & Pfeffermann, 1994). A RBF Network is an artificial neural network that uses radial basis functions as activation functions. It is a linear combination of radial basis functions. Such networks of RBF Network have three-layer: the first layer consisting of input units, the hidden layer with the Gaussian functions and the output layer (Bhatt & Gopal, 2004). The three-layer feed-forward network can be represented graphically as shown in Fig. 1. The RBF Network is defined by:

ym ðXÞ ¼

M X

wmj /j ðXÞ þ wm0 bm

ð1Þ

j¼1

In this equation, X is the n-dimensional input pattern vector, m = 1, 2, . . ., k, K is the number of output. M is the number of hidden neurons. wmj is the weight connecting the jth hidden unit to the mth output node. wm0 is the weight connecting the bias and the mth output node. uj(X) is the activation function of the hidden layer are chosen to be Gaussian. The activation function of the jth hidden unit is given by: kxuj k2 2r2 j

/j ðXÞ ¼ e

ð2Þ

where uj and rj are the center and the width for the jth hidden unit, respectively, which are adjusted during learning. Training the RBF Network includes three phases. In the training of a RBF Network, after the number of neurons is determined, parameters such as the center, width and weight will be tuned by using the training samples. Usually, the center and width are

13704

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

The distance between H1 and H2 in Fig. 2 is called a margin and 2 has a magnitude of kwk . The classifier with the largest margin will show the best generalization for data points that were not in the example set. Thus, the problem of finding the best hyperplane is transformed into a linearly constrained optimization problem, namely, to maximize the margin between H1 and H2, subject to the constraint

W1K W11 X

W21

1

W2K

Y1

yi ðxi  w þ bÞ  1 P 0 8i ¼ 1; . . . ; n

X2

In primal weight space, the classifier takes the decision function form Eq. (5). Thus, Cristianini and Taylor (2000) indicated that SVM was trained to solve the following optimization problem from Eq. (6).

YK X

Wk1

n

wkK

input layer

output layer

f ðxÞ ¼ sign  ðw  x þ bÞ 1 T w wþC 2

Minimize

hidden layer

ð5Þ N X

ni

i¼1

Subject to yi ðw  xi þ bÞ P 1  ni and ni P 0 for i ¼ 1; . . . ; n

Fig. 1. The architecture of an RBF Network.

decided first. The next phase is the selection of the width for each center. The final phase is to find the weights. The performance of a RBF Network depends mostly on the selection of the centers and the widths. 2.3. Support vector machine (SVM) SVM were first proposed by Vapnik (1995) in the late 1960s as part of his work on statistical learning theory. Recently, SVM has been widely used for solving problems in pattern recognition, bio-informatics, and more. In a binary classification context, SVM try to find a linear optimal hyperplane so that the margin of separation between the positive and the negative examples is maximized. The objective of the classifier is to define a boundary between the examples with positive class and those with negative class. A linear machine learns to separate two classes via a linear decision function defined by a weight vector w and a bias b, wTx + b = 0. For such, given a set of labeled training examples (xi, yi) with i = 1, 2, 3, . . . , N, x e Rn where each xi is an input instance and yi e {+1, 1} for the case of two possible classes positive and negative. The problem of finding the best boundary is better formulated not with one but with three parallel hyperplanes such that

H :y¼wxþb¼0 H1 : y ¼ w  x þ b ¼ 1 H2 : y ¼ w  x þ b ¼ 1

ð4Þ

ð3Þ

ð6Þ

where C is the parameter that controls the tradeoff between the P training error Ni¼1 ni and the margin and is a kernel function which corresponds to an inner product in a higher dimensional space. The classifier represented in Eq. (6) is still restricted by the fact that it performs only a linear separation of the data. This can be overcome by mapping the input examples to a high-dimensional space, where they can be efficiently separated by a linear SVM. This mapping is performed with the use of Kernel functions, which allow access to spaces of high dimensions without knowing the mapping function explicitly, which usually is very complex. The Kernel functions compute dot products between any pair of patterns in this new space. Thus, the only modification necessary to deal with non-linearity is to substitute any dot product among patterns by the Kernel product. Kernel functions are used to implicitly map data to new feature spaces. Kernel functions are from

Kðxi ; xj Þ 2 R There are different kernel functions used in SVM. The selection of the appropriate kernel function is very important, since the kernel defines the feature space in which the training set examples will be classified. Using different kernel functions in the SVM will lead to different performance results. In this study, linear, polynomial and radial basis functions (RBF) were evaluated and formulated in Table 1. We employ commonly used types of Kernel functions, the radial basis functions (RBF), illustrated in Eq. (7).

Kðxi ; xj Þ ¼ expðrkxi  xj k2

ð8Þ

It should be noticed that SVM were originally proposed to solve binary classification problems. Nevertheless, several strategies can be employed to extend this learning technique to multi-class problems. Sequential Minimal Optimization (SMO) is a fast method to train SVM. We used the SMO implementation of a SVM. The setting of parameters can improve classification accuracy. This work has tried to find the best accuracy of liner, polynomial, and radial kernel functions. In this study, the best result was obtained using radial kernel functions.

Table 1 Formulation of kernel functions.

Fig. 2. Separation of two classes by an SVM.

Kernel type

Kernel function K(xi, xj)

Linear Polynomial RBF(or Gaussian)

xT  xj (xi, xj)d exp(r||xi  xj||2)

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

2.4. Ensemble methods Ensemble is a machine learning technique that combines the output of several different classifiers with the goal of improving classification performance. It is a meta-classifier that combines the predictions of single classifiers. The goal of ensemble methods is to generate from a given training dataset a collection of diverse predictors whose errors are uncorrelated. Ensemble methods in machine learning aim to achieve higher classification accuracy and efficiency. It works better, depending on the learner being trained well, but different learners generalize in different ways, i.e., there is diversity in the ensemble (Ghosh, 2002). Ensembles can be built using different base classifiers: decision stumps (Schapire, Freund, Bartlett, & Lee, 1998) decision trees (Breiman, 2001) support vector machines (Kim, Pang, Je, Kim, & Bang, 2003; Wang et al., 2009), etc. In addition, many methods for constructing ensembles have been developed (Dietterich, 1997), while Bagging, Boosting and Adaboost are most popular ensemble learning algorithms (Oza & Russell, 2001). In this study, we employed the ensemble methods. All of the methods combine BN, RBF Network and SVM classifiers to form different ensemble classifiers. 2.4.1. Bagging Bagging was proposed by Breiman (1996), and is based on bootstrap aggregating. It is a meta-algorithm to improve classification and regression models in terms of stability and classification accuracy. Bagging is a device to improve the accuracy of unstable classifiers. If a base classifier is unstable, bagging helps to reduce the errors associated with random fluctuations in the training data. If a base classifier is stable, then the error of the ensemble is primarily caused by bias in the base classifier. In general a combined classifier gives better results than single classifiers, because of combining the advantages of the single classifiers in the final solution. Therefore, bagging might be helpful to build a better classifier model. 2.4.2. Adaboost and Multiboosting Boosting is a method that makes the most of a classifier by improving its accuracy. Freund and Schapire (1997) formulated AdaBoost, short for Adaptive Boosting. It is a well known, effective technique for increasing the accuracy of learning algorithms. The training and validation sets are switched, and a second pass is performed. Re-weighting and re-sampling are two methods implemented in AdaBoost. There are many variants on the idea of Boosting. We describe a widely used method called AdaBoostM1 that is designed specifically for classification. MutiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees and can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost’s high bias and variance reduction with wagging’s superior variance reduction (Webb, 2000). 2.5. Feature ranking Ranking of relevant features is an important problem that has been extensively studied and a number of different measures and features ranking methods have been developed (Kononenko & Kukar, 2006). Features ranking is not to say that we have ignored those methods that evaluate of attributes; on the other hand, it is possible to obtain ranked lists of attributes from these methods. Ranking methods are based on statistics, information theory, or on some functions of classifier’s outputs (Duch, Winiarski, Biesiada, & Kachel, 2003). In this study we used three well-known attribute ranking techniques, which are Relief, Info Gain and Chi squared. The three methods used ranking to measure the importance of features.

13705

2.5.1. Relief Relief, developed by Kira and Rendell (1992), is a feature ranking method and is able to estimate the importance of attributes. Relief assigns a grade of relevance to each feature, and those features valued over a user given threshold are selected. Relief is limited to only two-class problems. In this work, we rank the importance of the features according to the principle: the larger the weights, the more important the features. The basic algorithm of Relief (Arauzo-Azofra, Benitez, & Castro, 2004) is described as flows: RELIEF (dataset, j. . .) For 1 to j: K1 = random example from dataset. Neighbors = find some of the nearest examples to K1. For K2 in Neighbors: Perform some evaluation between K1 and K2 Return to the evaluation 2.5.2. Information Gain (IG) IG is a measure based on entropy. Entropy measures the impurity of a set of data. X with respect to the class attribute Y is the reduction in uncertainty about the value of Y when we know the value of X, IG (Y;X). IG looks at each feature in independence and measures the importance of a feature with respect to the class. IG obtained as follows:

IGðY; XÞ ¼ HðYÞ  HðYjXÞ

ð9Þ

where Y and X are discrete features, the entropy of Y is given by:

HðYÞ ¼ 

X

pðyÞlog2 ðpðyÞÞ

ð10Þ

y2Y

The conditional entropy of Y given X is:

HðYjXÞ ¼ 

X x2X

pðyÞ

X

pðyjxÞlog2 ðpðyjxÞÞ

ð11Þ

y2Y

2.5.3. Chi squared (CHI) CHI is one of the most widely used statistical tests (Cantu-Paz, Newsam, & Kamath, 2004). This method ranks features individually by measuring their chi squared statistics with respect to the classes (Hawley, 1996). The Chi-square test can be calculated from a contingency table (Forman, 2003; Yang & Pedersen, 1997). The CHI value can be obtained by the equation as follows:

v2d ¼

m X k X ðfij  F ij Þ2 F ij i¼1 j¼1

ð12Þ

where m is the number of values of the variable, K is the number of classes of the target variable, d is the degree of freedom =(m  1) ⁄ (k  1).After calculating the v2 value of all considered features, we can rank features based these values: the larger the v2 value, the more important the feature is. The degree of freedom of the above v2-statistics is (m  1) ⁄ (k  1). 3. Experimental results 3.1. Database overview In this study, we adopted a real-world car evaluation database from the UCI repository of machine learning database (Bohanec & Zupan, 1997) and the description of attributes of the database is illustrated in Table 2. The car evaluation database contains examples six attributes of a car: Buying, Maint, Doors, Persons, Boot and Safety. The database has 1728 instances and classified into four classes (Very-good, Good, Acceptable, and Unacceptable). In this

13706

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710 Table 2 The attributes description and their values of the car evaluation dataset. Characteristic

Attribute

Attribute description

Nominal values

Price

Buying Maint

Buying price Price of the maintenance

Very-high, high, median, low Very-high, high, median, low

Tech

Doors Persons Boot Safety

Number of doors Capacity in terms of persons to carry The size of luggage boot Estimated safety of the car

2, 3, 4, 5, more 2, 4, more Small, median, big Median, high

Collection and input Raw dataset Data Preprocessing Transform raw data form : Nominal to Numeric

Ranking of features determines the importance of features :using Relief, IG and CHI.

Combination of similar classes

Building Intelligent Model

Based (single) classifiers

BN training

RBF Network training

SVM-SMO training

Combination Ensemble classifiers with single classifiers Hybrid model training

Bagging + BN Bagging + RBF Network Bagging + SVM-SMO

Adaboost+ BN Adaboost + RBF Network Adaboost + SVM-SMO

Mutliboosting + BN Mutliboosting + RBF Network Mutliboosting + SVM-SMO

Model Evaluation Using k-fold cross validation experiments to estimate the accuracy ,sensitivity and specificity Evaluation area under the curve (AUC) of methods

Comparison of methods efficiency with Wilcoxon statistic

Fig. 3. The work flow of the proposed model.

13707

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710 Table 3 The attributes obtained by attribute ranking methods. Used methods

Average weights

Relief IG CHI Ranking

Buying

Maint

Doors

Persons

Boot

Safety

0.045 ± 0.003 0.033 ± 0.002 68.130 ± 3.661 3

0.034 ± 0.003 0.028 ± 0.004 58.052 ± 6.963 4

0.012 ± 0.002 0.002 ± 0.002 3.396 ± 4.163 6

0.204 ± 0.002 0.219 ± 0.002 332.902 ± 3.503 2

0.017 ± 0.002 0.012 ± 0.001 24.406 ± 2.972 5

0.236 ± 0.004 0.229 ± 0.003 356.826 ± 4.536 1

Table 4 The accuracy, sensitivity and specificity results of using single classifiers and ensemble classifiers. Classifiers

Single classifier

Ensemble classifiers Bagging

AdaboostM1

MultiBoosting

BN

0.923 (0.004) 0.933 (0.041) 0.889 (0.038)

0.941 (0.004) 0.946 (0.036) 0.906 (0.033)

0.942 (0.004) 0.941 (0.038) 0.952 (0.026)

0.942 (0.003) 0.932 (0.021) 0.921 (0.025)

RBF Network

0.915 (0.008) 0.921 (0.038) 0.874 (0.040)

0.932 (0.003) 0.941 (0.038) 0.891 (0.035)

0.954 (0.005) 0.958 (0.019) 0.932 (0.021)

0.947 (0.003) 0.952 (0.026) 0.921 (0.025)

SVM–SMO

0.956 (0.001) 0.963 (0.024) 0.929 (0.026)

0.958 (0.003) 0.960 (0.027) 0.931 (0.025)

0.961 (0.002) 0.965 (0.023) 0.937 (0.024)

0.960 (0.002) 0.964 (0.024) 0.935 (0.023)

Note: The numbers in parentheses are the standard errors.

work, the first three classes were viewed as one class. Therefore, the four classes were reduced two classes (Unacceptable and Acceptable). 3.2. The architecture of model An intelligent model was developed by this work to solve the real-world problem of evaluating and predicting consumer acceptability. The work flow of the proposed model is shown in Fig. 3, and each phase is introduced as described below: Stage 1: Collection and input of the raw dataset: It includes the collection of raw data, selecting the data and focusing on the features that influence the car evaluation. Stage 2: Pre-processing the dataset: This stage includes two steps. First, the data are transferred to forms appropriate for calculating. Second, we adopted three feature ranking methods (Relief, IG and CHI) to rank the features of car evaluation dataset. Stage 3: Building intelligent model: The experiment employed single classifier (BN, RBF Network and SVM–SMO) and ensemble classifier (Bagging, Adaboost and Mutliboosting) to build models. Stage 4: Model evaluation: This stage adopted 10-fold cross validation to evaluate accuracy, sensitivity, specificity, and AUC of those results, and list the comparisons of the calculated results. Finally, we do a comparison of methods efficiency using Wilcoxon statistic and explore the efficiency of the methods. 3.3. Ranked attributes using feature ranking methods The ranking of features determines the importance of features is the important problem we attempt to solve. In this work, we employed three feature ranking methods to rank features from a car evaluation dataset. Table 3 shows the features that were ranked from the car evaluation dataset using Relief, IG and CHI. As seen in Table 3, each feature is ranked by their average weights respectively with the three feature ranking methods and the rank number

Fig. 4. Comparison of the results using single classifier and ensemble classifiers.

Table 5 Comparison of the AUC using the different classifiers. Classifiers

Estimated AUC

95% Confidence intervals

BN B + BN A + BN M + BN RBF Network B + RBF Network A + RBF Network M + RBF Network SVM–SMO B + SVM–SMO A + SVM–SMO M + SVM–SMO

0.973 0.983 0.986 0.988 0.973 0.982 0.990 0.986 0.953 0.973 0.993 0.994

0.967–0.979 0.983–0.984 0.982–0.989 0.988–0.989 0.972–0.975 0.981–0.982 0.989–0.992 0.980–0.991 0.952–0.955 0.972–0.974 0.992–0.994 0.993–0.994

(0.016) (0.001) (0.009) (0.001) (0.004) (0.001) (0.005) (0.016) (0.005) (0.002) (0.003) (0.001)

Note 1: The numbers in parentheses are the standard errors. Note 2: B: Bagging, A: AdaBoostM1, M: MultiBoosting.

13708

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

Table 6 Comparison of classifier efficiency under BN, RBF Network and their ensembles.

BN B + BN A + BN M + BN RBF Network B + RBF Network A + RBF Network M + RBF Network

BN

B + BN

A + BN

M + BN

RBF Network

B + RBF Network

A + RBF Network

M + RBF Network

NA

0.533 NA

0.479 0.740 NA

0.349 0.001 0.825 NA

1.000 0.002 0.187 0.001 NA

0.575 0.480 0.659 <0.001 0.030 NA

0.311 0.170 0.698 0.695 0.008 0.117 NA

0.566 0.852 1.000 0.901 0.431 0.803 0.811 NA

B: Bagging, A: AdaBoostM1, M: MultiBoosting.

Table 7 Comparison of classifier efficiency under BN, SVM–SMO and their ensembles.

BN B + BN A + BN M + BN SVM–SMO B + SVM–SMO A + SVM–SMO M + SVM–SMO

BN

B + BN

A + BN

M + BN

SVM–SMO

B + SVM–SMO

A + SVM–SMO

M + SVM–SMO

NA

0.533 NA

0.479 0.740 NA

0.349 0.001 0.825 NA

0.233 <0.001 0.001 <0.001 NA

1.000 <0.001 0.159 <0.001 0.001 NA

0.220 0.002 0.461 0.114 <0.001 <0.001 NA

0.191 <0.001 0.377 <0.001 <0.001 <0.001 0.752 NA

B: Bagging, A: AdaBoostM1, M: MultiBoosting.

Table 8 Comparison of classifier efficiency under RBF Network, SVM–SMO and their ensembles.

RBF Network B + RBF Network A + RBF Network M + RBF Network SVM–SMO B + SVM–SMO A + SVM–SMO M + SVM–SMO

RBF Network

B + RBF Network

A + RBF Network

M + RBF Network

SVM–SMO

B + SVM–SMO

A + SVM–SMO

M + SVM–SMO

NA

0.030 NA

0.008 0.117 NA

0.431 0.803 0.811 NA

0.002 <0.001 <0.001 0.049 NA

1.000 0.001 0.002 0.420 0.001 NA

0.001 <0.001 0.607 0.667 <0.001 <0.001 NA

<0.001 <0.001 0.433 0.618 <0.001 <0.001 0.751 NA

B: Bagging, A: AdaBoostM1, M: MultiBoosting.

in the last column represents the degree of importance. The results show that Safety ranked first and is the most important feature while the importance of Doors is lowest. 3.4. Model performance An experiment was set up to compare the BN, RBF Network and SVM–SMO with Bagging, AdaBoost, and MultiBoosting. In all ensemble methods, the BN, RBF Network and SVM–SMO were used as the base classifiers. The comparison of experiments is based on 10-fold cross validation. In this study, we performed experiments by using a car evaluation dataset and compared the results of single BN, RBF Network, SVM–SMO and their ensemble classifiers. The experimental results are listed by accuracy, sensitivity and specificity, as shown in Table 4. In addition, the accuracy comparisons of these methods are illustrated in Fig. 4. As seen in Fig. 4, the results show that ensembles are better than single classifier. The accuracy of AdaBoostM1 ensemble classifier had the best performance (BN:0.942, RBF Network:0.954 and SVM–SMO:0.961) among all these approaches. 3.5. Model evaluation The usual method to compare classification algorithms is to perform k-fold cross validation experiments to estimate the accu-

racy of the algorithms. In this study, each classification model, the statistical results were based on 30 runs of the 10-fold cross validation. In addition, the following statistics were calculated: sensitivity, specificity and accuracy. AUC is the evaluation criteria for the classifier and can be statistically interpreted as the probability of the classifier to correctly classify malignant cases and benign cases. In this work, the AUC is obtained by a non-parametric method based on the Wilcoxon statistic to approximate the area. AUC can be used for comparing two different ROC curves from the same samples of cases. Table 5 shows the average of AUC (AUC), the corresponding standard error (SE derived from 30 AUC values), and 95% confidence interval (CI) using different classifiers (BN, RBF Network, SVM–SMO and their ensembles). The results demonstrated that ensemble classifiers AUC are better than single classifierAUC. Tables 6–8 show the comparisons and significance level of eight different methods. When the two classifiers are significant or close to p-values are indicated in bold (p-value from pairwise two-tailed z-test) in these three tables, and it means that these two have significantly different prediction rates. From Tables 6 and 7, the results show that there is no significant difference between BN and the other classifiers (BN ensembles, RBF Network, SVM–SMO) and their ensembles results. However, Tables 7 and 8 reveal that SVM–SMO is all significantly different than the others single classifiers (BN ensembles, RBF Network) and their ensemble classifiers.

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

4. Discussion Consumer acceptability has become a very important issue as manufacturers have to decide whether product factors influence consumer acceptability. In the highly competitive market, to meet consumer’s need is a critical factor for product success. However, understanding consumer’s need is difficult to grasp; product manufacturer often misunderstand what product attributes consumers really need or pay attention to. Therefore, evaluating the importance of products and predicting consumer’s acceptability is an important problem for production planners. Feature ranking is a problem that has to be addressed in many areas, and more and more researchers are seeking better strategies to determine the importance of features. In this study, we employed three feature ranking methods, Relief, IG and CHI, to rank the features of a car evaluation dataset. The result demonstrates that Safety is the most important feature and it supports the finding of Byun (2001). Effective prediction methods should be developed to help manufacturer predict consumer acceptability more accurately. More and more researchers are seeking better strategies through the help of prediction models. In the pervious work, several modeling alternatives, traditional statistical methods and artificial intelligence methods have been developed in order to successfully handle these tasks of predict consumer acceptability. This study adopted two classical (BN and RBF Network) and one state of the art prediction methods (SVM–SMO) for the problem. The results reveal the novel technique, SVM–SMO, outperform than those of the others. Recently, special types of predictive models, ensemble methods, have been well studied in the data mining community. Ensemble methods in machine learning aim to induce a collection of diverse predictors which are both accurate and complementary. Generally, ensemble method can be built using different base classifiers that are more accurate than a single classifier. It also improves the performance of predictions. In this study, the accuracy comparisons of these methods show that ensemble classifiers are better than single classifier. Similar comparison results were also shown in the items of AUC. The AUC is frequently used as a measure for the effectiveness of decision marker. Different estimation methods will of course provide different estimated AUC values. In this work, the AUC is adopted by a non-parametric method based on the Wilcoxon statistic to compare the classifiers significantly of same samples of cases. As seen in Tables 6 and 7, the AUC result of BN is not significantly different than BN ensembles, RBF Network, SVM–SMO and their ensembles. The classification results are not significantly improved by the used ensemble classifier of BN classifier. However, as shown in Tables 7 and 8, the AUC result of SVM–SMO is all significantly different than the others classifiers and their ensembles. Therefore, the classification results are significantly improved by used SVM–SMO ensemble classifier. This study developed an intelligent model to build models to evaluate and predict the consumer acceptability. The analysis used the intelligent model help to provide industry with a base for evaluating the importance of product and for making more accurate predictions of consumer acceptability. Generally, other datasets from different industry sectors, such as the health care industry, services industry and financial industry, can consider consumer acceptability assessing using this intelligent model, and other attributes of influencing the importance of a product that consumers really need, can be used for feature ranking to objectively rank the importance of attributes.

5. Conclusions Evaluation and prediction of consumer acceptability is one of important issues of production planning. A meaningful evaluation

13709

method is necessary and building an effective prediction model has been an important task. This study developed an intelligent model to deal with consumer acceptability problems. The model adopted three well-known feature ranking methods (Relief, IG and CHI) to evaluate importance of features. In addition, BN, RBF Network, SVM–SMO and their ensembles were employed to build models in an attempt to predict accuracy with better performance. The results show that ensemble classifiers are more accurate than a single classifier. This intelligent model not only helps manufacturer in evaluating the importance of product features but also predicting consumer acceptability. It was demonstrated that the consumer acceptability problem can be efficiently solved by the proposed model. The intelligent model proposed is simple in implementation and can be easily extended to other fields to improve the performance for consumer acceptability problems.

References Akay, D., & Kurt, M. (2009). A neuro-fuzzy based approach to affective design. International Journal of Advanced Manufacturing Technology, 40, 425–437. Arauzo-Azofra, A., Benitez, J. M., & Castro, J. L. (2004). A feature set measure based on relief. In Proceedings of the 5th international conference on recent advances in soft computing (pp. 104–109). Berry, M., & Linoff, G. (1997). Data mining techniques for marketing sales and customer support. New York: Wiley. Bhatt, R. B., & Gopal, M. (2004). On the structure and initial parameter identification of gaussian RBF Networks. International Journal of Neural Systems, 14(6), 373–380. Bohanec, M., & Zupan, B. (1997). UCI repository of machine learning database. Irvine, CA: University of California, Department of Information and Computer Science. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. Bretz, R. D., Jr., Milkovich, G. T., & Read, W. (1992). The current state of performance appraisal research and practice: Concerns, directions, and implications. Journal of Management, 18, 321–352. Broomhead, D. S., & Lowe, D. (1988). Multivariable functional interpolation and adaptive networks. Complex Systems, 2, 321–355. Byun, D. H. (2001). The AHP approach for selecting an automobile purchase model. Information & Management, 38(5), 289–297. Cantu-Paz, E., Newsam, S., & Kamath, C. (2004). Feature selection in scientific applications. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, Washington (pp. 22–25). Chao, L. C., & Skibniewski, M. J. (1995). Neural network method of estimating construction technology acceptability. Journal of Construction Engineering and Management, 121(1), 130–142. Cristianini, N., & Taylor, J. S. (2000). An introduction to support vector machines. Cambridge, MA: Cambridge University Press. DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44(3), 837–845. Dietterich, T. G. (1997). Machine learning research: Four current directions. AI Magazine, 18(4), 97–136. Duch, W., Winiarski, T., Biesiada, J., & Kachel, A. (2003). Feature ranking. Selection and discretization. In International conference on artificial neural networks (ICANN) and international conference on neural information processing (ICONIP), Istanbul (pp. 251–254). Elazouni, A. M., Ali, E. A., & Abdel-Razek, R. H. (2005). Estimating the acceptability of new formwork systems using neural networks. Journal of Construction Engineering and Management, 131(1), 33–41. Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3, 1289–1305. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. Ghosh, J. (2002). Multiclassifier systems: Back to the future. In Proceedings of the 3rd International workshop on multiple classifier systems, Cagliari, Italy (pp. 1–15). Han, S. H., Kim, K. J., Yun, M. H., Hong, S. W., & Kim, J. (2004). Identifying mobile phone design features critical to user satisfaction. Human Factors and Ergonomics in Manufacturing, 14(1), 15–29. Hanley, J. A., & McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. Hawley, W. (1996). Foundations of statistics. New York: Harcourt Brace & Company. Hruschka, E. R., Jr., & Ebecken, N. F. F. (2007). Towards efficient variables ordering for Bayesian networks classifier. Data & Knowledge Engineering, 63(2), 258–269. Hsiao, S. W., & Huang, H. C. (2002). A neural network based approach for product form design. Design Studies, 23, 67–84. Jaakkola, E. (2007). Critical innovation characteristics influencing the acceptability of a new pharmaceutical product format. Journal of Marketing Management, 23(3–4), 327–346.

13710

S.-T. Luo et al. / Expert Systems with Applications 38 (2011) 13702–13710

Kim, H. C., Pang, S., Je, H. M., Kim, D., & Bang, S. Y. (2003). Constructing support vector machine ensemble. Pattern Recognition, 36(12), 2757–2767. Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In Proceedings of the 9th international workshop on machine learning (pp. 249– 256). Klopotek, M. A. (2002). A new Bayesian tree learning method with reduced time and space complexity. Fundamenta Informaticae, 49(4), 349–367. Kononenko, I., & Kukar, M. (2006). Machine learning and data mining: Introduction to principles and algorithms. UK: Horwood Publishing Limited. Lai, H. H., Lin, Y. C., & Yeh, C. H. (2005). Form design of product image using grey relational analysis and neural network models. Computers & Operations Research, 32, 2689–2711. Lauria, E. J. M., & Duchessi, P. J. (2007). A methodology for developing bayesian networks: An application to information technology (IT) implementation. European Journal of Operational Research, 179(1), 234–252. Liao, S. H., Hsieh, C. L., & Huang, S. P. (2008). Mining product maps for new product development. Expert Systems with Applications, 34, 50–62. Lolas, S., & Olatunbosun, O. A. (2008). Prediction of vehicle reliability performance using artificial neural networks. Expert Systems with Applications, 34, 2360–2369. McEwan, J. A. (1997). A comparative study of three product acceptability trials. Food Quality and Preference, 8(3), 183–190. Moody, J., & Darken, C. J. (1989). Fast learning in network of locally-tuned processing units. Neural Computation, 1(2), 281–294. Nabney, I. T. (1999). Efficient training of RBF networks for classification. In Proceedings of the 9th international conference on artificial neural networks (pp. 210–215). Oza, N. C., & Russell, S. (2001). Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California (pp. 359–364).

Pettijohn, C. E., Pettijohn, L. S., & Taylor, A. J. (2002). The influence of salesperson skill, motivation, and training on the practice of customer-oriented selling. Psychology and Marketing, 19(9), 743–757. Poechmueller, W., Hagamuge, S. K., Glesner, M., Schweikert, P., & Pfeffermann, A. (1994). RBF and CBF neural network learning procedures. In IEEE world congress on computational intelligence (Vol. 1, pp. 407–412). Poirson, E., Dépincé, P., & Petiot, J. F. (2007). User-centered design by genetic algorithms: Application to brass musical instrument optimization. Engineering Applications of Artificial Intelligence, 20, 511–518. Rejeb, H. B., Boly, V., & Guimaraes, L. M. (2008). A new methodology based on Kano model for the evaluation of a new product acceptability during the front-end phases. In Proceedings of the 32nd annual IEEE international computer software and applications conference (pp. 619–624). Rodriguez, J. J., Kuncheva, L. I., & Alonso, C. J. (2006). Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619–1630. Rüping, S. (2000). MySVM-manual. AI Unit University of Dortmund: Computer Science Department. Schapire, R. E., Freund, Y., Bartlett, P. L., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686. Vapnik, V. N. (1995). The nature of statistical learning theory. Berlin, New York: Springer. Wang, S. J., Mathew, A., Chen, Y., Xi, L. F., Ma, L., & Lee, J. (2009). Empirical analysis of support vector machine ensemble classifiers. Expert Systems with Applications, 36(3), 6466–6476. Webb, G. I. (2000). MultiBoosting: A technique for combining boosting and wagging. Machine Learning, 40(2), 159–197. Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the 14th international conference on machine learning (pp. 412–420).