Hybrid intelligent modeling schemes for heart disease classification

Hybrid intelligent modeling schemes for heart disease classification

Applied Soft Computing 14 (2014) 47–52 Contents lists available at ScienceDirect Applied Soft Computing journal homepage: www.elsevier.com/locate/as...

607KB Sizes 0 Downloads 26 Views

Applied Soft Computing 14 (2014) 47–52

Contents lists available at ScienceDirect

Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc

Hybrid intelligent modeling schemes for heart disease classification Yuehjen E. Shao a , Chia-Ding Hou a,∗ , Chih-Chou Chiu b a Department of Statistics and Information Science, Fu Jen Catholic University, 510, Chung-Cheng Road, Xinzhuang District, New Taipei City 24205, Taiwan, ROC b Department of Business Management, National Taipei University of Technology, Taipei City 106, Taiwan, ROC

a r t i c l e

i n f o

Article history: Received 13 March 2013 Received in revised form 17 August 2013 Accepted 24 September 2013 Available online 9 October 2013 Keywords: Hybrid Logistic regression MARS Artificial neural network Rough sets Heart disease

a b s t r a c t Heart disease is the leading cause of death among both men and women in most countries in the world. Thus, people must be mindful of heart disease risk factors. Although genetics play a role, certain lifestyle factors are crucial contributors to heart disease. Traditional approaches use thirteen risk factors or explanatory variables to classify heart disease. Diverging from existing approaches, the present study proposes a new hybrid intelligent modeling scheme to obtain different sets of explanatory variables, and the proposed hybrid models effectively classify heart disease. The proposed hybrid models consist of logistic regression (LR), multivariate adaptive regression splines (MARS), artificial neural network (ANN), and rough set (RS) techniques. The initial stage of the proposed process includes the use of LR, MARS, and RS techniques to reduce the set of explanatory variables. The remaining variables are subsequently used as inputs for the ANN method employed in the second stage. A real heart disease data set was used to demonstrate the development of the proposed hybrid models. The modeling results revealed that the proposed hybrid schemes effectively classify heart disease and outperform the typical, single-stage ANN method. © 2013 Elsevier B.V. All rights reserved.

1. Introduction The heart can be viewed as the body’s engine: it is responsible for pumping life-sustaining blood via a network of vessels. Although most people know that the heart must be properly cared for, heart disease has risen steadily over the last century and has become the leading cause of death for people in the United States [1]. In addition, due to the necessity for the prevention of heart disease, the UK government provides 169 million European dollars to fund research on coronary heart disease [2]. Several studies have been devoted to using some single classification algorithm for the classification of eye diseases [3] and of brain diseases [4,5]. Different from most studies, this paper aims to propose a novel hybrid scheme for the classification of heart diseases. The heart disease data sets used in the present study were real data obtained from a UCI machine learning benchmark repository [6]. Due to its importance to mankind, many studies [7–9] on modeling procedures for heart disease classification have been conducted. Although heart disease data sets have 75 explanatory variables and one dependent variable (i.e., the presence or absence of heart disease), almost every study uses 13 explanatory variables

∗ Corresponding author. E-mail address: [email protected] (C.-D. Hou). 1568-4946/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.asoc.2013.09.020

to predict or classify the dependent variable. Although one can use the aforementioned variables to classify heart disease through the use of logistic regression (LR) techniques or machine learning approaches, the true relationship between these measurements and heart disease is not easy to determine. Besides, by using a single technology to address all of the classification problems may not always be possible [10–13]. To overcome the aforementioned limitations and maintain the classification accuracies of existing approaches for heart disease, the purpose of the present study was to determine the classification performance of hybrid modeling schemes that integrate the techniques of logistic regression, multivariate adaptive regression splines (MARS) and rough sets (RS) with artificial neural networks (ANN). The LR model is a forecasting or classification technique that is widely used in many practical applications. However, the LR model is sometimes criticized for its strong assumptions such as variation homogeneity. As a result, the LR model has limited applications. MARS is typically able to effectively reveal important data patterns and relationships within the complex data structures that often hide in high-dimensional data. In addition to LR and MARS techniques, ANN has become a fruitful alternative in modeling classification problems due to its ability to capture complex nonlinear relationships among variables. Consequently, ANN has superior classification capabilities compared to regression techniques [14–23]. However, ANN is criticized for its long training

48

Y.E. Shao et al. / Applied Soft Computing 14 (2014) 47–52

process in the design of optimal network topologies and difficulties in identifying the relative importance of potential input variables [24,25]. Rough set theory is a new effective classification tool for dealing with vagueness and uncertainty information [26]. Attribute reduction is one of the most important concepts of rough set theory. Irrelevant and redundant attributes are removed from the decision without any a priori information. Meanwhile, the knowledge mined by rough set theory can be expressed and saved as a rule. Due to these advantages, rough set theory has been widely used in many fields [26]. Although rough set theory has been successfully used to find data dependencies and feature subsets in various works, its running time, which is polynomial, is relatively long. The procedure for hybrid modeling is to initially use LR, MARS and RS techniques to model heart disease data sets. Because there is no theoretical approach for determining the best input variables for an ANN model, LR, MARS and RS can be implemented to determine a good subset of input variables when many potential variables are considered as input vectors of the designed ANN model. The resulting fewer but more significant explanatory variables are used as inputs to the ANN model. In terms of classification capability, the present study compares the traditional single stage of the MARS model, RS model, ANN model, and the proposed hybrid model for heart disease application. The superior classification capability of the proposed hybrid approach is addressed. The remainder of the study is organized as follows: the methodologies of LR, MARS, RS and ANN are discussed in Section 2, and a literature review is provided, as well. The designed LR, MARS, RS and ANN models are presented in Section 3, and a real heart disease data set is used to verify the proposed and typical models. The final section addresses the research findings and concludes the present study. 2. Methodologies Heart disease is one of the most serious diseases for human beings; thus, heart disease classification is an important issue. In the present study, we employed the techniques of LR, MARS, RS, and ANN and the proposed hybrid LR–ANN, MARS–ANN and RS–ANN models to classify heart disease. These methodologies are addressed in the following sections. 2.1. Logistic regression LR is one of the most common statistical methods for modeling real applications. The modeling process involves the setup of relationships between one dependent or response variable and several independent or explanatory variables. The performance of LR is typically acceptable, as long as the required assumptions have been met. However, the assumptions of the LR model (for example, variation homogeneity) often confine its application. The framework of LR can be simply described as follows. Let Yi represent dependent variables in case i and let Yi = 0 or Yi = 1, where 0 denotes the absence of heart disease and 1 denotes the presence of heart disease. Let Pr [Yi = 1 |X1i , X2i , X3i , ..., Xni ] = i Pr [Yi = 0 |X1i , X2i , X3i , ..., Xni ] = 1 − i

(1)

be the probabilities of Yi = 1 and Yi = 0 under a set of given independent variables (X1i , X2i , ..., Xni ). Therefore, the logistic regression model has the following form: ln

   i 1 − i

= ˇ0 + ˇ1 X1i + ... + ˇn Xni

(2)

A collinearity diagnosis should be initially implemented to exclude variables exhibiting high collinearity. Consequently, the remaining variables can be employed for LR modeling. In addition, the Wald forward method was used to recognize explanatory variables with significant influence. 2.2. Multivariate adaptive regression splines MARS has been generally applied in many fields [25,27–34]. The general MARS function can be represented as follows [34]: ˆ

f (x) = b0 +

M  m=1

bm

Km  

Skm (x(k,m) − tkm)



(3)

k=1

where b0 and bm are parameters, M is the number of basis functions (BF), Km is the number of knots, Skm takes on values of either 1 or −1 and indicates the right or left sense of the associated step function, (k, m) is the independent variable, and tkm is the knot location. The optimal MARS model was chosen using a two-stage process. First, we set up a large number of basis functions to fit the data. Second, the basis functions with the least contributions were deleted using generalized crossvalidation (GCV) criterion. A measure of variable importance can be obtained by observing the decrease in the calculated GCV values when a variable is removed from the model. The GCV can be expressed as follows: LOF(fˆM ) = GCV (M) =

1/N

N i=1

[yi − fˆM (xi )]2

[1 − (C(M)/N)]

2

(4)

2.3. Artificial neural network In recent years, ANN has been widely applied in engineering, education, social science, medical research, business and forecasting [35–41]. ANN nodes are divided into three layers, including the input, output and hidden layers. The structure of ANN can be briefly described as follows. For each neuron j in the hidden layer and neuron k in the output layer, the net inputs are given as: netj =



wji × oi , and netk =



i

wkj × oj ,

(5)

j

where i (j) is a neuron in the previous layer, oi (oj ) is the output of node i (j) and wji (wkj ) is the connection weight from neuron i (j) to neuron j (k). The neuron outputs can be described as: oi = neti oi = ok =

(6) 1

1 + exp−(neti +i ) 1 1 + exp−(neti +i )

= fi (neti , i )

(7)

= fi (netk , k )

(8)

where netj (netk ) is the input signal from the external source to node j(k) in the input layer and  j ( k ) is a bias. The transformation function shown in Eqs. (7) and (8) is a sigmoid function and is the most commonly utilized function to date. Thus, the aforementioned sigmoid function was used in the present study. 2.4. Rough set The concept of rough set was introduced in the early 1980s [26,42]. Rough set theory is an extension of the set theory used for the study of intelligent systems characterized by inexact, uncertain or vague information and can serve as a new mathematical tool for soft computing [43]. General elements engaged in rough set theory can be described as outlined in the following section.

Y.E. Shao et al. / Applied Soft Computing 14 (2014) 47–52

2.4.1. Information systems We can define an information system as a set of objects represented in a data table, where the rows in the table are considered the objects of analysis, and the columns represent measurable attributes of each object. Thus, an information system, I = (U, A), can be treated as a system, where U is a finite set of objects, and A is a finite set of attributes. Moreover, the attributes in A can be further classified into disjoint condition attributes C and decision attributes D, such that A = C∪D and C∩D = ∅. 2.4.2. B-Indiscernibility relation Let I = (U, A) be an information system. For every subset of attributes B of A, an indiscernibility relation on U is denoted and defined as: INDA (B) = {(x, y)∈U × U: for all a ∈ B, a(x) = a(y)}, where a ∈ A and B ⊆ A. If (x, y) ∈ INDA (B), then object x and y are indiscernible from each other when considering subset B of attributes [44]. This is known as equivalence relation, and every subset of A induces a unique indiscernible relation. 2.4.3. Lower and upper approximation By giving I = (U, A) and placing INDA (B) on universe U with respect to attribute set B ⊆ A, a B-lower approximation of X (BX) and a B-upper approximation of X (BX) were defined as follows for a given set X ⊆ U: B - X = {x : [x]B

¯ = {x : [x]B ⊆ X} and BX

49

Table 1 Definition of explanatory variables. Variable

Meaning

Y X1 X2 X3

Heart disease (0 = absence; 1 = presence) Age (years) Sex (0 = female; 1 = male) Chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic) Resting blood pressure (in mmHg on admission to the hospital) Serum cholesterol (in mg/dl) Fasting blood sugar >120 mg/dl (0 = false, 1 = true) Resting electrocardiographic results (0 = normal, 1 = having ST–T wave abnormality, 2 = showing probable or definite left ventricular hypertrophy according to Estes’ criteria) Maximum heart rate achieved Exercise induced angina (0 = no, 1 = yes) Oldpeak = ST depression induced by exercise relative to rest Slope of the peak exercise ST segment (1: upsloping, 2: flat, 3: downsloping) Number of major vessels (0–3) colored by fluoroscopy Thal (3 = normal, 6 = fixed defect, 7 = reversible defect)

X4 X5 X6 X7

X8 X9 X10 X11 X12 X13

Table 2 Selection of explanatory variables for the LR model. Approach

Maximum VIF

Significant variables

LR

1.79

X1 , X2 , X3 , X4 , X5 , X6 , X7 , X9 , X10 , X11 , X12 , X13

∪X = / ∅}

where [x]B denotes the equivalence class of B containing x for any element x of U. The lower approximation consists of objects that definitely belong to X, and the upper approximation contains objects that possibly belong to X. Consequently, X is classified as a rough set if its B boundary region, BNB (X) = BX − BX, is non-empty. In other words, there is a region of uncertainty regarding set membership. This uncertainty may be quantified for individual points x by assessing the degree of overlap between the indiscernibility class [x]B and the rough set X. In this manner, classifications maintain a global sense of knowledge. 3. Actual application To demonstrate the effectiveness of the proposed approach, a real heart disease data set was used to verify the proposed and typical models. The data set consisted of the records of 899 people. Each record consisted of 14 variables, and the data are summarized in Table 1 (readers can refer to the web site of University of California, Irvine [6] for more details on the data set). The response variable of the LR model was defined as Y. The first step in processing the raw data was to clean the data, and cases with missing measurements were deleted. As a result, the sample size became 280 cases. Among the 280 cases used in the present study, the first 168 cases (approximately 60% of the total cases) were selected as the model building set (training sample), while the remaining 112 cases (approximately 40% of the total cases) were retained as the validation set (testing sample). The neural networks simulator Qnet97, which was developed by Vesta Services Inc., was used to develop the ANN models and the two-stage hybrid classification models. Qnet97 is a C-based simulator that provides a system for developing backpropagation neural network (BPN) configurations using a generalized delta learning algorithm. The LR model was implemented using SPSS. The MARS model was constructed using MARS, which was developed by Salford Systems. The RESE (rough set exploration system) created by the research team supervised by Professor Andrzej Skowron was applied to construct the rough set model used in the present study.

This software system implements the basic elements of rough set theory and rule discovery techniques. The detailed modeling and forecasting results using the aforementioned techniques are described in the following section. 3.1. Single-stage modeling In this study, the single-stage modeling involves LR, MARS, RS and ANN. The Wald forward method was used to select significant explanatory variables. We needed to compute the variance inflation factor (VIF) to examine the presence of collinearity. If the value of VIF was greater than 10, we deemed the existence of serious multicollinearity in the model [45]. In the present study, the maximum VIF for the LR model was 1.79, which is less than 10. The selected variables are listed in Table 2. The explanatory variables chosen in this stage were used as inputs for the ANN employed in the hybrid model. The MARS selection results are shown in Table 3. In the selection process, six significant explanatory variables were chosen, and the corresponding relative importance indicators are listed in the last column of Table 3. The structure of the proposed ANN is described in the current section. More than 75% of neural network applications use the backpropagation neural network (BPN) structure; thus, the BPN was used in the present study to construct the ANN forecasting model. When using ANN in the single stage, we employed 13 input nodes and one output node in the ANN structure. The hidden nodes Table 3 Basis functions and important explanatory variables for the MARS model. Function

Std. dev.

Cost of omission

Number of BF

Variable

Relative importance (%)

1 2 3 4 5 6

0.149 0.141 0.138 0.098 0.066 0.059

0.130 0.131 0.131 0.122 0.117 0.116

1 1 1 1 1 1

X13 X3 X12 X10 X4 X2

100.000 98.294 95.379 62.804 32.011 12.703

50

Y.E. Shao et al. / Applied Soft Computing 14 (2014) 47–52

were set up to be n ± 2, where n is the number of input variables. Thus, in the initial phase, the hidden nodes were chosen as 11, 12, 13, 14 and 15. According to previous findings [46], the learning rates were set as 0.01, 0.005 and 0.001. After performing ANN modeling, we found that the {13-13-1} topology with a learning rate of 0.01 provided the best results and a minimal testing RMSE. Here, {ni –nh –no } stands for the number of neurons in the input layer, hidden layer and output layer, respectively. Variable selection was also achieved by applying rough set theory, which compares equivalence relations generated by sets of variables. Variables were removed so that the reduced set provided the same quality of classification as the original set. Using the greedy heuristics algorithm in RESE software, we performed variable selection on the heart disease data set. The results showed that 10 out of 13 variables were selected. The selected variables were X2 , X3 , X4 , X6 , X7 , X9 , X10 , X11 , X12 and X13 . All of the variables were used as inputs to ANN in the hybrid model. 3.2. Hybrid modeling While most studies use all 13 variables (refer to Table 1 for more details) as explanatory variables to classify heart disease, the aim of the present study was to develop hybrid models with less input variables and still maintain good classification capability. Because ANN is not suitable for initial variables selection, this study employs the other 9 possible hybrid modeling schemes; namely, LR–ANN, MARS–ANN, RS–ANN, LR–MARS, RS–MARS, LR–RS, MARS–RS, MARS–LR and RS–LR, respectively. LR, MARS and RS models were used to obtain 12, 6 and 10 important explanatory variables to classify heart disease. As a result, these 12, 6 and 10 explanatory variables were used as inputs to the hybrid classifiers. For the hybrid LR–ANN model, we set up 12 input nodes in the input layer, and the number of hidden nodes was set to 10, 11, 12, 13, and 14. The learning rates were identical to those used in the single-stage ANN model. That is, the learning rates were 0.01, 0.005, and 0.001, respectively. Again, the network topology with the lowest testing RMSE was considered the optimal network topology. The {12-12-1} topology with a learning rate of 0.01 provided the best results for the hybrid LR–ANN model. For the MARS–ANN hybrid model, we used 6 input nodes in the input layer. The number of hidden nodes was set to 4, 5, 6, 7, and 8. Accordingly, the {6-5-1} topology with a learning rate of 0.01 provided the best results. For the RS–ANN hybrid model, we used 10 input nodes in the input layer. The number of hidden nodes was set to 8, 9, 10, 11 and 12. The {10-10-1} topology with a learning rate of 0.01 provided the best results.

Table 4 AIR comparisons for single-stage and hybrid models. Models

Explanatory variables

AIR (%)

ANN alone RS alone MARS alone LR–ANN MARS–ANN RS–ANN LR–MARS RS–MARS LR–RS MARS–RS MARS–LR RS–LR

X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 , X9 , X10 , X11 , X12 , X13 X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 , X9 , X10 , X11 , X12 , X13 X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 , X9 , X10 , X11 , X12 , X13 X1 , X2 , X3 , X4 , X5 , X6 , X7 , X9 , X10 , X11 , X12 , X13 X2 , X3 , X4 , X10 , X12 , X13 X2 , X3 , X4 , X6 , X7 , X9 , X10 , X11 , X12 , X13 X1 , X2 , X3 , X4 , X5 , X6 , X7 , X9 , X10 , X11 , X12 , X13 X2 , X3 , X4 , X6 , X7 , X9 , X10 , X11 , X12 , X13 X1 , X2 , X3 , X4 , X5 , X6 , X7 , X9 , X10 , X11 , X12 , X13 X2 , X3 , X4 , X10 , X12 , X13 X2 , X3 , X4 , X10 , X12 , X13 X2 , X3 , X4 , X6 , X7 , X9 , X10 , X11 , X12 , X13

76.79 78.60 78.57 78.57 82.14 79.50 78.57 80.36 81.25 76.79 83.93 83.93

Table 5 Type I and Type II errors of the twelve models. Models

Type I errors

Type II errors

Total number of errors

ANN LR–ANN MARS–ANN LR–MARS RS–MARS MARS–LR RS–LR MARS RS–ANN RS MARS–RS LR–RS

0.27 0.15 0.16 0.10 0.08 0.11 0.13 0.29 0.13 0.11 0.18 0.11

0.18 0.30 0.20 0.36 0.34 0.22 0.20 0.12 0.30 0.34 0.30 0.28

0.45 0.45 0.36 0.46 0.42 0.33 0.33 0.41 0.43 0.45 0.48 0.39

expected misclassification costs [47]. Hence, special attention also pays to misclassification cost in order to evaluate the classification accuracy of the twelve built models. It is apparent that the costs associated with Type I errors (a patient without heart disease is misclassified as having heart disease) and Type II errors (a patient with heart disease is misclassified as without heart disease) are significantly different. In general, the misclassification costs associated with Type II errors are much higher than those associated with Type I errors. Therefore, both Type I and Type II errors of the twelve models need to be compared in order to justify the overall classification capability. According to the results in Table 5, RS–LR and MARS–LR both have lower Type I and Type II errors in comparison with the whole models. Accordingly, we can conclude that both hybrid models, RS–LR and MARS–LR, are the best models for heart disease classification. 4. Conclusions

3.3. Experimental results The single-stage classifiers and the hybrid models were developed in the present study for the classification of heart disease. The classification results are displayed in Table 4. As shown in Table 4, the accurate identification rate (AIR) of the proposed hybrid models was almost higher than that of the single-stage models. In addition, the CPU processing time to implement the classification tasks on the computer may be another important factor. For these reasons, the proposed hybrid models were considered the best classification forecasting models for heart disease. In addition to the overall classification accuracies, we investigate the performance of the twelve built classification models in terms of theirs Type I and Type II errors. It is well known that, in order to justify the overall classification capability of the designed classification models, the prior probability of sample data, the misclassification probability, and misclassification costs have to be taken into account in order to obtain a model with the smallest

Heart disease classification is an important issue for human health. The purpose of the present study was to propose a hybrid classification model for heart disease. The rationale of the proposed scheme was initially to obtain fewer significant explanatory variables by performing LR and MARS modeling. The resulting significant independent variables were used as inputs for the designed ANN model. According to the experimental results shown in Table 4, the proposed two-stage hybrid approaches were most appropriate for classifying heart disease. Moreover, the proposed hybrid MARS–ANN model was the best alternative because it contained the least number of explanatory variables and provided the best classification accuracy. Artificial intelligence methodology is very useful in many aspect of application such as engineering [48–51], credit risk modeling [52], medicine [25] and behavior modeling [53–57]. Among the established artificial intelligence techniques, the multi-stage hybrid procedure is a commonly used method. This study develops

Y.E. Shao et al. / Applied Soft Computing 14 (2014) 47–52

several two-stage hybrid approaches for heart diseases classification. To achieve higher classification of heart disease, one may collect more important variables. In addition, the proposed hybrid model is not the only classification technique that can be employed. For instance, one can combine other artificial intelligence techniques, such as decision trees or genetic algorithms, with neural networks or support vector machines to refine the structure further and improve the classification accuracy. The possibility to apply the same procedure to combine other methods as are the evolving systems of [58–63] deserves further research. It is worthy of further investigation in the future. Acknowledgement This work was partially supported by the National Science Council of the Republic of China (Grant nos. NSC 102-2221-E-030-019 and NSC 102-2118-M-030-001). References [1] M. Heron, National Vital Statistics Reports, vol. 59, 2011, pp. 9. [2] R. Luengo-Fernandez, J. Leal, A.M. Gray, UK research expenditure on dementia, heart disease, stroke and cancer: are levels of spending related to disease burden? European Journal of Neurology 19 (2012) 149–154. [3] J.J. Rubio, F. Ortiz, C.R. Mariaca, J.C. Tovar, A method for online pattern recognition for abnormal eye movements, Neural Computing and Applications 22 (2013) 597–605. [4] D.M. Vazquez, J.J. Rubio, J. Pacheco, A characterization framework for epileptic signals, IET Image Processing 6 (2013) 1227–1235. [5] J.J. Rubio, D.M. Vazquez, D. Mujica-Vargas, Acquisition system and approximation of brain signals, IET Science, Measurement and Technology 7 (2013) 232–239. [6] D.J. Newman, S., Hettich, C.L.S., Blake, C.J. Merz, http://archive.ics. uci.edu/ml/machine-learning-databases/heart-disease/heart-disease.names (1998). [7] R. Detrano, A. Janosi, W. Steinbrunn, M. Pfisterer, J. Schmid, S. Sandhu, K. Guppy, S. Lee, V. Froelicher, International application of a new probability algorithm for the diagnosis of coronary artery disease, American Journal of Cardiology 64 (1989) 304–310. [8] J.H. Gennari, P. Langley, D. Fisher, Models of incremental concept formation, Artificial Intelligence 40 (1989) 11–61. [9] H. Kahramanli, N. Allahverdi, Design of a hybrid system for the diabetes and heart diseases, Expert Systems with Applications 35 (2008) 82–89. [10] S. Goonatilake, S. Khebbal (Eds.), Intelligent Hybrid Systems, John Wiley and Sons, 1995. [11] C.L. Tan, T.S. Quah, H.J. Teh, An artificial neural network that models human decision making, IEEE Computer 29 (3) (1996) 64–71. [12] K. Xu, A.R. Luxmoore, L.M. Jones, F. Deravi, Integration of neural networks and expert systems for microscopic wear particles analysis, Knowledge-Based Systems 11 (1998) 213–227. [13] F. Zahedi, Intelligent Systems for Business: Expert Systems with Neural Networks, Wadsworth Publishing Co., California, 1993. [14] S. Desai, J.N. Crook, G.A. Overstreet, A comparison of neural networks and linear scoring models in the credit union environment, European Journal of Operational Research 95 (1996) 24–37. [15] H.L. Jensen, Using neural networks for credit scoring, Managerial Finance 18 (1992) 15–26. [16] A.M.S. Muniz, H. Liu, K.E. Lyons, R. Pahwac, W. Liu, F.F. Nobre, J. Nadal, Comparison among probabilistic neural network, support vector machine and logistic regression for evaluating the effect of subthalamic stimulation in Parkinson disease on ground reaction force during gait, Journal of Biomechanics 43 (2010) 720–726. [17] M. Caselli, L. Trizio, G. de Gennaro, P. Ielpo, A simple feedforward neural network for the PM10 forecasting: comparison with a radial basis function network and a multivariate linear regression model, Water, Air, & Soil Pollution 201 (2009) 365–377. [18] A. Minbashian, J.E.H. Bright, K.D. Bird, A comparison of artificial neural networks and multiple regression in the context of research on personality and work performance, Organizational Research Methods 12 (2009) 1–24. [19] K. Subramanian, V.M. Periasamy, M. Pushpavanam, K. Ramasamy, Predictive modeling of copper in electro-deposition of Bronze using regression and neural networks, Portugaliae Electrochimica Acta 27 (2009) 47–55. [20] X. Ren, X. Lv, Identification of extended Hammerstein systems using dynamic self-optimizing neural networks, IEEE Transactions on Neural Networks 22 (2011) 1169–1179. [21] J.H. Perez-Cruz, J.J. Rubio, E. Ruiz-Velzquez, G. Solis-Perales, Tracking control based on recurrent neural networks for nonlinear systems with multiple inputs and unknown deadzone, Abstract and Applied Analysis 2012 (2012) 1–18. [22] J.J. Rubio, Modified optimal control with a backpropagation network for robotic arms, IET Control Theory and Applications 6 (2012) 2216–2225.

51

[23] J. Peralta, X. Li, G. Gutierrez, A. Sanchis, Time series forecasting by evolving artificial neural networks with genetic algorithms, differential evolution and estimation of distribution algorithm, Neural Computing and Applications 22 (2013) 11–20. [24] C.J. Lu, Y.E. Shao, P.H. Li, Mixture control chart patterns recognition using independent component analysis and support vector machine, Neurocomputing 74 (2011) 1908–1914. [25] S.M. Chou, T.S. Lee, Y.E. Shao, I.F. Chen, Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines, Expert Systems with Applications 27 (2004) 133–142. [26] Z. Pawlak, Rough Set Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishers, Dordrent, 1991. [27] F. Vidoli, Evaluating the water sector in Italy through a two stage method using the conditional robust nonparametric frontier and multivariate adaptive regression splines, European Journal of Operational Research 212 (2011) 583–595. [28] L.P. Venkata, M.R. Jay, V. Chen, N. Engsuwan, S. Siddappa, A multivariate adaptive regression splines cutting plane approach for solving a two-stage stochastic programming fleet assignment model, European Journal of Operational Research 216 (2012) 162–171. [29] Y. Zhou, H. Leung, Predicting object-oriented software maintainability using multivariate adaptive regression splines, Journal of Systems and Software 80 (2007) 1349–1361. [30] J. de Andrés, P. Lorca, F.J. de Cos Juez, F. Sánchez-Lasheras, Bankruptcy forecasting: a hybrid approach using Fuzzy c-means clustering and multivariate adaptive regression splines (MARS), Expert Systems with Applications 38 (2011) 1866–1875. [31] Q. Xu, D.L. Massart, Y. Liang, K. Fang, Two-step multivariate adaptive regression splines for modeling a quantitative relationship between gas chromatography retention indices and molecular descriptors, Journal of Chromatography A 998 (2003) 155–167. [32] J.R. Leathwick, J. Elith, T. Hastie, Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions, Ecological Modelling 199 (2006) 188–196. [33] M. Jalali-Heravi, M. Asadollahi-Baboli, A. Mani-Varnosfaderani, Shuffling multivariate adaptive regression splines and adaptive neuro-fuzzy inference system as tools for QSAR study of SARS inhibitors, Journal of Pharmaceutical and Biomedical Analysis 50 (2009) 853–860. [34] H.M. Azamathullaa, F.C. Wu, Support vector machine approach for longitudinal dispersion coefficients in natural streams, Applied Soft Computing 11 (2009) 2902–2905. [35] J.H. Friedman, Multivariate adaptive regression splines (with discussion), Annals of Statistics 19 (1991) 1–141. [36] C.C. Chiu, Y.E. Shao, T.S. Lee, Identification of process disturbance using SPC/EPC and neural networks, Journal of Intelligent Manufacturing 14 (2003) 379– 388. [37] B. Repley, Neural networks and related methods for classification (with discussion), Journal of the Royal Statistical Society, Series B 56 (1994) 409–456. [38] A. Vellido, P.J.G. Lisboa, J. Vaughan, Neural networks in business: a survey of applications (1992–1998), Expert Systems with Applications 17 (1999) 51–70. [39] G. Zhang, B.E. Patuwo, M.Y. Hu, Forecasting with artificial neural networks: the state of the art, International Journal of Forecasting 14 (1998) 35–62. [40] Y.E. Shao, B.S. Hsu, Determining the contributors for a multivariate SPC chart signal using artificial neural networks and support vector machine, International Journal of Innovative Computing, Information and Control 5 (2009) 4899–4906. [41] Y.E. Shao, C.J. Lu, C.C. Chiu, A fault detection system for an autocorrelated process using SPC/EPC/ANN and SPC/EPC/SVM schemes, International Journal of Innovative Computing, Information and Control 7 (2011) 5417–5428. [42] Z. Pawlak, J.W. Grzymala-Busse, R. Slowiriski, W. Ziarko, Rough sets, Communnication of the ACM 38 (1995) 88–95. [43] W.H. Xu, W.X. Zhang, Knowledge reduction in consistent information system based on dominance relations, in: Y. Liu, G. Chen, M. Ying (Eds.), Beijing 2006, vol. 3, LNCS, Springer, Tsinghua University Press, 2006, pp. 1493–1496. [44] Z. Yan, Y. Yiyu, L. Feng, Data analysis based on discernibility and indiscernibility, Information Sciences 177 (22) (2007) 4959–4976. [45] J.F. Hair, R.E. Anderson, R.L. Tatham, W.C. Black, Multivariate Data Analysis, 5th ed., Prentice-Hall, NJ, 1998. [46] D.E. Rumelhart, D.E. Hinton, R.J. Williams, Learning Internal Representations by Error Propagation in Parallel Distributed Processing, MIT Press Cambridge, MA, 1986. [47] R.A. Johnson, D.W. Wichern, Applied Multivariate Statistical Analysis, 6th ed., Prentice-Hall, Upper Saddle River, NJ, 2007. [48] C.W. Chen, W.L. Chiang, F.H. Hsiao, Stability analysis of T–S fuzzy models for nonlinear multiple time-delay interconnected systems, Mathematics and Computers in Simulation 66 (2004) 523–537. [49] F.H. Hsiao, J.D. Hwang, C.W. Chen, Z.R. Tsai, Robust stabilization of nonlinear multiple time-delay large-scale systems via decentralized fuzzy control, IEEE Transactions on Fuzzy Systems 13 (2005) 152–163. [50] F.H. Hsiao, C.W. Chen, Y.W. Liang, S.D. Xu, W.L. Chiang, T–S fuzzy controllers for nonlinear interconnected systems with multiple time delays, IEEE Transactions on Circuits & Systems-I: Regular Papers 52 (2005) 1883–1893. [51] C.W. Chen, Stability conditions of fuzzy systems and its application to structural and mechanical systems, Advances in Engineering Software 37 (2006) 624–629. [52] S.L. Lin, A new two-stage hybrid approach of credit risk in banking industry, Expert Systems with Applications 36 (2009) 8333–8341.

52

Y.E. Shao et al. / Applied Soft Computing 14 (2014) 47–52

[53] C.W. Chen, The relationship between personality traits and sales force automation usage: a review of methodology, Human Factors and Ergonomics in Manufacturing & Service Industries (2013), http://dx.doi.org/10.1002/hfm.20311. [54] C.W. Chen, The relationship between personality traits and sales force automation usage: a preliminary study, Human Factors and Ergonomics in Manufacturing & Service Industries 23 (2013) 243–253. [55] C.W. Chen, Critical human factor evaluation of knowledge sharing intention in Taiwanese enterprises, Human Factors and Ergonomics in Manufacturing & Service Industries 23 (2) (2013) 95–106. [56] C.W. Chen, Human factors of knowledge-sharing intention among Taiwanese enterprises: a model of hypotheses, Human Factors and Ergonomics in Manufacturing & Service Industries 22 (2012) 362–371. [57] H.M. Kuo, A study of a B2C supporting interface design system for the elderly, Human Factors and Ergonomics in Manufacturing & Service Industries 22 (2012) 528–540.

[58] J.J. Rubio, SOFMLS. Online self-organizing fuzzy modified least square network, IEEE Transactions on Fuzzy Systems 17 (2009) 1296–1309. [59] D. Leite, R. Ballini, P. Costa, F. Gomide, Evolving fuzzy granular modeling from non-stationary fuzzy data streams, Evolving Systems 3 (2012) 65–79. [60] E. Lughofer, Single pass active learning with conflict and ignorance, Evolving Systems 3 (2012) 251–271. [61] E. Lughofer, A dynamic split-and-merge approach for evolving cluster models, Evolving Systems 3 (2012) 135–151. [62] L. Maciel, A. Lemos, F. Gomide, R. Ballini, Evolving fuzzy systems for pricing fixed income options, Evolving Systems 3 (2012) 5–18. [63] J.J. Rubio, J.H. Perez Cruz, Evolving intelligent system for the modelling of nonlinear systems with dead-zone input, Applied Soft Computing (2013), http://dx.doi.org/10.1016/j.asoc.2013.03.018, in press.