An integrated approach for operational knowledge acquisition of refuse incinerators

An integrated approach for operational knowledge acquisition of refuse incinerators

Expert Systems with Applications Expert Systems with Applications 33 (2007) 413–419 www.elsevier.com/locate/eswa An integrated approach for operation...

290KB Sizes 3 Downloads 142 Views

Expert Systems with Applications Expert Systems with Applications 33 (2007) 413–419 www.elsevier.com/locate/eswa

An integrated approach for operational knowledge acquisition of refuse incinerators Yang-Chi Chang *, Po-Chuan Lai, Meng-Tsung Lee Department of Marine Environment and Engineering, National Sun Yat-sen University, 70 Lien-Hae Road, Kaohsiung 804, Taiwan, ROC

Abstract Refuse incinerator operation poses a tremendous challenge for efficient supervision due to the highly complexity of physical and chemical mechanisms inside the systems. It is difficult to comprehend operational knowledge without thorough study and long-term on site experiments. Fortunately, many sensors are installed in incineration plants and tremendous amounts of raw data about daily practices and system states are recorded to assist operations. Without proper analysis, however, these data are not beneficial to operators. An integrated approach is adopted in the current study using feature selection and data mining techniques. Feature selection was initially applied to cope with the heavy computation burden due to the huge data set. Data dimension can be reduced by discarding redundant information and leaving only relevant features for further analysis. Data mining analysis is then utilized to build two decision tree models based on steam production and NOx emission target attributes. Implicit incinerator system relations, represented by production rules and predicting accuracies, can be acquired from the decision tree models. Such rule-based knowledge is expected to facilitate on-site operations and enhance refuse incinerator efficiency.  2006 Elsevier Ltd. All rights reserved. Keywords: Rule-based knowledge; Refuse incinerator; Data mining; Feature selection; Decision tree classification

1. Introduction Although trash incineration is an efficient and primary method for dealing with daily trash in Taiwan, some critical problems still exist. For example, most incinerators are equipped with heat recovery systems consisting of a boiler unit and a cogeneration process, and current cogeneration benefits have become the most important income for refuse incinerator systems (Chung & Poon, 1996; Sonesson, Bjorklund, Carlsson, & Dalemo, 2000). Managers must therefore search for a solution of how to keep highly efficient power generation within air pollution control. Although some physical and chemical mechanisms within the incinerator were studied, the research results may do

*

Corresponding author. Tel.: +886 7 525 2000x5176; fax: +886 7 5255060. E-mail address: [email protected] (Y.-C. Chang). 0957-4174/$ - see front matter  2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2006.05.008

little help to the incineration manager due to lack of long term on-site experiments. Otherwise, even though standard controls can handle ordinary operations, some system parameter adjustments are still needed in a variety of situations. However, those knowledge patterns or experiential rules are not easily discovered without proper analyses. Many sensors are installed in a refuse incineration plant to support a highly efficient operation through system monitoring. The rationale is to embed an implicit system characteristic about plant operation in the collected raw data. However, it is infeasible for a manager to identify the desired information by either data observation or simple calculations. Therefore, without proper analyses, system monitoring and data collection are useless. Fortunately, the difficulty of retrieving useful information through huge data sets can be overcome by advanced computer power and information techniques. An integrated approach was adopted by the current study using feature selection and data mining techniques. Feature

414

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419

selection was initially applied to solve the problem of heavy computation burden due to numerous data attributes in the raw data set. Data can be reduced by discarding redundant information and leaving only relevant attributes or features for further analysis. The data mining (DM) technique was then utilized to build two decision tree models based on steam production and NOx emission target attributes. The implicit relations between each incinerator sub-system, represented by production rules and predicting accuracies, can be acquired from the decision tree models. Such rulebased knowledge is expected to facilitate on-site operations and enhance refuse incinerator efficiency.

Start

Inclusion

YES

Stop criterion

Stop

NO

2. Methodologies

Exclusion

Test

2.1. Feature selection techniques Real world data are often large and more complex in structure than experimental data. Thus, it would take a tremendous amount of computational effort to analyze real world data. Computation time is one of the prime factors effecting real world data application. With rapid growth of data collection by automatic sensors or recorders, the needed computational effort has exceeded current computer capability for conducting such data analysis. Some software approaches have been developed to solve real world problems, such as principal component analysis (PCA), discrete wavelet transform (DWT), feature selection, etc. Among those approaches, feature selection is considered to perform better in real world applications (Liu & Setiono, 1998). Feature selection is also referred to as ‘‘feature subset selection’’ in some literatures. It is important to pattern classification or data mining in the machine learning domain due to its speedy learning process capability. In addition to finding relevant features, feature selection can also greatly improve classification performance, and remove redundant or irrelevant information (Blum & Langley, 1997; Pudil, Novovicˇova´, & Somol, 2002). The feature selection algorithm is valuable for improving prediction accuracy while reducing attribute numbers (John, Kohavi, & Pileger, 1994; Kohavi, 1995). The Sequential Forward Floating Search (SFFS) algorithm proposed by Pudil, Novovicˇova´, and Kittler (1994) is a revised algorithm based on a ‘‘sub-optimal’’ feature subset selection. It is proven to be a better feature selection algorithm than other feature selection algorithms (Anil & Douglas, 1997). Three major steps are identified in the SFFS algorithm: the first is ‘‘Inclusion’’, the second is ‘‘Test’’, and the last is ‘‘Exclusion’’. SFFS begins with the inclusion process to select a feature with best performance. The searching process is followed by conducting a test on every feature selected in the same iteration to specify features that will decay the whole performance. If such a feature exists, SFFS would commence the exclusion process to ignore such a feature. The algorithm will continue looking for other better features until all features are examined.

NO

YES

subsets with better accuracy found or not

Fig. 1. Simplified flow chart of SFFS algorithm.

The simplified SFFS flow chart is shown in Fig. 1 (Pudil et al., 1994; Somol, Pudil, Novovicˇova´, & Paclik, 1999). 2.2. Classification analysis in DM DM, the emerging information technique, effectively deals with the discovery of hidden knowledge, unexpected patterns, and implicit database rules (Adriaans & Zantinge, 1998; Han & Fu, 1999; Tung, Huang, Chen, & Shih, 2005). Raw data is seldom of direct benefit to people. The true value of data, contributing useful information for decision support and interpreting the phenomena represented by data sources, can be shown by the DM technique. Classification analysis is a DM technique, often used to establish relationships between classes and attribute value via a set of training instances. Given a set of finite and pre-determined class values, the classes in a classification problem are the ultimate objectives to be determined. Meanwhile, attribute values of each instance represent the instance’s characteristics, which affect potential classification outcome. Among the available classification methods, decision tree analyses have been used extensively in terms of predictive accuracy, computational speed, robustness, scalability, and interpretability (Elmasri & Navathe, 2000; Giordana & Neri, 1995). Therefore, the current study has also applied the decision tree analysis for acquiring operational knowledge in the designated incinerator plant. A decision tree is a flow-chart-like tree structure, where each internal node represents an attribute test, and each branch represents a test outcome, and leaf nodes represent classes. The basic decision tree induction concept is based on a greedy algorithm that constructs decision tree in a top–down, recursive, and divide-and-conquer manner

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419

(Greene & Smith, 1993). Many decision tree induction algorithms have been developed, including ID3 (Quinlan, 1986), C4.5 (Quinlan, 1993), CART (Breiman, Friedman, Olshen, & Stone, 1984), CHAID (Kass, 1980) and OC1 (Murthy, Kasif, & Salzberg, 1994). Among the techniques, C4.5 is the most widely used in classification applications. C4.5 algorithm is a descendant algorithm of ID3 that employs an entropy-based measure as a heuristic for selecting the attribute that will best separate the samples into individual classes. The entropy is also referred to as the information gain in the C4.5 algorithm. The attribute with the highest information gain is chosen as the target attribute for each node. If the attribute Ai is chosen as the branching attribute for training instances T, the concept of information gain of Ai is written as the following equation: gainðAi Þ ¼ EðT Þ  EðAi Þ; where X X pj log2 ðpj Þ and EðAi Þ ¼ ðnk =nÞEðT k Þ EðT Þ ¼  j

j

E(T) is the entropy of the set of instances T, and pj is the probability that an instance belongs to class j. E(Ai) is the resultant entropy of T if T is partitioned according to Ai, in which Tk is a disjoint subset of T based on Ai’s value, n is the total number of instances in T, and nk is the number of instances in Tk. With this algorithm, the information gain for each of the variables defining the samples in T can be obtained. The variable with the highest information gain is considered the most discriminating variable of the given set. 3. Data collection and cleaning process 3.1. Data collection The data set was acquired from the designated incineration plant in south Taiwan. There are 10755 records in the data, and each record has 509 attributes, respectively. According to the incineration plant functionalities, all attributes are mainly grouped into three sub-systems, including the refuse incineration sub-system, the cogeneration subsystem, and the air pollution control sub-system. The simplified flow chart for the incineration plant refuse process is shown in Fig. 2. The interaction mechanisms between these sub-systems are highly complex and non-linear. By observing the data or performing simple statistic analysis, it is impossible for the managers and the operators to grasp such relations. The objective of this study is to discover the inherent mechanisms embedded in the original data set. The focus is on the interactions between the refuse incineration subsystem and two main system output variables, the steam production and NOx emission. The rationale is that the steam and the waste gas (NOx), originating from the refuse incineration sub-system, are critical elements to incinerator management. Therefore, only refuse incineration sub-system attributes are taken into consideration as important parameters for current knowledge extraction modeling.

415

Fig. 2. Simplified flow chart for refuse processing system.

3.2. Data cleaning process Data preprocesses are required before the data-driven analysis. The preprocess steps are often referred to as ‘‘data cleaning’’, or removing outliers and inconsistent records, as well as pruning redundancies and abnormalities existing in the data set. Data cleaning avoids the ‘‘garbage in, garbage out’’ problem, meaning that if invalid data enter analysis processes, the resulting output will also be invalid and useless. There are many abnormal values within the original raw data, acquired from the incineration plant that may fail the data analysis process. Therefore, a two-stage cleaning process, including vertical cleaning and lateral cleaning, were used in this study to ensure the data integrity for subsequent analyses. • Stage 1, Vertical Cleaning: Vertical cleaning removes abnormal attributes, with a missing data rate larger than 10% in the vertically aligned records. The results also imply that the sensors associated with abnormal data attributes should be checked and repaired if necessary. • Stage 2, Lateral Cleaning: Lateral cleaning focuses on eliminating redundancies in the temporal scale. To correct such errors, some records reduplicated horizontally in the data set should be removed. In addition, some sensor malfunctions have been detected in the lateral cleaning process, and those records with many missing values are also removed. After the data cleaning process, 3427 records remain for the feature selection and decision tree analyses, as demonstrated in the following section. 4. Feature selection and decision tree model In many cases, previous experience, when applied for decisions, is transformed into lexical representation in a classification application (Helen, Alexander, & David, 2002). Classification can predict a categorical value of a specific data item, usually represented by a small number of ordinal levels, for example, high, medium, etc. The

416

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419

advantageous partitioning of continuous values into ordinal values is stressed below (Helen et al., 2002): • Ordinal values in classification may improve analysis results in classification applications. • Ordinal information is rather simple compared to continuous information in the traditional decision tree model. It leads to a reduced number of rules and nodes with simultaneous error rate reduction. • Also, an Ordinal problem can lead to a much richer set of conclusions and provide managers or operators a quick understanding of the phenomena. All attributes are partitioned into three categories, high, medium and low, for a better classification in this study, and each level covers a specific numerical data range. The refined attributes in the refuse incineration system are applied for feature selection after the data cleaning and categorizing process. The SFFS algorithm implemented by Robert P.W. Duin in his toolbox ‘‘PRTools,’’ was used for conducting the feature selection. The PRTools is a Matlab based toolbox for pattern recognition, and is freely used for academic research. There are several feature selection algorithms supported by PRTools, and SFFS has proven to perform well in terms of computational efficiency among them by using 1-nearest-neighbor classification as the ‘‘criterion function’’ (Zongker and Jain, 1996). Results generated by the SFFS algorithm on the target attributes are shown in Table 1. Among numerous attributes in the refuse incineration sub-system, the SFFS algorithm selected five attributes most related to steam production and six attributes for NOx emission. After data cleaning and the feature selection process, the selected attributes are then analyzed using the decision tree induction technique. The software Weka was applied to perform decision tree analysis in this study. Weka is a collection of machine learning algorithms for data mining implemented in Java and contains various tools for data preprocessing, classification (decision tree), regression, clustering, association rules, and visualization (Witten & Frank, 2005). It also provides many widely adapted decision tree induction algorithms for building a classification model, such as C4.5 and ID3. The current study preferred Table 1 Summary of SFFS algorithm result Target attribute

Attributes selected for classification

Steam production

Combustion chamber temperature Pre-warming air temperature Quantity of primary air of roller #6 Quantity of secondary air Refuse heating value The rolling speed of roller #2 The rolling speed of roller #5 Quantity of secondary air Quantity of primary air of roller #6 Combustion chamber temperature Quantity of steam production

NOx emission

using the C4.5 algorithm for conducting the decision tree analysis due to its better computational efficiency. Before commencing the decision tree analysis, some parameters must be assigned for the C4.5 algorithm. The ‘‘Confidence factor’’ parameter determines the confidence value to be used when pruning the tree, that is, removing branches that provide little or no statistical accuracy gain of the model. It has been proven that the default value of 0.25 works reasonably well in most cases, and therefore this parameter was left unchanged in this study (Murthy et al., 1994; Winston, 1992). Another parameter needed for decision tree building is the minimum number of instances that must be presented for a new leaf to be expanded in the decision tree. A more generalized or specialized tree could be expected by adjusting this parameter. In general, a higher number of such parameters will create a more generalized tree and a lower number will create a more specialized tree. It was normally suggested that the number of instances be set to twenty per class for better performance (Kirchner, To¨lle, & Krieter, 2004). The decision tree model for the steam production uses the parameter of 20 instances per class, as shown in Fig. 3. A ten-fold cross-validation algorithm was applied for accuracy rate estimations of the decision tree models. Cross-validation is a method for estimating generalization accuracy based on ‘‘resampling’’ (Weiss & Kulikowski, 1991). When conducting k-fold cross-validation, the entire data set needs to be randomly divided into k subsets of equal size. The training processes will then be performed k times, where each time one of the k-subsets will be reserved for testing the model accuracy rate, acquired from training the remaining k-1 subsets. Prediction accuracy results are often used to evaluate decision tree performance. Another method for accuracy rate estimation is the ‘‘split-sample’’ or ‘‘hold-out’’ method. In the split-sample method, only one validation subset is used to estimate generalization accuracy, while k different subsets are used recursively to estimate generalization accuracy in a k-fold cross-validation method. Therefore, the cross-validation method would provide more unbiased information in terms of the models’ accuracy for data mining applications (Elsner & Schmertmann, 1994; Rivals & Personnaz, 1999; Schaffer, 1993; Shao & Tu, 1995). Using 10-fold cross-validation, the current study found 86% predicting accuracy for the steam production model and 80% predicting accuracy for the air pollution (NOx) model, all within the acceptable range. 5. Rules generation The primary advantage of a decision tree model for incinerator operation, compared with other black box algorithms, is its natural knowledge representation method. A decision tree model can be easily transferred to a production rule base using the form ‘‘IF antecedence {x1,x2,x3, . . . ,xn}, THEN consequence {y}’’. In the previ-

75%

QSP >36.4

QSA >9462

80%

QSP >36.4

70%

QSP SP <33.3

90%

33.3
RHV<1755

1755
RHV>2187

QPA6>2530

75%

QSP <33 .3

PAT >35. 15

92%

33 .3
33.3
QSP> 36.4

QPA6> 2530

60%

QSP < 33. 3

65%

33.3
QSP < 33.3

70%

Ai r

Q S A : Q ua nt i ty of S ec o nda ry

R H V : Refu se H eat in g Valu e

T em per at ur e

63%

63%

QSP SP < 33.3

PAT : P rew ar m i ng A ir

67%

QSP < 33.3

33.3
PAT >35. 15

33. 3
QPA6< 2375

99%

95%

QSP < 33.3

QSP < 33.3

QSA <6970

CCT<980 CCT

Pr oduct i on

QS P : Quan t it y o f S t eam

A ir of Rol l er #6

Q PA 6 : Q uant i ty of P ri m ary

61%

QSP < 33.3

PAT<33.3

71%

QSP SP < 33.3

6970
T emperat ur e

60%

33.3
33. 3
PAT<33. 3

2375
QSA >9462

980
CCT : Combusti on Cham be r

80%

33.3
PAT>35.15

QSP<33.3

QPA6<2375

RHV< 1755

1755
RHV> 2187

2375
QSA <6970

Fig. 3. Decision Tree for the target attribute of the steam production.

95%

33.3
75%

33.3
QPA6< 2375

PAT<33.3

6 <2530 <25 2375
6970
CCT >1020.1

Root

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419 417

418

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419 Rule: IF (CCT AND (6970 THEN (QSP

Root

CCT≥1020ºC

IF

Chamber Combustion Temperature ≥1020.1º C

QSA≥ 9462

1020.1) QSA 9462) 36.4)

Accuracy rate: 80%

initial status of uncontrollable attributes 6970cmm≤QSA≤9562cmm

6970 cmm ≥ Quantity of AND Secondary Air ≥ 9562cmm

Combustion Chamber Temperature 1020.1

Refuse Heating Value 2187 cal/kg

QSP ≥36.4

75%

AND (RHV 2187) AND (QPA6 2530)

QPA6≥ 2530cmm

Quantity of Primary Air of AND Roller # 6 ≥ 2530cmm

Refuse Heating Value ≥ 2187 cal/kg

RHV≥ 2187cal/kg

AND

QSP≥36.4ton/hr

THEN Production ≥ 36.4 ton/hr

80%

Quantity of Steam

Accuracy Rate 80%

Fig. 4. An example of rule generation from one branch of the decision tree.

ous IF-THEN rule, the antecedence set contains the branching attributes x1,x2,x3, . . . ,xn and the consequence y is the target attribute in the original decision tree. Each branch in the decision tree model is equivalent to an IFTHEN rule with a corresponding accuracy rate as shown in Fig. 4. There are 22 and 41 rules generated from the steam production and air pollution model decision tress, accordingly. Since the IF-THEN rule bases are intuitive to humans, they powerfully facilitate on-site decision making. An incineration plant operator can use the heuristic rules to assess target attribute outcomes under various operational scenarios represented by branching attribute combinations in the antecedence set. Some of these attributes may not be manually adjusted in the central control room, since they indicate on-site initial status. For example, refuse heating value is prescribed depending on the incoming refuse material characteristic. Nevertheless, operators can manipulate controllable attributes to achieve desired outcomes. The IF-THEN rule presented in Fig. 3 is given below and its associated knowledge in terms of incinerator operation is explained in the following paragraph. IF (CCT P 1020.10), AND (6970.00 6 QSA 6 9462.00), AND (QPA6 P 2530.00), AND (RHV P 2187.00) THEN (QSP P 36.40)

The rule demonstrated above is based on the steam production target attribute, interpreted as Fig. 5. Two initial conditions, the uncontrollable attributes of combustion chamber temperature (CCT) and refuse heating value (RHV), have been detected either by the sensors or human observation in advance. If CCT is larger than or equal to

controllable attributes 6970 cmm Quantity of Secondary Air 9462 cmm

Quantity of Primary Air of roller #6 2530 cmm

objective Quantity of Steam Production 36.4 tons/hr

accuracy rate: 80%

Fig. 5. An example of showing the rule application in the incinerator operation.

1020.1 C and RHV is larger than or equal to 2187 cal/ kg, the operators may need to maintain two specific attributes within a certain range for achieving quantity of steam production (QSP) larger than 36.4 tons/h. Here the two controllable attributes are quantity of secondary air (QSA) supply and quantity of primary air supply for roller #6(QPA6). The operators in the central control room must adjust the associated controllers to maintain the QSA value between 6970 and 9462 cubic meters per minute (cmm) and the value of QPA6 larger than 2530 cmm. When applying this rule, the operators must also be aware of the embedded uncertainty in the rule, implying that this rule may not be correct all the time since the predictive accuracy is about 80%. In summary, the rule-based knowledge acquired from the decision tree models should be of great help for operators. Since only a few crucial attributes rather than numerous attributes deserve the operators’ attention, explicit results can be expected for better refuse incinerator operation. 6. Conclusions This paper proposes an integrated approach using two data-driven analyses for operational knowledge acquisition for the refuse incinerator. Tremendous amounts of data were generated and recorded via digital revolution, through real-time incineration plant, monitoring, providing a sound basis for advanced analysis. Hidden information in huge data sets cannot be easily identified as knowledge to support operations or decision making unless proper and trustworthy methods are applied. In light of the need discussed above, the feature selection technique and the decision tree analysis were utilized to accommodate practicable computational effort and explicit knowledge representation issues. The feature selection technique is capable of detecting most relevant features and thus a more concise and precision data set can be acquired for better subsequent analysis. Decision tree induction analysis, one of the data mining techniques, is well-known for its outstanding rule-extraction and prediction ability.

Y.-C. Chang et al. / Expert Systems with Applications 33 (2007) 413–419

The current study successfully created two sets of rulebased knowledge, focusing on steam production and NOx emission. Following the process of data cleaning and feature selection, two decision tree models were formulated based on the huge historical data. Decision tree performance was acceptable as evaluated by the accuracy rate index, using the 10-fold cross-validation algorithm. Finally, the more intuitive know-how domain expression, presented as a production rule, was derived from the decision tree models. The rules can greatly help incinerator plant on-site operators so that more efficient refuse incineration can be expected. Future issues beyond the one described here include: (1) a finer division of continuous values into ordinal values, other than high, medium, and low; (2) other related target attributes besides steam production and NOx emission; and (3) a broader model formulation to cover all the cogeneration sub-system and air pollution control sub-system attributes if the computational capacity problem is resolved. References Adriaans, P., & Zantinge, D. (1998). Data mining. Harlow: AddisonWesley. Anil, Jain, & Douglas, Zongker (1997). Feature selection: evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2). Blum, A., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97, 245–271. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Pacific Grove: Wadsworth. Chung, S. S., & Poon, C. S. (1996). Evaluating waste management alternatives by the multiple criteria approach. Resources Conservation and Recycling, 17, 189–210. Douglas, Zongker, & Anil, Jain (1996). Algorithms for feature selection: an evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1015–4651. Elmasri, R., & Navathe, S. B. (2000). Data warehousing and data mining. In Fundamentals of Database System (3rd ed.). MA: Addison Wesley, chap. 26. Elsner, J. B., & Schmertmann, C. P. (1994). Assessing forecast skill through cross validation. Weather Forecasting, 9, 619–624. Giordana, A., & Neri, F. (1995). Search-intensive concept induction. Evolutionary Computation, 3(4), 375–416. Greene, D. P., & Smith, S. F. (1993). Competition-based induction of decision models from examples. Machine Learning, 13, 229–257.

419

Han, J., & Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11(5), 798–805. Helen, M. Moshkovich, Alexander, I. Mechitov, & David, L. Olson (2002). Rule induction in data mining: effect of ordinal scales. Expert System with Applications, 22, 303–311. John, G. H., Kohavi, R., & Pileger, K. (1994). Irrelevant feature and the subset selection problem. Machine learning: Proceedings of the Eleventh International Conference, 121–129. Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29, 119–127. Kirchner, K., To¨lle, K. H., & Krieter, J. (2004). The analysis of simulated sow herd datasets using decision tree technique. Computers and Electronics in Agriculture, 42, 111–127. Kohavi, R. (1995). Wrappers for performance enhancement and oblivious decision graphs. Dissertation Thesis. Department of Computer Science, Stanford University, USA. Liu, Huan, & Setiono, Rudy (1998). Some issues on scalable feature selection. Expert Systems with Applications, 15, 333–339. Murthy, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1–32. Pudil, P., Novovicˇova´, J., & Kittler, J. (1994). Floating search methods in feature selection. Pattern Recognition Letters, 15(11), 1119–1125. Pudil, P., Novovicˇova´, J., & Somol, P. (2002). Feature selection toolbox software package. Pattern Recognition Letters, 23, 487–492. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann. Rivals, I., & Personnaz, L. (1999). On cross validation for model selection. Neural Computation, 11, 863–870. Schaffer, C. (1993). Selecting a classification method by cross validation. Machine Learning, 13, 135–143. Shao, J., & Tu, D. (1995). The jackknife and bootstrap. Springer-Verlag. Somol, P., Pudil, P., Novovicˇova´, J., & Paclik, P. (1999). Adaptive floating search methods in feature selection. Pattern Recognition Letters, 20, 1157–1163. Sonesson, U., Bjorklund, A., Carlsson, M., & Dalemo, M. (2000). Environmental and economic analysis of management system for biodegradable. Waste Resource Conservation and Recycling, 28, 29–53. Tung, K. Y., Huang, I. C., Chen, S. L., & Shih, C. T. (2005). Mining the Generation Xers’ job attitudes by artificial neural network and decision tree—empirical evidence in Taiwan. Expert Systems with Applications, 29, 783–794. Weiss, S. M., & Kulikowski, C. A. (1991). Computer systems that learn. MorganKaufmann. Winston, P. (1992). Learning by building identification trees. Artificial Intelligence. Addison-Wesley Publishing company (423). Witten, Ian H., & Frank, Eibe (2005). Data mining: practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.