A novel classification method based on the ensemble learning and feature selection for aluminophosphate structural prediction

A novel classification method based on the ensemble learning and feature selection for aluminophosphate structural prediction

Microporous and Mesoporous Materials 186 (2014) 201–206 Contents lists available at ScienceDirect Microporous and Mesoporous Materials journal homep...

836KB Sizes 0 Downloads 95 Views

Microporous and Mesoporous Materials 186 (2014) 201–206

Contents lists available at ScienceDirect

Microporous and Mesoporous Materials journal homepage: www.elsevier.com/locate/micromeso

A novel classification method based on the ensemble learning and feature selection for aluminophosphate structural prediction Minghai Yao a,b, Miao Qi a,⇑, Jinsong Li a, Jun Kong b,c,⇑ a School of Computer Science and Information Technology, Northeast Normal University, Key Laboratory of Intelligent Information Processing of Jilin Universities, Changchun 130117, China b School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China c Key Laboratory for Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China

a r t i c l e

i n f o

Article history: Received 2 May 2013 Received in revised form 30 September 2013 Accepted 9 December 2013 Available online 13 December 2013 Keywords: Zeolites Polymer processing Feature selection Ensemble learning Simulation

a b s t r a c t In this paper, a novel classification algorithm based on the ensemble learning and feature selection is proposed for predicting the specific microporous aluminophosphate ring structure. The proposed method can select the most significant synthetic factors for the generation of (6, 12)-ring-containing structure. First, the clustering method is employed for making each training subset contains all the structural characteristics of samples. Then, the method takes full account of the discrimination and class information of each feature by calculating the scores. Specially, the scores are fused for getting a weight for each feature. Finally, we select the significant features according to the weights. The result of feature selection will help to predict the (6, 12)-ring-containing AlPO structure well. Moreover, we compare our method with several classical feature selection methods and classification method by theoretical analysis and extensive experiments. Experimental results show that our method can achieve higher predictive accuracy with less synthetic factors. Ó 2013 Elsevier Inc. All rights reserved.

1. Introduction Zeolite materials are an important class of crystalline inorganic microporous solids formed by TO4 tetrahedra (T infers Si, P, Al, Ge, Ga, etc.) with a well defined regular pore system. The most interesting features of zeolites lie in their variable chemical compositions of the pore wall, as well as the tunable pore diameters and pore shape. These excellent characters endow zeolites with wide applications in catalysis, adsorption, separation, ion exchange and other fields [1–3]. According to the number of the pore ring, zeolites are classified as small, medium, large, and extra-large pore structure with the pore window delimited by 8, 10, 12 and more than 12 T-atoms. Extra-large pore zeolites are drawing more and more attention because they can process bigger molecule as desire in the fields mentioned above. In recent years, Corma and Baumes et al. have been engaged in research about the synthesis of microporous materials. They have found a lot of factors that affect the synthesis of microporous materials [4–7]. In literature [8], Corma and co-workers summed up the research situation of extra-large pore molecular sieve materials from the structure, stability, ⇑ Corresponding authors at: School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China. Tel.: +86 431 84536326. E-mail addresses: [email protected] (M. Qi), [email protected] (J. Kong). 1387-1811/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.micromeso.2013.12.008

catalysis and so on. Microporous aluminophosphate as an important branch of molecular sieve materials, has been widespread concern by researchers at home and abroad. At present, 60 kinds of microporous aluminophosphate structures are known, where twelve-ring pore size of 0.73 nm is a typical representative and has important applications in adsorption and catalytic fields. However, the crystallization kinetics of such materials is rather complicated. In general, there are many factors that influence the crystallization kinetics and the final crystalline phases, such as reaction raw materials, the gel composition, the reaction pH, the organic template agent, solvent, etc. Therefore, the rational synthesis of new microporous materials remains a significant challenge in the field of inorganic chemistry. In order to mine the relationships between the synthetic factors and the resulting structures, and further guide the rational synthesis of AlPO materials, Yu and co-workers have built the AlPO synthesis database including about 1600 items for scholars [9–10]. Each reaction data records the synthesis conditions including gel molar, temperature and time, solvent and template type, and the structural characteristics of the product. This database can provide a research platform for the rational design and synthesis of microporous materials [11]. In the past few years, AlPO molecular sieve has been used as a target to probe the relationships between synthetic factors and the resulting framework structures [12–16]. Li and co-workers adopted Support Vector Machine (SVM) to predict (6, 12)-ringcontaining microporous AlPO’s, which gave the best combination

202

M. Yao et al. / Microporous and Mesoporous Materials 186 (2014) 201–206

of synthetic factors based on brute-force method [12]. In literature [13], Partial Least Squares and Logistic Discrimination were used to predict the formation of microporous aluminophosphate AlPO4-5. In addition, four re-sampling methods were proposed to deal with the problem of class imbalance. Li and co-workers proposed an AlPO4-5 prediction system based on C5.0 combined with Fisher score [14]. As mentioned above, using feature selection methods and data mining techniques can better find the interaction between the synthetic condition and the specified product. In this paper, a classification algorithm based on the ensemble learning and feature selection is proposed, which can provide helpful analysis to the rational synthesis of microporous aluminophosphate. In our method, the cluster analysis technique is used to cluster training samples. Moreover, multi-classifier fusion mechanism is employed for improving classification performance. In particular, a new feature selection method is proposed to explore the significant factors for specific structure. The method combines the generation method for training and testing sets in cluster analysis approaches, ensures the diversification of training sets, solves the problem with sample imbalance. By improving the feature selection method, we explore the main factors which affect the results of the synthetic in the synthesis process for AlPO. Therefore, we obtain the AlPO synthesis prediction model which has higher prediction accuracy. In order to demonstrate the effectiveness and superiority of the proposed method, we compare our method with several classical feature selection methods and classification method on the basis of prediction accuracy through extensive experiments. From the view of data processing, the proposed method ensures the richness of the data structure information for sampling training samples and considers both the discrimination and class information of features for feature selection. Moreover, it can deal with the problems with the class imbalance and the redundancy among features. The proposed method adopts the idea of ensemble learning, which can construct a prediction model having higher prediction accuracy. This paper is organized as follows. Section 2 introduces the feature selection methods, the FCM clustering algorithm and ensemble learning. Section 3 describes the idea of the classification method and feature selection method. Section 4 is comparison experiments. Section 5 is results and discussions. Finally, conclusions are given in Section 6.

computationally. In this study, we are particularly interested in the filter methods and propose a novel feature selection method by considering both discrimination and class information. 2.2. FCM clustering algorithm The fuzzy c-means (FCM) algorithm is proposed by Bazdek, which is an improvement of the hard k-means algorithm [20]. It assigns a class membership to a data point, depending on the similarity of the data point to a particular class relative to all other classes. The FCM objective function of data set into c clusters is:

J m ðl; v Þ ¼

c X n X

c X

i¼1 j¼1

i¼1

lmij d2 ðxj ; v i Þ s: t:

lij ¼ 1;

ð1Þ

where X ¼ ðx1 ; x2 ; :::xj ; :::xn Þ is a data matrix with the size of p  n, p represents the dimension of each feature, and n represents the number of data points, m presents the index of fuzziness, and vi is the fuzzy cluster centroid of the ith cluster. Using the Euclidean norm, the distance metric d measure the similarity between a data point xj and a cluster centroid vi in the feature space: 2

d ðxj ; v i Þ ¼ kxj  v i k2 :

ð2Þ

The objective function is minimized when data points are close to the centroids of their clusters and assigned high membership values, and low membership values are assigned to data points far from the centroids. Letting the first derivatives of Jm with respect to l and v equal to zero yields, the two necessary conditions for minimizing Jm as follows:

lij ¼

2=ðm1Þ !1 c  X dðxj ; v i Þ k¼1

dðxj ; v k Þ

ð3Þ

and

vi ¼

Pn

lmij xj : m j¼1 lij

j¼1

Pn

ð4Þ

The FCM algorithm proceeds by iterating the two necessary conditions until a solution is reached. Each data point will be associated with a membership value for each class after FCM clustering. By assigning the data point to the class with the highest membership value, a segmentation of the data could be obtained. 2.3. Ensemble learning

2. Related works 2.1. Feature selection The need for feature selection (FS) often arises in machine learning and pattern recognition problems. FS has been one of the key steps in mining high-dimensional data for decades. The idealized definition of feature selection is to find the minimally sized feature subset that is necessary and sufficient for a specific task. FS has several potential benefits, such as improving the accuracy of classification, avoiding the well-known ‘‘curse of dimensionality’’, speeding up the training process and reducing storage demands. Specially, it can provide a better understanding and interpretability for a domain expert [17]. Generally, FS techniques are classically grouped into two classes: filter based methods and wrapper based methods [18–19]. A filter method assesses the quality of a given subset of features using solely characteristics of that subset without any learning algorithm. In contrast, the wrapper method evaluates the adequacy of a subset of features based on the performance of some classifier operating with the selected features. Wrapper based methods are more expensive

Ensemble learning improves generalization performance of individual learners by combining the outputs of a set of diverse base classifiers. Previous theoretical and empirical researches have shown that formation of ensemble is always more accurate than individual components in the ensemble, if and only if individual members are both accurate and diverse [21]. Lots of methods have been developed for constructing classification ensemble. The most popular techniques include the Random subspace methods [22], Bagging [23] and Boosting [24]. Random subspace method was first introduced by the literature [25], which is based on a random sampling for original feature components to obtain different feature subsets. In recent years, it has been applied to feature selection, clustering and other areas. Both Bagging and Boosting train the base classifiers by resampling training sets. These classifiers are usually combined by simple majority voting in the final decision rule. One difference between Bagging and Boosting lies in that the former obtains a bootstrap sample by uniformly sampling with replacement from original training set, while the latter resamples the training data by emphasizing more on samples that are misclassified by previous classifiers. Recently, besides classification ensemble, there also appears clustering

M. Yao et al. / Microporous and Mesoporous Materials 186 (2014) 201–206

Cluster 1 Negative FCM

sample

203

Sample subset 1 Extract samples

Cluster 2

Sample subset 2

Cluster k

Sample subset n

Fig. 1. The flowchart of sample subsets generation.

ensemble which combines a diverse set of individual clustering for better consensus solutions [26]. 3. Our proposed method In this section, the classification algorithm based on ensemble learning and feature selection is proposed. Bagging algorithm is one of ensemble learning algorithms. The process of Bagging can be described as: Independently and randomly extract data with 0 size n ðn0 < nÞ from the original data set sized of n to form a subset of samples. This process is carried out repeatedly until many independent subsets are generated. Then, each sample subset is used to train trains a classifier. Final classification results are determined by fusing these classification results. Nevertheless, this method of sample extraction is random. It cannot guarantee that all structural samples can be included in the sample subset. In order to improve the performance of each classifier, the training samples are first clustered. Then a sample subset is built a sample subset by selecting a part of samples from each cluster, which can guarantee the diversity of each sample subset. Fig. 1 is the flowchart of sample subsets generation. 3.1. Feature selection The Fisher score [27] is one of the most widely used supervised feature selection methods. It selects each feature independently according to its score under the fisher criterion. The fisher score is defined as:

Fj ¼

Pc

k k¼1 nk ðxj

Pc

 xj Þ

k k¼1 nk ðrj Þ

2

HðXÞ ¼

n X

 pðxi Þlog2 pðxi Þ;

ð7Þ

i¼1

where xi represents variable of X, n express the number of variables, p(xi) represents the probability of xi .HðX; YÞ represents joint entropy, which is defined as: h n X X   HðX; YÞ ¼  pðyj Þð  pðxi yj Þlog2 pðxi yj ÞÞ; j¼1

ð8Þ

i¼1

where xi and yj represent variables of X and Y, p(yj) represents prob ability of yj, pðxi yj Þ represents conditional probability, n and h express the numbers of matching eigenvalues in X and Y . Therefore, the correlation of X and Y can be defined as:

DðX; YÞ ¼

1 X Iðxi ; YÞ; jXj x 2X

ð9Þ

i

where X represents feature set, Y represents target class labels, Iðxi ; YÞ is mutual information of feature i and target category Y: The larger D is, the more discriminative this feature is. Taking into the characteristics of the different sample subsets, Fi and Di are assigned different weights. For each feature, the fused weight is defined as:

Si ¼ a  F i þ ð1  aÞ  Di a 2 ½0; 1;

ð10Þ

where a is the tradeoff factor. The larger Si is, the more discriminative this feature is.

2

;

ð5Þ

where k indicates the corresponding class label. c represents the number of classes, nk represents the number of kth class samples. xkj represents the averages of jth feature corresponding to the kth class. xj represents the averages of jth feature corresponding to all samples. rkj indicates the variance of jth feature corresponding to the kth class. The larger Fj is, the more discriminative this feature is. The Fisher score considers the divergence between classes and the divergence within classes. The big divergence between classes and small divergence within classes are desirable. But Fisher score does not consider the correlation between the feature subset and class label. When the divergence of one feature between different classes is very good and the correlation with class label is particularly low, it is not a good feature for classification. Therefore, the best features cannot be found only by fisher score for classification. Mutual information [28] is an information measure method in the information theory. It represents the correlation between two events, which is defined as:

IðX; YÞ ¼ HðXÞ þ HðYÞ  HðX; YÞ;

ð6Þ

where IðX; YÞ represents mutual information between events X and Y, H(X) and H(Y) express their entropies. The entropy is defined as:

3.2. Algorithm framework Given a training set X md ; class label set Y m1 ; the number of clusters t; tradeoff factor a and the number of classifiers n; the algorithm is described as following: Algorithm 1 begin initialize i 0; j 0; k 0 2 do i iþ1 3 if Y ¼¼ 0 X  Y 4 else X þ Y 5 until i ¼ m 6 C FCMðX  Þ cluster X  by the FCM algorithm 7 do j jþ1 8 do k kþ1 9 randomðC k =2Þ train N 10 until k ¼ t 11 train ¼ fX þ ; train Ng 12 train FSR FSðtrainÞ Select feature by Eq. (10) 13 ModelðiÞ bayesðtrain FSRÞ 14 until j ¼ n 15 return Model 16 end

204

M. Yao et al. / Microporous and Mesoporous Materials 186 (2014) 201–206

4. Experimental results The proposed classification method is implemented on AlPO data from AlPO database (http://zeobank.jlu.edu.cn/). We evaluate its performance from the aspects of prediction accuracy and the number of selected features. This database contains 1576 reaction data for ca. 230 AlPO structures. four groups of synthesis information are used in the experiment including the source materials, the template, the synthesis conditions and the structural characteristics. Because the data has missing values problem in the database, we select 21 synthetic factors shown in Table 1 for the experiments. Considering the (6,12)-ring represents will provides with instructional significance [29], 398 positive samples [containing (6,12)-rings] and 884 negative samples [containing non-(6,12)rings] are used to predict the (6,12)-ring-containing AlPOs and the other AlPOs. Four important synthetic factors related to the molar concentrations (F1–F4) are necessarily selected factors, and the other factors (F5–F21) are used as selected. The process of experiment is describes as: Step 1: Build the sample set containing 398 positive samples and 884 negative samples. Step 2: The negative samples are clustered using the FCM algorithm. Step 3: Randomly select 199 positive samples from 398 positive samples and randomly select 199 negative samples from each clustering subset to form the test samples. Step 4: According to the class labels, 199 negative samples are selected from remained negative samples and remained positive samples are used as the train samples. Step 5: Use the novel feature select method to select features. Step 6: Train the Bayesian classifier using the selected feature in training samples. Step 7: Repeat steps 3–6 n times.

prediction accuracy can reach 86.98%. The statistical result is shown in Fig. 2. In order to explain the effectiveness and superiority of our method, five comparison experiments are carried out in this paper. The clustering algorithm and non-clustering algorithm are compared in Experiment 1. In experiment 2, the feature selection methods and the non-feature selection methods are compared based on clustering. In experiment 3, our proposed feature selection method is compared with several classic methods, including Fisher score (Fisher), t-test [30], Information Gain (InfoGain) [31], Laplacian score (LS), ChiSquare (CS) [32], Gini [33], Kruskalwallis (KW) [34]. Experiment 4 is the feature dimensions comparison of different feature selection methods. In experiment 5, our proposed method is compared with other methods, including SVM, KNN, SCR [35], C4.5. Experiment 1 compares the prediction accuracy of clustering algorithm and non-clustering algorithm. The number of clusters t is set 3. The tradeoff factor a is set 0.3 in our study. The comparison results are shown in Fig. 3. If we want to train classifier with strong classifier performance, the training samples should contain great information. Clustering training samples makes the samples of each subset have similar structural characteristics. So, selecting training samples from the different subset can ensure the training sample set contains more

Here, the setting n is 13, because more than 99% of the data is sampled through 13 times of repeating. Through comparison of multiple classifiers, the Bayesian classifier is faster and more stable and therefore, the Bayesian classifier is used in this paper. Through experimental statistics, when the tradeoff factor a is 0.3, the

Table 1 Description of the synthetic factors. Code

Description of parameters

Gel composition

F1 F2 F3 F4

Molar ratio of Al2O3/Al2O3 in the gel composition Molar ratio of P2O5/Al2O3 in the gel composition Molar ratio of solvent/Al2O3 in the gel composition Molar ratio of template/Al2O3 in the gel composition

Solvent

F5 F6 F7 F8 F9 F10

Density Melting point Boiling point Dielectric constant Dipole moment Polarity

Organic template

F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 F21

Longest distance of organic template Second longest distance of organic template Shortest distance of organic template Van der Waals volume Dipole moment Ratio of C/N Ratio of N/(C + N) Ratio of N/van der Waals volume Sanderson electronegativity Number of freely rotated sing bond Maximal number of protonated H atoms

Fig. 2. The performance comparison with different tradeoff factors.

Fig. 3. The performance comparison of non-clustering and clustering methods.

M. Yao et al. / Microporous and Mesoporous Materials 186 (2014) 201–206

information. We can see from Fig. 3 that pre-clustering method keeps the structure information of the sample set in the training samples and improves the classification accuracy. Experiment 2 compares the prediction accuracy of FS and nonFS. It is worth noting that the FS method is our proposed. The experiment results based on different clusters are shown in Fig. 4. The prediction accuracy of feature selection method is higher than those without feature selection method about by 2%. This means that not all features play significant roles on prediction task. That is, the features of AlPO synthesis database are redundant. Experiment 2 demonstrates that the feature selection method can remove redundant features and improve the prediction accuracy. Experiment 3 compares the prediction accuracy of our method with the several classical methods, which is shown in Fig. 5. We can see that the prediction accuracy of Fisher score method gives the worst results. The prediction curve of the proposed method is always on the top. When the number of clusters is 3, the predictive accuracy of our method can reach 86.98%. Compared with Fisher score method, our method considers the relativity among features and class label such that the prediction accuracy is greatly improved. Other methods have different prediction accuracy because of consideration of different aspects. In general, the

205

Fig. 6. The predictive accuracy of different feature selection method with different dimensions.

method proposed in this paper is better than the other methods with the same number of clustering. Experiment 4 compares the prediction accuracy of different methods when feature dimension d is different. We can see from Fig. 6 that when the feature dimension is greater than 6, the prediction accuracy of our method is higher than the other methods. When the feature dimensions less than 6, the predictive accuracy of the various methods are different. Analysis of Fig. 6, when we predict the reaction of results, the number of the selected features should not be less than 6. Otherwise, the predictive results would be meaningless. However, even we select different feature dimensions, the proposed method is almost better than others. Experiment 5 compares the prediction accuracy of our method with other methods, which is shown in Fig. 7.

5. Results and discussions

Fig. 4. The prediction accuracy of the feature selection method and the non-feature selection method.

Table 2 displays the results of feature selection and prediction accuracy of different methods. Obviously, the selected features and rank orders of different methods are different because the evaluation criteria of each method are different. Visually, we can find that F12, F16 and F18 almost always appear in the result. This

Fig. 5. The prediction accuracy of different FS methods with different clusters.

Fig. 7. The prediction accuracy of our method and other method.

206

M. Yao et al. / Microporous and Mesoporous Materials 186 (2014) 201–206

Table 2 Comparisons among different FS methods. Methods

Selection and rank

Non-FS Fisher score t-Test InforGain Laplacian score ChiSquare Gini Kruskalwallis Our

F16 F16 F14 F14 F18 F12 F21 F12

F12 F12 F15 F20 F14 F18 F15 F16

F18 F14 F18 F11 F11 F16 F11 F18

F17 F20 F13 F12 F17 F17 F12 F14

F14 F10 F12 F13 F15 F14 F13 F17

Acknowledgment Predictive accuracy (%) F19 F13 F11 F16 F13 F19 F14 F13

F7 F8 F7 F7 F5 F7 F7 F7

83.80 84.54 84.74 85.98 82.78 85.98 85.69 85.55 86.98

This work is supported by the Young Scientific Research Foundation of Jilin Province Science and Technology Development Project (No. 201201070), the Jilin Provincial Natural Science Foundation (No. 201115003), the Fund of Jilin Provincial Science & Technology Department (No. 20111804), the Science Foundation for Post-doctor of Jilin Province (No. 2011274), and the Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT). References

means that F12, F16 and F18 are crucial factors for distinguishing between structures containing (6, 12)-rings or not. During utilizing the SVM to construct the structure of prediction products the same conclusions are obtained [14]. From the prediction results of the model, we can see that geometric parameters of organic template (F12 and F13), electronic parameters (F14, F16, F17, F18), solvent parameter (F7) have greater influence for products. Specially, F12 (second longest distance of organic template) F16 (ratio of C/N) F18 (ratio of N/van der Waals volume) play an important role for compositing AlPO. As well known, the geometrical characteristic of organic template plays a very important role in the shape of AlPO framework and the size of pore channel. The maximum distance of organic template is along with pore channel, while the second longest distance of organic template becomes the main factor to determine the size of pore diameters. These results also agree with the fact that the geometric and electronic parameters of an organic template have a vital effect on the pore size and shape of an AlPO structure. 6. Conclusions In this work, a novel classification algorithm based on integrated learning and feature selection is proposed. The idea of ensemble learning is used in the classification algorithm. Based on the cluster analysis techniques training set and test set generation method ensures the diversification of the training sets, solves the problem with imbalance of many chemical data. The proposed feature selection method considers both the discriminant abilities of features and correlations between features and class label. It explores the main factors which affect the results of the synthetic in the synthesis process for aluminum phosphate. We construct the aluminum phosphate synthesis prediction model which has higher prediction accuracy by the classification algorithm based on integrated learning and feature selection. Most experimental results have shown that our proposed method is efficient and can improve the predictive performance. Besides, it displays a better predictive performance compared with some existing methods applied to the same problem. In particular, only using 11 kinds of the synthetic factors can reach good performances for predicting form (6, 12)ring-containing AlPO’s with the accuracy of 86.98%. Simultaneously, the prediction results could provide a prior knowledge and a better understanding for rational synthetic experiments.

[1] Berend Smit, Theo L.M. Maesen, Nature 451 (2008) 671–678. [2] C.F. Chang, C.Y. Chang, K.H. Chen, W.T. Tsai, J.L. Shie, Y.H. Chen, Colloid Interface Sci. 277 (2004) 29–34. [3] X.E. Shi, S.R. Zhai, L.Y. Dai, Y.K. Shan, M.Y. He, W. Wei, D. Wu, Y.H. Sun, Acta Phys.-chim. Sin. 20 (2004) 265–270. [4] J.A. Stoeger, C.M. Veziri, M. Palomino, et al., Microporous Mesoporous Mater. 147 (2012) 286–294. [5] M. Moliner, A. Corma, Microporous Mesoporous Mater. 164 (2012) 44–48. [6] A. Tisler, K.K. Unger, F.F. Schueth, Zeolites 18 (1997) 232. [7] L.A. Baumes, M. Moliner, A. Corma, QSAR Comb. Sci. 26 (2007) 255–272. [8] J.X. Jiang, J.H. Yu, A. Corma, Angew. Chem. Int. Ed. 49 (2010) 3120–3145. [9] J.Y. Li, J.H. Yu, J.G. Sun, X.C. Dong, Y. Li, Z.P. Wang, S.X. Wang, R.R. Xu, Stud. Surf. Sci. Catal. 170 (2007) 168–176. [10] Li, J. Y., Yu, J. H., Xu, R, R. http://zeobank.jlu.edu.cn/. [11] Y. Yan, J. Li, M. Qi, X. Zhang, J.H. Yu, R.R. Xu, Sci. China, Ser. B: Chem. 52 (2009) 1734–1738. [12] J. Yu, R. Xu, Stud. Surf. Sci. Catal. 154 (2004) 1–13. [13] J.Y. Li, L. Li, J. Liang, P. Chen, J.H. Yu, Y. Xu, R.R. Xu, Cryst. Growth Des. 8 (2008) 2318–2323. [14] J.Y. Li, M. Qi, J. Kong, J.Z. Wang, W.F. Hou, J.H. Yu, R.R. Xu, Y. Xu, Microporous Mesoporous Mater. 129 (2010) 251–255. [15] M. Qi, Y.H. Lu, J.Z. Wang, J. Kong, Mol. Inform. 29 (2010) 203–210. [16] J.S. Li, Y.H. Lu, J. Kong, N. Gao, J.H. Yu, R.R. Xu, J.Z. Wang, M. Qi, J.Y. Li, Microporous Mesoporous Mater. 173 (2013) 197–206. [17] J.P. Hua, D. Waibhav, Pattern Recogn. 42 (2009) 409–424. [18] R. Kohavi, G.H. John, Artif. Intell. 97 (1997) 273–324. [19] H. Liu, L. Yu, IEEE Trans. on Know. Data Eng. 17 (2005) 491–502. [20] J.C. Bezdek, Patter recognition with fuzzy objective function algorithm, Plenum Press, New York, 1981. [21] K.H. Liu, B. Li, J. Zhang, J.X. Du, Pattern Recogn. 42 (2009) 1274–1283. [22] Tin Kam Ho, IEEE Trans. Pattern Anal. Mach. Intell. 20 (1998) 832–844. [23] Leo. Breiman, Mach. Learn. 24 (1996) 123–140. [24] Yoav Freund, Robet E. Schapire., Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning. 22 (1996) 148–156. [25] Tin Kam Ho., Random Decision Forests. Proc. 3rd Int. Conf. on Document Anal. and Recogn. 1 (1995) 278–282. [26] A. Topchy, A. Jain, W. Punch, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2005) 1866–1881. [27] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, Wiley, New York, 2001. [28] J.R. Ding, J.H.J. Huang, F. Liu, Y.T. Zhang, J. Harbin Inst. Tech. 44 (2012) 81–85. [29] Baerlocher, C., McCusker, L. B. Database of zeolite structures. Available from: . (2009). [30] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C, second ed., Cambridge University Press, New York, 1992. [31] T.M. Cover, J.A. Thomas, Elements of Information Theory, Wiley, New York, 2006. [32] H. Liu, R. Setiono, Artif. Intell. (1995) 388–391. [33] L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees, Chapman & Hall, London, 1984. [34] G. Ruxton, G. Beauchamp, Anim. Behav. 76 (2008) 1083–1087. [35] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, Y. Ma, IEEE Trans. Pattern Anal. Mach. Intell. 76 (2008) 1083–1087.