Powder Technology 356 (2019) 1024e1028
Contents lists available at ScienceDirect
Powder Technology journal homepage: www.elsevier.com/locate/powtec
Classification of coal and gangue under multiple surface conditions via machine vision and relief-SVM Dongyang Dou a, b, *, Wenze Wu a, b, Jianguo Yang c, Yong Zhang a, b a
Key Laboratory of Coal Processing and Efficient Utilization of Ministry of Education, China University of Mining and Technology, Xuzhou 221116, PR China School of Chemical Engineering and Technology, China University of Mining and Technology, Xuzhou 221116, PR China c National Engineering Research Center of Coal Preparation and Purification, China University of Mining and Technology, Xuzhou 221116, PR China b
a r t i c l e i n f o
a b s t r a c t
Article history: Received 25 September 2018 Received in revised form 19 February 2019 Accepted 6 September 2019 Available online 7 September 2019
Coal and gangue classification is a key problem in automating the task of gangue picking during the preparation of coal. Four coal properties, namely raw coal with a dry clean surface, a wet clean surface, a dry surface covered by slime, and a wet surface covered by slime, are frequently encountered in real situations. Typically, the conditions occur simultaneously to yield multiple surface conditions. In the study, the relief-Support Vector Machine (relief-SVM) method is proposed to recognize coal and gangue based on image analysis. First, 19 features of coal and gangue pictures including color and textural features were extracted. Subsequently, the relief-SVM method was employed to identify optimal features and construct optimal classifiers. The classifiers were then used on coal samples from the Dafeng and Baijigou coal mines to validate their efficacy in recognizing coal and gangue. The average accuracy corresponded to 92.57% and 92% for Dafeng coal and Baijigou coal, respectively. The experimental results indicated that fewer optimal features increased the classification accuracy and decreased the training and classification time, and thus, the proposed method is suitable for complex conditions. © 2019 Elsevier B.V. All rights reserved.
Keywords: Coal Gangue recognition Image analysis Relief algorithm SVM
1. Introduction It is necessary to remove gangue exceeding 50 mm in size prior to the preparation of coal to reduce both production load and equipment wear. Typically, the task is undertaken by human workers. However, the task can be hazardous for workers due to the heavy workload and dusty working environment. In order to overcome the issue, automation is introduced in coal preparation plants, and several extant studies focused on the automatic analysis and classification of coal quality [1e3]. An intelligent dry sorting system type GDRT using a g-ray gangue detection sensor and a high-pressure air gun as an actuator was successfully applied in six coal mining companies in China [4,5]. However, machine vision methods gain increased attention as alternatives to radioactive methods due to restrictions in using a radioactive source. Given advancements in modern twodimensional and three-dimensional imaging technologies, Liu et al. developed an image analysis method of digital CT scanned
* Corresponding author at: Key Laboratory of Coal Processing and Efficient Utilization of Ministry of Education, China University of Mining and Technology, Xuzhou 221116, PR China E-mail address:
[email protected] (D. Dou). https://doi.org/10.1016/j.powtec.2019.09.007 0032-5910/© 2019 Elsevier B.V. All rights reserved.
images to predict coal washability. A comparison with the float and sink test indicates a satisfactory match in terms of both particle size distribution and coal washability [6]. Machine vision methods based on visible light are widely used in the coal preparation field. Zhang et al. proposed an image analysis-based method for ash content prediction [7] and density fraction prediction [8] of coarse coal. Aldrich used machine vision systems to measure particle size distributions of coal on a conveyor belt [9]. Perez et al. presented a new method to improve rock classification via digital image analysis, mutual information-based feature selection, and a voting process to consider boundary information [10]. Singh et al. discussed a case study of ferruginous Indian manganese ore using red, green, and blue color space, histogram analysis, textural analysis, and edge detection techniques [11]. However, all the aforementioned studies were performed under ideal conditions in laboratories wherein the photographed and analyzed surface of the coal was clean without slime or dust. In real conditions, coal is typically covered by slime and is occasionally wet. Therefore, field-like coal properties should be considered while performing the image analysis of coal. Dou et al. detailed coal and gangue recognition under four operating conditions [5] although they ignored mixed conditions. The method adopted in the study is as follows. Raw coal is
D. Dou et al. / Powder Technology 356 (2019) 1024e1028
1025
screened via a 50-mm sieve, and the oversize pieces are channeled for gangue picking. The pieces are processed on a queuing device. The belt is subsequently divided into several channels by which the coarse pieces are passed through an imaging area. Motion detection technology is employed to track coarse coal pieces, and a fixed square area of the coal images is cut for further image analysis. The image analysis aids in identifying gangue, which is subsequently blown away via a high-pressure air gun when it drops from one belt to another. Coal and gangue recognition is a key challenge in the method. The study focuses on the classification of coal and gangue via image analysis under multiple mixed conditions wherein four coal properties, namely, coarse coal with a dry clean surface, a wet clean surface, a dry surface covered by slime, and a wet surface covered by slime, are presented together. The aim of the study involves proposing a method to accurately classify coal and gangue under the aforementioned complex field conditions.
between coal and gangue is also evident. The gangue surface typically exhibits fewer edges and corners, i.e., its surface is smoother than the surface of coal. Furthermore, the local variation in surface luminance of coal is more intense than that in gangue. All the texture changes are expressed and quantified via typical texture feature methods such as the gray-level co-occurrence matrix features (GLCMFs) [10] and Tamura textural features (TTFs) [13]. The GLCMFs describe the gray spatial correlation and luminance variation of the image. Conversely, TTFs are more intuitionistic, and are thereby more suitable for the visual perception of human nature. Finally, all the 19 features (12 color features and 7 textural features) are used in the study (see Table 1).
2. Theory
2.2.1. Feature selection based on relief algorithm The hypothesis-margin concept is employed to assess the classification ability of the feature dimension in the relief algorithm. The hypothesis-margin denotes the maximum distance by which the decision-making area moves while maintaining the same classification in identical samples. It is calculated as follows [14]:
2.1. Image feature extraction We extract the color and texture features of coal via image analysis. The color distribution is described via color moments. Color moments are measures that characterize color distribution in an image in the same manner in which central moments uniquely describe a probability distribution. Furthermore, color distribution information primarily focuses on low moments. The first moment (mean), second moment (variance), and third moment (skewness) of color adequately express the image color distribution. The redgreen-blue (RGB) and hue-saturation-value (HSV) color spaces are typically employed in color feature extraction. A color in the RGB color model is described by indicating the individual amounts of the red, green, and blue that are included. The color is expressed as an RGB triplet (r, g, b) wherein each component can vary from zero to a defined maximum value. The RGB color space includes the R, G, and B vectors, and the HSV color space includes H, S, and V vectors. Furthermore, the gray vector given in Eq. (1) also corresponds to a special color vector. All the vectors extract three moments as given in Eqs. (2)e(4) [7]. Zhang [12] indicated that the R, G, B, and the gray values (grayscale) of each pixel in the segmented region are almost identical in coal images. Grayscale denotes the collection or the range of monochromic (gray) shades ranging from pure white for the lightest end to pure black on the opposite end. Hence, the H, S, V, and gray vectors are all extracted as color features. The expressions are as follows:
Gray ¼ 0:2989R þ 0:5870G þ 0:1141B mi ¼
1 N
N X
Pij
(1)
(2)
j¼1
11 2 N X 2 1 A @ vi ¼ P mi N j¼1 ij 0
(3)
0
11 3 N X 3 1 si ¼ @ Pij mi A N j¼1
(4)
where Pij denotes the vector matrix of coal pictures, mi denotes the first moment, vi denotes the second moment, and si denotes the third moment. In addition to the luminance difference, the texture difference
2.2. Relief-SVM recognition method The relief-SVM consists of two parts, namely feature selection based on relief algorithm and SVM classification method.
1 2
q ¼ ðkx MðxÞ k kx HðxÞ k Þ
(5)
where H(x) and M(x) denote nearest-neighbor sample points with the same class and different class, respectively. The core of the relief algorithms involves appraising, using the difference in feature values between a given sample and nearest sample, and the quality of features wherein weights exceed the threshold. The weights are defined as follows [15]:
Wfi ¼ Wfi1 þ
diff f ðx; MðxÞ Þ diff f ðx; HðxÞ Þ m m
(6)
where f denotes the feature, I denotes the randomly selected instance, m denotes the sample size, and diff() denotes the distance between the samples. The weights of all the 19 features listed in Table 1 is calculated using Eq. (6). Increases in the weight of the feature increase their respective contribution to the classification. The features are individually removed based on the weights in ascending order, and the classification accuracy of the test dataset is used to determine the number of features that are retained. 2.2.2. SVM method The SVM method transforms the dataset to a high-dimensional and kernel-induced feature space and subsequently determines the separating hyperplane with the maximum distance to the closest points of the training set. It is successfully used for classification in several industrial applications [16e18]. A detailed description of the SVM algorithm is given in Ref. [19]. In the Radial Based Function (RBF) kernel function, the penalty parameter c and width parameter g cover a wide range of values. The parameter set (c, g) plays a role in the feature space framework and the corresponding classification accuracy, and thus optimization of both the aforementioned parameters is extremely important. Cross-validation and grid-search [20] are employed to identify good (c, g) such that the classifier accurately predicts unknown data. 2.2.3. Process of coal and gangue recognition using relief-SVM The features of coal and gangue pictures photographed under the multiple conditions are extracted to form the original dataset.
1026
D. Dou et al. / Powder Technology 356 (2019) 1024e1028
Table 1 Coal image features. Color features HSV space Fe1: Fe2: Fe3: Fe4: Fe5: Fe6: Fe7: Fe8: Fe9:
H first moment H second moment H third moment S first moment S second moment S third moment V first moment V second moment V third moment
Textural features Grayscale space
GLCMFs
TTFs
Fe10: Gray first moment Fe11: Gray second moment Fe12: Gray third moment
Fe13: Fe14: Fe15: Fe16:
Fe17: roughness Fe18: contrast Fe19: orientation
energy homogeneity correlation entropy
The dataset is subsequently randomly divided at a given ratio into a training dataset and a test dataset. The training dataset is used to construct the candidate classification models via relief-based feature selection and SVM (relief-SVM). The test dataset is employed to evaluate the classification results and determine the optimal model. The process of coal and gangue recognition method used in the study is shown in Fig. 1. 3. Experiments The experiments of coal and gangue classification under multiple conditions are performed using the test rig depicted in Fig. 2 where 1 denotes a computer, 2 denotes an LED (four lights in total), 3 denotes a camera, and 4 denotes a sunshade. Fig. 3 shows examples of coal and gangue under the four coal properties where (a) denotes the dry clean surface, (b) denotes the wet clean surface, (c) denotes the dry surface covered by slime, and (d) denotes the wet surface covered by slime. In each group, the left picture corresponds to that of coal and the right one corresponds to that of gangue. As shown in Fig. 3 (a), the coal surface exhibits specific texture changes and is darker than the gangue. The luminance and texture differences are evident. Two sources of raw coal were used, namely from the Dafeng coal mine and the other from the Baijigou coal mine. For each coal surface from each coal mine, 70 samples of coal and gangue were prepared and photographed. This was followed by performing feature extraction, and two 70 19 datasets (one each for coal and gangue) were obtained. With respect to each type of coal, the eight datasets from the four coal properties were mixed to simulate multiple conditions. A total of two such datasets (560 19) were obtained in the experiment. In each dataset, the training dataset and test dataset were randomly partitioned at a ratio of 1:1, i.e., the training dataset and test dataset each consisted of 280 records. 4. Results and discussion 4.1. Dafeng coal The 19 features were sorted based on their relief weights. The features are listed as follows: Fe5 (0.0039) < Fe18 (0.0054) < Fe8 (0.0070) < Fe11 (0.0077) < Fe17 (0.0085) < Fe15 (0.0101) < Fe14 (0.0128) < Fe6 (0.0144)
Fig. 2. Schematic diagram of the test rig.
< Fe13 (0.0149) < Fe16 (0.0183) < Fe3 (0.0222) < Fe12 (0.0237) < Fe9 (0.0237) < Fe7 (0.0331) < Fe10 (0.0340) < Fe4 (0.0354) < Fe2 (0.0414) < Fe19 (0.0618) < Fe1 (0.0798) First, Fe5 was removed from the original feature set. The relief-SVM method was employed to determine the optimal features and construct the optimal classifier for coal and gangue recognition. The iteration process that is used is presented in Table 2. As shown in Table 2, the accuracy of the test dataset reaches 93.57% when five features from the original feature set are removed. The accuracy decreased if more features were removed. Therefore, 14 features, namely, Fe15, Fe14, Fe6, Fe13, Fe16, Fe3, Fe12, Fe9, Fe7, Fe10, Fe4, Fe2, Fe19, and Fe1, were selected to construct the SVM classifier, and the optimal parameter set (c, g) of SVM corresponded to (1, 5.6569). In order to test the stability of the proposed relief-SVM method, we randomly re-sampled the training dataset and test dataset four times with a ratio of 1:1 by using the optimal features determined above to train the SVM classifiers and evaluate the performance of the test datasets. The results obtained are listed in Table 3. As shown in Table 3, the classification performance of the five tests is relatively stable. The average accuracy exceeded 92.5% in both the training dataset and test dataset. 4.2. Baijigou coal Baijigou coal was employed and the experiments performed on Dafeng coal were repeated to validate the applicability of the relief-
Fig. 1. Process of the relief-SVM recognition method.
D. Dou et al. / Powder Technology 356 (2019) 1024e1028
1027
Fig. 3. Coal and gangue showing four characteristic coal surfaces.
Table 2 Relief-SVM recognition results for Dafeng coal. No. of feature(s) removed
Training dataset accuracy (%)
Optimal c of SVM
Best g of SVM
Test dataset accuracy (%)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
92.14 92.14 91.9 91.67 91.67 90.95 90.48 90.24 91.19 92.14 86.19 86.43 85.95 83.57 83.57 81.43 80.95 80.48 79.29
2.82843 2.82843 5.65685 1 1 1 4 4 32 362.03867 2.82843 1 2 181.01934 512 724.07734 2.82843 2 0.00098
2.8284 2.8284 2 2.8284 2.8284 5.6569 2 2 1 0.3536 11.3137 8 16 0.7071 0.5 0.7071 2 64 0.001
91.43 91.43 92.14 92.86 92.14 93.57 92.86 92.14 91.43 91.43 88.57 90 85 82.14 80.71 80 79.29 75.71 77.86
Table 3 Re-sampling results of Dafeng coal. No. of features Accuracy on training dataset (%) Accuracy on test dataset (%) 14
Average
90.95 94.29 92.86 93.10 91.43 92.52
93.57 92.14 93.57 91.43 92.14 92.57
SVM method. Specifically, 19 features were sorted based on their relief weights. The features are listed as follows: Fe5 (0.0051) < Fe11 (0.0051) < Fe8 (0.0055) < Fe18 (0.0062) < Fe9 (0.0112) < Fe12 (0.0112) < Fe14 (0.0140) < Fe6 (0.0190) < Fe15 (0.0197) < Fe16 (0.0204) < Fe13 (0.0276) < Fe2 (0.0299) < Fe17 (0.0326) < Fe4 (0.0329) < Fe19 (0.0350) < Fe3 (0.0360)
< Fe7 (0.0435) < Fe10 (0.0453) < Fe1 (0.1001) First, Fe5 was removed from the original feature set. The relief-SVM method was employed to determine the optimal features and construct the optimal classifier for coal and gangue recognition. The iteration process is detailed in Table 4. As shown in Table 4, the accuracy on the test dataset reaches 93.57% when 9 features from the original feature set are removed. The accuracy decreased if more features were removed. Therefore, 10 features, namely Fe16, Fe13, Fe2, Fe17, Fe4, Fe19, Fe3, Fe7, Fe10, and Fe1 were selected to construct the SVM classifier, and the optimal parameter set (c, g) of SVM corresponded to (45.25483, 0.3536). In order to test the stability of the proposed relief-SVM method, we randomly re-sampled the training dataset and test dataset four times at a ratio of 1: 1 by using the above determined optimal features to train SVM classifiers and evaluate the performance on
1028
D. Dou et al. / Powder Technology 356 (2019) 1024e1028
and Baijigou coal, respectively. This revealed that the proposed method is effective in multiple surface conditions.
Table 4 Relief-SVM recognition results for Baijigou coal. No. of feature(s) removed
Training dataset accuracy (%)
Optimal c of Best g of Test dataset SVM SVM accuracy (%)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
91.43 91.9 92.14 91.67 92.38 92.38 90.95 90 90.95 90.95 90.95 86.9 85.48 84.52 85.24 84.76 85 84.29 83.1
64 64 181.01934 181.01934 256 256 128 45.25483 45.25483 45.25483 22.62742 1.41421 0.5 16 1.41421 0.70711 45.25483 0.04419 724.07734
0.5 0.5 0.25 0.25 0.25 0.25 0.1768 0.25 0.3536 0.3536 0.7071 1.4142 2 0.25 1.4142 16 0.011 1 0.25
90.71 90.71 90 90 90 90 91.43 92.14 92.14 93.57 91.43 90.71 85.71 86.43 85 85 85 83.57 80.71
Table 5 Re-sampling results for Baijigou coal. No. of features Accuracy on training dataset (%) Accuracy on test dataset (%) 11
Average
90.95 91.19 91.43 90.95 90.00 90.90
93.57 90.00 91.43 92.14 92.86 92.00
the test datasets. The results are given in Table 5. As shown in Table 5, the classification performance of the five tests was relatively stable. The average accuracy corresponded to 90.9% and 92% in the training dataset and test dataset, respectively. 5. Conclusions Coal and gangue recognition is a key problem in automating the task of gangue separation during the process of coal preparation. Four coal properties, namely raw coal with a dry clean surface, with a wet clean surface, with a dry surface covered by slime, and with a wet surface covered by slime, are frequently simultaneously encountered in real situations. The relief-SVM method presented in the study is proven as effective in the aforementioned types of complex situations based on the recognition results for both Dafeng coal and Baijigou coal. Decreases in the number of optimal features increased the classification accuracy and decreased the training and recognition time of the classifier. The feature Fe1 most significantly affected the classification process while Fe5 affected it the least. The average accuracy corresponded to 92.57% and 92% for Dafeng coal
Acknowledgments This study was supported by the Fundamental Research Funds for the Central Universities (No. 2019XKQYMS43) and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, China. References [1] D. Dou, J. Yang, J. Liu, Z. Zhang, H. Zhang, Soft-sensor modeling for separation performance of dense-medium cyclone by field data, Int. J. Coal Prep. Util. 35 (2015) 155e164. [2] D. Dou, J. Yang, J. Liu, H. Zhang, A novel distribution rate predicting method of dense medium cyclone in the Taixi coal preparation plant, Int. J. Miner. Process. 142 (2015) 51e55. [3] D. Dou, D. Zhou, J. Yang, A new partition curve model of dense-medium cyclone based on process parameters, Int. J. Coal Prep. Util. (2018), https:// doi.org/10.1080/19392699.2018.1515075. [4] li Kang, jinhui Huang, chang Liu, Applications of GDRT intelligent dry sorting system using g ray in six coal mining companies, Coal Process. Compr. Util. (3) (2017) 22e24 (in Chinese). [5] D. Dou, D. Zhou, J. Yang, Y. Zhang, Coal and gangue recognition under four operating conditions by using image analysis and relief-SVM, Int. J. Coal Prep. Util. (2018), https://doi.org/10.1080/19392699.2018.1540416. [6] H. Liu, S. Rodrigues, F. Shi, et al., Coal washability analysis using X-ray tomographic images for different lithotypes, Fuel. 209 (2017) 162e171. [7] Z. Zhang, J. Yang, Y. Wang, D. Dou, W. Xia, Ash content prediction of coarse coal by image analysis and GA-SVM, Powder Technol. 268 (2014) 429e435. [8] Z. Zhang, J. Yang, The density fraction estimation of coarse coal by use of the kernel method and machine vision, Energy Sources A 37 (2015) 181e191. [9] C. Aldrich, G. Jemwa, J. van Dyk, et al., Online analysis of coal on a Conveyor belt by use of machine vision and kernel methods, Int. J. Coal Prep. Util. 30 (2010) 331e348. vez, P.A. Vera, L.E. Castillo, C.M. Aravena, D.A. Schulz, [10] C.A. Perez, Pablo A. Este et al., Ore grade estimation by feature selection and voting using boundary detection in digital image analysis, Int. J. Miner. Process. 101 (2011) 28e36. [11] V. Singh, S.M. Rao, Application of image processing in mineral industry: a case study of ferruginous manganese ores, Miner. Process. Extr. Metall. 115 (2006) 155e160. [12] Z. Zhang, J. Yang, Narrow density fraction prediction of coarse coal by image analysis and MIV-SVM, Int. J. Oil Gas Coal Technol. 11 (2016) 279e289. [13] X. Zhang, P. Shen, J. Gao, D. Qi, L. Zhang, A. Xue, L. Xi, X. Chen, A license plate recognition system based on Tamura texture in complex conditions, in: Proceedings of the 2010 IEEE International Conference on Information and Automation, Harbin, China, 2010, pp. 1947e1952. [14] J. Jia, N. Yang, C. Zhang, A. Yue, J. Yang, D. Zhu, Object-oriented feature selection of high spatial resolution images using an improved relief algorithm, Math. Comput. Model. 58 (2013) 619e626. [15] P. Smyth, R.M. Goodman, Rule induction using information theory, in: G. Piatetsky-Shapiro, W. Frawley (Eds.), Knowledge Discovery in Databases, MIT Press, 1990. [16] D. Dou, S. Zhou, Comparison of four direct classification methods for intelligent fault diagnosis of rotating machinery, Appl. Soft Comput. 46 (2016) 459e468. [17] D. Dou, J. Jiang, Y. Wang, Y. Zhang, A rule-based classifier ensemble for fault diagnosis of rotating machinery, J. Mech. Sci. Technol. 32 (6) (2018) 2509e2515. [18] D. Dou, J. Yang, J. Liu, Y. Zhao, A rule-based intelligent method for fault diagnosis of rotating machinery, Knowl.-Based Syst. 36 (2012) 1e8. [19] D.J. Bordoloi, R. Tiwari, Support vector machine based optimization of multifault classification of gears with evolutionary algorithms from time-frequency vibration data, Measurement. 55 (2014) 1e14. [20] C. Chang, C. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (3) (2011) 1e27.