An experimental modeling of cyclone separator efficiency with PCA-PSO-SVR algorithm

An experimental modeling of cyclone separator efficiency with PCA-PSO-SVR algorithm

Powder Technology 347 (2019) 114–124 Contents lists available at ScienceDirect Powder Technology journal homepage: www.elsevier.com/locate/powtec A...

4MB Sizes 2 Downloads 136 Views

Powder Technology 347 (2019) 114–124

Contents lists available at ScienceDirect

Powder Technology journal homepage: www.elsevier.com/locate/powtec

An experimental modeling of cyclone separator efficiency with PCA-PSO-SVR algorithm Wei Zhang a,⁎, Linlin Zhang a, Jingxuan Yang a, Xiaogang Hao a, Guoqing Guan b,⁎⁎, Zhihua Gao c a b c

Department of Chemical Engineering, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China Energy Conversion Engineering Laboratory, Institute of Regional Innovation (IRI), Hirosaki University, 2-1-3, Matsubara, Aomori 030-0813, Japan Key Laboratory Coal Science and Technology, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China

a r t i c l e

i n f o

Article history: Received 15 October 2018 Received in revised form 21 January 2019 Accepted 25 January 2019 Available online 1 March 2019 Keywords: Cyclone separator Grade efficiency Support vector regression algorithm Particle swarm optimization Principal component analysis

a b s t r a c t Accurate prediction of the complicated nonlinear relationship among the grade efficiency, geometrical dimensions, and operating parameters based on limited experimental data is the most effective way to design a high-efficiency cyclone separator. Herein, a hybrid PCA-PSO-SVR model is proposed to predict the grade efficiency of cyclone separators with the operating parameters based on 217 sets of experimental data provided in the literature. The experimental data are preprocessed using the random sampling technique together with the normalization method and principal component analysis (PCA) at first; subsequently, the particle swarm optimization (PSO) algorithm is incorporated to optimize the parameters for the support vector regression (SVR), including the penalty factor C, kernel function parameter g, and insensitive loss ε. Finally, the SVR model with the optimized parameters is trained with 80% pretreatment data, and the generalization ability of the model is tested with the remaining 20% data. The mean squared error of the test sets is 6.948 × 10−4 with a correlation coefficient of 0.982. The comparison results show that the PCA-PSO-SVR model has higher accuracy, better generalization ability, and stronger robustness than the existing models for predicting the cyclone separator efficiency in the case with only a few experimental data. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Cyclone separator efficiency is considered one of the major criteria to design cyclone geometry and evaluate its performance. As shown in Table 1, four approaches have been developed to estimate the separation efficiency of cyclone separator. In the approach (1), the theoretical and semi-empirical models [1–9] are always derived from physical descriptions of gas flow pattern and energy dissipation mechanisms in the cyclone. However, although this is a conventional way, the assumptions and simplifications used in these models easily lead to significant errors between the experimental data and predicted results. Considering that the efficiency model of a cyclone separator should ideally be established through experimental data [10], the approach (2) has been applied. For instances, by using this approach, Zhu et al. [11] studied the effects of the flow rate, cylinder height, and exit tube length on the collection efficiency with a set of experimental data;

⁎ Corresponding author at: College of Chemistry and Chemical Engineering, Taiyuan University of Technology, Taiyuan 030024, China. ⁎⁎ Corresponding author at: Energy Conversion Engineering Laboratory, Institute of Regional Innovation (IRI), Hirosaki University, 2-1-3, Matsubara, Aomori 030-0813, Japan. E-mail addresses: [email protected] (W. Zhang), [email protected] (G. Guan).

https://doi.org/10.1016/j.powtec.2019.01.070 0032-5910/© 2019 Elsevier B.V. All rights reserved.

Chen et al. [12] conducted an experimental investigation to predict the influence of the operating temperature on overall cyclone efficiency; Lim et al. [13] performed a trial examination of the effects of cylinderand cone-shaped vortex finders on the particle collection efficiencies of cyclones with different flow rates; Luo et al. [14] derived the efficiency formula for a particular reverse-flow cyclone with a plane top and volute inlet (i.e., the PV-cyclone) separator by applying the similarity theory and regression analysis based on a large set of experimental data. However, a series of assumption has to be made to facilitate similar analysis but it cannot fit the actual situation, and the high accuracy of the regression model depends on a large set of data. Moreover, a series of experiments [14–16] made on the PV-cyclone have proved that the factors affecting the separation efficiency of cyclone separator are too complex to be represented only by the Stokes number. To understand the effect of the geometrical ratios on the flow field pattern and separation performance, the approach (3), i.e., computational fluid dynamics (CFD) study is always applied [17–22]. However, the modeling data obtained through CFD simulation is too time-consuming, and simultaneously, the obtained data deviate considerably from the real situation. Furthermore, the full elastic particle-wall collision and the ideal assumptions of dust collection at the bottom usually result in the over-prediction of the separation efficiency with smaller particles.

W. Zhang et al. / Powder Technology 347 (2019) 114–124 Table 1 Summary of the different approaches for separation efficiency estimation. Approach

Comments

(1) Theoretical and These models are derived semi-empirical from physical descriptions of gas flow pattern and models energy dissipation mechanisms in the cyclone. However, assumptions and simplifications used in these models lead to significant errors between the experimental data and predicted results. These models are (2) Experimental developed through a and statistical statistical regression models analysis based on an experimental data set for different cyclone configurations; however, determining the optimal correlation function for fitting experimental data is difficult. (3) Computational CFD can provide the fluid dynamics performance parameters and the detailed (CFD) information of the flow field inside the cyclone; the main drawback of CFD is computationally expensive to solve the Navier–Stokes equations in fluid mechanics. (4) Artificial intel- AI models (such as artificial neural networks, ligence (AI) and genetic algorithms), models and machine learn- and machine learning ing algorithms algorithms (such as support vector machines, SVM) have become powerful tools of scientific research and technology without the need of understanding the nature of phenomenon.

References Zhao [1,2]; Göran et al. [3]; Qiu [4]; Sun et al. [5];Yang et al. [6]; Barth [7];Dietz [8]; Leith-Licht [9]

Rafiee et al. [10]; Zhu et al. [11];Chen et al. [12]; Lim et al. [13]; Luo et al. [14]; Jin et al. [15,16]

115

cyclone separator. In this study, a hybrid model is proposed to predict the grade efficiency of cyclone separators with the operating parameters based on 217 sets of experimental data provided in the literature. It is expected to reach the following three objectives. (1) Modeling the separation efficiency of a cyclone separator with experimental data, in which both the geometrical and operating parameters are considered. (2) Principal component analysis (PCA) is used, by which eight factors that affect the grade efficiency are reduced to five independent factors for the modeling, but the information loss is minimal. (3) To create accurate mathematical models with limited experimental data, SVR is used; meanwhile, to improve the modeling accuracy, the particle swarm optimization (PSO) algorithm is applied. The paper is organized as follows. After the introduction section, the procedure by using the PCAPSO-SVR model to predict the grade efficiency of cyclone separators is described in Section 2. Then, in Section 3, to evaluate the universality and accuracy of PCA-PSO-SVR, it is compared with the classical theoretical models and several ANN models, and the results are discussed. Finally, the conclusions of this study are given. 2. SVR modeling hybrid PCA and PSO

Sun [17];Francesco et al. [18];Huang et al. [19]; Misiulia [20]; Mazyana [21]; Zhou et al. [22]

Elsayed [23,25]; Zhao [24]; Yetilmezsoy [26]; Khalkhali [27];

With the development of the modern computer technology, the implementation of big data processing has become very easy. As such, the approach (4), i.e., Artificial Neural Network (ANN) and Support Vector Regression (SVR) [23,24] algorithms, is becoming a hot topic by processing complex nonlinear mathematical models based on sample data without knowing the mechanism. These algorithms are successfully applied to model the efficiency of the cyclone separator based on CFD samples or the experimental data [25–27]. In particular, Elsayed et al. [25] successfully applied two radial basis function neural networks (RBFNNs) to model the pressure drop and cut-off diameter for cyclone separators, in which they studied seven geometrical parameters on the cyclone separator performance (the pressure drop and cut-off diameter) without considering the other operating parameters. However, the effects of particle size, particle density, gas velocity, surface roughness, kinematic viscosity, and so on, and the effects of dimensions and geometry of the cyclone on the performance still need to be taken into account for accurate modeling. Based on the above review, nowadays, it is required an accurate mathematical model to effectively predict the complex and nonlinear relationship between the separation efficiency and both the geometrical and operating parameters. It is considered that the accurate prediction of the complicated nonlinear relationship among the grade efficiency, geometrical dimensions, and operating parameters based on limited experimental data is the most effective way to design a high-efficiency

Fig. 1 shows the flowchart of grade efficiency modeled with the proposed PCA-PSO-SVR. 2.1. Experimental data The experimental data used in this study comes from the research on a particular reverse-flow cyclone with a plane top and volute inlet, which is an efficient gas-solid separator jointly developed by the University of Petroleum (China) and China Petrochemical Corporation in 1990 [28]. It is widely used in fluidized catalytic cracking (FCC) units, coal combustion, gasification and petrochemical reaction processes for gas-solid separation under high temperature, high pressure, and high dust concentration. The cyclone separator shown in Fig. 2 has an inlet height a, an inlet width b, a vortex finder diameter dr, a vortex finder height S, a cyclone diameter D, a particle exit diameter B, a separation space height Hs, a cylindrical part height H1, and a conical part height H2. According to the cyclone flow field studies and performance tests for the FCC catalyst-gas system, the optimum ratios are B/D = 0.4–0.5, Hs/D = 2.8–3.0, S/a = 0.8–1.0, and a/b = 2.2–2.5 [5]. A set of 217 experimental data from literature [14–16] is extracted for investigating the effect of different modeling methods on modeling

Fig. 1. Flowchart of the PCA-PSO-SVR proposed method.

116

W. Zhang et al. / Powder Technology 347 (2019) 114–124

dust particle size distribution σ. However, the effect of the mean square error of dust particle size distribution on the separation performance can be neglected with regard to both physical and mathematical aspects. To sum up, there are a total of eight input variables. The grade efficiency of particles ηi is selected as the output variable. Table 2 summarizes the input and output variables of SVR and gives some experimental data. 2.2. Support vector regression Support Vector Regression (SVR) is a powerful learning model to minimize the structural risk with better generalization capability based on the statistical theory. The core concept of SVR is firstly to map the original data into a high-dimensional feature space nonlinearly, and then to find an optimal linear regression function in this feature space. In short, the process is to achieve linearization and ascending dimensions. Thus, the problem of finding the optimal high dimensional linear plane is transformed into a convex quadratic programming problem. SVR problems with kernel functions are represented in Fig. 3. Solving the nonlinear regression problem is actually the process of solving the weight vector ωi and the threshold value u. The values of ωi and u are estimated by minimizing Eq. (1) based on Structural Risk Minimization Principle, m

Rreg ½ f  ¼ Remp½ f  þ λkωk2 ¼ ∑i¼1 C ðei Þ þ λkωk2

ð1Þ

Where Rreg is structural risk; Remp is empirical risk; f is function of nonlinear regression; ei is error between the predicted value and the true value, ei = f(xi) − yi. λ is regularization constant; m is number of samples; C(⋅) is ε-insensitive loss function defined as [29]

Fig. 2. The structure of PV cyclone separator.

the accuracy. The factors influencing the efficiency of the cyclone separator include the geometrical and operating parameters. There are three geometrical parameters that seriously affect the separation efficiency, namely, cyclone diameter D, ratio of cyclone cross-sectional area to inlet cross-sectional area Ka = πD2/4ab, and ratio of diameter of vortex ~r ¼ dr =D. Moreover, six operating parameters finder to that of cyclone d

C ðei Þ ¼ maxð0; jei j−εÞ

ð2Þ

where ε is insensitive loss which denotes the fault tolerance level of the mode. The larger the value, the greater the tolerance of the model to the error, and the higher the probability of under-learning; otherwise, the probability of over-fitting is larger. Therefore, it is crucial to select proper insensitive loss vector ε for support vector regression.

affect the collection efficiency: a gas velocity at cyclone inlet vi, a concentration of inlet particles Ci, a diameter of particles δ, a particle density ρp, a median size of particle dm, and a mean square error of Table 2 Input and output variables of the SVR model with corresponding experimental data. Input variables

Output variables

x1

x2

x3

x4

x5

x6

x7

x8

y1

D

Ka

~r d

vi

Ci

δ

ρp

dm

ηi

800 800 800 400 400 400 800 800 800 800 800 800

4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4

0.25 0.312 0.445 0.44 0.44 0.44 0.44 0.44 0.44 0.44 0.44 0.44

15.16 15.16 15.16 15.99 15.96 16 11.12 11.12 11.12 11.12 15.16 19

10 10 10 10 30 50 10 10 10 10 10 10

6 6 6 9 9 9 5 6 7 8 8 8

2876 2876 2876 2876 2876 2876 2876 2876 2876 2876 2876 2876

13.570 13.570 13.570 11.986 11.986 11.986 14.27 14.27 14.27 14.27 14.27 14.27

93.264 91.334 85.826 95.215 96.294 96.746 82.939 86.591 89.320 91.121 93.264 94.241

5 5 5 8 8 8 5 5 9 9

2876 2876 2876 2876 2876 2876 3050 3050 2876 3050

11.986 13.540 14.270 13.57 13.57 9.976 11.835 11.835 11.986 11.835

82.451 85.140 84.727 97.153 94.435 97.487 87.415 82.496 89.138 90.249

• • • 800 800 800 800 800 400 400 800 800 800

4.4 4.4 4.4 7.2 4.26 5.5 4.4 4.4 4.4 4.4

0.44 0.44 0.44 0.44 0.44 0.44 0.44 0.44 0.44 0.44

14.97 16.10 15.16 15.16 15.16 20 10.73 11.01 14.97 14.59

The significance of [bold] in the table is to emphasize the change of variables

10 10 10 10 10 10 10 10 10 10

W. Zhang et al. / Powder Technology 347 (2019) 114–124

117

Although the characteristic space and nonlinear mapping are used in the derivation process, their expressions are not required in the actual calculation [31]. The nonlinear regression function is computed by the kernel function, and the coefficients of αj, α j  , and b which correspond to the support vector in the sample data. The selection of kernel function is crucial to support vector regression which directly affects the nonlinear mapping of samples. Different kernel functions including the polynomial function, Gaussian radical basis function (RBF), and sigmoid (s-shaped) kernel function is selected in the SVR algorithm [32]. For better generalization and nonlinear regression ability, the RBF kernel function is selected for the SVR modeling. The expression is shown in Eq. (7),   K ðx; xi Þ ¼ exp −g kx−xi k2

Fig. 3. Schematic of SVR model.

Solving the minimization problem of Eq. (1) is then transformed into solving the quadratic programming problem of Eq. (3) after introducing the concept of relaxation variable ξi, minJ ¼ 1

 2

 m   kωk2 þ C∑i¼1 ξi þ ξi

ð3Þ

8 < yi −ðω  ϕðxÞÞ−u ≤ε þ ξi  s:t: ðω  ϕðxÞÞ þ u−yi ≤ε þ ξi :  ξi ; ξi ≥0

where g is kernel function parameter. Here, changing the value of g indirectly changes the nonlinear mapping function, which can determine the complexity and performance of the model directly. In this study, the purpose of SVR model training is to find an appropriate correspondence to satisfy Eq. (8) after the input and output variables are settled.   ~r ; v ; C ; δ; ρ ; dm ηi ¼ f D; K a ; d i i p

 where ω is weight vector; 1 2 kωk2 represents model complexity. C is penalty factor, which keeps a balance between the complexity and empirical risk [30]. Increasing the value of C indicates that the more attention is paid to the empirical risk, and the greater the possibility of over-fitting occurs. Otherwise, the phenomenon of under-fitting easily occurs. Therefore, the selection of an appropriate penalty factor is required. Choosing a suitable value of C is crucial during the establishment of a favorable SVR model. The Lagrangian multipliers method and KKT conditions can be used to transform the quadratic programming problem of Eq. (3) into the dual optimization problem of Eq. (4)

ð7Þ

ð8Þ

217 sets of experimental data from literature [14–16] as shown in Table 2 are used to train and test the SVR model. The range of each input parameter is shown in Table 3. According to the random sampling technique, 80% of the data are randomly selected as the training set of SVR, and the remaining 20% are used as the test set to verify the generalization ability of the model. Before training, the input data need to be normalized so that each variable can be converted into a number between 0 and 1. The output results after training should be reversely normalized.

9 8 > > < 1      = m m  m     max J ðα Þ ¼ max − ∑ i ¼ 1 α i −α i α j −α j Kðxi ; x j Þ−ε∑i¼1 α i þ α i þ ∑i¼1 yi α i −α i > > ; : 2 j¼1

s:t:

8 m  X  > > > α i −α i ¼ 0 <

ð4Þ

i¼1

> 0bα i bC > > : 0bα i bC

where αi, α i  , αj and α j  are Lagrangian operators; K(xi, xj) is the kernel function with which the input space of data can be transformed into a nonlinear and high-dimensional space. According to the αi and α i  calculated from Eq. (4), the support vector xi (with αi and α i  are not both 0) and the standard support vector xi (with one of αi and α i  is C) can be determined, and then the threshold value u can be calculated according to Eq. (5), u¼

2.3. Dimensionality reduction based on PCA When modeling multivariate data, the model complexity and computation time could be increased by the large amount of variables. To solve this problem, the principal component analysis (PCA) is adopted to reduce the dimension of the dataset. PCA is one of the most commonly used dimensionality reduction algorithms, which can well overcome the disadvantages of computational complexity resulted from too many dependent variables. The idea of PCA is to map the n-dimensional features to k dimensions (k b n) according to the maximum variance

h   h   i io 1 n l l ∑0bα i bC yi −∑ j¼1 α j −α j Kðxi ; x j Þ−ε þ ∑0bα bC yi −∑ j¼1 α j −α j Kðxi ; x j Þ þ ε

N NSV

i

ð5Þ where NNSV is number of standard support vectors,l is number of support vectors. Then the resulting approximation function can be written as Eq. (6):  l  f ðxÞ ¼ ∑i¼1 α i −α i K ðxi ; xÞ þ u

ð6Þ

theory. The k-dimensional feature matrix is called the master element, and it is a linear combination of the previous features. The new k features are independent and reflect most information of the sample space. Decision of the reduced number of dimensions related to PCA is a critical step. As the number of dimensions is only a few, some

118

W. Zhang et al. / Powder Technology 347 (2019) 114–124

Table 3 Range of input parameters.

Min Max

D (mm)

Ka

~r d

vi (m/s)

Ci (g/m3)

δ (μm)

ρp (kg/m3)

dm (μm)

300 1200

4 8

0.25 0.5

5 50

5 1000

1 15

2000 3500

8 15

information could be lost by the dimension-reduced matrix. Inversely, as the dimensions remain high, the complexity of the regression model also becomes too high. In both cases, the generalization ability of the regression model is low. In this study, the original SVR model is an eight-dimensional space, which could be reduced from eight to three by PCA. The performance parameters of the model from the test set after the dimension reduction and their corresponding SVR parameters are listed in Table 4. These performance parameters include the information retention ratio, and mean square error and correlation coefficient of the test set. It is observed directly from Fig. 4 that the information retention rate becomes lower and lower with the decrease in the dimensions. However, the greatest information loss occurs when the dimension is reduced from five to four. After the dimensionality-reduced models are tested one by one using the test set, it is found that the correlation coefficient of the model is the largest and the root mean square error is the smallest when the dimensions are reduced to five. Thusly, the dimension reduction from eight to five is the best one. In this study, the dimension reduction matrix W is obtained by the following steps. Step 1: Normalizing the training set. Step 2: Centralizing the training set. Step 3: Calculating the covariance matrix of the training set. Step 4: Calculating the eigenvalues of the covariance matrix and the corresponding eigenvectors. Step 5: Sorting the eigenvalues from large to small, the eigenvectors corresponding to the first five eigenvalues are found. The five eigenvectors form the dimension reduction matrix W shown in Eq. (9). According to Eq. (10), the input matrix A consisting of eight input variables is reduced to a five-dimensional feature matrix N by the 8 × 5 dimension reduction matrix W. The newly generated matrix N is composed of five independent variables N1, N2, N3, N4 and N5. 2

−0:0068 6 0:6800 6 6 0:1585 6 6 0:5024 W ¼6 6 −0:3822 6 6 −0:2003 6 4 −0:2123 0:1696

3 0:0547 0:1711 0:3651 0:9095 −0:2824 −0:4245 0:3984 −0:0366 7 7 0:8006 −0:3241 0:1317 −0:0345 7 7 −0:2846 0:2136 −0:2094 0:0256 7 7 −0:3918 −0:0870 0:2790 −0:0314 7 7 −0:0450 −0:1815 0:5427 −0:2155 7 7 −0:0941 −0:1701 0:2292 −0:1155 5 0:1758 0:7553 0:4711 −0:3300

ð9Þ

ð10Þ D ρp

dm

vi

Ci

The generalization capacity of SVR greatly depends on the hyperparameters, i.e., the penalty factor C, kernel function parameter g, and insensitive loss ε. However, it is difficult to determine the proper value of these parameters by prior knowledge, and the process of tuning parameters manually is time-consuming. Furthermore, the effect of these three parameters on the model performance is still uncertain. Thus, the particle swarm optimization (PSO) is adopted for the parameter's optimization. The PSO algorithm was proposed firstly by Kennedy and Eberhart [33] inspired by the hunting of birds. In the optimization process, each particle has its own speed, location, and fitness value determined by the target function. In each iteration, the particle updates its speed and position based on the best historical position (individual best) that the particle passes through and the best position (global best) that all particles can be found. The formula for updating speed and position are as follows, xid ðt þ 1Þ ¼ xid ðt Þ þ vid ðt þ 1Þ t þ 1 ¼ ω  t þ C1  r1  Pid−xidt þ C2  r2  ðGid−xidðtÞÞ ð11Þ vid ðt þ 1Þ ¼ ω  vid ðt Þ þ C 1  r 1  ðP id −xid ðt ÞÞ þ C 2  r2  ðGid −xid ðt ÞÞ ð12Þ where i is ith particle, d is dimension, t is iteration number, C1 and C2 are learning factors, r1 and r2 are random numbers between 0 and 1, ω is inertial weight of linear decreasing, Pid is individual extreme value of the ith particle on the d dimension, and Gid is global extreme value of all particles. The 5-fold cross-validation is used to evaluate the fitness of each particle to maintain a balance between computation cost and effectiveness of parameters optimization. Training sets are randomly divided into five non-intersecting subsets with a roughly equivalent number of data patterns. For every set of SVR parameters C, g and ε, extracting from the corresponding particle, four subsets are selected randomly to be the training set for establishing SVR model, and the performance of this SVR model is measured by calculating RMSE on the remaining one subset according to Eq. (13). RMSE ¼

N ¼AW where, N = [N1 N2 N3 N4 N5]T, A ¼ ½ δ

2.4. Parameter optimization of SVR by PSO

Ka

~  d r

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 n 2 ∑ ðy −f ðxi ÞÞ n i¼1 i

ð13Þ

where, n is the number of samples; yi is the true value; f(xi) is the predicted value of the model. This process is repeated for five times until each of the five subsets has been used once (only once) as the testing subset in turn. Eventually,

Table 4 Performance parameters of the SVR model after dimension reduction. Dimension

8 7 6 5 4 3

Information retention ratio

1 0.9997 0.9988 0.9985 0.9605 0.9543

Mean square error of test set

1.840e-3 1.095e-3 9.383e-4 6.984e-4 3.100e-3 3.736e-3

Correlation coefficient of test set

0.951 0.970 0.975 0.982 0.916 0.900

Parameters of SVR Penalty factor C

RBF kernel parameter g

Insensitive loss ε

650 217.62 368 660 800 25.63

0.673 0.62 0.9 0.673 1.5 240

0.03 0.01 0.021 0.026 0.01 0.001

W. Zhang et al. / Powder Technology 347 (2019) 114–124

119

Fig. 4. Model performance corresponding to the different dimensions. Fig. 5. Fitness curve.

the fitness value of each particle is estimated by averaging the RMSE value over 5-subsets [34]. However, to prevent the over-fitting of SVR model, a lower limit is set for the root mean square error during the particle swarm optimization, and the optimization ends when the root mean square error starts to be smaller than this lower limit. In this study, the SVR parameter optimization with PSO is described as follows: Step 1: The PSO parameters are set and the particle swarm is initialized as shown in Table 5. The parameters include the swarm size, the maximum iterations, the acceleration coefficients c1 and c2, the inertia weight, the penalty factor C∈[0.1, 800], the RBF kernel parameter g∈[0.1, 10], and the ε-insensitive loss function parameter ε∈[0, 1], respectively. Then, a population of initial particles is generated with the random position and velocity. Step 2: For the training set, a five-fold cross-validation is used to calculate the fitness value of different parameter combinations and then the calculated result is taken as the initial individual pbest for each particle. Here, the best pbest is set in particle swarm as the initial gbest. Step 3: The speed and position of the particle are updated according to Eqs. (11) and (12), and then the fitness value before updating pbest and gbest is calculated. Step 4: Step 3 is repeated until the end condition is met and the optimal parameter is finally obtained. Fig. 5 shows the optimization result varying with the number of iterations. The whole evolutionary process illustrates the changing trend of the best population fitness during the evolution process. The fitness decreases with the increasing generation number and converges at about generation 25. After 50 iterations, the RMSE obtained by the training set is 3.123 × 10−4 through the five-fold cross-validation, and the value of {C, g, ε} in the final optimization results is {660, 0.673, 0.026}. The SVR model configured with the optimal value {C, g, ε} obtained by the particle swarm optimization is trained based on the training data selected at random until it meets the convergence conditions.

Table 5 PSO parameter settings. Particle swarm size

50

Maximum iterations (C,g,ε) search range The initial position of the particle swarm The initial velocity of the particle swarm

50 Min = (0.1,0.1,0) Max = (800,10,1) randomly generated randomly generated

Subsequently, the trained models are performed to predict the simulated results according to the input of testing data and then compared with the true value. Finally, the performance of the model is evaluated according to the evaluation parameters. 2.5. Evaluation parameters For evaluating the performance of the model for the grade efficiency prediction, the normalized mean squared error MSE and the correlation coefficient R are defined as MSE ¼

1 n 2 ∑ ðy −f ðxi ÞÞ n i¼1 i

  n ∑i¼1 ðyi −yÞ  f ðxi Þ− f R2 ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 Pn 2 Pn i¼1 ðyi −yÞ i¼1 f ðxi Þ−f

ð14Þ

ð15Þ

where, n is number of samples; yi is true value; f(xi) is predicted value of the model; y is average valuation of true values; f is average of the predicted values. The smaller the mean squared error MSE, the higher the accuracy of the model prediction. Meanwhile, the greater the correlation coefficient, the higher the correlation between experimental data and predicted values. Moreover, R2 = 1 indicates that the predicted value is completely correlated with the experimental data; that is, there is a linear relationship in the sense that the probability is 1. Besides, in the following, the simulation time (CPU time) t is also considered to evaluate the computational efficiency. 3. Comparison and discussion 3.1. Comparison between the prediction results of PCA-PSO-SVR and experimental data Fig. 6 shows the comparison between the predicted results of the PCA-PSO-SVR model and experimental data for the grade efficiency of cyclone separators. The abscissa represents the experimental data of grade efficiency as reported in the literature [14–16], and the ordinate represents the predicted values of grade efficiency output of the PCAPSO-SVR model. The red balls illustrate the predicted results of grade efficiency by the PCA-PSO-SVR model for the training samples. The green triangles are the grade efficiency values predicted by the PCAPSO-SVR model for the test samples. They are all concentrated near the x = y line, indicating that the predicted results are consistent with the experimental data. The normalized mean squared error MSE and the correlation coefficient R of the training samples and testing samples

120

W. Zhang et al. / Powder Technology 347 (2019) 114–124

To compare the modeling accuracy of the two dimensionality reduction methods between PCA and Stokes number, the PSO-SVR hybrid algorithm is used for regression modeling with 80% experimental data of 217 sets on the two five-dimensional models. Subsequently, both models are tested separately with the remaining 20% data. The test results are shown in Fig. 7. The values of {C, g, ε} in the PCA-PSO-SVR algorithm are {660, 0.673, 0.026} and the value of {C, g, ε} in the Stokes number-PSO-SVR are {203, 3, 0.01}. The black squares indicate the predicted results of the PCA-PSO-SVR model for the testing samples. The closer the data points are to the line x = y, the closer the prediction results are to the experimental data. Most of the black squares cluster are found to be near the x = y line. The PCA-PSO-SVR model maintains high accuracy even when the number of experimental data is small (distributing in grade efficiency less than 80%). The results show that PCA method requires more data for the process of dimensionality reduction. 3.3. Comparison among the PCA-PSO-SVR and classical theoretical models

Fig. 6. Comparison on the prediction results of grade efficiency between training sample and test sample.

are shown in Fig. 6. Especially, both the correlation coefficients are close to 1. This indicates that the PCA-PSO-SVR model can be used as a new method to fit the complex nonlinear relationship between the grade efficiency and other influencing factors of the cyclone separator to augment the generalization ability and robustness. 3.2. Comparison of the dimensionality reduction between PCA and stokes number It is well known that the cyclone efficiency is greatly influenced by the Stokes number, a dimensionless number characterizing the behavior of particles suspended in a fluid flow. The Stokes number is defined as Eq. (16). Stk ¼

ρp δ2 vi 18μD

ð16Þ

where ρp is the particle density, δ is the particle diameter, D is the cylinder diameter and vi is the gas velocity at cyclone inlet. The Stokes number-based method can be regarded as a dimensionality reduction method that integrates the four factors affecting efficiency into a dimensionless variable. Thus, the eight variables, {δ, D, ρp, dm, vi, Ci, Ka, d r̃ }, that affect the cyclone efficiency are reduced to five, {Stk, dm, Ci, Ka, d ̃r}.

Three theoretical models of the cyclone separator (Barth [7], Dietz [8] and Leith-Licht [9]) are compared with the PCA-PSO-SVR. Figs. 8 (a), (c), (d), (e), (f) show that the influence trend of Ka, D, δ, ρp and vi on the grade efficiency obtained by these models. One can see that the PCA-PSO-SVR model is consistent with the trend reflected by the experimental data. However, the Barth model gives too optimistic prediction values for the grade efficiency, and in contrast, the Dietz and Leith-Licht models give more pessimistic prediction values. Most of the predicted values of PCA-PSO-SVR model are closer to the experimental data. Fig. 8(b) shows that the grade efficiency of the experimental data decreases with the increase in d̃r in the range of 0.25 to 0.45. The trends of PCA-PSO-SVR and Barth models are consistent with experimental data. However, it is interesting to note that the other two models result in the opposite trend in the same range. The concentration of inlet particles and particle size distribution are two important parameters that affect the grade efficiency, but they are not considered in the three theoretical models. In contrast, the PCA-PSO-SVR model is used to deal with any factor that has been measured by the experiments. The particle size distribution is expressed by the median diameter dm and the root mean square difference of the dust particle size. With the increase in the median particle size, the large particles will play a certain drag effect on the small particles, which will improve the separation efficiency in a certain range. The influence of the mean square error of the dust particle size distribution on the separation performance is negligible according to the experimental data analysis. As the concentration increases, the drag force generated by the large particles moving toward the wall will entrain the small particles toward the wall. As a result, the collision, interception, and agglomeration between particles increase. The viscous force of the gas stream on the particles is relatively reduced, which will lead to an increase in the particle separation efficiency. Prediction data of the median particle size dm and the inlet concentration ci on grade efficiency from the PCA-PSO-SVR model are compared with the experimental data in Figs. 8(g) and (h). The trend shows that the PCA-PSO-SVR model can better reflect the effect of these two parameters on the grade efficiency. 3.4. Comparison among PCA-PSO-SVR, PCA-SVR, PSO-SVR, and SVR models

Fig. 7. Prediction results comparison between PCA-PSO-SVR and Stokes-PSO-SVR.

To test the improvements of SVR performances by PCA and PSO, respectively, the prediction results of PCA-PSO-SVR, PCA-SVR, PSO-SVR, and SVR models for the testing sample are shown in Fig. 9. The red circles indicate the predicted results of the PCA-PSO-SVR model for the testing samples. Most of them are concentrated near line x = y which means the predicted results agree well with the experimental data. The PCA-PSO-SVR model achieves the minimum mean square error and high correlations compared with the other three models. The PCA-SVR model, whose correlation coefficient of 0.957 is higher than those of the PSO-SVR and SVR models, shows that the PCA effectively

W. Zhang et al. / Powder Technology 347 (2019) 114–124

121

Fig. 8. Performance comparison between the PCA-PSO-SVR and theoretical models. Fig. 8 (continued).

122

W. Zhang et al. / Powder Technology 347 (2019) 114–124 Table 6 Evaluation parameters and hyper-parameters of SVR hybrid with PSO and PCA.

MSE R2 C g ε

PCA-PSO-SVR

PCA-SVR

PSO-SVR

SVR

6.948 × 10−4 0.982 660 0.673 0.026

1.617 × 10−3 0.957 512 0.5 0.031

1.010 × 10−3 0.929 313 0.131 0.010

1.736 × 10−3 0.885 256 0.125 0.016

Fig. 10. Time consuming of parameter optimization.

Fig. 8 (continued).

Fig. 11. Comparison of the PCA-PSO-SVR model with the BP, RBF and GRNN models for the grade efficiency.

Table 7 Evaluation parameters of different models.

Fig. 9. Comparison of the PCA-PSO-SVR with PCA-SVR, PSO-SVR and SVR models for the grade efficiency.

MSE R2

PCA-PSO-SVR

BP

RBF

GRNN

6.948 × 10−4 0.982

1.200 × 10−2 0.901

4.680 × 10−2 0.8245

3.780 × 10−2 0.7532

W. Zhang et al. / Powder Technology 347 (2019) 114–124

123

reduces the dimensionality of feature space and improves the generalization ability of the model. The mean squared error of PSO-SVR method is 1.010 × 10−3, lower than those of PCA-SVR and SVR models which means that the particle swarm optimization improves the modeling accuracy of the SVM. Table 6 lists the mean squared error MSE and correlation coefficient R for evaluating the performance of the models combined with the hyper-parameters of SVR {C, g, ε} for the grade efficiency prediction. Fig. 10 shows the time consuming of CPU for SVR and PCA-SVR with the standard grid method (t = 145.07 s and 3508.85 s, respectively) and the time consuming of CPU for PSO-SVR and the PCA-PSO-SVR with PSO algorithms (t = 25.63 s and 502.65 s, respectively). The time required by PSO algorithm is far less than that with standard grid method because the optimization process needs 2500 times calculation with the 5-fold cross-validation to confirm the fitness function when the iteration is 50 and particle number is 50 when using particle swarm algorithm for optimization. However, the optimization process needs 125,000 times calculation with the 5-fold cross-validation to acquire the fitness function when each optimization parameter is set to 50 levels using a standard grid search method for the optimization. In summary, as an advanced evolutionary algorithm, the particle swarm optimization can replace the standard grid search method to find better model parameters to improve the optimization speed and accuracy.

Notation

3.5. Comparison among PCA-PSO-SVR, BP, RBF and GRNN models

Acknowledgments

To test the validity of the PCA-PSO-SVR model, three types of ANN (Artificial Neural Network) models are adopted to model the cyclone grade efficiency, namely, back propagation (BP), radial basis function (RBF), and general regression neural network (GRNN). The BP neural network adopts a single hidden layer structure with 10 neurons. The radial basis function has a spread velocity of 7.5 in the radial basis neural network, and the spread velocity of the probabilistic neural network in the generalized regression neural network is set to 0.1. Most of the prediction results of the PCA-PSO-SVR model cluster near the x = y line in Fig. 11, which means that the accuracy of PCA-PSO-SVR model is superior to the other three neural networks. Some values predicted by RBF are lower than that obtained from the experimental data, while some values predicted by GRNN are higher than that obtained from the experimental data. This phenomenon is especially noticeable when there are only a few data (distributing in grade efficiency less than 80%). Table 7 lists the evaluation parameters of BP, RBF, GRNN, and PCA-PSO-SVR models. It shows that the PCA-PSO-SVR model achieves the minimum mean square error and high correlations compared with the other three ANN models.

Authors acknowledge support from the National Key Research and Development Program of China (2018YFB0604603-03), National Natural Science Foundation of China (No. 21506139), NSFC-Shanxi Joint Fund for Coal-Based Low-Carbon Technology (No. U1710101) and Special Talent Program of Shanxi Province (No. 201605D211005).

4. Conclusions The PCA-PSO-SVR modeling method, which combines the principal component analysis, particle swarm optimization, and support vector regression algorithm, is proposed to model the cyclone efficiency using the experimental data. The simulation results show that PCA, as an unsupervised dimensionality reduction algorithm, can effectively reduce the dimensionality of feature space, eliminate partial noise data, reduce the complexity of the model, and improve the generalization ability of the model. As an optimization algorithm, PSO has the excellent optimization ability to gain the proper parameters of SVR model. With the optimized parameters, SVR is successfully used to predict the grade efficiency of cyclone separator. The prediction results show that PCA-PSO-SVR model has strong predictive ability, high stability, high generalization ability and robustness compared with the classical theoretical models, i.e.,PSO-SVR, SVR, PCA-SVR, and some types of ANN models. As a future extension of this work, the development of higher performance artificial intelligence models and advanced optimal search algorithms is necessary to predict the grade efficiency of cyclone separator more accurately and guide its optimization design.

a b H1 H2 S B C Ci D dr ~ dr

Inlet height,mm Inlet width,mm Cylinder height, mm Cone height, mm Length of vortex finder, mm Particle exit diameter, mm The penalty factor Concentration of inlet particles,g/m3 Cyclone diameter,mm Cyclone gas outlet diameter,mm The ratio of diameter of vortex finder to that of cyclone (dimensionless) ~ dr=dr/D

g Ka

The parameter of the kernel function The ratio of cyclone cross-section area to inlet cross-sectional area, (dimensionless) Ka = πD2/4ab Gas velocity at cyclone inlet,m/s Particle diameter,μm Median size of particle,μm The insensitive loss Overall efficiency,% Graded efficiency of particles,% Particle density,kg/m3 The weight vector The threshold value

vi δ dm ε η ηi ρp ωi u

References [1] B. Zhao, Development of a dimensionless logistic model for predicting cyclone separation efficiency, Aerosol Sci. Technol. 44 (12) (2010) 1105–1112, https://doi.org/ 10.1080/02786826.2010.512027. [2] B. Zhao, Prediction of gas-particle separation efficiency for cyclones: a time-of-flight model, Sep. Purif. Technol. 85 (2012) 171–177, https://doi.org/10.1016/j.seppur. 2011.10.006. [3] G. Lidén, A. Gudmundsson, Semi-empirical modelling to generalise the dependence of cyclone collection efficiency on operating conditions and cyclone design, J. Aerosol Sci. 28 (5) (1997) 853–874, https://doi.org/10.1016/S0021-8502(96)00479-X. [4] Y.F. Qiu, B.Q. Deng, N.K. Chang, Numerical study of the flow field and separation efficiency of a divergent cyclone, Powder Technol. 217 (2012) 231–237, https://doi. org/10.1016/j.powtec.2011.10.031. [5] G.G. Sun, J.Y. Chen, M.X. Shi, Optimization and applications of reverse-flow cyclones, China Particuology 3 (2005) 43–46, https://doi.org/10.1016/S1672-2515(07) 60162-6. [6] J.X. Yang, G.G. Sun, M.S. Zhan, Prediction of the maximum-efficiency inlet velocity in cyclones, Powder Technol. 286 (2015) 124–131, https://doi.org/10.1016/j.powtec. 2015.07.024. [7] W. Barth, Design and layout of the cyclone separator on the basis of new investigations, Brennstoff-Warme-Kraft 8 (1956) 1–9, http://refhub.elsevier.com/s00325910(17)30882-3/rf0100. [8] P.W. Dietz, Collection efficiency of cyclone separators, AICHE J. 27 (1981) 888–892, http://refhub.elsevier.com/s0032-5910(17)30882-3/rf0105. [9] D. Leith, W. Licht, The collection efficiency of cyclone type particle collectors: a new theoretical approach, AIChE Symp. Ser. 68 (1972) 196–206, http://refhub.elsevier. com/s0032-5910(16)30086-9/rf0030. [10] S.E. Rafiee, M.M. Sadeghiazad, Efficiency evaluation of vortex tube cyclone separator, Appl. Therm. Eng. 114 (2017) 300–327, https://doi.org/10.1016/j.applthermaleng. 2016.11.110. [11] Y. Zhu, K.W. Lee, Experimental study on small cyclones operating at high flowrates, J. Aerosol Sci. 30 (10) (1999) 1303–1315, https://doi.org/10.1016/S0021-8502(99) 00024-5. [12] J.Y. Chen, M.X. Shi, Analysis on cyclone collection efficiencies at high temperatures, China Particuology 1 (2003) 20–26, https://doi.org/10.1016/S1672-2515(07) 60095-5. [13] K.S. Lim, H.S. Kim, K.W. Lee, Characteristics of the collection efficiency for a cyclone with different vortex finder shapes, J. Aerosol Sci. 35 (2004) 743–754, https://doi. org/10.1016/j.jaerosci.2003.12.002. [14] X.L. Luo, J.Y. Chen, Research on the effect of the particale concentration in gas upon the performance of cyclone separators, J. Eng. Thermophys-rus. 13 (3) (1992) 282–285, http://jetp.iet.cn/EN/Y1992/V13/I3/282.

124

W. Zhang et al. / Powder Technology 347 (2019) 114–124

[15] Y.H. Jin, J.Y. Chen, Computation method of PV™ cyclone performance, Acta Pet. Sin. 2 (1995) 93–99, http://lib.cqvip.com/qk/81668X/200001/1878380.html. [16] Y.H. Jin, M.X. Shi, Experimental studies on scale-up of cyclone separator, J. China Univ. Pet. Ed. Nat. Sci. 5 (1990) 46–55, http://qikan.cqvip.com/article/detail.aspx? id=353292. [17] X. Sun, Y.Y. Joon, Multi-objective optimization of a gas cyclone separator using genetic algorithm and computational fluid dynamics, Powder Technol. 325 (2018) 347–360, https://doi.org/10.1016/j.powtec.2017.11.012. [18] M. Francesco, R. Francesco, N.G. Carlo, Separation efficiency and heat exchange optimization in a cyclone, Sep. Purif. Technol. 179 (2017) 393–402, https://doi.org/10. 1016/j.seppur.2017.02.024. [19] A.N. Huang, I. Keiya, F. Tomonori, F. Kunihiro, K. Hsiu-Po, Effects of particle mass loading on the hydrodynamics and separation efficiency of a cyclone separator, J. Taiwan Inst. Chem. E. 90 (2018) 61–67, https://doi.org/10.1016/j.jtice.2017.12.016. [20] D. Misiulia, A.G. Andersson, T.S. Lundström, Effects of the inlet angle on the collection efficiency of a cyclone with helical-roof inlet, Powder Technol. 305 (2017) 48–55, https://doi.org/10.1016/j.powtec.2016.09.050. [21] W.I. Mazyana, A. Ahmadib, J. Brinkerhoffa, H. Ahmedc, M. Hoorfar, Enhancement of cyclone solid particle separation performance based on geometrical modification: numerical analysis, Sep. Purif. Technol. 191 (2018) 276–285, https://doi.org/10. 1016/j.seppur.2017.09.040. [22] F. Zhou, G.G. Sun, Y. Zhang, H. Ci, Q. Wei, Experimental and CFD study on the effects of surface roughness on cyclone performance, Sep. Purif. Technol. 193 (2018) 175–183, https://doi.org/10.1016/j.seppur.2017.11.017. [23] K. Elsayed, C. Lacor, CFD modeling and multi-objective optimization of cyclone geometry using desirability function, artificial neural networks and genetic algorithms, Appl. Math. Model. 37 (8) (2013) 5680–5704, https://doi.org/10.1016/j.apm.2012.11.010. [24] B. Zhao, Modeling pressure drop coefficient for cyclone separators: a support vector machine approach, Chem. Eng. Sci. 64 (2009) 4131–4136, https://doi.org/10.1016/j. ces.2009.06.017.

[25] K. Elsayed, C. Lacor, Modeling and pareto optimization of gas cyclone separator performance using RBF type artificial neural networks and genetic algorithms, Powder Technol. 217 (2) (2012) 84–99, https://doi.org/10.1016/j.powtec.2011.10. 015. [26] K. Yetilmezsoy, Determination of optimum body diameter of air cyclones using a new empirical model and a neural network approach, Environ. Eng. Sci. 23 (4) (2006) 680–690, https://doi.org/10.1089/ees.2006.23.680. [27] A. Khalkhali, H. Safikhani, Pareto based multi-objective optimization of a cyclone vortex finder using CFD, GMDH type neural networks and genetic algorithms, Eng. Optim. 44 (1) (2012) 105–118, https://doi.org/10.1080/0305215X.2011. 564619. [28] G.G. Sun, M.X. Shi, The proper design and application of PV cyclone, Pet. Refin. Eng. 32 (9) (2002) 4–7, in Chinese https://doi.org/10.3969/j.issn.1002-106X.2002.09. 002. [29] M.P. Wang, Q. Tian, Dynamic heat supply prediction using support vector regression optimized by particle swarm optimization algorithm, Math. Probl. Eng. 1 (2016) 1–10, https://doi.org/10.1155/2016/3968324. [30] R. Dash, P.K. Sa, B. Majhi, Particle swarm optimization based support vector regression for blind image restoration, J. Comput. Sci. Technol. 27 (5) (2012) 989–995, https://doi.org/10.1007/s11390-012-1279-z. [31] Y.Y. Chen, Q.F. Xiong, Support Vector Machine Method and Application Course [M]. Beijing, 2011. [32] Y. Yajima, H. Ohi, M. Mori, Extracting feature subspace for kernel based linear programming support vector machines, J. Oper. Res. Soc. Jan. 46 (4) (2003) 395–408, https://doi.org/10.15807/jorsj.46.395. [33] E. Russell, K. James, Particle swarm optimization, in: IEEE proceedings, Neural Netw. 4 (1995) 1942–1948, https://doi.org/10.1109/ICNN.1995.488968. [34] Z. Zhong, D. Pi, Forecasting satellite attitude volatility using support vector regression with particle swarm optimization, IAENG Int. J. Comput. Sci. 41 (3) (2014) 153–162http://www.iaeng.org/IJCS/issues_v41/issue_3/IJCS_41_3_01.pdf.