Comparative study of regression modeling methods for online coal calorific value prediction from flame radiation features

Comparative study of regression modeling methods for online coal calorific value prediction from flame radiation features

Fuel 142 (2015) 164–172 Contents lists available at ScienceDirect Fuel journal homepage: www.elsevier.com/locate/fuel Comparative study of regressi...

1MB Sizes 0 Downloads 92 Views

Fuel 142 (2015) 164–172

Contents lists available at ScienceDirect

Fuel journal homepage: www.elsevier.com/locate/fuel

Comparative study of regression modeling methods for online coal calorific value prediction from flame radiation features Lijun Xu a,⇑, Yanting Cheng a, Rui Yin a, Qi Zhang b a Key Laboratory of Precision Opto-Mechatronics Technology of Ministry of Education, School of Instrument Science and Opto-Electronic Engineering, Beihang University, Beijing 100191, China b Beijing Huashengjincheng Science & Technology Co., Ltd., Beijing 100085, China

h i g h l i g h t s  Models for coal calorific value prediction from flame radiation were established.  Comparison of linear and nonlinear regression models was made.  Statistical approaches could not improve the performance of the linear model.  Statistical approaches could effectively improve the performance of the nonlinear model.  The PLSA-based SVR model shows the best performance with fewer feature components.

a r t i c l e

i n f o

Article history: Received 17 July 2014 Received in revised form 24 October 2014 Accepted 29 October 2014 Available online 15 November 2014 Keywords: Coal calorific value Flame radiation features Multiple regression analysis Statistical approach

a b s t r a c t In this paper, multiple regression methods are presented and compared for online coal calorific value prediction from multi-spectral flame radiation features. Several statistical approaches including principle component analysis (PCA), independent component analysis (ICA) and partial least squares analysis (PLSA) were used in linear and nonlinear regression analyses. Analyzing results show that nonlinear regression model can better approximate the relationship between the coal calorific value and the flame radiation features than linear regression model. In linear regression analysis, the performance of the linear coal calorific value prediction models was not improved by involving the statistical approaches. In nonlinear regression analysis, however, the performance of the prediction models was significantly improved when combined with the statistical approaches. The variation of coefficients of multiple regression showed that only the PLSA-based nonlinear regression model can discriminate useful feature components from useless feature components. The PLSA-based nonlinear regression model showed the best performance for coal calorific value prediction with the number of features reduced to about a third of that in the other models. With the PLSA-based nonlinear regression model, online coal calorific value prediction from the multi-band flame radiation features under the operating conditions used by the industrial boiler has the mean absolute error, standard deviation of the absolute errors, mean relative error and standard deviation of the relative errors of 148.76 kJ/kg, 291.86 kJ/kg, 0.76% and 1.53%, respectively. Ó 2014 Elsevier Ltd. All rights reserved.

1. Introduction For centuries, study of combustion process has attracted extensive attentions of related researchers. Reported researches are mostly numerical simulation studies supplemented by experimental verification [1]. For large scale industrial boilers, the research works are more challenging for the harsh application environments.

⇑ Corresponding author. Tel.: +86 10 82317325. E-mail address: [email protected] (L. Xu). http://dx.doi.org/10.1016/j.fuel.2014.10.081 0016-2361/Ó 2014 Elsevier Ltd. All rights reserved.

Combustion process involves fuel/air chemistry reactions and complicated heat and mass transfers. Simulation of combustion inside a combustor needs to specify boundary parameters including fuel compositions, pressure and temperature, etc., and more importantly, special measurement technologies are needed to validate the simulation results. Although some energy-intrusive techniques, such as tunable diode laser absorption spectroscopy [2,3], laser induced breakdown spectroscopy [4] and X-ray fluorescence analysis [5], have been used to measure some interested parameters including temperature, gas concentration, etc., from attenuation or spectral variation of the penetrated rays in engine design,

L. Xu et al. / Fuel 142 (2015) 164–172

the techniques have limited applications in design and monitoring of large scale industrial boilers due to complicated industrial environments. Industries involving coal fired energy conversion utilities are facing problems of low efficiency and tightened emission limitations. Industrial boilers are required to use fuels of designed ranges to keep high efficiency [6–8]. However, standard coal supply could not be assured because of the complexity of raw coal resources. Usually, coal mining has regional features, and coal resources at different layers under earth often have different ranks. The flame generated with one type of fuel will not be duplicated with another. Coal fired industries often have to use sub-quality or blended coals, and are often under off-design conditions, causing efficiency degrade and excess pollutant emissions [9]. On the other hand, lack of adequate auto-adjustment strategy is another factor that hinders the performance of modern industrial boiler system. As mentioned, it is hard to get knowledge of instant combustion conditions even on pilot rigs. There are scarcely in-situ approaches to directly get detailed knowledge of the combustion inside a boiler, from which automated control could be implemented to optimize the utility operations. In industrial applications, steam pressure, temperature and other parameters that can be conveniently obtained are normally used in the control of energy conversion process [10]. In routine procedure of utility operation adjustment, steam pressure is translated into the heat content of the fuel feed, which is used for the regulation of coal/ air ratios. However, the signals like steam pressure are post-stage parameter of the coal/air combustion, and greatly lagged in phase than what happened in the boiler. This greatly inhibits the efficiency of modern boiler control systems. As a result, coal fired power generation plants suffer from economic loss and excess pollutant emissions. Due to technique obstacles in industrial boilers, a lot of researches are trying to exploit in-situ methods to improve the energy conversion efficiency. Related literatures mostly adopt passive methods, which are safe and practical for industrial applications. In such technologies, combustion parameters are obtained from spectral analysis of optical radiation. Optical radiation carries rich information about combustion status and fuel properties [11]. Combined with artificial intelligence methods, some parameters on combustion conditions such as the fuel/air ratio, SOx and NOx emissions etc. can be evaluated [12,13]. 3D visualization of the flame profiles is also implemented by using several CCDs [14]. Some industry-oriented researches are also trying to accomplish in-situ measurement of combustion process parameters by using energy-intrusive technologies [15]. Related technologies can analyze flue gas in an online manner by using laser absorption detection sensors that are usually arranged in exhaust duct or near the exit of the boiler. Concentration of OH, CO, NHx and other gas components are measured, which is helpful to get knowledge of the combustion process and make optimization decisions to reduce pollutant emissions. Commercial products using laser technologies for in-situ applications also begin to appear in recent years. However, there is no consensus about the cost-benefit of in-situ flue gas analyzers using laser techniques. Flue gas analyzers have not been widely used in industrial applications. Boiler performance is closely related with coal qualities [16,17]. In literatures, tracing of coal qualities used in coal fired power stations are studied through proximate or ultimate analyses using technologies of secondary ion mass spectrometry (SIMS), ion chromatography-inductively coupled plasma mass spectrometry (IC-ICPMS), X-ray power diffraction (XRD), etc. Through sampling and analysis of feed coal, fly and bottom ash of the boilers, concentrations of major and trace elements, especially toxic elements, as well as the modes of occurrence of trace elements can be obtained. Furthermore, the relationship between the content of trace

165

elements and the ash yields is studied, from which the environmental impact of fly and bottom ash disposals from coal fired power stations could be accessed [18–20]. However, coal qualities are seldom measured for online boiler efficiency improvement. In the literature, some researches realized coal calorific value prediction using artificial intelligence techniques from ultimate or proximate analyzing data of coal samples [21,22]. Such researches need sampling of feed coal and analysis using SIMS, IC-ICPMS and XRD, etc. to get data for the prediction. However the method is off-line and long time is needed to get the result. In earlier publications of the research group of the authors, online fuel type identification and preliminary research on coal calorific value prediction have been reported [23–25]. Challenges of techniques for in-situ applications include tolerance of harsh environment in industrial boilers, fast sampling and interpretation of data. Operability, reliability, economic costs of installation and maintenance are necessarily considered and the cost-efficiency ratio should be low enough. Fuel/air combustion inside a boiler can be characterized by flame radiation features that directly reflect fuel properties. Mass contents generated in coal combustion comprise hot soot, molecules, radicals, ash, atoms, etc., and flame radiation is the summation of optical radiations of all the components. Different ranks of coal will generate contents with different mass concentrations and different radiation spectra [26,27]. Given adequate interpretation scheme, important indexes of coal properties could be obtained from the radiation features. In this paper, multiple regression analysis-based methods for coal calorific value prediction are presented and compared. The measurement system used photoelectrical sensors to detect flame radiation of coal-fired boiler in visible, ultra violet and infrared bands. In system preparation stage, data base containing calorific values of coal samples and synchronous recordings of flame radiation signals was established, and used in regression analysis for linear and nonlinear modeling on the relationship between coal calorific value and flame radiation features. From the raw multi-band radiation signals, raw features were extracted. Linear and nonlinear regression analyses for coal calorific value prediction from monitoring of multi-band flame radiation signals were made based on the least squares method and support vector regression (SVR), and the performance of all linear and nonlinear regression models established were compared in this paper. Combined with statistic approaches including principle component analysis (PCA), independent component analysis (ICA) and partial least squares analysis (PLSA), correlations among raw features were eliminated and new components with different statistical properties were generated. Through comparison of the coefficients of multiple regression of the models, and the prediction accuracies of coal calorific value using different models, the best fitted model for coal calorific value prediction from the flame radiation features was obtained. Based on the best fitted regression model, online prediction of coal calorific value could be realized with low cost measurement system at high accuracy. As well known, fuel type is one of the important factors for boiler design, the knowledge of fuel properties could help operators to get insight into the combustion process and regulate fuel and air rates more accurately. Meanwhile, online measurement of calorific value of burning coal could supply verification for coal fired boiler designs. 2. Methodology 2.1. Experiment system The experiment system used photoelectric sensors to detect flame radiation in visible, infrared and ultra violet bands. Hamamatsu Co. Ltd. produces various photo-electrical sensors. From the list of products commercially provided by Hamamatsu we selected

166

L. Xu et al. / Fuel 142 (2015) 164–172

three cost-effective and easy-to-use sensors that cover ranges of 190–1000 nm, 400–900 nm and 1000–1650 nm, respectively. The detection axis of the sensors was oriented at the flame root area of the burner, formed an angle about 7° with the burner axis, as shown in Fig. 1. The hardware of the probe was portable and easily installed. Using the view hole reserved for flame-eye in boiler design, the installation of the photoelectric probe required no extra opening of the wall, which insured security of industrial applications. An air-cooled jacket was designed to isolate the probe from over-heating in the boiler. As a passive mode, the measurement system is suitable for in-situ application in coal fired industries.

2.3. Regression analyses Linear and nonlinear regression analyses for coal calorific value prediction were made. Models of regression using raw radiation features and combining statistical approaches were established. 2.3.1. Linear regression modeling In linear regression analysis, it is assumed that the radiation features and the coal calorific value statistically satisfy

Y ¼ Xb þ e where

2 2.2. Statistical analysis of radiation features From flame radiation signals in ultraviolet, visible and infrared bands, features in time and frequency domains were extracted, including DC mean, root mean square, variance, kurtosis, skewness, n-entropy, number of zero-crossings, the first and second kind of average frequencies, and shape factor of the power spectral density (PSD) of the signal. In all, 3 times 10 variables were extracted from the radiation signals in visible, infrared and ultraviolet bands, here referred as raw features. Coal calorific value prediction from flame radiation features based on regression modeling would be affected by multiple factors. First, the cross-linearity among radiation features would cause greater variance of the regression coefficients. Also, uncorrelated information or data redundancy is another factor that might affect the prediction of coal calorific value. To alleviate possible disturbances, PCA, ICA, PLSA were used to transform raw radiation features into more regulate statistical spaces. Principle component analysis is a multivariate data analysis technique widely used in conjunction with regression analysis. PCA translates variables into orthogonal space, and could help eliminate problem of correlations among multiple variables. Independent component analysis is a computational method for blind source separation [28]. ICA treats multivariate signals as mixture of independent sources and finds the source signals by maximizing the statistical independence of the estimated components [29]. PLSA is a multivariable statistical approach and was first proposed by Herman Wood in 1983. The development of PLSA oriented in two fields: chemometrics, econometrics and social sciences. In regression analysis, latent set not accounted for by the global model may lead to biased or erroneous results in terms of model parameters and model quality. PLSA could achieve good result on this problem by canonical correlation analysis. PCA utilizes the first and second moments of the measured data, hence relying heavily on Gaussian features [30]. ICA exploits inherently non-Gaussian features of the data and employs higher moments. In regression application, PLSA is based on the principle of maximizing the relationship between inputs and outputs, while PCA and ICA treat inherent statistics of input sets.

ð1Þ

3

y1

2

6y 7 6 27 7 Y¼6 6 .. 7 4. 5 2

yn b0

3

Photoelectric probe

2

   xnp

e1

;

nðpþ1Þ

3

6 .7 7 ; e¼6 4 e2 .. 5

en

n1

ðpþ1Þ1

and Y is the observations of coal calorific value, X is the matrix of radiation features, b is the linear regression parameters, and e notes random errors. The estimation of b is denoted by BLS and calculated from

b ¼ XBLS Y

ð2Þ

b denotes estimation of coal calorific value. According to the where Y b k2 , least squares principle, BLS should minimize the value of kY  Y and the solution of BLS is 1

BLS ¼ ðX 0 XÞ X 0 Y

ð3Þ

2.3.2. Nonlinear regression modeling Nonlinear regression modeling was made using machine learning method of SVR. SVR is an important application of support vector machine (SVM) in regression analysis. SVM was proposed in late 1990s. In statistical analysis SVR can avoid over-learning problem based on structural risk minimization, especially when the data is insufficient to make an adequate statistical estimation [31]. SVR establishes regression model through definition of a hyper plane, i.e.

f ðxÞ ¼

N X ðai  ai Þkðxi ; xÞ þ b

ð4Þ

i¼1

where ai and ai are the Lagrange multipliers and the kernel function.

kðxi ; xÞ ¼ uðxi ÞuðxÞ

  2 K xi ; xj ¼ eckxi xj k 7

ð5Þ

ð6Þ

The primer problem can be optimized by minimizing the regularized risk function

Rreg ½f  ¼ Fig. 1. Installation of the photoelectric probe.

. . . x2p 7 7 7 .. 7 . 5

1 xn1

n1

3

is the linear dot product of the nonlinear mapping of inputs. A schematic diagram of the SVR based regression model is shown in Fig. 2. In SVR, the form of the kernel function makes the regression either a linear or nonlinear problem. In this study, Gaussian radial basis function was used, as given by

Coal flame illuminaon area

Burner

. . . x1p

6 1 x21 6 ; X¼6 .. 6 .. 4. .

6b 7 6 17 7 b¼6 6 .. 7 4. 5 bp

1 x11

l X 1 kxk2 þ C Le ðyi Þ 2 i¼1

ð7Þ

167

L. Xu et al. / Fuel 142 (2015) 164–172

L e ðy i Þ ¼

K ( x, x1 )

K ( x , x2 )

K ( x, xi )



0; jf ðxi Þ  yi j  e;

for jf ðxi Þ  yi j < e otherwise

ð8Þ

Here Le(y) is the e-intensive loss function, C is the regularization parameter to make a trade-off between the complexity and losses.

a1

a2 ai

ai = ai ∗ − ai

Fig. 2. SVR model for coal calorific value prediction.

3. Experiments and results The experiment was carried out on a tangentially fired boiler of a 300 MW unit. The site installation of the photoelectric probe was shown in Fig. 3. The size of the horizontal cross-section of the boiler is 14,048 mm  12,468 mm, and equipped with horizontal bias burners. The designed coal of the boiler is bituminous coal. The unit is equipped with distributed control system (DCS) to regulate operations of the boiler and the steam turbine conversion system. Fuel/air rates are controlled according to load settings, steam flow and feedback signals of steam pressure, temperature from superheater, re-heater and other departments. During the experiments, the unit was operated under normal service plan, and blended coals were fired. 3.1. Data preparation

Fig. 3. Site installation of the photoelectric probe.

In the experiment, pulverized coal was sampled from the primary fuel/air pipe near the burner. As the pulverized coal was transported very fast by the primary air in the primary pipe, it took very short time, say, less than a second, for the coal to be transported from the location of the sampling device to the burner. It took about 10 min to collect pulverized coal in each sampling. The coal samples were sent to the analysis lab to make quantitative evaluation of the coal calorific value, and the reported data of each sample was taken as the averaged value during the corresponding time of coal powder sampling. Meanwhile, raw features of flame radiation were recorded synchronously. 3 times 10 raw features for flame radiation in ultraviolet, visible and infrared in time and frequency domains, as introduced in the previous section were

Fig. 4. Simple and partial correlation coefficients between coal calorific value and (a) raw feature components; (b) feature components obtained by PCA; (c) feature components obtained by ICA; (d) feature components obtained by PLSA.

168

L. Xu et al. / Fuel 142 (2015) 164–172

(a)

0.2

0.2

(b)

0.15

R2

R2

0.15 0.1

0.1 0.05

0.05 0

0

5

10

15

20

25

0

30

0

(c)

0.2

(d)

R2

R2

10

15

20

25

30

0.2

0.15

0.15 0.1

0.1 0.05

0.05 0

5

Number of feature components

Number of feature components

0

5

10

15

20

25

30

0

0

5

10

15

20

25

30

Number of feature components

Number of feature components

Fig. 5. Variation of R2 with increasing number of feature components in linear regression models (a) using raw features; (b) based on PCA; (c) based on ICA; and (d) based on PLSA, respectively.

components and the coal calorific value more precisely, as correlations between the feature components are eliminated. Given the random variables X1, X2, . . ., Xn, denote the linear approximation of X1 and X2 from the remaining variables X3, . . ., Xn by X 1;3...n and X 2;3...n , respectively, then the partial correlation coefficient of X1 and X2 is expressed by

q1;2;3...n ¼

Fig. 6. Coal calorific value prediction by linear regression models.

extracted in real time, and were stored about every 10 s. Coal powder transported was sampled from the primary pipe about every 40 min and lasted for two days. Load setting of the unit was changed under normal scheduling of electricity generation management of the power plant during the experiment. Data base was created by recording the coal calorific value of coal samples collected from the primary fuel/air pipe and corresponding raw radiation features, based on which regression models for coal calorific value prediction from radiation features were established. 3.2. Radiation feature analysis Processed with PCA, ICA and PLSA, raw radiation features were translated into new statistic spaces, and new feature components were generated, which had the same dimension of the raw features. New feature components had better statistical properties than the raw. Fig. 4 shows the simple and partial correlation coefficients between the coal calorific value and the radiation features obtained through PCA, ICA and PLSA, as well as raw feature components, respectively. Correlation coefficient measures the strength of the linear relationship between different variables. Partial correlation coefficients could reflect correlation between the feature

EfðY 1  EðY 1 ÞÞðY 2  EðY 2 ÞÞg pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi DY 1 DY 2

ð9Þ

where Y 1 ¼ X 1  X 1;3...n , and Y 2 ¼ X 2  X 2;3...n . From Fig. 4 it is shown that there is great difference between simple correlation and partial correlation with coal calorific value, which implies that there are strong correlations between the raw features and the coal calorific value. This has negative influences on regression analysis. Simple and partial correlation coefficients between the feature components obtained by using the statistical approaches, namely, PCA, ICA and PLSA, and the coal calorific value shows good consistency. Feature components obtained by PCA and ICA show both positive and negative correlations with the coal calorific value, while by PLSA, all the feature components show positive correlation with the coal calorific value. PLSA extracts feature components that carry as much variation information of the input data as possible and are most correlated with the coal calorific value. With the increasing of the serial number of the feature components obtained by PLSA, a degrading tendency appears in correlation strength between the coal calorific value and the feature components.

3.3. Regression modeling Through regression modeling, we aim to trace the variation of coal calorific value from observations of flame radiation features. In the experiments, pulverized coal was sampled on-site with coal calorific value ranging from 1.76  104 kJ/kg to 2.14  104 kJ/kg. Linear and nonlinear regression models were established and the performances of different models were compared.

169

L. Xu et al. / Fuel 142 (2015) 164–172 1

(a)

(b)

1 0.8

0.6

0.6

R2

R2

0.8

0.4

0.4

0.2

0.2

0

0

5

10

15

20

25

0

30

0

(c)

(d)

1

10

15

20

25

30

1 0.8

0.6

0.6

R2

R2

0.8

0.4

0.4

0.2

0.2

0

5

Number of feature components

Number of feature components

0

5

10

15

20

25

30

0

Number of feature components

0

5

10

15

20

25

30

Number of feature components

Fig. 7. Variation of R2 with increasing number of feature components in nonlinear regression models (a) using raw features; (b) based on PCA; (c) based on ICA; and (d) based on PLSA.

Fig. 8. Coal calorific value predictions from nonlinear regression model using raw features.

Fig. 10. Coal calorific value predictions from PLSA-based nonlinear regression model.

linear model. R2 measures the overall correlation intensity between the model inputs and the output, and is mathematically defined by

R2 ¼

Fig. 9. Coal calorific value predictions from PCA and ICA-based nonlinear regression models.

3.3.1. Linear regression modeling Linear regression modeling was carried out between coal calorific value and raw radiation features, as well as feature components obtained by PCA, ICA and PLSA, respectively. The coefficient of multiple correlation, R2 , was calculated for each

Pn Pn ^i  y Þ2 ^ 2 ðy SSR SSE i¼1 ðyi  yi Þ ¼ Pi¼1 ¼ 1  ¼1 P n n SST SST 2 2 i¼1 ðyi  yÞ i¼1 ðyi  yÞ

ð10Þ

^i and yi denote the predicted and standard values of the Here y dependent variable, respectively. The coefficient of multiple correlation equals to the percent of variance of the dependent variable that could be predicted by the input variables [32]. The greater the value of R2 , the better the performance of the regression model is. Fig. 5 shows the variation of R2 with R2 increasing number of feature components in different linear models. The faster the increase of R2 , the more significant the corresponding component is for the prediction. In PLSA-based linear regression, the feature components are ordered by intensity of correlation with coal calorific value, and the reflection on the variance of the input data. As more feature components were involved in regression, the value of R2 in PLSA-based linear regression shows the smoothest increase. In PCA-based linear regression, the feature components are ordered by the contribution to the total variance of radiation

170

L. Xu et al. / Fuel 142 (2015) 164–172

Table 1 Definitions of mean absolute error, standard deviation of the absolute errors, relative error and standard deviation of the relative errors. Variables

Definitions PN 1 ^ n¼1 jyn  yn j N PN 1 ^ n¼1 jðyn  yn Þ=yn j N   1=2 PN  P 1  2 ^n j;  ^ d ¼ N1 N with dn ¼ jyn  y n¼1 dn  d n¼1 jyn  yn j N1  1=2   2 P P N N 1  ¼1 ^ ^n Þ=yn j; b ^ with bn ¼ jðyn  y n¼1 bn  b n¼1 jðyn  yn Þ=yn j N1 N

Mean absolute error Mean relative error Standard deviation of absolute errors Standard deviation of relative errors

features, but no definite tendency exists in the variation of R2 , as shown in Fig. 5(b). When all the feature components are involved in the regression analysis, all the linear models have optimum R2 as high as 0.179, and they have identical predictions of coal calorific value, as shown in Fig. 6. Test results for the linear models have a mean absolute error of 576.94 kJ/kg and a standard deviation of 450.08 kJ/kg.

3.3.2. Nonlinear regression modeling In this section, nonlinear regression modeling was realized through SVR using raw features and through SVR combined with PCA, ICA and PLSA, respectively. Fig. 7 shows variation of the value of R2 for all the nonlinear regression models, with increasing number of feature components involved in regression. It is shown in Fig. 7(a)–(c) that, for nonlinear model with raw features and models based on PCA and ICA, the value of R2 increases stably with increasing number of feature components involved in regression, except on several singular points, and reaches the highest level when all feature components are involved in regression modeling. The optimum value of R2 is 0.391 for direct nonlinear regression from the raw features, and the optimum value of R2 is 0.857 for both PCA and ICA-based nonlinear regression models. In other words, in nonlinear regression, the use of PCA and ICA shows the same effect on improvement of the model performance. From Fig. 7(d) it is shown that the PLSA-based nonlinear model has the highest precision when the first eleven feature components are involved in regression. When more components are involved in the regression, however, the value of R2 slowly decreases. In other words, in the PLSA-based nonlinear regression, the first eleven feature components are most meaningful for coal calorific value prediction, while the remaining components are weakly correlated with the coal calorific value. The feature components that are later extracted have minor contribution to explanation of the variation of coal calorific value, and their involvement cannot improve but degrade the performance of the regression model. The PLSA-based nonlinear model shows the best performance among all the four nonlinear models, with optimum value of R2 as high as 0.913, which is reached with only about a third of the number of inputs used in the other models. Coal calorific value predictions using nonlinear models with optimum values of R2 are shown in Figs. 8–10. Denote the estimate

^, definitions of the mean absolute error, standard of variable y by y deviation of the absolute errors, mean relative error and standard deviation of the relative errors are given in Table 1. The coefficients of multiple regression, mean absolute errors, mean relative errors, standard deviations of the absolute errors and relative errors for different nonlinear models are listed in Table 2. PCA and ICA-based nonlinear models have identical predictions of coal calorific value, as shown in Fig. 9. Coal calorific value prediction from nonlinear regression model directly using raw features shows the worst accuracy, while coal calorific value prediction from the PLSA-based nonlinear model shows the highest accuracy, with the mean absolute error, standard deviation of the absolute errors, mean relative error and standard deviation of the relative errors of 148.76 kJ/kg, 291.86 kJ/kg, 0.76% and 1.53%, respectively.

3.4. Comparison of the regression models In linear regression, coefficients of multiple regression of the models are no more than 0.2, in other words, no more than 20% of the variation of coal calorific value can be predicted by the linear regression models. From Fig. 5, it is shown that each feature component plays an active role for accuracy improvement of coal calorific value prediction. So, for all the linear models the best performance can be achieved when all the feature components were involved in the regression analysis. In addition, as shown in Fig. 5(d), the PLSA-based linear model has the smoothest increase of R2 . However, the adoption of statistic approaches, PCA, ICA and PLSA, has no contribution to improvement of the model performance. Regression modeling for coal calorific value prediction from multiple radiation features shows that, the nonlinear model can approximate the relationship between the coal calorific value and radiation features better than the linear model. All nonlinear models show better performances than the corresponding linear models. The value of R2 of each nonlinear regression model is more than twice of that of the corresponding linear regression model. In nonlinear regression, the use the statistic approaches has greatly improved the performance of the nonlinear regression models. Among all the linear and nonlinear regression models, the PLSAbased nonlinear model shows the best performance, having the highest value of R2 , and the lowest error in coal calorific value prediction.

Table 2 Comparison of performances of different nonlinear regression models. Models

R2

Mean absolute error (kJ/kg)

Standard deviation of mean absolute error (kJ/kg)

Mean Relative error (%)

Standard deviation of mean relative errors (%)

PLSA-based ICA-based PCA-based Using raw features

0.913 0.857 0.857 0.391

148.76 167.47 167.47 422.03

291.86 332.89 332.89 464.57

0.76 0.86 0.86 2.18

1.53 1.72 1.72 2.47

L. Xu et al. / Fuel 142 (2015) 164–172

The feature components themselves carry useful or useless information for coal calorific value prediction. The components that are weakly related with the coal calorific value will make the regression model very sensitive to minor disturbance of the data. When more than 11 components are involved in the PLSAbased nonlinear modeling, the performance of the model shows no increase but rather slowly decrease. The PLSA-based nonlinear model takes only about one third of the number of feature components to reach the best performance. Different from other statistical methods, PLSA can not only regulate the statistical properties of the inputs, but also maximize the correlations between the feature components and the coal calorific value. In the iterations of PLSA for components extraction, when a component has been extracted, the part in the input data matrix that corresponds to the component extracted is removed, and the next component is extracted from the remaining parts of the input data matrix, and so on. Each component is extracted on the basis of maximizing its reflection of the variation of coal calorific value, and also carrying as more information of the variation of the input data as possible. However, PCA and ICA only orthogonalize the inputs, while the relationship between the inputs and the output is neglected. This merit of PLSA makes PLSA-based nonlinear model most suitable for coal calorific value prediction from flame radiation features. It is expected that the coal calorific value, if it ranks between the upper and lower bounds that has been tested, could be precisely predicted from the combustion radiation features by using the regression model established, the closer the better. From Fig. 10 it is shown that the variation of coal calorific value could be well traced using the PLSA-based nonlinear regression model. Even though when the coal calorific value is close to either the upper or the lower bound of its test range, the predicted values can still well trace the corresponding results of lab analysis.

4. Conclusion The linear and nonlinear regression models for coal calorific value prediction were presented. Regression analyses show that the nonlinear model can approximate the relationship between coal calorific value and radiation features better than the linear model. Multiple statistical approaches including PCA and ICA and PLSA, were adopted in regression modeling. Results show that the performance of model was not improved by adopting statistical approaches in linear regression. However, in nonlinear modeling, the use of the statistical approaches has greatly improved the performance of the regression models. Among all linear and nonlinear regression models, the PLSA-based nonlinear model shows the best performance. All the other regression models have their own best performances with all the feature components involved in the regression, while the PLSA-based regression model has the best performance when the first eleven feature components are involved in the regression. The remaining feature components obtained by PLSA have weak correlations with the variation of coal calorific value, and the performance of the model is degraded when these components are involved in the regression. The PLSA-based nonlinear model reduces the dimension of inputs to about one third of that in other models. Compared with the coal calorific values predicted using PCAbased and ICA-based nonlinear models, the PLSA-based nonlinear model shows the best performance, with coefficient of multiple regression as high as 0.913. Coal calorific value prediction from the PLSA-based nonlinear regression model has mean absolute error, standard deviation of the absolute errors, mean relative error and standard deviation of the relative errors of 148.76 kJ/kg, 291.86 kJ/kg, 0.76% and 1.53%. The results verify that the flame

171

radiation features have definite relationship with coal calorific value and the nonlinear regression models developed in this paper can be used to predict coal calorific value better than the linear regression models. In addition, in-situ installation of the measurement system is convenient, with no particular requirements for operating conditions. The online coal calorific value prediction system is especially suitable for boiler applications in harsh industrial circumstances, which pose rigorous measurement restraints for safety reasons. Online monitoring of coal calorific value could help improve energy conversion efficiency of coal fired boilers by providing reliable parameter for coal/air regulation, as well as provide a supplementary tool for boiler engineering.

Acknowledgements The authors gratefully acknowledge the financial support from the National Natural Science Foundation of China under grants (Nos. 60972087, 61225006 and 61327011).

References [1] Yin C, Caillat S, Harion J-L, Baudoin B, Perez E. Investigation of the flow, combustion, heat-transfer and emissions from a 609 MW utility tangentially fired pulverized-coal boiler. Fuel 2002;81(8):997–1006. [2] Rieker GB, Lia H, Liua X. Rapid measurements of temperature and H2O concentration in IC engines with a spark plug-mounted diode laser sensor. Proc Combust Inst 2007;31(2):3041–9. [3] Liu C, Xu LJ, Cao Z. Measurement of nonuniform temperature and concentration distributions by combining line-of-sight tunable diode laser absorption spectroscopy with regularization methods. Appl Opt 2013;52(20):4827–42. [4] Gaft M, Dvir E, Modiano H, Schone U. Laser induced breakdown spectroscopy machine for online ash analyses in coal. Spectrochim Acta B 2008;63:1177–82. [5] Beckhoff B, Kanngieser B, Langhoff N, Wedell R, Wolff H. Handbook of practical X-ray fluorescence analysis. Berlin; Heidelberg: Springer; 2006. [6] Ryua C, Yanga YB, Khora A, Yates NE, Sharifia VA, Swithenbanka J. Effect of fuel properties on biomass combustion: Part I. experiments-fuel type, equivalence ratio and particle size. Fuel 2006;85(7–8):1039–46. [7] Gao Yuming, Külaots Indrek, Chen Xu, Suuberg Eric M, Hurt Robert H, Veranth John M. The effect of solid fuel type and combustion conditions on residual carbon properties and fly ash quality. Proc Combust Inst 2002;29(1):475–83. [8] Lewis MJ. Significance of fuel selection for hypersonic vehicle range. J Propul Power 2001;17(6):1214–21. [9] Kelly-Zion PL, Dec JE. A computational study of the effect of fuel type on ignition time in homogenous charge compression ignition engines. Proc Combust Inst 2000;28(1):1187–94. [10] Matsumura S, Ogata K, Fujii S, Shioya H. Adaptive control for the steam temperature of thermal power plants. In: Control applications, proceedings of the 1998 IEEE international conference on, vol. 2 (1–4). 1998. p. 1105–9. [11] Cai XS, Ji K, Zhao ZJ. The measurement of radiation spectrum of flame of different fuel. J Eng Thermophys 2004;25(1):171–3. [12] Docquier N, Lacas F, Candel S. Closed-loop equivalence ratio control of premixed combustors using spectrally resolved chemiluminescence measurements. Proc Combust Inst 2002;29(1):139–45. [13] Jeong YK, Jeon CH, Chang YJ. Evaluation of the equivalence ratio of the reacting mixture using intensity ratio of chemiluminescence in laminar partially premixed CH4-air flames. Exp Therm Fluid Sci 2006;30(7):663–73. [14] Zhou HC, Lou C, Cheng Q, Jiang ZW, He J, Huang BY, et al. Experimental investigations on visualization of three-dimensional temperature distributions in a large-scale pulverized-coal-fired boiler furnace. Proc Combust Inst 2005;30(1):1699–706. [15] Teichert H, Fernholz T, Ebert V. Simultaneous in situ measurement of CO, H2O, and gas temperatures in a full-sized coal-fired power plant by near-infrared diode lasers. Appl Opt 2003;42(12):2043–51. [16] Hower JC, Robl TL, Thomas GA. Changes in the quality of coal delivered to Kentucky power plants, 1978–1997: responses to clean air act directives. Int J Coal Geol 1999;41(1–2):12–155. [17] Buhre BJP, Hinkley JT, Gupta RP, Nelson PF, Wall TF. Fine ash formation during combustion of pulverised coal–coal property impacts. Fuel 2006;85(2):185–93. [18] Shah P, Strezov V, Prince K, Nelson PF. Speciation of As, Cr, Se and Hg under coal fired power station conditions. Fuel 2008;87:1859–69. [19] Font O, Moreno N, Querol X, Izquierdo M, Alvarez E, Diez S, et al. X-ray powder diffraction-based method for the determination of the glass content and mineralogy of coal (co)-combustion fly ashes. Fuel 2010;89(10):2971–6. [20] Senior CL, Bool LE, Morency JR. Laboratory study of trace element vaporization from combustion of pulverized coal. Fuel Process Technol 2000;63(2– 3):109–24.

172

L. Xu et al. / Fuel 142 (2015) 164–172

[21] Majumder AK, Jain R, Banerjee P, Barnwal JP. Development of a new proximate analysis based correlation to predict calorific value of coal. Fuel 2008;87(13– 14):3077–81. [22] Akkaya AV. Proximate analysis based multiple regression models for higher heating value estimation of low rank coals. Fuel Process Technol 2009;90(2):165–70. [23] Xu LJ, Tan C, Li XM, Cheng YT, Li XL. Fuel-type identification using joint probability density arbiter and soft-computing techniques. IEEE Trans Instrum Meas 2012;61(2):286–96. [24] Tan C, Xu LJ, Li XM, Yan Y. Independent component analysis-based fuel type identification for coal-fired power plants. Combust Sci Technol 2012;184(3):277–92. [25] Cheng YT, Xu LJ, Li XL, Guo ZS. Online coal calorific value prediction from multiband coal/air combustion radiation characteristics. In: The 8th IEEE international symposium on instrumentation and control technology; 2012 July 11–13; Penang, Malaysia. Piscataway, NJ: IEEE; 2012. p. 309–13. [26] Lou C, Zhou HC, Yu PF, Jiang ZW. Measurements of the flame emissivity and radiative properties of particulate medium in pulverized-coal-fired boiler

[27]

[28] [29] [30]

[31] [32]

furnaces by image processing of visible radiation. Proc Combust Inst 2007;31:2771–8. Styszko-Grochowiak K, Golas J, Jankowski H, Golas J, Jankowski H. Characterization of the coal fly ash for the purpose of improvement of industrial on-line measurement of unburned carbon content. Fuel 2004;83(13):1847–53. Moore B. Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Automat Control 1981;26(1):17–32. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural Networks 2000;13(4–5):411–30. Vincenzo Esposito V, Chin WW, Henseler J, Wang H. Handbook of partial least squares: concepts methods and applications. Berlin; Heidelberg: Springer; 2010. Steinwart I, Christmann A. Support vector machines. New York: Springer; 2008. Cohen J, Cohen P, West SG, Aiken LS. Applied multiple regression – correlation analysis for the behavioral sciences. 3rd ed. London; New York: Routledge; 2002.