Prediction intervals based soft sensor development using fuzzy information granulation and an improved recurrent ELM

Prediction intervals based soft sensor development using fuzzy information granulation and an improved recurrent ELM

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory ...

5MB Sizes 0 Downloads 23 Views

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemometrics

Prediction intervals based soft sensor development using fuzzy information granulation and an improved recurrent ELM Yuan Xu a, b, c, Han Jiang a, b, Wei Zhang a, b, Abbas Rajabifard c, Nengcheng Chen d, Yiqun Chen c, Yanlin He a, b, *, Qunxiong Zhu a, b, ** a

College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, PR China Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing, 100029, PR China Center for SDI and Land Administration, Department of Infrastructure Engineering, The University of Melbourne, VIC, 3010, Australia d State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University, Wuhan, Hubei, 430079, PR China b c

A R T I C L E I N F O

A B S T R A C T

Keywords: Prediction intervals Soft sensor Fuzzy information granulation Extreme learning machine Purified terephthalic acid

With the increasing complexity of large-scale industrial production processes, the number of variable factors is increasing. As a result, it is demanding to predict process key variables accurately. Currently, most of soft sensor models using support vector regression and artificial neural networks are based on point prediction. The soft measurement models using the technique of point prediction can only track or fit set values. It is difficult to deal with the problem of system uncertainty and to make reliability analysis using the point prediction based soft sensors. To address this problem, this paper proposes a development method of soft sensor using the technique of prediction intervals. Under this condition, the prediction intervals instead of the point prediction of the stable operation of the industrial process system are used. The interval boundaries of the trend change can be utilized to quantify and estimate the associated uncertainty. The proposed prediction intervals based soft sensor is based on fuzzy information granularity and improved recurrent extreme learning machine. First, the fuzzy information granularity is adopted to get the lower bound, trend and upper bound of the interval. Secondly, an improved recurrent extreme learning machine is built to further enhance the ability of prediction intervals. In the improved extreme learning machine model, a feedback layer is adopted to store the hidden layer output, calculate the data trend change and dynamically update the outputs of the feedback layer. Third, the comprehensive interval evaluation function is used to evaluate the rationality of the interval results. Through case studies using a University of California Irvine dataset and the purified Terephthalic acid solvent system, the provided prediction intervals method can directly generate the upper and lower bounds for process key variables with high accuracy.

1. Introduction Nowadays, with the complexity of industrial processes, the diversity of control variables and the increase in product quality requirements, there is a high demand for the accurate prediction of key variables in the process [1–3]. In actual production, most of the systems are non-linear systems. There is no quantitative analysis of the inputs and outputs. Sometimes, we consider using sensors to estimate process variables accurately [4,5]. In the past, the soft sensor model was mostly based on the point prediction for variables [6]. However, the point prediction method can only directly predict a point. The uncertainty information cannot be well dealt with using the point predictions. The prediction

intervals (PIs) work as a predictor that falls in a certain confidence interval condition, obtaining a range field of upper and lower limits. The PIs predict not only future trends, but also object uncertainty. Hence, the system reliability and security can be improved using the PIs based prediction. PIs are considered to be an important requirement for process decision management in the production of complex industrial processes, which quantifies the related uncertainty by forecasting the bounds of the trend changes [7–9]. Recently, soft sensor development for complex processes has attracted more and more attention of factories and enterprises. Therefore, apart from point prediction based soft sensor, interval predictions based soft sensor also requires significance attention for future development and applications [10–12].

* Corresponding author. College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, PR China. ** Corresponding author. College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, PR China. E-mail addresses: [email protected] (Y. He), [email protected] (Q. Zhu). https://doi.org/10.1016/j.chemolab.2019.103877 Received 5 August 2019; Received in revised form 17 October 2019; Accepted 23 October 2019 Available online 25 October 2019 0169-7439/© 2019 Elsevier B.V. All rights reserved.

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

There are many methods for data processing and data classification. However, it is difficult to divide these data into visualization intervals based on rules. Moreover, the traditional extreme learning machine algorithm often achieves unstable results and loses dynamic characteristics due to random setting of the input weights. It is impossible for ELM to describe the industrial production process in a sequential manner. So, in the paper, a novel prediction interval approach is proposed based on FIG and an improved recurrent ELM (IRELM). Firstly, the FIG method is used to fuzzify the historical time sequence by the partition window, and then

Fig. 1. Triangular membership function.

PIs approaches have attracted much attention from researchers [13–15]. For example, an extreme learning machine (ELM) model with self-feedback and the particle swarm optimization algorithm was proposed to perform robust and accurate prediction and modeling of high quality interval [16,17]. Most commonly known PIs techniques include delta [18], Bayesian [19], mean–variance estimation (MVE) [20], bootstrap [21], and so on. However, these methods may cost massive computation or need assumptions on the data distribution to construct the intervals. With the development of machine learning, artificial neural network (ANN) is very popular [22–25], because ANN has good fitting ability on any non-linear functions. ELM [26] has been widely used to develop a non-parametric tool for directly obtaining the PIs model, which has good generalization performance and fast learning capability [27]. In addition, to obtain higher prediction accuracy, an objective function including prediction interval coverage percentage and average width percentage is constructed to test the performance of PIs by using optimization algorithm [28,29]. However, ELM ignores the dynamic features of processes. The prediction intervals using ELM is less reliable. Currently, fuzzy algorithms have caused extensive researches in the field of interval predictions, since fuzzy algorithms do not require a priority distribution and support of optimization algorithm for PIs [30,31]. Also, fuzzy information granulation (FIG) [32] is presented based on fuzzy logic rules and the information granulation method. FIG is able to separate the original data into an interval and obtain the upper, lower bounds and the trend, and achieve the number of samples compression and information extraction by replacing the initial time series of parameters.

Fig. 3. Flowchart of the proposed PIs using FIG-IRELM.

Table 1 The selected variables of the Combined Cycle Power Plant. Input

Variable description

1 2 3 4

Relative Humidity (RH) Ambient Pressure (AP) average ambient variables Temperature (AT) Exhaust Vacuum (V)

Fig. 2. The structure of IRELM network. 2

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Fig. 4. Visualization of the FIG.

Fig. 5. Comparison results of prediction performance for upper bound.

2. Related works

the upper, lower and trend components are generated. Secondly, an IRELM model is presented to predict the above three components (the upper, lower and trend components). In the IRELM model, there is a feedback layer adopted in the original ELM. The feedback layer is connected to the hidden layer. The added feedback layer is used to store the hidden layer outputs. Then the output of the feedback layer can be dynamically updated using the rate of the trend change. Thirdly, the prediction interval coverage width-based criterion (PICWC) is utilized to estimate the rationality of the intervals. Finally, the proposed soft sensor using the PIs technique using FIG and IRELM (FIG-IRELM) can be developed. To test the performance of FIG-IRELM, a dataset from University of California Irvine (UCI) and the dataset collected from the purified Terephthalic acid (PTA) solvent system are selected. Compared with other models, the simulation results indicate that the proposed FIGIRELM can obtain PIs with higher quality and achieve better performance. The remaining parts of this paper are organized as follows: Section 2 briefly introduces the background of fuzzy information granulation and extreme learning machine; the proposed prediction intervals based soft sensor development method is provided in Section 3; Section 4 gives some interval evaluation indexes; case studies and result analyses are presented in Section 5; finally, Section 6 provides conclusions.

2.1. Fuzzy information granulation As a novel approach, the information Granulation (IG) computation is widely applied to simulate human thinking and solve complex problems. It divides a whole into several parts by similarity, function approximation and distinction. Each part is an IG. FIG is presented to handle the abilities of uncertainty knowledge by using fuzzy theory. The FIG technology is based on the methods of reasoning, decision-making and recognition. FIG deals with problems in a way that is close to human thinking. Therefore, FIG is regarded as a prototype of machine intelligence (MI) to make the reasonable decision in uncertain knowledge environment. The original model of FIG is expressed as:, G ðD; ½0; 1; FÞ where D is the domain, F is the map of D to [0,1]. FIG is used to divide the time series according to the partitioning windows and fuzzification. Firstly, the time series are regarded as an operation window, denoted as D, and a fuzzy granule P is established on D. The granule can describe the fuzzy concept G of D comprehensively. Thus, the fuzzy concept G is determined as: gΔ ðx 2 GÞ

3

(1)

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Fig. 6. Comparison results of prediction performance for trend bound.

Fig. 7. Comparison results of prediction performance for lower bound.

The commonly fuzzy membership functions of fuzzy granule consist of triangle, Gaussian and so on. The triangular membership function is expressed as follows:

f ðx; a; d; bÞ ¼

Trend Lower

bx > > > d > b d > > : 0 x>b

(2)

where a, b, d are the parameters.

Table 2 Comparison results of prediction evaluation indexes.

Upper

8 0 x > >xa > > > >
Method

BP

ELM

IRELM

RMSE MAPE RMSE MAPE RMSE MAPE

2.5564 0.0370 9.8156 0.0141 11.1119 0.0193

0.0387 6.1877e-05 4.5572 0.0052 6.8668 0.0102

0.0386 5.3668e-05 2.3560 0.0041 3.2575 0.0057

2.2. Extreme learning machine ELM with a single hidden layer has been widely applied in many kinds of fields, such as regression, classification, and so on. One of the good features of ELM is that an extremely training speed can be achieved. Consider that N different samples ðXi; Ti Þ 2 Rm  Rn are available, where Xi; is m1 input matrix, Ti is n1 expected value matrix. L hidden layer 4

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Fig. 8. The result of prediction performance for BP.

Fig. 9. The result of prediction performance for ELM.

Fig. 10. The result of prediction performance for IRELM.

nodes are assigned in the ELM. With the current N samples, the predictions of ELM can be represented as follows:

Table 3 Comparison results of prediction evaluation indexes.

PICP MPIW NMPIW PICWC

BP

ELM

IRELM

0.8977 49.7674 0.7215 10.5696

0.9205 47.0703 0.1529 5.8079

0.9545 46.9150 0.13668 1.1837

L X

  βi g Wi  Xj þ bi ¼ oj ; j ¼ 1; ⋯; N;

(3)

i¼1

where gðxÞ represents the activation function assigned to the hidden

5

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

where H represents the matrix of hidden layer output, which can be represented as follows: 2

3 2 3 Hðx1 Þ gðω11  x1 þ b1 Þ … gðω1n  x1 þ bn Þ 5 H ¼4 ⋮ 5¼4 ⋮ ⋮ ⋮ Hðxn Þ gðωn1  xn þ b1 Þ … gðωnn  xn þ bn Þ nn

(5)

3 2 3 T1 β1 6 T2 7 6 β2 7 7 6 7 β¼6 4 ⋮ 5 ;T ¼ 4 ⋮ 5 βn n1 Tn n1

(6)

2

The output weights β can be obtained using the least squares: β ¼ HyT

(7) 1

1

H y ¼ GH ðGGH Þ ðF H FÞ F H

where H y represents the Moore–Penrose pseudoinverse of H matrix. The extreme learning machine has good performance in the prediction problem. However, the structural risk is not considered in the modeling using the extreme learning machine, which may lead to insufficient generalization ability and stability. In addition, the initial value of the random setting of the algorithm will lead to low prediction accuracy.

Fig. 11. Flowchart of PTA solvent system.

Table 4 The selected variables of PTA solvent system.

3. Proposed FIG-IRELM PIs

Input

Variable description

1 2 3 4 5 6 7 8

Reflux temperature Feed quantity Temperature point between the 44th tray and 50th tray Water reflux Temperature point above the 35th tray Tray temperature near the up sensitive plate Produced quantity of top tower Temperature point between the 35th tray and 40th tray

For the proposed FIG-IRELM PIs method, it is included by FIG-based time series division and IRELM prediction. It is supposed that the data series is X ¼ ðx1 ; x2 ; …; xn Þ. Each subinterval by FIG-based division is called a window. Through the selection of the window size, we can maintain the data information, simplify the time series, and make it easy to study. Then, a window is fuzzified to generate a number of fuzzy granules to replace the related information. The triangular fuzzy membership function is used for fuzzification, and determine the kernel d for triangular fuzzy sets.

layer nodes; usually the sigmoid function is utilized as the activation function; Wi represents the weights connecting the hidden layer and the input layer, βi represents the weights with the output layer, bi represents the bias assigned to the ith hidden node. Eq. (3) can be re-written as follows: Hβ ¼ T

(8)

    xa xd f ðx : a; d; bÞ ¼ max min ;1 ;0 da bd

(9)

The triangular membership function is shown in Fig. 1. The lower bounds for triangular fuzzy sets are defined as follows:

(4)

Fig. 12. Visualization of the FIG. 6

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Fig. 13. Comparison results of prediction performance for upper bound.

Fig. 14. Comparison results of prediction performance for trend bound.

Fig. 15. Comparison results of prediction performance for lower bound.

7

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

8 < f ðaÞ ¼ 0 P ¼ f ða; d; bÞ f ðdÞ ¼ 1 : f ðbÞ ¼ 0

Table 5 Comparison results of prediction evaluation indexes.

Upper Trend Lower

Method

BP

ELM

IRELM

RMSE MAPE RMSE MAPE RMSE MAPE

0.2585 0.0033 0.4442 0.006 0.2999 0.0046

0.1891 0.0028 0.2257 0.0025 0.1354 0.0015

0.1049 0.0018 0.1046 0.0015 0.0734 0.0012

8 xn f ðxi Þ > < d ¼ ; n is even number 2 maxFðaÞ ¼ xnþ1 > da : d¼ ; n is odd number 2

The parameters a, d, b are the corresponding lower value, trend value and upper value, respectively. The lower parameter a describes the minimum value of the sequence change, the trend parameter d describes the trend level, and the upper parameter b describes the maximum value. For the IRELM prediction model, a four-layer network structure is constructed by adding a feedback layer. The feedback layer is used to dynamically memorize the hidden layer output. The structure of IRELM is shown in Fig. 2. Supposed that the memorized samples for the Ith feedback layer are g (k-I), the output weight Wf of feedback layer is defined as follows:

P

xi d

(10)

  Wfi ¼ Wf 1‘ N ; Wf 2 N ; ……; Wfn N ; k

Similarly, the upper bounds are defined as follows: P maxFðbÞ ¼

xi d

f ðxi Þ

Wfi 2 ½0; 1

(13)

Among them, the weight of the Ith layer is the N power of the output weight Wf, which makes the feedback layer update constantly. The matrix of feedback layer H ' is expressed as follows:

(11)

bd

(12)

The established fuzzy granules are described as follows:

Fig. 16. Comparison of prediction results.

Fig. 17. The result of prediction performance for BP. 8

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

Fig. 18. The result of prediction performance for ELM.

Fig. 19. The result of prediction performance for IRELM.

modified as follows: Table 6 Comparison results of prediction evaluation indexes.

PICP MPIW NMPIW PICWC

H' ¼

I X 

2

BP

ELM

IRELM

0.7976 0.7864 0.3410 57.3135

0.8571 0.6671 0.2781 29.152

0.9286 0.4396 0.2008 0.7870

 ηi Wfi bi  gðXðk  iÞÞ

 ' ' 6 gðw11  x1 þ b1 Þ þ H 1 … g w1n  x1 þ bn þ H n 6 …⋮ Hnew ¼ H þ H' ¼ 6 6⋮   4  g wn1  xn þ b1 þ H '1 …g wnn  xn þ bn þ H 'n

i 2 ½1; I

(17) On this basis, this paper introduces regularization; the algorithm has a certain improvement in the generalization ability, structural stability and prediction accuracy. The objective function of the IRELM algorithm is:

(14)

 η 1 minIRELM ¼ min jjμjj2 þ jjγjj2 β 2 2

(15)

where ηi represents the change rate of the Ith feedback layer.

μn ¼

gðai  xkiþ1 þ bi Þ  gðai  xki þ bi Þ Cðk  i þ 1Þ  cðk  iÞ

7 7 7 7 5 nn

i¼1

ηi ¼ ½μ1 ; μ2 ; ……; μn 

3

(18)

where η is the regularization coefficient. The sum of the training errors is

μ. jjμjj2 and jjγjj2 are the risk factors, which are the empirical risk and structural risk, respectively. Then calculate the output weight matrix:

(16)

where C (⋅) represents a unit time. μn is the nth nodes change rate in the feedback layer. The output matrix of the new hidden layer Hnew is

9

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

 γ ¼

Hnew T Hnew þ

I λ

1 Hnew T Hnew

PICWC is measured with the synergy between PICP and NMPIW:

(19)

  PICWC ¼ NMPIW  1 þ αðPICPÞ  eβðPICPγÞ

Finally get a fitted regression model: y¼

L X

γ i gðωi x þ bi Þ

In the above formula, αðPICPÞ is defined as follows: 

(20)

αðPICPÞ ¼

i¼1

The basic steps of IRELM are shown above. The model incorporates a regularization coefficient to make the model evade the empirical risk and the structural risk. During this process, every three adjacent data points in the original data are taken as a set of windows for generating information particles, and then are fuzzified. The results are divided into upper, lower and trend components, and then are transported to the IRELM network model to predict the time series, respectively. The flowchart of the proposed FIG-IRELM method is shown in Fig. 3.

The proposed FIG-IRELM prediction method is applied to the Combined Cycle Power Plant in the UCI data set and the PTA solvent system for performance verification.

5.1. Case study on the Shanghai composite index set

4.1. Prediction interval coverage probability

To validate the effectiveness of the proposed FIG-IRELM algorithm, the Combined Cycle Power Plant Data Set in the UCI is first selected. The data set contains 4 variables, which are the Relative Humidity (RH), the Ambient Pressure (AP), the hourly average ambient variables Temperature (AT) and the Exhaust Vacuum (V). The four input factors are used to predict the net hourly electrical energy output (EP). The four variables are shown in Table 1. The Combined Cycle Power Plant Data Set is a time series, and it is a correlation function with time t. First of all, this paper sets the time window to 5 and 450 sets of data for fuzzy information granulation. The results are divided into upper, lower and trend components. The upper, lower, trend components of the Combined Cycle Power Plant are visualized in Fig. 4. Secondly, the upper, lower and trend components obtained by FIG are predicted using BP, ELM, and IRELM, respectively. And the predicted results are shown in Figs.5–7, respectively. Two error indexes i.e. the Root mean square error (RMSE) and the Mean absolute percentage error (MAPE) are adopted to test the performance of the proposed method. The mathematical expressions of RMSE and MAPE are given as follows:

The PICP is used to indicate the probability of the true value that is contained in the prediction interval. The larger the value of the PICP is, the better the interval prediction achieves. (21)

where n represents the number of samples in the testing set, the logical value is ci whose value is 0 or 1. The formula for the logical value ci is listed as follows:  ci ¼

1 0

yi 2 ½LðXi Þ; UðXi Þ yi 62 ½LðXi Þ; UðXi Þ

(22)

where LðXi Þ and UðXi Þ represent the upper limit and the lower limit of the prediction interval at the ith moment, respectively. 4.2. Normalized mean prediction interval width

MAPE ¼

NMPIW is an indicator that is used to judge the width of PIs. Too wide width can make predictions meaningless, and too narrow width can lead to inaccurate predictions. NMPIW is calculated below: n 1X MPIW ¼ ðUðXi Þ  LðXi ÞÞ n i¼1

MPIW R

N



1 X

yðtÞ  ~yðtÞ

N i¼1 ~yðtÞ

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u 1 X jyðtÞ  ~yðtÞj2 RMSE ¼ t N  1 i¼1

(27)

(28)

(23) where yðtÞ and y~ðtÞ represent the expected value and the prediction value, respectively; N is the sample number. From Figs. 5–7, it can be seen that the proposed PIs model based on FIG and IRELM has better prediction accuracy than traditional ELM and BP network models. The evaluation indexes are shown in Table 2. To verify the superiority, BP, ELM and IRELM are used as regression models to obtain the upper, lower and trend of the Combined Cycle Power Plant. The interval prediction results are shown in Figs. 8–10, respectively. From Figs. 8–10, it can be seen that the proposed prediction model has better prediction results than traditional ELM and BP network models. The proposed PIs method can effectively cover the real value comparing with the other two methods, and the fluctuation trend of the upper and lower limits can well follow the trend of the real value. In addition, due to the amount of training data, the BP neural network is

where n represents the testing sample number, LðXi Þ and UðXi Þ are the upper and lower limits of the prediction interval at the ith moment, respectively. NMPIW ¼

(26)

5. Case study

The prediction interval coverage probability (PICP) and normalized mean prediction interval width (NMPIW) are mainly used to evaluate the validity, rationality and accuracy of the interval prediction model.

n 1X ci n i¼1

1PICP  γ < 0 0PICP  γ  0

where αðPICPÞ is a constant that may be 0 or 1. The value of αðPICPÞ determines the change of overall trend of NMPIW and PICWC. Due to the role of the exponential term, if the PICP is greater than or close to γ, the PICWC and NMPIW are constantly approaching or equal. In this way, on the basis of ensuring the coverage of the interval, the interval width is also guaranteed.

4. Interval evaluation indexes

PICP ¼

(25)

(24)

where R is the extreme difference of the actual observations in the testing set. 4.3. Prediction interval coverage width-based criterion (PICWC) To achieve a good constructed interval (the width is narrow enough, and the coverage is large enough at the same time). An indicator named 10

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877

6. Conclusions

easy to fall into local minimum and the upper and lower are reversed. By adding a feedback layer, the over-fitting and dynamic feature problems can be effectively solved, and regularization is added to improve prediction accuracy and structural stability. The evaluation indicators are shown in Table 3. Seen from the comparison results in Tables 2 and 3, the FIG-IRELM method proposed in this paper is superior to other methods in point prediction accuracy. At the same time, the proposed method has better interval coverage and a better PICWC than other traditional methods.

In this paper, a novel PIs for Soft Sensor approach is proposed based on FIG and IRELM. Most of the current soft sensor models are based on the specific prediction value. If the change interval of key process variables can be predicted, the change range of the variables can be reflected and the future fluctuation range to the operators can be also provided. In the proposed method, an improved extreme learning machine model is adopted to predict the outputs, where a feedback layer is added. The FIG is used to accurately estimate the parameters of the interval. Through case studies using a University of California Irvine dataset and the purified Terephthalic acid solvent system, the provided prediction intervals method can directly generate the upper and lower bounds for process key variables with high accuracy. The simulation results indicate that the proposed PIs method has a larger coverage probability and a narrower width than other models. In future researches, the effects of different time window widths on model predictions, the cross-over effects between different time window widths and the different membership functions on model predictions can be considered. In this way, more appropriate membership degree and time window can be designed for different time series data. The appropriate time window and membership function can greatly reduce the error, obtain stable prediction models.

5.2. Case study on the PTA solvent system The structure of the PTA solvent system is shown in Fig. 11. From Fig. 11, it can be seen that the PTA process is complex, where the whole process is mainly made up of 3 parts: one is the tower of solvent dehydration, one is the recovery unit of the N-Butyl Acetate (NBA), and the third one is the recycle unit of NBA. These three parts works together to produce the pure Terephthalic acid. In the production of the PTA process, the acetic acid consumption is an important process variable. The reaction is mainly produced by the oxidation of acetic acid. However, it is difficult to measure the acetic acid consumption at the top tower of the PTA process. That is to say, the acetic acid consumption at the top tower the PTA process is a key process variable that is different to measure. Generally, the acetic acid consumption at the top tower can be indirectly calculated through measuring the top tower conductivity [22,23]. For the PTA process, 8 input factors that are mostly related to the top tower conductivity are selected. The information of the selected 8 input factors is shown in Table 4. The top tower conductivity is selected as the output variable. Based on the proposed FIG-IRELM method, the data set is first divided into windows. In this process, every three adjacent data points in the original data are taken as a set of windows for generating information particles, and then are fuzzified. The results are divided into the upper, lower and trend components. The upper, lower, and trend components of the conductivity are visualized in Fig. 12. Secondly, the upper, lower and trend components obtained by FIG are predicted using BP, ELM, and IRELM, respectively, where predictions are shown in Figs.13–15. From Figs. 13–15, the proposed PIs model using FIG and IRELM achieves better prediction accuracy than the traditional BP network and the ELM model. For the BP neural network, it is easy to fall into the local minimum and a lot of training data are required. For the ELM prediction model, its convergence is superior to the BP model. However, ELM is easy to be over-fitting. The IRELM model by adding a feedback layer can solve the over-fitting and dynamic feature problems, and then the prediction accuracy is improved. The evaluation indexes are shown in Table 5. And the final prediction results are shown in Fig. 16. To validate the superiority of the proposed model in interval prediction, the BP, ELM and IRELM models are used as regression models to obtain the upper, lower and trend of the conductivity. The interval prediction results are shown in Figs. 17–19. From Figs. 17–19, the proposed prediction method achieves better prediction results than the traditional BP network and the ELM model. The proposed PIs method can effectively cover the real value comparing with the other two methods, and the fluctuation trend of the upper and lower limits can well follow the trend of the real values. In addition, due to the amount of training data, is easy to fall into the local minimum and the upper and lower are reversed for the BP neural network. By adding a feedback layer in ELM, the problem of over-fitting can be well solved and the prediction accuracy is improved. The evaluation indicators are shown in Table 6. The simulation results shown in Table 6 indicate that the comprehensive evaluation by the proposed method is optimal, which has the largest coverage interval and relatively smallest prediction interval width among the three models.

Declaration of competing interest The authors have declared that no conflict of interest exists. Acknowledgments This work is supported by grants from the National Natural Science Foundation of China under Grant Nos.61573051 and 61703027, the China Scholarship Council State-Sponsored Scholarship Program (Grant No. 201806885004), and the Open Research Fund of State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, WUHAN University (Grant No.18I01), the Fundamental Research Funds for the Central Universities under Grant Nos. XK1802-4 and JD1914. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.chemolab.2019.103877. References [1] Y.L. He, Z.Q. Geng, Q.X. Zhu, Soft sensor development for the key variables of complex chemical processes using a novel robust bagging nonlinear model integrating improved extreme learning machine with partial least square, Chemometr. Intell. Lab. Syst. 151 (2016) 78–88. [2] Q.X. Zhu, X.H. Zhang, Y. Wang, et al., A novel intelligent model integrating PLSR with RBF-Kernel based Extreme Learning Machine: application to modelling petrochemical process, IFAC-PapersOnLine 52 (1) (2019) 148–153. [3] Y.L. He, Z.Q. Geng, Q.X. Zhu, Data driven soft sensor development for complex chemical processes using extreme learning machine, Chem. Eng. Res. Des. 102 (2015) 1–11. [4] L. Pan, D.N. Politis, Bootstrap prediction intervals for linear, nonlinear and nonparametric autoregressions, J. Stat. Plan. Inference 177 (2016) 1–27. [5] M.T. Tham, G.A. Montague, A.J. Morris, et al., Soft-sensors for process estimation and inferential control, J. Process Control 1 (1) (1991) 3–14. [6] Y. Xu, Z.Q. Zhou, Q.X. Zhu, A new feedback DE-ELM with time delay-based EFSM approach for fault prediction of non-linear processes, Can. J. Chem. Eng. 93 (9) (2015) 1603–1612. [7] C. Wan, Z. Xu, P. Pinson, et al., Probabilistic forecasting of wind power generation using extreme learning machine, IEEE Trans. Power Syst. 29 (3) (2013) 1033–1044. [8] G. Zhang, Y. Wu, K.P. Wong, et al., An advanced approach for construction of optimal wind power prediction intervals, IEEE Trans. Power Syst. 30 (5) (2014) 2706–2715. [9] M. Hu, Z. Hu, J. Yue, et al., A novel multi-objective optimal approach for wind power interval prediction, Energies 10 (4) (2017) 419.

11

Y. Xu et al.

Chemometrics and Intelligent Laboratory Systems 195 (2019) 103877 [21] R. Errouissi, J. Cardenas-Barrera, J. Meng, et al., Bootstrap Prediction Interval Estimation for Wind Speed forecasting[C]//2015 IEEE Energy Conversion Congress and Exposition (ECCE), IEEE, 2015, pp. 1919–1924. [22] X.H. Zhang, Q.X. Zhu, Y.L. He, et al., A novel robust ensemble model integrated extreme learning machine with multi-activation functions for energy modeling and analysis: application to petrochemical industry, Energy 162 (2018) 593–602. [23] Y.L. He, Z.Q. Geng, Y. Xu, et al., A hierarchical structure of extreme learning machine (HELM) for high-dimensional datasets with noise, Neurocomputing 128 (2014) 407–414. [24] A. Khosravi, S. Nahavandi, D. Creighton, A neural network-GARCH-based method for construction of Prediction Intervals, Electr. Power Syst. Res. 96 (2013) 185–193. [25] A. Khosravi, E. Mazloumi, S. Nahavandi, et al., Prediction intervals to account for uncertainties in travel time prediction, IEEE Trans. Intell. Transp. Syst. 12 (2) (2011) 537–547. [26] K. Ning, M. Liu, M. Dong, et al., Two efficient twin ELM methods with prediction interval, IEEE Trans. Neural Netw. Learn Syst. 26 (9) (2014) 2058–2071. [27] G.B. Huang, Q.Y. Zhu, C.K. Siew, Extreme learning machine: theory and applications, Neurocomputing 70 (1–3) (2006) 489–501. [28] T. Xiong, Y. Bao, Z. Hu, et al., Forecasting interval time series using a fully complexvalued RBF neural network with DPSO and PSO algorithms, Inf. Sci. 305 (2015) 77–92. [29] N.A. Shrivastava, A. Khosravi, B.K. Panigrahi, Prediction interval estimation of electricity prices using PSO-tuned support vector machines, IEEE Trans. Ind. Inf. 11 (2) (2015) 322–331. [30] L.A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic, Fuzzy Sets Syst. 90 (2) (1997) 111–127. [31] A. Kavousi-Fard, A. Khosravi, S. Nahavandi, A new fuzzy-based combined prediction interval for wind power forecasting, IEEE Trans. Power Syst. 31 (1) (2015) 18–26. [32] S. Yin, Y. Jiang, Y. Tian, et al., A data-driven fuzzy information granulation approach for freight volume forecasting, IEEE Trans. Ind. Electron. 64 (2) (2016) 1447–1456.

[10] C. Wan, Z. Xu, P. Pinson, et al., Optimal prediction intervals of wind power generation, IEEE Trans. Power Syst. 29 (3) (2013) 1166–1174. [11] A. Khosravi, S. Nahavandi, D. Creighton, et al., Lower upper bound estimation method for construction of neural network-based prediction intervals, IEEE Trans. Neural Netw. 22 (3) (2010) 337–346. [12] I.D. Lins, E.L. Droguett, M. das Chagas Moura, et al., Computing confidence and prediction intervals of industrial equipment degradation by bootstrapped support vector regression, Reliab. Eng. Syst. Saf. 137 (2015) 120–128. [13] J. Quigley, L. Walls, Prediction Intervals for Reliability Growth Models with Small Sample Sizes, Springer London, 2006. [14] Y. Xu, M. Zhang, Research and application of interval prediction method for complex processes based on principal component independent analysis and mixed kernel RVM, CIE J. 68 (3) (2017) 925–931. [15] H. Quan, D. Srinivasan, A. Khosravi, Short-term load and wind power forecasting using neural network-based prediction intervals, IEEE Trans. Neural Netw. Learn Syst. 25 (2) (2013) 303–315. [16] Y. Xu, M. Zhang, L. Ye, et al., A novel prediction intervals method integrating an error & self-feedback extreme learning machine with particle swarm optimization for energy consumption robust prediction, Energy 164 (2018) 137–146. [17] Y. Xu, M. Zhang, Q. Zhu, et al., An improved multi-kernel RVM integrated with CEEMD for high-quality intervals prediction construction and its intelligent modeling application, Chemometr. Intell. Lab. Syst. 171 (2017) 151–160. [18] N. Buntao, S. Niwitpong, Confidence intervals for the difference of coefficients of variation for lognormal distributions and delta-lognormal distributions, Appl. Math. Sci. 6 (134) (2012) 6691–6704. [19] E.K. AL-Hussaini, A.H. Abdel-Hamid, A.F. Hashem, Bayesian prediction intervals of order statistics based on progressively type-II censored competing risks data from the half-logistic distribution, J. Egypt. Math. Soc. 23 (1) (2015) 190–196. [20] A. Khosravi, S. Nahavandi, An optimized mean variance estimation method for uncertainty quantification of wind power forecasts, Int. J. Electr. Power Energy Syst. 61 (2014) 446–454.

12