Zdravko Kravanja, Miloš Bogataj (Editors), Proceedings of the 26th European Symposium on Computer Aided Process Engineering – ESCAPE 26 June 12th -15th, 2016, Portorož, Slovenia © 2016 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/B978-0-444-63428-3.50061-8
Development of soft sensor with neural network and nonlinear variable selection for crude distillation unit process Kai Suna, Shao-hsuan Huangb, Shi-Shang Jang b*, David Shan-Hill Wong b* a
Department of Automation, Qilu University of Technology, Jinan, Shandong, 250353, China
b
Department of Chemical Engineering, National Tsing-Hua University,Hsin-Chu, 30013, Taiwan
Abstract This study focuses on the development of a neural network-based soft sensor for the estimation of the product properties for real-time monitoring and control in the crude distillation unit (CDU) process. There are a large number of predictor variables displaying a high level of cross-correlation in the CDU process, which increase complexity of the model and lower the model accuracy. Therefore, a novel variable selection method for neural network that can be applied to describe nonlinear industrial processes is developed to solve the problem. The proposed method is an iterative twostep approach. Firstly, a multi-layer perceptron (MLP) is constructed. Secondly, the least absolute shrinkage and selection operator (LASSO) is introduced to select the input variables that are truly essential to the model with the shrinkage parameter is determined using a cross-validation method. Then, variables whose input weights are zero are eliminated from the dataset. The algorithm is repeated until there is no improvement in the model accuracy. The results show that the model constructed by the proposed soft sensor could successfully follow the dynamics of the CDU process. In addition, the superiority of the proposed approach is illustrated by the comparison with other state-of-art methods. It turns out that the proposed approach can build a more compact model and present higher level of prediction accuracy than other existing methods. Keywords: soft sensor; variable selection; neural network; LASSO; crude distillation unit;
1. Introduction Crude distillation unit (CDU) have been widely used by chemical and petroleum industries to separate incoming crude oil into its component fractions, by exploiting differences in their boiling points. The quality of the final products would be verified by laboratory assays once a day, with the American Society of Testing Materials (ASTM) method D86, which is very time consuming. Therefore, it is necessary to develop an effective soft sensor to estimate the product properties for the real-time monitoring and control in the process. Bolf et al. [1] designed a neural network-based soft sensor to estimate the quality of kerosene, and research results showed validity of applying soft sensors for refinery product quality estimation as an alternative for process analyzers and laboratory assays. Mohler et al.[2] analysed the nonlinearity of the CDU process and proposed a neural
338
K. Sun et al.
network-based soft sensor for the prediction of the kerosene property. Caponetto et al. [3] developed a FPGA based soft sensor for the estimating the kerosene property by integrating a neural network with PCA. There are usually a large number of candidate explanatory variables in the CDU process, which will increase the complexity of the model and reduce the prediction accuracy of the soft sensor. Variable selection techniques can be applied in order to improve the prediction accuracy, reduce the complexity of the model, better capture the nature of the CDU process, and reduce the cost of measurements The motivation of this paper is to develop a robust variable selection method for the application of soft sensors that could accurate model the complex CDU process. This paper is organized as follows. Section 2 gives the detailed process description of the CDU process. The development of the methodology is presented in Section 3. In Section 4, the proposed method is applied to predict CDU process. Finally, some concluding remarks are given in Section 5.
2. Process description Figure 1 provides a schematic flow diagram representing a real crude oil distillation unit, which consists of a crude distillation column at atmospheric pressure as well as a furnace, stripper unit, pumparounds, and a decanter. The incoming crude oil can fall into one of two categories: crude oil with high viscosity and common crude oil. The crude oil with high viscosity is heated and pre-distillated in the LSHP section (F161 and V161), and the steam obtained in this process is imported into the main column (V101). Common crude oil is preheated by D103, and the resulting steam is imported into V101 directly. In addition, the liquid obtained from D103 flows into furnace (F101) where it is heated to a temperature of around 340°C. Following that, the steam obtained from the liquid in F101 is imported into V101. There are three pumparounds at the top of the CDU: the top pumparound (TPA), middle pumparound (MPA), and bottom pumparound (BPA). These pumparounds control the temperature of the steam by connecting to either one, or several heat exchangers. They also keep the vapour loading of the column at a stable rate, in addition to regulating the amount of liquid traffic in the column to achieve effective fractionation. As shown in the flow diagram, the top distillate fraction extracted in V101 is naphtha. The residual oil is discharged from the bottom of V101. The products of the side strippers are kerosene, light diesel (LD), and heavy diesel (HD) respectively. Kerosene is a considerably important product, which can be used for burning in lamps and domestic heaters, and also as a fuel for jets and turboprop aircraft engines. The quality of the kerosene would be verified by laboratory assays once a day, with the American Society of Testing Materials (ASTM) method D86. The distillation end point of a product was defined as the maximum reading of the temperature sensor obtained during the test. However, a 95 % distillation end point (D95) was commonly used, as the end point was difficult to measure with a good level of repeatability [1]. The kerosene D95 is very unstable during that process. In addition, the verification of kerosene quality using laboratory assays proved to be significantly time consuming. Therefore, it was shown that the development of a reliable inferential model for the estimation of kerosene would be necessary for real-time monitoring and control in the CDU process.
Development of soft sensor with neural network and nonlinear variable selection for crude distillation unit process
339
Figure 1. A schematic flow diagram of the CDU process
3. Proposed algorithm Suppose that X ∈ is the input data matrix, in which each column represents a is a vector representing the response candidate explanatory variable, and Y ∈ variable. Then, for given p input variables x , x , . . . , x , the response y is predicted by y
β x
β x
⋯
β x
(1)
The LASSO algorithm is proposed by introducing an extra penalty into ordinary least square equation, which is shown as [4]: argmin ∑
y
∑
βx
λ∑
|β |
(2)
where λ ∑ |β | is called the LASSO penalty, and λ is a nonnegative tuning parameter. The algorithm causes the β value to continuous shrink towards zero as the parameter λ increases. The coefficient β , β , … , β shrinks exactly to zero if λ is sufficiently large, which implies that all variables are eliminated. For a multi-layer perceptron (MLP) neural network with three layers, assume that the input variables of the network are given by the candidate variables , ,..., ; , ,..., . The weight the hidden layer has nodes, represented as iϵ 1, p , jϵ 1, q denotes the input weight between the input variable and the jth hidden neuron , while the bias of the jth neuron of the hidden layer is denoted by . The output signal of the jth neuron of the hidden layer, , can be given by ∑
(3) where denotes the activation function of the hidden layer. Let weight w jϵ 1, q represent the jth output weight between the hidden layer and the output layer, and denote the bias of the output layer, represents the activation function of the output layer. The input-output relationship of MLP is formulated as: ∑ ∑ (4)
In the paper, we integrate the LASSO penalty into the MLP in order to achieve model shrinkage and variable selection. The hybrid algorithm implements LASSO to conduct
340
K. Sun et al.
the accurate shrinkage of input weights of the MLP. Eq. (3) is reformulated by adding the parameter β , β , … , β in front of the input nodes of the MLP: ∑
∑
(5)
In a similar way, eq. (2) is also reformulated as ∑
∑
∑
∑
| |
(6) Apparently, eq.(5) is a nonlinear quadratic minimization problem that can be solved using a trust region reflective optimization algorithm [5]. Following that, the new neural network can be obtained by replacing with in eq. (4). The selection of the shrinkage parameter λ is a key component of the proposed approach because the choice of parameter significantly influences the performance of the algorithm. A K-fold cross-validation (CV) [6] strategy is implemented in this approach to select best parameter λ. In this paper, a new iterative backward deletion method for MLP network is proposed, which introduces LASSO into an MLP, and is labelled as LASSO-MLP. At each iteration, the proposed LASSO-MLP trains a new network with the current dataset, and 0 are deleted, shrinks by invoking LASSO. Then, the input variables for which and a new dataset is constructed using the remaining variables. This process is repeated until the termination conditions are met, which comprise either a maximum number of iterations, or a state being reached where no further improvement is achieved in the model error. The procedure of the proposed LASSO-MLP is described as follow: Step 1. Train or retrain a neural network with the training dataset , Step 2. Introduce the magnitude parameter into the current neural network, and obtain eq. (5), where is the number of remaining variables. Step 3. Determine the value of the parameter λ using K-fold CV. Step 4. Solve eq. (6) and obtain β , β , … , β . Replace with , and obtain the new neural network. Step 5. Determine whether the termination condition is met. If not, delete the candidate variables from X which have β 0, and obtain a new training dataset ′, . Let , ′, , and then goto Step 1. Step 6. Output the results.
4. Simulation Results In this section, the performance of the proposed nonlinear variable selection algorithm, abbreviated as LASSO-MLP, was investigated by comparisons with some state-of-art neural network based variable selection algorithms including SBS-MLP[7], NMIFS[8], and MLP with no variable selection. The performance of these algorithms was evaluated by using three statistics (1) Root mean square error (RMSE) of prediction. (2) Coefficient of determination ( R ): the square of the sample correlation coefficient between the outcomes and their predicted values. (3) Model size (M.S): the number of nonzero elements in the final ∗ s , that is, the remaining variables in the candidate variables pool following the model construction. The dataset consisted of 25 candidate input variables and one target output variable, namely kerosene D95. The data was collected on a daily basis in 2013. Altogether, there
Development of soft sensor with neural network and nonlinear variable selection for crude distillation unit process
341
were 361 samples, from which the first 240 samples were calibration data, and the other 121 samples were validation data. The input dataset was composed of the average values of 60-min data, before the kerosene sample was taken every day. Table 1 displays the statistical results of these different approaches over 100 runs. The M.S. was determined by the calibration data, and the RMSE and R were determined by the validation data. It is clearly that LASSO-MLP has a better prediction accuracy and smaller M.S than the other algorithms, which demonstrates that LASSO-MLP can build a more accurate and more compact model than other algorithms. Table 1. Statistical results for kerosene D95 prediction LASSO- MLP RMSE M.S.
NMIFS
SBS-MLP
MLP
3.68
4.03
4.27
4.95
0.86
0.82
0.79
0.73
10.18
11.02
11.75
25
Figure 2 applied the validation data to make a comparison between the kerosene D95 as verified by laboratory assays and the predictions given by LASSO-MLP. It appears that the model constructed by LASSO-MLP could successfully follow the dynamics of the kerosene D95. 285
Lab assays LASSO-MLP
280
o
Kerosene D95( C)
275
270
265
260
255
250
245
0
20
40
60
80
100
120
Figure 2. The kerosene D95 with Lab assays and LASSO-MLP
Figure 3 shows that the variables 24, 13, 5, 7 are chosen on over 80% of occasions by LASSO-MLP, whereas the other variables are selected at a frequency of below 60%, which demonstrates that these four variables are very important for the CDU process. In practical CDU process, the field operators preferred are those that regulate the flow rate, as controlling of the flow rate is considerably more convenient than controlling the temperature. Variable 24, the distillation kerosene flow rate, had the highest selection probability in the model, as shown in Figure 3. That result is consistent with experiences from the field. The kerosene product has a lower boiling point and density if the kerosene flow rate is lower. On the contrary, it has a higher boiling point and greater density if the kerosene flow rate is increased. In addition, the steam entering the bottom of V101 introduces heat energy into the column V101, which could influence the distillation process of the CDU. For that reason, the flow rate of the bottom steam, variable 21, had the second highest selection probability in the model. The variables 5 and 7 were the MPA and BPA flow rates, which had the third and fourth highest selection probabilities, respectively. This is because these two variables regulated the loading of the oil vapor and the amount of liquid traffic in the column for effective distillation.
342
K. Sun et al.
100
S e le c tio n P ro b a b ility (% )
90 80 70 60 50 40 30 20 10 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
V a ria b le s
Figure 3. Selection probability of CDU process variables with LASSO-MLP
5. Conclusions The verification of kerosene quality using laboratory assays proved to be significantly time consuming. Besides, there are a large number of predictor variables displaying a high level of cross-correlation in the CDU process, which increase complexity of the model and lower the model accuracy. In the study, a nonlinear variable selection method for MLP neural network was proposed. The proposed approach introduces the LASSO penalty into the general MLP, and carries out shrinkage on the input weights of MLP in order to achieve accurate variable selections. The proposed LASSO-MLP combines the MLP’s advantage of describing nonlinear process with the superior accuracy of variable selection that is provided by LASSO. Simulation results show that the model constructed by LASSO-MLP could successfully estimate the dynamics of the kerosene D95, and our approach has better performance than other state-of-art nonlinear variable selection methods.
References [1] N. Bolf, G. Galinec, and M. Ivandić, "Soft sensors for kerosene properties estimation and control in crude distillation unit," Chemical and Biochemical Engineering Quarterly, vol. 23, pp. 277-286, 2009. [2] I. Mohler, Z. Ujevic Andrijic, N. Bolf, and G. Galinec, "Distillation End Point Estimation in Diesel Fuel Production," Chemical and Biochemical Engineering Quarterly, vol. 27, pp. 125132, 2013. [3] R. Caponetto, G. Dongola, A. Gallo, and M. G. Xibilia, "FPGA based soft sensor for the estimation of the kerosene freezing point," in Industrial Embedded Systems, 2009. SIES'09. IEEE International Symposium on, 2009, pp. 228-236. [4] R. Tibshirani, "Regression shrinkage and selection via the lasso," Journal of the Royal Statistical Society. Series B (Methodological), pp. 267-288, 1996. [5] T. F. Coleman and Y. Li, "An interior trust region approach for nonlinear minimization subject to bounds," SIAM Journal on optimization, vol. 6, pp. 418-445, 1996. [6] R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," in IJCAI, 1995, pp. 1137-1145. [7] E. Romero and J. M. Sopena, "Performing feature selection with multilayer perceptrons," IEEE Transactions on Neural Networks, vol. 19, pp. 431-441, 2008. [8] P. A. Estévez, M. Tesmer, C. A. Perez, and J. M. Zurada, "Normalized mutual information feature selection," Neural Networks, IEEE Transactions on, vol. 20, pp. 189-201, 2009.