Adaptive soft sensor based on time difference Gaussian process regression with local time-delay reconstruction

Adaptive soft sensor based on time difference Gaussian process regression with local time-delay reconstruction

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680 Contents lists available at ScienceDirect Chemical Engineering Research and Desig...

2MB Sizes 29 Downloads 168 Views

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

Contents lists available at ScienceDirect

Chemical Engineering Research and Design journal homepage: www.elsevier.com/locate/cherd

Adaptive soft sensor based on time difference Gaussian process regression with local time-delay reconstruction Weili Xiong a,b , Yanjun Li b , Yujia Zhao c , Biao Huang c,∗ a

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China b School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China c Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G2G6, Canada

a r t i c l e

i n f o

a b s t r a c t

Article history:

Apart from strong nonlinearity and time-varying behaviors in industrial processes, the hid-

Received 8 September 2016

den time-delay information, which is unfortunately overlooked in most existing modeling

Received in revised form 14

methods, should also be taken into account in soft sensor modeling. In view of this, a novel

November 2016

soft sensor, referred to as local time-delay reconstruction based moving window time dif-

Accepted 20 November 2016

ference Gaussian process regression (LTR-MWTDGPR), is proposed in this paper. To deal

Available online 25 November 2016

with the time-delay, a fuzzy curve analysis based local time-delay parameter extraction procedure is performed along with a strategy of a moving window, which simultaneously

Keywords:

captures the process time-varying feature. Then the local window training dataset and new

Soft sensor

query sample are reconstructed according to the time-delay parameter set at the next sam-

Time difference model

pling instant. Afterwards, the time difference Gaussian process regression is employed to

Moving window

handle the drifting feature of local reconstructed dataset. The effectiveness and accuracy of

Time-delay extraction

the proposed LTR-MWTDGPR approach in predicting quality variables are verified through

Dataset reconstruction

a real sulfur recovery unit and an industrial debutanizer column. © 2016 Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved.

1.

Introduction

exhibiting significant time-delay introduced by signal and material transmission, or installation location and analyzing cycle of measur-

There is a great demand on control and optimization of product quality in modern industrial processes, which leads to requirements of

ing instruments (Fortuna et al., 2007). If we ignore such time-delay, the modeling accuracy and control quality of system would be greatly com-

the online measurement of process variables (Fortuna et al., 2007). In

promised. Increased delay would come along with deteriorated control performance. Therefore, it is imperative to have a reliable estimate of

many practical applications, quality-related variables such as concentration in gas flow and certain chemical ingredients of products are difficult to measure online (Ahmad et al., 2014; Yan et al., 2004). In

the delay between process variables and quality variables to optimize the control of chemical production processes.

such circumstance, soft sensors have been extensively applied through constructing mathematical models between auxiliary process variables and the dominant quality variables (Facco et al., 2009; Khatibisepehr

As process time-delay plays a critical role in system dynamics and control, there are numerous published works on the identification of time delay systems (Richard, 2003; Bozorg and Davison, 2006; Tufa and

et al., 2013). In some cases, although there are online analyzers installed for quality variables on site, the measurement sequence of quality vari-

Ramasamy, 2011), and the topic on how to develop reliable online soft sensors in presence of time delays has attracted much attention. In order to extract the process delay information, Fortuna et al. (2005)

ables is not consistent with the sampling sequence of process variables,

made use of the designing parameters of process hardware instruments, such as reactor volume or the length of a pipe, to estimate the



approximate delay range of the device, using a nonlinear autoregressive moving average model structure which involves certain amount of lagged samples to overcome the large time-delay introduced by

Corresponding author. Fax: +1 780 492 2881. gas chromatograph; besides, by computing the correlation coefficient E-mail address: [email protected] (B. Huang). http://dx.doi.org/10.1016/j.cherd.2016.11.020 0263-8762/© 2016 Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved.

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

671

Fig. 1 – The flow diagram of FCA-LDR procedure. between input and output variables, Komulainen et al. (2004) and Zhang

local characteristics and provide a more reliable soft sensor model for

et al., (2006) constructed an online dynamic partial least squares model and an aggregated neural network respectively to refine soft sensor

process quality control. Currently, data-driven methods have been widely applied in soft sensor modeling due to the fact that they do not need much prior

models with lagged sample information. In addition, Souza et al., (2010) developed an artificial neural network based data-driven soft sensor, utilizing mutual information index of process and quality variables to introduce variable delay into soft sensor model and select informative input variables to further improve the soft sensor reliability. However, there are still a number of open problems unsolved. For exam-

process knowledge and are simple to develop compared with the first principle models (Yuan et al., 2016; He et al., 2016; Gholami et al., 2015; Kadlec et al., 2009). Data-driven models, just as the name implies, are

ple, the number of lagged samples to be used in ARMA model structure is obtained by trial and error, which is prone to unstable model per-

built on numerous data collected from the process historical database. Although synchronous data acquisition can be done in large scale with the rapid development of distributed control systems, the timedelay between process variables and quality variables still exists due to

formance. Correlation coefficient based delay estimation methods are

the different spatial and temporal distribution of process instruments.

limited to linear systems, and mutual information based algorithms tend to show a high degree of computational complexity and need a lot

Thus the real-time collected dataset contains useful delay information, which provides opportunities for the establishment of time-delay

of data. Although sometimes the delay parameters can be determined through prior knowledge from in-depth analysis of process mecha-

related soft sensor models. Given that delay is almost ubiquitous in industrial processes, active steps must be the taken to correct dataset

nism, such an estimation procedure might exhibit characteristics of randomness and uncertainty. Thus, a delay estimation method that can

by re-matching input and output time sequence, which will bring a lot

well describe process nonlinearity with a relatively low computational

and online correction.

load is much needed. In the same time, since the operating conditions of process control are often time-varying, process data presents

On the other hand, nonlinearity and time-varying behavior of chemical industrial processes are two main topics to discuss through-

clear stage-wise characteristics. Therefore, when making an estimate of delay parameters, time-delay and shifting features under different operating conditions should be both considered so as to better capture

out soft sensor development history. When a soft sensor model is established and put into service, the problem of model degradation is inevitable, which can lead to deterioration in prediction accuracy of

of benefits to subsequent model building, real-time quality estimation,

672

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

the quality variables (Kaneko and Funatsu, 2013; Kadlec et al., 2011). The reasons are diverse, such as process state shifting, sensor or process drift, catalytic performance loss and so on (Kaneko et al., 2014). If the model fails to effectively adapt to the gradual or abrupt changes of the chemical plant, the soft sensor would perform poorly in process monitoring and control. To reduce the degradation, adaptive learning strategies have been put forward over the years to maintain the prediction accuracy in a long run. The most common approaches are the recursive methods (RM) (Li et al., 2000; Matias et al., 2015), moving window (MW) methods (Xu et al., 2015; Lu et al., 2014), just-in-time learning methods (JITL) (Liu et al., 2012; Cheng and Chiu, 2005) and time difference (TD) methods (Kaneko and Funatsu, 2011a; Kaneko and Funatsu, 2011b). Among them, MW and RM based models can adapt to gradual shifts in process variables and quality variable; JITL based models are suitable for changes such as shifts in process variables; however, each model type alone requires reconstructing the model frequently in many applications. In contrast, TD models not only can simultaneously adapt to drifts in both input variables and output variables with high stability, but also they do not encounter the issue of frequent model updating (Kaneko and Funatsu, 2015). Nevertheless, a global TD model has its limitation just like all offline models, which is aging with time. In order to enable global TD models to continue capturing process’s incipient and abrupt changes, in this paper two adaptive mechanisms, MW and TD strategies are combined to enhance the reliability of soft sensors in dealing with changing process dynamics. To describe the local nonlinear characteristics, the selection of the local model building algorithm is of great importance. Towards this end, various data-driven methods such as principal component analysis (PCA) (Yuan et al., 2015a), partial least squares (PLS) (Xu et al., 2015), least squares support vector

Fig. 2 – Simplified diagram of TDGPR modeling.

machine (LSSVM) (Lv et al., 2012) based algorithms are proven to be useful in many practical applications. Among these algorithms, Gaussian process regression (GPR) has achieved great momentum in its development recently by virtue of its nonparametric probabilistic model nature (Bishop, 2006). Several studies have shown that GPR not only can perform better in terms of system modeling than other algorithms, it can

also provide probabilistic estimation of the uncertainties (Chen and Wang, 2010; Ge, 2016; Rasmussen and Williams, 2006). As mentioned before, in practical industrial processes, timevarying, process nonlinearity and time-delay features are all significant issues in soft sensor modeling. Thus, the main goal of the presented

Fig. 3 – The modeling diagram of the proposed soft sensor method.

673

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

Fig. 4 – The block scheme of SRU process. Table 1 – Input and output variables of SRU process. Variable

Description Gas flow MEA GAS Air flow AIR MEA Secondary air flow AIR MEA 2 Gas flow in SWS zone Air flow in SWS zone Concentration of H2 S Concentration of SO2

x1 x2 x3 x4 x5 y1 y2

work is to propose an adaptive soft sensor based on time difference GPR with local time-delay reconstruction. The framework of the proposed methodology is briefly presented as follows: first, to deal with the aging problem of the static model, a MW strategy is combined with the TD method to gradually capture the time-varying and nonlinear drift dynamics; then an online time-delay correction algorithm is utilized to rematch the modeling dataset in the moving window, thus introducing more reliable dataset for modeling; at last, the GPR algorithm is adopted to fit nonlinear drifts of the reconstructed local dataset for modeling. The remainder of this article is organized as follows: Section 2.1 gives a brief introduction of the proposed local time-delay estimation and dataset reconstruction algorithm, followed by the fundamental modeling of time difference Gaussian process regression (TDGPR) presented in Section 2.2; next, in Section 2.3, the overall development procedure of the local time-delay reconstruction based moving window time difference Gaussian process regression (LTR-MWTDGPR) soft sensor is introduced in detail. In Section 3, two real industrial cases, namely the Sulfur recovery unit and the debutanizer column, are investigated to demonstrate the validity of the proposed method. Finally, concluding remarks are presented in Section 4.

2.

Methodology

2.1. Fuzzy curve analysis based local dataset time-delay reconstruction For chemical processes, assuming that the data are collected in real time and uniformly sampled, traditional modeling strategies basically establish the mapping relation between x (t) and y(t), where x (t) = (x1 (t), x2 (t), · · ·, xm (t)) is the auxiliary variable vector obtained at time t, m is the dimension of the input auxiliary variables, and y(t) is the value of the dominant variable at time t. However, models built on the basis of x (t) -to-y(t) relationship is not consistent with the actual process mechanism, because the corresponding input to y(t) should be x (t-di ), and here di is defined as the delay of the i-th input variable. Since time delay mismatch can lead to significant deterioration of the soft sensor accuracy, it is important to have a reliable estimation of di . In this article, time-delay parameter estimation is done through the fuzzy curve analysis (FCA) method. Fuzzy curve method is originally proposed to simplify fuzzy rules and determine the model

structure of fuzzy neural network (Lin and Cunningham, 1995). In consideration of its effectiveness in determining variable importance and its low computational complexity, a FCA based local dataset reconstruction algorithm (FCA-LDR) is therefore provided to handle process time-delay characteristics in this section. To apply the algorithm, the maximum delay Tmax needs to be specified either by apriori knowledge or through preliminary analysis, which satisfies 0 ≤ di ≤ Tmax , di ∈ N. Fig. 1 provides a brief idea of FCA based local dataset time-delay reconstruction. The detailed FCA-LDR algorithm is given as follow: Step 1: Collect uniformly sampling input and output variables and construct an initial moving window that contains L consecutive samples (L is defined as the length of moving window). The selection of L should be suitable for explaining local dynamics as well as local time-delay feature. Once the initial window is established, extend the original auxiliary variable xi (t) to a collection of time-delayed variables denoted as {xi (t − ) ,  = 0, 1, . . ., Tmax }. The extension mode is shown in (1), thus a m × (Tmax +1) dimensional time-delay input variable set can be obtained during this step. xi (t) , xi (t − 1) , · · ·xi (t − di ) , · · ·, xi (t − Tmax )





i = 1, 2, · · ·, m



(1)

sets

m

Step 2: For each extended time-delayed variable in the set of {xi (t − ) ,  = 0, 1, . . ., Tmax }, suppose that the mapping relationship in the window is: xi (t − ) → y (t). For every point (xi ,t − , y (t)) in the xi (t − ) → y (t) space, we define a Gaussian fuzzy membership function of the input variable xi (t − ) as it (xi (t − )) in formula (2), where b is determined as 20% the range of variable xi (t − ), In = [1,1,...,1]T , n = L-Tmax , xi ,t − is the (t − )-th sampling value of variable xi in the window, and xi (t − ) is the lagged variable sequence,t = Tmax + 1, ..., L.

  it (xi (t − )) = exp −

In xi, t − − xi (t − ) b

2 (2)

Step 3: By introducing time-delayed variables, the original variable xi will be expanded to have Tmax + 1 dimensions. The fuzzy curve vector Ci, () can be obtained by using centroid defuzzification of each new expanded variable with Eq. (3); afterwards, as shown in (4), di is the  with which Ci, () can get the maximum coverage range. In (4), Ci , ()max is defined as the maximum value in the fuzzy curve vector, while Ci , ()min is the minimum value of the obtained Ci, (). If the range of Ci, () is closer to that of y, then the input variable xi (t − ) is more important to the output variable. In view of this point, the important degree of each variable in the set of {xi (t − ) ,  = 0, 1, . . ., Tmax } can be further determined by coverage range ranking. Finally, the optimal time delay variables {xi (t − di ) , i = 1, 2, ..., m} in the window can be obtained.

L Ci, () =

t=T

Lmax

+1

it [xi (t − )] · y (t)

t=Tmax +1



it [xi (t − )]

di = argmax Ci , ()max − Ci , ()min



(3)

(4)



Step 4: Use di determined in the previous step to reconstruct the dataset in the window. Reconstruct the input time series into xd (t) = [x1 (t − d1 ) , x2 (t − d2 ) , · · ·, xm (t − dm )]; then

674

1

0.5

0.5

200

400 600 sampling data

800

0

1000

1

1

0.5

0.5

0

x5

0

x4

x3

0

x2

1

0

200

400 600 sampling data

800

0

1000

1

1

0.5

0.5

0

y

x1

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

0

200

400 600 sampling data

800

0

1000

0

200

400 600 sampling data

800

1000

0

200

400 600 sampling data

800

1000

0

200

400 600 sampling data

800

1000

5

6

7

8

9

6

7

8

9

Fig. 5 – Time plots of x and y for SRU process dataset.

2.2.

0.1 TDGPR M WTDGPR Proposed method

0.08 RMSE value

the reorganized initial moving window dataset is defined by, Wrec  [xd (t) , y(t)]t=Tmax +1,...,L . Step 5: When there is a new sample available, push the moving window one step forward, add the most recently collected sampling data into the window and remove the oldest one, repeat the above steps to complete the local dataset timedelay reconstruction.

Time difference Gaussian process regression

0.06 0.04 0.02

Different from traditional modeling of X = {x1 , x2 , ..., xm } ∈ Rn×m and y ∈ Rn×1 (n is the length of sampling data, m is the number of input variables), a TD model is built on the time difference of input and output datasets so as to adapt to the drifts of both X and y, namely, learning the mathematical relationship between X and y. The jth order time difference of X and y is defined below:

0

1

2

3

4

j

(a) L=50

0.1 TDGPR MWTDGPR

0.08

j X (t) = X (t) − X (t − j) j y (t) = y (t) − y (t − j)

(5)

RMSE value

Proposed method

Next, data-driven algorithms can be employed to model the mathematical relationship between j X and j y, and the GPR method is utilized to explain nonlinear drift characteristics of the process, which is presented as follows: Given the training sample sets X ∈ Rn×m and y ∈ Rn×1 , input x and output y satisfy y = f (x) + ε

(6)

where f is an unknown function, ε is a Gaussian noise with zero mean and variance n2 . Generally, in GPR the regression model in (6) has been considered to have a prior distribution defined as y = [f (x1 ), f (x2 ), ..., f (xn )]∼N(0, C), where C is a n × n dimensional covariance matrix. (For simplicity, mean function is preprocessed to 0). GPR can choose various covariance functions to describe the distribution characteristics of the sample space. In the present paper, the covariance function of sample xp and sample xq is selected as the form in (7), where c(xp , xq ) represents the element in the p-th row and the q-th column of

0.06 0.04 0.02 0

1

2

3

4

5 j

(b) L=110

Fig. 6 – Relationship between time difference order j and RMSE of TDGPR, MWTDGPR and the proposed method under different window sizes for SRU. matrix C. ıpq = 1 only holds true when p = q; otherwise, ıpq = 0. In (7), the parameter v controls the magnitude of the covariance and d functions as an importance weight assigned to p q each input variable xd . xd and xd represent the d-th attribute p q of x and x respectively.



1 p q 2 c(x , x ) = v exp − d (xd − xd ) 2 p

m

q

d=1

+ ıpq n2

(7)

675

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

Table 2 – The determination of maximum delay parameter Tmax . Maximum delay

Optimal delay

Maximum delay

Optimal delay

Maximum delay

Optimal delay

Tmax = 5 Tmax = 6 Tmax = 7 Tmax = 8

[5 5 5 5 5] [6 6 6 6 6] [7 7 7 7 7] [8 8 8 8 8]

Tmax = 9 Tmax = 10 Tmax = 11 Tmax = 12

[9 9 9 0 0] [10 9 10 10 1] [11 9 11 11 2] [11 9 12 11 3]

Tmax = 13 Tmax = 14 Tmax = 15

[11 9 13 11 4] [11 9 14 11 5] [11 9 14 6 10]

2.3.

0.05 0.045

RMSE value

0.04 0.035 0.03

L=30 L=50

0.025

L=70 L=90

0.02

L=110

0.015

1

2

3

4

5

6

7

8

9

j

Fig. 7 – RMSE trends of the proposed method with 5 different window sizes for SRU. Table 3 – Varible delay results under different numbers of training samples for SRU. Training sample number

[d1 , d2 , d3 , d4 , d5 ]

n = 500 n = 600 n = 700 n = 800 n = 900 n = 1000

[11 9 14 6 2] [11 9 14 6 2] [11 9 14 11 10] [11 9 14 11 10] [11 9 14 11 10] [11 9 14 6 10]

For a new input sample xnew , its corresponding prediction of the output ynew also follows a Gaussian distribution, with its mean and variance given by (8), where c(xnew ) = T [c(xnew , x1 ),c(xnew , x2 ), ...,c(xnew , xn )] is a n × 1 dimensional new covariance vector between x and n training input data in the database, c(xnew , xnew ) is the autocovariance of the new input. ynew

|X, y, xnew ∼N(ynew , y2new )

s.t.

ynew = cT (xnew )C−1 y

(8)

y2new = c(xnew , xnew ) − cT (xnew )C−1 c(xnew ) the aforementioned unknown parameter set  = Once 2

v, n , ω1 , · · ·, ωm is estimated through the maximum likelihood estimation approach, GPR can be utilized for prediction. In this case, by building a model of X and y where the time difference is based on j-th order; that is, if a new input data arrives at time tnew , the prediction value ypred (tnew ) can be calculated based on y (tnew − j) by (9). Fig. 2 clearly describes the simplified TDGPR modeling idea. j x (tnew ) = x (tnew ) − x (tnew − j)



j ypred (tnew ) = fGPR j x (tnew )



ypred (tnew ) = y (tnew − j) + j ypred (tnew )

(9)

Modeling framework of the proposed method

With the purpose of introducing local time-delay characteristics into the soft sensor model and meanwhile dealing with the drift dynamics of local process stage, a class of local time-delay reconstruction based moving window TDGPR (LTRMWTDGPR) soft sensors is proposed in the article. The main idea of the modeling is sketched in Fig. 3. The main steps concerning the modeling framework are detailed as follows: Step 1: Determine the maximum delay parameter Tmax through empirical and prior process knowledge or a data preanalysis method. Step 2: Collect the uniformly sampled data containing both input and output variables; construct an initial moving window Wini that contains L consecutive samples, where Wini = [x(t), y(t)]t=1,...,L , x(t) = {x1 (t), x2 (t), ..., xm (t)}. Step 3: Extend the original m variables with delays ranging from 0 to Tmax respectively, so a total of m × (Tmax +1) lagged variables as in (1) can be obtained. Extract the optimal delays using (2)–(4) in the local window by FCA method, and then m time-delay variables with the largest fuzzy curve range are prepared for the next step. The optimal delay parameters are denoted as d1 , d2 , ..., dm . Step 4: Correct the original x (t) -to-y(t) mapping relations in the moving window via the estimated delays d1 , d2 , ..., dm . In this step, the original L input samples are reduced to L − Tmax input samples which are more predictive for the output y(t). The reconstructed window is defined as Wrec : Wrec = [x1 (t − d1 ), x2 (t − d2 ), ..., xm (t − dm ), y(t )]t=Tmax +1,...,L Step 5: At the next sampling instant y (tnew ) can be predicted using information provided by the reconstructed input xd (tnew ), xd (tnew ) = [x1 (tnew − d1 ), x2 (tnew − d2 ), ..., xm (tnew − dm )]. Step 6: Next, as in (5), have the L − Tmax samples in Wrec as well as xd (tnew ) time-differenced with order j, and then build a GPR model on the differenced dataset. After that, the prediction drift output of j xd (tnew ) (denoted as j ypred (tnew )) can be obtained by TDGPR modeling. Based on historical output y (tnew − j), the final real-time prediction ypred (tnew ) can be achieved by (9). When the prediction procedure is accomplished, discard the current LTR-MWTDGPR model, and return the window database to the original state with L training samples, that is Wini = [x(t), y(t)]t=1,...,L . Step 7: Update the moving window to capture new process dynamics by adding the newest sample and dropping the oldest sample. By repeating the above steps, the real-time prediction of quality variable can thus be achieved. To verify the effectiveness of the presented algorithm, comparisons are made on prediction performance among the following 5 soft sensor models: Model 1: Global GPR model (Rasmussen and Williams, 2006); Model 2: Moving window based GPR (MWGPR) model (Zhang et al., 2015); Model 3: Global TDGPR model (Kaneko and Funatsu, 2011a);

676

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

Fig. 8 – Schematic diagram of debutanizer process. Table 4 – The performance of different models using dataset collected from SRU when L = 50 and L = 110. (a) L = 50 Method

Model type

Mean CPU runtime

GPR MWGPR TDGPR MWTDGPR Proposed method

Global Adaptive Global Adaptive Adaptive

– 0.237 s – 0.199 s 1.042 s

Method

Model type

Mean CPU runtime

GPR MWGPR TDGPR MWTDGPR Proposed method

Global Adaptive Global Adaptive Adaptive

– 0.360 s – 0.294 s 5.520 s

RMSE

0.0785 0.0379 0.0191 0.0205 0.0168

r2

−0.0919 0.7453 0.9357 0.9252 0.9502

(b) L = 110

Model 4: Moving window based TDGPR (MWTDGPR) model (Kaneko and Funatsu, 2015; Yuan et al., 2015b); Model 5: MWTDGPR model considering local time-delay estimation and dataset reconstruction, namely, proposed soft sensor model. Two evaluation indexes, root-mean-square error (RMSE) and coefficient of determination (r2 ) are selected to describe the prediction and tracking ability of different models in this paper.

3.

Case studies and discussion

3.1.

Sulfur recovery unit

In this section, the industrial sulfur recovery unit (SRU) is investigated to realize the development of the soft sensor. SRU process is a vital part for a large refinery system, the tail gas of which must be strictly dealt with and monitored in order to avoid environmental hazards with maximum efforts. In this industrial case, there are 5 auxiliary variables and 2 dominant variables, the detailed description of which is presented in

RMSE

0.0823 0.0341 0.0196 0.0208 0.0193

r2

−0.1321 0.8054 0.9355 0.9276 0.9377

Table 1, and the simplified block scheme of SRU is shown in Fig. 4 (Fortuna et al., 2003). A total of 1000 data samples have been collected under the normal operation condition with a sampling rate of 1 min. Different nonlinear model structures are designed for online estimation of the concentration of hydrogen sulfide (H2 S) in the tail stream of the SRU, and the time trend plots of the process variables are depicted in Fig. 5. The concentration of H2 S can be obtained by on-site measuring instrument. However, there exists a time lag between the real-time input and its response, which cannot simply be ignored. Due to inadequate prior process knowledge of the delay range, a pre-analysis procedure is therefore carried out on the collected 1000 samples to determine the possible maximum delay parameter Tmax , as shown next. Table 2 shows the delay estimation results of 5 auxiliary variable sequences towards H2 S variable sequence under different settings of Tmax . It is easy to see that the estimated variable delays gradually increase to 14 but not over 15 during the whole trial process with Tmax ranging from 5 to 15. From Table 3, it is noted that when Tmax is set as 15, the delay param-

677

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

1

0.8

0.8

0.4

x2

x1

0.6

0.6

0.2 200

400 600 sampling data

0.4

800

1

0.5

0.5

x4

1

0

x5

0

0

200

400 600 sampling data

0

800

1

1

0.8

0.9

0.6

0.8

x6

x3

0

0.4 0.2

0

200

400 600 sampling data

800

0

200

400 600 sampling data

800

0

200

400 600 sampling data

800

0.7 0

200

400 600 sampling data

800

y

1

0.5

0

0

200

400 sampling data

600

800

Fig. 9 – Time plots of x and y for debutanizer dataset. Table 5 – Model performance of 3 LTR-MWTDGPR algorithms using SRU dataset when L = 50. Method

Model type

Mean CPU runtime

CC-MWTDGPR MI-MWTDGPR Proposed method

Adaptive Adaptive Adaptive

0.387 s 2.395 s 1.042 s

RMSE 0.0188 0.0178 0.0168

r2 0.9373 0.9436 0.9502

Table 6 – The performance of different models using dataset collected from debutanizer when L = 30. Method

Model type

Mean CPU runtime

GPR MWGPR TDGPR MWTDGPR Proposed method

Global Adaptive Global Adaptive Adaptive

– 0.186 s – 0.163 s 0.342 s

eters do not exceed 15; however, they seem to change when the number of training samples is changed. Nevertheless, results of the two tables have indicated some consistency of local time-delays in the process dataset. Hence, hidden delay infor-

RMSE 0.1123 0.0312 0.0089 0.0089 0.0069

r2 −0.5721 0.8788 0.9902 0.9900 0.9940

mation of database must be taken into account during soft sensor development to depict local timing sequence relations. Therefore, in subsequent analysis, Tmax is set as 15.

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

To test the validity of the proposed method, 5 methods are used to construct soft sensor models using the same dataset. The methods are GPR, MWGPR, TDGPR, MWTDGPR and LTR-MWTDGPR presented in the article, respectively. Related simulation settings are illustrated as follows: 1. In the two global modeling methods, i.e. GPR and TDGPR, the number of training samples is equivalent to the size of moving window, and the rest of the dataset is used as test samples; 2. In the three TD based methods, i.e. TDGPR, MWTDGPR and LTR-MWTDGPR, time difference order j is set as 1 during model performance comparison; 3. To introduce the local delay, all MW based methods are implemented with relatively small window sizes (30, 50, 70, 90, 110), so as to reduce the algorithm complexity and to enhance model reliability in the same time.

0.2 TDGPR MWTDGPR

0.15 RMSE value

678

Proposed method

0.1

0.05

0

1

2

3

4

5

6

7

8

9

j

Fig. 10 – Relationship between time difference order j and RMSE of TDGPR, MWTDGPR and the proposed method under L = 30 for debutanizer. 0.04 L=30

0.035

RMSE value

Take the model performance of two window sizes L = 50, L = 110 as an example, the detailed computed accuracy indexes of the above-mentioned 5 methods are listed in Table 4. As observed in Table 4(a) and (b), with the window sizes L = 50 and L = 110, predition RMSE and r2 of different models are presented accordingly. Note that, no matter with a relatively small L(L = 50) or with a large L(L = 110), on the basis of most recent measured value (j = 1), the prediction results of the proposed method have all exhibited the least RMSE values and the highest r2 values in these two cases,which indicates that the proposed method has the best prediction accuracy and highest tracking performance, and it is not very sensitive to the selection of window size. In Table 4, the average computational results for adaptive model building are also provided. The computer configuration is as follows: OS: Windows 7 (32 bit), CPU: Intel(R) Core(TM) i5-4570 (3.20 GHz), RAM: 4G byte, and MATLAB 2013a is used. Of the 3 adaptive models, CPU runtime of the proposed method increases significantly with rising window size, and constant parameters update seems to be detrimental to its speed. However, the computation load would not become an issue if a small L (e.g., L = 30) is selected. To justify the superiority of fuzzy curve analysis algorithm in local time-delay estimation, the conventional correlation coefficient (CC) analysis and the mutual information (MI) analysis (Souza et al., 2010) are respectively introduced into local time-delay analysis procedure under the proposed LTRMWTDGPR framework, and the two corresponding tailored LTR-MWTDGPR algorithms are referred to as CC-MWTDGPR and MI-MWTDGPR. Take L = 50 as an example, the main computed indexes using dataset of SRU with (j = 1) are compared in Table 5. It is clearly shown that FCA based LTR-MWTDGPR exhibits superiority over CC and MI based methods in estimating local time-delay. In addition, for further proving the necessity of local delay estimation procedure, simulation comparisons are carried out on the TDGPR model, the MWTDGPR model and the LTRMWTDGPR model under different j values, as is clearly shown in Fig. 6. It is recognizable that when j varies from 1 to 9, traditional TDGPR shows a drastic declining tendency in model accuracy, whereas TDGPR updated with MW adaptive mechanism experiences a noticeable boost in model performance compared with that of TDGPR model. In the same time, it can be found that after further enhancement by extracting local delay, the proposed method performs better than MWTDGPR in most cases. In Fig. 6(a), the proposed method has an apparent increase in model performance. Since the local time-

L=50

0.03

L=70 L=90

0.025

L=110

0.02 0.015 0.01 0.005 1

2

3

4

5

6

7

8

9

j

Fig. 11 – RMSE values of LTR-MWTDGPR method with different window sizes for debutanizer. delay feature and extraction of sufficient data information are both considered, it turns out that the proposed method outperforms traditional TDMWGPR and TDGPR methods in both accuracy and reliability. In Fig. 6(b), model performance of the proposed method shows little advantages over that of MWTDGPR due to inadequate ability to capture local time-delay change despite the abundant time re-matched training samples for TD modeling. For better understanding of the impact of window size L on the performance of LTR-MWTDGPR method, trends of model prediction RMSE values under 5 different window sizes for SRU are plotted in Fig. 7. From Fig. 7, when a small window size (e.g., L = 30) is employed to track both the time-delay and the dynamics of time-varying drift, it is desirable that prediction is made by choosing j from 1 to 5. If a current time prediction of H2 S concentration is expected or has no alternative but to be obtained based on sample information collected 6–8 min ago, it is recommended that L be selected as around 50–70, which is more likely to achieve a better balance between local timedelay estimation and nonlinear dynamics characterization.

3.2.

Debutanizer column

Debutanizer column is an essential part of the desulfuring and naphtha splitter plant. During this process, propane and butane are required to be removed as overheads in the naphtha stream. The flowchart of the simplified debutanizer process is shown in Fig. 8. In this section, simulation experiments are conducted on real process dataset of the debutanizer to test the reliability of the established soft sensors. The input variables that can be directly measured online

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

in the debutanizer column are: x1 -Top temperature; x2 - Top pressure; x3 - Reflux flow; x4 -Flow to next process; x5 -6th tray temperature; x6 -Bottom temperature 1; x7 -Bottom temperature 2. On the other hand, the quality variable, namely, butane content, cannot be measured online, the acquisition of which is provided by on-site online analyzers. In terms of process dataset, the sampling interval is 6 min. The measuring cycle and installation location of online analyzers have resulted in an approximate delay range of around 45–90 min between all process variables and the arrival of butane content value (Fortuna et al., 2007, 2005). To improve the control quality of the process, real-time and just-in-time estimation of the butane content is needed. For simplicity, two bottom temperature variables are averaged here to become one variable; the trends of 6 input variables and 1 output variable are plotted in Fig. 9. On the basis of prior knowledge, we set Tmax as 19, and a total of 800 consecutive samples of the dataset are collected for soft sensor development and simulation discussion. Likewise, the GPR model, MWGPR model, TDGPR model, MWTDGPR model and LTR-TDGPR model are adopted respectively to set up soft sensors for real-time prediction of the butane content. Table 6 shows the comparison results of different models, where the two computed indicator values are given and simulation settings of 5 models are exactly the same as those in SRU. TDGPR, MWTDGPR and the proposed methods have j = 1. Here, we take L = 30 as an example, the detailed RMSE and r2 indicators are listed in the table. From Table 6, it is clearly shown that the global model without updating mechanism such as the use of the GPR model has the largest RMSE and its tracking ability is far from adequate in dealing with process time-varying dynamics. After the traditional GPR is integrated with an adaptive mechanism, such as GPR + MW strategy (MWGPR), GPR + TD strategy (TDGPR), the modeling accuracy has boosted to a large degree. Since the traditional TDGPR model does not update with the most recent dynamics, it tends to deteriorate significantly with the increase of time difference. After the moving window strategy has been added into TDGPR modeling, time-varying and nonlinear drift characteristics are both well dealt with. It is clearly shown that the proposed MWTDGPR with local time-delay reconstruction strategy maintains the highest prediction accuracy and explanation capability of process dynamics. Fig. 10 compares the reliability of different TD based models with L = 30. From Fig. 10, we see the phenomenon that the drift nonlinearity cannot be well captured by global TDGPR model when time difference order increases, showing a drastic deterioration, while MWTDGPR model performs better in terms of model accuracy and model stability owe to involving time-varying process dynamics into the soft sensor model. By further extracting process delay, the proposed LTRMWTDGPR method is able to enhance the prediction accuracy of MWTDGPR method significantly, and the influence of time difference is greatly reduced as well. The selection of window size plays an important role in soft sensor development for debutanizer column. As observed from Fig. 11, the plotted RMSE trends roughly indicate that with the increase of L, the reliability of LTR-MWTDGPR model tends to decline. In terms of prediction accuracy and TD model validity, the L here should be taken a relatively small value like L = 30. This is because, on one hand, with a small L, local domain time-delay characteristics can be best handled and thus the underlying local delay information can be fully extracted. On the other hand, a relatively small L can also enable the reconstructed samples to be the most informative

679

and to be related to local drift dynamics; thus time-varying and local nonlinearity can be both taken care of by a corrected TDGPR model. After a series of simulation comparisons, the results have verified that MWTDGPR with a local delay estimation and reconstruction technique can provide real-time prediction of butane content with high precision, which also enables a better tracking ability for prediction applications where a large time difference is needed.

4.

Concluding remarks

In this paper, a novel soft sensor modeling strategy, referred to as local time-delay reconstruction based moving window time difference Gaussian process regression (LTR-MWTDGPR), is proposed for estimating the quality variables of time-varying nonlinear processes with time-delays. The work can be characterized as three aspects. First, the moving window method and the TD strategy are combined to adapt to both gradual and abrupt changes in process characteristics. Second, a fuzzy curve analysis based local time-delay estimation and reconstruction algorithm is performed to correct the mapping relationship between input and output samples on the local window, which retains the most informative data for subsequent modeling. Third, the real-time quality prediction value can be obtained using the TDGPR modeling method, which is capable of capturing local drifts. The proposed LTR-MWTDGPR soft sensor is applied to a sulfur recovery unit with abrupt changes, and an industrial debutanizer column process with apparent drift dynamics. Compared with traditional methods, the proposed method is superior in handling nonlinearity or time-varying characteristic or both of them in soft sensor development. Moreover, the presented method can also be extended to act as an effective solution to different types of modeling problems in other nonlinear time-varying processes with time-delay.

Acknowledgments The authors thank the financial support by the National Natural Science Foundation of China (Nos: 21206053, 21276111), the 111 Project (B12018), The Six Talent Peaks Project in Jiangsu Province (2013-DZXX-043) the Fundamental Research Funds for the Central Universities (JUSRP51510), and Alberta Innovate Technology Futures.

References Ahmad, I., Kano, M., Hasebe, S., et al., 2014. Gray-box modeling for prediction and control of molten steel temperature in tundish [J]. J. Process Control 24 (4), 375–382. Bishop, C.M., 2006. Pattern Recognition and Machine Learning [M]. Springer, New York. Bozorg, M., Davison, E.J., 2006. Control of time delay processes with uncertain delays: time delay stability margins [J]. J. Process Control 16 (4), 403–408. Chen, T., Wang, B., 2010. Bayesian variable selection for Gaussian process regression: application to chemometric calibration of spectrometers [J]. Neurocomputing 73 (13), 2718–2726. Cheng, C., Chiu, M.S., 2005. Nonlinear process monitoring using JITL-PCA [J]. Chemom. Intell. Lab. Syst. 76 (1), 1–13. Facco, P., Doplicher, F., Bezzo, F., et al., 2009. Moving average PLS soft sensor for online product quality estimation in an industrial batch polymerization process [J]. J. Process Control 19 (3), 520–529.

680

chemical engineering research and design 1 1 7 ( 2 0 1 7 ) 670–680

Fortuna, L., Rizzo, A., Sinatra, M., et al., 2003. Soft analyzers for a sulfur recovery unit [J]. Control Eng. Pract. 11 (12), 1491–1500. Fortuna, L., Graziani, S., Xibilia, M.G., 2005. Soft sensors for product quality monitoring in debutanizer distillation columns [J]. Control Eng. Pract. 13 (4), 499–508. Fortuna, L., Graziani, S., Rizzo, A., et al., 2007. Soft Sensors for Monitoring and Control of Industrial Processes [M]. Springer Verlag, Berlin. Ge, Z., 2016. Active probabilistic sample selection for intelligent soft sensing of industrial processes [J]. Chemom. Intell. Lab. Syst. 151, 181–189. Gholami, A., Shahbazian, M., Safian, G., 2015. Soft sensor development for distillation columns using fuzzy C-means and the recursive finite Newton algorithm with support vector regression (RFN-SVR) [J]. Ind. Eng. Chem. Res. 54 (48), 12031–12039. He, Y.L., Geng, Z.Q., Zhu, Q.X., 2016. Soft sensor development for the key variables of complex chemical processes using a novel robust bagging nonlinear model integrating improved extreme learning machine with partial least square [J]. Chemom. Intell. Lab. Syst. 151, 78–88. Kadlec, P., Gabrys, B., Strandt, S., 2009. Data-driven soft sensors in the process industry [J]. Comput. Chem. Eng. 33 (4), 795–814. ´ R., Gabrys, B., 2011. Review of adaptation Kadlec, P., Grbic, mechanisms for data-driven soft sensors [J]. Comput. Chem. Eng. 35 (1), 1–24. Kaneko, H., Funatsu, K., 2011a. Development of soft sensor models based on time difference of process variables with accounting for nonlinear relationship [J]. Ind. Eng. Chem. Res. 50 (18), 10643–10651. Kaneko, H., Funatsu, K., 2011b. Maintenance-free soft sensor models with time difference of process variables [J]. Chemom. Intell. Lab. Syst. 107 (2), 312–317. Kaneko, H., Funatsu, K., 2013. Classification of the degradation of soft sensor models and discussion on adaptive models [J]. AIChE J. 59 (7), 2339–2347. Kaneko, H., Funatsu, K., 2015. Moving window and just-in-time soft sensor model based on time differences considering a small number of measurements[J]. Ind. Eng. Chem. Res. 54 (2), 700–704. Kaneko, H., Okada, T., Funatsu, K., 2014. Selective use of adaptive soft sensors based on process state [J]. Ind. Eng. Chem. Res. 53 (41), 15962–15968. Khatibisepehr, S., Huang, B., Khare, S., 2013. Design of inferential sensors in the process industry: a review of Bayesian methods [J]. J. Process Control 23 (10), 1575–1596. Komulainen, T., Sourander, M., Jämsä-Jounela, S.L., 2004. An online application of dynamic PLS to a dearomatization process [J]. Comput. Chem. Eng. 28 (12), 2611–2619. Li, W., Yue, H.H., Valle-Cervantes, S., et al., 2000. Recursive PCA for adaptive process monitoring [J]. J. Process Control 10 (5), 471–486. Lin, Y., Cunningham, G.A., 1995. A new approach to fuzzy-neural system modeling [J]. IEEE Trans. Fuzzy Syst. 3 (2), 190–198.

Liu, Y.Q., Huang, D.P., Li, Y., 2012. Development of interval soft sensors using enhanced just-in-time learning and inductive confidence predictor [J]. Ind. Eng. Chem. Res. 51 (8), 3356–3367. Lu, B., Castillo, I., Chiang, L., et al., 2014. Industrial PLS model variable selection using moving window variable importance in projection [J]. Chemom. Intell. Lab. Syst. 135, 90–109. Lv, Y., Liu, J., Yang, T., 2012. Nonlinear PLS integrated with error-based LSSVM and its application to NOx modeling [J]. Ind. Eng. Chem. Res. 51 (49), 16092–16100. Matias, T., Souza, F., Araujo, R., et al., 2015. On-line sequential extreme learning machine based on recursive partial least squares [J]. J. Process Control 27, 15–21. Rasmussen, C.E., Williams, C.K.I., 2006. Gaussian Processes for Machine Learning [M]. The MIT Press, Cambridge. Richard, J.P., 2003. Time-delay systems: an overview of some recent advances and open problems [J]. Automatica 39 (10), 1667–1694. Souza, F., Santos, P., Araujo, R., 2010. Variable and delay selection using neural networks and mutual information for data-driven soft sensors [C]. IEEE Conference on Emerging Technologies and Factory Automation, 1–8. Tufa, L.D., Ramasamy, M., 2011. Closed-loop identification of systems with uncertain time delays using ARX–OBF structure [J]. J. Process Control 21 (8), 1148–1154. Xu, O., Liu, J., Fu, Y., et al., 2015. Dual updating strategy for moving-window partial least-squares based on model performance assessment [J]. Ind. Eng. Chem. Res. 54 (19), 5273–5284. Yan, W., Shao, H., Wang, X., 2004. Soft sensing modeling based on support vector machine and Bayesian model selection [J]. Comput. Chem. Eng. 28 (8), 1489–1498. Yuan, X., Ye, L., Bao, L., et al., 2015a. Nonlinear feature extraction for soft sensor modeling based on weighted probabilistic PCA [J]. Chem. Intell. Lab. Syst. 147, 167–175. Yuan, X., Ge, Z., Song, Z., 2015b. Spatio-temporal adaptive soft sensor for nonlinear time-varying and variable drifting processes based on moving window LWPLS and time difference model [J]. Asia Pac. J. Chem. Eng. 11 (2), 209–219. Yuan, X., Huang, B., Ge, Z., et al., 2016. Double locally weighted principal component regression for soft sensor with sample selection under supervised latent structure [J]. Chemom. Intell. Lab. Syst. 153, 116–125. Zhang, J., Jin, Q.B., Xu, Y.M., 2006. Inferential estimation of polymer melt index using sequentially trained bootstrap aggregated neural networks [J]. Chem. Eng. Technol. 29 (4), 442–448. Zhang, W., Li, Y., Xiong, W., et al., 2015. Adaptive soft sensor for online prediction based on enhanced moving window GPR[C]. Control, automation and information sciences (ICCAIS). 2015 International Conference on IEEE, 291–296.