Data-driven soft sensor approach for online quality prediction using state dependent parameter models

Data-driven soft sensor approach for online quality prediction using state dependent parameter models

Author’s Accepted Manuscript Data-driven soft sensor approach for online quality prediction using state dependent parameter models Bahareh Bidar, Jafa...

1MB Sizes 2 Downloads 40 Views

Author’s Accepted Manuscript Data-driven soft sensor approach for online quality prediction using state dependent parameter models Bahareh Bidar, Jafar Sadeghi, Farhad Shahraki

www.elsevier.com

PII: DOI: Reference:

S0169-7439(16)30289-1 http://dx.doi.org/10.1016/j.chemolab.2017.01.004 CHEMOM3376

To appear in: Chemometrics and Intelligent Laboratory Systems Received date: 7 September 2016 Revised date: 7 December 2016 Accepted date: 8 January 2017 Cite this article as: Bahareh Bidar, Jafar Sadeghi and Farhad Shahraki, Datadriven soft sensor approach for online quality prediction using state dependent parameter models, Chemometrics and Intelligent Laboratory Systems, http://dx.doi.org/10.1016/j.chemolab.2017.01.004 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Data-driven soft sensor approach for online quality prediction using state dependent parameter models Bahareh Bidar, Jafar Sadeghi*, Farhad Shahraki Department of Chemical Engineering, University of Sistan and Baluchestan, P.O. Box 98164-161, Zahedan, Iran [email protected] [email protected] [email protected] * Corresponding author.

Abstract The present study describes the design and implementation of a new data-driven soft sensor that uses state dependent parameter models to improve product quality monitoring in a real industrial debutanizer column. The model parameters are assumed to be function of the system states, which are estimated by data-based modeling philosophy and state dependent parameter method. A comparative study of different soft sensing methods for online monitoring of debutanizer column is also carried out. The results show that the process non-linearity can also be addressed under this modeling method and the change of the process is also well tracked when missing data exist in the observed data. Results indicate that the new model is much more robust and reliable with less model parameters which make it useful for industrial applications. In addition, the performance indexes show the superiority of the proposed model over other conventional soft sensing methods. Keywords: soft sensor; data-based modeling; state dependent parameter; debutanizer column; quality prediction; missing value

1. Introduction Modern processes trend towards safer, cleaner, more energy efficient and profitable production in chemical plants and refineries. These aspects can be addressed by installing advanced monitoring and control systems in industrial plants which are based on measurement of different process variables. However, industrial processes face with many problems in measuring of primary variables such as products quality. Difficulties in measuring quality-related variables inevitably mean poor control or no control at all. Although, hardware sensors are adopted in the industrial plants to measure quality-related variables in order to deliver data for process monitoring and control, which they are frequently too expensive, difficult to install and simply not reliable or accurate enough. On some occasions, these variables are obtained through

laboratory analyses which lead to infrequent and inaccurate estimations. It has negatively affected process monitoring and control performance, which can result in increased production costs, lower product quality or even dangerous situations for plant, personnel and the environment [1, 2]. It is well known that industrial processes are generally quite complex to model and are characterized by significant, inherent non-linearities, so a rigorous theoretical modeling approach is often impractical, expensive and requiring a great amount of effort. Moreover, large amounts of data can be collected and stored within process plants which are a useful source of information for building predictive models and identification [3]. Soft sensor is an inferential model or an estimation algorithm that relates available sensor measurements with important process variables that are correlated with it and not physically measured [4-6]. Soft sensor has been become an important tool for industrial process to eliminate or at least reduce the above mentioned problems. If sufficiently accurate, the inferred primary output states may then be used as a feedback for automatic control and optimization. In this regard, data-driven soft sensors have gained continuous and increasing focus which are easy to develop and embedded in automatic control systems [7, 8]. Various approaches of data-driven soft sensors have been proposed for online prediction and process monitoring in the last years. The most popular linear approaches are multivariate statistical methods such as the principal component analysis (PCA) [9, 10] in a combination with a regression model; principal component regression (PCR) [9, 11, 12], partial least squares (PLS) [13, 14]. Multivariate statistical methods have been widely used in the plant data-based modeling due to their simplicity and clear mathematical background. Nevertheless, multiple linear regression (MLR) is one of the widely used regression methods, its predicted performance often fails in the case of process variable correlations, large dataset and measurement noise [15, 16]. The PCA method also has poor prediction due to ignore the input-output relationship and non-linearity of the data. It is usually used as pre-processing step followed by other computational methods. The PLS method presents as an extension of the PCA to consider the correlation between input and output variables, but it also model only linear relations between the data [2, 4]. Further, several extended algorithms of PCA and PLS approaches using local modeling methods [17-19] and supervised or semi supervised methods [9, 20, 21] have been proposed to deal with the problems.

The most popular nonlinear approaches are artificial neural networks (ANN) [22-24], Neurofuzzy system (NFS) [25, 26] and support vector machines (SVM) [27, 28], kernel PCA (KPCA) methods [29-33] and so forth. A fundamental problem in the modeling of nonlinear systems is to predict the system output as a function of past inputs and outputs. Therefore, artificial neural networks or Neuro-fuzzy models do not possess such relationships which would make them ideally suited for various standard applications, such as controller design or online prediction. Although, the mentioned nonlinear methods have been reported more and more in the literatures, but the linear modeling methods seem more practically useful [5]. While the multivariate statistical methods are currently well known, their deployment in the industrial applications remains a challenge because of data related issues such as outlier, colinearity and missing data in industrial processes [4, 34]. Currently, the most common approach to deal with these issues is obtaining as much process knowledge as possible and incorporating this knowledge into the model in the form of data pre-processing [35]. One of the first steps of any data pre-processing is to identify and document the extent of the missing data. Missing data can arise for several reasons such as the sensor fails, multiple sampling rate, network transmission loss or delay, etc. There is no one solution for handling missing data. Most statistical procedures automatically exclude observations that are missing values for any variables being analyzed, regardless of whether the analyst is cognizant of these exclusions. Moreover, many data imputation algorithms have been developed such as the EM-based maximum likelihood approach and data augmentation (DA) etc. The process knowledge should be acquired for each new soft sensor to be built due to the optimal strategy depends on the study design, the goals of the analysis, and the pattern of the missing data. After acquiring the knowledge, it has to be manually implemented into the models, which is also very timeconsuming [36, 37]. Additionally, most of the industrial processes have often non-stationary characteristics which static and dynamic behavior changes over time and thus require a strategy for online adaptation. It can be observed that the predictive accuracy of soft sensor models reduces during online operation. This is called soft sensor model degradation. Consequently, the soft sensor models should be updated automatically and frequently in order to deal with the changes of process plant and maintain high performance. Otherwise, after some time, the performance of the model reaches an unacceptable level and the model has to be re-tuned or re-developed [6, 38, 39].

Moving window [40-42] and recursive methods [11, 43, 44] have been commonly employed to adapt soft sensors to new process dynamics and successful applications of these methods have been reported. Although, these methods can adapt soft sensors to new process states, they fail to cope with abrupt changes of process characteristics. Moreover, a single global model could not implement well in wide operating range when strong process non-linearities exist. Comparing with the global model, local learning based soft sensors are employed to construct several local models for each specific operating ranges of the process [17-19]. There are several well-known local modeling methods such as Nuero-fuzzy and T-S fuzzy models which most of them suffer from requiring a priori knowledge to determine the division of operating space. The just-in-time (JIT) models were developed to build the local model automatically [45-48]. However, the JIT modeling can cope with changes in process characteristics as well as non-linearity, but its performance is not always satisfactory because the correlation between the process variables is ignored. In the correlation-based just-in-time learning (CoJIT) method, the correlation between the process variables is utilized for local model adaptation. Although, the CoJIT often outperforms the conventional JIT learning, it needs massive memory space and sometimes lead to inappropriate model adaptation [49, 50]. Numerous studies have been conducted to model industrial processes, ranging from simple linear to highly complicate nonlinear ones over the years. The parameters of all proposed models are considered constant or only time variable. But in some cases, non-linearity and chaotic behavior of the system is from its state dependency and hence system behavior cannot be identified well with constant or slowly changing parameter models. Data-based mechanistic (DBM) modeling philosophy was introduced and has been developed by Peter C. Young and co-workers [51-61]. The approach is most efficient in both modeling and control of nonlinear systems in contrast to complicated modeling strategies. DBM method provides clear indication of the model, while classical methods fail to converge on a sensible set of parameters. In this philosophy, the non-stationary and nonlinear aspects of the system are reflected by a simple structure varying parameter model, whether its parameters are varying with time or as a function of the state variables of the system. These types of models are called state dependent parameter (SDP) model [62, 63]. The idea of the SDP modeling was originated by Peter C. Young [64, 65] and Jerry Mendel [66] and its practical development are then explored within a broader setting [51, 52, 62, 63, 67-69]. The SDP method normally provided quasi-linear models

with less parameter than other data-driven models with higher prediction accuracy involving the system inputs, outputs and their past values. The major advantage of the method is balancing between order of the model (simplicity) and the model efficiency, while providing a good explanation of data and statistical analysis required for identification and estimation of such systems at the same time. Because the resultant model is simple and directly in digital form, it avoids both the model’s complication and errors produced through either simplification or digitization of a complex continuous time model. It is also worthwhile to point out that the resulting model and its parameters according to DBM philosophy are meaningful and interpretable while in data-based modeling, which are not mechanistic, this is generally not the case [70]. However, the study on design of robust soft sensors and make the industrial chemical applications of SDP method have not taken into account yet. The main novelty of the current study is to propose a new data-driven soft sensing model based on SDP method. It has been used to estimate the butane concentration in an industrial debutanizer column in order to illustrate the practical application and effectiveness of the method. Based on the effective input variables, several SDP models are constructed. The prediction accuracy of each SDP model is compared through performance indexes. Most significant input variables are selected based on correlation analysis and prediction accuracy, then they have been considered as the best inputs of the model. The selected structure is then evaluated by a dataset which its data have been missing randomly in order to demonstrate the capability of handling missing data, automatically. The other contribution of the present study is that, it is utilized a comparative study of soft sensor techniques which had already been proposed for debutanizer column. The rest of the paper is organized as follows: In section 2, an overview of DBM and SDP modeling are briefly introduced. The modeling method is described in detail, including fixed interval smoothing (FIS) and back-fitting algorithms. In section 3, an industrial chemical process, namely the debutanizer column is employed to demonstrate feasibility and effectiveness of the proposed model. In addition, in this section, a complete overview of proposed soft sensors has been carried out and the summary of their prediction results is presented. Finally, section 4 is the overall analysis of online quality predictions and some conclusions are made in the last section.

2. Preliminaries In this section, the DBM philosophy is first briefly described; then the section goes on to review the identification and estimation of the SDP models for non-linear, stochastic, dynamic systems that are playing an increasingly important role in DBM modeling. 2.1.

Data-based mechanistic modeling

The DBM approach has been developed primarily for direct analysis of experimental or observed data and has been applied successfully to various environmental, biological, ecological, engineering and economic systems. The DBM models will be more useful if the time-series data are rich enough to identify completely the system’s behaviors. In the absence of such data sources, the DBM modeling strategy does a poor job. Of course, any estimation method will fail when the data is not showing the modeled modes of the system behavior. Moreover, the prior assumptions about the form and structure of this model are kept at a minimum in order to avoid the prejudicial imposition of untested perceptions about the nature and complexity of the model [71]. Appropriate model structures are identified by a process of relatively statistical inference applied directly to the time-series data which parameters are estimated as time variable (TVP) or state dependent parameter. However, the parameters of TVP models are assumed to vary slowly when compared with that of SDP models. So SDP models are able to provide a description for a widely applicable class of nonlinear systems that includes rapid parametric changes. Theses state dependencies are estimated in the form of non-parametric relationships. The non-parametric state dependent relationships are normally parameterized in a finite form and the resulting (normally constant) nonlinear model parameters are estimated using some form of numerical optimization, such as nonlinear least squares, prediction error minimization or maximum likelihood (ML) optimization [62, 72, 73]. 2.2. Identification of TVP and SDP Models

The estimation of ‘slowly changing’ time variable parameters in the various kinds of linear regression models has been discussed by Young [74]. The term ‘slow’ here means that the temporal variation of parameters is slow in comparison with the variation of the system variables (input, output, and states) in the model. These TVP models include the dynamic linear regression (DLR), dynamic auto regression (DAR), dynamic harmonic regression (DHR) and dynamic auto-regressive with exogenous variables (DARX) models [75]. While the DARX model can

describe a truly dynamic system and produce complex response characteristics, because of its TVPs, it is only when, these parameters are functions of the system variables, and vary at a rate related to these variables that the resultant model can behave in a heavily nonlinear or even chaotic manner. These models are known as state dependent parameter ARX (SDARX) model [62]. The SDARX model equation can be written most conveniently for estimation purposes in the following form: (1) Or in more general form ∑

(

Where

)

(2)

are the observed output and

{

(3)

are the state dependent parameters, which are functions of one variable in state vector [

]. Here

[

] is a vector of other variables that may affect the

relationship between theses two variables. In addition, n m 1 is the number of parameters and is the

regressor as follow:

{

The term

(4)

is a pure time delay, measured in sampling intervals, which is introduced to allow for

any time delay that may occur between the incidence of a change in

and its first effect on

.

is a zero mean, white noise input with Gaussian normal amplitude distribution and variance (although this assumption is not essential to the practical application of the resulting estimation algorithms). Before considering full SDP models, such as Eq. (1), it is instructive to first deal with the simpler situation where the parameters in

are slowly variable with time. In order to estimate these

time variable parameters, it is necessary to make some assumptions about the nature of their temporal variability. One of the simplest and most generally useful classes of stochastic models involves the assumption that the ith parameter,

is defined by a two-dimensional stochastic

[

state vector;

] , where

and

are the changing level and slope of the

associated TVP, respectively. The stochastic evolution of each

(and, therefore, each of the n+m +1 parameters) in Eq. (2) is

assumed to be described by a generic generalized random walk (GRW) process [76, 77]. It is defined in the state space form as (5) Where [

];

*

[

and

+

(6)

] is a 2×1, zero mean, white noise vector that allows for stochastic

variability in the parameters and is assumed to be characterized by a (normally diagonal) covariance matrix

.

and the elements of

are referred to as hyper-parameters

which are assumed to be unknown a priori and should be estimated from the observed data. These normally constant parameters can adopt different values to make special cases as IRW, RW, SRW, AR(1), LLT and DT (see reference [62]). An overall state space model can then be constructed, by the aggregation of the subsystem matrices defined in Eq. (5), with the observation equation defined by the model Eq. (7-b). State equation:

(7-a)

Observation equation:

(7-b)

Where

is the scalar stochastic observed variable and

[

]

(8)

is an n+m+1 dimensional stochastic state vector. defined by the

matrices in Eq. (6);

corresponding subsystem matrices white noise input

which is defined as

is a block diagonal matrix with blocks

is a block diagonal matrix with blocks defined by the

in Eq. (6); and

is a (n+m+1)×1 vector containing the

, to each of the GRW models in Eq. (1).

These white noise inputs are assumed to be independent of the observation noise covariance

diagonal

(n+m+1)×(n+m+1)

matrix

that

each

diagonal

and have a element

is

correspondent to Additionally,

. The smoother estimated parameter is provided by the less value of .

is a row vector of the following form:

[

] (9)

, i=1, 2, …, n+m+1 are defined as zero vectors.

is related the scalar observation

to the

state variables defined by Eq. (7-b), so that it represents the model which each parameter defined as a GRW process. The state space formulation is particularly well suited for optimal recursive estimation in which the time variable parameters are estimated sequentially whilst working through the data in temporal and reverse order, through two separate passes, forward-pass filtering and backward-pass fixed interval smoothing by implementation of the Kalman filter concepts. The recursive character of the Kalman filter allows a GRW approach to naturally handle missing data which are very common in industrial practice and have no readily apparent pattern. It is sufficiently flexible to modify covariance of estimate and manage the estimation. It is also fast in calculations and execution time and provide an implicit local kernel for estimation and hence, when necessary, allow increasing or decreasing of the bandwidth [75]. 2.2.1. Forward-pass recursive least square filtering The Kalman filter equations can be divided into prediction equations (a priori estimate) for observations and the propagation of the state variables (and their covariance) and correction equations (a posteriori estimate) for updating the state estimates [73]. In relation to the time series

,

and the system described by Eq. (7-a) and Eq. (7-b), the prediction and

correction equations are as follows: Prediction:

̂| ̂ |

̂ ̂

(10-a) (10-b)

The estimation here is based on all the times before t. To include time t the correction is needed as shown in (11). Correction:

̂

̂

̂

̂

| |

̂ ̂

| |

[

̂

[

̂

| |

̂

] ( ]

̂

|

)

(11-a) (11-b)

|

̂ and ̂ are the initial value and its covariance matrix. Note that the term [

̂

|

] is

simply a scalar quantity. As a result, there is no requirement for direct matrix inversion even

though the repeated solution of the equivalent classical en bloc solution entails inverting an n×n matrix for each solution. 2.2.2. Backward-pass fixed interval smoothing Estimates obtained from the forward-pass filtering algorithm, are based on the initial conditions, ̂ and ̂ , and all estimated values prior to and include sample . Therefore, they need to be updated sequentially, whilst working through the data, in reverse temporal order using a backward-recursive procedure termed fixed interval smoothing algorithm to obtain a smooth estimate based on all the elements of the time series and to remove any lag effect. ̂



|

̂

{[

̂

|

]

|

{

̂

(12-a)

] [ ̂

|



̂

|

̂

|



|

]

(12-b)

̂ (12-c)

̂

The algorithm requires the inversion of

matrix which, in general, may cause some problem if it and ̂ are

is non-invertible. In these algorithms the noise variance ratio (NVR) matrix defined as follows: ⁄

, ̂

̂ ⁄

(13)

̂ is the error covariance matrix associated with the state estimates ̂ , which defines the estimated uncertainty in the parameters. The FIS algorithm runs backwards after the filtering stage and yields a smoothed estimate of state vector ̂

|

and its covariance matrix ̂

|

based on

all N samples. When missing data are encountered, an interpolation is generated based on the data on both sides of the gap and the estimated model. 2.2.3. Back-fitting algorithm for SDP models The FIS algorithm works very well when the parameters are changing slowly with time compared to the change of the temporally observed inputs and output. Since

is potentially a

nonlinear function of single variable which might, for instance, be its associated past input or output variable, conventional TVP estimation will fail because the simple GRW model (Eq. (5) and Eq. (6)) will be unable to effectively track these rapid variations. If data are re-ordered in a non-temporal order prior to estimation, the variables and associated parameters are now

changing quite slowly. Then, the recursive FIS estimation is applied using a special iterative back-fitting algorithm. Here, each parameter is estimated based on the modified dependent variable (MDV) series obtained by subtracting all the other terms on the right-hand side of Eq. (1) from

, using the

values of the other parameter estimates from the previous iteration. At back-fitting iterations, the sorting can then be based on the single variable associated with the current SDP being estimated. The back-fitting algorithm for the SDARX model takes the following form where, as defined by Eq. (4),

are the state variables associated with each of the SDPs in the model [62, 63].

1. Assume that FIS estimation has yielded prior TVP estimates

̂

|

of the SDPs. 2.

Iterate ∑

3. From the MDV, 4. Sort both

and

̂

|

according to the ascending order of

5. Obtain an FIS estimate ̂

|

of

in the MDV relationship

6. Repeat step 2-5 until convergence criterion occurs. The smoothing hyper-parameters required for FIS estimation at each stage are optimized by maximum likelihood approach. Note that the optimization can be carried out either after complete iterations or just on the initial or first two iterations which the hyper-parameters maintained at these values for the rest of the back-fitting procedure. 2.2.4. Hyper-parameter optimization It is necessary to specify the NVR parameters that characterize

as well as any other unknown

hyper-parameters in the state space model (Eq. (7-a) and Eq. (7-b)) to represent the parameter variations. The optimization of these hyper-parameters can be accomplished in various ways. In this method, maximum likelihood approach based on prediction error decomposition (PED) is utilized where the nature of the noise processes is also taken into consideration. With given initial values for the hyper-parameters, the Kalman filter algorithm will yield the one-step-ahead prediction errors or residual as follows: (14) Then, concentrated likelihood function is expressed as:

̂

|

* ∑



+

(15)

which needs to be maximized or can then be minimized if it is multiplied by -1 with respect to the unknown hyper-parameters in order to obtain their ML estimates. The

is the covariance of

. Since the likelihood function is a nonlinear function of the unknown hyper-parameters, the minimization needs to be carried out numerically. It is accomplished by initiating the optimization with the hyper-parameter estimates either selected by the user or set to some default values. The recursive filtering part of the algorithm is used repeatedly to generate afterward the value of

and

in Eq. (15) associated with the latest selection of hyper-parameters

made by the optimization algorithm. The optimization algorithm then adjusts its selection of hyper-parameter estimates in order to converge on those estimates which minimize this concentrated likelihood [62, 75]. This yields an appropriate low-order model and these SDP estimates when plotted against the associated dependent variables, are identified as non-parametric relationship. The estimated linear or nonlinear relationship can be parameterized in some convenient form such as linear, polynomial, exponential, power law, trigonometric function or a more general radial basis function, Neuro-fuzzy and a neural network. It should be avoided the over-parameterization that often accompanies nonlinear models, such as neural network, Neuro-fuzzy [75]. The advantage of such parameterization is that it makes the model fully self-contained, so revealing more clearly the nature of the identified nonlinear dynamic system. 3. Case Study In order to illustrate the superiority of SDP modeling over different soft sensing methods, it is necessary to benchmark these methods on publicly available datasets. In this section, the performance of SDP method is evaluated through a real industrial debutanizer column process. The input-output dataset for the debutanizer column has been shared by [78] and is available at (http://www.springer.com/gp/book/9781846284793). It has become a benchmark dataset for evaluating the performance of soft sensors [33]. An overview of soft sensing methods which exploited for real-time monitoring of product quality in the debutanizer column is also provided. 3.1.

Debutanizer distillation column

The debutanizer column is an important unit operation in petroleum refining industries and it is a part of a de-sulfuring and naphtha splitter plant where propane (C3) and butane (C4) are removed

as overheads from the naphtha steam as shown in Fig.1. With sufficient fractionation, the debutanizer column aims to maximize the stabilized gasoline content in the debutanizer overheads and minimize the butane content in the debutanizer bottoms, simultaneously. This requires continuous monitoring of butane concentration in the bottom product. Typically, many sensors which indicated with gray circles in Fig.1 are installed in the plant to measure the process variables and monitor product quality. The concentration of butane in the bottom of the debutanizer is indirectly measured [79]. Due to lack of real-time monitoring system for butane concentration, process non-linearity, involves a great deal of interactions between variables and multivariate nature of the process and open loop instability issues, control of product quality is a difficult task in this column [80]. To improve the control performance, a real-time estimation of the quality variable with a high degree of precision is supposed to be given by the available training data. The soft sensor model aims to explore such relation is constructed. The first step in designing a soft sensor is to select input variables correctly. This had been done by using the knowledge of the operating experts on the system. Referring to [78], seven process variables, including pressures, temperatures and flows in the plant which are much relevant to the application are selected as the input variable of soft sensors. Table 1 gives a detailed description of these variables. 3.2.

Debutanizer soft sensing methods

Fortuna et al. [79] be the first ones were designed the real-time estimator for a debutanizer column. They developed a complex soft sensor based on MLP with 13 hidden neurons. The model was a cascaded three-level neural network. Regardless of the input variables which are measured within the column, the model used delayed samples of the input and output variables. The model was obtained as follows: ̂

*

̂ represents the predicted value of

+ (16)

for the t-th query sample. The proposed model gave

satisfactory results for the online prediction of butane concentration. In the coming years, many researchers have done their studies on this benchmark dataset. They constructed soft sensor models based on the proposed variables in Eq. (16) which is determined by the analysis of expert knowledge and consideration of process dynamics [78]. An overview of published soft sensors for debutanizer column are summarized in chronological order of their appearance in the literature in Table 2.

3.3.

Soft sensor performance evaluation

The debutanizer column dataset contains 2394 samples which have been collected under normal operation conditions of the process. The dataset can be partitioned into two parts: the modeling dataset (the first 1197 samples) for training and the testing dataset (1197 samples) for evaluating the performance of proposed soft sensor. The prediction accuracy is determined using three performance indexes; root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R) defined as: √ ∑ ∑

∑ √∑

Where

|

(17)

̂ |

(̂ (̂

̂

̅̂ )

̅̂ ) ∑

(18) ̅

(19) ̅

is the number of data samples and

predicted value, mean values of

, ̂ , ̅ and ̅̂ are referred to as the real value,

and ̂ respectively. The configuration of utilized computer is

as follows: OS: Windows 7 (32 bit); CPU: Intel Core2 Duo 8700 (2.53 GHz*2); RAM 4 GB; the version of MATLAB is 2008a. 4. Implementation and results Since one objective of debutanizer column is to minimize the butane (C4) concentration at the bottom flow of the column, this concentration is chosen as the output variable which must be estimated by soft sensor. Fig .2(a-g) shows the trends of input variables, and part (h) shows the characteristic of the output variable. From these subfigures, it is easy to see the nonlinear relations between output variable and input variables. In addition, the process is changing gradually or rapidly as the process variables vary regularly. According to Eq. (16), the dimension of the input variable vector is 12 for the model identification. None of previously proposed methods were considered the state dependency of system parameters and all input variables are used to estimate the butane quality. But due to the characteristics of SDP modeling, it will require less input variables compared to neural networks and other regression methods. Evidently, surplus and inappropriate variables would produce

relatively poor predictions. Therefore, Eq. (16) can be simplified by eliminating some input variables, while the prediction accuracy of the model has been preserved. To build the soft sensor model, associated parameters are determined first. The dataset is considered in the form of Eq. (1), which each parameter is a function of only one state variable. For this reason, the ‘CAPTAIN’ toolbox of MATLAB is implemented. It provides access to novel algorithms for various important aspects of identification, estimation, time series analysis and automatic control system design. It has been developed between 1981 and the present at Lancaster University (http://www.es.lancs.ac.uk/cres/captain/). Several soft sensing models are built using different combinations of parameters involving optimization procedure. Detailed results of soft sensing models are provided in Table 3, that the best achievable performance is highlighted in bold. The one-, two- and three-step delayed outputs

are also included to consider the system dynamic conditions. It is easy

to see that with decreasing the number of parameters, the model can provide more and more accurate results. However, the SDARX model with one-, two- and three-step delayed outputs obtains more accurate results than the SDARX model with four-, five- and six-step delayed outputs

under the same number of parameters (Case 3-8).

More detailed comparison of the quality prediction results of the three different models is shown in Fig. 3. The online prediction of case 1 matches well with the actual data, but in many points is rather worse, clearly indicates that over-fitting might occur during training of the model, as is shown in Fig. 3(a). Compared with case 1, case 16 shows more robust prediction performance and hence the obtained soft sensor behaves more accurate and stable for online quality prediction. On the contrary, the non-robust model (case 22) can track the trend of the output variable, but it is very sensitive to noise and the predictions highly skewed from the actual data. Hence, the number of parameters for the selected model is chosen as 2 because one parameter will result in underfitting problems, while too many parameters improve the performance little but increase the complexity of the model. As some input variables such as pressure, flow to next process and bottom temperatures seem to make little improvement in prediction accuracy; they can be omitted from the model. As a result, the following possible combination of parameters; 6th tray temperature with threestep delayed (

) and butane concentration with one-step delayed (

), performs better than

other proposed models in terms of the minimum error and maximum prediction accuracy as shown in Table 3 and Fig. 3. The SDARX model with selected parameters and states takes the following form, ̂

(20)

Where

and

are the identified state dependent parameters. The procedure in

section 2 has been applied to estimate the state dependent parameters in Eq. (20). The correspondent NVR of each SDP is obtained based on IRW parameter variation model (with ). Finally, the ML optimization of the NVR hyper-parameters can be accomplished, as discussed in subsection 2.2.4. Here, the optimization criterion is achieved at the first back-fitting iterations and the optimized values are maintained for the subsequent three iterations required for convergence. The ML optimization of the model then yields the values for the NVR hyper-parameters reported in Table 4. The optimized NVR ( different from zero, illustrating that (

is insignificantly

is almost constant line. In contrast the optimized NVR

is indicated that the presence of state dependency and allowing for its estimation. Fig. 4

shows the optimum estimation results for the IRW model of parameter variations. It provides trends of SDP estimation of both parameters against sample number and associated state variables. It is also clear from Table 4 and Fig. 4, that the larger values of the NVR exhibit more variation, suggesting possibly significant changes in the mean value over the data series. Clearly, the parameter may be change little with time, but relatively changes with state variation such as the parameter parameter

. The results demonstrate that the algorithm is not very sensitive to the

.

The parameterization problem is to find a low dimensional parametric relationship for each SDP, which is able to characterize the graphical relationship as well as possible, without over-fitting. For this reason, the ‘curve fitting toolbox’ of MATLAB is implemented. On the basis of the results in Fig. 4, the suitable parametric form of parameters and the inverse tangent (tan-1) of

and

are the linear function of

, respectively. Consequently, simple curve fitting,

based on least squares estimation yields the following relationships (21) (

)

(

(

))

(22)

with associated coefficients p1 =1.026 (0.001), p2 = -0.0561 (0.00255), p3 = -0.1307 (0.0023) and p4 = 0.1597 (0.0028) where the standard errors with 95% confidence are shown in parentheses. In order to evaluate the robustness of designed soft sensor a test is accomplished. It is performed in situation where testing dataset is contaminated by some random missing values. The quality predictions of SDP-based soft sensor on testing dataset with/without missing values are shown in Fig. 5(a, b). One can judge from Fig. 5 that the soft sensor performance has not been affected by the missing values and the missing samples are correctly assigned to predicting ones. Comparing Fig. 5(a) with Fig. 5(b), there are no significant deviations between the actual and predicted values across the whole process specially at missing parts. This feature makes the SDARX model a very powerful tool can handle missing values without using data imputation algorithms such as missing value replacement, Em-based maximum likelihood, data augmentation methods, etc and guarantees the real-time efficiency of SDP-based soft sensor. Fig. 6 shows the actual versus predicted values of the butane concentration using SDARX model. Most of the data points fall close to the 45 degree line in the whole operation range. This indicates good prediction by the selected model. Table 5 quantitatively compares the prediction performance of proposed model and other soft sensor models for debutanizer column, where the RMSE, MAE and correlation coefficient criterions are given. The results mainly focus on data-driven soft sensors which are benchmarked against the debutanizer column dataset. As can be seen from Table 5, some performance indexes have not been reported and some others are reproduced approximately based on given graphs. It can readily conclude that the SDARX model produces the highest correlation coefficient, while it has the lowest RMSE and MAE compared to the other models. The MLR and MWPLS models perform worst among the soft sensors. The RMSE of some models such as both OLPLS-1 and OLPLS-2 [17] and LPLS-APSP [19] are so close to the SDARX model. But the SDARX model is superior over them from the aspects of low input variables, automatically dealing with missing values, simple model and as a consequence high-speed computing. 5. Conclusions In the present paper, a new soft sensor modeling approach called state dependent parameter method has been proposed which can successfully incorporate the state dependency of parameters under the nonlinear relationship between input and output variables. Based on the

DBM philosophy, the present study is developed in an innovative method for soft sensor design, which demonstrates appropriate capability for analyzing the data with drastic changes. Using the SDP method, the butane concentration in bottom product of an industrial debutanizer column has been estimated non-parametrically as a function of 6th tray temperature and previous butane concentration values. The suitable parametric form of parameters is then provided using least squares estimations. On the basis of the prediction results, it can be concluded that the SDP method is capable of reliably and effectively handling nonlinear and time-varying industrial processes. A comparative study of different soft sensing methods is also carried out for debutanizer column. It has been shown that among the soft sensing approaches, the results demonstrate a superior performance in terms of the reliability and robustness of the SDARX model. However, compared with other models, the SDARX model is less complex, and there are also less identified parameters that introduce a further ease of implementation to nonlinear modeling process. Such satisfactory prediction performance verifies the feasibility and efficiency of the new soft sensing model for online deployment. In addition, since the low-order model is simple and it converges rapidly, it is also suitable for inclusion in the industrial control systems. Finally, it is worth pointing out that the SDP non-parametric models can be useful in their own right. They can be simulated easily in programs such as MATLAB and provided a completely new way of estimation and identification, in parametric or non-parametric form.

Funding sources: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References: [1] T.M. Avoy, Intelligent control applications in the process industries, Annual Reviews in Control, 26 (2002) 75-86. http://dx.doi.org/10.1016/S1367-5788(02)80014-1 [2] D. Sliskovic, R. Grbic, Z. Hocenski, Methods for plant data-based process modeling in softsensor development, Automatika, 52 (2011). [3] Z. Ge, Z. Song, P. Wang, Probabilistic combination of local independent component regression model for multimode quality prediction in chemical processes, Chemical Engineering Research and Design, 92 (2014) 509–521. http://dx.doi.org/10.1016/j.cherd.2013.09.010 [4] P. Kadlec, B. Gabrys, S. Strandt, Data-driven soft sensors in the process industry, Computers and Chemical Engineering, 33 (2009) 795–814. http://dx.doi.org/10.1016/j.compchemeng.2008.12.012

[5] M. Kano, M. Ogawa, The state of the art in chemical process control in Japan: Good practice and questionnaire survey, Journal of Process Control, 20 (2010) 969–982. http://dx.doi.org/ 10.1016/j.jprocont.2010.06.013 [6] M. Kano, K. Fujiwara, Virtual sensing technology in process industries: Trends and challenges revealed by recent industrial applications, Journal of chemical engineering of Japan, 46 (2013) 1-17. http://dx.doi.org/10.1252/jcej.12we167 [7] E. Zamprogna, M. Barolo, D.E. Seborg, Optimal selection of soft sensor inputs for batch distillation columns using principal component analysis, Journal of Process Control, 15 (2005) 39–52. http://dx.doi.org/10.1016/j.jprocont.2004.04.006 [8] S. Khatibisepehr, B. Huang, S. Khare, Design of inferential sensors in the process industry: A review of Bayesian methods, Journal of Process Control, 23 (2013) 1575–1596. http://dx.doi.org/10.1016/j.jprocont.2013.05.007 [9] Z. Ge, Z. Song, Semi supervised Bayesian method for soft sensor modeling with unlabeled data samples, AIChE Journal, 57 (2011) 2109–2119. http://dx.doi.org/10.1002/aic.12422 [10] I.T. Jolliffe, Principal component analysis, Second ed., Springer, New York, 2002. [11] J. Tang, W. Yu, T. Chai, L. Zhao, On-line principal component analysis with application to process modeling, Neurocomputing, 82 (2012) 167–178. http://dx.doi.org/10.1016/j.neucom.2011.10.026 [12] J. Zhang, Offset-free inferential feedback control of distillation compositions based on PCR and PLS models, Chemical Engineering and Technology, 29 (2006) 560–566. http://dx.doi.org/10.1002/ceat.200500259 [13] H. J. Galicia, Q.P. He, J. Wang, A reduced order soft sensor approach and its application to a continuous digester, Journal of Process Control, 21 (2011) 489–500. http://dx.doi.org/10.1016/j.jprocont.2011.02.001 [14] S. Wold, M. Sjostrom, L. Eriksson, PLS-regression: A basic tool of chemometrics, Chemometrics and Intelligent Laboratory Systems, 58 (2001) 109–130. http://dx.doi.org/10.1016/S0169-7439(01)00155-1 [15] J.V. Kresta, J.F. Macgregor, T.E. Marlin, Multivariate statistical monitoring of process operating performance, Canadian Journal of Chemical Engineering, 69 (1991) 35 - 47. http://dx.doi.org/10.1002/cjce.5450690105 [16] N.R. Draper, H. Smith, Applied regression analysis, Third ed., Wiley, New York, 1998. [17] W. Shao, X. Tian, P. Wang, Local partial least squares based online soft sensing method for multi-output processes with adaptive process states division, Chinese Journal of Chemical Engineering, 22 (2014) 828–836. http://dx.doi.org/10.1016/j.cjche.2014.05.003 [18] W. Shao, X. Tian, Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models, Chemical Engineering Research and Design, 95 (2015) 113–132. http://dx.doi.org/10.1016/j.cherd.2015.01.006 [19] W. Shao, X. Tian, P. Wang, X. Deng, S. Chen, Online soft sensor design using local partial least squares models with adaptive process state partition, Chemometrics and Intelligent Laboratory Systems, 144 (2015) 108–121. http://dx.doi.org/10.1016/j.chemolab.2015.04.003 [20] Z. Ge, B. Huang, Z. Song, Nonlinear semi supervised principal component regression for soft sensor modeling and its mixture form, Journal of Chemometrics, 28 (2014) 793–780. http://dx.doi.org/10.1002/cem.2638 [21] J. Zhu, Z. Ge, Z. Song, Robust supervised probabilistic principal component analysis model for soft sensing of key process variables, Chemical Engineering Science, 122 (2015) 573–584. http://dx.doi.org/10.1016/j.ces.2014.10.029

[22] J.C.B. Gonzaga, L.A.d.C. Meleiro, C. Kiang, R.M. Filho, ANN-based soft-sensor for realtime process monitoring and control of an industrial polymerization process, Computers and Chemical Engineering, 33 (2009) 43–49. http://dx.doi.org/10.1016/j.compchemeng.2008.05.019 [23] C.M. Bishop, Neural networks for pattern recognition, Oxford University Press, USA, 1995. [24] J.C. Principe, N.R. Euliano, W.C. Lefebvre, Neural and adaptive systems, Wiley, New York, 2000. [25] J.-S.R. Jang, C.-T. Sun, E. Mizutani, Neuro-fuzzy and soft computing: A computational approach to learning and machine intelligence, Prentice-Hall Upper Saddle River, NJ, USA, 1997. [26] C.-T. Lin, C.S.G. Lee, Neural fuzzy systems: A neuro-fuzzy synergism to intelligent systems, Prentice-Hall, Upper Saddle River, NJ, USA, 1996. [27] J. Yu, A Bayesian inference based two-stage support vector regression framework for soft sensor development in batch bioprocesses, Computers and Chemical Engineering, 41 (2012) 134–144. http://dx.doi.org/10.1016/j.compchemeng.2012.03.004 [28] V.N. Vapnik, Statistical learning theory, Wiley, New York, 1998. [29] S.W. Choi, I.-B. Lee, Nonlinear dynamic process monitoring based on dynamic kernel PCA, Chemical Engineering Science, 59 (2004) 5897–5908. http://dx.doi.org/10.1016/j.ces.2004.07.019 [30] J.-M. Lee, C. Yoo, S.W. Choi, P.A. Vanrolleghem, I.-B. Lee, Nonlinear process monitoring using kernel principal component analysis, Chemical Engineering Science, 59 (2004) 223–234. http://dx.doi.org/10.1016/j.ces.2003.09.012 [31] Z. Ge, C. Yang, Z. Song, Improved kernel PCA-based monitoring approach for nonlinear processes, Chemical Engineering Science, 64 (2009) 2245–2255. http://dx.doi.org/10.1016/j.ces.2009.01.050 [32] Q. Jiang, X. Yan, Weighted kernel principal component analysis based on probability density estimation and moving window and its application in nonlinear chemical process monitoring, Chemometrics and Intelligent Laboratory Systems, 127 (2013) 121–131. http://dx.doi.org/10.1016/j.chemolab.2013.06.013 [33] P. Kadlec, R. Grbic, B. Gabrys, Review of adaptation mechanisms for data-driven soft sensors, Computers and Chemical Engineering, 35 (2011) 1–24. http://dx.doi.org/10.1016/j.compchemeng.2010.07.034 [34] H.T. Toivonen, State-dependent parameter models of non-linear sampled-data systems: A velocity-based linearization approach, International Journal of Control, 76 (2003) 1823-1832. http://dx.doi.org/10.1080/00207170310001637002 [35] D. Sliskovic, R. Grbic, E.K. Nyarko, Data preprocessing in data based process modeling, 2nd IFAC Conference on Intelligent Control Systems and Signal Processing, 2009, pp. 559–564. http://dx.doi.org/10.3182/20090921-3-TR-3005.00096 [36] P.J. Huber, E.M. Ronchetti, Robust statistics, Second ed., Wiley, Hoboken, New Jersey, 2009. [37] J. Deng, B. Huang, Identification of nonlinear parameter varying systems with missing output data, AIChE Journal, 58 (2012) 3454–3467. http://dx.doi.org/10.1002/aic.13735 [38] L. Fortuna, A. Rizzo, M. Sinatra, M.G. Xibilia, Soft analyzers for a sulfur recovery unit, Control Engineering Practice, 11 (2003) 1491-1500. http://dx.doi.org/10.1016/S09670661(03)00079-0 [39] S. Kim, M. Kano, S. Hasebe, A. Takinami, T. Seki, Long-term industrial applications of inferential control based on just-in-time soft-sensors: Economical impact and challenges,

Industrial and Engineering Chemistry Research, 52 (2013) 12346–12356. http://dx.doi.org/10.1021/ie303488m [40] S. Zhang, F. Wang, D. He, R. Jia, Online quality prediction for cobalt oxalate synthesis process using least squares support vector regression approach with dual updating, Control Engineering Practice, 21 (2013) 1267–1276. http://dx.doi.org/10.1016/j.conengprac.2013.06.002 [41] J. Liu, D.-S. Chen, J.-F. Shen, Development of self-validating soft sensors using fast moving window partial least squares, Industrial and Engineering Chemistry Research, 49 (2010) 11530–11546. http://dx.doi.org/10.1021/ie101356c [42] H. Kaneko, K. Funatsu, Preparation of comprehensive data from huge data sets for predictive soft sensors, Chemometrics and Intelligent Laboratory Systems, 153 (2016) 75–81. http://dx.doi.org/10.1016/j.chemolab.2016.02.011 [43] K. Chen, J. Ji, H. Wang, Y. Liu, Z. Song, Adaptive local kernel-based learning for soft sensor modeling of nonlinear processes, Chemical Engineering Research and Design, 89 (2011) 2117–2124. http://dx.doi.org/10.1016/j.cherd.2011.01.032 [44] S.J. Qin, Recursive PLS algorithms for adaptive data modeling, Computers and Chemical Engineering, 22 (1998) 503-514. http://dx.doi.org/10.1016/S0098-1354(97)00262-7 [45] C. Cheng, M.-S. Chiu, Nonlinear process monitoring using JITL-PCA, Chemometrics and Intelligent Laboratory Systems, 76 (2005) 1–13. http://dx.doi.org/10.1016/j.chemolab.2004.08.003 [46] Y. Liu, Z. Gao, P. Li, H. Wang, Just-in-time kernel learning with adaptive parameter selection for soft sensor modeling of batch processes, Industrial and Engineering Chemistry Research, 51 (2012) 4313–4327. http://dx.doi.org/10.1021/ie201650u [47] Y. Liu, J. Chen, Integrated soft sensor using just-in-time support vector regression and probabilistic analysis for quality prediction of multi-grade processes, Journal of Process Control, 23 (2013) 793–804. http://dx.doi.org/10.1016/j.jprocont.2013.03.008 [48] H. Jin, X. Chen, J. Yang, L. Wu, Adaptive soft sensor modeling framework based on justin-time learning and kernel partial least squares regression for nonlinear multiphase batch processes, Computers and Chemical Engineering, 71 (2014) 77–93. http://dx.doi.org/10.1016/j.compchemeng.2014.07.014 [49] K. Fujiwara, M. Kano, S. Hasebe, A. Takinami, Soft-sensor development using correlationbased just-in-time modeling, AIChE Journal, 55 (2009) 1754–1765. http://dx.doi.org/10.1002/aic.11791 [50] Z. Ge, Z. Song, A comparative study of just-in-time-learning based methods for online soft sensor modeling, Chemometrics and Intelligent Laboratory Systems, 104 (2010) 306–317. http://dx.doi.org/10.1016/j.chemolab.2010.09.008 [51] P.C. Young, M.J. Lees, The Active Mixing Volume (AMV): A new concept in modelling environmental systems, in: V. Barnett, K.F. Turkman (Eds.) Statistics for the Environment, Wiley, Chichester, 1993, pp. 3–44. [52] P.C. Young, Data-Based Mechanistic modeling of engineering systems, Journal of Vibration and Control, 4 (1998) 5-28. http://dx.doi.org/10.1177/107754639800400102 [53] P.C. Young, K.J. Beven, Data-Based Mechanistic modelling and the rainfall-flow nonlinearity, Environmetrics, 5 (1994) 335–363. http://dx.doi.org/10.1002/env.3170050311 [54] P.C. Young, Data-Based Mechanistic modelling of environmental, ecological, economic and engineering systems, Environmental Modelling & Software, 13 (1998) 105–122. http://dx.doi.org/10.1016/S1364-8152(98)00011-5

[55] M.B. Beck, P.C. Young, A dynamic model for DO-BOD relationships in a non-tidal stream, Water Research, 9 (1975) 769-776. http://dx.doi.org/10.1016/0043-1354(75)90028-7 [56] P.C. Young, P.E.H. Minchin, Environmetric time-series analysis: modelling natural systems from experimental time-series data, International Journal of Biological Macromolecules, 13 (1991) 190-201. http://dx.doi.org/10.1016/0141-8130(91)90046-W [57] P.C. Young, Parallel processes in hydrology and water quality: A unified time-series approach, Water and Environment Journal, 6 (1992) 598–612. http://dx.doi.org/10.1111/j.17476593.1992.tb00796.x [58] P.C. Young, P.G. Whitehead, A recursive approach to time series analysis for multivariable systems, in: G.C. Vansteenkiste (Ed.) Modeling and Simulation of Water Resource Systems, North Holland: Amsterdam, 1975, pp. 39–58. [59] P.C. Young, S. Parkinson, M. Lees, Simplicity out of complexity in environmental modelling: Occam's razor revisited, Journal of Applied Statistics, 23 (1996) 165-210. http://dx.doi.org/10.1080/02664769624206 [60] P.C. Young, Time variable and state dependent parameter modeling of nonstationary and nonlinear time series, in: T.S. Rao (Ed.) Developments in Time Series Analysis, Chapman and Hall, London, 1993, pp. 374–413. [61] P.C. Young, The validity and credibility of models for badly defined systems, in: M.B. Beck, G.v. Straten (Eds.) Uncertainty and Forecasting of Water Quality, Springer-Verlag, Berlin, 1983. [62] P.C. Young, P. McKenna, J. Bruun, Identification of nonlinear stochastic systems by state dependent parameter estimation, International Journal of Control, 74 (2001) 1837-1857. http://dx.doi.org/10.1080/00207170110089824 [63] P.C. Young, Stochastic, dynamic modelling and signal processing: Time variable and state dependent parameter estimation, in: W.J. Fitzgerald, A. Walden, R. Smith, P.C. Young (Eds.) Nonstationary and Nonlinear Signal Processing, Cambridge University Press, Cambridge, 2000, pp. 74-114. [64] P.C. Young, Applying parameter estimation to dynamic systems: Part I - Theory, Control Engineering, 16 (1969) 119-125. [65] P.C. Young, Applying parameter estimation to dynamic systems: Part II - Applications, Control Engineering, 16 (1969) 118-124. [66] J. Mendel, A priori and a posteriori identification of time varying parameters, 2nd IEEE Conference on System SciencesHawaii (USA), 1969. [67] P.C. Young, The identification and estimation of nonlinear stochastic systems, in: A.I. Mees (Ed.) Nonlinear Dynamics and Statistics, Birkhauser, Boston, 2001a, pp. 127-166. [68] P.C. Young, A general theory of modelling for badly defined dynamic systems, in: G.C. Vansteenkiste (Ed.) Modeling, Identification and Control in Environmental Systems, North Holland, Amsterdam, 1978, pp. 103–135. [69] M.B. Priestley, Nonlinear and nonstationary time series analysis, Academic Press, London, 1988. [70] P.C. Young, Data-Based Mechanistic modelling: Natural philosophy revisited?, in: L. Wang, H. Garnier (Eds.) System Identification, Environmetric Modelling and Control, SpringerVerlag, Berlin, 2011. [71] P.C. Young, Data-Based Mechanistic modelling, generalised sensitivity and dominant mode analysis, Computer Physics Communications, 117 (1999) 113-129. http://dx.doi.org/10.1016/S0010-4655(98)00168-4

[72] P.C. Young, D.J. Pedregal, W. Tych, Dynamic harmonic regression, Journal of Forecasting, 18 (1999) 369–394. [73] R.E. Kalman, A new approach to linear filtering and prediction problems, Journal of Basic Engineering, 82 (1960) 35-45. http://dx.doi.org/10.1115/1.3662552 [74] P.C. Young, Nonstationary time series analysis and forecasting, Progress in Environmental Science, 1 (1999) 3-48. [75] P.C. Young, Recursive estimation and time-series analysis, Second ed., Springer, New York, 2011. [76] A.J. Jakeman, P.C. Young, Recursive filtering and the inversion of ill-posed causal problems, Utilitas Mathematica, 35 (1984) 351-376. [77] A.J. Jakeman, P.C. Young, Refined instrumental variable methods of recursive time-series analysis Part II. Multivariable systems, International Journal of Control, 29 (1979) 621-644. http://dx.doi.org/10.1080/00207177908922724 [78] L. Fortuna, S. Graziani, A. Rizzo, M.G. Xibilia, Soft sensors for monitoring and control of industrial processes, Springer, New York, 2007. [79] L. Fortuna, S. Graziani, M.G. Xibilia, Soft sensors for product quality monitoring in debutanizer distillation columns, Control Engineering Practice, 13 (2005) 499–508. http://dx.doi.org/10.1016/j.conengprac.2004.04.013 [80] N.M. Ramli, M.A. Hussain, B.M. Jan, B. Abdullah, Composition prediction of a debutanizer column using equation based artificial neural network model, Neurocomputing, 131 (2014) 59– 76. http://dx.doi.org/10.1016/j.neucom.2013.10.039

Figure Captions Fig.1. Block diagram of the debutanizer column Fig.2. Training data charactristics of debutanizer column: (a)-(g) input data, (h) output data Fig.3. Prediction results of butane concentration on training dataset (a) case 1, (b) case 16 and (c) case 22 Fig.4. Left hand side: SDARX model parameters plotted against sample number. Right hand side: estimated a1 (yt-1) (top) and b3 (u5, t-3) (bottom) plotted against associated state variables Fig.5. Prediction results of butane concentration on testing dataset with (a) no missing value, (b) some random missing values Fig.6. Prediction versus actual data of butane concentration

Table 1 Detailed description of soft sensor input variables Variable Description u1 Top temperature u2 Top pressure u3 Reflux flow u4 Flow to nex process

6th tray temperature Bottom temperature Bottom temperature

u5 u6 u7

Table 2 List of published soft sensors for debutanizer column No.

Quality prediction

Applied method (s)

Ref.

1

C4 and C5 concentrations in top and bottom product

MLP

(Fortuna et al., 2005)

2

C4 concentration in bottom product

PLS, Standard SVR, LSSVR

(Ge & Song, 2010)

3

C4 concentration in bottom product

SBPCR

(Ge & Song, 2011)

4

C4 concentration in bottom product

LWKPCR

(Yuan et al., 2014)

5

C4 concentration in bottom product

Mixture semi supervised PCR

(Ge, Huang, et al., 2014a)

6

C4 concentration in bottom product

Nonlinear semi supervised PCR

(Ge, Huang, et al., 2014b)

7

C4 concentration in bottom product

GMM

(Fan et al., 2014)

8

C4 concentration in top and bottom product with another dataset

ANN

(Ramli et al., 2014)

9

C4 concentration in bottom product

PCR

(Ge, 2014)

10

C4 concentration in bottom product

OLPLS

(Shao et al., 2014)

11

C4 concentration in bottom product

MRSPPCA, MSPPCA

(Zhu et al., 2015)

12

C4 concentration in bottom product with another dataset

LSSVR, IWLSSVR

(Behnasr & Jazayeri-Rad, 2015)

13

C4 concentration in bottom product

LPLS-APSP

(Shao et al., 2015)

14

C4 concentration in bottom product

WPPCA

(Yuan et al., 2015)

15

C4 concentration in bottom product

NPRE, KNPRE

(Aimin et al., 2015)

16

C4 concentration in bottom product

MLR, PCR, BPNN

(Pani et al., 2016)

17

C4 concentration in bottom product

ALGPR

(Ge, 2016)

18

C4 concentration in bottom product

ANN

(Soares & Araujo, 2016)

Table 3 Performance indexes for different set of input variables used in SDARX model Case 1

Selected states

No. of states 12

R

RMSE

MAE

0.9924

0.0174

0.0111

2

11

0.9925

0.0173

0.0112

3

10

0.9578

0.0407

0.0272

4

10

0.9694

0.0347

0.0232

5

10

0.9797

0.0284

0.0190

6

10

0.9881

0.0218

0.0146

7

10

0.9945

0.0148

0.0100

8

10

0.9985

0.0077

0.0051

9

9

0.9985

0.0079

0.0052

10

8

0.9984

0.0080

0.0054

11

7

0.9983

0.0082

0.0055

12

6

0.9985

0.0076

0.0053

13

5

0.9985

0.0078

0.0054

14

4

0.9985

0.0076

0.0053

15

3

0.9985

0.0078

0.0053

16

2

0.9984

0.0080

0.0055

17

2

0.9982

0.0085

0.0057

18

2

0.9979

0.0091

0.0058

19 20

2 2

0.0096 0.0096

0.0060 0.0060

21

9

0.9977 0.9977 0.7212

0.0981

0.0684

22

6

0.6839

0.1033

0.0706

Table 4 Optimized NVR correspondent to each SDP SDP NVR

Table 5 Prediction performance indexes of debutanizer column soft sensors Publication Ge and Song (2010)* Shao et al. (2014) Yuan et al. (2014)

Model Type PLS Standard SVR LSSVR CoJIT OLPLS-1 OLPLS-2 PCR KPCR LWPCR

R Not reported -

RMSE 0.1650 0.1450 0.1418 0.0153 0.0133 0.0105 0.1041 0.0933 0.0670

MAE Not reported -

Ge et al. (2014) Ge (2014) * Fan et al. (2014)*

Shao et al. (2015)

Zhu et al. (2015)

Yuan et al. (2015)

Aimin et al. (2015) Ge (2016)*

Pani et al. (2016)

This work *

LWKPCR Nonlinear semi supervised PCR New active learning PCR GMM-based JITL LSSVM JIT-PLS RPLS MWPLS CoJIT LPLS-APSP MRSPPCA MSPPCA KNN WKNN G-PCA L-PCA L-WPCA G-PPCA L-PPCA L-WPPCA NPRE PCR KPCR KNPRE ALGPR MLR PCR BPNN trained by Gradient-Descent BPNN trained by ConjugateGradiant BPNN trained by LevenbergMarquardt SDARX model

0.9541 0.8474 0.5692 0.7753 0.3950 0.1480 0.5530

0.0577 0.1499 0.1450 0.1345 0.0215 0.0175 0.0163 0.1800 0.0170 0.0120 0.1385 0.1324 0.1253 0.0839 0.1515 0.1082 0.0811 0.1505 0.1071 0.0803 0.1429 0.1509 0.1550 0.1503 0.0640 0.9990 0.1511 0.1250

0.0810

0.6640

0.1110

0.0690

0.8560

0.0760

0.0550

0.9975

0.0101

0.0109

The RMSE values mentioned are approximate values since the values are available only in the form of charts.

Figure 1

Figure 2 (b)

0.5

0.9

0.8

0.4 0.3 0.2

Reflux flow (u3)

Top pressure (u2)

(d) 1

0.8 0.7 0.6

0.6 0.4 0.2

0.1 0

500

0.5

1000

Samples, sequence number

0

(e)

0

1000

0.6 0.4 0.2

500

1000

Samples, sequence number

0.9 0.8 0.7 0.6 0.5 0.4 0

500

0.6 0.4 0.2 0

1000

0

1000

Samples, sequence number

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2

0

500

500

1000

Samples, sequence number (h)

1

Bottom temperature (u7)

Bottom temperature (u6)

0.8

500

0.8

(g)

1

0

0

Samples, sequence number

(f)

1

0

500

Samples, sequence number

Concentration of butane (y)

Top temperature (u1)

1

0.6

0

6th tray temperature (u5)

(c)

1

Flow to next process (u4)

(a) 0.7

1000

Samples, sequence number

1 0.8 0.6 0.4 0.2 0

0

500

1000

Samples, sequence number

Figure 3

Concentration of butane

1.2

Predicted value Real value

1

0.8

0.6

0.4

0.2

0 0

200

400

600

800

1000

1200

Samples, sequence number

(a)

Concentration of butane

1.2

Predicted value Real value

1

0.8

0.6

0.4

0.2

0 0

200

400

600

800

1000

1200

Samples, sequence number

(b)

Concentration of butane

1.2

Predicted value Real value

1

0.8

0.6

0.4

0.2

0 0

200

400

600

800

Samples, sequence number

(c)

1000

1200

1.03

1.03

1.02

1.02

a1

a1

Figure 4

1.01

1.01

1 0.99 0

1 200

400

600

800

1000

1200

0.99 0

0.2

0.4

0.3

0.2

0.2

b3

b3

0.8

1

0.6

0.8

1

y(t-1)

Samples,sequence number 0.3

0.1 0 0

0.6

0.1 0

200

400

600

800

1000

1200

0

0.2

0.4

Samples,sequence number

u5(t-3)

Figure 5 1

Concentration of butane

0.9

Predicted value Real value

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

200

400

600

800

Samples, sequence number

(a)

1000

1200

1 Predicted value Real value

Concentration of butane

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

200

400

600

800

1000

1200

Samples, sequence number

(b) Figure 6 1 0.9

Predicted value

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Real value

Highlights    

A new soft sensing method is proposed based on state dependent parameter model. This methodology is provided a low-order model with less parameters. This methodology easily predicts missing values without using supplementary algorithms. The prediction accuracy of model is great compared with other models.