Copyright © 2002 IFAC 15th Triennial World Congress, Barcelona, Spain
www.elsevier.com/locate/ifac
DEVELOPMENT OF A SOFT SENSOR FOR A BATCH DISTILLATION COLUMN USING LINEAR AND NONLINEAR PLS REGRESSION TECHNIQUES Eliana Zamprogna1, Massimiliano Barolo1,* and Dale E. Seborg2 1
DIPIC, Department of Chemical Engineering, University of Padova Via Marzolo 9, 35131 Padova PD (Italy) 2 Department of Chemical Engineering, University of California Santa Barbara, CA 93106 (U.S.A.)
Abstract: A PLS-based model is developed for a batch distillation process in order to estimate the product compositions from temperature measurements. Both linear and nonlinear versions of PLS are employed and their estimation performance is compared. Several issues are addressed such as the selection of the most appropriate model input variables, and the effect of augmenting the original process data with lagged measurements. A novel PLS approach is also proposed that provides for the development of multiple PLS models for different time intervals during the batch operation. Copyright © 2002 IFAC Keywords: batch distillation, soft sensing, composition estimation, partial least squares.
estimator can be computationally heavy for on-line use.
1. INTRODUCTION Because of its inherent high operating flexibility and low capital investment, batch distillation has become a very attractive and profitable separation method, particularly in the manufacture of specialty chemicals. However, in order for these characteristics to be fully exploited, it is necessary to perform accurate composition monitoring during the entire operation to ensure that products of consistent high quality are obtained. This task is notoriously difficult because of the lack of rapid and reliable online analyzers. To circumvent the disadvantages of conventional composition analyzers, recent research activity has focused on the development of soft sensors (inferential estimators), which use a model of the process to be controlled and information retained by secondary process variables to obtain on-line estimates of the product compositions. Even though a soft sensor for a batch distillation process can be developed based on state estimation methods using a fundamental model of the process (Barolo and Berto, 1998; Oisiovici and Cruz, 2000), the development of an accurate mechanistic model for a batch distillation column is difficult (Edgar, 1996), and the resulting
As an alternative to first-principle modeling, an empirical model can be developed using information contained in the process historical database together with multivariate regression techniques. Partial Least Squares (PLS) regression has proven to be a powerful tool for analysing historical data and diagnosing operating problems (Kourti and MacGregor, 1995). This projection method has been widely used to develop steady-state inferential estimators for various continuous processes, and several applications have been reported for continuous distillation columns (Mejedell and Skogestad, 1991; Park and Han, 1998; Kano et al., 2000; Shin et al., 2000). This technique has recently been extended to the analysis, on-line monitoring, and diagnosis of batch processes (Nomikos and MacGregor, 1995; Duchesne and MacGregor, 2000), and successful applications have been reported (Kourti et al., 1995; Wold et al., 1998; Zheng et al., 2001). However, for most of these papers, the use of PLS is limited only to the estimation of the final product quality. In this paper, various PLS regression approaches are used to estimate the quality of the products obtained from a batch distillation column not only at the end of the process, but also during the entire duration of
_______________________ *
To whom all correspondence should be addressed. E-mail:
[email protected]
431
part of X and Y, respectively, that is left out of the regression. The optimal number of latent variables k can be assessed using a number of methods, with cross-validation being the most reliable and widely used (Kourti and MacGregor, 1995; Jackson, 1991).
the batch. A PLS-based soft sensor is developed from measurements of secondary process variables and a simulated historical database. In the next sections, several issues will be addressed such as a comparison of the performance of linear and nonlinear PLS models, the selection of the most informative input variables for the PLS models, and the effect of augmenting the original process data with lagged measurements. A novel PLS regression approach is proposed that is based on the development of multiple regression models for different portions of the batch process.
Because most practical problems are nonlinear, nonlinear PLS techniques have been developed in order to maintain the robust generalization property of the conventional PLS approach and, at the same time, represent any nonlinear relationship existing between X and Y. These techniques retain the framework of linear PLS but use a nonlinear relationship for each pair of latent variables ui and ti. This relationship can be generally represented as:
2. PARTIAL LEAST SQUARES REGRESSION Partial Least Squares is a multivariate statistical technique that transforms large sets of correlated measurements (an input matrix X and an output matrix Y) into smaller sets of uncorrelated variables, with minimal loss of information. This technique eliminates the high dimensionality and collinearity of the original data by projecting them onto a low dimensional space defined by a small number of orthogonal variables. A detailed description of the PLS algorithm and its mathematical formulation is provided by Geladi and Kowalski (1986). Only a brief overview of this method will be presented here.
u i = fi (t i ) + h i
where fi(.) stands for a nonlinear vector function. For example, it could be a second order polynomial regressor (Wold et al., 1989), a spline function (Wold, 1992), or a feedforward artificial neural network (ANN-PLS; Qin and McAvoy, 1992). These linear and nonlinear techniques are generally referred to as static PLS algorithms. When process variables are also characterized by auto-correlation in time, dynamic PLS can be used. In this approach, a relatively large number of past samples of the variables (lagged values) are included at each sampling instant in the original input data matrix X (Ricker, 1988), or in both the input and output matrices, X and Y (Qin and McAvoy, 1996). The augmented data matrices are then processed using conventional linear or nonlinear PLS regression algorithms.
The input data matrix X usually consists of easy-tomeasure process variables, whereas the output data matrix Y is typically composed of difficult-tomeasure quality variables. The dimensions of X are mX × nX, where mX is the number of samples and nX is the number of variables. Similarly, the dimensions of Y are mY × nY. In PLS regression, the original matrices X and Y are decomposed simultaneously in order to find the directions (latent variables) in the input space X that explain the most of the data variation and, at the same time, can predict the output space Y: X = TP T + E Y = UQT + F
(3)
(1)
The latent variables are aligned along the k columns of the two score matrices, T (mX × k) and U (mY × k), and are ordered in such a way that the amount of information (variance) of the original data described by each variable decreases as the number of latent variables increases. The PLS transformation is performed so that the score vectors of each i-th latent variable are mutually related through an inner linear relationship: u i = bi t i + h i , (2)
Conventional PLS assumes that the data are given by two-dimensional matrices. However, for batch processes, data matrices are typically arranged in the form of three-dimensional arrays, where different batch runs are organized as rows, the measurement variables are in the columns, and their time evolution occupies the third dimension. X
Samples
XB Samples
Samples
Batches
where bi is a coefficient determined by minimizing the norm of the residual vector hi.
Variables Samples
In Equation (1), the loading matrices P (nX × k) and Q (nY × k) represent the projection of the original matrices X and Y, respectively, onto the latent variable space. Two residual matrices E (mX × nX) and F (mY × nY) are also obtained that contain that
Variables
Fig. 1. Unfolding of a three-dimensional batch data array XB to create a two-dimensional array X.
432
Each horizontal slice through this 3D array contains the trajectories of all the variables from a single batch; each vertical slice collects the values of all the variables for all the batches at the same time instant. To extend the conventional PLS approach to batch datasets, these three-dimensional batch data arrays have to be unfolded to create two-dimensional arrays. One possible way to rearrange the original arrays is represented by stacking one horizontal slice after the other, as shown in Figure 1. This unfolding procedure is particularly useful when the number of recorded samples varies from batch to batch. Alternatively, Multiway PLS can be performed (Nomikos and MacGregor, 1995).
The process objective is to have all the products satisfy the assigned purity specifications. For each product, the lower bound on purity is 0.95 (mole fraction) for the dominant component in that product. It is assumed that temperature measurements are available for the still pot and four trays (trays #5, 10, 15, 20, considering a bottom-to-top numbering scheme). These measurements are used to estimate the composition of the light and intermediate components in the distillate stream (xD(1) and xD(2), respectively), and the composition of the heavy component in the bottoms (xB(3)) by means of a PLS–based soft sensor. In general, the requirements for open-loop monitoring are different from those for control. However, for the batch rectifier, only the requirements for monitoring need to be addressed, because the column is basically operated in an openloop mode (i.e., at constant reflux ratio).
To determine how well the original data conform to the PLS regression space, the Mean Squared (MSQ) error is defined as: MSQi =
(y i − yˆ i )(y i − yˆ i )T , mY
(4)
The datasets needed to develop the PLS regression model were generated by repeatedly running a nonlinear deterministic model of the column (Barolo and Berto, 1998) for different operating conditions. This approach generates a highly informative process database easily and economically, because the process behavior can be investigated for a wide region of operating conditions. Several batch operations were simulated by varying the initial feed composition, the boilup rate and the reflux rate from batch to batch. For each batch run, the time–varying trajectories of all process variables were recorded using a sampling period of 36 s. For the i-th batch, the recorded temperature measurements TB, T5, T10, T15 T20 were arranged as column vectors in the input data matrix Xi. Similarly, the measurements of xD(1), xD(2) and xB(3) were used as column vectors in the output matrix Yi.
where yi is the column vector of measurements of the generic i-th output variable yi, and yˆ i is its estimate obtained from the PLS model. 3. PROCESS DESCRIPTION AND DATA GENERATION In the following example, batch distillation is used to separate a zeotropic ternary mixture, which is carried out in a 20–tray batch rectifier according to the constant-reflux operating procedure described by Luyben (1991). In this strategy, the column is initially operated at total reflux. When the distillate composition meets the desired quality specification, the distillate withdrawal is started, and products and slop cuts are sequentially collected from the top and segregated in separate tanks. The heaviest cut is recovered from the reboiler at the end of the batch. The schematic diagram of the batch column is shown in Figure 2.
Because the simulated batches have different durations, matrices Xi and Yi have different number of samples. The simulated datasets were divided into two groups: data from 11 batches were used to compute the parameters of the multivariate regression model (calibration sets); data from 5 batches were used to test the accuracy of the PLS representation (validation sets). The input and output matrices for the calibration data sets were arranged according to the unfolding procedure described in Figure 1, to give the comprehensive input and output calibration data matrices Xc and Yc. Similarly, the input and output matrices corresponding to the selected validation sets were used to build the validation data matrices Xv and Yv. Calibration and validation data were scaled to zero mean and unit variance.
Water
LC
FC
. R
. D
TI
soft sensor
TI
TI
. V
P1
S1
P2
S2
Feed
4. PLS REGRESSION MODELS Steam
Conventional Linear, Polynomial, Spline and ANNbased PLS regressions were carried out on the calibration data sets Xc and Yc. The reliability and the accuracy of the model were subsequently tested using the validation data sets Xv and Yv. Several
P3
Fig. 2. Schematic diagram of the batch rectifier.
433
able to estimate this variable quite accurately when the composition specification target is approached.
issues were addressed, such as the effect of additional exogenous process inputs, and the sensitivity of the model accuracy to the presence of lagged measurements in the model inputs.
Effect of additional input variables. The reflux flowrate was included in the calibration and validation input matrices to check whether the informative content of the input data can improve the accuracy of the regression models. In order to explain the same amount of cumulative variance that was achieved when only temperatures were used as model inputs, it was necessary to consider PLS models with four latent variables in this case.
4.1 Static PLS regression models Three latent variables were retained in each PLS model, the optimal number having been determined by cross-validation. As reported in Table 1, the cumulative percentage of variance explained by the PLS models for the input and output calibration data is not strongly affected by the regression method used. Nevertheless, when validation results are considered, the linear PLS model provides the best overall estimation performance, because it minimizes the value of MSQ for two of the three estimated variables (xD(1) and xD(2); the best estimation of xB(3) is provided by the polynomial PLS model).
Table 2. Model comparison: additional flowrate input
Table 1. Model comparison: base case PLS regression method Linear Polynomial Spline ANN-based
Variance explained [%] X Y 98.05 88.37 98.05 87.66 98.03 88.79 98.05 87.18
MSQ × 105 xB(3) 5.15 5.08 5.99 5.90
xD(1) 5.43 6.25 7.88 7.65
xB(3) 4.09 4.22 6.71 5.39
xD(1) 5.44 6.86 30.47 7.66
xD(2) 7.12 7.59 22.02 8.57
4.2 Dynamic PLS regression models
1.0 0.8 0.6 0.4 0.2 0.0
Linear and nonlinear PLS models were developed when different numbers of lagged measurements were added to the original datasets. 10
10
Linear PLS
xB(3) xD(1) xD(2)
5
9
1.0 0.8 0.6 0.4 0.2 0.0
MSQ x 10
xD(1)
8
7 6
5
5
10
MSQ x 10
1
0
2
4
6
9
8
8
7
7
6
6
5
5
1
800
Time sample
1200
0
2
4
6
2
4
6
Ann PLS
1
0
2
4
6
Number of lagged inputs
400
0 10
Spline PLS
9
0
Polynomial PLS
8
6
0
0
9
7
1
5
xD(2) xB(3)
MSQ × 105
As shown in Table 2, even though the percentage of variance of Y captured for each PLS regression model is generally slightly higher than that reported in Table 1, including the reflux flowrate in the input datasets does not result in a substantial improvement of the model accuracy. In fact, this data augmentation produces an insignificant reduction of the estimation error for the linear, polynomial and ANN-based regression models. On the contrary, the value of the performance index increases significantly for the spline model.
xD(2) 7.13 7.89 9.13 8.83
Figure 3 shows that this PLS model is capable of reproducing the cumulative profile of the validation data fairly accurately. However, some mismatch between the actual and estimated profiles occurs, particularly for xD(2).
1.0 0.8 0.6 0.4 0.2 0.0
Variance explained [%] X Y 98.39 88.97 98.39 88.56 98.21 89.38 98.39 87.88
PLS regression method Linear Polynomial Spline ANN-based
0
0
Number of lagged inputs
Fig. 4. Validation results for dynamic PLS models. Figure 4 indicates that the augmentation of the data matrices does not influence the accuracy of the PLS models significantly. In fact, the MSQ value does not change significantly when different numbers of past samples are included in the data. Only the estimation of xD(1) seems to benefit from this dynamic PLS approach, because a moderate decrease of MSQ can
1600
Fig. 3. Validation data set. Comparison between the actual value of the product compositions (solid lines) and their estimates provided by a linear PLS model (dashed lines). Even though some error occurs in the estimation of xB(3) at the beginning of a batch, the PLS model is
434
be observed for this variable when the number of lagged variables increases.
In order to develop the MPLS models, the original data for each batch run were partitioned into five sections, corresponding to the total reflux phase and to the withdrawal of the products and slop cuts obtained from the top of the column. Linear and nonlinear MPLS models were developed. Table 3 reports the results for the validation data.
A major disadvantage of dynamic PLS is that the addition of a relatively large number of lagged values in the input data matrix causes a substantial increase in the dimension of this matrix. This implies that the computational load required for the multivariate regression increases considerably. This is particularly true for spline PLS and ANN-based PLS, which are based on more computationally intensive algorithms. It should however be noted that the data regression is performed off-line once and for all, and therefore the on-line computational effort is only marginally increased when lagged measurements are included in the input matrix. In any case, the regression model results more cumbersome to manipulate, due to the presence of a higher number of parameters.
xD(1) 3.28 4.29 15.46 4.62
1.0 0.8 0.6 0.4 0.2 0.0 400
800 1200 Time sample
1600
Fig. 5. Validation data set. Comparison between the actual value of the product compositions (solid lines) and their estimates provided by a linear MPLS model (dashed lines). Figure 5 indicates that the proposed MPLS approach describes the process fairly accurately, and is clearly superior to the conventional PLS approach. 6. CONCLUSIONS Both linear and nonlinear PLS-based models have been developed for a batch distillation process in order to estimate the composition of the distillate stream and of the bottom product from temperature measurements. Several issues were addressed, including the selection of the most appropriate input variables, and the effect of augmenting the original process data with lagged input measurements. It was found that multiple PLS regression provides the best representation of the process, in particular when the estimation of the distillate composition is concerned. This novel approach is based on developing a different PLS model for each operating phase of the
MSQ × 105 xB(3) 5.23 7.10 30.55 6.23
1.0 0.8 0.6 0.4 0.2 0.0
0
Table 3 Validation Results: MPLS models PLS regression method Linear Polynomial Spline ANN-based
xD(1)
A novel PLS method is proposed that takes into account the peculiar characteristics of the batch distillation process. This approach consists of subdividing the data recorded for each batch run into subsets, each of which corresponds to a particular operating period (i.e., startup; main cut 1 withdrawal; slop cut 1 withdrawal; …). For each period, separate models using linear and nonlinear PLS regression algorithms are developed. The PLS models of the same type (linear, polynomial, spline or ANN-based) obtained for each period are then used sequentially to estimate the whole development of the process. The resulting models are referred to as Multiple PLS (MPLS) regressions. This modeling strategy is motivated by the fact that each batch goes through a series of phases with substantially different characterization. In fact, because of the operating procedure adopted, the process dynamic regime moves from a condition of total reflux to a condition of constant reflux. At the same time, the process experiences large excursions in the tray compositions, due to the movement of the light and intermediate components from the bottom to the top of the column. These changes in the column dynamic regime and composition distribution are reflected in the variability of the temperature profile, which is used in the PLS regressions to reconstruct the entire development of the process. In contrast to conventional PLS, the MPLS method can handle this complexity and variety of information about the different process phases directly, and this fact can potentially improve the model accuracy.
xD(2)
5. MULTIPLE PLS REGRESSION MODELS
1.0 0.8 0.6 0.4 0.2 0.0
xB(3)
The spline PLS algorithm gives very poor results for the MPLS approach. For the other MPLS models, the MSQ values for xB(3) are slightly higher than those obtained using the conventional PLS approach. However, for the linear and ANN-based PLS regression models this increase is negligible. It is noteworthy that the MSQ values calculated for xD(1) and xD(2) are markedly lower, in particular for the linear MPLS regression model.
xD(2) 4.02 4.02 54.78 4.68
435
Qin, S.J. and T.J. McAvoy (1996). Nonlinear FIR Modeling via a Neural Net PLS Approach Comput. Chem. Eng., 20, 147-159. Ricker, N.L. (1988). The Use of Biased Least Squares Estimators for Parameters in Discretetime Pulse Response Model. Ind. Eng. Chem. Res., 27, 343-350. Shin, J., M. Lee and S. Park (2000). Design of a Composition Estimator for Inferential Control of Distillation Columns. Chem. Eng. Comm., 178, 221-248. Wold, S, N. Kettanech, H. Friden and A. Holmberg (1998). Modelling and Diagnosis of Batch Processes and Analogous Kinetic Experiments. Chemom. Intell. Lab. Syst., 44, 331-340. Wold, S. (1992). Nonlinear Partial Least Squares Modelling. II. Spline Inner Relation. Chemom. Intell. Lab. Syst., 14, 71-84. Wold, S., N. Kettaneh-Wold and B. Skagerberg (1989). Non-linear PLS Modelling. Chemom. Intell. Lab. Syst., 7, 53-65. Zheng, L.L., T.J. McAvoy, Y. Huang and G. Chen (2001). Application of Multivariate Statistical Analysis in Batch Processes. Ind. Chem. Eng. Res., 40, 1641-1649.
process, and using the models sequentially to estimate the product composition during the entire batch run. ACKNOWLEDGEMENT This research was carried out in the framework of the Research Project "VirSens" funded by the University of Padova (Project M1P012-1999). REFERENCES Barolo, M. and F. Berto (1998). Composition Control in Batch Distillation: Binary and Multicomponent Mixtures. Ind. Eng. Chem. Res., 37, 4689-4698. Duchesne, C. and J.F. MacGregor (2000). Multivariate Analysis and Optimization of Process Variable Trajectory for Batch Processes. Chemom. Intell. Lab. Syst., 51, 125-137. Edgar, T.F. (1996). Control of Unconventional Processes. J. Proc. Contr., 6, 99-110. Geladi, P. and B.R. Kowalski (1986). Partial LeastSquares Regression: a Tutorial. Analytica Chimica Acta, 185, 1-17. Jackson, J.E. (1991). A User’s Guide to Principal Components. John Wiley & Sons, New York (U.S.A.). Kano, M., K. Miyazaki, S. Hasebe and I. Hashimoto (2000). Inferential Control System of Distillation Compositions Using Dynamic Partial Least Squares Regression. J. Proc. Contr., 10, 157-166. Kourti, T., P. Nomikos and J.F. MacGregor (1995). Analysis, Monitoring and Fault Diagnosis of Batch Processes Using Multiblock and Multiway PLS. J. Proc. Contr., 4, 277-284. Kourti, T. and J.F. MacGregor (1995). Tutorial: Process Analysis, Monitoring and Diagnosis, Using Multivariate Regression Methods. Chemom. Intell. Lab. Syst., 28, 3-21. Luyben, W.L. (1991). Multicomponent Batch Distillation. 1. Ternary Systems with Slop Recycle. Ind. Eng. Chem. Res., 27, 642-647. Mejdell, T. and S. Skogestad (1991). Estimation of Distillation Compositions from Multiple Temperature Measurements Using Partial-leastsquares Regression. Ind. Eng. Chem. Res., 30, 2543-2555. Nomikos, P. and J. F. MacGregor (1995). Multi-way Partial Least Squares in Monitoring Batch Processes. Chemom. Intell. Lab. Syst., 30, 97108. Oisiovici, R.M. and S.L. Cruz (2000). State Estimation of Batch Distillation Columns Using an Extended Kalman Filter. Chem. Eng. Sci., 55, 4667-4680. Park, S. and C. Han (1998). A Nonlinear Soft Sensor Based on Multivariate Smoothing Procedure for Quality Estimations in Distillation Columns. Comput. Chem. Eng., 24, 871-877. Qin, S.J. and T.J. McAvoy (1992). Non-linear PLS Modelling Using Artificial Neural Networks. Comput. Chem. Eng., 16, 379-391.
436