A Novel Soft Sensor Method Detecting Completion of Transition for Industrial Polymer Processes

A Novel Soft Sensor Method Detecting Completion of Transition for Industrial Polymer Processes

A Novel Soft Sensor Method Detecting Completion of Transition for Industrial Polymer Processes Hiromasa Kaneko*, Masamoto Arakawa*, Kimito Funatsu* * ...

2MB Sizes 3 Downloads 31 Views

A Novel Soft Sensor Method Detecting Completion of Transition for Industrial Polymer Processes Hiromasa Kaneko*, Masamoto Arakawa*, Kimito Funatsu* * Department of Chemical System Engineering, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, Japan (Tel: +81-3-5841-7751; e-mail: [email protected]). Abstract: Soft sensors are widely used to estimate values of process variables that are difficult to measure online, for example, polymer quality variables. Industrial polymer processes generally produce many grades of products. In order to reduce quantity of off-grade material and produce a consistent product, values of polymer quality variables should be estimated with high accuracy by using soft sensor models. However, the predictive accuracy during grade transition can be low because a state in a polymer reactor is unsteady in transition. Values of process variables in the unsteady state can differ from those which is used to construct a regression model. It is desired to know the time on which the polymer quality meets product specifications. Thus, we propose to construct a model which detects completion of transition in order to assure predicted values of the polymer quality variables after the transition. By using the model and constructing regression models for each grade of a product, values of the objective variables can be predicted with high accuracy, selecting a regression model appropriately. We analyzed real industrial data as application of the proposed method. The proposed method achieved higher predictive accuracy than traditional ones. Keywords: soft sensor, process control, transition, k-nearest neighbor, support vector machine, range based approach, one-class support vector machine, partial least squares 1. INTRODUCTION In operating chemical plants, an operator has to monitor operating condition of the plants and control process variables, for example, temperature, pressure, liquid level, concentration of products and others. Therefore, these variables need to be measured online, but all of them are not easy to measure online because of technical difficulty, large measurement delays, high investment cost and others. In chemical plants, soft sensors are widely used to estimate process variables that are difficult to measure online (Kamohara et al., 2004). An inferential model is constructed between variables that are easy to measure online and ones that are difficult to measure online. Values of objective variables are estimated by the model. Mainly, a partial least squares (PLS) method (Wold et al., 2001; Lin et al., 2007) is used as a modeling method for the soft sensors. In addition, a principle component regression method (Aguado et al., 2006), a nonlinear PLS method (Zhao et al., 2006), an artificial neural network (Qin et al., 1997), a support vector machine based regression method (Yan et al., 2004) and others are researched as the soft sensor method. By using soft sensors, values of objective variables can be estimated with high accuracy. Industrial polymer processes generally produce many grades of products. Therefore, when a polymer grade changes, it is important to reduce quantity of off-grade material. Early and accurate judgment, whether values of polymer quality are within specifications or not, is desired by using soft sensors because it is impossible to measure a great variety of polymer

quality by hard sensors online. McAuley et al. (1991) constructed theoretically-based models to predict melt index and density, in which adjustable parameters were updated online. However, construction of high predictive model is difficult, because there can be nonlinear relationship between polymer quality and process variables. Kim et al. (2005) tried to deal with the nonlinearity by dividing data with values of polymer quality and constructing linear regression models for each divided data. PLS is used as a regression method. In prediction, an appropriate model is selected with a predicted value by using a global PLS model constructed with all of data. Then, a final predicted value of the polymer quality is calculated with the selected model. However, the global PLS model cannot represent the nonlinear relationship between polymer quality and process variables. Therefore, the selection of the appropriate model can be difficult. In addition, polymer quality is often measured in steady states after grade transition and seldom measured during the transition. The predictive accuracy during the transition can be low by using a regression model constructed with such data that the number of samples while transition is small. Thus, it is often difficult to accurately judge whether transition is completed or not. Fig. 1 shows a example of predicted values delaying to measured ones. Asterisks represent measured values of density and line represents predicted values of it. The predicted values can be calculated online, but the measurement delay of the measured values is about 1 hour. Types of polymer grades are 1 and 2. Dashed

target-grade or off-grade and then, if the sample is classified to the former, completion of transition is detected. 2.2 SVM An SVM is one of classification methods and can create nonlinear classifiers by applying the kernel trick. In the linear SVM, a discriminant function f(x) is as follows: Grade 1

(A)

(B)

f x

Grade 2

Fig. 1. A example of predicted values delaying to measured ones. line represents upper and lower bounds for the each polymer grade. It is important to detect actual completion of the transition because polymer products during the transition are off-grade. Measured values are within grade 2 specifications on (A), but predicted values are not until (B). Actually, much quantity of polymer products is wasted because a measurement result delays for about 1 hour. It is desired to know the time on which polymer quality meets product specifications. Thus, we propose to construct a model which detects completion of transition in order to assure predicted values of polymer quality variables Y after the transition. In this paper, a model which detects completion of transition is called as a discriminant model. By using the discriminant model and constructing regression models for each grade, values of Y are predicted with a selected regression model. This can avoid a problem that predictive accuracy drops during transition. In addition, by ensuring predictive accuracy with the discriminant model, it is possible that values of polymer quality are estimated with high predictive accuracy right after transition. We analyzed real industrial data as application of the proposed method. In this paper, we constructed discriminant models with a k-nearest neighbor (k-NN) method and support vector machine (SVM) (Vapnik et al., 1995) as methods modeling with target-grade and off-grade data and a range based approach (RANGE) and one-class SVM (OCSVM) (Vapnik et al., 1995) as methods modeling with only targetgrade data. Then, we tried to comprehend a state of plant by the discriminant models and estimate values of Y by a PLS model, selecting from models constructed with each grade data and updating the models appropriately. By comparing the proposed methods to traditional ones, we show superiority of the proposed ones. 2. METHODS

x w

(1)

b

where x ∈ Rn (where n is the number of variables) is a query sample; w ∈ Rn is a weight vector and b is a bias. Primal form of the SVM can be shown to be a following optimization problem: Minimize 1 w 2

2

C

(2)

i i

subject to yi x i w

b

yi

1

i

1,1

(3)

where yi and xi are training data; ξi are slack variables and C is a penalizing factor that controls a trade-off between a training error and a margin. By minimization of (2), we can construct a discriminant model which has a well balance between adaptive ability to the training data and generalization capability. A kernel function in our application is a radial basis function: K x, x'

e

x x'

2

(4)

where γ is a tuning parameter controlling width of the kernel function. By using (4), a non-linear model can be constructed because the inner product of x and w in (1) is represented as the kernel function of x. In this paper, LIBSVM is used as a machine learning software. We construct a SVM model that discriminates between targetgrade data and off-grade data and then detect completion of transition with the discriminant model. 2.3 RANGE As the simplest method, a domain which represents completion of transition can be determined by a data range of each variable. If values of all process variables in a query sample are within target-grade data ranges, it is determined that transition is completed.

2.1 k-NN The k-nearest neighbor algorithm is a well-known classification method. When we want to classify a query sample, we simply determine which class most of the k known samples that are closest to the query sample belong to. We use the Euclidean distance between two samples as a distance measure. A query sample is classified to either

2.4 OCSVM A OCSVM is a method applying an SVM to a domain description problem. Given a set of training data in a highdimensional input space, the objective of a OCSVM is to learn a function that will take the value +1 in the region where the majority of the data is concentrated and the value

-1 everywhere else. The function to be learned is modeled as a hyperplane in a transformed space and hyperplane parameters are estimated so that its margin with respect to the training data is maximized, as dictated by the data-driven distribution-free paradigm. The maximum margin solution of the OCSVM problem is obtained by solving the following quadratic optimization problem: Minimize 1 w 2

1 l

2

b

(5)

i

(6)

i i

subject to w

xi i

b 0

where w ∈ Rn is a weight vector; b is a bias; ξi are slack variables and is a parameter that represents an upper bound on the fraction of outliers in the data. Finally, the decision function inferred by the learned hyperplane is: f x

w

x

b

(7)

A kernel function in our application is a radial basis function (4) as well as SVM. In this paper, LIBSVM is used as a machine learning software. By constructing OCSVM models with each target-grade data, a query sample can be judged whether transition is completed or not with the discriminant model. 2.5 Proposed Soft Sensor Method In this paper, we propose to construct a discriminant model

with a k-NN, SVM, RANGE or OCSVM method in order to assure predicted values of polymer quality variables Y. By using the discriminant model and constructing regression models for each polymer grade, values of Y can be predicted with high accuracy, selecting a regression model appropriately. Fig. 2 shows a procedure of the proposed soft sensor method. First, we divide process data into each polymer grade data. Next, regression models and discriminant models are constructed with the each data. By applying a query sample to the discriminant model constructed with target-grade data, we judge whether transition is completed or not. If the model consider that a state in a polymer reactor is during the transition, a regression model corresponding to polymer grade before the transition predicts values of polymer quality. In this case, the predicted values cannot be trusted because the state must be unsteady during the transition. However, it is not an important matter. It is after the transition not during the transition that the prediction is important. On the one hand, if the model consider that the state is after the transition, a regression model corresponding to target-grade predicts values of polymer quality. The predicted values can be trusted because the query sample is near the training data of target-grade. In addition, if new data of polymer quality is obtained, models corresponding to target-grade is updated. 3. RESULTS AND DISCUSSION 3.1 Data We applied the proposed method to real industrial data in order to verify prediction ability of it. We analyzed the data obtained from operation of an industrial polymer process at Mitsui Chemical. Objective variables Y are density and melt flow rate (MFR) and explanatory variables X are 38 variables, such as temperature in the reactor, pressure and concentration

Fig. 2. A procedure of the proposed soft sensor method.

of monomer, comonomer and hydrogen. Reactor residence time is taken into consideration for X-variables. We prepared data monitored from January 2005 to April 2007 as training data and that from May 2007 to May 2008 as test data. 3.2 Detection of Completion of Transition We tried to detect completion of transition with models constructed with k-NN, SVM, RANGE and OCSVM methods in order to verity detection ability of each method. First, we visualized domains in which the discriminant models detected completion of transition with training data to comprehend features of the methods. Fig. 3 shows the results. As a visualization method, a principal component analysis (PCA) was applied to the data. Before the PCA calculation, 12 variables having unique values for each polymer grade were selected from 38 variables. Fig. 3(a) is a plot of a first principle component (PC) and a second PC. Black and gray points represent samples within specifications and off-grade samples, respectively. Contribution ratios of the first and second PCs were about 40% and 25%, respectively. Domains in which k-NN, SVM, RANGE and OCSVM models detected completion of transition were shown in Fig. 3(b), (c), (d) and (e). Here the models were constructed with only the first and second PCs for the visualization of the domains. In this paper, 5 was selected as k for k-NN. In Fig. 3(b), completion of transition was detected in almost all of domains because query samples away from target-grade data were detected as completion of transition by a majority if

(a ) PCA

there were few off-grade samples around the query samples for k-NN. In Fig. 3(c) for SVM, there seemed to be detected domains for only black points of Fig. 3(a). However, domains in which there were few off-grade samples like a right side of Fig. 3(a) were detected widely. On the one hand, for RANGE and OCSVM, which construct models with only samples within specifications, the detected domains were large as compared to those of SVM. It is possible for RANGE and OCSVM models to detect completion of transition in domains around which few off-grade samples are distributed. Comparing RANGE and OCSVM, detected domains of OCSVM are larger than those of RANGE in Fig. 3(d) and (e). For OCSVM, the detected domain would be too large as compared to SVM and RANGE. This can be because the discriminant models are constructed with only the first and second PCs. By using X-variables for constructing them, detected domains can be appropriately small. Secondly, completion of transition included in test data was detected with k-NN, SVM, RANGE and OCSVM models, which were constructed with 12 variables. Fig. 4 shows the results. Precision and detection rate are defined as follows: precision detection rate

TP FP TP TP FN TP

(8)

where TP is the number of true positives, i.e., how many samples are detected as completion of transition which actually are completion of transition; FP is the number of

(b) 5-NN

(c) SVM

(d) RANGE

(e) OCSVM

Fig. 3. Visualization of domains in which discriminant models detected completion of transition. Black and gray points represent samples within specifications and off-grade samples, respectively

because the results of SVM and OCSVM distributed in upper right as compared to those of 5-NN and RANGE. 3.3 Prediction

Fig. 4. The relationships between precision and detection rate. Gray open, gray closed, black open and black closed marks represent results of detection by using 5-NN, SVM, RANGE and OCSVM methods, respectively. Triangles, circles and squares represent results of 18 variables which can be closely related to Y, 12 variables which have unique values for each polymer grade and all 38 variables, respectively. false positives, i.e., how many samples are detected as completion of transition which actually are not completion of transition and FN is the number of false negatives, i.e., how many samples are not detected as completion of transition which actually are completion of transition. Gray open, gray closed, black open and black closed marks represent results of detection by using 5-NN, SVM, RANGE and OCSVM methods, respectively. Triangles, circles and squares represent results of 18 variables which can be closely related to Y, 12 variables which have unique values for each polymer grade and all 38 variables, respectively.

In order to verify superiority of the proposed soft sensor methods, they were compared to traditional ones. We used a PLS method and a support vector regression (SVR) method (Yan et al., 2004; Lee et al., 2005), as linear and nonlinear regression methods and the number of variables was 18. In this paper, traditional methods are a method which predicts Y by using one PLS model (PLS without update), a method which predicts Y by using one SVR model (SVR without update) and a method which predicts Y by using one PLS model with the model updating (PLS with update). Systematic sampling was used for the number of training data not to exceed 1000. For comparison, the proposed method without a discriminant model was applied (PLS per grade). In this method, a target regression model is selected when an operator changes a polymer grade. PLS is used to construct regression models in the proposed methods (5-NN+PLS, SVM+PLS, RANGE+PLS and OCSVM+PLS). Table 1 shows the prediction results. RMSE (root mean square error) is defined as follows: n

( y obs , i

y pred , i ) 2

(9) n where yobs is the actual y value; ypred is the predicted y value and n is the number of test samples. Accuracy rate is defined as follows: RMSE

i 1

accuracy rate

TP

TP FP

TN TN

FN

(10)

where TN is the number of true negatives, i.e., how many samples are not detected as completion of transition which actually are not completion of transition. In the four methods from the top in Table 1, when predicted values of both density and MFR are within specifications, completion of transition is detected. Then, accuracy rate, precision and detection rate are calculated.

Fig. 4 indicated that discriminant models with high precision and detection rate were constructed by using each method. Especially, precision tended to be high in order of RANGE, SVM, OCSVM and 5-NN and detection rate tended to be high in the reverse order. The 5-NN model must have high detection rate and low precision because the detected domain in Fig. 3(b) were large. The SVM model must have higher precision and lower detection rate than that of OCSVM because it is constructed with off-grade data in addition to data within specifications. However, if off-grade samples are not distributed around data within specifications like Fig. 3(C), a detected domain can be too large by using the SVM. In this case, the RANGE and OCSVM models would have higher precision than that of SVM. It is recommended that an appropriate model should be selected by checking data distribution.

By comparison between “PLS without update” and “SVR without update”, the latter had larger r2 and lower RMSE than the former for both density and MFR. This is because there could be nonlinear relationship between Y and process variables. In the case of “PLS with update” (Qin, 1998; Kaneko et al., 2009), r2 decreased and RMSE increased because a state in a polymer reactor was unsteady during transition and the regression model was constructed with high nonlinear data when it was updated. By updating the regression model, we could not deal with the nonlinearity of the polymer reactor. In addition, the number of the training data did not affect the results significantly.

The larger the number of variables was, the higher precision and lower detection rate discriminant models tended to have. In the case of the large number of variables, a detected domain of each method would be small, in particular RANGE. In addition, SVM and OCSVM gave good results

When PLS models per grade were constructed with updating them, the predictive accuracy for both density and MFR increased. However, the models considered almost all samples as completion of transition because detection rate was almost 100% and precision was low. A model outputs

almost the same value of Y when any query samples were input to the model. Thus, a discriminant model is essential for judgment of completion of transition.

accuracy rate with 12 variables than 5-NN and OCSVM has higher accuracy rate with 12 variables than RANGE. Fig. 6 shows the prediction results on January 28, 2008. An SVM model considered query samples as completion of transition too early even though they are off-grade samples actually. On the one hand, a OCSVM model considered query samples as in transition when they are off-grade samples actually and vice versa. In order to investigate the reason, we checked a t1t2 plot around the target-grade data. Fig. 7 shows the result. Black and gray points represent samples within specifications, respectively and off-grade samples and line represents a trajectory of query samples. There are few off-grade samples around right side. Fig. 3(c) shows that an SVM model tends to consider too large domain as completion of transition if off-grade samples are not distributed around target-grade samples. In addition, we investigated other days and confirmed the same tendency. In these cases, OCSVM models should be used as the discriminant models.

The proposed methods except “5-NN+PLS” had higher predictive accuracy than “PLS per grade”. The 5-NN model considered the query samples as completion of transition which decreased the predictive accuracy. Though MFR is known to have high nonlinearity for other process variables, the proposed methods could construct more predictive model for MFR than “SVR without update”, one of the nonlinear regression methods. By constructing linear regression models for each grade, we could deal with the nonlinearity of the polymer reactor. Fig. 5 shows the prediction examples of the traditional and the proposed methods. On behalf of the proposed methods, “SVM+PLS” with 12 variables and “OCSVM+PLS” with 12 variables are shown which have higher accuracy rate than the others. Gray line and thick line represent the time considered as in transition and the time considered as completion of transition, respectively. In the traditional method, though predicted values were within specifications once, after that query samples were considered as off-grade even though they are within specifications actually. In the proposed methods, predicted values were within specifications early and were near the measured values. It is important that the detection of completion of transition is achieved with only X-variables. Measurement errors of density and MFR are different, but the discriminant models can detect completion of transition regardless of these errors. We could comprehend the state of plant by the models which detect completion of transition and estimate Y by the PLS model, selecting and updating it appropriately.

Fig. 8 shows the prediction results on February 25 and 26, 2008. A OCSVM model consider query samples as completion of transition too late even though they are samples within specifications actually. On the one hand, an SVM model considered query samples as in transition when they are off-grade samples actually and vice versa though the prediction result were unsteady a little. In order to investigate the reason, we checked a t1-t2 plot around target-grade data. Fig. 9 shows that off-grade samples are distributed around target-grade samples. In addition, we investigated other days and confirmed the same tendency. In these cases, SVM models should be used as the discriminant models. Thus, when discriminant models are used practically, SVM or OCSVM models should be selected, checking t1-t2 plots. 4. CONCLUSION

In order to use discriminant models practically, we compared these models constructed by each method. SVM and OCSVM are regarded as proper because SVM has higher

In this paper, to ensure predictive accuracy of a regression model estimating polymer quality, we proposed to construct a

Table 1. Prediction results density RMSE r (×10-3) 0.940 2.83 0.957 2.41 0.927 3.11 0.976 1.80 0.973 1.85 0.972 1.93 0.970 2.02 0.975 1.76 0.980 1.61 0.979 1.62 0.976 1.58 0.979 1.56 0.975 1.37 0.982 1.48 0.982 1.47 0.981 1.45 2

PLS without update SVR without update PLS with update PLS per grade 18a 5-NN 12a +PLS 38a 18a SVM 12a +PLS 38a 18a RANGE 12a +PLS 38a 18a OCSVM 12a +PLS 38a a

The number of variables.

MFR 2

RMSE

accuracy rate(%)

precision (%)

detection rate(%)

0.749 0.895 0.702 0.933 0.869 0.925 0.856 0.927 0.946 0.944 0.949 0.947 0.954 0.949 0.946 0.953

3.32 2.14 3.62 1.72 2.36 1.83 2.54 1.77 1.51 1.51 1.42 1.48 1.29 1.46 1.49 1.38

26.8 50.3 28.2 79.6 80.0 80.5 78.9 81.0 82.6 82.0 78.3 79.4 74.2 81.9 83.0 82.4

85.0 88.1 88.8 79.6 84.1 81.9 83.8 86.8 86.8 87.8 86.1 85.0 89.1 83.9 84.2 85.8

9.08 43.0 10.6 99.9 92.0 96.7 90.9 89.7 92.0 89.7 86.6 89.8 76.8 95.5 96.7 93.1

r

(a) SVR without update

(a) SVM+PLS

(b) SVM+PLS (b) OCSVM+PLS Fig. 6. The prediction results on January 28, 2008.

(c) OCSVM+PLS Fig. 5. The prediction examples of the traditional and the proposed methods. Gray line and thick line represent the time considered as in transition and the time considered as completion of transition, respectively. discriminant model. Then, discriminant models were constructed by using k-NN, SVM, RANGE and OCSVM methods and compared. When the discriminant models are used practically, SVM or OCSVM models should be selected, checking t1-t2 plots. If the model selection is done for each polymer grade beforehand, discrimiminant models used in practice can be selected automatically. By constructing a regression model for each polymer grade and selecting it with the discriminant model, the prediction ability increased. The proposed method could be applied to not only the industrial

Fig. 7. The t1-t2 plot on January 28, 2008. Black and gray points represent samples within specifications and off-grade samples, respectively and line represents a trajectory of query samples. polymer process but also the other time series processes. By applying this method to process control, plants could be operated stably.

REFERENCES

(a) SVM+PLS

(b) OCSVM+PLS Fig. 8. The prediction results on February 25 and 26, 2008.

Fig. 9. The t1-t2 plot on February 25 and 26, 2008. ACKNOWLEDGMENTS The authors acknowledge the support of Mitsui Chemical Corporation.

Aguado, D., Ferrer, A., Seco, A. and Ferrer, J. (2006). Comparison of different predictive models for nutrient estimation in a sequencing batch reactor for wastewater treatment. Chemom Intell Lab Syst., 84, 75-81. Kamohara, H., Takinami, A., Takeda, M., Kano, M., Hasebe, S. and Hashimoto, I. (2004). Product quality estimation and operating condition monitoring for industrial ethylene fractionator. J. Chem. Eng. Japan., 37, 422-428. Kaneko, H., Arakawa, M. and Funatsu, K. (2009). Development of a New Soft Sensor Method Using Independent Component Analysis and Partial Least Squares. AIChE Journal, 55, 87-98. Kim, M., Lee, Y.H., Han, I.S. and Han, C. (2005). Clustering-Based Hybrid Soft Sensor for an Industrial Polypropylene Process with Grade Changeover Operation. Ind Eng Chem Res., 44, 334-342. Qin, S.J., Yue, H.Y. and Dunia, R. (1997). Self-validating inferential sensors with application to air emission monitoring. Ind Eng Chem Res., 36, 1675-1685. Qin, S.J. (1998). Recursive PLS algorithms for adaptive data modelling. Comput. Chem. Eng., 22, 503-514. Lee, D.E., Song, J.H., Song, S.O. and Yoon, E.S. (2005). Weighted support vector machine for quality estimation in the polymerization process. Ind Eng Chem Res., 44, 2101-2105. Lin, B., Recke, B., Knudsen, J.K.H. and Jorgensen, S.B. (2007). A systematic approach for soft sensor development. Comput Chem Eng., 31, 419-425. McAuley, K.B. and MacGregor, F.J. (1991). On-Line Inference of Polymer Properties in an Industrial Polyethylene Reactor. AIChE Journal, 37, 825-835. Vapnik, V.N. The Nature of Statistical Learning Theory, Springer, Berlin, 1995. Wold, S., Sjöström, M. and Eriksson, L. (2001). PLSregression: a basic tool of chemometrics. Chemom Intell Lab Syst., 58, 109-130. Yan, W.W., Shao, H.H. and Wang, X.F. (2004). Soft sensing modeling based on support vector machine and Bayesian model selection. Comput Chem Eng., 28, 1489-1498. Zhao, S.J., Zhang, J., Xu, Y.M. and Xiong, Z.H. (2006) Nonlinear projection to latent structures method and its applications. Ind Eng Chem Res., 453, 843-3852.