Phase Transition Analysis Based Quality Prediction for Multi-phase Batch Processes

Phase Transition Analysis Based Quality Prediction for Multi-phase Batch Processes

PROCESS ESTIMATION AND SOFT SENSOR Chinese Journal of Chemical Engineering, 20(6) 1191—1197 (2012) Phase Transition Analysis Based Quality Prediction...

513KB Sizes 2 Downloads 23 Views

PROCESS ESTIMATION AND SOFT SENSOR Chinese Journal of Chemical Engineering, 20(6) 1191—1197 (2012)

Phase Transition Analysis Based Quality Prediction for Multi-phase Batch Processes* ZHAO Luping (赵露平)1,2, ZHAO Chunhui (赵春晖)1,** and GAO Furong (高福荣)1,2 1 2

State Key Laboratory of Industrial Control Technology, Department of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China Department of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China

Abstract Batch processes are usually involved with multiple phases in the time domain and many researches on process monitoring as well as quality prediction have been done using phase information. However, few of them consider phase transitions, though they exit widely in batch processes and have non-ignorable impacts on product qualities. In the present work, a phase-based partial least squares (PLS) method utilizing transition information is proposed to give both online and offline quality predictions. First, batch processes are divided into several phases using regression parameters other than prior process knowledge. Then both steady phases and transitions which have great influences on qualities are identified as critical-to-quality phases using statistical methods. Finally, based on the analysis of different characteristics of transitions and steady phases, an integrated algorithm is developed for quality prediction. The application to an injection molding process shows the effectiveness of the proposed algorithm in comparison with the traditional MPLS method and the phase-based PLS method. Keywords multi-phase, transition, partial least squares, quality prediction, batch process

1

INTRODUCTION

As an important type of industrial production, batch processes have been widely applied to fine chemical, biopharmaceutical, food, polymer industries, and metallurgy, to obtain high-value-added products efficiently. The batch process safety and consistent product quality have become a focus of research. Data-based statistical analysis techniques, such as multiway principal component analysis (MPCA) and multiway partial least squares (MPLS) are the popular tools for batch process monitoring and quality prediction [1, 2]. Although other methods, such as Bayes classifier and neural networks, are also statistical quantitative feature extraction methods like PCA/PLS [3], MPCA/MPLS are still preferable for batch processes and many works were proposed based on them. Since MPCA/MPLS take the entire batch data as a single object, estimations of future measurements are needed for online applications, unavoidably affecting the model accuracy. Moreover, many batch processes have quite different phase characteristics that MPCA/MPLS are not capable of capturing. So different solutions were proposed [4-12], among which phase-based PCA/PLS [9-11] methods focus on characteristics of different operation phases. Lu et al. [9, 10] proposed phase-based PCA/PLS methods, recognizing that phases can reflect the changes of the inherent process correlations. Their methods can divide one batch process into several phases with different characteristics. Zhao et al. [13, 14] improved the quality prediction by focusing on correlation analyses in each phase, to reveal the phase-specific

effect on qualities. Considering the transitions between phases, Zhao et al. [15] proposed a soft-transition multiple PCA (STMPCA) method to detect and model transitions using a weighted sum of two phase models, and Yao et al. [16] improved STMPCA by proposing a new index of model similarity. In 2009, Yao et al. [17] gave an overview of multiphase/ multistage statistical process control methods. As mentioned above, works have been done focusing on different phase characteristics to improve process monitoring and quality prediction. However, few of them utilize phase transition information for quality prediction, though transitions occur between steady phases and they do have certain impacts on qualities. In this paper, for clarity, steady phases are referred to process phases showing constant characteristics, while transitions are those of change between two steady phases. By understanding how transitions influence product qualities, a new phase-based PLS method is proposed for quality prediction. First, batch processes are divided into several phases based on time-slice PLS model regression parameters. Then, the importance of each phase is evaluated through statistical analysis to identify critical-to-quality phases. Finally, different quality prediction schemes are utilized for transitions and steady phases, respectively. 2 2.1

METHODOLOGY PLS modeling based on time slices Batch process data are usually collected as a

Received 2012-05-31, accepted 2012-07-30. * Supported by Guangzhou Nansha District Bureau of Economy & Trade, Science & Technology, Information, Project (201103003), the Fundamental Research Funds for the Central Universities (2012QNA5012), Project of Education Department of Zhejiang Province (Y201223159), and Technology Foundation for Selected Overseas Chinese Scholar of Zhejiang Province (J20120561). ** To whom correspondence should be addressed. E-mail: [email protected]

1192

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

three-way array X ( I × J x × K ) , where I refers to the number of batches, J x refers to the number of process variables and K refers to the sample times within each batch. The measurement values of J y final quality variables in I batch are summarized into a matrix Y ( I × J y ) . The variables are first centered and scaled

across the batches. After that, the process data and the final qualities are denoted as X ( I × J x × K ) and

k

∑ xi β c

yˆ k _ alone =

X k = Tk PkT + Ek

(1)

Y = U k QkT + Fk

(2)

The above model can be written in regression form as Yˆk = X k BkPLS (3) where Tk and U k are the score matrices, Pk and Qk are the loading matrices, Ek and Fk are the residual matrices, BkPLS is the regression parameter matrix with k = 1, 2," , K . When single quality variable y ( I × 1) is considered, the regression model can be simplified as yˆ k = X k β k

(4)

where β k is the regression parameter, k = 1, 2," , K . PLS modeling based on time-slice matrices focuses only on the information at each sampling interval. To describe the accumulative effect, average information in each phase has to be utilized in prediction model. Considering a steady phase c, the average regression parameter from prediction models is used to represent the regression parameter of the prediction model for the whole phase: ke

βc =

k e − ks + 1

(6)

k

∑ xi β i

yk _ alone =

2.2

i = ks

k − ks + 1

(7)

Phase division

Process data in the transitions are more dynamic than the data in steady phases, which inspires an idea for phase identification. The details are introduced below: First, calculate the gradient of regression parameter β k (k = 1, 2," , K ) : β = β − β (8) k +1

k

k

Since β k is the regression parameter which describes the correlation between process variables and quality variable, the gradient of β k with respect to time, β , shows the dynamic characteristics contained k

in the correlation, that is, how the correlation changes in the time domain. If βk is small, the correlation keeps steady and the kth time interval belongs to a steady phase. Otherwise, the time interval belongs to a transition. However, the vector structure of the coefficient parameters makes it difficult to judge whether the gradients are small or large. Therefore, a straightforward index is desired. From the PLS theory, each element of β k corresponds to a process variable and shows the impact of this process variable to final qualities at the kth time interval. So utilize the sum of absolute value of all elements of βk as the index of progress dynamic characteristics: J

B k = ∑ βk , j

(9)

j =1

∑ βi

i = ks

k − ks + 1

In transitions, it is not reasonable to apply Eq. (6) because an average regression parameter neglects the dynamics and could not represent accurate correlations during the whole transition. So, the prediction during transition is calculated as

Y ( I × J y ) . The measurement values of all J x vari-

ables at the sampling interval k ( k = 1, 2, " , K) are stored in X k ( I × J x ) , which is called the kth time slice of X. The correlation between the process variables and the quality variables at time interval k can be extracted from matrices X k and Y . By applying PLS, time-slice PLS models are achieved as below [9, 10]:

i = ks

(5)

where β i is the regression parameter vector belonging to the ith time-slice PLS model, ks and ke are the time indices corresponding to the first and the last sampling intervals of the steady phase. Then taking the cumulative effect into account, the prediction during the phase for the new sample xi is

In Eq. (9), the same weight is assigned to each element of βk , because all process variables have been normalized before establishing time-slice PLS models. Or different weights can be given to the gradients to represent the importance of each process variable: J

B k = ∑ μk , j βk , j j =1

(10)

1193

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

where μ j , k = λ j , k

J

∑ λ j ,k

is the weight correspond-

j =1

ing to the importance of the jth variable, and λ j , k is the eigenvalue of the covariance matrix X kT X k . A larger B k indicates the process is more possible to be operating in a transition. A threshold B * should be specified so that those time intervals with B k values larger than B * can be regarded as transitions. When noises exist, Bk may have random spikes. In the present work, Lmin is defined as the minimum number of continuous process samples that show B k larger than the predefined threshold value so that these continuous process samples can be identified as a phase. In this way, the influences of noise can be removed since the noisy samples are not continuous. The threshold value B * is defined as follows: since B k of transitions have larger values and those of steady phases have smaller values, the trend of B k values for all the normal data can be captured and those steady Bk values are picked out which show no big variation; the threshold value B * is then chosen as the largest ones in the steady B k values. 2.3

be obtained to represent the predictions using the information from steady phase 1 and steady phase 2. By assuming that phase 1 is finished and phase 2 is current, the online prediction is as below: yˆ k = w1 yˆ1ke _ alone + w2 yˆ k2 _ alone

I

S R = 1 − SE = 1 − SST 2

∑ ( yi − yˆi )

wc =

i =1 I

∑ ( yi − y )

Rc2 R12 + R22

(11)

2

i =1

where yi is the measurement value of the final quality of the ith batch, y is the average of measurement values of the final quality of I batches, yˆi is the quality prediction for the ith batch, SSE is the sum of squares of prediction residuals, and SST is the sum of squares of original variation. A larger R 2 indicates better model prediction. The sampling intervals corresponding to large R 2 comprise the phases critical to the final quality. F-test is utilized to check the significance of R 2 according to the statistical theory [19]. The critical value of R 2 can be obtained given a significance factor α, e.g. 0.01 or 0.05. Algorithm of multi-phase PLS modeling

To model the accumulative effect to the final quality, the information of different critical phases

c = 1, 2

,

(13)

ke1

∑ Ri2

R12 =

2

(12)

where yˆ1ke _ alone is yˆ1k _ alone calculated at the last sampling interval of phase 1, and w1 and w2 are the weights assigned for phase 1 and phase 2, respectively.

Critical-to-quality phase identification

If one phase has significantly contributed to a final quality, a strong relationship exists between the process variables in this phase and the corresponding quality index. Therefore, a regression model which well reflects such relationship can provide good predication results. The multiple coefficient of determination, R 2 [18], can be utilized to evaluate the prediction precision of each time-slice model:

2.4

should be combined. For offline prediction, all critical phases should be involved in the model; while for online prediction, all the finished critical phases and all the finished sampling intervals in the current phase (if the current phase is a critical phase) should be utilized in the model. First, consider steady phases. If there are two steady phases critical to the quality, yˆ1k _ alone and yˆ k2 _ alone can

i = ks1 1 ke − ks1

+1

(14)

k

∑ Ri2

R22 =

i = ks2

k − ks2 + 1

(15)

where ks1 and ke1 are the time indices corresponding to the first and the last sampling intervals of phase 1, ks2 corresponds to the first sampling interval of phase 2. R 2 represents the average value of R 2 in the considered phase. Weight, w, is specified according to the presumed degree of contribution of the corresponding phase. Second, consider transitions. R 2 inspires an idea to establish a weighted regression model based on each sampling interval’s contribution. Then the online prediction can be conducted as below: k

yˆ k _ alone = ∑ ωi xi β i

(16)

i = ks

ωk =

Rk2 k



i = ks

Ri2

,

k ≥ ks

(17)

Here, weight ω is the ratio of R 2 at the kth sampling interval to the sum of the R 2 at all past sampling intervals. When transition is one of the two phases determining the quality, the prediction also can be performed

1194

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

using Eq. (12). By doing so, the contribution of each phase to the final quality is considered, no matter a phase is steady or transient, and the prediction result can be achieved, with properly specified weights. Without losing generality, the main algorithm of the composite regression model for multiphase processes with transitions can be described as

yˆ k = w1 yˆ1ke _ alone + " + wp yˆ kpe _ alone + " + wc −1 yˆ kce−_1 alone + wc yˆ kc _ alone wp =

R 2p R12 + " + R p2 + " + Rc2−1 + Rc2

,

(18)

as training data and 8 batches are used as test data. 3.2

Phase division illustrations

PLS algorithm is applied to each time-slice matrix X k and quality y . Then, B k are calculated and plotted in Fig. 1 to show the process dynamic characteristics for the two quality indices, length and mass. The characteristic differences between steady phases and transitions are shown clearly.

p = 1," , c − 1, c

(19) ken

∑ Ri2

Rn2 =

i = ksn n ke − ksn

+1

n = 1," , c − 1

,

(20)

k

∑ Ri2

Rc2 =

i = ksc

k − ksc + 1

(21)

(a)

where c is the index of current phase, c − 1 is the number of phases which have ended, yˆ kpe _ alone represents the prediction at the last sampling interval in phase p ( p = 1, 2," , c − 1) using the whole phase information, yˆ kc _ alone is the online prediction in the current phase using available phase information, and w p ( p = 1, 2," , c − 1) and wc are the weights for the finished phases and the current phase obtained from Eq. (19). 3 3.1

ILLUSTRATION AND DISCUSSION

(b) Figure 1 Process dynamic characteristics for the two quality indices: (a) length and (b) mass

Injection molding

Injection molding is a typical technique to manufacture plastic products in batch. Plastic particles are heated to melt, injected to a mold with certain configuration, and cooled down to solid state. Accordingly, there are three main steps in injection molding process: injection, packing-holding the solidifying material under pressure, and cooling [20, 21]. During the early cooling phase, plastication happens, where plastic particles are melted and prepared for the next cycle [22]. Most process variables are collected online through corresponding sensors, while final qualities are obtained at the end of each batch. Experiments are conducted using high-density polyethylene (HDPE) as the feed stock. The operating conditions are designed by design of experiments (DOE) [10, 13, 23]. Totally, process data from 33 normal batches are collected, where 25 batches are used

By setting B * as 0.11, the phase division labels can be obtained, as shown in Fig. 2. The curve shows whether B k exceeds B * , with 0 meaning negative and 1 meaning positive. By setting Lmin as 5 sampling intervals to eliminate the impact of spikes, phase division can be achieved. There are totally 9 phases, including 4 steady phases and 5 transitions. For simplicity, “S” refers to steady phase, and “T” refers to transition. Four steady phases are consistent with operation phases, with transitions between them. And at the beginning and the end of batch duration, the unsteady phases are regarded as transitions. Some spikes occur during the latter part of S3 and at about the 920th time interval during S4. The causing of the spikes is of two aspects. The spikes in S4 are caused by the data pretreatment of denoising and normalization. These spikes are easy to be misidentified as transitions. Using appropriate settings of B *

1195

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

(a)

(b)

Figure 2 Phase division labels for the two quality indices: (a) length and (b) mass (dashed lines: phase boundaries)

Figure 3

Phase division result for injection molding process

(a) (b) Figure 4 The R2 indices over the batch duration for the two quality indices: (a) length and (b) mass (vertical dashed lines: critical-to-quality region and phase boundaries; horizontal dashed lines: the critical value of R2)

and Lmin , such spikes can be differentiated from the transitions. The spikes in S3 deserve more attentions which are caused by a change of the variable correlations in the plastication phase. The two parts of the plastication phase without or with the spikes can be called Plastication-I and Plastication-II (S3-1 and S3-2), which have different effects on the final qualities. The final phase division result is shown in Fig. 3.

3.3

Critical-to-quality phase analysis

The R 2 indices for the two quality indices are plotted in Fig. 4. The horizontal dashed lines refer to the critical value of R 2 . The values of R 2 between the leftmost and rightmost vertical dashed lines are above the critical value. The other vertical dashed

1196

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

(a) (b) Figure 5 Online quality predictions of two quality indices (a) length and (b) mass for test batch 2 using (1) the proposed method, (2) MPLS, and (3) the phase-based PLS (dashed lines: measurement values of the two qualities)

lines show the related phase division. It is found that the values of R 2 change suddenly in T2 and in the middle part of S3, and S2, T3 and S3-1 are easy to be identified as critical-to-quality phases. While the first and the last sampling intervals between the first and last vertical dashed lines need more attention because they are not exactly the phase boundaries. First, the first sampling interval is in the middle of T2. T2 is the link between S1 and S2. S1 is not critical to final qualities, while S2 is. It takes a certain time span for the process transits from not critical-to-quality to critical-to-quality. The critical-toquality part belonging to T2 is called T2-2. So T2-2, S2, T3 and S3-1 are all critical-to-quality. Second, it has been discovered that the variable correlations during S3 show different features: one is steady and the other is a little dynamic. Here, it is revealed that S3-1 has a high and stable impact on the final qualities, while S3-2 barely affects the qualities. This difference within S3 is because of the mold gate freezing. Before the mold gate is frozen, the quantity of material filled in the cavity is highly correlated with cavity pressure, nozzle pressure and mold temperature, which affects product qualities. With the material solidification in the mold, such correlation becomes weak, so the value of R 2 decreases. When the mold gate is completely frozen, the product qualities are determined, and cannot be affected by the process variables any longer. 3.4

Quality predictions and comparisons

The online prediction results of two quality indices are shown in Fig. 5 for an arbitrary testing batch, compared with the results achieved based on the traditional MPLS and the phase-based PLS. MPLS provides the online prediction over the batch duration, while phase-based PLS and the proposed method only predict during critical-to-quality phases. Compared with MPLS, both phase-based PLS and the proposed method provide more efficient and accurate online prediction results.

Moreover, the phase-based PLS does not consider about transitions, which leads to vibratory predictions. Exemplifying the advantage of the proposed method, the predictions are smooth during the transitions. Furthermore, at the end of the last critical phase, the prediction of the proposed method is more accurate than the others. The offline prediction results of these three methods are evaluated by mean squared error (MSE) index as shown in Table 1. The proposed method also gives the best offline prediction results. Table 1 Prediction performance comparison based on MSE index for both training and test batches MSE Methods

Length/mm Training

Test

Mass/g Training

Test

proposed

0.0052

0.0069

0.0103

0.0133

MPLS

0.0089

0.0209

0.0237

0.0564

phase-based PLS

0.0067

0.0074

0.0173

0.0193

The traditional MPLS considers all data within batch as a unit without investigating whether they are relative to final qualities. It is unavoidable for it to involve redundant and useless information, which weakens its prediction ability. Phase-based PLS method and the proposed method focus on the critical-to-quality phases by excluding the useless information. It is reasonable that these two methods have better performances. However, the phase-based PLS omits the analysis of transitions, losing useful information, which impairs the precision of quality prediction and leads to vibratory online predictions. The comparisons verify that the proposed algorithm is more powerful than the other two. 4

CONCLUSIONS

A phase-based PLS algorithm is proposed for

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

quality predictions of batch processes with transitions in this work, and the influences of steady phases and transitions are both considered in an accumulative way. Batch processes are divided into phases by checking the dynamic characteristics of variable correlations. Then, critical-to-quality phases are identified, based on which an algorithm for quality prediction is developed. The application to injection molding process shows the effectiveness of the proposed algorithm. REFERENCES 1

2

3

4

5

6 7

8

9

Nomikos, P., MacGregor, J.F., “Monitoring batch processes using multiway principal component analysis”, AIChE J., 40, 1361-1375 (1994). Nomikos, P., MacGregor, J.F., “Multi-way partial least squares in monitoring batch processes”, Chemom. Intell. Lab. Syst., 30, 97-108 (1995). Venkatasubramanian, V., Rengaswamy, R., Kavuri, S.N., Yin, K., “A review of process fault detection and diagnosis Part III: Process history based methods”, Comput. Chem. Eng., 27, 327-346 (2003). Wold, S., Kettaneh, N., Tjessem, K., “Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection”, J. Chemom., 10, 463-482 (1996). Westerhuis, J.A., Kourti, T., MacGregor, J.F., “Analysis of multiblock and hierarchical PCA and PLS models”, J. Chemom., 12, 301-321 (1998). Choi, S.W., Lee, I.B., “Multiblock PLS-based localized process diagnosis”, J. Process Control, 15, 295-306 (2005). Duchesne, C., MacGregor, C.D., “Multivariate analysis and optimization of process variable trajectories for batch processes”, Chemom. Intell. Lab. Syst., 5, 125-137 (2000). Chu, Y.H., Lee, Y.H., Han, C., “Improved quality estimation and knowledge extraction in a batch process by bootstrapping-based generalized variable selection”, Ind. Eng. Chem. Res., 43, 2680-2690 (2004). Lu, N.Y., Wang, F.L., Gao, F.R., “Sub-PCA modeling and on-line

10 11 12

13

14

15

16 17

18 19

20

21

22 23

1197

monitoring strategy for batch processes”, AIChE J., 50, 255-259 (2004). Lu, N.Y., Gao, F.R., “Stage-based process analysis and quality prediction for batch processes”, Ind. Eng. Chem. Res., 44, 3547-3555 (2005). Ündey, C., Cinar, A., “Statistical monitoring of multistage, multiphase batch processes”, IEEE Control Syst. Mag., 22, 40-52 (2002). Gabrielsson, J., Jonsson, H., Airiau, C., Schmidt, B., Escott, R., Trygg, T., “The OPLS methodology for analysis of multi-block batch process data”, J. Chemom., 20, 362-369 (2006). Zhao, C.H., Wang, F.L., Gao, F.R., Lu, N.Y., Jia, M.X., “Improved knowledge extraction and phase-based quality prediction for batch processes”, Ind. Eng. Chem. Res., 47, 825-834 (2008). Zhao, C.H., Wang, F.L., Mao, Z.Z., Lu, N.Y., Jia, M.X., “Quality prediction based on phase-specific average trajectory for batch processes”, AIChE J., 54, 693-705 (2008). Zhao, C.H., Lu, N.Y., Wang, F.L., Jia, M.X., “Stage-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes”, J. Process Control, 17, 728-741 (2007). Yao, Y., Gao, F.R., “Phase and transition based batch process modeling and online monitoring”, J. Process Control, 19, 816-826 (2009). Yao, Y., Gao, F.R., “A survey on multistage/multiphase statistical modeling methods for batch processes”, Annual Reviews in Control, 33, 172-183 (2009). Johnson, R., Wichern, D., Applied Multivariate Statistical Analysis, 5th edition, Prentice Hall, Upper Saddle River, NJ, 354-383 (2002). Montgomery, D.C., Peck, E.A., Vining, G., Introduction to Linear Regression Analysis, 4th edition, John Wiley & Sons, New York, 22-28 (2004). Yang, Y., Gao, F.R., “Cycle-to-cycle and within-cycle adaptive control of nozzle pressures during packing-holding for thermoplastic injection molding”, Polym. Eng. Sci., 39, 2042-2064 (1999). Yang, Y., Gao, F.R., “Adaptive control of the filling velocity of thermoplastics injection molding”, Control Eng. Practice, 8, 1285-1296 (2000). Lu, N.Y., Gao, F.R., “Stage-based online quality control for batch processes”, Ind. Eng. Chem. Res., 45, 2272-2280 (2006). Yang, Y., “Injection molding: from process to quality control”, Ph.D. Thesis, The Hong Kong University of Science & Technology, Hong Kong (2004).