Computers and Electrical Engineering 40 (2014) 2101–2112
Contents lists available at ScienceDirect
Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng
Multimode process monitoring with PCA mixture model q Xianzhen Xu a,b, Lei Xie a,⇑, Shuqing Wang a a b
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027, Zhejiang, China School of Electric and Automatic Engineering, Changshu Institute of Technology, Changshu 215500, Jiangsu, China
a r t i c l e
i n f o
Article history: Available online 15 September 2014 Keywords: Multimode process monitoring Fault detection Gaussian mixture model EM algorithm PCA Tennessee Eastman process
a b s t r a c t For multimode processes, Gaussian mixture model (GMM) has been applied to estimate the probability density function of the process data under normal-operational condition in last few years. However, learning GMM with the expectation maximization (EM) algorithm from process data can be difficult or even infeasible for high-dimensional and collinear process variables. To address this issue, a novel multimode process monitoring approach based on PCA mixture model is proposed. First, the PCA technique is directly applied to the covariance matrix of each Gaussian component to reduce the dimension of process variables and to obtain nonsingular covariance matrices. Then the Bayesian Ying-Yang incremental EM algorithm is adopted to automatically optimize the number of mixture components. With the obtained PCA mixture model, a novel process monitoring scheme is derived for fault detection of multimode processes. Three case studies are provided to evaluate the monitoring performance of the proposed method. Ó 2014 Elsevier Ltd. All rights reserved.
1. Introduction With growing requirements of process safety and high product quality in modern industrial processes, monitoring and fault diagnosis have become increasingly important [1]. In the last few decades, multivariate statistical process monitoring (MSPM) methods, such as principle component analysis (PCA) and partial least squares (PLS), have been widely applied in different industrial processes for process analysis and monitoring [2–4]. Traditional MSPM approaches rely on the assumption that the operating data follows the unimode Gaussian distribution. In modern industrial processes, however, operating mode changes are often encountered due to the changes of feed flow-rate, product specifications, set points, etc. [5], the traditional multivariate process monitoring techniques are not suitable for such cases. This paper is focused on the statistical monitoring of multimode process characteristics. The real-time monitoring of multimode processes is a challenging problem, which has drawn increasing attention recently. Some multimode process monitoring methods have been developed in the last few years, including global PCA model [5], adaptive methods [6,7], multiple model methods [3,8], Gaussian mixture model based approaches [9,10], external analysis [11,12], etc. The global PCA model is applicable only if different operating modes share the common covariance structures, which restricts its application in many cases. Adaptive methods are implemented blindly so that it is easy to cause false alarm, especially during the transition of two operation modes. Similar problem may arise in the multiple model based method, since they need to assign the
q
Reviews processed and recommended for publication to the Editor-in-Chief by Guest Editor Dr. Zhihong Man.
⇑ Corresponding author. Tel.:+86 87952233x8237. E-mail address:
[email protected] (L. Xie).
http://dx.doi.org/10.1016/j.compeleceng.2014.08.002 0045-7906/Ó 2014 Elsevier Ltd. All rights reserved.
2102
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
Nomenclature m
xi x n
l h P T2 K K Xi hi
r t K SPE
number of variables mixing weight of the ith Gaussian component a m-dimensional sample number of samples mean vector global Gaussian model parameter eigenvector matrix hotelling statistic estimated number of Gaussian component actual number of Gaussian component subset of samples local Gaussian model parameters covariance matrix score vector eigenvalue squared prediction error statistic
monitored data sample to a single monitoring model. For the external analysis based method, it is difficult to carry out without appropriate process knowledge. Finite mixture models is a flexible probabilistic approach for multivariate data. Gaussian mixture model (GMM) has been applied in multimode process monitoring recently [9,10,13], in which each Gaussian component corresponds to an individual operating mode. Generally, the parameters of Gaussian mixture can be estimated by the expectation–maximization (EM) algorithm under the maximum likelihood framework [14–17]. However, EM has the following limitations, (i) it assumes that the number of Gaussian components in the mixture is a-priori known and fixed, (ii) it is a local searching approach, thus improper initialization can make it get trapped in a local maximal, and (iii) the collinearity among the variables leads the covariance matrix to be singular, which results in an ill-conditioned problem and cannot be solved by naïve EM algorithm. For the limitations (i) and (ii), in view of the absence of a priori process knowledge, the number of mixture components is unknown. Recently, some new approaches to solve this problem are presented via some selection criterion (like Bayesian Inference Criterion, Minimum Description Length, etc.). These methods can be classified into two categories: (i) pruning algorithm ,which starts with an arbitrarily large number of components and adjusts this number adaptively by eliminating the components of insignificant weights, such as F–J algorithm [18]. (ii) Incremental algorithm, which starts with one-or-two components and adds a new component sequentially after each EM procedure until a maximum number of components. Theoretical evidence [14] indicates that it is possible to learn a mixture density by maximum likelihood by incrementally adding components to the mixture up to a desired number of components. Based on approximation results, Vlassis [15] proposed a greedy EM algorithm, which was further discussed and strengthened in [16]. However, the stop criterion of greedy EM algorithm is still based on the maximum likelihood and thus the desired number of components is larger than the number of actual Gaussian components in the sample data. Ma [19] pointed out that the the determination of the Gaussian component number is equivalent to the maximization of a Bayesian Ying-Yang (BYY) harmony function. The BYY harmony function and its model selection consistency property were proved under wild conditions [20]. Therefore, we adopt the scale-incremental EM algorithm [21] based on BYY harmony function (BYY-EM for short) to automatically optimize the number of mixture components and estimate their statistical distribution parameters. For the last limitation , the PCA technique is utilized to decorrelate the elements of variable vectors. A novel multimode process monitoring approach based on PCA mixture model is proposed in this paper. First, the process data are assumed to be from a number of different clusters, each one corresponds to an operating mode and can be characterized by a Gaussian component. Then the principal component analysis is applied to the covariance of Gaussian components to decorrelate the elements of variable vectors, thus the parameters of the mixture components can be easily estimated by applying EM learning on PCA transformed space. In the absence of a-priori process knowledge, the BYY scale-incremental EM algorithm is then adopted to automatically optimize the number of mixture components. With the obtained PCA mixture model, a novel process monitoring scheme is derived for fault detection of multimode processes. The rest of the paper is organized as follows. The EM algorithm for Gaussian mixture model is briefly revisited in Section 2. Section 3 further introduces the PCA mixture model and model order selection method. The novel process monitoring scheme based PCA mixture model is also presented. Section 4 presents the simulation results of applying the proposed approach to the synthetic data, continuous stirred tank heater (CSTH) process and Tennessee Eastman process (TEP). Finally, a conclusion is outlined at the end of the article.
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
2103
2. Gaussian mixture models and EM algorithm For the processes operated under multiple conditions, although the mean shifts and/or covariance changes, the assumption that each operating conditions follows Gaussian distribution is still valid [10]. Therefore, the Gaussian mixture model is well suited to represent the data sources driven by different operating modes. Let x 2 Rm be a m-dimensional sample recorded from a multimode process. Its density function is represented by the convex combination of Gaussian densities as K X
pðxjhÞ ¼
xi gðxjhi Þ
ð1Þ
i¼1
where K is the number of Gaussian components,
PK
i¼1
xi ¼ 1; hi ¼ fli ; Ri g (i.e. mean vector li and covariance matrix ri ) and
h ¼ fh1 ; ; hK g represent the sets of local and global Gaussian model parameters, respectively. Each component density gðxjhi Þ is a normal probability distribution given by
gðxjhi Þ ¼
1 ð2pÞm=2 jRi j1=2
1 exp ðx li ÞT R1 ðx l Þ i i 2
ð2Þ
To construct a GMM, the following list of unknown parameters H ¼ ffx1 ; l1 ; R1 g; ; fxK ; lK ; RK gg need to be estimated from the sample data set X ¼ fx1 ; x2 ; ; xN g . It is difficult because we know neither the component from which each sample xj comes, nor the parameters H. However, the well-known expectation–maximization (EM) algorithm [22] is designed to deal with this difficulty through the concept of missing data. Given a set of training samples X, the log-likelihood function can be expressed as
log LðX; HÞ ¼
N X
log
K X
j¼1
xi gðxjhi Þ
ð3Þ
i¼1
and the parameter estimation problem is further formulated
^ ¼ arg maxðlog LðX; HÞÞ H
ð4Þ
H
The usual EM algorithm is implemented iteratively by repeating the expectation step (E-step) and maximization step (Mstep) to update the parameters of the Gaussian mixtures with the sample data until the above log-likelihood function increase to a local maximum. The iterative E-step and M-step are performed as follows [10]: E-step
ðsÞ ðsÞ xðsÞ k g xj j l k ; R k
PðsÞ ðkjxj Þ ¼ PK
i¼1
xsi g xj jlsi ; Rsi
ð5Þ
where P ðsÞ ðkjxj Þ denotes the posterior probability of the jth training sample within the kth Gaussian component at the sth iteration. M-step
PN
j¼1 lðsþ1Þ ¼ PN k
PðsÞ ðkjxj Þxj
j¼1 P
ðsÞ
ðkjxj Þ T PN ðsÞ ðsþ1Þ ðsþ1Þ xj l k j¼1 P ðkjxj Þ xj lk ðsþ1Þ Rk ¼ PN ðsÞ j¼1 P ðkjxj Þ PN ðsÞ j¼1 P ðkjxj Þ xðsþ1Þ ¼ k N ðsþ1Þ
ðsþ1Þ
ðsþ1Þ
ð6Þ
ð7Þ ð8Þ
where lk ; Rk , and xk are the mean, covariance, and prior probability of the kth Gaussian component at the ðs þ 1Þth iteration, respectively. However, two questions must be taken into account when EM algorithm is used to estimate the distribution parameters of the multimode operating data. (i) In industry process, there are always a large number of measured and controlled variables to be monitored. For m-dimensional monitored variables, the total number of Gaussian model parameters to be esti mated is K 12 m2 þ 32 m þ 1 1. Thus, the number of estimated parameters is very large and EM algorithm runs into local optimum more likely. (ii) These measured and controlled variables are always highly correlated due to mass balance, energy balance and other operation restrictions. This gives rise to a singular covariance matrix of Gaussian component. It is necessary to decorrelate the elements of variable vectors and to extract variable correlation from data.
2104
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
3. Multimode process monitoring with PCA mixture model 3.1. PCA mixture model Let X i ¼ fx1 ; x2 ; ; xni g be a m-dimensional subset of ith Gaussian component, whose parameter is hi ¼ fli ; Ri g. If collinearity among m variables exists,it can be reduced to a set of li -dimensional feature vector by a transformation matrix as
tp ¼ PTi ðxp li Þ; p ¼ 1; 2; ; ni
ð9Þ
where li 6 m is the number of PCA components, P i ¼ fpi;1 ; pi;2 ; ; pi;li g and vector pi;j is the eigenvector corresponding to the jth largest eigenvalue of Ri , which can be obtained by singular value decomposition (SVD) algorithm at each EM iterative M-step. The covariance matrix and mean vector in the PCA transformed space can be expressed as
^i ¼ R
Pni
T p¼1 t p t p
ni
Pni ¼
T p¼1 P i ðxp
li Þðxp li ÞT Pi ni
¼ PTi Ri Pi ¼ diagðki;1 ; ki;2 ; ; ki;li Þ
l^ i ¼ 0
ð10Þ ð11Þ
By Eqs. (10) and (11) we know that the PCA feature vectors t p are decorrelated due to the orthogonality of the transform ^i is a diagonal matrix whose diagonal elements are corresponding to the principal matrix Pi . Thus, its covariance matrix R eigenvalues. The removing of ðm li Þ eigenvalues close to zero reduces the number of parameters to be determined in the mixture model and guarantees the nonsingular covariance matrix. Since the transformation matrix P i is linear ,it will not change the data distribution character. If the retained data information is large enough, such as Pli Pm j¼1 ki;j P 0:999, then the conditional density function of Eq. (2) can be simplified as j¼1 ki;j =
gðxp jhi Þ ¼ gðt p jh^i Þ ¼
1 ð2pÞli =2 jR^i j1=2
" # Y li t 2p;j 1 1 exp t Tp R^i 1 t p ¼ exp 1=2 1=2 2 2kimj k j¼1 ð2pÞ
ð12Þ
i;j
When the conditional density function in E-step (Eq. (5)) is substituted into Eq. (12), it not only avoids the singularity of the covariance matrix, but also speeds up the iterative process. Considering the parameters of mixing component are updated at each iteration, a new step to obtain new eigenvalue parameters and corresponding eigenvectors should be added to M-step,which satisfied
Rkðsþ1Þ pi;j ¼ ki;j pi;j
ð13Þ
The above equation can be obtained by singular value decomposition (SVD) algorithm or QR decomposition [4]. Such iterative process is called PCA-EM algorithm in this paper. The final number of PCA bases can be determined by the cumulative percentage of variance (CPV) after iterative process finished. Thus, a PCA mixture model (PCAMM for short) can be constructed as:
pðxp jhÞ ¼
K X
xi gðti;p jhi Þ
ð14Þ
i¼1
ti;p ¼
PTi ðxp
li Þ
where each hi ¼ fli ; Ri ; Pi g is the set of parameters of the ith component, and xi are the mixing proportions subjected to PK ^ i¼1 xi ¼ 1, the conditional density function gðt i;p jhi Þ is given by Eq. (12). The parameters h can be estimated from training data set by PCA-EM algorithm. Then Hotelling’s T 2 and SPE statistics in PCA-based process monitoring can be easily extended to PCA mixture model. It is natural to adopt these statistics because each local model is normally distributed. The T 2 and SPE statistics for ith component are defined as T 1 T T 2i ¼ t Ti K1 l t i ¼ ðx li Þ P i Kli P i ðX li Þ i SPEi ¼ k I P i PTi ðX li Þk2
ð15Þ ð16Þ
3.2. Model order selection based on BYY
Given a sample data set X ¼ fx1 ; x2 ; ; xN g from an original mixture with k > 1 Gaussian components and an initial component number K ¼ 2. The harmony function JðHK Þ is expressed as follows:
JðHK Þ ¼
K X Hj ðpj kqj Þ j¼1
ð17Þ
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
where Hj ðpj kqj Þ ¼ N1
PN
x qðxt jhj Þ
t¼1
PK j
i¼1
xi qðxt jhi Þ
2105
ln½xj qðxt jhj Þ, which denotes the harmony level of the jth Gaussian with respect to the
actual Gaussian implied in the sample data,and qðxt jhi Þ is Gaussian mixture density given by Eq. (2). In order to improve the total harmony function, we can split the component with the least component harmony value Hj ðpj kqj Þ, e.g. the rth component, into two components along the direction of first principle component. The parameters for the two split components are described as follows:
xr1 ¼ xr2 ¼ xr lr1 ¼ lr 0:5A1 lr2 ¼ lr þ 0:5A1
ð18Þ
Rr1 ¼ Rr þ ððb bc2 1Þðxr =xr1 Þ þ 1ÞA1 AT1
ð21Þ
Rr2 ¼ Rr þ ððbc2 b c2 Þðxr =xr2 Þ þ 1ÞA1 AT1
ð22Þ
ð19Þ ð20Þ
pffiffiffiffiffiffiffi where b; c are equal to 0.5, A1 ¼ kr;1 Pr;1 is the offset in first principle component direction. PCA-EM algorithm mentioned above is then performed to estimate the parameters hKþ1 for K þ 1 mixture components. The above split procedure continues until the total harmony function reaches maximum. 3.3. Online monitoring with PCA mixture model A novel multimode process monitoring approach based PCA mixture model now is given to perform online process monitoring. In online monitoring, first, Bayesian inference strategy is utilized to compute the posterior probability of a new sample xt belonging to each mixture component as follows:
xk gðxt jlk ; Rk Þ ; ðk ¼ 1; 2; ; KÞ i¼1 xi gðxt jli ; Ri Þ
Pðxt 2 kÞ ¼ PK
ð23Þ
and xt belong to the following components,
c ¼ arg max Pðxt 2 kÞ
ð24Þ
k¼1;2;;K
Subsequently, the monitoring statistics T 2 and SPE-statistics are then calculated by Eq. (14). Considering that each mixture component has a different control limit, it is tedious to monitor all of the control charts. Therefore,the overall T 2 and SPE statistics are defined as follows for fault detection of multimode processes:
T2 ¼
T 2c T 2c;a
; SPE ¼
SPEc
ð25Þ
d2c;a
where T 2c;a ; d2c;a are T 2 and SPE control limit of cth mixture component, respectively. Only if all monitoring statistics is less than 1, the process is determined within normal operation. 4. Application examples 4.1. Synthetic data A key step for multimode process monitoring is derive the confidence boundary around the normal operating regions. In this experiment we generate two synthetic data sets to test the efficiency of PCA mixture model to estimate the distribution of multimode data. The first dataset (named Set 1) is generated from a mixture of seven Gaussian densities [15] with no collinearity among variables. In this case we create 20 random mixtures of varying complexity (see Fig. 1a–c), and sampled from each of them as a training set of 1000 points in Rm with m 2 f2; 5g. We test the methods for dimension m ¼ 5 because higher dimensions would suffer from the limited size of the training set (A Matlab code of the mixture data generation is available at http://www.science.uva.nl/vlassis/software). The other synthetic data (Set 2) was generated from the following multivariate linear system with collinearity among variables
2
x1
3
2
0:372 0:681
3
2
e1
3
7 6 7 6 6 7 6 x2 7 6 0:489 0:295 7 6 e2 7 7 s1 6 7 6 6 7 6 x3 7 ¼ 6 0:984 0:179 7 6 7 7 s þ 6 e3 7 6 7 6 7 2 6 7 6 6 7 0:52 0:49 x 5 4 45 4 4 e4 5 0:86 0:48 x5 e5
ð26Þ
where ½s1 s2 T denote Gaussian distributed data sources and ½e1 e2 e3 e4 e5 T are zero-mean white noises with standard deviation of rI. Five sets of data sources are simulated to represent different operating modes as: (1) Mode 1: s1 Nð10; 0:8Þ; s2 Nð12; 1:3Þ; (2) Mode 2: s1 Nð5; 0:6Þ; s2 Nð20; 0:7Þ; (3) Mode 3: s1 Nð16; 1:5Þ; s2 Nð30; 2:5Þ; (4) Mode 4: s1 Nð18; 0:8Þ; s2 Nð15; 1:3Þ; (5) Mode 5: s1 Nð2; 0:2Þ; s2 Nð7; 0:5Þ; The means and standard deviations of s1
2106
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
Fig. 1. Estimation of the Set 1 data distribution: (a)–(c) three train data sets in Set 1;(d)–(f) the result of F–J EM algorithm; (g)–(i) the result of PCA-EM algorithm.
and s2 are changed to reflect the shifts of operating region. In the simulation, we generate 200 samples under each operating mode and use all the 1000 samples as data set 2. Data generation and all algorithms are performed 20 times. We apply the proposed PCA-EM, the BYY-EM [19] and F–J EM algorithm [18] to construct mixture model on these data sets and computed the correct classification rate (CCR for short), that is the percent of K ¼ K runs over the 20 runs, and average run time (ART for short), respectively. The BYY-EM algorithm started from K ¼ 2 and the maximum number of components in F–J EM was 10. For Set 1, Fig. 1 shows the different distribution of three bivariate Gaussian data sets and learning results of F–J EM algorithm and PCA-EM algorithm on these data sets. The experimental results of BYY-EM algorithm are not mentioned here
Table 1 Experimental results of the three algorithms on Set 1. F–J GMM
BYY-GMM
BYY-PCAMM
Data dimension
CCR (%)
ART (s)
CCR (%)
ART (s)
CCR (%)
ART (s)
m=2 m=5
50 35
5.4574 4.1293
75 60
4.0379 5.7093
75 60
0.2658 0.3393
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
2107
because they are same as PCA-EM method. It can be verified that after the BYY-based learning, seven Gaussian components are finally located accurately at the seven components, respectively. In contrast, the F–J EM algorithm either gave a wrong number of components (Fig. 1(d)) or converged to wrong values of parameters (Fig. 1(e)). In Table 1 we give experimental results comparing PCA-EM with F–J EM and BYY-EM algorithm on all 20 mixture data sets. Obviously, the BYY-based EM algorithms have better estimation accuracy than the F–J algorithm. This result implies that splitting the component among the direction of first principle component is effective. Furthermore, the proposed BYY-PCAMM converges much faster than the first two methods. It shows diagonalization with PCA transformation can improve computational efficiency. When these methods are applied to estimate the distribution of linearly dependent dataset (i.e Set 2), as shown in Fig. 2 and Table 2, the first two EM algorithm both led to a wrong number of Gaussian components because of the singularity of covariance matrix. In contrast, as shown in Fig. 2(c), the proposed BYY-PCAMM can locate accurately at the five actual Gaussian components with fast convergence. As the standard deviation of white noises increases, as described in Table 2, the CCR of the first two methods are slightly increases, whereas CCR of BYY-PCAMM decreases. This is due to the fact that strong noises would reduce the collinearity among variables. 4.2. Simulated CSTH process In this section, the proposed monitoring method based on PCA mixture model is applied to a simulated CSTH process under multiple operating conditions. The CSTH simulation model was originally developed by Thornhill et al. [23]. As depicted in Fig. 3, the stirred tank is supplied with well mixed hot and cold water flow and heated by steam. The CSTH model inputs are the hot water, cold water and steam valve demands, and the outputs include the electronic measurements of the level, temperature, hot water and cold water flow that are controlled by multiple PI loops. A full description of the CSTH process can be found in [23] and the closed-loop simulation programs are available at their website. (http://www.ps.ic.uk./nina/CSTHSimulation/index.htm) As list in Table 3, five sets of operating conditions are designed to test the performance of the proposed method in monitoring multimode process. Three controlled variables, level, temperature, and cold water flow, are selected for monitoring in this study. In training period, 200 samples are collected under each operating mode with 1 s sampling time and the overall 1000 samples from multiple modes are used to construct the nominal PCA mixture model. The obtained PCA mixture model consists of five Gaussian components.The estimated prior probability for each component is 0.2, which is in accordance with the actual data generation. And the number of PCA bases for each component is 3, which shows no collinearity among monitoring variables. The accuracy of other model parameters, such as means and covariances, are further verified by the projected probability ellipses, as shown in Fig. 4. It can be observed that each operating region is well-defined by the corresponding Gaussian component. Two abnormal scenarios of instrument bias fault and random variation error are considered to test the effectiveness of this method. In the first case, the initial 100 samples are simulated under normal operating mode 1 and then a sudden step change of 0.5 is introduced into the level measurement from 101st sample through the 200th sample. For Case 2, the initial 100 normal samples are also collected under the first mode, but followed by an inflated random variation in the temperature measurement [10]. The conventional PCA method and the proposed approach based PCA mixture model are applied to the two cases and the corresponding control charts shown in Figs. 5 and 6. The control limits are 95% confidence limits and the number of principle components are determined by CPV > 0:8. Fig. 4 shows the train data and fault data distribution among with estimated probability ellipses. The five solid-line ellipses represent the 95% confidence boundaries of the five subsets of data from different modes, whereas the single dashed-line ellipse corresponds to the global confidence boundary (conventional PCA) of full set of mixed data. Note that the global ellipse covers much bigger region than sum of the five local ones. Both fault data sets are out of region of Mode 1 but within global normal regions. As shown in Fig. 5(a), the conventional PCA-based SPE and T 2 control charts perform poorly in detecting the bias fault in Case 1. The T 2 index is far below its control limit and thus unable to trigger any fault alarm. On the other hand, the SPE index
Fig. 2. Estimation of the Set 2 data distribution: (a) the result of F–J EM algorithm; (b) the result of BYY-EM algorithm; (c) the result of PCA-EM algorithm.
2108
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
Table 2 Experimental results of the three algorithms on Set 2. F–J EM
BYY-EM
PCA-EM
Noise deviation
CCR (%)
ART (s)
CCR (%)
ART (s)
CCR (%)
ART (s)
r = 0.01
0 10 5 5
1.5562 1.711 1.8271 1.9686
0 0 0 10
4.4492 6.3126 5.5855 6.7145
100 100 100 80
0.1552 0.173 0.2073 0.3457
r = 0.02 r = 0.03 r = 0.05
Fig. 3. Schematic diagram of the simulated CSTH process.
Table 3 Five sets of normal operating conditions of CSTH. Variable
Mode 1
Mode 2
Mode 3
Mode 4
Mode 5
Level SP Temperature SP Hot water valve
12 10.5 5.5
16 10.5 4.5
9 10.5 4.5
12 13.5 6
12 8 4
16 14
Temperature
Mode 4
12 Mode 3
10
Mode 2
8
Mode 5
train data fault 1 fault 2
6 4
5
10
15
20
Level Fig. 4. CSTH:scatter plots of training data and fault data among with estimated local (solid line) and global (dotted line) probability ellipses.
of some samples within first 100 samples exceeds its control limit, which shows PCA-based method has higher false alarm rate. Further, SPE index does not exceed the control limit until the 140th sample and returns below the limit line at 170th sample under abnormal operation mode. It suggests that the fault alarm is delayed by 40 samples after the first advent of the level step change. The high false alarm rate, low detection rate and long alarm delay of the conventional PCA method stem from the inflated model covariance and reduce fault sensitivity. On the contrary, the proposed monitoring method based PCA mixture model is illustrated in Fig. 5(b), the T 2 and SPE indexes provide a clear indication of fault occurrence right after the 100th sample. In the Case 2, the PCA-based T 2 and SPE charts in Fig. 6(a) completely fail to detect the random variation fault. However, as shown in Fig. 6(b), PCA mixture model detects the fault by violating the confidence limit from 101st sample without any delay. In this multimode CSTH example, the monitoring approach with PCA mixture mode is proven to be effective in detecting different types of sensor faults with lower false alarm rates, whereas the conventional PCA method appears to be very insensitive to the faults.
2109
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
0
nT
T2
2
10
−2
0
−2
10
10 0
10
10
50
100
150
200
0
50
0
50
100
150
200
100
150
200
0
SPE
nSPE
0
10
−2
10 0
50
100
150
200
Samples
Samples
Fig. 5. CSTH: T 2 and SPE control charts by (a) conventional PCA and (b) PCA mixture model in fault Case 1.
0
nT
T2
2
10
−2
−2
10
10 0
50
100
150
200
0
nSPE
SPE
0
10
10
0
50
100
Samples
150
200
0
50
0
50
100
150
200
100
150
200
0
10
Samples
Fig. 6. CSTH: T 2 and SPE control charts by (a) conventional PCA and (b) PCA mixture model in fault Case 2.
4.3. Tennessee-Eastman process (TEP) The Tennessee-Eastman process [24] is a model of a complex industrial chemical process. It has been widely used as a benchmark case study for different purposes. As shown in Fig. 7, the process has five major unit operations, i.e., a reactor, a product condenser, a recycle compressor, a stripper and a vapor/liquid separator. Four gaseous reactants are fed into the reactor to form two products along with a byproduct and an insert, which are separated through the downstream operations. The process consists of 12 manipulated variables and 41 measured variables for monitoring or control. The original process is itself strictly open-loop unstable, thus a certain control strategy must be introduced. Here the decentralized control design developed by Ricker [25] is adopted for the closed-loop process operation. Moreover, there are six modes of process operation at three different G/H mass ratios, as listed in Table 4. The optimal steady-state conditions can be found in [24] and the simulation codes can be downloaded online (http://depts.washington.edu/control/LARRY/TE/download.html). For the sake of practical consideration, the 22 continuous outputs among the 41 measurements are selected as monitored variables and the sampling time is 0.05 h. To construct the nominal model for process monitoring, 1000 samples are collected under each of the above operating modes and total 6000 samples constitute the training data set. The obtained PCA mixture model consists of five Gaussian components, in which mode 2 and mode 5 can be characterized by one Gaussian due to their distribution properties are very similar. The prior probability and the number of PCA bases for each component are f0:16670:16670:3330:16670:1667g and f12; 11; 4; 12; 13g, respectively. The number of PCA bases is determined by CPV P 0:75. The training data in the variable 1 and variable 2 along with the 95% probability ellipses corresponding to the six modes are exhibited in Fig. 8. It shows that the probability density of the six modes is well-characterized by the estimated Gaussian components. To examine the effectiveness of the proposed monitoring approach for complex industrial processes, three test scenarios with different kinds of faults are designed. Case 1: the process initially run under normal conditions in mode 5, then IDV6 (sudden loss of A feed) is added from the 101st through 200th samples. Case 2: the process is initially running at mode 3, then IDV11 (random variation in reactor cooling water temperature) is added at sampling point 100. Case 3: after running for about 50h under normal conditions in mode 3, the process operation switches to mode 1 at sampling point 100, then collect 100 samples under normal operation and IDV13 (slow drift in reaction kinetics) is introduced to mode 1 at 201st sample. The
2110
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
Fig. 7. Schematic diagram of the TEP.
Table 4 Six operating modes of TEP. Mode
Desired G/H mass ratio
Desired production (kg/h)
1 2 3 4 5 6
50/50 10/90 90/10 50/50 10/90 90/10
14076 14077 11111 Maximum Maximum Maximum
Fig. 8. TEP: scatter plots of training data along with the estimated local probability ellipse.
conventional PCA method and monitoring approach based on PCA mixture model are applied to the three cases and the corresponding control charts are shown in Figs. 9–11. The control limits are 95% and number of principle components are determined by CPV P 0:75. In Case 1, the sudden loss of feed A can cause a series of related variables to move far away from the normal operating regions rapidly due to the interactions among the monitored variables. Thus, as shown from Fig. 9, both global PCA and our proposed method capture the process fault with similar efficiency. For Case 2, however, as observed in Fig. 10(a), most of the faulty samples fall below the 95% control limits in both SPE and T 2 charts, which indicates PCA-based method has extremely
2111
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
2
10
2
2
nT
T
2
10
0
10 0
10
0
50
100
150
200
0
50
0
50
100
150
200
100
150
200
2
SPE
nSPE
10
0
10
0
10
0
50
100
150
200
Samples
Samples
Fig. 9. TEP: T 2 and SPE control charts by (a) conventional PCA and (b) PCA mixture model in fault Case 1.
2
10 1
2
nT
T
2
10
10
0
0
10
0
50
100
150
200
0
50
0
50
100
150
200
100
150
200
2
SPE
nSPE
10
0
10
0
10
0
50
100
150
200
Samples
Samples
Fig. 10. TEP: T 2 and SPE control charts by (a) conventional PCA and (b) PCA mixture model in fault Case 2.
2
10
0
T
2
nT2
10
50
100
150
200
250
SPE 10
0
300
nSPE
0
10
10
0
50
100
0
50
100
150
200
250
300
150
200
250
300
0
0
0
50
100
150
Samples
200
250
300
Samples
Fig. 11. TEP: T 2 and SPE control charts by (a) conventional PCA and (b) PCA mixture model in fault Case 3.
low detection rates. In Fig. 10(b)nevertheless, the PCA mixture model control charts are found to be able to detect the same fault without significant delay, which demonstrates the high accuracy of the proposed approach in detecting the difficult random variation fault. In Case 3, the global PCA method still performs poorly in dealing with the slow drifting fault. As shown in Fig. 11(a), the SPE and T 2 control charts miss most of the faulty samples and thus have high type II errors. In T 2 chart, for instance, it is obvious that the drifting trend of the statistical metric starts after about 190 min of the first event of drift fault. In contrast to the PCA method, the proposed approach successfully detects the drift fault in an early stage with a delay of 93 min only. The results show that the monitoring approach based on PCA mixture model is superior to the global PCA method in detecting the slow drift fault.
2112
X. Xu et al. / Computers and Electrical Engineering 40 (2014) 2101–2112
5. Conclusions The process monitoring problem can be considered as a one-class classification problem, of which the key task is to estimate the confidence boundary around the normal operating regions accurately to separate the normal process data from the faulty ones. Focusing on the complex industrial processes with multiple operating conditions, the proposed method based on PCA mixture model works well when the normal operating data follows a multimode distribution, which can be characterized by a finite Gaussian mixture model. Considering the ill-condition of covariance matrix duo to the collinearity among monitored variables, the PCA technique is applied to the E-step to decorrelate the elements of variable vectors. Then BYY algorithm is adopted to automatically optimize the number of mixture components. Compared to the F–J EM and BYYEM algorithm, the proposed PCA-base EM algorithm has a higher estimation accuracy and converges much faster. With the obtained PCA mixture model, a Bayesian inference procedure is utilized to compute the posterior probabilities of each monitored sample belonging to all the Gaussian components and the overall statistics index are calculated for online monitoring. The validity and effectiveness of the proposed monitoring approach are illustrated by CSTH and TE process. The comparison of monitoring results show that the presented approach can achieve a good parameter estimation of the mixture model with correct model selection and so it is superior to the conventional PCA method in terms of higher detection rates, lower false alarm rates and shorter delays of fault capture. Therefore, the proposed approach provides a promising tool to monitor the complex industrial processes under multiple operating conditions without a priori process knowledge. Acknowledgements This work was financially supported by the National Natural Science Foundation of China (Grant Nos. 61134007, 60904039, 613741211) and the 111 Project (Grant No. B07031). References [1] Ge ZQ, Gao FR, Song ZH. Two-dimensional Bayesian monitoring method for nonlinear multimode processes. Chem Eng Sci 2011;66(21):5173–83. [2] Qin SJ. Statistical process monitoring: basis and beyond. J Chemomet 2003;17:480–502. [3] Zhao CH, Yao Y, Gao FR, Wang FL. Statistical analysis and online monitoring for multimode processes with between-mode transitions. Chem Eng Sci 2010;65(22):5961–75. [4] Sharma Alok, Paliwal Kuldip K, Imoto Seiya, Miyano Satoru. Principal component analysis using QR decomposition. Int J Mach Learn Cybernet 2012. http://dx.doi.org/10.1007/s13042-012-0131-7. [5] Hwang D-H, Han C. Real-time monitoring for a process with multiple operating modes. Control Eng Pract 1999;7:891–902. [6] Qin SJ. Recursive PLS algorithms for adaptive data modeling. Comput Chem Eng 1998;22(4-5):503–14. [7] Li WH, Yue HH, Sergio V-C, Qin SJ. Recursive PCA for adaptive process monitoring. J Process Control 2000;10(5):471–86. [8] Zhao SJ, Xu YM, Zhang J. A multiple PCA model based technique for the monitoring of processes with multiple operating modes. In: Comput Aided Chem Eng. Elsevier; 2004. p. 865–70. [9] Choi SW, Park JH, Lee I-B. Process monitoring using a Gaussian mixture model via principal component analysis and discriminant analysis. Comput Chem Eng 2004;28(8):1377–87. [10] Yu J, Qin SJ. Mulitmode process monitoring with Bayesian inferece-based finite Gaussian mixture models. AIChE J 2008;54(7):1811–29. [11] Kano M, Hasebe S, Hashimoto I, Ohno H. Evolution of multivariate statistical process control: application of independent component analysis and external analysis. Comput Chem Eng 2004;28(6-7):1157–66. [12] Ge ZQ, Song ZH. Online monitoring of nonlinear multiple mode processes based on adaptive local model approach. Control Eng Pract 2008;16(12):1427–37. [13] Kim H-C, Kim D, Bang SY. An efficient model order selection for PCA mixture model. Pattern Recogn Lett 2003;24(9-10):1385–93. [14] Li JQ, Barron AR. Mixture density estimation. Adv Neural Inform Process Syst 2000;12:279–85. [15] Vlassis N, Likas A. A Greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 2002;15:77–87. [16] Verbeek JJ, Vlassis N, Krose B. Efficient greedy learning of Gaussian mixture models. Neural Comput 2003;15(2):469–85. [17] Wafo Soh C. Optimal filtering of measured spectral intensity, estimation of light attenuation coefficient and prediction of the euphotic zone in shallow waters. Int J Mach Learn Cybernet 2013. http://dx.doi.org/10.1007/s13042-013-0163-7. [18] Figueiredo MAF, AK Jain. Unsupervised learning of finite mixture models. IEEE Trans Pattern Mach Intell 2002;24:381–96. [19] Ma JW, Wang TJ, Xu L. A gradient BYY harmony learning rule on Gaussian mixture with automated model selection. Neurocomputing 2004;56:481–7. [20] Ma JW. Automated model selection (AMS) on finite mixtures:a theoretical analysis. In: Proceedings of the 2006 international joint conference on neural networks); 2006. p. 8255–61. [21] Li L, Ma JW. A BYY scale-incremental EM algorithm for Gaussian mixture learning. Appl Mathemat Comput 2008;205(2):832–40. [22] Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via em algorithm. J Royal Stat Soc Ser B-Methodol 1977;39(1):1–38. [23] Thornhill NF, Patwardhan SC, Shah SL. A continuous stirred tank heater simulation model with applications. J Process Control 2008;18(3-4):347–60. [24] Downs JJ, Vogel EF. Plant-wide industrial process control problem. Comput Chem Eng 1993;17:245–55. [25] Ricker NL. Decentralized control of the Tennessee Eastman challenge process. J Process Control 1996;6:205–21. Xianzhen Xu is currently a Ph.D. student in control sciences & engineering from Zhejiang University. At the same time she is an associate professor in Changshu Institute of technology. Her main research interest include, but are not limited to, process monitoring and fault diagnosis for complex processes. Lei Xie at present is an associate professor in Zhejiang University. In 2005, he was awarded the Ph.D. degree in control sciences & engineering from Zhejiang University. His main research interest include, but are not limited to, the interdisciplinary domain of statistics and system theory. Shuqing Wang at present is professor in Zhejiang University. His main research interest include, but are not limited to, the advance process control theory and technology, control theory and application.