PROCESS MONITOR Chinese Journal of Chemical Engineering, 20(6) 1174—1179 (2012)
Multimode Process Monitoring Based on Fuzzy C-means in Locality Preserving Projection Subspace* XIE Xiang (解翔) and SHI Hongbo (侍洪波)**
Key Laboratory of Advanced Control and Optimization for Chemical Processes, East China University of Science and Technology, Ministry of Education, Shanghai 200237, China Abstract For complex industrial processes with multiple operational conditions, it is important to develop effective monitoring algorithms to ensure the safety of production processes. This paper proposes a novel monitoring strategy based on fuzzy C-means. The high dimensional historical data are transferred to a low dimensional subspace spanned by locality preserving projection. Then the scores in the novel subspace are classified into several overlapped clusters, each representing an operational mode. The distance statistics of each cluster are integrated though the membership values into a novel BID (Bayesian inference distance) monitoring index. The efficiency and effectiveness of the proposed method are validated though the Tennessee Eastman benchmark process. Keywords multimode process monitoring, fuzzy C-means, locality preserving projection, integrated monitoring index, Tennessee Eastman process
1
INTRODUCTION
Multimode is one of the most common features in modern industrial processes due to various demands from markets. To ensure the safety of production, it is meaningful to develop effective monitoring methods for multimode processes. Once abnormal events occur during the production, faulty symptoms should be reflected in the monitoring chart and faulty alarms should be triggered. Recently, this area has been intensively studied and many statistical monitoring algorithms have been reported. For processes with multiple operational modes, the most intuitive idea is to build separate models for different modes, which is adopted by most of literature in this field. Since principal component analysis (PCA) and partial least squares (PLS) are the most mature monitoring techniques in single-mode process monitoring, they are extended to multimode processes in a multiple-model way, combined with other pattern classification algorithms, and the traditional T2 and SPE (squared prediction error) statistics are used as monitoring indices [1-10]. Fuzzy c-means (FCM) is one of the artificial intelligence tools for pattern recognition, and it is usually implemented in the preprocessing procedure in statistical process monitoring field. FCM was utilized to classify multimode data into different groups and separate PCA model for each data group was built [11, 12]. With the same line, by combining FCM with PCA, Ng and Srinivasan [9] proposed a so-called AdPCA (Adjoint principal component analysis) approach and validated its efficiency though a distillation unit and a multiphase penicillin cultivation process. However, the most significant disadvantage of these methods is that, for each online sample, only one local model is adopted while calculating the monitoring statistics and
the information from other models are neglected, which may lead to biased monitoring results. To overcome this deficiency, Yu and Qin proposed a Bayesian inference based Gaussian mixture model (GMM) approach for process monitoring [13, 14], in which the local probability indices are integrated though the posterior probability of each Gaussians. The monitoring index, named BIP (Bayesian inference probability) statistics, contains the statistical information from all the Gaussian models, so the monitoring results are more comprehensive and reliable. Ge and Song further implemented this Bayesian based approach successfully to monitor a polypropylene production process with multiple operation modes [15]. In this study, we propose a novel monitoring strategy for processes with multimode features. High-dimensional historical data are projected into low-dimensional LPP (locality preserving projection) subspace, where all the clustering information is preserved and the sensitivity to outliers is improved. The FCM is trained to classify the scores into several clusters, each corresponding to an operation mode. Then the Mahalanobis distances of every online sample to all the clusters are integrated through Bayesian rules. Thus all the local information from each individual cluster is contained in the final monitoring statistics, which will significantly enhance the reliability of the monitoring results. The effectiveness and efficiency of the approach are verified though the well-known TE (Tennessee Eastman) simulation platform. 2 2.1
METHODOLOGY Dimension reduction by LPP Locality preserving projection (LPP) is a
Received 2012-05-29, accepted 2012-07-26. * Supported by the National Natural Science Foundation of China (61074079) and Shanghai Leading Academic Discipline Project (B054). ** To whom correspondence should be addressed. E-mail:
[email protected]
1175
Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012
geometrically motivated manifold learning method [16]. In our approach, LPP is utilized to reduce the dimension of collected process variables. Since multimode process data usually distribute in different clusters in their original space and LPP is able to preserve local structures while reducing the dimension, it is reasonable to adopt LPP as the pretreatment tool for multimode data. Similar to PCA, LPP seeks a transformation matrix A to transfer high-dimensional data X = [ x1 ," , xn ]T ∈ R m into low-dimensional scores T = [t1 ," , t n ]T ∈ R p , ( p m) , so that ti represents xi , where ti = AT xi , i = 1, 2," , n . To find the transformation matrix A and preserve the local information, an adjacency graph is constructed first. Let G denote a graph with n nodes. Nodes xi and x j are connected by an edge if xi is
min J (T ; a , μ , Σ ) = ∑ i =1 ∑ j =1 (α ij ) Dij2,Σi q
(3)
Dij2,Σi = ( t j − μi ) Σ i−1 ( t j − μi )
T
(4)
c
subject to the constraints 0 ≤ α ij ≤ 1,
following Gustafsaon-Kessel (GK) algorithm is adopted. With the scores {t1 ," , t p } and an initial estimation
{q(0) , μ (0) , Σ (0) } , two steps are performed iteratively [18]. Step 1
Compute the cluster centers and covariance
α ij( s +1) = 1
( A = a1 ," , a p ),
pm
(2)
where ti is a p-dimensional vector score, and A is a m × p matrix. In the spanned LPP subspace, not only the computation burden of modeling is greatly reduced, the sensitivity is also improved significantly. As a preprocessing tool, the characteristics facilitate the subsequent modeling and monitoring procedures. 2.2
Fuzzy clustering
In the LPP subspace, the scores of historical data are separated into different meaningful clusters, each representing an operation mode. The objective of fuzzy clustering is to partition the data set T into K clusters with vague boundaries. The objective function is defined as
(6) ⎡ t j − μi( s +1) ⎤ ⎣ ⎦
(7)
Step 2 Compute the distance and update the degrees of membership
where λ is the eigenvalue and its corresponding eigenvector is a, M is a diagonal matrix whose entries are the sum of column or row of W, namely, M ii = Σ jW ji , and L = M − W is the Laplacian matrix.
ti = AT xi ,
T
[α ( s ) ]q ⎡⎣ t j − μi( s +1) ⎤⎦ ∑ j =1 ij = n ∑ j =1[αij( s ) ]q n
Σ i( s +1)
Dij2(,Σsi+1) = Σ i( s +1)
Let the column vectors a1 ," , am be the solutions of Eq. (1) and arrange them according to their eigenvalues, λ1 < " < λm . The projection can be written as
[α ( s ) ]q t j ∑ j =1 ij = n ∑ j =1[αij( s ) ]q n
μi( s +1)
(1)
XLX T a = λ XMX T a
(5)
distances of t j to the i-th cluster. To training the FCM,
After that, a symmetric matrix W is defined to weigh the edges. Let Wij represents the weight of the
no such edge [17]. Next, solve the following generalized eigenvector problem
c
score belonging to the i-th cluster, T is the lowdimensional scores obtained though LPP projection, q is the fuzzifier that determines the fuzziness of the resulting clusters, and Dij2,Σi denotes the Mahalanobis
speaking, x j is among the k nearest neighbors of xi .
vertices i and j are connected, and Wij = 0 if there is
∑ i =1αij = 1
where α ij is the membership value of the j-th data
among the k nearest neighbors of x j , or equally
edge joining vertices xi and x j . Wij = 1 if and only if
m
U , μ ,Σ
1/ m
T
⎡ t j − μi( s +1) ⎤ Σ i−1( s +1) ⎡ t j − μi( s +1) ⎤ ⎣ ⎦ ⎣ ⎦ (8)
∑ c =1 ⎡⎣ Dij2(,Σs +1) K
i
Dcj2(,Σs +i 1) ⎤⎦
2 /( q −1)
(9)
where α i( s +1) , μi( s +1) and Σ i( s +1) are membership value, the mean and covariance of the i-th clustering at the (s + 1)-th iteration, respectively. By repeating Steps 1 and 2 iteratively, the membership values and parameters of each cluster are computed with specified accuracy. After these training steps, the original process dataset is partitioned into K clusters, which correspond to K operation modes. 2.3
Integrated monitoring statistics
For monitoring algorithms, effective monitoring statistics are important and should be carefully designed. Since each online sample has several possible operation conditions, the traditional T2 and SPE statistics may be inapplicable. To avoid biased monitoring result, we construct an integrated global monitoring statistics based on Bayesian rules. Theoretically, the membership α ij may be considered as the prior probability of the j-th score to the i-th cluster. Each cluster is assumed to follow the
1176
Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012
Gaussian distribution with μ as mean vector and Σ as covariance matrix, so the posterior probability of the score of arbitrary online sample can be calculated with Bayesian rules as follows α g ( t new | μi , Σ i ) P ( μi , Σ i | t new ) = Ki (10) ∑ i =1αi g ( tnew | μi , Σ i )
where g(·) is the well-known Gaussian function. Noted that Eq. (4) represents the Mahalanobis distances of t j to each clusters, which can be directly utilized as local monitoring indices. At a given confidence level, this index serves as an indication whether the monitored data is normal or faulty provided that it belongs to the corresponding cluster. Considering that the monitored samples may come from multiple clusters, a global integrated index is further defined to combine the local probability metrics across all the possible clusters. The formulation of the proposed index, named as BID (Bayesian inference distance), is given by BID = ∑ i=1 P ( μi , Σ i | t new ) Dij2 ( t new ) K
(11)
The upper control limit for BID, denoted as DL, can be calculated though the F-distribution, which is p (n 2 − 1) (12) DL = Fp ,n − p;γ n( n − p ) where n and p represent the number of historical samples and the dimension of scores, respectively, and Fp , n − p;γ is an F distribution with p and n-p degrees of freedom at given significance level γ, usually 95% or 99% [14]. 2.4
Steps for offline modeling and online monitoring
The detailed procedures for the proposed
Figure 1
monitoring methodology are as follows. (1) Collect a set of historical data under all possible operation conditions and scale them to eliminate the effect of different mathematical dimensions. (2) Find the transformation matrix A of Eq. (1) to reduce the dimension of data, with all the clustering information preserved. (3) Map the historical data into the LPP subspace with Eq. (2) to obtain the scores of historical data. (4) Implementing the GK algorithm to train the FCM and learn the parameters of each cluster {μ1 , Σ 1 ,", μ K , Σ K } and their corresponding membership values {a1 ,", aK } . (5) For each monitored online sample, scale it with mean and covariance of historical data and project it into the LPP subspace by Eq. (2). (6) Compute the score posterior probabilities belonging to all clusters via Eq. (10), and calculate its global monitoring index BID with Eq. (11). (7) Specify a confidence level γ and compute the control boundary DL with Eq. (12). (8) Detect abnormal operation conditions when BID>DL for several consecutive samples. 3
TE BENCHMARK SIMULATION
The Tennessee Eastman (TE) simulation program is widely applied to evaluate the efficiency of process monitoring techniques [19]. The original version of the simulation program is provided in FORTRAN scripts, where the plant works under open-loop and would be shut down after running two hours due to the high pressures in reactor. The schematic diagram of TE is shown in Fig. 1. It consists of 5 unit operations, i.e., a reactor, a partial condenser, a recycle compressor, a
Schematic diagram of the Tennessee Eastman process
1177
Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012
stripper and a vapor/liquid separator. Four gaseous reactants (A, C, D and E) are fed into the reactor to form two products (G and H) along with a byproduct (F) and an inert (B). Several control schemes and identification studies for the plant have been reported. In our experiment, the decentralized control strategy proposed by Ricker [20] is implemented to generate simulated process data. The Matlab simulation codes can be downloaded from its website: http://depts.washington.edu/control/ LARRY/TE/. There are six prespecified operation modes in the platform to meet the demands of different production grades, among which three modes are chosen in this simulation, as shown in Table 1. The training data are obtained by running the simulation 75 h under normal operation conditions. The first 25 h are running under Mode1, the next 25 h is under Mode 2 and the last 25 h are working under Mode 3. The sampling time is 3 min, so 500 samples are generated under each operation mode. The 1500 samples constitute the historical data. Moreover, 51 process variables make up of three parts, namely, 22 continuous measurements, 19 component measurements and 12 manipulation variables, among which the 22 continuous measurements are chosen as monitored variables. Table 1
Three operation modes used in experiment
Mode
Mass ratio G/H
Production rate
1
50/50
7038 kg·h−1 G and 7038 kg·h−1 H −1
Table 2 Detecting time delay for the 20 predefined disturbances under operation Mode 1 (min) Model 1 case
10/90
1111 kg·h G and 10000 kg·h H
3
90/10
10000 kg·h−1 G and 1111 kg·h−1 H
Tables 2 and 3 give the monitoring results of four different approaches, i.e. AdPCA, DPCA (dynamic principal component analysis), GMM and the proposed LPP-FCM; the best results for each disturbance under both Mode 1 and Mode 3 is marked in bold. From these two tables, we can draw the following conclusions. On average, all the four methods give better monitoring results for most of the disturbances under Mode 3 than under Mode 1, due to different control mechanisms for different operation modes. AdPCA and DPCA, which belong to PCA based algorithms, are prone to producing more detection delays than Bayesian based approaches, i.e. GMM and LPP-FCM in this simulation, because for PCA based approaches, only one local PCA model is chosen and applied for every online sample and all the information from other PCA models is neglected. This may lead to biased monitoring results and increase the detection time delay. In contrast, for algorithms using Bayesian based monitoring statistics, the information from every local model is integrated into the final global index, enhancing the reliability of monitoring results. It is noted that, among the simulation cases, LPP-FCM gives the best monitoring results. These improvements are resulted from (1) the sensitivity to abnormal data in the preprocessing procedure, (2) the rational FCM models established for
T
DPCA
SPE
T
2
SPE
GMM BIP
LPP-FCM BID
1
30
33
6
0
3
0
2
111
42
21
36
36
18
3
606
561
558
546
585
495
4
603
failed
567
failed
594
453
5
597
failed
failed
597
603
555
6
0
6
0
0
0
0
7
357
27
21
15
45
6
8
failed
561
594
failed
540
498
9
147
135
141
132
129
111
10
153
162
102
9
12
6
11
21
21
153
15
15
9
12
75
81
72
66
84
57
13
60
48
90
57
69
42
14
9
15
9
12
6
6
15
597
582
588
failed
606
570
16
591
failed
516
failed
585
516
17
144
123
111
108
99
12
18
108
99
303
264
231
99
19
291
171
288
165
504
141
20
177
159
174
162
153
147
Table 3 Detecting delays for the 20 predefined disturbances under operation Mode 3 (min)
−1
2
AdPCA 2
Model 3 case
AdPCA T2
SPE
DPCA T2
SPE
GMM BIP
LPP-FCM BID
1
18
24
9
21
18
9
2
147
114
48
30
30
27
3
585
606
573
468
648
498
4
597
603
591
588
612
561
5
6
6
3
3
6
3
6
0
3
3
0
0
0
7
495
477
471
447
444
444
8
45
42
42
364
48
42
9
579
573
456
429
438
408
10
135
123
129
111
114
93
11
75
24
477
24
15
12
12
24
27
21
21
15
15
13
108
144
84
75
84
72
14
27
18
366
15
21
6
15
621
606
591
552
666
519
16
609
603
573
561
573
498
17
99
99
96
90
102
96
18
240
216
240
141
201
108
19
48
33
42
39
45
30
20
168
177
171
150
159
150
multimode data and their optimal local monitoring statistics, and (3) statistical information from all local
1178
Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012
clusters contained in the final BID statistics, which makes the monitoring results comprehensive and reliable. In the following, two disturbances, i.e. disturbance 4 under Mode 1 and disturbance 10 under Mode 3, are analyzed in more details. The fourth disturbance involves a step change of inlet temperature of reactor cooling water under Mode 1. This change is well suppressed by the control system and does not show observable influences for most monitored variables. Only two variables (the inlet temperature of reactor cooling water and the inlet temperature of condenser cooling water) deviate from their normal regions after a long time delay. The monitoring results of the four methods are displayed in Fig. 2. The T2 and SPE statistics of AdPCA do not show clear faulty indications until the 20th hour, which is 10 h delay after the disturbance is introduced. This disturbance is also detected with lots of false negatives (type-II error), which is undesirable for a monitoring algorithm. The SPE statistics for DPCA are totally failed in detecting this abnormal event, while its T 2 statistics detect this disturbance with remarkable false positives (type-I error) during the first normal operation periods. Fig. 2 (c) illustrates the monitoring results generated by GMM. Since BIP is a probability index with value range of [0, 1], the faulty symptoms are quite vague when detecting subtle disturbance such as that in this case. Compared with the former three monitoring algorithms, the proposed LPP-FCM approach gives the clearest sign for this fault, as illustrated in Fig. 2 (d). The BID statistics detect this random variation disturbance with the shortest time delay, and after that, the monitoring statistics stay high above the 95% control limit and last until the end of test time. This case shows that the proposed LPP-FCM approach
is more sensitive to subtle faults than other available multimode monitoring methods. Disturbance 10 is a random variation of temperature of feed D in stream 2 under Mode 3. Its effect is not widespread in the system either. The process variables influenced include the temperatures of separator and stripper, compressor work, and the coolant temperature of reactor. As observed in the simulation, most of the variations propagate with long time delay. It is important to detect this fault as soon as possible for data-driven algorithms. The monitoring results are displayed in Fig. 3. All the four monitoring methods detect this fault successfully, and their faulty symptoms are quite obvious and straightforward. The main difference lies in the time delay of detection. The SPE based AdPCA approach detects this fault at 123 min after the fault is introduced, and it is 12 min earlier than its T2 statistics. For DPCA, although its SPE statistics detect the disturbance with only 111 min delay, it produces too many false positives (type-I-error) during the normal periods, and will thus result in false alarms. The GMM also presents clear features of random variation with its BIP statistics, and its time delay is 114 min. Our LPP-FCM approach detects the disturbance with 93 min time delay, which is the minimum value for all the six monitoring statistics of four algorithms. The test scenarios for the TE example demonstrate the outstanding potential of the proposed approach to detect various types of faults in industrial processes of multiple modes. 4
CONCLUSIONS
In this paper, a systematic monitoring strategy for
(a)
(b)
(c)
(d)
Figure 2 Monitoring results of disturbance 4 under Mode 1 using AdPCA (a), DPCA (b), GMM (c), and the proposed LPP-FCM approach (d)
1179
Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012
(a)
(b)
(c)
(d)
Figure 3 Monitoring results of disturbance 10 under Mode 3 using AdPCA (a), DPCA (b), GMM (c), and the proposed LPP-FCM approach (d)
multimode processes is proposed. It has the following advantages compared with other multimode process monitoring methods. (1) The multimode feature of historical data is preserved while implementing the dimension reduction step. By transferring the high dimensional historical data into a low dimensional LPP subspace, the computation burden of the subsequent FCM training procedure is greatly reduced, and the sensitivity of the new subspace is also greatly improved. (2) By separating multimode data into several overlapped clusters though FCM, the historical data are well fitted and statistically described, which makes the monitoring model more accurate and reasonable. (3) By integrating all the local distance indices into a global BID statistics, the monitoring results are more reliable. Particularly, the integrated monitoring index permits certain samples, such as transient data and startup data, to belong to multiple clusters with different membership values simultaneously, which guarantees the monitoring uninterrupted and continuous for all kinds of online samples. This feature has important practical meanings for industrial monitoring algorithms. REFERENCES
5 6 7 8
9 10 11 12 13 14 15 16
1 2 3 4
Qin, S.J., “Statistical process monitoring: Basics and beyond”, J. Chemometr., 17 (8-9), 480-502 (2003). Zhao, S.J., Zhang, J., Xu, Y.M., “Monitoring of processes with multiple operation modes through multiple principle component analysis models”, Ind. Eng. Chem. Res., 43 (22), 7025-7035 (2004). Zhao, S.J., Zhang, J., Xu, Y.M., “Performance monitoring of processes with multiple operating modes through multiple PLS models”, J. Process Control, 16 (7), 763-772 (2006). Li, R.Y., Rong, G., “Fault isolation by partial dynamic principal
17 18 19 20
component analysis in dynamic process”, Chin. J. Chem. Eng., 14 (4), 486-493 (2006). Xiong, L., Liang, J., Qian, J.X., “Multivariate statistical process monitoring of an industrial polypropylene catalyzer”, Chin. J. Chem. Eng., 15 (4), 524-532 (2007) Li, Y.F., Wang, Z.F., Yuan, J.Q., “On-line fault detection using SVM-based dynamic MPLS for batch processes”, Chin. J. Chem. Eng., 14 (6), 754-758 (2006). Zhao, C.H., Wang, F.L., Lu, N.Y., Jia, M.X., “Stage-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes”, J. Process Control, 17 (9), 728-741 (2007). Xiong, L., Liang, J., Qian, J.X., “Multivariate statistical process monitoring of an industrial polypropylene catalyzer reactor with component analysis and kernel density estimation”, Chin. J. Chem. Eng., 15 (4), 524-532 (2007). Ng, Y.S., Srinivasan, R., “An adjoined multi-model approach for monitoring batch and transient operations”, Comput Chem. Eng., 33 (4), 887-902 (2009). Liu, J., Chen, D.S., “Nonstationary fault detection and diagnosis for multimode processes”, AIChE J., 56 (1), 207-219 (2010). Chen, J.H., Liu, J.L., “Using mixture principal component analysis networks to extract fuzzy rules from data”, Ind. Eng. Chem. Res., 39 (7), 2355-2367 (2000). Liu, J.L., “Fault detection and classification for a process with multiple production grades”, Ind. Eng. Chem. Res., 47 (21), 8250-8262 (2008). Yu, J., Qin, S.J., “Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models”, AIChE J., 54 (7), 1811-1829 (2008). Yu, J., Qin, S.J., “Multiway Gaussian mixture model based multiphase batch process monitoring”, Ind. Eng. Chem. Res., 48 (18), 8585-8594 (2009). Ge, Z.Q., Song, Z.H., “Multimode process monitoring based on Bayesian method”, J. Chemometr., 23 (12), 636-650 (2009). He, X.F., Niyogi, P., “Locality preserving projections”, Advances in Neural Information Processing Systems, 16, 153-160 (2003). Belkin, M., Niyogi, P., “Laplacian eigenmaps for dimensionality reduction and data representation”, Neural Comput, 15 (6), 1373-1396 (2003). Bezdek, J.C., Ehrlich, R., “FCM: The fuzzy C-means clustering algorithm”, Comput. Geosci-Uk, 10 (2-3), 191-203 (1984). Downs, J.J., Vogel, E.F., “A plant-wide industrial process control problem”, Comput. Chem. Eng., 17 (3), 245-255 (1993). Ricker, N.L., “Decentralized control of the Tennessee Eastman challenge process”, J. Process Control, 6 (4), 205-222 (1996).