Multimode Process Monitoring Based on Fuzzy C-means in Locality Preserving Projection Subspace

Multimode Process Monitoring Based on Fuzzy C-means in Locality Preserving Projection Subspace

PROCESS MONITOR Chinese Journal of Chemical Engineering, 20(6) 1174—1179 (2012) Multimode Process Monitoring Based on Fuzzy C-means in Locality Prese...

742KB Sizes 0 Downloads 33 Views

PROCESS MONITOR Chinese Journal of Chemical Engineering, 20(6) 1174—1179 (2012)

Multimode Process Monitoring Based on Fuzzy C-means in Locality Preserving Projection Subspace* XIE Xiang (解翔) and SHI Hongbo (侍洪波)**

Key Laboratory of Advanced Control and Optimization for Chemical Processes, East China University of Science and Technology, Ministry of Education, Shanghai 200237, China Abstract For complex industrial processes with multiple operational conditions, it is important to develop effective monitoring algorithms to ensure the safety of production processes. This paper proposes a novel monitoring strategy based on fuzzy C-means. The high dimensional historical data are transferred to a low dimensional subspace spanned by locality preserving projection. Then the scores in the novel subspace are classified into several overlapped clusters, each representing an operational mode. The distance statistics of each cluster are integrated though the membership values into a novel BID (Bayesian inference distance) monitoring index. The efficiency and effectiveness of the proposed method are validated though the Tennessee Eastman benchmark process. Keywords multimode process monitoring, fuzzy C-means, locality preserving projection, integrated monitoring index, Tennessee Eastman process

1

INTRODUCTION

Multimode is one of the most common features in modern industrial processes due to various demands from markets. To ensure the safety of production, it is meaningful to develop effective monitoring methods for multimode processes. Once abnormal events occur during the production, faulty symptoms should be reflected in the monitoring chart and faulty alarms should be triggered. Recently, this area has been intensively studied and many statistical monitoring algorithms have been reported. For processes with multiple operational modes, the most intuitive idea is to build separate models for different modes, which is adopted by most of literature in this field. Since principal component analysis (PCA) and partial least squares (PLS) are the most mature monitoring techniques in single-mode process monitoring, they are extended to multimode processes in a multiple-model way, combined with other pattern classification algorithms, and the traditional T2 and SPE (squared prediction error) statistics are used as monitoring indices [1-10]. Fuzzy c-means (FCM) is one of the artificial intelligence tools for pattern recognition, and it is usually implemented in the preprocessing procedure in statistical process monitoring field. FCM was utilized to classify multimode data into different groups and separate PCA model for each data group was built [11, 12]. With the same line, by combining FCM with PCA, Ng and Srinivasan [9] proposed a so-called AdPCA (Adjoint principal component analysis) approach and validated its efficiency though a distillation unit and a multiphase penicillin cultivation process. However, the most significant disadvantage of these methods is that, for each online sample, only one local model is adopted while calculating the monitoring statistics and

the information from other models are neglected, which may lead to biased monitoring results. To overcome this deficiency, Yu and Qin proposed a Bayesian inference based Gaussian mixture model (GMM) approach for process monitoring [13, 14], in which the local probability indices are integrated though the posterior probability of each Gaussians. The monitoring index, named BIP (Bayesian inference probability) statistics, contains the statistical information from all the Gaussian models, so the monitoring results are more comprehensive and reliable. Ge and Song further implemented this Bayesian based approach successfully to monitor a polypropylene production process with multiple operation modes [15]. In this study, we propose a novel monitoring strategy for processes with multimode features. High-dimensional historical data are projected into low-dimensional LPP (locality preserving projection) subspace, where all the clustering information is preserved and the sensitivity to outliers is improved. The FCM is trained to classify the scores into several clusters, each corresponding to an operation mode. Then the Mahalanobis distances of every online sample to all the clusters are integrated through Bayesian rules. Thus all the local information from each individual cluster is contained in the final monitoring statistics, which will significantly enhance the reliability of the monitoring results. The effectiveness and efficiency of the approach are verified though the well-known TE (Tennessee Eastman) simulation platform. 2 2.1

METHODOLOGY Dimension reduction by LPP Locality preserving projection (LPP) is a

Received 2012-05-29, accepted 2012-07-26. * Supported by the National Natural Science Foundation of China (61074079) and Shanghai Leading Academic Discipline Project (B054). ** To whom correspondence should be addressed. E-mail: [email protected]

1175

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

geometrically motivated manifold learning method [16]. In our approach, LPP is utilized to reduce the dimension of collected process variables. Since multimode process data usually distribute in different clusters in their original space and LPP is able to preserve local structures while reducing the dimension, it is reasonable to adopt LPP as the pretreatment tool for multimode data. Similar to PCA, LPP seeks a transformation matrix A to transfer high-dimensional data X = [ x1 ," , xn ]T ∈ R m into low-dimensional scores T = [t1 ," , t n ]T ∈ R p , ( p  m) , so that ti represents xi , where ti = AT xi , i = 1, 2," , n . To find the transformation matrix A and preserve the local information, an adjacency graph is constructed first. Let G denote a graph with n nodes. Nodes xi and x j are connected by an edge if xi is

min J (T ; a , μ , Σ ) = ∑ i =1 ∑ j =1 (α ij ) Dij2,Σi q

(3)

Dij2,Σi = ( t j − μi ) Σ i−1 ( t j − μi )

T

(4)

c

subject to the constraints 0 ≤ α ij ≤ 1,

following Gustafsaon-Kessel (GK) algorithm is adopted. With the scores {t1 ," , t p } and an initial estimation

{q(0) , μ (0) , Σ (0) } , two steps are performed iteratively [18]. Step 1

Compute the cluster centers and covariance

α ij( s +1) = 1

( A = a1 ," , a p ),

pm

(2)

where ti is a p-dimensional vector score, and A is a m × p matrix. In the spanned LPP subspace, not only the computation burden of modeling is greatly reduced, the sensitivity is also improved significantly. As a preprocessing tool, the characteristics facilitate the subsequent modeling and monitoring procedures. 2.2

Fuzzy clustering

In the LPP subspace, the scores of historical data are separated into different meaningful clusters, each representing an operation mode. The objective of fuzzy clustering is to partition the data set T into K clusters with vague boundaries. The objective function is defined as

(6) ⎡ t j − μi( s +1) ⎤ ⎣ ⎦

(7)

Step 2 Compute the distance and update the degrees of membership

where λ is the eigenvalue and its corresponding eigenvector is a, M is a diagonal matrix whose entries are the sum of column or row of W, namely, M ii = Σ jW ji , and L = M − W is the Laplacian matrix.

ti = AT xi ,

T

[α ( s ) ]q ⎡⎣ t j − μi( s +1) ⎤⎦ ∑ j =1 ij = n ∑ j =1[αij( s ) ]q n

Σ i( s +1)

Dij2(,Σsi+1) = Σ i( s +1)

Let the column vectors a1 ," , am be the solutions of Eq. (1) and arrange them according to their eigenvalues, λ1 < " < λm . The projection can be written as

[α ( s ) ]q t j ∑ j =1 ij = n ∑ j =1[αij( s ) ]q n

μi( s +1)

(1)

XLX T a = λ XMX T a

(5)

distances of t j to the i-th cluster. To training the FCM,

After that, a symmetric matrix W is defined to weigh the edges. Let Wij represents the weight of the

no such edge [17]. Next, solve the following generalized eigenvector problem

c

score belonging to the i-th cluster, T is the lowdimensional scores obtained though LPP projection, q is the fuzzifier that determines the fuzziness of the resulting clusters, and Dij2,Σi denotes the Mahalanobis

speaking, x j is among the k nearest neighbors of xi .

vertices i and j are connected, and Wij = 0 if there is

∑ i =1αij = 1

where α ij is the membership value of the j-th data

among the k nearest neighbors of x j , or equally

edge joining vertices xi and x j . Wij = 1 if and only if

m

U , μ ,Σ

1/ m

T

⎡ t j − μi( s +1) ⎤ Σ i−1( s +1) ⎡ t j − μi( s +1) ⎤ ⎣ ⎦ ⎣ ⎦ (8)

∑ c =1 ⎡⎣ Dij2(,Σs +1) K

i

Dcj2(,Σs +i 1) ⎤⎦

2 /( q −1)

(9)

where α i( s +1) , μi( s +1) and Σ i( s +1) are membership value, the mean and covariance of the i-th clustering at the (s + 1)-th iteration, respectively. By repeating Steps 1 and 2 iteratively, the membership values and parameters of each cluster are computed with specified accuracy. After these training steps, the original process dataset is partitioned into K clusters, which correspond to K operation modes. 2.3

Integrated monitoring statistics

For monitoring algorithms, effective monitoring statistics are important and should be carefully designed. Since each online sample has several possible operation conditions, the traditional T2 and SPE statistics may be inapplicable. To avoid biased monitoring result, we construct an integrated global monitoring statistics based on Bayesian rules. Theoretically, the membership α ij may be considered as the prior probability of the j-th score to the i-th cluster. Each cluster is assumed to follow the

1176

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

Gaussian distribution with μ as mean vector and Σ as covariance matrix, so the posterior probability of the score of arbitrary online sample can be calculated with Bayesian rules as follows α g ( t new | μi , Σ i ) P ( μi , Σ i | t new ) = Ki (10) ∑ i =1αi g ( tnew | μi , Σ i )

where g(·) is the well-known Gaussian function. Noted that Eq. (4) represents the Mahalanobis distances of t j to each clusters, which can be directly utilized as local monitoring indices. At a given confidence level, this index serves as an indication whether the monitored data is normal or faulty provided that it belongs to the corresponding cluster. Considering that the monitored samples may come from multiple clusters, a global integrated index is further defined to combine the local probability metrics across all the possible clusters. The formulation of the proposed index, named as BID (Bayesian inference distance), is given by BID = ∑ i=1 P ( μi , Σ i | t new ) Dij2 ( t new ) K

(11)

The upper control limit for BID, denoted as DL, can be calculated though the F-distribution, which is p (n 2 − 1) (12) DL = Fp ,n − p;γ n( n − p ) where n and p represent the number of historical samples and the dimension of scores, respectively, and Fp , n − p;γ is an F distribution with p and n-p degrees of freedom at given significance level γ, usually 95% or 99% [14]. 2.4

Steps for offline modeling and online monitoring

The detailed procedures for the proposed

Figure 1

monitoring methodology are as follows. (1) Collect a set of historical data under all possible operation conditions and scale them to eliminate the effect of different mathematical dimensions. (2) Find the transformation matrix A of Eq. (1) to reduce the dimension of data, with all the clustering information preserved. (3) Map the historical data into the LPP subspace with Eq. (2) to obtain the scores of historical data. (4) Implementing the GK algorithm to train the FCM and learn the parameters of each cluster {μ1 , Σ 1 ,", μ K , Σ K } and their corresponding membership values {a1 ,", aK } . (5) For each monitored online sample, scale it with mean and covariance of historical data and project it into the LPP subspace by Eq. (2). (6) Compute the score posterior probabilities belonging to all clusters via Eq. (10), and calculate its global monitoring index BID with Eq. (11). (7) Specify a confidence level γ and compute the control boundary DL with Eq. (12). (8) Detect abnormal operation conditions when BID>DL for several consecutive samples. 3

TE BENCHMARK SIMULATION

The Tennessee Eastman (TE) simulation program is widely applied to evaluate the efficiency of process monitoring techniques [19]. The original version of the simulation program is provided in FORTRAN scripts, where the plant works under open-loop and would be shut down after running two hours due to the high pressures in reactor. The schematic diagram of TE is shown in Fig. 1. It consists of 5 unit operations, i.e., a reactor, a partial condenser, a recycle compressor, a

Schematic diagram of the Tennessee Eastman process

1177

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

stripper and a vapor/liquid separator. Four gaseous reactants (A, C, D and E) are fed into the reactor to form two products (G and H) along with a byproduct (F) and an inert (B). Several control schemes and identification studies for the plant have been reported. In our experiment, the decentralized control strategy proposed by Ricker [20] is implemented to generate simulated process data. The Matlab simulation codes can be downloaded from its website: http://depts.washington.edu/control/ LARRY/TE/. There are six prespecified operation modes in the platform to meet the demands of different production grades, among which three modes are chosen in this simulation, as shown in Table 1. The training data are obtained by running the simulation 75 h under normal operation conditions. The first 25 h are running under Mode1, the next 25 h is under Mode 2 and the last 25 h are working under Mode 3. The sampling time is 3 min, so 500 samples are generated under each operation mode. The 1500 samples constitute the historical data. Moreover, 51 process variables make up of three parts, namely, 22 continuous measurements, 19 component measurements and 12 manipulation variables, among which the 22 continuous measurements are chosen as monitored variables. Table 1

Three operation modes used in experiment

Mode

Mass ratio G/H

Production rate

1

50/50

7038 kg·h−1 G and 7038 kg·h−1 H −1

Table 2 Detecting time delay for the 20 predefined disturbances under operation Mode 1 (min) Model 1 case

10/90

1111 kg·h G and 10000 kg·h H

3

90/10

10000 kg·h−1 G and 1111 kg·h−1 H

Tables 2 and 3 give the monitoring results of four different approaches, i.e. AdPCA, DPCA (dynamic principal component analysis), GMM and the proposed LPP-FCM; the best results for each disturbance under both Mode 1 and Mode 3 is marked in bold. From these two tables, we can draw the following conclusions. On average, all the four methods give better monitoring results for most of the disturbances under Mode 3 than under Mode 1, due to different control mechanisms for different operation modes. AdPCA and DPCA, which belong to PCA based algorithms, are prone to producing more detection delays than Bayesian based approaches, i.e. GMM and LPP-FCM in this simulation, because for PCA based approaches, only one local PCA model is chosen and applied for every online sample and all the information from other PCA models is neglected. This may lead to biased monitoring results and increase the detection time delay. In contrast, for algorithms using Bayesian based monitoring statistics, the information from every local model is integrated into the final global index, enhancing the reliability of monitoring results. It is noted that, among the simulation cases, LPP-FCM gives the best monitoring results. These improvements are resulted from (1) the sensitivity to abnormal data in the preprocessing procedure, (2) the rational FCM models established for

T

DPCA

SPE

T

2

SPE

GMM BIP

LPP-FCM BID

1

30

33

6

0

3

0

2

111

42

21

36

36

18

3

606

561

558

546

585

495

4

603

failed

567

failed

594

453

5

597

failed

failed

597

603

555

6

0

6

0

0

0

0

7

357

27

21

15

45

6

8

failed

561

594

failed

540

498

9

147

135

141

132

129

111

10

153

162

102

9

12

6

11

21

21

153

15

15

9

12

75

81

72

66

84

57

13

60

48

90

57

69

42

14

9

15

9

12

6

6

15

597

582

588

failed

606

570

16

591

failed

516

failed

585

516

17

144

123

111

108

99

12

18

108

99

303

264

231

99

19

291

171

288

165

504

141

20

177

159

174

162

153

147

Table 3 Detecting delays for the 20 predefined disturbances under operation Mode 3 (min)

−1

2

AdPCA 2

Model 3 case

AdPCA T2

SPE

DPCA T2

SPE

GMM BIP

LPP-FCM BID

1

18

24

9

21

18

9

2

147

114

48

30

30

27

3

585

606

573

468

648

498

4

597

603

591

588

612

561

5

6

6

3

3

6

3

6

0

3

3

0

0

0

7

495

477

471

447

444

444

8

45

42

42

364

48

42

9

579

573

456

429

438

408

10

135

123

129

111

114

93

11

75

24

477

24

15

12

12

24

27

21

21

15

15

13

108

144

84

75

84

72

14

27

18

366

15

21

6

15

621

606

591

552

666

519

16

609

603

573

561

573

498

17

99

99

96

90

102

96

18

240

216

240

141

201

108

19

48

33

42

39

45

30

20

168

177

171

150

159

150

multimode data and their optimal local monitoring statistics, and (3) statistical information from all local

1178

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

clusters contained in the final BID statistics, which makes the monitoring results comprehensive and reliable. In the following, two disturbances, i.e. disturbance 4 under Mode 1 and disturbance 10 under Mode 3, are analyzed in more details. The fourth disturbance involves a step change of inlet temperature of reactor cooling water under Mode 1. This change is well suppressed by the control system and does not show observable influences for most monitored variables. Only two variables (the inlet temperature of reactor cooling water and the inlet temperature of condenser cooling water) deviate from their normal regions after a long time delay. The monitoring results of the four methods are displayed in Fig. 2. The T2 and SPE statistics of AdPCA do not show clear faulty indications until the 20th hour, which is 10 h delay after the disturbance is introduced. This disturbance is also detected with lots of false negatives (type-II error), which is undesirable for a monitoring algorithm. The SPE statistics for DPCA are totally failed in detecting this abnormal event, while its T 2 statistics detect this disturbance with remarkable false positives (type-I error) during the first normal operation periods. Fig. 2 (c) illustrates the monitoring results generated by GMM. Since BIP is a probability index with value range of [0, 1], the faulty symptoms are quite vague when detecting subtle disturbance such as that in this case. Compared with the former three monitoring algorithms, the proposed LPP-FCM approach gives the clearest sign for this fault, as illustrated in Fig. 2 (d). The BID statistics detect this random variation disturbance with the shortest time delay, and after that, the monitoring statistics stay high above the 95% control limit and last until the end of test time. This case shows that the proposed LPP-FCM approach

is more sensitive to subtle faults than other available multimode monitoring methods. Disturbance 10 is a random variation of temperature of feed D in stream 2 under Mode 3. Its effect is not widespread in the system either. The process variables influenced include the temperatures of separator and stripper, compressor work, and the coolant temperature of reactor. As observed in the simulation, most of the variations propagate with long time delay. It is important to detect this fault as soon as possible for data-driven algorithms. The monitoring results are displayed in Fig. 3. All the four monitoring methods detect this fault successfully, and their faulty symptoms are quite obvious and straightforward. The main difference lies in the time delay of detection. The SPE based AdPCA approach detects this fault at 123 min after the fault is introduced, and it is 12 min earlier than its T2 statistics. For DPCA, although its SPE statistics detect the disturbance with only 111 min delay, it produces too many false positives (type-I-error) during the normal periods, and will thus result in false alarms. The GMM also presents clear features of random variation with its BIP statistics, and its time delay is 114 min. Our LPP-FCM approach detects the disturbance with 93 min time delay, which is the minimum value for all the six monitoring statistics of four algorithms. The test scenarios for the TE example demonstrate the outstanding potential of the proposed approach to detect various types of faults in industrial processes of multiple modes. 4

CONCLUSIONS

In this paper, a systematic monitoring strategy for

(a)

(b)

(c)

(d)

Figure 2 Monitoring results of disturbance 4 under Mode 1 using AdPCA (a), DPCA (b), GMM (c), and the proposed LPP-FCM approach (d)

1179

Chin. J. Chem. Eng., Vol. 20, No. 6, December 2012

(a)

(b)

(c)

(d)

Figure 3 Monitoring results of disturbance 10 under Mode 3 using AdPCA (a), DPCA (b), GMM (c), and the proposed LPP-FCM approach (d)

multimode processes is proposed. It has the following advantages compared with other multimode process monitoring methods. (1) The multimode feature of historical data is preserved while implementing the dimension reduction step. By transferring the high dimensional historical data into a low dimensional LPP subspace, the computation burden of the subsequent FCM training procedure is greatly reduced, and the sensitivity of the new subspace is also greatly improved. (2) By separating multimode data into several overlapped clusters though FCM, the historical data are well fitted and statistically described, which makes the monitoring model more accurate and reasonable. (3) By integrating all the local distance indices into a global BID statistics, the monitoring results are more reliable. Particularly, the integrated monitoring index permits certain samples, such as transient data and startup data, to belong to multiple clusters with different membership values simultaneously, which guarantees the monitoring uninterrupted and continuous for all kinds of online samples. This feature has important practical meanings for industrial monitoring algorithms. REFERENCES

5 6 7 8

9 10 11 12 13 14 15 16

1 2 3 4

Qin, S.J., “Statistical process monitoring: Basics and beyond”, J. Chemometr., 17 (8-9), 480-502 (2003). Zhao, S.J., Zhang, J., Xu, Y.M., “Monitoring of processes with multiple operation modes through multiple principle component analysis models”, Ind. Eng. Chem. Res., 43 (22), 7025-7035 (2004). Zhao, S.J., Zhang, J., Xu, Y.M., “Performance monitoring of processes with multiple operating modes through multiple PLS models”, J. Process Control, 16 (7), 763-772 (2006). Li, R.Y., Rong, G., “Fault isolation by partial dynamic principal

17 18 19 20

component analysis in dynamic process”, Chin. J. Chem. Eng., 14 (4), 486-493 (2006). Xiong, L., Liang, J., Qian, J.X., “Multivariate statistical process monitoring of an industrial polypropylene catalyzer”, Chin. J. Chem. Eng., 15 (4), 524-532 (2007) Li, Y.F., Wang, Z.F., Yuan, J.Q., “On-line fault detection using SVM-based dynamic MPLS for batch processes”, Chin. J. Chem. Eng., 14 (6), 754-758 (2006). Zhao, C.H., Wang, F.L., Lu, N.Y., Jia, M.X., “Stage-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes”, J. Process Control, 17 (9), 728-741 (2007). Xiong, L., Liang, J., Qian, J.X., “Multivariate statistical process monitoring of an industrial polypropylene catalyzer reactor with component analysis and kernel density estimation”, Chin. J. Chem. Eng., 15 (4), 524-532 (2007). Ng, Y.S., Srinivasan, R., “An adjoined multi-model approach for monitoring batch and transient operations”, Comput Chem. Eng., 33 (4), 887-902 (2009). Liu, J., Chen, D.S., “Nonstationary fault detection and diagnosis for multimode processes”, AIChE J., 56 (1), 207-219 (2010). Chen, J.H., Liu, J.L., “Using mixture principal component analysis networks to extract fuzzy rules from data”, Ind. Eng. Chem. Res., 39 (7), 2355-2367 (2000). Liu, J.L., “Fault detection and classification for a process with multiple production grades”, Ind. Eng. Chem. Res., 47 (21), 8250-8262 (2008). Yu, J., Qin, S.J., “Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models”, AIChE J., 54 (7), 1811-1829 (2008). Yu, J., Qin, S.J., “Multiway Gaussian mixture model based multiphase batch process monitoring”, Ind. Eng. Chem. Res., 48 (18), 8585-8594 (2009). Ge, Z.Q., Song, Z.H., “Multimode process monitoring based on Bayesian method”, J. Chemometr., 23 (12), 636-650 (2009). He, X.F., Niyogi, P., “Locality preserving projections”, Advances in Neural Information Processing Systems, 16, 153-160 (2003). Belkin, M., Niyogi, P., “Laplacian eigenmaps for dimensionality reduction and data representation”, Neural Comput, 15 (6), 1373-1396 (2003). Bezdek, J.C., Ehrlich, R., “FCM: The fuzzy C-means clustering algorithm”, Comput. Geosci-Uk, 10 (2-3), 191-203 (1984). Downs, J.J., Vogel, E.F., “A plant-wide industrial process control problem”, Comput. Chem. Eng., 17 (3), 245-255 (1993). Ricker, N.L., “Decentralized control of the Tennessee Eastman challenge process”, J. Process Control, 6 (4), 205-222 (1996).