Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis

Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis

CJCHE-00083; No of Pages 8 Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx Contents lists available at ScienceDirect Chinese Journal of C...

860KB Sizes 0 Downloads 82 Views

CJCHE-00083; No of Pages 8 Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Chinese Journal of Chemical Engineering journal homepage: www.elsevier.com/locate/CJCHE

Process Systems Engineering and Process Safety

Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis☆ Xiaogang Deng ⁎, Xuemin Tian College of Information and Control Engineering, China University of Petroleum (East China), Qingdao 266580, China

a r t i c l e

i n f o

Article history: Received 10 January 2014 Received in revised form 25 January 2014 Accepted 5 March 2014 Available online xxxx Keywords: Multimode chemical process Fault detection Local neighborhood similarity analysis Principal component analysis

a b s t r a c t Traditional data driven fault detection methods assume unimodal distribution of process data so that they often perform not well in chemical process with multiple operating modes. In order to monitor the multimode chemical process effectively, this paper presents a novel fault detection method based on local neighborhood similarity analysis (LNSA). In the proposed method, prior process knowledge is not required and only the multimode normal operation data are used to construct a reference dataset. For online monitoring of process state, LNSA applies moving window technique to obtain a current snapshot data window. Then neighborhood searching technique is used to acquire the corresponding local neighborhood data window from the reference dataset. Similarity analysis between snapshot and neighborhood data windows is performed, which includes the calculation of principal component analysis (PCA) similarity factor and distance similarity factor. The PCA similarity factor is to capture the change of data direction while the distance similarity factor is used for monitoring the shift of data center position. Based on these similarity factors, two monitoring statistics are built for multimode process fault detection. Finally a simulated continuous stirred tank system is used to demonstrate the effectiveness of the proposed method. The simulation results show that LNSA can detect multimode process changes effectively and performs better than traditional fault detection methods. © 2014 The Chemical Industry and Engineering Society of China, and Chemical Industry Press. All rights reserved.

1. Introduction As modern chemical process becomes more and more complex, timely fault detection is necessary to ensure plant safety and increase economic profit. Data driven fault detection technique is the most popular for large-scale process monitoring because rich running data are available in process database [1–3]. Principal component analysis (PCA) is one of the most classic data-driven fault detection methods and has been successfully applied in many industrial cases. However, traditional PCA method is based on the assumption that the process has only one normal operation mode, while real chemical processes are usually running under multiple operation modes because of the changes in market demands and raw materials [4]. In order to overcome the shortcomings of traditional methods for multimode chemical process, many improved methods have been presented. One common fault detection strategy is multiple modeling method [5,6]. This method divides all normal operating data into different groups by prior knowledge or clustering technique, and then builds an individual monitoring model for each group. In the work of Natarajan and

☆ Supported by the National Natural Science Foundation of China (61273160, 61403418), the Natural Science Foundation of Shandong Province (ZR2011FM014), the Fundamental Research Funds for the Central Universities (10CX04046A) and the Doctoral Fund of Shandong Province (BS2012ZZ011). ⁎ Corresponding author. E-mail addresses: [email protected], [email protected] (X. Deng).

Srinivasan [7], K means clustering and PCA were combined to build a multi-model monitoring method. Zhu et al. [8] offered an adjoined multi-ICA–PCA model for fault detection under multiple operating modes. Some related studies were also made by Ge and Song [9] and Zhang et al. [10]. Another kind of multimode process monitoring method is based on mixture model such as the Gaussian mixture model (GMM) [11], which is used to represent the data sources driven by different operating modes. A principal component based GMM was proposed by Yu [12] for semiconductor process monitoring and a nonlinear version of GMM was developed by Yu [13] for a multimode wastewater treatment process. Xie and Shi [14] proposed a novel adaptive GMM based multimode process monitoring scheme. However, in these methods some prior knowledge of process is necessary for offline mode identification or mixture modeling. Recently, local modeling strategy has been introduced for multimode process monitoring, which does not need prior knowledge of process and only utilizes the nominal operation data. The early use of local modeling was given by He and Wang [15,16], using K nearest neighbor method for fault detection. However, their methods only considered the position change of data points and omitted data direction variations. Ge and Song [17] applied local least squares support vector machine (LSSVM) to generate prediction error and built an ICA–PCA monitoring model for the whole process, but online LSSVM training involved complex optimization solving. Ma et al. [18] developed a novel local neighborhood standardization (LNS) strategy and built LNS-PCA method, which provided a straightforward multimode process monitoring way but did not consider possible parameter changes in mode switching.

http://dx.doi.org/10.1016/j.cjche.2014.09.022 1004-9541/© 2014 The Chemical Industry and Engineering Society of China, and Chemical Industry Press. All rights reserved.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

2

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

Based on the above analysis, this paper proposes a novel multimode process fault detection method called as local neighborhood similarity analysis (LNSA). LNSA is constructed using the local modeling framework and detects fault by comparing the similarity of datasets. With this method, complex model optimization is not needed and process parameter changes are considered for the operation mode switching. A simulation study on CSTR is given for method testing. 2. Traditional Methods: PCA and LNS-PCA PCA is a powerful dimension reduction technique. It produces uncorrelated variables that are linear combinations of original variables [19,20]. For a given data matrix X ∈ R n × m, which represents m columns of measured variables at n rows of sample points, PCA decomposes the data matrix X as T

T

T

X ¼ t 1 p1 þ t 2 p2 þ … þ t k pk þ E

ð1Þ

where ti is the score vector, pi is the loading vector, k is the number of principal components (PCs) retained in PC model, and E is the residual matrix. The loading vector pi represents the ith principal direction in PC subspace, which corresponds to the ith eigenvector of covariance matrix as 1 T X Xpi ¼ λi pi n−1

2

T

T

Q ¼x

−1

T

P x

  T I−PP x

ð6Þ

where d(xt, x it) (i = 1, …, K) represents the Euclidean distance between xt and x it. Based on the local neighborhood samples, the local neighborhood standardization is performed as [18] e xt ¼

xt −mðNðxt ÞÞ sðNðxt ÞÞ

ð7Þ

where m(N(xt)) and S(N(xt)) are the mean and standard deviation of its neighborhood set, respectively. With the use of local neighborhood standardization, LNS-PCA can monitor the multimode process more effectively than classical PCA method. However, LNS-PCA method applies a unified statistical model for all modes, which omits the variation of model parameters under different modes and may degrade the detection performance. 3. Local Neighborhood Similarity Analysis 3.1. Motivation

ð3Þ

ð4Þ

where x is a measurement vector with m variables, Λ is the diagonal matrix of eigenvalues λi, and P = [p1 p2 … pk] is the loading matrix. The confidence limits of the two statistics can be obtained according to the data distribution assumptions [21]. In order to cope with the problems caused by the multimode data characteristic, Ma et al. [18] integrated the LNS preprocessing strategy in PCA monitoring and developed a multimode process fault detection method LNS-PCA. LNS strategy is to standardize the sample with its local neighborhood mean and standard deviation rather than the fixed global ones. Its main idea is shown as follows. For any testing sample xt, its K nearest neighbors in training dataset X are defined as n o 1 2 K Nðxt Þ ¼ xt ; xt ; …; xt

      1 2 K d xt ; xt ≤ d xt ; xt ≤… ≤ d xt ; xt

ð2Þ

where λi is the eigenvalue corresponding to eigenvector pi, and X is assumed to be mean centered and variance scaled. When PCA is used in fault detection, two monitoring statistics are constructed as T ¼x PΛ

where the K nearest neighbors should satisfy the next distance relationship

ð5Þ

This paper aims to develop a data-driven multimode process fault detection method with the following traits. Firstly, process prior knowledge about mode partition is not required and only normal operation data are used for statistical model building. Secondly, the proposed method is straightforward and complex online optimization is not involved. Thirdly, not only the steady point changes but also the parameter variations are considered for switching of different operation modes. Based on the above requirements, this work is to present a multimode process fault detection method, LNSA. The method integrates local neighborhood data searching and similarity analysis. Its main idea is depicted in Fig. 1. In online process monitoring, moving window technique is applied to collect a current snapshot data window. The snapshot data window is updated online and indicates the current operation mode. For any snapshot data window, its local neighborhood data are searched from the multimode historical training dataset and a local neighborhood data window is built as a reference dataset. Two similarity factors between snapshot and neighborhood data windows are calculated to judge the appearance of a fault. 3.2. Local neighborhood data searching In LNSA based monitoring strategy, neighbors of current monitored data are required for coping with the multimode data characteristic. For a testing sample xt, its local neighbors are determined by its distance to training samples. The nearest neighbor of xt can be searched according to nn

xt ¼ arg min dðxt ; xi Þ xi ∈X

ð8Þ

Fig. 1. The framework of LNSA method.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

where X ∈ Rn × m is the training dataset, and d( ) is the distance metric. Usually the distance is computed by the Euclidean norm as 2

dðxt ; xi Þ ¼ kxt −xi k :

ð9Þ

For a testing data window including L samples XS ∈ RL × m, its nearest neighbor can be defined as   N S X ¼ arg min d X ; X i

ð10Þ

X i ∈X

where XS = {x1S, x2S, …, xLS}, Xi = {xi, xi + 1, …, xi + L − 1} is a data series with the L samples extracted training dataset X, and the distance function is defined as L  2   1X  S  S d X ; Xi ¼ xl −xiþl−1  : L l¼1

ð11Þ

LNPCA similarity factor is built as [22] k X k   1X S N 2 SLNPCA X ; X ¼ cos θi j k i¼1 j¼1

According to Section 3.2, we can obtain a local neighborhood dataset XN for online snapshot dataset XS. If the trajectory of online snapshot dataset XS is very similar to the trajectory of its local neighborhood dataset XN, process state is classified as some normal operation mode. Otherwise, if the online snapshot dataset XS exhibits some deviations from X N, it is concluded that some fault occurs. The deviations can be understood from two aspects: one is the shift of data center position, and another is considered from the data direction change. A graphical illustration is displayed in Fig. 2. Fig. 2(a) shows that datasets A and B have similar data directions but different data center positions. In Fig. 2(b), the center positions of the two datasets are close but their data directions are clearly distinguished. Based on the above analysis, the similarity analysis includes the comparisons of dataset direction and center position. Therefore, LNSA builds two similarity factors: local neighborhood PCA (LNPCA) similarity factor and local neighborhood distance similarity factor. LNPCA similarity factor is first introduced. It is known that the dataset direction can be described in low dimensional principal component subspace by PCA [22]. Thus the direction similarity of two datasets can be formulated by the angles between two principal component subspaces. For the online snapshot dataset XS and its local neighborhood dataset XN, their principal component subspaces can be constructed by selecting only the first k principal components for each dataset, which are denoted by LS and LN, respectively. Here the principal components are actually the loading vectors in PCA decomposition. Thus a

where θij is the angle between the ith and jth principal components l N i and ljS of LN and LS, respectively, and k is the maximal value of retained principal component numbers in the PCA decompositions of dataset XN and XS. The angle cosine cos θij can be written as  T   liN l Sj cos θi j ¼     N  S  li l j :

ð13Þ

Combining Eqs. (12) and (13), the LNPCA similarity factor is represented as

ð14Þ According to the procedure of PCA, the amount of variance described by each principal component is usually different. Thus the equal consideration for each principal component in Eq. (14) may be inappropriate. To take into account the variance explained by each principal direction [23], a weighted LNPCA similarity factor is defined as M SLNPCA

 T  T    trace M S M N M N M S S N ¼ X ;X k X N S λi λi

ð15Þ

i¼1

where N

N

N

M ¼L Λ ;

S

S

M ¼L Λ

S

ð16Þ

qffiffiffiffiffiffi qffiffiffiffiffiffi qffiffiffiffiffiffi N λN Λ ¼ diag λN λN 1; 2 ; ⋯; k

S

Λ ¼ diag

20

ð17Þ

qffiffiffiffiffiffi qffiffiffiffiffiffi qffiffiffiffiffi λS1 ; λS2 ; ⋯; λSk

ð18Þ

20 dataA

dataA

dataB

10

dataB

10

0

x2

x2

ð12Þ

0 T  12  T  T  N k X k   1X l Sj C trace LN LS LS LN S N B li : SLNPCA X ; X ¼ @   A ¼  N  S  k i¼1 j¼1 k li l j 

3.3. Local neighborhood similarity factor

-10 -20 -20

3

0 -10

-10

0 x1

10

20

-20 -20

-10

(a)

0 x1

10

20

(b)

Fig. 2. A comparison of different data deviations.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

4

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

S where eigenvalues λN i and λi correspond to the ith principal compoN N nent of L N and LS, respectively, with λN 1 N λ2 N ⋯ N λk and S S S λ1 N λ2 N ⋯ N λk. The LNPCA similarity factor SM LNPCA provides an effective and simple way to measure the similarity between datasets XN and XS. If the value of SM LNPCA is high and close to 1, the two datasets are considered to be from the same normal operation mode. Otherwise, the online snapshot data window may capture a fault case if the value of SM LNPCA is close to 0. As mentioned above, in some cases the direction of two datasets are similar but their position may be different. The usual position metric is the distance of two datasets, which is defined as

      S N N S T N S dLN X ; X ¼ m −m m −m L

N

S

L

ð19Þ

S

where mN ¼ 1L ∑ xi ; m ¼ 1L ∑ xi are the center of local neighborhood i¼1

i¼1

and snapshot data window. Considering that different variables have different standard variances, the distance in Eq. (19) may be unfair for the variable with small standard variance. Thus a weighted distance is given as       M S N N S T −1 N S dLN X ; X ¼ m −m Ξ m −m

ð20Þ

where Ξ is the weight matrix defined as Ξ ¼ diagfr 1 ; r 2 ; …; r m g

ð21Þ

where ri is the standard variance of the ith variable in local neighborhood dataset. Lastly another similarity factor, called as local neighborhood distance similarity factor, is constructed as      M S N M S N SLND X ; X ¼ exp −dLN X ; X :

ð22Þ

M The value of SM LND ranges from 0 to 1. If the value of SLND is close to 1, the two datasets are considered to be similar and the online data window belongs to some normal operation mode. Analogically, some fault may occur if the value of SM LND is close to 0.

4. Fault Detection Based on LNSA 4.1. Monitoring statistics In order to monitor multimode process effectively, two monitoring M statistics are constructed based on similarity factors SM LNPCA and SLND. S For a testing data window X (t) at time t and its local neighborhood data window XN(t), monitoring statistics t and D are given as   M S N T¼1−SLNPCA X ðt Þ; X ðt Þ   M S N d¼1−SLNd X ðt Þ; X ðt Þ :

distribution is considered in this work, the confidence limits of t and D cannot be obtained by the similar way as PCA. Here their confidence limits can be computed using kernel density estimator [24]. According to the t and D values from normal operating data, the univariate kernel density estimator is applied and the point occupying 95% area of density function can be obtained and used as the confidence limit. 4.2. Fault detection procedure The fault detection procedure based on LNSA includes two stages: offline modeling stage and online detection stage. In offline modeling stage, normal operating model including the multiple operation modes are collected and divided into two subsets: reference dataset and testing dataset. Offline LNSA is applied to determine the confidence limits of monitoring statistics. During online detection stage, new observed data are collected and online LNSA is used to compute realtime monitoring statistics. If any monitoring statistic exceeds its confidence limit, an alarming signal should be given to operators. The detailed steps are shown as follows. Offline modeling stage: (1) Acquire normal operating data matrix X ∈ Rn × m with all possible operation modes and divide it into two subsets: reference dataset X R ∈Rn1 m and testing dataset X M ∈ Rn2 m . (2) Apply moving window technique on testing dataset XM and obtain a series of snapshot data windows with the window width as L and the moving step as 1. (3) Search local neighborhood data window for each local testing data window in reference dataset XR. (4) Compute the local neighborhood similarity factors SM LNPCA and SM LND, then get the monitoring statistics t and D. (5) Establish the confidence limits of monitoring statistics by kernel density estimation. Online detection stage: (1) Obtain one new data window XS(t) = [x(t ‐ L + 1) … x(t − 1) x(t)]T at time t by moving window method. (2) Search the local neighborhood data window XN(t) for dataset XS(t). (3) Compute two similarity factors between XN(t) and XS(t). (4) Calculate two monitoring statistics t and D according to Eqs. (23) and (24), respectively. (5) Compare the statistics with the corresponding confidence limits. If some upper control limit is exceeded, the abnormal behavior of the process is detected.

ð23Þ ð24Þ

Compared to PCA method, LNSA method uses a different viewpoint to construct monitoring statistics. PCA monitoring statistics t2 and Q in Eqs. (3) and (4) are constructed by analyzing the change of principal component subspace and residual subspace, while LNSA monitoring statistics in Eqs. (23) and (24) are built by investigating the deviations of dataset direction and center position. The t statistic of LNSA monitors the deviation of dataset direction, which is related to the change of relationship of process variables. The D statistic of LNSA monitors dataset position deviation, which often reflects the shift of process operation points. After monitoring statistics are obtained, the confidence limits are calculated to determine whether process is in control. Because no prior

Fig. 3. Diagram of the CSTR system.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

In the monitoring procedure, LNSA detects process faults with two similarity factors including principal component direction similarity factor and weighted distance similarity factor. These similarity factors consider both the operation point changes and the parameter variations in multimode process monitoring so that the process faults may be detected effectively. 5. Case Studies The proposed LNSA process monitoring method is tested with a simulated continuous stirred tank reactor (CSTR) system [22,25, 26], as shown in Fig. 3. In the CSTR system a first order irreversible reaction is assumed. The inlet flow of solvent and reactant A produces a single component B as an outlet stream. Heat from the exothermic reaction is removed through cooling flow of jacket. The temperature and the liquid level of reactor are controlled using the cascade control strategy. Based on the mass, energy and component balances, the dynamic model of CSTR system can be expressed as dCA Q C −Q F CA −E=RT R ¼ −k0 e þ F AF dt Ah

ð25Þ

dTR k e−E=RT R CA ð−ΔH Þ Q F T F −Q F T R UAC ðT C −T R Þ ¼ 0 þ þ ρCp ρCp Ah dt Ah

ð26Þ

dT C Q C ðT C F −T C Þ UAC ðT R −T C Þ ¼ þ VC ρC CpC V C dt

ð27Þ

dh Q F −Q O ¼ dt A

ð28Þ

where A is the cross-sectional area of reactor, cA is the concentration of species A in the reactor, cAF is the concentration of species A in feed stream, Cp is the heat capacity of contents, Cp C is the heat capacity of coolant, E is the activation energy, h is the liquid level of reactor, k0 is preexponentional factor, QF is the feed flow rate to the reactor, Q O is the outlet flow rate, QC is the coolant flow rate, R is universal gas constant, tR is reactor temperature, tC is the temperature of coolant in the cooling jacket, tCF is the temperature of coolant feed, tF is the temperature of feed stream, U is the heat-transfer coefficient, AC is the total heat transfer area, ΔH is the reaction heat, ρ is the density of contents, and ρC is the density of coolant [22,26]. In the CSTR simulation system, three normal operating modes are designed corresponding to different operation conditions as shown in Table 1. Similar multimode operation case is often seen in some industrial plants such as the polypropylene reactor. Table 2 lists ten measured variables in simulation procedure and corresponding Gaussian noise standard deviation values. The simulation data of three normal operating modes are generated with the sample interval of 10 s. One dataset of 900 samples is used as a reference dataset XR, including 300 samples for each normal mode. Another 900 normal operation samples for all modes are simulated to construct a testing dataset XM so that confidence limits of monitoring statistics are determined. During Table 1 Nominal operating conditions and model parameters for CSTR process Nominal conditions

Model 1 Model 2 Model 3

TR/K

TC/K

cA/mol · L-1

h/dm

QC/L · min-1

QO/L · min-1

402.35 408.35 396.35

345.44 353.71 337.95

0.0372 0.0273 0.0510

6 6 6

15 12.36 18.43

100 100 100

5

Table 2 Measured variables and noise standard deviation values Measured variable

Noise standard deviation values

cA cAF TR TF TC TCF h QO QF QC

0.0024 mol·L-1 0.0024 mol·L-1 0.45 K 0.45 K 0.45 K 0.45 K 0.004 m 0.71 L·min-1 0.71 L·min-1 0.32 L·min-1

different mode running, five kinds of fault pattern data with 700 samples are simulated. For each fault pattern, fault is introduced after the 240th sample. The detailed fault descriptions are listed in Table 3, which contains process parameter change, operation condition disturbance and sensor bias. Three methods of PCA, LNS-PCA and LNSA are applied to detect the fault. In the use of LNSA method, the window width L is a key parameter. Theoretically, a larger L value is necessary to reduce the effect of random noises and measure the data deviations precisely. However, a larger L value leads to a wider data window and may increase fault detection delay. It is difficult to determine a best L value for all fault cases by theoretical analysis. In this paper the value of L is set as 50 empirically by simulation testing. When LNSA is applied, for sensitive detection, the online snapshot data window and its local neighborhood data window are combined as a whole snapshot window. In all results, we consider that one fault is detected if continuous 6 samples exceed the 95% confidence limit plotted as dashed line in the following charts. In the next analysis, all monitoring statistics in the monitoring charts are divided by their respective 95% confidence limits so that the alarm limits in all plots are equal to 1. Fault F1 is first illustrated for comparison of methods. When fault F1 occurs, monitoring charts are shown in Figs. 4–6. t2 statistic of PCA detects this fault at the 329th sample but Q statistic fails to find the fault. LNS-PCA does better and its two statistics give alarming signals at the 318th and 411st samples, respectively. LNSA charts give the best performance, where its two statistics detect this fault at the 298th sample simultaneously. Thus for fault F1, LNSA displays its good detection capability. Fault F5 is also simulated for method testing, which involves a sensor bias. The monitoring results of three methods are shown in Figs. 7–9. Because this fault only involves a small bias, it is difficult to detect. When PCA is used for monitoring, no obvious fault signal is given, so PCA fails to detect this fault for not considering the multimode data distribution. LNS-PCA in Fig. 8 uses the local neighborhood standardization technique and improves the monitoring performance. Its t2 chart indicates the occurrence of fault rather clearly, while Q value goes around the confidence limit. With the consideration of multimode and parameter varying, two monitoring statistics of LNSA work well and both exceed the threshold obviously. These detection results on fault F5 prove the superiority of LNSA again. Table 3 Fault patterns for CSTR system Fault Mode Description

Variable

Value

F1 F2 F3 F4 F5

cAF TF E/R UAC TR

+ 0.6 × 10−3mol·L-1·min-1 + 0.05 K·min-1 + 2 K·min-1 − 120J·min-2·K-1 +2 K

1 1 2 2 3

Rise of feed concentration Rise of feed temperature Catalyst deactivation Heat exchanger fouling Bias of reactor temperature sensor

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

5

5

4

4

3

3 Q

T2

6

2

2

1

1

0

0

200 400 sample number

0

600

0

200 400 sample number

600

200 400 sample number

600

Fig. 4. PCA monitoring chart for fault F1.

10

8

8

6

6 Q

T2

10

4

4

2

2

0

0

200 400 sample number

0

600

0

Fig. 5. LNS-PCA monitoring chart for fault F1.

5

20

4

15

T

D

3 10

2 5

1 0

0

200 400 sample number

0

600

0

200 400 sample number

600

200 400 sample number

600

5

5

4

4

3

3 Q

T2

Fig. 6. LNSA monitoring chart for fault F1.

2

2

1

1

0

0

200 400 sample number

600

0

0

Fig. 7. PCA monitoring chart for fault F5.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

10

10

8

8

6

6

7

Q

T2

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

4

4

2

2

0

0

200 400 sample number

0

600

0

200 400 sample number

600

Fig. 8. LNS-PCA monitoring chart for fault F5.

5

20

4

15

T

D

3

10

2

5

1 0

0

200 400 sample number

600

0

0

200 400 sample number

600

Fig. 9. LNSA monitoring chart for fault F5.

An overall comparison for PCA, LNS-PCA and LNSA is given in Table 4, listing the fault detection rates of three methods on five fault cases. Here fault detection rate is defined as the percentage of fault alarming samples in total fault samples. Higher fault detection rate means better process monitoring performance. Therefore, LNSA is a better multimode monitoring method. 6. Conclusions This paper proposes a novel multimode process fault detection method based on LNSA. Two similarity factors between local neighborhood data window and online snapshot data window are constructed for LNSA based fault detection, which are easy to implement and suitable for multimode process with varying parameters. The case study on a simulated CSTR system demonstrates that the proposed method can effectively deal with multimode process monitoring problem and outperforms traditional methods. However, some problems about LNSA are necessary for further study such as the determination of online snapshot window width and the quick searching of local neighborhood dataset.

Table 4 Comparison of fault detection rate (%) Fault

F1 F2 F3 F4 F5

PCA

LNS-PCA

LNSA

T2

Q

T2

Q

T

D

84.13 55.65 4.57 4.13 7.83

11.96 66.09 6.96 12.83 5.87

88.04 75.87 49.57 79.35 96.96

70.22 57.17 12.17 52.17 45.87

89.13 78.48 54.78 80.00 99.78

87.61 89.35 74.35 90.87 97.83

The bold data indicate the highest fault detection rate.

Nomenclature D monitoring statistic of LNSA d distance metric of two vectors dLN distance metric of two datasets dM weighted distance metric of two datasets LN E residual matrix k number of principal components retained in PC model L data window width LN principal component matrix of dataset XN S L principal component matrix of dataset XS lN the ith principal direction of dataset XN i S lj the jth principal direction of dataset XS N M weighted principal component matrix of dataset XN MS weighted principal component matrix of dataset XS N m center of local neighborhood mS center of snapshot data window m number of measured variables n number of sample points P loading matrix pi loading vector Q monitoring statistic of PCA and LNS-PCA SLNPCA LNPCA similarity factor SM weighted LNPCA similarity factor LNPCA SM local neighborhood distance similarity factor LND t monitoring statistic of LNSA t2 monitoring statistic of PCA and LNS-PCA ti score vector X measured data matrix XS online snapshot data window XN local neighborhood data window XR reference dataset XM testing dataset

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022

8

X. Deng, X. Tian / Chinese Journal of Chemical Engineering xxx (2014) xxx–xxx

x xt x1t xit e xt θij λi Ξ

measurement vector testing sample the first neighbor of xt the ith neighbor of xt standardized testing sample angle between principal components eigenvalue weighted matrix for distance metric

References [1] S.J. Qin, Survey on data-driven industrial process monitoring and fault diagnosis, Annu. Rev. Control. 36 (2) (2012) 220–234. [2] H. Guo, H. Li, Online batch process monitoring with improved multi-way independent component analysis, Chin. J. Chem. Eng. 21 (3) (2013) 263–270. [3] Q. Jiang, X. Yan, Statistical monitoring of chemical processes based on sensitive kernel principal components, Chin. J. Chem. Eng. 21 (6) (2013) 633–643. [4] Y. Ma, H. Shi, M. Wang, Adaptive local outlier probability for dynamic process monitoring, Chin. J. Chem. Eng. 22 (7) (2014) 820–827. [5] S.J. Zhao, J. Zhang, Y.M. Xu, Monitoring of process with multiple operating modes through multiple principal component analysis models, Ind. Eng. Chem. Res. 43 (22) (2004) 7025–7035. [6] S.J. Zhao, J. Zhang, Y.M. Xu, Performance monitoring of processes with multiple operating modes through multiple PLS modes, J. Process Control 16 (7) (2006) 763–772. [7] S. Natarajan, R. Srinivasan, Multi-model based process condition monitoring of offshore oil and gas production process, Chem. Eng. Res. Des. 88 (5–6) (2010) 572–591. [8] Z. Zhu, Z. Song, A. Palazoglu, Process pattern construction and multi-mode monitoring, J. Process Control 22 (1) (2012) 247–262. [9] Z. Ge, Z. Song, Multimode process monitoring based on Bayesian method, J. Chemom. 23 (12) (2009) 636–650. [10] Y. Zhang, C. Wang, R. Lu, Modeling and monitoring of multimode process based on subspace separation, Chem. Eng. Res. Des. 91 (5) (2013) 831–842.

[11] J. Yu, S.J. Qin, Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models, AICHE J. 54 (7) (2008) 1811–1829. [12] J. Yu, Fault detection using principal components-based Gaussian mixture model for semiconductor, IEEE Trans. Semicond. Manuf. 24 (3) (2011) 432–444. [13] J. Yu, A nonlinear kernel Gaussian mixture model based inferential monitoring approach for fault detection and diagnosis of chemical processes, Chem. Eng. Sci. 68 (1) (2012) 506–519. [14] X. Xie, H. Shi, Dynamic multimode process modeling and monitoring using adaptive Gaussian mixture models, Ind. Eng. Chem. Res. 51 (15) (2012) 5497–5505. [15] Q.P. He, J. Wang, Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes, IEEE Trans. Semicond. Manuf. 20 (4) (2007) 345–354. [16] Q.P. He, J. Wang, Principal component based k-nearest-neighbor rule for semiconductor process fault detection, Proceedings of 2008 American Control Conference, Washington, 2008, pp. 1606–1611. [17] Z. Ge, Z. Song, Online monitoring of nonlinear multiple mode processes based on adaptive local model approach, Control. Eng. Pract. 16 (12) (2008) 1427–1437. [18] H. Ma, Y. Hu, H. Shi, A novel local neighborhood standardization strategy and its application in fault detection of multimode processes, Chemom. Intell. Lab. Syst. 118 (2012) 287–300. [19] M. Kano, Y. Nakagawa, Data-based process monitoring, process control, and quality improvement: recent developments and applications in steel industry, Comput. Chem. Eng. 32 (1–2) (2008) 12–24. [20] I.B. Khediri, M. Limam, C. Weihs, Variable window adaptive Kernel Principal Component Analysis for nonlinear nonstationary process monitoring, Comput. Ind. Eng. 61 (3) (2011) 437–446. [21] L.H. Chiang, E.L. Russell, R.D. Braatz, Fault Detection and Diagnosis in Industrial Systems, Springer, London, 2001. 41–43. [22] M.C. Johannesmeyer, A. Singhal, D.E. Seborg, Pattern matching in historical data, AICHE J. 48 (9) (2002) 2022–2038. [23] A. Singhal, D.E. Seborg, Evaluation of a pattern matching method for the Tennessee Eastman challenge process, J. Process Control 16 (6) (2006) 601–613. [24] L. Wang, H. Shi, Multivariate statistical process monitoring using an improved independent component analysis, Chem. Eng. Res. Des. 88 (4) (2010) 403–414. [25] A. Singhal, D.E. Seborg, Effect of data compression on pattern matching in historical data, Ind. Eng. Chem. Res. 44 (9) (2005) 3203–3212. [26] X. Deng, X. Tian, Sparse kernel locality preserving projection and its application in nonlinear process fault detection, Chin. J. Chem. Eng. 21 (2) (2013) 163–170.

Please cite this article as: X. Deng, X. Tian, Multimode Process Fault Detection Using Local Neighborhood Similarity Analysis, Chin. J. Chem. Eng. (2014), http://dx.doi.org/10.1016/j.cjche.2014.09.022