Efficient fault detection using multivariate exponentially weighted moving average and independent component analysis

Efficient fault detection using multivariate exponentially weighted moving average and independent component analysis

916 Process SystemsEngineering2003 B. Chenand A.W.Westerberg(editors) 9 2003Publishedby ElsevierScienceB.V. Efficient Fault Detection using Multivar...

306KB Sizes 0 Downloads 63 Views

916

Process SystemsEngineering2003 B. Chenand A.W.Westerberg(editors) 9 2003Publishedby ElsevierScienceB.V.

Efficient Fault Detection using Multivariate Exponentially Weighted Moving Average and Independent Component Analysis Jong-Min Lee", ChangKyoo Yoo b and In-Beum Lee ~

aDepartment of Chemical Engineering, Pohang University of Science and Technology, Pohang, Kyongbuk 790-784, Korea bBIOMATH, Ghent University, Coupure Links 653, B-9000 Gent, Belgium Abstract In this paper, a new statistical process monitoring algorithm is proposed for detecting the process change influenced by a small shift in process variables, which is based on the multivariate exponentially weighted moving average (MEWMA) monitoring concept with independent component analysis (ICA) and kernel density estimation. The proposed monitoring method is applied to fault detection in the simulation benchmark of the biological wastewater treatment process (WWTP). For a small shift in these processes, the simulation results illustrated the monitoring power of MEWMA-ICA versus existing methods of the conventional PCA, ICA, MEWMA-PCA monitoring. Keywords Process monitoring; Fault detection; Exponentially weighted moving average (EWMA); Independent component analysis (ICA)

1. INTRODUCTION The need to analyze high-dimensional and correlated process data has led to the development of many process monitoring schemes that use multivariate statistical methods based on principal component analysis (PCA) and partial least squares (PLS). However, like Shewharttype control charts, 7r and squared prediction error (SPE) charts based on PCA cannot exactly detect process faults with a small disturbance because it uses information only from the most current samples tl]. In order to solve these problems, Wold[21 proposed the exponentially weighted moving PCA and PLS that are generalized to update models dynamically with memory effect. Chen et al.[3] suggested a monitoring method using the combination of PCA and MEWMA. Their approach not only processes information with memory to improve the disturbance detectability but also utilizes the joint statistical measures of T~ and SPE based on the conventional statistical control limits of PCA. However, a process monitoring techniques based on PCA has an indispensable assumption that process data are normally distributed. This assumption is often invalid for measurements from actual chemical processes because of their dynamic, control, and nonlinear nature t4]. As a result, the conventional PCA monitoring may often cause false alarms. In order to overcome these drawbacks and obtain better monitoring performance, we propose a new statistical process monitoring algorithm which is based on the multivariate exponentially weighted moving average (MEWMA) monitoring concept with independent component analysis (ICA) and kernel density estimationtS].

917 2. INDEPENDENT C O M P O N E N T ANALYSIS To introduce the ICA algorithm, it is assumed that d measured variables

Xl,X 2 .....

X d are given

as linear combinations of m (___d ) unknown independent components s ~ , s E , . . . , s m . The independent components and the measured variables are zero mean. The relationship between them is given by X=AS+E where X=[x(1),x(2) ..... x(n)]~ R d'~ is the data matrix, A = [ a I .... ,am] e R d•

(1) is the mixing

matrix, S=[s(1),s(2) ..... s(n)] e R m• is the independent component matrix, E e R d• is the residual matrix, and n is the number of sample. Here, we assume d > m ; when d equals m, the residual matrix, E , becomes the zero matrix. The basic problem of ICA is to estimate the original components S or to estimate A from X without any knowledge of them. Therefore, the objective of ICA is to calculate a separating matrix W so that components of the reconstructed data matrix S, given as S = WX (2) becomes as independent of each other as possible. Using ICA algorithm, we can obtain the rows of S whose norm is 1. From now on, we assume d equals m unless otherwise specified. The initial step in ICA is whitening, also known as sphering, which transforms measured variables x(k) into uncorrelated variables z(k) with unit variance. This pretreatment can be accomplished by PCA. !

The whitening matrix Q is given by Q = A 2 u r , where A = d i a g [ ; q

..... 2d] is diagonal

matrix whit the eigenvalues of the data covariance matrix E(x(k)x r (k) and U is a matrix with the corresponding eigenvalues as its columns. By defining the whitening matrix as Q and B=QA, the relationship between z and s is given as z(k) = Qx(k) = QAs(k) = Bs(k) Since s are mutually independent and z are mutually uncorrelated,

(3)

E{z(k)z r (k)}= B E { s ( k ) s r ( k ) } B r = BB r = I

(4)

We have therefore reduced the problem of finding an arbitrary full-rank matrix A to the simpler problem of finding an orthogonal matrix B, which then gives s(k) = n r z ( k ) = BrQx(k)

(5)

From Eq. (2) and Eq. (5), the relation between W and B can be expressed as W : BrQ (6) To calculate B, it is initialized and then updated so that s may have great non-Gaussianity. The detail ICA algorithm based on maximizing non-Gaussionity to calculate B or W can be found in the paper of Hy~irinen and Oja t6] and Lee et al. ~71. 3. THE MULTIVARIATE EXPONENTIALLY W E I G H T E D MOVING AVERAGE CHART The MEWMA chart proposed by Lowry et al. [81is a natural extension of the EWMA. In the

918 univariate case, suppose that the inputs are Xl,X2,... , the on-target mean of the process is 0 and the variance is o -2 . The EWMA is defined as Yi = 2x; + ( 1 - 2)y~_l

(7)

where 0 < 2 < 1 is a constant and Y0 = 0. It can be shown that if x~, x 2,... are independently identical distributed (i.i.d.) random variables, then the mean of Yi is 0 and the variance is o-y,2 = {211- (1 - 2)2i ]/(2 - 2)}or 2 . A multivariate version of the EWMA (MEWMA) control chart, an extension of the univariate EWMA, is defined as Yi = 2 x i "~"( 1 - 2)yi_ ~

(8)

where 0 < 2 < 1 is a constant and Y0 = 0. MEWMA is a statistical method that gives less weight to older data. Substituting recursively for Yi, with i = 1, 2,..., we obtain i

y, = 2~-] (1-)/,)i-Y xy

(9)

j=!

The covariance matrix of M E W M A is Sy = [ 2 / ( 2 - 2 ) ] S

as i gets larger, where S is the

covariance matrix of original data,. 4. C O M B I N I N G I C A W I T H M E W M A

4.1 Develop normal operating condition (NOC) model 1) The normal operating data are normalized using the mean and standard deviation of each variable. Auto-scaled data form is X.o~t = [x~,x 2.... , x . ] ~ R d~ . 7) Apply M E ~

model to auto-scaled data x~ for i = 1, 2 . . . . . t

Yi - 2xi + (1-2)Y~-i

(10)

where 0 < 2 <_1 is a constant and y0 = 0. 3) Whitening procedure for M E W M A model data Z.orm.., = QY.orma,

(11)

where Y.or~l----[Yl,Y2 ..... Y.]~ Ra~" 4) ICA procedure Obtain W , B , and S . . . . t from S normal "--

WYno~l

= B rZ

....

l

(

12 )

5) Calculate the norm of the row vectors of W and separate W into the deterministic part and the excluded part based on the magnitude of norms tTJ. B and S,or,~t can be separated with the same criterion.

W-->Wd,W e,

B-->Bd,B ,,

6) Calculate I2, Ie 2 , and S P E metrics.

S.orm~t--+Sd,Se

919 I2(k)=Sd(k)rsd(k),

d

2

I e (k)=$e(k)Tse(k),

SPE(k)=~.,(yj(k)-yj(k))

2

j=l

where k is avalue from 1 to n and Y = Q - I B d S d = Q - I B d W d Y n.... t" 7) Obtain control limits of 1 2 , Ie 2 and SPE metrics using kernel density estimation ~5'71

4.2 On-line Monitoring 1) For new data, apply the same scaling used in modeling. Scaled data form is

Xne w - ' [ X I , X 2 , . . . , X t ] E R aX'" 2) Apply MEWMA model to scaled test data x i for i = I, 2,..., t y; = 2x, + (1-2)y,_~

(13)

where 0 < 2 _<1 is the same value used in the modeling and y0 - 0. 3) Calculate S.,~d and S . ~

from S.ewd=WdY~ew and S.ewe =WeY.e ~ , respectively,

where Y.ew----[Yl,Y2 ..... Yt] e Rd'~" 4) Calculate I~ew2 (k) , I.e~e 2 (k), and SPE(k) 2

2

Inew (k) -'$newd(k) T Snewd(k), Ine.,e (k)=Snewe(k)Tsnewe(k),

d

S P E ( k ) = Z(Ynewj(k)-)newj(k)) 2 j=l

where

'~new= Q

-1 BdS,ewd=

Q-IBdWdY,~.

5) Monitor if the Inew2 (k), I,,ewe2 (k), and SPE(k) exceed the control limits calculated from the modeling procedure. 5. CASE STUDY The proposed ICA monitoring algorithm is applied to the detection of various disturbances in the simulated data on the basis of benchmark simulation. The IAWQ model No. 1 and a tenlayer settler model are used to simulate the biological reactions and the settling process, respectively. Fig. 1 shows the flow diagram of the modeled WWTP system. Influent data and operation parameters developed by a working group on benchmarking of wastewater treatment plants, COST 624, are used in the simulation E9I. We used seven variables among many variables used in the benchmark to build monitoring system since they are typically monitored and important variables in real WWT systems. The variables are listed in Table 1. The internal disturbance was imposed by decreasing the nitrification rate in the biological reactor through a decrease in the specific growth rate of the autotrophs (/.tA). The autotrophic growth rate at sample 288 was decreased rapidly from 0.5 to 0.47 day l. Table 1 Variables used in the monitoring of the benchmark model No. Symbol Meaning 1 S-NHi, Influent ammoniac conc. Influent flow rate 2 9.,. Total suspended solid (reactor 4) 3 TSS4 Dissolved oxygen cone. (reactor 3) 4 S-03 Dissolved oxygen cone. (reactor 4) 5 S-04 Oxygen transfer coefficient (reactor 5) 6 KLas Nitrate cone. (reactor 2) 7 S-N02

920

Fig.1. Process layout for the simulation benchmark. First, conventional PCA is applied to this small internal disturbance. Three principal components (PCs) are extracted by cross-validation [1~ However, the conventional PCA with three PCs cannot detect the internal disturbance since the periodic features of the wastewater plant dominate, leading to some false alarms even for normal operating state. In contrast to the PCA monitoring chart, the 12 chart of ICA monitoring can detect the internal disturbance for some samples. However, on the whole, ICA monitoring is not successful in detecting this small internal disturbance. To detect this small internal disturbance efficiently, MEWMA can be applied to the process data in advance. When MEWMA is used, it is important to select 2 properly. Montgomery [111found that values of ,;L in the interval 0.05 < 2, < 0.25 work well in practice, with 2 = 0.05,2 = 0.10, and 3, = 0.20 being popular choices. In general, 2 should be adjusted to a value appropriate for the characteristic of the monitored process. Fig. 2 illustrates the monitoring chats of MEWMA-PCA with 2 =0.05. In this case, fewer PCs are used since MEWMA-PCA captures more variance of data than conventional PCA. SPE chart indicates that disturbance is detected with long delay time of 400 samples. However, after sample 800, MEWMA-PCA does not detect the disturbance any more. MEWMA-PCA cannot detect this small internal disturbance for even small 2 values. On the other hand, MEWMAICA monitoring charts with the same value o f 2 =0.05 can detect this type small drift disturbance. As shown in Fig. 3, 12 values are above 99% confidence limit from sample 320 to the end though the detection time is delayed about 32 samples. These results show the power and advantage of combining MEWMA and ICA for detecting small shifts.

9

2o0

4oo

eoo

coo

~ooo

t2,o

1400 s .~.,i.

,.ran..

Fig. 2. MEWMA-PCA monitoring charts with 2 =0.05:/~ and SPE plots of the AR process data with the 99% confidence limits (dotted line). 2 PCs are retained in the PCA model. 6. CONCLUSIONS In this paper, a new statistical process monitoring algorithm is proposed, which is based on the multivariate exponentially weighted moving average (MEWMA) monitoring concept with ICA and kernel density estimation. ICA can extract underlying factors from multivariate statistical data, MEWMA is able to give a memory effect for monitoring. The proposed

921 monitoring method was applied to a wastewater simulation benchmark. Through the simulation results, we can conclude that MEWMA-ICA is very useful monitoring methods for detecting small shitts in process.

'

,

| 00

~

~

IT,

a-~

~

t2',o

,0',0

O0

200

400

000

000

1000

1200

1400

s.mpt....u,,

Fig. 3. MEWMA-ICA monitoring charts with 2=0.05:12, Ie 2 and SPE plots with the 99% confidence limits (dotted line). 2 ICs are retained in the ICA model. ACKNOWLEDGEMENT

This work was supported by the Brain Korea 21 project. REFERENCES

[1] R. Scranton, G C. Runger, J. B. Keats, and D. C. Montgomery, Quality and reliability engineering international, 12 (1996) 165. [2] S. Wold, Chemometrics Intell. Lab. Syst., 23 (1994) 149. [3] J. Chen, C.-M. Liao, F. R. J. Lin and M.-J. Lu, Ind. Eng. Chem. Res., 40 (2001) 1516. [4] E. B. Martin, A.J. Morris, Papazoglou and C. Kiparissides, Comput. Chem. Eng., 20 (1996) $599. [5] B. W. Silverman, Density Estimation for Statistics and Data Analysis, Chapman&Hall, UK, 1986. [6] A. Hyv/irinen and E. Oja, Neural Networks, 13 (2000), 411. [7] J.-M. Lee, C. K. Yoo, and I.-B. Lee, Journal of Process Control (2003) submitted. [8] C. A. Lowry, Journal of Quality Technology, 34 (1992), 46. [9] M. N. Pons, H. Spanjecrs and U. Jeppsson, Escape 9, Budapest, 1999. [ 10] S. Wold, Technometrics, 20 (1978) 397. [11] D. C. Montgomery, (3rd ed.), Introduction to Statistical Quality Control, Wiley, New York, 1996.