Preliminary-summation-based principal component analysis for non-Gaussian processes

Preliminary-summation-based principal component analysis for non-Gaussian processes

    Preliminary-Summation-Based Principal Component Analysis for NonGaussian Processes Zhijiang Lou, Dong Shen, Youqing Wang PII: DOI: Re...

3MB Sizes 0 Downloads 32 Views

    Preliminary-Summation-Based Principal Component Analysis for NonGaussian Processes Zhijiang Lou, Dong Shen, Youqing Wang PII: DOI: Reference:

S0169-7439(15)00137-9 doi: 10.1016/j.chemolab.2015.05.017 CHEMOM 3020

To appear in:

Chemometrics and Intelligent Laboratory Systems

Received date: Revised date: Accepted date:

25 March 2015 17 May 2015 18 May 2015

Please cite this article as: Zhijiang Lou, Dong Shen, Youqing Wang, PreliminarySummation-Based Principal Component Analysis for Non-Gaussian Processes, Chemometrics and Intelligent Laboratory Systems (2015), doi: 10.1016/j.chemolab.2015.05.017

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

T

Preliminary-Summation-Based Principal Component

RI P

Analysis for Non-Gaussian Processes

SC

Zhijiang Lou, Dong Shen, Youqing Wang*

NU

College of Information Science and Technology,

MA

Beijing University of Chemical Technology, Beijing 100029, China

ED

*Email: [email protected]

Abstract: To cope with the combined Gaussian and non-Gaussian features in the industrial processes,

PT

a novel preliminary-summation-based principal component analysis (PS-PCA) method is proposed in

AC CE

this study. Different from other approaches which improve principal component analysis (PCA) by changing its algorithm structure, PS-PCA just preprocesses the training and monitoring data without modification on PCA. According to the central limit theorem, PS-PCA adds up samples of each variable to make the distribution of the sum approach Gaussian distribution. These sums are then used for state monitoring. It has been proved that preliminary summation can increase the fault detection rate for Gaussian processes. Furthermore, some simulation tests substantiate that PS-PCA can improve the detection capability for non-Gaussian processes and even for nonlinear processes without increasing the computation load. Key words: multivariate statistical process control (MSPC); principal component analysis (PCA); independent component analysis (ICA); Preliminary-Summation-based PCA (PS-PCA); the central limit theorem

1

ACCEPTED MANUSCRIPT

1 INTRODUCTION

RI P

T

For large-scale industrial processes, the presence of abnormal situations may result in economic losses, environmental pollution, and even death. With the growing interests in safety, state monitoring is increasingly important in process system engineering. A key issue for the safe

SC

operation of industrial processes is the rapid detection of the abnormal situations and then

NU

identification and removal of the causal factors. In recent years, with the rapid development in computer and sensor technology, multivariate statistical process control (MSPC)1 has been developed

MA

and widely applied in industrial process.

One of the most common MSPC methods is principal component analysis (PCA)2-10. PCA was

ED

proposed by Pearson4 in the early 20th century, but the vast majority of its applications were not

PT

developed until the last few decades. The main idea of PCA is to reduce the data dimensions by projecting correlated variables onto a smaller set of new variables that are uncorrelated and retain

AC CE

most of the original variance. When implementing PCA for process monitoring, a PCA model should be established based on the process data collected under normal operations. The control limit for each monitoring statistic should be calculated. The process can then be monitored online by using these statistics. In the last few years, a lot of improvements for PCA have been proposed and many successful applications of PCA for process monitoring have been reported in the literature. However, PCA cannot cope with non-Gaussian processes. To address this issue, independent component analysis (ICA)11-16 was proposed. The main idea of ICA is to extract independent components instead of merely uncorrelated components. The fundamental restriction of ICA is that the independent components must be non-Gaussian14. However, for most industrial processes, both Gaussian and non-Gaussian features exist, which 2

ACCEPTED MANUSCRIPT

are difficult to deal with by using PCA and ICA. To cope with both Gaussian and non-Gaussian

T

features simultaneously, several approaches have been proposed, including ICA-PCA1, the support

RI P

vector data description approach ICA-SVDD17, and the statistical local approach LOCAL-ICA18. However, applying of these methods is difficult, because of the following reasons: first, all of them

SC

deal with Gaussian and non-Gaussian components separately while the perfect separation of Gaussian and non-Gaussian components is difficult or even impossible; second, large computational

NU

load is required for these methods. Hence, this paper proposes a preliminary summation method for

MA

PCA, termed as PS-PCA. PS-PCA can convert non-Gaussian components to Gaussian components by preliminary summation the training/monitoring data and monitors the new process data with the traditional PCA, so it avoids separation of Gaussian and non-Gaussian components and just needs a

ED

little computational load.

PT

By adding up the samples of each variable, the fault information can be accumulated and hence

AC CE

faults can be detected more easily. Through strict mathematical analysis, it has been proved that preliminary summation can indeed improve the fault detection rate for Gaussian processes and the fault detection rate will increase when the summation number increases. According to the central limit theorem (CLT)19, when adding up the samples of each variable, the distribution of the sum is close to the Gaussian distribution. As a result, preliminary summation also can improve the monitoring performance for non-Gaussian processes. When the same training data are used, PS-PCA has less computational load than the conventional PCA in most situations. However, according to the fault diagnosis results on a simulated process which contains both Gaussian and non-Gaussian features, PS-PCA performs much better than PCA, ICA-SVDD and LOCAL-ICA. The average detection rate of PS-PCA is 90.3%, and while the average detection rates of PCA, ICA-SVDD and LOCAL-ICA are 24.4%, 24.8%, and 3

ACCEPTED MANUSCRIPT

55.9% respectively. PS-PCA is therefore an effective solution to handle both Gaussian and

RI P

T

non-Gaussian features simultaneously. Besides non-Gaussianity, non-linearity is also wide-spread in industrial processes and it is another significant challenge for PCA. PS-PCA is a linear approach; however, it can be successfully

SC

used for nonlinear processes. Tested on a nonlinear simulated process, PS-PCA detects faults

NU

efficiently, and its detection rates for most faults are greater than 60%. The detection rates of three other nonlinear methods, kernel principal component analysis (KPCA)20-22, kernel independent

MA

component analysis (KICA)23 and kernel-independent component analysis-principal component analysis (KICA-PCA)24, are all below 60%, and the result indicates that PS-PCA is also effective for

ED

nonlinear processes.

PT

This paper proposes a new PCA based on data preprocessing and compares it with traditional PCA. Through theoretical analysis, it demonstrates that PS-PCA is more efficient than traditional

AC CE

PCA. Then PS-PCA is compared with traditional PCA, ICA-SVDD, LOCAL-ICA, KPCA, KICA, and KICA-PCA on the TE process25, 26, and the results validate the improved fault detection ability of PS-PCA.

The remainder of this paper is organized as follows. The classical PCA used for process monitoring is reviewed briefly in Section 2. Then PS-PCA is proposed for process monitoring and some details are introduced. In Section 3, the analysis of PS-PCA is given and the influence of preliminary summation is proved. To fully analyze the characteristics of PS-PCA and compare it with other MSPC methods, tests are carried out on a simulated non-Gaussian process and a simulated nonlinear process in Section 4. The TE process is employed to demonstrate the performance of the proposed method in Section 5. Finally, the contributions of this paper are summarized and some

4

ACCEPTED MANUSCRIPT

T

future studies are discussed in Section 6.

RI P

2 METHOD

SC

2.1. Principal Component Analysis

PCA decomposes the data matrix X ∈ R n×s (where n is the number of samples and s is the

NU

number of variables) into a transformed s subspace of reduced dimensions, which is defined by the span of a chosen subset of the eigenvectors of the covariance or correlation matrix associated with

MA

X . Each chosen eigenvector, or principal component (PC), captures the maximum amount of

ED

variability in the data in an ordered fashion. Mathematically, the decomposition is defined as follows: ˆ +E X = TP T + E = X

(1)

PT

where T ∈ R n×k refers to the score matrix, P ∈ R s×k refers to the loading matrix, and E ∈ R n× s is

AC CE

the residual matrix. Usually, only the first few dominant PCs are selected in P . In this paper, the covariance matrix S is used to derive a PCA model, defined as follows:

S=

1 XT X n −1

(2)

The columns of P are actually eigenvectors of S associated with the k largest eigenvalues. The cumulative percent variance ( CPV ) method is usually adopted to determine the number of PCs, which is defined as follows: k

s

i =1

i =1

CPV = ∑ λi / ∑ λi ×100% ≥ ε

(3)

where λi ( λ1 ≥ λ2 ≥ L λs ≥ 0 ) is the variance of score vector and ε is a parameter usually set to 85%. When CPV is larger than ε , set k as the number of the dominant PCs. The subspaces 5

ACCEPTED MANUSCRIPT

ˆ and E are called the score space and the residual space respectively, and T 2 and spanned by X

T

Q statistics27 are constructed to monitor the two spaces, respectively. Statistic T 2 represents the

RI P

distance between the location of the new data projected onto the subspace and the origin of subspace; statistic Q is a measure of the approximation error of the new data within the PCA subspace. Given

SC

a monitoring vector x ∈ R s×1 , the T 2 and Q statistics can be calculated as below:

(4)

NU

 T 2 = xT P ( Λ k ) −1 P T x

 Q = xT (I − PPT )(I − PPT ) x

MA

(5)

where I is a unit matrix; Λ k = diag (λ1 L λk ) ∈ R k ×k is a diagonal matrix, which is the estimated

degrees

( n − 1)(n + 1)k Fα ( k , n − k ) , where Fα ( k , n − k ) is an F-distribution with k and (n − k ) n( n − k ) of

freedom

PT

is δ T2 =

ED

covariance matrix of principal component scores. The threshold of T 2 index for a Gaussian process

with

the

level

of

1

AC CE

C h 2θ 2 θ h (h ) + 1 + 2 0 0 ] h0 , where θi = δ = θ1[ α 0 θ1 θ1 2 Q

significance

α ;

s

∑λ

i j

, h0 = 1 −

j = k +1

the

threshold

of

Q

is

2θ1θ3 , and Cα is the normal 3θ 22

deviate corresponding to the ( 1 − α ) percentile.

2.2. Preliminary-Summation-based PCA When non-Gaussian features exist in a process, it is inappropriate to calculate the loading matrix P in (1) using traditional PCA, because not all process variables follow Gaussian distribution. According to the central limit theorem (CLT)19, for a process variable that has r

r

τ =1

τ =1

% = ∑ X(τ ) and monitoring data x% = ∑ x(τ ) non-Gaussian features, the sum of its training data X 6

ACCEPTED MANUSCRIPT

will be close to a Gaussian distribution and thus satisfy PCA’s restriction. For convenience,

T

parameter r is termed as summation number. As a result, the loading matrix P can be calculated

RI P

% and x% . Hence, the main idea of PS-PCA is summation the training and by using X

monitoring data before applying PCA. It must be noted that there are some differences between

SC

summation procedures for the training and monitoring data: for the training data, to guarantee the independence of each sum, each sample of the training data should be used only once, i.e., given

NU

n samples of the data, there will be nnew = [ n / r ] sums; however, for the monitoring data, each data

MA

sample can be used many times. The process of summation is shown in Figure 1.

ED

(insert Figure 1 here) As shown in Figure 1 (b), the summation of the monitoring data is calculated at t and the

PT

summation number is r . If t ≥ r , the previous r − 1 samples of the monitoring data are used to get the sum, otherwise, the previous t − 1 samples of the monitoring data and another r − t samples of

AC CE

training data are used. In addition, Fig. 1(b) indicates that  x% (t + 1) = x% (t ) + x (t + 1) − x (t − τ + 1) and therefore recursion can be adopted to reduce the volume of calculations in the summation procedure of the monitoring data.

After summation, the next step is to use PCA to detect process faults with the new data, which is the same as that for a traditional PCA process. For PS-PCA, though there are n samples of training data, the actual number of training data adopted in PCA process is nnew , so the computational load in the training process is determined by nnew rather than n . Compared with traditional PCA, PS-PCA has two disadvantages: the preliminary summation requires additional computation and more training data. However, compared with PCA, the

7

ACCEPTED MANUSCRIPT

computation cost of preliminary summation is very small. In addition, for most industrial processes,

T

there are enough training data for PS-PCA. As a result, PS-PCA is a promising modification of PCA.

RI P

REMARK 1. Even though statistics pattern analysis (SPA)28, 29 has some similarities with PS-PCA, their differences are very clear. First, PS-PCA only uses the first order statistics while SPA adopts the

SC

higher order statistics together. However, PCA is a linear method and it is not good at handling the

NU

nonlinearity among the various-order statistics, so adding of higher order statistics may have negative impact on SPA. In addition, calculating and monitoring the higher order statistics will cost

MA

much more computational load. Second, many statistics are adopted in SPA to address the non-Gaussianity, nonlinearity, and dynamics problems, but the function of each statistic is not clear.

ED

For PS-PCA, the summation process is introduced to address the non-Gaussian problem and its function has been proved based on CLT and the simulation result. Third, to improve the fault

PT

detection performance, SPA emphasizes increasing the variety of the statistics while PS-PCA focuses

AC CE

on increasing the summation number r .

3 THEORETICAL ANALYSIS OF PS-PCA

To simplify the proof, it is assumed that both the normal and faulty processes are Gaussian. In fact, this assumption is reasonable because even for a non-Gaussian process, the summation will be close to Gaussian process when r → ∞ . 3.1 The influence of summation number on the fault detection rate

Assume the training data

x1 ∈ N

( μ norm , Σnorm )

x0 ∈ N

( μ norm , Σ norm )

, the monitoring data without fault

and the monitoring data with a fault x 2 ∈ N



fault

, Σ fault ) , where

8

ACCEPTED MANUSCRIPT

N

( μ, Σ )

means a Gaussian distribution with expectation of μ and covariance matrix of Σ . Both

RI P

T

xi ( i = 1, 2, 3 ) and μ j ( j = norm, fault ) are vectors, and Σ j ( j = norm, fault ) are matrices:

xi = [ xi1 , xi2 L xis ]T μ j = [ µ 1j , µ 2j L µ sj ]T

SC

ρ 12j σ 1jσ 2j L ρ 1j sσ 1jσ sj   σ 2j σ 2j L ρ 2j sσ 2j σ sj   M  2s s 2 s s  ρ j σ j σ j L σ jσ j 

NU

 σ 1jσ 1j  12 2 1 ρ σ jσ j Σj =  j   ρ 1sσ sσ 1  j j j

MA

where s is the number of variables and ρ gh ( g = 1, 2,L , s; h = 1, 2,L , s ) is the covariance. For j

ED

PS-PCA, the first step is to obtain the sum of each variable:

PT

Training data:

r ×t



x% 0 (t) =

x0 (τ ) ∈ N

( rμ norm , rΣ norm )

(6)

AC CE

τ = r ×( t −1) +1

Monitoring data without fault:

x% 1 (t) =

t



x1 (τ ) ∈ N

( rμ norm , rΣnorm )

(7)

τ =t − r +1

where [t − r + 1, t ] is the moving window for summation. Define t f as the fault occur time. According to the relationship between [t − r + 1, t ] and t f , there are three cases: t f > t , t − r + 1 < t f ≤ t and t f ≤ t − r + 1 . In the first case, x% 2 (t) = x% 1 (t) , which is a trivial case. The second

case means that just the last m = t − t f samples are fault data, and the third case means that all of the

r samples are fault data, which are called transitional fault data and steady-state fault data, respectively. The last two cases are shown as follows:

9

ACCEPTED MANUSCRIPT

t

t−m

x 2 (τ ) +

τ =t − m +1



x1 (τ ) ∈ N

( mμ

fault

+ ( r − m)μ norm , mΣ fault + ( r − m) Σ norm )

(8)

i =t − r +1 t

x% 2 (t) =



x 2 (τ ) ∈ N

( rμ

fault

, rΣ fault )

τ = t − r +1

T



(9)

RI P

x% 2 (t) =

Obviously, the steady-state fault data (9) can be considered a special case of transitional

NU

degenerates into the traditional PCA when r = 1 .

SC

fault data (8) and hence this section focuses on (8). From (6) to (9), one knows that PS-PCA

MA

The first step is to scale the training and monitoring data by using the sample average ( xnorm ) and variance ( Σ′norm ) of the training data. Because xnorm is the sample average of the training data

1

x% ′0 = ( rΣ′norm ) 2 ( x% 0 − rxnorm ) −

ED

x 0 , so xnorm ≈ μ norm . Hence, equations (6), (7), and (9) can be rewritten as

AC CE

PT

1 1 1 1 1 − − −  − −    ∈ N  Σ′norm 2 r (μ norm − xnorm ), Σ′norm 2 Σ norm Σ′norm 2  ≈ N  0, Σ′norm 2 Σ norm Σ′norm 2     

(10)

1

x% 1′ = ( rΣ′norm ) 2 (x% 1 − r xnorm ) −

1 1 1 1 1 − − −  − −    ∈ N  Σ′norm 2 r (μ norm − xnorm ), Σ′norm 2 Σ norm Σ′norm 2  ≈ N  0, Σ′norm 2 Σ norm Σ′norm 2     

(11)

1

x% ′2 = ( rΣ′norm ) 2 ( x% 2 − rxnorm ) −

1 1 1 − − −   ∈ N  (rΣ′norm ) 2 ( mμ fault + ( r − m)μ norm − rxnorm ), (rΣ′norm ) 2 ( mΣ fault + ( r − m) Σ norm )( rΣ′norm ) 2    1 1 1 − − −   m =N  Σ′norm 2 (μ fault − xnorm ), ( rΣ′norm ) 2 ( mΣ fault + ( r − m) Σ norm )( rΣ′norm ) 2   r  1 1 1 1 1 − − − − −   m m m ≈N  Σ′norm 2 (μ fault − μ norm ), Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2  r r  r 

(12) As shown in (10), x% ′0 is not affected by summation number r , so PS-PCA will have the same

10

ACCEPTED MANUSCRIPT

training results as PCA, including loading matrix P , matrix Λ k , and thresholds δ T2 and δ Q2 .

T

Because x% ′0 and x% 1′ follow the same distribution and this distribution cannot be affected by

RI P

summation number r and fault data number m , their fault detection rates will be the same as those

MA

Q = z% ′T z% ′

and z% ′ = x% ′2 (I − PPT ) ,

NU

and T 2 in (4) and Q in (5) will become: T 2 = y% ′T y% ′

1 2

SC

in the traditional PCA, so only x% ′2 need to be studied. Set y% ′ = x% ′2 P ( Λ k )



(14)

ED

So

(13)

1 1  −1  m − −  y% ′ = Λ k 2 PT x% ′2 ∈ N  Λ k 2 PT  Σ′norm 2 (μ fault − μ norm )  ,   r  

PT

1 1 1 1 1 1 − − − − −  −  m m Λ k 2 P T  Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2  PΛ k 2   r r  

(15)

AC CE

m m   m =N  μ 3 , Σ 3 + (1 − )Θ  r r  r  1  − m  z% ′ = (I − PP T ) x% ′2 ∈ N  (I − PPT )  Σ′norm 2 (μ fault − μ norm )  ,   r   1 1 1 1  − − − −  m m (I − PPT )  Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2  (I − PPT )   r r  

(16)

m m   m μ 4 , Σ 4 + (1 − )Ω  =N  r r  r  where

11

ACCEPTED MANUSCRIPT



1



1

μ 3 = Λ k 2 PT Σ′norm 2 (μ fault − μ norm ) 1



1



1



1



1



1



1



1



RI P

Θ = Λ k 2 PT Σ′norm 2 Σ norm Σ′norm 2 PΛ k 2

T



Σ 3 = Λ k 2 P T Σ′norm 2 Σ fault Σ′norm 2 PΛ k 2

1



1

SC

μ 4 = (I − PPT ) Σ′norm 2 (μ fault − μ norm ) −

1

Σ 4 = (I − PPT ) Σ′norm 2 Σ fault Σ′norm 2 (I − PPT ) −

1



1

Ω = (I − PP ) Σ′norm 2 Σ norm Σ′norm 2 (I − PPT )

NU

T

MA

Equations (15) and (16) demonstrate that both the expectation and variance of y% ′ and z% ′ vary with summation number r and fault data number m ; however, when m = r (or t f ≤ t − r + 1 ), rμ 3 and

rμ 4 . As the

ED

their variances will be fixed at Σ 3 and Σ 4 and their expectations will be

first step, Section 3.1 only studies the relationship between the fault detection rate and r in the

PT

steady-state fault data ( m = r or t f ≤ t − r + 1 ), and while the transitional fault data will be studied

AC CE

in Section 3.2. In addition, because the T 2 and Q statistics have similar form, Section 3.1 only study the relationship in T 2 statistic. THEOREM. If μ 3 ≠ 0 ( μ 4 ≠ 0 ), the following conclusions hold: (a) there exists r>0 , such that the fault detection rate of PS-PCA is higher than PCA when r>r ; (b) when r → ∞ , the fault detection rate of PS-PCA will converge to 100%.

PROOF. When y% ′ ∈ N

(

)

r μ 3 , Σ 3 , its probability density function is

1

f ( y% ′) =

2π Σ 3



1 2

e

1 % ′− r μ3 )] − [( y% ′ − r μ 3 )T Σ-1 3 (y 2

(17)

So the fault detection rate is 12

ACCEPTED MANUSCRIPT

P(r ) =





f (y% ′)dy% ′ = 1 −

y% ′T y% ′ >Tα

f (y% ′)dy% ′

(18)

y% ′T y% ′≤Tα

RI P

T

where Tα is the control limit of T 2 .Taking derivatives of (18), one gets:

∂P ∂f ( y% ′) 1 1 T −1 =− ∫ dy% ′ = ∫ f ( y% ′)[ μT3 Σ 3−1μ 3 − μ 3 Σ3 y% ′]dy% ′ ∂r ∂r 2 2 r y% ′T y% ′≤T y% ′T y% ′≤T α

SC

α

(19)

f (y% ′) ≥ 0

and

1 T −1 μ3 Σ3 μ 3 ≥ 0 2

.

2

NU

Because f ( y% ′) is the probability density function and Σ 3 is a positive definite matrix, so Because

μ3 ≠ 0

,

then

μT3 Σ 3−1μ 3 ≠ 0

and

hence

MA

 μT3 Σ 3−1y% ′  1 1 T −1 ∂P r = max ≠ ∞ . When r > r , then [ μT3 Σ 3−1μ 3 − μ 3 Σ3 y% ′] > 0 and hence >0.   T − 1 T y% ′ y% ′≤Tα μ Σ μ 2 ∂r 2 r  3 3 3

ED

That is to say, when r ≥ r , the fault detection rate increases as r increases.

PT

Generally, the fault detection rate of the conventional PCA is less than 1, i.e. P (1) < 1 . On the

AC CE

other hand,

  1 − [( y% ′ − r ×μ3 )T Σ3−1 ( y% ′ − r ×μ3 )] 1  2 lim P ( r ) = lim 1 − ∫ e dy% ′  =1-0=1 1 r →∞ r →∞   −  y% ′T y% ′>Tα 2π Σ 2 3  

Hence, there exists r ≥ r , such that P(r ) > P(1) for all r > r . The proof is finished.

REMARK 2. If μ 3 = 0 ( μ 4 = 0 ), PS-PCA and the traditional PCA will have the same fault detection rate. However, this condition is very rare, because most faults are unidirectional and continuous, which results in μ fault ≠ μ norm and it usually leads to μ 3 ≠ 0 ( μ 4 ≠ 0 ).

13

ACCEPTED MANUSCRIPT

3.2 Influence of fault data number on the fault detection rate

RI P

T

The conclusions in section 3.1 are drawn in situation when m = r and this section will examine the situation when m < r .



E (T 2 ) = E ( y% ′T y% ′) =

NU

SC

m m  m m   m  m For y% ′ ∈ N  μ 3 , Σ 3 + (1 − )Θ  and z% ′ ∈ N  μ 4 , Σ 4 + (1 − )Ω  , the r r  r r  r  r  2 expectation of T can be calculated as below. f ( y% ′) × y% ′T y% ′dy% ′

y% ′T y% ′≤Tα ∞





−∞ −∞ k

L∫

= ∑ ∫  −∞ i =1  ∞

k



−∞

(∫



−∞

k

f ( y% ′) × ∑ ( y% i′2 )dy%1′dy% 2′ L dy% k′ = ∑ i =1

L∫







−∞ −∞

k

L∫

MA

=∫



−∞

i =1

(∫







−∞ −∞

L∫



−∞

f (y% ′) × y% i′2 dy%1′dy% 2′ L dy% k′

)

f ( y% ′)dy%1′ L dy%i′−1dy% i′+1 L dy% k′ × y% i′2 dy% i′  

)

PT

ED

∞ = ∑  ∫ f y%′i ( y% i′) × y% i′2 dy% i′   −∞  i =1 





−∞

−∞ −∞





L∫

AC CE

where f y%′i ( y% i′) = ∫ L ∫



−∞

(20)

f (y% ′)dy%1′ L dy% i′−1dy% i′+1 L dy% k′ is the marginal distribution of y% i′ and

it follows Gaussian distribution with expectation and variance equal the corresponding terms of

m m m μ 3 and Σ 3 + (1 − )Θ , respectively. So r r r

k

k

2 E (T 2 ) = ∑  E ( y% i′2 )  = ∑ ( E ( y% i′) ) + D ( y% i′)    i =1 i =1

m2 T m m = μ 3 μ 3 + tr[ Σ 3 + (1 − )Θ] r r r

(21)

Similarly, one has

E (Q ) = E (z% ′T z% ′) =

m2 T m m μ 4 μ 4 + tr[ Σ 4 + (1 − )Ω] r r r

(22)

When m = r = 1 , which means PS-PCA degenerates into the traditional PCA, the expectations 14

ACCEPTED MANUSCRIPT

of these statistics are

RI P

T

E (T 2 ) m = r =1= μ 3T μ 3 + tr ( Σ 3 ) E (Q ) m = r =1 = μ 4T μ 4 + tr ( Σ 4 )

(23) (24)

SC

When m = r ≠ 1 , the expectations of these statistics in steady-state are

NU

E (T 2 ) m = r ≠1= rμ 3T μ 3 + tr ( Σ 3 ) E (Q ) m = r ≠1 = rμ 4T μ 4 + tr ( Σ 4 )

(25) (26)

MA

For convenience, E (T 2 ) m = r =1 and E (Q ) m = r =1 are denoted as the results of the traditional PCA. According to Equations (23) to (26), one gets E (T 2 ) m = r ≠1> E (T 2 ) m = r =1 and E (Q ) m = r ≠1> E (Q ) m = r =1 .

ED

In addition, PS-PCA and PCA have the same thresholds δ T2 and δ Q2 , so PS-PCA will have higher

PT

detection rate than PCA in steady state, which is consistent with the conclusion in Section 3.1. Now consider the situation when m < r . Define eT 2 = E (T 2 ) − E (T 2 ) m = r =1 and eQ = E (Q ) − E (Q ) m = r =1 ,

AC CE

and then one gets

eT 2 =

μ 3T μ 3 2 tr ( Σ 3 − Θ ) m − m + [tr ( Σ 3 − Θ) − μ 3T μ 3 ] r r

(27)

eQ =

μ 4T μ 4 2 tr ( Σ 4 − Ω) m − m + [tr ( Σ 4 − Ω) − μ 4T μ 4 ] r r

(28)

Please note that the values of eT 2 and eQ are only determined by m , because other terms are constants. For convenience, define the following two constants.

tr ( Σ 3 − Θ ) 2 μ Tμ ) − 4 3 3 [tr ( Σ 3 − Θ ) − μ 3T μ 3 ] r r

(29)

tr ( Σ 4 − Ω) 2 μ Tμ ) − 4 4 4 [tr ( Σ 4 − Ω) − μ 4T μ 4 ] r r

(30)

∆T 2 = (

∆Q = (

15

ACCEPTED MANUSCRIPT

( ∆Q ≥ 0

tr ( Σ 3 − Θ) + r ∆T 2 T 3

2μ μ 3

),

then

( m > mQ =

eT 2 > 0

tr ( Σ 4 − Ω) + r ∆ Q 2μ 4T μ 4

( eQ > 0 )

holds,

only

when

). That is to say, PS-PCA needs mT 2

T

m > mT 2 =

∆T 2 ≥ 0

RI P

Assume

( mQ ) steps to make its statistic expectations catch up with those of traditional PCA. Constants mT 2

SC

and mQ are termed as detection delays for PS-PCA and they are monotonic increasing with respect

NU

to parameter r . Because E (T 2 ) m = r ≠1> E (T 2 ) m = r =1 and E (Q ) m = r ≠1> E (Q ) m = r =1 , one knows that

MA

mT 2 < r and mQ < r .

If ∆T 2 < 0 ( ∆ Q < 0 ), then eT 2 > 0 ( eQ > 0 ) for all m ≤ r , so mT 2 = 0 ( mQ = 0 ). In other

ED

words, PS-PCA has no detection delay compared with conventional PCA in this situation.

PT

REMARK 3. For fault which can be detected by conventional PCA, a non-zero detection delay means PS-PCA may detect this fault later than PCA, and the larger detection delay indicates the

AC CE

longer time later.

Because μ 3 , μ 4 , Σ3 and Σ4 are all related with μ fault and/or Σ fault , thus both ∆ T 2 and ∆ Q are affected by the fault data. That is to say, the influence of parameter r on the detection

delay varies with the fault. For some faults, a larger value of r will introduce a larger delay and while for some faults it may introduce a smaller delay. All in all, larger r may increase the detection rate at the price of larger detection delay. Hence, appropriate value of r should be chosen to balance the detection delay and detection rate: for faults which are difficult to detect, larger r is more suitable because it can significantly increase the fault detection rate; for faults which are easy to detect, larger r is unreasonable as it significantly amplify the detection delay while does not improve the detection rate too much.

16

ACCEPTED MANUSCRIPT

T

4 SIMULATION STUDY OF PS-PCA

RI P

This section aims to study the performance of PS-PCA on non-Gaussian and nonlinear processes through simulation tests. Experiments were all carried out on the same computer with a

SC

Core 2.93-GHz CPU, 2-G memory, and Windows 7 operating system.

NU

4.1 Test on a simulated non-Gaussian process

To fully analyze the characteristics of PS-PCA and to compare it with other MSPC methods

MA

(PCA, ICA-SVDD, and LOCAL-ICA), a simple simulated process is employed to illustrate the monitoring performance of these methods, similarly to that used by Jiang and Yan26. Given the

ED

following simple process with six Gaussian distributed variables and six non-Gaussian distributed

PT

variables.

AC CE

 x1 = 0.1× N1 + 0.01× ω1  x = 5 + 0.15 × N + 0.015 × ω 2 2  2  x3 = 7 + 0.1× P3 + 0.01× ω3   x4 = 9 + 0.1× R4 + 0.01× ω 4  x = 11 − 0.3 × x + 0.8 × x + 0.9 × x + 0.3 × N + 0.03 × ω 1 2 3 5 5  5 x = 6 + x − 0.3 × x + x + 0.35 × N + 0.035 × ω  6 2 3 5 6 6   x7 = 8 − 0.5 × x1 + 0.8 × x2 + x4 + 0.01× ω7  x8 = 15 + x2 + x3 + 0.1× N 8 + 0.01× ω8   x9 = 3 + 0.5 × N 9 + 0.05 × ω9  x = 20 + 0.7 × N + 0.07 × ω 10 10  10  x11 = x9 + 0.8 × x10 + 0.01× ω11   x12 = 0.3 × x1 + 0.6 × x2 + 0.4 × x10 + 0.05 × ω12 Random variables N i and ωi follow the standard Gaussian distribution and ωi indicates the process noise. Random variables

Pi

and

Ri

follow Poisson distribution and Rayleigh

distribution, respectively. This process can be regard as a simplified model of industrial processes. In 17

ACCEPTED MANUSCRIPT

this model, variables x1 , x2 , x9 , x10 , x11 , and x12 are Gaussian distributed, which represents the

T

Gaussian components in industrial processes. Among them, variables x1 , x2 , x9 , x10 contain only

RI P

one Gaussian component for each while the other two variables contain two or more Gaussian

SC

components for each. In addition, x1 is zero means while the other five Gaussian variables are not. Poisson distribution and Rayleigh distribution are common non-Gaussian distribution and they are

NU

adopted to represent non-Gaussian components in industrial processes30-32. Both of variables x3 and x4 contains one non-Gaussian component and the rest four variables x5 , x6 , x7 and x8 contain

MA

both Gaussian component and non-Gaussian component simultaneously. At most 30,000 normal observations are produced for offline modeling. Furthermore, another 3,000 samples are generated

ED

for online monitoring, where a fault occurs at the 1001th sample point. The occurred fault might be

PT

the following nine types:

AC CE

Fault 1: a step change with amplitude of 0.1 in x1 ; Fault 2: a step change with amplitude of 0.1 in x2 ; Fault 3: a step change with amplitude of 0.1 in x3 ; Fault 4: coefficient 0.8 in x5 is changed to 0.88; Fault 5: coefficient 0.8 in x7 is changed to 0.88; Fault 6: a step change with amplitude of 0.1 in x5 ; Fault 7: coefficient -0.3 in x6 is changed to -0.25 Fault 8: term −0.3 × x1 in the expression of x5 is missed; Fault 9: term −0.3 × x3 in the expression of x6 is missed. These faults are common for process monitoring and they can be divided into three categories:

18

ACCEPTED MANUSCRIPT

Faults 1, 2, 3, and 6 are the step change fault26, which will not change the correlation between

T

variables; Faults 4, 5, and 7 are the coefficient change fault18 and they can affect the correlation

RI P

between variables; Faults 8 and 9 are the component missing fault, which can be regard as a special case of coefficient change fault. Each fault occurs in different variables to make the detection results

SC

more convincing.

NU

4.1.1 The influence of summation number on Gaussianity

MA

To test the influence of summation number r on Gaussianity, kurtosis14 is adopted. In ICA, kurtosis is used to measure non-Gaussianity and hence it can reflect Gaussianity. The kurtosis of x

ED

is defined as

 kurt ( x ) = E ( x 4 ) − 3( E { x 2 }) 2

PT

(31)

AC CE

For a Gaussian variable x , its fourth moment E ( x 4 ) equals 3( E { x 2 }) 2 . Thus, kurtosis is zero for a Gaussian random variable. For most non-Gaussian random variables, their kurtosis values are non-zero. Kurtosis can be either positive or negative. A random variable with negative kurtosis is classified as sub-Gaussian, and that with positive kurtosis is classified as super-Gaussian. Generally, Gaussianity can be measured by the absolute value of kurtosis: smaller absolute value of kurtosis indicates better Gaussianity and vice versa. 100

In this paper, the expectation in (31) is calculated as E ( χ ) = ∑ χ i 100 . To test the influence of i =1

the summation number r on the Gaussianity of this process, the value of r changes from 1 to 100, and then the max absolute values of the 12 variables’ kurtosis can be obtained. Twenty groups of Monte Carlo tests have been done and the average results are shown in Figure 2.

19

ACCEPTED MANUSCRIPT

(insert Figure 2 here)

RI P

T

As shown in Figure 2, even though there are some fluctuations, the general trend is clear: the kurtosis decreases fast as r increases from 1 to 10, but becomes steady after r is larger than 12. This result indicates that Gaussianity of the data can be increased by summation and r needs not to

SC

approach infinity, a value as small as 12 is enough to make this process approximate to a Gaussian

NU

process.

MA

4.1.2 PS-PCA fault detection results

The result in section 4.1.1 indicates that when summation number r is large enough, a

ED

non-Gaussian process will be very close to a Gaussian process, so PCA can be adopted to detect fault. In this test, the number of sums nnew for PS-PCA are set as 300, and take r as 1, 2, 5, 10, 50, and

PT

100 respectively, so the number of training data samples n should be 300 ( 300 ×1 ), 600 ( 300 × 2 ),

AC CE

1,500 ( 300 × 5 ), 3,000 ( 300 ×10 ), 15,000 ( 300 × 50 ), and 30,000 ( 300 ×100 ), respectively. For the sake of fairness, n for PCA, LOCAL-ICA are the same as for PS-PCA. However, when the number of the training data samples is very large, the computational load of ICA-SVDD exceeds the computational capability of the above-mentioned computer. Hence, the number of the training data samples n for ICA-SVDD is chosen as 300, 600, 1,000, 1,500, 3,000, and 6,000. For convenience and fairness, the number of the dominant PCs k for PS-PCA is set the same as that for PCA with 300 training data. In this study, all control limits are based on the confidence limit of 99%. Table 1 shows the false alarm rates of the four methods and Table 2 shows the detection rates of the nine faults. The false alarm rate is calculated as

the number of faults detected before 1, 000 and 1, 000

20

ACCEPTED MANUSCRIPT

(insert Table 1 here)

SC

(insert Table 2 here)

T

the number of faults detected between 1, 001 and 3, 000 . 2, 000

RI P

the detection rate is calculated as

In Table 1, when n is smaller than 3000, LOCAL-ICA has a large false alarm rate and thus

NU

Table 2 just list its detection rate when n takes 3,000, 15,000, and 30,000. Except for LOCAL-ICA,

MA

there is no significant difference in the false alarm rates of the other three methods. This is reasonable because neither x% ′0 nor x% 1′ in (11) and (12) are affected by summation number r and

ED

thus PS-PCA will has the same false alarm rate as PCA. However, the detection rates of PS-PCA and the other three methods show large variation in Table 2. In this table, PCA and ICA-SVDD can only

PT

detect Faults 3 and 9. Because LOCAL-ICA uses summation method to calculate its monitoring statistics, the fault information can be cumulated similarly with that in PS-PCA, and hence it can also

AC CE

detect Faults 5 and 7. However, LOCAL-ICA cannot perfectly separate Gaussian and non-Gaussian components and thus it fails to detect the other five faults. For PS-PCA, the preliminary summation process can successfully convert non-Gaussian components to Gaussian components, as a result, when r is 100, its detection rates for all faults except for Fault 6 are nearly 100%. Even for Fault 6, the detection rates of PS-PCA are 28.6% and 25.6%, which are much better than those of the other three methods. From the conclusion in Section 3.1, one knows that the detection rates of Fault 6 will be larger when r is larger and it will reach 100% when r → ∞ . One interesting phenomenon is that the detection rates of PS-PCA for most faults are little less than 100%, even for Faults 3 and 9. This is due to the detection delay of PS-PCA. Because Fault 3 and 9 can be detected easily by the traditional PCA, a larger value of r not only fails to improve the fault detection ability of PS-PCA, but also increase the detection delay evidently. As a result, differently from other faults whose 21

ACCEPTED MANUSCRIPT

detection rates are increasing with respect to r , the detection rates for Fault 3 and 9 is decreasing.

T

For Fault 9, PS-PCA achieves 99.9% detection rate when r = 5 , but its detection rate decreases to

RI P

99.6% when r = 100 , which means that r = 5 is large enough for Fault 9, and larger value will only increase the detection delay. In general, the effect of the fault detection delay is very little and

SC

PS-PCA is much better than the other methods.

NU

In order to clearly illustrate the superiority of PS-PCA, the monitoring charts for Faults 1, 3, and 9 are shown in Figures 3–5. To clearly display the curves, some charts are drawn

MA

with logarithmic coordinates. In Figures 3–5, the mean values of the T 2 and Q statistics in PS-PCA are increasing when r increases and they are much larger than those in PCA and

ED

ICA-SVDD. For example, in Figure 5, the statistics of PCA and ICA-SVDD marginally exceed the control limit quickly after the fault occurs, and they may sometimes fall down below the control limit.

PT

However, for PS-PCA, when r is large enough (such as 100), its T 2 and Q statistics will be

AC CE

far beyond the control limit and will not fall down below the limit any more. As a result, the detection rates of PS-PCA are much larger than these two methods. Statistic Tng2 in Figure 4 and statistic Te2 in Figure 5 show that, LOCAL-ICA can also detect Faults 3 and 9, because of its summation process in monitoring statics. However, in Figure 3, PS-PCA can successfully detect the faults while LOCAL-ICA fails, which demonstrates that preliminary summation is a more effective approach to handle the non-Gaussian problem and hence it is a simple but effective improvement for PCA. (insert Figure 3 here) (insert Figure 4 here)

22

ACCEPTED MANUSCRIPT

(insert Figure 5 here)

T

T 2 and Q are 9 and 13 for r = 1 , 50 and 100

RI P

In addition, in Figures 5, the mean values of

for r = 10 , and 400 and 1000 for r = 100 , respectively, which indicates that E (T 2 ) ≈ 4r + 5 and

SC

E (Q ) ≈ 9 r + 10 . This result is consistent with the conclusion in Section 3.2.

NU

To test the influence of summation number r on the detection rate, two situations were investigated: fix the number of sums nnew at 300 and change r from 1 to 100; and fix n at

MA

30,000 and change r from 1 to 100. The results for Faults 1, 4, and 6 are shown in Figures 6–8.

ED

(insert Figure 6 here)

PT

(insert Figure 7 here) (insert Figure 8 here)

AC CE

From these results it can be found that, for both strategies, the detection rates increase as r increases, and when r is large enough (such as 100), the detection rate is very close to 100%, which is consistent with conclusion in Section 3.1. For the second strategy, as n is fixed at 30,000, nnew decreases when r increases. However, the results of the three figures all show that the detection rate still increases even nnew decreases. The reason for this phenomenon is that nnew only affects the training data while r affects both the training and monitoring data. When nnew is large enough, the increase of nnew will not affect the training result any more while the increase of r will help PS-PCA accumulate more fault information in the monitoring data and improve the detection performance. Therefore, r affects the detection rates in two ways: making the sums close to Gaussian distribution and gathering more fault information. As a result, PS-PCA is more sensitive 23

ACCEPTED MANUSCRIPT

to faults and has the ability to deal with non-Gaussianity.

RI P

T

As mentioned earlier, PS-PCA requires more computation than PCA because of the additional preliminary summation step. The time cost for four comparison methods is listed in Table 3.

SC

(insert Table 3 here)

NU

In Table 3, the time cost for modeling offline with other methods is much longer than PS-PCA and the time is longer still when the number of the training data points increases. For PS-PCA the

MA

time cost of preliminary summation for modeling offline is about 1/20 of that required for PCA. Preliminary summation for online monitoring needs less than 1/12 of the time for monitoring online.

ED

Though the time cost for preliminary summation becomes larger when r increases, it does not increase too much because the sums are calculated recursively. Hence, PS-PCA with 1,500 or 30,000

PT

training data samples needs almost the same computation as the traditional PCA with 300 training data samples and thus it needs much less time than the other methods with 1,500, 6,000, or 30,000

AC CE

training data samples.

Overall, PS-PCA is more sensitive to process faults and has better fault detection rates for the processes with both Gaussian and non-Gaussian features, while only adding little computational load when using the same nnew . 4.2 Test on a simulated nonlinear process Besides non-Gaussianity, nonlinearity is also wide-spread in industrial processes and it is another significant challenge for PCA. It has been demonstrated that PS-PCA performs very well on linear processes. Then PS-PCA is applied to nonlinear processes and is compared with traditional nonlinear methods, such as KPCA, KICA and KICA-PCA. 24

ACCEPTED MANUSCRIPT

RI P SC

ED

MA

NU

 x1 = eU1 + 0.01× ω1  2  x2 = U1 + 0.01× ω2  x = U + U 2 + 0.01× ω 1 2 3  3  x4 = sin(U 2 + N1 ) + 0.01× ω4  (N +N )  x5 = e 1 2 + 0.01× ω5  2  x6 = (U1 + U 2 + N1 + N 2 ) + 0.01× ω6   x7 = U1 × U 2 × N1 × N 2 + 0.01× ω7  x = U 3 + U 3 + N 3 + N 3 + 0.01× ω 1 2 1 2 8  8  x9 = U1 × e N 2 + 0.01× ω9   x10 = N1 × sin( N 2 ) + 0.01× ω10  x = U + 2 × U + 3 × N + 4 × N + 0.01× ω 1 2 1 2 11  11  x12 = U1U 2 + 0.01× ω12

T

The simulated process is shown as below:

Random variables N i and ωi follow the standard Gaussian distribution and ωi indicates the

PT

process noise. Random variable U i follow the uniform distribution in interval [ −1,1] . At most

AC CE

60,000 normal observations are produced for offline modeling. Furthermore, another 3,000 samples are generated for online monitoring, where a fault occurs at the 1,001th sample point. The occurred fault might be the following seven types: Fault 1: a step change with amplitude of 0.1 in x1 ; Fault 2: a step change with amplitude of 0.1 in x2 ; Fault 3: a step change with amplitude of 0.1 in x3 ; Fault 4: term e ( s3 + s4 ) in x5 is changed to 0.5 × e( s3 + s4 ) ; Fault 5: term ( s1 + s2 + s3 + s4 ) 2 in x6 is changed to ( s1 + s2 + s3 + 5 × s4 ) 2 ; Fault 6: term s1 × s2 × s3 × s4 in x7 is changed to s1 × s2 ;

25

ACCEPTED MANUSCRIPT

Fault 7: term s13 in x8 is missed.

RI P

T

It is a nonlinear process with both Gaussian and non-Gaussian features, so the control limits of T 2 and Q statistics in PS-PCA cannot be determined directly from a particular approximate

SC

distribution. Therefore, the control limits for these two statistics, T 2 and Q , are obtained by using

NU

kernel density estimations as used for ICA.

In the same manner as Section 4.1, the number of sums nnew for PS-PCA are set as 300, and

MA

take r as 1, 5, 10, 50, 100, 200 respectively, so the number of the training data samples, n should be 300 ( 300 ×1 ), 1,500 ( 300 × 5 ), 3,000 ( 300 ×10 ), 15,000 ( 300 × 50 ), 30,000 ( 300 ×100 ), and

ED

60,000 ( 300 × 200 ), respectively. However, when the number of the training data samples is very large, the computation of KPCA, KICA, and KICA-PCA exceeds the computational capability of the

PT

above-mentioned computer. Hence, the number of the training data samples, n for KPCA, KICA,

AC CE

and KICA-PCA were chosen as 300, 1,500 and 15,000. In this paper, all control limits are based on the confidence limit of 95%. For convenience and fairness, the number of the dominant PCs, k , for PS-PCA is set the same as that for KPCA with 300 training data. The kernel functions and their parameters in KPCA, KICA, and KICA-PCA are selected through many times of attempts. Table 4 shows the false alarm rates of four methods and Table 5 shows the detection rates for the

seven

types

of

faults.

The

the number of faults detected before 1, 000 1, 000

and

false the

alarm detection

rate rate

is is

calculated

as

calculated

as

the number of faults detected between 1, 001 and 3, 000 . 2, 000

(insert Table 4 here) 26

ACCEPTED MANUSCRIPT

(insert Table 5 here)

RI P

T

From Tables 4 and 5, it can be seen that PS-PCA performs much better than the other three methods, despite being a linear method, while the other three methods are proposed for nonlinear processes. In Table 4, there is no significant difference in false alarm rates between the four methods.

SC

The detection rates for PS-PCA are much larger than the other three methods. Across the seven faults,

NU

KPCA, KICA and KICA-PCA can only detect Faults 5, and their detection rates are very small. Even for Fault 7, PS-PCA still performs much better than the other three methods, and its performance can

MA

be improved by increasing the number of training data while the other three methods cannot. Similarly to the results in Section 4.1.2, the detection rate of PS-PCA increases when r increases.

ED

Figures 9–11 show the monitoring charts for PS-PCA of Faults 2, 4, and 6.

AC CE

PT

(insert Figure 9 here)

(insert Figure 10 here) (insert Figure 11 here)

Just as in Figures 3–5, the mean values of T 2 and Q statistics increase when r gets larger. However, in Figures 3–5, both statistics increase when r gets larger, while in Figures 9–11, sometimes just one statistic get larger and the other one shows no regularity in variation. This phenomenon indicates that there are some differences in applying PS-PCA to linear process and nonlinear processes. Understanding this phenomenon is an area for future research. Table 6 shows the time cost for four compared methods. In this table, the other three methods need much more time than PS-PCA. The calculation loads of KPCA, KICA and KICA-PCA explosively grow with the increasing number of training data, while the calculation load of PS-PCA 27

ACCEPTED MANUSCRIPT

almost does not change. More important, for KPCA, KICA and KICA-PCA, both kernel functions

T

and their parameters have to set by trial and error20. Compared with them, parameter r in PS-PCA

SC

(insert Table 6 here)

RI P

can be selected more conveniently. Hence, PS-PCA is a promising algorithm for nonlinear processes.

NU

5 FAULT DETECTION IN THE TENNESSEE EASTMAN PROCESS The Tennessee Eastman process simulation was developed by Downs and Vogel (1993)25, 26,

MA

which simulates industrial processes containing both non-Gaussian and nonlinear features. As a benchmark simulation, the TE process has been widely used to test the performance of various

ED

monitoring approaches. The process consists of five major unit operations: a reactor, a product

PT

condenser, a vapor–liquid separator, a recycle compressor, and a product stripper. Two products are produced by two simultaneous gas–liquid exothermic reactions, and a byproduct is generated by two

AC CE

additional exothermic reactions. The process has 12 manipulated variables, 22 continuous process measurements, and 19 composition measurements sampled less frequently. A set of 20 programmed faults are introduced to the process, listed in Table 7. (insert Table 7 here)

Table 8, shows 33 variables, containing 22 measurements and 11 manipulated variables that are used for monitoring, because the 19 composition measurements are hard to measure in real time and one manipulated variable, the agitation speed, is not manipulated. In this work, a training data set including at most 60,000 normal samples is used to build the monitoring models. The testing data set includes 3,000 samples and all faults are introduced from the 1,001th sample and continue until the

28

ACCEPTED MANUSCRIPT

the number of faults detected between 1, 001 and 3, 000 . 2, 000

SC

(insert Table 8 here)

RI P

detection rate is calculated as

the number of faults detected before 1, 000 and the 1, 000

T

end. The false alarm rate is calculated as

NU

This section compares PS-PCA with PCA, ICA-SVDD, LOCAL-ICA, KPCA, KICA, and KICA-PCA in TE process. Considering the computational capability of computer, ICA-SVDD,

MA

KPCA, KICA, and KICA-PCA have less training data than the other four methods. All control limits are based on a confidence limit of 99% and their false alarm rates are listed in Table 9. Table 9

ED

indicates that 300 training data points are not enough for all of these methods because their false alarm rates are too large. To account for this, the numbers of training data samples for PS-PCA and

PT

PCA are set to 15,000 ( nnew = 300, r = 50 ) and 60,000 ( nnew = 300, r = 200 ). Because of the large

AC CE

false alarm rate of LOCAL-ICA, its number of training data is only set to 60000. For the other four methods, their numbers of training data samples are set as 900 and 4500, successively. The detection rates of these seven methods are listed in Tables 10 and 11. (insert Table 9 here) (insert Table 10 here) (insert Table 11 here) Both Tables 10 and 11 show that the performance of PS-PCA is superior to those of the other algorithms for most faults. Especially for Faults 5, whose detection rates of other methods are generally below 25%, whereas its detection rates under PS-PCA is 96.2%. The results in two tables also show us that for some faults which can be detected easily, such as Faults 1, 2, and 6, PS-PCA 29

ACCEPTED MANUSCRIPT

achieves a little worse than the other methods. The reason for this phenomenon is that PS-PCA incurs

T

a detection delay which impacts the fault detection rate. However, its effect is very little, so the

RI P

detection rate of PS-PCA reaches almost 100%. For Fault 12, only PS-PCA and LOCAL-ICA can detect it and LOCAL-ICA performs better than PS-PCA in detection rate. However, LOCAL-ICA

SC

costs much larger computational load and in addition incurs large false alarm rate, so PS-PCA works much better than LOCAL-ICA based on the overall performance of the 20 faults. Results in two

NU

tables also indicates that r = 200 is large enough for most faults except for Faults 12, 15, and 16,

MA

because their detection rates are almost 100% and larger r will only leads to larger detection delay. Figures 12-13 show the monitoring charts of PS-PCA for Faults 3 and 15. In Figure 12, when

ED

r ≤ 10 , the mean values of two statistics decrease when r increases, which seems contradicts to the conclusion in Section 3.2. The reason for this phenomenon is that, when r ≤ 10 , the process does not

PT

follow Gaussian distribution and thus the mean values of two statistics do not increase. However,

AC CE

when r > 10 , this process is closer to Gaussian and thus conclusion in Section 3.2 works. Figures 12 also shows the relationship between r and the detection delay: T 2 reaches its control limit at sample time 1850 when r =50, at sample time 1900 when r =100, and at sample time 2000 when

r =200, respectively, which indicates that PS-PCA needs more steps to reach the same value when r is larger. In Figure 13, when r = 10 , there are few data go beyond the control limit. However, more and more data go beyond the control limit when r gets larger, which proves that a large r can gather more fault information. (insert Figure 12 here) (insert Figure 13 here) It can be concluded from the simulation result that PS-PCA has better fault detection ability and 30

ACCEPTED MANUSCRIPT

T

a lower calculation load than the other methods. It is therefore a significant improvement over PCA.

RI P

6 CONCLUSIONS

In this paper, Preliminary-Summation-based PCA (PS-PCA) was proposed to deal with

SC

processes where Gaussian features and non-Gaussian features exist simultaneously. Differently from

NU

other improvement approaches, which change the algorithm structure of PCA, PS-PCA just preprocesses the training and monitoring data without any modification on PCA itself. To deal with

MA

the non-Gaussian features in the process, PS-PCA adds up the samples of each variable to make the variable distribution close to the Gaussian distribution. Moreover, by adding up the variable

ED

samples, the fault information is accumulated and faults can be detected much more easily.

PT

It has been proved that preliminary summation can improve the fault detection rate in Gaussian process. Tested on a simulated linear non-Gaussian process and the TE process, PS-PCA performed

AC CE

much better than PCA, ICA-SVDD, and LOCAL-ICA: not only have better fault detection ability but also need less calculation in most situations. It is therefore a promising MSPC method for non-Gaussian processes. Moreover, PS-PCA can be applied to non-linear processes with superior performance compared with KPCA, KICA, and KICA-PCA. This indicates that PS-PCA has potential in dealing with nonlinearity. Because PS-PCA is an improved PCA, it inherits some PCA’s drawbacks, e.g., it cannot work in multiple normal states33. On the other hand, AHPCA34, Phase MPCA35, and other methods36 have been proposed to deal with multi-stage issue. These above-mentioned methods could be integrated into PS-PCA, which will be studied in the near future.

31

ACCEPTED MANUSCRIPT

ACKNOWLEDGEMENT

RI P

T

This work was supported by National Natural Science Foundation of China (61374099), the Program for New Century Excellent Talents in University (NCET-13-0652), and Beijing Higher

AC CE

PT

ED

MA

NU

SC

Education Young Elite Teacher Project (YETP0505).

32

ACCEPTED MANUSCRIPT

REFERENCES

7 8 9 10 11 12

13 14 15 16 17 18 19 20 21 22 23 24 25 26

T

RI P

SC

6

NU

5

MA

4

ED

3

PT

2

Ge Z, Song Z. Process monitoring based on independent component analysis-principal component analysis (ICA-PCA) and similarity factors. Industrial & Engineering Chemistry Research. 2007; 46:2054-2063. Zhao C, Gao F. Fault-relevant principal component analysis (FPCA) method for multivariate statistical modeling and process monitoring. Chemometrics and Intelligent Laboratory Systems. 2014; 133:1-16. Zhao C, Wang F, Lu N, Jia M. Stage-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes. Journal of Process Control. 2007; 17:728-741. Elshenawy LM, Yin S, Naik AS, Ding SX. Efficient recursive principal component analysis algorithms for process monitoring. Industrial & Engineering Chemistry Research. 2009; 49:252-259. Zhang Y, Li S, Teng Y. Dynamic processes monitoring using recursive kernel principal component analysis. Chemical Engineering Science. 2012; 72:78-86. Kano M, Nagao K, Hasebe S, Hashimoto I, Ohno H, Strauss R, Bakshi BR. Comparison of multivariate statistical process monitoring methods with applications to the Eastman challenge problem. Computers & Chemical Engineering. 2002; 26:161-174. Jeng J-C. Adaptive process monitoring using efficient recursive PCA and moving window PCA algorithms. Journal of the Taiwan Institute of Chemical Engineers. 2010; 41:475-481. Choi SW, Martin EB, Morris AJ, Lee I-B. Adaptive multivariate statistical process control for monitoring time-varying processes. Industrial & Engineering Chemistry Research. 2006; 45:3108-3118. Wang X, Kruger U, Irwin GW. Process monitoring approach using fast moving window PCA. Industrial & Engineering Chemistry Research. 2005; 44:5691-5702. Zhao C, Sun Y. Subspace decomposition approach of fault deviations and its application to fault reconstruction. Control Engineering Practice. 2013; 21:1396-1409. Zhao C, Gao F, Wang F. An improved independent component regression modeling and quantitative calibration procedure. AIChE Journal. 2010; 56:1519-1535. Zhao C, Wang F, Mao Z, Lu N, Jia M. Adaptive monitoring based on independent component analysis for multiphase batch processes with limited modeling data. Industrial & Engineering Chemistry Research. 2008; 47:3104-3113. Stefatos G, Ben Hamza A. Dynamic independent component analysis approach for fault detection and diagnosis. Expert Systems with Applications. 2010; 37:8606-8617. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural Networks. 2000; 13:411-430. Hyvärinen A, Pajunen P. Nonlinear independent component analysis: existence and uniqueness results. Neural Networks. 1999; 12:429-439. Hyvarinen A. Survey on independent component analysis. Neural Computing Surveys. 1999; 2:94-128. Liu X, Xie L. Statistical-based monitoring of multivariate non-Gaussian systems. AIChE journal. 2008; 54:2379-2391. Ge Z, Xie L. Local ICA for multivariate statistical fault diagnosis in systems with unknown signal and error distributions. AIChE Journal. 2012; 58:2357-2372. Peña VH, Víctor H, Lai TL, Shao Q-M. Self-normalized processes: limit theory and statistical applications. Springer, 2009. Zhang Y. Enhanced statistical analysis of nonlinear processes using KPCA, KICA and SVM. Chemical Engineering Science. 2009; 64:801-811. Ge Z, Yang C, Song Z. Improved kernel PCA-based monitoring approach for nonlinear processes. Chemical Engineering Science. 2009; 64:2245-2255. Xu Y, Zhang D, Song F, Yang J-Y, Jing Z, Li M. A method for speeding up feature extraction based on KPCA. Neurocomputing. 2007; 70:1056-1061. Zhang Y, Qin SJ. Improved nonlinear fault detection technique and statistical analysis. AIChE Journal. 2008; 54:3207-3220. Fan J, Qin SJ, Wang Y. Online monitoring of nonlinear multivariate industrial processes using filtering KICA–PCA. Control Engineering Practice. 2014; 22:205-216. Khediri IB, Limam M, Weihs C. Variable window adaptive kernel principal component analysis for nonlinear nonstationary process monitoring. Computers & Industrial Engineering. 2011; 61:437-446. Jiang Q, Yan X. Chemical processes monitoring based on weighted principal component analysis and its application. Chemometrics and Intelligent Laboratory Systems. 2012; 11-20.

AC CE

1

33

ACCEPTED MANUSCRIPT

33 34 35 36

T

RI P

SC

32

NU

31

MA

30

ED

29

PT

28

Zhang Y, Li S. Modeling and monitoring of nonlinear multi-mode processes. Control Engineering Practice. 2014; 22:194-204. He QP, Wang J. Statistics pattern analysis-a new process monitoring framework and its application to eemiconductor batch processes. AIChE journal. 2014; 57:107-121. Wang J, He QP. Multivariate process monitoring based on statistics pattern analysis. Industrial & Engineering Chemistry Research. 2010; 49:7858-7869. Xie M, He B, Goh T. Zero-inflated Poisson model in statistical process control. Computational Statistics & Data Analysis. 2001; 38:191-201. Babus F, Kobi A, Tiplica T, Bacivarov I, Bacivarov A. Control charts for non-Gaussian distributions. in Advanced Topics in Optoelectronics, Microelectronics, and Nanotechnologies III. 2007; 66350I-66350I-8. Lee W-C, Wu J-W, Hong M-L, Lin L-S, Chan R-L. Assessing the lifetime performance index of Rayleigh products based on the Bayesian estimation under progressive type II right censored samples. Journal of Computational and Applied Mathematics. 2011; 235:1676-1688. Zhao SJ, Zhang J, Xu YM. Monitoring of processes with multiple operating modes through multiple principle component analysis models. Industrial & Engineering Chemistry Research. 2004; 43:7025-7035. Rännar S, MacGregor JF, Wold S. Adaptive batch monitoring using hierarchical PCA. Chemometrics and Intelligent Laboratory Systems. 1998; 41:73-81. Dong D, McAvoy TJ. Batch tracking via nonlinear principal component analysis. AIChE Journal. 1996; 42:2199-2208. Yao Y, Gao F. A survey on multistage/multiphase statistical modeling methods for batch processes. Annual Reviews in Control. 2009; 33:172-183.

AC CE

27

34

ACCEPTED MANUSCRIPT

Captions of Figures and Tables

RI P

Figure 2. Relationship of Gaussianity and summation number r.

T

Figure 1. The process of summation.

Figure 3. Monitoring chart for Fault 1. Figure 4. Monitoring chart for Fault 3.

SC

Figure 5. Monitoring chart for Fault 9. Figure 6. Fault detection rates for Fault 1 in two situations.

NU

Figure 7. Fault detection rates for Fault 4 in two situations.

Figure 8. Fault detection rates for Fault 6 in two situations.

Figure 10. Monitoring chart for Fault 4. Figure 11. Monitoring chart for Fault 6. Figure 12. Monitoring chart for Fault 3.

MA

Figure 9. Monitoring chart for Fault 2.

ED

Figure 13. Monitoring chart for Fault 15.

PT

Table 1. False alarm rates (%) of four methods. Table 2. Detection rates (%) for nine faults.

AC CE

Table 3. Time costs (s) for four compared methods. Table 4. False alarm rates (%) of four methods. Table 5. Detection rates (%) for seven types of faults. Table 6. Time costs (s) for four comparison nonlinear methods. Table 7. Fault descriptions for TE process. Table 8. Monitored variables in TE process. Table 9. False alarm rates (%) of seven compared methods. Table 10. Detection rates (%) of PS-PCA and three linear methods in TE process. Table 11. Detection rates (%) of PS-PCA and three nonlinear methods in TE process.

35

n

r

r

⊕ % (1) X

⊕ % (2) X

⊕ % (3) X

r

L

SC

⊕ ⊕ % (n − 1) X % (n ) X new new

r L

L

⊕ x% (t + 1)

L

⊕ x% (t + 2)

MA

⊕ x% (t )

NU

x

r

RI P

r

X

T

ACCEPTED MANUSCRIPT

Figure 1. The process of summation.

AC CE

PT

ED

(a) the training data; (b) the monitoring data.

36

ACCEPTED MANUSCRIPT

0.2 0.18

T

0.16

RI P

0.14

kurt

0.12 0.1

0.06 0.04

0

10

20

30

40

50 r

60

70

NU

0.02

SC

0.08

80

90

100

AC CE

PT

ED

MA

Figure 2. Relationship of Gaussianity and summation number r.

37

ACCEPTED MANUSCRIPT

(b) PCA (n=3000)

(a) PCA (n=300)

20

20

20

T2

T2

30

10

3,000

15

10

10 5

0 0

0 0

3000

1000

2000

3000

0.2 0.1 0.01 0

0 0

3000

20 Q

1000 2000 Sample Time

0 0

3000

(g) LOCAL-ICA (n=300)

1000

2000

1000

2000

2000

3000

1000

2000

(j) PS-PCA (r=1, n=300)

AC CE

10

1000

2000

Q

Q

1000 2000 Sample Time

0 0

3000

2000

3000

1000

2000

3000

1000 2000 Sample Time

3000

(l) PS-PCA (r=100, n=30000)

100 T2

150

40

50

1000

2000

0 0

3000

20

50

15

40

0 0

1000

2000

3000

1000 2000 Sample Time

3000

30

10

20

5 3000

1000

20

60

0 0

10

5

1000 2000 Sample Time

20

15

3000

10

(k) PS-PCA (r=10, n=3000)

3000

1000 2000 Sample Time

Q

T2

20

3000

40

T2

30

2000

10

0 0

3000

T 2e

T 2e

3000

PT

1000 2000 Sample Time

1000

20

20

0 0

3000

20

0 0

3000

40

2000

2000

30

T 2g

T 2g

1000

4000

1000

(i) LOCAL-ICA (n=30000)

10

0 0

3000

10 0 0

3000

20

ED

T 2g

3000

30

50

0 0

3000

10

0 0

3000

100

0 0

1000 2000 Sample Time

0.3 0.2 0.1 0 0 30 20 10 0 0 20

T 2ng

T 2ng

50

0 0

2000

1000 2000 Sample Time

(f) ICA-SVDD (n=6000)

20

100

0 0

0 0

(h) LOCAL-ICA (n=3000)

150

0 0

10

1000

MA

Q

2000

10 0 0

2000

20 10

-3

1000

3,000

NU

T2

0

0 20

1000

30

10 10

1,000 2,000 Sample Time

R2

R2

R2 T2

0 0 4 10

3000

5

0.3

0.2

2000

10

(e) ICA-SVDD (n=1500)

0.4

1000

15

SC

1000 2000 Sample Time

0 0

3000

Q

5

(d) ICA-SVDD (n=300)

T 2ng

2000

Q

Q

15

1000

RI P

2,000

T2

1,000

10

Q

0 0

0 0

T

T2

30

10

T 2e

(c) PCA (n=30000)

30

10 1000 2000 Sample Time

3000

0 0

Figure 3. Monitoring chart for Fault 1. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.

38

ACCEPTED MANUSCRIPT

(a) PCA (n=300)

(b) PCA (n=3000)

100

100

100

T2

T2

150

1000

2000

0 0

3000

0 0

3000

4

10 0

1000

2000

0 0

3000

150 100

10 10

Q

-2

0

1000

2000

3000

4 2 0 -2

0

10 T 2e

10 2 10 0 10 -2 10 0 4 10 2 10 0 10 -2 10 0 50

T 2ng

0

10 10 10

1000

2000

6 4

0

1000 2000 Sample Time

0 0

3000

(j) PS-PCA (r=1, n=300)

AC CE

50

0 0

1000

2000

10 10

R2

T2

1000 2000 Sample Time

10

15

10

0 0

1000 2000 Sample Time

3000

2000

3000

1000 2000 Sample Time

3000

50 0 0

3000

1000

2000

(i) LOCAL-ICA (n=30000)

4

10 2 10 0 10 -2 10 0 20

3000

1000

2000

3000

1000

2000

3000

1000 2000 Sample Time

3000

10

1000

2000

0 0 50

3000

25

1000 2000 Sample Time

0 0

3000

(k) PS-PCA (r=10, n=3000) 10

2

10

1

10

0

0

1000

2000

10

3000

4

10

2

10

Q 10

5

1000

100

(l) PS-PCA (r=100, n=30000)

5

3

2

10

20

0 0 150

T

10

3000

10

3

3000

1

-1

0

1000

2000

3000

1000 2000 Sample Time

3000

4

2

Q

T2

100

2000

10

3000

T2

150

Q

2000

25

2

0

3000

1000

20

(h) LOCAL-ICA (n=3000)

4

2

0 0

T 2g

10

4

1000

T 2g

10

(g) LOCAL-ICA (n=300)

0.1

3000

T 2e

10

0 0

3000

ED

T 2ng

10

50 1000 2000 Sample Time

0.2

30

MA

0 0

PT

Q

50

10

2000

20

-3

100

10

1000

30

0

SC

0.1

T 2e

10

2000

3000

(f) ICA-SVDD (n=6000)

NU

10

1000

0 0

3000

0.3

0.2

T2

T2

0.3 R2

R2

1

1000 2000 Sample Time

10

5

1000 2000 Sample Time (e) ICA-SVDD (n=1500)

0.5

3000

Q

(d) ICA-SVDD (n=300)

10

Q

Q

Q

0 0

3000

2000

15

10 5

1000 2000 Sample Time

1000

20

15

10

0 0

0 0

3000

20

20

0 0

2000

RI P

30

50

1000

T 2ng

0 0

50

T

T2

150

50

T 2g

(c) PCA (n=30000)

150

10

0

10

-2

0

1000 2000 Sample Time

3000

10

0

-2

0

Figure 4. Monitoring chart for Fault 3. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.

39

ACCEPTED MANUSCRIPT

(a) PCA (n=300)

(b) PCA (n=3000)

20

20

20

T2

T2

30

2000

3000

30

20

20

0 0

3000

2000

0 0

4

10 1000

2000

0 0

3000

40

30

20

10 1,000 2,000 Sample Time

0 0

3,000

(g) LOCAL-ICA (n=300) T 2ng 1000

2000

3000

0 0

10

T 2e

4

0

1000 2000 Sample Time

10 10

3000

0 0

T2

20

AC CE

10

1000

2000

0 0

3000

1000 2000 Sample Time

3000

(i) LOCAL-ICA (n=30000) 40

1000

2000

0 0

2000

2000

3000

1000

2000

3000

1000 2000 Sample Time

3000

20 10 0 0

3000 10

-1

1000 2000 Sample Time

1000

30 T 2g

1000

20

3000

10 10

3000

80

5

2

-1

0

10

4

(l) PS-PCA (r=100, n=30000)

60 40

10

2

20 0 0

3000 10

20

10

1000

2000

10

2

10

10

3000

0

0

1000

2000

3000

1000 2000 Sample Time

3000

4

2

Q

Q

10

10

3000

4

Q

30

1000 2000 Sample Time

2000

10

(k) PS-PCA (r=10, n=3000)

T2

30

1000

20

3000

2

(j) PS-PCA (r=1, n=300)

3000

30

5

0

2000

10

T 2e

3000

PT

T 2g

T 2g

2000

8

0 0

1000 2000 Sample Time

25

ED

1000

1000

20

3000

50

25

0 0

2000

25

0 0

50

0

1000

T2

T 2ng

50

0 0

0 0

50

100

0 0

0.1

3000

(h) LOCAL-ICA (n=3000)

150

3000

30

MA

0 0

Q

Q

20

10

2000

20

-2

0

1000 2000 Sample Time

0.2

T2

0

10

1000

30

2

10

R2

0.1

3000

0 0

(f) ICA-SVDD (n=6000)

NU

10

1000

3000

0.3

0.2

T2

T2

10

1000 2000 Sample Time

0.3 R2

R2

0 0

3000

10

(e) ICA-SVDD (n=1500)

0.5

2000

20

SC

(d) ICA-SVDD (n=300)

0.25

1000

30

10

1000 2000 Sample Time

0 0

3,000

T 2ng

0 0

10

2,000

Q

Q 10

10

1,000

Q

30

10

RI P

1000

0 0

Q

0 0

10

T

T2

30

10

T 2e

(c) PCA (n=30000)

30

10

0

10

-2

0

1000 2000 Sample Time

3000

10

0

-2

0

Figure 5. Monitoring chart for Fault 9. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.

40

ACCEPTED MANUSCRIPT

(b)

(a) 100

100 T2 Q

80

70

70

40

60 50 40

30

30

20

20

10

10 0 0

10

20

30

40

50

60

70

80

90

100

RI P

50

0

r

SC

60

T

80

0

T2 Q

90

Detection Rate (% )

Detection Rate (%)

90

10

20

30

40

50

60

70

80

90

100

r

NU

Figure 6. Fault detection rates for Fault 1 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as

AC CE

PT

ED

MA

30,000.

41

ACCEPTED MANUSCRIPT

(a)

(b)

100

100 T2 Q

80

70

70

40

60 50 40

30

30

20

20

10

10 0

10

20

30

40

50

60

70

80

90

100

RI P

50

0

0

r

SC

60

T

80

0

T2 Q

90

Detection Rate (%)

Detection Rate (%)

90

10

20

30

40

50

60

70

80

90

100

r

NU

Figure 7. Fault detection rates for Fault 4 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as

AC CE

PT

ED

MA

30,000.

42

ACCEPTED MANUSCRIPT

(a)

(b)

30

35 T2 Q

T2 Q

30

25

10

T

15

20

RI P

Detection Rate (% )

Detection Rate (% )

25 20

15

10 5

0

10

20

30

40

50

60

70

80

90

100

0

0

r

SC

0

5

10

20

30

40

50

60

70

80

90

100

r

NU

Figure 8. Fault detection rates for Fault 6 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as

AC CE

PT

ED

MA

30,000.

43

ACCEPTED MANUSCRIPT

(a) PS-PCA (r=1, n=300)

(b) PS-PCA (r=5, n=1500)

T2 20

2000

10

0 0

3000

20

30

15

20

10

Q

40

10

1000

5

20

T2

1000

10

Q

Q

15 10

5

5 1000 2000 Sample Time

3000

2000

0 0

1000 2000 Sample Time

0 0

3000

1000

2000

3000

1000 2000 Sample Time

3000

20 15 10 5 3000

0 0

MA

0 0

10

5

0 0

3000

(f) PS-PCA (r=200, n=60000)

15

10

NU

2000

3000

20

5 1000

0 0

SC

T2

T2

15

10

3000

(e) PS-PCA (r=100, n=30000)

15

1000 2000 Sample Time

10

5

1000 2000 Sample Time

(d) PS-PCA (r=50, n=15000) 20

3000

15

0 0

3000

2000

1000

20

Q

1000 2000 Sample Time

0 0

3000

5

20

0 0

2000

Q

1000

20

RI P

10

Q

30

40

20

0 0

40

T2

T2

30

0 0

(c) PS-PCA (r=10, n=3000)

60

T

40

Figure 9. Monitoring chart for Fault 2.

ED

(a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000 training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)

AC CE

PT

PS-PCA with 60,000 training data.

44

ACCEPTED MANUSCRIPT

(b) PS-PCA (r=5, n=1500)

20 10

20

20

10

10 1000

2000

0 0

3000

1000

0 0

3000

30

15

15

20

10

1000 2000 Sample Time

1000 2000 Sample Time

(d) PS-PCA (r=50, n=15000)

T2

10

1000

2000

8

NU

Q

4 2

1000 2000 Sample Time

3000

0 0

3000

1000

2000

3000

1000 2000 Sample Time

3000

15

6

0 0

10 5

0 0

3000

5

SC

T2

T2

2000

3000

15

10

1000

0 0

20

20

10

1000 2000 Sample Time

10

(f) PS-PCA (r=200, n=60000)

30

20

0 0

3000

(e) PS-PCA (r=100, n=30000)

30

3000

5

0 0

3000

2000

0 0

1000 2000 Sample Time

10 Q

0 0

5

1000

RI P

20

Q

20

10

Q

2000

40

Q

Q

0 0

(c) PS-PCA (r=10, n=3000) 30

T

30 T2

40

30 T2

T2

(a) PS-PCA (r=1, n=300) 40

5

3000

0 0

MA

Figure 10. Monitoring chart for Fault 4. (a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000

ED

training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)

AC CE

PT

PS-PCA with 60,000 training data.

45

ACCEPTED MANUSCRIPT

(a) PS-PCA (r=1, n=300)

(b) PS-PCA (r=5, n=1500)

T2

T2 1000

2000

1000

0 0

3000

30

30

15 Q

20

20 10

1000 2000 Sample Time

1000 2000 Sample Time

(d) PS-PCA (r=50, n=15000)

2000

20

T2

1000

NU

Q

5

5 1000 2000 Sample Time

3000

10

0 0

3000

1000

2000

3000

1000 2000 Sample Time

3000

20

15

0 0

2000

10

10

3000

5

0 0

3000

SC

T2 1000

0 0

15

10

5

1000 2000 Sample Time

10

20

20

10

3000

(f) PS-PCA (r=200, n=60000)

30

15

0 0

3000

(e) PS-PCA (r=100, n=30000)

20

2000

5

0 0

3000

1000

RI P

20

Q

40

0 0

T2

2000

40

10

Q

20 10

0 0

3000

0 0

1000 2000 Sample Time

15

Q

T2

30

20

10

Q

40

40

20

0 0

(c) PS-PCA (r=10, n=3000)

60

30

T

40

10 5

3000

0 0

MA

Figure 11. Monitoring chart for Fault 6. (a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000

ED

training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)

AC CE

PT

PS-PCA with 60,000 training data.

46

ACCEPTED MANUSCRIPT

(b) PS-PCA (r=5, n=1500)

(a) PS-PCA (r=1, n=300)

40

30

T2

20

1000

2000

Q

Q

100

50

10

0 0

3000

1000

2000

40

20

30

15

20 10

1000 2000 Sample Time

1000 2000 Sample Time

(d) PS-PCA (r=50, n=15000)

T2

2000

40

Q

Q 10

20 10

3000

1000

2000

3000

0 0

1000 2000 Sample Time

1000 2000 Sample Time

3000

30 20 10 3000

0 0

MA

1000 2000 Sample Time

0 0

3000

40

30

20

0 0

1000

NU

30

3000

10

0 0

3000

0 0

SC

T2

T2

2000

1000 2000 Sample Time

10

20

20 10

1000

3000

30

30

10

2000

(f) PS-PCA (r=200, n=60000)

40

20

0 0

3000

(e) PS-PCA (r=100, n=30000)

30

1000

5

0 0

3000

Q

0 0

0 0

3000

Q

0 0

20

RI P

50

T

100

(c) PS-PCA (r=10, n=3000) 40

T2

60

T2

150

Figure 12. Monitoring chart for Fault 3. (a) PS-PCA with 300 training data; (b) PS-PCA with 1500 training data; (c) PS-PCA with 3000

ED

training data; (d) PS-PCA with 15000 training data; (e) PS-PCA with 30000 training data; (f)

AC CE

PT

PS-PCA with 60000 training data.

47

ACCEPTED MANUSCRIPT

(a) PS-PCA (r=1, n=300)

(b) PS-PCA (r=5, n=1500)

50

30

20 10

2000

Q

50

10

0 0

3000

1000

2000

20

20

15

15

10

1000 2000 Sample Time

(d) PS-PCA (r=50, n=15000)

20

2000

2000

20

Q

Q 5

10 5

1000 2000 Sample Time

3000

500

1000 1500 2000 Sample Time

2500

3000

(f) PS-PCA (r=200, n=60000)

0 0

3000

1000

2000

3000

1000 2000 Sample Time

3000

15

15

10

0 0

1000

NU

15

0 0

10

0 0

3000

3000

20

10

1000

2500

T2

T2 10

2000

10

SC

20

1500

30

T2

30

0 0

3000

(e) PS-PCA (r=100, n=30000)

30

1000

5

0 0

3000

500

0 0

1000 2000 Sample Time

10 Q

1000 2000 Sample Time

0 0

3000

5 0 0

20

RI P

1000

100

Q

30

Q

0 0

40

T

T2

T2

100

(c) PS-PCA (r=10, n=3000)

40

T2

150

5

3000

0 0

MA

Figure 13. Monitoring chart for Fault 15. (a) PS-PCA with 300 training data; (b) PS-PCA with 1500 training data; (c) PS-PCA with 3000

ED

training data; (d) PS-PCA with 15000 training data; (e) PS-PCA with 30000 training data; (f)

AC CE

PT

PS-PCA with 60000 training data.

48

ACCEPTED MANUSCRIPT

Table 1. False alarm rates (%) of four methods. T

False alarm rates

0.9

Methods

2

PCA(n=1,500) 2

PCA(n=3,000) 2

T

Q

T

Q

T

Q

T

Q

T2

Q

1.4

1.1

0.9

1.1

0.9

1.0

0.8

1.0

0.8

1.0

0.8

ICA-SVDD(n=300)

ICA-SVDD(n=600)

ICA-SVDD(n=1,000)

R2

T2

Q

R2

T2

Q

R2

T2

Q

False alarm rates

0.2

0.4

1.9

0.1

0.9

1.1

0.1

0.5

0.8

LOCAL-ICA(n=300)

LOCAL-ICA(n=600)

LOCAL-ICA(n=1,500)

Indices

T2ng

T2g

T2e

T2ng

T2g

T2e

T2ng

T2g

T2e

False alarm rates

81.8

36.3

60.3

56.4

52.0

1.0

0.5

56.6

2.1

PS-PCA(300×1)

PS-PCA(300×2)

PS-PCA(300×5)

T2

Q

T2

Q

T2

False alarm rates

0.8

1.4

1.2

1.6

0.7

Q 1.4

ICA-SVDD(n=3,000)

ICA-SVDD(n=6,000)

T2ng

T2g

T2e

T2ng

T2g

T2e

T2ng

T2g

T2e

1.0

2.1

0.3

1.0

2.0

0.2

2.6

1.0

0.4

R2 0.1

T2

Q

0.8

5.6

LOCAL-ICA(n=3,000)

PS-PCA(300×10) T2

0.6

R2

T2

Q

0.1

0.3

1.9

LOCAL-ICA(n=15,000)

PS-PCA(300×50)

R2

T2

Q

0.2

0.5

0.7

LOCAL-ICA(n=30,000)

PS-PCA(300×100)

Q

T2

Q

T2

Q

1.1

0.7

1.4

0.1

1.2

AC CE

PT

ED

MA

Indices

ICA-SVDD(n=1,500)

NU

Methods

2

PCA(n=30,000)

Q

Indices

Methods

PCA(n=15,000)

T

Indices

PCA(n=600)

RI P

2

SC

Methods

PCA(n=300)

49

ACCEPTED MANUSCRIPT

Table 2. Detection rates (%) for nine faults. 2

Indices

T

Fault 1

10.8

PCA(n=600) 2

PCA(n=1,500) 2

PCA(n=3,000) 2

2

T

Q

T

Q

T

Q

T

Q

T2

Q

3.7

1.8

1.8

1.8

1.8

1.4

2.3

1.5

1.5

1.8

1.4

1.4

1.3

0.6

1.3

0.6

1.6

Fault 3

100.0

20.9

100.0

1.1

100.0

1.1

100.0

Fault 4

1.9

4.5

2.3

3.7

2.3

3.7

3.0

Fault 5

3.4

14.3

3.9

18.6

3.9

18.6

3.7

Fault 6

1.1

1.8

1.0

1.1

1.0

1.1

1.2

Fault 7

0.7

1.6

0.5

1.0

0.5

1.0

0.7

Fault 8

1.5

2.7

1.9

1.8

1.7

1.1

1.8

5.2

81.3

8.8

85.5

8.8

85.5

11.5

79.4

Indices

R2

T2

Q

R2

T2

Q

R2

T2

Fault 1

0.1

0.0

5.6

0.1

0.3

1.9

0.1

Fault 2

0.1

0.2

2.4

0.1

0.1

1.5

0.0

Fault 3

8.1

0.0

100.0

0.1

8.5

100.0

0.1

Fault 4

0.1

0.0

5.8

0.1

0.1

4.3

0.1

Fault 5

0.1

0.5

21.7

0.1

0.4

13.4

0.0

Fault 6

0.1

0.0

2.2

0.1

0.1

1.7

0.1

Fault 7

0.1

0.1

2.1

0.1

0.1

1.5

Fault 8

1.6

0.0

3.4

0.3

0.0

1.8

1.7

0.0

1.4

0.0

Fault 9

0.1

0.2

83.4

0.1

0.3

79.6

0.1

0.0

73.4

0.0

Indices

LOCAL-ICA(n=300)

LOCAL-ICA(n=600)

T2ng

T2ng

T2g

T2e

T2e

Fault 2 Fault 3 Fault 4 Fault 5 Fault 6 Fault 7 Fault 8

PS-PCA(300×1)

PS-PCA(300×2)

39.6

RI P 2.0

2.3

2.9

2.4

2.9

3.8

16.9

4.1

16.7

0.9

1.0

1.1

1.0

1.1

1.2

0.7

0.9

0.7

0.9

0.9

1.9

0.8

2.1

0.8

9.1

84.3

9.8

83.7

Q

R2

T2

Q

R2

T2

Q

R2

T2

Q

0.0

1.8

0.0

0.1

1.8

0.0

0.2

1.2

0.2

0.2

1.3

0.1

1.2

0.0

0.0

1.3

0.1

0.0

1.5

0.1

0.3

2.0

3.0

100.0

0.0

2.2

100.0

0.2

10.5

100.0

0.1

11.2

100.0

0.1

2.8

0.0

0.2

2.8

0.0

0.2

3.5

0.1

0.1

3.6

0.6

12.7

0.1

0.2

11.8

0.0

0.8

13.6

0.0

0.5

20.8

0.0

1.4

0.0

0.1

1.3

0.0

0.2

2.5

0.0

0.8

3.6

1.2

0.0

0.1

1.1

0.2

0.1

2.2

0.1

0.0

3.1

0.2

1.3

0.1

0.0

0.9

0.1

0.0

0.8

0.4

74.7

0.1

0.5

75.3

0.0

0.1

80.3

0.0

T2g

T2e

Fault 9 Methods

0.6

100.0

ICA-SVDD(n=6,000)

LOCAL-ICA(n=1,500) T2ng

1.7

22.2

1.6

ICA-SVDD(n=3,000)

AC CE

Fault 1

T2g

0.5

100.0

ICA-SVDD(n=1,500)

ED

PT

Methods

ICA-SVDD(n=1,000)

1.6

16.9

SC

ICA-SVDD(n=600)

MA

ICA-SVDD(n=300)

0.1

0.7 2.3

NU

Fault 9

PCA(n=30,000)

Q

Fault 2

Methods

PCA(n=15,000)

T

Methods

PCA(n=300)

PS-PCA(300×5)

LOCAL-ICA(n=3,000)

LOCAL-ICA(n=15,000)

LOCAL-ICA(n=30,000)

T2ng

T2g

T2e

T2ng

T2g

T2e

T2ng

T2g

T2e

2.2

40.7

0.8

6.0

2.1

10.3

15.8

0.5

2.8

0.1

2.8

0.0

1.2

1.7

0.5

2.6

0.0

2.6

99.1

99.1

23.7

98.1

99.6

5.3

99.6

0.4

28.3

4.4

12.2

15.7

2.9

7.1

24.0

19.9

0.0

48.5

1.8

44.3

100.0

2.5

6.1

100.0

10.6

9.7

100.0

0.6

7.9

0.0

0.5

0.2

1.0

10.6

2.2

1.9

0.0

10.2

33.8

0.0

0.6

88.1

3.3

0.6

90.7

3.6

16.3

10.6

0.4

20.9

3.7

14.7

6.4

3.0

30.1

63.4

99.9

46.1

58.5

99.9

92.3

2.7

99.9

PS-PCA(300×10)

PS-PCA(300×50)

PS-PCA(300×100)

Indices

T2

Q

T2

Q

T2

Q

T2

Q

T2

Q

T2

Q

Fault 1

10.8

3.7

3.7

2.6

5.6

5.1

11.6

17.9

21.7

96.6

95.7

97.4

Fault 2

1.5

1.4

1.9

2.1

3.2

2.5

8.2

4.0

82.5

1.9

98.8

9.1

Fault 3

100.0

20.9

99.8

100.0

99.9

100.0

99.9

99.9

99.7

99.9

99.5

99.9

Fault 4

1.9

4.6

4.4

9.4

8.1

35.5

20.5

73.6

99.0

98.6

98.7

98.5

Fault 5

3.4

14.3

6.4

85.2

18.3

99.9

51.5

99.9

98.6

99.7

98.1

99.5

Fault 6

1.1

1.8

0.9

2.9

0.8

4.7

1.3

7.4

9.9

12.4

28.6

25.6

Fault 7

0.7

1.6

1.2

3.5

0.5

5.2

1.3

13.9

9.9

95.4

18.8

98.1

Fault 8

7.7

10.5

21.3

7.1

27.7

8.1

59.8

9.8

98.7

87.9

99.5

93.0

Fault 9

5.2

81.3

25.9

99.8

45.5

99.9

92.1

99.8

99.6

99.6

99.1

99.6

50

ACCEPTED MANUSCRIPT

Table 3. Time costs (s) for four compared methods. PCA(1,500)

PCA(30,000) online monitoring

offline modeling

6.9E-1

8.1E-3

2.5E+0

offline modeling

online monitoring

offline modeling

online monitoring

5.3E+2

5.0E-1

8.7E+3

1.91E-0

RI P

ICA-SVDD(1,500)

8.0E-3

SC

LOCAL-ICA(30,000)

online monitoring

6.8E-1

online monitoring

ICA-SVDD(6,000)

LOCAL-ICA(1,500) offline modeling

T

offline modeling

6.2E-1

online monitoring

3.1E+0

7.5E-1

PS-PCA(300×100)

NU

PS-PCA(300×5)

offline modeling

Offline modeling by PCA

preliminary summation for online monitoring

Online monitoring by PCA

Preliminary summation for offline modeling

Offline modeling by PCA

preliminary summation for online monitoring

Online monitoring by PCA

8.8E-3

2.1E-1

6.0E-4

7.2E-3

9.8E-3

2.3E-1

6.0E-4

7.5E-3

AC CE

PT

ED

MA

Preliminary summation for offline modeling

51

ACCEPTED MANUSCRIPT

Table 4. False alarm rates (%) of four methods.

T2

Q

False alarm rates Methods

6.6

3.6

KICA(n=300)

KPCA(n=600) T2

Q

6.2

1.9

KICA(n=600)

KPCA(n=1,500) T2

Q

6.5

2.8

KICA(n=1,500)

KPCA(n=3,000) T2

Q

5.8

2.2

KICA(n=3,000)

Indices

I2

Q

I2

Q

I2

Q

I2

False alarm rates

4.9

6.7

3.8

4.7

3.4

4.8

4.2

KICA-PCA(n=300)

KICA-PCA(n=600)

KICA-PCA(n=1,500)

Indices

I2

T2

Q

I2

T2

Q

I2

T2

Q

False alarm rates

6.1

7.2

5.9

3.8

6.2

7.2

1.2

3.6

4.3

PS-PCA(300×1)

PS-PCA(300×5)

PS-PCA(300×10)

T2

Q

T2

Q

T2

False alarm rates

4.5

5.9

6.0

6.5

6.1

6.5

2.3

KICA(n=6,000)

KPCA(n=15,000) T2

Q

6.6

2.1

KICA(n=15,000)

Q

I2

Q

3.6

5.0

4.8

4.0

4.5

I2

T2

Q

4.6

5.2

1.7

PS-PCA(300×50)

KICA-PCA(n=6,000)

KICA-PCA(n=15,000)

I2

T2

Q

I2

T2

Q

5.1

4.5

2.5

5.2

2.4

2.0

PS-PCA(300×100)

PS-PCA(300×200)

Q

T2

Q

T2

Q

T2

Q

5.3

4.7

3.2

4.8

4.9

2.2

2.0

AC CE

PT

ED

MA

Indices

Q

I2

KICA-PCA(n=3,000)

NU

Methods

T2

Q

SC

Methods

KPCA(n=6,000)

T

KPCA(n=300)

Indices

RI P

Methods

52

ACCEPTED MANUSCRIPT

Table 5. Detection rates (%) for seven types of faults. T

KPCA(n=600) 2

KPCA(n=1,500) 2

Q

T

Q

T

2.5

KPCA(n=3,000) 2

T

Q

T

Q

T2

Q

1.8

5.9

2.6

7.1

1.0

1.3

6.2

3.5

7.8

1.1

1.1

7.1

0.7

6.8

3.2

6.9

6.6

1.5

6.7

Fault 2

7.7

2.7

7.3

1.5

7.8

1.2

7.2

Fault 3

6.3

3.0

6.6

2.7

6.8

0.9

6.2

5.7

1.9

6.0

2.1

5.6

1.0

4.8

Fault 5

20.8

39.7

25.3

37.6

20.6

29.4

19.8

Fault 6

4.4

3.2

4.2

1.5

4.8

1.2

4.1

Fault 7

6.2

2.4

6.3

1.6

6.6

1.5

KICA(n=600)

KICA(n=1,500)

I2

Q

I2

Q 6.5

2.6

I2

5.8

4.3

3.8

Methods

6.5

9.4

5.9

4.2

3.8

Fault 3

6.1

9.0

5.4

5.2

7.2

4.2

4.8

2.1

4.2

24.8

44.1

44.6

Fault 1

4.9

Fault 2

5.4

Fault 3

4.8

3.5

2.6

3.6

1.2

4.5

3.6

KICA-PCA(n=3,000)

KICA-PCA(n=6,000)

KICA-PCA(n=15,000)

Q

I2

T2

Q

I2

T2

Q

I2

T2

Q

I2

T2

Q

4.8

2.6

3.2

3.1

3.2

3.5

2.8

3.7

2.2

0.9

2.6

1.6

1.1

2.5

3.2

4.8

3.3

4.4

2.5

1.6

3.8

3.2

1.2

3.4

4.3

0.8

3.5

3.6

4.4

4.5

3.5

2.3

3.5

2.9

3.3

2.6

2.7

0.8

5.2

4.6

3.2

4.6

1.7

4.8

1.5

2.2

5.6

5.1

1.1

27.4

30.9

50.2

37.6

33.9

44.3

45.6

33.9

46.2

43.3

51.7

PT

AC CE

T2

3.2

11.1

4.1

6.2

5.6

7.8

2.3

2.4

1.9

4.6

7.2

1.5

3.8

4.2

1.6

4.2

1.2

0.8

10.5

5.0

5.2

6.8

4.5

4.6

5.5

4.4

4.4

5.2

1.6

2.5

1.6

0.9

4.2

5.5

0.5

PS-PCA(300×1)

Indices

2.6

KICA-PCA(n=1,500)

ED

Fault 2

5.8

3.0

3.5

4.5

Fault 7

6.7

3.2

3.6

3.8

Methods

3.4

4.2

6.1

5.3

4.2

4.2

8.7

Fault 6

3.0

4.0

5.2

55.2

4.8

4.4

Fault 1

5.8

5.2

3.8

3.5

T2

41.3

2.6

6.8

4.8

I2

8.9

4.6

4.4

Q

54.4

3.8

4.6

T2

8.5

5.4

5.2

3.9

I2

53.1

Q

3.2

3.8

Indices

Fault 5

I2

5.2

KICA-PCA(n=600)

5.2

Q

4.6

KICA-PCA(n=300)

Fault 4

I2

MA

5.8

Q

Q

7.2

3.6

Fault 7

KICA(n=15,000)

2.3

3.5

4.8

I2

KICA(n=6,000)

2.4

5.6

3.6

1.2

KICA(n=3,000)

59.8

5.2

8.9

6.8

5.6

6.5

5.3

0.9

1.8

46.2

6.1

Fault 6

4.2

5.9

1.8

55.8

Fault 3

52.3

1.6

44.2

3.2

49.6

3.9

50.6

6.0

46

1.8

50.2

5.8

5.1

0.8 25.8

2.7

7.1

8.5

6.2 22.4

58.7

6.5

53.1

1.2 27.6

4.2

7.3

Fault 2

Fault 5

5.2

22.6

44.6

5.2

4.8

6.2

2.5

25.4

5.8

Fault 1

Fault 4

6.2

5.9

NU

Indices

KICA(n=300)

3.2

SC

Fault 4

2

KPCA(n=15,000)

Q

Fault 1

Methods

KPCA(n=6,000)

T

Indices

2

RI P

Methods

KPCA(n=300)

PS-PCA(300×5)

PS-PCA(300×10)

PS-PCA(300×50)

PS-PCA(300×100)

PS-PCA(300×200)

Q

T2

Q

T2

Q

T2

Q

T2

Q

T2

Q

6.5

6.6

7.4

5.5

8.2

3.6

13.6

5.2

30.1

7.2

91.5

6.9

8.1

8.1

7.8

8.1

9.3

25.7

19.8

78.3

38.5

92.5

6.1

6.2

7.3

5.5

6.8

6.5

12.2

37.6

21.8

20.3

70.2

Fault 4

3.8

5.7

4.9

5.3

3.8

3.5

1.6

8.2

8.7

15.4

15.6

80.9

Fault 5

17.6

49.8

53.6

94.9

73.5

99.6

99.6

99.9

99.0

99.8

99.7

99.8

Fault 6

2.7

10.3

5.7

27.6

8.2

52.1

43.5

94.7

85.7

97.1

95.7

96.5

Fault 7

4.1

5.9

5.5

6.6

4.8

5.7

3.6

3.0

7.7

0.8

10.6

0.4

53

ACCEPTED MANUSCRIPT

Table 6. Time costs (s) for four comparison nonlinear methods. KPCA(1,500)

KPCA(15,000) online monitoring

offline modeling

4.3E+2

2.3E+2

5.8E+4

offline modeling

online monitoring

offline modeling

online monitoring

2.3E+2

2.2E+2

3.1E+4

3.0E+3

RI P

KICA(1,500)

online monitoring 3.6E+3

KICA(15,000)

KICA-PCA(15,000)

SC

KICA-PCA(1,500) offline modeling

T

offline modeling

online monitoring

2.3E+2

2.2E+2

online monitoring

3.0E+4

2.9E+3

PS-PCA(300×100)

NU

PS-PCA(300×5)

offline modeling

Offline modeling by PCA

preliminary summation for online monitoring

Online monitoring by PCA

Preliminary summation for offline modeling

Offline modeling by PCA

preliminary summation for online monitoring

Online monitoring by PCA

8.0E-4

4.2E-1

2.1E-3

3.1E-2

1.4E-2

4.2E-1

2.3E-3

3.5E-2

AC CE

PT

ED

MA

Preliminary summation for offline modeling

54

ACCEPTED MANUSCRIPT

Table 7. Fault descriptions for TE process. Type

A/C feed ratio, B composition constant (stream 4) B composition, A/C ratio constant (stream 4)

Step Step

3 4

D feed temperature (stream 2) Reactor cooling water inlet temperature

5 6

Condenser cooling water inlet temperature A feed loss (stream 1)

7 8

C header pressure loss—reduced availability (stream 4) A, B, C feed composition (stream 4)

Step Random variation

9 10

D feed temperature (stream 2) C feed temperature (stream 4)

Random variation Random variation

11 12

Reactor cooling water inlet temperature Condenser cooling water inlet temperature

Random variation Random variation

13 14

Reaction kinetics Reactor cooling water valve

Slow drift Sticking

RI P

Step Step

NU

SC

Step Step

Condenser cooling water valve Unknown

Sticking Unknown

AC CE

PT

ED

15 16-20

T

Description

1 2

MA

No.

55

ACCEPTED MANUSCRIPT

Table 8. Monitored variables in TE process.

1 A feed (stream 1)

18 Stripper temperature 19 Stripper steam flow

3 E feed (stream 3)

20 Compressor work

RI P

2 D feed (stream 2)

T

Variables

21 Reactor cooling water outlet temperature

5 Recycle flow (stream 8)

22 Separator cooling water outlet temperature

6 Reactor feed rate (stream 6)

23 D feed flow valve (stream 2)

7 Reactor pressure

24 E feed flow valve (stream 3)

8 Reactor level

25 A feed flow valve (stream 1)

9 Reactor temperature

26 Total feed flow valve (stream 4)

NU

SC

4 Total feed (stream 4)

10 Purge rate (stream 9)

27 Compressor recycle valve

11 Product separator temperature

29 Separator pot liquid flow valve (stream 10)

MA

12 Product separator level

28 Purge valve (stream 9) 30 Stripper liquid product flow valve (stream 11) 31 Stripper steam valve 32 Reactor cooling water flow

16 Stripper pressure

33 Condenser cooling water flow

ED

13 Product separator pressure 14 Product separator under flow (stream 10) 15 Stripper level

AC CE

PT

17 Stripper underflow (stream 11)

56

ACCEPTED MANUSCRIPT

Table 9. False alarm rates (%) of seven compared methods. PCA(n=60,000)

T2

Q

T2

Q

T2

Q

T2

Q

False alarm rates

34.5

61.1

3.9

4.1

1.7

0.7

1.7

0.9

ICA-SVDD(n=900)

R2

T2

Q

False alarm rates

1.7

0.0

50.4

Methods

LOCAL-ICA(n=300)

Indices

T2ng

False alarm rates

99.9

R2

T2

Q

0.1

0.0

12.8

LOCAL-ICA(n=1,500)

T2g

T2e

T2ng

99.2

60.4

4.9

KPCA(n=300)

Methods

T2

Q

T2

False alarm rates

0.6

40.7

0.2

KICA(n=300) Q

I2

24.8

41.0

T2

Q

0.1

0.0

3.9

0.1

Indices

I2

T2

Q

I2

False alarm rates

24.8

29.6

23.6

0.9

PS-PCA(300×1) T

Q

False alarm rates

34.5

61.1

0.0

4.3

LOCAL-ICA(n=15,000)

LOCAL-ICA(n=60,000)

T2ng

T2g

T2e

T2ng

T2g

T2e

14.1

4.5

2.8

1.0

10.1

6.2

Q

T2

Q

T2

Q

1.6

0.4

2.2

1.1

0.8

KICA(n=1,500)

KICA(n=4,500)

I2

Q

I2

Q

0.9

1.6

1.3

1.1

0.4

KICA-PCA(n=1,500)

KICA-PCA(n=4,500)

T2

Q

I2

T2

Q

I2

T2

Q

0.9

1.4

1.6

0.8

1.0

1.1

2.6

0.7

PS-PCA(300×50) 2

PS-PCA(300×200)

T

Q

T

Q

T2

Q

0.6

1.8

0.3

1.4

0.0

0.6

AC CE

PT

Indices

Q

KPCA(n=4,500)

PS-PCA(300×5) 2

T2

KPCA(n=1,500)

KICA-PCA(n=900)

ED

KICA-PCA(n=300)

2

ICA-SVDD(n=4,500) R2

Q

0.9

Methods

Methods

18.5

MA

False alarm rates

99.8

ICA-SVDD(n=1,500) R2

KICA(n=900)

I2

Indices

T2e

KPCA(n=900)

Indices

Methods

T2g

RI P

ICA-SVDD(n=300)

Indices

NU

Methods

PCA(n=1,500)

SC

PCA(n=300)

T

PCA(n=15,000)

Indices

Methods

57

ACCEPTED MANUSCRIPT

Table 10. Detection rates (%) of PS-PCA and three linear methods in TE process. PCA(n=1500)

PCA(n=60000)

ICA-SVDD(n=900)

ICA-SVDD(n=4500)

LOCAL-ICA(n=60000)

T2

Q

T2

Q

R2

T2

Q

R2

T2

Q

T2ng

T2g

Fault 1

100.0

100.0

99.8

100.0

99.1

99.0

99.8

99.5

96.4

99.7

97.7

97.5

Fault 2

98.4

99.2

98.1

98.8

98.1

98.1

99.2

98.7

19.4

99.2

94.1

95.0

Fault 3

9.9

11.0

3.5

3.2

0.1

0.1

12.8

0.2

0.0

4.1

7.3

19.0

Fault 4

100.0

100.0

100.0

100.0

100.0

100.0

100.0

100.0

0.2

100.0

99.0

Fault 5

5.2

7.0

3.1

0.9

0.4

0.0

8.3

06

0.0

1.0

0.2

Fault 6

100.0

100.0

100.0

100.0

94.6

89.6

94.6

94.6

94.2

Fault 7

100.0

100.0

100.0

100.0

100.0

100.0

100.0

100.0

Fault 8

96.9

97.4

97.1

97.3

97.1

95.6

97.5

96.1

10.4

17.9

4.4

5.9

2.1

0.2

20.2

0.1

79.8

88.1

71.8

86.0

89.2

41.2

88.5

86.7

Fault 11

97.4

98.1

95.8

97.3

96.3

94.3

98.4

95.7

Fault 12

54.9

37.6

42.6

25.2

34.8

0.0

54.5

Fault 13

95.3

94.8

94.6

94.8

94.8

90.2

94.4

95.1

Fault 14

99.8

99.9

99.4

99.9

99.6

26.4

99.9

Fault 15

6.5

6.8

2.7

1.3

1.9

0.0

15.2

0.0

13.5

2.3

0.8

1.6

92.3

90.6

90.4

92.0

Fault 18

65.1

71.7

58.2

69.5

53.6

Fault 19

98.4

99.2

98.2

98.6

97.6

Fault 20

86.9

87.6

86.8

87.3

86.5

Q

T2

Q

99.5

99.5

99.7

99.7

99.1

98.8

98.7

99.3

97.2

97.5

85.6

16.2

56.3

94.8

97.4

99.4

99.9

100.0

100.0

99.1

100.0

22.1

15.5

11.0

24.0

96.2

34.9

SC 90.0

92.5

94.5

94.3

94.5

96.9

99.1

99.9

100.0

99.0

99.4

99.9

100.0

100.0

99.6

100.0

38.7

97.4

95.1

95.1

97.0

97.7

97.3

96.8

97.8 77.1

0.0

6.1

1.9

23.8

71.3

18.5

58.4

59.7

0.0

86.5

50.4

77.3

90.0

81.4

88.3

83.4

91.1

93.6

98.1

95.1

97.5

96.0

97.7

98.4

98.8

98.7

45.4

3.7

21.4

93.0

66.6

66.8

35.1

65.1

94.3

95.0

92.0

93.1

94.4

94.8

95.0

94.8

95.3

97.5

99.1

99.9

98.8

99.0

99.7

99.6

99.8

90.0

99.8

2.1

0.0

8.4

1.5

11.2

15.2

14.3

5.5

15.5

4.4

26.5

17.5

MA

5.9

92.0

2.0

0.0

6.9

2.5

9.9

9.7

12.6

4.5

15.7

3.8

65.3

91.8

83.1

91.2

92.4

87.6

90.6

92.0

92.0

92.3

91.8

91.8

28.9

73.0

45.6

51.1

72.3

59.0

69.4

77.0

76.7

78.1

77.7

83.3

38.8

99.2

97.2

93.6

99.1

96.8

96.4

98.3

98.3

98.3

97.0

97.2

50.4

88.5

86.2

80.6

87.8

79.8

85.0

87.7

87.0

87.6

86.1

89.3

ED

5.4

Fault 17

AC CE

PT

Fault 16

PS-PCA(300×200)

T2

94.6

NU

Fault 9 Fault 10

PS-PCA(300×5)

T2e

RI P

Indices

T

Methods

58

ACCEPTED MANUSCRIPT

Table 11. Detection rates (%) of PS-PCA and three nonlinear methods in TE process. Methods

KPCA(n=900)

KPCA(n=4500)

KICA(n=900)

KICA(n=4500)

KICA-PCA(n=900)

PS-PCA(300×5)

T2

Q

T2

Q

T2

Q

T2

Fault 1

99.1

99.4

99.3

99.2

98.8

99.4

99.5

99.7

99.5

99.7

99.5

99.4

99.5

Fault 2

97.2

97.4

97.4

97.0

96.5

97.2

98.7

99.3

98.7

99.3

97.8

96.5

97.1

97.5

98.7

Fault 3

0.8

8.8

0.2

1.9

0.2

3.6

16.2

56.3

16.2

56.3

24.9

0.9

4.2

1.1

16.2

56.3

Fault 4

99.9

99.9

99.9

99.9

99.9

99.9

100.0

100.0

100.0

100.0

99.9

99.9

99.8

99.9

100.0

Fault 5

0.3

4.8

0.3

1.0

0.4

2.3

11.0

24.0

11.0

24.0

8.1

0.7

3.1

1.3

11.0

Fault 6

99.9

99.9

99.9

99.9

99.9

99.9

94.3

94.5

94.3

Fault 7

99.9

99.9

99.9

99.9

99.9

99.9

100.0

100.0

100.0

Fault 8

93.8

94.2

94.4

94.2

93.3

94.2

97.7

97.3

97.7

Fault 9

2.6

19.2

2.1

9.1

2.4

8.4

18.5

58.4

18.5

58.4

Fault 10

60.8

75.9

60.0

72.1

73.3

74.2

81.4

88.3

81.4

88.3

Fault 11

97.1

97.0

95.2

95.5

95.1

97.6

97.7

98.4

97.7

98.4

94.2

94.6

96.6

95.5

97.7

Fault 12

34.9

56.4

32.4

42.7

42.7

45.4

66.6

66.8

66.6

66.8

48.5

37.0

47.1

43.2

66.6

Fault 13

90.1

90.2

89.8

90.0

89.4

90.7

Fault 14

99.7

99.7

99.7

99.7

99.4

99.7

Fault 15

0.0

5.1

0.0

0.8

0.1

1.6

0.0

0.0

0.7

0.2

84.0

84.5

84.3

84.2

84.1

Fault 18

45.6

50.4

47.6

47.4

42.7

Fault 19

96.6

97.6

97.0

97.0

96.7

Fault 20

73.6

75.4

73.8

75.0

1.3 85.0

98.6

T2

99.4

T2

PS-PCA(300×200) T2

Q

99.7

99.7

99.1

99.3

97.2

97.5

94.8

97.4

100.0

99.1

100.0

24.0

96.2

34.9

Q

99.9

99.9

99.9

99.9

94.3

94.5

96.9

99.1

99.9

99.9

99.9

99.9

100.0

100.0

99.6

100.0

97.3

95.1

93.2

94.3

94.5

97.7

97.3

96.8

97.8

23.0

4.3

7.8

8.4

18.5

58.4

59.7

77.1

79.6

51.4

66.0

76.5

81.4

88.3

83.4

91.1

98.4

98.8

98.7

66.8

35.1

65.1

MA

94.5

100.0

NU

SC

Q

T

I2

RI P

Q

94.8

95.0

94.8

95.0

90.2

89.8

90.4

89.5

94.8

95.0

94.8

95.3

99.6

99.8

99.6

99.8

99.9

98.5

99.7

99.6

99.6

99.8

90.0

99.8

14.3

5.5

14.3

5.5

6.7

0.4

2.5

0.8

14.3

5.5

15.5

4.4

12.6

ED

3.9

Fault 17

Q

92.0

4.5

12.6

4.5

5.9

0.6

1.2

1.5

12.6

4.5

15.7

3.8

92.3

92.0

92.3

85.1

83.8

84.6

84.4

92.0

92.3

91.8

91.8 83.3

49.6

76.7

78.1

76.7

78.1

53.6

32.9

49.3

49.3

76.7

78.1

77.7

97.3

98.3

98.3

98.3

98.3

97.9

94.6

97.0

97.7

98.3

98.3

97.0

97.2

PT

Fault 16

Q

T2

KICA-PCA(n=4500)

Indices

87.0

87.6

87.0

87.6

76.6

73.1

74.0

75.5

87.0

87.6

86.1

89.3

75.8

AC CE

74.1

59