Preliminary-Summation-Based Principal Component Analysis for NonGaussian Processes Zhijiang Lou, Dong Shen, Youqing Wang PII: DOI: Reference:
S0169-7439(15)00137-9 doi: 10.1016/j.chemolab.2015.05.017 CHEMOM 3020
To appear in:
Chemometrics and Intelligent Laboratory Systems
Received date: Revised date: Accepted date:
25 March 2015 17 May 2015 18 May 2015
Please cite this article as: Zhijiang Lou, Dong Shen, Youqing Wang, PreliminarySummation-Based Principal Component Analysis for Non-Gaussian Processes, Chemometrics and Intelligent Laboratory Systems (2015), doi: 10.1016/j.chemolab.2015.05.017
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
T
Preliminary-Summation-Based Principal Component
RI P
Analysis for Non-Gaussian Processes
SC
Zhijiang Lou, Dong Shen, Youqing Wang*
NU
College of Information Science and Technology,
MA
Beijing University of Chemical Technology, Beijing 100029, China
ED
*Email:
[email protected]
Abstract: To cope with the combined Gaussian and non-Gaussian features in the industrial processes,
PT
a novel preliminary-summation-based principal component analysis (PS-PCA) method is proposed in
AC CE
this study. Different from other approaches which improve principal component analysis (PCA) by changing its algorithm structure, PS-PCA just preprocesses the training and monitoring data without modification on PCA. According to the central limit theorem, PS-PCA adds up samples of each variable to make the distribution of the sum approach Gaussian distribution. These sums are then used for state monitoring. It has been proved that preliminary summation can increase the fault detection rate for Gaussian processes. Furthermore, some simulation tests substantiate that PS-PCA can improve the detection capability for non-Gaussian processes and even for nonlinear processes without increasing the computation load. Key words: multivariate statistical process control (MSPC); principal component analysis (PCA); independent component analysis (ICA); Preliminary-Summation-based PCA (PS-PCA); the central limit theorem
1
ACCEPTED MANUSCRIPT
1 INTRODUCTION
RI P
T
For large-scale industrial processes, the presence of abnormal situations may result in economic losses, environmental pollution, and even death. With the growing interests in safety, state monitoring is increasingly important in process system engineering. A key issue for the safe
SC
operation of industrial processes is the rapid detection of the abnormal situations and then
NU
identification and removal of the causal factors. In recent years, with the rapid development in computer and sensor technology, multivariate statistical process control (MSPC)1 has been developed
MA
and widely applied in industrial process.
One of the most common MSPC methods is principal component analysis (PCA)2-10. PCA was
ED
proposed by Pearson4 in the early 20th century, but the vast majority of its applications were not
PT
developed until the last few decades. The main idea of PCA is to reduce the data dimensions by projecting correlated variables onto a smaller set of new variables that are uncorrelated and retain
AC CE
most of the original variance. When implementing PCA for process monitoring, a PCA model should be established based on the process data collected under normal operations. The control limit for each monitoring statistic should be calculated. The process can then be monitored online by using these statistics. In the last few years, a lot of improvements for PCA have been proposed and many successful applications of PCA for process monitoring have been reported in the literature. However, PCA cannot cope with non-Gaussian processes. To address this issue, independent component analysis (ICA)11-16 was proposed. The main idea of ICA is to extract independent components instead of merely uncorrelated components. The fundamental restriction of ICA is that the independent components must be non-Gaussian14. However, for most industrial processes, both Gaussian and non-Gaussian features exist, which 2
ACCEPTED MANUSCRIPT
are difficult to deal with by using PCA and ICA. To cope with both Gaussian and non-Gaussian
T
features simultaneously, several approaches have been proposed, including ICA-PCA1, the support
RI P
vector data description approach ICA-SVDD17, and the statistical local approach LOCAL-ICA18. However, applying of these methods is difficult, because of the following reasons: first, all of them
SC
deal with Gaussian and non-Gaussian components separately while the perfect separation of Gaussian and non-Gaussian components is difficult or even impossible; second, large computational
NU
load is required for these methods. Hence, this paper proposes a preliminary summation method for
MA
PCA, termed as PS-PCA. PS-PCA can convert non-Gaussian components to Gaussian components by preliminary summation the training/monitoring data and monitors the new process data with the traditional PCA, so it avoids separation of Gaussian and non-Gaussian components and just needs a
ED
little computational load.
PT
By adding up the samples of each variable, the fault information can be accumulated and hence
AC CE
faults can be detected more easily. Through strict mathematical analysis, it has been proved that preliminary summation can indeed improve the fault detection rate for Gaussian processes and the fault detection rate will increase when the summation number increases. According to the central limit theorem (CLT)19, when adding up the samples of each variable, the distribution of the sum is close to the Gaussian distribution. As a result, preliminary summation also can improve the monitoring performance for non-Gaussian processes. When the same training data are used, PS-PCA has less computational load than the conventional PCA in most situations. However, according to the fault diagnosis results on a simulated process which contains both Gaussian and non-Gaussian features, PS-PCA performs much better than PCA, ICA-SVDD and LOCAL-ICA. The average detection rate of PS-PCA is 90.3%, and while the average detection rates of PCA, ICA-SVDD and LOCAL-ICA are 24.4%, 24.8%, and 3
ACCEPTED MANUSCRIPT
55.9% respectively. PS-PCA is therefore an effective solution to handle both Gaussian and
RI P
T
non-Gaussian features simultaneously. Besides non-Gaussianity, non-linearity is also wide-spread in industrial processes and it is another significant challenge for PCA. PS-PCA is a linear approach; however, it can be successfully
SC
used for nonlinear processes. Tested on a nonlinear simulated process, PS-PCA detects faults
NU
efficiently, and its detection rates for most faults are greater than 60%. The detection rates of three other nonlinear methods, kernel principal component analysis (KPCA)20-22, kernel independent
MA
component analysis (KICA)23 and kernel-independent component analysis-principal component analysis (KICA-PCA)24, are all below 60%, and the result indicates that PS-PCA is also effective for
ED
nonlinear processes.
PT
This paper proposes a new PCA based on data preprocessing and compares it with traditional PCA. Through theoretical analysis, it demonstrates that PS-PCA is more efficient than traditional
AC CE
PCA. Then PS-PCA is compared with traditional PCA, ICA-SVDD, LOCAL-ICA, KPCA, KICA, and KICA-PCA on the TE process25, 26, and the results validate the improved fault detection ability of PS-PCA.
The remainder of this paper is organized as follows. The classical PCA used for process monitoring is reviewed briefly in Section 2. Then PS-PCA is proposed for process monitoring and some details are introduced. In Section 3, the analysis of PS-PCA is given and the influence of preliminary summation is proved. To fully analyze the characteristics of PS-PCA and compare it with other MSPC methods, tests are carried out on a simulated non-Gaussian process and a simulated nonlinear process in Section 4. The TE process is employed to demonstrate the performance of the proposed method in Section 5. Finally, the contributions of this paper are summarized and some
4
ACCEPTED MANUSCRIPT
T
future studies are discussed in Section 6.
RI P
2 METHOD
SC
2.1. Principal Component Analysis
PCA decomposes the data matrix X ∈ R n×s (where n is the number of samples and s is the
NU
number of variables) into a transformed s subspace of reduced dimensions, which is defined by the span of a chosen subset of the eigenvectors of the covariance or correlation matrix associated with
MA
X . Each chosen eigenvector, or principal component (PC), captures the maximum amount of
ED
variability in the data in an ordered fashion. Mathematically, the decomposition is defined as follows: ˆ +E X = TP T + E = X
(1)
PT
where T ∈ R n×k refers to the score matrix, P ∈ R s×k refers to the loading matrix, and E ∈ R n× s is
AC CE
the residual matrix. Usually, only the first few dominant PCs are selected in P . In this paper, the covariance matrix S is used to derive a PCA model, defined as follows:
S=
1 XT X n −1
(2)
The columns of P are actually eigenvectors of S associated with the k largest eigenvalues. The cumulative percent variance ( CPV ) method is usually adopted to determine the number of PCs, which is defined as follows: k
s
i =1
i =1
CPV = ∑ λi / ∑ λi ×100% ≥ ε
(3)
where λi ( λ1 ≥ λ2 ≥ L λs ≥ 0 ) is the variance of score vector and ε is a parameter usually set to 85%. When CPV is larger than ε , set k as the number of the dominant PCs. The subspaces 5
ACCEPTED MANUSCRIPT
ˆ and E are called the score space and the residual space respectively, and T 2 and spanned by X
T
Q statistics27 are constructed to monitor the two spaces, respectively. Statistic T 2 represents the
RI P
distance between the location of the new data projected onto the subspace and the origin of subspace; statistic Q is a measure of the approximation error of the new data within the PCA subspace. Given
SC
a monitoring vector x ∈ R s×1 , the T 2 and Q statistics can be calculated as below:
(4)
NU
T 2 = xT P ( Λ k ) −1 P T x
Q = xT (I − PPT )(I − PPT ) x
MA
(5)
where I is a unit matrix; Λ k = diag (λ1 L λk ) ∈ R k ×k is a diagonal matrix, which is the estimated
degrees
( n − 1)(n + 1)k Fα ( k , n − k ) , where Fα ( k , n − k ) is an F-distribution with k and (n − k ) n( n − k ) of
freedom
PT
is δ T2 =
ED
covariance matrix of principal component scores. The threshold of T 2 index for a Gaussian process
with
the
level
of
1
AC CE
C h 2θ 2 θ h (h ) + 1 + 2 0 0 ] h0 , where θi = δ = θ1[ α 0 θ1 θ1 2 Q
significance
α ;
s
∑λ
i j
, h0 = 1 −
j = k +1
the
threshold
of
Q
is
2θ1θ3 , and Cα is the normal 3θ 22
deviate corresponding to the ( 1 − α ) percentile.
2.2. Preliminary-Summation-based PCA When non-Gaussian features exist in a process, it is inappropriate to calculate the loading matrix P in (1) using traditional PCA, because not all process variables follow Gaussian distribution. According to the central limit theorem (CLT)19, for a process variable that has r
r
τ =1
τ =1
% = ∑ X(τ ) and monitoring data x% = ∑ x(τ ) non-Gaussian features, the sum of its training data X 6
ACCEPTED MANUSCRIPT
will be close to a Gaussian distribution and thus satisfy PCA’s restriction. For convenience,
T
parameter r is termed as summation number. As a result, the loading matrix P can be calculated
RI P
% and x% . Hence, the main idea of PS-PCA is summation the training and by using X
monitoring data before applying PCA. It must be noted that there are some differences between
SC
summation procedures for the training and monitoring data: for the training data, to guarantee the independence of each sum, each sample of the training data should be used only once, i.e., given
NU
n samples of the data, there will be nnew = [ n / r ] sums; however, for the monitoring data, each data
MA
sample can be used many times. The process of summation is shown in Figure 1.
ED
(insert Figure 1 here) As shown in Figure 1 (b), the summation of the monitoring data is calculated at t and the
PT
summation number is r . If t ≥ r , the previous r − 1 samples of the monitoring data are used to get the sum, otherwise, the previous t − 1 samples of the monitoring data and another r − t samples of
AC CE
training data are used. In addition, Fig. 1(b) indicates that x% (t + 1) = x% (t ) + x (t + 1) − x (t − τ + 1) and therefore recursion can be adopted to reduce the volume of calculations in the summation procedure of the monitoring data.
After summation, the next step is to use PCA to detect process faults with the new data, which is the same as that for a traditional PCA process. For PS-PCA, though there are n samples of training data, the actual number of training data adopted in PCA process is nnew , so the computational load in the training process is determined by nnew rather than n . Compared with traditional PCA, PS-PCA has two disadvantages: the preliminary summation requires additional computation and more training data. However, compared with PCA, the
7
ACCEPTED MANUSCRIPT
computation cost of preliminary summation is very small. In addition, for most industrial processes,
T
there are enough training data for PS-PCA. As a result, PS-PCA is a promising modification of PCA.
RI P
REMARK 1. Even though statistics pattern analysis (SPA)28, 29 has some similarities with PS-PCA, their differences are very clear. First, PS-PCA only uses the first order statistics while SPA adopts the
SC
higher order statistics together. However, PCA is a linear method and it is not good at handling the
NU
nonlinearity among the various-order statistics, so adding of higher order statistics may have negative impact on SPA. In addition, calculating and monitoring the higher order statistics will cost
MA
much more computational load. Second, many statistics are adopted in SPA to address the non-Gaussianity, nonlinearity, and dynamics problems, but the function of each statistic is not clear.
ED
For PS-PCA, the summation process is introduced to address the non-Gaussian problem and its function has been proved based on CLT and the simulation result. Third, to improve the fault
PT
detection performance, SPA emphasizes increasing the variety of the statistics while PS-PCA focuses
AC CE
on increasing the summation number r .
3 THEORETICAL ANALYSIS OF PS-PCA
To simplify the proof, it is assumed that both the normal and faulty processes are Gaussian. In fact, this assumption is reasonable because even for a non-Gaussian process, the summation will be close to Gaussian process when r → ∞ . 3.1 The influence of summation number on the fault detection rate
Assume the training data
x1 ∈ N
( μ norm , Σnorm )
x0 ∈ N
( μ norm , Σ norm )
, the monitoring data without fault
and the monitoring data with a fault x 2 ∈ N
(μ
fault
, Σ fault ) , where
8
ACCEPTED MANUSCRIPT
N
( μ, Σ )
means a Gaussian distribution with expectation of μ and covariance matrix of Σ . Both
RI P
T
xi ( i = 1, 2, 3 ) and μ j ( j = norm, fault ) are vectors, and Σ j ( j = norm, fault ) are matrices:
xi = [ xi1 , xi2 L xis ]T μ j = [ µ 1j , µ 2j L µ sj ]T
SC
ρ 12j σ 1jσ 2j L ρ 1j sσ 1jσ sj σ 2j σ 2j L ρ 2j sσ 2j σ sj M 2s s 2 s s ρ j σ j σ j L σ jσ j
NU
σ 1jσ 1j 12 2 1 ρ σ jσ j Σj = j ρ 1sσ sσ 1 j j j
MA
where s is the number of variables and ρ gh ( g = 1, 2,L , s; h = 1, 2,L , s ) is the covariance. For j
ED
PS-PCA, the first step is to obtain the sum of each variable:
PT
Training data:
r ×t
∑
x% 0 (t) =
x0 (τ ) ∈ N
( rμ norm , rΣ norm )
(6)
AC CE
τ = r ×( t −1) +1
Monitoring data without fault:
x% 1 (t) =
t
∑
x1 (τ ) ∈ N
( rμ norm , rΣnorm )
(7)
τ =t − r +1
where [t − r + 1, t ] is the moving window for summation. Define t f as the fault occur time. According to the relationship between [t − r + 1, t ] and t f , there are three cases: t f > t , t − r + 1 < t f ≤ t and t f ≤ t − r + 1 . In the first case, x% 2 (t) = x% 1 (t) , which is a trivial case. The second
case means that just the last m = t − t f samples are fault data, and the third case means that all of the
r samples are fault data, which are called transitional fault data and steady-state fault data, respectively. The last two cases are shown as follows:
9
ACCEPTED MANUSCRIPT
t
t−m
x 2 (τ ) +
τ =t − m +1
∑
x1 (τ ) ∈ N
( mμ
fault
+ ( r − m)μ norm , mΣ fault + ( r − m) Σ norm )
(8)
i =t − r +1 t
x% 2 (t) =
∑
x 2 (τ ) ∈ N
( rμ
fault
, rΣ fault )
τ = t − r +1
T
∑
(9)
RI P
x% 2 (t) =
Obviously, the steady-state fault data (9) can be considered a special case of transitional
NU
degenerates into the traditional PCA when r = 1 .
SC
fault data (8) and hence this section focuses on (8). From (6) to (9), one knows that PS-PCA
MA
The first step is to scale the training and monitoring data by using the sample average ( xnorm ) and variance ( Σ′norm ) of the training data. Because xnorm is the sample average of the training data
1
x% ′0 = ( rΣ′norm ) 2 ( x% 0 − rxnorm ) −
ED
x 0 , so xnorm ≈ μ norm . Hence, equations (6), (7), and (9) can be rewritten as
AC CE
PT
1 1 1 1 1 − − − − − ∈ N Σ′norm 2 r (μ norm − xnorm ), Σ′norm 2 Σ norm Σ′norm 2 ≈ N 0, Σ′norm 2 Σ norm Σ′norm 2
(10)
1
x% 1′ = ( rΣ′norm ) 2 (x% 1 − r xnorm ) −
1 1 1 1 1 − − − − − ∈ N Σ′norm 2 r (μ norm − xnorm ), Σ′norm 2 Σ norm Σ′norm 2 ≈ N 0, Σ′norm 2 Σ norm Σ′norm 2
(11)
1
x% ′2 = ( rΣ′norm ) 2 ( x% 2 − rxnorm ) −
1 1 1 − − − ∈ N (rΣ′norm ) 2 ( mμ fault + ( r − m)μ norm − rxnorm ), (rΣ′norm ) 2 ( mΣ fault + ( r − m) Σ norm )( rΣ′norm ) 2 1 1 1 − − − m =N Σ′norm 2 (μ fault − xnorm ), ( rΣ′norm ) 2 ( mΣ fault + ( r − m) Σ norm )( rΣ′norm ) 2 r 1 1 1 1 1 − − − − − m m m ≈N Σ′norm 2 (μ fault − μ norm ), Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2 r r r
(12) As shown in (10), x% ′0 is not affected by summation number r , so PS-PCA will have the same
10
ACCEPTED MANUSCRIPT
training results as PCA, including loading matrix P , matrix Λ k , and thresholds δ T2 and δ Q2 .
T
Because x% ′0 and x% 1′ follow the same distribution and this distribution cannot be affected by
RI P
summation number r and fault data number m , their fault detection rates will be the same as those
MA
Q = z% ′T z% ′
and z% ′ = x% ′2 (I − PPT ) ,
NU
and T 2 in (4) and Q in (5) will become: T 2 = y% ′T y% ′
1 2
SC
in the traditional PCA, so only x% ′2 need to be studied. Set y% ′ = x% ′2 P ( Λ k )
−
(14)
ED
So
(13)
1 1 −1 m − − y% ′ = Λ k 2 PT x% ′2 ∈ N Λ k 2 PT Σ′norm 2 (μ fault − μ norm ) , r
PT
1 1 1 1 1 1 − − − − − − m m Λ k 2 P T Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2 PΛ k 2 r r
(15)
AC CE
m m m =N μ 3 , Σ 3 + (1 − )Θ r r r 1 − m z% ′ = (I − PP T ) x% ′2 ∈ N (I − PPT ) Σ′norm 2 (μ fault − μ norm ) , r 1 1 1 1 − − − − m m (I − PPT ) Σ′norm 2 Σ fault Σ′norm 2 + (1 − ) Σ′norm 2 Σ norm Σ′norm 2 (I − PPT ) r r
(16)
m m m μ 4 , Σ 4 + (1 − )Ω =N r r r where
11
ACCEPTED MANUSCRIPT
−
1
−
1
μ 3 = Λ k 2 PT Σ′norm 2 (μ fault − μ norm ) 1
−
1
−
1
−
1
−
1
−
1
−
1
−
1
−
RI P
Θ = Λ k 2 PT Σ′norm 2 Σ norm Σ′norm 2 PΛ k 2
T
−
Σ 3 = Λ k 2 P T Σ′norm 2 Σ fault Σ′norm 2 PΛ k 2
1
−
1
SC
μ 4 = (I − PPT ) Σ′norm 2 (μ fault − μ norm ) −
1
Σ 4 = (I − PPT ) Σ′norm 2 Σ fault Σ′norm 2 (I − PPT ) −
1
−
1
Ω = (I − PP ) Σ′norm 2 Σ norm Σ′norm 2 (I − PPT )
NU
T
MA
Equations (15) and (16) demonstrate that both the expectation and variance of y% ′ and z% ′ vary with summation number r and fault data number m ; however, when m = r (or t f ≤ t − r + 1 ), rμ 3 and
rμ 4 . As the
ED
their variances will be fixed at Σ 3 and Σ 4 and their expectations will be
first step, Section 3.1 only studies the relationship between the fault detection rate and r in the
PT
steady-state fault data ( m = r or t f ≤ t − r + 1 ), and while the transitional fault data will be studied
AC CE
in Section 3.2. In addition, because the T 2 and Q statistics have similar form, Section 3.1 only study the relationship in T 2 statistic. THEOREM. If μ 3 ≠ 0 ( μ 4 ≠ 0 ), the following conclusions hold: (a) there exists r>0 , such that the fault detection rate of PS-PCA is higher than PCA when r>r ; (b) when r → ∞ , the fault detection rate of PS-PCA will converge to 100%.
PROOF. When y% ′ ∈ N
(
)
r μ 3 , Σ 3 , its probability density function is
1
f ( y% ′) =
2π Σ 3
−
1 2
e
1 % ′− r μ3 )] − [( y% ′ − r μ 3 )T Σ-1 3 (y 2
(17)
So the fault detection rate is 12
ACCEPTED MANUSCRIPT
P(r ) =
∫
∫
f (y% ′)dy% ′ = 1 −
y% ′T y% ′ >Tα
f (y% ′)dy% ′
(18)
y% ′T y% ′≤Tα
RI P
T
where Tα is the control limit of T 2 .Taking derivatives of (18), one gets:
∂P ∂f ( y% ′) 1 1 T −1 =− ∫ dy% ′ = ∫ f ( y% ′)[ μT3 Σ 3−1μ 3 − μ 3 Σ3 y% ′]dy% ′ ∂r ∂r 2 2 r y% ′T y% ′≤T y% ′T y% ′≤T α
SC
α
(19)
f (y% ′) ≥ 0
and
1 T −1 μ3 Σ3 μ 3 ≥ 0 2
.
2
NU
Because f ( y% ′) is the probability density function and Σ 3 is a positive definite matrix, so Because
μ3 ≠ 0
,
then
μT3 Σ 3−1μ 3 ≠ 0
and
hence
MA
μT3 Σ 3−1y% ′ 1 1 T −1 ∂P r = max ≠ ∞ . When r > r , then [ μT3 Σ 3−1μ 3 − μ 3 Σ3 y% ′] > 0 and hence >0. T − 1 T y% ′ y% ′≤Tα μ Σ μ 2 ∂r 2 r 3 3 3
ED
That is to say, when r ≥ r , the fault detection rate increases as r increases.
PT
Generally, the fault detection rate of the conventional PCA is less than 1, i.e. P (1) < 1 . On the
AC CE
other hand,
1 − [( y% ′ − r ×μ3 )T Σ3−1 ( y% ′ − r ×μ3 )] 1 2 lim P ( r ) = lim 1 − ∫ e dy% ′ =1-0=1 1 r →∞ r →∞ − y% ′T y% ′>Tα 2π Σ 2 3
Hence, there exists r ≥ r , such that P(r ) > P(1) for all r > r . The proof is finished.
REMARK 2. If μ 3 = 0 ( μ 4 = 0 ), PS-PCA and the traditional PCA will have the same fault detection rate. However, this condition is very rare, because most faults are unidirectional and continuous, which results in μ fault ≠ μ norm and it usually leads to μ 3 ≠ 0 ( μ 4 ≠ 0 ).
13
ACCEPTED MANUSCRIPT
3.2 Influence of fault data number on the fault detection rate
RI P
T
The conclusions in section 3.1 are drawn in situation when m = r and this section will examine the situation when m < r .
∫
E (T 2 ) = E ( y% ′T y% ′) =
NU
SC
m m m m m m For y% ′ ∈ N μ 3 , Σ 3 + (1 − )Θ and z% ′ ∈ N μ 4 , Σ 4 + (1 − )Ω , the r r r r r r 2 expectation of T can be calculated as below. f ( y% ′) × y% ′T y% ′dy% ′
y% ′T y% ′≤Tα ∞
∫
∞
−∞ −∞ k
L∫
= ∑ ∫ −∞ i =1 ∞
k
∞
−∞
(∫
∞
−∞
k
f ( y% ′) × ∑ ( y% i′2 )dy%1′dy% 2′ L dy% k′ = ∑ i =1
L∫
∞
∫
∞
−∞ −∞
k
L∫
MA
=∫
∞
−∞
i =1
(∫
∞
∫
∞
−∞ −∞
L∫
∞
−∞
f (y% ′) × y% i′2 dy%1′dy% 2′ L dy% k′
)
f ( y% ′)dy%1′ L dy%i′−1dy% i′+1 L dy% k′ × y% i′2 dy% i′
)
PT
ED
∞ = ∑ ∫ f y%′i ( y% i′) × y% i′2 dy% i′ −∞ i =1
∞
∞
−∞
−∞ −∞
∫
∞
L∫
AC CE
where f y%′i ( y% i′) = ∫ L ∫
∞
−∞
(20)
f (y% ′)dy%1′ L dy% i′−1dy% i′+1 L dy% k′ is the marginal distribution of y% i′ and
it follows Gaussian distribution with expectation and variance equal the corresponding terms of
m m m μ 3 and Σ 3 + (1 − )Θ , respectively. So r r r
k
k
2 E (T 2 ) = ∑ E ( y% i′2 ) = ∑ ( E ( y% i′) ) + D ( y% i′) i =1 i =1
m2 T m m = μ 3 μ 3 + tr[ Σ 3 + (1 − )Θ] r r r
(21)
Similarly, one has
E (Q ) = E (z% ′T z% ′) =
m2 T m m μ 4 μ 4 + tr[ Σ 4 + (1 − )Ω] r r r
(22)
When m = r = 1 , which means PS-PCA degenerates into the traditional PCA, the expectations 14
ACCEPTED MANUSCRIPT
of these statistics are
RI P
T
E (T 2 ) m = r =1= μ 3T μ 3 + tr ( Σ 3 ) E (Q ) m = r =1 = μ 4T μ 4 + tr ( Σ 4 )
(23) (24)
SC
When m = r ≠ 1 , the expectations of these statistics in steady-state are
NU
E (T 2 ) m = r ≠1= rμ 3T μ 3 + tr ( Σ 3 ) E (Q ) m = r ≠1 = rμ 4T μ 4 + tr ( Σ 4 )
(25) (26)
MA
For convenience, E (T 2 ) m = r =1 and E (Q ) m = r =1 are denoted as the results of the traditional PCA. According to Equations (23) to (26), one gets E (T 2 ) m = r ≠1> E (T 2 ) m = r =1 and E (Q ) m = r ≠1> E (Q ) m = r =1 .
ED
In addition, PS-PCA and PCA have the same thresholds δ T2 and δ Q2 , so PS-PCA will have higher
PT
detection rate than PCA in steady state, which is consistent with the conclusion in Section 3.1. Now consider the situation when m < r . Define eT 2 = E (T 2 ) − E (T 2 ) m = r =1 and eQ = E (Q ) − E (Q ) m = r =1 ,
AC CE
and then one gets
eT 2 =
μ 3T μ 3 2 tr ( Σ 3 − Θ ) m − m + [tr ( Σ 3 − Θ) − μ 3T μ 3 ] r r
(27)
eQ =
μ 4T μ 4 2 tr ( Σ 4 − Ω) m − m + [tr ( Σ 4 − Ω) − μ 4T μ 4 ] r r
(28)
Please note that the values of eT 2 and eQ are only determined by m , because other terms are constants. For convenience, define the following two constants.
tr ( Σ 3 − Θ ) 2 μ Tμ ) − 4 3 3 [tr ( Σ 3 − Θ ) − μ 3T μ 3 ] r r
(29)
tr ( Σ 4 − Ω) 2 μ Tμ ) − 4 4 4 [tr ( Σ 4 − Ω) − μ 4T μ 4 ] r r
(30)
∆T 2 = (
∆Q = (
15
ACCEPTED MANUSCRIPT
( ∆Q ≥ 0
tr ( Σ 3 − Θ) + r ∆T 2 T 3
2μ μ 3
),
then
( m > mQ =
eT 2 > 0
tr ( Σ 4 − Ω) + r ∆ Q 2μ 4T μ 4
( eQ > 0 )
holds,
only
when
). That is to say, PS-PCA needs mT 2
T
m > mT 2 =
∆T 2 ≥ 0
RI P
Assume
( mQ ) steps to make its statistic expectations catch up with those of traditional PCA. Constants mT 2
SC
and mQ are termed as detection delays for PS-PCA and they are monotonic increasing with respect
NU
to parameter r . Because E (T 2 ) m = r ≠1> E (T 2 ) m = r =1 and E (Q ) m = r ≠1> E (Q ) m = r =1 , one knows that
MA
mT 2 < r and mQ < r .
If ∆T 2 < 0 ( ∆ Q < 0 ), then eT 2 > 0 ( eQ > 0 ) for all m ≤ r , so mT 2 = 0 ( mQ = 0 ). In other
ED
words, PS-PCA has no detection delay compared with conventional PCA in this situation.
PT
REMARK 3. For fault which can be detected by conventional PCA, a non-zero detection delay means PS-PCA may detect this fault later than PCA, and the larger detection delay indicates the
AC CE
longer time later.
Because μ 3 , μ 4 , Σ3 and Σ4 are all related with μ fault and/or Σ fault , thus both ∆ T 2 and ∆ Q are affected by the fault data. That is to say, the influence of parameter r on the detection
delay varies with the fault. For some faults, a larger value of r will introduce a larger delay and while for some faults it may introduce a smaller delay. All in all, larger r may increase the detection rate at the price of larger detection delay. Hence, appropriate value of r should be chosen to balance the detection delay and detection rate: for faults which are difficult to detect, larger r is more suitable because it can significantly increase the fault detection rate; for faults which are easy to detect, larger r is unreasonable as it significantly amplify the detection delay while does not improve the detection rate too much.
16
ACCEPTED MANUSCRIPT
T
4 SIMULATION STUDY OF PS-PCA
RI P
This section aims to study the performance of PS-PCA on non-Gaussian and nonlinear processes through simulation tests. Experiments were all carried out on the same computer with a
SC
Core 2.93-GHz CPU, 2-G memory, and Windows 7 operating system.
NU
4.1 Test on a simulated non-Gaussian process
To fully analyze the characteristics of PS-PCA and to compare it with other MSPC methods
MA
(PCA, ICA-SVDD, and LOCAL-ICA), a simple simulated process is employed to illustrate the monitoring performance of these methods, similarly to that used by Jiang and Yan26. Given the
ED
following simple process with six Gaussian distributed variables and six non-Gaussian distributed
PT
variables.
AC CE
x1 = 0.1× N1 + 0.01× ω1 x = 5 + 0.15 × N + 0.015 × ω 2 2 2 x3 = 7 + 0.1× P3 + 0.01× ω3 x4 = 9 + 0.1× R4 + 0.01× ω 4 x = 11 − 0.3 × x + 0.8 × x + 0.9 × x + 0.3 × N + 0.03 × ω 1 2 3 5 5 5 x = 6 + x − 0.3 × x + x + 0.35 × N + 0.035 × ω 6 2 3 5 6 6 x7 = 8 − 0.5 × x1 + 0.8 × x2 + x4 + 0.01× ω7 x8 = 15 + x2 + x3 + 0.1× N 8 + 0.01× ω8 x9 = 3 + 0.5 × N 9 + 0.05 × ω9 x = 20 + 0.7 × N + 0.07 × ω 10 10 10 x11 = x9 + 0.8 × x10 + 0.01× ω11 x12 = 0.3 × x1 + 0.6 × x2 + 0.4 × x10 + 0.05 × ω12 Random variables N i and ωi follow the standard Gaussian distribution and ωi indicates the process noise. Random variables
Pi
and
Ri
follow Poisson distribution and Rayleigh
distribution, respectively. This process can be regard as a simplified model of industrial processes. In 17
ACCEPTED MANUSCRIPT
this model, variables x1 , x2 , x9 , x10 , x11 , and x12 are Gaussian distributed, which represents the
T
Gaussian components in industrial processes. Among them, variables x1 , x2 , x9 , x10 contain only
RI P
one Gaussian component for each while the other two variables contain two or more Gaussian
SC
components for each. In addition, x1 is zero means while the other five Gaussian variables are not. Poisson distribution and Rayleigh distribution are common non-Gaussian distribution and they are
NU
adopted to represent non-Gaussian components in industrial processes30-32. Both of variables x3 and x4 contains one non-Gaussian component and the rest four variables x5 , x6 , x7 and x8 contain
MA
both Gaussian component and non-Gaussian component simultaneously. At most 30,000 normal observations are produced for offline modeling. Furthermore, another 3,000 samples are generated
ED
for online monitoring, where a fault occurs at the 1001th sample point. The occurred fault might be
PT
the following nine types:
AC CE
Fault 1: a step change with amplitude of 0.1 in x1 ; Fault 2: a step change with amplitude of 0.1 in x2 ; Fault 3: a step change with amplitude of 0.1 in x3 ; Fault 4: coefficient 0.8 in x5 is changed to 0.88; Fault 5: coefficient 0.8 in x7 is changed to 0.88; Fault 6: a step change with amplitude of 0.1 in x5 ; Fault 7: coefficient -0.3 in x6 is changed to -0.25 Fault 8: term −0.3 × x1 in the expression of x5 is missed; Fault 9: term −0.3 × x3 in the expression of x6 is missed. These faults are common for process monitoring and they can be divided into three categories:
18
ACCEPTED MANUSCRIPT
Faults 1, 2, 3, and 6 are the step change fault26, which will not change the correlation between
T
variables; Faults 4, 5, and 7 are the coefficient change fault18 and they can affect the correlation
RI P
between variables; Faults 8 and 9 are the component missing fault, which can be regard as a special case of coefficient change fault. Each fault occurs in different variables to make the detection results
SC
more convincing.
NU
4.1.1 The influence of summation number on Gaussianity
MA
To test the influence of summation number r on Gaussianity, kurtosis14 is adopted. In ICA, kurtosis is used to measure non-Gaussianity and hence it can reflect Gaussianity. The kurtosis of x
ED
is defined as
kurt ( x ) = E ( x 4 ) − 3( E { x 2 }) 2
PT
(31)
AC CE
For a Gaussian variable x , its fourth moment E ( x 4 ) equals 3( E { x 2 }) 2 . Thus, kurtosis is zero for a Gaussian random variable. For most non-Gaussian random variables, their kurtosis values are non-zero. Kurtosis can be either positive or negative. A random variable with negative kurtosis is classified as sub-Gaussian, and that with positive kurtosis is classified as super-Gaussian. Generally, Gaussianity can be measured by the absolute value of kurtosis: smaller absolute value of kurtosis indicates better Gaussianity and vice versa. 100
In this paper, the expectation in (31) is calculated as E ( χ ) = ∑ χ i 100 . To test the influence of i =1
the summation number r on the Gaussianity of this process, the value of r changes from 1 to 100, and then the max absolute values of the 12 variables’ kurtosis can be obtained. Twenty groups of Monte Carlo tests have been done and the average results are shown in Figure 2.
19
ACCEPTED MANUSCRIPT
(insert Figure 2 here)
RI P
T
As shown in Figure 2, even though there are some fluctuations, the general trend is clear: the kurtosis decreases fast as r increases from 1 to 10, but becomes steady after r is larger than 12. This result indicates that Gaussianity of the data can be increased by summation and r needs not to
SC
approach infinity, a value as small as 12 is enough to make this process approximate to a Gaussian
NU
process.
MA
4.1.2 PS-PCA fault detection results
The result in section 4.1.1 indicates that when summation number r is large enough, a
ED
non-Gaussian process will be very close to a Gaussian process, so PCA can be adopted to detect fault. In this test, the number of sums nnew for PS-PCA are set as 300, and take r as 1, 2, 5, 10, 50, and
PT
100 respectively, so the number of training data samples n should be 300 ( 300 ×1 ), 600 ( 300 × 2 ),
AC CE
1,500 ( 300 × 5 ), 3,000 ( 300 ×10 ), 15,000 ( 300 × 50 ), and 30,000 ( 300 ×100 ), respectively. For the sake of fairness, n for PCA, LOCAL-ICA are the same as for PS-PCA. However, when the number of the training data samples is very large, the computational load of ICA-SVDD exceeds the computational capability of the above-mentioned computer. Hence, the number of the training data samples n for ICA-SVDD is chosen as 300, 600, 1,000, 1,500, 3,000, and 6,000. For convenience and fairness, the number of the dominant PCs k for PS-PCA is set the same as that for PCA with 300 training data. In this study, all control limits are based on the confidence limit of 99%. Table 1 shows the false alarm rates of the four methods and Table 2 shows the detection rates of the nine faults. The false alarm rate is calculated as
the number of faults detected before 1, 000 and 1, 000
20
ACCEPTED MANUSCRIPT
(insert Table 1 here)
SC
(insert Table 2 here)
T
the number of faults detected between 1, 001 and 3, 000 . 2, 000
RI P
the detection rate is calculated as
In Table 1, when n is smaller than 3000, LOCAL-ICA has a large false alarm rate and thus
NU
Table 2 just list its detection rate when n takes 3,000, 15,000, and 30,000. Except for LOCAL-ICA,
MA
there is no significant difference in the false alarm rates of the other three methods. This is reasonable because neither x% ′0 nor x% 1′ in (11) and (12) are affected by summation number r and
ED
thus PS-PCA will has the same false alarm rate as PCA. However, the detection rates of PS-PCA and the other three methods show large variation in Table 2. In this table, PCA and ICA-SVDD can only
PT
detect Faults 3 and 9. Because LOCAL-ICA uses summation method to calculate its monitoring statistics, the fault information can be cumulated similarly with that in PS-PCA, and hence it can also
AC CE
detect Faults 5 and 7. However, LOCAL-ICA cannot perfectly separate Gaussian and non-Gaussian components and thus it fails to detect the other five faults. For PS-PCA, the preliminary summation process can successfully convert non-Gaussian components to Gaussian components, as a result, when r is 100, its detection rates for all faults except for Fault 6 are nearly 100%. Even for Fault 6, the detection rates of PS-PCA are 28.6% and 25.6%, which are much better than those of the other three methods. From the conclusion in Section 3.1, one knows that the detection rates of Fault 6 will be larger when r is larger and it will reach 100% when r → ∞ . One interesting phenomenon is that the detection rates of PS-PCA for most faults are little less than 100%, even for Faults 3 and 9. This is due to the detection delay of PS-PCA. Because Fault 3 and 9 can be detected easily by the traditional PCA, a larger value of r not only fails to improve the fault detection ability of PS-PCA, but also increase the detection delay evidently. As a result, differently from other faults whose 21
ACCEPTED MANUSCRIPT
detection rates are increasing with respect to r , the detection rates for Fault 3 and 9 is decreasing.
T
For Fault 9, PS-PCA achieves 99.9% detection rate when r = 5 , but its detection rate decreases to
RI P
99.6% when r = 100 , which means that r = 5 is large enough for Fault 9, and larger value will only increase the detection delay. In general, the effect of the fault detection delay is very little and
SC
PS-PCA is much better than the other methods.
NU
In order to clearly illustrate the superiority of PS-PCA, the monitoring charts for Faults 1, 3, and 9 are shown in Figures 3–5. To clearly display the curves, some charts are drawn
MA
with logarithmic coordinates. In Figures 3–5, the mean values of the T 2 and Q statistics in PS-PCA are increasing when r increases and they are much larger than those in PCA and
ED
ICA-SVDD. For example, in Figure 5, the statistics of PCA and ICA-SVDD marginally exceed the control limit quickly after the fault occurs, and they may sometimes fall down below the control limit.
PT
However, for PS-PCA, when r is large enough (such as 100), its T 2 and Q statistics will be
AC CE
far beyond the control limit and will not fall down below the limit any more. As a result, the detection rates of PS-PCA are much larger than these two methods. Statistic Tng2 in Figure 4 and statistic Te2 in Figure 5 show that, LOCAL-ICA can also detect Faults 3 and 9, because of its summation process in monitoring statics. However, in Figure 3, PS-PCA can successfully detect the faults while LOCAL-ICA fails, which demonstrates that preliminary summation is a more effective approach to handle the non-Gaussian problem and hence it is a simple but effective improvement for PCA. (insert Figure 3 here) (insert Figure 4 here)
22
ACCEPTED MANUSCRIPT
(insert Figure 5 here)
T
T 2 and Q are 9 and 13 for r = 1 , 50 and 100
RI P
In addition, in Figures 5, the mean values of
for r = 10 , and 400 and 1000 for r = 100 , respectively, which indicates that E (T 2 ) ≈ 4r + 5 and
SC
E (Q ) ≈ 9 r + 10 . This result is consistent with the conclusion in Section 3.2.
NU
To test the influence of summation number r on the detection rate, two situations were investigated: fix the number of sums nnew at 300 and change r from 1 to 100; and fix n at
MA
30,000 and change r from 1 to 100. The results for Faults 1, 4, and 6 are shown in Figures 6–8.
ED
(insert Figure 6 here)
PT
(insert Figure 7 here) (insert Figure 8 here)
AC CE
From these results it can be found that, for both strategies, the detection rates increase as r increases, and when r is large enough (such as 100), the detection rate is very close to 100%, which is consistent with conclusion in Section 3.1. For the second strategy, as n is fixed at 30,000, nnew decreases when r increases. However, the results of the three figures all show that the detection rate still increases even nnew decreases. The reason for this phenomenon is that nnew only affects the training data while r affects both the training and monitoring data. When nnew is large enough, the increase of nnew will not affect the training result any more while the increase of r will help PS-PCA accumulate more fault information in the monitoring data and improve the detection performance. Therefore, r affects the detection rates in two ways: making the sums close to Gaussian distribution and gathering more fault information. As a result, PS-PCA is more sensitive 23
ACCEPTED MANUSCRIPT
to faults and has the ability to deal with non-Gaussianity.
RI P
T
As mentioned earlier, PS-PCA requires more computation than PCA because of the additional preliminary summation step. The time cost for four comparison methods is listed in Table 3.
SC
(insert Table 3 here)
NU
In Table 3, the time cost for modeling offline with other methods is much longer than PS-PCA and the time is longer still when the number of the training data points increases. For PS-PCA the
MA
time cost of preliminary summation for modeling offline is about 1/20 of that required for PCA. Preliminary summation for online monitoring needs less than 1/12 of the time for monitoring online.
ED
Though the time cost for preliminary summation becomes larger when r increases, it does not increase too much because the sums are calculated recursively. Hence, PS-PCA with 1,500 or 30,000
PT
training data samples needs almost the same computation as the traditional PCA with 300 training data samples and thus it needs much less time than the other methods with 1,500, 6,000, or 30,000
AC CE
training data samples.
Overall, PS-PCA is more sensitive to process faults and has better fault detection rates for the processes with both Gaussian and non-Gaussian features, while only adding little computational load when using the same nnew . 4.2 Test on a simulated nonlinear process Besides non-Gaussianity, nonlinearity is also wide-spread in industrial processes and it is another significant challenge for PCA. It has been demonstrated that PS-PCA performs very well on linear processes. Then PS-PCA is applied to nonlinear processes and is compared with traditional nonlinear methods, such as KPCA, KICA and KICA-PCA. 24
ACCEPTED MANUSCRIPT
RI P SC
ED
MA
NU
x1 = eU1 + 0.01× ω1 2 x2 = U1 + 0.01× ω2 x = U + U 2 + 0.01× ω 1 2 3 3 x4 = sin(U 2 + N1 ) + 0.01× ω4 (N +N ) x5 = e 1 2 + 0.01× ω5 2 x6 = (U1 + U 2 + N1 + N 2 ) + 0.01× ω6 x7 = U1 × U 2 × N1 × N 2 + 0.01× ω7 x = U 3 + U 3 + N 3 + N 3 + 0.01× ω 1 2 1 2 8 8 x9 = U1 × e N 2 + 0.01× ω9 x10 = N1 × sin( N 2 ) + 0.01× ω10 x = U + 2 × U + 3 × N + 4 × N + 0.01× ω 1 2 1 2 11 11 x12 = U1U 2 + 0.01× ω12
T
The simulated process is shown as below:
Random variables N i and ωi follow the standard Gaussian distribution and ωi indicates the
PT
process noise. Random variable U i follow the uniform distribution in interval [ −1,1] . At most
AC CE
60,000 normal observations are produced for offline modeling. Furthermore, another 3,000 samples are generated for online monitoring, where a fault occurs at the 1,001th sample point. The occurred fault might be the following seven types: Fault 1: a step change with amplitude of 0.1 in x1 ; Fault 2: a step change with amplitude of 0.1 in x2 ; Fault 3: a step change with amplitude of 0.1 in x3 ; Fault 4: term e ( s3 + s4 ) in x5 is changed to 0.5 × e( s3 + s4 ) ; Fault 5: term ( s1 + s2 + s3 + s4 ) 2 in x6 is changed to ( s1 + s2 + s3 + 5 × s4 ) 2 ; Fault 6: term s1 × s2 × s3 × s4 in x7 is changed to s1 × s2 ;
25
ACCEPTED MANUSCRIPT
Fault 7: term s13 in x8 is missed.
RI P
T
It is a nonlinear process with both Gaussian and non-Gaussian features, so the control limits of T 2 and Q statistics in PS-PCA cannot be determined directly from a particular approximate
SC
distribution. Therefore, the control limits for these two statistics, T 2 and Q , are obtained by using
NU
kernel density estimations as used for ICA.
In the same manner as Section 4.1, the number of sums nnew for PS-PCA are set as 300, and
MA
take r as 1, 5, 10, 50, 100, 200 respectively, so the number of the training data samples, n should be 300 ( 300 ×1 ), 1,500 ( 300 × 5 ), 3,000 ( 300 ×10 ), 15,000 ( 300 × 50 ), 30,000 ( 300 ×100 ), and
ED
60,000 ( 300 × 200 ), respectively. However, when the number of the training data samples is very large, the computation of KPCA, KICA, and KICA-PCA exceeds the computational capability of the
PT
above-mentioned computer. Hence, the number of the training data samples, n for KPCA, KICA,
AC CE
and KICA-PCA were chosen as 300, 1,500 and 15,000. In this paper, all control limits are based on the confidence limit of 95%. For convenience and fairness, the number of the dominant PCs, k , for PS-PCA is set the same as that for KPCA with 300 training data. The kernel functions and their parameters in KPCA, KICA, and KICA-PCA are selected through many times of attempts. Table 4 shows the false alarm rates of four methods and Table 5 shows the detection rates for the
seven
types
of
faults.
The
the number of faults detected before 1, 000 1, 000
and
false the
alarm detection
rate rate
is is
calculated
as
calculated
as
the number of faults detected between 1, 001 and 3, 000 . 2, 000
(insert Table 4 here) 26
ACCEPTED MANUSCRIPT
(insert Table 5 here)
RI P
T
From Tables 4 and 5, it can be seen that PS-PCA performs much better than the other three methods, despite being a linear method, while the other three methods are proposed for nonlinear processes. In Table 4, there is no significant difference in false alarm rates between the four methods.
SC
The detection rates for PS-PCA are much larger than the other three methods. Across the seven faults,
NU
KPCA, KICA and KICA-PCA can only detect Faults 5, and their detection rates are very small. Even for Fault 7, PS-PCA still performs much better than the other three methods, and its performance can
MA
be improved by increasing the number of training data while the other three methods cannot. Similarly to the results in Section 4.1.2, the detection rate of PS-PCA increases when r increases.
ED
Figures 9–11 show the monitoring charts for PS-PCA of Faults 2, 4, and 6.
AC CE
PT
(insert Figure 9 here)
(insert Figure 10 here) (insert Figure 11 here)
Just as in Figures 3–5, the mean values of T 2 and Q statistics increase when r gets larger. However, in Figures 3–5, both statistics increase when r gets larger, while in Figures 9–11, sometimes just one statistic get larger and the other one shows no regularity in variation. This phenomenon indicates that there are some differences in applying PS-PCA to linear process and nonlinear processes. Understanding this phenomenon is an area for future research. Table 6 shows the time cost for four compared methods. In this table, the other three methods need much more time than PS-PCA. The calculation loads of KPCA, KICA and KICA-PCA explosively grow with the increasing number of training data, while the calculation load of PS-PCA 27
ACCEPTED MANUSCRIPT
almost does not change. More important, for KPCA, KICA and KICA-PCA, both kernel functions
T
and their parameters have to set by trial and error20. Compared with them, parameter r in PS-PCA
SC
(insert Table 6 here)
RI P
can be selected more conveniently. Hence, PS-PCA is a promising algorithm for nonlinear processes.
NU
5 FAULT DETECTION IN THE TENNESSEE EASTMAN PROCESS The Tennessee Eastman process simulation was developed by Downs and Vogel (1993)25, 26,
MA
which simulates industrial processes containing both non-Gaussian and nonlinear features. As a benchmark simulation, the TE process has been widely used to test the performance of various
ED
monitoring approaches. The process consists of five major unit operations: a reactor, a product
PT
condenser, a vapor–liquid separator, a recycle compressor, and a product stripper. Two products are produced by two simultaneous gas–liquid exothermic reactions, and a byproduct is generated by two
AC CE
additional exothermic reactions. The process has 12 manipulated variables, 22 continuous process measurements, and 19 composition measurements sampled less frequently. A set of 20 programmed faults are introduced to the process, listed in Table 7. (insert Table 7 here)
Table 8, shows 33 variables, containing 22 measurements and 11 manipulated variables that are used for monitoring, because the 19 composition measurements are hard to measure in real time and one manipulated variable, the agitation speed, is not manipulated. In this work, a training data set including at most 60,000 normal samples is used to build the monitoring models. The testing data set includes 3,000 samples and all faults are introduced from the 1,001th sample and continue until the
28
ACCEPTED MANUSCRIPT
the number of faults detected between 1, 001 and 3, 000 . 2, 000
SC
(insert Table 8 here)
RI P
detection rate is calculated as
the number of faults detected before 1, 000 and the 1, 000
T
end. The false alarm rate is calculated as
NU
This section compares PS-PCA with PCA, ICA-SVDD, LOCAL-ICA, KPCA, KICA, and KICA-PCA in TE process. Considering the computational capability of computer, ICA-SVDD,
MA
KPCA, KICA, and KICA-PCA have less training data than the other four methods. All control limits are based on a confidence limit of 99% and their false alarm rates are listed in Table 9. Table 9
ED
indicates that 300 training data points are not enough for all of these methods because their false alarm rates are too large. To account for this, the numbers of training data samples for PS-PCA and
PT
PCA are set to 15,000 ( nnew = 300, r = 50 ) and 60,000 ( nnew = 300, r = 200 ). Because of the large
AC CE
false alarm rate of LOCAL-ICA, its number of training data is only set to 60000. For the other four methods, their numbers of training data samples are set as 900 and 4500, successively. The detection rates of these seven methods are listed in Tables 10 and 11. (insert Table 9 here) (insert Table 10 here) (insert Table 11 here) Both Tables 10 and 11 show that the performance of PS-PCA is superior to those of the other algorithms for most faults. Especially for Faults 5, whose detection rates of other methods are generally below 25%, whereas its detection rates under PS-PCA is 96.2%. The results in two tables also show us that for some faults which can be detected easily, such as Faults 1, 2, and 6, PS-PCA 29
ACCEPTED MANUSCRIPT
achieves a little worse than the other methods. The reason for this phenomenon is that PS-PCA incurs
T
a detection delay which impacts the fault detection rate. However, its effect is very little, so the
RI P
detection rate of PS-PCA reaches almost 100%. For Fault 12, only PS-PCA and LOCAL-ICA can detect it and LOCAL-ICA performs better than PS-PCA in detection rate. However, LOCAL-ICA
SC
costs much larger computational load and in addition incurs large false alarm rate, so PS-PCA works much better than LOCAL-ICA based on the overall performance of the 20 faults. Results in two
NU
tables also indicates that r = 200 is large enough for most faults except for Faults 12, 15, and 16,
MA
because their detection rates are almost 100% and larger r will only leads to larger detection delay. Figures 12-13 show the monitoring charts of PS-PCA for Faults 3 and 15. In Figure 12, when
ED
r ≤ 10 , the mean values of two statistics decrease when r increases, which seems contradicts to the conclusion in Section 3.2. The reason for this phenomenon is that, when r ≤ 10 , the process does not
PT
follow Gaussian distribution and thus the mean values of two statistics do not increase. However,
AC CE
when r > 10 , this process is closer to Gaussian and thus conclusion in Section 3.2 works. Figures 12 also shows the relationship between r and the detection delay: T 2 reaches its control limit at sample time 1850 when r =50, at sample time 1900 when r =100, and at sample time 2000 when
r =200, respectively, which indicates that PS-PCA needs more steps to reach the same value when r is larger. In Figure 13, when r = 10 , there are few data go beyond the control limit. However, more and more data go beyond the control limit when r gets larger, which proves that a large r can gather more fault information. (insert Figure 12 here) (insert Figure 13 here) It can be concluded from the simulation result that PS-PCA has better fault detection ability and 30
ACCEPTED MANUSCRIPT
T
a lower calculation load than the other methods. It is therefore a significant improvement over PCA.
RI P
6 CONCLUSIONS
In this paper, Preliminary-Summation-based PCA (PS-PCA) was proposed to deal with
SC
processes where Gaussian features and non-Gaussian features exist simultaneously. Differently from
NU
other improvement approaches, which change the algorithm structure of PCA, PS-PCA just preprocesses the training and monitoring data without any modification on PCA itself. To deal with
MA
the non-Gaussian features in the process, PS-PCA adds up the samples of each variable to make the variable distribution close to the Gaussian distribution. Moreover, by adding up the variable
ED
samples, the fault information is accumulated and faults can be detected much more easily.
PT
It has been proved that preliminary summation can improve the fault detection rate in Gaussian process. Tested on a simulated linear non-Gaussian process and the TE process, PS-PCA performed
AC CE
much better than PCA, ICA-SVDD, and LOCAL-ICA: not only have better fault detection ability but also need less calculation in most situations. It is therefore a promising MSPC method for non-Gaussian processes. Moreover, PS-PCA can be applied to non-linear processes with superior performance compared with KPCA, KICA, and KICA-PCA. This indicates that PS-PCA has potential in dealing with nonlinearity. Because PS-PCA is an improved PCA, it inherits some PCA’s drawbacks, e.g., it cannot work in multiple normal states33. On the other hand, AHPCA34, Phase MPCA35, and other methods36 have been proposed to deal with multi-stage issue. These above-mentioned methods could be integrated into PS-PCA, which will be studied in the near future.
31
ACCEPTED MANUSCRIPT
ACKNOWLEDGEMENT
RI P
T
This work was supported by National Natural Science Foundation of China (61374099), the Program for New Century Excellent Talents in University (NCET-13-0652), and Beijing Higher
AC CE
PT
ED
MA
NU
SC
Education Young Elite Teacher Project (YETP0505).
32
ACCEPTED MANUSCRIPT
REFERENCES
7 8 9 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24 25 26
T
RI P
SC
6
NU
5
MA
4
ED
3
PT
2
Ge Z, Song Z. Process monitoring based on independent component analysis-principal component analysis (ICA-PCA) and similarity factors. Industrial & Engineering Chemistry Research. 2007; 46:2054-2063. Zhao C, Gao F. Fault-relevant principal component analysis (FPCA) method for multivariate statistical modeling and process monitoring. Chemometrics and Intelligent Laboratory Systems. 2014; 133:1-16. Zhao C, Wang F, Lu N, Jia M. Stage-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes. Journal of Process Control. 2007; 17:728-741. Elshenawy LM, Yin S, Naik AS, Ding SX. Efficient recursive principal component analysis algorithms for process monitoring. Industrial & Engineering Chemistry Research. 2009; 49:252-259. Zhang Y, Li S, Teng Y. Dynamic processes monitoring using recursive kernel principal component analysis. Chemical Engineering Science. 2012; 72:78-86. Kano M, Nagao K, Hasebe S, Hashimoto I, Ohno H, Strauss R, Bakshi BR. Comparison of multivariate statistical process monitoring methods with applications to the Eastman challenge problem. Computers & Chemical Engineering. 2002; 26:161-174. Jeng J-C. Adaptive process monitoring using efficient recursive PCA and moving window PCA algorithms. Journal of the Taiwan Institute of Chemical Engineers. 2010; 41:475-481. Choi SW, Martin EB, Morris AJ, Lee I-B. Adaptive multivariate statistical process control for monitoring time-varying processes. Industrial & Engineering Chemistry Research. 2006; 45:3108-3118. Wang X, Kruger U, Irwin GW. Process monitoring approach using fast moving window PCA. Industrial & Engineering Chemistry Research. 2005; 44:5691-5702. Zhao C, Sun Y. Subspace decomposition approach of fault deviations and its application to fault reconstruction. Control Engineering Practice. 2013; 21:1396-1409. Zhao C, Gao F, Wang F. An improved independent component regression modeling and quantitative calibration procedure. AIChE Journal. 2010; 56:1519-1535. Zhao C, Wang F, Mao Z, Lu N, Jia M. Adaptive monitoring based on independent component analysis for multiphase batch processes with limited modeling data. Industrial & Engineering Chemistry Research. 2008; 47:3104-3113. Stefatos G, Ben Hamza A. Dynamic independent component analysis approach for fault detection and diagnosis. Expert Systems with Applications. 2010; 37:8606-8617. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural Networks. 2000; 13:411-430. Hyvärinen A, Pajunen P. Nonlinear independent component analysis: existence and uniqueness results. Neural Networks. 1999; 12:429-439. Hyvarinen A. Survey on independent component analysis. Neural Computing Surveys. 1999; 2:94-128. Liu X, Xie L. Statistical-based monitoring of multivariate non-Gaussian systems. AIChE journal. 2008; 54:2379-2391. Ge Z, Xie L. Local ICA for multivariate statistical fault diagnosis in systems with unknown signal and error distributions. AIChE Journal. 2012; 58:2357-2372. Peña VH, Víctor H, Lai TL, Shao Q-M. Self-normalized processes: limit theory and statistical applications. Springer, 2009. Zhang Y. Enhanced statistical analysis of nonlinear processes using KPCA, KICA and SVM. Chemical Engineering Science. 2009; 64:801-811. Ge Z, Yang C, Song Z. Improved kernel PCA-based monitoring approach for nonlinear processes. Chemical Engineering Science. 2009; 64:2245-2255. Xu Y, Zhang D, Song F, Yang J-Y, Jing Z, Li M. A method for speeding up feature extraction based on KPCA. Neurocomputing. 2007; 70:1056-1061. Zhang Y, Qin SJ. Improved nonlinear fault detection technique and statistical analysis. AIChE Journal. 2008; 54:3207-3220. Fan J, Qin SJ, Wang Y. Online monitoring of nonlinear multivariate industrial processes using filtering KICA–PCA. Control Engineering Practice. 2014; 22:205-216. Khediri IB, Limam M, Weihs C. Variable window adaptive kernel principal component analysis for nonlinear nonstationary process monitoring. Computers & Industrial Engineering. 2011; 61:437-446. Jiang Q, Yan X. Chemical processes monitoring based on weighted principal component analysis and its application. Chemometrics and Intelligent Laboratory Systems. 2012; 11-20.
AC CE
1
33
ACCEPTED MANUSCRIPT
33 34 35 36
T
RI P
SC
32
NU
31
MA
30
ED
29
PT
28
Zhang Y, Li S. Modeling and monitoring of nonlinear multi-mode processes. Control Engineering Practice. 2014; 22:194-204. He QP, Wang J. Statistics pattern analysis-a new process monitoring framework and its application to eemiconductor batch processes. AIChE journal. 2014; 57:107-121. Wang J, He QP. Multivariate process monitoring based on statistics pattern analysis. Industrial & Engineering Chemistry Research. 2010; 49:7858-7869. Xie M, He B, Goh T. Zero-inflated Poisson model in statistical process control. Computational Statistics & Data Analysis. 2001; 38:191-201. Babus F, Kobi A, Tiplica T, Bacivarov I, Bacivarov A. Control charts for non-Gaussian distributions. in Advanced Topics in Optoelectronics, Microelectronics, and Nanotechnologies III. 2007; 66350I-66350I-8. Lee W-C, Wu J-W, Hong M-L, Lin L-S, Chan R-L. Assessing the lifetime performance index of Rayleigh products based on the Bayesian estimation under progressive type II right censored samples. Journal of Computational and Applied Mathematics. 2011; 235:1676-1688. Zhao SJ, Zhang J, Xu YM. Monitoring of processes with multiple operating modes through multiple principle component analysis models. Industrial & Engineering Chemistry Research. 2004; 43:7025-7035. Rännar S, MacGregor JF, Wold S. Adaptive batch monitoring using hierarchical PCA. Chemometrics and Intelligent Laboratory Systems. 1998; 41:73-81. Dong D, McAvoy TJ. Batch tracking via nonlinear principal component analysis. AIChE Journal. 1996; 42:2199-2208. Yao Y, Gao F. A survey on multistage/multiphase statistical modeling methods for batch processes. Annual Reviews in Control. 2009; 33:172-183.
AC CE
27
34
ACCEPTED MANUSCRIPT
Captions of Figures and Tables
RI P
Figure 2. Relationship of Gaussianity and summation number r.
T
Figure 1. The process of summation.
Figure 3. Monitoring chart for Fault 1. Figure 4. Monitoring chart for Fault 3.
SC
Figure 5. Monitoring chart for Fault 9. Figure 6. Fault detection rates for Fault 1 in two situations.
NU
Figure 7. Fault detection rates for Fault 4 in two situations.
Figure 8. Fault detection rates for Fault 6 in two situations.
Figure 10. Monitoring chart for Fault 4. Figure 11. Monitoring chart for Fault 6. Figure 12. Monitoring chart for Fault 3.
MA
Figure 9. Monitoring chart for Fault 2.
ED
Figure 13. Monitoring chart for Fault 15.
PT
Table 1. False alarm rates (%) of four methods. Table 2. Detection rates (%) for nine faults.
AC CE
Table 3. Time costs (s) for four compared methods. Table 4. False alarm rates (%) of four methods. Table 5. Detection rates (%) for seven types of faults. Table 6. Time costs (s) for four comparison nonlinear methods. Table 7. Fault descriptions for TE process. Table 8. Monitored variables in TE process. Table 9. False alarm rates (%) of seven compared methods. Table 10. Detection rates (%) of PS-PCA and three linear methods in TE process. Table 11. Detection rates (%) of PS-PCA and three nonlinear methods in TE process.
35
n
r
r
⊕ % (1) X
⊕ % (2) X
⊕ % (3) X
r
L
SC
⊕ ⊕ % (n − 1) X % (n ) X new new
r L
L
⊕ x% (t + 1)
L
⊕ x% (t + 2)
MA
⊕ x% (t )
NU
x
r
RI P
r
X
T
ACCEPTED MANUSCRIPT
Figure 1. The process of summation.
AC CE
PT
ED
(a) the training data; (b) the monitoring data.
36
ACCEPTED MANUSCRIPT
0.2 0.18
T
0.16
RI P
0.14
kurt
0.12 0.1
0.06 0.04
0
10
20
30
40
50 r
60
70
NU
0.02
SC
0.08
80
90
100
AC CE
PT
ED
MA
Figure 2. Relationship of Gaussianity and summation number r.
37
ACCEPTED MANUSCRIPT
(b) PCA (n=3000)
(a) PCA (n=300)
20
20
20
T2
T2
30
10
3,000
15
10
10 5
0 0
0 0
3000
1000
2000
3000
0.2 0.1 0.01 0
0 0
3000
20 Q
1000 2000 Sample Time
0 0
3000
(g) LOCAL-ICA (n=300)
1000
2000
1000
2000
2000
3000
1000
2000
(j) PS-PCA (r=1, n=300)
AC CE
10
1000
2000
Q
Q
1000 2000 Sample Time
0 0
3000
2000
3000
1000
2000
3000
1000 2000 Sample Time
3000
(l) PS-PCA (r=100, n=30000)
100 T2
150
40
50
1000
2000
0 0
3000
20
50
15
40
0 0
1000
2000
3000
1000 2000 Sample Time
3000
30
10
20
5 3000
1000
20
60
0 0
10
5
1000 2000 Sample Time
20
15
3000
10
(k) PS-PCA (r=10, n=3000)
3000
1000 2000 Sample Time
Q
T2
20
3000
40
T2
30
2000
10
0 0
3000
T 2e
T 2e
3000
PT
1000 2000 Sample Time
1000
20
20
0 0
3000
20
0 0
3000
40
2000
2000
30
T 2g
T 2g
1000
4000
1000
(i) LOCAL-ICA (n=30000)
10
0 0
3000
10 0 0
3000
20
ED
T 2g
3000
30
50
0 0
3000
10
0 0
3000
100
0 0
1000 2000 Sample Time
0.3 0.2 0.1 0 0 30 20 10 0 0 20
T 2ng
T 2ng
50
0 0
2000
1000 2000 Sample Time
(f) ICA-SVDD (n=6000)
20
100
0 0
0 0
(h) LOCAL-ICA (n=3000)
150
0 0
10
1000
MA
Q
2000
10 0 0
2000
20 10
-3
1000
3,000
NU
T2
0
0 20
1000
30
10 10
1,000 2,000 Sample Time
R2
R2
R2 T2
0 0 4 10
3000
5
0.3
0.2
2000
10
(e) ICA-SVDD (n=1500)
0.4
1000
15
SC
1000 2000 Sample Time
0 0
3000
Q
5
(d) ICA-SVDD (n=300)
T 2ng
2000
Q
Q
15
1000
RI P
2,000
T2
1,000
10
Q
0 0
0 0
T
T2
30
10
T 2e
(c) PCA (n=30000)
30
10 1000 2000 Sample Time
3000
0 0
Figure 3. Monitoring chart for Fault 1. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.
38
ACCEPTED MANUSCRIPT
(a) PCA (n=300)
(b) PCA (n=3000)
100
100
100
T2
T2
150
1000
2000
0 0
3000
0 0
3000
4
10 0
1000
2000
0 0
3000
150 100
10 10
Q
-2
0
1000
2000
3000
4 2 0 -2
0
10 T 2e
10 2 10 0 10 -2 10 0 4 10 2 10 0 10 -2 10 0 50
T 2ng
0
10 10 10
1000
2000
6 4
0
1000 2000 Sample Time
0 0
3000
(j) PS-PCA (r=1, n=300)
AC CE
50
0 0
1000
2000
10 10
R2
T2
1000 2000 Sample Time
10
15
10
0 0
1000 2000 Sample Time
3000
2000
3000
1000 2000 Sample Time
3000
50 0 0
3000
1000
2000
(i) LOCAL-ICA (n=30000)
4
10 2 10 0 10 -2 10 0 20
3000
1000
2000
3000
1000
2000
3000
1000 2000 Sample Time
3000
10
1000
2000
0 0 50
3000
25
1000 2000 Sample Time
0 0
3000
(k) PS-PCA (r=10, n=3000) 10
2
10
1
10
0
0
1000
2000
10
3000
4
10
2
10
Q 10
5
1000
100
(l) PS-PCA (r=100, n=30000)
5
3
2
10
20
0 0 150
T
10
3000
10
3
3000
1
-1
0
1000
2000
3000
1000 2000 Sample Time
3000
4
2
Q
T2
100
2000
10
3000
T2
150
Q
2000
25
2
0
3000
1000
20
(h) LOCAL-ICA (n=3000)
4
2
0 0
T 2g
10
4
1000
T 2g
10
(g) LOCAL-ICA (n=300)
0.1
3000
T 2e
10
0 0
3000
ED
T 2ng
10
50 1000 2000 Sample Time
0.2
30
MA
0 0
PT
Q
50
10
2000
20
-3
100
10
1000
30
0
SC
0.1
T 2e
10
2000
3000
(f) ICA-SVDD (n=6000)
NU
10
1000
0 0
3000
0.3
0.2
T2
T2
0.3 R2
R2
1
1000 2000 Sample Time
10
5
1000 2000 Sample Time (e) ICA-SVDD (n=1500)
0.5
3000
Q
(d) ICA-SVDD (n=300)
10
Q
Q
Q
0 0
3000
2000
15
10 5
1000 2000 Sample Time
1000
20
15
10
0 0
0 0
3000
20
20
0 0
2000
RI P
30
50
1000
T 2ng
0 0
50
T
T2
150
50
T 2g
(c) PCA (n=30000)
150
10
0
10
-2
0
1000 2000 Sample Time
3000
10
0
-2
0
Figure 4. Monitoring chart for Fault 3. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.
39
ACCEPTED MANUSCRIPT
(a) PCA (n=300)
(b) PCA (n=3000)
20
20
20
T2
T2
30
2000
3000
30
20
20
0 0
3000
2000
0 0
4
10 1000
2000
0 0
3000
40
30
20
10 1,000 2,000 Sample Time
0 0
3,000
(g) LOCAL-ICA (n=300) T 2ng 1000
2000
3000
0 0
10
T 2e
4
0
1000 2000 Sample Time
10 10
3000
0 0
T2
20
AC CE
10
1000
2000
0 0
3000
1000 2000 Sample Time
3000
(i) LOCAL-ICA (n=30000) 40
1000
2000
0 0
2000
2000
3000
1000
2000
3000
1000 2000 Sample Time
3000
20 10 0 0
3000 10
-1
1000 2000 Sample Time
1000
30 T 2g
1000
20
3000
10 10
3000
80
5
2
-1
0
10
4
(l) PS-PCA (r=100, n=30000)
60 40
10
2
20 0 0
3000 10
20
10
1000
2000
10
2
10
10
3000
0
0
1000
2000
3000
1000 2000 Sample Time
3000
4
2
Q
Q
10
10
3000
4
Q
30
1000 2000 Sample Time
2000
10
(k) PS-PCA (r=10, n=3000)
T2
30
1000
20
3000
2
(j) PS-PCA (r=1, n=300)
3000
30
5
0
2000
10
T 2e
3000
PT
T 2g
T 2g
2000
8
0 0
1000 2000 Sample Time
25
ED
1000
1000
20
3000
50
25
0 0
2000
25
0 0
50
0
1000
T2
T 2ng
50
0 0
0 0
50
100
0 0
0.1
3000
(h) LOCAL-ICA (n=3000)
150
3000
30
MA
0 0
Q
Q
20
10
2000
20
-2
0
1000 2000 Sample Time
0.2
T2
0
10
1000
30
2
10
R2
0.1
3000
0 0
(f) ICA-SVDD (n=6000)
NU
10
1000
3000
0.3
0.2
T2
T2
10
1000 2000 Sample Time
0.3 R2
R2
0 0
3000
10
(e) ICA-SVDD (n=1500)
0.5
2000
20
SC
(d) ICA-SVDD (n=300)
0.25
1000
30
10
1000 2000 Sample Time
0 0
3,000
T 2ng
0 0
10
2,000
Q
Q 10
10
1,000
Q
30
10
RI P
1000
0 0
Q
0 0
10
T
T2
30
10
T 2e
(c) PCA (n=30000)
30
10
0
10
-2
0
1000 2000 Sample Time
3000
10
0
-2
0
Figure 5. Monitoring chart for Fault 9. Subfigures (a), (b), and (c) show the monitoring chart of PCA with 300, 3,000, and 30,000 training data, respectively; subfigures (d), (e), and (f) show the monitoring chart of ICA-SVDD with 300, 1,500, and 6,000 training data, respectively; subfigures (g), (h), and (i) show the monitoring chart of LOCAL-ICA with 300, 3,000, and 30,000 training data, respectively; subfigures (j), (k), and (l) show the monitoring chart of PS-PCA with 300, 3,000, and 30,000 training data, respectively.
40
ACCEPTED MANUSCRIPT
(b)
(a) 100
100 T2 Q
80
70
70
40
60 50 40
30
30
20
20
10
10 0 0
10
20
30
40
50
60
70
80
90
100
RI P
50
0
r
SC
60
T
80
0
T2 Q
90
Detection Rate (% )
Detection Rate (%)
90
10
20
30
40
50
60
70
80
90
100
r
NU
Figure 6. Fault detection rates for Fault 1 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as
AC CE
PT
ED
MA
30,000.
41
ACCEPTED MANUSCRIPT
(a)
(b)
100
100 T2 Q
80
70
70
40
60 50 40
30
30
20
20
10
10 0
10
20
30
40
50
60
70
80
90
100
RI P
50
0
0
r
SC
60
T
80
0
T2 Q
90
Detection Rate (%)
Detection Rate (%)
90
10
20
30
40
50
60
70
80
90
100
r
NU
Figure 7. Fault detection rates for Fault 4 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as
AC CE
PT
ED
MA
30,000.
42
ACCEPTED MANUSCRIPT
(a)
(b)
30
35 T2 Q
T2 Q
30
25
10
T
15
20
RI P
Detection Rate (% )
Detection Rate (% )
25 20
15
10 5
0
10
20
30
40
50
60
70
80
90
100
0
0
r
SC
0
5
10
20
30
40
50
60
70
80
90
100
r
NU
Figure 8. Fault detection rates for Fault 6 in two situations. (a) the number of sums nnew is fixed as 300; (b) the total number of training data n is fixed as
AC CE
PT
ED
MA
30,000.
43
ACCEPTED MANUSCRIPT
(a) PS-PCA (r=1, n=300)
(b) PS-PCA (r=5, n=1500)
T2 20
2000
10
0 0
3000
20
30
15
20
10
Q
40
10
1000
5
20
T2
1000
10
Q
Q
15 10
5
5 1000 2000 Sample Time
3000
2000
0 0
1000 2000 Sample Time
0 0
3000
1000
2000
3000
1000 2000 Sample Time
3000
20 15 10 5 3000
0 0
MA
0 0
10
5
0 0
3000
(f) PS-PCA (r=200, n=60000)
15
10
NU
2000
3000
20
5 1000
0 0
SC
T2
T2
15
10
3000
(e) PS-PCA (r=100, n=30000)
15
1000 2000 Sample Time
10
5
1000 2000 Sample Time
(d) PS-PCA (r=50, n=15000) 20
3000
15
0 0
3000
2000
1000
20
Q
1000 2000 Sample Time
0 0
3000
5
20
0 0
2000
Q
1000
20
RI P
10
Q
30
40
20
0 0
40
T2
T2
30
0 0
(c) PS-PCA (r=10, n=3000)
60
T
40
Figure 9. Monitoring chart for Fault 2.
ED
(a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000 training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)
AC CE
PT
PS-PCA with 60,000 training data.
44
ACCEPTED MANUSCRIPT
(b) PS-PCA (r=5, n=1500)
20 10
20
20
10
10 1000
2000
0 0
3000
1000
0 0
3000
30
15
15
20
10
1000 2000 Sample Time
1000 2000 Sample Time
(d) PS-PCA (r=50, n=15000)
T2
10
1000
2000
8
NU
Q
4 2
1000 2000 Sample Time
3000
0 0
3000
1000
2000
3000
1000 2000 Sample Time
3000
15
6
0 0
10 5
0 0
3000
5
SC
T2
T2
2000
3000
15
10
1000
0 0
20
20
10
1000 2000 Sample Time
10
(f) PS-PCA (r=200, n=60000)
30
20
0 0
3000
(e) PS-PCA (r=100, n=30000)
30
3000
5
0 0
3000
2000
0 0
1000 2000 Sample Time
10 Q
0 0
5
1000
RI P
20
Q
20
10
Q
2000
40
Q
Q
0 0
(c) PS-PCA (r=10, n=3000) 30
T
30 T2
40
30 T2
T2
(a) PS-PCA (r=1, n=300) 40
5
3000
0 0
MA
Figure 10. Monitoring chart for Fault 4. (a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000
ED
training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)
AC CE
PT
PS-PCA with 60,000 training data.
45
ACCEPTED MANUSCRIPT
(a) PS-PCA (r=1, n=300)
(b) PS-PCA (r=5, n=1500)
T2
T2 1000
2000
1000
0 0
3000
30
30
15 Q
20
20 10
1000 2000 Sample Time
1000 2000 Sample Time
(d) PS-PCA (r=50, n=15000)
2000
20
T2
1000
NU
Q
5
5 1000 2000 Sample Time
3000
10
0 0
3000
1000
2000
3000
1000 2000 Sample Time
3000
20
15
0 0
2000
10
10
3000
5
0 0
3000
SC
T2 1000
0 0
15
10
5
1000 2000 Sample Time
10
20
20
10
3000
(f) PS-PCA (r=200, n=60000)
30
15
0 0
3000
(e) PS-PCA (r=100, n=30000)
20
2000
5
0 0
3000
1000
RI P
20
Q
40
0 0
T2
2000
40
10
Q
20 10
0 0
3000
0 0
1000 2000 Sample Time
15
Q
T2
30
20
10
Q
40
40
20
0 0
(c) PS-PCA (r=10, n=3000)
60
30
T
40
10 5
3000
0 0
MA
Figure 11. Monitoring chart for Fault 6. (a) PS-PCA with 300 training data; (b) PS-PCA with 1,500 training data; (c) PS-PCA with 3,000
ED
training data; (d) PS-PCA with 15,000 training data; (e) PS-PCA with 30,000 training data; (f)
AC CE
PT
PS-PCA with 60,000 training data.
46
ACCEPTED MANUSCRIPT
(b) PS-PCA (r=5, n=1500)
(a) PS-PCA (r=1, n=300)
40
30
T2
20
1000
2000
Q
Q
100
50
10
0 0
3000
1000
2000
40
20
30
15
20 10
1000 2000 Sample Time
1000 2000 Sample Time
(d) PS-PCA (r=50, n=15000)
T2
2000
40
Q
Q 10
20 10
3000
1000
2000
3000
0 0
1000 2000 Sample Time
1000 2000 Sample Time
3000
30 20 10 3000
0 0
MA
1000 2000 Sample Time
0 0
3000
40
30
20
0 0
1000
NU
30
3000
10
0 0
3000
0 0
SC
T2
T2
2000
1000 2000 Sample Time
10
20
20 10
1000
3000
30
30
10
2000
(f) PS-PCA (r=200, n=60000)
40
20
0 0
3000
(e) PS-PCA (r=100, n=30000)
30
1000
5
0 0
3000
Q
0 0
0 0
3000
Q
0 0
20
RI P
50
T
100
(c) PS-PCA (r=10, n=3000) 40
T2
60
T2
150
Figure 12. Monitoring chart for Fault 3. (a) PS-PCA with 300 training data; (b) PS-PCA with 1500 training data; (c) PS-PCA with 3000
ED
training data; (d) PS-PCA with 15000 training data; (e) PS-PCA with 30000 training data; (f)
AC CE
PT
PS-PCA with 60000 training data.
47
ACCEPTED MANUSCRIPT
(a) PS-PCA (r=1, n=300)
(b) PS-PCA (r=5, n=1500)
50
30
20 10
2000
Q
50
10
0 0
3000
1000
2000
20
20
15
15
10
1000 2000 Sample Time
(d) PS-PCA (r=50, n=15000)
20
2000
2000
20
Q
Q 5
10 5
1000 2000 Sample Time
3000
500
1000 1500 2000 Sample Time
2500
3000
(f) PS-PCA (r=200, n=60000)
0 0
3000
1000
2000
3000
1000 2000 Sample Time
3000
15
15
10
0 0
1000
NU
15
0 0
10
0 0
3000
3000
20
10
1000
2500
T2
T2 10
2000
10
SC
20
1500
30
T2
30
0 0
3000
(e) PS-PCA (r=100, n=30000)
30
1000
5
0 0
3000
500
0 0
1000 2000 Sample Time
10 Q
1000 2000 Sample Time
0 0
3000
5 0 0
20
RI P
1000
100
Q
30
Q
0 0
40
T
T2
T2
100
(c) PS-PCA (r=10, n=3000)
40
T2
150
5
3000
0 0
MA
Figure 13. Monitoring chart for Fault 15. (a) PS-PCA with 300 training data; (b) PS-PCA with 1500 training data; (c) PS-PCA with 3000
ED
training data; (d) PS-PCA with 15000 training data; (e) PS-PCA with 30000 training data; (f)
AC CE
PT
PS-PCA with 60000 training data.
48
ACCEPTED MANUSCRIPT
Table 1. False alarm rates (%) of four methods. T
False alarm rates
0.9
Methods
2
PCA(n=1,500) 2
PCA(n=3,000) 2
T
Q
T
Q
T
Q
T
Q
T2
Q
1.4
1.1
0.9
1.1
0.9
1.0
0.8
1.0
0.8
1.0
0.8
ICA-SVDD(n=300)
ICA-SVDD(n=600)
ICA-SVDD(n=1,000)
R2
T2
Q
R2
T2
Q
R2
T2
Q
False alarm rates
0.2
0.4
1.9
0.1
0.9
1.1
0.1
0.5
0.8
LOCAL-ICA(n=300)
LOCAL-ICA(n=600)
LOCAL-ICA(n=1,500)
Indices
T2ng
T2g
T2e
T2ng
T2g
T2e
T2ng
T2g
T2e
False alarm rates
81.8
36.3
60.3
56.4
52.0
1.0
0.5
56.6
2.1
PS-PCA(300×1)
PS-PCA(300×2)
PS-PCA(300×5)
T2
Q
T2
Q
T2
False alarm rates
0.8
1.4
1.2
1.6
0.7
Q 1.4
ICA-SVDD(n=3,000)
ICA-SVDD(n=6,000)
T2ng
T2g
T2e
T2ng
T2g
T2e
T2ng
T2g
T2e
1.0
2.1
0.3
1.0
2.0
0.2
2.6
1.0
0.4
R2 0.1
T2
Q
0.8
5.6
LOCAL-ICA(n=3,000)
PS-PCA(300×10) T2
0.6
R2
T2
Q
0.1
0.3
1.9
LOCAL-ICA(n=15,000)
PS-PCA(300×50)
R2
T2
Q
0.2
0.5
0.7
LOCAL-ICA(n=30,000)
PS-PCA(300×100)
Q
T2
Q
T2
Q
1.1
0.7
1.4
0.1
1.2
AC CE
PT
ED
MA
Indices
ICA-SVDD(n=1,500)
NU
Methods
2
PCA(n=30,000)
Q
Indices
Methods
PCA(n=15,000)
T
Indices
PCA(n=600)
RI P
2
SC
Methods
PCA(n=300)
49
ACCEPTED MANUSCRIPT
Table 2. Detection rates (%) for nine faults. 2
Indices
T
Fault 1
10.8
PCA(n=600) 2
PCA(n=1,500) 2
PCA(n=3,000) 2
2
T
Q
T
Q
T
Q
T
Q
T2
Q
3.7
1.8
1.8
1.8
1.8
1.4
2.3
1.5
1.5
1.8
1.4
1.4
1.3
0.6
1.3
0.6
1.6
Fault 3
100.0
20.9
100.0
1.1
100.0
1.1
100.0
Fault 4
1.9
4.5
2.3
3.7
2.3
3.7
3.0
Fault 5
3.4
14.3
3.9
18.6
3.9
18.6
3.7
Fault 6
1.1
1.8
1.0
1.1
1.0
1.1
1.2
Fault 7
0.7
1.6
0.5
1.0
0.5
1.0
0.7
Fault 8
1.5
2.7
1.9
1.8
1.7
1.1
1.8
5.2
81.3
8.8
85.5
8.8
85.5
11.5
79.4
Indices
R2
T2
Q
R2
T2
Q
R2
T2
Fault 1
0.1
0.0
5.6
0.1
0.3
1.9
0.1
Fault 2
0.1
0.2
2.4
0.1
0.1
1.5
0.0
Fault 3
8.1
0.0
100.0
0.1
8.5
100.0
0.1
Fault 4
0.1
0.0
5.8
0.1
0.1
4.3
0.1
Fault 5
0.1
0.5
21.7
0.1
0.4
13.4
0.0
Fault 6
0.1
0.0
2.2
0.1
0.1
1.7
0.1
Fault 7
0.1
0.1
2.1
0.1
0.1
1.5
Fault 8
1.6
0.0
3.4
0.3
0.0
1.8
1.7
0.0
1.4
0.0
Fault 9
0.1
0.2
83.4
0.1
0.3
79.6
0.1
0.0
73.4
0.0
Indices
LOCAL-ICA(n=300)
LOCAL-ICA(n=600)
T2ng
T2ng
T2g
T2e
T2e
Fault 2 Fault 3 Fault 4 Fault 5 Fault 6 Fault 7 Fault 8
PS-PCA(300×1)
PS-PCA(300×2)
39.6
RI P 2.0
2.3
2.9
2.4
2.9
3.8
16.9
4.1
16.7
0.9
1.0
1.1
1.0
1.1
1.2
0.7
0.9
0.7
0.9
0.9
1.9
0.8
2.1
0.8
9.1
84.3
9.8
83.7
Q
R2
T2
Q
R2
T2
Q
R2
T2
Q
0.0
1.8
0.0
0.1
1.8
0.0
0.2
1.2
0.2
0.2
1.3
0.1
1.2
0.0
0.0
1.3
0.1
0.0
1.5
0.1
0.3
2.0
3.0
100.0
0.0
2.2
100.0
0.2
10.5
100.0
0.1
11.2
100.0
0.1
2.8
0.0
0.2
2.8
0.0
0.2
3.5
0.1
0.1
3.6
0.6
12.7
0.1
0.2
11.8
0.0
0.8
13.6
0.0
0.5
20.8
0.0
1.4
0.0
0.1
1.3
0.0
0.2
2.5
0.0
0.8
3.6
1.2
0.0
0.1
1.1
0.2
0.1
2.2
0.1
0.0
3.1
0.2
1.3
0.1
0.0
0.9
0.1
0.0
0.8
0.4
74.7
0.1
0.5
75.3
0.0
0.1
80.3
0.0
T2g
T2e
Fault 9 Methods
0.6
100.0
ICA-SVDD(n=6,000)
LOCAL-ICA(n=1,500) T2ng
1.7
22.2
1.6
ICA-SVDD(n=3,000)
AC CE
Fault 1
T2g
0.5
100.0
ICA-SVDD(n=1,500)
ED
PT
Methods
ICA-SVDD(n=1,000)
1.6
16.9
SC
ICA-SVDD(n=600)
MA
ICA-SVDD(n=300)
0.1
0.7 2.3
NU
Fault 9
PCA(n=30,000)
Q
Fault 2
Methods
PCA(n=15,000)
T
Methods
PCA(n=300)
PS-PCA(300×5)
LOCAL-ICA(n=3,000)
LOCAL-ICA(n=15,000)
LOCAL-ICA(n=30,000)
T2ng
T2g
T2e
T2ng
T2g
T2e
T2ng
T2g
T2e
2.2
40.7
0.8
6.0
2.1
10.3
15.8
0.5
2.8
0.1
2.8
0.0
1.2
1.7
0.5
2.6
0.0
2.6
99.1
99.1
23.7
98.1
99.6
5.3
99.6
0.4
28.3
4.4
12.2
15.7
2.9
7.1
24.0
19.9
0.0
48.5
1.8
44.3
100.0
2.5
6.1
100.0
10.6
9.7
100.0
0.6
7.9
0.0
0.5
0.2
1.0
10.6
2.2
1.9
0.0
10.2
33.8
0.0
0.6
88.1
3.3
0.6
90.7
3.6
16.3
10.6
0.4
20.9
3.7
14.7
6.4
3.0
30.1
63.4
99.9
46.1
58.5
99.9
92.3
2.7
99.9
PS-PCA(300×10)
PS-PCA(300×50)
PS-PCA(300×100)
Indices
T2
Q
T2
Q
T2
Q
T2
Q
T2
Q
T2
Q
Fault 1
10.8
3.7
3.7
2.6
5.6
5.1
11.6
17.9
21.7
96.6
95.7
97.4
Fault 2
1.5
1.4
1.9
2.1
3.2
2.5
8.2
4.0
82.5
1.9
98.8
9.1
Fault 3
100.0
20.9
99.8
100.0
99.9
100.0
99.9
99.9
99.7
99.9
99.5
99.9
Fault 4
1.9
4.6
4.4
9.4
8.1
35.5
20.5
73.6
99.0
98.6
98.7
98.5
Fault 5
3.4
14.3
6.4
85.2
18.3
99.9
51.5
99.9
98.6
99.7
98.1
99.5
Fault 6
1.1
1.8
0.9
2.9
0.8
4.7
1.3
7.4
9.9
12.4
28.6
25.6
Fault 7
0.7
1.6
1.2
3.5
0.5
5.2
1.3
13.9
9.9
95.4
18.8
98.1
Fault 8
7.7
10.5
21.3
7.1
27.7
8.1
59.8
9.8
98.7
87.9
99.5
93.0
Fault 9
5.2
81.3
25.9
99.8
45.5
99.9
92.1
99.8
99.6
99.6
99.1
99.6
50
ACCEPTED MANUSCRIPT
Table 3. Time costs (s) for four compared methods. PCA(1,500)
PCA(30,000) online monitoring
offline modeling
6.9E-1
8.1E-3
2.5E+0
offline modeling
online monitoring
offline modeling
online monitoring
5.3E+2
5.0E-1
8.7E+3
1.91E-0
RI P
ICA-SVDD(1,500)
8.0E-3
SC
LOCAL-ICA(30,000)
online monitoring
6.8E-1
online monitoring
ICA-SVDD(6,000)
LOCAL-ICA(1,500) offline modeling
T
offline modeling
6.2E-1
online monitoring
3.1E+0
7.5E-1
PS-PCA(300×100)
NU
PS-PCA(300×5)
offline modeling
Offline modeling by PCA
preliminary summation for online monitoring
Online monitoring by PCA
Preliminary summation for offline modeling
Offline modeling by PCA
preliminary summation for online monitoring
Online monitoring by PCA
8.8E-3
2.1E-1
6.0E-4
7.2E-3
9.8E-3
2.3E-1
6.0E-4
7.5E-3
AC CE
PT
ED
MA
Preliminary summation for offline modeling
51
ACCEPTED MANUSCRIPT
Table 4. False alarm rates (%) of four methods.
T2
Q
False alarm rates Methods
6.6
3.6
KICA(n=300)
KPCA(n=600) T2
Q
6.2
1.9
KICA(n=600)
KPCA(n=1,500) T2
Q
6.5
2.8
KICA(n=1,500)
KPCA(n=3,000) T2
Q
5.8
2.2
KICA(n=3,000)
Indices
I2
Q
I2
Q
I2
Q
I2
False alarm rates
4.9
6.7
3.8
4.7
3.4
4.8
4.2
KICA-PCA(n=300)
KICA-PCA(n=600)
KICA-PCA(n=1,500)
Indices
I2
T2
Q
I2
T2
Q
I2
T2
Q
False alarm rates
6.1
7.2
5.9
3.8
6.2
7.2
1.2
3.6
4.3
PS-PCA(300×1)
PS-PCA(300×5)
PS-PCA(300×10)
T2
Q
T2
Q
T2
False alarm rates
4.5
5.9
6.0
6.5
6.1
6.5
2.3
KICA(n=6,000)
KPCA(n=15,000) T2
Q
6.6
2.1
KICA(n=15,000)
Q
I2
Q
3.6
5.0
4.8
4.0
4.5
I2
T2
Q
4.6
5.2
1.7
PS-PCA(300×50)
KICA-PCA(n=6,000)
KICA-PCA(n=15,000)
I2
T2
Q
I2
T2
Q
5.1
4.5
2.5
5.2
2.4
2.0
PS-PCA(300×100)
PS-PCA(300×200)
Q
T2
Q
T2
Q
T2
Q
5.3
4.7
3.2
4.8
4.9
2.2
2.0
AC CE
PT
ED
MA
Indices
Q
I2
KICA-PCA(n=3,000)
NU
Methods
T2
Q
SC
Methods
KPCA(n=6,000)
T
KPCA(n=300)
Indices
RI P
Methods
52
ACCEPTED MANUSCRIPT
Table 5. Detection rates (%) for seven types of faults. T
KPCA(n=600) 2
KPCA(n=1,500) 2
Q
T
Q
T
2.5
KPCA(n=3,000) 2
T
Q
T
Q
T2
Q
1.8
5.9
2.6
7.1
1.0
1.3
6.2
3.5
7.8
1.1
1.1
7.1
0.7
6.8
3.2
6.9
6.6
1.5
6.7
Fault 2
7.7
2.7
7.3
1.5
7.8
1.2
7.2
Fault 3
6.3
3.0
6.6
2.7
6.8
0.9
6.2
5.7
1.9
6.0
2.1
5.6
1.0
4.8
Fault 5
20.8
39.7
25.3
37.6
20.6
29.4
19.8
Fault 6
4.4
3.2
4.2
1.5
4.8
1.2
4.1
Fault 7
6.2
2.4
6.3
1.6
6.6
1.5
KICA(n=600)
KICA(n=1,500)
I2
Q
I2
Q 6.5
2.6
I2
5.8
4.3
3.8
Methods
6.5
9.4
5.9
4.2
3.8
Fault 3
6.1
9.0
5.4
5.2
7.2
4.2
4.8
2.1
4.2
24.8
44.1
44.6
Fault 1
4.9
Fault 2
5.4
Fault 3
4.8
3.5
2.6
3.6
1.2
4.5
3.6
KICA-PCA(n=3,000)
KICA-PCA(n=6,000)
KICA-PCA(n=15,000)
Q
I2
T2
Q
I2
T2
Q
I2
T2
Q
I2
T2
Q
4.8
2.6
3.2
3.1
3.2
3.5
2.8
3.7
2.2
0.9
2.6
1.6
1.1
2.5
3.2
4.8
3.3
4.4
2.5
1.6
3.8
3.2
1.2
3.4
4.3
0.8
3.5
3.6
4.4
4.5
3.5
2.3
3.5
2.9
3.3
2.6
2.7
0.8
5.2
4.6
3.2
4.6
1.7
4.8
1.5
2.2
5.6
5.1
1.1
27.4
30.9
50.2
37.6
33.9
44.3
45.6
33.9
46.2
43.3
51.7
PT
AC CE
T2
3.2
11.1
4.1
6.2
5.6
7.8
2.3
2.4
1.9
4.6
7.2
1.5
3.8
4.2
1.6
4.2
1.2
0.8
10.5
5.0
5.2
6.8
4.5
4.6
5.5
4.4
4.4
5.2
1.6
2.5
1.6
0.9
4.2
5.5
0.5
PS-PCA(300×1)
Indices
2.6
KICA-PCA(n=1,500)
ED
Fault 2
5.8
3.0
3.5
4.5
Fault 7
6.7
3.2
3.6
3.8
Methods
3.4
4.2
6.1
5.3
4.2
4.2
8.7
Fault 6
3.0
4.0
5.2
55.2
4.8
4.4
Fault 1
5.8
5.2
3.8
3.5
T2
41.3
2.6
6.8
4.8
I2
8.9
4.6
4.4
Q
54.4
3.8
4.6
T2
8.5
5.4
5.2
3.9
I2
53.1
Q
3.2
3.8
Indices
Fault 5
I2
5.2
KICA-PCA(n=600)
5.2
Q
4.6
KICA-PCA(n=300)
Fault 4
I2
MA
5.8
Q
Q
7.2
3.6
Fault 7
KICA(n=15,000)
2.3
3.5
4.8
I2
KICA(n=6,000)
2.4
5.6
3.6
1.2
KICA(n=3,000)
59.8
5.2
8.9
6.8
5.6
6.5
5.3
0.9
1.8
46.2
6.1
Fault 6
4.2
5.9
1.8
55.8
Fault 3
52.3
1.6
44.2
3.2
49.6
3.9
50.6
6.0
46
1.8
50.2
5.8
5.1
0.8 25.8
2.7
7.1
8.5
6.2 22.4
58.7
6.5
53.1
1.2 27.6
4.2
7.3
Fault 2
Fault 5
5.2
22.6
44.6
5.2
4.8
6.2
2.5
25.4
5.8
Fault 1
Fault 4
6.2
5.9
NU
Indices
KICA(n=300)
3.2
SC
Fault 4
2
KPCA(n=15,000)
Q
Fault 1
Methods
KPCA(n=6,000)
T
Indices
2
RI P
Methods
KPCA(n=300)
PS-PCA(300×5)
PS-PCA(300×10)
PS-PCA(300×50)
PS-PCA(300×100)
PS-PCA(300×200)
Q
T2
Q
T2
Q
T2
Q
T2
Q
T2
Q
6.5
6.6
7.4
5.5
8.2
3.6
13.6
5.2
30.1
7.2
91.5
6.9
8.1
8.1
7.8
8.1
9.3
25.7
19.8
78.3
38.5
92.5
6.1
6.2
7.3
5.5
6.8
6.5
12.2
37.6
21.8
20.3
70.2
Fault 4
3.8
5.7
4.9
5.3
3.8
3.5
1.6
8.2
8.7
15.4
15.6
80.9
Fault 5
17.6
49.8
53.6
94.9
73.5
99.6
99.6
99.9
99.0
99.8
99.7
99.8
Fault 6
2.7
10.3
5.7
27.6
8.2
52.1
43.5
94.7
85.7
97.1
95.7
96.5
Fault 7
4.1
5.9
5.5
6.6
4.8
5.7
3.6
3.0
7.7
0.8
10.6
0.4
53
ACCEPTED MANUSCRIPT
Table 6. Time costs (s) for four comparison nonlinear methods. KPCA(1,500)
KPCA(15,000) online monitoring
offline modeling
4.3E+2
2.3E+2
5.8E+4
offline modeling
online monitoring
offline modeling
online monitoring
2.3E+2
2.2E+2
3.1E+4
3.0E+3
RI P
KICA(1,500)
online monitoring 3.6E+3
KICA(15,000)
KICA-PCA(15,000)
SC
KICA-PCA(1,500) offline modeling
T
offline modeling
online monitoring
2.3E+2
2.2E+2
online monitoring
3.0E+4
2.9E+3
PS-PCA(300×100)
NU
PS-PCA(300×5)
offline modeling
Offline modeling by PCA
preliminary summation for online monitoring
Online monitoring by PCA
Preliminary summation for offline modeling
Offline modeling by PCA
preliminary summation for online monitoring
Online monitoring by PCA
8.0E-4
4.2E-1
2.1E-3
3.1E-2
1.4E-2
4.2E-1
2.3E-3
3.5E-2
AC CE
PT
ED
MA
Preliminary summation for offline modeling
54
ACCEPTED MANUSCRIPT
Table 7. Fault descriptions for TE process. Type
A/C feed ratio, B composition constant (stream 4) B composition, A/C ratio constant (stream 4)
Step Step
3 4
D feed temperature (stream 2) Reactor cooling water inlet temperature
5 6
Condenser cooling water inlet temperature A feed loss (stream 1)
7 8
C header pressure loss—reduced availability (stream 4) A, B, C feed composition (stream 4)
Step Random variation
9 10
D feed temperature (stream 2) C feed temperature (stream 4)
Random variation Random variation
11 12
Reactor cooling water inlet temperature Condenser cooling water inlet temperature
Random variation Random variation
13 14
Reaction kinetics Reactor cooling water valve
Slow drift Sticking
RI P
Step Step
NU
SC
Step Step
Condenser cooling water valve Unknown
Sticking Unknown
AC CE
PT
ED
15 16-20
T
Description
1 2
MA
No.
55
ACCEPTED MANUSCRIPT
Table 8. Monitored variables in TE process.
1 A feed (stream 1)
18 Stripper temperature 19 Stripper steam flow
3 E feed (stream 3)
20 Compressor work
RI P
2 D feed (stream 2)
T
Variables
21 Reactor cooling water outlet temperature
5 Recycle flow (stream 8)
22 Separator cooling water outlet temperature
6 Reactor feed rate (stream 6)
23 D feed flow valve (stream 2)
7 Reactor pressure
24 E feed flow valve (stream 3)
8 Reactor level
25 A feed flow valve (stream 1)
9 Reactor temperature
26 Total feed flow valve (stream 4)
NU
SC
4 Total feed (stream 4)
10 Purge rate (stream 9)
27 Compressor recycle valve
11 Product separator temperature
29 Separator pot liquid flow valve (stream 10)
MA
12 Product separator level
28 Purge valve (stream 9) 30 Stripper liquid product flow valve (stream 11) 31 Stripper steam valve 32 Reactor cooling water flow
16 Stripper pressure
33 Condenser cooling water flow
ED
13 Product separator pressure 14 Product separator under flow (stream 10) 15 Stripper level
AC CE
PT
17 Stripper underflow (stream 11)
56
ACCEPTED MANUSCRIPT
Table 9. False alarm rates (%) of seven compared methods. PCA(n=60,000)
T2
Q
T2
Q
T2
Q
T2
Q
False alarm rates
34.5
61.1
3.9
4.1
1.7
0.7
1.7
0.9
ICA-SVDD(n=900)
R2
T2
Q
False alarm rates
1.7
0.0
50.4
Methods
LOCAL-ICA(n=300)
Indices
T2ng
False alarm rates
99.9
R2
T2
Q
0.1
0.0
12.8
LOCAL-ICA(n=1,500)
T2g
T2e
T2ng
99.2
60.4
4.9
KPCA(n=300)
Methods
T2
Q
T2
False alarm rates
0.6
40.7
0.2
KICA(n=300) Q
I2
24.8
41.0
T2
Q
0.1
0.0
3.9
0.1
Indices
I2
T2
Q
I2
False alarm rates
24.8
29.6
23.6
0.9
PS-PCA(300×1) T
Q
False alarm rates
34.5
61.1
0.0
4.3
LOCAL-ICA(n=15,000)
LOCAL-ICA(n=60,000)
T2ng
T2g
T2e
T2ng
T2g
T2e
14.1
4.5
2.8
1.0
10.1
6.2
Q
T2
Q
T2
Q
1.6
0.4
2.2
1.1
0.8
KICA(n=1,500)
KICA(n=4,500)
I2
Q
I2
Q
0.9
1.6
1.3
1.1
0.4
KICA-PCA(n=1,500)
KICA-PCA(n=4,500)
T2
Q
I2
T2
Q
I2
T2
Q
0.9
1.4
1.6
0.8
1.0
1.1
2.6
0.7
PS-PCA(300×50) 2
PS-PCA(300×200)
T
Q
T
Q
T2
Q
0.6
1.8
0.3
1.4
0.0
0.6
AC CE
PT
Indices
Q
KPCA(n=4,500)
PS-PCA(300×5) 2
T2
KPCA(n=1,500)
KICA-PCA(n=900)
ED
KICA-PCA(n=300)
2
ICA-SVDD(n=4,500) R2
Q
0.9
Methods
Methods
18.5
MA
False alarm rates
99.8
ICA-SVDD(n=1,500) R2
KICA(n=900)
I2
Indices
T2e
KPCA(n=900)
Indices
Methods
T2g
RI P
ICA-SVDD(n=300)
Indices
NU
Methods
PCA(n=1,500)
SC
PCA(n=300)
T
PCA(n=15,000)
Indices
Methods
57
ACCEPTED MANUSCRIPT
Table 10. Detection rates (%) of PS-PCA and three linear methods in TE process. PCA(n=1500)
PCA(n=60000)
ICA-SVDD(n=900)
ICA-SVDD(n=4500)
LOCAL-ICA(n=60000)
T2
Q
T2
Q
R2
T2
Q
R2
T2
Q
T2ng
T2g
Fault 1
100.0
100.0
99.8
100.0
99.1
99.0
99.8
99.5
96.4
99.7
97.7
97.5
Fault 2
98.4
99.2
98.1
98.8
98.1
98.1
99.2
98.7
19.4
99.2
94.1
95.0
Fault 3
9.9
11.0
3.5
3.2
0.1
0.1
12.8
0.2
0.0
4.1
7.3
19.0
Fault 4
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
0.2
100.0
99.0
Fault 5
5.2
7.0
3.1
0.9
0.4
0.0
8.3
06
0.0
1.0
0.2
Fault 6
100.0
100.0
100.0
100.0
94.6
89.6
94.6
94.6
94.2
Fault 7
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
Fault 8
96.9
97.4
97.1
97.3
97.1
95.6
97.5
96.1
10.4
17.9
4.4
5.9
2.1
0.2
20.2
0.1
79.8
88.1
71.8
86.0
89.2
41.2
88.5
86.7
Fault 11
97.4
98.1
95.8
97.3
96.3
94.3
98.4
95.7
Fault 12
54.9
37.6
42.6
25.2
34.8
0.0
54.5
Fault 13
95.3
94.8
94.6
94.8
94.8
90.2
94.4
95.1
Fault 14
99.8
99.9
99.4
99.9
99.6
26.4
99.9
Fault 15
6.5
6.8
2.7
1.3
1.9
0.0
15.2
0.0
13.5
2.3
0.8
1.6
92.3
90.6
90.4
92.0
Fault 18
65.1
71.7
58.2
69.5
53.6
Fault 19
98.4
99.2
98.2
98.6
97.6
Fault 20
86.9
87.6
86.8
87.3
86.5
Q
T2
Q
99.5
99.5
99.7
99.7
99.1
98.8
98.7
99.3
97.2
97.5
85.6
16.2
56.3
94.8
97.4
99.4
99.9
100.0
100.0
99.1
100.0
22.1
15.5
11.0
24.0
96.2
34.9
SC 90.0
92.5
94.5
94.3
94.5
96.9
99.1
99.9
100.0
99.0
99.4
99.9
100.0
100.0
99.6
100.0
38.7
97.4
95.1
95.1
97.0
97.7
97.3
96.8
97.8 77.1
0.0
6.1
1.9
23.8
71.3
18.5
58.4
59.7
0.0
86.5
50.4
77.3
90.0
81.4
88.3
83.4
91.1
93.6
98.1
95.1
97.5
96.0
97.7
98.4
98.8
98.7
45.4
3.7
21.4
93.0
66.6
66.8
35.1
65.1
94.3
95.0
92.0
93.1
94.4
94.8
95.0
94.8
95.3
97.5
99.1
99.9
98.8
99.0
99.7
99.6
99.8
90.0
99.8
2.1
0.0
8.4
1.5
11.2
15.2
14.3
5.5
15.5
4.4
26.5
17.5
MA
5.9
92.0
2.0
0.0
6.9
2.5
9.9
9.7
12.6
4.5
15.7
3.8
65.3
91.8
83.1
91.2
92.4
87.6
90.6
92.0
92.0
92.3
91.8
91.8
28.9
73.0
45.6
51.1
72.3
59.0
69.4
77.0
76.7
78.1
77.7
83.3
38.8
99.2
97.2
93.6
99.1
96.8
96.4
98.3
98.3
98.3
97.0
97.2
50.4
88.5
86.2
80.6
87.8
79.8
85.0
87.7
87.0
87.6
86.1
89.3
ED
5.4
Fault 17
AC CE
PT
Fault 16
PS-PCA(300×200)
T2
94.6
NU
Fault 9 Fault 10
PS-PCA(300×5)
T2e
RI P
Indices
T
Methods
58
ACCEPTED MANUSCRIPT
Table 11. Detection rates (%) of PS-PCA and three nonlinear methods in TE process. Methods
KPCA(n=900)
KPCA(n=4500)
KICA(n=900)
KICA(n=4500)
KICA-PCA(n=900)
PS-PCA(300×5)
T2
Q
T2
Q
T2
Q
T2
Fault 1
99.1
99.4
99.3
99.2
98.8
99.4
99.5
99.7
99.5
99.7
99.5
99.4
99.5
Fault 2
97.2
97.4
97.4
97.0
96.5
97.2
98.7
99.3
98.7
99.3
97.8
96.5
97.1
97.5
98.7
Fault 3
0.8
8.8
0.2
1.9
0.2
3.6
16.2
56.3
16.2
56.3
24.9
0.9
4.2
1.1
16.2
56.3
Fault 4
99.9
99.9
99.9
99.9
99.9
99.9
100.0
100.0
100.0
100.0
99.9
99.9
99.8
99.9
100.0
Fault 5
0.3
4.8
0.3
1.0
0.4
2.3
11.0
24.0
11.0
24.0
8.1
0.7
3.1
1.3
11.0
Fault 6
99.9
99.9
99.9
99.9
99.9
99.9
94.3
94.5
94.3
Fault 7
99.9
99.9
99.9
99.9
99.9
99.9
100.0
100.0
100.0
Fault 8
93.8
94.2
94.4
94.2
93.3
94.2
97.7
97.3
97.7
Fault 9
2.6
19.2
2.1
9.1
2.4
8.4
18.5
58.4
18.5
58.4
Fault 10
60.8
75.9
60.0
72.1
73.3
74.2
81.4
88.3
81.4
88.3
Fault 11
97.1
97.0
95.2
95.5
95.1
97.6
97.7
98.4
97.7
98.4
94.2
94.6
96.6
95.5
97.7
Fault 12
34.9
56.4
32.4
42.7
42.7
45.4
66.6
66.8
66.6
66.8
48.5
37.0
47.1
43.2
66.6
Fault 13
90.1
90.2
89.8
90.0
89.4
90.7
Fault 14
99.7
99.7
99.7
99.7
99.4
99.7
Fault 15
0.0
5.1
0.0
0.8
0.1
1.6
0.0
0.0
0.7
0.2
84.0
84.5
84.3
84.2
84.1
Fault 18
45.6
50.4
47.6
47.4
42.7
Fault 19
96.6
97.6
97.0
97.0
96.7
Fault 20
73.6
75.4
73.8
75.0
1.3 85.0
98.6
T2
99.4
T2
PS-PCA(300×200) T2
Q
99.7
99.7
99.1
99.3
97.2
97.5
94.8
97.4
100.0
99.1
100.0
24.0
96.2
34.9
Q
99.9
99.9
99.9
99.9
94.3
94.5
96.9
99.1
99.9
99.9
99.9
99.9
100.0
100.0
99.6
100.0
97.3
95.1
93.2
94.3
94.5
97.7
97.3
96.8
97.8
23.0
4.3
7.8
8.4
18.5
58.4
59.7
77.1
79.6
51.4
66.0
76.5
81.4
88.3
83.4
91.1
98.4
98.8
98.7
66.8
35.1
65.1
MA
94.5
100.0
NU
SC
Q
T
I2
RI P
Q
94.8
95.0
94.8
95.0
90.2
89.8
90.4
89.5
94.8
95.0
94.8
95.3
99.6
99.8
99.6
99.8
99.9
98.5
99.7
99.6
99.6
99.8
90.0
99.8
14.3
5.5
14.3
5.5
6.7
0.4
2.5
0.8
14.3
5.5
15.5
4.4
12.6
ED
3.9
Fault 17
Q
92.0
4.5
12.6
4.5
5.9
0.6
1.2
1.5
12.6
4.5
15.7
3.8
92.3
92.0
92.3
85.1
83.8
84.6
84.4
92.0
92.3
91.8
91.8 83.3
49.6
76.7
78.1
76.7
78.1
53.6
32.9
49.3
49.3
76.7
78.1
77.7
97.3
98.3
98.3
98.3
98.3
97.9
94.6
97.0
97.7
98.3
98.3
97.0
97.2
PT
Fault 16
Q
T2
KICA-PCA(n=4500)
Indices
87.0
87.6
87.0
87.6
76.6
73.1
74.0
75.5
87.0
87.6
86.1
89.3
75.8
AC CE
74.1
59