Control Engineering Practice 62 (2017) 11–21
Contents lists available at ScienceDirect
Control Engineering Practice journal homepage: www.elsevier.com/locate/conengprac
Incipient fault detection with smoothing techniques in statistical process monitoring
MARK
⁎
Hongquan Jia, Xiao Hea, Jun Shanga, Donghua Zhoub,a,
a Tsinghua National Laboratory for Information Science and Technology (TNList) and Department of Automation, Tsinghua University, Beijing 100084, PR China b College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, PR China
A R T I C L E I N F O
A BS T RAC T
Keywords: Incipient fault detection Fault detectability Quadratic form Smoothing technique Multivariate statistical process monitoring.
In modern industry, detecting incipient faults timely is of vital importance to prevent serious system performance deterioration and ensure optimal process operation. Recently, multivariate statistical process monitoring (MSPM) techniques have been extensively studied and widely applied to modern industrial systems. However, conventional fault detection indices utilized in statistical process monitoring are not sensitive to incipient faults with small magnitude. In this paper, by introducing two representative smoothing techniques, novel incipient fault detection strategies based on a generic fault detection index in MSPM are proposed. Fault detectability for each proposed strategy is analyzed. In addition, the effects of the smoothing parameters on fault detection, including advantages and disadvantages, are also investigated. Finally, case studies on a numerical example and two practical industrial processes are carried out to demonstrate the effectiveness of the proposed incipient fault detection strategies.
1. Introduction Data-driven fault detection and diagnosis (FDD) techniques, especially multivariate statistical process monitoring (MSPM) methods, have received increasing research attention in the past decades (Beghi et al., 2016; Chiang, Russell, & Braatz, 2001; Ge, Song, & Gao, 2013; Venkatasubramanian, Rengaswamy, Kavuri, & Yin, 2003). In modern industries, large amounts of process and quality data can be available, which provide a necessary basis for data-based analysis, modeling, and monitoring. In comparison with model- or knowledge-based methods, data-driven methods do not require first-principle models (Zhao, Lin, & Wang, 2015) or expert experience (Huan, Wu, & Ning, 2001), which is usually costly or time-consuming to obtain in practice, especially for large-scale industrial systems. Commonly used MSPM methods include principal component analysis (PCA) (Portnoy, Melendez, Pinzon, & Sanjuan, 2016), partial least squares (PLS) (MacGregor & Kourti, 1995), Fisher discriminant analysis (FDA) (Chiang, Kotanchek, & Kordon, 2004), canonical correlation analysis (CCA) (Chen, Ding, Zhang, Li, & Hu, 2016), etc. Usually two steps are needed when using these methods for fault detection. In the offline design procedure, one statistical model as the reference is established using training data; in the online monitoring procedure, real-time measurements are collected and used to calculate multivariate statistics, which are compared
⁎
against the reference model to judge whether the system is normal or not. Since the pioneering works of Kresta, MacGregor, and Marlin (1991) and Wise, Veltkamp, Davis, Ricker, and Kowalski (1988), a great number of novel and modified techniques have been proposed in the MSPM area so as to enhance the monitoring performance. However, existing methods in the literature mainly focus on serious faults, and incipient faults are relatively rarely considered. In practice, many kinds of serious faults evolve gradually from incipient faults which have small magnitudes. Timely and accurate detection of an incipient fault can help to schedule preventive maintenance, thus prevent more serious faults from occurring, and ensure that monitored systems stay in the optimal operation status. If the incipient fault is further isolated, efficient component maintenance or replacement can then be accomplished, and costly maintenance activities can be reduced to a large extent. Therefore, incipient fault detection is of great importance. Nevertheless, due to the small magnitude, normal samples and faulty samples containing an incipient fault often overlap greatly. Conventional MSPM methods as mentioned above are not sensitive to the incipient faults, resulting in much missed detection (Shang et al., in press). To cope with the incipient fault detection problem, some improvements have been presented. Wachs and Lewin (1999) proposed to
Corresponding author at: College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, PR China. E-mail address:
[email protected] (D. Zhou).
http://dx.doi.org/10.1016/j.conengprac.2017.03.001 Received 21 December 2016; Received in revised form 27 February 2017; Accepted 4 March 2017 0967-0661/ © 2017 Elsevier Ltd. All rights reserved.
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
example, the continuous stirred tank reactor (CSTR) process, and a practical electric multiple unit (EMU) air brake process are carried out in Section 4, followed by concluding remarks in Section 5.
recursively sum the last s scores, obtained from the PCA model, and construct descriptive statistics for process monitoring, so as to improve the fault detectability for relatively small shifts as well as for dynamic applications with sequential dependencies among samples. Later, Chen, Liao, Lin, and Lu (2001) proposed to combine PCA with two techniques having memory effect, i.e., multivariate exponentially weighted moving average (MEWMA) and multivariate s-term sum (MSSUM). Control limits of the two statistics T2 and Q are reestablished since in this work PCA is built on samples handled by MEWMA and MSSUM. Similarly, moving average PCA (MA-PCA) and EWMA-PCA were proposed in the master's thesis recently (Zheng, 2015). The effect of MA and EWMA on T2 and Q statistics as well as their control limits is analyzed theoretically. Also, critical fault magnitudes using MA-PCA and EWMA-PCA are derived. To improve the detection efficiency within the principal component subspace of PCA, the Kullback-Leibler divergence (KLD) was employed by Harmouche, Delpha, and Diallo, (2014, 2015) as a discrimination measure, which is proved to be more powerful to incipient faults than traditional T2 statistic. Though the previous works aforementioned enhance the incipient fault detection performance to some extent, they still suffer from several limitations. First, the contributions are restricted to one specific MSPM method, i.e. PCA, and only the T2 or Q index is involved, while other commonly used MSPM methods or fault detection indices are not considered. Second, existing methods (Chen et al., 2001; Zheng, 2015) that combine MA or EWMA with PCA build a new PCA model on the processed data. Thus, the T2 and Q indices share the same smoothing parameter, such as the window width in MA and the weighting factor in EWMA. This point somewhat constrains the corresponding methods because T2 and Q play different roles in process monitoring (Qin, 2003). Hence, the two indices should be designed to be able to have different smoothing parameter values. Besides, theoretical analyses provided by current works are not sufficient to provide insightful viewpoints for incipient fault detection. For example, the time delay caused by utilization of the MA or EWMA technique is not analyzed in detail. As for Harmouche et al. (2014) and Harmouche et al. (2015), calculation of the KLD measure requires probability density estimation of latent scores, which needs a set of observations simultaneously. Consequently, instant of the fault occurrence cannot be obtained. In the present work, two novel strategies are proposed to perform incipient fault detection. Because the two representative smoothing techniques MA and EWMA are well-recognized tools which are sensitive to small shifts (Chen et al., 2001; Wierda, 1994), they are adopted here to combine with conventional MSPM methods. Different from existing works designed specifically for PCA, this work extracts and analyzes a generic fault detection index, which can be applied to various MSPM methods. Two new indices integrating MA and EWMA with the generic index are proposed respectively, with control limits of the two indices established. Fault detectability analyses of the conventional index and two proposed indices are provided and compared with each other, demonstrating the superior performance of proposed indices over conventional one for incipient faults. Corresponding geometric interpretation is provided as well. In addition, effects of the smoothing parameters on the detection performance are analyzed. In our previous work, the MA technique has been combined with several statistics to improve the incipient detection efficiency (Ji, He, Shang, & Zhou, 2016). However, the contributions are limited to the PCA model and the single sensor incipient fault type. The remainder of this paper is organized as follows. In Section 2, two basic MSPM methods are first reviewed. A generic fault detection index is then extracted and a sufficient detectability condition based on the generic index is derived. Section 3 includes the main results, which consist of a motivational example, two proposed incipient fault detection strategies with corresponding detectability analyses and geometric interpretations, as well as some guidelines for selecting the smoothing parameters in the proposals. Case studies on a numerical
2. Preliminaries In this section, two most popular MSPM models, i.e., PCA and PLS, are first reviewed briefly. Then, a generic fault detection index of quadratic form is extracted to represent the fault detection indices utilized in various MSPM models. After that, a sufficient condition for fault detectability based on the generic detection index is derived. 2.1. Generic fault detection index in MSPM Given a normalized data matrix X ∈ N × m consisting of N samples with m variables, PCA decomposes X as follows (Qin, 2003)
X = TPT + E
(1) N ×l
m×l
is the loading matrix, T ∈ is the score matrix, and where P ∈ l is the number of principal components (PCs) retained in the model (Valle, Li, & Qin, 1999). The model parameter P can be obtained by implementing eigen-decomposition on the sample covariance of X ,
S=
1 ∼∼∼T XTX = PΛPT + PΛP N−1
(2)
The measurement space is thus partitioned via PCA into the principal component subspace (PCS) and the residual subspace (RS), which are ∼ spanned by columns of P and P , respectively. Several fault detection indices are defined in these subspaces, such as the T2 index, T 2 = xTPΛ−1PTx , and the squared prediction error index, Q = xT(I − PPT)x . Alternatively, one can choose to monitor the PCS and RS simultaneously with the combined index ϕ or the global Mahalanobis distance D (Qin, 2003). PLS is another commonly used technique in MSPM. PLS uses quality data to guide the decomposition of process data (Qin, 2012). Given normalized process data X ∈ N × m and quality data Y ∈ N × p , the nonlinear iterative partial least squares (NIPALS) algorithm can be used to establish the following PLS model (Zhou, Li, & Qin, 2010):
⎧ X = TPT + E ⎨ T ⎩ Y = TQ + F
(3) 2
In PLS-type methods, T and Q indices are also commonly utilized for fault detection. For a sample x , they are defined as T 2 = xTRΛ−1RTx and Q = ∥(I − PRT)x ∥2 , where R can be obtained through the NIPALS algorithm and Λ = TTT/(N − 1). A combined index integrating T2 and Q can also be defined in PLS like that in the PCA model (Yue & Qin, 2001). We are able to draw the conclusion that these fault detection indices used in PCA and PLS are all of quadratic form, which can be denoted by a unified expression:
Υ (x) = xTMx
(4)
The kernel matrix M varies for different fault detection indices, as listed in Table 1. Though the quadratic form (4) is extracted from PCA and PLS models, it can also be applied to other MSPM models, e.g., factor analysis (FA) (Ge, Xie, & Song, 2009), probabilistic PCA (PPCA) (Chen & Sun, 2009), FDA (Yin, Ding, Haghani, Hao, & Zhang, 2012), independent component analysis (ICA) (Lee, Yoo, & Lee, 2004), among others (Qin & Zheng, 2013; Yin et al., 2011; Zhou et al., 2010). It can be easily verified that M is symmetric and positive definite or semi-definite. To accomplish online fault detection, threshold or control limit of the generic fault detection index (4) ought to be calculated in advance. We employ results reported in Box (1954) to calculate the control limit, as expressed by
η2 = gΥ χα2 (hΥ ) 12
(5)
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Table 1 Fault detection indices in PCA and PLS. Υ
Kernel matrix M
PCA
PLS
T2
PΛ−1PT ≜ D
Q
∼ I − PPT ≜ C
ϕ*
∼ D/τ 2 + C/δ 2
D
S−1
T2
RΛ−1RT
Q
(I − RPT)(I − PRT)
Fig. 1. Fault detection using D in the illustrative example.
Note: τ2 and δ2 represent control limits of the T2 and Q indices in PCA, respectively (Yue & Qin, 2001). *
analyzed and summarized. where
3.1. An illustrative example
tr[(SM)2 ] gΥ = , tr (SM)
[tr (SM)]2 hΥ = tr[(SM)2 ]
Consider a simple process with two correlated variables and under normal operating conditions, they follow an arbitrarily specified multivariate Gaussian distribution as follows
(6)
Here, S denotes the covariance matrix of x under normal conditions. Calculation of (5) is under the assumption that x follows a multinormal distribution. For a new sample x , a fault is detected if Υ (x) > η2 ; otherwise, the process monitored is considered as normal.
⎡ ⎤ ⎡ ⎤ x ∼ 5 (μ, Σ), μ = ⎢ 6 ⎥ , Σ = ⎢ 3 2.6 ⎥ ⎣ 4⎦ ⎣ 2.6 4 ⎦
The D index is utilized here to monitor the process. In this case, the control limit is χα2 (2), which can be calculated explicitly from (5) and (6). The significance level α is set as 0.01. In the offline modeling step, 500 training samples are generated according to (13) to obtain sample l and covariance S. In the online monitoring step, the D index of mean μ μ )T S−1(x − l μ ), and coma new test sample x is calculated, D = (x − l 2 (2) to judge whether a fault occurs or pared against the control limit χ0.01 not. In this example, the test dataset consists of 500 samples as well. They are first generated according to (13) and a process fault is introduced from sample 201. This fault has an additive form x = x* + ξf , where ξ = [0. 2425 0. 9701]T and f=3. Fig. 1 shows the fault detection result for the test dataset. Two commonly used performance indices, i.e., false alarm rate (FAR) and fault detection rate (FDR), are calculated and marked in the figure. It is noted that, though a fault is added since sample 201, most D indices of sample 201 to sample 500 are still less than their control limit (as indicated by the red line in Fig. 1). Thus, the FDR is only 12.67%, far from satisfaction. It is straightforward to attribute this result to the relatively small fault magnitude. From (12), we can determine that for the illustrative example the critical fault magnitude is equal to 10.26. In other words, if f ≤ 10.26 , then the fault is not guaranteed to be detectable using conventional D index. Therefore, improved fault detection indices are needed for incipient fault detection.
2.2. Sufficient detectability condition The fault is assumed to affect the measurements through an additive term (Dunia & Qin, 1998; Qin, 2003)
x = x* + Ξj f
(7)
where x* denotes the normal part. Ξj f denotes the faulty part, and Ξj , ∥ f ∥ are the fault direction and magnitude, respectively. Both sensor faults and process faults can be represented by this form (Yue & Qin, 2001). To guarantee that the faulty sample x as expressed in (7) can be detected successfully, the following condition should be satisfied
Υ (x) = ∥ M1/2x ∥2 = ∥ M1/2(x* + Ξj f)∥2 > η2
(8)
or, equivalently
∥ M1/2(x* + Ξj f)∥ > η
(9)
Using the triangle inequality for vector norm, we have
∥ M1/2(x* + Ξj f)∥ ≥ ∥ M1/2Ξj f ∥ − ∥ M1/2x* ∥
(10)
1/2
Besides, ∥ M x* ∥2 ≤ η2 , or equivalently,
∥ M1/2x* ∥ ≤ η
(13)
(11)
because x* is the normal portion of x . By integrating (9), (10), and (11), a sufficient detectability condition is obtained as follows
3.2. Incipient fault detection strategies
∥ M1/2Ξj f ∥ > 2η
To enhance the incipient fault detection performance, two smoothing techniques, i.e., MA and EWMA, are respectively incorporated into the conventional fault detection index, resulting in two novel fault detection strategies. Through the fault detectability analysis, improvements of new proposals over the conventional one for incipient fault detection are indicated explicitly. As indicated by (4), the fault detection index at time k depends solely on the current sample xk . No additional information from the past data is kept. Therefore, just as shown by the illustrative example, this strategy is not sensitive to incipient faults. On the other hand, MA, which incorporates historical information, has a good performance at detecting small shifts in the mean change. For these reasons, we propose to use the following fault detection index at time k
(12)
3. Proposed method A numerical example is first provided to illustrate the inefficiency of conventional fault detection indices for incipient faults. Then, novel fault detection strategies integrating conventional detection indices with smoothing techniques are proposed, followed by fault detectability analyses. To demonstrate the superiority of proposed strategies for incipient faults intuitively, a comparison of geometric interpretations of conventional and proposed strategies is made. Besides, effects of the smoothing parameters on fault detection, including pros and cons, are 13
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Υk = x Tk Mxk
(14)
where
xk =
1 W
k
∑
xw
(15)
w = k − W +1
is the averaged sample of the last W samples. Because Υ is different from Υ, its control limit has to be re-established so that online fault detection can be performed. It is observed from (14) that Υ has the quadratic form as well. Thus, the control limit of Υ can also be approximated by the χ2 distribution as follows (Box, 1954)
η 2 = gΥ χα2 (hΥ )
(16)
where Fig. 2. Fault detection using D (W=13) in the illustrative example.
tr[(SM)2] g = , tr (SM) Υ
[tr (SM)]2 h = tr[(SM)2] Υ
(17)
sensitive to small shifts (Roberts, 2000). For a set of centered (zeromean) samples xk , k = 1, 2, …, N , the multivariate EWMA (MEWMA) statistic is defined as
Here, S denotes the covariance matrix of the averaged sample x under normal conditions. Under the assumption that normal samples are i.i.d., it can be easily verified that
xˇ k = λ xk + (1 − λ )ˇxk−1
1 S= S W
where 0 < λ ≤ 1 is the weighting factor and xˇ 0 = 0 . Some further manipulation leads to
(18)
Making use of (5), (6), (16), (17), and (18), we can obtain the following result after simple manipulation
η2 =
k
xˇ k = λ ∑ (1 − λ )k − i xi
(25)
i =1
1 2 η W
Then, we propose the following fault detection index at time k
(19)
Υˇk = xˇ Tk Mxˇ k
That is to say, control limit of the new fault detection index is reduced due to the use of MA technique. To show that Υ is superior to Υ for incipient faults, the fault detectability condition of Υ is analyzed and then compared with that of Υ. According to the fault model (7), the averaged sample within a time window can be expressed as
x = x* + Ξj f
2
1/2
2
Υ = x Mx = ∥ M x ∥ = ∥ M (x* + Ξj f)∥ > η
2
λ[1 − (1 − λ )2k ] Sˇ k = S 2−λ As k gets larger, (1 − λ ) we have
(21)
⎧ Υˇ Υˇ ⎪ gk → g = ⎨ ⎪ h Υˇ → hΥˇ = ⎩ k
(22)
→ 0 and Sˇ k → Sˇ = [λ /(2 − λ )]S. Consequently,
ˇ )2] tr[(SM ˇ ) tr (SM
= [λ /(2 − λ )]gΥ
ˇ )]2 [tr (SM ˇ )2] tr[(SM
= hΥ
(28)
leading to
For an incipient fault, (12) may not be fulfilled thus the fault cannot be successfully detected by conventional detection indices. However, as shown in (19), η is inversely proportional to W . Thus, (22) can be satisfied if an appropriate window width W is chosen. More specifically, define the critical window width as
⎞2 ⎛ 2η ⎟ Wc = ⎜⎜ ⎟ 1/2 ⎝ ∥ M Ξj f ∥ ⎠
(27) 2k
Using the same technique as in Section 2.2, we can obtain a sufficient detectability condition for (21) as follows
∥ M1/2Ξj f ∥ > 2η
ˇ
ˇ
determine the distribution parameters gkΥ and hkΥ in ηˇk2 , covariance of xˇ k , Sˇ k = E(ˇxkxˇ kT), ought to be calculated first. Assuming xi , i = 1, 2, …, k are independent and identically Gaussian distributed, from (25) we can express Sˇ k as
(20)
1/2
(26)
Similar to Υk , Υˇk is also of the quadratic form. Thus, control limit of Υˇk , denoted by ηˇk2 , can be calculated via a χ2 distribution as well. To
Here it is assumed that within the time window, all W samples are faulty. The case that a time window contains both normal and faulty samples, i.e., the initial stage of a fault, will be discussed in Section 3.4. The fault can be guaranteed to be detected by Υ if T
(24)
ηˇk2 → ηˇ 2 = [λ /(2 − λ )]η2
(29)
For simplification, we will take ηˇ instead of as the control limit of Υˇk in the following, because (1 − λ )2k in (27) converges rapidly as k increases provided λ is not extremely small. To demonstrate the detection performance of Υˇ for incipient faults, its detectability is analyzed and compared with that of Υ. Substituting the fault model (7) into (25) and assuming the fault occurs at time T, we have 2
(23)
and we will have the following theorem.
k
Theorem 1. With the proposed fault detection index Υ , the incipient fault as expressed in (7) can be guaranteed to be detectable if W > W c. The proof of this theorem can be obtained directly by substituting (19) into (22). Regarding the illustrative example in Section 3.1, we can conclude from Theorem 1 that the imposed fault can be guaranteed to be detected by D with a window width greater than 13, as shown in Fig. 2. Apparently, compared with Υ, Υ provides a much better detection performance for the incipient fault. Besides MA, EWMA is another well-known filtering technique
k
ηˇk2
k
xˇ k = λ ∑ (1 − λ )k − i x*i + λ ∑ (1 − λ )k − i Ξj f = λ ∑ (1 − λ )k − i x*i i =1
i=T
+ [1 − (1 − λ )k − T +1]Ξj f
i =1
(30)
We can further assume that k is much larger than T such that (1 − λ )k − T +1 → 0 . Under this circumstance, (30) degrades to k
xˇ k = λ ∑ (1 − λ )k − i x*i + Ξj f = xˇ *k + Ξj f i =1
14
(31)
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Fig. 3. Fault detection using Dˇ (λ=0.14) in the illustrative example. k
where xˇ *k ≜ λ ∑i =1 (1 − λ )k − i x*i . The case that (1 − λ )k − T +1 cannot be treated as 0, i.e., the initial stage of a fault, will be discussed in Section 3.4. Successful detection of the fault using Υˇ requires
Υˇ = xˇ TMxˇ = ∥ M1/2xˇ ∥2 = ∥ M1/2(ˇx* + Ξj f)∥2 > ηˇ 2
(32)
where the subscript k is omitted for clarity. Using the same technique as aforementioned, we can obtain a sufficient detectability condition for (32) as follows
∥ M1/2Ξj f ∥ > 2ηˇ
(33)
According to (29), ηˇ = λ /(2 − λ ) η . In addition, 0 < λ ≤ 1, thus 0 < ηˇ ≤ η . Therefore, for an incipient fault, though (12) may not be fulfilled, (33) can be satisfied if an appropriate weighting factor λ is chosen. More specifically, define the critical weighting factor as
λc =
2 Fig. 4. Scatter plot of the normal and faulty data: (a) original data, (b) MA-smoothed data, and (c) EWMA-smoothed data.
2
[2η /∥ M1/2Ξj f ∥] + 1
(34)
and we will have the following theorem.
proposed D index. Fig. 4 (b) shows the scatter plot of normal and faulty data handled by the MA technique with the window width W selected as 6. The control limit of D is re-established according to (19) and marked by the red ellipse. It is observed from this figure that the MAsmoothed normal and faulty samples are well-separated by the control limit of D . Therefore, compared with D, D is much better at detecting incipient faults and at the same time, has a low FAR. Analogously, Fig. 4 (c) shows the case for comparison of Dˇ and D.
Theorem 2. With the proposed fault detection index Υˇ , the incipient fault as expressed in (7) can be guaranteed to be detectable if λ < λc . This theorem can be proven directly by substituting (29) into (33). For the illustrative example in Section 3.1, we can conclude from Theorem 2 that the introduced fault can be certainly detected by Dˇ if λ is less than 0.1423 (λc). As shown in Fig. 3, compared with Υ, Υˇ provides a much better detection performance. The two indices Υ and Υˇ are related with each other in terms of their detectability properties. It can be observed from (23) and (34) that λc relates to Wc via λc = 2/(Wc + 1), which links the two proposed indices together. Indeed, as shown in Section 4, the two indices exhibit very similar detection results in various simulation examples. The main difference between Υ and Υˇ rests with the difference between the two smoothing techniques MA and EWMA. Because the EWMA statistic is calculated in a recursive manner, its requirement for data storage is small (Salsbury & Alcala, 2017). Thus, the Υˇ index may be preferred in practice.
3.4. Selection of the smoothing parameter The window width W and the weighting factor λ are crucial parameters for the proposed indices Υ and Υˇ respectively, which directly affect the incipient fault detection performance. In this subsection, effects of W (on Υ ) and λ (on Υˇ ) are analyzed and summarized, which can guide the selection of these smoothing parameters. First we investigate the effects of W on Υ . The advantage of using the MA technique for incipient fault detection is apparent. As we can conclude from (23), a small fault (Ξj f ) corresponds to a large critical window width Wc. Thus, according to Theorem 1, a larger W ensures the successful detection of a smaller fault. However, overlarge W may lead to two undesired side-effects. In the first place, it can be observed from (15) that overlarge W increases the amount of computation slightly. Second, a large detection delay can be incurred due to the use of a large W. To derive a concrete relation between the detection time delay and the selected window width, we analyze the initial stage of a fault, that is, within a time window partial samples are normal while others are faulty. Just as illustrated in Fig. 5, we consider the case that current time window contains Ld faulty samples and (W − Ld ) normal samples, so the detection time delay is equal to (Ld − 1). The averaged sample within this time window is
3.3. Geometric interpretation In this part, a geometric interpretation is provided to illustrate the superiority of proposed indices over conventional one intuitively. Again, the two-dimensional Gaussian variable (13) is taken as the example for illustration. The distribution parameters, as well as the introduced fault scenario, remain unchanged. 500 normal and 500 faulty samples are respectively generated and displayed in Fig. 4 (a). Normal data are denoted by blue points, while faulty data are denoted by pink asterisks. The red ellipse represents the determined control limit when conventional D index is used. As we can see from Fig. 4 (a), a large proportion of fault samples lie within the control limit, implying a low FDR offered by D. Then, we turn to the geometric property of the 15
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Fig. 5. Illustration of the initial stage of a fault.
x = x* +
Ld Ξj f W
(35)
It is further assumed that the fault is successfully detected by Υ at this moment. Similar to (22), a sufficient detectability condition for x expressed in (35) is
∥ M1/2 ·
Ld Ξj f ∥ > 2η W
Fig. 6. Illustration of f (λ ) versus λ for different λc values: (a) λc = 0.1, (b) λc = 0.3, (c)
(36)
λc = 0.5, and (d) λc = 0.7 .
which leads to the following result after simple manipulation
Ld >
W ·Wc
Table 2 Effects of the smoothing parameters on the proposed indices.
(37)
As Wc is a constant value given a specific fault, the detection delay is approximately proportional to the square root of the selected window width W. To avoid serious detection delay, W should not be too large. On the other hand, W > Wc has to be fulfilled to guarantee fault detectability. As a compromise, the ideal choice is the critical window width Wc. Then, we turn to the discussion about the effects of λ on Υˇ . From (34), we observe that a small fault corresponds to a small critical weighting factor λc. Hence, according to Theorem 2, a smaller λ allows the successful detection of a smaller fault. Nevertheless, much smaller λ is not always better. As indicated by (24), λ measures the rate at which older measurements enter into the calculation of statistic xˇ k . A small value of λ gives more weight to older data and less weight to current data, which may result in a delay for fault detection. To analyze the relation between the detection delay and the chosen λ, we reconsider (30) once again. But, at this time, (1 − λ )k − T +1 cannot be assumed equal to 0. Define Lˇd ≜ (k − T + 1) which denotes the number of faulty samples until current time k. We can express xˇ k as follows ˇ xˇ k = xˇ *k + [1 − (1 − λ ) Ld ]Ξj f
Index
Lˇd > f (λ ) =
2−λ λ · λ c 2−λ c
ln (1 − λ )
(Wc, +∞)
Υˇ
λ
(0, λc)
Effects of the parameter (pros and cons) (1) A larger W guarantees detection of a smaller fault; (2) Overlarge W increases the computational load; (3) Overlarge W may result in serious detection delay. (1) A smaller λ guarantees detection of a smaller fault; (2) Too small λ makes detection delay serious.
Optimal choice
Wc
λ opt b
The feasible region guarantees fault detectability. See (40) and the corresponding analysis for explanation.
Table 2. As aforementioned in this section, the proposed indices, compared with conventional ones, possess a better detection performance for incipient faults. In addition, since only extra add operation is required in proposed indices, its computational complexity is almost as simple as that of conventional indices. However, two limitations of the proposed method should also be noted. First, similar to many static MSPM methods, samples are assumed independent and identically Gaussian distributed in this work. Under this assumption, control limits of proposed indices, i.e. (19) and (29), can be valid and accurate. Though in Section 4.2 the proposed method is applied to an industrial process where the serial correlation among variables is present and also shows well detection performance, the dynamic process monitoring problem is not explicitly dealt with in this work. Second, as analyzed in Section 3.4, usually a time delay is needed to successfully detect the incipient fault. In other words, the proposed method improves the incipient fault detection performance at the cost of detection timeliness.
(39)
Solving for Lˇd , we have
⎡ ln ⎢1 − ⎣
W
b
where xˇ *k is the same as in (31). It is further assumed that xˇ k in (38) is successfully judged as faulty by Υˇ , so the detection delay is (Lˇd − 1). Similar to (33), a sufficient detectability condition for xˇ k in (38) is ˇ
Feasible region*
Υ
*
(38)
∥ M1/2 ·[1 − (1 − λ ) Ld ]Ξj f ∥ > 2ηˇ
Parameter
⎤ ⎥ ⎦ (40)
The domain of f (λ ) is λ ∈ (0, λc ). The function f (λ ) is continuous and differentiable everywhere. Besides, we have limλ→0f (λ )=+∞ and limλ → λcf (λ )=+∞. Thus, there exists an optimal parameter λ opt within the domain such that f (λ opt ) is minimized. The optimal selection λ opt can be estimated by numerical methods. To understand intuitively the geometric properties of f (λ ), for four different λc values, Fig. 6 plots their corresponding graphs. It can be noted from Fig. 6 that the minimum value of f (λ ), i.e., f (λ opt ), increases as λc decreases. That is to say, to detect a smaller fault, a smaller λc is required, thus the detection delay may be more serious. To sum up, effects of the smoothing parameters (W, λ) on the proposed indices (Υ , Υˇ ) are sketched in
4. Case studies In this section, three examples are used to demonstrate the efficiency of the proposed method, in comparison with classical approaches. It is known that the significance level, which directly reflects the FAR index, is required in calculating the control limits of 16
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
detection indices. Thus, for a fair comparison, the significance levels for different models and detection indices are all set as 0.01. Then, the FARs for different methods will be all close to the significance level 1% in theory provided that control limits are determined properly, but they do not have to be exactly the same in practice due to the randomness in one experiment. 4.1. A numerical example As aforementioned, the proposed incipient fault detection strategies, i.e., Υ and Υˇ , are applicable to various MSPM models provided the detection indices are of the quadratic form (4). Here, without loss of generality, we apply Υ and Υˇ to the total projection to latent structures (T-PLS) model (Zhou et al., 2010). The following synthetic model, which is the same as in Zhou et al. (2010), is adopted to generate normal data for model training
⎧ xk = Azk + ek ⎨ ⎩ yk = Cxk + vk
(41)
Fig. 7. Fault detection using T-PLS based Υ in the numerical example.
where 3
zk ∈ , zk , i
⎡1 3 4 4 0 ⎤T ∼ U[0, 1], i = 1, 2, 3 A = ⎢ 3 0 1 4 1 ⎥ ⎢⎣ ⎥ 1 1 3 0 0⎦
oped in T-PLS. For each statistic, FAR and FDR are calculated and marked in the figure. As we can see from Fig. 7, To2 (the top-right subfigure) fails to detect this fault efficiently. Usually, the To2 statistic can detect the fault occurring in So if it has a relatively large fault magnitude. We can calculate from (12) that, for the To2 statistic, a sufficient requirement for fault detectability is f > 4.74 in this example. However, the imposed fault is far from satisfying the sufficient condition. It is therefore understandable that conventional statistic, denoted by Υ, cannot detect the incipient fault efficiently. Then, the proposed statistics Υ and Υˇ are respectively applied to the test dataset. It can be concluded from Theorem 1 that the incipient fault is guaranteed to be detected by T o2 with a window width larger than 10. Fig. 8 shows the fault detection result using Υ , i.e., T y2 , T o2 , T r2 , and Qr . It can be observed obviously that the T o2 statistic is capable of detecting the incipient fault successfully after a short time delay of only three samples. For Υˇ , we learn from Theorem 2 that the imposed fault 2 can be guaranteed detectable by Tˇo if the weighting factor is less than 2 0.1824. Indeed, the Tˇo statistic successfully detects the incipient fault
5
ek ∈ , ek , j
∼ 5 (0, 0. 052 ), j = 1, …, 5 C = [ 2 2 1 1 0 ], vk ∼ 5 (0, 0. 12 )
U[0, 1] represents the uniform distribution in the interval [0,1], and 5 (μ, σ 2 ) represents the Gaussian distribution with mean μ and variance σ2. Faulty data are constructed via (7), in which the normal portion x* is also produced by (41), but independent of the normal data for training. According to Zhou et al. (2010), the T-PLS method, which is based on the PLS model, decomposes the X -space further into four subspaces. These subspaces are denoted by Sy , So , Srp , and Srr respectively, which have different meanings. Four statistics, i.e., Ty2 , To2 , Tr2 , and Qr , together with their corresponding control limits, are respectively developed in these subspaces so as to perform fault detection. It should be noted that the work (Zhou et al., 2010) mainly aims to demonstrate the superiority of T-PLS over PLS and PCA for quality-related process monitoring. Whereas, the present work just takes the T-PLS model as an example, and aims to show the superiority of proposed detection indices over conventional ones used in the T-PLS model. As a consequence, we decrease the fault magnitude used in Zhou et al. (2010) and test whether proposed indices can detect the incipient fault efficiently. We use 500 normal samples, produced by (41), to train a T-PLS model. The model parameter matrices obtained in this simulation are shown in Table 3. The test dataset is composed of 200 samples, among which the first 100 samples are normal and the last 100 samples are faulty. Without loss of generality, we only take one fault case for demonstration. Let the fault occur in So only, that is, the fault direction Ξj in (7) is set equal to Po (see Table 3). The fault magnitude is set as f=1.5, which is the half of that used in Zhou et al. (2010). Fig. 7 shows the fault detection result using conventional detection statistics devel-
after a short delay, as shown in Fig. 9. Therefore, compared with the conventional statistic, the proposed ones are more effective in detecting incipient faults. On the other hand, just as discussed in Section 3.4, to avoid serious detection delay, the window width used in Υ should not be overlarge
Table 3 T-PLS model parameter matrices.
py
Po
Pr
∼ Pr
0.1292 0.1315 0.2282 0.2505 0.0246
−0.2388 0.2636 0.6218 −0.6779 −0.1655
−0.7298 0.5738 0.2321 −0.0990 −0.2727
−0.5047 −0.6961 0.4283 0.2192 0.1710
0.1406 −0.2682 0.0460 0.1191 −0.9445
Fig. 8. Fault detection using T-PLS based Υ (W=10) in the numerical example.
17
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Fig. 10. Fault detection using PCA based Υ in the CSTR process.
control errors. CAf is the input reactant concentration, and Tf is the input reactant temperature. Besides, v1, v2 denote independent system noises. Other parameters in (42) and (43) are constant for the process. The measurements consist of four variables x = [CA, T , Tc, q], to which independent measurement noises are added. All simulation settings including parameter values, initial conditions, and controller information are the same as in Li et al. (2010). To build a PCA model, 1500 samples collected under normal conditions are used. According to the cumulative percent variance (CPV) criterion, two PCs are retained, which account for more than 95% of the variance in original variables. The test dataset consists of 1500 samples as well, in which a sensor bias fault is imposed since sample 501. The fault is added to the fourth measurement q and its magnitude is equal to 5 L/min. Note that the fault magnitude is even smaller than the standard deviation of q (5.4 L/min). Fig. 10 shows the fault detection result using conventional Q and T2 statistics of PCA. T2 completely fails to detect this fault since its FDR is close to the significance level. Though some anomalies are indicated by the Q index after sample 500, its FDR is merely 22.6% and most of the faulty samples are missed. Thus, conventional statistics in PCA are not efficient in detecting incipient fault. It was declared by Ku et al. (1995) that in dynamic processes DPCA usually outperforms static PCA for faults with relatively small magnitude. DPCA implements a standard PCA decomposition on augmented data matrix. For the CSTR process, the time lag is determined as l=1 according to the procedure presented in Ku et al. (1995). As shown in Fig. 11, the Q index of the DPCA model indeed improves the detection performance slightly, compared with the Q index of PCA. Moreover, as the time lag l increases, the FDR provided by the Q index of DPCA is also enhanced gradually, as shown in Table 6. Nevertheless, the T2 index of DPCA fails to detect this incipient fault all the time. Besides, a large time lag will increase the computational complexity of DPCA greatly. Even if the time lag is selected as 5, the FDR provided by the Q index is still unsatisfactory. Then we turn to investigate the fault detection performance of
Fig. 9. Fault detection using T-PLS based Υˇ (λ=0.12) in the numerical example.
Table 4 Detection delay Ld vs. window width W. W Ld
10 3
15 6
20 8
50 9
100 11
0.03 6
0.01 9
0.005 14
Table 5 Detection delay Lˇd vs. weighting factor λ. λ Lˇd
0.12 3
0.05 4
and the weighting factor used in Υˇ should not be too small. To illustrate this point, the detection delay with respect to W (used in T o2 ) and λ 2 (used in Tˇo ) for this numerical example is listed in Table 4 and Table 5, respectively. Hence, the selection of proper smoothing parameters is very important so as to guarantee fault detectability and avoid serious detection delay simultaneously. According to Table 2, critical parameter values, i.e., Wc and λc, directly affect the optimal parameter choice. In practice, critical parameter values can be determined by (23) and (34), given the detection method adopted and fault information. 4.2. The CSTR process In this subsection, a classical benchmark, i.e. the CSTR process is employed to demonstrate the effectiveness of proposed method. The proposed detection indices are applied to the Q and T2 statistics of PCA, 2 so as to show that Q (T 2 ) and Qˇ (Tˇ ) are more sensitive to incipient 2 fault than conventional Q (T ) statistic. Besides, proposed detection indices are compared with dynamic PCA (DPCA) (Ku, Storer, & Georgakis, 1995) based detection indices as well. The CSTR process is described by two differential equations as follows (Ji et al., 2016; Li, Qin, Ji, & Zhou, 2010)
⎛ E ⎞ dCA q = (CAf − CA) − k 0 exp⎜ − ⎟C + v1 ⎝ RT ⎠ A dt V
(42)
⎛ E ⎞ q dT −ΔH UA = (T f − T ) + k 0 exp⎜ − (T − T ) + v2 ⎟C + ⎝ RT ⎠ A VρCp c dt V ρCp
(43)
CA and T denote the outlet concentration and reaction temperature respectively, which are controlled variables with nominal values. Tc and q, representing respectively the cooling water temperature and feed flow rate, are chosen as manipulated variables with feedback from
Fig. 11. Fault detection using DPCA (l=1) in the CSTR process.
18
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Table 6 FDR provided by the Q index vs. time lag l in DPCA. l FDR (%)
1 38.4
2 45.8
3 52.1
4 56.3
5 59.6
proposed strategies. Fig. 12 shows the Q and T 2 plots, for which the window widths are selected as 4 and 220, respectively. It is observed that the incipient fault is successfully detected by Q after a very short time delay, as well as by T 2 after a relatively long time delay. The Qˇ and 2 Tˇ plots, as shown in Fig. 13, present a similar fault detection result. By comparing Figs. 12 and 13 with Figs. 10 and 11, it can be concluded that proposed detection indices perform better and more stably than conventional PCA and DPCA based detection indices. One drawback is that the FAR for Q or Qˇ , though acceptable, is increased a bit. This phenomenon implies that control limits for Q and Qˇ , as calculated by (19) and (29), are no longer entirely suitable. It is understandable because (19) and (29) hold under the assumption that samples are i.i.d., whereas in the CSTR process there exists autocorrelation and nonlinearities. In this case, adjusting the control limit might be necessary.
Fig. 14. EMU brake test bench (Ji et al., 2016).
ing the real braking process. For each EMU car, a simplified control schematic diagram of the air brake system is depicted in Fig. 15. Brake control unit (BCU), which includes electrical and mechanical components, is the core part of the air brake system. Under negative feedback control, the pre-controlled air pressure Ppre closely follows the pressure setpoint Pset . Then, Ppre is transformed to the output pressure Pout via the relay valve, which is subsequently converted to brake cylinder pressure {Pi , i = 1 ∼ 4}. The brake cylinder pressure is finally converted to the braking force by foundation brake rigging to achieve the brake application (Ji et al., 2016). As a key parameter, the brake cylinder pressure Pi directly affects the braking function and performance. In practice, pressure sensors are usually installed to measure Pi, which is monitored in real time so as to guarantee the normal operating status. From the experience of rail traffic engineers, measurement sensors may suffer from drifting and small bias faults in practical applications due to the train's harsh operation environment of vibration, shock, electromagnetic interference, etc. However, current monitoring strategies, e.g., univariate control charts, are not effective to detect these sensor faults with small magnitude. Though these incipient faults are not serious at their initial stages, timely detection for these faults is helpful to prevent more serious faults from happening, provide useful information for sensor maintenance and replacement, and keep the air brake system at its optimal operation state. Without loss of generality, the four brake cylinder pressure sensors mounted on car No. 1 of the test bench are studied. The braking level is set as 7, i.e., the highest braking level of service brake application. In total 600 samples are collected from the test bench when steady state is reached. The first 300 samples are normal, while the last 300 samples contain a fault. The fault model complies with (7) and here we consider a general multiple sensor fault. Specifically, the fault direction Ξj and fault magnitude f are set as
4.3. An EMU air brake process In this part, we apply the proposed incipient fault detection strategies to an EMU air brake system. For EMU, due to its high-speed feature, a reliable air brake system is very critical so as to slow down or stop the running train timely and smoothly upon request. The brake system being studied here is the EMU brake test platform of CRRC Qingdao Sifang Rolling Stock Research Institute Co., Ltd., China, as shown in Fig. 14. This platform corresponds to one specific CRH (China Railway High-speed) EMU type, which is capable of implement-
⎡ ⎤T Ξj = ⎢ 0 1 0 0 ⎥ , ⎣0 0 0 1 ⎦
Fig. 12. Fault detection using PCA based Υ in the CSTR process.
⎡ ⎤ f = ⎢1.5 ⎥ ⎣1.0 ⎦
(44)
That is, incipient bias faults are independently imposed on the second sensor P2 and the fourth sensor P4, with magnitudes equal to 1.5 and 1.0 (kPa) respectively. The PCA model is employed for fault detection. In addition to Q and T2, ϕ is also included in this example. A PCA model with one PC retained is established using normal samples. Fig. 16 shows the Q, T2, and ϕ plots for the incipient multiple sensor fault. T2 fails to detect this fault. Though Q and ϕ indicate some anomalies after sample 300, their FDRs are too low. Thus, conventional Q, T2, and ϕ indices in PCA cannot detect the incipient fault efficiently. As for the DPCA method, the time lag determined according to the design procedure proposed in Ku et al. (1995) is equal to 0. It implies that for this example DPCA is not necessary since there exists no autocorrelation among samples. This result is due to the fact that samples are collected when the process is under steady state and the sampling frequency is low. Thus,
Fig. 13. Fault detection using PCA based Υˇ in the CSTR process.
19
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Fig. 15. Control schematic diagram of the EMU air brake system (Ji et al., 2016).
Fig. 16. Fault detection using PCA based Υ in the EMU air brake process. Fig. 18. Fault detection using PCA based Υˇ in the EMU air brake process.
samples are nearly independent and not correlated in time. The proposed detection indices Υ and Υˇ are then tested. Note from (23) and (34) that, even for the same fault, critical parameters (Wc or λc) for different detection indices are different, because M and η depend on the specific detection index adopted. Fig. 17 shows the fault detection result using Υ , i.e., Q , T 2 , and ϕ within the established PCA model. The window widths utilized in Q , T 2 , and ϕ are 7, 55, and 9 respectively, all of which satisfy their respective sufficient detectability condition (see Theorem 1). Indeed, it is observed from Fig. 17 that all three indices can detect the incipient fault successfully after different levels of time delay. Similarly, Fig. 18 shows the fault detection result 2 using Qˇ , Tˇ , and ϕˇ with weighting factors equal to 0.2, 0.02, and 0.15 respectively. Likewise, the proposed Υˇ index is able to detect the incipient multiple sensor fault efficiently. To illustrate that detection performance of Υ and Υˇ relies on the smoothing parameters, their FDRs with respect to different parameter values in this industrial example are calculated and compared with
each other. Without loss of generality, we only take the T2 index as an 2 example. Tables 7 and 8 list the FDRs of T 2 and Tˇ with respect to W and λ, respectively. It is noted from Table 7 that the FDR is improved as W increases. When W is equal to 1, the proposed statistic T 2 degrades 2 to the conventional T2 index. For the Tˇ index, we observe from Table 8 2 that its FDR increases as λ decreases. Similarly, Tˇ reduces to T2 when λ is equal to 1. Overall, detection performance of Υ and Υˇ can be enhanced to a certain degree when W increases or λ decreases. On the other hand, W should not be overlarge and λ should not be too small so as to avoid serious detection delay. 5. Conclusions In this work, the incipient fault detection task under the MSPM framework has been involved. Instead of investigating one specific statistical analysis method like PCA, we analyze a generic fault detection index of quadratic form, which can represent various detection indices used in many MSPM methods. By incorporating two smoothing techniques, novel fault detection statistics based on the generic detection index are presented. Accordingly, control limits of the two proposed statistics are derived for fault detection purpose. From fault detectability analysis, two proposals are more sensitive to incipient faults compared with the conventional fault detection index. The superiority of proposed indices over conventional one is also illustrated geometrically with a simple example. To demonstrate the effectiveness of proposed statistics for incipient faults, case studies on three examples are carried out. In these examples, the proposed strategies are applied to the T-PLS and PCA models. Besides, both process fault and multiple sensor fault types are involved. Monitoring Table 7 FDR vs. window width W in T 2 . W FDR (%)
Fig. 17. Fault detection using PCA based Υ in the EMU air brake process.
20
1 4.67
3 14.67
5 35.0
10 66.33
30 93.67
55 93.0
Control Engineering Practice 62 (2017) 11–21
H. Ji et al.
Proceedings of the 35th Chinese Control Conference (pp. 6668–6672). Chengdu, China, 〈http://dx.doi.org/10.1109/ChiCC.2016.7554406〉. Ji, H., He, X., Shang, J., & Zhou, D. (2016). Incipient sensor fault diagnosis using moving window reconstruction-based contribution. Industrial & Engineering Chemistry Research, 55(10), 2746–2759. http://dx.doi.org/10.1021/acs.iecr.5b03944. Kresta, J. V., MacGregor, J. F., & Marlin, T. E. (1991). Multivariate statistical monitoring of process operating performance. The Canadian Journal of Chemical Engineering, 69(1), 35–47. http://dx.doi.org/10.1002/cjce.5450690105. Ku, W., Storer, R. H., & Georgakis, C. (1995). Disturbance detection and isolation by dynamic principal component analysis. Chemometrics and Intelligent Laboratory Systems, 30(1), 179–196. http://dx.doi.org/10.1016/0169-7439(95)00076-3. Lee, J. M., Yoo, C. K., & Lee, I. B. (2004). Statistical process monitoring with independent component analysis. Journal of Process Control, 14(5), 467–485. http://dx.doi.org/10.1016/j.jprocont.2003.09.004. Li, G., Qin, S. J., Ji, Y., & Zhou, D. (2010). Reconstruction based fault prognosis for continuous processes. Control Engineering Practice, 18(10), 1211–1219. http://dx.doi.org/10.1016/j.conengprac.2010.05.012. MacGregor, J. F., & Kourti, T. (1995). Statistical process control of multivariate processes. Control Engineering Practice, 3(3), 403–414. http://dx.doi.org/10.1016/0967-0661(95)00014-L. Portnoy, I., Melendez, K., Pinzon, H., & Sanjuan, M. (2016). An improved weighted recursive PCA algorithm for adaptive fault detection. Control Engineering Practice, 50, 69–83. http://dx.doi.org/10.1016/j.conengprac.2016.02.010. Qin, S. J., & Zheng, Y. (2013). Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures. AIChE Journal, 59(2), 496–504. http://dx.doi.org/10.1002/aic.13959. Qin, S. J. (2003). Statistical process monitoring: Basics and beyond. Journal of Chemometrics, 17(8–9), 480–502. http://dx.doi.org/10.1002/cem.800. Qin, S. J. (2012). Survey on data-driven industrial process monitoring and diagnosis. Annual Reviews in Control, 36(2), 220–234. http://dx.doi.org/10.1016/j.arcontrol.2012.09.004. Roberts, S. W. (2000). Control chart tests based on geometric moving averages. Technometrics, 42(1), 97–101. http://dx.doi.org/10.2307/1271439. Salsbury, T. I., & Alcala, C. F. (2017). A method for setpoint alarming using a normalized index. Control Engineering Practice, 60, 1–6. http://dx.doi.org/10.1016/j.conengprac.2016.12.002. Shang, J., Chen, M., Ji, H., & Zhou D. (2017), Recursive transformed component statistical analysis for incipient fault detection, Automatica, in press. 〈http://dx.doi. org/10.1016/j.automatica.2017.02.028〉. Valle, S., Li, W., & Qin, S. J. (1999). Selection of the number of principal components: The variance of the reconstruction error criterion with a comparison to other methods. Industrial & Engineering Chemistry Research, 38(11), 4389–4401. http://dx.doi.org/10.1021/ie990110i. Venkatasubramanian, V., Rengaswamy, R., Kavuri, S. N., & Yin, K. (2003). A review of process fault detection and diagnosis, Part III: Process history based methods. Computers & Chemical Engineering, 27(3), 327–346. http://dx.doi.org/10.1016/S0098-1354(02)00162-X. Wachs, A., & Lewin, D. R. (1999). Improved PCA methods for process disturbance and failure identification. AIChE Journal, 45(8), 1688–1700. http://dx.doi.org/10.1002/aic.690450808. Wierda, S. J. (1994). Multivariate statistical process control recent results and directions for future research. Statistica Neerlandica, 48(2), 147–168. http://dx.doi.org/10.1111/j.1467-9574.1994.tb01439.x. Wise, B. M., Veltkamp, D. J., Davis, B., Ricker, N. L., & Kowalski, B. R. (1988). Principal components analysis for monitoring the West Valley liquid fed ceramic melter. Waste Management, 88, 811–818. Yin, S., Ding, S.X., Zhang, P., Hagahni, A., & Naik., A. (2011). Study on modifications of PLS approach for process monitoring, In Proceedings of the 18th World Congress of the International Federation of Automatic Control (pp. 12389–12394). Milano, Italy, 〈http://dx.doi.org/10.3182/20110828-6-IT-1002.02876〉. Yin, S., Ding, S. X., Haghani, A., Hao, H., & Zhang, P. (2012). A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process. Journal of Process Control, 22(9), 1567–1581. http://dx.doi.org/10.1016/j.jprocont.2012.06.009. Yue, H. H., & Qin, S. J. (2001). Reconstruction-based fault identification using a combined index. Industrial & Engineering Chemistry Research, 40(20), 4403–4414. http://dx.doi.org/10.1021/ie000141 (+). Zhao, D., Lin, Z., & Wang, Y. (2015). Integrated state/disturbance observers for twodimensional linear systems, IET. Control Theory & Applications, 9(9), 1373–1383. http://dx.doi.org/10.1049/iet-cta.2014.1380. Zheng, X., (2015). Incipient fault detection based on principal component analysis, M. eng. thesis, North China Electric Power University, Beijing, China. Zhou, D., Li, G., & Qin, S. J. (2010). Total projection to latent structures for process monitoring. AIChE Journal, 56(1), 168–178. http://dx.doi.org/10.1002/aic.11977.
Table 8 2 FDR vs. weighting factor λ in Tˇ . λ FDR (%)
1.0 4.67
0.5 13.67
0.3 35.67
0.2 58.33
0.1 94.33
0.02 96.0
results illustrate that the proposed strategies are capable of detecting the imposed incipient faults efficiently provided appropriate smoothing parameters are chosen. In general, as the window width increases or the weighting factor decreases, detection performance of the two proposed statistics will be enhanced to a certain degree. On the other hand, in order to avoid serious detection delay, the window width should not be overlarge and the weighting factor should not be too small. Acknowledgement This work was supported by the National Natural Science Foundation of China (61490701, 61290324, 61210012, 61473163, 61522309), and Research Fund for the Taishan Scholar Project of Shandong Province of China. References Beghi, A., Brignoli, R., Cecchinato, L., Menegazzo, G., Rampazzo, M., & Simmini, F. (2016). Data-driven fault detection and diagnosis for HVAC water chillers. Control Engineering Practice, 53, 79–91. http://dx.doi.org/10.1016/j.conengprac.2016.04.018. Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics, 25(2), 290–302. http://dx.doi.org/10.1214/aoms/1177728786. Chen, T., & Sun, Y. (2009). Probabilistic contribution analysis for statistical process monitoring: A missing variable approach. Control Engineering Practice, 17(4), 469–477. http://dx.doi.org/10.1016/j.conengprac.2008.09.005. Chen, J., Liao, C. M., Lin, F. R. J., & Lu, M. J. (2001). Principle component analysis based control charts with memory effect for process monitoring. Industrial & Engineering Chemistry Research, 40(6), 1516–1527. http://dx.doi.org/10.1021/ie000407c. Chen, Z., Ding, S. X., Zhang, K., Li, Z., & Hu, Z. (2016). Canonical correlation analysisbased fault detection methods with application to alumina evaporation process. Control Engineering Practice, 46, 51–58. http://dx.doi.org/10.1016/j.conengprac.2015.10.006. Chiang, L. H., Russell, E. L., & Braatz, R. D. (2001). Fault detection and diagnosis in industrial systems London, U.K: Springer-Verlag. Chiang, L. H., Kotanchek, M. E., & Kordon, A. K. (2004). Fault diagnosis based on Fisher discriminant analysis and support vector machines. Computers & Chemical Engineering, 28(8), 1389–1401. http://dx.doi.org/10.1016/j.compchemeng.2003.10.002. Dunia, R., & Qin, S. J. (1998). Subspace approach to multidimensional fault identification and reconstruction. AIChE Journal, 44(8), 1813–1831. http://dx.doi.org/10.1002/aic.690440812. Ge, Z., Xie, L., & Song, Z. (2009). A novel statistical-based monitoring approach for complex multivariate processes. Industrial & Engineering Chemistry Research, 48(10), 4892–4898. http://dx.doi.org/10.1021/ie800935e. Ge, Z., Song, Z., & Gao, F. (2013). Review of recent research on data-based process monitoring. Industrial & Engineering Chemistry Research, 52(10), 3543–3562. http://dx.doi.org/10.1021/ie302069q. Harmouche, J., Delpha, C., & Diallo, D. (2014). Incipient fault detection and diagnosis based on Kullback-Leibler divergence using principal component analysis: Part I. Signal Processing, 94, 278–287. http://dx.doi.org/10.1016/j.sigpro.2013.05.018. Harmouche, J., Delpha, C., & Diallo, D. (2015). Incipient fault detection and diagnosis based on Kullback-Leibler divergence using principal component analysis: Part II. Signal Processing, 109, 334–344. http://dx.doi.org/10.1016/j.sigpro.2014.06.023. Huan, Z. L., Wu, X. H., & Ning, W. (2001). Design and implementation of expert system for failure diagnosis of boiler system in a thermo-power. Journal of Shandong University of Science and Technology (Natural Science), 20(1), 85–87. http://dx.doi.org/10.16452/j.cnki.sdkjzk.2001.01.023. Ji, H., He, X., Sai, H., Tai, X., Zhou, D., (2016). Fault detection of EMU brake cylinder, In
21