¨ Anton A. Kiss, Edwin Zondervan, Richard Lakerveld, Leyla Ozkan (Eds.) Proceedings of the 29th European Symposium on Computer Aided Process Engineering c 2019 Elsevier B.V. All rights reserved. June 16th to 19th , 2019, Eindhoven, The Netherlands. http://dx.doi.org/10.1016/B978-0-12-818634-3.50200-9
Incipient Fault Detection, Diagnosis, and Prognosis using Canonical Variate Dissimilarity Analysis Karl Ezra S. Pilarioa,b,* , Yi Caoc and Mahmood Shafieea a Department
of Energy and Power, Cranfield University, College Road, Bedfordshire, MK43 0AL, United Kingdom b Department of Chemical Engineering, University of the Philippines, Diliman, Quezon City 1101, Philippines c College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, China k.pilario@cranfield.ac.uk
Abstract Industrial process monitoring deals with three main activities, namely, fault detection, fault diagnosis, and fault prognosis. Respectively, these activities seek to answer three questions: ‘Has a fault occurred?’, ‘Where did it occur and how large?’, and ‘How will it progress in the future?’ As opposed to abrupt faults, incipient faults are those that slowly develop in time, leading ultimately to process failure or an emergency situation. A recently developed multivariate statistical tool for early detection of incipient faults under varying operating conditions is the Canonical Variate Dissimilarity Analysis (CVDA). In CVDA, a dissimilarity-based statistical index was derived to improve the detection sensitivity upon the traditional canonical variate analysis (CVA) indices. This study aims to extend the CVDA detection framework towards diagnosis and prognosis of process conditions. For diagnosis, contribution maps are used to convey the magnitude and location of the incipient fault effects, as well as their evolution in time. For prognosis, CVA state-space prediction and Kalman filtering during faulty conditions are proposed in this work. By covering the three main process monitoring activities in one framework, our work can serve as a baseline strategy for future application to large process industries. Keywords: canonical variate analysis (CVA), incipient fault, Kalman filter (KF), dynamic process monitoring
1. Introduction Data-driven methods for industrial process monitoring, known in general as multivariate statistical process monitoring (MSPM) techniques, have attained significant development in the last few decades (Reis and Gins, 2017). By addressing challenges such as high-dimensionality, temporal correlation, nonlinearity, non-Gaussianity, etc. in the plant data, MSPM tools have become more reliable for the automated detection of various faults in the plant (Ge et al., 2013). However, in the larger perspective, fault detection must be followed by fault diagnosis and prognosis (Chiang et al., 2005). Fault diagnosis aims to determine the location and magnitude of a detected fault, while fault prognosis aims to predict its evolution in time. Indeed, Reis and Gins (2017) noted an increasing trend of research focus in the area of fault diagnosis and prognosis in the process industries. By developing more reliable diagnostic and prognostic tools, wellinformed decisions can be made during production and maintenance operations planning, even when the fault continues to degrade the process performance. In addition, the ability to address
1196
K.E.S. Pilario et al.
the aforementioned challenges in handling plant data must be present in all three process monitoring activities. Hence, it is beneficial to establish them under an integrated framework. Among the various types of faults, the incipient fault is deemed as that which requires prognosis (Li et al., 2010). As opposed to abrupt faults, incipient faults are those that start at small magnitudes, but slowly worsen in time (Isermann, 2005). If an incipient fault is left to degrade the process, it may ultimately lead to process failure or an emergency situation (Pilario and Cao, 2017). Hence, incipient fault monitoring is an important issue that needs to be addressed by more advanced diagnostic and prognostic tools. For the sensitive detection of incipient faults under dynamically varying conditions, the Canonical Variate Dissimilarity Analysis (CVDA) method was recently developed (Pilario and Cao, 2018). This method is based on the well-known Canonical Variate Analysis (CVA) (Odiowei and Cao, 2010) but uses a dissimilarity-based statistical index for enhanced sensitivity by measuring the predictability of the hidden states of the process. In this work, the CVDA fault detection methodology is extended to fault diagnosis and prognosis. Specifically, we aim to: (i) formulate a contributionsbased approach for diagnosis using the canonical variate dissimilarity index; and, (ii) propose a prognostic tool using CVA state-space modelling and Kalman filtering under faulty conditions. This paper is structured as follows. The CVDA methodology is revisited in Section 2. The proposed diagnosis and prognosis procedures are given in Section 3. Section 4 demonstrates the monitoring performance in a simulated CSTR case study. Lastly, concluding remarks are given.
2. CVDA for Fault Detection The CVDA detection method proceeds by performing CVA to calculate the state and residual variables from the input-output process data, and then computing the statistical indices T 2 , Q, and D, which serve as health indicators of the process. Given N samples of input and output data at normal conditions, denoted as uk ∈ Rmu and yk ∈ Rmy at the kth sampling instant, respectively, the past and future column vectors are formed as: pk = [uTk−1 uTk−2 . . . uTk−p yTk−1 yTk−2 . . . yTk−p ]T ∈ Rmp T fk = yTk yTk+1 yTk+2 . . . yTk+ f −1 ∈ Rmy f
(1) (2)
where p and f are the number of lags in the past and future windows of data, respectively, and m = mu + my . The number of lags are chosen large enough to capture autocorrelation in the output data (Odiowei and Cao, 2010). 1 1 The sample covariance matrices are then obtained using Σ pp = M−1 Y p YTp , Σ f f = M−1 Y f YTf , and 1 Σ f p = M−1 Y f YTp , where Y p = [p p+1 p p+2 . . . p p+M ] and Y f = [f p+1 f p+2 . . . f p+M ] are the past and future Hankel matrices, respectively, and M = N − p − f + 1.
CVA aims to find vectors a ∈ Rmy f and b ∈ Rmp so that the correlation between the linear combinations aT fk and bT pk are maximized (Odiowei and Cao, 2010). The algebraic solution is given by the singular value decomposition (SVD) of the scaled Hankel matrix: −1/2
H = Σff
−1/2 Σ VT Σ f p Σ pp = UΣ
(3)
where U = υ 1 , υ 2 , . . . , υ r and V = ν 1 , ν 2 , . . . , ν r are the left and right singular matrices, Σ = diag(σ1 , σ2 , . . . , σr ) is the diagonal matrix of descending non-zero singular values, and r is the rank of H. The singular values, σi , are the maximum solutions of Eq. (3), which are also the canonical correlations between projected past and future data. Taking only the first n projections, corresponding to the n largest canonical correlations, the states, residuals, and dissimilarity
Incipient Fault Detection, Diagnosis, and Prognosis using Canonical Variate Dissimilarity Analysis
1197
features at the kth sampling instant are computed as: z k = J n pk ∈ R n
(4) (5) (6)
ek = Fn pk ∈ Rmp dk = Ln fk− f +1 − Σn Jn pk− f +1 ∈ Rn
−1/2 −1/2 Σ−1/2 where Jn = VTn Σ pp ∈ Rn×mp , F = (I − Vn VTn )Σ ∈ Rmp×mp , Ln = UTn Σ f f ∈ Rn×my f , Vn pp and Un are the first n columns of V and U, respectively, and Σ n is a diagonal matrix of the n largest singular values in Eq. (3). Finally, the statistical indices at the kth sampling instant are given by:
Tk2 = zTk zk
(7)
Qk = eTk ek
(8)
Dk =
dTk (I − Σ 2n )−1 dk
(9)
which measure departures from the usual state subspace, residual subspace, and predictability of future states from the past states, respectively. In the CVDA training phase, the distributions of T 2 , Q, and D, are estimated using kernel density estimation (KDE) without any assumption of Gaussianity in the data. Upper control limits (UCLs) 2 ,Q 2 2 are computed as the values of TUCL UCL , and DUCL at which P(T < TUCL ) = α, P(Q < QUCL ) = α, and P(D < DUCL ) = α, respectively. Throughout this paper, a significance limit of α = 99 % is adopted. During online monitoring, the process condition is deemed faulty when either of the indices exceeded their respective UCLs.
3. CVDA for Fault Diagnosis and Prognosis 3.1. Fault Diagnosis To achieve fault diagnosis, the value of a statistical index at time point k can be decomposed into parts contributed by each measured variable. Hence, the measured variable/s with the largest contributions can be identified as most associated with the fault that occurred. Incipient fault diagnosis results must be reported using contribution maps instead of the traditional contribution bar plots, so that the fault evolution and propagation across time can be visually illustrated. Contributions from the traditional CVA indices were already given by Jiang et al. (2015). In this paper, we derive the contributions for the dissimilarity index, D, as well. Altogether, the contributions of the ith measured variable (i = 1, . . . , m) to statistical index J at time point k, J with J ∈ {T 2 , Q, D}, are given by: denoted by Ci,k 2
T Ci,k =
p−1
∑ zTk
(i+m j) (i+m j) pk
Jn
(10)
(11)
j=0 Q Ci,k =
p−1
∑ eTk
(i+m j) (i+m j) pk
Fn
j=0
⎧ (i−mu +my j) (i−mu +my j) (i+m j) (i+m j) p−1 2 ⎪ fk− f +1 − Σ n Jn pk− f +1 i > mu ⎨∑ j=0 dTk (I − Σ )−1 Ln D Ci,k = (12)
⎪ (i+m j) (i+m j) ⎩∑ p−1 dT (I − Σ 2 )−1 −Σ Σ i ≤ m J p n n u k j=0 k− f +1 (l)
(l)
(l)
(l)
(l)
where Jn , Fn , Ln denote the lth columns in matrices Jn , Fn , Ln , respectively, and pk , fk denote the lth element in column vectors pk , fk , respectively. For Eq. (12) to apply, we assume that p = f and that measured variables are sorted as all inputs followed by all outputs.
1198
K.E.S. Pilario et al.
Using Eqs. (10)-(12), contributions from a fault-free training data set are first taken. For online J , and use, relative contributions are computed by removing the mean normal contribution from Ci,k scaling by the standard deviation of the normal contributions (Deng and Tian, 2011): J,rel = Ci,k
J − mean(C J |normal) Ci,k i,k J |normal) std(Ci,k
.
(13)
3.2. Fault Prognosis CVA is distinguished from other process monitoring methods in that it is also a system identification method (Larimore, 1990). This benefit is useful for prognosis under the CVDA framework. In CVA, the system is assumed to be represented by the following state-space model: xk+1 = Axk + Buk + wk yk = Cxk + Duk + vk
(14) (15)
where A, B, C, D are the state-space matrices, w, v are the process and measurement noise, respectively, and x, y, u are the states, outputs, and inputs, respectively. After Eq. (4), CVA identification ˆ B, ˆ D ˆ C, ˆ as is achieved by letting xk := zk and performing multivariate regression to estimate A, follows (Larimore, 1990): ˆ B ˆ xk xk A xk+1 xk −1 = Σ , · Σ , (16) ˆ D ˆ yk uk uk uk C where Σ [· , ·] denotes the sample covariance operation. Moreover, since the noise can be recovered ˆ k − Bu ˆ k − Du ˆ k and vˆ k = yk − Cx ˆ k , then the noise covariances can be computed ˆ k = xk+1 − Ax as w ˆ = Σ [w ˆ = Σ [ˆvk , vˆ k ]. Meanwhile, Pˆ 0 = Σ [ˆxk , xˆ k ] is the initial state covariance. ˆ k ] and R ˆ k, w as Q Given a future input sequence, the estimated state-space model can be used for predicting the output variables during normal operation. When a fault occurs, the state-space model must be retrained by applying Eq. (16) to the samples obtained during the early stages of degradation (RuizC´arcel et al., 2016). However, an incipient fault can bring changes in both the state-space model parameters and the states xk themselves. Re-training only corrects the model parameters. Hence, in this paper, we propose the use of Kalman filtering (KF) to correct changes in the states in conjunction with model re-training. In KF, the pertinent equations are: ˆ k−1|k−1 + Bu ˆ k xk|k−1 = Ax ˆ Pˆ k−1|k−1 A ˆ ˆ T +Q Pˆ k|k−1 = A ˆ k|k−1 ) xk|k = xk|k−1 + Kk (yk − Cx ˆ Pˆ k|k−1 Pˆ k|k = (I − Kk C)
(17) (18) (19) (20)
ˆ Pˆ k|k−1 C ˆ T (C ˆ T + R) ˆ −1 is the Kalman gain. where Kk = Pˆ k|k−1 C To apply Eq. (17)-(20), the initial states are first estimated using Eq. (4) on the first p samples. ˆ k and vˆ k are Gaussian-distributed, KF is used to update xk at every time step. Assuming that xk , w Note that changes due to other incipient faults, such as drifts in the noise statistics, may be difficult to track upon using the Kalman filter. This is a topic for future work.
4. Case Study To evaluate the performance of CVDA for fault diagnosis and prognosis, a closed-loop CSTR case study is used, which is available in (Pilario, 2017). Data are simulated from a system of 4 state
Incipient Fault Detection, Diagnosis, and Prognosis using Canonical Variate Dissimilarity Analysis
1199
482 297
286
370
301
273
(a)
(b)
Figure 1: Monitoring results for: (a) fault detection (Arrows - detection time, mins; Red dash - UCLs; Blue solid - detection index); and (b) fault diagnosis. Top: Catalyst decay, Bottom: Fouling. For (b), the contributions maps are for T 2 , Q, and D from left to right. equations, with 2 cascade control loops to maintain liquid level, h, and reactor temperature, T . The measured variables are u = [Fi ,Ci , Ti , Tci ]T and y = [h,C, T, Tc , F, Fc ]T . The input variables are being perturbed every 30 sampling times to simulate disturbance changes. Using a data set of 1000 samples at normal operation, CVDA was trained so as to generate the projection matrices, the UCLs, and the state-space model. The number of lags and number of states are selected as 14 and 8, respectively. These are chosen using autocorrelation analysis and the dominant singular values method (Odiowei and Cao, 2010). Two incipient faults are studied in the CSTR, namely, catalyst decay (Fault 1) and heat transfer fouling (Fault 2). In the faulty data sets, both faults are slow drifts introduced after an initial 200 min of normal operation. Fig. 1(a) shows detection results for the faulty data sets. As shown, the D index incurs the earliest detection time for both faults. To locate and track the magnitude of the detected fault, contribution maps are generated using Eqs. (10)-(13), and are shown in Fig. 1(b). The most associated variables to catalyst decay and fouling faults are shown to be the outlet concentration, C, and outlet temperature of the coolant, Tc . These are, indeed, the expected results for fault diagnosis. Moreover, the maps were able to illustrate how the fault effects propagate to other variables further in time. Notably, the D contribution maps revealed not only the fault-affected variables, but also the input variables that contributed to the dissimilarity between past and future states. Hence, we suggest to use all three contribution maps in conjunction to better capture incipient fault signatures. The 1-step ahead prediction of the top 2 faulty variables in each fault case is given in Fig. 2. The estimated data are Kalman-filter predictions from a state-space model re-trained every 20 sampling times using the latest 400 samples while under faulty conditions. Table 1 gives the R2 measure of fit of the predictions, showing that without KF, the faulty variables cannot be tracked accurately. Hence, KF improves CVDA fault prognosis in this case study.
5. Conclusion In this paper, an incipient fault detection, diagnosis, and prognosis methodology is presented under an integrated framework, namely, Canonical Variate Dissimilarity Analysis (CVDA). Using a CSTR case study involving 2 parametric incipient fault scenarios, the method was shown to
1200
K.E.S. Pilario et al.
Start of prediction by fault model Start of prediction by fault model
Start of Fault
Start of Fault
Fault Detected (286 min)
(a)
Fault Detected (273 min)
(b)
Figure 2: Fault prognosis results for the top 2 faulty variables in: (a) Catalyst decay fault; (b) Fouling fault. See Table 1 for R2 fitness on all the charts. Table 1: R2 fitness values of the predictions of the faulty variables Fault 1 with KF without KF Fault 2 with KF without KF C 90.57 % −135.0 % Tc 98.32 % −35.3 % Fc 94.41 % 27.30 % Fc 96.52 % 12.93 % provide early detection, reliable diagnosis, and accurate tracking of faulty variables. Hence, the framework constitutes a baseline strategy for industrial process monitoring.
References L. H. Chiang, E. L. Russell, R. D. Braatz, 2005. Fault Detection and Diagnosis in Industrial Systems. Springer-Verlag, London. X. Deng, X. Tian, 2011. A new fault isolation method based on unified contribution plots. Proceedings of the 30th Chinese Control Conference, CCC 2011 (10), 4280–4285. Z. Ge, Z. Song, F. Gao, 2013. Review of recent research on data-based process monitoring. Industrial and Engineering Chemistry Research 52, 3543–3562. R. Isermann, 2005. Model-based fault-detection and diagnosis - status and applications. Annual Reviews in Control 29 (1), 71–85. B. Jiang, D. Huang, X. Zhu, F. Yang, R. D. Braatz, 2015. Canonical variate analysis-based contributions for fault identification. Journal of Process Control 26 (2015), 17–25. W. E. Larimore, 1990. Canonical variate analysis in identification, filtering, and adaptive control. Proceedings of the IEEE Conference on Decision and Control 2, 596–604. G. Li, S. J. Qin, Y. Ji, D. Zhou, 2010. Reconstruction based fault prognosis for continuous processes. Control Engineering Practice 18 (10), 1211–1219. P.-E. Odiowei, Y. Cao, 2010. Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimations. IEEE Transactions on Industrial Informatics 6 (1), 36–45. K. E. Pilario, 2017. Cascade-controlled CSTR for Fault Simulation. URL https://www.mathworks.com/matlabcentral/fileexchange/65091-cascade-controlled-cstrfor-fault-simulation K. E. Pilario, Y. Cao, sep 2017. Process incipient fault detection using canonical variate analysis. In: 2017 23rd International Conference on Automation and Computing (ICAC). IEEE, pp. 1–6. K. E. S. Pilario, Y. Cao, 2018. Canonical variate dissimilarity analysis for process incipient fault detection. IEEE Transactions on Industrial Informatics. M. Reis, G. Gins, 2017. Industrial Process Monitoring in the Big Data/Industry 4.0 Era: from Detection, to Diagnosis, to Prognosis. Processes 5 (3), 35. C. Ruiz-C´arcel, L. Lao, Y. Cao, D. Mba, 2016. Canonical variate analysis for performance degradation under faulty conditions. Control Engineering Practice 54, 70–80.