Comparison of Two Basic Statistics for Fault Detection and Process Monitoring

Comparison of Two Basic Statistics for Fault Detection and Process Monitoring

Proceedings of the 20th World The International Federation of Congress Automatic Control Proceedings of 20th World The International Federation of Con...

614KB Sizes 4 Downloads 46 Views

Proceedings of the 20th World The International Federation of Congress Automatic Control Proceedings of 20th World The International Federation of Congress Automatic Control Proceedings of the the 20th9-14, World Congress Toulouse, France, July 2017 Available online at www.sciencedirect.com The International Federation of Control Toulouse, France,Federation July 9-14, 2017 The International of Automatic Automatic Control Toulouse, Toulouse, France, France, July July 9-14, 9-14, 2017 2017

ScienceDirect

IFAC PapersOnLine 50-1 (2017) 14776–14781 Comparison of Two Basic Statistics for Comparison of Two Basic Statistics for Comparison of Basic for Comparison of Two Two Basic Statistics Statistics for Fault Detection and Process Monitoring Fault Detection and Process Monitoring Fault Detection and Process Monitoring Fault Detection and∗∗,∗∗∗∗ Process Monitoring ∗,∗∗ ∗∗,∗∗∗

Zhiwen Chen ∗,∗∗ Kai Zhang ∗∗,∗∗∗∗ Yuri A.W. Shardt ∗∗,∗∗∗ ZhiwenX. Chen Yuri A.W. ∗∗ Kai Zhang ∗ ∗,∗∗ ∗∗,∗∗∗∗ ∗∗,∗∗∗∗ Steven Ding Xu Yang ∗∗∗∗ Chunhua Yang Shardt ∗,∗∗ ∗∗,∗∗∗∗ ∗∗,∗∗∗∗ Zhiwen Chen Yuri ∗∗ Kai ∗∗∗∗ ∗ Tao Peng ZhiwenX. Chen Kai Zhang Zhang Yuri A.W. A.W. Shardt Shardt Steven Ding ∗∗ Xu Yang ∗∗∗∗ Chunhua Yang ∗ Tao Peng ∗ ∗∗ Xu Yang ∗∗∗∗ Chunhua Yang ∗ Tao Peng ∗ Steven X. Ding Steven X. Ding Xu Yang Chunhua Yang Tao Peng ∗ ∗ School of Information Science and Engineering, Central South School of Information Science and Engineering, South ∗ University,Science 410083and Changsha, ChinaCentral ∗ School of Information Engineering, Central School of Information Science and Engineering, Central South South University, 410083 Changsha, China (e-mail: [email protected];[email protected]) University, University, 410083 410083 Changsha, Changsha, China China (e-mail: [email protected];[email protected]) ∗∗ for Automatic Control and Complex Systems, University of (e-mail: [email protected];[email protected]) ∗∗ Institute (e-mail: [email protected];[email protected]) Institute for Automatic Control and Complex University ∗∗Duisburg-Essen, Bismarckstrasse 81 BB, 47057Systems, Duisburg, Germanyof ∗∗ Institute for Automatic Control Complex Systems, University Institute for Automatic Control and and Complex Systems, University of Duisburg-Essen, Bismarckstrasse 81 BB, 47057 Duisburg, Germanyof (e-mail: [email protected]) Duisburg-Essen, Bismarckstrasse 81 BB, 47057 Duisburg, Duisburg-Essen, Bismarckstrasse 81 BB, 47057 Duisburg, Germany Germany (e-mail: [email protected]) ∗∗∗ Chemical Engineering, Waterloo, ON N2L 3G1, (e-mail: ∗∗∗ Department of (e-mail: [email protected]) [email protected]) Waterloo, ON N2L 3G1, ∗∗∗ Department of Chemical Engineering, Canada ∗∗∗ Department of Chemical Engineering, Waterloo, Department of Chemical Engineering, Waterloo, ON ON N2L N2L 3G1, 3G1, Canada (e-mail: [email protected]) Canada Canada (e-mail: [email protected]) ∗∗∗∗ of Science and Technology Beijing, Beijing, China [email protected]) ∗∗∗∗ University (e-mail: (e-mail: [email protected]) of Science and Technology Beijing, Beijing, China ∗∗∗∗ University (e-mail: [email protected];[email protected]) ∗∗∗∗ University of Science University of Science and and Technology Technology Beijing, Beijing, Beijing, Beijing, China China (e-mail: [email protected];[email protected]) (e-mail: [email protected];[email protected]) (e-mail: [email protected];[email protected]) Abstract: In this paper, two common statistics, the T 22 and the Q statistics, for fault detection and the Q relationship statistics, forbetween fault detection Abstract: this paper,are twocompared. common statistics, thethe T 2 geometric and processIn monitoring Specifically, the T 22 and statistics, fault Abstract: In this two common the T and the the Q Q relationship statistics, for forbetween fault detection detection Abstract: Inmonitoring this paper, paper,are twocompared. common statistics, statistics, thethe T 2 geometric and process Specifically, thefalse T2 statistic andmonitoring 3 commonare forms of the Specifically, Q statistics the is analysed. Furthermore, using the and compared. geometric relationship the T and process process monitoring are compared. Specifically, the geometricFurthermore, relationship between between thefalse T2 statistic and 3 common forms of the Q statistics is analysed. using the alarm rate (FAR) and theforms fault of detection rate (FDR), the fault detection performance of both statistic and 3 common the Q statistics is analysed. Furthermore, using the false statistic and 3 common of the Q rate statistics is analysed. Furthermore, using the false alarm rate and and theforms fault detection (FDR), fault performance of both statistics is(FAR) quantified compared. Therate results showthe that, fordetection a given significance level, the alarm rate and the detection (FDR), the fault detection performance of alarm rateis(FAR) (FAR) and and the fault fault detection rate (FDR), the fault detection performancelevel, of both both statistics quantified compared. The results show that, for a given significance the 2 T statistic has the best overall FDR. statistics is quantified and compared. The results show that, for a given significance level, 2 statistics is quantified and compared. The results show that, for a given significance level, the the T 2 statistic has the best overall FDR. 2 statistic has the best overall FDR. T T statistic has the best overall FDR. © 2017, IFAC Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. 2(International Keywords: T 2 statistic, Q statistic, multivariate statistical fault detection, process monitoring. Keywords: T 2 statistic, Q statistic, multivariate statistical fault detection, process monitoring. Keywords: Keywords: T T 2 statistic, statistic, Q Q statistic, statistic, multivariate multivariate statistical statistical fault fault detection, detection, process process monitoring. monitoring. 1. INTRODUCTION T 22 statistic gives the best fault detectability (fault detec1. INTRODUCTION T 2 statistic gives the when best fault (fault detec2 tion rate). However, the Tdetectability unavailable, 1. T gives best detectability detec2 statistic is (fault 1. INTRODUCTION INTRODUCTION T 2 statistic statistic gives the the when best fault fault detectability detection rate). However, the Tthe isof(fault unavailable, 2 statistic Over the last three decades, fault detection and process for example, when computing inverse the covari2 statistic is unavailable, tion rate). However, when the T tionexample, rate). However, when the Tthestatistic isof unavailable, Over the last three decades, fault detection and process for when computing inverse the covarimonitoring (FD-PM) techniques have received consideranceexample, matrix causes numerical problem, an alternative test Over the three fault detection and process when computing the of Over the last last three decades, decades, faulthave detection andconsiderprocess for for example, when numerical computingproblem, the inverse inverse of the the covaricovarimonitoring (FD-PM) techniques received matrix an the alternative test able attention due to the increasing demands onconsiderprocess ance statistic, thecauses Q statistic is widely used in MVA-based monitoring (FD-PM) techniques have received ance matrix causes numerical problem, an alternative test monitoring (FD-PM) techniques have receivedonconsiderance matrix causes numerical problem, an the alternative test able attention due to the increasing demands process statistic, the Q statistic is widely used in MVA-based safety, productdue quality, economic operation, and overall statistic, techniques, such as principal component analysis (PCA) able attention to the increasing demands on process the Q statistic is widely used in the MVA-based able attention due to theeconomic increasing demands and on process statistic, thesuch Q statistic is widely used in the MVA-based safety, product quality, operation, overall techniques, as principal component analysis (PCA) system product reliability. The existing FD-PM methods can be and partial least squares (PLS). Since theanalysis distribution of safety, quality, economic operation, and overall such principal component (PCA) safety, product quality, economic operation, and can overall techniques, such as as principal component (PCA) system reliability. The existing FD-PM methods be techniques, and partial least squares (PLS). Sinceapproximately, theanalysis distribution of subdivided into 2 The different methods: model-based and the Q statistic is usually only known it is system reliability. existing FD-PM methods can be partial least squares (PLS). Since the of system reliability. The existing FD-PMmodel-based methods canand be and and partial least squares (PLS). Sinceapproximately, the distribution distribution of subdivided into 2 different methods: the Q statistic is usually only known it is data-driven methods. Compared to model-based methods, necessary to examine the only geometric relationship between subdivided into 2 different methods: model-based and the Q statistic is usually known approximately, it subdivided methods. into 2 different methods: model-based and necessary the Q2 statistic is usually only knownrelationship approximately, it is is data-driven Compared to model-based methods, to examine the geometric between methods are relatively easy to applymethods, to real, necessary the T 2 andtoQexamine statisticsthe in order to understand thebetween impact data-driven methods. Compared to geometric relationship methods. Compared to model-based model-based methods, necessary to examine the geometric relationship between data-driven methods are relatively easy to apply to real, the T and Q statistics in order to understand the impact 2 methods on fault detection performance. large-scale processes, since such methods do not to require of these data-driven methods relatively easy apply real, T Q in order the data-drivenprocesses, methods are are relatively easy to todo apply real, the thethese T 2 and and Q statistics statistics in detection order to to understand understand the impact impact large-scale since such methods not to require methods on fault performance. accurate process models (Gertler, 1998; Isermann, 2006; of large-scale processes, since such methods do not require of these methods on fault detection performance. large-scale processes, since such methods do not require of these methods on fault detection performance. Thus, the objectives of this paper are: accurate process models (Gertler, 1998; Isermann, 2006; Ding, 2013). Within the area of data-driven methods,2006; mul- Thus, the objectives of this paper are: accurate process models (Gertler, 1998; accurate process models (Gertler, 1998; Isermann, Isermann, 2006; Ding, 2013). Within the area of data-driven methods, mul- Thus, the objectives of this paper are: tivariate analysis (MVA)-based methods are commonly objectives of this paper are: the T 22 statistic and • tothe study the relationship between Ding, 2013). Within the area of data-driven methods, mulDing, 2013). Within(MVA)-based the area of data-driven methods, mul- Thus, tivariate analysis methods are commonly • to study the relationship between T 2 statistic and used in various industrial processes, including chemical, three common definitionsbetween of the Qthe statistics; tivariate analysis (MVA)-based methods are commonly commonly •• to study the relationship the T tivariate analysisindustrial (MVA)-based methods are used in various processes, including chemical, to study the relationship between the T 2 statistic statistic and and three common definitions of the Q statistics; iron and steel, waste waterprocesses, treatment,including and metallurgical • to compare their FD performance; and used in various industrial chemical, three common definitions of the Q statistics; used in various industrial processes, including chemical, iron and steel, waste water treatment, and metallurgical three common definitions of the Q statistics; • to compare their FD performance; and processes (Kourti andwater MacGregor, 1995; Chiang et al., recommend the design of newand fault detection iron waste treatment, and metallurgical •• to their FD performance; iron and and steel, steel, waste treatment, andChiang metallurgical processes (Kourti andwater MacGregor, 1995; etChen al., to compare compare theirthe FD design performance; recommend of types newand fault detection 2000; Qin,(Kourti 2012; Ding etMacGregor, al., 2013; Hu et al., 2014;et method that can use these two of statistics. processes and 1995; Chiang al., • to recommend the design of new detection processes (Kourti and etMacGregor, 1995; Chiang al., 2000; Qin, 2012; Ding al., 2013; Hu et al., 2014;et Chen • method to recommend of types new fault fault detection that canthe use design these two of statistics. et al., 2016a; He et al., 2015; Chen et al., 2016b; Chen, 2000; Qin, 2012; Ding et al., 2013; Hu et al., 2014; Chen method that can use these two types of statistics. 2000; Qin, 2012; Ding et2015; al., 2013; Hu et al., 2014; Chen, Chen Notation: et al., 2016a; He et al., Chen et al., 2016b; method that can use these two types of statistics. The notation used in this paper is standard. 2 2017). Typically, theal., Hotelling’s T 2 et statistic and the Q Notation: et al., He 2015; Chen, notation used Euclidean in this paper is consisting standard. et al., 2016a; 2016a; He et et al., 2015; Chen Chen et al., al., 2016b; 2016b; the Chen, 2017). Typically, the Hotelling’s Tthe Q Notation: Rnn denotes The the n-dimensional space The notation in is standard. 2 statistic statistic, which is the alsoHotelling’s known as T squared and prediction Notation: The notation used used Euclidean in this this paper paper is consisting standard. 2 statistic 2017). Typically, and the R the n-dimensional space n×m 2017). Typically, the Hotelling’s Tthestatistic and the Q Q of nndenotes statistic, which is also known as squared prediction × 1 vectors with real components, R is the set n denotes the n-dimensional Euclidean space R consisting n×m error (SPE), are used for fault detection (FD). R denotes the n-dimensional Euclidean space consisting statistic, which is also known as the squared prediction of n × 1 vectors with real components, R is the set statistic, which alsoforknown as the squared n×m eigenvalue error (SPE), are isused fault detection (FD). prediction of all n × m real matrices, and eig(X) denotes n×m is the set n × 11×vectors with real components, R of n × vectors with real components, R is the set error (SPE), are used for fault detection (FD). 2 all n m real matrices, and eig(X) denotes eigenvalue error (SPE), are used for fault detection (FD). The T 2 test statistic is a generalized likelihood-ratio test of matrix X. tr(X) denotesand the eig(X) trace ofdenotes matrix eigenvalue X. I is an all m matrices, statistic and is a generalized likelihood-ratio test of matrix The T 2 test all n n× × X. m real real matrices, and eig(X) denotes eigenvalue tr(X) denotes the trace of matrix X. statistic (Basseville Nikiforov, 1993; Ding, 2014). identity matrix anddenotes 0 is a the zerotrace vector. x ∼ NX.(µIIx ,is Σan statistic is likelihood-ratio test The x) matrix X. of test statistic and is a a generalized generalized likelihood-ratio test of The T T 2 test statistic (Basseville Nikiforov, 1993; Ding, 2014). of matrixmatrix X. tr(X) tr(X) denotes trace of matrix matrix isΣan an identity and 0 is a the zero vector. x random ∼ NX.(µIxvector ,is x) Thus, the Neyman-Pearson lemma states that, for a given denotes that x is a normally distributed statistic (Basseville and Nikiforov, 1993; Ding, 2014). identity matrix and 0 is a zero vector. x ∼ N (µ , Σ x x) statistic (Basseville and Nikiforov, 1993; Ding, 2014). Thus, the Neyman-Pearson lemma states that, for a given identity matrix and 0 is a zero vector. x ∼ N (µ , Σ denotes that x is a normally distributed random vector x x) 2 significance level, that is, allowable false that, alarmforrate, the with mean µxxand covariance Σx . χ2 (m) random stands for the Thus, lemma aa given denotes that is a normally distributed vector Thus, the the Neyman-Pearson Neyman-Pearson lemma states states given significance level, that is, allowable false that, alarmforrate, the with denotes thatµxxand is acovariance normally distributed random vector mean Σ . χ (m) stands for the x 2 1 chi-square distribution with mΣdegrees of stands freedom. Let significance level, false alarm 2 (m) with mean µ for the Chunhua Yang, Taothat Peng is, andallowable Zhiwen Chen would like torate, thankthe the x .. χ significance level, that is, allowable false alarm rate, the with mean µxx2 and and covariance covariance Σdegrees for2Let the 1 chi-square distribution with m of stands freedom. x χ (m) 2 Chunhua Yang, Science Tao Peng and Zhiwen Chen would like#to61490702). thank the prob(χ > χ (m)) = α be the probability that χ > National Natural Foundation of China (grant 1 chi-square distribution with m degrees of freedom. Let α 2 2 2 Chunhua Yang, Tao Peng and Zhiwen Chen would like to thank the 1 chi-square distribution with m degrees of freedom. Let prob(χ > χ (m)) = α be the probability that χ 2 National Foundation ofChen China (grant Yang, Tao Peng and Zhiwen would like#to61490702). thank the αα (significance level). 2 2 > χ2α (m)22 equals XuChunhua Yang Natural and YuriScience Shardt would like to thank the National Natural 2 (m)) = α be the probability that χ2 > prob(χ > χ National Natural Science Foundation of China (grant # 61490702). α prob(χ > χαα(m)) = α be level). the probability that χ > χ2α (m) equals (significance Xu YangFoundation and YuriScience Shardt would like to the National Natural National Natural Foundation of thank China (grant # the 61490702). Science of China (grant # 61673053) and Beijing 2 (m) equals α (significance level). χ Xu and Shardt would like to thank the National χα Science of China (grant # 61673053) and the Natural Beijing Xu Yang YangFoundation and Yuri Yuri Shardt would like thank the Natural α (m) equals α (significance level). Natural Science Foundation (grant # to 4162041) for National funding. Science of (grant 61673053) and Natural Science Foundation ## 4162041) for funding. Science Foundation Foundation of China China(grant (grant # 61673053) and the the Beijing Beijing Natural Science Foundation (grant # 4162041) for funding. Natural Science Foundation (grant # 4162041) for funding.

Copyright © 2017 IFAC 15341 2405-8963 © IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Copyright © 2017, 2017 IFAC 15341 Copyright © 2017 IFAC 15341 Peer review under responsibility of International Federation of Automatic Control. Copyright © 2017 IFAC 15341 10.1016/j.ifacol.2017.08.2586

Proceedings of the 20th IFAC World Congress Zhiwen Chen et al. / IFAC PapersOnLine 50-1 (2017) 14776–14781 Toulouse, France, July 9-14, 2017

2. PROBLEM FORMULATION

3.1 Definition of FAR and FDR

Let y ∈ Rm be a sample measurement of a vector of m sensors. Assume that: • y ∼ N (0, Σy ); • The covariance matrix Σy is regular; • The samples are independent and identically distributed (i.i.d.). The T 2 statistic is defined as 2 T 2 = yT Σ−1 y y ∼ χ (m)

(1)

2

where χ (m) is the chi-squared distribution with m degrees of freedom, while the 3 common definitions of the Q statistic are   tr(Σy )χ2 (1) T Q = y y ∼ λ1 χ2 (m) (2)  gχ2 (h) m 2 λi tr(Σy Σy ) [tr(Σ )]2 i=1 where g = tr(Σ and h = tr(ΣyyΣy ) = = m y) λi i=1 m ( λ i )2 i=1 (Box, 1954), λi is the ith eigenvalue of Σy m λ2 i=1

14777

i

(i = 1, . . . , m), and λ1 ≥ λ2 ≥ . . . ≥ λm . Note that if Σy has the same eigenvalues, that is, λ1 = λm , then T T 2 = λ−1 1 y y, which is indeed a type of Q statistic. In this paper, it is assumed that λ1 > λm . Furthermore, let Qmax , Qtr and Qgh be the Q statistic, whose distribution is approximated by λ1 χ2 (m), tr(Σy )χ2 (1) and gχ2 (h), respectively. Given a significance level α, the corresponding thresholds are determined as Jth,T 2 = χ2α (m) Jth,Qtr = tr(Σy )χ2α (1) Jth,Qmax = λ1 χ2α (m) Jth,Qgh = gχ2α (h) where the three thresholds for Q are obtained based on modifications of the χ2 -distribution with different degrees of freedom. 2 Let Jth,Q be any of the three possible thresholds. It can be noted that the formulation of the T 2 and Q statistics in (1)and (2) is different from the common formulation in the FD-PM literature, where the T 2 statistic is used to monitor variations in the principal component subspace and the Q statistic for the residual subspace (Jackson and Mudholkar, 1979; Kourti and MacGregor, 1995; Wise and Gallagher, 1996; Qin, 2012). However, in this paper, the objective is to monitor the changes in the data directly without first creating a model, e.g. PCA model. In such circumstances, there exists a relationship between these two statistics, which needs to be clarified Zhang et al. (2017). 3. GEOMETRIC RELATIONSHIP BETWEEN THE T 2 AND Q STATISTICS First, the definitions of false alarm rate (FAR) and fault detection rate (FDR) are introduced. Then, the geometric relationship between the two statistics is examined. 2 The approximation of the Q statistic, which is not based on χ2 distribution, is not considered here.

For performance evaluation of FD methods, the conventional way is to check the false alarm and detection alarm. A false alarm is understood as an event where an alarm is triggered when there is no fault, and a detection alarm is regarded as an event where an alarm is raised when a fault actually occurs. From the probability point of view, FAR and FDR can be correspondingly defined as the probabilities of false alarms and detection alarms. Let J and Jth be the test statistic and corresponding threshold, respectively, the definitions of FAR and FDR are then (Ding, 2013) Definition 1. False Alarm Rate (FAR) Given J and Jth , we call the conditional probability FAR = prob(J > Jth |f = 0) the false alarm rate., where f represents the fault. Definition 2. Fault Detection Rate (FDR) Given J and Jth , we call the conditional probability FDR = prob(J > Jth |f = 0) the fault detection rate.

(3)

(4)

For the calculation of FAR and FDR, there exist numerous approaches. For example, a numerical approximation approach was given in Yin et al. (2012), which is simple for use but may have a large estimation error. A theoretical approach was proposed in Zhang et al. (2015), which is based on a probabilistic framework but more involved for practical use. In this study, the numerical approximation approach based on randomized algorithms is used (Tempo et al., 2005), which is simple for practical use and maintains an acceptable estimation error. 3.2 Discussion of the geometric relationship between the T 2 and Q statistics The quadratic form of T 2 is T 2 ≤ Jth,T 2 which defines an ellipse for m = 2 or a hyperellipsoid for m ≥ 3 (Chiang et al., 2000). The lengths of the semiaxes are given by the positive square roots of the reciprocals −1 of the eigenvalues of (Σy ΣT , for example, the largest y) semiaxis is  1 −1 ) min eig((Σy ΣT y) Since Σy is symmetric, the lengths of the semiaxes are equal to the eigenvalues of Σy , with the largest semiaxis being equal to λ1 and the smallest semiaxis to λm (Fenna, 2006). Meanwhile, the quadratic form of Q is Q ≤ Jth,Q which defines a circle for m = 2 or a hypersphere for m ≥ 3 (Chiang et al., 2000). In the theory of statistics, the boundary of the quadratic form (1) is given by the threshold Jth,T 2 , which is determined by the given level of significance α. In this case, the largest semiaxis is equal to λ1 Jth,T 2 and the smallest semiaxis to λm Jth,T 2 . For the quadratic form (2), the radius is equal to the corresponding threshold Jth,Q .

15342

Proceedings of the 20th IFAC World Congress 14778 Zhiwen Chen et al. / IFAC PapersOnLine 50-1 (2017) 14776–14781 Toulouse, France, July 9-14, 2017

F

y

The formulae for the boundary curve of the corresponding confidence regions are y2 y2 Ellipse: 1 + 2 = Jth,T 2 (5) λ1 λ2 (6) Circle: y12 + y22 = Jth,Q Let OA be the semimajor axis and OB the semiminor axis, then OA = y1 and OB = y2 . It can be observed from Fig. 1 and formulae (5-6) that the condition that the circle and ellipse have four intersections, i.e., A, B, C and D, requires that OA2 > Jth,Q and OB 2 < Jth,Q

Confidence region of Q Confidence region of T

2

B Q2

Q1 A

G

H Q4

C

x

Q3 D

O

Note that E

Fig. 1. Confidence regions for the T 2 and Q statistics For the sake of simplicity, consider the special case m = 2. From Fig. 1, for a given significance level α, it can be seen that • the ellipse defines the confidence region of T 2 , which is the set {T 2 |T 2 ≤ Jth,T 2 }. The x- and y-axes are represented by the eigenvectors of Σy . • the circle defines the confidence region of Q, which is the set {Q|Q ≤ Jth,Q }.

Since both the circle and ellipse are centred on the origin, as shown in Fig. 1, there are three possible interactions between them: • no intersections, that is, the ellipse always lies inside the circle; • two intersections, that is, the ellipse and circle meet at two touching intersections; and • four intersections.

As shown in Fig. 1, in the case of four intersections, they can be labelled as Q1 , Q2 , Q3 and Q4 , which gives 5 separate regions to consider. The first region denoted by Q1 Q2 Q3 Q4 in Fig. 1 is the confidence region where both shapes overlap. The other 4 regions are separated regions, where one of the two statistics is larger: the upper region Q1 F Q2 B; the lower region Q3 DQ4 E; the left region Q1 AQ4 H; and the right region Q2 CQ3 G. A FD-method’s performance is a function of the FAR and FDR. The smaller the FAR and the larger the FDR the better is the performance. It should be stressed that the performance of the T 2 and Q statistics with respect to FAR and FDR depends on the relationship between the fault information and the geometric relationship between the statistics. For the first two cases, where the ellipse lies inside the circle, the FAR of the Q statistic will be lower than the T 2 statistic. However for FDR, the situation will be reversed. For the third case, if a fault occurs in the upper and lower regions, the FDR of the T 2 statistic will be larger than that for the Q statistic, while the result is reversed when the fault is in the left and right regions along the x-axis. In order to further examine the implications of the these different regions on the performance of different fault detection algorithms, there is a need to understand the conditions under which four intersections can occur for the three Q statistics.

OA2 = λ1 Jth,T 2 OB 2 = λ2 Jth,T 2 Therefore, the condition reduces to λ1 Jth,T 2 > Jth,Q (7) λ2 Jth,T 2 < Jth,Q (8) This can be extended to the general case for m > 2. Then, conditions (7) and (8) are reformulated as λ1 Jth,T 2 > Jth,Q (9) (10) λm Jth,T 2 < Jth,Q which can guarantee that the m-dimensional spheroid and ellipsoid have intersections. In the following, we will discuss whether the three approximations of the Q statistic satisfy conditions (9) and (10). Results for Qmax statistic In this case, Jth,T 2 = χ2α (m), Jth,Q = λ1 χ2α (m), which gives λ1 Jth,T 2 = Jth,Q , λm Jth,T 2 < Jth,Q Thus, condition (9) fails. The ellipse lies always inside the circle. Results for Qtr statistic We first consider a special case for m = 2. Since Jth,T 2 = χ2α (2) and Jth,Q = (λ1 + λ2 )χ2α (1), condition (7) becomes λ2 Jth,T 2 < Jth,Q ⇒ λ2 χ2α (2) < (λ1 + λ2 )χ2α (1)

Since λ2 < λ1 and χ2α (2) < χ2α (1) + χ2α (1), we λ2 χ2α (2) < λ2 (χ2α (1) + χ2α (1)) < λ1 χ2α (1) +

(11)

have

λ2 χ2α (1) Thus, condition (7) holds. Furthermore, condition (8) can be rewritten as λ1 Jth,T 2 > Jth,Q ⇒ λ1 χ2α (2) > (λ1 + λ2 )χ2α (1) (12) It is difficult to determine in general whether condition (12) holds. However, a counterexample is enough to disprove this condition. Therefore, we take α = 0.05. Since χ2α (2)/χ2α (1) = 1.5597 and χ2α (2) < χ2α (1) + χ2α (1), condition (12) becomes λ1 χ2 (2) = 1.5597λ1 > λ1 + λ2 ⇒ > 1.7867 λ1 2α χα (1) λ2 In this case, the circle and ellipse do not overlap when 1 < λλ12 ≤ 1.7867. For the other significance levels, e.g., 0.01, 0.02, 0.03 and 0.04, the conclusions are similar. Therefore, in the general case that is m > 2, it is harder to determine whether the hyperellipsoid and hypersphere intersects with four points, since the results depend on the trace of the covariance matrix under consideration.

15343

Proceedings of the 20th IFAC World Congress Zhiwen Chen et al. / IFAC PapersOnLine 50-1 (2017) 14776–14781 Toulouse, France, July 9-14, 2017

14779

Algorithm 1: 6

Step 1: Let α = [0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08], m be any integer uniformly selected from the interval [1, 1000], n be a randomly generated integer from [2, 100] and the number of Monte Carlo experiments be 50 million. Step 2: Generate m random eigenvalues selected uniformly from the interval [, 1000], where  is a very small real number. Step 3: Run the Monte Carlo experiment to test condition (16). Record the results, where 1 represents that the inequality holds and 0 that it does not. Step 4: Check the decision logic:  no 0 present ⇒ inequality (16) holds otherwise ⇒ inequality (16) does not hold.

4

Since there is no known analytical solution for χ2α (m), it is hard to obtain a proof for (16). Thus, Algorithm 1 is designed to test this inequality. The result based on this algorithm suggests that inequality (16) holds. From the above results, only the third approximate distribution for the Q statistic guarantees that the circle and ellipse intersect with four points. In this case, when information about a certain type of fault is known, the FDR of the Q statistic is higher than that for the T 2 statistic. In practice, fault information is unknown and the inverse of the covariance matrix can be easily computed, so that the T 2 statistic is preferred since it will give a higher FDR. This point is demonstrated by a numerical example in the next section. 4. NUMERICAL EXAMPLE

0 y1

2

4

6

data χ2α(m)

y2

3

tr(Σy)χ2α(1) gχ2α(h)

2

λ1χ2α(m)

1 0 −1 −2 −3

λ21 +λ22 +...+λ2m

The remaining problem is to show that the above inequality holds.

−2

4

−4 −4

(16)

−4

Fig. 2. Confidence regions for the T 2 and Q statistics in Example 1

From (14), we have



0

−6 −6

λ2 +λ2 +...+λ2 > λ11 +λ22 +...+λm χ2α (h) ⇒ m 2 λ1 (λ1 + λ2 + . . . + λm )χα (m) > (λ21 + λ22 + . . . + λ2m )χ2α (h) since λ1 (λ1 + λ2 + . . . + λm ) > (λ21 + λ22 + . . . + λ2m ).



λ1χ2α(m)

−4

λ1 χ2α (m)

2 λ1 +λ2 +...+λm χα (h) 2 2 χ (h) λ1 λm +λ2 λm +...+λm < χ2α(m) λ21 +λ22 +...+λ2m α 2 χα (h) m h λ1 +λ2λ+...+λ < χ2α (m) m

gχ2α(h)

−2

Let m = h + a, a ≥ 0, then χ2α (m) ≤ χ2α (h) + χ2α (a). Condition (13) can be easily obtained from

λm χ2α (m) <

tr(Σy)χ2α(1)

2

y2

Results for Qgh statistic First, reformulate condition (910) as λ2 + λ22 + . . . + λ2m 2 χ (h) (13) λ1 χ2α (m) > 1 λ 1 + λ2 + . . . + λm α λ2 + λ22 + . . . + λ2m 2 λm χ2α (m) < 1 χ (h) (14) λ 1 + λ2 + . . . + λm α It can be shown that 1 < h < m due to the fact that 1 (λ1 + λ2 + . . . + λm )2 < λ21 + λ22 + . . . + λ2m (15) m

data χ2α(m)

−2

0 y1

2

4

Fig. 3. Confidence regions for the T 2 and Q statistics in Example 2 4.1 Example for the geometric relationship In order to validate the results in Section 3, the significance level is set to be 0.05. Two numerical examples are used:   4 0.5 • Example 1. Let Σy = , then λλ12 = 1.5064 < 0.5 3 1.7867. Fig. 2 shows that only the gχ2α (h)-based confidence region has four intersections with the χ2α (m)based one.   2 0.5 • Example 2. Let Σy = , then λλ12 = 2.7836 > 0.5 1 1.7867. Fig. 3 shows that both the gχ2α (h)- and tr(Σy )χ2α (1)-based confidence regions have four intersections with the one spanned by χ2α (m). The above two examples are consistent with the results in Section 3. 4.2 Example for performance evaluation

Two numerical examples are provided in this section. The first one aims to show the results in the last section. The second one is used to compare the performance of the T 2 and Q statistics.

We now discuss the FD performance of the T 2 statistic and three cases of the Q statistic with respect to FAR and FDR.

15344

Proceedings of the 20th IFAC World Congress 14780 Zhiwen Chen et al. / IFAC PapersOnLine 50-1 (2017) 14776–14781 Toulouse, France, July 9-14, 2017

Table 1. FAR for fault-free scenario α

FARe

Qmax

Qtr

Qgh

T2

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

0.70× 10−3 57.58× 10−3 0.0013 0.0011 0.0014 0.0024 0.0025 0.0029

0.0011 0.0023 0.0025 0.0111 0.1498 0.1387 0.0248 0.0368

0.0121 0.0213 0.0312 0.0418 0.0499 0.0600 0.0688 0.0778

0.0100 0.0200 0.0300 0.0400 0.0501 0.0593 0.0693 0.0799

T2

Table 2. FDR for fault scenario (%) Ξ

unknown

known

f

Qmax

Qtr

Qgh

3

1.55

3.08

13.29

25.03

5

7.02

11.73

32.96

56.65

8

19.59

30.51

66.59

87.11

3

2.34

4.16

14.69

10.04

5

9.05

13.89

33.71

21.70

8

36.12

45.92

70.93

52.94

In the first part, the FAR performance is compared. Let m = 5, then ten covariance matrices are generated, 3 and their eigenvalues are constrained to be between 0.1 and 100. Furthermore, 40,000 fault-free samples are generated based on these covariance matrices. For each covariance matrix, a Monte Carlo experiment is run 20,000 times (Tempo et al., 2005). The FAR is estimated using the mean value of the false alarms. Assume that the probability of each covariance matrix is the same, as a result, the total FAR is obtained. This comparison is conducted with eight different significance levels, that is, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, and 0.08. Table 1 shows the results. In the second column, the expected FAR is equal to the significance level. It can be seen from the fourth and fifth columns that the Qmax statistic has the smallest FAR, followed by the Qtr statistic. Both of them are much smaller than the expected FAR. From the last two columns, we can see that the FARs of the Qgh and T 2 statistics are around the expected FAR. Furthermore, for the first four significance levels, the FAR of the Qgh statistic is higher than that for the T 2 statistic. For the last four significance levels, the FAR of the Qgh statistic is smaller than that for the T 2 statistic. From the eight cases, the FAR values of the T 2 statistic are closer to the corresponding expected FAR value due to its standard distribution. The FAR values of the Qgh statistic deviate a little from the expected FAR value, because it only follows an approximate normal distribution. In the second part, we study the detection performance. Assume that the fault structure is known and can be modelled as yf = y + Ξf (17) 3

The Matlab code used to generate the covariance matrix was provided by Sihan Yu.

where Ξ represents the direction of fault and f the magnitude of fault. In order to remove the influence of Ξ on the fault magnitude, the fault direction is scaled to Ξ unit length by ||Ξ|| . Three fault magnitudes are considered, namely, 3, 5 and 8. In this part, two cases are considered as follows: • In the first faulty case, the fault direction is assumed to be unknown, which is randomly generated. Eight directions are considered and with the same probability. • In the second faulty case, the fault direction Ξ is assumed to be known a priori, which is set to be the first eigenvector of Σy   4.86 0.41 0.51 −1.05 −0.57  0.41 5.98 −1.56 −1.73 2.09    Σy =  0.51 −1.56 2.79 −1.10 −1.62 −1.05 −1.73 −1.10 2.60 0.70  −0.57 2.09 −1.62 0.70 5.10

For each fault magnitude, run the Monte Carlo experiment 20,000 times. Table 2 shows the results for FDR. In both cases, the Qmax statistic has the smallest FDR, followed by the Qtr statistic. In the first case, the T 2 statistic has the largest FDR. In the second case, the Qgh statistic has the largest FDR when the fault is along the direction of the first eigenvector, which represents the largest semiaxis. Based on the performance of the four statistics in the fault-free and fault scenarios, it is clear that the lower FARs for the Qmax and Qtr statistics are at the cost of lower FDR. The T 2 statistic leads to a satisfactory tradeoff between the FAR and the FDR. If the T 2 statistic is inaccurate due to numerical trouble, the Qgh statistic is a sound alternative. Remark 1. Note that the T 2 statistic in this paper is constructed assuming that the fault structure is unknown.

15345

Proceedings of the 20th IFAC World Congress Zhiwen Chen et al. / IFAC PapersOnLine 50-1 (2017) 14776–14781 Toulouse, France, July 9-14, 2017

If the fault structure is known, then the T 2 statistic can be reconstructed by taking into account this information (Ding, 2014). 5. CONCLUDING REMARKS This paper has focused on comparing the T 2 and Q statistics for fault detection. The geometric relationship between the T 2 and three cases of Q statistic, i.e., Qmax , Qtr and Qgh has been discussed. It has been found that the geometric relationship has a close relationship with detection performance. Only the confidence region given by the Qgh statistic guarantees that it has four intersections with the one given by the T 2 statistic. This leads to the performance of the Qmax and Qtr statistics having undesired FAR performance, and the FDR given by the Qgh statistic may be higher than the one of the T 2 statistic in some special cases. This point has been verified by numerical examples. The results have shown that the T 2 statistic has an acceptable FAR performance and the highest FDR for an unknown fault structure case. The FAR of the Qgh statistic has relatively small deviations with the expected value compared with the Qmax and Qtr statistics. In addition, at the significance level of 0.05, the Qgh statistic has the smallest deviation. In the special faulty case, the Qgh statistic has the highest FDR, even better than T 2 . Finally, it can be concluded that the T 2 statistic is the better choice for fault detection; if it is unavailable, the alternative should be the Qgh statistic. The future work will consider: • studying more approximation to the distribution of the Q test statistic; • applying the 2 test statistics to benchmark processes and real data sets. REFERENCES M. Basseville and I.V. Nikiforov. Detection of Abrupt Changes: Theory and Application. Prentice-Hall, New York, 1993. G. E. P. Box. Some theorems on quadratic forms applied in the study of analysis of variance problems: Effect of inequality of variance in one-way classification. Annals of Mathematical Statistics, 25:290–302, 1954. Z. W. Chen. Data-Driven Fault Detection for Industrail Processes: Canonical Correlation Analysis and Projection Based Methods. Springer Vieweg, Wiesbaden, 2017. Z. W. Chen, S. X. Ding, K. Zhang, Z. B. Li, and Z. K. Hu. Canonical correlation analysis-based fault detection methods with application to alumina evaporation process. Control Engineering Practice, 46:51–58, 2016a. Z. W. Chen, K. Zhang, S. X. Ding, , Y. A. W. Shardt, and Z. K. Hu. Improved canonical correlation analysisbased fault detection methods for industrial processes. Jouranl of Process Control, 41:26–34, 2016b. L. H. Chiang, E. L. Russell, and R. D. Braatz. Fault diagnosis in chemical processes using fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemometrices and Intelligent Laboratory Systems, 50:243–252, 2000. S. X. Ding. Model-Based Fault Diagnosis Techniques— Design Schemes, Algorithms and Tools (2nd ed.). Springer-Verlag, London, 2013.

14781

S. X. Ding. Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems. Springer-Verlag, London, 2014. S. X. Ding, S. Yin, K. X. Peng, H. Y. Hao, and B. Shen. A novel scheme for key performance indicator prediction and diagnosis with application to an industrial hot strip mill. IEEE Transactions on Industrial Informatics, 9 (4):2239–2247, 2013. D. Fenna. Cartographic Science: A Compendium of Map Projections, with Derivations. CRC, 2006. J. J. Gertler. Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker, New York., 1998. Z. M. He, H. Y. Zhou, J. Q. Wang, Z. W. Chen, D. Y. Wang, and Y. Xing. An improved detection statistic for monitoring the nonstationary and nonlinear processes. Chemometrics and Intelligent Laboratory Systems, 145: 114–124, 2015. Z. K. Hu, Z. W. Chen, W. H. Gui, and B. Jiang. Adaptive PCA based fault diagnosis scheme in imperial smelting process. ISA Transactions, 53(5):1446–1455, 2014. R. Isermann. Fault Diagnosis Systems. Springer-Verlag, London, 2006. J. E. Jackson and G. S. Mudholkar. Control procedures for residuals associated with principal component analysis. Technometrics, 21:341–349, 1979. T. Kourti and J.F. MacGregor. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometrics and Intelligent Laboratory Systems, 28(1):3 – 21, 1995. S. J. Qin. Survey on data-driven industrial process monitoring and diagnosis. Annual Reviews in Control, 36(2):220–234, 2012. R. Tempo, G. Calafiro, and F. Dabbene. Randomized Algorithms for Analysis and Control of Uncertain Systems. Springer, 2005. B. M. Wise and N. B. Gallagher. The process chemometrics approach to process monitoring and fault detection. Journal of Process Control, 6(6):329–348, 1996. S. Yin, S. X. Ding, A. Haghani, H. Y. Hao, and P. Zhang. A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark tennessee eastman process. Journal of Process Control, 22:1567–1581, 2012. K. Zhang, H. Y. Hao, Z. W. Chen, S. X. Ding, and K. X. Peng. A comparison and evaluation of key performance indicator-based multivariate statistics process monitoring approaches. Journal of Process Control, 33:112–126, 2015. K Zhang, S. X. Ding, Yuri A. W. Shardt, Z. W. Chen, and K. X. Peng. Assessment of t2- and q- statistics for detecting additive and multiplicative faults in multivariate statistical process monitoring. Journal of the Franklin Institute, 354(2):668–688, 2017.

15346