9th IFAC on Fault Detection, Supervision and Safety of Symposium Technical Processes 9th IFAC on Fault Detection, Supervision and Safety of Symposium Technical Processes 9th IFAC Symposium on Fault Detection, Supervision September 2-4, 2015. Arts et Métiers ParisTech, Paris, and France Safety of Symposium Technical Processes 9th IFAC on Fault Detection, Supervision September 2-4, 2015. Arts et Métiers ParisTech, Paris, and France Available online at www.sciencedirect.com Safety of Technical Processes September 2-4, 2015. Arts et Métiers ParisTech, Paris, France Safety of Technical Processes September 2-4, 2015. Arts et Métiers ParisTech, Paris, France September 2-4, 2015. Arts et Métiers ParisTech, Paris, France
ScienceDirect
Fault Detection and Isolation Using Principal Component Analysis IFAC-PapersOnLine 48-21Interval (2015) 1402–1407 Fault Detection and Isolation Using Interval Principal Component Analysis Fault Detection and Isolation Using Interval Principal Component Analysis Methods Fault Detection and Isolation Using Interval Principal Component Analysis Methods Fault Detection and Isolation Using Interval Principal Component Analysis Methods Methods Tarek AIT IZEM, Wafa BOUGHELOUM. Methods
TarekFaouzi AIT IZEM, Wafa Messaoud BOUGHELOUM. Mohamed HARKAT, DJEGHABA Tarek AIT Wafa BOUGHELOUM. Mohamed HARKAT, DJEGHABA TarekFaouzi AIT IZEM, IZEM, Wafa Messaoud BOUGHELOUM. TarekFaouzi AIT IZEM, Wafa Messaoud BOUGHELOUM. Mohamed HARKAT, DJEGHABA Mohamed Faouzi HARKAT, Messaoud DJEGHABA Badji-Mokhtar, Annaba University, Department of Electronics P.O.Box 12, 23000 (e-mail:
[email protected], Mohamed Faouzi HARKAT, Messaoud DJEGHABA Badji-Mokhtar, Annaba University, Department of Electronics P.O.Box 12, 23000 (e-mail:
[email protected],
[email protected],
[email protected],
[email protected]) Badji-Mokhtar, Annaba of (e-mail:
[email protected],
[email protected],
[email protected]) Badji-Mokhtar, Annaba University, University, Department Department of Electronics Electronics P.O.Box P.O.Box 12, 12, 23000 23000 (e-mail:
[email protected],
[email protected], Badji-Mokhtar, Annaba University, Department of Electronics P.O.Box 12, 23000 (e-mail:
[email protected],
[email protected],
[email protected],
[email protected],
[email protected],
[email protected])
[email protected])
[email protected],
[email protected],
[email protected]) Abstract: Principal component analysis (PCA) is a commonly used approach to process monitoring. Abstract: Principal component analysis (PCA) is a commonly used approach to cases, process monitoring. However, Principal it has beencomponent developed analysis for singleton variables. Whereas,used in many real life thismonitoring. leads to a Abstract: (PCA) is a commonly approach to process However, it has been developed for singleton variables. Whereas, in many real life cases, this leadspaper to a Abstract: Principal component analysis (PCA) is a commonly used approach to process monitoring. severe lossPrincipal of information, this analysis can be overcome the interval notion. The this present Abstract: component (PCA) isby a introducing commonly used approach to cases, process monitoring. However, it has been developed for singleton variables. Whereas, in many real life leads to aa severe loss of information, this can be overcome by introducing the interval notion. The present paper However, it has been developed for singleton variables. Whereas, in many real life cases, this leads to deals with study fault detection and isolations (FDI) of uncertain process using interval PCA. However, itthe hasinformation, been of developed for singleton variables. Whereas, in many real life cases, this leadspaper to a severe loss of this can be overcome by introducing the interval notion. The present deals with the study of fault detection andvarious isolations (FDI)and of uncertain process using interval PCA. severe loss of are information, this can be overcome bymodels, introducing the interval notion. The present Interval data generated according to the FDI procedure is lead usingpaper the severe loss of information, this can be overcome by introducing interval notion. The present paper deals with the study of fault detection and isolations (FDI) of uncertain process using interval PCA. Interval data are generated according to various models, and the FDI procedure is lead using the deals with the study of fault detection and isolations (FDI) of uncertain process using interval PCA. reconstruction principle technique, in its new interval form, for three interval PCA methods: Vertices deals with the study of fault detection and isolations (FDI) of uncertain process using interval PCA. Interval data are generated according various models, the procedure is lead the reconstruction principle technique, in itsto interval form,and for threeFDI interval PCA methods: Vertices Interval data PCA, are generated according tonew various models, and the FDI procedure isreported lead using using the PCA, Centers and Midpoints/Radius PCA. A comparison is presented where it ismethods: in which Interval data are generated according tonew various models, and the FDI procedure is lead using the reconstruction principle technique, in its interval form, for three interval PCA Vertices PCA, Centers PCA, and Midpoints/Radius PCA. A comparison is presented where it is reported in which reconstruction principle technique, in its new interval form, for three interval PCA methods: Vertices conditions each method performs best for FDI purpose. reconstruction principle technique, in its new interval form, for three interval PCA methods: Vertices PCA, Centers and Midpoints/Radius PCA. A conditions eachPCA, method best for FDI purpose. PCA, Centers PCA, and performs Midpoints/Radius PCA. A comparison comparison is is presented presented where where it it is is reported reported in in which which PCA, Centers PCA, and Midpoints/Radius PCA. A comparison is presented where itAll is reported in which conditions each method performs best for FDI purpose. Keywords: Principal Component Analysis, Interval Data, Reconstruction Principle, Fault detection and © 2015, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. rights reserved. conditions each method performs best for FDI purpose. Keywords: Principal Component Analysis, Interval Data, Reconstruction Principle, Fault detection and conditions method performs best for FDI purpose. isolation. each Keywords: Principal Component Analysis, Interval Data, Reconstruction Principle, Fault detection and isolation. Keywords: Principal Component Analysis, Interval Data, Reconstruction Principle, Fault detection and Keywords: Principal Component Analysis, Interval Data, Reconstruction Principle, Fault detection and isolation. isolation. isolation. Another interval PCA method include the midpoints-radii 1. INTRODUCTION Another interval introduced PCA method PCA (MR-PCA) by include Palumbothe andmidpoints-radii Lauro (2003) 1. INTRODUCTION Another interval PCA method include the midpoints-radii PCA (MR-PCA) introduced by Palumbo and Lauro (2003) Another interval PCA method include the midpoints-radii 1. INTRODUCTION treating midpoint and the interval range as two separate Another interval PCA method include the midpoints-radii 1. INTRODUCTION (MR-PCA) introduced by Palumbo and Lauro (2003) Principal component1.analysis (PCA) is a well-known method PCA INTRODUCTION treating midpoint and the interval range as two separate PCA (MR-PCA) introduced by Palumbo and Lauro (2003) variables to enhance C-PCA by radius. D’Urso PCA (MR-PCA) introduced by incorporating Palumbo and Lauro (2003) Principal component analysis (PCA) which is a well-known method treating midpoint and the interval range as two separate in the area of multivariate analysis can represent the variables to enhance C-PCA by incorporating radius. D’Urso treating midpoint and the interval range as two separate Principal component analysis (PCA) is a well-known method and Giordani (2004)and introduced an alternative using treating midpoint the interval range asapproach two separate in the area of multivariate analysis which can represent the Principal component analysis (PCA) is a well-known method variables to enhance C-PCA by incorporating radius. D’Urso main tendency of an analysis observed data; which it ahas found application and Giordani (2004) introduced an alternative approach using variables to enhance C-PCA by incorporating radius.methods, D’Urso Principal component (PCA) is well-known method in the area of analysis can represent the least squares for MRPCA. Different from these variables to enhance C-PCA by incorporating radius. D’Urso main ofofanengineering. observed data; it has found application in various thetendency areafields of multivariate multivariate analysis which can represent the and Giordani (2004) introduced an alternative approach using PCA is used widespread in least squares for MRPCA. Different from these methods, and Giordani (2004) introduced an alternative approach using in the area of multivariate analysis which can represent the main tendency of an observed data; it has found application Gioia and Lauro (2006) putDifferent forward an analytical interval and Giordani (2004) introduced an alternative approach using in various fields of engineering. PCA is used widespread in main tendency of an observed data; it has found application least squares for MRPCA. from these methods, diagnosis and monitoring of processes for the detection of Gioia and Lauro (2006) put forward an analytical interval least squares for MRPCA. Different from these methods, main tendency of an observed data; it has found application in various fields of engineering. PCA is used widespread in PCA based onforan(2006) interval-valued covariance matrix, and least squares MRPCA. Different from these methods, diagnosis and monitoring of processes for the detection of in various fields of engineering. PCA is used widespread in Gioia and Lauro put forward an analytical interval aberrant by exploiting orwidespread quasi-linear based on an interval-valued covariance matrix, and Gioia and Lauro (2006) put forward an analytical interval in variousinformation fields of engineering. PCA linear is for used in PCA diagnosis and monitoring of processes the detection of LeRademacher and Billard (2012) employed symbolic Gioia and Lauro (2006) put forward an analytical interval aberrant by observed exploiting linear or detection quasi-linear diagnosis information andbetween monitoring of processes for the of PCA based on an interval-valued covariance matrix, and dependencies the data. and Billard (2012) employed symbolic PCA based on an interval-valued covariance matrix, and diagnosis information and monitoring of processes for the detection of LeRademacher aberrant by exploiting linear or quasi-linear covariance extend the classical PCA. basedtoon an interval-valued covariance matrix, and dependencies between the aberrant information by observed exploitingdata. linear or quasi-linear PCA LeRademacher and Billard (2012) employed symbolic covariance to extend classical PCA. employed symbolic LeRademacher and the Billard (2012) aberrant information by exploiting linear or quasi-linear dependencies between the observed data. More precisely, the PCA thedata. obtaining of an implicit LeRademacher and Billard (2012) employed symbolic dependencies between the allows observed covariance extend PCA. This paper to proposes the classical use of three covariance to extend the the classical PCA.Interval PCA methods dependencies the observed More precisely, the PCA thedata. obtaining of an implicit model of thebetween system toallows be monitored by estimating its covariance to extendand the classical PCA.Interval This paperC-PCA proposes the MR-PCA), use of three methods More precisely, the PCA allows the obtaining of an implicit (V-PCA, order PCA to obtain an model ofand theitssystem bethrough monitored by estimating its This paper proposes the use of threeinInterval More precisely, the PCAtoallows the obtaining of an implicit PCA methods structure parameters an eigendecomposition (V-PCA, C-PCA and MR-PCA), in order to obtain an This paper proposes the use of three Interval PCA methods More precisely, the PCA allows the obtaining of an implicit model of the system to be monitored by estimating its interval PCA model for diagnosis purpose. The This paper proposes the use of three Interval PCA methods structure and parameters eigendecomposition model theitssystem to bethrough by estimating its interval (V-PCA, and in order to obtain an of the of matrix ofmonitored data,an where the principal PCA principle model for diagnosis purpose. The (V-PCA, C-PCA C-PCA and MR-PCA), MR-PCA), in order to obtain an model ofcovariance theitssystem to bethrough monitored by estimating its reconstruction structure and parameters an eigendecomposition isforuseddiagnosis forin faults detection and (V-PCA, C-PCA and MR-PCA), orderpurpose. to obtain The an of the covariance matrix of data, where the principal structure and its parameters through an eigendecomposition interval PCA model components define the new reduced space of data reconstruction principle is used for faults detection and interval PCA model for diagnosis purpose. The structure and its parameters through anwhere eigendecomposition of the covariance matrix of data, the principal isolation, a new extension of this principle is, however, interval PCA model for diagnosis purpose. The components define the new reduced space of data of the covariance of model data, where theusing principal reconstruction principle is used for faults detection and representation. Oncematrix the PCA obtained data extension of this principle is, however, reconstruction principle used forthe faults detection of the covariance matrix of data, where the principal components define the new reduced space of data isolation, introducedaainnew order to takeis account new interval typeand of reconstruction principle isin used forprinciple faults detection and representation. Once the obtained using components define the PCA new model reduced space of data isolation, new extension of this is, however, collected from a normally operating system, the detection of innew order to take in account the new interval type of isolation, extension of this the principle is, however, components define the PCA new model reduced space using of data introduced representation. Once the obtained the data. aaA comparison between used interval PCA isolation, new extension of this the principle is, however, collected from a normally operating system, the detection of representation. Once the PCA model obtained using data introduced in order to take in account new interval type of faults canfrom be aperformed by generating residuals from a introduced the data. A comparison between the used interval PCA in order to take in account the new interval type of representation. Once the PCA model obtained using data collected normally operating system, the detection of models, in A terms of to good and is presented introduced in order takedetection in account theisolation, new interval type of faults can be performed by generating residuals from a collected from a normally operating system, the detection of the data. comparison between the used interval PCA comparison between the observed data and residuals thetheones given by terms ofbest good detection and isolation, is presented the data.in A comparison between the used interval PCA collected from aperformed normally operating system, detection ofa models, faults can be by generating from to demonstrate the suited method for diagnosis. the data.in A comparison betweenand theisolation, used interval PCA comparison between the observed data and the ones given by faults can be performed by generating residuals from a models, terms of good detection is presented the PCA model. to demonstrate theofbest suited method diagnosis. in terms good detection andforisolation, is presented faults can be performedobserved by generating residuals from a models, comparison between in terms ofbest good detection andforisolation, is presented the PCA model. comparison between the the observed data data and and the the ones ones given given by by models, to demonstrate the suited method diagnosis. 2. PRINCIPAL COMPONENT ANALYSIS to demonstrate the best suited method for diagnosis. comparison between the observed data and the ones given by the Diagnosis techniques, and more specifically those based on to demonstrate the best suited method forANALYSIS diagnosis. 2. PRINCIPAL COMPONENT the PCA PCA model. model. the PCA model. Diagnosis techniques, more specifically those based on Principal2.component COMPONENT analysis is a vector ANALYSIS space transformation statistical methods likeand PCA, have been developed for the 2. PRINCIPAL PRINCIPAL COMPONENT ANALYSIS Diagnosis techniques, and more specifically those based on 2. PRINCIPAL COMPONENT ANALYSIS Principal component analysis is a vectorspace spaceinto transformation statistical methods like PCA, have been developed for the Diagnosis techniques, and more specifically those based on often used to transform multivariable a subspace analysis of single valued variables. However, in real life there Diagnosis techniques, and more specifically those based on Principal component analysis is aa vector space transformation statistical methods like PCA, have been developed for the often used to transform multivariable space into a subspace Principal component analysis is vector space transformation analysis of single valued variables. However, in real life there statistical methods like PCA, have been developed for the which preserves maximum variance of the original space in are many situations in which the use of these variables may Principal component analysis is a vector space transformation statistical methods like PCA, have been developed for the often used to transform multivariable space into aa subspace analysis of single valued variables. However, in real life there which preserves maximum variance of the original space in often used to transform multivariable space into subspace are many situations in which the use of these variables may analysis of single valued variables. However, in real life there minimum number of dimensions. The measured process cause severe loss of information. In this case, more complete often used to transform multivariable space into a subspace analysis of single valued variables. However, in real life there which preserves maximum variance of the original space in are many situations in which the use of these variables may minimum number of dimensions. The measured process which preserves maximum variance of the original space in cause severe loss of information. In this case, more complete are many situations in which the use of these variables may variables are usually correlated to each other. PCAspace can be information can beofachieved a set of statistical preserves maximum variance of the original in are many situations in whichby thedescribing usethis of case, these variables may which minimum number of dimensions. The measured process cause severe loss information. In more complete variables are usually correlated to each other. PCA can be minimum number of dimensions. The measured process information can be achieved by describing a set of statistical cause severe loss of information. In this case, more complete defined asare a linear transformation of the other. original correlated units in terms ofbeof interval data. Though, there isofalmost no minimum number ofcorrelated dimensions. The measured process cause severe loss information. In this case, more complete variables usually to each PCA can be information can achieved by describing a set statistical a linear transformation of the original correlated variables usually correlated to data each other. PCA be units in terms ofbeinterval Though, purpose. there no defined information can achieved bydiagnosis describing a set isofalmost statistical intoas aare new set of uncorrelated that explain thecan trend adaptations of interval PCAdata. for variables are usually correlated to of each other. PCA can be information can be achieved by describing a set ofalmost statistical data defined as aa linear transformation the original correlated units of Though, there into a new set of uncorrelated data that explain the trend defined as linear transformation of the original correlated adaptations of interval PCAdata. for diagnosis units in in terms terms of interval interval data. Though, purpose. there is is almost no no data of theinto process. defined asa new a linear transformation of the original correlated units in terms of interval data. Though, there is almost no data set of uncorrelated data that explain the trend adaptations of interval PCA for diagnosis purpose. Several extensions of PCA data exist in the of theinto process. data a new set of uncorrelated data that explain the trend adaptations of interval PCAtoforinterval-valued diagnosis purpose. into aa new of uncorrelated data that explain the trend adaptations of interval PCA diagnosis purpose. Several extensions toforinterval-valued data in the data of the process. X nnmm containing n samples of m Consider data set matrix literature. Cazes etof al.PCA (1997) and Chouakria et exist al. (1998) of the process. Several extensions of PCA to interval-valued data exist in the X containing n samples of m Consider a data matrix n m of the process. literature. Cazes et al. (1997) and Chouakria et al. (1998) Several extensions of PCA to interval-valued data exist in the process variables collected under normal operation. This n m X containing n samples of m aa data matrix proposed the first et adaptations, known as the centers PCA Several extensions ofal.PCA to interval-valued data exist in (Cthe Consider literature. Cazes (1997) and Chouakria et al. (1998) process variables collected under normal operation. This X n samples of m nm containing Consider data matrix proposed the first adaptations, known as the centers PCA (Cliterature. Cazes et al. PCA (1997) and Chouakria etThe al. centers (1998) Consider matrix must be matrix normalized zerocontaining mean and variance X tounder n unit samples of m a data process variables collected normal operation. This PCA) andthe thefirst vertices (V-PCA) methods. literature. Cazes et al. (1997) and Chouakria et al.PCA (1998) proposed adaptations, known as the centers (Cmatrix must be normalized to zero mean and unit variance process variables collected under normal operation. This PCA) and the vertices PCA (V-PCA) methods. The centers proposed the first adaptations, known as the centers PCA (Cwith the scale parameter vectors of mean and variance. process variables collected under normal operation. This matrix must be normalized to zero mean and unit variance method computes the principal components (PCs) using the proposed the first adaptations, known asmethods. the centers PCA (C- with PCA) the PCA (V-PCA) The centers the scalebe parameter vectors of transformation mean andand variance. matrix must normalized to zero mean unit method computes the principal components (PCs) using the PCA) and and the vertices vertices PCA methods. The centers determines an optimal linear of variance the data matrix must be normalized to zero mean and unit variance with the scale parameter vectors of mean and variance. interval centers, whereas the(V-PCA) vertices method computes the PCA PCA) and the vertices PCA (V-PCA) methods. The centers method computes the principal components (PCs) using the PCA determines an optimal linear transformation of the data with the scale parameter vectors of mean and variance. interval centers, whereas the vertices method computes method computes the principal components (PCs) using the matrix X in terms of capturing the variation in the data: the scale parameter vectors of transformation mean and variance. PCA of PCs using the vertices of the observed hyper-rectangles. method computes the principal components (PCs) using the with interval centers, whereas the vertices method computes X in termsan of optimal capturinglinear the variation in the data: PCA determines determines an optimal linear transformation of the the data data PCs using the vertices observed hyper-rectangles. interval centers, whereas of thethe vertices method computes the matrix PCA determines an optimal linear transformation of the data matrix X in terms of capturing the variation in the data: interval centers, whereas the vertices method computes the PCs using the vertices of the observed hyper-rectangles. matrix X in terms of capturing the variation in the data: PCs using the vertices of the observed hyper-rectangles. matrix X in terms of capturing the variation in the data: PCs using the vertices of the observed hyper-rectangles.
Copyright © 2015 IFAC 1402 Copyright © 2015, 2015 IFAC 1402Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2015 IFAC 1402Control. Peer review©under of International Federation of Automatic Copyright 2015 responsibility IFAC 1402 Copyright © 2015 IFAC 1402 10.1016/j.ifacol.2015.09.721
SAFEPROCESS 2015 September 2-4, 2015. Paris, France
Tarek AIT IZEM et al. / IFAC-PapersOnLine 48-21 (2015) 1402–1407
T XP et X TPT (1) being the principal component matrix, and With T the matrix P mm contains the principal vectors which are the eigenvectors associated with the eigenvalues λi of the covariance matrix (or correlation matrix) Σ of X: (2) PPT Where Λ is a diagonal matrix that contains in its diagonal the eigenvalues of sorted in decreasing order. Once the number of components (l < m) determined, the eigenvectors matrix P can be partitioned in the form:
1403
3.1. Vertices PCA
nm
P Pˆ P
The Vertices Principal Component Analysis (V-PCA), proposed by Cazes et al. (1997), offers the possibility to detect the underlying structure of the two-way interval valued data set. Let S1 , S2 ,..., Sn be n objects described by m interval valued variables X1 , X 2 ,..., X m .
S1 xs11 S . X .2 . Sn xsn 1
(3)
Note that the smallest eigenvalues indicate the existence of linear or quasi-linear relations between components of the data. ml
is generated by The transformation matrix Pˆ choosing l eigenvectors or columns of P corresponding to l principal eigenvalues. Matrix Pˆ transforms the space of the measured variables into the reduced dimension space.
Tˆ XPˆ
(4)
ˆ ˆ T XCˆ l Xˆ XPP
(5)
ˆ ˆT Cˆ l PP
(6)
The residual space is spanned by the matrix P generated by choosing the last (m-l) eigenvectors or columns of P.
X X Xˆ X I Cˆ l
(7)
The determination of the PCA model can be resumed by an eigen-decomposition of the covariance matrix and the determination of the number (l) of components to be retained.
.
xs1m . . xsn m
.
. .
. .
(9)
Where xsi j xij , xij is the variable X j for the object Si . An object Si can be visualized in the description space as a hyper-rectangle of 2m vertices, the length of its segments are given by the intervals associated with every description variable. In V-PCA, Each object, in an m dimensional space, can be represented by a numerical data matrix Mi of 2m lines and m columns containing the vertices of the associated hyperrectangle. Vertices are given by all the possible combinations between the bounds of the object S. The global V-PCA data matrix M (of dimension n 2m m ) is obtained by concatenating all vertices matrices Mi for all Si objects
M M1 . . M n
T
(10)
V-PCA consists of performing PCA on (10). As for ordinary PCA, it is advisable to pre-process the data in order to avoid unwanted differences among the variables. The matrix in (10) can be pre-processed as in the standard single valued case.
3. INTERVAL PRINCIPAL COMPONENT ANALYSIS
The V-PCA procedure for interval data is resumed in the following steps:
Let X nm be the data matrix containing n samples of m process variables collected under normal operation, where x j i is the ith observation of the jth variable.
Calculate vertices matrix M for the interval data. Application of classical PCA for the vertices matrix. Let Y1 ,..., Yl l m be the first components of this PCA, and
In real life, x j i is but an approximate value given by the
1 ,..., l are their respective eigenvalues. Determine the new interval principal components Y1 ,..., Yl from the numerical components Y1 ,..., Yl obtained.
sensor, it is generally stained with uncertainties due to different factors. Thus, it is more appropriate to represent such measure by an uncertain model, the approximation error unknown; we suppose that its variation is limited and can be represented by an interval of the form xij , xij .Thus,
Let LSi be the set of lines numbers of M associated to the object Si , and ykj ( k LSi ) is the jth numerical principal component Y j associated to vertices of object Si and
obtaining the new interval data matrix X
x 11 , x 11 . . . X . x n1, x n1 .
.
corresponding to the kth line of M. The value of interval principal component Y j for object Si is given by;
x 1m , x 1m
. . . . x nm , x nm
YSi j yij , yij
(8)
yij min ykj
Where
k LSi
yij max ykj
(12)
k LSi
The bounds xij and xij define a domain which dimensions are representative of the uncertainties affecting x j i .
(11)
In resume, the V-PCA procedure allows the transformation of interval data from initial to principal space, and vice versa, through the vertices matrix. 1403
SAFEPROCESS 2015 1404 September 2-4, 2015. Paris, France
Tarek AIT IZEM et al. / IFAC-PapersOnLine 48-21 (2015) 1402–1407
3.2. Centers PCA An alternative exploratory tool in order to summarize interval valued data sets is the centers PCA (C-PCA), as proposed by Cazes et al. (1997). Similarly to V-PCA, C-PCA transforms the interval valued data matrix in (8) into a new single valued matrix. Specifically, the interval valued score of the generic observation unit i on the generic variable j is replaced by the single valued score
x ijc
x ij x ij
. . .
. . .
x 1cm . . c x nm
(14)
In C-PCA, a PCA is performed on the standardized (in the classical way) matrix in (14). The C-PCA procedure for interval data is resumed in the following steps:
Calculate the centers matrix X c as in (14). Application of classical PCA for the centers matrix. Let Y1 ,..., Yl l m be the first components of this PCA, with
1 ,..., l and u1 ,...,u l their respective eigenvalues and eigenvectors. Determine the new interval principal components as: n
y ij
x ik u kj
k 1,u kj 0 n
y ij
k 1,u kj 0
x ik u kj (15)
n
In classical PCA, and by analysing the eigenstructure of the covariance matrix of data collected under normal operating conditions, linear relations among the variables are revealed. The PCA model so obtained describes the normal process behaviour, and by using projections in principal and residual spaces; unusual events are then detected by referencing the observed behaviour against this model. Several diagnosis techniques are used for FDI using PCA, in this paper, we emphasis on the diagnosis by variables reconstruction. The variable reconstruction approach assumes that each variable may be faulty and suggests to reconstruct the assumed faulty variable using the PCA model from the remaining variables [10]. This reconstructed variable is then used to detect and isolate the faults. There are several approaches of reconstruction which lead exactly to the same solution. In the following, we define the different reconstructions for the classical and interval case, using the presented interval PCA methods. 4.1. Diagnosis by Reconstruction of Variables For classical PCA, the reconstruction of the ith variable of a vector x(k) is given by:
k 1,u kj 0
The Midpoints-radii PCA on interval-valued data, introduced by Palumbo and Lauro (2003), is resolved in terms of midranges ( X c ), midpoints ( X r ), and inter-connection between midpoints and midranges.The interval matrix (8) maybe expressed as: (16) X X c X r , X c X r
X X X X ,X r (17) 2 2 Two independent PCA’s are singly exploited on these two matrices. The solutions are given by the following Eigen systems: c
4. FAULT DETECTION AND ISOLATION USING INTERVAL PCA
m
3.3. Midpoints-Radii PCA
X
xi k
1 c
c c
1 r
r
X u u
X u u r
r
(18) (19)
cij x j k
j 1, j i
(21)
1 cii
Where cij are coefficients of matrix C Note that the reconstruction of the ith variable uses all the other variables data except the ith variable. Thus, if only this variable is faulty; its reconstruction eliminates the fault affecting it. In other words, the obtained reconstruction is a fault independent estimation. The reconstructed data matrix can also be formulated as: (22) Xˆ i Gl i X i is expressed in terms of Cˆ as: Matrix G l
l
l
Gl I i
c
(20)
To get a logical graphical representation of the statistical units as a whole, Palumbo and Lauro (2003) proposed to represent the rotated radii coordinates on the midpoints PCs as supplementary points.
x ik u kj
Summing up, the Centers method is, analogously to the vertices method, performing a classic PCA on the midpoints matrix.
Where
n
k 1,u kj 0
x ik u kj
X ' c X c X ' r X r X 'c X r X ' r X c
(13)
2
For i = 1,...,n and j = 1,...,m, that is the midpoint or center of the interval at hand. Therefore, we then get the centers matrix
x 11c . Xc . c x n1
Where c ,u c and r ,u r are, respectively, the eigenvalues and eigenvectors of the two partial analysis of midpoints and midranges matrices, and Σ is the covariance matrix given by;
i i Cl I 1 iT Cˆl i T
(23)
Where i is the vector of reconstruction direction with all elements equal to 0 except the ith which value is 1. The i notation Xˆ is for the reconstruction of Xˆ without using variable of rank i. For the interval case, calculation of the elements of [Xˆ (i) ] depend on the used interval PCA method. In the following,
1404
SAFEPROCESS 2015 September 2-4, 2015. Paris, France
Tarek AIT IZEM et al. / IFAC-PapersOnLine 48-21 (2015) 1402–1407
we introduce the extension of the variable reconstruction, for the three presented interval PCA methods, using the projection matrix G l i for the first (l) PCs, as:
2 x1 k 0.3v1 k 1.2sin k / N cos k / 4 , v1 k N 0, 2 k / 2 N , v2 k N 0, 2 x2 k 0.8 v2 k 1.5cos k / 5 e x3 k x1 k x2 k x4 k x1 k x3 k (31) x k x k x k 2 3 5
For V-PCA, and by using the vertices matrix M given in (10), the interval reconstruction of variables is obtained using the following relations: xˆ min Hˆ ij
kj
k LS i
(24)
xˆ ij max Hˆ kj k LS i
(25) Hˆ MG l i (i) ˆ Analogously, elements of [X ] for C-PCA are obtained using the following relations:
xˆ ij xˆ ij
n
x ik G l
i kj
n
k 1,u kj 0
k 1,u kj 0
n
n
x ik G l i kj
k 1,u kj 0
x ik G l
(26)
x ik G l i kj
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
80
100
120
140
160
180
200
0
5
(28)
l
I Cˆl Gl i
(30)
For interval data, the new projection of the reconstructed variables is obtained, for the last (m-l) components, via the projection matrix Fl i . This can be done by replacing the projection matrix G l i in relations (25), (26), and (27), by the new matrix Fl i in order to obtain the elements of [X
5 [x3]
-5
k 1,u kj 0
0
0
-5
It is also possible to use projections of reconstructions onto residual space in order to generate residuals capable of detecting and isolating faults, this projection is given by; (29) X i Fl i X i ˆ Matrix F is expressed in terms of C as;
Fl
0
-5
kj
4.2. Diagnosis by Projection of Reconstructions
i
5 [x1]
[x2]
Xˆ i Xˆ i , Xˆ i
l
A centered variation X represented by 5% of the variation range of each variable is generated and added to the data in order to simulate noise, thus obtaining the new interval type data X X X , X X , (Fig.1).
5
i
For the MR-PCA case, and by adding the rotated radii points, the reconstructed variables are given by the following: Xˆ i X cG l i X rG l i (27) Xˆ i X cG l i X rG l i Where
1405
i
[x4]
0
-5 5 [x5]
0
-5
Fig. 1.Structure of generated interval data [X] Using the presented methods, we established three interval PCA models of the data, and according to the variance of reconstruction error criterion (VRE), Valle et al. (1999), the number of PC’s to retain is l 2 . 5.2. Diagnosis Procedure Once the modelling phase performed, using the different interval PCA methods presented, a new set of data is generated based on (31), and different types of offsets are added in order to proceed with the diagnosis routine. However, since we are using the projection of reconstructions technique for FDI, it is of our interest to know the interval residuals structure before injecting faults. (Fig.2) is a limited representation of the residuals [r1 ],...,[r5 ] given by the projection onto the residual space using the last three (m-l=3) components, for all the interval PCA models. Furthermore, in this work, and for simplicity purpose, the different representations related to the interval models are plotted respectively in blue, red and magenta, for V-PCA, CPCA and MR-PCA.
].
5. COMPARATIVE STUDY To illustrate the presented interval PCA diagnosis techniques, we consider a static system based on 5 variables j={1,…,5}, and two models. The offline FDI procedure for the obtained interval models are executed using projection of reconstructions. A comparison between the three interval PCA models for diagnosis is lead, in terms of good detections and isolations, in order to determine the best suited method for the task.
0.5 [r1]
20
40
60
-0.5
0
20
40
60
0
20
40
60
0
20
40
60
-0.5
10020
14060
16080
20
40
60
10020
12040
14060
16080
20
200120
40
140
60
160
80
180
100
200
1
0
180100
20
200120
40
140
60
160
80
180
100
200
1
0
180100
20
200120
40
140
60
160
80
180
100
200
1
0
180100
20
200120
40
140
60
160
80
180
100
200
1
0
180100
20
200120
40
140
60
160
80
180
100
200
1
0
-0.5 0.5 [r4]
10020
12040
14060
16080
0
-0.5 0.5 [r5]
0
-0.2 80 0
180100
0.5
0.2
0
0
0
-0.5
[r3]
10020
Fig.2. Structure of residuals.
1405
12040
0
-0.2 80 0
[r5]
0
0
0.5 [r2]
0.2 [r4]
0
0.5
5.1. Data Generation
16080
0
-0.1 80 0
1
-1
14060
0.1 [r3]
0
-0.5
12040
0
-0.5 80 0
0.5
[r4]
10020
-0.5
0.5 [r2]
0
[r3]
[r5]
The data matrix X includes N=200 measurements and is described in different instants k by the following relations:
0
[r1]
0
-0.5 80 0
0.5 [r2]
0.5
0.5 [r1]
0
-0.5
12040
14060
16080
0
-0.5
SAFEPROCESS 2015 1406 September 2-4, 2015. Paris, France
Tarek AIT IZEM et al. / IFAC-PapersOnLine 48-21 (2015) 1402–1407
We note from (Fig.2) that the upper and lower projections which are, respectively, residuals of upper and lower estimations of interval data matrix [ Xˆ ], are forming an envelope around the zero line.
gathered in Table 1, where detected outliers are marked by ones, and undetected outliers are marked by zeroes.
5.2.1. Fault Detection In classical PCA, unusual events are projected onto residual space and can therefore be detected. Same principle can be applied in the interval case. However, for the new Interval detection method, such abnormal behavior can’t be considered as a fault unless one of the residuals bounds changes sign, as shown in (Fig.3). In other words, the envelope created by residuals is a safe zone spanned by the interval of approximation error in which every unusual event isn’t considered as a fault but as an uncertainty.
Real fault
Fig.5. Residuals with added offsets for C-PCA model.
Uncertainty
Fig.3. Fault detection for interval data To illustrate this notion, and in order to compare between the three interval PCA models, we simulate two different offsets, x 1 and x 3 , with different amplitudes varying around the mean of X , and affecting variables [x 1 ] and [x 3 ] , respectively. As mean of X equals 0.2, in our case, the two offsets are injected as follows:
x 1 from moment 40 to 60, varying in the range [0.1,0.2] to simulate an approximation error, or an uncertainty. x 3 from moment 120 to 140, varying in the range [0.2,0.3] to simulate real fault.
Fig.6. Residuals with added offsets for MR-PCA model. Results for the detection phase (Table.1), are in favour of the MR-PCA model which demonstrates a near perfect rate of detection. C-PCA is also behaving well as of differentiating the uncertainties and the faults. Nonetheless, it is slightly oversensitive for added outlier. As of the V-PCA model, the obtained results show a poor performance, where only high amplitude outliers are detected. Table 1. Detection results for different outliers
Uncertainties
As an example, a representation of the residuals after injecting offsets x 1 and x 3 , of amplitude 0.14 and 0.26, respectively, for the three interval PCAs, is given by (Fig.4), (Fig.5), and (Fig.6).
Running several times the diagnosis routine, and after injecting outliers with an increasing variation of 0.01 each step, allows us to determine which of the Interval PCA models performs best, in terms of detecting the real faults from the uncertainties. The obtained results have been
1406
Faults
Fig.4. Residuals with added offsets for V-PCA model.
V-PCA
C-PCA
MR-PCA
0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3
0 0 0 0 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
0 1 1 1 1 1 1 1 1 1
SAFEPROCESS 2015 September 2-4, 2015. Paris, France
Tarek AIT IZEM et al. / IFAC-PapersOnLine 48-21 (2015) 1402–1407
Note that, since the added variations are slight compared to the data, and as the offset’s amplitude converges toward 0.2, detection decisions become tight and are made by filtering persistent and non-persistent sign changes for a predefined limit.
Table 3. Experimental Signature of Faults VPCA CPCA MRPCA
Table 2. Theoretical Signature of Faults [X ]
[X ]
[X ]
[X ]
[X ]
x1
00000
10111
11011
11101
11110
x2
01111
00000
11011
11101
11110
x3
01111
10111
00000
11101
11110
x4
01111
10111
11011
00000
11110
x5
01111
10111
11011
11101
00000
1
2
3
4
5
The complete isolation of all faults can only be achieved after performing a number of reconstructions while eliminating a variable at a time, an example of two projections into residual space, [X 2 ] and [X 3 ] , is given by (Fig.7), for the different interval PCAs. The representation is focused on the fault occurrence zone between moments 120 and 140, the fault to isolate being x 3 ,of amplitude 0.26 used in the detection phase.
[X
2
]
Fig.7. Residuals [X
[X
2
] and [X
3
3
]
] for fault occurrence zone
Based on the obtained residuals, the experimental signature of faults can be established. Table.3 represents limited experimental signatures to the zone where the fault occurred using all the interval PCA models. The fault isolation is the result of a comparison between the experimental and theoretical signatures; this can be done using similarity calculation, correlation or distance. After running several tests, the three interval PCA methods, have shown the ability of effectively isolating faults, only, once detected.
[X ] 00111 00101 00111 1
5.2.2. Fault isolation In this section, we illustrate the fault isolation procedure using the new interval reconstruction principle. According to the properties of projection matrix G l i , as in Dunia et al. (1998), we know a priori the influence of faults on the residuals, i.e. theoretical signature of faults, Harkat et al. (2002). As an example, Table 2 combines a theoretical signature of faults for the reconstructed projections, where the appearance of a fault is marked by ones and there absence by zeroes.
1407
[X ] 00101 01101
[X ] 00000 00000
[X ] 10101 11101
[X ] 01110 01110
10101
00000
10101
01110
2
3
4
5
6. CONCLUSION Introducing interval notion to diagnosis of systems, and more precisely using principal component analysis, is a novel technique that emphasizes on uncertainties of measurement. The interval nature of the projections ensures the elimination of approximation errors during fault detection and isolation procedure. In this paper, we presented three different interval PCA methods used for FDI purpose: V-PCA, C-PCA and MRPCA, where a comparative study has been realised for several offsets to simulate both faults and uncertainties. The difference noticed between the methods lies mainly in the detection phase: V-PCA demonstrates a lack of precision, where only high amplitude faults are detected. C-PCA is relatively better, despite its slight oversensitivity to outliers. While MR-PCA performed the best in detecting the outliers and distinguishing the real faults from the approximation errors. The isolation of faults, realised using the new interval reconstruction principle, has been proven effective. REFERENCES Cazes, P., Chouakria, A., Diday, E., and Schektman, Y. (1997), Extension de l’analyse en composantes principales à des données de type intervalle, Revue de Statistique Appliquée, 45(3), 5–24. [414] Chouakria, A. (1998), extension des méthodes d’analyse factorielle a des données de type intervalle, Ph.D.dissertation, Université Paris-Dauphine. [414,415,424,425] Sergio Valle, Weihua Li, and S. Joe Qin. Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods. Ind. Eng. Chem. Res., 1999, 38 (11), pp 4389–4401 Palumbo, F., and Lauro, N. C. (2003), A PCA for interval-valued data based on midpoints and radii, in New Developments in Psychometrics. Tokyo. D’Urso P, Giordani P (2004). A least squares approach to principal component analysis for interval valued data. Chemometr Intell Lab Syst 70(2):179–192 Gioia, F., and Lauro, C. (2006), Principal component analysis on interval data, Computational Statistics, 21,343–363. [414,418,419,420]. Jennifer Le-Rademacher & Lynne Billard (2012), Symbolic covariance principal component analysis and visualization for interval-valued data, Journal of Computational and Graphical Statistics, 21:2, 413-432. Dunia R. and Qin S. (1998). A subspace approach to multidimensional fault identification and reconstruction, American Institute of Chemical Engineers Journal Harkat M. F., Mourot G., Ragot J. (2002), Différentes méthodes de localisation de défauts basées sur les derniéres composantes principales,Conférence Internationale Francophone d’Automatique CIFA, Nantes- France,
1407