Information Sciences 490 (2019) 265–284
Contents lists available at ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
Fault detection of uncertain chemical processes using interval partial least squares-based generalized likelihood ratio test Harkat M.-F. a,∗, Mansouri M. b, Nounou M.N. a, Nounou H.N. b a b
Chemical Engineering Program, Texas A&M University at Qatar, Doha, Qatar Electrical and Computer Engineering Program, Texas A&M University at Qatar, Doha, Qatar
a r t i c l e
i n f o
Article history: Received 12 April 2018 Revised 23 March 2019 Accepted 25 March 2019 Available online 30 March 2019 Keywords: Partial least squares (PLS) Generalized likelihood ratio test (GLRT) Process monitoring Interval-valued data Fault detection (FD)
a b s t r a c t Fault detection (FD) is essential for monitoring various chemical processes. Many chemical processes can be described by input-output models. Partial least squares (PLS) method is one of the most popular statistical approaches used for modeling and monitoring chemical processes. In many situations, measured process data can exhibit some level of uncertainty. In such cases, expressing the data in interval form can be useful. Therefore, this work addresses the problem of fault detection of uncertain chemical processes using interval input-output PLS-based generalized likelihood ratio test (GLRT). The proposed novel approach helps extend the applicability of GLRT to uncertain processes represented by interval-valued input-output data. In the developed approach, the modeling phase is performed using PLS and then the GLRT chart is applied to the interval residuals for fault detection. To evaluate the fault detection abilities of the proposed PLS-based interval valued GLRT approach, two examples are used: a simulated example and a distillation column example. The performance of the proposed technique is evaluated in terms of the missed detection and false alarms rates. © 2019 Elsevier Inc. All rights reserved.
1. Introduction Data driven process monitoring techniques have been gradually gaining interest for chemical process modeling and monitoring due to their advantages, easy implementation while not requiring deep of the process itself [19,30,33,36,37,47]. Multivariate statistical approaches including principal component analysis (PCA) [14,21,22,24,26,31,39] and partial least squares (PLS) are very effective for modeling and monitoring chemical processes [12,15,27,46]. PLS is one of the important datadriven MSPM approaches used to find the latent variables from data sets by capturing the largest variance in the data and maximizing the correlation between the input and output variables. In PLS, the matrices computed during off-line modeling are then used for on-line monitoring. PLS has been first proposed by Wold in [45], and has been successfully applied in several research areas, such as process modeling, fault detection and process monitoring. Geladi and Kowalski in [15] presented a detailed description tutorial with some applications of PLS. In many situations, the measured process data are uncertain. Such uncertainties can have a negative impact on the established model, and thus, on the fault detection performances [38]. Hence, there is a need for a mechanism to reduce ∗
Corresponding author. E-mail addresses:
[email protected],
[email protected] (H. M.-F.).
https://doi.org/10.1016/j.ins.2019.03.068 0020-0255/© 2019 Elsevier Inc. All rights reserved.
266
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
the effect of uncertainties in the data, which can be due to the impression of measurement devices or missing data over different samples of time. One useful representation of data, that accounts for such uncertainties, is interval-valued data. For this purpose, several PCA models for interval-valued data are proposed in literature [8,9,13,29,35]. Recently, a PCA based process monitoring approach for interval-valued data was presented [2,3]. The developed techniques applied interval PCA to estimate the model and then interval residuals are generated for fault detection. To enhance the detection abilities of these techniques, the authors in [18,32], proposed to use generalized likelihood ratio test (GLRT) and exponentially weighted moving average (EWMA) control charts. In this paper, we extend our previous work to consider input-output models using PLS to improve the model accuracy. PLS is one of the most popular statistical approaches used for chemical process modeling and monitoring purposes since it takes into consideration the input-output relationship between variables that capture major trends in data set. Here, an interval PLS based GLR test will be developed and an interval GLR test will be derived for fault detection. The GLRT has been shown to provide effective detection performances for fixed false alarm rates. Therefore, this work addresses the problem of fault detection of uncertain chemical systems using PLS-based GLRT. The presented approach is applicable to uncertain processes represented by interval-valued input-output data. The detection effectiveness of the developed PLS-based GLRT is evaluated through two examples: synthetic data and a distillation column process. The fault detection results are evaluated in terms of false alarm and missed detection rates. The rest of the paper is organized as follows. Section 2 provides a brief presentation of the classical PLS regression model. Section 3 gives a description of PLS regression model for interval-valued data. Description of the proposed fault detection chart based interval GLRT is reported in Section 4. In Section 5, the performance of the interval PLS-based GLRT fault detection technique is evaluated using two examples: a synthetic data and a distillation column benchmark. Finally, concluding remarks are presented in Section 6.
2. Partial least squares Consider a linear PLS algorithm to model the relation between two data sets. Given an input matrix X ∈ Rn×m with n samples and an output matrix Y ∈ Rn×M . The relationships between X and Y are modeled by means of latent vectors,
X = T P T + Ex =
ti pTi + Ex
(1)
i=1
Y = UQ T + Ey =
ui qTi + Ey
(2)
i=1
where T and U are the (n × ) matrices of the retained latent vectors, matrix P ∈ Rm× and matrix Q ∈ RM× represent matrices of loadings. Matrix Ex ∈ Rn×m and matrix Ey ∈ Rn×M are the matrices of residuals. The objective of PLS algorithm is to find the solution of the following optimization problem [10]:
max wTi XiT Yi qi , s.t. wi =1, qi =1
(3)
where wi , qi are weight vectors that yield ti = Xi wi and ui = Yi qi , respectively. The weight vectors are computed in an iterative manner using the nonlinear iterative partial least squares algorithm (NIPALS). More details about the PLS algorithm can be found is [10]. Once the score vectors t and u are extracted, the loading vectors p and q in Eqs. (1) and (2) can be computed by regressing X on t and Y on u, respectively,
Y = X B + Ey ,
(4)
−1
ˆ = XT U TT XXT U B
TT Y.
(5)
ˆ, W ˆ , and B ˆ the estimated parameters of the PLS regression model. We denote by P Given a new scaled sample vectors x(k ) ∈ Rm and y(k ) ∈ RM , their estimations xˆ (k ) and yˆ (k ) can be expressed, respectively, as:
xˆ (k ) = Cˆ x(k ),
(6)
ˆ x ( k ), yˆ (k ) = B
(7)
ˆ RT and R = W ˆ (P ˆTW ˆ )−1 . where Cˆ = P
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
267
3. Partial least squares for interval-valued data 3.1. Interval data description The uncertainties in process data, which can be due to measurement imprecisions or process variations, may be represented by considering data as interval-valued. In reality, the actual value x∗j (k ) of a variable can deviate from the measured one xcj (k ). The measurement error is defined as δ x j (k ) = xcj (k ) − x∗j (k ). Hence, once a measurement xcj (k ) is avail-
able, the actual (unknown) value x∗j (k ) of the measured variable belongs to the interval x∗j (k ) = x j (k ) x j (k ) , where x j (k ) = xcj (k ) − δ x j (k ) and x j (k ) = xcj (k ) + δ x j (k ). An interval-valued variable (IVV) [X j ] ⊂ R is represented by a sets of real numbers delimited by ordered couples: [X j ] = {[x j (1 )], [x j (2 )], . . ., [x j (n )]}, where [x j (k )] ≡ [x j (k ), x j (k )] ∀k ∈ [1, . . ., n] and x j (k ) ≤ x j (k ). The interval [xj (k)] can also be expressed by the couple {xcj (k ), xrj (k )}, where:
xcj (k ) =
1 (x j (k ) + x j (k )), 2
(8)
xrj (k ) =
1 (x j (k ) − x j (k )). 2
(9)
and,
Typically, some standardization must be performed prior to processing data. Definition 3.1. Let us introduce two basic concepts: mean interval and variance [25,28]. Mean interval [mj ] is defined as:
[m j ] =
1 [ x j ( k )] . n
(10)
k
For interval valued data, the following distance measure is used [11]:
d x j (k ) , y j (k )
= xcj (k ) − ycj (k ) + xrj (k ) − yrj (k ),
(11)
where d([xj (k)], [yj (k)]) satisfies the Euclidean distance properties [11,34]. Based on Eq. (11), the following properties can be verified. Let x j (1 ) , x j (2 ) , . . ., x j (n ) be a set of finite interval,
so that x j (k ) ⊂ R, ∀ k ∈ {1, . . ., n} and m j j ∈ {1, . . ., m} are their corresponding mean intervals, then n
xcj (k ) − mcj + xrj (k ) − mrj
= 0,
k=1
and, n
d2
x j (k ) , m j
k=1
is minimized. Using Eq. (11), the variance can be computed as [11,34]:
σ2
n n n 2 2 1 c xcj (k ) − mcj xrj (k ) − mrj , = x j (k ) − mcj + xrj (k ) − mrj + 2 n i=1
k=1
1 n
where, mcj =
n k=1
xcj (k ) and mrj =
1 n
n k=1
(12)
k=1
xrj (k ).
Eq. (12) affirms that the variance for interval valued data can be decomposed into three components: variance among midpoints, variance among ranges and twice the connection between midpoints and ranges, given by n c r c r x j (k ) − m j x j (k ) − m j ≥ 0. k=1
The properties in Eq. (11) indicate that the distance between intervals can be generalized to the Euclidean distance in the space Rm . A standardized interval is given by:
1 σ
1 c x j (k ) − mcj + xrj (k ) − mrj . σ
xcj (k ) − mcj − xrj (k ) − mrj ,
Definition 3.2. Given any interval-valued variables:
Xj =
and
[Xi ] =
x j (1 )
xi (1 )
x j (1 )
xi ( 1 )
... ...
x j (k )
xi ( k )
x j (k )
xi ( k )
... ...
x j (n )
xi ( n )
(13)
T
x j (n )
T
xi ( n )
,
, j = i.
268
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
The product of two intervals is defined as:
xi ( k )
xi ( k )
x j (k )
x j (k )
= min(s )
max(s ),
where s = xi (k )x j (k ), xi (k )x j (k ), xi (k )x j (k ), xi (k )x j (k ) [20]. and the squares of the interval is given by,
min (s1 ) 2 xi ( k ) =
xi ( k )
0
if 0 ∈ / [xi (k )]
max (s1 )
max (s1 )
(14)
(15)
otherwise
where s1 = xi (k )2 , xi (k )2 . Definition 3.3. For an interval-valued variable [Xj ], the squared norm is given by,
Xj , Xj
n 2 = x j ( k ) 2
= Xj
k=1
n 1 2 2 = x j ( k ) + x j ( k )x j ( k ) + x j ( k ) . 3 k=1
Definition 3.4. Given any interval valued variables [X1 ], [X2 ], . . ., [Xm ] of n observations and ∀ a j ∈ R, j = 1, . . ., m define an interval-valued variable [Y] as a linear function or combination of [X1 ], [X2 ], . . ., [Xm ], i.e.,
[Y ] =
m
aj Xj
j=1
=
y (1 )
y (1 )
...
T
y (n )
y (n )
.
In order to avoid the problem that the predicted lower bound values of the response variable are greater than the upper bound values, Moore’s linear combination algorithm [34] is adopted as follows:
y (k ) =
m
aj
τ x j ( k ) + ( 1 − τ )x j ( k ) .
j=1
y (k ) =
m
a j (1 − τ )x j (k ) + τ x j (k ) ,
j=1
with
τ=
0 1
i f aj ≤ 0 otherwise.
3.2. Center partial least squares (CPLS) method for interval-valued data Partial least squares (PLS) is widely used in process monitoring [30]. However, to deal with uncertain processes and interval-valued data, new PLS model is required. An extension of the PLS approach is proposed in order to be applied for interval valued data. The proposed approach, called center PLS method relies on fitting a PLS regression model using the center of intervals of variables on the training set and then applying the identified PLS model on the lower and upper bounds of the interval values of the independent variables to predict the interval values of the dependent variables, respectively. Next, a center PLS (CPLS) method is developed for interval-valued data. A regression model for interval-valued data was introduced by Billard and diday [5]. A model is built on the center points of the intervals, then the authors applied this model to interval independent variables to predict the interval dependent variables [40]. This technique will be used here to derive a PLS model. Assume that [X1 ], . . ., [Xm ] are m independent interval-valued variables, and [Y1 ], . . ., [YM ] are M dependent interval-valued variables, as follows:
⎡ x (1 ) [ 1 ] ⎢ . [X ] = ⎢ ⎣ . .
[x1 (n )]
... . . . ...
⎤
[xm (1 )] . ⎥ ⎥ . ⎦ . [xm (n )]
⎡ y (1 ) [ 1 ] ⎢ . [Y ] = ⎢ ⎣ . .
[y1 (n )]
... . . . ...
⎤
[yM (1 )] . ⎥ ⎥. . ⎦ . [yM (n )]
(16)
Let xcj (k ) and ycl (k ), k = 1, . . .., n, j = 1, . . ., m and l = 1, . . ., M be the center points for interval valued-data. Let the kth observed values of [Xj ] be [x j (k )] = [x j (k ), x j (k )], and observed values of [Yl ] be [yl (k ), yl (k )]. Hence,
xcj (k ) = (x j (k ) + x j (k ))/2 ycl (k ) = (yl (k ) + yl (k ))/2.
k = 1, 2, . . ., n
(17)
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Then, the matrices Xc and Yc can be expressed as:
⎡ xc ( 1 ) 1
⎢ ⎣
. . .
Xc = ⎢
⎤
xc1 (n )
⎡ yc ( 1 )
xm ( 1 )c . ⎥ ⎥ . ⎦ . xcm (n )
... . . . ...
1
⎢ ⎣
Yc = ⎢
. . .
yc1 (n )
... . . . ...
269
⎤
ycM (1 ) . ⎥ ⎥ . ⎦ . ycM (n ).
(18)
and the fitted linear regression model can be written as:
Y c = X c Bc + ε c ,
(19)
B ∈ m × M
where is the matrix of regression parameters. The PLS estimator of B and C are Bˆ and Cˆ, respectively. Then, the lower and upper bounds of interval data estimation Xˆ = Xˆ Xˆ and Yˆ = Yˆ Yˆ can be computed as:
Xˆ = Xˆ
Xˆ
= [X ]Cˆ
X Cˆ.
= X
(20)
Yˆ = Yˆ
Yˆ
ˆ = [X ]B
(21)
ˆ. X B
= X
By taking into account the following property of interval valued-data:
γx if γ > 0 γx if γ < 0
γx x = γx
γ x
(22)
where γ ∈ R is a real scalar, the lower and upper bound estimates can be expressed as:
⎧ ⎪ ⎪ ⎨ xˆ j (k ) = ⎪ ⎪ ⎩xˆ j (k ) =
m i=1, Cˆi j <0 m
m
xi (k )Cˆi j +
xi (k )Cˆi j
k = 1, . . ., n
xi (k )Cˆi j
j = 1, . . ., m
i=1, Cˆi j >0 m
xi (k )Cˆi j +
i=1, Cˆi j <0
(23)
i=1, Cˆi j >0
⎧ m m ˆ ˆ ⎪ ⎪ ⎨ yˆ j = i=1,Bˆ <0 xi (k )Bi j + i=1,Bˆ >0 xi (k )Bi j ij
⎪ ⎪ ⎩yˆ j =
k = 1, . . ., n
ij
m
m
xi (k )Bˆi j +
i=1,Bˆi j <0
(24)
xi (k )Bˆi j
j = 1, . . ., M.
i=1,Bˆi j >0
Let [x(k)] and [y(k)] be new interval measurement vectors at time k. Their estimates using the CPLS model can be expressed as:
xˆ (k ) = Cˆ[x(k )],
(25)
ˆ [x(k )], yˆ (k ) = B
and then, the corresponding estimation errors can be computed, using interval arithmetic [20], as:
[ex (k )] = [x(k )] − xˆ (k )
(26)
[ey (k )] = [y(k )] − yˆ (k ) , where,
[ex (k )] = ex (k ) =
x(k ) − xˆ (k )
[ey (k )] = ey (k ) =
ex ( k )
x(k ) − xˆ (k )
.
(27)
ey ( k )
y(k ) − yˆ (k )
y(k ) − yˆ (k )
.
(28)
270
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
4. Fault detection indices based on CPLS The obtained interval CPLS model describes normal process behavior and unusual events can be detected by comparing the observed behavior against this model. From Eqs. (27) and (28), it is clear that we have to evaluate two sets of interval residuals. Based on these residuals several indices can be computed. In order to reduce the number of equations and simplify the presentation the index x or y in those equations will be replaced by •. In this case, Eqs. (27) and (28) can be represented by only one equation as,
[e• (k )] = e• (k ) where [e• (k )] =
e• ( k ) ,
e1,• ( k )
e• (k ) = e1,• (k )
... e j,• (k )
...
(29)
e j,• (k ) ...
T [em,• (k )] , T em,• (k ) , and T em,• (k ) . ...
e• (k ) = e1,• (k ) . . . e j,• (k ) . . . This notation will be adopted in the rest of the paper. From those primary interval residuals [ex ] and [ey ], several fault detection indices can be derived. The most used multivariate index for fault detection is the Q statistic, also called squared prediction error (SPE) and is defined, for single-valued data, as [16,26,27,31,36]:
Q (k ) = e(k )
2
= e ( k )e ( k ) m = e2j (k ). T
(30)
j=1
From the definition of the Q statistic based on single-valued data approaches and its different expressions given by Eq. (30), several indices can be derived for interval-valued data. 4.1. Norm of upper and lower bounds of residuals This chart computes two classical single valued Q charts using lower and upper bounds of interval-residuals, respectively. It is defined as,
e • 2 ! 2 2 = e j,• (k ) e j,• (k ) .
Q1,• =
e • 2 j
(31)
j
4.2. Sum of squares of interval residuals This chart presents another way of computing the quadratic form of interval valued residuals, defined in Eq. (15), as
Q2,• =
j
2
e j,• (k )
⎧ max s j if 0 ∈ / e j,• (k ) ⎨ min s j = j ⎩ 0
max s j
"
(32)
otherwise
#
where s j = e2j,• (k ), e2j,• (k ) . Furthermore, the two fault detection indices Q1,• and Q2,• for interval-valued data, proposed in [1,4], are computed from the upper and lower bounds of the interval residuals. Thus, yielding an interval index with an upper and a lower bound. Ait-Izem et al. [1] proposed to use the SPE control limit presented in [22] and extended it to interval data based on Box’s quadratic form approximation [7]. Hence, the limits for the corresponding indices can be computed based on their respective estimate mean (a) and estimate variance (b) so that,
ηα = gindex χh2index ,α ,
(33)
where g = b/2a and h = 2a2 /b. Note that the thresholds for the presented indices are calculated for each bound of the statistic. However, the computed thresholds (upper and lower bound) tend to be equal due to the symmetry of bounds. So, in the case of the two fault detection indices Q1,• and Q2,• , only one control limit (threshold) will be computed for each index and then used for fault detection.
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
271
4 [x1 ] 0 -4 0
50
100
150
200
250
300
350
400
450
500
50
100
150
200
250
300
350
400
450
500
200 250 300 Sample sumber
350
400 450 Lower bound
500
1 [x2 ] 0 -1
2 [x3 ] 0 -2 50
100 150 Upper bound
Fig. 1. Time evolution of interval-valued simulated variables.
3
Estimated [y1 ]
2 1 0 -1 -2 -3 -3
-2
-1
0 [y 1 ]
1
2
3
Fig. 2. Scatter plots of predicted and observed training data y1 .
4.3. Product of interval residuals A global interval residual can be computed as a product of interval residual vectors (see Definition 3.2, Eq. (14)), as follows:
[Q3,• (k )] = [e• (k )] [e• (k )] = e j,• (k ) e j,• (k ) , T
(34)
j
where,
e j,• (k ) e j,• (k ) = Q 3,• (k )
Q 3,• ( k ) ,
2
2
(35)
Q 3,• (k ) = min e2j,• (k ), e j,• (k )e j,• (k ), e j,• (k )
(36)
Q 3,• (k ) = max e2j,• (k ), e j,• (k )e j,• (k ), e j,• (k ) ,
The result is an interval residual and the fault is detected if 0 ∈ / Q 3,• ( k ) Q 3,• ( k ) .
272
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
2.5 2 1.5
Estimated [y2 ]
1 0.5 0 -0.5 -1 -1.5 -2 -2.5 -2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
[y 2 ] Fig. 3. Scatter plots of predicted and observed training data y2 .
4
Q Upper bound Q Lower bound 95% threshold
Q 1,x
3 2 1 0 0
50
100
150
200
250
300
350
4
450
500
Q Upper bound Q Lower bound 95% threshold
3
Q 1,x
400
2 1 0 0
50
100
150
200 250 300 Sample number
350
400
450
500
Fig. 4. Evolution of Q1,x in both cases fault-free and faulty.
4.4. Interval norm of residuals From Definition 3.3, the interval norm of residuals index Q4,• is defined as [2,3],
Q4,• (k ) = e• (k ) m 2 = 13 e2j,• (k ) + e j,• (k )e j,• (k ) + e j,• (k ) . 2
(37)
j=1
4.5. Generalized likelihood ration (GLR) chart In this section, first a classical GLR is presented and then a proposed interval GLR will be derived. The GLR chart is a commonly used hypothesis testing technique in model-based fault detection [17,43,44]. Let E ∈ Rm be an observation vector following a Gaussian distribution N (0, σ 2 Im ) or N (θ = 0, σ 2 Im ), where θ is the mean vector and σ 2 > 0 is the known variance. The hypothesis testing problem can be formulated as follows,
H0 = {E ∼ N (0, σ 2 Im )},
(null hypothesis),
H1 = {E ∼ N (θ , σ Im )},
(alternative hypothesis).
2
(38)
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Q Upper bound Q Lower bound 95% threshold
0.4
Q 1,y
273
0.3 0.2 0.1 0 0
50
100
150
200
250
300
350
400
450
500
Q 1,y
1 Q Upper bound Q Lower bound 95% threshold
0.5
0 0
50
100
150
200 250 300 Sample number
350
400
450
500
Fig. 5. Evolution of Q1,y in both cases fault-free and faulty.
Q 2,x
4
2
0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
Q 2,x
4
2
0
Fig. 6. Evolution of Q2,x in both cases fault-free and faulty.
The likelihood estimate of θ (this parameter represents the value of the fault in consideration) is computed by maximizing the GLRT T (E ) as follows,
T (E ) = 2 log
sup fθ (E )
θ ∈Rm
f
$ θ =0
(E )
(39)
$
% 2
$
%% 2
E − θ 2 E 2 / exp − 2 2σ 2σ 2 θ 1 = 2 min E − θ 22 + E 22 σ θ 1 & 2 + E 2 = 1 E 2 . = 2 E − θ 2 2 2 σ σ2 = 2 log sup exp −
where 1
m
( 2π ) 2 σ m
θ& = arg minE − θ 22 = E is the estimate of θ , the probability density function of E is θ exp − 2σ1 2 E − θ 22 , and .2 represents the Euclidean norm.
f θ (E ) =
274
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Q 2,y
0.4
0.2
0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
Q 2,y
1
0.5
0
Fig. 7. Evolution of Q2,y in both cases fault-free and faulty.
4
Q 3,x
2 0 -2 -4 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
6
Q 3,x
4 2 0 -2 -4
Fig. 8. Evolution of Q3,x in both cases fault-free and faulty.
Here, the distribution of the decision function T (E ) under H0 allows designing a statistical test with a desired false alarm rate, α , with a threshold ηα ,
P 0 ( T ( E ) ≥ ηα ) = α ,
(40)
where P0 (A ) represents the probability of an event A when E is distributed according to the null hypothesis H0 . The statistics T is distributed according to the χ 2 law with m degrees of freedom. This law is central under H0 and non central under H1 with a parameter of non-centrality equal to: κθ = σ12 θ 22 . The GLRT statistic follow a Chi-Square distribution [6,23] as follows:
T (E ) =
1
σ2
E 22 ∼ χm2 .
(41)
To deal with interval-valued data, interval GLRT will be derived and applied to the interval residuals computed from CPLS method.
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
275
1
Q 3,y
0.5 0 -0.5 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
1
Q 3,y
0.5 0 -0.5
Fig. 9. Evolution of Q3,y in both cases fault-free and faulty.
Q 4,x
0.2 0.15 0.1 0.05 0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
Q 4,x
0.3 0.2 0.1 0
Fig. 10. Evolution of Q4,x in both cases fault-free and faulty.
Let us consider Eq. (41), and E = [e• (k )], the interval GLR (IGLR) in this case is given by:
IGLR• (k ) =
m e j,• (k ) j=1
1 = 3
m
σ j,•
!2
e2j,• (k ) + e j,• (k )e j,• (k ) + e j,• (k ) 2
σ j2
j=1
(42) ,
where σ j,2• is the variance of the jth interval residual given by Eq. (12). The control limits ηα for the described fault detection charts, i.e. IGLR• and Q4,• can be computed from their corresponding approximate distribution as [2,3]:
ηα = gχh,2 α .
(43)
This control limit is based on Box’s equation [7]. Considering that a is the estimate mean of the fault detection chart, and b is its estimated variance, we note:
g=
b , 2a
h=
2a2 . b
(44)
276
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Q 4,y
0.1
0.05
0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
Q 4,y
0.15 0.1 0.05 0
Fig. 11. Evolution of Q4,y in both cases fault-free and faulty.
IGLRx
10
5
0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
IGLRx
15 10 5 0
Fig. 12. Evolution of IGLRx in both cases fault-free and faulty.
Therefore, the proposed detection approach combines the benefits of PLS and IGLRT, so that the PLS is used for modeling and IGLRT is used for fault detection. In order to effectively assess the performance of the developed technique, a synthetic example and an actual benchmark process will be used.
5. Interval partial least squares-based generalized likelihood ratio test with applications To illustrate the proposed center partial least squares (CPLS) based interval generalized likelihood ratio test (GLRT), we consider a simulated example using 7 variables with n = 500 samples. The simulated example represents a linear relationship between the variables [2,4]. First, we generated two independent variables represented by orthogonal sine and cosine functions and the other variables are generated from a linear combination of those variables with added noise. The proposed simulated example represents an input/output linear model, where the first five variables represent the input matrix X and the last two variables represent the output matrix Y.
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
277
IGLRy
10
5
0 0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200 250 300 Sample number
350
400
450
500
IGLRy
20
10
0
Fig. 13. Evolution of IGLRy in both cases fault-free and faulty.
Fig. 14. Basic distillation column controlled with LV-configuration.
The monitored variables are described at time instant k by the following relations:
⎧ x1 (k ) = 0.4v1 (k ) − 1.3 sin(k/2 ) + ε1 (k ) v1 ( k ) ∼ N ( 0, σ 2 ) ⎪ ⎪ 2 ⎪ x ( k ) = v ( k ) − 2 cos ( k/ 4 ) + ε ( k ) v ⎪ 2 2 2 2 ( k ) ∼ N ( 0, σ ) ⎪ ⎪ ⎨x3 (k ) = cos(k/3 )e(v3 (k)−1) + ε3 (k ) v3 ( k ) ∼ N ( 0, σ 2 ) x4 ( k ) = x1 ( k ) + x2 ( k ) + ε4 ( k ) ⎪ ⎪ x5 ( k ) = x2 ( k ) + x3 ( k ) + ε5 ( k ) ⎪ ⎪ ⎪ ⎪ ⎩x6 (k ) = 2x1 (k ) + x3 (k ) + ε6 (k ) x7 ( k ) = x4 ( k ) + 2x5 ( k ) + ε7 ( k ),
(45)
where ε j (k ), j = 1, . . . , 7, represents Gaussian noise with small variance added to the measurements. In order to generate interval-valued data [x], a variation δ xj , j = 1, . . . , 7, is added to each variable. δ xj is represented by 10% of the variation
278
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284 1850
Interval-valued data
[x 1 ]
1800 1750 1700 1650 200
400
600
800
1200
1400
1600
1800
2000
1400
1600
1800
2000
1400
1600
1800
2000
1800
2000
Interval-valued data
0.52 [x 2 ]
1000
0.5 0.48 200
400
600
800
1000
1200
Interval-valued data
[x 3 ]
1300 1200 1100 200
400
600
800
1000 1200 Sample number
Fig. 15. Distillation column interval-valued Measurements.
0.02 Q 1,x
upper bound lower bound
0.01
0 0
200
400
600
800
1000
1200
1400
Q 1,y
0.02
1600
upper bound lower bound
0.01
0 0
200
400
600
800
1000 1200 Sample number
1400
1600
1800
2000
Fig. 16. Evolution of Evolution of Q1,x and Q1,y with a fault on variable x2 .
range of the corresponding variable xj . Hence, the construction of intervals is given by:
x j ( k ) = x j ( k ) − δ x j ( k ), x j ( k ) + δ x j ( k ) .
[X (k )] = [[x1 (k )], . . ., [x5 (k )]] [Y (k )] = [[x6 (k )] [x7 (k )]].
(46)
Fig. 1 shows the time evolution of interval-valued data of the simulated example. A CPLS model is constructed using = 2 retained latent variables. Scatter plots of the measured and predicted variables y1 and y2 are presented, respectively, in Figs. 2 and 3. These plots indicate a good performance of the identified PLS model. After the process model has been successfully developed, the fault detection step will be applied. Two faults are simulated on variable x3 between samples 120 and 130 and on variable x7 (i.e., y2 ) between samples 300 and 340. To quantify the efficiency of the proposed interval fault detection indices, two metrics are used: the false alarms rate (FAR) and the miss detection rate (MDR) [19]. The FAR is the number of normal observations that are wrongly judged as faulty (false alarms) over the total number of fault-free samples. The MDR is the number of faulty samples that are wrongly considered as normal (missed detections) over the total number of faulty samples. Figs. 4, 6, 8, 10 and 12 represent the time evolution of fault detection indices Q1,x , Q2,x , Q3,x , Q4,x and IGLRx , respectively, computed based on interval residuals [ex ]. Figs. 5, 7, 9, 11 and 13 represent the time evolution of fault detection indices Q1,y , Q2,y , Q3,y , Q4,y and IGLRy , respectively, computed based on interval residuals [ey ]. The performances of the different fault detection charts are summarized in Table 1.
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
279
Q 2,x
0.02
0.01
0 0
200
400
600
800
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1000 1200 Sample number
1400
1600
1800
2000
Q 2,y
0.02
0.01
0
Fig. 17. Evolution of Q2,x and Q2,y with a fault on variable x2 .
Table 1 FAR % and MDR % for the presented fault detection charts (simulated example) . X
Y
Charts
FAR %
MDR %
FAR %
MDR %
Q1,• Q2,• Q3,• Q4,• IGLR•
62 62.03 0 29.26 3.44
4.87 3.86 68.88 3.86 0
6.13 6 0 5.37 4.95
1.21 2 35.24 0 0
Table 2 Distillation column process variables.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
Variables
Description
qF zF TF FM FV D B L V MD MB xB
fraction of liquid in feed feed composition [mole fraction] feed temperature [ ◦ C] feed molar flow [kmol/min] feed volumetric flow [kmol/min] distillate flow [kmol/min] bottom flow [kmol/min] reflux flow [kmol/min] boilup flow [kmol/min] condenser holdup [kmol] reboiler holdup [kmol] bottom composition [mole fraction]
From Table 1, it can be noted that the index Q3,• resulted in a false alarm rate F AR = 0 because the evaluation does not need any control limit estimation as in the other cases of fault detection charts, where an approximation of the chart distribution is needed and a probabilistic control limit is computed with a certain confidence level. However, this chart resulted in a fairly high rate of missed detection MDR = 68.88 for the index Q4,x and MDR = 35.24 for Q4,y . Fault detection chart Q1,• (respectively Q2,• ) is formed by a quadratic upper and lower bounds as expressed in Eq. (31) (respectively, Eq. (32)). During the training phase, two control limits are computed for each chart corresponding to upper and lower values. The computed control limits are very close and only one control limit is used for fault detection. Detection charts Q1,• and Q2,• resulted in the same rates of false alarms and missed detection, with F AR = 62% and MDR = 3.86 when the indices are computed from the interval residuals [ex ]. This is due to the expression of those indices, given by Eqs. (31) and (32), wich are very close. At each sample time, each chart has both upper and lower bounds but in 2 2 quadratic forms and those bounds are j e• and j e• . It should be noted that the IGLR• index is a weighted Q4,• where, the weights are inverse of interval residuals variances. The index Q4,y resulted in similar performance to the IGLRy index because the variances of interval residuals [ey ] are close to each others but the IGLRx presents a better performance than the Q4,x .
280
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Fig. 18. Evolution of Evolution of Q3,x and Q3,y with a fault on variable x2 .
Table 3 FAR % and MDR % for the presented fault detection charts (scenario 1 of distillation column). X
Y
Charts
FAR %
MDR %
FAR %
MDR %
Q1,• Q2,• Q3,• Q4,• IGLR•
4.40 4.40 0 4.80 4.90
79.02 79.02 72.12 2.10 0
3.40 3.40 0.40 4.50 5.30
92.10 92.10 100 92.00 91.60
Table 4 FAR % and MDR % for the presented fault detection charts (scenario 2 of distillation column). X
Y
Charts
FAR %
MDR %
FAR %
MDR %
Q1,• Q2,• Q3,• Q4,• IGLR•
4.40 4.40 0 4.80 4.90
92.20 92.20 100 92.30 92.30
3.40 3.40 0.40 4.50 5.30
94.80 94.80 73.82 34.36 0
This is due to the fact that the Q4,• takes advantage of the squared prediction error statistic computed in residual subspace as interval norm of residual components. When the variances of interval residuals are quite different from one another, the detection ability of the Q4,• visibly declines. To accurately monitor faults, the IGLR• is proposed to take into account the difference in variances between residuals. In this simulated example, the interval residuals [ey ] have comparable variances and hence the performances of the Q4,y are close to those of the IGLRy . In conclusion, the proposed IGLR• provides the best results in terms of FAR and MDR rates compared to the other indices. To confirm this observation, the proposed fault detection approach will be applied to a distillation column example.
5.1. Distillation column example Here, the fault detection strategy presented in this paper will be tested on a simulated distillation column process. Distillation is one of the most common liquid-liquid separation processes, and can be carried out in a continuous or batch system. Distillation can be used to separate binary or multi-component mixtures. Many variables, such as column pressure, temperature, size, and diameter are determined by the properties of the feed and the desired products. The used plant represent a continuous distillation column, and a linearized dynamic model is used [41,42]. The column A has 40 theoretical stages and separates a binary mixture with relative volatility of 1.5 into products of 99% purity. Fig. 14 represents the diagram of a simple distillation column.
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
281
Fig. 19. Evolution of Evolution of Q4,x and Q4,y with a fault on variable x2 .
IGLRx
6
4
2
0 0
200
400
600
800
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1000 1200 Sample number
1400
1600
1800
2000
IGLRy
6
4
2
0
Fig. 20. Evolution of IGLRx and IGLRy with a fault on variable x2 .
The distillation column is simulated for 2 h, under normal operating conditions, and 20 0 0 data samples are generated. Table 2 shows the 12 process variables to be monitored. Interval data are generated by adding 1% of the magnitude of the data to represent sensor impression. Each sensor measurements are noisy and are imprecise, with 1% of imprecision, interval-valued data are generated. Fig. 15 shows the time evolution of column interval-valued data. The output matrix contains two variables Y = [x6 x7 ] (dependent variables) and the remaining variables form the input data matrix X (independent variables). The CPLS model is derived from this data-set with = 6 retained latent variables. For fault detection, two scenarios are considered. In the first scenario, a fault is introduced on the variable x2 (input variable) between sample time 10 0 0 and 20 0 0. Figs. 16–20 show, respectively, the time evolution of fault detection indices Q1,• , Q2,• , Q3,• , Q4,• and IGLR•. The performances of the different fault detection indices in terms of FAR and MDR are presented in Table 3. From Table 3, it is clear that indices IGLRx and Q4,x resulted in the best performances. However, the index IGLRx showed slightly better performances than the Q4,x . In the second scenario, a fault in variable y2 (output variable) is introduced between samples time 10 0 0 and 20 0 0. Here, only indices Q4,• and IGLR• are presented as depicted in Figs. 21 and 22, respectively. FAR and MDR results in this case are illustrated in Table 4. The index IGLRy gives the best performances compared to the other indices with MDR = 0 and F AR = 5.3.
282
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
Fig. 21. Evolution of Q4,x and Q4,y with a fault on variable y2 .
IGLRx
6 4 2 0 0
200
400
600
800
1000
1200
1400
1600
1800
2000
0
200
400
600
800 1000 1200 Sample number
1400
1600
1800
2000
IGLRy
10
5
0
Fig. 22. Evolution of IGLRx and IGLRy with a fault on variable y2 .
In conclusion, the proposed interval fault detection index IGLR• provides similar performances to the Q4,• index in terms of FAR and MDR when interval residuals have comparable variances. However, when the variances of interval residuals are different from one another, the detection ability of the Q4,• visibly declines and the proposed fault detection index, IGLR• , resulted in the best performance compared to the other presented fault detection indices. 6. Conclusions In this paper, we presented a new fault detection technique for monitoring chemical processes based on partial least squares and generalized likelihood ratio test (GLRT). The PLS-based interval GLRT approach deals with the problem of un-
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
283
certainties in systems using latent-driven technique based interval valued data. The developed center PLS model is applied in order to generate the interval residuals to use later for fault detection. The idea behind the novel approach is to widen its applicability for processes represented by input-output and interval valued data. This helps provide a more accurate modeling of uncertain systems and then provide a more effective way that enables better decision making with respect to fault detection. Two examples are used to evaluate the fault detection performances of the proposed CPLS-based interval GLRT approach: the first one is a simulated example and the second one is a distillation column example. The detection abilities of the proposed technique are evaluated in terms of the missed detection and false alarms rates. The detection results demonstrate the effectiveness of the CPLS-based interval GLRT technique in terms of missed detection rates with small false alarm rates. This is due to the fact that the PLS model resulted in a linear model and process data collected from most chemical systems are nonlinear. Hence, one future research direction is to develop interval nonlinear PLS-based method to handle a wide range of nonlinearities. Supplementary material Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.ins.2019.03.068. References [1] T. Ait-Izem, W. Bougheloum, M.F. Harkat, M. Djeghaba, Fault detection and isolation using interval principal component analysis methods, IFAC 48 (21) (2015) 1402–1407, doi:10.1016/j.ifacol.2015.09.721. [2] T. Ait-Izem, M.F. Harkat, M. Djeghaba, F. Kratz, Sensor fault detection based on principal component analysis for interval-valued data, Qual. Eng. (2017) 1–13. [3] T. Ait-Izem, M.F. Harkat, M. Djeghaba, F. Kratz, On the application of interval pca to process monitoring: a robust strategy for sensor fdi with new efficient control statistics, J Process Control 63 (2018) 29–46. [4] A. Benaicha, G. Mourot, J. Ragot, K. Benothman, Fault detection and isolation with interval principal component analysis, in: Proceedings of the International Conference on Control, Engineering and Information Technology, 162, 2013. [5] L. Billard, E. Diday, Regression analysis for interval-valued data, in: Data Analysis, Classification, and Related Methods, Springer, 20 0 0, pp. 369–374. [6] C. Botre, M. Mansouri, M.N. Karim, H. Nounou, M. Nounou, Multiscale pls-based glrt for fault detection of chemical processes, J. Loss Prev. Process. Ind. 46 (2017) 143–153, doi:10.1016/j.jlp.2017.01.008. [7] G. Box, et al., Some theorems on quadratic forms applied in the study of analysis of variance problems, i. effect of inequality of variance in the one-way classification, The Ann. Math. Stat. 25 (2) (1954) 290–302, doi:10.1214/aoms/1177728786. [8] P. Cazes, A. Chouakria, E. Diday, Y. Schektrman, Entension de l’analyse en composantes principales à des données de type intervalle, Revue de Statistique appliquée 45 (3) (1997) 5–24. [9] A. Chouakria, Extension de l’analyse en composantes principales à des données de type intervalle, Thèse, Paris IX Dauphine. Paris IX Dauphine, INRIA-Rocquencourt, 1998. Ph.D. thesis. [10] B. Dayal, J.F. MacGregor, et al., Improved pls algorithms, J. Chemom. 11 (1) (1997) 73–85. [11] F. De-Carvalho, P. Brito, H.H. Bock, Dynamic clustering for interval data based on l2 distance, Comput. Stat. 21 (2) (2006) 231–250. [12] S. De-Jong, Simpls: an alternative approach to partial least squares regression, Chemometr. Intell. Lab. Syst. 18 (3) (1993) 251–263, doi:10.1016/ 0169- 7439(93)85002- x. [13] P. D’Urso, P. Giordani, A least squares approach to principal component analysis for interval valued data., Chemometr Intell Lab Syst 70 (2) (2004) 179–192, doi:10.1016/j.chemolab.20 03.11.0 05. [14] R. Fezai, M. Mansouri, O. Taouali, M.F. Harkat, N. Bouguila, Online reduced kernel principal component analysis for process monitoring, J. Process Control 61 (2018) 1–11. [15] P. Geladi, B.R. Kowalski, Partial least-squares regression: a tutorial, Anal. Chim. Acta 185 (1986) 1–17, doi:10.1016/0 0 03-2670(86)80 028-9. [16] J. Gertler, D. Singer, A new structural framework for parity equation-based failure detection and isolation, Automatica 26 (2) (1990) 381–388, doi:10. 1016/0 0 05- 1098(90)90133- 3. [17] F. Gustafsson, The marginalized likelihood ratio test for detecting abrupt changes, IEEE Trans. Automat. Contr. 41 (1) (1996) 66–78. [18] M.F. Harkat, M. Mansouri, M. Nounou, H. Nounou, Enhanced data validation strategy of air quality monitoring network, Environ. Res. 160 (2018) 183–194. [19] M.F. Harkat, G. Mourot, J. Ragot, An improved pca scheme for sensor fdi: application to an air quality monitoring network, J. Process Control 16 (6) (2006) 625–634, doi:10.1016/j.jprocont.2005.09.007. [20] T. Hickey, Q. Ju, M.H. Van Emden, Interval arithmetic: from principles to implementation, J. ACM (JACM) 48 (5) (2001) 1038–1068. [21] E. Jackson, A User’s Guide to Principal Components, 587, John Wiley & Sons, 2005, doi:10.1002/0471725331. [22] E. Jackson, G. Mudholkar, Control procedures for residuals associated with principal component analysis, Technometrics 21 (3) (1979) 341–349, doi:10. 2307/1267757. [23] S.M. Kay, Fundamentals of Statistical Signal Processing, vol. ii: Detection Theory. Signal Processing., Upper Saddle River, NJ: Prentice Hall, 1998. [24] T. Kourti, J. MacGregor, Process analysis, monitoring and diagnosis using multivariate projection methods: a tutorial., Chemom. Intelligent Laboratory Syst. 28 (3) (1995) 3–21. [25] V. Kreinovich, H.T. Nguyen, B. Wu, On-line algorithms for computing mean and variance of interval data, and their use in intelligent systems, Inf. Sci. 177 (16) (2007) 3228–3238. [26] J.V. Kresta, J.F. MacGregor, T.E. Marlin, Multivariate statistical monitoring of process operating performance, Can. J. Chem. Eng. 69 (1) (1991) 35–47, doi:10.1002/cjce.5450690105. [27] U. Kruger, X. Wang, Q. Chen, S.J. Qin, An alternative pls algorithm for the monitoring of industrial process., in: IEEE American Control Conference, 6, 2001, pp. 4455–4459, doi:10.1109/acc.2001.945680. [28] N.C. Lauro, F. Palumbo, Principal component analysis on subpopulations: an interval data approach, in: IMPS Conference’01, Osaka (Japan), 2001. [29] J. Le-Rademacher, L. Billard, Symbolic covariance principal component analysis and visualization for interval-valued data, Journal of Computational and Graphical Statistics 21 (2) (2012) 413–432, doi:10.1080/10618600.2012.679895. [30] J.F. MacGregor, C. Jaeckle, C. Kiparissides, M. Koutoudi, Process monitoring and diagnosis by multiblock pls methods, AIChE J. 40 (5) (1994), doi:10. 10 02/aic.69040 0509. [31] J.F. MacGregor, T. Kourti, Statistical process control of multivariate processes, Control Eng. Pract. 3 (3) (1995) 403–414. [32] M. Mansouri, M.F. Harkat, M. Nounou, H. Nounou, Midpoint-radii principal component analysis-based ewma and application to air quality monitoring network, Chemometr. Intell. Lab. Syst. (2018). [33] M. Mansouri, M.N. Nounou, H. Nounou, Improved statistical fault detection technique and application to biological phenomena modeled by s-systems, IEEE Trans. Nanobiosci. 16 (6) (2017) 504–512. [34] R. Moore, Interval Analysis, Prentice Hall, Englewood Cliffs, NJ., 1966.
284
H. M.-F., M. M. and N. M.N. et al. / Information Sciences 490 (2019) 265–284
[35] F. Palumbo, C.N. Lauro, A pca for interval-valued data based on midpoints and radii, in: New Developments in Psychometrics, Springer, 2003, pp. 641– 648, doi:10.1007/978- 4- 431- 66996- 8_74. [36] S.J. Qin, Statistical process monitoring: basics and beyond, J. Chemom. 17 (8–9) (2003) 480–502, doi:10.1002/cem.800. [37] S.J. Qin, Survey on data-driven industrial process monitoring and diagnosis, Annu. Rev. Control. 36 (2) (2012) 220–234, doi:10.1016/j.arcontrol.2012.09. 004. [38] P. Rumschinski, J. Richter, A. Savchenko, S. Borchers, J. Lunze, R. Findeisen, Complete fault diagnosis of uncertain polynomial systems, IFAC Proc. Vol. 43 (5) (2010) 127–132. [39] E. Russell, L. Chiang, R. Braatz, Data-driven methods for fault detection and diagnosis in chemical processes, Springer-Verlag, London., 20 0 0, doi:10. 1007/978- 1- 4471- 0409- 4. [40] B. Sinova, A. Colubi, G. González-Rodrı, et al., Interval arithmetic-based simple linear regression between interval data: discussion and sensitivity analysis on the choice of the metric, Inf. Sci. 199 (2012) 109–124. [41] S. Skogestad, et al., Dynamics and control of distillation columns: a tutorial introduction, Chem. Eng. Res. Des. 75 (6) (1997) 539–562. [42] S. Skogestad, P. Lundström, E.W. Jacobsen, Selecting the best distillation control configuration, AlChE J. 36 (5) (1990) 753–764. [43] X. Wei, H. Liu, Y. Qin, Fault diagnosis of rail vehicle suspension systems by using glrt, in: Control and Decision Conference (CCDC), 2011 Chinese, IEEE, 2011, pp. 1932–1936. [44] A. Willsky, E. Chow, S. Gershwin, C. Greene, P. Houpt, A. Kurkjian, Dynamic model-based techniques for the detection of incidents on freeways, IEEE Trans. Automat. Contr. 25 (3) (1980) 347–360. [45] H. Wold, Estimation of principal components and related models by iterative least squares, Multivariate analysis (1966) 391–420. [46] S. Wold, N. Kettaneh-Wold, B. Skagerberg, Nonlinear pls modeling, Chemometrics and intelligent laboratory systmes 7 (53) (1989) 53–65, doi:10.1016/ 0169-7439(89)80111-x. [47] S. Yin, S.X. Ding, X. Xie, H. Luo, A review on basic data-driven approaches for industrial process monitoring, IEEE Trans. Ind. Electron. 61 (11) (2014) 6418–6428, doi:10.1109/tie.2014.2301773.