Sensor fault detection and diagnosis in the presence of outliers

Sensor fault detection and diagnosis in the presence of outliers

Neurocomputing 349 (2019) 156–163 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Sensor ...

2MB Sizes 0 Downloads 35 Views

Neurocomputing 349 (2019) 156–163

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

Sensor fault detection and diagnosis in the presence of outliers Chen Xu, Shunyi Zhao, Fei Liu∗ Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi 214122, China

a r t i c l e

i n f o

Article history: Received 27 August 2018 Revised 27 December 2018 Accepted 8 January 2019 Available online 17 January 2019 Communicated by Shen Yin Keywords: Sensor fault detection and diagnosis t-distribution Unknown noise statistics Variational Bayesian inference

a b s t r a c t In this paper, a sensor fault detection and diagnosis (FDD) method is proposed for linear state-space models in the presence of outliers. The t-distribution with unknown scale matrix and degrees of freedom (dof) parameter is used to describe the measurement noise. By using the variational Bayesian inference, the states, the scale matrix, and the dof parameter are estimated simultaneously. Since the noise distribution is no longer the Gaussian, a modified residual evaluation is proposed to detect the fault. After that, the cause of fault can be determined by observing the changes on measurement noise covariance. Two continuous stirred tank reactor (CSTR) process is conducted to demonstrate that the proposed method can provide more reliable FDD results than the existing methods when measurements contain outliers.

1. Introduction Due to the safety and reliability requirements of complex industrial processes, the problem of fault detection and diagnosis (FDD) has attracted wide attentions [1–6]. With the increasing level of automation, process control systems have strict requirements on the equipment, including the measurements by the sensors. Therefore, sensor abnormalities or faults will lead to serious consequences, such as product quality degradation. The primary task of the FDD is to detect faults and identify faulty information in order to take relevant control actions before they become unpredictable events [7–11]. At present, FDD has been effectively applied to cyber-physical systems (CPS), distributed systems, and relevant toolbox has been developed [12–16]. In general, FDD methods can be divided into two categories including the model-based methods and the data-based methods [17–21]. Among the existing model-based FDD approaches, the filter-based method is considered as an efficient tool, and has been widely used from various perspectives [22–25]. The Kalman filter (KF), designed on the nominal basis of the system model, provides state estimates in the minimum mean square error (MMSE) sense, and is often implemented to detect the fault for linear case [26,27]. By using the residual generator, the well-known generalized likelihood ratio (GLR) test framework can be used for online FDD [28]. After that, many extensions and modifications have been proposed with the purpose of enhancing the sensitivity to faults as well as the robustness to disturbances [29–31]. For example,



Corresponding author. E-mail address: fl[email protected] (F. Liu).

https://doi.org/10.1016/j.neucom.2019.01.025 0925-2312/© 2019 Published by Elsevier B.V.

© 2019 Published by Elsevier B.V.

two KFs are designed in [29] to diagnose the fault for linear drive system, and the interacting multiple-model method is proposed in [30] in the presence of mismodeled transition probabilities. In [31], a sensor FDD algorithm is presented based on extended KF, which can isolate all sensor faults and is robust to system noise. A limitation of the aforementioned methods is that they assume a complete prior knowledge of noise statistics. However, the measurement noise statistics is not easily available in real applications, and the detection performance will degrade for sure. On the other hand, outliers are of considerable practical importance since they occur frequently in industrial processes [32]. Generally, the presence of outliers is mainly caused by random variations, recording error of measurements and so on. Conversely, the fault may be introduced by sensor failure or damage [33]. Therefore, when outliers exist in measurements, the KF based FDD approaches [25–28] cannot distinguish between fault and outliers and break down since the Gaussian assumption is no longer valid. In order to cope with the outliers, some robust filters are proposed based on t-distribution in state estimation domain [34–36]. However, the outliers issue has not been solved in the field of modelbased FDD. In this paper, we proposes a sensor FDD for linear state-space models in the presence of outliers. To tackle outliers occurred in measurements, a robust filter is designed based on t-distribution which utilized to describe the measurement noise. Using the variational Bayesian (VB) inference, the unknown noise statistics, including scale matrix and dof parameter, together with the system states are estimated simultaneously. By introducing tolerance interval, the fault can be detected using residual generators, by which outliers can be distinguished from fault. Furthermore, the modified measurement covariance is estimated to determine which sensors

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

are faulty, which is meaningful for troubleshooting. The main contributions of this paper are summarized as: •





In the proposed method, the tolerance interval and modified measurement covariance are introduced first to detect the fault and determine the faulty sensors. The proposed FDD method can be implemented in the case of unknown noise statistics, which is critical for practical applications. Compared with the traditional KF based FDD [25–28], the proposed method can reduce false alarm significantly in the presence of outliers.

The structures of this paper are organized as follows: In Section 2, we introduce the t-distribution and formulate the problem. In Section 3, a sensor FDD method is proposed based on a robust filter using the VB approximations. In Section 4, a chemical engineering example is used to test the proposed method, and conclusions are drawn in Section 5. 2. Preliminaries and problem formulation Consider the linear state-space model

xk+1 = Ak xk + Bk uk + wk ,

(1)

zk = Ck x k + fk + vk ,

(2)

( ν +2 n )|P|− 2   ν +n , n 2 ( 2 ) (π ν ) 2 ( ν2 ) 1 + ν 1

t ( x; u, P, ν ) =

(3)

where 2 = (x − u )T P−1 (x − u ) and  (t) is the Gamma function. In this paper, vk is assumed to follow:

p( v k ) = t ( v k ; 0 , R k , ν k ) ,

(4)

where t(vk ; 0, Rk , νk ) denotes the t-distribution PDF with location 0, scale matrix Rk and dof parameter νk . In this paper, Rk = diag(Rk,1 , . . . , Rk,d ) is a diagonal matrix, and the dof parameters is νk = (νk,1 , . . . , νk,d )T . Here, we compare Gaussian distribution and t-distribution by means of a two-dimensional example, where the parameters are given by

 

u=

0 , 0



P=

1 −0.35

the heavy tailed noise and deal with the outliers problems [34– 36]. The problem considered in this paper can be formulated as follows: Given the system models (1) and (2) with t distributed measurement noise, we will design a robust filter for sensor FDD. A distinct feature of the proposed method is that the scale matrix and the dof parameter of t-distribution for measurement noise are unknown. 3. Robust filter design using variational Bayesian inference 3.1. Prior distribution selection for t-distribution based model To derive a closed-form solution to the posterior PDF, the likelihood p(zk |xk ) needs to be modeled. Using (2) and (4), the likelihood p(zk |xk ) can be expressed by

p( z k | x k ) = t ( z k ; C k x k , R k , ν k ) .



−0.35 , 1

 

ν=

5 . 5

Fig. 1 shows the heat maps of ln N(x; u, P) and ln t(x; u, P, ν). The coloring is the same in Fig. 1(a) and (b). We can see that the white region (values below −15) is contained in Fig. 1(b), but not in Fig. 1(b). That is, the t-distribution assigns a larger probability to the sample (outlier) which is farther away from u. This is verified that the t-distribution has heavier tails, which is controlled by the dof parameter [37,38]. Therefore, the t-distribution can model

(5)

In order to guarantee the posterior distribution has the same functional form as the prior distribution, the conjugate prior distribution is selected for the unknown scale matrix Rk and the dof parameter νk . Specifically, the independent Inverse-Gamma distribution and Gamma distribution are chosen as the prior distributions for the Rk and νk respectively [34,39]

p(Rk |z1:k−1 ) =

where k denotes the time index, xk ∈ Rm is the state vector, uk ∈ R p is the control input signal, zk ∈ Rd is the measurement vector, fk ∈ Rq is the unknown sensor fault to be detected, and Ak ∈ Rm×m , Bk ∈ Rm×p , and Ck ∈ Rd×m are the system matrices. The process noise wk ∼ N(0, Qk ) denotes Gaussian distribution noise with mean vector 0 and covariance matrix Qk , and vk is measurement noise. In this paper, the initial state x0 ∼ N(x0|0 , P0|0 ), and x0 , wk and vj are assumed to be mutually uncorrelated for any k and j. In the traditional KF based FDD methods [25–28], the outliers are often treated as a fault since they violate the Gaussian assumption. It is worth noting that the t-distribution is often used to characterize potential outliers [37,38]. The probability density function (PDF) of an n-dimensional t-distribution is defined by

157

d 

IG(Rk,i ; α¯ k,i , β¯ k,i )

i=1

=

α¯ k,i −α¯ k,i −1 −β¯ k,i /Rk,i d  β¯ k,i Rk,i e i=1

p(νk |z1:k−1 ) =

d 

(α¯ k,i )

,

(6)

G(νk,i ; a¯ k,i , b¯ k,i )

i=1

=

d ¯ a¯ k,i (a¯ k,i −1 ) −b¯ k,i νk,i  bk,i νk e i=1

(a¯ k,i )

,

(7)

where α¯ (· ) and β¯ (· ) are shape parameter and scale parameter of Inverse-Gamma distribution, and a¯ (· ) and b¯ (· ) are shape parameter and rate parameter of Gamma distribution. According to the Kalman prediction equations, the one-step predicted PDF can be written as

p(xk |z1:k−1 ) = N(xk ; xk|k−1 , Pk|k−1 ),

(8)

where the predicted state xk|k−1 and corresponding error covariance Pk|k−1 are specified as

xk|k−1 = Ak−1 xk−1|k−1 + Bk−1 uk−1 ,

(9)

Pk|k−1 = Ak−1 Pk−1|k−1 ATk−1 + Qk−1 .

(10)

To derive p(Rk |z1:k−1 ) and p(νk |z1:k−1 ), the dynamic models of the scale matrix Rk and the dof parameter νk are necessary. Similar to [39], a heuristic model for Rk and νk is implemented, which simply spreads their previous values by a scalar factor ρ ∈ (0, 1]. According to (6) and (7), we can obtain the dynamic model as follows

α¯ k,i = ραk−1,i ,

(11)

β¯ k,i = ρβk−1,i ,

(12)

a¯ k,i =

ρ ak−1,i ,

(13)

b¯ k,i =

ρ bk−1,i .

(14)

158

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

Fig. 1. Comparison of Gaussian distribution and t-distribution by heat maps.

Note that ρ = 1 means that noise variance is stationary and this forms ensure the conjugacy and the stability of variational method [39]. Because of the closed-form solution of the posterior PDF is unavailable, the t-distribution based model needs to be transformed into a hierarchical Gaussian model by introducing an auxiliary variable k = diag( k,1 , . . . , k,d ). According to [34,36], the likelihood p(zk |xk ) can be rewritten as the following forms:

In order to solve this issue, the VB method is used to approximate the posterior distribution. We now search for a free form factored approximating distribution for p(xk , k , Rk , νk |z1: k ) as

p(zk |xk , k , Rk ) = N(zk ; Ck xk , −1 Rk ), k

{q(xk ), q(k ), q(Rk ), q(νk )} = arg min KL(q(xk )q(k )q(Rk )q(νk ) × p(xk , k , Rk , νk |z1:k ) ), (20)

p(k |νk ) =

d 



G k,i ;

νk,i νk,i  2

i=1

,

2

.

(15)

(16)

The problem has been transformed into solving the state estimation for t-distribution based hierarchical model with Eqs. (6)– (8) and (15)–(16).

p(xk , k , Rk , νk |z1:k ) ≈ q(xk )q(k )q(Rk )q(νk ),

(19)

where q(xk ), q(k ), q(Rk ) and q(νk ) are the approximated posterior PDFs of xk , k , Rk and νk , respectively. This approximation can be formed by minimizing the Kullback–Leibler (KL) divergence [40]

where KL(q(x ) p(x )) = q(x ) log qp((xx)) dx is the KL divergence between q(x) and p(x). Therefore, we obtain the analytical solutions for q(xk ), q(k ), q(Rk ) and q(νk ) by fixed-point iterations of the following equation [36]:

log q(ϑ ) = log p(, zk , |z1:k−1 ) (−ϑ ) + cϑ ,

(21)

3.2. Posterior updating with VB

  {xk , k , Rk , νk },

(22)

In order to estimate the state xk of the hierarchical model formulated in (6)–(8) and (15)–(16), the joint posterior PDF p(xk , k , Rk , νk |z1: k ) needs to be computed. Using the Bayes’ theorem, the joint filtering posterior PDF can be written as

where  · denotes the expectation operation, ϑ is an arbitrary element of , and  (−ϑ ) is the set of all elements in  except for ϑ, and cϑ is constant with respect to the variables ϑ. When ϑ = xk and substituting (18) into (21) yields

p(xk , k , Rk , νk |z1:k ) ∝ p(xk , k , Rk , νk , zk |z1:k−1 )

1 log q(xk ) = − (zk − Ck xk )T R−1  k ( z k − C k x k ) k 2 1 − (xk − xk|k−1 )T P−1 (xk − xk|k−1 ) + cx . k|k−1 2

= p(zk |xk , k , Rk ) p(xk |z1:k−1 ) p(k |νk ) ×p(νk |z1:k−1 ) p(Rk |z1:k−1 ).

(17)

Exploiting (17), log p(xk , k , Rk , νk , zk |z1:k−1 ) is formulated as

Define the modified likelihood PDF p(zk |xk ) as

p(xk , k , Rk , νk , zk |z1:k−1 )

˜ k ), p( z k | x k ) = N ( z k ; C k x k , R

= N(zk ; Ck xk , −1 Rk )N(xk ; xk|k−1 , Pk|k−1 ) k ×

d  i=1



G k,i ;

d νk,i νk,i  

2

,

2

d  × IG(Rk,i ; α¯ k,i , β¯ k,i ).

(23)

(24)

˜ k is given by where the modified measurement covariance R

G(νk,i ; a¯ k,i , b¯ k,i )

˜ k = k −1 R−1 −1 . R k

i=1

(25)

Using (8), (24) and (25) in (23), we obtain

(18)

i=1

Since the system states xk , auxiliary variable k , scale matrix Rk , and dof νk are coupled through the likelihood p(zk |xk , k , Rk , νk ), the exact joint posterior is not analytically tractable.

˜ k )N(xk ; xk|k−1 , Pk|k−1 ). q ( xk ) ∝ N ( zk ; Ck xk , R

(26)

Employing (8) and (23)–(26), q(xk ) can be updated as a Gaussian PDF:

q ( xk ) = N ( xk ; xk|k , Pk|k ),

(27)

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

where xk|k is the filtering estimate, Pk|k is the estimate error covariance, and the expressions are given by

˜ k )−1 , Kx = Pk|k−1 CTk (Ck Pk|k−1 CTk + R

(28)

where ψ ( · ) denotes the digamma function. When lower bound value converges (suppose after taking N iteration steps), the approximate posterior PDFs q(xk ), q(k ), q(Rk ) and q(νk ) can be updated as



xk|k = xk|k−1 + Kx (zk − Ck xk ),

(29)

Pk|k = (I − Kx Ck )Pk|k−1 .

(30)

d 

q(k ) ≈

G( k,i ; θk,i , Uk,i ),

(31)

q ( Rk ) ≈

1 = (νk,i + 1 ), 2

(32)

(33)

d 

d 

q (ν k ) ≈

d 

(45)

G( k,i ; θk,i , Uk,i ),

(46)

IG(Rk,i ; αk,i , βk,i ),

(47)

i=1





(N ) (N ) IG Rk,i ; αk,i , βk,i =

d  i=1

G



d   (N ) (N ) , bk,i = G(νk,i ; ak,i , bk,i ). νk,i ; ak,i

(48)

i=1

4. Sensor FDD based on robust filter 4.1. Sensor fault detection

where

k = tr(R−1 ((zk − Ck xk|k )(zk − Ck xk|k )T + Ck Pk|k CTk )), k

(34)

and [ · ]ii denotes the ith diagonal element of [ · ]. The derivation of (31)–(33) is given in Appendix A.1. When ϑ = Rk , q(Rk ) is updated as an inverse Gamma PDF :

Due to the traditional KF based methods [25–28], the fault detection step is applied by evaluating the residual at each estimation step. Accordingly, the following two hypotheses are introduced: •

d 



i=1

1 ([k ]ii + νk,i ), 2

q ( Rk ) =



(N ) (N ) G k,i ; θk,i , Uk,i =

i=1

where

Uk,i =

d  i=1

i=1

θk,i



q(xk ) ≈ N xk ; xk(N|k) , Pk(N|k) = N(xk ; xk|k , Pk|k ),

When ϑ = k , q(k ) is updated as a Gamma PDF:

q(k ) =

159

IG(Rk,i ; αk,i , βk,i ),

(35)

i=1



H0 : System operates normally, H1 : Fault occurs in the system. In order to detect the fault, the statistical function is defined as

where

1 2

(36)

1 2

(37)

αk,i = α¯ k,i + , βk,i = β¯ k,i +  k,i [(zk − Ck xk|k )2i + (Ck Pk|k CTk )ii ]. The derivation of (35)–(37) is also given in Appendix A.1. When ϑ = νk , q(νk ) is updated as a Gamma PDF:

q (ν k ) =

d 

G(νk,i ; ak,i , bk,i ),

(38)

i=1

where

ak,i = a¯ k,i +

1 , 2

(39)

1 1 1 bk,i = b¯ k,i − − log k,i +  k,i . 2 2 2

(40)

The derivation of (38)–(40) is given in Appendix A.2. 3.2.1. Computation of expectations Using (27), (31), (35) and (38), we can compute the required expectation k , R−1 , νk , logk as k







k = diag  k,1 , . . . ,  k,d = diag

αk,1 − 1 αk,d − 1 R−1 = diag , . . . , , k βk,1 βk,d

θk,1 Uk,1

,...,

θk,d Uk,d



,

(41)

(42)

νk,i = ak,i /bk,i ,

(43)

log k,i = ψ (θk,i ) − log Uk,i ,

(44)

˜ k ]−1 ek , Tk = eTk [Ck Pk|k−1 CTk + R

(49)

where the residual ek = zk − Ck xk|k−1 . The traditional threshold can be determined by the χ 2 distribution, i.e.,

H0 :

Tk ≤ Jth = χα2s (d )

∀ k,

(50)

H1 :

Tk > Jth = χα2s , (d )

∃ k,

(51)

where Jth is the threshold, α s is the significance level, d is the measurement dimensionality. However, according to [33], the threshold Jth are often beyond the ideal limits since the measurement noise is assumed to be tdistribution and may become too weak for detecting faulty events. In this paper, we introduce the tolerance interval () to relieve the insensitivity from t-distribution and a new logic is used to detect the sensor fault. The modified residual evaluation is defined as

H0 :

Tk ≤ Jth = χα2s (d )

∀ k,

(52)

H1 :

TK > Jth = χα2s (d )

∃ K,

(53)

where K = [k − , k], and TK = {Tk− , · · · , Tk },  is the tolerance interval. Although degrading the fault detection accuracy by , this modification will improve the effectiveness of fault detection in the presence of outliers. Remark 1. If the outlier exists, the statistic Tk will has a big change which is greater than Jth and return to normal in the next few time steps (the time steps is tolerance interval ). In this case, the outlier will be misjudged as a fault. Therefore, it is necessary to design a new logic to eliminate the influence of outliers on sensor fault detection.

160

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163 Table 1 Parameters of two-CSTR process. F0 = 4.998m3 /h F1 = 39.996m3 /h F3 = 30m3 /h Fr = 34.998m3 /h V1 = 1.0m3 V2 = 3.0m3 T0 = 300.0K T03 = 300.0K ρ = 10 0 0.0kg/m3 CAs 0 = 4.0kmol/m3 CAs 03 = 2.0kmol/m3

R = 8.314kJ/kmolK c p = 0.231kJ/kg e1 = 5.0 × 104 kJ/kmol e2 = 7.53 × 104 kJ/kmol e3 = 7.53 × 104 kJ/kmol k10 = 3.0 × 106 h−1 k20 = 3.0 × 105 h−1 k30 = 3.0 × 105 h−1

H1 = −5.0 × 104 kJ/kmol

H2 = −5.2 × 104 kJ/kmol

H3 = −5.4 × 104 kJ/kmol

Fig. 2. Two CSTR process.

Remark 2. The tolerance interval  is uncertain for different systems. It can be determined by the prior knowledge. If the process noise is small,  will be small and vice versa. And if  is 0, the fault detection method will degenerate into the (50) and (51). In this paper,  is set to 2. Although this setting is proposed for the example of this paper, the proposed method with suggested  has good fault detection performance in many contexts based on our experience. 4.2. Sensor fault diagnosis After detecting the sensor fault, it is necessary to determine which sensor is faulty. For this purpose, the modified measurement ˜ k is used for sensor fault diagnosis. Although it is not covariance R known, the true measurement covariance does not change significantly in normal operating mode, while the fault will lead to a big change of measurement covariance. Hence, the fault diagnosis step can be described as

R˜k,i ≤ Ji

i = 1, . . . , d,

Fig. 4. Detection results in Case 1.

(54)

logic as

˜ k , and Ji is the where R˜k,i denotes the ith diagonal element of R threshold of R˜k,i . Because there are outliers in the system, the estimated R˜k,i will also be changed. Therefore, we define a modified fault diagnosis

R˜K,i ≤ Ji

Measurements

315

i = 1, . . . , d,

(55)

where R˜K,i = {R˜k−,i , . . . , R˜k,i }. That is, when R˜K,i is greater than Ji , the ith sensor is faulty. Measurements

320

310

310

305

300

300

290 0

0.5

1

1.5

2

2.5

3

320

320

310

310

300

300

290

290

280

0

0.5

1

0

0.5

1

1.5

2

2.5

3

1.5

2

2.5

3

280 0

0.5

1

1.5

2

2.5

3

Time(h)

Time(h)

(a) Case 1: fault

(b) Case 2: fault + outliers

Fig. 3. Simulated measurements including fault and outliers.

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

161

detection (CD). For more details about the three indices, one can refer to [30]. 5.1. Two CSTR precess As shown in Fig. 2, two CSTR process [41,42] is used to test the proposed sensor FDD algorithm in the presence of outliers. Based on the energy and material conservation laws, we can obtain a two CSTR model:

dT1 F0 F = (T0 − T1 ) + r (T2 − T1 ) − Gi (T1 )CA1 + dt V1 V1 3

Q1

ρ c pV1

1

,

dCA1 F0 F = (CA0 − CA1 ) + r (CA2 − CA1 ) − Ri (T1 )CA1 , dt V1 V1 3

1

dT2 F1 F = (T1 − T2 ) + 3 (T03 − T2 ) − Gi (T2 )CA2 + dt V2 V2 3

Fig. 5. Detection results in Case 2.

1

The threshold Ji can be determined empirically by experience. Because fault/outliers can cause large variations in measurement covariance, Ji is set to be a large value generally. 5. Simulation studies To demonstrate the effectiveness of the proposed sensor FDD method, in this section, we test the proposed robust filter (RF) method using the two connected continuous stirred tank reactor (CSTR) process simulation. To make comparisons, two methods are implemented: the KF with inaccurate measurement covariance (KF1) and the KF with true measurement covariance (KF2) [25,26]. Three performance indices are introduced to evaluate the performance: false alarm (FA), missed fault detection (MFD), and correct

0.8

where the definition of different variables in (56) can be seen in [38]. The values of all parameters are summarized in Table 1. When Q1 = Q2 = 1.4 × 104 kJ/h, CA0 = CAs 0 , CA03 = CAs 03 , a steady state in this process is (T1s , CAs 1 , T2s , CAs 2 ) = (305.051 K, 2.497 kmol/m3 , 303.914 K, 2.283 kmol/m3 ). We can obtain a linear model by linearizing the nonlinear model at this operating point. In this example, two temperatures T1 and T2 are measured with sampling time t = 0.005h. The initial values are given as x(0 ) = [306, 2.5, 304, 2.3]T and F0 = 10(1 + 0.9 sin(0.008t ))m3 /h [41]. The true noise covariances are given as Qk = diag(0.01, 0.002, 0.01, 0.002 ) and Rk = diag(0.1, 0.1 ),

Case 2

0.8

0.6

0.6

Values.(%)

0.7

0.5 0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

KF1 KF2 RF

0.9

0.7

0.4

0 CD

MFD

(a) Performance indices

FA

(56)

1

1

0.5

,

3

KF1 KF2 RF

0.9

Values.(%)

dCA2 F1 F = (CA1 − CA2 ) + 3 (CA03 − CA2 ) − Ri (T2 )CA2 , dt V2 V2

Case 1

1

Q2

ρ c pV2

CD

MFD

(b) Performance indices

Fig. 6. Comparison of different algorithms using performance indices.

FA

162

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

Acknowledgment This work was supported by the National Natural Science Foundation of China (no. 61773183 and no. 61833007), and the national first-class discipline program of Light Industry Technology and Engineering (LITE2018-25). Appendices A.1. Derivation of (31)–(33) and (35)–(37) The logarithm of the joint filtering posterior is given by

log p(zk , xk , k , Rk , νk |z1:k−1 ) 1 = − (zk − Ck xk )T R−1 k (zk − Ck xk ) k 2 νk,i 1 − (xk − xk|k−1 )T P−1 (xk − xk|k−1 ) − k|k−1 2 2 k,i d

Fig. 7. Diagnosis results of two Cases.

+

Ck =

1 0

0 0

0 1

0 0



νk,i − 1

 (57)

We consider sensor faults by replacing the second row of Ck by [0, 0, 5, 0] when 0.5h ≤ t ≤ 1h and adding a biased with magnitude 10 to the measurements when 1.5h ≤ t ≤ 2h. Two cases are considered: Case 1: Measurements only including fault; Case 2: Measurements consist of fault and outliers. The simulated fault and outliers can be seen in Fig. 3. The tolerance interval  is set to 2 and Ji = 5. The simulation results are shown in Figs. 4–7. Figs. 4 and 5 show the fault detection performance of the two cases, where the index “1” denotes that the residual evaluation function falls below the threshold, and the index “2” indicates that the residual evaluation function exceeds the threshold. We can see that the fault can be detected well using these three methods when there are no outliers in measurements. However, if outliers exist, the proposed RF method is still effective while the KF based methods (KF1 and KF2) are invalid. Fig. 6 shows the comparisons of different methods using performance indices. It can be seen that the proposed RF method can guarantee high CD and low FA. However, the KF based methods (KF1 and KF2) increases the FA a lot when the outliers exist. Fig. 7 shows the fault diagnosis result after the fault is detected where index “1” denotes the sensor is normal, and index “2” indicates the sensor is faulty. By using the modified covariance R˜K,i , the cause of the fault can be determined accurately either in Case 1 or Case 2.



d

log 

+

log k,i +

ν  k,i

2

i=1 d

i=1



2

i=1

respectively. In two CSTR model, the measurement matrix is



d



d 

d

νk,i

i=1

α¯ k,i +

i=1

(a¯ k,i − 1 )logνk,i −

i=1

2

log

νk,i 2

 d β¯ k,i 3 log Rk,i − 2 Rk,i i=1

d

b¯ k,i νk,i .

(58)

i=1

Substituting (58) into (21) and substituting (58) in (21) yield

1 log q(k ) = − (zk − Ck xk )T R−1 k (zk − Ck xk ) k 2

d d νk,i − 1 νk,i + log k,i − + c , 2 2 k,i i=1

log q(Rk ) = −

(59)

i=1

d 

α¯ k,i +

i=1

 d β¯ k,i 3 log Rk − 2 Rk,i i=1

1 − tr(k ((zk − Ck xk|k )(zk − Ck xk|k )T 2 +Ck Pk|k CTk )R−1 ) + cR . k

(60)

Utilizing (34) in (59) gives

log q(k ) =

d



1 − k,i [k ]ii + 2

i=1

log k,i −

νk,i 2



νk,i − 1





2

k,i + c ,

(61)

According to (60) and (61), we can get (31)–(33) and (35)–(37). A.2. Derivation of (38)–(40)

6. Conclusion In this paper, a new sensor FDD algorithm is proposed for the linear state-space model with t distributed measurement noise to handle the problem of outliers. The states and unknown noise statistics are estimated simultaneously using VB inference. The fault and faulty sensors are detected by introducing the tolerance interval and modified measurement covariance. The simulation results demonstrate that the proposed method can solve the sensor FDD problem in the presence of outliers, and has better performance than the conventional KF based methods.

Using ϑ = νk and (58) in (21), log q(νk ) can be formulated as

log q(νk ) =

d i=1

+

νk,i 2

d i=1

×

d i=1



log

νk,i 2



d i=1

log 

ν  k,i

2

d νk,i − 1 νk,i  k,i log k,i − 2

i=1

2

(a¯ k,i − 1 ) log νk,i − b¯ k,i νk,i + cν .

(62)

C. Xu, S. Zhao and F. Liu / Neurocomputing 349 (2019) 156–163

Using Stirling’s approximation: log ( [34], log q(νk ) can be reformulated as

νk,i 2

)≈

νk,i −1 2

log(

νk,i 2

)−

νk,i 2



 d  d 1 1 1 log q(νk ) = a¯ k,i + − 1 log νk,i − b¯ k,i − − log k,i 2 2 2 i=1



1 +  k,i νk,i + cν . 2

i=1

(63)

Using (63), we can obtain (38)–(40). References [1] M. Zhong, T. Xue, S.X. Ding, A survey on model-based fault diagnosis for linear discrete time-varying systems, Neurocomputing 306 (2018) 51–60. [2] S. Zhao, B. Huang, Iterative residual generator for fault detection with linear time-invariant state-space models, IEEE Trans. Autom. Control 62 (10) (2017) 5422–5428. [3] C. Zhao, B. Huang, Incipient fault detection for complex industrial processes with stationary and nonstationary hybrid characteristics, Ind. Eng. Chem. Res. 57 (14) (2018) 5045–5057. [4] K. Yan, Z. Ji, W. Shen, Online fault detection methods for chillers combining extended Kalman filter and recursive one-class SVM, Neurocomputing 228 (2017) 205–212. [5] C. Xu, S. Zhao, F. Liu, Distributed plant-wide process monitoring based on PCA with minimal redundancy maximal relevance, Chemom. Intell. Lab. Syst. 169 (2017) 53–63. [6] Y. Mao, F. Ding, A. Alsaedi, Adaptive filtering parameter estimation algorithms for Hammerstein nonlinear systems, Signal Process. 128 (2016) 417–425. [7] Y. Qin, C. Zhao, F. Gao, An iterative two-step sequential phase partition (ITSPP) method for batch process modeling and online monitoring, AIChE J. 62 (7) (2016) 2358–2373. [8] J. Feng, K. Han, H. Zhang, Identification-oriented robust finite memory fault detection filter design for networked industrial process with fading channel communication, Neurocomputing 267 (2017) 624–634. [9] X. Zhang, T. Parisini, M.M. Polycarpou, Sensor bias fault isolation in a class of nonlinear systems, IEEE Trans. Autom. Control 50 (3) (2005) 370–376. [10] G. Dong, Y. Li, S. Sui, Fault detection and fuzzy tolerant control for complex stochastic multivariable nonlinear systems, Neurocomputing 275 (2018) 2392–2400. [11] M.Y. Zhong, S.X. Ding, E.L. Ding, Optimal fault detection for linear discrete time-varying systems, Automatica 46 (8) (2010) 1395–1400. [12] Y. Jiang, S. Yin, Recursive total principle component regression based fault detection and its application to vehicular cyber-physical systems, IEEE Trans. Ind. Electron. 14 (4) (2018) 1415–1423. [13] Y. Jiang, S. Yin, O. Kaynak, Data-driven monitoring and safety control of industrial cyber-physical systems: basics and beyond, IEEE Access 6 (2018) 47374–47384. [14] X. Yin, J. Liu, Distributed output-feedback fault detection and isolation of cascade process networks, AIChE J. 63 (10) (2017) 4329–4342. [15] H. Chen, B. Jiang, N. Lu, Real-time incipient fault detection for electrical traction systems of CRH2, Neurocomputing 306 (2018) 119–129. [16] Y. Jiang, S. Yin, Recent advances in key-performance-indicator oriented prognosis and diagnosis with a MATLAB toolbox: DB-KIT, IEEE Trans. Ind. Inform. (2018), doi:10.1109/TII.2018.2875067. [17] Z. Ge, Z. Song, F. Gao, Review of recent research on data-based process monitoring, Ind. Eng. Chem. Res. 52 (10) (2013) 3543–3562. [18] S. Yin, X. Zhu, C. Jing, Fault detection based on a robust one class support vector machine, Neurocomputing 145 (2014) 263–268. [19] V. Venkatasubramanian, R. Rengaswamy, K. Yin, S.N. Kavuri, A review of process fault detection and diagnosis: Part III: Process history based methods, Comput. Chem. Eng. 27 (3) (2003) 327–346. [20] S. Yin, Y. Jiang, Y. Tian, O. Kaynak, A data-driven fuzzy information granulation approach for freight volume forecasting, IEEE Trans. Ind. Electron. 64 (2) (2017) 1447–1456. [21] Y. Qin, C. Zhao, X. Wang, Subspace decomposition and critical phase selection based cumulative quality analysis for multiphase batch processes, Chem. Eng. Sci. 166 (2017) 130–143. [22] I. Samy, I. Postlethwaite, D. Gu, Survey and application of sensor fault detection and isolation schemes, Control Eng. Pract. 19 (7) (2011) 658–674. [23] Y. Mao, F. Ding, A novel data filtering based multi-innovation stochastic gradient algorithm for Hammerstein nonlinear systems, Digital Signal Process. 46 (2015) 215–225. [24] M. Zajac, Online fault detection of a mobile robot with a parallelized particle filter, Neurocomputing 126 (2014) 151–165. [25] A.S. Willsky, A survey of design methods for failure detection in dynamic systems, Automatica 12 (6) (1975) 601–611. [26] B. Brumback, M. Srinath, A chi-square test for fault-detection in Kalman filters, IEEE Trans. Autom. Control 32 (6) (1987) 552–554. [27] R.K. Mehra, J. Peschon, An innovations approach to fault detection and diagnosis in dynamic systems, Automatica 7 (5) (1971) 637–640. [28] A. Willsky, H.L. Jones, A generalized likelihood ratio approach to the detection and estimation of jumps in linear systems, IEEE Trans. Autom. Control 21 (1) (1976) 108–112.

163

[29] S. Huang, K.K. Tan, T.H. Lee, Fault diagnosis and fault-tolerant control in linear drives using the Kalman filter, IEEE Trans. Ind. Electron.S 59 (11) (2012) 4285–4292. [30] S. Zhao, B. Huang, F. Liu, Fault detection and diagnosis of multiple-model systems with mismodeled transition probabilities, IEEE Trans. Ind. Electron. 62 (8) (2015) 5063–5071. [31] G. Foo, X. Zhang, S. Member, D.M. Vilathgamuwa, A sensor fault detection and isolation method in interior permanent-magnet synchronous motor drives based on an extended Kalman filter, IEEE Trans. Ind. Electron. 60 (8) (2013) 3485–3495. [32] R. Pearson, Outliers in process modeling and identification, IEEE Trans. Control Systems Technology 10 (1) (2002) 55–63. [33] J. Zhu, Z. Ge, Z. Song, Robust modeling of mixture probabilistic principal component analysis and process monitoring application, AIChE Journal 60 (6) (2014) 2143–2157. [34] Y. Huang, Y. Zhang, N. Li, A robust Gaussian approximate filter for nonlinear systems with heavy tailed measurement noises, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016, 2016, pp. 4209–4213. [35] M. Roth, E. Ozkan, F. Gustafsson, A student’s filter for heavy-tailed process and measurement noise, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2013, 2013, pp. 5770–5774. [36] Y. Huang, Y. Zhang, Z. Wu, A novel robust student’s t based Kalman filter, IEEE Trans. Aeros. Electron. Syst. 53 (3) (2017) 1545–1554. [37] C.M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, NY, 2006. [38] C. Xu, S. Zhao, B. Huang, F. Liu, Distributed student’s t filtering algorithm for heavy-tailed noises, Int. J. Adapt. Control Signal Process. 32 (6) (2018) 875–890. [39] S. Sarkka, A. Nummenmaa, Recursive noise adaptive Kalman filtering by variational Bayesian approximations, IEEE Trans. Autom. Control 54 (3) (2009) 596–600. [40] Y. Ma, B. Huang, Bayesian learning for dynamic feature extraction with application in soft sensing, IEEE Trans. Ind. Electron. 64 (9) (2017) 7171–7180. [41] M. Rashedi, J. Liu, B. Huang, Communication delays and data losses in distributed adaptive high-gain EKF, AIChE J. 62 (12) (2016) 4321–4333. [42] Y. Sun, N.H. El-Farra, Quasi-decentralized model-based networked control of process systems, Comput. Chem. Eng. 32 (9) (2008) 2016–2029. Chen Xu received the Bachelor’s degree from the Department of Automation, Jiangnan University, Wuxi, China, in 2013. Currently, he is a Ph.D. Candidate in the Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi, China. From 2016 to 2018, he was a visiting student in the Department of Chemical and Materials Engineer, University of Alberta, AB, Canada. His research interests include state estimation, fault detection and diagnosis, and stochastic signal processing.

Shunyi Zhao (M’14) was born in Jinhua, China, in 1987. He received the Ph.D. degree in control theory and application from the Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi, China, in 2015. From 2013 to 2014, he was a Visiting Student in the Department of Chemical and Materials Engineering, University of Alberta, Edmonton, AB, Canada, where he was a Postdoctoral Fellow. In 2015, he joined Jiangnan University as an Associate Professor. His research interests include statistical signal processing, Bayesian estimation theory, and fault detection and diagnosis. Dr. Zhao is the recipient of Alexander von Humboldt Research Fellowship in Germany, the excellent Ph.D. thesis award (2016) in Jiangsu Province, China, and a nomination of excellent doctoral thesis from Chinese Association of Automation (CAA) in 2016. Fei Liu received the BSc. degree in electrical technology from Wuxi Institute of Light Industry, China, in 1987; the MSc. degree in industrial automation from Wuxi Institute of Light Industry, China, in 1990; and the PhD degree in control science and control engineering from Zhejiang University, China, in 2002. From 1990 to 1999, he was an Assistant Lecturer, and Associate Professor in Wuxi Institute of Light Industry. Since 2003, he has been a Professor of the Institute of Automation, Jiangnan University. From 2005 to 2006, he was a Visiting Professor with the University of Manchester, UK. His research interests include advanced control theory and applications, batch process control engineering, statistical monitoring and diagnosis in industrial process, and intelligent technique with emphasis on fuzzy and neural systems.