IFAC
COl>
Copyright 0 IFAC Fault Detection, Supervision and Safety of Technical Processes, Washington, D.C., USA, 2003
Publications www.elsevier.comllocatelifac
FAULT DETECTION WITH OBSERVERS AND GENETIC PROGRAMMING: APPLICATION TO THE DAMADICS BENCHMARK PROBLEM
Marcin Witczak * Ron J. Patton" Jozef Korbicz •
• Institute of Control and Computation Engineering, University ofZielona G6ra, ul. Podgorna 50, 65-246 Zielona G6ra, Poland, e-mail: {M.Witczalc.J.Korbicz}@issi.uu.gora.pl .. Control and Intelligent Systems Engineering, Department of Engineering, University of Hull, Cottingham Road, East Yorkshire HU6 7RX, United Kingdom, e-mail:
[email protected]
Abstract: This paper is focused on the problem of designing a fault diagnosis scheme for a class of non-linear systems. The one objective is to show how to employ a genetic programming technique to obtain state-space models of non-linear systems. Another objective is to employ a modified version of the well-known unknown input observer to form a non-linear deterministic observer for the purpose of residual generation. The final part of the paper shows how to use the proposed approach to tackle fault detection of the DAMADICS benchmark. Copyright © 2003 IFAC Keywords: fault diagnosis, non-linear systems, system identification, observers, genetic algorithms
I. INTRODUCTION
hoc approaches. each applicable only to a very restricted class of systems. With the advent of neural networks. fuzzy models (Nelles. 2001), and modem structure optimization techniques (Koza. 1992). a much wider class of systems can be handled. This implies the possibility of designing more reliable control and supervision tools.
It is well known that there is an increasing demand for modem systems to become more effective and reliable. This real world's development pressure has transformed automatic control, initially perceived as the art of designing a satisfactory system. into the modem science that it is today. The observed increasing complexity of modem systems necessitates the development of new control and supervision techniques. To tackle this problem. it is obviously profitable to have all the knowledge concerning a system behaviour. Undoubtedly, an adequate model of a system can be a tool providing such knowledge. Indeed. nowadays. advanced techniques for designing controllers are also based on models of systems. As most of industrial systems exhibit a non-linear behaviour, this has been the main reason for further development of non-linear system identification theory. Indeed, a few decades ago, nonlinear system identification was a field of several ad-
Irrespective of the identification method used. there is always the problem of model uncertainty. i.e. the model-reality mismatch. Thus. the better the model used to represent a system behaviour. the better the chance of improving the reliability and performance in diagnosing faults . Unfortunately, disturbances as well as model uncertainty are inevitable in industrial systems. and hence there exists a pressure creating the need for robustness in fault diagnosis systems. This robustness requirement is usually achieved in the fault detection stage.
1101
In the context of robust fault detection, many approaches have been proposed (Chen and Palton, 1999; Patton et al., 2(00). Undoubtedly, the most common one is to use robust observers, such as the Unknown Input Observer (UIO) (Chen and Patton, 1999; Seliger and Frank, 2(00), which can tolerate a degree of model uncertainty and hence increase the reliability of fault diagnosis. There are, of course, many different observers which can be applied to non-linear, and especially non-linear deterministic systems. Logically, the number of "real world" applications (not only simulated examples) should proliferate, yet this is not the case. It seems that there are two main reasons why strong formal methods are not accepted in engineering practice. First, the design complexity of most observers for non-linear systems (Chen and Patton, 1999; Seliger and Frank, 2(00) does not encourage engineers to apply them in practice. Second, the application of observers is limited by the need for nonlinear state-space models of the system being considered, which is usually a serious problem in complex industrial systems. This explains why most of the examples considered in the literature are devoted to simulated or laboratory systems, e.g. the celebrated three- (two- or even four-) tank system, an inverted pendulum, a travelling crane, etc. Bearing all these difficulties in mind, the paper is organized as follows. Section 2 focuses on a statespace model designing with genetic programming. In Section 3 an UIO which can relatively easy be applied for a class of non-linear systems is presented. In order to justify the reliability of the presented approaches, a comprehensive simulation study regarding the DAMADICS benchmark problem (Bartys et al., 2oo3) is performed in Section 4.
Moreover, it is assumed that the true state vector Xk is, in particular, unknown. Without loss of generality, it is possible to assume that
A(Xk) = diag[al,l(xk), . . . ,a..,n(Xk)J .
(6)
Thus, the problem reduces to identifying a..,i(Zk), hi(Uk), i = 1, ... , n, and Ck+l . Assuming that la.,.(xk)1 < 1, i = 1, . . . ,n it can be shown (Witczak et al., 2oo2) that the model (4)-(5) is globally asymptotically stable. This implies that ai,i(zk) should have the following structure
ai,i(xk)
= tanh(si,i(xk»
,
i = 1, . .. , n,
(7)
where tanh(·) is a hyperbolic tangent function, and
Si,i(Zk) is a function to be determined. Undoubtedly, many tools can be employed to obtain (4)-(5), e.g. neural networks or Genetic Programming (GP) (Koza, 1992). GP is an extension of genetic algorithms (Michalewicz, 1996), which are a broad class of stochastic optimization algorithms inspired by some biological processes, which allow populations of organisms to adapt to their surrounding environment. The main difference between these two approaches is that in GP the evolving individuals are parse trees rather than fixed-length binary strings. The main advantage of GP over neural networks is that the models resulting from this approach are less sophisticated (from the point of view of the number of parameters). Since Si,i(Zk), hi(Uk), i = 1, .. . , n are assumed to be non-linear (in general) functions they can easily be represented as trees as shown in Fig. 1. The language
2. DESIGN OF STATE-SPACE MODELS Let us consider the following class of non-linear discrete-time systems:
Xk+l Yk+l
= g(Xk, Uk,P) + Wk,
(1)
= Ck+lXk+l + tJk ·
(2)
Fig. 1. An exemplary tree representing Si,i(Zk). of the trees in GP is formed by a user-defined function IF set and terminal T set, which form the nodes of the trees (cf. Fig. 1). The functions should be chosen so as to be a priori useful in solving the problem, i.e. any knowledge concerning the system under consideration should be included in the function set. This function set is very important and should be universal enough to be capable of representing a wide range of nonlinear systems. In the case of a parameterized tree. as shown in Fig. I, the terminal set is composed of variables only. Such a parameterization has proven to be especially useful for a model designing purpose (Witczak, 2oo3; Witczak et al., 2oo2). On the other hand, it leads to the problem of non-linear parameter estimation which has to be solved by some nonlinear programming tools, e.g. the Adaptive Random Search (ARS) algorithm (Waiter and Pronzato. 1997).
where p is the parameter vector, Xk is the state, uk is the input, Yk is the output, g( .) is a non-linear function , Wk and Vk are the process and measurement noise, respectively. With a slight abuse of notation the parameter vector will be neglected in model equations . Assume that the function g( .) has the form
g(Xk ' Uk)
= A(Xk)Xk + h(uk) .
(3)
where h(·) is a non-linear function, and A( ·) is a matrix of function. The state-space model of the system (1)-(2) can be expressed as
Xk+l Yk+l
= A(Xk)Xk + h(uk), = Ck+lXk+l .
(4) (5)
The problem is to determine A(·), Ck+l and h(·), given a set of input-output measurements {( Uk, Yk) }k!.O 1.
1102
As a result of applying the above approach to the identification of (1)-(2), each entry of A(Xk) and
As usual. to perform further derivation. it is necessary to Iinearize the model around the current state estimate Xk. This leads directly to the classical approximation
h(Uk) can be obtained with a population of trees evolved by the GP algorithm. It should be pointed out that for that particular purpose the two terminals sets can be distinguished. i.e. first for A(Zk) (T = {Zk}) and second for h( Uk) (f = {Uk}).
ek+l/k :::: Akek
ek+l/k
xo.
n . -1
k=O
= akAkek + Ekdk,
(16)
The problem is to obtain an appropriate form of the instrumental matrices Qk-l and Rk in such a way as to ensure the convergence of the observer or adequately to maximize the bounds of the diagonal elements of the matrix ak .
n.-1
L xkzI = L Ykxk,
(15)
In order to avoid the above approximation. the diagonal matrix ak = diag(Ol.k, ... ,On.k) can be introduced (Witczak et al.• 2(02). which makes it possible to establish the following exact equality
As can be observed. parameter estimation involves computation of C k • which is necessary to obtain the output error t:k and. consequently. the value of the fitness function . To tackle this problem. for each trial point p it is necessary to first set an initial state estimate and then to obtain the state estimate Xko k = 1•... , nt -1. Knowing the state estimate and using the least-square method. it is possible to obtain C k (assuming Ck = C) by solving the following equation:
C
+ Ekdk .
For that purpose Witczak et al. (2001.2002) performed a comprehensive convergence analysis with the Lyapunov method. As a result they obtained the following conditions
(8)
k=O
It should be also pointed out that the order n of the model is in general unknown and hence should be determined throughout experiments.
_ () Q:(Ak) ( (1 - ()Q:(P k ) ) ak $ 1'1 = - - - T (1 (Ak) if (Al.kP~Al'k)
i
(1
3. DESIGN OF OBSERVERS
and (1
Let us consider a class of non-linear system described by the following equations
Xk+l Yk+l
= g(Xk) + h(Uk + L1 .d k ) + Ekdk, = Ck+lXk+l + L2.k+lfk+l,
if (ak - I) $ 1'2
Q:(Cr)Q:(Ck) Q:(Rk) x ( if (Cr) if (C k ) if (CkPkCI + Rk)
(9)
(10)
where 9(Xk) is assumed to be continuously differentiable with respect to Xk . f k states for the fault signal, dk is the unknown input. and L 1,k. L 2 ,k. Ek are their distribution matrices. Similarly to the Extended Kalman Filter (Anderson and Moore. 1979). the UIO presented in (Chen and Patton. 1999. pp.98-108) can be extended to the class of non-linear systems (9)-(10) This leads to the following structure of the Extended UlO (EUIO):
. max 10;,kl $ 1'1 and . max 10;,k-11 $ 1'2 · (17)
l=l •... ,n
Pk
It should also be pointed out that the matrix Ak used in the designing procedure is now defined by
8Xk
",.-z.
= Al,kP~AIk + TkQk_1 T I + HkRkHI,
(18) it is clear that an appropriate selection of the instrumental matrices Qk-l and Rk may enlarge the b0unds 1'1 and 1'2 and consequently. the domain of attraction. Indeed. if the conditions (17) are satisfied. then Xk converges to Xk .
(12)
= 8g(Xk) I
l=l, .. . ,n
Since (cC. (Chen and Patton. 1999. pp. 98-108»
= Xk+1/k + H k+lF:k+l/k + K 1,k+lEk,
Ak
)!
Bearing in mind that ak is a diagonal matrix. the above inequalities can be expressed as
(11)
Xk+l
(Ak)
= ~ (Ak) x
Unfortunately. an analytical derivation of the matrices Qk-1 and Rk seems to be an extremely difficult problem. However. it is possible to set the above matrices as follows: Qk-l = {31I. Rk = {31I. with {31 and {31 large enough. On the other hand. it is well known that the convergence rate of such an EKF-like approach can be increased by an appropriate selection of the covariance matrices Q k-l and R/r.. i.e. the more accurate (near "true" values) the co variance matrices. the better the convergence rate. This means that in the deterministic case (w/r. = 0 and Vk == 0). both matrices should be zero ones. Unfortunately. such an approach usually leads to the divergence of the observer as well as other computational problems. To tackle this. a compromise between the convergence
(13)
The main objective of this section is to show that the convergence of the EUlO strongly depends on the appropriate choice of the instrumental matrices Rk and Q k (measurement and process noise covariance matrices in the stochastic case). Moreover. the fault free mode is assumed. i.e. f k = O. For notational convenience. let us define the a priori state estimation error
1103
4.1 Determination of a state-space model
and the convergence rate should be established. This can easily be done by setting the instrumental matrices as
Qk-l Rk
= !31eLl e k-l l + oil, = {32ek e/cl + 021,
The objective of this section is to design the statespace model of the actuator being considered (cf. Fig. 2) according to the approach described in Section 2. The parameters used during the identification process were n.,. = 200, nd = 10, n. = 10, IF = {+,., /}. For the sake of comparison, the linear statespace model was obtained with the use of MATLAB System Identification Toolbox. In both the linear and non-linear cases the order of the model was tested between n = 2, ... ,8. Unfortunately, the relation between the input U/c and the juice flow Yl,le cannot be modelled by a linear state-space model. Indeed, the modelling error was approximately 35% thus making linear model not acceptable. On the other hand, the relation between the input Uk and the rod displacement Y2,k can be modelled, with very good results, by the linear state-space model. Bearing this in mind, the identification process was decomposed into two phases, i.e.:
(19) (20)
with {31, 132 large enough, and 010 02 small enough.
4. EXPERIMENTAL RESULTS
Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems (DAMADICS) is a Research Training Network funded by the European Commission under Framework V. This multidisciplinary and complementary RTN DAMADICS is focusing on drawing together wide-ranging techniques and fault diagnosis within the framework of a real application to on-line diagnosis of a 5-stage evaporisation plant of a sugar factory at Lublin, Poland (as shown in Fig. 2). The network focuses on the dia____
(1) derivation of a relation between rod displacement and the input with a linear state-space model, (2) derivation of a relation between juice flow and the input with a non-linear state-space model designed by GP.
.,..
-; '
-
J1.I,,~
:
Experimental results showed that the best-suited linear model is of order n = 2. After the 50 runs of the GP algorithm performed for each model order, it was found that the order of the model which provides the best approximation quality is n = 2. Thus, as a result of combining both linear and non-linear models, the following model structure of an actuator was obtained:
Fig. 2. The scheme of the first stage of the evaporation station. gnosis of a valve (cf. Fig. 2) plant actuators and looks towards real implementation methods for new actuator systems. The sugar factory is a subcontractor (under the Technical University of Warsaw) providing real process data and evaluation of trials of fault diagnosis methods.
(21)
Yk+l
= CXIe+l,
(22)
where
The actuator itself can be considered as a four-input and two-output system. The input Uk and output y/c are defined as follows: Uk = (CV, PI, P2, Tl), Yk = (F, X), where CV is the control value, PI and P2 are the pressures at the inlet and the outlet of the valve, respectively, Tl is the juice temperature at the inlet of the valve, F is the juice flow at the outlet of the valve, X is the servomotor rod displacement.
AF(Xk)
) +.X2,Ie26xl,k + 001' . A
x
B
x
The data gathered from the real plant is available on the DAMADICS web site (Bartys et aI., 2003). In order to check the reliability and performance of the fault diagnosis systems developed in the project, it has been necessary to generate the faults. It is obvious fact that it is very hard or even impossible to generate faults with a real industrial plant. Thus, an actuator simulator was developed with MATLAB Simulink. This tool makes it possible to generate the data for total of 19 different faults as well as for normal operating mode. The comprehensive description of all faults can be found in (Bartys et al., 2003).
= diag [0.3tanh (lOx~,1e + 23xl,leX2,k 0.15tanh
(5X~,/c + 1. 5X l,k) ' 2 00 xI,k
+ .
1
1'
= [0.78786
-0.28319] 0.41252 -0.84448 '
=
[2.3695 -1.3587 -0.29929 1.1361 ] 12.269 -10.042 2.516 0.83162 '
h 1 (ule) = -1.08 7u t/c + 0.0629u~.k - 0. 5019u~.k - 3.0108u~,k + 0.9491(Ul,kU2,k - Ul ,kU3,k) _ 0.5409 Ul,le U4,/c + 0.9783, U2,kU3,1e + 0.01 h 2(Uk) - 0.292utk + 0.0162u~,k - 0. 1289u~,k - 0.7733u~,/c
+ 0.2438(Ul,kU2,/c - til,kti3,k) _ 0.1389 Ul ,/c ti 4,/c + 0.2513, ti2,leti3,/c + 0.01
C
1104
110
= [ 00 0.79
0] -0.047 .
The mean-squared output error for the model (21)(22) was 0.0079. The main differences between the behaviour of the model and the system was observed for the non-linear model (juice flow) during the saturation of the system. This inaccuracy constitutes the main part of the modelling uncertainty, but in practice this drawback is less important because it is rather vain, owing to the safety requirements, to expect that the control system will force the valve till its maximum allowable flow rate.
Knowing an estimate dk of an unknown input dk. k 1, ... , n, it is straightforward, using the approach described in (Cben and Patton, 1999). to obtain the disturbance distribution matrix Ek and consequently the unknown input decoupling matrix H k. In the case of the actuator being considered the matrix H k has the following form
=
[0.2074 0] (29) 0.3926 Since the matrix H k is given the instrumental matrices can be set. A good solution to this task is to set 3 /31 = 10, 61 = om in (19) and f32 = 10 • 61 = om in (19). k
4.2 Observer-based/ault diagnosis 4.2.1. Unknown input estimation and design 0/ instrumental matrices Determination of the unknown input distribution matrix Ek constitutes a very impor-
4.2.2. Threshold determination and fault detection In practical situation the residual rk (output error or innovation in the stochastic case) cannot be at the zero level. even when no faults occur. Thus, many efforts have been made {see (Chen and Patton. 1999) for a list of references) to enhance the robustness of FDI at the decision-making stage, i.e. while making decisions based on the residual signal. It is wellknown that the fixed threshold and methods in which it is assumed that the noise is Gaussian usually do not work perfectly in practice. e.g. due to the impossibility of perfect decoupling.
tant part of the EUIO designing procedure. Indeed, without this knowledge it is impossible to design the EUIO, and consequently, to achieve robust residual generation. In general non-linear case the unknown input estimation problem can be viewed as an unconstrained optimization task of the form
Since Ek+l
= Yk+l
- Yk+l where
= g(Xk) + h(Uk) + dk. = Ck+1 X k+l.
(24)
Yk+1
Xk+l
= g(Xk) + h(Uk) + dk.
(26)
Xk+!
Another solution which seems more suitable relies on the use of the so-called adaptive threshold. i.e. a threshold which changes in time according to some prespecified rules. For a single residual being considered as a scalar variable such an adaptive threshold can be defined as follows
(25)
and
Yk+! = Ck+1 X k+l ·
rmax ]
(27)
[
The solution to the optimization task (23) can be obtained by solving the following set of linear equations TAT
rrin
= T(Uk-d
(30)
where rk'&X and Tk'in are the bounds of the residual which can be perceived as a residual confidence region. Thus, the problem is to design the relation T(·) . Although many solutions have been proposed in the literature (see (Chen and Patton, 1999) for a list of references), most of them can only be applied for linear systems.
[
C k+1 Ck+l d k =C H1 Yk+l - CHl [g(Xk) + h(Uk)]]·
°°°°°
=
H
(28)
It should clearly be pointed out that (28) can be solved = n. This condition is usually very difficult to attain in practice, and hence an approximate solution has to be employed which can be obtained by minimizing the norm of the difference between left and right hand side of (28). e.g. by the Broyden-F1etcher-Goldfarb-Shanno method (Waiter and Pronzato, 1997).
expilicite only when rank(Ck+d
In this work. a very simple approach to designing T( ·) for both linear and non-linear systems is proposed. It is assumed that TO can be approximated by the finite order polynomial, which for a single-input system can be described as follows " •. -1
As it can be seen from (23) the unknown input dk is chosen in such a way as to minimize the output error Ek (residual) instead of the state estimation error ek. This simplifies, or even makes possible, estimation of the unknown input as the output error can be directly obtained based on the output of the system and the model. Such a solution can be fully justified by the fact that the main purpose of robust FDI is to make the residual robust to the unknown input while it is not necessary for the state estimation error.
T(Uk-t)
=
L
p,uLI + Po.
(31)
,=1
where p, = lPr'in ,pf'in] denotes the confidence region of the i-th parameter. These confidence regions can easily be obtained based on the least-square estimate of parameters and the corresponding Fisher information matrix (Walter and Pronzato, 1997). For the considered actuator, a 99% parameter confidence region was assumed. Moreover, it was founded
1105
influence the industry to be more interested and to have more confidence in applying such techniques in practice.
Table 1. Results of fault detection (D detectable. N - not detectable) Fault
It
12
h
/8
/10
Small
D D D
N
D
D D D
N N
D D D
D
1t3
/1s
It6
/17
/18
/19
D
N N D
D
D D D
D D D
Medium Big Fault Small
Medium Big
D D D
/11
/12 N N
ACKNOWLEDGMENTS
D
The authors acknowledge funding support under the EC RTN contract (RTN-1999-(0392) DAMADICS. Thanks are expressed to the management and staff of the Lublin sugar factory. Cukrownia Lublin SA. Poland for their collaboration and provision of manpower and access to their sugar plant.
that only the first input Ul.k (control value) is significant for the residual. Thus for both residuals (juice flow and rod displacement) single-variable polynomials of order 5 and 3. respectively. were employed. Fig. 3 presents the residuals and their bounds (adaptive threshold) for the fault-free mode. Since the method
6. REFERENCES Anderson. B. D. 0 .• and Moore. 1. B .• 1979. Optimal Filtering (New Jersey: Prentice-Hall). Bartys M .• Syfert M .• Quevedo J .• and Patton R .•2003. Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems (DAMADICS): A Benchmark Study. Proc. IFAC Symp. Fault Detection. Supervision and Safety of Technical Processes: SAFEPROCESS 2003. Washington D.C.• USA. 9-11 June. Chen. 1.. and Patton. R. 1.. 1999. Robust Model-based Fault Diagnosis for Dynamic Systems (London: Kluwer Academic Publishers). Koza. 1. R .• 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Cambridge: The MIT Press). Michalewicz, Z .• 1996. Genetic Algorithms + Data Structures Evolution Programs (Berlin: Springer). Nelles. 0 .. 2001. Non-linear Systems Identification. From Classical Approaches to Neural Networks and Fuzzy Models (Berlin: Springer). Patton. R. J .• Frank. P.• and Clark. R. N. (Eds.). 2000. Issues of Fault Diagnosis for Dynamic Systems (Berlin: Springer-Verlag). Seliger. R.. and Frank. P.. 2000. Robust observerbasedfault diagnosis in non-linear uncertain systems. - In: Issues of Fault Diagnosis for Dynamic Systems (Patton. R. J .• P. Frank and R. N. Clark. Eds.) (Berlin: Springer-Verlag). Waiter. E .• and Pronzato. L.. 1997. Identification of Parametric Models from Experimental Data (Berlin: Springer). Witczak. M .• 2003. Identification and Fault Detection of Non-linear Dynamic Systems (Zielona G6ra: University of Zielona G6ra Press). Witczak. M .• and Korbicz. 1..2001. An extended unknown input observer for non linear discrete-time systems. Proc. European Control Conj. ECC 2001. Porto. Portugal. 7-9 September. CD-ROM. Witczak. M .• Obuchowicz A.. and Korbicz. 1.• 2002. Genetic programming based approaches to identfication and fault diagnosis of non-linear dynamic systems. Int. J. Contr.. Vo!. 75. No. 13. pp. 1012-1031.
Fig. 3. Fault-free residuals and their bounds (juice flow - left. rod displacement - right). of designing an appropriate threshold is known it is possible to check the fault detection capabilities of the proposed observer-based fault detection scheme. For that purpose. data sets with faults were generated. It should be pointed out that only abrupt faults were considered. Table 1 shows the results of fault detection. As can be seen. the abrupt faults presented in Tab. 1 can be considered as small. medium. and big according to the benchmark description (Bartys et al.. 2(03).
=
5. CONCLUSIONS The purpose of this work was to provide a complte design procedure of the residual generator. Starting from a set of data. it was shown how to obtain a state-space model and then how to use it to design an observer for the purpose of residual generator. The main drawback to the proposed systems identification framework is its computational cost resulting in a relatively long identification time. However. as usual. the model construction procedure is realized off-line. and hence the identification time is not very important. Another objective was to provide a detailed description of an industrial application study of the proposed fault diagnosis scheme for a valve plant actuators (being one of the parts of the evaporation station at the Lublin sugar factory in Poland). It was shown. using a set of faults. that the proposed observer-based fault detection scheme can provide good results. The authors believe that the presented benchmark problem will encourage other researchers (outside the network) to test the reliability and performance of various analytical redundancy approaches. This may
1106