Copy r ight © IFAC Ad apti\e S,stellls in Control a nd Sig nal Processin g. (;Iasgo w. L"K. 19H9
FAULT DETECTION AND DIAGNOSIS IN A CHEMICAL PLANT USING SEQUENTIAL HYPOTHESIS TESTING X.
J.
Zhang and M. B. Zarrop
Control Systems Centre, UMIST, p.a . Box 88, Manchester M60 IQD , UK
Abstract. In this paper we apply probability ratio tests to the proble. of fault detection and diagnosis in a chemical plant. The sequential probabili ty ratio test developed by Wald (1947) for selecting one of two hypotheses is a simple on-line scheme but, in general, its exact implementation for a dynamic system requires the use of Kalman Filtering techniques (Willsky, 1976). Armitage (1950) proposed an extended SPRT for multiple hypothesis testing and this is applied here to dynamic systems. However, substantial computational savings are made by using input-output models to yield the steady state output predictors under each hpothesis and thus avoid the i.plementation of Kalman Filters . In addition , a resetting technique (Chien and Adams, 1976) is extended to the multihypothesis case and this speeds up detection when a fault occurs. Examples are given of the detection of faults in parts of a chemical plant, where the alternative hypotheses correspond to level sensor bias, leakage and valve malfunction in a stock tank and fouling in a heat exchanger. Keyword. Fault detection and diagnosis; process control; techniques; system failure and recovery.
INTRODUCTION In this paper, the on-line detection and diagnosis of two types of fault in a chemical plant are considered. First, a sequential probability ratio test (SPRT) for mUltiple hypothesis testing is described. A simpler form of the SPRT for discriminating between two possible .eans of a gaussian process is then discussed. A description of the chemical plant is given followed by a discussion of the behaviour of the system under different fault conditions. Modelling of the normal and fault models are briefly described and some simulation results are presented to illustrate the FDD procedures.
linearization
Let Pi(t) denote the joint probability density of y(l), ••• ,y(t) when Hi is true and define the log-likelihood ratios
(i,j=O,l, •..
(2)
,N ; i~j)
Under gaussian assumptions, these statistics can be calculated recursively via
£~(t) ARMITAGE'S SPRT Consider a single-output dynamical system operating either in a normal fault-free mode (hypothesis Ho is true) or in one of N different fault modes (Hi is true, i I> (l, . . . ,N». The N+1 possible operating conditions are each represented by an input-output model of the linear, discrete-time, ARMAX type, expressed in the predictor form yet) = Yi(tlt-1) + £i(t)
(i=O,l, ... ,N) (1)
where y( t) Hi' (I>i(t»
is the scalar output. Under is a zero mean white noise
sequence with variance
af·
a 12.
a 12. (3 )
using (1) and some initial Aio(O) indicating prior belief in Hi' Then
Armitage's SPRT for deciding which hypothesis is true, then gives rise to the following algorithm:
X.
532
J.
Zhang and M. B. Zarrop Then the SPRT recursively via
Data: Thresholds
statistic
is
generated
Aij (i,j=O,I, .•. N;i~j) , BL Aio(O)
A(t) = A(t-I) + met) [yet) - H m(t)]/02
(i=O, 1, ... N)
(5 )
1. Set t = 1 and Ho = .false. 2. Observe yet)
3. For i=I, ... ,N, compute Aio(t) from (I), (3) and check
* *
and tested against suitably chosen thresholds. This type of test is particularly effective for diagnosing faults that occur suddenly but have only a transient effect at the output:
if Ho = .false. and Aio(t) ( -Aoi for i=I, ... ,N, set Ho = .true. and go to 4 if Ho = .true. and Aio(t) Aio( t) = BL
< BL
, set
4. Set j = I
if Aij(t) ( - Aji for i=j+l, ..• ,N and Ajk , Ajk for k = O, ... ,j-I, set Hj = .true. and terminate the test
* otherwise:
for all t
met)
as t
~
0
~
< some
tf
•
(6 )
PLANT DESCRIPTION
5. For i=j+1 to N, compute Aij(t) and check
*
met) = 0
if j < N return to 5 with j=j+1 if j=N and ANk(t) » ANk for = O, ... ,N-I, set HN = .true. and terminate the test
k
6. Set t = t+1 and return to 2. The thresholds Aij can be simply related to selected error rates as in the two hypothesis case (Wald, 1947; Armitage, 1950; Zhang, 1988) • The resetting threshold BL is employed when no fault is present (Ho is accepted) and is roughly equivalent to restarting the test. This technique originated with Chien and Adams (1976) and speeds up detection of a fault once it occurs. From ( I) , (3) it is clear that the calculation of the SPRT statistics requires the construction of the one-step-ahead output predictor corresponding to each hypothesis. Here the steady state predictor is used based on the standard polynomial partition (Astrom, 1970) rather than employing Kalman filters (Willsky, 1976). This yields considerable computational savings and has been found to work well. A SIMPLE SPRT A simple form of SPRT is obtained if the FDD problem can be considered as deciding between two mean value functions of a white stochastic process. For example : Ho
yet) is gaussian with zero mean and variance 0 2 (t=I,2, .. . )
Hi
yet) is gaussian with mean met) and variance 0 2 (t=I,2, .. . )
The chemical plant considered is part of an Anhydrous Caustic Soda (ACS) plant (ICI, 1986). The process consists of a stock tank, a heat exchanger and two symmetric flow control loops (Fig. I). The initial fluid is stored in the stock tank, where the level is kept within desired bounds by a proportional controller (LIC). It is then fed into the pre-concentrator through a valve (VLVP). The valve is operated by a PI controller (LPC), which is used to maintain a constant level in the base of the pre-concentrator. The output flow-rate of the base is controlled by two PI controllers (FCA, FCB). The system has the following inputs and measured outputs: U1
stock tank level set-point (DVL i )
U2
pre-concentrator base level set-point (SPPi)
U3
pipeline A flow-rate set-point (SPCA 1 )
U4
pipeline B flow-rate set-point (SPCB 1 )
Y1
stock tank level (LI)
Y2
stock tank input flow-rate (FT2)
Y3
pre-concentrator base level ( LP l )
Y4
pre-concentrator vapour temperature (TPV)
Y,
pipeline A flow-rate (FLOWA)
Y.
pipeline B flow-rate (FLOWB)
Since the instruments used for measurement and control must operate between well-defined upper and lower limits, saturation nonlinearities exist almost everywhere in the system. The transient characteristics of the tank, the pre-concentrator and the pipelines are considerably di fferent. The time constants of the tank and the pipelines differ by a factor of 275. For convenience, the system is separated into three subsys tems : tank, pre-concentrator
Fa ult Detecti on a nd Diagnosis in a Chemical Pla nt and the pipelines. FAULT DYNAMICS The faults considered in this system are: leakage in the tank, fouling in the pre-concentrator, valve parameter change in the tank and level transmitter parameter change in both the tank and the pre-concentrator base. It is assumed that a single fault may develop at any time in the form of a step change in a physical parameter. Simulations show that the response of the system to such faults differs according to the fault location within the plant. When a sudden jump develops in the valve parameter, the level transai tter output changes slowly and reaches a new steady state level after some time (Fig . 2). The behaviour of the level transmitter output is silllilar when a leakage develops in the tank. For convenience, this type of fault is termed 'Type I' . On the other hand, an immediate jump can be observed from the level transmitters, when these transmitters' parameters change, and it dies away gradually (Fig. 3). Similar behaviour is observed in the temperature sensor, when fouling develops in the pre-concentrator. These faults are associated with a transient effect at the sensor. We call this type of fault 'Type 11' requiring the special type of mean test described earlier. Note that any fault in the pre-concentrator affects the tank behaviour but not vice versa. The FDD problem can be cast in a multiple hypothesis testing framework. In order to apply SPRT a number of linear models are required representing different operating conditions of interest, including key fault conditions.
533
In the above hypotheses, it is assumed that the size of changes in the parameters is 10" except in the case of leakage, where a loss of 3.6" of the tank's output flow-rate is considered to be significant. Models corresponding to the above hypotheses have been developed. Note the following points: Separate models can be built for each subsystem, if interaction between the subsystems is considered properly. Linearized models are used to represent the non-linear system at the various steady-state operating points. A PRBS input is applied to excite the system dynamics and thus improve the quality of the input-output data. The physical structure of the system suggests that the tank is a first-order system and the pre-concentrator of third-order. If we represent the effect which the pre-concentrator has on the tank by the output level measurement Y3 , the models of the stock tank take the following fora (Zhang, 1988): Y 1 (t) + al Y 1 (t-l)
=b1
Y3 (t) + b a Y3 (t-l)
+ b 3 U 1 (t) + b. U 1 (t-l)
(7)
Similarly, for the pre-concentrator:
MODELLING THE HYPOTHESBS Ten hypotheses are chosen: Ho - Normal operation in the tank. Ho* - Normal operation in the pre-concentrator. Hi - Positive bias in the tank valve parameter. Ha - Negative bias in the tank valve parameter. H3 - Leakage in the tank. H. - Positive bias in the tank level sensor parameter.
HI - Negative bias in the tank level sensor parameter.
(8 )
The sampling interval is 0.01 hours . Models are validated according to a) Akaike's FPE • b) Confidence intervals for the parameter estimates. c) Simulation accuracy. d) Whiteness of the residuals. e) Bodeplots. Having obtained the models, the SPRT can be applied to detect and diagnose faults in the system.
HI - Positive bias in the pre-concentrator base level sensor parameter. H7 - Negative bias in the pre-concentrator base level sensor parameter. H. - Fouling in the pre-concentrator (causing a decrease in the tube-to-wall heat transfer coeficient)
FDD FOR THB PLANT System faults can be divided into Type I and Type 11 according to the behaviour of the sensor output. For Type I I faults, simulation experience shows that when Armitage's SPRT is applied (using input/output models) the fault remains undetected. This is because the distance
X.
534
J.
Zhang and M. B. Zarrop
between a Type 11 fault model and a no-fault model is very small compared with the modelling errors due to linearization (Zhang, 1988). For this reason fault detection schemes for the two types of fault are designed separately. As shown in Fig. 3, when a bias develops in the parameter of the level sensor, it is observable from the sensor outputs for a short period only. The deviation of the sensor output from its steady state value (the straight line), represents a fault signature generated by the output observation and its prediction at steady state. Let y(t)
= y(t)
- y(.)
(iv) A fault whose size is not precisely modelled can also be detected if it lies closer to a modelled fault mode than to the no-fault model. For example, when a leakage of 5% of the output flow-rate occurs at t:24 hrs. it is detected at t=26.89 hrs. CONCLUSION In this paper, the detection and diagnosis of faults in a chemical process are discussed. A comprehensive FDD scheme has been developed based on estimated linear input-output models. Simulations show that the FDD scheme can be applied to detect faults of different sizes and that it is robust to non-gaussian noise and modest modelling errors (Zhang, 1988)
(9 )
ACKNOWLEDGEMENTS where y(.) is the predicted value of the output y(t) when the system is operating at steady state. Equation (5) can be applied for the detection of Type 11 faults, if the mean of the corresponding sequence y(t) can be modelled accurately after the fault occurs. Since a Type 11 fault is associated with a sudden jump at the sensor output, a 'switch' is set to start the estimation of the mean of y(t). This is done by comparing successive output values. If, for example, the di fference between two consecutive values is greater than some prescribed threshold, the estimation and the recursion (6) is initiated. For the detection and diagnosis of Type I faul ts, Armi tage' s SPRT can be directly based on the models developed above. In order to make a unique and reliable decision about the operating conditions of the system, the above two FDD schemes are combined and a decision making mechanism is designed. The flow chart of the complete FDD scheme is shown in Fig. 4. Simulations show that: (i) All the faults of interest can be detected and diagnosed correctly. (ii) Type 11 faults are detected immediately after they occur. The fault sizes estimated are close to the true values. For example, when fouling develops in the pre-concentrator at t:22 hrs, it is detected and diagnosed at t=22.02 hrs. Figure 5 shows the cumulative sum of the SPRT for testing the mean of the fault signature corresponding to output Y4' (iii) The type I faults are detected a few hours after they occur. For example, when a positive bias in the tank valve develops at t:24 hrs, it is detected at t:26. 50 hrs. Figure 6 shows the behaviour of the cumulative sums Al0 , A 2l , A3l and the sensor output Yl'
The authors would like to thank ICI and in particular Dr. A. Allidina for their help in this work. Financial support from the British Council and the State Education Commission of China is also appreciated.
REFERENCES Armitage, P. (1950). Sequential analysis with more than two alternative hypotheses and its relation to discriminant function analysis, J. Royal Stat.Soc., B, 12, 137-144. Astrom, K.J. (1970). Introduction to Stochastic Control, Academic Press. Chien, T.T. and M.B. Adams (1976). A sequential failure detection technique and its application, IEEE Trans.AC, October, 750-757. ICI (1986). Anhydrous caustic soda fault prediction and diagnosis, Report BH1/ELECT/DISC43/D1. Wald, A. (1947). Sequential analysi., Dover Publications Inc. Willsky, A.S. (1976). A survey of design methods for failure detection in dynamic systems. Automatica, 12, 601-611. Zhang, X.J. (1988). Auxiliary signal design in fault detection and diagnosis, Ph.D. Thesis, Control Systems Centre, UMIST, U.K.
Fault Detection and Diagnosis in a Chemical PIa m
535
VLVT
FTI
FrO
wc. SOli
FLOWA
FLOW.
SPCB.
Fig. 1. Description of the chemical process.
Level sensor behaviour when Yalve parameter changes
0.6'r---~---~--~---~--~-'---,
0.58 0.56 0.54 0.52l------~
0.48 0.46 0.44
0.42
0.48~------O1'::'0-----:1'::'2-----:1~4---~16,----~18::----:'2O time (hour)
Fig. 2.
No
Level sensor behaviour when the sensor parameter ch:lllges 0 . 6r---~---~---~---_--~----,
0.58 0.56 0.54 0.5 2 - - - - - ' 0.5 0.48 0.46 r\ o
0.44
0.42 10
12
14 time (hour)
Fig. 3.
16
18
20
Fig. 4. Flow chart of the FDD scheme.
x. J. Zhang and M. B. Zarrop
536
~~,---,~,,~~--~~--~---n~------~n~.--~n.
Fi,. 5. Behaviour of the cumulative sum under Hs'
20
20
10
10
0
0
·10
·10
·20
·20
·30 22
24
26
28
~'~lt~M~~'Ml,
·30 22
24
26
28
26
28
hn
hrs
0.6
20
0.55
;~~~I\I ·10
24
26 hn
VI
0.5
~~~
·20 ·30 22
~~~~
A21
0.45
28
0.4 22
24 hn
Fi,. 6. Behaviour of the output Y1 and the cumulative sums A10,A21,A31 under H1 •