A I '89 Conference, Prague
Knowledge represented by mathematical models for fault diagnosis in chemical processing units J Lutcha and J Zejda The automation of fault diagnosis in the field of chemical processing has been thwarted, in the past, chiefly by the inability to represent quantitative causal models and process constraints. This paper describes a system which uses mathematical model-based reasoning for fault diagnosis in a chemical plant. This approach appears to give somewhat better results in comparison with past attempts" at using expert systems, which typically carried out the diagnosis on the basis of empirical knowledge alone. The latter systems usually suffer from the disadvantage of being too process-specific, inflexible and cumbersome. Keywords: knowledge, fault diagnosis, chemical processing industry, expert systems, empirical knowledge, causal models, process constraints Nowadays, equipment for chemical processes has become more complex, more extended in scope and integrated both within and with the adjacent processing units in the whole industrial plant. Due to these trends, every abnormal shutdown due to an equipment failure brings about sizable economic losses - - particularly in bulk chemicals production - - but above all, if the process malfunctions are left uncorrected the situation could result in a catastrophic event such as an explosion, fire, or the release of toxic chemicals. Since the existing processing units are complex, it is a difficult task for a process operator to diagnose an oncoming malfunction. Although well trained in standard procedures, an operator may have difficulty in coping with unanticipated events and low probability failures. Because time constraints are critical, hesitation as well as inappropriate action could lead to disaster. Therefore, any system which would help operators determine Research Institute of Chemical Equipment, Kfi~ikowl 70. 602/)0 Brno, Czechoslovakia Paper received 3 August 1989. Accepted 26 October 1989
32
in good time the presence of, and identify the process faults could increase the operability and safety of processing units, In recent years, a number of papers on the fault diagnosis of chemical processes have been published. A relatively new promising approach appears to be in exploiting the methods based on artificial intelligence, particularly expert systems. However, existing findings have indicated that expert systems based on a shallow knowledge of empirical experience are not well suited to the diagnosis problems in the field of chemical processing units. This could especially be the case in situations where either there is not enough expertise or where, for a new/modified chemical process, the operating experience is simply missing. A relatively well defined structure of chemical processing equipment could be suggestive for creating an expert system knowledge base. Thus, the process could be broken down, for diagnosis purposes, into smaller subsystems and these may then be analysed independently. The analysis could concentrate on dominant subsystems, from the viewpoint of the diagnosis, which play a key role in proper functioning of the whole chemical processing unit, Behaviour of the subsystem - - usually represented by one or a group of unit operations - - can be described by a mathematical model. It should be observed that the material and energy balance equations, expressing the conservation laws, form the backbone of the mathematical model. In addition, the balances assumed over the whole process could be considered as fundamental relationships between subsystems determined by the breakdown procedure. However, it should be pointed out that experimental knowledge collected over the years of operating chemical processing units generally offers a substantial contribution in applying expert systems to fault diagnosis. This paper is aimed at exploiting the knowledge incorporated in mathematical models and developing the coding of this knowledge in systems detecting malfunctions of chemical processing equipment.
0950-7051/90/010032-04 $03.00 © 1990 Butterworth & Co (Publishers) Ltd Knowledge-Based Systems
A I '89 Conference, Prague K N O W L E D G E R E P R E S E N T E D BY GOVERNING EQUATIONS To illustrate the technique of generating a knowledge base for fault diagnosis from mathematical models, a very simple subsystem of a stream splitter is considered - - see Figure 1. The splitting ratio for the flow rate is assumed to be 1 : 1. (Under actual circumstances, it could be a case of heating up the feedstock material.) It might be noticed that the classical diagnosing methods are usually oriented at predetermined steady states and are unable to cope with a very trivial situation when e.g. the flow rate is varying. Mathematical model equations used for the purpose of diagnosis express the constraints upon process variables and are called governing equations - - see Reference 1. For the subsystem under investigation, the governing equations are given by material balance equations applied to the flow rates of incoming and leaving streams FI + F2 - F3 = 0 F1 - F2 = 0
(1) (2)
If the governing equations are satisfied, a normal faultless operation of the process equipment is assumed. The verification could be carried out on-line, without any operator action, by using measured values acquired from a plant data acquisition system. A set F of specified anticipated faults is associated with the governing equations, so the occurrence of a fault causes the equations to be violated. For the considered example, the set F is specified as follows: A B C D E F
.... .... .... .... .... ....
sensor sensor sensor sensor sensor sensor
F2 FI F3 F1
fails fails fails fails F2 fails F3 fails
high high high low low low
It should be noticed that if e.g. the sensor indicates a lower value than the correct one - - fails low - - then this might not be necessarily because of the faulty sensor only. The lower value could be caused by a leaking pipe, say. The indicated measured data substituted into the governing equations can result in non-zero left-hand sides - - the residuum of a governing equation. The value of the residuum - - whether positive, negative or zero - - could help to break down the specified set of faults into classes corresponding to the particular governing equation. Table 1 shows how the considered faults are classified according to the suggested procedure. In process monitoring for fault diagnosis purposes, the task is reversed. Thus, a data acquisition system on the processing unit offers measured values - - it is not known whether correct or not - - and by using these measurements the process state is being tested, e.g. let the actual measured values substituted into the first governing equation result in a positive residuum for the first equation; in the case of the second governing equation, the residuum is negative. If it is assumed that a single fault occurs, then the diagnosis could be derived from the intersection of the appropriate subsets F = H~- N H;- = {A, B , F } 71 {A, D} = A
(3)
Further, this failure must explain simultaneous validity/violation of all governing equations. Thus, in this simple example, the fault A causes the above mentioned state of validity/violation of the considered governing equations. Provided all possible process states are to be analysed in connection with diagnosis of all failures A . . . F, it is necessary to consider all alternatives to H*. The list of these alternatives is shown in Table 2.
Table 1. Classification of considered faults according to suggested procedure Subset of faults for which the residuum of governing equation is Positive H+
Fa
l-@ Figure 1. Stream splitter. F 1, F2, F3 - - stream flow rate Vol 3 No 1 March 1990
1. F ~ + F , - F 3 = 0 2. F l - F 2 = 0
A,B,F B,E
'Zero' HI' $ C,F,$
Negative H;C,D,E A,D
Table 2. Alternatives to H* H~
H~
H~-
B
#
E
H°
F
$
C
Hi-
A
#
D
$ . . . normal operation # . . . logicalconflict 33
A I '89 Conference, Prague The suggested diagnosing procedure is discussed by Zejda 2 in more detail. However, the generalized formula (3) may be expressed as follows: D = HTAH__2*A. . . .
AH,]
(4) ml(HiV, H(j', H~, F) = (0.8, 0.2, 0.0,0.0) rn2(H+,H°,H2, F) = (0.0, 0.1,0.9,0.0)
where D
true statement on a fault
U~
denotes that one from the actual statements H,.+, H~,?, H~- is true (_~+ is a true statement if at least one failure occurred which is associated with the positive residuum of the appropriate governing equation)
__l
numbers of governing equations
INFERENCE UNDER UNCERTAINTY The previous derivation of diagnosis does not respect the fact that a decision, whether the constraints given by governing equations are violated or not, can be made with a certain accuracy only. The diagnostic identification is based there on the two-values logic (Boolean logic), i.e. two disjoint categories of true and false are being applied. This inherent property causes e.g. in several cells - - denoted as # in Table 2 - - a Logical conflict to appear. It is simply inadmissible, for the Boolean approach, that violations of the (+), (0) and ( - ) type co-exist at the same time. Further, if the value of the governing equation residuum happens to be located in the vicinity of the prespecified threshold, then the diagnosis is infinitely sensitive to incremental changes in the plant, e.g. the measurement noise will cause the diagnosis to fluctuate between two individual faults - - then diagnosis instability occurs. Therefore, introducing a certain level of belief attached to the statement that the governing equation is violated, e.g. in the positive direction, appears to be much more natural and corresponds better to the limited accuracy of measuring instruments, random fluctuations of the processing unit state and often much simplified assumptions on which are based the used mathematical models, constructed so they do not reflect reality accurately. To cover these aspects of the fault diagnosing Shafer-Dempster theory have been elaborated on by Sharer 3, or with fundamentals laid down by Dempster ~. It has been decided here to exploit this theory for its ease of applicability to the discussed diagnosing problem (the finite set of process states) and also for its popularity among researchers in the field of expert systems. The following presentation is aimed at showing the main features of an algorithm based on the ShaferDempster theory, which is capable of dealing with uncertainty in diagnosis. There is no intention to define precisely the categories used. The discussion is strictly bound to the above mentioned example of a stream splitter. Based on intuitive notions, the fundamental procedure stemming from the Shafer-Dempster 34
theory, particularly Dempster's rule of combination, is presented in more detail. The basic probability mass distribution corresponding to the governing equations reads (5) (6)
The significance of the given figures is the following: the first constraint - - Equation (1) - - is violated in the positive direction with a degree of belief of 0.8, whereas 0.2 is assigned to the normal operation. Analogically, the same is applicable for the second governing equation (2) with the only difference being in the direction of violation, i.e. to the negative. The mass attributed to F expresses the degree of ignorance, and if its value is equal to zero the uncertainty is reflected only by a probability. When constructing the basic probability mass distributions, it should be realized that according to the theory of evidence the sum of weights must equal one. By using Dempster's rule of combination, Table 2 could be completed with the sums of partial contributions derived from violation/validity of individual governing equations - - see Table 3. For evaluation of the degree of belief attributed to the individual fault diagnosis, Dempster's rule has the following form (7)
m ( F ) = (~ml(HT)m2(H~))/(1 - k)
where the basic probability mass is calculated for each F in sequence from A to F. Further, summing is performed over HT n H* = F. The value of k is
(8)
k= Zml(H,)m2(H2)
where summing is over HI' N H~ = 0. The degree of belief for individual diagnosis then reads: m ( A ) = 0.878 rn(V) = 0.098 rn($) = 0.024 (9) m ( B ) = m ( C ) = m ( D ) = m(E) : 0.0
Table 3. Partial contributions derived from violation/validity of individual governing equations H~HI1~ H1 AUBUF $ CUDUE m I=0.8 0.2 0.0
H~
B
II
E
0.0
0.0
0.0
0.0
F
$
C
CUFU$
0.1
0.08
0.02
0.0
H2 A UD
0.9
0 0.18
D 0.0
B U E
H°
/ H~( n H~
A J,).72
\ m~(H-()m2(H2 ) Knowledge-Based Systems
A I '89 Conference, Prague Thus, for the discussed example, the hypothesis that the fault A has occurred, could be favoured. However, it should be admitted that the fault F might be present, though with a low weight, or even that the system might be operating at a normal state. A comparison of Table 2 with Table 3 illustrates one of the fundamental differences between the Boolean and the theory of evidence approach to the fault diagnosis. Whereas the former selects only one fault, the latter method assumes that all faults are present but ordered according to the weights. Thus, the method can deal with the simultaneous occurrence of several faults. In this case, the only limitation seems to be two faults existing at the same instant and acting in such a way so that one compensates the effect of the other. A more detailed discussion can be found in Reference 2. The success of mathematical models combined with the theory of evidence approach to the diagnosis problem appears to be conditioned by mutual independence of the governing equations and elements (faults) of the set F. To secure this, it is necessary to apply a deep knowledge embedded in physical principles. Moreover, it is necessary that there will be no causal relation between two faults in the subset F. If eventually such a couple occurs in F, then the presence of only one of them is sufficient. Unfortunately, in the case of more complex processing units the analysis of the above mentioned aspects presents no easy problem. In a correctly constructed model, each governing equation represents an independent source of knowledge. (However, even when there is some dependency, the presented procedure does not collapse; only the weights are redistributed. A similar situation for a Boolean, even statistical, approach might cause some insurmountable difficulties.) Then the conjunction (4) will acquire the form of an n-multiple Dempster's rule of combination. To proceed with an evaluation according to the expression (5), it is necessary to have n distributions having the structure of (3). These distributions could be obtained from each file of measured plant data and specified tolerance threshold values for each governing equation. If the absolute value of the residuum exceeds this tolerance, then the governing equation is considered to be violated. The higher the residuum value, the higher the weight attributed to the belief that the equation's validity is violated. These weights are then the objects which are being manipulated with, according to Dempster's rule of combination. Quantification of tolerances could be determined e.g. statistically (see Reference 2). CONCLUSION Designing a computer system, which would be based on a discussed approach, and effectively assisting operators in detecting and diagnosing process upsets, necessitates the resolving of a few practical problems; i.e. an actual computer implementation and integration
Vol 3 No i March 1990
into a process plant's digital control system has to be developed. For integration of the diagnosing system, it is necessary to link it with the control system so that there is easy access to plant data and, inversely, the results of diagnosing could be simply sent back to an operator's information system. A concept which has been chosen from several possibilities is the one with an implementation of the diagnosing system on an autonomous PC communicating with the plant computer control system via a properly designed link. Although at an early stage of development, an attempt was made to use part of an expert system shell, this was later abandoned because it did not bring with it the expected saving in programming effort. The suggested system has been grouped into three segments. The first one generates, on the basis of plant data acquired on-line, appropriate distributions in the form of Equations (5), (6). The following segment processes these basic probability mass distributions according to the n-multiple Dempster's rule of combination, as is shown by expressions (7) and (8). The last segment is designed to interpret and display, in graphical form, the results of the diagnosis. The whole system is programmed in a Turbo-Pascal environment. The choice of programing language has been influenced by some features of Turbo-Pascal which are advantageous for the implementation of the system. In particular, the language offers easy parallel handling of data structures of set and real number type. For simplification of procedures, which is a significant factor in the case of on-line application, ranking of diagnosis weights is carried out by the operation of a point estimate 5. This operation is necessary if intersections in Table 3 are not single element sets. Currently, attention is being paid to the design of a system which would exploit both the shallow and deep knowledge, and is being designed to be used as part of a computer control system of a fired heater in a reduced crude oil vacuum distillation unit.
REFERENCES 1 Kramer, M A 'Malfunction diagnosis using quantitative models with non-Boolean reasoning in expert systems' Artif. Intell. Chem. Eng. J. Vol 33 No 1 (January 1987) pp 130-140 2 Zejda, J Quantitative Models Approach in Fault Diagnosis of Chemical Processing Units PhD Thesis, Technical University FE-VUT Brno, Czechoslovakia (1988) 3 Sharer, G A Mathematical Theory of Evidence Princeton University Press, USA (1976) 4 I)empster, A P 'Upper and lower probability inference' Ann. Math. Stat. Vol 38 (1967) pp 325-339 5 ltavranek, T 'An interpretation of Hajek-Valdes results on Dempster's semigroup' Proc. AI '88 Conference Prague (April 1988) pp 67-76
35