Computers & Chemical Enyinerriny, Vol. 3, pp. 489-493, 1979 Printed in Great Britam. All nghts reserved.
009% I354/79/040489--05$02.00/O Copyright 0 1981 Pergamon Press Ltd.
Paper 9.3 AN ALGORITHM FOR DIAGNOSIS OF SYSTEM IN THE CHEMICAL PROCESS
FAILURES
M. IRI University of Tokyo, Tokyo, Japan Mitsubishi
K. AOKI Heavy Industries, Ltd., Nagasaki, O’SHIMA*
E. Tokyo
Institute
of Technology,
Japan
Nagatsuta
Midori-Ku,
Yokohama,
Japan
and H. Kyushu
MATSUYAMA
University,
Fukuoka,
(Receioed I1 September
Japan 1979)
Abstract- An attempt was made to apply graph theory to the diagnosis of the system failures in the chemical process. A signed digraph is used for a mathematical model representing the influences among elements of the system. The concept of a pattern on the signed digraph is introduced for representing a state of the system. In order to eliminate carrying out the complicated and inefficient quantitative simulation, the mathematical model of the system structure to represent the propagation of failures is simplified in a qualitative fashion. The origin of the system failure can be located at the maximal strongly-connected component in the causeeffect graph reflecting the pattern of abnormality. Even when the pattern is observed only partially, the assumption of single origin of the failure reduces, to some extent, the range of possible candidates to be the first cause of the failure. ScopePIn a chemical plant, when an abnormal situation appears, the operator is warned by flashing or buzzing devices, each of which normally corresponds to one state variable of the plant. However, it is not easy for the human operator to identify the first cause of the system failure if several alarms start simultaneously. The importance of automated diagnosis of system failures has increased remarkably, as chemical processes have grown in scale and become more complex in structure. There are basically two kinds of approaches to failure diagnosis by computer. First is the experience-oriented method which is based on a list processing algorithm ; all the experienced patterns of failures and the corresponding causes are filed, and the pattern faced in practice is searched in the list. The other is the logic-oriented method which uses a cause-effect algorithm ; all possible cause-effect relations are prepared and the chains of cause-effect relations used to explain consistently the observed failure pattern. Conclusions and Significance-The method of the present work is categorised as a cause-effect algorithm. The state of each variable is represented qualitatively to have three ranges; high, normal and low. With the aid of graph treatment the logical structure of cause-effect relations consistent with the observed failure pattern is constructed. The most important part in the algorithm of the present method is to reduce the number of candidates for the real cause. It was found that the assumption that more than two failures do not occur at the same time can reduce the searching process to the extent of being practically feasible. the structure of dependencies among the variables is fully described. Though these two conditions do not necessarily hold in a practical chemical plant, it can be shown that the computer can assist diagnosis to some extent, with the aid of graph theory. The basic principle of the algorithm for diagnosis is to trace the cause back along the directed graphs, which are to be qualitatively derived from the characteristic equations of the process. It is essential, in order to carry out efficient computation, to classify the state variables in accordance with their mathematical characteristics [I].
1. INTRODUCTION
A tremendous number of state variables is involved in determining the characteristics of propagation of system failures in the chemical process. It is not an easy task even for an experienced operator to identify the first cause of the failures when more than one variable have become out of the normal level. It is, however, logically possible to identify the first cause, provided that all the variables are detected and
~_._ *Author
to whom correspondence
__-__ should
be addressed. 489
490
Fig. 3(a). Negative
v,
feedback.
“3
Fig. 1. Buffer tank. ,C----YL9 t
VI t
V2
t
F2
-
L
-
Fig. 2. Signed digraph 2. GRAPHICAL
F3+
AND
OF A SYSTEM
ITS STATES
2.1 Signed digraph and pattern For the buffer tank system shown in Fig. 1 F,, F, and F, are flow rates, L denotes the level of the tank, and VI, V, and V, are the apertures of the valves. The mathematical structure of this system can be represented by the signed digraph shown in Fig. 2. In the graph, the nodes correspond to the state variables of the system and the branches represent the immediate influences between the nodes. Positive and negative influences (i.e. promotion and suppression), respectively, are distinguished by signs ‘+’ and ‘-’ given to the branches. It is assumed that the state of the system can be specified by the value of the state variable divided into three ranges-high, normal and low, which are designated, respectively, as ‘+‘, ‘O’and ‘-‘. Thus the state of the system can be described as a combination of signs assigned to the nodes of the signed digraph representing the structure of the system. This combination of signs is defined as a ‘pattern’; if any node has a nonzero sign, the system is in failure. For instance, the abnormal decrease in the flow rate F, caused by sticking of V2 gives rise to the pattern shown in Table 1. Table 1 v,
F‘t
0
0
L 0
V2 _
F2 _
V3 +
F3 +
2.2 Cause-effect yruph In the context of failure diagnosis, we are not interested in nodes with sign ‘0’ and brancHes through which there exist no routes of propagation of the failure. A node nj is said to be valid if and a branch
$(nj)
#
a.
Fig. 3(c). Pattern
b.
V3
of buffer tank.
REPRESENTATION
STRUCTURE
+
Fig. 3(b). Pattern
0
6, is said to be consistent if $(a + ~k)~(bkM~ _ bk) = +
where tJ(nj) is the sign of nj, 4(bk) is the sign of 6, and Il/(a’b,) and $(8-b,) denote respectively initial and terminal nodes of 6,.
signs of
It can be considered that failure propagates only through consistent branches. A simple negative feedback loop as shown in Fig. 3(a) is a typical example illustrating how the concept of the consistent branch is applied to the diagnostic problems. Since the state variables A and Bare directly interrelated, abnormality would appear simultaneously at both nodes when failure occurs at either node. When both A and B are observed to deviate positively, it is considered that the branch with sign ‘t ’ is consistent (shown as pattern ‘a’ of Fig. 3(b)), and node 4 is deduced to be the origin of the failure. Conversely, when pattern ‘b’ is observed, where A shows positive deviation and B shows negative deviation, the branch with sign ‘-’ becomes consistent and B is the origin of the failure. Generalization of such relationship is the causeeffect graph, for short the CE graph, which consists of all the valid nodes and all the consistent branches in the signed digraph. 2.3 Extensim oj dejnitions to controlled nodes The CE graph describes the propagation of the failure so long as the failure does not propagate through the node with sign ‘0’. If the system has controlled variables, it may happen that the failure apparently propagates through the node with sign ‘O’, corresponding to the controlled variable. In the buffer tank system shown in Fig. 1, the level L shows a normal value even if the flow rate F, decreases abnormally because the level controller compensates the tendency of the level L to become greater by opening the valve V3 to increase the outflow F,. The pattern corresponding to this failure is shown in Table 1 and the CE graph for this pattern is shown in Fig. 4. The maximal strongly connected components of this CE graph are V2 and V, which are suspected to be the origins of the failure. However, it is known that V3 is not the origin of the failure ; the actual propagation of failure in this case should be represented by the digraph of Fig. 5. In order to take into account the special feature of control equipment in the system, the states of variables are categorized into five states designated as ’ + ‘, ‘O’,‘ -‘, ‘0’ and ‘@‘, where sign ‘0’ (‘0’) might intuitively be interpreted as the state of a variable which would have sign ‘+’ (‘-‘) without control but does not appear abnormal due to the action of control. The
An algorithm for diagnosis of system failures
in the chemical
“2
V,
---+ F2
491
Table 3
+ 0
O-T--
process
F3
“3
F, 0
L 0
v2
Fig. 4.
Table
VI
+
0
F, 0
-
L 0
F,
_
V3
F, -
V3 +
F3
+
4
V2 +
F3 +
0 + “2
L
F2
F3
“3
Fig. 5. controlled node is said to be valid if it has any one of the signs ‘+‘, ‘-‘, ‘0’ and ‘0’. The definition of a consistent branch is rather complicated. The branch h,, is said to be consistent if it satisfies one of the following five conditions (iHv) : (i) $(a+h,) = f, $(?+bJ = f, and $(?‘bk)&hk)ll/(a-b,) = + (ii) ti(Xb,)
= + and $(?+hk)&hk)
(iii) $(2-h,)
= - and $(~?+b&$(h~) = -
= +
(v) h, is a control-information carrying branch, tj(a’b,) = - and 4(bk)$(& b,J = -. With the use of these extended definitions for signs of nodes, valid nodes and consistent branches, it is possible to obtain the pattern corresponding to the abnormal decrease of F, as Table 2 and the CE graph for this pattern as Fig. 5, the correct representation of propagation of the failure. Table 2 ___~
0
L 0
FI
0
3. FORMULATION
V, __--
F, -
OF THE PROBLEM
THE ORIGIN
V3 +
pattern is observed. The maximal strongly-connected components of the CE graph for the expanded pattern are considered to be candidates for the origin of the failure. An example of the expanded pattern of the partial pattern in Table 3 is given in Table 4, and the CE graph for the expanded pattern is shown in Fig. 6. 3.2 Presumption
(iv) b, is a control-information carrying branch, $(?+hJ = + and &hk)$(aehk) = +
VI
Fig. 6.
F3 +
OF FINDING
OF A FAILURE
3.1 Partiul pattern und expunded pattern The problem of finding the origin of the failure can be reduced to that of finding the maximum stronglyconnected components of the CE graph. The algorithm proposed by Tarjan [2] is now regarded as one of the most efficient for this purpose. However, usually some of the node signs cannot be observed for either technical or economical reasons, so that the set of nodes of the signed digraph should be partitioned into two subsets: one consisting of observed nodes whose signs are known, and the other consisting of unobserved nodes whose signs are not known. A set of signs of the observed nodes is called a partial pattern (‘partial’ in the sense that signs are assigned only to a part of nodes). For instance, if the apertures of V,-V, in the buffer tank system in Fig. 1 are not observed, a partial pattern corresponding to the abnormal decrease of F, is that given in Table 3. Assuming, in an arbitrary way, the signs of the unobserved nodes, an ‘expanded pattern’ of the partial
of’cc single origin
undfbrmulation
of’
thr problem
In order to restrict the locations of the origins of failure as far as possible, the following fundamental presumption is adopted, since the probability of the simultaneous occurrence of more than one failure is considered to be extremely small. Presumption. There is a single origin of the system failure. If the CE graph for the expanded pattern satisfies the presumption of a single origin, there should be only one maximal strongly-connected component (to be called a ‘rooted digraph’). Then the problem of searching the origin of a system failure under the presumption of a single origin is formulated as follows. Problem. Given a signed digraph and a partial pattern on it, enumerate the expanded patterns which make the corresponding CE graph rooted. 4. FRAMEWORK SEARCHING
OF THE ALGORITHM
THE ORIGIN
FOR
OF A FAILURE
The problem formulated in the preceding section can be solved in principle by enumerating the CE graph for all the possible expanded patterns and by testing whether each CE graph is rooted or not. However, such a primitive method requires 3”” trials when M unobserved nodes exist in the signed digraph. The concept of the partial pattern can be extended to a set of signs of observed nodes and those of assumed ones. An assumed node also is said to be valid if it has a nonzero sign. A branch is said to be consistent if it satisfies one of the five conditions (iHv) in the previous section irrespective of whether its initial and terminal nodes are observed nodes or assumed ones. Furthermore. a branch is said to be semi-consistent if it satisfies one of the following two conditions : (a) At least one of the initial or terminal nodes of the branch is not assumed. (b) Both initial and terminal nodes of the branch are valid.
M. IRI
492
et a/.
The framework of an origin-searching algorithm combined with the depth first technique can be outlined as follows : (1) Input a partial pattern consisting of signs of observed nodes. (2) Assume a sign for a non-assumed node and expand the partial pattern by adding the sign to it. (3) Decompose a quasi-CE graph for the partial pattern into strongly-connected components and determine the partial order among them. At this step two cases are possible : Case A-If the quasi-CE graph has more than one maximal essential component (m 2 2) stop expanding the partial pattern and examine the possibility of changing the assignment of a sign to one of the assumed nodes. Case B--If the quasi-CE graph has at most one maximal essential component (m 3 1), the case is further divided into two subcases : Case B,-If there are one or more non-assumed nodes, return to step (2). Case B,-If there exists no non-assumed node, output the current pattern and the CE graph for it, which represents one of several possible ways of propagation of the failure. Then, go over the possibility of changing the assignment of sign to one of the assumed nodes. Repeat this process until all the possibilities are exhausted. 5. METHODS FOR GRAPHICAL OF CHEMICAL
=f;(x,,x2
6. ILLUSTRATIVE
a partial
pattern
F3
“3
d “2
L
+ F3
“3
0
O+
cl
L
F3
“3
F2
Fig. 8(a).
“2
F2
Fig. 8(b). +
“2
L
F2
F3
“3
Fig. 8(c). Table 5
L +
F, 0
“2
F2 _
“3
F3 +
PROCESSES
observed, possible origins of the failure are searched according to the following procedures. (a) The quasi-CE graph for the partial pattern consisting of signs of observed nodes only, as shown in Table 3, is given as Fig. 7. Though the sign of L is ‘O’,L is not removed from the quasi-CE graph because it is a controlled node. (b) If’ -’ or ‘0’ is assigned to L, the quasi-CE graph is split into two parts as shown in Fig. 8 (a) or (b). Thus, the sign of L should be ‘+‘. The extended pattern obtained by adding the sign of L to the partial pattern in Table 3 is given as Table 5 and the corresponding quasi-CE graph is shown in Fig. 8 (c).
0
-0
+ F2
L
0
+ F3
“3
0 +
0
Fig. 9(a).
,..., x,).
A branch is to be defined as to start from xj and end at xi if df,/ax, # 0 and the sign of aA/axj is to be assigned to the branch. If the sign of af,/ax, varies depending on the situation, two branches starting from xi and ending at xi, with different signs ‘+’ and ‘-’ respectively, should be defined. No self-loop is defined even if af/ax, # 0 because the self-loop has nothing to do with the search for the origin of failure.
When
L
F2
Fig. 7.
“I
REPRESENTATION
Signed digraphs for chemical processes are obtained by use of the following two methods : (1) Produce a signed digraph from operation data and experienced operators. (2) Produce a signed digraph from a mathematical model of the process. In the signed digraph obtained by the first method, few unobserved nodes would be contained but the branches would represent not only direct but also indirect influences. The first method, in addition, cannot cover the pattern of failures which have not been predicted from either experience or guess. When the operation data and the experiences of operators are not sufficient to obtain a consistent representation of the process, it is more desirable to construct the signed digraph from the mathematical model. The mathematical model usually consists of ordinary differential equations and algebraic equations. In general, ordinary differential equations can be rewritten in the form :
2
“2
+ F2
L
Fig. 9(b).
EXAMPLE
as shown
in Table
3 is
Fig. 9(c).
F3
“3
An algorithm for diagnosis of system failures in the chemical process Table 7
Table 6
“1
L +
Fl 0
“2
F, -
“, +
F, +
+ +
+
“2
F2
+
L
F3
+ “3
Fig. 10(a).
+ “2
493
-
+
+
+
F2
L
F3
“3
Fig. 10(b).
“1 I:‘, (c)
FI 0 0 0
L + + +
“2
F,
+ 0
I _
“3 + + +
F, + + +
(c) If the sign of V, is assumed as ‘-’ or ‘O’,the quasiCE graph is split into two parts as shown in Fig. 9 (a) or (b). Thus, the sign of V, should be ‘+‘. Table 6 shows the extended pattern obtained by adding the sign of k’, to the partial pattern in Table 5. The quasi-CE graph for the extended pattern is shown in Fig. 9 (c). (d) Extended patterns corresponding to the assignment of‘-‘, ‘+‘and ‘O’to V,, respectively,areshown in Tables 7 (a)-(c). The CE graph for the pattern given in Tables 7 (b) and (c), respectively, are shown in Figs. 10 (b) and (c). (e) Since we have already examined all possible extended patterns of the original partial pattern in Table 3, it can be concluded that the origin of the failure is either sticking of valve V, or an abnormal decrease of flow rate F,, perhaps caused by the blockage of the line or a failure of the pump. REFERENCES
P. K. Andow & F. P. Lees, Process computer alarm analysis: Outline of a method based on list processing. “2
F2
L
Fig. lo(c).
F3
“3
Transaction of the Institution of Chemical Engineers 53, 195-208 (1975).
R. E. Tarjan, Depth-first search and linear graph algorithm. SIAM J. Computing 1 (2), 1466160 (1972).