Signed digraph based multiple fault diagnosis

Signed digraph based multiple fault diagnosis

Computers" chem. Engng, Vol. 21, Suppl., pp. $655-$660, 1997 Pergamon © 1997 Elsevier Science Ltd All fights reserved Printed in Great Britain PII:...

536KB Sizes 21 Downloads 51 Views

Computers" chem. Engng, Vol. 21, Suppl., pp. $655-$660, 1997

Pergamon

© 1997 Elsevier Science Ltd All fights reserved Printed in Great Britain

PII:S0098-1354(97)00124-5

0098-1354/97 $17.00+0.00

Signed Digraph Based Multiple Fault Diagnosis Hiranmayee Vedam Venkat Venkatasubramanian* Laboratory for Intelligent Process Systems School of Chemical Engineering Purdue University W. Lafayette, IN 47906 - 1283, U.S.A Abstract. Abnormal Situation Management (ASM) has received considerable attention from industry and academia recently. The first step towards better ASM is the timely detection and diagnosis of the abnormal situation, Most of the existing methods for fault diagnosis assume that only a single fault occurs at any given time. However, multiple faults do occur in processes, albeit less frequently than single faults. When multiple faults occur, existing methods either lead to incorrect diagnosis or complete lack of diagnosis. Multiple fault diagnosis (MFD) is a difficult problem because the number of combinations grows exponentially with the number of faults. In this paper, a signed directed graph (SDG) based algorithm for MFD is developed. The computational complexity is efficiently handled by assuming that the probability of occurrence of a multiple fault scenario decreases with an increasing number of faults involved. SDG based diagnosis, like any other qualitative method, has poor resolution. This poor resolution is overcome by using a knowledge base consisting of knowledge about the process constraints, maintenanceschedules etc. The proposed algorithm is implemented in Gensym's expert system shell, G2. The application of the algorithm is illustrated using an industrial scale simulation of the standard FCCU called TRAINER.

1

Introduction

1990), and process history based methods like neural networks (Thompson and Kramer, 1994; Kavuri and Venkatasubramanian, 1994), principal component analysis (Nomikos and MacGregor, 1994), and qualitative trends analysis (Rengaswamy and Venkatasubramanian, 1995). Qualitative methods have an advantage that they donot require detailed knowledge about the process and are relatively easier to develop. The disadvantage of such methods is their lack of resolution i.e., the number of root causes the diagnostic method notifies as possible root causes is very large. Also, most of the diagnostic methods assume that the abnormal situation is due to a single root cause. Hence these methods would lead to either wrong diagnosis or complete lack of diagnosis when multiple faults occur.

Abnormal situation management (ASM) involves the timely detection, diagnosis and correction of abnormal process conditions. An estimated $20 B is lost annually by the petrochemical industry in the US due to insufficient ASM. It is also estimated that there were 240 plant shutdowns during a one year period that could have been prevented (Nimmo, 1995). The identification of the root causes for the abnormal situation from process measurements is known as fault diagnosis. 1 During an abnormal situation, the operators are typically flooded with dozens of alarms. An operator is expected to make a quick decision about the root causes and take corrective action in a very short span of time making ASM a difficult task. An automated framework to support operator decision-making in performing fault diagnosis and suggesting possible corrective actions can improve the situation significantly. Significant attention has been given by the research community in the recent years to automate fault diagnosis. These automated fault diagnosis methods can be classified into model based methods like signed directed graphs (SDG) (Iri et al., 1979; Wilcox and Himmelblau, 1994), observer based methods (Frank, 1990) and assumption based methods (Petti et al.,

Multiple fault diagnosis is a difficult problem because the problem has a combinatorially explosive search space. The probability of multiple faults occuring in a process is often small, but when multiple faults do occur, these multiple faults make it difficult for the operator to identify the root causes and reduce the number of possible corrective actions. Operators receive little or no training in identifying multiple faults. This can significantly hamper the ability of the operator to cope with the abnormal situation when multiple faults occur. DeKleer and Williams (1987) developed an assumption based algorithm to diagnose multiple faults in digital circuits. The disadvantage of their approach is that the computational complexity increases exponentially with the number

Author to whom all correspondenceshould be addressed, Email: venkat@ecn, purdue, edu Fax: 1-317-494-0805

1The termsrootcauseand faultare usedinterchangeablyin this paper. $655

$656

PSE '97-ESCAPE-7 Joint Conference

of faults. Morales and Garcia (1990) modified their development by using group propagation technique to reduce the computational complexity. They applied their modular approach to diagnose multiple faults in digital circuits. However, the above approaches are not directly applicable to chemical processes because of the dynamic nature of chemical processes. Furthermore, the diagnosis in chemical processes for ASM should be performed on-line in a short period of time. Hence, computational efficiency is a crucial factor. Chung et.al (1994) developed a SDG- neural network based method for identifying multiple incipient faults in a nuclear power plant. They developed SDGs for subsystems of the nuclear plant. Each SDG operates under a single fault assumption. The root causes from individual SDGs are combined to arrive at multiple faults. In this paper, we will discuss algorithms to perform multiple fault diagnosis (MFD) using a SDG for the whole process. The computational complexity arising due to combinatorics is reduced by an assumption that the probability of occurrence of a multiple fault scenario decreases with the number of faults in a given scenario. The resolution of SDG based diagnosis is significantly enhanced by using a knowledge base to screen out root nodes which have very little probability. The proposed algorithms will be illustrated on a simulation of a standard FCCU called TRAINER (SACDA, 1995). Section 2 will discuss the development of signed digraphs, the knowledge base development and the algorithm for SDG based single fault diagnosis. In Section 3, the shortcomings of existing algorithm in the context of MFD are elaborated and the algorithms to perform MFD are presented. In Section 4, the case study is briefly discussed and results of MFD for the case study are presented. We will conclude with a discussion of the present work and suggestions for future work.

2 Signed Directed Graphs A signed directed graph is a representation of the process causal information, in which the process variables (and parameters) are represented as graph nodes and causal relations are represented by directed arcs. Nodes in the SDG assume values of (0), (+) and () representing the nominal steady state value, higher and lower than steady-state values respectively. Directed arcs point from a cause node to its effect node. Arc signs associated with each directed arc can take values of (+) and (-) representing whether the cause and effect change in the same direction or opposite direction respectively. A SDG may also include conditional arcs which become active only if certain conditions are satisfied. For example, the arc connecting a manipulated variable to the controlled variable is active only if the controller is not in manual mode. A SDG for the process can be developed from model equations representing the process or from the operator's knowledge of the process. An automated framework for the development of SDGs

based on model equations was developed by Mylaraswamy(1996). For many processes, however, model equations may be available only for some of the units. Hence, a combination of model equation based approach and operator' s knowledge is required to develop the SDG for the entire process. Process data for various abnormal situations is used as the operator's knowledge in this work. Partial digraph for the process can be built using model equations for the units, where ever available. The operator's knowledge is then used to infer the cause-effect relations in the units where model equations are not available. This procedure is elucidated while developing the SDG for TRAINER FCCU in Section 4. One of the serious limitations of qualitative methods like SDG is their lack of resolution. The reason can be attributed to the qualitative ambiguities involved in this kind of an approach (Kuipers, 1986). However, SDGs have a distinction of finding all possible fault candidates. In this work, the resolution of SDG based fault diagnosis is improved using a knowledge base which would screen out physically impossible root nodes. This knowledge base can consist of knowledge about reliability of equipment, infeasible root nodes and information about equipment maintenance. If an equipment has been recently serviced and its reliability is high then the root node pertaining to that equipment has very little probability of being a possible root node. The heat exchanger transfer coefficient decreases with time due to fouling and if a recent maintenance was not performed, then a positive change in heat exchanger transfer coefficient can be ruled out as a root node. Using such a knowledge base, we will show how the resolution of SDG based fault diagnosis has on an average improved more than 52% for the TRAINER FCCU in Section 4. Iri et.al(1979) proposed an algorithm for using SDG for fault diagnosis. The inherent assumption in their approach was that a single root cause which can explain the given abnormal situation can be found. They also assume that there exists a valid causal path between the root node and the observed abnormal measurements. A root node is any node in the digraph which has atleast one consistent arc connecting it to an effect node and no consistant arc connecting it to a cause node. An arc is said to be consistant if sign(cause) * sign(arc) * sign(effect) = (+). Wilcox and Himmelblau (1994) presented a new digraphbased diagnosis reasoning called possible cause-effect graph. This approach reduces the search space and hence the number of root causes generated. Based on these approaches the basic algorithm for performing single fault diagnosis using SDG can be summarized in Figure 1. The basic steps involved in performing SDG based single fault diagnosis are : 1. Propagate the deviation in the nodes representing process measurements (measured nodes) from effect to cause node via consistant arcs till the root nodes are identified. 2. Use the knowledge base to screen out physically

PSE '97-ESCAPE-7 Joint Conference

$657

in the SDG receive values of (+) and (-) respectively. These deviations are propagated from effect to cause nodes via consistant arcs till root nodes are identified. The root nodes identified for the present abnormal set,Ttm}. The situationare {F3,T1,QI .... UAL, 17 5 knowledge base is used to screen out root node with very little probability, Ttm. A breadth-first search is performed on each of the remaining root nodes. However, there exists no valid causal path from any of the root nodes to both T2 and T3. Hence, no root cause is identified using the above algorithm. In Section 3, we will show how the MFD algorithm identifies the correct root cause, namely {F3 and UA] }.

ho!~ga~ nodedevia~, from effectno~ to causenode

J

Perform breadth-first sea~h with the root node as origin

3 Multiple Fault Diagnosis Using SDG

t node to all the abnormal / ~ / umd nodes ? j J

Root Cause Identitied!![ 1

Figure 1: Algorithm for SDG based single fault diagnosis

The algorithm described in Section 2 for performing single fault diagnosis will lead to either lack of diagnosis or wrong diagnosis when multiple faults occur in a process. The algorithm tries to find a single consistant path from the root node to all the abnormal measurements. Since no such path exists when multiple faults occur, the algorithm might find a wrong root cause or might not find any root node that can explain the abnormal measurements. Hence, the above algorithm needs to be modified to perform multiple diagnosis. A simplistic approach would be to change the single fault diagnosis algorithm as follows:

impossible root nodes. 3. If a conflict exists in assigning a sign to any node then a voting scheme is used. 4. For each of the root nodes a breadth-first search is performed to check if a valid causal path exists between the root node and the observed abnormal measurements. 5. A root node is identified as the root cause for the abnormal situation if such a causal path exists. Consider, for example, the SDG for the preheater section in Model IV FCCU (McFarlane et al., 1993) in Figure 2. The variables T2 and T3 are measured. If T2 increases and T3 decreases, then T2 and T3 nodes

©

~ QIos

~s,tU

1

(/......... ",,\/,:/

Figure 2: SDG for Preheater Section of ModellV FCCU

1. Perform Steps 1-3 as in the single fault diagnosis discussed in Section 2. 2. All combinations of root nodes which can explain all the observed measured node deviations are identified. 3. Minimal cut sets of such combinations are identified as possible multiple root causes for the observed abnormal situation. A minimal cut set is defined as the minimal number of root causes required to explain the given abnormal situation. 4. A combination of root nodes which can explain all the observed measured node deviations and is minimal is called a root cause. The MFD algorithm discussed above, called MFD 1 is computationally expensive and the computational time increases exponentially with the number of root nodes. For example, if the number of root node after Step 1, is say 20. Then the worst case estimate of number of combinations explored in Step 2 above is of the order of 10 l°. In order to perform MFD in a reasonable period of time MFD1 is modified based on the assumption that the probability of occurrence of multiple faults decreases with the number of faults in a given scenario. We call this algorithm MFD2. For example, if the number of root nodes is 20, assume all root nodes are equally likely for the given abnormal situation. The probability that one of them can explain the abnormal situation is 0.05, that two root nodes can

$658

PSE '97-ESCAPE-7 Joint Conference

explain the scenario is 0.0025 and so on. Hence the probability that a single fault can occur is 8000 times more likely than the occurrence of a four fault scenario that can explain the same abnormal situation. The algorithm can be described as follows: 1. Perform Steps 1-3 as in the single fault diagnosis discussed in Section 2. 2. Arrange the root nodes in decreasing order according to the number of measured node deviations each root node can explain. 3. Combinations of root nodes are called root node lists (RNLs). RNLs with fewer number of root nodes are explored before exploring RNLs with larger number of root nodes. For example, single root nodes which can explain the observed abnormal situation are first explored before RNLs consisting of two root nodes that can explain the same abnormal situation. 4. Whenever a RNL can explain the observed abnormal situation all other combinations involving that RNL are removed from the search space, thus reducing the search space dramatically. The rationale behind this step is that if a RNL, say RNL1 can explain a given abnormal situation then, every other RNL consisting of all the nodes in RNL1 are supersets of RNL1, hence RNL1 is the minimal cut set of all such RNLs. 5. If the number of root node in any RNL exceeds a maximum number, Nma~, and if that RNL is still unable to explain the abnormal situation, then such RNLs are also eliminated, further reducing the search space. This is a valid procedure because of the assumption that the probability of occurrence of an RNL that can explain the observed abnormal situation decreases with the number of nodes in that RNL. 6. An RNL that which can explain the measured node deviations and is minimal is called a root cause. Single fault diagnosis is a special case of this algorithm when Nma~ is set equal to 1. For the example with 20 root nodes, if Nrna~ is set to 3, then the number of combinations explored in the worst case is 1350. The reduction in computational load for 20 nodes with N,,~a~ equal to 3, is of the order of 108, making the algorithm an attractive one. Revisiting the SDG for the preheater section of Model IV FCCU, each of the root nodes is examined to see if it can explain the deviation in T2 and T3. Since no such root node exists, two node combinations of the root nodes form two node RNLs. Each RNL is examined to see if the root nodes in the RNL can explain the deviation in T2 and T3. An RNL consisting of {Fa, Qto,~} can explain the deviations and hence becomes a root cause. Then the RNL is removed from the search space. Similarly an RNL consisting of {F3, UA I } also becomes a root cause and hence removed from the search space.

~ ir "/

Figure 3: Signed Digraph for Slurry PA of TRAINER FCCU However, an RNL consisting of {F3, T1} cannot explain the observed deviations and hence retained in the search space. If Nma~ is set to 3, then combinationsof three nodes are formed from the two node RNLs that cannot explain the observed measured node deviation and the root nodes. One such RNL is {Fa, T1, Qtos~}. Eventhough this RNL can explain the observed measured node deviation, this RNL doesnot become a root cause because, its subset {Fa, Qtoss} is already a root cause. Hence this RNL is discarded. If any RNLs with three nodes still remain in the search space after all the combinations are explored, the search is stopped because the Nma~: is set to 3. If none exists, then the algorithm stops because, the search space has been explored completly. This algorithm differs from MFD1 in that, MFD1 explores all the combinations possible before finding the minimal cutsets. The approach to root cause identification in MFD1 is similar to identifying minimal cut sets in Fault Tree Analysis. The performance of MFD2 algorithm alone on TRAINER FCCU is discussed in Section 4, since MFD1 never converged for all the abnormal situations examined.

4

MFD for TRAINER FCCU

In this Section, MFD2 is applied to an industrial simulation of a standard FCCU called TRAINER. This simulation was provided to us by Honeywell Tehnology Center in the form of executables. No model equations for any part of the FCCU are available. The FCCU simulated is a stacked regenerator/disengager design with the regenerator on the bottom and operating at a higher pressure. The convertor consists of an external riser standpipe and single stage stripper. A waste heat boiler is used to extract heat from the hot flue gas that emerges from the regenerator. It provides steam required for the air blower steam turbine and a number of steam driven pumps. A centrifugal

PSE '97-ESCAPE-7 Joint Conference air blower provides the regenerator air. A main fractionator with several pumparounds and a single stage wet gas compressor form the down stream processing units. The feed to the riser is preheated using the slurry pumparound (PA) from the main fractionator. Additional heating of the feed is provided using a feed furnace before the feed enters the riser. A slide valve controis the catalyst flow from the regenerator to the riser. The flow and temperature of the feed, catalyst, steam and air are controlled using low level PID controllers. The level controllers are cascaded around the flow controllers where necessary. Field operated devices and override controllers are also provided. The heating of feed using slurry PA and the waste heat boiler make the system highly coupled. This makes decomposition into subsystems a difficult task. Over 300 different abnormal scenarios can be simulated (Mylaraswamy, 1996; SACDA, 1995)

Table 1: Increase in Resolution Using a knowledge base Fault No. 1 2 3 4 5 6 7

Number Before KB 42 33 33 33 34 8 4

Number After KB 23 10 10 7 11 5 4

Improvement ( % ) 45 70 70 79 68 37 0

The SDG for TRAINER is developed using a combination of model based and operator' s knowledge as discussed in Section 2. Model equations are available for equipment like controllers, shell and tube heat exchangers, fan type heat exchangers and valves. Model equations for large equipment like main fractionator, air blower, riser/regenerator and wet gas compressor are not available. The knowledge base consists of the data obtained from the simulation for ten different faults.The partial SDGs for equipments with known model equations are combined with the data from the operator's knowledge to construct the SDG for the TRAINER FCCU. The SDG consists of 69 measured nodes and 224 unmeasured nodes. The SDG has the capability to diagnose 88 different faults and their combinations. It took about 100 (wo)man hours to build the SDG. The SDG for slurry PA is shown in Figure 3 Knowledge based systems to perform both single fault diagnosis and multiple fault diagnosis are implemented in Gensym' s real time expert system shell, G2. The single fault diagnosis algorithm has been implemented as a part of AEGIS (Abnormal Event Guidance and Information System), Honeywell's prototype for ASM (ASM Home Page). The original SDG algorithm had very poor resolution as shown in Table 1.The actual abnormal situations used cannot be discussed due to proprietary reasons. Significant im-

$659

provements in the algorithm has been achieved by using a knowledge base which can sieve out physically impossible faults. The number of root causes identified after using the knowledge base (KB) is shown in Table 1. A sample of the rules used in the knowledge base is shown in Table 2. Table 2: Sample Rules Used in the Knowledge base Pump efficiency has values (-) or (0) UAF has values (-) or (0) Power is (-) or (0) Fans are (-) or (0)

MFD2 has been implemented on several combinations of the faults. In all the cases, the actual fault combination is correctly identified. The computational time is less than a minute on a Sparc 10 machine for all the abnormal situations studied. The algorithm can correctly identify the fault combination even in cases when the effect of one fault annuled the effect of the other on some of the measured variables. It can also identify multiple faults that occur both sequentiallyand simultaneously. Only one abnormal situation is discussed in detail here due to spatial constraints. The abnormal situation is a combination of loss in slurry PA due to decrease in pump efficiency and a positive drift in the controller transmitter that controls the the air inflow to the regenerator. The simulation starts at steady state. After 1 minute the air controller transmitter begins to drift. The slurry PA pump efficiency starts decreasing at t=5 minutes. The algorithm identifies air flow controller transmitter failure as one of the root causes between t=l minute and t=5 minutes. After t=5 minutes, both the faults are identified. The output of the.algorithm after t=5 minutes in the final form presented to the operator is shown in Figure 4. Some of the other root causes that can show similar deviations in the process measurements like transmitter drifts in slurry PA controllers, reduction in airblower capacity are also identified.

5

Discussion and Future work

This paper discusses algorithms for performing MFD using SDGs. The algorithm differs from the earlier approaches in that, the multiple faults are explored only on an as needed basis, reducing the computational load significantly. The lack of resolution in SDG is improved significantly using a knowledge base. This algorithm is illustrated using TRAINER FCCU. The multiple faults identified need to be presented coherently to the operator. An approach for presentation is as shown in Figure 4. Better ways to present the results are being explored. From Figure 4 it can be seen that the resolution in the case of multiple faults is still inadequate. We are currently investigating several methods to improve the resolution and to perform conflict resolution.

$660

PSE '97-ESCAPE-7 Joint Conference Mylaraswamy, D., DKIT: A Blackboard-based, distributed, multi-expert environmentfor Abnormal Situation Management. PhD thesis, Purdue University (1996). H

Nimmo, I., Adequately address abnormal situation operations. Chem. Eng. Prog., 91(9), 36-45 (1995). Nomikos, E and J. E MacGregor, Monitoring of batch processes using multiway principal component analysis. AIChEJ., 40, 1361-1375 (1994). Petti, T., J. Klein, and P. Dhurjati, Diagnostic model processor, using deep knowledge for process fault diagnosis. AIChE J., 36(4), 565-575 (1990).

Figure 4: Diagnosis output of MFD2 for Failure of Slurry PA pump and Bias in air flow controller transmitter. F5 - Air flow controller transmitter bias;F1Slurry PA pump failure

References ASM Home Page, The abnormal situation management joint research and development consortium. URL Ref.:http://www.iac.honeywell.com/Pub/Tech/asmwww.html. Chung, H., Z. Bien, J. Park, and P. Seong, Incipient multiple fault diagnosis in real time with application to large-scale systems. IEEE Trans. Nuclear Science, 41(4), 1692-1703 (1994). Frank, P. M., Fault diagnosis systems using analytical and knowledge-based redundancy - a survey. Automatica, 26, 459-474 (1990). Iri, M., K. Aoki, E. O' Shima, and H. Matsuyama, An algorithm for diagnosis of system failures in chemical processes. Comput. Chem Engng., 3, 489-493 (1979). Kavuri, S. and V. Venkatasubramanian, Neural network decomposition strategies for large-scale fault diagnosis. Int. J. Control, 59(1994), 767792 (1994). Kleer, J. D. and B. C. Williams, Diagnosing multiple faults. Artificiallntelligence, 32, 97-130 (1987). Kuipers, B., Qualitative simulations. Artificial Intelligence, 29, 289-338 (1986). McFarlane, R., R. Reineman, J. Bartee, and G. Georgakis, Dynamic simulator for a model iv fluid catalytic cracking unit. Comput. Chem. Engng, 17(3), 275-300 (1993). Morales, E. and H. Garcia, Artificial Intelligence in Process Engineering, chapter 5. Academic Press (1990).

Rengaswamy, R. and V. Venkatasubramanian, A syntactic pattern-recognition approach for process monitoring and fault diagnosis. Engng Applic. Artif. Intell., 8(1), 35-51 (1995). SACDA, Process Model Description : Fluidized Catalytic Cracking Unit Standard Model. SACDA Inc. (1995). Thompson, M. and M. A. Kramer, Modeling chemical processes using prior knowledge and neural networks. AICHE Journal, 40(8), 1328-1338 (1994). Wilcox, N. A. and D. M. Himmelblau, The possible cause-effect graph model for process fault diagnosis - i. methodology. Comput. Chem. Engng., 18(2), 103-116 (1994).

Acknowledgement The authors would like to thank Honeywell Technology Center and the CIPAC consortium member companies which provided the financial support for this research. We also would like to thank the ASM consortium members for their valuable feedback and insights into FCC operations.