Failures in control systems

Failures in control systems

Reliability Engineering 7 (1984) 193-211 Failures in Control Systems M. G a l l u z z o lstituto di Ingegneria Chimica, Universita di Palermo, Sicily...

665KB Sizes 17 Downloads 95 Views

Reliability Engineering 7 (1984) 193-211

Failures in Control Systems M. G a l l u z z o lstituto di Ingegneria Chimica, Universita di Palermo, Sicily, Italy

and P. K. A n d o w Department of Chemical Engineering, Loughborough University of Technology, Loughborough, Leicestershire LEII 3TU, Great Britain

(Received: 24 February, 1983)

ABSTRACT There is an increasing demandjor detailed safety and reliability analyses of new and existing plant designs. Quite often this will involve fault-tree construction. There are well-known methods Jor constructing trees for ordinary items of plant. Man), attempts have been made to provide computer-based aids. There are some quite difficult problems when control loops are encountered. The major contribution to advance in this area has been by Lapp and Powers. Their algorithm has been the subject of much debate and theoretical argument. This paper reports some very simple laboratory experiments to test the algorithm with a real control system. The algorithm is found to be generally good (with minor discrepancies) for proportional controllers, but somewhat lacking if integral control action is used, as is nearly always the case in practice.

1

INTRODUCTION

A number of failure analysis techniques have been developed over the years. They range from very qualitative ones like criticality analysis, 1 193 Reliability Engineering 0143-8174/84/$03.00 © Elsevier Applied Science Publishers Ltd, Printed in Great Britain

194

M. Galluzzo, P. K. Andow

H A Z O P (hazard and operability studies) 2'3 and FMEA (failure modes and effects analysis), 4 to qualitative and quantitative ones like event trees, s fault trees 5-7 and cause-consequence diagrams. 8 Well-developed and frequently used computer codes already exist for fault-tree analysis. 9-'2 Furthermore some computerized methods have been proposed for the automatic synthesis of fault trees and cause-consequence diagrams. 13- 17 Synthesis is the weakest step in the entire process of failure analysis because, if manually done, it is time consuming and does not guarantee a unique result. This is because different analysts develop the fault tree in different ways. There are two quite distinct reasons for this: (1) Although the basic fault-tree methodology is well defined, there are some differences when the less common types of gate and event are used. All significant fault trees contain A N D and (inclusive) OR gates. Many contain NOT gates. Some analysts also use the exclusive OR gate (written as E O R). Some dispute exists as to the interpretation of fault trees containing this type of gate. (2) During fault-tree development, it is natural to group together causal paths that have some convenient common physical significance. Different analysts will use different groupings. The result is that even if two fault trees have identical cut sets they may appear quite different. Analysts may also develop trees to different levels of detail, perhaps because they were initially trained in different disciplines and do not perceive systems in the same way. It is therefore difficult to check fault trees satisfactorily. During the design cycle it is also quite common to change a design as a result of a fault-tree analysis. The modified system must then be analysed again. In this environment it is attractive to have a fault-tree synthesis program for two reasons: (1) As a repeatable, independent check on manual analysis. (2) As a means of evaluating a modified design quickly (or perhaps for use in sensitivity analyses). (It is not usually suggested that automated synthesis is used in isolation, since many of the benefits are due to the insight that is gained into the system by the process of manual synthesis.) Computer codes for synthesis have been developed by several groups, as noted above.

Failures in control systems

195

Some difficulties still exist in the application of these codes to process systems. Some of them have been indicated for fault trees, l s,~ 9 The main difficulties arise from: (1) The type of models used for process units. (2) Time-dependent behaviour. These difficulties mainly arise in analysing control loops. Their behaviour is in fact deeply dependent on time constants, dead times and dynamic actions of controllers, all of which are not considered in normal failure analyses. In some cases, where complex control systems are involved, it would be necessary to use continuous and dynamic models instead of the commonly used discretized steady-state models. This would require discretized or continuous-simulation techniques, but would increase the complexity and hence the cost of the analysis. There is therefore a need to use the simplest models possible for all components, including controllers. For these, only proportional action is usually considered in the published literature. This paper studies the application of the most common failure analysis technique, the fault-tree method, to a simple control loop and the I DEVA I TO IN OF VARA I BLE

I

nODE ,A ' TE DIST[~~^ICES

f DEVCES L AC!YATED I

Fig. I.

196

M. Galluzzo, P. K. Andow

TABLE 1

System Failure Events Control valve

Level transducer

Controller

Line transducer~controller Line controller-valve Controller air supply Transducer air supply

Fails open Fails closed Stuck Fails high Fails low Stuck Fails high Fails low Stuck Leakage Leakage Low pressure Low pressure

implications of assuming proportional action only for the controller when integral action is also present. In contrast to much of the published work, the results here are based on experiments on a level control loop carried out (by M.G.) in the control laboratory at Loughborough. There are two methods that can be used in fault-tree construction where a control loop is encountered. One consists of representing fault propagation by an A N D gate with two inputs: 'a disturbance' and 'control loop does not work'. The second one consists of using an operator that takes into consideration how faults can propagate around the loop. We are concerned with the second method here. We shall consider the operator that has received so much attention, namely that used by Lapp and Powers. ~4 The operator is in the form of a fault tree and Fi

ATO Fig. 2.

Failures in control systems

197

is shown in Fig. 1. The system we will use throughout this paper is a simple level control loop, shown in Fig. 2. The failures of Table 1 have been introduced for the components of the loop. Failure modes of process c6mponents like tank or pipe leakages have been disregarded in order to focus attention on control-loop components. Reverse action of components (which is part of the Lapp and Powers operator) will not appear in this analysis, because this kind of failure is considered to be more relevant to preventive inspection than to fault or disturbance analysis. Two top events are of interest here: LEVEL HIGH and LEVEL LOW. The digraph of the system and the fault trees for these events, obtained by applying the Lapp and Powers operator, are shown in Figs. 3, 4(a) and 4(b). As previously stated, the operator does not take account of the type of controller, although by implication the fault trees are supposed to be valid, or at least conservative, whatever controller is used. In the digraph, leakages in the lines from transducer to the controller

- I ~-'-~-1;0

-lo

. . . . . 3.

Fig.

-10

198

M. Galluzzo, P. K. Andow

Sh I

Fig. 4(a).

and from the controller to the control valve are introduced as fault variables directly affecting the input variable of the downstream component. Control devices failing H I G H or L O W are also introduced as univariant input variables instead of branches with different gain. Two levels, moderate (denoted as + 1 or - 1) and large (denoted as + 10 or - 10) are considered for disturbances. The laboratory level control loop used for the experiments consisted of an open tank, a control valve on the output line, a controller and a level measuring instrument. The instruments used were: a pneumatic P I D Taylor transcope controller, a Taylor fixed range differential

Failures in control systems

199

r-

Fig. 4(b). pressure transmitter (range 0-7.5 x 103Nm -2, 0 - 3 0 i n W G , output 2 x 104-1.03 × l0 s N m -z, 3-15 psi) and a Taylor diaphragm valve of the air-to-open type. Monitoring and recording has been accomplished by a DEC Minc microcomputer system connected to the loop by pressure-voltage converters. After parameters for proportional (P) and proportional + integral (PI) controllers had been set up by conventional experimental procedure, two complete sets of experiments were carried out for both cases, introducing into the system the same failures considered in drawing the fault trees.

200

M. Galluzzo, P. K. Andow

2

SUMMARY OF EXPERIMENTAL RESULTS

2.1 Valve fails closed and valve fails open These failures are, as expected, one-event cut sets for the top events L(HIGH) and L(LOW) respectively for both P and PI loops. 2.2 Valve stuck No difference obviously exists between P and PI loops, because if the valve is stuck the controller cannot affect it. If a significant ( > 5 ~o of input flowrate for the particular loop) disturbance is not present when the failure is introduced, the loop remains in the same initial conditions. 'Valve stuck' therefore is not a one-event cut set. Depending on the sequence of events different results are produced when a moderate disturbance of the input flowrate occurs: (1)

If the disturbance precedes the event 'valve stuck' and if the disturbance is such that the control loop can cope with it, then the event 'valve stuck' is not sufficient to cause a top event. (2) If the order of events is the opposite, after the event 'valve stuck' followed by the disturbance then the control loop is like an open loop with a fixed resistance. Disturbances that it can cope with are obviously lower than those in the first case. Therefore it can happen that for the same disturbance as for the first case a top event occurs. In the first case gain and integral time of the controller determine the value of allowed disturbances. In the second case there is no difference between P and PI controllers and the value of allowed disturbances is determined by the open loop characteristics only. 2.3 Transducer fails high and transducer fails low These failures are, as expected, one-event cut sets for the top events L(HIGH) and L(LOW) respectively for both P and PI control loops. 2.4 Transducer stuck 2.4.1 Loop with P controller In this case the failure is not sufficient to drive the loop to a top event. A

Failures in control systems

201

disturbance in the flow input is necessary to cause a top event. The direction of disturbance determines the particular top event. However, the sequence of events is important and what has been said for the event 'valve stuck' in sub-section 2.2 holds true here.

2.4.2 Loop with PI controller In this case, the failure represents a one-event cut set. The noise present is sufficient to drive the system to a top event. In the experiments, a small difference (always less than 6.9 x 10 2 N m - 2, 0.1 psi) between the effective transducer pressure and the constant pressure of the signal sent to the controller to simulate this failure was sufficient to cause a top event. The same pressure difference was not sufficient in the case of the loop with a P controller to produce the same effect. 2.5 Controller fails high and controller fails low These failures, as expected, represent one-event cut sets for the top events L(HIGH) and L(LOW) respectively, for both P and PI control loops. 2.6 Controller stuck F r o m a functional point of view there is no difference between this failure and 'valve stuck' in sub-section 2.2. The behaviour is the same, even with regard to the event sequence with a disturbance. 2.7 Leakage in the line from the transducer to the controller The size of leak determines the behaviour. The control loop can compensate small leakages, while the top event is obtained very quickly when the leakage exceeds a certain value. Therefore, in considering this type of failure, it is necessary to define at least a limit in the leakage, above which the leakage represents a one-event cut set. No difference has been observed between loops with P and PI action. 2.8. Leakage in the line from the controller to the valve The behaviour is qualitatively the same as for the previous failure. The control loop is less sensitive to this second leakage.

202

M. Galluzzo, P. K. Andow

2.9 Air-supply failure Two different kinds of failure have been explored separately for the controller and the transducer: (l) Total air-supply failure. (2) Air-supply pressure less than n o r m a l - - a b o u t 1.28 × 10SNm -2 (20psi). In all cases no difference has been observed between P and PI loops. The total air-supply failure represents a one-event cut set for the top event L(HI) in all cases. On the other hand, the loop is unaffected by air supply pressure either to the controller or to the transducer, until the pressure goes below the working output pressure of instruments. After that, the behaviour is the same as for the events 'level transducer fails low' (sub-section 2.3) or 'valve fails closed' (sub-section 2.t) and the top event L(HI) is reached in all cases. 3

DISCUSSION

It is normally supposed that when a basic event happens, the system is in a normal 'perfect state'. For a control loop, normal state can mean that it is reacting to a disturbance. It is supposed, for instance, that when the transducer becomes stuck, its output pressure is the same as the reference in the controller. In practice, that is not usually the case. If the controller has the integral action, as many controllers do, and if there is an error when the transducer becomes stuck, then this will be integrated indefinitely until saturation is reached. The results of the experiments are displayed in the form of fault trees in Figs. 5(a) to 5(d). Three main considerations arise from them and from their comparison with the fault trees predicted by the standard algorithm. In order to highlight the differences, the L ( H I ) f a u l t trees have been converted to cut sets and are shown in Tables 2--5. (1)

For a single control loop, fault trees for P and PI controllers differ only in the branch where the: basic event 'transducer stuck' is considered. This is a one-event cut set for the PI-controller case, while for the P controller a disturbance must be present at the same time.

203

Failures in control systems TABLE 2

Large-Disturbance Cut Sets (Common to Theory and Experiments) -- F~ very high Valve fails closed Level transducer fails LO OR--

Level controller fails LO Level transducer air supply fails completely Level controller air supply fails completely Large leak between level transducer and level controller _ Large leak between level controller and valve TABLE 3

Moderate-Disturbance Cut Sets for General System V Valve stuck F i (High)

AND--q

ControuORstuck

L

TransduORer stuck

OR Small leak between transducer and controller OR

-- Controller stuck

--

AND

Small leak between controller and valve

OR

--

Transducer stuck TABLE 4

Moderate-Disturbance Cut Sets for P Controller (Experimental Results) -- Valve stuck OR F (High) (must be second event)

AND - -

Controller stuck OR Transducer stuck

204

Mo Galluzzo, P. K, Andow

® ® @ kLt

°_~ Qfl. taO

"~l.u

~J M Z

g w~ ~w -llld

@ @ @ ® ® ®

Failures in control systems

® ® @ @ @ @ @ @ ® @ ® @ ® ®

205

M. Galluzzo, P. K. Andow

206

®

Z

Q

0



0 U z Ill

tglxl e,,I,-

5~

®

Failures in control systems

207

® Q (J Z m

g wO

~q

w W

~0 o..I t,lt~ t~ta ~z Jw

@ @

® ® ® ® ®

208

M. Galluzzo, P. K. Andow

TABLE 5

Moderate-Disturbance Cut Sets for PI Controller (Experimental Results)

_•

Valve stuck

Fi (High) (must be 2nd event)

AND

ContrOller stuck

OR Transducer stuck

If it is desirable to represent both cases using only one fault tree, then the one for the PI controller has to be chosen, since it is more conservative. (2) C o m m o n to all cases involving an instrument sticking and a disturbance is the observation that the event sequence would have to be considered to represent exactly the fault structure. However that would involve introducing priority A N D gates in subsequent quantitative analysis. 2° (3) The third consideration is more concerned with the definition of some failures rather than with the fault structure itself. Leakages in instrument lines affect the control-loop behaviour only when they are larger than a certain value, the only drawback being in some cases a larger consumption of air. Air-supply failures are significant only when the supply pressure is lower than the working output of an instrument. That means that the event 'air-supply failure' is dependent for its consequences on the working point. As a result, if 2 x 104--1.03 × 10SNm -2 (3-15psi) is the range of the output pressure, 1.03 × 10SNm -2 (15 psi) has to be considered the limit value for a conservative quantitative description of this failure, unless conditional events are introduced. As noted earlier, the controller was set up according to conventional practice. This means that all the assumptions of linear analysis, including relatively small disturbances, apply. It is obvious that this may not be the best way to set up a controller if its behaviour is particularly important when large disturbances occur, as is likely to be the case for failure events. However, for the purpose of the experiments described here, it seems

Failures in control systems

209

reasonable to assume that a controller on an industrial plant would (at best !) have been set up in the conventional way. It is also worth noting that the 'transducer-stuck' one-event cut set for the PI controller is a distinct disadvantage of PI control, from the point of view of safety. If such faults are likely, then it seems best to specify that no integral action should be used. The results of this study must always be considered in the light of the equipment used (this is why the maker's name has been given). However it does appear that other types of equipment would exhibit the same, or similar, characteristics. The essential point is that the safety and reliability analyst must find out the failure characteristics of the controller etc. when needed.

4

CONCLUSIONS

The following conclusions can be drawn from this experiment: (1) The simple discrete-state model type employed by Lapp and Powers can reasonably approximate the failure behaviour of the control system tested for this work. (2) The basic model given in the literature really applies to proportional (P) controllers only. In particular for the 'transducerstuck' failure, the basic model is optimistic in that it yields a twoevent cut set where, in practice, a one-event cut set exists. This would make a significant difference in reliability calculations. (3) The basic model also yields a total of four two-event cut sets involving 'small-leak' failures. In practice, these two-event cut sets did not occur, but this might not be true for all hardware types. This weakness in the basic model leads to a pessimistic prediction of overall failure rate, and is thus much less serious than that mentioned in conclusion 2 above. A more serious disadvantage might be that it would give an unnecessarily complicated result, if applied widely within a plant. (4) It may be that conventional linear analysis is not the most appropriate method of tuning control loops where there are safety implications involved. On the same theme it might be better to use proportional controllers rather than proportional-integral controllers on some loops where transducer faults are likely.

210

M. Gattuzzo, P. K. Andow

A C K N O W L E D G E M ENTS The authors would like to sincerely thank Dr D. W. Drott for his help with the experimental work and Consiglio Nazionale delle Ricerche (Italy) for financial support for M. Galluzzo. Thanks are also due to Mr B. E. Kelly for the fault-tree graphics program used in the figures.

REFERENCES 1. Hammer, W. Handbook of System and Product SaJety, Prentice-Hall, New Jersey, 1972, p. 157. 2. Lawley, H. G. Operability Studies and Hazard Analysis, Chem. Engng. Prog., 70(4) (1974) p. 45. 3. Chemical Industry Safety and Health Council. A Guide to Hazard and Operability Studies, Chemical Industries Association Ltd, London, 1977. 4. King, C. F. and Rudd, D. R. AtChE J., t8 (t971) p. 257. 5. Atomic Energy Commission. Reactor Safety Study. An Assessment of Accident Risks in US Commercial-Nuclear Power Plants; ref: WASH 1400, Washington, D.C., 1975. 6. Barlow, R. E., Fussell, J. B. and Singpurwalla, N. D. (eds). Reliability and Fault Tree Analysis, Soc. for Ind. and Appl. Mathematics, Philadelphia, Pa, 1975. 7. Haas, D. F. in System Sajety Symposium, The Boeing Co., Seattle, Washington, 1965. 8. Nielsen, D. S. The Cause-Consequence Method as a Basis Jor Quantitative Accident Analysis, Danish Atomic Energy Commission, RISO, Report RISO-M-1374, 1971. 9. Vesely, W. E. and Narum, R. E. PREP and KITT: Computer Codes Jor Automatic Evaluation of Fault Trees, Idaho Nuclear Corp., IN 1349, 1970. 10. Fussell, J. B., Henery, E. B. and Marshall, N. H. M O C U S - - A Computer Program to Obtain Minimal Cut Sets from Fault Trees, ANCR-1156, 1974. 11. Willie, R. Fault Tree Analysis Program, Operations Research Center Report No. ORC 78-14, University of California, Berkeley, Rept UCRL-13981, Lawrence Livermore National Laboratory, 1978. 12. Lambert, H. E. and Davis, B. J. The Fault Tree Computer Codes I M P O R T A N C E and GATE, Short Course Material, 1980. 13. Fussell, J. B. Synthetic Tree Model." A Formal Methodology jor Fault Tree Construction, ANCR 1098, Aerojet Nuclear Co., Idaho Falls, Idaho, 1973. 14. Lapp, S. A. and Powers, G. J. Computer Aided Synthesis of Fault Trees, IEEE Trans. Rel., R-26(2) (1977). 15. Apostolakis, G. E., Salem, S. L. and Wu, J. S. CA T: A Computer Code./br the Automated Construction of Fault Trees, EPRI-705, Electric Power Res. Inst., Palo Alto, California, 1978.

Failures in control systems

211

16. Martin-Solis, G. A., Andow, P. K. and Lees, F. P. An approach to fault tree synthesis for process plants. In: Loss Prevention and Safety Promotion in the Process Industries, Dechema, Frankfurt, DECHEMA 2: 367, 1977. 17. Hollo, E. and Taylor, J. R. Algorithm and Program for Consequence Diagram and Fault Tree Construction, Danish Atomic Energy Commission, RISO, Rep. RISO-M-1907, 1975. 18. Andow, P. K. Difficulties in Fault Tree Synthesis for Process Plant, IEEE Trans. Rel., R-29(2) (1980). 19. Andow, P. K. Fault Trees and Failure Analyses: Discrete State Representation Problems, Trans IChemE, 59 (1981) p. 125. 20. Fussell, J. B., Aber, E. F. and Rahl, R. G. IEEE Trans. Rel., R-25 (1976) p. 324.