Fault Handling Design for Integrated Marine Systems

Fault Handling Design for Integrated Marine Systems

86 FAULT HANDLING DESIGN FOR INTEGRATED MARINE SYSTEMS Mogens Blanke and Rikke Bille J~rgensen Aalborg University, Department of Control Engineerin...

2MB Sizes 24 Downloads 242 Views

86

FAULT HANDLING DESIGN FOR INTEGRATED MARINE SYSTEMS Mogens Blanke and Rikke Bille

J~rgensen

Aalborg University, Department of Control Engineering Fredrik Bajers Vej 7, DK 9220 Aalborg, Denmark, email: [email protected]

Abstract. A consistent method for design of fault handling in control systems is presented. It is based on analysis of component fault modes and their effects. Mathematical models for fault detection and isolation are derived from bond-graphs associated with each component and subsystem and automated analysis provides decision tables for fault handling. The result is a methodology for engineering design which presents the propagation of component faults and shows how fault handling should stop further migration. The method offers significantly improved dependability and a way to obtain this with simple means. A marine case study illustrates the methodology. Keywords. Dependable Systems, Marine ContrOl, Reliability, Fault Handling.

1. INTRODUCTION

In marine automation, production quality and efficiency are very important aspects but the real crucial issue is to have a high degree of reliability, availability and safety. In this context, this is referred to as having a dependable system. Advances in marine automation have provided integration of monitoring and control functions to enhance the operator's overview and his ability to act fast and correct when faults occur. This has given significant advance in the ability of the operator to perform manual supervision: to detect a fault, isolate it's cause and handle the fault by change of operation, use of alternative equipment to maintain the desired operation, or reduce machinery performance to a safe level. Simultaneous advances in process technology has led to higher degree of automation and complex plants to achieve enhanced quality and efficiency in normal operation. The increased plant complexity has, however, made automated systems more vulnerable to faults and, despite the integration effort, made it more

difficult to handle faults by human supervision. With focus changing towards enhanced availability and safety, including environment protection, there is serious interest in changing part of plant supervision to the automation level. This is technically possible with the integrated automation systems as the platform, but new design methods are needed to cope efficiently with the complexity and ensure correct and consistent fault handling. Within the nuclear and avionics industry, much effort has gone into design of failsafe systems which are tolerant to any single fault. These systems are expensive in terms of both hardware and development effort and are prohibitive for ordinary marine automation. Here, additional hardware should not be required and design and additional implementation costs be very limited. The requirement is that faults can be tolerated but it should be prevented that they develop into failures at a SUbsystem or plant level. The purpose of this work has been to develop a con-

87 cept that meets these requirements. A method is suggested that gives a consistent design and assures system dependability. The basic philosophy is to use existing sensors and actuators in an integrated system and make systematic use of any direct and indirect redundancy in the available information.

The main focus of this paper is on faults that influence system operation and may develop into failures at a higher level. This is an area where requirements from autonomous systems and higher demands to dependability from ordinary industrial applications are challenges to fault detection theory but also a domain where this fairly new field can prove its potential.

2. REQUIREMENTS TO DESIGN ME1HOD The paradigm is that models for fault modes and effects at the corn ponent level can be expanded to the subsystem level and be the basis for fault handling within control loops at this higher level. The method should be a component based bottom-up approach. This is a consequence of two realities: - machinery and other marine subsystems have clearly defined goals and junctionalities, and - system information is only available as manufacturing drawings showing mechanical components and their interconnections

Logic and mathematic models should exist in a library of generic components, and the interconnection of components should describe the subsystem level. Overall input-output should be defined at the subsystem level making it possible to connect subsystems into a hierarchy. This must include fault effect propagation. Fault detection and isolation should be accomplished as both low level single-sensor fault detection and more sophisticated methods using analytic redundancy techniques for fault detection and isolation (FDI). The above analysis should show where it would be beneficial to apply analytic FDI. Mathematical models needed for analytic FDI should be automatically generated.

Dependability of a control system can be obtained by giving it ability to detect and isolate faults and react with actions that accommodate the fault. Reactions will be predetermined at the design stage: a control system can freeze to a safe state or the controller can be re-con figured, e.g., by using a reduced set of sensors if a sensor fault has occurred.

3.1. Open and closed loop systems

Handling of faults in open loop systems, e.g., monitoring and remote control, is technically straightforward, but the reactions used to accommodate a fault need to be designed with careful consideration to safety and availability of the total plant. Optimization at a local level may easily violate an overall safety goal. Handling of faults in closed loop components is a more difficult and challenging task. Properly designed systems can accommodate the effects of faults whereas less careful designs can let fault effects propagate to other subsystems. For these reasons, fault analysis need to incorporate analYSis throughout a system. However, to limit an explosion in complexity, different degree of detail can be used at different levels of analysis.

3.2. Architecture

3. HOW DEPENDABILITY IS OBTAINED Faults in one subsystem of an automated plant has often undesired effects on other subsystems if remedy actions are not taken after a fault occurs. Today, shut down functions and interlocks are used to prevent failures to dilate from one sub-system to another. The use of such functions has, however, the consequence that plant availability is sometimes reduced without good reason. With an ever higher degree of automation, the penalty has been higher control system vulnerability to faults, particularly in sensors and actuators.

The first step to achieve dependability is detection of a non-normal condition. The second step is to isolate the cause to one or more possible component faults. The third step is to evaluate the condition, take decision about which actions shall be activated to accommodate the fault and finally enforce the handling actions. These functions are adequately implemented as a supervisory structure with three levels: 1. 2.

a lower level with control and input/output. a second level with functions to detect fault

88 conditions in sensors, actuators, control loops and control algorithms where this is needed 3.

a third level with state-event logic which reacts on the cu"ent condition, receiving inputs from detectors on any non-normal state and the operational mode of the process. Dedicated affector modules will also exist to execute handling actions when required.

The 2.nd and 3.rd level are meta-Ievels which together constitute a supervisory control.

A slew rate check can be beneficial in such cases. Mean and RMS value change. Incipient faults may be impossible to locate on a single sensor basis but mean value change detection or RMS value change detection can sometimes be applied (Basseville and Nikiforov, 1994).

A generic sensor description module is envisaged to include a selected range of these detection methods which are easy to parameterise and some (e.g. cusum algorithms) are self adapting to the noise level of a particular measurement.

Levels 1 and 2 are executed in real-time. Level 3 is executed when triggered by events at a lower level. Implementation of the supervisor has been made using a BEOLOGI~ (M011er, 1995) generated stateevent machine on a small scale prototype, and metalevel object-oriented programming was used to implement a large, but conceptually simpler problem within emergency management (Lunau and Nielsen, 1995).

3.3. Single sensor validity check.

The detection should first be considered a sim pIe single sensor/single signal detection problem. As is standard practice, a validity check is conducted on signals in connection with conversion to digital representation. This check is a range check as a minimum. The time to detect a sensor fault is important in this context. Some applications require detection within one sample (feedback elements in critical loops), others have less stringent timing requirements (reference signals to control loops and feed forward signals) (Blanke et. al, 1993). Range violation. The ability to detect faults using range check requires that analog measurements are offset from zero such that any disconnection or short circuit, whether it is to safety ground, signal ground, or any supply line, and whichever external wire involved, will cause a range violation. The 4-20 mA standard can easily achieve these properties. For voltage sources, for resistance measurements, and three or four terminal bridge measurements, care need to be taken in the design of input circuitry. The demand of a range violation requires that active pull up is applied on input terminals or other techniques ensure rapid diverge from normal input potential if any input wire should break. Slew rate violation. Some sensor and measurement faults develop stepwise but within the validity range.

3.4. Redundancy Based Fault Detection and Isolation

Having passed the first validity check, analytic redundancy techniques can be very effective in detecting non-normal conditions which were not observable at the single sensor detection level. Furthermore, redundant information is needed in most cases to be able to isolate a fault. Formulations of requirements to detection of faults, and the later isolation (FDI) (patton et.al, 1989) is today done as ad hoc engineering. This process requires deep process knowledge and engineering skills to make a successful design. It is, therefore, expensive in terms of both key personnel and engineering resources. The main tasks are model building and determination of isolation procedure. The task of residual generation for detection is quite well established and fairly easily automated. What is needed is thus to automate the modelling and isolation tasks. FDI methods require a dynamic fault model like equation (1). An FDI method detects a deviation from normal and isolates the component of the fault vector, f , which is the most likely cause to the observation. Only fault effects which have been included in the model, can be isolated. This implies that FDI methods can not guarantee that all relevant faults can be isolated. X(t) :: A x(t) + B u(t) + E, /(t) + Ed d(t) y(t) :: C x(t) + D u(t) + G, /(t) + Gd d(t)

(1)

The symbols in eq. (1) are: state vector, x, control input, u, disturbance, d, and additive fault vector, f Fault propagation is described by the plant dynamics and the matrices E, and Gt

94 inference engine had difficulties. The ,:"ork-around solution was to incorporate an additional state with each FMEA block stating whether a logic search had already been through this part of the diagram. The result was easy determination ofosed loop paths which could be used in the identifKation of potential points where fault handling could be activated to stop further propagation. (Blanke, 10rgensen, and Svavarsson, 1995). The false detection problem is not solved in this way, however. False detection and noise on FDI residuals may cause considerable diagnosis uncertainty. This problem needs to be solved using, e.g., the usual stochastic detection methods. Automatic handling of bond-graphs interconnection and translation to state space models has not been pursued. The reason was that other groups have reported such results (de Vries, 1994). The prototype tool is certainly far from a full scale im plementation, but the experience has shown that the concept as such seems to be worth pursuing at a larger scale. The methodology is summarized below.

8. SYSTEMATIC DESIGN The Matrix FMEA method is first conducted. It has the following steps:

l.a. Detailed FMEA model for each component l.b. List all potential component faults l .c. Find fault effects for each component fault l .d. Propagate fault effects through system I.e. Locate closed loop points and determine fault reactions. The second step is to get the FDI dynamic model.

2.a. Describe the component interconnections 2.b. Link bond graph models for components to get subsystem model 2.c Translate the bond graph model to state space form. Use fault effects from l.b as inputs. The third step is fault handling specification and implementation:

3.a. Use 2.c. and 1.e to define detectors. 3.b. Use l.e to define fault accommodation actions 3.c. Design a supervisor layer state event logic that executes 3.b when faults are detected.

The problem of consistency and completeness can be partly solved using an inference engine. Reliable implementation is, however, a SUbject of continued research.

9. CONCLUSIONS The paper showed how a matrix formulation of an FMEA method could be adopted to fit into the fault detection and isolation problem . State space fault descriptions of system dynamics and propagation was obtained from generic bond-graph models of components. It was shown how the component models could be simplified into generic types for used in the design, and how the generic types were used in the model building stage. It was further shown that the FMEA method and the generic component types enable isolation of failure modes with different degree of criticality and determination of control system actions to the various faults. The main contribution was a new method to systematic capture of requirements for fault detection and accommodation, and a systematic way of specifying FDIA properties related to component failure modes. The paper is hoped to increase the awareness of the dependability of control systems and the importance and feasibility - of incorporating fault handling already at the design stage of new control systems.

95

10. REFERENCES Andow, P.K. (1980): "Difficulties in Fault-Tree Synthesis for Process Plants". IEEE Trans. on reliability. Vol R-27. Apr. 1980, pages 1-9. Basseville, M. and I. Nikiforov (1994): Statistical Change Detection. Prentice Hall, 1994. Bell. T. E. Bell (1989) :"Managing Murphy's Law: Engineering a Minimum-Risk System". IEEE Spectrum, June 1989. Blanke, M., R B. J0rgensen, M. Svavarsson: A New Approach to Design of Dependable Control Systems. Proc. 40. KoREMA, Zagreb, Croatia, April, 1995. Blanke,M. S.B. Nielsen and R.B. J0rgensen (1993) : Fault Accomodation in Feedback Control Systems, in Hybrid Systems., Springer Verlag Lecture Notes in Computer Science vol. 736, October 1993, pp. 393-425. (ed.RL.Grosman, ANerode, AP.Ravn, and H.Rischel). Blanke,M. S. B0gh, RBille J0rgensen, and R. 1. Patton (1994): Diesel Engine Actuator - A Benchmark for FDI. Proc . IFAC SAFEPROCESS'94, Finland, June 94. pp. 498506-

Blanke, M. and R B. J0rgensen: Reliability Related to Sensor and Actuator Interface in Machinery Systems. Aalborg University report R93-4016. Blanke, M. and M. Gottlieb (1989): "Ship Control and Supervisory System for the Stantklrd Flex 300 Multi-Role Naval Ships". RINA International Conference on New Developments in Warship Propulsion. London. Nov. 1989. 9 p. Galluzo M., P.K. Andow (1986): "Reliability Analysis ofSystems Containing Complex Control Loops". IFAC Proc. Symp. on Reliability of Instrumentation Systems. 1986, pp 47-52 Herrin S. A (1981). : "Maintainability Applications Using the Matrix FMEA Technique".IFAC's Transactions on reliability, vol. R-30 No. 3, August 1981. Hogan P.A., C.R Burrows, K.A Edge, R.M. Atkinson, M.R Montakhab, D.J.Wollons (1992) : ''Automated Fault Analysis for Hydraulic Systems. Part 2: Applications".Jour. of Systems and Control Engineering, IMechE 1992, page 215-224

J0rgensen, RB (1995).: Development and Test of Methods for Fault Detection and Isolation: Theory and Practice. Ph.D: thesis, Department of Control Engineering, Aalborg University. Karnopp, D. and R Rosenberg(1983): Introduction to Physical System Dynamics, McGraw-Hill. i.aprie, J.C.(1987): "Computing Systems Dependability and Fault Tolerance: Basic Concepts and Terminology". Agardograph, No 289, pp 1.1-1.15

Lege J. M. (1978): "Computerized Approach for Matrix-Form FMEA IEEE Trans. on reliability. Vol R-27, No.l , 1978, pp 154-157. Lunau, c.P. and J.K. Nielsen (1995): Emma: An Emergency Management System for use onboard Ships. IFAC CAMS'95 Workshop, Trondheim, Norway, May, 1995. M011er, G. (1995): On the Technology of Array-based Logic. Ph.D. thesis. Electric Power Eng. Dept. Tech. University of Denmark, Lyngby, Denmark. Patton, RJ., P.Frank, D. Oarke (1989) Fault Diagnosis in Dynamic Systems: Theory and Applications.Prentice Hall. Ulerich, N. H. and Gary J. Powers (1988): "On-Line Hazard Aversion and Fault Diagnosis in Chemical Processes: The Digraph + Fault tree Methos". IEEE Transactions on Reliability, Vol.37, No.2,June 1988, pp 171-177 Vries, Th. J. A de (1994): Conceptual Design of Controlled Electro-Mechanical Systems. Ph.D. thesis, Universiteit Twente, NL. Willems, J.c., (1991): Paradigms and Puzzels in the Theory of Dynamical Systems. IEEE Transactions AC, Vol. 36. N. 3, pp 259-294. If.