Fault-tolerant control systems — A holistic view

Fault-tolerant control systems — A holistic view

ControlEng.Practice,Vol. 5, No. 5, pp. 693-702, 1997 Copyright © 1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0967-0661/97 ...

893KB Sizes 58 Downloads 180 Views

ControlEng.Practice,Vol. 5, No. 5, pp. 693-702, 1997 Copyright © 1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0967-0661/97 $17.00 + 0.00

Pergamon

PII:S0967-0661(97)00051-8

FAULT-TOLERANT

CONTROL SYSTEMS - A HOLISTIC VIEW

M. Blanke*, R. Izadi-Zamanabadi*, S.A. B~gh* and C.P. Lunau** *Department of Control Engineering, Aalborg University, FredrikBajers Vej 7C, DK 9220Aalborg, Denmark ([email protected]) **Department of Computer Science, Aalborg University, FredrikBajers Vej 7C, DK 9220 Aalborg, Denmark

(Received March 1997) Abstract: Fault-tolerant control is used in systems that need to be able to detect faults and prevent simple faults related to control loops from developing into production stoppages or failures at a plant level. This is obtained by combining fault detection with supervisory control and re-configuration to accommodate faults. Much attention has been focused on fault detection in its own right. This paper deals with fault tolerant control from a much wider point of view, covering the entire design process from the engineering of the interface to structural implementation. Experience ranging from a simple temperature control to a complex satellite control system demonstrates significant improvements in plant availability using simple means. Copyright © 1997 Elsevier Science Ltd

Keywords: Fault-tolerant control, fault detection, supervisory control, industrial control.

Much effort has gone into advancing the theory and practice of fail-safe systems within the nuclear and avionics industry (Warwick and Tham, 1991). These high-risk applications require fail-safe operation, i.e. systems that can withstand any single point failure without effects on system operation. This requires solutions that are very costly in both hardware and development effort. The expense of such technology is prohibitive for ordinary industrial automation where implementation must be cheap. Nevertheless, the enhancement in availability and safety should be significant. The purpose of this work has been to give an overview of a development concept with techniques that meet these requirements.

I. INTRODUCTION Around-the-clock availability has become a key incentive for automate production processes, supply utilities, and manufacturing using industrial automarion technology. Earlier industrial requirements focused mainly on improvements in quality to get increased throughput with desired quality. This resulted in the industrial use of self-tuning control on a large scale, with model-based, optimal and robust techniques playing an important role, and adaptive methods being used in difficult cases. With increasing process monitoring and an ever-higher level of automation to achieve desired quality, plants have become more vulnerable to faults in instrumentation, and availability is now the single factor with the highest impact on profitability. Fault-tolerant control offers a method of obtaining increased availability, by avoiding inadvertent process shut-downs from simple faults, e.g. in instrumentation and control loops that could develop into production stoppage or plant failures.

This work introduces a methodology for the engineering design of fault tolerant control (FTC). It outlines how fault tolerance is obtained using the functionai levels: single sensor validation, fault detection and isolation (FDI) by analytical redundancy, autonomous supervision and reconfiguration. A threelayer architecture is introduced to separate these into independent, yet interconnected functions. A proce693

694

M. Blanke et al.

dure is described that features systematic design and the consistent selection of faults to be detected. Analysis of the distribution of fault effects to obtain consistent and correct specifications for re-configuration is a salient feature of the method. Software and implementation aspects are then discussed. A meta-level architecture is suggested based on reflection, and logic representations within the supervisor are dealt with. Finally, two examples are presented: a simple temperature control loop and F r c design of the attitude control system for the Danish Orsted satellite.

2. FAIL-SAFE VERSUS FAULT-TOLERANT SYSTEMS A definition of fault-tolerance versus fail-safe systems is useful. Fail-safe systems are able to withstand any single point failure without any noticeable change in their functionality or performance. Fallsafe systems have the following characteristics. They • • • • • •

continue despite any single point failure involve triple redundancy in hardware use voting (2 out o f 3 ) f o r sensor signals use triple signal processing computers employ dual actuators (or more) are commonly regarded as very expensive.

Fault-tolerant systems may degrade performance when a fault occurs, but a fault will not develop into a failure at the system level, if this could be prevented through proper action in the programmable parts of a control loop. Fault-tolerant systems have the following properties. They • • • • •

aim to prevent any simple fault f r o m developing into failure at system level use information redundancy to detect faults use reconfiguration inprogrammable system components to accommodate faults accept degradedperformance due to a fault but keep plant availability are cheap - no new hardware.

F'I'C could be implemented by ad hoc methods, but correct implementation is more likely with a systematic procedure and a well-defined architecture.

3. AN ARCHITECTURE FOR FAULTTOLERANT CONTROL An architecture for fault-tolerant control systems is illustrated in Fig. 1. It has three layers. A lower layer

with the control loop, the second with detector functions and effectors to effect reconfiguration, and the third with supervisor functionality. The reasons for separating them into three layers are the benefits of a clear development structure, independent specification and development of each layer, and last but not least to obtain testability of detector and supervisor functions.

3.1. Lowest layer, with signal checking, control, and actuator commands

The lowest layer comprises the traditional control loop with sensor and actuator interfaces, signal conditioning and filtering, and the controller. Aiming at obtaining FTC, key features for the sensor interfaces should be to support detectability, and include validity checking designed to fit the purpose. Examples are: • • •

check range, check slew rate (abrupt faults) check mean or R M S (incipient faults).

3.2. Second level, with detectors and effectors

This level comprises a number of detectors, usually one for each fault effect which must be detected, and effectors that implement desired re-configuration or other remedial actions initiated from the autonomous supervisor. The functions of the modules are: • • •

detection based on hardware or analytic redundancy based on F D I methods detection of faults in control algorithms and application software dedicated effector modules to execute fault handling.

3.3. Third level, with autonomous supervision

The supervisor comprises state-event logic to describe the logical state of the controlled object. Transitions between states are driven by events. The supervisor functionality includes: • • • •

interface to detectors f o r change detection interface to upper level f o r mode change signals demand re-configuration or other remedial actions to accommodate a fault signal to plant-wide co-ordination or operator about current state.

Fault-Tolerant Control Systems - A Holistic View ...............................

695

1

AI

la

_L b

IF

I

C

................................

i

Potentiometer

Fig. 1

Three-level architecture for fault-tolerant autonomous control system. The three layers are control, detection, and supervision, the latter with communication to a plant-wide overall control or operator.

Fig. 2

Wiredng

ISC

•Electrical diagram of potentiometer with computer interface when used as an angle transducer. Any short circuit to ground or between terminals is likely, as is disconnected wires to any terminal

The faults to consider for interface design are:

4. ENGINEERING ASPECTS: SINGLE-SENSOR FAULT DETECTION The detection of faults in single sensors ought to be well understood and implemented in any process control equipment when standard transducers are considered. This is not the case, however. The scientific community does not consider this area as important: this is fairly straightforward circuit design, and instrumentation engineers do not consider fault conditions in normal designs. The consequence has been that nearly all automation systems live with the consequences of interfaces that are not effective at detecting faults. An exception is the advent of intelligent transducers, where manufacturers build single sensor validation into their products. The following example clarifies the statement.

• • • • • •

4.2. Design for detectability

Requirements for the detection of abrupt changes depend on the use of the potentiometer. Used in a feedback loop, detection is normally required within 2 samples. When used for setpoint generation, detection within 3-5 samples is usually sufficient. It is obvious that the ability to detect the above-listed faults within this short time interval requires additions to the generic interface circuit of Fig. 2. The necessary amendments are: • •

4.1. Example: angular position measurement •

Angular position measurement, shown in Fig. 2, is used in remotely controlled valves and in a large number of other devices. Potentiometers are the most commonly used feedback element for angular rotation when a limited range of rotation is needed. However, circuit design for fault detectability is rarely considered at all. As an example, typical connection circuits use a symmetric supply. This prevents the effective detection of a short circuit from wiper to signal ground, a quite common event.

short-circuit between any terminals (A, B, or C) short-circuit between any terminal and ground disconnection at any terminal (broken wire) intermittent short-circuit intermittent loss of connection electromagnetic disturbances in cables or units.

the supply is one-sided and lifted from ground a parasite current is induced as shown to detect broken wire at terminal C. a margin is allowed between normal range and the level for any short circuit.

These changes could make single-sensor fault detection fast and straightforward. More sophisticated techniques can detect such faults, but a sound engineering design principle is to keep simple things simple. 4.3. Front-end interface software

On the front-end software, simple design rules will also add to robustness and dependability. One should

696

M. Blanke et al.

distinguish between physical and electrical ranges with adequate margins. Slew rate estimation must employ a tiny Kalman-filter. Triple sampling should be used if electromagnetic interference (EMI) is a risk. This is illustrated in Fig. 3 where one triplet sampling gives one number (A/D conversion):

time Tv, s e t



at



select si or (si + sj)/2 based on IsqJ < 3*sdev(s).

T1 Fig. 3

S1$2

~

A library of FDlfunctions is needed Component models must include fault effect models It must be possible to generate mathematical models, e.g. in state-space form.

T2

T3

A third step is to find specifications for the design of fault accommodation (reconfiguration) •

Supported by design tool - it is possible to change fault propagation models for programmable elements • Automatic controller redesign for faulty system is not yet feasible but specifications are produced

time v

Triple sampling by AD converter - a simple but efficient technique to avoid EMI-generated outliers in measurement signals.

Other areas of system design employ standard patterns for design, and rules for the quality management of a development are well established. FTC development can benefit from similar procedures. A development method was described by Blanke (1995, 1996a, b) and by BCgh et al. (1995). The paradigm in this work was that component-based fault models can be expanded to subsystem and system level. The argument was that most technical systems have clearly defined goals: functionality and descriptions are available in the form of drawings showing component inter-connections, and these systems are rarely specified in a top-down goal description. Related developments, using the same paradigm, were used by Atkinson et al. (1992), with applications given in (Hogan et a1.,1992). Veilette et al., (1992) used a hybrid system approach but did not consider the design process in its entirety.

5.1. Systematic design

The first step of the proposed systematic design (Blanke, 1996a) is a component-based Failure Mode and Effects Analysis (FMEA) which has the following features:

• •

• • •

s12 = $1-s2, $23 = s2-$3, $31 = s 3 - s 1.

5. A DEVELOPMENT METHOD

• •

The second step is the generation of mathematical models for FDI

all known (included)faults are treated propagation o f faults through the subsystem is shown a well-established QA technology is used, it is possible to build standardised component library.

The next steps concern implementation where consistency and correctness are of major concern. A software architecture should support re-use and maintainability. This is particularly important for fault-tolerant systems where full-scale functionality is extremely difficult to validate.

5.2. Failure mode and effects analysis (FMEA)

An FMEA analysis deals with component faults and the propagation of fault effects. Components are, e.g. sensors, controllers, actuator motors, and valves. Components are linked, and so are their FMEA schemes. The FMEA analysis describes how fault effects propagate. FMEA analysis is commonly required for safety-critical systems. The trend is to require this analysis for an increasing number of industrial systems. The FMEA analysis is based on FMEA schemes for each component or functional block in the system, see (Legg, 1978; Galluzo and Andow, 1986; Blanke, 1996a). There is no absolute truth in the construction of failure mode and effects diagrams. Relating output effects to first principles and generalised states is a way to proceed for the physical parts of a system. In this context, the art of FMEA scheme construction is to make them simple enough to be manageable. This means that only physical faults with different effects need to be included in the scheme. Table 1 shows the FMEA scheme for the potentiometer. Statistical data for a large number of failed components are available in reliability databases, which can form the basis for generic schemes.

Fault-Tolerant Control Systems - A Holistic View

observed, the set of faults that caused these can be found from applying logic inference to get the inverse relation:

Table 1. FMEA scheme for potentiometer measurement of angle of rotation Potentio- Too low meter signal

Fluctuation

No relation to angle

Component/effect Input loss of broken wire A, supply short A-B Ou~ut

Component

shortcircuit B-C element broken

vibration

Too high signal

broken wire B, short A-C

broken wire C

loose connection

stuck, shaft broken, element broken

wiper element defect broken dust between wiper and element

(1)

where Af is a Boolean matrix representing the propagation. The index i is a component identifier and ® the inner product disjunction operator. The operation carried out by the operator is equivalent to the scalar Boolean disjunction "V" and the inner product to the 'W', i.e., row no. k of (1) is

(2)

...v (akn A f c n ) An effect may be functions of both the presence of one fault, and the absence of another. In this case, an additional fault vector component is defined as

fc3 -

f c l A J2C2 ...

(3)

This requires further - yet simple - logical steps in the analysis. If faults are propagated effects from other components,

eci ~-" A f

®

r,c/1

(5)

It is noted that this relation will not be unique, i.e. there are several possible component faults which could cause a certain set of fault effects. The salient features of this analysis include

The last feature is particularly useful during the design process. Fault detection and re-configuration can be specified as part of programmable components, usually the autonomous controller, to stop the propagation of fault effects.

The FMEA scheme can also be expressed as:

e~.k *-- (akl A f c l ) V (ak2 ^ f c 2 ) v

fsys ~ asbys® ec3.

• automated generation offaultpropagation • completeness assured in engineering terms • possibility of changing the properties of programmable components in the analysis.

5.3. Matrix formulation of FMEA

eci ~ Aif ® f~i

697

(4)

[_ec.i-lJ

Interconnection makes it possible to describe the FMEA matrix for an entire subsystem, where closed logic loops need special treatment (Blanke, 1996 a, b). When the output effects from a subsystem are

The result of the FMEA analysis is thus a specification of which faults should be detected, and what reactions should be imposed on the system when certain patterns of fault effects are observed by the supervisor. This implies that certain fault reactions will be done autonomously at the automation level. Previously, the process control strategy was to display relevant information and let an operator decide the remedial action. The advantages of autonomous reaction are rapid detection and recovery, and that the choice of remedial action is not an on-the-spot reaction of any operator on watch, but is part of the deliberate design of the process automation. The completeness and correctness properties of the design method are crucial in daring to take this step.

5.4. Relations to fault detection and isolation Fault detection and isolation has been a research area for a number of years, and the theory is well established, see, e.g. (Patton et al., 1989; Gertler, 1995). Patton (1993) gives an overview of available approaches, Frank (1994) treats fuzzy methods, Basseville and Nikiforov (1993) treat the detection problem from a statistical point of view. Sampath et al. (1996) provide an analysis of fault-diagnostic control using discrete-event analysis. New approaches to modelling issues, which are very important in this context, are treated in (Willems, 1991) and by Gauthorp and Smith (1996). Dynamic models for fault detection and isolation using the well-established continuous-time modelling, can be described in the state-space form:

698

M. Blanke et al.

k(t) = A x(t) + B u(t) + Exf f(t) + Exd d(t) y(t) = C x(t) + D u(t) + E y f f(t) + Eyd d(t)

where f represents the faults that can be isolated; x, u, and y are the state, input and measurement vectors, respectively, and d is an unknown disturbance. Today, f i s found ad hoc, from inspection of the state space model. Systematic design based on component fault analysis will ensure that it is the complete set of fault effects which is actually used. This completeness property is maintained throughout the design in so far as all essential component faults were included from the start.

~ Potentiometer oC AC !wi~cheB s*'

22O V ~ ~.~ LS TS

(6)

l A

it

o,o,

ISC

Fig. 5

AC motor with open-close activated by binary signals to relays and local control (manual). Abbreviations: HTR: heating element; LS: limit switch; TS: thermal switch; AC: 220 V AC supply.

6. EXAMPLE: TEMPERATURE CONTROL

7. SUPERVISOR IMPLEMENTATION

A temperature control loop for a fluid cooling system with cascade control, shown in Fig. 4, is an example of perhaps the most common control loop in the process industry. It comprises

Unlike the run-time parts of a control or monitoring system, the fault detection and accommodation parts of an autonomous fault-tolerant controller cannot be rigorously tested against the full-scale plant. Correct implementation is thus an even more urgent requirement than for the ordinary controller.

• • • •

actuator with AC motor and up-down command temperature sensor controller with process interface three-way valve and filter system.

I

Actuator

~ PTI00

Fig. 4

Temperature control loop.

Consequences of a fault in the actuator loop would most likely be temperature control loop instability, i.e. saturation or sustained large fluctuation. Failure of the temperature control loop would have consequences for operation, wear, and process availability. The three-way valve is shown as a functional diagram in Fig. 5. Simple faults include: • • •

actuatorpositionfeedback (potentiometer) actuator limit (end stop) switch relay fails to open, fails to close, fails to disengage.

The result of an FMEA analysis of this problem is a specification of reconfiguration possibilities • faulty position signal ~ use estimate of the valve position in the motor controller • limit switch fault ~ override limit switch information • a relay fault ~ actuator fault, cannot be accommodated, only detected.

The functionality of the supervisor is: • stop propagation and damaging effects o f faults • consistent handling of faults • provide reconfiguration, close-down, shut-down • actions are part of deliberate design decisions not on-the-spot reactions by any operator on duty • completeness of rules in the supervisor is mandatory. Requirements for implementation include: •

completeness of the states and events which are treated. • correctness of the implemented code and fault accommodation. • maintainability - software faults are very easily introduced when changes are made to the plant configuration and subsequently to control-system software. Maintenance is of crucial importance. • re-usability helps in using proven code f o r e.g. fault detectors.

7.1. State-event machine logic f o r supervisor design. Completeness and correctness requirements can - at present - be rigorously fulfilled only for the part of the design and implementation process from which logic rules are formulated and code is generated.

Fault-Tolerant Control Systems - A Holistic View

699

The FMEA-based analysis is a significant step toward completeness, from the process fault description to the formulation of a fault-accommodation strategy. State-machines, or automata, are suitable for the implementation of supervisor logic, because a "state" captures the relevant history of a part of the system at hand in a very compact way. However, a straightforward implementation is likely to cause a combinatorial explosion in realistic cases. To avoid this problem, an extended version of state-event machines can be used to describe the state set of the supervisor and bd implemented as parallel, connected automata (Izadi-Zamanabadi et al., 1996).

7.2. An extended state-event machine An extended state-event machine (E-SEM) tion supports implementation as a number event machines. The extension consists of tion set which define the dependencies distinct state-event machines.

descripof statea condibetween

Fig. 6

Interaction between one state with others in an extended state-event machine. A condition set is attached to each transition.

Fig. 7 shows the implementation of a problem as parallel automata with 2, 2, and 3 states, respectively, and 17 transitions. Fig. 8. shows the same problem implemented as one state-event machine with 12 states. This requires 144 transitions. The complexity of the transitions is clearly reflected in the software complexity and ease of maintenance.

Formally the i th E-SEM is defined by:

Mi : ( Si, Scon,i, Pii,Zifi, gi )

(7)

consisting of:

• afinitestateset Si={sij},j=[1,Ns], • a condition set (includes operational mode)

Scon,i ={Ssil,con,Ssi2,con .... SsiNs,COn),jE[X, gS ] • an input alphabet Pij ~ Pi corresponds to messages from detectors and operator or plant wide co-ordination, • output alphabet zij ~ Zi contains messages

Fig. 7

Implemented as parallel, extended machines. There are 2,2, and 3 states with 17 transitions.

Fig. 8

Implemented as one machine. There are 12 states and 144 transitions.

about events and commands to execute reconfiguration or other fault accommodation, • functionsfi and gi for state transition and output, respectively, described by the following equations: sij(n + 1)= f i(sij(n),Pij(n))

s~.j.... (n)

(8)

h

zij (n) = gi (sij (n), Pij (n) I _

Ssij,con(n )

(9)

where n is incremented by events and Ns is the number of states in the E-SEM M;. Fig. 6 illustrates the interaction between states in the E-SEM. Implementation as parallel, extended machines is advantageous. This is seen from the graphical illustrations of two implementations.

7.3. Meta-object architecture Object oriented software technology offers interesting features for the implementation of the

700

M. Blanke et al.

architecture outlined above for fault-tolerant systems. The desire to separate the software into a base level and meta-levels can be implemented with reflective mechanisms. This can be done using metaobjects, each of which is responsible for a limited part of the overall functionality. Meta-objects are objects with the following properties. They • intercept base-objects messages • are fully transparent to base-objects • full access to input and output from base-level objects • have full information about states of base-level objects • are instantiated upon events. • Dynamic attachment is possible to both baseobjects and other meta-objects, and • dynamic re-routing of messages can be used for re-configuration.

An architure with meta-objects that are able to intercept messages between objects at the base level is illustrated in Fig. 9.

7.4. Object oriented implementation

Meta-object technology has been implemented by extensions to languages such as Objective C (Lunau, 1995) and Smalltalk. A reflective architecture for aprocess-control applications is defined in (Lunau, 1997). Implementation in Java is expected. The implications of using the suggested meta-object technology is that it requires the control level to be implemented in OO technology as well. State-event logic needs to be implemented in this framework too. The main features of the application of Meta-OO technology for this purpose are: that the control level is implcmcnt¢~i and tested independently of the detectors and supervisor; detectors can be added and supervisor state event logic changed for maintenance (changes in processes requires such maintenance); and addition of modules is possible at run-time without interrupting the control layer (which is particularly essential for large-scale systems).

8. APPLICATION EXPERIENCE Several test cases for the implementation of fault tolerant control have been considered, from examples from ship propulsion and cargo control on the industrial actuator laboratory experiment (Blanke et al, 1995). The following is, however, an actual application on the autonomous attitude control system for the Danish Orsted satellite, (Begh et al., 1995).

Fig. 9

Reflective architecture for object-oriented implementation of supervisor using meta-levels for detectors, effectors, and supervisor logic.

8.1. On-board FTC for the Orsted Satellite

The system considered here is the autonomous attitude control and attitude determination system (ACS) for a 60 kg satellite which has a planned launch in late 1997. The task of the control system is to determine the spacecraft's orientation towards Earth, and to maintain a desired direction using active control. An on-board ACS manager comprises state-event logic to handle mode changes and re-configurations. The detector functions for faults which would develop into mission-critical failures if not accommodated. In the development, delicate trade-offs had to be taken between functionality and hard constraints set by available on-board memory. The hard real-time implementation was made in ADA. The meta-level ideas came later than the definition of the software architecture, so these ideas were not used in this project.

Fault-tolerant control is particularly interesting in the Orsted project because it was not possible - due to cost and weight constraints - to use hardware redundancy to obtain a fail-safe design. In addition, operator intervention from Ground is limited to periods where the satellite passes Denmark. This means that intervals of up to 13 hours might pass without a fault being noticed. Autonomous FTC was, therefore, a natural desire in this project. As an example of fault detection and reconfiguration, consider the plot of the satellite inclination and the power used by the control coils when a coil driver fault occurs, see Fig. 10. The test case is the following scenario, One of the satellite's actuators, a coil producing mechanical torque by interaction with the Earth magnetic field, fails. The plot shows the satellite's inclination angle towards Earth. An

Fault-Tolerant Control Systems - A Holistic View increase of the mean error from 10 to 45 (leg., with a peak of 150 deg., is the result without fault handling. FTC is able to keep the satellite within 20 deg inclination, which still allows scientific instruments to operate.

701

10. ACKNOWLEDGEMENTS This research was supported by the Danish Research Council for Technology and Science under grant no. 9500765. This grant and the support from the Orsted Project are gratefully acknowledged.

Angle between z-axis and B-vector 18~ 16~

11. REFERENCES

14~

~

12C

t

h

faulthandling, F~.~i~'

,,%

10~

ii

~J

~

II II

Withoutfaulthandling il II

i,

II II

II II

~

Ii

II II i

I! II

CoildriverAXshunt shortcircuitat50 min 50

100

150

200

250

300

350

400

45O

500

Power Consumption 4 3.5

2s

!!~ !It

#tl |i tl I"l{ I"J~ fl li'}~ fl |1 I I

II I I

!!f 0.5

% &

, i ' i t i '

,t, , it , t, ' It

II I ' I J I: iiIll i,,l t , i

I * dlill { I i I I dill { I i II i loo

i 150

i 2(30

i 250

r_

i!

il

i Ii

I~

I il ", i Ii

II

'iI , t

I il i it

i ~

II ii

I 360

t 400

i 460

it

5O0

Time [mini Fig. 10

Atkinson, R . M . M . R . Montakhab, K.D.A. Pillay, D.J.Woollons, P.A. Hogan, C.R.Burrows, K.A. Edge (1992): Automated Fault Analysis for Hydraulic Systems, part 1: Fundamentals. IMechE Proc. Vol.206, 207-214. Basseville, M. and I. Nikiforov (1994): Statistical Change Detection. Prentice Hall, 1994. Bell, T.E. (1989): Managing Murphy's Law: Engineering a Minimum-Risk System". IEEE Spectrum, June 1989. Blanke, M., R.B. JCrgensen, M. Svavarsson (1995): A New Approach to Design of Dependable Control Systems. AUTOMATIKA, Croatia, Dec. 1995, Vol 3-4, 101-108. Blanke, M. and R.B. J0rgensen (1995): Fault Handling Design for Integrated Marine Systems.

Satellite's inclination angle towards Earth and the power consumed by the control coils when a fault occurs in one out of three perpendicular coils. On-board reconfiguration gives uninterrupted scientific observation, while this is not possible without FTC.

9. CONCLUDING REMARKS The aim of this paper was to provide an overview and a systems development view, of the problem of fault-tolerant control. The main idea was that faults in control systems must be detected and the control system reconfigured when faults occur. This is considered a better approach than attempting to increase robustness to cope with fault conditions, due to the penalty in performance. The two main purposes of faulttolerant control were shown to be the achievement of maximum availability, and the prevention of subsystem faults from developing into failures at a system level.

Proc. 3 rd IFAC Workshop on Control Applications in Marine Systems, CAMS'95, Trondheim, Norway, 10-12. May. pp. 238-246. Blanke, M. (1995): Aims and tools in the evolution of fault-tolerant control. COSY Workshop, Rome. Sept. 95. Blanke, M. (1996a) A component based approach to industrial fault detection and accommodation. Proc. IFA C World Congress 1996, San Francisco Vol. N, pp 97-102. Blanke, M. (1996b): Consistent Design of Dependable Control Systems. Control Engineering Practice, Vol. 4, No. 9, pp 1305-1312, 1996. BCgh, S.A., R. Izadi-Zamanabadi, and M. Blanke (1995): Onboard supervisor for the Orsted satellite attitude control system. 5 'h ESA Workshop on AI and knowledge based systems for space. Noordwijk, Oct. 1995, pp 137-152. Frank, P.M.: (1994): Application of Fuzzy logic to process supervision and fault diagnosis. IFAC Safeprocess'94, Helsinki, Finland, June 1994. Galluzo M., P.K. Andow (1986): Reliability Analysis of Systems Containing Complex Control Loops. Proc. IFAC Symp. on Reliability of Instrumentation Systems, pp 47-52. Gauthorp, P. and L. Smith (1996): Metamodelling: For bond graphs an dynamic systems. Prentice Hall, 1996. Gertler, J. (1995): Towards a theory of dynamic consistency relations. IFA C Workshop: On-line Fault Detection And Supervision. Newcastle, UK, June 1995, 55-75.

702

M. Blanke et al.

Hogan, ,P.A., C.R. Burrow, K.A. Edga, R.M. Atkinson, D.J. Woollons (1992): Automated Fault Analysis for Hydraulic Systems, part 2: applications. IMechE Proc. Vol.206, 215-224. Izadi-Zamanabadi, R, S.A. Blagh, and M. Blanke (1996): On the design and realization of supervisory functions in fault tolerant control. KoRema Workshop, Opatia, Croatia, Sept. 1996. Legg J.M. (1978): Computerized Approach for Matrix-Form FMEA. IEEE Trans. on Reliability. Vol.27, No.I, 154-157. Lunau, C. P. (1995): EMMA - Emergency Management System for use onboard Ships. Proc. 3 rd IFAC Workshop on Control Applications in Marine Systems, CAMS'95, Trondheim, 164173. Lunau, C. P. (1997): A Reflective Architecture for Process Control Applications. Proc. European Conference on Object Oriented Programming, ECOOP'97, published by Springer Verlag in Lecture Notes in Computer Science. Helsinki, Finland, June 1997.

Patton, R.J., P.Frank, D. Clarke (1989): Fault Diagnosis in Dynamic Systems: Theory and Applications. Prentice Hall. Patton, R.J. (1993) Robustness Issues in FaultTolerant Control. Proc. Tooldiag, Toulouse, France, 5-7 April, 1993. Vol. 3. Sampath, M., R. Sengupta, S. Lafortune, K. Sinnamohiden, and D.C. Teneketiz (1996): Failure Diagnosis Using Discrete-Event Models. IEEE Transactions on Control Systems Technology, Vol. 4 , 2 , 105-124. Veillette, R.J., J.V. Medanic, and W.R. Perkins (1992): Design of Reliable Control Systems. IEEE Trans. AC, Vol 37. No. 3, 290-304. Warwick K: and M.T. Tham (eds) (1991): Failsafe Constrol Systems. Applications And Emergency Management. Chapman and Hall, London. Willems, J.C., (1991): Paradigms and Puzzles in the Theory of Dynamical Systems. IEEE Trans. AC, Vol. 36. No. 3, 259-294.