European Journal of Control (2008)5:387–390 # 2008 EUCA DOI:10.3166/EJC.14.387–390
Discussion on: ‘‘Reconfigurable Fault-tolerant Control : A Tutorial Introduction’’ Jan Maciejowski Cambridge University Engineering Department, Cambridge CB2 1PZ, England
This contribution is more a reflection on the status of fault-tolerant control than a detailed response to the paper by Lunze and Richter. It also gives me the opportunity to push some of my own favourite ideas and do some self-advertisement. Citations of the form [LRn] will refer to reference [n] in Lunze and Richter’s paper. I am nervous about the fault-tolerant control literature. Someone reviewing it from outside the community could conclude that quite a lot has been achieved. I believe that, although many good ideas and proposals have been published, very little has really been achieved. Putting it bluntly, I think we have very little evidence that any of our proposals really work. This is not really the fault of the research community; it is due to the fact that the problem of fault-tolerant control is very hard, and that the opportunities for implementation of faulttolerant controllers are very limited. The realities of the academic world dictate that we publish papers about problems that we can solve, even if they are not exactly the problems that need to be solved. Thus the majority of papers about faulttolerant control assumes linear models, although significant faults will surely result in most systems being pushed far from equilibrium conditions, at least in initial transients. We often assume that actuator and sensor faults appear as additive disturbances on inputs and outputs, although most such faults are ‘hard-over’ faults to minimum or maximum values, which cannot be modelled in this way. (In [7] I proposed that a suitable model for actuator faults is u ¼ Mv þ d,
E-mail:
[email protected]
where u is the signal applied to the plant, v is the signal output from the controller, M is a matrix, and d is a vector—constant for most common faults. M ¼ I and d ¼ 0 represent the no-fault condition.) We further assume frequently that the objective of a faulttolerant controller is to recover the no-fault performance, which seems a very unreasonable expectation for a system which has suffered a serious fault— this is surely a key distinction between fault-tolerant and more conventional adaptive control. A vital part of fault-tolerant control is Fault Detection and Isolation (FDI). Many papers on faulttolerant control (including my own) assume that the results of FDI are available. I think it is reasonable to try to decompose the problem in this way, but one has to make sensible assumptions about how quickly such results might become available after a failure, and/or about the quality of such results. I have reviewed, and even seen published, papers which present FDI schemes in which faults are detected almost instantaneously. These of course rely on perfect measurements and perfect plant models, and in practice would be raising false alarms all the time. There is a wellestablished statistically-based literature on FDI (such as [2]), from which it is clear that data must be collected for a significant time after a failure occurs, if the probability of a false alarm is to be acceptably low, and this time depends primarily on sensor signal-tonoise ratios. (Alternative approaches which focus on model uncertainty rather than sensor noise, and use ideas from robust control theory rather than statistics, are exemplified by [LR16] and [4].) There is very little reference to such results from the fault-tolerant
388
control literature, and very little investigation of how much detection delay could be tolerated for control purposes. It is possible that things are better than such results suggest, because of three factors: (1) In the aftermath of a failure, large signal excursions and large excitations are likely, so signal-to-noise ratios may be much better than in standard conditions— though nonlinear behaviour is likely to be excited, so the problem setting becomes harder. (2) Modern actuators and sensors often have some self-diagnostic capability, so some failures in these areas may be detectable easily and quickly. (3) High-fidelity models are increasingly available for on-line use, and these provide additional information which it should be possible to exploit somehow. But these are all unproven speculations. Consideration of the interplay between FDI and fault-tolerant control reveals how little we really understand about some basic questions. If a serious failure occurs (as in the aircraft examples mentioned in the Lunze-Richter paper) it is clearly not necessary to have a very accurate model of the post-failure plant. An approximate model which captures the essentials of the new behaviour is better than an accurate model of the pre-failure plant. But what does essentials mean here? How can we judge that the new approximate model is better for control than the old one? How much data do we have to collect in order to reach that point? We do not have answers to such questions. (We have some proposals, such as monitoring multiple models and selecting one which looks best according to some criterion, but these are ‘fixes’—possibly very effective— which do not yield deep understanding. One proposal which may lead to deeper understanding is [8].) Aircraft offer good possibilities of fault-tolerant control, because there is so much redundancy available with conventional control surfaces (and even more with exotic surfaces such as canards). This is particularly true with modern electrically-actuated aircraft, in which every surface can be individually controlled—so that ailerons can be used to produce lifting forces without rolling moments, or elevators to produce rolling moments, for example. Such possibilities seem much rarer in process control, because there is very little redundancy available, at least at the level of individual processes. Presumably this is because process plants are constructed at minimum capital cost, so redundancy has been deliberately eliminated. One of the few examples of which I am aware that offers significant redundancy is the multistage refrigeration process described in [6], where one or more stages could be by-passed in the event of component failure, at the cost of reduced efficiency. Another example was the one that triggered my
Discussion on: ‘‘Reconfigurable Fault-tolerant Control’’
interest in using model predictive control (MPC) for fault-tolerant control [5]. I heard a development engineer of the DMC Corporation (now part of Aspentech) describe how, in a DMC implementation of MPC on a petrochemical plant, the controller operated a valve that it had never operated before, and the reason turned out to be that the valve which was usually used had become ineffective due to fouling. This led to my investigation of some inherent fault-tolerance of MPC [7], which does not rely on explicit FDI. Is it reasonable to expect that we might be able to do something realistic about fault-tolerant control? We know that something useful is possible, because we know that humans can often do it successfully. So it does not seem futile to aim for acceptable faulttolerance in control systems of unmanned vehicles or unmanned pumping stations, or to provide automated assistance in systems operating with minimal personnel, such as military and future civil aircraft, ships, power generation and distribution systems, petrochemical plants, paper mills, etc. Perhaps we aim too high when we consider extreme cases like the aircraft examples described by Lunze and Richter, in which engines or complete hydraulic systems are lost—pilots exhibit fault-tolerant functionality regularly in the face of more minor problems such as false alarms, and even such ‘easy’ functionality must be provided before UAV’s can be deployed safely above denselyinhabited areas. On the other hand, even in the B-747 incident described in [LR45] a high-fidelity model of the post-fault aircraft was obtained on the basis of onboard data only [9]; so perhaps it is possible to do this automatically on-board in 5 minutes, rather than offline by a student in 6 months—and in that incident 5 minutes would have been fast enough to avoid a disaster [3]. There is also anecdotal evidence (eg [1]) that significant fault-tolerant capability, specifically control re-allocation, has been available in military aircraft for many years—but I have not been able to obtain reliable confirmation of this, presumably because any such developments remain secret. Publications such as [LR1,LR10,LR14,LR32] are to some extent ambiguous as to whether the proposed solutions have really been applied in practice. Are our methods and techniques appropriate for solving the fault-tolerant control problem? Some claim that only ‘AI’ approaches can provide solutions to this kind of problem. I remain optimistic that optimisation-based methods, such as MPC, offer an effective framework. (I don’t agree completely with comments in section 4.4.2 of the Lunze-Richter paper about computing power and process speed, as they relate to MPC, which seem to me to be
Discussion on: ‘‘Reconfigurable Fault-tolerant Control’’
rather out-of-date, or about the implementability of ‘explicit’ MPC.) But I believe that we should remain open-minded; in particular machine-learning may have much to offer here [LR19]. Of course the control and systems community’s tradition of rigorous analysis should certainly be applied to any proposed solutions, whatever their provenance. And applying that tradition to what we have accomplished so far should lead, I believe, to modesty on our part.
References 1. Atkinson PJ. Letter to the editor: ‘‘Re-inventing the wheel?’’ Flight Int 2003; 163(4870): 38–39 2. Basseville M, Nikiforov IV. Detection of Abrupt Changes: Theory and Application, Prentice-Hall, 1993
389 3. Edwards CJ (ed). Fault-Tolerant Control—A Benchmark Challenge, Springer, in press 4. Emami-Naeini A, Akhter MM, Rock SM. Effect of model uncertainty in failure detection: the threshold selector. IEEE Trans Autom Control 1988; 33(12): 1106–1115 5. Maciejowski JM. Reconfiguring control systems by optimisation. Proceedings on European Control Conference, Brussels, July 1997. 6. Maciejowski JM. Modelling and predictive control: enabling technologies for reconfiguration. In: Gertler JJ (ed.), Annual Reviews in Control, vol. 23, Pergamon, 1999, pp. 13–23. 7. Maciejowski JM. The implicit daisy-chaining property of constrained predictive control. Appl Math Comput Sci 1998; 8(4): 101–117 8. Safonov M, Tsao T. The unfalsified control concept and learning. IEEE Trans Autom Control 1997; 42(6): 843– 847
Final Comments by the Authors J. Lunze, J.H. Richter In this note, we respond to the ’’Discussion of FaultTolerant Control’’ by Jan Maciejowski. We adopt the same reference system, referring to the literature from our initial paper as [LRn] and to literature from Maciejowski’s discussion as [Mn]. This discussion concerns the value of theoretical contributions to fault-tolerant control (’’the literature’’) in general and states that little evidence for successful applications of such contributions has been reported. The main issues concern (1) the discrepancy between realistic systems and fault models and such models that are amenable to analytical analysis, (2) the dependence of active fault-tolerant control on the availability of reliable diagnostic results, and (3) the interplay between fault diagnosis and fault-tolerant control that is largely ignored in the literature. In summary, the concerns are that the prevailing assumptions and frameworks do not adequately reflect reality. We share these concerns, which are supported by our first-hand practical experience with experimental tests of fault-tolerant control approaches using our test-bed VERA. Some of these results were reported in [LR57] in detail and outlined in our tutorial. We see some significant progress in particular on the system models aspect (1) at this time. A few new interesting approaches for nonlinear, hybrid systems have appeared, which became available after the final submission [1, 3–5]. Furthermore, we have hope that the issues (2) and (3) can be approached by rigorous analysis to a much larger extent than it was done so
far. New research in these directions is strongly encouraged, possibly based on [2], [M8]. The discussion by Maciejowski touches the more fundamental aspect of the relation between applied engineering sciences and practice that we see as follows. It is the task of science to identify, clarify, and formalise general prototypical problems, to make statements about the solvability of the problems, and to provide systematic and generalized solution approaches to the problems. From this point of view, every practical problem deviates more or less significantly from the prototype problems regarding, for example, its assumptions. To solve a practical problem, parts of the general solution might be usable without modification, other parts have to be modified for reasons specific to the domain of application, or they have to be extended by heuristic elements. From this perspective, some achievements of the fault-tolerant control community are not as negligible as the discussion might suggest. For example, necessary and sufficient conditions for the solvability of several FTC problems have been stated. These statements are useful at least for identifying unsolvable FTC problems, and for identifying the prerequisites that have to be satisfied in any application to enable fault-tolerant control. In this respect, theoretical results that are important for practical applications do not only show how to solve a fault-tolerant control problem but, sometimes more importantly, also prove which problems are unsolvable.