The development of safe advanced road transport telematic software P H Jesty, T F Buckley t and M M West describe the work done as part of the EC DRIVE project, including the philosophy behind a proposal for a software standard and its certification criteria
Whilst the road transport industry has already started to take advantage of the increased functionality offered by the use of programmable electronic systems, the current Type Approval mechanism, whereby all road transport equipment is certified as being fit and safe for its purpose, cannot adequately assess them. This paper describes the work done as part of an EC DRIVEproject to propose a European standard for the development of safe road transport telematic systems. In particular the philosophy behind the software standard and its certification criteria, where most of the new problems lie, is discussed. A solut/on is proposed that is pragmatic, meaningful and workable. safety-critical software standards
safety engineering
road transport
It is now generally accepted that there is a number of major problems associated with road transport, including congestion, pollution, injuries and fatalities. These result in a steady deterioration in the environment, and an increasing cost on society in general and on individuals in particular. In various parts of the First World a number of major research programmes is now under way that are attempting to tackle one or more of these problems. These programmes have two things in common, namely the use of radio communications and the use of programmable electronics, often actual computers. Applications that are being investigated include intelligent cruise control, collision warning, platooning of vehicles, automatic navigation (some with centralized control), SafetyCriticalComputingGroup,Schoolof ComputerStudies,University of Leeds,UK t DrTom Buckleydiedduringthelaterstagesof the workdescribedin this paper Paperreceived:20 December1991. Revised:15 August1992
traveller information and automatic toll collection. Some of these applications require communication between vehicles, whilst others require communication between a vehicle and the roadside. Some require programmable systems or computers inside the vehicle, whilst others need a network of large computers spread over a wide area. Whatever their objective, it is essential that the addition of one or more of such new systems does not create a less safe road transport environment than exists at present, and preferably should create a safer one. From 1987 most of the major vehicle manufacturers and electronic component companies in the European Community (EC) started to collaborate within the Commission of the European Communities' (CEC) EUREKA research programme on a Programme for a European Traffic with Highest Efficiency and Unprecedented Safety (PROMETHEUS). The objective is to 'create concepts and solutions which will make traffic perceptively safer, more economical, with less impact on the environment and will make the traffic system more efficient n. The programme is principally concemed with advanced applications associated directly with vehicles. The year 1989 saw the start of a further CEC funded research programme for Dedicated Road Infrastructure for Vehicle Safety in Europe (DRIVE) 2. Whilst this has a similar set of stated objectives to PROMETHEUS they were to be achieved through the application of road transport informatics (RTI) both to road vehicles and to the roadside infrastructure. Whilst the PROMETHEUS programme runs until 1995, the DRIVE programme was completed at the end of 1991, and has been succeeded by DRIVE II, whose objectives are to evaluate the ideas developed duringthe first DRIVE programme (now known as DRIVE I) by means of field trials. There are two main sets of trials, urban (known as POLLS) and inter-urban, usually international (known as CORRIDORS). The new approach has now been given the name of Advanced road Transport Telematics (ATT).
0141-9331/93/01 0037-10 © 1993 Butterworth-Heinemann Ltd Vol17Nol
1993
37
DRIVE Safely This paper is principally concerned with the work done in the DRIVE 1 project 'DRIVE Safely' (V1051) by a consortium consisting of TUV Rheinland in Germany, TNO Delft in The Netherlands, Program Validation Ltd and the Safety Critical Computing Group at the University of Leeds, the last two being in the UK. The problem that was addressed by this project can be explained as follows. The current situation is that most of the failures that occur in a road transport situation are caused by some form of human error; very few are the result of equipment failure. There are two principal reasons for this. Firstly, most of the control of any situation is placed in the hands of a human operator, and thus most loss of control is only possible through human error. Secondly, such equipment as can undertake control is designed and manufactured using a well understood traditional technology, which is now very reliable. One of the aims of the research programmes mentioned above is to replace some of the control currently undertaken by human operators with ATT systems. However, such systems will only increase the overall safety of road transport if the number of failures associated with the new ATT systems is very small. Unfortunately ATT systems are being designed and implemented using new technologies which are not fully understood, and of which there is little experience. In particular there are currently no standards that specify how safe programmable systems should be designed and implemented. The task of DRIVE Safely was therefore to produce a proposal for a European Standard for the development of safe RTI systems3.
along a one-way street. MMI safety is concerned with ensuring that the mechanism by which the driver is instructed to get to the destination does not create too much of a distraction from the primary task of driving. Meanwhile, traffic safety can be involved in the following scenario: the centralized control of the guidance system discovers that there is an obstruction on a particular road which has produced major congestion; in order to keep the rest of the traffic flowing, drivers are diverted onto another set of roads. If these roads are in a residential area then a relatively safe situation for pedestrians, and playing children, suddenly becomes less safe.
SYSTEM SAFETY The proposed standard is divided into five main parts. Part A covers the generic system aspects of risk assessment and overall development philosophy. It should be noted that this proposed standard is only concerned with (programmable) electronic (sub-)systems, and so other (sub-)systems are the concern of other standards. It is applicable to new designs, and may be used for the evaluation of existing equipment provided the information is made available. Part B covers the specification and architecture of (programmable) electronic systems. The safe design of the software is covered in Part C, whilst the hardware aspects are covered in Part D. Throughout the design and implementation of any programmable electronic system or sub-system, there will be a process of assessment, and the mechanism by which a Certification Authority should carry this out is described in Part E. Part F contains a glossary of jargon terms used throughout the document. The principal recommendations of the nonsoftware parts can be summarized as follows.
THE SAFETY TASK FORCE During the EC DRIVE I programme a group of projects met regularly during Concertation Meetings and ultimately formed a 'Safety Task Force' to provide, for the first time, a unified front to the problem of road traffic safety. They identified three basic areas: • System safety - concerned with the safe functioning of equipment under all conditions, and where response times are usually less than one second. (DRIVE Safely was the only project concerned with systems safety, and its representative in the Safety Task Force was Dr T F Buckley.) • Man-machine interface (MMI) safety- concerned with how a human individual reacts to the information displayed by the equipment, and provides responses back to the equipment. Response times are usually a few seconds. • Traffic safety - concerned with the whole road traffic infrastructure (e.g. road layout) where it can often take a very long time to demonstrate the effectiveness, or otherwise, of a new feature. The Safety Task Force is an expression of the fact that it is not always possible to categorize the safety aspects of any proposed system under a single heading. Indeed sometimes an application can involve all three safety areas simultaneously. One such is intelligent automatic route guidance. In this case system safety is concerned with ensuring that safe instructions are given so that, for example, a driver is not instructed to go the wrong way
38
Risk classification and failure probabilities Risk is usually defined as the procluct of the seriousness of the effect of a failure and the probability of that failure. In the road transport situation this is complicated by the fact that a failure will not automatically lead to an accident, and that the severity of an accident is not dependent on the type of failure. We therefore decided to classify risks in terms of the controllability of the safety of the situation after a failure. Thus, for example, the sudden failure of a vehicle's radio would normally only be a nuisance, although it could result in an accident if the driver's attention was diverted sufficiently by the occurrence. Similarly the failure of a steering system would leave the vehicle uncontrollable, but in very favourable circumstances no accident need ensue. Before the design of any new equipment is started, the question 'What risk is acceptable?' must be answered. A hazard analysis must therefore be performed to attempt to identify a preliminary hazard list. This should be done by a group containing as wide an expertise related to the system as possible, including at least one from each of the safety areas above. The hazards are then classified in terms of the controllability of the situation should they occur, and we have identified five controllability categories as shown in Figure 1. The controllability category for each hazard defines an integrity level required for the design of the new (sub-)system, which in turn defines the requirements for the process of development. The aim is to
Microprocessors and Microsystems
I
Figure 1.
Hazard
Identification
Controllability Category
Integrity Level
Uncontrollable
Extremely High
Difficult to Control
Very High
Debilitating
High
Inconvenient
Medium
Nuisance Only
Low
_~~ Requirements for the development process, configuration and component choice
Very Remote
l
Remote
1
1
Reasonably Possible Probable
Effect of integrity levels
produce, and to be able to demonstrate that one has produced, in a manner that will be discussed below, a (sub-)system whose probability of failure is also shown in Figure 1. There is currently some considerable debate as to whether the failure probabilities in diagrams such as Figure 1 should be given numbers or not. Whilst the advantage of having numbers cannot be denied, they must have a meaning and be demonstrable as useful. For software this is a problem. Much work has been undertaken developing metrics for the reliability of software, but this has concentrated on large systems, and produced, in safety-critical terms, only modest probabilities for failure 4. We therefore felt that it was too early to put numbers into Figure 1, and to do so might create a false sense of security for those who then used them.
Safety by design If a system is safety-critical then both the functional requirements and the safety requirements of the system must be considered. The system may be monitored by a safety-related protection process, or the safety-related
Safety Requirement Specification Functional Safety Requirements Figure 2.
Total probability of the product Extremely Improbable
Failure
Safety Integrity Requirements Measures Measures to to control faults faults
avoid
Safety specification
Vol 17 No 1 1993
process may be part of the overall control of the system. In either situation the specification of the safety-related process should be divided into the functions that it has to perform (functional safety requirements) and how faults are to be treated (safety integrity requirements), as shown in Figure 2. Failures can be of two main types, random or systematic. Random failures can occur at any time and are due to a degradation in the hardware. Systematic failures are the result of a fault in some stage of the life-cycle of a system, and can occur in either hardware or software. There are three principal measures that can be taken against such failures: • Quality - mainly systematic failures • Reliability - random failures • Configuration - mainly random failures Quality is achieved through the application of a suitable quality management system that should conform, at least, to ISO 9000 s. Configuration of the system architecture is the mechanism which ensures that those faults that do occur do not lead to an unsafe situation. The basic requirement is that the probability that a single failure can lead to a dangerous situation should be negligible. In addition, if a failure is undetected, there should be a negligible probability that it could lead to a dangerous situation even in combination with one or more other failures; the actual number depends on the integrity level required. One particular problem of concern is that, whilst the use of redundancy is suitable for the control of random hardware faults, systematic faults in hardware or software will produce a common mode failure in all channels simultaneously, unless they have each been designed diversely. It must also always be remembered that the safety requirements specification itself is a potential cause of common mode failures.
39
Safe hardware The reliability of hardware components can be increased by using good quality products, and possibly derating them. The environment both inside a vehicle, and at the roadside, can be very hostile and so tests must be performed to ensure that the hardware will perform correctly under all conditions, and in all the required geographical locations. These include tests for temperature, humidity, vibration, shock, salt mist, dust, sand and the action of fluids. Tests must also be done to ensure electromagnetic compatibility such that the electronic components do not cause interference, nor are they affected by it. There are many Standards relating to these issues which are listed in our document.
Type approval Type approval is the certification of the product and is suitable for those systems (usually mechanical or electrical) which can be fully black box tested once they have been made. Most new cars and goods vehicles have to be subjected to this type of certification, which is 'a way of making sure that cars and goods vehicles are safe for use on the road, without having to inspect and test every single one '6. Whilst vehicles and roadside equipment were built using traditional mechanical and electrical engineering techniques, this procedure worked well. However the increased functionality required for the new ATT systems can only be obtained by using programmable systems. Since such systems cannot be exhaustively black box tested, another means of compliance is required for certification.
TOWARDS THE DEVELOPMENT OF SAFE SOFTWARE
is required, the production stage is entered, which basically produces multiple copies of the same system. It is assumed that, once the prototype is accepted, the design is 'correct' and so, providing quality control is maintained during production, any failures will be due to random faults in a component. This is the basis of Type Approval. Unfortunately the development of software does not follow these same stages in the same way (see Figure 3b). The production phase (producing multiple copies) is trivial and software is 100% reliable in the sense that it can never change, and so always causes the same thing to be done. Thus any faults that are in software appear during the requirements --~ development stages, and if they are not discovered they remain as systematic design faults in the production model. The development of fault-free software is notoriously difficult for two principal reasons: • it is a digital system, and any state can follow any other state - the feature that provides the flexibility and reliability • its potential for complexity - the feature that provides the increased functionality. These, of course, are the two reasons for which it was being used in the first place! Since software cannot, in general, be exhaustively black box tested, it can be seen from the above argument that its evaluation should be done by assessing the development process, which will include testing, rather than the final product. This is achieved in the civil aviation industry as laid down in 'Software Considerations in Airborne Systems and Equipment Certification: ED-12A/ DO-178A". The basic message of ED-12A/DO-178A is that, if they are to obtain certification of their products which contain software, 'designers must take a disciplined approach to software requirements definition, design, development, testing, configuration management and documentation'. It should be noted that the aviation industry does not certify software as a product in its own right - only as a component of a larger system.
The nature of the problem Traditional engineering products can be considered as developed in two stages: design and production, as shown in Figure 3a. First an idea is developed into a design until one or more prototypes are created. Once it has been confirmed that these prototypes do represent what
Legal liability The 1985 European Community Directive on product liability requires each member state to introduce laws imposing strict liability on the producers of defective
RequirementsH Development Design& H Prototypes H ProductionH
Operation
4
a
AssumedCorrect H Development Design& ~_~ Operation Requirements
b
4
100%Reliable
Figure3. a, Development of traditional engineering product; b, development of software engineering product 40
Microprocessors and Microsystems
products that can cause damage. For the United Kingdom this has resulted in the Consumer Protection Act 1987 (CPA). The definition, in the CPA, of a product includes 'a product which is comprised in another product, whether by virtue of being a component part....', and it is thus reasonably clear that the producer of any product, or the component of any product, would become liable if it inflicted damage. Unfortunately this clarity disappears as soon as the product under discussion is a piece of software. However, lawyers who have studied the Act believe that it does apply to safety-critical software 8, and hence a prudent producer of safety-critical software will develop and market it in a sensible and professional manner 9.
Existing (draft) standards It became clear during the 1980s that the general quality of software engineering was poor 1°, and that there were particular problems with the development of safetyrelated software 11. The year 1989 saw the publication of two sets of draft standards relating to safety-critical systems containing programmable electronics. The first was the Ministry of Defence DEFSTANs 00-55 and 00-56, which have now been issued as interim standards 12,13. The basic philosophy behind these is that once a subsystem containing software has been identified as being safety-critical, then DEFSTAN 00-55 prescribes a development route for safety-critical programs that is based upon the application of mathematical methods for specification, design and verification. They put a very real limitation on what is currently feasible, and their publication naturally caused some considerable debate. A few months later, the International Electrotechnical Commission (IEC) draft standards were published which took a very different view of the situation 14'1s. They propose that every sub-system should be analysed with respect to the risks associated with it, and an integrity level then given to it; the greater the risk, the higher the level, and the greater the care that must be given to its design. Various development techniques are then identified as being suitable for each integrity level. There is no specified limit as to what can be attempted. Unfortunately, neither of these two sets of draft standards fully solves the current problems faced by industry. There are already many safety-critical systems in existence which could not be developed according to the interim DEFSTANs, and their removal is not a viable option. In addition the apparently sensible attitude taken by the IEC/SC65A/WG9 standard is flawed. The requirement that different design techniques should be used for different integrity levels implies that some methods are demonstrably better than others. There is no scientific or quantitative basis for this assumption TM. The situation is therefore one in which control over safety-critical software is being called for, but no-one agrees as to how, in detail, it should be done - or in other words, this is not the time to write a standard, but many people are saying that they want one! We therefore had to address all these points whilst we were writing the software standard and certification (Parts C and E) of the proposal for a European standard for the development of safe RTI systems.
9'oi 17 No I 1993
THE PROPOSED SOFTWARE STANDARD The proposed standard for RTI systems addresses a number of problems simultaneously: • It gives guidance as to how safe software should be designed and implemented at a time when there is no absolute consensus as to how this should be done. • It introduces the certification of software sub-systems into the existing Type Approval systems. • It provides a mechanism for providing a meaningful way of handling the safety integrity of a software subsystem. These three topics will now be discussed individually, and we will show that by allocating certain tasks to the Certification Authority, a practical solution to these problems can be achieved without holding back any advances that may be made in the area of safety-critical computing, or handicapping the European motor industry. The proposal currently only deals with the design, implementation and certification of new systems.
Design and implementation methodologies The previous section concentrated on pointing out the deficiencies in the two sets of draft standards. There is, however, much in common between them and the various guidelines that preceded them; in particular those produced by Technical Committee 7 of the European Workshop on Industrial Computer Systems (EWlCS)17, the Software Tools for Application to Large Real Time Systems (STARTS) Public Purchaser Group 18'9, and the Health and Safety Executive (HSE)19' 20. The draft standards are only flawed in that their authors have tried to impose rules as to which development methodologies may, or may not, be used before a general consensus has been reached. Our approach to this problem is to recognize the lack of consensus and to acknowledge it, rather than ignore it 21. Engineering knowledge usually advances in one of two principal ways: either by research or by experience ('learning by one's mistakes'). Whilst the results of the former are usually publicly available, there is a natural tendency for victims of the latter to keep very quiet about them, especially if safety is involved! Thus in order for everyone to be able to benefit from a mistake, especially one that was committed in good faith and may even be common practice, it is necessary to have a central body who can sanitize any report made to it and feed the information quickly back into the industry 22. Many industries already have such a system in existence, though few, if any, include software faults in their reports: an unofficial list is being maintained in the USA 23. We therefore require a Standard or Guideline that fosters 'best practice', but that at the same time permits the subject to develop in the light of experience. A Standard on its own cannot achieve this properly as it is purely a document, and the change mechanism can be extremely slow. There is a consensus shared by many workers in the area of safety-critical computing that, since it is so easy for software to be developed in a haphazard manner, all safety-related software should be formally and independently assessed before being permitted to become operational. Our approach is therefore to combine the
41
job of independent assessor or Certification Authority with that of information disseminator in a manner that will be described below, though we do acknowledge that there will be political, economic and staffing problems in setting up such a body. The proposed standard is therefore definitive in those areas where there is an undisputed consensus, but it gives more open guidance when it covers the methodologies that should be used.
The software life-cycle A life-cycle is a management tool to provide a systematic way of ensuring that all important items are covered during a course of action. The life-cycle for safety-critical software in the proposed standard, shown in Figure 4, is based on that proposed in the draft international Standard IEC/SC65A/WG9 TM, and can be summarized as follows:
By performing a number of hazard analyses of the entire system, those sub-systems which will contain software, and that are deemed to be safety-critical, can be discovered. The software requirements specification for each sub-system is then drawn up in two stages. The first stage deals solely with the functions that the sub-system has to perform, whilst the second stage details the additional requirements that the sub-system must have in order to maintain its safety integrity. Once a complete and unambiguous specification has been produced for a sub-system it is then possible to proceed to the design. During the design regular checks are made to ensure that it remains consistent both with itself, and with the original specification. Meanwhile, possibly using a different team of people, a plan is made as to how the safety of the final sub-system is to be validated. Once the design is finished a complete design review takes place to ensure that all the original requirements are accounted for correctly. After everyone is satisfied with
Designation of
Safety-Related Software
Software Requirements Specification
C
Safety Integrity
Functional
E
Requirements Requirements Specification Specification
R
T
i
i
Planning
Design
I
I
Validation
I
Verification I
F
i Design
-/1-
C
Review
I
I
A
i
hnplementation~ Verification]
T
i
I
Test
0 N
S~ety Validation
Operation and Maintenance Figure 4.
42
System Modification Retro-Fit I
A life-cycle for safety-critical software Microprocessors and Microsystems
the design, the implementation can proceed, with regular checks being made to ensure that the code remains consistent with the specification. Once the code is written testing can take place. When the software sub-system is complete the validation plan is used to check the safety of the subsystem, and its performance in so faras it can affect safety. (It should be noted that, strictly speaking, this proposed standard is solely concerned with ensuring that a product does not cause an injury. Subject to this being so, it is not normally concerned with the correct functioning of the system, which is a matter for normal quality control.) The result of each phase is one or more documents (and sometimes code) which can be classified as being either a plan, a specification or a report. The proposed standard specifies what each document should contain and their interrelationship. Those documents that must be made available to the Certification Authority will depend on the integrity level being designed for. It is during the creation of the design plan that the developer must consult with the Certification Authority as to which method(s) may be used for the development. The advantage of this system is that the Certification Authority will, over the years, gain in experience as to which techniques are the most suitable and will therefore be able to influence the techniques used by the industry and to guide it in the 'best' direction. The exact mechanism as to how this might be done needs further investigation since the Certification Authority should not be put at risk for product liability. During the lifetime of the system it will need to be maintained and possibly modified. Whenever any changes are made to the system, and these may be changes to the external environment within which it is to be used, a check must be made to ensure that the features incorporated into the design, and the assumptions made during the initial (or previous) safety integrity assessments, are still valid for the existing and new circumstances. Formal methods
The use of discrete mathematics for the functional specification of systems, now known as formal methods, is suffering the same problem as the other candidate techniques for the development of safety-critical software: they have only been available for about 30 years, they have been practised for less and they are still very much in their infancy. The idea that the user's requirements can be unambiguously captured in a formal specification from which a program can be refined, and against which a program can be proved, is exceedingly attractive. If this were possible in all cases then the contents of our proposal would be obvious, and short! The current reality is somewhat different. Of the available formal specification notations few have commercial toolsets at present, and even when they do become available the notations are not suitable for all types of problem 24. One must also understand precisely the nature of the verification and proof work carried out 2s. These problems do not denigrate the role of formal methods; they merely put them in perspective. They are a tool, and sometimes a very POWerful tool, in the locker of the software engineer, but they are not the only tool. Whilst they are as fundamental to the understanding of computer systems as traditional mathematics is to the understanding of other engineering disciplines, their
Vol "17 No 7 7993
application is by no means as well developed. A software engineer should therefore use them when appropriate, but must be prepared to use less rigorous methods, supported by documented engineering judgements (that only come with experience), when necessary. CERTIFICATION The nature of software certification
Certification implies some form of official approval of a software product, which in turn implies that the software performs, in its associated hardware, in accordance with the claims of the software producer. It is, to a certain degree, an implied second guarantee. (Whereas hardware guarantees are actually useful, a typical software supplier's guarantee is often virtually worthless26.) This 'second' guarantee is given by an independent authority to which the software suppliers submit their products for testing and assessment. The certifying body puts its reputation and prestige at risk in endorsing the software and whilst there may be no financial penalty should the software fail, the resulting loss of credibility to the certifying body can be of serious consequence. Because of this risk, the wording used by the certifying authority can be expected to be very precise and exact and the customer for safetycritical software has to be aware of the certification process so as to avoid falling into the trap of assumingthat certification means that the software is perfect. Certification can have at least three meanings or levels of guarantee. • The software is correct. • The software is correct in that it provides the service as described in its top level requirements specification. • The software has been tested in its target hardware and appears to be correct and appears to have been produced in accordance with good software engineering practices and in accordance with a recognized relevant standard. These interpretations of what is meant by certification become progressively weaker. The first says that the software will provide its intended service and that there will be no unexpected side effects. That is to say that it does the job it is meant to do and it does nothing else besides. It is virtually impossible to give such a guarantee for any, but the simplest, software product. The second level is saying that if the specification is correct then the software is correct, but in any case the software does what its designers thought that it should do. This level of certification is possible for small systems. If it can also be shown that the specification is complete then the software can be certified as having no unwanted side effects and we have in effect moved up to the first level. The third level is the most commonly used in practice; it is the method by which the civil aviation authorities accept the software components of certified airborne computer equipment. There are two basic problems that force certification down from the top level to the bottom level. The first is the difficulty of knowing that one has derived a complete, unambiguous and correct specification from the initial requirements - a universal problem. The second is the problem of testing and/or proving the implemented system against the specification; this is an area being addressed by the use of formal methods.
43
Integrity levels As shown in Figure 1 the objective of designing a (sub-)system to a particular integrity level is to achieve a specified probability of failure. In some forms of engineering this can be achieved by designing in a certain way, or by using certain components. It is then possible to demonstrate that, when a product has been designed to, say, a very high integrity level, it indeed has a lower probability of failure than one designed to, say, a medium integrity level. This is because all failures are assumed to be due to random faults. Software, however, cannot have random faults; any faults that might exist will be systematic faults in the design. The probability of failure does not therefore have the same type of meaning. In fact it is very difficult to find any criteria by which two software programs may be compared in a manner analogous to Figure 1, especially since it is possible for even a 'very badly' designed program to be correct, or contrariwise, a 'very well' designed program still to have faults. Thus, since it is not possible to assure that one design methodology will produce a demonstrably better product than another using the criterion of the probability of failure, we have rejected this interpretation of integrity levels. The probabilities of failure shown in Figure 1 ultimately give the road user confidence in the equipment. We have therefore extended this notion into the idea of levels of confidence, which we map onto the integrity levels, an idea also being proposed for secure software 27. The principle is that the more one knows about the design and implementation processes of a piece of software, the greater the confidence that one can place in it. This knowledge is gained through the documentation produced, and so the requirements for certification at a particular integrity level are specified in terms of the documents required for inspection. Thus, for example, whilst a software developer may use a formal specification language for a system of any integrity level, a formal architectural specification would only be mandated bythe Certification
Authority for the higher integrity levels. The conformance quality of any software tools to be used in the development, in particular compilers, is also specified.
Organization So far the Type Approval system has worked well whilst road transport equipment was made entirely from mechanical or electrical components. However, as explained above, software systems differ from electromechanical systems. It is therefore necessary to update the Type Approval mechanism to cater for the fact that software does not have a corresponding 'production' phase (all the effort goes into its design) and that, in general, it cannot be black box tested as fully as mechanical or electrical systems due to its additional complexity. The assessment of the process of software development against a standard (as distinct from conformance to a standard (e.g. a compiler)) is a relatively new idea. The crucial difference between the Type Approval of electromechanical systems and the assessment of software is that the former is carried out after design and implementation, with a relatively short contact time with the approval authority, whereas the latter needs to be initiated very early in the life cycle of the product, with a resulting long contact time with the Certification Authority. We envisage software certification in this manner for road transport informatic systems. Figure 5 shows a possible administrative structure, based on one being proposed for certifying secure software systems27. By reporting to the CEC through a Management Board, the Certification Authority can ensure compatibility throughout the EC. The Licensed Evaluation Centres would be registered with the Certification Authority, and act as Certification Bodies on behalf of the Certification Authority. These Licensed Evaluation Centres may, or may not, be part of an existing Type Approval organization.
CEC
IManagementBoardI Certification Authority
Licensed Evaluation
Licensed
Centre
Evaluation Centre
I
I
Developers, System Builders, Contractors, Sub-Contractors
Figure 5. 44
Proposed software certification organization Microprocessors and Microsystems
FUTURE W O R K
5 Quality Management and QualityAssurance Standards Guidelines forSelection and Use ISO 9000 (Equivalent to BS 5750) International Standards Organization (1987) 6 How to Get Type Approval for Cars and Goods Vehicles Vehicle and Component Approvals Division, Department of Transport, Bristol, UK 7 Software Considerations in Airborne Systems and Equipment Certification The European Organisation for Aviation Equipment, ED-12A (equivalent to Radio Technical Commission for Aeronautics, DO-178A) (1985) 8 Robertson, A R 'Product liability issues for developers of safety-critical software' Industrial Experience of Formal Methods: SafetyNet 89 Conference Proceedings (1989) 9 The STARTS Purchasers" Handbook: Procuring Software-Based Systems STARTS Public Purchaser Group, NCC Publications (1989) 10 Software: A Vital Key to UK Competitiveness Cabinet Office: Advisory Council for Applied Research and Development (ACARD), Her Majesty's Stationery Office (1986) 11 Software in Safety-Related Systems, lEE (1989) 12 Procurement of Safety Critical Software in Defence Equipment, Interim Defence Standard 00-55, Ministry of Defence (1991) 13 Hazard Analysis and Safety Classification of the Computer and Programmable Electronic System Elements of Defence Equipment, Interim Defence Standard 00-56, Ministry of Defence (1991) 14 Software for Computers in the Application of Industrial Safety Related Systems IEC SC65A WG9 (Technical Committee No. 65: Industrial-Process Measurement and Control) Draft, International Electrotechnical Commission (1989) 15 Functional Safety of Electrical~Electronic~Programmable Electronic Systems: Generic Aspects IEC SC65A WG10 (Technical Committee No. 65: IndustrialProcess Measurement and Control) Draft, International Electrotechnical Commission (1992) 16 Jesty, P H, Buckley, T F, Hobley, K M and West, M 'Drive Project V1051 - Procedure for safety submissions for road transport informatics' Colloquium on Safety Critical Software in Vehicle and Traffic Control, Digest No 1990/031, lEE (1990) 17 Redmill, F J (Ed) Dependability of Critical Computer Systems 1 Elsevier Applied Science (1988) 18 The STARTS Purchasers' Handbook: Software Tools for Application to Large Real Time Systems STARTS Public Purchaser Group, NCC Publications (1986) 19 Guide to Programmable Electronics Systems in Safety Related Applications: Introduction Health and Safety Executive, Her Majesty's Stationery Office (1987) 20 Guide to Programmable Electronics Systems in Safety Related Applications: General Technical Guidelines Health and Safety Executive, Her Majesty's Stationery Office (1987) 21 Buckley, T F, Jesty, P H, Hobley, K M and West, M 'DRIVE-ing Standards: - A safety critical matter' Fifth Annual Conference on Computer Assurance: Systems Integrity, Software Safety and Process Security (COMPASS '90), IEEE (1990) 22 Thomas, A M 'Should we trust computers?' The BCS/ Unisys Annual Lecture, The British Computer Society (1988) -
As part of the DRIVE II programme we are now extending the results of DRIVE Safely, and assisting in the safe production of some of the field trials, through a project called PASSPORT. As well as providing specific advice, PASSPORT will also produce some guidelines for the development of safe ATT systems, which will amplify the philosophies developed during DRIVE Safely. In addition, since DRIVE Safely was only able to concentrate on basic programmable electronic systems, in DRIVE II we also wish to extend our studies to the safe development of more complex systems, such as networks and data bases, as well as to the related areas of the safe development of application-specific integrated circuits and custom designed very large scale integrated circuits. The UK Government is sponsoring a research programme on 'safety-related systems' (SafelT) funded jointly by the Department of Trade and Industry and the Science and Engineering Research Council. A number of companies in the motor industry is collaborating in a project in this programme called MISRA, with which we are associated. This will produce guidelines for the development of safety-critical software, and will also use some of the results of DRIVE Safely.
CONCLUSIONS DRIVE Safely believes that it is proposing a standard for the development of safe RTI systems that is both meaningful and workable. The development of any safety-critical software is particularly difficult due to the inherent problems associated with software, and the lack of consensus as to which methods should be employed. The solution that we propose is pragmatic and uses the Certification Authority as the mechanism by which best practice is fostered throughout the industry. The integrity level required of a piece of software is to be mapped onto a confidence level of the final code. This is to be achieved through varying the levels of knowledge gained by the Certification Authority by means of the documentation supplied by the developer.
ACKNOWLEDGEMENT This work was funded under the EC DRIVE Programme.
REFERENCES 1 Williams, M'PROMETHEUS is rolling' Colloquium on The Car and its Environment - What DRIVE and PROMETHEUS have to Offer, Digest No 1990/020, lEE (1990) 2 'Advanced Telematics in Road Transport' DG-XlII Proceedings of the DRIVE Conference (Vols 1 & 2) Elsevier (1991) 3 Towards a European Standard: The Development of Safe Road Transport Informatic Systems (Draft 2) DRIVE Safely Consortium, DRIVE Project Number V1051 (1992) 4 Brocklehursl:, A S and Littlewood, B 'New ways to get accurate reliability measures' IEEE Softw. Vol 9 No 4 (July 1992) pp 34-42
Vol 77 No 7 1993
45
23 Bloomfield, R E and Froome, P K D 'Aspects of the licensing and assessment of highly d e p e n d a b l e computer systems' in Anderson, T (Ed) Safe a n d Secure Computing Systems Blackwell Scientific Publications (1989) 24 Sowerbutts, B 'Formal Methods - White h o p e or red herring?' S a f e t y N e t No 5, Viper Technologies Ltd, Worcester (June 1989) 25 Cohn, A'The notion of proof in hardware verification' J. A u r o r a . R e a s o n i n g Vol 5 No 1 (1989) pp 127-139 26 Bryan, W L and Siegel, S G S o f t w a r e P r o d u c t A s s u r a n c e , Elsevier (1988)
Peter lesty graduated in electrical and electronic engineering at Leeds University, UK. After gaining an MSc in computing and spending two years at Leeds Polytechnic, he became the undergraduate Computing Officer at Leeds University in 1974, and was appointed Lecturer in 1980. He is now a Senior Lecturer and a founder member of the Safety Critical Computing Group. He is currently the co-ordinator of the EC DRIVE
project PASSPORT.
46
27 Description of The Scheme UK IT Security Evaluation and Certification Scheme, UKSP01, Department of Trade and Industry (1991)
Tom Buckley graduated in physics at Imperial College,U K. After gaining his PhD at Westfield College, and spending a year at Portsmouth Polytechnic, he was appointed lecturer in digital technology and computer architecture at Edinburgh University in 1971, moving to LeedsUniversity in 1979. He was a founder member of the Safety Critical Computing Group and had become established as an international expert within the EC DRIVE programme. He died on 13th July 1991 in the USA, after a short illness, whilst en route to give a paper at a conference.
Margaret West graduated in mathematics at Liverpool University, UK and was initially involved in research in theoretical physics. She subsequently received an MSc in computing from Bradford University, and in 7989 joined the Safety Critical Computing Group at Leeds University to work on the DRIVE Safely project. Her research interests are in software enginoaring and in particular the application of formal methods to safety-critical computing.
Microprocessors and Microsystems