Verifying cyber attack properties

Accepted Manuscript Verifying Cyber Attack Properties Colin O’Halloran, Tom Gibson Robinson, Neil Brock PII: DOI: Reference: S0167-6423(17)30127-2 ...

Download PDF

1MB Sizes 0 Downloads 121 Views

Report

PDF Reader
Full Text

Accepted Manuscript Verifying Cyber Attack Properties

Colin O’Halloran, Tom Gibson Robinson, Neil Brock

PII: DOI: Reference:

S0167-6423(17)30127-2 http://dx.doi.org/10.1016/j.scico.2017.06.006 SCICO 2110

To appear in:

Science of Computer Programming

Received date: Revised date: Accepted date:

19 May 2016 14 June 2017 14 June 2017

Please cite this article in press as: C. O’Halloran et al., Verifying Cyber Attack Properties, Sci. Comput. Program. (2017), http://dx.doi.org/10.1016/j.scico.2017.06.006

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Highlights • • • • •

Assuring the resilience and security properties of a Cyber-Physical System are important. How is such assurance to be achieved when such systems have already been developed? Refinement is used to check abstractions of source code. The abstraction of source code components are represented in the process algebra CSP. The CSP representation is checked against system security properties through refinement checking using the FDR3 model checker.

Verifying Cyber Attack Properties Colin O’Halloran, Tom Gibson Robinsonb , Neil Brocka b Department

a The Charles Stark Draper Laboratory of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK

Abstract The heterogeneous, evolving and distributed nature of Cyber-Physical Systems (CPS) means that there is little chance of performing a top down development or anticipating all critical requirements such devices will need to satisfy individually and collectively. This paper describes an approach to verifying system requirements, when they become known, by performing an automated reﬁnement check of its composed components abstracted from the actual implementation. This work was sponsored by the Charles Stark Draper Laboratories under the DARPA HACMS project. The views, opinions, and/or ﬁndings expressed are those of the authors and should not be interpreted as representing the oﬃcial views or policies of the Department of Defense or the U.S. Government Keywords: Autonomy, Cyber-Physical Systems, Internet of Things, Veriﬁcation, Validation, Security, Safety

1. Introduction This paper is a revised and extended version of an invited paper [36] presented at AVOCS 2015. Cyber-Physical Systems (CPS) are increasingly novel hardware and software compositions creating smart, autonomously acting devices, enabling eﬃcient end-to-end work-ﬂows and new forms of user-machine interaction. Currently CPS range from large supervisory control and data acquisition (SCADA) systems (which manage physical infrastructure) [30] to medical devices such as pacemakers and insulin pumps [14], to vehicles such as airplanes [7] and cars [38]. In the future they will include low power small devices that will make up the Internet of Things [41]. Clearly resilience, safety, security and privacy are of paramount concern. However, the heterogeneous, evolving and distributed nature of CPS means that there is little chance of performing a top down development or anticipating all critical requirements such devices will need to satisfy individually and collectively. The looming problem is therefore how to address the veriﬁcation and validation of such systems when they are critical and rely upon other systems that

Preprint submitted to Elsevier

June 22, 2017

have limited assurance? Worse still, how can CPS be veriﬁed when the requirements will not be known until after they are deployed and composed with other CPS to form unanticipated critical services? An approach to addressing this problem is to automatically extract the behaviours of such deployed systems that are composed together through welldeﬁned architectures, or protocols. At this point system requirements can be articulated because the overall system of systems has been constructed for some purpose. This means that a system level representation of the relevant behaviours is needed. The importance of pushing the representation up to the level of system properties is that it is the natural level for people to articulate requirements, it is often diﬃcult to express these as lower level speciﬁcations for components unless it is part of a top down development. Such top down development has already been excluded from the problem space being addressed. Verifying components against speciﬁed lower level properties that are generally desirable is rather hit and miss. The history of developing secure systems is strewn with examples of systems where low level security properties to hold, but the system is insecure. Conversely it is also possible for the system level to be secure, but low level vulnerabilities to be present - but not exploitable at the system level. There is no reason to believe that the same is not true for systems that are required to be safe. Another advantage of working with system level requirements is that they can guide what behaviours are relevant to the system level requirements, i.e. they drive relevant abstractions that can be performed. For example a subsystem on a CAN bus might have no relevant critical behaviour except that it does not ﬂood the CAN bus with messages when critical messages are being passed between other sub-systems. In the past the approach discussed in this paper would have been perceived to have too many obstacles for any chance of success. Advances in various areas of automated reasoning and model checking now make such an approach possible. The overall approach discussed in this paper is shown in the work-ﬂow presented in Figure 1. The work-ﬂow presented is between artefacts using various tools and techniques. The work-ﬂow starts with code for each component of the software architecture of interest. The code is in a predictable subset of the C programming language [33]. The FSG tool processes the C code to generate a compliance script that links a speciﬁcation in the Z notation [43] to its implementation in the C code. The Z speciﬁcation is also projected out for further processing. At this point we have limited assurance that the Z speciﬁcation generated by the FSG tool is actually implemented by the original C code. To establish the required assurance, the compliance script is processed by a tool called QCZ (built on top of ProofPowerZ [37]) that treats the compliance script as though it is a top down reﬁnement from the top level Z speciﬁcation via intermediate reﬁnement steps to the original C code. At each reﬁnement step a Veriﬁcation Condition within the Z notation is generated. The set of Veriﬁcation Conditions 2

Figure 1: Work-ﬂow of tools used

3

can now be processed by ProofPowerZ, a theorem prover for conjectures in the Z notation. If the Veriﬁcation Conditions can be proven then we have established that the top level Z speciﬁcation is a valid abstraction of the C code’s behaviour. If the Veriﬁcation Conditions are true then they can be automatically proven using proof tactics developed for proving auto-code against a Simulink diagram represented by a speciﬁcation in Z [32]. Returning to Figure 1, the top level Z speciﬁcation is translated into machine readable CSP, called CSPM [40]. For each component, of the software architecture of interest, the work-ﬂow described above is repeated. At this point the various CSPM representations are composed together in parallel, synchronising over the appropriate alphabet. The composed artefact is processed by the reﬁnement model checker FDR to automatically check whether the composed CSPM model satisﬁes a property that represents a class of cyber attacks that, for example, a penetration tester might try. In the rest of the paper a brief introduction to CSP, CSPM and the model checker FDR is given. An overview of a subset of C, called C [33], is given because it is the executable part of a wide spectrum language called the C Compliance Notation. The C Compliance Notation encompasses C and speciﬁcation statements expressed in the Z notation that are linked by a reﬁnement relation. The semantics of the C Compliance Notation and the generation of Veriﬁcation Conditions, which establish the correctness of reﬁnement steps, are discussed. An example representing encrypted communication to a hobbyist quadcopter is given to illustrate the approach. The software written in C is abstracted to a system level and then veriﬁed against properties that encodes the sort of attack a penetration tester might try. Finally future work that employs de-compilation of executable binary into a mathematical representation is outlined along with related work. 2. A Brief Introduction to CSP and FDR 2.1. The CSP language CSP is a process algebra in which programs or processes may be described. The processes communicate events from a set Σ (the set of all possible communications in the universe under consideration) with an environment. We sometimes structure events by sending them along a channel. For example, c.3 denotes the value 3 being sent along the channel c. Further, given a channel c the set {|c|} ⊆ Σ contains those events of the form c.x. The simplest CSP process is the process ST OP that can perform no events. The process a → P oﬀers the environment the event a ∈ Σ and then behaves like P . τ denotes a hidden action that is invisible to the environment. The process P 2 Q oﬀers the environment the choice of the events oﬀered by P and by Q and is not resolved by the internal action τ , since it is invisible to the environment. P Q non-deterministically chooses which of P or Q to behave like. P Q initially behaves like P , but can timeout (via τ ) and then behaves as Q.

4

INTRUDER( i n , o u t ) = i n ? x → ( o u t ! x → INTRUDER( i n , o u t ) ) i n ? → INTRUDER( i n , o u t ) Figure 2: Intruder Behaviours

Recursive processes can be deﬁned either equationally or using the notation μ X · P . In the latter, every occurrence of X within P represents a recursive call. For example Figure 2 speciﬁes the behaviour of a particular kind of intruder that we will use later in the paper. The intruder’s behaviour is modelled abstractly as listening to the transmissions over a channel in and then replaying them over a channel out before recursively behaving like INTRUDER again; or it non-deterministically ignores a transmission before recursively behaving like INTRUDER again. The process INTRUDER is parameterised by the channels in and out, so it is a general model of a replay attack requiring only in and out to be instantiated. The alphabet of INTRUDER will be whatever values that can be communicated down the channels in and out once they are instantiated. P A B Q allows P and Q to perform only events from A and B respectively and forces P and Q to synchronise on events in A ∩ B. P Q allows P and Q to A

run in parallel, forcing synchronisation on events in A and arbitrary interleaving of events not in A. The interleaving of two processes, denoted P |||Q, runs P and Q in parallel but enforces no synchronisation. P \ A behaves as P but hides any events from A by transforming them into the internal event τ . This event does not synchronise with the environment and thus can always occur. P [[R]] behaves as P but renames the events according to the relation R. Hence, if P can perform a, then P [[R]] can perform each b such that (a, b) ∈ R, where the choice (if more than one such b) is left to the environment (like 2). Skip is the process that immediately terminates. The sequential composition of P and Q, denoted P ; Q, runs P until it terminates at which √point Q is run. Termination is indicated using the special tick event (denoted ): Skip is deﬁned √ as tick and then ST OP and, if the left argument of P ; Q performs a tick, , then P ; Q performs a τ to the process Q (i.e. P is discarded and Q is started). Using the operations of parallel composition, renaming and hiding, a new operator of linked parallelism is deﬁned in CSP. The eﬀect of linked parallelism is to link two channels together so that the communications across the linked channels are not visible. For example in Figure 3 the process INTRUDER of Figure 2 is instantiated with two channels GCSXMIT and ICPY. Using linked parallelism the INTRUDER is composed with the processes DLR, DECRYPT and RECVR. Each process is similarly parameterised by channels that are hidden by linking them, leaving only GCSXMIT and the communication

5

R2M.CMsgStruct.1 visible. STREAM2 = INTRUDER(GCSXMIT , ICPY ) [ ICPY ↔ ICPY ] ( DLR( ICPY , D2D) [ D2D ↔ D2D ] ( DECRYPT( Sk , D2D , D2R) [ D2R ↔ D2R ] RECVR( 0 , D2R , R2M . CMsgStruct . 1 ) ) ) Figure 3: Composition of STREAM2

2.2. The semantics of CSP The simplest approach to giving meaning to a CSP expression is by deﬁning an operational semantics. The operational semantics of a CSP process naturally creates a√labelled transition system where the edges are labelled by events from Σ ∪ {τ, } and the nodes are process states. Formally, a Labelled Transition System is a 3-tuple consisting of a set of nodes, an initial node, and a relation a −→ on the nodes: i.e. it is a directed graph where each edge is labelled by an event. The usual way of deﬁning the operational semantics of CSP processes is by presenting Structured Operational Semantics (SOS) style rules in order a to deﬁne −→. For instance, the operational semantics of the external choice operator are deﬁned by: a

P −→ P a P 2 Q −→ P

τ

P −→ P τ P 2 Q −→ P 2 Q

with symmetric rules for Q. CSP also has a number of denotational models, such as the traces, stable failures and failures-divergences models. In these models, each process is represented by a set of behaviours: the traces model represents a process by the set of sequences of events it can perform; the failures model represents a process by the set of events it can refuse after each trace; the failures-divergences model augments the failures model with information about when a process can perform an unbounded number of τ events. Two processes are equal in a denotational model iﬀ they have the same set of behaviours. If every behaviour of Impl is a behaviour of Spec in the denotational model X, then Spec is reﬁned by Impl, denoted Spec X Impl. 2.3. FDR FDR is a reﬁnement checker for CSP. As input, FDR takes in a ﬁle written in a textual form of CSP describing a number of processes, and can then verify if 6

one process reﬁnes another, according to the CSP denotational models. When a reﬁnement check fails, FDR provides a counterexample that shows a behaviour of the system that violates the required reﬁnement property. FDR is also capable of checking other properties such as deadlock freedom, determinism, and divergence freedom by automatic reduction to a reﬁnement check. All versions of FDR have operated in a similar way, ﬁrst developed by Jackson and Roscoe [16]. In order to check if P X Q in the denotational model X, P and Q are converted into Labelled Transition Systems using CSP’s operational semantics. The reﬁnement check is then performed by considering the product of these Labelled Transition Systems, and performing either a depth or breadth- ﬁrst search over the product. In each state of the product, various properties are checked according to the denotational model that the reﬁnement check is being performed in. All versions of FDR have also been explicit, in the sense that data values are explicitly represented rather than being represented by symbolic values; the latter is still very much a research topic. 2.3.1. CSPM CSPM combines the CSP process algebra (in an essentially general way) with a lazy functional programming language that permits general purpose models to be constructed easily. The generality of CSPM is one of the principle strengths of FDR, and is one of the key features that has enabled FDR to tackle the variety of problems. For example the CSPM process deﬁned in Figure 4 uses the power of the functional language, CSP and the type checker built into FDR3. It declares a process EXTERNAL recv with six parameters. The let .. within construct of the functional language declares a local identiﬁer Input Channel to have the value of queue in the scope of the within part. A value Input Value is communicated across the channel Input Channel and then another let .. within construct is used to declare Indirections1 set to a value given by applying a built-in function called mapUpdate. The function mapUpdate inserts a key value pair (in this case x v’ and Input Value) into the supplied map (in this case Indirections), overwriting the current value if the speciﬁed key is already in the map. The new map Indirections1 is used in the within part as a parameter to the process P supplied as a parameter of EXTERNAL recv. EXTERNAL recv models a C function recv (with a simple pointer parameter) used to communicate external values to the C code that calls recv. The parameter Q is a continuation for the process P, i.e. it enables P future behaviour in terms of Q to be described. The type checker for FDR infers the appropriate types in the context of the use of EXTERNAL recv. 2.3.2. Evolution of FDR FDR2 was notable for its support for supercombinators. These provide a general way of combining a number of Labelled Transition Systems using rules that indicate how transitions from diﬀerent Labelled Transition Systems should interact. For example, a rule may specify that the supercombinator performs the event a in the case that process 1 performs the event b and process 2 performs the event c. The main advantage of supercombinators is that it makes 7

EXTERNAL recv ( x v ’ , queue , P , i , I n d i r e c t i o n s ,Q) = let I n p u t C h a n n e l = queue within Input Channel ? Input Value → let Indirections1 = mapUpdate ( I n d i r e c t i o n s , x v ’ , I n p u t V a l u e ) within P( i , I n d i r e c t i o n s 1 ,Q) Figure 4: Deﬁnition of an external service recv

it easy to support networks that are built from a complex combination of CSP operators without incurring any performance impact. For example, using supercombinators, renaming and hiding are essentially free. FDR2 pioneered the usage of compression [39] which has proven an immensely valuable technique for analysing large complex CSP systems. Compression functions take in a Labelled Transition System and return a second Labelled Transition System that is semantically equivalent (in the relevant denotational model) but, hopefully, contains fewer nodes. For example, the normalisation algorithm outlined above is a compression function (and frequently performs well), as is strong bisimulation. The work on compression was instrumental in enabling industrial exploitation of FDR. FDR2 also incorporated a surprisingly useful function in the form of chase which is a form of partial-order reduction. The function chase(P) behaves like P, except that τ events are forced to occur in an arbitrary order until a stable state is reached. This has the eﬀect of only considering one possible ordering of the τ actions. This remarkably simple function was ﬁrst developed in order to support analysis of security protocols. FDR3 [15] was a complete rewrite of FDR, for example it addressed errors FDR2 gave along the lines of is not a set (with no attached line number). A type checker for CSPM was built into FDR3. Compared to FDR2, FDR3’s major feature is its speed: it is typically three times faster on a single core but is also capable of using multiple cores. Further, when running on multiple cores and checking models in the traces and failures models, it scales linearly [17] (i.e. twice as many cores cause FDR to take half the time). FDR3 also incorporates a cluster version that is also capable of scaling linearly as more nodes are added to the cluster [17]. Few other model checkers are capable of scaling in such ways. One particularly notable experiment involved the use of FDR3 on a cluster of 64 16-core servers running on Amazon’s EC2 service, which managed to model check a problem with over 1 trillion states.

8

3. C The origins of C go back to the late 1990s when the pressure to use the C programming language for safety related, and to some extent safety critical, applications was growing. This led to a review of the then Defence Evaluation and Research Agency’s position on the use of C and led to a period of research into pragmatic restrictions, which could be adopted immediately [19], to reduce potential safety clearance risks and longer term development of a language subset of C for safety critical applications. Experience with the development of the SPARK language and the deﬁnition of its formal semantics [3] later led to the development of a concrete syntax for a restricted subset of C called “Restricted C”. Restricted C was intended to be the basis for the development of a formal semantics and a subsequent veriﬁcation tool to support its use in safety critical applications. Before any formality was added to the language, a period of review, validation with MOD projects and simple maturing of the subset took place. During this period it became clear that manual development of software was being displaced by commercial automatic code generation tools. These commercial tools implicitly deﬁned a subset of C that did not necessarily conform to the concrete syntax of Restricted C, but some of them could still be veriﬁable and usable for safety critical applications. For this reason the C language was developed. The insight that took place was to abstract away from the concrete syntax of Restricted C to orientate the language deﬁnition more towards semantic concerns that directly support formal veriﬁcation. When enough conﬁdence in the C subset was gained, work on formalisation took place. A formal semantics for C was deﬁned and its validation through a prototype veriﬁcation tool took place in 2007. The formal semantics of C, which will support the assurance of veriﬁcation tool development of C by third parties, will be published in the near future. It would be inappropriate to publish a formal semantics in this guide because it requires a careful exposition in its own right and this guide is already large. C and in particular Restricted C are diﬀerent from other proposed subsets of C (such as Clight [29], C0 [42], Cyclone [18] and Caduceus [11]) because C and Restricted C were both designed with safety policy and usability as well as formal veriﬁcation in mind. A brief discussion of the design criteria for Restricted C and C, plus potential evolution of the subset, is now given. 3.1. Predictability The main diﬀerence between C and MISRA C:1998 is that side-eﬀects are not allowed in C expressions. For example: i + j + + is allowed in MISRA C:1998 (as the value of the expression is the same under any order of evaluation of the sub-expressions) but is illegal in the abstract syntax of C and the particular concrete representation of Restricted C. The reason the expression i + j + + is illegal in C is that tool support would be required to check that the expression is the same under any order of evaluation of the sub-expressions, and this leads to a combinatorial explosion of possible evaluations depending upon the number of sub-expressions. The removal of side-eﬀects from expressions has 9

been largely achieved syntactically. The expression i + j + + is syntactically illegal in C. This has been done by separating expressions and statements. The removal of side-eﬀects from function calls in expressions has not been achieved syntactically; this requires a semantic enforcement. In C, expressions must not have side-eﬀects, and all non-null statements must have side-eﬀects. Interestingly this separation of expressions and statements has removed the well known confusion in C between = and ==; for example: if (x = 1) ... and x == 2; are syntactically illegal in C. The concrete syntax for C with semantic restrictions achieves three things. The ﬁrst is the removal of ambiguities from C. This includes both syntactic ambiguities (the dangling else problem) and semantic ambiguities (e.g. order of evaluation of sub-expressions). The second is the removal of features, which although unambiguous, are confusing for the programmer (e.g. the confusion between = and ==). The third is the removal of features which make formal reasoning harder (e.g. side-eﬀects in expressions). A static analysis tool that checks for undeﬁned behaviour, such as divide-byzero, is required in order to ensure predictability. Tools such as Polyspace [27] ´ [8] could be used to perform the check. or ASTREE 3.2. Simplicity Simplicity of a language tends to lead to simpler semantics that are easier to understand and reason about. Inevitably there is a tension between the simplicity of a language and its expressiveness or utility. The C language has primitive types and very limited type constraints, this provides great ﬂexibility but also programs that are diﬃcult to understand correctly. The syntactic and semantic constraints required for predictability also impose some simplicity on the subset of C. The C language has been designed with simplicity in mind. The design philosophy is that the language can evolve with layers added as necessary. 3.3. Expressive Power The key issue is whether C is expressive enough for critical applications. Clearly the expressiveness of the full C language makes it very useful for more applications than C would. For example C would be much more diﬃcult to use for the implementation of operating systems. The restrictions to the C language embodied in Restricted C (that is slightly more constrained than C) were found to be usable by the Replacement Maritime Patrol Aircraft project and guidelines based on Restricted C were input into the JSF project and Eurojet for Euroﬁghter Typhoon (in fact the subset ﬁnally adopted by Eurojet was even more restrictive than even Restricted C). This experience indicates that C is expressive enough for embedded avionics software.

10

3.4. Veriﬁcation A tool called QCZ [4] processes scripts consisting of formal reﬁnement steps from speciﬁcation statements, expressed in Z, to code written in the C subset. For each reﬁnement step, Veriﬁcation Conditions are generated, which if proven ensure the correctness of the code with respect to the semantics of C and Z. The Veriﬁcation Conditions can be proven using the ProofPower theorem prover [37] for Z, since all the Veriﬁcation Conditions are expressed in Z. A feature of the C language is that aspects of it are implementation speciﬁc, i.e. they depend upon the compiler for a target machine. To accommodate this feature of C the QCZ tool allows the implementation speciﬁc aspects to be deﬁned. The QCZ tool has been successfully applied to signal conditioning code that had been extensively tested. The code that was the subject of QCZ veriﬁcation had already been developed with good processes and with extensive testing, so that most of the code examined was correct. Despite this QCZ was able to demonstrate an error and this is in line with expectations for the use of C [34]. The case study indicates that C can be used for realistic embedded software that can be formally veriﬁed against a formal speciﬁcation. 3.5. Diﬀerences from ANSI-C 1. A translation unit (in the sense of a C input ﬁle) is deﬁned as a sequence of paragraphs interleaved by narrative text. To avoid confusion, the nonterminal that was called translation unit has been renamed as external declaration list, since it is used for an arbitrary subsequence of the declarations and function deﬁnitions in a C input ﬁle. 2. The entry point for the grammar is now the non-terminal paragraph which is either an external declaration list or a reﬁnement step. 3. Introduction of the body of a function can be deferred, i.e., the body in the function deﬁnition is given as a label that is later associated with the code of the body in a reﬁnement step. 4. A function prototype may now include a formal speciﬁcation of the function (as a function speciﬁcation opt in a direct declarator or type direct declarator). 5. A statement may be given a formal speciﬁcation (in the speciﬁcation statement option for statement, which can subsequently be reﬁned formally or informally (in a reﬁnement step). 6. There is an option to recognise certain C comments as annotations that are recorded by the C Compliance Tool for use by other tools. Annotations are treated syntactically as a form of statement and are treated semantically as null statements. 7. A Z predicate or expression may now be used as an expression (primarily for use in function speciﬁcations and speciﬁcation statements, but also for experimental purposes).

11

3.6. C Conformance C enforces no stylistic guidelines. The abstract syntax used in the semantic processing of C is where the vital semantic constraints become important. The prototype C Compliance Tool (discussed in the next section) includes a C conformance checker. This takes as input the full C Compliance Notation syntax as described in the next section and checks for conformance with the C rules. 4. The C Compliance Notation The C Compliance Notation (CCN) allows C programs to be presented in a literate programming style in which the order of presentation of program fragments is chosen by the writer rather than ﬁxed by the C syntax rules. C is used as part of the C compliance notation. The semantics of C are themselves expressed in Z as an operational semantics, based on work by Norrish formalised in the HOL system [31]. The operational semantics of C in Z informed the formalisation of C in ProofPower, but they were not used directly. The formalisation discussed later in section 4.1 uses a weakest precondition semantics for the C subset of C along with translation of declarations and expressions into Z (discussed in section 6.1). The formalisation is comparable to Frama-C, however it has not been designed to act as a framework for other analysis tools. A program presented in the C Compliance Notation may be interspersed with formal speciﬁcations of the program’s behaviour written in the speciﬁcation language Z. If the program conforms to the rules given in this language description, a Z document can be produced automatically from its presentation using the C Compliance Notation. This Z document contains Z paragraphs representing the types, objects and functions declared in a program, together with conjectures, known as Veriﬁcation Conditions, whose proof constitutes a correctness proof for the program against its speciﬁcation. The level of mathematical rigour in a C Compliance Notation script is under the user’s control. At one extreme, no formal material at all need be included; at the other extreme, every subprogram can be formally speciﬁed and veriﬁed. Most practical uses of the Compliance Notation will lie between these extremes. A construct is not handled formally if the C Compliance Notation allows the syntax for the construct in parts of the script which do not have a formal speciﬁcation, but does not support formal reasoning about the construct. The C Compliance Notation is supported by the C Compliance Tool, an extension to ProofPower. In addition to syntax-checking, type-checking and document preparation functions, the Compliance Tool supports extraction of the C program from a Compliance Notation script and the generation of the Z document. All the facilities of ProofPower are available for proving Veriﬁcation Conditions. These facilities are augmented with a range of theorems and proof procedures which are customised for Veriﬁcation Condition proofs. The C Compliance Tool includes a facility rccc for checking the conformance of a C source ﬁle with the rules of C.

12

4.1. Semantics of the C Compliance Notation The semantics of the C Compliance Notation, CCN, are deﬁned by the following: 1. The CCN Toolkit of Z deﬁnitions deﬁned with respect to the formal semantics of C. 2. A scheme for translating C expressions into Z using the vocabulary of the CCN Toolkit. 3. A scheme for translating C declarations into Z. 4. A Veriﬁcation Condition generation algorithm which, given pre- and postconditions specifying a state change in a C program, and a C statement, generates veriﬁcation conditions which, if true, imply that the C statement satisﬁes the speciﬁcation The translation of expressions and declarations is shown by example in section 6.1 later. 5. Veriﬁcation Condition Generation The Veriﬁcation Condition generation is deﬁned in terms of a weakest precondition algorithm. The Veriﬁcation Conditions generated for statement S against pre- and post-conditions A and G being logical equivalently to (a suitable universal closure of) A ⇒ B, where B is the weakest pre-condition, W P (S, G) for G through S. In this section, we give a schematic deﬁnition of the function W P for each statement form. Assignment Statement. The assignment statement is split into special cases. The ﬁrst is that of a simple variable, i.e. a variable on the left hand side of an assignment that does not involve any explicit pointer operations. To deal with explicit pointer operations (or their absence) we need a way of denoting the store, i.e. memory, which the pointer accesses. The store is denoted by σ. If T is a simple variable and the store σ is not mentioned in G then W P (T = E, G) ≡ G[E /T ] If T is a simple variable along with an array index of known size then WP (T [I ] = E , G) ≡ WP (T = T ⊕ {I → E }, G) If T is a simple variable along with a structure member selection M then WP (T .M = E , G) ≡ WP (T = StrUpd C (T , M , E ), G) If T does not fall under the above special cases, the assignment is transformed into an equivalent speciﬁcation statement asserting the ﬁnal value of the store σ: WP (T = E , G) = WP (Δ T [true, σ = Assign( T , E ) σ], G) Δ followed by a set of identiﬁers denotes the frame of identiﬁers whose values can be changed by a speciﬁcation statement, any identiﬁers not in the frame cannot be changed by the speciﬁcation statement. 13

Speciﬁcation Statement. The weakest pre-condition calculation for a speciﬁcation statement is as follows: WP (Δw [pre, post], G) ≡ pre[w /w ] ∧ (∀ W • post[w /w ][w /w ] ⇒ G[w /w ]) Here, w denotes initial variables and w ﬁnal variables, w /w denotes substituting ﬁnal variables for initial variables. The quantiﬁcation over W is over a set of allocation schema asserting that the program variables w are validly allocated in the store σ. If Statement. An if statement has weakest pre-condition as follows: WP (if (C ) S 1 else S 2 , G) ≡ (C [v /v ] ⇒ WP (S 1 , G)) ∧ (¬C [v /v ] ⇒ WP (S 2 , G) C.

In the above C[v /v] means the result of adding a prime to each variable in

Switch Statement. A switch statement is called “structured” if it satisﬁes the rules of C (so that its body is a compound statement, with no declarations, each of whose statements is a labelled compound statement terminating in a break statement, the last of these statements being labelled with the keyword default). Unstructured switch statements cannot be handled formally. The weakest pre-condition calculation for a structured switch statement is similar to that for an equivalent iterated if-statement, but using membership assertions, V ∈ {A, B , ...} to abbreviate multiple equations. Do Statement. If the body of a do statement is a speciﬁcation statement, the weakest pre-condition is calculated as follows: WP (do Δw [pre, post] while (C ), G) ≡ pre[w /w ] ∧ (∀ W • post ∧ ¬C ⇒ G) If the body of the do statement is not a speciﬁcation statement, the body is ignored, and the weakest pre-condition is as above with pre and post both taken as true. While Statement. While statements are transformed into an equivalent if statement whose body is a do statement. For Statements. For statements are transformed into an equivalent while statement. Expression Statements. Expression statements other than assignments are handled using the following program transformations: WP WP WP WP WP

(T op= E , G) ≡ WP (T = T op E , G) (T ++, G) ≡ WP (T = T + 1 , G) (T −−, G) ≡ WP (T = T − 1 , G) (++T , G) ≡ WP (T = T + 1 , G) (−−T , G) ≡ WP (T = T − 1 , G)

These transformations are valid given that the expression T has no sideeﬀects. 14

Labelled Statements. The label has no eﬀect: WP (L::S , G) ≡ WP (S , G)

6. Z The formal notation Z [43] is based on the standard mathematical notation used in axiomatic set theory, lambda calculus, and ﬁrst-order predicate logic. All expressions in the Z notation are typed, thereby avoiding some of the paradoxes of naive set theory. Z includes a mathematical toolkit of commonly used mathematical functions and predicates. It is not appropriate to try to describe the whole of the Z notation in this paper; however it is worth taking a little time to describe the use of Z to link C code to a representation in CSPM. An example of the use of Z in this paper is the deﬁnition of DoLoop0 in Figure 5. It is introduced as an axiomatic deﬁnition, i.e. it has global scope across the whole of the Z document. The axiomatic deﬁnition is of a function from a set of pairs of values and locations to itself. The sets of values and locations have been deﬁned as part of the semantics of expressions and declarations in the C language, the semantics of expressions and declarations in C is discussed in the next sub-section. The semantics of the function DoLoop0 is given by the equality to the expression on the right hand side involving a function DoLoopBody0 and a set DoLoopTest0 that are combined using predeﬁned Z operators. The function DoLoop0 is a model of a Do .. While loop in C. The transformation of the program state, modelled by the function DoLoopBody0, is composed with its own reﬂexive transitive closure constrained by the set of states admitted by the While loop test and the set of states that result if the loop terminates. If the loop cannot terminate then the set of states it can terminate in is the empty set and therefore DoLoop0 is the empty function. In Z the domain restriction operator constrains the reﬂexive transitive closure of DoLoopBody0 (treated as a relation) to the set of states that are possible on entry to the recursion. The co-domain restriction operator constrains the reﬂexive transitive closure to the set of states possible if termination occurs. The axiomatic deﬁnition of Figure 5 is an example of what an SML functional program, called FSG, automatically generates when it processes a Do .. While loop in C. The deﬁnition of the function DoLoopBody0 is also automatically generated as an axiomatic deﬁnition by FSG. The FSG functional program generates these deﬁnitions in the context of a C compliance script it generates that links a top level speciﬁcation to the C code it was generated from. A compliance tool, called QCZ, processes the compliance script using the Z generated by FSG and Z that QCZ generates in order to in turn generate Veriﬁcation Conditions. The Veriﬁcation Conditions establish the validity of the top level speciﬁcation with respect to the C code it was generated from. In the next section a simple example of a script in the C Compliance Notation, CNN, being processed by QCZ is shown to illustrate the generation of Veriﬁcation Conditions.

15

Z

DoLoop0 : VALUE C × LOC C → VALUE C × LOC C ∀ in V : VALUE C ; MAVLINK local x L : LOC C • DoLoop0 (in V , MAVLINK local x L ) = (DoLoopBody0 o9 (DoLoopTest0 DoLoopBody0 ) ∗ − DoLoopTest0 ) (in V , MAVLINK local x L ) Figure 5: FSG generated Z axiomatic deﬁnition of DoLoop0

6.1. Processing a C Compliance Notation script The C Compliance Tool, called QCZ, has two main functions: Z generation of C expressions and declarations to generate Veriﬁcation Conditions; and C conformance checking. In the rest of this section we illustrate Z generation using simple examples. A C translation unit is introduced by the command new translation unit, which prepares the tool to input C source and creates a new theory to hold Z deﬁnitions associated with the ﬁle scope in the translation unit. SML

new translation unit "stack .c"; C declarations and function deﬁnitions can be entered in C Compliance Notation, CCN, paragraphs such as: C Compliance Notation

enum {STACK MAX = 50 }; When paragraphs are entered they are parsed and stored for future processing. To invoke the semantic processing, the following function generate z is used: SML

generate z (); If print z document is now called then QCZ introduces a Z global variable: STACK MAX e to represent the value of the enumeration constant ST ACK M AX. SML

print z document "−"; (The parameter of print z document is the name of the theory you wish to print, or a minus sign to mean the current theory). print z document output

STACK MAX e = IntVal C 50

16

At any time, you may use C quotation symbols to translate Z expressions into C, for example if you type in: C : STACK MAX + 1 ; then the tool will respond with: val it = Z IntVal C (mk signed int C (IntOf C STACK MAX e + 1 )) : TERM which is the translation of the C expression { STACK MAX + 1 }. Continuing the example, two global variables are introduced that will represent a stack: C Compliance Notation

int stack [STACK MAX ], stack index ; SML

generate z (); Examining the Z document shows that the QCZ tool has introduced global variables stack l and stack index l which are bindings of type LOC C representing the address and type of the variables. print z document output

stack l , stack index l : LOC C stack l .type = ArrayType C (IntOf C STACK MAX e , ArithmeticType C (Signed C Int C )); stack index l .type = ArithmeticType C (Signed C Int C ) Now suppose a function to reset the state of the stack variables is introduced: C Compliance Notation

void reset(void ) Δ stack index [ Z stack index v = IntVal C 0 ] { stack index = 0 ; } Invoking the semantic processing: SML

generate z (); informs us that a new theory “reset” has been created and a Veriﬁcation Condition has been generated and stored in it. If the Veriﬁcation Condition 17

is true (which it is), then the body of the function is a correct reﬁnement of the speciﬁcation in the function prototype. Applied to this theory, print z document will now show us the Veriﬁcation Condition. SML

print z document "reset"; print z document output

vc 1 ? IntVal C 0 = IntVal C 0 For a function pop, a speciﬁcation statement is used to specify the intended return value to a variable: C Compliance Notation

int pop(int x ) { int result; Δ stack index , result [ IntOf C stack index v > 0 , Z result v = ArrayOf C stack v (IntOf C stack index v − 1 ) Z ∧ IntOf C stack index v = IntOf C stack index v − 1 ] POP 1 ; return result; } Generating the Z for this function creates a new theory “pop” containing global variables representing the addresses and types of the parameter and the local variable. SML

generate z (); The speciﬁcation statement labelled POP 1 is reﬁned resulting in two Veriﬁcation Conditions stored in the theory “pop”. C Compliance Notation

POP 1

if (stack index > 0 ) { stack index −= 1 ; result = stack [stack index ]; } SML

generate z (); There is a facility allowing the introduction of the function body in a function deﬁnition to be deferred to a later step. This means the body of the 18

function can be developed without needing write access to the theory “stack.c”, so that functions in one translation unit can be developed in separate ProofPower databases with a common parent containing the theory associated with the translation unit. The development of the push operation will illustrate this. To defer the introduction of the function body, we give a label, PUSH BODY in this example, in place of the function body: C Compliance Notation

void push(int x ) Δ stack index , stack [ IntOf C stack index v < IntOf C STACK MAX e , Z = IntOf C stack index v + 1 ∧ Z IntOf C stack index v ArrayOf C stack (IntOf C stack index v ) = x v ] PUSH BODY When we generate the Z for this function deﬁnition, the function declaration is recorded in the current context but the new theory for the function is not created and no Veriﬁcation Conditions are generated. SML

generate z (); We now provide the body of the function in a reﬁnement step. Typically, this would be done in a child database after ﬁrst opening the theory for the translation unit: SML

open theory"stack .c"; C Compliance Notation

PUSH BODY

{ if (stack index < STACK MAX ) { stack [stack index ] = x ; stack index += 1 ; } } Generating the Z for this step causes the theory “push” to be created and populated with a global variable representing the address and type of the parameter and a Veriﬁcation Condition for the reﬁnement of the function speciﬁcation statement by the function body. SML

generate z (); This completes the example of using the Z Generation facilities for the C Compliance Notation, CCN. In the next section an example of an encrypted communication sub-system implemented in C is introduced. 19

Figure 6: SMACCMPilot communication channels

7. The System Examples of system requirements, used to illustrate the approach proposed in this paper, is that of absence from the system’s behaviour of successful security attacks. Such security attacks would be typical of penetration testing. The diﬀerence is that the model checking approach provides a more comprehensive examination of realistically exploitable vulnerabilities for an attacker outside a system. The simpliﬁed example system is based upon a version of the SMACMMPilot [22] developed under the DARPA High Assurance Cyber Military Systems (HACMS) project [6]. SMACMMPilot is an open-source autopilot software for small unmanned aerial vehicles (UAVs) using new high-assurance software methods, the simple C code used in the example is not from SMACMMPilot and exploits on the initial unhardened code are no longer present. The relevant part of the system for the requirement of interest is that of encrypted pairwise communication between a ground station and the SMACMMPilot; and a safety controller and the SMACMMPilot, illustrated in Figure 6. Within the quadcopter’s software there are two channels: Stream1 and Stream2. Stream1 consists of: an internal buﬀer (for the packets transmitted from the ground station); a decryption function; and a process that assembles the message for processing according to the MAVLINK protocol [21]. A diagram representing Stream1 and Stream2 appears in Figure 7. The design of Stream1 is in terms of 3 reactive processes, which are internal 20

Figure 7: Diagrammatic representation of communication to the quadcopter

(STREAM1 [| {|GCSXMIT|} |] STREAM2) [| {| R2M |} |] MAVLINK(R2M)

Figure 8: CSPM composition of communication streams with MAVLINK

to the quadcopter, plus an external process representing communication from the Ground Control Station, GCS. The internal processes use 2 system services (send and receive) to communicate with each other. The 3 internal processes are named DLR, DECRYPT and RECVR. DLR takes the data from GCS and sends it onward to be decrypted. DECRYPT decrypts the data and sends it onward to be dealt with by RECVR before sending onwards to the MAVLINK protocol software. The architectural composition of the streams, shown in Figure 7, is represented in the machine processable dialect of the process algebra of Communicating Sequential Processes, CSPM, discussed in section 2.3.1. In Figure 8 STREAM1 and STREAM2 communicate through the messages over the channel GCSXMIT. The channel GCSXMIT represents wireless communication that can be openly observed even if the data transmitted is encrypted. MAVLINK communicates with STREAM1 and STREAM2 through the messages over R2M. In CSPM the composition of STREAM1 is in terms of the 4 processes GCS, DLR, DECRYPT and RECVR, shown in Figure 9. The messages over GCSXMIT consist of symbolic values for a key (denoted by Sk), a sequence number and data. The messages over R2M consist of a sequence number and data. The messages over D2D (like GCSXMIT) consist of the symbolic values for a key, a 21

STREAM1 = GCS( Sk , GCSXMIT) [| {|GCSXMIT|} |] ( DLR(GCSXMIT , D2D) [ D2D ↔ D2D ] ( DECRYPT( Sk , D2D , D2R) [ D2R ↔ D2R ] RECVR( 0 , D2R , R2M . CMsgStruct . 0 ) ) ) Figure 9: Composition of STREAM1

void GCS( int Key , chan ∗ out ) { SMsg G C S l o c a l m e s s a g e ; G C S l o c a l m e s s a g e . Key = Key ; G C S l o c a l m e s s a g e . cmsg . SeqNo = 0 ; G C S l o c a l m e s s a g e . cmsg . msg . v a l = 0 ; for ( ; ; ) { G C S l o c a l m e s s a g e . cmsg . SeqNo = ( G C S l o c a l m e s s a g e . cmsg . SeqNo+1) % 4 ; send (& G C S l o c a l m e s s a g e , s i z e o f ( G C S l o c a l m e s s a g e ) , out ) ; } return ; } Figure 10: The source of the GCS C function

sequence number and data. The C code for the function GCS is shown in Figure 10. Note that the function GCS does not terminate because the ‘for’ loop, which calls a service to send a packet onward, does not terminate. The simpliﬁcations of functionality do not detract from the points this paper makes about architectural composition, abstraction and veriﬁcation of system properties. The composition of STREAM2 is similar to STREAM1, but a process modelling an intruder’s behaviour replaces the process GCS in Figure 3. In Figure 2 the intruder’s behaviour is modelled abstractly as listening to the transmissions from the ground station and replaying them, or non-deterministically ignoring a transmission. At this point all that has been done is to model the system’s architecture in CSPM. The functionality of the components of the architecture needs to be abstracted from the source code. Directly translating the behaviour of the source code as CSPM introduces the problem of computing intermediate states before 22

int MAVLINK( chan ∗ i n ) { SMsg MAVLINK local x ; do { r e c v ( ( SMsg∗)&MAVLINK local x , s i z e o f ( MAVLINK local x ) , i n ) ; } while ( 1 ) ; return 1 ; } Figure 11: The source of the MAVLINK C function

the ﬁnal state of a terminating function. Accommodating intermediate states greatly increases the overall number of states for a model checking tool. To address this problem a predicate representing the ﬁnal state is calculated and the correctness of this calculation proven within a theorem proving environment. For functions that do not terminate, i.e. implementations of reactive processes, the same approach is taken. The limitation is that reasoning about reactive behaviour cannot be done within the theory, hence a translation to CSPM where reactive processes can be reasoned about. The next section addresses this reﬁnement based approach to abstracting the code to derive behaviours of GCS, DLR, DECRYPT, RECVR and MAVLINK. The overall system composition, as represented by the architectural composition of Figure 8, can then be subjected to a reﬁnement check of speciﬁed external “attack” properties. 8. Abstracting the implementation into CSPM 8.1. Generating a functional speciﬁcation MAVLINK is used to illustrate the abstraction of the C code (shown in Figure 11) to a postcondition. The postcondition is then translated into CSPM as a process transforming the state and calling system services denoted by events speciﬁed as part of the system architecture represented in CSPM. The code shown in Figure 11 simply acts as a sink for the decrypted messages, the extra functionality for a real implementation would be incorporated into the postcondition. A tool, called FSG (developed by D-RisQ Ltd. under the DARPA HACMS project) generates the strongest postcondition for a C function in a manner similar to that discussed in [35]. The tool, implemented on top of ProofPower’s QCZ tool (discussed in section 6.1), generates as output the C compliance notation, CCN. When FSG is run on the C function of Figure 11 it generates the speciﬁcation statement shown in Figure 12. The speciﬁcation states that if the “do loop” within MAVLINK terminates then the result is a relational closure called ‘DoLoop0’. Note that neither parameter to ‘DoLoop0’ is changed by ‘DoLoop0’. If the loop does not terminate 23

C Compliance Notation

int MAVLINK (chan ∗in) Z:MAVLINK local x loc : LOC C • Z Ξ [((int) BC Z if (DoLoopTest0 DoLoopBody0 ) ∗ − DoLoopTest0 = ∅ then (in v , MAVLINK local x loc ) = DoLoop0 (in v , MAVLINK local x loc ) else true ) && MAVLINK ! == (1 ) ] MAVLINK 1 Figure 12: FSG output for MAVLINK C function

then the behaviour is completely underspeciﬁed. The point of the predicate is not to reason about it directly to establish some assertion; it is there to record suﬃcient information in order to translate into CSPM. 8.2. Soundness of the abstraction The soundness of the abstraction is established by recording the abstraction steps and presenting them as a forwards stepwise reﬁnement that can be processed by ProofPower’s QCZ tool to generate veriﬁcation conditions (discussed in section 6.1). The veriﬁcation establishes that the original C code is correct with respect to the speciﬁcation statement generated by FSG. For example, Figure 13 is a reﬁnement of Figure 12; and Figure 14 is a reﬁnement of Figure 13. An example of a veriﬁcation condition is shown in Figure 15. The veriﬁcation condition shown is the second of two generated for the ﬁrst reﬁnement step of Figure 13. It is the form A ⇒ (A ⇒ B) and can therefore be immediately simpliﬁed to (A ⇒ B). Informally the veriﬁcation condition requires a proof that under the assumption of the boolean condition of the “if then else” that the binding (M AV LIN K local xloc =M AV LIN K local xloc , M AV LIN K v !=IntV alC 1, v , σ =σ) inv =in is an element of the postcondition for MAVLINK. The veriﬁcation conditions can be proven within the theorem prover ProofPower.

24

C Compliance Notation

MAVLINK 1

{ SMsg MAVLINK local x ; Δ [1 , ((int) BC Z if (DoLoopTest0 DoLoopBody0 ) ∗ − DoLoopTest0 = ∅ then (in v , MAVLINK local x l ) = DoLoop0 (in v , MAVLINK local x l ) else true ) ] MAVLINK 2 return 1 ; } Figure 13: First reﬁnement step

The advantage of dealing with partial correctness is clear for non-terminating loops. The underlying semantics of a loop in QCZ is that the body of a loop is to be repeatedly executed in states satisfying the loop condition until a state that does not satisfy it is reached. In the case of non-termination the result is the empty relation. There is no obligation to assign any further meaning to nonterminating execution as far as veriﬁcation of the code is concerned. However in the speciﬁcation generated the behaviour is left underspeciﬁed if the loop does not terminate leaving the model checker for CSPM to solve it. The small number of data values required to express the examples of cyber attack properties in this paper means that the model checker is able to do this. 8.3. Translating to CSP The translation from the output of the FSG function into CSPM has been speciﬁed in Z. The subset of Z that is the output of FSG is represented as an Abstract Syntax that is naturally described by a recursive Z freetype. Similarly a subset of CSPM used for the translation is also represented as an Abstract Syntax and described by another recursive Z freetype. The translation function speciﬁed in Z is deﬁned denotationally over the Abstract Syntax representing the output of FSG. The output of FSG is essentially a predicate representing the state reached by executing the C code being abstracted. The predicate has the form of “For

25

C Compliance Notation

MAVLINK 2

do { recv ((SMsg ∗)&MAVLINK local x , sizeof (((MAVLINK local x ))), in); } while (1 ); Figure 14: Second reﬁnement step

all” quantiﬁed variables “satisfying” a conjunction of equalities “such that” a predicate about communicating with the environment holds. The intuition behind the translation function is that: the quantiﬁed variables appear as some of the parameters to a CSP process; the conjunction of equalities appear in the “let .. within” clause of a CSPM process; and the predicate concerning communication is represented by events, with suitable control ﬂow like “if .. then .. else”. State is recorded by the parameters passed between processes, including map functions to deal with limited use of pointers. Calls to other C functions are treated as continuation processes. If no communication ever takes place and the C code terminates and the process SKIP occurs. To illustrate the translation we return to Figure 12, it is translated into CSP as the process MAVLINK, shown in Figure 16. The communication channel ‘in’ is a parameter of MAVLINK, reﬂecting the channel ‘in’ passed as a parameter to the C function MAVLINK in Figure 11. As the name implies, the “do loop” in the C code executes the body of the loop before the loop guard is evaluated. In this example the loop guard is true therefore the body is executed indeﬁnitely. The call of the external function recv used as part of the deﬁnition of DoLoopBody0 in Figure 19 means that the behaviour is that of a reactive system. The translation of DoLoop0 into CSPM, is shown in Figure 17. It models predicate equalities by use of a let clause and then evolves into a process that models the externally provided function of a receive service. The external C function recv has three arguments. Two are recv__x (corresponding to the address of a local variable) and recv__queue (corresponding to the channel ‘in’), but the sizeof argument is not used in the human generated speciﬁcation of recv and is therefore ignored. The remaining parameters of the CSP process EXTERNAL recv are used to supply continuation information for subsequent behaviour. The augmented deﬁnition of EXTERNAL recv was shown in Figure 4. The argument Indirections is a map function of the Haskell like functional language that augments the process algebra of CSP that is CSPM. The map function Indirections models simple pointers, such as those used for pass by reference in a function call. The process EXTERNAL recv is deﬁned by a human as part 26

C Compliance Notation

vc 2 ∀ MAVLINK local x l , MAVLINK local x loc : [addr : Z; type : TYPE C ]; [in v : VALUE C ; σ : STORE C | (in v , in l ) AllocatedIn σ] • BC if (DoLoopTest0 DoLoopBody0 ) ∗ − DoLoopTest0 = ∅ then (in v , MAVLINK local x loc ) = DoLoop0 (in v , MAVLINK local x loc ) else true = IntVal C 0 ⇒ (BC if (DoLoopTest0 DoLoopBody0 ) ∗ − DoLoopTest0 = ∅ then (in v , MAVLINK local x l ) = DoLoop0 (in v , MAVLINK local x l ) else true = IntVal C 0 ⇒ (MAVLINK local x loc = MAVLINK local x loc , MAVLINK v ! = IntVal C 1 , in v = in v , σ = σ) ∈ MAVLINK post ) Figure 15: Example of a Veriﬁcation Condition

MAVLINK( i n ) = DoLoop0 ( i n , ( | | ) , SKIP )

Figure 16: Deﬁnition of the CSPM process MAVLINK

27

DoLoop0 ( i n V , I n d i r e c t i o n s ,Q) = let r e c v x = MAVLINK local x L r e c v y = S i z e O f ( SMsg ) recv queue = in V within EXTERNAL recv ( r e c v x , recv queue , Loop0 , i n V , Indirections , Q) Figure 17: CSP translation of DoLoop0

Loop0 ( i n V , I n d i r e c t i o n s ,Q) = i f true then DoBody0 Star ( i n V , I n d i r e c t i o n s , Loop0 , i n V , I n d i r e c t i o n s ,Q) else let MAVLINK output = t r u e within Q Figure 18: Translation of Loop0 process

of the model of the CSPM model of the system architecture. It simply takes an input value received and updates the map function that models pointers. It then behaves like the process P supplied as an argument to EXTERNAL recv with P ’s required arguments, which also includes another continuation process Q. For example, the actual parameter, Loop0, substituted for P in EXTERNAL recv is shown Figure 17. The continuation process Loop0 corresponds to the translation of the reﬂexive transitive closure of the loop body. The loop guard is true therefore the loop never terminates and hence, in the process deﬁnition, the ‘else’ part can never be taken. In general if a loop terminates the process Q takes over the ﬂow of execution. The CSPM deﬁnition of DoBody0 Star is translated from the FSG generated axiomatic deﬁnition of DoLoopBody0; DoLoopBody0 is shown in Figure 19. Although it looks (at ﬁrst sight) to be complicated, Figure 19 is simply setting the abbreviations for expressions that are substituted for formal parameters of the external function recv. For example the identiﬁer recv__x is an abbreviation for the address of a pointer and recv__y is an abbreviation for the size of the structure passed to recv. The translation into CSPM of DoLoopBody0 of Figure 19 is: The deﬁnition of the process DoBody0 Star is similar to that of DoLoop0 and reﬂects the axiomatic 28

Z

DoLoopBody0 : VALUE C × LOC C → VALUE C × LOC C ∀ in V : VALUE C ; MAVLINK local x L : LOC C | BC (∀ recv queue, recv recv , recv x , recv x , recv y : VALUE C ; σ, σ : STORE C | recv x = PointerVal C MAVLINK local x L .addr ∧ recv y = IntVal C (SizeOf C (StructType C ("Key", ArithmeticType C (Signed C Int C )), ("cmsg", StructType C ("SeqNo", ArithmeticType C (Signed C Int C )), ("msg", StructType C ("val ", ArithmeticType C (Signed C Int C )) ) ) )) ∧ recv queue = in V • (recv call [recv queue/queue v , recv recv /recv v !, recv x /x v , recv x /x v , recv y/y v ])) = IntVal C 0 • DoLoopBody0 (in V , MAVLINK local x L ) = (in V , MAVLINK local x L ) Figure 19: FSG generated Z axiomatic deﬁnition of the loop body

29

DoBody0 let recv recv recv within

Star ( i n V , I n d i r e c t i o n s , P , i , j ,Q) = x = MAVLINK local x L y = S i z e O f ( SMsg ) queue = in V EXTERNAL recv ( r e c v

x , recv

q u e u e , P , i , j ,Q)

Figure 20: Translation of DoLoopBody0 into CSPM

SYSTEM = (STREAM1 [| {|GCSXMIT|} |] STREAM2) [| {| R2M |} |] MAVLINK(R2M) Figure 21: CSPM SYSTEM deﬁnition

deﬁnition of the loop body in Figure 19. The required recursion is achieved by supplying Loop0 as the continuation P (along with Loop0’s arguments in v for i and Indirection for j) that is supplied to EXTERNAL recv in Figure 18. The translation of the reactive process MAVLINK into CSPM along with the similarly translated C code implementations of the reactive processes, which make up Stream1 and Stream2, can be plugged into CSPM system architecture deﬁnition of Figure 8. 9. Veriﬁcation of System Properties 9.1. Replay attack property The encrypted communication over two streams (one for a ground control station and one potentially for a safety controller) is represented in CSPM as: The process MAVLINK is represented by the CSPM process MAVLINK( i n ) = DoLoop0 ( i n , ( | | ) , SKIP )

where the deﬁnition of DoLoop0 is derived from the speciﬁcation generated from the FSG tool that abstracts the C code implementation, as described in section 8. Thus the behaviour of MAVLINK instantiated with the channel R2M can be plugged into the SYSTEM deﬁnition above. Similarly the behaviours of the process that make up STREAM1 and STREAM2 can be derived from their implementation in a similar manner to that described previously in section 8. The quadcopter is a Cyber-Physical system that was initially developed without reference to potential penetration tester type attacks. A form of external attack is to interfere with the commands sent from the ground control station by replaying them through the second communication stream. A speciﬁcation for robustness against this type of attack is that once started with a particular input stream, it must not be possible to receive messages from the other stream. The speciﬁcation can be speciﬁed operationally as the CSPM process:

30

SPEC1 = R2M? CMsgStruct . x . msg → SPEC1 ’ ( x ) SPEC1 ’ ( x ) = R2M! CMsgStruct . x ? → SPEC1 ’ ( x ) Figure 22: Replay attack property

This speciﬁes that once the Mavlink has received a message from one source, all future messages should be received from the same source. FDR can check if the SYSTEM process running in parallel with the INTRUDER process satisﬁes this speciﬁcation. In this case, the assertion fails, indicating that the speciﬁcation is not satisﬁed by the system. FDR returns a counterexample that shows that the intruder can take a message sent over the ground-to-air interface, and replay it over the second interface. The weakness is that because the second stream is a copy of the ﬁrst stream that uses the same initial nonce and encryption key, and thus it can accept replayed messages from the other stream. The same system representation can be used to check other system security properties speciﬁed in CSPM. 9.2. Denial of service through ﬂooding a channel Denial of service attacks try to make a resource unavailable to its intended users. In this system, if the intruder can access one of the interfaces it is reasonable to ask if the intruder can deny service to the legitimate user who is using the other interface. The following assertion checks this in this model: a s s e r t SYSTEM \ d i f f ( E v e n t s , {| R2M. 0 |} ) : [ d i v e r g e n c e f r e e ]

This will fail if SYSTEM can perform an inﬁnite sequence of events that contains no events that correspond to the MAVLINK receiving a message from the legitimate user. This indicates the possibility that this can deny the legitimate user access to the system, assuming the intruder is able to produce an unbounded sequence of messages. The check fails demonstrating that such a denial of service is possible. The reason for the check failing is due to the lack of fairness in the implementation of MAVLINK. A simple change to make the system resilient to such an attack would be if MAVLINK attempted to read from both streams alternately, this would ensure that the intruder could not overwhelm the system. 9.3. Denial of service through bounded sequences Another form of denial of service is due to the way that sequence numbers are handled in the system. Sequence numbers are used to ensure that the messages arrive in the correct order. When a message is received with a given sequence number, it checks to see if that number is larger than the sequence number of the last message it receives. If it is, it accepts the message and stores the new sequence number. The following assertion reveals a problem with this: a s s e r t SYSTEM’ : [ d e a d l o c k f r e e ]

31

SYSTEM’ is as per SYSTEM but allows the intruder to inject messages into the ground-to-air channel at the start (which is possible since this is a wireless link). This assertion fails, indicating that there is a way in which SYSTEM can deadlock and accept no further actions. This is because the attacker replays a message from a previous session with a very high sequence number, which the system accepts because it is properly constructed. This means that when the ground control station attempts to send a message with a low sequence number, it will not be accepted. This attack is a form of denial of service, since it prevents the legitimate user from interacting with the system. It is made possible by the fact that replay attacks work not only between diﬀerent interfaces, but also between diﬀerent sessions because there is no session-speciﬁc data.

10. Scalability To address the issue of scalability it is worth reviewing the overall form of the resulting translation from C into CSPM. A C function without any loops is translated as a terminating reactive process. The C function can be many hundreds of lines of code with many auxiliary variables. The result of applying FSG would be a functional description between the initial and ﬁnal states. The translation into CSPM would be a single step process that reacts to its environment. The environment will be the interface to the external world, or communications to the rest of the system. If the C function has a loop then the form of the CSPM is: a single step reactive process up to the point of the loop; a continuation to a, possibly nonterminating, single step reactive process; and a continuation to the CSPM process that is the translation from the output of FSG for the rest of the function. Any C functions that are called within the C code are treated as a continuation to a CSPM process that results from translating the speciﬁcation of that C function generated by FSG. The overall form of the CSPM representation is of a series of single step processes that have eliminated most of the intermediate states due to FSG generating the predicate linking initial to ﬁnal states. Hence there are many fewer states than a software model checker would have to deal with. The state space of these translated processes is driven by the size of the channels used, i.e. the number of values that can be communicated through a channel. The CSPM representation is thus parameterised by the set of channel values. A system, or sub-system, requirement will use a subset of the channels in the CSPM representation therefore only those values for those channels are needed. For example the property of replay protection speciﬁed in Figure 22 uses the channel R2M. The CSPM that deﬁnes the channel is: SZ = 4 SeqNo = {0 . . SZ−1} d a t a t y p e M = Msg c h a n n e l R2M : SeqNo .M

−− s h o u l d be power o f 2

32

Thus only 4 values are required to express the property. The actual message is irrelevant to the property of a replay attack and has been abstracted to a single value M. Channels that are dependent upon this channel can be similarly abstracted, systematic methods are available to determine the size of such sets of data for CSP [26]. If a requirement needs many values to express it then the approach described in this paper will not scale. The question then becomes are there interesting requirements that can be expressed using only a few values? The examples of section 9 indicate that there are useful properties that lend themselves to scaling. The example of the system implemented in the C code is not in itself large enough to demonstrate that the approach is scalable, but even a large example would only be indicative. The example does show that the number of states can be drastically reduced from directly model checking the C source code. The number of concurrent processes in the example sub-system was accurate for an early version of the Arducopter. Further scalability would require diﬀerent properties of sub-systems, such as the communication sub-system example, to be established and then the properties for each sub-system composed to give an overall system requirement. The in-built compositionality of CSPM has been used many times to assess large systems [16], based upon claimed behaviour of sub-systems. The authors believe that the described approach scales suﬃciently for many embedded systems where the architecture is known. The limitations of the approach discussed in this paper are: knowledge of the software architecture and its size; restricting pointer use to relatively simple purposes, such as parameter passing; and the types of property to be checked. The number of software architectures actually used in real embedded systems is limited. Although it would be a signiﬁcant amount of work to characterize such architectures and to determine which one was being employed in a particular instance, it is not infeasible. The size of the architecture will increase the size of the model check required, however the property of interest mitigates this. For example only a small part of the quadcopter’s software architecture needed to be described to check plausible cyber attacks. The biggest limitation to the applicability of the current work is the use of pointers, because of the limited ability to represent them in the CSPM model. Embedded systems in the aerospace sector tend to ﬁt the criteria, especially when auto-coding is used. Other areas such as the Internet Of Things will probably be less amenable to the current approach because of the lack of constraints in software development compared to aerospace. 11. Related Work The High Assurance Cyber Military Systems (HACMS) project [6] is creating technology for the construction of high-assurance Cyber-Physical Systems, where high assurance is concerned with functional correctness and satisfaction of appropriate safety and security properties. The work described in this paper is part of the “Red Team” eﬀort that is seeking to use “traditional” penetration 33

testing techniques allied to analysing the system using automated veriﬁcation. In section 6, ongoing work is discussed that seeks to de-compile executable binary into the C compliance notation from which a speciﬁcation can be generated and translated into CSPM. Thus the aim is to use the C compliance notation as an intermediate representation generated from a de-compiler. Recent advances indicate that the time is ripe for verifying properties of systems. A number of projects on systems veriﬁcation have emerged, most notably, the seL4 veriﬁed OS Kernel project [23]. The project developed and formally veriﬁed a high-performance microkernel on ARM and x86 architectures. Such microkernels are a critical core component of modern embedded systems architectures. They are the piece of software that has the most privileged access to hardware and regulates access to that hardware for the rest of the system. Virtually every modern smart-phone runs a microkernel quite similar to seL4. In the rest of the HACMS project a clean-slate approach has been adopted using formal methods-based approaches to enable semi-automated code synthesis from executable, formal speciﬁcations. In addition to generating code, HACMS seeks a synthesizer capable of producing a machine-checkable proof that the generated code satisﬁes functional speciﬁcations as well as security and safety policies. Work on extending the veriﬁed seL4 kernel to an RTOS is taking place within HACMS. All the preceding work described involves top-down development of software from scratch where the safety and security requirements are known. The approach advocated in this paper aims to address software that already exists and when requirements only emerge later. Another notable example is the Verisoft and Verisoft XT projects [1]. They verify special-purpose operating systems and a hypervisor, at the level of source code but without I/O. There have also been systems veriﬁcations that focus on safety properties, most notably, Yang and Hawblitzel’s type-safe operating system [44]. New supporting technology, such as programming logics for system source code, have been developed, e.g. by Shao’s group at the University of Yale [10]. In all of the above lines of work, veriﬁcation assumes availability of the source code. The Research Institute in Automated Program Analysis and Veriﬁcation [9] in the UK has two related research projects. The ﬁrst is “Compositional Security Analysis for Binaries” that is translating binaries to C code in order to apply software analysis tools like CMBC [24]. However this approach is focussed on low-level properties, like zeroing of released memory, rather than system-level properties. The second project is “Program Veriﬁcation Techniques for Understanding Security Properties”, like this paper it addresses source code rather than binaries. Unlike the approach described in this paper it does not compose source code to verify system-level properties, however it might be possible to extend it in that direction. The VATES project [2] was aimed at verifying behaviour of embedded software at the level of an LLVM intermediate representation. It uses a top-down approach by inventing an abstract CSP-based speciﬁcation and then formally relating a model of a concrete system implementation, given in the form of the LLVM intermediate representation. This demonstrated that well-established 34

formal veriﬁcation techniques for declarative process-algebraic speciﬁcations are applicable to low-level software programs. The VATES project developed a formal operational semantics of LLVM and established a bisimulation relation between LLVM and CSP models in order to prove that a given LLVM program is a correct implementation of a given CSP model. Again a top-down development was assumed and the translation from the intermediate representation down to assembly is just assumed to be correct. The most relevant work has been that of Magnus Myreen who developed an approach to formal veriﬁcation of machine-code [28] that has been exploited to verify binary code [23]. It has been subsequently extended by Fox to deal with more Instruction Set Architectures (ISAs) [13]. Myreen and Fox’s work has so far targeted ARM, Intel x86 and IBM PowerPC processors with a proofproducing de-compiler, which given some machine code for any of the processors mentioned above will automatically, via proof, derive a functional program from the machine code. This functional program is a record of the state change the machine code can perform. Although the approach only provides a low level description in Higher Order Logic it could form the basis for connecting binary executables to system architectures. An alternative output for the de-compiler has been developed as part of the veriﬁcation of the binary of seL4 veriﬁed OS Kernel. The output generated is a graph language whose nodes describe the result on the state of executing binary and whose arcs represent control ﬂow. In particular one type of node represents the eﬀect of calling a function. 12. Ongoing and Future Work Within the HACMS project, the lead organisation for the Red Team is the Charles Stark Draper Laboratories, or Draper Labs. Draper Labs are working on a de-compiler tool under HACMS. The HACMS de-compiler, called the Fracture tool, converts Executable Object Code for an Instruction Set Architecture (ISA) into an intermediate representation called LLVM [25]. Current work on the HACMS de-compiler is attempting to emit a representation, in the C compliance notation, from LLVM that is abstracted into a predicate by FSG and then translated into CSPM for the reﬁnement model checker, FDR [15], to verify against system security properties. Work is also ongoing on the formal speciﬁcation of a translator from the speciﬁcations, generated by the FSG tool, to CSPM. Myreen’s de-compilation technology provides advantages over the use of Fracture and FSG. For example the Fracture tool does not provide any assurance of correctness against the semantics of the ISA used. The manipulation of LLVM and the translation into the C compliance notation is similarly unassured. Similarly there is no planned veriﬁcation of the translation from the output of FSG to CSPM. As a consequence any reported violations of system properties may be false positives and satisfaction of the properties cannot be relied upon. Such limitations make its use questionable when safety is required, although its role within the Red Team’s assessment of systems is still useful. 35

The new graph representation generated by Myreen’s de-compilation technology allied with some simpliﬁcation support from an SMT solver seems to naturally map onto the approach described by this paper. If this is the case then system safety and security requirements could be established with high assurance and automatically. The gap between a graph representation of binary executable in Higher Order Logic and CSPM can be addressed through the Uniﬁed Theories of Programming (UTP) [20]. An assured translator could be built on top of a current tool [12] that supports UTP and is built on top of Isabelle. To employ the FDR3 model checker for properties that were sensitive to data values would require a two pronged approach. One is to build data abstraction into the translation from the representation in Higher Order Logic to CSPM. The other is the development of a symbolic model checker for CSPM to assure such data abstractions. The capability of FDR3 to harness cloud resources to check over large state spaces means that it is plausible that embedded systems could be within the reach of automated veriﬁcation against system properties. For example FDR3 checked a trillion states using Amazon’s EC2 web service in a few hours at a cost of approximately $70 [15]. The only part of the process for analysing Cyber-Physical Systems that is not currently automated is the incorporation of the communication events, such as ‘recv’ and ‘send’, that enable the reactive processes to be composed together. Work under the DARPA Mining and Understanding Software Enclaves (MUSE) project [5] is building a new machine learning arrangement that will analyse terabytes of open source software. The work carried out by Draper Labs, in collaboration with a group at Stanford University that is led by Andrew Ng, identiﬁes what open source library components have been used in an implementation with a view to addressing known security vulnerabilities . The same technology could be adapted to recognise the limited number of architectural compositions used in the construction of Cyber-Physical Systems. Once identiﬁed a library of CSPM models for communication events for that architecture could be automatically incorporated to build an overall model of an aspect of a system, like the one described in this paper. Acknowledgements. This work was sponsored by the Charles Stark Draper Laboratories under the DARPA project on High Assurance Cyber Military Systems (HACMS) under agreement number FA8750-12-2-0247. The views, opinions, and/or ﬁndings expressed are those of the authors and should not be interpreted as representing the oﬃcial views or policies of the Department of Defense or the U.S. Government. The authors are grateful for help with FSG from Mark Teasedale of D-RisQ Ltd and Rob Arthan of Lemma1. The authors are indebted to the reviewers for vastly improving the paper. Bibliography [1] Eyad Alkassar, Mark A. Hillebrand, Wolfgang Paul, and Elena Petrova. Veriﬁed Software: Theories, Tools, Experiments: Third International Con36

ference, VSTTE 2010, Edinburgh, UK, August 16-19, 2010. Proceedings, chapter Automated Veriﬁcation of a Small Hypervisor, pages 40–54. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010. [2] Bj¨ orn Bartels, Sabine Glesner, and Thomas G¨othel. Model transformations to mitigate the semantic gap in embedded systems veriﬁcation. ECEASST, 30, 2010. [3] B. Carr´e, C. O’Halloran, and C. T. Sennett. Final report on work to deﬁne a formal semantics for spark. DRA customer report, 1993. [4] D-RisQ. http://www.drisq.com/. [5] DARPA. http://www.darpa.mil/program/ mining-and-understanding-software-enclaves. [6] DARPA. Hacms project. http://www.cyber.umd.edu/sites/default/ files/documents/symposium/fisher-HACMS-MD.pdf, 2015. [7] Terry L David. Cyber security challenges in the global airspace. Integrated Communications Navigation and Surveillance (ICNS), 2016. [8] David Delmas and Jean Souyris. Astr´ee: from research to industry. Proc. 14th International Static Analysis Symposium, SAS 2007, G. Fil´e and H. Riis-Nielson (eds), page 437—451, 2007. [9] EPSRC. https://verificationinstitute.org/. [10] Xinyu Feng, Zhong Shao, Yuan Dong, and Yu Guo. Certifying low-level programs with hardware interrupts and preemptive threads. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 170–182, New York, NY, USA, 2008. ACM. [11] Jean-Christophe Filliˆatre and Claude March´e. The why/krakatoa/caduceus platform for deductive program veriﬁcation. Computer Aided Veriﬁcation 19th International Conference, CAV 2007, 2007. [12] Simon Foster. utp-isabelle/.

http://www-users.cs.york.ac.uk/~simonf/

[13] Anthony Fox. Directions in ISA Speciﬁcation, chapter Interactive Theorem Proving: Third International Conference, ITP 2012, Princeton, NJ, USA, August 13-15, 2012. Proceedings, pages 338–344. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. [14] Kevin Fu and James Blum. Controlling for cybersecurity risks of medical device software. Communications of the ACM, 56(10):35–37, 2013.

37

[15] Thomas Gibson-Robinson, Philip Armstrong, Alexandre Boulgakov, and A. W. Roscoe. FDR3: a parallel reﬁnement checker for CSP. International Journal on Software Tools for Technology Transfer, 18(2):149–167, 2015. [16] Thomas Gibson-Robinson, Guy Broadfoot, Gustavo Carvalho, Philippa Hopcroft, Gavin Lowe, Sidney Nogueira, Colin O’Halloran, and Augusto Sampaio. Fdr: From theory to industrial application. Bill Roscoe Festschrift,LNCS, 2017. [17] Thomas Gibson-Robinson and A.W. Roscoe. Fdr into the cloud. In Communicating Process Architectures, 2014. [18] Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett. Cyclone: A type-safe dialect of c. C/C++ User’s Journal, 2005. [19] M. G. Hill and E. V. Whiting. Risk reduction for c coding. Replacement Maritime Patrol Aircraft project - internal report, version 1.6, 2001. [20] C.A.R. Hoare and H. Jifeng. Unifying Theories of Programming. Prentice Hall series in computer science. Prentice Hall, 1998. [21] Galois Inc. https://github.com/mavlink. [22] Galois Inc. SMACCMPilot. http://smaccmpilot.org/, 2015. [23] Gerwin Klein, June Andronick, Kevin Elphinstone, Gernot Heiser, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. sel4: formal veriﬁcation of an operating-system kernel. Commun. ACM, 53(6):107–115, 2010. [24] Daniel Kroening. http://www.cprover.org/cbmc/. [25] Draper Labs. Fracture tool. https://github.com/draperlaboratory/ fracture. [26] Ranko Lazic and Bill Roscoe. Data independence with generalised predicate symbols. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA ’99), Volume I, 319–325, CSREA, pages 319–325. Press, 1999. [27] The MathWorks. http://www.mathworks.com/products/polyspace/. [28] Magnus O. Myreen. Formal veriﬁcation of machine-code programs. PhD thesis, University of Cambridge, 2008. [29] V. Nepomniaschy, I. Anureev, and A. Promsky. Veriﬁcation-oriented language clight and its structural operational semantics. Ershov Memorial Conference, 2003.

38

[30] A. Nicholson, S. Webber, S. Dyer, T. Patel, and H. Janicke. Scada security in the light of cyber-warfare. Computers and Security, 31(4):418–436, 2012. [31] Michael Norrish. C formalised in HOL. PhD thesis, University of Cambridge, 1999. [32] C. O’Halloran. Automated veriﬁcation of code automatically generated R Automated Software Engineering, 20(2):237 – 264, 2013. from simulink. [33] C. O’Halloran and C.H. Pygott. Formalising c and c++ for use in high integrity systems. The Safety of Systems, Redmill F., Anderson T. (eds), 2007. [34] C. O’Halloran, A. Smith, and P. Clayton. Verifying c software by proof – a case study. QINETIQ/EMEA/S&DU/TR0704743, 2007. [35] Colin O’Halloran. Guess and verify–back to the future. FM 2009: Formal Methods, pages 23–32, 2009. [36] Colin O’Halloran. Verifying critical cyber-physical systems after deployment. Vol. 72, Proceedings of the 15th International Workshop on Automated Veriﬁcation of Critical Systems, Electronic Communications of the EASST, journal.ub.tu-berlin.de/eceasst/issue/view/87, 2015. [37] Lemma One. http://www.lemma-one.com/ProofPower/index/. [38] Jonathan Petit and Steven E. Shladover. Potential cyberattacks on automated vehicles. IEEE Transactions on Intelligent Transportation Systems, 16(2):546–556, 2014. [39] A. W. Roscoe, P. H. B. Gardiner, M. H. Goldsmith, J. R. Hulance, D. M. Jackson, and J. B. Scattergood. Hierarchical compression for model-checking csp or how to check 1020 dining philosophers for deadlock. TACAS, 1995, 1995. [40] A.W. Roscoe. The Theory and Practice of Concurrency. Prentice Hall, 1997. [41] Ahmad-Reza Sadeghi, Christian Wachsmann, and Michael Waidner. Security and privacy challenges in industrial internet of things. Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE, 2015. [42] M. Strecker. Compiler veriﬁcation for c0. Technical report, Universit`e Paul Sabatier, Toulouse, 2005. [43] Jim Woodcock and Jim Davies. Using Z: speciﬁcation, reﬁnement, and proof. Prentice Hall, 1996. [44] Jean Yang and Chris Hawblitzel. Safe to the last instruction: Automated veriﬁcation of a type-safe operating system. http://research. microsoft.com/apps/pubs/default.aspx?id=122884, 2010. 39

Verifying cyber attack properties

Verifying cyber attack properties

Recommend Documents