Int. J. Human-Computer Studies (2002) 56, 199–223 doi:10.1006/ijhc.2001.0521 Available online at http://www.idealibrary.com.on
Veri¢cation and validation of the SACHEM conceptual model M. Le Goc Usinor, Sachem, LBI, 13776 Fos sur Mer Cedex France. LSIS, Av. Escadrille Normandie Niemen, 13397 Marseille Cedex 20 France. email:
[email protected] C. Frydman and L. Torres LSIS, Av. Escadrille Normandie Niemen, 13397 Marseille Cedex 20 France. emails:
[email protected];
[email protected] (Received 7 April 2000 and accepted in revised form 3 December 2001) We present a method for transforming a KADS conceptual model (informal) into an operational model (formal) based on high-level Petri nets. The KADS model we consider specifies the functional architecture of the knowledge-based system called SACHEM, designed for blast furnace control. The operationalizing process we propose allows the KADS model to be completed and validated. Upon execution of the operational model, the dynamics of the system can be simulated. Thus the proposed operationalizing process contributed to the validation and verification of the SACHEM conceptual model. # 2002 Elsevier Science Ltd. KEYWORDS: specification; validation; operationalizing; knowledge-based system; KADS.
1. Introduction Our research developed out of a contract between two French groups: the Usinor Group, a European steel company, and the L6 Research Laboratory. This collaborationy falls within the scope of the SACHEM project, the aim of which is to develop an interactive, real-time, knowledge-based system to assist in decision-making during blast furnace control operations, 24 h a day, 365 days a year. The problem addressed in this paper entails the adding of the action-recommending service to the problem detection and diagnostic services. The development of a large, complex knowledge-based system like SACHEM is a long and difficult process. Any undetected error during system specification involves very expensive delays. Therefore, specification validation is important. In this paper we describe our proposed method for validating the conceptual model of SACHEM.
yThis work is supported in part by Sollac contract number A190759 (Sollac is a company of the Usinor Group).
1071-5819/02/020199 + 25 $35.00/0
# 2002 Elsevier Science Ltd.
200
M. LE GOC ET AL.
This work starts from the SACHEM conceptual model elaborated by Usinor with the KADSy methodology (Wielinga, Schreiber & Breuker, 1992) and concerns the validation of this model. To reach this objective, we propose to complete and transform the SACHEM informal conceptual model into an operational model based on high-level Petri nets. During the building of this operational model, different problems of the SACHEM conceptual model were identified. These problems were resolved, in collaboration with the specification team, to complete and correct the conceptual model. Moreover, we provided a behaviour description language that is useful for adding system behavioural specifications before or during operationalization. Finally, the operational model obtained from the completed and corrected conceptual model was used to simulate the dynamics of the system. The proposed approach constitutes a decisive step that precedes design and should be applicable to any conceptual model obtained with the KADS methodology. In Section 2, we summarize the objectives of the specification and design phases of the software development cycle and briefly describe the KADS methodology for knowledge-based system development. Section 3 introduces the SACHEM system in order to present the problem of validating the SACHEM conceptual model. In discussing our approach for operationalizing the KADS conceptual model of the SACHEM system, we first explain how the conceptual model is completed (Section 4) and how high-level Petri nets are used to build an operational model from the completed conceptual model (Section 5). Then, in Section 6, we present the results of simulations on the SACHEM operational model. The main lessons to be taken from the operationalization of the conceptual model of SACHEM in an industrial setting are presented in Section 7.
2. The knowledge level and the development process 2.1. INITIAL PHASES IN SOFTWARE DEVELOPMENT
The first phase in the software development process consists in establishing the specification of the system from the customer’s requirements (Calvez, 1990; Sommerville, 1992). Specification aims at characterizing a system by defining what the system must do. This implies considerable dialogue between the customer and the specification team. Consequently, a specification must be easily elaborated, understood and modified (Calvez, 1990). Furthermore, a specification must be validated by the customer, to prove that it satisfies the request and verified by the specification team, to prove that it is coherent (without ambiguities and contradictions). A specification model is used to describe the specification of the system to be developed. The formal property of this model allows it to be verified and validated. A specification model is usually composed of three parts (Calvez, 1990): yWe use KADS to refer to the methodology that emerged from the Esprit projects KADS-I and KADS-II, of which the last version is CommonKADS.
201
SACHEM MODEL
(1) A data model describing the data and the relations between them. (2) A function model describing the data transformations in terms of functions. (3) A behavioural model defining the temporal evolution and the execution conditions of the functions. The next step in the development process is the design phase. The design aims at characterizing a system by defining how it behaves. The designer takes into account the specification in order to obtain a solution and then validate it. Any solution proposed must satisfy the specification. A design model describes a solution for the system. The three dimensions just listed must be found once again in this model. By nature, the specification model is free from implementation details. On the other hand, the design model takes into account computational techniques. The passage from the specification to the design phase is a critical problem in software development (Sommerville, 1992) that must inevitably be faced when one is developing any software, including knowledge-based systems like SACHEM. Several methodologies exist for software development, some of them specific to knowledge-based systems. The KADS methodology is the one that was chosen by Usinor for the development of the SACHEM system. The justifications of this choice are outside the scope of this paper, but the fundamental reason is that in 1989, KADS methodology was considered to be the most complete approach to managing the development of a complex knowledgebased system. 2.2. THE KADS METHODOLOGY FOR KNOWLEDGE-BASED SYSTEM SPECIFICATION
The KADS methodology provides a general framework for the development of knowledge-based systems (Wielinga et al., 1992). It proposes guidelines for building an informal model, known as a conceptual model, which must be completed to become a specification model. Like any software engineering methodology, KADS distinguishes the specification model from the design model (see Figure 1). 2.2.1. The KADS conceptual model. Since the KADS conceptual model is built in an implementation-independent way, it constitutes a part of a true specification model
Figure 1. Abstraction process of KADS.
202
M. LE GOC ET AL.
(Fensel, 1995). This model is based on the identification of different types of knowledge, which are distinguished as three layers within the KADS conceptual model: the domain, inference and task layers (see Wielinga et al., 1992) for detailed information). 2.2.2. KADS abstraction process. One of the most important contributions of the KADS methodology is the effort to position knowledge-based system development as a complement to the classical software development process. This point of view is particularly clear in the abstraction process of KADS. With KADS it is possible to distinguish the conceptual specification process (producing the conceptual model) from the complete specification (producing the specification model) and the technical design (producing the design model). This capability helps to clarify the development process for knowledge-based systems (Wielinga et al., 1992). Knowledge-based system development can then be viewed as the adding of knowledge acquisition to classical software development. Knowledge acquisition, in turn, establishes a bridge between the requirements document and the specification model document. This bridge is precisely the informal conceptual model, which facilitates communication among experts, the specification team, decision makers and clients. But neither the KADS methodology nor other software methodologies give an explicit solution to the specification validation problem (van Harmelen, Balder, Aben & Akkermans, 1991); the problem of moving from specification to design likewise remains unsolved. 2.3. THE INCOMPLETENESS OF THE KNOWLEDGE LEVEL
The abstraction cycle of KADS is connected with the Newell proposition, formulated in 1982 as an hypothesis (Newell, 1982): ‘‘There exists a distinct computer systems level, lying immediately above the symbol level, which is characterized by knowledge as the medium and the principle of rationality as the law of behavior’’. The fundamental point of the Newell proposition is that the knowledge level is characterized by a radical incompleteness. So, from this point of view, any attempt to build a complete description of a system at the knowledge level is illusory. While systems at the knowledge level are functionally characterized in terms of the functional role that a subset of knowledge is playing, the same systems must be characterized structurally at the symbol level, in terms of the logical structures that realize some functional role (Steel, 1990). Thus the global design process of a knowledge-based system can be defined as the transformation of knowledge in formal structures. The basic difficulty lies in transforming a conceptual model, expressed at the knowledge level, into a specification model, expressed at the symbol level. This process is complex because it is concerned with the intricate relations existing between two abstraction levels and within a level (Newell, 1982). The process is particularly complex when one is starting from scratch, for at least two reasons. (1) The concept of knowledge is not yet clearly understood. There is no operational definition of knowledge. We know only what it is not, and what we are permitted
SACHEM MODEL
203
to do. In this sense, we can say that our knowledge about a system constitutes a specification of what the system must be able to do (i.e. the ‘‘skill’’ of the system) and it is important to note that it concerns the whole system. In other words, the knowledge must specify all the knowledge-based system modules, not just the knowledge bases. Therefore, the distribution of the knowledge in a functional model is a key factor of complexity in the design process. (2) The perception of a system and its modelling are two linked cognitive processes. We assume that a model of a system based on either construct always results from the choice of at least one point of view; this position was recently discussed in more detail (Le Goc, 1999). The more complex the system, the greater the number of points of view that can or must, be adopted. When a system is actually complex, then, many models must be constructed to solve some real-world problem. Because a model can be viewed as a structured set of knowledge about the system, the set of knowledge to be acquired is the result of the choice of a particular point of view. Consequently, the knowledge necessary for the construction of a system must also be specified and unfortunately, the set of knowledge required by the system is not known until the customer validates the system. The preceding points justify the assertion that ‘‘KADS does not provide a dedicated computational framework that provides a straightforward operationalization of conceptual model’’ (Wielinga et al., 1992). Thus, the problem of transforming a conceptual model into a specification model is still open. However, this problem must be solved when one is applying KADS to the development of an industrial knowledgebased system. That is why Wielinga et al. (1992) recommend adopting an operationalizing process that preserves the conceptual model structure. This recommendation is important, but it is not sufficient. We adopt the approach that considers knowledge-based system development ‘‘as a process of adding symbol-level information to a conceptual model’’ (Wielinga et al., 1992). The question we put in this paper is the following: given a behaviour described at a knowledge level, what kind of information must be added to obtain a truly functional specification at the symbol level? This question concerns not only the problem of the conceptual model completeness, but also the problem of conceptual model validation. 2.4. HOW TO VALIDATE A CONCEPTUAL MODEL
Since descriptions at a knowledge level are radically incomplete, the process of adding information can be endless and an objective criterion must be found to stop it. Because representation exists at the symbol level, it is possible to adopt the classical software development criterion}namely, validation of the specification model}for stopping the process of adding information. We thus propose to stop the process of conceptual model construction by the validation of the specification model. This leads to the joint validation of the conceptual model and the specification model (see Figure 2). Moreover, this validation can be engaged without waiting for the completion of the domain layer of the conceptual model (Frydman, Torres & Le Goc, 2000). In this case, the validation is concerned only with the generic model of interpretation. The key point
204
M. LE GOC ET AL.
Figure 2. Joint validation of a conceptual model and a specification model.
is that this model specifies the functional architecture of the knowledge-based system. This property is very important when a very large-scale knowledge-based system such as SACHEM is being developed from scratch. The process of acquiring the domain knowledge of SACHEM took 14 man-years (see Section 3). Therefore, the operational process could not be grounded on domain layer knowledge operationalization because the construction of the specification model and the corresponding design model cannot be postponed until the knowledge elicitation and formalization processes have been completed. But since the behaviour of a system described at the knowledge level is described within the generic model of interpretation of a KADS model, this model constitutes the basic input of the specification phase of SACHEM development. The operationalization of this knowledge suffices to simulate the future system behaviour. This approach is grounded on the assumption that the elicitation of domain knowledge will not modify the generic model of interpretation. This is one of the ideas the KADS methodology used to ground the generic model concept. And this idea has been verified in the acquisition of knowledge for SACHEM. On the other hand, the SACHEM project team estimated that when the operationalizing and implementation languages differ, the domain layer operationalization constitutes a ‘‘double coding effort’’, that is, a double formalization effort. Indeed, the first formalization corresponds to the representation of the domain knowledge in a language derived from the language of predicates of the first order. The second one corresponds to the knowledge representation in the language retained for the
SACHEM MODEL
205
implementation of the knowledge bases, here the KOOL-94 language. In other words, operationalization of the whole SACHEM conceptual model would represent a considerable and more or less useless cost. But because our approach does not operationalize the domain layer, the operationalization effort is acceptable. As a result, our validation approach does not address the problem of validating the domain knowledge. Completing and operationalizing are concerned with the inference and task layers, which specify the functional architecture and contain the behavioural knowledge that constrains the dynamics of the system to develop, here the SACHEM system. To make this paper self-contained, we first describe the SACHEM system to which our validation approach has been applied. We then show how the behavioural part of the SACHEM conceptual model is completed and transformed into high-level Petri nets. The types of problem this transformation reveals are examined. Eventually, it will be possible to identify other problems by simulating the Petri nets so obtained.
3. The SACHEM conceptual model SACHEM is a very large-scale, knowledge-based system used since 1999 to monitor and control six of the blast furnaces of the Usinor Group (two in Fos sur Mer, two in Dunkirk and two in Lorraine). SACHEM is also in operation to control a galvanization line at Montataire. 3.1. THE SACHEM SYSTEM
SACHEM improves the product quality and lengthens the life of all the related equipment. The SACHEM system is designed as a reactive rational agent whose mission is to satisfy this major need. This agent acts in real time, continuously (24 h/day, 365 days/year) in a reliable way (see (Le Goc & Thirion, 1999) for more information about SACHEM). The architecture of SACHEM is tuned to process more than 11 000 data items/ min for a given blast furnace (see Figure 3). SACHEM assumes the following functions. (1) Data acquisition, synchronization, ordering and verification. (2) Data processing and validating (using physical and chemical models). (3) Signal analysis, including neural networks, to detect modifications in time and space. (4) Detection of phenomena occurring during operation, interpretation of the current situation, and alarm generation when necessary. (5) Recommendation of action for the operator, when a situation must be corrected. The SACHEM system is now considered to run a blast furnace better than the best USINOR operator. When the development project began in 1991, the intended benefit
206
M. LE GOC ET AL.
Figure 3. Knowledge distribution in relation to general system architecture.
Figure 4. Conceptual architecture of the perception function of SACHEM.
was around 1 Euro per ton of hot metal. In 1999 the measured return of the investment on one blast furnace of the Usinor Group was around 1.7 Euro per ton. 3.2. KNOWLEDGE ASPECTS IN THE SACHEM SYSTEM
To understand the SACHEM experience and the methodological solution developed, it is important to keep in mind some numbers. The method used to build the SACHEM system was a specialization of the KADS conceptual framework. The specifications of the whole system result from an analysis of the conceptual model of the knowledge. The latter was constructed with the OpenKADS environment (Kirsh, Maesano & Raboux, 1993) which represents 25 000 objects for 33 goals, 27 tasks, 75 inference structures, 3200 concepts, and 2000 relations. This model (see Figure 4) represents 14 man-years: the efforts of a team of six knowledge engineers and 12 industry experts over 3 years. The conceptual model has two parts: a generic model of interpretation and a linguistic
SACHEM MODEL
207
model, which is a representation in natural language of the domain knowledge. This linguistic model comprises 2000 pages of text and graphs, distributed in over 70 documents. In 1997 the total software volume represented around 400 000 lines of code (Figure 3). About 60% of this code volume implements the SACHEM functional model, of which 33% is dedicated to the actual knowledge bases. These knowledge bases contain more than 1060 classes of objects, 1100 first-order logic rules and 140 chronicles (a chronicle is a set of temporal constraints linking a set of events). All these items of knowledge are expressed in KOOL-94, an object-oriented language coupled with a first-order logic inference engine and a temporal relations manager. 3.3. VERIFYING AND VALIDATING THE DYNAMICS OF THE SACHEM CONCEPTUAL MODEL
The approach we advocate to obtain a formal specification model from the SACHEM informal conceptual model consists of completing the inference and task layers of the conceptual model and transforming them into high-level Petri nets. Petri nets are widely used tools in domains as varied as operating system design and real-time system modelling (Ban#atre, 1991). They present several advantages. (1) As a formalism, they allow the properties of the model to be formally verified. (2) As an operational model, they allow the dynamics of the system to be validated by simulation. (3) Their graphical representation facilitates communication among experts. One important feature of our approach is that it does not call for any additional skills on the part of the knowledge engineers. They do not need to know anything about Petri nets or operationalizing languages. ‘‘Operationalizing’’ is the accepted term used in knowledge engineering to designate transformation from an informal model into an executable model (van Harmelen et al., 1991; Fensel & van Harmelen 1994; Jacob-Delouis & Krivine, 1995). However, two ways of operationalizing exist and must be distinguished (Fensel & van Harmelen, 1994; Torres & Frydman, 1998). (1) The first approach recommends directly translating the informal conceptual model into executable code: operationalizing is synonymous with producing the target knowledge-based system itself. (2) The second recommends the transformation of the informal conceptual model into a formal and executable model before the target system is produced. Our work is concerned with the second kind of operationalizing, which allows the conceptual model and the specification model to be verified and validated before coding. Our work, like that of van Harmelen and ten Teije, 1997, attempts to verify and validate the models before the knowledge-based system is built and consequently before the system itself has been verified and validated. In other words, we suppose that the conceptual model does not necessarily describe the valid behaviour of the knowledge-
208
M. LE GOC ET AL.
based system; this is exactly what one wants to verify. Our approach must be distinguished from others (Pipard, 1987; Pierret-Golbreich, 1994; Haouche & Charlet, 1996). Even if a supplementary implementation effort is required, verifying and validating before implementation is widely justified by the benefits accruing from specification correction (Sommerville, 1992; Fensel, 1995; Frydman et al., 2000). Since we do not transform the domain layer, we are not concerned with verifying and validating the knowledge base, as done by others (Pipard, 1987; Fensel, Schoenegge, Groenboom & Wielinga, 1996), but in verifying and validating the behaviour of the SACHEM conceptual model in the way outlined by Groot, ten Teije and van Harmelen (1999). Our completing and operationalizing approach then consists of four steps. (1) (2) (3) (4)
Completing the inference layer Completing Introducing additional behavioural descriptions in the task layer. ) Building the inference Petri net from the completed inference layer. Building the task Petri nets Operationalizing from the completed task layerand from the inference Petri net.
4. Completing the SACHEM conceptual model 4.1. COMPLETING THE INFERENCE LAYER
An inference in the conceptual model describes a transformation from input to output roles.y An inference structure is a network of inferences that allows one to perceive the data dependency among these inferences, abstracting from control. The specification of any system must allow each module of the system to be developed by one team and used by another without knowing how this will be (or is) achieved. Therefore, the specification of a module must at least include the identifier, the input, the output and the associated initial and final situations. Thus, an inference specification must be completed by a set of preconditions describing the conditions needed for the inference execution and a set of postconditions describing the situations in which the inference puts its context after execution (Pierra, 1991). It may also be completed by giving its execution time. Since the execution time of an inference is often difficult to state with precision at the specification phase, an inference may, rather, be characterized by an interval defining some possible execution times. We provide the knowledge engineers with a language LI to describe the preconditions, postconditions and execution time intervals associated with the inferences. Preconditions and postconditions are expressed in terms of input or output properties, such as type, value and creation date. The following grammar, in Backus yThe KADS methodology advocates one output role for each inference. However, generalizing to many output roles is not contradictory and is permitted in some environments such as OpenKADS.
SACHEM MODEL
209
Naur form, defines these additional specification elements for each inference (obvious definitions are deliberately omitted).
An example of a completed inference is given in Figure 5.
4.2. COMPLETING THE TASK LAYER
The KADS methodology indicates that the task definition must include the behavioural description, that is, the way of chaining the components of the task. However, it does not provide the definition of the concepts needed and still less the tools required to do so. The specification team must define the syntax and the semantics of the behavioural description language. Within the OpenKADS framework, this necessitates a major programming effort, as well as the very advanced programming knowledge of a computer specialist. Therefore, we provide a formal language LT to allow the conceptual model to be completed by adding behavioural descriptions. This language is easily used by computer novices and sufficiently expressive to describe the task behaviour. Any behaviorual definition language must contain, at the very least, the following chaining between task components: sequence, repetition, conditional execution, synchronization and parallelism. For the SACHEM project, only the sequence and synchronization (at the start of the task) operators are useful and are thus taken into account in our language. In any case, the language can be extended by adding new operators (Torres & Frydman, 1998). Our language LT is formally defined by the following grammar, where the operator ‘‘ , ’’ represents the sequence and the operator ‘‘//’’ represents the synchronization. The
210
M. LE GOC ET AL.
Figure 5. The Selecting disturbances inference.
synchronization operation that we consider here consists of synchronizing the beginning of the execution of the following components.
An example of a completed task is given in Figure 6.
5. Operationalizing and verifying the SACHEM conceptual model The formalism we chose to use in operationalizing the inference and task layers of the SACHEM conceptual model is based on Petri nets. Moreover, we use the following kinds of Petri net. (1) Object (Bax, 1995) and interpreted (Jensen, 1994): to allow the preconditions and postconditions on the execution of inferences and tasks to be represented. (2) Hierarchical (Jensen, Huber & Shapiro, 1990), to naturally express the hierarchical structure of tasks. (3) Time (Merlin et al., 1976), to represent time executions of tasks and to study time constraints.
211
SACHEM MODEL
Figure 6. The determining corrections task.
5.1. BUILDING AND VERIFYING THE INFERENCE PETRI NET
5.1.1. Building method. From the description of the inference set, we define the inference Petri net by associating places with roles and transitions with inferences. When it is used by more than one inference, the role generates only one place in the inference Petri net. The properties of a role correspond to attributes of the place representing the role. Preconditions of an inference are translated into preconditions of the transition representing the inference and postconditions are translated into actions of the transition. Finally, the time execution interval of an inference is associated with the time execution interval of the transition representing the inference. Figure 7 illustrates the method for building the inference Petri net. 5.1.2. Verifying the inference Petri net. The inference Petri net continues to allow verifications to be made on the conceptual model that may lead to a first correction of the conceptual model. For instance, we can identify the following on the inference Petri net. (1) The input and output placesy of the inference Petri net, corresponding to the inputs and the outputs of the conceptual model. (2) Isolated transitions, corresponding to isolated inferences in the conceptual model. (3) Transition wells (without output places) or sources (without input places) corresponding to incompletely defined inferences. (4) Strong similarities between identifiers of roles or between identifiers of inferences. The degree of similarity is valued depending on the number of substituted and inserted characters. For instance, ‘‘Action Plans’’ and ‘‘action Plan’’ and yInput and output places of a Petri net (PN) can be identified from the input and output transitions of the PN. A transition is an input/output one if there is no transition preceding/following this transition. Then, input places/output places are the places in input/output of input transitions/output transitions.
212
M. LE GOC ET AL.
Figure 7. Operationalizing inference structures of the inference layer.
‘‘action plans’’ give strongly similar results and correspond either to different entities whose names would be changed or to one entity with typing error or spelling mistake. (5) Transition preconditions or actions, such as the use of place names that do not correspond to input or output places of the transition. Preconditions must only use input roles while postconditions use input and output roles. (6) Transitions with inconsistent execution time interval. Experts must correct the conceptual model in the OpenKADS environment and start the translation and verification process again. For instance, various problems were detected with respect to the 15 inferences of the action-recommending subsystem of SACHEM. Some of them were not effective problems: the 14 inputs and the four outputs had so been validated by experts; the presence of one isolated transition has been justified. Others proved effective, as well: of the four problems on role identifiers, three corresponded to spelling mistakes. The inference Petri net obtained from the corrected conceptual model is nothing more than a precedence graph between inferences (precedence that is implicitly described in the inferences structures). This graph constrains the inference chaining described in the behavioural description of a task (given in the language LT). 5.2. BUILDING AND VERIFYING THE TASK PETRI NETS
The task Petri nets are built from the completed task layer and from the inference Petri net. 5.2.1. Building Method. In the completed task layer of the conceptual model, the structure of a task redefines the task hierarchically in subtasks and/or inferences (Fensel
213
SACHEM MODEL
& van Harmelen, 1994) and its behaviour describes the execution order of the subtasks/ inferences that compose the task. This hierarchical structure of tasks is naturally represented by hierarchical Petri nets. Indeed, each task is associated at the highest abstraction level with a Petri net composed of one substitution transition (Jensen, 1994), with input/output places, preconditions, actions and execution time intervals deduced from the Petri net of the task on the next lower abstraction level. At this immediately lower abstraction level, a transition is associated with each component of the task. In this Petri net, each transition can again be replaced by the Petri nets of its associated components, to obtain a Petri net of lower abstraction level and so on. The hierarchical Petri net represents the task at different abstraction levels (see Figure 8). At a given abstraction level, the Petri net of a task describes the precedence graph of the elements associated with its transitions. The whole system, considered as a task, can be associated with a hierarchical Petri net, as can any part of the system. The translation method of the behaviour of a task distinguishes independent components of the task from dependent ones as follows. (1) Dependent components are those that form a sequence in the inference Petri net. (2) Independent components are not dependent components. In fact, dependent components are the components of the task for which chaining remains implicitly defined in the inference Petri net. They correspond to dependent inferences/subtasks that share some of their input or output roles. Inference1 and Subtask2 of Figure 8 are dependent components of the task Example. Inference2 and Inference3 are dependent components of the task Subtask2. Independent components correspond to inferences/subtasks that have no common
Figure 8. Operationalizing a task at various abstraction levels.
214
M. LE GOC ET AL.
role in the inference Petri net; that is, there is no implicit chaining defined between them in the inference Petri net. The behaviour of two dependent components is immediately translated from the inference Petri subnet by the hierarchical Petri net corresponding to these components. The translation illustrated in Figure 8 represents this case. The behaviour of two independent components is translated by introducing control transitions and/or control places in the corresponding hierarchical Petri net (see Figure 9). The synchronization translation leads to the introduction of a control transition and as many control places as there are synchronized components. The sequence translation consists of adding a control place between the sequentialized components. 5.2.2. Verifying the task layer. Certain structural and behavioural properties of the tasks can be automatically verified during the building of the task Petri nets. Indeed, we can identify the following. (1) Roles that are input of several inferences. This situation may correspond to two cases: either the datum represented by the role is consumed by one of the inferences and is not available for the others (this condition, which emphasizes an incomplete
Figure 9. Defining the behaviour of the two independent tasks.
SACHEM MODEL
(2) (3)
(4) (5) (6)
(7)
(8)
215
specification, will oblige the experts to revise the conceptual model) or the datum is read, then restored. All these situations must be pointed out to the experts. Strong similarities between identifiers of tasks. Incomplete definitions of task composition. No task may be empty; a task must contain at least one component; and every inference must be a component of a task. Cycles in task composition (a component cannot contain its container). Names that do not correspond to components of the task in its behavioural description. Wrong or incomplete input or output list in the structural description of a task. The true inputs/outputs of a task are the inputs/outputs of its components, that are not outputs/inputs. Once deduced, the true inputs and outputs of a task can be compared to those listed in the structural description of the task. Task behavioural descriptions that do not respect the precedence constraints expressed by the inference Petri net. The only acceptable description is one of the following. (a) A synchronization of independent components. (b) A sequence between independent components. (c) A sequence between dependent components that is already expressed in the inference Petri net. Note that a behavior corresponding to such a sequence is redundant (since it is still expressed in the inference Petri net); it is useful only to make the task description more explicit. Let us consider Subtask2 of the SACHEM conceptual model in Figure 8. The defined behavior (Inference2, Inference3) is the only one acceptable for this task. All other behaviours should not be acceptable: (i) Inference3, Inference2) is a sequence between two dependent components that is not expressed by the inference Petri net. (ii) (Inference2//Inference3), equivalent to (Inference3//Inference2), corresponds to a synchronization between dependent components. Missing task behavioural descriptions.
All these possible or effective problems are listed in a special file produced during the Petri net building. The experts are responsible for correcting the task layer of the conceptual model. During the study of the 10-task action-recommending subsystem of SACHEM, six problems concerning task composition were detected: two tasks had no component, one inference was not used as a task component and three tasks were incompletely defined, being composed of a task without a component or with an incompletely defined one. Two problems were concerned with the names of tasks, as a result of spelling mistakes. Among the task behavioural descriptions, some were redundant (i.e. they expressed sequences that were already expressed in the inference Petri net), and many others were missing. The model comprising all the task Petri nets we finally obtained is an executable one. Several properties can be now verified by simulation on this model or some parts of it.
216
M. LE GOC ET AL.
5.3. PRACTICAL BUILDING
From a practical point of view, the OpenKADS environment that allowed the SACHEM conceptual model to be defined can be used to translate conceptual models. The translation may be realized only by defining Lisp translation functions (since the environment is coded in Lisp). The Petri nets so generated are expressed in the Cabernet syntax. Cabernet (computer-aided software engineering based on ER NETs) (Silva, 1994; Pezz"e, 1994) is the software engineering environment we chose, which provides an integrated set of tools (graphical editor, executor, animator, analyser, hierarchy manager, etc.) for specifying and analysing the specifications of real-time systems based on Cab nets.
6. Simulating the SACHEM operational model The Cabernet simulator can be used to perform behavioural analysis on the operational specification model. From an initial marking corresponding to the input places of the system (or subsystem), we say that the simulation result is correct if simulation provokes a marking corresponding to the output places of the system (or sub-system). The initial marking is such that any input place is marked with one token. A place marking represents the existence of the associated datum and its accessibility. We say that the execution of a task is correct if, for any Petri net representing this task with the initial marking of its input places, when the execution of this Petri net stops, any output place of this Petri net is marked with one token and no other place is marked. Simulation allows us to identify specification problems of three kinds.
6.1. EFFECTIVE CONFLICTS CAUSING DIFFERENT EXECUTION RESULTS
Sometimes effective conflictsy cause different execution results according to the component execution order or allow the system to alert users to transitions that would be impossible to launch. Figure 10 illustrates an effective conflict (on place 1, which comprises the input of two transitions) causing different execution results. From the initial state (a), only one transition can be launched: the final state obtained may be (b) if x is launched or (c) if y is launched. This corresponds to an incomplete specification: there is at least a condition on the execution of the inferences/subtasks represented by the transitions x and y that has not been expressed. Moreover, the data may be restored after having been read (i.e. in the case of resource sharing) in the case of the two transitions that must be executed every time. Figure 11 illustrates an example of effective conflict (at place 2, which comprises the input of two transitions) causing a transition that could never be launched. From the yThe case of a place comprising the input of two transitions defines a structural conflict. A structural conflict can be automatically detected on a Petri net. However, a structural conflict is not necessarily effective: in Figure 10(a), if we add an edge from transition y to place 1 and from transition x to place 1, and if conditions on transitions x and y (determining which one is fired) are defined, there is no more conflict. Consequently, effective conflicts can be detected only at the time of simulation.
SACHEM MODEL
217
Figure 10. Effective conflict with different execution results. (a) initial state; (b) & (c), final states.
Figure 11. Effective conflict with a transition that cannot be launched. (a) initial state; (b) intermediate state; (c) final state.
Figure 12. Place marked with more than one token. (a) initial state; (b) final state.
initial state in Figure 11(a), after the transition x has begun, the intermediate state of Figure 11(b) is obtained. In this intermediate state, only transition y can be launched and this leads to the Petri net of Figure 11(c). Is this situation realistic? Indeed, transition z will never be launched (i.e. the inference/subtask associated with this transition will never be executed). A solution consists of restoring the token in place 2 after transition y has been launched, but the decision lies with the experts. 6.2. PLACES MARKED WITH MORE THAN ONE TOKEN
From the initial state in Figure 12(a), the obtained final state is given in Figure 12(b), where place 3 is marked with two tokens. Do these two tokens represent only one datum or two data? The specification team must resolve this ambiguity: either the data were incrementally obtained (e.g. a database) or each execution destroys the previous result.
218
M. LE GOC ET AL.
This case, which occurs when a place consists of the output of two transitions that are both launched before one of the tokens has been consumed by another transition, cannot be easily detected in the Petri net. The same ambiguity exists when one token is produced in one place several times in such a way that before producing a new token in the place, the previously produced token is consumed by a transition (this induces the presence of cycles in the Petri net). See, for instance, place 1 in Figure 12, and suppose that an edge is added from transition x to place 1. Then the problematic place is never marked with more than one token, but it can be identified by the presence of cycles. 6.3. TOKENS REMAINING IN PLACES THAT MAY NOT BE OUTPUT PLACES
The initial state in Figure 13(a) produces the obtained final state given in Figure 13(b), where the two places 2 and 4 are marked. Is place 2 a true output of this sub-system? This situation will arise if the place belongs to a cycle in the Petri net (but not for all the places in the cycle; see, e.g. place 3 in Figure 13). Even if a detecting cycle can be achieved in the Petri net, it cannot allow the identification of problematic places (those that may be concerned with the previous ‘‘ambiguity redefinition or incrementally definition’’ or those that are not really output of the system). 6.4. ADVANTAGES OF SIMULATION
We have shown that simulation allows the specification model behaviour to be validated. This helps us to reach our objectives. Moreover, simulation allows the system to be studied from a viewpoint that is relevant from the design standpoint. For instance, the performances of the system and/or any sub-system may be evaluated when an interval of launch time is associated with each transition (i.e. a time interval for the execution of an inference or a task). This possibility is interesting from the standpoint of real-time system development. In addition, the system can be studied at different abstraction levels corresponding to the hierarchical structure of tasks. Then, the user can try different behavioural descriptions and compare them for each task. All this should allow comparison of different architectures of the system, a first step toward a design solution.
Figure 13. Tokens remaining in places that are not output places. (a) initial state; (b) final state.
SACHEM MODEL
219
7. Industrial feeling 7.1. PROBLEMS WITH THE VALIDATION OF A GENERIC MODEL OF INTERPRETATION
The interpretation model of the knowledge required to control a blast furnace is called the ‘‘Analysis Expertise General’’. This model contains the generic interpretation model of SACHEM, some case studies, the list of actuators and those of information channels. The analysis model of the general expertise specifies the fundamental functions of the reasoning of SACHEM and guides the process of knowledge acquisition. The validation of this model is therefore crucial. The knowledge engineers and the Experts of the ‘‘Usinor experts’ group’’ have performed the validation of the analysis model of the general expertise in a double perspective: (1) To validate the analysis model of the general expertise as an abstract description of the experts’ knowledge. Experts must read the model to answer the following question: Is this description compatible with your knowledge? The process of acquiring expert knowledge will only begin when the ‘‘Usinor experts’ group’’ has validated this property of the model of analysis of the general expertise. (2) To validate the analysis model of the general expertise as a description of a ‘‘conceptual engine’’. Thus the ‘‘Usinor experts’ group’’ was asked the following question: Is this description compatible with your problem-solving method? In other words, are the logical properties of the tasks and inference structures contained in the model of analysis of the general expertise adapted to the problemsolving process you are working with? This question is concerned with the ability of the model of analysis of the general expertise to constitute an efficient specification of the functional architecture of the knowledge-based system to construct. These two perspectives illustrate the epistemological rupture within the abstraction cycle of KADS (see Figure 1).
7.2. TO MAINTAIN A GENERIC MODEL OF INTERPRETATION
This difficulty occurred during the SACHEM system development, notably in 1992 when the analysis model of the general expertise was delivered for the first time. The experts declared that they were not able to validate this model. Technically, the difficulty in validating the ‘‘State Correction’’ task can be formulated as the following question: How can one validate the dynamic dimension of a generic model of interpretation by an expert? The problem was not approached until 1995, when the model analysis of the general expertise of SACHEM was updated to introduce a new description of the ‘‘state correction’’ task. The problem was then to complete the analysis model of the general expertise and to extend the underlying ontology. The model analysis of the general expertise of SACHEM had then been updated to add the 10 tasks and the 15 inference structures of the ‘‘state correction’’ task.
220
M. LE GOC ET AL.
The application of the validation principle proposed in this paper is based on two underlying ideas. The first one is the fact that the notion of function is a natural concept for the experts. So the description of functional behaviour is a usual description of the dynamic of a system for an expert. The second idea is that the validation of the functional specification model entails the one of the generic model of interpretation of the expertise. 7.3. LESSONS DRAWN
The application of the proposed method to the SACHEM model of analysis of the general expertise leads to the following lessons. (1) A KADS generic model of interpretation constitutes an efficient tool for developing the functional architecture of the SACHEM knowledge-based system. It is not necessary to operationalize the domain knowledge. This aspect becomes a decisive advantage when the volume of knowledge at the domain level is important. (2) The operationalization of the tasks and the inference layer of a KADS model concretizes the dynamic aspects of the reasoning process. More precisely, this simulation shows when, why and how the tasks and inference structures are activated and inactivated to solve a problem. This aspect is particularly crucial in the design of a real-time knowledge-based system. (3) Operationalization permits the detection of specification errors, especially inconsistencies and incompleteness. This fundamental property of the operationalization process results from the crossing of the epistemological rupture that exists between the knowledge level and the symbolic level: the imprecision in all system descriptions at the knowledge level must disappear when descriptions are recorded at the symbol level.
7.4. ECONOMIC IMPACT
In order to value the economic relevance of the proposed method we will take the following facts like a basis of calculation. (1) The ‘‘Action Recommending model’’ represents 33% of the whole generic model of interpretation of SACHEM. (2) About 20% of the ‘‘Action Recommending model’’ was erroneous. About 80% of mistakes are syntactic, the remaining 20% being semantic. Because the syntactic mistakes are easy to correct, we will only take into account the semantic errors. So 4% of the ‘‘Action Recommending model’’ was to correct. (3) The application of the method represents an effort of 40 man days, that means h 26,000. (4) The development of SACHEM costs around Mh 24.39 from 1992 to 1997. The cost of the SACHEM developments was distributed as indicated in the following points. These numbers show that when h 1 is invested in the conceptual modelling phase, h 15 must be again spent to construct SACHEM.
SACHEM MODEL
221
(a) The conceptual modelling phase costs Mh 1.52 (from 1992 to half 1993). (b) The specification modelling phase costs Mh 4.57 (from 1993 to half 1995). (c) The design and the coding phases cost Mh 18.3 (from 1994 to the end of 1997). If one extrapolates these numbers to SACHEM, one arrives at the following evaluations. (1) The application of the method on the whole generic model of interpretation of SACHEM represents an effort of 3 40=120 man days, that means h 78 000. (2) The stakes are about a total amount of Mh 4.57+M h 18.3=Mh 22.87 (i.e. 94% of the global cost of the SACHEM project). (3) About 4% of the whole generic model of interpretation of SACHEM is erroneous. The cost of these mistakes is then 0.04Mh 22.87= h 914 800. It is then possible to pool the following findings. (1) The application of our method avoids an expense of 4% of the global cost of the SACHEM project (i.e. 0.04 Mh 24.39=Mh 0.97). (2) The validation phase we add represents 0.3% of the global cost of the SACHEM project (h78 000/h24 390 000). (3) In terms of stakes, the validation of the model costs 0.3% of the global cost of the SACHEM project and concerns 94% of expenses (i.e. Mh 22.87/Mh 24.39). It appears clearly that the cost applying our method is negligible in relation to the global cost of the project and in relation to its stakes. Therefore, it can be applied in an industrial development environment.
8. Conclusion In this paper we have proposed a method to validate a KADS conceptual model that is applicable without arranging knowledge of the domain. The method transforms an informal KADS conceptual model into an operational specification model, based on the formalism of high-level Petri nets, that can be executed through simulation. To complete the conceptual model, we have provided the knowledge engineer with two formal languages: a language at the inference layer and a behavioural language at the task layer. We have shown that by formalizing the completed conceptual model into Petri nets, it is possible to reduce the vagueness and ambiguity of the informal conceptual model, particularly at the dynamic level and to contribute to verifying the completeness and consistency of the specification. Finally, the simulation of the operational model contributes both to the validation of the specification and to the bridging of the gap between system specification and design. An advantage of our method is that it allows knowledge engineers to complete and validate the operational model in the terms of the KADS informal model. Unlike other approaches (formalizing or operationalizing), we do not require the knowledge engineer to possess knowledge of the formalism used (e.g. formal logic) and the coherence
222
M. LE GOC ET AL.
between the two models (formal and informal) is easily assured. Thus the cost of operationalization is reduced to an acceptable level for a project aimed at the development of an industrial knowledge-based system. The method has been validated on the conceptual model part relative to the Action Recommendation Service of SACHEM. Work in progress concerns the replacement of the Petri formalism with a more general one, the DEVS formalism. It should lead to the introduction of temporal constraints in a KADS conceptual model so that execution constraints of tasks can be specified and simulated.
References BanaŒtre, J.-P. (1991). La programmation parall"ele. Outils, m!ethodes et e!l!ements de mise en oeuvre. Paris: Eyrolles. Bax, M. (1995). R!eseaux de Petri Orient!e Objet pour la mod!elisation des syst"emes distribue!s temps r!eel. Th"ese de doctorat, Universit!e de Montpellier II, D!ecembre. Calvez, J.-P. (1990). Sp!ecification et conception des syst"emes. Une m!ethodologie. Paris: Masson. Fensel, D. (1995). Specification languages in knowledge engineering and software engineering. Knowledge Engineering Review, 10, 361–404. Fensel, D. & van Harmelen, F. (1994). A comparison of languages which operationalize and formalize KADS models of expertise. Knowledge Engineering Review, 9, 105–147. Fensel, D. Schoenegge, A. Groenboom, R. & Wielinga, B. (1996). Specification and verification of knowledge-based systems. Knowledge Acquisition Workshop, November. Frydman, C. Torres, L. & Le Goc, M. (2000). V!erification et validation du mod"ele d’expertise du Syst"eme SACHEM. Conf!erence Ing!enierie des Connaissances, Toulouse, Mai. Groot, P., ten Teije, A. & van Harmelen, F. (1999). Formally verifying dynamic properties of KBS. 11th European Workshop on Knowledge Acquisition, Modeling, and Management. Haouche, C. & Charlet, J. (1996). Knowledge-based system validation: a knowledge acquisition perspective. 12th European Conference on Artificial Intelligence, Budapest, Hungary. Jacob-Delouis & Krivine. (1995). LISA: un langage r!eflexif pour op!erationnaliser les mod"eles d’expertise. Revue d ’Intelligence Artificielle, 9, 53–88. Jensen, R. W. (1994). An introduction to the theoretical aspects of colored Petri nets. In J. W. Bakker, N. P. Roever & G. Rozenberg (Eds.), A Decade of Concurrency. Lecture Notes in Computer Science, Vol. 803, pp. 230–272. Berlin: Springer-Verlag. Jensen, K. Huber, P. & Shapiro, R. M. (1990). Hierarchies in colored Petri nets. In Advances in PNs, Lecture Notes in Computer Science, Vol. 483, pp. 313–341. Berlin: Springer. Kirsh, P. Maesano, L. & Rabaux, E. (1993). Open KADS. m!ethode & atelier pour la mod!elisation des connaissances. G!enie Logiciel & Syst"emes Experts No. 31, Juin. Le Goc, M. (1999). Ontological models as shared model to validate a knowledge based system. Knowledge Acquisition Workshop, Banff, Alberta, Canada, October. Le Goc, M. & Thirion, C. (1999). Using both numerical and symbolic models to create economic value: The SACHEM system example. 27th McMaster Symposium on Iron and Steelmaking, Ont., Canada, May. Merlin, P. & Farber, D. J. (1976). Recoverability of communication protocols implications of a theoretical study. IEEE Transactions on Communications 24, 1036–1043. Newell, M. (1982). The knowledge level. Artificial Intelligence, 18, 87–127. Pezz"e, M. (1994). CABERNET: A customizable environment for the specification and analysis of real-time systems. Dipartimento di Elettronica et Informazione, Politecnico di Milano, June 2.
SACHEM MODEL
223
Pierra, G. (1991). Les bases de la programmation et du g!enie logiciel. Paris: Dunod. Pierret-Golbreich, C. (1994). TASK MODEL: a framework for the design of expertise models and their application. Knowledge Acquisition Workshop, University of Calgary, October. Pipard, E. (1987). INDE: un syst"eme de d!etection d’inconsistances et d’incompl!etudes dans les bases de connaissances. Th"ese de docteur de 3e" me cycle en informatique, Universit!e de Paris Sud, 17/12/1987. Silva, S. (1994). Cabernet User Manual, Computer Aided Software Engineering Based on ER NETs. Cefriel and Politecnico de Milano, May. Sommerville, I. (1992). Software Engineering. Reading, MA: Addison-Wesley. Steel, L. (1990). Components of expertise. Artificial Intelligence, 11, 28–49. Torres, L. & Frydman, C. (1998). Verifying and validating specifications of knowledge-based systems. European Conference on Artificial Intelligence, Brighton, UK, August. van Harmelen, F. & ten Teije, A. (1997). Validation and verification of conceptual models of diagnosis. 4th European Symposium on the Validation and Verification of Knowledge-Based Systems (EUROVAV ’97), Leuven, Belgium, June. van Harmelen, F., Balder, J. R., Aben, M. W. M. M. & Akkermans, J. M. (1991). (ML)2 A formal language for KADS models of expertise. Esprit Projet P5248 KADS-II, KADS-II/ T1.2/TR/ECN/006/1.0, D1.2.1, 28/10/91. Wielinga, B. J., Schreiber A. T. & Breuker, J. A. (1992). KADS a modelling approach to knowledge engineering. Knowledge Acquisition, 4, 5–53. Paper accepted for publication by Associate Editor, Mark Musen