Int. J. Human – Computer Studies (1996) 44 , 629 – 652
Causal model-based knowledge acquisition tools: discussion of experiments† JEAN CHARLET DIAM De´ partement Biomathe´ matiques & Sery ice d’Informatique Me´ dicale de l’APHP , 91 Bouley ard de l’Ho ˆ pital , 75634 Paris Cedex 13 , France. email:charletê biomath.jussieu .fr CHANTAL REYNAUD Laboratoire de Recherche en Informatique , Baˆ timent 490 , Uniy ersite´ Paris-Sud , 91405 Orsay Cedex , France. email:Chantal .Reynaudêlri.fr JEAN-PAUL KRIVINE E.D.F . Direction des Etudes et Recherches , 1 , Ay enue du Ge´ ne´ ral de Gaulle , 92141 Clamart Cedex , France. email:Kriy ineêclr34el.der.edf.fr (Receiy ed 3 June 1994 and accepted in rey ised form 18 December 1995) The aim of this paper is to study causal knowledge and demonstrate how it can be used to support the knowledge acquisition process. The discussion is based on three experiments we have been involved in. First, we identify two classes of Causal Model-Based Knowledge Acquisition Tools (CMBKATs): bottom-up designed causal models and top-down designed causal models. We then go on to discuss the properties of each type of tool and how they contribute to the whole knowledge acquisition process. ÷1996 Academic Press Limited
1. Introduction The aim of this paper is to study causal knowledge and demonstrate how it can be used to support the knowledge acquisition (KA) process. We use the term ‘‘causal knowledge’’ to refer to conceptual forms that allow us to describe and reason on the succession of phenomena. In Artificial Intelligence, we deal with two kinds of causality: (i) causality as an abstraction of a mathematizing a phenomenon—e.g. the qualitative physics approach; (ii) causality as a relationship between high-level concepts observed as phenomena—e.g. states of a device being modelled (see Section 5.4.1). The discussion is based on three experiments we have been involved ` LE tool (Reynaud, 1993), the ACTE tool (Charlet, 1991; 1993) in the in: the ADE field of medical diagnosis, and the DIVA experiment for turbine-generator diagnosis (David & Krivine, 1989a ). These experiments were carried out separately and independently. Our discussion fits in with the new paradigm of ‘‘task and method specific tools’’ for KA (Musen, 1992; Puerta, Egar, Tu & Musen, 1992). We present † This paper is a revised version based on a paper that was presented by the authors at the 6th European Knowledge Acquisition for Knowledge Based System Workshop. Heidelberg and Kaiserslautern, Germany (Charlet, Krivine & Reynaud, 1992). 629 1071-5819 / 96 / 050629 1 24$18.00 / 0
÷ 1996 Academic Press Limited
630
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
KA through a conceptual model of possible processes for building an expert system (Section 2), then we use this framework to localize the role and the place of Causal Model -Based Knowledge Acquisition Tools (CMBKATs) (Section 3). Such KA tools have two major roles, depending on the way they are built. They can be useful in validating the heuristic knowledge being acquired, although this supposes first that the causal KA process has been developed independently of the heuristic KA process, and second that connection between the two knowledge bases is nevertheless possible. In this paper, such models are termed ‘‘bottom-up designed causal models’’. Causal models can also be used to ensure consistency in expert discourse, but this supposes that they are connected with the heuristic level by construction. Such causal models, termed ‘‘top-down designed causal models’’, are often built using expert justifications during interviews. The two roles described above are important and useful in the KA process, and particularly in what we call ‘‘instantiation of the conceptual model’’. The two kinds of tools which can be built applying these different approaches can contribute ` LE, significantly to enhancing capabilities during the KA process. ACTE and ADE described in Section 4, are representative illustrations of both kinds of tools. However, a single CMBKAT can not assure both of the two above roles. We therefore consider that it would be useful to study how these tools can be integrated or work together in the same KA workbench. Similarly, a promising research path would be to investigate how CMBKATs can be made to work with more conventional KA tools, with explicit reference to the underlying conceptual model. In the last section of the paper, we discuss causal models (still in the KA context) and a few specific points based on the three experiments: (i) comparison with other closed works, (ii) how CMBKATs manage a conceptual model of expertise, (iii) determination of an accurate causal model for KA, which leads to our identifying of three important properties: connection, consistency and completeness, (iv) causal models from an ontological point of view, and finally, (v) how CMBKATs deal with uncertainty.
2. Knowledge acquisition as conceptual model design A new and fruitful paradigm for KA has emerged in recent years. This paradigm, sometimes called ‘‘task and method specific tools’’, aims at bridging the conceptual gap between the form in which knowledge is described in the natural discourse of domain specialists, and the form in which it is represented in Knowledge Based programs. This approach is based on the design of a Conceptual Model which attempts to describe the problem-solving process at a higher level of abstraction.† This conceptual model can either be designed by selection and refinement from a predefined library of generic components, or can be built from scratch when no suitable generic component are available. Of course, the former case is preferable, and most attention is paid to the development of such a library [see CONSTRUCT, Esprit project P5477, or the KADS methodology (Wielinga, Schreiber & Breuker, 1992)]. Thus, KA becomes a modelling task, and once the conceptual model has † Conceptual model is now a term largely used. However, it may still give rise to some confusion. For a deeper discussion on how various approaches refer to it, see Karbach, Linster and Voss (1990).
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
631
been designed the process is largely dependent on the role and type of knowledge required. This issue has been widely examined. Clancey (Heuristic Classification), Chandrasekaran (Generic Tasks), Wielinga and Breuker (Models of Expertise and the KADS Methodology) and McDermott (Role Limiting Methods and KA Tools) were among the forerunners in this field (Clancey, 1985; Wielinga & Breuker, 1986; Chandrasekaran, 1987; McDermott, 1988). KA, as a cognitive task, can also be modelled in this context. Figure 1, from Aussenac-Gilles, Krivine and Sallantin (1992) proposes a conceptual model for an expert system building process. This breakdown is not wholly original, and it is neither a generalization of previous descriptions nor an alternative description. It no longer attempts to come up with a new KA theory, but is merely reformulates a shared framework that enables us to discuss and compare KA tools and methods. For this reason, it is relevant to our discussion on causal models in KA. Before commenting further on this breakdown, let us first outline the utility of breaking down the KA process in this way. Firstly, this kind of analysis proy ides a better delimitation of the purposes of indiy idual tools and methods. Because of the increasing number of KA tools and methods, we must be able to delimit individual roles and define their respective scopes. A sound basis for comparing tools and method is essential; otherwise we may end up trying to compare items that are not really comparable. Secondly, such a framework is useful for specifying how sey eral tools or methods can be used together. It attempts to specify the inputs and outputs of each step, and then to specify how complementary or redundant the various tools are. More generally, we need to model the KA process itself when designing a general KA workbench so that the various tools supplied should be recognizable by the role they can play during KBS design. Last, and this is the purpose of our paper, such a framework is useful for specifying the benefit and role of a specific approach with respect to other ones. We will now describe the four steps of Figure 1, as given by Aussenac-Gilles et al. (1992). Naturally these steps are not sequential, they should be seen as a breakdown of the KA task rather than a linear KA procedure. (1) Knowledge elicitation. Some authors such as Motta, Rajan and Eisenstadt (1990) call, this phase ‘‘bottom-up elicitation’’. The output consists of ‘‘raw data’’ that can be more or less self-organized, depending on the tools or the method used. This data serves as a basis for constructed the conceptual model. Further elication will then become ‘‘model-driven’’ (conceptual model-driven); this is included in step 3 [termed ‘‘top-down elicitation’’ by Motta et al. (1990)]. The main characteristic of this phase is that no a priori model is assumed. Methods such as interviews or think aloud protocols are the most frequently used at this stage. Motta et al. (1990) consider these techniques to be ‘‘weak’’ because they are domain-independent. This step also includes ‘‘data analysis’’, which aims to remove the background noise in raw data and outputs the first set of organized elicited knowledge. Actually, some works in the current KA research from domain experts are focused on the reuse of predefined ontologies. An ontology is a declarative model of the terms and relationships in a domain. Typically, ontologies are organized as class hierarchies, where each class defines a set of objects of a certain type. Each class has a set of attributes which models the concept that the class represents (Neches,
632
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
Knowledge sources (experts, documents, etc)
Conceptual model schema
Complete conceptual model
3 1
4
2
Partial expertise 1
Operational system
Aquisition driven by data 2
Design of the conceptual model schema 3
Conceptual model instantiation 4
Operationalization of the conceptual model
FIGURE 1. Knowledge acquisition (from Aussenac-Gilles et al. , 1992).
Fikes, Finin, Gruber, Patil, Senator & Swartout 1991). Such techniques could enhance the knowledge elicitation process by making it more efficient. For example, ontologies provide terms referring to the entities that must be distinguished as categories in the domain (Wielinga, van de Velde, Schreiber & Akkermans, 1993). (2) Design of the conceptual model. Based on the first set of data and knowledge gathered from step 1, step 2 seeks to provide a framework for representing these pieces of knowledge and describing how this knowledge will be used (the problem solving process). Undoubtedly, one of the most difficult task in this paradigm is to define suitable problem-solving methods for tackling concrete problems. Many approaches adopt a selection and refinement process which, in fact, takes into account elements of various grain size. The KADS-I project is representative of one of the first approaches whereby a predefined and fairly comprehensive complete conceptual model, called an interpretation model, is selected from a library and subsequently adapted (Breuker, Wielinga, Van Someren, De Hoog, Schreiber, De Greef, Bredeweg, Wielemaker & Billault 1987). More recently, various researchers have tried to develop new problem-solving methods from reusable components with a grain sizer finer than that of problem-solving methods. Thus, current research on ´ GE ´ -II (Tu, Eriksson, Gennari, Shahar & Musen, 1995), on SPARK (Yost, PROTE Klinker, Linster, Marque` s & McDermott, 1994), on KREST (Steels, 1990), on DIDS (Runkel & Birmingham, 1993), and also in the COMMON-KADS project (Wielinga et al. , 1993) pays a lot of attention to identifying these small grain abstractions, often called mechanisms, and to how these might be combined into useful problem-solving methods. (3) Instantiation of the conceptual model. The conceptual model defined in step 2
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
633
provides a very solid foundation for KA. The role and place of knowledge still to be acquired is clearly defined. For this reason, many tools and methods exploit this fact, and most KA tools can be classified in this category. The most typical are certainly those from CMU—i.e. MORE (Kahn, Nowlan & McDermott, 1985), MOLE (Eshelman, 1988), SALT (Marcus & McDermott, 1989), and all the tools generated by PROTEGE II (Tu et al. , 1995). Motta et al. (1990) term this step ‘’top-down elicitation’’. (4) Operationalization. This phase consists of translating the conceptual model and the associated knowledge into a running system. Here, we are no longer directly concerned with ‘‘knowledge acquisition’’. However, since KBS design very frequently needs feedback and looping, it would appear useful to design an operational system that offers the best possible reflection of the conceptual model. Some authors refer to a ‘‘semi isomorphism’’ (Hickman, Killin, Land, Mulhall, Porter & Taylor, 1989). This is mostly a matter of architecture or language, and this subject constitutes an important line of research (Karbach, Voss, Schuckey & Drouven, 1991; Reinders et al . , 1991). These four steps are not, of course, sequential, and iterations are frequent. They can be seen as a kind of conceptual model of the KA process itself. Of course, the human expert has to intervene during all four steps (at least during the first three).
3. Causal models and their relations with KA 3.1. TWO CLASSES OF CAUSAL MODELS FOR KA
In our discussion, causal models are examined for their role in the KA process. Thus, the terms ‘‘heuristic level’’ or ‘‘heuristic knowledge base’’ refer to knowledge that needs to be acquired (i.e. the knowledge that causal models are supposed to improve), while the term ‘‘causal knowledge base’’ refers to the causal level. Before going on to discuss the place of causal models for KA in the above context, and commenting on our experience, let us first briefly define what we mean by causal models. First, following Davis (Davis, 1989), we reject the terms ‘‘deep models’’ and ‘‘shallow models’’. These notions are relative. As Sticklen, Chandrasekaran & Bond (1988) state, one particular model is only deep with respect to another model, and may well be shallow compared to a third. Another point of view concerns the function of the expressed knowledge (the ‘‘teleology’’). What we term in our discussion ‘‘causal models ’’ are models that express causality at a certain ley el of abstraction , without referring to the goal of the final system in which they will be used (e.g . diagnosis or classification in a medical KBS ). Intuitively, causal models express knowledge that may be relevant over a wide range of applications and so appear to be more ‘‘generic’’. They are supposed to capture fundamental laws governing a domain and often try to represent ‘‘domain theory’’. For this reason, the term ‘‘deep knowledge’’ is sometimes applied. This kind of knowledge seems to be very naturally available, often appearing as a background to a number of interviews for KA (‘‘why are you telling me that?’’). They also serve to justify acquired knowledge and provide new clues and new directions for broadening the KA process itself. From our experience, we distinguish between two classes of causal models in KA,
634
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
according to the way these models are built. Each class is of interest and raises different kinds of difficulties. Bottom-up designed causal models: these models are designed independently of the KA process. They generally develop out of existing models (design models, handbooks or tutorial manuals), and often describe ‘‘how things work’’ or ‘‘how things fail’’. In Electromyography, for example, medical students are taught detailed knowledge of neuroanatomy, the relationship between nerves and the muscles they innervate, and of pathology, the findings of the diseases. This knowledge can easily be acquired from medical handbooks. Top-down designed causal models: these models are designed during the KA process. They are limited to the concepts involved in the application and with regards to this application they seek to be complete (see Section 5.3). They frequently appear as justifications of elicited knowledge (‘‘why are you telling me that?’’). Thus, using top-down designed causal models lead to the simultaneously running two KA processes: the main KA process (to acquire what we have called ‘‘heuristic knowledge’’), and the process of designing the causal model itself. In the domain of acute abdominal pains, for example, the description of the pathologies could not refer to a particular pathophysiological model because of the multiplicity of the systems involved. Only a descriptive model with findings (i.e. signs, syndromes) and diseases is reasonably available. It forms a causal model in the abdominal territory and can be the foundation of the elicitation phase. Bottom-up designed causal models provide a different viewpoint (different with regards to the heuristic knowledge base and because the source is different from the (generally expert) source which provides heuristic knowledge). The combination of two views of the same subject area may prove to be very fruitful, especially in terms of validating the heuristic level, but may also be difficult to achieve. Conversely, although top-down designed causal models and associated heuristic levels are represented in different levels of abstraction, the former is derived from the latter, and thus takes on a role of ‘‘internal consistency in the expert discourse’’ rather than a validation of what is acquired. We return to this point below (see Section 5.3). 3.2. PLACE AND INTEREST OF CAUSAL MODELS FOR KA
In this paper, we aim to present and then comment on the reasoning behind the three following assertions: $ Causal models can be y ery useful for KA because they proy ide an alternatiy e and complementary y iewpoint on the knowledge being acquired. Several issues have been quoted; (i) the use of justifications of expert knowledge for validating acquired knowledge or extending an existing knowledge base (Neches, Swartout & Moore, 1985), (ii) the acquisition of heuristic knowledge (Davis, 1989), and (iii) the acquisition of strategic knowledge (Gruber, 1991). $ Most causal model -based knowledge acquisition tools or methods (CMBKATs ) play a major role during the ‘‘instantiation ’’ of a conceptual model and are therefore relevant for step 3 in the framework described in Figure 1 (conceptual model instantiation). We will see that the various experiments use a pre-existing conceptual model. Gruber assumes that an initial amount of knowledge has been collected before the system (the ASK system) can be used (Gruber, 1991). In the
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
635
ACTE tool, a conceptual model is assumed (a kind of classification) and an initial set of knowledge (hypothesis and symptoms) is still specified when building and ` LE can only run once the heuristic using the causal model. Similarly, ADE knowledge base has been exprimed with an initial set of rules. $ However, CMBKATs do not perform this instantiation automatically , but help to achiey e it. They therefore have to be used in collaboration or interaction with other tools or methods, and benefits come from explicitly referencing these tools or methods to the underlying conceptual model.
4. Three experiments We will now present the three experiments our discussion is based on. Two of them are complete KA systems. They have been fully described elsewhere in (Reynaud, ` LE system and in (Charlet, 1993) for the ACTE system. ADE ` LE 1993) for the ADE and ACTE were designed to support the KA activity in an interactive way in architectures combining heuristic and causal knowledge (often termed ‘‘Second Generation Expert Systems’’).† Both of these systems are used in medical diagnosis applications. In this section we merely list the main features of these systems. The third experiment is described in David and Krivine (1989a ,b ) and is an experiment in combining causal models and heuristic levels for KA purposes rather than a pure KA tool. ` LE SYSTEM 4.1. THE ADE
4.1 .1 . Description ` LE is a CMBKAT with two main features. First, it is designed within the ADE context of Second Generation Expert Systems, specifically in the context of systems in which heuristic and causal models cooperate. Second, it contributes to the design of a heuristic knowledge base, an equivalent to an instantiated conceptual model. ` LE is to use causal models to make the main (heuristic) The basic idea in ADE KA process easier. This does not mean that problems encountered with the (main) ` LE does not generate the heuristic level KA process will entirely disappear. ADE from a causal level. On the contrary, heuristic and causal models are represented by two independent knowledge bases, each with its own KA process: the heuristic knowledge base HKB is represented by production rules (and will form the operational knowledge base), the causal knowledge base (CKB) is composed of causal models and is represented using a formalism closely related to that of the semantic network. Since they come from different sources, these knowledge bases ` LE is to take will probably model various viewpoints. The objective of ADE advantage of these different but possibly complementary viewpoints to help the main (heuristic) KA process while it is in progress. ` LE thus supplies the knowledge engineer (assumed to be the system user) by ADE providing control and understanding of the (heuristic) knowledge being acquired. Furthermore, it can provide suggestions to complete the HKB. It is a tool for † David, Krivine and Simmons (1993) outlines experiments and reflexions on Second Generation Expert Systems.
636
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
refining and extending an already existing HKB, using causal models. Moreover, this tool makes communication with the domain expert easier, because it is based on a causal level assumed to be more tractable. In addition, it contributes to easier expression and validation of the (heuristic) knowledge base. ` LE is as follows: the expert expresses a production rule The KA process in ADE encoding some piece of heuristic knowledge. Each rule is then considered separately. When a new production rule is acquired (let us assume that the heuristic ` LE operates), abductive reasoning based on KA process is in progress when ADE the causal knowledge base provides justifications that can be regarded as proofs of the association between the conditions and the conclusion of the rule. A precise analysis of the justifications provides the expert with interesting results concerning the nature of the relationship between conditions and conclusion, the strength of this link, the roles played by the conditions and the discriminating power of the ` LE takes advantage of this analysis to explain conditions over the conclusion. ADE the rule, comment on it, check it, and to suggest modifications or new rules. ` LE can also be used for designing new knowledge bases and maintaining ADE existing knowledge bases. Such an approach can not be used without any heuristic knowledge. Reasoning on causal models which have been acquired separately from ` LE approach, the heuristic level the heuristic level has to be controlled. In the ADE is used for supporting reasoning on causal models. This approach has been experimented on a medical diagnosis application in electromyography, diagnosing muscle and peripheral nerve disorders by electrical measurements. An example of a session in this field is described below (see Figure 2). Other applications in this specific domain are developed, among these, the MUNIN expert system (Andeassen, Wolbye, Falck & Andersen, 1987), which is part of ESPRIT project P599, an expert assistant for electromyography. Researchers began by working on knowledge representation, and proposed a probabilistic causal network as a unified approach to diagnosis, planning and explanation. This choice led to difficulties concerning acquisition of knowledge in the causal network. The solution which has been proposed consisted of using deep models (models of pathophysiological processes as expressed in medical textbooks and papers). This would reduce the problem of estimating thousands of probabilities into a problem of adjusting a much smaller number of model parameters. ` LE 4.1 .2 . CKB in ADE The CKB models the domain knowledge. In the CKB, causal relations co-exist with descriptive or definition relations. In addiiton, the CKB is not composed of a unique causal net, but may contain several competitive models. It corresponds to the fact that a model is always designed from a particular viewpoint and that a variety of ` LE, we would like to take advantage of the viewpoints may be possible. In ADE multiplicity of available domain models. Nonetheless, no causal model will always be useful for the heuristic KA process. The knowledge engineer has to choose the most appropriate models, those most closely related to the heuristic knowledge being acquired. With reference to the classification of causal models given in Section 3, causal ` LE can be qualified as bottom-up designed causal models. models used in ADE
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
637
` LE immediately operates The knowledge engineer giy es the rule described below. ADE on it. If not Paresthesie in the median distal region and if Woman then Carpan channel syndrome This rule expresses the relation between paresthesie ( prickling ) confined to a part of the arm , the fact that the patient is a woman and carpal channel syndrome being an ` LE are : obstruction of the channel of the median nery e at the wrist. The results of ADE 1. This rule is not valid, it can not be explained. (No explanation has been found for the association between the conditions and the conclusion of the rule in instantiated causal models ) 2. Nevertheless, one (partial) generic explanation exists: ‘‘A clinical symptom is localized in a muscle; a muscle is innervated by nerves; a nerve can be damaged by diseases’’. 3. No explanation has been found for ‘‘woman’’. Is ‘‘woman’’ a condition which reflects heuristic context? ` LE has identified generic concepts which can be associated with some terms of the ADE rule : clinical symptom for Paresthesie in the median distal region and disease for Carpal channel syndrome. Its justifies the association between these concepts using generic causal models. This helps the knowledge engineer by proy iding understanding of the rule being acquired. Let us note that the explanation is partial because no concept has been found for the condition ‘‘woman.’’ This condition could reflect a heuristic context. ` LE are analyzed. The knowledge engineer decides to modify the rule (to Results of ADE ` LE remoy e the ‘‘not ’’ in the conditions of the prey ious rule ) and proposes that ADE should y alidate the modified rule. If Paresthesie in the median distal region and if Woman then Carpal channel syndrome ` LE are : The results of ADE 1. This rule is partially valid. (One partial explanation has been found in instantiated causal models ) ‘‘Paresthesies in the median distal region are localized in the muscle in the median nerve; this is innervated by the median nerve which can be damaged by carpal channel syndrome?’’ 2. No additive condition is required to explain the conclusion of the rule. 3. The explanation below suggests that the rule is an evocative one. 4. Here is an analogous rule: If Hypoesthesie in the median distal region then Carpal channel syndrome
... etc. ` LE system, in the field of electromyography. FIGURE 2. An example of a session with the ADE
CKB has its own KA process and it is quite independent of the main KA process. CKB is born out of pre-existing models directly extracted from medical textbooks. Obviously, such existing models are not directly connected with the heuristic level. The approach is therefore extended by a connection phase in which the knowledge engineer tries to reformulate the initial, ‘‘raw’’, causal models to enable connection with the HKB.
638
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
The CKB is a two-level modelling, represented by two kinds of models: ‘‘Instantiated Causal Models’’ and ‘‘Generic Causal Models’’. ‘‘Instantiated Causal Models’’ model objects and relationships between objects of the subject domain. These objects refer directly to the conditions and the conclusions of the HKB production rules. ‘‘Generic Causal Models’’ model classes of objects and relations between them, and are obtained from ‘‘Instantiated Causal Models’’. Classes of objects stem from grouping together objects, and relations between classes of objects stem from generalizing relations between objects. Relations of the models (whatever the kind of model) are described by a name in a literal form, a type which denotes the type of domain knowledge that the relation refers to (neuroanatomical, pathophysiological, etc.) and its nature (hierarchic, descriptive, functional, causal, evocative). The value of these attributes are exploited when analysing explanations provided by CKB. Relations between Generic Models and Instantiated Models are exclusively ‘‘is-a’’ relations between classes of objects and objects which belong to them. Using such causal models was helpful in our medical application—medical diagnosis is a suitable domain, and one for which causal models are available. We worked on a medical diagnostic reasoning system for electromyography. The CKB is composed of: $ an anatomical model which models which muscle is innervated by which nerve, $ a model which describes an electrical schema of diseases, and which in particualr describes electrical measurements which are abnormal when a disease appears, $ a model of localization which shows in which muscular region a symptom appears, $ a pathophysiological model which indicates the effects of the diseases on nerves, on electrical measurements and on the clinical schema of a patient. ` LE 4.1 .3 . Using ADE ` ADELE improves communication with the domain expert. It provides the knowledge engineer with a deeper background to understand the acquired expert knowledge. By giving explanations and making comments, it helps to avoid potential misinterpretations and modelling errors.† Moreover, the knowledge engineer will be able to detect strange combinations of knowledge elements, which are expressed at different levels of abstraction. Expressing knowledge with the production rule formalism is sometimes difficult for an expert and can lead to imperfections ` LE can serve as the (Console, Fossa & Torasso, 1989). The results provided by ADE basis for discussion between the expert and the knowedge engineer, that can lead ` LE provides the the expert to reformulate the system’s knowledge. In addition, ADE knowledge engineer with the possibility of checking the acquired expert knowledge by using the causal knowledge base; this is a way of validating the acquired heuristic knowledge. ` LE has a very interesting structure, from both methodological and computaADE tional viewpoints. First, explanations are sought at generic level; then, when (at least) one has been found (termed generic explanation), further explanations are † As far as the design of a knowledge base is considered as a modelling process.
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
639
sought at the instantiated level. This approach does not explore all instantiated models, but seeks instantiations of generic explanations. 4.2. THE ACTE SYSTEM
4.2 .1 . Description ACTE is a KA tool in a medical diagnostic domain. The resulting knowledge-based system seeks to achieve a heuristic classification. The approach is based on interpretation of a causal model. The aim of this interpretation is twofold: it facilitates checking of causal model, and it generates heuristic knowledge which is usable for a specific task. ACTE handles two knowledge bases: the CKB—i.e. the causal model—which is the input to the KA process, and the HKB which is the output of this process and will constitute the operational knowledge base. The CKB is represented using a semantic network formalism, while the HKB is represented by production rules. The following steps describe ACTE’s KA process. (1) The conceptual model is first defined so as to describe the problem-solving methods that accomplishes the diagnostic task. These methods deal, for example, with ‘‘data-abstraction’’ or the ‘‘unicity of the fault process’’. (2) The expert provides domain knowledge in a causal model framework. This causal model is the CKB. (3) The CKB is taken as a whole and globally analysed in interaction with the expert in order to check its consistency. (4) Each relation is analysed separately and interpreted by the problem-solving methods of the conceptual model. This generates both immediate (obvious) and heuristic knowledge. Depending on the nature of the heuristic knowledge—i.e. the nature of the causal relationships and problem-solving methods which have generated this heuristic knowledge—it may be checked and refined by the expert. The CKB is therefore interpreted by the interpretation model, and the final HKB is built. ACTE has been applied to LE´ ZARD, a medical diagnostic reasoning system for discriminating between diseases in the domain of acute abdominal pain. We will illustrate various features of ACTE with examples from the knowledge base of the LE´ ZARD system (Charlet, 1992, 1993). 4.2 .2 . CKB in ACTE We have seen that CKB models domain knowledge. This knowledge is pathophysiological and semiological knowledge. Both types of knowledge are represented in a causal network. The pathophysiological part of the network is made up of causal and hierarchical relationships between diagnoses. Causal relationships describe the etiology of diagnoses. Hierarchical relationships are tangled taxonomies based on anatomical or pathological considerations. These tangled taxonomies express multiple viewpoints on the same set of diseases. In ACTE we chose both an anatomical description of
640
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
EPIGASTRIC-PAIN DIGESTIVE-DISEASE
APPENDICITIS
OBSTRUCTION
ORGANIC-OBSTRUCTION
FUNCTIONAL-OBSTRUCTION
OMBILICAL-PAIN PERITONITIS
VASCULAR-DISEASE
URINARY-DISEASE
PYELONEPHRITIS MESENTERIC-INFARCTION
RENAL-COLIC
URINARY-TRACT-INFECTION
FIGURE 3. Causal network representation. Dotted lines represent causal links, black lines represent taxonomic links and grey lines represent evocation links.
the diseases—e.g. DIGESTIVE-DISEASE or URINARY-DISEASE—and a process-based description—e.g. ORGANIC-OBSTRUCTION or FUNCTIONAL-OBSTRUCTION. Semiological knowledge is described by relationships between signs and diagnoses. These signs are symptoms (input data) or syndromes (abstracted data). Furthermore, the nature of these relationships is twofold: some qualifying conditions that express that a sign must be observed before evoking a disease (Eshelman, 1988) or some triggering conditions that evoke some hypotheses (Szolovits, Patil & Schwartz, 1988). Figure 3 gives an illustration of this type of knowledge. 4.2 .3 . Using ACTE ACTE is closely related to a type of conceptual model (a type of heuristic classification applied to medical diagnosis). This domain is characterized by incompleteness and uncertainty.† Problem-solving methods, which are now described, have been influenced by previous works (Long, Naimi, Criscitiello, Pauker & Szolivits, 1984; Clancey, 1985; Chandrasekaran, 1987; Szolovits et al. , 1988): $ generalizing from specific observations with the ‘‘data-abstraction’’ method; $ focusing on the highest abstraction levels—i.e. trying to decide if the classes of the most widespread diseases are present before deciding on more specific classes; $ using trigger or necessity signs to evoke hypotheses; $ applying the unique fault process hypothesis—i.e. attempting to create a unique causal chain to explain the observed symptoms. † Not all medical domains are characterized by incompleteness and uncertainty—e.g. cardiac pathophysiology which may address ‘‘qualitative physic’’ problematic.
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
641
The last method intervenes, among others methods, in the validation of CKB. This method is used in several diagnostic systems, and particularly in the medical field (Long et al. , 1984; Szolovits et al. , 1988). According to this hypothesis, multiple disorders (in medicine, multiple diseases) may occur simultaneously if, and only if, they are connected by causal relations. ACTE’s interpretation of the causal network is based on this latter hypothesis. The interpretation algorithm is the exclusions calculus.† The basic idea is to use the unique fault process in two ways: either to seek diagnoses which may appear in the same causal process or to seek diagnoses which may not appear in the same causal process; a simplified formulation is: ‘‘two concepts without causal relationships are exclusive’’. In practice, this interpretation yields mutual -exclusions —i.e. two concepts of the domain cannot belong to the same solution—as opposed to mutual -compatibilities. The reality is obviously more complex insofar as the causal relationships may be transitive or not: ‘‘A may cause B’’ and ‘‘B may cause C’’ do not imply ‘‘A may cause C’’. These problems of ‘‘weak transitivity’’ have been discussed in cognitive science by Johnson-Laird (Johnson-Laird, 1980) and in AI (Szolovits et al. , 1988; David & Krivine, 1989a ). Our approach consists in asking the expert about the plausibility of chains of causally-related events such as A and C. Also, mutual-exclusions and mutual-compatibilities are retained conditional to the transitivity (or non-transitivity) attributed by the expert to the successive causal relationships. The benefit of this interpretation is twofold. $ It validates the causal network. For example, the expert may not agree with the result of the system (e.g. a mutual-exclusion relation between two diagnoses). He might thus be led to add new causal relationships to justify his posiiton (e.g. that the two concepts above are compatible). $ The unique fault process hypothesis is operationalized in heuristic knowledge through mutual-exclusions. Just like a bottom-up designed causal model, the causal model used in ACTE may well come from existing models directly extracted from medical textbooks. Nevertheless, such a model has been designed for a particular discrimination diagnostic task, with well known signs and diagnoses. Moreover, the CKB is checked during the knowledge KA process. In this way, and with reference to the classification of causal models in Section 3, the causal model used in ACTE mainly refers to a top-down designed causal model. This illustrates how a causal model can help to instantiate a conceptual model in order to create an operational and heuristic knowledge base. Further work will be carried out to make the conceptual model in the program more explicit, as for heuristic knowledge. This may be difficult, but the result should enhance the performance and the understanding of the interpretation process of semiological and pathophysiological relations (i.e. in making the nature of the knowledge provided explicit in terms of the conceptual model). † More precisions and examples on this algorithm are presented in Charlet (1993).
642
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
` LE 4.3. COMPARISON BETWEEN ACTE AND ADE
` LE and ACTE are two very similar CMBKATs; in particular their CKB and ADE HKB are the same as regards the nature of the modelled links. Nevertheless some specific differences must be noted. ` LE and ACTE do not refer in the same way to the conceptual model First, ADE of the task being modelled. Neither takes advantage of an explicit representation of the conceptual model (developments along these lines are planned). ACTE, however, assumes a fixed conceptual model and interprets the causal model in this framework in order to create some, often heuristic, rules, even though the ` LE takes heuristic conceptual model might not be explicit. Conversely, ADE rules—i.e. a partially instantiated conceptual model—and verifies that these rules ` LE has the are plausible interpretations of a causal model. In this sense, ADE potential to be used for a number of different tasks. Moreover, ACTE works from a causal model towards heuristic rules, whereas ` LE attempts to work in the opposite direction (from heuristic rules towards a ADE causal model). In the former case, an examination of the CKB can help to generate new pieces of heuristic knowledge. In the latter case, the starting point is the heuristic knowledge and CKB is used for various checking procedures. These approaches are complementary: ACTE allows us to instantiate a conceptual model on a domain in order to build a heuristic and operational knowledge base, whereas ` LE enables us to validate chunks of an implicit conceptual model included in ADE new heuristic rules (Charlet, 1993; Reynaud, 1993). Finally, the causal model in ACTE is mainly what we have termed a ‘‘top-down ` LE refers to a ‘‘bottom-up designed causal model’’, while the causal model in ADE designed causal model’’. As we will see in the next section, the latter is more suitable for validation, while the former is more useful for internal consistency checking or for extending the knowledge base over a broader domain area. 4.4. CAUSAL MODELS IN THE DIVA EXPERIMENT
DIVA is an expert system for vibration-based monitoring of turbine-generator. It is currently in the industrialization phase. DIVA is built around an explicit conceptual model. The main task concerns classification of typical situations which represent possible states of the machine. The system tries to identify the situations which best account for concrete symptoms present on the machine. The last description characterizes the type of knowledge that must be acquired [accordingly, a strong protocol has been defined to guide the KA process (David, Krivine & benoit Richard, 1993)]. However, during this process leading up to construction of an important knowledge base, it appeared that the experts often provided justifications for the elicited knowledge. Thus, in order to support the KA process, an attempt to gather all justifications in a causal network was made (David & Krivine, 1989b ). According to our classification of causal models for KA, the DIVA causal model is clearly a top-down designed causal model. The starting point is the heuristic level, and causal models are derivated from this heuristic level. Section 5 will discuss the problems we encountered during this experiment.
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
643
5. Discussion Before discussing CMBKAT, let us remember that our discussion on causal models only applies within a KA context. Our aim is not to comment on causal knowledge alone. This is a very broad area, ranging from qualitative physics to causal nets, and a large amount of literature is available. Rather, we wish to discuss causal models from a KA point of view, first by comparing our works with other closed works, then by speaking about specific points relative to the use of our causal models in the KA process. Subsequently, after restricting the discussion to the problem of KA, let us place our causal models in relation to ‘‘classical’’ causal models mentioned in the literature, especially models of correct behaviour and fault models, and speak about uncertainty. 5.1. COMPARISON WITH OTHER CLOSED WORKS
One can find similarities between our works and many computer-based tools that facilitate the knowledge acquisition process. TEIRESIAS, which has been developed by Davis (1979), already has debugging capabilities and was able to determine whether a newly added piece of knowledge fits into an existing knowledge ` LE is very similar. However, these base. From the functionality point of view, ADE ` LE relies on a systems differ in the way they improve their capabilities. ADE high-level understanding of the acquired knowledge by using two-level causal models whereas TEIRESIAS relies on rule models and proposes explanations at the level of the program control structure. Such comparisons are numerous; we cannot be exhausitve. We shall thereby restrict ourselves to KA systems according to the main characteristics of the tools that we have put forward in the paper. This leads to consider only systems which helps the instantiation of the Conceptual Model phase. Many of such systems were developed by J. McDermott’s research group at Carnegie Mellon University during the 1980s. We are going to compare our works with MOLE (Eshelman, 1988), one of these systems, which, in our opinion, can be viewed as a CMBKAT. MOLE allows users to describe knowledge for diagnostic tasks in terms of a method called Coy er -and -Differentiate. Being a method-oriented tool, some consequent features are very interesting: for example, the fact that the expert is not required to describe anything more than an under-specified network before starting the KA process or the efforts which have been directed toward not bothering the expert with unnecessary questions. Like MOLE, ACTE makes heuristic assumptions about the world, allowing to set down limits and constraints in the determination of the knowledge to be acquired. Some of these assumptions play a similar role in the KA process of the two systems. Nevertheless, similar assumptions may lead to different expressions of the corresponding problem-solving methods. It may also be interesting to compare the results—in knowledge base terms—given by the assumption of Exclusiy ity in MOLE and the assumption of unicity of the fault process in ACTE. The main difference between MOLE and our systems is the use of the causal model. MOLE acquires just enough knowledge about the domain to find an initial set of hypotheses that cover the symptoms and differentiate between the hypotheses. So, some pieces of knowledge acquired during the KA process can be viewed as pieces of causal knowledge, but there is no representation of a causal model as
644
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
TABLE 1 ` LE , and MOLE CMBKATs Comparison of the ACTE , ADE CMBKATs Characteristics Functionality
Type of KA tool Supported method Hypotheses of the method Class of causal model Role in the KA phase Means
ACTE Generate heuristic knowledge from causal knowledge Method-oriented Heuristic Classification Unicity of the fault process Top-down Assure internal con sistency of the KB Global checking of the KB
` LE ADE Control and guide the heuristic knowledge transfer Method-independent
Bottom-up Validate the KB Iterative process of checking pieces of heuristic knowledge
MOLE Guide the heuristic knowledge acquisition Method-oriented Coy er -and differentiate Exhaustivity and Exclu sivity of the diagnoses Top-down Assure internal consis tency of the KB† Iterative process of refinement of the heuristic knowledge†
† Roles and means are re-interpretations from the authors. Causal models problematic is not explicitly addressed by Eshelman (1988).
such. No specific use of a causal model is planned in MOLE’s approach. In ACTE ` LE, an explicit causal model is defined (with a hierarchical structure in the or ADE case of ACTE). This description is more restrictive but allows global checking and validation about the knowledge to be acquired. ` LE nor In the way they model the problem-solving process, neither MOLE, ADE ACTE have an explicit conceptual model. Nevertheless, MOLE and ACTE make assumptions on the method used by the designed expert system. MOLE assumes that the method is coy er -and -differentiate while ACTE assumes that it is a kind of heuristic classification. Both MOLE and ACTE are method-oriented KA tools ` LE is a method independent KA tools. whereas ADE Finally, if referring to our classification of causal models for KA, we can say that the MOLE causal model is clearly a top-down designed causal model. The starting point is the heuristic level (the expert), and the causal model is acquired during the knowledge acquisition process. Table 1 summarizes the comparison between ACTE, ´ LE, and MOLE. ADE Let us now examine another important category comprising tools that operate at a meta-level. These tools are able to generate knowledge acquisition tools automati´ GE ´ -I cally from the model of a task. Examples of this type of tool are PROTE (Musen, 1989) which is method-oriented, DOTS (Eriksson, 1990) which does not ´ GE ´ -II (Tu et al. , follow a given problem-solving method and, more recently, PROTE 1995), a method-independent shell. Let us compare these generated knowledge acquisition tools with our works. The generated KA tools are specialized knowledge editing environments for application experts. A task model is used to determine components of an interface for a knowledge editor. As they are task-oriented, the tools guide the user through the acquisition sessions by prescribing the roles in which the knowledge entered by a developer will be used by a given inference
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
645
engine. They are also very useful for ensuring that the knowledge is complete and ` LE are task-oriented. consistent with respect to the model. Neither ACTE nor ADE They do not know the roles in which the entered knowledge will be used. Neither of them is thereby concerned with the manner in which the contents of a KB should be ` LE are more suitable for presented to their users for editing. ACTE and ADE validation or internal consistency checking. Depending on the use of task or ` LE are, in a sense, more general. They problem-solving models, ACTE and ADE ` LE’s approach is independent of are task-independent KA tools. Moreover, ADE any problem-solving method. 5.2. RELATION TO THE CONCEPTUAL MODEL
` LE nor ACTE benefits from an explicit conceptual We have seen that neither ADE model (even if ACTE assumes that the method is a kind of heuristic classification). However, making the underlying conceptual model explicit appears to be a promising path for research. The ‘‘knowledge level approach’’, briefly described in Section 3, allows us to assign a precise role to problem solving for each chunk of acquired knowledge. It can also provide a way of ‘‘labelling’’ the outputs from the various CMBKATs. Viewing KA in terms of ‘‘acquiring knowledge for a specific task in a specific conceptual model’’ has meant that another step forward was necessary, by comparison with ‘‘acquiring rules’’ as in the first generation of expert systems. In the same way, it may appear interesting to look at how CMBKATs can work together with other, more conventional KA tools performing instantiation of conceptual models (with explicit use). However, it may become difficult to identify and characterize the nature of the knowledge or comments provided by the CMBKATs in terms of the conceptual ` LE and ACTE model. This is currently being looked into with the ADE experiments. Let us now give a more concrete example of how CMBKATs can deal with an explicit conceptual model. Part of the conceptual model of the DIVA system is a classification task. This classification task requires a set of well identified types of knowledge: knowledge to decide if a specific typical situation accounts for the concrete symptoms gathered on the machine, knowledge to refine a situation that has been recognized, knowledge to interpret an inference as ‘‘abstracting data’’, etc. The causal model provide the basis for a global explanation of symptoms. Thus, it is strongly related to the first type of knowledge quoted above (knowledge to decide if a specific typical situation accounts for the concrete symptoms gathered on the machine). In other words, if the classification is mainly based on a heuristic matching, a causal explanation may help us acquire this heuristic knowledge. In this example, the causal model is of no help for other types of knowledge identified in the conceptual model. 5.3. WHAT IS AN ACCURATE CAUSAL MODEL FOR KA?
Designing a causal model is a KA process that comes in addition to the main process that aims at eliciting the HKB. These two processes may be completely separate or inter-related, depending on the type of model (top-down and bottom-up). But for both of them, three properties are expected.
646
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
(1) Connection. This property refers to the degree of connection between the CKB and the HKB. Obviously, connection must be strong to further improve the KA process. (2) Consistency. This property refers to the whole causal model and its internal consistency. This property is not essential. A CMBKAT can operate on several partial justifications or several competitive networks. (3) Completeness. This property refers to the knowledge present in the causal model or part of a causal model. A certain kind of completeness is expected (completeness is relative, just like the ‘‘deep model’’—at least completeness regarding the role the causal model has to play for KA). We shall discuss these three properties in connection with the two classes of causal models defined in Section 3. Then, we shall introduce some advantages of causal models. 5.3 .1 . Obtaining causal models by a ‘‘top-down approach’’ The first class of causal model for KA identified above is obtained from the ‘‘heuristic level’’ (the knowledge used in the KBS under construction). These causal models are termed ‘‘top-down designed causal models’’. They are mostly the direct justification of the chunks of knowledge already acquired (‘‘why are you telling me that?’’). This process of construction has some important consequences in terms of connection, consistency and completeness. These are as follows. $ The connection between the causal model and the heuristic level is easy to produce (one is derived from the other). Connection is ‘‘by construction’’. $ However, it may be difficult to design a coherent and unique causal network. Each justification is strongly related to the heuristic level, but there is no reason why the set of justifications should be coherent. This problem has been encountered in the DIVA experiment, where it was not easy to connect concepts belonging to one local justification with similar (but slightly distinct) concepts in another local justification. Experimentation has led to the design of several local causal networks, attached to the various diagnosis hypothesis. Therefore, consistency is hard to achieve, but is not essential, and partial models can be used (Console & Torasso, 1988). In the ACTE experiment, this coherence had to be taken into account through the unique fault process hypothesis (see Section 4.2.3). $ The completeness of the network may sometimes appear as a challenge as difficult to meet as the design of the ‘‘main knowledge base’’ (i.e. what we have called the ‘‘heuristic level’’ and which the cusal model is supposed to help). This is the case in the DIVA experiment, and so the advantages of using causal models need to be reconisdered in the light of the costs involved in designing such models. 5.3 .2 . Obtaining causal models by a ‘‘bottom-up approach’’ Bottom-up causal models are designed (or are pre-existing) independently of the heuristic level. This is typically the case for causal models obtained from tutorial manuals (for instance medical treatizes) or resulting from material design. In this ` LE. A case, connection becomes a major problem. This is illustrated by ADE connection step, following the selection of pre-existing causal models, aims to
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
647
connect the HKB with the CKB. The process of designing the two levels is quite separate from the initial stages. It is not surprising that connection does not always come naturally. As opposed to top-down designed causal models, consistency and completeness are often a major pre-requisite of such pre-existing causal models, especially when they are also correct behaviour models. This remark must be relativized with fault ` LE for models which are partial by nature (see Section 5.4). Therefore, in ADE example, the approach suffers from the incompleteness of the CKB. Incompleteness of CKB is shown when some piece of heuristic knowledge cannot be or is only partially justified. In this case, CKB must be increased with new, more appropriate ` LE does not help to add this knowledge, but it does allow the causal models. ADE user to localize the missing knowledge. The notion of consistency can be illustrated by the ‘‘top-down’’ designed causal models of ACTE. At the beginning of the KA process they may not be consistent, but the approach helps the expert to create consistency. 5.3 .3 . Ady antages of causal models To conclude this section, we note that the study of causal models has led to underscore some of their advantages which we shall describe below. CKB and HKB are two modellings of the same domain, at different levels of abstraction (CKB is ‘‘deep’’ with regard to heuristic knowledge). Thus, CMBKATs allow us to build KBS in a more cosnsitent and complete way by providing a separate specification of the problem. In particular, causal models can be useful for HKB validation and consistency. As we have seen, for validation purposes, a ‘‘bottom-up designed causal model’’ is certainly better suited, while consistency is more easily attainable using a ` LE ‘‘top-down designed causal model’’. This is what we have observed on the ADE and ACTE experiments. This is not really surprising, since validation requires an independent reference, and ‘‘bottom-up designed causal models’’ have this property; consistency, on the other hand, requires a strong connection between the HKB and the CKB, and ‘‘top-down designed causal models’’ have this property by construction. Secondly, CKB contains knowledge which is supposed to be more readily available. Acquisition of CKB is thus assumed to be easier than acquisition of HKB (even though the DIVA experiment tends to prove the contrary), and this is one of the advantages we can expect from using a causal model for KA. Finally, CKB is a KA support, but it can also be very useful for other tasks. Justifications can be used for explanation (Swartout, 1983), and also for the operational system itself, covering atypical problems the HKB is unable to solve. 5.4. ONTOLOGICAL POINT OF VIEW
5.4 .1 . Ontology of the causal model In Section 3.1 we saw that causal models attempt to represent the domain theory. Yet, in the literature, two kinds of causal models have been identified, each of them referring to a different domain theory (Sticklen et al. , 1988). Let us recall these two classes of causal models so as to place our causal models in relation to them. The first type is related to the qualitative physics approach. In this case, we derive
648
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
the functionality of a device from the function and the structure of its constituent sub-devices. The causality is often coupled with a temporal precedence relationship. These models are models of correct behay iour (MCB), characterized by completeness. The ability to design models of correct behaviour proves that the domain can be completely specified. In such a domain—e.g. in qualitative physics—the KA process is easier to perform. Models of correct behaviour may be helpful for designing expert systems and diagnostic systems. The second type is centred on a causal network representing descriptions of the states of the device being modelled. Such a causal network is often linked to an associational model which establishes relationships between observations and states. Both models (causal and associational) define a fault model (FM). Often, only partial models are available. In medicine, KBS systems are often based on FM because of the incomplete understanding of the underlying processes. Therefore, in such domains FM are request for perform diagnostic tasks (see Section 4.2.3). As pointed out above, these fault models can be used for reasoning and justification in addition to KA. According to this classification, let us characterize the causal models in ACTE and ` LE. In ACTE, the CKB is a causal-associational state network mapped on a ADE disease process hierarchy, as in numerous medical KBS (Clancey, 1989). Thus, the ` LE, causal relations co-exist knowledge model of ACTE is typically a FM. In ADE with descriptive or definition relations as we pointed out in Section 4.1.2. Causal relations describe diseases and the symptoms the diseases bring on. They thereby define a FM, whereas descriptive relations are related to a MCB. With regard to the knowledge level analysis of Section 2, this observation suggests further directions of research. CMBKATs play a major role during the instantiation of the conceptual model (step 3 of the framework described above) but they should also play a role in the design of the conceptual model (step 2). In the same way, current research in methodologies now explicitly recognizes the need to integrate modelling based on libraries of generic models, and modelling that consists in designing the conceptual model from the domain model (Aussenac-Gilles, 1994). 5.4 .2 . Causal model and ontology In the recent approaches which enhanced the relationships between the different components of the conceptual model (Steels, 1990; Wielinga et al. , 1993; Runkel & Birmingham, 1993; Tu et al. , 1995) the domain theory is very rich and the notion of ontology is of prime interest and used to explicitate different statements about domain knowledge.† In this context it is possible to explicitly declare causality in domain model: ‘‘Causality’’ is a term of the model ontology used to build model schema which structures causal domain model (Wielinga et al. , 1993). This observation suggests possible interaction between a CMBKAT and these approaches: if causal models have been stated as necessary in the developed KBS, such a framework may fully integrate causal models and a CMBKAT might be connected in order to play its role of validation or consistency checking. † We exemplify the relationship between causal model and KBS using ontologies with the vocabulary of the COMMON-KADS approach. Nevertheless, the discussion could be the same with regard to Steel’s or Musen’s approaches (Steels, 1990; Tu et al. , 1995).
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
649
5.5. DEALING WITH UNCERTAINTY
It is widely recognized that reasoning on causal nets is a difficult process. However, in addition to the classic difficulties (often in terms of complexity), causal models for KA are frequently required to deal with uncertainty (especially for top-down designed causal models, more closely related to the heuristic level). Weak causality appears in the form of relations such as ‘‘A may cause B under conditions C or D’’. These conditions can be either explicit (the conditions are explicitly represented) or implicit (e.g. a certainty factor). In the former case, the complexity of the network may increase and make design difficult. In the latter case, classical problems of weak transitivity have to be dealt with (see ACTE, Section 4.2). Nevertheless, the need to represent this uncertainty is widely recognized (Szolovits et al. , 1988). Using uncertain relations in causal nets has been attempted in several experiments [for example in CHECK (Console & Torasso, 1990), in DIVA (David & Krivine, 1989b ) , and in ACTE (Charlet, 1993)]. With his probabilistic causal network Long has tried to determine a global and incremental solution to uncertainty management (Long, 1989).
6. Conclusion This paper is the result of a recent comparison of several experiments on the use of causal models for KA. We have tried to place the various CMBKATs in the paradigm of ‘‘task and method specific tools’’. We have outlined the fact that they do not generally use explicit reference to any underlying conceptual model. We have noted how such a relation might be a promising path of research. In particular, we think that it will be of great help to study how a set of tools can be integrated or can collaborate in the same KA workbench. Further work should be done in this direction. We have also identified two different types of CMBKAT. Of course, such a classification is not strict, and very often tools can be either ‘‘top-down designed’’ or ‘‘bottom-up designed’’. However, we have also noted how the former are more appropriate for checking ‘‘internal consistency’’, and how the latter are better suited for validation. We have also seen some of the properties that such tools may have. Connection, consistency and completeness have been identified with the two classes of tools.
References ANDEASSEN, S., WOLBYE, M., FALCK, B. & ANDERSEN, S. (1987). Munin—a causal probabilistic network for interpretation of electromyographic findings. Proceedings of the 10th International Joint Conference on Artificial Intelligence , pp. 366 – 372, Milan, Italy. AUSSENAC-GILLES, N. (1994). How to combine data abstraction and model refinement: a methodological contribution in macao. Proceedings of the 8th European Workshop on Knowledge Acquisition for Knowledge -Based Systems. AUSSENAC-GILLES, N., KRIVINE, J.-P. & SALLANTIN, J. (1992). Editorial. Rey ue d ’Intelligence Artificielle , 6 , 7 – 18. BREUKER, J. A., WIELINGA, B. J., VAN SOMEREN, M., DE HOOG, R., SCHREIBER, G., DE GREEF, P., BREDEWEG, B., WIELEMAKER, J. & BILLAULT, J. P. (1987). Model -driy en
650
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
knowledge acquisition : interpretation models. Technical report, Esprit project 1098. Deliverable A1. CHANDRASEKARAN, B. (1987). Towards a functional architecture for intelligence based on generic information processing tasks. Proceedings of the 10th International Joint Conference on Artificial Intelligence , pp. 1183 – 1192, Milan, Italy. CHARLET, J. (1991). ACTE: a strategic knowledge acquisition method. In M. LINSTER & B. R. GAINES, Eds. Proceedings of the 5th European Workshop on Knowledge Acquisition for Knowledge -Based Systems , pp. 85 – 93. Glasgow, UK. CHARLET, J. (1992). ACTE acquisition des connaissances par interpre´ tation d’un mode´ le causal. Rey ue d ’Intelligence Artificielle , 6 , 99 – 129. CHARLET, J. (1993). ACTE: a causal model-based knowledge acquisition tool. In J.-M. DAVID, J.-P. KRIVINE & R. SIMMONS, Eds. Second Generation Expert Systems , pp. 495 – 516. Berlin: Springer Verlag. CHARLET, J., KRIVINE, J.-P. & REYNAUD, C. (1992). Causal model-based knowledge acquisition tools: discussion of experiments. In T. WETTER, K.-D. ALTHOFF, J. H. BOOSE, B. R. GAINES, M. LINSTER & F. SCHMALHOFER, Eds. Proceedings of the 6th European Workshop on Knowledge Acquisition for Knowledge -Based Systems , pp. 318 – 336. Heidelberg: Springer-Verlag. CLANCEY, W. J. (1985). Heuristic classification. Artificial Intelligence , 27 , 289 – 350. CLANCEY, W. J. (1989). Viewing knowledge bases as qualitative models. IEEE -Expert , 4 , 9 – 23. CONSOLE, L., FOSSA, M. & TORASSO, P. (1989). Acquisition of causal knowledge in the CHECK system. Computers and Artificial Intelligence , 8 , 323 – 345. CONSOLE, L. & TORASSO, P. (1988). A logical approach to deal with incomplete causal models in diagnostic problem solving. Lecture Notes in Computer Science , 313 . Berlin: Springer-Verlag. CONSOLE, L. & TORASSO, P. (1990). Hypothetical reasoning in causal models. International Journal of Intelligent Systems , 5 , 83 – 124. DAVID, J.-M. & KRIVINE, J.-P. (1989a ). Augmenting experience-based diagnosis with causal reasoning. Applied Artificial Intelligence , 3 , 239 – 248. DAVID, J.-M. & KRIVINE, J.-P. (1989b ). Designing knowledge-based systems within functional architecture: the DIVA experiment. Proceedings of the 5th IEEE Conference on Artificial Intelligence Applications , Miami, FL, USA. DAVID, J.-M., KRIVINE, J.-P. & BENOIT RICARD (1993). Building and maintaining a large knowledge-based system from a ‘‘knowledge level’’ perspective: the DIVA experiment. In J.-M. DAVID, J.-P. KRIVINE & R. SIMMONS, Eds. Second Generation Expert Systems , pp. 376 – 402. Berlin: Springer-Verlag. DAVID, J.-M., KRIVINE, J.-P. & SIMMONS, R., Eds. (1993). Second Generation Expert Systems. Berlin: Springer-Verlag. DAVIS, R. (1979). Interactive transfer of expertise: acquisition of new inference rules. Artificial Intelligence , 12 , 121 – 157. DAVIS, R. (1989). Form and content in model-based reasoning. Proceedings of the IJCAI workshop on Model -Based reasoning. ERIKSSON, H. (1990). Meta-tool support for customized domain-oriented knowledge acquisition. In J. H. BOOSE & B. G. AND, Eds. Proceedings of the 5th Banff Knowledge Acquisition for Knowledge -Based Systems Workshop , pp. 6.1 – 6.20. Banff, Alberta, Canada. ESHELMAN, L. (1988). MOLE: a knowledge acquisition tool for cover-and-differentiate systems. In S. MARCUS, Ed. Automating Knowledge Acquisition for Expert Systems. Boston, MA: Kluwer Academic. GRUBER, T. R. (1991). Learning why by being told what. IEEE -Expert , x , 65 – 75. HICKMAN, F. R., KILLIN, J. L., LAND, L., MULHALL, T., PORTER, D. & TAYLOR, R. M. (1989). Analysis for Knowledge -Based Systems , a Practical Guide to the KADS Methodology. Hemel Hempstead: Ellis Horwood. JOHNSON-LAIRD, P. N. (1980). Mental models in cognitive science. Cognitiy e Science , 4 , 71 – 115. KAHN, G., NOWLAN, S. & MCDERMOTT, J. (1985). MORE: an intelligent knowledge
CAUSAL MODEL-BASED KNOWLEDGE ACQUISITION
651
acquisition tool. In A. JOSHI, Ed. Proceedings of the 9th International Joint Conference on Artificial Intelligence , pp. 581 – 585, Los Angeles, CA: Morgan Kaufmann. KARBACH, W., LINSTER, M. & VOSS, A. (1990). Models, methods, roles and tasks: many labels—one idea? Knowledge Acquisition , 2 , 279 – 299. KARBACH, W., VOSS, A., SCHUCKEY, R. & DROUVEN, U. (1991). Model-K: prototyping at the knowledge level. In D. HERIN-AIME´ , R. DIENG, J.-P. REGOUARD & J. ANGOUJARD, Eds. Proceedings of the 1st conference on Knowledge Modelling & Expertise Transfer , pp. 195 – 208. Oxford: IOS Press. LONG, W. (1989). Medical diagnosis using a probabilistic causal network. Applied Artificial Intelligence , 3 , 367 – 383. LONG, W., NAIMI, S., CRISCITIELLO, M. G., PAUKER, S. G. & SZOLOVITS, P. (1984). An aid to physiological reasoning in the management of cardiovascular disease. Proceedings of the Computers in Cardiology Conference , pp. 3 – 6. MARCUS, S. & MCDERMOTT, J. (1989). SALT: a knowledge acquisition language for propose-and-revise systems. Artificial Intelligence , 39 , 1 – 37. MCDERMOTT, J. (1988). Preliminary steps towards a taxonomy of problem-solving methods. In S. MARCUS, Ed. Automating Knowledge Acquisition for Expert Systems. Boston, MA: Kluwer Academic. MOTTA, E., RAJAN, T. & EISTNSTADT, M. (1990). Knowledge acquisition as a process of model refinement. Knowledge Acquisition , 2 , 21 – 49. MUSEN, M. A. (1989). Automated Generation of Model -Based Knowledge Acquisition Tools. Research notes in Artificial Intelligence. Los Altos, CA: Morgan Kaufmann Publishers. MUSEN, M. A. (1992). Editorial. Overcoming the limitations of role-limiting methods. Knowledge Acquisition , 4 , 165 – 170. NECHES, R., FIKES, R., FININ, T., GRUBER, T., PATIL, R., SENATOR, T. & SWARTOUT, W. (1991). Enabling technology for knowledge sharing. The Artificial Intelligence Magazine , 12, 16 – 36. NECHES, R., SWARTOUT, W. & MOORE, J. (1985). Explainable (and maintainable) expert systems. In A. JOSHI, Ed. Proceedings of the 9th International Joint Conference on Artificial Intelligence , pp. 382 – 389. Los Angeles, CA: Morgan Kaufmann. PUERTA, A. R., EGAR, J. W., TU, S. W. & MUSEN, M. A. (1992). Method knowledgeacquisition shell for the automatic generation of knowledge-acquisition tools. Knowledge Acquisition , 4 , 171 – 196. REINDERS, M., VINKHUYZEN, E., VOSS, A., AKKERMANS, H., BALDER, J., BARTSCH-SPO¨ RL, B., BREDEWEG, B., DROUVEN, U., VAN HARMELEN, F., KARBACH, W., KARSSEN, Z., SCHREIBER, G. & WIELINGA, B. (1991). A conceptual modelling framework for knowledge-level reflection. AI Communications , 4 , 74 – 87. REYNAUD, C. (1993). Acquisition and validation of expert knowledge by using causal models. In J.-M. DAVID, J.-P. KRIVINE & R. SIMMONS, Eds. Second Generation Expert Systems , pp. 517 – 540: Berlin: Springer-Verlag. RUNKEL, J. T. & BIRMINGHAM, W. P. (1993). Knowledge acquisition in the small: building knowledge-acquisition tools from pieces. Knowledge Acquisition , 5 , 221 – 243. STEELS, L. (1990). Components of expertise. The Artificial Intelligence Magazine. STICKLEN, J., CHANDRASEKARAN, B. & BOND, W. E. (1988). Distributed causal reasoning for knowledge acquisition: a functional approach to device understanding. Proceedings of the 3rd Banff Workshop for Knowledge Acquisition , Banff, Canada. (Revised version. Applied AI , 3 , 275 – 304, 1989). SWARTOUT, W. (1983). Xplain: a system for creating and explaining expert consulting programs. Artificial Intelligence , 21 , 285 – 325. SZOLOVITS, P., PATIL, R. S. & SCHWARTZ, W. B. (1988). Artificial intelligence in medical diagnostic. Annals of Internal Medicine , 108 , 80 – 87. TU, S. W., ERIKSSON, H., GENNARI, J. H., SHAHAR, Y. & MUSEN, M. A. (1995). Ontology-based configuration of problem-solving methods and generation of knowledge´ GE ´ -II to protocol-based decision support. acquisition tools: applications of PROTE Artificial Intelligence in Medicine , 7 , 257 – 289. WIELINGA, B., VAN DE VELDE, W., SCHREIBER, G. & AKKERMANS, H. (1993). Towards a
652
J. CHARLET, C. REYNAUD AND J.-P. KRIVINE
unification of knowledge modelling approaches. In J.-M. DAVID, J.-P. KRIVINE & R. SIMMONS, Eds. Second Generation Expert Systems , pp. 299 – 335. Berlin: Springer-Verlag. WIELINGA, B. J. & BREUKER, J. A. (1986). Models of expertise. Proceedings of the 7th European Conference on Artificial Intelligence , Brighton, UK. WIELINGA, B. J., SCHREIBER, A. T. & BREUKER, J. A. (1992). KADS: a modelling approach to knowledge engineering. Knowledge Acquisition , 4 , 5 – 54. YOST, G., KLINKER, G., LINSTER, M., MARQUE` S, D. & MCDERMOTT, J. (1994). The SBF framework 1989 – 1994: from applications to workplaces. Proceedings of the 8th European Workshop on Knowledge Acquisition for Knowledge -Based Systems. Amsterdam, The Netherlands. pp. 21 – 40. Paper accepted for publication by the Editor, B. R. Gaines