Data & Knowledge Engineering 4 (1989) 125-155 North-Holland
125
Mapping between a NIAM conceptual schema and KEE flames Steven T W I N E Department of Computer Science, University of Queensland, St Lucia, 4067, Australia Abstract. The basis of knowledge representation in both information systems and expert systems is a declarative specification of the facts about some Universe of Discourse. Representation schemes can be considered on the conceptual level, which is concerned only with the semantics of these facts, or at the internal level, which is concerned with the implementation of these facts in computer-oriented data structures. The requirements at each level are very different. Intellicorp's Knowledge Engineering Environment (KEE) supports a frame-based representation language for defining knowledge bases. KEE's frame language is unsuitable for use at the conceptual level because it fails to satisfy both the ISO Conceptualisation Principle and a weaker form of the ISO 100% Principle (called the Explicit Representation Principle). However, KEE could still be used as a representation scheme at the internal level. This will permit us to use a conceptual description language (with all the well-known advantages) to describe analyse the Universe of Discourse for a knowledge-based application, and to effectively implement this system using the existing (and expanding) body of KEE support software. In this paper, I will present a mapping between a NIAM knowledge base and a KEE knowledge base. This mapping is formally defined by a conceptual schema which permits us to record the required mapping information. The constraints on this conceptual schema reflect the necessary restrictions on the mapping transformation. Using this mapping schema, I will develop procedures to map the dynamic (update) operations on a NIAM conceptual database into corresponding operations on the KEE knowledge base. Keywords. Conceptual schemas, KEE, Knowledge representation, Knowledge transformation, NIAM.
1. Introduction
One major research activity in recent years has been exploring the relationships between knowledge bases (in artificial intelligence) and databases (in information systems) [11]. Most researchers view this relationship at the implementation level, and concentrate on the transfer of data structures and processing techniques from one research world to the other. For example, the use of frame-based representations to store relational data ([4]), the use of relational databases to store frame data ([2]) or the addition of rule processing algorithms to a database management system ([5, 17]). A more fundamental view of this relationship is concerned only with the deep semantic structure of the information contained in the database or knowledge base. At this conceptual level, both databases and knowledge bases are simply collections of facts about the Universe of Discourse. The conceptual level completely excludes any implementation and user presentation aspects, but includes linguistic aspects (which can be effectively used for design and validation purposes) This provides a different perspective on the relationship between knowledge representation schemes from artificial intelligence (AI) research and conceptual schema languages from information systems (IS) research. 0169-023X/89/$3.50 ~) 1989, Elsevier Science Publishers B.V. (North-Holland)
126
S. Twine I Mapping between NIAM conceptual schema and KEE frames
In this paper, I will analyse the relationships between Intellicorp's KEE ([7]) and the NIAM conceptual schema language [15, 16, 22] at the conceptual level (KEE as an alternative conceptual description language to NIAM) and at the internal level (KEE as an implementation language for NIAM). In Section 2, I will explain the distinction between the specification of a knowledge base (at the conceptual level) and its implementation (at the internal level). I will introduce the ISO Conceptualisation Principle and three weaker forms of the ISO 100% Principle to define the requirements of a conceptual description language. Section 3 is an introduction to the relevant concepts from the NIAM conceptual schema language. It is not within the scope of this paper to present the complete NIAM model, or to justify the design decisions it embodies. For more information, see [16]. In Section 4, I will examine the suitability of KEE's frame language for specifying knowledge bases at the conceptual level. A more detailed discussion of this analysis can be found in [19]. The conclusion is that KEE is unsuitable for this task because it includes too many incompatible ways to represent simple declarative facts. So, KEE is unsuitable for use as a conceptual description language, and any analysis and design methods based on KEE will inherit this unsuitability (because of the strong dependence between the semantics of a description language and the guidance offered by its associated design procedure). However, it would be useful to exploit the large body of KEE support software for developing knowledge-based systems. This suggests a different approach which is quite pragmatic: specify the Universe of Discourse using NIAM and implement it using KEE. In Section 5, I will specify the mapping between a NIAM conceptual schema and a KEE knowledge base. This section includes formal descriptions (using NIAM) of the NIAM model, the KEE model and the mapping relations that record the cross-references between the corresponding components. In Section 6, I will briefly address some dynamic aspects of the NIAM to KEE mapping. Finally, in Section 7, I discuss the implications of these results and directions for future research.
2. Specification and implementation of knowledge bases Recall that any description at the conceptual level must not contain implementation or user presentation information. This is prescribed by the ISO Conceptualisation Principle [21]: The Conceptualisation Principle. A conceptual schema should only include conceptually relevant aspects, both static and dynamic, of the Universe of Discourse, thus excluding all aspects of (external or internal) data representation, physical data organization and access as well as all aspects of particular user representation such as message formats, data structures, etc. This principle is partially adequate, because it explicitly excludes some aspects of the problem which are clearly not relevant to the conceptual level. However, it does not specify what these "conceptually relevant aspects" are. A more precise definition, derived from a 1981 paper by Falkenberg [6] is: A conceptual schema should only include propositions which refer exclusively to the Universe of Discourse. In other words, a proposition is conceptually relevant if and only if it refers exclusively to the Universe of Discourse. The Universe of Discourse is that part of the real world about which we wish to communicate knowledge. If the description language permits (or worse, requires) the designer to specify a
S. Twine / Mapping between NIAM conceptual schema and KEE frames
127
conceptual schema which violates the Conceptualisation Principle, then this language can be said to have violated the Conceptualisation Principle (and is, therefore, unsuitable as a conceptual description language). Of course, it may be possible to select a suitable subset of this language which satisfies the Conceptualisation Principle. This conceptual schema acts as a prescriptive grammar to specify which sentences (or facts) may be recorded in the database. More precisely, the propositions in the conceptual schema must prescribe the permitted states of the database such that every permitted state of the database corresponds to a meaningful state of the Universe of Discourse. This is specified by the ISO 100% Principle [21]: The 100% Principle. All relevant general static and dynamic aspects, i.e. all rules, laws, etc., of the Universe of Discourse should be described in the conceptual schema. The information system cannot be held responsible for not meeting those described elsewhere, including, in particular, those in application programs. This can be split into three independent principles which, taken together, imply the 100% Principle: The Explicit Representation Principle. Every proposition in the conceptual schema must be specified explicitly. It is not permitted to represent some of these propositions implicitly (so that they are only detectable by applying metarules of interpretation to the conceptual schema). This principle specifies that every defined rule about the Universe of Discourse must be specified explicitly. It similarly forbids implicit assumptions such as those often found in AI knowledge representation schemes, such as the Unique Names Assumption [19] or the Closed World Assumption [3]. If such assumptions are required, then they must be stated explicitly in the conceptual schema. This has been suggested as good practice within the AI community as well [3]. The Central Representation Principle. Every semantic rule (or metaproposition) about the Universe of Discourse must be specified in the conceptual schema. It is not permitted to specify any semantic rules in application programs. This principle captures the part of the 100% Principle which forbids the implementation of schema rules within application programs, or elsewhere. It is difficult (if not impossible) to verify the consistency of a set of constraints, if they are distributed among a number of application programs. It is similarly difficult to ensure that the database conforms to the rules specified in the conceptual schema at all times, if the responsibility for constraint enforcement is left with the application programmer. The Complete Representation Principle. The conceptual schema must contain a set of rules such that each and every permitted state of the database corresponds to a meaningful state of the Universe of Discourse, and each and every permited state of the Universe of Discourse corresponds to a permitted state of the database. This principle captures the part of the 100% Principle which requires that "all relevant general static and dynamic aspects" be defined in the conceptual schema. Many conceptual schema languages do not satisfy this principle and designers are, therefore, forced to rely on informal notations for specifying certain constraints. In other words, if a conceptual schema language is insufficiently powerful, then some conceptual schemas expressed in that language will fail to satisfy the Complete Representation Principle because they will permit database states which do not correspond to meaningful states of the Universe of Discourse. Such a conceptual schema is said to be undercon-
strained. In summary, the Complete Representation Principle states that all relevant rules must be defined so that each and every database state permitted by these rules corresponds to a meaningful state in the Universe of Discourse, and the Explicit Representation Principle
S, Twine / Mapping between NIAM conceptual schema and KEE frames
128
states that every defined rule must be explicitly represented in the conceptual schema. Together, they imply that every relevant rule must be explicitly represented in the conceptual schema (which is the ISO 100% Principle). A conceptual schema, together with a conforming database, is a specification of a knowledge base. This knowledge base can be implemented by encoding (or representing) the grammar of the conceptual schema, and the facts of the database, using some representation scheme. This internal representation of the conceptual schema is called (obviously) an internal schema. The internal level is concerned with representing (or implementing) the grammar specified by the conceptual schema. 3. The NIAM model The most basic component of the NIAM model is a (uniquely referenceable) object instance in the Universe of Discourse. There is an explicit, and a priori, distinction between non-lexical object instances (called entity instances) and lexical object instances (called label instances). Entity instances (such as a person called Bill) do not have direct external representations. Label instances (such as the person's name "Bill") must have an immediate external representation. Entity instances represent things; label instances represent names. Any proposition involving one or more entity instances (and their associated label instances) is called a fact instance. As an example, the following proposition involves two entity instances and two label instances: The Person Bill works for the Company IBM. Contrary to popular misc~mception, a fact instance may involve more tha~ two entity instances. NIAM ([13, 15]) is not restricted to a "binary relationship" model. However, a variation of NIAM that is restricted to binary fact types (called Binary Relationship Modelling) has gained reasonably wide acceptance [10]. Fact instances are the only "knowledge-bearing elements" in the NIAM model. It is only possible to record the existence of an entity instance if some fact about that entity instance is recorded in the database. Each entity instance must play a distinct role in the fact instance. In the example above, the Bill entity plays the role works-for, and the IBM entity plays the role employs. Any proposition involving an entity instance and a label instance is called a reference instance. Reference instances must be binary. As an example, the following proposition represents a naming convention for a Person entity instance: There is a Person with the first name "Bill"
Entity instances which may play the same role are classified together, giving entity types. Similarly, label instances which may play the same role are classified together, giving label
types. Fact instances which involve the same roles are classified together, giving fact types. Similarly, reference instances which involve the same roles are classified together, giving
reference types. Graphically, • A circle represents an entity type. • A dashed circle indicates a label type. • A box represents a role. A series of connected boxes represents a fact type.
S. Twine / Mapping between NIAM conceptual schema and KEE frames
works for
] employs
zx
(3 IBM
td
129
es
/f---[---.\ ; i
First Name \,..
~ ,/
f Company ~ Name /
\
.../
... . . . . .
,
/
Fig. 1. Conceptual schema.
Fig. 1 shows the entity types, label types, roles, fact types and reference types involved in the two propositions listed above. If every instance of an entity type can be uniquely identified by a corresponding instance of a single label type, the label type is sometimes omitted from the diagram and the label type name is shown in parentheses next to the entity type name. It is often useful to show the population of a fact type on the same diagram as the specification of that fact type. This can be done by showing a table of fact instances immediately below the boxes that represent the roles of that fact type (Fig. 2). This poses a technical problem: the roles of a fact instance are played by entity instances, and these entity instances are (by definition) not directly representable. If this is the case, how can we show the population of a fact type? The answer is simple: entity instances are denoted by the label instance (or label instances) that identify them. In some circumstances, we may need to specify a fact instance about some other fact instance. For example, we may know that Bill works for IBM, and that this employment commenced in 1984. In the NIAM model, this is specified by nesting, nominalizing or objectifying the first fact type. This means that the first fact type can be treated as an entity type when specifying ti.~? second fact type. Using the previous notation, this is shown in Fig. 3. A major part of most practical conceptual schemas is the constraint section. Theoretically, constraints reduce semantic knowledge to a syntactic level where it can be enforced: what is semantically meaningless is made syntactically illegal. Of course, these constraints must be defined in terms of the entity types, subtypes and fact types in the conceptual schema.
-__q works Bill for Robert
Eckhard
sj~
[employs IBM CT~
SIEMENS
CDC
Bill HP Fig. 2. Populated conceptual schema.
130
S. Twine I Mapping between NIAM conceptual schema and KEE frames
@ Fig. 3. A nested fact type.
The NIAM model includes a small set of predefined constraint types which (empirically) cover most of the requirements that arise in practice. Any other relevant constraints can be specifed using an appropriate constaint language (which is constructed on top of NIAM's semantic primitives). For an example of such a language, refer to Meersman's RIDL language [9, 10]. For the purposes of this paper, only a few constraint types are relevant. Uniqueness constraints are shown on the diagram as a double-headed arrow over one or more roles. A uniqueness constraint requires that each entity instance playing the role (or each combination of entity instances playing the combination of roles) under the constraint arrow occurs at most once in any population of the fact type containing those roles. In Fig. 4, C1 and C4 are uniqueness constraints. Mandatory role constraints are shown on the diagram as a large dot (on the circle) where the line joins the mandatory role to its associated entity type. This constraint requires that if any fact instance is recorded about a given entity instance of the entity type which plays the mandatory role, then that entity instance must play the mandatory role in at least one fact instance of that fact type. In Fig. 4, C2, C3 and C5 are mandatory role constraints. Sometimes, a fact type is only meaningful for a well-defined subset of the instances of an entity type. This requires a constraint which is stronger than an optional (non-mandatory) role, but weaker than a mandatory role. For example, in Fig. 5, the earns role is non-mandatory. However, there is a rule which says that a person may earn a salary if and only if they have worked for any company after 1985. If we accept the conceptual schema of Fig. 5, it violates the Complete Representation Principle because it permits a user to record that a person, who has not worked for any company after 1985, but who still earns a salary. (Note that the rule does not say that a person must earn a salary if they have worked for a company after 1985).
C4
i~er~d "iisycarof
I c°mmcncem~t I
Fig. 4. Conceptualschema with simpleconstraints.
S. Twine / Mapping between NIAM conceptual schema and KEE frames
(~--'
C6
131
)
Im
I c=nmer'=m~t I
"~ (AD)'/)
Fig. 5. Conceptual schema without subtypes.
The solution is to define the subset of entity instances about which instances of the fact type can be recorded. This subset is called an entity subtype. To enforce the constaint, the fact type is associated with the subtype instead of the supertype. Graphically, an entity subtype is linked to its entity supertype with an arrow. Fig. 6 shows the conceptual schema of Fig. 5, with the constraints corrected. A conceptual schema can include any number of directed (acyclic) networks of supertypesubtype relationships. Every member of the subtype is always a member of the supertype. Therefore, instances of fact types associated with the supertype can be recorded for subtype instances. Similarly, it is not necessary to specify reference types for each entity subtype (because instances of the subtype can be referenced through the naming conventions associated with the supertype). In NIAM, a subtype is a constraint, not a way of encoding a fact. In other words, knowing that the entity instance Bill is a member of the subtype Employed Person does not tell us
~3
.~
I i.
•
C6
cams
C4
=
[o====m=, [
)
[is earnedby
Fig. 6. Conceptual schema with subtypes.
132
S. Twine / Mapping between NIAM conceptual schema and KEE frames
anything new about the Universe of Discourse. Rather, this fact is explicitly recorded about some supertype of the subtype, as a subtype-defining fact. For example, in Fig. 6, the fact that Bill works for IBM tells us that Bill is in the subtype Employed Person. More detailed information about the NIAM model can be found in [15, 16 and 22].
4. Representing facts in KEE's frame language KEE has a frame-based knowledge representation component which includes units (or frames), links between units, slots in units, facets in slots, and values in facets. In this paper, we will only consider the declarative aspects of KEE's frame language. In particular, I will not discuss methods, active values or inference rules. Fikes and Kehler [7] provide a good introduction to KEE. Fig. 7 contains an example of a unit. This unit is named Bill. Each unit must have a single, unique name. KEE units are used to represent both facts about entity types and entity instances. There is no syntactic distinction in KEE between a unit which represents facts about an entity type and a unit which represents facts about an entity instance. There are no apparent advantages in permitting types and instances to become confused, and there are a large number of obvious disadvantages (erroneous inferences, inability of the system to maintain consistency, no clear interpretation for such structures etc). It is important to realise that the unit's name cannot be used to make this distinction: to the system, this name is simply an uninterpreted character string. Such "meaningful identifiers" can have only an informal semantics. The name of a type-unit corresponds to the name of the entity type it represents. The name of an instance-unit corresponds to the label instance that identifies the entity instance represented by that unit. KEE cannot represent the case where an entity instance is identified by a combination of label instances (a complex naming convention) or where an entity instance can be identified by a number of different label instances (of different label types) or even a number of different label instance combinations. KEE can only represent the case where an entity instance is identified by exactly one label instance. The only way in which type-units and instance-units may be distinguished is by adopting a (self-enforced) restriction, together with an implicit rule of interpretation. The restriction is that we will represent only a single level of type-abstraction within a KEE knowledge base, UNIT Bill MEMBER OF: Manager OWN SLOT holdsJob Values: Manager Cardinality.Max: 1 Cardinality.Min: 1 OWN SLOT worksForCompany ~ Values: IBM Cardinality.Max: 1 Cardina]ity.Min: 1 OWN SLOT managesProject Values: IRIS Cardinality.Min: 1 Cardinality.Max: 1 Fig. 7. A KEE flame.
S. Twine / Mappbrg between NIAM conceptual schema and KEE frames
133
and that every instance-unit must be linked to exactly one type-unit. The rule of interpretation is that any unit which is at the tail of a member link is an instance-unit. Conversely, every unit which is at the head of a member link is a type-unit. Because every instance-unit is associated with exactly one type-unit, a unit which does not participate in any member links must be a type-unit (representing a type with an empty current extension). Because this type-instance fact can only be deduced by applying an informal rule of interpretation (that is, a rule that cannot be enforced by the system) KEE violates the Explicit Representation Principle. KEE supports two sorts of links: member links and subclass links. This unit has one member link (to the unit Manager), but no subclass links. The member link represents a type-instance relationship (that is, an instance of the membership fact type); the subclass links represents a subtype-supertype relationship (that is, an instance of the subtype fact type). KEE does not provide any declarative way to restrict which links a unit may participate in. As a consequence, many semantically meaningless propositions (such as "The class of all projects is a subclass of the person called Bob") can be represented by syntactically legal KEE frame structures. This makes it difficult to provide a consistent semantic interpretation for any KEE data structure. KEE includes two types of slots: MEMBER slots and OWN slots. The unit named Bill contains three OWN slots. If a slot is of the type MEMBER, then its enclosing unit represents an entity type, and the slot represents a type of fact that can be known about instances of that entity type. If the slot is of the type OWN, then its enclosing unit represents an entity instance, and the slot represents a type of fact that can be known about that entity instance. KEE includes a slot inheritance procedure which is activated whenever a member link or subclass link is asserted. When a member link is asserted, KEE copies all of the MEMBER slots from the unit at the head of the link (assumed to be a type-unit), converts them into OWN slots and installs them in the unit at the tail of the link (assumed to be an instance-unit). This ensures that instances of each fact type known about an entity type can be recorded for any instances of that entity type. When a subclass link is asserted, KEE copies all of the MEMBER slots from the unit at the head of the link (assumed to be a type-unit representing the supertype), and installs them in the unit at the tail of the link (assumed to be a type-unit representing the subtype). This ensures that any fact type known about an entity supertype is also known about any subtype of that supertype. For this reason, if an entity instance is to be represented as a unit, then the unit should be a member of the unit which represents the most specific entity subtype that the entity instance belongs to. This rule prevents the redundant specification of facts. Finally, a slot may include several different types of facets. We will only consider the interpretation of Values, Cardinality.Min, and Cardinality.Max facets in this paper. A Values facet records one or more values for a slot. The combination of enclosing unit, slot and value (of the Values facet) represents a binary fact instance. In the unit above, the combination of UNIT Bill, OWN SLOT worksForCompany and VALUES IBM encodes the fact instance:
The manager Bill works for the company IBM Of course, this implies that each value in the Values facet represents a entity instance. Therefore, there are two (overlapping) possibilities for representing an entity instance: entity instances may be represented as units or as Values facet values or both.
134
S. Twine I Mapping between NIAM conceptual schema and KEE .frames
The representation chosen for the entities involved in a binary fact determines the way in which the fact will be represented. When the first entity is represented by a unit, and the second entity is represented by a value, then the fact must be represented by placing that value in the appropriate slot in that unit. When both entities are represented by units, the fact can be asymmetrically represented by placing the name of one unit, as a value, in the appropriate slot in the other unit. As a consequence, one of the entities will now have a dual representation (as a unit and as a value). There is no semantic basis for choosing which unit should contain the slot. Alternatively, the fact can be symmetrically, and redundantly, represented by a value in the appropriate slot in each unit. However, KEE provides no declarative, centralized way to maintain mutual consistency between the two slot values. Hence, there is no way that the system can prevent an update anomaly from occurring. The knowledge base designer must explicitly control the consistency of these slot values by writing LISP update procedures. Finally, if both entities are represented only by values, then there is no way to represent the fact in KEE. Intuitively, we would like to represent "important" entity instances (and their types) as units, and "unimportant" entity instances as values. There appears to be no good, formal basis for deciding a priori which entity instances are "important" and which are not. There may be many cases where the correct choice is obvious, but there are just as many cases where the correct choice is quite difficult. Furthermore, different people may make different "obvious choices" within the same set of entity instances. The notion of "important" is quite subjective. More importantly, the process of partitioning entity instances into units and values seems to be iterative and unguided. Consider the fact type shown in Fig. 8(a) and the two possible frame representations in Fig. 8(b). The choice of whether to represent the Person entity type, or the Company entity type, or both, by a unit is totally arbitrary. There is no formal basis for making the decision. Now, if we decide we also need to record facts like:
IBM is located in San Jose. CDC is located in Brussels. CDC is located in Minneapolis.
[-of I ooys I Fig. 8(a). Conceptual schema.
UNIT Person MEMBER SLOT worksForCompany UNIT Company MEMBER SLOT employsPerson
Fig. 8(b). Equivalentunitrepresentations.
S. Twine I Mapping between NIAM conceptual schema and KEE frames
135
then we must add a new fact type to the conceptual schema in Fig. 8(a). This new schema is shown in Fig. 9(a), together with the four possible unit choices in Fig. 9(b), 9(c), 9(d) and
9(e). Fig. 9 indicates the four possible choices if each fact type is to be represented in only one unit. If we permit symmetrical representation of a binary fact (as a slot in each of two units), then the number of possible choices increases to seven (one possibility where both facts are represented symmetrically, and two possibilities where only one fact is represented symmetrically). The choice seems to be totally arbitrary, although it is not without consequences. It may not be possible to represent certain constraints declaratively with certain grouping choices. It is even possible to arbitrarily split units into sub-units, representing the same entity in the U o D (or the same type), but with different slots attached. In this case, there is no longer
works for [ employs ~----( ?Com%~an~-~ - - q locatedin is ~ocation of I Fig. 9(a). Conceptual schema. UNIT Company MEMBER SLOT employsPerson MEMBER SLOT locatedInCity
Fig. 9(b). Unit representation, choice 1. UNIT Person MEMBER SLOT worksForCompany UNIT Company MEMBER SLOT locatedInCity Fig.9(c).Unitrepresentation,choice2.
UNIT Person MEMBER SLOT worksForCompany UNIT City MEMBER SLOT containsCompany
Fig. 9(d). Unit representation, choice 3. UNIT Company MEMBER SLOT employsPerson UNIT City MEMBER SLOT containsCompany
Fig. 9(e). Unit representation, choice 4.
136
s. Twine I Mapping between NIAM conceptual schema and KEE frames
any obvious connection between the units to suggest that they are both partial groupings of facts that can be known about the same entity! Of course, the possibility of arbitrary "unit decompositions" (rather like the relational decompositions that occur during Normalization) permits even more semantically equivalent possibilities. It should be clear that it is only possible to represent binary facts in KEE frames. This does not limit KEE's formal representational power because all n-ary or nested fact types can be transformed into (semantically equivalent) binary fact types by introducing artificial entity types. However, some people propose to represent nested or n-ary fact types by storing a list o f values in a Value facet. Consider these ternary fact instances: Bill worked on Project Iris for 20 hours. Bill worked on Project Noah for 15 hours.
The ternary fact type in Fig. lO(a) and the KEE frame in Fig. lO(b) are both supposed to permit this fact as part of a valid population. The unit in Fig. lO(b) is clearly an inadequate representation. There is no formal information to distinguish between the instances of the Project entity type and the instances of the Hours entity type. Instead, the knowledge engineer must rely on an informal and unenforceable intemretation rule: the first value in each list is the Project instance and the second value in each list is the Hours instance. Finally, uniqueness and mandatory role constraints (which can also be considered as metalevel facts) can be encoded by values in the other facets of a slot. The Cardinality.Min facet specifies the minimum number of values that are permitted within the Values facet at any moment in time. Recall that each value represents an entity instance. Therefore, the Cardinality.Min facet specifies the minimum number of fact instances (of the type represented by the sl0t) which can be recorded for the entity instance represented by the unit. A single mandatory role is equivalent to Cardinality.Min = 1. Similarly, the Cardinality.Max facet specifies the maximum number of values that are permitted within the Values facet at a given instant. A uniqueness constraint (over the role
BIn Bin
IRIS NOAH
20
15
Fig. lO(a). Conceptualschema for employees, projects, hours.
UNIT Bill SUPERCLASSES: () MEMBER OF: Person OWN SLOT W o r k s O n P r o j e c t F o r H o u r s Values: ( (Iris 20) (Noah 15)
Fig. 10(b). KEE frame for employees, projects, hours.
S. Twine / Mapping between NIAM conceptual schema and KEE frames
137
played by the entity type ~vhich the unit represents an instance of) is equivalent to Cardinality.Max = 1. KEE cannot represent uniqueness constraints over more than one role. To summarize, a binary fact may be encoded in KEE in any of the following ways: • If both entities are represented by units, and the fact is a membership (meta)fact, then represent it as a member link. • If both entities are represented by units, and the fact is a subtype (meta)fact, then represent it as a subclass link. • If one entity is represented as a unit, and the other is represented by a value, then represent the fact as a UNIT + SLOT + VALUE combination. • If both entities are represented by units, and the fact is not a membership fact or a subtype fact, then represent the fact as a UNIT + SLOT + VALUE combination in either unit (asymmetrically) or in both units (symmetrically). As you can see, there are too many different (and, in many cases, incompatible) ways of encoding facts in KEE data structures. So far as representing declarative facts is concerned, KEE shares some of the problems of the "multi-target" fact representation data models of the 1970s (namely, CODASYL DDL [1]). For more detail on the semantic anomalies of KEE, and their consequences, refer to [19]. This analogy with CODASYL is sufficiently strong to suggest that knowledge engineers using KEE will have certain difficulties in designing, maintaining and integrating KEE knowledge bases, similar to the problems that database designers had in designing, maintaining and integrating CODASYL DDL databases [12]. One of the most severe of these problems is the need for extensive reprogramming (or rewriting of inference rules) whenever a minor change is made to the underlying data structures. Many of these changes do not even affect the semantics of the knowledge base, because they simply change from one way of representing a fact to an alternative way of representing the same fact. The underlying cause for these problems is simple: it is not possible to specify a fact without also indicating how it must be implemented (i.e. in which manner it will be encoded, and, in the case of facts encoded as unit-slot-value combinations, which unit it will be stored in). The conceptual specification of the KEE knowledge base is implicit within its implementation, and can only be retrieved by abstracting away from these implementation details. It is not possible to specify the conceptual structure of a knowledge base, independent of a specific implementation of that structure.
5. A mapping between NIAM conceptual schemas and KEE frames Because KEE fails to satisfy the Conceptualisation Principle, it is not adequate for specifying knowledge bases at the conceptual level. This leaves two possibilities. The first is to "conceptualise" KEE frames (i.e. to alter the representation language so that it satisfies the principles in Section 2). Then we could provide a design method for this conceptual frame language. Unfortunately, to "conceptualise" KEE frames, we must make the following changes: 1. We must distinguish between units which encode types and units which encode instances. This is to eliminate the ambiguities discussed in Section 4. 2. We must eliminate the distinction between values and units. This distinction is arbitrary and non-conceptual. All entities in the UoD should be represented by units. This will reduce the number of equivalent possibilities for encoding facts as slots. 3. We must eliminate the distinction between facts represented by values in slots in units, and facts represented as links between units. This means that we must permit the
138
S. Twine / Mapping between NIAM conceptual schema and KEE frames
specification of constraints over facts stored as links. Currently, only facts stored as unit, slot, value combinations can be explicitly constrained. 4. Similarly, we must eliminate the physical grouping of slots around units. This clustering is an implementation consideration, not a conceptual one. In other words, we must reduce the number of equivalent possibilities for representing facts, so that the conceptual mechanism for representing a fact does not depend on implementation considerations (e.g. which unit the fact is to be stored in). These changes would be so radical as to result in a different language entirely. "Conceptualised" KEE would bear little resemblance to current KEE! This would give us yet another proposal for a conceptual description language (to be rationally evaluated on the grounds of semantic expressiveness) but we could not employ the current KEE software in the development of systems specified in that new language. The second alternative is to use KEE for implementing those knowledge bases at the internal level. This implies that we can define a mapping between the NIAM conceptual schema language and the KEE frame language. This would allow us to take advantage of the current KEE software development environments to support the implementation of software (a task they are suited for) even though they are of little use in the specification and design of that software. Before specifying this transformation, we must consider the general requirements on any transformation between a conceptual schema and an internal schema. An implementation of a conceptual schema and a specification of that conceptual schema are said to be representationally equivalent, if every fact population that is permitted by the conceptual schema can be recorded in the internal database specified by the implementation of that schema. An implementation of a conceptual schema and the specification of that conceptual schema are said to be semantically equivalent, if exactly those fact populations permitted by the conceptual schema can be recorded in the internal database specified by the implementation of the schema. An internal schema may not be semantically equivalent to its conceptual schema because it may permit facts to be recorded in the internal database which are not part of any valid population of the conceptual schema. This can happen if the internal schema model does riot include the same (or an equivalent) set of constraints as the conceptual schema model. In a three-schema architecture this is not a major problem because the conceptual information processor (CIP) is responsible for enforcing all constraints. Therefore only those facts which are permitted by the conceptual schema will be stored in the internal database. In other words, although the internal database is permitted to contain facts which violate the conceptual schema, the CIP ensures that only facts which satisfy the conceptual schema will ever be stored in the internal database. The lack of semantic equivalence is only a problem if a full three schema architecture is not supported, and the internal information processor is expected to check constraints on the internal database. This often happens when the conceptual schema is implemented using an internal (non-conceptual) description language (such as KEE). In this case, the extra constraints must be checked by application programs (or, in KEE, LISP functions triggered by active values). Internal schemas are not expected to satisfy the Explicit Representation, Complete Representation or Conceptualisation Principles (!). However, an internal schema must be representationally equivalent to its conceptual schema. If this were not true, then the conceptual-internal mapping would involve a loss of knowledge: there would be facts which were permitted by the conceptual schema but which were not permitted by the internal schema. The Lossless Mapph~g Principle. Any internal schema must be representationally equival-
S. Twine / Mapping between NIAM conceptual schema and KEE frames
139
ent to its corresponding conceptual schema. However, an internal schema need not be semantically equivalent to its corresponding conceptual schema. The Meta Principle [14] states: The Meta Principle. Every conceptual schema can be considered as the contents of a (meta)database, whose permitted states are specified by another conceptual (meta)schema. In Section 4, I indicated that KEE was only capable of representing a subset of the NIAM model: • binary fact types, without nesting • simple naming conventions that consist of exactly one reference type • single role uniqueness constraints • tingle role mandatory role constraints • entity subtypes. A metaschema for this restricted model is given in Fig. 11. Similarly, we can use NIAM to specify a conceptual schema which may be populated with syntactically well-formed (declarative) KEE knowledge bases. The KEE metaschema is shown in Fig. 12. Finally, we can specify the mapping information using another conceptual schema. This mapping schema may be populated by valid mappings between a NIAM conceptual schema (a permitted population of the NIAM metaschema) and a KEE knowledge base (a permitted population of the KEE metaschema).
isplayedby|
~isinsc~ofl
,1 g°v~ [
cons~'~t n~e) (mandatory
role name)
Fig. 11. A restrictedNIAMmetaschema.
140
S. Twine / Mapping between NIAM conceptual schema and KEE frames :.... ........... -...
has
[ identifies I /
Slot "--. ........... ...-"
•
of
®
is of
Sl°t N ~ j
Is
[ ism~m~ [ hasas
isin i c°ntains
,,, •
[ is subclass I hasas of subclass
is in C.Max [has in C.M~;
q
•
Slot
inheritedto i
.
Fig. 12. A KEE metaschema. It is important to realise that this mapping schema is a formal specification of the information required to establish a semantic mapping between a NIAM conceptual schema and a corresponding KEE knowledge base. I have chosen to use a graphical notation to describe this formal specification, because I believe that this is a more efficient language for communication than (say) traditional predicate logic. This mapping schema could trivially be translated into a predicate logic equivalent. The precise interpretation of NIAM conceptual schemas is in no way impaired by their graphical representation. The constraints on the mapping schema define the possible space of valid mappings between a NIAM conceptual schema and a representationally equivalent KEE frame structure. Any algorithm intended to generate KEE frames automatically from a NIAM conceptual schema must satisfy the restrictions imposed by the conceptual specification of the mapping. Finally, this mapping schema could be used to construct (in KEE) a conceptual information processor which supports the NIAM model. This would allow the developer, or even the end user, to use the NIAM model directly. The CIP would use this mapping schema information to translate NIAM operations into corresponding KEE operations. An example of this is given in Section 6. The correspondences between NIAM and KEE are (informally) specified in Table 1. The correspondence between a fact instance (a fact type populated wi~h a set of entity instances) and Unit, Own slot, Values facet value combination is derivable from (1) and (3). Rather than present the mapping conceptual schema in a single, incomprehensible diagram, I will present it, piece by piece, in order to show the individual mappings between corresponding KEE concepts and NIAM concepts.
S. Twine I Mapping between NIAM conceptual schema and KEE frames
141
Table 1. NIAM construct 1 2 3 4 5 6 7
Equivalent KEE construct
Role Entity Type Entity Instance Uniqueness Constraint Mandatory Role Constraint Subtype Relationship Type-lnstance Relationship (usually implicit)
Slot (Member or Own) Unit Unit or Values facet value Cardinality.Max facet value Cardinality.Min facet value Subclass link Member link
Fig. 13(a) contains a small conceptual schema, together with a small population. Figs. 13(b), 13(c), 13(d), 13(e), and 13(f) contain the specific KEE implementation of the conceptual schema and fact population in Fig. 13(a). Let us decide to represent only the following entity types as units: Person, Employee, Manager, City, and Project. CIC
•
Belgium
Brussels
(Counlry-Name) USA C2
•
C14 •
Minneapolis
=
holds CI
Manager Manager Scientist
Bill Gerard Robert C3
.=
C4
~
/
CIX: IBM Ci~
~ ~
PC
f f ~
I"
Minneapolis Brussels Brussels
C12
" CP
worksfor Bm Gerard Robert
IBM cvc
~
mM cax
CDC
manages
]by
IRIS
J
"
C7
]-'~
IRIS
\09 01"/=
C8
~ •
EP
I onby !io1~ Robert
NOAH
CDC
C6 Man~er
was
_
.
NOAH
/ (Project-Name)
,p"
NOAH
IRIS
Fig. 13(a). Example conceptual schema.
25,000
\
142
S. Twine / Mapping between NIAM conceptual schema and KEE frames
UNIT Person MEMBER SLOT holdsJob Cardinality.Min: 1 Cardinality.Max: 1 M E M B E R SLOT w o r k s F o r C o m p a n y Cardinality.Min: 1 Cardinality.Max: 1 Fig. 13(b). KEE unit ~rperson ~cts. UNIT Manager SUPERCLASSES: Person MEMBERS: Bill Gerard MEMBER SLOT holdsJob Cardinality.Min: 1 Cardinality.Max: 1 MEMBER SLOT worksForCompany Cardinality.Min: 1 Cardinality.Max: 1 MEMBER SLOT managesProject Cardinality.Min: 1 Cardinality.Max: 1 Fig. 13(c). KEE unit ~r manager ~cts. UNIT Employee SUPERCLASSES: Person MEMBERS: Robert MEMBER SLOT holdsJob Cardinality.Min: 1 Cardinality.Max: 1 MEMBER SLOT worksForCompany Cardinality.Min: 1 Cardinallty.Max: 1 MEMBER SLOT works0nProJect
Fig, 13(d). KEE unit ~remployee ~cts. UNIT Project MEMBERS: Iris Noah MEMBER SLOT involvesCompany MEMBER SLOT hasBudget Cardinality.Max: 1 MEMBER SLOT managedByManager Cardinality.Min: 1 Cardinality.Max: i Fig. 13(e). KEE unit ~rpr~ect ~cts. UNIT City MEMBER SLOT locationOfCompany MEMBER SLOT isInCountry Cardinality.Min: 1 Cardinality.Max: 1
Fig. 13(f). KEE unit ~rcity ~cts.
S. Twine / Mapping between NIAM conceptual schema and KEE frames
UNIT Bill OWN SLOT holdsJob Values: Manager Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT worksForCompany Values: IBM Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT managesProject Values: IRIS Cardinality.Min: 1 Cardinality.Max: 1
Fig. 13(g). KEE unit for Bill facts. UNIT Gerard OWN SLOT holdsJob Values: Manager Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT worksForCompany Values: CDC Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT managesProject Values: Noah Cardinality.Min: 1 Cardinality.Max: 1
Fig. 13(h). KEE unit ~ r Gerardfacts.
UNIT Robert OWN SLOT holdsJob Values: Scientist Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT worksForCompany Values: CDC Cardinality.Min: 1 Cardinality.Max: 1 OWN SLOT worksOnProject Values: Iris Noah
Fig. 130). KEE unit for Robert facts.
UNIT Minneapolis OWN SLOT locationOfCompany Values: CDC OWN SLOT isInCountry Values: USA
Fig. 13(j). KEE unit ~ r Minneapolis facts.
143
144
S. Twine I Mapping between NIAM conceptual schema and KEE frames
UNIT Brussels OWN SLOT locationOfCompany Values: CDC IBM OWN SLOT isInCountry Values: Belgium
Hg. 13(k). KEE unit ~r Brussels ~cts. UNIT Iris OWN SLOT involvesCompany Values: CDC IBM Cardinality.Min: 1 OWN SLOT hasBudget Cardinality.Max: 1 OWN SLOT managedByManager Values: Bill Cardinality.Min: 1 Cardinality.Max: 1
Fig. 13(l). KEE unit ~ r l r i s ~cts. UNIT Noah OWN SLOT involvesCompany Values: CDC Cardinality.Min: 1 OWN SLOT hasBudget Values: 25000 Cardinality.Max: 1 OWN SLOT managedByManager Values: Gerard Cardinality.Min: 1 Cardinality.Max: 1
Fig. 13(m), KEE unit ~r Noah ~cts.
Notice that the KEE representation includes a good deal of redundancy: the specification of every fact type (and, therefore, every constraint on a fact type) is redundantly stored with every instance of the entity type involved in that fact type. Some entity types in the NIAM conceptual schema are mapped to KEE units. This mapping is specified by the conceptual schema in Fig. 14. Notice that the mapping is one-to-one (specified by the uniqueness constraints over the mapped to and mapped from roles) but not total (because the mapped-to role is not mandatory).
mappedto m~d fromI
I
(EntityTypeName)
Person Employ~ Manager City hoist
Person Employo~ Manager City hoist
Fig. 14. Mapping schema for entity types.
(UnitName)
S. Twine I Mapping between NIAM conceptual schema and KEE frames
UNIT
145
SLOT in UNIT
/
is heldb ~ Manager._ ~ Scientist ..._
UNIT I I
employs__l-~~
FACET
Values
))
Fig. 15. An example of role grouping.
Every fact type in the NIAM conceptual schema must be represented by at least one KEE M E M B E R slot. More precisely, each binary fact type could be represented by two MEMBER slots, if both the entity types involved are represented by units. This is the symmetric redundancy case. If a binary fact is represented by more than two MEMBER slots, then the extra slots are simply duplicates. This is never useful. So, we could define a mapping relation between MEMBER slots and fact types. However, it is better to define the mapping between roles and MEMBER slots. The mapping between a fact type and a member slot is derivable from the mapping between its component roles and their member slots. Fig. 15 shows the relationship between the roles in a fact type and the slots in a unit (assuming that the facts are to be grouped with the Person entity type). Recall from the KEE metaschema that a slot is identified by the combination of a unit name and a slot name. In other words, slots have information-bearing names. For clarity in the digaram, we have identified slots by expressions of the form U.S, where U is the unit name and S is the slot name, rather than showing the involved reference types. At least one role in each fact type must be mapped to a MEMBER slot if the KEE knowledge base is to be representationally equivalent to the NIAM conceptual schema. This is specified by the mapping schema fragment in Fig. 16. Subtype relationships in the conceptual schema are mapped onto subtype links in the KEE knowledge base. Fig. 17 shows this mapping. The artificial indentifiers ([S1], [$2], [L1], [L2]) are used to conveniently identify instances of the two nested fact types.
(RoleName)
mappedfrom ~crscn.holdsJob employs Person.worl~FotCompany managedby ManagorananagesPmje.ct manages PmjcetananagedByMmager is workedonby Employee.worksOnPmjeet for Pmjeet.hasBudget involvedin Pmject.involvesCompany locatedin City.loc,afionOIUompany Fig. 16. Mapping schema for roles.
(UnitName)
146
S. Twine / Mapping between NIAM conceptual schema and KEE frames
[S21 Manager
Person
is mapped [ is mapped V [ to [ from [ is1]
[S~l
[L1] Employee Person [L2] Manager Person
[LII
[I2l
Fig. 17. Mapping schema for subtype relationships.
The uniqueness constraints over the mapped to and mapped from roles ensure that the mapping between (direct) subtype relationships and subtype links is a one-to-one mapping. The mapped-to role is not total because some subtype relationships may not be specified (because one or both of the entity types involved were not represented as units). Type-instance relationships are not explicitly represented in a NIAM conceptual schema. However, they are implicitly represented because the population is associated with the fact types in the schema. More precisely, the type-instance fact is not properly a part of the conceptual schema, but is represented in the architectural division between the conceptual schema and the database. In KEE, these type-instance relationships are shown as member links between units. Fig. 18 shows this mapping. In the conceptual metaschema fragment of Fig. 18, the type-instan.ce relationship is derivable: if an entity instance El plays a role R in some fact instance FI, and the entity type ET plays the same role R in the fact type FT (and FI is an instance of FT), then it follows that the El is an instance of ET. This can be written as a formal derivation rule and added to the conceptual schema definition. It appears at first that an error has been made in this schema. Why hasn't the type-instance fact type been nested to create an entity type (called Type-lnstance Relationship) which is mapped-to the (nested) Type-lnstance Link entity type? This situation is shown in Fig. 19. The schema in Fig. 19 is incorrect because it has a uniqueness constraint on a nested fact type which does not include all of the roles in the nested fact type. Formally, then, for every entity instance (every instance of the Entity Instance entity type) there can be precisely one associated Entity Type (an instance of the Entity Type entity type) recorded. For every one of these fact instances that can be recorded, there can be at most one mapping fact recorded (to link the type-instance relationship to a subtype link). Therefore, for every entity instance recorded, there can be at most one mapping fact recorded. This means that nesting the type-instance fact type is unnecessary (and technicaUy incorrect). This is why the mapping fact involves the Entity Instance entity type and not a nested type-instance fact type. Again, the "short" uniqueness constraints over each of the mapping roles ensure that the mapping is one-to-one. The mapped-to role in not mandatory because some type-instance relationships may not be representable as member links in KEE (i.e. where either the entity instance has or the entity type has not been represented as a unit).
S. Twine I Mapping between .,~IIAM conceptual schema and KEE frames
/
[ ~ t I -'
Manager Employee
I~1
manages ~ workedon
•
.
is in
/I @ I
/
-- ~
4, i
, I~ [ ~
pop.
holds
Robe~
woAs
Bm
manages manages managed by
/ / ~
.Iris
( Entity ~
l'~kl~l
~-ee)
/i~'x
I
'Robert Gerard
,
'
~
t
unit. )
for
/
nia
~1
Iris
[L6]
\
\
l---b~--lhasas I k I,<>, I,'lI )
/
I mappedl°Im--r--I '
/
~
~ [L3] If.A] [LS] [L6]
i Bill Robert Gerard Iris
Manager Employee Manager Project
Fig. 18. Mapping schema for type-instance relationships.
,.in. i
pop.ol I contorts, I
:
':
~
~
,-°,°--,
I
I
~
I
plays
,,
,.
"°;.Y
Fig. 19. An alternative mapping for type-instance relationships.
147
148
S. Twine I Mapping between NIAM conceptual schema and KEE frames
m a ~ to Ima~edfromj Bill Iris Robert
CIX:
IBM
Bill Iris Robert
(Unit Name)
I IBM
(Value)
Fig. 20. Mapping schema for entity instances.
Finally, entity instances can be mapped into KEE units or into Values facet values. This mapping is specified in Fig. 20. The mandatory role disjunction ensures that every entity instance in the database is mapped to a KEE unit or to a facet value. This prevents information loss. Fact instances are not mapped explicitly. As I indicated previously, the mapping between a fact instance and a unit, slot, value combination is derivable from three component mappings: the one-to-one mapping between an entity instance and the unit, the one-to-one mapping between an entity instance and a facet value and the one-to-one mapping between a role and a slot. This completes the mapping definitions for the structural components of the NIAM model. This is perfectly adequate if we have a conceptual information processor to enforce the conceptual schema constraints. However, this is often not the case. If there is no conceptual information processor, then the internal information processor must take responsibility for enforcing these constraints. If this is the case, then we will want to map as many conceptual schema constraints onto their equivalent (declarative) internal schema constraints as possible. Any remaining conceptual schema constraints will need to be mapped onto constraint procedures at the internal level in order to be enforced. KEE is capable of declaratively representing the following (restricted) set of NIAM constraints: 1. single-role uniqueness constraints 2. single-role mandatory role constraints Recall from Section 4, that some single-role uniqueness constraints can be mapped onto the Cardinality.Max facet value for the slot which represents the fact type containing the constrained role. Fig. 21 shows this mapping relation. The "long" uniqueness constraint C8 cannot be mapped to a declarative KEE constraint, so it is not shown as part of the mapping schema population. It is only possible to represent single-role uniqueness constraints over the role played by the "pivot" entity type (that is, the entity type which is the basis for the fact grouping). If a fact type is represented twice, by grouping with both entity types, then each single-role uniqueness constraint can be represented exactly once, in a slot i~l the unit that corresponds to the e~Ltity type that plays the unique role. This means that no single-role uniqueness constraint will be represented more than once. Hence the mandatory role constraint over the mapped to role in the mapping conceptual schema. Single-role mandatory role constraints can be represented by Cardinality.Min constraints. Fig. 22 specifies this mapping relation.
S. Twine I Mapping between NIAM conceptual schema and KEE .frames
.
.
.
.
is
~, _I~-_T~-__'~ \~mqu~.y~/ ~onsura~/
c6 .
.
~I .
.
.
[Vl] Person.holds]ob " IV2] l~m.worksEo~.~mp~y IV3] Managcrznanag~Pmje.~t [V4]
in scope (22 C4 C6
C7
1 1 1 1
PmjectznanagcdByManagcr
governs holds worksf~ manages managedby
Fig. 21. Mapping schema for single role uniqueness constraints.
(Matilda=/ ..... ~/ ~ ~
C5 CI1
r'l C3 C5 C11
[VTI [VS]
IVS] IV6] [V7] [VS]
Person.holdsIob Pcrson.worksFoxCompaay Manager.manage~Pmject PmjecLinvolvesCompany
1 1 1 1
holds worksfor manages involves
Fig. 22. Mapping schema for single-role mandatory role constraints.
149
150
S. Twine ! Mapping between NIAM conceptual schema and KEE frames
Again, it is °nly possible to represent a single-role mandatory role constraint over the role played by the "pivot" entity type (that is, the entity type which is the basis for the fact grouping).
6. Mapping the dynamic aspects The previous section discussed a (representational equivalence) mapping between the corresponding NIAM constructs and KEE constructs. This mapping schema included the structural (fact-encoding) components of the restricted NIAM model and a limited set of the constraints. Given that a NIAM knowledge base and a KEE knowledge base have been defined, and the mapping between them constructed, the next task is to map the dynamic aspects. More precisely, it is possible to update the fact population of a conceptual schema using two operators: ADD FACT and DELETE FACT. These two operators are theoretically sufficient to make any permitted database change. Let's consider the case of adding a fact to the database first. Note that we will not consider the possibility of adding a subtype relationship (because the subtype structure is part of the conceptual schema, not part of the knowledge base). We can formally define the syntax of an ADD operator as
ADD (FT, El 1, R 1, El 2, R2) where FT is a (binary) fact type, E~ and E 2 a r e entity instances, and R 1 and R 2 a r e the roles (in FT) that E~ and E 2 play. The semantics of this operator (as implemented by the CIP) are examine the conceptual schema, and its existing fact population, check that the fact to be added satisfies the required constraints, and, if so, add the fact instance to the fact population. To implement this ADD operation across the NIAM--* KEE mapping, it is necessary to transform it into a set of additional ADD operations that ensure the contents of the KEE metaschema (that is, the KEE knowledge base) and the contents of the mapping schema (that is, the current mapping information) are altered appropriately. The task of mapping this ADD operation is actually quite simple. The only difficulty is working out which entity instance gets mapped to a unit; the other must then get mapped to a value, in a slot (which is mapped from the fact type) in that unit. The procedure shown in Fig. 23 performs this mapping. An example of this procedure's behaviour might help to understand it. Consider the following ADD operation, applied tc~ the mapping schema developed earlier.
ADD (MP, Bill, manages, Isaac, managed by) This statement adds the fact instance The manager Bill manages the project Isaac
to the database. We will assume that this fact instance satisfies all constraints. Now, if we consider the mapping developed before, each of the roles in this fact type is mapped to a different slot. Conversely, because arbitary facts can only be encoded by unit, slot, value triples, the other role (more precisely: the entity type playing the other role) must be mapped to a unit.
S. Twine / Mapping between NIAM conceptual schema and KEE frames P R O C E D 0 ~ MAP ( ADD ( FT, EI1, R1, EI2, R 2 )) IF there is a slot S mapped from role R 1 TB~N IF there is NO unit U mapped from entity instance EI 1 THEN /* Map El I to a unit */ create a unit, U, with the same name as EI 1 create a m e m b e r link between the unit U and the unit UT (mapped from the entity type of EI 1) ADD Entity Instance EI 1 is m a p p e d to Unit U ADD (Entity Instance EI 1 belongs to Entity Type ET 1) is m a p p e d to (Unit U is m e m b e r of Unit UT) ENDIF [* Map EI 2 to a value */ LET S 1 = OWN Slot (in unit U) inherited from slot S create a value, V, with the same name as EI2~ in the Values facet of the slot S 1 in unit U 1 . ADD Entity Instance EI 2 m a p p e d to Facet Value V ENDIF /* Now do the same processing for the inverse slot, if needed */ IF there is a slot S mapped from role R 2 THEN IF there is NO unit U mapped from entity instance EI 2 THEN /* Map EI 2 to a unit */ create a unit, U, wilh the same name as EI 2 create a m e m b e r link between the unit U and the unit UT (mapped from the entity type of EI 2) ADD Entity Instance EI 2 is m a p p e d to Unit U ADD (Entity Instance EI 2 belongs to Entity Type ET 2) is m a p p e d to (Unit U is member of Unit UT) ENDIF /* Map El I to a value */ LET S I = OWN Slot (in unit U) inherited from slot S create a value, V, with the same name as ~Ii, in the Values facet of the slot S 1 in unit U 2 ADD Entity Instance EI 1 mapped to Facet Value V ENDIF END PROCEDURE
Fig. 23. Mapping procedure for the ADD operation.
In the first case,
managed by--->UNIT Manager manages---, SLOT managesProject In the second case,
managed by--->UNIT Project manages---, SLOT managedByManager Taking the first case (the managesProject slot) first, the IF condition evaluates to true, so we must map entity instance Bill (El 1) to a unit, and entity instance Isaac (El2) to a facet value.
151
152
S. Twine / Mapping between NIAM conceptual schema and KEE frames
Next, we check to see whether the entity instance Bill has already been mapped to a unit or not. We will assume that it has not (therefore, this is the first fact instance recorded about that entity ip~stance). The following facts is added to the mapping database:
,,
Entity Instance Bill is mapped to Unit BilJ Unit Bill is a member of Unit Person (Entity Instance Bill belongs to Entity type Person) is mapped to (Unit Bill is a member of Unit Person) PROCEDURE MAP ( DELETE ( FT, EI 1, R 1, EI 2, R 2 )) IF there is a slot S mapped from role R 1 THEN /* Delete EI 2 to a value */ LET S 1 = OWN Slot (in unit U) inherited from slot S Let V be the F a c ~ Value mapped from Entity Instance EI 2 delete the value, V, from the Values facet of the slot S 1 in unit U 1 DELETE Entity Instance EI 2 mapped to Facet'Value V /* Check if unit U is empty */ IF there are NO entity instances mapped to facet values in any Values facet of any OWN slot in unit U THEN delete unit U DELETE Ent%ty Instance EI 1 is mapped to Unit U DELETE Unit U is member of Unit (mapped from Entity Type (contains Entity Instance EI1)) DELETE (Entity Instance EI 1 belongs to Entity Type ET I) is mapped to (Unit U is member of Unit (mapped from Entity Type (contains Entity Instance EII))) ENDI~ ENDIF /* Now do the same processing for the inverse slot, if needed */ IF there is a slot S mapped from role R 2 THEN /* Delete El I to a value */ LET S 1 = OWN Slot (in unit U) inherited from slot S Let V be the Facet Value mapped from Entity Instance EI 1 delete the value, V, from the Values facet of the slot S 1 in unit U 1 DELETE Entity Instance EI 1 mapped to Facet Value V /* Check if unit U is empty */ IF there are NO entity instances mapped to facet values in any Values facet of any OWN slot in unit U THEN delete unit U DELETE Entity Instance EI 2 is mapped to Unit U DELETE Unit U is member of Unit (mapped from Entity Type (contains Entity Instance EI2)} DELETE (Entity Instance EI 2 belongs to Entity Type ET 2) is mapped to (Unit U is member of Unit (mapped from Entity Type (contains Entity Instance EI 2) )) ENDIF ENDIF END PROCEDURE
Fig. 24. Mapping procedure for the delete operation.
S. Twine I Mapping between NIAM conceptual schema and KEE frames
153
KEE's inheritance procedure automatically adds the following facts:
Unit, Bill contains a Slot with Slot Name managesProject Siot BiU.managesProject is of Slot Type Own Finally, the procedure creates the facet value which represents EI 2 and adds the remaining mapping information:
Slot Bill.managesProject has Values value of Facet Value Isaac Entity Instance Isaac mapped to Facet Value Isaac The mapping procedure is precisely the same of the other slot, except that the ELSE part will be executed and El t will be mapped to a facet value, and El t will be mapped to a unit. The procedure for mapping a DELETE operation (Fig. 24) is also straightforward, with the only difficulty being the deletion of "empty" units when the last fact instance encoded in a unit has been deleted.
7. Conclusions
In this paper, I have demonstrated that KEE's frame language is unsuitable for use at the conceptual level, but possibly suitable for use at the internal level. I have defined a conceptual schema to store the information to map facts between a NIAM conceptual schema and a representationally equivalent KEE frame knowledge base (with or without symmetric redundancy). I have also specified procedures to map the conceptual update operators (ADD fact and DELETE fact) into the corresponding operations on the KEE knowledge base and mapping database. An obvious extension is to compose a NIAM to Relational mapping with a NIAM to KEE mapping, as shown in Fig. 25. This composed mapping would allow the free exchange of data (encoded facts) between a relational database and a representationally equivalent KEE knowledge base. The relational operations would be mapped into a set of conceptual updates, which would be mapped into a set of KEE updates. The semantics of the knowledge base would be defined in the conceptual schema, and enforced by the conceptual information processor.
,f
r
Conceptual Schema
N I A M . ~ NIAM-Rel. TransformerI Transformer
EE
RELATIONAL
C
Fig. 25. A relationalto KEE mapping.
154
S. Twb~e / Mapping between NIAM conceptual schema and KEE frames
F----- 7
\
Conceptual ., Schema
T, NIAM-KEE~i Transformer /
• +
!NIAM-KEE Transformer
I
,
i i v _ virtual r-~.----~ ( EE t.~(" NIAM-KEE fKEE kb( ~°~-~ ................ ~ t ' , t
\l
\transformation\- \
Fig. 26. A KEE to KEE mapping.
Existing work in this area, such as Intellicorp's KEEconnection, requires extremely complex mappings because of the different ways in which facts can be defined in KEE knowledge bases, and does not seem to permit all possible encoding combinations. Similarly, this mapping would permit the transfer of (encoded) facts between two KEE knowledge bases, each of which may have a different internal representation of the same facts, as shown in Fig. 26. This provides a rational semantic basis for the (difficult) task of integrating different KEE knowledge bases. Finally, knowledge engineering is generally considered a difficult task which requires strong prescriptive guidance for the knowledge engineer. There are no theories or design procedures to guide a knowledge engineer in constructing a KEE knowledge base to model some Universe of Discourse. However, there is a well-defined, and extensively tested, design procedure for designing a high-quality, user-validated NIAM conceptual schema for any Universe of Discourse ([15, 16]). The mapping permits the possibility of constructing a conceptual schema for the Universe of Discourse, mapping that into a representationallyequivalent KEE knowledge base, and using that mapping information to transform conceptual updates to KEE updates.
Acknowledgements This research was made possible by the UNISYS AI Research Grant awarded to the Department of Computer Science, University of Queensland. Comments on an earlier version by Professor Sjir Nijssen, David Duke and the referees have led to improvements in the presentation of this paper.
References and bibliography [1] CODASYL: Database Task Group Report, ACM, New York, 1971. [2] R.M. Abarbanel and M.D. Williams, A Relational Representation for Knowledge Bases, Intellicorp, 1987, Internal Report. [3] R.J. Brachman and H.J. Levesque, What makes a knowledge base knowledgeable? A view of databases from the knowledge level, in Expert Database Systems, L. Kerschberg (ed.) (Benjamin-Cummings, Menlo Park, California, 1986). [4] E. Chow, Representing databases in frames, Proceedings AAAI-87,
S. Twine / Mapping between NIAM conceptual schema and KEE .frames
155
[5] M. Deering and J. Faletti, Database support for storage of AI reasoning knowledge, in Expert Database Systems, L. Kerschberg (ed.) (Benjamin~Cummings, 1986). [6] E. Falkenberg, Foundations of the conceptual schema approach to information systems, in Lecture Notes of the NATO Advanced Study Institute on Database Management and Applications, June 1-13, 1981, Portugal (North-Holland, 1982). [7] R.E. Fikes and T. Kehler, The role of frame-based representation in reasoning, Comm. ACM 28, 9 (Sept. 1985) 904-920. [8] W. Kent, Data and Reality (North-Holland, 1982). [9] R. Meersman, The high-level end user, in Infotech State of the Art Report, Series 10, No. 7 (Pergamon Press UK, 1982). [10] R. Meersman, O. De Troyer and F. Ponsaert, RIDL: User Guide, Control Data (Belgium) Internal Report, February 1984. [11] R. Meersman, Knowledge and data: A survey in the margin of the IFIP DS-2 conference, in EntityRelationship Approach, S. Spaccapietra (ed.) (North-Holland, 1987). [12] G.M. Nijssen, Two major flaws in the CODASYL DDL 1973 and proposed corrections, Information Systems (1975) 115-132. [13] G.M. Nijssen, On the gross management for the next generation database management systems, in B. Gilchrist (ed) Proc. IFIP World Congress (1977). [14] G.M. Nijssen, A framework for advanced mass storage applications, Proceedings IFIP Medinfo 1980 (Tokyo, 1980). [15] G.M. Nijssen, On experience with large-scale teaching and use of fact-based conceptual schemas in industry and university, in R. Meersman and T.B. Steel Jr. (eds.) Proceedings of IFIP Conference on Data Semantics (DS-I) (Elsevier North-Holland, 1986). [16] G.M. Nijssen and T.A. Halpin, Conceptual Schema and Relational Database Design: A Fact-Based Approach (Prentice-HaU, 1989). [17] M. Stonebraker, Adding semantic knowledge to a relational database system, in M.L. Brodie, J. Mylopoulos and J.W. Schmidt (eds.), On Conceptual Modelling (Springer-Verlag, 1984). [18] S. Twine, From information analysis towards knowledge analysis, Proc. 2nd European Workshop on Knowledge Acquisition for Knowledge-Based Systems (Bonn, Federal Republic of Germany, June, 1988). [19] S. Twine, Representing facts in KEE's frame language, Proc. IFIP WG2.6/WG8.1 Conf. on the Role of Artificial Intelligence in Databases and Expert Systems (Guangzhou, China, July, 1988). [20] S. Twine, Towards a knowledge engineering procedure, Proceedings Expert Systems '88 (Brighton, UK, December, 1988). [21] Concepts and Terminology for the Conceptual Schema and the Information Base, J.J. van Griethuysen (ed.), 1982. Report of ISO TC97/SC5/WGS. [22] G.M.A. Verheijen and J.V. Bekkum, NIAM: An information analysis method, in T.W. Olle, H.G. Sol and A.A. Verrijn-Stuart (eds.) Information Systems Design Methodologies: A Comparative Review (NorthHolland, 1982).