Can a large knowledge base be built by importing and unifying diverse knowledge?: lessons from scruffy work G Berg-Cross Two different roads to the modelling of intelligence and the building of intelligence systems have been proposed. These are the 'neat' (logical) versus scruffy (ad hoc) philosophies applied to the building of AI systems. The paper revisits this issue, and characterizes the nature of recent relevant work, with particular emphasis on the Cyc project. A constructionist perspective that is akin to Piagetian work and Sowa's crystallizing of theory is espoused. This perspective notes that scruffy, bottom-up methods provide a developmental basis for more formal theories which, in turn, provide a further bootstrapping of subsequent scruffy development of important formal elements, such as conceptual catalogues. Particular emphasis is placed on conceptual analysis results that are available from the Cyc project, which is attempting to achieve robust intelligence using vast pools of handcrafted knowledge. This effort, like a good conceptual catalogue, is an attempt to provide an empirical basis for knowledge acquisition via automated understanding of documentation and machine learning. Keywords: intelligence systems, conceptual analysis, knowledge acquisition, neat and scruffy philosophies For the author, research on large repositories of knowledge grows out of his conflicting experiences in knowledge engineering several large knowledge bases (KBs) on the one hand, and researching issues for complex KBs on the other. This is a topic that was discussed at the first Conceptual Graphs Workshop 1. Unfortunately, none of this knowledge-engineering work combines largeness with complexity. Like many, the author believes that a Advanced Decision Systems, Suite 800, 2111 Wilson Boulevard, Arlington,VA 22209, USA Paper received11 March 1992.Accepted13 April 1992 Vol 5 No 3 September 1 9 9 2
harmonizing of the two is needed to understand, research and ultimately build robust systems of substantial, real intelligence. This paper discusses a pragmatic approach to building large and complex knowledge bases. Particular use is made of consensual repositories of knowledge achieved though exploratory knowledge engineering. Guha and Lenat z, reporting on the Cyc project, argue that there is no elegant, seemingly low-effort road to achieving large, complex KBs and a resulting robust intelligence. Three low-effort candidates that they reject as a road to truly intelligent systems are shown in Figure 1: the natural-language understanding of documents, the automated building of knowledge via machine learning, and the unification of knowledge from existing expert systems/KBs. Guha and Lenat reject natural-language understanding as a 'free-lunch' route, because it requires a large amount of common-sense knowledge (shown in the natural-language knowledge-bases (NLKB) box in Figure 1) to handle the extraction of knowledge from documents. (Some discussion of domain-specific issues of such extraction is found in Reference 3.) Machine learning (ML) (e.g. through inventing new concepts similar to those already known) is rejected as an automated method of building KBs, because similarity learning has not been mastered, and there is not a good, broad range of concepts with which to seed a learning system. Indeed, ML is also seen as a potential beneficiary of a large, fundamental KB. The final approach proposes achieving the goal by a unification of diverse KBs built for expert systems or research. Several arguments are raised against these by Guha and Lenat, including the problem of different representations, different names, and the lack of a 'semantic glue' to hold domain-specific knowledge together. The open hypothesis motivating this paper is that the conceptual-structures4 approach may overcome each of these problems, and thus might achieve some large, complex
0950-7051/92/030245-10 © 1992 Butterworth-Heinemann Ltd
245
LCKB I
Natural-language understanding of documents
/ Natural
I ILCKB2 I LCKB3 1
Machine-learning discovery of understanding
-
language knowledge base
%
~° °
Figure 1. Roads to large, complex knowledge bases [L-CKB: large, complex knowledge base.]
KB by the unification of diverse knowledge accumulated by separate workers. Perspectives for the discussion are developed by first considering a scruffy, evolutionary approach to conceptual analysis. The experience of the Cyc project is partially summarized through examples, and then some implications for conceptual-graph (CG) KBs and knowledge unification are noted.
NEAT T H E O R I E S AND S C R U F F Y M E T H O D S The neat-versus-scruffy dichotomy5 characterizes two broad ways of thinking about and developing intelligent systems. The neat position, represented by Nilson and McCarthy, espouses formal systems built on logic or mathematically based principles. Neat advocates favour elegant, homogeneous representational/reasoning systems, and precise approaches such as abounded in early theorem-proving systems. They argue that a formal basis provides clean paths to understanding. A milder, associated belief of this position is that logic is an essential ingredient in any competent system. Minsky, Schank and Hofstadter represent the scruffy point of view, which sees a weakness in such formalisms, and suggests that reliance on a formal base at this time is counterproductive. Formalisms are seen as distracting researchers from the hard, but truly important, problems of AI. Minsky recently formulated an objection that contrasts people's ability successfully to build assembly robots and inability to develop housework robots as follows: [Our success in the assembly line is] because the conditions in factories are constrained, [but] . . . the objects and activities of everyday life are too endlessly varied to be described by precise, logical definitions and deductions. Commonsense reality is too disorderly to represent in terms of universally valid axioms. To deal with such complexity we need more flexible styles o f thought, such as we see in human commonsense reasoning, which is based more
246
on analogies and approximations than on precise formal procedures. The neat-and-scruffy view of AI discussed above is largely focused on representational issues, but the dichotomy can be viewed more broadly at other levels. It exists at the level methodology for building KBs, at a project level, at a philosophy-of-science level, and at a personal level. Advocates of the scruffy and neat AI position often differ, if tacitly, on the proper degree of neatness at the other levels. Scruffy proponents can point to a number of problemsolving domains where early progress, using 'neat' theorem-proving approaches, has proven quite inadequate for the problems at hand. Notable among these is the move away from these pure approaches in naturallanguage processing, as the Schankian use of schema proved more robust, and the move away from STRIPSstyle planning, owing to the computational complexity. On a number of fronts, more ad hoc measures seem successful in this area, and they show a greater promise of development. Entire areas of AI have grown out of the episodic view of memory, which is contrasted with the more formalized semantic memory. Notable among these is case-based reasoning6 and the adaptive planning of Hammond 7. Given the continued success of this work, it is time to ask if insights from some ad hoc work can be assembled into a more consistent base for subsequent work. Indeed there are 'bases' to each approach; Shankian work includes standard scripts, MOPS and cases, for example.
Piagetian and cognitive rationales for scruffy understanding 'Neats' and 'scruffies' divide along several grounds, including those of what products they identify as success (theory versus programs), and how performance is used to judge as essential. They are, however, mutually interKnowledge-Based Systems
ested in phenomena of commonsense reasoning. Neats believe that commonsense is orderly, while scruffies see it as disorderly. The pursuit of common intelligence has a long tradition in psychology, and one can argue for a modern, cognitive thrust growing from a Piagetian tradition. One might place part of the cognitive-science roots of common-sense reasoning there, based on a number of different models that Piaget proposed to describe the development of children's knowledge and thinking styles. Piaget, for example, provides a balanced view of cognitive structures and processes involved in everyday logic of space, time and causality. Piaget also achieves a balance between neat models of logic and semilogic and scruffy, heuristic principles by which humans attempt to stabilize their cognitive models of the world. His theories are based on a three-part clinical-critical, 'epistemological' method of • observing child behaviour longitudinally; this provides a historical perspective, • using language-intense interviews to provide a psychological perspective, • gathering a rational (adult) perspective through the use of other scientific disciplines. Analysis of repeated behaviour makes the work consistent with the stricter philosophy of behaviourism. Subjecting hypotheses to crossdiscipline measures represents a scruffy strategy for ferreting out alternative explanations of the phenomena. Piaget pays great attention to the role of language in intelligence, but also see logic as an additional mirror in which to observe thinking. His model of intelligence is partially developmental, in that biological structures play a role, especially in the early stages of intelligence. Increasing net (logical) structures emerge from interactions with the external factors of the 'logically organized' physical world. This position is at odds with that of theorists such as Fodor and Jackendoff, who favour innately based rather than interactively derived concepts for recognizing such things as measurement and number. It is, in a word, scruffy, compared with a neat, innate position. Modern inheritors of the Piagetian perspective can be seen in qualitative physics and in Sowa's writings. For example, a balance between logic and practical reasoning can also be seen in Sowa's metaphor of crystallizing theories out of knowledge soup s. Like Piagetian work, Sowa's examples use language and linguistic phenomena to outline the cognitive structure model. From this base, a wide cognitive perspective is cast, including perception and planning. Piagetian theory's structural-developmental view of intelligence provides an additional way of viewing scruffy and neat approaches. Given an intelligent organism's structure, the Piagetian view is that it develops from essentially scruffy experiences, such as those of trying to understand sentences or images that one has not experienced, assimilating these into rudimentary cognitive structures, and accommodating to reality by constructing more formal structures and processes as it goes along. A case can be made that AI might proceed as a science in such a fashion. That is, science should organize a theory, microtheory or system around a few core ideas Vol 5 No 3 September 1992
and processes, run these against problems of increasing difficulty, and identify and abstract out more formalized structures and processes. This allows cognitive scientists time to master the emerging skill of representing knowledge. This approach has been used intentionally or unintentionally several times, and a very nice discussion of these ideas is contained in Reference 9. It might be further argued that Sowa's discussions of the use of conceptual analysis from a soup of information and theory refinement via concept salience ~°is in this vein. On the face of it, this is more of a scruffy idea of science, but it includes a hidden formalism in its 'rudimentary' cognitive structure. These ideas are explored further in the next section.
Convergence, common sense and Cyc project One area for convergence between the two views is in the area of commonsense knowledge. Scruffies see common sense and basic cognitive processing as more of an ad hoc assembly of knowledge than a logical organization. A working hypothesis of the scruffy position seems to be that AI research will 'discover' a good path to intelligent systems by trying lots of things and weeding through the results. This view speculates that, from a collection of scruffy knowledge, there will emerge a pattern. Emergence, in this case, comes with the hard work of knowledge engineering and conceptual analysis of particulars. One effort along these lines is to build pragmatically large dictionaries or 'encyclopaedias' of knowledge. This effort, called the Cyc project 6, has evolved out of an initial approach that is based on experimental analysis based on the 'expert-system'-like approaches that abounded in the mid 1980s. In these early efforts, general issues of time, space, causality, substance and intent were ignored in favour of highly specific domain representations. By the late 1980s Cyc had moved to a 'middle ground that combines the lessons from a formal analysis with insights from an empirical approach '2. The current middle approach is outlined in Figure 2. Knowledgebase building still starts empirically with an examination of a very scruffy collection of 'snippets' from a variety of sources. These include the original entries from encyclopaedias (from which the name Cyc partially derives), children's stories, cartoons, and, more recently, news clippings. The move away from encyclopaedias as sources of information is based on the recognition that encyclopaedia entries presuppose substantial commonsense knowledge on the part of the reader. Children's stories also surprised the Cyc knowledge engineers by requiring large amounts of real-world knowledge. Cartoons, on the other hand, entail many violations of physical laws which provide insights into hidden assumptions. The current Cyc methods encode text fragments into a loose Cyc representation. These are accumulated, and then carefully refined and incorporated into the existing Cyc KB (see Figure 2) with more formal representation. To ensure that this incorporation is carried out properly, the new Cyc KB is queried, and its 'understanding' is checked against human sensitivity. Refinements are made on the basis of these results, and new, related assertions are added, and the scruffy query-analysis cycle is continued. On the whole, the approach bears a 247
Express questionsl (:~ os Cyc queries, i
Create test /
\
questions /
........\
I Refined Cyc I epresentation
~ New Cyc~Anolyse knowled e esults and g ..... ratine :: "Scruffy revision : " '.
~'
. . . .
Generalize /
I
\
'/
encyclopoed'oi s ..,. ; . ........ : .-.;.. '
....
==================================================
IOId Cyc I Iknowledgel I, base I
Cyc ~1 "......... :;i::i:"l(:~representati°-nJ: "
.~i::~::i::;::;ii::iii::ii~::i::~i;::!ii!i::i!i~:;(.i~ ~ •: : l : , x a m l n e !:i:~:~:i:.-j, :~ :iiiii: text !%:!!J~,:i:I
.i:iii::!::i:.l(~)Addnew l . i :iiii:.ilossertlons~ :~::::::..: ...........::...:.:::.
I
.
Children's stodes/l )i::iifragmentsiiiiiiiii;~ cartoons I ;;;ii!i!iiii!ii!iii!ili!~!!!ii:i: ~ i~::N:ewt~paPer:"l ~::~::i'l~ snippets I • - Scruffy s t a r t
Figure 2. Modified (scruffy) experimental approach [After Reference2.] remarkable resemblance to Piaget's three-stage 'epistemological' approach. Somewhat isolated 'scruffy facts' that are organized by observations are started with. These are translated into a communicable form to allow discussion and queries to improve them. Finally, a broader perspective is used as a final scruffy revision that adds details in diverse contexts. Historically, it may be worth noting that, in the same year that Cyc was initiated, Sowa4 took a middle position in the overall scruffy-and-neat argument. As discussed in Reference 4, both neat and scruffy methods have something to offer in the development of intelligent systems, and each overlooks limitations to its position. Neats, for example, overlook the heuristic value of schemata such as Schank's scripts. In Sowa's work, logic, as supplemented by Pierce's idea of context, is used to bridge partially the neat-scruffy gap. Another tempering element of Sowa's position is the balancing of a neat representational form (CGs) with a scruffy and practical method of conceptual analysis (described in Chapter 6 of Reference 4). The work reported in this paper follows in that spirit by exploring findings, particularly from Cyc researchers' (usually scruffy) conceptual analysis, with an eye towards converting these into a large, complex CG model. Things such as a standard 'seed' model for the development of large KBs would be desirable. Sowa's 4 concept catalogue and his prescribed set of conceptual relations provide some uniform help in the expression of consensual reality. A standardization of CG representations, such as is pursued by researchers at Unisys and the University of Minnesota, USA, further sparks the belief that diverse conceptual-analysis results, ontologies and 248
KBs may be joining into a large repository of common knowledge. Indeed, the integration of diverse domains expressed in CG formalisms may also provide a large KB itself, despite the challenges that exist with naming conventions, concept ontologies and numerous simplifications and assumptions. CONCEPTUAL CATALOGUES, COMMON S E N S E A N D R E P R E S E N T A T I O N IN CYC PROJECT Scruffy approaches see efforts to reason about ordinary phenomena as growing out of millions of small learning episodes. An oversimplification of the position is to view childhood, when such learning goes on, as a repository or knowledge-base building phase. In this view, assembling a huge knowledge base of 'facts', such as a lexicon, is part of the cognitive basis that allows humans to deal with the world. In a sense, this view says that general knowledge/reasoning is built by first accumulating large amounts of domain-specific knowledge/reasoning, and then 'generalizing' them (e.g. into conceptual catalogues) through some abstracting process. Traditional expert systems are criticized for disguising their lack of such generalized knowledge by artificial restriction of their problem domains. Lenat and Guha 2 suggest, for example, that •.. such systemsare hopelesslybrittle: they do not cope wellwith novelty, nor do they communicatewell with each other. Also, the approach taken in such systemsto the representation of knowledge just doesn't scale up. Consistently with this view, the Cyc project ~ is spending Knowledge-Based Systems
Abnormal
Uncommon Arrangement
Unidentified
New
Unknown Object
/
Never Seen Before
\
Never Before Seen
/\ Seen for the First Time
Never Before Observed
Established
Never Observed Before
Never Present Before
Not Previously Noted
Figure 3. Portion of concept analysis of 'unusual' ten years of effort (1984-94) to get at commonsense functionality by assembling vast pools of facts into what is called consensus KB that will overcome brittleness. The Cyc team assumes that humans rely on a broad core base of facts and general knowledge. The adjective 'consensus' is used to describe this knowledge, either because people tacitly act as if they agree with it as true, or because knowledge engineering and cognitive studies show it to be widely believed. This general knowledge is assumed to provide a basis for the learning of additional facts, and also cushions people if specific rules do not apply. However, what knowledge might provide the best basis of this subsequent learning of facts? Lenat seems to fall back on a scruffy, but consistent, type of approach to decide the type of knowledge to encode. Much of it grows out of a nontheoretical and pragmatic notion that a KB should consist of things that people agree on and need to get along on. The Cyc approach is to put in ' . . . a non-trivial fraction of consensus reality-- the millions of things that we all know, and that we assume everyone else knows'. Currently, the size of the KB is between 1M and 2M facts. As a frame-based system, knowledge can be detailed by more than 4000 different slots. The Cyc 'knowledge-engineering' approach has been described above (see Figure 2), and it is organized around a comprehensive, yet scruffy, idea. As people create rules, they define the terms that they place in them, and they later on reflect back on these, and work out inconsistencies. If this iterative and encyclopaedic approach is not taken, argues the Cyc team, one quickly runs into brittleness. As an example, Lenat and Guha offer the following rule as evidence for the brittleness of typical small facts placed in KBs. IF the purchase price is greater than the and you query 'are there unusual circumstances this month?' and the response is 'yes' THEN authorize the purchase There is a vast amount of knowledge quietly embedded Vol 5 No 3 September 1992
in this rule. For example, terms such as 'unusual circumstances' and 'authorize' are abstract concepts that could require considerable processing to recognize. Essentially, the rule cheats, by simplifying the true understanding of these terms into a very simple pattern match that fires the rule. The use of rules like this results in a brittle, superficially 'expert' system. Real understanding of the query and concepts in it, such as 'unusual', needs to be based on so many alternative phrasings that one might as well speak of natural-language understanding. In the author's own knowledge engineering, a concept such as 'unusual' or 'abnormal' is manifest in a statement in many ways. In one message-processing system, for an aerial-image domain, a group of domain analysts gave a large list of terms they they meant by 'unusual' in such an image. Image analysis makes the work interesting from a cognitive point of view, since the mapping between perceptual and language phenomena is involved. A small portion of their initial conceptual analysis is shown in Figure 3. 'Unusual' divides into an abnormal or strange idea, with 'uncommon arrangements' or 'unknown objects' on the one hand, and 'new' things on the other. A central part of 'new' is the idea of seeing, noting and observing, which brings in subtle variations of primitive ideas. This is evident in phrases for 'new' which include such things as 'not previously noted' and 'seen for the first time.' Temporal concepts such as 'first' and 'previously' are part of the understanding. In all such analysis, a knowledge engineer finds him/herself striving to find semantic 'primitives' to describe the terms. Observations on the problem of identifying conceptual primitives and avoiding circular definitions for natural-language systems are discussed in Reference 12. Cyc, like CG work, takes seriously the semantic disambiguation of 'sentences'; as previously noted, Cyc is largely driven by text fragments. To do this, each 'slot' in a Cyc rule is defined by its own Cyc 'frame'. Thus 'authorize' and 'query' in the above rule have their own definition outside the role that they play in the above rule. This amounts to a large conceptual catalogue. A simple example of a Cyc frame adapted from 'The world according to Cyc' by Lenat and Guha is 249
CYC Clean Eplstemolog ical Level
Users, human and machine
Tell/Ask
Translate
/
Scruffier Heuristic Level
Unit (Frame) Editor
KEs, system builders
~.~
I
Figure 4. Relationship between episternological level and heuristic level [After Reference14.] New York capital: (Albany) residents: (John Traveler) stateOf: (USA) The slots' names become highly compounded as difficult concepts are described. Slot values are sets, as are instances in CGs. Each Cyc frame or unit corresponds to a CG concept (see Reference 13 for a correspondence in form). However, following the methodology of Figure 2, Cyc's simple frames proved awkward and inadequate for the expression of many assertions. Important deficiencies noted in Guha and Lenat2 include an inability to express • • • •
disjunctions, inequalities, existentially qualified statements, metalevel propositions about sentences.
Thus frames are supplemented by CycL, a powerful constraint language, to guide inference mechanisms from inheritance to 'nearly-closed-world guessing'. CycL is an nth-order predicate calculus, augmented by several special constructions such as 'SetOr. As a scruffy engineering approach, Cyc's inference techniques were initially ad hoc fixes to problems as they arose (IS-A links with rigid toCompute definitions of a new slot in terms of an old slot, plus production rules and lumps of LISP code). However, given several years of work and some formalization from its scruffy start, it is time to ask some questions, as follows. Is there a set of primitive knowledge that is in common with other efforts? Is there a core set of ideas that can be shared? Are there practical lessons for other implementers? These issues are pursued in the remainder of this paper.
Epistemological level and heuristic level As noted above, a scruffy representational approach allows the ready, albeit sloppy, expression of ideas and 250
fast inferences. However, it mitigates a clean and simple semantics. Cyc's evolutionary solution (shown in Figure 4) has been to employ two separate representational levels, with a translation ability between them, and a facility TA (for TELL-ASK) enabling there to be user interaction with the system using the formal constraint language. The epistemological level (HL) uses the FOPC formalism with enhancements for set construction and reification. This level has a clean semantics, and, like CG formalisms, allows easier communication. The heuristic level (HL), on the other hand, uses special-purpose representations and algorithms for rapid inferencing. Both KBs are part of Cyc, with equal knowledge with which users can interact. Lessons can be learned from each of these aspects of Cyc. The EL provides a cleaner introduction to knowledge representation, and represents several refinements of knowledge growing out of the early scruffier efforts. Cyc has pushed beyond the ontological level into the less elegant heuristic areas, the HLs, and has useful ideas on six logically superfluous (and hence scruffy) operations. The following give some idea of the management of large KBs:
• • • • • •
Tell (x): Asserts a statement by user into KB. Unassert(x): Given s, nothing can be said about it. Justify (s): Obtain an argument for s. Ask (s): Test the truth value of a statement. Deny (s): After deny(s), s is no longer true in the KB. Bundle (s): Packages several of the above functions.
The HL's job is to speed up processing. Some of this goes on in the TELL/ASKguided translation. For example, the TA translator can notice that a FOPC sentence is an instance of particular schema that has an efficient inference mechanism. Thus, the user TELLSthe system something in an FOPC way, without knowing all the terms of the system, and some interpretation is made: Knowledge-Based Systems
THING [ENTITY]
T
Slate
Thi
Event
I
General Prlmltivee
,nlangible Thing ~
I^'=l /
Process
Stuff
'°°7"
Arlilacts
Person
Number
Law
lqO~nO~O nlrm~Jn~ng Internal Machlne
External Rep Thing
People
Driving Law
Specifics
#47
nead~ Figure 5. Portion of 'thing' specializations within C Y C ontology TELL (For all x, y, z) owns (x,y) @physicalPart(y,z) ~ owns(x,z) This is converted to a predicate using a specialized procedure in Cyc called transfersThro. TransfersThro's schema includes a class of rules that covers the above example, inheritance of last names from father to son etc. This allows such propositions to be expressed as a simple predicate transfersThro(lastname, son). One observation about this heuristic level is that Cyc, at least in its recent reports provides no formal model of the TELL, UNASSERT etc. ideas. It is scruffy about these.
Division of knowledge in Cyc Cyc's ontology is organized around categories within a generalization/specialization directed graph. Like many KBs, Cyc divides up knowledge into basic propositions about the world and events in it (e.g., 'John went to Boston by bus') and more general principles that are often used (e.g. 'ancestors are older than descendents'). Cyc attempts to scale up to a strong, nonbrittle system in a two-tier approach: • building the top level made up of a set of general concept primitives, • a vast bottom level of specifics that are built out of the top level. The top level is hoped to be a powerful 'seed' KB from which instances can be grown. Figure 5 shows a set of primitives at the top and scruffy details on the bottom of what might be called a knowledge 'broom'. Going down the ontology, "all the lines of specialization are moved through. Thus 'Number' is a specialization of 'IntangibleThing'. For several concepts, there is a corresponding partology. For example, furniture has parts, although PeanutButter (Cyc runs major terms together while other representations use hyphens to separate them) does not. Vol 5 No 3 September 1992
There are sets of instances off concepts that are not shown. Figure 5 contains several concepts that overlap with Sowa's concept catalogue. In cases such as 'event' and 'state', even the same names are used. In other cases, there are reasonable alternative names for similar ideas. Thus 'thing' corresponds roughly to Sowa's [entity], and 'RepresentedThing' corresponds to [information]. Cyc's definition of thinghood is very pragmatic, and exemplifies its practical conceptual-analysis style. Something is a thing if a statement can be made about it. Thus the conceptual analysis is driven by a not unreasonable pragmatics of communication. A Cyc example is that 'DiningAtARestaurant' is a thing because of a conceptual analysis that says how often one dines per month or its interestingness etc. can be defined. This is slightly different from Sowa's approach, since events are now things (entities). In many cases, the large Cyc project conceptual analysis has articulated new concepts that seem to make sense. Examples of ontological and commonsense concepts developed in the Cyc project include • • • • •
process, substantiveness versus individual objects, collections versus individual objects, tangible versus intangible objects, composite objects and agents.
These represent possible refinements to Sowa's preliminary conceptual catalogue. For example, the concept 'process' is not included in Sowa's catalogue, but might be defined as a primitive inbetween [event] and [act]. This leads to the following three changes and additions to the conceptual catalogue: • ACT < PROCESS: An act is a process with an animate agent. • PROCESS < EVENT: A process is an event such that part of the process is still defined as that process: [PROCESS: 1001] --I, (PART) ~ [PROCESS:1001] 251
EVENT < ENTITY: Events are the set of entities (things) which may have a temporal extent, such as starting time, ending time, duration. Events can be associated by temporal relations (after, during etc.). Events include things started by agents as well as happenings: [EVENT:*X] (PSBL) --~ [PROPOSITION [START] --*(AGNT) --~ [ANIMATE]] (PSBL) -* [PARTICIPANT] (DUR) ~ [TIME-PERIOD] (SUCC) -* [EVENT: *Y] (LOC) "-~ [PLACE] It is worth noting that the definition of EVENT identifies it as an ENTITY, and its definition now includes the idea of a START: a concept also not included in Sowa's conceptual catalogue. Another addition to EVENT is the idea that it may include PARTICIPANTS. Specializations of PROCESS include TANGIBLE-EVENTS compared with INTANGIBLE-EVENTS. These, of course, correspond to TANGIBLE-OBJECT and INTANGIBLEOBJECT on the THING side of the ontology. The rational definition of INTANGIBLE-OBJECToffered by Cyc is [INTANGIBLE-OBJECT] - (ATTR) --* [{MASSLESS, ENERGYLESS}]
An instance of INTANGIBLE-OBJECT is [PERSON]. As an example of a TANGIBLE-OBJECT, Cyc suggests the rule that collections cannot have mass, and so every TANGIBLE-OBJECTmust be an INDIVIDUAL-OBJECT.Or, in Cyc terms, TANGIBLE-OBJECT is a specialization of INDIVIDUAL-OBJECT. This is well and good. However, there are a few fine points that Cyc makes in considering cognitive agents. Take the concept for 'John Sowa'. What type of object is that? The Cyc conceptual analysis is that this is a 'CompositeTangible&Intangible Object'. The reason is that Sowa possesses a body portion that has mass and energy, but he also also possesses a mental part that is intangible. A portion of an approximate, translated representation is as follows: [PERSON: JOHN SOWA](PART) --* [BODY] --* (ATTR) {MASS, ENERGY] (PART) ~ [MIND] --* (ATTR) --* [{MASSLESS,ENERGYLESS}] On the whole, Cyc offers many suggestions for a conceptual catalogue, but, as far as the author can tell, outside the KB itself, these are not collected as a definitive ontology. Guha and Lenat 2 provide very high-level summaries of some findings, and note that almost all knowledge is defeasible. Only 5% is nonmonotonic, and, of this part of the knowledge in Cyc, only 1% is definitional (e.g. cousin(x) = children [sibling[parents(x)]]). Thus, it is hard to determine at the present time how convergent the fit is between Cyc and other ontologies such as Sowa's conceptual catalogue. For the related fields of AI and 252
cognitive science, such convergences would be valuable evidence of scientific progress in the field of commonsense reasoning.
Cyc's inferencing On the whole, the Cyc effort holds to the pragmatic use of extant theory, importing and adjusting this to its needs. For example, defeasible reasoning uses defaults via a syntactic structure akin to circumscription. Thus, the commonsense proposition that 'all people walk' is weakened to 'people usually walk' and expressed as InstanceOf(x,person) @ not abl(x) ~ walks(x) Cyc derives conclusions from this description by using an 'argumentation axiom' with the ab(x) or 'abnormal' portion of the default structure. For example, a sentence (i.e. the unit has slot-labelled as an assumption) can be added to the Cyc KB: 'John is handicapped'. 'Handicapped' may be in the set of variables with the ab(x) proposition, and thus make an argument that John may not walk. On the other hand, additional propositions may indicate that this handicap has to do with sight impairment, or that John is known to perform activities that imply walking ability. To handle this, Cyc2 uses axioms to combine arguments. The following is the main axiom reflecting a commonsense notion that counterarguments exist, and that there are preferences between arguments: (For all a)(argumentFor('p,a) @involvedArg(a) @ not (exists al)@ (argumentFor('notp,al) @ not preferred(a,a 1)@involved Arg(al))) True ('p) Exercising this axiom allows Cyc to judge one argument true and thus add it to the Cyc KB. Cyc's criteria for preferring arguments includes ideas standard in the field, e.g. it prefers those whose inference paths are short, and those which have a causal argument slot value.
Context and mierotheories In Reference 4, Pierce's negation is used to partition assertions from irrelevant contexts. This idea is used to determine the scope of quantifiers. More recently, Sowa 8 has moved to a broader contextual approach applying abductive approaches to large soup-like collections of knowledge (see also Reference 15). Context is also an important part of the revised Cyc system. The original thrust for context came through the use of axiomatic microtheories (sentences with quantified variables) needed to seed Cyc with an understanding of the physical world. For example, the proposition that something which is not supported will fall is formalized in Cyc as IST(for-all x not supported(x) ~ falls(x, NTP) The predicate-like IST means 'Is True with microTheory'. In this case, the microtheory NTP stands for Naive Knowledge-Based Systems
Theory of Physics. Theories such as NTP are lst-class Cyc objects, and they are conveyed in descriptions that indicate some scope of applicability. In Cyc, statements of the world always have to be understood and quantified within some context at some level of granularity. Currently, most Cyc objects and events are asserted within a default context called MEM (Most Expressive Microtheory). SUMMARY AND ISSUES The Cyc experience provides a number of valuable lessons for ongoing efforts to build large, complex KBs and systems of real intelligence. Interestingly, many of the 'discoveries' in the areas of knowledge engineering, related reasoning and representation were anticipated by Sowa4 and his subsequent ruler©theory-generation work that he calls crystallizing theory out of knowledge soup. It also connects to ongoing efforts to bring representations and reasoning approaches together. Some communalities and core approaches between Cyc and contemporary CG work include • the use of truth-maintenance systems (TMS) for KB bookkeeping, • the expression of default knowledge, • the universal use of context for granular, situated knowledge, • directed graph ontologies, • the expression of propositional attitudes, • operations for reification and reflection, • nonmonotonic constructions. In building a large operative KB, one runs into a host of data-management issues. Some of these, such as historical/audit data, are identified in Reference 1. Cyc uses a host of pragmatic frames called SeeUnits to provide metalevel information about slot values and entries in a value set. These include the stating of • constraints on slot value, • relative or qualitative values on numerical slots, • qualifiers such as the number of entries making up the slot value set, • historical (audit) information on who created or edited an entry, • notification information, i.e. who gets contacted if this value is changed; this was an important issue in the directives model discussed in Reference 1 in which certain corporate information, such as standards, is 'owned' by parts of the corporation, • epistemological information about who believes this and why (an important ingredient in a consensus KB). Lessons learned, as described by Guha and Lenat 2, include: • a better definition of the qualities of a good knowledge engineer, • the recognition that a smaller number of fragments is needed to 'seed' the Cyc ontology, Vol 5 No 3 September 1992
• movement to N L U to support the 'reading' of an encyclopaedic entry, • progress on 'working' solutions for representation and reasoning about © belief, awareness and intention, © causality and time (similar to formalizations of temporal reasoning by Esch and Nagle'6), © uncertainty, © substanceness versus individuality, © temporal and spatial 'subabstractions'. Along with a scruffy approach comes the idea of ad hoc justifications for information and lack of certainty. Both play important roles in Cyc, and they may also be of use to the CG system developer. Finally, a major interest for the CG community would be the ability to import concepts (and perhaps heuristics) from Cyc for large CG work. A simple translational model for this is shown in Figure 6. Experience with CG to frame translations makes this a hopeful prospect. Obviously, there is an enormous amount of code in the HL that would be difficult to 'translate' over. However, even the Cyc project itself notes that the EL portion could be run with an entirely different and less sophisticated inference engine. One view of such an effort is that of adding more soup to the knowledge mix. As an aid to the study of translation, a preliminary reworking of the Sowa conceptual catalogue will need to be produced for review and comment as an aid to common understanding within the CG community. This would serve to test further the idea of a converging set of primitives predicted in the Naive Physics manifesto '7. Convergence is described as being confirmed by Lenat and Gutha, although the criteria used to judge this are not stated. This is an important issue which should be discussed in the cognitive-science and AI communities. REFERENCES 1 Berg-Cross, G and Hanna 'Database design with conceptual graphs' Proc. 1st Int. Wkshp. Conceptual Graphs (1986) 2 Guha, R V and Lenat, D B 'CYC: a mid-term report' MCC Technical Report ACT-CYC-134-90 MCC (1990) 3 Pazienza, M T and Velardi, P 'Methods for extracting knowledge from corpora' Proc. 5th Ann. Wkshp. Conceptual Structures (1990) 4 Sowa, J F 'Conceptual structures: information processing in mind and machine' Addison-Wesley, USA (1984) 5 Bnndy, A 'What is the well-dressed AI educator wearing now?' AI Magazine (1982) 6 Kolodner, J L and Riesbeck, C 'Case-based reasoning' Proc. 1985 8th Nat. Conf. Artificial Intelligence of the Common Sense Worm Ablex (1990) 7 Hammond, K J Case-Based Planning: A Framework for Planningfrom Experience (1990) 8 Sowa, J F 'Crystallizing theories out of knowledge soup' in Ras, Z W and Zemankova, M (Eds.) Intelligent Systems: State of the Art and Future Directions Ellis Horwood, USA (1990) pp 456-487 253
O,e.°O0 ii!i!l 00 :ii i !i
i!i[
Users, human and machine
i!iii:iii,li!i!ii,!iiiii!ii! KEs, system builders
Figure 6. Unification via translation
9 Novak, J D and Gowin, D B Learning How to Learn Cambridge University Press (1984) 10 Sowa, J F 'Finding structure in knowledge soup' Proc. Info Japan '90 - - Vol 2 Information Processing Society of Japan, Japan (1990) pp 245-252 11 Lenat, D B, Gaha, R V, Pittman, K, Pratt, D and Shepherd, M 'CYC: toward programs with common sense' Commun. A C M (1990) 12 Velardi, P 'Why human translators still sleep in peach?' Proc. COLING 1990 (1990) 13 Sowa, J F 'Knowledge acquisition by teachable systems' Proc. EPIA 89 Springer-Verlag (1989) pp 381-396 (Lecture Notes in Artificial Intelligence) 14 l)erthiek, M 'An epistemological level interface for CYC' MCC Technical Report ACT-CYC-084-90 MCC (1990)
254
15 Berg-Cross, G and Price, M E 'Acquiring and managing knowledge using a conceptual structures approach' IEEE Trans. Syst. Man & Cyber. Vol 19 No 3 (1989) pp 513-527 16 Eseh, J W and Nagle, T E 'Representing temporal intervals using conceptual graphs' Proc. 5th Ann. Wkshp. Conceptual Structures (1990) 17 Hayes, P J 'The 2nd naive physics manifesto' in Formal Theories BIBLIOGRAPHY
Cowan, P A Piaget with Feeling Holt, Rinehart & Winston (1978) Way, E C Knowledge Representation and Metaphor Kluwer, Netherlands (1990)
Knowledge-Based Systems