Artificial Intelligence 61 (1993) 105-112 Elsevier
105
ARTINT 942
Book Review
Ernest Davis, Representations of
Commonsense Knowledge*
William
Croft
Center ]or the Study of Language and Information, Stanford University, StanJbrd, CA 94305-4115, USA
Since this reviewer is a linguist, more specifically a "cognitive linguist", it is worth beginning this review with some significant differences in perspective betwecn linguistics and AI, as well as some general remarks about what each can contribute to the other. Linguistics, in particular linguistic semantics, attempts to give an accurate and explanatory representation of the meaning of words and grammatical inflections and constructions. As such, its scope appears to be much smaller than that of knowledge representation in AI. To be sure, natural language understanding is, of course, a central concern of AI. But A1 is also concerned with perception, reasoning, planning, and other intelligent activities which a p p e a r to make little use of language. For example, planning the best route to drive to the record store can be done without uttering a word. For these reasons, knowledge representation in AI must represent a larger array of information than what appears necessary for the characterization of linguistic meaning. Also, knowledge representation is not bound to semantic distinctions and representational structures that are useful for the characterization of natural language meaning. However, these differences are not as great as they seem. From the linguist's point of view, it is not obvious that (natural language) semantic representation is distinct from (general) knowledge representation in its scope of coverage. This is known as the "dictionary versus encyclopedia" debate in linguistics Correspondence to: W. Croft, Center for the Study of Language and Information, Stanford University, Stanford, CA 94305-4115, USA. E-mail: wcroft(a3csli.stanford.edu. :~(Morgan Kaufmann, San Mateo, CA, 1990); 515 pages.
0004-37(}2/93/$06.00 © 1993 ElsevierScience Publishers B.V. All rights reserved
106
w. ('roJ~
(Haiman [3]): the dictionary view is that there is a distinct component ot linguistic semantic representation, and the encyclopedia view is that semantic representation is knowledge representation, or as Ray Jackendoff puts it, semantic structure is conceptual structure. Cognitive linguists in general ascribe to the latter view, influenced by the role of "real-world" or "'commonsense" knowledge in the interpretation of phrases such as house key versus house guest. For cognitive linguists, then, the scope of linguistic semantic research is the same as the scope of research into knowledge representation. There remains the other issue, are representations of knowledge for natural language understanding the same in structure as those for other purposes, or not? A priori, it would seem inefficient for human beings to have two separate knowledge representation systems. On the other hand, there is no a priori reason to represent mental processes by means of language-like structures. Nevertheless, for practical reasons many knowledge representation systems use language-like structures, or primitive concepts based on language. That is because our commonsense knowledge of many phenomena is embodied in natural language, because they arc "invisible" or otherwise not directly accessible to physical description. This is clearly true of mental and social phenomena such as knowledge and belief, emotions, intentions and goals, and social interaction. But at the commonsense level it is also arguably true of an "invisible" such as force or causation, and of higher-order perceptual structures such as the shape of ordinary objects. These observations might indicate simply a failure of scientific inquiry to establish theories of mind, society, etc. that are independent of the "'folk theories" recoverable from natural language. But it is possible to make a stronger argument that natural language semantics is more than a heuristic for commonsense knowledge representation. Research in the semantics of natural languages reveals which conceptual distinctions and structures are used by natural languages and which are not, or at least are not commonly represented in the languages of the world] Presumably those conceptual structures found widely in natural languages are those that are important to human beings, and m o r e o v e r are useful in the encoding and transmission of our knowledge, since language has evolved in adaptation to its function of communicating information. Since humans are good at representing and reasoning with knowledge, it is reasonable to take seriously the conceptual structures that humans use, as revealed by tile semantics of natural languages. The argument of the preceding paragraph has particular force for cognitive linguists since they are concerned with human linguistic ability. Most cognitive linguists would agree that their semantic representations should be psycho~It is critical that any postulation of universal or near-universal conceptual distinctions be supported empirically by research in diverse languages. Unfortunately this is rarely the case.
Book Review
1(17
logically real. The conceptual structures are assumed to be those that people actually use, and the analyses are intended to model human behavior, including " e r r o r s " and misunderstanding. This view diverges from the prominent view that AI is the study of intelligence abstracted from human ability (including human fallibility). In point of fact, though, cognitive semanticists have developed most of their analyses based on introspection on the subtleties of grammatical usage, and the aforementioned mental constructs have been "discovered" without psychological experimentation. Thus, the contrast between linguistics and AI here is not as great as it appears. Since the cognitive approach to semantics is relatively young, there is much empirical and analytical work to be done. For this reason, among others, there is not much interest among cognitive linguists on many of the representational issues that are important to AI researchers, except the psychological reality issue. The emphasis has been on discovering what sorts of mental constructs are necessary for the representation of meaning: experiential domains, idealized cognitive models, mental spaces, force dynamics, focal adjustments, and so on. The further task of integrating these mental constructs into a single model of knowledge representation of sufficient clarity that it could be implemented has barely been started. 2 Another area in which there is a difference between the concerns of linguistic semantics and AI is the relative emphasis on representation and reasoning. The linguist's chief concern is the representation of the meaning of natural language utterances. There are an immense variety of words, inflections, and grammatical particles that capture all facets of the human conceptualization of experience. The linguist is expected (eventually) to analyze all of these. This is the chief reason for the empirical focus on developing all the mental constructs necessary for semantic representation that were just mentioned. Most AI research places a greater emphasis on reasoning because of the central role of planning and problem solving. This is true even in natural language systems, where reasoning with commonsense knowledge is used to determine semantic interpretation (e.g. the reference of an anaphoric expression such as Put it in the drawer). This is a difference more of emphasis than principle, I believe. There is a tradeoff between empirical breadth across the semantic structures of a human language and formal depth in the axibmatization of a domain so that complex inferences can be consistently drawn. The goals of the two disciplines determine preferences in the tradeoff. 3 Davis' book exhibits the contrast between the linguistic and the AI approaches quite strongly. In the preface, Davis notes that one of three important omissions in his book is "representations of knowledge based on 2Some feel that an effort in this direction is premature; others (those skeptical of the possibility of doing AI), impossible. 3The aims of formal semanticists in the Iogico-philosophicaltradition resemble those of AI researchers much more closely.
108
w. ('r¢~!}
linguistic considerations" (p. viii), although he admits that such a study would be valuable (p. ix). The chapters on space (Chapter 6) and physics (Chapter 7) in particular are based heavily on geometry and physical theory and are little connected to linguistic semantics in those domains. There is a greater emphasis on reasoning than on representation (in the sense described above), so that the range of phenomena seems restricted in scope to a cognitive linguist. These arc not necessarily shortcomings of the book. To my knowledge, the book covers the range of topics that AI researchers interested in commonsense knowledge and reasoning have explored, and no phenomenon appears to be seriously misrepresented. In other words, for what the book intends to do, it accomplishes it well. (The chapter on plausible reasoning could be jettisoned, however, since Davis hardly uses it in the substantive chapters.) But AI research on commonsense knowledge could benefit from input from linguistic semantics. Here I will mention four models developed in semantics, particularly cognitive semantics, that might be of value to the At approach: idealized cognitive models, focal adjustments, force dynamics and mental spaces.
Idealized cognitive models An area in which linguists have had much to say is vagueness (Section 1.7). Davis interprets vagueness as gradience or continuous parameters (e.g. baldness and number of hairs on one's head). It seems that this notion of vagueness can be quantified (or at least be made more precise) in terms of neighborhoods on a scale or fuzzy intervals. But this is only one kind of vagueness or "'natural imprecision". Lakoff [5] notes a number of other factors besides gradience that lead to prototype effects (a major source of natural imprecision). The most important of these factors is the idealization of the cognitive mode[ of the concept. To put it in logical terms, the axioms and theorems that we use--say. those defining the social relations assumed in the definition of a " b a c h e l o r " - are strictly interpretable only in a mode[ that represents an idealization of reality. However, speakers must use them to interpret reality itself, which is more complicated and includes such things as males living with their girlfriends, homosexuals, priests who have taken a vow of celibacy, etc. that are not present in the idealized model. These background assumptions are part of the meaning of the words (Searle [7], Langacker [6, pp. 183-189]). This is essentially a generalization of the frame problem, discussed by Davis in the chapter on time. The frame problem can be put in a wider perspective by viewing it in terms of idealized models. Unfortunately, linguists such as Lakoff have only pointed to the necessity of including background assumptions in the meanings of words. Langacker [6] has argued that the background assumptions are simply the domain or combination of domains in which a concept is defined. A simple example is that an arc is defined against the background assumption of the structure of a circle, which in
Book Review
109
turn is defined against the background model of space. This only pushes the problem to the definition of experiential domains. However, at a more prosaic level natural language can give clues to useful idealized models. For example, Herskovits [4] identifies a class of idealized geometric shapes (which can be expanded through cross-linguistic comparison) that are used in defining prepositions and directional adverbs such as on, in, and through. This is a logical candidate set for the primitive shapes that could be used by a commonsense constructive solid geometry (see Davis, Section 6.2.2).
Focal adjustments Idealization is one example of the phenomenon of construal or conceptualization, by which the infinite complexity of experience is reduced to mentally manageable proportions. Other construal functions are known as "focal adjustments" [6] (see also Talmy [9]); these include attention, perspective, and figure-ground organization, all familiar from perceptual psychology. These have to do with space, though such processes are presumed to apply to nonspatial domains as well. In my mind, it is highly possible that spatial representation and reasoning could be improved by taking focal adjustments into consideration. In particular, this applies to indexical expressions, which encode location or motion relative to the speaker (as in here~there and come~go). Indexicals are also found extensively in temporal expressions, as in now/then, yesterday~today, two days ago, etc. In Chapter 1, Davis argues that indexicals should be translated into locations in an absolute coordinate system. lndexicals might profitably be modeled as they stand, in terms of positions, times, etc. relative to the intelligent agent, rather than translated into absolute coordinates. This is just a guess, of course, but this is the way natural language is overwhelmingly set up, especially in the description of space and time, and presumably there is a good conceptual reason for it. Likewise, prepositions make essential reference to speaker position and orientation; consider the bird is in front of" the tree, the bird flew over the tree. Prepositions indicate spatial relations between two objects. The subject and object of a preposition are the figure and ground in Gestalt psychology terms. Prepositions also represent a canonical set of spatial relations, utilizing basic concepts such as contact/no contact, orientation, and the geometry of thc ground object. The relations found in natural language would presumably be a natural set of relations for the formulation of relative positions of objects without using an absolute coordinate system.
Force dynamics The semantics of space and time in language are the best explored areas in linguistics. Recently there has been more research on the structure of events and their participants (Talmy [10], Croft [1]). Davis discusses the structure of
J ~qJ
W. ( r<~/l
events in the chapters titled "Time" (Chapter 5) and "Physics" (Chapter 7). A central property of events, the presence or absence of change, is discussed at length in the context of the frame problem, though change in general and causation in general are not discussed. But a central question about events is not touched upon: how are events individuated, and how are they linked to one another? This problem is closely tied to the semantics of verbs and the relation between subject and object. Verb meaning is based primarily on ~ commonsense notion of causal relations between individuals which is independent of the temporal dimension. The concept of causality can be extended to various noncausal relations such as spatial relations and possession [1, Chapters 4-5]. "Aspectual" structure, the inherent temporal structure of an event, is also central to verb meaning (Vendler [1 l]). Talmy [10] has developed a model of "force dynamics" that generalizes the notion of causation to include "letting" relations, as in The open window let the hostages escape. Talmy's model is based on a distinction between force and resistance between two (or more) objects, each of which has an inherent tendency towards action or inaction. This model can be applied to verbs and constructions describing interpersonal interactions (applicable to the modeling of multiagent plans) and degrees of belief. These theories cover a very broad range of the semantics of the expression of events in natural language, and so might figure importantly in the representation of events in AI. Mental spaces
There have been many proposals in linguistics, philosophy, and AI as to how to model mental states that have propositional content. Unfortunately (and this is the most significant drawback in the book for me), Davis relies heavily on syntactic theories of the representation of propositional attitudes. For all the shortcomings of other approaches, the syntactic approach cannot be but an unilluminating stopgap representation. Most other approaches do represent the propositional content of propositional attitudes as such, but situate it in a "possible world". This approach has problems in the usual metaphysical interpretation as sets of possible worlds linked by accessibility relations. Many of the problems with the metaphysical interpretation of possible worlds are avoided by the mental space model (Fauconnier [2]). A mental space is a mental construct: a single, partially-specified region in conceptual space. Individuals ("values") are bound to particular mental spaces; individuals can only have counterparts in other mental spaces. An equally critical aspect of Fauconnier's theory is the reification of descriptions (or "roles" as he calls them) of individuals in a specific mental space; this solves many of the referential opacity problems. For example, the sentence Oedipus wants to marry his mother can be interpreted in two ways: Oedipus knows Jocasta is his mother, in which case the sentence is false; and he doesn't know she is his
Book Review
111
mother, in which case the sentence is true. This is analyzed as the contrast between referring to Jocasta using a description in Oedipus' desire space--i.e. Oedipus knows she's his mother--versus referring to her via the description of her counterpart in reality space--Oedipus doesn't know she's his mother. Without necessarily advocating mental spaces and roles versus values as the panacea for representing propositional attitudes--Fauconnier's work is still quite programmatic--this is another example of where semantic research by linguists could have a significant impact on commonsense models developed in AI. In addition, the AI focus on rational intelligence leads to the ignoring of the irrational, or other sorts of human aims such as politeness. The chapter on minds (Chapter 8) is almost entirely on knowledge and belief (with a very brief and programmatic discussion of perception), and the one on plans and goals (Chapter 9) is about what psychologists would call intentions. Current research on commonsense theories of mind (e.g. Wellman [12]) postulate a third domain, desires (hence the name BD1 or belief-desire-intention psychology). This domain is missing from Davis' book, but this represents a gap in AI research (though currently being remedied), not in the book's coverage thereof. The last chapter on society is also sketchy and incomplete for the same reason, and AI would benefit well from research in sociology or social psychology, including sociolinguistics (for example, to improve the greatly oversimplified discussion of speech acts). Since Davis" book is a textbook, it would be nice to supplement it with readings that take linguistic semantics into consideration in the representation of space, time, force, events, mental states, and interpersonal interactions. Unfortunately, no such textbook exists. Most of the works cited in this review are book-length analyses and hence are not suitable for such a course. However, they make reference to shorter articles by linguists such as Herskovits, Lakoff, Langacker, Talmy, Wierzbicka [13], and others who work in this paradigm. These articles could be used to supplement Davis' book in a broader-based, though less rigorous, presentation of the representation of commonsense knowledge.
References 11] W. Croft, Syntactic Categories and Grammatical Relations: The Cognitive Organization of Information (University of Chicago Press, Chicago, IL, 1991). [2] G. Fauconnier, Mental Spaces (MIT Press, Cambridge, MA, 1985). 13] J. Haiman, Dictionaries and encyclopedias, Lingua 50 (1980) 329 357. [4] A. Herskovits, Language and Spatial Cognition (Cambridge University Press, Cambridge, England, 1986). [5] G. Lakoff, Women, Fire and Dangerous Things: What Categories Reveal about the Mind (University of Chicago Press, Chicago, IL, 1987).
112
W. ('rctt't
[6] R.W. Langacker, Foundations of' Cognitive Grammar, Vol. I: Theoretical Prerequls'ites (Stan ~ ford University Press, Stanford, CA, 1987). [7] J. Searle, Literal meaning, in: Expression and Meaning (Cambridge University Press, Cambridge, England, 1979) 117 136. [8] L. Talmy, How language structures space, in: H.L. Pick Jr and L.P. Acredolo, eds.. Spatial Orientation: Theory, Research and Application (Plenum, New York, 1983) 225-282. [9] L. Talmy, The relation of grammar to cognition, in: B. Rudzka-Ostyn, ed.. l'op,',s m Cognitive Linguistics (John Benjamins, Amsterdam, 1988) 165-205, [10] L. Talmy, Force dynamics in language and cognition, Cogn. Sci. 12 (1988) 49-100. [11] Z. Vendler, Verbs and times, in: Linguistics in Philosophy (Cornell University Press, Ithaca, NY, 1967) 97 121; originally: Philos. Rev. 66 (1957) 143-I60. [12] H. Wellman, The Child's Theory of Mind (MIT Press, Cambridge, MA, 1990). [13] A. Wierzbicka, The Semantics of Grammar (John Benjamins, Amsterdam, 1988),