Constructing and Updating a Dynamic Model of a Problem Domain

Constructing and Updating a Dynamic Model of a Problem Domain

Copyright © IF AC Art ifi cial Inte llige nce. Leningrad . USS R lYS3 CONSTRUCTING AND UPDATING A DYNAMIC MODEL OF A PROBLEM DOMAIN H. Oim, M. Koit ,...

2MB Sizes 0 Downloads 51 Views

Copyright © IF AC Art ifi cial Inte llige nce. Leningrad . USS R lYS3

CONSTRUCTING AND UPDATING A DYNAMIC MODEL OF A PROBLEM DOMAIN H. Oim, M. Koit , S. Litvak, T. Roosmaa and M. Saluveer Tarlll Srate Ulli l'ersitr . Ta rtll . USSR

Abstract. The paper develops the method for text frame organization. Text which describes dynamic situations is considered. Kefwords. Dynamic programming; models; modelling; machinelanguages; computer programming; cognitive systems; ~ata handling; trees (mathematical) •

~ented .

,'

." 'V

INTRODUCTION

the objects and phenomena denoted by the corresponding expressions is under consideration. Recent investigations both in theoretical linguistics and in AI of the questions of the relationship between language and other areas of human cognition as well as the striving to study larger linguistic units than traditional words and sentences have produced new types of data appropriate for consideration in discourse understanding.

A person rarely stores data in his memory exactly in the form given by observation. The observed data undergo various kinds of processing assimilation, selection, generalization, etc. Speaking of this processing, the term "interpretation" is ordinarily used: the observed data are interpreted in a certain way and later these interpreted data will be remembered and used. All this applies also to discourse comprehension. What people remember is not the exact wording of the text but our interpretation of it.

Facts about the cognitive processes, knowledge and memory of language users belong to the first type. It is evident that text understanding requires much more than merely knowing the meaning of words and of morphological and syntactic structures - it also requires an extensive knowledge of the real world phenomena described in the text, and knowing how to use this knowledge in interpreting the meaning of the incoming text. This knowledge may be said to constitute the cognitive competence of the language user.

Although there has been much research carried out on these kinds of processing, the exact nature of interpretation is still far from clear. We see two kinds of problem here. First, what are the types of processes interpretation that are realized and through what types of mechanisms are these processes carried out. Secondly, how can the corresponding mechanisms be incorporated into the formal means currently used in language understanding systems. In other words, this is the old problem facing AI experts - what to represent and how to do it (Popov 1982:1)1).

The second type of data consist of the knowledge of various regularities of social and conventional character that govern human linguistic communication, the knowledge of how to construct and (or interpret different kinds of texts in various contexts. This knowledge concerns the rules and models for building and/or

Generally when the relationship between language and knowledge is discussed the connection of the linguistic meaning with the knowledge about 91

92

H.

O ~m

interpreting texts..! and may be termed the interactIve competence of a language user. Together with human lin~stiC competence, i.e. the knowl~e of language proper (knowledge of the meanings of words and rules of grammar) these two competences are used simul taneously in any act of communication thus forming human communicative competence. The recognition of the fact that knowledge associated with meanings of separate words constitutes but a small part of the whole stock of knowledge needed for understanding natrual language discourse, that there exist other kinds of knowledge that cannot be attached to concrete words but, nevertheless, play an extremely important role in building or interpreting texts was a major step forward in simulating the process of (human) discourse comprehension. These other kinds of knowledge have up to now been mainly studied in connection with building text understanding systems. Several types of such knowledge structures have been proposed: goals and plans (Schank and Abelson, 1977; Wilensky, 1978); points (Wilensky, 1980); affect or plot units (Lehnert, 1980). They have been designed for representing high level common sense knowledge structures that, in the process of interpreting a text, interact with concrete data presented by the text and organize the results of the interpretation into corresponding higher level units and structures. But when we look at these theoretical constructs more closely it must be noted that all the structures mentioned are, in fact, designed to characterize one kind of texts only: stories about (human) goal-directed activities. Thus they reflect certain characteristic features of the reality described in these texts o It ""is-no-tat all clear to what extent the same organisational (knowledge) structures are relevant in interpreting other kinds of texts. And still more open is the question of what types of such knowledge are used by humans in the task of discourse understanding in general, i.e. there is no clear picture of the ge~e~~ SJstem ,Ef_ c.2nce..pt..uall!le~n.§ that woul.abe neeaeain the overall theory of language understanding. Proceeding from the considerations outlined above the authors have been developing a natural language understanding system TARLUS (Litvak,1981; Litvak, 1982; Koit, 1983). The cent-

et

a~ .

ral aim of TARLUS is to simulate integrative and constructive processes in language understanding. By assembling certain data from a text it must see that these data together represent an instance of a more general case - that certain actions of a person may be qualified as ;t,012..ber,L, buying, refusal L etc. Such higher level-events are called hyperevents in TARLUS. In general, it can be claimed that no understanding system can do without this kind of interpretation, i.e. without recogn~z~ng certain things, situations and events implicitly described but not explicitlYmentioned in text. Our second task arises directly from the first one - it is to develop a suitable representation formalism for simulating the recognition and reasoning processes in text understanding on computer. In the following we want to give some concrete illustrations of the implementations carried out in TARLUS. First, a synopsis of knowledge representation used in TARLUS is given; we then proceed to the construction and updating of the model of the text; finally, we consider some topical problems in hyperevent recognition. KNOWLEDGE REPRESENTATION IN TARLUS One of the possible ways of representing knowledge is the frame formalism. Frames in TARLUS' knowledge base fall into two groups: terminal and conceptual. They constitute a hierarchy built up on property inheritance principle, e.g. JYRI ----~PERSON~LIVBEINC sup PHOBJ.

The structure of all the frames is similar containing, on the one hand, a list of slots in the form of attribute-value pairs and, on the other hand, there are various attached procedures connected either with definite slots of a frame or with a frame as a whole. Terminal frames are descriptions of concrete notions found on the lowest level of the hierarchy, e.g. JYRI in the above example. Their main difference from the frames of the second group lies in the absence of the reference INSTANCE to the frames in TARLUS knowledge base. All the other frames belong to the group of conceptual frames representing descriptions of semantic categories of concepts. Some of the frames in the knowledge

Co n s truc tin g a nd Updat ing a Dyn ami c Mo de l

base are "marked" and belong to the group of ~yperframes the identification of which can be viewed as one of the main tasks of the system. As the group of hyperframes is an "open" one the user can choose a suitable set of hyperframes for a concrete problem domain, i.e. s/he can simply mark a set of frames from among the overall list of frames in the knowledge base as hyperframes for the problem domain at hand. The current version of TARLUS makes use of texts describing events and situations of offences against private property. The hyperevents the system has to recognize are such as s..!e~lj.n.s:,_r.2b..Ee..!'Y...1 .!0..su~I.:l'_ etc. All the slots in the frames fall into~o groups: variable slots, be~ an abstraction of the list of attripute-value, and procedural slotd'. To describe variable slots (e.g. AG, PAC, REG, OBJ, INSTR, etc.) the following constructions are used: i) slot name = REQ (category 1, category 2, ••• , category n). The filler of this slot may be a word of any of the categories 1 to n,e.g. OBJ

= REQ

(PHOBJ, PLACE)

ii) slot name = REQ (category 1, category 2, ••• , category n) DEF (value) Here DEF gives a most typical filler of the slot. Vfhen a slot of such construction remains unfilled during frame instantiation then it may be filled with the information from its DEF part, e.g. frame TAKE has INSTR = REQ (THING) DEF (HAND) If it is not said in a text with the help of which something was taken then we can assume in a default sense that the instrument was a hand. iii) slot name = REQ (cate~ory 1, category 2, ••• , category n) DEF (slot name from frame name)Here we have a situation where it is not the value that must be ascribed to a slot that is given but a rule (procedure) pointing to a part of the frame from where the value should be taken, e.g. REC = REQ (PERSON, INSTIT) DEF (AG !!:2l! TAKE) Procedural slots, e.g. SETTING, GOAL,

93

CONSEQ trigger special procedures which generate new frames. With the help of such procedures we can explicate information given implicitly in a text. As an example consider slot GOAL in the frame TAKE: GOAL = POSSESS PAC=AG from frame name OBJ=OB.f"TrO'm frame name TM TLI from frame name A typical procedural slot consists of 1) slot name, e.g. GOAL, 2) the name of the new frame to be generated, e.g. POSSESS; this name serves as a cue for triggering a procedure for generating that frame and filling in its slots; 3) slot names of slots of the new frame, e.g. PAC,OBJ, TM, and 4) the procedures for filling in these slots, e.g. AG from frame name. Procedural slots may also contain one or several requirements which point out under what conditions the slot may be filled in at all, e.g. GOAL=if INSTR !!:2l! GO=ATHING lliE. LOCATION i.e. if the value of the slot INSTR of the frame GO belongs to the semantic category ATHING then a new frame LOCATION may be generated and it serves as the filler of the slot GOAL. CONSTRUCTION OF THE CURRENT MODEL The construction of the model of the input text is based upon the construction of every action and situation frame conveyed by the input text sentences. A sentence frame is thus a network of frames arrived at after the semantic interpretation of a sentence. As the input to the semantic interpretation serves a syntacticosemantic dependence tree put forward by the syntactic analyser. Activization of the frames and filling in their slots proceeds top-down according to the input tree, i.e. first the topmost frame is activated and instantiated, then the second topmost one, etc. Filling in of a fram~s slots proceeds in correspondence with the dependence tree which means that the frames of the participants and objects of the situation or event are activated, the instances of the corresponding frames are generated and their slots filled in. Sentence frame construction consists of two stages. First, it is checked to see whether the input tree satisfies the requirements of plausibility:

H. Oim er; al.

94

i) Slot Pattern Checker sees if there is the minimum amount of data required for generating frame specimen, e.g. frame TAKE must obligatorily have one of its slots,either LOCFR or SOURCE, filled; ii) it is checked whether the semantic category of the filler corresponds to the constraints set up for that slot, e.g. the requirement AG=REQ(PERSON) means that the AG slot of a frame may be filled only by a word of the semantic category PERSON. iii) it is verified whether the input tree does not contain links not allowed by the frame, e.g. if the INSTR tree has a link FORGET ------- HAND then it follows that this dependence tree is an incorrect one because frame FORGET has no INSTR slot; iv) will it be possible "to decipher" all the syntactic links whose syntactic names must be substituted for real semantic ties. That situation may arise when during the syntactic analysis it is not possible to determine the exact nature of the links connecting two words. In that case the words are tied with a "dumb" connector ATR which must undergo a new interpretation during semantic analysis. For example, the pair ATR MAN ------ CAR (corresponding to the phrase "the man's car") is paraphrased with the help of the frame

OWN PAC = MAN OBJ = CAR TM = T1 i.e. it's the car owned by the man at the time T1 • If all these problems find a positive solution the system proceeds to the next stage which is constructing the frame specimens. Here the specimens of the corresponding frames are generated and their slots filled in. Through the latter the sentence frame generated is tied with the frames of preceding sentences and thus the construction of every sentence frame leads to the updating of the current model of the text. Let us see how the updating of the model proceeds on the example of the following fragment of a story: Jack went by train from Tallinn to Leningrad. When he got off the train

in Tapa he forgot his briefcase in the compartment. The following drawing illustrates a fragment of updating the current text model: Time scale:--~---------------gt--. GO AG

GET OFF

I

JACK

AG

LOCFR-TALLINN 1 LOCTO--LENINGRAD

=,

I

INSTRrTRAIN-LOCFR NAME--ESTONIA FORGET AG LOC--~

OBJ-llli.l.E..ECASE LOCTO-----COMPARTMENT

TEXT INTERPRETATION The process of text understanding does not terminate with the construction of a text frame. In the course of understanding people alwaYS make use of their knowledge - the more s/he knows beforehand of the events described in the text the better s/he can understand the text. There exist different levels at which the regularities of communication and the structure of knowledge wich participates in this communication should be described. Building the text frame may be termed as the primary interpretation of a text. Besides this interpretation there also exists the socalled secondary interp~etation which consists of the interpretation of combinations of events presented in a text as a realization of higher order events (Oim, 1981). This interpretation is characterized by: i) distributed character of information; ii) existence of prestored knowledge of the structures that represent the corresponding general event or situation concepts in the interpreter's memory;

construc tin g a nd Upda ting a Dyn a mi c Mode l

95

iii) its active nature (parallel occurrence with primary interpretation) ;

i) the shop-assistant has the obligation to react in a certain way to the client's request;

iv) possibility of recursive operation - every higher order event predicts the possible following event(s) (on the same level of description) and thus can also direct the process of secondary interpretation of the text data.

ii) the client has to pay for the items obtained;

TARLUS reasoning capabilities are made up of two components: i) distributed reasoning procedures consisting mostly of local comp?nents tied to single frames. Th1s kind of reasoning is represented by procedural slots of frames an~ i~~riggered when the correspond1ng ~ncept is activated; i1') centralized reasoning procedures constituting an independent subsystem alongside with the knowledge system and representing typical ways of reasoning in certain meaningful situations. Reasoning mechanisms are an integral part of human text understanding capability and modelling them helps to simulate the process of text understanding which, by and large, consists of bringing the content of what is being described in the text under some knowledge scheme and proceeding from it interpreting chunks of texts as parts of that scheme. Knowledge schemes are generalized and most typical knowledge of frequently occurring events or situations, e.g. "football match", "shopping", "visiting the theatre", "talk to a friend", etc. Humans do not make all the possible deductions from a text but limit themselves to such ones which are essential for filling in the scheme with concrete data. Global reasoning mechanisms are one kind of secondary interpretation procedures - they do not add new information to the already existing but interpret the following text with certain predictions trying to accommodate new information with that already existing in the systems current model of the text. Besides making predictions global reasoning makes it possible to recover the preceding situation, too, and thus makes either forward or backward inferences. For example, when the system has recognized that there is a shobRing situation in the text its gio 1 reasoning tells the system that there exist such constraints to the situation as:

iii) the shop-assistant always demands payment for the items sold; iv) the shop-assistant and the client know their respective obligations, etc. The two main tasks of text interpretation (both primary and secondary interpretation) in TARLUS are hyperevent recognition and question answering about the input texts. Both tasks demand an extensive use of integrative and constructive processes in language understanding, and they must be selective as well as directed in a certain way. The hyperevent recognition mechanism of TARLUS may be thought of as a context sensitive and adaptive tree whose nodes represent problems to be solved and arcs point to the paths which the reasoner can follow, depending upon the solution of the given problem and the knowledge state of the system. The order of solving problems contained in every node depends on the current contents of the reasoner and includes, among other things, knowledge of the current topic and the characteristics of the participants of the scene. The proble~s are presented in the form of quest10ns to which the reasoner must find answers on the ground of its knowledge. The output of the work of the hyperevent recognition mechanism is the name of the hyperevent recognized. In the current version of TARLUS hyperframes are frames of the category TRANSFER, e.g. stealing, robber!, buying-selling, etc., but not *he frames taking or transfer itself. To illustrate the work of the reasoner let us see how the recognition of the hyperevent of stealing implicit in the following text proceeds (original texts with which TARLUS deals are in Estonian). 1. John entered a shop. 2. He was the only customer there. 3. John asked the shop-assistant to show him a Kodak camera. 4. As the shop-assistant had just sold the last camera in the shop she went to the back room to fetch some new ones. 5. At that time the man pocketed a hand-calculator lying on the counter. 6. vVhen the shop-assistant returned with the goods she noticed that the calculator was missing. 7. She ordered John to return it but instead of that he seized a camera from the counter and ran

H. Oim e t al .

96

out of the shop. The following drawing illustrates a fragment of the relevant reasoning tree: TRANSFER

occurred

Is there an hyperevent

yes

...

j

buying-selling exchange returning

The first problem the reasoner must solve is whether there are any clues triggering the search. In the given case such clues are words of the semantic category TRANSFER occurring in structures which signal of changes in possession relations. The requirement "hyperevent word" means that the reasoner has to check if the text contains any words explicitly mentioning some TRANSFER hyperframe, as for example the verb to buy in sentence Bob bought an old Chevrolet If there are such words then the reasoner in the present sense is not needed and the event is stored away in the system's memory. But the presence of explicitly mentioned hyperevents does not terminate the search as the text may have some other clues for triggering the reasoner. The concepts giver and receiver are text constants ascribed to fixed participants during one pass through the tree. Giver denotes-a person acting as the AGENT or SOURCE in the frame TRANSFER, receiver is the RECIPIENT of that frame. From the above text the reasoner can conclude that there was no hyperevent word but it contained a clue for entering the search tree, namely the verb to ~ocket. So the search is triggeredoext the reasoner has to determine the respective roles of the two persons acting in the story: i) who is the O\VNER of the OBJECTS transferred; since the event happen-

...

Did know about no

f··

stealing robbery losingfinding

yes

...

le~ding

making a present

ed in a shop the reasoner concludes that OVINER = shop-assistant as s/he is responsible for all things in the shop though really s/he does not own them. ii) who is the POSSESSOR of the OBJECTS transferred (at first it was the shop-assistant as she was the OVINER of these OBJECTS; but from sentences 5 and 7 the reasoner infers that a) the shop-assistant does not POSSESS the calculator and camera any more, and b) that John possesses the calculator and camera and is therefore the POSSESSOR. Note that once these facts have been established they need not be inferred any more and are passed to the nodes where the corresponding information is required (the reasoner now "knows" in a sense that it knows this information and must simply look it up instead of generating it again) • Next the reasoner must establish if there was a return TRANSFER, i.e. if the giver got something from the receiver. If yes, it passes to the next problem. If the answer is negative it proceeds along the branch leading to a set of hyperframes "stealing", "robbery", "lending" etc. To discriminate among them it must solve a number of other problems,one of which, as we could see, is whether the OWNER knows about the TRANSFER in question. This in turn leads to other problems until the reasoner

Construc tin g and Upda ting a Dyn ami c Mod e l

is able to decide that the text describe~the hyperevent of stealing with John as the agent and the shopassistant's calculator and camera as the objects of stealing. A text may contain more than one description of the events of the class TRANSFER. With the aim of full interpretation of the text the reasoner "marks" the frames used during one pass of the tree. If some frames are left after the first pass then a new search is triggered until no unchecked "marked" frames remain. CONCLUDING REMARKS There are several general faults th~ characterize present-day compu~ systems designed to understand co~erent natural language texts. !t~~i, they make too heavy and rigorous use of predetermined knowledge schemes relating to the subject matter of texts and to the structure of reasoning processes which carry out understanding. Se£ongJz, these predetermined know= ledge and reasoning structures operate on levels of organization that are too removed from the level on which immediate text contents are presented. We have little data as to how. the text ~ontents themselves dynam1cally man1pulate reasoning steps and decisions of human language understanders. Thir~ it would be of great importance~o obtain a more elaborated general picture of different conceptual means that are needed in order to describe all the new kinds of data that should be incorporated in the theory of natural language understanding. This would also provide us with additional cues for deciding what characteristics should be attributed specifically to frames, e.g. to frames of words in lexicon, to frames of events and episodes described in texts, and so on. It is possible to point out certain other types of conceptualizations that are needed if frames are used to describe the process of text understanding. In particular, there should exis~ certai~ means for describing the 1nteract10n of frames which occur in a text; in other words, for describing the ways in which context (which in the case of a text itself is presented in the form of frames) can influence the construction of an instantiation of a frame as the interpretation of a fixed subpart of a text. Finally, TARLUS needs a more flexible central control structure to take into account the dynamically changing picture of the world (and of itself) which results from local rea-

97

soning processes. To this end we are further developing the "scene surveyor" mechanism of TARLUS (Li tvak, 1982) which has to provide the system with some kind of metaknowledge: knowledge of what it already knows, what open questions it has, and what kind of knowledge it needs to answer these questions. REFERENCES Koit M., Litvak S., Roosmaa T., Saluveer M., Oim H. (1983). Using frames in causal reasoning. In: Papers on Artificial Intelligeicv. Tartu, vol.V. Lehner V. (1980). Affect Units and Narrative Summarization. Yale University Research Repor~179o Litvak, S.R., Roosmaa, r.A., Saluveer, M.E., Oim H.I. (1981). Giperphenomena Recognition on the System of Connected Text Understanding. - In: Proceedi~s of Artificial Intellegence. Tartu (In Russian). Litvak S., Roosmaa T., Saluveer M Oim H. (1982). On the Interaction of Knowledge Representation and Reasoning Mechanism in Discourse Comprehension. - In: Proceedings of the 1~82 Eurotean--Conference on Art1ficialntelliHen(10 Orsay,France. Oim. 981 ). Language, Meaning and Human Knowledge. - Nordic Journal of Li~UistiCS. N 40 Popov E. V. (1 2). A Natural Language Interaction with a Computer. Nauka. Moscow. (In Russian). Schank R.C., Abelson R.P. (1977). scriats, Plans, GOalS and Understan ing. Lawrence Eri baum Associates. Hillsdale, N.J. Wilensky R. (1978). Understanding Goal-Based Stories. Yale UniverSi~ Research Re~ort ~ 1!Oo Wilens Ro (1980).OINT : Theory of Story Content o Memorandum N UCB/ERL M80/17. University of California, Berkeley.