Cop yri ght © I FAC A rtifi cial Inte lli ge nce, Le nin grad, USS R I l)X~
;
FORMAL SYSTEMS FOR NATURAL LANGUAGE MAN-MACHINE INTERACTION MODELLING
v.
A. Fomitchov
Mosco l\' /lIsrirlll<' of Electrollic Mechallical Ellgill eerillg,
,'vt OSCO I\ ',
USS R
Abstract. Some new classes of formal systems are briefly described: the classes of so called S-calculuses of types 1-5 and, as a consequence, the class of so called T-calculuses. For A!~1,2,3,4 each inference rule of S-calculus of type i is also inference rule of S-calculus of type i+1. There are 17 in~erence rules in S-calculuses of type 5. The results of research " bring a number of new ideas in to mathematical logic and give new mathematical tools for knowledge representation problem investigating and for natural language dialogue models developing.
--
Keywords. Man-machine systems; modelling; models; set theory; human factor; display systems; artificial intelligence. INTRODUCTION
course. In particular: 1) the dialogue is realized by means of statements, commands (imperatives) and questions of arbitrary length which can include the homogeneous members of sentence, direct and indirect speech, participal and gerundial constructions, subordinate attributive clauses and subordinate clauses of purposes; 2) one can omit in texts the words and sequences of words, refer to the earlier mentioned objects and to the meaning of the earlier utilized large parts of text; 3) in texts one can assign names to the considered objects and introduce new concepts.
Studies on constructing formal models reflecting the use of natural language in thinking and communication processes are becoming more and more actual in artificial intelligence systems theory. Until recently, judging by the literature, a sufficiently general mathematical tools suited for a wide variety of applications to deal with the following questions, has not been known: to describe the semantic structure of natural language texts and of separate sentences of some wide-spread types; to formalize the connections between the semantic and surface structures of texts; to develop conceptual dictionary models and to construct knowledge representation languages in accordance with frame theory; to simulate communication processes by taking into account each dialogue participant's having knowledge about oneself and about goals and knowledge of other interaction participants.
The task posed is a new one. THE S-CALCULUSES OF TUBS 1-5: AN OUTLINE OF DESCRIBING In this paper we will consider the entities of four kinds calling these entities, respectively, the individuals, sets of individuals, concepts denoting the individuals and concepts denoting the sets of individuals. We will divide the individuals into two classes: objects (people, events, books etc.) and senses (of statements, imperatives and questions).
The new formal tools characterized below substantially enlarge the possibilities of mathematical studies of the enumerated questions. The main goal was to define and to investigate a class of formal systems (see Shoenfield, 1967), or calculuses, convenient to develop the models reflecting some important aspects of natural language inter-
We will assume that some set S of strings is given, the elements of S denoting sufficiently general concepts associated with individuals 203
V.A. Fornitcho v
204
and called the sorts. Let us define the sets of strings S', SIt, SIt, by setting
b.s.s. B. These rules essentially determine a set of operations for texts IR constructing.
S'= {uIU=ts,S6S}, S"= tutu ={sJ,
The formal aspect of the question can be outlined as follows. Every b.s.s. B defines a set of strings D(B) including ~(B) and not possessing the symbol &. Some statements Ao,A1, ••• ,Ai(1~i~16) define by
(1)
S€sj , S"'={ulu =f{sJ, seSJ, and let "s=s US' US" US"'. We will assign to each considered entity a string of "S calling it the type of this entity. Let us assign the strings of S, S', SIt, SIt', correspondingly, to the individuals, the concepts denoting the individuals, the sets of individuals, the concepts denoting the sets of individuals. For instance, we will assign the string pers to the concept "person" and the string {pers] to the particular groups of people.
t
The principal idea co~ists in: 1) to define a class ~ of formal languages so that the strings of some languages of ~ give a convenient and useful form (from the natural language dialogue models developing viewpoint) for denoting the entities of the four kinds considered; the assign a type to ea~ string of f}..very language of ~ - a string of S for some set of sorts S. This idea has been realized in the following way. First, a class X of ordered fivetuples of a special kind called in this Eaper the basic semantic systems (b.s.s.) has been defined. Every b.s.s. B determines a set of strings~(B). There exists such a subclass X' of class X that for B EX' .1l(B) can be interpreted as a set consisting of function and relation names, and also of strings denoting the concepts, numbers, names, the meanings of verbs and modal words, the logical connectives and quantifiers, the meanings of expressions like "arbitrary", "all","adressee", "in order to", and the meanings of words and expressions of some other kinds. In this case we will interpret a five-tuple B E:X' as a formal representation of knowledge to utilize in constructing the informational representations (IR) of texts out of elements of 'J!t. (B).Every b.s.s. B dete~ines a set S(B) and, hence, a set S(B). The elements of these sets are called, respectively, the sorts and the types. Then 17 rules have been elaborated enabling us to obtain the compound expressions of special kinds out of strings of ~(B) for an arbitrary
joint induction a set S5'=S~(B) 1 and for j=1, ••• ,i some set
Wj=W~(B). SJ'.(B) consists of strings 1 1 · of the form j3 & t where ft E D+CB), " t E S(B). For the "reasonable constructed" b. s. s. one can interpret fi and t as the designations of some entity and of its type, correspondingly. If i=1, ••• ,16 then each string
T~ WiCB) is presented in the form 7: =
oL,& . . &d-nj&f3 where d,, ·.·,oI.."i;
f3 € D+CB). In this case one can interpret J3 for the "reasonable constructed"b.s.s. as a part of texts IR obtained out of ot1 , • •• ,o('V'by means of some rules of list Ao ,A 1 , ••• ••• ,A i , as a latter being employed the rule Aj •
Finally, the following proposition has been proved. Theorem. Let B be an arbitrary b.s.s. 1 {, i ~ 16. Then: a) if the strings cl-. & t, d... & t' belong to S'i(B) then t=t'; b) if 1 ~ j
~
i, 1 ~ k
BED+(B),ol,& •••
&
then j =k, n =~, 1
~
i,
oC.n,&fiEWiCB) ,
ci., = d. f
, I, ... ,01..11 , "
c) if fl E ;I(., (B) then there exist not such n > 1, cl.. l ' ... 0( 11 £: D+ CB) , j
E {1 , ••• , i J Wl(B).
that
d., & . . . &x..1I.. & /> E
Natura l Language Man-Machine I nteraction Mode lli ng
Let k 1=7, ik2=10, k)=11, k 4=15,k 5=16. For i=1, ••• ,5 S-calculus of type i in b.s.s. B is defined as a triple 'Pi (B)= D~(B)=[oL E
-J[.(B)
,Ri>
where
=d 1 ••• dn , n,? 1, dp •• ,dn
D(B)U{&j), Ri= {A o ,A 1 , •• ,A k
d.
The set of such ft E D+ (B) that for A some t €S(B) the string fo &t belongs to S ~i (B) is called the Slanguage of type i in b.s.s. Band is denoted by Cif. ~• (B). I t is the ' cla~of S-language of type 5 that repre.~ents a sough~ for class of formal languages et. •
.
THE INFERENCE RULES OF S-CALCULUSES OF TYPES 1-5 The rule Ao gives an initial inventory of inferred formulas. The rule A1 governs joining the formal analogues of meanings of words "arbitrary", "some", "every", "all" etc. to the images of prime and compound concepts. The rule A2 can be interpreted as a description of the method to mark the expressions being fragments of IR of texts and qualifying the entities mentioned in texts by means of variables. The rules A) and A4 essentially indicate the methods of using connectives "not" (symbol 1 ), "and", "or" (symbols /\ V ) in constructing IR of texts. In particular, the rule A4 allows to obtain the strings of the forms (ci.. , /\ .• • I\c{n) or (o(lV", Vo(n) where n? 1 and sI ... 0( fl. are IR of statements, or ol"""o(llare " IR of imperatives, or al" "', a.n.. are IR of questions. Besides it is possible that 01.." ••. ,0('1. denote prime or compound concepts, or 0("" " d..n.. denote individuals, or 0(" a.n. denote sets of individuals. In each considered case 0(" ... , o(n. denote the homogeneous (in some formal sense) entities. J
J
"'J
In essence, the rules A ,A 6 ,A tell 7 5 how to employ the function and relation names in constructing IR of texts. Besides, the rule A6 permits
205
to state formally that some entities are identical. The functions and relations of some nontraditional kinds are proposed to be considered. For instance, the functions whose arguments and/or values can be the sets of objects, the concepts, the IR of texts. Using the rule AS one can construct the IR of phrases "stand up", "it is hot", etc. The rule A is intended to be used for descri~ing the semantic structure of sentences with verbs, modal words and some verbal nouns. This rule explains the organisation of sentences with direct and indirect speech, with subordinate clauses of purposes. The rule A10 allows, first, to mark IR of a statement by a variable. Such a variable designates a situation being discussed in the given statement. Secondly, A10 makes it possible to mark IR of imperative by variable denoting the goal situation being discussed. The rule A11 makes it possible (with the rules Ao-A 10 ) to build the IR of compound concepts and to approximate the semantic structure of descriptions of diverse entities: of real objects, events, algorithms etc o Such descriptions can be, for instance, the formal analogues of sentences with participial constructions and attributive subordinate clauses. The inference rules system of S-calculuses of type 4 consists of the rules Ao-A11 and of some new rules with numbers 12-15. The rule A12 makes it possible to construct the IR of questions without interrogative words out of IR of statements and imperatives. The rule A1 ) indicates the method of constructing IR of questions with interrogative words out of IR of statements (or of imperatives) and out of variables. The rule A14 makes it possible to build the compact IR of answers to the questions with interrogative words. The rule A15 essentially makes it possible to mark the arbitrary long fragment (corresponding to statement,
206
V.A. Fomit chov
or to imperative, or to question) of the IR of text by suited variable. If there are refrences in the narrative text T to the meaning of fragment TI, and fragment PI of IR of T corresponds to TI then one can use. variables to mark fragment PI in d1fferent parts of IR of T. In S-calculuses of type S a new inference rule A16 is added to the rules Ao-A 1S • In essence, the rule A16 is parttially applicable to IR of statements and to IR of imperatives. This rule makes it possible to join the existential and universal quantifiers to IR of statements and the universal quantifier to IR of imperatives. The scientific novelty of the inference rules of S-calculuses of type S displays, in particular, in the following way: these rules make it possible to construct the compact informational representations and explain the organization of the surface semantic structure of such phrases and texts: "It is necessary to draw two circles. Diameters are Bcm and 12cm", "Somebody had not turned off the knife-switch. This caused a fire", "a group consisting of five persons", "all inhabitants of Kiev", "How many inhabitants are there in Kiev?", "Victor said that he had lived in Kiev and Moscow. It was new for Rita that Victor had lived in Kiev", "Victor works not at a plant. He is a teacher", "to take an arbitrary box", "to bring a toy to every child under ten". T-CALCULUSES The idea of describing the T-calculuses class is as follows o An intercourse participant forms a text taking account of an analogue of some basic semantic system B. This analogue of a b.s.s o is a part of his world model. If in the text a sentence occurs introducing a new concept or a new designation then the next fragment of the text will be formed with respect to some new b.s.s. B'. The inference rules of T-calculuses explain, for example, the organization of semantic structure of texts including the sentence "a tanker is a vessel for carrying liquid freights" and of texts including the sentence "Let M be the intersection of lines AO and Bc ".
ABOUT SOl'I'lE POSSIBLE APPLICATIONS OF THE RESULTS The developed mathematical tools considerably enlarge the possibilities of constructing models of type "Narrative text ( > Informational representation (IR)". Using the formalism of T-calculuses the author of this paper has constructed the particular mathematical model of type "Narrative text '> IR" for texts from some useful for applications sublanguages of natural languages. Texts can be, for example, graphic objects descriptions, commands to calculate the values of parameters of diverse Objects. This model is an independent task area and can be generalized in various directions. The formalism of S-calculuses makes it possible to obtain the mathematical interpretation of Schank's (1975) conceptual dependency theory in natural languages and of ~rinciples of constructing by Rieger (Schank,197S) the conceptual memory of the experimental dialogue system MARGIE. It can be proposed to represent formally the frames of object classes by means of T-calculuses expressions of the form c
where t designates the meaning of word HeKOTophlli (in Russian), C is formal anaIogue of a concept, for i=1, ••• ,n fi is a function name or a relation name, v- is a variable. In a similar way the T-calculuses can be used for developing conceptual dictionary models.
Natural Langua ge Man-Machine Interaction Modelling
The strings of the form (2) can also be used fo~ representing the meanings of narrative texts and representing the knowledge modules. For instance, the following relationships can hold: C=statement,f 1=source, f 2=truth, f =new. 3 For representing compactly knowledge about all information modules J 1 , ••• ,Jm of the form (2) associated with an object x it can be proposed to construct (in accordance with the rules A4 and A7) expressions of a form Information(x, (V; 1\ . .• 1\ ~» where V-;, ... , ~ are variables marking the mod~~es J 1 , ••• ,Jm• For
/-
moa~ling the knowledge of an artifici~l intelligence system about
oneself and for representing the knowledge about goals and knowledge of other active objects one can consider the expressions of the form KNOWS(x1'~ ,a, T) where KNOVlS is a name of four-slace relation, x1'~ are constants denoting the intelligent systems, a designates a concept or an object, ?: is IR of statement. Such expressions one could interpret in the following way: from the viewpoint of the object x 1 , the object x? possesses the information 7: abou't a. The formalism of S-calculuses and T-calculuses has been developed on the basis of ideas published in (Fomitchov, 1978, 1981a, 1981b). REFERENCES Fomitchov, V.A. (1978). The algebraic description of structures of knowled~ representation la~ ~es in he memo~ of integra root. In book: A omaticheskoe regulirovanie i upravlenie,v.11, M. Vsesouzn.zaochn.mashinostr. institut, 1978. (in Russian). Fomitchov, V.A. (1981a). On the theoof lo~ic-algebraic models of e mean~ng level mechanisms of speach constructing. it The formulation of task and t e idea of approach to its SOlV~. Rukopiss deponirovana v ITY 27 october 1981, 85 pp. (in Russian). Fomitchov, V.A. (1981b). On the development and the application of the theory of logic-al~braic modeiiing a number of he natural
tt
"1-H
l~ges
207
mechanisms for the level text bUilding. In: IX Vsesouznyi simposium po cibernetike. (Tezisy simposiuma, Sukurny, 10-15 noyabrya 1981), vol. 1, The knowledge representation. Nauchny sovet po komplexnoy probleme flCibernetika fl Akad.Nauk SSSR, Moscow. (in Russian). Schank R.C. (1975). Conceptual Information Processing. NorthHolland publ.comp. - Amsterdam Oxford, American Elsevier publ. comp.inc. - N.Y. 1975. Shoenfield J.R. (1967). Mathematical Logic. Addison-Wesley pUblishing company. m~ng