Reprinted from Int. J. Man-Machine Studies (1975) 7, 571-608
Natural Language Acquisition by a Robot* H. D. BLOCK
Department of Theoretical and Applied Mechanics, Cornell University J. MOULTONt AND G. M. ROBINSON
Department of Psychology, Duke University. (Received 24 October 1974) " . . . nothing reveals our ignorance about a phenomenon more clearly than an attempt to define it in constructionalterms." (W. Grey Walter, The DevelopmentandSignificanceof Cybernetics) We present a model of language acquisition which demonstrates that considerable language competence can be acquired without presupposing innate linguistic factors. Vague or magical properties are avoided by describing the model as a design for a robot using clearly defined algorithms and mechanisms which can be simulated or constructed. Described first is the development of a simplified semantic system using a minimum of cognitive apparatus operating in a restricted environment. This works in conjunction with an algorithm to develop grammatical competence. No innate grammatical knowledge is needed; only interaction with an environment and a native speaker of the language is required. Grammatical competence is portrayed in terms of a new syntactic model, whose power is demonstrated by a game "simulation" in which the players make connections allowed by the syntactic rules to generate and parse English sentences. The paper closes with a second game which simulates the robot and its "parent" by a team of human players. This simulation shows how the robot develops its semantic and syntactic systems through its experience in the world.
Introduction This p a p e r presents a model for n a t u r a l language acquisition as a design for a robot. By a t t e m p t i n g to "define the p r o b l e m in c o n s t r u c t i o n a l t e r m s " we believe we can isolate a set of m i n i m a l l y essential features for language acquisition. W i n o g r a d (1971) has s h o w n that considerable language competence can be programmed into a robot. W e hope to show how a r o b o t with a perceptual a n d m o t o r system b u t with n o p r e - p r o g r a m m e d linguistic i n f o r m a tion can acquire such competence by interacting with a linguistically c o m p e t e n t teacher. W e are n o t directly concerned with how humans learn l a n g u a g e ; *An earlier version of this paper was presented at the Conference on Biologically Motivated Automata Theory held at the MITRE Corporation, MacLean, Va., 19-21 June, 1974. tRequests for reprints to Dr. Janice Moulton, Department of Psychology, Duke University, Durham, N.C. 27706 U.S.A. 571
572
H.D. BLOCK et al.
ours is the more general question of what makes it possible for language to be learned at all. This paper deals only with the problems involved in specifying the development of a semantic and syntactic system. We do not deal with the difficult problems of speech production and perception, pattern recognition and the formulation of strategies, goals and decisions. We presuppose that our robot has a perceptual and m o t o r system which combines the state-of-the-art features of SRI's mobile automaton (Nilsson, 1969) and the " e y e - h a n d " machines built at Stanford University (McCarthy, Earnest, Reddy & Viscens, 1968) and MIT. Our robot is to operate in a restricted environment with a small set of readily discriminable objects. The robot "converses" with its teacher by Teletype about the entities, attributes, relations and events in its e n v i r o n m e n t - - a giant chessboard room.
The Problem Natural language acquisition requires the learning of complex information. F o r our purpose, we can consider this information to be of three types*. (1) Lexical--the formation of an association between each concept and the lexeme (roughly, a word; see Lyons, 1968, p. 197 if) which names it. We must show how concepts are mediated by the robot's cognitive apparatus and how Sensory Detectors
[
WOCldMap
J
~
JSyntaxCrystal E
~
t
\ RED REoP,w. / PAWN Diclionory Bins
FIG. 1. A simplified version of a robot's cognitive apparatus. A small sample of a large number of initially random connections is shown. lexemes become connected to these mediators. (2) Syntactic--the development of a system which can signify a relationship among concepts by a relationship a m o n g lexemes. (3) Pragmatic--learning whether particular concepts and/or their instantiations can be related to each other in a given situation. This information is essentially nonlinguistic and is used by every *Readers should not concern themselves that the parts of this tristinction do not correspond neatly to any of the several uses of "syntax" and "semantics" in the literature.
NATURAL LANGUAGE ACQUISITION BY A ROBOT
573
organism or automaton which learns to interact with its environment. It is essential for processing and resolving the frequent lexical and syntactic ambiguities of English. For example, consider the sentence: MARY HAD A LITTLE LAMB. Although the syntactic information tells clearly how the lexemes are related, the lexical ambiguity of HAD permits at least four interpretations, each of which is appropriate to a different situation. In the sentence VISITING PROFESSORS CAN BE DULL, only pragmatic information involving the non-linguistic context can determine which of the two possible meanings allowed by syntactic ambiguity is appropriate. The robot processes the three types of information needed for language acquisition using a cognitive system of four components: World Map, Associator, Dictionary, and Syntax Crystal (see Fig. 1). The World Map is the robot's internal representation of its environment. It is a four-dimensional array (three spatial dimensions and time) whose elements stand for features and/or "things"* the robot perceives. When the robot acquires the ability to interpret linguistic messages, it can use this information to update its World Map. Since the World Map is topologically equivalent to the environment, the robot can formally represent complex events by computing changes in the relative position, and state of objects which are represented on the Map. The robot could also represent hypothetical events to evaluate pragmatically alternative interpretations of messages or even simulate the World Maps of its teacher and other robots.t The Associator combines, gates and passes on information from other components. It consists of an interconnected network of logic gates whose outputs represent various combinations of features from the robot's sensory apparatus, motor controls and World Map. Each output is activated when the robot detects the particular combination of features it represents. The rows of dots in Fig. 1 represent just a few of its several layers which are connected [as in the Perceptron (Rosenblatt, 1962) and Pandemonium (Selfridge, 1959) models] both convergently and divergently. This paper does not deal with the problems involved in training the robot to classify patterns and extract features. We abbreviate these processes here by teaching the robot to classify patterns by the presence or absence of a few basic features. Storage Bins of the Dictionary are connected to the Teletype and to the Associator in such a way that Bins which are connected to active Associator outputs store any Teletype input which occurs during that activity. As explained in that next section, this enables the robot to build up a lexicon. During language output, the contents of Dictionary Bins which are connected *Let "things" be a shorthand for entities, attributes, relations and events. tThe World Ms p is an elaboration of the S.R.I. robot's GRID M O D E L (Nilsson, 1969).
574
ri. D. BLOCKet al.
to active Associator outputs are teletyped in a format constrained by the Syntax Crystal. The Syntax Crystal is the result o f a learning algorithm which builds up a strongly adequate grammar by operating on linguistic messages. It enables the robot to combine the meanings of individual lexemes to provide semantic interpretation for parsing and generating sentences. The Syntax Crystal is a grammatical system which represents constituent and dependency relationships, provides distinct parsings for syntactically ambiguous sentences and can be used to portray the similarity in constituent relationships of paraphrases. It does not require an innate linguistic structure or an additional level o f processing by a different set of rules as does transformational grammar (Chomsky, 1965). The model is described in some detail. Two games are provided which simulate (1) the parsing and generating ability of the Syntax Crystal, and (2) an overview of the robot's language acquisition behaviour.
Acquisition of Lexicon Using Fig. 1, we can sketch briefly how the robot builds up a lexicon. Imagine the robot to be alone in its world, the giant chessboard room. Now we place a red pawn on square B-l, simultaneously teletyping RED P A W N to the robot. The color, shape, and location of the new object excite the detectors for red, pawnshape and internal World Map location B-1. Each excited unit emits an impulse along the wires, connecting it to other units, i.e. wires a, b, h, andj. In the Associator, nodes 1 and 2 receive impulses. Each logic node passes on the impulse if, and only if, all of its inputs are active. Nodes 1, 2, and 3 meet this condition. Impulses are passed down to the Dictionary Bins to which they happen to be connected. Each excited Bin stores the teletype input RED PAWN. At the same time, the Associator mediates connections between all active Dictionary Bins and location B-1 of the World Map. Connections are also made to units in the Syntax Crystal which will be explained in the next section. I f we now show the robot a blue pawn while teletyping BLUE PAWN, this phrase will be stored only in the third Dictionary Bin. This Bin now has the word P A W N stored twice but R E D and BLUE each stored only once. A variety of "things" and their teletyped descriptions are presented to the robot. Lexemes that are stored in other Bins with higher frequency are supressed (e.g. the third Bin in Fig. 1 has a greater frequency of PAWN, so this lexeme will be supressed in Bins 1 and 2. R E D will continue to occur in all three Bins until non-pawn occurrences of red are presented to the robot. In
NATURAL LANGUAGE ACQUISITION BY A ROBOT
575
the end, the Dictionary will contain a set oflexemes each of which is connected through the Associator to the set of sensory input features and World Map locations which are activated by the things named by each lexeme. This process of passively correlating lexemes with sets of sensory input and locations can be made more efficient by permitting the robot to actively test these connections, obtaining feedback from its teacher. The connections to the Dictionary represent part of a semantic system-what each lexeme is for the robot. For names and descriptions of concrete objects, the above scheme is clearly workable. It can be extended to cover words such as FAST, GO, PUSH, NEAR, BECOME, by connecting their Dictionary Bins with computations on outputs from the robot's sensory system and information stored on the World Map. However, we could not expect the robot to make useful connections to computations on perceptual or World Map data for lexemes like FOR and IS. The essence of these lexemes is what they do with other lexemes in sentences, i.e. their syntactic role, which is represented in the model by their connections to units in the Syntax Crystal.
Syntax and its Mediation The robot confronts a world of entities, attributes, relations and events. When one of these is mediated through the Associator, it becomes one of the robot's concepts. In the process of acquiring a lexicon, a lexeme naming a concept becomes stored in a Dictionary Bin connected to the appropriate Associator output. But entities, attributes, relations and events rarely present themselves in isolated independence. The robot must deal with concepts which are related in varied and complex ways. The role of syntax in natural language is to provide a coding scheme whereby relationships among concepts can be signified by relationships among the lexemes which name these concepts. In English, the order of the lexemes in a string encodes most of the syntactic relationships; the rest is encoded by inflections which add on to or modify the form of lexemes. This paper will consider only -en, -ed and -ing inflections as examples of inflectional encoding. Suppose the robot has acquired the lexemes: HUMAN, PAWN, PUSH(ES) and BLACK. There are several ways these lexemes can be related, each expressing a different relation among the concepts they name and exhibited by a different linear permutation of the lexemes. Figure 2 cartoons an event described by the string: HUMAN PUSH(ES) BLACK PAWN. English syntax can represent this relationship among concepts by arranging their lexemes in the order given by the sentence above.
576
H.D.
BLOCK
e t al.
The Syntax Crystal mechanism must enable the robot to signify relationships among nameable concepts by a linear permutation of the lexemes which name the concepts, and to derive the correct (i,e. intended) set of relationships from any linear permutation of lexemes which comprise a well-formed input string. Where lexical or syntactic ambiguities occur, the robot, as any language user, would make use of the context. Figure 2 shows how the Associator provides information for the relations among concepts. Connected to the Dictionary Bins, are output nodes 1-4 representing the concepts as explained in the previous section. The other nodes and connections responsible for building the lexicon are not shown in Fig. 2.
Associator
,:~%
(2 ?
r-
(3
-! i
(4)
y
L
7 Dictionary Bins
I
H
4
2
HUMAN PUSHES BLACK PAWN FIG. 2. H o w the Syntax Crystal maps AssocJator node connections.
Node 5 represents the connections which mediate the robot's detection that the concepts named P A W N and B L A C K are directly (i.e. not through any other output nodes) related. The fact that they are related and not any quality about the relationship is all the robot need determine. In other words, it must determine only whether or not there is a direct connection like that portrayed through node 5. Node 6 represents a direct connection between P U S H ( E S ) a n d P A W N ;
NATURAL LANGUAGE ACQUISITION BY A ROBOT
577
node 7 connects PUSH(ES) to H U M A N . That the two connection pathways are discriminable is the only necessary information; the robot does not need to classify the pathways as signifying "action-object" and "action-agent." The actual construction of an Associator to perform this function may prove more challenging than we suppose. But, except for connections to the Dictionary, this cognitive capacity is essentially non-linguistic. Our main interest is to show how this basic capacity can be exploited by a model which acquires and uses natural language syntax. At the bottom of Fig. 2 is a diagram which combines the topology of the node connections with the proper ordering of the lexemes in the sentence. The light lines suggest pathways in the Associator node diagram above to clearly show the mapping. This new diagram matches up information about relations among the concepts with the linear permutation of lexemes which name the concepts so related. Since this is the function of syntax, we need to describe a mechanism which will enable the robot to use the information carried by the diagram. The Syntax Crystal is a model for such a mechanism in the form of a two-dimensional cellular automaton. As a model for syntactic competence it shows that a grammar more powerful than phrase-structure grammar is learnable, something that has not been done before. L. Harris (1972) successfully simulated some aspects of language acquisition. His robot was able to build up a lexicon by correlating the names of several objects and activities with " . . . built-in concepts" (p. 89) defined in terms of the robot's sensory and motor apparatus. Each piece of apparatus was "wired to" a grammatical category, e.g. its motors were wired to the "verb" category. The robot inferred a simple (but not trivial) phrase-structure grammar by operating on the sequence of grammatical categories assigned to incoming strings according to the maxim: "The parts of speech are the parts o f the robot" (p. 87). For the project described here, we deliberately avoid the use of innate grammatical categories, especially categories which are assigned on the basis ofapriori ontological classifications*. The Syntax Crystal Learning Algorithm assigns words to grammatical categories on the basis of their relationship to other words on a string. The category distinctions are developed only as needed and none are avialable before the acquisition process beginst. Among the other differences between this model and the one used by Harris are: lexical and syntactic acquisition are no longer treated as processes which *See Lyons (1968, section 7.6) for a discussion of "notional" vs. "formal" grammatical categorization. tGradual elaboration of grammatical category distinctions occurs in human language acquisition. Discovery of the categories is seen as a major part of language acquisition (McNeill, 1970).
578
H.D. BLOCKe t al.
occur independently; the "top-down" parsing (p. 131) restriction is eliminated allowing simultaneous elaboration of the parsing and generating structure from any number of starting points.
The Syntax Crystal The syntactic model is represented as a set of rectangular cards. Guided by a simple algorithm, the robot enters either a lexeme or a connection code on the edges of the initially blank cards. The entries are determined by the algorithm from sentences teletyped to the robot and from the teacher's acceptance or rejection of sentences generated by the robot. Sentences are generated and parsed by permitting the rectangular cards to tesselate under the local control of the connection codes. Local control means that each card can determine only its immediate neighbors. When parsing a sentence, the local "crystal growth" is constrained by the order of the lexemes in the sentence. The completed crystal results in a diagram like that at the bottom of Fig. 2 from which the robot can "read off" the conceptual relations as discussed in the previous section. When generating a sentence, crystal growth is constrained to connect lexeme-bearing cards by pathways which are topologically equivalent to those in the Associator signifying relations among concepts, e.g. the pathway through nodes 5, 6, and 7 in Fig. 2. The completed crystal maps the conceptual relations into the proper lexeme order and applies inflections where necessary. In this section, the learning process will be described and then the algorithm which controls it. Then we will show that a small set of completed cards can generate and parse English sentences of considerable complexity. A set of cards is provided in the form of a game for readers to test the adequacy of the model themselves. In the Syntax Crystal model, grammatical features of natural language are acquired through the Learning Algorithm, a recursive procedure which encodes the structural relations of syntax as hierarchical connections between lexemes. Although the Learning Algorithm is specified here for syntax acquisition, it is not intrinsically specific to language. It could easily be adapted to explain the acquisition of other cognitive and motor skills which are considered to be hierarchically organized (Lashley, 1951; Miller, Galanter & Pribram, 1960). Attempts to account for syntactic competence by learning theorists ignore the underlying structure of syntax because it is not directly represented in the linear strings of speech behaviour (Skinner, 1957). Instead, this approach tries to account for syntax acquisition in terms of association frequencies of words or syntactic categories (e.g. Braine, 1963). Inadequacies in this approach (see Bever, Fodor & Weksell, 1965; Chomsky, 1959) have prompted many
NATURAL LANGUAGE ACQUISITION BY A ROBOT
579
linguists to conclude that there must be innate linguistic mechanisms for syntax acquisition to be possible. Innate linguistic information is used specifically by L. Harris with his prewired notional syntactic categories and proclaimed by Chomsky as " . . . the quite obvious fact that the speaker of a language knows a great deal that he has not learned." (1966, p. 60). The Syntax Crystal combines the ability to make simple hierarchical connections (i.e. two things connected, not to each other, but both to a third thing) with the cognitive skills of association, generalization and discrimination that are recognized by learning theorists to provide an explicit procedure whereby the structural relations of syntax can be learned. Syntax learning requires the following four conditions: (1) An environment to provide something to converse about. (2) A linguistically competent teacher to provide linguistic input about the environment and to accept or reject the robot's trial utterances. (3) The robot must have some lexemes associated with concepts. (4) The robot must be able to learn that the co-occurrence of lexemes can express a relation between the concepts they represent. For example, for the string HUMAN WALK(S), HUMAN and WALK(S) both apply to the same object. This information is independent of syntactic characteristics such as order and inflection. Syntactic characteristics are sometimes unnecessary for a collection of lexemes to be treated as a string. These characteristics must sometimes be unnecessary for them to be learnable. One must be able to at least partly understand a string independent of these characteristics to recognize that it is a string, only then can the syntactic characteristics of such a string be acquired. This bootstrap problem is resolved by the non-linguistic relationships which mediate the understanding of the string they accompany. Thus for a string such as THE HUMAN IS WALKING, the robot must first be able to pick out the lexemes it understands, say, HUMAN and WALK, and then, recognizing that their concepts are related, use the algorithm to form a Syntax Crystal connection between them. Later it can add on Syntax Crystal cards for THE, IS, and -ING. It has been argued (Katz, 1966, p. 251)that the observable grammatical features (order, syntactic category inflections, etc.) of language are too impoverished for grammar to be learned without innate rules. Since the meaning of individual lexemes, and information that the concepts expressed by the lexemes can be related are not considered grammatical features, the argument amounts to a claim that one cannot learn the grammar of a language without paying attention to meaning. The validity of this argument has no
H . D . BLOCK e t al.
580
bearing on the Syntax Crystal Learning Algorithm which uses knowledge of the extra-grammatical features of language as a starting point. To begin the acquisition process, the robot enters the lexemes on the bottom of blank rectangular cards, e.g.:
Connections between cards are made by entering apposed matching codes. Two cards are connected if one can draw a horizontal or vertical line between the matching codes, with no other card intervening. The connection for HUMAN WALK(S)* could be: N
A A
N HUMAN
S $ WALK(SI
The lexeme cards are connected hierarchically by generating two other "structure" cards. Structure cards are not an arbitrary creation; they can be considered to arise from a kind of natural selection process. Think of the lexeme cards to be connected as surrounded by a matrix of blank cards. Initially the robot generates a (perhaps random) variety of paired codes on surrounding cards, connecting the two lexeme cards through a number of different pathways, direct lexeme-lexeme and less direct (hierarchical) connections. As will be seen, the hierarchical pathways are more general and thus will be used more frequently than the linear pathways which are specific to individual lexeme sequences. Let the connections be strengthened by frequent use and postulate a housecleaning operation which eliminates or makes less accessible the infrequent connections. That is not to deny that some lexeme-lexeme connections remain--they are such stuff as cliches are made on. The most important feature of the Syntax Crystal is that the connection codes on the cards are not programmed into the robot. The robot begins with a set of blank cards and a simple algorithm for entering and correcting the codes. The learning algorithm develops the hierarchical structure of syntax from linguistic experience. The local rules are represented by card types with *Agreement implemented by context sensitivity can be provided by additions to the model: (1) a plane above the plane of the cards which filters consistent sets of number, gender, etc. inflection forms, or (2) levers or cams on every card which adjust the lexeme cards for inflections and when moved, cause all the levers on constituent (see next section) cards to adjust correspondingly, so that, for example, the co-ordinate case inflections of nouns and related verbs will appear on their lexeme cards. Again, such mechanisms are not specifically linguistic, cf. Penrose (1959).
581
N A T U R A L L A N G U A G E A C Q U I S I T I O N BY A R O B O T
coded edges. The codes on the cards are arbitrary with the following restric Lion: when a lexeme or phrase can be substituted for another in a s t r i n g produce an acceptable utterance, it is given the same top code, otherwise the codes must be different. For example, since ROBOT can be substituted for H U M A N in H U M A N WALKS), it is given the same top code: A A
]
For a new string, for example, W A L K FAST, the robot must recognize th there is s o m e connection between the concepts expressed by W A L K and FAST. To connect the WALK(S) card to the new FAST card, the algorithm constructs:
The WALK(S) card is given an additional top code, V. For a more complicated sentence, for example, H U M A N WALK(S) FAST the phrase, WALK(S) FAST, is joined to the previously created structure: R
AA
S
R
(S)
HUMAN
V S,V
B B
WALK (S)
T T FAST
Placing parentheses around a code indicates that the connection is optional A string or parsing structure is complete when all nonoptional codes are connected. Here the phrase WALK(S) FAST connects to H U M A N as a un It was substitutible for WALK(S) just as ROBOT was substitutible for H U M A N in the earlier string. The highest horizontal connection (here codes " A " ) represents the main constituent connection. Subordinate constituen and their structural relations are represented by the other horizontal connections, and their positions in the hierarchy. The scope of any constituent represented by everything which can be reached from the tallest column in the constituent via a horizontal and/or downward path (see Fig. 3 for a detailed illustration). The dependent constituent of a pair is th e one whose column lower down (e.g. FAST is the dependent constituent of WALK(S) FAST H U M A N WALK(S) FAST.)
582
I~. D. BLOCK et al.
Feedback as to whether the substitution of a constituent results in a proper utterance comes from the response of the teacher. In most cases the response informs the robot whether or not its trial utterance is acceptable. Usually such feedback confounds syntactic, semantic, and possibly stylistic acceptability. Teachers would tend to withhold positive responses when the robot's utterances were ungrammatical, nonsensical, impolite or any combination of these faults. In addition, criteria of acceptability will vary widely across teachers and the sophistication of the robot. The important point is that the linguistic knowledge symbolized by connections in the Syntax Crystal is shaped by the teacher's responses: the strength of a coded connection increases when its use
. . . . .
%-4 AI
I
i
v, . . . .
AI
A~H~ E
N3 ' d 3
N~ J2
J2 J2 __
i '
N5
Jr-3
N5,6
BIG
ROBOT
zo
i
IS
:
i
o~
~-
dl-4
....
! '
,
""
J4
K,NO
DI !
Dt
.
Dt
.
OI
- |
D2
N
D2, ' 2 .
-
-N3
I
.
TO IHOMANS~ Left
conslituent
THE BIG ROBOT
Connection
Right constituent
SI
IS KLND TO HUMANS
THE
AI
BiG ROBOT
BIG
d2
ROBOT
IS
J3
KIND TO HUMANS
KIND
DI
TO HUMANS
TO
N5
HUMANS
FIG. 3. results in an acceptable utterance; a connection is weakened when its use results in an unacceptable utterance. In a longer string, which two lexemes are first connected together as a subunit, is determined by relations among the concepts named by the lexemes. The most important relation is the scope of each concept; which other concepts it is directly related to. An example shows what is involved: BIG H U M A N WALKS. The scope o f "big" is "human", while the scope of "walks" is "big human". Therefore, the Syntax Crystal connects BIG to H U M A N and then connects that pair as a unit to WALKS. The robot generates sentences by first setting out (in any order) lexeme cards which bear the names of things (see footnote on p. 573) which, according to its semantic system, are involved in what it wants to communicate. It then begins to attach structure cards to the lexeme cards, subject only to the constraint that the scope of a concept (and thus the hierarchy of relations
NATURAL LANGUAGEACQUISITIONBY A ROBOT
583
among constituents) is determined by the column of structure cards above its lexeme. This constraint alone enables the Syntax Crystal to place the lexemes in the proper order and introduce the required inflections and syntactic function words. When parsing a sentence given to it, the robot sets out the lexemes in the order they appear. It then begins to attach structure cards to the lexeme cards from left to right, subject only to the constraints provided by lexeme order and the presence of syntactic function words and inflections. The resulting structure portrays the conceptual relationships encoded by the syntax. Lexemes which name concepts connected in the robot's Associator will have their cards connected by a pathway through the Crystal. The level and column at which the connection is made encodes the constituent and dependency relations of the words and phrases so joined. Simple structures produced by the Syntax Crystal will look very much like rectangular versions of the structures described by phrase structure grammars. Our aim is not to reconstruct the well-known structural relations of syntax, but to show how they can be learned. However, the Syntax Crystal incorporates important departures from traditional grammars. Table 1 lists the grammatical categories of the codes used in the following sections as a convenience for the general reader. TABLE 1 Explanation of codes in terms of the grammatical categories of the word cards they connect to Code letter A B C D F G I J M N P R Q S T V W Z
Connects to
Particles Adverbs Co-ordinate conjunctions Prepositions, also adverbs Abstract nouns, gerunds -ING phrases: gerunds, participles, progressive tense Infinitive Adjectives Modal auxiliaries Nouns Passive constructions Relative clauses Questions Main connections of a sentence or clause "that" clauses Verbs Interrogative adverbs The copula IS
H. D. BLOCK et
584
al.
TABLE 1 continued Subdivision of N, V, and Z codes If bottom Side code connects to code is N~ Adjective N~ Verb phrase Na Verbs, prepositions, relative clauses, as object N~ Prepositional phrases that modify it N5 Adjective such that particle is required N8 Particle N7 Relative clauses V0 Indirect object Vt Noun phrase V2 Direct object V3 Adverbs and prepositions V4 Progressive tense, -ING of gerund V5 Propositions Ve Noun phrase as participial adjective V7 Passive form Va Modifiers of passive form V9 Modal auxiliary V~0 Infinitives V~x Relative clause, as verb of V~ Infinitive, as verb of Z0 Predicate adjective or predicate nominative Zx Abstract noun's phrase Z2 Progressive tense Z3 Question which begins with copula Zs Modal auxiliaries from copula Zs Abstract noun phrase and modal question A phrase structure tree diagram can be characterized as a series of production rules of the form: X - + Y + Z where X is a node which branches off into two lower nodes, Y and Z. A Syntax Crystal can be characterized as a set of connection rules which can be compared to the production rules of phrase structure. Consider the sentence" G O O D R O B O T G O E S /S..~. ADJ
t
N
I
1si s N
V
I
GOOO ROBOT GOES
D D GOOD
A A
V
(N)
V
R
GOES
R,N
ROBOT
NATURAL LANGUAGE ACQUISITION BY A ROBOT
S -+NP-kVP NP-+(ADJ)-}-N VP-+V ADJ-+GOOD N-+ROBOT V-+GOES
585
IN q-V]/S (N)-+D/Aq-R D -+GOOD N, R-+ROBOT V-+GOES
The rules are written directly below the diagram of each model. For both kinds of rules, the plus sign orders the sequential appearance of the two terms which surround it. The square brackets assure distributivity; parentheses indicate the connection code or element is optional. Preceding an arrow is the top connection code of a card. Following the arrow is either a lexeme or bottom code of the same card and the bottom code-slash-side code of a card horizontally connected to it. The code on the same card is connected to the non-dependent constituent. The code followed by the slash is connected to the dependent constituent. The geometry of the rectangular Crystal is the source of added information. Portraying dependency relations is an advantage the Syntax Crystal shares with categorial grammar but not with phrase structure (Lyons, 1968, p. 231). Linguistic theories disagree about what comprises syntax and how much of the information in language is carried by syntax. Behaviourist language theory (Skinner, 1957) restricts syntax to the linear pattern of grammatical categories. Each linear pattern is a stimulus ("autoclitic") which controls the heater's responses to the individual lexemes. Phrase structure generative grammars posit an underlying hierarchical structure which produces the linear arrangement of lexemes. The structure, with labeled nodes, represents the production rules of the grammar, a diagram of the constituent structure of linguistic strings. Categorial grammars (e.g. Ajdukiewicz, 1935) encode the dependency relations among constituents but do not distinguish among the many grammatical categories provided by phrase structure. This means that categorial production rules cannot convey the n a t u r e of the relationship between concepts. This information must be carried by a combination of lexical and pragmatic information. The Syntax Crystal combines advantages of both phrase structure and categorial grammars; dependency relations and production rule structure are both incorporated. The Syntax Crystal is theoretically neutral as to whether or not to allow production rules to express types of ontological relations between concepts. In this paper, to minimize the complexity of syntax acquisition, we have elected to n o t map specific ontological relations on to specific production rules. This choice allows the Learning Algorithm to be
H.D. BLOCKet al.
586
formulated in a simple way and does not require the robot to distinguish one relation of concepts from another for syntax acquisition. Our choice reflects our aim to identify the minimally essential features for language acquisition. Transformational grammar assigns a more ambitious role to syntax by positing a "deep structure" which (1) can portray the similarity in meaning across logically equivalent paraphrases (e.g. THE STRONG GIRL and THE GIRL WHO IS STRONG) as products of the "same" deep structure (i.e. isomorphically similar). The Syntax Crystal portrays the similarities across paraphrases as topological similarity between pathways in the Crystals for each paraphrase.
Mz
MI MI
(It's much more interesting
Z5
to visit students,)
Mit2
CAN GI~3
Z0,2,5 BE
/N2~ Jl JI
Z 5 "~"O~S J3
....
JI-4 _ ~
)1 Vt AI Z5
' NI-7'9
t
MI'2
o,,3
(They don't seem to
J5 J3'
Zo,,,, BE
0,_,!
hovethe s0or,,e o,
the permanent faculty.)
DULL|
There is only one phrase structure tree for both sentences.
~ N P NO. I PARTICIPLE No.2 GERUND VISITING
~
VP / ~
NOUN
MaD
IS
P ADJ
PROFESSORS
CAN
BE
DULL
BUt there ore two different SYNTAX CRYSTALS and each shows the proper scope and dependency relations.
VISIT ING PROFESSORS CAN BE DULL
VISITING PROFESSORS CAN BE DULL
FIG. 4.
587
N A T U R A L L A N G U A G E A C Q U I S I T I O N BY A R O B O T
It is argued that transformation rules are necessary to explain regularities between sentence modes (e.g. interrogative and declarative, active and passive); phrase structure grammar alone is inadequate. The Syntax Crystal could explain these as regularities o f card substitutions. The substitution of a few basic mode-determining cards causes the Crystal to rearrange itself to reorder the lexemes and introduce the required function words and inflections. The next section illustrates this detail. Although fluent language users probably employ the equivalent of these "card-substitution" rules, such rules are not essential to syntax acquisition and use. They would arise after a set of Syntax Crystal connections had been learned, as short-cut procedures to facilitate the connection of lexemes. Transformational grammar provides distinct structural descriptions for each version of a syntactically ambiguous sentence. Consider an example: V I S I T I N G PROFESSORS CAN BE D U L L . On one interpretation the sentence is about a kind of visiting, on another it is about professors of a certain sort. Phrase structure grammar provides different grammatical labels for the two interpretations, but the two phrase structure trees are otherwise identical. The Syntax Crystal not only uses different structure cards for the two interpretations, but arranges them in different ways, indicating the dependency relations among the constituents (see Fig. 4). Transformational grammar also claims an advantage over phrase structure in being able to show a structural difference between the two sentences: J O H N IS E A G E R TO PLEASE and J O H N IS EASY TO PLEASE. For $
/N NP
VP
N
COPULA
JOHN
JOHN
iS
IS
EAGER
ADJ
EASY EAGER
TO
PLEASE
INF P
TO
V
TO
PLEASE
I
JOHN
IS
EASY
TO
PLEASE
FIG. 5. Syntax Crystal compared with phrase structure.
588
H . D . BLOCK e t al.
these examples phrase structure grammar does not even provide different labels. Figure 5 shows that the Syntax Crystal will provide different structures. In the first sentence, TO PLEASE is the dependent constituent of E A G E R TO PLEASE; in the second sentence, TO PLEASE is the non-dependent constituent, modified by IS EASY which tells how John is to please. However, there are some distinction-making powers which transformational theory attributes to syntax that we consider to be better reserved for lexical and pragmatic information. For example, T H E S H O O T I N G OF T H E H U N T E R S receives two different syntactic structures under transformational analysis in order to distinguish "the hunters" as either the perpetrators of or the victims of "the shooting". The difference hinges on the nature of the relationship of hunters to the shooting. For comparison, consider the conceptual relations expressed by the following set of strings: BAG OF GOLD, BRICK OF G O L D , LOVE OF G O L D , C H L O R I D E OF G O L D , D E N S I T Y OF GOLD, P U R C H A S E OF G O L D . Here, OF connects G O L D to one of six other lexemes to express six different ontological relationships: filled with gold, made of gold, gold as a recipient of an affect, gold in molecular intimacy with chlorine, property o f gold, transaction involving gold. Metaphoric and elliptical uses are not included. Transformational grammar does not structurally distinguish all these relationships*. In our model we call upon syntax to signify that there exists a relationship expressed through the word OF and allow pragmatic information to determine the nature of the relationship which obtains between gold and each of the other concepts listed above. Similarly, the Syntax Crystal would resolve the ambiguity of the, T H E S H O O T I N G OF T H E H U N T E R S , not in the syntax, but bypragrnatic information gained at a coroner's inquest or a meeting of the SPCA. Syntax need only encode information about how the lexemes are related, not how the concepts they stand for are related. Since creatures without language are capable of understanding these ontologic relationships, it is not necessary to detail them within the syntax. A squirrel by its behaviour can tell us that it knows the difference between a box containing acorns and a box constructed of acorns. In the following section the Learning Algorithm will be specified and then applied to illustrate how it encodes the constituent and dependency relations as well as order and inflections.
*Some versions of "generative semantic" grammars can cope with this problem by inventing a new syntactic category for each ontologic relationship. But see Bolinger's (1965) argument that the proliferation of categories needed for codification of such relations is unending and therefore the project is impossible.
NATURAL LANGUAGE ACQUISITION BY A ROBOT
589
The Learning Algorithn 1. Basic hierarchical c o n n e c t i o n : when t w o lexemes o r phrases which n a m e concepts the r o b o t can connect in the A s s o c i a t o r , are p r e s e n t e d together, the rule to generate Syntax C r y s t a l c a r d s i s - - p l a c e two b l a n k r e c t a n g u l a r cards with s h o r t edges adjacent. T h e first lexeme p r e s e n t e d is e n t e r e d on the b o t t o m o f the left card, the s e c o n d on the b o t t o m o f the fight card. C o d e s are assigned to the t o p o f each c a r d a c c o r d i n g to the p r o c e d u r e s given below. T w o m o r e b l a n k c a r d s with s h o r t edges a d j a c e n t are p l a c e d a b o v e the c o d e d cards. These new cards are given codes on their b o t t o m edges to m a t c h the codes o n the t o p edges o f the l o w e r cards. T h e a d j a c e n t edges o f the u p p e r p a i r are given m a t c h i n g codes. 2. T o connect three o r m o r e lexemes: first c o n n e c t an a d j a c e n t p a i r o f the lexemes as above, then treat this a r r a y as a single lexeme c a r d to c o n n e c t it, as above, to the t h i r d card, a n d so on. W h i c h two cards are c o n n e c t e d first will be d e t e r m i n e d b y the r o b o t ' s s e m a n t i c system.* 3. T h e codes on short edges c a n n o t be m a t c h e d with codes o n l o n g edges a n d are therefore i n d e p e n d e n t o f them, except for the following r e q u i r e m e n t : W h e n e v e r a structure c a r d is given a side c o d e n o t f o u n d on either side o f o t h e r structure cards, its b o t t o m code m u s t also be m a d e unique. 4(a). I f r e p l a c i n g a constituent with a n e w lexeme o r string results in a n a c c e p t a b l e sentence, the new " c a n d i d a t e c o n s t i t u e n t " receives, at its attachm e n t edge, the same code as the one f o r w h i c h it is s u b s t i t u t i b l e . t A s the need *Consider the example BIG ROBOT GOES. The robot's learning the meaning of BIG involves connecting that lexeme to some Associator dot which is active when the (distancecorrected) size of the object in the robot's perceptual field is greater than some value. Call the satisfaction of this inequality "B". ROBOT would be connected to the appropriate shape detector output "R" and GOES would be connected to the movement detection system "G". The robot should first join BIG and ROBOT and then connect them as a unit to GOES because, given any reasonable experience with various size objects in motion, its Associator will have: (B connected to R) connected (G) rather than (B) connected to (R connected to G). tPerhaps no lexeme is grammatically substitutible for another in all constructions. To allow the reuse (generality) of structure cards we will limit the robot's substitution test to some number of trials, supposing that a devaluing correction procedure can be applied later in case an important substitution has been missed. (The number of trials chosen may facilitate or hinder the robot's linguistic progress, just as any generalization may be too broad or too narrow. Subprograms to (1) vary the number of future substitutions inversely with the number of corrections required and, (2) try substitutions for lexemes that have some codes in common, would be helpful). This procedure will produce a unique set of grammatical classifications for each robot, but with much overlap. But that is as it should be. The range of grammatically unacceptable strings varies with different grammars and different speakers (Lyons, 1968, p. 152). Unacceptable strings which do not violate the grammar will be considered attempts to relate concepts that are not related.
590
H.D. BLOCKet al.
for finer syntactic distinctions develop, the errors resulting from overgeneralization are corrected by procedure 7 below. In other words, this requirement insists that existing connection codes be tried before creating new codes as in (b), which follows. (b). If substitution results in an ungrammatical string, the constituent must be given a new top code (and consequently a new structure card above must be made). 5. Additional codes can be entered on the top of existing lexeme cards. 6. Optional codes can be added to the top of structure cards in order to match bottom codes of other cards. Alternatively, codes on the top of structure cards can be made optional (by addition of parentheses around code) when a grammatical string can be obtained by leaving the coded edge unmatched. 7. When an ungrammatical string results from combining existing cards, discard (or devalue) the structure cards in the string. To illustrate how these procedures generate a Syntax Crystal we use, for heuristic purposes, codes to suggest the grammatical categories of linguistics" N for noun, V for verb, etc. To retain the heuristic advantages of these letter codes, the different codes needed for ordered modifiers are distinguished by subscripts. Given the string, GOOD ROBOT, four card types below will be generated which are illustrated along with corresponding connection rules. The reader should refer back to p. 585 for directions on how to derive the rules from the cards. [dl + NI]/dl
~ d G tO 0 0~
~ N R IO B O T
The codes N 1 and J1 must be different from each other by procedure 4(b) above, because GOOD GOOD and ROBOT ROBOT are not acceptable strings. A second sentence, ROBOT GO(ES), requires the robot to generate three more cards and add a code to the top of the ROBOT card as required by procedure 4(b) above. The rules will be: Vz~GO(ES); N1,2~ROBOT; [N2+Vz]/Sz. A new string, GOOD ROBOT GO(ES), can be formed with the addition of an optional N2 on the top of the N1 card. The rules for this string are: [N~q-V1]/S1; (N2)~J1/Jx-kN~; Vz~GO(ES); Jz~GOOD; Nz, z~ROBOT. In this case it is clear the [GOOD ROBOT] GO(ES), rather than GOOD
NATURAL LANGUAGE ACQUISITION BY A ROBOT
591
[ROBOT GO(ES)], is the correct relationship. The robot has treated GOOD ROBOT as a unit and connected it to GO(ES) by the ROBOT column. If the constituent and dependency relations assigned by traditional grammar were irrelevant to the meaning of a string, the robot could connect the cards another way. In this illustration, however, we will connect the cards so that the usual dependencies are given (see also the footnote on p. 589). So far the robot has given each lexeme card a different top code. For efficiency we supply the robot with a testing procedure for substitutivity. Then, before coding the GO(ES) lexeme card, the robot would test to see if GO(ES) can be connected to other lexeme cards the way GOOD and ROBOT can. It would teletype GOOD GO(ES) and GO(ES) ROBOT. The proud parent of a human might initially take these utterances as baby-talk paraphrases of IT'S GOOD TO GO(ES) and ROBOT GO(ES), and so reinforce any connection between lexemes that the baby might have made.* But the parent of our robot is of the strict old school and will either reject (with the HUH ? reaction) and/or correct these teletypings. Consequently, the top code for the GO(ES) lexeme card will be distinct from those of GOOD and ROBOT. The sentence GO(ES) FAST results in rules: [V~q-B1]/B1; BI~FAST; Vl,z ~GO(ES) from the rule: V1 ~GO(ES). ROBOT GO(ES) FAST can be formed with the addition of an optional Vx on the top of the V~ card, changing the rule to: (V1) ~Vz-q-B1/B1. Upon receiving the sentence ROBOT PUSH(ES), the card corresponding to VI~PUSHES is created. The substitutivity test is then applied to PUSH(ES). The acceptability of ROBOT PUSH(ES) FAST modifies the rule to V1, 2~PUSHES. With the robot's present state of information about lexicon and syntax, the PUSH(ES) card receives the same top code as the card for GO(ES). ROBOT PUSH(ES) HUMAN presents the robot with more work. Initially, HUMAN appears to deserve the same code as FAST, but the substitutivity test reveals that HUMAN is not substitutible for FAST in ROBOT GO(ES) FAST. Therefore, HUMAN receives a different top code. This card corresponds to the rule Na ~HUMAN. An N3 structure card must be created to attach above it. By procedure 3, a new structure card, because of the new side code Ns, is made to go next to it. The rule would begin Vsq-N3 for this connection, and change to V1 ~V3-q-N3/N3 for the connection to the ROBOT card. Vz must be added to the top of the PUSH(ES) card. *Such grammatical tolerance on the part. of human adults suggests that children learn that the co-occurrence of lexemes expresses a relation between the concepts they name. This paper does not deal with this discovery of the existence of syntactic connections.
592
H.D. BLOCKet al.
Presenting PUSH(ES) ROBOT requires that the V1 card of V~-+V3qN3/Na be made optional, and adds N3 to the top of the ROBOT card. GO(ES) UP TO H U M A N results in: BI-+Px-t-N3/N3; PI-+UPTO. ROBOT CAN PUSH results in: Vx-+M1/MI-t-V~; Mx-+CAN. V1, 2, 8, 4-+PUSH(ES). These 10 sentences given to the robot, along with the substitution tests and subsequent modifications result in the following rules of a Syntax Crystal (phrase structure analogues are given on the right): (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14)
[Nz+V1]/S1 (N2) -+J1/J1 q-N1 (Vx, ~, ,)-+V2-f-B1/BI (V~, 2, 4)-+Va-I-N2/N3 B 1-+P1 q-N3/N3 VI-+M1/Mlq-V4 N1-3-+ROBOT Jl-+GOOD VL 2, 4-+GO(ES) B 1-+FAST VI_4-+PUSH(ES) N1-3-+HUMAN P1 -+UPTO MI-+CAN
S-+NP-}-VP NP-+(Adj) q-N VP-+V-t-(AdvP) VP-+Vt,+(NP) AdvP-+->Prepq-NP VP-+Auxq-V N-+ROBOT Adj -+GOOD V-+GO(ES) AdvP-+FAST Vtr-+PUSH(ES) N-+HUMAN Prep -+UPTO Aux-+CAN
With these eight lexemes and six rules (11 Syntax Crystal cards which are elements of the rules) over a 1000 well-formed sentences of nine words or less can be generated. Without even considering the dependency relations portrayed, the Syntax Crystal rules are more powerful than their phrase structure analogs. For example, there are strings which can be generated by the Syntax Crystal rules but not by phrase structure rules given while the contrary is not the case; e.g. ROBOT CAN PUSH H U M A N FAST and GO UPTO ROBOT FAST. More phrase structure rules than Syntax Crystal rules are needed to generate some strings, e.g. H U M A N PUSH(ES) requires phrase structure rules 1, 2, 4, 11 and 12 while only Syntax Crystal rules 1, 11 and 12 are needed. It is important to realize that these phrase structure rules cannot be generated directly by the Learning Algorithm. The Learning Algorithm requires the information found on the Syntax Crystal cards but not present in the derived phrase structure rules. For a further comparison, see the end of the next section.
593
N A T U R A L L A N G U A G E A C Q U I S I T I O N BY A R O B O T
Complexityof Syntax Crystal Structures Using the structure cards of the Syntax Crystal Game, we can illustrate some of the grammatical features which can be achieved.* Structure cards controlling repeatable types of lexemes or phrases ( more than one prepositional phrase m a y connect to a single verb, G O W I H U M A N UP T O D O O R ) repeat the code of the bottom edge on the edge, permitting them to stack up recursively. Nonrepeatable modif (e.g. the single noun phrase permitted after transitive verbs which do take indirect objects) do not repeat the b o t t o m code in their top code, so t cannot stack up.
Other functions are easily incorporated into the subscripting sche Unlimited embedding ("Horses who eat goats who eat mice w h o . . . " permitted by the use of a card from which a clause can be generated contain the same type of card from which a clause can be generated . . . . etc.
N7 '~1
I
Nl_8 NOUN
RI
RI
(R I ) SZ SZ R2 R2,3 WHO
VII
I
Vil,...N3 ,I N5 N3 RI v~ V2,.... N2,3 VERB N7 RI
RI
Required modifiers are handled by the top codes on the cards of lexe] that require certain modifiers. For example, a count noun that needs an art will have a top code of Ns, 6. With this code, no matter how m a n y adjective occur in between, the existing structure cards guarantee that an article *For ease of play, some cards have been modified from the form generated by Learning Algorithm.
594
H.D. BLOCKet al.
occur too. The lexeme card of a transitive verb that needs a direct object will be coded V~, ensuring that the card that adjoins it be the one that connects to a noun phrase on the right. I
NZ SI i
...... . I
V2 3 [ N~ N3
N5,6 t J, .o
!
v2
l
TRANS.VERB
NP
The verb of an infinitive clause cannot have a modal auxiliary connection, but it can have the other connections of a main verb phrase. A Syntax Crystal does this by entering a new code, VI~, on the connection to the verb of an infinitive phrase and adding this code to the top of all but the modal connection: I
Vq4's ]l II Vlo VERB~
Ii
|1
12 SB Ss
~/,2
With these properties the cards of the Syntax Crystal game distinguish count nouns, mass nouns, abstract nouns, and pronouns; transitive verbs, intransitive verbs, propositional verbs, and verbs that take indirect objects; questions, relative clauses and infinitives; passive, progressive and passive progressive tenses. Sample Crystals illustrate the complexity of the grammar that such local rules can produce. We begin by matching a structure card to a word card:
595
NATURAL L A N G U A G E ACQUISITION BY A ROBOT
Nz-6 represents six distinct codes, Nz, N2, N3, N,, N5 and N 6. Several ca: can now be attached:
Nz
SI Si
NI- 6 and then
ROBOTS
Vi ZissVi4.., ,, , -d3 I t . . Zo P2 Z o,z,3 IS, ARE,BE
1
A longer sentence can be obtained by the insertion of an appropriate card the J3 top connection, no revision of the original structure is required: I
N2 N,-~ rROBOTS
(1"3) ] I Vl I I Z, 5 eV ,,4,...,1 ' ' J3 J3 Zo Pz
SI SI
r
J zo,2, I
@: J3,N3
Sometimes similar surface arrangements of words have different structure relations. For example:
Here FRIENDLY connects directly to AUTOMATA and not, as before, ARE. Instead the whole phrase FRIENDLY AUTOMATA connects as unit to ARE. Note that the cards of the Ja connection do not actually touch Cards lower down in the hierarchy have "positional priority" (with of exception described below) and higher cards must move horizontally to go out of their way. The above sentence can become embedded by connecting the optional top code Ts. The card connecting to Tz must move to the left because of the positional priority of the lower cards, and the Fz code above it matched by a card to the far right of all the lower cards.
596
n . D . BLOCK et al.
~,~ ..-
NZ Si S, (T~, NI-6 ROBOTS
~ ~ G t , ~
i Z 1,5'Vl,4
~
~Zo'"~
-
~
......
~
_
Z 0,2,3 L S,ARE,BE
J[
J ]dlI
I
FR:; : ; L ¥ ~ AuTNO :-6TA I
A study of this crystal will reveal that other strings such as THAT IS RECOGNIZED; A U T O M A T A RECOGNIZE; F R I E N D L Y ROBOTS, can be formed with subsets of the cards. This sentence now has a new unconnected optional top code T8 which permits multiple embedding. Note that some cards are duplicates. The learning algorithm produces card types whose tokens can be used repeatedly in the same sentence. The Pz card with the arrow on it connects to a suffix inflection and is the exception to the positional priority principle. A card with an arrow on it has positional priority no matter how high it appears in the hierarchy; all other cards must move out of its way.* Modal auxiliaries are easily inserted without any other variation in the crystal structure. Note again the reuse of card types:
_ I
I THAT Nl-s
I_ cr~l
I s'';'
I
~Vl,4 I Zo,,,,
I IS,ARE,BE Jn
, N3 N2-4
d5
Z0,2,3 IS,ARE,BE
dl dl
%%,
/I
V7 PI PI PI ~'I Vl-12 I PI RECOGNIZE- EN,'-~'}D
i
N= dl NI- 6 JI-4 FRIENDLYAUTOMATA
*This exception to positional priority is an artifact of the two-dimensional representation of the Crystal.
597
NATURAL LANGUAGE ACQUISITION BY A ROBOT
The above declarative sentence can be "transformed" into a question by changing just four cards, leaving the rest of the structure intact. The Syntax Crystal model can account for many variations in linguistic forms with relatively few changes in the rules, thus the cognitive operations proposed do not become implausibly large.
------3 TI
T5 (Ts) S~ V~
Nz SI
Ni-6 I
ZI,5,6
I
Vl 4,-..,d~
Zo P2
:::
I P2
Z °'z,5 P2'O a,4p p I IS,ARE,BE V7 I I PI I
M~ Ml MZjl'Vl,ll I
P'z
VHZ PI RECOGNZE -OE)N,-.(EIO
Z5
Z 1,5 6 V1,4',-.,d5 Zo P2 Z0,2,3 IS,ARE,B(
J3Js,N 5 d
d N2-4
J! '[ ' NI [ dl-4 NI-6 FR ENDLY AUTOMATA
Moving the two columns on the right over and inserting other cards will produce the progressive form of this passive mode. The G1 card has positional priority, so the card under it must move horizontally out of the way. Notice that, because of the discontinuous constituent, any equivalent phrase structure rule must be ad hoc and specific to this particular production. This phrase
i
TI
T2 '1
I T4z'4 I THAT
S~ N2 NI-II ROBOTS
Si
MI MI !
MI, 2 CAN
MI
(Ts) v,
MUST
MI,2
~ 1,VI,II
I Z5 Z 1,5,6 Vl 4,'",d 'Z 0 p Z 0,2,5 IS,ARE,SI
~Sd3,N3 I
NZ-4
Ji JilJi NI
JI-4 ~ NI-6 FRIENDLY1AUTOMAT~
zsv zG
Z hVl,ll Mj Mi Z5
Ta
598
ft. D. BLOCK et al. TABLE 2
Examples o f the sentence types that the Syntax Crystal cards can generate Grammatical categories and codes Particles, adverbs (A, B) Prepositions (D) -1NG words (G): gerunds verbids, participles
Infinitives
Adjectives and copula (J, Z)
Nouns (1'4)
Modal auxiliaries (M) Passive (P) Relative clauses (R) Questions and sentence connections (Q, S) That clauses (T) Verbs (V)
The girl laughed aloud. A small dog ate it quickly. The boy with measles eats with his hands. Tea is good with sugar. She is exercising before eating lunch. Is he thinking that seeing is believing? It is sad being alone. Dinner is being eaten inside. The hunting party may be leaving. Being eaten hurts. Knowing logic helps. Careless playing is dangerous. A1 hopes to be eating with Bert. AI helps Henry to be polite. A1 wants Jim to be polite. Ann uses the wrench to help. (eating? (Why) is this injured person {the leader ? tkind to me ? (gentle with her. The hunting dog may be /chained.(sleeping. thappy lnear the t fire. The big dog smelled lunch from the yard. Is the boy with fleas who ate it ? He thinks that Alice who likes big meals is the winner. Certainty is rare. Dreaming can be delightful. (How) can fish sleep ? The dog may be thinking. Beaten eggs must be folded into the batter to be cooked properly. Is the car wanted by the owner ? The child who is singing is the one who was sick. The dog who likes cats smells. The girl who was questioned by them fled. Can people who smoke think that others like to smell it ? Is the audience waiting to hear an encore ? Why are you ignored .9 That you know that is good. I think that I do. They are fixing cars poorly in the shop by skimping on parts to save money.
NATURAL LANGUAGE ACQUISI~ON BY A ROBOT
59c~
structure rule shows neither constituent nor dependency relations. Such a rule might be of the form: VP ~ A U X + B E q-BE q--ING + V + - E D , -EN. This is a very important point: a major argument for transformational grammar is the inability for phrase structure to handle discontinuous constituents. The Syntax Crystal, as the above illustration shows, does not have this shortcoming. The Syntax Crystal can use the above cards to generate other constructions: CAN BE R E C O G N I Z I N G ; CAN BE R E C O G N I Z E D ; IS RECOGNIZ I N G ; IS R E C O G N I Z E D ; IS B E I N G R E C O G N I Z E D . Phrase structure must use new rules to do these. The Syntax Crystal model has had to rely on the slow and often too sympathetic simulation by human manipulation. To date, only a small subset o f the model has been computer simulated. In the game simulation described below, the reader is asked to play the role of computer, joining cards by hand in hopes of finding weevils in the works. The authors would be grateful to learn of any ungrammatical constructions which resulted and which structure cards participated in the deviant structures. Table 2 presents a sample o f the sentence types that we have generated.
General Instructions for the Syntax Crystal Game Lightly shade the faces of the cards in each section of Table 3 with the color indicated. This makes it easy to group the cards in order to increase the complexity of the structure in several steps. Semantic coherence is provided by restricting the lexicon. The lexeme cards are changed with each increase of structural complexity. Glue the card pages to light cardboard and cut them out. You will also need a supply of blank cards to duplicate coded cards. As described earlier, one card may connect to another if they can be placed so that facing edges (with no card in between) bear compatible codes. Compatible codes are matching pairs o f identical letters with the same subscript or subscripts whose ranges overlap. A code in parentheses is optional, i.e. it may be treated either as a blank edge or as a coded edge. Blank edges may touch each other as a consequence of other connections, but they are not to be considered connections in themselves. Linguists usually select a few sentences to illustrate how their rules work. Instead, we encourage the production of any sentence and structure possible, providing lexicons and structure cards of increasing variety for this purpose. The game demonstrates the power of the Syntax Crystal as well as providing an excellent method for discovering mistakes in the rules.
600
H.D. BLOCK et al. TABLE 3
Deli (color green)
A1
C1
D2
D2
A(N)
AND
WITH
ON
Ul_6
N1_4
J9
J1, 2
DOUBLE
HOT
COLESLAW Ns, 6 BLT
Ul-6 RYE
Ul_6 PASTRAMI
C2
C3 N3
A1
AI
CI
Ul-6
LOX
MUSTARD
Nl_6
Ns, 6
COFFEE
(C3, ,) C4 S1 C~
C4
C8
(N2-~, 7, Q3)
DI
Na
S1
Jl
D2
J1
N~
(N2-,, ,, Qz) Dl N4
Us
N~
C3
C4
DI D1
ORDER
J2
J1
J2
Nl
U5, 6
N5
(N2-4, 7, Qa) AI N6
J~
Porno (color pink) discard all C codes
A1
A1
B1
B1
HIS
THE
HUNGRILY
BRUTALLY
D~ UNDER
D2 NEAR
G1, 3 -ING
J1-3 HOT
MOIST
N~, s HE, HIM
Ns, 8 NYMPH
Ns, 6 INTRUDER
Vl-4, 6 NIBBLE(S)
IS, ARE, BE
J1-4 BUSY
N,-6 LIPS
U~, 6
HAND
Vl-4, 6
Vl-4, 6 FONDLE(S)
DEVOUR(S)
GI
G2
J,-3
Z0, 2, 3, V4
J1
BI
Bt
GI
J3~ Q3
(T3)
G~ D1 "J~ Vl,
J,
9,
V6
Ja, Na
(V1, 3, 9, 11-13)
(Vl, 3, 4, 9, 11-13)
G$~ Q8
N~
Bt D1
St Vt
V2
ii, 12 Z5
J3
G3
G2
V3
G1 V4
Zl, 5, 6 Vl, 4, 9, 11, 12
G~
J3 P~
V6
Z~
Ve
Zo
Politics (color purple) no Bt cards used
At
D2
D,, I2
THE
WITH
TO
Gl, 3
ING
601
NATURAL LANGUAGEACQUISITIONBY A ROBOT
TABLE 3 continued J1-5
J1-4, V2-4, 7, 9, 11, 18
M1
M1
DIFFICULT
DELIBERATE
MIGHT
COULD
J1-4,
STABLE N1-7
N1-7, V1, z. 4, ,. o. N1-7, Vo-4, e-18
NI-7
No,
9, 11, 12
HOSTILITIES
PEACE
WAR
P1
Re
Z0, 2, 3
PROMISE
-(E)N, -(E)D
WHICH
IS, ARE, BE
N s , e, V2, 4,
OPPOSITION
7-12 11
ATTEMPT N2, a, Q3
J3~ Q3
Sa
I1
RI
M1
Js
M1
N7
(R1) P2 V7_9
B1 D1
P2, Q3, 4 N3
P1
R1
R2
Vl, 11
VI, 4, 9
Vo
glo
Vll
V12
A1
B1
D2
D2
ANY
NECESSARILY
BEFORE
BEYOND
M1
Vs
P1
Vl-4, 9
S3
R1
P2
11
Vo I1
V7
S3
53
ZI~ Vl, 11
MI Z5 Philosophy (color gray) Vo cards not used. D2~ I2
G1, 3
J1-4
TO
-ING
OPAQUE
J1-3, 5 POSSIBLE
MUST
N1-7, Qa PEOPLE
N2, 3, 7, Qa DESCARTES
T4, N3, Q4 CERTAINTY
Na, T4, Q4 LOGIC
P1 -(E)N, -(E)D
MI, 2
R2
T1, 8, 4
Vl-13
Vl-13
Vl, 8-13
WHO, WHICH
THAT
LEARN(S)
KNOW(S)
THINK(S)
W2
Z o , 2, 3
(Wl)
IS, ARE, BE
Me
HOW Q2
Q,
Q3 F1 T4
Q,
T4. N3. Q4 GI Va (T3)
Wl
Q2
Tx, 4 T1
T2
T3
$1
WI V~3
V~
(Wl) Qz
T3
Q1
V1, 9, 11-13
Z3
TI
T3
T1
F1 ZI
Q3 Q1
Q3
FI
$1
W3
Q1
Q1
F1 Ze
Wl
602
H.D. BLOCKet al.
Playing the Syntax Crystal Game The first round of the game is played using only the green cards which include the vocabulary of a delicatessen. In a deli, language is used for only one purpose--to name something to eat. So the language consists of noun phrases (nouns and their modifiers) describing what the speaker wants and how much. Grammatical strings in a deli include: H O T C O F F E E A N D A B.L.T. W I T H COLESLAW ON RYE A D O U B L E O R D E R OF H O T PASTRAMI A N D COLESLAW In a good deli, verb phrases are never used because people are too busy ordering or making sandwiches to tell stories about who did what to whom. Just say what you want. The second round of the game explores the language of erotica. The green word cards bearing the deli lexicon are set aside. Add all the pink cards to the green structure cards (discard any green cards bearing " C " codes). You have now replaced the deli vocabulary by the porno vocabulary and combined the deli structure cards with those of the porno language. Pornography is characterized by simple, direct, action sentences. Subject noun does something to object noun. Subject, verb, direct object and a few choice modifiers are all that is required. Noun modifiers were used in deli language. Now we add verb modifiers and the copula IS. There are no clauses, subjunctives or anything to slow down the heart rate. The progressive tense is introduced here to keep things going. In order to generate sentences appropriate to the subject matter, use as many pink structure cards as possible. Grammatical strings include: T H E H O T N Y M P H IS D E V O U R I N G HIS MOIST I N T R U D E R HUNGRILY. D E V O U R I N G LIPS ARE BUSY U N D E R HIM. For a more interesting syntax proceed to the next round. Discard the pink word cards. Reserve for the philosophy language the pink cards with the bottom code " B " . Politics is a language full of qualifications, hesitations and side steppings. Here we need modal auxiliaries, relative clauses, infinitives, indirect objects, and the passive construction. The purple cards can generate or parse sentences such as: D E L I B E R A T E HOSTILITIES M I G H T P R O M I S E T H E OPPOSITION A D I F F I C U L T WAR. A PEACE W H I C H IS STABLE C O U L D BE A T T E M P T E D .
NATURAL LANGUAGE ACQUISITION BY A ROBOT
603
For the most political sentences, use as many purple structure cards as possible. To introduce even greater syntactic complexity, discard the purple lexicon and add the gray cards. These represent philosophy jargon. In addition to the structures allowed in the deli, the porno shop, and the campaign rally, philosophy has propositional verbs ("that" clauses), abstract nouns, questions, and gerunds. For instant philosophy, use as many of the gray structure cards as possible. You may get: T H A T L O G I C IS POSSIBLE M U S T BE B E I N G L E A R N E D . H O W C O U L D DESCARTES T H I N K T H A T C E R T A I N T Y IS POSSIBLE ? The Syntax Crystal game can be played to generate English sentences and their structures. One can start with any card (observing the color-coding on the lexeme cards for semantic coherence) to generate a sentence. This is different from other linguistic models. To illustrate we start the game with a lexeme card. The green cards are dealt out to the players along with three blank cards each. The dealer chooses a lexeme card to begin the game. Only cards bearing matching codes on corresponding edges can be connected. From then on the players try to connect one of their cards to one of the cards already played. At any time a player may duplicate a previously played card on a blank card and play that. This will be necessary to complete most sentences because initially only one token of each card type is given out. The game ends when the Crystal generated on the board has no (nonoptional) coded edges still unconnected. The object of the game is to have as few cards as possible when the game ends. A secondary object is, of course, to create the most interesting strings. The players may continue to use the green cards on the next deal or go on to the next color. Although more difficult, players may also simulate the generation of sentences which encode previously decided constituent and dependency relations corresponding to the ontological relationships of a particular message. This simulates the process described on p. 582-3. The Syntax Crystal game can also be played to parse sentences. One starts with a string of lexeme cards of the same color. Then one searches for the structure cards which will connect together all of those lexeme cards. This game is more difficult. For now we need a card that connects to a single edge and connects to the other parts of the structure as well. Unless a sentence is ambiguous only one structure is possible (see Fig. 3). When the crystal structure is completed, the dependency relations and constituent analysis may be read by tracing connection pathways over the crystal.
604
n . D . BLOCKet al.
The Robot Family Language Game This section describes a simulation in the form of a game of how the robot learns to use language as it learns about its world. Human players take the role of a naive baby robot and a parent robot who is competent in " r o b o t language". Each team represents a family of two competing against other families to see whose baby robot first acquires competence in the language. There may be two or more teams. A simplified version is presented here. The game is played on a chess board with two teams to a board. The player from each team taking the role of the baby robot uses a pawn of the appropriate color as a token. The team-mate playing the role of parent robot uses the matching queen as a token. The players may place their tokens on any unoccupied square on the board. Each player may touch directly only their own token (which represents their corporeal manifestation in the robot world). There are two passive pieces of each color: knight and rook. Each team chooses the initial positions of the pieces of its own color. Once placed the rooks remain fixed in position for the rest of the game, while the knight may be pushed around the board by the active pieces in any pattern or direction-chess-move restrictions are not applicable. With the token the player may push around any movable piece on the board (including other players' tokens, except the rooks). None of the pieces are permitted to leave the world of the chess board. The teams take turns moving. During a turn only one of the players on each team may move his or her token. On each move, there is no limit to the length and complexity of the move or to the number of passive pieces moved. During each turn, the baby robot player and the parent robot player each have the privilege of transmitting one message to their partner. A message is written on a piece of notepaper and consists of a string of capital letters each standing for one lexeme. Smiles of approval and frowns of disapproval are the only other communication permitted. Only the parent robot players of each team are provided with a lexicon containing the capital letters paired with the English names of the concepts they stand for. There are several randomized pairings of letters with concepts to permit the same person to play several games as the baby robot. Also, the same set of concepts is not used for every game. This is an important theoretical point. Learning the lexicon of a language is more complicated than merely learning to affix a verbal label onto the appropriate one of a previously established list of concepts. Important also is discovering which of the features in the world perceivable to the language learner the particular language happens to encode. The robot is equipped to be sensitive to many features, but not every feature or combination of features is given a
605
NATURAL LANGUAGE ACQUISITION BY A ROBOT
name by any particular language (cf. L. Harris, 1972). R o b o t language, as well as having its lexemes represented by arbitrary capital letters, has a syntax that the baby robot must also learn. Often, the ability to understand sentences depends on knowledge of syntax. The parent robots are each given a set o f grammatical rules to use in communicating with their baby robots. Table 4 is a sample lexicon and a g r a m m a r for one version o f the game. Another version, simulating an inflected language, has no restrictions on lexeme order but provides a list of numerals that serve as subscripts or superscripts on the letters which represent the lexemes to indicate their grammatical function. TABLE 4
Sample version of robot family language game: parent robot guide sheet Lexicon: concepts and their English names A =Babyrobot B =Parentrobot C=Rook F = Knight
E =Push(es) D =Go(es) "~note D=Red Jhomonym G = Black
H = Upto J =Awayfrom K----Slowly L = M = Fast (note synonym)
Grammar: The order of the lexemes
Noun
Verb
Adjective modify-
Adjective. Adjective Noun modify- ~oun modifyAd- Prepo- object of ing direct ingdirect n.uoing verb sition preposi- object of object tion preposiobject ject subject tion
Example 1 D K H C G English translation "Babyrobot goes slowly upto black rook." Example 2 E A English translation "Parentrobot pushes Babyrobot."
A
Although the parent robots are expected to communicate "grammatically" there may be times when a message with less than complete sentences may be useful. F o r instance, it has been found to be advantageous for a parent robot to simplify the g r a m m a r in the early stages, reducing it to a kind of " b a b y talk" in order for a message to be more easily understood by the baby robot. Since either or both players on a team may transmit messages during each turn, the type of message may vary considerably. The baby robot may move
606
~
~
"~
°~
°~
~0 o
'X3 0 "0
o
o
o
o
e~
o
o
~0
II
II~'ll II ~ :0 II II if'
~.II
II
II
~o
,..~
o
~
~
-
.~
~
.~.~
~'oo =
"~
°o~
o=
m~..
~
o
•
oo_o
~
~
o ;> o
,,
NATURAL LANGUAGE ACQUISITION BY A ROBOT
607
and the parent robot transmit a description of the move. The parent robot may itself move and provide a description of its activity. The parent robot may command the baby robot to do something and observe how well the message was understood. The baby robot may describe its own move and await a confirming or correcting message. The object o f the game is for the baby of each robot family to acquire the language as quickly as possible. The criterion of success is a fairly complicated sentence which is posted at the beginning of each game. The sentence is a "playscript" for the baby robots to act out. Each baby robot is permitted three attempts during each game. The team whose baby is first to follow the script perfectly is declared the winning family. Using an adult human to simulate a baby robot, while unusual, is not any less legitimate than using an electronic computer to simulate human cognitive operations. It is only necessary to control or make irrelevant those properties of the simulator that are not shared by the simulatee. Thus, a computer's greater speed, component reliability, or skin color must not enter critically into a conventional simulation. In the present case, those abilities of adult humans not shared by the baby robot must be made irrelevant. This is done through the restrictions imposed by the artifical language and rules of the Robot Family Language Game. The human player is restricted to performing only the operations of which the robot is capable. To get a sense of the effectiveness of the simulation, the reader is strongly advised to observe, or participate in, a round of the game. Table 5 is a condensed protocol of an actual game. An experienced parent robot player was paired with a novice baby robot player. All moves wherein the baby robot made no progress (or regressed) are omitted for brevity. In this experiment, the baby robot had a private scratchpad to record hypothesized cooccurrences of linguistic and environmental events.
References AJDUKIEWICZ, K. (1935). Die syntaktische Konnexitat. Studia Philosophica (Wars-
zawa). 1, 1. BEVER,T. G., FODOR, J. A. & WEKSEL,W. (1965). Theoretical notes on the acquisition of syntax: a critique of contextual generalization. Psyehol. Rev. 72, 467. BOLINGER,D. (1965). The atomization of meaning. Language 41,555. BRAINE, M. D. S. (1963). On learning the grammatical order of words. Psychol. Rev. 70, 323. CHOMSKY,N. (1959). Review of Skinner's Verbal Behavior. Language 35, 26. CHOMSKY,N. (1965). Aspects of the Theory of Syntax. Cambridge: MIT Press. CHOMSKY,N. (I 966). Cartesian Linguistics. New York: Harper & Row.
608
H.D. BLOCK et al.
HARRIS,L. (1972). A model for adaptive problem solving applied to natural language acquisition. Unpublished dissertation, Cornell University. KATZ, J. J. (1966). The Philosophy of Language. New York: Harper & Row. LASHLEY, K. (1951). The problem of serial order in behavior. In Cerebral Mechanisms in Behavior: The Hixon Symposium. Ed. L. A. JEFFRESS. New York: Wiley. LYoNs, J. (1971). Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. McCARTHY, J., EARNEST,L. D., REDDY, D. R. & VISCENS,P. J. (1968). A computer with hands, eyes, and ears. A.F.I.P.S. Conference Proceedings. Vol. 33. Washington, D.C.: Thompson Book Co. McNEILL, D. (1970). The Acquisition of Language: the Study of Developmental Psycholinguistics. New York: Harper & Row. MILLER, G., GALANTER,E. & PRIBRAM, K. (1960). Plans and the Structure of Behavior. Holt, Rinehart & Winston. NILSSON, N. J. (1969). A mobile automaton: an application of artificial intelligence techniques. Proceedings of the First International Joint Computer Conference. Washington, D.C., May. PEFa~OSE,L. S. (1959). Self-reproducing machines. Scientific American 2,00, 105. ROSEt,mLATT,F. (1961). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington, D.C. : Spartan. SELFRn~GE, O. G. (I959). Pandemonium: a paradigm for learning. In The Mechanization of Thought Processes. London: H.M. Stationery Office. SKINNER,B. F. (1957). Verbal Behavior. New York. Appleton-Century-Crofts. WrNOGRAD, T. (1971). Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. MAC TR-84. Cambridge: MIT Artificial Intelligence Laboratory.