JOUI/BIAL O r VERBAL LEAlqNING AND VERBAL BEHAVIOR 6, 7 0 7 - - 7 1 3
(1967)
Some Structural and Sequential Factors in the Processing of Sentences1 LAWRENCE E. MARKS2 Center for Cognitive Studies, Harvard University, Cambridge, Massachusetts
The Ss were presented a number of pairs of simple sentences in which the Subject and Object were substituted one for the other. Test sentences, derived from the original normal sentences by inversion of a pair of proximal words or a word and a proximal phrase, were shown singly, and the latency for the identification of the source of each test sentence by S was measured. The results showed that, regardless of the syntactic type of sentence investigated, strings with inversions that interrupted major phrase boundaries had longer source-identification times than had strings with inversions that left phrase boundaries intact, and, for structurally equivalent types of inversion, the earlier in the string the inversion occured the longer was the identification time. These results were taken as support for the hypotheses that linguistic rules (phrases) have some psychological reality and that left-to-right predictiveness is important in the processing of sentences. One technique for assessing the importance of various linguistic rules in the processing of verbal materials is to attempt to isolate each rule independently, violate it, and measure the amount of deviation from normal response that occurs. In this experiment disjunctive reaction times were measured for the identification of sentences when some word-order inversion occurred within the strings. That is, what was measured was the time to identify the normal, undistorted sentence from which each test sentence was derived. The word-order inversions that were 1A portion of this paper is based upon part of a doctoral thesis submitted to the Department of Psychology, Harvard University. The research was supported by funds granted to Harvard University, Center for Cognitive Studies, by the National Institute of General Medical Sciences, grant #5T1GM-1011-04. The author is indebted to Professor George A. Miller for his aid and advice. 2 Now at the John B. Pierce Foundation Laboratory, New Haven, Connecticut.
studied were, for the most part, inversions of pairs of proximal words. This appeared to be the most elementary level at which to begin an analysis of the importance of various syntactic rules. If linguistic rules have some psychological reality, then it may be expected that within-phrase distortions, i.e., distortions that leave major phrase boundaries intact, will be less disruptive and result in shorter identification times than will interphrase distortions, i.e., distortions that disrupt many phrase boundaries. This prediction would follow from the claim that the perception of entire phrases is an integral part of sentence processing ( F o d o r and Bever, 1965). However, an emphasis upon the importance of linguistic rules in the processing of sentences can lead one to ignore another important aspect of sentence proeessing-its left-to-right nature. The importance of left-to-right constraints (predictiveness) is emphasized to a greater extent when the focus of the study of the
707
708
~ARKS
structural aspects of language is upon statistical orders of approximation. Howes and Osgood (1954), for example, demonstrated that the probability of emission of a particular word to a series of word stimuli is an increasing function of the number of stimulus words that have the response word as its primary associate. If left-toright predictiveness is psychologically important-that is, if words that appear early in a sentence somehow limit the possible set of alternatives that may occur laterthen another prediction can be made: given syntactically equivalent types of word inversion, the earlier in the string the distortion occurs, the greater should be the interference with predictiveness, the uncertainty concerning what may follow, and the longer the time for sentenceidentification. METHOD
Materials Three sets of test sentences were constructed. The first set (active sentences) was formed as follows. First, three sentences were constructed: "Sudden floods cause rising tides," "Herbivorous animals eat carnivorous plants," and "Handsome men like beautiful women." Each could be converted into another meaningful, grammatical sentence by reversing the Noun-Phrases that appear in the Subject and Object, e.g., producing "Rising tides cause sudden floods." For each such pair of sentences eight derivative strings were constructed by the inversion of a proximal pair of words; four such derivative strings were made from each member of each pair. Thus there were ten sentences for each pair; this included the undistorted sentences themselves. All of the undistorted sentences are of the structural form Adjectivel-Noun~-Verb-Adjeetive~-Noun~. Then the derived strings from each were formed by inversion of the order of the first proximal Adjective and Noun (Adjectivel-Nounl, e.g., "Floods sudden cause rising tides") inversion of the order of the second Adjective and Noun (AdjectivezNoun~, e.g., "Sudden floods cause tides rising.") the first Noun and the proximal Verb (Noun~Verb, e.g. "Sudden cause floods rising tides.") and the Verb and the second Adjective (Verb-
Adjectiv% e.g., "Sudden floods rising cause tides." ) Another set (passive sentences) was constructed as follows. First, both undistorted sentences from each subset of active sentences were transformed into passives (e.g., "Rising tides are caused by sudden floods.") Each thus had the syntactic form Adjectivel-Nounl-Verb-AdjectivezNoun2, where the Verb here is a compound of are-Verb-by. Four strings were derived from each member of each pair in an identical manner to those above. An example of a Nounl-Verb string would be "Rising are caused by tides sudden floods." Thus again each of the three subsets consisted of ten strings which included the normal sentences. The final set (infinitive sentences) was constructed ill the following manner. First, three new sentences were formed: "To create numerous troubles entails possible dangers," "To establish new concepts requires major changes," and "To meet great fortune implies good luck." Again, pairs of normal sentences were formed by the reversal of Noun-Phrases within each initial sentence. The six sentences were syntactically identical: To-VerblAdjectivel-Noun~-Verb~Adjective~NouI~. Four derived strings were formed from each original sentence by inversion of the first Adjective and Noun and second Adjective and Noun as above, and by inversion of the first Verb and Noun Phrase (Verbl-Noun-Phrasel) and second Verb and Noun-Phrase (Verb~-Noun-Phrase2). These last two, for example, produced "To numerous troubles create entails possible dangers," and "To create numerous troubles possible dangers entails," respectively. Here again ten strings, including the original sentences, were formed from each initial pair.
Procedure Each sentence was typed on a 5 X 8 in. card. The S sat in front of a half-silvered mirror. The sentence to be responded to could be seen in the mirror when illuminated. Beside the viewing box was a response box flanked by the two standard sentences (typed on 5 >< 8 in. cards ) that were the two normal sentences that formed one of the pairs of one of the three sets. The response box had two keys side by side; each key corresponded to one of the two standard sentences. After E signalled "Ready," he pressed a switch that illuminated the test sentence and started a Standard Electric Timer. The S was instructed to decide which of the two standard sentences
P R O C E S S I N G O F SENTENCES
709
for the various sentence types also appear in the table. The maximum possible number of "correct" responses was 36, since, within each set, each of 6 Ss had all three subsets in which each sentence-type appeared twice (formed from each of the two standard sentences). Clearly, responses were most rapid to the undistorted sentences in all cases, i.e., regardless of the syntactic structure of the sentence (whether active, passive, or infinitive). Furthermore, in the case of Adjective-Noun inversions, for each of the three sets inversion of the first Adjective and Noun resulted in a string that took longer for identification of the source than did inversion of the second Adiective and Noun. Though the difference is significant only for the infinitive sentences and nearly significant for active sentences (according to a sign test, Table 2), the probability that the direction of such differences would be the same for all three sets is obviously quite small. Similarly, with the infinitive sentences distortion of the first
each test sentence was derived from (or was identical to) and to respond as quickly as possible by pressing the appropriate key. Pressing a key automatically stopped the timer, removed the illumination of the sentence, and signalled E which key h a d been pressed. The S was presented the ten test sentences for each standard pair, and all three pairs that form one set were given. Each S received only one set. Ten see elapsed between the response to one test sentence and the onset of the presentation of the next, 1 min between sets of ten test sentences. The order of appearance of the various types of sentences for any standard pair was varied according to a partial Latin square design, and the order of presentation of the three subsets was counterbalanced among the six Ss who received each set. The Ss were all male and female college undergraduates who were paid for their services. RESULTS
Geometric means of the lateneies for the various types of test-sentence were calculated twice, onee with all of the latencies included and again with the latencies for responses to the "incorrect" key excluded. Both sets of means are presented in Table 1. The numbers of "errors" that occurred
TABLE 1 GEOMETRIC MEANLATENCIES(SEe) OF SOURCE IDENTIFICATION FOR STRINGS DERIVED FROM THREE
TYPES OF SENTENCE Type of derived string Type of normal sentence Active
Passive
Infinitive
Normal 2.43 (2.28)
Adjeetive2- Adjective1Noun2 Nounl 2.46 (2.45)
2.62 (2.66)
VerbAdjeetive2
NOUnlVerb
~, 96 (2. s4)
8.40 (3.22)
2.68
3.07
8.41
4.10
4.88
(2.68)
(3 ]s)
(3.4~)
(s. 70)
(4.27)
Verb2NounPhrase2
VerblNounPhrasel
2.88
2.91
8.72
8.54
4.07
(2. ss)
(2. sT)
(s. 6s)
(s. 42)
(4.08)
Number of "errors" Active Passive Infinitive
8 0 1
3 ~ 2
1 0 1
1 3
4 15 4
N o t e . - - T h e entries within parentheses do not include lateneies for "incorrect" responses.
2
710
MARKS TABLE
ONE-TAILED PROBABILITIES OF DIFFERENCES IN REACTION TIMES BETWEEN TYPES OF DERIVED STRINO, BY A NORMAl, APPROXIMATION TO THE BINOMIAL DISTRIBUTION (SIc~N TEST)
Type of normal sentence Active Type of derived string
z
p
Passive N
Adjectives-Nouns
1.50
.07
36
vs.
(1.59)
(.06)
(32)
z 1.17
p .13
(0.86) (20)
Infinitive N
z
p
N
36
2.17
.02
36
(34)
(2.53)
(. 01)
(34)
AdjectiverNounl VerbrNoun-Phrase2
vs.
1.02
.16
35
(1.47)
(o8)
(~9)
Verbl-Noun-Phrascl Adjectives-Nouns
1.8~
.04
36
vs.
(2.11)
(.o2)
(as)
1.70
.05
(2.51) (.ol)
35
(31)
Verb-Adjectives Adjectives-Noun2
vs.
3.17
(s.62)
.01
(.ool)
36
4.50
.001
36
(sl)
(s.52)
(.ool)
(19)
Nounl-Verb Adjcctive1-Nounl
1.17
.13
36
1.50
.07
36
vs.
(0.86)
(.20)
(s4)
(1.05)
(. 15)
(as)
Verb--Adjective2 Adjectivel-Nounl
3.11
.001
36
vs.
(2.51)
(.ol)
(31)
4.50
.001
(3.o~) (.ol)
86
(21)
NounrVerb N o t e . - - T h e entries within parentheses do not include pairs in which "incorrect" responses occurred. Both sets of entries exclude ties. Maximum N = 36.
Verb-Phrase, i.e., inversion of the first Verb- and Noun-Phrase, resulted in a string whose source took longer to identify than the string with an equivalent distortion of the second Verb-Phrase (Table 1). Though again this difference itself was not quite significant (Table 2), the left-to-right effect is manifested. If the reaction times for strings in which the inversions cut across major phrase boundaries (NOunl-Verb and Verb-Adjective2) are compared to the reaction times for strings that maintain intact phrase boundaries (Adjective-Noun inversions), it may be seen that the former were clearly
longer (Table 1) though not all of the comparisons reached significance (Table 2). Discussion
The results of the experiment seem amply to support the predictions made. With equivalent syntactic types of distortion, the earlier in the sentence the distortion appears, the greater the interference with processing and the longer the latency of source-identification. This was seen to occur both for Adjective--Noun and VerbNoun-Phrase inversions. Furthermore, inversions that disrupted major phrase
PROCESSING OF SENTENCES
boundaries interfered with sentence processing to a greater extent than did inversions that left them intact. It is unlikely that any of these differences are explicable in terms of the particular choice of words used, since Noun-Phrases were reversed in forming the pairs of standard sentences from which all of the test sentences were derived. In order to interpret the results it is necessary to examine the relation between response times and "error" frequencies. Should a correlation appear (the longer the reaction time, the greater the frequency of "errors"), then the differences among response times would be explicable solely in terms of differences in interpretability. That is, the more ambiguous the "meaning" or interpretation of a derived string, the longer the time S might have spent just attempting to decide from which normal sentence the test sentence was indeed derived. For normal and for infinitive sentences it is obvious that there was no correlation (Table 1). Strings that had the longest reaction times, both when the "erroneous" responses were included and excluded in the calculation of the means, did not in general also demonstrate greater variability in interpretation. In fact, the only abnormally large number of "errors" occurred for the Nounl-Verb inversion in the passive sentences. The fact that this string also had the longest identification time of all the passive strings allows the possibility that the above explanation is correct for this single case. On the other hand, since this is the only case in which this possibility is at all manifest, it does not seem that the explanation in terms of variation in differences in interpretation is correct in general. It is necessary to examine the left-toright effect with regard to other variables of which it might be a function. For example, recent work in structural linguistics bv
711
Chomsky (1965) has led to the development of the distinction, in the linguistic analysis of sentences, between deep structures and surface structures. The deep structure of a sentence represents what is semantically interpretable, while the surface structure what is phonologically represented. For the active sentences the sequential order of the words is essentially the same in both the deep and surface structures. When the active sentences are passivized, however, only in the surface structure do the positions of the two Noun Phrases (the logical Subject and Object) reverse. The word order in the deep structure of the passive sentences is essentially the same as that of the active sentences. The only difference is that the deep structure of the passive sentences contains a marker noting that the surface structure has a passive form. Then it can be asked whether the left-to-right effect in sentence processing depends upon serial position in the deep or the surface structure. If it is the former, then it would be predicted that the effect would reverse itself in the passive sentences, i.e., that a serially earlier distortion in the surface structure of passive sentences would have made the resultant string less difficult to process than would a later distortion. But clearly this was not the result; the effect was left-toright (in the phonologically represented surface structures) for both active and passive sentences. An alternative explanatory hypothesis that might be derived from Chomsky's model would involve the variable of hierarchical level. This hypothesis would state that the degree of interference with sentence processing would be a function of the degree of embedding in the surface structure at which the distortion occurs: the higher the hierarchical level, the greater the interference. For both active and passive sentences the first Adjective
712
~naKs
and Noun appear higher in the surfacestructural representation than do the second Adjective and Noun, since the former constitute a Noun Phrase that is an immediate constituent of a Sentence, whereas the latter form a Noun Phrase that is an immediate constituent of a Verb Phrase that is an immediate constituent of a Sentence. But this was not the case for the infinitive sentences, where the first Adjective and Noun occur at a lower hierarchical level than do the second Adjective and Noun. This is so because the second Adjective and Noun form a Noun Phrase that is a constituent of the major Verb Phrase, whereas the first Adjective and Noun form a Noun-Phrase that is a constituent of a Verb-Phrase that is itself a constituent of the major Noun-Phrase. In other words, the number of rewriting rules that must be applied in order to produce the second Adjective and Noun for the infinitive sentences is smaller than the number needed to produce the first Adjective and Noun, whereas the reverse was true for both active and passive sentences. Yet the same left-to-right effect revealed itself in the data for the infinitive sentences. Neither of the purely linguistic notions described above seems to account for the results of the experiment. The left-to-right effect found here seems to be independent of the syntactic structure of the sentence, at least for the three syntactic types examined here. The explanation given here is based upon the claim that sentences are processed from left to right and that words in the first part of a sentence make the words later in the sentence more predictable. That is, the first few words of a sentence give the reader or listener information concerning the semantic and syntactic structure of the sentence by limiting the possible word-classes and selections within classes that may come later. It is not being claimed that the first few words of a sen-
tenee necessarily limit the alternatives that may occur later to a greater extent than the last few words might limit the possibilities that could have occurred earlier; certainly being given "The dog buried the - - . " does not make "bone" more predictable than being given "The - - buried the bone." makes "dog." Rather, it is purely by virtue of the fact that spoken sentences are necessarily processed from left to right and that written sentences are normally read in that order that words early in the sentence are more constraining. That is, the first few words of a sentence give more information about what is to come than do the last few words for the very obvious reason that the first few words are first; when the last few words have been read or heard there is nothing that remains to the sentence (i.e., nothing else to be limited or predicted). This is certainly not to say that verbal }~ehavior should be described solely in terms of contextual constraints in a finitestate model. Thus, though the results of this experiment seem related to those of Howes and Osgood, the additional results of this experiment-that constituent structure (maintenance of intact phrases) is important in sentence processing-and particularly of many other recent studies in psycholinguistics (Fodor and Bever, 1965; Miller, 1962) seem to demonstrate the necessity for the incorporation of more complex structural notions than statistical probabilities within any model of performance that is to be adequate. What seems to be necessary is some left-to-right nonfinite-state grammar. That is, when a sentence is processed the expectations concerning the rest of the sentence which the initial parts set up must be described in terms of phrase structure and perhaps transformational rules such as those described by Chomsky. Thus the expectations would concern syntactic and semantic
PROCESSING
categories a n d restrictions. This is certainly a complex matter, one w h o s e surface is just b a r e l y b e i n g e x a m i n e d in c o n t e m p o r a r y psycholinguistics. REFERENCES
OF SENTENCES
FODOI1, J. A., AND BEVER, T. G. The psychological reality of linguistic segments. J. verb. Learn. verb. Behav., 1965, 4, 414-420. HOWES, D., AND OSGOOD, C. E. On the combination of associative probabilities in linguistic contexts. Amer. J. Psychol., 1954, 67, 241-258. MmLER, G. A. Some psychological studies of grammar. Amer. Psychologist, 1962, 17, 748-76"9_,
CHOMSKY, N. Aspects of the Theory of Syntax.
Cambridge: M.I.T. Press, 1965.
713
(Received January 10, 1966)