4
Semantic
heuristics
and syntactic
analysis*
KENNETH I. FORSTER ILMAR OLBREI Monash University
Abstract This paper investigates the hypothesis that the component of sentence processing time directly attributable to syntactic processing depends critically on certain semantic properties of the sentence. Using two different procedures, it is found in a series of experiments that there is little evidence to support this view. Speci&cally, it is shown that syntactic processing time tends to be constant for sentences of varying semantic plausibility but constant syntactic structure, andfurther, that reversibility fails to aflect sentence processing in a systematic way. These facts are interpreted as indicating that the recovery of the underlying structure of a sentence is controlled by purely syntactic properties of the input. In order to provide an account of how a speaker of a language understands a sentence, one must come to terms with the fact that sentence comprehension is not a unitary process. The act of understanding a sentence involves decision-making at three levels of analysis, the lexical, syntactic and semantic levels, and it is generally assumed that there is considerable interaction between processing at any one level and processing at other levels. This assumption of interaction between levels greatly complicates the task of making inferences from experimental data, since we are unable to study the variables that control processing time at one level without also having to consider the effects of the same variables at other levels. Special problems arise in the case of experiments designed to study semantic processing. The appropriate experimental design holds syntactic structure constant and * This research was supported by a grant to the first author from the Australian Research Grants Committee. Requests for reprints should be addressed to K. I. Forster, Department of Psychology, Monash University, Clay-
ton, Victoria, Australia 3168. The authors wish to thank Donald Thomson and Ivan Watson for reading earlier versions of this paper.
Cognition 2(3),
319-347
320
Kenneth I. Forster and Ilmar Olbrei
studies the covariation of meaning and sentence processing time. However, this design assumes that there will be little or no variation in lexical or syntactic processing time as a function of meaning. The latter assumption seems untenable, especially in view of the widespread belief that the sequence of operations necessary to analyze the syntactic structure of a sentence can be drastically modified by certain semantic properties of the sentence (e.g., Slobin, 1966; Schlesinger, 1968; Bever, 1970; Schank, 1972). The purpose of this paper is to re-examine this belief and to present evidence from several experiments that bear on the issue. It will be argued that the current view of sentence processing is unnecessarily complex, and that there are grounds for making the simplifying assumption that syntactic processing is relatively independent of semantic processing, in the sense that the time required to analyze the syntactic structure of a sentence is constant, despite variations in the meaning of the sentence. Before outlining the experiments, it is necessary to at least sketch a preliminary model of sentence processing, indicating clearly the possible sources of interaction. As indicated above, we assume three levels of processing: Lexical, syntactic and semantic. Lexical processing refers to the processes of word recognition and lexical access (e.g., Oldfield, 1966; Rubenstein, Garfield and Miilikan, 1970; G,>ugh, 1972; Meyer and Schvaneveldt, 1971). The purpose of lexical processing is to gain access to the information contained in the mental lexicon about each lexical item in the sentence and involves a search through the lexical entries until a match is obtained between the perceived features of a segment of the input and the features contained in a lexical entry. Syntactic processing refers simply to the formulation and testing of hypotheses about the grammatical relations that hold between the lexical items, while semantic processing refers to the formulation and testing of hypotheses about the meaning of the sentence. In the simplest case, we would assume that hypotheses about the syntactic structure are based exclusively on syntactic cues. These cues would consist of items of information such as the distribution of inflections, the surface order of formatives, the placement of relative pronouns (e.g., Fodor and Garrett, 1967) and the syntactic features of the lexical items provided by the lexicon (e.g., Fodor, Garrett and Bever, 1968). However, we must acknowledge the possibility that syntactic hypotheses might also be based on semantic cues, and here we would include the semantic features of the lexical items provided in the lexicon (e.g., selection constraints) and also purely pragmatic features derived from the speaker’s knowledge of the real world, such as the knowledge that dogs seldom wear golf-shoes or that firemen are typically employed to put out fires. Although there are bound to be areas of overlap between syntactic and semantic cues, this problem can be sidestepped by considering only relatively clear cases. Ideally, we would also wish to assume that syntactic hypotheses are tested by determining whether they are consistent only with the syntactic properties of the
Semantic heuristics and syntactic
analysis
321
sentence, but again, the possibility of interaction cannot be overlooked. Once formulated, a syntactic hypothesis may be evaluated by determining whether it is consistent with current semantic hypotheses. We shall return to this problem later. Finally, it should be noted that these proposals are not meant to imply that processing at any one level must be completed before processing at any other level can commence. Instead, processing may proceed at all three levels simultaneously, with syntactic and semantic hypotheses being constructed on the basis of quite independent cues. The most important consequence of this assumption is that the total time for sentence comprehension is not a simple sum of the times required for processing at each level. In Figure 1, the main information channels in the model are summarised and labelled. Channels A and B refer to the transfer of lexical information to higher levels, the former channel involving only syntactic features, and the latter involving semantic and pragmatic features. Channels D and E refer to feedback to the lexical processor Figure 1. The six possible information channels between the three levels of processing
LEXICAL
+-
E
322
Kenneth I. Forster and Nmar Olbrei
from the syntactic and semantic processors, which allows for the possibility that lexical search will be facilitated if the syntactic or semantic context of a given lexical item is known. Channels B and C represent the input to the semantic processor. The information in channel C stems from the syntactic processor and consists of a statement of the grammatical relations that have been isolated. Thus, hypotheses about the meaning of the sentence are based in part on a knowledge of grammatical relations, which are transferred piecemeal, as they are recovered. However, semantic hypotheses may also be based on purely lexical information (channel B). That is, in the sentence The doctor cured the patient, the semantic processor may be able to correctly predict the meaning of the sentence purely on the basis of the pragmatic information available about the key lexical items doctor, cure and patient. It is assumed that such hypotheses could be constructed quite independently of syntactic processing, and indeed, in some circumstances, these hypotheses may be accepted without checking that they are consistent with the known grammatical properties. Of central importance in this paper is the remaining channel F, which represents feedback to the syntactic processor from the semantic processor. In this else, information from the semantic level is used in a heuristic fashion to modify syntactic processing. This could occur in two ways. To take the example cited above, when the semantic hypothesis about the meaning of the sentence is formulated on the basis of pragmatic cues, certain predictions about the correct syntactic analysis may be formulated and relayed back to the syntactic processor. For example, it may be hypothesized that doctor must be the logical subject of cure, and the syntactic processor may merely have to determine whether this is a possible analysis. This may well prove to be a more rapid method of determining syntactic structure than just relying on purely syntactic cues. That is, the analysis of structural relations is facilitated by considering the likely meaning of the sentence, in this case, owing to the fact that the correct analysis has been suggested by semantic hypotheses. Alternatively, discovery of the correct syntactic analysis may be facilitated by a more rapid elimination of incorrect syntactic hypotheses. Thus, the syntactic hypothesis that doctor is the object of cure (induced, say, on some purely syntactic cue) could be rejected immediately by noting that it is incompatible with the current semantic hypothesis. Obviously, there wouid !rave to be some procedure for weighting hypotheses derived from different sources of evidence, so that decisions could be made when conflicts arise. In what follows, the primary question of interest is whether channel F plays a significant role in sentence processing. Two positions on this issue can be contrasted. The first, which will be referred to as the constancy hypothesis, asserts that channel F is effectively inoperative, and that the component of total processing time directly attributable to syntactic processing remains constant despite variations in meaning.
Semantic heuristics and syntactic
analysis
323
The alternative view, which will be referred to as the interactive hypothesis, asserts that evidence from the semantic level of processing plays an important heuristic role in syntactic decision making, and hence it should be possible to show that the syntactic component of total processing time is markedly affected by semantic properties of the sentence. The major assumptions of the constancy hypothesis are as follows: (i) That the only inputs to the syntactic processor are the syntactic properties of the lexical items provided by the lexicon (channel A), and the cues available from the positioning of formatives in the input string; (ii) that decisions about syntactic analysis are made without reference to the possible meaning of the sentence; (iii) that although semantic hypotheses may be formulated on the basis of lexical information (channel B) while syntactic processing is still in progress, the only function of these hypotheses is to expedite semantic interpretation when the syntactic analysis becomes available; and (iv) that semantic processing is not completed until the syntactic analysis is available, and that semantic hypotheses must at least be checked for compatibility with the syntactic analysis. The major assumptions of the interactive hypothesis, on the other hand, can be represented as the denial of assumptions (i) - (iii), with an indeterminate position on assumption (iv). We are now in a position to interpret the evidence which bears on these issues. The most frequently cited evidence for the interactive view comes from the reversibility phenomenon (Slobin, 1966). When selection constraints dictate the assignment of the key lexical items in the sentence to the functions of logical subject or object (e.g., The girl is watering theflowers), the sentence is said to be nonreversible. When there are no such constraints (e.g., Thegirl is watching the dog), the sentence is said to be reversible. Generally, nonreversible sentences are processed more rapidly than reversible sentences. This result establishes that reversibility affects part of the total process, without necessarily supporting the interactive position. That is, it could be that reversibility affects the time required to determine the meaning, not the structure, of the sentence. However, Slobin also showed that the usual difference in the time required to process actives and passives was absent for nonreversible sentences; passives took longer to process than actives only when there were no semantic cues. On the assumption that the active-passive difference reflects syntactic complexity, it is clear that this result supports the interactive position. Much the same result has been obtained for pragmatic cues (Herriot, 1969). In this case, nonreversibility is produced by expectancies about the probable meaning of the sentence. In nonreversible sentences, there are strong expectancies that constrain the assignment of nouns to grammatical functions (e.g., The doctor cured the patient), whereas in reversible sentences, these are absent (e.g., The boy spoke to the teacher). Once again, the active-passive difference was obtained only for reversible sentences.
324
Kenneth I. Forster and I/mar Olbrei
The same model accounts for both results. The problem set for the syntactic analyzer is to decide whether the sequence NP, . . . V . . . NP, should be analyzed as an active or a passive sentence. For nonreversible sentences, a semantic test can resolve the issue immediately, whether we are dealing with logical reversibility (Slobin) or pragmatic reversibility (Herriot). In the first case, the incorrect hypothesis leads to an anomalous sentence, and in the second, it leads to an implausible sentence, and hence the hypothesis can be immediately rejected. However, the semantic test is of no assistance for reversible sentences, and further syntactic processing is required. Evidence for the constancy hypothesis has been provided by Forster and Ryder (1971) in an experiment with similar logic to the reversibility experiments, except that semantic properties were manipulated in a different way. The general design of experiments of this type is as follows: The relative difficulty of N sentence types, T,, Tz, . . .. TN is measured under M semantic conditions, S,, S, . . . . SM, where a given sentence represents a combination of a particular sentence type and semantic condition S, (Ti). In the reversibility experiment, there are two sentence types (active and passive), and two semantic conditions (reversible and nonreversible). If it can be shown that the relative difficulty of the N sentence types varies markedly as a function of the semantic condition, then this constitutes support for the interactive hypothesis. In the Forster and Ryder (1971) experiment, twenty different sentence types were studied under three different semantic conditions: (1) Where the meanings of the sentences were entirely plausible and relatively predictable from the meanings of the key lexical items (e.g., The queen danced at the ambassador’s ball); (2) where the meanings were relatively implausible (e.g., The author stared at his neighbour’s elbow), and it would be expected that semantic hypotheses based on lexical information would be less informative, and (3) where the sentences were semantically anomalous (e.g., The hotel arrived at the government’s
bark).
In the first condition, syntactic analysis would be aided by the fact that the most likely semantic organization correctly predicts the appropriate syntactic analysis. However, in the second condition, the events referred to in the sentence were deliberately chosen to be relatively bizarre or unexpected. Knowing the meaning of the lexical items does not immediately suggest the correct syntactic analysis, and if syntactic hypotheses are sometimes rejected because they lead to implausible meanings, then it follows that at some stage, the correct syntactic hypothesis will initially be discarded, presumably to be later reinstated. Finally, in the third condition, where the sentences were semantically anomalous, it is obvious that semantic heuristics would be of no assistance whatever to the syntactic processor. These facts suggest that the time required to syntactically analyze the same structures under these three different semantic conditions would vary considerably. Within any one of these conditions, the relative difficulty of the twenty structures
Semantic
heuristics and syntactic
analysis
325
would be a function of the time required to identify the structure of the sentence, and also the time required to determine the meaning (although these times are not necessarily additive). Since the meanings varied, any correlation between the three sets of estimates of processing difficulty would indicate a relatively constant effect of syntactic structure on processing time. The results obtained by Forster and Ryder (1971) showed clearly that the effects of syntactic structure on processing complexity were positively correlated across the three semantic conditions, despite the very considerable effects of implausibility and anomaly on overall performance. This result suggests that semantically based heuristics do not exert a profound effect on syntactic processing. However, the existence of a positive correlation due to structure does not rule out the interactive hypothesis, nor does it do more than provide the minimal support necessary for the constancy hypothesis to be true. That is, a positive correlation is a necessary condition for the constancy hypothesis to be true, but is by no means a sufficient condition. Thus we arrive at the point of central concern in this paper. Given the very strong apparent effect of semantic factors on syntactic processing observed by Slobin (1966), it would be expected that in the Forster and Ryder (1971) study it would be quite difficult to demonstrate a positive correlation due to syntactic structure. The latter study tends to support the constancy hypothesis, whereas the former disconfirms it. If both results prove to be empirically reliable, then it must be concluded that the interpretations we have offered are incorrect. The aim of the experiments to be reported in this paper was to determine whether both of these effects were sufficiently robust to be detectable under a new set of experimental conditions. The experimental technique chosen was an extension of the decision latency procedure used by Rubenstein, Garfield and Millikan (1970) to study word recognition. Their task required the subject to decide whether a given sequence of letters formed a word or not. This procedure can easily be extended to sentences: The subject is asked to decide whether a given sequence of words forms a meaningful sentence or not. Presumably, the only way such a decision can be made is to attempt to analyze the sequence as a meaningful sentence in English. Hence, decision times should reflect the total processing time required for the sentence. This technique has been used previously with some degree of success to study context effects (Dooling, 1972). In the first experiment, the constancy effect is re-examined using the decision latency technique, and the same procedure is then used in the next three experiments to study the reversibility phenomenon. Experiment 5 also examines reversibility, using the rapid visual presentation technique originally used in the Forster and Ryder (1971) study.
326
Kenneth I. Forster and Ilmar Oilwei
Experiment 1 The purpose of this experiment was to determine whether the syntactic constancy effects observed by Forster and Ryder (1971) can also be obtained with a decision latency technique. The design involves comparing the distributions of latencies of two sets of twenty sentences matched for syntactic structure, one set having plausible meanings and the other implausible meanings. The anomalous sentences used by Forster and Ryder (1971) could not be used in this design, since preliminary testing revealed that these sentences were almost always rejected by subjects, and hence that decision latency does not really reflect the total time required to analyze the structure and ‘meaning’ of these sentences. The specific prediction tested was that there would be a positive correlation between the processing times for plausible and implausible sentences of the same syntactic structure. In addition, a number of subsidiary hypotheses were considered. These were mainly concerned with the adequacy of the decision latency technique itself. Thus, regardless of the outcome of the first hypothesis, it should be the case that implausible sentences are associated with longer decision latencies than plausible sentences. In addition, it was expected that two-clause sentences would produce longer latencies than one-clause sentences of the same length. Method The sentences were taken from Forster and Ryder (1971). Twenty different surface structures were used, half consisting of one clause and half consisting of two clauses. On the basis of previous findings (Forster, 1970; Forster and Ryder, 1971) it was expected that the two-clause sentences would take longer to process than the oneclause sentences, although recent evidence (Holmes and Forster, 1972) suggests that this is not necessarily true for all two-clause constructions. Two versions of each surface structure were prepared, one having a plausible meaning, and one having a relatively unexpected, or implausible, meaning. The actual sentences used are listed in the Appendix. All sentences contained seven words, and the average word length was equated for all four types of sentences. In addition, twenty ungrammatical sequences were included. These were not simple random word sequences, since this would enable a correct discrimination of sentences. and non-sentences to be based on an analysis of only the first two or three words in the sequence. To prevent this, the ungrammatical sequences were designed so that, superficially at least, they resembled grammatical sequences. Examples of these items are: Some of the oldest six down stooping. About seven of nothing in the night.
Semantic
The unbearable lightness such of running The chorus of winning loud the lions.
heuristics and syntactic
analysis
327
quiet.
Slides were prepared in such a fashion that when projected, the sentence was displayed as it would appear in a normal typewritten text, except that the lettering was white on a dark background. The initial letter of the first word of all sequences was in upper case. The subjects were instructed that their task was to determine as quickly as possible whether the presented sequence was an intelligible, grammatical sentence. If the sequence was a sentence, the subjects responded by pressing a push-button held in the right hand, but if it was not, no response was required at all. This arrangement was designed to simplify the response conditions as much as possible. Reaction times were measured from the onset of the test slide to the subject’s response. On those trials in which the subject failed to respond, the experimenter waited for ten sets before terminating the trial. The action of the slide projector in changing slides served as a warning signal for the beginning of each new trial. The instructions provided clear examples of the types of items used, and the subjects were warned that some of the sentences would have unexpected meanings. It was also explained that the ungrammatical sequences would obviously be ungrammatical, and examples were given. Eight practice items were included to familiarize the subject with the procedure and materials. All subjects were tested individually, and a different random ordering of items was used for each subject. Ten undergraduate and postgraduate volunteers served as subjects. They were paid for their participation in the experiment. Results The principal analysis of the data concerned the hypothesis that there would be a positive correlation between decision latencies for the plausible and implausible versions of the twenty surface structures used in the experiment. In order to test this hypothesis, mean latencies were computed for each of the 40 sentences, averaging over the 10 subjects. Two sets of item means were computed. For the first set, in order to eliminate the distortion introduced by exceptionally long latencies, the data for each individual subject were analyzed, and cutoff points were set two standard deviation units above and below the mean for each subject. Any observations exceeding these cutoffs were replaced by the appropriate cutoff value. These adjusted scores were then used to determine means for each item. For the second set of means, the reciprocals of the unadjusted raw observations for each subject were used. This transformation also limits the bias introduced by unduly long latencies and serves as a check on the first analysis. In both analyses, errors were excluded altogether. For both sets of data, there was a significant positive correlation between the means
328
Kenneth I. Forster and Ilmar Olbrei
for the plausible sentences and the means fo: the corresponding implausible sentences. The obtained product-moment correlations were .582 (y < .Ol) for the means based on the adjusted latencies, and .655 (p < .Ol) for the means based on the reciprocals. These results confirm the findings of Forster and Ryder (1971), and the obtained values are comparable to the value of .65 reported in the earlier study. Subsequent analysis of the results investigated whether decision times also reflected the number of clauses in the sentence, and the plausibility of the meaning. Means for each of the four conditions were computed for each subject, and these were analyzed as a 2 x 2 factorial with repeated measures on each factor. To ensure that the results were independent of the data metric chosen, analyses were carried out separately for the adjusted latencies, and for the reciprocals. Table 1 shows the means for the four conditions, using the adjusted latencies. As expected, two-clause sentences took longer to process than one-clause sentences of the same length F(1,9) = 52.09, p < .Ol, and this effect was also observed in the reciprocal analysis. Also, implausible sentences took longer to process than plausible sentences, F(1,9) = 46.99, p < .Ol, with a similar effect for reciprocals. As can be seen in Table 1, the effect of adding an extra clause was not entirely constant. When an extra clause was added to a plausible sentence, decision times were increased by 252 msec, but in an implausible sentence, the extra clause added 364 msec to processing time. However, in this case, there was disagreement between the two analyses; for the adjusted latencies, the interaction of number of clauses and plausibility was not significant, F(1,9) = 3.43, p > .05, but in the reciprocal analysis it was significant, F(1,9) = 5.19, p < .05. Table
1.
Mean adjusted latencies (in msec) as a function of number of clauses and plausibility in Experiment 1
Plausible
Implausible
One-clause Two-clause
1281 1533
1547 1911
Difference
252
364
In studies of this kind, where variation between sentences is almost as great as variation between subjects, it is important to check that the effects are typical of all items. The appropriate analysis for this purpose is one which takes the means for individual items (sentences) as the basic data, not the means for each subject. In this case, the subject factor is collapsed, and the design becomes a straightforward factorial with a within-cells error estimate and no repeated measures. Under these conditions, a significant effect can only be obtained when the majority of items show the same
Semantic
heuristics and syntactic
analysis
329
effect. This is not necessarily true of the subject analysis, where a small minority of items can produce a consistent effect for each subject. Accordingly, the item means were analyzed, and again, two analyses were carried out, one on the adjusted latencies and one on the reciprocals. For the adjusted latencies, both the number of clauses and plausibility produced significant results, F(1,36) = 12.96, p < .Ol and F(1,36) = 12.34, p < .Ol, but the interaction term was not significant, F(1,36) = 0.40, p > .05. The reciprocal an.alysis showed exactly the same pattern. Thus it can be concluded that the main elfects were general effects observed for all items, but that the interaction previously observed in the subject analysis was not typical of all items.
Discussion The results of this experiment, taken together with the results of Forster and Ryder (1971), provide strong evidence that the time required to analyze a particular syntactic structure is approximately constant, despite gross variations in the plausibility and well-formedness of the meaning assigned to that structure. Such a result lends little support to an interactive view which argues that syntactic analysis is normally guided by semantic cues, although this hypothesis is by no means invalidated. The results obtained with the decision latency technique also confirm other findings. First, in agreement with earlier studies using accuracy of report as a criterion of performance (Forster, 1970; Forster and Ryder, 1971), two-clause sentences produced longer decision times than one-clause sentences of the same length, implying that in some sense, the clause constitutes a major unit of processing. In addition, semantically implausible sentences took longer to process than plausible sentences, a finding which is consistent with results obtained with a variety of techniques (Rosenberg and Jarvella, 1970; Herriot, 1969; Forster and Ryder, 1971). This is a particularly important result, since the decision latency technique is not open to some of the criticisms that could be levelled against earlier studies demonstrating the same effect. In particular, it cannot be argued that performance on plausible sentences is superior merely because subjects are better able to reconstruct the input sentence, a criticism which applies only to techniques requiring subjects to actually report the sentence. As noted earlier, the increment in processing time produced by the embedded clause was not exactly the same for plausible and implausible sentences (252 and 364 msec respectively), aithough this difference was significant in only one of the four separate (but not independent) tests of the interaction term. Although the possible existence of an interaction between number of clauses and plausibility is of some interest, it should be noted that such an effect could be interpreted in a number of ways. For example, it could imply that implausibility increases the time required to
330
Kenneth I. Forster and llmar Olbrei
syntactically process an embedded clause, in which case some doubt would be cast on the constancy hypothesis. But equally, such an interaction might simply mean that the number of clauses affects not only syntactic processing time but also semantic processing time; that is, a two-clause sentence takes longer to process semantically than a one-clause sentence. This is a very real possibility but is in no way inconsistent with the constancy hypothesis. Although we have argued that the results provide support for the principle of constancy of syntactic processing times, it might be claimed that, in fact, the results show the opposite; the reported correlations between processing times for matched structures show that syntactic variables control less than 40% of the total betweensentence variance. This implies that at least 60% of the variance was controlled by non-syntactic variables, which might be taken as evidence that syntactic processing time was I?otconstant across the two semantic conditions. This argument confuses syntactic processing time with total processing time. The constancy hypothesis relates to the former time, not the latter. However, observations can only be made on total processing time, and syntactic processing time is only one component of this, along with semantic processing time, measurement errors, sampling errors, etc. Thus there is an upper limit to the maximum possible effect that could have been observed, and this upper limit would have to be well below 100x, probably nearer to 50 %, since the effect of plausibility (a semantic effect) was roughly equal to the effect of number of clauses (a syntactic effect). The major difficulty here is that we do not know how large a correlation to expect if the constancy hypothesis is true. If highly reliable data were used, and if syntactic processing time could be measured directly, then the correlation should be much higher. But given that neither of these conditions was met, all that can be concluded is that the constancy hypothesis requires a non-zero positive correlation, which in fact was obtained.
Experiments 2-3 In the next experiments, attention is directed to the reversibility phenomenon. As previously argued, this effect can be interpreted as showing that syntactic processing time is not constant when the number of semantic cues to the structure is varied. It should be noted that the reversibility effect is in no way incompatible with the results of the previous experiment. In Experiment 1, it was shown that syntactic processing time tended to be constant for plausible and implausible sentences. But this does not directly imply that it should also be constant for reversible and nonreversible sentences, since plausibility and reversibility are entirely orthogonal attributes. For example,
Semantic
heuristics and syntactic
analysis
331
implausible sentences can be either nonreversible, as in The JEea jumped over the skyscraper, or reversible, as in The Jlea jumped over the girafse, Thus, empirically, it is entirely possible that both the constancy and reversibility results are correct. The conflict arises only when the implications of these results are considered, since reversible and implausible sentences share the common property of providing fewer semantic cues to structure than non-reversible and plausible sentences, respectively. The aim of these experiments was to determine whether it is possible to reproduce the reversibility phenomenon, using procedures designed to eliminate some of the criticisms that can be made of earlier experiments (Slobin, 1966; Herriot, 1969). The first procedural change involved using the decision latency technique outlined in the previous experiment. We believe this to be a more direct index of sentence processing time than either of the techniques previously used, which at best could only be described as highly indirect. The second procedural change involves the selection of sentences to be used in the experiment. One of the predictions to be made from the interactive theory is that reversible sentences will take longer to process than nonreversible sentences. However, reversible and nonreversible sentences must differ in meaning, and hence they may also differ in plausibility. Given the very strong effects of semantic plausibility on total processing times, it is clearly necessary to ensure that the two kinds of sentences do not differ in plausibility. There is some reason to expect that reversible sentences will often be less plausible than their nonreversible counterparts. Although we have not attempted to give any precise definition of plausibility, it is not too difficult to propose an operational test. For example, consider the procedure used by Rosenberg and Jarvella (1970) to manipulate what they call the semantic integration of a sentence. Sentences were constructed by asking subjects to fill in the blank in sentences such as the following: The dog chased the -. The majority of subjects will give cat as a response, and this fact is taken to mean that the sentence The dog chased the cat is semantically wellintegrated. On the other hand, the fact that none give alligator as a response indicates that the sentence The dog chased the alligator is less well-integrated semantically. This distinction is extremely close, if not identical, to the intuitive distinction between plausible and implausible sentences. If we accept that plausibility can be indexed by the associative predictability of the sentence, then it can easily be seen that this index will not always be the same for reversible and nonreversible sentences. For example, consider some of the sentences used by Herriot (1969) to manipulate pragmatic reversibility. For nonreversible sentences, we have examples such as The doctor cured the patient, where any one of the three lexical items is highly predictable from the other two. For reversible sentences, we have examples such as The army assisted the navy, where associative connections
332
Kenneth I. Forster and Ilmar Olbrei
would be far weaker. Again, in the case of Slobin (1966), we have as an example of a logically nonreversible sentence, The girl watered thejowers. In this case, predictability is probably very high, although no higher than the example Slobin gives of a reversible sentence, The dog chased the cat (by Herriot’s standards, a nonreversible sentence). But if we chose to compare the nonreversible sentence with the following reversible counterpart, The girl obscured the,flowers, we would obviously be confounding reversibility effects with associative predictability (plausibility) effects. These problems can be minimized by ensuring that both sets of sentences are matched for associative predictability, and hence plausibility. In the experiments to be reported, the samples of items to be used were checked for variations in plausibility in the following way: The sentences were presented in a visually degraded form (blurred typewritten copies) to a panel of colleagues who were given unlimited time to reconstruct as much of the sentence as possible. The more predictable the sentence, the greater the probability that the entire sentence will be reconstructed. Also included in the test were the samples of one-clause plausible and implausible sentences used in Experiment 1. As would be expected, the percentage number of words correctly reported (79 %) for plausible sentences was considerably higher than for implausible sentences (58x), thus demonstrating that the technique is sensitive to plausibi1ity.l However, for the reversible and nonreversible sentences, there were only very small differences favoring the nonreversible versions : The predictability values for the reversible and nonreversible items used in Experiment 2 were 63% and 65x, and for Experiment 3 the corresponding values were 54% and 56 %. Neither of these differences was significant. The design of the experiments followed the design of Slobin (1966), and each consisted of a 2 x 2 factorial, the factors being syntactic type (actives vs. passives) and reversibility (reversible vs. nonreversible). If the reversibility effect is genuine, then the processing times for reversible sentences should be longer than for nonreversible sentences, and passives should require longer times than actives only for the reversible case. The same procedure was followed in both experiments, the only difference between the two experiments being the actual items used.
1. The correlation between plausibility and predictability can be estimated more precisely using the judgments of plausibility obtained for these sentences in the original study (Forster and Ryder, 1971, p. 292). The ob-
tained product-moment correlation between the plausibility judgments and the mean percentage number of words reported for each sentence was .55, p < .02.
Semantic
heuristics and syntactic
analysis
333
Method The first step involved constructing a basic set of sentences, half actives and half passives. Each of these sentences was reversible. The active sentences contained five words and the passives contained seven words. Each of these sentences was then converted into a logically nonreversible sentence by changing either the first or the second nominal, ensuring that the nonreversible and reversible versions were, on the average, equally plausible. The average word length was also equated. Examples of the items are as follows: Reversible : Nonreversible Reversible : Nonreversible:
Four : Four Some Some
women touched the girl. women touched the skirt. teachers were dismayed by the parents. teachers were dismayed by the essays.
The sentences are listed in the Appendix. It will be observed that the constraints imposed by the design make many of the sentences relatively unnatural, but it was intended that this would have occurred equally often for reversible and nonreversible sentences. All the active sentences were of the general form NP, . . V . . . NP, and all passive sentences were of the form NP, . . . be . . . V . . . by NP,. Thus the passives are always two words longer than the actives. This means that any difference in the processing time required for actives and passives will be partly due to structural differences and partly due to differences in number of words. The procedure for measuring processing times was the same as in the preceding experiment. The sentences were presented in a random sequence by means of a slide projector, and subjects pressed a key if the sequence of words formed a meaningful sentence; otherwise no response was made. Two sets of materials were used in each experiment (set A and set B). Each set contained actives and passives, half of each sentence type being logically nonreversible, half being reversible.2 If the reversible version of a given sentence were used in set A, then set B contained the nonreversible version, and vice versa. Thus, no single S received both the reversible and nonreversible versions of a given item. In Experiment 2, each set contained 32 sentences, and in Experiment 3, each set contained 24 sentences. None of the sentences used in Experiment 2 was used in Experiment 3. The number of distracters in each experiment equalled the number of test items. Basically, these distracters were designed to superficially resemble the test items. Typical examples are as follows : 2. Three of the nonreversible sentences used in Experiment 3 are actually only pragmatic-
ally nonreversible.
334
Kernleft?I. Forster and ilmar Olbrei
That thought the woman was by person. A poor substitute the amused. The woman was only eaten the uncle. The number light under side. He was in also more. The full measure was filled by empty. In each experiment,
20 undergraduate
volunteers
served as subjects.
Results The decision mean latencies Table 2. The way factorial (active-passive)
latencies were analyzed in the same way as in Experiment 1, and the over subjects in each condition for the two sets of items are reported in results for Experiments 2 and 3 were analyzed separately in a threedesign, the factors being materials (Set A or Set B), sentence type and reversibility, with repeated measures on the last two factors.
Table 2. Mearz adjusted latencies (in msec) as a funtion qfsentence type in Experiments 2 and 3
Active Passive Difference
Experiment 2 Reversible Nonreversible
Experiment 3 Reversible Nonreversible
1023 1201
1058 1213
1037 1191
1072 1246
178
155
154
174
In order for the reversibility argument to be sustained, there are two essential requirements. First, reversible sentences must be more difficult to process than nonreversible sentences, and second, the difference between actives and passives must be substantially reduced fc.r nonreversibles compared with revel sibles. Neither requirement was satisfied in either experiment. First, in both cases, reversibles were slightly easier to process than nonreversibles, but this effect was significant only in Experiment 3, F(1,18) = 9.16, p < .Ol. Second the interaction of sentence type and reversibility was not significant (F< 1) in either experiment, and in the case of Experiment 3, the difference between actives and passives was actually greater in the nonreversibie condition. The only evidence of an interaction effect came from the subsidiary analysis of the reciprocals in Experiment 3, F(1,18)=4.93, p < .05, but this makes little sense, since it implies t!lat nonreversibility interferes with the processing of passives to a greater degree than for actives. Neither this effect, nor the significant reverse effect of reversibility in Experiment 3, was significant in the item analyses.
Semantic
heuristics and syntactic analysis
335
The only effect reaching significance in all analyses (including the item analyses) was the difference between actives and passives. This was significant at the 1% level in each case, with the E: values ranging from 14.10 to 76.61.
Discussion
Taken together, Experiments 2 and 3 provide a surprising disconfirmation of an almost universally accepted empirical fact. There can be no doubt that under the conditions of these experiments the reversibility of the NPs has little relevance for sentence processing. The implication of this result is either that the previously reported reversibility effects were in some way artifactual, or that the decision latency task does not adequately reflect total sentence processing time. We shall return to the first alternative later and for the present concentrate on the possibility of inadequacies in the experiments reported here. The most obvious possibility is that the subjects of the experiment were able to develop special strategies so that the decision latencies did not accurately index sentence processing time. For example, subjects may have been able to correctly classify sentences without completely processing the syntax or meaning of the sentence. In the case of a reversible sentence, for instance, there is no need to determine which NP is the logical subject and which is the logical object, since both versions would be well-formed. On the other hand, nonreversible sentences might require more detailed syntactic processing, since one of the possible arrangements of NPs would be anomalous. This might explain why reversibles were no more difficult to classify than nonreversibles, but would not explain why a large active-passive difference was found for both types of items. Another possibility is that little attention was paid to the meaning of the sentence, since the distracters were all ungrammatical, and hence there was no need to consider the meaning to arrive at a correct decision. The only way to ensure that both syntax and meaning are accurately analyzed is to manipulate the properties of the distracters so that errors will be made whenever an item is not fully processed. Thus, if some of the distracters are grammatical actives or passives, but are semantically anomalous, we can determine how much attention is paid to meaning by examining the error rates on these items. If meaning is routinely ignored, then the error rate should be close to 100%. Similarly, if some of the distrac:ors are meaningful actives or passives, but contain minor syntactic errors, then we can determine how accurately the syntax is being processed by examining the error rates on these items. The next experiment examines the effects of including distracters of the above types.
336
Kenneth I. Forster and Ilmar Olbrei
Experiment 4 In this experiment, the test items were the sentences used in Experiment 3. However, instead of constructing a set of distracters that were merely superficially similar to sentences, the distracters were carefully designed so that in all respects, except that of being well-formed, they matched the properties of the test items. The two major classes of distracters used were as follows: 1. Anomalous items. In order to check that the subjects were determining the meaning of the sentences, eight semantically anomalous items were included, half active and half passive. These items were : The man built the cat; The idea o&fled the artist; The reporters printed the senator; The book wrote the author; The artist was painted by the picture; The boy was inserted by the mother; The workers were repaired by the priest; The feet were mentioned by the name. If the subjects were totally ignoring the meaning of the sentence, then it would be expected that a high percentage of errors would be made on these items. Of course, it should be noted that accurate classification of these items as malformed requires accurate processing of syntax as well. 2. Ungrammatical items. These consisted of 24 items, half active and half passive, which contained relatively minor grammatical errors. Within each sentence type, half would have been reversible sentences without the ungrammatical feature, and half would have been nonreversible. These items were based on the sentences used in Experiment 2. The errors consisted principally of changes in the correct word-order, lack of agreement and omission of articles. Typical examples are as follows: Active reversible : Four women the touched girl; The boys seem liked girls; The teacfler saved sorry boy. Active nonreversible: The butcher recognized three hat; The girl the kissedphoto; The plague tflree killed men. Passive reversible: The student were annoyed by the woman; Some were teacfzers dismayed the by parents; Each child were identified by gentleman clever. Passive nonreversible: The doctors was guided by her instinct; The stranger was overcome by now smoke; The statue was rescued quickly by artist. In order to balance the number of sentences and distracters, an extra eight filler sentences were included. The experimental procedure was exactly the same as in the previous experiment, except in the following respects: (i) The subjects were provided with two response buttons, one for a ‘Yes’ response, one for a ‘No’ response (in the previous experiments, the subjects only responded to well-formed sentences). This procedure forces the subject to commit himself as rapidly as possible and produces higher error rates than the alternative procedure used earlier; (ii) the instructions
Semantic
heuristics und syntactic
analysis
337
stressed the need for careful analysis of the items, with special attention being given to anomalous items. It was stressed that only fully meaningful and grammatical sentences should be responded to with a ‘Yes’ response. The subjects were asked to respond as quickly as possible, without making a large number of errors. A total of 20 undergraduate volunteers served as subjects, and they were paid for their participation in the experiment, Table 3 shows the mean latency obtained in each condition, the method of analyzing the results being the same as in the previous two experiments. The first point to note is that the overall latency has increased markedly in comparison to Experiment 3 (see Table 2), even though the test items are exactly the same. This indicates that the manipulation of the distracters had the desired effect of forcing the subjects to analyze the test items in greater detail. Table 3. Mean adjusted latencies Experiment 4
Active Passive Difference
(in msec) as a function
of sentertce type in
Reversible
Nonreversible
1436 1709
1361 1668
273
307
As in previous experiments, there is a large and significant difference between actives and passives, F(1,18) = 41.64, p < .Ol, with a rather larger effect being obtained in the present experiment. This effect was highly significant in all analyses. However, in contrast to the earlier experiments, reversibles tended to produce longer latencies than nonreversibles, although this effect is marginal. In the analysis of the mean latencies for subjects, the effect of reversibility just failed to reach significance, F(1,18) = 4.39, p > .05, but in the analysis of reciprocals, it reached significance, F(1,18) = 4.79, p < .05. However, in both item analyses, the effect was nonsignificant (F< 1). Whatever the status of this effect, it is not particularly strong and cannot be generalized over items. However, the central point to notice is that the classic reversibility effect is not obtained in that there is a substantial active-passive difference for both reversible and nonreversible items. Whatever interaction between sentence type and reversibility exists, it tends to go in the opposite direction to the predicted effect, with nonreversibles producing a larger difference than reversibles, although this interaction falls well short of significance in the analysis of the subject means, F(1,18) = 0.38, p > .05, with similar results in the other analyses.
338
Kenneth I. Forster and /Imar Olbrei
Thus, on the basis of these results, there is no reason to reject the constancy hypothesis, since the slight effects of reversibility are equal for both sentence types. The most reasonable interpretation of the reversibility differences, if they are reliable, is that reversibility has affected either semantic or lexical processing. Furthermore, there can be no argument in this experiment about whether the subjects were in fact processing the sentences accurately. For six of the eight anomalous distracters, all twenty subjects responded correctly by rejecting the item as a meaningful sentence. Of the seven errors that occurred (equivalent to an error probability of .04), five were on the item The man built the cat, where there is some justification for arguing that the item was, in fact, perfectly well-formed. Thus performance on the anomalous distracters was virtually perfect, and there is no way in which this could be achieved without determining the meaning of every item in the experiment. This assumes, of course, that the error rate for the test items was also low, which was in fact the case: Over all items, the error probability was only .05. A similar conclusion applies to the processing of the ungrammatical distracters, although the error rates tended to be a little higher. For distracters based on active sentences, the error probability was .07, and for passives the value was .18. For 16 of the 24 ungrammatical distracters, at least 90% of the subjects correctly rejected the item, and the majority of errors were on items where errors of number were involved (e.g., The statue was rescued quickly by artist; The doctors was guided by her instinct).
Experiment 5 The final experiment again examines the reversibility issue, this time using a quite different experimental task. It might be argued that the decision latency technique represents an unnatural situation, somewhat analogous to proof-reading, in which the subject is asked to attend to relatively minor technical details of syntax, rather than attempting to interpret the input in a meaningful way. Accordingly, it was decided to use a different task, which places considerable emphasis on the subject’s ability to rapidly organize the input in meaningful terms. The task chosen was the rapid serial visual presentation (RSVP) technique, which had also been used in the original experiment of Forster and Ryder (1971). Briefly, this technique involves presenting each word of the sentence successively at an extremely rapid rate (16 words/set) with the subject merely being required to report as many words as possible. Each word is visually superimposed on the preceding word, which prevents the formation of a cumulative sensory trace of the entire sentence and forces the subject to process the input at the same rate as it is presented. The assumption underlying the procedure is that the presentation rate is slow enough to permit
Semantic
heuristics and syntactic
analysis
339
each word to be identified but too fast to allow each word to be separately encoded into memory. However, if the subject is able to organize the input meaningfully, then the encoding operation is assumed to be far more rapid. That is, the rapid presentation rate essentially causes the subject to forget much of what he has seen, unless he can impose a meaningful organization on the sequence. Although there is obviously some doubt as to whether the subjects can always identify every word, this is not an important issue in the present context, since there is no reason to expect purely visual factors such as forward or backward masking effects to differ systematically as a function of syntactic or semantic variables. The precise details of the procedure and evidence indicating the suitability of the technique are available elsewhere and will not be repeated here (Forster, 1970; Forster and Ryder, 1971; Holmes and Forster, 1972; Holmes, 1973). The items used in this experiment were very similar to the test items used in the preceding experiments and consisted of 24 sentence-pairs, with one member of each pair being logically reversible, the other logically nonreversible. As in the earlier experiments, half the items were actives and half were passives, with the reversible and nonreversible versions being identical except for one of the NPs. The actives were all seven words in length, and the passives were all nine words in length. The procedure for checking the predictability of the sentences was also the same as in the earlier experiments. The results of this analysis showed no significant differences in predictability as a function of reversibility. As in previous experiments, two sets of materials were prepared. The sentences were filmed so that each word occupied a single frame of a 16 mm movie film, with a ready signal preceding the sentence. The sentences were presented by means of a variable speed motion analyzer projector at a speed of 16 frames/set. After each sentence had been presented, the subject was asked to write down as much as possible, with no constraints being placed on guessing. Two films were prepared so that if the reversible member of a sentence-pair was included in the first film, the nonreversible member was included in the second. A total of 20 subjects were used, half being assigned to each film. In scoring the responses, it was necessary to select a procedure which allowed comparisons between actives and passives. The normal measure, number of words correct, is inappropriate since the passives were longer than the actives. The simplest method (to be referred to as method L) is to score the number of key lexical items reported, these items being the two NPs and the verb. This has the virtue of being relatively objective and allows a direct comparison of actives and passives. However, it ignores the question of whether the subject understands the grammatical and semantic relations between these items. This problem can be overcome by scoring extra points for reporting the grammatical relations correctly. That is, not only would the
340
Kenneth I. Forster and Ilmar Olbrei
subject receive one point for reporting the logical subject of the sentence, he would get an extra point if it were also clearly the logical subject of his reported version, similarly for the logical object, and the voice of the verb. Thus the maximum score for a sentence would be 6, and this would indicate that the essential features of the meaning of the sentence had been correctly reported. However, this technique (referred to as method S) also involves an element of subjectivity in the scorer, and it is for this reason that both sets of results are reported. Table 4 shows the mean scores in each condition under both scoring systems. For scoring system L, the maximum possible score is three, and this is obtained if all three key lexical items are reported (inflections and order ignored). For scoring system S, the maximum possible score is six, and this is obtained if all three key lexical items are assigned the correct underlying grammatical function. Fortunately, the overall pattern of the results is identical for the two scoring systems. In both cases, there is a highly significant difference between actives and passives, F(1,18) = 25.64, p < .Ol for scoring system L, and F(1,18) = 23.92, p < .Ol for system S. However, for both analyses, neither the main effect of reversibility, nor the interaction of reversibility with sentence type was significant, F < 1. Table 4. Mean number of key lexical items (L) and mean semantic score (S) per sentence in Experiment 5
L
Reversible S
Nonreversible L S
Active Passive
2.18 1.84
4.21 3.24
2.13 1.83
4.13 3.26
Difference
0.34
0.97
0.30
0.87
Thus, once again there is no reason to reject the constancy hypothesis, since reversibility fails to influence the difference in performance on actives and passives.
General discussion The general conclusion to be drawn from the last four experiments seems inescapable: Under the conditions of these experiments, reversibility has very little impact on sentence processing. Why, then, was the effect of reversibility so pronounced in the experiments of Slobin (1966) and Herriot (1969)? The answer may lie in the nature of the items used. In the present experiment,
Semantic heuristics and syntactic analysis
341
every effort was made to ensure that the reversible and nonreversible items were equivalent in all respects except reversibility; they differed only in one NP and were also matched for associative predictability. On the other hand, it is possible that in the earlier experiments, the nonreversible items tended to be more predictable, but it should be noted that this confounding would produce an equal facilitation for both actives and passives. Thus, an overall effect of reversibility could be explained in this way, but not an interaction between sentence type and reversibility. It is of interest to note a further property of the items used in this paper: The verbs necessarily had to be capable of appearing in both reversible and nonreversible sentences (since the verb was held constant). However, this was not necessarily true in the earlier studies. For example, it is difficult to construct a reversible counterpart for the nonreversible item The girl is watering the flowers merely by changing an NP. That is, the verb water appears more naturally in nonreversible environments.3 The tasks used by Slobin and Herriot also differ in important ways from the tasks we have used. The technique used by Slobin (1966) was the verification task, in which the subject must decide whether the sentence is a true description of a subsequently presented picture. There are several reasons for doubting results obtained with this procedure. First, as Gough (1966) has shown, there is some doubt that the verification task measures sentence processing time at all, but measures instead the time taken to compare sentence meanings with interpretations of pictures. Second, there is no guarantee that the various pictures are all equally easy to interpret. For example, if we take a nonreversible sentence, The girl kissed the photo, the appropriate picture may be quite easily interpreted. All that needs to be established is whether the picture contains a girl and a photo, and whether the action depicted is kissing. But in the reversible sentence The girl kissed the nurse, there may be some doubt as to who is doing the kissing. In fact the very nature of reversibility suggests that this may be a regular feature of the reversible picture. Third, it appears that the range of possible pictures which can follow reversible and nonreversible sentences must differ. For example, the reversible sentence The dog is chasing the cat can be followed by a picture of a dog chasing a cat, a cat chasing a dog, or something else altogether. However, the 3. This suggests a classification of verbs as reversible or nonreversible. Verbs such as defuse or repair would almost always occur in nonreversible sentences, whereas verbs such as promise or resemble would almost always occur in reversible sentences. A further suggestion is that the property of reversibility resides in the verb rather than in the sentence, which would explain the lack of reversibility effects observed here, since, on this classification, all the verbs would have been neutral,
i.e., occurring with equal probability in either kind of sentence. However, in several subsidiary experiments, using both the decision latency and the RSVP technique, no evidence has been found to support this notion. These experiments used sentence pairs such as The lecturer knew/taught the student: The soldier was recognized/dismissed by the general. In
all cases, the usual activepassive effect was obtained, but there was a total absence of any effect of reversibility.
342
Kenneth I. Forster and Ilmar Olbrei
nonreversible sentence The girl is watering theflowers can be followed by a picture of a girl watering flowers or something else altogether, but not by a picture of flowers watering a girl, unless anomalous pictures are used. This suggests that the verification of a nonreversible sentence can be carried out at an earlier stage of processing than for a reversible sentence, and in some accounts of this experiment (e.g., Morton, 1966) it is explicitly assumed that syntactic processing is quite unnecessary for the nonreversible sentence to be verified. This is not a weakness of the experiment by any means, but if this account is correct, it means that the experiment tells us nothing about whether reversibility facilitates syntactic processing. These problems were avoided by Herriot (1969), who measured the time taken to give the logical subject and object of the sentence, in that order. While this task cannot be dismissed entirely, it is open to obvious strategy effects, and it is an open question whether the task reliably measures sentence processing time. The principal difficulty is that it is probably far easier to give the logical subject and object when the order in which they must be reported is the same as the order in which they occur in the sentence. The decision latency technique used in Experiments 1-4 does not appear to suffer from the same weaknesses. There is no simple strategy that can be adopted, provided that the distracters are suitably designed. In Experiment 4, where the distracters were made as similar as possible to the test items, any attempt to base a decision on an incomplete analysis of the syntax of the sentence would produce a high error rate. Similarly, any attempt to ignore the meaning of the sentence would have produced high error rates on the semantically anomalous distracters. Even in the experiments where the distracters only superficially resembled the test items, it seems likely that both syntax and meaning were processed, since in all experiments, the active-passive difference was substantial, and in Experiment 1, there was a large effect due to both number of clauses and plausibility. However, it must be conceded that the decision latency task might introduce a certain artificiality, in that the precision of processing required is probably higher than in normal sentence processing. That is, the subject must behave more like a proof-reader than is normally the case. But this is not a deficiency in the technique; the goal is to understand how we process the syntax and meaning of sentences under a variety of conditions, not just a particular set of conditions where there is little premium placed on accuracy. By manipulating the properties of the distracters, we can potentially select any degree of precision we wish. It is interesting to note, incidentally, that it was only in Experiment 4, where the requirements were the most stringent, that any sign of an effect of reversibility was apparent. In this case, however, the effect (weak as it was) was equal for both actives and passives, thus supporting the constancy hypothesis.
Semantic
heuristics and syyntactic analysis
343
The RSVP technique used in Experiment 5 potentially allows for quite different processing. The main weakness of the technique is that the subject must report the original sentence, which means that the report can be contaminated by the effects of guessing. This presents no special problems in the current situation, since the sentences were equated for predictability. However, the RSVP task differs from the decision latency task in that there is no reason why the subject must process the syntax at all. In the decision latency task, even if the meaning of the sentence had been correctly guessed on the basis of lexical information, the input string still had to be checked for grammaticality. But in the RSVP task, it is generally assumed that the critical factor governing performance is how rapidly the meaning of the sentence can be determined. Hence, if the meaning had been processed through channel B, then this would facilitate the subject’s report. If, as was suggested earlier, the meaning of nonreversibles can be determined without having to rely on syntactic analysis, then we should have observed a large effect of reversibility in Experiment 5. The fact that no effect occurred implies either that B-type processing did not occur at all or that it was too slow to be of any assistance. Further, the fact that a strong active-passive difference was found for both reversible and nonreversible items implies that syntactic processing was, in fact, attempted. In conclusion, we believe that the evidence presented here is sufficient to call into question the generality of the reversibility phenomenon. If this is accepted, then it appears that the case for the interactive hypothesis collapses, leaving the constancy hypothesis as the only viable alternative. It should be noted that the constancy principle does not require that syntactic analysis always precede semantic analysis, nor does it imply that there are no circumstances in which the meaning of a sentence could be obtained without syntactic analysis. The correct inference is that when syntactic analysis is required by the task conditions, it is executed without regard for the meaning of the sentence. This conclusion tends to suggest that there must be a psychologically real level of description which is purely syntactic, and quite independent of the semamic representation. Thus, we conclude that there is no evidence to indicate that the number of possible syntactic analyses of the sentence is reduced by the operation of semantic heuristics that apply prior to or during syntactic analysis. If such heuristics exist, then their operation must be so limited that they operate in exactly the same way, regardless of whether there is or is not a plausible semantic organization of the sentence (and, according to Forster and Ryder, 197 1, even if the sentence is semantically anomalous), and regardless of whether the sentence is reversible or nonreversible. Nevertheless, it is quite clear that semantic heuristics play an important role in sentence processing, as shown by the very strong effects of plausibility in Experiment 1. We are now in a position to argue that the locus of this plausibility effect is more likely
334
Kenneth I. Forster and Ilmar Olbrei
to be in the semantic processing stage than in the syntactic stage. Evidently, semantic processing is organized in such a way that plausible meanings can be assigned more rapidly than implausible meanings. At the moment, we are unable to specify more precisely what is involved in plausibility. It may simply reflect variations in the associative connections between lexical items (e.g., Rosenberg and Jarvella, 1970; Collins and Quillian, 1972) or it may reflect something more abstract, such as the difficulty of constructing a reasonable context for the sentence. Whatever the nature of these effects, it seems likely that semantic interpretation must involve a heuristic stage of processing, in which hypotheses about the probable meaning of the sentence are formulated and tested, along the lines of analysis-by-synthesis routines.
Semantic heuristics and syntatic analysis
345
Appendix Sentences used in Experiment 1 Sentences are grouped by syntactic structure. and the second is the implausible version.
The first sentence is the plausible
version
The aborigines were shown a rusty The officials were given a warm reception. invention. Five girls waded into the large pool. Three bugs jumped over the mouldy meat. The wealthy child attended a private school. One gentle ghost haunted a scared parson. Some events greatly troubled the serious students. Their noise slowly deafened the pretty minister. The hungry boy found same dry bread. The clever fly made some tiny drugs. Several children raced to the burning building. Several lawyers rushed to the falling passage. The foreign film was an acclaimed success. The hideous plot was a continued failure. Nobody laughed at the boy’s silly mistake. Nobody climbed to the god’s frozen shrine. The queen danced at the ambassador’s ball. The author stared at his neighbour’s elbow. John smoked cigars throughout the dreary play. Mary chewed spears throughout the corrupt talk. The solicitor she wants is busy elsewhere. The daughter she hates is angry somewhere. The dress that Pam wore looked ugly. The aunt that Jim ate tasted foul. They expected their soldiers to approach quietly. They imagined their audience to applaud lightly. Your singing loudly disturbed the entire assembly. Her dying suddenly disrupted the amazing banquet. The choir sang hymns while we prayed. The babies rang bells while he fired. Having animals near us is terribly upsetting. Seeing libraries near us is slightly inviting. His father knew that he disliked marriage. His infant said that he inspected letters. Having aroused him she then left quickly. Having arranged him she then read bravely. The police accused us of trespassing again. The judge charged us with undressing often. Sue hoped nobody remembered the awful scene. Joan guessed nobody recognized the living form. Sentences used in Experiment 2 Reversible versions take the second.
take the first nominal
in parentheses,
and nonreversible
versions
Four women touched the (girl, skirt). The butcher recognized the (man, hat). The boys liked the (girls, books). Poor John needed (his companions, some attention). A man treated the (farmer, disease). The girl kissed the (nurse, photo). The teacher saved the (boy, box). A cleaner discovered the (watchman, suitcase). The (girl, smell) disturbed the woman. The (guest, game) entertained the boys. The (doctor, plague) killed three men. The (agent, idea) surprised the model. The (youth, game) tired the
346
Kenneth I. Forster and Ilmar Olbrei
minister. That (woman, sheet) covered the girl. Six (women, banners) welcomed the crowd. The (horse, mask) frightened the thief. Many students were aided by the (woman, rules). The doctor was guided by his (servant, instinct). Some teachers were dismayed by the (parents, essays). The stranger was overcome by the (sailor, smoke). The student was annoyed by the (woman, noise). A woman was upset by the (crowd, photo). Most children were pleased by the (performers, performance). Stupid John was misled by the (traveller, signpost). (Uncle Bill, Bill’s face) was shaved by uncle Alan. A (tiger, deer) was attacked by a panther. The (general, proposal) was praised by the committee. The (woman, shirt) was washed by the girl. Each (child, vase) was identified by the gentleman. The (man, ban) was lifted by the king. The (student, statue) was rescued by the artist. The (builder, machine) was employed by the electrician. Sentences
used in Experiment
3
Five boys guarded the (man, cave). The boy saw six (officers, flowers). Uncle Jim forgot the (boy, box). The man seized the (robber, vessel). This person knew the (man, game). The boy kicked the (stranger, football). Many (warriors, buildings) shielded the army. The (girl, play) offended the workmen. A (worker, noise) interrupted the program. This (man, book) confused the teacher. The (patient, picture) shocked the doctor. The (visitor, program) amused the child. The (dog, moth) was chased by the man. The (parson, parcel) was received by the soldier. The (author, train) was met by the painter. The (woman, animal) was mistreated by the man. The (boy, body) was examined by uncle Charles. The (thief, drum) was beaten by the youth. Big Joan was assisted by the (man, book). Old Bob was saved by (the youth, good luck). The teacher was enlightened by the (speaker, article). Many students were impressed by the (visitors, displays). The warrior was injured by the (rebel, sword). The stranger was hidden by the (negro, grass). REFERENCES Bever, T. G (1970) The cognitive basis for linguistic structures. In J. R. Hayes (Ed.), Cognition and the development of language. New York, Wiley. Collins, A. M., and Quillian, M. R. (1972) How to make a language user. In E. Tulving and W. Donaldson (Eds.), Organization of memory. New York, Academic Press. Dooling, D. J. (1972) Some context elTects in the speeded comprehension of sentences. .I. eXP. Psychol., 93, 56-62.
Fodor, J. A., and Garrett, M. (1967) Some syntactic determinants of sentential complexity. Pert. Psychophy., 2, 289-296. Garrett, M., and Bever, T. G. (1968) Some syntactic determinants of complexity, II: Verb structure. Pcrc. Psychophy., 3, 4.53-461. Forster, K. I. (1970) Visual perception of rapidly presented word sequences of varying complexity. Prrc. Psychophy., 8, 215-221. and Ryder, L.A. (1971) Perceiving the
Semantic
structure
of sentences. J. 10, 285-296. Gough, P. B. (1966) The verification of sentences: The effects of delay of evidence and sentence length. J. verb. Learn. verb. verb.
and meaning
Learn.
verb.
Beh.,
Beh., 5, 492-496. (1972) One second
of reading. In J. F. and I. G. Mattingley (Eds.), Language by ear and by eye. Cambridge, Mass., MIT Press. Herriot, P. (1969) The comprehension of active and passive sentences as a function of pragmatic expectations. J. verb. Learn.
-
Kavanagh
verb. Beh., 8, 166-169.
Holmes, V. M. (1973) Order of main and subordinate clauses in sentence perception. J. verb. Learn. verb. Beh., 12, 285293.
and Forster, K. I. (1972) Perceptual complexity and underlying sentence structure. J. verb. Learn. verb. Beh., 11, 148156. Meyer, D. E., and Schvaneveldt, R. W. (1971) Facilitation in recognizing pairs of words : Evidence of a dependence between retrieval operations. J. exp. Psychol., 90,
-
227-234.
heuristics and syntactic
analysis
341
Morton, J. (1966) Comments on J.P. Thorne’s paper. In J. Lyons and R. J. Wales (Eds.), Psycholinguisticspapers. Edinburgh, Edinburgh University Press. Oldfield, R. C. (1966) Things, words and the brain. Q. J. exp. Psychol., 18, 340-353. Rosenberg, S., and Jarvella, R. J. (1970) Semantic integration as a variable in sentence perception, memory and production. In G. B. Flores d’Arcais and W. J. M. Levelt (Ed%), Advances in psycholinguistics. Amsterdam, North-Holland. Rubenstein, H., Garfield, L., and Millikan, J. A. (1970) Homographic entries in the internal lexicon. J. verb. Learn. verb. Beh., 9, 487-492.
Schank, R. C. (1972) Conceptual dependency: A theory of natural language understanding. Cog. Psychol., 3, 552-631. Schlesinger, I. M. (1968) Sentence structure and the reading process. The Hague, Mouton. Slobin, D. I. (1966) Grammatical transformations and sentence comprehension in childhood and adulthood. J. verb. Learn. verb. Beh.,
5, 219-227.
R&urn&
Dans cette ttude on cherche g verifier si le temps attribuable au calcul syntaxique d’une phrase est fortement influencC par les proprittts semantiques de cette phrase. Une strie d’expkriences oti l’on utilise deux types de proddures, ne permet pas de soutenir cette position. Plus exactement, ces exp&iences montrent que le temps de calcul tend g &tre constant pour des phrases ayant des plausibili-
t& stmantiques vari&es lorsque ces phrases ont une m&me structure syntaxique. Ces exp&iences montrent Cgalement que la reversibilitd n’affecte pas de facon systkmatique le calcul de la phrase. Ces faits sont in&p&t& comme indiquant que la recherche de la structure sous-jacente d’une phrase depend des propri&% syntaxiques de l’input.