JOURNAL
OF VERBAL
LEARNING
Adjectival
AND
VERBAL
BEHAVIOR
Negation Multiply
15,143-157
(1976)
and the Comprehension Negated Sentences
of
MARK A. SHERMAN State
University
College,
New Paltz,
New
York
Two experiments examined the comprehension of singly and multiply negated sentences. Difficulty of comprehension was measured by the speed and accuracy with which subjects judged the semantic reasonableness of sentences. Marked and negatively prefixed adjectives (sad, Itnhappy) were a particular focus of the study, since their psycholinguistic status, with respect to negation, is uncertain. When sentences contained more than two negatives, comprehension suffered a large decrement. This finding is discussed in relation to Bever’s (1970) model and the concept of cognitive overload. Though prefixed and marked adjectives had little effect on comprehension in an otherwise affirmative context, they had a consistent and substantial effect when added to a double negative sentence. Thus, these adjectives appear to function as psycholinguistic negatives, that is, they may be analyzed as “not X,” at least when other negatives are present.
Since publication of Chomsky’s Syntactic Structures in 1957, psychologists have been interested in the comprehension of negative sentences. Initially, most studies confined themselves to the word not, but more recently attention has been given to other negative words and morphemes (Clark, 1970; Just & Carpenter, 1971; Sherman, 1973). The consistent finding has been that negatives are cognitively more complex than affirmatives, as typically demonstrated by an increase in verification time. That negatives add to the time required to verify a sentence is, in fact, a This research was supported in part by a graduate traineeship from the National Institute of General Medical Sciences to the author, and by Grant SD-187 from the Advanced Research Project Agency to the Harvard University Center for Cognitive Studies. I am especially grateful to George A. Miller and Doris Aaronson for their advice and encouragement. I also wish to thank the Department of Psychology at Stanford University, and especially Roger N. Shepard, for generously providing a stimulating environment in which to work on this paper. The assistance of Joseph Keating in statistical computations is also gratefully acknowledged. Requests for reprints should be sent to Mark A. Sherman, Department of Psychology, State University College, New Paltz, New York 12561. Copyright 0 1976 by Academic Press, Inc. All rights of reproduction in any form reserved. Printed in Great Britain
major assumption of theoretical model of sentence-verification derived independently by Chase and Clark (1972), and Trabasso, Rollins, and Shaughnessy (1971). But the term “negative” is a broad one, and includes words with negative affect which do not negate logically (e.g., sad). To avoid ambiguity, I shall refer to any conceivably negative word which consistently increases sentence difficulty as a psycholinguistic negative; and this effect I shall call psycholinguistic negation (as distinguished, for example, from logical negation). I shall assume that psycholinguistic negation occurs when the word not is literally present or is in the decoding of a word or phrase. Whether any word we would typically call a negative is capable of psycholinguistic negation is a central concern of the present investigation. Another concern is the effect, on comprehension, of two or more negatives appearing in one sentence. Multiply negated sentences are of theoretical interest (see Bever, 1970), and, because of their probable extreme difficulty, they could furnish a suitable testing ground for assumed psycholinguistic negativity. 143
144
MARK
A. SHERMAN
A major aim of the present study was to ascertain the psycholinguistic status of marked adjectives. Clearly, such adjectives-examples are sad and rude-are evaluatively negative; but are they also psycholinguistic negatives? That is, are they more difficult to comprehend in sentences than their unmarked counterparts (happy, polite)? Sherman (1973) reported an increase in comprehension difficulty brought about by negative prefixed adjectives (such as unhappy) but here, of course, the negative marking is explicit. What reasons are there to suspect that there is also negative marking, albeit implicit, in marked adjectives ? Evidence comes from several different sources (see Huttenlocher & Higgins, 1971, for a review): Adjectives like sad, bad, and rude are intuitively negative in affect, and most speakers would agree that there is even a negative tone in less emotional words such as narrow, shallow, and short (metaphorical usage corroborates this view-consider narrow interest, a shallow person, the short end of a bargain). In fact, by definition these words imply “lacking X.” Cross-linguistic studies have reavealed that several languages have no word for “bad,” expressing it as “notgood,” but only one language studied expresses “good” as “not-bad.” (Greenberg, 1966; Zimmer, 1964.) A linguistic analysis (Klima, 1964) shows that at least those marked adjectives which can take complements have some of the defining syntactic characteristics of other more explicit negatives (although, like negative prefixed adjectives, they are incapable of sentence negation). For example, *He was willing to see any more clients that day is, at best, odd; but He was (not willing, unwilling, reluctant) to see any more clients that day are all quite grammatical. The relation between marking and negative affixation provides additional support. Zimmer (1964) has shown that, for a large number of languages, negative affixes (e.g., un-, in-, and -less in English) are rarely attached to what he calls “evaluatively
negative” stems, which appear to be coextensive with marked adjectives. It is an interesting and important fact, for example, that English contains the words happy, unhappy, and sad, but no “unsad. There are a number of such adjectival triads in English, and in virtually all of them only the unmarked stem takes the prefix (e.g., polite-impoliterude, intelligent-unintelligent-stupid). Assuming that marked adjectives are already marked for negation, adding the prefix would introduce considerable complexity. Also, the synonymy of such pairs as sad and unhappy implies there is a negative in both. Psychological studies of the internal lexicon have shown that, when they appear in lists, marked adjectives are treated as negatives. Hamilton and Deese (1971) found that nai’ve subjects reliably sorted adjective pairs into marked and unmarked, and that their criterion was evaluative. When subjects are simply asked to define adjectives, they are more likely to use overt negatives, such as not, in their definitions of marked adjectives (e.g., sad = “not happy”) than in their definitions of unmarked adjectives (Mann, 1968; Huttenlocher & Higgins, 1971). But the fact that sad is typically defined as “not happy” does not mean that it is inevitably decoded this way when it appears in a sentence, especially when one considers that, logically, X is sad is an assertion. Studies of deductive reasoning (Clark, 1969; Huttenlocher & Higgins, 1971) show that marked adjectives have greater psychological complexity than their unmarked counterparts, but have left open the question of whether marking plays the role of psycholinguistic negative in single sentence comprehension. The only study which asked this question directly was Clark (1971), which compared the verification times of sentences containing either long or short. But while significant, the difference in verification times was extremely small; furthermore, the pattern of latencies for true and false responses-crucial in Clark’s model-did not indicate that short
145
ADJECTIVJ3SAh'JlMULTIPLENEGATION
was treated as a negative. This result, therefore, remains ambiguous. The second (but related) interest of the present study was the multiply negated sentence. One of the principal differences between linguistic competence and performance is that the latter may not permit comprehension of strings that the former claims are grammatical. The main target for demonstration of this phenomenon has been the multiply self-embedded sentence (e.g., Miller & Isard, 1964; Stolz, 1967); however, multiply negated sentences probably exhibit the same property (extreme comprehension difficulty) and are certainly more common in English than the essentially nonexistent doubly and triply embedded sentence. Bever (1970) has made the reasonable assertion that sentences containing three (or, presumably, more) negatives are extremely complex, much more so than sentences containing two, which he believes are “perfectly comprehensible and acceptable” (p. 338). Bever presents no empirical evidence for these claims, however, and one of the purposes of Experiment I was to examine them empirically. The speed and accuracy with which multiply negated sentences are comprehended should be assessed not only because of the relevance of such data to part of Bever’s (1970) influential theory of sentence perception and comprehension, but also because such sentences might present a unique opportunity to test for psycholinguistic negation in marked adjectives. Confirmation of the catalyzing effect of the third negative (as essentially predicted by Bever) would suggest a sensitive test for the presence of such negation: Add the alleged psycholinguistic negative to a definitely double negative sentence and note the effect on comprehension. Experiment I examined the comprehension of sentences containing up to four overt negatives (including the negative prefix). Experiment II compared the effects of unmarked, marked and negative prefixed adjectives on the comprehension of sentences
containing either no other negatives or two other overt negatives. In both experiments the task was to judge the semantic reasonableness of sentences and the dependent variables were response time (RT) and error rate. This procedure is highly similar to the typical sentenceverification task. EXPERIMENT
I
The purpose of Experiment I was twofold: (1) To provide specific information on the effects of different types of negatives and their combinations on sentence comprehension, and (2) to examine the effects of the negative prefix and find, if possible, a sentence where the prefix had an especially large effect on comprehension difficulty. Assuming that such a sentence were found, it could be used in a further experiment to test marked adjectives for psycholinguistic negation. In addition to the negative prefix, Experiment I also investigated the effects of the negative particle (not), a sentence negator combined with an indefinite pronoun (no one = “not anyone”), and a negative verb without the power of sentence negation (doubt). Experiment I was actually suggested by the results of a pilot study in which subjects attempted to comprehend sentences containing up to four negatives. Several different negatives (specifically, not, few, deny, dis-, in-, un-, fail to, seldom, no one, doubted, and rarely) appeared in a variety of sentence types. As in the main experiments, subjects judged the semantic reasonableness of sentences. Although a statistical analysis of the effects of specific negatives was not possible in this preliminary investigation, the results clearly showed that two negatives make a sentence difficult and three often make it nearly incomprehensible (e.g., Few people strongly deny that the world is not.flat). Following completion of the comprehension task, subjects were asked to describe whatever strategies they had used in dealing with the multiply negated sentences. Not surprisingly, more than half
146
MARK
A. SHERMAN
reported the mental combination of double negatives to form affirmatives. The implications of this heuristic for an explanation of the extreme difficulty of 3- (and 4-) negative sentences are discussed below. Method Subjects. There were 32 paid subjects of whom the majority were Harvard and Radcliffe undergraduates. Students at other area colleges, employees of Harvard University, and one high school student also served as subjects. Materials and design. Sixteen sets of 32 stimulus sentences were used, each set derived from a basic sentence of the form : [Reasonableness (ezz]
{::I:)
value determiner] that [pronoun]
would (not) be (un-) [adjective]. Each basic sentence contained a different prefixable adjective (e.g., willing, suited, capable). A complete listing of the 16 basic sentences is contained in the Appendix. Examples of actual stimulus sentences are: Because he often worked for hours at a time, no one believed that he was not capable of sustained effort (reasonable). Because he usually worked for IO min at a time, no one believed that he was not capable of sustained eflort (not reasonable). He liked to make decisions for the group and thus everyone doubted that he would be unsuited for the director’s job (reasonable). He liked to let others make decisions and thus everyone doubted that he would be unsuited for the director’s job (not reasonable). A possible major source of variability in response times could have been due to differences among the basic sentences. For example, it could be the case that Bob was a very sensitive person, and thus everyone believed that he would be hurt by criticism takes longer to verify than The boy liked to plan ahead and thus everyone believed that he
would be preparedfor future events. To reduce this variability (as well as to eliminate basic sentences where reasonableness was not clear), the 16 basic sentences were selected from an original pool of 30 by means of a pretesting procedure. Eight other subjects participated in this phase. They made reasonableness judgments only for O-negative versions of sentences, such as the examples above. Nine sentences were eliminated because at least one subject made an error or disagreed with the experimenter’s assignment of reasonableness. Of the remaining 21 sentences, those 16 were selected which had the smallest range of mean RTs. For purposes of design and analysis, each of the negatives (no one, doubted, not, and un-) was considered to be a two-valued factor (present or absent), and Reasonableness (R or NR) was a fifth such factor. The design was a balanced factorial in blocks of 16 with repeated measures (Winer, 1962, pp. 379-412). In this design the highest order interaction is completely confounded with between-blocks differences, but unconfounded estimates of all main effects and lower-order interactions are provided. The design divided the 32 subjects into two blocks of 16. Each subject received 16 stimulus sentences, representing all possible combinations of the four negatives (within subject, a different basic sentence was used for each combination). For subjects in Block I (odd-numbered subjects), the eight 0-, 2-, and 4-negative sentences were reasonable, whereas the eight l- and 3-negative sentences were unreasonable. For subjects in Block II (evennumbered subjects), this arrangement was reversed. Padding sentences helped to insure that subjects would not associate quantity of negatives with reasonableness value, a tendency which would be unlikely in any case. To counterbalance fully position and sequence effects, the 256 pairings of negative combination and basic sentence were arranged in a randomized 16 x 16 Greco-Latin square (Fisher & Yates, 1943), whose rows represented negative combination (e.g., not and
ADJECTIVES
AND
MULTIPLE
un-) and whose columns represented the basic sentences. Each odd-numbered subject (Block I) received sentences represented by a row of the square; even-numbered subjects (Block 11) received the same sentences in the same order, except for the reversed reasonableness values. Prior to receiving the stimulus sentences, the subject received six practice sentences of which only two were multiply negated. Interspersed among the 16 stimulus sentences were 18 padding sentences. Procedure. The task was highly similar to sentence verification, and has been described in detail elsewhere (Sherman, 1973). By means of a simple apparatus, sentences were presented, one at a time, to each subject. The subject rested his hands on two response switches labeled YES and NO (for Reasonable and Not Reasonable, respectively). Because of dificulties experienced by some subjects in earlier experiments when the YES switch was on the left, it was kept on the right for all subjects in both experiments reported here. Written instructions asked the subject “to decide whether each sentence is making a reasonable (or true) statement, or if it is saying something which is unreasonable (or false).” Two simple examples were then presented. Subjects were advised that some sentences would be “more complex than these examples, but once understood, their reasonableness or lack of it will be quite apparent.” The instructions requested subjects to “give the most obvious answer,” and urged them “not (to) sacrifice correctness for speed.” As a further incentive to rapid but accurate responding, a cash bonus was offered to “the most accurate and fastest one-quarter of the subjects.” The experimenter began each trial by saying “Ready”; he then depressed a switch which illiminated the sentence and started a timer (accurate to l/l00 of a second). Depression of a response switch by the subject stopped the timer. The six practice sentences and the 34 stimulus and padding sentences
147
NEGATION
were presented with 5 to 10 sec.
an intertrial
interval
of
Results and Discussion In spite of the emphasis on accuracy in the instructions, the overall error rate was 15.8 %, which is quite high for this type of task. Therefore, two separate analyses were carried out, the first with errors excluded (Excl.), and the second with all response times included, regardless of correctness (Incl.). Erroneous responses constituted missing values in the Excl. analysis, and these were estimated by an extrapolation procedure described in Winer (1962, p. 281). For the present study this method of estimation is conservative with respect to differences among negative combinations, as it tends to use RTs for the lower numbers of negatives (0, 1,2) to estimate RTs for the higher numbers (3, 4); the reason for this is that many more errors were made on the latter. To deal with the problem of extreme values, and to increase the homogeneity of variance for the various factor combinations, a logarithmic transformation was applied to RTs for both analyses. Geometric means were then computed. Table 1 presents these means, and error rates, for each number of negatives. There is a fairly large difference between the TABLE
1
MEAN RESPONSETIMES’ (PERCENTAGE)
FOR EACH NUMBER EXPERIMENT Ib
Number of negatives in sentence 0 1 2 3 4
AND
ERROR
RATES
OF NEGATIVES
IN
Excl.
Incl.
Error rate
4.75 5.92 7.45 9.71 11.41
4.75 5.94 7.63 9.25 9.56
0.0 4.7 12.5 29.7 40.6
a RTs in seconds; the average standard error was 0.53 set for the Excl. analysis and 0.55 set for the Incl. analysis. b Excl. = errors excluded, Incl. = errors included.
148
MARK
A. SHERMAN
Excl. and Incl. means for three negatives and a much larger difference for four negatives. The fact that Incl. means are noticeably lower for the 3- and 4-negative sentences indicates that subjects felt a certain natural time limit in comprehending a sentence and would tend to stay within this limit even if it meant error rates of 30 or 40 %. The data do not fully support Bever’s (1970) contention that sentences containing two negatives are “perfectly comprehensible and acceptable.” A second negative adds considerably to comprehension time, and the 12.5% error rate is higher than one usually finds in verification tasks. However, a consideration of both errors and response times indicates that the most dramatic increase in difficulty does indeed occur upon addition of a third negative, a result which Bever’s analysis predicts. Taken as a whole the 3-negative sentences used here were clearly beyond normal comprehension ability.
TABLE
2
MEAN RESPONSE TIME@ AND ERROR RATES (PERCENTAGE) FOR EACH COMBINATION OF NEGATIVES IN EXPERIMENT Ib
Times
Error
Negatives
Excl.
Incl.
rate
No negatives no one doubted not unno one doubted no one, not no one, undoubted, not doubted, unnot unno one doubted, not no one doubted, unno one, not undoubted, not unno one doubted, not un-
4.75 5.72 6.55 5.97 5.49 5.85 7.81 7.36 8.92 7.44 7.64 8.40 7.63 10.67 12.97 11.41
4.75 5.72 6.58 5.97 5.52 6.00 8.01 7.63 9.55 7.39 7.62 7.92 8.15 9.68 11.70 9.56
0.0 0.0 15.6 0.0 3.1 6.3 15.6 15.6 18.8 12.5 6.3 31.3 15.6 37.5 34.4 40.6
Specific negatives and their combinations.
The primary interest of the present experiment was the effect of the four types of negatives both alone and in concert, with the negative prefix a particular focus. Table 2 presents the geometric mean RTs and error rates for the 16 combinations of negatives. Reasonableness value does not appear as a variable because none of its interactions were significant. With the four negatives and reasonableness considered as binary factors, analyses of variance were carried out, using a procedure described in Winer (1962, pp. 399-412). One analysis (Excl.) made use of correct responses only, with missing values estimated by the extrapolation procedure mentioned above. A second analysis (Incl.) used all response times, correct and incorrect. In the Excl. analysis the main effects of all four negatives were highly significant : No one, F( 1,450) = 14.13 ; doubted, (F1,450)=67.14;not,F(1,450)=179;andun-, F(1,450) = 89.57; allps < .OOl. NR judgments were significantly longer than R judgments, F(1, 450) = 4.23, p < .05. There was a highly significant negative interaction between no one
a RTs given in seconds; the average standard error was 0.61 set for the Excl. analysis and 0.71 set for the Incl. analysis. b Excl. = errors excluded, Incl. = errors included.
and doubted, F(l,450) = 40.09, p < .OOl, and a positive interaction between not and am-, F(1, 450) = 4.34, p < .05. When actual error times were used (Incl.), the main effects of the four negatives remained significant, No one, F(l, 450) = 8.43, p < .Ol; doubted, F(l, 450) = 50.49; not, F(l,450) = 112.1; and UPZ-,F(l,450) = 55.45;~s all less than .OOl. Although neither Reasonableness, F(l, 450) = 1.88, nor the not x M- interaction, F < 1, remained significant in this analysis, the highly significant interaction of no one and doubted remained, F(l,450) = 39.93,~ < .OOl. The discussion that follows will confine itself to the Excl. data because there are few major differences between the Excl. and Incl. results, and because this parallels reports of results in previous studies of sentence verification. The average increase in comprehension time produced by each of the four negatives were:
ADJECTIVES
AND
MULTIPLE
No one, 0.75 set; doubted, 1.66 set; not, 2.71 set; and un-, 1.91 sec. The means for no one and doubted would, no doubt, have been higher were it not for the strong negative interaction between the two; No one doubted was as easy or easier to comprehend than either of these negatives appearing without the other. Among the single negative sentences, those containing doubted were the most difficult, both in terms of RTs and errors; the negative prefix added least to difficulty, but this increase was significant, p < .Ol, Sign Test. The difficulty of doubted was not predicted, because doubt, like un-, is not capable of sentence negation (Klima, 1964). This can be seen in the contrast between He wasn’t sure of anyone, not even his bestfriend versus “He doubted (was unsure of) anyone, not even his best friend. The difficulty of doubted could be due to a rather subtle aspect of its meaning: For the sentences used here, at least, the word doubt is a less “polar” negative than the others. That is, it does not diametrically reverse meaning in quite the same way the other negatives do. Consider, for example, the contrast between the two statements, Everyone believed that he was capable of sustained effort and Everyone
NEGATION
149
The 2-negative sentences were substantially more difficult than the single negatives; an exception was no one doubted, which was easier than doubted alone and practically equal in difficulty to no one. Perhaps no one doubted lent itself particularly well to the strategy (reported by subjects in the pilot experiment) of combining two negatives to form an affirmative. However, this strategy should also have worked well for not and un-, whose effect in combination was possibly more difficult than an additive model would predict. The difference in ease of handling the two double negatives might also be accounted for by a difference in polarity of meaning, where no one doubted is the more polar and, hence, easier. For example, the two sentences No one doubted that he was capable of sustained effort and Everyone believed that he was not incapable of sustained effort have similar
meanings, but the double negative in the first is making a definitive positive statement, whereas that in the second seems rather hesitant and uncertain. Another possibility is that no one doubt(ed) is a frequent, highly practiced expression in English and is therefore treated as a rather long affirmative unit believed that he was not capable of sustained in comprehension. But the data for 3- and efort. The expressions capabIe and not 4-negative sentences dispute this. No olfe capable are polar opposites (the same can be doubted is semantically very close to everyone shown true for no one versus everyone, and for believed. When not or un- is added to everyone un-). On the other hand, doubt is not neces- believed . . ., response time increases by 1.22 sarily the polar opposite of believe. To “not and 0.74 set, respectively (and the error rate believe” something may go further than stays at or very close to zero). But when either “doubting” it, as doubt implies some small of these negative morphemes are added to no measure of belief. This can be shown in the one doubted . . ., response times rise by 2.55 and contrast among the following three responses 1.78 set, respectively (and there are substantial to the statement that John can cook really increases in error rate, especially for 1t0 one good potato pancakes: Z believe it. Z don’t doubted, not). What may be occurring here is believe it. Z doubt it. Perhaps because the word a kind of cognitive overload. Suppose that the doubt makes a less extreme or exact state- comprehender mentally combines no one and ment than no one, not, or zm-, its reasonableness doubted to form an affirmative. His mental is less readily testable. Just and Carpenter’s coding is probably something like “Everyone (1971) finding that terms like ,&v and a believed.” But this is strictly an internal coding minority take longer to process than not, none, and, hence, it can be assumed that to maintain and no could be fitted into the same kind of the “affirmativeness” of No one doubted takes framework. effort or space. This presents no problem until
150
MARKA.SHERMAN
another negative is encountered, at which point the sentence-processor is apparently overloaded and the affirmative unity of no one doubted can no longer be maintained. The concept of cognitive overload may be useful for explaining the difficulty of 3- and 4negative sentences, in general, and it will be discussed again in the General Discussion section. The negative prefix. One of the purposes of Experiment I was to find a specific sentence frame where comprehension difficulty would be greatly increased by the addition of a negative prefix. Each of the negatives used in this experiment, including un-, had its largest effect as a third negative. Specifically, the prefix increased difficulty most when added to doubted, not (e.g., . . . everyone doubted that he would not be prepared. . . versus . . . everyone doubted that he would not be unprepared . . .). Un- increased RT by over 4 set in this type of sentence, and also sharply increased error rate. Clearly, the sentence frame . . . everyone doubted that PRO . . . not . . . is especially “sensitive” to the addition of a morphological negative; the possibility that this sensitivity could be exploited to show that implicitly marked adjectives may function as true psycholinguistic negatives was the basis for Experiment II. II Experiment II sought to determine whether marked adjectives have the same effect on comprehension as negative prefixed adjectives and, in particular, whether marked adjectives are capable of psycholinguistic negation. The sentence frames used were selected on the basis of the results of Experiment I. One group of subjects saw sentences of the form . . . everyone believed that PRO , . . .; another group saw sentences of the form . . . everyone doubted that PRO . . . not . . . . Both types of sentences-one containing two overt negatives and the other containing none-were used to show whether marked and prefixed adjectives generally increase comprehension EXPERIMENT
difficulty, or whether they do so only when certain conditions are met, such as the straining of the sentence-processor. Method Subjects. Forty-eight subjects participated, of whom 29 were Harvard and Radcliffe undergraduates. Of the remaining 19 subjects, two were graduate students at Harvard, eight were recent college graduates, and eight had some college experience. Subjects were divided into two equal groups on the basis of time of arrival for the experiment. Materials and design. The following six adjective triads were used in the construction of the stimulus sentences : happy-unhappy-sad, polite-impolite-rude, safe-unsafe-dangerous, intelligent-unintelligent-stupid, attractive-unattractive-ugly, and willing-unwilling-reluctant. Each triad consisted of an unmarked adjective (e.g., happy), its prefixed antonym (unhappy), and its marked antonym (sad). Responses to a questionnaire in which a separate group of subjects were asked to give antonyms for adjectives had clearly demonstrated that for all of the above triads both the prefixed and marked adjectives are considered antonyms of the unmarked. The six triads and their basic sentences were selected from an original pool of nine by means of a pre-testing procedure identical to that of Experiment I. From each triad 12 sentences were constructed and each sentence was typed separately on a 5 x 8 index card. Table 3 presents the schematics for the sentences derived from each triad. For each triad, sentences represented the combination of three factors: Number of overt negatives in sentence (none, two), Adjective (unmarked, prefix, marked), and Reasonableness (R, NR). Subjects in the No Other Negatives Group (Group NN) saw sentences of the form . . . everyone believed that PRO . . . . and subjects in the Two Other Negatives Group (Group TN) saw sentences matched with those of Group NN, but of the form. . . everyone doubted that PRO . . . not . . . . Each
ADJECTIVES
TABLE STIMULUS
SENTENCES
AND
MULTIPLE
3
(SCHEMATICS),
EXPERIMENT
II
1. He was always a (well-behaved, bratty) child, and thus everyone (believed, doubted) that he would (not) be (polite, impolite, rude) to the guests. 2. He had just (won a lot of money in a contest, lost a lot of money at poker), and everyone believed that he would be (happy, unhappy, sad) about this.
NEGATION
151
Procedure. The procedure was essentially identical to Experiment I. Resultsand Discussion
Because of a high overall error rate, again two separate analyses of the data were performed. As in Experiment I, one analysis (Excl.) used response times for correct responses only, whereas the second analysis 3. Because the car had (brand new, worn out) tires, &cl.) included all times, regardless of everyone believed that it would be (safe, unsafe, correctness. The estimation procedure used dangerous) at high speeds. for the Excl. analysis was the same as that used 4. Because his IQ was (160,70), everyone believed that in Experiment I. To provide a greater homohe would seem (intelligent, unintelligent, stupid) to geneity of variance across conditions, a most of those who met him. logarithmic transformation was performed 5. She had (a lovely face and figure, scars and pockon the response times, and geometric means marks all over her face), and thus everyone believed that she would look (attractive, unattractive, ugly) were obtained. These means, and error rates, to most men. appear in Table 4. There is little difference in the results 6. He (loved, hated) little children, and thus everyone believed that he would be (willing, unwilling, whether errors are excluded (and their times reluctant) to perform at the children’s Christmas estimated) or included. This is especially true party. for Group NN, where the overall error rate was extremely low (2.8’%). But even for Group TN, where the rate was 31.9 %, the subject saw SIX stimulus sentences, one from difference between the Excl. mean and Incl. each triad, with all combinations of Adjective mean was 0.30 set (Excl. 8= 9.67 set, and Reasonableness represented. The experi- Incl. 8 = 9.37 set). The results of the analyses ment was treated as a 2 x 3 x 2 factorial were little affected by whether RTs for errors design with repeated measures on the last two were included or estimated. In line with Clark’s (1973) suggestions, factors (Winer, 1962, pp. 319-337). Since no 6 x 6 Greco-Latin square exists, triads were treated as a random effect. Hence, it was impossible to counterbalance fully both in what follows, FI is the traditional F, where factor combination and triad. Because it was subjects are the sampling variable; F2 is the far more important to balance position and value obtained when triads are used as the sequence effects for the factor combinations, sampling variable; min F’ is the minimum they were arranged in a randomly determined possible value of the quasi-F ratio, where 6 x 6 Latin square; the order determined by subjects and triads are treated as sampling the square was used for subgroups of six variables simultaneously. subjects. The order of the triads was semiAs was expected, the presence of the two randomized, with an effort made to balance explicit negatives made a tremendous differthe orderings as much as possible. ence in comprehension difficulty. Not only did The six stimulus sentences presented to the combination of doubt and not add nearly each subject were interspersed among 16 6 set to comprehension time (this difference is padding sentences of various types. Six prac- highly significant, Excl. : FI(l, 46) = 96.06, tice sentences, including one similar to the F,(l, 10) = 493.8, min F’(1, 56) = 80.42; Incl.: stimulus sentences, preceded presentation of F,(l, 46) = 88.85, F’(1, IO) = 289.4, min F’(l, the stimulus and padding sentences. 39) = 67.97; all ps < .OOl), but even this addi-
152
MARK A. SHERMAN TABLE MEAN
RESPONSE TIMES’,~
4
AND ERROR RATES (PERCENTAGE)
IN EXPERIMENT
II
Reasonableness R
Adjective
NR
Group NN Unmarked Prefixed Marked Total
3.80 4.19 4.18 4.05
(3.80) (4.19) (4.13) (4.03)
3.83 3.59 3.76 3.73
(3.82) (3.59) (3.76) (3.72)
Group TN Unmarked Prefixed Marked Total
6.86 11.08 9.42 8.95
(6.82) (10.81) (9.82) (8.98)
8.58 11.59 11.43 10.44
(8.74) (10.05) (10.64) (9.78)
Total
3.81 (3.81) 3.88 (3.88) 3.96 (3.94) 3.88 (3.87) 7.67 11.33 10.38 9.67
(7.72) (10.42) (10.22) (9.37)
Error rate
2.1 0.0 6.2 2.8 29.2 33.3 33.3 31.9
n RTs in seconds; “Incl.” times are in parentheses. b The average standard error was 0.82 set for the Excl. analysis and 0.89 set for the Incl. analysis.
tional time was insufficient for accurate interaction of a number of negatives and comprehension, as indicated by the very high reasonableness approached significance, Excl. : error rate for Group TN. Note that the F,(l, 46) = 10.42, p < .Ol, F,(l, 10) = 3.86, maximum expected error rate for a subject p<.1;1nc1.:F1(1,46)=3.41,p<.1,F,(1,10)= who was merely guessing would be 50 %. 1.44, n.s. Table 4 clearly shows that if the sentence Clearly, the significance of the Adjective contained no overt negatives, the differences effect and the Adjective by Number of between the three types of adjectives were negatives interaction is due to the longer RTs small, whereas they were quite large for sen- for those doubted, not sentences which contences containing two overt negatives. In both tained prefixed and marked adjectives. This cases the differences are in the expected was verified by means of Newman-Keuls and direction, but individual analyses of variance Sign tests. Most important for the present carried out separately for Group NN and investigation, marked adjectives had a large Group TN showed that the adjective effect is effect on comprehension difficulty and, in fact, significant only for the latter. The overall effect they were nearly indistinguishable from the of adjective type was significant, Excl.: negative prefixed adjectives in terms of this F,(2, 92) = 14.43,~ < .OOl, F,(2, 20) = 11.64, effect (statistically substantiated by the tests p < .OOl, min F’(2, 57) = 6.44, p < .Ol; Incl.: just cited). Fl(2, 92) = 7.97, p < .OOl) Fz(2, 20) = 9.93, The results of Experiment II indicate that in p -c .Ol, min F’(2,75) = 4.42, p < .05. There was the presence of two other negatives, marked also a significant interaction between adjective and negative prefixed adjectives are processed types and presence of overt negatives, Excl.: as psycholinguistic negatives. However, it is F,(2, 92) = 11.14, p < .OOl, F,(2, 20) = 10.26, less clear how these adjectives are processed p < .OOl, min F’(2, 63) = 5.34, p < .Ol; Incl.: when there are no other negatives in the senF1(2, 92) = 5.64, p < .Ol, F,(2, 20) = 8.49, tence. With respect to prefixal negation, the p<.Ol,minF’(2,84)=3.39,p<.05.Noother results for Group NN contrast with those of effects reached significance, although the Sherman (1973), where the prefix added
ADJECTIVESAND
MULTIPLE NEGATION
significantly to comprehension time even when it was the only negative in the sentence. A possible explanation for this difference in results is that subjects in the earlier study, but not in Group NN here, also saw sentences containing the double negative, not un-. Faced with the possibility of un- appearing in such a construction, subjects in Sherman (1973) may have delayed responding when they encountered the prefix in order to make certain that a double negative construction was not involved. The negation assumed to be implicit in marked adjectives also failed to affect sentence difficulty for Group NN. At first glance, this finding conflicts with Clark (1971), who reported a comprehension advantage for Iong (versus short). But, although significant, the mean difference reported by Clark was quite small-42 msec. This is much below the time typically required by not, the sine qua non of psycholinguistic negatives, and, in fact, is lower than the nonsignificant mean differences found here (70 and 150 msec for Prefix and Marked, respectively). The typical effect of not was observed in a pilot study in which subjects saw sentences containing not plus the unmarlced adjective, as well as sentences of the kind seen by Group NN. In that study, the prefixed and marked adjectives again did not significantly affect response time, although the differences were in the expected direction; the word not, however, led to a highly significant 0.5 set increase in RT. However, in the presence of two overt negatives (Group TN), both negative prefixed and marked adjectives had a large effect on comprehension difficulty : The negative prefix added 3.66 set to response time (2.70 set in the Incl. analysis), and marking added 2.71 set (2.50 set in the Incl. analysis). Differences of this magnitude are a clear indication of psycholinguistic negation. (It is difficult to determine whether for the prefixal and implicit negation there remains any comprehension advantage in comparison with particle negation, as found in Sherman (1973),
153
because the substitution of not in the present context would yield the ungrammatical * . . . everyone doubted that he was not not . . .). GENERALDISCUSSION
The main results of the present study were : (1) Lexical negation-either overt, as in the prefixed unhappy, or implicit, as in the marked sad-did not consistently increase sentence difficulty when the sentence contained no other negatives and there was no set for negation. (2) Sentences containing more than two negatives were extremely difficult to comprehend. And when added to a sentence containing two negatives, a prefixed or marked adjective affected comprehension as if the adjective were truly a psycholinguistic negative (presumably analyzed as “not plus adjective”). (3) There were large differences in difficulty among different types of negatives, and the difficulty of each was not predictable strictly from linguistic considerations, such as scope of negation. How can one explain the finding that neither negative prefixed nor marked adjectives significantly affected comprehension time when they appeared in sentences like He had just won a lot qf money in a contest, and everyone believed that he would be (unhappy, sad) about this-as compared to the same sentence with happy-whereas such adjectives had a large effect on comprehension time when the sentence was He hadjust won a lot of money in a contest and everyone doubted that he wouldnot be (unhappy, sad) about this? One possibility is that the mental decoding of sad is always “not happy” but the time needed for the “not” was so little in sentences of the type read by subjects in Group NN (Experiment II) that it was not detected statistically. For the subjects in Group TN, however, the sentence comprehension mechanism was already strained by the presence of two overt negatives,; the presumed “not” in the analysis of sad could no longer be dealt with swiftly because it now had to enter into
154
MARK
A. SHERMAN
the difficult mental calculations incurred by the presence of three negatives in one sentence. An alternative explanation is that prefixed and marked adjectives may be decoded in two different ways, either as unanalyzed Iexical simples or as “not plus adjective” (see Fodor, Bever, & Garrett, 1974, pp. 381 ff.). What I am suggesting here is that subjects in Group TN treated prefixed and marked adjectives as psycholinguistic negatives, but subjects in Group NN, and in Experiment I, treated them as psycholinguistic affirmatives. The possibility that a particular linguistic phenomenon may be handled in very different ways in comprehension has been discussed by Holmes (1973) with reference to self-embedded sentences. It has been known for some time (e.g., Stolz, 1967) that sentences containing two or more embeddings are difficult or incomprehensible, whereas the corresponding rightbranching sentence is comprehended without great difficulty. But Holmes showed that a sentence containing a single embedding may actually be easier to deal with than the corresponding right-branching sentence. She suggests that the processing strategy changes drastically when a sentence contains more than one embedding, and, in fact, citing the research of Stolz and others, she points out that multiply embedded sentences may not even be treated as grammatical strings. A similar situation may exist in the case of multiply negated sentences. The question here, of course, is why the comprehension mechanism should analyze sad as “sad” when the sentence is easy, but analyze it as “not happy” when the sentence is already heavy with negatives. The answer could be that some kind of set for negation is established by multiply negated sentences. Faced with having to “compute” the overall direction of a statement, the reader (or listener) quickly searches out every possible negative and includes them in his computation. Support for this description comes from Experiment I, where un- added significantly to comprehension difficulty even when it was the
only negative in the sentence. A major difference between Experiment I and Group NN in Experiment II is that subjects in the former saw many multiply negated sentences, whereas subjects in the latter saw very few (only in the padding sentences). But regardless of which theory of adjective decoding is ultimately the more adequate, the (at least occasional) psycholinguistic negativity of prefixed and marked adjectives seems virtually certain. In both experiments reported here, as well as in Sherman (1973), negative prefixed adjectives showed the typical “negative” effect on comprehension; and in Experiment II, the effect of adjectival marking directly paralleled that of prefixation. Incidentally, these findings offer no support for a simple frequency explanation of differences in adjectival difficulty. The marked adjectives used in Experiment II are far more frequent than their prefixed synonyms, but they were nearly as difficult. If frequency of occurrence in English were the critical factor in determining difficulty of comprehension, the marked adjectives should have fallen midway between the unmarked and prefixed adjectives in difficulty. But this did not occur. For example, for one of the triads, inteiligent-unintelligentstupid, the KuEera and Francis (1967) frequency counts are 26, 0, and 24, respectively. Yet the mean RTs for Group TN in Experiment II were, for these adjectives, 7.24, 11.26, and 10.32 set, respectively. Why are sentences containing more than two negatives so very difficult, even when one of these is a lexical negative? Bever (1970) attempts to explain the excessive difficulty of 3-negative sentences by subsuming them under a theory which proposes a general perceptual difficulty with words having double functions. Thus in his example, “They did not want me not to promise not to help them” (p. 338) the heart of the problem is the second not, which simultaneously is within the scope of a negative (the first not) and includes a negative (the third not) within its own scope. The strength of Bever’s analysis lies in its generality
ADJECTIVES
AND
MULTIPLE
(he applies it to self-embedded and other very complicated sentences), but with regard to multiply negated sentences it may be inadequate. Its main limitation is that it does not incorporate the chief strategy used by subjects, namely, combining negatives to form affirmatives. To comprehend Bever’s example seems virtually impossible without this strategy; on the other hand, the rather natural mental conversion of “. . . did not want me not to promise . . . .” to “. . . wanted me to promise . . . .” makes the sentence comprehensible. The above strategy would help to explain why certain 3-negative sentences are easier than others, e.g., . . . no one doubted that he would be un . . . . versus . . . everyone doubted that he would not be un- . . . . ; the greater ease of the former could be due to the facility with which no one and doubted are combined. Bever’s analysis does not directly deal with this point. It seems clear that an explanation of the extreme difficulty of 3- and 4-negative sentences should include the obvious fact that the comprehender combines negatives as he goes along (subjects in the pre-Experiment I pilot study often reported the use of such a strategy). Difficult as such sentences are even when this strategy is employed, without it comprehension would appear to be close to impossible. If the combining of double negatives to form affirmatives is central in the comprehension of multiply negated sentences, then the reason for the large jump in difficulty at three negatives could be some kind of cognitive overload (Meyer, 1973), which, in this case, leads to a near-breakdown in comprehension. The main problem for the comprehender, in this view, is that the affirmative resulting from the combination of any two negatives is strictly a cognitive entity; thus it must share a limited processing capacity (Foss, 1969) with reasonableness determination. The presence of the third negative causes an overload, even when this negative is a lexical one. Although the main interests of the present
155
NEGATION
study were lexical negatives and multiple negation, the results also revealed rather consistent differences in difficulty among the various negatives employed. In Experiment I, the word doubted clearly emerged as the most troublesome of the four negatives that were used. Not was next, with no one and un- the least difficult. From a linguistic viewpoint the relative difficulty of doubt was unexpected, since it does not cause sentence negation, whereas not and no one both do (consider No one believes it either and John does not believe it either versus ‘John doubts it either). The nonpolarity of the meaning of doubt could be responsible, and, if so, this would further complicate any simple analysis of the difficulty presented by negation. Finally, the relative ease of WZ-, as compared to not, replicates the finding of Sherman (1973) and thus further strengthens the speculation that the “purpose” of negative prefixation in language is to facilitate comprehension. APPENDIX
Stimulus Sentences Used in Experiment (Schematics)
I
1. He was (6 ft.-5 in., 4 ft.-l0 in.) tall, and thus (everyone, no one) (believed, doubted) that he would (not) be (un)conzfortabZe with very tall girls. 2. Because he inevitably did (well, poorly) at all sorts of games of chance, everyone believed that he would be Zucky at roulette. 3. Because the evidence was (forged, genuine), everyone believed that it would be questionable in a court of law. 4. Because he (often worked for hours at a time, usually worked for 10 min at a time), everyone believed that he was capable of sustained effort. 5. Because Mary’s personal appearance was always (immaculate, slovenly), everyone believed that she would be tidy in her housekeeping. 6. Bob was a very (sensitive, hard) person,
156
MARKA.SHERMAN
and thus everyone believed that he would be hurt by criticism. 7. Because he was wearing (bright red clothing and it was a clear day, black clothing and it was a dark night), everyone believed that he would be visible to the enemy. 8. He always (thought out his decisions carefully before proceeding, acted on impulse and demanded to have his way), and thus everyone believed that he was reasonable in most of his actions. 9. He (loved, hated) little children, and thus everyone believed that he would be willing to perform at the children’s Christmas party. 10. He liked to (make decisions for the group, let others make decisions), and thus everyone believed that he would be suited for the director’s job. 11. Because he was quite (sober, drunk), everyone believed that his speech would be intelligible to the group. 12. John had repeatedly (made, lost) money on the stock market, and thus everyone believed that he would be succexsful in his latest stock venture. 13. The boy liked to (plan ahead, live only for the present), and thus everyone believed that he would be prepared for future events. 14. Because their defenses were extremely (weak, strong), everyone believed that they would be vulnerable to attack. 15. He had just (won a lot of money in a contest, lost a lot of money at poker), and everyone believed that he would be happy about this. 16. Because he had (strong moral and ethical convictions, a tendency to steal and embezzle), everyone believed that he would be honest in his business dealings. REFERENCES BEVER, T. G. The cognitive basis for linguistic structures. In J. R. Hayes (Ed.), Cognition and the development of language. New York: Wiley, 1970. Pp. 279-352. CHASE, W. G., & CLARK, H. H. Mental operations in the comparison of sentences and pictures. In
L. Gregg (Ed.), Cognition in learning New York: Wiley, 1972.
and memory.
N. Syntactic structures. The Hague: Mouton, 1957. c LARK, H. H. Linguistic processes in deductive reasoning. Psychological Review, 1969, 76, 387404. CLARK, H. H. How we understand negation. Paper presented at the COBRE Workshop on Cognitive Organization and Psychological Processes, Huntington Beach, California, 1970. CLARK, H. H. The chronometric study of meaning components. Paper presented at the CRNS Colloque International sur les Probl&mes Actuels de Psycholinguistique, Paris, 1971. CLARK, H. H. The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal CHOMSKY,
Behavior,
1973,12,335-359.
FISHER, R. A., & YATES, F. Statistical biological,
agricultural
and
medical
tables for research.
Edinburgh: Oliver & Boyd, 1943.2nd ed. FODOR, J. A., BEVER, T. G., & GARRETT, M. F. The psychology of Zanguage. New York: McGrawHill, 1974. Foss, D. J. Decision processes during sentence comprehension: Effects of lexical item difficulty and position upon decision times. Journal of Verbal Learning and Verbal Behavior, 1969, 8, 457-462. GREENBERG, J. H. Language universals. In Sebeok, T. A. (Ed.), Current trends in linguistics. The Hague: Mouton, 1966. Vol. 3. HAMILTON, H. W., & DEESE,J. Does linguisticmarking have a psychological correlate? Journal qf Verbal Learning and Verbal Behavior, 1971,10, 707-714. HOLMES, V. M. Order of main and subordinate clauses in sentence perception. Journal of Verbal Learning and Verbal Behavior, 1973,12,285-293. HUTTENLOCHER, J., & HIGGINS, E. T. Adjectives, comparatives, and syllogisms. Psychological Review, 1971,78,487-504. JUST, M. A., & CARPENTER, P. A. Comprehension of negation with quantification. Journal of Verbal Learning and Verbal Behavior, 1971, 10,244-253, KLIMA, E. S. Negation in English. In J. Fodor & J. Katz (Eds.), The structure of language. Englewood Cliffs, N.J. : Prentice-Hall, 1964. KU~ERA, H., &FRANCIS, W. N. Computationalanalysis of present-day American English. Providence, R.I. : Brown University, 1967. MANN, J. W. Defining the unfavorable by denial. Journal of Verbal Learning and Verbal Behavior, 1968,7,760-766. MEYER, D. E. Verifying affirmative and negative propositions: Effects of negation on memory retrieval. In Kornblum, S. (Ed.), Attention and
ADJECTIVES performance IV. 1973. Pp. 379-394. MILLER, G. A., &ISARD,
New York:
Academic
AND
Press,
S. Freerecall of self-embedded English sentences. Information and Control, 1964, 7,292-303. SHERMAN, M. A. Bound to be easier? The negative prefix and sentence comprehension. Journal of Verbal Learning and Verbal Behavior, 1973, 12, 76-84. STOLZ, W. S. A study of the ability to decode grammatically novel sentences. Journal of Verbal Learning and Verbal Behavior, 1967, 6, 867-873.
MULTIPLE
157
NEGATION
TRABASSO,
T.,
ROLLINS,
H.,
&
SHAUGHNESSY,
E.
Storage and verification stages in processing concepts. Cognitive Psychology, 1971,2,239-289. WINER, B. J. Statistical design. New York: ZIMMER, K. E. Affixal
principles
in experimental
McGraw-Hill, 1962. negation in English and other languages: An investigation of restricted productivity. Word, 1964, 20, No. 2, Monograph No. 5.
(Received October 1, 1974)