JOURNAL
OF EXPERIMENTAL
Conditional
CHILD
PSYCHOLOGY
30,44-61(1980)
Reasoning following A Developmental
Contradictory Analysis
Evidence:
DAVID O’BRIEN AND WILLIS F. OVERTON Temple
University
Recent evidence suggests that young adults do not correctly understand the logical relationship of the conditional (if p then 4) as it applies to hypothesis testing, and most training procedures have not been productive. However, the introduction of contradictory evidence following faulty inferences has led to accurate inferences with conditional statements. Third and seventh grade and college students (9, 13, and 21 years of age, respectively) were tested to assess developmental differences in improvement following contradiction training and to test whether improved performance transfers to other conditional reasoning tasks. Significant improvement in conditional reasoning was found for the young adult group following the introduction of contradictory evidence, and the positive effect of the treatment transferred across tasks. The third grade students showed no effects from the introduction of the contradiction, but the seventh graders were often confused by the introduction of the contradiction. Seventh grade and college student performances were generally worse than that of the third graders for the positive instances @ 4). but while the contradiction training improved college students’ performance it did not affect the seventh graders. The results are discussed in terms of changes in cognitive structures.
Conditional reasoning has been of increasing interest to developmental psychology (e.g., Braine, 1978, 1979; Ennis, 1975, 1976; Kodroff & Roberge, 1975; Kuhn, 1977; Moshman, 1979; Staudenmayer & Bourne, 1977; Wildman & Fletcher, 1977). This interest is warranted, for the conditional relationship (i.e., if p then q) has a central role in scientific reasoning. The relationship between a set of scientific principles and a hypothesis is a conditional (Hemple, 1966; Popper, 1959), and reasoning from hypothetical propositions to a conclusion is governed by the conditional relationship (Leblanc & Wisdom, 1976). Popper (1959) noted that while particular positive instances cannot verify a conditional hypothesis, a single counterexample can falsify it. It is A portion of this research was presented at the Biennial Meeting of the Society for Research in Child Development, San Francisco, 1979. The authors wish to express their appreciation to the principal, teachers, and children of the St. Barnabas School in Philadelphia, Pennsylvania. Requests for reprints should be sent to either author at the Department of Psychology, Temple University, Philadelphia, PA 19122. 44 OO22-096518ODKKkW
18$02.00/O
Copyright 0 1980 by Academic Preaa. Inc. All rights of reproductmn m any form reserved.
CONDITIONALREASONING
45
this asymmetry with which the present research is concerned. The conditional (p 3 4) is truth-functionally defined as true whenever both p and 4 are true o> . q), or wheneverp is not true and q is true (F . q), or whenever p and q are not true (ii .q). The only circumstances which make @ + q) false are whenp is true and q is not true @ .S). A basic and continuing issue in this area has concerned the performance of young adults on selection tasks that require recognition of those propositions that adequately test the truth of conditional statements (i.e., if p then q). The tasks employed generally require the selection, from among four propositional types @, 6, q, 4), of those that can test the truth of the statement. For example, the statement “If a rod is thin, then it is flexible” results in the selection alternatives: (a) “thin rods” (p); (b) “flexible rods” (4); (c) “rods that are not thin” @); and (d) “rods that are not flexible” (4). Although it is only by a falsification strategy that a test of the truth of a conditional statement can be accomplished, it has been found that even among young adults few select those propositions that can falsify the conditional (Evans, 1972; Wason, 1966, 1968; Wason & Johnson-Laud, 1972). In the example, (a) “thin rods” @), and (d) “rods that are not flexible” ($, would be required to correctly test the statement, since these and only these could yield the disconfirming case of thin inflexible rods (p . 4). Attempts to induce successful conditional testing have generally not been productive (Lunzer, Harrison, & Davey, 1972; Wason, 1968), and success with extremely concrete tasks has not generalized to tasks with abstract materials (Johnson-Laird, Legrenzi, & Legrenzi, 1972; Wason & Johnson-Laird, 1972). The absence of generalization from the concrete to abstract task presentations suggests that performance with the concrete material may frequently be based on interpretations other than that which would suggest insight into the logical structure of the conditional statement. The relationship between “ifp then q” and @ -+ q) is not isomorphic. In ordinary language “if, then” statements are commonly used to indicate causal or temporal relations (Staudenmayer, 1975). For example, “If you study hard, then you will get a good grade” may be understood to mean that you get a good grade because you have studied hard, rather than being understood as an antecedent-consequent conditional relationship. Other authors have contended that “if, then” is often interpreted to indicate an equivalence or biconditional relationship (Geis & Zwicky , 1971; Knifong, 1974; Staudenmayer, 1975; Taplin, 1971). From this perspective it is asserted that in common usage “if, then” is often associated with such meanings as promises, and the truth of the consequent is understood to be restricted to the fulfillment of the antecedent clause. Thus, someone who is told that “if you mow the lawn, then you’ll get five dollars” expects to receive the money only if the lawn is mowed, despite the fact that the truth of the statement as a conditional is dependent only
46
O’BRIEN
AND
OVERTON
upon the lack of a counterexample (i.e., mowing the lawn (p) and not being paid (4)) and cannot be tested whenever the lawn is not mowed @) or the five dollars is forthcoming (q). In light of these various interpretations that may be placed upon “if, then” statements, it is important to explore possible conditions that foster appropriate interpretations and successful performance. Wason (1964) found that young adults make correct inferences about conditional statements when early faulty inferences are followed by contradictory evidence. The procedure used, referred to here as the inference tusk, has been to present subjects with an incomplete conditional rule, such as “If a worker is years of age, or older, then that worker will receive at least $350 each week.” This statement is followed by a series of exemplars of the rule, such as a 20-year-old who makes $50 each week, and a 60-year-old who makes $600 each week. The task requires that the subject state what can be inferred about the missing age in the rule following each exemplar. After the presentation of exemplars in which the monetary amount exceeds that given in the rule (q), it is common for young adults to make the erroneous inference that the age in the rule does not exceed that in the exemplar. For example, following the exemplar of a 65-year-old who gets $550 each week, the frequent response is that the age in the rule is less than 65. While this response is consistent with a biconditional interpretation of the rule, it is not correct for a conditional interpretation. When this erroneous inference is followed by an exemplar that directly contradicts it, such as a 65-year-old who makes $200 each week, the error is usually eliminated in later trials. The evidence of the contradictory exemplar alerts the subject to the possibility of (D . q), a realization that is necessary for conditional reasoning. Although this research suggests that young adults are capable of making appropriate interpretations and correct inferences concerning conditional statements, it does not establish whether insight gained from the contradictory evidence generalizes to other conditional reasoning tasks. In addition to the lack of evidence concerning generalization, research employing the contradictory training paradigm has been undertaken only with young adults, and has not addressed the issue of age differences in conditional reasoning abilities. Piaget’s theory suggests that the ability to reason with logical relationships systematically, i.e., as components of a whole structure, should not be expected until early adolescence with the development of formal operational thought structures (Inhelder & Piaget, 1958). The formal structure is combinatorial not only in the sense that it provides a structure of all possible combinations of binary truth-functional propositions, but also in the sense that the structure involves coordinated transformations that enable the thinker to systematically compare logical relationships to empirical evidence and to other rules. The ability to understand class inclusion relations is found at the concrete operational
CONDITIONALREASONING
47
stage, and the concrete thinker may thus be incorrectly attributed conditional reasoning skills (Inhelder & Piaget, 19.58), but only the formal operational thinker with the complete combinatorial system should be able to infer both the inverse (p * 4) and the reciprocal (4 + p) to the conditional rule. Piaget directly hypothesizes that the formal operational structure is necessary in order to distinguish the conditonal from the biconditional, and that only the formal operational thinker will seek a counterexample to test the truth of a conditional rule (Beth & Piaget, 1966, p. 181). The empirical literature suggests that there are substantial increases in conditional reasoning abilities at the beginning of adolescence reflected in performance with categorical and conditional syllogistic items (Roberge & Paulus, 1971), with transfer of concept-learning across rules (Boume & O’Banion, 1971), with tasks requiring conditional interpretation of empirical evidence (Kuhn, 1977), with tasks directly training truth-function assignments (Staudenmayer & Boume, 1977), and with tasks assessing truth-function assignments (Taplin, Staudenmayer, & Taddondio, 1974). Given this, the introduction of contradictory evidence in conditional reasoning tasks should result in improved performances only for those individuals old enough to possess formal operational cognitive structures. The present study was designed to assess the effect of the contradictory training paradigm at the third grade, seventh grade, and young adult level, and to assess whether such training generalizes to other areas of conditional reasoning. A form of the inference task, described earlier, was used to introduce the contradictory evidence. To test for generalization to new tasks, a selection and an evaluation task were used. Both tasks involve the presentation of six conditional statements. Following each statement in the selection task, single propositions @, p, q, 4) are presented and the subject asked if each proposition could test the truth of the statement. Following each statement in the evaluation task, proposition combinations (p . q, D . q, p . 4, $ * 4) are presented and subjects are asked if the combination proves the rule true or false. Since conditional hypothesis testing requires falsification for deductive certainty, only the counterexample (p . @) should be evaluated as providing proof of the truthvalue of the statement. Given the statement “If someone gets a flu shot, then that person won’t get the flu” and the propositional combinations: (a) “Someone who got a flu shot and didn’t get the flu” (p . q); (b) “Someone who didn’t get a flu shot and didn’t get the flu” (p . q); (c) “Someone who got a flu shot and got the flu” @ . 4); and (d) “Someone who didn’t get a flu shot and got the flu” (,5 . 4), only the combination in (c) provides the necessary counterexample (JJ . q). It was expected that the contradictory training would provide insight into the logical relationship of the conditional for young adults and that this insight should generalize to the other two conditional reasoning tasks. It was also expected that such insight would not extend to third grade
48
O’BRIEN
AND
OVERTON
children whose thought structure would lack formal operational characteristics. No specific predictions were made for the seventh grade target group. METHOD
Subjects Thirty third grade children (X = 9 years, 9 months, SD = 5.5 months; 16 male, 14 female), 20 seventh grade children (X = 13 years, 5 months, SD = 9.2 months; 11 male, 9 female), and 20 college students (k = 21 years, 5 months, SD = 18.4 months; 12 male, 8 female) participated. The two younger groups were enrolled in a middle-class Catholic school in Philadelphia, Pennsylvania. The young adults were all students at Temple University in Philadelphia, Pennsylvania. Tasks and Design Inference task. Twenty subjects at each age were given the inference task (modified from Wason, 1964) which presented the rule “If a worker is years of age, or older, then that person will receive at least $350 each week” (see Table 1). Each of 12 trials gave an age and salary for a single exemplar. The task was to decide which of three choices could be inferred from the exemplar about the missing age in the rule. For example, following the exemplar “There is a worker who is 25 years old and makes $200 each week,” it would be correct to assert that the information tells that “the age in the rule is more than 25” (Choice l), while it would be incorrect to assert that the information shows “that the age in the rule is 25 at most” (Choice 2), or “nothing at all” (Choice 3). Following the exemplar “There is a worker who is 70 years old and makes $400 each week,” the correct assertion would be that the information tells “nothing at all” about the missing age in the rule. Subjects tend to follow exemplars in which the monetary amount is less than that in the stated rule with the conditionally correct assertion. However, subjects usually assert that those exemplars in which the amount given is equal to or exceeds the amount stated in the rule, shows that the missing age in the rule is less than that in the exemplar. It is not recognized that someone could make more than $350 (4) without exceeding the age in the exemplar. Inspection of Table 1 shows that all ages below 45 are associated with salaries below $350 (4) and all ages above 45 are associated with salaries of $350 or above (q; except for Trial 6). For half of the subjects at each age level, Trial 6 provided a contradictory exemplar to the erroneous inference that was likely to have been made on previous trials, and was directly contradictory to any error made on Trial 5. The other half of the subjects at each age received no contradictory evidence. The association of ages below 45 with salaries below $350 and of ages above 45 with salaries of $350 or above sets subjects up to infer on Trial 6 that the age is below 40 when it in fact is 60 or
CONDITIONAL TABLE AGE
AND
49
REASONING
SALARYINSTANCES
1 INTHEINFERENCETASK
Rule: “If a worker is -. . years of age, or older, then that person will receive at least $350 each week.” Instances: There is a worker who is 15 years old and makes $100 each week. 70 20 25 65 6W 69 30 55 50 35 45 40
400 50 200 550 600” 2006 300 450 350 250 500 150
u Trial 6 without contradictory evidence. b Trial 6 with contradictory evidence.
older; thus the contradictory evidence. This exemplar can alert subjects to the possibility of Q * q), a realization necessary for conditional reasoning. Subjects in the no contradictory evidence condition are never alerted to the possibility of @ +q). There were, thus, two treatment groups. At each trial the information from the previous exemplars was provided in order to eliminate memory deficits as a possible performance variable. Trials 8, 9, and 11, following the critical evidence of Trial 6, are directly relevant to the effects of the contradictory training. While a biconditional interpretation should lead to Choice 2 (the age in the rule is at most the age of the exemplar), a conditional interpretation should lead to Choice 3 (nothing at all). An additional 10 third grade students were given the inference task (half with contradictory Trial 6, half without contradictory evidence) with material that was more closely related to their life experience. The rule used was: “If someone is __ years of age, or older, then that person will get an allowance of at least $3.50 each week.” In order to ensure that the performance of the third grade children did not differ from extraneous logical or linguistic misunderstandings, each of the third grade children was pretested for a comprehension of the concepts “at most,” “at least,” and “more than” by making comparisons of collections of beans. All children were successful at comprehending these concepts. Selection rusk. Following the inference task, half of each treatment group at each age was given a selection task. The selection task consisted of six conditional statements: (a) “If a worker is 70 years of age, or older,
50
O’BRIEN
AND
OVERTON
then that person must be retired”; (b) “If a rod is thin, then it must be flexible”; (c) “If you get a flu shot, then you won’t get the flu”: (d) “If a card has a letter A on one side of it, then it has a number 3 on the other side of it”; (e) “If rilks are tall, then spritzers have teeth”; and (f) “If a student does the homework, then that student will get a good grade.” The first statement was always given first, while the order of presentation of the others was randomized. Following the presentation of each statement the subject was asked to decide for each of four propositional types @, q, p, 4) whether or not it could provide for a test of the truth of the statement. Thus, for example, the conditional statement (a) “If a worker is 70 years of age, or older, then that person must be retired” requires as correct responses the positive selection of (p)--“workers 70 years of age, and older”-and of (i)“people who are still working” as providing a test of the truth of the statement. This follows because only such selections can lead to finding a counterexample. Evaluation task. Following the inference task, the half of each treatment group at each age that was not given the selection task received the evaluation task. This task presented the same six conditional statements as the selection task, with the same ordering. Following presentation of each statement the subject was presented with the four pairs of propositional combinations (JJ . q, p . q, p * 4, p . (I) in random order. The task was to decide for each propositional combination whether or not it proves the rule true or false. Thus, for example, the conditional statement (c) “If you get a flu shot, then you won’t get the flu” requires the assessment that the propositional combinations “people who get a flue shot and don’t get the flu” @ . q), “people who don’t get a flu shot and get the flu @ . G), and “people who don’t get a flu shot and don’t get the flu @ . q) do not prove the truth or falsity of the statement, while the combination “people who get a flu shot and get the flu” (p . 4) proves it false. Procedure All tasks were administered individually in both written and oral form. Both the selection and evaluation tasks followed the inference task within the same testing session. The session began by having the subject informed that the research was concerned with the way students in different grades solved certain problems, and that all the information needed to answer each question correctly would be provided. The subject was assured there were no tricks involved. The subject was given a test booklet for the inference task that contained instructions on the cover. The subject read the instructions while the tester read them aloud. The instructions stated that there is a business that “has a rule about the amount of money it pays people each week. What we know is that IF SOMEONE IS YEARS OF AGE, OR OLDER, THEN THAT PERSON WILL RECEIVE AT LEAST $350
CONDITIONAL
REASONING
51
EACH WEEK. We want to know what can be said about the missing age in the rule. I’m going to tell you the age and amount of money that different workers make. After I tell you the age and amount of money a worker makes, you tell me, on the basis of this information, what you can about the age in the rule. Do you understand?” The tester turned to the first page of the booklet (Trial 1). If the subject responded that the instructions had not been understood, the tester used Trial 1 as an example. If the subject responded that the instructions had been understood the tester allowed the subject to respond to Trial 1. If the subject responded incorrectly (either a conditional or a biconditional interpretation would lead to the correct response-that the age is more than 15), the tester used the trial as an example. (This trial was not used in data analysis.) Each page of the booklet provided the rule with the missing age, a list of all the previous exemplars (see Table l), a new exemplar, and a list of the three response choices. The subject was told that the response referred only to the information provided in that trial, i.e., the ages given in Choices 1 and 2 were that given in the exemplar for the trial. The tester read aloud all the information on each page, including the information from each previous exemplar, to ensure that the subject attended to the task as it progressed. The subject was periodically urged to think about the problem and not to rush to an answer. On the few occasions that a subject asked whether or not the amount of money was proportional to the age or referred to extraneous information, the tester responded that nothing should be inferred that is not in the stated rule. Following the inference task each subject was given either the selection task or the evaluation task. For each task the subject was given a test booklet that contained instructions and the six problem conditional sentences. The tester read aloud the information that was before the subject. The instructions for the selection task were: “I am going to show you some rules. We don’t know if the rules are telling the truth or not. Each of them may be true or false. After you see each rule I am going to ask you to decide what you would need to look at to know if the rule was false or not.” The instructions were worded with emphasis on falsification in order to maximize the likelihood of appropriate responses. Following the presentation of each rule the subject was asked: “Could you find out if the rule is true or false if you looked at: . . .,” and was given a list of the four possible propositional types (p, 4, ~7, 4). The subject was required to answer yes or no for each proposition. The procedure of the evaluation task was the same, with the following instructions: “I am going to show you some rules. We don’t know if the rules are telling the truth or not. Each of them may be true or false. After you see each rule, I am going to ask you to decide what proves the rule true or false, and what does not prove the rule true or false.” Following presentation of each rule, the subject was asked: “Would you know if the
52
O’BRIEN
AND
OVERTON
rule was true or false if you found: . . .,” and was given a list of propositional combinations @ . q, p . q, p . S, p * q). Each of the combinations required a yes or no answer, and did not require the subject to indicate whether it was truth or falsity which was proved. In both the selection and evaluation tasks the tester read aloud all of the information that was given the subject in written form, and the subject was periodically urged to think about the answer and not to rush to make a judgment. RESULTS Inference
Task
Conditional and biconditional interpretations of the rule lead to different correct responses on Trials 8, 9, and 11. These are therefore the relevant trials following the critical Trial 6. Responses for the inference task were scored by giving one point for each correct response according to the conditional interpretation (that the information of the exemplar told nothing at all about the age in the rule) for the three relevant trials. There was no difference between the two forms of the inference task for the third grade children, t( 15) = .18, and consequently, further analyses were computed with the same task material for all grade levels. A 3 (Grade) x 2 (Treatment) ANOVA computed for inference scores demonstrated a significant main effect for grade, F(2, 54) = 3.54, p < .05, and a significant Grade x Treatment interaction, F(2, 54) = 5.33,~ < .Ol. The maximum score was 3, and the mean scores were: Third grade, .45 without treatment and .40 with treatment; seventh grade, .50 without treatment and .40 with treatment; and college, 1.10 without treatment and 1.80 with treatment. Newman-Keuls comparisons of within-grade treatment effects were significant only for the college group, Q(54) = 4.37, p < .Ol, demonstrating that the contradiction treatment was effective in providing insight into the structure of the conditional relationship only for the age group originally tested by Wason (1964). Between-grades comparisons demonstrated that the college group performed significantly better than either the third or seventh grades only within the contradiction condition, Q(54) = 3.6 1, p < .05. These comparisons suggest that while young adults show improved performance, the third and seventh grade children have not yet developed the prerequisite cognitive structures necessary to benefit from the contradictory evidence. While the ANOVA demonstrates the effectiveness of the contradictory training for the young adults, it does not provide information on how the subjects were responding when not making a conditional interpretation of the task. Since subjects were often incorrect, it is particularly important to know if they were answering randomly or in accordance with a biconditional interpretation. In addition, it is important to know how subjects
CONDITIONAL
REASONING
53
responded before Trial 6. Trials 2, 3,4, and 5 provide exemplars that give information corresponding to (q), (G), (i), (q), respectively. A conditional interpretation of the task would lead to responses for these four trials such that Trials 2 and 5 tell “nothing at all about the age in the rule,” while Trials 3 and 4 tell that “the age in the rule is more than (20 or 25).” A biconditional interpretation of the task would lead to the same responses for Trials 3 and 4, but Trials 2 and 5 would lead to the conclusion that “the age in the rule is (70 or 65) at most.” Each subject was placed in one of the following response classes: (a) conditional, if all four trials were given the correct response for the conditional interpretation; (b) biconditional, if all four trials were given the correct response for the biconditional interpretation; and (c) other. The number of subjects classified respectively as conditional, biconditional, and other were: 2, 11, and 7 for the third grade; 0, 12, and 8 for the seventh grade; and 3, 13, and 4 for the college group. The proportion of subjects responding consistently with either a conditional or a biconditional interpretation did not significantly differ for the three age groups, ~~(2) = .634, and the majority of subjects were rule following at all ages. Selection
Task
The selection task was scored by giving one point to each correct response for each propositional type (p, p, q, G), and across all six statements there could be a total of six points for each subject for each propositional type. To test for the generalization of the insight gained from the contradiction training, a 3 (Grade) x 2 (Treatment) x 4 (Proposition Type) ANOVA was computed with repeated measures on propostion type. There was a significant main effect for grade, F(2, 24) = 18.14, p < .Ol, and for proposition type, F(3, 72) = 57.07, p < .Ol, and a significant Grade x Treatment x Proposition Type interaction, F(6, 72) = 4.36, p < .Ol. Results of the selection task are summarized in Table 2. Because of the specific hypothesis concerning Grade x Treatment interaction, planned comparisons of within-grade treatment effects were computed. These analyses demonstrated that the contradiction treatment introduced in the earlier inference task had a generalized positive influence only for the college group, t(24) = 2.18, p < .05. Newman-Keuls comparisons of proposition types revealed that it is significantly easier to correctly select p than to correctly assess i, q, or 4, Q(72) = 7.65, p < .Ol. Subjects also found it more difficult to correctly assess q than to deal correctly with the other proposition types, Q(72) = 9.69, p < .Ol. Comparisons at grade levels demonstrated that the college group performed better than either the third, Q(24) = 6.02, p < .Ol, or the seventh grade, Q(24) = 7.94, p < .Ol, and there was no significant difference between the third and seventh grade performances. Comparisons of the
54
O’BRIEN
AND OVERTON TABLE
SELECTION
TASK
MEAN
SCORES:
GRADE
2 x TREATMENT
x PROPOSITION
TYPE
Grade Third
Seventh
College
Total
Contradiction group proposition type P
5
i 4 4; Totals
5.4 1.2 3 3.65
5.4 1.6 1.4 3.4 2.95
5.8 5.4 3.4 4.4 4.75
5.4 4.1 2.0 3.6 3.78
4.6 3.6 1.6 3 3.2
5.6 4 .6 2.8 3.25
6 4 2.2 4.4 4.15
5.4 3.87 1.47 3.4 3.53
No contradiction group proposition type P
i 4 ci Totals Nofe. Maximum score = 6.
Proposition Type x Grade x Treatment interaction were consistent with the findings of grade and treatment effects, except that the introduction of the contradiction in the earlier inference task resulted in significantly poorer seventh grade performance for the p proposition type, Q(72) = 5.11, p < .05, so that performance was better for the third grade than for the seventh grade following contradiction training, Q(72) = 6.62,~ < .05. In view of the influence of content on reasoning (Staudenmayer, 1975; Wason & Johnson-Laird, 1972) and in view of the varied content of the six statements used in the present task, an analysis was made of scores collapsed across proposition types rather than across statements. In this analysis each subject could score up to four points (one for each proposition type) for each statement. A 3 (Grade) x 2 (Treatment) x 6 (Statement) ANOVA was computed with repeated measures on statements. A significant main effect was found for grade, F(2,24) = 17.85,~ < .Ol, and for statements, F(5, 120) = 7.07, p < .Ol, and a significant interaction for Grade x Treatment x Statement, F(10, 120) = 7.38, p < .Ol. NewmanKeuls comparisons showed that the college group scored significantly higher than either of the other groups, Q(24) = 6.07, p < .Ol. The two statements “If rilks are tall, then spritzers have teeth” and “If a card has a letter A on one side of it, then it has a number 3 on the other side of it” were both significantly more difficult than each of the other statements except “If a rod is thin, then it is flexible,” Q(120) = 4.62, p < .01. Comparisons of cell means for the interaction of Grade x Treatment x Statement revealed that the beneficial effect of the inference task training
CONDITIONAL
REASONING
55
for the college students was not evident for the easiest statements, “If a worker is 70 years of age, or older, then that person must be retired” and “If you get a flu shot, then you won’t get the flu,” Q(l20) = .752. College group performance for these two statements was significantly better even without the contradiction training, Q<120) = 8.85, p < .Ol. Inspection of all other cell means for the three-way interaction reveals no surprises, and it appears that the treatment and grade effects are not due to one or two of the six statements employed. Neither of the ANOVAs computed for the selection task provide information as to what responses, other than the correct ones, were being made by the subjects, and in light of the overall number of incorrect responses it is important to know what proposition types were selected. Since there are four propositional types, there are 16 possible combinations of selections which can be made, e.g., no propositional types are selected, p is selected, p * 4 . 4 are selected, etc. Of the 16 possible combinations of selections, only five were made with any regularity: p ‘4; P * 4; p * 6 . 9; p * 4 . 4; p * ,Y . q * 4. Table 3 gives the frequency of each of these selection patterns for each grade and treatment group. It can be seen that subjects at all grade levels were systematic rather than random in the choices made. There was no significant difference in the proportion of responses that were in these five response classes rather than some other combination for the six Grade x Treatment groups, x2(5) = .956. Inspection of Table 3 shows that both third and TABLE NUMBER
OF SUBJECTS
AT EACH
GRADE
3
SELECTING
ALTERNATIVE
PROPOSITION
TYPES
Grade Third Contradiction selections P ‘4
Seventh
College
group 10
P.G P*P.q P ‘Y .G P.P.4.4
All others No contradiction selections P ‘4 P.4 P.P.9 P’4’4 P.P.4.4
All others
group 12 4 1
8
0 5
,
4 5 2 12 4 3
56
O’BRIEN
AND OVERTON
seventh graders tended to select the verifying proposition types (p . q), but following the contradiction training many of the seventh graders tended to select all four of the proposition types. The only group that selected the correct (p * 4) proposition types with any regularity was the college group following the contradiction training. Evaluation
Task
The evaluation task was scored by giving one point to each correct response for each propositional combination (p . q, P * q, p . 4, p * 4). and across all six statements there could be a total of six points for each subject for each propositional combination. To test for the generalization of the insight gained from the contradiction training, a 3 (Grade) x 2 (Treatment) x 4 (Combination Type) ANOVA was computed with repeated measures on combination type. There was a significant main effect for combination type, F(3, 72) = 27.48, p < .Ol, and a significant interaction for Combination Type x Grade x Treatment, F(6,72) = 3.51, p < .05. Results of the evaluation task are summarized in Table 4. Because of the specific hypothesis concerning Grade x Treatment interaction, planned comparisons of within-grade treatment effects were computed. These analyses demonstrated that the contradiction treatment TABLE NUMBEROFSUBJECTSAT COMBINATION
4
EACH GRADE LEVEL TYPESASPROOFOFTHE
CHOOSING ALTERNATIVE CONDITIONAL
Grade Third
Seventh
College
Contradiction group combination types” A c A&C A&D B&C A, B, & C A, B, C, & D
No contradiction
group”
A c A&C A&D B&C A, B, & C A, B, C, &D u Note.
A=p.q;B=p-q;C=p.ij;D=p*ij.
0
11 9 0 0 5 4
CONDITIONAL
REASONING
57
introduced in the earlier inference task had a beneficial effect only for the college group, t(24) = 2.28, p < .05. Newman-Keuls comparisons of combination types demonstrated that the p * 4 combination type was significantly more difficult to correctly assess than each of the other types, Q(72) = 9.17,~ < .Ol. Comparisons of cell means for the Combination Type x Grade x Treatment interaction revealed a pattern that is consistent with the expected results except for thep . 4 combination type. The three lowest cell means were for thep . q combination type, with both the seventh grade and college group incorrectly evaluating the verifying evidence, but while the contradiction training significantly improved the performance of the young adults, Q(72) = 5.24, p < .Ol, the seventh grade scores remained low. A second ANOVA was computed for the evaluation task, with scores collapsed across combination types rather than statements. In this analysis each subject can score a maximum of four points (one for each combination type) for each statement. A 3 (Grade) x 2 (Treatment) x 6 (Statement) ANOVA was computed with repeated measures on statements. A significant main effect was found for statements, F(5, 120) = 9.15, p < .Ol, and significant interactions were found for Statement x Grade, F(l0, 120) = 13.95, p < .Ol, and for Statement x Treatment x Grade, F(10, 120) = 2.73, p < .Ol. Newman-Keuls comparisons showed that “If you get a flu shot, then you won’t get the flu” was significantly easier than each of the other statements, Q(120) = 4.17, p < .Ol. This is consistent with previous findings that task performances improve when the consequent (q) is presented as a negated proposition (Evans, 1972; Evans & Lynch, 1973; Wason, 1977), and Evans argues that this is the result of a “matching bias” and represents the right answer for the wrong reason. Comparisons of cell means for the Statement x Grade interactions revealed no unexpected findings except that the college group did more poorly than the third or seventh grades for the statement “If a worker is 70 years of age, or older, then that person must be retired,” Q(120) = 2.12, p < .Ol. Comparison of means for the interaction of Statement x Treatment x Grade showed that the college group was significantly better on the “If a worker . . .” statement following contradiction training, Q( 120) = 2.81, p < .05. It is difficult to know how to interpret this finding, because the same statement was so easy in the selection task for the college students that the benefits of the contradiction training were not required. Since this statement was always presented first, it is likely that it represents an order effect. The evaluation task is less complex than the selection task, and it is likely that the apparent simplicity of the evaluation task leads college students to respond on the first trial without reflecting on the implications of their choice. Inspection of all other cell means for
58
O’BRIEN
AND OVERTON
the 3-way interaction reveals no surprise, and the grade and treatment effects for the task are not limited to a small subset of statements. Since there are four combination types, there are 16 possible response classes that could be made. For example, a subject might respond that only the p . q combination proves the statement’s truth-value, or that both the p * q and p . (? combinations prove the statement’s truth value. Of the 16 possible response classes, only seven were found with any regularity, and these are presented in Table 5. The only group to choose the p . 4 combination with any regularity was the college group following the contradiction training. DISCUSSION
The results replicate earlier findings indicating that introduction of contradictory evidence significantly improves performance for the young adults in making conditional inferences (Wason, 1964), and that young adults without such information do not select the correct propositional types with which hypothesis testing can take place (Wason & JohnsonLaird, 1972). Further, it is clear that the insight gained with the introduction of contradictory evidence in the inference task generalizes to other forms of conditional reasoning for young adults. Taken together, this evidence suggests that young adults do have the competence to engage successfully in hypothesis testing. It seems unlikely that understanding of the conditional relationship is learned in the single trial in which the TABLE EVALUATION
TASK
MEAN
5
SCORES: GRADE x TREATMENT TYPE COMBINATION
x PROPOSITIONAI
Grade Third
Seventh
College
Total
2.4 3.8 4 3 3.3
.2 4.2 2.8 4.4 2.9
2.2 4.2 5.8 5 4.3
1.6 4.1 4.2 4.1 3.5
1.6 4 2.4 4.2 3.05
.4 2.6 4 4 2.75
.8 3.2 4.4 4.4 3.2
.9 3.3 3.6 4.2 3.0
Contradiction group proposition combinations P ‘4
P-q P.4
P’4 Totals No contradiction group proposition combinations P ‘4
P.4 P’Y
P.4 Totals Note. Maximum score = 6.
CONDITIONAL
REASONING
59
contradictory evidence is presented, but rather, it is more reasonable that the subject is alerted to the asymmetry of the relationship and alters the strategy employed. Stone and Day (1978) have suggested that formal operational thinking is not always spontaneous, but that latent formal operational strategies may be invoked by many thinkers following appropriate informational cues. The young adults in the present study demonstrated such latent formal operational thinking, but this does not appear to be the case for the children at the two younger age levels. The two findings, that (a) the introduction of the contradictory evidence increases the occurrence of the error of selecting the p proposition type by seventh graders, and (b) the third grade students perform better than the seventh graders in evaluating the p . q combination type, were unexpected. Either of these results taken alone might suggest chance effects, but taken together it seems likely that the seventh grade group may be undergoing a transition in cognitive organization. The increased evaluation of the p . 4 instance combination as verifying the conditional rule may represent an increased awareness of the importance of empirical evidence that the third grade children lack. Within the Piagetian tradition of investigating the child as scientist, this view of the place of empirical evidence is parallel to the faith placed in induction by scientists prior to Hume’s revelation that induction is logically invalid. The young adults also seem to share this tendency to choose the p . q combination, but this verification strategy is not employed following the contradiction training. While the third grade children are unaffected by the introduction of the contradictory evidence, the increased selection of the p proposition type following the contradiction training demonstrates that the seventh graders were at least confused by it. This is further illustrated by the fact that, on the inference task, six of the ten seventh graders who received the contradictory evidence on Trial 6 responded that the information provided on Trial 6 told them “nothing at all” about the missing age in the rule, even though the inference “that the age in the rule is more than 65” is valid. Neither the third grade nor college students showed this pattern of responding. The contradictory training paradigm is a particularly appropriate method with which to improve the young adult’s performance on tasks with conditional statements. The interpretation of an “if, then” statement as expressing an equivalence or a biconditional relationship is a hypothesis as to the meaning of the statement. Given such an interpretation, the errors made in the inference task following exemplars with monetary values exceeding the amount given in the rule are reasonable. The instances in the trials preceding the contradictory evidence of Trial 6 are arranged in such a way as to support an equivalence relationship, for the ages and monetary amounts progress towards an apparent final solution to the problem with a biconditional interpretation (see Table I). The
O’BRIEN
60
AND OVERTON
logic of hypothesis testing, though, precludes the possibility of proving such a hypothesis, no matter how many positive instances are provided. The single counterexample to the biconditional interpretation provided by the evidence of Trail 6 is sufficient to prove the biconditional hypothesis false. As one college student stated: “So that’s what you’re after!” With the single piece of contradictory evidence sufficient to disprove the orginal interpretation of the “if, then” statement, generalization of the insight to other conditonal reasoning tasks that require recognition of a counterexample and a falsification strategy should be expected. Earlier findings (e.g., Moshman, 1979) that few subjects employ nonverification strategies may not be indicative of the competence of young adults, but it may be that contradiction training is necessary to elicit falsification. REFERENCES Beth, E. W., & Piaget, J. Mathematical epistemology and psychology. Dordrecht: Reidel, 1966. Bourne, L. W., Jr., & O’Banion, K. Conceptual rule learning and chronological age. Developmental Psychology, 1971, 5, 525-534. Braine, M. D. S. On the relation between the natural logic of reasoning and the standard logic. Psychological Review, 1978, 85, l-21. Braine, M. D. S. If-then and strict implication: A response to Gandy’s note. Psychological Review, 1979, 86, 154-156. Ennis, R. H. Children’s ability to handle Piaget’s propositional logic: A conceptual critique. Review of Educational Research, 1975, 45, l-41. Ennis, R. H. An alternative to Piaget’s conceptualization of logical competence. Child Development,
1976,
47, 903-919.
Evans, J. St. B. T. Interpretation
and matching bias in a reasoning task. Quarter/y Journal of 24, 193-199. Evans, J. St. B. T., & Lynch, J. S. Matching bias in the selection task. British Journal of Psychology, 1973, 64, 391-397. Falmagne, R. Reasoning: Representation and process in children and adults. Hillsdale. N.J.: Erlbaum, 1975. Geis, M. L., & Zwicky, A. M. On invited inferences. Linguistic Inquiry, 1971, 2, 561-566. Hempel, C. G. Philosophy ofnatural science. Englewood Cliffs, N.J.: Prentice-Hall, 1966. Inhelder, B., & Piaget, J. The growth of logical thinking from childhood to adolescence. New York: Basic Books, 1958. Johnson-Laird, P. N., Legrenzi, P., & Legrenzi, M. S. Reasoning and a sense of reality. Experimental
British
Journal
Psychology,
of Psychology,
1972,
1972,
63, 395-400.
Knifong, J. D. Logical abilities of young children-two styles of approach. Child Development, 1974, 45, 78-83. Kodroff, J. K., & Roberge, J. J. Developmental analysis of the conditional reasoning abilities of primary-grade children. Developmentnl Psychology, 1975, 11, 21-28. Kuhn, D. Conditional reasoning in children. Developmental Psychology, 1977,13,342-353. Leblanc, H., & Wisdom, W. A. Deductive logic. Boston: Allyn and Bacon, 1976. Lunzer, E. A., Harrison, C., & Davey, M. The four-card problem and the generality of formal reasoning. Quarterly Journal of Experimental Psychology, 1972, 24, 326-339. Moshman, D. Development of formal hypothesis-testing ability. Developmental Psychology, 1979, 15, 104-112. Popper, K. The logic of scienttjk discovery. New York: Basic Books, 1959.
CONDITIONAL
REASONING
61
Roberge, .I. .I., & Paulus, D. H. Developmental patterns for children’s class and conditional reasoning abilities. Developmental Psychology, 1971, 4, 191-200. Staudenmayer, H. Understanding conditional reasoning with meaningful propositions. In R. Faimagne (Ed.), Reasoning: Representation and process in children and adults. Hillsdale, N.J.: Erlbaum, 1975. Staudenmayer, H., & Boume, L. E., Jr. Learning to interpret conditional sentences: A developmental study. Developmental Psychology, 1977, 13, 616-623. Stone, C. A., & Day, M. C. Levels of availability of a formal operational strategy. Child Development, 1978, 49, 1054-1065. Taplin, J. E. Reasoning with conditional sentences. Journal of Verbal Learning and Verbal Behavior, 1971, 10, 218-225. Taplin, J. E., Staudenmayer, H., & Taddondio, J. L. Developmental changes in conditional reasoning: Linguistic or logical? Journal of Experimental Child Psychology, 1974, 17, 360-373. Wason, P. C. The effect of self-contradiction on fallacious reasoning, Quarter/y Journal of Experimental Psychology, 1964, 16, 30-34. Wason, P. C. Reasoning. In B. Foss (Ed.), New horizons in psychology. Harmondsworth: Penguin Books, 1966. Wason, P. C. Reasoning about a rule. Quarterly Journal of Experimental Psychology, 1968, 20, 273-281. Wason, P. C. Theory of formal operations: A critique. In B. A. Geber (Ed.), Piager and knowing: Studies in genetic epistemology. London: Routledge and Kegan Paul, 1977. Wason, P. C., & Johnson-Laird, P. N. Psychology of reasoning: Structure and content. Cambridge, Mass.: Harvard University Press, 1972. Wildman, T. M., & Fletcher, H. J. Developmental increases and decreases in solutions of conditional syllogistic problems. Developmental Psychology, 1977, 13, 630-636. RECEIVED: June 13, 1979; REVISED: September 18, 1979.