Children's interpretations of covariation data: Explanations reveal understanding of relevant comparisons

Children's interpretations of covariation data: Explanations reveal understanding of relevant comparisons

Learning and Instruction 59 (2019) 13–20 Contents lists available at ScienceDirect Learning and Instruction journal homepage: www.elsevier.com/locat...

448KB Sizes 0 Downloads 10 Views

Learning and Instruction 59 (2019) 13–20

Contents lists available at ScienceDirect

Learning and Instruction journal homepage: www.elsevier.com/locate/learninstruc

Children's interpretations of covariation data: Explanations reveal understanding of relevant comparisons

T

Andrea Saffrana,∗, Petra Barchfelda, Martha W. Alibalib, Kristina Reissc, Beate Sodiana a

LMU Munich, Germany University of Wisconsin-Madison, USA c Technical University of Munich, Germany b

A R T I C LE I N FO

A B S T R A C T

Keywords: Scientific reasoning Evidence evaluation Data interpretation Statistical reasoning Development

This research investigates children's understanding of the significance of comparisons between data categories for judgments of covariation. Past studies showed that children sometimes neglect some of the relevant data categories. This may occur because children fail to understand the relevance of the comparisons between data categories. To investigate this interpretation, 51 second graders and 43 fourth graders were tested in a betweensubject design. In the standard condition, children were asked to explain their own covariation judgments. In the explain-correct condition, children were told the correct judgments and asked to explain them. Children in the explain-correct condition often provided explanations that were consistent with the correct judgments; children in the standard condition did so less often. Thus, when asked to explain correct judgments, elementary school children's explanations reveal that they possess a basic conceptual understanding of inference from covariation data.

1. Introduction Imagine you want to observe birds in the park. You realize that on some days, there are birds to observe and on other days, there are no birds to observe. You wonder why that might be. You think that the weather might play a role. Suppose you keep track for a period of 60 days. You tabulate your data in a 2 × 2 contingency table showing the number of days there were birds and the number of days there were no birds, along with the number of sunny days and not sunny days. The ability to intuitively analyze such data is an important scientific reasoning skill. It requires, minimally, a conceptual understanding of the comparisons between cells that are necessary to draw an inference about the association between sunny days and the presence of birds. The present research investigates children's abilities to make explicit covariation judgments based on data presented in 2 × 2 contingency tables (see Fig. 1). Research with adults indicates that interpreting covariation data is a challenging task. Judgment accuracy is often poor (e.g., Batanero, Estepa, Godino, & Green, 1996; Osterhaus, Magee, Saffran, & Alibali, 2018; Shaklee & Elek, 1988) and inadequate strategies are common (e.g., Batanero et al., 1996; Mata, Garcia-Marques, Ferreira, & Mendonça, 2015; Osterhaus et al., 2018; Shaklee, 1983; Shaklee & Hall, 1983; Shaklee & Mims, 1982; Shaklee & Tucker, 1980; Shaklee & Wasserman, 1986). A prominent problem is the tendency to



neglect parts of the data. For example, Shaklee and colleagues showed in several studies that adults base their judgments mostly on cells A and B of the table, thus neglecting cells C and D (e.g., Shaklee & Hall, 1983; Shaklee & Mims, 1982; Shaklee & Tucker, 1980). This finding is in line with the result that people weight the four cells of the table in descending order for their judgments (Levin, Wasserman, & Kao, 1993); that is, the value in cell A influenced their judgments most, followed by cells B, C, and D. The few existing studies of children also indicate a strong tendency to neglect substantial portions of the data as well as poor judgment accuracy (Obersteiner, Bernhard, & Reiss, 2015; Shaklee & Mims, 1981; Shaklee & Paszek, 1985). For example, in Shaklee and Paszek’s (1985) study, 16.2% of second graders and 17.6% of fourth graders based their judgments only on cell A, and 40.5% of second graders and 70.6% of fourth graders based their judgments on cells A and B. In contrast, only 2.7% of second graders and 5.9% of fourth graders considered all four cells of the contingency table. More recently, Saffran, Barchfeld, Sodian, and Alibali (2016) found that, under facilitating task conditions that highlighted comparisons between rows, elementary school children referred more often to four cells (on about six out of nine items) than to two cells (on about three out of nine items) when explaining how they reached judgments about covariation data presented in contingency tables. Although these data point to more comprehensive

Corresponding author. Department of Psychology, LMU Munich, Leopoldstr. 13, 80802, Munich, Germany. E-mail address: andrea.saff[email protected] (A. Saffran).

https://doi.org/10.1016/j.learninstruc.2018.09.003 Received 16 March 2017; Received in revised form 30 August 2018; Accepted 7 September 2018 0959-4752/ © 2018 Elsevier Ltd. All rights reserved.

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

that is needed to support future efforts to develop methods for promoting learning of covariation. We focus on elementary school children (Grades 2 and 4) because it has been shown that data category neglect is pronounced in this age group (e.g., Shaklee & Paszek, 1985). To classify children's explanations, we concentrate on explanations that are consistent with the correct judgment as an indicator of conceptual understanding of covariation data. We define consistent explanations as explanations that are suitable to explain a correct judgment for a given covariation problem. The number and types of comparisons between data categories that are needed to make an explanation consistent with the correct judgment varies depending on the data pattern. For instance, comparing differences (e.g., (A-B) vs. (C-D)) is consistent with the correct judgment for an item with the numerical structure A = 18, B = 15, C = 18, and D = 4 (see Item 3 in Fig. 3), but not for an item with the numerical structure A = 30, B = 11, C = 20, and D = 1 (see Item 8 in Fig. 3). Using a between-subject design, we compare a standard condition, in which participants were asked to provide and explain their own judgments (explain-own condition), and an explain-correct condition, in which participants were asked to explain provided correct judgments. We expect children to provide more explanations that are consistent with the correct judgment when asked to explain correct judgments (explain-correct condition) than when asked to provide and explain their own judgments (explain-own condition). In light of prior work (Obersteiner et al., 2015; Saffran et al., 2016; Shaklee & Mims, 1981; Shaklee & Paszek, 1985), we predicte that children will display low judgment accuracy and few explanations consistent with the correct judgments in the explain-own condition. If children produce comparable numbers of consistent explanations in the explain-correct condition, this would indicate that children fail to understand the meaning of the data categories, even with considerable task support. If children produce more consistent explanations in the explain-correct condition than in the explain-own condition, this would suggest that children's underlying conceptual understanding was masked in previous studies by other factors, such as processing demands.

Fig. 1. Labeled 2 × 2 contingency table.

reasoning than the results reported by Shaklee and Paszek (1985), perhaps due to different methodological approaches, it remains the case that data category neglect and poor judgment accuracy is prevalent in children's interpretations of covariation data. Since children often make incorrect judgments of covariation when presented with contingency tables, their post-hoc justifications are often flawed, as well (Saffran et al., 2016). They may fail to take parts of the data into account because their initial judgments rested solely on information from two cells and they attempt to be consistent with those initial judgments. Even if they consider all four cells, they may not integrate the information from the cells in an appropriate way. However, despite these flaws in their reasoning, children may be aware, in principle, of the significance of all four data categories. That is, they may understand the relevance of the comparisons between data categories, but extraneous factors - such as limited working memory or inadequate executive control - may prevent them from taking all data into account or integrating the data appropriately. If this is the case, then children should be able to display their understanding in an alternative task that has lower task demands. The present study examines whether children possess a basic conceptual understanding of the significance of comparisons between categories of covariation data presented in 2 × 2 contingency tables. To address this question, we eliminated some of the task demands involved in deriving an inference about covariation. If children are presented with the correct judgment and asked to explain why that judgment is correct, then they should be able to display their basic, conceptual understanding of the relations between the data categories, without having to apply a mathematically correct integration rule. This approach builds on the self-explanation technique discussed in the education literature. In self-explanation studies, students are prompted to explain learning materials (e.g., worked-out examples, one's own problem solving efforts) to themselves. This technique yields positive effects on learning outcomes across a wide range of domains (for reviews, see Fonseca & Chi, 2011; Rittle-Johnson & Loehr, 2016). Although much of this research focuses on children's explanations of their own ideas, some studies have shown that self-explaining why correct information is correct or why incorrect information is incorrect is especially likely to enhance learning (Siegler, 2002; Siegler & Chen, 2008). Kuhn and Katz (2009) argued that explaining why a correct judgment is correct may help to divert attention from evidence that is in line with one's preexisting ideas and as such, it may help reasoners to take alternative evidence into account. Following this line of reasoning, we propose to use self-explanations of correct judgments to investigate children's understanding of covariation data. We hypothesize that correct judgments may encourage children to focus on all relevant comparisons between cells, and thus overcome their tendency to neglect parts of the data. Thus, we propose to use self-explanations of correct judgments as a method to investigate children's conceptual understanding under optimal conditions (i.e., under reduced information processing demands). Thus, this research will yield important data about children's conceptual understanding

2. Methods 2.1. Participants Participants were 94 children, including 51 second graders (26 male; mean age M = 8.23 years, SD = 0.32, range = 7.59–8.87), and 43 fourth graders (29 male; mean age M = 10.24 years, SD = 0.39, range = 9.46–10.88). Children were recruited from four elementary schools in Munich, Germany. 2.2. Design A 2 × 2 between-subject design with task condition and grade level as factors was used. To ensure that the experimental groups were comparable with respect to their data interpretation abilities, a paperand-pencil pretest was administered in class four to ten weeks before the individual interview sessions. In this test, children were presented with a story context about the effectiveness of crèmes against pimples and were asked to decide for eight covariation problems which of two crèmes was more effective or if there was no difference. The experimental groups were matched based on their mean pretest performance and gender within each grade level. Table 1 shows the number of participants in each group. 2.3. Materials Two series of nine pictures with 2 × 2 contingency tables were used. The context story was about the association between different varieties of apples and apple juice color (light vs. dark). The rows and columns of the tables were labeled with small illustrations, indicating 14

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

because it is selling better than dark-colored apple juice. They asked nine scientists to test the associations between different varieties of apples and apple juice color. Each scientist examined bottles of lightand dark-colored apple juice and checked with which of two varieties of apples (e.g., variety A vs. variety B) it was produced. After this introduction, the interviewer explained the meanings of the rows and columns based on a sample contingency table, which did not contain data:

Table 1 Number of Participants in Each Group. Condition

Grade level 2

nd

All th

4

Explain-own Explain-correct

25 26

20 23

45 49

All

51

43

94

Each scientist writes down his observations in such a table. In the first row (interviewer pointed to the first row), he notes apple juice bottles from variety A and in the second row (interviewer pointed to the second row), he notes apple juice bottles from variety B. In the first column (interviewer pointed to the first column), he notes bottles with light-colored apple juice and in the second column (interviewer pointed to the second column), he notes bottles with dark-colored apple juice. Two control questions were administered to make sure that the participants understood the meanings of the cells (“In which cell do the scientists write down light-colored apple juice from variety A?”, “In which cell do the scientists write down dark-colored apple juice from variety B?”). Children who failed these questions received a second explanation of the table and a second set of control questions. After passing the control questions, the experiments and results of the nine scientists were presented one at a time. In the explain-own condition, children were asked for a judgment (“The company wants to produce light-colored apple juice. Is it better to use apples from variety A or is it better to use apples from variety B or does it make no difference?”) and for an explanation (“Where do you see in the table, that (answer to judgment question)?”). In the explain-correct condition, children were not asked for judgments. Instead, the interviewer told them the correct judgment for each table by saying that the scientist had looked at the data and is certain that, e.g., it is better to use apples from variety A to produce lightcolored apple juice. Then, children were asked to provide an explanation for the given correct judgment with reference to the data table: “Where do you see in the table, that (correct judgment)?” In some cases, children did not believe that the judgment that they were told was true. In such cases, children received a prompt, e.g., “The scientist is very certain that it is better to use apples from variety A to produce lightcolored apple juice,” and were asked again for an explanation of the correct judgment. After this procedure, the interview continued even if children still did not believe the correct judgment.

Fig. 2. Sample item.

the levels of the variables. The rows were labeled “variety A” and “variety B” and the columns were labeled “light-colored apple juice” and “dark-colored apple juice”. The cell frequencies were depicted with small illustrations and numbers (see Fig. 2). The covariation problems were the same in both conditions (see Fig. 3). Numbers ranged from 1 to 30. There were three problems depicting no relationship (Items 5, 6, 9), three problems depicting positive relationships (Items 1, 4, 7), and three problems depicting negative relationships (Items 2, 3, 8). The numerical structure differed with respect to several characteristics, such as the saliency of the relation between values. Five data sets included the same cell value twice (Items 2, 3, 5, 6, 9) and the values in Item 9 were simple multiples. For the other items, relations were not as salient but could easily be estimated (e.g., 13 is about two times 6 in Item 1). The difference between the cells in the two rows and the cells in the two columns was the same for Items 5, 6, and 8, while this difference varied for the other data sets. The covariation problems were presented in two orders (order A: 1, 5, 3, 9, 4, 8, 6, 7, 2; order B: 1, 2, 7, 6, 8, 4, 9, 3, 5). Half of the children in each group received order A and the other half order B.

2.5. Coding In the explain-own condition, we first coded children's judgments as correct or incorrect. Next, we coded the content of the explanations children provided in each condition (see Table 2 for sample explanations for each content category). Explanations that referred to four cells were divided into comparison of ratios and comparison of differences. The category comparison of ratios encompasses comparisons of conditional probabilities (e.g., A/(A + B) vs. C/(C + D)) and also comparisons of simple proportions (e.g., A/C vs. C/D). Note that correct comparisons of ratios lead to correct judgments for all covariation data patterns. The category comparison of differences includes row- and column-wise comparisons of differences (e.g., (A-B) vs. (C-D), (A-C) vs. (B-D)). Explanations in this category lead to the correct judgment for only a subset of the data patterns. Comparisons of two cells were divided based on which cells they referred to (A vs. B; C vs. D; A vs. C; B vs. D).

2.4. Procedure Children were interviewed individually in a quiet place at their school. The items were presented on a laptop and the interviews were video-recorded and transcribed. Children in both conditions were told that they would hear a story about scientists and that there would be questions about the scientists and their research results. In both task conditions, the interview started with the context story about a company that wants to produce light-colored apple juice

Fig. 3. Cell Frequencies for the nine covariation data sets.

15

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

Table 2 Content Categories with Sample Explanations. Content category

Explanation type

Sample explanation

Comparison of ratios

ratios are compared

Comparison of differences

differences are compared (no proportional reasoning)

“because there (first row) are 20 light ones and 10 dark ones and 10 is exactly half of 20. In the second row, there are only 10 light ones and 5 dark ones and 5 is also half of 10" “because here (cell C) are fewer light ones than here (cell A), but there (cell B) are also fewer dark ones than there (cell D)”

A vs. B

reference only to and B) reference only to C and D) reference only to A and C) reference only to (cells B and D)

C vs. D A vs. C B vs. D

the first row (cells A

“because there is more light-colored apple juice (cell A) than dark-colored juice (cell B)”

the second row (cells

“because here (cell D) are more than here (cell C)”

the first column (cells

“because here are 13 (cell A) and there are only 10 (cell C)”

the second column

“because here (cell D) is more dark-colored apple juice than in there (cell B)”

Disbelieved

children do not believe the judgment they are told by the interviewer (only in the explain-correct condition)

Other

does not fit in any content category

Note. Sample explanations are from Item 9 (20/10/10/5) and from Item 1 (13/6/10/18).

3. Results

Attending to only two of the four cells is not adequate for any data pattern. However, comparing values of two cells, especially in the second row or column, can still be appropriate to explain a correct judgment in some data sets used in the present study (e.g., comparing cell B and cell D in Item 8). In the explain-correct condition, some children did not believe that the judgment provided by the interviewer was true. Such explanations were coded as disbelieved, independent of the content of the explanation. Explanations that did not fit into any category were combined in the residual category other (e.g., explanations that did not refer to the table, one-cell and three-cell explanations). All explanations were coded by two raters independently. Interrater agreement varied from 84.0% for Item 9 to 95.7% for Item 1. Disagreements were resolved by discussion. In both conditions, we coded whether children's explanations were consistent with the correct judgment based on the described content coding. We defined consistent explanations as explanations that would yield the correct judgment for a given covariation problem. The consistency coding encompassed the categories consistent (which was further divided into comparison of ratios, comparison of differences, and comparison of two cells), inconsistent, and not applicable. Explanations that refer to comparisons of ratios (including comparisons of conditional probabilities) are consistent with correct judgments for every data pattern. Comparisons of differences are consistent with the correct judgments for some but not all data patterns. For example, when a child reports a comparison of differences for Item 3, the explanation is classified as consistent because it would yield the correct judgment for Item 3; however, comparison of differences is inconsistent with the correct judgment for Items 8 and 9. Likewise, comparisons of two cells are consistent with the correct judgment for some but not all data patterns. For example, comparing the two cells of the second column is consistent with the correct judgment for Item 3, but this same two-cell comparison is inconsistent with the correct judgment for Items 6 and 9. Explanations that were not consistent with a correct judgment for a given covariation problem were classified as inconsistent. Disbelieved cases were also classified as inconsistent. For explanations in the residual (other) category, their relation to the correct judgment could not be determined. Thus, those explanations were coded as not applicable. The items in the present study vary with respect to which and how many explanation types are consistent with the correct judgments. Item 9 is the only one for which only comparisons of ratios is consistent with the correct judgment. For all other items, more than one of the explanation types is consistent with the correct judgment. Table 3 indicates which combinations of content categories and items lead to consistent and inconsistent explanations.

For the purpose of analysis, each child received a sum score across the nine items for each consistency category. For each category, Cronbach's alpha was calculated as an indicator of homogeneity (consistent: α = 0.84; inconsistent: α = 0.90, Items 1 and 6 were not included in this analysis because there were no inconsistent explanations and thus zero variance; not applicable: α = 0.56). This analysis pointed to good internal consistency for the categories consistent and inconsistent. The category not applicable (n/a) showed poor internal consistency, which is not surprising for a residual category in which various rare explanations are combined. It is still reported in order to provide a comprehensive analysis. Based on these sum scores, mean values for experimental groups and grade levels are reported. Eighty-three children passed both control questions after the first explanation of the table. Ten children passed the control questions only after a second explanation of the table. For one child, data for the control questions was missing because the interviewer forgot to ask the questions. Gender and order of items did not influence consistency with correct judgments of children's explanations so these factors were not included in the statistical models. Judgment accuracy and relationships between judgment accuracy and consistent explanations are presented only for the explain-own condition since children were not asked for their own judgments in the explain-correct condition. Cronbach's alpha for judgment accuracy across the nine items was 0.91, indicating excellent internal consistency. Although judgment accuracy was not our focus, we report some descriptive statistics from the explain-own condition (n = 25 second graders, n = 20 fourth graders) in order to provide some information about children's performance level. The mean proportion of correct judgments was M = 0.36 (SD = 0.28; range: 0–1) for the second graders and M = 0.52 (SD = 0.35; range: 0–1) for the fourth graders. The difference in correct judgments across grades was not significant, F(1, 43) = 2.72, p = .106, η2p = 0.06. Fourth graders' performance level was above what one would expect by chance, t (19) = 2.34, p = .030, d = 0.52, whereas second graders' was not, t (24) = 0.54, p = .597, d = 0.11. As expected, judgment accuracy was associated with use of consistent explanations in the explain-own condition, r(43) = 0.81, p < .001. When children provided correct judgments, they nearly always provided explanations that were consistent with those correct judgments (M = 0.97, SD = 0.08, range: 0–1). This proportion did not differ between grade levels (Grade 2: M = 0.96, SD = 0.08; Grade 4: M = 0.97, SD = 0.07; F(1, 43) = 0.04, p = .840, η2p < 0.01). Thus, children who judged a covariation problem correctly were also able to 16

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

Table 3 Transformation from Content Categories to Consistency Categories based on Item Structure. Content categories

Consistency categories Consistent Comparison of ratios

Comparison

Ratios (4 cells) Differences (4 cells) A vs. B C vs. D A vs. C B vs. D

Inconsistent Comparison of differences

n/a

Comparison of two cells

All items Items 1, 2, 3, 4, 5, 6, 7 Items 1, 5, 7 Items 1, 2, 3, 4, 5, 7, 8 Items 1, 6 Items 1, 2, 3, 4, 6, 7, 8

Items 8, 9 Items 2, 3, 4, 6, 8, 9 Items 6, 9 Items 2, 3, 4, 5, 7, 8, 9 Items 5, 9

Disbelieved

All items

Other

All items

Note. n/a not applicable.

give suitable explanations for their judgments. In contrast, when children provided incorrect judgments, they rarely provided explanations that were consistent with the correct judgments (all: M = 0.12, SD = 0.26, n = 41, range: 0–1). This proportion also did not differ between grade levels (Grade 2: M = 0.14, SD = 0.26, n = 25; Grade 4: M = 0.10, SD = 0.26, n = 16; F(1, 39) = 0.19, p = .663, η2p = 0.01). We turn next to the question of central interest, which was whether children in the explain-correct condition were more likely to provide explanations that were consistent with the correct judgments. Table 4 presents mean values for each consistency category by condition and grade level. Multivariate analyses of variance including all consistency categories showed overall significant effects for condition (Pillai's Trace = 0.30, F (4, 87) = 9.12, p < .001), grade level (Pillai's Trace = 0.11, F (4, 87) = 2.78, p = .032), and their interaction (Pillai's Trace = 0.11, F (4, 87) = 2.67, p = .037). The univariate analyses (see Table 5) revealed significant main effects of condition for both consistent and inconsistent explanations; children reported more consistent and fewer inconsistent explanations in the explain-correct condition. The detailed, content-based categorization of consistency categories also reveals some noteworthy effects: Comparisons of ratios were very rare, presumably because only one item (Item 9) required ratio-based reasoning for successful interpretations (see Section 2.5). However, there were main effects of condition and grade level, as well as a significant interaction. Fourth graders reported more comparisons of ratios in the explain-correct condition (on average for 0.7 of the nine items) while second graders did not. Comparisons of differences that were consistent with correct judgments were also more frequent in the explain-correct condition than in the explain-own condition.1 Table A1 in Appendix A presents the relative frequencies of consistent explanations for each item. A breakdown of the consistency data into terciles (1–3 consistent, 4–6 consistent, 7–9 consistent; see Fig. 4) shows that the majority of children in the explain-correct condition provided explanations consistent with the correct judgments on most of the items.

Table 4 Mean Values (SD in Parentheses) for each Consistency Category by Condition and Grade Level. Category

Grade 2

Grade 4

Explain-own

Explaincorrect

Explain-own

Explaincorrect

3.88 (2.32)

5.92 (2.04)

4.40 (2.84)

7.17 (1.61)

0.12 (0.44)

0.08 (0.27)

0.15 (0.37)

0.70 (0.77)

2.44 (2.92)

3.38 (2.12)

2.75 (2.86)

4.91 (1.73)

1.32 (1.07)

2.46 (1.68)

1.50 (1.19)

1.57 (1.83)

inconsistent

4.64 (2.56)

2.23 (2.05)

4.05 (2.82)

1.22 (1.17)

n/a

0.48 (1.19)

0.85 (1.19)

0.55 (0.89)

Consistent Comparison of ratios Comparison of differences Comparison of two cells

0.61 (0.94)

Note. n/a not applicable. Table 5 Univariate MANOVA Results for the Consistency Categories. Category

Factor

F

Consistent

Condition Grade level Interaction

27.44** 3.71+ 0.63

< .001 .057 .429

0.23 0.04 0.01

Condition Grade level Interaction Condition Grade level Interaction Condition Grade level Interaction

6.04* 10.06** 8.28** 9.40** 3.29 1.45 3.84+ 1.35 3.05

.016 .002 .005 .003 .073 .223 .053 .248 .084

0.06 0.10 0.08 0.10 0.04 0.02 0.04 0.02 0.03

Inconsistent

Condition Grade level Interaction

32.45** 3.04 0.21

< .001 .085 .647

0.27 0.03 < 0.01

n/a

Condition Grade level Interaction

0.91 0.14 0.48

.343 .708 .492

0.01 < 0.01 < 0.01

Comparison of ratios

Comparison of differences

Comparison of two cells

4. Discussion The present study investigated children's conceptual understanding of the significance of comparisons between data categories for covariation judgments. We compared a standard condition (the explain-own condition), in which participants were asked to give and explain their own judgments, and an explain-correct condition, in which participants

p

η2p

Note. **p < .01, *p < .05,+p < .07. One multivariate Analysis of Variance (MANOVA) with condition and grade levels as factors and the consistency categories as dependent variables was calculated. Multivariate statistics are reported in the text. Degrees of freedom of univariate tests 1, 90. n/a not applicable.

1 Please note that some distributions were skewed. However, F-tests are fairly robust against violations of the standard distribution assumption (e.g., Bühner & Ziegler, 2009; Eid, Gollwitzer, & Schmitt, 2010).

17

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

Fig. 4. Relative frequency of consistent explanations categories. 1–3 consistent: child reported 1–3 explanations consistent with the correct judgments. 4–6 consistent: child reported 4–6 explanations consistent with the correct judgments. 7–9 consistent: child reported 7–9 explanations consistent with the correct judgments.

difficulty in deriving correct judgments for covariation data patterns spontaneously. Factors that tax the information processing system, such as limited working memory capacity or lack of inhibitory control, may lead children to use inadequate heuristics and thereby neglect parts of the data. One such inadequate heuristic is a purely confirmatory strategy that is well documented by a large body of literature across different domains (e.g., Klayman & Ha, 1987; Mynatt, Doherty, & Tweney, 1977; for a review see; Nickerson, 1998). The task requirements in the two conditions of the present study differed with respect to the demand of overcoming the confirmation bias. In the standard task, children need to search for confirming as well as for disconfirming evidence relative to a hypothesis. Children focusing on confirming evidence may neglect parts of the data and therefore have a high risk of failure. In contrast, in the explain-correct task, children only need to search for evidence that confirms the correct judgment provided by the experimenter. Further work is needed to investigate the underlying information processing thoroughly so that the source of children's problems in data interpretation tasks becomes clearer. For example, one could vary characteristics of the task in a way that allows systematic examination of the role of specific executive functions such as inhibition. The role of confirmation bias could be investigated in a task in which participants are required to explain why an incorrect judgment is incorrect. In this case, participants would need to search for disconfirming evidence with equally low demands on working memory and inhibitory control as in the explain-correct paradigm. Research in this area should include not only children but also adults, because it has been shown that even adults' data interpretation abilities are far from perfect (e.g., Batanero et al., 1996; Levin et al., 1993; Osterhaus et al., 2018; Shaklee, 1983). Research with very young children might also yield interesting results. The explain-correct paradigm could be used with a simplified data interpretation task to test for preschoolers’ basic, conceptual understanding of covariation data. Although one needs to keep in mind that explanation competencies are limited in young children, our pilot data suggest that preschoolers are able to refer to data categories when explaining their own judgments. Thus, it seems sensible to employ the explain-correct paradigm with preschoolers. Indicators of competency in this age group would be highly relevant in the recent discussion about early implicit causal reasoning competencies in young children (e.g., Gopnik, Sobel, Schulz, & Glymour, 2001; Kushnir & Gopnik, 2007; Sobel, Tenenbaum, & Gopnik, 2004) since they would point to a rather early explicit

were asked to explain provided correct judgments. In the explain-own condition, children's judgment accuracy was far from perfect, and they did not regularly provide explanations consistent with the correct judgments. When asked to explain provided correct judgments, children provided more explanations consistent with the correct judgments. Moreover, most second graders and almost all fourth graders provided consistent explanations for the majority of items in the explain-correct condition. Thus, children gave explanations in terms of comparisons of relevant data categories when asked to provide explanations for correct judgments of covariation data. Explanations referring to comparisons of ratios – which from a mathematical point of view are the most sophisticated ones – were very rare in both conditions. This is not surprising, given that there was only one item for which only comparing ratios was consistent with the correct judgment. For all other data patterns, comparisons of differences or comparisons of two specific data categories (not any possible two-cell comparison) were sufficient to explain a correct judgment (see Table 3). Thus, we cannot infer from the overall low number of ratiobased explanations whether children were unable to reason based on ratios; they may have stuck to simpler and more familiar explanations because these were sufficient to answer the test questions. However, the pattern of mean values of explanations referring to comparisons of ratios indicates that fourth graders were able to reason based on ratios under certain conditions, while there was no evidence that second graders could do so. In the explain-correct condition, fourth graders referred to ratios for 0.70 out of nine items. This value approaches the number of items (i.e., one item) for which only comparing ratios was consistent with the correct judgment. In line with Vygotsky’s (1978) reasoning, we might say that ratio-based reasoning is within the zone of proximal development for fourth graders (i.e., they can do it with task support if it is required) but it is not yet in the zone of proximal development for second graders. In sum, the present results strongly indicate that elementary school children possess a basic conceptual understanding of data categories for inferences about covariation. These findings are consistent with research on elementary school children's understanding of experimentation strategies: although the control of variables strategy is spontaneously produced only in adolescence, most fourth graders choose a controlled over a confounded experiment and can explain the rationale for their choice, thus indicating an understanding of a conclusive test (Bullock & Ziegler, 1999). Many interpretations are possible for why children face severe 18

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

Torbeyns, & Verschaffel, 2003). If children are highly confident that their own interpretations are correct, they may be less amenable to considering other comparisons. Recent work suggests that some metacognitive skills are specific to numerical information (Vo, Li, Kornell, Pouget, & Cantlon, 2014), and it stands to reason that such skills would be relevant in data interpretation. Thus, one promising direction for future work would be to explore the role of metacognitive skills — including specifically numerical ones — in children's interpretations of covariation data. In conclusion, the present study demonstrates that elementary school children have a basic conceptual understanding of how to draw inferences from covariation data. Moreover, this work also shows that asking children to explain correct judgements is a useful and practical way to elicit this understanding. In these ways, this study provides a new perspective on instruction on basic data analytic skills, and it adds to a growing body of research documenting scientific reasoning abilities in elementary school children.

understanding of data categories. Since the present study supports the view that elementary school children possess a rudimentary conceptual understanding of covariation data, it provides a starting point for developing stochastics curricula focusing on covariation data interpretation for elementary school classrooms. By working with experimenter-provided explanation-correct problems, children may learn about the data categories they need to integrate in order to derive a correct judgment. If this is the case, then an explanation-correct curricular unit should have a positive effect on students’ subsequent abilities to interpret covariation data in contingency tables spontaneously. Positive effects of self-explanation on learning outcomes have been shown across various content domains and contexts (for reviews, see Fonseca & Chi, 2011; Rittle-Johnson & Loehr, 2016). However, one needs to pay attention to the specific circumstances in which self-explanations have positive effects. For example, Kuhn and Katz (2009) found that children's self-explanations of their own ideas enlarged the effect of prior beliefs, which led to poorer performance than no selfexplanations if those beliefs were incorrect (for a recent review including constraining factors, see Rittle-Johnson & Loehr, 2016). In the case of covariation data interpretation, it seems especially important that learners know whether the judgments they explain are correct, because they are very likely to come to incorrect judgments on their own (e.g., Shaklee & Paszek, 1985). Results from Siegler and Chen (2008) suggest that it might also be fruitful to ask learners to explain why incorrect judgments are incorrect; these conditions may heighten learners' awareness of disconfirming evidence. We did not ask children for confidence ratings about their judgments or explanations; however, it seems likely that metacognitive processes may play a role in children's data interpretation, just as they do in strategy choice in other tasks (e.g., Kuhn & Pearsall, 1998; Luwel,

5. Author note Andrea Saffran and Petra Barchfeld, Department of Psychology, LMU Munich, Germany; Martha W. Alibali, Department of Psychology, University of Wisconsin-Madison, USA; Kristina Reiss, TUM School of Education, Technical University of Munich, Germany; Beate Sodian, Department of Psychology, LMU Munich, Germany. Supported This work was supported by the German Research Council (DFG SO 213/31-3, RE 1247/8-3; DFG SO 213/34-1).

Appendix A Table A1 Relative Frequencies (%) of Consistent Explanations for Each Item by Grade Level and Condition. Item

Grade 2 Explain-own

1 2 3 4 5 6 7 8 9

13 10 30 27 18 18 6 10 8 16 24 24 15 16 30 20 20 10

6 18 30 10 15 4 7 29 8 16 12 12 10 30 11 1 10 5

Grade 4 Explain-correct

Explain-own

Explain-correct

87.5

100

100

100

33.3

75.0

45.0

90.9

48.0

76.0

38.9

95.7

36.0

70.8

42.1

100

44.0

84.6

45.0

95.7

100

100

100

100

40.0

97.2

50.0

95.2

0

46.2

15.8

25.0

4.2

10.0

16.7

60.0

Bühner, M., & Ziegler, M. (2009). Statistik für Psychologen und Sozialwissenschaftler. München. Deutschland: Pearson. Bullock, M., & Ziegler, A. (1999). Scientific reasoning: Developmental and individual differences. In F. E. Weinert, & W. Schneider (Eds.). Individual development from 3 to 12: Findings from the Munich longitudinal study (pp. 38–54). Cambridge: Cambridge University Press.

References Batanero, C., Estepa, A., Godino, J. D., & Green, D. R. (1996). Intuitive strategies and preconceptions about association in contingency tables. Journal for Research in Mathematics Education, 27(2), 151–169.

19

Learning and Instruction 59 (2019) 13–20

A. Saffran et al.

interpretations of covariation data: Beneficial effects of variable symmetry and problem context. Quarterly Journal of Experimental Psychology, 1–11. https://doi.org/10. 1177/1747021818775909. Rittle-Johnson, B., & Loehr, A. M. (2016). Eliciting explanations: Constraints on when self-explanation aids learning. Psychonomic Bulletin & Review. https://doi.org/10. 3758/s13423-016-1079-5. Saffran, A., Barchfeld, P., Sodian, B., & Alibali, M. W. (2016). Children's and adults' interpretation of covariation data: Does symmetry of variables matter? Developmental Psychology, 52(10), 1530–1544. Shaklee, H. (1983). Human covariation judgment: Accuracy and strategy. Learning and Motivation, 14, 433–448. Shaklee, H., & Elek, S. (1988). Cause and covariate: Development of two related concepts. Cognitive Development, 3, 1–13. Shaklee, H., & Hall, L. (1983). Methods of assessing strategies for judging covariation between events. Journal of Educational Psychology, 75(4), 583–594. Shaklee, H., & Mims, M. (1981). Development of rule use in judgments of covariation between events. Child Development, 52, 317–325. Shaklee, H., & Mims, M. (1982). Sources of error in judging event covariations: Effects of memory demands. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8(3), 208–224. Shaklee, H., & Paszek, D. (1985). Covariation judgment: Systematic rule use in middle childhood. Child Development, 56, 1229–1240. Shaklee, H., & Tucker, D. (1980). A rule analysis of judgments of covariation between events. Memory & Cognition, 8(5), 459–467. Shaklee, H., & Wasserman, E. A. (1986). Judging interevent contingencies: Being right for the wrong reasons. Bulletin of the Psychonomic Society, 24(2), 91–94. Siegler, R. S. (2002). Microgenetic studies of self-explanation. In N. Garnott, & J. Parziale (Eds.). Microdevelopment: A processoriented perspective for studying development and learning (pp. 31–58). Cambridge, MA: Cambridge University Press. Siegler, R. S., & Chen, Z. (2008). Differentiation and integration: Guiding principles for analyzing cognitive change. Developmental Science, 11(4), 433–448. Sobel, D. M., Tenenbaum, J. B., & Gopnik, A. (2004). Children's causal inferences from indirect evidence: Backwards blocking and Bayesian reasoning in preschoolers. Cognitive Science, 28(3), 303–333. Vo, V. A., Li, R., Kornell, N., Pouget, A., & Cantlon, J. F. (2014). Young children bet on their numerical skills. Psychological Science, 25(9), 1712–1721. http://doi.org/10. 1177/0956797614538458. Vygotsky, L. S. (1978). Mind in society: The development of higher mental processes. Cambridge, MA: Harvard University Press.

Eid, M., Gollwitzer, M., & Schmitt, M. (2010). Statistik und Forschungsmethoden. Lehrbuch. Weinheim, Deutschland: Beltz. Fonseca, B. A., & Chi, M. T. (2011). Instruction based on self-explanation. In R. E. Mayer, & P. A. Alexander (Eds.). Handbook of research on learning and instruction (pp. 296– 321). New York, NY: Routledge. Gopnik, A., Sobel, D. M., Schulz, L. E., & Glymour, C. (2001). Causal learning mechanisms in very young children: Two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. Developmental Psychology, 37(5), 620–629. Klayman, J., & Ha, Y.-W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94(2), 211–228. Kuhn, D., & Katz, J. (2009). Are self-explanations always beneficial? Journal of Experimental Child Psychology, 103(3), 386–394. https://doi.org/10.1016/j.jecp. 2009.03.003. Kuhn, D., & Pearsall, S. (1998). Relations between metastrategic knowledge and strategic performance. Cognitive Development, 13(2), 227–247. http://doi.org/10.1016/S08852014(98)90040-5. Kushnir, T., & Gopnik, A. (2007). Conditional probability versus spatial contiguity in causal learning: Preschoolers use new contingency evidence to overcome prior spatial assumptions. Developmental Psychology, 43(1), 186–196. Levin, I. P., Wasserman, E. A., & Kao, S.-F. (1993). Multiple methods for examining biased information use in contingency judgments. Organizational Behavior and Human Decision Processes, 55, 228–250. Luwel, K., Torbeyns, J., & Verschaffel, L. (2003). The relation between metastrategic knowledge, strategy use and task performance: Findings and reflections from a numerosity judgement task. European Journal of Psychology of Education, 18(4), 425–447. Mata, A., Garcia-Marques, L., Ferreira, M. B., & Mendonça, C. (2015). Goal-driven reasoning overcomes cell D neglect in contingency judgements. Journal of Cognitive Psychology, 27(2), 238–249. Mynatt, C. R., Doherty, M. E., & Tweney, R. D. (1977). Confirmation bias in a simulated research environment: An experimental study of scientific inference. Quarterly Journal of Experimental Psychology, 29(1), 85–95. https://doi.org/10.1080/ 00335557743000053. Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2(2), 175–220. Obersteiner, A., Bernhard, M., & Reiss, K. (2015). Primary school children's strategies in solving contingency table problems: The role of intuition and inhibition. ZDM Math. Educ. 47, 825–836. Osterhaus, C., Magee, J., Saffran, A., & Alibali, M. (2018). Supporting successful

20