Middle school students' understanding of the role sample size plays in experimental probability

Middle school students' understanding of the role sample size plays in experimental probability

Journal of Mathematical Behavior 20 (2001) 229 – 245 Middle school students’ understanding of the role sample size plays in experimental probability ...

404KB Sizes 0 Downloads 25 Views

Journal of Mathematical Behavior 20 (2001) 229 – 245

Middle school students’ understanding of the role sample size plays in experimental probability Leslie Aspinwalla,*, James E. Tarrb,1 a

Department of Curriculum and Instruction, 209 Milton Carothers Hall, Florida State University, Tallahassee, FL 32306, USA b Department of Middle and Secondary Education, 303 Townsend Hall, University of Missouri, Columbia, MO 65211, USA

Received 27 November 2000; received in revised form 15 August 2001; accepted 20 August 2001

Abstract In this study, we examined the impact of an instructional program on sixth-grade students’ understanding of experimental probability as it relates to sample size. As the number of trials in an experiment increases, the experimental probability is more likely to reflect the parent distribution; thus, smaller samples are more likely to yield unusual results. Results of this study indicate that, while typical middle school students are seemingly unaware of the relationship between experimental probability and sample size, appropriate cognitive activity focused on results of simulations of random phenomena can foster conceptual development. We witnessed growth that occurred as a result of key instructional tasks and concomitant mental activity. D 2001 Elsevier Science Inc. All rights reserved. Keywords: Probability; Experimental probability; Theoretical probability; Sample size; Statistics; Simulation; Mathematics instruction; Constructivism; Mathematics curriculum; Cognitive processes; Concept formulation

1. Introduction Recent curriculum reform documents in school mathematics (e.g., Australian Educational Council and Curriculum, 1994; Department of Education and Science and the Welsh Office, 1991) have advocated broadening the scope of probability in the middle school curriculum. In * Corresponding author. Tel.: +1-850-644-8427; fax: +1-850-644-1880. E-mail addresses: [email protected] (L. Aspinwall), [email protected] (J.E. Tarr). 1 Tel.: + 1-573-882-4034; fax: + 1-573-882-4481. 0732-3123/01/$ – see front matter D 2001 Elsevier Science Inc. All rights reserved. PII: S 0 7 3 2 - 3 1 2 3 ( 0 1 ) 0 0 0 6 6 - 9

230

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

the United States, the National Council of Teachers of Mathematics (2000) has called for students to carry out simulations of random phenomena, and to compare experimental results to the mathematically derived probabilities. This advocation of simulations to foster probabilistic reasoning is largely predicated on students’ ability to make a connection between the experimental probability of an event (using relative frequencies to determine the likelihood of an event) and the theoretical probability of an event (derived analytically). In general, the relationship between the two concepts results from the fact that, for a given event, experimental probability will more closely approximate theoretical probability as the number of trials increases. This relationship, sometimes called the Law of Large Numbers (Watkins, 1981), asserts that large samples are more likely to reflect the parent population while small samples often produce experimental probabilities that differ markedly from the parent distribution. The concept of sample is fundamental to data analysis and reasoning under uncertainty.

2. Theoretical background Notwithstanding recommendations of various professional organizations for the inclusion of probability in the school mathematics curriculum, there is limited research to inform classroom practice. In a review of existing literature, Shaughnessy (1992) articulated the need for further investigations into the teaching and learning of probability stating, ‘‘There are so many areas of needed research that it is hard to know where to begin’’ (p. 489). He specified the need for teaching experiments to understand the effects of instruction on students’ understanding of probability concepts. Since the call for teaching experiments in probability, few studies (e.g., Berenson, 1999; Fischbein & Schnarch, 1997; Maher, 1998; Speiser & Walter, 1998; Tarr, 1997) have examined the growth of students’ probabilistic thinking as the result of instruction. In a pivotal investigation, Jones, Langrall, Thornton and Mogill (1999) evaluated the thinking of third-grade students during 16 instructional sessions in probability. The instructional program was informed by a research-based framework (Jones, Langrall, Thornton, & Mogill, 1997) that included descriptions of young children’s probabilistic thinking. Case study analysis of third-grade students revealed key patterns to producing growth in student learning; however, research focused on conceptions in only four constructs: sample space, (theoretical) probability of an event, probability comparisons, and conditional probability; students’ understanding of experimental probability, and its relation to theoretical probability, was not reported. Indeed, although numerous studies have examined students’ thinking in probability (e.g., Falk, 1983; Green, 1983; Piaget & Inhelder, 1975), almost no research has examined students’ conceptions of experimental probability and its relation to sample size. In a report of action research, Lawrence (1999) used various activities to help middle school students develop an informal definition of probability using a prescribed number of trials to determine experimental probabilities. By listing the set of all possible outcomes, students were able to determine theoretical probabilities that were subsequently compared

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

231

to experimental probabilities. Berenson (1999) engaged eighth-grade students in probability simulations as a means of eliciting student thinking in the determination of the ‘‘fairness’’ of several dice games. Such activities promoted the concept of sample space and served to link students’ initial conceptions to more formal representations of sample space. Finally, Metz (1999) examined the probabilistic thinking of second-, fourth-, and fifth-grade students’ ideas about the nature of sampling. She determined that 41% of children argued for the power of sampling, including 56% of fifth graders. The majority of these students articulated that an adequate sample was needed in order to draw inferences about a population, while others qualified the information one could draw from a sample. She asserted that negligible research has investigated the impact of instruction on students’ understanding of sample size.

3. Purpose of the study This research responded to the call for teaching experiments in probability (Shaughnessy, 1992) by determining the impact of instruction on middle school students’ thinking. In particular, it sought to examine whether simulations of random phenomena influence students’ understanding of the role sample size plays in determining experimental probabilities. This study used a cognitive framework (Jones, Thornton, Langrall, & Tarr, 1999) that captures the manifold nature of students’ probabilistic reasoning and identifies four levels of thinking with respect to the experimental probability of an event (see Fig. 1). This framework

Fig. 1. Framework of probabilistic reasoning: experimental and theoretical probability.

232

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

was used to generate assessment protocols and instructional tasks, and to evaluate the impact of instruction on students’ probabilistic thinking.

4. Methodology and data analysis The primary goal of this study was to determine the impact of instruction on sixthgrade students’ understanding of the experimental probability of an event. Accordingly, our research was organized into three phases: (1) a preinstructional assessment to determine students’ existing awareness of the role sample size plays in experimental probability; (2) a 5-day instructional program comprised of a series of problem tasks that required students to carry out simulations of various random phenomena and to draw inferences from experimental results at various stages; and (3) a postinstructional assessment to determine the extent of growth, if any, in students’ probabilistic thinking as the result of instruction. The 5-day instructional program was implemented in one Grade 6 classroom at a middle school in middle Tennessee. Twenty-three students participated in all three phases of the study conducted during the final weeks of an academic year. The 5-day instructional program comprised a series of problem-solving tasks, key questions, and writing prompts. Each probability task required students to carry out simulations of random phenomena. Results of individual experiments were then aggregated and served as a basis for whole-class discussions. In this study, both researchers provided daily instruction and attempted to elicit students’ probabilistic thinking and to engage students in classroom discourse. The cooperating teacher stimulated additional class discussion with interjections of key questions. Moreover, she assisted in the selection of case study students chosen to represent a range of mathematical thinking. It should be noted that, in the tradition of naturalistic research, our goal was not to intervene in the development of student ideas by giving mathematical definitions or by correcting their thinking. Instead, attempts were made to use student responses as a vehicle for whole-class discussions in which student thinking was corroborated or challenged by classmates. Baseline interviews were conducted by one of two teacher–researchers 1 week prior to the commencement of the instructional program. Interviews were audiotaped for subsequent analysis. The structured interview contained six tasks designed to assess students’ understanding of experimental probabilities, and student responses were analyzed using the cognitive framework (Fig. 1). More specifically, using the framework criteria, student responses were coded and used to determine the dominant thinking level for each student with respect to this key probability concept. Agreement between the coresearchers was achieved on 130 of 138 levels—a reliability of 94%. Disagreements in the coding of particular items were resolved through discussions and the scorings were used in subsequent quantitative analyses. Using an interview protocol patterned upon items from preinstructional assessment, students’ understanding was assessed in relation to the cognitive framework several days after instruction. A dominant thinking level was determined for each student, and the

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

233

Wilcoxon signed test (Siegel & Castellan, 1988) was used to determine whether significant differences were present between students’ level of thinking prior to and following instruction. Qualitative analysis examined multiple sources of data including audiotaped interview assessments, videotapes of instruction, student responses to writing prompts, worksheets of case study students, and field notes taken by each researcher and the cooperating teacher. Such analyses were undertaken to identify patterns and changes in students’ probabilistic thinking and to determine possible catalysts for growth in student understanding of experimental probability. As the study unfolded and particular pieces of information came to light, data triangulation was used to carefully compare each data piece against at least one other source. For example, if a possible assertion arose from a case study student’s response to an out-of-class writing prompt, it was considered in light of data from other sources such as a researcher’s field notes, videotape of instruction, or an interview assessment. No single item of information was given serious consideration unless it could be triangulated (Lincoln & Guba, 1985). The researchers’ approach to data collection and analysis was iterative; results from the instructional program and journal entries of both researchers and students were analyzed as they were collected. Emerging themes were identified and utilized to guide the formulation of subsequent activity. The process of triangulation by data source (Miles & Huberman, 1984) yielded assertions that emerged from the experiences and understandings of the participants: students and teacher–researchers. Assertions were confirmed or disconfirmed in the triangulation process.

5. Results of the study The instructional program yielded varied success in fostering students’ understanding of experimental probability as it relates to the number of trials. Following instruction, marked changes in students’ thinking levels were observed. With respect to the experimental probability of an event, the Wilcoxon signed ranks test detected a significant difference (z = 2.03, P < .05) in students’ level of thinking prior to instruction and their level of thinking on the postinstructional assessment. Although subsequent qualitative analysis revealed evidence of students’ ability to relate the number of trials to experimental probability, such growth in student thinking was largely confined to their ability to determine the relative likelihood of large and small sets of outcomes. Prior to instruction, five of six case study students demonstrated limited awareness of the relationship between experimental probability and the number of trials in an experiment. In particular, five of six case study students were coded at Level 1 or 2 in the baseline assessment (see Fig. 1). Students at Level 1 exhibited no awareness of the relationship between experimental and theoretical probability. These students viewed the results of simulations as irrelevant and instead used subjective thinking in making probability judgments. Other students were coded at Level 2 because they placed too much faith in small samples of data from simulations. Moreover, they typically believed that any sample of data, small or large, should reflect the parent distribution.

234

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

6. Case study analysis of student learning 6.1. Case study analysis of Emily and Ravi Emily exemplified Level 2 thinking prior to instruction. In particular, she demonstrated her pervasive belief that all samples of data should reflect the parent population. Consider her response to an item from the baseline assessment in which she examined four sets of outcomes of repeated flips of a colored chip: (a) 3 of 4 whites, (b) 6 of 8 whites, (c) 12 of 16 whites, and (d) 24 of 32 whites (Figs. 2–5, respectively). Although 75% of the outcomes is white in each sample, this proportion is least likely to occur in the largest set of trials, (d) (Fig. 5); the smallest set of outcomes, (a) (Fig. 2), is most likely to produce the 3:1 ratio of white to red. The following excerpt from the initial assessment illustrates Emily’s misconception as she sought to determine which set of outcomes was most—and least—likely to occur in repeated flips of the colored chip: I: Which of these four sets of outcomes do you think is the most likely to occur, or do you think all four sets are equally likely? E: Well, they’re all three-fourths so they all have the same chance because in each row there’s four and there’s only one red and three whites. I: Which of these four sets do you think is the least likely to occur, or do you think all four sets are equally unlikely? E: They’re all equally unlikely. It seems like there should be two reds and two whites [in each row] if there’s the same chance for each color. Ravi, a second case study student, offered similar responses to the series of questions in the initial assessment. Although he pointed out that each set of outcomes contained the same proportion of red and white outcomes, he was unable to differentiate between the relative likelihood of each of the four sets of flips as indicated in the following excerpt: I: Suppose I flipped the chip a lot of times: First 4 flips, then 8 flips, then I flipped the chip 16 times, and finally I flipped it 32 times [refer to Figs. 2–5, respectively]. Which one of these four sets of outcomes is most likely to happen: 3 of 4 whites, 6 of 8 whites, 12 of 16 whites, or 24 of 32 whites? R: [emphatically] They’re all the same! I: How would you explain that? R: The chances would be, like, three-fourths [points to Fig. 2]. Twenty-four over 32 is threefourths [pointing to Fig. 5] and this [again points to Fig. 2]. They’re all three-fourths! Like Emily, Ravi incorrectly associated the proportion of white outcomes with the numerical (theoretical) probability of the event. This characteristic is typical of students at

Fig. 2. Four flips: three of four are white.

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

235

Fig. 3. Eight flips: six of eight are white.

Level 2 of the framework shown in Fig. 1; they often realize that numbers play a role in determining numerical probabilities but are unable to do so correctly. Growth in Emily and Ravi’s thinking was evident on the postinstructional assessment as they demonstrated an awareness of the long-term behavior of repeated flips of the colored chip. In particular, both case study students recognized that larger samples were more likely to reflect the parent distribution. The following is an excerpt of Emily’s response on the postinstructional assessment: I: Which do you think is the most likely to occur, or do you think all four are equally likely? E: [Points to three of four whites]. I: Why is this one [three of four whites] the most likely? E: Well this one [three of four whites] is unlikely, so that means that if it would happen again like this one [six of eight whites], it would be even more unlikely. And if it were to happen again like that one [12 of 16 whites], it would be even more unlikely. I: Which of these four sets do you think is the least likely to occur, or do you think all four sets of outcomes are equally unlikely? E: That one [points to 24 of 32 whites]. I: And why would that be the most unlikely? E: Because, well, Kevin brought that up in class [on Day 4 of the instructional program]. Since there’s an even chance [for each side], where there’s three whites and one red, it would probably happen once instead of, like, a bunch of times. It would be harder to

Fig. 4. Sixteen flips: twelve of sixteen are white.

236

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

Fig. 5. Thirty-two flips: twenty-four of thirty-two are white.

get one red out of four flips all those times [24 of 32 whites] instead of just this one time [3 of 4 whites] because it would have to happen eight times. The notion that smaller samples are more likely to yield results not representative of the parent population eluded most case study students in the initial assessment. Following instruction, understanding of this key principle in probability and statistics was evident in four of six case study students including Ravi, whose response paralleled that of Emily: I: R: I: R:

Which one of these four outcomes is most likely to occur? Three of four. Why? Well, it’s unusual for that [points to Fig. 2], and it would have to happen eight times in a row right there [points to Fig. 5] for 24 of 32. It would have to happen two times and four times right there [points to Figs. 3 and 4, respectively].

Ravi’s response reflected his understanding that unusual or unrepresentative results are more likely to occur in smaller samples. As the number of trials in the experiment increases, he argued that unusual results would, in essence, have to occur repeatedly in order to produce the same proportion of red and white outcomes. This notion that an unlikely event ‘‘would have to happen eight times in a row’’ to produce the same proportion in a larger sample was never offered as a plausible explanation by any teacher–researcher during instruction; instead, it was a student-constructed response that occurred during whole-class discussions on Day 1. Upon elaboration, Ravi identified this key instructional moment in saying, ‘‘It’s like Kyle said in class when we were playing the games.’’ Specifically, he identified To Sum It Up [on Day 4 of instruction] as a catalyst for growth in his probabilistic reasoning. Indeed, qualitative analysis of Ravi’s journal entries and worksheets, coupled with field notes of the cooperating teacher and both teacher–researchers, supports the notion that growth in his thinking occurred as the result of one key instructional task.

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

237

6.2. Case study analysis of Blake The role sample size plays in the long-term behavior of random phenomena eluded most case study students prior to instruction. This was particularly evident in an assessment task in which students determined whether their chance of winning an unfair game was influenced by the number of trials. More specifically, the game involved a spinner partitioned into two sectors, blue comprised 60% and yellow 40% of the circle (see Fig. 6). After playing a ‘‘Winner Takes All’’ format of the game, students considered whether playing the game to 3 points, or to 10 points, influenced their probability of winning. Blake was typical of case study students at baseline; he demonstrated little awareness of the relationship between the number of trials and the probability of winning the spinner game: I: Let’s say that we play the game again so that the winner is the first to score three points. Compared to last time when we spun it only one time, are your chances of winning better, the same, or are they worse than when we played before? B: It’s the same because there’s still more blue than yellow so I’ll probably still win. I: What about my chances of winning? Does it make a difference to my chances of winning if we play to three? B: No, because my space is still bigger. I: Let’s say that instead of playing the game to three points, we played it so that the winner was the first to score 10 points. Does playing the game to 10 points make any difference to your chance of winning? B: Well, yeah. Like in going to one [point], if it lands on mine then I would win; but if we played to 10, then I could be ahead and you could come back and win. I could still win but your chance [of winning] is better now. Blake vacillated in stating whether the number of trials plays a role in determining the likelihood of winning the spinner game. At first, three trials did not influence his chance of winning but, later, Blake argued that 10 trials would affect his chances. Ultimately, he reverted to subjective reasoning (Level 1) in stating that his opponent could ‘‘come back and win [if they played longer]’’ when, in reality, a longer game would ultimately favor him.

Fig. 6. Blue and yellow spinner.

238

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

Following instruction, Blake demonstrated his awareness that smaller samples are more likely to produce unusual outcomes. In the following excerpt from the postinstructional assessment, Blake realizes that the number of trials does indeed influence the probabilities associated with the unfair game: I: We’re going to start over only now we’re going to play the game so that the winner is the first one to score three points. Compared to the first time we played, are your chances of winning better, worse, or are they the same as they were before? B: With three times I have a better chance [of winning] because, like, when we did that game where we had 12 and 2 [The Race Game, Day 1 of instruction]. . . well, with one try anything could happen but with three [spins] I probably have a better chance of winning. I: What if we played the game so that the winner was the first to score 10 points? Do you think you have a better chance of winning, a worse chance of winning, or the same chance of winning if we played it to three points? B: Better. A better chance because. . . with three, it’s still not that hard for you to win. You could easily just get three right here [points to the yellow sector]. But with 10, you’d get more chances to land on blue, and it’ll probably land on blue more because you know that [blue] has more of a percentage. Analysis of case study students’ learning revealed that Blake and others likely benefited from one instructional task, namely The Race Game, and classroom discourse that focused on individual and aggregated results of probability simulations. Videotapes of instruction, student journal entries, responses to writing prompts, and three sets of field notes served to triangulate the assertions reported here.

7. Fostering students’ understanding of sample size: key lessons from the instructional program 7.1. The race game One relatively common approach to introducing probability concepts involves the rolling of a pair of dice. Using multiple trials of the experiment, the sum of the numbers appearing on the two dice can be determined in order to promote the idea that some events are more likely to occur than others. Through successive rolls, students may discover, for example, that the event ‘‘7’’ is more likely than the event ‘‘12.’’ In this study, a similar approach was taken in an instructional task called The Race Game. The in-class activity sheet for The Race Game appears in Fig. 7. In this game, students were assigned a race car numbered 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12. Working with a partner, each student rolled a pair of dice, one red and one green, and determined the sum of the spots on the dice. The outcome of each trial of the experiment determined which race car would advance one space toward the finish line. In essence, the ‘‘race track’’ served as a line plot to represent the frequency with which each event—2, 3, 4, . . ., 11, and 12—occurred. The winner of The Race Game is the player whose race car is first to cross a finish line located 10 spaces from the starting line. Before beginning

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

239

Fig. 7. The Race Game.

play, students were asked to predict which car they thought would win. The majority of students felt that each race car had the same chance of winning or used subjective reasoning (e.g., ‘‘I think 9 will win because it’s my lucky number.’’) in making their predictions. The

240

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

length of the racetrack appeared to have no influence on student thinking; there was no connection made between the length of the racetrack and the large number of trials required to produce a winner in The Race Game. As the game was played, we tabulated the whole-class results on a ‘‘race track’’ on the board. In particular, we asked students to announce which race car was ahead at each milepost; that is, which race car number was first to come up twice, which was first to occur three times. A list of racecars leading at each milepost was listed on the board. Initially, the list of race cars leading at Milepost 1 was rather lengthy, and included the numbers of race cars whose theoretical probability was small such as those numbered 2, 3, 11, or 12. As the game progressed, however, the list of race cars in the lead began to dwindle; in the end, only race cars numbered 6, 7, and 8 won The Race Game. As a result of whole-class discussions, students came to realize that race cars with numbers such as 2 or 12 had a greater probability of winning when the race track was shorter or, stated alternatively, smaller samples are more likely to yield unusual results than a larger number of trials. Moreover, they learned that racecars with a higher probability of advancing in a single roll were more likely to emerge as victorious as the length of the race track increased. 7.2. To sum it up: a dice game A second instructional task that played a key role in fostering students’ probabilistic reasoning was entitled To Sum It Up: A Dice Game. On Day 4 of the instructional program, students worked in pairs and carried out numerous trials of a simulation (see Fig. 8). This game is played with a pair of dice. Prior to commencement of the game, students picked a color as follows: WHITE: Scores one point if the sum of the dice is 2, 3, 4, 9, 10, 11, or 12. YELLOW: Scores one point if the sum of the dice is 5, 6, 7, or 8. Students were then asked to predict which color would win if the game were played in each of the following formats:    

The The The The

first player to score one point is the winner; winner is the player leading after three rolls; winner is the player leading after 11 rolls; winner is the player leading after 21 rolls.

The majority of students expressed their belief that players selecting WHITE were favored to win the game because WHITE comprised more possible outcomes (seven) than did YELLOW (four). Interestingly, irrespective of color choice, students almost uniformly felt their probability of winning was greatest in the longest format, 21 rolls, and used subjective reasoning to justify their thinking. Although a longer game format favors one color, it is YELLOW, not WHITE, whose probability of winning is increased.

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

241

Fig. 8. To Sum It Up.

After a short discussion in which students decided on which color to select, the game was played as a whole-class activity. To play the game as a whole class, we created the cards shown in Figs. 9 and 10. Half the students was given WHITE cards and the other half was given YELLOW cards. Working as 11 pairs, alternating students rolled the dice, determined the sum, and kept track of the cumulative score for each player. At designated points in the simulation, scores were tabulated as the 11 pairs reported which color was leading the game. In particular, after one trial of the experiment, students were asked to display their colored card if they were leading. At this initial point in the game, the

242

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

Fig. 9. YELLOW scores a point.

events WHITE and YELLOW appeared with approximately the same frequency: six for WHITE and five for YELLOW. After two additional trials of the experiment, students were asked to report which color was winning the game. This time, the tally was five for WHITE and six for YELLOW. Scores were monitored at two other points. After 11 trials of the experiment, students were solicited and the score was noted on the chalkboard: three for WHITE and eight for YELLOW. Finally, after 21 trials of the game, each winner was asked to display their colored card and the score was again noted: 1 for WHITE and 10 for YELLOW. The lone winner holding a WHITE card seemed to pique students’ interest and the pair of students was asked to elaborate their final score. After 21 trials, WHITE had scored 11 points and YELLOW had scored 10. Whole-class discussions ensued at the conclusion of the game. The teacher–researchers posed the question, ‘‘Which player, WHITE or YELLOW, has a better chance of winning the game To Sum It Up? A minority of the students, including the lone pair for whom WHITE outscored her opponent, argued that WHITE had a higher probability of winning in spite of pooled results indicating a 10:1 score of YELLOW over WHITE. These students continued to assert that WHITE has more numbers and should score more often. One student, Sarah, refuted their argument by stating, ‘‘They [WHITE] may have more numbers but we [YELLOW] have the good numbers!’’ A second student, Brian, shared that he had worked out the probability of scoring for each color. Using theoretical probabilities determined as the result of The Race Game [on Day 1 of the instructional program], students worked in pairs to confirm Brian’s assertion that one color, namely YELLOW, had a slightly better chance of scoring in each trial.

Fig. 10. WHITE scores a point.

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

243

As the result of carrying out the simulation and subsequent classroom discourse, students learned that even those with WHITE cards (with numbers whose outcomes were less likely to occur) can win the game, particularly in a format requiring only a smaller number of trials. As the number of trials increases, however, the favored player—YELLOW, in this case—should prevail. As one student summed it up after the seventh trial, ‘‘I can’t win now because I have the bad numbers and they’ll never catch up.’’

8. Discussion Research indicates that The Law of Large Numbers is nonintuitive even to students of all ages (Fischbein & Schnarch, 1997; Konold, Pollatsek, Well, Lohmeier, & Lipson, 1993; Shaughnessy, 1992). Nevertheless, results of this study suggest that this elusive, yet important, concept can be made comprehendible even to sixth-grade students. Prior to instruction, only one of five case study students exhibited an awareness of the relationship between the number of trials and the probability of an unlikely event. Despite this misconception, the majority of case study students following instruction demonstrated a higher level of thinking in their understanding of experimental probability. The use of probability simulations served as a catalyst to challenge students’ existing conceptions and to foster growth in their probabilistic reasoning. Such simulations evoked small- and wholeclass discussions that may have helped some to reorganize their thinking. Notwithstanding the growth in student learning, the simulation activities described herein failed to yield growth in understanding for all students. In particular, qualitative analysis suggests that results of individual simulations seemed to reinforce some students’ misconceptions. These students typically had erroneous intuitions validated by results of their simulations. Other students continued to be distracted by irrelevant aspects of the experiments and were misled by seemingly primitive intuitions. It follows that additional research is needed to determine ways to foster the probabilistic thinking of students who fail to understand the relevance of data from simulations. A fundamental tenet of socioconstructivism is the view that language and agreement play a role in establishing and justifying mathematical concepts (Ernest, 1996). In this study, the simulation component of instructional tasks elicited students’ probabilistic thinking that, in turn, served as the basis for whole-class discussions. Ultimately, through social interactions and conversations, students were able to negotiate meaning for the outcomes of their experiments. Growth in students’ understanding reported here was the likely result of appropriate instruction and concomitant mental activity. Given the size of our sample, however, it is only appropriate to remain somewhat tentative of the results of this study. Nevertheless, it is our belief that The Race Game and To Sum It Up represent examples of instructional activities that foster the development of students’ understanding of several key concepts in probability and to promote powerful connections between data and chance. Additional investigations are needed into the teaching and learning of probability in order to ascertain the complexity of how students ‘‘learn to reason probabilistically’’ (Maher & Speiser, 1999; Speiser, 2000) and the long-term effects of sustained instruction in probability.

244

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

References Australian Educational Council and Curriculum. (1994). Mathematics—a curriculum profile for Australian schools. Carlton, VIC: Curriculum. Berenson, S. B. (1999). Students’ representations and trajectories of probabilistic thinking. In: R. Hitt, & M. Santos (Eds.), Proceedings of the twenty first annual meeting of the North American chapter of the international group for the psychology of education ( pp. 459 – 465). Columbus, OH: ERIC Clearinghouse of Science, Mathematics, and Environmental Education. Department of Education and Science and the Welsh Office. (1991). Mathematics for ages 5 – 16. London: Central Office of Information. Ernest, P. (1996). Varieties of constructivism: a framework for comparison. In: L. P. Steffe, & P. Nesher (Eds.), Theories of mathematical learning ( pp. 335 – 350). Mahwah, NJ: Lawrence Erlbaum Associates. Falk, R. (1983). Children’s choice behaviour in probabilistic situations. In: D. R. Grey, P. Holmes, V. Barnett, & G. M. Constable (Eds.), Proceedings of the first international conference on teaching statistics ( pp. 784 – 801). Sheffield, UK: Teaching Statistics Trust. Fischbein, E., & Schnarch, E. (1997). The evolution with age of probabilistic, intuitively based misconceptions. Journal for Research for Research in Mathematics Education, 28, 96 – 105. Green, D. R. (1983). A survey of probability concepts in 3000 pupils aged 11 – 16. In: D. R. Grey, P. Holmes, V. Barnett, & G. M. Constable (Eds.), Proceedings of the first international conference on teaching statistics ( pp. 784 – 801). Sheffield, UK: Teaching Statistics Trust. Jones, G. A., Langrall, C. W., Thornton, C. A., & Mogill, A. T. (1997). A framework for assessing young children’s thinking in probability. Educational Studies in Mathematics, 32, 101 – 125. Jones, G. A., Langrall, C. W., Thornton, C. A., & Mogill, A. T. (1999). Students’ probabilistic thinking in instruction. Journal for Research in Mathematics Education, 30, 487 – 519. Jones, G. A., Thornton, C. A., Langrall, C. W., & Tarr, J. E. (1999). Understanding students’ probabilistic reasoning. In: L. V. Stiff, & F. R. Curcio (Eds.), Developing mathematical reasoning in grades K-12, National Council of Teachers of Mathematics’ 1999 yearbook ( pp. 146 – 155). Reston, VA: The Council. Konold, C., Pollatsek, A., Well, A., Lohmeier, J., & Lipson, A. (1993). Inconsistencies in students’ reasoning about probability. Journal for Research in Mathematics Education, 24, 392 – 414. Lawrence, A. (1999). From the giver to the twenty-one balloons: explorations with probability. Mathematics Teaching in the Middle School, 4, 504 – 509. Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage Publications. Maher, C. A. (1998). Is this game fair? The emergence of statistical reasoning in children. In: PereiraMendoza, Kea, Kee, Wong (Eds.), Proceedings of the international conference on the teaching of statistics (ICOTS-5), (vol. 1, pp. 53 – 60). Singapore. Maher, C. A., & Speiser, R. (1999). The complexity of learning to reason probabilistically. In: R. Hitt, & M. Santos (Eds.), Proceedings of the twenty first annual meeting of the North American chapter of the international group for the psychology of education ( pp. 181 – 186). Columbus, OH: ERIC Clearinghouse of Science, Mathematics, and Environmental Education. Metz, K. E. (1999). Why sampling works or why it can’t: ideas of young children engaged in research of their own design. In: R. Hitt, & M. Santos (Eds.), Proceedings of the twenty first annual meeting of the North American chapter of the international group for the psychology of education ( pp. 492 – 498). Columbus, OH: ERIC Clearinghouse of Science, Mathematics, and Environmental Education. Miles, M. B., & Huberman, A. M. (1984). Qualitative data analysis: a sourcebook of new methods. Beverly Hills: Sage. National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA: The Council. Piaget, J., & Inhelder, B. (1975). The origin of the idea of chance in children. New York: W.W. Norton (L. Leake, Jr., P. Burrell, & H. D. Fischbein, Trans.).

L. Aspinwall, J.E. Tarr / Journal of Mathematical Behavior 20 (2001) 229–245

245

Shaughnessy, J. M. (1992). Research in probability and statistics: reflections and directions. In: D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning ( pp. 465 – 494). New York: Macmillan. Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). New York: McGraw-Hill. Speiser, R. (2000). The complexity of learning to reason probabilistically. In: M. L. Fernandez (Ed.), Proceedings of the twenty second annual meeting of the North American chapter of the international group for the psychology of education ( pp. 45 – 50). Columbus, OH: ERIC Clearinghouse of Science, Mathematics, and Environmental Education. Speiser, R., & Walter, C. (1998). Two dice, two sample spaces. In: Pereira-Mendoza, Kea, Kee, Wong (Eds.), Proceedings of the international conference on the teaching of statistics (ICOTS-5), (vol. 1, pp. 61 – 66). Singapore. Tarr, J. E. (1997). Using knowledge of middle school students’ thinking in conditional probability and independence to inform instruction. Unpublished doctoral dissertation, Illinois State University. Watkins, A. E. (1981). Monte Carlo simulation. In: A. P. Shulte, & J. R. Smart (Eds.), Teaching statistics and probability, National Council of Teachers of Mathematics’ 1981 yearbook ( pp. 146 – 155). Reston, VA: The Council.