“Buggy algorithms” as attractive variants

“Buggy algorithms” as attractive variants

JOURNAL OF MATHEMATICAL BEHAVIOR 15, 285-302(1996) “Buggy Algorithms” as Attractive Variants GIYOO HATANO Keio University SHIZUKO AMAIWA Shinshu Un...

1MB Sizes 0 Downloads 54 Views

JOURNAL OF MATHEMATICAL BEHAVIOR 15, 285-302(1996)

“Buggy Algorithms” as Attractive Variants GIYOO HATANO Keio University

SHIZUKO AMAIWA Shinshu

University

KAYOKO INAGAKI Chiba University

Two experiments were conducted in order to investigate how buggy algorithms in multidigit subtraction are used. In Experiment 1, third graders were given tests of multidigit subtraction, comprehension of the trade principle, and single-digit addition. They were given a follow-up test of multidigit subtraction 2 years later. In Experiment 2, third to sixth graders were tested. Both experiments showed that a majority of the students who used low-level buggy algorithms could use more advanced algorithms including the correct one. Whereas the proficiency of component skills was a negative predictor of the use of buggy algorithms soon after the students had learned how to solve multidigit subtraction problems, the comprehension of the trade principle tended to immunize them against the use of buggy algorithms a few years later. Additional questions and tests in Experiment 2 revealed that the children who used buggy algorithms had an optimistic estimate of getting correct answers by them and that those algorithms were relied on more often when students had to solve many problems than when they had to solve just two problems. Thus, the results gave support to assumptions that students believe buggy algorithms to be effort-saving variants and that they rely on them even when they have the correct procedure. Buggy algorithms can be spontaneous inventions, not necessitated by students’ lack of some procedural rules.

It has generally been agreed that children reveal-sometimes in the course of mastering the right, borrow-and-decrement, procedure for solving multidigit subtraction-“systematic” errors, which are due to “buggy algorithms” (Brown &

This study was supported in part by Grant-in-aid for scientific research to the first author (No. 57510050). We are grateful to the students, teachers, and principals of the schools for their willing cooperation, to Ms. Yasuko Suga for her assistance in conducting the first experiment, and to Prof. Alan Schoenfeld for his instructive comments when part of this article was presented at the AERA meeting in San Francisco, April 1986. A portion of the data in Experiment 1 was analyzed and reported in terms of abacus learning experience in Amaiwa, S. and Hatano, G. (1989), Effects of abacus learning on 3rd-graders’ performances in paper-and-pencil tests of calculation, Japanese Psychological Research, 31, 161- 168. Correspondence and requests for reprints should be sent to Giyoo Hatano, Faculty of Letters, Keio University, 2-14-45 Mita, Minato-ku, Tokyo 108 Japan. 285

286

HATANO, AMAIWA, AND INAGAKI

Burton, 1978) or procedures only partially correct. However, two contrasting views have been offered as to how these algorithms or procedures are induced and when they are relied on for solving a series of multidigit subtraction problems. On the one hand, investigators adopting the information-processing or computational position have asserted that buggy algorithms are generated when some procedural rule(s) of the correct algorithms is missing or replaced by an incorrect one (Young & O’Shea, 1981), or when the missing rule causes an impasse that has to be repaired somehow (Brown & VanLehn, 1980, 1982). In other words, the buggy algorithms are regarded as reflecting students’ imperfect local systems of procedural rules or productions. Those investigators have not given an explicit answer to the question of when the buggy algorithms are used by students. This is probably due to the fact that their computer analogy presupposes that one and the same set of rules is always applied throughout a test session, and thus children’s errors are of the same type at least in terms of the impasse, although different repairs may produce somewhat different types of erroneous responses. This view that partially correct procedures are necessitated by a missing rule will be called the missing-rule view hereafter. On the other hand, those taking the constructivist orientation are skeptical of the missing-rule view, and have raised criticisms of it. They share an opinion that both the generation and the use of partially correct procedures involve active and complicated processes (Cobb, 1990; De Corte, Greer, & Verschaffel, 1996). More specifically, some of them have stressed the fact that, although children’s errors for the borrow-and-decrement procedure are usually not random but systematic, the use of an incorrect procedure is not stable (e.g., Hennessy, 1993). Others have proposed that the induction or the use of a buggy algorithm is not just a procedural matter, but, rather, is associated with a lack of general cognitive understanding (e.g., Cauley, 1988) or of understanding of the place value principle (e.g., Resnick, 1982, 1983). We would like to join the constructivist position and propose an alternative view of buggy algorithms, which might be called the attractive-vuriant view. It consists of the following two key assumptions: (a) Children who show systematic errors in multidigit subtraction often possess the correct or a nearly correct algorithm (the correct-procedure assumption); but (b) they rely on buggy algorithms, because these algorithms seem to be correct variants that can save them mental effort (the effort-saving assumption). We can derive a few behavioral predictions from this alternative view. First, based on the correct-procedure assumption, we predict that a majority of those children who show typical buggy responses will also make some correct responses to the same type of problems within the same test session. In other words, children will display both mature (correct or nearly correct) and immature (buggy) strategies, often quite different, at a single test session. Second, we predict, based on the effort-saving assumption, that children’s proficiency in the component skills that are involved in subtraction with borrowing (e.g., single-

"BUGGY ALGORITHMS"

287

digit addition and finding complementary numbers to 10) will make it less likely for them to rely on buggy algorithms, because the algorithms are adopted for being simpler or more effort saving. Proficient component skills make the correct algorithm easy to run, and thus adopting another, simpler (but actually erroneous) procedure becomes less rewarding. Third, also based on the effort-saving assumption, we predict that children’s knowledge about why the standard borrowing procedure is right (e.g., their comprehension of the trade principle between columns) will inhibit their use of buggy algorithms, because those children who comprehend the trade principle are more likely to judge the algorithms not to be legitimate variants. It should be noted that these predictions are differential; that is, if confirmed, they not only lend support to the attractive-variant view, but also make the missing-rule view less tenable. The first prediction clearly contradicts the notion of buggy algorithms in the missing-rule view, which holds that incorrect constituent rules are as stable as other parts of the solution procedures. The second and third predictions cannot be derived from the missing-rule view either, unless a number of auxiliary assumptions are introduced: Proficiency in component skills that do not concern the borrow-and-decrement procedure is irrelevant in the missing-rule view, because neither the limitation of working memory capacity nor the tendency to save mental effort is considered. Because the bugs are procedural, and “critics that would reflect the basic principles of quantity” are not involved in the repair process (Resnick, 1987, p. 23, conceptual knowledge plays no direct role in either the use or the removal of buggy algorithms.

EXPERIMENT

I

This experiment was aimed at testing the three behavioral predictions already mentioned by analyzing a set of longitudinal data. Students, initially in the third grade, were given a test consisting of 20 items of multidigit subtraction, and differences in their mature and immature strategies were examined. They also took tests measuring their proficiency in component skills (single-digit addition and finding complementary numbers to 10) and comprehension of the trade principle. The test of multidigit subtraction was administered again about 20 months later. Method Participants. Participants were 110 third graders (about equal numbers of boys and girls) from two elementary schools in a small town near Tokyo. Their ages ranged from 8 years, 6 months to 9 years, 5 months, with a mean of 9 years, 0 months, and they had learned three-digit subtraction with two-step borrowing about a year before. Classes of about 30 students were given tests of multidigit subtraction, comprehension of the trade principle, single-digit addition, and

288

HATANO, AMAIWA, AND INAGAKI

finding complementary numbers to 10, in this order. Most of them were given the test of multidigit subtraction again in Grade 5. Multidigit Subtraction. Two three-digit numbers were given as minuend and subtrahend in each item. The test consisted of 10 practice items without borrowing, 10 items with borrowing once (either from tens or hundreds), and 10 items with borrowing twice (including four items with two-step borrowing, such as 701 - 252 = ?) in that order. Four minutes each were allowed for the set of 10 items with borrowing once and the set of 10 items with borrowing twice. This was more than enough time for completion by most children. Comprehension of the Trade Principle. Two sets of numbers, expressed in terms of units, tens, and hundreds, were compared in each item. For example, children were asked to judge whether “9 tens and 9 units” was equal to “8 tens and 10 units,” and whether “8 hundreds, 2 tens, and 6 units” was equal to “7 hundreds, 11 tens, and 16 units.” Ten items dealt with trade between two adjacent columns as in the first example given and 10 other items dealt with trade between all three columns as in the second example. Three practice items (e.g., comparing “1 ten and 5 units” with “15 units”), to which the right answer was given after students had responded, preceded the 20 test items. Ten minutes were allowed, which was ample. Single-Digit Addition. Students were presented a long, randomized series of single-place numbers 2 through 9 and required to add as many consecutive numbers as possible. Two minutes were allowed. Finding Complementary Numbers to 10. Students were given a series of single-place numbers 1 through 9, and required to find the complementary number to 10 (i.e., to write 3 to 7, 6 to 4, etc.) as quickly as possible. One minute was allowed. Scoring. The number of correctly computed or judged items for the last three tests was taken as the measure of proficiency or comprehension. Assessment of Most and Least Mature Strategies. The multidigit subtraction test was scored in terms of the level of strategies used for “borrow-and-decrement,” not by the number of correct answers. We first constructed a table of expected incorrect answers generated by nine major buggy algorithms related to the borrow-and-decrement procedure that had been found among Japanese students (Suga & Hatano, 1987). They were smaller-from-larger, smallest-is-zero, borrow-no-decrement, borrow-overdecrement, borrow-decrement-twice, ignor-

“BUGGY

ALGORITHMS”

289

ing-smaller, add-and-decrement, borrow-across-zero, and borrow-from-zero. ’ With this table of incorrect answers and the correct answers, the pattern of responses of each child was scrutinized. Each algorithm or strategy, including the correct one, was identified by using a criterion of two consecutive expected answers from the algorithm or three expected answers in all. Those strategies were classified into the following four levels: Level 1: Processing is limited within each column; that is, neither borrowing nor decrement is attempted-smaller-from-larger and smallest-is-zero are included here. Level 2: Borrow-and-decrement procedure is performed, but incorrectlyborrow-no-decrement, borrow-overdecrement, borrow-decrement-twice, ignoring-smaller, and add-and-decrement. Level 3: Borrowing is performed incorrectly when the next column left of the minuend is zero-borrow-across-zero and borrow-from-zero. Level 4: The correct procedure. Algorithms belonging to Level 3 could produce correct solutions when two-step borrowing was not required. Therefore, Level 4 was assigned only when a child made two consecutive correct or three correct responses on those items in which minuends had zero in the tens column. The performance of each child was described by the levels of his or her least and most mature strategies. In a few cases of “tinkering,” when a student made four or more errors of the same (or lower) level without any buggy algorithms being identified, he or she was assigned that level as his or her least mature strategy. These levels were assigned 1 to 4 when the most or least mature strategy was used as a covariate in statistical analyses. Results Relationships Between LMS and MMS. Cross-tabulations of the least mature strategy (LMS) and most mature strategy (MMS) at both Grade 3 and Grade 5 clearly demonstrate that a great majority of the participants who revealed bugs

‘Four of those bugs (smaller-from-larger, borrow-no-decrement, borrow-across-zero, and borrowfrom-zero) are listed in VanLehn (1982) by essentially the same names. Smallest-is-zero and ignoring-smaller correspond to his ZEROIINSTEADIOFIBORROW and BORROW/ADD/IS/TEN, respectively. The remaining three, which are not listed in VanLehn (1982) are defined as follows. Borrowoverdecrement: The student correctly borrows but decrements the next left column by 2; Borrowdecrement-twice: The student correctly borrows but decrements by 1 each the next two columns; Add-and-decrement: If the top number is smaller than the bottom number, the student adds them but decrements by I the next column.

HATANO.

290

AMAIWA.

AND INAGAKI

(i.e., LMS at Levels l-3) also used other, more advanced strategies including the correct one (see Table 1). For example, among the 32 participants whose LMS at the third grade was coded Level 1, nine showed MMS of Level I, three MMS of Level 2, 15 of Level 3, and five of Level 4. In other words, only nine of them consistently used Level 1 strategies. All 15 students whose LMS was Level 2 had MMS of Level 3 or 4. Similarly, at Grade 5, out of the 18 students whose LMS was identified as Level 1 or 2, 17 gave at least a few correct solutions, and nine of them did so even for items involving two-step borrowing. In contrast, the discrepancy was zero among those whose LMS was Level 3. This result, however, may have been obtained mainly because there were only four items requiring two-step borrowing. How did the participant choose different strategies‘? Usually one strategy was used for a while, then the student switched to another. In other words, the selection of a strategy was not a random, item-by-item process. Moreover, there seemed to be some features of problems that often caused some specific strategies to be chosen. For example, two problems out of the 10 items with borrowing once that had zero either in the units or tens column of the minuend (880 - 526 = ? and 806 - 492 = ?) often induced the smaller-from-larger or smallest-iszero strategy. That is, 15 students gave answers of 366 and 494 to them (as predicted from the smaller-from-larger strategy) and eight gave 360 and 404 (as predicted from the smallest-is-zero strategy). Ten of the 15 and two of the eight made correct responses three times or more for these 10 problems. In other words, even those students who could solve a number of multidigit problems

Cross-Tabulations

TABLE 1 of Most Mature Strategy (MMS) and Least Mature Strategy (LMS) (Experiment 1) MMS at Grade 3

LMS at Grade 3 Level I Level 2 Level 3 Level 4

Level 1

Level 2

Level 3

Level 4

9

3 0

I5 9 I

5 6 0 56

MMS at Grade 5 LMS at Grade 5 Level I Level 2 Level 3 Level 4

Level 1

Level 2

Level 3

Level 4

0

1 0

4 4 14

0 9 0 74

Note. Figures show the number of participants

“BUGGY

291

ALGORITHMS”

involving borrowing were tempted to use less mature strategies by some particular features of problems. Once they were tempted, they would often repeat those induced strategies for a few more items. Proficiency in Component Skills and Strategy Levels. Because the results regarding single-digit addition and those of finding complementary numbers were quite similar, we discuss the former only. The students were divided into strategy groups by LMS or MMS, and the mean number correct for single-digit addition was calculated for each strategy group (see Table 2). Separate analyses of variance (ANOVAs) on the numbers correct for single-digit addition were conducted for LMS and MMS at both grades. When an ANOVA revealed a significant effect of the strategy group, an analysis of covariance (ANCOVA) with grade in language (a crude measure of school achievement in general) as a covariate was carried out for Grade 3; two ANCOVAs were carried out with grade in language or the corresponding strategy level at Grade 3 as a covariate for Grade 5. Both ANOVAs for LMS and MMS at Grade 3 indicated the significant effect of the strategy group; for LMS, F(3, 106) = 9.39,~ < .OOl, and for MMS, F(3, 106) = 7.66, p < .OOl . Post hoc analyses by Tukey test revealed that the Level 4 students performed significantly (p < .05) better than the Level 1 students for LMS and the Level 4 students significantly outperformed both those at Level 1 and those at Level 3 for MMS. The effect remained significant even when grade in language was entered as a covariate: for LMS, F(3, 105) = 3.96, p < .O 1, and for MMS, F(3, 105) = 2.72, p < .05.

Numbers

TABLE 2 Correct for Single-Digit Additions (in 2 Minutes) Among Strategy Groups (Experiment 1)

LMS at Grade 3

M

SD N MMS at Grade 3

M SD N

LMS at Grade 5

M SD

N MMS at Grade 5

Level 1

Level 2

Level 3

Level 4

23.8 8.0 32

28.8 10.1 15

26.7 9.8 I

35.6 11.7 56

21.8 7.1 9

20.0 4.6 3

26.3 9.7 31

34.4 11.5 67

24.2 2.5 5

24.8 9.6 13

25.6 11.0 14

32.8 11.9 74

21.0 -

24.6 9.1 22

32.2 11.8 83

M SD N

Note. N means the number of participants

1

292

HATANO,

AMAIWA,

AND INAGAKI

The corresponding ANOVAs at Grade 5 revealed a smaller, but still significant effect of the strategy group F(3, 102) = 3.54 for LMS, and F(2, 103) = 4.26 for MMS, p < .05 for both. Post hoc analyses revealed that no pairwise difference was significant for LMS but the difference between Level 3 and Level 4 was significant at .05 for MMS. The effect at Grade 5 became nonsignificant in ANCOVAs; that is, when either the grade in language or the corresponding strategy level at Grade 3 was used as a covariate. It should be noted that proficiency in the component skills was assessed under time pressure, whereas the time limit for the multidigit subtraction test was lenient, and the numbers of “No Answers” were small (only 2%) and did not differ significantly among the strategy groups. In addition, there were very few errors for the component skills, and thus individual differences in the skills were probably mostly due to one’s speed in retrieving number facts. Nevertheless, as predicted, buggy algorithms were adopted more often when the students did not have proficiency in component skills. Comprehension of the Trade Principle and Strategy Levels. Mean scores on the test of the trade principle comprehension were calculated for strategy groups (see Table 3). Similar ANOVAs and ANCOVAs for LMS and MMS at both grades were conducted. The ANOVA did not show a significant effect of the LMS group at Grade 3, but revealed a significant effect of the MMS group, F(3, 106) = 3.95, p < .05. Post hoc analyses revealed that no pairwise difference was significant. This significant effect disappeared when grade in language was add-

Numbers

TABLE 3 Correct for Comprehension-of-Trade-Principle Among Strategy Groups (Experiment 1)

LMS at Grade 3

M SD

MMS at Grade 3

M

N

SD

N LMS at Grade 5

M SD

N MMS at Grade 5

M SD N

Items

Level 1

Level 2

Level 3

Level 4

14.3 3.7 32

15.0 3.7 I5

13.9 3.0 7

16.0 2.9 56

13.0 3.9 9

11.0 1.0 3

15.1 3.2 31

15.8 3.1 67

12.6 3.1 5

16.2 3.0 13

12.4 3.5 I4

15.6 3.1 74

11.0

13.4 3.6 22

15.7 3.1 83

Note. The total number of items is 20. N means the number of participants

“BUGGY ALGORITHMS”

293

ed as a covariate. The corresponding ANOVAs for Grade 5, however, showed a significant effect of the strategy group: for LMS F(3, 102) = 5.75, and for MMS F(2, 103) = 5.34, p < .!I1 for both. Post hoc analyses revealed that the students at Level 4 performed significantly (p < .05) better than those at Level 3, but those at Level 2 unexpectedly outperformed the Level 3 students for LMS. The difference between Level 3 and Level 4 was significant for MMS. The effect became nonsignificant for MMS when either grade in language or the MMS at Grade 3 was entered as a covariate, but stayed significant for the LMS: With the grade in language as a covariate, F(3, 101) = 3.93, p < .05; with the third grade LMS, F(3, 101) = 4.60, p < .Ol.

Discussion The results in Table 1 strongly support the correct procedure assumption. These clear, frequent discrepancies between LMS and MMS within the same session deny that each participant was responding consistently using a fixed set of rules, as assumed in the missing-rule view. It is not only theoretically possible but also tenable to assume that quite a number of students are tempted to use erroneous procedures even when they can use the right one. It should be pointed out that the criteria for identifying strategies were not too lenient in this experiment: two or more consecutive expected answers or a total of three or more expected answers from the strategy in the test of 20 items, where these expected answers were highly distinctive. VanLehn (1982) actually adopted the criterion of two or more expected answers, in total, for identifying buggy algorithms that were close to the correct procedure. Instability of human buggy algorithms was admitted by VanLehn (1982, 1990) himself, and reported by Hennessy (1993) and VanLehn (1982), but in two separate sessions. The effort-saving assumption is also supported by the fact that both of the predictions that were based on it were confirmed. Students were probably tempted to rely on a buggy algorithm because it looked like a simpler but correct variant of the standard procedure. They were strongly tempted to try a simpler version when they were not proficient in component skills of the standard procedure, as suggested by clear differences in the speed of single-digit addition between those groups that were classified by levels of multidigit subtraction strategies (Table 2). Furthermore, as shown in Table 3, comprehension of the trade principle showed a contribution to levels of strategies that was significant at Grade 5, but negligible at Grade 3. The results suggest that children could continue to resist the temptation to adopt erroneously simplified procedures only if they possessed conceptual understanding of the principle involved in the right procedure, although they could follow the procedure they had recently been taught without understanding. Although these findings concerning the effort-saving assumption cannot easily be explained by the missing-rule view, a number of alternative explanations can

294

HATANO,

AMAIWA.

AND INAGAKI

be offered for each piece of evidence. For example, those advanced intellectually would tend both to use the correct procedure for multidigit subtraction and to comprehend the trade principle better; those who studied school math harder would be proficient in basic computation and also good at multidigit subtraction. In other words, the results of Experiment 1 fall short of providing the conclusive evidence for this assumption. We thus conducted Experiment 2.

EXPERIMENT

2

Experiment 2 was aimed at, in addition to replicating the findings of Experiment 1 with a larger sample, (a) examining whether those students relying on a buggy algorithm in fact believed that they could often get the correct answer by it, and (b) investigating whether they relied more readily on a buggy algorithm when they expected a larger amount of effort. Although we were interested in testing the effort-saving assumption, we thought that it would not be informative to ask students directly whether they would think these algorithms were attractive variants of the correct procedure. Answers to such a question could be biased by their general confidence, modesty, and other variables. Thus we decided to use two indirect strategies, instead: first, to ask whether students believed that they could get a good score by relying on the (buggy) algorithm they had chosen; and second, to observe whether they were more likely to use a simpler buggy algorithm when a large number of problems was presented. Some readers may think we should have adopted an intensive individual experiment for aims (a) and (b). We did not, because we believe that students are more likely to rely on buggy algorithms for paper-and-pencil tests administered in group; that is, when their performance is not closely scrutinized by an adult. Method Participants and Procedure. Participants were 79 third graders (the expected mean age was 9 years), 79 fourth graders (10 years), 62 fifth graders (11 years), and 8 1 sixth graders (12 years) from an elementary school in a small city near Sendai, Japan. They had learned how to solve three-digit subtraction problems with borrowing about a year before. They were first given a test of 30 multidigit subtraction items in their classrooms. After solving each set of 10 items, they were required to give an estimate of the number correct; that is, how many of the 10 preceding items they thought they had correctly solved. Then came two other tests of comprehension of the trade principle and single-digit addition. Finally, they were given two multidigit subtraction problems that had been included in the previous test, 873 - 567 = ? and 405 - 188 = ?, without being told so.

“BUGGY

ALGORITHMS”

295

Multidigit Subtraction. Two three-digit numbers were given as minuend and subtrahend in each item. This test consisted of 10 items with borrowing once (either from tens or hundreds), 10 items with borrowing twice (including four items with two-step borrowing), and 10 mixed items (including three items with two-step borrowing), in this order. The last 10 items were added to the 20 items that had been used in Experiment 1, primarily to increase the number of the twostep borrowing items. Four minutes were allowed for each set of 10 items, which was more than enough for completion by most children. Strategies used for borrow-and-decrement were assessed for each student from the pattern of responses, as in Experiment 1. Those strategies were again classified into the same four levels. Comprehension of the Trade Principle. Two more practice items, in which numbers were accompanied by measures (e.g., comparing 1 week and 2 days with 9 days), preceded the same three practice items and 20 test items as in Experiment 1. Single-Digit Addition. This was exactly the same test as the one used in Experiment 1. Results Relationships Between LMS and MMS. Cross-tabulations of the least mature strategy (LMS) and most mature strategy (MMS) demonstrate that a great majority of the participants revealing bugs (i.e., LMS at Levels l-3) also used other, more advanced strategies, including the correct one. See Table 4 for the results of Grade 3. Of the 103 such students for Grades 3 to 6, only 10 (9.7%) belonged to the same MMS level as LMS. Again, those whose LMS was at Level 3 were exceptions. Out of eight students who revealed Level 3 LMS, only one had MMS

Cross-tabulations

TABLE 4 of Most Mature Strategy (MMS) and Least Mature Strategy (LMS) (Experiment 2) MMS at Grade 3

LMS at Grade 3 Level I Level 2 Level 3 Level 4 Note.

Level 1

Level 2

Level 3

Level 4

2

0 0

12 17 2

7 15 0 24

Figures show the number of participants.

296

HATANO, AMAIWA, AND INAGAKI

at Level 4, in spite of the increased number of two-step borrowing items. We will offer our interpretation of this result later. Proficiency of Component Skills, Comprehension of the Trade Principle, and Strategy Levels. Considering that there were fewer participants belonging to less mature levels at later grades, we compared the mean single-digit addition score of combined Levels 1 and 2 with that of combined Levels 3 and 4 for LMS, and combined Levels 1 to 3 with Level 4 for MMS at each grade. The results are summarized in Table 5. ANOVAs and ANCOVAs revealed that the mean scores of single-digit addition differed highly significantly between the combined strategygroupsatGrade3,F(l,77)=9.69,p<.OlforLMS,andF= 11.27,p< .Ol for MMS, and did so even when the grade in language was entered as a covariate, F( 1, 76) = 8.39, p < .Ol for LMS, and F = 8.97, p < .Ol for MMS. However, there were only two significant differences at later grades, for MMS at Grade4,F(l,77)=6.66,p<.05,andforLMSatGrade5,F(l,60)= 16.18,~ < .OO1, and the effect of the strategy group stayed significant only for the latter, F( 1, 59) = 5.28, p < .05. Although the single-digit addition score was assessed in each grade in this experiment, these results corroborated those of Experiment 1: Proficiency in the component skills effectively prevented students from relying on buggy algorithms shortly after they had learned multidigit subtraction, but not later. Similar ANOVAs and ANCOVAs were conducted on the comprehension of the trade principle. Those more advanced in strategy outperformed those less

TABLE 5 Numbers Correct for Single-Digit Additions (Experiment LMS Levels

l&2

MMS 3&4

l-3

4

SD N

23.2 8.3 53

30.6 12.9 26

21.2 8.2 33

28.8 11.0 46

Grade4

M SD N

3.5.4 20.4 IX

41.5 16.3 61

27.3 12.5 10

41.9 17.2 69

Grade 5

M SD N

36.6 12

61.6 20.7 SO

37.5 5.4 4

58. I 21.7 58

M SD N

47.8 17.3 12

59. I 18.9 69

50.8 23.8 6

57.9 18.6 75

Grade 3

Grade 6

M

2)

11.0

No&. N means the number of participants

“BUGGY ALGORITHMS”

297

advanced, as can be seen in Table 6, and the combined strategy group showed a significant effect in the ANOVA for both LMS and MMS in all grades except for MMS at Grade 6: F( 1, 77) = 5.14 and 5.93 at Grade 3, p < .05 for both; F( 1, 77) = 13.93and 18.13atGrade4,pc .Ol forboth;F(1,60)= 16.93,p< .OOl and 4.81, p < .05 at Grade 5; F(1, 79) = 5.82, p < .05 for LMS at Grade 6. However, the ANCOVAs revealed that when grade in language was entered as a covariate, comprehension of the trade principle did not differ significantly between the LMS groups or between the MMS groups at Grade 3. It differed significantly for both LMS and MMS at Grade 4, F( 1,76) = 7.44, p < .Ol and F = 6.07, p < .05, and for LMS at Grade 5, F( 1, 59) = 7.88, p < .Ol These results were consistent with those obtained in Experiment 1: Comprehension of the trade principle effectively prevented the use of buggy algorithms a few years after the learning of the borrow-and-decrement procedure. Strategy Levels and Estimation of Numbers Correct. Here we analyze the data for only the third graders, who revealed low-level strategies fairly often. The students’ LMSs were assessed for each set of 10 items, using slightly more lenient criteria than those for the entire set of 30 problems. When a student made two consecutive erroneous responses expected from one of the nine buggy algorithms or three or more errors of the same (or lower) level strategies, he or she was assigned that level as his or her LMS. Table 7 shows mean self-estimated and actual numbers of correct responses for each set of 10 items for students at

Numbers

TABLE 6 Correct for Comprehension-of-Trade-Principle Items (Experiment 2) LMS

Levels Grade 3

l&2 M

MMS 3&4

l-3

4

SD N

12.9 3.5 53

14.8 3.1 26

12.4 3.3 33

14.4 3.8 46

Grade 4

M SD N

12.3 3.3 18

15.7 3.4 61

10.8 3.1 10

15.5 3.3 69

Grade 5

M SD N

14.3 4.3 12

18.0 2.3 50

14.0 5.3 4

17.5 2.9 58

Grade 6

M SD N

16.1 3.0 12

18.1 2.7 69

17.3 3.4 6

17.9 2.8 75

Nore. N means the number of participants

298

HATANO,

AMAIWA,

AND INAGAKI

TABLE ‘7 Estimated Scores and Actual Scores Among Strategy Groups (Experiment Level 2

Level 1 Levels of Strategy Scores First 10 items

Estimated

M SD N

Middle IO items

M SD

N

Last IO items

M SD

N

Actual

4.0 7.8 2.5 2.0 II I1 r (IO) = 4.60”’ 0 7.7 0 2.6 3 3 t (2) = 4.13”

8.8

1.3

2.6 6 6 t (5) = 6.38*’ I .8

Estimated

Actual

9.6 5.4 0.5 I .4 5 5 t (4) = 4.88” 9. I 3.9 1.3 2.4 I4 14 t (13) = 6.29”’

8.8

4.4 1.6 16 I6 t (15) = 8.52**’

I .6

2)

Levels 3 & 4 Estimated

Actual

9.1 1.4 48 f (47) = 8.7 1.5 37

8.8 I.1 48 I .70* 8.1 2.1 37

ns 8.8 1.5 36

x.3 1.3 37 12s

Note. Differences between estimated and actual scores were tested by matched t tests. One participant at Levels 3 and 4 gave no estimation of the number correct for the last set of 10 items. “/I < .lO. -p < .Ol. ***p < ,001.

LMS Level 1. Level 2, and Levels 3 and 4 combined, excluding students who had left any item unanswered for the set of 10 items in question. It is clear that a majority of students who used buggy algorithms were “optimistic” as to their scores. ANOVAs showed that the effect of the strategy group was significant only for the first set of IO items, F(2, 61) = 3.87, p < .05, with Level 3 and 4 students’ estimates being significantly (p < .05) higher than the Level I students’ by post hoc Tukey analyses. For the remaining two sets of 10 items, the self-estimated scores differed only negligibly among the LMS strategy groups, although the actual scores naturally differed greatly. Matched t tests between selfestimated and actual scores showed that the Level 1 and Level 2 students always overestimated the score significantly except for those at Level 1 for the middle set of 10 items, where the number of cases was too small. In contrast, selfestimated and actual scores did not differ significantly for the Level 3 and 4 students. The Same Problems With and Without Many Other Similar Problems. Performance on the two selected problems varied substantially with the context of presentation at least among the third graders, whose component skill seemed not to be highly automated. For the problem of 873 - 567 = ?, 25 students made an error when this problem was embedded in many other items, whereas 12 did so when it was paired with the other selected problem (six made an error in both

“BUGGY ALGORITHMS’

299

contexts). The difference was significant by McNemar’s test, x2( 1, N = 25) = 6.76, p < .Ol. The former context produced 20 responses predicted from the nine major buggy algorithms, whereas the latter produced only five such responses. The difference was significant, x2(1, N = 25) = 9.00, p < .Ol. For the problem of 405 - 188 = ?, a smaller, but similarly significant difference was observed for the number of errors: 42 versus 30 students made errors, 25 of them for both contexts, x2( 1, N = 22) = 6.55, p < .05. However, there was almost no reduction in the number of buggy responses, 23 versus 20; as expected, 18 each of these buggy responses were predictable from the two Level 3 strategies (borrow-across-zero and borrow-from-zero).

Discussion The results of Experiment 2 replicated well and extended those of Experiment 1. Again, it was confirmed that quite a number of students used immature buggy algorithms even though they could use more advanced algorithms. The proficiency of component skills tended to prevent students from relying on buggy algorithms at Grade 3, but not later. In contrast. the comprehension of the trade principle tended to immunize them against buggy algorithms a few years later. Thus, the results gave support to the effort-saving assumption as well as the correct-procedure assumption. The results also suggest that children’s use of buggy algorithms is based on their optimistic estimate of getting correct answers by doing so. These algorithms, which are simpler than the correct one, are relied on when students have to solve many problems. In other words, it is likely that students regard buggy algorithms to be effort-saving, legitimate (therefore attractive) variants. In spite of the increased number of two-step borrowing items, only one of those whose LMS was at Level 3 revealed MMS at Level 4. Buggy algorithms for two-step borrowing (borrow-across-zero and borrow-from-zero) may have been exploited only when the students had no other way to proceed. In other words, for these buggy algorithms, the missing rule view may hold. The observed small contextual variation in the number of buggy responses for the item of 405 - 188 = ? was consistent with this interpretation. Because these two strategies are fairly complicated, adopting either of them does not clearly save effort.

GENERAL DISCUSSION The two experiments reported here showed that a number of students who used low-level buggy algorithms could use more advanced algorithms, including the correct one. Whereas the proficiency of component skills was a negative predictor of the use of buggy algorithms soon after they had learned how to solve multidigit subtraction problems, the comprehension of the trade principle tended to prevent them from using buggy algorithms a few years later. Conceptual

300

HATANO, AMAIWA, AND INAGAKI

knowledge may have no effect on algorithmic behavior of students in lower and middle elementary grades (Davis & McKnight, 1980). In addition, Experiment 2 revealed that the third-grade children who used buggy algorithms had an optimistic estimate of getting correct answers by doing so and that those algorithms were relied on more often when students had to solve many problems than when they were given just two problems. Thus the results generally supported the assumptions that students believe buggy algorithms to be effort-saving but legitimate variants and that they rely on the buggy algorithms even when they have the correct procedure. Children’s use of multiple strategies is not new. Siegler (1986, 1987) already demonstrated it in various sets of skills. He indicated that “almost all children use several different strategies” and “one strategy is chosen one day and a different one the next day” (199 1, p. 82). The results reported here show that his conclusions can be extended to a set of strategies including incorrect ones and even within the same 4-minute test session. As Siegler (1991) put it, our students seemed to use “each strategy most often on problems where it works especially well” (p. 83) more specifically, where it can save time and effort needed for solving problems. Likewise, students tend to rely on buggy algorithms when there are many items to solve, and when component skills are less proficient, in short, when they can save much effort by relying on variants of the taught procedure. The findings in the experiments given here indicate that children’s systematic errors in multidigit subtraction are not necessarily generated because students lack some procedural rules. Buggy algorithms in multidigit subtraction may be spontaneous inventions (in this case false ones) to reduce mental effort, rather than needed or enforced repairs. These algorithms are probably generated by applying effort-saving modification heuristics that are valid in other contexts, such as swapping, default value substitution, ignoring the secondary operation, or doing the same operation on the next available entity. Students may have stored and used the algorithms depending on how quickly and easily they are applied in solving the problems. We do not assert that buggy algorithms in multidigit subtraction are always produced by spontaneous inventions. Pupils’ only choice may be to rely on repair strategies or their stored outcomes; for instance, they may have failed to incorporate the taught, correct algorithm. Even in this study, the buggy algorithms for two-step borrowing seemed to be adopted because the students did not have any other way to solve the problem. Those students who can use the subtraction algorithm correctly for ordinary problems may be stuck when there are intermediate zeroes in the minuend (Davis & McKnight, 1980). Still another possibility is that pupils know correctly how to subtract with borrowing but do not know when that procedure should be applied (Morita, Sato, & Hoshino, 1987). However, findings reported here generally support the attractive-variant view better than the missing-rule view.

“BUGGY

ALGORITHMS”

301

Our notion of buggy algorithms as attractive variants is similar to the conceptualization offered by Baroody and Gannon (1984) with regard to the use of economical strategies. According to the authors, young children often invent a “labor-saving” addition strategy, which is apparently based on their understanding of the principle of commutativity but which, in fact, is not. A similar observation and conceptualization of the generation and use of strategies was presented by Scribner (1984), in this case physical effort-saving ones among preloaders . The attractive-variant view of buggy algorithms has a few important instructional implications. It implies that supplementing the missing rules is neither necessary nor sufficient for removing students’ systematic errors. “Debugging,” or removal of those systematic errors in human participants, means that the retrievability of erroneous strategies becomes virtually zero, and thus necessitates long, complex processes, to which at least the following two acquisitions contribute: (a) children’s proficiency in component skills, and (b) conceptual understanding by which they can reject a buggy algorithm. Without any remedial teaching for multidigit subtraction, buggy algorithms should disappear as students become more proficient (and automatic) in component skills or come to understand the trade principle (e.g., through the development of the part-whole schema). Another important instructional implication may be that children’s use of buggy algorithms is not a sign of instructional failure. Teachers need not be upset by the buggy algorithms that many of their students reveal. They could treat the buggy algorithms as an indicator of students’ proficiency in the component skills (e.g., single-digit addition) and of their comprehension of the place va!.ue and trade principle. Alternatively, they could encourage students to discuss whether these buggy variants are valid, why they are not legitimate, and what principle they break. Although the computer metaphor has its advantages, it may be misleading if it is taken literally. Even when systematic errors seem to be generated by an algorithm involving a local and procedural deficiency, they are, much more often than not, a product of the problem solver’s intelligent attempt for adaptation, for example, to save effort by using a legitimate variant. “Remediation at the individual bug level has no future” (Hennessy, 1993, p. 335) because it ignores such goal-directed, complex construction processes (Cobb, 1990; De Corte et al., 1996).

REFERENCES Baroody, Arthur J., & Gannon, Kathleen E. (1984). The development of the commutativity principle and economical addition strategies. Cog&ion and Instruction. I, 321-339. Brown, John Seely, & Burton, Richard B. (1978). Diagnostic models for procedural bugs in basic mathematical skills. Cognitive Science, 2, 155- 192.

302

HATANO,

AMAIWA,

AND INAGAKI

Brown, John Seely, & VanLehn, Kurt (1980). Repair theory: A generative theory of bugs in procedural skills. Cognifive Science, 4, 379-426. Brown, John Seely, & VanLehn, Kurt (1982). Toward a generative theory of “bugs."In T.P. Carpenter, J.M. Moser, & T.A. Romberg (Eds.), Addition and suhtruction: A cognitive perspective (pp. 117-136). Hillsdale, NJ: Erlbaum. Cauley, Kathleen M. (1988). Construction of logical knowledge: Study of borrowing in subtraction. Journal

qf Educational

Psychology.

80, 202-205.

Cobb, Paul (1990). A constructivist perspective on information-processing theories of mathematical activity. Intrrnatiomd Journal qfEducutiona1 Research, 14. 67-92. Davis, Robert D., & McKnight, Curtis (1990). The influence of semantic content on algorithmic behavior. Journal of Muthemutical Behavior. 3, 39-87. De Corte, Erik, Greer, Brian, & Verschaffel, Lieven (1996). Mathematics learning and teaching. In D. Berliner & R. Calfee (Eds.), Hundbook of educutional psychology. New York: Macmillan. Hennessy, Sara (1993). The stability of children’s mathematical behavior: When is a bug really a bug’? Leuming and Instruction, 3, 3 155338. Morita, Eiji, Sato, Hitoshi. & Hoshino, Akihiko (1987). A study of errors in multi-digit subtraction: An alternative view. Bulletin of the Chibu University Centerfor Educational Technology, 8, 41-62.

Resnick,

Lauren B. (1982). Syntax and semantics in learning to subtract. In T.P. Carpenter, M.M. Moser, & T.A. Romberg (Eds.), Addition und subtraction: A cognitive perspective (pp. 137155). Hillsdale. NJ: Erlbaum. Resnick, Lauren B. (1983). A developmental theory of number understanding. In H.P. Ginsburg (Ed.). The development of murhemntical thinking (pp. 109-151). New York: Academic. Resnick, Lauren B. (1987). Constructing knowledge in school. In L.S. Liben & D.H. Feldman (Eds.), Development and learning: Corzflict or coqruence? (pp. 19-50). Hillsdale, NJ: Erlbaum. Scribner, Sylvia (1984). Studying working intelligence. In B. Rogotf & J. Lave (Eds.), Everydq cognition (pp. 9-40). Cambridge, MA: Harvard University Press. Siegler, Robert S. (1986). Unities in strategy choices across domains. In M. Perlmutter (Ed.), Minnesota symposium on child develqment (Vol. 19, pp. l-48). Hillsdale. NJ: Erlbaum. Siegler. Robert S. (1987). Strategy choices in subtraction. In J.A. Sloboda & D. Rogers (Eds.), Cqninve processes in mathemutics (pp. 8 1- 106). Oxford, UK: Clarendon. Siegler, Robert S. (1991). Children’s thinking (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall. Suga. Yasuko, & Hatano, Giyoo (1987, October). A raxonomv of buggy procedures in multidigit subtruction. Paper presented at the 29th annual convention of the Japanese Association of Educational Psychology. Tokyo. VanLehn, Kurt (1982). Bugs are not enough: Empirical studies of bugs, impasses and repairs in procedural skills. Journal qf’Mathemuticu1 Behavior, 3, 3-72. VanLehn. Kurt (1990). Mind bugs: The origin ofprocedural misconceptions. Cambridge, MA: MIT Press. Young, Richard M., & O’Shea. Tim (1981). Errors in children’s subtraction. Cognitive Science, 5. 1533 177.