Studies
in Educational
SO191-491X(96)00002-8
THE CUMULATIVE MATHEMATICAL
Evaluation. Vo1.22, No.1, pp. 29-40, 1996 Copyright 0 1996 Elsevicr Science Ltd Printed in Great Britain. All rights reserved 0191-491X/96 $15.00 + 0.00
EFFECT OF ABILITY GROUPING ACHIEVEMENT: A LONGITUDINAL PERSPECTIVE’
ON
Sore1 Cahan and Liora Linchevski with Naama Ygra and lrit Danziger School of Education, The Hebrew University Jerusalem, Israel
Introduction Ability grouping entails any organizational scheme aimed at decreasing the heterogeneity of the learning group. It may be based on gene& achievement - involving the creation of homogeneous homeroom classes (streaming) - or on achievement in a particular subject, where students in the same homeroom class learn that subject in groups at different levels (tracking). Within these two categories there is a good deal of variability between implementations regarding such factors as the degree of heterogeneity within the groups, the number and size of levels, and teaching methods. These differences stem from organizational considerations and constraints (e.g., simplifying the school timetable) no less than from educational ones ( Hallinan, 1992; Hallinan & Sorenson, 1983). The main justification for using ability grouping is the need to adapt content, level, pace and teaching methods to students who function at varied levels (Slavin, 1988, 1990; Sorenson & Hallinan, 1986). Such a didactic fit is considered particularly important in mathematics, not only because ability is seen as the central factor explaining differential achievements in this domain (Lorenz, 1982), but also because mathematics is perceived as hierarchical, serial, or cumulative (Ruthven, 1987), which makes it difficult to work with heterogeneous groups. Thus, while more than 80% of mathematics teachers believe that their subject is inappropriate for teaching groups of students of mixed ability, only 16% of 29
30
S. Cahan
et al.
science teachers and 3% of English teachers hold this view about their own subjects (Her Majestys Inspectorate, 1978, 1979, 1980). The proponents of ability grouping perceive it as a means of improving the scholastic achievements of all students. This notion is countered, however, by the argument that homogeneous classes on a low scholastic level decrease the students’ intellectual stimulation, sense of challenge and ambition to progress in their classwork. This, in turn, lowers the chances of the weaker students to progress in their schoolwork and increases the achievement gap between them and their stronger classmates (Braddock, Craim, & McPartland, 1984; Dar, 1985; Eash, 1961). Moreover, grouping seems to exact a social price, as ability levels largely overlap with socioeconomic differences (Alexander, Cook, & McDill, 1978; Kerckhoff, 1986; Oakes, 1982). More importantly, it tends to block the way to equality of opportunity since the level of a student’s ability group in junior high school - especially in mathematics - itself constitutes a consideration in placing the student in a high school track (Oakes, 1992). The effect of ability grouping on the achievements of students of various ability levels has thus both theoretical and practical interest. Does ability grouping succeed in improving the achievements of students in general and the weaker ones in particular? The appropriate design for examining grouping effects is a random experiment involving the hypothetical assignment of students in heterogeneous classes to ability group levels, together with actual implementation of ability grouping in a random subsample of schools. However, the difficulty in performing random experiments in the educational system and the methodological problems associated with post-hoc comparisons between schools with and without ability grouping (for a review, see Slavin, 1990) has led to a less ambitious type of research, one that compares ability-group levels (or tracks) within schools in search of differential effects. Studies of this sort investigate whether the gap between better and weaker students after placement in ability groups for a certain length of time differs from the gap that would be expected on the basis of initial differences (Slavin, 1990). Their most prevalent finding is that ability grouping increases the achievement gap between students at different levels (Alexander et al., 1978; Gamoran, 1986; Gamoran & Barns, 1987; Gamoran & Mare, 1989; Oakes, 1982; Sorenson & Hallinan, 1986; Kerckhoff, 1986). These studies suffer, however, from the methodological difficulty of disentangling the effect of ability grouping per se from that of initial differences. Selection is inherent in the very notion of grouping, which entails differential treatment of students with different abilities, so that it is impossible to separate the treatment from the students. The differential effect of ability grouping cannot be studied by random experiments, as the random placement of students in ability groups is a contradiction in terms. This is an interesting case in which it is impossible in principle to investigate a treatment effect by using an experimental design. In an attempt to separate the grouping effect from that of initial differences, studies of this sort have been statistically controlled for background data (such as IQ scores, previous achievements, socioeconomic status), thereby providing an estimate of the criteria used to place students in ability groups (Gamoran, 1986; Gamoran & Nystrand, 1990; Rowan & Miracle, 1983; Sorenson & Hallinan, 1986; Vanfossen, Jones, & Spade, 1987). Gamoran and Nystrand (1990), for example, controlled for initial differences by
Ability
grouping
31
means of a socioeconomic questionnaire and a test given at the beginning of the year. Sorenson and Hallinan (1986) administered a standard achievement test to all students at the beginning of the year and obtained personal data from teachers about the students in each ability group. However, because these data overlap only partially the criteria actually used to place students in ability groups (usually some measure of previous achievement), controlling for them statistically is an insufficient means of resolving the problem of initial selection. This difficulty is especially salient in regard to mathematics, where initial differences in knowledge between group levels are especially great (Slavin, 1990). The Regression Discontinuity
Design
The problem can be overcome if assignment to group levels is based solely on a measure of ability and/or previous achievements in the subject matter at hand (henceforth, the pre-test), according to agreed-upon cutoff points determined by the desired number and size of ability groups. If this condition is satisfied, then the variance among student scores on a common achievement test some time later (the post-test) can be seen as the sum of two effects: (1) the effect of initial differences (i.e., differences on the pre-test), and (2) the effect of the group level. These two effects can be disentangled by means of a regression discontinuity design (Cook & Campbell, 1979): the effect of initial differences is estimated by the regression line of the post-test on the pre-test within each group level, while the grouping effect is estimated by the discontinuity between the regression lines within contiguous group levels (Figure 1).
A No GROUPING EFFECT
EFFECT OF HIGH GROUP
B
POSITIVE
/
Higher
c
NEGATIVE EFFECT OF HIGH GROUP
Group
Group
/
Lower
Group
Post-test
Figure 1: Hypothetical regressions of post-testscoreson pre-test scores,reflecting the existence (lb, Ic) andabsence(la) of a groupingeffect (dotted line)
32
S. Cahan
et al.
Abadzi (1984) used this design to investigate the effect of streaming on the achievements of fourth-grade students in eight American schools. The results showed that group assignment did have a significant effect: after one year of streaming the gap between the groups increased. At the end of the second year, however, this gap had diminished considerably (Abadzi, 1985). According to this study, therefore, the differential effect of ability grouping is not cumulative. Of course, these results may be specific to the particular grade level (elementary school) or sample involved, as well as to the idiosyncratic implementation of the ability grouping idea: streaming (rather than tracking); the number (2), relative size and heterogeneity of the group levels, and the treatments (e.g., curriculum, teaching methods) administered to each group. Thus, additional replications are needed to establish generalizability, preferably with radically different implementations of ability grouping. The present study takes one step in this direction. Using the regression discontinuity design, we investigated the cumulative effect of ability grouping (tracking) in mathematics in Israeli junior high schools. The achievement tests (post-tests) used were based on the schools’ curricula, reflecting the material actually taught, rather than the standardized tests that have generally been employed (Slavin, 1990). In addition, the grouping effect was estimated separately for each of the nine schools investigated; this actually constitutes independent replications. This is in sharp contrast to earlier studies (including Abadzi, 1984) which calculated effects in a pooled sample of schools, hence obtaining results which are not necessarily representative of the majority of schools, or even of the average school. Finally, our study included information concerning student movement from one group level to another, enabling us to arrive at a even more precise definition of the subject pool. Method Design The study was a longitudinal one and involved repeated testing of the same students at the end of the first and last (third) year of study in junior high school. Sample The target population consisted of all students enrolled in the seventh grade during the 1989190 school year, in all junior high schools in Israel that practiced ability grouping in mathematics and that met the following criteria: 1. 2. 3.
There were at least three levels of ability groups in mathematics. Students were allocated to groups according to a well-defined criterion (generally scores on a grade-wide placement test). The placement data were kept in school records that were available for perusal.
Ability
grouping
33
Of the 15 schools that met all these conditions (the third was the most problematic), 12 agreed to cooperate and were included in the initial sample. During the course of the study three schools dropped out, so that the sample ultimately included nine junior high schools. Out of the initial number of seventh graders in the participating schools, about 90% in each school took the mathematics test administered at the end of seventh grade and about 70% also took the post-test administered two years later, at the end of ninth grade (see Table 1). To allow for meaningful comparisons between grouping effects at the end of seventh and ninth grades, the estimation of both effects was based on a common sample within each school. In each school this sample consisted of all ninth-grade students in 1991192 who met the following conditions (Table 1, Column D): 1.
2. 3.
Took the post-test administered at the end of the seventh grade. Took the post-test administered at the end of the ninth grade. Did not move from one group level to another during the eighth or ninth grade (see below).
Table 1: Numberof Studentsby StageandSchool School
Initial numberof students (4
Took post-test 1 (7th grade) (W
Took both (C)
Final numberof students* @I
278
158
120
114
355 179 117
228 148
190 115
279 184
92 257 165
74 206 139
179 104 67 206 127
126 193 238
91 169 205
85 143 175
73 129 170
post-tests
* Took both tests and did not change level in Grade 8 or 9.
Variables Pre-test score. The pre-test score was obtained from a placement test administered by the schools at the beginning of seventh grade. Since the schools administered different tests, scores were standardized separately for each school, with a mean of 0 and a standard deviation of 1. By examining the regression of group level on the pre-test score, we were able to locate the cutoff points used by each school for allocating students to the ability groups.
34
S. Cahan et al.
Ability group level. The individual student’s ability level was determined by the school at the beginning of the seventh grade according to his/her pre-test score. In all schools in the sample, there were either three or four group levels (level 1 = high; level 4 = low) which constituted separate math classes. Post-test scores. For each student there are two post-test scores: (1) the score obtained on the mathematics test administered at the end of the seventh grade (post-test l), and (2) the score on the mathematics test administered at the end of the ninth grade (post-test 2). The tests were constructed on the basis of the mathematics curriculum and only included material common to all ability groups (post-test 1 included whole numbers, rational numbers, algebra I and word problems; post-test 2 included algebra II, word problems, geometry and functions). They were approved by experts and by the General Inspector for Mathematics Teaching at the Israeli Ministry of Education. In order to control for between-school differences in mean achievement levels and test-score variance, the scores on both post-tests were standardized separately for each school, with ameanofOandanSDof 1.
Implementation
of the Regression Discontinuity
Design
Underlying the use of the regression discontinuity differential effect of ability grouping are two assumptions: 1. 2.
design
to estimate
the
Children are allocated to group levels only on the basis of their pre-test scores, according to well defined cutoffs. The ability group is fixed for each student, i.e., students do not move from one group to another.
Obviously, these assumptions are not perfectly met in practice. As a regression of abilitygroup level on the pre-test score revealed, some students were placed in a different level than that indicated by their scores (the relative frequency of such exceptions ranged between 0% and 23% of the seventh graders across the nine schools, with a median of 18%). We interpreted this discrepancy as the school’s “correction” of a measurement error in the pre-test scores, based on additional data, mainly sixth-grade marks. For these students, the pre-test scores were revised; they were assigned a random score out of the distribution of pre-test scores in the ability group in which they were actually placed. Nor was the second assumption found to be universally true. Some students were moved from one ability group to another during their three years of study in junior high school. We treated such exceptions differently, depending on the grade level in which they occurred. When students were moved during the first year (seventh grade), their ability level was defined according to the group they were in for at least half the school year (20 weeks or more). Such adjustments were performed for between 0% and 18% of the students across the nine schools, with a median of 4%. Students who moved from one group to another in eighth or ninth grade (between 0% and 20% of the students who took both post-tests, with a median of 9%) were excluded from data analysis because it was impossible to trace the length of their stay in each ability group.
Ability grouping
Calculation
35
of Effects
The effects of grouping and of the initial differences at entry to junior high school on achievement in each school were calculated separately for each posttest. The overall effect of the initial differences between the students was defined as equal to $Pj,
where
j=l
m indicates the number of ability differences for ability group j:
groups in the school and Pj is the effect of initial
Pj = bj (xmj - xminj) where (x,,~ - xminj) is the range of the pretest scores within ability group j, and bj is the regression coefficient of the pretest scores in ability group j (see Figure 2). Similarly, the m-l
overall
grouping effect aLwas defined as equal to CLj, where Lj is the effect of ability j=l
group j. As illustrated by Figure 2, Lj is the difference between the predicted posttest score for xminj in group j and that predicted for the same pretest score ( x,,,,~) in group j-l.
Level 1 P, ._/-/).. 1
._ ._
L1
Level 2 p2 /l ._. __.___ ._ _._ ..__ ._. .__ .. _.. I L Level 3 p3 A
Pretest (x)
Figure 2: Computation of the effects of ability grouping(Ll + L2) andinitial differencesat entry to junior high school(Pl+P2+P3) using the regression discontinuity design. A hypothetical example.
36
S. Cahan et al.
For example, let m=3, Pt=2.1, P2~1.9, P3=1.8, Lt=lS effect of the initial
differences is iPj
and L2cO.8. In this case, the overall
= Pt+P2+P3=2.1+1.9+1.8=5.8.
Accordingly,
the
j=l
overall grouping effect is:
= Lt+L2=1.5+0.8=2.3.
1
j=l
Results Table 2 displays the estimated effects of ability level and initial differences at entry to junior high school (beginning of grade 7) on achievement at the end of the first and last year of study in junior high school (the end of grades 7 and 9, respectively) for each of the nine schools. At the end of seventh grade, after one year of ability grouping, the grouping effect ranged from 0.1 to 1.3 SD, with a median of 0.9 SD. In the majority of schools this effect was considerably lower than that of initial differences (median = -0.7 SD). After three years of tracking (end of ninth grade), the grouping effect ranged from 0.7 to 2.1 SD, with a median of 1.4 SD, and was greater than the seventh-grade grouping effect in every school (differences ranged from 0.1 to 0.8 SD, with a median of 0.6 SD). This increase in the grouping effect over the last two years of junior high school was accompanied by a decrease in the effect of initial differences in eight out of the nine schools (Column F in Table 2). As a result, at the end of ninth grade the grouping effect exceeded that of initial differences in most of the schools (Column H). Table 2:
Estimated Effects of Goup Level andInitial Differencesat Entry to Junior High Schoolon Achievement at the End of the SeventhandNinth Grade,by School(SD units)
Effect on Achievement of Initial Differences at Entry to Junior High School Grade 9 Grade7 Grade 9 (B)-(A)
of Group Level School Grade 7 (4
Cf.9
Differences (D)-(C)
(A)-(C)
(B)-(D)
((3
@>
03
(F)
(G)
(HI
;*‘2
1.3 1.5
0.7 0.2
-0.9 -0.7
-1.5 -1.3
0.1 -0.4
1 2
iiz
1.4
: :
0:1 0.8 ::i
A*$ 1:5 K
2:1 1.5 ::;
1.4 1.2 0.6
0.6 0.7 E:;’
-0.9 -0.1 -0.6
-2.0 -0.7 -7.;
-0.5 0.1 -1.1 1.0
i 9
:4 0:9
2:1 1.2 1.5
:*z 1:5
i::: ii::
E!; 0:6
-yj -0:3 -0.7
-0:4 -0.5 -0.6
A:: 0.7
Median
0.9
1.4
1.7
1.2
0.6
-0.7
-0.7
0.1
Ability
grouping
37
Discussion The results of the present study, although based on a different methodology, are consistent with the most frequent finding in the literature (Alexander et al., 1978; Gamoran & Barns, 1987; Gamoran & Mare, 1989; Oakes, 1982; Sorenson & Hallinan, 1986), namely, that placement in ability groups increases the gap between students at different group levels. In other words, if students with the same pre-test scores, close to the cutoff point, were randomly placed in groups at different levels, the scholastic achievements of the students in the higher group would be greater, on the average, than those of the students placed in the next lower group. The direction of the effect was consistent across all nine schools in the sample. Moreover, ability grouping was found to have a real cumulative effect on achievement. In each of the schools investigated, the effect was greater after three years of grouping (end of ninth grade) than after one year (end of seventh grade), with a median‘ increase of 0.5 SD. The consistency of results across the nine junior high schools studied which constitute independent replications - supports this conclusion. Such support is particularly important owing to the partial fulfilment of the assumptions underlying the regression discontinuity approach: allocation to groups in our sample of schools was not based solely on pre-test scores according to clearly defined cutoff points, and some students moved from one level to another. The size of the cumulative effect of ability grouping - a median of 1.4 SD at the end of the ninth grade, greater than the effect of individual differences in most cases - is quite impressive, especially in comparison with the size of other effects in education, where an effect of 0.5 SD is considered large (Cohen, 1977). In fact, the obtained effect of group level underestimates the real effect size. As explained earlier, the tests which examined student achievements were constructed on the basis of material common to all the ability groups. Any material studied in the higher-level but not in the lower-level groups was not included. Had it been included, this would have increased the differences between groups (that is, the estimate of the grouping effect). These findings raise questions about the feasibility of using ability grouping in junior high schools to promote the achievements of all students and to narrow scholastic gaps. At the same time, it must be stressed that the results of the present study do not refer to the effect of ability grouping per se, but rather to its implementation in the junior high schools investigated. In fact, the discrepancy between our results and those of Abadzi’s (1985) study - which found a diminished grouping effect over time - point to the limited generalizability of conclusions in this domain, as well as the importance of replications. It also should be emphasized that widening of the gap between stronger and weaker students might also occur in heterogeneous classes. Therefore, our results have no clear bearing on the comparative benefits of homogeneous and heterogeneous educational settings.
38
S. Cahan er al.
Notes 1.
2.
This study was supported by grants from the Israeli Ministry of Education, Culture and Sport and by the NCJW Research Institute for Innovation in Education, The Hebrew University of Jerusalem. The actual calculation of effects used the multiple regression equation of post-test scores on pre-test scores and ability-group levels. In this design, the regression coefficients of ability level and pre-test equal the mean, across ability groups, of the effects of ability level and initial differences, respectively. Thus, the overall effect of grouping in each school is (m - 1)bl , where bLis the regression coefficient of ability level. The overall effect of initial differences is of the pre-test score and 1 where bp is the regression coefficient hnx - x,,,inh x&)is the range of the pre-test scores within the school. hnax -
References Abadzi, H. (1984). Ability grouping effects on academic achievement southwestern school district. Journal of Educational Research, 77, 287-292.
and self-esteem in a
Abadzi, H. (1985). Ability grouping effects on academic achievement and self esteem: Who performs in the long run as expected? Journal of Educational Research, 79, 36-39. Alexander, K.L., Cook, M.A., & McDill, E.L. (1978). Curriculum stratification. American Sociological Review, 41. 963-980.
tracking
and educational
Braddock, I.J., Craim, R.L., & McPartland, J.M. (1984). A long term view of school desegregation: Some recent studies of graduates as adults. Phi Delta Kappan, Dec. 84, 259-264. Cohen, J. (1977). Statistical power analysis for the behavioral sciences (revised ed.). New York: Academic Press. Cook, T.D., & Campbell, D.T. (1979). Q uasi-experimentation: settings. Boston: Rand McNally.
Design analysis issues for field
Dar, Y. (1985). Teachers’ attitudes toward ability grouping: social and organizational influences. Interchange, 16 (2), 17-38.
Educational
Eash, M.J. (1961). Grouping:
What have we learned? Educational
Gamoran, A. (1986). Instructional Education, 59, 185-198.
and institutional
Leadership,
considerations
and
18, 429-434.
effects of ability grouping.
Sociology
of
Gamoran, A., & Barns, M. (1987). The effects of stratification in secondary schools: Synthesis of survey and ethnographic research. Review of Educational Research, 57, 415-435. Gamoran, A., & Mare, R. (1989). Secondary school tracking and educational inequality: Compensation, reinforcement, or neutrality? American Journal of Sociology, 94, 1146-l 183.
Ability
grouping
Gamoran, A., & Nystrand, M. (1990). Tracking, at the World Congress of Sociology, Madrid, Spain. Hallinan, M.T. (1992). The organization Sociology of Education, 65, 114-127.
39
instruction
and achievement. Paper presented
in the middle
school.
Hallinan, M.T., & Sorenson, A.B. (1983). The formation and stability of instructional American Sociological Review, 48, 838-85 1.
groups.
Her Majesty’s Inspectorate. Her Majesty’s
Inspectorate.
of students for instruction
(1978). Primary education in England. Hmso, London. (1979).
Aspects of Secondary
Education
in England.
Hmso,
London. Her Majesty’s Inspectorate. (1980). Aspects of Secondary Supplementary Information on Mathematics. Hmso, London. Kerckhoff, A.C. (1986). Effects of ability grouping Sociological Review, 51, 842-858.
in British
Education
secondary schools. American
Lorenz, J.H. (1982). On some psychological aspects of mathematics achievement and classroom interaction. Educational Studies in Mathematics, 13, 1-19. Oakes, J. (1982). The reproduction The Urban Review, 14. 107-120.
of inequality:
in England:
assessment
The content of secondary school tracking.
Oakes, J. (1992). Can tracking research inform practice? Technical, considerations. Educational Researcher, May, 12-21.
normative,
Ruthven, H. (1987). Ability 18, 243-253.
Studies in Mathematics,
stereotyping in mathematics. Educational
Rowan, B.A., & Miracle, W., Jr. (1983). System of ability grouping achievement in elementary schools. Sociology of Education, 56, 133-144. Slavin, R.E. (1988). Synthesis of research on grouping Educational Leadership, 46, 67-76.
and political
and the stratification
of
in elementary and secondary schools.
Slavin, R.E. (1990). Achievement effects of ability grouping evidence synthesis. Review of Educational Research, 60, 3, 471-499.
in secondary schools: A best-
Sorenson, A.B., & Hallinan, M.T. (1986). Effects of ability grouping on academic achievement. American Educational Research Journal, 23, 5 19-542. Vanfossen, B.E., Jones, J.D., & Spade, J.Z. (1987). Curriculum Sociology of Education, 60, 104-122.
tracking and status maintenance.
S. Cahan
et al.
The Authors SOREL CAHAN is a Senior Lecturer at the School of Education of the Hebrew University of Jerusalem, specializing in educational measurement snd research and evaluation methodology. LIORA LINCHEVSKI is a senior teacher at the School of Education of the Hebrew University of Jerusalem, specializing in the teaching of mathematics.