Class size as a means of three-tiered support in Finnish primary schools

Class size as a means of three-tiered support in Finnish primary schools

Learning and Individual Differences 56 (2017) 96–104 Contents lists available at ScienceDirect Learning and Individual Differences journal homepage: ...

449KB Sizes 0 Downloads 22 Views

Learning and Individual Differences 56 (2017) 96–104

Contents lists available at ScienceDirect

Learning and Individual Differences journal homepage: www.elsevier.com/locate/lindif

Class size as a means of three-tiered support in Finnish primary schools 1

Mari-Pauliina Vainikainen , Ninja Hienonen

⁎,1

MARK

, Risto Hotulainen

University of Helsinki, Faculty of Educational Sciences, Centre for Educational Assessment, Finland

A R T I C L E I N F O

A B S T R A C T

Keywords: Class size Thinking skills Support needs Tiered support model Longitudinal study

In Finland, class size is used as a means of support by placing students with milder support needs in slightly smaller classes. This study tests the scientific basis of this practice by following the development of 869 students' performance from the fourth grade to the sixth grade, analyzing the effects of class size on performance and the patterns of performance in the groups of students receiving tier 2 (n = 69) and tier 3 (n = 36) support. The results confirmed that on average larger classes perform better and that students receiving support study in slightly smaller classes. At the individual level, receiving support was related to lower initial performance and the gap increased during the follow-up. However, at the class level, the proportion of students receiving support in the class predicted later performance positively. Class size was only related to initial differences and not to the development of performance.

1. Introduction Class size is one of the most controversial topics in the politics of education. Teachers and parents exert great pressure to diminish class sizes and in Finland the Ministry of Education and Culture has recently provided the organizers of education with considerable amounts of extra funding for this purpose (Ministry of Education and Culture, 2014). The common understanding in the field is that especially students with support needs would benefit from studying in smaller classes as they would have more opportunities for interaction with the adults in the class. However, there is little sound evidence about the effectiveness of regulating the class size in general and hardly any research on how the regulation of class size functions as a means of support even if it is commonly practiced in Finland. Therefore, the aim of the present study is to test these assumptions by following the development of 869 students' thinking skills from the beginning of the fourth grade to the end of the sixth grade. We analyze whether class size affects the development of performance when initial differences are controlled for and whether the pattern is similar for students receiving tier 2 or tier 3 support. 1.1. Class size and academic performance One main challenge when investigating the relationship between class size and student achievement is that students are seldom distributed to classes randomly; hence, there is a positive correlation between class size and performance (Akerhielm, 1995; Kupiainen & Hienonen, 2016). Some ⁎

1

Corresponding author. E-mail address: ninja.hienonen@helsinki.fi (N. Hienonen). These authors contributed equally.

http://dx.doi.org/10.1016/j.lindif.2017.05.004 Received 20 July 2016; Received in revised form 12 May 2017; Accepted 13 May 2017 1041-6080/ © 2017 Elsevier Inc. All rights reserved.

well-designed experimental studies have attempted to investigate the effects of class size in randomized settings. The most notable of them has been the Student-Teacher Achievement Ratio (STAR) project, which was a four-year, large-scale randomized experiment based in Tennessee in the mid-1980s. In the experiment, both students and teachers were randomly assigned to smaller or larger classes within schools (Finn & Achilles, 1990; Krueger & Whitmore, 2001). The first results indicated the positive effects of small classes and also indicated more benefits for low-performers and minority students (Finn & Achilles, 1990; Krueger, 1999). However, in more recent analyses of the data, Konstantopoulos (2007) found that although all types of students benefited from being in small classes, reductions in class size did not reduce the achievement gap between low and high performers. On the contrary, highachieving students may have benefited even more (see also Rice, 1999). Konstantopoulos and Traynor (2014) later used Progress in International Reading Literacy Study (PIRLS) data and found an opposite effect of a positive association between class size and achievement. However, the relation was statistically insignificant when teacher, classroom and school variables were taken into account. When investigating the effects of class size, there are naturally many classroom processes that should be taken into account (Pedder, 2006). The effects of class size are often related to teacher-student interaction, (i.e., Konstantopoulos & Sun, 2013), student on-task behavior and student-student relations that may or may not result in learning outcomes (Hattie, 2005). Blatchford, Edmonds and Martin (2003) found that there were more individualized task-related contacts between teacher and student in small classes, but the diminished class size did not affect student on-task behavior or peer interaction (see also

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

grade. The average class size of the first and the second grade classes is smaller than in the higher grades: in 2012, the average class size for grades 1–2 was 18.7, whereas for grades 3–6 it was 20.2 (Karjalainen & Lamberg, 2014). The first foreign language choices are made after the second grade and students are likely to be divided across classes according to the chosen language. In other words, students who have chosen more exceptional languages will probably end up in the same classes (e.g., Kosunen, 2016). In general, students studying exceptional languages are performing better as they tend to come from homes with higher educational level of parents (e.g., Kalalahti & Varjo, 2016). Moreover, at least in larger cities, there are other types of selective classes with a special emphasis (e.g., music, science). Their student admission is based on application and selection via aptitude tests in the emphasized subject area (Kosunen, 2016). There are some indications that students receiving tier 2 or tier 3 support only seldom study in these emphasized classes, though this needs to be investigated further. These aforementioned practices are among the main reasons for the positive correlation between class size and student performance. Tier 2 support is provided as part of mainstream education. At tier 3 level, students in the first place stay in the regular classes and only if it is not possible to respond to their needs there, part-time or full-time small group can be used as an alternative. Tier 3 student's primary teaching group shall be stated in the decision on special support. Under the Basic Education Decree (852/1998), in education given to students receiving tier 3 support, the teaching group may consist of a maximum of ten pupils, though it can be exceeded when justified. In 80% of general education classes in primary schools in Finland, there are students receiving tier 2 or tier 3 support (Kupiainen & Hienonen, 2016). Also, it seems that students at tier 2 or tier 3 level are distributed to classes of very different sizes. Approximately two thirds of tier 2 level students study in regular size classes (with 16 to 29 students) whereas half of the students at tier 3 level study in smaller classes (with < 16 students). In the present study it is investigated whether students receiving support are placed in smaller classes and whether they seem to benefit from studying smaller classes.

Blatchford, Edmonds, & Martin, 2003). The explanations for low performers benefiting from smaller classes are most often related to the teacher. Some studies suggest that teachers in smaller classes are more likely to focus their attention on specific students and provide individualized instruction and feedback (Blatchford, Basset, Goldstein, & Martin, 2003; Blatchford & Martin, 1998; Hargreaves, Galton, & Pell, 1998; Molnar et al., 1999), whereas other studies have shown that teachers seldom change their teaching or instructional practices according to the class size (Hattie, 2005; Hoxby, 2000). According to Betts and Shkolnik (1999), teachers react more strongly to class size changes when teaching below-average students. The focus of the present study is mainly on students receiving support who usually perform below average. 1.2. The Finnish support model In Finland, supporting the weakest learners has been considered to be extremely important ever since the implementation of comprehensive school in 1970s (Graham & Jahnukainen, 2011; Sabel, Saxenian, Miettinen, Kristensen, & Hautamäki, 2011). The effectiveness of the support system has been proved in international comparisons, in which the weakest Finnish students have usually outperformed their comparison groups in other countries (e.g., OECD, 2013). However, the share of lowperforming students has increased since 2003 (OECD, 2016). The current three-tiered support model (National Board of Education, 2016; see also Thuneberg et al., 2013, 2014) emphasizes early identification and preventative actions. To a certain extent, this multi-tier model is functionally equivalent to Response-to-Intervention (RTI) service delivery model in the United States (Jahnukainen & Itkonen, 2015; see also, Björn, Aro, Koponen, Fuchs, & Fuchs, 2015 for a slightly different view). The starting point of the Finnish model is that with some exceptions, moving to the next tier is possible only when the previous tier has proven to be insufficient. Tier 1, general support, should be provided immediately when any concern is raised and support can be only temporary. Tier 1 interventions can be conducted at a school or class level or they can be individually designed for specific students. The most common means of support at this tier are differentiation, remedial instruction and part-time special education either as co-teaching or in a smaller group (Thuneberg et al., 2013). Receiving general support does not require any decisions or official documents, and therefore it is difficult to evaluate the effectiveness of the support in quantitative studies like the present one. If general support is concluded to be insufficient, a pedagogical assessment is conducted in multiprofessional collaboration (Vainikainen, Thuneberg, Greiff, & Hautamäki, 2015). According to the assessment, an individual learning plan is created and tier 2 intensified support is organized. Intensified support consists largely of the same means as general support, but the intensity increases and multiple types of interventions are typically implemented simultaneously. Even though regulating class size is not officially tied to the support system, it seems to be common to place students with this kind of milder support needs in slightly smaller classes so that they can receive more attention from the pedagogical staff. When intensified support fails to provide sufficient support for the student, a pedagogical evaluation is conducted in multiprofessional collaboration and an individual education plan is done accordingly. Tier 3 always requires the official decision and only at tier 3 level fulltime special education can be provided and a student can more or less permanently be placed in a clearly smaller special education class. One aim of the present study is to deepen the understanding of how class size and intensity of support are related during tier 2 and tier 3 support and to analyze whether these students benefit from studying in smaller classes.

1.4. The development of thinking skills during primary school There is a relatively common agreement that in addition to subject matter-specific knowledge, education should enhance more general skills needed in all learning (e.g., Recommendation 2006/962/EC of the European Parliament and of the Council of 18 December 2006). The importance of developing thinking skills as a general goal of education has been understood for decades (Resnick, 1987); nowadays, they are often the focus of curricula as well as national and international educational assessment frameworks (see Adey & Csapó, 2011; Moseley, Elliott, Gregson, & Higgins, 2005; Vainikainen, Hautamäki, Hotulainen, & Kupiainen, 2015). In the 1990s, Finland had already defined learning to learn (Hautamäki et al., 2002) as a measurable outcome of education (National Board of Education, 1999) and the assessment of thinking skills forms an important part of the measurement of it. This study utilizes the data of one of the ongoing longitudinal Finnish studies focusing on the development of learning to learn (first reported by Vainikainen, Wüstenberg, Kupiainen, Hotulainen, & Hautamäki, 2015). The thinking skills tasks used in this study are related to curricular contents but they require the application of higher-order thinking skills instead of just the repetition of subjectspecific knowledge. The tasks of the instrument can roughly be grouped into the categories of general reasoning, mathematical thinking and reading comprehension skills (see Vainikainen, Wüstenberg et al., 2015, for further details). They can be interpreted through the theory developed by Demetriou, Spanoudis, and Mouyi (2011) on the architecture, development and education of the human mind as they measured the functioning of the inference system and problem solving in the contexts of categorical, quantitative, spatial, causal and verbal structural systems.

1.3. Assigning students into classrooms in Finland In Finland, classes are often reorganized after the second grade. Since then, class placements are quite permanent to the end of the sixth 97

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

Wüstenberg et al., 2015). The ethical consideration for the study was done at the Education Department of the municipality. The parents of the participating children gave their informed consent to the follow-up when responding to the parent questionnaires at grades one, four and six. The results of these questionnaires are not presented in this article. The analyses presented here cover the school grades four to six. All students of 20 randomly selected schools within the municipality participated in the study at the beginning of the fourth grade in 2010 and at the end of sixth grade in 2013. Altogether 978 students attended these classes during that time, but only students providing data in both data collections were included in the analyses (N = 883, 52% girls). When the measure for class size was extracted from the original fourth grade student lists received from the Education Department of the city, it turned out that there were classes with only one fourth grader (n = 2) and 12 more classes with fewer than nine fourth graders. These classes were most likely mixed-grade special education classes and they were excluded from further analyses as their real class size was unknown. Thus, the final number of students in the further analyses was 869. The mean age of these students at the time of the fourth grade data collection was 10.23 years (SD = 0.33) and the sixth grade data collection 12.81 years (SD = 0.33). Most of the students came from families with no immigrant background: 45 students had other home language than the language of teaching. As only 10 of them received support, their results were not reported separately.

In Finland children begin formal schooling at the age of seven years. At this age, children are in a transition regarding the development of thinking skills and the differences between children can be relatively large (Demetriou et al., 2011). Even though longitudinal studies have shown that initial differences predict later achievement relatively strongly (e.g., Duncan et al., 2007), there is also a strong body of evidence showing that both specific training and more generally schooling also influence the development of these skills (Adey, Csapó, Demetriou, Hautamäki, & Shayer, 2007; Hotulainen, Mononen, & Aunio, 2016; Molnár, 2011) and part of this development seems to happen at a class level (Vainikainen, Hautamäki et al., 2015). It is likely that beyond instruction, organizational factors influence the student outcomes as well. Regarding students who receive tier 2 or tier 3 support, there is evidence that the development of their thinking skills is slightly slower and the differences increase over time (Vainikainen, 2014), but the mechanisms are still unknown. The present study tests the assumptions that in addition to the general Matthew effect (Stanovich, 1986) class size could have a separable effect on the development of thinking skills, and that students receiving tier 2 or tier 3 support could benefit from studying in smaller classes, even though there is otherwise contradictory evidence on the effectiveness of the diminished class size on achievement. If effects can be found for general thinking skills, in the future it should be studied whether the conclusions are also applicable to the development of subject matter-specific knowledge. 1.5. Research questions and hypotheses

2.2. Measures

There are cross-sectional studies showing that in Finland class size is in positive relation with performance and there are some indications that this might be due to the regulating of class size according to students' support needs and more generally the level of performance by having more students in classes with a special emphasis (Kupiainen & Hienonen, 2016). Thus, the aim of the present study is to prove these assumptions using longitudinal data. The research questions and the hypotheses are: Q1: Is class size positively related to thinking skills already in the beginning of the fourth grade due to the use of class size regulation as a means of support?

The data collections were conducted by class teachers according to written instructions. The students worked individually on booklets consisting of cognitive tasks and beliefs questionnaires, 4 × 45 min in the fourth grade, and one 90-minute session in the sixth grade. Only cognitive items, which were identical for both age groups were used in this study. The number of items per task type, and the reliabilities of the test for both age groups are presented in Table 1. The reliabilities were acceptable. Mathematical thinking skills were measured with nine common items. Five of these were from the Mental Arithmetic task, an adaptation of the Arithmetic subscale of the Wechsler Adult Intelligence Scale - Revised (WAIS-R: Wechsler, 1981) and four from the Arithmetical Operations task of Demetriou, Pachaury, Metallidou, and Kazi (1996). The items were first coded dichotomously as correct or incorrect and the maximum number of correct items was nine. A mean for the percentage of correctly solved items was calculated separately for the two task types, and they were then averaged once more to produce a score for mathematical thinking. Reasoning skills were measured by the water-level task of Piaget and Inhelder (1956). The students were asked to draw to a picture of eight empty bottles the lines indicating the water level and mark the area filled with water when the bottles were half full. The bottles were coded dichotomously as correct or incorrect and a mean for the percentage of correctly solved items was calculated. Reading comprehension was assessed by two tasks. In the hierarchy-

H1. The positive correlation of class size can already be seen in the fourth grade as students receiving support (tier 2 and tier 3) study in classes which are on average smaller. Q2: Is class size a predictor of later performance or does it simply reflect initial differences between the student populations of the classes? H2. The gap between smaller and larger classes increases over time. One important reason for this is that students' initial performance level is higher in larger classes, which also have less students with support needs. At an individual level, higher initial level often predicts better later acquisition of the same skills (cf., Bast & Reitsma, 1997) and students with support needs tend to slowly fall behind (Vainikainen, 2014). Q3: Does regulation of class size function as a means of support for students with identified support needs? That is, when only students receiving tier 2 and tier 3 support are studied, does their performance increase when class size decreases?

Table 1 Reliabilities of the tests for different grades (Cronbach's α).

H3. For students receiving support (tier 2 and tier 3), smaller classes predict more positive performance as students receiving support benefit from studying in smaller classes. 2. Methods 2.1. Participants This study utilized the data of a longitudinal learning to learn study conducted in a large municipality in Finland (see Vainikainen,

Scale

Number of items

α 4th grade

α 6th grade

Thinking skills testa Arithmetical operations Mental arithmetic Reasoning skills (bottles task) Reading comprehension Analogical reasoning

37 4 5 8 20 8

0.75

0.83

0.77

a The test score was calculated as a weighted mean of the three domains so that reading comprehension, mathematical thinking and reasoning skills received equal weight.

98

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

rating task, the students were asked to read a one-page text and to assess 16 statements as to whether they present a good description of the text as a whole, important information regarding the content of the text, or refer to less important details in the text (Lyytinen & Lehto, 1998). The other task was a shorter text with four multiple-response items for measuring students' ability to understand and interpret complex sentences. The items were coded dichotomously as correct or incorrect and a mean for the percentage of correctly solved items was calculated for all 20 items together. The final thinking skills test scores for grades four and six were calculated by averaging once more the three task type mean scores from each grade. The assumption for measures described so far is that they assess general cognitive competences that schooling enhances through subject-specific teaching (Adey et al., 2007). In addition to them, we used an adaptation of a Dutch geometric analogies test (Hosenfeld, van den Boom, & Resing, 1997) the students completed in the fourth grade to measure effects of more general cognitive competences. Students were presented a pair of two geometric figures and asked to choose a corresponding pair to another figure from among five answering options. Each analogy (8 items) was scored dichotomously as correct or incorrect and mean for the percentage of correctly solved items was calculated. Class size was extracted from the fourth grade student lists provided by the Education Department, including also students who did not provide any data in the assessment. Support was measured by asking class teachers at grade six whether the student received tier 2 intensified (n = 69) or tier 3 special support (n = 36). The information was not available for 77 students. As we used Maximum Likelihood Robust (MLR) estimation for fitting the models, we used also the data provided by them, where applicable. For testing the hypotheses, we specified a two-level model with cross-level interaction effects. At the individual level, we predicted the sixth grade test score by the fourth grade test score and the fourth grade analogical reasoning. In addition, we added categorical variables for tier 2 support (0 = no support, 1 = tier 2 support) and for tier 3 support (0 = no support, 1 = tier 3 support) as predictors. At the class level, we used the average fourth grade performance level, class size and the proportion of students receiving tier 2 or tier 3 support (mean for all tier 2 and tier 3 students) in a class as predictors. Additionally, we calculated cross-level interaction effects for class size and tier 2 and tier 3 support.

Table 2 Descriptive statistics for the variables used in modeling.

Variable Test score sixth grade Within level predictors Test score fourth grade Analogical reasoning fourth grade Tier 2 support Tier 3 support Between level predictors Test score fourth grade Class size Proportion of students at tier 2 and tier 3(%)

N

Min

Max

M

Sd

869

0.00

1.00

0.60

0.18

869 831 798 798

0.00 0.00 0.00 0.00

1.00 1.00 1.00 1.00

0.43 0.50 0.09 0.05

0.18 0.31 0.28 0.21

45 45 45

0.27 9 0.00

0.60 0.43 30 21.77 1.00 0.13

0.08 4.53 0.14

N = Number of responses, Min = minimum value, Max = maximum value, M = Mean, Sd = Standard deviation

3.1. The first hypothesis In the first hypothesis we assumed that the positive correlation of class size and performance could already be seen in the fourth grade and that students receiving support studied in classes that on average were smaller. Descriptive statistics showed that students with no support studied in classes with on average 21.91 (SD = 4.02) students, whereas the average class size of those receiving tier 2 support was 21.49 (SD = 3.27) and tier 3 support 17.97 (SD = 4.17). To test the first hypothesis, a two-level structural equation model (later referred to as Model 1) was specified, in which the variance of the sixth grade test score was divided to individual and class level. The intra-class correlation of 0.17 meant that 17% of the variance was situated at the class level in the null model. To test the first hypothesis further, individual- and class-level variables were added to model one by one. First, the initial individual differences were controlled for by adding the fourth grade test score. After this, the share of explained variance by the individual level variables was 7.5%. The fourth grade test score was a relatively strong predictor of later performance (β = 0.57, p < 0.001). Next, the sixth grade class level performance was predicted by the average fourth grade performance level of the class, which predicted sixth grade class level performance relatively well (β = 0.55, p < 0.001), explaining 30% of the class level variance. As a next step, the fourth grade analogical reasoning was added to the model to control for the individual differences in addition to the fourth grade test score and it as well predicted the sixth grade test score (β = 0.27, p < 0.001). Together they explained 37% of the individual level sixth grade test score variance. The fourth grade test performance and the analogical reasoning correlated relatively strongly (r = 0.46, p < 0.001) and the direct effect of the fourth grade test performance on later performance slightly decreased (β = 0.39, p < 0.001). Next, we added received support at tier 2 and tier 3 level separately in the model as dummy-coded individual level predictors (0 = no support). We studied the relations of the predictors by examining their partial correlations. Tier 2 support correlated negatively with the fourth grade test score (r = − 0.17, p < 0.001) and with the fourth analogical reasoning (r = − 0.12, p < 0.001). Also, tier 3 support correlated with the fourth grade test score (r = − 0.14, p < 0.001) and with the fourth analogical reasoning (r = − 0.14, p < 0.001). With the support variables, the model explained 42% of the individual level variance. At the class level, we predicted the sixth grade test score with class size and the proportion of students receiving support in the class. As expected, class size correlated positively with the fourth grade test score (r = 0.36, p < 0.005). The proportion of students refceiving support correlated negatively with the class size (r = −0.40, p < 0.005) and with the fourth grade test score (r = − 0.40, p = 0.055), though the latter was not statistically significant. The model that is presented in Fig. 1, explained 42% of individual level and 34% of the class level variance.

2.3. Statistical methods Descriptive statistics were calculated and multilevel mixed models performed with SPSS24. Multilevel structural equation modeling was applied in MPlus 7.2 (Muthén & Muthén, 2012) using the MLR estimation without imputation of missing values. The deviation from normality of all variables was within the recommended limits. The models were considered to have a good fit with Comparative Fit Index (CFI) and Tucker-Lewis Index (TLI) > 0.95, and Root Mean Square Error of Approximation (RMSEA) < 0.06 (Kline, 2005). 3. Results Descriptive statistics were first calculated for all variables used in modeling. The results for the whole sample are presented in Table 2. The correlations of the variables used in multilevel modeling are presented in Appendices A and B. The data used in the analyses contained 45 classes. In 24 classes there was at least one student receiving tier 2 support and in 21 classes there were at least one student receiving tier 3 support. There was only one class, with 9 students, in our data that consisted only of students receiving support. The average proportion of student receiving support in a class was 0.13 (SD = 0.14). The number of students receiving support in classes of different sizes is presented in Appendix C. 99

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

Fig. 1. Model 1 for testing the Hypotheses 1 and 2.

classes would be helpful for them. Therefore, at the final stage of the modeling we wanted to study cross-level interactions of tier 2 and tier 3 level support and class size. For calculating the interaction effects, we first ran a two-level random model using Montecarlo integration in Mplus. However, the model did not converge and we could not get reliable estimates for the interactions. A mixed model in SPSS produced estimates, but they were very close to 0 (β = −0.01, p < 0.05 for Tier 2 support ∗ Class size; β = −0.002, p = 0.465 for Tier 3 support ∗ Class size). Thus, there was no interaction between class size and tier 3 support and a very weak interaction of class size with tier 2 support. Even though the latter gave some indication on that students receiving tier 2 support could benefit from smaller classes, the instability of the result and the extremely weak estimate made us conclude that the third hypothesis was not supported.

It was concluded that H1 was supported regarding the relationship between the test score and class size whereas the class size as means of support was supported partially: only students at tier 3 level studied in clearly smaller classes. 3.2. The second hypothesis In the second hypothesis it was assumed that the gap between smaller and larger classes increases over time as a higher initial level of performance often predicts better later acquisition of the same skills and students with support needs tend to slowly fall behind in development. To test this hypothesis, we studied the regression coefficients of the model described in Section 3.1. Both tier 2 support (β = − 0.12, p < 0.001) and tier 3 support (β = − 0.20, p < 0.001) predicted negatively the sixth grade test performance. To have sufficient degrees of freedom for calculating the fit indices, we fixed the unstandardized individual level regression coefficients for the fourth grade test score to 0.37 and for the fourth grade analogical reasoning to 0.13 as they had remained stable at all stages when adding new variables in the model. The constrained model fitted the data well (CFI = 1.000 TLI = 1.019, RMSEA = 0.06, χ2 = 0.092, df = 2, p = 0.955). Class level results showed that class size did not predict the sixth grade test score statistically significantly. Instead, the proportion of students receiving support was a positive predictor of the sixth grade class level performance (β = 0.32, p < 0.05). However, when class size was added to the model without any other class-level variables, it predicted sixth grade test score statistically significantly ((β = 0.44, p < 0.05). It was concluded that the H2 was not fully supported: received support was indeed negative predictor for the later performance at the individual level, but the proportion of students receiving support in class was a positive predictor of the class level results. However, on the contrary to the expectations, class size did not predict more positive class-level development statistically significantly after initial differences in test performance was controlled for.

4. Discussion The present study tested the assumption that class size would predict positively primary school students' thinking skills due to the practice of placing students receiving support in smaller classes. Moreover, the effectiveness of this practice was evaluated by analyzing whether students receiving support really benefit from studying in smaller classes. The hypotheses were tested by fitting a multilevel structural equation model on three-year follow-up data from 10 to 12year-old students (N = 869), looking at the development of performance of students receiving tier 2 and tier 3 support within the relatively new Finnish three-tiered support model (see Thuneberg et al., 2013). The first hypothesis to be tested was that already at the beginning of the fourth grade, there would be class size-based differences in performance and that students receiving support would study in smaller classes. Modeling the data confirmed that the first part of the assumption was correct: the positive correlation of class size and performance was statistically significant. The second part of the hypothesis was supported only partly: students receiving tier 3 special support studied in classes with on average fewer than 18 students, whereas students receiving tier 2 support and the other students without support typically studied in classes with almost 22 students. The results most likely reflect two phenomena typical for Finnish schools at least in the most urban areas: there are classes with a special emphasis (Kosunen, 2016; Varjo, Kalalahti, & Silvennoinen, 2014) that tend to be larger than average classes, have fewer students receiving support and more

3.3. The third hypothesis In the last hypothesis it was assumed that regardless of the general tendency of class size being positively related to performance and the development of it at least in Finland, for students receiving support the prediction would go to the opposite direction and studying in smaller 100

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

performance level the effect of class size was non-existent. Thus, it confirmed the purposeful sorting of students into classrooms. Of course, learning outcomes are not the only reason for reducing class size; factors like working climate and classroom management should also be taken into consideration. The number of students in the classroom can be seen as one contextual influence on classroom life and for example potential of distractions and off-task behavior can be lower in smaller classes (e.g., Blatchford, Basset, & Brown, 2011). In all, class size as such is likely to influence performance through the mediating effects only and other variables such as the forms of support given to students can be seen at least as effective as the size of a class. The final stage of the present study was to evaluate whether the use of class size was an effective practice from the perspective of the development of performance. Model 1 showed that class size as such hardly predicts more beneficial development of performance and the phenomenon is explained by the uneven division of students on different achievement levels in classes of different sizes. That is, after we controlled for the initial competences, the effect of class size was not statistically significant. In addition, classes with higher proportions of students receiving support performed lower at the fourth grade, though not quite statistically significantly. However, as already discussed above, the proportion of students receiving support in class predicted higher performance at the sixth grade. This might indicate that the placement of students across classes within schools is somewhat purposeful. The higher proportion of students receiving support in class does not indicate that the average development of a class would be slower. Finally, we looked at the cross-level interaction of class size separately for students at tier 2 and at tier 3 level. With students at tier 3 level, the class size effect on development was absolutely nonexistent even though the lower their initial performance was, the smaller classes they studied in. Of course, this makes the interpretation of the results difficult; first of all, it reports about a functioning system: at least on average in this system, no disadvantage is caused to the students at tier 3 support level by not placing them in small classes. Therefore, further research should be done using propensity scores on how the performance of very similar students receiving special support develops in classes of different sizes. The results of students receiving tier 2 support revealed a possible trend that would urgently need more research with larger samples as it may mean that reduced class size works as a means of support regarding the development of performance. The interaction between class size and tier 2 support was slightly negative and it could indicate that students with milder support needs could benefit from smaller classes over time. However, the effect was extremely weak and we could not replicate it with our own data even using another statistical package. Thus, it can be only concluded that more research is clearly needed. The most notable limitations of the study are related to the sample. Even though the original sample consisted of almost 900 students who were followed for several years with numerous measures that make the design of the study relatively unique, there were still too few students receiving support to analyze the group differences properly. We had to exclude classes with fewer than 9 students from the analyses which excluded especially some students at tier 3 level and decreased their proportion in the data. There were only 69 students receiving tier 2 intensified support and 36 students receiving tier 3 special support and these groups were simply too small for receiving stronger statistical effects. Furthermore, data included only 45 classes which restricted studying the cross-level interactions. However, the results create a base for further research with larger samples and there is already another longitudinal study underway in Finland with a total sample size of about 10,000 students. Another limitation of this study was that it was not possible to differentiate between students studying full-time and those studying part-time in regular classes. In addition, the current data do not offer any information about the pedagogical staff present in classes.

students with a relatively high initial level of performance (Kupiainen & Hienonen, 2016). On the other hand, class size is used as one means of support already in the early stages of the three-tiered support model, which means that class size is actively regulated to allow teachers more time to pay attention to students with support needs. These fourth grade cross-sectional results are not quite in line with international literature on class size as very few studies have been conducted in settings in which class size and support needs have been clearly related. Nevertheless, there have been studies that have indicated that the performance of minority or low-income students has been enhanced by placement in smaller classes (Blatchford, Goldstein, Martin, & Browne, 2002; Finn & Achilles, 1990; Krueger, 1999). There is evidence that students in smaller classes interact more with the teacher (Blatchford, Edmonds et al., 2003) and that the individual instruction increases more in classes identified by the teacher as “below average” (Betts & Shkolnik, 1999). This extra attention may be extremely important for students who are falling behind, but may have only a slight impact on the overall average achievement of other students in the class (Ehrenberg, Brewer, Gamoran, & Willms, 2001). Indeed, the class-level results of our study indicate that the higher the proportion of students receiving support is, the smaller the class is, and this seems to have a positive effect on class level performance. However, at the same time, receiving support was related to more negative development at the individual level. There are also studies that have found no evidence that class size reduction is more efficacious for the minority students (Hoxby, 2000; Konstantopoulos, 2007). In the light of our findings, it seems that at least it is expected that small classes provide more attention for students receiving tier 3 support as they are assigned to slightly smaller classes. After finding out that the fourth grade results followed exactly the pattern that we expected based on earlier cross-sectional Finnish studies, it was next tested whether the same pattern applied to the development of thinking skills as well. That is, it was assumed that a larger class would predict faster improvement of the test performance due to Matthew-effects (cf., Bast & Reitsma, 1997): the differences between high and low performers increase over time. Against the expectations, class size did not predict the performance in the sixth grade when the initial differences in test performance and general cognitive competences were controlled for. Both the received support at tier 2 and tier 3 level, as expected, predicted lower later test performance. This result is in accord with Vainikainen's (2014) earlier results showing that the differences between students receiving support and others begin to increase towards the end of the primary stage of the Finnish comprehensive education system. At this age, the factors related to students' background also generally have a greater impact on the development of academic achievement and differences among students begin to increase (e.g., Caro, McDonald, & Willms, 2009). This growth most likely depends not only on children's cognitive development, but also on the changes in students' self-representations as they decrease from exceedingly positive to more realistic (Demetriou & Kazi, 2006). As a disproportionally large part of students receiving tier 3 special support have both low motivation and low achievement (Thuneberg, 2007), there is clearly a room for interventions to prevent or slow these tendencies. Based on the present study and the results from earlier crosssectional studies, reduced class size is used as one means of support in Finland at least for tier 3 level students. It must be mentioned though, as the focus in this study was on regular classes, the proportion of students receiving tier 3 support is lower than the national average (7.3% of all comprehensive school students, Official Statistics of Finland) as the special education classes were excluded from the analyses. When class size was added to the model without other class-level variables, it predicted positively and statistically significantly later test performance. However, after controlling for the initial 101

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

students at tier 2 level (Kupiainen & Hienonen, 2016; Lintuvuori, 2015). Either way, to assign students with support needs to smaller classes can be one way to manage the increasing student heterogeneity in regular classrooms. Still, the question remains about whether the reduced class size enhances the performance and the development of performance of low achieving students. Despite acknowledging the complexity of investigating the class size effects, this study gave some indications on that for students receiving tier 2 intensified support, the reduced class size might be beneficial, though further research with larger data is needed. However, does this mean that class sizes for other students could be increased with little sacrifice to offset the costs of smaller classes (see Rice, 1999)? On the other hand, if the additional attention and support for students with support needs is provided in regular class, it could benefit the whole class. As the results of this study showed, the proportion of students receiving support in class predicted positively the later performance. A next step would be to take a closer look at the class composition and include in the analyses different compositional features with larger data. Thus, it could be determined what kind of class composition would best enhance the progress of students receiving support.

Furthermore, the instruction and support methods can be different in different classes. Studying those effects would require different data and methods and thus goes beyond this study. 5. Conclusions One explanation for the quality of the Finnish education system is the extensive learning and schooling support system (Malinen, Väisänen, & Savolainen, 2012; Sabel et al., 2011). The aim is to teach students receiving support “in conjunction with other instruction” (Basic Education Act 628/1998), meaning that they are in the first place taught in regular classrooms. The results of this study confirm that students are not randomly assigned to classrooms of different sizes within schools. It seems that schools use their autonomy when distributing students to classes and that the class size can be used as one means of support for students in need of support. Nevertheless, according to our data, it seems that only students receiving tier 3 level support are placed in clearly smaller classes whereas students receiving tier 2 support study in classes on average sizes. However, there are indications that some schools form smaller classes containing only

Appendix A. Pearson correlation coefficients for individual level variables

Variable 1 2 3 4 5

Test score sixth grade Test score fourth grade Analogical reasoning fourth grade Tier 2 support Tier 3 support

1

2

3

4

5

1 0.57⁎⁎ 0.48⁎⁎ − 0.17⁎⁎ − 0.28⁎⁎

1 0.45⁎⁎ − 0.17⁎⁎ − 0.15⁎⁎

1 − 0.11⁎⁎ − 0.14⁎⁎

1 − 0.07 ns

1

ns not significant. ⁎⁎ p < 0.01.

Appendix B. Pearson correlation coefficients for class level variables

Variable 1 2 3

Test score fourth grade Class size Proportion of students at tier 2 and tier 3

1

2

3

1 0.36⁎⁎ − 0.19 ns

1 − 0.14 ns

1

ns not significant. ⁎⁎ p < 0.01.

Appendix C. Students receiving support in classes

Number of tier 2 students

Number of classes

Number of tier 3 students

Number of classes

0 1 2 3 4 5 6 7 8 9 10

17 9 7 2 3 1 1 1

0 1 2 3 4 5 6 7 8 9 10

20 16 3 2

1

102

1

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

(pp. 37–69). Jyväskylä: Finnish Educational Research Association. Karjalainen, T., & Lamberg, K. (2014). Esi-ja perusopetuksen opetusryhmät 2013[Teaching groups in preschool and basic education]. In T. Kumpulainen (Ed.), Opettajat Suomessa 2013 (pp. 41–52). [Teachers in Finland 2013]. Helsinki: National Board of Education. Kline, R. B. (2005). Principles and practice of structural equation modeling (2nd ed.). New York: The Guildford Press. Konstantopoulos, S. (2007). Do small classes reduce the achievement gap between low and high achievers? Evidence from project STAR (Discussion paper no. 2904)The Institute for the Study of Labor. Konstantopoulos, S., & Sun, M. (2013). Are teacher effects larger in small classes? School Effectiveness and School Improvement: An International Journal of Research, Policy and Practice, 25, 1–17. http://dx.doi.org/10.1080/09243453.2013.808233. Konstantopoulos, S., & Traynor, A. (2014). Class size effects on reading achievement using PIRLS data: Evidence from Greece. Teachers College Record, 116(2). Kosunen, S. (2016). Families and the social space of school choice in urban Finland. Studies in educational sciences 267. Helsinki: University of Helsinki. Krueger, A. (1999). Experimental estimates of education production functions. Quarterly Journal of Economics, 114, 497–532. Krueger, A. B., & Whitmore, D. M. (2001). The effect of attending a small class in the early grades on college-test taking and middle school test results: Evidence from project STAR. The Economic Journal, 111, 1–28. Kupiainen, S., & Hienonen, N. (2016). Luokkakoko[Class size]. Research in education and science 72. Jyväskylä: Finnish Educational Research Association. Lintuvuori, M. (2015). Oppimisen ja koulunkäynnin tuen järjestäminen virallisen tilastotiedon ja empiirisen tutkimusaineiston kuvaamana[Describing the support to learning and school attendance through official statistics and empirical data]. In M. Jahnukainen, E. Kontu, H. Thuneberg, & M.-P. Vainikainen (Eds.), [From special education to the support to learning and school attendance]. Research in Educational Sciences 67(pp. 43–76). Turku: Finnish Educational Research Association. Lyytinen, S., & Lehto, J. E. (1998). Hierarchy rating as a measure of text macroprocessing: Relationship with working memory and school achievement. Educational Psychology, 18(2), 15–169. Malinen, O.-P., Väisänen, P., & Savolainen, H. (2012). Teacher education in Finland: A review of a national effort for preparing teachers for the future. Curriculum Journal, 23(4), 567–584. http://dx.doi.org/10.1080/09585176.2012.731011. Ministry of Culture and Education (2014). Opetusryhmien tila Suomessa. Selvitys eduskunnan sivistysvaliokunnalle esi-ja perusopetuksen opetusryhmien nykytilasta [Teaching groups in Finland. The present state of the teaching groups in preschool and basic education, report for the parliaments Committee for Education and Culture]. Opetus-ja kulttuuriministeriön julkaisuja 2014:2. Helsinki: Ministry of Culture and Education. Molnar, A., Smith, P., Zahorik, J., Palmer, A., Halbach, A., & Ehrle, K. (1999). Evaluating the SAGE program: A pilot program in targeted pupil-teacher reduction in Wisconsin. Educational Evaluation and Policy Analysis, 21(2), 165–177. Molnár, G. (2011). Playful fostering of 6- to 8-year-old students' inductive reasoning. Thinking Skills & Creativity, 6(2), 91–99. http://dx.doi.org/10.1016/j.tsc.2011.05. 002. Moseley, D., Elliott, J., Gregson, M., & Higgins, S. (2005). Thinking skills frameworks for use in education and training. British Educational Research Journal, 31(3), 367–390. http://dx.doi.org/10.1080/01411920500082219. Muthén, L. K., & Muthén, B. O. (2012). Mplus user's guide version 7. National Board of Education (1999). A Framework for evaluating educational outcomes in Finland. Evaluation 8/1999. Helsinki: National Board of Education. National Board of Education (2016). National Core Curriculum for basic education. Helsinki: National Board of Education. OECD (2013). PISA 2012 results: What students know and can do – Student performance in mathematics, reading and science. Volume IOECD Publishinghttp://dx.doi.org/10. 1787/9789264201118-en. OECD (2016). Low-performing students: Why they fall behind and how to help them succeed. OECD Publishinghttp://dx.doi.org/10.1787/9789264250246-en. Pedder, D. (2006). Are small classes better? Understanding relationship between class size, classroom processes and pupils' learning. Oxford Review of Education, 32, 213–234. Piaget, J., & Inhelder, B. (1956). The child's conception of space (Translated from French by F. J. Langdon and J. L. Lunzer)London: Routledge & Kegan Paul. Recommendation 2006/962/EC of the European Parliament and of the Council of 18 December (2006). on key competences for lifelong learning. Official Journal L 394 of 30.12.2006. Resnick, L. (1987). Education and learning to think. Washington, DC: National Academy Press. Rice, J. K. (1999). The impact of class size on instructional strategies and the use of time in high school mathematics and science courses. Educational Evaluation and Policy Analysis, 21(2), 215–229. Sabel, C., Saxenian, A., Miettinen, R., Kristensen, P. H., & Hautamäki, J. (2011). Individualized service provision in the new welfare state: Lessons from special education in Finland. Sitra Studies 62. Helsinki: Sitra. Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21(4), 360–407. Thuneberg, H. (2007). Is a majority enough? Psychological well-being and its relation to academic and prosocial motivation, self-regulation and achievement at school (Research report 281)Helsinki: University of Helsinki. Thuneberg, H., Hautamäki, J., Ahtiainen, R., Lintuvuori, M., Vainikainen, M.-P., & Hilasvuori, T. (2014). Conceptual change in adopting the nationwide special education strategy in Finland. Journal of Educational Change, 51, 37–56. http://dx. doi.org/10.1007/s10833-013-9213-x.

References Adey, P., & Csapó, B. (2011). Developing and assessing scientific reasoning. In B. Csapó, & G. Szabó (Eds.), Framework for diagnostic assessment of science (pp. 17–54). Budapest: Nemzeti Tankönyvkiadó. Adey, P., Csapó, B., Demetriou, A., Hautamäki, J., & Shayer, M. (2007). Can we be intelligent about intelligence? Why education needs the concept of plastic general ability. Educational Research Review, 2, 75–97. http://dx.doi.org/10.1016/j.edurev. 2007.05.001. Akerhielm, K. (1995). Does class size matter? Economics of Education Review, 14, 229–241. Basic Education Act 628/1998 (1998). Amendments up to 1136/2010. Government of Finland. Retrieved February 15, 2016 from http://www.finlex.fi/en/laki/ kaannokset/1998/en19980628.pdf. Basic Education Decree 852/1998 (1998). Amendments up to 966/2016. Government of Finland. Retrieved February 15, 2016 from http://www.finlex.fi/fi/laki/ajantasa/ 1998/19980852. Bast, K., & Reitsma, P. (1997). Matthew effects in reading: A comparison of latent growth curve models and simplex model with structured means. Multivariate Behavioural Research, 32(2), 135–167. Betts, J. R., & Shkolnik, J. L. (1999). The behavioral effects of variations in class size: The case of math teachers. Educational Evaluation and Policy Analysis, 21(2), 193–213. Björn, P., Aro, M. T., Koponen, T. K., Fuchs, L. S., & Fuchs, D. H. (2015). The many faces of special education within RTI frameworks in the United States and Finland. Learning Disability Quarterly, 39(1), 58–66. http://dx.doi.org/10.1177/073194871559478. Blatchford, P., Basset, P., & Brown, P. (2011). Examining the effect of class size on classroom engagement and teacher-pupil interaction: Differences in relation to pupil prior attainment and primary vs. secondary schools. Learning and Instruction, 21, 715–730. http://dx.doi.org/10.1016/j.learninstruc.2011.04.001. Blatchford, P., Basset, P., Goldstein, H., & Martin, C. (2003). Are class size differences related to pupils' educational progress and classroom processes? Findings from the institute of education class size study of children aged 5–7 years. British Educational Research Journal, 29, 709–730. http://dx.doi.org/10.1080/0141192032000133668. Blatchford, P., Edmonds, S., & Martin, C. (2003). Class size, pupil attentiveness and peer relations. British Journal of Educational Psychology, 73, 15–36. Blatchford, P., Goldstein, H., Martin, C., & Browne, W. (2002). A study of class size effects in English school reception year classes. British Educational Research Journal, 28(2), 169–185. http://dx.doi.org/10.1080/01411920120122130. Blatchford, P., & Martin, C. (1998). The effects of class size on classroom processes: ‘It's a bit like a treadmill—Working hard and getting nowhere fast!’. British Journal of Educational Studies, 46(2), 118–137. Caro, D. H., McDonald, J. T., & Willms, J. D. (2009). Socio-economic status and academic achievement trajectories from childhood to adolescence. Canadian Journal of Education, 32(3), 558–590. Demetriou, A., & Kazi, S. (2006). Self-awareness in g (with processing efficiency and reasoning). Intelligence, 34, 297–317. http://dx.doi.org/10.1016/j.intell.2005.10. 002. Demetriou, A., Pachaury, A., Metallidou, Y., & Kazi, S. (1996). Universals and specificities in the structure and development of quantitative-relational thought: A cross-cultural study in Greece and India. International Journal of Behavioural Development, 19(2), 255–290. http://dx.doi.org/10.1080/016502596385785. Demetriou, A., Spanoudis, G., & Mouyi, A. (2011). Educating the developing mind: Towards an overarching paradigm. Educational Psychology Review, 23(4), 601–663. http://dx.doi.org/10.1007/s10648-011-9178-3. Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P., ... Japel, C. (2007). School readiness and later achievement. Developmental Psychology, 43(6), 1428–1446. http://dx.doi.org/10.1037/0012-1649.43.6.1428. Ehrenberg, R. G., Brewer, D. J., Gamoran, A., & Willms, J. D. (2001). Class size and student achievement. Psychological Science in the Public Interest, 2(1), 1–30. Finn, J. D., & Achilles, C. M. (1990). Answers and questions about class size: A statewide experiment. American Educational Research Journal, 27(3), 557–577. Graham, L. J., & Jahnukainen, M. (2011). Where art thou, inclusion? Analysing the development of inclusive education in New South Wales, Alberta and Finland. Journal of Education Policy, 26(2), 263–288. http://dx.doi.org/10.1080/02680939.2010. 493230. Hargreaves, L., Galton, M., & Pell, A. (1998). The effects of changes in class size on teacher-pupil interactions. International Journal of Educational Research, 29, 779–795. Hattie, J. (2005). The paradox of reducing class size and improving learning outcomes. International Journal of Educational Research, 43, 387–425. Hautamäki, J., Arinen, P., Eronen, S., Hautamäki, A., Kupiainen, S., Lindblom, B., ... Scheinin, P. (2002). Assessing Learning-to-learn: A Framework. Evaluation 4/2002. Helsinki: National Board of Education. Hosenfeld, B., van den Boom, D. C., & Resing, W. C. M. (1997). Constructing geometric analogies test for the longitudinal testing of elementary school children. Journal of Educational Measurement, 34(4), 367–372. Hotulainen, R., Mononen, R., & Aunio, P. (2016). Thinking skills intervention for lowachieving first graders. European Journal of Special Needs Education, 31(3), 360–375. http://dx.doi.org/10.1080/08856257.2016.1141541. Hoxby, C. M. (2000). The effects of class size on student achievement: New evidence from population variation. The Quarterly Journal of Economics, 115, 1239–1285. Jahnukainen, M., & Itkonen, T. (2015). Tiered intervention: History and trends in Finland and the United States. European Journal of Special Needs Education, 31(1), 140–150. http://dx.doi.org/10.1080/08856257.2015.1108042. Kalalahti, M., & Varjo, J. (2016). Lähikoulupolut ja painotet valinnat [Local school paths and emphasized choices]. In H. Silvennoinen, M. Kalalahti, & J. Varjo (Eds.), Koulutuksen tasa-arvon muuttuvat merkitykset [Varying meanings of educational equity]

103

Learning and Individual Differences 56 (2017) 96–104

M.-P. Vainikainen et al.

137–148. http://dx.doi.org/10.1016/j.ijer.2015.06.007. Vainikainen, M.-P., Wüstenberg, S., Kupiainen, S., Hotulainen, R., & Hautamäki, J. (2015). Development of learning to learn in primary school. International Journal of Lifelong Education, 34(4), 376–392. http://dx.doi.org/10.1080/02601370.2015. 1060025. Varjo, J., Kalalahti, M., & Silvennoinen, H. (2014). Families, school choice, and democratic iterations on the right to education and freedom of education in Finnish municipalities. Journal of School Choice, 8(1), 20–48. http://dx.doi.org/10.1080/ 15582159.2014.875408. Wechsler, D. (1981). WAIS-R: Manual: Wechsler Adult Intelligence Scale—Revised. Harcourt Brace Jovanovic for Psychological Corp.

Thuneberg, H., Vainikainen, M.-P., Ahtiainen, R., Lintuvuori, M., Salo, K., & Hautamäki, J. (2013). Education is special for all: The Finnish support model. Gemeinsam leben. 2. Gemeinsam leben (pp. 67–78). Vainikainen, M.-P. (2014). Finnish primary school pupils' performance in learning to learn assessments: A longitudinal perspective on the educational equity. Research report 360. Helsinki: University of Helsinki. Vainikainen, M. P., Hautamäki, J., Hotulainen, R., & Kupiainen, S. (2015). General and specific thinking skills and schooling: Preparing the mind to new learning. Thinking Skills and Creativity, 18, 53–64. http://dx.doi.org/10.1016/j.tsc.2015.04.006. Vainikainen, M.-P., Thuneberg, H., Greiff, S., & Hautamäki, J. (2015). Multiprofessional collaboration in Finnish schools. International Journal of Educational Research, 72,

104