Understanding how teachers practise inclusive classroom assessment

Understanding how teachers practise inclusive classroom assessment

Studies in Educational Evaluation 63 (2019) 113–121 Contents lists available at ScienceDirect Studies in Educational Evaluation journal homepage: ww...

653KB Sizes 1 Downloads 100 Views

Studies in Educational Evaluation 63 (2019) 113–121

Contents lists available at ScienceDirect

Studies in Educational Evaluation journal homepage: www.elsevier.com/locate/stueduc

Understanding how teachers practise inclusive classroom assessment a,⁎

Pei-Ying Lin , Yu-Cheng Lin a b

T

b

Department of Educational Psychology and Special Education, College of Education, University of Saskatchewan, 28 Campus Dr., Saskatchewan, Canada Department of Psychological Science, University of Texas- Rio Grande Valley, 1201 W University Dr., Edinburg, Texas, 78539, United States

A R T I C LE I N FO

A B S T R A C T

Keywords: Assessment accommodations Assessment practices Classroom assessment Inclusive education Special education

This study was undertaken to better understand how Canadian teachers utilized the three major purposes of assessment for students with diverse needs through a questionnaire that reflected major concepts of AFL, AOL, and ACC (assessment for and of learning; accommodations and modifications). We found that a higher percentage of teachers- especially special education teachers- reported offering ACC frequently to their students than did those who reported practising AFL and AOL frequently. In line with the literature, our findings suggest that AOL was adopted by a greater percentage of teachers than AFL, especially when teachers reported practising AAL (assessment as learning which is considered a subset of AFL) less frequently. Our data also shows that teachers who put one assessment concept into practice tended to utilize the other concepts. Finally, a small but notable number of teachers surveyed reported sometimes or never implementing AFL, AOL, and ACC in class.

1. Introduction With the increasing trends in inclusive education and internationalization, educational communities have become more diverse in school systems. They have become more diverse in inclusive education in response to the increasing demands for diversity. Teachers’ education programs have offered courses in order to prepare pre-service and inservice teachers for diverse students and inclusive education. In the past, a majority of students with special needs were often excluded from full or partial inclusion in education, and it was an open question as to how well schools facilitated these students’ inclusion in broader social contexts. Providing inclusive education around the world is mainly about grounding students in human rights and educational equity, it is “a process of responding to individual difference within the structures and processes that are available to all learners” (Florian, 2008, p. 202). In other words, education is provided to all students at local schools, including students with and without special needs as well as English language learners whose home language is not English. However, placing all students in the mainstream classroom does not necessarily ensure learners have equitable opportunities to learn. It is critical for teachers to differentiate their instruction methods in order to meet the diverse needs of all students at different grade levels (Schimmer, 2014; Tomlinson, 1999; Tomlinson & McTighe, 2006;). Classroom assessment is one of the major elements of student learning and teaching and has proven to be an effective teaching methodology for fostering students’ learning growth- if employed properly and interpreted validly and



fairly (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014; Black & Wiliam, 1998a, 1998b; Popham, 2011); therefore, understanding teachers’ assessment practices for all students is essential for stakeholders such as teacher educators (McMillan, 2000; Mertler, 2009; Roscoe, 2013). The findings of previous research on assessment beliefs or practices are valuable and informative; however, the research to date has tended to focus on either just two major components of assessment (assessment for learning [AFL] and assessment of learning [AOL]) or accommodations (ACC). Very little research up to now has surveyed all three elements in a study, and it is therefore difficult to have a bigger picture of teachers’ inclusionary teaching practices. To narrow this knowledge gap, this study attempts to investigate teachers’ assessment practices with respect to inclusion. 1.1. Assessment for, as, of learning Assessments or examinations have long been developed and used for different purposes. Tests were mainly used for accountability and diagnosis purposes in the past. Standardized psychological assessments such as the Woodcock-Johnson IV Tests of Achievement have been

• Conversion(M.E.) used for diagnostic purposes; standardized achievement assessments such as the Canadian Achievement Tests-4 for measuring students’

Corresponding author. E-mail address: [email protected] (P.-Y. Lin).

https://doi.org/10.1016/j.stueduc.2019.08.002 Received 24 October 2017; Received in revised form 25 April 2019; Accepted 12 August 2019 0191-491X/ © 2019 Elsevier Ltd. All rights reserved.

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

1.2. Inclusive assessment: accommodations and modifications

academic achievements for accountability purposes. The accountability-oriented assessment practices primarily rest on the notion of assessment of learning (AOL) or summative assessment. AOL is an integrated concept that represents a variety of summative assessments, including regular end-of-unit, end-of-term, or high-and low-stakes assessments. However, it is believed that the AOL mainly focuses on testing and reporting of academic performance of students to varied stakeholders such as school administrators, school divisions, parents and students. This belief has raised serious concerns and heated debates among educators and researchers (Harlen, 2005; Remesal, 2011; Teasdale & Leung, 2000; Volante & Jaafar, 2010). A range of concerns that have emerged include increased stress among teachers for highstakes testing results, an increase of instances of “teaching to the test”, an overemphasis on memorizing facts and test-taking strategies, and a reduction in other subjects that are not tested such as art, music, or physical education (e.g., Earl et al., 2003; Klenowski, 2011; TorneyPurta & Amadeo, 2013; Volante & Jaafar, 2010). The seminal work of Black and Wiliam (1998a); Black & Wiliam, 1998b strongly advocates the concepts of assessment for learning (AFL) or formative assessment, emphasizing a continuous adjusted learning and teaching process that should occur among teachers and students. Moreover, a number of researchers have paid close attention to another interconnected concept, assessment as learning (AAL), that instructs students how to self-monitor and assess others on an on-going basis in order to provide and obtain feedback on their own learning progress (Black, McCormick, James, & Pedder, 2006; Black, Harrison, Lee, Marshall, & Wiliam, 2003; Earl, 2003; Earl & Katz, 2006). This concept has been considered a subset of formative assessment or AFL (Earl, 2003; Volante, 2010). The intent or spirit underlying the theory of assessment for and as learning is appealing to educators, and there is a large volume of published articles describing the strategies for putting the theory into practice (Ciobanu, 2014; Clark, 2012; Donham, 2010; Gibbons & Kankkonen, 2011). However, the empirical data often yielded mixed results. Secondary school teachers and/or elementary school teachers perceived the role of a quality assessment to be important in student learning and teaching improvement (Brown, 2004; Brown, Lake, & Matters, 2009; Segers & Tillema, 2011); an assessment, in these teachers’ eyes, was also driven by a school accountability purpose, holding teachers and students accountable for school achievements. That is, teachers viewed an assessment as a diagnostic tool that can be used to inform their teaching practices and student progress and as a summative tool for measuring students’ achievements. However, previous studies found that teachers mainly focused on the practices of grading and reporting (Delanshere & Jones, 1999; Volante, 2010). Philippou and Christou (1997) reported that although primary teachers acknowledged the role of assessment in evaluating student learning and making decisions about teaching effectiveness, they nevertheless might not use assessment results to make changes in the math curriculum to enhance student learning. This finding reflects a gap between teachers’ assessment beliefs and practices. Their concerns were associated with the external assessment policies, pressures of increased accountability, regulations, rules, and the curriculum (Brown et al., 2009; Klenowski, 2011; Marshall & Drummond, 2006). Secondary teachers often expressed more concerns over assessments than primary teachers as they were expected to meet the external demands of preparing secondary students for high-stakes assessments (Brown et al., 2009; Scott, Webber, Aitken, & Lupart, 2011). Furthermore, Scott et al. (2011) indicated that subject specialization may in part explain why high school teachers use formative assessment less frequently than elementary and middle school teachers. Although significant differences were found among different grade-level or subject teachers in the studies reviewed above, none of the factors examined in Brown (2004) were found to be significantly related to teachers’ conceptions of assessment, including gender, years of experience, and years of teacher training.

The strategies of differentiated instruction and assessment have long been used for promoting inclusive education for students with diverse needs educated in the general classroom, and especially for students with special needs and English language learners (Butcher, Sedgwick, Lazard, & Hey, 2010; Jung & Guskey, 2007; Tomlinson & McTighe, 2006). However, previous studies indicate that some teachers used their own teaching experiences to guide their teaching practices, instead of utilizing assessments to inform their teaching practices (Lin & Lin, 2015; Lin & Lin, 2015). Teaching and assessment can be seen as two sides of the same coin, but the findings suggest that assessment and teaching may be only loosely connected- for some educators. We would argue that using differentiating assessments for all learners in the general classroom is critical and essential. Accommodations such as providing extended time, computers, and assistive technology, if appropriate, can help students demonstrate the knowledge and skills they acquire and master while still measuring what an assessment is intended to measure (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014). Thus, the use of appropriate and individualized accommodations can enhance inclusivity and differentiate between the assessments required for students with diverse needs. Modifications refer to the changes that do affect what a test is intended to measure. Although accommodations and modifications are different from a theoretical standpoint, teachers use them on a daily basis. As the present study is aimed at understanding teachers’ assessment practices, it is necessary to investigate accommodations and modifications within the scope of the current study. There is a large volume of published studies reporting that English language learners who persistently struggle with phonological processing (e.g., phonological awareness, decoding, and spelling) and show significant weakness in verbal working memory, display very similar cognitive, linguistic, and learning profiles to English monolingual poor readers (Geva & Herbert, 2012; Geva & Wiener, 2014; Geva, YaghoubZadeh, & Schuster, 2000). In addition to the accommodations discussed in special education research, English language learners also receive accommodations relevant to the linguistic aspect of instruction or a test, such as linguistic simplified versions, bilingual word-for-for-word dictionary without definitions, bilingual versions, or oral reading in native and English languages (Abedi & Gándara, 2006; Abedi & Hejri, 2004; Abedi, Hofstetter, & Lord, 2004; Rivera & Collum, 2006). However, English language learners are more likely to be provided with accommodations that do not interfere with the language of an assessment (e.g., extended testing time, or testing in a separate testing room) than those that directly address the linguistic aspect of the test in order to maintain the integrity of the assessment (Albus & Thurlow, 2008). Although accommodations are often used for assessment only, they should also be provided for instructional purposes. Previous studies have suggested aligning each student’s individual learning needs, instructional, and test accommodations (Christensen, Lazarus, Crone, & Thurlow, 2008; Christensen, Thurlow, & Wang, 2009; Thurlow, Christensen, & Lail, 2008). In particular, individualized accommodations that have been used for instruction should also be employed for assessment. Following this logic, accommodations for AFL should be consistent with those for AOL for students with special needs and English language learners. Our literature review suggests that there have been serious concerns about an overemphasis on AOL in teachers’ assessment practices, even though many educators advocate for AFL and AAL, or a balanced assessment system consisting of AFL, AAL, AOL, with appropriate considerations of accommodations and modifications. In an effort to further understand the concerns, this paper, using a teacher questionnaire, seeks to investigate in-service teachers’ inclusive classroom assessment practices. Through this questionnaire, the current study surveyed a group of Canadian teachers to better understand their inclusionary 114

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

Ministry of Education, 2010). The EQAO permits three types of special provisions for English language learners for Grade 9 math and the Grade 10 Ontario Secondary School Literacy Test: additional time to a maximum of double allotted time, periodic supervised breaks, and a small or individual group for test administration or a separate study carrel (Education Quality & Accountability Office (EQAO), 2017a, 2017b). Given the complexity of exceptionalities, more types of accommodations are permitted for students with special needs, including sign language or oral interpreter, large print, assistive technology (Education Quality & Accountability Office (EQAO), 2017a, 2017b). While there are differentiated assessment guidelines for students with special needs and English language learners in Ontario, current adaptation guidelines are applicable for all learners in Saskatchewan, including these two student populations. A recent educational policy of the Saskatchewan Ministry of Education offers a guideline, assisting educators of the province in adapting learning environment, instruction, assessment, and resources to support all students to achieve curricular outcomes (Saskatchewan Ministry of Education, 2017a). In particular, the Ministry of Education recognizes the importance of assessment adaptions that help achieve the optimal educational goal of having equitable and fair assessments for all learners in the inclusive classroom across the province. The use of adaptations for assessments should match student strengths and learning needs, but should not compromise the integrity of a given assessment. Moreover, the Ministry suggests that three purposes of assessment (AFL, AAL, and AOL) are inter-related components applied to Saskatchewan curricula. Teachers are encouraged to involve students in the assessment process and design assessments that align with curricular expectations and outcomes in order to gather a variety of valid data. Compared with Ontario’s assessment policy, documenting accommodations is not mandatory for Saskatchewan schools; adaptations (accommodations) for students with special needs can be added to a student’s Inclusion and Intervention Plan (IIP) as supplementary information (Saskatchewan Ministry of Education, 2017b). The principles of the IIP are the same as the IEP in Ontario although they might differ in details and implementation in both provinces.

Table 1 Demographic Characteristics of Participating Teachers. Characteristics Province Gender Grade level taught

Teaching experience with Years of teaching

Subject areas

Ontario Saskatchewan Female Male Elementary Secondary Elementary and Secondary Students with special needs English language learners Less than 2 years 2-4 years 5-10 years 11-20 years More than 21 years English language arts Math Social studies Science Special education Other

n

%

61 58 101 18 33 26 54 94 90 8 30 30 34 17 97 83 84 76 66 33

51.3 48.7 84.9 15.1 27.7 21.8 45.4 96.9 92.8 6.7 25.2 25.2 28.6 14.3 81.5 69.7 70.6 63.9 55.5 27.7

assessment practices. It set out to answer the specific questions: (1) How do teachers put inclusive classroom assessment into practice?; (2) what are the relationships among teachers’ practices that meet the different purposes of inclusive classroom assessment? Specifically, do teachers’ assessment practices correspond to what has been postulated or debated in the literature?; and (3) do teachers’ assessment practices differ by teachers’ characteristics and experience? To address these questions, we discuss our research design, questionnaire construction, and data analyses in detail as follows. 2. Methods 2.1. Participants We recruited a group of 119 elementary and secondary school teachers in Ontario (51.3%) and Saskatchewan (48.7%), Canada, which contained 84.9% females and 15.1% males (Table 1). Respondents were recruited through university contacts in both provinces and consented to participate in this study. Approximately one-third of these teachers have taught at elementary (27.7%), secondary levels (21.8%) or both (45.4%). Of these teachers, many often teach multiple subjects such as language arts, math and science and approximately half of them were special education teachers (55.5%). About 93.3% of these teachers have taught at least two years and 14.3% of them have taught more than 21 years. A majority of them are females (84.9%) and have worked with students with special needs (96.6%) or English language learners (92.4%) throughout their teaching careers.

2.3. Measures The present study developed an Inclusive Assessment Practices Questionnaire designed to investigate teachers’ practices with regard to the purposes of classroom assessments. The questionnaire consists of 16 items reflecting three components (AFL, AOL, and ACC). As the AAL is deemed to be a subset of AFL (Earl, 2003; Volante, 2010), these two components were combined in the questionnaire. In this study, the AFL component was used to understand teachers’ formative assessment practices that keep track of students’ learning progress on an ongoing basis and reflect their current teaching practices (e.g., “I provide every student with descriptive and detailed feedback using my ongoing records.”); AAL was defined as how often teachers instruct their students to monitor, assess, and adjust their own learning progress as well as make decisions about their learning (e.g., “I teach students how to assess their own learning progress.”); AOL investigates how often teachers summarize students’ current level of learning compared with their peers, curriculum expectations and standards in order to determine the next step in students’ learning (e.g., “I use assessment performance on major assignments / report cards to decide the next step in students’ learning/ placement/ promotion.”); and the ACC component was used to investigate teachers’ practices of using accommodations or modifications to support students with disabilities and/or English language learners in demonstrating their knowledge and skills as well as to inform their own teaching practices (e.g., “I provide one or more accommodations that a student usually uses.”). The questionnaire used a 4-point Likert scale of response options, providing four response categories (Never = 0, Sometimes = 1, Often = 2, Always = 3). Table 2 reports the percentages of survey responses given by participating

2.2. Context of the study Educational assessment in Canada is mainly guided by provincial policies. The Ontario Ministry of Education mandates that curriculum implementation, class activities and assignments must take into account the accommodations that are documented in each student's Individual Education Plan (IEP)(Ontario Ministry of Education, 2010, 2017). Based on a student’s IEP, a student with special needs may require “accommodations only”, “modified learning expectations, with the possibility of accommodations”, or “an alternative program, not derived from the curriculum expectations for a subject/grade or a course” (Ontario Ministry of Education, 2010, p. 70). Moreover, an IEP should specify the accommodations for provincial assessments allowed by the Education Quality and Accountability Office (EQAO) and need to be compatible with those provided for classroom assessment (Ontario 115

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

Table 2 Descriptive Statistics of Survey Results. Response Category (%)

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

Item I develop assessments aligned with the curriculum expectations and standards. I use different kinds of assessments for students to demonstrate their learning. I provide every student with descriptive and detailed feedback using my ongoing records. I use assessments to reflect my current teaching practices. I teach students how to assess their own learning progress. I use assessments that allow students to monitor their own learning progress. I teach students how to keep ongoing records to assess their own learning progress. I compare students’ assessment performance on major assignments / report cards to know how well a student is performing in relation to their peers. I use students’ assessment performance on major assignments / report cards to examine whether they achieve curriculum expectations and standards. I use assessment performance on major assignments / report cards to decide the next step in students’ learning / placement / promotion. I provide one or more accommodations (e.g., extra time, quiet setting, scribe) for students with special education needs based on their IEPs. I provide one or more accommodations (e.g., dictionary, glossary) for students who speak a first language other than English based on their identified learning needs. I modify assessments (e.g., change the assessment content or criteria/rubrics) for students with special education needs based on their IEPs. I modify assessments for students who speak a first language other than English based on their identified learning needs. I provide one or more accommodations that a student usually uses. I change one or more accommodations if they are not useful for assessing students’ particular skills / knowledge / concepts.

teachers. Teachers’ background information was also collected through this questionnaire, including gender, years teaching school, subjects and grade levels taught in school, and any experience working with students with special needs or English language learners. The background information is included in the data analysis in order to investigate the relationship between teachers’ characteristics and assessment practices.

Never 3 3 3 1 5 5 18 18

Sometimes 11 9 24 11 36 29 34 36

Often 36 41 45 47 41 43 31 23

Always 39 40 21 33 10 12 8 12

3

20

34

33

2

24

38

27

3

6

13

71

3

8

17

56

4

7

24

55

7 2 3

13 12 9

24 30 39

39 49 39

Table 3 Summary of Items and Factor Loadings. Factor

2.3.1. Factor-analytic evidence Two methods were employed to evaluate the reliability and validity of items that were developed specifically to reflect four constructs of the questionnaire. First, three assessment and measurement professionals, including a professor and three doctoral candidates of educational assessment and measurement, reviewed the questionnaire items. The items were modified based on the reviewers’ feedback before the data collection. Second, the questionnaire was also validated by factor analysis. The analysis was conducted on the 16 items with oblique rotation by using the direct oblimin method. The Kaiser-Meyer-Olkin measure indicates that the sample size is adequate for factor analysis (KMO = .79); Bartlette’s test of sphericity suggests that there are some relationships among items and conducting factor analysis is appropriate ( χ 2 = 738.85, p < . 001). Three factors were extracted and can explain 53.33% of total variance: ACC (factor 1), AFL (includes a subtest of AAL)(factor 2), and AOL (factor 3)(Table 3). In addition, Cronbach’s alpha reliability coefficient of .88 indicates that our questionnaire is reliable and has a high internal consistency.

Item

1: ACC

2: AFL

3: AOL

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

−.063 .107 .018 .052 −.055 −.046 .080 −.094 −.040 .080 .836 .850 .785 .733 .794 .688

.560 .442 .707 .496 .700 .804 .597 .032 .103 −.162 −.001 .030 −.043 −.017 .100 −.042

.226 .166 −.166 .192 .032 −.044 −.109 .444 .793 .903 .020 −.118 −.023 .008 .075 .109

Note: Boldface indicates highest factor loadings.

McCutcheon, 2002). Furthermore, each final latent class model was determined by the information indices, such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC), entropy, the Vuong-Lo-Mendell-Rubin likelihood ratio test, and the Lo-MendellRubin adjusted likelihood ratio test (Lin & Lin, 2014; Muthén & Muthén, 2000; Nylund, Asparouhov, & Muthe ́n, 2007). Second, we also examined the Pearson correlations among three components (AFL, AOL, and ACC) to address the second research question pertaining to the relationship among teachers’ assessment practices of AFL, AOL, and ACC. In particular, this analysis investigated how teachers responded to one set of questions (e.g., AFL) in relation to other sets of questions (e.g., AOL, ACC). To address the third research question, a series of multivariate analyses of variance (MANOVA) were also performed to investigate whether teachers’ assessment practices of AFL, AOL, and ACC (dependent variables) significantly differed in relation to their characteristics or backgrounds (independent variables). The MANOVA models include different background variables, including province, gender, years of teaching experience, grade level taught, whether or not the teachers

2.4. Data analysis The first two research questions we sought to address were to investigate teachers’ assessment practices and the relationship among three major components- AFL, AOL, and ACC. We employed two statistical methods- latent class analysis and correlation- to discuss these two questions by analyzing teachers’ questionnaire responses in Mplus 7 (Muthén & Muthén, 2012) and SPSS 22 (IBM Corp, 2013). First, a latent class analysis was conducted to examine teachers’ response patterns to the three components (AFL, AOL, and ACC) and the entire questionnaire. Teachers were assigned to the same latent class if their response patterns were the same or similar. This is because each latent class model was built on the joint probabilities of observed items measuring a component (Finch & Bronk, 2011; Goodman, 2002; 116

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

were compared with those responding ‘sometimes’ and ‘never’ to one or more items across three factors (Fig. 1). Overall, a greater percentage of teachers reported putting AFL and ACC into practice frequently (ranging from 63% to 85%) than reported putting AOL into practice frequently (ranging from 34% to 67%), although fewer teachers reported frequently teaching students to self-assess or self-monitor their own learning progress (AAL; ranging from 39% to 55%). For instance, approximately 18% of teachers answered that they never taught their students how to keep ongoing records to keep track of their own progress. In addition, a smaller group of teachers reported that they often compared students’ assessment performance on major assignments or report cards to know how well a student was performing in relation to their peers (AOL; 34%) and 18% of the teachers reported that they never did so in the classroom. It is worth noting that approximately 65% of teachers reported ‘always’ or ‘often’ using assessment performance on major assignments or report cards to decide the next step in student’s learning, placement, or promotion, while 24% of teachers reported ‘sometimes’ using summative assessment results to advance student learning. In contrast, nearly 2% of teachers reported ‘never’ using assessment results to further student learning (Fig. 1). We then included each set of items that measured a factor in separate analyses. The set of analyses for AOL and ACC reveal that the two-class model fits the data better than the other three models (latent classes 1, 3, and 4), while participating teachers’ responses to AFL were grouped into three latent classes. The response patterns are illustrated by the sizes of latent classes as well as the mean scores of each component and the entire scale in Fig. 2. These results show that a higher number of teachers frequently offered accommodations or modifications to students with special needs or English language learners (47.1%) than the teachers who adopted AFL (19.3%) and AOL (35.3%). Approximately half of the teachers sometimes practised AFL (58.0%) and AOL (50.4%) and a small percentage of teachers infrequently practised AFL in class (7.6%). These findings suggest that it is imperative to conduct the latent class analyses because they can validly distinguish the assessment practices that were similar from those were different. The results, as shown in Table 5, indicate that there are significant positive correlations among AFL, AOL, and ACC (correlation coefficients ranging from .31 to .43). These results indicate that teachers’ responses are consistent across three assessment components; teachers’ ratings of one component are associated with the ratings of other components and the entire measure.

Table 4 Fit Indices for Each Latent Class Model Specification for Inclusive Assessment Practice Questionnaire. Model

Class 1

Class 2

Class 3

Class 4

AFL No. of free parameters AIC BIC Sample-Size Adjusted BIC Entropy VLMR LRT (p) LMRA LRT (p)

21 1755.027 1812.487 1746.113 – – –

43 1638.430 1756.086 1620.178 0.810 0.217 0.219

65 1573.676 1751.529 1546.086 0.934 0.024 0.024

87 1560.810 1798.86 1523.882 0.910 0.781 0.781

AOL No. of free parameters AIC BIC Sample-Size Adjusted BIC Entropy VLMR LRT (p) LMRA LRT (p)

9 787.485 811.708 783.269 – – –

19 746.518 797.653 737.616 0.788 0.000 0.000

29 729.413 807.462 715.826 0.867 0.088 0.093

55 764.411 907.695 733.991 0.906 0.361 0.364

ACC No. of free parameters AIC BIC Sample-Size Adjusted BIC Entropy VLMR LRT (p) LMRA LRT (p)

18 1298.606 1347.699 1290.810 – – –

37 1081.128 1182.041 1065.102 0.924 0.000 0.000

56 1046.971 1199.705 1022.716 0.926 0.061 0.063

75 1011.874 1216.428 979.390 0.951 1 1

All Components No. of free parameters AIC BIC Sample-Size Adjusted BIC Entropy VLMR LRT (p) LMRA LRT (p)

48 3842.542 3974.298 3822.579 – – –

97 3570.892 3837.151 3530.552 0.909 0.095 0.097

146 3518.645 3919.405 3457.926 0.926 0.628 0.629

195 3613.82 4149.082 3532.724 0.688 0.760 0.760

*Note: The final latent class models for AFL, AOL, and ACC were highlighted in bold; AFL = assessment for learning; AOL = assessment of learning; ACC = accommodations or modifications; AIC = Akaike information criterion; BIC = Bayesian information criterion; VLMR LRT = Vuong-Lo-Mendell-Rubin likelihood ratio test; LMRA LRT = Lo-Mendell-Rubin adjusted likelihood ratio test. *p < .05, ***p < .001.

had taught as a special education teacher, had worked with students with special needs, and English language learners in the classroom. 3. Results

3.2. Question 3: assessment practices in relation to teachers’ characteristics and experience

3.1. Questions 1 and 2: teachers’ assessment practices of AFL, AOL, and ACC

Participants’ background variables were included separately in seven sets of MANOVA, including province, gender, grade level taught, years of teaching, whether or not the respondents had taught as special education teachers, had worked with students with special needs or English language learners in schools. The analyses were used to test whether or not our results are generalizable across teachers with different characteristics. Results show that teachers’ assessment practices, in general, did not significantly differ by varied characteristics except for a group difference in ACC. Data indicates that special education teachers offered accommodations or modifications more often than other teachers (F1,111 = 11.31, p < .01, η2 = ..09), although no significant group difference was found in AFL and AOL. Furthermore, no significant group difference was found between females and males and among those who taught at different grade levels, or in a different province, or had fewer or more years of teaching, or who did or did not work with students with special needs or ELL students.

To investigate whether or not teachers showed distinct response patterns in the questionnaire items, we included all items in the one-, two-, three-, and four- latent class models. The model fit indices, including the measures of relative model fit (AIC, BIC, and sample-size adjusted BIC), a measure of class classification accuracy (entropy), and measures that test if the number of latent classes in a model results in a better model fit compared to a model with one fewer latent class (Vuong-Lo-Mendell-Rubin LRT; Lo-Mendell-Rubin adjusted LRT), show that the one-class model fits the data better than the other class model (Table 4). That is, teachers’ responses to the frequency of how often they practised inclusive classroom assessment were, in general, similar, and were grouped into only one latent class. To further investigate the relationship within teachers across three factors, the percentages of teachers’ responses to each item are presented in Fig. 1 (Lamont et al., 2017; Urick & Bowers, 2014). To better interpret the data, the percentages of teachers reporting ‘always’ and ‘often’ put inclusive classroom assessment into practice were combined into one category and 117

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

Fig. 1. Plot of the percentages of teachers reported practicing AFL, AOL, and ACC in the one-class model.

4. Discussion

Table 5 Intercorrelations Questionnaire.

We found significant positive correlations among AFL, AOL, and ACC, indicating that participating teachers who enforced even one assessment strategy would be more likely to use other assessment components more frequently in the general classroom. What is interesting in this data is that positive correlations were found between AFL and AOL. In other words, teachers’ adoption of AOL (summative assessment) did not necessarily mean that AFL or AAL (formative assessment) were not used. These results echo what researchers suggest- that building a balanced and cohesive assessment system is important for high-quality schools (Lin & Lin, 2015; Earl & Katz, 2006; Stiggins, 2006; Wang, Beckett, & Brown, 2006). In line with previous studies (Brown, 2004; Brown et al., 2009; Segers & Tillema, 2011), participating teachers used assessments as both diagnostic and summative tools for enhancing student learning and informing teaching practices, as well as for reporting and accountability purposes. Given that a noticeable percentage of participating teachers, in general, put AFL, AOL, and ACC into practice, their inclusionary assessment practices varied by level of perceived intensity or frequency.

for

Components

of

Inclusive

Assessment

Practice

Measure

AFL

AOL

ACC

AFL AOL ACC

– .425** .372**

– .310**



*Note: AFL = assessment for learning; ACC = accommodations or modifications. ** p < .01.

AOL = assessment

of

learning;

The first set of latent class analyses that included all items in the models resulted in only one latent class, indicating that teachers who completed the questionnaire often utilized inclusive assessment concepts in the general classroom (Fig. 2). However, our further latent class analyses suggest that participating teachers can be regrouped into two different clusters for AOL (black and red bubbles in Fig. 2), suggesting that more than one-third of the teachers often put AOL into practice and approximately half of the teachers sometimes adopted AOL in class Fig. 2. Response patterns of teachers in latent classes 1, 2, and 3. *Note: The size of the bubble reflects the number of teachers in each latent class. AFL = assessment for learning; AOL = assessment of learning; ACC = accommodations or modifications; The number in each bubble refers to the percentage of participating teachers in a given latent class.

118

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

(Fig. 1). Prior studies have noted the importance of consistently linking instructional accommodations to those for classroom and large-scale assessments (Bolt & Thurlow, 2004; Helwig & Tindal, 2003; KetterlinGeller, Alonzo, Braun-Monegan, & Tindal, 2007; National Research Council, 2004). This finding suggests there may be a discrepancy between the high level perceived needs for accommodations and the accommodation practices in some general classrooms. Overall, the results of MANOVA suggest that our findings are invariant across different teacher characteristics and teaching experience since participating teachers’ assessment practices did not significantly differ by gender, province, grade level taught, whether or not the participant had worked with students with special needs, or English language learners in schools. The exception was that special education teachers reported offering ACC more frequently than other teachers. In addition, compared with their use of AFL and AOL, more participating teachers reported providing ACC to their students with special needs or English language learners. This finding accords with the other earlier investigations: that teachers demonstrated positive beliefs toward accommodations (Bielinski, Sheinker, & Ysseldyke, 2003; Lin & Lin, 2015; Tindal, Lee, & Ketterlin-Geller, 2008). Moreover, it is also interesting to observe that there is no significant difference in veterans’ and junior teachers’ assessment practices of AFL, AOL, and ACC. This result is consistent with Brown (2004) who found that years of experience was not significantly associated with teachers’ conceptions of assessment. Although previous studies found that secondary school teachers employed formative assessments less frequently than elementary school teachers (Brown et al., 2009; Scott et al., 2011), the current study did not find a significant difference among teachers who taught at different grade levels in terms of AFL. In addition, our AOL results are in agreement with the finding of Volante and Beckett (2011) that showed that the use of provincial assessment data was similar among interviewed elementary and secondary teachers in Ontario.

(Fig. 2). Although a majority of the teachers reported practising AOL frequently (Fig. 2), it is somewhat surprising that a small group of participating teachers indicated that they never summarized their students’ current level of performance in relation to counterparts, curriculum expectations and outcome indicators (AOL)(ranging from 2% to 18%, Fig. 1). Moreover, teachers were assigned to three clusters for AFL (black, red, and blue bubbles in Fig. 2). Similar to the use of AOL, the results show that some teachers still adopted AFL substantially less often than the other teachers (ranging from 1% to 18% in Fig. 1; blue bubble in Fig. 2). This finding suggests that some teachers rarely used assessment strategies, AFL and AOL, to reflect their current practices and support students’ learning progress. The assessment practices of these teachers were not in agreement with current literature that shows classroom assessment plays an important role in effective teaching- if employed properly and interpreted validly and fairly (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014; Black & Wiliam, 1998a, 1998b; McMillan, 2000; Popham, 2011). Our data also reveals that teachers may not teach their students how to self-assess or self-monitor their own learning progress (ranging from 39% to 55%, Fig. 1). This result is consistent with Volante (2010); Volante and Beckett (2011), which found many teachers and administrators were not familiar with AAL, and consequently, did not consistently put it into practice in school. Our findings about AAL call for providing more professional development opportunities on learner autonomy (Black et al., 2006; Marshall & Drummond, 2006; Remesal, 2011). Although a greater number of teachers reported adopting AFL frequently (ranging from 39% to 82%) than those reported practising AOL frequently (ranging from 34% to 67%, Fig. 1), fewer teachers reported frequently instructing students how to self-assess, or even just allowing students to self-assess or self-monitor their own learning progress (AAL which is considered a subset of AFL)(ranging from 39% to 55%, Fig. 1). As such, we observed a higher percentage of the teachers reported practising AOL frequently than those who reported practising AFL frequently (35.3% versus 19.3%, Fig. 2). This finding accords with earlier observations concerning an imbalance between the use of AFL and AOL and an overemphasis on AOL in school systems (Delanshere & Jones, 1999; Philippou & Christou, 1997; Volante, 2010). Taken together, our results suggest that a greater percentage of teachers reported adopting AOL frequently than did those who reported adopting AFL frequently, and that they practised AAL (a subset of AFL) less frequently in particular. Researchers often express a serious concern about an imbalance between teachers’ practices when they use AOL and AFL. With the increased pressures of accountability and demands from external assessment policies, it was found that teachers’ views about assessments mainly focused on grading and test reporting, rather than using formative assessment as a diagnosis tool for enhancing student progress and quality of teaching (Klenowski, 2011; Marsh, 2009; Philippou & Christou, 1997; Volante, 2010). Moreover, our findings are also not consonant with what researchers recommend, suggesting that a balanced assessment system, which includes both AFL and AOL, is required in order to meet the standard of a quality education (Stiggins, 2006; Wang et al., 2006). A majority of students with special needs had received at least one accommodation for large-scale assessments. For instance, 96.4% of students with intellectual disabilities received accommodations for a high-stakes literacy assessment (Lin & Lin, 2016). For English language learners, there were as many as 75 different types of accommodations identified in a national review of state policies conducted by Rivera and Collum (2006). In another review of state assessment policies, it was reported that all 51 states allowed at least some types of accommodations for test takers who were English language learners (Willner, Rivera, & Acosta, 2008). It is worth noting that a small percentage of participating teachers never or only sometimes offered accommodations or modifications to their students (ranging from 9% to 19%)

5. Conclusions, implications, and limitations The present study was undertaken to better understand the assessment practices of teachers with varied characteristics and teaching experience. In particular, we investigated how teachers utilized three major purposes of assessment for students with and without special needs and English language learners: AFL, AOL, and ACC. Making use of latent class analyses, distinct group patterns revealed that a greater number of teachers- especially special education teachers- frequently offered accommodations or modifications (ACC) to their students than they offered AFL and AOL. Significant correlations were found among three major components of inclusive assessment practices; that is, teachers who put one assessment concept into practice tended to utilize the other concepts. This study has also found that a greater percentage of teachers reported practising AOL frequently than did those who reported adopting AFL frequently, and especially that they practised AAL (considered a subset of AFL) less frequently. Furthermore, the data reveals that there was variability in the utilization of AFL, AOL, and ACC because some teachers practised all three types of assessment strategies infrequently. Overall, the results of MANOVA suggest that the findings are invariant across varied teaching experiences and backgrounds. The results that we have identified assist in our understanding of teachers’ assessment practices. This research offers several important implications for inclusion, teacher education, induction, and professional development programs and future studies. First, finding significant correlations among three major components of inclusive assessment practices lends support to the notion that a cohesive and balanced assessment system is still necessary to produce high-quality schools (Earl & Katz, 2006; Lin & Lin, 2015; Stiggins, 2006). Teachers who put one assessment concept into practice tended to utilize the other concepts. Second, our findings indicate that general classroom teachers may need support to enact assessment as learning (AAL). Third, we also 119

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin

found that a small but notable number of teachers sometimes or never implemented AFL, AOL, and ACC in class. These assessment practices may not align well with the recommendations made by a large amount of the current literature on classroom and large-scale assessments (Black & Wiliam, 1998a, 1998b; Fuchs, Fuchs, Eaton, Hamlett, Binkley et al., 2000; Fuchs, Fuchs, Eaton, Hamlett, & Karns, 2000; Volante & Beckett, 2011). Teacher education and professional development should take into consideration teachers’ needs for training in the use of classroom assessment and accommodations (Brown, 2004; Crawford & Ketterlin-Geller, 2013; Mertler, 2009; Volante, 2010). The present study represents a collection of teachers who practiced inclusive assessment. In contrast with other methods that collected one or a small number of cases (e.g., a case study method), the survey method employed in the current study can broaden the research scope, and advanced statistical analyses can allow inferences to be made about a population of interest (Roberts, 1999). Similar to many qualitative interview studies (e.g., Crawford & Ketterlin-Geller, 2013; Remesal, 2011; Volante, 2010; Volante & Beckett, 2011), the present study may be susceptible to the effects of response biases commonly seen in selfreports (e.g., acquiescence, social desirability)(Furr & Bacharach, 2014; Knowles & Condon, 1999; Paulhus, 2002; Schwarz, 1999). In other words, participants responded to the questions positively, even though they honestly might not put the assessment concepts into practice. Consequently, attention should be paid to the limitations of self-reporting in the interpretation of the results. It is recommended that further research be undertaken to reduce the response biases in selfreports. Moreover, the current study aims to investigate teachers’ inclusive assessment practices across various assessment approaches within an alternative theoretical framework and a broader scope, not only pertaining to a single conventional concept of AFL, AOL, or ACC, but also focusing on an integration of all three components. Further research might explore all assessment components of different assessment approaches through classroom-based observations, case studies, or action research.

preparation in Canada(pp. 492-525). E-book published by the Canadian Association for Teacher Education. Lin, P. Y., & Lin, Y. C. (2016). Examining accommodation effects for equity by overcoming a methodological challenge of sparse data. Research in Developmental Disabilities, 51-52, 10–22. American Psychological Association, & National Council on Measurement in Education. Washington, DC: Standards for educational and psychological testing Authors. Bielinski, J., Sheinker, A., & Ysseldyke, J. (2003). Varied opinions on how to report accommodated test scores: Findings based on CTB/McGraw-Hill’s framework for classifying accommodations (NCEO Synthesis Rep. 49)Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003). Assessment for learning: Putting it into practice. Maidenhead, UK: Open University Press. Black, P., McCormick, R., James, M., & Pedder, D. (2006). Learning how to learn and assessment for learning: A theoretical inquiry. Research Papers in Education, 21(2), 119–132. Black, P., & Wiliam, D. (1998a). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80, 139–148. Black, P., & Wiliam, D. (1998b). Assessment and classroom learning. Assessment in Education Principles Policy and Practice, 5, 7–73. Brown, G. T. L. (2004). Teachers’ conceptions of assessment: Implications for policy and professional development. Assessment in Education Principles Policy and Practice, 11, 301–318. Brown, G. T. L., Lake, R., & Matters, G. (2009). Assessment policy and practice effects on New Zealand and Queensland teachers’ conceptions of teaching. Journal of Education for Teaching, 35(1), 61–75. Butcher, J., Sedgwick, P., Lazard, L., & Hey, J. (2010). How might inclusive approaches to assessment enhance student learning in HE? Enhancing the Learner Experience in Higher Education, 2(1), 25–40. Christensen, L. L., Lazarus, S. S., Crone, M., & Thurlow, M. L. (2008). 2007 state policies on assessment participation and accommodations for students with disabilities (Synthesis Rep. 69). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Christensen, L. L., Thurlow, M. L., & Wang, T. (2009). Improving accommodations outcomes: Monitoring instructional and assessment accommodations for students with disabilities. Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Ciobanu, M. (2014). In the middle - whose learning is it anyway? Increasing students’ engagement through assessment as learning techniques. Gazette - Ontario Association for Mathematics, 53(2), 16–21. Clark, I. (2012). Formative assessment: Assessment is for self-regulated learning. Educational Psychology Review, 24(2), 205–249. Crawford, L., & Ketterlin-Geller, L. (2013). Middle school teachers’ assignment of test accommodations. The Teacher Educator, 48(1), 29–45. Delanshere, G., & Jones, J. H. (1999). Elementary teachers’ beliefs about assessment in mathematics. A case of assessment paralysis. Journal of Curriculum and Supervision, 14(3), 216–240. Donham, J. (2010). Creating personal learning through self-assessment. Teacher Librarian, 37(3), 14–21. Earl, L. (2003). Assessment as learning: Using classroom assessment to maximize student learning. Thousand Oaks, CA: Corwin Press. Earl, L., & Katz, S. (2006). Rethinking classroom assessment with purpose in mind. Winnipeg, MB: Western Northern Canadian Protocol. Earl, L., Levin, B., Leithwood, K., Fullan, M., Watson, N., Torrance, N., et al. (2003). England’s national literacy and numeracy strategies: Final report of the external evaluation of the implementation of the strategiesNotts, England: Department for Education and Skills. Education Quality and Accountability Office (EQAO) (2017a). How to administer the OSSLT. Toronto, ON: EQAO. Education Quality and Accountability Office (EQAO) (2017b). Guide for accommodations and special provisions. Toronto, ON: EQAO. Finch, W. H., & Bronk, K. C. (2011). Conducting confirmatory latent class analysis using Mplus. Structural Equation Modeling A Multidisciplinary Journal, 18(1), 132–151. Florian, L. (2008). Special or inclusive education: Future trends. British Journal of Special Education, 35(4), 202–208. Fuchs, L. S., Fuchs, D., Eaton, S. B., Hamlett, C., Binkley, E., & Crouch, R. (2000). Using objective data sources to enhance teacher judgments about test accommodations. Exceptional Children, 67(1), 67–81. Fuchs, L. S., Fuchs, D., Eaton, S. B., Hamlett, C. L., & Karns, K. M. (2000). Supplemental teacher judgments of mathematics test accommodations with objective data sources. School Psychology Review, 29, 65–85. Furr, R. M., & Bacharach, V. R. (2014). Psychometrics: An introduction (2nd ed.). Thousand Oaks, CA: Sage. Geva, E., & Herbert, K. (2012). Assessment and interventions in English language learners with LD. In B. Wong, & D. Butler (Eds.). Learning about learning disabilities (pp. 271– 298). (4th ed.). New York, NY: Elsevier. Geva, E., & Wiener, J. (2014). Psychological assessment of culturally and linguistically diverse children - A practitioner’s guide. New York, NY: Springer. Geva, E., Yaghoub-Zadeh, Z., & Schuster, B. (2000). Understanding individual differences in word recognition skills of ESL children. Annals of Dyslexia, 50, 123–154. Gibbons, S. L., & Kankkonen, B. (2011). Assessment as learning in physical education: Making assessment meaningful for secondary school students. Physical & Health Education Journal, 76(4), 6. Goodman, L. A. (2002). Latent class analysis: The empirical study of latent types, latent variables, and latent structures. In J. A. Hagenaars, & A. L. McCutcheon (Eds.). Applied latent class analysis (pp. 3–55). Cambridge, UK: Cambridge University Press.

Funding This research was supported by the Social Sciences and Humanities Research Council of Canada. Declaration of Competing Interest The authors declared no potential conflicts of interest. Acknowledgements We are very grateful to the teachers who participated in this study. References Abedi, J., & Hejri, F. (2004). Accommodations for students with limited English proficiency in the national assessment of educational progress. Applied Measurement in Education, 17, 371–392. Abedi, J., & Gándara, P. (2006). Performance of English language learners as a subgroup in large-scale assessment: Interaction of research and policy. Educational Measurement Issues and Practice, 25(4), 36–46. Abedi, J., Hofstetter, C. H., & Lord, C. (2004). Assessment accommodations for English language learners: Implications for policy-based empirical research. Review of Educational Research, 74, 1–28. Albus, D., & Thurlow, M. L. (2008). Accommodating students with disabilities on state English language proficiency assessments. Assessment for Effective Intervention, 33, 156–166. Lin, P. Y., & Lin, Y. C. (2014). Examining student factors in sources of setting accommodation DIF. Educational and Psychological Measurement, 74(5), 759–794. Lin, P. Y., & Lin, Y. (2015a). Identifying Canadian teacher candidates' needs for training in the use of inclusive classroom assessment. International Journal of Inclusive Education, 19(8), 771–786. Lin, P. Y., & Lin, Y. C. (2015b). What teachers believe about inclusive assessment in Canada: An empirical investigation. In L. Thomas, & M. Hirschkorn (Eds.). Change and progress in Canadian teacher education: Research on recent innovations in teacher

120

Studies in Educational Evaluation 63 (2019) 113–121

P.-Y. Lin and Y.-C. Lin Harlen, W. (2005). Teachers’ summative practices and assessment for learning – Tensions and synergies. The Curriculum Journal, 16(2), 207–223. IBM Corp (2013). IBM SPSS statistics for windows 22.0. Armonk, NY: IBM Corp. Jung, L. A., & Guskey, T. R. (2007). Standards-based grading and reporting: A model for special education. Teaching Exceptional Children, 40(2), 48–53. Ketterlin-Geller, L. R., Alonzo, J., Braun-Monegan, J., & Tindal, G. (2007). Recommendations for accommodations: Implications of (in)consistency. Remedial and Special Education, 28, 194–206. Klenowski, V. (2011). Assessment for learning in the accountability era: Queensland, Australia. Studies in Educational Evaluation, 37(1), 78–83. Knowles, E. S., & Condon, C. A. (1999). Why people say “yes”: A dual-process theory of acquiescence. Journal of Personality and Social Psychology, 77, 379–386. Lamont, A. E., Markle, R. S., Wright, A., Abraczinskas, M., Siddall, J., Wandersman, I.m P., et al. (2017). Innovative methods in evaluation: An application of latent class analysis to assess how teachers adopt educational innovations. The American Journal of Evaluation, 39(3), 364–382. https://doi.org/10.1177/1098214017709736. Marsh, C. (2009). Key concepts for understanding curriculum. London: Routledge. Marshall, B., & Drummond, M. (2006). How teachers engage with assessment for learning: Lessons from the classroom. Research Papers in Education, 21(2), 133–149. Mertler, C. A. (2009). Teachers’ assessment knowledge and their perceptions of the impact of classroom assessment professional development. Improving Schools, 12(2), 101–113. McCutcheon, A. L. (2002). Basic concepts and procedures in single-and multiple-group latent class analysis. In J. A. Hagenaars, & A. L. McCutcheon (Eds.). Eds.), Applied latent class analysis (pp. 57–88). Cambridge, UK: Cambridge University Press. McMillan, J. H. (2000). Fundamental assessment principles for teachers and school administrators. Practical Assessment, Research & Evaluation, 7(8), Retrieved from http:// PAREonline.net/getvn.asp?v=7&n=8. Muthén, L. K., & Muthén, B. O. (2000). Integrating person-centered and variable-centered analyses: Growth mixture modeling with latent trajectory classes. Alcoholism, Clinical and Experimental Research, 24, 882–891. Muthén, L. K., & Muthén, B. O. (2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén Muthén. National Research Council (2004). Keeping score for all: The effects of inclusion and accommodation policies on large-scale educational assessments. Committee on participation of English language learners and students with disabilities in NAEP and other large-scale assessments. In J. A. Koenig, & L. F. Bachman (Eds.). Board on testing and assessment, center for education, division of behavioral and social sciences and education.. Washington, DC: The National Academies Press. Nylund, K. L., Asparouhov, T., & Muthe ́n, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14, 535–569. Ontario Ministry of Education (2010). Growing success: Assessment, evaluation, and reporting in Ontario schoolsRetrieved fromhttp://www.edu.gov.on.ca/eng/ policyfunding/growSuccess.pdf. Ontario Ministry of Education (2017). Special education in Ontario, kindergarten to grade 12: Policy and resource guide. Retrieved fromhttp://www.edu.gov.on.ca/eng/parents/ speced.html. Paulhus, D. L. (2002). Socially desirable responding: The evolution of a construct. In H. Braun, D. N. Jackson, & D. E. Wiley (Eds.). The role of constructs in psychological and educational measurement (pp. 67–88). Hillsdale, NJ: Lawrence Erlbaum. Philippou, G., & Christou, C. (1997). Cypriot and Greek primary teachers’ conceptions about mathematical assessment. Educational Research and Evaluation, 3(2), 140–159. Popham, W. J. (2011). Assessment literacy overlooked: A teacher educator’s confession.

The Teacher Educator, 46, 265–273. Remesal, A. (2011). Primary and secondary teachers’ conceptions of assessment: A qualitative study. Teaching and Teacher Education, 27, 472–482. Rivera, C., & Collum, E. (Eds.). (2006). State assessment policy and practice for English language learners: A national perspective. Mahwah, NJ: Lawrence Erlbaum. Roscoe, K. (2013). Enhancing assessment in teacher education courses. The Canadian Journal for the Scholarship of Teaching and Learning, 4(1) Article 5. Saskatchewan Ministry of Education (2017a). The adaptive dimension for Saskatchewan K12 students. Regina, SK: Saskatchewan Ministry of Education. Saskatchewan Ministry of Education (2017b). Inclusion and intervention plan guidelines. Regina, SK: Saskatchewan Ministry of Education. Schwarz, N. (1999). Self-reports: How the questions shape the answers. The American Psychologist, 54, 93–105. Scott, S., Webber, C. F., Aitken, N., & Lupart, J. (2011). Developing teachers’ knowledge, beliefs, and expertise: Findings from the Alberta student assessment study. The Educational Forum, 75(2), 96–113. Segers, M., & Tillema, H. (2011). How do Dutch secondary teachers and students conceive the purpose of assessment? Studies in Educational Evaluation, 37, 49–54. Schimmer, T. (2014). Ten things that matter from assessment to grading. Upper Saddle River, NJ: Pearson. Stiggins, R. J. (2006). Assessment for learning: A key to student motivation and learning. Phi Delta Kappa Edge, 2(2), 1–19. Teasdale, A., & Leung, C. (2000). Teacher assessment and psychometric theory: A case of paradigm crossing? Language Testing, 17(2), 163–184. Thurlow, M., Christensen, L., & Lail, K. E. (2008). An analysis of accommodations issues from the standards and assessments peer review(Technical Report 51). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Tindal, G., Lee, D., & Ketterlin-Geller, L. (2008). The reliability of teacher decision-making in recommending accommodation for large-scale tests(Technical report 08-01). Eugene, OR: Behavioral Research and Teaching, University of Oregon. Tomlinson, C. A. (1999). The differentiated classroom: Responding to the needs of all learners. Alexandria, VA: Association for Supervision and Curriculum Development. Tomlinson, C. A., & McTighe, J. (2006). Integrating differentiated instruction and understanding by design: Connecting content and kids. Alexandria, VA: Pearson. Torney-Purta, J., & Amadeo, J.-A. (2013). International large-scale assessments: Challenges in reporting and potentials for secondary analysis. Research in Comparative and International Education, 8(3), 248–258. Urick, A., & Bowers, A. J. (2014). What are the different types of principals across the United States? A latent class analysis of principal perception of leadership. Educational Administration Quarterly, 50(1), 96–134. Volante, L. (2010). Assessment of, for, and as learning within schools: Implications for transforming classroom practice. Action in Teacher Education, 31(4), 66–75. Volante, L., & Beckett, D. (2011). Formative assessment and the contemporary classroom: Synergies and tensions between research and practice. Canadian Journal of Education / Revue Canadienne de L Éducation, 34(2), 239–255. Volante, L., & Jaafar, S. B. (2010). Assessment reform and the case for learning-focused accountability. The Journal of Educational Thought, 44(2), 167–188. Wang, L., Beckett, G. H., & Brown, L. (2006). Controversies of standardized assessment in school accountability reform: A critical synthesis of multidisciplinary research evidence. Applied Measurement in Education, 19, 305–328. Willner, L. S., Rivera, C., & Acosta, B. D. (2008). Descriptive study of state assessment policies for accommodating English language learners. Washington, DC: George Washington University, Center for Equity and Excellence in Education.

121