Kindergarten literacy assessment of English Only and English language learner students: An examination of the predictive validity of three phonemic awareness measures

Kindergarten literacy assessment of English Only and English language learner students: An examination of the predictive validity of three phonemic awareness measures

Journal of School Psychology 47 (2009) 369 – 394 Kindergarten literacy assessment of English Only and English language learner students: An examinati...

226KB Sizes 0 Downloads 37 Views

Journal of School Psychology 47 (2009) 369 – 394

Kindergarten literacy assessment of English Only and English language learner students: An examination of the predictive validity of three phonemic awareness measures Danielle L. Linklater ⁎,1 , Rollanda E. O'Connor, Gregory J. Palardy Graduate School of Education, University of California, Riverside, United States Received 8 September 2008; received in revised form 30 June 2009; accepted 4 August 2009

Abstract The study assessed the ability of English phonemic awareness measures to predict kindergarten reading performance and determine factors that contributed to growth trajectories on those measures for English Only (EO) and English language learner (ELL) students. Using initial sound fluency (ISF), phoneme segmentation fluency (PSF), and a combined phoneme segmentation task (CPST), students' beginning of kindergarten scores were used to predict end-of-kindergarten Nonsense Word Fluency (NWF) and reading (WRMT-R/NU). Regression analyses revealed that ISF and CPST early in kindergarten predicted variance in NWF and WRMT-R/NU. PSF did not predict reading performance over ISF or CPST. While gender was a significant factor in the growth curves across the measures, results revealed no significant difference for EO and ELL students. Published by Elsevier Ltd. on behalf of Society for the Study of School Psychology. Keywords: Early literacy; Phonemic awareness; Reading assessment; Kindergarten; English language learners

As federal policies continue to emphasize the prevention of reading disabilities for atrisk students, educators and researchers alike have become progressively more concerned with identifying early literacy risk factors. Most children formally enter the school system in kindergarten, which makes the kindergarten year critical in determining and utilizing preventative measures. Identifying and providing intervention to kindergarteners who are ⁎ Corresponding author. Sproul Hall, University of California at Riverside, Riverside, CA, 92521, United States. Tel.: +1 949 433 9717; fax: +1 951 491 7599. E-mail address: [email protected] (D.L. Linklater). 1 Present address: 6189 Claridge Drive, Riverside, CA 92506, United States. ACTION EDITOR: Scott Ardoin 0022-4405/$ - see front matter. Published by Elsevier Ltd. on behalf of Society for the Study of School Psychology. doi:10.1016/j.jsp.2009.08.001

370

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

likely to experience reading problems are critical to later success in school. Without early identification and intervention for students with reading problems, Juel (1988) and Scarborough (1995) have shown that poor readers in first grade are likely to remain poor readers in fourth grade and beyond. For students who are behind in reading and literacy development, the opportunities to “catch-up” to peers diminish over time (Good, Simmons, & Smith, 1998). As Mathes and Torgesen (1998) observed, in order to reduce reading failure, we need to allocate resources for early identification and prevention. Early literacy measures The National Reading First Assessment Committee (Kame'enui, 2002) concluded that comprehensive school-wide early literacy assessment systems should be used to screen, monitor progress, diagnose, and measure student outcomes. Screening assessments can be utilized to determine which children are at risk for reading difficulties so that intervention can be provided. Recent scientific advances in early literacy assessment have provided schools with access to information about students' foundational beginning reading skills. Over the past decade, researchers have made significant progress in improving educators' ability to accurately and reliably assess students' early literacy skills (Good & Kaminski, 2003; Torgesen, 2002). A large body of research suggests that individual differences in phonemic awareness and alphabetic knowledge in the early grades are strong predictors of success in acquiring beginning reading skills (Good, Simmons, & Kame'enui, 2001; O'Connor & Jenkins, 1999; Torgesen et al., 1999; Vanderwood, Linklater, & Healy, 2008). In kindergarten, phonemic awareness has gained prominence because of its critical role in reading development. Phonemic awareness (PA) is the awareness of sounds in spoken words and the ability to manipulate the sounds in spoken words (Wagner, Torgesen, Laughon, Simmons, & Rashotte, 1993), which is essential to developing later reading skills (O'Connor, Jenkins, & Slocum, 1995; Torgesen, Morgan, & Davis, 1992). Although assessing alphabetic knowledge is simpler because the number of items to asses is relatively small and each item is distinct from the others, measuring PA early in kindergarten is more challenging because some letters have multiple sounds and, therefore, the wide variety of possible letter combinations complicates the task. Most studies evaluating student performance on phoneme segmentation tasks do not measure this skill until well into kindergarten or beginning of first grade (Felton & Pepper, 1995; Good et al., 2001; Juel, 1988; Kaminski & Good, 1996; Rouse & Fantuzzo, 2006). Creating a measure that can maintain predictive strength throughout kindergarten for a skill that changes rapidly over the year is difficult: a single outcome measure that can be used to monitor progress over the course of a school year must simultaneously avoid floor effects for a task that is hard at the beginning of the year while avoiding ceiling effects for a task that is relatively easy for many kindergartners at the end of the year. Developers of the Dynamic Indicators of Beginning Early Literacy (DIBELS. Good & Kaminski, 2003) addressed the problem by creating a measure of first-sound choices (Initial Sound fluency: ISF) for early assessments and a production task requiring sequential segmentation (Phoneme Segmentation Fluency: PSF) for assessments toward the end of kindergarten. The problem with this solution is that it creates a measurement shift that makes it difficult to gauge progress continuously. A single measure that would not need to be changed during

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

371

the school year would be more desirable: a continuous measure of performance would create more stability. Several approaches could ease the problem of floor effects on segmenting measures early in kindergarten. For example, Spector (1992) developed a measure of segmenting in which the examiner offered increasingly supportive feedback on test items and scored each item depending upon the level of support the child needed for responding (i.e., 0–6 points per word). The measure included two-phoneme and threephoneme words. In an experimental comparison with simple (i.e., respondents say the phonemes) and complex (i.e., respondents delete a phoneme) static measures, her measure was a stronger predictor of end-of-kindergarten segmenting ability and word reading. The measure is referred to as a “dynamic” measure because results incorporate student changes in responding as a function of feedback and instruction during the actual assessment. Segmenting words into three phonemes is challenging because it requires respondents to verbally isolate three distinct sounds within a single word. Typical readers may perform poorly on a task of 3-phoneme segmenting because they do not understand the directions or because they lack either the prerequisite skills or experience in word play. A measure of 3-phoneme segmenting given early in kindergarten may be helpful for identifying children who will be poor readers later in school, but it will also probably over identify poor readers, as many children with experience and good kindergarten instruction will learn to segment on time with their peers. Spector's measure addressed this problem by providing extensive modeling (including hand-over-hand movement of disks to represent speech phonemes) and scoring items variably, depending on the level of support needed. This kind of scaled scoring of each item helps to ease the floor effects found on static measures given early in kindergarten (Catts, Petscher, Schatschneider, Bridges, & Mendoza, 2009); however, her dynamic measure required 15 to 20 min per child to administer and reliability was not determined. O'Connor et al. (O'Connor, Jenkins, & Slocum, 1995; O'Connor & Jenkins, 1999) verified that tasks that were easy enough to avoid floor effects early in kindergarten, such as first-sound identification or onset-rime tasks, were poor predictors of reading skill late in kindergarten. Based on the success of Spector's dynamic assessment method, O'Connor and Jenkins (1999) addressed task difficulty level and corrective feedback in their combined phonemic segmenting test (CPST). They began the test with onset-rime level segmenting, which is accessible to typical 4-and 5-year olds (Slocum, O'Connor, Jenkins, 1993), and continued with segmenting 3-phoneme words for children who performed well on the easier task. They provided modeling and feedback for each item that was incorrect or partially correct, with variable scoring of each response. They found that CPST administered early in kindergarten was a significant predictor of end-of-first grade reading achievement. Moreover, it retained predictive power when administered late in kindergarten and required about 5 min to administer. Tasks that had seemed predictive early in kindergarten, such as syllable deletion and letter naming, ceased to predict first grade reading significantly when they were administered in spring of kindergarten. Other types of kindergarten measures, such as rhyming, blending, receptive language, and memory for sounds, were not significant predictors of first grade achievement after segmenting was accounted for, whether administered early or late in kindergarten, despite their significant correlations with later reading achievement. Like the measures developed by Spector (1992) and O'Connor and Jenkins (1999), DIBELS includes variable scoring for each item on PSF; however, ISF does not, and neither

372

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

DIBELS measure is dynamic because examiners do not provide feedback or additional prompts after the practice trials, which may contribute to the observed floor effects when used early in kindergarten (Catts et al., 2009). For students who have difficulty with understanding the instructions, it seems reasonable to hypothesize that providing ongoing support through modeling and practice trials during test administration might provide a more accurate view of what students understand about segmenting spoken words. Additionally, most of the research centering on the ability of phoneme awareness tasks to predict later reading skills was conducted with monolingual English-speaking students or did not disaggregate the findings for ELLs (Good et al., 2001; Rouse & Fantuzzo, 2006). Data are beginning to accumulate using more diverse populations, demonstrating that the domain of phonemic awareness transfers across languages and is predictive of reading in one's second language (Lesaux & Siegel, 2003; Lesaux, Rupp, & Siegel, 2007; Quiroga, Lemos-Britton, Mostafapour, Abbott, & Berninger, 2002). More research is needed to determine how well initial sound fluency and phoneme segmentation measures predict later reading skills with ELLs. English language learners The groups at highest risk for reading difficulties include children from low-income and minority families and children with limited English proficiency (Snow, Burns, & Griffin, 1998). The number of children in the United States and Canada who speak a language other than English has risen rapidly in the past decades and continues to grow. English language learners (ELLs) typically exhibit lower academic achievement, particularity in literacy, than their non-ELL peers (Peregoy & Boyle, 2000; Slavin & Cheung, 2005; Snow et al., 1998). Longitudinal data show that in addition to beginning school with lower reading skills, lowincome and minority families also fail to close the gap in reading even after years of academic instruction (Davison, Seo, Davenport, Butterbaugh, & Davison, 2004). Placement in special education may be a more complex issue for ELLs than non-ELLs due to numerous cultural and linguistic factors that contribute to poor academic achievement (Bernhardt, 2003). In multilingual educational settings, differentiating normal second language reading acquisition from signs of reading failure is particularly challenging (Geva, 2000; Wilkinson, Ortiz, Robertson, & Kushner, 2006). Schools often overlook or delay addressing the possibility that ELLs may be having difficulties due to a reading disability as opposed to a lack of English language proficiency (Lesaux & Siegel, 2003). Specifically, general education teachers sometimes hesitate to refer ELLs to special education because they are unable to determine whether they are having difficulties learning to read due to second language acquisition issues or disabilities (Klingner, Artiles, & Barletta, 2006). While many school districts opt to wait until a students' language proficiency has developed to determine if a learning disability is present, researchers are demonstrating that early intervention reduces later academic difficulties (Mathes & Torgesen, 1998; O'Connor, Fulmer, Harty, & Bell, 2005; Quiroga et al., 2002). Research indicates that ELLs who start out seriously delayed in reading can make substantial gains in a short period of time if identified early and given an empirically supported reading intervention (Healy, Vanderwood, & Edelston, 2005; Justice & Pullen, 2003; O'Connor, Bocian, BeebeFrankenberger, & Linklater, in press; Quiroga et al., 2002).

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

373

Changes in federal legislation allowing schools to use Response-to-Intervention (RtI) models to identify and treat academic performance problems may provide ELLs with additional support within general education without waiting for students to fail for a long period of time. RtI evaluates the degree to which a student identified as at risk for academic or behavior problems benefits from an empirically supported intervention (LinanThompson, Vaughn, Prater, & Cirino, 2006). Students who make expected gains are considered responsive to instruction and are expected to continue to make progress when adequate instruction is provided in the general education classroom. However, students who make minimal gains or do not meet criterion-level performance even after receiving empirically supported interventions are considered “poor responders”. These students may need more intensive long-term instruction, such as special education. ELLs may perform poorly in school due to lack of exposure or instruction in English, or due to learning problems. The RtI model may address the issue of disproportionate numbers of ELLs referred for special education by reducing inappropriate referrals (Klingner, Sorrells, & Barrera, 2007). The use of curriculum-based measurement is a critical component of the RtI model, which allows for local norms to be used in determining expected levels of performance and goals. To accomplish this aim research is needed to determine which measures are most appropriate for monitoring student progress and to determine typical growth curves on these early literacy assessments for ELLs and their English Only classmates. In order to determine appropriate normative goals, typical student growth over a specified period must be ascertained. Most studies analyzing student growth evaluate literacy once students should be capable of reading text through oral reading fluency measures (Deno, Fuchs, Marston, & Shin, 2001; Fuchs, Fuchs, Hamlett, & Walz, 1993; Hasbrouck & Tindal, 1992). Several studies (Lesaux et al., 2007; O'Connor & Jenkins, 1999; Ritchey & Speece, 2006) have demonstrated that growth in early literacy skills in kindergarten is related to word reading. However, the data on early literacy measures have not been used to establish the rate of growth that is expected during the academic school year. Instead, criterion goals have been established for students at each grade level (Good et al., 2001). Normative slope data could be used to establish appropriate goals for student outcomes; however, rate of growth may vary throughout the school year. In a study of second grade growth in oral reading fluency, Ardoin and Christ (2008) found a main effect for semester, such that fall-to-winter growth was greater than winter-to-spring growth. It would be useful to determine typical student growth over the kindergarten year on early literacy measures and to compare the growth of English only and ELL students. A second consideration is whether growth in early literacy skills varies sufficiently by gender to suggest different intervention needs for girls and boys. Diamond and Onwuegbuzie's (2001) study of reading achievement in elementary schools found higher achievement for girls than for boys. In 4th grade, the National Center for Education statistics (NCES, 2007) identified significantly higher reading scores for girls than boys in California, as well as in other states. The advantage for girls was also apparent in kindergarten in McCoach, O'Connell, Reis, and Levitt's (2006) analysis of the Early Childhood Longitudinal Study (ECLS-K) data. Although MacMillan (2000) also reported an advantage for girls in his study of growth in reading rate, the difference was only equivalent to one month's growth. In contrast to these studies, Scarborough (1995) found no significant contribution of gender toward predicting reading difficulties in 2nd grade. These mixed findings raise questions about

374

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

whether generating separate norms by gender is meaningful in kindergarten, and, if so, when during-the-year differences in level and slope become apparent. The purpose of the current study The current study evaluated the predictive validity of three types of phonemic awareness measures (initial sound fluency, phoneme segmentation fluency, and a combined phoneme segmentation task) for English only and ELL kindergarten girls and boys using growth curve analysis and linear regression. While the first two measures — ISF and PSF — assess fluency and accuracy as a snapshot of current level of performance, the third measure — Combined Phoneme Segmentation Task (CPST) — incorporates an element of responsiveness to instruction by assessing performance in the context of corrective feedback modeling, and practice. Due to this difference, we expected CPST to be a more sensitive measure of PA in fall of kindergarten than ISF or PSF, and perhaps be a more accurate assessment of PA for ELLs, for whom the extensive modeling could be especially supportive. This study measured growth rates for three segmenting measures and compared the ability of these measures to predict kindergarten reading outcomes (nonsense word reading, word reading, and comprehension) of ELLs and non-ELLs. The research questions we investigated were: (1) What are the expected initial skill level and growth rate on ISF, PSF, and CPST over the kindergarten year and to what degree do those measures vary? (2) Are Language Status (ELL versus EO) and Gender significant predictors of the initial score and growth in ISF, PSF, and CPST? (3) Do initial scores on ISF, PSF, and CPST account for incremental variance on future reading performance (Nonsense Word Fluency and Woodcock Reading Mastery Test scores) at the end of the kindergarten? and (4) Do Gender and Language Status account for variance beyond that explained by initial ISF, PSF, and CPST scores on future reading performance at the end of kindergarten? Method Participants and setting Students The participants were all students enrolled in partial-day kindergarten at four elementary schools in two school districts in southern California. The districts served a lower socioeconomic population with a range at the schools from 31–84% of the students receiving free or reduced lunch (California Department of Education, 2006). There were 22 kindergarten classrooms (a range of two to eight kindergarten classrooms at each school) with approximately 20 students per class with 21 female teachers and 1 male teacher. The language of instruction in the schools was English. ELLs received 30 min per day of English Language Development (ELD) in English. The participants consisted of 401 students (206 males and 195 females; 289 English Only and 112 ELLs). The demographic break down of student ethnicity at each school ranged from 50 to 54% Hispanic, followed by 20–31% White, then 7–19% African–American. According to the California Department of Education (CDE), the demographic characteristics of the participants are representative of California elementary schools (CDE, 2008).

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

375

Across schools 25–29% of the participants were classified as ELLs as determined by Limited English Proficient (LEP) scores on the California English Language Development Test (CELDT). The CELDT was developed by CTB/McGraw Hill through a contract with the California Department of Education (CDE). The test is used to identify students who have limited English proficiency and to determine the level of English language proficiency of those students in the areas of listening and speaking in English. The CELDT was administered by school staff at the beginning of kindergarten. Internal consistency estimates of reliability of scores for the reading subtests range from .85 to .91. CELDT scores range from 1 to 5, with 5 representing advanced English proficiency. Students who scored within the limited English proficiency range (levels 1–3) are considered ELLs by the CDE. For the purposes of this study, students were classified into two groups (ELL and English only) on the basis of the school's classification system. Approximately 15% of the participants across the four schools (N = 60; 31 males and 41 ELLs) participated in a tier-2 literacy intervention program for 45 min per week as part of a larger research study evaluating students' response to intervention. Students were selected for intervention based on low performance on measures of early literacy and vocabulary. The subsample for predicting end-of-year outcomes Additional parent permission was obtained for measuring reading outcomes on a normreferenced assessment. Due to the time involvement for test administration near the end of the school year, the researchers determined that a subsample would be sufficient to evaluate standardized reading outcomes. First, the sample was stratified by English only and ELL students. Next, 50 students were randomly selected from each group to result in a similar number of English only and ELL students. Students who returned the permission forms were included in the subsample. The subsample consisted of 86 students (42 males and 46 English Only). Of these students, 25 participated in the tier-2 intervention (14 males and 21 ELLs). Reading curriculum The adopted district-wide reading program was the Houghton–Mifflin Reading Program (HMR; Houghton Mifflin Reading, 2002). The philosophical foundation of the HMR program was developed through the compilation of peer-reviewed research supporting evidence that oral language, knowledge of letter names, phonological awareness, and concepts of print are important factors in learning to read (National Reading Panel, 2000; Snow et al., 1998). Measures Initial sound fluency The Dynamic Indicators of Basic Early Literacy Skills (DIBELS) test of Initial Sounds Fluency (ISF) is a standardized, individually administered measure of phonological awareness that assesses a child's ability to verbally produce the initial sound in an orally presented word (Kaminski & Good, 1996). This measure assesses accuracy and fluency. The examiner presented four pictures to the child, named each picture, and then asked the child to identify the picture that began with the sound produced orally by the examiner. The child was also asked to produce the beginning sound for an orally presented word that matched one of the given pictures. The examiner calculated the amount of time taken to verbally produce the correct

376

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

sound and converted the score into the number of initial sounds correct in 1 min. The ISF measure took about 3 min to administer and has over 20 alternate forms for monitoring progress. ISF can be given as early as pre-school through winter of kindergarten. ISF was measured in the fall (approximately week 5 of school), winter (approximately the 5th month of school), and spring (approximately the 9th month of school). Alternate-form reliability for ISF is .72 in January of kindergarten (Good, Simmons, & Kame'enui, 2001). ISF has been shown to correlate .36 with the Woodcock–Johnson PsychoEducational Battery Readiness Cluster score (Good et al., 2001) and .48 with the DIBELS PSF in the winter of kindergarten (Good et al., 2004). With respect to spring-of-first-grade reading, ISF has been shown to correlate .38 with CBM-oral reading fluency and .36 with the Woodcock– Johnson Psycho-Educational Battery Total Reading Cluster score (Good et al., 2004). Phoneme segmentation fluency The DIBELS PSF measure is a standardized, individually administered test of phonemic awareness (Kaminski & Good, 1996). PSF assesses a student's ability to verbally segment three- and four-phoneme words into their individual phonemes. Similar to ISF, this measure assesses accuracy and fluency. The PSF task was administered by the examiner who orally presented words of three to four phonemes. The student was requested to produce the individual phonemes for each word. After the student responded, the examiner presented the next word, and the total number of correct phonemes produced in 1 min constituted the final score for a final assessment. The PSF measure took about 2 min to administer and has over 20 alternate forms for monitoring progress. PSF was measured in the fall, winter, and spring. Only English PSF probes and English directions were used. PSF can be given as early as the winter of kindergarten through the spring of first grade. The two-week alternate-form reliability for the PSF measure is .88 (Kaminski & Good, 1996). PSF has been shown to correlate .54 with the Woodcock–Johnson PsychoEducational Battery Readiness Cluster score in spring of kindergarten (Good et al., 2004; Good et al., 2001). PSF administered in the spring of kindergarten has been shown to correlate .68 with the spring-of-first-grade Woodcock–Johnson Psycho-Educational Battery Total Reading Cluster score (Good et al., 2001). Combined phoneme segmentation task A combined phoneme segmentation task was also used (O’Connor et al., 1995), which required students to segment 10 words into onset and rime (t—op) and 5 words into isolated phonemes (m—ai—n). This measure was individually administered and assessed a student's ability to verbally segment 1-syllable words into smaller than syllable units, including first sound, onset-rime and isolated phonemes. Following 3 modeled and practiced words for onset-rime segmenting (“I can say two sounds in soap. Watch me: s—oap. You try it.”), the examiner orally presented 10 words and the student was asked to produce two sounds in each word as onset and rime. The examiner scored the child's response, and then provided feedback on every item, which consisted of praise for correct responses or modeling and practice for partially correct, incorrect, or no responses. For example, if the examiner said “make”, and the student said /m/, the examiner scored 1 point for a correct word part and said: “Yes, make starts with /m/. I can say two sounds: /m/-/ake/. Try it.” The examiner encouraged the child to repeat the correct response. Each of the first 10 words was worth 0–2 points, and 1

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

377

point was awarded for each of any two correct phonemes or 2 points for the onset-rime segmented form of the word. We did not penalize children who segmented words into 3 isolated phonemes during the onset-rime task, but awarded the full 2 points for all such responses (e.g., 2 points for m—ake or for m—a—ke). For students who produced any correct responses on the first 10 words, the examiner modeled 3-phoneme segmentation for 3 practice trials, and five additional words were presented with 0 to 3 points awarded per word. The student attempted to produce the individual phonemes for the word. For example, the examiner said boat, and the student said b-oa-t for 3 points. The total possible score was 35 (i.e., 10 words × 2 points each + 5 words × 3 points each). CPST was measured in the fall, winter, and spring. The split-half reliability (Spearman–Brown corrected) was 0.96. The fall-of-kindergarten CPST correlates .42 with spring-of-kindergarten WRMT-R Word Identification and Word Attack subtests (O'Connor & Jenkins, 1999). Nonsense Word Fluency The DIBELS test of Nonsense Word Fluency (NWF) is a standardized, individually administered test of the letter-sound correspondence of the English alphabet. The student was shown an 8.5″ × 11″ sheet of paper with randomly ordered, vowel-consonant (VC) and consonant-vowel-consonant (CVC) nonsense words (e.g. “aj,” “lut”) and asked to produce the individual letter sound of each letter or produce the whole nonsense word. The final score was the number of letter sounds produced correctly in 1 min. Students were assessed with NWF in the spring only. The 1-month alternate-form reliability for NWF in January of first grade is .83 (Good et al., 2004). NWF (administered in January of first grade) has been shown to correlate .66 with the Woodcock–Johnson Psycho-Educational Battery Total Reading Cluster score (administered in May; Good et al., 2004). Woodcock Reading Mastery Test-Revised Normative Update The Woodcock Reading Mastery Test-Revised Normative Update (WRMT-R/NU; Woodcock, 1998) is a norm-referenced battery of tests for kindergarten through age 75 which provides a measure of overall reading achievement. The WRMT-R/NU was individually administered and the Total Reading Short Scale Score included a combination of two subtests: Word Identification and Passage Comprehension. WRMT-R/NU was administered in the spring. The split-half reliability for the Total Reading Cluster, Word Identification subtest, and Passage Comprehension subtest are 0.98, 0.98, and 0.94, respectively, in first grade. Procedures Training procedures Doctoral students were trained as part of a larger research study to administer and score the measures. Each graduate student was provided with a set of written instructions detailing how the reading probes were to be administered and scored. All examiners watched appropriate test administration and were observed by at least one other examiner as part of the inter-rater agreement data collection (see below).

378

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

Administration procedures All students were assessed with the ISF, PSF, and CPST measures in the fall, winter, and spring. In the spring of kindergarten, students were also assessed with NWF. The WRMT-R/ NU was administered to the subsample of 92 students in the spring. Inter-scorer agreement data were collected on 10% of probes administered (randomly selected). A simple formula for calculating inter-scorer agreement [Agreements / (Agreements + Disagreements) × 100] was utilized, resulting in a range from 96% to 99% total percentage agreement on all measures. Data analysis Multilevel growth modeling and regression analysis were used to answer the research questions. Multilevel growth model descriptions and equations Because our general purpose was to model changes in outcomes over time, a growth model was used. Moreover, because children are nested in classrooms, which is a violation of independence of observations assumption that can result in biased parameter estimates if ignored, a 3-Level growth model was employed. The levels are a measurement model at Level 1, a child model at Level 2, and a classroom model at Level 3. Besides protecting against parameter mis-estimation due to violation of the independence, including the classroom level provides an indication of the degree the classroom context, as opposed to individual differences, impacts growth in reading skills. Separate 3-Level growth models were used for each of the 3 respective segmenting skill outcomes. Below we introduce the equations for this model using multilevel modeling notation developed by Raudenbush and Bryk (2002). In its most basic form the individual growth trajectories are represented by intercepts and slope parameters in what may be referred to as the measurement model, which can be described as follows: Level 1: measurement Ytij = p0ij + p1ij atij + etij

  etij fN 0; r2 :

ð1Þ

The repeated measurements on the outcome, Y, are conceptualized as nested within children who are nested within classrooms, hence the three-level model. The notation in Eq. (1) uses subscripts to describe this data structure, where Ytij indicates the observed value of the respective outcome variable at time t for individual i in classroom j, π0ij is the initial status or expected value of Y for individual i when time equals zero, π1ij is the growth rate or expected change on the outcome per unit change in time for individual i, atij is a predictor variable measuring time within each individual, and etij denotes the residuals or errors associated with this relationship, which are assumed to have a mean of zero and to be normally distributed.2 2 Note that the time variable was coded so that 0 corresponds to the first time point, 1 corresponds to the second, and 2 corresponds to the third. Therefore, the intercept or initial status is the expected score on the respective outcome at the first time point, which was collected a few weeks after the start of kindergarten.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

379

The measurement model generates an intercept (π0i) and slope (π1i) for each child, which are the outcomes at Level 2, the child-level model. This yields the following level two equations: Level 2: children p0ij = b00j + p1ij = b10j +

Q X q=1 Q X

b0q j X qij + r0ij rij fN ð0; Tr Þ :

ð2Þ

b1q j X qij + r1ij

q=1

In the intercept equation, β00j is the mean initial status or the mean of the child intercepts in classroom j, which can be interpreted as the expected value on the outcome for classroom j when time equals zero. Similarly, in the slope equation β10j represents the average rate of change or the expected change in classroom j on the outcome per unit change in time. The intercept and slope equations may also include a set of Q covariates represented by the Xs, each of which also has a coefficient associated with it that describes the linear relationship between the covariate and the intercept or slope outcome. Both the intercept and slope equations have a random error term associated with them, notated by r0ij and r1ij. These random effects are assumed to be normally distributed, have a mean of zero, and to covary. Tr represents the covariance matrix of random effects at Level 2, which consists of the variance in r0ij (τπ00), the variance in r1ij (τπ11), and the covariance between r0ij and r0ij (τπ10). The Level 3 model is for classroom effects. In the present study we do not have a substantive interest in classroom effects; rather, the classroom model is included to account for variation in the children's growth trajectories due to teacher effectiveness, classroom composition, and other classroom effects. If the classroom level is ignored, that variation is distributed through the rest of the model, potentially contaminating other parameter estimates. In other words, when the classroom level is ignored the model is misspecified. The mean initial status (β00j) and mean growth rate (β10j) for each classroom are the outcomes in the classroom-level model, which can be represented by Eq. (3). Level 3: classroom b00j = g000 + u0j b10j = g100 + u1j

uj fN ð0; Tu Þ :

ð3Þ

Note that the classroom model is an unconditional model, meaning that there are no covariates. γ000 represents the mean of the classroom intercept estimates, and γ100 represents the mean of the classroom growth estimates (i.e., the grand mean intercept and the grand mean growth rate, respectively). As we saw with the Level-2 model, both the intercept and slope equations have a random error term associated with them, now notated by u0j and u1j. Again, these random effects are assumed to be normally distributed, have a mean of zero, and to covary. Tu represents the covariance matrix of random effects at Level 3 consisting of the variance in u0j (τu00), the variance in u1j (τu11), and the covariance

380

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

between u0j and u1j (τu10). Conceptually, this matrix quantifies the extent to which the average intercepts and the average growth rates vary across classrooms. The model-building strategy was as follows. We first estimated the unconditional 3-Level growth model for each outcome. The results of the unconditional models provided a baseline for model fit and were used to compute the percent of the total variance in the initial status and growth that were between classrooms — that is, attributable to classroom effects rather than individual differences. We also examined the classroom-level variance components of this model for statistical significance to test whether the classroom level was necessary. If those variance components are not statistically greater than zero, classroom effects do not contribute to differences in the growth trajectories and therefore the classroom level can be omitted. Next, the conditional model was estimated for each outcome. The conditional model included two dummy coded predictors, one for Gender and one for Language Status, which were student measures (rather than classroom) and therefore were entered into the Level-2 model.3 These predictors were included in both the intercept and slope outcomes models because the intercepts and slopes covary and thus including them in only one and not the other could affect parameter estimates. Regression model descriptions and equations Regression analysis was used to address research questions 3 and 4 regarding the incremental predictive powers of the 3 tests of reading segmentation skill, Gender, and Language Status. Note that because we are only interested in whether the successive predictors entered into the model accounted for additional variance in the outcome and not whether that variance was at the child level or classroom level, multiple regression was suitable for our purpose. The formulations for this model are shown in Eq. (4). Yi = b 0 +

P X

Xpi bp + ei

  ei fN r2 :

ð4Þ

p=1

Yi is the outcome variable, β0 is the intercept which is defined as the expected value of the outcome when all the predictors equal 0, Σ is the summation notation which indicates that there are P predictor variables and associated coefficients represented by X and β, respectively, each set having an additive relationship with the outcome, and ei is the residual, which has a mean of 0 and is assumed to be normally distributed. The model-building strategy was as follows. Because we wanted to test whether CPST explained additional variation beyond the widely used DIBELS measures, the DIBELS measures were entered into the model first. We entered ISF in step one and PSF in step two, followed by CPST in step three. Lastly, we entered Gender and then Language Status to 3 The original data collection design included an intervention that was not substantively related to the focus of the present study. While the intervention was implemented with child groups balanced on gender and language status and was not believed to be associated with those factors, we wanted to make sure it did not bias the estimates of effects of gender and language status on growth in the reading outcomes. We therefore ran the models with and without a dummy variance indicating whether the child received the intervention. A comparison of these models verified that the intervention did not change the results for gender and language status. We therefore only present the results for the simpler model that did not include the intervention variable.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

381

examine whether they account for variance in NWF and WRMT above and beyond the segmentation measures. Results As an index of how students performed on the kindergarten measures of ISF, PSF, and C-PSF in the fall, winter, and spring, as well as the outcome measures of NWF and WRMTR/NU in the spring, the mean scores, standard deviations, and percentiles are presented in Table 1. Predictors of initial status and growth in ISF, PSF, and CPST The results of the unconditional model (see Table 2) address research question 1. At the beginning of kindergarten the average child was able to identify 5.82 (t(21) = 13.84, p b .01) initial sounds per min on ISF, 7.67 (t(21) = 10.27, p b .01) phonemes per min on PSF, and scored 16.88 (t(21) = 13.66, p b .01) on the CPST. These mean initial status scores (γ000) are all significantly greater than zero. Moreover, on average children increased their scores significantly during kindergarten on each reading skill measure with average gains (γ100) between testing periods (approximately 15 weeks) of 10.87 (t(21) = 18.03, p b .01) for ISF, 19.66 (t(21) = 17.81, p b .01) for PSF, and 7.54 (t(21) = 17.04, p b .01) for CPST. The initial status for each outcome varied significantly across children (ISF τπ00 = 15.10, χ2(323) = 391.42, p b .05; PSF τπ00 = 52.69, χ2(323) = 430.16, p b .01; CPST τπ00 = 122.62, χ2(323) = 1321.22, p b .01), but the mean initial status for classrooms did not, except for CPST (ISF τβ00 = 0.32, χ2(21) = 24.50, p N .05; PSF τβ00 = 1.42, χ2(21) = 25.12, p N .05; CPST τβ00 = 23.00, χ2(21) = 68.67, p b .01). With the exception of the classroom estimate for CPST, these results were expected. Previous research has shown that children enter kindergarten with varying levels of reading skills (Lee & Burkam, 2002). However, the mean initial status is not expected to vary between classrooms because students are generally not tracked in kindergarten and the demographic characteristics of the schools in our sample were similar. Hence, the classroom means were expected to be highly similar when children entered kindergarten. The growth rates varied across children on two of the three outcomes (ISF τπ10 = 17.37, χ2(323) = 490.62, p b .01; PSF τπ10 = 13.43, χ2(323) = 333.42, p N .05; CPST τπ10 = 12.10, χ2(323) = 481.80, p b .01) and between classrooms on all three outcomes (ISF τβ10 = 5.25, χ2(21) = 64.53, p b .01; PSF τβ10 = 21.38, χ2(21) = 106.32, p b .01; CPST τβ10 = 1.98, χ2(21) = 40.90, p b .01). The significant estimates of the child-level variances suggest that individual differences in the children's backgrounds and abilities contributed to growth in reading skills. It is worth noting that the magnitude of the child-level growth trajectory variance estimate for PSF was similar to ISF and CPST; however, because the repeated measures for PSF had far greater measurement error, the PSF growth variance was not significant. The significant variance in growth rates between classrooms suggests that classroom characteristics impact the rate of change in reading skills. This comes as no surprise, given that previous research has shown that an array of classroom effects impact student learning, with the practices and attitudes of the teacher being the most prominent variables (Palardy & Rumberger, 2008). Finally, the unconditional model results show 14 to 61% of the

382

Source

Mean (SD)

Overall

Mean (SD)

25th 50th 75th percentile percentile percentile Fall ISF 5.63 (6.55) Winter ISF 18.24 (20.32) Spring ISF 27.43 (14.72) Fall PSF 6.46 (11.80) Winter PSF 30.13 (20.41) Spring PSF 45.55 (18.78) Fall CPST 16.93 (13.28) Winter CPST 25.34 (12.34) Spring CPST 32.19 (7.73) Spring NWF 32.09 (18.51) Spring WRMT 100.93 (11.39) Spring WI 12.36 (11.13) Spring PC 4.92 (5.1)

0 8.7 17.1 0 11 35.5 1.5 17 33 20 92.3 4.3 1.25

4 15.3 25.3 0 35 49 15 32 35 29 101.5 8 3

9.5 25 35.5 8 46 58 30 35 35 42 107 19 7

EO students

Mean (SD)

25th 50th 75th percentile percentile percentile 6.17 (6.85) 18.36 (12.07) 28.23 (14.50) 7.18 (12.36) 30.89 (20.38) 46.85 (18.61) 17.24 (13.17) 26.22 (11.87) 32.49 (8.06) 33.51 (19.06) 103.76 (9.93) 14.67 (11.44) 5.29 (5.09)

0 9.3 18 0 12 36.3 2 19.8 33 21 98 6 2

4.8 17.1 26 0 35 50.3 15 33 35 31 103 12 3

10 25.7 35.9 10 47 59.8 30 35 35 44 109.5 19.5 7

ELL students 25th 50th 75th percentile percentile percentile

4.23 (5.50) 17.96 (32.40) 25.62 (15.11) 4.62 (10.05) 28.32 (20.46) 42.58 (18.93) 16.13 (13.58) 23.27 (13.21) 31.52 (6.91) 28.83 (16.82) 98.23 (12.13) 10.15 (10.47) 4.57 (5.14)

0 6.4 15.3 0 8 31.5 0 12 31.5 17 90 2 1

2.1 13.2 23.4 0 30 43 14.5 30 35 26 101 6 2

7.2 20.7 32.8 3.8 45 56.5 30 35 35 40 104 15 7

Note. ISF = Initial Sound Fluency, PSF = Phoneme Segmentation Fluency, CPST = Combined Phoneme Segmentation Task, NWF = Nonsense Word Fluency, WRMT = Woodcock Reading Mastery Test, WI = Word Identification subtest, PC = Passage Comprehension subtest.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

Table 1 Means, SDs, and percentiles for EO and ELL students.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

383

Table 2 Predictors of ISF, PSF, and CPST growth and initial status during kindergarten. Unconditional

Coefficient estimates Initial status Mean (γ000) Gender (γ010) Language Status (γ020) Growth Mean (γ100) Gender (γ110) Language Status (γ120)

Conditional

ISF

PSF

CPST

ISF

PSF

CPST

5.82 ⁎⁎ – –

7.67 ⁎⁎ – –

16.88 ⁎⁎ – –

6.06 ⁎⁎ 1.17 − 2.55 ⁎⁎

6.78 ⁎⁎ 3.65 ⁎ − 2.60

15.30 ⁎⁎ 5.08 ⁎⁎ − 2.47

10.87 ⁎⁎ – –

19.66 ⁎⁎ – –

7.54 ⁎⁎ – –

9.77 ⁎⁎ 2.58 ⁎⁎ − 0.28

18.63 ⁎⁎ 2.70 ⁎ − 0.68

8.09 ⁎⁎ − 1.47 ⁎⁎ 0.42

46.40 ⁎⁎

48.99 ⁎⁎

139.74 ⁎⁎

46.10 ⁎⁎

122.62 ⁎⁎ 12.10 ⁎⁎

14.62⁎ 15.73 ⁎⁎

49.85 ⁎⁎ 11.29

116.41 ⁎⁎ 12.14 ⁎⁎

Variance components Within child 49.42 ⁎⁎ 139.96 ⁎⁎ Measurement error (σ2) Child level 15.10 ⁎⁎ 52.69 ⁎⁎ Initial status (τπ00) 17.37 ⁎⁎ 13.43 Growth (τπ10) Classroom level 0.32 1.42 Initial status (τβ00) 5.25 ⁎⁎ 21.38 ⁎⁎ Growth (τβ10) Percent of the total variance between classrooms Initial status 2.10 2.62 Growth 23.21 ⁎⁎ 61.42 ⁎⁎

23.00 ⁎⁎ 1.98 ⁎⁎

0.58 4.76 ⁎⁎

1.49 20.56 ⁎⁎

21.69 ⁎⁎ 1.92 ⁎⁎

15.79 ⁎⁎ 14.06 ⁎⁎

3.82 23.23 ⁎⁎

2.90 64.55 ⁎⁎

15.71 ⁎⁎ 13.66 ⁎⁎

Model summary Deviance statistic Number of parameters Deviance change (χ2Δ)

7447.35 9 –

7569.44 13 27.08 ⁎⁎

8520.57 13 27.19 ⁎⁎

7427.93 13 19.42 ⁎⁎

7596.52 9 –

8547.76 9 –

⁎ p b .05. ⁎⁎ p b .01.

total variance in reading skill growth is between classrooms, depending on the outcome. Again, this suggests a substantial classroom effect. The conditional model is for the purpose of testing whether Gender and Language Status are associated with initial status and growth on the outcomes (research question #2). Note that the intercept terms (γ000 and γ100) are now conditioned on the Gender and Language Status indicators. As a result, they are now interpreted as the expected initial status and rate of growth on the outcomes for children coded zero on both indicators, which corresponds to boys in the English only condition. The model summary results presented in Table 2 show that the addition of those two indicators significantly reduced the deviance statistic for all three models, which indicates that adding those predictors significantly improved model fit to the data (ISF χ2(4) = 27.08, p b .01; PSF χ2(4) = 27.19, p b .01; CPST χ2(4) = 19.42, p b .01).4 4

Full maximum likelihood estimation was used, which is appropriate for likelihood ratio test model comparisons based on deviance statistics.

384

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

This suggests that jointly Gender and Language Status account for a significant amount of variance in the repeated measurements of each outcome. The dummy coded Gender indicator estimates the degree to which girls differ from boys on the outcome. The coefficients show that girls tended to begin kindergarten with significantly higher phoneme segmentation skills (PSF γ010 = 3.65, t(342) = 2.23, p b .05; CPST γ010 = 5.08, t(342) = 4.12, p b .01) compared with boys; however, gender was not significantly associated with initial ISF scores (γ010 = 1.17, t(342) = 1.58, p N .05). Girls' reading skills increased at significantly greater rates than boys on ISF (γ110 = 2.58, t(342)= 3.13, p b .01) and PSF (γ110 = 2.70, t(342)= 2.43, p b .05), but boys increased their CPST scores faster during kindergarten (γ110 = −1.47, t(342) = −2.51, p b .01). These results indicate that by the end of kindergarten girls had average scores 6.33 higher on ISF, 8.05 higher on PSF, and 2.14 higher on CPST, which correspond to boys lagging behind girls approximate 30%, 20%, and 7% of a school year.5 The coefficient on the dummy-code Language Status indicator estimates the degree to which children classified as ELL differ from English Only. ELL children scored significantly lower on ISF at the beginning of kindergarten (ISF γ020 = − 2.55, t(342) = − 3.44, p b .01), but Language Status was not associated with initial PSF or CPST scores (PSF γ020 = − 2.60, t(342) = − 1.83, p N .05; CPST γ020 = − 2.47, t(342) = − 1.55, p N .05). However, in each case children classified as ELL tended to have lower initial scores compared with children classified as English Only. Growth in segmentation skills was not associated with Language Status on any of the outcomes (ISF γ120 = − .28, t(342) = − .45, p N .05; PSF γ120 = − .68, t(342) = − .60, p N .05; CPST γ120 = .42, t(342) = .66, p N .05).

Incremental predictive powers of segmentation scores on NWF and WRMT-R/NU The effects of the predictor variables on NWF were examined first (see Table 3). ISF fall produced a significant effect, F (1, 339) = 87.43, p b .001, R2 = .21. PSF fall entered second had a significant additive effect, F (1, 338) = 49.23, p b .01, R2 = .02, accounting for an additional 2% of the variance in NWF scores. CPST fall entered next was also significant, F (1, 337) = 47.13, p b .001, R2 = .07, over ISF- and PSF-fall scores, accounting for a total of 30% of the variance in spring-of-kindergarten NWF performance. The addition of Gender and LS did not account for significantly more variance. Because CPST fall accounted for a significant amount of variance above the DIBELS measures, additional regression equations were run with CPST fall entered first, followed by ISF fall and PSF fall. When CPST fall was entered first in the regression equation, 23% of the variance was accounted for in NWF performance. ISF fall added 6% more variance; however, PSF fall entered last accounted for no additional variance. The addition of Gender and LS did not produce a significant change in R2 (p N .05), indicating that after the scores on the measures were considered, Gender and Language Status did not account for additional variance in NWF performance. 5

The gender difference at the end of kindergarten is computed by adding gender difference in initial status with two times the gender difference in growth rate (γ010 + 2 ⁎ γ110). The end of kindergarten gender difference in percent of school year metric is computed by dividing the end of kindergarten gender difference by two times the mean growth rate for the unconditional model ((γ010 + 2 ⁎ γ110) / 2 ⁎ γ100).

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

385

Table 3 Hierarchical linear regression model for fall predictor variables. Source

df

MS

Nonsense Word Fluency (N = 345) ISF fall 1 23871.32 Residual 339 ISF fall 2 13130.64 PSF fall Residual 338 ISF fall 3 11470.03 PSF fall CPST fall Residual 337 ISF fall 4 8765.64 PSF fall CPST fall Gender Residual 336 ISF fall 5 7112.08 PSF fall CPST fall Gender Language Status Residual 335 WRMT-R/NU (N = 92) ISF fall 1 Residual 86 ISF fall 2 PSF fall Residual 85 ISF fall 3 PSF fall CPST fall Residual 84 ISF fall 4 PSF fall CPST fall Gender Residual 83 ISF fall 5 PSF fall CPST fall Gender Language Status Residual 82 ⁎ p b .05. ⁎⁎ p b .01. ⁎⁎⁎ p b .001.

F

ΔR2

Source

87.43 ⁎⁎⁎ .21 ⁎⁎⁎ CPST fall Residual 49.23 ⁎⁎⁎ .02 ⁎⁎ CPST fall ISF fall Residual 47.13 ⁎⁎⁎ .07 ⁎⁎⁎ CPST fall ISF fall PSF fall Residual 36.20 ⁎⁎⁎ .01 CPST fall ISF fall PSF fall Gender Residual 29.46 ⁎⁎⁎ .00 ISF fall PSF fall CPST fall Gender Language Status Residual

4057.99 48.27 ⁎⁎⁎ .36 ⁎⁎⁎ CPST fall Residual 2109.51 25.36 ⁎⁎⁎ .01 CPST fall ISF fall Residual 1931.24 29.52 ⁎⁎⁎ .14 ⁎⁎⁎ CPST fall ISF fall PSF fall Residual 1470.19 22.57 ⁎⁎⁎ .01 CPST fall ISF fall PSF fall Gender Residual 1254.27 20.50 ⁎⁎⁎ .04⁎ ISF fall PSF fall CPST fall Gender Language Status Residual

df

MS

F

ΔR2

1 27124.50 102.96 ⁎⁎⁎ .23 ⁎⁎⁎ 339 2 17005.22 69.74 ⁎⁎⁎ .06 ⁎⁎⁎ 338 3 11470.03

47.13 ⁎⁎⁎ .00

337 4

8765.64

36.20 ⁎⁎⁎ .01

336 5

7112.08

29.46 ⁎⁎⁎ .00

335

1 86 2

3786.81

43.41 ⁎⁎⁎ .34 ⁎⁎⁎

2612.16

36.61 ⁎⁎⁎ .13 ⁎⁎⁎

85 3

1931.24

29.52 ⁎⁎⁎ .05 ⁎⁎

84 4

1470.19

22.57 ⁎⁎⁎ .01

83 5

1254.27

20.50 ⁎⁎⁎ .04 ⁎

82

386

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

Next, the effects of the fall predictor variables on the outcome measures of WRMT-R/ NU were examined (see Table 3, bottom). ISF fall was entered at the first step and produced a significant effect, F (1, 86) = 48.27, p b .001, R2 = .36. PSF fall entered next did not produce a significant change in R2. CPST fall in the third step produced a significant effect, F (1, 84) = 29.52, p b .001, R2 = .14. Together, ISF fall and CPST fall accounted for 51% of the variance in WRMT-R/NU performance. When CPST fall was entered first in the regression equation, 34% of the variance was accounted for in WRMT-R/NU performance, with an additional 13% accounted for by ISF. PSF fall entered last accounted for an additional 5% of the variance in WRMT-R/NU scores. The addition of Gender did not produce statistically significant results. The addition of Language Status entered last produced a significant effect, accounting for an additional 4% of the variance. This effect suggests that LS accounted for a small amount of variance in WRMT-R/NU performance at the end of kindergarten. Therefore, fall literacy predictors were regressed separately for English Only and ELL students (see Table 4). Both ISF and CPST were significant predictors of outcome measures for English Only and ELL students. ISF and CPST accounted for more variance (52%) for English Only students than ELLs (44%).

Table 4 Hierarchical linear regression model for fall predictor variable. Source

df

MS

F

English Only students (N = 45) ISF fall 1 1783.62 31.05 ⁎⁎⁎ Residual 41 ISF fall 2 915.07 15.85 ⁎⁎⁎ PSF fall Residual 40 ISF fall 3 761.64 16.02 ⁎⁎⁎ PSF fall CPST fall Residual 39 English language learners (N = 47) ISF fall 1 2011.64 19.38 ⁎⁎⁎ Residual 43 ISF fall 2 1032.78 9.84 ⁎⁎⁎ PSF fall Residual 42 ISF fall 3 1044.55 12.81 ⁎⁎⁎ PSF fall CPST fall Residual 41 Outcome variable = WRMT-R/NU. ⁎ p b .05. ⁎⁎ p b .01. ⁎⁎⁎ p b .001.

ΔR2

Source

df

MS

F

ΔR2

.43 ⁎⁎⁎

CPST fall Residual CPST fall ISF fall Residual CPST fall ISF fall PSF fall Residual

1 41 2 40

1398.16

20.91 ⁎⁎⁎

.34 ⁎⁎⁎

1061.97

21.08 ⁎⁎⁎

.18 ⁎⁎⁎

761.64

16.02 ⁎⁎⁎

.04

.01 .11 ⁎⁎

.31 ⁎⁎⁎ .01 .17 ⁎⁎

CPST fall Residual CPST fall ISF fall Residual CPST fall ISF fall PSF fall Residual

3

39

1 43 2

2177.24

21.78 ⁎⁎⁎

.34 ⁎⁎⁎

1405.94

16.12 ⁎⁎⁎

.10 ⁎

42 3

1044.55

12.81 ⁎⁎⁎

.05

41

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

387

Differential predictive powers by Language Status We also used regression to further address research question 4 by investigating the extent to which ISF fall, PSF fall, and CPST fall differentially predicted reading performance for English Only and ELLs (see Table 5). Overall, results for the fall predictors of end-of-kindergarten reading for English Only and ELLs were similar, except for two findings. First, ISF accounted for less variance in reading outcomes for ELLs than for English Only. This result is not surprising, because ISF was designed with Englishspeaking children in mind. More surprising was the finding that CPST predicted similarly for English Only and ELLs, and was the strongest predictor for ELLs on both reading outcomes.

Discussion The purpose of the current study was to provide empirical evidence of reading segmentation skill growth trajectories during kindergarten as measured by ISF, PSF, and CPST and of the predictive powers of those measures at the start of kindergarten on subsequent reading skills for English Only and ELLs. We used growth modeling in this study for two reasons: (a) to identify early literacy trajectories of kindergarten students, and (b) to compare ELLs' early literacy growth curves with the growth curves of English Only kindergarten students, as well as compare the growth curves between boys and girls. The purpose of the regression analyses was to determine how well these measures predicted later reading skills. Table 5 Separate linear regression model for fall predictor variables. Source

NWF N

Source df

English Only students ISF fall 246 1 Residual 244 PSF fall 246 1 Residual 244 CPST fall 246 1 Residual 244

t

β

R2

7.83 ⁎⁎⁎

.45

.20 ⁎⁎⁎

6.63 ⁎⁎⁎

.39

.15 ⁎⁎⁎

8.54 ⁎⁎⁎

.49

.24 ⁎⁎⁎

.44

.19 ⁎⁎⁎

.31

.10 ⁎⁎

.47

.22 ⁎⁎⁎

English language learners ISF fall 107 1 5.01 ⁎⁎⁎ Residual 105 PSF fall 107 1 3.37 ⁎⁎ Residual 105 CPST fall 103 1 5.40 ⁎⁎⁎ Residual 101 ⁎ p b .05. ⁎⁎ p b .01. ⁎⁎⁎ p b .001.

ISF fall Residual PSF fall Residual CPST fall Residual

ISF fall Residual PSF fall Residual CPST fall Residual

WRMT-R/NU N

df

t

β

R2

45

1 43 1 43 1 43

5.71 ⁎⁎⁎

.66

.43 ⁎⁎⁎

2.07 ⁎

.30

.09 ⁎

4.57 ⁎⁎⁎

.58

.39 ⁎⁎⁎

4.33 ⁎⁎

.56

.31 ⁎⁎⁎

1.24

.18

.03

4.89 ⁎⁎⁎

.60

.36 ⁎⁎⁎

45 45

46 46 44

1 42 1 44 1 42

388

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

Modeling growth in phonemic awareness First, this study provides evidence to support a growth model for ISF, PSF, and CPST measures in kindergarten. Most data on early literacy measures establish criterion goals for students at each grade level (Good et al., 2001). This study extends previous research by providing estimates of the expected rate of growth that students achieved per week during the kindergarten year. Growth from October to January can be especially important, because students can score poorly at the beginning of the year due to factors that have little to do with learning potential, such as difficulties with language, sparse exposure to printed words, or interest in activities unrelated to literacy. For the sample overall, growth was slightly higher from October to January than later in the year (see Table 1), with growth per week of approximately 0.8 initial sounds (ISF), 1.6 phonemes (PSF) and 0.6 word segments (CPST). We were most interested in the growth of students who began kindergarten with poorly developed skills because these children are the most likely candidates for more intensive reading instruction. For ELLs in the bottom quartile, higher growth per week was found in the second than in the first half of the year. Although this difference may be due to gradual improvement in English, it creates potential problems for early identification. The rate of improvement in the first semester was highest for these low-skilled ELLs on the CSPT (0.8 segments per week), which suggests that this measure may be useful for monitoring literacy development of students whose language skills are low. Across the measures, Language Status was not associated significantly with the rate of skill acquisition (i.e., growth rate). Additionally, Language Status was not associated significantly with initial skill level for PSF or CPST. However, for ISF, the analyses demonstrated that initially ELLs scored significantly lower than English Only students (approximately 2.5 initial sounds correct per min), which corroborates previous research indicating that ELLs tend to enter school with lower reading skills (Snow et al., 1998). Of the segmenting measures used here, ISF had the highest reliance on English vocabulary, which was needed to remember previously identified picture names. For this reason, ELLs may have more difficulty on this measure when first entering kindergarten before exposure to English language and vocabulary has been established. Across the measures, gender had a significant association with level of initial performance (PSF and CPST) and rate of growth (ISF, PSF, and CPST). For PSF and CPST, the analyses demonstrated that girls initially scored higher than boys (score approximately 3.7 and 5.1 points higher, respectively). This result is similar to other studies examining gender differences in reading acquisition (McCoach et al., 2006). Gender gaps in reading have been reported in the United States for over 15 years (NCES, 2007), and are also found worldwide (Marks, 2008). Still, school personnel need to understand that even when gender differences are found, they tend to be small with little impact on responsiveness to instruction (Meece, Glienke, & Burg, 2006). On ISF and PSF, girls grew at a faster rate (2.5 and 2.7 faster per semester, respectively) than boys. However, on CPST, the opposite effect was seen as boys grew at a faster rate (approximately 1.5 phonemes correct per semester) than girls. Differences between the measures may explain these trends. First, unlike PSF and ISF, CPST is not fluency-based, which allows students to spend more time thinking about the task and their possible responses.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

389

Second, feedback, modeling, and practice for incorrect responses help some students to improve their understanding of the skill during administration of the test. This corrective feedback provides instruction directly on the skill being tested, which can improve the readiness of less prepared children to engage in the tasks. While this level of support may be unnecessary for higher skilled children, it may have been helpful for students who began kindergarten with less developed literacy understandings—in particular, for the boys and ELLs. Third, unlike the timed measures, CPST had a maximum score of 35. Girls scored higher initially and therefore did not have as much room to grow as did boys. Predicting kindergarten literacy outcomes The variance explained by ISF in the fall of kindergarten was consistent with previous studies (Good & Kaminski, 2002; Good et al., 2001), and demonstrated that ISF in the beginning of kindergarten significantly predicted and accounted for variability on end-ofkindergarten measures of nonsense words, word identification, and reading comprehension. This study extends those findings by demonstrating that ISF in the beginning of kindergarten significantly predicts later reading performance for ELLs. While studies have demonstrated that phonological awareness transfers across languages (Denton, Hasbrouck, Weaver, & Riccio, 2000; Lesaux & Siegel, 2003; Lindsey, Manis, & Bailey, 2003; Manis, Lindsey, & Bailey, 2004), most studies evaluated how well Spanish tasks predict English outcome measures. This study extends those results and concurs with Quiroga et al.'s (2002) finding that indicated that English phonemic awareness was related to learning to read English when one's first language is Spanish and reading instruction is in English. This result is encouraging for educators who plan to assess ELLs as early as the beginning of kindergarten in order to determine which ELLs may need additional reading support. The PSF measure has been shown to be a stronger predictor of later reading performance than ISF (Kaminski & Good, 1996; Good et al., 2001; Rouse & Fantuzzo, 2006); however, most studies evaluating DIBELS PSF and other phoneme segmentation fluency tasks do not measure this task until late in kindergarten or beginning of first grade (Felton & Pepper, 1995; Good et al., 2001; Rouse & Fantuzzo, 2006). Additionally, most of the research centering on the ability of phoneme awareness fluency tasks to predict later reading skills utilizes monolingual English-speaking students or does not disaggregate the findings for ELLs (Good et al., 2001; Rouse & Fantuzzo, 2006). The results of this study indicated that PSF given in the beginning of kindergarten only accounted for 2% additional variance above ISF on NWF performance and did not significantly account for additional variance above ISF on WRMT-R/NU. Furthermore, with ELL students, fall-PSF only accounted for 10% of the variance on NWF and no significant variance on the WRMT-R/NU. These results indicate that PSF given in the beginning of kindergarten does not significantly predict later reading performance better than ISF for English Only or ELL students. Most intriguing was the finding that a phonemic awareness accuracy measure that captured features of ISF and PSF given early in kindergarten (i.e., CPST) significantly added to the predictability of later reading achievement. Fall CPST scores accounted for a significant amount of variance in both NWF and WRMT-R/NU scores at the end of kindergarten for English Only and ELLs, which provides teachers with an additional measure to consider in the beginning of kindergarten to account for variability in end-of-kindergarten reading. CPST

390

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

may account for more variance than PSF in end-of-kindergarten reading due to the combined level of difficulty, low receptive language requirement, and corrective feedback procedure. This finding is consistent with earlier studies (e.g., Spector, 1992) that have suggested that measures including corrective feedback (i.e., dynamic measures) may be more predictive of future outcomes than static measures. CPST may capture some level of responsiveness to instruction that is missing in static early literacy measures such as the DIBELS. A comparison of fall scores on the three predictors lends support to this possibility. Among the three predictors, CPST was the least prone to floor effects. Moreover, for English learners, it is likely that the modeling of the task demands during the feedback stage reduced reliance on comprehension of the English directions. Limitations The generality of our findings could be limited due to specific features of our sample. First, all participating schools were located in low to low-middle income communities. Furthermore, due to the large ELL population, ELLs in the study may have received more instruction directed to their needs than in schools where the ELL population is small. The ELLs in our sample were Spanish speaking children and we did not differentiate our ELLs by the level of language in English (i.e., CELDT scores for ELLs ranged from 1–5) or Spanish. Although we were encouraged to find that the growth rates were similar for ELLs and English Only students, ELLs with home languages that are less-phonetic than Spanish or non-phonetic may have growth trajectories different from our sample. Additionally, all teachers across the four schools used the same kindergarten curriculum. How that curriculum was used probably varied widely, because we found strong teacher effects, especially on PSF, and so the growth rates generated on measures used here could differ with other instructional approaches. General conclusions and implications Using normative data from this study provides a starting point for documenting expected student growth on ISF, PSF, and CPST and determining appropriate goals for growth across the year. It is important to note that we found no significant differences between growth rates for ELLs and English Only students for any of our predictors. While significant differences between growth rates for boys and girls were apparent, gender did not contribute significantly to predicting later reading outcomes, which suggests that educators focus on the desired outcomes and provide the appropriate instruction for achieving them regardless of Language Status or Gender. Overall, ISF in the beginning of kindergarten had higher validity coefficients for our sample (spring-NWF = .45 and WRMT-R/NU = .60) than reported in other studies (e.g., .36 with the Woodcock–Johnson Psycho-Educational Battery Readiness Cluster score in Good et al.'s 2001 study). Furthermore, CPST had higher correlations with NWF (.48) and WRMT-R/NU (.59) performance than did ISF, accounting for an additional 7% and 15% of the variance, respectively. Often it is difficult for educators to find tests that accurately assess beginning ELLs' skills due to the impact of limited English language skills.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

391

However, ISF and CPST given in the beginning of kindergarten were shown to predict later word reading and reading comprehension for ELLs. Several aspects of the results compel us to consider more carefully what the initial status and growth across the year imply for researchers, teachers, and school psychologists who are interested in screening for possible reading difficulties and intervening based on screening. First, bear in mind that ELLs and boys began the year significantly below English Only children and girls on these measures. Notice also that floor effects were evident, particularly for ISF and PSF. As evidence, students in the bottom 25th percentile scored 0 on both measures, and students at the 50th percentile did not score much better (e.g., 4 and 0, respectively, for the total sample, and 2 and 0 for ELLs). While students in the bottom 25th percentile also scored poorly on the CPST (mean 1.5), students at the 50th percentile scored 15, with no difference between ELLs and English Only. The improved range for the CPST may help educators to make more accurate decisions about students who may need more careful monitoring over the first few months of school. Growth curves are becoming widely used in educational and psychological research and for a good reason — they are ideal for studying processes that take place over time. This is a flexible model and we would like to offer some recommendations based on our experiences conducting the present study on how to optimize its usefulness. First, the linear growth model requires only 3 repeated measurements on the outcome, but having more than 3 will allow greater modeling flexibility. For example, with additional measures the form of the curve needs not be restricted to linear. Other plausible non-linear forms, such as quadratic or piecewise, can be tested. These model extensions can help shed new light on the data. Collecting measurements near the beginning and end of the period of consideration (kindergarten for the present study) is important for obtaining reliable growth estimates. However, in cases where the child is developmentally unprepared for testing with a given instrument, reliability of the growth estimate can be improved by collecting additional measures after he or she has had more exposure to instruction in language and reading domains. For example, some non-native English speakers may enter kindergarten with insufficient reading skills to be tested reliably with a given instrument, but they may be ready one or more months later, especially where instruction includes a focus similar to the instrument. In that case the reliability of their growth trajectories may improve considerably if additional data are collected at one or two months following initial screening, even if that is not part of the original protocol. It could be tempting to chalk initial differences up to language difficulties for the ELLs (suggesting we should wait to intervene until language development in English has improved) and well documented developmental differences in reading acquisition between girls and boys. However, taking either stance could be a disservice to students. It is important to note that the growth rates for these groups were strong in the first semester of kindergarten. The growth trajectory from October to January was steep for most kindergartners. Given this rapid progress on segmenting skills, and especially for the lowest performing students on the CPST, educators should consider screening early in the year and screening low-scoring students again 4 to 8 weeks later for evidence of growth. Our findings regarding prediction of outcomes suggest that educators could use ISF and/ or CPST in the beginning of kindergarten to identify students, including ELLs who need additional reading instruction to reduce risk of poor kindergarten outcomes regardless of the

392

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

student's English language proficiency. Overall, however, ELLs had great difficulty engaging in ISF in the fall and performed better on the CPST, even though both measures were designed to capture students' understanding of the first sound in words. These differences may be due to the difficulty of naming and retaining names for the pictures used on ISF. The extensive modeling and practice on the CPST may have helped students to focus on the sounds in words without regard to their meanings. Based on these findings, schools could screen early and identify those ELLs at risk for reading failure and monitor and re-evaluate risk over the first semester of kindergarten. Other studies have demonstrated that early identification and intervention can have a significant impact on later reading performance (Vellutino et al., 1996; Morris, Tyner, & Perney, 2000). School personnel no longer have to wait until an ELL student has failed in reading or gained English proficiency before action is taken to intervene. The results of this study provide further evidence and tools in support of screening all students early in their schooling to identify those who may be in need of reading intervention, regardless of Language Status. References Ardoin, S. P., & Christ, T. J. (2008). Evaluating curriculum-based measurement slope estimates using data from triannual universal screenings. School Psychology Review, 37, 109−125. Bernhardt, E. (2003). Challenges to reading research from a multilingual world. Reading Research Quarterly, 38, 112−117. California Department of education. (2008). Sacramento, CA:www.cde.ca.gov California Department of education. (2006). Sacramento, CA: www.cde.ca.gov Catts, H., Petscher, Y., Schatschneider, D., Bridges, M., & Mendoza, K. (2009). Floor effects associated with universal screening and their impact on the early identification of reading disabilities. Journal of Learning Disabilities, 42, 163−176. Davison, M. L., Seo, Y. S., Davenport, E. C., Butterbaugh, D., & Davison, L. J. (2004). When do children fall behind? What can be done? Phi Delta Kappan, 85, 752−761. Deno, S. L., Fuchs, L. S., Marston, D., & Shin, J. (2001). Using curriculum-based measurement to establish growth standards for students with learning disabilities. School Psychology Review, 30, 507−524. Denton, C. A., Hasbrouck, J. E., Weaver, L. R., & Riccio, C. A. (2000). What do we know about phonological awareness in Spanish? Reading Psychology, 21, 335−352. Diamond, P. J., & Onwuegbuzie, A. J. (2001). Factors associated with reading achievement and attitudes among elementary school-aged students. Research in the Schools, 8(1), 1−11. Felton, R. H., & Pepper, P. P. (1995). Early identification and intervention of phonological deficits in kindergarten and early elementary children at risk for reading disabilities. School Psychology Review, 24, 405−414. Fuchs, L. S., Fuchs, D., Hamlett, C., & Walz, L. (1993). Formative evaluation of academic progress: How much growth can we expect? School Psychology Review, 22, 27−48. Geva, E. (2000). Issues in the assessment of reading disabilities in L2 children — beliefs and research evidence. Dyslexia, 6, 13−28. Good, R. H., & Kaminski, M. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills, 6th ed. Eugene, OR: Institute for the Development of Education Achievement. Good, R. H., & Kaminski, R. A. (2003). DIBELS: dynamic indicators of basic early literacy Skills, 6th ed. Longmont, CO: Sopris West. Good, R. H., Kaminski, R. A., Shinn, M., Bratten, J., Shinn, M., Laimon, L., et al. (2004). Technical adequacy and decision making utility of DIBELS (Technical Report No. 7) Eugene, OR: University of Oregon. Good, R. H., Simmons, D. C., & Kame'enui, E. J. (2001). The importance of decision making utility of a continuum of fluency-based indicators of foundational reading skills for third grade high-stakes outcomes. Scientific Studies of Reading, 5, 257−288. Good, R. H., Simmons, D. C., & Smith, S. B. (1998). Effective academic interventions in the United States: Evaluating and enhancing the acquisition of early reading skills. School Psychology Review, 27, 45−56.

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

393

Hasbrouck, J. E., & Tindal, G. (1992). Oral reading fluency norms: A valuable assessment tool for reading teachers. Reading Teacher, 59, 636−644. Healy, K., Vanderwood, M., & Edelston, D. (2005). Early literacy interventions for English language learners: Support for an RTI model. The California School Psychologist, 10, 55−64. Houghton Mifflin Reading (2002). A research-based framework for Houghton Mifflin Reading Grades K-8. Retrieved April 7, 2005. www.eduplace.com/marketing/nc/framework.html. Individuals with Disabilities Education Improvement Act (2004). Juel, C. (1988). Learning to read and write: A longitudinal study of 54 children from first through fourth grades. Journal of Educational Psychology, 80, 437−447. Justice, L. M., & Pullen, P. C. (2003). Promising interventions for promoting emergent literacy skills: Three evidence-based approaches. Topics in Early Childhood Special Education, 23, 99−114. Kame'enui, E. J. (2002). Final report on the analysis of reading assessment instruments for K-3. Retrieved July 5, 2006, from http://idea.uoregon.edu/assessment.html Kaminski, R. A., & Good, R. H. (1996). Towards a technology for assessing basic early literacy skills. School Psychology Review, 25, 215−227. Klingner, J. K., Artiles, A. J., & Barletta, L. M. (2006). English language learners who struggle with reading: Language acquisition or learning disabilities? Journal of Learning Disabilities. Klingner, J. K., Sorrells, A. M., & Barrera, M. T. (2007). Considerations when implementing response to intervention with culturally and linguistically diverse students. In D. Haager, J. Klingner, & S. Vaughn (Eds.), Evidence-based Reading Practices for Response to Intervention Baltimore, MD: Brookes Publishing. Lee, V. E., & Burkam, D. T. (2002). Inequality at the starting gate: Social background differences in achievement as children begin school. Washington D.C.: Economic Policy Institute. Lesaux, N. K., Rupp, A. A., & Siegel, L. S. (2007). Growth in reading skills of children from diverse linguistic backgrounds: Findings from a 5-year longitudinal study. Journal of Educational Psychology, 99, 821−834. Lesaux, N. K., & Siegel, L. S. (2003). The development of reading in children who speak English as a second language. Developmental Psychology, 39, 1005−1019. Linan-Thompson, S., Vaughn, S., Prater, K., & Cirino, P. T. (2006). The response to intervention of English language learners: At-risk for reading. Journal of Learning Disabilities, 39, 390−398. Lindsey, K. A., Manis, F. R., & Bailey, C. E. (2003). Prediction of first-grade reading in Spanish-speaking Englishlanguage learners. Journal of Educational Psychology, 95, 482−494. MacMillan, P. (2000). Simultaneous measurement of treading growth, gender, and relative-age effects: Manyfaceted Rasch applied to CBM reading scores. Journal of Applied Measurement, 1, 393−408. Manis, F. R., Lindsey, K. A., & Bailey, C. E. (2004). Development of reading in grades K- 2 in Spanish-speaking English-language learners. Learning Disabilities Research and Practice, 19, 214−224. Marks, G. N. (2008). Accounting for the gender gaps in student performance in reading and mathematics: Evidence from 31 countries. Oxford Review of Education, 34, 89−109. Mathes, P. G., & Torgesen, J. K. (1998). All children can learn to read: Critical care for the prevention of reading failure. Peabody Journal of Education, 73, 317−340. McCoach, D. B., O'Connell, A. A., Reis, S. M., & Levitt, H. A. (2006). Growing readers: A hierarchical linear model of children's reading growth during the first 2 years of school. Journal of Educational Psychology, 98, 14−28. Meece, J. L., Glienke, B. B., & Burg, S. (2006). Gender and motivation. Journal of School Psychology, 44, 351−373. Morris, D., Tyner, B., & Perney, J. (2000). Early steps: Replicating the effects of a first grade reading intervention program. Journal of Educational Psychology, 92, 681−693. National Center for Education Statistics (2007). The nation's report card. Retrieved January 21, 2009 from http:// nces.ed.gov/nationasreportcard;pdf/stt2007/2007497CA4.pdf National Reading Panel (2000). Teaching children to read: An evidence-based assessment of scientific research literature on reading and its implications for reading instruction. Bethesda, MD: National Institutes of Health. O'Connor, R.E., Bocian, K., Beebe-Frankenberger, M., & Linklater, D.L. (in press). Responsiveness of students with language difficulties to early intervention in reading. Journal of Special Education. doi:10.1177/0022466908317789. O'Connor, R. E., Fulmer, D., Harty, K., & Bell, K. (2005). Layers of reading intervention in kindergarten through third grade: Changes in teaching and child outcomes. Journal of Learning Disabilities, 38, 440−455. O'Connor, R. E., & Jenkins, J. R. (1999). The prediction of reading disabilities in kindergarten and first grade. Scientific Studies of Reading, 3, 159−197.

394

D.L. Linklater et al. / Journal of School Psychology 47 (2009) 369–394

O'Connor, R. E., Jenkins, J. R., & Slocum, T. A. (1995). Transfer among phonological tasks in kindergarten: Essential instructional content. Journal of Educational Psychology, 87, 202−217. Palardy, G. J., & Rumberger, R. W. (2008). Teacher effectiveness in first grade: The importance of background qualifications, attitudes, and instructional practices for student learning. Educational Evaluation and Policy Analysis, 30, 111−140. Peregoy, S. F., & Boyle, O. F. (2000). English learners reading English: What we know, what we need to know. Theory Into Practice, 39, 237−247. Quiroga, T., Lemos-Britton, Z., Mostafapour, E., Abbott, R. D., & Berninger, V. W. (2002). Phonological awareness and beginning reading in Spanish-speaking ESL first graders: Research into practice. Journal of School Psychology, 40, 85−111. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods, 2nd edition. Thousand Oaks, CA: Sage. Ritchey, K. D., & Speece, D. L. (2006). From letter names to word reading: The nascent role of sublexical fluency. Contemporary Educational Psychology, 31, 301−327. Rouse, H. L., & Fantuzzo, J. W. (2006). Validity of the dynamic indicators for basic early literacy skills as an indicator of early literacy for urban kindergarten children. School Psychology Review, 35, 341−355. Scarborough, H. S. (1995, March). The fate of phonemic awareness beyond the elementary school years. Paper presented at the biannual meeting of the Society for Research in Child Development, Indianapolis, IN. Slavin, R. E., & Cheung, A. (2005). A synthesis of research on language of reading instruction for English language learners. Review of Educational Research, 75, 247−284. Slocum, T. A., O'Connor, R. E., & Jenkins, J. R. (1993). Transfer among phonological manipulation skills. Journal of Educational Psychology, 85, 618−630. Snow, C. E., Burns, M. S., & Griffin, P. (1998). Preventing reading difficulties in young children. Washington, DC: National Academy Press. Spector, J. E. (1992). Predicting progress in beginning reading: Dynamic assessment of phonemic awareness. Journal of Educational Psychology, 84, 353−363. Torgesen, J. K. (2002). The prevention of reading difficulties. Journal of School Psychology, 40, 7−26. Torgesen, J. K., Morgan, S. T., & Davis, C. (1992). Effects of two types of phonological awareness training on word learning in kindergarten children. Journal of Educational Psychology, 84, 364−370. Torgesen, J. K., Wagner, R. K., Rashotte, C. A., Rose, E., Lindamood, P., Conway, T., et al. (1999). Preventing reading failure in young children with phonological processing disabilities: Group and individual responses to instruction. Journal of Educational Psychology, 91, 1−15. Vanderwood, M., Linklater, D, & Healy, K. (2008). Nonsense word fluency and future literacy performance for English language learners. School Psychology Review, 37, 5−17. Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Pratt, A., Chen, R., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 86, 601−638. Wagner, R. K., Torgesen, J. K., Laughon, P., Simmons, K., & Rashotte, C. A. (1993). Development of young readers' phonological processing abilities. Journal of Educational Psychology, 85, 83−103. Wilkinson, C. Y., Ortiz, A. A., Robertson, P. M., & Kushner, M. I. (2006). English language learners with readingrelated learning disabilities: Linking data from multiple sources to make eligibility determinations. Journal of Learning Disabilities. Woodcock, J. (1998). Woodcock reading mastery test — Revised/normative update. Circle Pines, MN: American Guidance Service.