Journal of School Psychology 59 (2016) 13–38
Contents lists available at ScienceDirect
Journal of School Psychology journal homepage: www.elsevier.com/locate/jschpsyc
Repeated versus wide reading: A randomized control design study examining the impact of fluency interventions on underlying reading behavior Scott P. Ardoin a,⁎, Katherine S. Binder b, Tori E. Foster a, &, Andrea M. Zawoyski a a b
University of Georgia, United States Mount Holyoke College, United States
a r t i c l e
i n f o
Article history: Received 9 June 2015 Received in revised form 13 May 2016 Accepted 6 September 2016 Available online xxxx Keywords: Repeated reading Wide reading Reading fluency Prosody Eye movements
a b s t r a c t Repeated readings (RR) has garnered much attention as an evidence based intervention designed to improve all components of reading fluency (rate, accuracy, prosody, and comprehension). Despite this attention, there is not an abundance of research comparing its effectiveness to other potential interventions. The current study presents the findings from a randomized control trial study involving the assignment of 168 second grade students to a RR, wide reading (WR), or business as usual condition. Intervention students were provided with 9–10 weeks of intervention with sessions occurring four times per week. Pre- and post-testing were conducted using Woodcock-Johnson III reading achievement measures (Woodcock, McGrew, & Mather, 2001, curriculum-based measurement (CBM) probes, measures of prosody, and measures of students' eye movements when reading. Changes in fluency were also monitored using weekly CBM progress monitoring procedures. Data were collected on the amount of time students spent reading and the number of words read by students during each intervention session. Results indicate substantial gains made by students across conditions, with some measures indicating greater gains by students in the two intervention conditions. Analyses do not indicate that RR was superior to WR. In addition to expanding the RR literature, this study greatly expands research evaluating changes in reading behaviors that occur with improvements in reading fluency. Implications regarding whether schools should provide more opportunities to repeatedly practice the same text (i.e., RR) or practice a wide range of text (i.e., WR) are provided. © 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
1. Introduction In order to comprehend text, students must develop proficiency in reading fluency. Fluent reading occurs when one reads text accurately with appropriate speed and expression (National Institute of Child Health and Human Development, [NICHD], 2000). LaBerge and Samuels (1974) bridged the relationship between comprehension and fluency with the theory of automatic information processing. The theory suggests that although the effort to read draws upon the brain's limited capacity for processing information, the effort requirement decreases with practice, allowing the brain to take on additional tasks such as comprehension. ⁎ Corresponding author. E-mail address:
[email protected] (S.P. Ardoin). Action Editor: Stephen Patrick Kilgus
http://dx.doi.org/10.1016/j.jsp.2016.09.002 0022-4405/© 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
14
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
Consequently, non-fluent readers must devote a substantial amount of attention to decoding text, which depletes the resources they have available to attend to the meaning of text. In contrast, fluent readers are able to decode words automatically, allowing them to devote sufficient attention to comprehension (Tannenbaum, Torgesen, & Wagner, 2006). One highly recommended procedure for improving students' reading fluency is repeated readings (RR). Although there is abundant research suggesting that RR improves students' reading rate and accuracy on practiced passages (NICHD, 2000), there are few studies that examine the extent to which it results in generalized improvements in students' reading accuracy, rate, prosody, and comprehension. Likewise, despite evidence in regards to the importance of reading fluency (Therrien, 2004), there is a dearth of research examining how reading behavior changes as students become more fluent readers. Unique assessment techniques, such as eye movement monitoring and spectrographic analysis of prosodic characteristics, may allow for a more sophisticated understanding of how interventions improve reading fluency and what features of instruction will yield the fastest progress. Fortunately, improvements in technology increase the feasibility with which such instruments can be employed within schools (Rayner, Ardoin, Binder et al, 2013). The current study extends the existing body of research examining fluency based interventions by employing randomized control design procedures and evaluating the effects of intervention across multiple measures: rate, comprehension, prosody, and eye-movements. 1.1. Repeated reading (RR) Samuels (1979, 1997) developed the method of repeated readings (RR), which was largely influenced by the theory of automatic information processing (LaBerge & Samuels, 1974). Typically, RR requires students to reread a story a pre-established number of times, to a pre-established level of fluency, or to a pre-established percent above their baseline fluency level (Ardoin, McCall, & Klubnik, 2007; Samuels, 2006). The goals for RR are to increase students' reading speed, transfer learning to new passages, and improve comprehension (Samuels, 2006). According to the National Reading Panel (NICHD, 2000), RR is an effective instructional practice for improving the fluency of all students through fourth grade and struggling readers through high school. In a meta-analysis of the RR literature, Therrien (2004) reported that RR resulted in large increases in fluency (effect size [ES] = 0.83) and moderate increases in comprehension (ES = 0.67) on practiced passages. Observed improvements in accuracy and rate are suggested to be a result of multiple opportunities to practice reading the same words correctly (Haring & Eaton, 1978). With repeated exposure, the student receives more practice and recognizes the letters that make up the words more efficiently (Ehri, 1992; Leslie & Calhoon, 1995; Share & Stanovich, 1995). 1.2. RR: Promoting generalization In addition to evidence supporting the use of RR for improving students' fluency on practiced passages, there is evidence suggesting that gains transfer to untrained passages (Chard, Vaughn, & Tyler, 2002; Meyer & Felton, 1999; Therrien, 2004). Transfer effects can be explained by Haring and Eaton's (1978) instructional hierarchy, which implies that once students develop sufficient fluency, they can effectively generalize their skills to new contexts. Ardoin et al. (2007) applied the instructional hierarchy in an examination of procedures for promoting generalization of skills learned during reading instruction. Students participated in a Multiple Exemplars (ME) condition, which involved reading multiple passages with high word overlap, as well as an RR condition. Practice opportunities between conditions were equivalent. Transfer effects were evaluated on high-word-overlap generalization passages. Half of the students in the RR condition achieved greater or equivalent gains in reading fluency compared to students in the ME condition. Results were inconclusive for the other half of participants. Ardoin et al. hypothesized that because students in the ME condition did not have the opportunity to develop sufficient fluency, intervention gains did not generalize to the high-word-overlap assessment. 1.3. Limitations of RR One notable limitation of RR is that the many claims for its utility in improving reading fluency and comprehension are based on reviews that combine studies with varying populations and intervention procedures. Therefore, it is difficult to determine which components have the largest impact on improvements in reading skills. For example, some studies incorporated experimenter-developed passages (e.g., O'Shea, Sindelar, & O'Shea, 1987; Rashotte & Torgesen, 1985) or single-case design procedures that failed to control for carry-over effects (e.g., Daly, Martens, Hamler, Dool, & Eckert, 1999; Eckert, Ardoin, Daly, & Martens, 2002). Other studies involved partner reading, incorporated instructional components beyond RR, or failed to control for the number of readings completed by students (e.g., Fuchs, Fuchs, & Kazdan, 1999; Vaughn et al., 2000; Wexler, Vaughn, Roberts, & Denton, 2010). Furthermore, claims are made regarding the beneficial effects of RR on prosody, a component of reading fluency rarely studied within the RR literature (Ardoin, Morena, Binder, & Foster, 2013). Without clear standards for implementation, RR is subject to distortion and misuse in educational settings. Another limitation was identified in a recent What Works Clearinghouse (WWC; “What works clearinghouse”, 2014 intervention report on RR for students with learning disabilities. Surprisingly, findings indicated that RR does not significantly impact alphabetics, general reading achievement, or reading fluency for students in this population. The report included only one pilot study using a single-case design (i.e., Steventon, 2004) and two group design studies (i.e., Ellis & Graves, 1990; Wexler et al., 2010) conducted with participants in upper elementary and secondary grades. Closer examination of methodology reveals that Ellis and Graves (1990) did not evaluate students' reading fluency despite the primary goal of RR being to improve reading fluency. Further, Wexler et al. (2010) utilized a peer-mediated intervention involving students with severe reading disabilities as
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
15
tutors and tutees. Despite unconventional procedures and measurement, the studies evaluated in the WWC review met rigorous design criteria, commanding attention from researchers evaluating RR. Ultimately, findings from the review may challenge the NRP's (NICHD, 2000) recommendation for using RR as a fluency intervention for struggling readers through high school. Furthermore, despite the NRP's suggestion that RR is an effective instructional practice for all elementary students through fourth grade (NICHD, 2000), the majority of RR studies only examine the effects of RR on the reading performance of struggling readers (Chard et al., 2002; Therrien, 2004). Although all students might benefit from RR, elementary students who are not struggling in their development of reading fluency might benefit more from exposure to a greater amount of materials as opposed to repeated practice with the same materials. If students are fluent readers, they may benefit from instruction that promotes generalization through exposure to words and new vocabulary across contexts. 1.4. Wide Reading (WR)1 By requiring students to reread the same passage, RR limits the breadth of vocabulary and content to which students are exposed (Homan, Klesius, & Hite, 1993). Considering Haring and Eaton's (1978) emphasis on promoting fluency via opportunities to practice along with LaBerge and Samuels's (1974) theory of automatic information processing, it is possible that students' fluency may improve if they simply are given multiple opportunities to practice reading new material. In an early evaluation of this hypothesis, Rashotte and Torgesen (1985) exposed participants to three conditions in order to compare the effectiveness of different reading interventions involving the same number of reading opportunities. Students received (a) RR on a set of passages with low word and content overlap, (b) RR on a set of passages with high word and content overlap, and (c) intervention involving non-repetitive reading of passages with minimal word and content overlap (i.e., wide reading [WR]). Results indicated that improvements in reading fluency were greatest when students received RR on passages with high word and content overlap. However, interestingly, students exposed to WR exhibited greater gains in reading fluency than did students receiving RR with low word and content overlap. Thus, rather than repetition, guided practice may be the main advantage of RR instruction. Unfortunately, limitations pertaining to variability in assessment passages, small sample size (N = 12), and potentially confounding practice effects call for an extension of Rashotte and Torgesen's study through a randomized control design study with standardized assessments across students and conditions. In another evaluation of the effects of RR and WR, Homan et al. (1993) examined changes in reading rate, errors, and comprehension among sixth-grade students participating in either an assisted non-repetitive condition (i.e., WR) or an RR condition. Participants received 20 min of intervention three times each week for seven weeks. Students in both conditions improved significantly across measures with no differences between conditions. Similar findings were revealed in a synthesis of studies evaluating fluency instruction for struggling secondary students, with the effects of RR not exceeding those of WR (Wexler, Vaughn, Edmonds, & Reutebuch, 2008; Wexler et al., 2010). However, caution should be taken when generalizing the findings from Wexler et al. (2008, 2010) to elementary students, as the reading skills of secondary students—even those who are struggling in reading—are likely different from those of early elementary students whose sight word vocabularies are continuing to grow rapidly (Carlisle, 2000). Unfortunately, to date researchers have not evaluated whether the benefits of RR versus WR may vary depending on students' reading skills. It may be that students with fluency difficulties benefit more from RR than WR because RR provides multiple opportunities to correctly read words and to practice reading at a faster pace. In contrast, students without fluency difficulties may benefit more from WR as it exposes them to a greater number of words as well as the opportunity to practice the same words across multiple contexts. Researchers have also failed to examine whether RR or WR leads to greater improvements in prosody, a component of reading fluency indicative of students' comprehension of texts (Rasinski, Blachowicz, & Lems, 2006). Finally with one exception (Ardoin, Binder, Zawoyski, Foster, & Blevins, 2013) researchers have failed to examine difference in how or whether these two interventions lead to differences in improvements in reading behavior (i.e., eye-movements) as a function of either intervention. 1.5. Prosody Despite considerable research evaluating the impact of fluency based intervention on students' rate of reading, little research exist examining an often forgotten component of reading fluency, prosody, The National Reading Panel (NICHD, 2000) included prosody as one of the essential components of fluent reading. Good prosodic reading occurs when individuals read in expressive and melodic patterns characterized by meaningful pausing, correct timing, and appropriate variations in pitch, stress, and volume (Dowhower, 1987, 1991). When text is read with good prosody, it sounds similar to spoken language and signifies that the reader successfully interpreted the meaning of the text (Rasinski et al., 2006). Despite its important role in the development of reading fluency, research examining prosody is limited (Cowie, Douglas-Cowie, & Wichmann, 2002). Furthermore, extant studies are plagued by small sample sizes (e.g., Dowhower, 1987; N = 17) and less-than-perfect assessment procedures, such as subjective ratings of prosody and simple frequency counts of variables (e.g., Herman, 1985; Klauda & Guthrie, 2008; Ravid & Mashraki, 2007). Fortunately, recent advances in technology permit more objective and precise measurement of prosody by transferring students' readings into spectrograms that provide a visual representation of prosodic characteristics. For example, using spectrogram of students' reading, Cowie et al. (2002) statistically described reading fluency, prosody, and the relationship between these skills 1 The name “Wide Reading” has not been used consistently throughout the literature. Other names for the same intervention included Multiple Exemplars, Nonrepetitive Conditions, and Guided Wide Reading. To maintain consistency within the current study, all published labels describing a wide reading intervention were changed to “Wide Reading (WR).”
16
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
among elementary-aged readers. Analyses supported previous descriptions of fluency and expressiveness, indicating that readers rated as fluent by researchers read sentences faster and paused for shorter durations between sentences, as compared to readers rated as less fluent. More recent studies (e.g., Schwanenflugel, Westmoreland, & Benjamin, 2013; Valle, Binder, Walsh, Nemier, & Bangs, 2013) examining prosody typically have analyzed readers' pause durations throughout texts (e.g., pauses between words and paragraphs, pauses following commas and periods) and final pitch declinations (i.e., decreases in pitch on the last words of declarative sentences). Research examining readers' pauses and final pitch declinations suggests that these variables are uniquely related to reading comprehension and oral reading fluency and that they explain significant variance in reading comprehension beyond accuracy and speed (Benjamin & Schwanenflugel, 2010; Klauda & Guthrie, 2008). For example, research indicates that highachieving readers as compared to lower achieving readers pause less frequently, place less emphasis on words and are more likely to decrease their pitch at the conclusions of sentences (Miller & Schwanenflugel, 2006; Schwanenflugel, Hamilton, Kuhn, Wisenbaker, & Stahl, 2004). 1.6. RR and prosody Although RR is recommended for developing reading fluency (i.e., reading with appropriate accuracy, rate, and prosody), there is little empirical support for RR's benefits related to prosody. Schreiber (1987) hypothesized that RR may help readers gain proper prosodic information about passages and that, with practice, readers might progress from reading a passage word by word to reading in significant phrases. Dowhower (1987) and Herman (1985) found preliminary evidence to support Schreiber's hypothesis; however, results should be interpreted cautiously due to their small sample sizes (i.e., Dowhower, 1987, N = 17; Herman, 1985, N = 8) and measurement limitations. More recent findings suggest that the effects of RR on prosody may differ depending on how students receive directions and feedback. For example, Ardoin, Morena et al. (2013) found that, when provided with feedback and directions regarding improvement in reading speed, students receiving RR improved their reading rates but exhibited declined performance on prosodic variables across readings. In contrast, when given directions and feedback about prosody, students receiving RR improved on nearly all prosodic variables but did not significantly improve their reading rates across readings. 1.7. Eye movement research Similar to prosody research, there is limited research examining the eye movements of students during reading. Until recently, almost all of the research on eye movements during reading was conducted with skilled adult readers (Miller & O'Donnell, 2013). Considering the vast differences between the eye movements of beginning early elementary readers versus skilled adult readers (Blythe & Joseph, 2011; Miller & O'Donnell, 2013), findings from research conducted with adults are limited in helping researchers understand children's reading behavior. Differences in the eye movement patterns of beginning child readers and skilled adult readers likely reflect the difficulty that beginning readers face in recognizing words comprising a given text. Given that the focus of fluency-based interventions is to improve students' recognition of words, eye movement measures serve as seemingly ideal dependent measures for evaluating how reading fluency interventions change students' reading behavior (Miller & O'Donnell, 2013). Reading behavior involves series of discrete eye movements (saccades) separated by pauses (fixations). It is during the fixations that readers derive information from what they are reading. Readers also look back at previously read text (regressions) when reading (see Appendix A for definitions). The eye movements of developing readers differ from those of skilled readers (see Blythe & Joseph, 2011; Morris & Rayner, 1990 for review). Elementary-aged readers fixate on virtually every word and often make multiple consecutive fixations on a single word (Ardoin, Binder et al., 2013). As a result, they make shorter saccades on average and exhibit longer average fixation durations (300–400 ms) than do skilled adults (200–250 ms). Furthermore, up to 50% of their eye movements are regressive as compared to only 10%–20% for adults. Differences between elementary-aged and skilled adult readers are thought to reflect the difficulty that beginning readers face in recognizing words in texts (Valle et al., 2013). Accurate interpretation of eye movement data is aided by our ability to extract multiple measures from a single reading episode, with different measures reflecting various stages of word processing (Rayner, Chace, Slattery, & Ashby, 2006). For example, two parameters related to early lexical processing (First Fixation Duration = the duration of the first fixation on a word, regardless of how many total fixations there are on a word; Gaze Duration = the sum of all fixations on a word before a reader moves to another word) can be measured on each word read in a passage, and analyses can be conducted to examine differences in students' work recognition skills across word types (Joseph, Nation, & Liversedge, 2013). Based upon the same reading episode, measures of fixation time over large segments of text can be used to examine processing associated with building higher-level representations of the meaning of the text, and rereading time can be used as an indicator of comprehension difficulty. By examining these and other measures in concert, researchers can use eye movement measures to construct a coherent picture of how the reading process unfolds over time (Rayner et al., 2006; Rayner et al., 2013). 1.8. Children's eye movements during reading Foster, Ardoin, and Binder (2013) conducted the first study examining the eye movements of second-grade readers during RR of a passage embedded with low- and high-frequency target words. Results indicated that RR improved reading for all participants, as evidenced by decreases in overall reading time and changes in eye movement measures (first fixation duration, gaze duration, total fixation time, interword regressions, and average fixation count per word) across readings in the expected direction.
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
17
Additionally, Foster et al. found that, for beginning readers, RR differentially impacted processing of low- and high-frequency target words, with the greatest facilitation occurring on low-frequency target words. Results suggested that with each new reading, students' word recognition of unfamiliar words increased. In an extension of this, results of Zawoyski, Ardoin, and Binder (2015) indicated that the effects of RR were greater for lower-performing readers than high performing readers, as evidenced by their improved processing of both low- and high-frequency target words. Interestingly, Valle et al. (2013) also reported differences in the eye movements of elementary students as a function of their reading achievement scores, particularly with regard to eye movement measures related to decoding. Ardoin, Binder et al. (2013) conducted the only study to date examining changes in the reading behavior of elementary students provided with (a) the opportunity to read a single passage with embedded target words four times [RR], (b) the opportunity to read four different passages embedded with the target words [WR], or (c) no opportunity to practice reading target words [control]. Findings suggested that, when compared to the control condition, neither intervention improved students' reading of high-frequency target words in a generalization passage but that students in both the RR and WR conditions improved equally on low-frequency target words as compared to control students (Ardoin, Binder et al., 2013). Results supported implications of past applied research on RR and WR (e.g., Homan et al., 1993; Wexler et al., 2008) suggesting that WR may be as effective as RR in improving reading fluency. However, given that Ardoin et al. examined a single intervention session, the effects of extended RR and WR on reading as well as whether effects might vary as a function of achievement level require further exploration. 1.9. Purpose Despite a large research base supporting the effectiveness of RR and WR separately, only one study incorporated a randomized controlled design to evaluate the effects of RR, WR, and typical classroom instruction on multiple reading measures over an extended period of time (i.e., Wexler et al., 2010). Results from Wexler et al. (2010) indicated null effects, although it is important to note both that participants were high-school students with reading disabilities and that participants assumed the roles of both tutor and tutee. The current study aimed to extend Wexler et al. by examining RR, WR, and a “business-as-usual” (BU) condition with second-grade students. Participants varied in their reading achievement from low average to proficient students in an effort to examine how RR and WR might differentially impact students according to their skill. As recommended by Therrien (2004), students in the current study received intervention from skilled adults in order to ensure maximum effectiveness. In addition to measuring traditional reading outcome variables (e.g., oral reading fluency measures, standardized tests of reading achievement) at pre- and posttest, we also measured students' prosody during oral reading. To address the limited nature of research on prosody development, we analyzed intervention effects on students' prosody utilizing advanced technology (i.e., audio equipment and spectrograms). Additionally, we measured students' eye movements during silent reading. Eye tracking allowed for real-time measurement of students' reading behavior and detection of subtle changes in word- and passage-level processing that might ultimately lead to improvements in oral reading rate and accuracy. This was the first eye movement study to examine behavioral changes underlying improvements in reading fluency as well as whether the two interventions would produce differential effects in eye movements. The overarching purpose of the current study was to investigate how different interventions impact the core components of reading fluency (i.e., accuracy, speed, and prosody) and eye movements during reading. An additional consideration was whether these effects varied by achievement level. By integrating an extended intervention period with findings from traditional standardized assessments, eye movement measures, and prosodic evaluation, this study examined associations among reading and educational outcomes from a unique and multifaceted perspective. The study was guided by three primary research questions: 1) Achievement Measures: Does RR or WR intervention improve students' reading achievement above and beyond a BU condition? Do results vary as a function of intervention condition and/or students' initial reading achievement level? 2) Reading Prosody: Do 9–10 weeks of RR or WR intervention alter students' reading prosody above and beyond BU? Do results vary as a function of intervention condition and/or students' initial reading achievement level? 3) Global Eye Movements: Do 9–10 weeks of RR or WR intervention result in changes in students' global eye movement behavior above and beyond changes observed for students provided with BU? Do results vary as a function of intervention condition and/or students' initial reading achievement level? 2. Method 2.1. Participants and setting Students were recruited by asking the second-grade teachers within each of three participating schools to send letters of consent, approved by the University of Georgia institutional review board, to the parents of students who met inclusionary criteria. Inclusionary criteria were that the student (a) was receiving classroom instruction within a second-grade classroom in one of the three participating schools; (b) had the skills necessary to read connected text; (c) had a record of good attendance; (d) had a first language of English; and (e) was receiving reading instruction only within the regular education classroom. Data of one student were not included in analyses due to excessive absences, but all other students who began the study were included in the analyses. The final set of participants included 78 boys and 90 girls of White (86%), multiracial (6%), Black or African-American (3%), Hispanic or Latino (3%), and Asian (2%) ethnicities with a mean age of 7 years, 8 months (range = 6 years, 11 months
18
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
to 8 years, 8 months) at the initial point of data collection. These participants were drawn from two elementary schools (Grades K–5) and one primary school (Grades K–2) in the Southeastern United States across four consecutive semesters. Within the two elementary schools, students were drawn from the classrooms of four teachers, and within the primary school, students were drawn from the classrooms of five teachers. Across schools, the percentage of students eligible to receive free or reduced-price meals ranged from 23% to 32%. At the initial point of data collection, participants demonstrated Low Average to Superior broad reading skills (range = 85 to 124; M = 104.74), as indicated by composite standard scores on the Woodcock-Johnson Tests of Academic Achievement – Third Edition, Form A (WJ-III ACH; Woodcock, McGrew, & Mather, 2001). Similarly, participants' pretest oral reading fluency rates ranged from 20 to 194 words read correctly per min (WRCM; M = 83.92). Demographic information and pretest achievement scores for participant subsamples assigned to each experimental condition are provided in Table 1. 2.2. Materials/apparatus and measures 2.2.1. Pre- and posttest Standardized measures were individually administered to assess participants' reading achievement, oral reading fluency, oral reading prosody, and eye movements during silent passage reading. 2.2.2. Reading achievement To assess broad reading achievement, four reading subtests from the WJ-III ACH (Woodcock et al., 2001) were administered. The first subtest, Letter-Word Identification (LWI), required participants to read aloud lists comprised of individual letters and words. The second subtest, Reading Fluency (RF), required them to silently read and indicate the veracity of as many printed sentences as possible within a 3-min time limit. During the third task, Passage Comprehension (PC), participants were asked to read sentences/short texts containing a missing word and to supply each missing word orally (e.g., “The ___ barked, frightening the children.”). The last subtest, Word Attack (WA), required participants to read lists of nonsense words (e.g., “glerz”) aloud. Each administration yielded raw scores for each subtest and a Broad Reading (BR) composite score (an age-based standard score summarizing performance on the first three subtests) for each participant. The WJ-III technical manual provides evidence of the technical adequacy of the WJ-III subtests and BR composite by presenting median test–retest reliability coefficients which exceed 0.80 across subtests and BR. Evidence of validity is found in correlations between the WJ-III ACH subtests and other norm-referenced measures of reading achievement, with coefficients ranging from 0.62 to 0.81 (Schrank, McGrew, & Woodcock, 2001). 2.2.3. Oral reading fluency Participants' oral reading fluency (ORF) was assessed at pre- and posttest using three reading curriculum-based measurement (CBM-R) probes drawn from Formative Assessment Instrumentation and Procedures for Reading (FAIP-R) materials, which demonstrate levels of technical adequacy comparable to those of other CBM-Rs (Ardoin, Eckert et al., 2013; Christ, Ardoin, & Eckert, 2010). In line with universal screening procedures for CBM-R (Shinn, 1998), participants individually read FAIP-R probes aloud to an examiner who marked errors, supplied words after hesitations of 3 s, and recorded the last word read after 1 min elapsed. Performance across all three probes was used to calculate median scores in WRCM for each participant. 2.2.4. Oral reading prosody To assess prosody during oral passage reading, participants were asked to read an experimenter-developed narrative passage (“Molly”) aloud. This passage consisted of 355 words in five paragraphs, with a reading level of grade 3.36 according to the Spache (1953) readability formula. The passage was designed to include multiple sentence types (e.g., declarative, yes/no questions, and wh- questions) and varied punctuation (e.g., adjective and clause commas, sentence and paragraph periods) to allow for evaluation of pitch changes at ends of sentences and punctuation-related pausing. It was presented to participants in printed form as two pages of double-spaced, standard upper- and lowercase text. Prosody was measured using digital recordings of participants' oral reading and the Praat software program, version 5.2.26 (Boersma & Weenink, 2011). Recordings were gathered using an Audio Technica AT2020USB Condenser USB Microphone Table 1 Descriptive summary of participants at pretest by experimental condition. Condition (n)
Gender Male
Female
White
Black or African American
Hispanic or Latino
Asian
Multiracial M (SD)
M (SD)
RR (56)
39.29% (22) 44.64% (25) 55.36% (31)
60.71% (34) 55.36% (31) 44.64% (25)
91.07% (51) 85.71% (48) 82.14% (46)
1.79% (1) 5.36% (3) 1.79% (1)
3.57% (2)
3.57% (2) –
–
105.27 (7.86) 104.71 (7.87) 104.25 (7.80)
WR (56) BU (56)
Race
ORF
1.79% (1) 3.57% (2)
1.79% (1)
7.14% (4) 10.71% (6)
85.36 (32.64) 84.53 (31.37) 81.75 (30.17)
WJ-III
Note. Demographic information is reported for each condition in terms of percentage and total number of participants (n). Pretest achievement scores for Oral Reading Fluency (ORF) and WJ-III Broad Reading are reported in WRCM (words read correctly in a min) and standard scores, respectively.
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
19
connected to a PC and were coded using Praat to analyze word, sentence, and paragraph boundaries; pause durations within and between these boundaries (i.e., pauses after commas separating list items, after commas ending phrases, at sentence boundaries, and at paragraph boundaries); and minimum and maximum pitch levels during specific portions of reading (i.e., declarative sentences, wh- questions, and yes/no questions). Definitions of prosody measures similar to those employed by Ardoin, Morena et al. (2013) are provided in Appendix B. Although traditional studies of reliability and validity have yet to be conducted with the measures of prosody employed, research suggests modest relationships with students' comprehension (Miller & Schwanenflugel, 2006; Schwanenflugel et al., 2004), significant differences in prosody as a function of passage difficulty (Young & Bowers, 1995), as well as significant differences in the reading prosody of students with and without learning disabilities in reading. 2.2.5. Eye movements during silent reading To assess eye movements during reading, participants were asked to silently read an experimenter-developed narrative passage adapted from Anansi and the Talking Melon, a story written by Eric A. Kimmel. This passage (“Sammy”) consisted of 157 words in 12 sentences. The reading level of the passage fell at grade level 3.53, according to the Spache (1953) readability formula. The passage was displayed as one page of 1.5-spaced black text (against a white background) formatted in standard upperand lowercase letters using 20-point Times New Roman font. The maximum line width was 84 characters, with 3.7 characters equaling 1 degree of visual angle. Eye movements during silent reading were measured using an SR Research EyeLink 1000 system. Eye movements were recorded from one eye (i.e., the right eye, except when tracking issues necessitated recording from the left eye), but passage viewing was binocular. The passage was displayed on either a 19-inch or 22-inch LCD monitor. Eye movement recording was conducted in a dimly lit room in each participant's school. Participants used a Microsoft Sidewinder Plug and Play game pad to answer comprehension questions and to indicate when they were finished reading displayed text. Eye movement data were collected and analyzed across multiple measures which are defined in Appendix A (first fixation duration, gaze duration, fixations, total fixation time, intraword regressions). Analyses of the psychometric qualities of commonly used eye movement measures indicate coefficients ranging from 0.73 to 0.82 (test-retest reliability), 0.86 to 0.89 (alternate-form reliability), and 0.47 to 0.77 (criterion-related validity in relation to WJ-III ACH Broad Reading standard scores and CBM-R median scores; Foster, Ardoin, & Binder, 2016). 2.2.6. Intervention passages Intervention passages were pulled from first- through third-grade-level reading textbooks published in 1978–1997 through the Harcourt Brace Jovanovich “Signatures,” Silver Burdett Ginn “World of Reading,” and Scott Foresman “Celebrate Reading!” series. These materials were selected to minimize the possibility of prior exposure to the passages; none of these curricula were being used in participants' schools. Following selection, passages were annotated with running word counts for each line (as in examiner copies of CBM-R probes), scanned, and converted to PDF electronic copies. During intervention implementation, PDF files were displayed on PCs with Adobe Reader software using the “Two-Up Continuous” or “Two Page Scrolling,” “Show Cover Page During Two-Up,” and “Reading Mode” viewing options. Each reading textbook was converted into a separate PDF file consisting of several passages. PDF files were categorized and labeled by level (i.e., Grade 1, Level 1; Grade 1, Level 2; Grade 2, Level 1; Grade 2, Level 2; Grade 3, Level 1; Grade 3, Level 2). Participants continuously read PDF files at their level from start to finish in a set order. Random sampling of the passages at each grade level (based on 42 first-grade-level stories, 15 s-grade-level stories, and 19 third-grade-level stories) revealed that, on average, the proportion of unique words (i.e., # of unique individual words / total # of words within passage) within each passage was 44% for Grade 1 stories (range = 14%–100%), 45% for Grade 2 stories (range = 22%–78%), and 38% for Grade 3 stories (range = 25%–57%). When counting unique words, words that appeared multiple times in the passage were counted only once. 2.2.7. CBM-R progress monitoring During the intervention period, CBM-R progress monitoring was conducted weekly with students in each of the three conditions. Each week, participants individually read three FAIP-R probes drawn from level-appropriate passage sets selected based on pretest ORF performance. As typically conducted when estimating student growth using CBM-R data (Ardoin, Christ, Morena, Cormier, & Klingbeil, 2013), ordinary least squares regression was used to calculate the rate of words gained per week for each student. Specifically, for each student a line of best fit (slope) was determined by entering the date of each CBM-R session and the student's median WRCM from the related session. The resulting beta coefficient was multiplied by 7 to produce a measure of words gained per week for each student. 2.3. Procedures 2.3.1. Pre- and posttest Approximately one week prior to and following the intervention period, participants were pulled from their classrooms for administration of the aforementioned pre- and posttest measures. Participants each completed two pre- and posttest sessions (i.e., one session each on two consecutive days), each lasting 30–45 min; tasks were split across two sessions. During the first of these sessions, participants were administered previously described assessments measuring broad reading achievement and oral reading fluency. Graduate students trained in standardized testing procedures (i.e., school psychology doctoral students) administered tests individually in quiet areas near participants' classrooms (i.e., unoccupied classrooms, office spaces, etc.). During the second
20
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
session, participants completed tests requiring computer equipment (i.e., assessment of silent passage reading and oral reading prosody) in a separate room selected to avoid distractions. Detailed procedures for these tasks are outlined below. 2.3.2. Eye movement monitoring Each participant was seated approximately 50–55 cm from a computer monitor and placed his/her chin on a chin rest used to minimize movement of the head. While experimenters adjusted the chin rest and camera, they described the tasks to the participant (i.e., that he/she would silently read stories from the monitor while the camera recorded his/her eye movements) and showed him/her how to use the game pad. The eye monitoring system was calibrated for each participant using a nine-point calibration grid covering the entire display screen. Following successful calibration and validation (using another nine-point grid), each participant completed a practice trial that acquainted him/her with silently reading from the monitor and using the game pad. After completing the practice trial, the participant was given a short break to prevent discomfort or fatigue. Prior to the experimental trial, the participant was provided with further instructions; each participant was told to do his/her “best reading,” to try to read each word in the passage, that the experimenter could not assist with reading, and that he/she would be asked a comprehension question after reading. Following repetition of the calibration process, the participant was instructed to fixate on a dot displayed in the upper-left corner of the screen (i.e., where the passage would begin). Once fixation was satisfactory, the passage was displayed on the monitor. After silently reading the passage, the participant used the game pad to indicate when he/she was finished reading and to answer the subsequent comprehension question. The purpose of the comprehension questions was to encourage students to read for comprehension. At posttest, participants were presented with additional passages and questions for purposes associated with other studies; the experimental passage presented at pretest was always presented first, followed by three additional passages. For information associated with those studies, see publications by Ardoin, Binder et al. (2013), Foster et al. (2013), and Zawoyski et al. (2015). 2.3.3. Oral prosody assessment Each participant was provided with the printed passage and instructed to do his/her “best” oral reading. The experimenter monitored performance, supplied words after hesitations of 5 s, and silently redirected students to the correct line when needed (i.e., pointed) but did not provide error correction. 2.3.4. Participant assignment Following pretest assessment, participants were cumulatively ranked according to their ORF median scores and raw scores for correct responding on the WJ-III ACH Reading Fluency subtest. Based on these rankings, experimenters created matched groups of three students and, using Microsoft Excel's random number generator, assigned one group member to each of three conditions: repeated reading (RR), wide reading (WR), and “business as usual” (BU). Efforts were made to ensure that condition assignment was distributed evenly across classrooms. At pretesting, experimenters were blind to student condition, since students had yet to be assigned; however, following assignment, experimenters were not blind to students' conditions, as data collectors were also often involved in intervention implementation. Appropriate levels of intervention passages (i.e., PDF files) and progress-monitoring probes were determined based on participants' pretest ORF median scores. For example, participants earning scores of 70 WRCM or above were progress monitored using Level 3 FAIPR probes, whereas participants earning lower scores were assigned to Level 2. Of note, due to the error inherent in CBM-R, changes in participants' levels of intervention materials occasionally were necessitated. Levels were adjusted such that participants reading Grade 2-level passages read approximately 40–60 WRCM during their initial reading. Similarly, participants reading Grade 3-level passages were expected to read 60 or more WRCM during their first reading and at least 100 WRCM during their final reading. Attention was also given to students' reading accuracy. Students were assigned to easier materials if their reading accuracy was below 90%. 2.3.5. Intervention One-to-one reading intervention was implemented 4 days per week during 15–20-min daily sessions over a 9- to 10-week period. Due to resource limitations, the study was conducted across four separate periods, with an approximately equal number of participants during each period. Due to weather-related school closures, one such period involved only 9 weeks of intervention. In cases where students were absent four or more occasions, make-up sessions were conducted 5 days per week to ensure that all students experienced approximately the same number of intervention sessions. The number of intervention sessions completed by students in the RR (M = 37.04; range = 30–40) and WR (M = 37.00; range = 32–40) conditions was approximately equal, t(109) = 0.08, p = 0.94. To ensure that intervention was supplementary (i.e., in addition to typical instruction), sessions took place during teacher-selected intervals in which students were not receiving reading/language arts instruction. Specifically, sessions generally occurred in the morning prior to the official start time of the school day or during morning announcements. During each RR/WR session, participants completed four consecutive segments of oral reading and answered comprehension questions verbally following two of those segments. Comprehension questions required participants to recount information (i.e., the name of a person, thing, or place in the passage and a detail about that entity), to predict future details, to evaluate past predictions, or to summarize the passage. Participants progressed sequentially through appropriate-level reading materials (PDF files), beginning each session's reading at the word on which students left off at the completion of the previous session. In addition to delivering instructions and questions, interventionists monitored performance and provided immediate error correction (i.e., prompted participants to reread any misread words, modeled accurate reading of misread/unknown words, and instructed participants to reread such words correctly in their surrounding contexts). Following each reading segment, interventionists
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
21
also provided performance feedback regarding participants' oral reading rate (i.e., number of words read, number of min read) during that segment. Once per week, participants in both conditions were allowed to choose a small toy/prize to thank them for their participation. In order to ensure procedural fidelity, interventionists followed paper or electronic scripts/protocols specific to each intervention during all intervention sessions. At the end of each session, interventionists logged the number of words read by the participant and the session duration. Across intervention conditions, participants spent 12–15 min per session reading aloud, not including time spent on preparation, answering comprehension questions, and performance feedback. For sessions involving CBM-R progress monitoring, reading times were shortened to ensure that overall session length remained consistent across days (i.e., 15–20 min). To ensure that participants in both intervention conditions completed an equivalent amount (duration) of oral reading (t[109] = 0.35 p = 0.73), the length of time allotted for WR participants' reading (in min) was yoked to that of participants in the RR condition. Across the study, participants in the RR condition read aloud for an average of 451 min (SD = 39; range = 345–516), and participants in the WR condition read aloud for an average of 449 min (SD = 35; range = 376–509). 2.3.6. Repeated reading (RR) In line with past research on critical components of RR-based intervention procedures (Therrien, 2004), participants assigned to the RR condition read passages four consecutive times. During the first reading segment of each session, participants read aloud for 5 min. During each subsequent reading segment, they read the same selection of text (i.e., what they had read during the first reading) while interventionists timed and recorded the duration of each reading. A copy of the RR protocol is provided in Appendix C. 2.3.7. Wide reading (WR) To ensure that participants in the WR condition read for the same duration as the RR participants, interventionists implemented RR intervention before working with WR participants. After providing RR intervention, interventionists determined the length of corresponding WR reading segments by summing the durations of all four RR reading segments and dividing by four. If an RR participant was absent, data from that student's previous session were used to calculate WR reading time. WR intervention was designed to expose participants to a wider range of text than that encountered by RR participants; thus, participants assigned to the WR condition read passages continuously without rereading any material. During each reading segment within the intervention session, WR participants read aloud for the predetermined duration described above (i.e., 1/4 of the total time spent reading by corresponding RR participants). Interventionists used timers to indicate the end of each segment and then recorded the number of words that participants read during each reading; the number of words read was calculated using the running word count totals at the ends of each line of text. A copy of the WR protocol is provided in Appendix C. 2.3.8. Business as usual (BU) In addition to their regular classroom instruction, BU students participated in pre- and posttesting and weekly CBM-R progress monitoring sessions. Otherwise, BU participants' school day did not differ from that of second-grade students who did not participate in the study. Furthermore, because RR and WR participants were generally provided with intervention outside of classroom instructional time, the amount and quality of instruction provided by classroom teachers to students did not differ between participants regardless of condition assignment or their participation in the study. 2.3.9. Progress monitoring CBM-R administration procedures for weekly progress monitoring were identical to those employed during pre- and posttest assessment. Students assigned to intervention conditions completed the assessment during regularly scheduled sessions (prior to intervention procedures), and students assigned to the BU condition were assessed weekly during the same times of the school day. 2.3.10. Training and procedural fidelity During the intervention period, each interventionist worked with a select group of students, such that intervention for a given participant was always implemented by the same individual(s). Interventionists worked one to one with students and would work with two to three students per day. In order to decrease the possibility of effects by interventionists, the interventionist would be assigned to both RR and WR students. Intervention and progress monitoring procedures were conducted by experimenters who completed two 3-h training sessions prior to the start of data collection. Retraining sessions were provided when deemed necessary based upon collected procedural fidelity/inter-rater reliability data. To facilitate collection of procedural fidelity data, all sessions were audio recorded. Procedural fidelity checks were conducted by interventionists (who checked one another's sessions) and graduate research assistants throughout the 9- to 10-week intervention period. These evaluators collected data regarding whether interventionists: (a) read daily instructions to participants; (b) provided the correct form of intervention (RR or WR) to participants; (c) asked comprehension questions; (d) properly implemented error correction procedures; and (e) responded appropriately to each word read (i.e., corrected errors and did not respond inappropriately to words read accurately). These data were gathered for 12% of all intervention sessions, distributed across all interventionists (N = 33) and both experimental conditions (237 RR sessions and 236 WR sessions). On average, interventionists read daily instructions with 99% accuracy, provided the correct form of intervention with 100% accuracy, and asked comprehension questions with 97% accuracy. Across sessions evaluated, interventionists responded appropriately to 99% of words read. In addition, evaluators examined interventionists' degree of procedural fidelity in administering CBM-R probes to monitor progress.
22
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
Procedural fidelity was calculated by dividing the number of steps implemented correctly by the total number of steps (N = 8). Data were collected for 26% of progress monitoring sessions; average procedural fidelity was 98% (range = 38%–100%). The case of 38% fidelity was due to one of the randomly selected sessions not including audio recording of the entire session; thus, all steps prior to those recorded were marked as not being properly implemented. 2.3.11. Inter-rater reliability Inter-rater reliability for progress monitoring data (i.e., CBM-R scoring) was calculated by two school psychology graduate research assistants and was assessed for 26% of administered CBM-R probes. Percent agreement for each probe was calculated by subtracting the number of disagreements from the number of total words read recorded by the original scorer, then dividing that difference by the number of total words read. Average percent agreement across probes was 99% (range = 90%–100%). 2.3.12. Classroom observations Observational data were collected to ensure that RR and WR intervention was supplementary (i.e., in addition to and not in place of) to typical classroom instruction. Classroom observations were conducted during teachers' regularly scheduled Language Arts/reading blocks (M = 104 min) in order to ensure that BU did not change for students in either of the intervention conditions. One random participant was observed during each session for the duration of the Language Arts/reading block. Each participant was observed in his/her classroom setting at least one time; the average number of observations conducted per student was 1.26. Across the four semesters of data collection, 19 observers (not interventionists) were trained by a graduate research assistant in collecting behavioral observation data until they were able to pass an “observation test” and achieve 95% inter-rater reliability with the primary observer. Observational data were collected using laptop computers equipped with Instant Data software (version 1.4 for PC; Samaha, Vollmer, & Bourret, 2002). Each context/behavior was associated with an assigned key that could be toggled on or off, allowing for the recording of the duration associated with each context/behavior. Observation data files were analyzed using Instant Analyzer software (version 1.0; Samaha, Vollmer, & Bourret, 2001). 2.3.13. Observational codes Dependent measures fell within three broad categories: General Context, Specific Context, and Reading Behavior (described below). Observers recorded at least one context/behavior for each category for the duration of each observation. Upon completion of each observation, data were summarized by calculating the total amount of time (in seconds) participants spent in each context or engaging in each behavior. This number was then divided by the total amount of time spent completing Reading Activity (defined below) and multiplied by 100. This calculation allowed analysis of the percentage of time each participant spent in each context and engaging in each behavior. In addition, it allowed analysis of the percentage of time associated with context-behavior combinations (e.g., Alone-Silent, Small Group-Instruction). 2.3.14. General context First, observers recorded which General Context that target students were in: Reading Activity or Non-Reading Activity. “Reading Activity” was defined as time in which participants were expected to be engaging in a language arts activity including reading, writing, grammar, or spelling. “Non-Reading Activity” was defined as time during which participants were engaged in an activity that was clearly not related to language arts (e.g., math, PE, science, transitions). Since all observations took place during regularly scheduled Language Arts/reading blocks, participants were generally involved in Reading Activity. 2.3.15. Specific context Observers also recorded the Specific Context in which participants were engaged: Alone, One-on-One Teacher, Peer-No Adult, Small Group, or Larger Group. “Alone” was defined as the participant being by him- or herself, without peer interaction or grouping. “One-on-One Teacher” was defined as the participant working one on one with an adult (i.e., teacher, parent, or volunteer), without any peers present. “Peer-No Adult” was defined as the participant working with a peer group without an adult present. “Small Group” was defined as the participant working in a small group of peers led by an adult. “Large Group” was defined as the participant working in a large group, listening to teacher lecture (e.g., going over spelling words, vocabulary words, using whiteboard). 2.3.16. Reading behavior Observers recorded the Reading Behavior in which participants were engaged: Aloud, Silent, Listen, Write, Instruction, Choral, or No-LA. “Aloud” was defined as the participant reading out loud so others could hear him/her. “Silent” was defined as the participant reading silently (e.g., silent sustained reading of a chapter book). “Listen” was defined as the participant listening to another person (student or adult) reading to him/her, with the participant having access to the reading material. “Write” was coded if the student was doing written work silently (e.g., worksheets). “Instruction” was coded if the participant was receiving instruction in reading, answering comprehension questions, talking with peers about books, or retelling a story. “Choral” was coded if the participant was reading out loud with another student or group of student in unison. “No-LA” was coded if the participant was not engaged in any behavior considered related to language arts (e.g., staring at the wall, waiting his/her turn, completing math worksheets) during times in which he/she was expected to be engaged in Reading Activity.
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
23
2.4. Data analyses We used G*Power (Faul, Erdfelder, Lang, & Buchner, 2007) to calculate a priori power for our study. A meta-analysis conducted by Therrien (2004) indicated that the effect sizes associated with the RR method have been estimated to be = 0.76. Thus, assuming a critical p value of 0.05, to achieve a power of 0.80 with our mixed design with two independent group variables (i.e., intervention condition and skill level) and two repeated measures (i.e., pre- and posttest), it was necessary to recruit a total of 162 participants. SPSS was used to analyze resulting study data. The effects of intervention (RR n = 56; WR n = 552; BU n = 56), time, and skill level were examined first using MANOVAs in which all of the dependent variables associated with a behavior (i.e., achievement tests, prosody and eye movement behavior) were entered into the analyses, and the MANOVAs were then followed up with mixed analyses of variance (ANOVAs). Participants were grouped by skill level based on their pretest WJ-III BR composite scores (lower n = 57; medium n = 53; higher n = 57). We divided the data set in thirds, representing the lower, medium, and higherlevel groups. The medium group had fewer participants because there were a few students in both the lower and higher groups that obtained the same composite score, so they were placed in each respective group. One-way ANOVAs and Bonferronicorrected follow-up analyses were conducted to examine significant interactions between variables (corrected p-value of 0.017 for each follow-up analysis). Overall effect sizes are reported as values of partial eta squared, which can be interpreted using Cohen's (1988) f benchmarks for small (η2p = 0.010), medium (η2p = 0.059), and large effects (η2p = 0.1378). 3. Results 3.1. Classroom instruction Analyses were conducted to examine whether there were differences in classroom instruction provided to the students across conditions. One-way ANOVAs were conducted to compare the percentage of time that participants in each condition spent engaged in small-group instruction, one-to-one instruction, and reading aloud within their classrooms. There were no significant differences between groups in terms of small-group instructional time, F(2, 158) = 2.21, p = 0.113, one-to-one instructional time, F(2, 158) = 0.70, p = 0.498, or time spent reading aloud, F(2, 162) = 0.15, p = 0.859. 3.2. Differences in intervention practice Independent-samples t-tests were used to compare the number of words read by participants in the RR and WR conditions. Participants in the RR group (M = 36,000.80, SD = 9566.85) read approximately 25% more words than participants in the WR group (M = 28,815.40, SD = 8625.10), t(109) = 4.15, p b 0.001, d = 0.789. However, after adjusting for the repeated nature of RR intervention (i.e., comparing only those words read as part of non-repeated sequences of text), participants in the WR group (M = 28,815.40, SD = 8625.10) read approximately 220% more unique words as compared to participants in the RR group (M = 9000.20, SD = 2391.71), t(109) = 16.56, p b 0.001, d = 3.131. 3.3. Achievement assessments Two data files were missing for the ORF measure (one each from WR and RR), but data files were complete for WJ-III across all participants. Achievement assessment analyses corresponded to the WJ-III measures of WA, LWI, RF, and PC, as well as the FAIP-R CBM-R probes (see Table 2 for descriptive statistics). Prior to analyzing the data, outliers were identified using the outlier labeling rule (Hoaglin & Iglewicz, 1987). This rule establishes cut-off values by taking the difference between the scores at the 75th and 25th percentile, and then multiplying this difference by g, which is 2.2 as suggested by Hoaglin and Iglewicz (1987). The generated value is then subtracted from the score at the 25th percentiles and added to the score at the 75th percentile. Scores falling outside these boundaries were classified as outliers and handled via winsorization. Outliers represented a very small proportion of the data: ORF =0.01, LWI = 0.006, RF = 0.006, and there were no outliers for either PC or WA. We first ran a 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) × 5 (Task) MANOVA (Table 3). This analysis produced significant effects for skill, time, task, Time × Intervention, Time × Skill, Task × Intervention, Task × Skill, Task × Time, and Time × Task × Intervention. Thus, a 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) mixed ANOVA was conducted (see Table 4) for each variable as a follow up the MANOVA. Similar results were obtained for LWI, PC, and ORF. For each of these assessments, statistically significant main effects for Time were found, such that participants' scores were higher at posttest compared to pretest. There were also statistically significant main effects for skill level, with low-skilled students scoring lower than both the middle-performing and highest-performing students. There were also Time × Condition interactions for each of these measures. In post hoc follow-up tests, we found that, although all intervention groups improved their performance from pre- to posttest to a statistically significant degree, the effect sizes associated with RR (LWI = 0.642, t(55) = 9.939, p b 0.001; PC = 0.501, t(55) = 7.432, p b 0.001; ORF = 0.680, t(55) = 10.818, p b 0.001) and WR (LWI = 0.578, t(54) = 8.603, p b 0.001; PC = 0.449, t(54) = 6.636, p b 0.001; ORF = 0.742, t(54) = 12.477, p b 0.001) were always greater than the effect sizes for the BU 2
One participant receiving WR was excluded from these analyses due to excessive absences during the intervention period.
24
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
Table 2 Summary of descriptive statistics for achievement assessments across time, intervention, and skill level. Measure
RR Low M (SD)
ORF (WRCM) (N = 167) Pretest 55.71 (19.94) Posttest 89.00 (23.56) LWI (N = 167) Pretest 36.57 (3.30) 41.21 Posttest (3.09) RF (N = 167) Pretest 21.07 (5.31) Posttest 27.93 (4.70) PC (N = 167) Pretest 18.14 (2.18) Posttest 23.50 (2.98) WA (N = 167) Pretest 13.71 (4.66) Posttest 18.21 (5.54) CBM Slopea (N = 165) 2.1 (0.35) a
RR Med M (SD)
RR High M (SD)
WR Low M (SD)
WR Med M (SD)
WR High M (SD)
BU Low M (SD)
BU Med M (SD)
BU High M (SD)
76.43 (17.44) 102.86 (19.22)
114.05 (28.35) 132.19 (34.00)
57.22 (16.70) 84.65 (20.69)
92.14 (17.61) 117.93 (17.87)
113.50 (24.11) 133.78 (23.50)
57.85 (26.05) 76.55 (28.62)
81.72 (13.97) 105.11 (15.89)
108.33 (23.88) 120.67 (34.82)
41.14 (1.68) 44.95 (3.25)
45.86 (3.28) 48.95 (3.20)
37.96 (1.74) 41.65 (2.74)
41.50 (1.99) 44.29 (3.17)
47.89 (3.66) 49.83 (4.05)
37.35 (2.39) 40.60 (3.10)
41.61 (1.42) 43.89 (2.37)
46.72 (3.46) 48.33 (4.89)
25.33 (4.32) 30.52 (5.14)
33.95 (5.52) 38.62 (7.90)
20.57 (5.82) 28.09 (5.99)
29.14 (4.40) 36.29 (3.27)
32.83 (5.46) 38.44 (5.01)
19.60 (6.16) 26.05 (6.63)
26.28 (4.51) 32.94 (4.80)
32.72 (6.92) 35.83 (9.02)
22.95 (2.46) 25.95 (2.46)
26.76 (1.34) 28.10 (2.30)
19.26 (1.68) 22.74 (3.18)
23.36 (1.98) 26.36 (2.68)
25.89 (2.27) 27.17 (2.68)
18.65 (2.25) 21.00 (3.20)
23.44 (1.72) 24.83 (2.81)
25.83 (2.26) 25.72 (3.30)
17.00 (4.11) 18.91 (3.88)
19.38 (2.73) 22.14 (2.55)
11.83 (3.88) 14.22 (4.83)
14.79 (4.06) 17.07 (4.70)
20.22 (3.89) 22.28 (4.55)
11.90 (4.04) 16.00 (3.58)
15.00 (3.25) 17.94 (3.65)
19.22 (4.76) 21.06 (5.18)
1.68 (0.28)
2.1 (0.35)
1.54 (0.28)
1.19 (0.28)
1.47 (0.28)
1.47 (0.28)
1.47 (0.28)
0.77 (0.28)
CBM Slope is reported in words gained per week. Obtained slopes were multiplied by 7 to reflect words gained per week.
group (LWI = 0.440, t(55) = 5.579, p b 0.001; PC = 0.137, t(55) = 2.591, p = 0.005; ORF = 0.525, t(55) = 7.921, p b 0.001). In addition, there was a significant Time × Skill interaction. Post hoc analyses revealed that, although all skill level groups showed statistically significant improvement from pre- to posttest, the effect sizes associated with these differences was greater for the lowest-level (LWI = 0.686, t(56) = 11.048, p b 0.001; PC = 0.552, t(56) = 12.647, p b 0.001; ORF = 0.727, t(56) = 12.219, p b 0.001) and middle-level students (LWI = 0.563, t(52) = 8.184, p b 0.001; PC = 0.472, t(52) = 10.514, p b 0.001; ORF = 0.789, t(52) = 13.935, p b 0.001), while the effect size was smaller for the highest-performing students (LWI = 0.406, t(56) = 6.187, p b 0.001; PC = 0.084, t(56) = 6.444, p b 0.001; ORF = 0.445, t(56) = 6.703, p b 0.001). For WA, there were two statistically significant main effects. First, there was a statistically significant main effect for time, such that students improved from pre- to posttest. Second, there was a statistically significant main effect for skill level, such that Table 3 Multivariate statistics of achievement assessments across time, intervention, and skill level. F
p
η2p
(2, 158) (2, 158) (1, 158) (4, 158)
2.11 108.08 472.14 4868.49
0.125 b0.001 b0.001 b0.001
0.026 0.578 0.749 0.992
(2, 158) (2, 158) (4, 158) (8, 312) (8, 312) (4, 155) (16, 632) (4, 158) (8, 312) (8, 312) (16, 632)
4.29 10.20 0.79 2.52 14.02 74.61 1.24 0.75 2.73 1.71 0.56
0.015 b0.001** 0.531 0.011 b0.001** b0.001** 0.231 0.557 0.006 0.095 0.913
0.051 0.114 0.020 0.061 0.264 0.658 0.030 0.019 0.065 0.042 0.014
df Multivariate Statistics Main Effects Intervention: Skill: Time: Task: Interactions Time × Intervention: Time × Skill: Intervention × Skill: Task × Intervention: Task × Skill: Task × Time: Task × Skill × Intervention: Time × Skill × Intervention: Time × Task × Intervention: Time × Task × Skill: Time × Skill × Intervention × Task: Note: *pb.05, **pb.01.
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
25
Table 4 Summary of ANOVA statistics for achievement assessments across time, intervention, and skill level. df ORF (WRCM) (N = 167) Main Effects Intervention Skill Time Interactions Time × Intervention: Time × Skill: Intervention × Skill: Time × Skill × Intervention: WJ-III LWI (N = 167) Main Effects Intervention Skill Time Interactions Time × Intervention: Time × Skill: Intervention × Skill: Time × Skill × Intervention: WJ-III RF (N = 167) Main Effects Intervention Skill Time Interactions Time × Intervention: Time × Skill: Intervention × Skill: Time × Skill × Intervention: WJ-III PC (N = 167) Main Effects Intervention Skill Time Interactions Time × Intervention: Time × Skill: Intervention × Skill: Time × Skill × Intervention: WJ-III WA (N = 167) Main Effects Intervention Skill Time Interactions Time × Intervention: Time × Skill: Intervention × Skill: Time × Skill × Intervention: CBM Slopea (N = 165) Main Effects Intervention Skill Interactions Intervention × Skill:
F
p
η2p
(2, 158) (2, 158) (1, 158)
1.87 72.14 325.95
0.157 b0.001⁎⁎ b0.001⁎⁎
0.023 0.477 0.674
(2, 158) (2, 158) (4, 158) (4, 158)
3.65 5.74 0.89 0.71
0.028⁎ 0.004⁎⁎ 0.472 0.588
0.044 0.068 0.022 0.018
(2, 158) (2, 158) (1, 158)
1.36 1433.16 209.99
0.259 b0.001⁎⁎ b0.001⁎⁎
0.017 0.644 0.571
(2, 158) (2, 158) (4, 158) (4, 158)
4.44 5.37 0.42 0.01
0.013⁎ 0.006⁎⁎ 0.797 1.0
0.053 0.064 0.010 0.000
(2, 158) (2, 158) (1, 158)
1.92 64.25 266.97
0.149 b0.001⁎⁎ b0.001⁎⁎
0.024 0.449 0.628
(2, 158) (2, 158) (4, 158) (4, 158)
1.37 4.37 1.48 0.58
0.258 0.014⁎ 0.210 0.679
0.017 0.052 0.036 0.014
(2, 158) (2, 158) (1, 158)
3.89 124.61 111.26
0.023⁎ b0.001⁎⁎ b0.001⁎⁎
0.047 0.612 0.413
(2, 158) (2, 158) (4, 158) (4, 158)
7.33 14.64 0.79 0.68
0.001⁎⁎ b0.001⁎⁎ 0.533 0.604
0.085 0.156 0.020 0.017
(2, 158) (2, 158) (1, 158)
2.83 44.27 80.59
0.062 b0.001⁎⁎ b0.001⁎⁎
0.035 0.359 0.338
(2, 158) (2, 158) (4, 158) (4, 158)
0.69 2.25 1.14 0.73
0.505 0.109 0.341 0.576
0.009 0.028 0.028 0.018
(2, 156) (2, 156)
2.48 3.98
0.087 0.021⁎
0.031 0.049
(4, 156)
0.93
0.448
0.023
a
CBM Slope is reported in words gained per week. Obtained slopes were multiplied by 7 to reflect words gained per week. ⁎ p b 0.05. ⁎⁎ p b 0.01.
better performance was associated with each higher level of skill. There were no statistically significant effects associated with condition. For RF, as with all the achievement measures, students statistically significantly improved their performance from pre- to posttest, and statistically significant differences were noted across all three skill levels. In addition, there was a statistically significant Time × Skill level interaction. Post hoc analyses revealed that, while all skill level groups showed improvement from pre- to posttest, the effect sizes associated with these differences were greater for the lowest-level (0.741, t(56) = 12.647, p b 0.001) and middle-level students (0.680, t(52) = 10.514, p b 0.001), while the effect size was smaller for the highest-
26
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
performing students (0.426, t(56) = 6.444, p b 0.001). As with WA, there were no statistically significance effects for condition for RF. Slope data (i.e., words gained per week) for the CBM-R progress monitoring assessments completed across the 9–10 weeks of intervention were also analyzed using a 3 (Intervention Group) × 3 (Skill Level) ANOVA. The lowest-level students had a higher rate of growth in ORF as compared to the highest-level students, but no other effects were significant (see Table 4). 3.4. Prosody analyses One participant's data file (from the WR condition) was missing due to a technical problem with the recording equipment. Prosody analyses were conducted on pause durations, pitch changes, and intrusions. As with the achievement data, prior to analyzing the data, outliers were identified using the outlier labeling rule, and winsorizing was used to deal with the outliers. Outliers represented a very small proportion of the data: pause = 0.014, intrusions = 0.01, and pitch = 0.003. Prosody data were initially analyzed via a 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) × 9 (Measure) MANOVA. This analysis produced significant effects for Measure, Time × Intervention, Time × Skill, Measure × Skill, Measure × Time, and Time × Measure × Skill (see Table 5). A 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) × n (Type: 4 for pause manipulation, 2 for the intrusion measure, and 3 for the pitch measure) mixed ANOVA was conducted for each variable as follow up to the MANOVA (Table 6). Means, test statistics, and effect sizes are presented in tables 6 and 7. For both the pause and pitch data, the sphericity assumption was violated, so the values presented in Table 6 represent the Greenhouse-Geiser values. For the pause analyses, there were four different types of pause durations that were evaluated: (a) pauses after commas that separated items in a list (e.g., red, white, and blue), (b) pauses after commas that ended a phrase, (c) pauses at sentence boundaries, and (d) and pauses at paragraph boundaries. For intrusions, the number of times a student stumbled on a word (word intrusion) or paused at an inappropriate place in a phrase (i.e., not at a comma or sentence boundary) was counted. Finally, for the pitch analyses, the pitch change in declarative sentences, wh-questions, and yes/no questions was measured. For both pauses and intrusions, there were statistically significant main effects of time such that readers made shorter pauses and had fewer intrusions over time, and the effect sizes were quite large. There were also statistically significant main effects for skill level such that there were sizable and significant differences among each of the skill level groups. There were also statistically significant main effects of the types of pauses and intrusions made by students. That is, statistically, readers paused the longest at paragraph boundaries, followed by sentence boundaries, then at commas that separated clauses, and least at commas separating items in a list. All of those differences were statistically significant. In addition, readers had statistically more sentence intrusions than word intrusions. Although there was no statistically significant main effect of the intervention for the intrusion measure, there was a statistically significant main effect for the pause measure, with students in the WR condition pausing significantly longer than students in the BU and RR conditions. For pauses, there were several statistically significant two-way interactions. Time and skill level interacted such that all skill level groups displayed a statistical decrease in pause length from pre- to posttest, but the lowest-skilled (Mdiff = 203 ms, SDdiff, = 296, t(56) = 5.157, p b 0.001) and medium-skilled (Mdiff = 219 ms, SDdiff, = 208, t(52) = 7.661, p b 0.001) students showed a bigger decrease compared to the highest-skilled students (Mdiff = 100 ms, SDdiff, = 201, t(55) = 3.744, p b 0.001). Time also statistically interacted with type of pause. The biggest pre- to posttest differences were found for commas that separated items in a list (Mdiff = 212 ms, SDdiff, = 294, t(165) = 9.296, p b 0.001) and for paragraph boundaries (Mdiff = 219 ms, SDdiff, = 450, t(165) = 6.267, p = 0.001); however statistically significant decreases were also observed for clause (Mdiff = 125 ms, SDdiff, = 328, t(165) = 4.904, p b 0.001) and sentence (Mdiff = 137 ms, SDdiff, = 353, t(165) = 4.999, p b 0.001) boundaries. Table 5 Multivariate statistics of prosody parameters across time, intervention, skill level, and type. df Main Effects Intervention Skill Time Measures Interactions Time × Intervention: Time × Skill: Intervention × Skill: Measure × Intervention: Measure × Skill: Measure × Time: Measure × Skill × Intervention: Time × Skill × Intervention: Time × Measure × Intervention: Time × Measure × Skill: Time × Skill × Intervention × Measure *pb.05 **pb.01.
F
p
η2p
(2, 152) (2, 152) (1, 152) (8, 145)
0.58 2.13 0.05 206.15
0.561 0.123 0.832 b 0.001**
0.008 0.027 0.000 0.919
(2, 152) (2, 152) (4, 152) (16, 292) (16, 292) (8, 145) (32, 592) (4, 152) (16, 292) (16, 292) (32, 592)
5.26 3.10 0.66 0.88 5.23 15.85 0.69 1.16 1.77 1.84 0.68
0.006 0.048 0.617 0.591 b0.001** b0.001** 0.901 0.333 0.034 0.026 0.908
0.065 0.039 0.017 0.046 0.223 0.466 0.036 0.030 0.088 0.092 0.036
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
27
Table 6 Summary of ANOVA statistics for prosody parameters across time, intervention, skill level, and type.
Pause Duration (ms) (N = 166) Main Effects Intervention Skill Time Type Interactions Time × Intervention: Time × Skill: Time × Type: Intervention × Skill: Intervention × Type: Type × Skill: Time × Skill × Intervention: Type × Skill × Intervention: Time × Type × Intervention: Time × Skill × Type: Time × Intervention × Skill × Type: Intrusions (#) (N = 165) Main Effects Intervention Skill Time Type Interactions Time × Intervention: Time × Skill: Time × Type: Intervention × Skill: Intervention × Type: Type × Skill: Time × Skill × Intervention: Type × Skill × Intervention: Time × Type × Intervention: Time × Skill × Type: Time × Intervention × Skill × Type: Pitch Change (Hz) (N = 161) Main Effects Intervention Skill Time Type Interactions Time × Intervention: Time × Skill: Time × Type: Intervention × Skill: Intervention × Type: Type × Skill: Time × Skill × Intervention: Type × Skill × Intervention: Time × Type × Intervention: Time × Skill × Type: Time × Intervention × Skill × Type:
df
F
p
η2p
(2, 157) (2, 157) (1, 157) (2.637, 414.016)
3.31 28.61 87.98 100.40
0.039⁎ b0.001⁎⁎ b0.001⁎⁎ b0.001⁎⁎
0.040 0.267 0.359 0.390
0.470 0.020⁎ 0.005⁎⁎ 0.098 0.568 0.071 0.375 0.720 0.295 0.79 0.602
0.010 0.049 0.027 0.048 0.010 0.025 0.026 0.018 0.015 0.007 0.021
(2, 157) (2, 157) (2.761, 433.488) (4, 157) (5.274, 414.016) (5.274, 414.016) (4, 157) (10.584, 414.016) (5.522, 433.488) (5.522, 433.488) (11.044, 433.488)
0.76 4.02 4.35 1.99 0.80 2.02 1.07 0.71 1.22 0.528 0.84
(2, 156) (2, 156) (1, 156) (1, 156)
2.00 62.15 269.46 171.29
0.139 b0.001⁎⁎ b0.001⁎⁎ b0.001⁎⁎
0.025 0.443 0.633 0.523
(2, 156) (2, 156) (1, 156) (4, 156) (2, 156) (2, 156) (4, 156) (4, 156) (2, 156) (2, 156) (4, 156)
0.79 19.15 21.74 0.59 2.72 7.27 1.13 0.63 0.10 0.24 0.07
0.458 b0.001⁎⁎ b0.001⁎⁎ 0.669 0.069 0.001⁎⁎ 0.345 0.644 0.909 0.791 0.990
0.010 0.197 0.122 0.015 0.034 0.085 0.028 0.016 0.001 0.003 0.002
(2, 152) (2, 152) (1, 152) (1.623, 246.633)
0.54 2.62 0.001 655.12
0.587 0.076 0.978 b0.001⁎⁎
0.007 0.033 0.000 0.812
(2, 152) (2, 152) (1.638, 249.011) (4, 152) (3.245, 246.633) (3.245, 246.633) (4, 152) (6.490, 246.633) (3.276, 249.011) (3.276, 249.011) (6.553, 249.011)
5.23 3.25 0.36 0.65 0.57 1.60 1.14 0.68 1.27 0.53 0.31
0.006⁎⁎ 0.041⁎ 0.655 0.629 0.682 0.173 0.338 0.738 0.280 0.714 0.961
0.064 0.041 0.002 0.017 0.007 0.021 0.029 0.017 0.016 0.007 0.008
⁎ p b 0.05. ⁎⁎ p b 0.01.
Finally, pause type and skill level produced a statistically significant interaction. Statistically, the lowest-skilled students paused the longest at paragraph boundaries compared to other types of pauses (adjective, t(56) = 7.082, p b 0.001; clause, t(56) = 7.317, p b 0.001; sentence, t(56) = 6.951, p b 0.001), yet there were no statistically significant differences between the other three pause types. Medium-skilled students, however, displayed a more nuanced understanding of punctuation than did lowskilled students. Statistically, they made the shortest pauses after commas separating items in a list, followed by commas separating clauses and sentence boundaries (adjective vs. clause, t(52) = 3.139, p = 0.003; adjective vs. sentence, t(52) = 4.603, p b 0.001; adjective vs. paragraph, t(52) = 8.916, p b 0.001), with no statistical differences between these two pause types (t(52) = 1.582, p = 0.120). They also had the longest pauses at paragraph boundaries (clause vs. paragraph, t(52) = 6.164, p b 0.001; sentence vs. paragraph, t(52) = 6.580, p b 0.001). The highest-skilled students displayed the most sophisticated understanding of punctuation. Statistically, they made the shortest pauses after commas separating items in a list, followed by commas
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
28
Table 7 Summary of descriptive statistics for prosody parameters across time, intervention, skill level, and type. Measure
Pause Duration (ms) (N = 166) Adjective Comma (ms) Pretest Posttest Clause Comma (ms) Pretest Posttest Sentence Period (ms) Pretest Posttest Paragraph Period (ms) Pretest Posttest Intrusions (#) (N = 165) Word Intrusions Pretest Posttest Sentence Intrusions Pretest Posttest Pitch Change (Hz) (N = 161) Declarative Pitch Change Pretest Posttest Wh- Question Pitch Change Pretest Posttest Yes/No Pitch Change Pretest Posttest
RR Low M (SD)
RR Med M (SD)
RR High M (SD)
WR Low M (SD)
WR Med M (SD)
WR High M (SD)
BU Low M (SD)
BU Med M (SD)
BU High M (SD)
871.23 (324.38) 566.02 (253.40)
666.51 (421.71) 486.66 (260.40)
397.10 (215.14) 254.41 (173.27)
797.46 (388.98) 559.95 (340.79)
547.80 (317.25) 329.35 (121.34)
308.07 (149.21) 228.59 (120.11)
979.19 (474.09) 714.47 (406.14)
666.83 (244.47) 334.12 (130.34)
437.57 (207.35) 283.06 (133.45)
705.11 (342.91) 490.14 (198.02)
775.83 (356.39) 602.78 (317.76)
486.62 (231.52) 513.78 (293.06)
799.27 (460.80) 615.80 (452.11)
644.10 (266.61) 463.23 (124.85)
417.99 (152.60) 423.96 (143.56)
1012.54 (495.30) 820.54 (444.53)
591.26 (246.22) 487.25 (281.94)
534.58 (219.78) 390.92 (194.66)
928.33 (270.66) 615.67 (250.50)
864.52 (327.06) 554.65 (298.56)
695.22 (300.37) 615.85 (358.12)
796.97 (501.18) 750.15 (402.45)
664.81 (254.90) 466.35 (206.71)
527.37 (201.11) 442.19 (190.09)
1010.94 (493.83) 930.00 (597.30)
678.96 (227.27) 613.48 (260.03)
632.02 (294.15) 483.60 (237.71)
1351.25 (530.88) 882.62 (266.62)
1129.36 (525.72) 799.39 (429.69)
832.40 (424.24) 759.80 (412.95)
1059.57 (373.75) 896.25 (457.00)
954.99 (398.78) 646.09 (267.20)
738.22 (232.64) 606.48 (222.59)
1281.93 (479.70) 1115.92 (548.00)
954.11 (399.62) 732.71 (415.85)
819.06 (366.22) 624.48 (191.56)
1.72 (0.95) 1.01 (0.48)
1.01 (0.56) 0.56 (0.33)
0.48 (0.38) 0.39 (0.34)
1.45 (1.24) 0.84 (0.51)
0.63 (0.33) 0.46 (0.22)
0.47 (0.39) 0.41 (0.29)
1.52 (0.85) 0.98 (0.69)
0.75 (0.44) 0.51 (0.31)
0.55 (0.54) 0.39 (0.33)
2.46 (0.89) 1.49 (0.45)
1.91 (0.60) 1.11 (0.58)
0.95 (0.72) 0.59 (0.44)
2.22 (0.70) 1.41 (0.71)
1.56 (0.53) 0.96 (0.46)
0.96 (0.42) 0.51 (0.35)
2.74 (1.19) 1.90 (0.88)
1.70 (0.60) 1.10 (0.50)
1.24 (0.82) 0.70 (0.39)
−65.71 (33.20) −74.44 (61.49)
−63.87 (40.32) −74.12 (45.18)
−57.11 (43.02) −77.63 (48.95)
−98.87 (92.80) −66.87 (38.23)
−81.43 (58.71) −54.30 (44.10)
−81.68 (48.39) −70.70 (45.59)
−78.23 (42.72) −72.78 (30.68)
−60.06 (43.82) −60.45 (45.05)
−60.50 (37.84) −72.87 (37.22)
59.25 (69.57) 36.55 (32.46)
71.99 (65.48) 53.06 (46.40)
77.77 (63.31) 58.28 (60.36)
56.12 (37.17) 73.63 (56.43)
45.70 (54.51) 64.24 (55.55)
77.29 (63.62) 91.57 (66.26)
55.39 (47.24) 74.50 (63.08)
68.88 (64.16) 44.70 (47.44)
87.37 (66.66) 68.44 (69.88)
56.74 (53.27) 65.68 (56.79)
73.10 (50.95) 60.42 (36.02)
55.59 (39.18) 58.81 (46.01)
60.18 (52.63) 67.66 (47.57)
49.27 (36.74) 38.48 (19.09)
68.27 (47.97) 71.48 (44.17)
43.19 (37.38) 83.07 (53.29)
64.74 (52.35) 543.82 (42.48)
80.48 (53.03) 61.33 (33.30)
separating clauses, sentence boundaries, and then paragraph boundaries. All differences were statistically significant (adjective vs. clause, t(55) = 6.825, p b 0.001; adjective vs. sentence, t(55) = 8.380, p b 0.001; adjective vs. paragraph, t(55) = 11.692, p b 0.001; clause vs. sentence, t(55) = 3.963, p b 0.001; clause vs. paragraph, t(55) = 9.866, p b 0.001; sentence vs. paragraph, t(55) = 5.600, p b 0.001, see Table 7). Intervention group did not statistically interact with any of the other variables. There were also several statistically significant two-way interactions for the intrusion measure. Time statistically interacted with skill level such that, while all of the differences from pre- to posttest were statistically significant, the biggest difference occurred for the lowest-skilled students (Mdiff = 0.730, SDdiff, = 0.471, t(56) = 11.718, p b 0.001) compared to the medium-skilled (Mdiff = 0.497, SDdiff, = 0.379, t(51) = 9.445, p b 0.001) and highest-skilled students (Mdiff = 0.282, SDdiff, = 0.281, t(55) = 7.496, p b 0.001). Time also significantly interacted with type of intrusion; although both improved statistically over time, the difference for sentence intrusions (Mdiff = 0.665, SDdiff, = 0.622, t(164) = 13.730, p b 0.001) was greater than the difference for word intrusions (Mdiff = 0.345, SDdiff, = 0.593, t(164) = 7.473, p b 0.001). Finally, type of intrusion statistically interacted with skill level. Though participants at all three skill levels always had significantly more sentence than word intrusions, the biggest differences occurred for both low-skilled (Mdiff = 0.799, SDdiff, = 0.717, t(56) = 8.409, p b 0.001) and medium-skilled (Mdiff = 0.722, SDdiff, = 0.556, t(51) = 9.368, p b 0.001) students as compared to highest-skilled students (Mdiff = 0.375, SDdiff, = 0.537, t(55) = 5.228, p b 0.001). Intervention group did not statistically interact with any of the other variables.
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
29
For pitch changes, there were a few statistically significant effects. First, although readers displayed a statistically significant decrease in pitch for declarative sentences, they exhibited a statistically significant increase in pitch for both question types, with no statistical differences between the two types. Second, there was a statistically significant Time × Skill level interaction such that the lowest-level readers became more expressive from pre- to posttest (t(54) = 2.439, p = 0.018), whereas there were no statistical changes in pitch for medium- or highest-skilled students (t(50) = 1.225, p = 0.226, t(54) = 1.442, p = 0.155). Finally, time statistically interacted with intervention condition such that participants in the WR condition became statistically more expressive over time (t(51) = 2.371, p = 0.022), those in the RR condition became statistically less expressive over time (t(53) = 2.478, p = 0.016), and those in the BU condition did not experience any changes over time (t(54) = 0.366, p = 0.716). 3.5. Eye movement global analyses In total, there were 20 missing data files for these analyses. Six of the files (2 BU, 3 WR, and 1 RR) were missing due to track losses. Track losses can occur when the participant makes a large head movement, which results in the tracking system losing calibration. Fourteen of the files (7 BU, 3 WR, and 4 RR) were lost due to non-reading behavior by the children. Non-reading behavior was characterized by shorter gaze durations and total fixation time, more skipping, and in general a more erratic reading pattern than on-task reading behavior. Trained raters then identified non-reading behaviors from the eye movement records. (See Nguyen, Binder, Nemier, and Ardoin (2014) for additional information on this behavior.) Analyses were conducted at the global level (i.e., across all words in the passage) across groups of low-, medium-, and highest-performing readers. Global measures included first fixation duration, gaze duration, total fixation time, number of intraword regressions, average fixation count per word, and average saccade length in number of characters. As with the other data sets, prior to analyzing the data, outliers were identified using the outlier labeling rule, and winsorizing was used to deal with the outliers. Outliers represented a very small proportion of the data: first fixation duration − no outliers, gaze duration = 0.009, total time = 0.018, intraword regression = 0.012, fixation count = 0.006, saccade length = 0.006. We first ran a 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) × 6 (Measure) MANOVA. This analysis produced statistically significant effects for skill, time, measure, Measure × Skill, and Measure × Time (see Table 8). A 2 (Time) × 3 (Intervention Group) × 3 (Skill Level) mixed ANOVA was conducted for each variable as follow up to the MANOVA. Means, test statistics, and effect sizes for global measures are presented in Tables 9 and 10. Across all measures, we found large, statistically significant effects of time. Students improved on all measures across the 9– 10 week intervention period. The size of the improvement in eye movement measures from pre- to posttest is notable. As expected due to the design specification of grouping participants by achievement scores, analyses across all measures confirmed large, statistically significant between-groups effects. For all measures, with the exception of first fixation duration, there were statistically significant differences among all three skill groups. For first fixation duration, although the highest-skilled students had shorter times compared to the medium- and lowest-skilled groups, there was no statistical difference between the latter two groups. There were two instances in which time and skill level produced statically significant interactions: intraword regressions and saccade length. For intraword regressions, readers always statistically improved from pre- to posttest; however, the magnitude of the difference was greater for medium-skilled students, t(47) = 6.604, p b 0.001, than for lowest-skilled, t(43) = 2.454, p = 0.018, or highest-skilled students, t(54) = 3.906, p b 0.001. For saccade length, there was no statistical improvement from preto posttest for lowest-skilled students, t(43) = − 1.178, p = 0.245; however, both medium-, t(47) = −3.857, p b 0.001, and highest-skilled students, t(54) = −4.297, p b 0.001, did experience improvement over time. Table 8 Multivariate statistics of global eye movement parameters across intervention by skill level. F
p
η2p
(2, 138) (2, 138) (1, 138) (5, 134)
1.11 33.77 110.96 9801.84
0.334 b0.001** b0.001** b0.001**
0.016 0.329 0.446 0.997
(2, 138) (2, 138) (4, 138) (10, 270) (10, 270) (5, 134) (20, 548) (4, 138) (10, 270) (10, 270) (20, 548)
0.52 1.58 0.65 0.85 5.86 26.46 0.98 0.07 1.05 1.71 1.35
0.598 0.210 0.625 0.583 b 0.001 b 0.001 0.480 0.991 0.405 0.079 0.140
0.007 0.022 0.019 0.030 0.178 0.497 0.035 0.002 0.037 0.060 0.047
df Multivariate Statistics Main Effects Intervention Skill Time Measures Interactions Time × Intervention: Time × Skill: Intervention × Skill: Measure × Intervention: Measure × Skill: Measure × Time: Measure × Skill × Intervention: Time × Skill × Intervention: Time × Measure × Intervention: Time × Measure × Skill: Time × Skill × Intervention × Measure *pb.05, **pb.01.
30
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
Table 9 Summary of ANOVA statistics for global eye movement parameters across intervention by skill level. df First Fixation Duration (ms) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions Time × Intervention: (2, 138) Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138) Gaze Duration (ms) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions Time × Intervention: (2, 138) Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138) Total Fixation Time (ms) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions Time × Intervention: (2, 138) Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138) Number of Intraword Regressions (#) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions Time × Intervention: (2, 138) Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138) Average Fixation Count per Word (#) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions (2, 138) Time × Intervention: Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138) Average Saccade Length per Word (# of letters) (N = 147) Main Effects Intervention (2, 138) Skill (2, 138) Time (1, 138) Interactions Time × Intervention: (2, 138) Time × Skill: (2, 138) Skill × Intervention: (4, 138) Time × Skill × Intervention: (4, 138)
p
η2p
1.54 11.18 32.86
0.218 b 0.001⁎⁎ b 0.001⁎⁎
0.022 0.139 0.192
3.40 0.36 1.41 1.13
0.036⁎ 0.702 0.232 0.344
0.047 0.005 0.039 0.032
0.331 b 0.001⁎⁎ b 0.001⁎⁎
0.016 0.331 0.428
0.88 2.50 0.99 0.67
0.417 0.086 0.415 0.614
0.013 0.035 0.028 0.019
0.85 28.54 81.64
0.431 b 0.001⁎⁎ b 0.001⁎⁎
0.012 0.293 0.372
0.51 0.92 0.33 0.25
0.603 0.403 0.859 0.910
0.007 0.013 0.009 0.007
0.30 15.31 44.81
0.740 b 0.001⁎⁎ b 0.001⁎⁎
0.004 0.182 0.245
2.78 3.10 0.09 0.64
0.066 0.048⁎ 0.986 0.632
0.039 0.043 0.003 0.018
0.44 14.02 17.91
0.648 b 0.001⁎⁎ b 0.001⁎⁎
0.006 0.169 0.115
0.60 0.61 0.14 0.88
0.550 0.544 0.965 0.481
0.009 0.009 0.004 0.025
1.68 35.50 24.77
0.190 b0.001⁎⁎ b0.001⁎⁎
0.024 0.207 0.152
0.51 4.20 0.11 1.75
0.601 0.017⁎ 0.980 0.142
0.007 0.057 0.003 0.048
F
1.12 34.15 103.22
⁎ p b 0.05. ⁎⁎ p b 0.01.
Aside from the first fixation duration measure, intervention group was not statistically related to any changes in reading behavior across time. For first fixation duration, there was a statistically significant two-way interaction between time and intervention group. Follow-up t-tests demonstrated that there was no statistical change in duration from pre- to posttest for the BU group (t(46) = 1.868, p = 0.068), while students in the RR and WR groups demonstrated a statistically significant decrease in first fixation duration from pre- to posttest (t(50) = 5.516, p b 0.001 and t(48) = 2.453, p = 0.018, respectively).
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
31
Table 10 Summary of descriptive statistics for global eye movement parameters across intervention by skill level. Measure
First Fixation Duration (ms) (N = 147) Pretest Posttest Gaze Duration (ms) (N = 147) Pretest Posttest Total Fixation Time (ms) (N = 147) Pretest Posttest Number of Intraword Regressions (#) (N = 147) Pretest Posttest Average Fixation Count per Word (#) (N = 147) Pretest Posttest Average Saccade Length per Word (# of letters) (N = 147) Pretest Posttest
RR Low M (SD)
RR Med M (SD)
RR High M (SD)
WR Low M (SD)
WR Med WR M (SD) High M (SD)
BU Low M (SD)
BU Med M (SD)
BU High M (SD)
316.83 (36.77) 295.07 (29.05)
300.01 (34.88) 287.32 (37.00)
275.03 (34.73) 259.42 (30.29)
300.49 (34.84) 295.19 (31.00)
282.25 (27.04) 271.31 (25.19)
284.30 (31.55) 274.97 (30.04)
287.84 (26.81) 288.79 (27.73)
293.26 (30.54) 279.81 (31.63)
261.77 (27.76) 257.55 (32.29)
530.71 (114.32) 458.79 (68.98)
480.99 (106.64) 411.86 (69.09)
381.84 (76.88) 337.78 (65.88)
515.52 (95.18) 457.34 (78.38)
418.83 (64.48) 365.48 (41.79)
379.57 (55.18) 351.85 (45.92)
481.27 (98.22) 447.47 (100.74)
458.64 (82.35) 388.76 (54.63)
373.88 (76.50) 333.08 (63.90)
775.54 (202.84) 687.70 (114.36)
693.06 (159.37) 576.80 (112.27)
556.50 (124.94) 458.63 (126.16)
741.39 (165.50) 668.06 (181.24)
623.58 (136.48) 502.57 (91.13)
534.92 (104.15) 468.43 (101.21)
768.21 (263.92) 638.82 (183.71)
648.22 (124.84) 524.89 (72.94)
555.12 (172.76) 465.02 (116.45)
0.30 (0.12) 0.29 (0.11)
0.29 (0.13) 0.20 (0.10)
0.19 (0.10) 0.15 (0.08)
0.35 (0.17) 0.27 (0.10)
0.27 (0.14) 0.19 (0.10)
0.17 (0.09) 0.15 (0.10)
0.36 (0.29) 0.28 (0.22)
0.31 (0.14) 0.17 (0.07)
0.23 (0.14) 0.14 (0.07)
2.14 (0.64) 2.11 (0.30)
1.84 (0.61) 1.72 (0.36)
1.75 (0.38) 1.49 (0.42)
2.14 (0.70) 1.93 (0.55)
1.91 (0.55) 1.65 (0.41)
1.51 (0.50) 1.45 (0.37)
2.15 (0.99) 2.00 (0.58)
2.06 (0.47) 1.67 (0.32)
1.69 (0.76) 1.46 (0.42)
4.60 (1.14) 4.49 (0.74)
4.93 (1.16) 5.38 (0.84)
5.36 (1.01) 5.92 (1.37)
4.53 (1.09) 5.09 (1.26)
5.16 (1.18) 5.61 (1.20)
5.88 (1.31) 6.38 (1.36)
4.63 (0.78) 4.44 (0.80)
4.81 (0.94) 5.29 (0.82)
5.31 (0.98) 6.23 (1.29)
4. Discussion Extensive research exists supporting the use of RR as a means of increasing students' reading fluency. However, there are three conspicuous gaps in the literature. First and foremost, to date, there does not exist a randomized controlled trial comparing the effects of RR to another intervention providing elementary students with an equal amount of time practicing reading. Given the wide use of RR within schools for promoting students' fluency, such research would seem to be a necessity. Another missing component of RR research addressed by this study is a necessary examination of the impact of RR—or, for that matter, any fluency-based intervention—on students' reading prosody across an extended period of time. No study to date has measured withingroup changes in prosody as a function of time using objective measures of changes in pitch and pause durations. There are also no existing applied research studies which examine changes in students' eye movements during reading as a function of improvements in fluency across time. Finally, researchers have yet to evaluate whether the benefits of RR and WR might vary depending on students' developmental reading fluency levels. To address these limitations, we conducted a randomized controlled trial study in which students received 9–10 weeks of either RR or WR or engaged in BU. The primary finding across the measures employed was that students across conditions made substantial growth. For instance, students across conditions gained an average of 2.28 words per week according to pre-post CBM-R data, which exceeds expected published rates of gain for second-grade students (1.66 words gained per week; Deno, Fuchs, Marston, & Shin, 2001). Surprisingly, despite providing students in the two intervention conditions with 12–15 min daily of actual reading time (not including preparation time and time spent responding to comprehension questions), overwhelming intervention effects were not observed. That is, while we obtained intervention effects for a few measures (LWI, PC, ORF, and first fixation duration), the benefits of the intervention did not surface across the many remaining measures. Although results from this randomized controlled trial study provide some evidence to suggest that students receiving intervention made greater gains than students in the BU condition, intervention effects were neither consistent across measures nor exceptionally large. There were, however, no incidents in which students in the BU condition made greater gains than those of students in either intervention condition. The only consistent finding across measures was that all students, regardless of achievement level and condition, made substantial growth between pre- and posttest.
32
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
4.1. Achievement measures Analyses indicated that, across all measures, students made large and significant gains in reading achievement from pre- to post-test. Interestingly, across measures of ORF, LWI, PC, and RF, gains were generally greatest for the lowest-skilled students, followed by the medium-skilled students, then the highest-skilled students. The only measure for which there was not a time by skill level interaction was WA. Greater improvements among the lower-achieving students are likely a function of those students having greater room for improvement along with factors associated with regression to the mean among the highest- and lowest- achieving students. Although a time by intervention condition interaction was not observed across all of the achievement measures, analyses of the LWI, PC, and ORF measures suggested that students in the RR and WR conditions made greater gains across time than did students in the BU condition. There were not, however, any differences observed between the two intervention conditions (RR vs. WR). It is also interesting to note that there was not a Time × Condition × Skill Level interaction, suggesting that there were no differences in the benefits of RR versus WR for students according to their skill levels. Although RR is an intervention generally provided to students struggling with reading, as suggested by the National Reading Panel (NICHD, 2000), it seems that it might be beneficial to students still developing reading fluency, regardless of their level of reading achievement. It is, however, important to remember that there were no differences in the level of growth made by students in the RR versus WR condition, suggesting that students might simply benefit from extra time spent reading aloud to an adult with immediate corrective feedback. Essentially, there may be nothing magical about rereading materials during these intervention periods for students with low average to above average reading achievement. Given the wide use of CBM-R progress monitoring procedures in schools for evaluating intervention effects, we were interested in the discrepancy in findings based on CBM-R data: although CBM-R pre-post and progress monitoring data both indicated gains in student achievement, between-group comparisons differed by measure. Specifically, pre-post data using a set of three FAIP-R CBM-R passages suggested greater gains for students in the two intervention conditions as compared to BU, but CBM-R progress monitoring data did not suggest differences. Although the purpose of this study was not to evaluate which measurement procedure should be employed for evaluating the effects of intervention on student progress, these findings strongly support the need for schools to use multiple measures when evaluating the effect of interventions on student gains. 4.2. Reading prosody Considering past research, we assumed that, regardless of intervention condition, improvements would be observed in students' achievement levels across the intervention period. We were, however, less certain as to whether differences in prosody would be observed across time, as this was the first study to measure changes in prosody objectively across a short period of time (9–10 weeks). In an effort to evaluate changes in students' prosody between pre- and posttest, we employed measures of pause durations, intrusions, and pitch changes, all of which have been associated with differences in students' levels of comprehension and achievement. Our results supported past research, indicating differences in students' level of sophistication in prosody as a function of their skill levels (Miller & Schwanenflugel, 2006; Valle et al., 2013). For instance, whereas the lowest-skilled students only exhibited significantly different pause durations at paragraph boundaries compared to all other measured locations, significant differences between the durations of all four pause types (i.e., separating list items, at commas separating clauses, at sentence boundaries, and at paragraph boundaries) were observed among the highest-skilled students. Results also indicated significant differences as a function of time (i.e., from pre- to posttest), with significant decreases in pause durations and intrusions as well as changes in the expected direction in regards to pitch. Interestingly, similar to improvements in our measures of achievement, there were significant Time × Skill level interactions. Low- and medium-skilled students made greater improvements in pause durations than did the highest-skilled students, the lowest-skilled students made greater improvements in intrusions than did both the medium- and highest-skilled students, and lowest-skilled students became more expressive while medium- and highest-skilled students exhibited no overall changes in pitch from pre- to posttest. A Time × Intrusion Type interaction was also observed, with students displaying greater decreases in sentence versus word intrusions. This interaction was possibly due to the small number of word intrusions made by students; however, it also demonstrates that, as students' reading skills improve, their reading becomes less effortful and more fluent, allowing for greater cognitive resources to be dedicated to text comprehension. In summary, these data provide initial evidence that, even within short timeframes, students' reading expression improves along with reading achievement. Differences in growth made by students as a function of intervention were only observed in regards to changes in pitch. Results regarding pre- to posttest changes in students' level of expressiveness indicated no differences for students in the BU condition, an increase in expressiveness for students in the WR condition, and a decrease in expressiveness for students in the RR condition. These findings are interesting in light of the fact that Ardoin, Morena et al. (2013) reported differences in students' reading prosody as a function of the feedback given to them as part of an RR intervention. In that study, effects on pitch were not observed for students who received feedback regarding reading speed, but changes were observed for students who were given feedback regarding reading prosody. Although current participants in both RR and WR conditions were provided with feedback regarding rate, the extent to which feedback may have reinforced students' fast reading was likely greater in the RR condition. During each intervention session, students in the RR condition first read for a selected amount of time (e.g., 5 min) and then read the completed section of text three additional times; following each reading, they were provided with feedback regarding the amount of time they took to read the passage. Typically, this amount of time decreased with each reading; thus, feedback
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
33
was generally positive. In contrast, WR sessions were divided into four equivalent segments of time, with the sum of those segments equaling the cumulative reading time completed by RR students to whom WR students were yoked. Following each reading segment, WR students were informed of the number of words they read during that segment. Since students in the WR condition were always reading novel text, the number of words they read across the four reading segments each day did not necessarily improve; therefore, feedback was not always positive. Given that performance feedback regarding rate did not expose students in the WR condition to the same dense schedule of reinforcement as that in the RR condition, WR students may not have attended to rate to the same extent as did students in the RR condition. Thus, these results provide further evidence that focusing students' attention primarily on their reading rate may have detrimental effects on their reading expression. It is often stated that RR improves the degree to which students read with expression (Dowhower, 1987). However, the extent to which effects on prosody generalize to unpracticed materials and students with varying levels of reading fluency needs to be further investigated. With the exception of Ardoin, Morena et al. (2013), RR studies demonstrating improvements in reading expression have employed subjective measures of prosody and passages initially at students' frustrational levels. Results of this study suggest that generalized improvements in reading expression are likely to occur as a function of RR, but that they are due to generalized improvements in reading achievement rather than repeatedly reading the same materials. 4.3. Eye movement global analyses Similar to results regarding students' reading prosody, we were uncertain of the extent to which differences might be observed in students' eye movements across the 9–10 week intervention period. Although previous research examining the effects of intervention on students' reading of a single passage reported significant changes in eye movements across four readings of the same passage (Foster et al., 2013), such gains from isolated practice would not be expected to generalize across an extended period of time; for example, reading rate improvements on a single passage across rereadings (e.g., gain of 35 WRCM observed by Ardoin, Morena et al., 2013) are generally greater than those expected for students across a 10-week period of time (1.5 WRCM gained per week × 10 = 15 WRCM). Although RR of a single passage might be expected to improve students' passage comprehension and, thus, yield significant changes in eye movement measures associated with higher-order processing (Zawoyski et al., 2015), equally significant general gains in comprehension might not be expected across only a 9–10 week period. Nevertheless, results indicated a large significant effect of time across every global eye movement measure. Even though we expected decreases in fixations and fixation lengths from pre- to posttest, it is astonishing how large the effect sizes were after just 9–10 weeks of intervention. Across our eye movement measures, student performances increased by 4%–36% (effect size Range: 0.12–0.43), with an average increase of 15%. These results are the first to demonstrate rapid changes in children's eye movements at this point in their reading development. Also consistent with achievement and prosody data were analyses indicating differences in students' reading behavior (i.e. eye movements) as a function of their skill level. For instance, analyses of global eye movement measures indicated that the lowest-skilled students were associated with longer fixations on words (gaze duration and total fixation time), more fixations on words, more regressions to previously read words, and shorter saccade lengths. These differences between students are especially interesting considering that participants in this study were generally average- to high-achieving students. These results extend the eye movement research literature by providing evidence that, just as there is substantial variability in the passage reading fluency of students within the same grade, there are large differences between same-grade students in terms of their eye movements during passage reading. Unlike with the achievement and prosody data, Time × Skill Level interactions were inconsistent across global eye movement measures, with few of such interactions being significant. Analyses indicated that (a) the magnitude of change from pre- to posttest for the measure of intraword regressions was greater for medium-skilled students, and (b) whereas medium- and highskilled students increased their saccade lengths, lower-skilled students made no such improvement. Thus, whereas achievement and prosody measures revealed the greatest observed gains among the lowest level of students, these same lower-skilled students were the least likely to exhibit significant changes from pre- to posttest in reading behaviors measured by global eye movements. Similar to analyses of the prosody and CBM-R progress monitoring data, eye movement data do not provide strong evidence that either RR or WR led to greater rates of improvement than did BU. Across global analyses, the only significant Time × Intervention condition interaction was observed for the measure of first fixation duration, which is generally considered to reflect lower-level processing of text. Analyses suggested that students in both RR and WR conditions significantly decreased their first fixation durations from pre- to posttest, but significant differences were not observed for students in the BU condition. 4.4. Limitations There are three key limitations that should be considered when interpreting the outcomes of these data. The primary limitation is the fact that the current participant sample was drawn from a single, relatively high-performing school district. Although participants were drawn from three different schools, the general curriculum to which they were exposed was consistent across schools. Furthermore, participating schools were relatively similar in regards to racial composition, socioeconomic status, and academic performance. A second limitation is that students receiving special education, English to Speakers of Other Languages (ESOL), or gifted services were not allowed to participate in this study. These students were excluded due to the increased difficulties and limited feasibility that we would have faced in ensuring that participants across conditions were receiving equivalent amounts of teacher-led instruction throughout the school day. Unfortunately, as a result, the participant sample in this study was not representative of
34
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
the students with whom fluency-based interventions typically are implemented. Nevertheless, the NRP (NICHD, 2000) has emphasized the importance of fluency-based instruction for all students and also has suggested that RR is a suitable means of such instruction for students through fourth grade. Given the level of improvements in fluency observed in this study, it is clear that fluency instruction is appropriate for second-grade students, regardless of achievement level. Due to the high level of personnel and physical (space) resources required to implement one-to-one interventions each morning prior to the beginning of classroom instruction, along with the considerable amount of time associated with pre- and posttest assessment, we were required to conduct this study across four semesters and three schools. Within each school, students were drawn from 4 to 6 different classrooms. Thus, a final limitation of the current study is that we were not able to account for nesting effects in our analyses due to the small number of participants within each cell.
4.5. Summary and implications This study was funded by the U.S. Department of Education's Institute of Education Sciences for the purpose of examining underlying changes in students' reading behavior (i.e., eye movements) as a function of improvements in students' reading fluency. To increase the probability that students' fluency would change from pre- to posttest, two-thirds of study participants were provided with one of two interventions, both of which increased the amount of time students spent reading aloud to an adult. In addition, participants were selected from second-grade classes in light of research suggesting that second graders make the greatest gains in reading fluency relative to students in other early grade levels (Chall, 1996; Kuhn & Stahl, 2003). The most consistent finding across all of the measures employed within this study was that students indeed made huge improvements in reading behavior, as indicated by WJ-III performance, reading rate, reading expression, and eye movement patterns. A second consistent finding was that the magnitude of changes across assessment measures was not reliable across students' skill levels. For the majority of employed measures, the students in the lowest achievement group made the greatest gains across time. Although such great gains are highly favorable in potentially allowing these students to “catch up” to their higher-achieving peers, maximum growth for all students would seem to be a better outcome. Regardless, these results highlight the fact that achievement level should be considered when examining the impact of instruction on student performance. A third notable finding from this study—and one related to its ultimate purpose—was that both RR and WR conditions resulted in greater improvements in students' achievement compared to BU. Although results related to many variables suggested that students in the BU condition exhibited gains as great as those for students in the RR and WR conditions, there were Time × Intervention Condition interactions for the majority of the achievement variables (ORF, LWI, and PC). Considering that past researchers have not administered prosody and eye movement measures, these results are generally consistent with past research suggesting that RR improves students' reading achievement (Therrien et al., 2006). However, it is interesting that RR failed to produce greater gains in students' achievement compared to WR. These results strongly suggest that previously observed improvements in students' reading achievement as a result of RR implementation were largely due to additional time spent reading as opposed to repeatedly reading text. It should be noted that students in the RR and WR conditions practiced reading materials that were deemed to be at their instructional levels. If more challenging materials had been employed, we may have observed different effects. Although research suggests that students make greater gains on practiced text when RR involves more challenging materials (Daly, Bonfiglio, Mattson, Persampieri, & Foreman-Yates, 2005), researchers have not yet evaluated whether this difference in benefit persists when intervention is implemented across an extended period of time. We chose to have students read instructional-level materials for two primary reasons. First, to maintain experimental control, students in both intervention conditions needed to read materials at the same level of difficulty; requiring students in the WR condition to read passages that were not at the instructional level may have caused undue frustration due to their lack of opportunities to reread and master the material. Second, requiring students to read more challenging texts would have necessitated an overwhelming amount of error correction procedures across both intervention conditions. On occasion, results from pretest assessment erroneously resulted in the assignment of a student to frustrational-level material. When faced with such situations, we quickly changed the assigned level of material not only to be consistent with our established rules, but also because intervention sessions were frustrating for both the experimenters and the students. Given the lack of differences in outcomes between the RR and WR conditions, practitioners should thoughtfully consider the potential benefits and drawbacks of each set of procedures when providing students with a fluency-based intervention. Although results of this study would suggest that RR might not benefit students more than WR, there is strong evidence to suggest that RR improves students' fluency and comprehension on passages on which intervention is provided. Thus, if intervention will expose students to specific content and/or materials that they may need to read later in class, then RR may be the best set of procedures. However, a drawback of RR is that students may become annoyed with repeatedly reading passages more than once, especially if the material is not of interest to them. WR addresses this issue, as students read materials only once. Furthermore, WR exposes them to a significantly broader range of words. For example, students in the RR condition read an average of 9000 words comprising unique (i.e., non-repeated) text sequences across the 10 weeks of intervention, whereas students in the WR condition read an average of 28,815. One would expect that, by reading a greater variety of texts, students would be exposed to a greater breadth of vocabulary and contextual information. However, ensuring that passages are at an appropriate reading level might be of particular importance for WR intervention given that students are provided with only one opportunity to read words correctly and understand materials. Despite the wide use of RR-based intervention within elementary schools and its extensive literature base, there clearly remains much left unknown regarding RR as an intervention for improving elementary students' reading achievement. This study
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
35
demonstrates the importance of employing multiple measures, including a true control group, and not assuming that an intervention deemed “empirically valid” will be of great benefit for all students regardless of age and skill level. Acknowledgements “The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A100496 to the University of Georgia. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.” Appendix A. Common Definitions of Eye Movements and Eye Movement Parameters (e.g., Rayner et al., 2006) Average number of fixations per word: fixation count divided by number of words of interest (i.e, number of words in the passage) First fixation duration: the duration of the first fixation on a word, regardless of the number of fixations made on the word Fixations: pauses on words that allow readers to extract information from fixated points and surrounding areas of visual acuity Fixation count: total number of fixations Gaze duration: the sum of all of the fixations made on a word prior to movement to another word Probability of skipping words: average likelihood (from 0 to 100%) of skipping a given word during the first pass of reading (i.e., not fixating on that word before fixating beyond that word), as indicated by the number of words skipped during first-pass reading divided by the number of words of interest (i.e, number of words in the passage) Regressions: backward saccadic movements that typically reflect additional processing of previously identified text or correction for “overshooting” eye movements Saccades: rapid movements in which the eyes move from one point to another and vision is suppressed Total fixation time: the sum of all fixations, including regressions, on a word Number of interword regressions: total number of regressions between words (versus within words) Number of intraword regressions: total number of regressions within single words (versus between words) Appendix B. Definitions of Prosody Measures (e.g., Ardoin, Morena, Binder, & Foster, 2013) Pitch Measures Pitch difference: the highest pitch in a sentence minus the lowest pitch in that sentence Pitch drop: the pitch level at the end of the last word of a declarative sentence minus the maximum pitch on that word Pitch rise: the pitch level at the end of the final word of a question minus the minimum pitch level for that word Pause Measures Sentence intrusion: an inappropriate pause between words of at least 200 ms; may be related to decoding processes that are not efficient enough to slow down memory access needed to resolve word meaning Sentential pause duration: the amount of time elapsed (in ms) between the end of the word preceding a targeted delimiter (e.g., adjective comma, clause comma, paragraph period, sentence period) and the start of the word following it Word intrusion: an inappropriate pause within a word; may be related to difficulty navigating grapheme-phoneme mappings or limited sight word vocabulary Duration Measures Pause duration: the duration of a pause before or after a word of interest (e.g., target word) Word duration: the amount of time elapsed (in ms) on a word of interest (e.g., target word) Appendix C. Intervention Protocols Day 1 Materials -Computer with passages on Adobe Reader Make sure that in Adobe, you have selected “Two-Up Continuous” or “Two Page Scrolling” and “Show Cover Page During TwoUp” under Page Display (in View menu). Also, selecting “Reading Mode” under the View menu will maximize the document so there are no tool buttons on the window. -Timer Directions for Students Read the following to all students: Over the next few weeks, we are going to read together 4 days a week. Reading out loud to me will help you to become even better at reading than you already are.
36
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
When you are reading, I want you to try and do your best reading. It will be important that you pay attention to what you are reading because I will ask you questions about the stories to make sure that you understand what you are reading. If there are words that are difficult for you to read, I will help you with them but I always need you to try your best.
Wide Reading General Instructions
1) Calculate amount of time per reading segment by calculating data with RR student. 2) Note time you start working (do not include CBM time) 3) Read directions to student 4) Start timer before each of student's readings 5) Follow along while student is reading, providing error correction procedures when the student hesitates for 3 s. 6) Provide immediate error correction when student misreads a word 7) After you are finished working with student, note the time. 8) Record the following on data sheet a. Total time working with student b. Length of reading sessions c. Record # of last word read d. Information for calculating total # of words read.Daily Instructions Today when you read to me, after ___minutes, I will tell you how many words you read to me and then I will ask you to read three more times for ___ minutes. After each reading I will tell you how many words you read and I will sometimes ask you a question. Remember to do your best reading. I will help you with any words that you need help with, but I want you to try your best to read all the words.Feedback • That time you read ____ words in ___ minutes. Repeated Readings General Instructions
1) 2) 3) 4)
Note time you start working (do not include CBM time) Read directions to student Start timer before each of student readings Follow along while student is reading, providing error correction procedures when the student hesitates for 3 s 5) Provide immediate error correction when student misreads a word 6) After you are finished working with student, note the time. 7) Record the following on data sheet a. Total time working with student b. Length of each reading session c. Record # of last word read d. Information for calculating total # of words read. Daily Instructions Today at the end of 5 min, I will tell you how many words you read correctly and then you will read up to that point a second, third, and fourth time. After each of these readings I will tell you how long it took you to read the section and I will sometimes ask you a question about what you just read. Remember to do your best reading. I will help you with any words that you need help with, but I want you to try your best to read all the words. Feedback • You read _____ words in 5 min. (first time) • You read _____ words in ___ minutes. (2nd-4th reading)
Immediate Error Correction Procedures (including for 3-second delays) • Immediately after an error is made, say the following: That's not right. What word is this? (point to where made error) • If student says word correctly, let student continue. • If student does not say word correctly or has hesitated on a word, say: This word is _______. What is this word? Now read this over again, starting here.(Have student start at the word following the previous sentence or previous comma.) Comprehension Questions: Circulate through the following questions, trying to go in order as much as possible. Ask 2 questions per session. • Tell me the name of one person, thing, or place in the story and something that you learned about that person, thing, or place in the part of the story that you just read. • What do you think will happen next in the story? (Only ask this question if the student has another section of the story to read.) • Did your prediction come true? • Tell me what the story was about in 7 words or less. • *If above questions do not apply to passage, ask others (what favorite part of story was, what another title could be, what they learned)Before 2nd, 3rd & 4th readings • Great, now read for another __ minutes. Be sure to do your best reading. Immediate Error Correction Procedures (including for 3-s delays) • Immediately after an error is made, say the following: That's not right. What word is this? (point to where made error) • If student says word correctly, let student continue. • If student does not say word correctly or has hesitated on a word, say: This word is _______. What is this word? Now read this over again, starting here.(Have student start at the word following the previous sentence or previous comma.) Comprehension Questions: Circulate through the following questions, trying to go in order as much as possible. Ask 2 questions per session. • Tell me the name of one person, thing, or place in the story and something that you learned about that person, thing, or place in the part of the story that you just read. • What do you think will happen next in the story? (Only ask this question if the student has another section of the story to read.) • Did your prediction come true? • Tell me what the story was about in 7 words or less. • *If above questions do not apply to passage, ask others (what favorite part of story was, what another title could be, what they learned) Before 2nd, 3rd, & 4th reading Great, Now read that section again. Be sure to do your best reading.
Read the following to Wide Reading students: Each day I will set a timer for 2–3 min and after the timer goes off, I will tell you how many words you read during that time and I will sometimes ask you a question to make sure that you are paying attention to what you are reading. Read the following to Repeated Readings students:
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
37
Each day I will have you read as much of a story as you can in 5 min. At the end of 5 min, I will tell you how many words you read and I will sometimes ask you a question about what we read. We will then read up to that point a second, third, and fourth time. After each of these readings, I will tell you how long it took you to read the section.
References Ardoin, S. P., McCall, M., & Klubnik, C. (2007). Promoting generalization of oral reading fluency: Providing drill versus practice opportunities. Journal of Behavioral Education, 16, 54–69. http://dx.doi.org/10.1007/s10864-006-9020-z. Ardoin, S. P., Morena, L. S., Binder, K. S., & Foster, T. E. (2013a). Examining the impact of feedback and repeated readings on oral reading fluency: Let's not forget prosody. School Psychology Quarterly, 28, 391–404. http://dx.doi.org/10.1037/spq0000027. Ardoin, S. P., Binder, K. S., Zawoyski, A. M., Foster, T. E., & Blevins, L. A. (2013b). Using eye-tracking procedures to evaluate generalization effects: Practicing target words during repeated readings within versus across texts. School Psychology Review, 42(4), 477–495. Ardoin, S. P., Eckert, T. L., Christ, T. J., White, M. J., Morena, L. S., January, S. A., & Hine, J. F. (2013c). Examining variance in reading comprehension among developing readers: Words in context (curriculum-based measurement in reading) versus words out of context (word lists). School Psychology Review, 42(3), 243–261. Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., & Klingbeil, D. A. (2013d). A systematic review and summarization of the recommendations and research surrounding curriculum-based measurement of oral reading fluency (CBM-R) decision rules. Journal of School Psychology, 51, 1–18. http://dx.doi.org/10.1016/j.jsp. 2012.09.004. Benjamin, R. G., & Schwanenflugel, P. J. (2010). Text complexity and oral reading prosody in young readers. Reading Research Quarterly, 45, 388–404. http://dx.doi.org/ 10.1598/RRQ.45.4.2. Blythe, H. I., & Joseph, H. S. S. L. (2011). Children's eye movements during reading. In S. P. Liversedge, I. D. Gilchrist, & S. Everling (Eds.), The Oxford handbook of eye movements (pp. 643–662). Oxford, UK: Oxford University Press. Boersma, P., & Weenink, D. (2011). Praat: Doing Phonetics by Computer (Version 5.2.26) [Software]. (Updated versions available from) http://www.fon.hum.uva.nl/ praat/ Carlisle, J. F. (2000). Awareness of the structure and meaning of morphologically complex words: Impact on reading. Reading and Writing, 12, 169–190. http://dx.doi. org/10.1023/A:1008131926604. Chall, J. S. (1996). Stages of reading development (2nd ed ). Fort Worth, TX: Harcourt-Brace. Chard, D. J., Vaughn, S., & Tyler, B. J. (2002). A synthesis of research on effective interventions for building fluency with elementary students with learning disabilities. Journal of Learning Disabilities, 35, 386–406. http://dx.doi.org/10.1177/00222194020350050101. Christ, T. J., Ardoin, S. P., & Eckert, T. L. (2010). Formative assessment instrumentation and procedures for reading (FAIP-R): Oral reading passages. University of Minnesota. Cowie, R., Douglas-Cowie, E., & Wichmann, A. (2002). Prosodic characteristics of skilled reading: Fluency and expressiveness in 8–10-year-old readers. Language and Speech, 45, 47–82. http://dx.doi.org/10.1177/00238309020450010301. Daly, E. J., III, Martens, B. K., Hamler, K. R., Dool, E. J., & Eckert, T. L. (1999). A brief experimental analysis for identifying instructional components needed to improve oral reading fluency. Journal of Applied Behavior Analysis, 32, 83–94. http://dx.doi.org/10.1901/jaba.1999.32-83. Daly, E. J., III, Bonfiglio, C. M., Mattson, T., Persampieri, M., & Foreman-Yates, K. (2005). Refining the experimental analysis of academic skills deficits: Part I. An investigation of variables that affect generalized oral reading performance. Journal of Applied Behavior Analysis, 38, 485–497. Deno, S. L., Fuchs, L. S., Marston, D., & Shin, J. (2001). Using curriculum-based measurement to establish growth standards for students with learning disabilities. School Psychology Review, 30(4), 507–524. Dowhower, S. L. (1987). Effects of repeated reading on second-grade transitional readers' fluency and comprehension. Reading Research Quarterly, 22, 389–406. http:// dx.doi.org/10.2307/747699. Dowhower, S. L. (1991). Speaking of prosody: Fluency's unattended bedfellow. Theory Into Practice, 30, 165–175. http://dx.doi.org/10.1080/00405849109543497. Eckert, T. L., Ardoin, S. P., Daly, E. J., III, & Martens, B. K. (2002). Improving oral reading fluency: A brief experimental analysis of combining an antecedent intervention with consequences. Journal of Applied Behavior Analysis, 35, 271–281. http://dx.doi.org/10.1901/jaba.2002.35-271. Ehri, L. (1992). Reconceptualizing the development of sight word reading and its relationship to recoding. In P. Gough, L. Ehri, & R. Treiman (Eds.), Reading acquisition (pp. 107–143). Hillsdale, NJ: Lawrence Erlbaum Associates. Ellis, E. S., & Graves, A. W. (1990). Teaching rural students with learning disabilities: A paraphrasing strategy to increase comprehension of main ideas. Rural Special Education Quarterly, 10(2), 2–10. Faul, F., Erdfelder, E., Lang, A. -G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191. Foster, T. E., Ardoin, S. P., & Binder, K. S. (2013). Underlying changes in repeated reading: An eye movement study. School Psychology Review, 42(2), 140–156. Foster, T. E., Ardoin, S. P., & Binder, K. S. (2016, February). Technical adequacy of eye movement measures of children's silent readingPoster presented at the annual meeting of the National Association of School Psychologists, New Orleans, LA. Fuchs, L. S., Fuchs, D., & Kazdan, S. (1999). Effects of peer-assisted learning strategies on high school students with serious reading problems. Remedial and Special Education, 20, 309–318. http://dx.doi.org/10.1177/074193259902000507. Haring, N. G., & Eaton, M. D. (1978). Systematic procedures: An instructional hierarchy. In N. G. Haring, T. C. Lovitt, M. D. Eaton, & C. L. Hansen (Eds.), The fourth R: Research in the classroom (pp. 23–40). Columbus, OH: Charles E. Merrill Publishing Company. Herman, P. A. (1985). The effect of repeated readings on reading rate, speech pauses, and word recognition accuracy. Reading Research Quarterly, 20, 553–565. http:// dx.doi.org/10.2307/747942. Hoaglin, D. C., & Iglewicz, B. (1987). Fine-tuning some resistant rules for outlier labeling. Journal of the American Statistical Association, 82, 1147–1179. Homan, S. P., Klesius, J. P., & Hite, C. (1993). Effects of repeated readings and nonrepetitive strategies on students' fluency and comprehension. The Journal of Educational Research, 87, 94–99. http://dx.doi.org/10.1080/00220671.1993.9941172. Joseph, H. S. S. L., Nation, K., & Liversedge, S. P. (2013). Using eye movements to investigate word frequency effects in children's sentence reading. School Psychology Review, 42(2), 207–222. Klauda, S. L., & Guthrie, J. T. (2008). Relationships of three components of reading fluency to reading comprehension. Journal of Educational Psychology, 100, 310–321. http://dx.doi.org/10.1037/0022-0663.100.2.310. Kuhn, M. R., & Stahl, S. (2003). Fluency: A review of developmental and remedial strategies. Journal of Educational Psychology, 95, 1–19. LaBerge, D., & Samuels, S. J. (1974). Toward a theory of automatic information processing in reading. Cognitive Psychology, 6, 293–323. http://dx.doi.org/10.1016/00100285(74)90015-2. Leslie, L., & Calhoon, A. (1995). Factors affecting children's reading of rimes: Reading ability, word frequency, and rime-neighborhood size. Journal of Educational Psychology, 87(4), 576–586. http://dx.doi.org/10.1037/0022-0663.87.4.576. Meyer, M., & Felton, R. (1999). Repeated reading to enhance fluency: Old approaches and new directions. Annals of Dyslexia, 49, 283–306. http://dx.doi.org/10.1007/ s11881-999-0027-8. Miller, B., & O'Donnell, C. (2013). Opening a window into reading development: Eye movements' role within a broader literacy research framework. School Psychology Review, 42, 123–139. Miller, J., & Schwanenflugel, P. J. (2006). Prosody of syntactically complex sentences in the oral reading of young children. Journal of Educational Psychology, 98, 839–843. http://dx.doi.org/10.1037/0022-0663.98.4.839. Morris, R. K., & Rayner, K. (1990). Eye movements in skilled reading: Implications for developmental dyslexia. In J. Stein (Ed.), Vision and Visual Dyslexia (pp. 233–242).
38
S.P. Ardoin et al. / Journal of School Psychology 59 (2016) 13–38
National Institute of Child Health and Human Development (2000). Report of the National Reading Panel. Teaching children to read: an evidence-based assessment of the scientific research literature on reading and its implications for reading instruction: Reports of the subgroups (NIH Publication No. 00-4754). Washington, D.C.: U.S. Government Printing Office. Nguyen, V. T., Binder, K. S., Nemier, C., & Ardoin, S. P. (2014). Gotcha! Catching kids during mindless reading. Scientific Studies of Reading, 18, 274–290. O'Shea, L. J., Sindelar, P. T., & O'Shea, D. J. (1987). The effects of repeated readings and attentional cues on the reading fluency and comprehension of learning disabled readers. Learning Disabilities Research, 2, 103–109. Rashotte, C. A., & Torgesen, J. K. (1985). Repeated reading and reading fluency in learning disabled children. Reading Research Quarterly, 20, 180–188. http://dx.doi.org/ 10.1598/RRQ.20.2.4. Rasinski, T., Blachowicz, C., & Lems, K. (2006). Fluency instruction: Research-based best practices. New York: Guildford Press. Ravid, D., & Mashraki, Y. E. (2007). Prosodic reading, reading comprehension and morphological skills in Hebrew-speaking fourth graders. Journal of Research in Reading, 30, 140–156. http://dx.doi.org/10.1111/j.1467-9817.2007.00340.x. Rayner, K., Chace, K. H., Slattery, T. J., & Ashby, J. (2006). Eye movements as reflections of comprehension processes in reading. Scientific Studies of Reading, 10, 241–255. http://dx.doi.org/10.1207/s1532799xssr1003_3. Rayner, K., Ardoin, S. P., & Binder, K. S. (2013). Children's eye movements in reading: A commentary. School Psychology Review, 42(2), 223–233. Samaha, A. L., Vollmer, T., & Bourret, J. (2001). Data analysis software: Instant analyzer 1.0. Computer program (Gainesville, FL). Samaha, A. L., Vollmer, T., & Bourret, J. (2002). Data analysis software: Instant data 1.4. Computer program (Gainesvielle, FeL). Samuels, S. J. (1979). The method of repeated readings. The Reading Teacher, 32, 403–408. Samuels, S. J. (1997). The method of repeated readings. Reading Teacher, 50, 376–381. Samuels, J. (2006). Looking backward: Reflections on a career in reading. Journal of Literacy Research, 38, 327–344. http://dx.doi.org/10.1207/s15548430jlr3803_3. Schrank, F. A., McGrew, K. S., & Woodcock, R. W. (2001). Technical abstract (Woodcock-Johnson III assessment service bulletin no. 2). Itasca, IL: Riverside Publishing. Schreiber, P. A. (1987). Prosody and structure in children's syntactic processing. In R. Horowitz, & S. J. Samuels (Eds.), Comprehending oral and written language (pp. 243–270). New York, NY: Academic Press. Schwanenflugel, P. J., Hamilton, A. M., Kuhn, M. R., Wisenbaker, J. M., & Stahl, S. A. (2004). Becoming a fluent reader: Reading skill and prosodic features in the oral reading of young readers. Journal of Educational Psychology, 96, 119–129. http://dx.doi.org/10.1037/0022-0663.96.1.119. Schwanenflugel, P. J., Westmoreland, M. R., & Benjamin, R. G. (2013). Reading fluency skill and the prosodic marking of linguistic focus. Reading and Writing. http://dx. doi.org/10.1007/s11145-013-9456-1. Share, D. L., & Stanovich, K. E. (1995). Cognitive processes in early reading development: Accommodating individual differences into a model of acquisition. Issues in education: Contributions from educational psychology, 1, 1–58. Shinn, M. R. (Ed.). (1998). Advanced applications of curriculum-based measurement. New York: Guilford Press. Spache, G. (1953). A new readability formula for primary-grade reading materials. The Elementary School Journal, 55, 410–413. Steventon, C. E. (2004). Repeated reading within the context of a peer-mediated remedial reading program. Dissertation Abstracts International, 65(06 A), 105–2162. Tannenbaum, K. R., Torgesen, J. K., & Wagner, R. K. (2006). Relationships between word knowledge and reading comprehension in third-grade children. Scientific Studies of Reading, 10(4), 381–398. Therrien, W. J. (2004). Fluency and comprehension gains as a result of repeated reading: A meta-analysis. Remedial and Special Education, 25(4), 252–261. Therrien, W. J., Wickstrom, K., & Jones, K. (2006). Effect of a combined repeated reading and question generation intervention on reading achievement. Learning Disabilities Research and Practice, 21(2), 89–97. U.S. Department of Education, Institute of Education Sciences. What Works Clearing House (2014c). Repeated reading. (Retrieved from) http://ies.ed.gov/ncee/wwc/ pdf/intervention_reports/wwc_repeatedreading_051314.pdf Valle, A., Binder, K. S., Walsh, C. B., Nemier, C., & Bangs, K. E. (2013). Eye movements, prosody, and word frequency among average- and high-skilled second-grade readers. School Psychology Review, 42(2), 171–190. Vaughn, S., Chard, D. J., Bryant, D. P., Coleman, M., Tyler, B. J., Linan-Thompson, S., & Kouzekanani, K. (2000). Fluency and comprehension interventions for third-grade students. Remedial and Special Education, 21, 325–335. http://dx.doi.org/10.1177/074193250002100602. Wexler, J., Vaughn, S., Edmonds, M., & Reutebuch, C. K. (2008). A synthesis of fluency interventions for secondary struggling readers. Reading and Writing, 21, 317–347. http://dx.doi.org/10.1007/s11145-007-9085-7. Wexler, J., Vaughn, S., Roberts, G., & Denton, C. A. (2010). The efficacy of repeated reading and wide reading practice for high school students with severe reading disabilities. Learning Disabilities Research and Practice, 25, 2–10. http://dx.doi.org/10.1111/j.1540 5826.2009.00296.x. Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-Johnson III tests of academic achievement. Itasca, IL: Riverside Publishing. Young, A., & Bowers, P. G. (1995). Individual difference and text difficulty determinants of reading fluency and expressiveness. Journal of Experimental Child Psychology, 60, 428–454. Zawoyski, A. M., Ardoin, S. P., & Binder, K. S. (2015). Using eye tracking to observe differential effects of repeated readings for second-grade students as a function of achievement level. Reading Research Quarterly, 50, 171–184. http://dx.doi.org/10.1002/rrq.91.