Using portfolios to assess the writing of ESL students: a powerful alternative?

Using portfolios to assess the writing of ESL students: a powerful alternative?

Journal of Second Language Writing 11 (2002) 49±72 Using portfolios to assess the writing of ESL students: a powerful alternative? Bailin Song*, Bonn...

291KB Sizes 0 Downloads 19 Views

Journal of Second Language Writing 11 (2002) 49±72

Using portfolios to assess the writing of ESL students: a powerful alternative? Bailin Song*, Bonne August Kingsborough Community College, City University of New York, 2001 Oriental Blvd., Brooklyn, NY 11235, USA

Abstract This article describes a quantitative study that compared the performance of two groups of advanced ESL students in ENG 22, a second semester composition course. Both groups had been enrolled in ENG C2, a compensatory version of Freshman English for students with scores one level below passing on the CUNY Writing Assessment Test (WAT). At the end of ENG C2, one group was assessed on the basis of portfolios, as well as the CUNY WAT; the other was assessed using the WAT. Comparable percentages of students in both groups passed the WAT at the end of C2. However, students from the portfolio group with passing portfolios were permitted to advance to ENG 22 regardless of their performance on the WAT, while students in the non-portfolio group moved ahead only if they had passed the WAT. (The WAT remained a graduation requirement for all students.) The study found that students were twice as likely to pass into ENG 22 from ENG C2 when they were evaluated by portfolio than when they were required to pass the WAT. Nevertheless, at the end of ENG 22, the pass rate and grade distribution for the two groups were nearly identical. Because portfolio assessment was able to identify more than twice the number of ESL students who proved successful in the next English course, however, it seems a more appropriate assessment alternative for the ESL population. # 2002 Elsevier Science Inc. All rights reserved. Keywords: Portfolio assessment; ESL students; CUNY WAT

Introduction Portfolio assessment of writing, which incorporates several diverse writing samples produced at different times, has often seemed ideally suited to programs * Corresponding author. Tel.: ‡1-718-368-5849; fax: ‡1-718-368-4786. E-mail address: [email protected] (B. Song).

1060-3743/02/$ ± see front matter # 2002 Elsevier Science Inc. All rights reserved. PII: S 1 0 6 0 - 3 7 4 3 ( 0 2 ) 0 0 0 5 3 - X

50

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

that use a curriculum in¯uenced by the writing process. Portfolios can accommodate and even support extensive revision, can be used to examine progress over time, and can encourage students to take responsibility for their own writing. Furthermore, assessment criteria seem less arbitrary for portfolios than they might when applied to a single, impromptu piece. The literature is rich in discussions of the important issues raised by portfolio assessment (Belanoff & Dickson, 1991; Black, Daiker, Sommers, & Stygall, 1994; Camp, 1993; Elbow & Belanoff, 1997; Hamp-Lyons & Condon, 2000; Huot & Williamson, 1997; Kearns, 1993; Yancey, 1999) and in accounts of the development of portfolio assessment programs (Belanoff & Elbow, 1986; Courts & McInerney, 1993; Gill, 1993; Hamp-Lyons & Condon, 1993, 2000; Markstein, Withrow, Brookes, & Price, 1992). As yet, however, this literature has not been much augmented by quantitative research. Based on their screening of articles on portfolio assessment in the literature in a 10-year period, Herman and Winters (1994) ®nd that a very small number of them (<8%) ``either report technical data or employ accepted research methods'' (p. 48). Camp (1993) discusses some of the reasons for this absence, pointing mainly to the gap between conventional psychometrics and some of the factors which make portfolio assessment appealing: the inter-relationship of pedagogy and assessment, the variability in conditions of performance, and the collaborative nature of the revision process. Nevertheless, as more quantitative research appears, it will be of immediate and considerable use both in making a decision as to whether or not to adopt portfolio assessment and in making the case for its adoption. While seeking to reap the bene®ts promised by portfolio assessment, programs weighing its adoption as a replacement for standardized measures must keep in mind additional practical factors. Not the least of these is the labor-intensive nature of portfolio evaluation. Often, too, departments making this decision are accountable not only to their own administrations but also to state legislatures and other holders of pursestrings (Huot, 1992; White, 1992). Those unfamiliar with portfolios may consider them ``soft,'' subjective, and easily subverted (Hutchings, 1990). Beyond teacher and student satisfaction, can a case be made for portfolio assessment as a dependable method for determining readiness to advance to a higher level of instruction? Particularly in the case of ESL students, has portfolio assessment been clearly established as a reliable basis for high-stakes decisions about placement and progress? Portfolio assessment and ESL students The writing abilities of ESL students are more dif®cult to assess than those of native speakers (Johns, 1991; Thompson, 1990). In spite of this dif®culty, their writing is more likely than that of native speakers to be evaluated in large scale assessments, from the TOEFL to system-wide or institutional placement programs. Carlson and Bridgeman (1986) welcomed the introduction of writing

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

51

samples as adjuncts to indirect methods of assessment, but cautioned those who used them to ``be cognizant of the numerous variables that condition the interpretation of their results.'' That caution was well-advised, for the bene®ts of portfolio assessment appear to lie not in reducing the number of variables but in extending the number and range of pieces available for assessment and in its link to process writing (Hamp-Lyons, 1994). Hamp-Lyons (1991) advocates the use of portfolios for ESL students. Portfolios are thought to be especially suitable for non-native English-speaking students because ``portfolios provide a broader measure of what students can do, and because they replace the timed writing context, which has long been claimed to be particularly discriminatory against non-native writers'' (Hamp-Lyons & Condon, 2000, p. 61). According to Hamp-Lyons and Condon, after portfolios were introduced into the exit assessment, more ESL students tested out of the University of Michigan's Writing Practicum on the ®rst try than had been the case when the exit assessment used a timed essay. Ruetten (1994), citing research that shows that ESL students ®nd holistically scored competency exams particularly dif®cult, describes her own research involving the second course of a composition sequence which required students to pass a pro®ciency exam. In Ruetten's study, native English speakers and non-native speakers achieved a comparable pass rate when a prototype of the portfolio, an appeals folder containing several representative pieces of writing, was evaluated. Given the results, Ruetten concludes that some kind of portfolio assessment is particularly useful in evaluating ESL writers (p. 94). Research at Borough of Manhattan Community College of the City University of New York (Jones, 1992) shows that ESL students assessed by portfolio achieved results in the next course that were better than or comparable to those achieved by native English speakers who were assessed on the CUNY Writing Assessment Test (WAT), a holistically graded timed impromptu essay. Challenges of portfolio assessment While portfolio assessment promises potential bene®ts for curriculum and assessment, it also faces challenges. Brown and Hudson (1998) sum up from the literature ®ve disadvantages of using portfolio assessment: the issues of design decision, logistics, interpretation, reliability, and validity. Of great concern are the assessment's time-consuming nature, and the issues of reliability and validity. Portfolio assessment programs, without a doubt, make substantial demands on instructors. While planning portfolio tasks and lessons, coaching students on drafts, and helping them compile portfolios can be comfortably folded into a processoriented course, the actual evaluation of portfolios is inevitably labor intensive, requiring a signi®cant amount of time from instructors. Furthermore, reliability and validity concerns remain as unresolved issues. For example, how can we assure psychometric reliability, such as scoring consistency and rater agreement? Through

52

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

``negotiation'' among raters or through rater training directed by anchor papers and scoring guides (Yancey, 1999)? How can scoring fairness be achieved? Through the articulation of difference and negotiation so that community standards are established (Elbow & Belanoff, 1997)? How do we provide ``equitable assessment settings'' (Herman & Winters, 1994, p. 52) and make sure that all students have equal access to resources (Brown & Hudson, 1998)? More importantly, how do we ascertain that portfolios adequately exemplify students' writing abilities so that the decisions we make about students are accurate? In the debate over validity and reliability, there are con¯icting perspectives on the role of psychometric standards and standardization. Some consider portfolio assessment a viable alternative precisely because it resists standardization (Huot & Williamson, 1997; Moss, 1994). They believe that reliability and validity in the narrow psychometric sense are undesirable factors in evaluation. On the other hand, others (Herman, Gearhar, & Baker, 1993; LeMahieu, Gitomer, & Eresh, 1995) ®nd that psychometric integrity is attainable for portfolio assessment. Hamp-Lyons and Condon (2000) believe that both reliability and validity are necessary and must be established ``if portfolio-based assessments are to grow and to replace less satisfactory ones'' (p. 136) since only these types of data can convince bureaucrats. In 1994 they ®nd that the University of Michigan's fullscale entry-level and exit portfolio assessment achieves levels of reliability and validity equal to or better than direct tests of writing that are based on timed writings scored holistically. Williams (1998, 2000) strongly argues for and demonstrates a need for standardization and strict adherence to standard protocols. He reasons that standardized procedures are necessary in establishing performance standards. Without standards for implementation and outcomes, portfolio assessment will become whimsical, capricious, and unfair because ``it increases the subjectivity teachers bring to evaluation'' (2000, p. 136). This unreliability will threaten any bene®ts portfolio assessment brings and make it lose its appeal because portfolio assessment was, indeed, ``developed with the goal of making the evaluation of classroom writing more objective, more fair, and more realistic'' (2000, p. 147). The purpose of the study In 1995 the Department of English at Kingsborough Community College, a large urban campus that is part of the City University of New York, adopted portfolio assessment as the standard procedure for students in intermediate and advanced ESL. At the same time, portfolios were also adopted for developmental English writing courses, which ESL students took after completing advanced ESL. The decision to implement portfolio assessment throughout the ESL and developmental sequences followed 2 years of experimenting with portfolio assessment on a limited, voluntary basis. A major factor in the decision was the desire to implement a form of assessment that supported our curriculum. We

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

53

also reasoned that portfolio assessment would be a more valid measure for exit from the developmental writing sequence, a relatively high stakes decision that had previously depended upon the CUNY WAT. The CUNY WAT (see Appendix A for a sample test and criteria for evaluation) was given to all entering freshmen on all CUNY campuses prior to the introduction of the American College Testing Program (ACT) writing and reading exams in the fall of 2000. The WAT is a timed impromptu essay in which students must argue for or against a proposition of general interest (e.g., ``States should mandate the use of automobile safety belts'' or `` Students who come late to class should not be penalized''), supporting their arguments from their experience, their reading, and their observations of others. Essays were evaluated by two readers on a scale of 1±6, with a total score of 8 (4 ‡ 4) considered passing. All WAT readers were certi®ed by the University's Of®ce of Academic Affairs. For certi®cation, readers participated in an all-day training session at the end of which they were given a test. They had to meet the University's standards in order to be permitted to grade WAT tests. At Kingsborough Community College entering students who passed the WAT were admitted into the standard freshman composition class, ENG 12. Those who scored below 6 were placed into developmental courses or ESL, as appropriate. Students who scored a 6 or 7 were placed in ENG C2. The ENG C2 course was designed as a variant of the standard four-credit freshman English course; however, because the students had not yet passed the CUNY WAT and were thereby ``remedial,'' the course was augmented with an additional non-credit instructional hour. Under the old, i.e., pre-portfolio, system, if students passed both the course and the WAT, they received credit for the ®rst semester of Freshman English and advanced to ENG 22, the second semester English course. If they failed the WAT, regardless of their performance in the course, they had to repeat the course. Like students who had been placed in ENG 12, successful C2 students then advanced to the second semester Freshman English course, ENG 22. Thus, the developmental sequence and the composition sequence intersected at ENG C2. Moreover, this course, which included both native English speakers and ESL students, presented a particular challenge to instructors because the high stakes, university-mandated exit testing took place at its conclusion. C2 students pressed for direct test preparation and resisted attention to the broader development of academic writing, even though the course was intended to have the same objectives as ENG 12. When the portfolio was introduced experimentally, both instructors and students commented that the new system seemed to them to be more suitable and fairer than a single test. The portfolio was a performance-based appraisal that evaluated students' progress and accomplishment within the learning environment. Although the portfolio also contained an in-class ®nal writing exam, the exam was not a test of writing speed because it allowed an adequate amount of time for writing, nor was it culturally biased, as students had an opportunity to become familiar with the issues addressed in the prompts.

54

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

Regardless of the department's satisfaction with the portfolio, however, the WAT could not be waived: it was a university-wide, as well as a college, requirement. At best, the use of portfolios could allow students who had done passing work in the course to receive credit for ENG C2 and continue into ENG 22 if their portfolios were deemed passing, thus deferring for a semester the requirement that the students pass the WAT. This provided time for an additional writing course before students had to retake and pass the high stakes test. As the department's use of portfolios expanded to include all of the developmental writing courses, it became increasingly important to have data that demonstrated whetherÐin addition to instructor and student satisfactionÐthe use of portfolios to pass out of the developmental program was effective in determining readiness for the next course. Thus, we decided to conduct a study. We were interested in ®nding out how portfolio assessment had served our ESL students and whether or not it demonstrated validity for the purpose of making placement decisions about these students. The questions the study attempted to answer are as follows: 1. Did pass rates for the CUNY WAT and for ENG C2 differ signi®cantly between portfolio sections and non-portfolio sections? 2. When they took ENG 22, how did the performance of ESL students from portfolio sections of ENG C2 compare with those who had passed the ENG C2 sections that used the WAT as the method of assessment? 3. In particular, how successful in ENG 22 were those portfolio students who had passed portfolios but had not achieved a passing score on the WAT? Using portfolios to assess the writing of ESL students at Kingsborough The basic writing/composition sequence at Kingsborough integrates intensive reading and writing in thematically organized courses. The development of writing through a process of revision is central to the sequence. For this reason, portfolio assessment seemed a particularly appropriate way to evaluate student progress. As members of the department had moved from a pedagogy that emphasized the teaching of grammatical correctness and rhetorical modes to one that stressed process, revision, and text-based writing assignments, they were eager to replace the timed impromptu essay format with a method of assessment that supported the new curriculum. While in the long run, students might be expected to apply process techniques in a timed essay, this did not seem to be a reasonable expectation for students who had completed only 12 weeks (the length of the Kingsborough semester) of basic writing, nor did one sample of student writing seem to be a suf®cient basis for a high-stakes decision. A second reason for seeking a different form of assessment was the increasing number of immigrant students. These ESL students, many of whom were welleducated in their ®rst languages, responded well to the curriculum, but found great

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

55

dif®culty in passing the WAT. Unfamiliarity with topics, inability to construct convincing arguments for or against culturally speci®c issues, and anxiety about being tested impromptu in a second language may account for the dif®culty. However, frequently students cited having inadequate time as the main factor affecting their performance on the WAT. This is probably because ESL students simply need more time to implement all aspects of the writing processÐthinking and planning, writing (and translating), and reviewing and editing. In a study comparing ®rst- and second-language writing, Raimes (1987) ®nds that ESL students tend to edit and correct their work more than native English speakers. This suggests that the opportunity to review and edit is of crucial importance for ESL students. When pressured by time on the WAT, ESL students usually lacked time for even a quick review of their essays, thus producing essays with more errors and language problems than permitted. As the ESL population increased, average pass rates in the upper level basic writing course fell from over 50% to just above 40%. The holistically scored WAT became a barrier to ESL students, thus con®rming Ruetten's (1994) research on holistically scored evaluations. ESL students were held back by the WAT even though they had demonstrated that they could complete course assignments, including in-class writing assignments, satisfactorily or even excellently. This situation generated a large number of students who were repeating the course, sometimes several times, and a consequent retention problem of growing proportions, as students found themselves unable to proceed. The Kingsborough model of portfolio assessment In order for the portfolio assessment of writing to receive of®cial sanction and be considered as a trustworthy and credible evaluation measure, it is necessary for portfolio programs to establish acceptable performance standards and have in place standardized procedures and guidelines for conducting the assessments. We believe it is very important to implement standardized procedures and outcome standards strictly if we want to achieve and maintain high levels of reliability and consistency. The Kingsborough portfolio, designed by a group of instructors, both full-time and part-time, contains a cover letter, two revised essays with several drafts, and a departmental writing exam. The cover letter provides a venue for the student to present him/herself as a writer to the portfolio reader. It encourages accounts of struggles encountered and progress made during the semester as well as discussions of how he/she views the writing process. The two revised essays are chosen by the students from a collection of ®ve to six revised essays as those best representing their writing ability. A typical 12-week basic writing course at Kingsborough requires of students ®ve to six revised essays, at least half of which must be reading-based, and four to ®ve in-class essays, as well as informal writing. Instructors develop

56

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

their own essay topics, incorporating the course readings as much as possible (see Appendix B for a sample assignment). Each of the revised essays in the portfolio must be accompanied by at least two other drafts with the instructor's comments. Students who do not have all the drafts or ful®ll the requirements of the class are not allowed to submit a portfolio. We decided to require two of the student's revised essays, one of which must be reading-based, in order to provide the reader with an adequate but not overwhelming sample of each student's work. The departmental writing exam, the in-class writing piece included in all portfolios, is 2 hours long, more than twice the length of time allowed for the WAT. The exam is based on a long reading passage distributed in advance. The students do not discuss or write about it in-class ahead of time, but are encouraged to clarify their understanding by using a dictionary or discussing it with each other outside of class. On the exam, students are asked to spend about 30 minutes answering short-response questions on the reading passage and about 90 minutes writing an essay on one of the two given topics about issues drawn from the passage (see Appendix C). Evaluation criteria for portfolios include: ®nding and organizing ideas, using the revision process, and editing and presenting work (see Appendix D). The speci®c check list items included in the criteria were determined by the department based on discussions with instructors and a survey of literature as well as reference to the Evaluation Scale for the WAT. Portfolios are cross-read by pairs of instructors. To maintain reliability, instructors of portfolio sections are trained intensively twice a semester, once at mid-term and once at the end just before the actual reading. In training sessions, instructors examine and discuss ``anchor'' essays. Without exception, all instructors teaching developmental writing must attend these training sessions. Students' portfolios are graded pass/fail by the readers. Space is provided on the evaluation sheets for readers to make additional comments on the weaknesses and strengths of the portfolios they read. Discussions and communications between the pair of readers are discouraged, to make sure that students' portfolios are evaluated strictly according to the pre-established standards, rather than through their instructor's negotiations with the reader. Instead, an appeals process is available to the student's instructor should the instructor and the portfolio reader disagree. Thanks to the intensive training and strict adherence to rating guidelines and standards, the level of inter-rater reliability of our portfolio assessment over the years has remained high. For example, in the spring semester of 2001, portfolio readings for the highest level of our developmental English sequence achieved an inter-rater reliability of 0.82. This result is above the 0.80 benchmark for holistic ratings of timed essays. As the instructors have become more accustomed to reading portfolios and the process has become more systematized, the time required has decreased. We have developed strategies for responding succinctly, especially to students with passing portfolios. An experienced reader can evaluate a portfolio in about 12±15 minutes.

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

57

ESL population and data collection procedures Kingsborough Community College has a large ESL population. Each semester, about 500 students are enrolled in ESL classes taught by ESL faculty, while over 300 other second language students take upper level developmental English courses, many of which are either designated for ESL students or taught by English faculty with experience in working with ESL students. Students come from as many as 60 countries and regions and speak more than 40 languages or dialects. However, the predominant language spoken by our ESL students is Russian (44%), followed by French and Creole (15%), Chinese (12%), Spanish (10%), Arabic (4%), and Polish (3.6%), according to the fall, 2000 freshman enrollment record.1 Most of the ESL students are recent immigrants and have completed their high school education in their native countries; a small number of them have studied 1 or 2 years at college or hold a bachelor's degree in their ®rst language. Although most attend Kingsborough to pursue academic degrees, some take classes for certi®cates or to prepare for license exams so that they can enter a profession quickly. All nine sections of ENG C2 (designated for ESL students) that ®rst adopted portfolio assessment were included in the study as the portfolio group, and a matching number of ENG C2 sections (also for ESL students) that used the CUNY WAT as an exit criterion were included as the non-portfolio group. Although the portfolio was used as an exit criterion, students in the portfolio group were also given the WAT at the end of the semester as required by the college. Students had to have completed all the course work in order to submit a portfolio or be allowed to take the WAT. One hundred and three students from the portfolio group and 107 from the non-portfolio group (50% of the total enrollment of each group) were randomly selected to be included in the study. Students in both groups, as required for entry into ENG C2, had received scores of 6 or 7 on the CUNY WAT. Those who had scored lower or higher were placed into different levels. Students chose sections solely based on scheduling concerns, without any knowledge of which sections were participating in the portfolio program or which professors were teaching the sections (Kingsborough does not list professors' names in the course catalogue). It is, therefore, reasonably safe to conclude that students were distributed randomly and all sections began with students of comparable writing ability. Our analysis of grades in ENG 22 operated from the pool of students who passed ENG C2, either through the portfolio or the WAT. We analyzed only those students who took ENG 22 in the semester immediately following the one in which they completed ENG C2. Therefore, we cannot report on those students who did not pass ENG C2. 1

Although specific numbers of students speaking these languages vary from semester to semester, Russian has long been the predominant language spoken by our ESL students. For this study, students of Russian language background made up more than 50% of the subjects.

58

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

In addition, some students (in both groups) who passed ENG C2 did not take ENG 22 at Kingsborough during the next semester, because of transfers, drops, and other reasons; therefore, only students whose ENG 22 grades were available, speci®cally 64 of 80 students in the portfolio group and 36 of 41 in the nonportfolio group, were included in further data analysis. Of the 16 portfolio students who did not take ENG 22, 5 received an A in C2, 8 received a B, and 3 received a C, averaging 3.13 using the standard numerical scale for grades. All ®ve non-portfolio students who did not take ENG 22 received a B, or 3.0, in C2. The average grades are very similar, and thus do not suggest that the sample of students taking ENG 22 was skewed by the absence of those who passed ENG C2 but did not take ENG 22. Analysis of data According to Huck and Cormier (1996), a chi-square test or a z-test can be used to contrast two unrelated groups on a dichotomous dependent variable when data take the form of proportions, percentages, or frequencies. These two tests are mathematically equivalent and will always lead to the same decision regarding a null hypothesis. Thus, we used the z-test for proportions to contrast the two groups' CUNY WAT and ENG C2 pass rates (for Question 1). In comparing the students' performance in ENG 22 (Question 2), the distribution of the letter grades for the two groups were analyzed using w2. As the two groups' means of the ENG 22 grades appeared very similar, we wanted to know whether the students in the two groups were distributed in the same fashion across the ®ve letter grades. To compare the performance in ENG 22 of the portfolio students who had not passed the CUNY WAT with that of the non-portfolio students who had passed the CUNY WAT (Question 3), a more commonly used t-test was used to see if their numerical grade means were signi®cantly different. Results Question 1: The CUNY WAT and ENG C2 pass rates The pass rates of the CUNY WAT and ENG C2 for the two groups are reported in Table 1. The two groups' CUNY WAT pass rates do not appear signi®cantly different. Of the 103 students in the portfolio group, 33 or 32% passed the CUNY WAT as compared to 37 or 35% of the 107 students in the non-portfolio group. However, 80 or 78% of the 103 students in the portfolio group passed the portfolio assessment and thus the ENG C2 course; by contrast, only 41 or 38% of the 107 students in the non-portfolio group passed the course as determined mainly by their scores on the CUNY WAT. These ENG C2 pass rates look

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

59

Table 1 Comparison of CUNY WAT and ENG C2 pass rate

Passing CUNY WAT ENG C2

Sample of portfolio group (n ˆ 103)

Sample of non-portfolio group (n ˆ 107)

33 (32%) 80 (78%)

37 (35%) 41 (38%)

z

P

0.46 6.43

.3228 0

signi®cantly different. The z-tests for proportions con®rmed that there was no signi®cant difference between the two groups' CUNY WAT pass rates, but the two groups differed signi®cantly in passing the ENG C2 course.2 While ESL students in the portfolio group had a CUNY WAT pass rate similar to the ESL students in the non-portfolio group, students in the former group were twice as likely as students in the latter group to pass ENG C2 when they were evaluated by portfolio. These results indicated that on the timed impromptu test, the WAT, the performance of the two groups of students was similar. The students in the portfolio sections, however, were able to demonstrate their readiness for the next level through a range of pieces: representations of their semester's work produced under less time pressure and with access to dictionaries and other reference material and a writing exam, completed in 2 hours rather than 50 minutes and on topics they had read about. Their revised pieces of writing often re¯ected the skills of experienced writers, showing extensive changes in ideas, content, support, text organization and structure, and expression. The nonportfolio students, though, did not have this chance to demonstrate their writing ability. Question 2: Students' performance in ENG 22 Since the ultimate test of evaluation criteria is usually its ability to predict future success, we followed the two groups of students into their next level of English courses, ENG 22. As shown in Table 2, the two groups' means of equated grades do not appear signi®cantly different. A chi-square test con®rmed this. No signi®cant difference was found, w2 (4, n ˆ 100† ˆ 2:15, P ˆ :71, in the distributions of the ®ve letter grades for the portfolio group and the WAT group. (The non-portfolio group is referred to as the WAT group here since all members had passed the WAT before 2 To be on the safe side, we also performed chi-square tests on these data. The tests yielded the same results as did the z-tests: no significant difference was found between the two groups' CUNY WAT pass rates, but their ENG C2 pass rates were significantly different. The tests confirm Huck and Cormier's statement that chi-square test and z-test are mathematically equivalent and will always lead to the same decision regarding null hypothesis when analyzing data in the form of proportion, percentages, or frequencies.

60

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

Table 2 Distribution of frequencies and percentages by grades and group Grade (equated)

Group Portfolio

WAT

n (%)

n (%)

A (4) B (3) C (2) D (1) F (0)

12 25 18 3 6

4 19 9 1 3

Total

64 (100)

Mean (S.D.)

(19) (39) (28) (5) (9)

2.53 (1.14)

(11) (53) (25) (3) (8)

36 (100) 2.56 (1.03)

they took ENG 22.) In other words, the students in the portfolio group had the same proportionate breakdown of grades as did the students in the WAT group. In both groups, the great majority of students (over 85%) received a C or better than C letter grade. In other words, portfolio students had the same success rate as WAT students in ENG 22. This result further suggests that portfolio assessment for ESL students in ENG C2 was as good a predictor of their success in ENG 22 as was the required writing pro®ciency test, the CUNY WAT. Question 3: The performance in ENG 22 of the portfolio students who passed ENG C2 but failed the WAT Using the ENG 22 data, we particularly examined the subgroup of portfolio students who advanced to ENG 22 on the merit of their portfolios without a passing score on the WAT. Of the 64 students who took ENG 22, 38 did not have a passing score on the WAT. Table 3 shows the distribution of the ENG 22 letter grades and the equated grade mean of this sub-group of portfolio students as compared to those of the WAT group. The means of the two groups were not signi®cantly different (t ˆ 1:03, df ˆ 72, P ˆ :31). The great majority (84%) of the portfolio students received a C or better than C grade. The success rate in ENG 22 of the portfolio sub-group was comparable to that of the WAT group (89%), despite the fact that none of the portfolio students had passed the WAT before they enrolled in ENG 22. This ®nding is particularly encouraging because these students would have been denied a chance even to be enrolled in the course, not to mention reaching their full writing potential, if the WAT had been used as the criterion for exit from ENG C2. We also followed up on these students' WAT records since passing the test remained a graduation requirement. Twenty-nine or 76% of them passed the test within a year after they completed ENG C2. This suggests that additional writing

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

61

Table 3 Distribution of frequencies and percentages by grades and group Grade (equated)

Group Portfolio (sub-group)

WAT

n (%)

n (%)

A (4) B (3) C (2) D (1) F (0)

4 15 13 0 6

4 19 9 1 3

Total

38 (100)

(11) (39) (34) (0) (16)

(11) (53) (25) (3) (8)

36 (100)

Mean (S.D.)

2.29 (1.18)

2.56 (1.03)

t P

1.03 0.31

1.03 0.31

instruction and practice in writing, even though not directed toward the test, enabled them to meet that requirement within the mandated time frame. Discussion The results of the study indicate that ESL students, whether enrolled in portfolio or non-portfolio sections, generally have dif®culties passing a timed impromptu test, as evidenced by the low WAT pass rates for the two groups of students (32% and 35%, respectively). These results support the ®ndings of early studies (Johns, 1991; Ruetten, 1994; Thompson, 1990) which suggest that holistically scored, timed writing pro®ciency exams ``are particularly (and perhaps unfairly) dif®cult for ESL writers'' (Ruetten, 1994, p. 91). These exams particularly handicap ESL students because they not only test them on unfamiliar genres and tasks, but also require them to meet standards of excellence in grammatical and mechanical accuracy they cannot reach on a ®rst draft in 50 minutes (Hamp-Lyons & Condon, 2000). However, it is encouraging to note that ESL students are twice as likely to pass their ®rst semester English course when they are evaluated by the use of portfolio as when evaluated by the timed WAT. This ®nding supports that of Ruetten (1994) that portfolio assessment is indeed very useful for ESL students. Although there is a concern that students who have passed the course through portfolio without passing the writing competency exam as required may not be able to function in the subsequent English course, such concern seems to be groundless according to the ®ndings. The study found no evidence that allowing ESL students to pass an English writing course through a carefully structured and monitored portfolio assessment reduced their chances of successfully completing the requirements in

62

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

the subsequent English course. Indeed, in their second semester of English, the sub-group of studentsÐthose who passed the ®rst level English course on the merit of their portfolios without a passing score on the writing pro®ciency examÐ performed as well as students who passed the writing pro®ciency exam. Demonstrating validity is often asked of any assessment method. The Kingsborough portfolio assessment model seems to have established criterion-related validation in predicting students' future performance. As measured by the second semester English course, the great majority of students (over 85%) who were permitted to advance to the course with passing portfolios succeeded in that course. They had the same course pass rate as students who had passed the CUNY WAT, proving that portfolio assessment is as valid as the CUNY WAT in predicting students' success in a subsequent English course. However, because portfolio assessment was able to identify more than twice the number (80 out of 103 versus 41 out of 107) of our ESL students who actually proved successful in the next English course, it seems a more appropriate assessment alternative for our ESL population. It is possible, of course, that other alternatives, e.g., allowing extra time for the WAT, might have proven equally or even more effective than the portfolio in placing students in ENG 22; however, that would not have provided the curricular support we were seeking for this course, since the primary purpose of ENG C2 was to serve as a variant of Freshman English, an introduction to academic writing. Moreover, college and university restrictions precluded other alternatives. Although we continue to use portfolios for course exit, the Trustees require the use of a nationally normed standardized instrument (the ACT tests) for exit from the developmental sequence and entrance into Freshman English and will not entertain proposals for alternative methods of assessment. Limitations of the study and suggestions for future research The ®ndings of this study are based on the empirical data of real classrooms rather than data from pure research settings. As such, it is dif®cult to control variables affecting performance. Therefore, the study cannot identify variables that contribute to portfolio students' successes in both the ®rst and subsequent English courses. Successful performance might be attributable to the changed pedagogy that focuses on higher-order critical thinking skills, or perhaps to the extra time students are forced to invest in the writing process. Another possible factor could be students' higher motivation as they see the opportunity to pass on to the next level of the course sequence rather than repeating the same course. It is possible that all these factors have a confounding impact. Future studies are necessary in order to ®nd answers to the question of contributing variables. Follow up case studies on individual portfolio students would also be helpful. Another limitation of the study is also due to its empirical nature. Although our portfolio assessment has a good scoring reliability, the correlation between the

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

63

grades assigned by ENG 22 instructors and students' achievement in writing cannot be assessed. Therefore, using ENG 22 grades as a success measure can be problematic, for example, when classroom teachers grade on a curve, i.e., assign a spread of grades (from A to F) no matter how good a class is relative to other classes. Nevertheless, we can assume quite safely that the course grade is a reliable, useful measure for this study because our ENG 22 instructors all have substantial experience in teaching this course, and they do not use curving when assigning grades. In addition, students enrolling in the ENG 22 sections were randomly, and therefore presumably normally, distributed over more than 30 sections. Hence, the overall comparability of the grade distribution for both groups seems indicative of comparable performance. Because quantitative data are so limited in portfolio assessment studies, more studies of this nature should be replicated at other institutions where the ESL population is different and other types of writing competency exams are given. In addition, studies with different research designs should be conducted to further examine the effects of portfolio assessment. For example, to measure the instructional aspects of portfolio approach, a more appropriate writing test, perhaps one similar to our departmental writing exam which allows more time and gives topics on issues that are familiar to students, should be given to both portfolio and non-portfolio students at the end. This study is unable to determine the instructional effects of the portfolio approach because only the timed, impromptu WAT test was given to both portfolio and non-portfolio students and both groups had similar pass rates on the test. Conclusion This study adds to the evidence noted by Ruetten and others that the holistically graded timed impromptu essay exam appears to discriminate against competent ESL writers, and in fact places an unnecessary obstacle in their path. Large numbers of Kingsborough students who might have continued to the next course successfully were blocked from progressing because they were unable to pass the holistically scored WAT. Further, the study demonstrates that, when carefully conducted with clear evaluation standards, portfolio assessment can be relied upon as a basis for making judgments about the writing pro®ciency of ESL students. Acknowledgments The authors are grateful to the three anonymous reviewers whose questions and comments have been so helpful to the development of this article. We owe special thanks to our colleagues in the Department of English at Kingsborough Community College, CUNY, who developed the portfolio system and the curriculum it supports.

64

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

Appendix A. Sample CUNY Writing Assessment Test and grading criteria The CUNY Writing Skills Assessment Test City University of New York 1983 I. Sample test Directions You will have 50 minutes to plan and write the essay assigned below. You may wish to use your 50 minutes in the following way: 10 minutes planning what you are going to write; 30 minutes writing; 10 minutes reading and correcting what you have written. You should express your thoughts clearly and organize your ideas so that they will make sense to a reader. Correct grammar and sentence structure are important. Write your essay on the lined pages of your booklet. You may use the inside of the front cover of the booklet for preliminary notes. You must write your essay on one of the following assignments. Read each one carefully and then choose either A or B. A. It always strikes me as a terrible shame to see young people spending so much of their time staring at television. If we could unplug all the TV sets in America, our children would grow up to be healthier, better educated, and more independent human beings. Do you agree or disagree? Explain and illustrate your answer from your own experience, your observations of others, or your reading. B. Older people bring to their work a lifetime of knowledge and experience. They should not be forced to retire, even if keeping them on the job cuts down on the opportunities for young people to find work. Do you agree or disagree? Explain and illustrate your answer from your own experience, your observations of others, or your reading. II. Evaluation scale for the writing skills assessment test 6

The essay provides a well-organized response to the topic and maintains a central focus. The ideas are expressed in appropriate language. A sense of pattern of development is present from beginning to end. The writer supports assertions with explanation or illustration, and the vocabulary is well suited to the context. Sentences re¯ect a command of syntax within the ordinary range of standard written English. Grammar, punctuation, and spelling are almost always correct. 5

The essay provides an organized response to the topic. The ideas are expressed in clear language most of the time. The writer develops ideas and generally signals

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

65

relationships within and between paragraphs. The writer uses vocabulary that is appropriate for the essay topic and avoids oversimpli®cations or distortions. Sentences generally are correct grammatically, although some errors may be present when sentence structure is particularly complex. With few exceptions, grammar, punctuation, and spelling are correct. 4

The essay shows a basic understanding of the demands of essay organization, although there might be occasional digression. The development of ideas is sometimes incomplete or rudimentary, but a basic logical structure can be discerned. Vocabulary generally is appropriate for the essay topic but at times is oversimpli®ed. Sentences re¯ect a suf®cient command of standard written English to ensure reasonable clarity of expression. Common forms of agreement and grammatical in¯ection are usually, although not always, correct. The writer generally demonstrates through punctuation an understanding of the boundaries of the sentence. The writer spells common words, except perhaps so-called ``demons,'' with a reasonable degree of accuracy. 3

The essay provides a response to the topic but generally has no overall pattern of organization. Ideas are often repeated or undeveloped, though occasionally a paragraph within the essay does have some structure. The writer uses informal language occasionally and records conversational speech when appropriate written prose is needed. Vocabulary often is limited. The writer generally does not signal relationships within and between paragraphs. Syntax is often rudimentary and lacking in variety. The essay has recurrent grammatical problems, or because of an extremely narrow range of syntactical choices, only occasional grammatical problems appear. The writer does not demonstrate a ®rm understanding of the boundaries of the sentence. The writer occasionally misspells common words of the language. 2

The essay begins with a response to the topic but does not develop that response. Ideas are repeated frequently, or are presented randomly, or both. The writer uses informal language frequently and does little more than record conversational speech. Words are often misused, and vocabulary is limited. Syntax is often tangled and is not suf®ciently stable to ensure reasonable clarity of expression. Errors in grammar, punctuation, and spelling occur often. 1

The essay suffers from general incoherence and has no discernible pattern of organization. It displays a high frequency of error in the regular features of standard written English. Lapses in punctuation, spelling, and grammar often frustrate the reader. Or, the essay is so brief that any reasonably accurate judgment of the writer's competence is impossible.

66

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

Appendix B. Sample ENG C2 writing assignment ENG C2 Professor August In-class Essay 2 In The Broken Cord, Michael Dorris makes many references to his heritage as an American Indian. He describes and examines both negative and positive aspects of being a Native American. Discuss the importance of this heritage to Dorris, referring to speci®c incidents in the book to illustrate and support your points. In relation to Dorris, consider your own ethnic heritage. What is its importance to you? Does it play a similar role for you as it does for Dorris, or is it quite different? This is a more complicated essay than we've done so far. In order to ®nd material and make certain that the structure of your essay is logical and clear, be sure to take time for step 1 below.

Please follow these steps as you write your essay: 1. Use about 10 minutes for invention, thinking about and planning your essay. You might brainstorm, free write, outline, or just jot down ideas. 2. Draft the essay. Be sure to double space (skip lines). Remember to use paragraphs to indicate steps in your thinking or new aspects of your topic. Try to have an engaging introduction and a good conclusion. 3. Leave 10 minutes at least to edit and proofread carefully. Look for places that are not clear. Also look for words you may have left out, as well as for spelling, capitalization, correct punctuating of quotations, and sentence boundaries. Appendix C Kingsborough Community College City University of New York Department of English Departmental writing examination The questions below refer to the essay, ``The Tyranny of the Majority'' by Lani Guinier, which was distributed in advance by the instructor. Students are encouraged to use their own marked-up copies. Extra copies are available in the English Department (C309) for students who have not brought their own.

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

67

Use about 30 minutes to answer Parts I and II. Write your answers on the attached paper. Be sure to label each answer with the question number. Part I: Answer both questions in Part I. 1. Summarize, using your own words, paragraph 25. 2. Paraphrase paragraph 14. Part II: Choose one of the following questions. Answer in a paragraph. 1. How does Nikolas's solution to the Sesame Street Magazine exercise correspond to Guinier's views on ``majority rule?'' 2. Explain the irony in Guinier's use of the word ``tyranny'' to describe a basic principle of the American form of government (see paragraph 12). 3. Use your own words to explain what Guinier means by a ``Madisonian Majority'' (paragraph 16). Explain why she gives it this name. Part III. Essay: Use about one and one-half hours of the exam time to write your essay. Choose one of the topics below. Write a logically organized, well-developed, and carefully proofread essay on the topic. In your essay, refer to Guinier's essay and quote from it. Write the essay on the paper provided. 1. In her essay, ``The Tyranny of the Majority,'' Lani Guinier questions the equity of a winner-take-all approach to democracy in a multicultural society. Write an essay in which you continue this discussion. In your essay:  Using your own words, explain the difference Guinier finds between majority rule in a homogeneous society and a heterogeneous society (see, e.g., paragraph 14).  Guinier uses the example of an incident that occurred at Brother Rice High School in Chicago. From your own experience or reading, choose an example of majority rule in a heterogeneous society or community. Describe the experience. Then discuss specifically and in detail how it illustrates the problem posed by Guinier or a possible solution to that problem. 2. Guinier makes extensive use of her young son's idea of ``fairness.'' Is she stretching her son's vision of the world too far by assuming that what is right and works for children is also right and works for adults, or is she accurate in her application of Nikolas's moral view to the idea of democracy? Write an essay in which you examine these ideas and extend the discussion. In your essay:  Summarize, using your own words, what Guinier presents as her son's idea of fairness.  From your experience or your reading, choose an example of a conflict or tension between majority and minority interests. Describe this conflict

68

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

or tension. Then discuss specifically how Nikolas's moral view might apply. Does it offer a possible solution or not? Explain your thinking. Appendix D. Sample portfolio evaluation form

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

69

70

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

References Belanoff, P., & Dickson, M. (Eds.). (1991). Portfolios: Process and product. Portsmouth, NH: Boynton/Cook Heinemann. Belanoff, P., & Elbow, P. (1986). Using portfolios to increase collaboration and community in a writing program. WPA: Writing Program Administrator, 9, 27±39. Black, L., Daiker, D., Sommers, J., & Stygall, G. (Eds.). (1994). New directions in portfolio assessment: Reflective practice, critical theory, and large-scale scoring. Portsmouth, NH: Boynton/Cook Heinemann. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32, 653±675.

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

71

Camp, R. (1993). The place of portfolios in our changing view of writing assessment. In R. Bennett & W. C. Ward (Eds.), Construction versus choice in cognitive measurement: Issues in constructed response, performance testing, and portfolio assessment (pp. 183±212). Hillsdale, NJ: Erlbaum. Carlson, S., & Bridgeman, B. (1986). Testing ESL student writers. In K. Greenberg, H. Wiener, & R. Donovan (Eds.), Writing assessment: Issues and strategies (pp. 126±152). New York and London: Longman. Courts, P.L., & McInerney, K.H. (1993). Assessment in higher education: Politics, pedagogy, and portfolios. Westport, CT: Praeger. Elbow, P., & Belanoff, P. (1997). Reflections on an explosion: Portfolios in the 90's and beyond. In K. Yancey & I. Weiser (Eds.), Situating portfolios: Four perspectives (pp. 21±33). Logan, UT: Utah State University Press. Gill, K. (1993). Process and portfolios in writing instruction. Urbana, IL: National Council of Teachers of English. Hamp-Lyons, L. (Ed.). (1991). Assessing second language writing in academic contexts. Norwood, NJ: Ablex. Hamp-Lyons, L. (1994). Interweaving assessment and instruction in college ESL writing classes. College ESL, 4, 43±55. Hamp-Lyons, L., & Condon, W. (1993). Questioning assumptions about portfolio-based assessment. College Composition and Communication, 44, 176±190. Hamp-Lyons, L., & Condon, W. (2000). Assessing the portfolio: Principles for practice, theory, and research. Cresskill, NJ: Hampton. Herman, J. K., Gearhar, M., & Baker, E. (1993). Assessing writing portfolios: Issues in the validity and meaning of scores. Educational Assessment, 1, 201±224. Herman, J. L., & Winters, L. (1994). Portfolio research: A slim collection. Educational Leadership, 52, 48±55. Huck, S.W., & Cormier, W.H. (1996). Reading statistics and research. New York, NY: Harper Collins Huot, B. (1992, October). Portfolios, fad or revolution? Paper presented at New Directions in Portfolio Assessment, the Fourth Miami Conference on the Teaching of Writing, Miami, Ohio. Huot, B., & Williamson, M. (1997). Rethinking portfolios for evaluating writing: Issues of assessment and power. In K. Yancey and I. Weiser (Eds.), Situating portfolios: Four perspectives (pp. 43±56). Logan, UT: Utah State University Press. Hutchings, P. (1990). Learning over time: Portfolio assessment. AAHE Bulletin (April), 6±8. Johns, A. M. (1991). Interpreting an English competency examination. Written Communication, 8, 379±401. Jones, J.W. (1992). Evaluation of the English as a Second Language portfolio assessment project at Borough of Manhattan Community College. A practicum report presented to Nova University. Kearns, E. (1993). On the running board of the portfolio bandwagon. WPA: Writing Program Administrator, 16, 50±58. LeMahieu, P., Gitomer, D.H., & Eresh, J.T. (1995). Portfolios in large-scale assessment: Difficult but not impossible. Educational Measurement: Issues and Practice, 14, 11±16, 25±28. Markstein, L., Withrow, J., Brookes, G., & Price, S. (1992, March). A portfolio assessment experiment for college ESL students. Paper presented at the 1992 TESOL Convention, Vancouver. Moss, P. (1994). Validity in high stakes writing assessment: Problems and possibilities. Assessing Writing, 1, 109±128. Raimes, A. (1987). Language proficiency, writing ability, and composing strategies: A study of ESL college student writers. Language Learning, 37, 439±467. Ruetten, M. K. (1994). Evaluating ESL students' performance on proficiency exams. Journal of Second Language Writing, 3, 85±96. Thompson, R. M. (1990). Writing proficiency tests and remediation: Some cultural differences. TESOL Quarterly, 24, 99±102. White, E. (1992). Portfolios as an assessment concept. Paper presented at New Directions in Portfolio Assessment, the Fourth Miami Conference on the Teaching of Writing, Miami, OH.

72

B. Song, B. August / Journal of Second Language Writing 11 (2002) 49±72

Williams, J.D. (1998). Preparing to teach writing. Mahwah, NJ: Lawrence Erlbaum. Williams, J.D. (2000). Identity and reliability in portfolio assessment. In B. Sunstein & J. Lovell (Eds.), The portfolio standard (pp. 135±148). Portsmouth, NH: Heinemann. Yancey, K. B. (1999). Looking back as we look forward: Historicizing writing assessment. College Composition and Communication, 50, 483±503.