Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality

Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality

Available online at www.sciencedirect.com ScienceDirect Journal of Second Language Writing 28 (2015) 53–67 Different topics, different discourse: Re...

375KB Sizes 4 Downloads 197 Views

Available online at www.sciencedirect.com

ScienceDirect Journal of Second Language Writing 28 (2015) 53–67

Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality Weiwei Yang a,*, Xiaofei Lu b, Sara Cushing Weigle c a

College of Foreign Languages, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Ave., Nanjing, Jiangsu 211106, China b Department of Applied Linguistics, The Pennsylvania State University, 304 Sparks Building, University Park, PA 16802, USA c Department of Applied Linguistics and ESL, Georgia State University, P.O. Box 4099, Atlanta, GA 30302-4099, USA

Abstract This study examined the relationship between syntactic complexity of ESL writing and writing quality as judged by human raters, as well as the role of topic in the relationship. Syntactic complexity was conceptualized and measured as a multi-dimensional construct with interconnected sub-constructs. One hundred and ninety ESL graduate students each wrote two argumentative essays on two different topics. It was found that topic had a significant effect on syntactic complexity features of the essays, with one topic eliciting a higher amount of subordination (finite and non-finite) and greater global sentence complexity and the other eliciting more elaboration at the finite clause level (in particular, coordinate phrases and complex noun phrases). Local-level complexity features that were more prominent in essays on one topic (i.e., subordination and elaboration at the finite clause level) tended not to correlate with scores for that topic. Rather, a reversed pattern was observed: the less prominent local-level complexity features for essays on one topic tended to have a stronger correlation with scores for that topic. Regression analyses revealed global sentence and T-unit complexity as consistently significant predictors of scores across the two topics, but local-level features exhibited varied predicting power for scores for the two topics. # 2015 Elsevier Inc. All rights reserved. Keywords: Syntactic complexity; ESL writing performance; Topic effect

Introduction The inquiry into syntactic complexity of writing and its relationship with writing quality is not new. However, as Ortega (2003) points out, many early second language (L2) studies in this area suffer from problems of small sample sizes and homogeneity of learner proficiency, often yielding conflicting findings. Furthermore, given the relatively large number of syntactic complexity measures that have been used (see Lu, 2011; Ortega, 2003; Wolfe-Quintero, Inagaki, & Kim, 1998), we cannot assume that the relationship between syntactic complexity and writing quality is the same across the different measures (Norris & Ortega, 2009). The number of measures that exist also invites the

* Corresponding author. Tel.: +86 15261807574. E-mail addresses: [email protected] (W. Yang), [email protected] (X. Lu), [email protected] (S.C. Weigle). http://dx.doi.org/10.1016/j.jslw.2015.02.002 1060-3743/# 2015 Elsevier Inc. All rights reserved.

54

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

question of what the construct really is and what measures are appropriate. Norris and Ortega (2009) usefully propose examining syntactic complexity as a multi-dimensional construct. To date, however, this proposal has been adopted by very few studies (see, e.g., Byrnes, Maxim, & Norris, 2010). Additionally, while some research suggests that variations in writing tasks can influence the linguistic features of texts and the writing scores given to those texts, the role of writing topic has not been given due attention in studies of the relationship between syntactic complexity and writing quality, although the very few studies that touched upon this issue suggest that topic effects can be expected (Crowhurst & Piche, 1979; Tedick, 1990). In this study, we hope to circumvent the limitations of previous studies by measuring syntactic complexity as a multi-dimensional construct and using a larger sample size. We also explore the role of writing topic in the relationship between syntactic complexity and writing quality. In the rest of this section, we review related literature, by first establishing syntactic complexity as a multi-dimensional construct and then synthesizing related studies. Then, we present the methodology and results of our study and discuss the findings as well as their implications for syntactic complexity research and L2 writing assessment. Syntactic complexity as a multi-dimensional construct In linguistic theories, syntactic complexity traditionally refers to compound and complex sentences, i.e., clausal complexity (see Diessel, 2004; Ravid & Berman, 2010). In some linguistic traditions, the notion of syntactic complexity has not extended to phrasal complexity (see, e.g., Givo´n (2009); Givo´n & Shibatan, 2009). However, in another view emerging in L1 and L2 developmental studies focusing on syntactic maturity (e.g., Cooper, 1976; Crossley, McNamara, Weston, & McLain Sullivan, 2011; Hunt, 1965; Lu, 2011; Ravid & Berman, 2010) and discourse analysis of texts in different genres (e.g., Biber, 2006; Biber, Gray, & Poonpon, 2011; Ravid & Berman, 2010), phrasal complexity (particularly noun phrase complexity) has been considered an integral part of syntactic complexity. What complicates the construct of syntactic complexity further is that the notion of clause has not been defined consistently across disciplines. Notably, linguistic theories of grammar (Cristofaro, 2003; Givo´n, 2009; Halliday & Matthiessen, 2004; Langacker, 2008) count both finite and non-finite clauses as clauses. In writing research, however, following Hunt’s (1965) definition, the term clause has been predominantly used to refer only to finite clauses. Therefore, when calculating an index such as number of clauses per sentence as a syntactic complexity measure, discrepancy in results may arise due to the different definitions of clause adopted. There may be no easy answer as to which definition of clause is more appropriate, but we adopt the view that both finite clauses and non-finite elements should be examined as part of the construct. However, to maintain consistency with previous writing research, we use the term clause to refer to finite clauses only and use the term non-finite element to refer to non-finite clauses. In alignment with grammar theories, we see both finite dependent clauses and non-finite elements as representing

[(Fig._1)TD$IG]

Overall Sentence Complexity (Mean Length of Sentence: MLS)

Clausal Coordination (T-units per Sentence: TU/S)

Overall T-unit Complexity (Mean Length of T-unit: MLTU)

Elaboration at Clause Level (Mean Length of Clause: MLC)

Phrasal Coordination (Coordinate Phrases per Clause: CP/C)

Clausal Subordination (Finite) (Dependent Clauses per T-Unit: DC/TU)

Non-finite Elements/Subordination (Non-finite Elements per Clause: NFE/C)

Noun-Phrase Complexity (Complex Noun Phrases per Clause: CNP/C)

Fig. 1. A multi-dimensional representation of syntactic complexity.

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

55

subordination. In general, we agree with Norris and Ortega’s (2009) conceptualization of syntactic complexity as a multi-dimensional construct represented at the levels of global complexity, clausal subordination (finite), clausal coordination, and sub-clausal elaboration (including non-finite elements/subordination, phrasal coordination, and noun phrase complexity). Based on Norris and Ortega (2009), the diagram in Fig. 1 displays our conceptualization of this multi-dimensional construct and the hierarchical relationships among the sub-constructs. Laid out in the parentheses for each subconstruct are the indices selected from previous literature that can best measure the constructs in our framework. The distinct, discrete sub-constructs for syntactic complexity are found at the terminal nodes, thus including clausal coordination, clausal subordination (finite), non-finite elements/subordination, phrasal coordination, and noun phrase complexity. Non-terminal nodes in the diagram are composites of discrete sub-constructs, including elaboration at the clause level, overall T-unit complexity, and overall sentence complexity, with each composite forming a higher level and with overall sentence complexity essentially encapsulating all sub-constructs. In this study, mean length of sentence and mean length of T-unit are seen as global complexity measures, and the other six measures are seen as local-level complexity measures. When selecting the measures for the discrete sub-constructs (i.e., the ones at the terminal nodes in the diagram), we ensured that they not only represent the sub-constructs well but also show the sub-constructs to be non-overlapping and distinct ones. The following complex sentence illustrates how the five discrete sub-constructs are represented distinctly with the measures we selected: [TD$INLE] 1complex noun phrase

2 coordinate clauses

1 finite dependent clause

1 coordinate phrase It was quite a difficult decision for him, but he decided that he would go back to college but keep his job in order to pay off the tuition, fees, and living expenses. 1 coordinate phrase 1 non-finite element

The measures selected for the composites of sub-constructs (i.e., the ones at the non-terminal nodes in the diagram) are overall length measures that have been commonly used in the writing literature. These measures represent the subconstructs holistically rather than discretely, with the assumption that the length of the analytical unit for a composite level will increase when one or more sub-constructs under that level are utilized. An obvious drawback of such measures is that one would not know which discrete sub-constructs, if any, make a difference in the analysis in hand. However, such composites of sub-constructs are useful when none of the discrete sub-constructs makes a difference on its own. For the same example sentence shown above, we display below how the composite sub-constructs are 1 [TD$INLE]represented with these length measures. 1 sentence (33 words) 1 T-unit (8 words) 1 clause (8 words) It was quite a difficult decision for him,

1 T-unit (25 words) 1 clause (3 words) but he decided1

1 clause (22 words) that he would go back to

college but keep his job in order to pay off the tuition, fees, and living expenses.

1

One anonymous reviewer questioned counting ‘‘but he decided’’ as a three-word clause, although the dependent clause following it serves as the object of the main clause. To the best of our knowledge, to calculate the index of ‘‘mean length of clause’’, text length (i.e., the total number of words in a text) is divided by the total number of clauses in the text. ‘‘[But] he decided’’ in the example is counted as a clause and is thereby technically counted as a three-word clause. See Hunt (1965) for such a mean clause length calculation method. Brynes, Maxim, and Norris (2010) have similar examples of clause segmentation. The computational tool we used for our analysis counts such a unit as three words as well. We do see the limitation with this method of counting, particularly when finite object clauses are used frequently in a text, which can greatly reduce mean clause length.

56

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Syntactic complexity and writing quality A number of L2 and L1 studies have looked into the relationship between syntactic complexity and writing quality, with the latter typically indicated by holistic or analytic ratings of essays. While language proficiency is often assumed as a given in L1 writing studies, writing specialists agree that L2 writing ability involves both L2 proficiency and writing ability (see, e.g., Cumming, 1989; Weigle, 2002). A reasonable hypothesis regarding L2 writing is that increased language proficiency involves control over increasingly complex syntactic structures, while increased writing ability involves the successful deployment of these linguistic resources in the service of specific writing goals. A score on an L2 writing test may thus be an indicator of language proficiency, writing ability, or both, depending on the nature and purpose of the assessment and the scoring criteria. In this paper, we adopt the view that L2 writing quality (as judged by raters) is a function of both writing ability and language proficiency. Among the multiple dimensions of syntactic complexity, few have been examined in terms of their relationship with L2 writing quality. Such studies have tended to employ overall length measures, with mean length of T-unit (MLTU) being the most commonly used, along with mean length of sentence (MLS) and mean length of clause (MLC). Clausal subordination (finite) has been of considerable interest as well, typically measured by clauses per T-unit (C/TU). Previous results on these measures have been rather mixed (see also Ortega, 2003). For the relationship between MLTU and writing quality, both significant (e.g., Homburg, 1984; Kameen, 1979) and nonsignificant findings (e.g., Larsen-Freeman & Strom, 1977; Nihalani, 1981) have been reported. Similarly, for the relationship between finite clausal subordination and writing quality, both significant (e.g., Flahive & Snow, 1980; Homburg, 1984) and non-significant relationships (e.g., Bardovi-Harlig & Bofman, 1989; Kameen, 1979; Perkins, 1980) have been identified. Fewer studies have examined MLS and MLC, but these studies have revealed a significant relationship between complexity at those levels and writing quality (Homburg, 1984; Kameen, 1979). The same set of complexity variables have been examined in L1 studies on the relationship between syntactic complexity and writing quality as well, and these studies paint an equally unclear picture (for reviews, see Crowhurst, 1983; Hillocks, 1986). As Ortega (2003) points out, previous L2 inquiries into the relationship between syntactic complexity and writing quality suffered from several limitations. Many of them used homogenous language-proficiency groups (thereby yielding small between-group variance) and had small sample sizes. Further, they typically employed analysis of variance rather than correlation, so that a lot of data values were lost. These limitations rendered early studies less powered for statistical testing, i.e., less able to reveal significant findings when they existed. Consequently, conclusions from previous studies must be interpreted with caution. More empirical studies that can avoid or reduce these limitations are much needed. Topic effect on syntactic complexity The effects of variables related to writing tasks or prompts on textual features of the written product and scores or ratings have been studied in both the L1 and L2 writing literature (for overviews of this literature, see Shaw & Weir, 2007; Weigle, 2002). These variables include, among others, genre (e.g., letters, essays, and reports), discourse mode (e.g., narrative, expository, and argumentative) and dimensions of the topic or subject matter itself (e.g., personal or impersonal, discipline-specific or general, and familiar or unfamiliar). The existing literature points to a general agreement that discourse mode affects syntactic complexity in writing, with potentially different effects for different syntactic complexity dimensions (e.g., Crowhurst & Piche, 1979; Lu, 2011; Ravid, 2004; San Jose, 1972), as well as an emerging picture that the relationship between syntactic complexity and writing quality is dependent on discourse mode (e.g., Beers & Nagy, 2009; Crowhurst, 1980; Spaan, 1993). However, less is known about the effect of topics within genres or discourse modes on syntactic complexity and how topic may play a role in the relationship between complexity and writing quality. For the purposes of this paper, we define topic as what is exactly construed by the writing prompt (i.e., the actual wording of the writing task) and what the writers are invited to specifically write about. For a prompt of ‘‘What are the advantages and disadvantages of homeschooling?’’, the topic is not simply homeschooling, but specifically the pros and cons of homeschooling. In assessment settings, an examination of topic is as interesting and as important as other task variables, since much is to be learned about topic features that may affect equivalence of writers’ linguistic and writing performance across different topics, a condition for the reliability of an assessment.

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

57

In the analysis of topics, we have found the literature on task-based language teaching (TBLT) relevant and useful. TBLT is an approach to L2 curriculum design, including instruction and assessment. In this approach, real-world tasks involving the use of language and pedagogical tasks conducive to promoting learners’ ability to perform real-world tasks are used as the basic units to organize instruction and assessment (Ellis, 2003; Long & Crookes, 1993; Skehan, 1998). Writing an essay, as we see it, is a real-life task that may need to be performed in various situations. In the TBLT literature, there is explicit theorizing of the effect of task variables on syntactic complexity of language production, along with other performance features. There are some overlaps in terms of what dimensions of topics have been examined between the TBLT literature and the L1 and L2 writing literature, such as personal versus impersonal topics and familiar versus unfamiliar topics (see the task complexity framework in Skehan, 1998, 2014 in particular). What is perhaps more enlightening and relevant to the current study is the resource-directing dimensions in Robinson (2001, 2007, 2011) task cognitive complexity framework, which currently includes six dimensions: +/ here and now, +/ few elements, +/ spatial reasoning, +/ causal reasoning, +/ intentional reasoning, and +/ perspectives-taking, where the +/ signs denote with/without or more/less. These dimensions are seen to make cognitive/conceptual demands on learners that can direct the learners’ attention to form-function mappings. Robinson hypothesizes that increased task complexity along these dimensions will lead to higher syntactic complexity in language production. For example, learners will produce syntactically more complex language for tasks that require causal reasoning than for those that require no or less causal reasoning. Robinson’s theorizing of such an effect is in a large part based on Givo´n (1985) notion that ‘‘greater structural complexity tends to accompany greater functional complexity in syntax’’ (p. 1021). We found Robinson’s task complexity dimensions pertinent to the tasks used in our study and will therefore employ them in our analysis of topic effect. Overall, the current study aims to fill the research gaps by examining syntactic complexity as a multi-dimensional construct, and by considering the role of writing topic in the relationship between syntactic complexity and writing quality. The study aims to address the following three research questions: 1. What is the effect of topic on syntactic complexity (with its different dimensions) of ESL students’ writing? 2. What is the relationship between syntactic complexity (with its different dimensions) and quality of ESL students’ writing? 3. What is the predictive power of syntactic complexity (with its different dimensions) on ESL students’ writing quality? Material and methods Participants and data The dataset used in this study was a subset of the essays collected by Weigle (2011) for a study investigating the validity of automated scores of TOEFL iBT independent writing tasks. Weigle collected essays on two different prompts from each of 386 nonnative English-speaking students studying at eight different institutions in the United States. Her participants included matriculated undergraduate and graduate students from 10 fields of study as well as non-matriculated English language students enrolled in language programs. The first prompt asked students to discuss whether people place too much emphasis on personal appearance (hereafter, the appearance topic). The second prompt asked students to discuss whether careful planning while young ensures a good future (hereafter, the future topic). The two writing tasks do not differ in genre or discourse mode, as they are both argumentative essays, nor do they differ in the dimensions of being personal versus impersonal, or familiar versus unfamiliar. However, they do differ in one of the cognitive complexity dimensions in Robinson (2001, 2007, 2011) framework, namely, causal reasoning. Robinson (2005) defines causal reasoning as ‘‘justify[ing] beliefs, and support[ing] interpretations of why events follow each other by giving reasons.’’ The future topic tends to elicit causal reasoning in the sense that it requires the writers to justify why a good future follows or does not follow careful planning, whereas the appearance topic does not. The participants had 30 minutes to write each essay. Half of them wrote on the appearance topic first and half on the future topic first. Each essay was rated on a five-point scale by two human raters (out of a pool of six trained raters) using the TOEFL iBT Independent writing scoring guide (ETS, 2008), with a third rater adjudicating if the scores of the two raters were separated by more than one point; only four percent of essays required such adjudication.

58

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Table 1 Descriptive statistics for essay scores. Topic

Appearance Future

N

Length (words)

190 190

Score

Mean

Std. dev.

Mean

Std. dev.

309.22 348.15

84.11 104.53

3.60 3.70

0.70 0.80

The TOEFL rubric covers rating descriptors in the areas of task fulfillment, ideas development and organization, unity and coherence, and language use. The rubric does not explicitly address syntactic complexity as a criterion; however, higher scoring essays are expected to demonstrate ‘‘syntactic variety.’’ In our study, we used 380 essays written by 190 matriculated graduate students whose essay and score data were available for both prompts from Weigle’s (2011) study. We chose to use graduate student data because the reliability of the automated tool we used to calculate the syntactic complexity indices has not been established for lower proficiency ESL writers, and English proficiency requirements tend to be more stringent for graduate programs than for undergraduate admission (typically above 79 for TOEFL iBT or above 550 for paper-based TOEFL). Thus, these participants’ L2 English proficiency can be said to be at the range of intermediate-high to advanced levels. Participant age ranged from 21 to 46 years, with a mean of 27.27; 109 were female, and 80 were male. Ten major fields of study were represented, the most common being social sciences (60 participants), business (37 participants), and natural sciences (23 participants); 38 L1 backgrounds were represented, with Chinese being the most common native language (87), followed by Korean (18) and Japanese (10). Table 1 provides the descriptive statistics for the essays in the dataset. The average of the ratings given by the two human raters was taken as the measure of writing quality for each essay. Syntactic complexity measurement The syntactic complexity of each essay was assessed using eight different measures representing the eight interconnected sub-constructs laid out in the Introduction. These include mean length of sentence (MLS), T-units per sentence (TU/S), mean length of T-unit (MLTU), mean length of clause (MLC), dependent clauses per T-unit (DC/ TU), coordinate phrases per clause (CP/C), complex noun phrases per clause (CNP/C), and non-finite elements per clause (NFE/C). The definitions of the eight measures and the sub-constructs they represent are summarized in Table 2. The essays were analyzed using the L2 syntactic complexity analyzer (L2SCA) (Lu, 2010), with some minor adaptations. This analyzer takes a written English text as input, produces frequency counts of nine linguistic units in the text—word, sentence, clause, dependent clause, T-unit, complex T-unit, coordinate phrase, complex nominal, and verb phrase—and generates 14 indices of syntactic complexity for the text. We followed Lu’s (2010, 2011) definitions for most of the linguistic units and computed six measures—MLS, MLTU, MLC, TU/S, DC/TU, and CP/C—with the original version of L2SCA. Then, along the work of Biber et al. (2011), we defined complex noun phrases as noun phrases that contain one or more of the following: pre-modifying adjectives, post-modifying Table 2 Syntactic complexity measures. Sub-construct

Measure

Definition

Overall sentence complexity Clausal coordination Overall T-unit complexity Clausal subordination Elaboration at clause level Phrasal coordination Noun phrase complexity Non-finite elements/subordination

Mean length of sentence (MLS) T-units per sentence (TU/S) Mean length of T-unit (MLTU) Dependent clauses per T-unit (DC/TU) Mean length of clause (MLC) Coordinate phrases per clause (CP/C) Complex NPs per clause (CNP/C) Non-finite elements per clause (NFE/C)

Number Number Number Number Number Number Number Number

of of of of of of of of

words divided by number of sentences T-units divided by number of sentences words divided by number of T-units dependent clauses divided by number of T-units words divided by number of clauses coordinate phrases divided by number of clauses complex NPs divided by number of clauses non-finite elements divided by number of clauses

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

59

prepositional phrases, and post-modifying appositives. The pattern used to identify complex nominals in the original L2SCA was modified accordingly to match this definition in order to calculate CNP/C. This measure does not show the full range of NP complexity measures (cf. Bulte´ & Housen, 2012; Ravid & Berman, 2010); however, other types of NP complexity such as relative clauses and non-finite modifications of nouns are captured within the measures of DC/TU and NFE/C. Finally, to calculate non-finite elements per clause, we subtracted 1 from the measure of verb phrases per clause, since by definition a clause contains one finite VP, and the other VPs are therefore non-finite. Statistical analysis The complexity indices and writing scores of the essays were analyzed to answer the three research questions. First, dependent samples t tests were conducted to examine the effect of writing topic on the syntactic complexity of the students’ writing. Second, Pearson’s product-moment correlations between syntactic complexity indices and writing scores were calculated for each topic to identify the relationship between syntactic complexity and the quality of the essays. Finally, regression analyses were run for each topic to assess the predictive power of syntactic complexity on the writing scores.2 We took two approaches for the regression analyses: the first to look at global syntactic complexity features and the second to look at local-level complexity features. In the first approach, we used MLS and MLTU, separately, to examine the predictive power of these global syntactic complexity features on writing scores. In the second, we conducted all-possible-subsets regression analyses, with the measures for the six local-level complexity sub-constructs as predictors in the full model (i.e., TU/S, DC/TU, MLC, CP/C, CNP/C, and NFE/C). These predictor variables did not have problems with multicollinearity, i.e., high inter-correlations among predictor variables, as tolerance values for each of the measures were all above 0.10. The all-possible-subsets regression, in contrast to the often-used step methods (forward, backward, and forward stepwise), makes possible an exhaustive analysis of all subsets (often combinations) of predictor variables and their predictive power. Instead of producing only one regression model, the all-possible-subsets regression method provides several regression models that can predict the dependent variable well, often as well as what the step methods may produce (Huberty, 1989; Kutner, Neter, Nachtsheim, & Li, 2005; Stevens, 2009). The researcher can then choose the ‘‘best’’ regression model based on other relevant criteria and can also observe patterns based on the best models produced. The all-possible-subsets regression analyses for each of the topics in our study were conducted with the Automatic Linear Modeling function in SPSS version 21. Akaike Information Criterion Corrected (AICC, as defined by Hurvich & Tsai, 1989) was used as the information criterion to determine the best regression models. The best models based on AICC are the ones that have an SSE (sum of squares for the error) as small as the one for the full model and have a smaller number of predictors. The smaller the AICC is, the better a model is. Results Research question 1: effect of topic on syntactic complexity Table 3 displays the descriptive statistics for the syntactic complexity features used in the essays for the appearance topic and the future topic; the t statistics and the p values indicate the statistical testing results for the topic comparison for each of the features, and Cohen’s d values show the effect sizes. In comparison to essays on the future topic, essays on the appearance topic in general showed a significantly higher amount of elaboration at the finite clause level, as can be observed in the significantly higher values for MLC, CP/C, and CNP/C. On the other hand, essays on the future topic utilized a significantly higher amount of subordination—both finite and non-finite, as can be seen in the significantly higher values for DC/TU and NFE/C. Essays on the future topic also displayed significantly greater overall sentence complexity, as measured by MLS. There were, however, no statistical differences in overall T-unit complexity as measured by MLTU, and clausal coordination as measured by TU/S, for essays on the two topics. As 2 Although path analyses were deemed more appropriate for such a question for our study, since they can take into account the hierarchical relationships among the predictors, we were unable to successfully run path analyses on our data. This occurred probably due to complexity of the model based on Fig. 1, the relatively small sample size, and potentially other unknown factors.

60

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Table 3 Syntactic complexity indices by writing topic. Sub-construct Overall sentence complexity Clausal coordination Overall T-unit complexity Clausal subordination Elaboration at clause level Non-finite subordination Phrasal coordination Noun phrase complexity a b

Measure

Appearance topic

Mean length of sentence (MLS) T-units per sentence (TU/S) Mean length of T-unit (MLTU) Dependent clauses per T-unit (DC/TU) Mean length of clause (MLC) Non-finite elements per clause (NFE/C) Coordinate phrases per clause (CP/C) Complex NPs per clause (CNP/C)

18.55 1.11 16.70 0.75 9.62 0.35 0.32 0.94

(4.27) (0.11) (3.63) (0.36) (1.64) (0.13) (0.15) (0.30)

Future topic 19.47 1.13 17.22 0.92 8.94 0.43 0.18 0.72

(4.61) (0.13) (3.80) (0.34) (1.63) (0.18) (0.11) (0.28)

t 3.36 2.11 2.01 6.10 5.08 6.21 11.75 9.42

pa

Cohen’s d b

0.001 0.036 0.046 0.000b 0.000b 0.000b 0.000b 0.000b

0.21 0.17 0.14 0.51 0.41 0.55 1.09 0.78

The alpha value for the analysis was adjusted to 0.05/8, or 0.00625, by the Bonferroni correction for multiple tests, as eight tests were done. p < 0.00625.

Table 4 Pearson correlations between syntactic complexity indices and writing scores. Sub-construct Overall sentence complexity Clausal coordination Overall T-unit complexity Clausal subordination Elaboration at clause level Non-finite subordination Phrasal coordination Noun phrase complexity

Measure Mean length of sentence (MLS) T-units per sentence (TU/S) Mean length of T-unit (MLTU) Dependent clauses per T-unit (DC/TU) Mean length of clause (MLC) Non-finite elements per clause (NFE/C) Coordinate phrases per clause (CP/C) Complex NPs per clause (CNP/C)

Appearance topic a

0.27 0.15 0.21a 0.14 0.13 0.03 0.03 0.12

Future topic 0.25a 0.11 0.22a 0.08 0.23a 0.20a 0.21a 0.20a

The alpha value for the analysis was adjusted to 0.05/8, or 0.00625, by the Bonferroni correction for multiple tests, as eight tests were done. a p < 0.00625.

Cohen’s d values in the table show, except for the small effect size observed for MLS, the effect sizes for all the significant differences found for the local-level features are moderate to large, showing the practical meanings of such differences.3 Research question 2: relationship between syntactic complexity and writing quality Table 4 summarizes the correlations between each of the eight syntactic complexity indices and writing scores for the two different topics. First, MLS and MLTU, indicating overall sentence complexity and overall T-unit complexity respectively, significantly positively correlated with writing scores for both topics. Second, all four measures pertaining to elaboration at the finite clause level—MLC, NFE/C, CP/C, and CNP/C significantly positively correlated with writing scores for the future topic, but not the appearance topic. Third, DC/TU, measuring finite clausal subordination, did not significantly correlate with writing scores for either topic, but its correlation for the appearance topic was almost twice as large as that for the future topic. Finally, TU/S, showing clausal coordination, did not correlate with writing scores for either topic. It should also be noted that the strength of the relationship for all significant findings is overall rather low, ranging from 0.20 to 0.27. Research question 3: predictive power of syntactic complexity on writing quality scores In our first approach in the regression analyses, MLS and MLTU were separately used as predictors of writing scores. For the appearance topic, MLS was found to be a significant predictor of writing scores (R2 = 0.07, F 1,188 = 14.33, p < 0.001; b = 0.05); so was MLTU (R2 = 0.04, F 1,188 = 8.71, p < 0.01; b = 0.04). For the future topic, 3

Cohen (1988) considers 0.20 to be a small effect for t tests, 0.50 to be a moderate effect, and 0.80 to be a large effect.

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

61

Table 5 Best regression models based on all-possible-subsets regression. Regressors

R2

Adj. R 2

93.99 94.69 93.67 93.85 93.93

0.09 0.08 0.09 0.09 0.09

0.07 0.06 0.07 0.07 0.07

125.52 124.11 124.06 123.68 123.53

108.02 107.30 107.33 107.38 106.20

0.09 0.10 0.10 0.10 0.11

0.08 0.08 0.08 0.08 0.08

99.09 98.24 99.09 98.86 98.63

SSE

Appearance topic

MLC, DC/TU, TU/S DC/TU, TU/S, CNP/C MLC, DC/TU, TU/S, NFE/C MLC, DC/TU, TU/S, CNP/C MLC, DC/TU, TU/S, CP/C

Future topic

MLC, TU/S, DC/TU DC/TU, TU/S, CP/C, NFE/C DC/TU, TU/S, NFE/C, CNP/C MLC, DC/TU, TU/S, CP/C DC/TU, TU/S, CP/C, NFE/C, CNP/C

AICC

MLC = mean length of clause; DC/TU = dependent clauses per T-unit; TU/S = T-units per sentence; CP/C = coordinate phrases per clause; CNP/ C = complex noun phrases per clause; NFE/C = non-finite elements per clause.

MLS was also found to be a significant predictor of writing scores (R2 = 0.06, F 1,188 = 12.50, p < 0.001; b = 0.04); so was MLTU (R2 = 0.05, F 1,188 = 9.64, p < 0.01; b = 0.05). The analyses showed that both MLS and MLTU were significant, consistent predictors of scores across the two topics, accounting for a small variance in the scores: 4–7%. MLS was a slightly stronger predictor of scores than MLTU. Further, the regression coefficients (b) showed that with one additional word in each sentence or T-unit, there was an increase of 0.04–0.05 in scores. In our second approach in the regression analyses, we used all-possible-subsets regression and entered measures for all six local-level sub-constructs (i.e., TU/S, DC/TU, MLC, CP/C, CNP/C, and NFE/C) as predictors. Table 5 displays the five best regression models for the two topics. The first row shows the ‘‘best’’ regression model for each topic, and the order of the variables in the first row is based on their importance in predicting the scores for that topic, with the most important listed first. The order of presentation of the five best models for each topic is based on AICC values, with lower values considered better. As can be seen in Table 5, the ‘‘best’’ model for both topics consisted of MLC, DC/TU, and TU/S, and MLC was the most important predictor in both cases; DC/TU was the second most important for the appearance topic, while TU/S was the second most important for the future topic. For the appearance topic, these three predictors accounted for 9% of scores (F 3,186 = 5.83, p < 0.001); for the future topic, these also explained 9% of scores (F 3,186 = 6.37, p < 0.001). The other best models for the appearance topic, particularly the second best one, further show that among the sub-constructs subsuming MLC, only CNP/C was relatively important in predicting scores for this topic. In contrast, the other four best models for the future topic demonstrate the importance of all the sub-constructs subsuming MLC (i.e., CP/C, NFE/C, and CNP/C) in predicting scores for this topic. Table 6 lists the predictors in the ‘‘best’’ model for each topic, presented in the order of their importance and with their b (regression coefficient), b (standardized regression coefficient), and p (significance) values. The table further demonstrates the relative importance of the three predictors in the ‘‘best’’ regression models. Based on the b values, comparably speaking, MLC is a more important predictor of scores for the future topic than that for the appearance topic, while DC/TU and TU/S have greater importance in predicting scores for the appearance topic than that for the future topic. For example, one additional word in each finite clause is associated with 0.14 of increase in scores for the future topic but only 0.10 of increase in scores for the appearance topic.

Table 6 ‘‘Best’’ regression models and regression coefficients b (b). Appearance topic

Future topic

Regressors

b (b)

p

Regressors

b (b)

p

MLC DC/TU TU/S

0.10 (0.22) 0.43 (0.21) 1.21 (0.18)

0.003 0.005 0.013

MLC TU/S DC/TU

0.14 (0.28) 0.98 (0.16) 0.31 (0.13)

0.000 0.022 0.064

MLC = mean length of clause; DC/TU = dependent clauses per T-unit; TU/S = T-units per sentence.

62

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

In summary, the regression findings for both topics showed that syntactic complexity was a significant predictor of writing scores, explaining a consistent yet rather small proportion of variance in writing scores across the two topics. However, although the importance of the two global syntactic complexity measures (i.e., MLS and MLTU) in predicting scores was consistent across the two topics, the importance of the local-level complexity measures in predicting scores varied across the two topics. Synthesis of the research findings Merging the results for the three research questions, we observed two main patterns, categorized according to the level of syntactic complexity dimensions. At the global complexity levels, as indicated by MLS and MLTU, topic did not have much effect on the syntactic complexity features. These global features were also found to have a positively significant and consistent relationship with writing quality scores across the two topics. At the local level though, topic was found to exert significant and greater effects on the syntactic complexity features, with the exception of clausal coordination (measured by TU/S). Essays on the appearance topic included a significantly higher amount of elaboration at the finite clause level, primarily attributable to the use of more coordinate phrases and complex noun phrases. The future topic elicited significantly more use of subordination—both finite and nonfinite. This suggests that specific topics may naturally elicit more use of certain syntactic complexity features. What was particularly intriguing was that this topic effect connected with the relationship between the local-level syntactic complexity features and writing quality scores in a patterned manner. Specifically, although the appearance topic elicited a significantly higher amount of elaboration at the finite clause level, through more use of coordinate phrases and complex noun phrases, these features did not significantly correlate with writing scores for essays on this topic and were not as important in predicting scores; on the other hand, while essays on the future topic used significantly fewer of these features, scores on these essays had a significant, positive correlation with the frequency of these features and were explained more by these features. Similarly, although finite subordination was used more in the essays for the future topic, it did not have a significant relationship with writing quality scores for those essays and was not as important in predicting the scores. Meanwhile, a reversed pattern was observed for the appearance topic: the lower usage of finite subordination in the essays on this topic was accompanied by a much stronger, positive relationship between finite subordination and writing quality scores for those essays. The only syntactic complexity feature that did not show such a reversed pattern was non-finite subordination: it was used more in the essays on the future topic and also showed a significant relationship with scores for those essays. What the combined results suggest is that the writers who were able to use not only topic-intrinsic complexity features but also other types of local-level complexity features were awarded with higher scores, which could well be an acknowledgment of their higher linguistic ability and/or writing ability. We show in Appendix A the use of the syntactic complexity features in essays for the two topics at two different score points, demonstrating how higher level of syntactic complexity and variation in features are achieved in the more highly rated essays and how the more topic-intrinsic complexity features are also prominent in the lower scored essays for each topic. The first two samples are excerpts from two essays on the appearance topic rated at 3.5 and 5 points, respectively. In both, coordinate phrases and complex noun phrases were frequently employed, providing descriptions and lengthening the clauses. However, in the lower-scored essay sample, the use of finite and non-finite subordination was much more limited than that in the higher-scored essay sample. The higher-scored essay demonstrated much greater syntactic complexity and variation. The second two essay samples illustrate how for the future topic, both essays utilized finite and non-finite subordination, making both essays highly propositional. However, the lower-scored essay sample showed use of only few complex noun phrases and no coordinate phrases. In contrast, the higher-scored essay sample contained a lot more use of complex noun phrases and several coordinate phrases, making the essay highly propositional as well as descriptive. Similarly, this higher-scored essay embodied both high syntactic complexity and high syntactic variation. It should be noted that the lower- and higher-scored essay samples for each topic also differ in other linguistic features, notably lexical sophistication, which has been found to significantly correlate with scores on argumentative essays (e.g., Yang & Weigle, 2011). Collectively, however, these samples illustrate how syntactic complexity differs in writing samples of different quality and how it may be affected by topic.

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

63

Discussion Examining syntactic complexity as a multi-dimensional construct (Norris & Ortega, 2009) with different levels of sub-constructs, the study revealed complex yet patterned findings about the relationship between syntactic complexity and writing quality and the role of topic in this relationship. The discussion centers on two main areas that our study is able to illuminate: (1) the systematic and patterned findings associated with topic in linguistic performance; (2) measurement issues pertaining to syntactic complexity. One particularly intriguing question from our study results is why such topic effects on syntactic complexity were observed and whether such findings are generalizable in any way to other topics. In our study, with one topic eliciting higher amount of elaboration at the finite clause level and the other one inviting greater subordination, we concluded that certain topics may naturally call for more use of certain local-level syntactic complexity features. One possible explanation for our findings is that the future topic demands causal reasoning in task performance while the appearance topic does not, a comparison we laid out in the Material and methods section. Since causal reasoning requires juxtaposing the relationship between two or more entities or events, it is reasonable that the future topic elicited more frequent use of multi-propositional sentences containing subordination. In contrast, the appearance topic does not demand causal reasoning; it simply involves one entity (i.e., appearance) and one proposition (i.e., people are placing too much emphasis on appearance) and asks about the truth value of such a proposition. Most likely due to these factors, the appearance topic elicited more descriptions, rather than propositions. Since Robinson (2001, 2007, 2011) framework for task complexity makes hypotheses about the relationship between causal reasoning and syntactic complexity in language production, the current study is able to illuminate the proposed relationship. Robinson predicts that an increase in causal reasoning will lead to an increase in the syntactic complexity of language production. The findings of the current study provide partial support for the prediction and show the prediction to be true for some of the syntactic complexity sub-constructs but not others. Overall though, the findings support Robinson’s prediction which is primarily concerned with the amount of subordination as the main syntactic complexity construct, since the future topic which required causal reasoning indeed elicited more use of subordination. However, the view of the syntactic complexity construct in the TBLT literature has also been expanded to include other sub-constructs such as elaboration at the clause level (see Skehan, 2014). Examined this way, Robinson’s prediction is not supported regarding the other syntactic complexity sub-constructs. The findings of the study can also certainly illuminate measurement choices for syntactic complexity. Such considerations are two-fold in examinations of task effects on syntactic complexity and in examinations of the relationship between syntactic complexity and writing quality (i.e., writing proficiency and linguistic proficiency). First of all, based on the findings of the current study, when we investigate task effects on syntactic complexity, syntactic complexity is ideally measured multi-dimensionally. The study demonstrates that task may have different effects on different sub-constructs of syntactic complexity, as Crowhurst and Piche’s (1979) study also indicated. When claims or hypotheses are made about task effects on syntactic complexity, considerations shall be given to which sub-constructs may be affected and why, and predictions should be made explicitly in relation to different subconstructs. Currently, in the task-based language literature, the most commonly used sub-construct is the amount of subordination (Skehan, 2014). Global complexity, clausal coordination, overall elaboration at the finite clause level, phrasal coordination, and noun-phrase complexity have not been given adequate attention. Secondly, in examinations of the relationship between syntactic complexity and writing quality, our study suggests that either of the two global measures (MLS and MLTU) can work well as a generic syntactic complexity measure since both were found to significantly and consistently predict writing scores across topics. However, it is equally important to identify which local-level complexity features are at work in predicting scores. The challenge of using local-level complexity features, though, is that different topics may need different constellations of these features in predicting writing scores or that the importance of each local-level feature in predicting scores varies across topics. Based on our findings, MLC, TU/S and DC/TU could be used to collectively predict scores across topics. However, the regression coefficients (or their relative importance) for each of these measures are likely to differ across topics. Examined in conjunction with the syntactic complexity measures commonly used in the writing literature (see Ortega, 2003), the most commonly used measure of MLTU could work, ideally along with the clausal coordination measure of TU/S. However, using local-level measures such as MLC or DC/TU as the only syntactic complexity measure or the only local-level measure in addition to the global measure of MLTU to make claims about the studied relationships can be problematic given that choices of local-level measures should be contingent upon topics and three or more local-

64

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

level measures are needed to function collectively to predict writing scores. Such complex relationships between syntactic complexity and writing quality mediated by the topic could also explain the mixed findings reported about the syntactic complexity–writing quality relationship in earlier studies (Crowhurst, 1983; Hillocks, 1986; Ortega, 2003). In general, the study illustrates the importance of considering syntactic complexity measurement choices in view of potential influences from the topic of the discourse and the cognitive operations that may be invited by the topic. Syntactic complexity is also studied and measured to answer some other research questions, such as trajectories of syntactic development and maturity (e.g., Hunt, 1965; Ravid & Berman, 2010; Nippold, Hesketh, Duthie, & Mansfield, 2005) and a comparison of syntactic complexity features in different registers (e.g., Biber et al., 2011). In investigations of these other questions, topic and other task factors must also be carefully considered. Finally, the study demonstrates that diversity and variations in the local-level syntactic complexity features employed, rather than mere use of the complexity features called for by the topic, contributes to higher ratings of the writing. This indicates that one essence of linguistic development and L2 writing development is seen in learners’ ability to stretch their linguistic repertoire and achieve linguistic complexity in ways not constrained by the task or the topic, shown in the greater linguistic resources and means to attain greater diversity and sophistication in language use. In not only meeting the task demands but also going beyond what is expected by enriching the discourse through other complex constructions and the meanings embodied, learners demonstrate their highest linguistic ability and convey their thoughts in sophisticated and linguistically impressive ways. The observation of greater variations in the syntactically complex structures employed in writing as benchmark of higher writing quality is also seen in the work of Myhill (2008, 2009) who examined L1 young children’s writing. Berman (2008) similarly found that L1 speakers/ writers’ ability to ‘‘stack’’ or ‘‘nest’’ different finite clauses through coordination and subordination developed as a function of age. Our findings thus support syntactic variety (specifically, variety of complex structures used) as one criterion in writing rating rubrics, as found in some existing rating rubrics (e.g., Jacobs, Zinkgraf, Wormuth, Hartfiel, & Hughey, 1981; Gentile, Riazantseva, & Cline, 2002). Conclusions The study revealed intricate relationships among writing topic, syntactic complexity of writing, and writing quality. The relationship between syntactic complexity and writing quality was found to be significant and rather constant at the global syntactic complexity levels—global sentence complexity and global T-unit complexity across different topics. Yet, such a relationship was found to vary across topics at the local complexity levels— clausal coordination, finite subordination, overall elaboration at the finite clause level, non-finite subordination, phrasal coordination, and noun-phrase complexity. The generic length measures of mean length of sentence and mean length of T-unit may work in predicting writing scores across topics, but they fall short in their capacity to indicate how syntactic complexity is exactly achieved by ESL writers on a given topic. In general, however, syntactic variety achieved by writers, using not only features naturally called for by a certain topic, but also other features to add to the variation, appeared to contribute to higher writing quality as judged by human raters. The study had a number of strengths, such as the use of repeated measures, a relatively large sample size, and a thorough examination of the syntactic complexity construct. The findings and our interpretations should however be viewed in relation to other features of our study design. In particular, our writer sample was limited to ESL graduate students who are likely to be more linguistically and cognitively mature than many other ESL populations.4 Second, the writing tasks were argumentative tasks only so that the findings are not to be generalized to other rhetorical tasks. Third, only two writing topics were examined and thus the findings, although revealing, 4 Whether our findings will be borne out for lower-proficiency and younger or older writers is an open question. Both the variables of language proficiency and learners’ age have been found to affect syntactic complexity in language production (see Ortega (2003) for L2 cross-sectional studies, Berman (2008) and Hunt (1965) as examples of L1 studies examining different age groups) and learners’ age may affect the relationship between writing quality and syntactic complexity, as Crowhurst’s (1980) study suggests. However, there is also evidence that the effects of task on syntactic complexity can possibly be the same across age groups; studies comparing syntactic complexity in argumentative and narrative essays have reported greater global syntactic complexity and amount of subordination in argumentative essays for different age groups (e.g., Lu, 2011; Beers & Nagy, 2009; Crowhurst, 1980; Crowhurst & Piche, 1979).

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

65

may be confined to certain topics. Finally, there were not as many lower-scored essays as higher-scored essays in our data, which may have affected the strength of relationships reported in our study. We certainly welcome future inquiries that replicate our study with other writing topics and writer populations. Further, for future work, syntactic complexity sub-constructs can be conceptualized and measured in a more fine-grained manner, taking a more functional perspective and taking into account different types of subordination—complement, adverbial, and adjective, finite and non-finite, respectively, since these different subordination types may not develop with age and proficiency in similar ways and may function differently in different discourses, as suggested in the work of Hunt (1965), Nippold et al. (2005), Nippold, Mansfiled, and Billow (2007), and Biber et al. (2011), which all primarily examined L1 language samples. Likewise, with increasing interest in noun-phrase complexity, sophisticated and more sensitive measures to tap complexity that arise from the use of different types of modifications for head nouns (see Bulte´ & Housen, 2012; Ravid & Berman, 2010) can be pursued in our understanding of syntactic complexity and its relationship with other variables. Acknowledgment The essay data for this paper were collected for a project sponsored by the TOEFL Committee of Examiners and the TOEFL program at Educational Testing Service entitled ‘‘Validation of Automated Scoring of TOEFL iBT Tasks Against Non-Test Indicators of Writing Ability.’’ Appendix A Essay samples Annotation symbols: underlined: subordinate clause or element italicized: complex noun phrases bold: coordinate phrases italicized and bold: coordinate phrases as modifiers of nouns, or complex noun phrases in a coordinate noun phrase Appearance topic Score: 3.5 First, you are so easily judged by your appearance when you first time meeting with people. It is right. You and a stranger met each other. You know nothing about him/her, and he/she completely has no clue about you. The appearance and fashion somehow at this time become a proxy for your personal characteristics. You could be labeled as ‘‘neat’’, ‘‘cute’’, ‘‘sharp’’ or ‘‘nasty’’ by your appearance. Job interview is good example for such case. You want to look as smart and professional, then wear the business suit. Score: 5 The world has gone through various changes in the recent years, among which the way people dress and appear in public. The advances in technology have contributed to the development of new fibers and textile material that have helped people find the best attire for the best situation. This has created a huge dependence on appearance in various societies—not to say all. Fashion is, indeed, a big and industry nowadays, satisfying the needs for people to feel better about themselves and to please others around them in society. Future topic Score: 3.5 Planning is something we should all learn to do when little. By learning how to do it early in life, it becomes a habit, that will ensure success in other activities. There are a lot of factors that come in to play in when planning something, one of those factors are the unknown. I believe that by learning to take the unknown into consideration you are able to react better to changes in life. Score: 5 I believe that these aspirations and careful planning will guide young people in other steps they might take. For example, a young boy who desires to be a medical doctor will know for sure that college education is undebatable. Keeping this goal in mind will also prevent such youngster from non-normative or delinquent behaviors that could hinder him from achieving his goals. In other words, aspirations and goals for the future promote academic resilience in young people.

66

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

References Bardovi-Harlig, K., & Bofman, T. (1989). Attainment of syntactic and morphological accuracy by advanced language learners. Studies in Second Language Acquisition, 11, 17–34. Beers, S. F., & Nagy, W. E. (2009). Syntactic complexity as a predictor of adolescent writing quality: Which measures? Which genre?. Reading and Writing, 22, 185–200. Berman, R. A. (2008). The psycholinguistics of developing text construction. Journal of Child Language, 35, 735–771. Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins. Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly, 45, 5–35. Bulte´, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (pp. 21–46). Amsterdam: John Benjamins. Byrnes, H., Maxim, H. H., & Norris, J. M. (Eds.). (2010). Realizing advanced foreign language writing development in collegiate education: Curricular design, pedagogy, assessment [Special Issue]. The Modern Language Journal, 94(S1), i–iv 1–235. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum. Cooper, T. C. (1976). Measuring written syntactic patterns of second language learners of German. The Journal of Educational Research, 69, 176–183. Cristofaro, S. (2003). Subordination. Oxford: Oxford University Press. Crossley, S. A., Weston, J. L., McLain Sullivan, S. T., & McNamara, D. S. (2011). The development of writing proficiency as a function of grade level: A linguistic analysis. Written Communication, 28, 282–311. Crowhurst, M. (1980). Syntactic complexity in narration and argument at three grade levels. Canadian Journal of Education, 5, 6–13. Crowhurst, M. (1983). Syntactic complexity and writing quality: A review. Canadian Journal of Education, 8, 1–16. Crowhurst, M., & Piche, G. L. (1979). Audience and mode of discourse effects on syntactic complexity in writing at two grade levels. Research in the Teaching of English, 13, 101–109. Cumming, A. (1989). Writing expertise and second-language proficiency. Language Learning, 39, 81–135. Diessel, H. (2004). The acquisition of complex sentences. Cambridge, England: Cambridge University Press. Ellis, R. (2003). Task-based language learning and teaching. Oxford, UK: Oxford University Press. ETS. (2008). iBT/Next Generation TOEFL Test Independent Writing Rubrics (Scoring Standards) Retrieved from www.ets.org/Media/Tests/ TOEFL/pdf/Writing_Rubrics.pdf. Flahive, D., & Snow, B. (1980). Measures of syntactic complexity in evaluating ESL compositions. In J. W., Oller, Jr. & K. Perkins (Eds.), Research in language testing (pp. 171–176). Rowley, MA: Newbury House. Gentile, C., Riazantseva, A., & Cline, F. (Riazantseva, & Cline, 2002). A Comparison of Handwritten and Word-processed TOEFL Essays: Final Report Internal document. ETS. Givo´n, T. (1985). Function, structure, and language acquisition. In Slobin, D. I. (Ed.). The crosslinguistic study of language acquisition. Vol. 1 (pp.1008–1025). Hillsdale, NJ: Lawrence Erlbaum. Givo´n, T. (2009). The genesis of syntactic complexity: Diachrony, ontogeny, neuro-cognition, evolution. Amsterdam: John Benjamins. Givo´n, T., & Shibatan, M. (Eds.). (2009). Syntactic complexity: Diachrony, acquisition, neuro-cognition, evolution. Amsterdam: John Benjamins. Halliday, M. A. K., & Matthiessen, C. (2004). An introduction to functional grammar (3rd ed.). London: Arnold. Hillocks, G. (1986). Research on written composition: New directions for teaching. Urbana, IL: ERIC Clearinghouse on Reading and Commutation Skills and the National Conference on Research in English. Homburg, T. J. (1984). Holistic evaluation of ESL compositions: Can it be validated objectively? TESOL Quarterly, 18, 87–107. Huberty, C. J. (1989). Problems with stepwise methods: Better alternatives. In Thompson, B. (Ed.). Advances in social science methodology. Vol. 1 (pp.43–70). Greenwich, CT: JAI Press. Hunt, K. W. (1965). Grammatical structures written at three grade levels. Champaign, IL: National Council of Teachers of English. Hurvich, C. M., & Tsai, C.-L. (1989). Regression and time series model selection in small samples. Biometrika, 76, 297–307. Jacobs, H. L., Zinkgraf, S. A., Wormuth, D. R., Hartfiel, V. F., & Hughey, J. B. (1981). Testing ESL composition: A practical approach. Rowley, MA: Newbury House. Kameen, P. T. (1979). Syntactic skill and ESL writing quality. In C. Yorio, K. Perkins, & J. Schachter (Eds.), On TESOL’79: The learner in focus (pp. 343–364). Washington, DC: TESOL. Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models (5th ed.). New York: McGraw-Hill. Langacker, R. W. (2008). Cognitive grammar: A basic introduction. Oxford: Oxford University Press. Larsen-Freeman, D., & Strom, V. (1977). The construction of a second language acquisition index of development. Language Learning, 27, 123–134. Long, M. H., & Crookes, G. (1993). Units of analysis in syllabus design: The case for task. In G. Crookes & S. M. Gass (Eds.), Tasks in a pedagogical context: Integrating theory and practice (pp. 9–54). Clevedon, UK: Multilingual Matters. Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15, 474–496. Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly, 45, 36–62. Myhill, D. (2008). Towards a linguistic model of sentence development in writing. Language and Education, 22, 271–288. Myhill, D. (2009). Becoming a designer: Trajectories of linguistic development. In R. Beard, D. Myhill, J. Riley, & M. Nystrand (Eds.), The Sage handbook of writing development (pp. 402–414). London: Sage. Nihalani, N. K. (1981). The quest for the L2 index of development. RELC Journal, 12, 50–56.

W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

67

Nippold, M. A., Hesketh, L. J., Duthie, J. K., & Mansfield, T. C. (2005). Conversational versus expository discourse: A study of syntactic development in children, adolescents, and adults. Journal of Speech, Language, and Hearing Research, 48, 1048–1064. Nippold, M. A., Mansfield, T. C., & Billow, J. L. (2007). Peer conflict explanations in children, adolescents, and adults: Examining the development of complex syntax. American Journal of Speech-Language Pathology, 16, 179–188. Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied Linguistics, 30, 555–578. Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24, 492–518. Perkins, K. (1980). Using objective methods of attained writing proficiency to discriminate among holistic evaluations. TESOL Quarterly, 14, 61–69. Ravid, D. (2004). Emergence of linguistic complexity in written expository texts: Evidence from later language acquisition. In D. Ravid & H. BatZeev Shyldkrot (Eds.), Perspectives on language and language development (pp. 337–355). Dordrecht: Kluwer. Ravid, D., & Berman, R. A. (2010). Developing noun phrase complexity at school age: A text-embedded cross-linguistic analysis. First Language, 30, 3–26. Robinson, P. (2001). Task complexity, cognitive resources and syllabus design: A triadic framework for examining task influences on SLA. In P. Robinson (Ed.), Cognition and second language instruction (pp. 287–318). Cambridge: Cambridge University Press. Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential framework for second language task design. International Review of Applied Linguistics in Language Teaching, 43, 1–32. Robinson, P. (2007). Criteria for classifying and sequencing pedagogic tasks. In M. P. Garcı´a Mayo (Ed.), Investigating tasks in formal language learning (pp. 7–26). Clevedon, UK: Multilingual Matters. Robinson, P. (2011). Second language task complexity, the cognition hypothesis, language learning, and performance. In P. Robinson (Ed.), Second language task complexity: Researching the cognition hypothesis of language learning and performance (pp. 3–38). Amsterdam: John Benjamins. San Jose, C. P. M. (1972). Grammatical structures in four modes of writing at fourth-grade level. (unpublished doctoral dissertation) Syracuse, NY: Syracuse University. Spaan, M. (1993). The effect of prompt on essay examinations. In D. Douglas & C. Chapelle (Eds.), A new decade of language testing research (pp. 98–122). Alexandria, VA: TESOL. Shaw, S. D., & Weir, C. J. (2007). Examining writing: Research and practice in assessing second language writing. Studies in Language Testing 26, Cambridge: Cambridge University Press. Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press. Skehan, P. (2014). The context for researching a processing perspective on task performance. In P. Skehan (Ed.), Processing perspectives on task performance (pp. 1–26). Amsterdam: John Benjamins. Stevens, J. (2009). Applied multivariate statistics for the social sciences (5th ed.). Mahwah, NJ: Lawrence Erlbaum. Tedick, D. J. (1990). ESL writing assessment: Subject-matter knowledge and its impact on performance. English for Specific Purposes, 9, 123–143. Weigle, S. C. (2002). Assessing writing. Cambridge: Cambridge University Press. Weigle, S. C. (2011). Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability. TOEFL iBT Research Report (TOEFL iBT-15). Princeton, NJ: Educational Testing Service. Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity. Honolulu, HI: University of Hawai’i, Second Language Teaching and Curriculum Center. Yang, W., & Weigle, S. C. (2011). Lexical richness of ESL writing and the role of prompt. Paper Presented at the 10th Conference for the American Association for Corpus Linguistics (AACL). Weiwei Yang is Associate Professor of English at Nanjing University of Aeronautics and Astronautics. Her research interests include cognition and discourse, discourse analysis, second language literacy development and assessment, and second language teaching and learning. She holds a PhD in Applied Linguistics from Georgia State University. Xiaofei Lu is Gil Watz Early Career Professor in Language and Linguistics and Associate Professor of Applied Linguistics and Asian Studies at The Pennsylvania State University. His research interests are primarily in computational linguistics, corpus linguistics, and intelligent computer-assisted language learning. He is the author of Computational Methods for Corpus Annotation and Analysis (2014, Springer). Sara Cushing Weigle is Professor of Applied Linguistics at Georgia State University. She has conducted research in the areas of assessment, second language writing, and teacher education and is the author of Assessing Writing (2002, Cambridge University Press). Her most recent research has focused on the validity of automated scoring of ESL writing and the use of integrated tasks in writing assessment.