Journal of Second Language Writing 44 (2019) 51–62
Contents lists available at ScienceDirect
Journal of Second Language Writing journal homepage: www.elsevier.com/locate/jslw
Syntactic complexity and writing quality in assessed first-year L2 writing
T
J. Elliott Casala, , Joseph J. Leeb ⁎
a b
Department of Applied Linguistics, The Pennsylvania State University, 304 Sparks Building, University Park, PA 16802, USA ELIP Academic & Global Communication Program, Department of Linguistics, Ohio University, 383 Gordy Hall, Athens, OH 45701, USA
ARTICLE INFO
ABSTRACT
Keywords: First-year writing Second language writing Academic writing Syntactic complexity Corpus linguistics Writing quality
This study explores the relationship between syntactic complexity and writing quality in assessed source-based research papers produced by ESL undergraduate writers in a first-year writing course through a combination of holistic and fine-grained measures of complexity. The analysis is based on a corpus of 280 student papers across three grade tiers: high, mid, and low. A one-way MANOVA was used to explore the statistical significance of differences of five commonly used syntactic complexity measures (assessed using Lu’s Second Language Syntactic Complexity Analyzer, 2010) across these grade tiers. Results reveal little variation in clausal subordination and coordination, but statistically significant lower complex nominal densities, mean length of clauses (phrasal measures), and mean length of T-units (global measure) in low-rated papers. Analysis of complex nominal composition using the Stanford Tregex with differences assessed with a one-way MANOVA shows that the highest densities of complex nominal types are present in high-rated papers, with statistical significance in adjectival pre-, prepositional post-, and participle modification, and the lowest densities in low-rated papers. While clausal complexity did not demonstrate a relationship with assessed quality, both global and phrasal complexity features appear to be important components. We conclude with implications for syntactic complexity research and ESL composition pedagogy.
1. Introduction Over the past several decades, research on syntactic complexity has featured prominently in the second language (L2) writing literature, with a notable proliferation of such studies over the last ten years. While this domain of research has a long tradition of utilizing various measures of syntactic complexity as potential benchmarks of language proficiency or development (e.g., Ansarifar, Shahriari, & Pishghadam, 2018; Bulté & Housen, 2014; Crossley & McNamara, 2014; Mazgutova & Kormos, 2015; Vyatkina, Hirschmann, & Golcher, 2015), recent studies have also considered the impacts of factors such as topic (e.g., Yang, Lu, & Weigle, 2015), genre (e.g., Lu, 2011; Staples & Reppen, 2016), and first language (e.g., Lu & Ai, 2015) on the production of complexity in L2 writing. From the perspective of the Complexity, Accuracy, and Fluency (CAF) framework, these studies have contributed to second language acquisition theory by conceptualizing the evolving complexity of syntactic dimensions of language. They have also expanded our understanding of learner choices and practices, with important implications for the learning, production, and assessment of L2 writing. Although this line of research has deepened our understanding of the language produced by L2 writers in English and the various
⁎
Corresponding author. E-mail address:
[email protected] (J.E. Casal).
https://doi.org/10.1016/j.jslw.2019.03.005 Received 20 September 2018; Received in revised form 25 March 2019; Accepted 25 March 2019 1060-3743/ © 2019 Elsevier Inc. All rights reserved.
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
affecting variables, Ortega (2015) emphasizes that we do not yet understand the extent to which syntactic development leads to better writing in the eyes of gatekeepers and evaluators. Laying out research implications in the Journal of Second Language Writing 2015 special issue on L2 writing complexity, Ortega posits that “if syntactic complexity grows as writers become increasingly more capable of using the additional language with linguistic maturity, so will they also write with more communicative and rhetorical flexibility” (p. 87.) Yet, while studies into the relationship between syntactic complexity and L2 writing quality exist (e.g., Biber, Gray, & Staples, 2016; Bulté & Housen, 2014; Crossley & McNamara, 2014; Taguchi, Crawford, & Wetzel, 2013; Yang et al., 2015), as do studies which take features of syntactic complexity into account of broader profiles of high-rated writing (e.g., Friginal & Weigle, 2014; Jarvis, Grant, Bikowski, & Ferris, 2003), extremely limited scholarship has been dedicated to the relationship between syntactic complexity and assessed writing quality in L2 student writing in the context of first-year writing (FYW) courses (Staples & Reppen, 2016, is a notable exception). This gap is particularly striking because, as with many issues related to academic writing, FYW courses represent a crucial intersection between general English language development and the ability to engage in academic writing practices and genres. While research suggests that characteristics of general L2 English proficiency and academic writing proficiency differ greatly (Biber & Gray, 2010; Biber, Gray, & Poonpon, 2011), little is known about the syntactic complexity of source-based academic writing produced by L2 student writers in FYW courses or how syntactic complexity relates to assessed writing quality in this context. Overall, although some studies have found measures related to syntactically complex phrases (e.g., mean length of clause, complex nominals per clause) to be useful descriptors or predictors of quality (e.g., Biber et al., 2016; Staples & Reppen, 2016; Yang et al., 2015), Crossley and McNamara (2014) found no relationship between phrasal complexity features and ratings. They conclude that “the syntactic features that develop in L2 learners are not the same syntactic features that will assist them in receiving higher evaluations for essay quality” (p. 75). To address the limited research in this area, this study explores the extent to which five holistic measures of syntactic complexity (mean length of T-unit, mean length of clause, number of T-units per clause, number of clauses per T-unit, and number of complex nominals per clause) relate to writing quality (operationalized as instructor assigned grades) in L2 FYW students’ source-based writing. Given the importance of phrase-level syntactic complexity in academic writing, a follow-up analysis of how five specific types of complex nominals relate to writing quality is presented as well. This study, thus, draws on both holistic and fine-grained measures of phrasal syntactic complexity. 1.1. Syntactic complexity as a multidimensional construct Syntactic complexity studies are increasingly adopting the view that syntactic complexity should be operationalized as a “multidimensional construct” (Norris & Ortega, 2009), encompassing global (e.g., mean length of T-unit), clausal (e.g., subordinated or coordinated clauses per T-unit), and phrasal (e.g., mean length of clause, complex nominals per clause) sub-constructs (e.g., Lu, 2011, 2017; Mancilla, Polat, & Akcay, 2017; Norris & Ortega, 2009). This extension in how complexity is conceptualized highlights syntactically complex structures as a range of linguistic features with the potential to construct various meanings. It also reinforces a definition of syntactic complexity as the use of both “sophisticated” and “varied” (Lu, 2010) structures which enables “expansion of the capacity to use the additional language in ever more mature and skillful ways, tapping the full range of linguistic resources offered by the given grammar in order to fulfill various communicative goals successfully” (Ortega, 2015, p. 82). Thus, a developed capacity to produce syntactically complex structures should afford L2 writers greater agency in accomplishing their communicative and rhetorical goals. While clausal and global dimensions of syntactic complexity have a long history in applied linguistics research, Biber and Gray’s (2010) counter to the “stereotypes that academic writing is elaborated and explicit” (p. 3) has played a role in advancing the view that phrasal complexity should also be considered. By demonstrating through analysis of corpora that subordination of finite clauses, long assumed to be prominent in academic writing, is more strongly associated with speech, they called the use of clausal complexity measures into question. Rather, they argue that academic writing is characterized by its “reliance on phrasal structures, especially complex phrases with phrasal modifiers” (Biber, Gray, & Poonpon, 2013, p. 192). This set of structures which did not feature prominently in many syntactic complexity approaches previously is now seen to have “stronger discriminative power” (Lu, 2011, p. 57) at certain proficiency levels and in certain contexts. When considering L2 academic writing, such prominent features of academic writing are relevant measures. In a “weaker” conclusion to their findings, Biber et al. (2011) maintain that “complexity is not a single unified construct” (p. 29) which can be measured by only one means. They, thus, advocate for the inclusion of phrasal measures in a broader conceptualization of syntactic complexity. In their “stronger” version, however, Biber et al. propose a developmental progression for both first language (L1) and L2 writers. This development begins with speech (and a reliance on finite dependent clause manipulation) and progresses through five stages of increased phrasal complexity, particularly those involved in noun modification, which are produced in “only the more specialized circumstances of formal writing” (p. 29). Their proposed developmental progression has attracted both criticism (see Yang, 2013) and empirical testing (e.g., Ansarifar et al., 2018; Parkinson & Musgrave, 2014). With regard to the view of syntactic complexity as a multidimensional construct, Biber et al.’s (2011) weak view allows for phrasal complexity to be a sub-construct of syntactic complexity alongside more traditional global and clausal complexity sub-constructs, while the strong view suggests that some of the more common linguistic features targeted by global and clausal measures are “obviously not difficult” (p. 29) to produce and, therefore, somewhat less complex. As discussed below, we believe that the developmental progression proposed by Biber and colleagues represents a useful means of hypothesizing the syntactic development of maturing academic writers in English, but that the global and clausal measures, left out of 52
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
the strong view, may capture valuable early stages of development or communicative choices in particular contexts. As such, the present study employs a multidimensional set of complexity measures reflecting both Norris and Ortega’s (2009) hypothesized syntactic development progression (based theoretically on Halliday & Mathiessen, 1999) that learners rely syntactically on coordinate clauses, then subordinate clauses, then phrasal elaboration, and Biber et al.’s (2011) developmental progression from finite dependent clauses to complex noun phrases. In the analysis of syntactic complexity in L2 FYW students’ writing, many of the variables relevant to syntactic complexity in L2 English writing are important. L2 student writers in this context represent distinct L1 backgrounds, are asked to engage in a wide variety of academic and pedagogical genres on a range of topics, and are not always past the developmental stages where global and clausal measures of complexity are expected to be relevant. Biber et al.’s developmental progression places the target of many advanced L2 English writers, the production of high-level (and often high-stakes) academic English texts, in focus. At the same time, it is important to consider that the domains of complexity often associated with speech (i.e., global and clausal) are likely to represent important components of student choices or evaluator expectations. 1.2. Global and clausal measures of syntactic complexity in L2 writing research While opinions vary, some arguments and evidence support the view that clausal and global measures of syntactic complexity can be useful metrics in analyzing pre-advanced L2 writing. Norris and Ortega (2009) explain that measures of complexity that contain “a potentially multi-clausal production unit” (p. 561) must be interpreted as global because they can be increased in a large variety of distinct ways (e.g., increased number or length of clauses, longer noun phrases). In this regard, mean length of T-unit (MLT) has been the most widely used metric (Ortega, 2003; Wolfe-Quintero, Inagaki, & Kim, 1998). In Ortega’s (2003) research synthesis, she found that MLT could reliably indicate proficiency differences for L2 English writers, and significant changes in MLT have been reported even over the span of a single English for academic purposes (EAP) course (Bulté & Housen, 2014) or five weeks of in-class writing (Stockwell & Harrington, 2003) without explicit instruction. In some cases, MLT has been shown to have a relationship with writing quality assessments, as Bulté and Housen (2014) found strong correlations, and Yang et al. (2015) reported MLT as a “consistent” predictor of quality. However, MLT and other T-unit based measures have drawn criticism, particularly because the nature of the changes (e.g., the product of linguistic development or rhetorical decisions) cannot be recovered from the global-level holistic measure (Biber et al., 2013). Norris and Ortega (2009) also recommend clausal-level complexity measures targeting subordination, which they argue is syntactically significant during intermediate English proficiency, and clausal coordination measures for analyzing the writing of beginning stage proficiency (based on Bardovi-Harlig, 1992). Subordination has been measured in a wide variety of ways, with some utilizing holistic ratios, such as clauses per T-unit (e.g., Bulté & Housen, 2014; Lu, 2011), and others using more narrowly defined features such as adverbial subordinate clauses (e.g., Staples & Reppen, 2016; Taguchi et al., 2013). Norris and Ortega (2009) reported that clausal coordination had been rarely used in syntactic complexity research almost a decade ago, and it still does not feature prominently. While some scholars question the usefulness of clausal subordination or coordination as measures of syntactic complexity, Ortega (2003) found evidence that clausal subordination measures can be used to investigate differences in intermediate L2 writers, and Grant and Ginther (2000) summarize that, “by and large, writers use more subordination as they become more proficient” (p. 139). In terms of coordination, Crossley and McNamara (2014) report significant changes in coordinated clauses over the duration of a course. Some studies, however, document little relationship between proficiency and clausal subordination (e.g., Bulté & Housen, 2014; Yoon, 2017); Lu (2011) found little ability to discriminate between course-based proficiency levels with either clausal subordination (particularly clauses per T-unit) or clausal coordination (T-units per sentence); and the strong associations of subordination with speech, rather than writing, are well-documented (Biber & Gray, 2010; Biber et al., 2011). Somewhat surprising, however, is Crossley and McNamara’s (2014) finding that dependent clause rates best predicted raters’ evaluations of writing quality in an academic writing class. So, while results of how clausal measures, particularly clausal subordination measures, reflect proficiency development are mixed, some evidence indicates that clausal measures overall play a role in judgments of quality. Crossley and McNamara offer the descriptive nature of the task as a possible explanation for this preference, but this nonetheless underscores the rhetorical impact that particular types of complexity affect, further supporting the inclusion of “a sufficiently wide range of judiciously chosen complexity measures” (Bulté & Housen, 2014, p. 56). In the same study, Crossley and McNamara (2014) note that many learners appear to produce fewer embedded clauses in favor of more complex noun phrases, highlighting not only that measures of syntactic complexity operate on distinct dimensions, but also that they are likely to interface in complex ways as learners develop linguistic proficiency and genre appropriate ways of constructing meaning. 1.3. Phrasal dimensions of syntactic complexity in L2 writing research The importance of the phrasal dimension of syntactic complexity is common to both Biber et al.’s (2011) and Norris and Ortega’s (2009) hypothesized developmental trajectories and is consistent across variables of interest in the literature, both for holistic and narrow measures. Common holistic measures of phrasal complexity are complex nominals per clause and mean length of clause, which have both been shown to vary by genre (e.g., Lu, 2011) and topic (e.g., Yang et al., 2015), to discriminate between proficiency levels (Lu & Ai, 2015), to change over time (e.g., Bulté & Housen, 2014), and to be strong predictors of writing quality assessments (e.g., Bulté & Housen, 2014; Yang et al., 2015). In this paper, we adopt Lu’s (2010) definition of a complex nominal, based on Cooper (1976), which refers to a noun modified by an attributive adjective, possessive noun, post-preposition, relative clause, participle, or appositive; a noun clause; or gerund and infinitival subjects (see Lu, 2010, p. 483, for further explanation). Therefore, the complex nominal approach to phrasal complexity emphasizes noun phrase complexity. 53
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Noun pre-modifiers, participle post-modifiers, attributive adjectives, and prepositional post-modifiers are the most commonly measured structures, and they play important roles in the findings of two studies comparing the syntactic complexity of graduate students and disciplinary “expert” writers. Ansarifar et al. (2018) found that L2 MA abstract writers used significantly fewer noun, participle, adjective, and prepositional noun-modifiers than both PhD and expert writers and significantly fewer of the other types of complex nominal types. The PhD writers were similar to the disciplinary experts, differing only in their use of prepositional postmodifiers. Similarly, Parkinson and Musgrave (2014) found that novice graduate writers used significantly more attributive adjectives (an early acquired form in Biber et al.’s developmental progression) than experienced graduate writers, while the more experienced group used a wider variety of complex nominals more frequently. In an FYW context, Staples and Reppen (2016) document a complex interplay between L1, genre, and language ratings for both attributive adjectives and pre-modifying nouns. Just as Ortega (2003) and Wolfe-Quintero et al. (1998) reported in their synthesis and meta-analyses, many factors regarding these studies make them difficult to compare. Nevertheless, a cohesive picture of the significant role that phrase-level complexity plays both in academic writing and in L2 writing development emerges, and these two domains converge in FYW. Yet, while L2 writers in such courses have recently received increasing scholarly attention overall (e.g., Eckstein & Ferris, 2017; Lee, Hitchcock, & Casal, 2018; Staples & Reppen, 2016), we know very little about how complexity factors into assessment of student writing in these contexts. The particular affordances of complex structures, as Ryshina-Pankova (2015) points out, are little understood, but the importance of syntactic complexity as a multidimensional construct in academic writing and L2 writing development raises crucial questions regarding the role syntactic complexity plays in the assessment of L2 students’ writing in FYW. 1.4. The present study This study, therefore, explores the relationship between the use of prominent syntactic features and quality of assessed L2 FYW source-based research papers across three grade tiers: high, mid, and low. These syntactic features include five holistic measures across three dimensions discussed above: mean length of T-unit as a global measure; clauses per T-unit and T-units per sentence as clausal measures of subordination and coordination, respectively; and mean length of clause and complex nominals per clause as phrasal measures. Due to the importance of phrasal complexity in recent research on academic writing and the differences in phrasal complexity observed in the present study, complex nominal composition is further examined across the three grade tiers through the number of noun phrases containing adjective pre-modifiers, prepositional post-modifiers, participle modifiers, possessive noun modifiers, and relative clauses, normalized to occurrence per clause. Through these comparisons and the use of both holistic and finegrained measures of syntactic complexity, the study aims to deepen our understanding of how the syntactic choices of L2 academic writers in FYW courses relate to assessed writing quality. Thus, our study is motivated by the following primary research questions: 1 To what extent do high-, mid-, and low-rated research papers written by second language undergraduate students in a first-year writing course differ in their global, clausal, and phrasal syntactic complexity? 2 To what extent do high-, mid-, and low-rated research papers written by second language undergraduate students in a first-year writing course differ in the normalized frequency of five specific types of noun-modifier based phrasal complexity measures? 2. Corpus and methodology 2.1. Corpus description The study’s data consist of a corpus of 280 assessed source-based research papers written by L2 university students in a US-based first-year writing (FYW) course. The papers are a subset of the Corpus of Ohio Learner and Teacher English (COLTE), a large corpus of assessed English as a second language (ESL) students’ writing, ranging in grades from A to F, and teachers’ written feedback at Ohio University. The 280 texts selected for this study come from multiple sections of the second of two courses in the FYW sequence, designed specifically for international and multilingual undergraduate students and taught exclusively by L2 writing specialists with at least an MA in TESOL/applied linguistics. To enroll in the course, students must have obtained a TOEFL iBT writing section score of 24 or higher, scored a 6/6 on the institutional intensive English program’s (IEP) composition test, and/or completed the first FYW course with a grade of C or higher; thus, the present study’s student writers have already demonstrated relatively high levels of English language proficiency. As can be seen in Table 1 below, participants’ prior English language learning experiences varied considerably in terms of duration of study, time in the US, and years in a university IEP. The English language and academic writing instruction learners received is likely a variable as well, depending on a variety of factors including perhaps participation in IEP bridge programs. The standardized curriculum is designed to develop students’ higher-level academic writing abilities to succeed in disciplinary courses: compose effective papers for different audiences and purposes; analyze audience and purposes related to various academic genres; engage in secondary research; integrate sources through paraphrasing, summarizing, and quoting, following APA style; use appropriate academic style; and self-edit for grammatical accuracy. Successful completion of this course fulfills institutional FYW requirements for graduation. The source-based research paper assignments analyzed in this study required students to select and research specific issues related to broader course themes (e.g., human migration, refugee crisis, natural disasters, online learning, public health). The dataset includes texts submitted for both a shorter, four-page assignment and a longer, eight-page assignment, and in some cases student writers contributed samples of both short and long papers to the corpus. Both writing tasks were oriented around analysis and 54
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Table 1 Demographic information of student authors of each text. Mean
SD
Range
Age Years of English study in home country Months of US residence Terms in IEP
21.49 8.49 26.43 3.22
1.79 3.51 12.78 2.07
18–27 1–22 0.25–84 0–9
Sex
125 Female 155 Male 193 Mandarin 49 Arabic 38 Other L1s (Armenian, Cantonese, French, Greek, Gujarati, Kazakh, Korean, Macedonian Norwegian, Portuguese, Spanish, Thai, Twi)
L1
Note: SD = standard deviation; complete information is not available for all students.
synthesis of source content and perspectives, and students were expected to utilize at least four academic sources in the shorter task and at least six academic sources in the longer task. Students developed research questions, conducted in-depth library research using primary and secondary sources, and explored, analyzed, and evaluated sources in writing their papers. For these assignments, students submitted an outline and two or three drafts, depending on the instructor. Course instructors graded and provided written feedback on each draft based on a standardized grading rubric used by all teachers of the FYW course. The rubric includes categories of content, organization, source use, and language use, and instructors participate in regular program-wide norming sessions led by the undergraduate composition coordinator to ensure that the use of the rubric is consistent. The 280 papers selected were the first graded draft of source-based secondary research paper assignments. The first graded draft was selected because it did not receive any written feedback from course instructors prior to submission. While the majority of the student writers from whom the papers were drawn are L1 Mandarin and Arabic speakers, the study’s corpus consists of texts written by students from 13 other languages. This distribution of language backgrounds is representative of the overall enrollment in this course sequence. Table 1 presents a summary of the student writers’ demographics. All of the papers were manually cleaned: the paper codes, titles, section headers, footers, figures, appendices, and bibliography were removed. After cleaning the papers, the total word count of the study’s corpus was 387,994 words. The corpus was divided into three grade tiers based on the grades assigned to the papers by course instructors: high (i.e., A papers), mid (i.e., B papers), and low (i.e., C papers) grade tiers. Table 2 presents descriptive statistics for the corpus used in this study. 2.2. Syntactic complexity measures The analysis consists of 10 total measures of syntactic complexity deemed significant based on previous research (see Tables 3 and 4). The first five measures are more holistic, ratio-oriented measures that represent the multiple dimensions recommended by Lu (2011) and Norris and Ortega (2009). These include a global measure (mean length of T-unit), two clausal measures (sentence coordination ratio and T-unit complexity ratio), and two phrasal measures (mean length of clause and complex nominal density). The global dimension of complexity includes “generic” measures that “index overall syntactic complexity” (Norris & Ortega, 2009, p. 561) through consideration of highly variable chunks. Complex nominals are considered an important component of advanced academic writing in English (and L2 English writing development), and the results reported below support this claim in our data. A follow-up analysis, thus, was conducted using several of the individual types of complex nominals that are collapsed into the holistic rating CN/C in the automated tool used in this analysis (see below). An analysis of clausal complexity using fine-grained, specific measures was not conducted, due to both previous findings regarding the role of clausal complexity in advanced academic writing and the limited variation found in the holistic analysis of clausal complexity. Of the complex nominal types included in the measure, adjectival pre-modified nouns, prepositional post-modified nouns, participle modified nouns, possessive noun modified nouns, and noun phrases with relative clauses beginning with that or wh- pronouns occurred frequently enough to warrant inclusion. These five measures were normalized to occurrence per clause to be compared across grade tiers. In Lu (2010), other complex nominal measures include appositives, nominal clauses, and gerunds/infinitives in Table 2 Description of the corpus. Grade tier
N
Word Total
Word Mean
Word SD
High Mid Low Total
88 142 50 280
117,614 203,792 66,588 387,994
1336.5 1435.2 1331.8 1385.7
469.6 531.7 481.8 504.9
Note: SD = standard deviation. 55
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Table 3 Syntactic complexity measures by dimension. Dimension
Measure name
Calculation
Label
Global Clausal (Coordination) Clausal (Subordination) Phrasal Phrasal
Mean length of T-unit Sentence coordination ratio T-unit complexity ratio Mean length of clause Complex nominal density
Words/T-units T-units/sentence Clauses/T-unit Words/clause Complex nominals/clause
MLT T/S C/T MLC CN/C
Note: Measure names, calculations, and labels adopted from Lu (2010, pp. 478–479). Table 4 Types and examples of nominal modifiers. Complex nominal type
Example
Pre-Adjective Post-Preposition Participle Possessive Noun Relative Clause
Detrimental devices; immediate response; hurtful words The presence of technology; negative influence on young people Studies tracking the effects; another man neglecting his friends Armenia’s unemployment rate; a person’s academic direction Tests on animals where scientists were trying to determine
Note: All examples extracted from A papers in the corpus.
subject positions. These features were extracted initially, but, due to their infrequent occurrences across all groups (less than once per paper), they ultimately were left out of the analysis. Of these items, appositives are a feature that, according to Biber et al.’s (2011) developmental progression, emerges at high levels of academic proficiency, so the rarity of this feature in the corpus was not entirely unexpected. 2.3. Analytical tools Text lengths were assessed using MacOS command line, and statistical procedures were conducted using SPSS 22. Syntactic complexity analysis was conducted using Lu’s (2010) L2 syntactic complexity analyzer (L2SCA), a free python-based automated text analyzer that can compute 14 indices of syntactic complexity. The L2SCA integrates the Stanford parser (Klein & Manning, 2002) to perform reliable part-of-speech tagging and sentence parsing according to the Penn Treebank guidelines (Marcus, Santorini, & Marcinkiewics, 1993), as well as Stanford Tregex (Levy & Andrew, 2006) to identify and tally specified syntactic patterns (e.g., noun phrases dominating an adjective) with which to calculate the value for each index. For the indices included in the present study, Lu (2010) reports strong correlation with human raters in a random subset of a large corpus of college level English as a Foreign Language data from nine Chinese universities: MLT (0.987), MLC (0.932), T/S (0.919), C/T (0.961), and CN/C (0.867); and Yoon and Polio (2017) report reliability scores of 0.81 between the tool and 30 manually coded essays from their sample of advanced ESL writing, with the exception of T/S (0.74). For a more detailed description of the tool, the unused measures, and of system validation, see Lu (2010). The particular complex nominal forms were extracted and tallied using the Stanford Tregex tool with the input commands provided in Lu’s L2SCA (p. 47). Regarding the accuracy of the extraction procedures in this dataset, precision (an analysis of overproduction) and recall (an analysis of underproduction) were calculated for each of these individual forms. For precision, one of the researchers manually coded sentences that included 100 extracted items for each measure (randomization accomplished through a python script) and calculated precision as the number of agreements (true positives) divided by the number of agreements plus the number of disagreements (true positives plus false positives; in this case 100). The following precision scores were obtained: adjective (0.93), post-preposition (0.92), participle (0.88), possessive noun (0.98), and relative clause (0.83). For recall, one of the researchers manually coded one page of randomly selected texts for each of the five features until 100 codes were assigned for each measure. These texts were then analyzed using Tregex and the procedures described above, and recall was calculated as the number of manually coded instances identified by the automated approach (true positives) divided by the number of agreements plus the number of disagreements (true positives plus false positives; in this case 100). The following recall scores were obtained: adjective (0.96), post-preposition (0.97), participle (0.95), possessive noun (0.94), and relative clause (0.91). 2.4. Analytical procedures and statistical analysis Syntactic complexity measures were calculated using the L2SCA via MacOS command line. The values for the five selected measures were then exported to excel for calculation of descriptive statistics and SPSS for statistical analysis. A one-way Multiple Analysis of Variance (MANOVA) was conducted with post-hoc Tukey with grade tier as the independent variable and the five selected measures of syntactic complexity as the dependent variables, with an alpha level of 0.05. As Lu’s (2010) L2SCA collapses several complex nominal structures into the holistic complex nominal measures, Stanford Tregex (Levy & Andrew, 2006) was used to analyze the occurrence of particular types of complex nominals individually across the grade 56
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Table 5 Descriptive statistics of syntactic complexity measures across grade tiers. Measure
Grade tier
N
Mean
SD
95% CI Lower Upper
MLT (global)
High Mid Low High Mid Low High Mid Low High Mid Low High Mid Low
88 142 50 88 142 50 88 142 50 88 142 50 88 142 50
19.98 19.19 17.99 1.14 1.15 1.14 1.85 1.81 1.78 10.95 10.66 10.11 1.48 1.41 1.32
3.10 3.73 2.56 0.28 0.25 0.28 0.11 0.10 0.11 1.51 1.40 1.00 0.33 0.28 0.26
19.32 18.57 17.26 1.12 1.13 1.11 1.79 1.76 1.71 10.63 10.43 9.82 1.40 1.36 1.12
T/S (clausal) C/T (clausal) MLC (phrasal) CN/C (phrasal)
20.64 20.60 18.72 1.17 1.16 1.17 1.90 1.89 1.84 11.27 10.89 10.39 1.55 1.52 1.25
Note: SD = standard deviation; CI = confidence interval; MLT = mean length of T-unit; T/S = sentence coordination ratio; C/T = T-unit complexity ratio; MLC = mean length of clause; CN/C = complex nominals per clause.
tiers. To ensure consistency, the same Tregex commands specified by Lu (2010) were used in this study. A batch file was created to run each of the five assessed complex nominal constructs for each of the 280 files sequentially. These values were then normalized by occurrence per clause per text, exported to excel for calculation of descriptive statistics, and then exported to SPSS for statistical analysis via a one-way MANOVA with post-hoc Tukey with grade tier as the independent variable and the five complex nominal constructs as dependent variables, with an alpha level of 0.05. 3. Results 3.1. Research question 1 Our first research question asked how high-, mid-, and low-quality (according to instructor assessments) source-based research papers written by L2 English writers in a FYW course differed in terms of the five measures of syntactic complexity. The measures chosen spanned the three dimensions of syntactic complexity that Norris and Ortega (2009) proposed: global (MLT), clausal (T/S and C/T), and phrasal (MLC and CN/C). Table 5 presents the descriptive statistics for these measures. There is an observable pattern across four of the five measures (all but T/S): high-rated papers contain the highest levels of syntactic complexity; mid-rated papers contain somewhat lower complexity than high-rated papers and somewhat higher complexity than low-rated papers; and low-rated papers contain the lowest levels of complexity. What is noteworthy is that little overall variation is found for both clausal measures, T/S (clausal coordination) and C/T (clausal subordination), and more pronounced differences for the global measure (MLT) and for both phrasal measures (CN/C and MLC). Results of a one-way MANOVA revealed a significant difference in complexity by grade tier, although the effect was small: F (10, 546) = 2.12, p = 0.022; Wilks’ Lambda = 0.927; partial η2 = 0.037. Pairwise post-hoc Tukey results are presented in Table 6. Table 6 Descriptive statistics of syntactic complexity measures across grade tiers. Measure
Grade tier
N
Mean
SD
95% CI Lower Upper
MLT (global)
High Mid Low High Mid Low High Mid Low High Mid Low High Mid Low
88 142 50 88 142 50 88 142 50 88 142 50 88 142 50
19.98 19.19 17.99 1.14 1.15 1.14 1.85 1.81 1.78 10.95 10.66 10.11 1.48 1.41 1.32
3.10 3.73 2.56 0.28 0.25 0.28 0.11 0.10 0.11 1.51 1.40 1.00 0.33 0.28 0.26
19.32 18.57 17.26 1.12 1.13 1.11 1.79 1.76 1.71 10.63 10.43 9.82 1.40 1.36 1.12
T/S (clausal) C/T (clausal) MLC (phrasal) CN/C (phrasal)
20.64 20.60 18.72 1.17 1.16 1.17 1.90 1.89 1.84 11.27 10.89 10.39 1.55 1.52 1.25
Note: SD = standard deviation; CI = confidence interval; MLT = mean length of T-unit; T/S = sentence coordination ratio; C/T = T-unit complexity ratio; MLC = mean length of clause; CN/C = complex nominals per clause. 57
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Table 7 Descriptive statistics of nominal modifier measures across grade tiers. Complex nominal type
Grade tier
N
Mean
SD
95% CI Lower Upper
Pre-Adjective
High Mid Low High Mid Low High Mid Low High Mid Low High Mid Low
88 142 50 88 142 50 88 142 50 88 142 50 88 142 50
1.29 1.14 1.03 0.98 0.86 0.77 0.15 0.12 0.11 0.08 0.10 0.07 0.42 0.38 0.36
0.52 0.39 0.33 0.41 0.36 0.28 0.10 0.09 0.06 0.07 0.08 0.06 0.18 0.18 0.14
1.18 1.07 0.93 0.90 0.79 0.69 0.13 0.10 0.09 0.06 0.08 0.06 0.38 0.36 0.32
Post-Preposition Participle Possessive Noun Relative Clause
1.39 1.20 1.12 1.07 0.91 0.85 0.17 1.13 0.13 0.09 0.11 0.09 0.46 0.42 0.41
Note: All measures normalized per clause; SD = standard deviation; CI = confidence interval.
Significant differences were found for global and phrasal measures, but not for either of the clausal measures. Concerning the global measure, the lower length of T-units in low-rated papers was statistically significant when compared to high-rated papers (p = 0.003). For phrasal measures, once more low-rated papers were constructed with significantly shorter clauses than both high(p = 0.002) and mid-rated papers (p = 0.41), as well as significantly fewer complex nominals per clause than high-rated papers (p = 0.001). No significant differences were found between high- and mid-rated papers for any measure, in spite of modest observable differences in means for global and phrasal measures. In summary, the holistic measure of syntactic complexity across global, clausal, and phrasal measures reveals that low-rated papers differ little from other papers in terms of clausal complexity but are significantly less complex at both global and phrasal levels. Low-rated papers are distinguished from other texts by having shorter T-units and clauses and lower densities of complex nominals, but are similar in levels of clausal coordination and subordination. This suggests that complexity, or the meanings afforded by complex structures, are at least a part of the equation in how students are evaluated as L2 academic writers. 3.2. Research question 2 Due to the importance placed on phrasal complexity in academic writing in recent scholarship (e.g., Biber & Gray, 2010; Biber et al., 2011) and findings that complex nominals are increasingly abundant in academic texts produced by more advanced-level L2 writers (e.g., Ansarifar et al., 2018; Parkinson & Musgrave, 2014), our second research question asked how five particular complex nominal structures normalized per clause varied across high-, mid-, and low-quality (according to instructor assessments) sourcebased research papers written by L2 English writers in a FYW course. Descriptive statistics for each structure across each grade tier are presented in Table 7. As with the holistic measures, high-rated papers contain the highest number of four of the five types of complex nominals, and low-rated papers include the lowest number of all five types. Pre-modifying adjectives and post-modifying prepositions are notably abundant, with student writers producing both structures an average of approximately once per clause, particularly writers of mid- and high-rated papers. Results of a one-way MANOVA revealed a significant difference in complex nominal types by grade tier, although the effect was small: F (10, 546) = 2.26, p = 0.013; Wilks’ Lambda = 0.922; partial η2 = 0.04. Pairwise post-hoc Tukey results are presented in Table 8. Significant differences were present in adjective, preposition, and particle modified noun phrases, with a similar pattern to the holistic measures pattern emerging: a progressive increase in complexity from low- to high-rated papers, with greater complexity in high-rated papers and less complexity in low-rated papers. However, rather than statistical measures separating low-rated papers from higher-rated papers in terms of syntactic complexity, in the analysis of particular complex nominal structures the high-rated papers are shown to contain statistically significantly more adjective, preposition, and participle modified nominals than both mid(p = 0.031; 0.015; and 0.018, respectively) and low-rated (p = 0.002; 0.003; and 0.019, respectively) papers, with no differences found in other comparisons for any measure. Thus, the evidence suggests that the use of particular complex nominal structures (adjective pre-modification, preposition post-modification, and participle modification) is a feature of high-rated L2 English academic writing in FYW compositions, at least for source-based research papers. 4. Discussion Taken together, the results of this study highlight that higher levels of global and phrasal syntactic complexity (particularly in adjective, participle, and preposition modified nominal phrases) is associated with higher instructor assessment of quality for L2 English writers of research papers in FYW courses, and that lower levels of global and phrasal complexity are associated with lower assessment in this context. At the same time, the results suggest that clausal complexity is not a distinguishing factor in assessed 58
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
Table 8 Results of pairwise post-hoc Tukey for nominal modifiers across grade tiers. Measure
Pair
p value
Cohen’s d
Pre-Adjective
High-Mid High-Low Mid-Low High-Mid High-Low Mid-Low High-Mid High-Low Mid-Low High-Mid High-Low Mid-Low High-Mid High-Low Mid-Low
0.031* 0.002* 0.256 0.015* 0.003* 0.395 0.018* 0.019* 0.774 0.185 0.852 0.106 0.353 0.177 0.714
0.326 0.597 0.305 0.311 0.598 0.279 0.315 0.485 0.131 0.266 0.153 0.424 0.222 0.372 0.124
Post-Preposition Participle Possessive Noun Relative Clause
* p < 0.05.
quality of such writing, but this does not necessarily suggest that clausal complexity is not important. A lack of differences does not indicate a lack of complexity, but rather that writers of texts which are assessed to be of higher or lower quality do not produce such structures more or less frequently. Overall, these findings are in line with much of the previous scholarship on syntactic complexity in L2 English writing. The global complexity level differences detected between high- and low-rated texts in MLT align with both Bulté and Housen’s (2014) and Yang et al.’s (2015) findings that T-unit length is an effective predictor of writing quality, even though the L2 writers in Bulté and Housen’s (2014) study were placed into a lower-level EAP writing course than the students in the current study, and the L2 writers in Yang et al.’s (2015) study were graduate students. While Biber et al. (2013) quite accurately point out that measures such as MLT do not capture a particular identifiable type of complexity, the present study provides further evidence that MLT appears to be associated with higher writing quality. In terms of clausal measures, some studies have previously reported proficiency-based differences in subordination (e.g., Grant & Ginther, 2000; Ortega, 2003) and development over time in coordination (e.g., Crossley & McNamara, 2014), but correlations with quality were only reported for subordination by Crossley and McNamara in more narrative genres. Further, other studies have reported little proficiency-based variation in these clausal measures overall (e.g., Lu, 2011; Yoon, 2017) or found that development of clausal complexity to plateau at advanced levels of language proficiency (e.g., Byrnes, Maxim, & Norris, 2010). Thus, the absence of meaningful differences across groups and overall lack of variance for both measures in the present study is somewhat unsurprising, considering the advanced proficiency of these student writers. Similarly, differences in assessed writing quality across grade tiers in phrasal complexity measures align with the findings of those who have explored the relationship between complex nominals and writing quality or writing development both by holistic measures including CN/C, MLC (e.g., Bulté & Housen, 2014; Lu, 2011; Yang et al., 2015), and specific forms of complex nominals, including adjective modifiers (Ansarifar et al., 2018; Parkinson & Musgrave, 2014; Staples & Reppen, 2016), preposition modifiers (Ansarifar et al., 2018), and participles (Ansarifar et al., 2018). When viewed through Biber et al.’s (2011) and Norris and Ortega’s (2009) proposed trajectories for syntactic development, the differences observed in phrasal complexity and lack of differences in clausal complexity suggest that the instructors valued learners’ production of syntactically complex structures as predicted. Norris and Ortega provide a generalized hypothesis for L2 syntactic development, based theoretically on systemic functional linguistics, that L2 linguistic expression progresses from coordination at lower levels to subordination at intermediate levels, and that “subordination’s role should subside at even higher levels of development in favor of greater use of phrasal-level complexification” (p. 563). While the nature of the data presented in this study does not lend itself to analysis of L2 writing development, it is noteworthy that phrasal complexity distinguished low-rated texts holistically and high-rated texts with specific complex phrasal structures in an L2 FYW course (advanced ESL), a stage where the emergence of these features is predicted. Further, while the claim that subordination is reduced in favor of phrasal complexity cannot be attested in this data, the general lack of variation between groups in both coordination and subordination at the clausal level can be cautiously understood to suggest that learners in this context may have “mastered” such structures, as previous research suggests (e.g., Byrnes et al., 2010). This claim, of course, would be better supported by analysis that included fine-grained measures of clausal complexity. While Norris and Ortega (2009) focus broadly on L2 development, and not exclusively on writing development, the developmental progression proposed by Biber et al. (2011) hypothesizes that L2 academic writers begin with a developed ability to produce syntactically complex structures associated with speech, including subordinated clauses, and then develop the capacity to produce a variety of phrasal complexity features characteristic of formal writing. It is, therefore, more oriented towards a description of how academic writing develops in L2 English writers. While we did not find that low-rated papers included more features strongly associated with speech, we found a relationship between three features predicted to emerge in the second and third stages of Biber et al.’s five stages (noun phrases with attributive adjectives, participle modifiers, and prepositional post modifiers) with high-rated 59
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
essays. This can possibly be explained by the instructors’ preference (conscious or unconscious) for these features, the advanced development of student writers obtaining A grades on their texts, or the communicative potential of these structures. Appositives, hypothesized to emerge in the fifth and final stage, was exceedingly rare in this corpus, which the developmental framework would predict. Overall, while the nature of the data does not provide clear evidence for either developmental hypothesis (which was never an aim of this study), neither does it contradict what would be expected for this population of L2 writers. However, we are hesitant to fully embrace either developmental hypothesis as fixed and neatly staged, avoiding any assumptions that language evolves along a fixed trajectory. 4.1. Implications for syntactic complexity research and L2 writing instruction An important research-oriented implication emerges from this study; there are benefits to considering syntactic complexity as a multidimensional construct and carefully assembling a set of complexity features when addressing complexity related research questions. Biber et al. (2013) rightly highlight the criticality of considering the “distinctive grammatical features of [academic] register - phrasal modifiers” (p. 195) when conducting syntactic complexity analysis in L2 writing research (though they do not advocate for exclusive exploration of phrasal complexity). These structures feature prominently in academic writing overall and were characteristic of high-rated papers in this study. However, not all writing that L2 English writers produce (or wish to produce) is academic, and there is little reason to believe that even most of it is. Some dimensions of complexity have been shown to be important for writing about particular topics (e.g., C/T in Yang et al., 2015) or constructing particular genres (e.g., several subordination and coordination measures in Lu, 2011). Additionally, Crossley and McNamara (2014) report that raters scored essays with higher subordination ratios to be of higher quality, possibly because it was better suited to the writing tasks. L2 writers make syntactic decisions at the clausal and global levels too (e.g., deciding that something is just too long), and analysis of syntactic complexity, particularly when exploring variation across genre or correlations with quality, should be able to reflect these choices. These arguments lead to a methodological issue that helped shape this study from its early stages, that both holistic and specific measures to syntactic complexity assessment exist. Particularly in terms of complex nominals, the CN/C measure analyzed from Lu’s (2010) L2SCA combines several types of complex nominals that other scholars consider individually into a single measure. Importantly, CN/C has been shown in the present study as well as other studies to be a relevant measure of complexity in L2 writing. In this study, both holistic and specific structures provided insights into the differences of high-, mid-, and low-rated papers. CN/C revealed that low-rated papers have generally less complex noun phrases, while high-rated papers were found to be particularly unique in nominals with pre-adjective, post-prepositional, and participle modifiers. In this study, we utilized the L2SCA to extract the holistic complex nominal measures (MLC and CN/C), but also utilized the Stanford Tregex commands that Lu has built into his tool for extraction of individual features. These arguments and the findings of this study highlight the importance of syntactic complexity in L2 English writing instruction, as evidence indicates that L2 writers are likely to benefit from both an increased repertoire of complex structures and increased awareness of the appropriateness and functional capacities of particular complex structures. Of particular potential is an emphasis on noun phrases in academic writing pedagogy, as this study’s findings support the claim that noun phrase complexity is an important characteristic of English academic writing overall. We argue that such instruction could draw on both explicit instruction of the linguistic options available to learners in constructing noun phrases and discourse-based examination of noun phrases in authentic writing, thus targeting students’ ability to produce complex structures as well as raising their awareness of how and why a writer may choose to do so. Nevertheless, although considerable attention has been paid to the role of complexity in L2 writing and development, little attention has been paid to the instruction or learning of syntactically complex structures. Evidence suggests that it develops over time, as Bulté and Housen (2014), Crossley and McNamara (2014), and Mazgutova and Kormos (2015) found statistically significant development in a variety of syntactic complexity measures in as little as a single EAP course, despite these studies not including instruction in the production of syntactically complex structures. Thus, further research on the forms and benefits of including targeted instruction of complex structures in L2 writing may be welcome. 4.2. Limitations and future directions While this study sought to pursue a multidimensional approach to syntactic complexity and utilize both holistic and specific measures of syntactic complexity in the analysis of L2 English writing in an FYW course, a few limitations exist. First, although all grade tiers were sufficiently well-represented, the three groups were unbalanced. Future research, therefore, could conduct similar comparisons with a more balanced set of groups using a larger corpus. Another aspect of the corpus-design that could be controlled in future analyses is the inclusion of two writing tasks of differing length. While the tasks were similarly focused on source-based research synthesis, they had different length requirements which may not be completely accounted for with the normalization procedures implemented here. Also, in terms of methodology, studies have shown that topic effect and L1 may impact the syntactic choices of L2 writers in English. Neither of these factors, however, were controlled in this study because of the nature of the corpus. Students wrote about a wide variety of topics but were predominantly L1 Mandarin speakers. Future research could make greater efforts to control these variables. Additionally, this study included both holistic and fine-grained measures of phrasal complexity but excluded fine-grained measures of clausal complexity. Although limited variation was found overall in clausal complexity, this does not preclude the possibility of variance existing for particular structures across grade-tiers or for proficiency-based differences (which were not assessed) to exist. At the same time, the fine-grained features included for analysis of phrasal complexity are by no means 60
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
exhaustive. For the present study, we explored the specific noun modifiers included in Lu’s (2010) complex nominal measures individually as a follow-up analysis. Future studies that adopt both holistic and specific measures of phrasal complexity could expand the range of particular structures captured by both approaches. Finally, while precision and recall values were calculated and reported for the fine-grained features used in this study, accuracy information for Lu’s (2010) L2SCA was reported from other studies rather than calculated on the current dataset. This research process has also uncovered several questions that remain largely unexplored, many relating to the decision-making processes of L2 student writers and FYW instructors. Little is known regarding L2 writers’ motivations to select particular types of syntactically complex structures over others or their awareness of genre or register appropriateness. Similarly, little is known about instructors’ perceptions and awareness of the syntactic complexity of the writing they assess. Future research into the relationship between syntactic complexity and writing quality could take learner and teacher cognition into account, exploring the factors contributing to writing and rating decisions or awareness of the complexity and meaning of structures present in a text. 5. Conclusion This study found that global and phrasal syntactic complexity differed, sometimes significantly, across high-, mid-, and low-rated L2 FYW research papers, with high-rated papers demonstrating the highest levels of complexity and low-rated papers demonstrating the lowest. Clausal complexity was shown to vary little across all grade tiers, both in terms of coordination and subordination. Overall, these results underscore the importance of syntactic complexity, particularly nominal complexity in producing successful (or in this case high-rated) academic writing, and highlight the pedagogical attention that should be paid to the production and meaning of such structures in FYW courses. Acknowledgments We would like to thank Xiaofei Lu, at The Pennsylvania State University, for his feedback and the editors and anonymous reviewers for their thoughtful and valuable comments on earlier versions of this paper. References Ansarifar, A., Shahriari, H., & Pishghadam, R. (2018). Phrasal complexity in academic writing: A comparison of abstracts written by graduate students and expert writers in applied linguistics. Journal of English for Academic Purposes, 31, 58–71. Bardovi-Harlig, K. (1992). A second look at T-unit analysis: Reconsidering the sentence. TESOL Quarterly, 26, 390–395. Biber, D., & Gray, B. (2010). Challenging stereotypes about academic writing: Complexity, elaboration, explicitness. Journal of English for Academic Purposes, 9(1), 2–20. Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly, 45(1), 5–35. Biber, D., Gray, B., & Poonpon, K. (2013). Pay attention to the phrasal structures: Going beyond T-units—A response to WeiWei Yang. TESOL Quarterly, 47(1), 192–201. Biber, D., Gray, B., & Staples, S. (2016). Predicting patterns of grammatical complexity across language exam task types and proficiency levels. Applied Linguistics, 37(5), 639–668. Bulté, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing, 26, 42–65. Byrnes, H., Maxim, H. H., & Norris, J. M. (2010). Realizing advanced foreign language writing development in collegiate education: Curricular design, pedagogy, assessment [Monograph]. Modern Language Journal, 94(s1). Cooper, T. C. (1976). Measuring written syntactic patterns of second language learners of German. The Journal of Educational Research, 69, 176–183. Crossley, S. A., & McNamara, D. S. (2014). Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66–79. Eckstein, G., & Ferris, D. (2017). Comparing L1 and L2 texts and writers in first-year composition. TESOL Quarterly, 52(1), 137–162. Friginal, E., & Weigle, S. (2014). Exploring multiple profiles of L2 writing using multi-dimensional analysis. Journal of Second Language Writing, 26, 80–95. Grant, L., & Ginther, A. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing, 9(2), 123–145. Halliday, M. A. K., & Mathiessen, C. (1999). Construing experience through meaning: A language-based approach to cognition. London: Cassell. Jarvis, S., Grant, L., Bikowski, D., & Ferris, D. (2003). Exploring multiple profiles of highly rated learner compositions. Journal of Second Language Writing, 12, 377–403. Klein, D., & Manning, C. D. (2002). Fast exact inference with a factored model for natural language parsing. In S. Becker, S. Thrun, & K. Obermayer (Eds.). Proceedings of the 15th international conference on neural information processing systems (pp. 3–10). Cambridge, MA: MIT Press. Lee, J. J., Hitchcock, C., & Casal, J. E. (2018). Citation practices of L2 university students in first-year writing: Form, function, and stance. Journal of English for Academic Purposes, 33, 1–11. Levy, R., & Andrew, G. (2006). Tregex and Tsurgeon: Tools for querying and manipulation tree data structures. Proceedings of the fifth international conference on language resources and evaluation. Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496. Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly, 45(1), 36–62. Lu, X. (2017). Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Language Testing, 34(4), 493–511. Lu, X., & Ai, H. (2015). Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing, 29, 16–27. Mancilla, R. L., Polat, N., & Akcay, A. O. (2017). An investigation of native and nonnative English speakers’ levels of written syntactic complexity in asynchronous online discussions. Applied Linguistics, 38(1), 112–134. Marcus, M. P., Santorini, B., & Marcinkiewics, M. A. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330. Mazgutova, D., & Kormos, J. (2015). Syntactic and lexical development in an intensive English for academic purposes programme. Journal of Second Language Writing, 29, 3–15. Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied Linguistics, 30(4), 555–578. Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24,
61
Journal of Second Language Writing 44 (2019) 51–62
J.E. Casal and J.J. Lee
492–518. Ortega, L. (2015). Syntactic complexity in L2 writing: Progress and expansion. Journal of Second Language Writing, 29, 82–94. Parkinson, J., & Musgrave, J. (2014). Development of noun phrase complexity in the writing of English for academic purposes students. Journal of English for Academic Purposes, 14, 48–59. Ryshina-Pankova, M. (2015). A meaning-based approach to the study of complexity in L2 writing: The case of grammatical metaphor. Journal of Second Language Writing, 29, 51–63. Staples, S., & Reppen, R. (2016). Understanding first-year L2 writing: A lexico-grammatical analysis across L1s, genres, and language ratings. Journal of Second Language Writing, 32, 17–35. Stockwell, G., & Harrington, M. (2003). The incidental development of L2 proficiency in NS-NNS email interactions. CALICO Journal, 20, 337–359. Taguchi, N., Crawford, W., & Wetzel, D. Z. (2013). What linguistic features are indicative of writing quality? A case of argumentative essays in a college composition program. TESOL Quarterly, 47(2), 420–430. Vyatkina, N., Hirschmann, H., & Golcher, F. (2015). Syntactic modification at early stages of L2 German writing development: A longitudinal learner corpus study. Journal of Second Language Writing, 29, 28–50. Wolfe-Quintero, K., Inagaki, K. S., & Kim, H. Y. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity. Honolulu, HI: University of Hawaii Press. Yang, W. (2013). Response to Biber, Gray, and Poonpon (2011). TESOL Quarterly, 47(1), 187–191. Yang, W., Lu, X., & Weigle, S. C. (2015). Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality. Journal of Second Language Writing, 28, 53–67. Yoon, H. (2017). Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct multidimensionality. System, 66, 130–141. Yoon, H., & Polio, C. (2017). The linguistic development of students of English as a second language in two written genres. TESOL Quarterly, 51(2), 275–301. J. Elliott Casal is a PhD candidate in Applied Linguistics at the Pennsylvania State University. His research and teaching interests include academic and professional writing practices, corpus linguistics, and genre studies. His publications have appeared in Journal of English for Academic Purposes, Language Learning and Technology, and System. Joseph J. Lee, PhD, is an associate lecturer in the Department of Linguistics and Assistant Director of the ELIP Academic & Global Communication Program at Ohio University. His research and teaching interests include ESP/EAP, genre studies, classroom discourse, advanced academic literacy, applied corpus linguistics, and teacher education. His publications have appeared in English for Specific Purposes, Journal of English for Academic Purposes, Journal of Second Language Writing, and System, among others. His recent book is Exploring Spoken English Learner Language Using Corpora: Learner Talk (2017, Palgrave Macmillan), co-authored with Eric Friginal, Brittany Polat, and Audrey Roberson.
62