Journal Pre-proof The use of English prepositions in lexical bundles in essays written by Korean university students
Young Eun Lee, Isaiah WonHo Yoo, Yu Kyoung Shin PII:
S1475-1585(19)30423-0
DOI:
https://doi.org/10.1016/j.jeap.2020.100848
Reference:
JEAP 100848
To appear in:
Journal of English for Academic Purposes
Received Date:
10 August 2019
Accepted Date:
29 January 2020
Please cite this article as: Young Eun Lee, Isaiah WonHo Yoo, Yu Kyoung Shin, The use of English prepositions in lexical bundles in essays written by Korean university students, Journal of English for Academic Purposes (2020), https://doi.org/10.1016/j.jeap.2020.100848
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.
Journal Pre-proof
The use of English prepositions in lexical bundles in essays written by Korean university students
Young Eun Lee (first author) Isaiah WonHo Yoo (second author) Yu Kyoung Shin (corresponding author) *This research was supported by Hallym University Research Fun (HRF-201809-010).
Journal Pre-proof
The use of English prepositions in lexical bundles in essays written by Korean university students The purpose of this study is to explore how Korean university undergraduate students use English prepositions embedded in frequently occurring multiword sequences, or lexical bundles, in their essays. Most prior research on prepositions has centered on prepositional phrases (PPs), including idiomatic expressions, identified on the basis of their structure and function. This study investigates prepositions in lexical bundles, which are identified solely on the basis of frequency in context and are therefore generally incomplete in structure. The findings show patterns of preposition use by English learners that differ from the accumulated findings on this topic in the literature. The study first identifies the types of PP-based bundles in a learner corpus built on the English writing samples of 2,130 students. It then compares the learners’ uses of PP-based bundles to the uses by native speakers of English in academic prose, as documented by Biber et al. (1999). Results show that Korean learners rely heavily on a small number of PP-based bundles and underuse those that are characteristic of academic prose. A subsequent error analysis of prepositions in the learner bundles reveals an error rate of approximately 7% in 13 bundle types. More than 70% of the errors are preposition misuses. Based on these findings, this study offers suggestions for classroom materials and further research topics centering on PP-based lexical bundles.
INTRODUCTION Prepositions are notoriously difficult for many L2-English learners to acquire because of the difficulty of using them accurately with content words in context (Celce-Murcia & LarsenFreeman, 1999; Lindstromberg, 2010; Park & Jang, 2011). Existing studies on learners’ use of prepositions often focus on prepositional phrases (PPs), which comprise a preposition and its complement, often a nominal phrase (e.g., in a shop). As many such studies have demonstrated, learners have problems using English prepositions accurately in context. In his error analysis of in the English writing and speaking of 200 Korean university students majoring in English, Yun (2004) reported that the accurate use of prepositions was as difficult as that of articles for the learners. More specifically, he found that for the prepositions at, on, and in, the most frequent error occurred when their function was indicating time (over 30%); the second most frequent was 1
Journal Pre-proof
using an incorrect preposition with a verb; and the third most frequent error type occurred when prepositions were used to indicate position. Focusing on the preposition of, Lah and Yoo (2015) analyzed errors in a learner corpus containing essays written by Korean university students and found that only 4% of all the tokens of the preposition of were used incorrectly, suggesting that most students understood the precise use of the preposition. However, a further analysis revealed that more than 80% of the tokens were used in what Linstromberg (2010, p. 206) calls an integrative function (e.g. “the red roof of the house”), demonstrating the students’ inability to use the preposition in a variety of different functions in their writing. In addition, Yu and Yoo (2010) analyzed errors with prepositional verbs (e.g., believe in, ask for) in 416 Korean university students’ argumentative essays, classifying the errors into five categories--omission, wrong preposition, addition, misordering, and others--and found that omission errors accounted for 54% of the total prepositional verb errors, followed by addition errors (12%) and the wrong preposition (18%). There were no misordering errors, which was also the case in the studies of Lah and Yoo (2015) and of Ahn (2013), a finding which suggests that students clearly understand where prepositions need to be placed in a sentence. These researchers have argued that preposition errors are due to students’ inability to distinguish transitive from intransitive verbs and a lack of knowledge about prepositional verbs. Moreover, the studies highlighted the need for sufficient exposure to prepositions in context to help students improve their accurate use of prepositions. Despite the efforts of a few studies on preposition uses, there is still a gap in the research on preposition errors in multiword sequences in relation to usage patterns in discourse. Almost all of these studies have investigated L2 learners’ use of prepositions as part of prepositional phrases 2
Journal Pre-proof
(PPs), mostly in independent semantic units such as idiomatic expressions, thus taking a topdown approach in which target preposition usage is predetermined and meaning-based. There have been, however, very few attempts to examine how L2 learners use prepositions in expressions typical of academic prose, which requires consideration of context.
PREVIOUS STUDIES ON LEXICAL BUNDLES
One way to investigate contextual uses of prepositions involves recurrent multiword sequences in discourse. A common type of recurrent sequence is lexical bundles (LBs), frequently occurring sequences of three or more words in a given genre (Biber, Johansson, Leech, Conrad, & Finegan, 1999). The literature on this topic has previously centered on the frequency of LBs in different registers and their structural/functional classifications (e.g., Shin, 2019; Biber, Conrad, & Cortes, 2004; Biber et al., 1999; Hyland, 2008; Salazar, 2014), commonly suggesting that phrasal bundles including noun phrases (NPs) and PPs are more common in academic prose, while clausal bundles are more typical of face-to-face conversation. Much research has substantiated that students produce increasingly complex writing as they advance in their academic careers (e.g., Biber & Gray, 2010; Biber, Gray, & Poonpon, 2011; Ortega, 2003; Pan, Reppen, & Biber, 2016). L1 students as well as L2 students must learn to use academic prose appropriately, which entails learning the use of appropriate LBs. Several L1 studies have focused on the differences in the usage of LBs between novice academic writers and expert academic writers; these studies have compared published academic articles to university students’ production (e.g., Cortes, 2002, 2004; Levy, 2008; Scott & Tribble, 2006; Tribble, 2011). Two studies by Cortes (2002, 2004) compared bundles in student writing to bundles in published research articles. In the first study, the student corpus consisted of English native3
Journal Pre-proof
speaker freshmen’s essays. She found significant differences between student writing and published writing and observed that students’ choice of bundles was affected by the assigned tasks. The second study compared published academic texts in the fields of history and biology to texts written by students (including graduate students) in the same disciplines. Cortes again found little overlap in the two groups: the most frequent phrasal bundles in the published text corpus were infrequent in the student corpus; when students did use some of the same bundles that appeared in the published work, their use was unlike that of the published authors. Other studies have focused on lexical bundles produced by L2 writers, often comparing them to native English-speaking writers and generally demonstrating that L2 writers favor clausal bundles characteristic of conversation (e.g., Shin, 2018, 2019; Bychkovska & Lee, 2017; Chen & Baker, 2010; Huang, 2015; Pan et al., 2016; Qin, 2014; Yoon, 2017). For example, Chen and Baker (2010) compared the use of LBs by native speakers and Chinese learners of L2 English, using one corpus of published academic texts and two corpora of native and nonnative student academic writing consisting of university assignments. Overall, this study showed that student writers are different from expert writers; for example, the L1 and L2 students used more clausal bundles than did the expert writers, who used more phrasal bundles. However, a qualitative analysis investigating expanded concordance lines further revealed several distinctive features of L2 writers. L2 students were found to overgeneralize a limited number of expressions that L1 writers rarely used in academic writing. At the same time, the L2 students underused the bundles most frequently used by the L1 writers in both published and student writing. These findings recall other studies targeting EFL Korean university students, which have indicated that these English learners tend to use colloquial and idiomatic LBs in their academic writing (e.g., Huh, 2008; Kim, 2013; Lee & Kim, 2017; Nam, 2017; Yoon & Choi, 2015). 4
Journal Pre-proof
Specifically, Yoon and Choi (2015) examined the use of bundles in argumentative essays produced by Korean university students and native English-speaking university students. Their findings showed that the Korean students favored bundles used in speech, including those with personal pronouns and clausal bundles. Additionally, the same group rarely used phrasal bundles typical of academic written genres, such as nominalizations. Their native counterparts, on the other hand, displayed frequent uses of phrasal bundles; thus, the study identified few overlapping bundles between the native and nonnative student groups. Likewise, Lee and Kim (2017) compared lexical bundles in English academic writing by L1 Korean and L1 English university students, showing distinctive differences between the two groups. The Korean students were found to heavily rely on colloquial expressions (e.g., there are lots of); when they did use PP-based bundles, these were mostly limited to idiomatic expressions (e.g., for a long time). In contrast, the native students used PP bundles such as as a result of, in relation to the, and in the absence of, which are generally considered characteristic of academic prose. Furthermore, not just L2 student writers but also L2 expert academic writers appear to generally underuse PP-based bundles (e.g., Hong, 2018; Pan et al., 2016; Salazar, 2014). Salazar (2014), for instance, reported that L1-Spanish experts employed a relatively small number of phrasal bundles in their published English-language research articles. She also found that in comparison to native English writers, the nonnative writers overused PP-based bundles, but only eight types. Salazar argued that the repeated use of a few PP-based bundles reflected the writers’ lack of knowledge of alternative expressions. In short, previous corpus analyses that have sought to compare native and nonnative LB usage have frequently asserted that nonnative academic writers use fewer specific bundles or use bundles less frequently than native-speaker academic writers (e.g., Ädel & Erman, 2012; Chen & 5
Journal Pre-proof
Baker, 2010; Pang, 2009; Qin, 2014; Salazar, 2014). This body of research, however, consists almost entirely of corpus studies that use automatic search programs, which can only locate full and correct LBs (e.g., Chen & Baker, 2010; De Cock, 2000; Nekrasova, 2009; Paquot, 2017; Salazar, 2014; Wei & Lei, 2011). Among the few studies that have examined the partial production of LBs by L2 writers (Bychkovska & Lee, 2017; Huang, 2015; Kang, Yoo, & Shin, 2018; Shin, Cortes, & Yoo, 2018; Shin & Kim, 2017), Huang (2015) investigated the accuracy of lexical bundles used by junior and senior Chinese university students in EFL settings. Overall, senior students produced 2.5 times more bundle tokens and almost twice as many bundle types as did juniors, but there was no significant statistical difference in the two levels’ grammatical and functional accuracy rates with bundles. This finding suggests that accuracy did not improve with higher proficiency. In addition, Bychkovska and Lee (2017), who studied the use of LBs in L1-Chinese university students’ writing, claimed that more than 50% of all the errors they found were related to function words such as articles and prepositions. They concluded that the participants’ articleless L1 (Chinese) was a cause for the high proportion of function-word errors. Another study, Shin et al. (2018), examined English learners’ use of definite articles in LBs by analyzing the core expressions of LBs, i.e. central phrases often comprising a single word or phrase with a following preposition. Shin et al. discovered that more than 89% of all the definite article errors they identified were omission errors, with both misformation and addition errors occurring at approximately 5%. Despite the increasing amount of research addressing L2 learners’ use of LBs, little research has investigated problematic target forms in the framework of LBs. In particular, there is a dearth of research on the uses of prepositions in LBs in L2 production, although prepositions seem to be 6
Journal Pre-proof
the most frequently embedded items in phrasal bundles (Kang et al., 2018). The current study conducts a qualitative analysis of natives’ and nonnatives’ use of prepositions embedded in LBs to provide a fuller picture of the areas that are problematic for L2 learners and that are therefore likely to lead to the consistent reports from corpus-based research that learners underuse PPbased bundles. By so doing, this study hopes to provide useful findings to inform pedagogical approaches to teaching formulaic language to L2 learners.
RESEARCH QUESTIONS
In this study, LBs serve as a tool to examine the use of prepositions in L1-Korean English learners’ writing. It investigates the extent to which Korean university undergraduates use prepositions in LBs in their English academic prose. Specifically, it first identifies four-word LBs that include a preposition(s) in a corpus of Korean university student writing and then compares these with the LBs used by native speakers of English reported in Biber et al. (1999). Next, it examines the accuracy of the LBs used by the Korean students with respect to embedded prepositions. The two specific research questions are as follows:
1. What are the most frequently occurring four-word PP-based bundles in essays written by Korean university students and how differently are they used compared with those produced by native speakers of English? 2. How accurately do the Korean undergraduate university students use prepositions embedded in lexical bundles?
METHOD Corpus and lexical bundles 7
Journal Pre-proof
The learner corpus used in the present study consists of 604,008 words of English writing samples produced by undergraduate students at a Korean university. A total of 2,130 first-year students wrote argumentative essays with 284 words on average, as shown in Table 1. The students were instructed to write essays in 50 minutes on one of the following three essay prompts, all of which were followed by the instruction “use specific reasons and examples to support your answer”: 1) It has been said, “Not everything that is learned is contained in books.” Compare and contrast knowledge gained from experience with knowledge gained from books. Which source is more important? Why? 2) If you could change one important thing about your hometown, what would you change? 3) If you could make one important change in a school that you attended, what change would you make?
Table 1. Description of the learner corpus Number of texts
Number of words
Average text length
2,130
604,008
283.6
To answer the first research question, the four-word lexical bundles that include a target form (i.e., a preposition) were identified in the learner corpus. Biber et al. (1999) set a threshold of at least ten times per million words across five texts for a four-word expression to be considered a lexical bundle. We decided to set a lower threshold, considering the smaller size of the learner corpus used in this study and the documented findings that English learners underuse phrasal bundles in academic prose (e.g., Ädel & Erman, 2012; Bychkovska & Lee, 2017; Huang, 2015; Pan et al., 2016; Shin, 2019). The present study therefore used a threshold of seven times
8
Journal Pre-proof
across three different texts in the learner corpus, which includes about 600,000 words in total. The lower threshold should also help us to overlook fewer PP-based bundles that could serve as target bundles for the study, suiting our main purpose to analyze the use of prepositions in phrasal bundles rather than to report on frequency. In addition, given the different sizes of the learner and the native corpora, tokens per million words (pmw) was used for the comparison of the PP-based bundles in the two corpora. Following Biber et al. (1999), frequency in each corpus was indicated with three asterisks for more than 100 occurrences per million words, two for 40– 99.99 occurrences, one for 20–39.99 occurrences, and none for 10–19.99 occurrences. Prepositional bundles were extracted using a tagging program and n-gram/cluster analysis in AntConc (Anthony, 2019). Tagging was done using the Free CLAWS web tagger1 with C7 tagset to add the part-of-speech of each word in the corpus files. The retrieved bundles were then manually checked for any LBs that were taken directly from the essay prompts (e.g., thing about your hometown), which were excluded from the analysis. Lastly, the set of prepositional bundles from the learner corpus was compared with the set of prepositional bundles used by native English speakers in academic prose, as identified by Biber et al. (1999).
Core expression approach In order to answer the second question (i.e., accuracy of LBs in terms of prepositions), a core-expression approach (Shin et al., 2018) was adopted. Shin et al. defined a core expression as the part of an LB that consists of its core word and the following word. For instance, number of is the core expression of bundles like in a number of and the total number of. As they argued,
The Free CLAWS web tagger was first developed in the early 1980s to tag words in the British National Corpus (BNC). It has achieved 96–97% accuracy with an error rate of only 1.5%, according to the program’s official website (http://ucrel-api.lancaster.ac.uk/claws/free.html). 1
9
Journal Pre-proof
this approach allows the researcher to examine the separate elements of a given type of LB, which in turn makes it easier to investigate inaccurate uses. Many of the core expressions consist of a content word followed by a preposition (e.g., number of, some of). Because most of the core expressions include a preposition, additional efforts were made to extract every potentially relevant bundle. For instance, from the bundle in front of my, the core expression front of was searched for in the learner corpus to examine the use of the preceding preposition in. In addition, the corpus was also searched for the expression front of my to investigate the accurate use of the preceding preposition. Appendix A shows the 50 core expressions that were used to examine the total tokens of frequently used PP-based bundles in the learner corpus. These are relatively longer than the original core expressions used for the detection of article errors (Shin et al., 2018), because this study’s analysis focuses on the initial prepositions in the bundles. Based on the results of an n-gram/cluster analysis, a modified core expression approach was used to add a few bundles that had been uncounted due to differences in uppercase and lowercase words in the bundles (see Appendix A).
Errors with prepositions in lexical bundles Errors with prepositions embedded in lexical bundles were classified into three types: omission, addition, and misuse, following a tradition of the error analysis literature (e.g., Back, 2011; Lah & Yoo, 2015; Lee, 2001; Yun, 2004). Missing obligatory prepositions were classified as omission errors, unnecessary prepositions as addition errors, and incorrect prepositions as misuse errors. To check the reliability of the classification process, one of the researchers (a nonnative speaker of English) and two native speakers of English analyzed and categorized the preposition errors in 36% of all the bundles containing such errors. The initial agreement rate was about 85%; however, the raters negotiated each case of disagreement until they reached 10
Journal Pre-proof
complete agreement.
RESULTS PP-based bundles used by native and nonnative speakers of English This section presents the frequency analysis of lexical bundles used by Korean university students. Table 2 shows the final list of bundles from the learner corpus after topic-dependent bundles were removed, a process which left about 32% of the bundles first identified. As the table shows, limited types of bundles were identified, and most of them were used infrequently: only 25 bundles were used more than 10 times. The finding of small numbers of tokens for these bundles echoes previous studies’ findings of English learners’ underuse of PP-based bundles (e.g., Kim, 2013; Nam, 2018; Salazar, 2014; Shin, 2019). Table 2. PP-based bundles in the learner corpus (722 tokens) Lexical bundle in my opinion knowledge for these reasons I on the other hand for a long time in my case I at the same time of the most important in my opinion the at that time I in front of my because of lack of for the first time in my opinion I in the real world to the subway station for this reason I of these reasons I as a result I
Token 53 52 48 35 32 27 23 18 16 16 15 15 15 15 15 13 12 11
Lexical bundle in his or her in my case when in these reasons I of people who live to people who live with my friends in in front of the in my opinion experience in other words knowledge in other words the in this essay I of all I want of the biggest city for students to study for students who want in books there are in my life I in my opinion it
Token 9 9 9 9 9 9 8 8 8 8 8 8 8 7 7 7 7 7 11
Journal Pre-proof
in conclusion I think in the middle of as a result the by reading a book for these reasons if in conclusion if I in the world and between students and teachers between teachers and students
11 11 10 10 10 10 10 9 9
in other words I in the case of of all I would of all there are of knowledge that we of money and time of the city is on the road and
7 7 7 7 7 7 7 7
Next, the bundles found in the learner corpus were compared with 91 different PP-based bundles frequently occurring in academic prose as reported by Biber et al. (1999), i.e. 56 types of prepositional phrase with embedded of-phrase fragments and 35 types of other prepositional phrase (fragments). The total number of words was different in the two corpora. Therefore, the raw frequency of learner bundles, word tokens per million words (pmw), was calculated and rounded to three decimal places for an effective comparison with the native bundles. In Table 3, the bundles are marked by asterisks according to the pmw rate: three asterisks (***) for more than 100 occurrences pmw, two (**) for 40–99.99 occurrences, one (*) for 20–39.99 occurrences, and none for 10–19.99 occurrences.
12
Journal Pre-proof
Table 3. Distribution of lexical bundles in the native corpus (Biber et al., 1999) and the learner corpus Native corpus (29 types) in the case of on the other hand as a result of at the same time at the time of in the absence of in the form of in the presence of on the basis of as a function of as part of the at the beginning of at the level of by the fact that for the first time in a number of in a such way in terms of the in the context of in the course of in the development of in the number of in the present study in the process of in the same way in the United States on the one hand to the development of of the most important
*** *** ** ** ** ** ** ** ** * * * * * * * * * * * * * * * * * * *
Learner corpus (26 types, 510 tokens) ** in my opinion knowledge 53 ** for these reasons I 52 ** on the other hand 48 ** for a long time 35 ** in my case I 32 ** at the same time 27 * of the most important 23 * in my opinion the 18 * at that time I 16 * in front of my 16 * because of lack of 15 * for the first time 15 * in my opinion I 15 * in the real world 15 * to the subway station 15 * for this reason I 13 of these reasons I as a result I in conclusion I think in the middle of as a result the by reading a book for these reasons if in conclusion if I in the world and in the case of
12 11 11 11 10 10 10 10 10 7
pmw 87.75 86.09 77.81 57.95 52.98 44.70 38.08 29.80 26.49 26.49 24.83 24.83 24.83 24.83 24.83 21.52 19.87 18.21 18.21 18.21 16.56 16.56 16.56 16.56 16.56 11.59
Note. The bundles indicated in bold refer to shared bundles found in both corpora.
13
Journal Pre-proof
As shown in Table 3, PP-based bundles occurred less frequently per million words in the learner corpus than in the native corpus. Five bundles (bold text in the table) occurred in both corpora, but three with considerable differences in token numbers: in the case of, on the other hand, and of the most important. These differences indicate that the nonnative writers did not use the PP bundles that are normally used by native writers in academic prose. In addition, the structural types of PP bundles used by the native and nonnative writers differed. A comparison of the most frequently used four-word PP-based bundles highlights two main structural differences in the two corpora. First, the Korean student writers showed a lack of variety in the use of these bundles in their writing. Among the 26 bundles frequently used by Korean student writers, approximately 42% (11 bundles) contained overlapping parts with one to five other bundles. Table 4 lists the bundles with overlapping content words or phrases in the learner corpus.
Table 4. Frequently used bundles with overlapping words in the learner corpus Overlapping words
Lexical bundles in my opinion knowledge in my opinion (3) in my opinion the in my opinion I for these reasons I for this reason I this reason /these reasons (4) of these reasons I for these reasons if as a result I as a result (2) as a result the in conclusion I think in conclusion (2) in conclusion if I Total 11 types (215 tokens)
Tokens 53 18 15 52 13 12 10 11 10 11 10
As shown in the table, the 11 learner bundles can be divided into four groups based on their 14
Journal Pre-proof
overlapping words. The total token count of bundles with overlapping parts was 215, which is 42.16% of the total tokens of the bundles used by the learners (see Table 3). This high percentage reflects the extent to which the learner writers used the same formulaic sequences repeatedly in their writing, in line with the findings reported by Shin (2019). Three phrases (i.e., in my opinion, as a result, and in conclusion) appeared in seven bundle types. Two different prepositions were used in bundles with the noun phrase this reason and its plural form these reasons; however, the writers’ intention in using these two phrases seems to be the same. In context, the four overlapping words in Table 4 were used to make claims, generally summing up the writers’ argument at the beginning or end of their discourse. When these expressions are followed by the first person singular, as they frequently are in these overlapping bundles, the intention of the writers seems fairly obviously to be to indicate the introductory or concluding remarks of their argumentative essays (Shin, 2019). Thus, while the students’ written production shows an effort to use PP-based bundles in their writing, one problem they face in doing so is that they lack knowledge of alternative expressions with similar functions (Salazar, 2014). The deficiency in LB types and the repeated use of the same LBs in the learner corpus recalls the lack of bundle variety found in nonnative writing in other studies (Salazar, 2014; Nam, 2018).
Analysis of preposition errors in learner bundles To answer the second research question, the accuracy of the LBs in terms of prepositions in the learner corpus was examined by employing a modified version of Shin et al.’s (2018) core expression approach. The analysis centered on analyzing the use of a preposition that precedes a core (lexical) word in each lexical bundle in context. Table 5 shows 13 bundles frequently used 15
Journal Pre-proof
by the learners, along with the numbers of tokens of attempts at producing these bundles in the learner corpus that included one of the three error types; the percentages indicate the proportions of each type of error. Table 5. Thirteen lexical bundles with three types of preposition errors in the learner corpus Lexical bundles in my opinion experience in front of my in front of the for these reasons I in these reasons I for these reasons if on the other hand at the same time at that time I for the first time for this reason I in the middle of on the road and Total
Omission 4 (100%) 2 (100%) 1 (20%) 1 (25%) 7 (100%) 1 (20%) 16 (29.1%)
Addition (0%)
Misuse 1 (100%) 4 (80%) 9 (100%) 3 (100%) 3 (75%) 2 (100%) 3 (100%) 8 (100%) 4 (80%) 2 (100%) 39 (70.9%)
Total 1 4 2 5 9 3 4 2 7 3 8 5 2 55 (100%)
The examples in (1–3) illustrate the omission of prepositions. The bundles with preposition errors are underlined and preceded by asterisks; other errors are neither corrected nor marked. Following each example is the target LB in parentheses. (1) My favorite thing is that the school is *front of my house. (in front of my) (2) *The other hand, students who are poor at the subject don’t understand that class. (On the other hand) (3) *That time I could not realize the importance of practical academy that teach the math essay. (At that time I) The example sentences in (4–6) show the misuse of prepositions in PP-based bundles. As in the omission errors, the student writers were successful in using the core expressions of each
16
Journal Pre-proof
bundle, but they lacked accuracy in selecting the adjoining preposition in each bundle, suggesting their difficulty in producing multiword sequences with correct prepositions in context. As for example (4), the word experience from the bundle for my opinion experience appears in one of the writing prompts given to the students (i.e., comparison of knowledge from experience and knowledge from books), referring to the knowledge gained from experience. (4) *For my opinion, experience knowledge is more important than contained in books. (In my opinion, experience) (5) *Of these reasons, I want make the ship roads in my hometown. (for these reasons I) (6) if there is such programme *at the middle of the semester then students can be fresh for remaining class. (in the middle of) The results of this analysis of preposition errors embedded in lexical bundles point to interesting PP-based bundle uses of Korean student writers. First, even though English preposition use is known to be a challenge for Korean students (e.g., Lee, 2012; Park & Jang, 2011), the LB preposition error rate was quite low, at 7.13% (55 tokens). Part of the reason for the relatively low error percentage may be the underuse of PP-based bundles by learners in the first place. Moreover, the lack of variety in the LB types used by the writers might have the effect of decreasing their preposition uses. Considering that four of the individual prepositions to, in, of, and for rank among the 20 most frequently used words in the corpus data, there is a high possibility that both reasons affect the rate of preposition errors in the learner bundles. This low error rate is in line with the results of Lah and Yoo (2015), who reported an error rate of about 4% with the preposition of, again emphasizing the scarcity of various preposition uses in learners’ writing. Second, misuse errors accounted for more than 70% of the total preposition errors (omission 17
Journal Pre-proof
29.1% and addition 0%). The high rate of misuse errors in bundles contrasts somewhat with previous studies’ findings on prepositions. For instance, Lah and Yoo’s (2015) study of learners’ errors with of found 70% of the errors to involve addition, and 20% to be misuse of prepositions. In contrast, the high proportion of misuse errors in the present study seems to stem from the fact that only 18.9% of the frequent bundles beginning with of comprise 13.2% of the total tokens, while other preposition errors such as those with in, at, and for contribute to the high percentage of misuse errors. Not only the error token patterns but each type of error also showed a certain relationship between the prepositions and the error types. For example, errors of omission and substitution occurred primarily with bundles containing the prepositions in, for, on, and at. In addition, almost half of the preposition errors (45.5%) occurred in bundles containing these reasons or this reason. In the context of the whole writing discourse, preposition misuse errors seem to occur most frequently when the writers are making claims or giving reasons for their arguments as they conclude their essays. DISCUSSION AND CONCLUSION The present study is among the few attempts to investigate erroneous uses of prepositions embedded in LBs in L2 production. While prior research has centered on learners’ uses of prepositions in PPs, often drawing on predetermined and meaning-based phrases (e.g., idiomatic expressions), this study targeted PP-based bundles identified in L2 argumentative essays. The findings provide new insights into learners’ uses of prepositions in recurrent multiword sequences in the given genre. The first part of the study identified PP-based bundles produced by Korean university students in order to compare them with PP-based bundles produced by native speakers, as documented by Biber et al. (1999). The results show that when Korean students attempt to use 18
Journal Pre-proof
PP-based bundles in their writing, they not only have a restricted repertoire but also tend to select bundles different from those found in the native corpus. This observation is consistent with the accumulated findings in the literature, including other studies involving L1-Korean EFL speakers (e.g., Kim, 2013; Lee & Kim, 2017; Yoon & Choi, 2015). The second part of the study examined the accuracy of prepositions in PP-based bundles as produced by the learners. The results showed that the Korean students tend to misuse prepositions when attempting to use phrasal bundles. A core-expression-based error analysis found a high rate of substitution of a different preposition for the more appropriate one (e.g., *at the middle of for in the middle of), with such substitutions accounting for approximately 70% of all the preposition errors. In other words, while the Korean students used core expressions of LBs correctly, they very often combined them with incorrect prepositions, resulting in a variety of different forms which did not occur with sufficient frequency to be detected by automatic procedures for analyzing corpus data (Shin et al., 2018). The frequency of such errors with embedded items in LBs therefore can partly explain the small numbers of PP-based bundles in L2 production consistently reported by datadriven corpus studies (e.g., Bychkovska & Lee, 2017; Chen & Baker, 2010; Qin, 2014; Shin, 2019; Yoon, 2017). These findings provide useful pedagogical implications, as they indicate the need for carefully designed preposition-teaching materials to help students improve the accuracy of their preposition use. Recurrent sequences, for example, could be an effective tool to teach accurate preposition use in the classroom. Considering that learners already have some knowledge about the core expressions, but frequently use incorrect adjoining preposition(s) in context, controlled activities such as preposition-matching and fill-in-the-blank tasks would likely help students to use prepositions more accurately (see also Shin & Kim, 2017, which describes using core 19
Journal Pre-proof
expressions to teach adjoining articles in LBs in controlled activities; and Kang et al., 2018, for LB-based preposition instruction with varying degrees of explicitness in the teaching materials). Such instruction on prepositions in LBs can help English learners use LBs’ constituent grammatical elements correctly, an essential skill in fluent language production and effective communication. Further, it would increase their exposure to LBs typical of academic prose (e.g., Hyland, 2012; Wray, 2002; Shin & Kim, 2017). However, it should be noted that the small number of PP bundles found in the current study may not be solely attributable to the nature of nonnative writing but could be in part due to the type of essay prompts the learners were given. While the findings of the study show that the Korean students produced LBs greatly different from those produced by their native counterparts, the essay prompts used in the learner corpus appear to have had quite an influence on the choice and usage of bundles. Considering the fact that the Korean students were asked to produce their own opinions on the given topic (e.g., the changes to be made in their school or hometown), it seems, at least partly, natural that they employed bundles characteristic of spoken language. The current study’s data comprised essays written in response to only three writing topics (all encouraging the writers to express their personal thoughts), limiting the study’s generalizability for academic writing. In this respect, further research is needed. In particular, comparative studies that use native and nonnative corpora, closely matched for writing prompts, are needed to provide a more concrete picture of the usages of PP-based bundles unique to each language group. Overall, the present study adds useful insights into the use of lexical bundles produced by English learners, pinpointing specific areas with which learners have difficulty in their attempts to use multiword sequences in the genre of argumentative essays. The findings of the study thus 20
Journal Pre-proof
offer practical pedagogical information that should be useful for ESL and EFL teachers, while also potentially providing a basis for future research to delve more deeply into the usage of formulaic sequences and their constituent items.
REFERENCES Ädel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for Specific Purposes, 31, 81–92. Ahn, S. (2013). An analysis of the use of phrasal verbs and prepositional verbs by Korean EFL learners. The New Korean Journal of English Language and Literature, 55(2), 207–232. Anthony, L. (2019). AntConc (Version 3.4.3) [Computer software]. Tokyo, Japan: Waseda University. [Available from http://www.laurenceanthony.net/]. Back, J. (2011). Preposition errors in writing and speaking by Korean EFL learners: A corpusbased approach. Studies in British and American Language and Literature, 99, 227–247. Biber, D. (2006). Stance in spoken and written university registers. Journal of English for Academic Purposes, 5, 97-116. Biber, D., Conrad, S., & Cortes, V. (2004). If you look at...: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), 371–405. Biber, D., & Gray, B. (2010). Challenging stereotypes about academic writing: Complexity, elaboration, explicitness. Journal of English for Academic Purposes, 9, 2-20. Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly, 45(1), 5-35. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Harlow, UK: Longman. 21
Journal Pre-proof
Bychkovska, T., & Lee, J. (2017). At the same time: Lexical bundles in L1 and L2 university student argumentative writing. Journal of English for Academic Purposes, 30, 38–52. Celce-Murcia, M., & Larsen-Freeman, D. (1999). The grammar book: An ESL/EFL teacher’s course (3rd ed.). Boston, MA: Heinle & Heinle. Chen, Y., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning and Technology, 14(2), 30–49. Cortes, V. (2002). Lexical bundles in freshman composition. In R. Reppen, S. M. Fitzmaurice, & D. Biber (Eds.), Using Corpora to Explore Linguistic Variation (pp. 131-145). Amsterdam: John Benjamins Publishing Company. Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes, 23(4), 397-423. De Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and writing. In C. Mair, & M. Hundt (Eds.), Corpus linguistics and linguistic theory: Papers from the twentieth international conference on English language research on computerized corpora (pp. 51-68). Atlanta: Rodopi. Gray, B. (2015). On the complexity of academic writing: Disciplinary variation and structural complexity. In V. Cortes & E. Csomay (Eds.), Corpus-based research in Applied linguistics: Studies in honor of Doug Biber (pp. 49-77). Amsterdam: John Benjamins. Hong, J. (2018). Use of lexical bundles by Korean EAP students at two different academic levels. Foreign Languages Education, 25(4), 23-52. Huang, K. (2015). More does not mean better: Frequency and accuracy analysis of lexical bundles in Chinese EFL learners’ essay writing. System, 53, 13–23. Huh, M. (2008). I think it is: Lexical bundles in EFL writing. Korean Journal of Applied 22
Journal Pre-proof
Linguistics, 24(3), 129-146. Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27, 4–21. Hyland, K. (2012). Bundles in Academic Discourse. Annual Review of Applied Linguistics, 32, 150-169. Jaworska, S., Krummes, C., & Ensslin, A. (2015). Formulaic sequences in native and non-native argumentative writing in German. International Journal of Corpus Linguistics, 20(4), 500525. Kang, S., Yoo, I., Shin, Y. (2018). Formulaic-language-based preposition instruction: An instructional experiment with Korean high school EFL students. Paper presented at AACL (American Association for Corpus Linguistics), Atlanta, USA. Kim, J. (2013). Lexical bundles in Korean college students’ English essays. Language & Literature Teaching, 19(3), 157–179. Kim, S. (1981). Syntactic analysis of the preposition of. English Teaching, 21, 119–126. Lah, J., & Yoo, I. (2015). A corpus analysis of the preposition of in Korean college matriculants’ writing. English Teaching, 70(3), 99-115. Lee, H. (2001). An analysis of errors on prepositions in English free writings. Modern English Education, 2(2), 239–253. Lee, K. (2012). English prepositions (3rd ed.). Seoul: Kyomunsa. Lee, H., & Kim, H. (2017). Features of lexical bundles in academic writing by non-native vis-avie native speakers of English. English Language and Linguistics, 23(2), 67-88. Levy, S. (2008). Lexical bundles in professional and student writing: A study in linguistic variation. Germany: VDM Verlag Dr. Müller. 23
Journal Pre-proof
Lindstromberg, S. (2010). English prepositions explained. Philadelphia, PA: John Benjamins. Nam, D. (2017). Functional distribution of lexical bundle in native and non-native students’ argumentative writing. The Journal of Asia TEFL, 14(4), 703-716. Nam, J. (2018). English learner corpus-based analysis of structural patterns and functional types of lexical bundles. The New Korean Journal of English Language and Literature, 60(2), 115–132. Nekrasova. T. (2009). English L1 and L2 speakers’ knowledge of lexical bundles. Language Learning, 59(3), 647-686. Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24, 492-518. Pan, F., Reppen, R., & Biber, D. (2016). Comparing patterns of L1 versus L2 English academic professionals: Lexical bundles in Telecommunications research journals. Journal of English for Academic Purposes, 21, 60-71. Paquot, M. (2017). L1 frequency in foreign language acquisition: Recurrent word combinations in French and Spanish EFL learner writing. Second Language Research, 33(1), 13-32. Park, K., & Jang, B. (2011). Language teaching and learning. Seoul: Parkyoungsa. Pang, P. (2009). A study of the use of four-word lexical bundles in argumentative essays by Chinese English majors – a comparative study based on WECCL and LOCNESS. CELEA Journal, 32(3), 25-45. Qin, J. (2014). Use of formulaic bundles by non-native English graduate writers and published authors in applied linguistics. System, 42, 220-231. Salazar, D. (2014). Lexical bundles in native and non-native scientific writing: Applying a corpus-based study to language teaching. Philadelphia, PA: John Benjamins. 24
Journal Pre-proof
Scott, M., & Tribble, C. (2006). English for academic purposes: Building an account of expert and apprentice performances in literacy criticism. In M. Scott and C. Tribble (Eds.), Textual Patterns: Key Words and Corpus Analysis in Language Education (pp. 131-159). Amsterdam, the Netherlands: John Benjamins. Shin, Y. (2018). Lexical bundles in argumentative essays by native and nonnative Englishspeaking novice academic writers (Unpublished doctoral dissertation). Atlanta, GA: Georgia State University. Shin, Y. (2019). Do native writers always have a head start over nonnative writers? The use of lexical bundles in college students’ essays. Journal of English for Academic Purposes, 40, 1– 14. Shin, Y., Cortes, V., & Yoo, I. (2018). Using lexical bundles as a tool to analyze definite article use in L2 academic writing: An exploratory study. Journal of Second Language Writing, 39, 29–41. Shin, Y., & Kim, Y. (2017). Using lexical bundles to teach articles to L2 English learners of different proficiencies. System, 69, 79–91. Thewlis, S. H. (2007). Grammar dimensions 3: Form, meaning, and use. Boston, MA: Thomson Heinle. Tribble, C. (2011). Revisiting apprentice texts: Using lexical bundles to investigate expert and apprentice performance in academic writing. In F. Meunier, S. De Cock, G. Gilquin and M. Paquot (Eds.), A taste for corpora. In honour of Sylviane Granger (pp. 85-108). Amsterdam: John Benjamins. Wei, Y., & Lei, L. (2011). Lexical bundles in the academic writing of advanced Chinese EFL learners. RELC Journal, 42(2), 155–166. 25
Journal Pre-proof
Wray, A. (2002). Formulaic language and the lexicon. Cambridge, UK: Cambridge University Press. Yoon, C., & Choi, J. (2015). Lexical bundles in Korean university students’ EFL compositions: A comparative study of register and use. Modern English Education, 16(3), 47-69. Yoon, H. (2017). A study on the use of lexical bundles in second language writing at different levels of proficiency. Studies in Foreign Language Education, 31(1), 35–58. Yu, T., & Yoo, I. (2010). Korean university students’ use of prepositional verbs: A corpus-based study. English Teaching, 65(4), 403-424. Yun, S. K. (2004). Data analysis of errors in English by Korean college students. Modern English Education, 5(2), 131–153. Appendix A. Core expressions (50) and lexical bundles (53) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Core Expression my opinion knowledge my opinion the my opinion experience my opinion it my opinion I my case I my case when front of my front of the result I result the conclusion I think conclusion if I other words knowledge other words the other words I
17
these reasons I
Lexical Bundles in my opinion knowledge in my opinion the in my opinion experience in my opinion it in my opinion I in my case I in my case when in front of my in front of the as a result I as a result the in conclusion I think in conclusion if I in other words knowledge in other words the in other words I for these reasons I of these reasons I
Refined Tokens 53 18 8 7 1315 2632 9 16 8 1011 10 1011 10 8 8 7 4852 12 26
Journal Pre-proof
18 19 20 21 22
these reasons if students and teachers teachers and students students to study students who want
23
people who live
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
other hand long time same time most important that time I first time real world the subway station lack of this reason I middle of reading a book the world and his or her my friends in all I want biggest city books there are my life I case of this essay all I would all there are money and time knowledge that we the city is the road and
in these reasons I for these reasons if between students and teachers between teachers and students for students to study for students who want of people who live to people who live on the other hand for a long time at the same time of the most important at that time I for the first time in the real world to the subway station because of lack of for this reason I in the middle of by reading a book in the world and in his or her with my friends in of all I want of the biggest city in books there are in my life I in the case of in this essay I of all I would of all there are of money and time of knowledge that we of the city is on the road and
79 10 9 9 7 7 9 9 4748 35 27 23 16 15 15 15 1415 1113 11 10 10 9 9 8 8 7 7 7 78 7 7 7 7 7 7
Note. The ten bold numbers in the refined token column are the total tokens of lexical bundles extracted by the core expression method.
27
Journal Pre-proof
First author Young Eun Lee received her master's degree in the Department of English from Sogang University. Her current research interests include lexical bundles, corpus linguistics and L2 acquisition in adult learners, with particular emphasis on pedagogical applications in L2 teaching.
Second author Isaiah WonHo Yoo is Professor in the Department of English at Sogang University. His primary research focuses on how corpus linguistics informs language pedagogy. His recent publications have appeared in Corpora, Applied Linguistics, the Journal of Second Language Writing, Language Acquisition, and Linguistic Inquiry.
Corresponding author Yu Kyoung Shin (corresponding author) is Assistant Professor in the School of Global Studies at Hallym University. Her research interests include corpus analysis of academic written registers and disciplinary variation, particularly for applications to L2 language learning and teaching. Her recent work has appeared in TESOL Quarterly, the Journal of English for Academic Purposes, English for Specific Purposes, System, Language Teaching Research, and the Journal of Second Language Writing.
Journal Pre-proof
Young Eun Lee (first author) Sogang University Department of English Seoul, Republic of Korea Phone: +82-10-5150-1159 Email:
[email protected]
Isaiah WonHo Yoo (second author) Sogang University Department of English Seoul, Republic of Korea Phone: +82-2-705-8340 Fax: +82-2-715-0705 Email:
[email protected]
Yu Kyoung Shin (corresponding author) Hallym University School of Global Studies Chuncheon, Republic of Korea Phone: +82-33-248-2486 Fax: +82-33-248-2485 Email:
[email protected]