The British Academic Written English (BAWE) corpus

The British Academic Written English (BAWE) corpus

Journal of English for Academic Purposes 7 (2008) 294 www.elsevier.com/locate/jeap The British Academic Written English (BAWE) corpus The Editors are...

54KB Sizes 23 Downloads 344 Views

Journal of English for Academic Purposes 7 (2008) 294 www.elsevier.com/locate/jeap

The British Academic Written English (BAWE) corpus The Editors are pleased to announce that the 6.5 million word British Academic Written English (BAWE) corpus is now available to researchers who register with the Oxford Text Archive. It is listed as resource number 2539, see http://ota.ahds.ac.uk/headers/2539.xml. BAWE was developed at the Universities of Warwick, Reading and Oxford Brookes as part of a larger ESRCfunded project to investigate genres of assessed writing in British higher education.1 Publications and findings from this project are available on request. The corpus developers were Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics [previously called CELTE], University of Warwick), Paul Thompson (Department of Applied Linguistics, Reading) and Paul Wickens (Westminster Institute of Education, Oxford Brookes). The corpus contains 2,761 proficient student assignments, produced and assessed as part of university degree coursework, and fairly evenly distributed across 35 university disciplines and four levels of study (first year undergraduate to Masters level). About half the assignments were graded at a level equivalent to ‘distinction’ (D) (70% or above), and half at a level equivalent to ‘merit’ (M) (between 60% and 69%). The majority (1,953) were written by L1 speakers of English. As some assignments contained multiple pieces of coursework, the total number of separate texts in the corpus is 2,897. Texts have been categorised into 13 broad genre families, including ‘‘essays’’, ‘‘critiques’’, ‘‘case studies’’, ‘‘explanations’’, ‘‘methodology recounts’’, ‘‘problem questions’’ and ‘‘proposals’’. Information about the genre family, discipline, level and grade of each assignment is provided in the header for each file, alongside contributor information which did not influence collection policy, such as gender, year of birth, native speaker status, and years of UK secondary education. A spreadsheet providing these details and a manual explaining the encoding conventions are also included as part of the Oxford Text Archive deposit. The corpus is suitable for use with concordancing programs such as AntConc or WordSmith Tools, and provides scope for extensive research into lexis, phraseology and language variation in university student writing. The BAWE corpus developers welcome use of the corpus for research purposes, provided that they are informed of any output arising from analysis in the form of dissertations, theses, presentations or publications. Please contact Hilary Nesi [email protected] for details of how to cite BAWE, or if you have any queries about its design and development. Ken Hyland Institute of Education University of London United Kingdom E-mail address: [email protected]

1

2004e2007 project number RES-000-23-0800.

1475-1585/$ - see front matter doi:10.1016/j.jeap.2008.10.012