The linguistic impact of a CLIL Science programme: An analysis measuring relative gains

The linguistic impact of a CLIL Science programme: An analysis measuring relative gains

System xxx (2015) 1e11 Contents lists available at ScienceDirect System journal homepage: www.elsevier.com/locate/system The linguistic impact of a...

702KB Sizes 0 Downloads 14 Views

System xxx (2015) 1e11

Contents lists available at ScienceDirect

System journal homepage: www.elsevier.com/locate/system

The linguistic impact of a CLIL Science programme: An analysis measuring relative gains rez Vidal*, Helena Roquet 1 Carmen Pe Department of Translation and Linguistic Sciences, Roc Boronat 138, Barcelona 08018, Spain

a r t i c l e i n f o

a b s t r a c t

Article history: Received 20 December 2014 Received in revised form 13 May 2015 Accepted 17 May 2015 Available online xxx

The present study seeks to contribute new evidence regarding the linguistic progress achieved over one academic year by Content and Language Integrated Learning (CLIL) secondary education learners, enrolled in an English-medium Science course. It gauges the relative linguistic gains resulting from the CLIL programme in contrast with a formal instruction (FI) programme developed in the same school. Participants were followed longitudinally with a pre-test, post-test design. Tests used to elicit data were modelled on the type of tasks used both in FI and CLIL. They tapped into the productive and receptive skills of the learners. The sample included 2 groups of bilingual Catalan/Spanish participants, English being their L3 (N ¼ 50 each). One group had FI plus CLIL in the Science course, (experimental group), the other had FI only (control group). Results obtained confirm that larger relative gains are obtained by the FI þ CLIL programme, however not in all domains and to the same degree: relative higher gains accrue in reading, as expected, but not in listening. Similarly, their writing ability, and particularly so their accuracy, shows higher relative gains, and so do their lexico-grammatical abilities. In sum, in the CLIL programme analysed, reading and grammar seem to benefit the most. © 2015 Elsevier Ltd. All rights reserved.

Keywords: Language learning Time and timing Composition task Text completion task Dictation task Lexico-grammatical abilities Sentence completion task Grammaticality judgement task EFL CLIL

1. Introduction The main goal of this classroom-based exploratory study is to analyse the linguistic progress achieved over one academic year by English as a Foreign Language (EFL) learners. In order to do so, we measure the relative amount of progress they achieve, with a primary focus on skill development. Participants are secondary education learners who, in addition to following conventional formal instruction (FI) in English since age 5, have been enrolled in an English-medium Science programme taught with a Content and Language Integrated Learning (CLIL) approach since age 10, that is, two years prior to the onset of the study. They are the GA: FI þ CLIL group. We examine their progress over one year with respect to their level at the beginning of the year, between ages 13 and 14. Progress is measured in terms of the degree of relative proficiency gains accrued as a result of the FI þ CLIL programme they have undergone. Their relative gains are then in turn contrasted with the relative gains obtained by another group, the GB: FI group, from the same school, who has followed a FI only programme. Age of onset and hours of instruction for the two groups are kept similar. Ultimately, this should allows us to gauge the effects of the FI þ CLIL programme, and examine whether it yields differential relative linguistic gains in EFL written production,

* Corresponding author. Tel.: þ34 935421138. E-mail addresses: [email protected] (C.P. Vidal), [email protected] (H. Roquet). 1 Tel.: þ34 935422409. http://dx.doi.org/10.1016/j.system.2015.05.004 0346-251X/© 2015 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

2

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

written and oral comprehension and lexico-grammatical abilities. With such a goal, Section 2 includes a brief appraisal of the background to CLIL programmes, subsequently, Section 3 describes the Science study, Section 4 presents its results, Section 5 discusses them, and, finally, conclusions are drawn in Section 6. 2. Literature review -vis languages, CLIL has been defined as a new idyosincratic development in modern European educational policies vis-a instruction, and European citizenship. The European motto defining it reading: ‘United in diversity’, the linguistic and cultural diversity of its 27 member states with 23 different languages (European Commission, 1995, 2005, 2007, 2008, 2012). In CLIL, a language different from the domestic language is used as the medium of instruction for curricular subjects at primary and secondary levels of education. Four interrelated features are intrinsic to such programmes in our belief: i) the fact that an additional or foreign language, for both the teachers and the learners, is used as the medium of instruction; ii) the fact that the culture of the classroom and the curriculum remains that of the L1, as do classroom communication practices (Johnson & Swain, 1997); iii) the international ethos which such an educational option confers to the classroom (see Author 1, 2015 for a detailed presentation of such a view); iv) the unsurprising fact that such programmes would not have been possible without a robust policy behind them, as already mentioned, something which underscores the fact that CLIL is an educational approach, not a simple ‘methodology’. Other authors have identified somewhat similar sets of features, for example Lasagabaster and Sierra (2010), who refer to language of instruction, teachers, starting age, teaching materials, language objectives, inclusion of immigrants and research; a list which has subsequently been further qualified (Llinares, Morton, & Whittaker, 2013). Undoubtedly, as a result of such a polifacetic nature and in spite of being still very much in its infancy (Eurydice, 2008), the European CLIL approach to education has already been on several agendas, as has previously been discussed (Author 1, 2013, 2007). Besides being on the political and the educational agenda, CLIL has also been on the social agenda, as many families at -vis languages: the early the end of the 1990s had pinned their hopes on another main tenet of the European policy vis-a introduction of foreign languages. Seeing it fail, certainly in contexts such as Spain (García Mayo & García Lecumberri, 2003; ~ oz, 2006), they placed their hopes in CLIL as a possible solution to poor standards in foreign languages. CLIL has also been Mun on the research agenda, holding enormous potential for the field of Second Language Acquisition (SLA) research. Indeed it is proving to be nearly as prolific as the Canadian immersion programmes on which it was mirrored, particularly in its initial stages. The present monographic issue is a renewed effort to further contribute to such a raft of studies (see for example the pez-Gime nez & Chaco n-Beltran, 2013; volumes published only in Spain over half a decade: Abello-Contesse, Chandler, Lo n & Michavila, 2012; Cenoz, 2009; Dafouz & Guerrini, 2009; Escobar Urmeneta, Evnitskaya, Moore, & Patin ~ o, 2011; Alco Lasagabaster & Ruiz de Zarobe, 2010; Llinares et al., 2013; Lorenzo, Casal, de Alba, & Moore, 2007; Ruiz de Zarobe & nez Catala n, 2009; Ruiz de Zarobe, Sierra, & Gallardo del Puerto, 2011), with the remaining countries in Europe not Jime falling short of publications either, such as those from Austria (Ackerl, 2007; Dalton-Puffer, 2007, 2008, 2011); Belgium (Van de Craen, Ceuleers, Mondt & Allain, 2008) Finland (Nikula, 2007), Germany (Zydatiss, 2007, 2012); the Netherlands (Admiraal, n, 2004), to name but a few. Westhoff, & de Bot, 2006); Norway (Hellekjaer, 2010); Sweden (Sylve If we now turn to a quick overview of the existing consensus on the linguistic benefits of CLIL, general results seem to be by and large positive, although there are aspects which are either unaffected by CLIL or for which research is inexistent or inconclusive, namely syntax, productive vocabulary, written accuracy, discourse skills and pragmatic efficiency (although see Llinares et al. 2013; Whittaker & Llinares, 2009), and pronunciation, that is, degree of foreign accent. Such a positive impact has generally being attributed to higher quantity and quality of exposure. However, methodological issues are still unresolved in CLIL research and subject to debate, a debate to which this article seeks to contribute, as several key variables affecting such ~ oz, 2012, 2015). positive results remain by and large somewhat underexplored (Author 1, 2013; Mun Focussing specifically on the skills examined in this study, that is writing, reading, listening and lexico-grammatical abilities, Ruiz de Zarobe et al. (2011) summarized mixed results in the literature, with fluency and complexity improving, but not accuracy and discourse (Dalton-Puffer, 2011; Escobar, 2004; Lasagabaster, 2008; Moore, 2009; Ruiz de Zarobe, 2008; Zydatiss, 2007). Whittaker and Llinares (2009) also attested good results in writing. Ruiz de Zarobe (2008) emphasized the potential contrasting CLIL versus non-CLIL effects on reading, which are also rather evident with respect to listening in a positive direction for CLIL (Lasagabaster, 2008). Of special interest for our data is the study by Victori and Vallbona (2008), whose CLIL 6th graders were specifically better at listening, measured by means of a dictation task. Receptive vocabulary also seems to clearly improve (Jexenflicker & Dalton-Puffer, 2010; Zydatiss, 2007). This is not the case for lexico-grammatical abilities regarding some morphological phenomena, such as the use of null subjects, negation and suppletive forms n & Gutie rrez Mangado, 2009) which are not reported to improve, while, in contrast, sentence complexity and (Martínez Adria affixal inflection do seem to improve with CLIL (Dalton-Puffer, 2007), as does morphosyntax (Lazaro Ibarrola & García Mayo, 2012). Against such a mixed-findings backdrop, the current article aims at providing new empirical evidence with a study adopting a pretest posttest design, in which an experimental group is measured against a control group, both matched regarding number of hours of exposure. This should allow us to address the following research question: Will the GA group experiencing a FI þ CLIL Science programme at secondary education level obtain higher relative linguistic gains over one year than the GB group which experiences FI only, when hours of instruction are matched? Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

3

Table 1 Data elicitation tasks. Domain

Tests

Tasks

1. Productive abilities

 Writing test

 Task1:

2. Receptive abilities

 Reading test

Composition  Task2:

 Listening test

Text completion  Task3:

 Grammar test

Dictation  Task4:

3. Lexico-grammatical abilities

Sentence completion task(DCT)Grammaticality judgment tasks (GJT)

Our prediction is that the GA: FI þ CLIL group will show larger relative gains than the GB: FI group, in all the skills analysed, that is writing, reading, listening-comprehension and lexico-grammatical abilities. 3. The Science study The Science study here presented deals with the contrastive effects of two different educational programmes applied in the same school, respectively experienced by two EFL early adolescent learner groups. One educational programme includes two different learning contexts, that is, a CLIL context and a FI context, both of which learners experience over the course of the study. The other educational programme includes a FI context of learning with no CLIL. The study compares the gains obtained by an experimental group of learners following the CLIL þ FI programme, GA, with those obtained by a group following the FI only programme, GB, the control group. It takes a fresh new look at the results obtained in an existing Science study (Author 1 & Author 2, 2015; Author 2, 2011). With this new approach the current study presents new results concerning the relative language proficiency gains in productive and receptive abilities obtained by GA: FI þ CLIL and GB: FI. It calculates CLIL effects in a descriptive manner, by means of comparing the percentage of gains for each ability. In this section the materials and methods of the study are presented. 3.1. Materials and methods The materials, namely the tasks, and a description of how they were used for the data elicitation and collection procedure, the participants and the design of the study are described in detail below. 3.1.1. Data collection materials As shown in Table 1 below, three domains of EFL proficiency were tapped into in this study: 1. Productive abilities, 2. Receptive abilities, and 3. Lexicogrammatical abilities. Productive abilities were measured by means of a written test.2 Receptive abilities were measured by means of a reading comprehension test and a listening comprehension test. Lexicogrammatical abilities were examined by means of a grammar test. An important issue to be considered here is the distinction between tests and tasks. In this study we follow Macaro (2010) who has emphasized: …What constitutes a test as opposed to a task. In part the distinction is between measures intended to assess language proficiency, and measures intended to explore issues in second language acquisition. If the measure is referred to as a test, the implication is that there is a certain level of rigour, established through piloting the measure and establishing its reliability and validity. A task on the other hand has not necessarily undergone such rigorous scrutiny, in terms of reliability and vailidity. Also the results from its use will probably have a different purpose than for a test. Having said this, in reality the purpose for both tests and tasks may be identical. So there is a problem with the nomenclature. (Macaro, 2010: 121). Consequently with this idea, we argue that our elicitation instruments are tasks, chosen for their ecological validity as they are the type of tasks used both in the CLIL and the FI classroom. However, the umbrella term ‘test’ is also adequate for our purposes, since we use them to analyse SLA, as Table 1 above reflects. As has previously been emphasized, meaningful tasks are key in organising FI conducive to language learning (Foster & Skehan, 1996; Tavakoli & Foster, 2008). In the case of Content Based Instruction (CBI) or CLIL programmes, meaning-

2 No data on oral proficiency were available in the Science study due to the constraints of data collection within the school where the study was conducted.

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

4

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

focused instruction which is based on curricular content provides ample opportunities for employing meaningful tasks in the classroom. In what follows the tasks used to test the abilities examined in this study are described, together with their administration procedure. 3.1.1.1. Productive task e Writing: Composition. In order to gauge production in writing, participants were administered a composition task on the basis of a picture. It showed two policemen, a mother and a boy at the entrance door of a home. Beneath the picture there were 3 lined boxes. Participants were shown the picture and then they were asked to use each of the three boxes in turn in order to: i) “Imagine and write the dialogue between the policemen and the mother and the child” (20 lines); ii) “Answer the following two questions (10 lines per question in one box respectively), in the space of 20 min”:

Why did this happen? How do you think the situation will end?

The choice of a composition was based on the number of subskills that come into play when learners write a piece of text. This particular material was chosen because it was thought that the young boy in the picture would allow for a process of identification on the part of the learner (Tavakoli & Foster, 2008). The two questions required them to use different genres, i.e. a dialogue and two very short narratives. 3.1.1.2. Receptive task e Reading: Text completion. In order to collect the data for the measurement of gains in reading comprehension skills, we used a text completion task, designed following a ‘cloze’ procedure. It contained 20 blanks which participants had to fill in by choosing the correct option out of 4 given in a multiple choice format. The text dealt with tsunamis, a topic studied in the Science subject and therefore familiar to GA: FI þ CLIL as far as vocabulary and structures are concerned, but not to GB: FI. In the instructions participants were told to: “Choose the adequate word in the list to fill in each blank by circling the right answer in your answer sheet. Take your time to read the text once from beginning to end before you start answering.” Cloze tests measure the reader's ability to decode interrupted or mutilated messages by making the most acceptable substitutions from all the contextual clues available (Weir, 1998: 50), while avoiding the particular difficulty of asking learners to deploy written abilities while dealing with text comprehension. 3.1.1.3. Receptive task e Listening: Dictation. A dictation was read aloud to the participants in order to measure oral comprehension skills by tapping on their abilities to memorize short strings of language, recall them and write them correctly (Hughes, 2003). It consisted of a text dealing with the Antarctica. Again it was a familiar topic related to the Science subject for € rnyei, 2005): “We shall now do a GA: FI þ CLIL but not for GB: FI. The following anxiety lowering instructions were given (Do dictation. Listen carefully and write down what you hear, bit by bit, as your teacher will stop at intervals. If you don't understand something do not worry. Write down what you can, and leave a blank space when you have doubts so that you can fill in that blank upon the second reading. Sure you will do well!” Fluent native speakers nearly always score 100% on a well administered dictation while non-native learners make errors of omission, insertion, word order, inversion, etc., indicating that their internalised grammars are, to some extent, inaccurate and incomplete; that they do not fully understand what they hear and that what they re-encode is correspondingly different from the original (Oller, 1979; Weir, 1998). 3.1.1.4. Lexico-grammatical ability task: Sentence completion and grammaticality judgements. In order to obtain data on lexicogrammatical ability, participants were administered a test with two differentiated parts in it. In the first one learners had to carry out a Sentence Completion task (SCT), for which they were required to provide preference judgements, i.e. select the appropriate sentence among those provided; or the appropriate string to complete an isolated sentence. It included 30 items. The sentences were ordered according to three progressive degrees of difficulty. Testees were also given encouraging instructions to proceed to each subsequent part of the task. From “Surely you will be able to answer these questions”, and then, “Try now with these”, to “The following ones are a bit more difficult, but surely you can try and answer some”. In the second part participants were given 20 sentences whose grammaticality they had to judge. Widespread use is made of Grammaticality Judgement tasks (GJT) in L2 acquisition research as they offer quick clear data on learners' grammars in context (see for example García Mayo, 2003; Ionin, 2011). Regarding data collection, it took place in an exam-like situation and was handled by the class teachers due to institutional conventions, with no prior notice given to participants. All instructions in the test booklet were written in Catalan, the main language of the participants. They were given two hours, in two separate class sessions, to complete the tasks (with some free time after each task).

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

5

3.2. Participants and design The two groups examined in this study were Catalan/Spanish bilingual EFL adolescents following secondary education, as shown in Table 2 below. The school is located in an area of high socioeconomic status. Having been together in the school since nursery, for both the experimental group, GA: FI þ CLIL (N ¼ 50), and the control group, GB: FI (N ¼ 50), age of onset of EFL instruction was at 5/6, hence their respective 1330 and 1260 accumulated hours of instruction at the onset of the study placed them at an upper intermediate level. Each group included 50% of male and female participants. The school has a high stakes multilingual profile and had taken great care in the design and implementation of the CLIL programme (see Author 1 & Author 2, 2015). The programme was intended for the entire student population in the school, hence no bias in the profile of the CLIL group should be expected. In fact, it is because the CLIL programme was being introduced in the school at stages, and that thus, over one year, the GA and the GB groups coexisted, that this study was made possible. The study has a longitudinal pretest e post test design as Table 2 below shows. Both groups of learners were measured before and after one academic year, that is at time 1 (T1) and time 2 (T2) respectively, in order to tap into relative gains obtained over the course of that year. Data collection at pre-test took place at the end of GA: FI þ CLIL's first year of secondary education (Grade 7), that is, at 13 years of age. They had then had 8 years of exposure to the target language. However, since the age of 10 (Grade 5) they had in addition received the extra CLIL hours. Thus, their accumulated time of EFL exposure at pre-test (T1) amounted to 1330 h, which included 1120 h of FI and 210 h of CLIL. At pre-test (T1), the controls GB: FI, in turn, were measured in Grade 8 when they were 14 years old, also at the end of the academic year. They had then had 9 years of FI, and a total number of accumulated hours of 1260. At post-test (T2) the experimental GA: FI þ CLIL had had 9 years of exposure to English and 4 to CLIL; and an accumulated 1540 h, 1260 of FI and 280 of CLIL. In turn, the controls GB: FI were measured in Grade 9 when they were 15 years old, also at the end of the academic year. They then had had 10 years of FI when data were collected a second time (T2), and a total number of accumulated hours of 1400. In sum, over the year under study, the experimental group had received 210 h of exposure to English (140 FI þ 70 CLIL), in contrast with the control group who had received 140 h (FI). The design described above made sure that the respective accumulated hours of exposure to English for the experimental and the control groups were as similar as possible both at pre-test (T1) and at post-test (T2), although for GA: FI þ CLIL some of the hours were CLIL hours. This made it possible for a comparison of the effect of the hours over one year. The quantity of hours being relatively similar and the quality being different, any contrasts in the gains obtained by each group over a year was expected to reveal whether or not the CLIL hours have a significantly higher positive effect on learners' linguistic progress than the FI hours. However, a relevant factor appears here: not only quality but also distribution, that is intensity, is different for both groups, and in favour of GA: FI þ CLIL, as more hours in the course of the same number of weeks reflects more hours per week: the experimental group is exposed to English for 7 h weekly, while GB: FI receives 4.6 h of EFL exposure weekly ~ oz, 2015), an issue further discussed below. (Mun Meeting the above methodological requirement to measure CLIL effects on linguistic progress made inter-group comparisons possible, however, at the expense of not matching the research groups exactly for age at pre-test. Otherwise, had we ~ oz, 2012; kept age constant we would have created a disadvantage for GB: FI in terms of time of exposure to English (Mun Serrano, 2012). This entailed that at pre-test the latter group included learners who were a year older than the former, as explained above and Table 2 displays, 14 versus 13, but both still within the same educational level, lower secondary. In return, GB: FI had had a total number of hours of EFL exposure of 1260 h, a highly comparable figure to that of the GA: ~ oz, 2006; Singleton, 2005). CLIL þ FI, who had 1330, at only one year distance in age (Mun 3.3. Analyses In this study, in order to proceed to the new analyses of the relative proficiency gains obtained by both groups for each task, we have drawn on the results of an existing study conducted with the same corpus which provided us with the raw gains made by GA: FI þ CLIL and GB: FI (Author 1 & Author 2, 2015; Author 2, 2011). What we aim to do here is take a step backward with respect to those raw gains and measure relative gains, that is the percentage of gains made by each group in any given skill in relation to their point of departure at the onset of the study. Whether they have improved 5, 10, 15 percent or more with respect to T1. By so doing, we will obtain a point of reference to compare how much progress each group makes over one year.

Table 2 Participants and design.

GA: FI þ CLIL (N ¼ 50) FI: Nursery (5/6yrs.) CLIL: Grade 5 (10 yrs) GB: FI (N ¼ 50) FI:Nursery (5/6 yrs)

Pre-tests (T1)

Post-tests (T2)

Grade7(12/13yrs.) GB: FI FI: 1120 h þ CLIL: 210h ¼ 1330 h

Grade8 (13/14 yrs) FI: 1260 h þ CLIL: 280h ¼ 1540 h T2-T1 hours: 210

Grade8 (13/14 yrs) FI: 1260 h

Grade9 (14/15yrs.) FI: 1400 h T2-T1 hours: 140

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

6

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

Different procedures were used to obtain those raw results on what the percentages of gains in this study have been drawn. The data collected from the composition task were transcribed using the CLAN programme. They were then analysed quantitatively for the conventional domains of Complexity, both Grammatical and Lexical, Accuracy and Fluency (CAF) (Housen, Kuiken, & Vedder, 2012; Wolfe Quintero, Inagaki & Kim, 1998). Then they were also analysed qualitatively with a rating scale (Friedl & Auer, 2007) whereby Task Fulfilment, Organisation, Grammar and Vocabulary features were measured taking into account 6 behavioural levels on a scale of 0 (not enough to evaluate) to 5 (very good). The data on progress made in receptive skills, including the text completion task and the dictation task, were straightforwardly corrected using objective criteria with a correcting profile. Finally, the data on lexico-grammatical abilities yielded a frequency figure counting for correct/incorrect items. A final figure representing an average score was thus obtained of both tasks in order to calculate linguistic progress for both specific competence dimensions analysed. Results were introduced to a Stats Graphic matrix, and the formulae for each ratio were calculated. Finally, mean results for all measures per group were drawn and compared with an ANOVA statistical analysis, the significance level set at <0.05. The results of these analyses have been reported in Author 1 and Author 2 (2015) and in Author 2, (2011), as already mentioned. Hence, when we present the relative gains in what follows, we will be able to refer to whether the comparisons of the raw results of GA: FI þ CLIL and GB: FI had resulted in any significant differences in the preceding studies. For the purpose of the current new analyses of the relative proficiency gains here presented, the percentage of gains in the three domains scrutinized with respect to Time 1 have been calculated on the basis of such raw results. Hence, we are not aiming here at a statistical result, but rather at describing the proportion of gain, which, in a sense, is infinite, one could gain from nothing to 1000 times what one had at a specific point in time. The results of these analyses are presented in Section 4 below. 4. Results In this section we address the research question stated above, namely, whether a FI þ CLIL programme brings about higher relative linguistic gains over one year, than a FI programme. 4.1. Relative gains in production abilities: The written composition task This first set of results contrasting relative gains for GA: CLIL þ FI and GB: FI tap into the participants' written skills by means of a composition on a general issue, not related to the CLIL subject Science, which hence both groups have practiced in the EFL classroom. They are analysed quantitatively for CAF, and qualitatively following Friedl and Auer (2007) to tap into the domains of Task Fulfilment, Organization, Grammar and Vocabulary. The contrast between both groups is hence made clear by means of the relative proficiency gains, as shown in Fig. 1. In order to avoid one of the most serious weaknesses of holistic methods, subjectivity, the evaluation with holistic measures was carried out by two specialists following the same procedure

Fig. 1. Results for written abilities (% of gains).

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

7

(McNamara, 2000). The two specialists started evaluating the compositions together so as to find common criteria to set up the level. Once an agreement had been reached, they proceeded with the evaluation separately and they met once a week to check the results. At the beginning it was established that if the difference in the qualifications for one same student was higher than 2 (i.e. one evaluator gave a 4 for vocabulary and the other evaluator a 1) the evaluation would have to be repeated. However, it was never the case since results were always the same or with only one point difference. In this last case, the final result was the average value (i.e. if one evaluator gave a 4 for organisation and the other one a 2, the result was a 3). Written Fluency is measured by means of Total number of Words (W) written in the space of 20 min. We follow Wolfe Quintero et al. (1998), who use number of words as a general measure of fluency. These authors define fluency as more words and more structures being accessed in a limited time, whereas a lack of fluency means that only a few words or structures are accessed (Wolfe-Quintero et al., 1998: 14). Results show that both groups experience a small decrease in the percentage of gains obtained with respect to T1, albeit more so in the case of GB: FI. GA: FI þ CLIL decreases 0.7%, while GB: FI goes down to 2.9%. The statistical analyses of mean raw gains had shown that this decrease was not significant in either group (Author 1 & Author 2, 2015). Hence, neither group makes progress in Fluency and consequently no relative gains can be calculated. Written Accuracy is examined by measuring the number of Errors per Word (E/W). Results show that both groups experience an increase in the percentage of gains obtained, albeit more so in the case of GA: FI þ CLIL, with a 35% increase, in contrast with a 6.5% experienced by GB: FI. Hence both groups write with fewer errors, but GA: FI þ CLIL clearly more so than GB: FI. Previous analyses of mean gains had shown that the difference in gains between both groups reached significance (Author 1 & Author 2, 2015). We now move to the results for Complexity. We can report the results for the coordination index (CI) tapping into Syntactic Complexity, which establishes the relative weight of subordination with respect to coordination, the lower the resulting figure, the higher the amount of subordination. It is calculated by dividing coordinate clauses by combined (coordinate þ subordinate). They reveal that GA: FI þ CLIL experiences an increase of 2.5% while GB: FI a decrease of 4.2%, which implies that GA: FI þ CLIL relative improvement in the use of subordination is clearly higher than that of GB: FI. Finally, Lexical Complexity is measured by means of the Guiraud Index, which is calculated by dividing the total number of word types by the square root of tokens, in order to take into account text length. In this measure GA: FI þ CLIL's relative gains of 3.2% are clearly below GB: FI's 6.6% increase.Previous analyses of mean gains with the raw data had shown that GB's increase nearly reached significance (Author 1 & Author 2, 2015). Turning to the qualitative results with the compositions, GA: FI þ CLIL also experiences higher relative gains than GB: FI in Text Organisation, with 27.5% versus 15.30% respectively; Grammar, with 26.1% versus 8.30%, Task Fulfilment, with 14% versus 10.80% and Vocabulary, where the difference is minor than for the other measures, in agreement with the results for CAF, 12.6% versus 9.10%. It must be noted that prior analyses with the raw measures had shown a tendency for GA to overtake GB in the qualitative measures, albeit not reaching significance (Author 1 & Author 2, 2015). In sum, GA: FI þ CLIL relative CAF gains and gains in the qualitative measures in written production with respect to its proficiency level at T1 are clearly superior to those of GB: FI as far as Accuracy and Syntactic Complexity, and less so as regards Fluency, where both groups experience a small loss, as shown in Fig. 1. In contrast, GB experiences larger gains in Lexical Complexity. Regarding the qualitative measures, GA obtains higher gains than GB right across the board.

4.2. Relative gains in receptive abilities: The text completion task and the dictation task The second set of results contrasting relative gains for GA: CLIL þ FI and GB: FI tap into the participants' receptive abilities, both reading, by means of a text completion task, and listening, by means of a dictation task, as shown in Fig. 2.

Fig. 2. Results for receptive skills and lexico-grammatical abilities (% of gains).

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

8

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

In this domain, the results show that GA's relative gains in the text completion task, 11.8%, clearly contrast with GB's 1.5%, as Fig. 2 displays. GA participants achieve greater gains than GB participants in reading abilities over the course of one year. This is an area in which previous analyses had revealed a significant benefit of the FI þ CLIL programme over the FI programme only (Author 1 & Author 2, 2015). Regarding the dictation task, similar relative gains seem to accrue for both groups, with GB gaining 2.8%, and GA 2.5%. If we also try and picture out these results across receptive abilities, reading and listening, we see, as the visual graph in Fig. 2 renders evident, that GA's relative gains over the course of one year, as measured through a text completion task and a dictation task, are clearly in contrast. Regarding the tasks used for data elicitation, several considerations must be made which should throw some light on these mixed results. The use of text completion tasks using a cloze procedure with a multiple choice technique poses no practical problems, but this is not the case for a dictation task. One problem which needs to be stressed is the intricacies of correction in a dictation task. In this study, spelling was not required for a response to be scored as correct since it was meaning comprehension rather than form (spelling) that was meant to be tested. It was not enough for our participants to attempt a representation of the sounds that they heard, without making sense of those sounds. The difficulty was to really know whether, although having a correct spelling (or not), the learner had recognised the word, giving to those particular sounds the sense that the specific context really required. For example, “(… ) the british explorer (… )”, was scored as correct because the meaning of British was understood. However, in “(… ) he was to late (… )” we did not really know whether the mistake was due to the fact that the learner did not know the meaning of too and thus its correct comprehension was impossible, or he/she did not know how to spell it. In this particular case, it was considered a mistake. 4.3. Relative gains in lexico-grammatical abilities: The discourse completion task and the grammaticality judgement task The third set of results contrasting relative gains for GA: CLIL þ FI and GB: FI tap into the participants' general grammatical abilities, as measured with a SCT and a GJT. GA's relative gains, 7.3%, are in clear contrast with GB's 0.77%, as Fig. 2 displays. This is an area in which previous studies had revealed a significant benefit of the FI þ CLIL programme over the FI programme only (Author 1 & Author 2, 2015). We now turn to the discussion of results in order to address the research question in this study. 5. Dicussion The results presented above allow us to examine whether the experimental group, GA: FI þ CLIL, makes higher relative gains in EFL than the control group, GB: FI, in the course of one academic year with respect to their onset level at T1. A summary of the results is as follows. Regarding written productive skills across all quantitative and qualitative measures, GA shows higher relative progress than GB in Accuracy, and a small but somewhat higher relative progress than GB in Syntactic Complexity, although the opposite is true for Lexical Complexity. Both groups experience loss in Fluency. Larger qualitative gains of over 25% are also displayed for GA in Accuracy and Grammar, and Text Organization. These are domains in which GB makes much less relative progress. Benefits of between 10 and 15% accrue in Task Fulfilment and Vocabulary for GA, but at a lesser distance from GB. These results seem to be in agreement with Llinares et al. (2013) and Whitakker and Llinares (2009). GA has also made higher relative gains in lexico-grammatical abilities (Escobar, 2004; Lasagabaster, 2008; Ruiz de Zarobe, 2008; Zydatiss, 2007). As for the remaining two domains of proficiency scrutinized, relative gains in reading, as prior studies had found (DaltonPuffer, 2011; Ruiz de Zarobe, 2008), are clearly higher for the GA participants. Listening is however not higher, this time in contrast with the results from prior studies (Lasagabaster, 2008; Victori & Vallbona, 2008). The above summary allows us to establish that our hypothesis is only partially confirmed. It also allows us to answer our research question by stating that the GA: FI þ CLIL participants have improved to a larger degree than the GB: FI group in relative gains, as far as their ability to write more purposefully, a better organized, more accurate and syntactically complex text, and this irrespective of the fact that the test they were administered did not deal with Science topics which GA FI þ CLIL had studied, but GB: FI had not. The same is true for the grammar tests, which deal with general topics and language. We can additionally state that GA FI þ CLIL also shows much higher relative gains in reading, when the test does deal with tsunamis, a topic studied in the Science class. However, the same is not true for the listening test, in spite of the fact that it deals with the Antartica, a topic also dealt with in the Science class. The answer to our research question is thus only partially affirmative as GA: FI þ CLIL shows higher relative gains in the productive skill measured, but, concerning receptive abilities, only reading, but not listening improves. Consequently and most importantly, results seem to uncover the fact that, in the case of the sample analysed in this study, the CLIL programme benefits specific language domains practiced in the Science subject, although not across all skills, while at the same time it also benefits general language abilities. Several issues may be on focus in a discussion of such findings, namely quantity of time of exposure to the target language, that is EFL, and its distribution; type of tasks used to measure the impact of a CLIL programme, and quality of time. In relation with the issue of intensity of exposure and practice in the target language, as already mentioned in Section 3.2. above, indeed, GA FI þ CLIL has had 210 of instruction including FI and CLIL over one year, that is, seven hours per week, and GB: FI has had 140 h of FI over one year, that is four and a half hours per week, hence a lower intensity programme. It must be borne in mind however, that even if of a lower intensity, the GB programme is far from the so-called ‘drip-feed’ programmes of 2e3 h a week (Stern, 1985). As has been emphasized in the second language acquisition literature, intensive courses have been shown to be highly effective for students' language development. Consequently a higher intensity of exposure Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

9

experienced by GA might be the explanation as to how its relative gains in the composition task, the reading task and the lexico-grammatical task are higher than GB's gains. In this respect Serrano (2012) has emphasized: In general, research conducted on time distribution and SLA suggests a certain degree of benefit in concentrating the hours of instruction instead of spreading them over long periods of time. This benefit is clearer when the comparison includes intensive programs that offer more hours of instruction than regular ‘drip-feed’ programmes. However in this comparison we cannot know whether it is time increase or time concentration (or both) that is causing the effect (pp.16). Concerning task typology, a further explanation for our positive results for GA might derive from the very nature of the tests used in this study, which are all integrative and comprehension-based, while, in contrast, when production of finegrained morphological features is analysed, no CLIL versus non-CLIL differences have been reported (García Mayo & Villarreal Olaizola, 2011). Finally, regarding quality of exposure to EFL, from the perspective of the psycholinguistic literature it has been argued that CLIL contexts allow for practice which is meaning-oriented. This is in contrast with FI, in which communication tends to focus on form as the most common sort of practice, unless very committed communicative approaches to language teaching are adopted. Meaning oriented practices are supposed to provide the adequate opportunities for communication in which negotiation of meaning, and plenty of opportunities for output practice spurr linguistic development (Sanz, 2014). Against such three factors, what explains GA's advantage over GB requires further exploration. Quite probably experimental methods might help us disintangle our results. Interestingly, in this respect, one of the most relevant findings in this study in our view rests precisely in the interface between meaning and form, namely the higher results in Accuracy, Grammar, and Lexico-grammatical abilities experienced by the group following the FI þ CLIL programme. The significant improvement found in the area of accuracy in the writing skill and in lexico-grammatical abilities is a rather surprising finding. Opposite results were obtained by the empirical studies carried out in Canada and Europe. In Canada, this led to a concern for fostering accuracy, as proposed by Harley, Allen, Cummins, and Swain (1990), and more recently Lyster (2007). A recent appraisal on this issue underscores the need for FI: Studies in Canada have (… ) demonstrated that the effectiveness of immersion depends on a combination of factors, including amount of exposure to the second language, the age of the learners and pedagogical approach. There has been increased attention in recent years to pedagogical issues and, in particular, the effectiveness of more systematic and explicit instruction in linguistic structures. (Genesee, 2013: 39) Our positive results might be explained by transfer of knowledge and skills from the EFL FI context to a CLIL context, where meaningful practice provides extra opportunities for linguistic development to take place. Indeed grammar abilities are often and very often practised in the FI context. In the literature exploring the impact of different learning contexts, such as both CLIL and FI, the effect of practice on skill development has been discussed from the perspective of how learners transfer knowledge from one context into the other, as more often than not students experience combinations of contexts (Author 1, 2015; DeKeyser, 2007, 2014). 6. Conclusions This classroom exploratory study has looked into the development of general language proficiency as measured through relative percentage of gains, with the emphasis placed on skill development. Our findings have shown that the adolescent EFL learners undergoing a well-established programme of 210 h including FI and CLI, 140 þ 70 respectively, over one year, progress relatively more than learners following a FI only programme for 140 h. More specifically, they reveal that relative gains accrue particularly in an integrative written production task in the form of a composition and in a grammar task, including two parts, involving SCT and GJT respectively, neither of them dealing with a Science topic. They also make higher relative gains in a reading task, dealing specifically with a Science topic. We can conclude by saying, that, in the case of our participants, the CLIL programme allows those following it to make progress in their general use of English, and in particular grammar, as well as in domain specific uses of the target language and that relative gains are larger than those of the non CLIL learners for all skills measured, except listening and Lexical Complexity. The CLIL immersion context seems to have provided learners with the large amount of meaningful practice necessary for automatisation to take place (DeKeyser, 2007), however not in a linear manner, as has often been claimed to happen (Jessner, 2008). One year difference at post-test, 14 and 15, 70 h of FI and 210 h of accumulated CLIL instruction separates both research groups; a highly comparable number of accumulated hours of instruction makes them similar, 1260 versus 1400. Whether the reason for the FI þ CLIL programme advantage is quantity and intensity, task specificity, or quality of exposure deserves the attention of further research. It may be the case, as Genesee (2013) mentioned above stresses, that it is ‘a combination of factors’, which explains ‘the CLIL advantage’. Acknowledgements  This work was supported by the AGENCIA UNIVERSITARIA DE RECERCA (AGAUR), in Catalonia, [2014 SGR 1568]; and by the Ministry of Economy and Competitiveness [FFI2013-48640-C2-1-P]. We are grateful to the school for allowing us to collect data and to the participants, who, ultimately, made this study possible. Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

10

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

References  pez-Gime nez, M. D., & Chaco  n-Beltra n, R. (Eds.). (2013). Bilingual and multilingual education in the XXI century. Clevedon: Abello-Contesse, C., Chandler, P., Lo Multilingual Matters. Ackerl, C. (2007). Lexico-grammar in the essays of CLIL and non-CLIL students: error analysis of written production. ViewZ (Vienna English Working Papers), 16/3, 6e11. Admiraal, W., Westhoff, G., & de Bot, K. (2006). Evaluation of bilingual secondary education in the Netherlands: students' language proficiency in English. Educational Research and Evaluation, 12, 75e93. n, E., & Michavila, F. (2012). La universidad multilingüe. Madrid: Tecnos. Alco Author 1. (2007). The need for focus on form in content and language integrated approaches: an exploratory study. Modelos y Pr acticas en Eicle. [Special ~ ola de Lingüística Aplicada (RESLA), 39e53. Issue]. Revista Espan pez-Gime nez, & R. Chaco  nAuthor 1. (2013). Perspectives and lessons from the challenge of CLIL experiences. In C. Abello-Contesse, P. Chandler, M. D. Lo Beltr an (Eds.), Bilingual and multilingual education in the 21st century (pp. 59e85). Clevedon: Multilingual Matters. Author 1. (2015). Languages for all in education: CLIL and ICLHE at the crossroads of multilingualism, mobility and internationalisation. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based learning in multilingual educational environments (pp. 31e50). Berlin: Springer. Author 1, & Author 2. (2015). CLIL in context: profiling language abilities. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based learning in multilingual educational environments (pp. 237e254). Berlin: Springer. Author 2. (2011). Integrating content and language in mainstream education in Barcelona. A study of the acquisition of English as a foreign language. Germany: Lambert Academic Publishing. Cenoz, J. (2009). Towards multilingual education. Basque educational research from an international perspective. Clevedon: Multilingual Matters. Dafouz, E., & Guerrini, M. (2009). CLIL across educational levels, 3e17. Madrid: Richmond Publishing, Santillana. Dalton-Puffer, C. (2007). Discourse in content and language integrated learning (CLIL). Amsterdam/Philadelphia: John Benjamins Publishing Company. Dalton-Puffer, C. (2008). Outcomes and processes in content and language integrated learning (CLIL): current research from Europe. In W. Delanoy, & L. Volkmann (Eds.), Future perspectives for English language teaching (pp. 139e157). Heidelberg: Carl Winter. Dalton-Puffer, C. (2011). Content and language integrated learning: from practice to principles? Annual Review of Applied Linguistics, 31, 182e204. DeKeyser, R. (2007). Study abroad as foreign language practice. In R. DeKeyser (Ed.), Practice in a second language: Perspectives from applied linguistics and cognitive psychology (pp. 209e226). Cambridge: Cambridge University Press. DeKeyser, R. (2014). Research on language development during study abroad: methodological considerations and future perspectives. In Author 1 (Ed.), Second language acquisition in study abroad and formal instruction contexts. Amsterdam/Philadelphia: John Benjamins. €rnyei, Z. (2005). The psychology of the language learner: Individual differences in second language acquisition. Mahwah, N.J: Lawrence Erlbaum. Do Escobar, C. (2004). Content and language integrated learning: do they learn content? Do they learn language?. In Proceedings of the XXI International Conference AESLA (pp. 27e38). ~ o, A. (2011). AICLE-CLIL-EMILE Educacio  plurilingüe: Experi Escobar Urmeneta, C., Evnitskaya, N., Moore, E., & Patin encies, research & polítiques. Bellaterrra: noma de Barcelona, Servei de Publicacions. Universitat Auto European Commission. (1995). White paper on education and training, teaching and learning. Towards the learning society. Retrieved June 10, 2014 from http:// ec.europa.eu/white-papers/#block_13. European Commission. (2005). A new framework strategy for multilingualism. Retrieved June 10, 2014 from http://ec.europa.eu/languages/eu-languagepolicy/multilingualism_en.htm. European Commission. (2007). Final report. High level group on multilingualism. Retrieved June 10, 2014 from http://ec.europa.eu/languages/orphans/highlevel-group_en.htm. European Commission. (2008). A rewarding challenge. How the multiplicity of languages could strengthen Europe. Group of intellectuals for intercultural dialogue chaired by Mr. Amin Maalouf. European Commission. (2012). The Bologna higher education area in 2012: Bologna process implementation report. Eurydice. (2008). Key data on teaching languages at school in Europe. Special Eurobarometer 243. Brussels: European Commission. Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language performance. Studies in Second Language Acquisition, 18, 299e324. €uterungen zur Novellierung der Reifeprufungsverordnung fur AHS, lebende Fremdsprachen (Rating scale used for assessment of Friedl, G., & Auer, M. (2007). Erla €lten: BIFIE. the writing task). Wien/St. Po García Mayo, M. P. (2003). Age, length of exposure and grammaticality judgements in the acquisition of English as a foreign language. In M. P. García Mayo, & M. L. García-Lecumberri (Eds.), Age and the acquisition of English as a foreign language (pp. 94e114). Clevedon: Multilingual Matters. García Mayo, M. P., & García Lecumberri, M. L. (Eds.). (2003). Age and the acquisition of English as a foreign language. Clevedon: Multilingual Matters. García Mayo, M. P., & Villarreal Olaizola, I. (2011). The development of suppletive and affixal tense and agreement morphemes in the L3 English of BasqueSpanish bilinguals. Second Language Research, 27(1), 129e149.  pezGenesee, F. C. (2013). Insights into biligual education from research on immersion programmes in Canada. In C. Abello-Contesse, P. Chandler, M. D. Lo nez, & R. Chaco n-Beltra n (Eds.), Bilingual and multilingual education in the 21st century (pp. 24e42). Clevedon: Multilingual Matters. Gime Harley, B., Allen, P., Cummins, J., & Swain, M. (1990). The development of second language proficiency. Cambridge: Cambridge University Press. Hellekjaer, G. (2010). Language matters: assessing lecture comprehension in Norwegian English-medium instruction. In C. Dalton-Puffer, T. Nijula, & U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 253e278). Amsterdam: John Benjamins. Housen, A., Kuiken, F., & Vedder, I. (2012). Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA. Amsterdam/Philadelphia: John Benjamins. Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press. Ionin, T. (2011). Formal theory-based methodologies. In A. Mackey, & S. Gass (Eds.), Research methods in second language acquisition (pp. 30e52). London: Blackwell. Jessner, U. (2008). A DST model of multilingualism and the role of metalinguistic awareness. The Modern Language Journal, 92(2), 270e283. Jexenflicker, S., & Dalton-Puffer, C. (2010). The CLIL differential: comparing the writing of CLIL and non-CLIL students in higher colleges of technology. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 169e189). Amsterdam/Philadelphia: John Benjamins. Johnson, K., & Swain, M. (1997). Immersion education: International perspectives. Cambridge: CUP. Lasagabaster, D. (2008). Foreign language competence in content and language integrated courses. The Open Applied Linguistics Journal, 1, 31e42. Lasagabaster, D., & Ruiz de Zarobe, Y. (2010). CLIL in Spain: Implementation, results and teacher training. Newcastle, UK: Cambridge Scholars Publishing. Lasagabaster, D., & Sierra, J. M. (2010). Immersion and CLIL in English: more differences than similarities. ELT Journal, 64, 376e395. Lazaro Ibarrola, A., & García Mayo, M. P. (2012). Faster and further morphosyntactic development of CLIL vs. EFL Basque-Spanish bilinguals learning English in high-school. IRAL International Review of Applied Linguistics, 50(2), 135e160. Llinares, A., Morton, T., & Whittaker, R. (2013). The roles of language in CLIL. Cambridge: Cambridge University Press. ~ ola de Lingüística Aplicada (RESLA), 39e55. Lorenzo, F., Casal, S., de Alba, V., & Moore, P. (2007). Models and practices in CLIL. [SI]. Revista Espan Lyster, R. (2007). Learning and teaching languages through content. A counterbalanced approach. Amsterdam/Philadelplia: John Benjamins. Macaro, E. (2010). Continuum companion to second language acquisition. London: Continuum. rrez Mangado, M. J. (2009). The acquisition of English syntax by CLIL learners in the Basque country. In Y. Ruiz de Zarobe, & R. Martínez Adri an, M., & Gutie nez Catala n (Eds.), Content and language integrated learning: Evidence from research in Europe (pp. 176e196). Bristol: Multilingual Matters. M. Jime McNamara, T. F. (2000). Language testing. Oxford: Oxford University Press.

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004

C.P. Vidal, H. Roquet / System xxx (2015) 1e11

11

Moore, P. (2009). On the emergence of L2 oracy in bilingual education: A comparative analysis of CLIL and mainstream learner talk. Sevilla: Universidad de Sevilla. Unpublished Doctoral Dissertation. ~ oz, C. (Ed.). (2006). Age and the rate of foreign language learning. Clevedon: Multilingual Matters. Mun ~ oz, C. (Ed.). (2012). Intensity exposure experiences in second language learning. Clevedon: Multilingual Matters. Mun ~ oz, C. (2015). Time and timing in CLIL: a comparative approach to language gains. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based learning Mun in multilingual educational environments (pp. 87e105). Berlin: Springer. Nikula, T. (2007). Speaking English in Finnish content-based classrooms. World Englishes, 26, 206e223. Oller, J. W. (1979). Language tests at school. London: Longman. Ruiz de Zarobe, Y. (2008). CLIL and foreign language learning: a longitudinal study in the Basque Country. International CLIL Research Journal, 1(1), 60e73. nez Catala n, R. (2009). Content and language integrated learning. Evidence from research in Europe. Clevedon: Multilingual Matters. Ruiz de Zarobe, Y., & Jime Ruiz de Zarobe, Y., Sierra, J. M., & Gallardo del Puerto, F. (2011). Content and foreign language integrated learning: Contributions to multilingualism in European contexts. Bern, Berlin, Bruxelles, Frankfurt am Main, New York, Oxford, Wien: Peter Lang (Linguistic Insight series). Sanz, C. (2014). Contributions of study abroad research to our understanding of SLA processes and outcomes: the SALA Project. In Author 1 (Ed.), Second language acquisition in study abroad and formal instruction contexts (pp. 1e17). Amsterdam/Philadelphia: John Benjamins. Serrano, R. (2012). Is intensive learning effective? Reflecting on the results from cognitive psychology and the second language acquisition literature. In C. ~ oz (Ed.), Intensity exposure experiences in second language learning (pp. 3e25). Clevedon: Multilingual Matters. Mun Singleton, D. (2005). The critical period hypothesis. A coat of many colours. International Review of Applied Linguistics, 10, 209e231. Stern, H. H. (1985). The time factor and compact course development. TESL Canada Journal, 3, 13e27. n, L. (2004). Teaching in English or English teaching? On the effects of content and language integrated learning on Swedish learners' incidental vocabulary Sylve €teborg University. acquisition. Doctoral Dissertation. Go Tavakoli, P., & Foster, P. (2008). Task design and second language performance. The effect of narrative type on learner output. Language Learning, 58(2), 439e473. Van de Craen, P., Ceuleers, E., Mondt, K., & Allain, L. (2008). European multilingual language policies in Belgium and policy-driven research. In K. Lauridsen, € ttingen: V&R Unipress. & D. Toudic (Eds.), Languages at work in Europe (pp. 139e151). Go Victori, M., & Vallbona, A. (2008, December 12). A case study on the implementation of CLIL methodology in a primary education school: Results, benefits and challenges. Barcelona: Universitat de Barcelona. Paper presented at the CLIL-TBL Seminar. Weir, C. J. (1998). Communicative language testing with special reference to English as a foreign language (Vol. 11). University of Exeter: Exeter Linguistic Studies. n Whittaker, R., & Llinares, A. (2009). CLIL in social science classrooms: analysis of spoken and written productions. In Y. Ruiz de Zarobe, & R. Jimenez Catala (Eds.), Content and language integrated learning. Evidence from research in Europe (pp. 215e234). Clevedon: Multilingual Matters. Wolfe-Quintero, K., Inagaki, S., & Kim, H. (1998). Second language development in writing: Measures of fluency, accuracy and complexity. Hawai'i: University of Hawai'i at Manoa. Zydatiss, W. (2007). Deutsch-Englische Züge in Berlin (DEZIBEL). Eine Evaluation des bilingualen Sachfachunterrichts in Gymnasien: Kontext, Kompetenzen, Konsequenzen. Frankfurt am Main, Germany: Peter Lang. Zydatiss, W. (2012). Linguistic thresholds in the CLIL classroom? The threshold hypothesis revisited. International CLIL Research Journal, 1(4), 16e28.

Please cite this article in press as: Vidal, C. P., & Roquet, H., The linguistic impact of a CLIL Science programme: An analysis measuring relative gains, System (2015), http://dx.doi.org/10.1016/j.system.2015.05.004