Understanding the role of orthography in the acquisition of a non-native vowel contrast

Understanding the role of orthography in the acquisition of a non-native vowel contrast

Available online at www.sciencedirect.com Language Sciences 32 (2010) 380–394 www.elsevier.com/locate/langsci Understanding the role of orthography ...

358KB Sizes 0 Downloads 13 Views

Available online at www.sciencedirect.com

Language Sciences 32 (2010) 380–394 www.elsevier.com/locate/langsci

Understanding the role of orthography in the acquisition of a non-native vowel contrast Ellen Simon a,*, Della Chambless b, Ubirata˜ Kickho¨fel Alves c a

English Department, Ghent University, Muinkkaai 42, 9000 Ghent, Belgium Department of Romance Studies, Duke University, Durham, NC 27708-0257, USA c Programa de Po´s-Graduacßa˜o em Letras, Universidade Cato´lica de Pelotas, 96010-000 Pelotas RS, Brazil b

Received 22 May 2009; received in revised form 30 June 2009; accepted 6 July 2009

Abstract This paper examines the role of orthographic information used during training on the ability to learn a non-native vowel contrast. We investigate whether exposure to novel grapheme-to-phoneme correspondences can help learners in the acquisition of a new phonological contrast. Three related experiments were carried out on the acquisition of the French vowel opposition between /u/ (as in ‘vous’, you) and /y/ (as in ‘vu’, seen) by American English listeners. The experiments consisted of word learning, perceptual discrimination and vowel-categorization tasks. The results reveal that the use of orthography during training did not appear to have a significant influence on performance during testing and that the consonantal context in which the French vowels occur influences the categorization of the vowels by American English listeners. We explore several explanations as to the lack of an effect and, secondarily, discuss implications of these studies for pronunciation training involving the use of minimal pairs. Ó 2009 Elsevier Ltd. All rights reserved. Keywords: Orthography; Perception; Phonology; Second language acquisition; Vowels; Word learning

1. Introduction It is well known that speakers of different native languages have difficulty perceiving certain vowel and consonant contrasts in a second language (henceforth L2). The difficulty is greatest when the target language has a contrast between two sounds which the source language lacks. According to Best’s Perceptual Assimilation Model (PAM, Best, 1994; Best et al., 2005), L2 learners may then perceptually ‘assimilate’ (hence the term ‘Perceptual Assimilation Model’) the two members of the L2 contrast to the L1 sound which is perceptually closest. In other words, they may interpret the two L2 sounds as allophones of the same phoneme, a phenomenon called single-category assimilation. An example of a difficult contrast for American English (henceforth AE) speakers is the French opposition between front vs. back rounded vowels. English does not have any *

Corresponding author. Tel.: +32 9 331 32 77; fax: +32 9 264 41 79. E-mail address: [email protected] (E. Simon).

0388-0001/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.langsci.2009.07.001

E. Simon et al. / Language Sciences 32 (2010) 380–394

381

front rounded vowel phonemes and earlier studies have shown that native English speakers have difficulty producing and perceiving the difference between the front rounded vowel /y/ and the back rounded vowel /u/ in French (Flege, 1987; Flege and Hillenbrand, 1984; Gottfried, 1984; Levy and Strange, 2007; Strange et al., 2007). Given the inherent difficulties associated with distinguishing between an L2 phoneme which occurs in the L1 and one which does not, many studies have examined the way in which particular L2 exposure conditions and/or laboratory training techniques maximize the ability of the listener to learn the contrast (Bradlow et al., 1997; Flege, 1989; Iverson and Evans, 2007; Nishi and Kewley-Port, 2007; Pisoni et al., 1982; Strange, 1995; Strange and Dittman, 1984; Tees and Werker, 1984; Wang and Munro, 2004). In this study we aim to examine whether learning to distinguish between two quite similar-sounding categories is facilitated by the difference being reflected in the orthography. Our question is thus whether seeing minimally different written words encourages the establishment of separate categories for the sounds in question. The role of orthography in the perception of L2 vowels has been investigated in a recent study by Escudero et al. (2008), who taught novel words containing the English vowels /e/ and // to L1 Dutch speakers who were proficient in English. The English contrast between /e/ and // is difficult for native speakers of Dutch, as Dutch has only one member of the contrast, /e/, which phonetically is often realized in-between the two English vowels. Informants were trained in one of two ways: on word–picture pairings with auditory information only; or on pairings with both auditory information and the written form of the word. The words they were taught were nonce words, associated with non-objects (e.g. the word ‘tenzer’ was associated with a picture of a non-existing object). During testing, informants heard an auditory target and were asked to click on one of four or five pictures on the screen. The eye movements of the participants were tracked in this testing phase. The results of the eye-tracking revealed that participants in the Spelling Group fixated more on /e/ targets than the Auditory group, i.e. showed an asymmetry with /e/ as the dominant category. For example, when participants heard the first syllable of the English nonce word ‘tenzer’ (/ten/), they focused on the picture of the (non-object) ‘tenzer’ as well as on the picture of the ‘tandik’. However, when they heard the first syllable of the word ‘tandik’ (/tn/), they only looked at the picture of the ‘tandik’, not at the picture of the ‘tenzer’. The Auditory Group (trained without orthography) did not show an asymmetric pattern of word recognition: auditory /e/ and // attracted looks to both ‘e’ and ‘a’ words. The authors surmise that only the group receiving spelling was able to build separate lexical representations, as only the Spelling Group showed the asymmetric word recognition pattern. They take this to mean that explicit instruction on orthography enables learners to encode lexical contrasts. In this study, we take Escudero et al.’s (2008) finding a step further by testing whether the effect of orthography on the creation of distinct phonological contrasts for confusable L2 vowels can also be found for a different vowel pair in a different language and with a new methodology. We will focus on the acquisition of the French /u/–/y/ contrast and consider whether training with orthographic forms assists AE listeners in the creation of distinct phonological categories. Instead of investigating the effect of orthography by registering eye movements, we examine whether orthography can help learners in an L2 word learning task. Levy and Strange (2007) examined the perception of French vowels by AE speakers, who had never had formal instruction in French. The results of an AXB task1 testing the listeners’ performance on the pairs /u–y/, /i–y/, /u–/, and /y–/ revealed that the participants’ scores on the /u–y/ pair were significantly lower than on the other pairs, confirming earlier research by Gottfried (1984) and Rochet (1995), suggesting that /u/ and /y/ assimilate to the same AE phoneme2. In our study, as in Escudero et al.’s (2008) study, participants are trained in one of two conditions: with auditory information only or with auditory information linked to spelled forms. It is important to point out that the orthographic information provided to the participants in one of the training conditions is built on a grapheme–phoneme correspondence which does not correspond to either the L1 or the L2 orthographic 1 In an AXB task, listeners hear sets of three audio stimuli. In each set, the stimulus in the middle (X) is either the same as the first (A) or as the third stimulus (B). Listeners are asked to decide whether X is the same as A or as B. 2 The results of this study also revealed that there was a major effect of consonantal context for the inexperienced listeners, who produced more errors when the vowel occurred in alveolar context (in stimuli of the form /radVt/) than in bilabial context (in stimuli of the form /rabVp/). This finding will be further discussed in the results of Experiment II.

382

E. Simon et al. / Language Sciences 32 (2010) 380–394

conventions. We are thus not investigating the influence of L1 grapheme–phoneme correspondences on L2 perception and word learning, but rather the possible effect of a newly learnt grapheme–phoneme correspondence on the acquisition of a non-native vowel contrast. We hypothesized that listeners who were provided with orthographic forms during training would outperform learners who were not, in both word learning tasks and perceptual discrimination tasks. This may occur if phonological segments are matched to their corresponding graphemes because learners who are trained with two different graphemes may be more likely to map them to two distinct phonological segments. In other words, two different orthographic representations will help learners create two different phonological representations, which will again help them to keep minimally different words apart in their mental lexicon. For example, in our experiment, the French nonce word ‘douge’ ([duZ]) was matched to the picture of a boat, while the French nonce word ‘duˆge’ ([dyZ]) was associated with the picture of a banana. We hypothesize that learners trained on these words with the spelling beneath the pictures would find it easier to match the two graphemes for the vowels to the two distinct vowel phonemes, and hence that they would perform better on the word learning test following the training than learners who had been trained with just the auditory forms [duZ] and [dyZ], matched to the picture of the boat and that of the banana, respectively. Three experiments were carried out to gain more insight into the perception of the French vowels /u/ and /y/ by AE listeners and into the potential role of orthographic information in training. Experiment I consisted of a word learning task in which tokens were recorded from multiple speakers, followed by an AXB task. Experiment II was a cross-language categorization task in which speakers had to identify French vowels as belonging to the nearest AE vowel category. Finally, in Experiment III a categorization task was followed by a word learning task in which tokens were drawn from a single speaker. The three experiments are discussed in the following sections. 2. Experiment I: word learning task with multiple speakers and AXB task In this experiment we examined the effect of training with or without orthographic information on the perceptual acquisition of a phonological vowel contrast which is absent in the listeners’ native language. The aim was to investigate whether listeners exposed to orthography during training on novel words outperformed listeners trained with audio stimuli alone. 2.1. Participants The participants were 20 native speakers of AE, recruited at the University of North Texas. None of them had formal instruction in French or German beyond high-school and none had been regularly exposed to another language besides English in the home.3 The informants participated for course credit or were paid $10. The analysis was based on the results from 20 people who participated in the word learning task, 16 of whom also performed the AXB task. 2.2. Stimuli The auditory stimuli were 16 monosyllabic nonwords of the form C(C)VC, which were organized into two sets of minimal triplets: (a) a set of 12 target triplets containing the vowels [y, u, i]4 and (b) a set of four distractor triplets containing the vowels [A, e, o]. The stimuli were constructed in such a way so as to avoid frames that made words with one of the English vowels which the French vowels might assimilate to (e.g. a frame like /b_k/ would be avoided, as AE listeners might assimilate French /u/ to English /u/ and hence store the French token ‘bouque’ as the English word ‘book’). The target vowels were embedded in words forming minimal 3

Best and Tyler (2007) refer to participants who are ‘‘linguistically naı¨ve to the target language of the test stimuli” as ‘functional monolinguals’. As such, our study examines the very first stage in L2 perceptual acquisition. 4 Minimal triplets rather than minimal pairs were used, as these allowed the learner to contrast the front rounded vowel /y/ with the back rounded vowel /u/ as well as with the front unrounded vowel /i/.

E. Simon et al. / Language Sciences 32 (2010) 380–394

383

triplets, because minimal sets are known to increase listeners’ awareness of the differences between members of a contrast (Nishi and Kewley-Port, 2007, p. 1507). Two examples of target triplets are provided in (1) and (2): (1) (2)

duˆge /dyZ/ stuˆgue /styg/

- douge /duZ/ - stougue /stug/

- dige /diZ/ - stigue /stig/

Two repetitions of all stimuli were recorded from four native speakers of French (two male and two female). They were asked to read the words at their normal speaking rate.5 Tokens from multiple speakers were recorded as it has been argued that speaker variability may be essential to the creation of robust perceptual categories (e.g. Pisoni et al., 1994). Moreover, training learners on variable tokens produced by multiple speakers more closely simulates the situation of non-native listeners learning a foreign language in a natural setting (MacKain et al., 1981). The visual stimuli were pictures of objects (e.g. ‘banana’, ‘boat’, ‘glasses’), selected from the web and adjusted in size. 2.3. Procedure The experiment was built with SuperLab 4.0.36 and consisted of a word learning task and an AXB discrimination task.7 All subjects carried out both tasks in the same order. The experiment took about 30 min. In the word learning task, participants were told they were going to learn words from an unfamiliar language, and that they would be taught three words at a time: triplets whose members differed only on the vowel sound. In the training phase, participants were put alternately into one of two groups, a Sound Only Group and a Sound–Spelling Group, and taught the meaning of a number of word triplets. Subjects in the Sound Only Group were shown a picture on the computer screen and heard the corresponding audio stimulus 1 s later (while the picture was still on the screen). After 2 s, a new screen appeared showing a new picture, followed by the auditory stimulus of the second member of the triplet. Finally, the third picture and corresponding sound file were presented. Subjects in the Sound–Spelling Group were trained in exactly the same way, except that they also saw the word spelled underneath the picture for each stimulus. After three repetitions of the same triplet (with the words appearing in different orders and produced by three different speakers), the informants were tested on their ability to correctly match the audio stimulus of the word to the picture. The testing phase was the same for informants from the Sound Only Group and those from the Sound–Spelling Group. They were shown the three pictures simultaneously, displayed in a row on the screen and heard the auditory forms of the three words but saw no written forms. After each sound, they had to press a button on a response pad to indicate which picture matched the sound they just heard, picture 1, picture 2, or picture 3.8 The aim of the AXB task was to test the ability of the learners to generalize to novel stimuli. The task contained six new triplets, each one again containing the vowels /u/, /y/ and /i/. The task was the same for all subjects. They heard a triplet and were asked to press a button to indicate whether the second word they heard was the same as the first or the third one. The screen showed the number (1) on the left, a question mark in the middle, and the number (3) on the right of the screen. Participants were instructed to press the leftmost button if they thought the second word (X) was the same as the first word (A) and the rightmost button if they thought the second word was the same as the third word (B). All three words in each triplet were spoken by three different speakers. This was done to make sure that informants could not simply rely on acoustic 5

Natural rather than synthetic stimuli were used as the former have been shown to lead to more rapid improvement on L2 vowels than the latter (Nishi and Kewley-Port, 2007). 6 SuperLab is stimulus presentation software, which makes it possible to build experiments, in which sound files are linked to pictures and orthographic forms on the screen, and in which participants’ responses can be saved. 7 In addition, 16 of the 20 participants to Experiment I also performed a production task. The results of this task are not discussed in the present paper, in which we focus on the role of orthography in perception. 8 Picture 1 corresponded to the picture on the left side of the screen, picture 2 corresponded to the center picture and picture 3 corresponded to the picture on the right side of the screen.

384

E. Simon et al. / Language Sciences 32 (2010) 380–394

Table 1 Percentage of items correct in the word learning task. Participant

Sound–spelling

Participant

Sound only

1 2 3 4 5 6 7 8 9 10

67.1 70.8 34.7 66.7 76.4 87.5 33.3 66.7 78.6 81.4

11 12 13 14 15 16 17 18 19 20

70.8 31.9 34.7 56.9 30.6 79.2 69.4 75.0 63.8 83.3

Median IQR Mann–Whitney

69 11.4 U = 39,000 Z = 832 P = .436 > .05

66.6 33.7

identity between words, but had to access their phonological knowledge of the contrast (as discussed by Pater, 2003). 2.4. Results The results for the word learning task are presented in Table 1. While the Sound–Spelling Group outperformed the Sound Only Group, a Mann–Whitney U-test9 revealed that this difference was not significant (P = .436 > .05). However, there was a great deal of variation in performance scores between the participants. The percentage of correct responses ranged from a minimum of 30.6% correct (participant no. 15) to a maximum of 87.5% correct (participant no. 6) and the Interquartile range (IQR) was 11.4% for the Sound–Spelling Group and as much as 33.7% for the Sound Only Group. The results of the AXB task are presented in Table 2. The analysis revealed, firstly, that the participants in the Sound–Spelling Group did not significantly outperform the participants in the Sound Only Group (Mann– Whitney U-test, P = .721 > .05). Secondly, the scores were high for all speakers, ranging from 79.2% correct (participants no. 4 and 12) to a full 100% correct (participants no. 2 and 13). 2.5. Discussion The results did not confirm our hypothesis that the presence of orthographic information would help learners to build phonological forms of lexical items containing a non-native vowel contrast. While learners who received the written forms of the words during training performed better than learners who did not get this extra information, the difference was not significant in the word learning task or in the AXB task.10 The lack of effect of orthographic information during training in the word learning task could be accounted for by a number of different factors. One hypothesis was that some AE listeners do not have single-category assimilation for French /y/ and /u/ (i.e. do not assimilate both L2 sounds to the same L1 sound(s), see Best, 1994; Best et al., 2005), and spelling 9

Non-parametric Mann–Whitney U-tests are used here as well as in the remainder of the paper, because (1) the data sets are relatively small, and (2) the values for skewness and kurtosis were calculated for all data sets and proved not the meet the criteria for performing parametric t-tests. 10 Since the AXB task was designed as a post-test, to examine whether any possible effect of spelling would be generalized by listeners to novel tokens, the lack of effect in the AXB task follows naturally from the finding that there was no effect of orthography in the word learning task.

E. Simon et al. / Language Sciences 32 (2010) 380–394

385

Table 2 Percentage of items correct in the AXB task. Participant

Sound–spelling

Participant

Sound only

1 2 3 4 5 6 7 8

87.5 100.0 85.4 79.2 91.7 95.8 85.4 85.4

9 10 11 12 13 14 15 16

68.8 93.8 87.5 79.2 100.0 87.5 85.4 85.4

Median IQR Mann–Whitney

86.5 7.3 U = 28,000 Z = 832 P = .721 > .05

86.5 5.2

assists mostly (or only) in cases of single-category assimilation. In order to confirm whether AE listeners have single- or a two-category assimilation for French /y/ and /u/, we decided to carry out a cross-language categorization task with the stimuli we used in Experiment 1 (see Experiment II). A second hypothesis was that the task was too taxing for some participants: participants had to remember the meaning of 36 new target words and of 12 distractor words in a 20–25 min training session. Moreover, the stimuli were recorded by four different native speakers of French. We therefore decided to revise the experiment with a longer training phase to further encourage learning of the contrast, fewer triplets to reduce the memory load, and no speaker variability (see Experiment III). 3. Experiment II: cross-language categorization task In Experiment I we tested the hypothesis that exposure to orthography during training on a new vowel contrast would assist learners in the perceptual acquisition of the contrast. The analysis revealed that participants trained with orthographic forms did not significantly outperform participants trained without orthographic forms. One factor we suggested as an explanation for this result was that American listeners may assimilate the French vowels /u/ and /y/ to two distinct English vowels. If this is the case, then orthography may only play a minor role in helping learners to establish a perceptual contrast, and near-ceiling effects are expected. In order to determine whether or not the French /u/–/y/ contrast is an example of single-category assimilation or two-category assimilation, we ran a second experiment (Experiment II), in which we investigated how AE listeners map the French vowels /u/, /y/ and /i/ to AE vowels by means of a cross-language categorization task. 3.1. Participants The participants were ten native speakers of AE, recruited at the University of North Texas. None of them had participated in Experiment I. They had not had formal instruction in French or any other language with front rounded vowels beyond high-school. The participants received $10 for participation. 3.2. Stimuli The stimuli were the same as in Experiment I: 24 monosyllabic nonwords of the form C(C)VC produced by the same four native speakers of French and consisting of 12 minimal triplets containing the target vowels /u/, /y/ and /i/. Two example triplets are given in (3) and (4): (3) (4)

juˆque /y/ bluˆve /y/

- jouque /u/ - blouve /u/

- jique /i/ - blive /i/

386

E. Simon et al. / Language Sciences 32 (2010) 380–394

Fig. 1. Categorization of French /u, y, i, o/ by AE listeners.

Four triplets ended in a bilabial stop, four in a velar stop, three in a labiodental fricative and one in a palatoalveolar fricative. Each stimulus was repeated 10 times. 3.3. Procedure The stimuli were presented aurally over headphones, with a 500 ms inter-stimulus interval. Five English words were displayed on the screen: peek, pick, booth, book and poke, containing the vowels /i:/, /I/, /u:/, /u/, and /ou/, respectively. Participants were told that they were going to participate in a listening experiment. The instructions on the initial screen were the following: ‘‘After you hear a sound, click on the vowel that is most similar”. During the experiment the instructions ‘‘Choose the vowel that best matches the vowel that you heard” remained at the top of the screen. Participants were told that, although the vowels they would hear are obviously different from English vowels, they should—for each token—pick out the English vowel that each of the French ones comes the closest to approximating. The experiment took about an hour. 3.4. Results The graph in Fig. 1 displays the percentage of vowels chosen by all 10 informants together. The X-axis presents the vowels in the five words on the screen. The Y-axis shows the percentage of auditory tokens assigned to each of those five vowels. The results show that the French vowel /u/ was categorized as the English booth vowel (tense /u/) in the majority of tokens (64%) and as the English book vowel (lax /u/) to a lesser extent (28%). The division was similar for French /i/, which was categorized as the tense peek vowel (/i/) in 72% of the tokens and as the lax pick vowel (/I/) in 27% of the tokens. The categorization of French /y/ was variable: it was categorized as booth (/u/) in 35% of the tokens, as book (/u/) in 32% of the tokens and as pick (/I/) in 27% of the tokens.11 A closer analysis of the results revealed that the place of articulation of the consonant following the vowel in the C(C)VC cluster influenced the categorization of the French /y/ vowel. When /y/ was followed by a velar stop it was most frequently categorized as the book vowel (/u/, 46%), followed by booth (/u/, 33%) and pick 11 It should be noted that these results should not be taken to mean that the AE listeners did not perceive the difference between the French vowels /u/, /y/, and /i/, but rather they tell us something about how AE listeners categorize these French vowels as AE vowels in different contexts.

E. Simon et al. / Language Sciences 32 (2010) 380–394

387

Fig. 2. Categorization responses to /y/ and /u/ preceding bilabial and velar codas.

(/I/, 16%). When /y/ was followed by a bilabial, on the other hand, it was most frequently categorized as the pick vowel (/I/, 51%), followed by booth (/u/, 27%) and book (/u/, 18%). Fig. 2 compares the categorization of / y/ and /u/ preceding bilabial and velar codas and shows that, for all speakers together, the categorization of / y/ was considerably different preceding a bilabial than preceding a velar, while the categorization of /u/ did not differ as a consequence of the place of articulation of the following consonant. The French vowel /y/ was most often categorized as the pick vowel (/I/) (51% of the tokens) in the bilabial context and as the book vowel (/u/) in the velar context (46%). The French vowel /u/, however, was categorized most frequently as the booth vowel (/u/) in both bilabial (66%) and velar (59%) contexts.

3.5. Discussion The results showed that there was a great deal of variation in the categories to which the participants assigned French /y/ tokens and there was no clear single-category assimilation of French /u/ and /y/ to the English booth vowel (/u/). This may explain why the Spelling Group did not significantly outperform the Sound Only Group in Experiment I. Both /u/ and /y/ were categorized as English book (/u/) with a similar frequency (in 28% of the tokens for /u/ and in 32% of the tokens for /y/), but the two vowels differed in the extent to which they were categorized as pick (/I/) and booth (/u/): /y/ was matched to pick in 27% of the tokens; the number of /u/ tokens matched to pick was minimal (1%). The number of booth assignments, on the other hand, was considerably greater for /u/ (64%) than for /y/ (35%). This indicates that for a fairly large number of the tokens, /u/ and /y/ were categorized as different AE vowels (i.e. there was no clear cut single-category assimilation), meaning that the participants could distinguish between the two French vowels. When the data were broken down according to the bilabial versus velar place of articulation of the coda consonant, it appeared that /y/ was categorized as the front unrounded pick vowel (/I/) in more than half of the tokens. These findings of Experiment II may help to explain the lack of effect of orthography in Experiment I. It was shown that the consonantal context in our stimuli helped participants to distinguish between French /u/ and /y/, and earlier research has shown that French /u/ and /y/ are harder to distinguish in an alveolar context than in a bilabial context (Levy and Strange, 2007). As a result, it is possible that manipulating the presence of orthographic information during training in Experiment I did not affect some listeners’ performance on the word learning task, since they categorized the vowels /u/ and /y/ differently in certain consonantal contexts. Learners would thus have largely been able to distinguish between lexical items containing /u/ and /y/ from the start, and the benefit which spelling might have had would thus have been lost. Moreover, the acoustic variability between tokens of the same type (e.g. various realizations of the same vowel /y/), resulting from

388

E. Simon et al. / Language Sciences 32 (2010) 380–394

the fact that the stimuli were produced by different native French speakers, could have confused learners in the creation of the contrast and prevented spelling from offering an extra boost. Because spelling could possibly have been prevented from having an effect on the perceptual acquisition of the contrast in the word learning task in Experiment I, we decided to design and run a third experiment, in which the stimuli were taken from just one native speaker of French and in which the consonantal context of the vowels was held constant. 4. Experiment III: vowel-categorization task and word learning task with a single speaker Experiment III consisted of two parts: a categorization task and a word learning task. The results of the word learning task in Experiment I, in which the presence of orthographic forms during training was manipulated, revealed that there was no significant difference between the groups of participants according to training condition. This lack of effect may potentially have been due to the fact that, in some consonantal contexts, there was a tendency for some listeners to categorize the French vowels /u/ and /y/ as distinct English vowels, as was indeed shown to be the case in Experiment II. Another factor considered potentially responsible for the lack of effect of training condition in Experiment I was that the stimuli were read by multiple native speakers of French, which may have inhibited the learning process for some participants to such an extent that any effect of orthography was diminished. Some studies (Kingston, 2003; McCandliss et al., 2002) have shown that reducing stimulus and speaker variation have a positive effect if learners do not need to generalize to novel items and the training session is relatively short. Experiment III therefore consists of a word learning task, in which the place of articulation of the coda consonant in the stimuli is held constant at alveolar. A short vowel- categorization task was also performed in order to confirm that categorization of the French vowels by the new set of AE listeners followed the same patterns of perception which were reported for Experiment II. For both tasks, the stimuli were recorded from just one native speaker of French. 4.1. Participants The participants were 20 native speakers of AE, recruited at Duke University, North Carolina and at the University of North Carolina. None of the informants had participated in Experiment I or II. They had not received instruction in French or German or any other language with front rounded vowels beyond highschool and were paid $12 for participation. 4.2. Stimuli The stimuli were novel French nonwords which formed minimal pairs with a CV(C)C structure in which V was /u/, /y/ or /i/. They were produced by one female native speaker of (Belgian) French. The stimuli were recorded twice; only one of each of the repetitions was used in the experiment. In contrast to the stimuli used in Experiments I and II, the place of articulation of the stimuli in Experiment III was held constant at alveolar. 4.3. Procedure In the vowel-categorization task, built in Praat (Boersma and Weenink, 2009), the participants heard a stimulus and were instructed to click on one of the five words on the screen which contained a vowel that best matched the sound they had just heard. They were told that the words they heard came from a language with which they were not familiar. The words on the screen were could, cooed, keyed, kid and cod, representing the vowels /u/, /u/, /i/, /I/ and /A/, respectively. Words with the vowels /u/, /u/, /i/ and /I/ were selected because AE listeners could potentially categorize the French vowels /u/, /y/ and /i/ as these AE vowels. The word cod functioned as a control word, to check that participants were not clicking randomly. Twenty-seven stimuli (nine triplets) were presented and repeated three times in random order. There was a 1.5 s pause between a click and the presentation of the following auditory stimulus.

E. Simon et al. / Language Sciences 32 (2010) 380–394

389

The procedure in the word learning task, built in SuperLab 4.0.3, was largely the same as for Experiment I, with the important differences that (1) all stimuli were produced by one speaker instead of four speakers, (2) the number of triplets was reduced to six, and distractor triplets were omitted, and (3) each stimulus was repeated six times (i.e. two repetitions in three blocks). These three changes were intended to facilitate the learning process by decreasing the acoustic variability between the tokens (change (1)) and the memory load (changes (2) and (3)). The entire experiment took nearly an hour, with the vowel-categorization task lasting about 10–15 min and the word learning task about 40 min. 4.4. Results Fig. 3 presents the results of the vowel-categorization task. The analysis revealed that French /i/ was categorized as English /i/ in 76% and as /I/ in 23% of the tokens. These percentages are very similar to those in Experiment II, in which French /i/ was categorized as English /i/ in 72% and as /I/ in 27% of the tokens. French /u/ was categorized as English /u/ in 66% of the tokens and as /u/ in 29% of the tokens, which is again close to the 64% /u/ and the 28% /u/ response rates found in Experiment II. The results for French /y/ are, however, different for the two experiments, as is illustrated in Fig. 4. The comparison shows that, while 27% of the tokens containing French /y/ were categorized as the English pick vowel (/I/) in Experiment II, only 1% of the tokens were categorized as this vowel when preceding an alveolar coda, as in Experiment III. By contrast, far more /y/ tokens were categorized as the cooed vowel (/u:/) in Experiment III (68%) than in Experiment II (35%). Table 3 presents the results of the word learning task and shows that there was no significant difference (Mann–Whitney U-test, P > .05) between the two groups of learners, who got high scores in both conditions. There is little variation between the scores of individual informants, which range from 77.8% to 98.1% correct. 4.5. Discussion The results of the categorization task in Experiment III confirm Levy and Strange’s (2007) claim that native speakers of English categorize French /u/ and /y/ as back rounded vowels preceding an alveolar consonant. Levy and Strange (2007) examined the perception of French vowels by a group of non-French speaking AE listeners. They tested the discrimination of the French vowels /u, , y, i/ in the nonwords ‘rabVt’ and ‘rabVp’

Fig. 3. Categorization of French /u, y, i/ preceding alveolar coda by AE listeners.

390

E. Simon et al. / Language Sciences 32 (2010) 380–394

Fig. 4. The categorization of French /y/: comparison between Experiments II and III.

Table 3 Percentage of items correct on the word learning task. Participant

Sound–spelling

Participant

Sound only

1 2 3 4 5 6 7 8 9 10

98.1 89.8 88.9 86.1 91.7 98.1 90.7 96.3 85.2 88.9

11 12 13 14 15 16 17 18 19 20

85.2 81.5 88.0 96.3 91.7 77.8 97.2 97.2 97.2 97.2

Median IQR Mann–Whitney

90.3 6.25 U = 48,500 Z = 114 p = .912 > .05

94 11.3

in an AXB task, and found that the non-French speaking AE listeners confused /i/ and /y/ more often in bilabial than in alveolar context. They hypothesize that, because AE /u/ is often fronted in alveolar context, AE listeners confuse French /y/ and /u/ in the alveolar context more often than in the bilabial context, in which the acoustic closeness of /y/ and /i/ leads to confusion between these two vowels. By keeping the coda consonant constant at alveolar in Experiment III, we succeeded in reducing the percentage of /y/ tokens that were categorized as /I/ to 1%. As expected, both /u/ and /y/ were now categorized in very similar ways. The tendency for two-category assimilation, which was considered a possible explanatory factor for the lack of effect of orthography in Experiment I, can be excluded in this experiment given the results from the vowel-categorization task. However, in contrast to our expectation, learners in the Sound–Spelling Group again did not outperform learners in the Sound Only Group in the word learning task. Interestingly, the results showed that the median scores in Experiment III’s word learning task (90.3% for the Sound-Spelling Group and 94% for the Sound Only Group) were much higher than in Experiment I’s word learning task (69% for the Sound–Spelling Group and 66.6% for the Sound Only group). One possible explanation for the much higher scores in Experiment III is that we eliminated all variability in the stimuli. First of all, the stimuli were

E. Simon et al. / Language Sciences 32 (2010) 380–394

391

produced by just one instead of four different native speakers of French (cf. Kingston, 2003, who also found that speaker variability impeded learning). Secondly, the tokens to which the participants were exposed during training were the same as those on which they were tested. As a result, it is possible that in this second word learning task, participants did not build lexical representations after all, but relied on phonetic similarity between the trained and the tested words. 5. Conclusions and implications The aim of the set of studies reported on in this paper was to examine whether exposure to orthographic information during training assists learners in the acquisition of a novel vowel contrast. The results of the picture–word matching task in Experiment I revealed that participants trained on the acquisition of new words by means of word–picture associations with orthographic forms did not significantly outperform learners trained without orthographic information. This finding does not support our hypothesis that listeners who are trained with orthographic forms establish a grapheme-to-phoneme mapping which facilitates the creation of distinct lexical items for new L2 words. We suggested two factors which could explain the lack of effect of orthography in the word learning task: A first possible reason for the lack of effect was that some AE listeners already had two distinct lexical representations for the French minimal pairs containing /u/ and /y/ when these vowels occurred in particular consonantal contexts. A vowel-categorization task with the stimuli used in Experiment I was therefore carried out (Experiment II). The analysis showed that there was indeed an increased tendency for two-category assimilation when the vowel was followed by a bilabial coda consonant. A second possible explanation for the lack of effect of orthography was that there was a great deal of variability in the stimuli, as they were drawn from multiple native speakers of French. We hypothesize that, because of the variability in the stimuli, it may have been difficult for the AE listeners to assimilate different acoustic realizations to one and the same phoneme. For instance, if the listeners heard different phonetic realizations of /y/, but did not realize that these were all realizations of the same phoneme, then the fact that in the orthographic forms those realizations were all represented with the same letter may have been confusing rather than helpful. Of course, in real learning situations learners will also be confronted with different realizations of the same phoneme by different speakers and in various contexts. However, in a 1-h experiment like the one carried out in this study, it is likely that learners simply did not receive enough exposure to the target sounds in order to overcome the difficulties caused by the variability between the tokens. We therefore carried out a third experiment, which consisted of a vowel-categorization task, in which the coda was held constant at alveolar, and a word learning task in which all variability was eliminated: tokens were produced by just one speaker and participants were tested on the same tokens they were trained on. The results of Experiment III revealed that the French vowels /u/ and /y/ were both categorized as back rounded vowels when they were followed by an alveolar coda consonant, confirming Levy and Strange’s (2007) hypothesis that the fronting of AE /u/ in alveolar context leads AE listeners to categorize the French front rounded vowel /y/ as the AE back rounded vowel /u/. However, the word learning task, in which all stimuli had an alveolar coda consonant, and in which tokens were produced by just one native speaker of French, revealed that, again, participants trained with orthographic forms did not outperform participants who were trained without orthography. This could, however, also be the result of a near-ceiling effect in both groups (though none of the participants gave 100% correct responses). A third potential explanation for the lack of effect of orthographic forms in training on a new vowel contrast is related to the nature of the orthographic system in English. Orthographic depth is defined as the degree of complexity of the mapping between graphemes and phonemes and conceptualized on a scale from more transparent (or ‘shallow’) to more opaque (or ‘deep’) (Van den Bosch et al., 1994). For most English vowels, there is no one-to-one grapheme-to-phoneme correspondence. The English grapheme hoi, for instance, can represent the sound /A/ (as in ‘hot’) and /K/ (as in ‘mother’), as well as /u/ when doubled (as in ‘goose’) and /ä/ when followed by hri (as in ‘word’). English is thus situated on the opaque end of the orthographic depth scale, as opposed to languages with more transparent orthographic systems, such as Serbo-Croatian and Italian (Van den Bosch et al., 1994). As a result, L1 English listeners may be less likely to rely on spelling

392

E. Simon et al. / Language Sciences 32 (2010) 380–394

to create distinct phonological categories than speakers of a language with a more transparent orthographic system. In a study by Erdener and Burnham (2005), speakers of a language with a more transparent orthography (Turkish) and a more opaque orthography (Australian English) were trained on the production of Spanish (more transparent) and Irish (more opaque) nonwords. The analysis of the production task revealed that the Turkish participants made considerably fewer phoneme errors than the English-speaking participants when orthographic information of the transparent language was provided. The authors link this finding to the participants’ native language: as native speakers of Turkish are familiar with a transparent orthographic system in their L1, they are used to mapping graphemes onto phonemes on a one-to-one basis and hence orthographic forms of (non)words in a foreign language assist them in the production of those words more than they assist native speakers of a language with an opaque writing system, such as English. We believe that a parallel can be drawn to our study: we might not have found a positive effect of orthography on the perception of a novel vowel contrast by L1 English listeners, because these listeners may not be used to mapping graphemes onto phonemes on a one-to-one basis and hence they do not profit from the spelled forms of the words as much as native speakers of a language with a more transparent orthographic system.12 One suggestion for further research is thus to carry out the experiments with native speakers of a language with a more transparent grapheme-to-phoneme mapping, such as Spanish. If such an experiment were to be carried out, the native language of the listeners should be carefully selected. For instance, Brazilian Portuguese is a language with a very transparent orthographic system which lacks /y/, yet it would not be suitable for this experiment, as Rochet’s 1995 study showed that native speakers of Brazilian Portuguese typically have twocategory assimilation for French /u/ and /y/. Spanish, by contrast, would be a suitable L1, since it has a transparent orthography (Erdener and Burnham, 2005) and L1 Spanish speakers tend to categorize both French /y/ and /u/ as back rounded vowels (Meunier et al., 2003). As far as implications of this study for language teaching are concerned, the results of the picture–word matching tasks following different training conditions are inconclusive about the potential positive influence of presenting learners with orthographic information during L2 word learning tasks. While learners trained with orthographic forms outperformed learners trained without orthography, the difference was not significant. However, our study has implications for language teaching with respect to the use of minimal pairs or triplets for the teaching of a new vowel contrast. While teaching L2 phoneme contrasts by means of minimal pairs has been criticized by some (see e.g. Brown, 1995, who argues that minimal pairs do not merit the attention they receive), they are still pervasively present in pronunciation teaching materials (Jones, 1997). While we do not engage in the debate on the efficacy of minimal pair exercises compared to other pronunciation teaching techniques, the result of the vowel-categorization tasks in Experiments II and III suggest that, if minimal pairs are used in pronunciation teaching, they should be carefully selected. Specifically, we would like to make two suggestions. First, we suggest that teachers should be careful to adapt the minimal pairs/sets to the native language of their students. If, for example, a person is teaching French to native speakers of English, s/he might choose minimal pairs in which the vowel contrast being focused on is /u/–/y/. This is a sensible choice, since both vowels are commonly assimilated to back rounded vowels by AE speakers. However, if the same teacher then starts teaching French to native speakers of Portuguese, s/he might do better to focus on the pair /y/–/i/, as it has been shown that native speakers of Brazilian Portuguese tend to categorize both these vowels as /i/ (Rochet, 1995). In other words, those pairs which can be shown to result in single-category assimilation should receive priority in minimal pair training. Secondly, the results of the vowel-categorization task in Experiment II revealed that AE listeners perceived the French vowels differently preceding velar, alveolar and bilabial consonants. The French vowel /y/, for instance, was categorized as AE /I/ more often when preceding a bilabial than before an alveolar. When place of articulation of the coda consonant was held constant at alveolar, as in Experiment III, both /u/ and /y/ were 12 A parallel can also be drawn here to the language-dependent use of visual information, such as liprounding, accompanying auditory forms: speakers of languages in which visual information is relatively low, such as Japanese, have been shown to attend less to visual information and attune more to auditory information than speakers of a language with a high visual information load, such as English (Sekiyama et al., 2003). Similarly, speakers of a language with a more opaque orthographic system may attend less to orthographic information than speakers of a language with a more transparent orthography.

E. Simon et al. / Language Sciences 32 (2010) 380–394

393

categorized as back rounded vowels. These results imply that the consonantal frames of minimal pairs are of great importance. If a person is, for instance, teaching French to native speakers of AE, s/he could start with (near-)minimal pairs with a bilabial in the coda (e.g. jupe [jyp] ‘skirt’ – loupe [lup] ‘magnifying glass’), in which the contrast is easier to perceive by native speakers of English, and then gradually move on to more difficult pairs, i.e. those with an alveolar coda consonant (e.g. tuˆtes [tyt] simple past, 2nd p. pl. of se taire; ‘to be silent’– toute [tut] adj.; ‘all, whole’). We suggest that starting with pairs in which learners can relatively easily perceive the contrast may considerably enhance learners’ confidence in their ability to distinguish between the novel L2 vowels. Once learners are aware that they hear the contrast in particular contexts, teachers can move on to presenting the contrast in a more difficult context, thereby reminding learners of their previous successful perception of the contrast (Dixo Leiff and Pow, 2000). In sum, we argue for an approach which takes into account how learners with a specific L1 background perceive L2 vowels in particular phonetic contexts. Specifically, we suggest that, if minimal pairs are used in pronunciation training, the vowels should be embedded in different consonantal contexts, moving from contexts in which the contrast is easier to perceive to contexts which are more difficult for particular L1 speakers. While the present set of studies did not show a positive effect of orthography on the perception of novel vowels, we believe that one potential reason for this lack of effect may lie in the opaque orthographic system of the listeners’ L1, English. Further research with native speakers of a language with a transparent orthographic system and a single-category assimilation of a novel L2 vowel contrast should be carried out to test this hypothesis. While Escudero et al.’s (2008) recent eye-tracking study on the English /e/–// contrast with L1 Dutch speakers revealed an effect of orthography, this effect was not replicated in the present word learning study on the French /u/–/y/ contrast with L1 AE listeners. It is thus clear that more work is needed to understand the role of orthography in the acquisition of an L2 sound system and that more studies with different methodologies and focusing on different contrasts are needed in order to grasp the complex interaction between orthography and phonological acquisition. Acknowledgements The authors would like to thank Joe Pater and John Kingston for feedback on the original design of the experiments and John Kingston for generously allowing the use of his lab for some recordings. They are grateful to all participants for their cooperation. The first author wishes to thank the Belgian American Educational Foundation for funding a research stay at the University of Massachusetts, Amherst, as well as the Fund for Scientific Research Flanders for a postdoctoral research grant. The third author would like to thank CAPES (Coordenacßa˜o de Aperfeicßoamento Pessoal de Nı´vel Superior) – Brazil for funding a stay at the University of Massachusetts, Amherst. References Best, C.T., 1994. The emergence of native-language phonological influences in infants: A perceptual assimilation hypothesis. In: Goodman, J.C., Nusbaum, H.C. (Eds.), The development of speech perception. MIT Press, Cambridge, MA, pp. 167–224. Best, C.T., Tyler, M.D., 2007. Nonnative and second-language speech perception: Commonalities and complementarities. In: Munro, M.J., Bohn, O.-S. (Eds.), Second language speech learning: The role of language experience in speech perception and production. John Benjamins, Amsterdam, pp. 13–34. Best, C.T., McRoberts, G.W., Goodell, E., 2005. Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener’s native phonological system. Journal of the Acoustical Society of America 109, 775–794. Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer [Computer programma]. Retrieved from . Bradlow, A., Pisoni, D.B., Akahane-Yamada, R., Tohkura, Y., 1997. Training Japanese listeners to identify English/r/ and/l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America 101 (4), 2299–2310. Brown, A., 1995. Minimal pairs: Minimal importance? Language Teaching Journal 49 (2), 169–175. Dixo Leiff, C., & Pow, E. (2000). Understanding and empowering the learner–teacher in pronunciation instruction: Key issues in pronunciation course design. Humanising Language Teaching, 2.6 (retrievable from ). Erdener, V.D., Burnham, D.K., 2005. The role of audiovisual speech and orthographic information in nonnative speech production. Language Learning 55 (2), 191–228. Escudero, P., Hayes-Harb, R., Mitterer, H., 2008. Novel L2 words and assymetric lexical access. Journal of Phonetics 36 (2), 345–360.

394

E. Simon et al. / Language Sciences 32 (2010) 380–394

Flege, J.E., 1987. The production of ‘new’ and ‘similar’ phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics 15, 47–65. Flege, J.E., 1989. Chinese learners’ perception of the word-final English/t/-/d/ contrast: Performance before and after training. Journal of the Acoustical Society of America 86 (5), 1684–1697. Flege, J.E., Hillenbrand, J., 1984. Limits on phonetic accuracy in foreign language speech production. Journal of the Acoustical Society of America 76 (3), 708–721. Gottfried, T.L., 1984. Effects of consonant context on the perception of French vowels. Journal of Phonetics 12, 91–114. Iverson, P., & Evans, B. G. (2007). Auditory training of English vowels for first-language speakers of Spanish and German. In Proceedings of the sixteenth international conference of phonetic sciences (ICPhS), August 2007 (pp. 1625–1628). Jones, R., 1997. Beyond ‘listen and repeat’: Pronunciation teaching materials and theories of second language acquisition. System 25 (1), 103–112. Kingston, J., 2003. Learning foreign vowels. Language and Speech 46, 295–349. Levy, E.S., Strange, W., 2007. Perception of French vowels by AE adults with and without French language experience. Journal of Phonetics 36 (4), 141–157. MacKain, K.S., Best, C., Strange, W., 1981. Categorical perception of English/r/ and/l/ by Japanese bilinguals. Applied Psycholinguistics 2, 369–390. McCandliss, B.D., Fiez, J.A., Protopapas, A., Conway, M., McLelland, J.L., 2002. Success and failure in teaching the r-l contrast to Japanese adults: Predictions of Hebbian model of plasticity and stabilization in spoken speech perception. Cognitive, Affective, and Behavioral Neuroscience 2, 89–108. Meunier, C., Frenck-Mestre, C., Lelekov-Boissard, T., & Le Besnerais, M. (2003). Production and perception of foreign vowels: does the density of the system play a role? In Proceedings of the fifteenth international congress of phonetic sciences (ICPhS), August 2003, Barcelona, Spain (pp. 723–726). Nishi, K., Kewley-Port, D., 2007. Training Japanese listeners to perceive American English vowels: Influence of training sets. Journal of Speech, Language, and Hearing Research 50, 1496–1509. Pater, J., 2003. The perceptual acquisition of Thai phonology by English speakers: task and stimulus effects. Second Language Research 19, 209–223. Pisoni, D.B., Aslin, R.N., Perey, A.J., Hennessy, B.L., 1982. Some effects of laboratory training on identification and discrimination of voicing contrasts in stop consonants. Journal of Experimental Psychology: Human Perception and Performance 8, 297–314. Pisoni, D.B., Lively, S.E., Logan, S.J., 1994. Perceptual learning of non-native speech contrasts. Implications for theories of speech perception. In: Goodman, J.C., Nusbaum, H.C. (Eds.), The development of speech perception. MIT Press, Massachusetts, pp. 121– 166. Rochet, B., 1995. Perception and production of second language speech sounds by adults. In: Strange, W. (Ed.), Speech perception and linguistic experience: Issues in cross-language research. York Press, Timonium, MD, pp. 379–410. Sekiyama, K., Burnham, D., Tam, H., & Erdener, D. (2003). Auditory-visual speech perception development in Japanese and English talkers. In Proceedings of the international conference on audio-visual speech processing (AVSP), September 2003, St. Jorioz, France (pp. 43–47). Strange, W., 1995. Cross-language studies of speech perception: A Historical review. In: Strange, W. (Ed.), Speech perception and linguistic experience: Issues in cross-language research. York Press, Timonium, MD, pp. 3–45. Strange, W., Dittman, S., 1984. Effects of discrimination training on the perception of/r-l/ by Japanese adults learning English. Perception and Psychophysics 36, 131–145. Strange, W., Weber, A., Levy, E.S., Shafiro, V., Hisagi, M., Nishi, K., 2007. Acoustic variability within and across German, French, and AE vowels: Phonetic context effects. Journal of the Acoustical Society of America 122 (2), 1111–1129. Tees, R.C., Werker, J.F., 1984. Perceptual flexibility: Maintenance or recovery of ability to discriminate non-native speech sounds. Canadian Journal of Psychology 38, 579–590. Van den Bosch, A., Content, A., Daelemans, W., De Gelder, B., 1994. Measuring the complexity of writing systems. Journal of Quantitative Linguistics 1 (3), 178–188. Wang, X., Munro, M.J., 2004. Computer-based training for learning English vowel contrasts. System 32, 539–552.