A perceptual basis for the systematic phonological correspondences between Japanese load words and their English source words

A perceptual basis for the systematic phonological correspondences between Japanese load words and their English source words

Journal of Phonetics (1994) 22, 343-356 A perceptual basis for the systematic phonological correspondences between Japanese load words and their Engl...

7MB Sizes 0 Downloads 12 Views

Journal of Phonetics (1994) 22, 343-356

A perceptual basis for the systematic phonological correspondences between Japanese load words and their English source words Naoyuki Takagi and Virginia Mann School of Social Sciences, University of California at Irvine , Irvine, CA 92717, U.S.A . (Received 17th March 1992, and in revised form 14th July 1993

There exist systematic sound correspondences in Japanese loan words whose English origins contain a tense or lax vowel followed by a voiceless stop. For English words of the structure (e 1)V 1e 2( {e3/v 2 } ), the length of Japanese loan word vowels and consonants corresponding to V 1 (tense or lax) and ~ (voiceless stop) is predicted by the quality of V 1 and any following elements. To test the hypothesis that these correspondences follow from perceptual assimilation, 18 native speakers of Japanese were presented with nonsense words uttered by two native speakers of American English, where el = /g/ , vl =/I, i, U, u/ , ~ = /p, t, k/ , e 3 = /s, t/ and v2 = /a/. The subjects chose which of four katakana representations sounded closest to each stimulus: S(hort)V + S(ingle)e, SV + G(eminate)e, L(ong)V + se or LV+ GC. The response patterns for eve and eve/s/ tokens were consistent with the experimental hypothesis in so far as they matched loan word correspondences and correlated with certain durational properties of the stimuli. However , the responses to eve/t/ and evev tokens did not match the existing systematic correspondences, implicating factors other than perceptual assimilation .

1. Introduction

The Japanese language has borrowed many words from English, which have been made consistent with the Japanese phonology and have come to be used in everyday Japanese conversation. The fact that certain loan words have become an integral part of the Japanese lexicon is evidenced not only by their Japanized pronunciation but also by the fact that they are included-in compact size Japanese dictionaries such as Shinmeikai Kokugo Jiten (a dictionary of approximately 70,000 entries, from which all the examples in this paper are taken; Yamada 1981). It is also the case that native speakers of Japanese learn the meaning of such loan words not by learning English but in the context of Japanese monolingual communication. For example, native speakers of Japanese learn that hitto refers to a hit in baseball without knowing that the word derives from the English word hit just as native speakers of English learn what communication means without having learned Latin. (In the 0095-4470/94/040343 + 14 $08.00/0

© 1994 Academic Press Limited

344

N. Takagi and V. Mann TABLE I. Examples of Japanese loan words as a function of V 1quality and phonological environment Lax (a)

Tense

VIC

hit-hitto

beet-biitoo

(b)

V 1C/s/

box-bokkusu

boots-buutsu

(c)

V 1C/t/

duct-dakuto

None

(d)

V1Cv

butter-bataa

heater- hiitaa

vee vee vc

vc

vvc vvc

we

batter-battaa

vee

written language, these loan words are differentiated from surrounding text by being spelled out with katakana, the less frequently used of the two "syllabaries", 1 but this in and of itself does not mark them as Joan words, since katakana is also used for other purposes.) This investigation concerns Japanese Joan words whose original English words contain lax and tense vowels followed by a voiceless stop. It limits itself to monosyllabic and disyllabic English words having the phonological structure of (1).

(C1)V1Cz({~:}) ,

(1)

where C 1 =any consonant; V 1 =stressed; tense vowels (/i/, /ul), lax vowels (/1/,/u/,etc.); C2 =voiceless stops, /p/, It/, /k/; C3 =/s/ or /t/; v2 =unstressed vowels (lower case v); and parenthesized elements are optional. Both Japanese and American linguists have noted certain systematic phonological transformations involving such words (Kunihiro, 1963; Ohso, 1971; Lovins, 1975). These are illustrated in Table I and can be summarized by the following three rules: Rule (1):

lax vl~v tense vl~vv

Rule (2):

c2~c/tense v~--(C;:J)

Rule (3)

(a): C2 ~ cc I lax V 1_(/sl) (b): C2 ~C/lax V 1_ / t / (c): c2~ {c~Y lax vl_v.

1 The Japanese writing system includes three subsystems: kanji, a system of about 2000 characters borrowed from Chinese logographs, which are used to write Sino-Japanese words and many native Japanese nouns and verb roots ; hiragana , a syllabary-like set of graphemes which is used primarily to spell out the moras of verbal inflexions and other grammatical morphemes; and katakana, an alternative set of the same number of graphemes for moras, which is used primarily to spell out more recent loan words from languages such as English, but also to write some native Japanese nouns in some contexts, such as plant and animal names in scientific writing.

Perceptual factor in Japanese Loan words

345

Here, V and VV represent Japanese short and long vowels, and C and CC represent Japanese single and geminate consonants. As noted by Ohso (1971) and Lovins (1975), Japanese vowels and consonants corresponding to English V 1 and ~ are for the most part predictable given the quality of V 1 (lax or tense) and the phonological environment in which C2 occurs. The sole exception is stated in (3c). When ~ follows a lax V 1 and precedes an unstressed vowel, it can correspond either to a single (C) or geminate (CC) consonant. The purpose of this study is to examine the basis of the correspondence described by rules (1)-(3) . Best and her colleagues offer a framework which clarifies the perceptual phenomenon that could underlie this and other inter-language phonological correspondences (Best, McRoberts & Sithole, 1988; Best & Strange, 1992). In their view the processing of foreign speech sounds "entails the perceptual assimilation of incoming speech sounds to the phonemic categories of the native language whenever possible" (Best et al., 1988: 347). Our hypothesis is that rules (1)-(3) follow from the perceptual assimilation which takes place when native speakers of Japanese hear English words of the form (C 1)V 1Ci {C3 /v2 } ). This possibility has been discussed or alluded to by several different authors. A perceptual basis for rule (1) may be found in the durational differences between English lax and tense monophthongs. Although the opposition between English lax vowels /I/, /u/ and tense vowels /i/ , /u/ is realized mainly in terms of vowel quality, lax vowels are well known to be shorter than tense ones (Jones, 1960; Peterson & Lehiste, 1960; Crystal & House, 1982, 1988). In Japanese, durational differences are utilized phonemically to distinguish both vowels and consonants; long vowels are literally longer in duration than short ones, while vowel quality remains more or less the same (Han, 1962; Homma, 1981; Beckman, 1982). Experience with this durational distinction could lead to a tendency to perceptually assimilate tense vowels as long, and lax vowels as short. Many researchers have suggested a perceptual basis for the correspondences in rule (3a), where English voiceless stops follow lax vowels (Kunihiro, 1963; Ohso, 1971; Lovins, 1975; Takebayashi, 1982). The essence of their argument is that English word-final voiceless stops after lax vowels have, to a Japanese ear, an auditory impression which is closer to geminate consonants than to single ones. This tacitly implies that English voiceless stops after tense vowels sound like single consonants, and hence rule (2). It has been widely accepted in the literature that duration is an important cue not only for the perceived distinction between short and long vowels but also for the difference between short and long (i.e., geminate) consonants in Japanese (e.g., Fujisaki, Nakamura, and Imoto , 1973). Presumably, voiceless stops after lax vowels have a longer closure duration than those after tense vowels, hence the correspondences of rules (2) and (3a). Any explanation of gemination in Japanese loan words is somewhat complicated by rules (3b) and (3c) , which indicate that phonological environment can influence the gemination patterns involving English voiceless stops after lax vowels. In (3b), voiceless stops before It/ always correspond to single stops, not to geminates as in (3a) . In (3c), voiceless stops before unstressed vowels can correspond to either Japanese single or geminate consonants. For example, Shinmeikai Kokugo Jiten contains 26 loan words whose original English words have the particular phonological pattern described in rule (3c). ~ corresponds to a single consonant in eight of these and to a geminate in the remaining 18.

346

N. Takagi and V. Mann

Ohso (1971: 32, cited in Lovins, 1975: 93-94) suggests the possibility that rule (3c) reflects a certain perceptual ambiguity. She attributes the bivalent correspondence to the ambisyllabic nature of inter-vocalic stops; the inter-vocalic /t/ in butter, for example, can be "perceived" as a part of the first syllable (/bAt+ <1-/) or of the last syllable (/bA + t<1-/) by native speakers of Japanese. In the first case, she comments, the word is "realized as battaa with a geminate", and in the latter it is "bataa with a single t". At this point, one might wonder why ambisyllabicity does not have an analogous consequence when preceding vowels are tense as in rule (2). In this particular environment, English voiceless stops always correspond to Japanese single stops even when followed by unstressed vowels. According to Lovins (1975), ambisyllabicity is not an issue because a long vowel followed by a geminate consonant "is prohibited in Japanese" (p . 84). Since English tense vowels correspond to long vowels in Japanese, this prohibition constrains the following consonant to a single stop. Note that such phonological constraints cannot explain all of the regularities in rules (1)-(3), because they do not exist when the English source vowel is lax. That is, for English hit, the Japanese counterpart could have been /hito/, rather than /hitto/, as this is a perfectly legitimate sound sequence in Japanese. 2 We note also that there are certain problems even with Lovin's account of the regularity of rule (2) for tense vowels. First, while such forms are exceedingly rare, a geminate consonant does follow a long vowel in the past forms of certain verbs as in kootta ('froze') and in certa:in adverbial phrases as in paatto ('swiftly'). Furthermore, when Lovins' subjects were presented with a natural utterance of /kit/ and asked to represent what they heard, /kiitto/ (i.e., a geminate consonant after a long vowel) was one of the spontaneously generated responses (Lovins, 1975: 130). This certainly indicates that a voiceless stop after a tense vowel can be perceived as a long vowel followed by a geminate consonant, and an empirical test of this possibility is pertinent to Ohso's explanation of (3a) . One relatively straightforward test of the possibility that perceptual assimilation underlies rules (1)-(3) is to present native speakers of Japanese with English utterances whose phonological structure conforms to (1) and ask them to represent each utterance by spelling it out in katakana. Such was the procedure employed by Lovins (1975) in an experiment which used four native speakers of Japanese as subjects. However, her materials contained only a few instances of the systematic correspondences described by rules (1)-(3). For example, fix and cracker were the only instances of lax V 1 + C2 + /s/ and lax V1 + ~ + v words, respectively. It was also the case that her stimuli were real English words read aloud to the subjects. The use of real words raises the possibility that at least some of her subjects could have been influenced by their knowledge of English. The manner of presentation also makes it difficult to determine any perceptual basis for the subjects' transcription responses because there is no means of verifying that the tokens of lax vowels were physically longer than the tense ones, etc., or of relating acoustic structure to transcription behavior. 2 One might also suspect that the English spelling may contribute to the observed sound correspondences. Although such possibilities are suggested by Lovins (1975), this does not seem to be the case for the sound correspondences discussed in this paper. For example, 'tt' in butter and batter corresponds to a single consonant in the first , and to a geminate consonant in the latter case as in bataa and battaa (Table I) .

Perceptual factor in Japanese loan words

347

Our research is intended to remedy these problems and thereby compliment Lovins' work on the correspondences described by rules (1)-(3). In seeking to substantiate the perceptual basis of those rules, we have employed a larger stimulus set consisting of pre-recorded nonsense words produced by two different speakers. We also have employed a larger pool of subjects who differ in their knowledge of English. Thus, we may more systematically control for knowledge of English as we more subtly examine the relation between the phonological and durational properties of the English utterances and the transcription behavior of Japanese subjects. On the other hand, we have limited our study to the correspondences in these rules, and specifically to forms where the C2 that alternates between single and geminate is a stop, as shown above in Table I. Lovins (1975) also studied gemination of other classes of sounds, such as fricatives and voiced stops, occurring in other phonological environments. Before extending the scope of the our investigation, however, we decided to study a limited range of sound correspondence very carefully. In the experiment reported below, native speakers of Japanese were presented with nonsense English words whose syllable structure conformed to (1) . Our materials sampled all of the loan word patterns described in (1), using /g/ as C 1 ; /I/ and /u/ as lax VI> /i/ and /u/ as tense V 1 ; /p, t, k/ as C2 ; /s, t/ as C 3 ; and /-;;J/ as an optional final unstressed vowel. The subjects were asked to choose which of four Japanese katakana spellings best represented each stimulus. For example, they heard a native English speaker's utterance of I g1k/, and chose among the four katakana representations for /g!ku/ (VC), /g!kku/ (VCC), /g!iku/ (VVC) and /g!ikku/ (VVCC). Rule (1) predicts that the lax vowel /I/ will be assimilated to Japanese short vowel /i/, and rule (3a) predicts that the final voiceless stop /k/ will be assimilated to the Japanese geminate consonant /kk/, making /gikku/ the most frequent response. In general, it was expected that the regular pattern of correspondences described by rules (1)-(3) would emerge as the consequence of the Japanese listeners' perceptual assimilation of our English nonsense utterances. In the exceptional case of rule (3a), where both single and geminate consonants correspond to C2 of the English lax V 1 + Cz + v2 sequence, we expected Cz to be assimilated to both single and geminate consonants. A second goal was to determine whether certain durational properties of the test stimuli were related to the pattern of transcription responses. To this end, we investigated the possible acoustic correlates in a subset of our materials and their relation to subjects' responses. One question was whether, on the average, the lax and tense vowel tokens actually differed in duration. A second question was whether, on the average, the final consonant is "longer" after lax vowels (i.e., exhibits a longer closure duration). A third was whether durational differences between individual tokens correlate with the pattern of transcription responses. 2. Experiment 2. 1. Subjects

All subjects were native speakers of Japanese . The "less exposure" (LE) group consisted of 12 high school and college students living in Japan, all of whom served

348

N. Takagi and V. Mann

as paid volunteers, receiving 500 yen (approximately $4). They had never lived in a country where English is spoken as a primary language of communication . Although , like all Japanese students, they had gone through 3-4 h of English classes every week in junior high and high school, most of their English teachers were non-native speakers of English and classroom instructions were dominantly based on what has been called a grammar-translation method. This method focuses on a conscious knowledge of English grammar and translation skills rather than oral communicative abilities. Thus, while most of the "less exposure" subjects were able to read English sentences with a dictionary in their hands and translate them into Japanese , their ability to understand English utterances spoken at a normal rate and to express themselves orally was minimal. Although eight subjects had taken classes from native speakers of English , it was at most one hour per week, and the demand to express themselves in English is definitely far less than that of the "more exposure" (ME) group of subjects. The "more exposure" (ME) group consisted of six native speakers of Japanese who served as unpaid volunteers. Five of them had studied at least one full academic year in the United States. The remaining subject had a sound working knowledge in phonetics and had spent a year in Shanghai where she used English quite often as a means of communication , especially with her English-speaking roommate. 2.2. Materials

The test stimuli were 44 nonsense syllables conforming to the syllable structures described in (1). The complete list of test stimuli is found in the Appendix. We chose I g/ as the initial consonant C 1 to minimize the occurrence of meaningful English words in the test stimuli. This initial consonant was followed by one of four vowels: lax /I/ or /u/, or tense /i/ or /u/ . The following consonant Cz was either /p/, It/, or /k/ , and the optional C3 was either /s/ or It/, and f;,f was used as the optional unstressed vowel v2 • Since the consonant cluster /ttl is not allowed in English in a word-final position, only /p/ and /k/ appeared as Cz before /t/. The stimuli contained effectively no meaningful English words. The only exception was I gik/ , a word which has not established itself as a loan word in Japanese . This excluded the possibility that the subjects could rely either on their knowledge of English spelling patterns, or on their knowledge of loan words corresponding to the test tokens. For the purpose of testing, two native speakers of American English, one male (Speaker CC), and one female (Speaker DM) , read each of the 44 tokens twice in a randomized order. This allowed us to introduce variability between as well as within speakers, and add to the generalizability of our results. The speakers were asked to release the final stops in those nonsense syllables ending with a stop. The randomization was different for each speaker and the male speaker also read four additional tokens to be used in practice trials. Each token was cued by a serial number read in Japanese by the first author. The utterances were recorded in a sound proof chamber using a TASCAM133 casette deck and a SONY ECM-150T microphone. To determine the durational properties of the test utterances, each stimulus was low pass filtered at 5 kHz and digitized at a sampling rate of 10kHz. The ILS signal processing program was used to obtain the PCM waveform, which was used to

349

Perceptual factor in Japanese loan words TABLE II. Mean and standard deviation (in parentheses) of vocalic duration and closure duration (ms) as a function of vowel quality, speaker identity and syllable ending Lax Ending C#

Speaker

Vowel

Closure

Vowel

Closure

cc

146 (12) 145 (11)

183 (24) 76(11)

181 (16) 181 (16)

153 (32) 68 (19)

156 (12) 130 (18)

161 (17) 87 (16)

171 (18) 159 (21)

153 (31) 78 (20)

141 (17) 131 (12)

141 (38) 77 (10)

158 (15) 147 (18)

124 (24) 110 (35)

118 (10) 122 (14)

151 (22) 75 (16)

150 (12) 143 (15)

114 (23) 68 (15)

DM C/s/

cc DM

C/t/

cc DM

Cv

Tense

cc DM

measure the vocalic and closure durations. The vocalic duration was defined as the duration from the release of the initial consonant (marked by a burst) to the cessation of periodicity (reflecting vocal cord vibration) . The following consonant duration was defined as the duration from the onset of the silent portion to the final stop burst. Table II presents the mean and standard deviation of the vocalic and closure durations as a function of vowel quality (lax and tense), speaker identity (CC and DM) and phonological environment. As expected, for each speaker the vocalic duration was longer for tokens containing tense vowels than for those containing lax vowels, consistent with previous measurements of real English words (Peterson & Lehiste, 1960; Crystal & House, 1982, 1988). As predicted, the closure duration was slightly longer after lax vowels than after tense vowels with the exception of Speaker DM's CVC/t/ tokens. In general, Speaker DM's stimuli had much shorter closure durations than those produced by Speaker CC. The response alternatives for each item were the four katakana representations corresponding to all possible combinations of vowel and consonant length: short+ single (VC), short+ geminate (VCC), long+ single (VVC) and long+ geminate (VVCC). For each stimulus, the four choices were printed on a response sheet following the serial number assigned to the stimulus, and the subjects were required to circle the item which best represented that stimulus.

2. 3. Procedure The experiment was conducted at three different locations in Japan and two in the United States using conventional cassette tape players. In order to run more than one subject at a time, the stimuli were presented through loudspeakers at a comfortable listening level. Subjects were not informed that the nonsense words were possibly from English. They were told only to chose the item which best represented each stimulus. After hearing and responding to the four practice items, the subjects were first presented with the 88 stimuli recorded by one speaker. After a short break, they then heard the 88 stimuli recorded by the other speaker. The whole presentation took less than 30 minutes. The speed of presentation was

350

N. Takagi and V. Mann

controlled by the experimenter so as to allow each subject sufficient time to make a response. 2. 4. Perceptual results Table III summarizes the pattern of the subjects' transcription responses. The mean relative frequency of each of the four choice items is broken down according to the vowel (V 1) quality and the four different ending patterns of the English tokens. The starred entries indicate the responses that would follow from the systematic correspondences described by rules (1)-(3). The data from the 12 subjects in the "less exposure" group appear separately from the 6 subjects in the "more exposure" group. 2.4.1. eve and eve/s/ tokens For the English test tokens that end with e and e/s/ (the first two rows for each group), we expected that vee and vve responses would be most frequent for lax and tense vowel tokens, respectively, if the systematic sound correspondences are the product of perceptual assimilation . This prediction was confirmed. Both LE an ME subjects chose vee most frequently for the lax vowels and VVe for the tense vowels. For these two types of ending, the mean relative frequencies of the predicted categories are represented graphically in Fig. 1 for each of the eight factorial combinations of vowel quality, subject group and syllable ending. A three-way ANOV A on the proportion of "predicted" responses after applying an arcsine transformation revealed a highly significant main effect of subject group [F(1, 16) = 20.04, p < 0.001). The subjects with greater exposure to English (points connected with broken lines in the figure) chose the predicted items more systematically, although the most frequent categories were identical across the two subject groups. A significant main effect of vowel quality was also found [F(1, 16) = 14.31, III . Mean relative frequency of choice items as a function of V, and syllable ending for the two subject groups. An asterisk indicates responses predicted by rules (1)-(3)

TABLE

Vowel

Ending

vc

V, [lax]

V, [tense]

Choice

Choice

vee

vvc vvcc

vc

vee

vvc vvcc

Less exposure group 9.4 C# 4.5 C/s/ 8.9* C/t/ 24.3* Cv

73.8* 78.8* 70.8 49.7*

6.9 3.8 6.8 8.0

10.4 12.8 13.5 18.1

7.6 2.8 9.4 37.5

22.2 32.6 37.5 17.7

52.1* 46.9* 32.3 30.9*

18.1 17.7 20.8 13.9

More exposure group 1.4 C# 0 C/s/ 8.3* C/t/ Cv 2.8*

88.9* 93.1 * 78.1 72.2*

9.0 2.8 7.3 1.4

0.7 4.2 6.3 23.6

0 0.7 3.1 34.0

4.9 13.2 14.6 20.8

87.5* 79.2* 70.8* 19.4*

7.6 6.9 11.5 25.7

Perceptual factor in Japanese loan words 100 90 ....>, 0 00

.

,

',,

<>-----.....-..:-:::=~~~~~

'•

80

2

u"' "0

2u

LE:C - - LE:C/s/ ---o-- · ME : C

· -<>-

70

:a QJ ....

~

351

60

----

~

ME:C/s/

50 40 Lax V

Tense V

Figure 1. Mean relative frequency of predicted categories.

p < 0.002], indicating that the tokens with lax vowels favored subjects' choice of the predicted category (VCC) more consistently than the tense vowel tokens favored VVC. This may be due to the fact that the vocalic duration contrast between lax and tense vowels in English is not as large as the contrast found in Japanese between short and long vowels, which is approximately 1:2 (Han, 1962; Homma, 1981; Beckman, 1982). In our experiment, the mean vocalic duration for the tense vowels of each speaker is considerably less than twice the mean vocalic duration for their lax vowels (148: 176 ms for CC; 138: 170 ms for DM), as was true in the data of Peterson & Lehiste (1960) and Crystal & House (1982, 1988). This could explain our observation that there were many more short vowel responses (VC and VCC) for the tense vowel tokens than there were long vowel responses to the lax vowels, and hence a lower percentage of predicted responses for the tense vowel stimuli. The interaction between vowel quality and syllable ending was also significant [F(1, 16) = 7. 70, p < 0.014]. When the test tokens contained lax vowels, the C/s/ ending led to a higher percentage of predicted category, but when the tokens contained tense vowels, the C ending led to a higher percentage of predicted category. The interaction between subject group and vowel quality was marginal [F(1, 16) = 4.06, p < 0.061]. None of the other higher order interactions attained significance. 2.4.2. CVC' It/ tokens Subjects' responses to CV /t/ tokens were particularly interesting; the lax vowel data did not agree with the predictions of rule (b) and the tense vowel data of the ME subjects were systematic despite the lack of loan words having this structure. When the English vowel was lax, both groups chose VCC most frequently . Response VC, which reflects the systematic correspondence described by rules (1) and (3b) , was chosen in fewer than 10% of the trials. This indicates that not all the English utterances that have a lax vowel followed by two successive stops are assimilated to the Japanese sequence of a short vowel followed by a single consonant (VC) . Moreover, the tendency to perceive VCC instead of VC suggests

352

N. Takagi and V. Mann

that our subjects were not generally responding on the basis of their knowledge of the systematic correspondences between English words and Japanese loan words described in Table I. Otherwise , ve should have been the most frequent category for this type as well as for e and e/s/ after lax vowels. When the English vowel was tense, the " most frequent" category was different between the two groups. The LE subjects chose vee somewhat more frequently than vve , whereas the ME subjects chose vve systematically. As noted in Table I, to our knowledge there are no Japanese loan words whose original English words end with two successive voiceless stops preceded by a tense vowel. 3 The fact that the ME subjects' response pattern was nonetheless systematic suggests that their response behavior in general was not governed by conscious knowledge of the systematic correspondences. However, the response pattern difference between the LE and ME groups indicates that exposure to authentic spoken English can somehow change the way in which English utterances are assimilated to Japanese phonemes. 2.4.3. evev tokens As is shown in (3c), English words with a lax vowel followed by a stop plus an unstressed vowel give rise to an anomalous one-to-two correspondence. If Ohso's ambisyllabicity account of the alternation between ve and vee for this type of disyllable is true, the response to a particular token should be either ve or vee, depending on whether the medial consonant was perceived as belonging to the following syllable or to the first syllable. Thus, these two responses together should be most frequent , and should show a similar distribution across the subject groups, since all subjects were listening to the same tokens. The most frequent response category was vee for both groups. However, the absolute frequency of the vee response differed substantially between the two groups, and the next most frequent response also differed. It was the expected ve for the LE subjects, but vvee for the ME subjects . Since ve was chosen only in 2.8% of the trials by the latter subjects, the tokens presented here were not ambiguous between ve and vee interpretations for the ME subjects, disconfirming Ohso's ambisyllabicity hypothesis. Furthermore, although vvee was only the third most frequent category for the LE subjects , it still accounted for 18.1% of the responses, nearly as many as the 23.6% for the ME subjects , and far more than the vve responses by either group. That vvee was chosen so much more often than vee indicates that, contrary to Lovins's account of the regular correspondence patterns, the phonotactic constraint against this syllable structure is not strong enough to override the perception of a long vowel followed by a geminate for these English disyllables. For the tense vowel tokens ending in ev, the most frequent category was not the predicted vve, but ve for both groups. Note that the relative frequency of the most frequent response category was low compared to the other cases. The next most frequent category was the predicted vve for the LE group (30.9% ), but vvee for the ME group (25 .7%) , once again showing that the subjects' could perceptually 3 Two voiceless stops do not follow a tense vowel in English except in the inflected forms of regular verbs as a result of a historical sound change. Moreover, past and past participle forms of verbs such as steeped are seldom borrowed into Japanese . In Shinmeikai Kikugo Jiten there were five such cases (e .g. , sutendo gurasu for stained glass) , but none of them were from English words that had two word-final voiceless stops after a tense vowel.

Perceptual factor in Japanese loan words

353

parse a syllable with a geminate consonant following a long vowel despite the phonotactic constraint against such syllables. Again, the frequency of this type of phonotactically-dispreferred response, combined with the lack of any clear pattern of preference for responses in accord with the correspondences in rules (1)-(3) for this frequent loan source type, suggests that in general subjects were not governed by conscious knowledge of the systematic correspondences in their choice of response. 2.5. The relation between acoustic structure and perceptual responses

The subjects' responses to eve and eve/s/ tokens did not exhibit the group differences observed in the responses to eve/t/ and evev tokens (i.e., the most frequent response categories were the same for the two groups of subjects), and therefore seemed less dependent on knowledge of English. This being the case , we decided to investigate the relation between durational cues and the subjects' long vowel and geminate consonant responses for these stimuli. Pearson correlations were computed between vocalic or closure durations or their ratios (VD/eD), as one variable, and the relative frequencies of long vowel (VV) or geminate consonant responses (ee) as the other. The response data were stated as the number of subjects who gave VV or ee responses divided by the total number of the subjects, after the arcsine transformation. The analyses combined the tokens of the two speakers but were conducted separately for the lax and tense vowel tokens. The results are presented in Table IV. It was found that the closure duration tended to have the highest correlations with the subject's relative frequencies of both long vowel and geminate consonant responses. Longer closure duration favored more geminate consonant responses and fewer long vowel responses (hence the negative correlation) . It is perhaps not surprising that the vocalic duration had such low correlations with the relative frequencies of long vowel responses. Tables II and III show that the largest source of variability in vocalic duration and in frequency of long vowel responses is the contrast between lax and tense vowels, a contrast which has been eliminated by doing separate analyses for the two vowel types. (We did such separate analysis precisely so that the large differences between lax and tense vowels should not swamp any effects of the different input consonant coda types that are under TABLE IV. Correlations between relative frequency of long vowel (LV)/ geminate consonant (GC) perception (arcsine transformation) and vocalic duration (VD) and closure duration (CD) and their ratio (VD/CD) for lax and tense vowel tokens. n = 48; an asterisk indicates that the correlation is significant at a = 0. 008 (level adjusted for multiple comparisons, two-tailed) Lax V

VD CD VD/ CD

Tense V

vv

cc

vv

cc

0.24 -0.47* 0.55*

-0.04 0.43* -0.45*

0.31 -0.71* 0.59*

-0.24 0.75* -0.56*

354

N. Takagi and V. Mann

consideration here .) On the other hand, Table II also shows that the range of variation in vocalic duration across ending types within a given vowel type is as large as (and complementary to) the variation in following closure duration . This complementary distribution of vocalic and closure durations may explain the signficant correlations involving the VD/eD ratio. These correlations suggest that, within a given vowel type , there might be some scaling of vocalic duration to closure duration-i.e. , shorter closure duration may make the preceding vowel sound longer.

3. Discussion The present research was motivated by the systematic phonological correspondences between English words and their Japanese loan word counterparts summarized by rules (1)-(3). Its primary goal was to test the hypothesis that perceptual assimilation of English sounds into Japanese phonemes gave rise to these systematic correspondences. This hypothesis was most strongly confirmed for the eve and eve/s/ tokens. The lax vs. tense vowels in those stimuli systematically corresponded to short vs. long vowels in the response choices and the stops after lax vs. tense vowels corresponded to the geminate vs. single consonants, respectively. This matches the systematic pattern of correspondence presented in Table I, consistent with the possibility that these correspondences are a product of perceptual assimilation. The present data also reveakd that the subjects' vowel and consonant length judgments are correlated to some extent with the durational structure of the stimuli. For example, a significant portion of the variance in individual responses to utterances containing tense (50% ) and lax (22%) vowels can be accounted for by closure duration. The significant correlations between the durational properties and the subjects' responses do indicate that these responses were based on the perceptual judgment, not on the subjects' conscious knowledge of how English sounds should be represented. Interestingly, however, vocalic duration was not highly correlated with the vowel length judgments, although both the closure ' duration and the ratio of vocalic and closure duration were . This suggests that vowel length judgment was based more on the relative duration of the vocalic segment than upon absolute duration. Given the moderate correlations between the durational cues in the stimuli and the vowel and consonant length judgments, further research is needed to identify other acoustic cues involved in this perceptual phenomenon (vowel quality is an apparent candidate) and the way they interact with each other. To achieve this end, it might be desirable to use synthesized speech tokens with the relevant acoustic parameters carefully manipulated. The results of the eve and eve/s/ tokens conform to the correspondences in Table I, and are consistent with the possibility that English hit, for example, came to be represented as hitto in Japanese because it sounded as though it contained the Japanese short vowel /i/ followed by a geminate consonant, whereas beet came to be represented as biito because it sounded like it contained a long vowel followed by a single consonant. In contrast, our subjects' responses to the eve/t/ and evev tokens do not conform to the systematic correspondences in Table I, and this suggests that the systematic phonological correspondences found for eve/t/ and evev English words may not be direct consequences of perceptual assimilation.

Perceptual factor in Japanese loan words

355

Nonetheless, the correspondences involving English eve/t/ and evev words appear to be too systematic to be a product of chance. As Lovins (1975) notes, some of these correspondences, especially when the English vowel is tense, may be related to the Japanese phonological constraint that a geminate consonant is infrequent after a long vowel. The fact that our subjects chose vvee so often as a response indicates that the phonotactic constraint does not preclude the perceptual parsing of these forms as having a geminate consonant following a long vowel. However, it is still possible that a constraint on syllable structure influences some higher-level decisions as to how English utterances should be transcribed. Ohso (1971) offered an ambisyllabicity account of the anomalous one-to-two phonological correspondence in rule (c) involving evev words containing lax vowels. We have modest support for that account insofar as the LE group tended to respond ve or vee as predicted. If we argue that native speakers of Japanese with limited command of English were responsible for the introduction of English words such as butter and batter, ambisyllabicity could account for the bivalence in rule (3c). In closing, however, we would like to suggest another possible source of influence on loan word correspondences, namely, a conscious effort to achieve morphological consistency. It is conceivable that some of the regularization of the Japanese spellings of English words such as heat and heater was done by individuals who were knowledgeable about English morphology. Given heat+ er, for example, if a Japanese speaker knew that this word derives from heat, which sounded like /hiito/ , he or she might have concluded that heater should be represented as /hiitaa/ with a long vowel, where our results suggest that heater may sound more like /hitaa/. By similar analogy, /kattaa/ (for cutter) came to contain a geminate consonant because cut sounds like /katto/, and cutter is derived from cut. However, /bataa/ (for butter) did not come to contain a geminate, since this word does not derive from but. It is certainly true that there are exceptions; for example, putter is derived from put, yet /pataa/ contains a single consonant. Given the many decades during which Japanese has been borrowing from English, some of these differences may be due to different periods of incorporation into the language or different source dialects. However, to investigate such an explanation would take years of philological detective work, whereas possible morphological influences could be examined by a dictionary search in conjunction with transcription experiments of the sort reported in this paper. We thank Mary Beckman, Court Crowther and our anonymous reviewers for their constructive input.

References Beckman, M. (1982) Segment duration and the "mora" in Japanese, Phonetica , 39, 113-135. Best, C. T ., McRoberts, G . W. & Sithole, N. M. (1988) Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants, Journal of Experimental Psychology : Human Perception and Performance, 14, 345-360. Best, C. T. & Strange, W. (1992) Effects of phonological and phonetic factors on cross-language perception of approximants, Journal of Phonetics, 20, 305-330. Crystal, T. H. & House, A . S. (1982) Segmental durations in connected speech signals: preliminary results, Journal of the Acoustical Society of America, 72, 705-716. Crystal, T . H. & House, A. S. (1988) Segmental durations in connected speech signals: Current results Journal of the Acoustical Society of America, 83, 1553-1573.

N. Takagi and V. Mann

356

Fujisaki , H ., Nakamura , K. & Imoto, T . (1973) Auditory perception of duration of speech and non-speech stimuli. Annual Bulletin (Research Institute of Logopedics and Phoniatrics, University of Tokyo) , 7, 45-64. Gimson, A . C. (1988) An introduction to the pronunciation of English (4th edition). London: Edward Arnold . Han , M. S. (1962) . The feature of duration in Japanese, Onsei no Kenkyuu, 10, 65-80. Homma, Y. (1981) Durational relationship between Japanese stops and vowels, Journal of Phonetics, 9, 273-281. Jones , D. (1960) An outline of English phonetics (9th edition) . Cambridge : W. Heffer & Sons. Kunihiro, T. (1963) Gairaigo hyooki ni tsuite-nichi-ei on'in taikei no hikaku , Nichi-ei Ryoogo no Hikaku Kenkyuu Jissen Kiroku , 27-48. Tokyo : Taishuukan . Lovins , J. B. ( 1975) Loan words and the phonological structure of Japanese. Indiana University Linguistics Club. Ohso, M. (1971) A phonological study of some English loan words in Japanese. Unpublished M.A . Thesis, Ohio State University. Peterson , G. E . & Lehiste , I. (1960) Duration of syllable nuclei in English, Journal of the Acoustical Society of America, 32, 693-703. Takebayashi, S. (1982) Eigo Onseigaku Nyuumon . Tokyo: Taishuukan . Watanabe, S. & Hirano, N. (1985) The relation between the perceptual boundary of voiceless plosives and their moraic counterparts and the duration of the preceding vowels . Onseigengo, I, 1-8. Yamada, T. (ed.) (1981) Shinmeikai KokugoJiten. Tokyo: Sanseido .

Appendix

Complete list of types VI

Lax gip, git, gik gup, gut , guk

Tense gip, git, gik gup, gut, guk

(b) eve/s/

g1ps, glts, g1ks gups, guts, guks

gips, gits, giks gups, guts, guks

(c) eve/t/

gipt, gikt gupt, gukt

gipt, gikt gupt, gukt

(d) eve;;,

gip;,' git;} ' gik;, gup;,, gut;,, guk;,

gip;,, git;,, gik;, guk;,, gut;,, guk;,

(a) eve