Processing classifier–noun agreement in a long distance: An ERP study on Mandarin Chinese

Processing classifier–noun agreement in a long distance: An ERP study on Mandarin Chinese

Brain & Language 137 (2014) 14–28 Contents lists available at ScienceDirect Brain & Language journal homepage: www.elsevier.com/locate/b&l Processi...

2MB Sizes 1 Downloads 76 Views

Brain & Language 137 (2014) 14–28

Contents lists available at ScienceDirect

Brain & Language journal homepage: www.elsevier.com/locate/b&l

Processing classifier–noun agreement in a long distance: An ERP study on Mandarin Chinese Chun-Chieh Hsu a,⇑, Shu-Hua Tsai b, Chin-Lung Yang c, Jenn-Yeu Chen d a

Department of Foreign Languages and Literature, National Tsing Hua University, Taiwan Department of Psychology, National Chung Cheng University, Taiwan c Language and Cognition Laboratory, Department of Linguistics & Translation, City University of Hong Kong, Hong Kong d Department of Chinese as a Second Language, National Taiwan Normal University, Taiwan b

a r t i c l e

i n f o

Article history: Accepted 15 July 2014

Keywords: Classifiers Long-distance dependency Relative clauses Mandarin Chinese ERPs P600 N400 Semantic incongruity

a b s t r a c t The classifier system categorizes nouns on a semantic basis. By inserting an object-gap relative clause (RC) between a classifier and its associate noun, we examined how temporary classifier–noun semantic incongruity and long-distance classifier–noun dependency are processed. Instead of a typical N400 effect, a midline anterior negativity was elicited by the temporary semantic incongruity, suggesting that the anticipation of coming words influences semantic processing and that metacognitive processes are involved in resolving the conflict. The lack of reduced P600 effects at the RC marker suggests that classifier–noun mismatch may not be effective in RC prediction. The N400 observed at the head noun suggests that the parser retains the temporary incongruity in the memory and computes the classifier–noun semantic agreement over a long distance. In addition, both successful and unsuccessful long-distance integration elicited P600 effects, supporting the view that P600 indexes more than just syntactic processing. Detailed discussion and implications are provided. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction In many languages, nouns are categorized into different classes. The linguistic devices used to classify nouns naturally form a dependent relationship with the nouns. For example, the grammatical gender system in Indo-European languages arbitrarily classifies nouns into masculine, feminine, and neuter types, and the grammatical gender marked on the articles and/or the adjectives creates a dependency between the nouns and these elements (Corbett, 1991). In languages spoken in East and Southeast Asia, the classifier system is what expresses nominal classification. Unlike the gender agreement system which is essentially grammatical, the classifier system has a lexical origin and is correlated with the semantic properties of the nouns (Aikhenvald, 2003; Dixon, 1986; Grinevald, 2000). What’s special about the semantic agreement between the classifier and its associate noun is that these two dependent elements can be intervened by a relative clause (RC), creating a temporary semantic incongruity and a long-distance dependency. In this study, by using the event-related potentials (ERPs) method, we examine the brainwave responses ⇑ Corresponding author. Address: Department of Foreign Languages and Literature, National Tsing Hua University, Hsinchu, Taiwan. E-mail address: [email protected] (C.-C. Hsu). http://dx.doi.org/10.1016/j.bandl.2014.07.002 0093-934X/Ó 2014 Elsevier Inc. All rights reserved.

associated with the processing of temporary classifier–noun semantic incongruity, and look into how the processing of semantic agreement and structural dependency is manifested when the classifier and its associate noun are situated in a long-distance. Below, we introduce the properties of classifier–noun agreement in Chinese, and review relevant ERP components reported in the past studies. 1.1. The semantic/conceptual nature of classifier–noun agreement The classifiers of our focus here differ from measure words (e.g. san bei/ping shui ‘three cups/bottles of water’). They are obligatory elements in the noun phrase that contains demonstratives or numerals. In linguistics, the system of individual classifiers in Chinese has been claimed to reflect conceptual structures, because the objects classified by the same classifier have to share similar semantic properties on some dimension, such as shape, texture, or function (Allan, 1977; Croft, 1994; Lakoff, 1986; Tai, 1994; Tien, Tzeng, & Hung, 2002). For example, while the items (fish, ropes, towels, and law statements) in (1) look quite heterogeneous on the surface, they can be paired up with the classifier ‘‘tiao (條)”, because they all have a long shape, either physically or metaphorically. That is, each classifier is associated with a certain class of nouns by agreeing on some type of semantic feature(s), and the

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

semantic relationship between the classifier and its associate noun may be transparent or opaque (Ahrens, 1995; Zhang, 2007). (1)

a

yu / shengzi / maojin / fagui na-tiao fish rope towel law-statement that-CLa ‘that fish / rope / towel / law-statement’

In the gloss, ‘CL’ is used to stand for classifiers.

In psycholinguistics, the conceptual/semantic nature of the classifier–noun agreement system has been empirically tested. One line of research focuses on the influence of the classifier system on human categorization (Gao & Malt, 2009; Kuo & Sera, 2009; Saalbach & Imai, 2007; Zhang & Schmitt, 1998). The other line of research targets on the cognitive mechanisms for processing classifiers. In production studies, it was found that classifier selection is modulated by semantic category and visual shape (Bi, Yu, Geng, & Alario, 2010), and that classifiers are easier to access through pictures than words in naming tasks (Chen & Wang, 2003). These studies suggest that classifiers have semantic representation located in the conceptual stratum in the production mechanism. Recent brain studies also attested the semantic relationship between classifiers and their associate nouns. Chou, Lee, Hung, and Chen (2012)’s functional magnetic resonance imaging (fMRI) study showed that when participants read incongruous classifier–noun word pairs, greater activation was found in the mid-ventral region of the left inferior gyrus, suggesting an increased demand on semantic processing. Several electrophysiological studies demonstrated that the mismatch between the classifier and the noun elicited a significant N400 effect, a typical index for semantic anomaly. For example, Sakai, Fukumitsu, Yusa, and Koizumi (2007) measured ERPs with post-nominal numeral classifiers in Japanese, and found a broadly distributed N400 effect in the classifier–noun mismatch pairs when compared to the match pairs. In Chinese, typical N400 effects are also observed at classifier– noun mismatch pairs, whether they appeared as word pairs (Tsai, Hsu, Yang, & Chen, 2008), appeared at sentence-final positions (Zhou et al., 2010), or even appeared in a reversed noun-classifier order (Zhang, Zhang, & Min, 2012). Thus, the semantic relationship of classifier–noun agreement is well supported. 1.2. Processing semantic agreement: near and far One very interesting aspect of classifier–noun agreement is that the classifier and its associate noun can be far apart from each other, creating a temporary semantic incongruity and a long-distance dependency. In Mandarin, the structure of noun phrases is strictly head-final such that the determiners (demonstratives, numerals, classifiers) and the modifiers (adjectives and RCs) must appear before the head noun. This property allows the classifier and its associate noun to be separated by a modifying RC as in (2). (2) yi-ke [RC linju zhongzhi ___ de ] guoshu one-CL(tree) neighbor plant DE fruit-tree

‘a fruit-tree which the neighbor planted’

Example (2) presents a special case for on-line sentence processing. The intervening object-gap RC creates a temporary semantic incongruity of one classifier–noun pair as well as a longdistance dependency of another classifier–noun pair. Assuming that the parser works in a left-to-right, word-by-word fashion, when reading the first two words, the parser encounters an obvious semantic mismatch, because the classifier for trees, ‘‘ke”, does

15

not match the linearly adjacent noun, linju ‘neighbor’, which denotes a human. Yet, the sequence does not stop here, but is continued with a head-final RC, which is the only possible grammatical way to continue the mismatch sequence. As the parser proceeds, it encounters another noun, the head noun of the RC (goushu ‘fruit-tree’), that agrees with the classifier (‘‘ke”). The semantic agreement is then established over a long-distance. In addition to the semantic agreement relationship, the classifier and the long-distant head noun are also syntactically dependent because a classifier requires a suitable noun as its licensor in the structure. Whether the classifier–noun mismatch may serve as an effective cue for the parser to predict a RC structure that contains a suitable head noun has been an issue under debate. Earlier studies found that in Mandarin, the temporary classifier–noun mismatch could not facilitate the processing of the subsequent RC and the compatible head noun, unless a supportive context is provided. In a series of self-paced reading experiments testing sentences like (3), Hsu, Phillips, and Yoshida (2005) found a significant slowdown at the adjacent noun (zuojia ‘writer’) when it was preceded by a mismatching classifier (ben ‘CL(book)’) than by a matching classifier (wei ‘CL(human)’), reflecting a detection of classifier–noun semantic incongruity. However, at the head noun (xiaoshuo ‘novel’), no facilitation was found in the sentences that contained classifier–noun mismatch, suggesting that the cue of classifier–noun semantic incongruity did not facilitate the parser in predicting and processing the compatible head noun which appeared later. Only when situating this type of sentences within a RC-felicitous context could the facilitation be observed (Hsu, Hurewitz, & Phillips, 2006; Wu, Haskell, & Andersen, 2006). (3) Zuotanhui- na-wei/ [zuojia yu dangdi- minzhong shang, ben de Seminar-at that-CL writer with local people (human/ book) taolun ___ de] xiaoshuo juyou bushao zhengyixing. discuss DE novel have not-less controversy ‘At the seminar, the novel that the writer discussed with the local people is quite controversial.’ Later studies, on the contrary, showed that pre-RC mismatching classifiers are helpful in predicting RC structure. Wu, Kaiser, and Andersen (2009) tested RC sentences with and without a pre-RC mismatching classifier like (4). Assuming that head-final RCs without mismatching classifiers are initially processed as an SVO simple clause by an incremental parser, structural reanalysis would take place when encountering DE, which disambiguates the sequence to be a RC. As predicted, facilitation was found in the RCs preceded by a mismatching classifier, but the effect was delayed to the adverb (jingtide ‘cautiously’). Chen, Xu, Tan, Zhang, and Zhong (2013) tested similar sentences like (4) with ERPs. At the RC-disambiguating region DE, the P600 effect, an index correlated with structural reanalysis (e.g. Friederici, Hahne, & Mecklinger, 1996; Osterhout, Holcomb, & Swinney, 1994), was found significantly reduced in the RCs preceded by a mismatching classifier, suggesting that mismatching classifiers help to predict an RC structure and facilitate structural reanalysis.1 However, as Chen et al. (2013) pointed out, all the object-gap RCs they used had animate head nouns. This means that only animate-counting 1 It is possible that the discrepancy among these studies might be related to the different materials used. One explanation for the lack of facilitation in sentences like (3) may be that the target RC contained multiple DEs (the adjective dangdi-de ‘local’ and RC de). As a reviewer pointed out, there is a tendency of avoiding multiple uses of DE in one sentence in Chinese. We suspect that the type of sentences in (3) may disrupt the parser’s RC prediction, and we avoided these problems in our test materials.

16

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

classifiers were used in their experiment. Thus, it is curious whether mismatching inanimate-counting classifiers may also be effective to RC prediction. (4) With/without pre-RC mismatching classifier (Na-wei) [jushi zazhong ___ de] jizhe jingtide huangu that-CL boulderhit DE journalist cautiously look(human) about sizhou. surroundings ‘The journalist that the boulder hit cautiously looked about his surroundings.’ To summarize, the classifier–noun mismatch that appears along with an object-gap RC has a special status, for it being both a semantic conflict and a structural cue at the same time. Yet, very few studies have looked at the processing of semantic incongruity that is only temporary and its associated brainwave responses. In addition, more needs to be done to understand whether the temporary classifier–noun incongruity is effective to RC prediction, and whether the semantic agreement and the structural integration between the classifier and its associate noun may be established over a long-distance. Our study was motivated to address these questions. The ERPs method, with its multidimensional signals and high temporal sensitivity, allows us to observe brainwave responses that index immediate processes the parser undergoes when processing temporary classifier–noun incongruity and the long-distance classifier–noun dependency. 1.3. The ERP components and the predictions In the ERP literature, semantic processing and syntactic processing are traditionally associated with distinct components. The N400, a negative-going waveform peaking around 400 ms after the stimulus, is typically correlated with semantic processing (for reviews, see Kutas & Federmeier, 2011; Kutas, Van petten, & Kluender, 2006). The P600, a late positive-going waveform observed at about 500–800 ms after the stimulus onset, is usually considered as an index for various aspects of syntactic processing (for reviews, see Gouvea, Phillips, Kazanina, & Poeppel, 2010; Hagoort, 2003b). Some past studies have investigated how semantic processing may be influenced by discourse or syntactic factors. It is demonstrated that, when embedded in a supportive discourse context, the N400 effect associated with semantic congruency may be neutralized or reduced (Camblin, Gordon, & Swaab, 2007; Filik & Leuthold, 2008; Nieuwland & Van Berkum, 2006; Van Petten et al., 1999). When compounded with syntactic anomaly, the N400 effect associated with semantic process may be suppressed (Friederici, Gunter, Hahne, & Mauth, 2004; Hahne & Friederici, 2002) or enlarged (Hagoort, 2003a). Recently, the traditional view of N400–P600 dichotomy has been challenged. A number of studies which examined semantic reversal anomalies (i.e. the role-reversed sentences such as The hearty meal was devouring . . .) consistently show that the mismatching target (devouring) elicited no N400 effects but only clear P600 effects, giving rise to the so-called ‘‘Semantic P600” phenomenon (e.g. Hoeks, Stowe, & Doedens, 2004; Kim & Osterhout, 2005; Kolk, Chwilla, Van Herten, & Oor, 2003; Kuperberg, Sitnikova, Caplan, & Holcomb, 2003). To explain the full range of data, various processing models that assume syntax-independent semantic interpretation mechanisms have been proposed (e.g. BornkesselSchlesewsky & Schlesewsky, 2008; Hagoort, Baggio, & Willems, 2009; Kuperberg, 2007, but also see Chow & Phillips, 2013), and some also suggest to reconsider the functional role of the N400 and P600 components (e.g. Brouwer, Fitz, & Hoeks, 2012). What’s important here is that the N400 effect correlated with semantic processing is easily modulated under different discourse and syn-

tactic conditions. However, previous studies have only focused on the N400 effects elicited by global semantic incongruity (whether embedded in word pairs, sentence frames, or discourse contexts), and very little is known with regard to the processing of semantic incongruity that is only temporary and could serve as a structural cue. Mueller, Hahne, Fujii, and Friederici (2005) is the only study that came close to examine how temporary semantic incongruity is processed. They created classifier–noun incongruity that was potentially temporary by placing the classifier–noun pair in the middle of sentences (e.g. ni-wa-no neko-o tobikoeru tokoro desu ‘two-CL(bird)-GEN cat-ACC jump over take place’)2 in a miniature version of Japanese, Mini-Nihongo. Surprisingly, unlike the studies that observed clear N400 effects at classifier–noun semantic incongruity in word pairs (Sakai et al., 2005, 2007), Mueller et al. found no effects of N400 at the mismatching noun, but a late left anterior negative (LAN) shift instead. Mueller et al. suggested that the delayed LAN effect might reflect greater working memory demand for retaining the noun while waiting for another correct classifier,3 similar to the sustained left anterior negativity found in constructions involving long-distance dependencies (Fiebach, Schlesewsky, & Friederici, 2002; King & Kutas, 1995; Kluender & Kutas, 1993). Yet, Sakai et al. (2007) pointed out that the absence of N400 effect in Mueller et al. (2005) might be caused by the weak semantic anomaly in their classifier–noun violations, because only combinations of animate-counting classifiers with animate nouns was used in their study, and this was likely to be the limitation of using a miniature language. Two recent ERP studies also examined the processing of classifier–noun incongruity in sentences. Zhou et al. (2010) situated the classifier–noun pair at the sentence-final position as the object of the matrix verb, and manipulated the classifier to match or mismatch with the following object noun, such as Zhao xiuli yizhang/tai changyi. ‘Zhao repaired one-CL(chair)/CL(electric appliance) chair.’ They found an N400 effect at the object noun when preceded by a mismatching classifier, suggesting a difficulty in semantic integration between the classifier and noun. Interestingly, they also found a sustained late anterior negativity in the mismatching condition, and interpreted this late negativity effect as an indication of a second-pass semantic reinterpretation process initiated by the difficulty in integrating word meaning into the preceding context. Zhang et al. (2012) examined the processing of classifier–noun in a non-canonical structure of object noun + subject noun + verb + numeral classifier + adjective, such as car / Qingfeng Zhao / had seen / one-ling(CLvehicle) / black ‘Qingfeng Zhao had seen a black car.’ The processing of classifier–noun agreement here involved a long-distance, but was in a reversed order. Zhang et al. also observed an N400 effect and a late anterior negativity when the classifier mismatched the object noun. They suggested that these effects reflect the difficulty in semantic integration and the secondary semantic interpretation respectively. Yet, the non-canonical sentences used in Zhang et al.’s experiment may limit its generalizability. In this study, to further understand the processing of classifier– noun pairs at varying dependency distances in which temporary semantic incongruity and long-distance semantic agreement are embedded, we utilized the semantic properties carried by different classifiers, in conjunction with object-gap RCs, to create the test paradigm as shown in Table 1. Since our target sentences were all presented out of context, we used number-classifiers (one-CL)

2 Japanese NPs have case marking. GEN refers to ‘genitive case (-no)’ and ACC refers to ‘accusative case (-o)’. 3 Unlike Chinese, classifiers in Japanese can either precede or follow its associate noun.

17

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28 Table 1 A sample of test paradigm. Adverbial phrase

在1 夏末 2 秋初3 Zai xiamo qiuchu at summer-end autumn-start time

Num + classifier

時4, shi,

a. Match-short: 一位5 yi-wei one-CL(person) b. Match-Long: 一棵5 yi-ke one-CL(tree) c. Mismatch: 一朵5 yi-duo one-CL(flower)

Object relative clause

Head noun

鄰居6 linju neighbor

種植7 zhongzhi planted

_

的8 de DE

果樹9 guoshu fruit-tree

鄰居6 linju neighbor

種植7 zhongzhi planted

_

的8 de DE

果樹9 guoshu fruit-tree

鄰居6 linju neighbor

種植7 zhongzhi planted



的8 de DE

果樹9 guoshu fruit-tree

Main clause

結了10 jie-le bear-LE

好多11 haoduo many

果子12。 guozi fruit

English translation: ‘‘In the late summer and beginning of fall, a fruit-tree which a neighbor planted bore lots of fruit.”

instead of demonstrative-classifiers (that-CL) to avoid any potential referential ambiguity. In the Match-Short condition, the classifier was semantically congruous to the adjacent noun (yi-wei linju ‘one-CL(person) neighbor’), and was neither semantically nor syntactically dependent to the head noun. In the Match-Long condition, the classifier was semantically incongruous to the adjacent noun, but was semantically congruous as well as syntactically dependent to the head noun (yi-ke ‘one-CL(tree)’ and guoshu ‘fruit-tree’). In the Mismatch condition, the classifier (yi-duo ‘one-CL(flower)’) was semantically incongruous to both the adjacent noun and the head noun. In this paradigm, both the Match-Long and Mismatch conditions contained a temporary semantic incongruity between the classifier and the adjacent noun, but these two conditions differed in whether the long-distance head noun matched the classifier or not. To avoid the problem mentioned by Sakai et al. (2007) that the semantic anomaly in the combination of animate-counting classifiers with animate nouns may be too weak to elicit N400 effects, we used both animate-counting and inanimate-counting classifiers. Among the 76 classifiers used, only 10 were animatecounting classifiers, because Chinese has only about ten animatecounting classifiers, much less than inanimate-counting classifiers (Gao & Malt, 2009; Zhang et al., 2012). Thus, the semantic incongruity between the classifier and the adjacent noun in the Match-Long and the Mismatch conditions were predominately between inanimate-counting classifiers and animate nouns in our experiment. We measured ERP responses at three critical regions (adjacent noun, DE, head noun) and make the following predictions corresponding to our research questions. First, the ERP responses at the adjacent noun (linju ‘neighbor’) reflect how the temporary classifier–noun semantic incongruity may be processed. If the temporary classifier–noun incongruity is processed as a typical semantic anomaly that is global and irresolvable, we predict a typical N400 effect and a late anterior negativity at the adjacent noun in the Match-Long and the Mismatch conditions, as what’s found in the previous studies like Zhou et al. (2010) and Zhang et al. (2012). If, on the other hand, the processing of the temporary classifier– noun semantic incongruity is influenced by its status of being a syntactic cue for subsequent RC structure, we expect a response that is not identical to the typical N400. Or, if the parser keeps the temporary mismatching classifier in the working memory for later use, we expect a sustained (delayed) LAN effect, as suggested by Mueller et al. (2005). Second, the ERPs pattern at the RC-disambiguating region (DE) can tell us if the temporary classifier–noun mismatch helps the parser to successfully predict a RC structure. Although the reduced P600 effect observed in Chen et al. (2013) suggests that pre-RC mismatching classifier is effective to RC prediction, only

animate-counting classifiers were used in their study. We further tested this by using mostly inanimate-counting classifiers in our experiment. A reduced P600 effect is predicted at DE in the Match-Long and the Mismatch conditions, compared to the Match-Short condition, if the temporary semantic mismatch between the inanimate-counting classifier and the adjacent noun is used effectively in predicting a RC structure. Last but not the least, at the head noun (guoshu ‘fruit-tree’), if the parser attempts to establish the semantic agreement between the classifier and the head noun over a distance, we expect a larger N400 amplitude in the Mismatch condition than in the Match-Long condition, because the classifier is semantically incongruous to the head noun in the former but not in the latter. In addition, in the Match-Long and the Mismatch conditions, since the classifier was incompatible with the adjacent noun, it needs another noun, the distant head noun in this case, as its syntactic licensor. The P600 component has been associated with long-distance syntactic integration (e.g. Felser, Clahsen, & Munte, 2003; Kaan, Harris, Gibson, & Holcomb, 2000; Phillips, Kazanina, & Abada, 2005). Recent findings of the Semantic 600 phenomenon also suggest that the P600 is an index of general mapping conflict and the evaluation of wellformedness (Bornkessel-Schlesewsky & Schlesewsky, 2008). In our test paradigm, interestingly, the long-distance classifier–noun integration is successful in the Match-Long condition but is unsuccessful in the Mismatch condition. Such paradigm offers a possibility to see whether similar P600 effects may be observed in these conditions, and the findings will be informative to further understand the function of the P600 component. 2. Experimental methodology 2.1. Participants Thirty people (17 females and 13 males) between the age of 18 and 27 (mean age of 22.1 years) participated in the experiment. Participants were all right-handed native speakers of Mandarin Chinese spoken in Taiwan with normal or corrected-to-normal visions. Each participant was well informed of the experiment procedure and signed the consent form before the experiment. Each participant was paid $100 NT per hour of their participation. 2.2. Design and materials A total of 120 triplets of test sentences as illustrated in Table 1 were constructed. Each test sentence contained 12 regions. Region 1–4 included an adverbial phrase that introduced the sentence. Region 5 was the numeral-classifier, followed by an object-gap RC at Region 6–7, the RC marker DE at Region 8, and the head noun at Region 9. Region 10–12 was the matrix clause to end the

18

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

Table 2 Mean scores and standard deviations (in parentheses) in each offline judgment task. Overall acceptability (on a 2-point scale)

Matchshort Matchlong Mismatch

Semantic compatibility (on a 5-point scale) Between classifier and adjacent noun

Between classifier and head noun

0.77 (.18)

4.78 (.27)

NA

0.69 (.23)

1.24 (.31)

4.71 (.17)

0.13 (.17)

1.21 (.27)

1.34 (.31)

sentence. In the Match-Long and Mismatch conditions, a total of 65 different classifiers (60 inanimate-counting and 5 animate-counting) were used. In the Match-Short condition, only 11 classifiers (6 inanimate-counting and 5 animate-counting) were used, among which the two classifiers for human nouns, wei and ming, were repeated many times to match the adjacent noun, which was usually a human taking up the agent role. Since the Match-Short condition served as a control and contained no classifier–noun mismatch, the lexical repetition in this condition would not change the result pattern, but may help to amplify the effect of the classifier–noun mismatch in the other two conditions. To ensure that the test materials were natural and appropriate for comparisons, three off-line judgment tasks were conducted. The first one checked the overall acceptability of the three types of target sentences. The 120 sets of three test sentences were distributed among three lists in a Latin-Square design so that each list contained 120 target sentences with 40 from each condition. Each list was intermixed with 120 filler sentences to create a total of 240 sentences. To reduce the number of sentences in the questionnaire, each list was divided into two sub-lists with 120 sentences in each (60 targets and 60 fillers). Two randomizations of each sub-list were then generated to counterbalance the order effect. A separate group of 180 participants (mean age of 20.86 years) were asked to judge the acceptability of each sentence by circling 1 (acceptable) or 0 (unacceptable).4 A participant’s data was disregarded if the survey was not completed or if more than half of the mismatched sentences were judged acceptable or more than half of the fillers were judged unacceptable. The valid data from 155 participants were analyzed (see Table 2). An analysis of variance (ANOVA) performed on the acceptability scores revealed a main effect of condition (F (2, 308) = 485.90, p < .0001). The post hoc analyses with Bonferroni correction showed that the sentences in the Match-Short and the Match-Long conditions were more acceptable than those in the Mismatch condition (ps < .000). The Match-Short condition was also more acceptable than the Match-Long condition (p = .009). This is expected because the sentences in the Match-Long condition involved long-distance dependency and were more complex. In addition, one-sample t-tests showed that the sentences in the Match-Short and Match-Long conditions were acceptable above the chance level (50%) (t(154) = 18.78, p = .000; t(154) = 10.48, p = .000), whereas the acceptability of the sentences in the Mismatch condition was below the chance level (t(154) = 27.58, p = .000). The other two off-line judgment tasks were carried out to check the semantic compatibility between the classifier and the adjacent noun and between the classifier and the head noun. First, the classifier and the adjacent noun from the total of 360 target sentences were extracted and paired together in two lists with 180 pairs in

4

We used a 2-point scale acceptability test because we wanted to avoid gray areas in between either acceptable or unacceptable, and the results of a binary measure are equivalently informative as the 5- or 7-point scale measure (Weskott & Fanselow, 2011).

each, and two randomizations of each were created. A separate group of 121 participants rated the semantic compatibility between the classifier and the adjacent noun on a 5-point scale, with ‘1’ indicating the lowest semantic compatibility and ‘5’ indicating the highest semantic compatibility. See the results in Table 2. The repeated-measures ANOVA revealed a main effect of Condition (F(2, 240) = 16640, p < .000). Post hoc analyses with Bonferroni correction showed that the classifier was significantly more compatible with the adjacent noun in the Match-Short conditions than in the other two conditions (ps < .000). The averaged compatibility between the classifier and the adjacent noun in the Match-Long and the Mismatch conditions were almost similar (1.24 vs. 1.21), though the difference was somehow significant (p = .041). Second, since the classifier and the head noun in the MatchLong and the Mismatch conditions were assumed to be computed for semantic agreement over a long-distance, they were extracted for a semantic compatibility check. They were paired together in a list and two random orders were created. A separate group of 60 participants rated the semantic fit between the classifier and the head noun on a 5-point scale, with ‘1’ indicating the lowest semantic compatibility and ‘5’ indicating the highest semantic compatibility. The results (see Table 2) showed that the classifier was significantly more compatible with the head noun in the MatchLong condition than in the Mismatch condition (t(59) = 77.69, p = .000). The 120 sets of target sentences with 120 fillers of similar length and complexity were used in the ERP experiment. The fillers were all grammatical declarative sentences of various types of structures other than RCs, and no individual classifiers were used in the fillers. An example of filler is Xuexiao canting tigong xuesheng pianyi you haochi-de biandang ‘School restaurant provides students with cheap and delicious lunch boxes.’ Each participant saw one list in which the 240 items (120 targets and 120 fillers) were presented randomly. In each list, 35% (84/240) of the trials were coupled with a true-or-false comprehension question to ensure that the participants paid good attention in reading the test sentences. 2.3. Procedure The EEG was recorded as the participants read each sentence for comprehension. Each sentence was presented one word at a time at the center of the computer screen for 300 ms with a stimulusonset asynchrony (SOA) of 600 ms. Every trial began with a fixation cross to orient readers’ attention to the center of the screen, and the participants pressed the space bar to start a trial. A true-orfalse comprehension question appeared after some of the trials on a random basis. Participants had to make a response based on the meaning of the sentence they just read. To counterbalance handedness, half of the participants pressed the ‘‘green” key for yes, ‘‘pink” key for no, and the other half did the opposite. Feedback (‘‘Sorry” for incorrect and ‘‘Good Job” for correct) was given immediately after their response. To reduce artifacts like eye movements and eye blinks, participants were instructed to remain as still as possible with their eyes fixated at the center of the screen throughout a sentence trial. They were requested to refrain from blinking as much as possible when stimuli were presented and were encouraged to rest before initiating the next trial. A brief practice session was provided, and the whole experiment lasted around one and half an hour. 2.4. Apparatus and ERP recordings The electroencephalogram (EEG) was recorded using 128 channel Electrical Geodesics system (Electrical Geodesics, Inc.; Tucker, 1993), consisting of Geodesic Sensor Net electrodes, Net Amps, and Net Station software running on an Apple Macintosh G5, dual

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

19

2-GHz Intel core computer with Mac OS 10.4.9. Instructions and visual stimuli were presented on a 17-in. LCD monitor working at 75 Hz refresh rate. The E-prime program (Psychology Software Inc., Pittsburgh, PA) controlled the experimental trials and sent event information to EEG recording system (Net Station, Electrical Geodesics Inc., Eugene, Oregon). The EEG signals were recorded continuously at 500 Hz by the Net Station with a 24 bit A/D converter. Impedances were kept below 5 kΩ. The EEG was amplified and analog filtered with 0.1–100 Hz bandpass filters, referenced to the vertex, and a 60 Hz notch filters then digitized at 500 Hz. Six eye channels were used to monitor the trials with eye movements and blinks. All event onset times and accuracy were recorded for later analysis. 2.5. ERP data analysis The EEG data were segmented off-line into 1300 ms epochs, spanning 200 ms pre-stimulus to 1100 ms post-stimulus for the three critical words. Data were digitally screened for artifacts (eye blinks or movements or transient electronic artifact) and contaminated trials were removed. Overall, 4.08% (147/3600) of trials were rejected, leaving no participant with fewer than 70% of good trials in any condition. The averaged ERP data were digitally filtered at a high-pass 0.06 Hz to filter out the DC offset while leaving the ERP components unharmed, and filtered at a low-pass 30 Hz to remove residual high-frequency noise. The averaged ERP data had the baseline corrected over the 200 ms pre-stimulus period,5 and re-referenced to an average reference frame to remove topographic bias that could result from the selection of a reference site (Dien, 1998). To effectively increase the accuracy of the average-reference derivation, the spatial topography was assessed by dividing the electrodes into 13 spatial regions, and the mean voltage amplitudes of all channels within each region were averaged. Using clusters that average multiple measurements can reduce spurious interactions with single electrode locations, and provides more reliable sample of the activity within any region than a single separate measurement taken within the same region (Dien & Santuzzi, 2005). The clustered regions included three lateral (left, midline, and right) and five lobe (pre-frontal, frontal, central, posterior, and temporal) sites corresponding to the international 10–20 system (F7–F8, F3–Fz–F4, C3– Cz–C4, P3–Pz–P4, T3–T4). Fig. 1 shows the 128 recording electrodes and the details of the 13 clusters of electrodes. The statistical analysis used two repeated-measures ANOVAs. One tested the ERPs for the three medial electrode sites (Fz, Cz, and Pz) and the other tested the six lateral clustered electrode sites (F3–F4, C3–C4, and P3–P4). The midline ANOVAs included two within-participant factors: Condition (Match-Short/Match-Long/ Mismatch) and Electrodes (Fz/Cz/Pz). For the analysis on the lateral electrode sites, since we used clustered electrode sites that averaged the voltages of the surrounding channels (see Fig. 1), the six lateral regions of interest by Hemisphere (left/right) and Region (anterior/central/posterior) were represented by each of the clus5 Using the period of pre-stimulus 200 ms as the baseline for target words is a common practice in sentence-reading ERP experiments. However, as one reviewer pointed out, the critical words in Region 6 (the adjacent noun) were preceded by the different classifiers in Region 5. Using the pre-stimulus 200 ms baseline for Region 6 may distort the ERP patterns because of the possible differences across conditions in the baseline interval. To address this concern, we analyzed the mean amplitudes of ERPs in the intervals of 300–600 ms and 400–600 ms after the onset of the classifiers. As expected, the results revealed neither a main effect of Condition nor any interaction involving Condition (ps > .1), suggesting that there were no systematic ERP differences across conditions in the baseline prior to the onset of the critical words at Region 6. Thus, using the baseline of pre-stimulus 200 ms (i.e. the interval of 400–600 ms of classifiers at Region 5) should have the same influence on the ERPs to the critical word for each condition and would not affect the ERP patterns at Region 6. This subtraction method has been used to address similar concerns (e.g. Zhang et al., 2012).

Fig. 1. The schematic flat representation of the 128 electrodes from which EGG activity was recorded. The 13 clustered electrode sites corresponding to the International 10–20 system localizations were marked.

tered regions: left anterior (F3); left central (C3); left posterior (P3); right anterior (F4); right central (C4); right posterior (P4). The ANOVAs for lateral clustered sites included three within-participant factors: Condition (Match-Short/Match-Long/Mismatch), Hemisphere (Left/Right), and Region (Anterior/Central/Posterior). 3. Experimental results 3.1. Behavioral results The mean accuracy for each condition was: Match-Short: 91.2% (SD = 6.94), Match-Long: 92.6% (SD = 4.65), and Mismatch: 84.1% (SD = 18.18). The repeated-measures one-way ANOVA was conducted on the mean accuracy. The analysis revealed a main effect of the classifier type (F(2, 58) = 5.029, p = .024). Further contrast analyses showed that both the sentences in the Match-Short and the Match-Long conditions received higher accuracy than the sentences in the Mismatch condition (F(1, 29) = 5.679, p = .024). 3.2. Event-related potentials Three sets of analyses of the ERP results were carried out. The first analysis targeted on the adjacent noun to examine the processing of the temporary classifier–noun incongruous pairs. The second analysis focused on the RC marker DE to see if the temporary classifier–noun mismatch facilitates the structural reanalysis. The third analysis focused on the head noun to examine the long-distance classifier–noun semantic agreement and structural integration. 3.2.1. Effects on the adjacent noun Fig. 2 presents grand average ERPs (n = 30) at the adjacent noun (Region 6) over the scalp surface with the 13 clustered sites in all three conditions. The target words in all conditions elicited a P1– N1–P2 complex at around the interval of 75 ms to 225 ms, a characteristic pattern for visually presented stimuli. Visual inspection showed that, starting at around 250 ms, more negative-going waveforms were elicited by the temporary classifier–noun incon-

20

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

Fig. 2. Grand average ERPs for the 13 clustered electrode sites at the adjacent noun (Region 6).

gruity in the Match-Long and Mismatch conditions in the frontal area, in comparison to the Match-Short condition. In the later latency, there were no distinct peaks or divergence among the waveforms of the three conditions, except for the sustained negativity in the midline anterior area in the Match-Long and the Mismatch conditions. These observations were quantified by performing ANOVAs in the latency window of 250–450 ms and 500–800 ms after the onset of Region 6. In the 250–450 ms interval, the ANOVA on the midline electrodes revealed no main effect of Condition, but there was a reliable interaction between Condition and Electrodes. See Table 3 (Region 6) for the overall results. Subsequent analyses examining the interaction pattern showed that the effect of Condition was significant at Fz (F(2, 58) = 5.29, MSE = 4.669, p = .010), not significant at Cz (F(2, 58) = 1.08, MSE = .943, p = .345), and marginally significant at Pz (F(2, 58) = 3.12, MSE = 2.758, p = .054), suggesting that the interaction was mainly driven by different patterns on Fz, Cz, and Pz. As shown in Fig. 2, at Fz, the ERPs in the conditions with classifier–noun incongruity were more negative than that in the Match-Short condition (F(1, 29) = 9.167, MSE = 12.597, p = .005), and no difference was found between the Match-Long and the Mismatch conditions. At Pz, on the other hand, the ERP response was most negative in the Match-Short condition than in the other two conditions (F(1, 29) = 4.399, MSE = 5.800, p = .045). The ANOVA on the lateral clustered sites revealed similar patterns, with no main effect of Condition, but marginal interaction between Condition and Region (p = .087). Although the interaction was only marginal, the overall waveform patterns were similar to the patterns found in the midline electrodes where more negative amplitudes were observed in the Match-Long and the Mismatch conditions than in the Match-Short condition at the anterior region. In the 500–800 ms interval, both the medial and the lateral ANOVAs did not reveal any effect of Condition nor the interaction. Extra analyses on a later time window of 700–900 ms yielded similar results.

In sum, at the adjacent noun, the analyses show an early larger negativity at the midline anterior region (Fz) in the Match-Long and Mismatch conditions than in the Match-Short condition, and the patterns were opposite in the midline posterior region (Pz). In addition, the significant negativity at Fz elicited by the classifier–noun incongruity (in Match-Long and Mismatch conditions) seemed to continue to the later latency, but it did not reach overall significance. No other effects were reported from the analyses on ERPs in the later time window. 3.2.2. Effects on the RC marker DE Fig. 3 presents grand average ERPs at the RC marker DE (Region 8) over the scalp surface with the 13 clustered sites in all three conditions. All three conditions elicited a P1–N1–P2 complex that is typical for visually presented stimuli. Visual inspection of the waveform patterns showed that the ERPs started to go more negative in the Match-Long and the Mismatch conditions in the midline areas, starting at around 200 ms. In the later latency, no distinct peaks or divergence among the waveforms were observed among the three conditions. These observations were tested by performing ANOVAs in the latency window of 250–450 ms and 500– 800 ms after the target onset. In the 250–450 ms interval, the ANOVA over the midline electrodes revealed a main effect of Condition (see Table 3). As shown in Fig. 3, the ERPs in the conditions with classifier–noun incongruity were more negative than that in the Match-Short condition (F(1, 29) = 7.069, MSE = 3.380, p = .013). The ANOVA for lateral regions did not yield any significant effects (Fs < 1). In the 500–800 ms time window, both medial ANOVA and lateral ANOVA did not reveal any significant effects either. 3.2.3. Effects on the head noun Fig. 4 presents grand average ERPs at the head noun (Region 9) over the scalp surface with the 13 clustered sites in all three conditions. All three conditions elicited a P1–N1–P2 complex that is

21

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28 Table 3 The mean amplitude ANOVAs over midline and literal electrodes at three critical regions. Region 6 (adjacent noun)

Time windows 250–450 ms

500–800 ms

F

MSE

P

F

MSE

P

Midline electrodes Condition (2, 58) Condition  Electrode (4, 116)

1.44 4.34

1.661 4.663

.246 .009⁎⁎

1.80 1.60

2.719 2.895

.179 .199

Lateral electrodes Condition (2, 58) Condition  Region (4, 116) Condition  Hemisphere (2, 58) Condition  Region  Hemisphere (4, 116)

<1 2.45 1.06 <1

.579 2.77 .930 .171

.861 .087y .351 .552

1.09 <1 <1 <1

1.273 2.561 1.017 .802

.336 .428 .467 .486

Midline electrodes Condition (2, 58) Condition  Electrode (4, 116)

3.48 <1

3.411 .440

.038⁎ .834

1.65 <1

1.371 .337

.262 .766

Lateral electrodes Condition (2, 58) Condition  Region (4, 116) Condition  Hemisphere (2, 58) Condition  Region  Hemisphere (4, 116)

<1 <1 <1 1.972

.374 1.116 .305 .409

.690 .580 .703 .133

<1 <1 <1 1.767

.384 1.907 .225 .662

.620 .518 .769 .168

Midline electrodes Condition (2, 58) Condition  Electrode (4, 116)

2.60 2.82

2.067 2.725

.086y .043⁎

<1 1.64

1.265 2.056

.481 .185

Lateral electrodes Condition (2, 58) Condition  Region (4, 116) Condition  Hemisphere (2, 58) Condition  Region  Hemisphere (4, 116)

4.80 2.30 2.11 1.348

4.036 3.132 1.913 .366

.012⁎ .098y .140 .267

3.18 2.31 <1 <1

4.593 5.701 .517 .313

.049⁎ .104 .666 .490

Region 8 (DE)

Region 9 (Head noun)

y ⁎ ⁎⁎

0.05 < p < 0.1. p < 0.05. p < 0.01.

Fig. 3. Grand average ERPs for the 13 clustered electrode sites at DE (Region 8).

22

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

Fig. 4. Grand average ERPs for the 13 clustered electrode sites at the head noun (Region 9).

Table 4 The summary of the ERP effects at each critical region in each condition. Condition

Match-short Match-long Mismatch

Adjacent noun

RC marker DE

Head noun

250–450 ms

250–450 ms

250–450 ms

500–1000 ms

X Midline anterior negativity Midline anterior negativity

X Midline negativity Midline negativity

N400 X N400

X P600 P600

typical for visually presented stimuli. Visual inspection of the waveform patterns showed that the ERPs started to go more negative in the Match-Short and the Mismatch conditions in the central and posterior regions, starting at around 250 ms. In addition, compared to the Match-Short condition, the Match-Long and the Mismatch conditions seemed to elicit more positivity in the later latency. At the posterior region, the positivity started at around 500 ms and lasted till 1000 ms. The ANOVAs were run in the latency window of 250–450 ms and 500–1000 ms after target onset to test these effects. In the 250–450 ms interval, the ANOVA over midline electrodes revealed a significant interaction between Condition and Electrodes, and a marginal effect of Condition. See Table 3 (Region 9) for the overall results. Subsequent analyses examining the interaction pattern showed that the effect of Condition was only significant at Cz (F(2, 58) = 6.31, MSE = 4.445, p = .004), but not significant at Fz (F(2, 58) < 1, MSE = .622, p = .509) and at Pz (F (2, 58) = 1.89, MSE = 1.293, p = .162), suggesting that the interaction was mainly driven by different pattern at Cz. Further pairwise comparisons at Cz showed that the ERPs in the Match-Short and Mismatch conditions were significantly more negative than that in the Match-Long condition (ps < .05). For the lateral ANOVA, a significant main effect of Condition was found as well as a marginal interaction between Condition and Region. Since the interaction was only marginal, we did not run further analyses, although the waveform patterns seem to suggest that the effect of Condition was more obvious at the central and posterior regions. As for the main effect of Condition, Fig. 4 shows that the ERPs were more neg-

ative in the conditions where the classifier did not match the head noun than in the Match-Long conditions (F(1, 29) = 8.297, MSE = 1.783, p = .007). In the 500–1000 ms interval, while the medial ANOVA did not reveal any significant effect, the lateral ANOVA revealed the main effect of Condition (see Table 3). As shown in Fig. 4, the ERP in conditions involving long-distance integration was more positive than in the Match-Short condition (F(1, 29) = 7.858, MSE = 2.232, p = .009), and there was no difference between the Match-Long and the Mismatch condition (F < 1). Although there was no significant interaction between Condition and Region, the pattern of results seems to suggest that the late positivity elicited in the Match-Long and the Mismatch conditions are more obvious at the posterior (P3/P4) region. In sum, at the head noun, in the 250–450 ms interval, a significant negativity with a central-posterior distribution that was typical of N400 effects was elicited in the Match-Short and the Mismatch conditions in contrast to the Match-Long condition. In the 500–1000 ms interval, positive ERPs were elicited in the Match-Long and the Mismatch conditions in comparison to the Match-Short condition. This effect of long-lasting positive shift with no clear peak resembles the P600 effect. 4. Discussion The aim of the present study was to understand the on-line processing of classifier–noun agreement when these two dependent elements are intervened by an object-gap RC, creating a temporary

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

semantic incongruity and a long-distance dependency. Table 4 summarizes our ERP results: At the adjacent noun, the Match-Long and the Mismatch conditions elicited a larger midline anterior negativity. At DE, a larger negativity at midline areas was found in the Match-Long and Mismatch conditions than in the Match-Short condition. At the head noun, typical N400 effects were observed in the Match-Short and Mismatch conditions, and the P600 effects were observed in the Match-Long and Mismatch conditions. Our findings and their theoretical implications are discussed below. 4.1. How the parser deals with the temporary classifier–noun mismatch One of the intriguing questions that we raised in this study is about the parser’s reaction toward a semantic incongruity that is only temporary and could serve as a structural cue. In our test paradigm, the classifier and the adjacent noun were semantically incongruous in the Match-Long and the Mismatch conditions, but the incongruity was temporary, because the input sequences were continued such that compatible nouns could potentially be introduced later. A greater anterior negativity with a central focus (Fz) was observed at the adjacent noun in these two conditions, and this negativity was not identical to a typical N400 effect, which normally has a centro-parietal distribution. The finding of a midline anterior negativity but not typical N400 effect suggests that the parser does not treat the temporary classifier–noun semantic mismatch as a typical global semantic anomaly. Several studies also found early negativity with frontal maxima (Demestre, Meltzer, Garcia-Albea, & Vigil, 1999; Deutsch & Bentin, 2001; Wicha, Moreno, & Kutas, 2004). Wicha et al. (2004) argued that this type of negativity should be interpreted as a N400 effect, because the negative-going ERPs in the centro-parietal areas were neutralized by the earlier latency of the posterior P600 effect. However, two reasons prevent us from interpreting our finding of the early anterior negativity as a typical N400 effect. First, in these previous studies which claimed that the early anterior negativity is an N400 effect, the negative ERP responses were found consistently across the anterior, the central, and the parietal areas, and were maximized in the anterior sites. In our study, the negative ERPs in the Match-Long and the Mismatch conditions were only observed in the midline-anterior area, but not in the central and posterior areas. Second, we did not find P600 effects with an early latency that could potentially neutralize the distribution of the negativity in the posterior area. Thus, we do not think our finding of a midline anterior negativity is a typical N400 effect. A midline anterior negativity, instead of a typical N400 effect, observed at the temporary semantic classifier–noun incongruity suggests that the semantic processing between the classifier and the noun is influenced by the syntactic information encoded at this point. Unlike global classifier–noun violations which are pure semantic anomaly and consistently engender typical N400 effects (e.g. Sakai et al., 2007; Tsai et al., 2008; Zhou et al., 2010), the temporary classifier–noun semantic incongruity is linguistically more complex for it combines both semantic conflict and structural cue at the same time. On the one hand, the classifier–noun mismatch is semantically incongruous because the semantic features between these two dependent elements are incompatible. On the other hand, the classifier–noun mismatch also serves as an unambiguous syntactic cue for RC prediction, since RC is the only grammatical way to continue the classifier–noun mismatch fragment (Hsu, 2006). Therefore, it is likely the parser computes both kinds of information immediately and simultaneously – recognizing the classifier–noun semantic incongruity as well as taking it as a structural cue, resulting in a somewhat untypical ERP pattern. The other index found to be associated with processing classifier–noun incongruity inside sentences is late anterior negativity,

23

and has been interpreted to reflect the difficulty in secondary semantic integration of words and preceding contexts (Zhou et al., 2010; Zhang et al., 2012). We did observe the sustained anterior negativity in the central area (Fz), but such effect was not overall significant. We suspect that the lack of significance is also related to the temporary nature of classifier–noun incongruity in our manipulation. The midline anterior negativity found here cannot be the LAN (Left Anterior Negativity) effect either, despite possible involvement of syntactic processing at classifier–noun mismatch. The LAN effect usually occurs between 300 and 500 ms after the target onset with more reliably left lateralized distribution, and has generally be taken to reflect early detection of morpho-syntactic violations (e.g. Friederici, Pfeifer, & Hahne, 1993; Osterhout & Mobley, 1995). The midline anterior negativity observed in our study resembles the morpho-syntactic LAN with respect to its timing (around 300–500 ms), but not with respect to its distribution. In addition, the LAN elicited by morpho-syntactic violations tends to co-occur with a P600, forming a biphasic LAN-P600 pattern, as observed in most studies that combined syntactic and semantic violations (Friederici et al., 2004; Gunter, Stowe, & Mulder, 1997; Hahne & Friederici, 2002). In our study, the classifier–noun mismatch is a semantic violation but not a syntactic violation, and no P600 effect was observed, suggesting that the midline anterior negativity is not a LAN effect. Our finding is somewhat similar to Mueller et al. (2005)’s study on miniature Japanese in which the classifier–noun mismatch was situated in the middle of sentence. Mueller et al. (2005) found a left anterior negativity in the later latency, and interpreted it as a delayed LAN associated with working memory process. In contrast to Mueller et al., we observed the anterior negativity in the early latency and with a central distribution. While we agree that the anterior negativity effect is potentially associated with the greater working memory demand, we suggest that the observed differences may be related to the more complicated, inanimate-counting classifiers adopted in our study but not in Mueller et al.’s study. We suggest that a plausible interpretation of the midline anterior negativity elicited at the temporary classifier–noun mismatch is to associate it with potential metacognitive strategy and increasing working memory load. As mentioned earlier, the classifier– noun mismatch situated in the middle of sentence actually plays a conflicting role, being a semantic anomaly as well as structural cue at the same time. In order to avoid processing breakdown caused by such a temporary conflict, the parser might need some top-down help from the metacognitive level to resolve the processing difficulty. Neuroimaging and ERP studies suggest that frontal areas, the anterior cingulate in particular, are associated with major metacognitive processes such as conflict monitoring and resolution (e.g. Fernandez-Duque, Baird, & Posner, 2000; Van Veena & Carter, 2002). In ERP experiements with Stroop verbal tasks where the participants have to resolve the word and color conflict, a larger negativity at around 350 ms with an anterior– medial focus has been found (e.g. Liotti, Woldor, Perez, & Mayberg, 2000), similar to the midline anterior negativity effect observed in our study. In addition, in experiments involving manipulation of working memory load, the high working memory load was found to elicited larger negativity than the low working memory load, and the negativity increase is larger over the frontal areas (e.g. Löw et al., 1999). In our study, the midline anterior negativity elicited by the temporary classifier–noun semantic incongruity may reflect both the readers’ regulation of cognitive activity to resolve the conflict and the increasing working memory load associated with it. In the experiment, since the participants expected to read full sentences instead of word pairs or phrases, when encountering classifier–noun mismatch at the clause-initial position, they probably resolved the conflict by figuring that the

24

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

incongruity could be temporary, and employ some cognitive strategies to make sense of the semantic incongruity. One way to do this is to keep the temporary incongruity in the working memory and expects a compatible noun as the sentence unfolds. The effect of a midline anterior negativity thus might reflect the metacognitive processes in resolving the conflict as well as the working memory demand associated with keeping the classifier active and anticipating some plausible sequence to appear later. The supportive evidence for retaining the mismatching classifier in the working memory for later use is the N400 and P600 effects observed at the head noun, which suggest that the parser tries to integrate the mismatching classifier with the head noun over a distance (see Section 4.3 and 4.4). In other words, what the parser encountered here was not a typical semantic anomaly which would have elicited a typical N400 effect, but, instead, was a temporary linguistic conflict that needed to be resolved via metacognitive processes. These metacognitive processes usually associate with brainwave activities in the anterior regions, and this may explain why the negativity was absent in the central and posterior regions. 4.2. Is classifier–noun mismatch an effective cue for RC prediction? In our study, the classifier–noun mismatch is considered as a structural cue for RC prediction, because an object-gap RC is the only way to continue the sequence of a temporary classifier–noun mismatch. If the parser exploits the classifier–noun mismatch cue to successfully predict a RC, we expect a reduced P600 effect at the RC marker DE in the Match-Long and the Mismatch conditions, compared to the Match-Short condition. However, counter to the prediction, we did not find reduced P600 effects at DE in these two conditions. Instead, a larger early negativity in the midline, maximized at the central position (Cz), was observed in these two conditions than in the Match-Short condition. This pattern of results at DE is interesting. On the one hand, the lack of reduced P600 effects here seems to suggest that the classifier–noun mismatch did not help the parser to successfully predict the appearance of DE of a RC structure beforehand to facilitate structural reanalysis. On the other hand, the early midline negativity was elicited and we suspect it to be related to the reactivation of the classifier–noun incongruity in the Match-Long and Mismatch conditions. In Mandarin, the element DE is used not only in RCs but also in all other noun-modifying phrases, and it always precedes the head noun. The observed early midline negativity at DE resembles the N400 in terms of the timing, but not so typical in terms of distribution and shape. It is possible that, at the point of DE, which entails the coming of another noun, the mismatching classifier is quickly retrieved from the working memory to get ready for the integration with the coming noun, leading to this somewhat untypical response. Together, these findings seem to imply that while the classifier–noun mismatch cue may not be so effective in predicting a specific RC structure, it is facilitative in predicting another noun. In other words, while the parser probably did not explicitly predict a RC structure based on the classifier–noun incongruity, it actually anticipated the appearance of another noun that could resolve the classifier–noun incongruity, giving rise to the early midline negativity effect at DE position. More studies are necessary to further understand this effect. The lack of reduced P600 effects at DE implicates that there is a limit on the parser’s ability to use indirect cues effectively to make specific structural predictions (Hsu et al., 2005, 2006). Even though the classifier–noun mismatch is an unambiguous cue for RC structure, it is a rather indirect cue because there is nothing in the grammar of the classifier or the grammar of the noun that directly requires a RC structure. In order to predict a RC structure, the parser needs to project more structures such as noun phrases and

clausal phrases ahead to connect these two mismatching elements (see Appendix A), and there is no guarantee that the parser is able to do so (Hsu, 2006). In Hsu et al.’s off-line sentence completion experiments, only 44% of the total responses continued the classifier–noun mismatch with a RC, and as high as 50% were ungrammatical continuation. Similarly, no facilitation was found at DE in sentences that contained classifier–noun mismatch in their selfpaced reading experiments. Our ERP finding suggests that, although the parser may resolve the classifier–noun mismatch cue via some metacognitive processes and probably anticipates another noun, it is unable to explicitly predict a specific RC structure ahead of time based on an indirect classifier–noun mismatch cue. Then, why did Chen et al. (2013)’s study show reduced P600 effects at DE in cases where the mismatching classifiers are available, but not in our study? We think that the difference is probably related to the test materials used. First, these two studies used different types of mismatching classifiers. Chen et al. (2013) used the very small set of animate-counting classifiers in their experiment, whereas we used 60 inanimate-counting classifiers for the classifier–noun mismatch in our experiment. It is possible that in Chen et al.’s study, the small number of animate-counting classifiers might restrict head-noun candidates, making the RC prediction easier. In our study, the diverse nature and the variety of inanimate-counting classifiers might constrain the parser from utilizing the mismatch cue successfully. Second, these two studies differed in the type of determiners used in conjunction with the mismatching classifier. In Chen et al.’s study, the mismatching classifier came with a demonstrative as in that-CL, whereas in our study, the mismatching classifier came with a numeral as in one-CL. Since demonstratives tend to refer to more specific and definite nouns than numerals which often refer to indefinite nouns, demonstrative-classifiers might therefore be a more useful cue than numeral-classifiers to help the parser to make explicit RC prediction. Further investigation is necessary to find out how these factors may affect the parser in utilizing the indirect cue of a classifier–noun mismatch for successful RC prediction. 4.3. Processing classifier–noun semantic agreement in a long-distance The other important goal of our study is to examine the processing of the semantic agreement between a classifier and its associate noun when they are in a long-distance. At the long-distant head noun, we observed typical N400 effects in the Match-Short and the Mismatch conditions, both of which the classifier did not match the head noun. Since the classifier was semantically congruously to the adjacent noun in the Match-Short but not in the Mismatch condition, we assume that the N400 effects observed in these two conditions, despite being associated with semantic integration, should reflect somewhat different underlying mechanisms. First, the finding of a N400 effect in the Mismatch condition in contrast to the Match-Long condition suggests that the parser keeps the temporary semantic incongruity in the working memory, and is able to process the classifier–noun semantic agreement over a long-distance. In both of these two conditions, the classifier did not match the adjacent noun. Upon encountering the head noun, the parser tries to associate it backwards with the classifier on a semantic basis. The semantic incompatibility between the classifier and the head noun in the Mismatch condition but not in the Match-Long condition accounts for why a larger N400 effect was elicited in the former than in the latter. This result proves that the parser does not disregard the temporary classifier–noun incongruity altogether, but retains it in the working memory for later processing. Most importantly, the discovery of the N400 effect at the head noun in the Mismatch condition indicates that the parser computes the semantic agreement relationship between the classi-

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

fier and the head noun, even when the computation involves a long-distance. Our finding corroborates with the previous finding that the processing of classifier–noun agreement is semantically-based (e.g. Sakai et al., 2007; Zhou et al., 2010), and we further show that the semantic processing takes place despite the manipulated distance between these two dependent elements. This strengthens the linguistic argument that individual classifier systems categorize nouns based on their inherent semantic properties (e.g., Allan, 1977; Tai, 1994). As mentioned in the Introduction, classifier systems are like gender systems that classify nouns into different categories. Studies show that gender agreement violations in Indo-European languages generally elicit a composite of a LAN effect and a P600 effect (see Molinaro, Barber, and Carreiras (2011) for a review), reflecting a typical syntactic form-driven processing of gender agreement, although the processing might interact with semantic information (e.g., Deutsch & Bentin, 2001; Gunter, Friederici, & Schriefers, 2000). Our finding together with previous findings, on the other hand, shows that classifier–noun agreement is a semantic driven process. The difference between gender agreement and classifier agreement suggests that although they both serve the function of categorizing nouns into different classes, they involve quite different processes and mechanisms in categorizing nouns, supporting the hypothesized continuum of linguistic devices, from strictly grammatical to strictly lexical, in nominal classification across languages (Craig, 1986; Senft, 2000). Secondly, interestingly, an N400 effect was also observed in the Match-Short condition, a condition with no classifier–noun mismatch. Though it was a bit surprising at first, the N400 effect found here is accountable under either the semantic integration view (e. g. Brown & Hagoort, 1993; Hagoort et al., 2009) or the lexical retrieval view (e.g. Kutas & Federmeier, 2000; Lau, Phillips, & Poeppel, 2008) of the N400 effect. The test sentences of the Match-short and Match-long conditions were identical except for the classifier – one matching the adjacent noun and one matching the head noun. Contextually, the classifier–noun mismatch contained in the Match-Long condition, but not in the Match-Short condition, offers more clues for the parser to foretell potential coming head nouns. Lexically, the semantic features carried by the classifier in the Match-Long condition, but not in the MatchShort condition, allow the pre-activation of the relevant semantic information of the head noun. In other words, the head noun was less predicted both semantically and contextually in the Match-Short condition, and thereby elicited larger N400 amplitudes. Thus, our finding of an N400 effect in the Match-Short condition reflects difficulty in processing a word that is less predictable from the preceding context, supporting the view that the extent to which the preceding context pre-activates or predicts the specific word(s) modulates N400 effects (Federmeier, 2007; Lau, Holcomb, & Kuperberg, 2013). 4.4. Long-distance integration between the classifier and the head noun Classifiers and their associate nouns are not only related semantically, but also dependent syntactically, because all classifiers need a suitable noun as its licensor in the structure. In the Match-Short condition, the classifier is semantically congruous and thus syntactically dependent to the adjacent noun. In the Match-Long and the Mismatch conditions, on the other hand, the mismatching classifier is kept in the working memory and awaits a suitable noun as its licensor as more input unfolds. We observed typical P600 effects at the head noun in the Match-Long and the Mismatch conditions, in contrast to the Match-Short condition, indicating that the parser is performing the long-distance integration to associate the head noun with the classifier. However, since

25

these two conditions differ in whether the head noun is compatible with the classifier, the P600 effects elicited may reflect fundamentally different processes.6 In the Match-Long condition, the long-distance integration is successful because the head noun matches the classifier as a legitimate licensor. The sentences in the Match-Long conditions contained no syntactic violations or garden-paths that require syntactic reanalysis. Thus, the elicitation of the P600 effects here stands against the view that P600 is solely associated to syntactic repair processes. Previously, the P600 effect has been found to correlate with the processing of filler-gap dependencies in wh-constructions (Felser et al., 2003; Fiebach et al., 2002; Gouvea et al., 2010; Kaan et al., 2000; Phillips et al., 2005; Ueno & Kluender, 2009). A careful examination of the Match-Long sentences reveals that two types of long-distance dependencies are involved, one is the filler-gap dependency of the RC construction and the other is the classifier–noun dependency. Since we did not find a similar late positivity in the Match-Short condition where the filler-gap dependencies of RC constructions were also involved, it is not likely that the P600 effect elicited at the head noun in the Match-Long condition is the result of gap-filler dependencies. In other words, the larger late positivity observed at the head noun in the Match-Long condition should be associated with the processing of the long-distance dependency between the classifier and the head noun. This finding suggests that the P600 effect may probably reflect the processing of any kind of dependencies that is long-distance, and is associated with processing cost for updating and integrating novel references into the current mental model (Brouwer et al., 2012). In the Mismatch condition, the classifier cannot be successfully integrated into the ongoing sentence either semantically or syntactically. Semantically, it is not compatible with the head noun; syntactically, no further syntactic structure is available to resolve the mismatch. Since the mismatching classifier kept in the working memory still requires a noun as its licensor and the parser sees the head noun as its last chance, the long-distant yet unsuccessful integration resulted in the larger P600 response, compared to the Match-Short condition. Given that the sentences in the Mismatch condition do not violate any syntactical rules, but are considered semantically ill-formed, as the classifier is semantically incompatible with the head noun, our finding of the P600 effects here is of particular interest in light of recent discussion on the Semantic P600 phenomenon. Studies that observed P600 effects in syntactically intact sentences have attributed them to animacy violations and surface implausibility (Chow & Phillips, 2013; Stroud & Phillips, 2012), or difficulty in semantic integration at different syntactic hierarchies (Zhou et al., 2010). Here, we suggest that the P600 effect may reflect a conflict between semantic incompatibility and syntactic requirement, a problem in general mapping, as suggested by Bornkessel-Schlesewsky and Schlesewsky (2008). Alternatively, the P600 may also be related to the assessment of overall well-formedness, since the target sentence is eventually unacceptable and the processing is irresolvable, supporting the model proposed by Bornkessel and Schlesewsky (2006). Importantly, we do not find any difference in the P600 responses between the Match-Long and the Mismatch conditions – the distribution and the amplitude of this component were roughly the same in these two conditions. That is, similar P600 effects are elicited by both successful and unsuccessful long-distance integration. Our finding implicates that any account of P600 effects should be able to capture both types of processes – one involves long-distance classifier–noun dependency integration and the other involves overall semantic incompatibility.

6 Thank to the reviewers’ valuable comments, the discussion of the P600 effects here was substantially improved by incorporating their suggestions.

26

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

4.5. Conclusion In this study, we have attempted to explicate the processes that the parser undergoes to compute the temporary semantic incongruity and the long-distance dependency of classifier–noun pairs, and their associated ERP responses. Four important findings are summarized here. First, our finding of midline anterior negativity suggests that the semantic processing of classifier–noun incongruity is influenced by the encoded structural information, and metacognitive strategies are employed to retain the temporary mismatch cue in the working memory. Second, the lack of reduced P600 effects at DE suggests that the classifier–noun mismatch is not effective to RC prediction, implying a limit on the parser’s ability in using indirect syntactic cues to predict specific structures. Third, we found that long-distance semantic incongruity between the classifier and its associate nouns elicited typical N400 effects, suggesting that the parser is able to compute and establish the semantic agreement process over a long distance. Lastly, our finding of P600 effects in both successful and unsuccessful long-distance integration between the classifier and the head noun extends our view of the P600 component, and provides further

evidence that the P600 is associated with more than just syntactic processing. Acknowledgments We are very grateful to the three anonymous reviewers for their valuable comments which helped us improve this paper substantially. Chun-Chieh Hsu, Shu-Hua Tsai, and Jenn-Yeu Chen were associated with National Cheng Kung University when the experiment was conducted under the grant from NCKU Landmark and Integrative Research Project (#D16-B0045) to Jenn-Yeu Chen. The preparation of this article was partly supported by the grant from the Ministry of Science and Technology in Taiwan (NSC 102-2410H-007-029) to Chun-Chieh Hsu. It is also in part supported by the Aim for the Top University Project of National Taiwan Normal University (NTNU), sponsored by the Ministry of Education in Taiwan, the International Research-Intensive Center of Excellence Program of NTNU, and Ministry of Science and Technology in Taiwan (NSC 103-2911-I-003-301). Appendix A

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

References Ahrens, K. (1995). Classifier neutralization in Mandarin Chinese. Linguistic Notes from La Jolla, 18, 140–174. Aikhenvald, A. Y. (2003). Classifiers: A typology of noun categorization devices. New York, US: Oxford University Press. Allan, K. (1977). Classifiers. Language, 53, 285–311. Bi, Y., Yu, X., Geng, J., & Alario, F. X. (2010). The role of visual form in lexical access: Evidence from Chinese classifier production. Cognition, 116(1), 101–109. Bornkessel, I., & Schlesewsky, M. (2006). The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages. Psychological Review, 113(4), 787–821. Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2008). An alternative perspective on ‘‘semantic P600” effects in language comprehension. Brain Research Reviews, 59 (1), 55–73. Brouwer, H., Fitz, H., & Hoeks, J. (2012). Getting real about semantic illusions: Rethinking the functional role of the P600 in language comprehension. Brain Research, 1446, 127–143. Brown, C. M., & Hagoort, P. (1993). The processing nature of the N400: Evidence from masked priming. Journal of Cognitive Neuroscience, 5(1), 34–44. Camblin, C. C., Gordon, P. C., & Swaab, T. Y. (2007). The interplay of discourse congruence and lexical association during sentence processing: Evidence from ERPs and eye tracking. Journal of Memory and Langauge, 56, 103–128. Chen, J.-Y., & Wang, T.-Y. (2003). The nature of the classifier–noun agreement in Chinese word production. Paper presented at the 44th annual meeting of the psychonomic society, Vancouver, Canada. Chen, Q., Xu, X., Tan, D., Zhang, J., & Zhong, Y. (2013). Syntactic priming in Chinese sentence comprehension: Evidence from event-related potentials. Brain and Cognition, 83(1), 142–152. Chou, T. L., Lee, S. H., Hung, S. M., & Chen, H. C. (2012). The role of inferior frontal gyrus in processing Chinese classifiers. Neuropsychologia, 50(7), 1408–1415. Chow, W. Y., & Phillips, C. (2013). No semantic illusions in the ‘‘Semantic P600” phenomenon: ERP evidence from Mandarin Chinese. Brain Research, 1506, 76–93. Corbett, G. (1991). Gender. Cambridge: Cambridge University Press. Craig, C. (Ed.). (1986). Noun classes and categorization: Proceedings of a symposium on categorization and noun classification. Amsterdam/Philadelphia: John Benjamins. Croft, W. (1994). Semantic universals in classifier systems. Word, 45, 145–171. Demestre, J., Meltzer, S., Garcia-Albea, J. E., & Vigil, A. (1999). Identifying the null subject: Evidence from event-related brain potentials. Journal of Psycholinguistic Research, 28(3), 293–312. Deutsch, A., & Bentin, S. (2001). Syntactic and semantic factors in processing gender agreement in Hebrew: Evidence from ERPs and eye movements. Journal of Memory and Language, 45(2), 200–224. Dien, J. (1998). Issues in the application of the average reference: Review, critique, and recommendations. Behavioral Research Methods, Instruments, and Computers, 30, 34–43. Dien, J., & Santuzzi, A. M. (2005). Application of repeated measures ANOVA to high density ERP datasets: A review and tutorial. In T. C. Handy (Ed.), Event-related potentials: A methods handbook (pp. 57–81). Cambridge, MA: MIT Press. Dixon, R. M. W. (1986). Noun classes and noun classification in typological perspective. In C. Graig (Ed.), Noun classes and categorization (pp. 105–112). Amsterdam: John Benjamins. Federmeier, K. D. (2007). Thinking ahead: The role and roots of prediction in language comprehension. Psychophysiology, 44(4), 491–505. Felser, C., Clahsen, H., & Munte, T. F. (2003). Storage and integration in the processing of filler-gap dependencies: An ERP study of topicalization and whmovement in German. Brain and Language, 87(3), 345–354. Fernandez-Duque, D., Baird, J. A., & Posner, M. I. (2000). Executive attention and metacognitive regulation. Consciousness and Cognition, 9, 288–307. Fiebach, C. J., Schlesewsky, M., & Friederici, A. D. (2002). Separating syntactic memory costs and syntactic integration costs during parsing: The processing of German WH-questions. Journal of Memory and Language, 47, 250–272. Filik, R., & Leuthold, H. (2008). Processing local pragmatic anomalies in fictional contexts: Evidence from the N400. Psychophysiology, 45, 554–558. Friederici, A. D., Gunter, T. C., Hahne, A., & Mauth, K. (2004). The relative timing of syntactic and semantic processes in sentence comprehension. NeuroReport, 15, 165–169. Friederici, A. D., Hahne, A., & Mecklinger, A. (1996). Temporal structure of syntactic parsing: Early and late event-related brain potential effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(5), 1219–1248. Friederici, A. D., Pfeifer, E., & Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1, 183–192. Gao, M. Y., & Malt, B. C. (2009). Mental representation and cognitive consequences of Chinese individual classifiers. Language and Cognitive Processes, 24(7–8), 1124–1179. Gouvea, A. C., Phillips, C., Kazanina, N., & Poeppel, D. (2010). The linguistic processes underlying the P600. Language and Cognitive Processes, 25(2), 149–188. Grinevald, C. (2000). A morphosyntactic typology of classifiers. In G. Senft (Ed.), Systems of nominal classification (pp. 50–92). Cambridge, UK: Cambridge University Press. Gunter, T. C., Friederici, A. D., & Schriefers, H. (2000). Syntactic gender and semantic expectancy: ERPs reveal early autonomy and late interaction. Journal of Cognitive Neuroscience, 12(4), 556–568.

27

Gunter, T. C., Stowe, L. A., & Mulder, G. (1997). When syntax meets semantics. Psychophysiology, 34, 660–676. Hagoort, P. (2003a). Interplay between syntax and semantics during sentence comprehension: ERP effects of combining syntactic and semantic violations. Journal of Cognitive Neuroscience, 15, 883–899. Hagoort, P. (2003b). How the brain solves the binding problem for language: A neurocomputational model of syntactic processing. Neuroimage, 20, s18–s29. Hagoort, P., Baggio, G., & Willems, R. M. (2009). Semantic unification. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 819–836). Cambridge, MA: MIT Press. Hahne, A., & Friederici, A. D. (2002). Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research, 13, 339–356. Hoeks, J. C. J., Stowe, L. A., & Doedens, G. (2004). Seeing words in context: The interaction of lexical and sentence level information during reading. Cognitive Brain Research, 19, 59–73. Hsu, C. C. N. (2006). Issues in head-final relative clauses in Chinese: Derivation, processing, and acquisition (Ph.D. dissertation), University of Delaware. Hsu, C. C. N., Phillips, C., & Yoshida, M. (2005). Cues for head-final relative clauses in Chinese. Paper presented at the 18th annual CUNY conference on human sentence processing. University of Arizona, Tucson, AZ. Hsu, C. C. N., Hurewitz, F., & Phillips, C. (2006). Contextual and syntactic cues for head-final relative clauses in Chinese. Paper presented at the 19th annual CUNY conference on human sentence processing. The City University of New York, NY. Kaan, E., Harris, A., Gibson, E., & Holcomb, P. (2000). The P600 as an index of syntactic integration difficulty. Language and Cognitive Processes, 15(2), 159–201. Kim, A., & Osterhout, L. (2005). The independence of combinatory semantic processing: Evidence from event-related potentials. Journal of Memory and Langauge, 52, 205–225. King, J. W., & Kutas, M. (1995). Who did what and when? Using word- and clauselevel ERPs to monitor working memory usage in reading. Journal of Cognitive Neuroscience, 7(3), 376–395. Kluender, R., & Kutas, M. (1993). Bridging the Gap: Evidence from ERPs on the processing of unbounded dependencies. Journal of Cognitive Neuroscience, 5(2), 196–214. Kolk, H. H. J., Chwilla, D. J., Van Herten, M., & Oor, P. J. W. (2003). Structure and limited capacity in verbal working memory: A study with event-related potentials. Brain and Language, 85, 1–36. Kuo, J. Y., & Sera, M. D. (2009). Classifier effects on human categorization: The role of shape classifiers in Mandarin Chinese. Journal of East Asia Linguistics, 18, 1–19. Kuperberg, G. R. (2007). Neural mechanisms of language comprehension: Challenges to syntax. Brain Research, 1146(1), 23–49. Kuperberg, G. R., Sitnikova, T., Caplan, D., & Holcomb, P. J. (2003). Electrophysiological distinctions in processing conceptual relationships within simple sentences. Cognitive Brain Research, 17, 117–129. Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4(12), 463–470. Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the event-related brain potential (ERP). Annual Review of Psychology, 62, 621–647. Kutas, M., Van petten, C., & Kluender, R. (2006). Psycholinguistics electrified II (1994–2005). In M. Traxler & M. A. Gernsbacher (Eds.), Handbook of psycholinguistics (pp. 659–724). Academic Press. Lakoff, G. (1986). Classifiers as a reflection of mind. In C. G. Craig (Ed.), Noun classes and categorization (pp. 13–51). John Benjamins. Lau, E. F., Holcomb, P. J., & Kuperberg, G. R. (2013). Dissociating N400 effects of prediction from association in single-word contexts. Journal of Cognitive Neuroscience, 25(3), 484–502. Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for semantics: (De) constructing the N400. Nature Reviews Neuroscience, 9(12), 920–933. Liotti, M., Woldor, M. G., Perez, R., & Mayberg, H. S. (2000). An ERP study of the temporal course of the Stroop color–word interference effect. Neuropsychologia, 38, 701–711. Löw, A., Rockstroh, B., Cohen, R., Hauk, O., Berg, P., & Maier, W. (1999). Determining working memory from ERP topography. Brain Topography, 12, 39–47. Molinaro, N., Barber, H. A., & Carreiras, M. (2011). Grammatical agreement processing in reading: ERP findings and future directions. Cortex, 47, 908–930. Mueller, J. L., Hahne, A., Fujii, Y., & Friederici, A. D. (2005). Native and nonnative speakers’ processing of a miniature version of Japanese as revealed by ERPs. Journal of Cognitive Neuroscience, 17(8), 1229–1244. Nieuwland, M. S., & Van Berkum, J. J. A. (2006). When peanuts fall in love: N400 evidence for the power of discourse. Journal of Cognitive Neuroscience, 18, 1098–1111. Osterhout, L., Holcomb, P. J., & Swinney, D. A. (1994). Brain potentials elicited by garden-path sentences: Evidence of the application of verb information during parsing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 786–803. Osterhout, L., & Mobley, L. A. (1995). Event-related brain potentials elicited by failure to agree. Journal of Memory and Language, 34(6), 739–773. Phillips, C., Kazanina, N., & Abada, S. H. (2005). ERP effects of the processing of syntactic long-distance dependencies. Cognitive Brain Research, 22(3), 407–428. Saalbach, H., & Imai, M. (2007). Scope of linguistic influence: Does a classifier system alter object concepts? Journal of Experimental Psychology: General, 136 (3), 485–510. Sakai, Y., Iwata, K., Riera, J., Wan, X., Yokoyama, S., Shimoda, Y., et al. (2005). Brain activities related to the integration of nouns and numeral classifiers in Japanese

28

C.-C. Hsu et al. / Brain & Language 137 (2014) 14–28

– An ERP study. IEICE technical report (institute of electronics, information and communication engineers) (Vol. 105(170), pp. 17–22). Sakai, Y., Fukumitsu, Y., Yusa, N., & Koizumi, M. (2007). An ERP study of the classifier system in Japanese: Syntactic or semantic? Paper presented at the 20th annual CUNY conference on human sentence processing, La Jolla, California. Senft, G. (Ed.). (2000). Systems of nominal classification. Cambridge, UK: Cambridge University Press. Stroud, C., & Phillips, C. (2012). Examining the evidence for an independent semantic analyzer: An ERP study in Spanish. Brain and Language, 120(2), 108–126. Tai, J. H.-Y (1994). Chinese classifier systems and human categorization. In M. Y. Chen & O. J. L. Tzeng (Eds.). In honor of William S.-Y. Wang: Interdisciplinary studies on language and language change (pp. 479–494). Taipei: Pyramid Press. Tien, Y. M., Tzeng, O. J. L., & Hung, D. L. (2002). Semantic and cognitive basis of Chinese classifiers: A functional approach. Language and Linguistics, 3(1), 101–132. Tsai, S.-H. R., Hsu, C.-C. N., Yang, C.-L., & Chen, J.-Y. (2008). An event-related potential (ERP) study of the classifier–noun relationship in Mandarin Chinese. Paper presented at the British Association of Cognitive Neuroscience (BACN) joint annual meeting with the Wales Institute for Cognitive Neuroscience (WICN). Swansea University, UK. Tucker, D. M. (1993). Spatial sampling of head electrical fields: The geodesic sensor net. Electroencephalography and Clinical Neurophysiology, 87, 154–163. Ueno, M., & Kluender, R. (2009). On the processing of Japanese wh-questions: An ERP study. Brain Research, 1290, 63–90. Van Petten, C., Coulson, S., Weckerly, J., Federmeier, K. D., Folstein, J., & Kutas, M. (1999). Lexical association and higher-level semantic context: An ERP study. Journal of Cognitive Neuroscience Supplement, 46.

Van Veena, V., & Carter, C. S. (2002). The anterior cingulate as a conflict monitor: fMRI and ERP studies. Physiology & Behavior, 77(4–5), 477–482. Weskott, T., & Fanselow, G. (2011). On the informativity of different measures of linguistic acceptability. Language, 87(2), 249–273. Wicha, N. Y., Moreno, E. M., & Kutas, M. (2004). Anticipating words and their gender: An event-related brain potential study of semantic integration, gender expectancy, and gender agreement in Spanish sentence reading. Journal of Cognitive Neuroscience, 16(7), 1272–1288. Wu, F., Haskell, T., & Andersen, E. (2006). The interaction of lexical, syntactic, and discourse factors in on-line Chinese parsing: Evidence from Eye-tracking. Paper presented at the 19th annual CUNY conference on human sentence processing. The City University of New York, NY. Wu, F., Kaiser, E., & Andersen, E. (2009). The effects of classifiers in predicting Chinese relative clauses. In M. Grosvald & D. Soares (Eds.), Proceedings of the western conference on linguistics (WECOL) (pp. 318–329). Davis: University of California. Zhang, H. (2007). Numeral classifiers in Mandarin Chinese. Journal of East Asian Linguistics, 16, 43–59. Zhang, S., & Schmitt, B. H. (1998). Language-dependent classification: The mental representation of classifiers in cognition, memory, and ad evaluations. Journal of Experimental Psychology: Applied, 4(4), 375–385. Zhang, Y., Zhang, J., & Min, B. (2012). Neural dynamics of animacy processing in language comprehension: ERP evidence from the interpretation of classifier– noun combinations. Brain and Language, 120, 321–331. Zhou, X., Jiang, X., Ye, Z., Zhang, Y., Lou, K., & Zhan, W. (2010). Semantic integration processes at different levels of syntactic hierarchy during sentence comprehension: An ERP study. Neuropsychologia, 48(6), 1551–1562.