Acta Psychologica 143 (2013) 253–260
Contents lists available at SciVerse ScienceDirect
Acta Psychologica journal homepage: www.elsevier.com/ locate/actpsy
The perceptual nature of audiovisual interactions for semantic knowledge in young and elderly adults Guillaume T. Vallet a, b,⁎, Martine Simard b, Rémy Versace a, Stéphanie Mazza a a b
Laboratoire d'Étude des Mécanismes Cognitifs, University Lyon 2, 5 avenue Pierre-Mendès France, 69676 Bron Cedex, France School of Psychology, Laval University, 2325 rue des Bibliothèques, Quebec City, Quebec G1V 0A6, Canada
a r t i c l e
i n f o
Article history: Received 9 July 2012 Received in revised form 5 April 2013 Accepted 6 April 2013 Available online 15 May 2013 PsycINFO classification: 2343 - Learning & memory 2860 - Gerontology 2320 - Sensory perception Keywords: Memory Perception Grounded cognition Masking Audiovisual priming
a b s t r a c t Audiovisual interactions for familiar objects are at the core of perception. The nature of these interactions depends on the amodal – sensory abstracted – or modal – sensory-dependent – approach of knowledge. According to these approaches, the interactions should be respectively semantic and indirect or perceptual and direct. This issue is therefore a central question to memory and perception, yet the nature of these interactions remains unexplored in young and elderly adults. We used a cross-modal priming paradigm combined with a visual masking procedure of half of the auditory primes. The data demonstrated similar results in the young and elderly adult groups. The mask interfered with the priming effect in the semantically congruent condition, whereas the mask facilitated the processing of the visual target in the semantically incongruent condition. These findings indicate that audiovisual interactions are perceptual, and support the grounded cognition theory. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Grounded cognition theories state that memory is grounded in its sensory–motor features defining modal knowledge (see Barsalou, 2008). This assumption questions the nature of the multisensory interactions for semantic knowledge generally described as indirect and semantic in nature. A sensory stimulus would activate a sensoryabstracted semantic representation – the concept – in semantic memory, which in turn, would activate the associated sensory representations of this representation. Reversely, grounded cognition theories suppose that the first stimulus directly activates the others. The first general objective of the present study is to explore the nature of the multisensory interactions for semantic knowledge and thus indirectly to assess the nature of knowledge. The second general objective of the study is to test the effect of aging on these interactions, because this issue remains unexplored in healthy elderly individuals. Multisensory interactions for semantic knowledge can be assessed using a cross-modal priming paradigm. In this paradigm, the prime, in one sensory modality, facilitates the processing of a semantically
⁎ Corresponding author at: Laboratoire EMC, Université Lyon 2, 5 avenue Pierre-Mendès France, 69676 Bron Cedex, France. Tel.: +33 4 78 77 43 50; fax: +33 4 78 77 43 74. E-mail addresses:
[email protected] (G.T. Vallet),
[email protected] (M. Simard),
[email protected] (R. Versace),
[email protected] (S. Mazza). 0001-6918/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.actpsy.2013.04.009
associated target in another sensory modality (e.g. Doehrmann & Naumer, 2008). The cross-modal priming effect and its nature remain poorly understood in young adults (see Ryan et al., 2008; Vallet, Riou, Versace, & Simard, 2011) and almost unexplored in other populations such as in normal aging (but see Ballesteros & Mayas, 2009). This effect is observed using all the modalities, such as touch and vision (Easton, Greene, & Srinivas, 1997), but most of the studies are realized using the auditory and visual modalities (see, Schneider, Engel, & Debener, 2008). However, on the eight studies published on the cross-modal effect in normal aging, two did not report a priming effect (for a discussion on this topic, see Vallet, Simard, & Versace, 2012). The nature of the task might explain the discrepancies between these studies, since the two studies with negative results used a production task (wordstem completion) compared to the others. One conclusion would be to avoid production tasks to study the cross-modal priming effect in normal aging (see also Fleischman, Wilson, Gabrieli, & Schneider, 2005 in Alzheimer's Disease). More data are thus needed to better understand priming effects in healthy aging. Within- (the same stimulus as prime and target) and betweenmodal priming was thought to be respectively perceptual and semantic (Doehrmann & Naumer, 2008; Schacter, 1992), because the cross-modal priming effect was generally reported as weaker than the within-modal priming effect (for a review, see Roediger & McDermott, 1993). However, all these studies used words as experimental material. When different material is used, equivalent priming effects between within-
254
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
modal and cross-modal priming paradigms are reported, between touch and vision (objects, Easton, Srinivas, & Greene, 1997; Reales & Ballesteros, 1999) as well as between audition and vision (pictures and sounds, Ballesteros, Gonzalez, Mayas, Garcia-Rodriguez, & Reales, 2009; Schneider et al., 2008). Similar priming magnitude suggests that a common representation underlies the sensory representations of the same object (see Chen & Spence, 2010; but see Greene, Easton, & LaShell, 2001 for discussion). This representation might be grounded in its sensory–motor features – modal knowledge (Barsalou, 2008) – or being abstracted from them – amodal knowledge (Tulving, 1972). The cross-modal priming effect is currently interpreted within the amodal approach of knowledge (e.g. Chen & Spence, 2011). The processing of one stimulus in one modality (e.g. meow sound) will activate, through a bottom-up activation, the corresponding concept in semantic memory (‘cat’). The processing (e.g. identification) of a semantically associated stimulus (cat's picture) will then benefit of the pre-activation of this semantic concept. The concept ‘cat’ is independent of the sensory (meowing sound, visual representation) and motor (stroke a cat, play with a cat) components of the object, but these components are indirectly related together by the concept. In the opposite point of view, knowledge remains grounded in its sensory–motor features and no context-abstracted semantic intermediary should exist. Consequently, processing a familiar sound shall automatically and directly activate the related representations in the other sensory modalities (Molholm, Martinez, Shpaner, & Foxe, 2007). The cognitive system is supposed to simulate, through mental imagery, the situation to process (for a review, see Barsalou, 2008). The simulation seems to occur at a modal level (i.e. perceptually based for perceptual features, see Pecher, Zeelenberg, & Barsalou, 2003) because it occurs in the same brain areas than perception (Beauchamp, McRae, Martin, & Barsalou, 2007; Slotnick & Schacter, 2006) and action (Hauk & Pulvermüller, 2004; Saygin, McCullough, Alac, & Emmorey, 2010). The overlap between memory and perceptuo-motor features suggests that memory and perception are probably not hierarchically organized but rather interact at the same level or are confounded (e.g. Riou, Lesourd, Brunel, & Versace, 2011). Supporting the modal knowledge hypothesis, Vallet, Brunel, and Versace (2010) used a two-way cross-modal priming paradigm in which participants completed a categorization task (animal/artifact) in two phases. In the study phase, all the primes were first presented (e.g. auditory primes) and then, in the test phase, all the targets were processed (e.g. visual targets). The novelty of this paradigm was to add a meaningless mask sharing the modality of the target (e.g. visual mask for auditory primes) during the presentation of half of the primes. The results indicated that the mask interfered with the priming effect in the test phase, without any significant effect of the mask in the study phase. The mask interference appeared unlikely in an amodal view of knowledge since the mask was sharing the modality of the target and was meaningless. Indeed, in the amodal approach, the activation of the associated stimuli should be disrupted only by a concurrent semantic stimulus processed during the prime presentation (Mulatti & Coltheart, 2012). Nevertheless, the results of this study cannot fully rule out the amodal hypothesis since an alternate explanation remains possible: attention could have been distracted during the processing of the prime due to the mask. The latter hypothesis is congruent with both modal and amodal approaches. The attention hypothesis should thus be excluded in order to first validate the modal approach of knowledge, and then grounded cognition theories. The specific objective of the present study is therefore to assess the attention hypothesis in a masked cross-modal priming paradigm. According to this hypothesis, the mask effect should be the same regardless of the semantic relationship between the stimuli. On the contrary, the modal approach of knowledge states that the interference effect of the mask should be restricted to the semantic congruent condition (prime and target referring to the same object). The second specific objective of the present study is to test the effect of aging on the mask interference effect. A first reason to include healthy elderly adults is that grounded cognition has almost never been
directly tested in this population. However, the validity of grounded cognition theories in normal aging is a mandatory step for these theories to be applied to the whole spectrum of cognition. In the first study on this issue (Dijkstra, Kaschak, & Zwaan, 2007), the authors demonstrated that congruent body postures between encoding and retrieval facilitate retrieval of autobiographical memories. This result showed that the body posture is part of the memory trace as suggested by the grounded cognition theories. In a second study, on the nature of knowledge in aging (Vallet, Simard, & Versace, 2011), the same procedure than the one used in the study of Vallet et al. (2010) was applied to healthy elderly participants. The same results and the same limits than those reported in young adults were found in elderly adults. In other words, the attention hypothesis should also be ruled out to validate modal knowledge in normal aging. These questions were tested by adapting the masked cross-modal priming paradigm previously used by Vallet et al. (2010). In each trial, the auditory prime was followed, 500 ms later, by the visual target. Half of the primes were presented with a visual meaningless mask. In the semantically congruent condition, the sound prime and the picture target refer to the same object (e.g. “meow” sound and cat picture). In the semantically incongruent conditions, the sound prime and the picture target belonged to two different categories (e.g. “bark” sound and piano picture) or to two different exemplars within the same category (e.g. clarinet sound and guitar picture). The potential involvement of attention and/or executive functions in the masking effect was tested by correlational analyses between a neuropsychological battery and the priming paradigm. We hypothesized that the interference effect of the mask would be restricted to the semantic congruent condition compared to the incongruent condition (i.e. interaction between the mask and the semantic congruency) supporting the modal approach of knowledge. It was predicted that elderly adults would present the same pattern of results as the young adults for the mask effect (i.e. interaction between the mask and the semantic congruency). Finally, it was expected that attention and/or executive functions would not be specifically associated with the masked conditions but rather with all the experimental conditions in elderly adults (see Vallet, Simard et al., 2011) on the contrary to the predictions of the attention hypothesis. 2. Method The nature of the cross-modal interactions for semantic knowledge was assessed using a masked cross-modal priming paradigm from the auditory to visual modalities. Half of the auditory primes were presented simultaneously with a meaningless visual mask. The semantic congruency between the prime and the target was manipulated in order to test the specificity of the possible mask interference (i.e. attention hypothesis). In the category-congruent condition the sound prime and the picture target belonged to the same semantic category (animals vs. artifacts). This condition was divided into two equal sub-conditions: the item-congruent condition, in which the sound prime and the picture target refer to the same object (“meow” sound–cat picture), and in the item-incongruent condition (guitar sound–hornet picture). In the category-incongruent condition the prime and the target belonged to two different categories. The general design was a 3 (congruency: item-congruent, item-incongruent, category-incongruent) × 2 (mask: unmasked primes, masked primes) within-subject factors. 2.1. Participants Thirty-two students and 32 healthy elderly were recruited from respectively Laval University student volunteer pool and from public announcements in Quebec City (see Table 1 for demographic data). All participants in the study were native French-speaking Quebecois. Health information was gathered from all participants during an extensive interview about their medical history and medication. Participants
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
with a medical history and/or taking medications for conditions with known sensory or neurological effects were excluded. All participants underwent a neuropsychological screening battery including hearing perception [audiogram measuring audio frequencies and auditory volume (Hearing test, Digital Recordings)], visual acuity [Monoyer's test at 3 meters], cognitive speed [simple reaction time task (SRT)], a standard test of general cognitive functioning [Mini-Mental State Examination (Folstein, 1975)], verbal memory [RL/RI-16 free and cued recall task (Van der Linden, 2004)], executive functions [Trail Making Test (Delis, Kaplan, & Kramer, 2001; Lezak, Howieson, Loring, & Fisher, 2004); the Stroop test (Stroop, 1935) and the Hayling Test (Burgess & Shallice, 1997)], visuo-spatial abilities [Visual Object and Space Perception battery (Warrington & James, 1991), shape detection, incomplete letters and number location sub-tests], and executive-semantic functions [word fluency test (Cardebat, Doyon, Puel, & Goulet, 1990)]. The neuropsychological data of the participants are summarized in Table 1. This research was approved by the Ethical Committee of the “Centre de recherche Université Laval Robert-Giffard” and all participants signed an informed consent form before the experimental session started. Each participant was tested individually in sessions that lasted approximately 2 h for the young participants, and about three hours for the elderly participants. The young participants completed the experimental protocol in one session. The older adults completed the protocol in two sessions lasting 1.5 hour each. Usually, a week separated the two sessions. Table 1 Demographic and neuropsychological data by group.
Demographics Sex (F/M) Age (years) Education (years) Neuropsychological screening Global cognition MMSE Speed Simple reaction time Perception Visual acuity uncorrected Visual acuity corrected Auditory acuity VOSP Screening Letters Number localisation Memory RL/Rl Immediate recall (/16) Sumoffree recall (/48) Sum of total recall (/48) Delayed free recall (/16) Delayed total recall (/16) Recognition Executive Stroop Color Word Word–color Errors word–color TMT Part 1 Part 2 Part 3 Part 4 Part 5 Errors part 4 Hayling test Part A Part B Mixed Verbal fluency Categorial Phonemic
Young adults
Elderly adults
Mean
Mean
SD
SD
25/7 23.44 16.19
4.23 2.18
24/8 72.72 14.31
6.88⁎ 5.02
29.69 277 5.85 9.07 3.11
0.54 27.80 3.93 2.41 0.72
28.69 302 4.52 8.05 6.95
1.00⁎ 50.86⁎ 2.83 1.64 2.83⁎
19.84 19.69 9.69
0.45 0.59 0.59
19.66 19.38 8.72
15.94 39.28 47.59 15.31 16.00 16.00
0.25 4.16 0.61 0.86 0.00 0.00
15.28 29.81 46.25 14.38 15.84 15.86
0.77⁎ 6.72⁎ 4.10⁎ 6.67⁎ 0.39⁎ 0.34⁎
55.44 40.66 91.03 0.12
8.46 5.68 16.74 0.42
68.16 47.81 149 2.72
14.01⁎ 6.96⁎ 450.55⁎
15.78 21.84 21.16 49.22 13.44 0.38
3.96 6.39 6.22 12.98 4.20 0.87
27.25 42.84 47.31 111.59 28.50 0.88
7.33⁎ 13.18⁎ 15.54⁎ 48.25⁎ 10.38⁎
43.12 109.47
2.81 29.70
49.09 135
5.46⁎ 41.40⁎
27.84 14.13
4.84 2.94
18.84 11.81
3.50⁎ 3.66
Notes. Executive = executive functions; SD = standard deviation. ⁎ p b 0.05.
0.60 0.75 120⁎
10.50
1.16
255
2.2. Stimuli used in the priming paradigm Overall, 256 stimuli were used: half of them (128) were sounds and half were color photographs illustrating the objects in their natural context (see Table 2 for the repartition of the conditions and the number of stimuli by condition). Half of the stimuli (128) represented familiar animals (e.g. cow, cat, dog, lion) and the other half, familiar artifacts (e.g. piano, guitar, bell, airplane). All the photographs had the same format (393 × 295 pixels with a resolution of 72 × 72 dots per inch). All the sounds lasted 1000 ms. Eight visual color masks were created using Photoshop CS3 Mac applying a ripple effect on 8 new color pictures. This procedure was meant to make it impossible to identify the result. It made it look like an abstract painting. Different masks were created to avoid a systematic association between the stimuli and a specific mask, and to avoid repetition. Each mask was associated with sixteen different sounds. The experiment was composed of 128 trials including 16 practice trials, 56 category-incongruent trials (half masked), and 56 categorycongruent trials with half of these trials in the item-congruent and the other half in item-incongruent conditions (half masked in both conditions), constituting 14 trials per condition. The stimuli included in each kind of experimental condition are described below. One hundred and forty stimuli (half photographs) consisted of 70 familiar bimodal objects (sounds matching the pictures) and were included in the category-congruent condition. From these stimuli, 55 bimodal objects were the same objects (55 photographs and 55 sounds) as the ones used in our previous experiment (Vallet et al., 2010). Fifteen new bimodal items were added in order to adapt the experimental material for the Quebecois population. The selection of the stimuli was made by a new pre-test with 5 new participants. The pre-test assessed the sound-picture association of the items, as well as their recognition in each modality to ensure they would be as prototypical and familiar as possible (see, Vallet et al., 2010). For the needs of the categorization task, 84 new stimuli (half photographs) were added. These stimuli were objects that are not typically associated with a sound (e.g. ant, table), or hard to identify in the auditory modality (e.g. clarinet sound, birdsong). These stimuli were included in the category-incongruent condition and were excluded of the analyses since they were not counterbalanced with the items in the other conditions. Finally, 32 new stimuli (half photographs) were used in the 16 practice trials and not included in the analyses. The practice trials represented all the experimental conditions, and they were the same for all the participants. A trial-unique paradigm was chosen in order to avoid uncontrolled effects, which might result from multiple presentations (see Barense, Gaffan, & Graham, 2007). While contrasting category-incongruent with item-congruent conditions is the most common design for manipulating semantic congruency (e.g. Chen & Spence, 2010), the present study focused only on the category-congruent condition in order to be more specific. In this case, the prime and the target belong to the same semantic category, but could refer or not to the same semantic object. In other words, the Table 2 Distribution of the number of stimuli as function of the experimental factors (excluding the stimuli used in the practice trials). Category-congruency
Item-congruency
Category-congruent
Item-congruent
Mask
Stimuli
Unmasked 14 bimodal stimuli Masked 14 bimodal stimuli Item-incongruent Unmasked 14 sounds + 14 pictures Masked 14 sounds + 14 pictures Category-incongruent Unmasked 28 sounds + 28 pictures Masked 28 sounds + 28 pictures
Total 28 28 28 28 56 56
Note. Bimodal stimuli = sound and its matching picture (meow sound – cat's picture).
256
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
Fig. 1. Illustration of the experimental protocol. A sound is presented as the prime. For half of the sound primes, a meaningless visual mask is presented. Then, a photograph is categorized as an animal or as an artifact.
manipulation of semantic congruency was at an exemplar level rather than in a categorical level. The inconvenience of this choice is that the category-incongruent condition should remain included to avoid any possible prediction in the categorization task. All the stimuli belonging to the category-congruent condition were thus counterbalanced in all sub-conditions: item-congruent unmasked and masked; and itemincongruent unmasked and masked. The participants were randomly assigned to one of the 4 different experimental groups, but all participants experienced all the conditions. The stimuli in the categoryincongruent condition were not counterbalanced with the other conditions (category-congruent), because it was impossible to find the same bimodal, familiar, and recognizable features as those previously chosen. These items were, therefore, excluded from the analyses. All the stimuli and conditions were presented according to a pseudo-random order.
2.3. Priming task procedure The experiment was conducted using a MacIntosh MacBook Pro using Psyscope software X B53 (Cohen, MacWhinney, Flatt, & Provost, 1993) to set up and manage the experiment. Each participant was tested individually in one session lasting approximately 12 min (see Fig. 1 for an illustration of the experimental procedure). The participants were informed that they were taking part in a study on reaction speed to visual stimuli. The participants were told that before the presentation of each picture a sound, which could match, or not the picture, will be played. They were also informed that sometimes a colored rectangle might appear on the screen as they heard the sound. The participants were instructed to not pay attention to these stimuli (sounds and rectangles) in order to focus their attention only on the pictures for the categorization task. Each participant adjusted the auditory intensity in order to reach a comfortable level. Each trial started with a fixation point displayed for 800 ms. This was followed, 300 ms later, by a 1000 ms sound presented bi-aurally through a stereo headset: half of the sounds corresponded to animals and the other half, to artifacts. For half of these sounds, a visual mask was presented simultaneously for 1000 ms. Five hundred ms later, a centrally positioned picture appeared for 1000 ms, followed by a white screen displayed for 4000 ms or until the participant responded. The participants were asked to judge, as quickly and as accurately as possible, whether the picture corresponded to an animal or to an artifact. Response logging started with the presentation of the picture target. The response keys used were ‘d’ and ‘k’, and were counterbalanced across the participants.
3. Results 3.1. Statistical analysis The data were analyzed using R version 2.11.1 (R Foundation for Statistical Computing). The practice trials were not included in the analyses as were the category-incongruent items, since they were not counterbalanced with the other items in the other conditions. 1 The mean correct reaction times and mean rates of correct responses were calculated across subjects for each experimental condition. Reaction times exceeding 3000 ms and 2.5 standard deviations above each participant's mean in each condition and reaction times less than 150 ms and 2.5 standard deviations below each participant's mean per condition were treated as outliers and removed from the analyses (1% of the data for each group). Separate analyses of variance (ANOVA) were performed on the percentages of correct responses, and of correct reaction times. The analyses were performed with subjects as random variables, according to a 2 (group: young vs. elderly) × 2 (item-congruency: item-congruent vs. item-incongruent) × 2 (mask: masked vs. unmasked) design with group as the betweensubject factor and the other factors as within-subject factors. Post-hoc analyses were conducted using two-tailed Student's t-tests. An alpha level of 0.05 was used for the ANOVA and the t-tests. The Pearson correlation tests were calculated between the score of executive functions and the experimental conditions. A first step before conducting the correlation analysis was to examine whether the different conditions were inter-correlated or not (see Appendix A). In the two groups of participants, the reaction times were very highly intercorrelated (superior at 0.80, p b .001). Very high correlations between conditions showed that these different conditions were not independent and should reflect a common entity (Tabachnick & Fidell, 2007). As a consequence, these conditions were averaged together and not used as different scores. The mean of the reaction times of the four conditions was calculated and used for the analyses. Yet, the correct response rates were not inter-correlated except for the item-congruent 1 ANOVA with the category-incongruent condition revealed no effect for the correct response rates. For the reaction times, the ANOVA revealed an effect of the group F(1,62) = 50.17, p b .05, but no interaction between the group and the other factors. There was a main effect of the semantic-congruency factor, F(1,62) = 10.96, p b .05; and a significant semantic-congruency ∗ mask interaction, F(1,62) = 10.35, p b .05. The unmasked congruent items were processed faster than masked one t(63) =3.94, p b .05. Reversely, the masked incongruent items were processed faster than the unmasked one t(63) = 2.30, p b .05. Finally, no difference was observed between the masked and unmasked items within the category-incongruent condition t(63) = .83, p b .41.
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
257
masked and the item-incongruent masked conditions, which were correlated in the young group (cf. Table A1 in Appendices). The different conditions of the correct response rates were used directly in the analyses. In order to avoid type I error, a more severe alpha level of 0.005 was set for these latter analyses because of the multiple correlations realized.
3.2. Short term cross-modal priming paradigm 3.2.1. Experimental results The mean reaction times and correct responses rates by group and by factor are presented in Table 3. The analyses performed on the correct responses revealed no significant effect (for all factors and interactions, F(1, 62) b 1). The overall correct response rate was 97.05% (97.48% and 96.61% for the young and elderly adults respectively) suggesting ceiling effects. The analyses of reaction times revealed a main effect of the group (F(1, 62) = 50.17, p b .05, η 2partial = .45), with slower reaction times in the group of older adults. A main effect of the item-congruency was also observed (F(1, 62) = 10.96, p b .05, η 2partial = .15), with congruent stimuli processed faster than incongruent stimuli. Finally, the data showed an interaction between the item-congruency and the mask factors (F(1, 62) = 14.35, p b .05, η 2partial = .19). There was no main effect of the mask (F(1, 62) = 1.27, p b .05, η 2partial = .26), no interaction between the group and the item-congruency (F(1, 62) b 1) or between the group and the mask (F(1, 62) b 1) or interaction between all the factors (F(1, 62) b 1). The detailed analysis of the interaction between the mask and the item-congruency (see Fig. 2, including the effect of aging) showed that the items in the semantically congruent unmasked condition were processed faster than masked items (t(63) = 3.94, p b .01, d = .21). In contrast, the items in the unmasked item-incongruent condition were processed slower than the masked items (t(63) = 2.30, p b .05, d = .11). At least, the size of the mask effect in the congruent and incongruent conditions appeared similar (t(63) = 0.91, p = .37). The two groups of participants exhibited a significant priming effect of 33 ms (t(63) = 5.16, p b .05, d = .58) which was calculated by subtracting the mean reaction time of the unmasked item-congruent from the unmasked item-incongruent conditions. The magnitude of the priming effect appeared to be equivalent between the two groups (t(62) = 0.17, p = .86). Raw data suggest that elderly adults may have been more sensitive to the semantic incongruency since the congruent masked subtracted from incongruent masked conditions equaled −14 ms in the elderly adults group compared to 1.7 ms in the young adults group. However, this difference between the groups
Table 3 Mean reaction times and correct response rates for the young and older adults for the cross-modal priming paradigm. Reaction times (ms)
Young N = 32
Item-congruent Item-incongruent
Elderly N = 32
Item-congruent Item-incongruent
Note. SD = standard deviation.
Unmasked primes Masked primes Unmasked primes Masked primes Unmasked primes Masked primes Unmasked primes Masked primes
Correct response rates
Mean
SD
Mean
SD
522 542 554 544 677 705 712 691
72.13 83.83 91.13 85.07 86.73 111.35 113.43 93.84
.97 .97 .98 .97 .97 .96 .97 .97
.04 .04 .03 .05 .04 .05 .05 .06
Fig. 2. Means and standard errors for reaction times of the interaction between item-congruency and mask for the young and the older adults respectively.
was not significant, t(62) = 1.16, p = .25, so that the inhibition deficit reported in aging was not observed in the present paradigm. 3.3. Neuropsychological results 3.3.1. Cognitive profiles The cognitive profiles of each participant and each group revealed no abnormal score compared with their respective population matched for age and education (according to the norms of the test manuals). Except for the delayed recall and recognition scores in the RL/RI-16 and the errors in the TMT, all the results of neuropsychological measures were significantly different between the two groups (t-tests). The elderly participants demonstrated poorer performance than the young participants – accuracy and speed – on recall memory tasks and executive functions tasks compared to that of the young participants. 3.3.2. Correlations The correlation analyses revealed no significant correlation between the neuropsychological tests and the correct response rates in the experiment, except between the masked item-congruent condition and the speed score of the TMT 1 in the young group. This unique correlation in each group appears insufficient to conclude that the masked item-congruent condition is associated with visual search since the other visual processing parts of the TMT or the Stroop task were not correlated with this experimental score. Similarly, there is a unique correlation between the errors of the Stroop interference and the masked item-incongruent condition in the elderly group. For reaction times, the analyses revealed no significant correlations in the young group. In the elderly group, the correlations demonstrated that age, SRT, speed of TMT 2 and TMT 4, and speed and errors of Stroop interference, as well as the flexibility and inhibitory scores were correlated to the mean reaction times of the experiment (see Table 4). The TMT 4, the flexibility score, the Stroop interference and inhibitory scores are known to be executive tasks. The SRT and TMT 2 evaluated psycho-motor speed. The lack of correlation between executive functions and the priming paradigm in the young group suggests that attention was not particularly involved in our priming task. This hypothesis is also supported by the fact that the masked conditions were not distinct from the other conditions (highly intercorrelated). This is indicating that attention was not more associated with the masked conditions than with the unmasked conditions. The correlations between executive functions and the priming paradigm in the elderly adults suggest that they need to rely on more resources to accomplish the same task than younger adults (see Salthouse, 2000).
258
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
4. Discussion The first general goal of the present study was to assess the nature of the cross-modal interactions for semantic knowledge, and therefore the nature of knowledge in memory. To the best of our knowledge, our study is the first to directly assess this issue by testing the specificity of an interference effect on the cross-modal priming effect: half of the auditory primes were presented with a meaningless visual mask. The second aim of the study was to test the effect on aging on these cross-modal interactions. We predicted a significant crossmodal priming effect in both young and elderly adults' groups. As predicted by grounded cognition theories, we hypothesized that young and elderly adults would present a specific interference effect of the mask limited to the semantically congruent condition. The categorization task was very well carried out. The older adults were as accurate as the younger adults, but they were slower than young adults as predicted by the general slowing hypothesis in normal aging (Salthouse, 2000). Congruent stimuli were globally processed faster than incongruent stimuli, as typically expected (Laurienti, Kraft, Maldjian, Burdette, & Wallace, 2004). A crossmodal priming effect was demonstrated in both young and elderly participants which replicates several studies using pictures and sounds of familiar objects in young adults (e.g. Schneider et al., 2008). Applied to normal aging, this result confirms that, most of the time, a cross-modal priming effect is observed in the elderly (see also, Ballesteros et al., 2009), despite the contradictory results reported in the literature (see Vallet et al., 2012). This is suggesting that methodological factors rather than specific implicit memory impairment for cross-modal situations might be involved in this lack of priming effect previously mentioned in normal aging. The main finding of the present study was the interaction between the mask and the semantic congruency. Regardless of the group factor, the mask interfered with the processing of the target in the item-congruent condition, whereas the mask facilitated the processing
Table 4 Correlations in the elderly group between cognitive tests and the categorization task. Test
Sub-test
Correlations
General
Age Education MMSE SRT TMT-1 TMT-2 TMT-3 TMT-4 TMT-5 Flexibility TMT-1 TMT-2 TMT-3 TMT-4 TMT-5 Colors Words Words/colors Inhibition Words/colors Part A Part B Errors part B Animals T/N/P
0.54⁎⁎ −0.13 0.18 0.66⁎⁎ 0.32 0.48⁎⁎
TMT — speed
TMT — errors
Stroop — speed
Stroop — errors Hayling test
Word fluency
0.39 0.42⁎⁎ 0.32 0.43⁎⁎ 0.00 −0.08 0.05 0.10 −0.03 −0.13 0.16 0.45⁎⁎ 0.49⁎⁎ 0.42⁎⁎ 0.26 0.27 −0.15 −0.17 −0.18
Notes. SRT = simple reaction time; Flexibility = TMT4 − ((TMT2 + TMT3) / 2); Inhibition = WC − ((W + C) / 2). ⁎⁎ p b 0.005.
of the picture target processing in the item-incongruent condition. It seems difficult to explain this interaction according to the amodal approach of knowledge. First, masking effects are generally explained by the superposition of a sensory stimulus on a target stimulus sharing the same modality (for a review, see van den Bussche, Notebaert, & Reynvoet, 2009). In the present study, the mask did not share the modality of the prime. Moreover, our mask effect could not be defined as a forward masking – i.e. mask before the target – since this kind of masking effect is limited to 300 ms (Enns & Di Lollo, 2000), and an inter-stimulus interval (ISI) of 500 ms was used in our study. Secondly and more importantly, the mask effect seems not to rely on attention. A first hypothesis would have been that the mask distracted the attention of the participant from the prime so that the priming effect was less efficient. If this was true, then the mask would not have facilitated the processing in the semantically incongruent condition as was observed in our study. Moreover, the correct response rates in the unmasked conditions were equivalent to those in the masked conditions, while the attention hypothesis states that participants should have poorer performance for masked stimuli than for unmasked ones. A second hypothesis would have been that attention was divided between the two modalities (Alais, Morrone, & Burr, 2006). In this case, the visual mask would overload the visual attention capacity leading to less efficient processing of the visual target. However this would mean that the mask should have had the same effect in the semantically incongruent condition and this was not the case in the present study. Finally, the attention hypothesis also suggests that masked conditions should be more correlated with attention/executive tasks than the unmasked conditions. This hypothesis did not find any support in the present study since the reaction times of all the experimental conditions were very highly inter-correlated. It seems therefore unlikely that attention alone underlies the observed interaction, which goes against the amodal approach of knowledge. The interaction between the mask and the semantic congruency factors is more easily explained in a modal approach of knowledge. The facilitation effect of the mask in the semantically incongruent condition might come from a pre-activation of the visual modality by the mask (e.g. Arieh & Marks, 2008). We might expect the same to happen in the semantically congruent condition but such a bimodal facilitation should remain weaker than the priming effect. Regarding the interference effect of the mask, we can suppose that the visual mask has interfered with the automatic and direct activation of the visual representation associated with the sound prime. Indeed, the sound prime leads to the simulation of the associated stimulus, here the visual representation (Brunel, Labeye, Lesourd, & Versace, 2009; Vallet et al., 2010), but this simulation should not occur if the same modalities are involved in another processing (Kaschak et al., 2005). The lack of interaction between the group and the other factors suggests that elderly adults may also have modal knowledge (see Vallet, Riou et al., 2011; Vallet, Simard et al., 2011). Sensory-dependent knowledge in elderly adults is an important finding since our results, combined with the studies demonstrating sensory-dependent knowledge in children (e.g. Engelen, Bouwmeester, de Bruin, & Zwaan, 2011), indicate that grounded cognition theories could be applied to the whole spectrum of cognition. Moreover, sensory-dependent knowledge supposes that a less accurate perception or a less efficient perceptual processing should decrease the quality of memories. This assumption can explain the statistical associations found between memory and auditory and visual acuity (e.g. Gussekloo, Craen, & Oduber, 2005; Valentijn et al., 2005). In other words, having poorer perception should be reflected by poorer memory performance. The lack of difference between the two groups of participants may appear surprising since several studies have found an early white matter and a gray matter decline in aging. A cerebral decline might then alter the communications between the sensory brain areas (Giorgio et al., 2010), but these changes do not significantly impact the sensory
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
areas themselves or their connectivity (Ceponiene, Westerfield, Torki, & Townsend, 2008). Direct communication between modalities, as supposed by the modal approach, has found some support in the neurobiological data. Research performed in the last decade has indicated that the different modalities interact together even at a very low level (e.g. Martuzzi et al., 2007) and possibly directly (e.g. Cappe, Rouiller, & Barone, 2009). The same seems to be true for semantic knowledge with integration demonstrated in uni-modal and hetero-modal brain regions (for a review, see Amedi, Kriegstein, Atteveldt, Beauchamp, & Naumer, 2005). The present study had some limitations. For instance, the time window chosen might be surprising. A short time window will reinforce the likelihood of a cross-modal integration (Colonius & Diederich, 2010) so that the prime and the mask in the present study might have been integrated together. Against this hypothesis, learned associations would be superior to direct perception in influencing multisensory perception (Mitterer & Jesse, 2010). In addition, the ISI used was also unusual for a study on multisensory integration (e.g. Chen & Spence, 2010). Yet multisensory integration could occur with an ISI of 500 ms as in the present study (Wallace et al., 2004). The time window could also be questioned because aging is characterized by a slowing effect that may shift the time windows for multisensory perception. Against this hypothesis, young and older adults seem to share the same time windows for multisensory perception (Horváth, Czigler, Winkler, & Teder-Sälejärvi, 2007). In conclusion, this study demonstrated for the first time that audiovisual interactions for semantically related stimuli are perceptual and direct. The perceptual nature of these interactions strongly supports modal knowledge as predicted by grounded cognition theories. This result was found for both young and older adults. The data also suggests that different mechanisms could underpin the processing of stimuli that are not semantically related. In this case, a facilitation effect was observed following the visual mask. This interaction raises questions about the nature of interactions for unrelated stimuli. Further research is needed to explore this issue.
Acknowledgments Guillaume T. Vallet and Rémy Versace are supported by a grant from the Rhône-Alpes Region in the cluster “Handicap, Aging, Neurosciences”. The authors wish to thank Benoit Riou for his assistance in this project.
Appendix A. Correlations between the different experimental conditions in the young and older adults' groups
Table A1. Correlations between the different experimental conditions for reaction times and correct response rates in the young and elderly adults' groups. Reaction times
Item-congruent unmasked Item-congruent unmasked Item-congruent unmasked Item-congruent masked Item-congruent masked Item-incongruent unmasked
Correct response rates
Young
Older
Young
Older
Item-congruent masked
0.92
0.84
0.01
0.09
Item-incongruent-unmasked
0.87
0.86
0.03
0.12
Item-incongruent-masked
0.93
0.88
0.05
0.07
Item-incongruent-unmasked
0.85
0.89
0.24
0.18
Item-incongruent-masked
0.87
0.82
0.43
0.08
Item-incongruent masked
0.87
0.8
0.02
0.15
259
References Alais, D., Morrone, C., & Burr, D. (2006). Separate attentional resources for vision and audition. Proceedings of the Royal Society of London B: Biological Sciences, 273, 1339–1345. Amedi, A., Kriegstein, K., Atteveldt, N. M., Beauchamp, M. S., & Naumer, M. J. (2005). Functional imaging of human crossmodal identification and object recognition. Experimental Brain Research, 166, 559–571. Arieh, Y., & Marks, L. E. (2008). Cross-modal interaction between vision and hearing: A speed–accuracy analysis. Perception & Psychophysics, 70, 412–421. Ballesteros, S., Gonzalez, M., Mayas, J., Garcia-Rodriguez, B., & Reales, J. M. (2009). Cross-modal repetition priming in young and old adults. European Journal of Cognitive Psychology, 21, 366–387. Ballesteros, S., & Mayas, J. (2009). Preserved cross-modal priming and aging: A summary of current thoughts. Acta Psychologica Sinica, 41, 1063–1074. Barense, M. D., Gaffan, D., & Graham, K. S. (2007). The human medial temporal lobe processes online representations of complex objects. Neuropsychologia, 45, 2963–2974. Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. Beauchamp, M. S., McRae, K., Martin, A., & Barsalou, L. W. (2007). A common neural substrate for perceiving and knowing about color. Neuropsychologia, 45, 2802–2810. Brunel, L., Labeye, E., Lesourd, M., & Versace, R. (2009). The sensory nature of episodic memory: Sensory priming effects due to memory trace activation. Journal of Experimental Psychology. Learning, Memory, and Cognition, 35, 1081–1088. Burgess, P. W., & Shallice, T. (1997). The Hayling and Brixton tests. Bury St. Edmunds, UK: Thames Valley Test Company. Cappe, C., Rouiller, E. M., & Barone, P. (2009). Multisensory anatomical pathways. Hearing Research, 258, 28–36. Cardebat, D., Doyon, B., Puel, M., & Goulet, P. (1990). Evocation lexicale et sémantique chez des sujets normaux. Performances et et dynamiques de production en fonction du sexe, de l'âge et du niveau culturel. Acta Neurologica Belgica, 90, 207–217. Ceponiene, R., Westerfield, M., Torki, M., & Townsend, J. (2008). Modality-specificity of sensory aging in vision and audition: Evidence from event-related potentials. Brain Research, 1215, 53–68. Chen, Y. -C., & Spence, C. (2010). When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures. Cognition, 114, 389–404. Chen, Y. -C., & Spence, C. (2011). Crossmodal semantic priming by naturalistic sounds and spoken words enhances visual sensitivity. Journal of Experimental Psychology. Human Perception and Performance, 37, 1554–1568. Cohen, J., MacWhinney, B., Flatt, M., & Provost, J. (1993). PsyScope: An interactive graphic system for designing and controlling experiments in the psychology. Behavior Research Methods, 25, 257–271. Colonius, H., & Diederich, A. (2010). The optimal time window of visual-auditory integration: A reaction time analysis. Frontiers in Integrative Neuroscience, 4, 1–6. Delis, D. C., Kaplan, E., & Kramer, J. H. (2001). D-KEFS executive function system. San Antonio: The Psychological Corporation. Dijkstra, K., Kaschak, M. P., & Zwaan, R. A. (2007). Body posture facilitates retrieval of autobiographical memories. Cognition, 102, 139–149. Doehrmann, O., & Naumer, M. J. (2008). Semantics and the multisensory brain: How meaning modulates processes of audio-visual integration. Brain Research, 1242, 136–150. Easton, R. D., Greene, A. J., & Srinivas, K. (1997a). Transfer between vision and haptics: Memory for 2-D patterns and 3-D objects. Psychonomic Bulletin Review, 4, 403–410. Easton, R. D., Srinivas, K., & Greene, A. J. (1997b). Do vision and haptics share common representations? Implicit and explicit memory within and between modalities. Journal of Experimental Psychology. Learning, Memory, and Cognition, 23, 153–163. Engelen, J. A., Bouwmeester, S., de Bruin, A. B. H., & Zwaan, R. A. (2011). Perceptual simulation in developing language comprehension. Journal of Experimental Child Psychology, 110, 659–675. Enns, J. T., & Di Lollo, V. (2000). What's new in visual masking. Trends in Cognitive Science, 4, 345–352. Fleischman, D. A., Wilson, R. S., Gabrieli, J. D. E., & Schneider, J. A. (2005). Implicit memory and Alzheimer's disease neuropathology. Brain, 128, 2006–2015. Folstein, M. F. (1975). Mini-mental state. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189–198. Giorgio, A., Santelli, L., Tomassini, V., Bosnell, R., Smith, S., De Stefano, N., et al. (2010). Age-related changes in grey and white matter structure throughout adulthood. Greene, J. D. W., Easton, R. D., & LaShell, L. S. (2001). Visual–auditory events: Cross-modal perceptual priming and recognition memory. Consciousness and Cognition, 10, 425–435. Gussekloo, J., Craen, A. D., & Oduber, C. (2005). Sensory impairment and cognitive functioning in oldest-old subjects: The Leiden 85+ Study. The American Journal of Geriatric Psychiatry, 13, 781–786. Hauk, O., & Pulvermüller, F. (2004). Neurophysiological distinction of action words in the fronto-central cortex. Human Brain Mapping, 21, 191–201. Horváth, J., Czigler, I., Winkler, I., & Teder-Sälejärvi, W. A. (2007). The temporal window of integration in elderly and young adults. Neurobiology of Aging, 28, 964–975. Kaschak, M. P., Madden, C. J., Therriault, D. J., Yaxley, R. H., Aveyard, M., Blanchard, A. A., et al. (2005). Perception of motion affects language processing. Cognition, 94, 79–89. Laurienti, P. J., Kraft, R. A., Maldjian, J. A., Burdette, J. H., & Wallace, M. T. (2004). Semantic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research, 158, 405–414. Lezak, M. D., Howieson, D. B., Loring, D. W., & Fisher, J. S. (2004). Neuropsychological assessment (4th ed.). New-York: Oxford University Press.
260
G.T. Vallet et al. / Acta Psychologica 143 (2013) 253–260
Martuzzi, R., Murray, M. M., Michel, C. M., Thiran, J. -P., Maeder, P. P., Clarke, S., et al. (2007). Multisensory interactions within human primary cortices revealed by BOLD dynamics. Cerebral Cortex, 17, 1672–1679. Mitterer, H., & Jesse, A. (2010). Correlation versus causation in multisensory perception. Psychonomic Bulletin Review, 17, 329–334. Molholm, S., Martinez, A., Shpaner, M., & Foxe, J. J. (2007). Object-based attention is multisensory: Co-activation of an object's representations in ignored sensory modalities. European Journal of Neuroscience, 26, 499–509. Mulatti, C., & Coltheart, M. (2012). Picture-word interference and the response-exclusion hypothesis. Cortex, 48, 363–372. Pecher, D., Zeelenberg, R., & Barsalou, L. W. (2003). Verifying different-modality properties for concepts produces switching costs. Psychological Science, 14, 119–124. Reales, J. M., & Ballesteros, J. (1999). Implicit and explicit memory for visual and haptic objects: Cross-modal priming depends on structural descriptions. Journal of Experimental Psychology. Learning, Memory, and Cognition, 25, 644–663. Riou, B., Lesourd, M., Brunel, L., & Versace, R. (2011). Visual memory and visual perception: When memory increase visual search. Memory and Cognition, 39, 1094–1102. Roediger, H. L., & McDermott, K. B. (1993). Implicit memory in normal human subjects. In F. Boller, & J. Gafman (Eds.), Handbook of neuropsychology (pp. 63–131). Amsterdam: Elsevier. Ryan, J. D., Moses, S. N., Ostreicher, M. L., Bardouille, T., Herdman, A. T., Riggs, L., et al. (2008). Seeing sounds and hearing sights: The influence of prior learning on current perception. Journal of Cognitive Neuroscience, 20, 1030–1042. Salthouse, T. A. (2000). Aging and measures of processing speed. Biological Psychology, 54, 35–54. Saygin, A. P., McCullough, S., Alac, M., & Emmorey, K. (2010). Modulation of BOLD response in motion-sensitive lateral temporal cortex by real and fictive motion sentences. Journal of Cognitive Neuroscience, 22, 2480–2490. Schacter, D. L. (1992). Priming and multiple memory systems: Perceptual mechanisms of implicit memory. Journal of Cognitive Neuroscience, 4, 244–256. Schneider, T. R., Engel, A. K., & Debener, S. (2008). Multisensory identification of natural objects in a two-way crossmodal priming paradigm. Experimental Psychology, 55, 121–132. Slotnick, S. D., & Schacter, D. L. (2006). The nature of memory related activity in early visual areas. Imaging, 44, 2874–2886.
Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology. General, 18, 643–662. Tabachnick, B. G., & Fidell (2007). Using multivariate statistics. Boston: Allyn and Bacon Boston. Tulving, E. (1972). Episodic and semantic memory. In E. T. W. Donaldson (Ed.), Organization of memory (pp. 381–403). New York: Academic Press. Valentijn, S. A. M., van Boxtel, M. P. J. V., van Hooren, S. A. H., Bosma, H., Beckers, H. J. M., Ponds, R. W. H. M., et al. (2005). Change in Sensory Functioning Predicts Change in Cognitive Aging Study. Journal of the American Geriatrics Society, 374–380. Vallet, G., Brunel, L., & Versace, R. (2010). The perceptual nature of the cross-modal priming effect: Arguments in favor of a sensory-based conception of memory. Experimental Psychology, 57, 376–382. Vallet, G., Riou, B., Versace, R., & Simard, M. (2011a). The sensory-dependent nature of audio-visual interactions for semantic knowledge. In C. Hoelscher, T. F. Shipley, & L. Carlson (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 2077–2082). Boston, MA: Cognitive Science Society. Vallet, G., Simard, M., & Versace, R. (2011b). Sensory-dependent knowledge in young and elderly adults: Arguments from the cross-modal priming effect. Current Aging Science, 4, 137–149. Vallet, G., Simard, M., & Versace, R. (2012). Exploring the contradictory results on the cross-modal priming effect in normal aging: A critical review of the literature. In H. N., & S. Z. (Eds.), Psychology of priming (pp. 102–122). Hauppauge: Nova Science Publishers Inc. van den Bussche, E., Notebaert, K., & Reynvoet, B. (2009). Masked primes can be genuinely semantically processed. Experimental Psychology, 56, 295–300. Van der Linden, M. (2004). L'épreuve de rappel libre/rappel indicé à 16 items (RL/RI-16). In M. Van der Linden, & F. Coyette (Eds.), L'Évaluation des Troubles de la Mémoire, Solal, Marseille (pp. 25–47). Wallace, M. T., Roberson, G. E., Hairston, W. D., Stein, B. E., Vaughan, J. W., & Schirillo, J. A. (2004). Unifying multisensory signals across time and space. Experimental Brain Research, 158, 252–258. Warrington, E. K., & James, M. (1991). The visual object and space perception battery. Thames Valley Test, Bury St. Edmunds, UK.