Effects of invalid feedback on learning and feedback-related brain activity in decision-making

Effects of invalid feedback on learning and feedback-related brain activity in decision-making

Brain and Cognition 99 (2015) 78–86 Contents lists available at ScienceDirect Brain and Cognition journal homepage: www.elsevier.com/locate/b&c Eff...

945KB Sizes 0 Downloads 48 Views

Brain and Cognition 99 (2015) 78–86

Contents lists available at ScienceDirect

Brain and Cognition journal homepage: www.elsevier.com/locate/b&c

Effects of invalid feedback on learning and feedback-related brain activity in decision-making Benjamin Ernst ⇑, Marco Steinhauser Catholic University of Eichstätt-Ingolstadt, Germany University of Konstanz, Germany

a r t i c l e

i n f o

Article history: Received 23 December 2014 Revised 20 July 2015 Accepted 21 July 2015 Available online 8 August 2015 Keywords: Decision-making Feedback processing Event-related potentials Feedback-related negativity P300

a b s t r a c t For adaptive decision-making it is important to utilize only relevant, valid and to ignore irrelevant feedback. The present study investigated how feedback processing in decision-making is impaired when relevant feedback is combined with irrelevant and potentially invalid feedback. We analyzed two electrophysiological markers of feedback processing, the feedback-related negativity (FRN) and the P300, in a simple decision-making task, in which participants processed feedback stimuli consisting of relevant and irrelevant feedback provided by the color and meaning of a Stroop stimulus. We found that invalid, irrelevant feedback not only impaired learning, it also altered the amplitude of the P300 to relevant feedback, suggesting an interfering effect of irrelevant feedback on the processing of relevant feedback. In contrast, no such effect on the FRN was obtained. These results indicate that detrimental effects of invalid, irrelevant feedback result from failures of controlled feedback processing. Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction Optimal decision-making crucially relies on the ability to improve decisions based on the evaluation of feedback. However, feedback is often ambiguous providing a mixture of valid and invalid information. For instance, a teacher may show an annoyed facial expression before she tells a student that her answer was correct. Even if the student knows that only the oral feedback is relevant and valid, the irrelevant and invalid facial expression might impair learning. The goal of the present study was to investigate whether feedback processing is impaired when relevant and valid feedback is presented together with interfering, irrelevant and potentially invalid feedback. Here, irrelevant feedback is defined as a stimulus that conveys valence information (positive, negative) which, however, is not predictive of the actual future outcome. By considering electrophysiological indices of feedback processing, we aimed to examine whether irrelevant feedback impairs processing of relevant feedback.

⇑ Corresponding author at: Catholic University of Eichstätt-Ingolstadt, Ostenstraße 25, D-85072 Eichstätt, Germany. E-mail address: [email protected] (B. Ernst). http://dx.doi.org/10.1016/j.bandc.2015.07.006 0278-2626/Ó 2015 Elsevier Inc. All rights reserved.

In recent years, it has been shown that feedback about the outcome of a simple decision triggers a cascade of event-related potentials (ERPs) that reflect different aspects of learning and feedback processing. The so-called feedback-related negativity (FRN) refers to a negative deflection reaching its maximum around 200–300 ms after feedback onset at fronto-central electrode sites, which is more negative for negative feedback than for positive feedback (Miltner, Braun, & Coles, 1997). In their reinforcement learning theory of the error-related negativity (RL-ERN theory), Holroyd and Coles (2002) proposed that the FRN is a correlate of reinforcement learning. According to this account, a negative reward prediction error is generated in the midbrain dopamine system and is conveyed to the anterior cingulate cortex (ACC) where it elicits the FRN and guides learning (Holroyd & Yeung, 2011, 2012). Recently, this account has been modified by assuming that the FRN effect (i.e., the larger negativity following negative feedback) is actually caused by a reward positivity following positive feedback which overlaps with a feedback-locked N200 and which reflects a positive reward prediction error (e.g., Baker & Holroyd, 2011; Foti, Weinberg, Dien, & Hajcak, 2011; Hajihosseini & Holroyd, 2013; Holroyd, Krigolson, & Lee, 2011; Holroyd, Pakzad-Vaezi, & Krigolson, 2008). As an alternative to these variants of the RL-ERN theory, the predicted response-outcome (PRO) model (Alexander & Brown, 2011) proposed that ACC activity

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

reflected by the FRN represents both positive and negative prediction errors and that differences between positive and negative feedback reflect different degrees of expectedness (Ferdinand, Mecklinger, Kray, & Gehring, 2012). Despite these differences, these accounts share the assumption that the FRN reflects a reward prediction error related to reinforcement learning (for reviews, see San Martín, 2012; Walsh & Anderson, 2012). A second feedback-related component – the feedback-locked P300 – is a positivity peaking at posterior electrode sites between 200 and 600 ms after feedback onset. Although most studies found the P300 to be larger for positive feedback (Bellebaum & Daum, 2008; Bellebaum, Polezzi, & Daum, 2010; Ernst & Steinhauser, 2012; Ferdinand & Kray, 2013; Hajcak, Moser, Holroyd, & Simons, 2007; Holroyd, Baker, Kerns, & Müller, 2008; Ulrich & Hewig, 2014; Wu & Zhou, 2009; Zhou, Yu, & Zhou, 2010), others showed a larger P300 for negative feedback (Frank, Woroch, & Curran, 2005; Mathewson, Dywan, Snyder, Tays, & Segalowitz, 2008) or no valence effect at all (Holroyd & Krigolson, 2007; Li, Han, Lei, Holroyd, & Li, 2011; Yeung & Sanfey, 2004; for a review, see San Martín, 2012). The P300 following stimuli in simple decision tasks has been associated with attentional processes or the updating of working memory (Donchin & Coles, 1988; Nieuwenhuis, Aston-Jones, & Cohen, 2005; Polich, 2007). In the context of feedback, the interpretations of the P300 are more specific relating this component to controlled evaluation of action outcomes in working memory (Holroyd & Coles, 2002; Sato et al., 2005; Squires, Hillyard, & Lindsay, 1973; Yeung & Sanfey, 2004; for a review, see San Martín, 2012). Controlled outcome evaluation refers to processes that allow for rapid, flexible behavioral adaptation but rely strongly on attention or processing in working memory (Sailer, Fischmeister, & Bauer, 2010), as proposed in recent models of learning (Collins & Frank, 2012; Frank & Claus, 2006). For instance, a study on reversal learning found that a pronounced feedback-locked P300 predicts correct behavioral adjustment after a contingency reversal, while the FRN reflected adjustments based on prediction errors (Chase, Swainson, Durham, Benham, & Cools, 2011). Despite the ongoing debate about the exact functional significance of these components, the evidence described above suggests that the FRN is more related to reinforcement learning in the ACC, whereas the feedback-locked P300 is associated with controlled feedback evaluation (Chase et al., 2011; Sailer et al., 2010; Walsh & Anderson, 2011). This suggests that feedback is processed by two distinct systems, a perspective that has previously been proposed to account for data in behavioral (Collins & Frank, 2012), fMRI (Daw, Gershman, Seymour, Dayan, & Dolan, 2011), and patient studies (Frank & Claus, 2006). In the present study, we considered these components to examine how irrelevant and potentially invalid feedback is processed in decision-making. To achieve this, we constructed a simple task in which participants could optimize their decisions (and thus maximize their pay-off) by learning from feedback. The task required participants to decide which one of two characters was associated with a reward. Each character pair was presented a first time in a learning phase and a second time in a test phase. In the learning phase, the decision relied entirely on guessing and feedback had to be evaluated to learn the correct response. Then, in the test phase, correct responding was associated with a reward. In this way, performance in the test phase could be used as an indicator of how efficiently participants utilized feedback in the learning phase. Note that this paradigm differs from reinforcement learning paradigms utilized in previous studies insofar as the stimuli are only presented twice. By this, brain activity following feedback in the learning phase is unaffected by prior learning, but in contrast to gambling tasks this feedback can be used to improve the subsequent performance in the test phase.

79

Crucially, the feedback stimulus presented in the learning phase was ambiguous1 and provided two types of feedback. On the one hand, there was a relevant feedback that always validly indicated whether the response was correct or not. On the other hand, there was a preceding irrelevant feedback that also provided information about the correctness of the response, but this information was valid on half of the trials only. Because this irrelevant feedback contradicted relevant feedback in half of the trials, it was uninformative for learning. Participants knew at any time which feedback was relevant and which was irrelevant, and that only relevant feedback was informative for learning. To ensure that the irrelevant feedback was still processed under these conditions, relevant and irrelevant feedback was realized using Stroop stimuli (Stroop, 1935), that is, colored words whose meaning also referred to a color (e.g., the word BLUE in yellow color). In the present case, the relevant feedback dimension was the word color (e.g., blue for positive feedback, yellow for negative feedback), whereas the irrelevant feedback was the word meaning, which could be valid or invalid depending on whether it referred to the same (e.g., BLUE in blue color) or to the alternative color (e.g., YELLOW in blue color). A first question was whether the irrelevant feedback has a detrimental effect on learning from relevant feedback, even if participants know that the word is irrelevant. The advantage of using Stroop stimuli is that it is virtually impossible to ignore the word meaning, which is demonstrated by the finding that speeded naming of the color is typically strongly affected by the nature of the word (Stroop effect; for a review, see MacLeod, 1991). However, even if the word is encoded automatically and even if this delays the identification of the color, this does not necessarily imply that it also impairs learning from feedback provided by the color. This is because in contrast to the classical Stroop paradigm there is no response selection under time pressure and no response conflict. Rather, irrelevant feedback could prime either the color category or the feedback valence associated with this color. Accordingly, if irrelevant and relevant feedback are incompatible, this could either impair the identification of the relevant feedback color or activate false valence information. To examine whether this affects learning, we analyzed whether performance in the test phase was impaired if the feedback stimulus in the learning phase contained an invalid word as compared to when it contained a valid word. If irrelevant feedback has an influence on learning, this should also be reflected in ERPs elicited by the relevant feedback. If irrelevant feedback affects learning more indirectly because it primes color categories and influences the identification of the relevant feedback (which is one component of the Stroop effect; De Houwer, 2003), then both, the FRN and the P300, should be affected by the validity of irrelevant feedback as both rely on accurate stimulus identification. However, if priming occurs on the level of more abstract valence information, it is conceivable that the two components are differentially influenced, with the exact pattern depending on whether this priming influences reinforcement learning or controlled feedback processing or both. On the one hand, if priming occurs on the level of semantic representations of valence (i.e., ‘‘right’’ and ‘‘wrong’’) in working memory, this would affect controlled feedback processing but not necessarily reinforcement learning. In this case, we would expect that irrelevant feedback influences the P300 rather than the FRN. On the other hand, it is possible that controlled processes are better able to protect learning from the influence of irrelevant information. In this case, effects of irrelevant feedback on learning should be

1 Note that, in the present study, only the validity of an irrelevant dimension of the feedback stimulus was manipulated, while the relevant dimension was always valid. This differs from other studies in which feedback validity refers to the correctness of a single feedback stimulus (e.g., Mies, van der Veen, Tulen, Hengeveld, & van der Molen, 2011).

80

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

mediated only by automatic reinforcement learning, and irrelevant feedback should influence the FRN rather than the P300. 2. Method 2.1. Participants Twenty-nine participants (19 female) between 19 and 28 years of age (mean 22.0) with normal or corrected-to-normal vision participated in the study. Participants were recruited at the University of Konstanz, and received a base fee of 7 € and a performance-dependent bonus (mean: 6.00 €, range: 0.20 to 10.30 €). The data from two participants (both female) was excluded from further analysis because their test performance was close to chance. The study was conducted in accordance with institutional guidelines and informed consent was acquired from all participants. 2.2. Stimulus material The test stimuli consisted of 288 Chinese and Chinese-looking characters. Each character comprised a height of 2.5–2.9° visual angle and a width of 2.7–2.9° visual angle at a viewing distance of 60 cm, and was presented in white color on a black background. For each participant, the stimuli were randomly divided into 24 sets of six items with each item consisting of a pair of characters. Within each item, each character was presented 0.5° visual angle

left or right from the screen center. The position of each character (left or right) was randomly determined for each presentation. The feedback stimuli consisted of four Stroop stimuli (Stroop, 1935) comprising a height of 1.2° visual angle and a width of 3° visual angle, which were presented centrally on the screen. We used the color words BLAU (engl. blue) and GELB (engl. yellow) whose initial color was white, which then turned into either blue or yellow (see description below). 2.3. Design and procedure Participants were instructed that one of the characters in each item was associated with a reward of 10 euro cents whereas the other character was associated with a loss of 10 euro cents, and that each item was presented in a learning trial and a test trial. In learning trials, participants were provided with the opportunity to learn which one of the characters was rewarded. To achieve this, they first provided a guess by pressing a left or right key and then received visual feedback whether their guess was correct or not. In test trials, participants again chose between the two characters by pressing a key but now received the reward if the correct character was chosen and the punishment if the incorrect character was chosen. Because we wanted to incentivize learning from feedback rather than guessing, reward and punishment was provided in the test trials (in which responding reflected previous learning) but not in the learning trials (in which responding reflected mere guessing).

Fig. 1. Sequence of events in a typical trial during the learning phase (A) and the test phase (B). The stimulus consisted of two Chinese-looking characters. After participants pressed a response key, the irrelevant feedback (color word) was presented followed by the relevant feedback (word color) indicating the correctness of the participants’ response. The box (C) illustrates all four possible combinations of relevant feedback valence (positive, negative) and irrelevant feedback validity (valid, invalid) for the case where yellow indicated positive feedback and blue indicated negative feedback. Note that German words (‘BLAU’, ‘GELB’) were used in the actual experiment. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

The procedure of a learning trial is illustrated in Fig. 1A. After presentation of an item, responses had to be given within five seconds by pressing one of two keys on a German standard keyboard (Y with left index finger for a left response, M with right index finger for a right response). Following the response (or after five seconds), the item disappeared and the screen remained blank for 200 ms. If a response has occurred within the time limit, a feedback stimulus was presented for 1200 ms. Feedback was a Stroop stimulus (Stroop, 1935) with a delayed color onset. It consisted of a color word (BLAU, engl. blue, or GELB, engl. yellow) that was presented in white color for the first 600 ms, and then turned into blue or yellow color for the remaining 600 ms. If no response was given within the time limit, the German word VERPASST (engl. missed) was presented for 1200 ms. During the learning phase, these misses were associated with a loss of 10 euro cents. Finally, a blank screen appeared for 1900 ms followed by the next trial. Participants were instructed that the word color served as relevant feedback indicating the true correctness of the response. In contrast, the word meaning was irrelevant and corresponded to the word color in half of the trials, but differed from the word color in the other half. Because word meaning was not predictive of the true correctness of the response but was semantically related to feedback, we considered it as irrelevant feedback that could be valid or invalid. Conditions in which word meaning was compatible to word color are termed valid irrelevant feedback (e.g., the word BLAU in blue color), whereas conditions in which word meaning was incompatible to word color are termed invalid irrelevant feedback (e.g., the word BLAU in yellow color). At the same time, we distinguish between conditions in which the relevant feedback indicated that a response was correct (positive relevant feedback) or incorrect (negative relevant feedback). Fig. 1C illustrates how the four possible combinations of word color and word meaning are mapped on combination of irrelevant feedback validity and relevant feedback valence. Please note that colors were counterbalanced across participants, that is, whereas a correct response was indicated by a yellow color in half of the participants, it was indicated by a blue color in the other half. The procedure of a test trial is illustrated in Fig. 1B. Here, the items from the learning phase were presented again, but now, correct and wrong responses were associated with a reward of 10 euro cents and a loss of 10 euro cents, respectively. Moreover, although the same character pairs were used as in the learning phase, the spatial positions (left/right) of the two characters were switched relative to the learning phase in half of the items. This should ensure that participants learn the stimuli rather than the response side. The procedure was the same as in learning trials with one exception: A feedback was presented for 800 ms that indicated the amount of money won (‘‘+10 cents’’) or lost (‘‘ 10 cents’’) depending on the correctness of the response. Again, a feedback indicating a miss was given in case of a late response. During the test phase, these misses were associated with a loss of 30 euro cents. Each participant was tested in an individual session. After fitting the electrode cap, participants were seated comfortably in a dimly lit, electrically shielded room. They were made familiar with the task by completing two training blocks. The main part of the experiment consisted of 24 blocks. Each block comprised a learning phase and a test phase in which six items were first learned and then tested. The order of items was randomized within each phase. Between phases and blocks, there was a short, self-paced interruption which allowed participants to inform the experimenter of any problems they encounter during the task. Moreover, there was a longer break every sixth block. After the final block, participants

81

were given a short questionnaire and then were paid according to their performance. 2.4. Electrophysiological recordings The electroencephalogram (EEG) was recorded using a BIOSEMI Active-Two system (BioSemi, Amsterdam, The Netherlands) with 64 Ag-AgCl electrodes from channels Fp1, AF7, AF3, F1, F3, F5, F7, FT7, FC5, FC3, FC1, C1, C3, C5, T7, TP7, CP5, CP3, CP1, P1, P3, P5, P7, P9, PO7, PO3, O1, Iz, Oz, POz, Pz, CPz, Fpz, Fp2, AF8, AF4, AFz, Fz, F2, F4, F6, F8, FT8, FC6, FC4, FC2, FCz, Cz, C2, C4, C6, T8, TP8, CP6, CP4, CP2, P2, P4, P6, P8, P10, PO8, PO4, O2 as well as the left and right mastoid. The CMS (Common Mode Sense) and DRL (Driven Right Leg) electrodes were used as reference and ground electrodes. The vertical and horizontal electrooculogram (EOG) was recorded from electrodes above and below the right eye and on the outer canthi of both eyes. All electrodes were off-line re-referenced to averaged mastoids. EEG and EOG were continuously recorded at a sampling rate of 512 Hz. 2.5. Data analysis EEG data from learning trials were analyzed using EEGLAB v6.01 (Delorme & Makeig, 2004) and custom routines written in MatLab 7.0.4 (The Mathworks, Natick, MA). The data were band-pass filtered excluding activity below 1 Hz and above 30 Hz. Epochs were extracted ranging from 200 ms before to 500 ms after feedback onset, and baseline activity was removed by subtracting the average voltage from an interval between 100 ms and 0 ms before feedback onset. Large artifacts were identified by computing the joint probability of each epoch (see EEGLAB v6.01), and epochs were excluded whose joint probability deviated more than five standard deviations from the distribution mean. Furthermore, epochs were excluded for which activity exceeded more than 150 lV above or below the baseline-corrected mean. Ocular artifacts were corrected using a linear regression method (Gratton, Coles, & Donchin, 1983) implemented in EEGLAB. Finally, epochs were averaged separately for each condition of interest. On average, this resulted in 32.1, 30.7, 30.0, and 33.0 artifact-free trials in the valid/positive, valid/negative, invalid/positive, and invalid/negative feedback conditions, respectively. Quantification of the FRN and the P300 has to deal with the problem that the two components fully overlap in time and partially overlap with respect to their spatial distribution (e.g. San Martín, 2012). In most studies, the FRN is a small negative deflection at frontocentral electrodes that overlaps with the broader anterior tail of the posteriorly peaking P300 (e.g., Yeung & Sanfey, 2004). To control for the influence of the P300 on FRN measurement, we applied a peak-to-peak method to quantify the FRN amplitude (Frank et al., 2005; Yeung & Sanfey, 2004; for an overview of FRN quantification methods, see San Martín, 2012). FRN amplitudes were determined separately for each condition and each participant at electrode FCz by, first, determining the amplitude of the most negative peak between 200 and 400 ms after feedback onset and, then, subtracting the amplitude of the most positive peak between 150 ms after feedback onset and the previously determined negative peak (Ferdinand et al., 2012; Holroyd, Hajcak, & Larsen, 2006). If there was no negative peak in the 200–400 ms time window, the FRN amplitude was taken as 0 mV. For the P300, we used a mean amplitude measure because the P300 peak is often masked by the posterior tail of the FRN and/or an N200 peak. For waveforms following relevant feedback, we calculated mean amplitudes in the time interval between 336

82

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

and 387 ms which captures the P300 peak at the Pz electrode. Finally, we conducted an exploratory analysis of ERPs elicited by irrelevant feedback. In addition to peak-to-peak analyses of the N200, we selected mean amplitudes between 225 and 335 ms at electrode CPz for analysis based on visual inspection. 3. Results 3.1. Behavioral data As expected, the proportion of correct responses in the learning trials (M = 49.4%, SD = 3.78) did not significantly exceed chance performance, F(1, 26) = 0.62, p = .44, suggesting that participants were merely guessing. The main question of interest was whether participants were able to learn from feedback and whether learning was influenced by the validity of the irrelevant feedback. To investigate this, each item in the test phase was classified according to whether it was followed by positive or negative relevant feedback in the learning phase, and whether the irrelevant feedback provided in the learning phase was valid or invalid. Then, the proportion of correct trials in the test blocks for each of the resulting conditions was calculated (see Fig. 2). The data were subjected to a two-way ANOVA with repeated measurement on the variables Relevant Feedback Valence (positive, negative) and Irrelevant Feedback Validity (valid, invalid). We obtained two significant main effects. First, test performance after positive relevant feedback (M = 66.3%, SD = 8.88) as compared to negative relevant feedback (M = 78.6%, SD = 8.53), F(1, 26) = 36.1, p < .001. Second, and more importantly, test performance was impaired when the irrelevant feedback was invalid (M = 71.1%, SD = 7.69) as compared to when irrelevant feedback was valid (M = 73.9%, SD = 7.23), F(1, 26) = 6.01, p < .05. There was no significant interaction (F < 1). Taken together, these data show that although the words were completely irrelevant, invalid feedback provided by irrelevant words impaired learning from feedback. Moreover, test performance was clearly above chance level even in the invalid irrelevant feedback condition. This indicates that participants were aware that they had to learn from relevant feedback only and that effects of irrelevant feedback are not driven by failures to understand the instruction. 3.2. Feedback-locked ERP data 3.2.1. Relevant feedback The behavioral data indicated that irrelevant feedback had a substantial influence on learning from feedback. In order to

Fig. 2. Behavioral performance in the test phase as a function of Relevant Feedback Valence and Irrelevant Feedback Validity in the learning phase. Please note that chance performance would correspond to 50% accuracy.

Fig. 3. Feedback-locked ERPs and topographies following relevant feedback in the learning phase as a function of Relevant Feedback Valence and Irrelevant Feedback Validity. (A and B) Averaged waveforms for channels FCz and Pz. (C and D) Topographies of the difference wave between positive and negative relevant feedback for the 336–385 ms time interval after relevant feedback onset, presented separately for the valid (C) and invalid (D) irrelevant feedback condition.

investigate the origin of this effect, we analyzed feedback-locked ERPs elicited by the relevant feedback to investigate whether processing of relevant feedback is influenced by the presence of irrelevant feedback that was either valid or invalid. To achieve this, we computed waveforms for positive and negative relevant feedback as a function of whether the irrelevant word provided the same (i.e., valid, irrelevant feedback) or a different (i.e., invalid, irrelevant feedback) feedback. The FRN was quantified at electrode FCz using the aforementioned peak-to-peak measure. Visual inspection revealed that the P300 was maximal around 360 ms at channel Pz (see Fig. 3B). As a consequence, the P300 was quantified as mean amplitude in a time range between 336 ms and 387 ms at electrode Pz. Both measures were subjected to a two-way ANOVA with repeated measurement on the variables Relevant Feedback Valence (positive, negative) and Irrelevant Feedback Validity (valid, invalid). For the FRN, a main effect of Relevant Feedback Valence, F(1, 26) = 4.63, p < 0.05, was revealed. Negative relevant feedback was associated with a larger FRN amplitude (M = 5.42 lV, SD = 2.47) than positive relevant feedback (M = 4.30 lV, SD = 1.90). However, there was neither a significant main effect of Irrelevant Feedback Validity, F(1, 26) = 2.67, p = 0.11, nor a significant interaction, F < 1. This suggests that irrelevant feedback did not influence the FRN to relevant feedback.2 For the P300 (see Fig. 3B), we obtained a significant main effect of Relevant Feedback Valence, F(1, 26) = 27.1, p < 0.001, a marginally significant main effect of Irrelevant Feedback Validity, 2 Although an amplitude difference at FCz after valid and invalid irrelevant feedback can be seen in the ERPs depicted in Fig. 3A, inspection of the topography of the difference wave in the time window around the FRN peak revealed that this difference was maximal at posterior electrode sites. This strongly suggests that this difference is due to an effect of Irrelevant Feedback Validity on the P300 and not on the FRN.

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

F(1, 26) = 2.84, p < 0.1, and a significant interaction, F(1, 26) = 5.12, p < 0.05. Fig. 3C and D illustrates this interaction by depicting topographies of difference waves reflecting the valence effect when irrelevant feedback was valid (Fig. 3C) and invalid (Fig. 3D) in the time range of the P300. The interaction indicated that the P300 was larger for positive feedback (M = 9.07 lV, SD = 3.86) than for negative feedback (M = 5.52 lV, SD = 3.85) when the preceding word was valid. However, this valence effect was strongly reduced when the preceding irrelevant feedback word was invalid (positive relevant feedback: M = 7.18 lV, SD = 3.22; negative relevant feedback: M = 5.37 lV, SD = 3.59). These results suggest that the P300 to relevant feedback was associated with a valence effect, which might reflect the strength of controlled feedback evaluation, but that this effect was strongly reduced when relevant feedback was preceded by invalid, irrelevant feedback.

3.2.2. Irrelevant feedback Our first analyses showed that irrelevant feedback influences processing of relevant feedback. However, one could also ask whether the irrelevant feedback is itself processed as feedback (even though participants knew that it was irrelevant). If irrelevant feedback were processed not only as an irrelevant word stimulus but as a feedback stimulus, one would expect that irrelevant feedback valence influences ERPs in a similar way as relevant feedback valence does. Indeed, such an analysis is possible because our design allowed for distinguishing between ERPs elicited by irrelevant and relevant feedback due to the delayed onset of relevant feedback. Visual inspection of the ERPs at fronto-central electrode sites (see Fig. 4A) suggests a pronounced N200 which might reflect an FRN. However, when we analyzed the peak-to-peak amplitudes at electrode FCz, we found no significant difference between positive and negative irrelevant feedback, F(1, 26) = 1.83, p = .19. This suggests that in contrast to relevant feedback, feedback valence

83

did not affect the N200 amplitude, indicating that there was no FRN effect (Holroyd & Coles, 2002; Holroyd et al., 2008). However, the topography of the difference wave between positive and negative irrelevant feedback (see Fig. 4C) suggests that irrelevant feedback valence affected activity at posterior electrodes with a peak of this effect at electrode CPz. In order to quantify this posterior activity, we calculated mean amplitudes at this electrode between 225 and 335 ms. The analysis revealed that positive irrelevant feedback was associated with larger amplitudes (M = 5.39 lV, SD = 2.13) than negative irrelevant feedback (M = 4.59 lV, SD = 1.74), F(1, 26) = 4.71, p < 0.05. Moreover, when we calculated peak-to-peak amplitudes at this electrode in the same way as in the preceding analyses, we found no significant valence effect, F(1, 26) = 0.003, p = .96. This suggests that this posterior valence effect is not due to a posterior N200 but could rather reflect a valence effect on the feedback-locked P300. However, this interpretation has to be treated with caution because the time course and spatial distribution of this effect clearly differs from the feedback-locked P300 following relevant feedback. Nevertheless, it indicates that feedback valence is reflected in the ERPs following irrelevant feedback. In a next step, we analyzed how irrelevant feedback processing affected performance and relevant feedback processing. Interestingly, we found a correlation (across participants) between the valence effect on ERPs following irrelevant feedback (quantified as the difference between the amplitudes of the P300 following positive and negative irrelevant feedback) and the effect of irrelevant feedback validity on test performance (quantified as the difference in test performance after valid and invalid feedback), r = .39, p < .05. That is, the more pronounced the posterior difference wave after irrelevant feedback, the greater the effect of irrelevant feedback validity on test performance. In contrast, the valence effect on ERPs following irrelevant feedback did not correlate with the validity effect (valid minus invalid) on the P300 following relevant feedback, r = .11, p = .59. Finally, we found no significant correlation between the valence effect on test performance and any of the aforementioned ERP effects. Taken together, these analyses suggest that irrelevant feedback is processed like feedback at least to some extent. Moreover, the strength of irrelevant feedback processing predicts how strongly invalid irrelevant feedback impairs learning. This implicates that invalid feedback affects learning not only because relevant feedback processing is impaired but also because irrelevant feedback itself leads to false learning of the incorrect response.

4. Discussion

Fig. 4. Feedback-locked ERPs and topographies following irrelevant feedback in the learning phase as a function of Irrelevant Feedback Valence. (A and B) Averaged waveforms for channels FCz and CPz. (C) Topography of the difference wave between positive and negative irrelevant feedback for the 225–325 ms time interval after irrelevant feedback onset.

The aim of the present study was to investigate how irrelevant and potentially invalid feedback influences learning in decision making. To this end, we measured feedback-related ERPs in a simple decision-making task in which participants could maximize their pay-off in a test phase by learning from feedback provided during a learning phase. Crucially, the feedback stimuli included not only relevant, valid feedback but also irrelevant and potentially invalid feedback. Irrelevant and relevant feedback was implemented by first presenting a color word (the irrelevant feedback) which then adopted a specific color (the relevant feedback). Using Stroop stimuli with delayed color onset ensured that the irrelevant feedback was still automatically encoded (Glaser & Glaser, 1989). As expected, we found performance in the test phase to be influenced by irrelevant feedback provided during the learning phase. Learning from feedback was significantly impaired when the irrelevant word provided invalid feedback as compared to when it provided valid feedback. This clearly demonstrates that

84

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

irrelevant feedback can impair learning even under conditions where the learner is fully aware that a stimulus provides irrelevant feedback. In a further step, we analyzed feedback-locked ERPs to reveal processes that contribute to this effect. We observed an FRN that was larger for negative than for positive feedback as well as a feedback-locked P300 that was larger for positive than for negative feedback. Most importantly, the feedback-locked P300 following the relevant feedback was strongly impaired when the preceding irrelevant feedback was invalid. This effect was stronger for positive than for negative feedback, indicating that invalid feedback reduced the effect of valence on the P300. Recent models suggest that feedback can be processed by two distinct systems, one based on processing of prediction errors, and the other strongly based on controlled processes involving attention and working memory (Daw et al., 2011; Frank & Claus, 2006). Provided that this valence effect on the feedback-locked P300 is a marker of controlled feedback evaluation, this finding suggests that irrelevant feedback impaired only controlled processing of relevant feedback. This could suggest that irrelevant feedback primed valence information that was involved only in the controlled processing of feedback in working memory. The notion that irrelevant feedback has activated valence information receives support from the observation of a valence effect in the processing of irrelevant feedback. The modulation of a P300-like waveform by irrelevant feedback valence implies that irrelevant feedback was evaluated according to the valence associated with the irrelevant word. Moreover, this valence processing turned out to be predictive of learning decrements caused by invalid irrelevant feedback. This suggests that participants were less influenced by irrelevant feedback if they were better able to either inhibit the irrelevant feedback or prevent that the word is evaluated according to its valence. In contrast, we found no such effect of irrelevant feedback on the FRN. Considering current theories of the FRN (Alexander & Brown, 2011; Holroyd & Coles, 2002; Holroyd et al., 2008), this implies that the presence of invalid irrelevant feedback information did not affect the generation of a prediction error or the utilization of a prediction error in the ACC. This could reflect that the computation of prediction errors is independent from valence information in working memory. On the one hand, this is consistent with the idea that value computation in the human reward system is based on direct associations between stimulus characteristics (here: color) and outcomes (e.g., Holroyd & Coles, 2002). On the other hand, it could show that the processes underlying FRN generation are highly selective regarding which information is utilized and which is ignored. This is suggested by prior research showing that the FRN can reflect different types of information (e.g., monetary outcome or performance feedback) depending on which aspect was emphasized by the experimental context (Nieuwenhuis, Yeung, Holroyd, Schurger, & Coles, 2004). Interestingly, we observed an effect of relevant feedback valence on the FRN amplitude although the probability of positive and negative feedback was equal. This is in line with the RL-ERN theory which assumes that only the prediction error generated by negative feedback contributes to the FRN (Holroyd & Coles, 2002). It is also consistent with the idea of a reward positivity that reflects the prediction error generated by positive feedback (e.g., Holroyd et al., 2008). However, because positive and negative feedback were equally probable, this result appears to be in conflict with the PRO model (Alexander & Brown, 2011) which would predict FRN amplitude differences only when feedback probabilities differ. However, it could be argued that these amplitude differences were due to an overoptimistic bias (Miller & Ross, 1975) that led participants to expect positive feedback in some trials even though they were merely guessing (Oliveira, McDonald, & Goodman, 2007). Moreover, it is possible that subjective feedback

probabilities were biased because the learning phase was preceded by a test phase in which positive feedback was more probable than negative feedback. It is important to note that although the irrelevant feedback was not predictive of the relevant feedback (which is why we termed it irrelevant), it is still potentially conflicting with the relevant feedback. Therefore, an optimal strategy would be to actively inhibit the irrelevant feedback stimulus. The ERPs following irrelevant feedback suggest that this inhibition was not perfect, which is not surprising given the well-known Stroop effect. Rather, even irrelevant feedback elicited a valence effect, and this valence effect was predictive of the influence of feedback validity on test performance. However, one could ask whether the results had changed if the irrelevant feedback was less conflicting, e.g., because the irrelevant feedback was invalid on very few trials only. Indeed, research on the Stroop effect suggests that the frequency of incompatible stimuli influences the strength with which the irrelevant word is processed (Logan & Zbrodoff, 1979). Furthermore, if the irrelevant feedback is frequently valid, it becomes relevant feedback (because it is now informative regarding the correctness of the response) – a condition under which previous studies have obtained an FRN even for apparently irrelevant stimuli (Baker & Holroyd, 2009, Exp. 2; Holroyd et al., 2011). Thus, one would expect that the influence of irrelevant feedback crucially depends on the probability that it conflicts with relevant feedback – a prediction that should be tested in future research. Another question is whether conflict could play a direct role in the emergence of our effects. Indeed, conflict-based accounts of the FRN (Botvinick, Cohen, & Carter, 2004; Liu, Nelson, Bernat, & Gehring, 2014) suggest that the FRN reflects the conflict between actual and expected feedback. One could assume that irrelevant feedback induces an expectation about the relevant feedback which then leads to a conflict with the actual relevant feedback. In this case, the FRN amplitude following relevant feedback should have been generally increased in the invalid irrelevant feedback condition. However, this is not reflected in our results because irrelevant feedback validity had no effect at all on the FRN following relevant feedback. Taken together, our results suggest that the detrimental effects of irrelevant feedback are driven by mechanisms reflected by the feedback-locked P300 and that this mechanism could involve the processing of valence information in working memory. To provide a more process-based explanation of these effects, we first have to discuss why the feedback-locked P300 is increased for positive feedback in our task. While initial studies suggested that the feedback-locked P300 is only sensitive to outcome magnitude but not to outcome valence (e.g. Gu, Wu, Jiang, & Luo, 2011; Sato et al., 2005; Yeung & Sanfey, 2004) recent studies have reported an effect of valence, albeit not consistently (see San Martín, 2012, for an overview). Possible explanations of this valence effect can be derived from current theories on the P300: First, the P300 amplitude has frequently been shown to reflect the unexpectedness of stimuli (for an overview, see Polich & Kok, 1995, but see Verleger, Hamann, Asanowicz, & S´migasiewicz, 2015; Verleger, Metzner, Ouyang, S´migasiewicz, & Zhou, 2014). However, it is rather unlikely that this can explain the present effects because positive and negative feedback was equally frequent. Moreover, if the participants had been overconfident concerning the probability of positive feedback – possibly due to the higher frequency of positive feedback in the test phases – a more pronounced P300 after the subjectively less expected negative feedback should have been observed. Finally, it has been proposed that the feedback-locked P300 is modulated by the motivational significance of the feedback (e.g. Nieuwenhuis et al., 2005; Nieuwenhuis, 2011; San Martín, 2012; Wu & Zhou, 2009). This would indicate that in our paradigm

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

positive feedback was more motivationally significant than negative feedback. However, given that negative feedback was equally informative with respect to future outcomes (+10 ct. vs. 10 ct.), it is unclear why feedback should be of different significance. Here, we propose that the P300 following feedback reflects the feedback-based evaluation of a choice by controlled processes (Ernst & Steinhauser, 2012). This account is motivated by recent evidence that the error positivity or Pe – a P300-like ERP that follows incorrect responses in decision tasks – reflects a decision process by which the performance monitoring system decides whether an error has occurred or not (e.g., Murphy, Robertson, Allen, Hester, & O’Connell, 2012; Nieuwenhuis, Ridderinkhof, Blom, Band, & Kok, 2001; Steinhauser & Yeung, 2010; for reviews, see Overbeek, Nieuwenhuis, & Ridderinkhof, 2005; Ullsperger, Harsay, Wessel, & Ridderinkhof, 2010) and that the Pe amplitude represents the accumulated evidence in favor of an error (Murphy et al., 2012; Steinhauser & Yeung, 2010, 2012). We propose that a similar idea could account for the feedback-locked P300 in the present task. In contrast to the Pe, however, we assume that the feedback-locked P300 represents the accumulated evidence that the initial choice was correct, and that this explains why positive feedback is associated with larger P300 amplitudes. This is plausible because in speeded choice tasks, errors are more relevant for performance monitoring because they convey information about behavioral adjustments. In contrast, in learning tasks with feedback, correct feedback is more relevant because participants seek for information confirming that their response is correct. Based on this account, the effect of invalid, irrelevant feedback on the P300 following relevant feedback would indicate that the evaluation of the previous choice is impaired when invalid feedback is provided. This might reflect the interfering effect of the automatically encoded word because the word primed an incompatible valence representation. This should affect the accumulation of evidence that the initial choice was correct and, in this way, the feedback-locked P300 amplitude. Please note that this account would predict that the P300 following negative, relevant feedback should be – at least slightly – larger when irrelevant feedback was invalid as compared to when it was valid (because positive, irrelevant feedback should also add some evidence for a correct choice, and thus, should increase the P300). There are at least two reasons why such an effect was not obtained: First, a closer inspection of the ERPs at Pz shows that the FRN also affected activity at this electrode which might have masked the expected effect on the P300 on negative feedback trials. Second, the relationship between relevant feedback valence and irrelevant feedback validity might be multiplicative instead of additive. That is, irrelevant feedback does not affect the feedback-locked P300 following relevant feedback if this feedback-locked P300 is almost absent (because no evidence from relevant feedback is accumulated at all). Taken together, the present results demonstrate that ambiguous feedback can impair learning even if the learner is fully aware about which feedback stimulus is valid and which is potentially invalid and thus irrelevant. Our data provide evidence that one mechanism contributing to these performance decrements is the interfering effect of irrelevant feedback on the processing of relevant feedback. This effect was reflected in changes in the P300 rather than the FRN. Thus, at least in the current learning paradigm, it is the P300 rather than the FRN that is associated with performance – a finding that has been obtained elsewhere with different tasks (Chase et al., 2011; Ernst & Steinhauser, 2012; Sailer et al., 2010; Yeung & Sanfey, 2004). This suggests that controlled processes can be crucially involved in learning from feedback in decision-making (e.g. Collins & Frank, 2012; Frank & Claus, 2006). Whereas literature on learning from feedback is currently strongly dominated by studies focusing on reinforcement

85

and the FRN, future research is needed that specifies the exact mechanism that underlies controlled feedback processing.

Acknowledgements This research was supported by a grant to Marco Steinhauser from the Excellence Initiative of the University of Konstanz as part of the Center for Psychoeconomics. We are grateful to Lisa Kübler for assistance in conducting the experiments.

References Alexander, W. H., & Brown, J. W. (2011). Medial prefrontal cortex as an actionoutcome predictor. Nature Neuroscience, 14, 1228–1344. Baker, T. E., & Holroyd, C. B. (2009). Which way do I go? Neural activation in response to feedback and spatial processing in a virtual T-maze. Cerebral Cortex, 19, 1708–1722. Baker, T. E., & Holroyd, C. B. (2011). Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200. Biological Psychology, 87, 25–34. Bellebaum, C., & Daum, I. (2008). Learning-related changes in reward expectancy are reflected in the feedback-related negativity. European Journal of Neuroscience, 27, 1823–1835. Bellebaum, C., Polezzi, D., & Daum, I. (2010). It is less than you expected: The feedback-related negativity reflects violations of reward magnitude expectations. Neuropsychologia, 48, 3343–3350. Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: An update. Trends in Cognitive Sciences, 8, 539–546. Chase, H. W., Swainson, R., Durham, L., Benham, L., & Cools, R. (2011). Feedbackrelated negativity codes prediction error but not behavioral adjustment during probabilistic reversal learning. Journal of Cognitive Neuroscience, 23, 936–946. Collins, A. G. E., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35, 1024–1035. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69, 1204–1215. De Houwer, J. (2003). On the role of stimulus–response and stimulus–stimulus compatibility in the Stroop effect. Memory & Cognition, 31, 353–359. Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134, 9–21. Donchin, E., & Coles, M. G. H. (1988). Is the P300 component a manifestation of context updating? Behavioral and Brain Sciences, 11, 357–374. Ernst, B., & Steinhauser, M. (2012). Feedback-related brain activity predicts learning from feedback in multiple-choice testing. Cognitive, Affective, & Behavioral Neuroscience, 12, 323–336. Ferdinand, N. K., & Kray, J. (2013). Age-related changes in processing positive and negative feedback: Is there a positivity effect for older adults? Biological Psychology, 94, 235–241. Ferdinand, N. K., Mecklinger, A., Kray, J., & Gehring, W. J. (2012). The processing of unexpected positive response outcomes in the mediofrontal cortex. The Journal of Neuroscience, 32, 12087–12092. Foti, D., Weinberg, A., Dien, J., & Hajcak, G. (2011). Event related potential activity in the basal ganglia differentiates rewards from nonrewards: Temporospatial principal components analysis and source localization of the feedback negativity. Human Brain Mapping, 32, 2207–2216. Frank, M. J., & Claus, E. D. (2006). Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychological Review, 113, 300–326. Frank, M., Woroch, B., & Curran, T. (2005). Error-related negativity predicts reinforcement learning and conflict biases. Neuron, 47, 495–501. Glaser, W. R., & Glaser, M. O. (1989). Context effects in Stroop-like word and picture processing. Journal of Experimental Psychology: General, 118, 13–42. Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new method for off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55, 468–484. Gu, R., Wu, T., Jiang, Y., & Luo, Y. J. (2011). Woulda, coulda, shoulda: The evaluation and the impact of the alternative outcome. Psychophysiology, 48, 1354–1360. Hajcak, G., Moser, J. S., Holroyd, C. B., & Simons, R. F. (2007). It’s worse than you thought: The feedback negativity and violations of reward prediction in gambling tasks. Psychophysiology, 44, 905–912. Hajihosseini, A., & Holroyd, C. B. (2013). Frontal midline theta and N200 amplitude reflect complementary information about expectancy and outcome evaluation. Psychophysiology, 50, 550–562. Holroyd, C. B., Baker, T. E., Kerns, K. A., & Müller, U. (2008). Electrophysiological evidence of atypical motivation and reward processing in children with attention-deficit hyperactivity disorder. Neuropsychologia, 46, 2234–2242. Holroyd, C. B., & Coles, M. G. H. (2002). The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–708.

86

B. Ernst, M. Steinhauser / Brain and Cognition 99 (2015) 78–86

Holroyd, C. B., Hajcak, G., & Larsen, J. T. (2006). The good, the bad and the neutral: Electrophysiological responses to feedback stimuli. Brain Research, 1105, 93–101. Holroyd, C. B., & Krigolson, O. E. (2007). Reward prediction error signals associated with a modified time estimation task. Psychophysiology, 44, 913–917. Holroyd, C. B., Krigolson, O. E., & Lee, S. (2011). Reward positivity elicited by predictive cues. Neuroreport, 22, 249–252. Holroyd, C. B., Pakzad-Vaezi, K., & Krigolson, O. (2008). The feedback correct related positivity: Sensitivity of the event related brain potential to unexpected positive feedback. Psychophysiology, 45, 688–697. Holroyd, C. B., & Yeung, N. (2012). Motivation of extended behaviors by anterior cingulate cortex. Trends in Cognitive Sciences, 16, 122–128. Holroyd, C. B., & Yeung, N. (2011). An integrative theory of anterior cingulate cortex function: Option selection in hierarchical reinforcement learning. In R. Mars, J. Sallet, M. Rushworth, & N. Yeung (Eds.), Neural basis of motivational and cognitive control. Cambridge, MA: MIT Press. Li, P., Han, C., Lei, Y., Holroyd, C. B., & Li, H. (2011). Responsibility modulates neural mechanisms of outcome processing: An ERP study. Psychophysiology, 48, 1129–1133. Liu, Y., Nelson, L. D., Bernat, E. M., & Gehring, W. J. (2014). Perceptual properties of feedback stimuli influence the feedback-related negativity in the flanker gambling task. Psychophysiology, 51, 782–788. Logan, G. D., & Zbrodoff, N. J. (1979). When it helps to be misled: Facilitative effects of increasing the frequency of conflicting stimuli in a Stroop-like task. Memory & Cognition, 7, 166–174. MacLeod, C. (1991). Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin, 109, 163–203. Mathewson, K. J., Dywan, J., Snyder, P. J., Tays, W. J., & Segalowitz, S. J. (2008). Aging and electrocortical response to error feedback during a spatial learning task. Psychophysiology, 45, 936–948. Mies, G. W., van der Veen, F. M., Tulen, J. H. M., Hengeveld, M. W., & van der Molen, M. W. (2011). Cardiac and electrophysiological responses to valid and invalid feedback in a time-estimation task. Journal of Psychophysiology, 25, 131–142. Miller, D. T., & Ross, M. (1975). Self-serving biases in the attribution of causality: Fact or fiction? Psychological Bulletin, 82, 213–225. Miltner, W. H. R., Braun, C. H., & Coles, M. G. H. (1997). Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a ‘‘generic’’ neural system for error detection. Journal of Cognitive Neuroscience, 9, 788–798. Murphy, P. R., Robertson, I. H., Allen, D., Hester, R., & O’Connell, R. G. (2012). An electrophysiological signal that precisely tracks the emergence of error awareness. Frontiers in Human Neuroscience, 6, 65. Nieuwenhuis, S. (2011). Learning, the P3, and the locus coeruleus-norepinephrine system. In R. B. Mars, J. Sallet, M. Rushworth, & N. Yeung (Eds.), Neural Basis of Motivational and Cognitive Control. Oxford University Press. Nieuwenhuis, S., Aston-Jones, G., & Cohen, J. D. (2005). Decision making, the P3, and the locus coeruleus-norepinephrine system. Psychological Bulletin, 131, 510–532. Nieuwenhuis, S., Ridderinkhof, K. R., Blom, J., Band, G. P. H., & Kok, A. (2001). Error related brain potentials are differentially related to awareness of response errors: Evidence from an antisaccade task. Psychophysiology, 38, 752–760. Nieuwenhuis, S., Yeung, N., Holroyd, C. B., Schurger, A., & Cohen, J. D. (2004). Sensitivity of electrophysiological activity from medial frontal cortex to utilitarian and performance feedback. Cerebral Cortex, 14, 741–747.

Oliveira, F. T., McDonald, J. J., & Goodman, D. (2007). Performance monitoring in the anterior cingulate is not all error related: Expectancy deviation and the representation of action-outcome associations. Journal of Cognitive Neuroscience, 19, 1994–2004. Overbeek, T. J. M., Nieuwenhuis, S., & Ridderinkhof, K. R. (2005). Dissociable components of error processing: On the functional significance of the Pe vis-àvis the ERN/Ne. Journal of Psychophysiology, 19, 319–329. Polich, J. (2007). Updating P300: An integrative theory of P3a and P3b. Clinical Neurophysiology, 118, 2128–2148. Polich, J., & Kok, A. (1995). Cognitive and biological determinants of P300: An integrative review. Biological Psychology, 41, 103–146. Sailer, U., Fischmeister, F. P. S., & Bauer, H. (2010). Effects of learning on feedbackrelated brain potentials in a decision-making task. Brain Research, 1342, 85–93. San Martín, R. (2012). Event-related potential studies of outcome processing and feedback-guided learning. Frontiers in Human Neuroscience, 6. Sato, A., Yasuda, A., Ohira, H., Miyawaki, K., Nishikawa, M., Kumano, H., et al. (2005). Effects of value and reward magnitude on feedback negativity and P300. Neuroreport, 16, 407–411. Squires, K. C., Hillyard, S. A., & Lindsay, P. H. (1973). Cortical potentials evoked by confirming and disconfirming feedback following an auditory discrimination. Perception and Psychophysics, 13, 25–31. Steinhauser, M., & Yeung, N. (2010). Decision processes in human performance monitoring. The Journal of Neuroscience, 30, 15643–15653. Steinhauser, M., & Yeung, N. (2012). Error awareness as evidence accumulation: Effects of speed–accuracy trade-off on error signaling. Frontiers in Human Neuroscience, 6. Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology: General, 18, 643–662. Ullsperger, M., Harsay, H. A., Wessel, J. R., & Ridderinkhof, K. R. (2010). Conscious perception of errors and its relation to the anterior insula. Brain Structure and Function, 214, 629–643. Ulrich, N., & Hewig, J. (2014). A miss is as good as a mile? Processing of near and full outcomes in a gambling paradigm. Psychophysiology, 51, 819–823. Verleger, R., Hamann, L. M., Asanowicz, D., & S´migasiewicz, K. (2015). Testing the SR link hypothesis of P3b: The oddball effect on S1-evoked P3 gets reduced by increased task relevance of S2. Biological Psychology, 108, 25–35. Verleger, R., Metzner, M. F., Ouyang, G., S´migasiewicz, K., & Zhou, C. (2014). Testing the stimulus-to-response bridging function of the oddball-P3 by delayed response signals and residue iteration decomposition (RIDE). NeuroImage, 100, 271–280. Walsh, M. M., & Anderson, J. R. (2011). Modulation of the feedback-related negativity by instruction and experience. Proceedings of the National Academy of Sciences, 108, 19048–19053. Walsh, M. M., & Anderson, J. R. (2012). Learning from experience: Event-related potential correlates of reward processing, neural adaptation, and behavioral choice. Neuroscience & Biobehavioral Reviews, 36, 1870–1884. Wu, Y., & Zhou, X. (2009). The P300 and reward valence, magnitude, and expectancy in outcome evaluation. Brain Research, 1286, 114–122. Yeung, N., & Sanfey, A. D. (2004). Independent coding of reward magnitude and valence in the human brain. Journal of Neuroscience, 24, 6258–6264. Zhou, Z., Yu, R., & Zhou, X. (2010). To do or not to do? Action enlarges the FRN and P300 effects in outcome evaluation. Neuropsychologia, 48, 3606–3613.