Altered activation in association with reward-related trial-and-error learning in patients with schizophrenia

NeuroImage 50 (2010) 223–232 Contents lists available at ScienceDirect NeuroImage j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l ...

Download PDF

2MB Sizes 0 Downloads 27 Views

Report

PDF Reader
Full Text

NeuroImage 50 (2010) 223–232

Contents lists available at ScienceDirect

NeuroImage j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / y n i m g

Altered activation in association with reward-related trial-and-error learning in patients with schizophrenia Kathrin Koch a,⁎, Claudia Schachtzabel a, Gerd Wagner a, Julia Schikora a, Christoph Schultz a, Jürgen R. Reichenbach b, Heinrich Sauer a, Ralf G.M. Schlösser a a b

Department of Psychiatry and Psychotherapy, Friedrich-Schiller-University Jena, Jahnstr., Philosophenweg 3, 07740 Jena, Germany Medical Physics Group, Institute for Diagnostic and Interventional Radiology, Friedrich-Schiller-University Jena, Philosophenweg 3, 07740 Jena, Germany

a r t i c l e

i n f o

Article history: Received 23 July 2009 Revised 3 December 2009 Accepted 7 December 2009 Available online 16 December 2009 Keywords: fMRI Putamen Reinforcement learning Reward Schizophrenia

a b s t r a c t In patients with schizophrenia, the ability to learn from reinforcement is known to be impaired. The present fMRI study aimed at investigating the neural correlates of reinforcement-related trial-and-error learning in 19 schizophrenia patients and 20 healthy volunteers. A modiﬁed gambling paradigm was applied where each cue indicated a subsequent number which had to be guessed. In order to vary predictability, the cuenumber associations were based on different probabilities (50%, 81%, 100%) which the participants were not informed about. Patients' ability to learn contingencies on the basis of feedback and reward was signiﬁcantly impaired. While in healthy volunteers increasing predictability was associated with decreasing activation in a fronto-parietal network, this decrease was not detectable in patients. Analysis of expectancy-related reinforcement processing yielded a hypoactivation in putamen, dorsal cingulate and superior frontal cortex in patients relative to controls. Present results indicate that both reinforcement-associated processing and reinforcement learning might be impaired in the context of the disorder. They moreover suggest that the activation deﬁcits which patients exhibit in association with the processing of reinforcement might constitute the basis for the learning deﬁcits and their accompanying activation alterations. © 2009 Elsevier Inc. All rights reserved.

Introduction An increasing number of studies indicate that reinforcement learning is impaired in patients with schizophrenia (Morris et al., 2008; Premkumar et al., 2008; Waltz et al., 2007). The ability to learn from reinforcement, feedback or reward and to optimize behavior accordingly is essential for normal functioning in everyday life. This impairment must therefore be regarded as strongly debilitating. Several lines of evidence indicate that disorder-related alterations in the dopamine (DA) system might underlie this deﬁcit (Holcomb and Rowland, 2007; Meisenzahl et al., 2007). Dopaminergic processes are assumed to constitute the basis of the so-called prediction error (PE). In primate studies (Schultz, 2000), ﬁring in midbrain dopamine neurons has been found to be elevated after an unpredicted reward (positive prediction error) and decreased when the predicted reward was omitted (negative prediction error). Especially the positive prediction error is thought to constitute a relevant constituent of reinforcement-based learning (Hikosaka et al., 2008; Schultz, 2006). Medial and lateral prefrontal as well as dorsal and ventral striatal regions constitute central dopaminergic projection ﬁelds. Accordingly, phasic DA releases and accompanying

⁎ Corresponding author. Fax: +49 3641 9 35444. E-mail address: [email protected] (K. Koch). 1053-8119/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2009.12.031

activation increases in fronto-striatal areas shortly after a positive prediction error have been demonstrated in a number of primate studies (Schultz et al., 1993; Schultz and Romo, 1990). In line with this notion, a number of studies on healthy volunteers found activation increases in predominantly medial and lateral frontal regions to be inversely related to the likelihood to receive positive feedback or reward (McClure et al., 2003; O'Doherty et al., 2003; Pagnoni et al., 2002). Accordingly, increased activation in these regions in association with increased uncertainty has also been reported in healthy volunteers during feedback- or reward-based probabilistic learning (Fiorillo et al., 2003; Koch et al., 2008; Schlösser et al., 2009; Volz et al., 2003). There is strong empirical support for an altered dopaminergic state in patients with schizophrenia (Abi-Dargham, 2004; Abi-Dargham et al., 2002; Goldman-Rakic et al., 2004). Findings on a decreased activation in the ventral striatum during reward anticipation (Kirsch et al., 2007) and processing (Schlagenhauf et al., 2008) corroborate this frequently discussed “dopamine hypothesis”. Hence, investigating reward-based learning with its putatively strong dependence on the fronto-striatal system should reveal disorder-related alterations within this system. It is surprising, therefore, that only few studies investigated the neural correlates of reward- or reinforcement-related learning in patients (Murray et al., 2008; Waltz et al., 2009; Weickert et al., 2009). Initial evidence by Shepard and Holcomb (2006) pointed to lacking habenular and dorsal striatal activation in association with

224

K. Koch et al. / NeuroImage 50 (2010) 223–232

impaired feedback-based learning in schizophrenia patients. Recent results by Murray and colleagues (2008) who applied a reward-based trial-and-error learning task revealed hypoactivations in mainly dorsal striatum, midbrain and frontal regions in ﬁrst episode patients. The present study explored reward-related trial-and-error learning in a dynamic environment by varying the predictability of a consequence. Reward was provided by an indication of a monetary gain. As most patients exhibited a high degree of negative symptoms we implemented a monetary loss (instead of a mere omission of a predicted reward) to intensify the negative emotion accompanying the omission of a positive consequence. Assuming that it is the absence of an expected positive consequence and its accompanying negative emotion which characterizes the negative PE the additional implementation of a monetary loss or punishment was an attempt to enhance this negative emotional effect. It should be mentioned, however, that our implementation of the negative prediction error is strictly speaking not in accordance with its original concept which implies the absence or lack of reinforcement when it was expected. To investigate the direct relation between impaired learning performance and brain activation, we assessed individual trial learning by modeling an individual learning rate parameter and relating this to learning-associated activation patterns. As mentioned above, previous studies on healthy subjects found a negative relation between activation intensity and predictability in task-relevant areas. Against the background of the ﬁndings mentioned above (Kirsch et al., 2007; Schlagenhauf et al., 2008; Waltz et al., 2009) indicating disorder-related alterations within the fronto-striatal system which is known to be involved in successful reinforcement learning, we expected patients to show an impaired learning performance. Accordingly, we expected the negative relation between activation intensity and predictability in task-relevant areas to be missing in patients. We furthermore anticipated this to be reﬂected by an altered relation between individual trial learning efﬁciency and brain activation. Materials and methods Subjects Initially, 20 patients and 20 controls had been included in the study. One patient was excluded due to excessive movement. Hence, 19 right-handed (Annett, 1967) patients (12 male, 7 female) with a DSM-IV diagnosis of schizophrenia and 20 right-handed healthy subjects (12 male, 8 female) were ﬁnally included in the study. On average, patients were 35.2 ± 11.5 years old and had a mean education of 10.58 ± 2.06 years. In the healthy controls, mean age was 29.7 ± 9.1 and mean education 12.70 ± 0.92 years. There was no signiﬁcant difference between the groups in terms of age (t(37) = 1.62, n.s.) but a signiﬁcant difference regarding education (t(25) = −4.12, p b 0.01, corrected for unequal variances). Diagnosis was established by the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) (First et al., 1996) and conﬁrmed by two independent clinical psychiatrists (R.S. and Ch.S.). Patients were free of any concurrent psychiatric diagnosis and had no neurological conditions. They were in remission from an acute psychotic episode. One of the patients was unmedicated, the other 18 patients were on stable medication with atypical antipsychotics. The mean chlorpromazine equivalent dose was 736.1 ± 588.7. Psychopathological status of the patients was assessed by the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987). Ratings were 15.4 ± 4.7 on the positive subscale, 18.8 ± 6.8 on the negative subscale and 35.8 ± 7.6 on the general psychopathology scale. A verbal multi-choice IQ test (Lehrl, 1989) conﬁrmed that none of the participants was mentally retarded (IQ b 70). Volunteer subjects were screened by comprehensive assessment procedures for medical, neurological and psychiatric history. Exclu-

sion criteria were current and potentially interfering medical conditions, any current or previous neurological or psychiatric disorder and ﬁrst-degree relatives with axis I psychiatric or neurological disorders. All participants gave written informed consent to a study protocol approved by the Ethics Committee of the FriedrichSchiller-University Medical School. Experimental Design Using the Presentation software package (Neurobehavioral Systems Inc., USA) stimuli were projected onto a transparent screen inside the scanner tunnel which could be viewed by the subject through a mirror system mounted on top of the MRI head coil. The subjects' responses were recorded by using an MRI-compatible ﬁbre optic response device (Lightwave Medical Industries, Canada) with four button keypads for the right hand. A probabilistic trial-and-error learning task arranged as an eventrelated design was applied. This task is similar to the task used previously in healthy controls (Koch et al., 2008). For the purpose of patient studies, however, task difﬁculty has been reduced. In this task, participants were informed that they were presented a card with a geometrical ﬁgure on it (i.e. circle, cross, half-moon, triangle, square or pentagon). They were told that each ﬁgure was associated with an unknown value ranging from 1 to 9. Participants were asked to guess whether the ﬁgure on the card predicted a value higher or lower than the number ﬁve. They were told that each correct guess was followed by a monetary reward (+0.50 €) whereas each wrong guess was followed by a punishment (− 0.50 €). Participants were also instructed that each ﬁgure predicted the respective value (higher or lower than ﬁve) with a certain probability. Three conditions were employed; one highly uncertain condition that did not allow any outcome prediction (i.e. 50% stimulus-outcome contingency), one condition permitting full prediction (i.e. 100% stimulus-outcome contingency) and one condition where prediction was partly possible (i.e. 81% stimulus-outcome contingency). Participants were not informed about the predictive probabilities of the respective ﬁgures. Thus, they had to learn to improve their guesses based on the different prediction probabilities in the course of the experiment in order to maximize their gains. The whole paradigm consisted of a series of 96 interleaved trials with 16 trials for each stimulus category. As each probability condition was based on two stimulus categories (e.g. the 81% condition contained 16 squares and 16 circles) each probability condition consisted of 32 trials. Each trial started with the presentation of the probability condition-speciﬁc ﬁgure which was shown for 1.5 s. After an inter-stimulus interval lasting 4.5 s, a question mark was presented for 2.5 s during which participants had to answer by button press. After another inter-stimulus interval of 4.5 s, the correct solution followed by the indication of a reward or punishment appeared for 2.5 s. Each trial ended with an inter-trial interval lasting 3.5 s. In addition, we introduced a temporal jitter by varying the second inter-stimulus interval between 4.5 and 5.5 s in order to increase sensitivity. Participants were compensated according to their performance, although the minimum of € 20 was guaranteed for volunteering. fMRI procedure Functional data were collected on a 3-T Siemens TIM Trio whole body system (Siemens, Erlangen, Germany) equipped with a 12element receive-only head matrix coil. Foam pads were used for positioning and immobilization of the subject's head within the head coil. T2⁎-weighted images were obtained using a gradient-echo EPI sequence (TR = 2040 ms, TE = 26 ms, ﬂip angle = 90°) with 40 contiguous transverse slices of 3.3 mm thickness covering the entire brain. Matrix size was 72 × 72 pixels with in-plane resolution of 2.67 × 2.67 mm2 corresponding to a ﬁeld of view of 192 mm. A series

K. Koch et al. / NeuroImage 50 (2010) 223–232

225

of 965 whole-brain volume sets were acquired, with the ﬁrst three images of each series being discarded. High-resolution anatomical T1-weighted volume scans (MPRAGE) were obtained in sagittal orientation (TR = 2300 ms, TE = 3.03 ms, TI = 900 ms, ﬂip angle = 9°, FOV = 256 mm, matrix = 256 × 256, number of sagittal slices = 192, acceleration factor (PAT) = 2, TA = 5:21 min) with a slice thickness of 1 mm and isotropic resolution of 1 × 1 × 1 mm3. Data analysis Behavioral data Performance was assessed by the percentage of correct reactions in each probability condition. A two-way repeated measures ANOVA followed by post hoc t-tests with probability condition (50%, 81%, 100%) as within-subject factor and group (patients, controls) as between-subject factor was performed to test for performance differences. To assess individual trial learning, we applied the concept of temporal difference learning to estimate an individual learning rate (LR) parameter (Sutton and Barto, 1998). To this aim, the change in associative strength of stimulus i on each trial j, (ΔVij) was determined as

ΔVij =

8 γ ij 4Aij > > < > > : γ ij 4 Aij −

jX −1

! ΔVik

j=1 jN1

k=1

In psychological terms, γij can be interpreted as stimulus-reward associability (Dayan and Abbott, 2001) and moreover as a discount factor determining the extent to which rewards that arrive earlier are more important for learning than rewards that arrive later (Niv and Schoenbaum, 2008; O'Doherty et al., 2003). Considering that knowledge of the stimulus-reward associability changes as a function of experience, γij is updated on a trial-by-trial basis according to the previous stimulus-reward correlation. Thus, γij takes into account both the position of each trial j in the course of the learning process as well as the condition-speciﬁc predictability. Given that the correlation between geometrical ﬁgure and numeric value was not known a priori, we set γi,1 = 0.5 for each condition i. According to O'Doherty and colleagues (2003), a maximum value of γij = 0.9 was chosen. In case of a correct guess and monetary gain, Aij of a trial takes the value 1, for non-reinforcement trials (i.e. incorrect guess and jP −1 monetary loss), Aij was coded by 0. ΔVik illustrates the expected k=1 jP −1 reward in each trial j; as indicated by , therefore associative k=1

strengths of all j-1 previous trials are summed up (with an initial jP −1 value of 0). Applying the associative strengths ΔVik the learning k=1

progress in each probability condition was calculated for all subjects and averaged across each group (Fig. 1). The learning rate (LR), modeled to assess the individual learning capability under stable learning conditions (i.e. 100% condition), was then calculated as follows: LRi = 1 −

16 1 1 X 4 V − ΔVij 16 j = 1 ij = max

where Vij/max reﬂects the expected reward in case of optimal learning performance. Thus, higher LR values stand for better learning. A twosample t-test served for comparing the individual learning rates 1 Given 16 trials for each stimulus category 1-1/16 gives the mean LR for each condition.

Fig. 1. Average change in associative strength (a: controls; b: patients) across each probability condition.

between the groups. A repeated measures ANOVA with associative strength (i.e. associative strength values across the 16 trials) as within-subject factor and group (patients, controls) as betweensubject factor served for testing for group differences across the learning process under stable learning conditions. fMRI data Preprocessing and statistical analysis of the fMRI data were performed using SPM5 (http://www.ﬁl.ion.ucl.ac.uk/spm). Functional data were corrected for differences in time of acquisition by sinc interpolation, realigned to the ﬁrst image of every session and linearly and non-linearly normalized to the Montreal Neurological Institute (MNI, Montreal, Canada) reference brain. Data were spatially smoothed with a Gaussian kernel (8 mm, full-width at halfmaximum) and high-pass ﬁltered with a 128-s cutoff. All data were inspected for movement artefacts. Subjects with movement parameters exceeding 3 mm translation on the x-, y- or z-axis or 3° rotation were excluded. In addition, individual movement parameters entered analyses as covariates of no interest. Brain activations were then analyzed voxel-wise to calculate statistical parametric maps of t-statistics for each condition of the task described above: 50% probability condition (i.e. activation during responding to triangles and pentagons), 81% probability condition (i.e. activation during responding to circles and squares), 100% probability condition (i.e. activation during responding to half-

226

K. Koch et al. / NeuroImage 50 (2010) 223–232

moons and crosses) and feedback (reward or punishment) condition (i.e. activation during presentation of correct or incorrect solution and indication of monetary win or loss modeled by separate regressors). In addition, probability-speciﬁc feedback regressors (i.e. reward or punishment in association with 50%, 81% and 100% probability) were separately modeled to subsequently enter the analysis of probability-speciﬁc feedback. BOLD signal changes for the different conditions were modeled as a covariate of variable length boxcar functions and convolved with a canonical hemodynamic response function (HRF). These HRFs were then used as individual regressors within the general linear model (GLM). A ﬁxed effect model at the single-subject level was performed to create images of parameter estimates. A two-way non-sphericity corrected repeated measures ANOVA with the three probability conditions (50%, 81%, 100%) as withinsubject factor, and group (patients, controls) as between-subject factor was performed on the second level. Here, the probabilityrelated activation changes were analyzed in association with both linearly increasing (i.e. 100% N 81% N 50%) as well as linearly decreasing (i.e. 100% b 81% b 50%) predictability. In addition, a regression analysis with individual learning rate (LR) as covariate and group (patients, controls) as between-subject factor was performed to explore potential group differences with regard to the relationship between individual trial learning and brain activation under optimal learning conditions. Finally, potential reinforcement-related activation abnormalities in dependence on reward expectation were explored. This was done by performing another two-way non-sphericity corrected repeated measures ANOVA with feedback (reward/punishment under 50%, 81% and 100% probability conditions) as within-subject factor and group (patients, controls) as between-subject factor. This analysis served to reveal potential activation deﬁcits in patients in association with positive and/or compared to negative prediction errors (i.e. unexpected reward relative to omission of expected reward). Positive PE was modeled by the linear decrease in positive feedback in relation to predictability (i.e. 50% N 81% N 100%) (Abler et al., 2006). This should reveal areas which show strongest activation in response to unexpected reward. Negative PE was modeled by the linear increase in negative feedback in relation to predictability (i.e. 50% b 81% b 100%). This should reveal areas with strongest activation in response to worse than expected (negative) feedback (i.e. negative feedback in the 100% condition). Due to only correct responses in the 100% condition, one healthy control subject had to be excluded from this analysis. All analyses were corrected for age and education. Finally, we investigated a potential effect of atypical antipsychotic medication on brain activation. Medication was converted to chlorpromazine (CPZ) equivalent dosage (Atkins et al., 1997; Woods, 2003) and correlated with feedback- and probability-related activation. Within-group analyses were based on an FDR corrected

Table 1 Talairach coordinates of activation maxima (SPM{T} value) for decreasing predictability (controls: p b 0.01 FDR corrected, controls vs. patients: p b 0.001 uncorrected). Region of activation

Side

BA

Talairach coordinate x

y

z

T

k

Controls Middle frontal gyrus Middle frontal gyrus Inferior parietal lobe

r r r

9 10 40

40 32 54

34 52 − 50

22 −3 41

6.03 5.54 4.53

226 64 14

Controls vs. patients Anterior cingulate Middle frontal gyrus Anterior cingulate

l r r

32 9 32

− 16 34 20

37 32 38

−2 21 15

4.47 4.26 3.84

54 92 68

signiﬁcance level of p b 0.01. Group comparisons were thresholded at p b 0.001 uncorrected. Number of expected voxels per cluster was taken as a spatial extent threshold. Coordinates were transformed into Talairach coordinates using the algorithm proposed by Brett et al. (2002). We used the Talairach demon (www.talairach.org) and the anatomical atlas by Mai et al. (2004) for anatomical labeling. Results Behavioral data The two-way repeated measures ANOVA on the percentage of correct responses yielded a signiﬁcant main effect of probability condition (F(2,74) = 52.52, p b 0.001), a signiﬁcant main effect of group (F(1,37) = 24.24, p b 0.001) and a signiﬁcant probability-bygroup interaction (F(2,74) = 6.82, p b 0.002). Planned post hoc t-tests revealed no signiﬁcant difference in the percentage of correct reactions between the groups for the 50% condition (t(37) = 0.99, n.s.) but a signiﬁcant difference for the 81% condition (t(37) = 4.18, p b 0.001) as well as for the 100% condition (t(37) = 4.21, p b 0.001). Fig. 1 shows the modeled learning process based on the individual jP −1 associative strengths k = 1 ΔVik of each trial averaged across each group for all probability conditions. As can be seen, patients' ability to learn stimulus contingencies was diminished predominantly in the 81% and 100% condition. The assessment of the individual trial learning under stable learning conditions yielded a mean learning rate of 0.67 ± 0.17 in patients and 0.88 ± 0.14 in controls (Fig. 2). The independent two-sample t-test testing for differences in the individual learning rates between the groups yielded a signiﬁcant effect (t(37) = 4.13, p b 0.001). The ANOVA testing for group differences in associative strength across the learning process under stable learning conditions yielded a main effect of group (F(1,37) = 17.07, p b 0.001), a main effect of associative strength (F(15,37) = 34.30,

Table 2 Talairach coordinates of activation maxima (SPM{T} value) for regression with individual learning rate (patients vs. controls: p b 0.001 uncorrected). Region of activation

Fig. 2. Distribution of individual learning rates and their means (±SD) in both groups.

Side

BA

Patients vs. controls Superior frontal gyrus Superior frontal gyrus Middle frontal gyrus Middle frontal gyrus Middle frontal gyrus Thalamus

r r l l r l

6 6 6 8/9 6

Controls vs. patients Occipital lobe Cerebellum

r r

18

Talairach coordinate

T

k

x

y

z

8 8 − 20 − 26 32 −2

28 41 11 29 12 − 11

52 44 57 39 51 8

5.08 4.73 4.94 5.00 4.39 3.59

32 164 56 72 98 15

24 36

− 72 − 47

− 10 − 16

4.57 3.90

58 22

K. Koch et al. / NeuroImage 50 (2010) 223–232

p b 0.001) and a signiﬁcant interaction (F(15,37) = 2.61, p = 0.001) indicating lower overall performance as well as a worse learning performance across time in patients compared to controls. fMRI data The two-way non-sphericity corrected ANOVA investigating neural activation in association with increasing and decreasing predictability yielded signiﬁcant effects in healthy controls in association with decreasing predictability (or, in other terms, increasing uncertainty) in right dorsolateral prefrontal cortex (BA 9), right fronto-polar (BA 10) and right inferior parietal lobe (Table 1). Patients showed no signiﬁcant activation increases in association with

227

decreasing predictability. The direct group comparison revealed signiﬁcantly stronger activation in controls compared to patients in association with decreasing predictability in the right dorsolateral prefrontal cortex (BA 9) and the anterior cingulate (BA 32) (Table 1). There were no signiﬁcant results in association with linearly increasing predictability. Regression analysis to explore potential group differences with regard to the relationship between trial learning efﬁciency and brain activation yielded signiﬁcantly increased activation in patients compared to controls in middle and superior frontal regions (BA 6, BA 8/9) and the left thalamus (Table 2). As illustrated in Fig. 3, these effects were based on a negative relationship between trial learning efﬁciency and brain activation in

Fig. 3. Illustration of relationship between individual learning rate and parameter estimates for BA 8/9, BA 6 and thalamus which showed signiﬁcant effects in the group comparison at p b 0.001 uncorrected (see Table 2). Negative correlations were detectable in controls while patients exhibited a positive relationship between individual learning rate and activation.

228

K. Koch et al. / NeuroImage 50 (2010) 223–232

Table 3 Talairach coordinates of activation maxima (SPM{T} value) for activation differences between the groups for positive, negative and positive compared to negative prediction error (p b 0.001 uncorrected). Region of activation

Side

Positive PE controls vs. patients Anterior cingulate r Putamen l Medial frontal gyrus r Middle temporal gyrus r

BA

32 10 21

Negative PE controls vs. patients Hippocampus r Insula r PE (positive–negative) controls vs. patients Anterior cingulate r 32 Putamen l Superior frontal gyrus l 9

Talairach coordinate

T

k

x

y

z

12 − 20 6 58

44 8 46 − 37

18 13 −7 2

4.98 4.31 3.92 3.58

97 50 33 19

36 38

− 31 −3

−5 20

4.30 3.67

32 20

12 − 22 − 26

44 10 44

18 12 29

4.18 3.79 3.67

40 27 47

controls and positive correlations in patients. These correlations were signiﬁcant for BA 6 (controls: r = − 0.67, p b 0.001; patients: r = 0.43, n.s.) and BA 8/9 (controls: r = −0.74, p b 0.001; patients: r = 0.46, p b 0.05) in controls and thalamus (controls: r = −0.56, p b 0.01; patients: r = 0.53, p b 0.02) in both groups (p b 0.05 corrected for multiple comparisons by FDR). The investigation of stronger activation in controls compared to patients for the regression with individual learning rate revealed a signiﬁcantly increased activation in controls relative to patients in the right cerebellum and the right occipital lobe (Table 2). Here, correlations between individual learning parameters and brain activation were only signiﬁcant in controls (cerebellum: controls: r = 0.62, p = 0.003; patients: r = −0.43, n.s.; occipital lobe: controls: r = 0.63, p = 0.003; patients: r = − 0.39, n.s.). The analysis of activation abnormalities in association with positive PE yielded a signiﬁcant relative hypoactivation in patients in predominantly frontal regions (BA 32, 10) and the putamen. The analysis of activation abnormalities in association with negative PE yielded a signiﬁcant relative hypoactivation in patients in the hippocampus and the insula (Table 3). There were no relative hyperactivations in patients, neither for positive nor for negative PE. The investigation of positive compared to negative prediction error yielded a signiﬁcant hypoactivation in patients relative to controls in the left superior frontal gyrus (BA 9), the dorsal cingulate (BA 32) and the lateral part of the left dorsal striatum/putamen (Table 3, Fig. 4). The opposite contrast testing for hyperactivations in patients relative to controls yielded no signiﬁcant results. There were no signiﬁcant correlations, neither positive nor negative, between CPZ equivalents and activation in association with positive or negative feedback (p b 0.01 FDR corrected). There were also no signiﬁcant correlations between CPZ equivalents and probabilityrelated activation (in terms of predictability-related increase, decrease or regarding each probability condition separately; p b 0.01 FDR corrected). Discussion Present ﬁndings indicate that in patients with schizophrenia reward-related probabilistic trial-and-error learning is signiﬁcantly impaired. They show this impairment to go along with altered activation patterns in mainly frontal, cingulate and striatal regions. There were no signiﬁcant performance differences between the groups under conditions which allowed no learning (i.e. the 50% condition). However, patients exhibited signiﬁcantly less correct responses in association with moderate uncertainty as well as when stimulus contingencies were fully predictable. Moreover, the learning

rate employed to assess individual learning capability under optimal conditions in a stable learning environment showed the ability to learn predictable stimulus contingencies to be signiﬁcantly impaired in patients relative to healthy controls. As Fig. 2 indicates variance within the patient group was comparatively large, few patients (about 20%) showed a learning performance comparable to the average performance in healthy controls. Present ﬁndings are in line with the concept that in a majority of schizophrenia patients impairments emerge when they have to use feedback on a trial-by-trial basis to guide response selection (Gold et al., 2008). An increasing number of studies have demonstrated reinforcement learning on a trial-by-trial basis to be impaired in patients (Morris et al., 2008; Premkumar et al., 2008; Waltz et al., 2007) although some studies showed unimpaired learning (Keri et al., 2005; Weickert et al., 2002). Likewise, in the study by Murray et al. (2008) patients exhibited no learning impairments. The task administered by Murray and colleagues was, however, conceptually different and somewhat easier than the present one. Moreover, only ﬁrst episode patients were included who might have been better able to compensate the reported deﬁcits in neural activation. As Fig. 1 illustrates, healthy controls performed according to expectation under all predictability conditions. Fig. 2 moreover shows that intra-group variance regarding learning rates was smaller in controls compared to patients. Thus, in controls the majority of all participants showed learning parameters between 0.8 and 1. On the cerebral level, this adequate performance in controls was linked to signiﬁcant dorsolateral prefrontal, fronto-polar and inferior parietal activation decreases in association with increasing predictability. Thus, in line with previous studies (Fiorillo et al., 2003; Koch et al., 2008; Schlösser et al., 2009; Volz et al., 2003), present ﬁndings converge on the idea that increased predictability enables the brain to reduce resources. As opposed, decisions under high task difﬁculty or uncertainty force the brain to uphold activation in areas shown to be relevant for processes like decision making, performance monitoring and cognitive control (Koch et al., 2007, 2008; Koch et al., 2006; Volz et al., 2003; Zysset et al., 2006). The ﬁndings suggest that both the ability to increase processing resources with increasing uncertainty as well as the ability to reduce resources under predictable and stable environmental conditions might characterize a “good learner”. The negative correlation between individual learning rate and frontal signal intensity (as shown in Fig. 3) in healthy controls corroborates this presumption. It attributes “bad learners” a decreased ability to reduce processing resources despite maximum predictability. As opposed, patients lacked these activation decreases with increasing predictability. The direct group comparison yielded signiﬁcantly reduced activation decreases in patients compared to healthy volunteers in predominantly right dorsolateral prefrontal regions and the dorsal cingulate. Hence, results are in line with our hypothesis and indicate that in patients relative activation increases in areas which are thought to be involved in processes like cognitive control and error monitoring (Petrides, 2005; van Veen and Carter, 2002) are prevailing. Interestingly, in patients there was a positive correlation between individual learning rate and signal intensity in a fronto-thalamic network. Moreover, the direct group comparison of the learning-raterelated activation revealed a signiﬁcantly stronger signal in this network in patients compared to controls. This suggests that patients who showed a moderate learning performance sustained a (presumably compensatory) increased activation. This increased activation was detectable in networks known to be relevant for processes demanding a high amount of attention and cognitive control (BA 8/9, thalamus) (Wager and Smith, 2003) and areas found to be activated during probabilistic cue-outcome feedback learning (BA 6) (Volz et al., 2003). Patients with a poor learning performance not only lacked any presumed compensatory activation increases, but they even displayed a tendency to deactivate their frontal-thalamic network

K. Koch et al. / NeuroImage 50 (2010) 223–232

229

Fig. 4. Signiﬁcantly stronger activation in the left striatum (putamen), dorsal anterior cingulate (BA 32) and dorsolateral prefrontal cortex (BA 9) for positive versus negative prediction error in controls compared to patients at p b 0.001 uncorrected. Parameter estimates show increased activation in association with positive prediction error and suppressed activation in association with negative prediction error in controls with the opposite relation in patients.

(Fig. 3). Relative increases in deactivation have repeatedly been reported in patients in association with different cognitive processes (Harrison et al., 2007; Mannell et al., in press) although their underlying mechanisms remain to be elucidated. The present results suggest that the concept of cortical inefﬁciency as originally proposed by Callicott et al. (2000) and formalized by Manoach (2003) might be transferable to the context of individual learning performance. Despite this distinct evidence pointing to neurophysiological deﬁcits in patients in association with reward-based learning, the underlying mechanisms remain to be discussed: First, an impaired

processing of reinforcement or reward (Juckel et al., 2006) may impede patients to learn from reinforcement. Second, processing of reinforcement or reward may be intact but the ability to integrate this reinforcement-related information into subsequent decision making may be deﬁcient (Heerey et al., 2008). In the present study, the processing of reinforcement was associated with decreased fronto-striatal activation in patients. This activation abnormality which affected the dorsal striatum/putamen, the dorsal cingulate and the medial frontal cortex clearly speaks against the second assumption.

230

K. Koch et al. / NeuroImage 50 (2010) 223–232

The striatum and the dorsal cingulate are known to constitute central components of the nigrostriatal and mesolimbic DA-system (Van den Heuvel and Pasterkamp, 2008). The putamen forms the lateral part of the dorsal striatum. It is to be considered as predominantly relevant in the context of reinforcement processing as it receives direct dopaminergic efferents from the substantia nigra (Moore, 2003). It has been shown to be involved in exclusive reinforcement processing (Bischoff-Grethe et al., 2009; Ino et al., 2009), in signaling discrepancies between reward expectations and outcomes (Tobler et al., 2006), as well as in reward-based learning processes (Haruno and Kawato, 2006; Wachter et al., 2009). Primate studies indicate that these reward-based learning processes might be eased by phasic DA releases and accompanying activation increases in striatal and medial frontal areas during and shortly after an unexpected, positive reinforcement as they seem to cause synaptic plasticity to take place (Schultz et al., 1993; Schultz and Romo, 1990). The lack of activation in these regions which was detectable in patients might therefore be one or even the main cause for the patients' inability to learn from the given feedback. Interestingly, as parameter estimates in Fig. 4 illustrate, healthy control subjects showed both increased fronto-striatal activation in association with positive PE processing as well as suppressed fronto-striatal activation in association with negative PE. Both ﬁndings are in accordance with preceding studies on healthy subjects (Abler et al., 2005; McClure et al., 2003) and results from primate studies (Schultz, 2002). Patients, as opposed, did not only lack these relations but showed an exactly oppositional pattern. In correspondence to our results, Waltz et al. (2009) revealed very similar activation abnormalities in patients when investigating the neural responses to both predictable and unpredictable primary reinforcers. In accordance with our procedure, they compared neural correlates of positive PE with neural correlates of negative PE. In their study, positive PE was implemented by analyzing reinforcement not delivered as expected. This corresponds to negative feedback in the 100% condition in our study. Negative PE was implemented by analyzing reinforcement delivered and not expected corresponding to, in our study, positive feedback in the 50% condition. Similar to our results, Waltz and colleagues found signiﬁcant hypoactivations in patients relative to controls in a more inferior frontal area and the left putamen. Hence, present as well as preceding ﬁndings imply that the reduced activation in the putamen in response to expectancy-related rewards might be of major psychopathological relevance. A number of studies in healthy subjects have demonstrated the dorsal cingulate to be involved predominantly in the coding of reward value (Magno et al., 2008; Rogers et al., 2004). The reduced dACC activation which we found in patients compared to controls in association with the analysis of PE-related processing suggests that this reward value sensitivity is impaired in patients. Thus, we assume that the lack of dorsal striatal as well as dorsal cingulate activation might constitute the basis of the patients' inability to learn from reinforcement. Somewhat contradictory to this presumption, recent results by Weickert and colleagues (2009) showed that fronto-striatal abnormalities might be present in patients even in face of unimpaired learning performance. The task applied by Weickert and colleagues was a feedback-based probabilistic learning task without monetary reward. Their ﬁndings nevertheless indicate that impaired learning might not be a necessary consequence of these fronto-striatal activation alterations as long as the activation alterations are compensated. In their study, Weickert and colleagues found compensatory activation increases in patients in dorsolateral prefrontal, cingulate, parahippocampal and parietal regions. Finally, it should be mentioned that the caudate has repeatedly been found to be relevant in the context of probabilistic feedback learning (Poldrack et al., 1999, 2001; Weickert et al., 2009). Our

ﬁnding of altered activation in the putamen in patients as well as the lack of activation abnormality in the caudate is therefore somewhat surprising. It indicates that the caudate alterations might be psychopathologically relevant in the context of probabilistic feedback learning without a reward component (Weickert et al., 2009) whereas putamen dysfunctions might become manifest when learning is associated with a rewarding consequence. Limitations and concluding remarks The present study revealed signiﬁcant deﬁcits in reinforcement learning in patients with schizophrenia. The already mentioned ﬁndings reporting unimpaired reward-related learning in patients with a ﬁrst episode of schizophrenia (Murray et al., 2008), however, indicate that our results are not necessarily transferable to ﬁrst episode patients. In more chronic patients, progressive neurodegenerative processes might underlie these deﬁcits. In concordance with this assumption, central components of the brain reward system have been reported to be affected by progressive neurodegenerative changes in schizophrenia (Tamagaki et al., 2005; Wang et al., 2008). Future studies in chronic patients should therefore investigate a potential inﬂuence of structural alterations within central dopaminergic networks on the ability to learn on the basis of feedback and reward. Although somewhat speculative, some clinical implications might be derivable from our ﬁndings as well. Thus, present ﬁndings imply that in a clinical or therapeutic context promoting intrinsic motivation might be signiﬁcantly more effective than attempting to motivate by external remuneration. Especially in chronic patients external remuneration might not constitute the proper strategy to cause activation of patients' internal reward system. This conclusion might have implications for the application and further development of cognitive behavioral and psychoeducative treatment strategies in schizophrenia aiming at intrinsic motivational aspects. As a limitation to the study, groups were not matched for performance which limits explanatory power of the data to some degree. Thus, it cannot be excluded that the greater fronto-striatal activity during negative feedback in the 100% condition in patients relative to healthy controls is linked to the relatively higher number of errors in the patient group. Moreover, medication has to be mentioned as another factor which constrains explanatory power of the present study to some extent. Apart from one patient, all patients received atypical antipsychotic medication which is known to exert DA-antagonistic effects and to acutely affect striato-cingulate rCBF (Lahti et al., 2003, 2005; Meisenzahl et al., 2008). However, there is increasing literature showing that reward-related MDS activation in ventral striatal regions is unaffected by atypical antipsychotic treatment (Juckel et al., 2006; Schlagenhauf et al., 2008). The lacking correlation between chlorpromazine equivalents and activation in the present study is in principle in line with these ﬁndings as is the lack of altered activation in the ventral striatum. Our patients, however, did exhibit altered alteration in the dorsal part of the striatum. Thus, as opposed to ventral striatal regions, the dorsal striatum seems to be reduced in activation in response to (unexpected) monetary reward in patients despite atypical antipsychotic treatment. It has to be clearly stated, however, that the design of the present study is not appropriate to draw well-founded conclusions on medication effects. Further studies targeting the investigation of medication effects a priori are needed to bring more clarity. In sum, the present study provides further evidence that the ability to learn on the basis of feedback and reward in a probabilistic environment is impaired in medicated schizophrenia patients. A reduced responsivity of the fronto-striato-cingulate system to positive reinforcement and monetary gain might constitute the basis of this impairment.

K. Koch et al. / NeuroImage 50 (2010) 223–232

Acknowledgment This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft [KO 3744/1-1 to K.K.]). References Abi-Dargham, A., 2004. Do we still believe in the dopamine hypothesis? New data bring new evidence. Int. J. Neuropsychopharmacol. 7 (Suppl. 1), S1–S5. Abi-Dargham, A., Mawlawi, O., Lombardo, I., Gil, R., Martinez, D., Huang, Y., Hwang, D.R., Keilp, J., Kochan, L., Van Heertum, R., Gorman, J.M., Laruelle, M., 2002. Prefrontal dopamine D1 receptors and working memory in schizophrenia. J. Neurosci. 22, 3708–3719. Abler, B., Walter, H., Erk, S., 2005. Neural correlates of frustration. Neuroreport 16, 669–672. Abler, B., Walter, H., Erk, S., Kammerer, H., Spitzer, M., 2006. Prediction error as a linear function of reward probability is coded in human nucleus accumbens. Neuroimage 31, 790–795. Annett, M., 1967. The binomial distribution of right, mixed and left handedness. Q. J. Exp. Psychol. 19, 327–333. Atkins, M., Burgess, A., Bottomley, C., Riccio, M., 1997. Chlorpromazine equivalents: a consensus of opinion for both clinical and research applications. Psychiatri. Bull. 21, 224–226. Bischoff-Grethe, A., Hazeltine, E., Bergren, L., Ivry, R.B., Grafton, S.T., 2009. The inﬂuence of feedback valence in associative learning. Neuroimage 44, 243–251. Brett, M., Johnsrude, I.S., Owen, A.M., 2002. The problem of functional localization in the human brain. Nat. Rev. Neurosci. 3, 243–249. Callicott, J.H., Bertolino, A., Mattay, V.S., Langheim, F.J., Duyn, J., Coppola, R., Goldberg, T.E., Weinberger, D.R., 2000. Physiological dysfunction of the dorsolateral prefrontal cortex in schizophrenia revisited. Cereb. Cortex 10, 1078–1092. Dayan, P., Abbott, L.F., 2001. Theoretical Neuroscience Computational and Mathematical Modeling of Neural Systems. MIT Press, Cambridge, MA. Fiorillo, C.D., Tobler, P.N., Schultz, W., 2003. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902. First, M.B., Spitzer, R.L., Williams, J.B.W., Gibbon, M., 1996. Structured Clinical Interview for DSM-IV (SCID). APA Press, Washington. Gold, J.M., Waltz, J.A., Prentice, K.J., Morris, S.E., Heerey, E.A., 2008. Reward processing in schizophrenia: a deﬁcit in the representation of value. Schizophr. Bull. 34, 835–847. Goldman-Rakic, P.S., Castner, S.A., Svensson, T.H., Siever, L.J., Williams, G.V., 2004. Targeting the dopamine D1 receptor in schizophrenia: insights for cognitive dysfunction. Psychopharmacology (Berl.) 174, 3–16. Harrison, B.J., Yucel, M., Pujol, J., Pantelis, C., 2007. Task-induced deactivation of midline cortical regions in schizophrenia assessed with fMRI. Schizophr. Res. 91, 82–86. Haruno, M., Kawato, M., 2006. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulusaction-reward association learning. J. Neurophysiol. 95, 948–959. Heerey, E.A., Bell-Warren, K.R., Gold, J.M., 2008. Decision-making impairments in the context of intact reward sensitivity in schizophrenia. Biol. Psychiatry 64, 62–69. Hikosaka, O., Bromberg-Martin, E., Hong, S., Matsumoto, M., 2008. New insights on the subcortical representation of reward. Curr. Opin. Neurobiol. 18, 203–208. Holcomb, H.H., Rowland, L.M., 2007. How schizophrenia and depression disrupt reward circuitry. Curr. Treat Options Neurol. 9, 357–362. Ino, T., Nakai, R., Azuma, T., Kimura, T., Fukuyama, H., 2009. Differential activation of the striatum for decision making and outcomes in a monetary task with gain and loss. Cortex 46, 2–14. Juckel, G., Schlagenhauf, F., Koslowski, M., Filonov, D., Wustenberg, T., Villringer, A., Knutson, B., Kienast, T., Gallinat, J., Wrase, J., Heinz, A., 2006. Dysfunction of ventral striatal reward prediction in schizophrenic patients treated with typical, not atypical, neuroleptics. Psychopharmacology (Berl.) 187, 222–228. Kay, S.R., Fiszbein, A., Opler, L.A., 1987. The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276. Keri, S., Juhasz, A., Rimanoczy, A., Szekeres, G., Kelemen, O., Cimmer, C., Szendi, I., Benedek, G., Janka, Z., 2005. Habit learning and the genetics of the dopamine D3 receptor: evidence from patients with schizophrenia and healthy controls. Behav. Neurosci. 119, 687–693. Kirsch, P., Ronshausen, S., Mier, D., Gallhofer, B., 2007. The inﬂuence of antipsychotic treatment on brain reward system reactivity in schizophrenia patients. Pharmacopsychiatry 40, 196–198. Koch, K., Wagner, G., von Consbruch, K., Nenadic, I., Schultz, C., Ehle, C., Reichenbach, J., Sauer, H., Schlösser, R., 2006. Temporal changes in neural activation during practice of information retrieval from short-term memory: an fMRI study. Brain Res. 1107, 140–150. Koch, K., Wagner, G., Nenadic, I., Schachtzabel, C., Roebel, M., Schultz, C., Axer, M., Reichenbach, J.R., Sauer, H., Schlösser, R.G., 2007. Temporal modeling demonstrates preserved overlearning processes in schizophrenia: an fMRI study. Neuroscience 146, 1474–1483. Koch, K., Schachtzabel, C., Wagner, G., Reichenbach, J.R., Sauer, H., Schlösser, R., 2008. The neural correlates of reward-related trial-and-error learning: an fMRI study with a probabilistic learning task. Learn. Mem. 15, 728–732. Lahti, A.C., Holcomb, H.H., Weiler, M.A., Medoff, D.R., Tamminga, C.A., 2003. Functional effects of antipsychotic drugs: comparing clozapine with haloperidol. Biol. Psychiatry 53, 601–608.

231

Lahti, A.C., Weiler, M.A., Medoff, D.R., Tamminga, C.A., Holcomb, H.H., 2005. Functional effects of single dose ﬁrst- and second-generation antipsychotic administration in subjects with schizophrenia. Psychiatry Res. 139, 19–30. Lehrl, S., 1989. Mehrfachwahl-Wortschatz-Intelligenztest MWT-B [Multiple-choice vocabulary intelligence test]. Perimed, Erlangen. Magno, E., Simoes-Franklin, C., Robertson, I.H., Garavan, H., 2008. The role of the dorsal anterior cingulate in evaluating behavior for achieving gains and avoiding losses. J. Cogn. Neurosci 21, 2328–2342. Mai, K.J., Assheuer, J., Paxinos, G., 2004. Atlas of the Human Brain. Elsevier, London. Mannell, M.V., Franco, A.R., Calhoun, V.D., Canive, J.M., Thoma, R.J., Mayer, A.R., in press. Resting state and task-induced deactivation: a methodological comparison in patients with schizophrenia and healthy controls. Hum. Brain Mapp Manoach, D.S., 2003. Prefrontal cortex dysfunction during working memory performance in schizophrenia: reconciling discrepant ﬁndings. Schizophr. Res. 60, 285–298. McClure, S.M., Berns, G.S., Montague, P.R., 2003. Temporal prediction errors in a passive learning task activate human striatum. Neuron 38, 339–346. Meisenzahl, E.M., Schmitt, G.J., Scheuerecker, J., Moller, H.J., 2007. The role of dopamine for the pathophysiology of schizophrenia. Int. Rev. Psychiatry 19, 337–345. Meisenzahl, E.M., Schmitt, G., Grunder, G., Dresel, S., Frodl, T., la Fougere, C., Scheuerecker, J., Schwarz, M., Boerner, R., Stauss, J., Hahn, K., Moller, H.J., 2008. Striatal D2/D3 receptor occupancy, clinical response and side effects with amisulpride: an iodine-123-iodobenzamide SPET study. Pharmacopsychiatry 41, 169–175. Moore, R.Y., 2003. Organization of midbrain dopamine systems and the pathophysiology of Parkinson's disease. Parkinsonism Relat. Disord. 9 (Suppl. 2), S65–S71. Morris, S.E., Heerey, E.A., Gold, J.M., Holroyd, C.B., 2008. Learning-related changes in brain activity following errors and performance feedback in schizophrenia. Schizophr. Res. 99, 274–285. Murray, G.K., Corlett, P.R., Clark, L., Pessiglione, M., Blackwell, A.D., Honey, G., Jones, P.B., Bullmore, E.T., Robbins, T.W., Fletcher, P.C., 2008. Substantia nigra/ventral tegmental reward prediction error disruption in psychosis. Mol. Psychiatry 13, 267–276. Niv, Y., Schoenbaum, G., 2008. Dialogues on prediction errors. Trends Cogn. Sci. 12, 265–272. O'Doherty, J.P., Dayan, P., Friston, K., Critchley, H., Dolan, R.J., 2003. Temporal difference models and reward-related learning in the human brain. Neuron 38, 329–337. Pagnoni, G., Zink, C.F., Montague, P.R., Berns, G.S., 2002. Activity in human ventral striatum locked to errors of reward prediction. Nat. Neurosci. 5, 97–98. Petrides, M., 2005. Lateral prefrontal cortex: architectonic and functional organization. Philos. Trans. R. Soc. Lond. B Biol. Sci. 360, 781–795. Poldrack, R.A., Prabhakaran, V., Seger, C.A., Gabrieli, J.D., 1999. Striatal activation during acquisition of a cognitive skill. Neuropsychology 13, 564–574. Poldrack, R.A., Clark, J., Pare-Blagoev, E.J., Shohamy, D., Creso Moyano, J., Myers, C., Gluck, M.A., 2001. Interactive memory systems in the human brain. Nature 414, 546–550. Premkumar, P., Fannon, D., Kuipers, E., Simmons, A., Frangou, S., Kumari, V., 2008. Emotional decision-making and its dissociable components in schizophrenia and schizoaffective disorder: a behavioural and MRI investigation. Neuropsychologia 46, 2002–2012. Rogers, R.D., Ramnani, N., Mackay, C., Wilson, J.L., Jezzard, P., Carter, C.S., Smith, S.M., 2004. Distinct portions of anterior cingulate cortex and medial prefrontal cortex are activated by reward processing in separable phases of decision-making cognition. Biol. Psychiatry 55, 594–602. Schlagenhauf, F., Juckel, G., Koslowski, M., Kahnt, T., Knutson, B., Dembler, T., Kienast, T., Gallinat, J., Wrase, J., Heinz, A., 2008. Reward system activation in schizophrenic patients switched from typical neuroleptics to olanzapine. Psychopharmacology (Berl.) 196, 673–684. Schlösser, R.G., Nenadic, I., Wagner, G., Zysset, S., Koch, K., Sauer, H., 2009. Dopaminergic modulation of brain systems subserving decision making under uncertainty: a study with fMRI and methylphenidate challenge. Synapse 63, 429–442. Schultz, W., 2000. Multiple reward signals in the brain. Nat. Rev. Neurosci. 1, 199–207. Schultz, W., 2002. Getting formal with dopamine and reward. Neuron 36, 241–263. Schultz, W., 2006. Behavioral theories and the neurophysiology of reward. Annu. Rev. Psychol. 57, 87–115. Schultz, W., Romo, R., 1990. Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. J. Neurophysiol. 63, 607–624. Schultz, W., Apicella, P., Ljungberg, T., 1993. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci. 13, 900–913. Shepard, P.D., Holcomb, H.H., Gold, J.M., 2006. Schizophrenia in translation: the presence of absence: habenular regulation of dopamine neurons and the encoding of negative outcomes. Schizophr. Bull. 32, 417–421. Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning: An Introduction. MIT Press. Tamagaki, C., Sedvall, G.C., Jonsson, E.G., Okugawa, G., Hall, H., Pauli, S., Agartz, I., 2005. Altered white matter/gray matter proportions in the striatum of patients with schizophrenia: a volumetric MRI study. Am. J. Psychiatry 162, 2315–2321. Tobler, P.N., O'Doherty, J.P., Dolan, R.J., Schultz, W., 2006. Human neural learning depends on reward prediction errors in the blocking paradigm. J. Neurophysiol. 95, 301–310. Van den Heuvel, D.M., Pasterkamp, R.J., 2008. Getting connected in the dopamine system. Prog. Neurobiol. 85, 75–93. van Veen, V., Carter, C.S., 2002. The anterior cingulate as a conﬂict monitor: fMRI and ERP studies. Physiol. Behav. 77, 477–482.

232

K. Koch et al. / NeuroImage 50 (2010) 223–232

Volz, K.G., Schubotz, R.I., von Cramon, D.Y., 2003. Predicting events of varying probability: uncertainty investigated by fMRI. Neuroimage 19, 271–280. Wachter, T., Lungu, O.V., Liu, T., Willingham, D.T., Ashe, J., 2009. Differential effect of reward and punishment on procedural learning. J. Neurosci. 29, 436–443. Wager, T.D., Smith, E.E., 2003. Neuroimaging studies of working memory: a metaanalysis. Cogn. Affect. Behav. Neurosci. 3, 255–274. Waltz, J.A., Frank, M.J., Robinson, B.M., Gold, J.M., 2007. Selective reinforcement learning deﬁcits in schizophrenia support predictions from computational models of striatal-cortical dysfunction. Biol. Psychiatry 62, 756–764. Waltz, J.A., Schweitzer, J.B., Gold, J.M., Kurup, P.K., Ross, T.J., Salmeron, B.J., Rose, E.J., McClure, S.M., Stein, E.A., 2009. Patients with schizophrenia have a reduced neural response to both unpredictable and predictable primary reinforcers. Neuropsychopharmacology 34, 1567–1577. Wang, L., Mamah, D., Harms, M.P., Karnik, M., Price, J.L., Gado, M.H., Thompson, P.A., Barch, D.M., Miller, M.I., Csernansky, J.G., 2008. Progressive deformation

of deep brain nuclei and hippocampal-amygdala formation in schizophrenia. Biol. Psychiatry 64, 1060–1068. Weickert, T.W., Terrazas, A., Bigelow, L.B., Malley, J.D., Hyde, T., Egan, M.F., Weinberger, D.R., Goldberg, T.E., 2002. Habit and skill learning in schizophrenia: evidence of normal striatal processing with abnormal cortical input. Learn. Mem. 9, 430–442. Weickert, T.W., Goldberg, T.E., Callicott, J.H., Chen, Q., Apud, J.A., Das, S., Zoltick, B.J., Egan, M.F., Meeter, M., Myers, C., Gluck, M.A., Weinberger, D.R., Mattay, V.S., 2009. Neural correlates of probabilistic category learning in patients with schizophrenia. J. Neurosci. 29, 1244–1254. Woods, S.W., 2003. Chlorpromazine equivalent doses for the newer atypical antipsychotics. J. Clin. Psychiatry 64, 663–667. Zysset, S., Wendt, C.S., Volz, K.G., Neumann, J., Huber, O., von Cramon, D.Y., 2006. The neural implementation of multi-attribute decision making: a parametric fMRI study with human subjects. Neuroimage 31, 1380–1388.

Altered activation in association with reward-related trial-and-error learning in patients with schizophrenia

Altered activation in association with reward-related trial-and-error learning in patients with schizophrenia

Recommend Documents