The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion

The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion

G Model ARTICLE IN PRESS NBR-2001; No. of Pages 14 Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx Contents lists available at ScienceDi...

1MB Sizes 0 Downloads 47 Views

G Model

ARTICLE IN PRESS

NBR-2001; No. of Pages 14

Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Neuroscience and Biobehavioral Reviews journal homepage: www.elsevier.com/locate/neubiorev

Review

The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion Paul Faulkner a , J.F. William Deakin b,∗ a b

The Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, 760 Westwood Plaza, Los Angeles, CA 90024, USA Neuroscience and Psychiatry Unit, G.907 Stopford Building, Oxford Road, Manchester M13 9PT, UK

a r t i c l e

i n f o

Article history: Received 11 February 2014 Received in revised form 19 June 2014 Accepted 28 July 2014 Available online xxx Keywords: 5-HT Serotonin Raphe nucleus Tryptophan depletion Reward Punishment Behavioural inhibition Aversion Learning

a b s t r a c t Deakin and Graeff proposed that forebrain 5-hydroxytryptamine (5-HT) projections are activated by aversive events and mediate anticipatory coping responses including avoidance learning and suppression of the fight-flight escape/panic response. Other theories proposed 5-HT mediates aspects of behavioural inhibition or reward. Most of the evidence comes from rodent studies. We review 36 experimental studies in humans in which the technique of acute tryptophan depletion (ATD) was used to explicitly address the role of 5-HT in response inhibition, punishment and reward. ATD did not cause disinhibition of responding in the absence of rewards or punishments (9 studies). A major role for 5-HT in reward processing is unlikely but further tests are warranted by some ATD findings. Remarkably, ATD lessened the ability of punishments (losing points or notional money) to restrain behaviour without affecting reward processing in 7 studies. Two of these studies strongly indicate that ATD blocks 5-HT mediated aversively conditioned Pavlovian inhibition and this can explain a number of the behavioural effects of ATD. © 2014 Published by Elsevier Ltd.

Contents 1.

2.

3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Panic disorder in the context of 5-HT and punishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Publication selection criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Overview and controversies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Action inhibition and reversal learning without reinforcement (Table 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Stop-signal tasks (SSTs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Go/no-go tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3. Reversal learning (non-reinforced) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Punishment induced inhibition (instrumental and Pavlovian) (Table 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1. Reinforced discrimination and reversal learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2. Reinforced go/no-go and other decision-making tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3. Reinforced categorization task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

∗ Corresponding author. Tel.: +44 161 275 7427; fax: +44 161 275 7429. E-mail addresses: [email protected] (P. Faulkner), [email protected] (J.F.W. Deakin). http://dx.doi.org/10.1016/j.neubiorev.2014.07.024 0149-7634/© 2014 Published by Elsevier Ltd.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

2

3.3.4. Pavlovian to instrumental transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5. Aversive Pavlovian conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.6. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. Inhibition and risk taking; probability and temporal discounting of reward (Table 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1. Probability, risk and gambling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2. Loss chasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3. Four-arm bandit task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4. Cued-reinforcement reaction time task (CRRTT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.5. Information sampling task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.6. Temporal discounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.7. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Lack of effect of ATD on action inhibition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. 5-HT and aversive Pavlovian inhibition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. 5-HT, inhibition and aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Panic disorder in the context of 5-HT and punishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Outstanding issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Supplementary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. Introduction 1.1. Overview The behavioural functions of 5-HT have been a matter of speculation, experiment and controversy since 5-HT was identified in the brain over 60 years ago. Many experiments using drugs or lesions to reduce 5-HT function have reported disinhibitory effects on a wide variety of animal behaviours. This has lead to 40 years of debate on the question of whether 5-HT systems have a general role in suppressing behaviour (Harvey et al., 1975) or have a more specific central role in processing aversive stimuli, as first suggested by Wise et al. (1970). Soubrié (1986) concluded that 5-HT neurones generally restrain behaviour in any competition between action and restraint, for example, in restraining responding for small rewards in order to obtain larger delayed rewards. Others have proposed that different 5-HT topographic systems and associated receptors orchestrate specific adaptive responses to aversive events and environments (Deakin, 1983, 2013; Deakin and Graeff, 1991; Lowry, 2002; Paul and Lowry, 2013). More recently a role of 5-HT in reward mechanisms has been proposed (see Roberts, 2011). These theories have had a predominantly animal behavioural perspective. However, the last decade has seen a number of studies with human participants that have employed cognitive neuroscience decision-making paradigms in order to isolate the effects of 5-HT manipulation on impulsivity and the processing of rewards and punishments. The implications of these studies for the behavioural functions of 5-HT are the subject of this review. We confine our attention to human studies in which acute tryptophan depletion (ATD) was used to decrease 5-HT functioning because this has been by far the most common manipulation. This ingenious dietary manipulation markedly decreases circulating concentrations of tryptophan, the precursor of 5-HT, which is presumed to impair 5-HT synthesis and release. Compelling evidence that ATD decreases 5-HT release in humans or animals is lacking and the technique has been criticised on these grounds by van Donkelaar et al. (2011) and defended by Crockett et al. (2012a). We conclude that ATD effects on release are likely to be mild and that some 5HT projections may be more susceptible than others to ATD (see Supplementary material). Therefore absence of an ATD effect is an uncertain basis on which to reject a function of 5-HT. The present review focuses on the implications of ATD effects on performance on tasks relevant to theories of 5-HT and inhibition, reward and punishment. We have not reviewed studies in clinical populations because our focus is on healthy mechanisms and there

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

are recent extensive reviews of the cognitive neuropsychopharmacology of depression that cover ATD effects (Roiser et al., 2012; Elliott et al., 2011). For the same reason we have not reviewed ATD effects on emotion processing. Mendelsohn et al. (2009) comprehensively reviewed the effects of ATD on cognitive performance across a broad range of cognitive-perceptual domains. Their main conclusion was that ATD has a consistent effect in impairing declarative memory mainly on verbal list-learning tasks in the visual rather than auditory domain. Other forms of memory were not consistently affected. ATD had no consistent effects on executive or percepto-motor function that might affect studies of reward and punishment. The last 10 years have seen attempts to formalise the role of 5HT in behavioural inhibition and motivation using computational modelling of information processing during reward and punishment learning. For example temporal difference reinforcement learning (TDRL) models provide well established computational accounts of how learning proceeds in proportion to the unexpectedness of rewards (reward prediction error; RPE) which is signalled by dopamine neurones (Schultz et al., 1997). Several ‘dopamine opponency’ theories propose that 5-HT signals punishment prediction error (PPE) as an opponent to dopamine, and together they compute the trade-off between rewards and punishments that control activation and inhibition (Daw et al., 2002; Cools et al., 2011; Boureau and Dayan, 2011). According to such accounts 5-HT function is not specifically associated with inhibition or with punishment processing, but rather their combination. The suggestion that 5-HT governs the rate at which rewards devalue with delay (Doya, 2002) initiated another computational theme in understanding 5-HT functions. We discuss computational and other theories of 5-HT function that have been co-evolving from 10 years of ATD studies and their implications for Deakin and Graeff’s (1991) anatomical theory of 5-HT and aversive coping.

1.2. Panic disorder in the context of 5-HT and punishment Since the early behavioural pharmacology experiments of Graeff and others in animals, 5-HT has appeared to have opposite roles in different forms of anxiety (Graeff and Schoenfeld, 1970; Schütz et al., 1985). Deakin and Graeff (1991) and others (McNaughton and Corr, 2004; Deakin, 2013; Paul and Lowry, 2013) proposed a resolution in which different 5-HT systems mediate different adaptive responses (defences) depending on the imminence of the threat, as follows.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

Proximal threat. Panic attacks originate in spontaneous activation of flight-fight mechanisms in the peri-aqueductal grey matter (PAG) that are evoked in nature by warnings of imminent threats to life such as pain, asphyxia and hypercapnia, and proximity to a predator. That proximal threat evokes PAG activation in humans was neatly demonstrated in a virtual reality functional magnetic resonance imaging (fMRI) experiment by Mobbs et al. (2007). Furthermore, Tuescher et al. (2011) showed that patients with panic disorder have aberrant PAG fMRI responses to threat. Distal threat. Distal (future) innate or conditioned threats, including planned actions that risk punishment and fear of future panic attacks, activate three dorsal raphe nucleus (DRN) 5-HT projections to evade the threat through:

(i) projections to dopamine-basal ganglia approach-avoidance mechanisms that signal punishment prediction, behavioural inhibition and avoidance; (ii) projections to amygdala that enhance cue-evoked negative valence – activation of outputs to hypothalamus and PAG preparatory autonomic mechanisms (anticipatory anxiety); (iii) projections to PAG that suppress the behavioural fight-flight component until avoidance is impossible and escape essential (see McNaughton and Corr, 2004; Paul et al., 2014; Graeff and Del-Ben, 2008 for extensive reviews).

The human evidence discussed from ATD studies in the present paper is relevant to (i) and (ii) above but also contextually to (iii) – restraint of the PAG escape-panic system by DRN 5-HT punishment projections. Direct evidence for the latter in humans is limited to two studies in panic disorder patients. Miller et al. (2000) found that ATD enhanced panic responses to breathing 5% CO2 but lessened anxiety ratings in anticipation of the CO2 . Conversely, fenfluramine-induced 5-HT release suppressed CO2 responses but increased anticipatory anxiety (Mortimore and Anderson, 2000). CO2 and ATD had minimal effects in the healthy volunteer groups. A plausible explanation is that the 5-HT defence is strong in patients since their major fear is having another panic attack. This sets up anticipatory anxiety (agoraphobia) and thus 5-HT restraint of the PAG-mediated panic response. Further direct tests of the theory in humans are lacking but as noted in 1.1 a number of ATD studies allow us to address the more general question of whether 5-HT has a mediating role in the behavioural effects of punishment in humans.

2. Methods 2.1. Publication selection criteria We searched PubMed (http://www.ncbi.nlm.nih.gov/pubmed), using the search term ‘tryptophan’ or ‘tryptophan depletion’ along with terms ‘decision-making’, ‘reward’, ‘punishment’, ‘appetitive’, ‘aversive’, ‘inhibition’, or ‘impulsivity’. Further, we examined the reference list of every article cited in this review, and also searched through the reference list of every website dedicated to first and last authors (where possible) in order to ensure that no relevant studies were missed. In order to be included in this review, the studies must (1) be original papers, written in English, and appearing in a peer-reviewed journal, (2) achieve successful depletion of participants’ tryptophan (as shown by plasma levels of tryptophan), and (3) include cognitive tasks examining the effect of ATD upon participants’ decision-making abilities. Any studies that did not meet these criteria were not included in this review.

3

2.2. Studies Thirty-four studies were included in this review. Twenty-five of these studies included healthy volunteers, and 4 included psychiatric populations or participants with a family history of a psychiatric illness, and two (Roiser et al., 2006; Finger et al., 2007) included healthy volunteers both with and without a genetic polymorphism. Nearly all studies included males and females. Mean age for all studies was less than 28 years except for the Talbot et al. (2006) sample, which had a mean age of 34 years. Where ranges were given, the maximum age was less than 30 years in all but 3 with maxima of 35, 40 and 60 years. Most studies recorded mood but none reported any changes in mood after ATD. A summary of these studies and their results can be seen in Table 2. The composition of the ATD drink in each study varied, containing as little as 31.5 g tryptophan (Finger et al., 2007; Seymour et al., 2012) and as much as 104.5 g (Talbot et al., 2006). When examining the extent of tryptophan depletion, 17 studies examined the total tryptophan concentration in the blood only, 6 studies examined the ratio of large neutral amino acids to tryptophan ratio LNAA:TRP ratio only, and 9 examined both. All studies in this review reported decreased plasma tryptophan after ATD compared to sham depletion, but the extent of decrease varied from as little as 59% (Crockett et al., 2012b) to 95% (Hindi Attar et al., 2012). Many of the studies also reported an increase in plasma tryptophan after sham depletion. The plasma changes in tryptophan concentration and LNAA:TRP ratio in each study can be observed in Table 1. 3. Results and discussion 3.1. Overview and controversies The review is divided into 3 sections according to psychological mechanisms of inhibition. First (Section 3.2) we consider studies of behavioural inhibition in which there are no formal rewards or punishments, second (Section 3.3) we review studies which investigate the role of 5-HT in inhibition of choice or speed of response associated with punishment, and third (Section 3.4) we review studies in which inhibition results from variation in the value of rewards – their magnitude, probability or delay. Many of the tasks have explicitly translational analogues in the animal literature, and their rationale and the effects of 5-HT and other drug manipulations have been well reviewed by Eagle and Baunez (2010). Experimental studies of response inhibition in humans typically involve stopping an already initiated response (stop signal tasks; SSTs), withholding a prepotent response (go/no-go tasks, reversal learning) or delaying a response (temporal discounting). The prepotent response may be maintained by instruction to respond to a cue or learning to respond to it by reward. Inhibition to other cues is elicited by instruction or punishment (typically loss of winnings). In contrast to punishment learning, inhibition of responses that have already either been initiated (as in SSTs) or selected (as in go/no-go) has been termed ‘action inhibition’ (Eagle et al., 2008). Action inhibition tasks also tend to involve time pressure, and different forms of action inhibition have been shown to have distinct neuropharmacological profiles in rodents (Eagle and Baunez, 2010; McNaughton et al., 2013). The first section (Section 3.2) reviews studies of the effects of ATD on action inhibition and reversal learning tasks that do not involve explicit rewards or punishments to determine whether 5-HT has a role in general response inhibition in the absence of punishment. The influence of punishment on behaviour involves Pavlovian learning in which cues that are followed by punishment acquire aversive properties, motivate avoidance behaviour (e.g. choosing another option), and inhibit on-going behaviour. Punishments contingent on a response reduce the probability of the response

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14 4

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

through instrumental response-outcome learning. However, Pavlovian mechanisms are also likely to be involved since cues associated with the punishment will also inhibit responding and guide the organism to different stimuli and responses. In order to identify whether ATD has different effects in different forms of learned inhibition, a number of variations on go/no-go and discrimination reversal tasks have been designed. The results are described in the second section (Section 3.3). Much recent computational theorising about the role of 5HT in inhibitory control has followed from the idea that 5-HT acts in opposition to dopamine systems whose role in mediating the reinforcing (learning) and incentive (guiding and energising) effect of rewards has been elaborated to a considerable degree of sophistication (Daw et al., 2002; Cools et al., 2011; Boureau and Dayan, 2011). Just as dopamine firing encodes information about the occurrence and prediction of rewards (Schultz et al., 1997) so 5HT neurones carry the same information about punishments. Learning is maximal when unexpected rewards and punishments (prediction errors) occur. Computational theories propose that 5-HT-mediated predictions of punishment are tightly linked specifically to behavioural inhibition i.e. passive avoidance and general behavioural suppression; 5-HT is not involved in active avoidance behaviour, for example. This 5-HT mediated aversive inhibition extends to inhibition of ‘processing of aversive information’ which has been seen as resilience in the face of adversity and protective against depression (Dayan and Huys, 2008; Cools et al., 2008a; Robinson et al., 2012). There is thus a paradox in these accounts that punishment prediction and resilience are mediated simultaneously by 5-HT (Cools et al., 2008b). Anticipating the computational theories above, Deakin and Graeff (1991) argued that 5-HT (via projections of the dorsal raphe nucleus; DRN) mediates punishment prediction but that this enhances aversive or emotion processing rather than inhibiting it. Indeed, sustained activation of DRN projections is known to mediate the inhibitory state of learned helplessness induced by brief but inescapable foot shock in experimental animals (Maier and Watkins, 2005; Paul and Lowry, 2013). Deakin and Graeff (1991) reasoned that 5-HT projections distinct from those mediating acute aversive responses, namely MRN projections from the median raphe nucleus (MRN), mediate adaptation to chronic adversity (resilience) by slowly disengaging the acute DRN response. It is possible this system may be more susceptible to ATD than some DRN projections (see Supplementary material for rationale). A number of studies in Section 3.3 address these competing theories and paradoxes and how they might be resolved. Reduced 5-HT function has been implicated in clinical disorders of impulsivity and risk-taking over many years for example in impulsive aggression (Linnoila et al., 1983; Rylands et al., 2013; Deakin, 2003). There is an extensive animal literature that is compatible with low 5HT and impulsivity (e.g. Harrison et al., 1997a,b; Harrison et al., 1999) but is not entirely consistent as reviewed by Miyazaki et al. (2012). A recent compelling observation is that mice with no detectable forebrain 5-HT consequent upon tryptophan hydroxylase2 gene knockout show a remarkable phenotype of impulsive aggression and compulsive behaviours (Angoa-Perez et al., 2012). They lack fear responses rather than showing disinhibition of fear as the computational models might arguably predict. Lack of 5-HT mediated aversion and consequent disinhibition of dopamine incentive systems could be a mechanism of such impulsive behaviour as proposed by Deakin (2003). Several cognitive/motivational mechanisms for impulsive responding might involve 5-HT (Evenden, 1999; Dalley and Roiser, 2012). They include failure of future thinking/planning (Winstanley et al., 2006), hasty decision-making based upon insufficient information (‘reflection impulsivity’, Kagan, 1966), the failure of more delayed rewards

to motivate behaviour (rapid temporal discounting; Bizot et al., 1988), and, disinhibition of dopamine reward processing (Deakin, 1983; Deakin and Graeff, 1991; Rocha et al., 1998; Winstanley et al., 2004). Section 3.4 reviews ATD studies on impulsivity. 3.2. Action inhibition and reversal learning without reinforcement (Table 1) 3.2.1. Stop-signal tasks (SSTs) The tasks require participants to respond to stimuli (often images or symbols) as quickly as possible. On some trials, a tone after the go stimulus indicates that the initiated response must be halted. The delay between the go and stop stimuli is varied to find the delay that inhibits 50% of the responses. The delay is used to compute the stop signal reaction time (SSRT) – how close to its implementation can a response be inhibited. There is little evidence that ATD affects SSRTs. For example, neither Cools et al. (2005) nor Clark et al. (2005) found an effect of ATD on SSRTs or the proportion of successful inhibitions. Furthermore, ATD had no effect even in participants with the short allele of the 5-HTT polymorphism. Crean et al. (2002) reported that ATD increased SSRTs in those with a family history of alcoholism but decreased SSRTs in those without. It is possible that ATD does not impair 5-HT release sufficiently to affect SST. However, neurotoxin-induced depletions in animals have not reported disinhibitory effects on animal versions of SSTs either (Eagle et al., 2008, 2009). In contrast SSRTs are slowed after frontal lobe damage in both rodents and humans and are modulated by noraderenergic drugs (Eagle et al., 2008). This form of action ‘cancellation’ inhibition would therefore appear to involve fast neocortical mechanisms that are modulated by noradrenaline but not by 5-HT. 3.2.2. Go/no-go tasks Go/no-go tasks involve responding or withholding a response to successively presented correct or incorrect stimuli, which may also be rewarded or punished. Fast and incorrect responding (commission errors; CEs) reflect impulsivity. Six studies failed to find an overall effect of ATD in increasing CEs in healthy control participants (LeMarquand et al., 1998, 1999; Zepf et al., 2008; Rubia et al., 2005; Evers et al., 2006; Macoveanu et al., 2013a). However, one study did report increased CEs in participants with a strong familial risk of alcoholism after ATD (LeMarquand et al., 1999). In another, ATD showed opposite effects on CEs – increasing them in an aggressive subgroup of people with attention deficit hyperactivity disorder (ADHD), and decreasing them in the low aggressive subgroup. Overall ATD had no effect compared to sham depletion in ADHD or in a control group. In contrast, neither aggressive nor non-aggressive adolescents showed effect of ATD in a study by LeMarquand et al. (1998). Nevertheless the results suggest that ATD may reveal vulnerable 5-HT systems in some populations at risk of impulse control disorders. Three of the 5 studies above investigated the effect of ATD on neural processing during go vs no-go trials. As noted Rubia et al. (2005) found no effect of ATD on CEs. However, after ATD BOLD responses on no-go trials were decreased within the right prefrontal cortex and increased in the right middle temporal and left middle and inferior temporal gyri. Although only 9 subjects were studied, a similar effect of ATD was seen in the discrimination reversal study of Evers et al. (2005; see above). However, Evers et al. (2006), in a go/no-go study, found no effects of ATD on performance or on BOLD responses. Recently Macoveanu et al. (2013b) in a large within-subject (n = 24) study reported that right inferior frontal no-go BOLD responses were augmented or reduced by ATD depending on individual levels of 5-HT2a PET altanserin receptor binding. One inference is that the state of synaptic 5-HT function may not only determine the magnitude of ATD effects but also their

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model

ARTICLE IN PRESS

NBR-2001; No. of Pages 14

P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

5

Table 1 Effects of acute tryptophan depletion on non-reinforced behavioural inhibition tasks. Study

Subjects, N, M:F

Task

Effect of ATD

Cools et al. (2005)

HVs; 23, 23:0

Stop signal (reaction time) task

No effect

Clark et al. (2005)

HVs; 42, 29:13

Stop signal (reaction time) task

No effect

Crean et al. (2002)

HVs; 40, 40:0 50% FH alcoholism

Stop signal (reaction time) task

Slowed SSRT in FH+ Speeded SSRTs in FH−

LeMarquand et al. (1998)

Aggressive and non-aggressive adolescents; 38, 38:0

Go/no-go

No effect

LeMarquand et al. (1999)

HVs; 57, 57:0 50% FH alcoholism

Go/no-go

Increased CEs in FH+ No effect in FH−

Rubia et al. (2005)

HVs; 9, 5:4

Go/no-go

CEs no effect Decreased no-go BOLD signal in PFC and temporal lobe

Evers et al. (2006)

HVs; 13, 13:0

Go/no-go

CEs no effect No effect BOLD response during inhibition

Macoveanu et al. (2013a)

HVs; 22, 14:8 Within-subjects design. No sham depletion or placebo pill

Go/no-go fMRI ATD, citalopram and no-drug sessions PET 5-HT2A BP

CEs no effect. R Inferior frontal BOLD Increased in low 5-HT2A BP Decreased in high 5-HT2A BP

Walderhaug et al. (2002)

HVs; 24, 24:0

Continuous performance task

Increased tendency to over-respond using letter stimuli but not shapes

Walderhaug et al. (2008)

HVs; 24, 24:0

Continuous performance task

Increased false alarms if ATD administered before sham condition

Murphy et al. (2002)

HVs; 11, 0:11

Reversal learning task

No effect on errors Increase or decrease in overall RTs depending on order of sham and ATD procedure

Rogers et al. (1999a,b)

HVs

Reversal (CANTAB)

Impaired reversal

Talbot et al. (2006)

HVs; 32, 16:16

Reversal (CANTAB)

No effect

Abbreviations: M, male; F, female; HVs, healthy volunteers; CEs, commission errors; FH+, family history positive; FH−, family history negative; ID/ED, intradimensional/extradimensional; CANTAB, Cambridge Automated Neuropsychological Test Battery; R, right; PET 5-HT2a BP, PET altanserin binding potential.

direction. Multimodal PET and pharmaco fMRI cognitive challenge studies have the capability to reveal modulatory effects of 5-HT on brain processing of behavioural inhibition that may be relevant to abnormal impulsivity traits. The continuous performance task (CPT; first designed by Rosvold and Delgado, 1956) resembles a typical go/no-go task but requires a go response only when repetitions occur in a sequence of cues. Lures, or foils, resembling repetitions tempt false alarms. These tasks measure both sustained attention and response inhibition. Walderhaug et al. (2002) reported that ATD caused a general tendency to disinhibition but only when letters were used as stimuli rather than shapes. Walderhaug et al. (2008) found that ATD increased false alarms, but only if ATD was administered on the first test session. We review whether ATD has effects on inhibition when performance is rewarded or punished in go/no-go tasks, after the following sections on non-reinforced and reinforced reversal learning.

3.2.3. Reversal learning (non-reinforced) In reversal learning tasks participants learn which of a simultaneously presented pair of stimuli produces “correct” feedback or is rewarded, and which is incorrect. Incorrect choices may have no consequence, may produce “incorrect” feedback or be punished by

loss of winnings. The correct and incorrect stimuli are reversed from time to time. Where feedback is 100% accurate (i.e. the feedback is deterministic) the task is soon learned by a win-stay, lose-shift strategy. In probabilistic reversal learning uncertainty and errors are introduced by misleading feedback, typically on 20% trials. Staying in the face of loss produces perseverative errors and may reflect resistance to loss. Immediate shifting after loss may be taken to reflect sensitivity to punishment. Murphy et al. (2002) examined the effects of ATD on probabilistic reversal learning. There were no wins or losses only positive or negative feedback (correct or incorrect). ATD had no effect on errors but increased or decreased RTs depending which drink they had first. However, the sample size was small (N = 11), and levels of tryptophan were depleted modestly compared to other studies (∼70% depletion). Rogers et al. (1999a,b) found that ATD impaired reversal learning on the deterministic reversal stages of the ID/ED attentional set-shifting task of the CANTAB (Sahakian and Owen, 1992) but this was not replicated by Talbot et al. (2006) in an apparently identical experiment. 3.2.4. Summary and conclusions There is little evidence that ATD produces a general impairment of behavioural inhibition across a range of tasks in agreement with the review of Mendelsohn et al. (2009).

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model

ARTICLE IN PRESS

NBR-2001; No. of Pages 14

P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

6

3.3. Punishment induced inhibition (instrumental and Pavlovian) (Table 2) 3.3.1. Reinforced discrimination and reversal learning ATD had no effect on errors in reversal tasks reinforced by winning and losing points in 2 studies (Evers et al., 2005; Finger et al., 2007). However, Evers et al. (2005) also measured RTs and reported they were slower after ATD as Murphy et al. (2002) reported in those receiving ATD first. Despite the lack of effect of the ATD condition on performance it tended to increase blood oxygen level dependent (BOLD) responses in dorsomedial pre-frontal cortex (DMPFC) associated with errors. The trend level significance may reflect the small sample size (N = 12). Nevertheless the results raise the interesting possibility that ATD increases the sensitivity of BOLD responses within the DMPFC to negative outcomes without affecting performance. Cools et al. (2008a) tested the idea that 5-HT release acts as a punishment-predicting signal in an opponent manner to rewardpredicting dopamine neurons (Deakin, 1983; Daw et al., 2002) in a reversal learning task. A small sample of 12 volunteers learned a discrimination reversal task between a pair of stimuli. However, rather than choosing the rewarded one, participants had to indicate whether the stimulus that was highlighted would be followed by a reward or punishment, which was always presented regardless of the participant’s prediction. This allowed separate assessment of the ability to predict rewarding and punishing outcomes rather than to select or avoid stimuli. The rewarded/non-rewarded stimuli reversed from time to produce prediction errors. After shamdepletion participants displayed a bias towards making more errors in predicting punishments than rewards but only in non-reversal phases. However, this bias was abolished in the ATD condition such that participants’ ability to predict punishment improved to equal their performance in predicting rewards. These results indicate that 5-HT has a valence-specific role in punishment processing but are at first sight incompatible with the idea that 5-HT signals punishment (Deakin and Graeff, 1991;

Daw et al., 2002) – in which case ATD should have exaggerated poorer prediction of punishment. Cools et al. (2008a) proposed that ATD improved punishment prediction because punishmentinduced peaks of 5-HT release stand out more against a reduced background or tonic level of 5-HT caused by ATD. Thus, learning of stimulus-punishment associations improved after ATD. Why baseline 5-HT release should be more sensitive to ATD than peaks is not obvious but is not implausible either. Impressively, Robinson et al. (2012) replicated these results in a much larger, between-subjects sample (N = 41). Both Cools et al. (2008a) and Robinson et al. (2012) propose that the bias towards punishment errors in the balanced drink condition reflects a role for 5-HT in inhibiting the processing of aversive stimuli with the dual effect of reducing accuracy in the punishment phases of the task and enhancing resilience. According to Cools et al. (2011) when punishments occur, high 5-HT firing rates gradually increase background 5-HT levels and this makes further punishment spikes of 5-HT less salient against the already increased background and thus less processed. In effect, the suggestion is that 5-HT systems contain an in-built mechanism of adaptation to aversion or resilience. In contrast, Deakin and Graeff (1991) reasoned that a separate 5-HT system, arising from the MRN, mediates tolerance or resilience to chronic aversion, distinct from the PPE-signalling 5-HT neurones of the DRN. As noted in Section 1 and Supplementary material, MRN resilience projections may be especially sensitive to ATD. 3.3.2. Reinforced go/no-go and other decision-making tasks Finger et al. (2007) investigated the effect of the 5-HT transporter (HTT) genotype, ATD and their combination on a reward and punishment go/no-go task. Responding to 6 ‘good’ stimuli (S+) won points while responding to 6 intermingled ‘bad’ stimuli (S−) deducted points. Omission errors (OEs; failure to respond to a rewarded stimulus) and commission errors (CEs; responses to a punished stimulus, passive avoidance errors) were recorded. ATD had no effect on passive avoidance in contrast to the study of Geurts

Table 2 Effects of ATD on instrumental and Pavlovian inhibition in reinforced reversal learning, go/no-go and other tasks. Study

Subjects, N, ratio (M:F)

Task

Effect of ATD

Evers et al. (2005)

HVs; 12, 12:0

Reversal learning

No effect on behaviour, increased BOLD response in DMPFC during reversal switch errors

Finger et al. (2007)

HVs; 35, 17:18 (Only 26 genotyped)

Reversal learning

No effect in s/−, s/s HTT More errors all phases in ll HTT But few errors overall

Cools et al. (2008)

HVs; 12, 4:8

Predicting outcome in reversal task

Reversed normal bias of more errors in predicting punishment than predicting rewards

Robinson et al. (2012)

HVs; 41, 0:41

Predicting outcome in reversal task

Reversed normal bias of more errors in predicting punishment than predicting rewards

Finger et al. (2007)

HVs; 35, 17:18

Go/no-go (S− and S+ for losses and gains)

No effect on CEs to S− (i.e. no effect on passive avoidance) Increased OEs to S+ (i.e. diminished effect reward)

Crockett et al. (2009)

HVs; 22, 8:14

Go/no-go

Decreased punishment-induced slowing of RT No effect on CEs

Crockett et al. (2012b)

HVs; 30, 13:17

Reinforced categorization

Decreased punishment slowing of RT, and Decreased response inhibition

Geurts et al. (2013)

HVs; 45, 23:22

Pavlovian-to-instrumental transfer

Abolished aversive PIT suppression No effect on positive PIT Impaired passive avoidance

Hindi Attar et al. (2012)

HVs; 39, 39:0

Pavlovian conditioning task TDRL modelling

Decreased aversive prediction error signals in amygdala and OFC. Decreased conditioned SCRs

Abbreviations: M, male; F, female; HVs, healthy volunteers; CEs, commission errors; OEs, omission errors; S+, stimulus reward; S−, stimulus punishment; TDRL, temporal difference reinforcement learning; SCRs, skin conductance responses.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

et al. (2013) discussed below. Unexpectedly those heterozygous for long (l/l) HTT alleles were less able to restrain responses to avoid punishment than those possessing short alleles (s/−), which latter confer risk for depression and neuroticism. However a total genotyped sample of 26 is insufficient for these results to be other than preliminary. Indeed, a recent study of 700 individuals, the l/l HTT genotype (den Ouden et al., 2013) was associated with increased sensitivity to punishment in a discrimination reversal task. The more surprising result from Finger et al. (2007) was that ATD increased OEs (failure to respond to the S+) in this sample, one of a few results indicating that 5-HT may have a role in reward processing. Crockett et al. (2009) designed a novel go/no-go task to isolate 5-HT’s role in three processes relevant to behavioural inhibition: punishment-induced inhibition (the general suppression of responding in aversive contexts), motor-response inhibition (the intentional inhibition of inappropriate motor responses), and sensitivity to aversive outcomes. Participants viewed checkerboards composed of blue and yellow squares, and made a ‘go’ response if their target colour was in the majority (the number of squares of each colour differed on each trial, in order to vary difficulty), or withheld a response if their target colour was in the minority. The task had 2 conditions – rewarding correct go and no-go responses, and punishing incorrect responses. In half the blocks of each condition, go responses attracted greater rewards or punishments than no-go responses and in the other half, no-go responses received the greater reward or punishment. As expected the possibility of punishment for incorrect go responses slowed go RTs and this was magnified in the greater punished go condition. ATD abolished the general slowing of responses in the punishment condition but did not lessen the effect of reward/punishment magnitude on RTs or affect accuracy. After a punishing outcome, subsequent RTs were slowed and this ‘emotional’ impact of punishment was also unaffected by ATD. This study suggested that 5-HT may mediate inhibition (slowing) associated with punishment but is not involved in behavioural sensitivity to punishment. Thus 5-HT is not critical solely for punishment processing or for behavioural inhibition alone, but for their conjunction in punishment-induced inhibition. The effect of ATD was remarkably specific in not affecting reward influences. There is the usual caveat that ATD is a mild intervention and a more potent method of depleting 5-HT might have had an effect on sensitivity to punishment or indeed reward. RTs may be more sensitive to subtle 5-HT changes than behavioural choice. 3.3.3. Reinforced categorization task Following the results of Crockett et al. (2009), Crockett et al. (2012b) argued that punishment-induced inhibition reflects at least two processes: an instrumental process in which responses are inhibited because they become linked to their aversive consequences, and a Pavlovian process whereby concurrent cues, which also predict aversive outcomes, elicit conditioned suppression and reduce the vigour of choice responses. Participants viewed a checkerboard of blue and yellow squares, and participants decided as quickly as possible which colour was in the majority. The authors compared: (i) RTs on yellow and blue response keys in a rewardonly (RO) condition, in which correct responses on either key were rewarded with 10 points and incorrect responses gained 0 points, with (ii) RTs in a reward and punishment (R–P) condition, in which correct responses continued to be rewarded, but incorrect responses on one of the colour keys were ‘punished’ with a loss of 10 points. In short, incorrect responses on one colour were never punished whereas incorrect responses on the other colour were punished in the R–P condition. The possibility of an upcoming punishment slowed responses on both keys, with slowing on the non-punishment key reflecting Pavlovian suppression

7

evoked by the presence of the punishment key. Punishmentinduced slowing on both keys was blocked by ATD. This is a striking replication of Crockett et al.’s (2009) result in the reinforced go/no-go paradigm. These results again suggest that 5-HT mediates punishment inhibition but suggest that the mechanism is Pavlovian. In contrast to their previous findings, passive avoidance of responding on the punishment key in the R–P block and postpunishment slowing were blocked by ATD. This would appear to allow the possibility that 5-HT has a role in reporting the occurrence of punishment and its aversiveness rather than in suppressing such processing as proposed in recent theoretical models (Cools et al., 2008a; Robinson et al., 2012; Boureau and Dayan, 2011). The results are not compatible with the idea that ATD enhances punishment prediction by increasing phasic relative to baseline 5-HT firing (Cools et al., 2008a); such an effect of ATD would have enhanced the suppressive effects of punishment. However, the results clearly are compatible with evidence discussed below that amygdala 5-HT facilitates aversive Pavlovian learning (Hindi Attar et al., 2012). 3.3.4. Pavlovian to instrumental transfer Geurts et al. (2013) recently carried out a decisive test of the hypothesis that 5-HT mediates Pavlovian punishment suppression using a Pavlovian-instrumental transfer (PIT) task. This task contained 3 phases. In the first phase participants learned an instrumental approach/avoidance go/no-go task in which they learned to collect and reject good and bad mushrooms. In approach blocks they actively collected good mushrooms (go) or left bad ones (no-go). In avoidance blocks they actively discarded bad mushrooms (go) or retained good ones (no-go). Correct responses were probabilistically rewarded with 20 cents on 75% of correct trials. Pavlovian conditioning took place on the second phase in which participants passively viewed pairings of five different fractal images with a large or small win or loss, or nothing. In the third, PIT phase participants repeated the instrumental mushroom task with trial cues presented against a background of one of the Pavlovian fractal images. ATD caused a general impairment in instrumental learning and a specific passive avoidance deficit reported as an impaired ability to learn to inhibit collecting bad mushrooms. The key finding was that ATD reversed the inhibitory effect of aversive Pavlovian stimuli on instrumental behaviour and did not modify the effect of the appetitively conditioned Pavlovian stimulus. The authors argue that these results provide evidence for a selective role of 5HT in tying aversive expectations to behavioural inhibition, which can be seen to both support and extend the findings of Crockett et al. (2009) and Crockett et al. (2012b). However, these results only occurred in those whose first experience of the task was in the sham depletion condition. No PIT or ATD effect was seen in those who had the ATD first. The authors reject the possibility that 5-HT attenuated aversive processing mainly on the grounds that the instrumental responses were successfully learned but, against this, learning was less efficient and there was a specific passive avoidance deficit in the ATD condition. They also point out that ATD reversed rather than abolished PIT, such that ATD increased go responses to aversive CSs compared to sham depletion, rather than simply preventing the suppression of go responses. However, there is no obvious explanation for this reversal effect. 3.3.5. Aversive Pavlovian conditioning Buchel and colleagues (Yacubian et al., 2006) have used temporal difference reinforcement learning (TDRL) modelling (see page 5) of fMRI BOLD responses to identify which brain regions signal prediction errors. They found that unexpected losses (negative reward prediction errors) were encoded in the amygdala and not

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14 8

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

as the inverse of positive reward prediction errors (RPEs) in the ventral striatum. The authors reasoned that punishing events are equivalent to negative RPEs, and, drawing upon the work of Deakin and Graeff (1991) and Daw et al. (2002), that this punishment signalling might be mediated by serotonin projections to amygdala. This hypothesis was strikingly corroborated in a Pavlovian fear conditioning experiment in which painful heat stimuli applied to the skin followed conditioned stimuli (Hindi Attar et al., 2012). TDRL modelling found that punishment prediction errors (PPEs) were signalled in amygdala BOLD responses, and that ATD attenuated these responses. Furthermore ATD lessened skin conductance responses to the aversive conditioned stimuli (CSs) suggesting that PPE signalling is positively associated with aversive emotion processing. The findings indicate that the 5-HT innervation of the amygdala carries information about upcoming punishments and is a basis of aversive PPE learning. This seems incompatible with the Cools et al. (2008a) and Robinson et al. (2012) formulations that ATD, by reducing baseline 5-HT levels, caused punishment induced 5-HT spikes to stand out and thus enhance the ability to predict punishments in their studies. Rather the fact that ATD effects were seen during non-reversal runs of punishment and not after reversals seems more compatible with the idea that ATD interfered with a 5-HT resilience mechanism as the authors also proposed.

3.3.6. Summary and conclusions (a) 5-HT is asymmetrically involved in punishment inhibition rather than reward effects. Specifically, two studies report that a normal bias to greater errors in predicting punishments than rewards is prevented by ATD (Cools et al., 2008a; Robinson et al., 2012). Two studies of punishment-induced slowing of RTs suggested ATD reverses a Pavlovian component of punishment inhibition (Crockett et al., 2009, 2012b). This hypothesis survived a rigorous explicit test by Geurts et al. (2013) showing that ATD blocked aversive PIT and not appetitive PIT. (b) Aversive Pavlovian inhibition is mediated by 5-HT (see Geurts et al. (2013) above). (c) Aversive Pavlovian conditioning may involve 5-HT release in amygdala (see Hindi Attar et al., 2012 above) and enhanced punishment processing. Cools et al. (2008a, 2011) proposed the novel hypothesis that phasic and tonic 5-HT release respectively mediate punishment prediction and behavioural inhibition, the latter decreasing “vigour” of responding and decreasing aversive processing. That ATD might increase salience of phasic PPE signalling by reducing tonic 5-HTrelease is not borne out by the reduction in PPE BOLD observed after ATD in the Hindi Attar et al. (2012) imaging study. (d) 5-HT is involved in a normal bias to make greater errors in predicting aversive outcomes than rewarded outcomes. This is compatible with theories that one role of 5-HT in punishment processing is to mediate resilience to aversion (Cools et al., 2008a; Robinson et al., 2012; Deakin and Graeff, 1991). ATD blocked the bias but given c) above it is unlikely that PPE was improved. ATD may have impacted MRN projections associated with resilience and antidepressant effects (Deakin and Graeff, 1991; see Supplementary material) to prevent resilient disengagement with aversive cues, thus improving the ability to predict punishment. However, Cools and colleagues (Geurts et al., 2013) reinterpreted this bias towards punishment prediction errors as an inhibition of predicting an upcoming punishment, since correct responses were always followed by a punishment. Thus, ATD-mediated reversal of this bias is another instance of a disinhibition of 5-HT-mediated responding to punishments.

(e) Inhibitory avoidance of punishment (passive avoidance) probably involves 5-HT. ATD impaired passive avoidance in studies by Crockett et al. (2012b) and Geurts et al. (2013).

3.4. Inhibition and risk taking; probability and temporal discounting of reward (Table 3) 3.4.1. Probability, risk and gambling The Cambridge gambling task (CGT; e.g. Rogers et al., 1999a,b; Clark et al., 2011) was designed to assess the interaction between risk and decision-making. Participants view a row of 10 boxes on the computer screen, red ones on one side and blue on the other, in ratios varying from 5:5 to 9:1. The task is to guess which colour conceals a yellow token and decide how many points to gamble. Performance measures include the quality of decision-making (betting on the more numerous colour), the willingness to ‘risk’ some of their already accumulated points, and deliberation times. Rogers et al. (1999a,b) reported that ATD participants chose the rational higher probability gambles less than those who underwent sham depletion and took longer to do so. There was no effect on the amount gambled. These remarkable results were not replicated by Talbot et al. (2006), who found opposite effects using the same procedures; ATD increased the number of high probability gambles chosen after ATD with no effects on speed or amount gambled. It is an empty speculation that there may have been a systematic difference in the groups. Nevertheless, it may be relevant that the mean age of the Talbot et al. (2006) participants was 6 years greater than in Rogers et al. (1999a,b) and there is much evidence that the effects of ATD depend on individual characteristics as will be evident from this review. One or two more studies would be very desirable. Rogers et al. (2003) developed the ‘Choice × Risk’ paradigm to understand their previous finding (Rogers et al., 1999a,b) by separating the influence of magnitude from probability of gains and losses on choice. Participants chose between a ‘control’ gamble in which there was a 50/50 chance of winning or losing 10 points, and an ‘experimental’ gamble in which there was either a 25% or 75% chance of winning either 80 or 20 points, and the opposite chance of losing either 80 or 20 points. ATD reduced discrimination between magnitudes of wins (but not losses or probabilities) compared to those who underwent sham depletion. The results are not incompatible with their previous result with the Cambridge Gamble task in that ATD reduced betting on more probable gains. These results suggest that ATD can affect instrumental choice by impairing reward processing. However, Faulkner et al. in recent unpublished data, found no effects of ATD on the Choice × Risk task in 27 healthy volunteers. Anderson et al. (2003) measured the trade-off between probability and size of imaginary wins on choice between schematic pairs of roulette wheels. Wheel (A) provided a smaller but nearly certain win (ranging between £1 and £10,000), wheel (B) provided a win 2.5 times that of the former, but at a probability that systematically varied from trial to trial. For each reward size on roulette ‘A’, the probability on roulette ‘B’ was systematically altered from ∼5% chance of winning to 100%, and the average of the points at which the participant switched from roulette ‘A’ to ‘B’ was noted as their ‘indifference point’. From this can be calculated a probability discounting factor. ATD had no effect upon participants’ indifference points or discounting. Macoveanu et al. (2013b) compared the effect of ATD with a single dose of citalopram on BOLD responses to wins and losses according to their antecedent riskiness (odds against). Risk taking was encouraged by greater wins. There was no effect of either treatment on risk taking. When losses occurred after safe bets, ATD decreased activation in amygdala whereas citalopram increased it. The reverse pattern was seen in dorso-medial prefrontal cortex

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

9

Table 3 Effect of ATD on inhibition influenced by magnitude, probability and delay of reward. Study

Subjects, N, ratio (M:F)

Task

Effect of ATD

Rogers et al. (1999b)

HVs; 31, 15:16

Find the key in 10 red or blue boxes

Decreased choice of more probable gain

Talbot et al. (2006)

HVs; 32, 16:16

Find the key in 10 red or blue boxes

Increased choice of more probable gain

Rogers et al. (2003)

HVs, 36, 18:18

Choice x Risk

Decreased win sensitivity

Macoveanu et al. (2013b)

HVs; 22, 14:8 Within-subjects design. No sham depletion or placebo pill

Find the ace card game High wins encourage risk ATD, citalopram and no-drug sessions fMRI

No effect on risk taking Decreased amygdala BOLD response to loss after safe bets Increased DMPFC BOLD response to loss after safe bets No effect on win responses

Anderson et al. (2003)

HVs; 28, 17:11

Imaginary win size vs probability gamble

No effect on indifference point

Campbell-Meiklejohn et al. (2011)

HVs;34,16:18

Loss chasing game

Decreased loss chasing

Seymour et al. (2012)

HVs; 30, 30:0

Four-arm bandit Outcomes: money, finger-shock both or neither TDRL modelling

Increased perseverative responding in face of dwindling returns Reduced influence of reward on choice

Cools et al. (2005)

HVs; 23, 23:0

Cued reinforcement reaction time

Blocked incentive speeding OR Blocked punishment-induced slowing

Robinson et al. (2010)

HVs: 33, 15:18

Cued reinforcement reaction time

Blocked or reinstated incentive speeding depending on mood status

Roiser et al. (2006)

HVs; 30, 17:13 s/s vs −/l HTT

Cued reinforcement reaction time

Decreased incentive speeding s/s HTT not in −/l

Worbe et al. (2014)

HVs; 44, 17:27

4 choice serial reaction time task

Increased premature responses (impulsive action)

Crockett et al. (2012a)

HVs; 21, 7:14

Information sampling

More willing to pay points for information sampling before deciding

Crean et al. (2002)

HVs; 40, 40:0 50% FH alcoholism

Temporal discounting

No effect

Schweighofer et al. (2008)

HVs; 20, 20:0

Temporal discounting

Increased discounting

Tanaka et al. (2007)

HVs; 12, 12:0

Temporal discounting

No effect but modified BOLD responses

Abbreviations: M, male; F, female; HVs, healthy volunteers; TDRL, temporal difference reinforcement learning.

where ATD increased loss-induced BOLD responses. The treatments had no effect when losses followed risky bets. Losses after safe bets were unexpected and these PPEs were attenuated in left and right amygdala after ATD and increased by citalopram. This results fits well with evidence that that 5-HT mediates aversive PPEs in amygdala (Hindi Attar et al., 2012; Deakin, 2013). Differential modulation in medial prefrontal cortex may represent initiation of top-down control of primary aversive processing. ATD had no effect on winrelated activity.

The reduction in loss-chasing after ATD would be compatible with a role of 5-HT in aversive motivation (Deakin and Graeff, 1991), and could also be interpreted as analogous to learned helplessness or impaired resilience induced by ATD in the putative MRN resilience system (Deakin and Graeff, 1991). However, the findings of Campbell-Meiklejohn et al. (2011) are hard to reconcile with a role of 5-HT in inhibition associated with non-reward or punishment (Crockett et al., 2009; Dayan and Huys, 2008; Soubrié, 1986), which would predict disinhibition of loss chasing after ATD.

3.4.2. Loss chasing Chasing losses can lead to serious debt in problem gamblers. To understand its neural substrates Campbell-Meiklejohn et al. (2011) investigated the effect of ATD in an experimental loss chasing paradigm. Participants were given a virtual £20,000 to gamble and losses of up to £160 occurred. They could recover the same amount by gambling but at the risk of increasing their losses by the same amount again. They could quit and accept such losses or continue to chase the loss. ATD reduced decisions to chase and reduced the number of gambles in a chase. The authors point out that loss chasing can be regarded as aversively motivated escape behaviour.

3.4.3. Four-arm bandit task Seymour et al. (2012) designed a simultaneous reward and punishment learning task in which a temporal difference reinforcement learning (TDRL) model could be fitted to the behaviour to quantify the extent to which the occurrence of rewards and punishments controlled instrumental choice. Participants could gamble at any time on any of four ‘bandits’ that could deliver a 20p reward, a punishment (electric shock), neither, or both. Importantly, these bandits were independent of each other. Whilst each bandit would often result in a similar outcome as the previous trial, the probability of outcomes slowly changed over time (e.g. from being

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14 10

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

rewarding to becoming punishing in later trials). Thus participants had to constantly re-learn reward values of the bandits. Participants who underwent ATD showed more perseverative responding on a bandit as it produced less favourable outcomes. TDRL modelling indicated that ATD lessened the effect of rewards in discouraging switching but had no significant effect on the influence of punishment on choice. Neural representations of reward outcome value and reward prediction error were detected in small region of ventromedial prefrontal cortex and putamen that were attenuated after ATD. Finally, no significant BOLD correlates of prediction errors were observed. The authors discuss a number of possible mechanisms for the perseverative tendency after ATD. It is reminiscent of the remarkable perseveration of responding to the no-longer rewarded cue in reversal learning after 5-HT denervation of orbito-frontal cortex (OFC) in marmosets (Roberts, 2011). However, the explanation remains obscure. Some neurones in DRN, which may or may not have been 5-HT containing, appear to signal reward value (Bromberg-Martin et al., 2010) and in the absence of such signals there may be little motivation to switch in the marmosets or in the ATD treated humans. The attenuation of the influence of reward on behaviour is very novel but lacking in theoretical explanation. One possibility is that participants develop a degree of learned helplessness since there is no sustained optimal strategy and shocks continue to be delivered. Perhaps MRN 5-HT projections can disconnect cues from no longer adaptive responses as suggested by Deakin (1983) and Deakin and Graeff (1991) and developed by Boureau and Dayan (2011). This could be exacerbated by ATD and result in a degree of anhedonia. However according to Deakin and Graeff this would be mediated by excessive aversive DRN inhibition of dopamine reward which ATD should attenuate; perhaps the DRN5-HT-dopamine system might be partially resistant to ATD (see Supplementary material). Seymour et al. (2012) is an important study; evidence for a role of 5-HT in reward from this and another gambling study (Rogers et al., 2003) albeit non-replicated, does suggest the need for replication and theoretical formulations that can be tested in new ways.

3.4.4. Cued-reinforcement reaction time task (CRRTT) To test the idea that anhedonia in depression reflects impaired 5-HT function and might mediate effect of reward in the Rogers et al. (2003) gambling task, Cools et al. (2005) investigated whether ATD would block speeding of responses when a high probability rewards is signalled in the cued-reinforcement reaction time task (CRRT). Participants were rewarded for speedily identifying the odd one out of three stimuli. Coloured highlighting of the cue indicated whether that reward was 10, 50 or 90% likely. Highlighting caused a graded speeding of response maximal at 90%. No graded speeding was seen after ATD. Although reported as an abolition of incentive speeding, ATD in fact speeded responding on the 10% and 50% trials rather than blocking speeding on the 90% probability trial. Indeed Geurts et al. (2013), in the light of their decisive demonstration of a role of 5-HT in Pavlovian inhibition, suggested that ATD-induced speeding in the CRRT might have reflected attenuation of an inhibitory aversive Pavlovian influence on RTs in the 10 and 50% trials when there was a greater chance of failure. Equally, fear of failure may have been reduced by ATD and caused the speeding in the low-reward/high-punishment conditions. In a study by Robinson et al. (2010) ATD had opposite effects on incentive speeding depending on whether the female participants had positive or negative mood induction. Roiser et al. (2006) also reported ATD lessened incentive speeding but only in those homozygous for the short version of the 5-HT transporter gene. However, in the absence of knowledge about synaptic 5-HT function in this HTT genotype, the ATD effect is hard to interpret.

Very recently, Worbe et al. (2014) found that ATD caused substantial increase in premature responding in the human 4 choice serial reaction time test (4CSRTT) in which fast responses were rewarded with £1. There was no evidence that ATD slowed responses or caused inaccuracy as might be expected if 5-HT has a role in incentive speeding. There were no punishments, except the failure to win £1 through premature responding. This is a translational version of the rat 5 choice task in which impulsive responding is encouraged by food reward. Harrison et al. (1997a,b) reported that neurotoxin lesions of dorsal raphe neurones in rats caused speeding and more accurate responses and that this was mediated by disinhibition of dopamine. 3.4.5. Information sampling task The information sampling task (Clark et al., 2006) allows participants to improve their chance of making a winning guess by peeping at (‘sampling’) elements of the answer before choosing. Less sampling is a measure of reflection impulsivity defined as hasty decision-making based upon insufficient information (Kagan, 1966). Sampling is partly motivated by fear of making a losing guess (‘global loss’) and this can be gauged by how much potential reward they are prepared to sacrifice (‘local loss’) by paying to sample. Crockett et al. (2012c) used ATD to determine whether 5HT is involved in responses to local or global loss. Participants were presented with 25 grey squares, under which were blue and yellow squares revealed by touching the screen. Participants could sample as many squares as they wished before deciding which colour they felt predominated. Participants won or lost 100 points for a correct or incorrect response. The task had two conditions; a ‘free’ condition, in which information sampling incurred no cost, and a ‘costly’ condition, in which sampling cost 10 points per square against potential winnings. In the costly condition participants sampled fewer squares before deciding, and this suppressive effect of small costs (−10 points) was lessened in the ATD condition. ATD did not affect free sampling. These results suggest that 5-HT increases avoidance of immediate aversive outcomes (costs of 10 points) rather than the more distal prospects of losing because of an incorrect response (costs of 100 points). The authors relate greater sampling in the low 5-HT condition to ruminating about possible aversive outcomes in depression. This work gives some support to Crockett et al. (2009) by indicating that 5-HT plays a crucial role at the intersection of both punishment and behavioural inhibition, linking aversive predictions (small costs) and behavioural inhibition in a way that is adaptive to contextual demands. 3.4.6. Temporal discounting The control over behaviour exerted by reward decays with delay, which is termed temporal discounting of reward value. High rates of temporal discounting have been implicated in impulse control disorders and depression. The rate of discounting may be determined by fitting hyperbolic discount valuation models as taken from Mazur and Vaughan (1987): V = A/(1 + kD), where V is value, A is the magnitude of the delayed reward, D is the length of delay and k is the free parameter. A higher k value indicates a higher rate of discounting, while a ‘flat’ discounting rate (k = 0) implies no discounting at all, choosing the more delayed reward, no matter how long the wait. Mobini et al. (2000) reported that neurotoxin lesions of ascending 5-HT pathways increased discounting. Model fitting revealed the effect was selective for k and did not affect reward valuation. Doya (2002) proposed that 5-HT systems determine discounting rate and in a human imaging study Tanaka et al. (2004) showed that long delay rewards engage DRN and dorsal striatal regions associated with habit strength whereas short delays engage ventral striatal Pavlovian incentive learning. There have been few ATD studies of temporal discounting in humans. Doya and colleagues have carried out 3 studies using a

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

11

similar paradigm in which participants could choose and experience fixed increments of delay (1.5 s) to gain a larger reward over a smaller immediate reward. The delays were close to those used in animal studies of temporal discounting. Schweighofer et al. (2008) found that ATD increased the number of small reward choices (5 vs 20 yen) and increased discounting when compared to sham depletion and tryptophan loading. In a fMRI version of the task no effects of ATD on performance were seen in either Tanaka et al. (2007) or Demoto et al. (2012) possibly because the number of trials was limited by the imaging, the main focus of the studies. However, ATD effects on discounting correlated with neuroticism in the Tanaka et al. (2007) study. The main result of this study was that ATD modified parameterised fMRI responses such that, at low levels of 5-HT, ventral striatum activation related to predicting immediate rewards, whereas after tryptophan loading, dorsal striatal activations related to delayed rewards. The result is highly compatible with the model that DRN projections restrain ventral Pavlovian incentive motivation allowing waiting for later larger rewards over small immediate rewards (Doya, 2002; Tanaka et al., 2004). Other studies have examined the effect of ATD upon temporal discounting by using imaginary delays. Crean et al. (2002) administered a computer questionnaire temporal discounting paradigm in which indifference points were established between immediate monetary reward and greater delayed amounts. This enabled calculation of individual discounting parameters under ATD or sham depletion. Although the delays were imaginary, participants knew they would receive one of the immediate or delayed options they had chosen after the study. There was no effect of ATD on discounting parameters in participants with or without familial risk of alcoholism. Using a similar paradigm and the same hyperbolic discounting valuation model (both reported in Pine et al., 2009), Faulkner et al. (unpublished data) also found no effect of ATD on k in healthy volunteers and neither did Worbe et al. (2014). Finally, Tanaka et al. (2009) addressed how the behaviours that produce delayed rewards are learned, and whether serotonin has a role in the retrospective association of past actions with current outcomes. In brief they report that ATD impaired the ability to learn from delayed punishments but not the ability to learn from either delayed rewards or immediate punishments. They speculate that 5-HT projections from the median raphe may determine the time scale over which aversive outcomes are retrospectively associated with antecedent actions while dorsal raphe projections mediate already learned waiting behaviour.

avoidance Miyazaki et al. (2012). Similarly, Cools et al. (2011) suggests that tonic levels of 5-HT oppose dopamine to reduce vigour (punishment inhibition), reduce punishment salience (resilience), and, promote waiting (less discounting). (d) 5-HT may have a role in reward processing but there are no replicated ATD findings. Several studies (Finger et al., 2007; Rogers et al., 2003; Cools et al., 2005; Robinson et al., 2010; Roiser et al., 2006; Seymour et al., 2012) find evidence to suggest 5-HT may have a role in facilitating the control exerted over behaviour by reward. However the findings are isolated (4-arm bandit; Seymour et al., 2012) or not replicated (Gambling; Rogers et al., 2003) or have a more likely explanation in terms of 5-HT mediated Pavlovian inhibition (CRRT; Cools et al., 2005; Robinson et al., 2010; Roiser et al., 2006). A theory for a role of 5-HT in reward has yet to emerge. The positive results suggest a mild loss of reward sensitivity. We speculate that ATD may induce mild anhedonia by reducing function particularly in MRN resilience/antidepressant projections while relatively sparing a degree of DRN restraint of dopamine incentive.

3.4.7. Summary and conclusions (a) 5-HT does not appear to have a major role in probability discounting to release risky choice behaviour on the basis of human ATD studies. This may reflect the lack of potency of ATD but there is little experimental evidence in animals to make this a strong expectation. (b) 5-HT is likely to have a role in maintaining behavioural inhibition in the face of delayed reward. Only one study has reported evidence of accelerated temporal discounting after ATD (Schweighofer et al., 2008). However, a strong theoretical account backed by several imaging and animal studies suggests that DRN projections restrain Pavlovian incentive mechanisms, which otherwise attract behaviour to immediate rewards, and sustain inhibition for delayed larger rewards (Doya, 2002; Tanaka et al., 2004, 2007). This account is highly compatible with the disinhibitory effect of ATD in the 4CSRTT (Worbe et al., 2014) mentioned earlier, which paralleled the effect of DRN lesions in animals. (c) Waiting to obtain a reward may involve similar 5-HT processes to passive avoidance. The evidence that 5-HT mediates punishment inhibition (section 3.2) has lead to suggestions that the same 5HT systems are engaged not only in waiting for rewards but also in waiting (delaying responding) to avoid punishment i.e. passive

In the 9 studies in Table 2, correct and incorrect responses were variably rewarded or punished. In 7 studies, ATD modulated punishment effects and 6 were essentially replications of 3 effects. First, 5-HT clearly has an important role in mediating aversively conditioned Pavlovian inhibition (Crockett et al., 2012b; Geurts et al., 2013; Hindi Attar et al., 2012). Aversive 5-HT Pavlovian inhibition is also a likely substrate for disinhibitory effects of ATD on the cued reinforcement reaction time task (CCRT). Second, 5-HT contributes to inhibition of punished instrumental choice (Crockett et al., 2012b; Geurts et al., 2013). Third, ATD reversed the normal bias to make more errors in predicting punishments than in predicting rewards (Cools et al., 2008a,b; Robinson et al., 2012). However, this effect is probably an instance of the second effect since the bias is probably due to punishment-induced suppression of responding (see Section 3.3).

4. Conclusions 4.1. Lack of effect of ATD on action inhibition It is remarkable that ATD seems only to modify behavioural inhibition when behaviour is motivated by punishments and rewards, and that the effects are selective for inhibition associated with punishment. The specificity is particularly noteworthy given that the effects of aversion that are affected by ATD are often induced by a loss of rewards presented at other times in the same experiment in which the effect of rewards are not affected by ATD. In the 12 studies in Table 1 in which cues signal non-reinforced action inhibition, ATD had few convincing disinhibitory effects. It seems likely that 5-HT does not mediate action-cancellation inhibition in the SST since potent 5-HT manipulations in animal SSTs also have no effect on this form of inhibition (Eagle et al., 2008). The lack of effect of ATD in the other non-reinforced tasks may mean that 5-HT is not tonically active in restraining behaviour where there is no strong pressure for restraint. However, since all behaviour is motivated it seems more likely that ATD is too mild to disturb low-level 5-HT influences. 4.2. 5-HT and aversive Pavlovian inhibition

4.3. 5-HT, inhibition and aversion Recent computational theories suggest that 5-HT does not mediate all effects of punishment or all forms of inhibition but rather ‘ties’ the two together (Cools et al., 2011; Boureau and Dayan, 2011). They propose that punishment-linked behavioural inhibition (loss

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14 12

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

of “vigour”) is mediated by tonic 5-HT, which also inhibits central processing of aversive information and thus improves resilience. This was based partly on imaging studies that found ATD increased emotion processing (Cools et al., 2008b). However, the effects of 5-HT manipulations on emotion processing are inconsistent, as reviewed by Elliott et al. (2011). Preservation of instrumental avoidance after ATD was attributed to relative preservation of phasic 5-HT release signalling punishment (Crockett et al., 2009). However, this account was contradicted by subsequent findings of disinhibition of punished instrumental choice by ATD and most directly by the ability of ATD to block punishment prediction signalling in amygdala (Hindi Attar et al., 2012). Where effects of ATD have been found they are generally compatible with the idea that cues or responses that predict punishment can acquire the ability to release 5-HT, which mediates aversive Pavlovian and instrumental inhibition of behaviour. 4.4. Panic disorder in the context of 5-HT and punishment Several findings discussed and summarised above are compatible with the punishment – related functions ascribed to 2 of the 3 main DRN 5-HT subsystems in Section 1.2. i-iii, recapitulated as follows. (i) DRN 5-HT projections that modulate dopamine function have been implicated in inhibition and restraint of premature responding to successfully obtain rewards. ATD disinhibited responding when fast responses are rewarded (see Table 3) as in the CCRT and 4CSRTT, and when waiting for large rewards is required (delay discounting). The aversive motivation is the failure to obtain available rewards. 5-HT-DA interaction may also be the substrate for ATD interference with Pavlovian inhibition (Table 2). (ii) 5-HT amygdala projections appear to signal punishment and mediate punishment prediction error learning of Pavlovian aversion (Hindi Attar et al., 2012). (iii) DRN projections to PAG restrain panic/escape in anticipation of threat in animals. The human translation of this idea – that 5-HT restrains panic via a physiological defence mechanism – is encouraged by the fact that ATD experiments implicate 5-HT systems in the motivational effects of punishment. Thus the efficacy of serotonin selective reuptake inhibitors (SSRIs) in panic disorder may be mediated by increasing PAG 5HT release, bolstering a physiological restraint mechanism as Deakin and Graeff (1991) proposed. However, the question then arises: why do SSRIs not exacerbate anticipatory 5-HT defences and promote inhibition fear and avoidance? A brief answer is that this can happen early in treatment and that down-regulation of 5-HT2c receptors counteracts the morbidly overactive DRN projections. This controversial issue is discussed further in Deakin (2013). 4.5. Outstanding issues One difficulty in interpreting the effects of ATD is that different 5-HT systems have different functions and may vary in their sensitivity to ATD and their engagement by particular experimental tasks. SSRIs are effective antidepressants and this implies that some aspects of 5-HT function are concerned with resilience to punishment and responsiveness to rewards (evidence of the latter is discussed in Section 3.4). Some 5-HT projections may enhance primary processing of punishment while others engage top-down modulatory effects which mediate adaptive (Cools et al., 2008b) or amplifying changes (e.g. Robinson et al., 2013). Deakin and Graeff (1991) proposed that MRN projections to 5-HT1A receptors mediate resilience by interfering with rehearsal of aversive events in

hippocampus and by disengaging acute DRN responses to aversive events. We know from the subsequent work of Maier and Watkins (2005) that overactive DRN projections in learned helplessness are turned off by inhibitory projections from prelimbic cortex when shock becomes escapable. We suggest that MRN projections may bring this descending system into play perhaps initiated by interactions between DRN and MRN in the brain stem described by Forster et al. (2006, 2008). The general point is that much more needs to be known about how different 5-HT subsystems modulate and engage top-down control of aversion. Human imaging studies and non-human primate behavioural studies are tackling the issue (e.g. Pezawas et al., 2005; Robinson et al., 2013; Roberts, 2011). The reciprocal effects of 5-HT-mediated punishment processing in amygdala and medial prefrontal cortex was strikingly visualised in Macoveanu et al. (2013b). Such studies are essential because of the limited development of rodent prefrontal cortex. More generally there is much to understand about the role of 5-HT in aversive processes and secondary adaptations, especially in the social domain, such as resilience, self-esteem and social perception. Further studies using ATD will have a place but it will be important to use drugs with specific 5-HT receptor affinities to isolate the role of different 5-HT subsystems. While much has been learned from the use of ATD to experimentally manipulate 5-HT function, a critical issue is to identify what social, environmental and internal stimuli activate different 5-HT projections. For example, some 5-HT systems concerned with resilience may be sensitive to social-affiliative rewards (Deakin, 1996). Answers may be possible from fMRI studies that can visualise raphe sub-nuclei and their dynamic connectivity with forebrain structures in humans (Salomon et al., 2011), and from advances in imaging neuronal function in experimental animals in vivo. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.neubiorev. 2014.07.024. References Anderson, I.M., Richell, R.A., Bradshaw, C.M., 2003. The effect of acute tryptophan depletion on probabilistic choice. J. Psychopharmacol. (Oxf.) 17, 3–7. Angoa-Perez, M., Kane, M.J., Briggs, D.I., Sykes, C.E., Shah, M.M., Francescutti, D.M., Rosenberg, D.R., Thomas, D.M., Kuhn, D.M., 2012. Genetic depletion of brain 5HT reveals a common molecular pathway mediating compulsivity and impulsivity. J. Neurochem. 121, 974–984. Bizot, J.C., Thiébot, M.H., Le Bihan, C., Soubrié, P., Simon, P., 1988. Effects of imipramine-like drugs and serotonin uptake blockers on delay of reward in rats. Possible implication in the behavioral mechanism of action of antidepressants. J. Pharmacol. Exp. Ther. 246, 1144–1151. Boureau, Y.-L., Dayan, P., 2011. Opponency revisited: competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36, 74–97. Bromberg-Martin, E.S., Hikosaka, O., Nakamura, K., 2010. Coding of task reward value in the dorsal raphe nucleus. J. Neurosci. 30, 6262–6272. Campbell-Meiklejohn, D., Wakeley, J., Herbert, V., Cook, J., Scollo, P., Ray, M.K., Selvaraj, S., Passingham, R.E., Cowen, P., Rogers, R.D., 2011. Serotonin and dopamine play complementary roles in gambling to recover losses. Neuropsychopharmacology 36, 402–410. Clark, L., Roiser, J.P., Cools, R., Rubinsztein, D.C., Sahakian, B.J., Robbins, T.W., 2005. Stop signal response inhibition is not modulated by tryptophan depletion or the serotonin transporter polymorphism in healthy volunteers: implications for the 5-HT theory of impulsivity. Psychopharmacology (Berl.) 182, 570–578. Clark, L., Robbins, T.W., Ersche, K.D., Sahakian, B.J., 2006. Reflection impulsivity in current and former substance users. Biol. Psychiatry 60, 515–522. Clark, L., Dombrovski, A.Y., Siegle, G.J., Butters, M.A., Shollenberger, C.L., Sahakian, B.J., Szanto, K., 2011. Impairment in risk-sensitive decision-making in older suicide attempters with depression. Psychol. Aging 26, 321–330. Cools, R., Blackwell, A., Clark, L., Menzies, L., Cox, S., Robbins, T.W., 2005. Tryptophan depletion disrupts the motivational guidance of goal-directed behavior as a function of trait impulsivity. Neuropsychopharmacology 30, 1362–1373. Cools, R., Robinson, O.J., Sahakian, B., 2008a. Acute tryptophan depletion in healthy volunteers enhances punishment prediction but does not affect reward prediction. Neuropsychopharmacology 33, 2291–2299.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

Cools, R., Roberts, A.C., Robbins, T.W., 2008b. Serotoninergic regulation of emotional and behavioural control processes. Trends Cogn. Sci. 12, 31–40. Cools, R., Nakamura, K., Daw, N.D., 2011. Serotonin and dopamine: unifying affective, activational, and decision functions. Neuropsychopharmacology 36, 98–113. Crean, J., Richards, J.B., de Wit, H., 2002. Effect of tryptophan depletion on impulsive behavior in men with or without a family history of alcoholism. Behav. Brain Res. 136, 349–357. Crockett, M.J., Clark, L., Robbins, T.W., 2009. Reconciling the role of serotonin in behavioral inhibition and aversion: acute tryptophan depletion abolishes punishment-induced inhibition in humans. J. Neurosci. 29, 11993–11999. Crockett, M.J., Clark, L., Roiser, J.P., Robinson, O.J., Cools, R., Chase, H.W., Ouden, H.D., Apergis-Schoute, A., Campbell-Meiklejohn, D., Seymour, B., Sahakian, B.J., Rogers, R.D., Robbins, T.W., 2012a. Converging evidence for central 5-HT effects in acute tryptophan depletion. Mol. Psychiatry 17, 121–123. Crockett, M.J., Clark, L., Apergis-Schoute, A.M., Morein-Zamir, S., Robbins, T.W., 2012b. Serotonin modulates the effects of Pavlovian aversive predictions on response vigor. Neuropsychopharmacology 37, 2244–2252. Crockett, M.J., Clark, L., Smillie, L.D., Robbins, T.W., 2012c. The effects of acute tryptophan depletion on costly information sampling: impulsivity or aversive processing? Psychopharmacology (Berl.) 219, 587–597. Dalley, J.W., Roiser, J.P., 2012. Dopamine, serotonin and impulsivity. Neuroscience 215, 42–58. Daw, N.D., Kakade, S., Dayan, P., 2002. Opponent interactions between serotonin and dopamine. Neural Netw. 15, 603–616. Dayan, P., Huys, Q.J.M., 2008. Serotonin, inhibition, and negative mood. PLoS Comput. Biol. 4, e4. Deakin, J.F.W., 1983. Roles of serotonergic systems in escape, avoidance and other behaviours. In: Cooper, S. (Ed.), Theories in Psychopharmacology. Academic Press, London/New York, pp. 179–204. Deakin, J.F., 1996. 5-HT, antidepressant drugs and the psychosocial origins of depression. J. Psychopharmacol. 10, 31–38. Deakin, J.F., 2003. Depression and antisocial personality disorders: two contrasting disorders of 5HT function. J. Neural Transm. Suppl. 64, 79–93. Deakin, J.F., 2013. The origins of ‘5-HT and mechanisms of defence’ by Deakin and Graeff: a personal perspective. J. Psychopharmacol. 27, 1084–1089. Deakin, J.F., Graeff, F.G., 1991. 5-HT and mechanisms of defence. J. Psychopharmacol. 5, 305–315. Demoto, Y., Okada, G., Okamoto, Y., Kunisato, Y., Aoyama, S., Onoda, K., Munakata, A., Nomura, M., Tanaka, S.C., Schweighofer, N., Doya, K., Yamawaki, S., 2012. Neural and personality correlates of individual differences related to the effects of acute tryptophan depletion on future reward evaluation. Neuropsychobiology 65, 55–64. den Ouden, H.E.M., Daw, N.D., Fernandez, G., Elshout, J.A., Rijpkema, M., Hoogman, M., Franke, B., Cools, R., 2013. Dissociable effects of dopamine and serotonin on reversal learning. Neuron 80, 1090–1100. Doya, K., 2002. Metalearning and neuromodulation. Neural Netw. 15, 495–506. Eagle, D.M., Bari, A., Robbins, T.W., 2008. The neuropsychopharmacology of action inhibition: cross-species translation of the stop-signal and go/no-go tasks. Psychopharmacology (Berl.) 199, 439–456. Eagle, D.M., Lehmann, O., Theobald, D.E., Pena, Y., Zakaria, R., Ghosh, R., Dalley, J.W., Robbins, T.W., 2009. Serotonin depletion impairs waiting but not stop-signal reaction time in rats: implications for theories of the role of 5-HT in behavioral inhibition. Neuropsychopharmacol 34, 1311–1321. Eagle, D.M., Baunez, C., 2010. Is there an inhibitory-response-control system in the rat? Evidence from anatomical and pharmacological studies of behavioral inhibition. Neurosci. Biobehav. Rev. 34, 50–72. Elliott, R., Zahn, R., Deakin, J.F., Anderson, I.M., 2011. Affective cognition and its disruption in mood disorders. Neuropsychopharmacology 36, 153–182. Evenden, J.L., 1999. Varieties of impulsivity. Psychopharmacology (Berl.) 146, 348–361. Evers, E.A.T., Cools, R., Clark, L., van der Veen, F.M., Jolles, J., Sahakian, B.J., Robbins, T.W., 2005. Serotonergic modulation of prefrontal cortex during negative feedback in probabilistic reversal learning. Neuropsychopharmacology 30, 1138–1147. Evers, E.A.T., van der Veen, F.M., van Deursen, J.A., Schmitt, J.A.J., Deutz, N.E.P., Jolles, J., 2006. The effect of acute tryptophan depletion on the BOLD response during performance monitoring and response inhibition in healthy male volunteers. Psychopharmacology (Berl.) 187, 200–208. Finger, E.C., Marsh, A.A., Buzas, B., Kamel, N., Rhodes, R., Vythilingham, M., Pine, D.S., Goldman, D., Blair, J.R., 2007. The impact of tryptophan depletion and 5-HTTLPR genotype on passive avoidance and response reversal instrumental learning tasks. Neuropsychopharmacology 32, 206–215. Forster, G.L., Feng, N., Watt, M.J., Korzan, W.J., Mouw, N.J., Summers, C.H., Renner, K.J., 2006. Corticotropin-releasing factor in the dorsal raphe elicits temporally distinct serotonergic responses in the limbic system in relation to fear behaviour. Neuroscience 141, 1047–1055. Forster, G.L., Pringle, R.B., Mouw, N.J., Vuong, S.M., Watt, M.J., Burke, A.R., Lowry, C.A., Summers, C.H., Renner, K.J., 2008. Corticotropin-releasing factor in the dorsal raphe nucleus increases medial prefrontal cortical serotonin via type 2 receptors and median raphe nucleus activity. Eur. J. Neurosci. 28, 299–310. Geurts, D.E.M., Huys, Q.J.M., den Ouden, H.E.M., Cools, R., 2013. Serotonin and aversive Pavlovian control of instrumental behavior in humans. J. Neurosci. 33, 18932–18939. Graeff, F.G., Del-Ben, C.M., 2008. Neurobiology of panic disorder: from animal models to brain neuroimaging. Neurosci. Biobehav. Rev. 32, 1326–1335.

13

Graeff, F.G., Schoenfeld, R.I., 1970. Tryptaminergic mechanisms in punished and nonpunished behavior. J. Pharmacol. Exp. Ther. 173, 277–283. Harrison, A.A., Everitt, B.J., Robbins, T.W., 1997a. Central 5-HT depletion enhances impulsive responding without affecting the accuracy of attentional performance: interactions with dopaminergic mechanisms. Psychopharmacology (Berl.) 133, 329–342. Harrison, A.A., Everitt, B.J., Robbins, T.W., 1997b. Doubly dissociable effects of median- and dorsal-raphe lesions on the performance of the fivechoice serial reaction time test of attention in rats. Behav. Brain Res. 89, 135–149. Harrison, A.A., Everitt, B.J., Robbins, T.W., 1999. Central serotonin depletion impairs both the acquisition and performance of a symmetrically reinforced go/no-go conditional visual discrimination. Behav. Brain Res. 100, 99–112. Harvey, J.A., Schlosberg, A.J., Yunger, L.M., 1975. Behavioral correlates of serotonin depletion. Fed. Proc. 34, 1796–1801. Hindi Attar, C., Finckh, B., Büchel, C., 2012. The influence of serotonin on fear learning. PLoS ONE 7, e42397. Kagan, J., 1966. Reflection-impulsivity: the generality and dynamics of conceptual tempo. J. Abnorm. Psychol. 71, 17–24. LeMarquand, D.G., Pihl, R.O., Young, S.N., Tremblay, R.E., Séguin, J.R., Palmour, R.M., Benkelfat, C., 1998. Tryptophan depletion, executive functions, and disinhibition in aggressive, adolescent males. Neuropsychopharmacology 19, 333–341. LeMarquand, D.G., Benkelfat, C., Pihl, R.O., Palmour, R.M., Young, S.N., 1999. Behavioral disinhibition induced by tryptophan depletion in nonalcoholic young men with multigenerational family histories of paternal alcoholism. Am. J. Psychiatry 156, 1771–1779. Linnoila, M., Virkkunen, M., Scheinin, M., Nuutila, A., Rimon, R., Goodwin, F.K., 1983. Low cerebrospinal fluid 5-hydroxyindoleacetic acid concentration differentiates impulsive from nonimpulsive violent behaviour. Life Sci. 33, 2609–2614. Lowry, C.A., 2002. Functional subsets of serotonergic neurones: implications for control of the hypothalamic-pituitary-adrenal axis. J. Neuroendocrinol. 14, 911–923. Macoveanu, J., Hornboll, B., Elliott, R., Erritzoe, D., Paulson, O.B., Siebner, H., Knudsen, G.M., Rowe, J.B., 2013a. Serotonin 2A receptors, citalopram and tryptophandepletion: a multimodal imaging study of their interactions during response inhibition. Neuropsychopharmacology 38, 996–1005. Macoveanu, J., Rowe, J.B., Hornboll, B., Elliott, R., Paulson, O.B., Knudsen, G.M., Siebner, H.R., 2013b. Playing it safe but losing anyway – serotonergic signalling of negative outcomes in dorsomedial prefrontal cortex in the context of riskaversion. Eur. Neuropsychopharmacol. 23, 919–930. Maier, S.F., Watkins, L.R., 2005. Stressor controllability and learned helplessness: the roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor. Neurosci. Biobehav. Rev. 29, 829–841. Mazur, J.E., Vaughan Jr., W., 1987. Molar optimization versus delayed reinforcement as explanations of choice between fixed-ratio and progressive-ratio schedules. J. Exp. Anal. Behav. 48, 251–261. McNaughton, N., Corr, P.J., 2004. A two-dimensional view of defensive systems: defensive distance and fear/anxiety. Neurosci. Biobehav. Rev. 28, 285–305. McNaughton, N., Swart, C., Neo, P.S.H., Bates, V., Glue, P., 2013. Anti-anxiety drugs reduce conflict-specific “theta” - a possible human anxiety-specific biomarker. J. Affect. Disorders 148, 104–111. Mendelsohn, D., Riedel, W.J., Sambeth, A., 2009. Effects of acute tryptophan depletion on memory, attention and executive functions: a systematic review. Neurosci. Biobehav. Rev. 33, 926–952. Miyazaki, K., Miyazaki, K.W., Doya, K., 2012. The role of serotonin in the regulation of patience and impulsivity. Mol. Neurobiol. 45, 213–224. Miller, H.E., Deakin, J.F., Anderson, I.M., 2000. Effect of acute tryptophan depletion on CO2 -induced anxiety in patients with panic disorder and normal volunteers. Br. J. Psychiatry. 176, 182–188. Mobbs, D., Petrovic, P., Marchant, J.L., Hassabis, D., Weiskopf, N., Seymour, B., Dolan, R.J., Frith, C.D., 2007. When fear is near: threat imminence elicits prefrontalperiaqueductal gray shifts in humans. Science 317, 1079–1083. Mortimore, C., Anderson, I.M., 2000. d-Fenfluramine in panic disorder: a dual role for 5-hydroxytryptamine. Psychopharmacology (Berl). 149, 251–258. Mobini, S., Chiang, T.J., Al-Ruwaitea, A.S., Ho, M.Y., Bradshaw, C.M., Szabadi, E., 2000. Effect of central 5-hydroxytryptamine depletion on inter-temporal choice: a quantitative analysis. Psychopharmacology (Berl.) 149, 313–318. Murphy, F.C., Smith, K.A., Cowen, P.J., Robbins, T.W., Sahakian, B.J., 2002. The effects of tryptophan depletion on cognitive and affective processing in healthy volunteers. Psychopharmacology (Berl.) 163, 42–53. Paul, E.D., Lowry, C.A., 2013. Functional topography of serotonergic systems supports the Deakin/Graeff hypothesis of anxiety and affective disorders. J. Psychopharmacol. 27, 1090–1106. Paul, E.D., Johnson, P.L., Shekhar, A., Lowry, C.A., 2014. The Deakin/Graeff hypothesis: Focus on serotonergic inhibition of panic. Neurosci. Biobehav. Rev. [Epub ahead of print]. Pezawas, L., Meyer-Lindenberg, A., Drabant, E.M., Verchinski, B.A., Munoz, K.E., Kolachana, B.S., Egan, M.F., Mattay, V.S., Hariri, A.R., Weinberger, D.R., 2005. 5-HTTLPR polymorphism impacts human cingulate–amygdala interactions: a genetic susceptibility mechanism for depression. Nat. Neurosci. 8, 828–834. Pine, A., Seymour, B., Roiser, J.P., Bossaerts, P., Friston, K.J., Curran, H.V., Dolan, R.J., 2009. Encoding of marginal utility across time in the human brain. J. Neurosci. 29, 9575–9581. Roberts, A.C., 2011. The importance of serotonin for orbitofrontal function. Biol. Psychiatry 69, 1185–1191.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024

G Model NBR-2001; No. of Pages 14 14

ARTICLE IN PRESS P. Faulkner, J.F.W. Deakin / Neuroscience and Biobehavioral Reviews xxx (2014) xxx–xxx

Robinson, O.J., Cools, R., Crockett, M.J., Sahakian, B.J., 2010. Mood state moderates the role of serotonin in cognitive biases. J. Psychopharmacol. 24, 573–583. Robinson, O.J., Cools, R., Sahakian, B.J., 2012. Tryptophan depletion disinhibits punishment but not reward prediction: implications for resilience. Psychopharmacology (Berl.) 219, 599–605. Robinson, O.J., Overstreet, C., Allen, P.S., Letkiewicz, A., Vytal, K., Pine, D.S., Grillon, C., 2013. The role of serotonin in the neurocircuitry of negative affective bias: serotonergic modulation of the dorsal medial prefrontal-amygdala ‘aversive amplification’ circuit. Neuroimage 78, 217–223. Rocha, B.A., Scearce-Levie, K., Lucas, J.J., Hiroi, N., Castanon, N., Crabbe, J.C., Nestler, E.J., Hen, R., 1998. Increased vulnerability to cocaine in mice lacking the serotonin-1B receptor. Nature 393, 175–178. Rogers, R.D., Blackshaw, A.J., Middleton, H.C., Matthews, K., Hawtin, K., Crowley, C., Hopwood, A., Wallace, C., Deakin, J.F., Sahakian, B.J., Robbins, T.W., 1999a. Tryptophan depletion impairs stimulus-reward learning while methylphenidate disrupts attentional control in healthy young adults: implications for the monoaminergic basis of impulsive behaviour. Psychopharmacology (Berl.) 146, 482–491. Rogers, R.D., Everitt, B.J., Baldacchino, A., Blackshaw, A.J., Swainson, R., Wynne, K., Baker, N.B., Hunter, J., Carthy, T., Booker, E., London, M., Deakin, J.F., Sahakian, B.J., Robbins, T.W., 1999b. Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology 20, 322–339. Rogers, R.D., Tunbridge, E.M., Bhagwagar, Z., Drevets, W.C., Sahakian, B.J., Carter, C.S., 2003. Tryptophan depletion alters the decision-making of healthy volunteers through altered processing of reward cues. Neuropsychopharmacology 28, 153–162. Roiser, J.P., Blackwell, A.D., Cools, R., Clark, L., Rubinsztein, D.C., Robbins, T.W., Sahakian, B.J., 2006. Serotonin transporter polymorphism mediates vulnerability to loss of incentive motivation following acute tryptophan depletion. Neuropsychopharmacology 31, 2264–2272. Roiser, J.P., Elliott, R., Sahakian, B.J., 2012. Cognitive mechanisms of treatment in depression. Neuropsychopharmacology 37, 117–136. Rosvold, H.E., Delgado, J.M., 1956. The effect on delayed-alternation test performance of stimulating or destroying electrically structures within the frontal lobes of the monkey’s brain. J. Comp. Physiol. Psychol. 49, 365–372. Rubia, K., Lee, F., Cleare, A.J., Tunstall, N., Fu, C.H.Y., Brammer, M., McGuire, P., 2005. Tryptophan depletion reduces right inferior prefrontal activation during response inhibition in fast, event-related fMRI. Psychopharmacology (Berl.) 179, 791–803. Rylands, A.J., Hinz, R., Jones, M., Holmes, S.E., Feldmann, M., Brown, G., McMahon, A.W., Talbot, P.S., 2013. Pre- and postsynaptic serotonergic differences in males with extreme levels of impulsive aggression without callous unemotional traits: a positron emission tomography study using (11)C-DASB and (11)CMDL100907. Biol. Psychiatry 72, 1004–1011. Sahakian, B.J., Owen, A.M., 1992. Computerized assessment in neuropsychiatry using CANTAB: discussion paper. J. R. Soc. Med. 85, 399–402. Salomon, R.M., Cowan, R.L., Rogers, B.P., Dietrich, M.S., Bauernfeind, A.L., Kessler, R.M., Gore, J.C., 2011. Time series fMRI measures detect changes in pontine raphe following acute tryptophan depletion. Psychiatry Res. 191, 112–121. Schütz, M.T., de Aguiar, J.C., Graeff, F.G., 1985. Anti-aversive role of serotonin in the dorsal periaqueductal grey matter. Psychopharmacology (Berl.) 85, 340–345.

Schultz, W., Dayan, P., Montague, P.R.A., 1997. Neural substrate of prediction and reward. Science 275, 1593–1599. Schweighofer, N., Bertin, M., Shishida, K., Okamoto, Y., Tanaka, S.C., Yamawaki, S., Doya, K., 2008. Low-serotonin levels increase delayed reward discounting in humans. J. Neurosci. 28, 4528–4532. Seymour, B., Daw, N.D., Roiser, J.P., Dayan, P., Dolan, R., 2012. Serotonin selectively modulates reward value in human decision-making. J. Neurosci. 32, 5833–5842. Soubrié, P., 1986. Serotonergic neurons and behavior. J. Pharmacol. 17, 107–112. Talbot, P.S., Watson, D.R., Barrett, S.L., Cooper, S.J., 2006. Rapid tryptophan depletion improves decision-making cognition in healthy humans without affecting reversal learning or set shifting. Neuropsychopharmacology 31, 1519–1525. Tanaka, S.C., Doya, K., Okada, G., Ueda, K., Okamoto, Y., Yamawaki, S., 2004. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat. Neurosci. 7, 887–893. Tanaka, S.C., Schweighofer, N., Asahi, S., Shishida, K., Okamoto, Y., Yamawaki, S., Doya, K., 2007. Serotonin differentially regulates short- and long-term prediction of rewards in the ventral and dorsal striatum. PLoS ONE 2, e1333. Tanaka, S.C., Shishida, K., Schweighofer, N., Okamoto, Y., Yamawaki, S., Doya, K., 2009. Serotonin affects association of aversive outcomes to past actions. J. Neurosci. 29, 15669–15674. Tuescher, O., Protopopescu, X., Pan, H., Cloitre, M., Butler, T., Goldstein, M., Root, J.C., Engelien, A., Furman, D., Silverman, M., Yang, Y., Gorman, J.M., LeDoux, J., Silbersweig, D., Stern, E., 2011. Differential activity of subgenual cingulate and brainstem in panic disorder and PTSD. J. Anxiety Disord. 25, 251–257. van Donkelaar, E.L., Blokland, A., Ferrington, L., Kelly, P.A., Steinbusch, H.W., Prickaerts, J., 2011. Mechanism of acute tryptophan depletion: is it only serotonin? Mol. Psychiatry 16, 695–713. Walderhaug, E., Lunde, H., Nordvik, J.E., Landrø, N.I., Refsum, H., Magnusson, A., 2002. Lowering of serotonin by rapid tryptophan depletion increases impulsiveness in normal individuals. Psychopharmacology (Berl.) 164, 385–391. Walderhaug, E., Landrø, N.I., Magnusson, A., 2008. A synergic effect between lowered serotonin and novel situations on impulsivity measured by CPT. J. Clin. Exp. Neuropsychol. 30, 204–211. Winstanley, C.A., Dalley, J.W., Theobald, D.E.H., Robbins, T.W., 2004. Fractionating impulsivity: contrasting effects of central 5-HT depletion on different measures of impulsive behavior. Neuropsychopharmacology 29, 1331–1343. Winstanley, C.A., Eagle, D.M., Robbins, T.W., 2006. Behavioral models of impulsivity in relation to ADHD: translation between clinical and preclinical studies. Clin. Psychol. Rev. 26, 379–395. Wise, C.D., Berger, B.D., Stein, L., 1970. Serotonin: a possible mediator of behavioural suppression induced by anxiety. Dis. Nerv. Syst. 31 (Suppl.), 34–37. Worbe, Y., Savulich, G., Voon, V., Fernandez-Egea, E., Robbins, T.W., 2014. Serotonin depletion induces ‘waiting impulsivity’ on the human four-choice serial reaction time task: cross-species translational significance. Neuropsychopharmacology (Epub ahead of print). Yacubian, J., Gläscher, J., Schroeder, K., Sommer, T., Braus, D.F., Büchel, C., 2006. Dissociable systems for gain- and loss-related value predictions and errors of prediction in the human brain. J. Neurosci. 26 (37), 9530–9537. Zepf, F.D., Holtmann, M., Stadler, C., Demisch, L., Schmitt, M., Wöckel, L., Poustka, F., 2008. Diminished serotonergic functioning in hostile children with ADHD: tryptophan depletion increases behavioural inhibition. Pharmacopsychiatry 41, 60–65.

Please cite this article in press as: Faulkner, P., Deakin, J.F.W., The role of serotonin in reward, punishment and behavioural inhibition in humans: Insights from studies with acute tryptophan depletion. Neurosci. Biobehav. Rev. (2014), http://dx.doi.org/10.1016/j.neubiorev.2014.07.024