Article
Cue-Evoked Dopamine Promotes Conditioned Responding during Learning Highlights d
Inconsequential novel, but not familiar, stimuli activate VTA and SNc dopamine neurons
d
Intrinsic stimulus value modulates dopamine responses to novel stimuli
d
Dopamine activation during familiar CS accelerates conditioning
d
Dopamine inhibition during novel CS decelerates conditioning
Morrens et al., 2020, Neuron 106, 1–12 April 8, 2020 ª 2020 Elsevier Inc. https://doi.org/10.1016/j.neuron.2020.01.012
Authors atay Aydin, Joachim Morrens, C¸ag Aliza Janse van Rensburg, Jose´ Esquivelzeta Rabell, Sebastian Haesler
Correspondence
[email protected]
In Brief Morrens et al. show that inconsequential novel stimuli evoke responses in dopamine neurons in the VTA and SNc. Performing bidirectional optogenetic manipulation during conditioning, they then demonstrate that novel stimulusevoked dopamine promotes the development of behavioral responses, indicating that Pavlovian conditioning is influenced by CS dopamine, in addition to US reward prediction errors.
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Neuron
Article Cue-Evoked Dopamine Promotes Conditioned Responding during Learning atay Aydin,1,2,4,5 Aliza Janse van Rensburg,1,2,3,4 Jose´ Esquivelzeta Rabell,1,2,3,4 Joachim Morrens,1,2,3,4,5 C¸ag and Sebastian Haesler1,2,3,4,6,* 1VIB,
3001 Leuven, Belgium 3001 Leuven, Belgium 3KU Leuven, Department of Neuroscience, Research Group Neurophysiology, 3000 Leuven, Belgium 4Neuroelectronics Research Flanders, 3001 Leuven, Belgium 5These authors contributed equally 6Lead Contact *Correspondence:
[email protected] https://doi.org/10.1016/j.neuron.2020.01.012 2Imec,
SUMMARY
Dopamine neurons mediate the association of conditioned stimuli (CS) with reward (unconditioned stimuli, US) by signaling the discrepancy between predicted and actual reward during the US. Some theoretical models suggest that learning is also influenced by the salience or associability of the CS. A hallmark of CS associability models is that they can explain latent inhibition, i.e., the observation that novel CS are more effectively learned than familiar CS. Novel CS are known to activate dopamine neurons, but whether those responses affect associative learning has not been investigated. Here, we used fiber photometry to characterize dopamine responses to inconsequential familiar and novel stimuli. Using bidirectional optogenetic modulation during conditioning, we then show that CS-evoked dopamine promotes conditioned responses. This suggests that Pavlovian conditioning is influenced by CS dopamine, in addition to US reward prediction errors. Accordingly, the absence of dopamine responses to familiar CS might explain their slower learning in latent inhibition.
INTRODUCTION Dopamine neurons play a key role in associative learning. When sensory cues (conditioned stimuli [CS]) are associated with reward (unconditioned stimuli [US]), they respond to unexpected reward with a phasic increase in their firing rate. When predicted reward is omitted, the firing rate of dopamine neurons is reduced. Thus, dopamine neurons encode the discrepancy between predicted and actual reward, i.e., reward prediction errors (RPEs) (Schultz et al., 1997). RPEs play a central role in formal theories of reinforcement learning (Mackintosh, 1975; Pearce and Hall, 1980; Rescorla and Wagner, 1972; Sutton and Barto, 1998), and experimental manipulations of dopamine with opto-
genetics, specifically during the US period, have confirmed that dopamine RPE responses causally impact associative learning (Chang et al., 2016; Steinberg et al., 2013). According to variants of RPE-centered frameworks, associative learning is also directly influenced by the CS through variations of its salience or associability (Mackintosh, 1975; Pearce and Hall, 1980). Associability refers to the potential of a stimulus to be associated with another stimulus through learning. The higher the associability of a stimulus, the faster it can be associated with an outcome. If associability is zero, it cannot be associated with anything. Consistent with an involvement in CS associability, novel, physically intense, or otherwise salient sensory stimuli have long been recognized to evoke phasic dopamine re ski sponses that decrease with increasing experience (Kamin et al., 2018; Lak et al., 2016; Ljungberg et al., 1992; McNamara et al., 2014; Menegas et al., 2017; Rebec et al., 1997). However, the contribution of dopamine CS responses to associative learning has not yet been addressed experimentally. The concept of CS associability has also provided the cornerstone for models of latent inhibition, i.e., the common behavioral observation that conditioned responses to familiar cues establish much slower during associative learning than those to novel cues. Latent inhibition has been widely observed across different learning paradigms and mammalian species (Lubow, 1989; Lubow and Moore, 1959), suggesting that it reflects an adaptive learning strategy. According to CS associability frameworks, experiencing stimuli that are not followed by an event or consequence reduces their associability, resulting in latent inhibition. Pharmacological evidence has implicated the dopamine system in latent inhibition. Dopamine agonists, such as amphetamine, attenuate latent inhibition in rodents (Solomon et al., 1981; Weiner et al., 1988), whereas dopamine antagonists, such as the antipsychotic drugs haloperidol and chlorpromazine, increase latent inhibition (Christison et al., 1988; Weiner et al., 1987). Consistent with the effects of antipsychotic drugs on latent inhibition, schizophrenia has been associated with reduced latent inhibition (Rascle et al., 2001). Despite this evidence for a role of dopamine in latent inhibition, the circuit mechanism by which the activity of dopamine neurons relates to the different learning rates of novel and familiar stimuli remains unknown today. Neuron 106, 1–12, April 8, 2020 ª 2020 Elsevier Inc. 1
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Figure 1. Dopamine Neurons Respond to Novel, but Not to Rare, Familiar Stimuli (A) Schematic illustration of the novelty exposure task. Top: odorants were delivered to head-restrained mice, and sniffing behavior was recorded using nasal thermography. Middle: a single trial included 2-s odorant presentation followed by an intertrial stimulus interval (ISI), drawn from an exponential distribution with (legend continued on next page)
2 Neuron 106, 1–12, April 8, 2020
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Here, we used fiber photometry in head-restrained mice to measure dopamine neuron activity in the midbrain ventral tegmental area (VTA) and substantia nigra pars compacta (SNc) in response to inconsequential familiar and novel olfactory stimuli. In parallel, we measured the spontaneous orienting response to novel stimuli (Esquivelzeta Rabell et al., 2017; Mutlu et al., 2018) to obtain behavioral evidence for novelty perception. This non-associative paradigm allowed us to measure dopamine signals in the absence of learned CS value or generalized reward value. We then performed olfactory conditioning experiments to study the role of CS dopamine in associative learning. We varied stimuli along the familiarity dimension, i.e., presenting familiar and novel stimuli to experimentally modulate CS dopamine. Using bidirectional optogenetic manipulation of dopamine neurons selectively during the CS period, we specifically tested the hypothesis that CS dopamine influences the rate at which conditioned responses develop. RESULTS Dopamine Neurons Respond to Novel, but Not to Rare, Familiar Stimuli To investigate the response of dopaminergic neurons to novel and familiar stimuli, we used a novelty exposure paradigm in awake, head-restrained mice (Figure 1A; n = 15). In this paradigm, animals experience stimuli of varying history of prior exposure. We introduced stimuli that were never presented before in the setup as ‘‘novel’’ and stimuli that they experienced already on previous days as ‘‘familiar.’’ Specifically, mice were familiarized with eight odors for 5 days (phase 1). On the test day (phase
2A), we presented four novel (11% of all trials) and the previously familiarized odors (78% of all trials) while measuring exploratory sniffing, a well-established behavioral response to novel stimuli (Esquivelzeta Rabell et al., 2017; Mutlu et al., 2018). In phase 2A of the paradigm, we further introduced a rare familiar condition (11% of all trials) in which we presented previ€zel, ously familiarized stimuli at rare occurrence (Bunzeck and Du 2006). Given that the surprise of a stimulus is inversely related to its probability of occurring (Barto et al., 2013), this manipulation increased surprise for a subset of familiar stimuli. The mean number of trials between two common familiar stimuli was 5 ± 0.2 trials. Between two rare familiar stimuli, it was 29 ± 0.8 trials, and for novel, it was 30.1 ± 0.9 trials. Hence, the probability of encountering the same stimulus again in the task was similar for novel and rare familiar stimuli. We evaluated the significance of behavioral responses across experimental conditions by building a simple regression model for every animal with the baseline-subtracted breathing frequency in the 3.5-s time window after odor onset as dependent variable and indicator variables for novelty, rareness, and familiarity as regressors. This resulted in t-statistic values for every animal, which were then globally evaluated to be different from zero using two-sided Wilcoxon signed rank tests. As expected, mice (n = 15) exhibited significantly increased sniffing to novel (tmean = 5.17; p < 0.001), but not familiar stimuli (tmean = 0.39; p = 0.19; Figure 1B). Mice slightly but significantly increased their respiration rate in response to rare familiar odors (tmean = 1.08; p = 0.002), confirming that they noticed the deviation from the standard scheme (Figure 1B). Throughout the novelty exposure paradigm, we measured the activity of dopaminergic neurons across the VTA and SNc using
mean of 5 s, cut off at <2 and >20 s. Bottom: behavioral training started with a first phase in which mice were familiarized with eight odorants (phase 1). In a second phase (phase 2A), four previously familiarized stimuli were presented in a large proportion of all trials (common familiar, 78% of all trials). Four familiar stimuli and four novel stimuli were presented in a smaller proportion of trials (rare familiar and novel, 11% of trials each). (B) Left: mean baseline-subtracted breathing frequency (± SEM) of mice (n = 15) before, during, and after the presentation of novel, common familiar, and rare familiar stimuli in phase 2A. To arrive at the mean, we first calculated the mean response for each of the three conditions per animal and then calculated the mean and SEM of those means. Gray shading indicates stimulus presentation. Right: t-statistic values for breathing behavior in every animal in response to novel, common familiar, and rare familiar stimuli (see STAR Methods for details of the regression model). (C) Fiber photometry was used to measure dopamine transients in DAT-Cre mice expressing GCaMP6s selectively in dopamine neurons. (D) Schematic illustration of unexpected water and air puff delivery (phase 2B). In the same behavioral session, but after the odor exposure paradigm (phase 2A), mice were given unexpected water, and they received unexpected air puffs to their nose. (E) Example single-trial, Z scored photometry signals from three animals (rows: animals #8, #7, and #14) in response to four presentations of novel (green), rare familiar (blue), and common familiar (gray) odorants in phase 2A and in response to four deliveries of unexpected water (black) and air puff (orange) in phase 2B. For novel, rare, and common familiar conditions, the color intensity from dark to light indicates the presentation order from first to fourth presentation. The chemical identity of each odorant is indicated. Experiments that used the same odorant chemical but in different animals are marked by rounded rectangles with identical dash pattern. Gray shading indicates the stimulus presentation time. The dashed line indicates water or air puff onset. The horizontal scale bar indicates time in seconds; the vertical scale bar indicates the Z scored signal amplitude. (F) Average Z scored dopamine response (± SEM) to novel odorants averaged across all experimental animals that showed significant activation by stimulus novelty according to a regression model with dopamine as a dependent variable and a novelty trial indicator variable as the only explanatory factor (n = 8; see STAR Methods for more details of the regression models used). Color intensity from dark to light green indicates the presentation order from first to fourth presentation. Gray shading indicates stimulus presentation. (G) Average Z scored dopamine response (± SEM) to rare familiar odorants averaged across all experimental animals (n = 15). Color intensity from dark to light blue indicates the presentation order from first to fourth presentation. Gray shading indicates stimulus presentation. (H) Average Z scored dopamine response (± SEM) to common familiar odorants averaged across all experimental animals (n = 15). Color intensity from dark to light gray indicates the presentation order from first to fourth presentation. Gray shading indicates stimulus presentation. (I) Average Z scored dopamine response to water and air puff averaged across all experimental animals (n = 15). The dashed line indicates water or air puff onset. (J) Average dopamine transients during the presentation of novel stimuli (left) and water (right) for all animals (n = 15 mice), normalized to each animal’s peak water response and sorted by the relative peak novelty response (highest, top; lowest, bottom). (K) T-statistic values for dopamine transients in every animal in response to novel, common familiar, and rare familiar stimuli (left) and reward and air puffs (right) (see STAR Methods for details of the regression model). *p < 0.05, **p < 0.01, ***p < 0.001.
Neuron 106, 1–12, April 8, 2020 3
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
A
B
E
I
C
F
J
G
K
D
H
L
Figure 2. Activation of Dopaminergic Neurons during Familiar CS Accelerates the Development of Conditioned Responding (A) Optogenetic strategy to selectively stimulate dopamine neurons during CS. DAT-Cre mice received bilateral injection with AAV5-EF1a-DIO-hChR2(H134R)EYFP into the VTA/SNc. They were further implanted bilaterally with optic fibers prior to behavioral training. All experiments performed in the experimental group (ChR2VTA/SNc group, n = 7 mice) were replicated in a control group of animals, injected with AAV5-EF1a-DIO-EYFP (ControlVTA/SNc.group, n = 6 mice). (B) To evaluate the effect of optogenetic stimulation of dopamine neurons on associative learning, we used a conditioning paradigm with six experimental conditions (conditioning task I, right). In an initial training phase (phase 1), we paired one odorant with reward (CS+), whereas four odorants were presented without any consequence (CS‒). In the subsequent second phase (phase 2), the odor that was rewarded during training remained rewarded (CS+), and two of the previously non-rewarded odors remained non-rewarded (CS‒). One novel odor, which was paired with reward, was introduced (CS+nov). Two previously non-rewarded odors became rewarded (CS+famstim and CS+fam), and during the presentation of one of those odors, we performed optogenetic stimulation (CS+famstim). Finally, we also performed optogenetic stimulation during the presentation of one of the non-rewarded odors (CS‒famstim). (C) Anticipatory licks for CS+ and CS‒ at the end of phase 1 (mean of last five trials). (D) Anticipatory licks for CS+nov, CS+famstim, CS+fam and CS‒famstim (mean of last five trials in phase 2). (E) Illustration of optogenetic stimulation parameters for CS+famstim and CS‒famstim trials. Optical stimulation (10 pulses at 20 Hz) was performed during odor presentation for 0.5 s, odor was delivered for 2 s, and reward was delivered with a delay of 2 s after odor offset. (F) Anticipatory licks in an example animal in response to familiar (CS+fam, gray) and novel (CS+nov, green) cues. The continuous line represents the moving average (window length 10 trials) of anticipatory licks for each condition. The two colored triangles at the horizontal axis indicate the behavioral change points calculated in this example animal. Gray and white shading indicate different experimental sessions (one session per day). Trial number refers to all trials of the respective experimental condition. (G) Cumulative anticipatory lick trials for familiar (CS+fam, gray) and novel (CS+nov, green) conditions, averaged across all animals (mean ± SEM). (H) Cumulative incidence curves of the change points from all experimental animals conditioned with familiar (CS+fam, gray) and novel (CS+nov, green) stimuli. (I) Anticipatory licks in an example animal in response to familiar, optogenetically stimulated (CS+famstim, blue) and non-stimulated (CS+fam, gray) cues. The continuous line represents the moving average (window length 10 trials) of anticipatory licks. The two colored triangles at the horizontal axis indicate the behavioral change points in this example animal. Gray and white shading indicate different experimental sessions (one session per day). Optogenetic stimulation was performed only on the first day (blue squares). (J) Cumulative anticipatory lick trials for CS+famstim and CS+fam, averaged across all animals in the ChR2VTA/SNc group (mean ± SEM).
(legend continued on next page)
4 Neuron 106, 1–12, April 8, 2020
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
fiber photometry (Figure 1C). In brief, we selectively targeted expression of the fluorescent calcium indicator GCaMP6s to dopaminergic neurons using recombinant AAV (hEF1-LS1LGCaMP6) in mice expressing Cre recombinase from the pro€ckman moter of the dopamine transporter gene (DAT-Cre) (Ba et al., 2006). Optical fibers were implanted in the midbrain regions VTA and SNc to measure calcium transients of dopamine ensembles (see STAR Methods). Theoretical modeling and experimental studies suggest that signals originating from up to around 120-mm distance from a 400-mm fiber tip (numerical aperture 0.39) will be captured with a detection probability of >80% (Mansy et al., 2019; Pisanello et al., 2019). The number of neurons contributing to the photometry signal further largely depends on where the fiber tip is positioned relative to the local distribution of cell bodies and axons. A very rough estimate based on the above considerations and the density of dopamine neurons in the VTA and SNc (Margolis et al., 2006) suggests that about 10 to at most 50 dopamine neurons contributed to the signals measured in our study. Figure 1E illustrates single-trial fluorescence signals captured in phase 2A of the novelty exposure task in response to the indicated chemicals when introduced as novel, as rare familiar, or as common familiar in three example animals. We found strong responses to novel stimuli in some (e.g., animals #7 and #8 and Figure 1J), but not in all animals (e.g., animal #14 and Figure 1J). The same novel odorant could evoke strong responses in some animals (e.g., ethyl valerate in animals #7 and #8), but not in others (ethyl valerate in animal #14). Given that our photometry measurements captured transients of a relatively small group of neurons, the variability of responses to novel stimuli across animals likely reflects the differential responsiveness of different dopamine neuron ensembles (Figure 1E). Of note, animals that did not show any dopamine response to novel stimuli (Figures 1E and 1J) still expressed sniffing behavior in response to novel stimuli (Figure 1B). The amplitude of responses to novel stimuli decreased with repeated presentation in the unrewarded context of our novelty exposure task. When the same odorants were introduced not as novel but instead as rare or common familiar stimuli, they did not evoke responses in dopamine neurons (Figures 1E and S1G). To be able to relate novel odor-evoked dopamine responses to the well-described responses of dopamine neurons to water reward and air puffs (Cohen et al., 2012), we also measured dopamine transients during presentation of unexpected water and air puffs in the same session after the novelty exposure paradigm was completed (phase 2B). Water and air puffs were never paired with odorants (Figure 1C). We found robust, stereotypical responses to water in all animals (Figures 1E and 1J). Dopamine transients in response to air puffs were more variable. We observed both increasing and decreasing transients in response to air puff (Figure S1E), as has been reported previously (Cohen et al., 2012).
The population analysis confirmed the single-trial observations. We determined the significance of Ca2+ responses using regression models as described above, with dopamine as dependent variable and novelty, rareness, familiarity, and water and air puffs as explanatory variables. To allow comparing signal amplitudes across different conditions, we normalized dopamine transients measured during odor presentations and air puff to their respective water responses. The population average of dopamine transients showed significant activation by water (tmean = 8.09; p < 0.001; Figures 1I and 1K) and novel stimuli (tmean = 2.42; p < 0.001; Figures 1F and 1K). Dopamine transients to novel stimuli covered a wide range of response amplitudes, which appear to follow a unimodal distribution (Figures 1J and 1SA). Responses to novel stimuli further decrease over the four repeated presentations in the unrewarded context of our novelty exposure task (Figure 1F). No significant dopamine responses to common familiar stimuli were observed (tmean = 0.16; p = 0.56; Figures 1H and 1K). We also did not observe significant responses to rare familiar stimuli (tmean = 0.09; p = 0.72; Figures 1G and 1K), although they evoked significant behavioral responses (Figure 1B). Finally, dopamine transients significantly decreased on average in response to air puffs (tmean = 1.53; p = 0.02; Figures 1I and 1K). Dopamine Responses to Novel Stimuli Appear to Be Modulated by Intrinsic Stimulus Value To test responses of dopamine neurons to novel stimuli, we used a large panel of structurally dissimilar odorants, each of which might have a different level of intrinsic value. It is well established that parameters that increase reward value, including reward size (Bayer and Glimcher, 2005; Tobler et al., 2005) and reward probability (Fiorillo et al., 2003), increase the magnitude of dopamine responses. Dopamine responses to novel stimuli might thus be similarly modulated by the intrinsic value of odorant chemicals. To allow for testing this hypothesis, we measured the relative intrinsic value of 14 chemicals in our odorant panel in a separate group of DAT-Cre mice (n = 6) using an odor investigation assay (Figure S1I). In this assay, we quantified the time that freely moving animals spent investigating an odorant source. The implicit choice to investigate for longer or shorter duration can be considered a measure of intrinsic value (Kermen et al., 2016; Kobayakawa et al., 2007). Since odorant preferences are in large part innate, they are expected to be similar even in different groups of congenic mice. We found that the mean investigation time of odorant chemicals was positively correlated with the mean dopamine response magnitude to those odorants when they were presented as novel in the novelty exposure task (Spearman correlation coefficient r = 0.72; p = 0.005; Figure S1J) but uncorrelated when they were presented as familiar (Spearman correlation coefficient r = 0.03; p = 0.94). Consistent with this, a mixed model regression analysis (see STAR Methods for details) of odorants
(K) Cumulative incidence curves of change points of animals in the ChR2VTA/SNc group for familiar, optogenetically stimulated (CS+famstim, blue) and non-stimulated (CS+fam, gray) cues. (L) Cumulative incidence curves of change points of animals in the ControlVTA/SNc group for familiar, optogenetically stimulated (CS+famstim, blue) and nonstimulated (CS+fam, gray) cues. **p < 0.01, ***p < 0.001.
Neuron 106, 1–12, April 8, 2020 5
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
(n = 10), for which we had collected dopamine signals in both novel and familiar conditions, revealed significant effects of novelty (p < 0.001) and value (p = 0.039) when odors were novel. In line with the absence of dopamine responses to familiar stimuli (Figure 1K), there was no significant effect of value (p = 0.36) when odorants were familiar. These results suggest that dopamine responses to novel stimuli are modulated by intrinsic stimulus value, which may also help explain some of the variability observed in the responses to novel stimuli. Moreover, dopamine responses to novel stimuli were positively correlated with water responses (Spearman correlation coefficient r = 0.62; p = 0.016), suggesting that dopamine neurons are similarly modulated by stimulus familiarity and positive value. Novel Stimuli Activate Dopamine Neurons across VTA and SNc Previous work in rodents has indicated that responses to novel stimuli might not be distributed evenly across dopamine neurons in the VTA and SNc. Whereas dopaminergic projections from the lateral SNc to the tail of the striatum exhibit phasic responses to novel odorants, this was not observed for projections from the VTA to the ventral striatum (Menegas et al., 2017, 2018). On the other hand, the robust dopamine release in the nucleus accumbens, a major VTA projection target, upon entry of rats into a novel space supports the existence of novelty-responsive dopamine neurons in the VTA (Rebec et al., 1997). Analyzing the anatomical distribution of all responses in our dataset, we found responses to novel stimuli, like water responses, to be present throughout the VTA and SNc (Figure S1D). Air puff responses were similarly distributed across the dopaminergic midbrain (Figure S1F). Previous experiments have suggested that dopamine might directly evoke exploratory sniffing (Makanjuola et al., 1980; Molloy and Waddington, 1985). To test the hypothesis that noveltyevoked dopamine might in turn induce rapid sniffing, we investigated the dynamic relationship between dopamine transients and respiration. Novelty-evoked dopamine transients preceded sniffing responses (latency dopamine 467 ± 103 ms, median ± SEM; latency sniffing 627 ± 123 ms, median ± SEM). Moreover, both responses concurrently habituated with repeated presentations of novel stimuli (Figures 1F and S1B) within each session. However, we did not find evidence for a trial-by-trial correlation between dopamine and sniffing behavior. We built a regression model in which sniffing responses were regressed on significantly novelty-modulated dopamine transients (n = 8) along with an indicator variable for novelty trials. Fitting this model indicated no remaining correlation between sniffing and dopamine after accounting for trials being novel (novelty: tmean = 6.51; p = 0.008; dopamine: tmean = 0.16; p = 0.46; Figure S1C). Hence, dopamine does not appear to directly cause sniffing, but both processes are driven by a third factor—the novelty of the perceived stimulus. Activation of Dopaminergic Neurons during Familiar CS Accelerates the Development of Conditioned Responding To investigate the functional role of CS dopamine responses in the context of associative learning, we optogenetically manipulated dopamine neurons in the VTA-SNc of mice performing an olfactory conditioning task (conditioning task I) in which novel
6 Neuron 106, 1–12, April 8, 2020
and previously familiarized odorant cues were paired with fluid rewards (ChR2VTA/SNc group: n = 7 mice; Figures 2A and 2B). In this task, mice display anticipatory licking behavior prior to reward delivery, indicating the successful association of cue and reward (Cohen et al., 2012). In a first phase (phase 1), we trained mice for 5 days to establish anticipatory licking for one rewarded cue (CS+) while familiarizing them with four non-rewarded stimuli (CS‒; Figure 2B). At the end of phase 1, behavioral lick responses were established for the CS+, but not for any of the non-rewarded stimuli (CS+, p = 0.008; CS‒, left to right, p = 0.98, p = 1, p = 1, p = 0.95; Figure 2C). After this initial training in phase 1, we started a second phase (phase 2) in which we maintained the contingency for one CS+ and one CS‒ but changed the contingency for the other stimuli (Figure 2B). We introduced a novel stimulus, which was also paired with reward (CS+nov). We further started rewarding one previously familiarized odorant (CS+fam). During the presentation of a second familiar odorant, we performed bilateral channelrhodopsin 2-mediated (ChR2) stimulation of dopamine neurons in the VTA-SNc (Figures 2A, 2E, and S2A). This familiar stimulus was also rewarded (CS+famstim). According to previous work using similar behavioral paradigms, responses of dopamine neurons to rewarded novel stimuli gradually decrease over relatively long time periods of about 20 to 30 trials (Lak et al., 2016; Menegas et al., 2017). To broadly match these timescales, we optogenetically stimulated dopamine neurons during the cue presentation throughout the first conditioning session, unless there were fewer than 20 trials per condition in the first session, in which case we also stimulated the next session (mean number of stimulated trials 26 ± 0.8). Of note, reward-predictive dopaminergic cue responses typically emerge much slower. In fact, they develop slower than anticipatory behavior itself (Cohen et al., 2012; Menegas et al., 2017; Morris et al., 2006). To evaluate whether dopamine stimulation alone caused licking behavior, we also stimulated dopamine neurons during the presentation of a familiar, non-rewarded stimulus (CS‒famstim). After phase 2, animals had eventually developed robust conditioned responses to all rewarded stimuli (CS+, p = 0.008; CS+fam, p = 0.008; CS+famstim, p = 0.008; CS+nov, p = 0.008; Figure 2D). Animals did not lick in response to the non-rewarded stimuli (CS‒, p = 0.59; CS‒famstim, p = 0.2; Figure 2D). Anticipatory licking was indistinguishable between the stimulated and the unstimulated familiar CS‒ (CS‒famstim, mean licks = 1.3 versus CS‒, mean licks = 1.3; p = 1), demonstrating that stimulation of dopamine neurons during the CS period did not cause licking on its own. Mice associated novel stimuli (CS+nov) more rapidly with reward than previously familiarized ones (CS+fam); i.e., they showed latent inhibition (Figures 2F, 2G, and S2C). To quantify this effect across animals, we determined behavioral change points for each animal; i.e., we identified the trial in which an animal consistently started anticipatory responding (Spiegelhalter et al., 1996). We then performed a multivariate survival analysis of the anticipatory licking change points (Vonta, 2009) to quantify the difference in the rate at which animals established behavioral CS responses (novelty coefficient b1 = +4.33; p < 0.001; Figure 2H). In a complementary analysis, we estimated learning rates for CS+nov and CS+fam using a simple reinforcement model
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
B DAT-Cre
optic fiber (473nm) PFC
VTA/SNc AAV5-EF1a-DIO-hChR2 (H134R)-EYFP
C
animal #2
15
D change-point incidence
A
10
5
1
ChR2
*
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0 0
10 20 trial #
EYFP
0 0
10
20 30 trial #
0
10
20 30 trial #
Figure 3. Stimulation of Dopaminergic Axon Terminals in the PFC during Familiar CS Is Sufficient to Promote Conditioning (A) Optogenetic strategy to selectively stimulate dopamine axon terminals in the PFC during the presentation of familiar CS. Experimental paradigm and optical stimulation protocol as described in Figures 2B and 2E. (B) Widespread expression of ChR2 in dopamine axon terminals in the PFC (maximal projection image of a 25-mm optical section). Thick gray lines indicate optic fiber implant positions. Inset: higher magnification of optical fiber implant lesion. (C) Anticipatory licks in an example animal in response to familiar, optogenetically stimulated (CS+famstim, blue) and non-stimulated (CS+fam, gray) cues. The continuous line represents the moving average (window length 10 trials) of anticipatory licks. The two colored triangles at the horizontal axis indicate the behavioral change points calculated in this example animal. White and gray shading indicate different experimental sessions (one session per day). (D) Cumulative incidence curves of change points from animals in the ChR2PFC group (n = 5 mice; left) and the controlPFC group (n = 5 mice; right), conditioned with familiar, optogenetically stimulated (CS+famstim, blue) and non-stimulated (CS+fam, gray) cues. *p < 0.05.
(Miller et al., 1995). Learning rates were significantly higher for novel than for familiar stimuli (p = 0.008; Figure S2E). To evaluate the effect of the stimulation of dopamine neurons during the presentation of familiar cues, we compared the development of anticipatory licking responses to stimulated (CS+famstim) and unstimulated cues (CS+fam) within the same animal. Anticipatory licking developed more rapidly for dopaminestimulated than unstimulated familiar cues (Figures 2I, 2J, and S2D). The survival analysis of anticipatory licking change points revealed that stimulation of dopamine during familiar cue presentation significantly accelerated learning (optogenetic stimulation coefficient b2 = +2.16; p = 0.005; Figure 2K). Consistent with this, the reinforcement model revealed significantly higher learning rates for the dopamine-stimulated familiar cue compared to the non-stimulated familiar cue (CS+famstim versus CS+fam p = 0.008; Figure S2F). We did not observe significant effects of optogenetic stimulation on the rate of conditioning in a control group of animals (ControlVTA/SNc group: n = 6 mice; Figure S2B) performing the same experimental manipulations as the ChR2VTA/SNc group but expressing enhanced yellow fluorescent protein (EYFP) instead of ChR2 (optogenetic stimulation coefficient b2 = +0.485; p = 0.24; Figure 2L) (learning rate p = 0.84; Figure S2H). EYFP control animals did, however, show latent inhibition (Figure S2G). Additional between-group tests confirmed significantly earlier change points (p = 0.01; Figure S2I) and higher learning rates (p < 0.001; Figure S2J) in the ChR2VTA/SNc group compared to the controlVTA/SNc group for photostimulated familiar CS+ (CS+famstim). Stimulation of Dopaminergic Axon Terminals in the PFC during Familiar CS Is Sufficient to Promote Conditioning Previous research in primates has identified the prefrontal cortex (PFC) as a dopaminergic projection region in which dopaminergic modulation of sensory information occurs (Jacob et al., 2013; Noudoost and Moore, 2011). Moreover, catecholamin-
ergic depletion of prelimbic medial PFC in rats has been shown to enhance latent inhibition (Nelson et al., 2010). Therefore, we experimentally evaluated the specific contribution of CS-related prefrontal dopamine on associative learning. We performed the same experiment as described above for the ChR2VTA/SNc group. At the end of phase 1, animals had established anticipatory licking for CS+, but not CS‒ (data not shown). In phase 2, we optogenetically stimulated dopaminergic axon terminals in the PFC during the presentation of previously familiarized cues (ChR2PFC group: n = 5; Figures 3A, 3B, S3A, and S3B). Conditioned responding developed more rapidly to dopamine-stimulated cues than to non-stimulated cues (CS+famstim versus CS+fam, optogenetic stimulation coefficient b2 = +1.64; p = 0.03; Figures 3C and 3D, left; and Figure S3D). This finding is corroborated by the reinforcement model that revealed significantly higher learning rates for dopaminestimulated familiar cues compared to non-stimulated familiar cues (Figure S3E). As for VTA-SNc stimulation, dopamine PFC stimulation during the presentation of familiar, non-rewarded stimuli did not cause licking (data not shown). We did not observe significant effects of optogenetic stimulation on the rate of conditioning in a control group of animals performing the same experimental manipulations as the ChR2PFC group but expressing EYFP instead of ChR2 (controlPFC group: n = 5; Figure S3C; optogenetic stimulation coefficient b2 = 0.183; p = 1, Figure 3D, right; learning rate p = 0.78, Figure S3F). Between-group tests confirmed earlier change points (p = 0.008) (Figure S3G) and higher learning rates (p = 0.004; Figure S3H) in the ChR2PFC group compared to the controlPFC group for photostimulated familiar CS+ (CS+famstim). The Suppression of Novel Cue-Evoked Dopamine Decelerates the Development of Conditioned Responding Next, we experimentally reduced dopamine CS responses to novel stimuli using an optogenetic inhibition strategy with
Neuron 106, 1–12, April 8, 2020 7
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
F
CS+nov
5
CS
0 0
CS+novinh
US
10 trial #
CS+nov CS+nov
10
5
0 20
10 trial #
20
0.6 0.4 0.2
0
5
10 trial #
15
no
S+
S+
C
S-
1
0
0
H
eNpHR3.0
0.8
C
S+
S+
**
1
inh
change point incidence
10
cumulative lick trials
# anticipatory licks
G 15
15
change point incidence
animal #2
0
0
C
time to odor-on (s)
E
5
v
CS-nov
4
5
C
CS+nov
reward
inh
ov
CS+nov
laser
** *
10
C
odor
10
**
Sn
CS-
**
Phase 2
15
C
CS+
15
# anticipatory licks
CS US
AAV5-EF1a-DIO-eNpHR3.0-EYFP
0
D Phase 1
S-
VTA/SNc
C
Phase 2
C
Phase 1 CS US
no v in h
Conditioning task II
B
DAT-Cre
# anticipatory licks
optic fiber (593nm)
A
EYFP
0.8 0.6 0.4 0.2 0 0
5 10 trial #
Figure 4. The Suppression of Novel Cue-Evoked Dopamine Decelerates the Development of Conditioned Responding (A) Optogenetic strategy to selectively inhibit dopamine neurons during CS in DAT-Cre mice injected with AAV5-EF1a-DIO-eNpHR3.0-EYFP (top). Optical inhibition was performed during odor presentation (CS+novinh), fading out linearly over a period of 1 s until shortly before reward delivery to avoid rebound excitation (Mahn et al., 2016). All experiments performed in the experimental group (eNpHR3.0 group, n = 7) were replicated in a control group of animals injected with AAV5EF1a-DIO-EYFP (controleNpHR3 group, n = 8). (B) To evaluate the effect of a reduction of dopamine CS responses on associative learning, we used an experimental paradigm consisting of five experimental conditions (conditioning task II, right). In an initial training phase (phase 1), we paired one odorant with reward in 70% of all trials (CS+), whereas one odorant was presented without any consequence (CS‒). In the subsequent second phase (phase 2), the contingency for CS+ and CS‒ remained unchanged. We further introduced four novel odorants. Two of these novel odorants were paired with reward (CS+nov and CS+novinh), whereas two were not paired with reward (CS‒nov). During the presentation of one rewarded novel odor (CS+novinh), we performed optogenetic inhibition. (C) Anticipatory licks for CS+ and CS‒ at the end of phase 1 (mean of last five trials). (D) Anticipatory licks for CS+nov, CS+novinh and CS‒nov (mean of last five trials of second day of phase 2). (E) Anticipatory licks in an example animal learning to associate a novel odorant with reward without (CS+nov, green) and with (CS+novinh, yellow) optogenetic inhibition. The continuous line represents the moving average (window length 10 trials) of anticipatory licks. The two colored triangles at the horizontal axis indicate the behavioral change points in this example animal. Gray and white shading indicate different experimental sessions (one session per day). Optical inhibition with yellow laser was only applied on the first day (yellow squares). Inset: lick raster plots (1st trial, top row; one row per trial) show the onset of robust responding is shifted by optogenetic inhibition of dopamine during CS. (F) Cumulative anticipatory lick trials for CS+nov and CS+novinh averaged across all animals in the eNpHR3.0 group (mean ± SEM). (G) Cumulative incidence curves of the change points from animals in the eNpHR3.0 group (left) and the controleNpHR3 group (right), conditioned with novel odorants without (CS+nov, green) or with optogenetic inhibition (CS+novinh, yellow). *p < 0.05, **p < 0.01, ***p < 0.001.
halorhodopsin (eNpHR3.0) to directly test their contribution to the effectiveness of associative learning (conditioning task II; Figures 4A and 4B). We first trained mice (eNpHR3.0 group: n = 7 mice) for one week with one rewarded (CS+) and one non-rewarded cue (CS‒; phase 1). This training led to behavioral lick response for CS+, but not CS‒ (CS+, p = 0.008; CS‒, p = 0.89; Figure 4C). After this initial training in phase 1, we started a second phase (phase 2) in which we introduced two novel stimuli, which were paired with reward (CS+nov). During the presentation of one of the novel cues, we performed bilateral photoinhibition of dopamine neurons (Chang et al., 2016) to suppress novelty-evoked transients (CS+novinh; Figures 4B and S4A). Taking into account the habituation of dopamine responses to novel stimuli, we inhibited cue response throughout the first conditioning session (mean number of inhibited trials, 16 ± 0.75).
8 Neuron 106, 1–12, April 8, 2020
To preclude a bias from inductive generalization (Gershman and Niv, 2015), we also introduced two novel stimuli that were not paired with reward (CS‒nov), thus ensuring that the overall reward probability for newly introduced stimuli was at chance level. The unstimulated novel, rewarded cue (CS+nov) allowed for within-animal comparison of the effect of photoinhibition of dopamine neurons. At the end of phase 2, mice had eventually learned all associations at the end of phase 2, indicated by the presence of robust anticipatory licking to all rewarded cues (CS+, p = 0.008; CS+nov, p = 0.008; and CS+novinh, p = 0.02) and the absence of licking to non-rewarded stimuli (CS‒ and CS‒nov, p = 1; Figure 4D). However, learning dynamics differed between photoinhibited (CS+novinh) and non-inhibited novel cues (CS+nov). Anticipatory licking responses developed slower when dopamine neurons were
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
inhibited during the presentation of the novel sensory stimulus (Figures 4E and 4F). The analysis of anticipatory licking change points confirmed that inhibition of novelty-evoked dopamine activity significantly slowed down the development of conditioned responses (optogenetic inhibition coefficient b3 = 2.12; p = 0.003; Figure 4G). Consistent with this, the reinforcement model revealed significantly lower learning rates for photoinhibited than for non-inhibited novel cues (p = 0.004; Figure S4C). Importantly, we did not find evidence for a direct effect of CS photoinhibition on licking behavior. There was no significant difference between photoinhibited and non-inhibited trials in the numbers of licks (CS+novinh 7.3 versus CS+nov 5.6) and the latency to first lick (CS+novinh 2001 ms versus CS+nov 2297 ms). Rather, we observed a shift in the behavioral change points of responding (Figures 4E, inset, and 4G). Photostimulation had no effect in control animals (controleNpHR3 group: n = 8 mice; Figure S4B) undergoing the same experimental manipulations as the eNpHR3.0 group but expressing EYFP instead of eNpHR3.0 (controleNpHR3 group: optogenetic inhibition coefficient b3 = 0.225; p = 0.34; Figure 4H) (learning rate p = 0.22; Figure S4D). Additional between-group tests confirmed significantly later change points (p = 0.003; Figure S4E) and lower learning rates (p = 0.03; Figure S4F) in the eNpHR3.0 group compared to the controleNpHR3 group for photoinhibited novel CS+ (CS+novinh). Taken together, these results suggest that inhibition of dopamine neurons during the presentation of novel CS decelerates the development of conditioned responses. DISCUSSION The effectiveness of associative learning has been proposed to be directly influenced by the CS through variations of its associability (Mackintosh, 1975; Pearce and Hall, 1980). From a practical perspective, stimulus associability can be considered as the salience of a stimulus. Various types of CS salience have been shown to influence dopamine neuron activity, including reward value, physical intensity (Fiorillo et al., 2013), the uncertainty about reward outcome (Fiorillo et al., 2003), and novelty (Lak et al., 2016; Ljungberg et al., 1992). A hallmark of models attributing associative learning to changes in CS effectiveness is that they provide an explanation for the widely observed behavioral phenomenon of latent inhibition. Latent inhibition has also been associated with dopamine, but it remained unknown how the activity patterns of dopamine neurons relate to the different learning rates observed for familiar and novel stimuli. Here, we used fiber photometry to measure the response of dopamine neurons to stimuli of varying familiarity, the stimulus dimension involved in latent inhibition. We then experimentally evaluated the contribution of novel CS-evoked responses to the effectiveness of associative learning by performing a bidirectional optogenetic manipulation of dopamine neurons selectively during the CS period. Specifically, we tested the hypothesis that dopamine CS responses promote the development of conditioned responses. In our novelty exposure paradigm, novel stimuli evoked dopamine transients of different response magnitude. Dopamine responses to novel stimuli rapidly habituated concomitant with
sniffing responses in the non-associative context of our paradigm. Based on the presence and absence of responses to novel stimuli at projection sites in the ventral and dorsal striatum, respectively, it has been suggested previously that only specific subpopulations of dopamine neurons, defined by their anatomical position and projection targets, are activated by novel stimuli (Menegas et al., 2017). The unimodal distribution of dopamine activation across animals by novel stimuli (Figure S1A) and the absence of a clear pattern across VTA and SNc observed in this study (Figure S1D) do not support the notion of a distinct subpopulation responding to novel stimuli. Moreover, we found responses to novel stimuli in the VTA, which is not expected from previous work that did not find such responses in the ventral striatum, a major VTA projection target (Menegas et al., 2017). The response of dopamine neurons in the VTA might in fact reflect prefrontal projections, which are known to originate in the VTA in rodents (Lammel et al., 2008). What drives dopamine responses to novel stimuli? Neither rareness, which caused a small behavioral reaction (Figure 1B), nor the unpredictable time of arrival of familiar stimuli in the novelty exposure paradigm caused noticeable changes in dopamine activity. Since odorants were never paired with reward in the novelty exposure paradigm, they also did not acquire conditioned reward value or generalized reward value from other cues. We tested responses to four different novel odorants in each animal, and odorants were balanced across different animals, yet behavioral and dopamine responses were only observed when stimuli were novel and not when they were familiar (Figures 1E–1H and S1G). These observations suggest that sensory stimuli can activate dopamine neurons when they provoke bottom-up attention, arousal, and behavioral orienting, which is commonly, but not exclusively, observed when animals experience unfamiliar stimuli (Nour et al., 2018; Schultz, 2016). It is well established that dopamine neurons are activated by evidence for and suppressed by evidence against reward (Fiorillo, 2013; Schultz, 2016). Although novel stimuli could in principle also indicate evidence against reward, neither we nor others have observed suppression of dopamine by novel stimuli. This suggests that novel stimulus-evoked dopamine activity indicates potential reward availability. Moreover, dopamine ensembles that were particularly responsive to water were also particularly responsive to novel stimuli. Based on estimates of intrinsic value obtained in an odor investigation assay, we further found that average dopamine responses to novel stimuli tended to be higher the more appetitive odorants were. Collectively, these findings support the view that dopamine neurons treat novel stimuli as evidence for reward (i.e., as positive RPE), which is related to the idea that dopamine responses to novel stimuli represent a ‘‘novelty bonus’’ that attaches an inherent a priori positive reward value to novel stimuli to motivate exploration (Kakade and Dayan, 2002). During associative learning, novel or otherwise salient cues activate dopamine neurons already in the very first trial. The role of these CS responses in associative learning has been less well studied than dopamine US responses, which signal prediction errors that determine how well an animal learns to associate a CS with a US (Chang et al., 2016; Steinberg et al., 2013). By manipulating dopamine specifically during the CS period, we
Neuron 106, 1–12, April 8, 2020 9
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
demonstrated that dopamine CS responses promote the rate at which conditioned responses develop. How does CS dopamine promote the development of conditioned responding? In principle, there are two alternative hypotheses (which are not mutually exclusive). CS dopamine might either promote associative learning or promote reward-oriented approach behavior in a non-associative manner related to incentive salience (Berridge and Robinson, 1998; McClure et al., 2003). According to the latter hypothesis, dopamine lowers the threshold for the animal to take action in response to the CS. Our optogenetic dopamine manipulations could thus have affected the rate at which conditioned responses develop, without affecting learning. The experimental design of this study does not allow a clear differentiation between the two hypotheses, but we did not find evidence in favor of a direct influence of dopamine on licking behavior. CS dopamine stimulation in our experiments did not cause licking, per se, but selectively promoted responding only to rewarded cues. Similarly, CS dopamine photoinhibition does not seem to have prevented licking, as animals had the same number of licks in trials with and without photoinhibition. Therefore, we favor the interpretation that CS dopamine increases the associability of the CS to promote associative learning, consistent with theoretical models in which the CS plays a critical role in forming the association between the CS and the US (Mackintosh, 1975; Pearce and Hall, 1980). The function of dopamine in associative learning could thus involve a combination of CS and US responses, which drive CS and US associability, respectively. This idea is consistent with dopamine response properties recorded previously in primates and with theoretical models combining stimulus associability and error terms (Lak et al., 2016; Le Pelley, 2004; Schultz, 2016). Our experimental finding that dopamine CS responses promote conditioning also has important implications for latent inhibition. Latent inhibition experiments involve a pre-exposure phase, during which inconsequential stimuli are presented, and a conditioning phase, during which previously exposed and novel stimuli are associated with an outcome. Whereas pharmacological manipulation of dopamine during the pre-exposure has no effect on latent inhibition, disrupting dopamine during subsequent conditioning also disrupts latent inhibition (Weiner et al., 1984). Thus, the contribution of dopamine appears to occur during conditioning. We found that selectively increasing or decreasing CS dopamine essentially recapitulated the different learning rates observed for novel or pre-exposed stimuli, respectively. This suggests that it is via their cue-related activity that dopamine neurons contribute to latent inhibition. Accordingly, the absence of dopamine responses to familiar CS might explain their slower learning in latent inhibition. Further studies are needed to reveal the mechanism by which CS dopamine affects latent inhibition and associative learning. STAR+METHODS Detailed methods are provided in the online version of this paper and include the following: d d
KEY RESOURCES TABLE LEAD CONTACT AND MATERIAL AVAILABILITY
10 Neuron 106, 1–12, April 8, 2020
d d
d
d
EXPERIMENTAL MODEL AND SUBJECT DETAILS B Animals METHOD DETAILS B Stereotactic Surgeries B Fiber Photometry B Novelty Exposure Paradigm and Water/Air Puff Paradigm B Odor Investigation Assay B Optogenetics B Behavioral Paradigms for Optogenetic Manipulation of Dopamine during CS B Histology QUANTIFICATION AND STATISTICAL ANALYSIS B Analysis Novelty Exposure Paradigm B Analysis of Conditioning Experiments DATA AND CODE AVAILABILITY
SUPPLEMENTAL INFORMATION Supplemental Information can be found online at https://doi.org/10.1016/j. neuron.2020.01.012. ACKNOWLEDGMENTS We thank Pamela Martin Del Olmo and Martijn Broux (NERF) for excellent assistance with mouse behavioral experiments and histological analysis. This work was supported by a PhD scholarship (11ZA317N) from the Research Foundation–Flanders (FWO) to J.M. and a Career Development Award from the Human Frontier Science Program (CDA00029/2013-C) and a Marie-Curie Career Integration Grant from the European Union (DopaPredict) to S.H. AUTHOR CONTRIBUTIONS Conceptualization, J.M. and S.H.; Investigation, J.M., C.A., A.J.v.R., and J.E.R.; Formal Analysis, J.M., C.A., and S.H.; Writing – Original Draft, J.M. and S.H.; Writing – Review & Editing, J.M., C.A., and S.H.; Visualization, C.A. and S.H.; Supervision, S.H. DECLARATION OF INTERESTS The authors declare no competing interests. Received: November 19, 2018 Revised: October 28, 2019 Accepted: January 13, 2020 Published: February 5, 2020 REFERENCES €ckman, C.M., Malik, N., Zhang, Y., Shan, L., Grinberg, A., Hoffer, B.J., Ba Westphal, H., and Tomac, A.C. (2006). Characterization of a mouse strain expressing Cre recombinase from the 30 untranslated region of the dopamine transporter locus. Genesis 44, 383–390. Barto, A., Mirolli, M., and Baldassarre, G. (2013). Novelty or surprise? Front. Psychol. 4, 907. Bayer, H.M., and Glimcher, P.W. (2005). Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141. Berridge, K.C., and Robinson, T.E. (1998). What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res. Brain Res. Rev. 28, 309–369. €zel, E. (2006). Absolute coding of stimulus novelty in the Bunzeck, N., and Du human substantia nigra/VTA. Neuron 51, 369–379. Chang, C.Y., Esber, G.R., Marrero-Garcia, Y., Yau, H.J., Bonci, A., and Schoenbaum, G. (2016). Brief optogenetic inhibition of dopamine neurons
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
mimics endogenous negative reward prediction errors. Nat. Neurosci. 19, 111–116. Christison, G.W., Atwater, G.E., Dunn, L.A., and Kilts, C.D. (1988). Haloperidol enhancement of latent inhibition: relation to therapeutic action? Biol. Psychiatry 23, 746–749. Cohen, J.Y., Haesler, S., Vong, L., Lowell, B.B., and Uchida, N. (2012). Neurontype-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88. Esquivelzeta Rabell, J., Mutlu, K., Noutel, J., Martin Del Olmo, P., and Haesler, S. (2017). Spontaneous Rapid Odor Source Localization Behavior Requires Interhemispheric Communication. Curr. Biol. 27, 1542–1548.e4. Fiorillo, C.D. (2013). Two dimensions of value: dopamine neurons represent reward but not aversiveness. Science 341, 546–549.
Mansy, M.M., Kim, H., and Oweiss, K.G. (2019). Spatial detection characteristics of a single photon fiber photometry system for imaging neural ensembles. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER) (IEEE), pp. 969–972. Margolis, E.B., Lock, H., Hjelmstad, G.O., and Fields, H.L. (2006). The ventral tegmental area revisited: is there an electrophysiological marker for dopaminergic neurons? J. Physiol. 577, 907–924. McClure, S.M., Daw, N.D., and Montague, P.R. (2003). A computational substrate for incentive salience. Trends Neurosci. 26, 423–428. McNamara, C.G., Tejero-Cantero, A´., Trouche, S., Campo-Urriza, N., and Dupret, D. (2014). Dopaminergic neurons promote hippocampal reactivation and spatial memory persistence. Nat. Neurosci. 17, 1658–1660.
Fiorillo, C.D., Tobler, P.N., and Schultz, W. (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902.
Menegas, W., Babayan, B.M., Uchida, N., and Watabe-Uchida, M. (2017). Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice. eLife 6, e21886.
Fiorillo, C.D., Song, M.R., and Yun, S.R. (2013). Multiphasic temporal dynamics in responses of midbrain dopamine neurons to appetitive and aversive stimuli. J. Neurosci. 33, 4710–4725.
Menegas, W., Akiti, K., Amo, R., Uchida, N., and Watabe-Uchida, M. (2018). Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat. Neurosci. 21, 1421–1430.
Gershman, S.J., and Niv, Y. (2015). Novelty and Inductive Generalization in Human Reinforcement Learning. Top. Cogn. Sci. 7, 391–415.
Miller, R.R., Barnet, R.C., and Grahame, N.J. (1995). Assessment of the Rescorla-Wagner model. Psychol. Bull. 117, 363–386.
Jacob, S.N., Ott, T., and Nieder, A. (2013). Dopamine regulates two classes of primate prefrontal neurons that represent sensory signals. J. Neurosci. 33, 13724–13734.
Molloy, A.G., and Waddington, J.L. (1985). Sniffing, rearing and locomotor responses to the D-1 dopamine agonist R-SK&F 38393 and to apomorphine: differential interactions with the selective D-1 and D-2 antagonists SCH 23390 and metoclopramide. Eur. J. Pharmacol. 108, 305–308.
Kakade, S., and Dayan, P. (2002). Dopamine: generalization and bonuses. Neural Netw. 15, 549–559. ski, J., Mamelak, A.N., Birch, K., Mosher, C.P., Tagliati, M., and Kamin Rutishauser, U. (2018). Novelty-Sensitive Dopaminergic Neurons in the Human Substantia Nigra Predict Success of Declarative Memory Formation. Curr. Biol. 28, 1333–1343.e4. Kermen, F., Midroit, M., Kuczewski, N., Forest, J., The´venet, M., Sacquet, J., Benetollo, C., Richard, M., Didier, A., and Mandairon, N. (2016). Topographical representation of odor hedonics in the olfactory bulb. Nat. Neurosci. 19, 876–878. Kobayakawa, K., Kobayakawa, R., Matsumoto, H., Oka, Y., Imai, T., Ikawa, M., Okabe, M., Ikeda, T., Itohara, S., Kikusui, T., et al. (2007). Innate versus learned odour processing in the mouse olfactory bulb. Nature 450, 503–508. Lak, A., Stauffer, W.R., and Schultz, W. (2016). Dopamine neurons learn relative chosen value from probabilistic rewards. eLife 5, e18044. €ckel, O., Jones, I., Liss, B., and Roeper, J. (2008). Lammel, S., Hetzel, A., Ha Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron 57, 760–773. Le Pelley, M.E. (2004). The role of associative history in models of associative learning: a selective review and a hybrid model. Q. J. Exp. Psychol. B 57, 193–243. Ljungberg, T., Apicella, P., and Schultz, W. (1992). Responses of monkey dopamine neurons during learning of behavioral reactions. J. Neurophysiol. 67, 145–163. Lubow, R.E. (1989). Latent Inhibition and Conditioned Attention Theory (Cambridge University Press). Lubow, R.E., and Moore, A.U. (1959). Latent inhibition: the effect of nonreinforced pre-exposure to the conditional stimulus. J. Comp. Physiol. Psychol. 52, 415–419. Mackintosh, N.J. (1975). A theory of attention: variations in the associability of stimuli with reinforcement. Psychol. Rev. 82, 276–298. Mahn, M., Prigge, M., Ron, S., Levy, R., and Yizhar, O. (2016). Biophysical constraints of optogenetic inhibition at presynaptic terminals. Nat. Neurosci. 19, 554–556. Makanjuola, R.O.A., Dow, R.C., and Ashcroft, G.W. (1980). Behavioural responses to stereotactically controlled injections of monoamine neurotransmitters into the accumbens and caudate-putamen nuclei. Psychopharmacology (Berl.) 71, 227–235.
Morris, G., Nevet, A., Arkadir, D., Vaadia, E., and Bergman, H. (2006). Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 9, 1057–1063. Mutlu, K., Rabell, J.E., Martin Del Olmo, P., and Haesler, S. (2018). IR thermography-based monitoring of respiration phase without image segmentation. J. Neurosci. Methods 301, 1–8. Nelson, A.J.D., Thur, K.E., Marsden, C.A., and Cassaday, H.J. (2010). Catecholaminergic depletion within the prelimbic medial prefrontal cortex enhances latent inhibition. Neuroscience 170, 99–106. Noudoost, B., and Moore, T. (2011). Control of visual cortical signals by prefrontal dopamine. Nature 474, 372–375. Nour, M.M., Dahoun, T., Schwartenbeck, P., Adams, R.A., FitzGerald, T.H.B., Coello, C., Wall, M.B., Dolan, R.J., and Howes, O.D. (2018). Dopaminergic basis for signaling belief updates, but not surprise, and the link to paranoia. Proc. Natl. Acad. Sci. USA 115, E10167–E10176. Pearce, J.M., and Hall, G. (1980). A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87, 532–552. Pisanello, M., Pisano, F., Hyun, M., Maglie, E., Balena, A., De Vittorio, M., Sabatini, B.L., and Pisanello, F. (2019). The Three-Dimensional Signal Collection Field for Fiber Photometry in Brain Tissue. Front. Neurosci. 13, 82. Rascle, C., Mazas, O., Vaiva, G., Tournant, M., Raybois, O., Goudemand, M., and Thomas, P. (2001). Clinical features of latent inhibition in schizophrenia. Schizophr. Res. 51, 149–161. Rebec, G.V., Christensen, J.R., Guerra, C., and Bardo, M.T. (1997). Regional and temporal differences in real-time dopamine efflux in the nucleus accumbens during free-choice novelty. Brain Res. 776, 61–67. Rescorla, R., and Wagner, A. (1972). A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In Classical Conditioning II: Current Research and Theory, A.H. Black and W.F. Prokasy, eds., pp. 64–99. Salinas-Herna´ndez, X.I., Vogel, P., Betz, S., Kalisch, R., Sigurdsson, T., and Duvarci, S. (2018). Dopamine neurons drive fear extinction learning by signaling the omission of expected aversive outcomes. Elife 7, https://doi. org/10.7554/eLife.38818. Schultz, W. (2016). Dopamine reward prediction-error signalling: a twocomponent response. Nat. Rev. Neurosci. 17, 183–195.
Neuron 106, 1–12, April 8, 2020 11
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Schultz, W., Dayan, P., and Montague, P.R. (1997). A neural substrate of prediction and reward. Science 275, 1593–1599.
Tobler, P.N., Fiorillo, C.D., and Schultz, W. (2005). Adaptive coding of reward value by dopamine neurons. Science 307, 1642–1645.
Solomon, P.R., Crider, A., Winkelman, J.W., Turi, A., Kamer, R.M., and Kaplan, L.J. (1981). Disrupted latent inhibition in the rat with chronic amphetamine or haloperidol-induced supersensitivity: relationship to schizophrenic attention disorder. Biol. Psychiatry 16, 519–537.
Vonta, F. (2009). The frailty model. J. Appl. Stat. 36, 927–928.
Spiegelhalter, D., Thomas, A., Best, N., and Gilks, W. (1996). BUGS Examples Volume 2, Version 0.5 (version ii). 2 (MRC Biostatistics Unit), pp. 0–75.
Weiner, I., Feldon, J., and Katz, Y. (1987). Facilitation of the expression but not the acquisition of latent inhibition by haloperidol in rats. Pharmacol. Biochem. Behav. 26, 241–246.
Steinberg, E.E., Keiflin, R., Boivin, J.R., Witten, I.B., Deisseroth, K., and Janak, P.H. (2013). A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973. Sutton, R.S., and Barto, A.G. (1998). Reinforcement Learning (MIT Press).
12 Neuron 106, 1–12, April 8, 2020
Weiner, I., Lubow, R.E., and Feldon, J. (1984). Abolition of the expression but not the acquisition of latent inhibition by chronic amphetamine in rats. Psychopharmacology (Berl.) 83, 194–199.
Weiner, I., Lubow, R.E., and Feldon, J. (1988). Disruption of latent inhibition by acute administration of low doses of amphetamine. Pharmacol. Biochem. Behav. 30, 871–878.
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
STAR+METHODS KEY RESOURCES TABLE
REAGENT or RESOURCE
SOURCE
IDENTIFIER
Antibodies Tyrosine Hydroxylase antibody
Abcam
Cat#ab112; RRID: AB_297840
Alexa Fluor594
Abcam
Cat#ab150080; RRID: AB_2650602
hEF1-LS1L-GCaMP6
Massachusetts Institute of Technology Viral Core Facility
N/A
AAV-EF1a-DIO-eNpHR3.0-EYFP
University of North Carolina Vector Core
N/A
AAV5-EF1a-DIO-hChR2(H134R)-EYFP
University of North Carolina Vector Core
N/A
AAV5-EF1a-DIO-EYFP
University of North Carolina Vector Core
N/A
Benzyl acetate
Sigma Aldrich
Cat# W213500
Trans-2,cis-6-Nonadienal
Sigma Aldrich
Cat# W337706
Isoamyl acetate
Sigma Aldrich
Cat# W205508
Heptanal
Sigma Aldrich
Cat# W254002
Thiophene
Sigma Aldrich
Cat# T31801
Phenylethyl alcohol
Sigma Aldrich
Cat# W285803
2,3-Dimethylpyrazine
Sigma Aldrich
Cat# W327107
Limonene
Sigma Aldrich
Cat# W504505
Carvone
Sigma Aldrich
Cat# 435759
Acetic acid
Sigma Aldrich
Cat# W200611
Benzaldehyde
Sigma Aldrich
Cat# 418099
Anisole
Sigma Aldrich
Cat# W209708
Ethyl valerate
Sigma Aldrich
Cat# W246204
Pentenoic acid
Sigma Aldrich
Cat# W284300
2,3,4-Trimethylpyrazine
Sigma Aldrich
Cat# W324418
Salicylaldehyde
Sigma Aldrich
Cat# W300403
a-phellandrene
Sigma Aldrich
Cat# W285611
Paraffin oil
Sigma Aldrich
Cat# 18512
Geraniol
Sigma Aldrich
Cat# W250708
Methyl methacrylate
Sigma Aldrich
Cat#M55909
Dimethyl sulfoxide
Sigma Aldrich
Cat#276855
Vectashield
Vector Laboratories
Cat#H-1000
Bacterial and Virus Strains
Chemicals, Peptides, and Recombinant Proteins
Experimental Models: Organisms/Strains DAT-Cre mouse (B6.SJL-Slc6a3tm1.1(cre)Bkmn/J)
Jackson Laboratory
Cat#006660
C57BL/6J male mice
KU Leuven
N/A
MATLAB
Mathworks
N/A
SAS
SAS Institute
N/A
R
r-project (https://www.r-project.org/)
N/A
Software and Algorithms
LEAD CONTACT AND MATERIAL AVAILABILITY Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Sebastian Haesler (
[email protected]). This study did not generate new unique reagents.
Neuron 106, 1–12.e1–e7, April 8, 2020 e1
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
EXPERIMENTAL MODEL AND SUBJECT DETAILS Animals We used a total of 54 adult male mice heterozygous for Cre-recombinase under control of the DAT gene. Animals were housed on a 12 h dark/12 h light cycle (dark from 07:00 to 19:00) Before animals started behavioral tasks, they were habituated to head-restrain for three days. All animals performed their task around the same time each day. All procedures were performed in accordance with the Federation of European Laboratory Animal Science Associations and approved by the KU Leuven Animal Ethics Committee. METHOD DETAILS Stereotactic Surgeries At the time of surgery, mice were 3-6 months old. Surgeries were performed using a stereotactic apparatus with animals under ketamine/medetomidine hydrochloride anesthesia (0.75 mg/kg and 1 mg/kg respectively), administered intraperitoneally (IP). Body temperature was maintained at 37 C using a feedback-controlled heating pad (Harvard Apparatus) and the depth of anesthesia was checked throughout surgeries by monitoring tail pinch response, whisking, breathing rate, and eye reflexes. Analgesia was administered postoperatively (Meloxicam, 1mg/kg, lP). For fiber photometry experiments, animals underwent two consecutive surgeries: (1) viral injection into VTA and SNc, and (2) implantation of a head-plate and a micro-drive with optic fiber. To express GCaMP6s specifically in dopaminergic neurons, we unilaterally injected a total of 400-500 nL of adeno-associated virus (AAV) containing Cre-dependent vector driving GCaMP6s (hEF1-LS1LGCaMP6) into the VTA and SNc. To target VTA, we injected virus at the following coordinates, relative to bregma: 3.1 AP, +0.5 ML, +4 and +4.5 DV. To target SNc, we injected virus at the following coordinates, relative to bregma: 3.1 AP, +0.8 ML, +4 and +4.5 DV. After a recovery period of at least one week, mice underwent a second surgery. During the second surgery, a fiber optic (400 mm diameter, numerical aperture 0.39, Thorlabs)-coupled micro-drive allowing chronic, stable, and minimally disruptive access to deep brain regions was implanted. This micro-drive was stereotactically implanted into VTA/SNc after a head plate had been secured to the skull with dental acrylic. Using the screw of the micro-drive, we were able to lower the optic fiber along the dorsoventral axis in the brain. For optogenetic manipulation experiments, animals also underwent two surgeries. In the first surgery, a total of 800 nL AAV was injected bilaterally in VTA/SNc. For optogenetic inhibition, we used AAV expressing Cre-dependent eNpHR3.0 (AAV5-EF1a-DIOeNpHR3.0-EYFP) whereas for stimulation, we used AAV expressing Cre-dependent ChR2 (AAV5-EF1a-DIO-hChR2(H134R)EYFP). Following a recovery period of at least one week, mice underwent a second surgery to implant a head plate and two optic fibers (200mm diameter, Thorlabs). Optic fibers were implanted at a depth of 4.0 mm and secured to the skull with dental cement. Anteroposterior and lateromedial coordinates were the same as for photometry experiments. For the optogenetic stimulation experiments in prefrontal cortex, ChR2-AAV was bilaterally injected in VTA/SNc at the aforementioned coordinates, and optic fibers were implanted at the following coordinates, relative to bregma: +1.8 AP, +0.45 ML, +1.6 DV. To allow for recovery and sufficient gene expression, behavioral experiments started 2-3 weeks after the second surgery. All virus constructs were obtained from University of North Carolina Vector Core. Fiber Photometry The implanted optic fiber-coupled micro-drive interfaced with a flexible patch cord (Doric Lenses) on the skull surface, which simultaneously delivered excitation light (405 nm and 465 nm, continuous, 500 mA, Doric lenses LED) and collected fluorescence emission. LEDs of the photometer were switched off between trials to minimize bleaching or tissue damage. Emitted fluorescent light was spectrally separated from excitation light using a dichroic mirror and collected by two photo-multiplying tubes (PMT), one for 430 nm background signal and one for 525 nm GCaMP6s. We subtracted the GCaMP6s signal from the background signal to remove potential motion artifacts. Moreover, signals were detrended to correct for bleaching across sessions and smoothened using a lowess filter (300 ms smoothing window). Then, we fit a 4th degree polynomial to adjust for within-trial signal bleaching. To ensure the fit was not distorted by any stimulus-evoked responses, the 4 s time window after odor onset was replaced by linear interpolation for the fit. From the resulting GCaMP6s signal, we calculated a Z-score variable by comparing the average and variance of the signal in a 2 s period before each trial (F0) with the signal at any given point during the trial (F1). The calculation for each point in the trial was then z = [(F1 – mean(F0)) / var(F0)]. PMT voltage output signals were collected at a sampling frequency of 1000 Hz by a National Instruments (NI) board, which was controlled by the same software (Labview) that was used to control water delivery. This ensured that GCaMP6s signals and task events (odor or water/air puff delivery) and behavioral variables could be readily aligned. Novelty Exposure Paradigm and Water/Air Puff Paradigm After surgery and recovery, animals were held on a mild water deprivation schedule, such that water could serve as a reward during experimental sessions (Cohen et al., 2012). Water drops contained 2ml water. After each session, we provided ad libitum access to water for 10 min. Licks were recorded via a NI board connected to LabView software (National Instruments). Air puffs were delivered to the mouse’s nose, which is presumably aversive for the animals. Odors were delivered for 2 s using a custom-made olfactometer
e2 Neuron 106, 1–12.e1–e7, April 8, 2020
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
(Esquivelzeta Rabell et al., 2017). Each odor was dissolved in paraffin oil to achieve a concentration of 100 ppm and one milliliter of diluted odor was placed inside each sampling bottle connected to constant air flow. To measure responses to novel, rare familiar and familiar stimuli, we selected odorants from a list of chemicals, including thiophene, heptaldehyde, a-phellandrene, acetic acid, anisole, 2,3,5-trimethylpyrazine, isoamyl acetate, dimethylpyrazine, (s)-(-)-limonene, ethyl valerate, methyl methacrylate, benzaldehyde, phenethyl alcohol, (s)-(+)-carvone, dimethyl sulfoxide, salicylaldehyde, trans-2,cis-6-nonadienal, 4-pentenoic acid, benzyl acetate and geraniol. Odorants were chosen pseudorandomly for the different experimental conditions with the aim to balance odorants such that the same chemicals could serve as ‘novel’ and as ‘familiar’ in different animals (Figure S1G). Prior to photometry measurements, animals were first familiarized with 8 odor stimuli for 5 consecutive days (Phase 1). Following each odor familiarization session, mice received 20 unexpected water drops and 20 unexpected air puffs at random intervals. At the day of the photometric measurement (Phase 2A), animals started with a short-term habituation block including all previously familiarized odors (3 presentations/odor). Following this short-term habituation block, four novel odors were introduced. Each novel odor was presented four times, common familiar odors were presented 28 times each. Four previously familiarized stimuli were made rare, by presenting them only four times in the session. All novel, common, and rare familiar stimuli were presented randomly. The interstimulus interval (ISI) was drawn from an exponential distribution with mean of 5 s and cutoff < 2 s and > 20 s. Throughout the paradigm, respiration was registered using a non-contact, all optical approach with a thermal camera (Esquivelzeta-Rabell et al., 2017; Mutlu et al., 2018). To relate novelty-evoked responses to the known water/air puff responses of dopamine neurons in mice, we delivered 20 unexpected water drops and air puffs after completing novelty exposure (Phase 2B). The water spout for water delivery and the tube for air puff delivery were only added to the setup after the animal had completed the novelty exposure paradigm. To collect baseline data, we also introduced 27 and 20 blank trials (no odor, no reward, no air puffs) randomly in the novelty exposure paradigm and reward/air puff paradigm respectively. Odor Investigation Assay The odor investigation experiment was performed during the day between 09:00 and 17:00 h in a laminar flow fume hood. The experiment was performed in transparent plastic mouse cages (395 3 346 3 213 mm) placed on a light Table (400 3 400 3 100 mm) for background illumination. The investigation assay was conducted separately for each animal, one session per day. Odor investigation was measured for fourteen odorants (Figures S1I and S1J, Sigma Aldrich), diluted in 5 mL mineral oil in order to obtain a final vapor pressure of 1 Pa. Odorants were presented to animals, using filter paper (Grade 413 VWR), soaked with odorant solution (1 ml), which was placed in a small disposable Petri dish (35 3 10 mm) in which small openings were drilled for air exchange. Each experimental session started with 2 consecutive habituation phases of 10 min each, during which animals were exposed to the same mineral oil also used for odorant dilutions. After that, animals were exposed for 3 min to an odorant, followed by a 5 min intertrial interval without odorant exposure. We performed five such trials per day. Each habituation phase and odorant trial were performed in a fresh cage. The cage in which animals were kept during the intertrial interval animals was reused. To avoid crosscontamination of odorants between experimental animals, we transferred each mouse into a new housing cage right after the end of each experimental session. In between different animals tested on the same day, we cleaned the cages with 32 g/100ml isopropyl-tridecyl-dimethyl-ammonium (Umonium 38 neutralis spray). The order of mice and odorants was randomized across the experiment. During all odorant trials, mice were recorded using two webcams (Logitech HD C525). One cam was positioned to record the whole cage from a birds-eye view, the other was zoomed in on the Petri dish site. Video files were scored manually with the observer blinded to odorant and animal identity. To quantify odorant investigation time, we defined an ‘‘odorant zone’’ as the circular area around the Petri dish with a diameter of one length of the animals’ body size. Investigation time was scored once the animal had entered the zone with at least three paws until it left the zone (at least three paws out). We further counted the number of visits to the odorant zone. The mean investigation time per odorant was obtained by dividing the total investigation time by the number of visits. Optogenetics Implanted optical fibers were coupled to fiber-optic pieces connecting to a yellow laser (593 nm; Laserglow Technologies) for inhibition experiments and a blue laser (473 nm; Laserglow Technologies) for stimulation experiments. A beam splitter (Thorlabs) was used to deliver light bilaterally. In the CS stimulation experiment, we delivered trains of 10 pulses (5 ms pulse duration) at 20 Hz (i.e., one pulse every 50ms) at the time of odor delivery (starting 250 ms after odor-onset) during illumination trials (Figure 2E). In the CS inhibition experiment, we delivered continuous light starting 1 s before odor presentation until 350 ms before reward delivery (Figure 4A). We started the illumination prior to stimulus onset, following the common practice in the field to allow for sufficiently long hyperpolarization and thus ensure DA neurons are really inhibited when sensory input would normally drive burst firing (Salinas-Herna´ndez et al., 2018). During the last second, we gradually reduced the light intensity, to avoid rebound excitation (Mahn et al., 2016). Laser intensity at the fiber tip was adjusted to a total power of 1-5 mW in stimulation experiments and 10-16 mW in inhibition experiments. To assure animals did not see laser pulses, the end of the patchcord was covered with black plasticine.
Neuron 106, 1–12.e1–e7, April 8, 2020 e3
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Behavioral Paradigms for Optogenetic Manipulation of Dopamine during CS To study the role of dopamine CS responses in associative learning, we used two conditioning tasks (Conditioning task I & II). In all rewarded trials, we paired odorant stimuli (delivered for 2 s) with water reward (delivered 2 s after odor offset). In the CS stimulation experiment (Conditioning task I, Figures 2 and 3) we evaluated how stimulating dopamine neurons during familiar CS affects the speed of CS-US association. In a first training phase (phase 1, 5 days), we paired one odorant with reward (CS+). We also presented four different odors, not paired with any consequence (CS‒). On day 6, we presented each CS+ and CS- of phase 1 again five times. This provided the behavioral endpoint of phase 1. Then we started a second phase (phase 2) which had six experimental conditions. All experimental conditions were block-randomized, such that each block included one trial of each experimental condition. We changed the contingency for two odorants which were introduced during phase 1. They became rewarded (CS+fam and CS+famstim). Two odorants introduced during phase 1 remained non-rewarded (CS‒ and CS‒famstim). The contingency for the rewarded odorant introduced during phase 1 also remained unchanged (CS+). Finally, we also introduced a novel odorant, which was paired with reward (CS+nov). During the presentation of CS+famstim and CS‒famstim, blue laser stimulation was performed as described above. Data was collected for 5 consecutive days. Laser stimulation was delivered for a minimum of 20 trials. Once 20 trials were completed in an animal, we continued to stimulate the remaining trials in that session, but all subsequent sessions were performed without stimulation. Sessions in phase 2 were terminated when animals stopped licking in response to CS+, which indicated they were sated. The mean number of trials performed per session in the CS stimulation experiment was 15.8 ± 0.6 trials for each experimental condition. In the CS inhibition experiment (Conditioning task II; Figure 4) we evaluated the effect of inhibiting novelty-evoked CS responses on the speed of CS-US association. To ensure animals did not learn associations too rapidly, we set the reward probability for rewarded stimuli to 70% in this task. In a first training phase (phase 1, 5 days), we paired one odor with reward (CS+) and presented one odor without any consequence (CS‒). Animals were trained between four and five days in phase 1 (mean 4.4 ± 0.2 days). After the initial training phase, we started a second phase (phase 2), which had five experimental conditions. Experimental conditions were block-randomized, such that each block included one trial of each experimental condition. In the first 5 blocks of phase 2, we maintained the contingencies of phase 1 (one odor CS+, one odor CS‒). Then we introduced four novel odorants. Two of these novel odorants were paired with reward (CS+nov and CS+novinh) and two were not paired with reward (CS‒nov). We chose to introduce two novel non-rewarded stimuli (CS‒nov) to ensure the chance of being rewarded was balanced for novel stimuli. CS+ and CS‒ from phase 1 were continued during phase 2. We ensured that both CS+nov and CS+novinh always led to the same outcome within each block, i.e., if CS+nov was rewarded in a given block, CS+novinh was also rewarded and if CS+nov was not rewarded in a given block, CS+novinh was also not rewarded. During the presentation of CS+novinh yellow laser stimulation was performed as described above. Data were collected for 2 consecutive days with laser stimulation only being delivered at the first day. Sessions in phase 2 were terminated when animals stopped licking in response to CS+, which indicated they were sated. The mean number of trials performed per session in the CS inhibition experiment was 17.7 ± 0.7 trials for each CS odorant. Histology Upon completion of experiments, mice received an overdose of ketamine/medetomidine, exsanguinated with saline, and perfused with 4% paraformaldehyde. Brains were cut in 80 mm coronal sections on a vibratome. Virus expression in dopamine neurons was determined through EYFP fluorescence and 49,6-diamidino-2-phenylindole (DAPI, Vectashield) was applied to sections to visualize nuclei. Slides were examined to verify the position of the optic fiber tip was in or above VTA/SNc dopamine neurons, or in or above prefrontal cortex, and in a virus-expression region. Immunohistochemical labeling for tyrosine hydroxylase (TH), the rate-limiting enzyme in the biosynthesis of catecholamines, was performed using an antibody against mouse tyrosine hydroxylase (Abcam ab112, dilution 1:500). A fluorescently (Alexa Fluor594) labeled secondary antibody (Abcam ab150080, dilution 1:1000) was used to visualize presence of TH in the tissue. QUANTIFICATION AND STATISTICAL ANALYSIS Statistical analyses were performed in MATLAB, R and SAS. When evaluating significance, a value of p <= 0.05 was considered significant. Unless stated otherwise, we used the one-sided Wilcoxon signed rank test for single samples as well as paired data and the one-sided Wilcoxon rank sum test for unpaired data to evaluate statistical significance. Analysis Novelty Exposure Paradigm Breathing frequency was obtained by counting the number of inhalations in a 500 ms backward-looking sliding window across each trial. While photometry measurements were z-score normalized (see above), breathing signals were baseline-subtracted (breathing frequency throughout the trail subtracted with the per-trial average breathing frequency in a 2000 ms pre-odor window). To investigate the influence of novelty (nov), rareness (rar) and familiarity (fam) on the breathing frequency we ran the following regression on every animal’s data: AUCbreathing = b0 + b1 Dðnov = 1Þ + b2 Dðrar = 1Þ + b3 Dðfam = 1Þ + ε
e4 Neuron 106, 1–12.e1–e7, April 8, 2020
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
AUCbreathing corresponded to the area under the curve (AUC) of the continuous breathing signal in the 500-4000 ms window after the odor-onset. Prior to using it in the regression, we scaled AUCbreathing using the mean between the maximum and minimum value occurring in an animal, as well as the absolute difference between the maximum and minimum value occurring in an animal. Subtracting this mean from AUCbreathing and dividing by the range then transformed the AUC variable to the [-1,1] scale for every animal. With the responses during blank trials (see novelty exposure paradigm) serving as a reference, explanatory variables were trial indicator variables for novelty, rareness and familiarity. We then obtained the t-values corresponding to our b1, b2 and b3 parameters for every animal. To determine the latency of the breathing response to novelty, we built a separate model in which we regressed the novelty trial indicator variable as the only explanatory factor on every ms of the breathing frequency signal. Latency of the breathing response was defined as the exact ms where the t-statistic of the b1 parameter reached a value of 2. As we compared the mean latency in breathing versus the mean latency in dopamine to novelty (see below), the reported latency in the text only includes ms values from animals where we witnessed significant dopamine responses to novelty. To determine the rank correlation between dopamine responses to novel stimuli and water, we determined the peak dopamine response in the 5 s time window after novel stimulus and water onset, respectively, for each novelty and water trial in each animal (n = 15). Then we averaged the peak dopamine response of all water trials and the first three presentations of all novel trials to obtain the mean response to water and novel stimuli in each animal. Statistical significance of the correlation was computed using the exact permutation distributions (two-tailed tests). The rank correlation between the mean investigation time of odorant chemicals and the dopamine responses to those odorants when they were presented as novel in the novelty exposure task was determined in all animals which showed significant responses to novel stimuli (n = 8; Figure S1A). We determined the peak dopamine response in the 5 s time window after stimulus onset separately for each chemical that was presented as novel (n = 14). Then we averaged the peak dopamine response of the first three novel presentations of each odorant to obtain the mean response of each animal to that odorant when presented as novel. Responses to the same chemical from different animals were averaged. Statistical significance of the correlation was computed using the exact permutation distributions (two-tailed tests). To relate dopamine’s activity to novelty (nov), rareness (rar), familiarity (fam), reward (rew), and air puffs (puf) we ran the following regression models on every animal’s data: AUCdopamine = b0 + b1 Dðnov = 1Þ + b2 Dðrar = 1Þ + b3 Dðfam = 1Þ + ε and AUCdopamine = b0 + b1 Dðrew = 1Þ + b2 Dðpuf = 1Þ + ε The first model was applied on the data from the novelty exposure paradigm, while the second model was applied to the data from the free reward/air puff paradigm. For the novelty exposure paradigm, the dopamine-dependent variable was the AUC in a 500 ms window spanning the peak ms for novelty and peak ms for rareness (for that animal). The AUC value was scaled to the range of [-1,1] as discussed above. For the free reward/air puff paradigm, we defined a similar dependent variable based on the peak ms for reward and air puffs. With responses during blank trials (see novelty exposure paradigm) serving as a reference, explanatory variables were trial indicator variables for novelty, rareness, familiarity, reward and air puffs. As before, we obtained the t-values corresponding for the b parameters – one for each animal. We also constructed a simpler model in which we regressed the novelty trial indicator variable as the only explanatory factor on the dopamine AUC. Animals with a significant regression parameter of the novelty trial indicator variable in that model were considered to respond significantly to novelty. To determine the latency of the dopamine response to novelty, we regressed the novelty trial indicator variable as the only explanatory factor on every ms of the dopamine signal. The latency was then determined as the exact ms where the t-statistic of the b1 parameter reached a value of 2. To compare response amplitudes between animals, we normalized the photometry data using the peak reward response of each animal (Figure 1J). This was defined as the average z-score value in a 250 ms window surrounding the animal’s maximum z-score value in the reward condition. Similarly, peak novelty and peak air puff responses corresponded to the average z-score value in a 250 ms window surrounding the animal’s maximum z-score value in the novelty and air puff condition respectively. Peak novelty and peak air puff responses were then normalized by dividing them by the peak reward response of the same animal. We investigated the influence of stimulus novelty and intrinsic value on dopamine using a mixed model analysis, using only trials of the 10 chemicals for which we had collected dopamine signals for both novel and familiar conditions. Per animal and per odor, only the first three presentations were included in the analysis. Fixed effects in the model were novelty (trial dummy indicator; 0 if familiar, 1 if novel), intrinsic value (mean-subtracted odor inspection time), and the interaction between novelty and intrinsic value. We fitted the mixed model using restricted maximum likelihood estimation. Parametrizing the model with the familiar condition (novel = 0) as the reference case, the intrinsic value coefficient indicated whether intrinsic value modulates the dopamine signal when odors are familiar. To allow for a different relationship between intrinsic value and dopamine when odors were novel versus familiar, we introduced an interaction term between novelty and intrinsic value in the model. We also run the model with the novel condition (novel = 1) as the reference case. In this parametrization, the coefficient of the intrinsic value term directly indicated the relationship between intrinsic value and dopamine when odors were novel. To account for between-animal variance, animal-specific random effects were added to the novelty and intrinsic value terms as well as to the intercept.
Neuron 106, 1–12.e1–e7, April 8, 2020 e5
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
Analysis of Conditioning Experiments Extracted licks were considered as anticipatory if they fell in the 4 s window from odor-on to reward delivery. Such licks appear after perception of reward-predicting CS+, yet before actual delivery of reward US, hereby revealing that an animal has learnt the CS‒US association. Since the number of anticipatory licks varied considerably within sessions and across sessions, a binary lick trial variable was created for further analysis. Trials were classified as lick trials (coded as 1) or non-lick trials (coded as 0) with lick trials being defined as trials having more anticipatory licks than the average number of anticipatory licks in the non-rewarded conditions. Importantly, this binarized lick trial variable does not contain any information about factors such as expected value and satiety, yet still captures whether an animal has learnt a certain CS‒US association or not. Using the lick trial variables, we calculated cumulative lick trial variables on which we performed Bayesian change point analysis in SAS for the conditions of interest: novel rewarded, novel rewarded plus inhibition, familiar rewarded, and familiar rewarded plus stimulation. The Bayesian model and its associated priors are given below: normal a + b1 ðtrialnum cpÞ; s2 if trialnum < cp cumlicktrialsðtrialnumÞ = normal a + b2 ðtrialnum cpÞ; s2 if trialnumRcp with cp uniformð0; total#trialsÞ a; b1 ; b2 normalð0:5; 0:1Þ s2 uniformð0; 0:1Þ From this we obtained a MCMC sample of size = 20000 with a burn-in period of size = 1000. The model provided change points cp which correspond to the trial numbers from which point onward animals have learnt a certain CS‒US association; i,e, from which point onward animals consistently lick in anticipation to reward. If dopaminergic CS responses would underlie faster associative learning, optogenetic stimulation of dopamine during CS (see CS stimulation experiments; Figures 2 and 3) would be expected to lead to lower change points. Further, in the CS inhibition experiment (Figure 4), an effect of optogenetic inhibition would result in later change points. Change points of the different conditions from all animals were plotted as cumulative incidence curves (being the complement of the Kaplan-Meier survival curves) using the survival package in R. To evaluate the effect of novelty (NOV), optogenetic stimulation (STIM) and optogenetic inhibition (INH) we then fitted multivariate frailty models which are standard survival Cox regression models modeling the time-dependent hazard rate l with additional random terms for each animal li ðtÞ = l0 ðtÞ:wi : expðb1 : NOVÞ
li ðtÞ = l0 ðtÞ:wi : expðb2 : STIMÞ
li ðtÞ = l0 ðtÞ:wi : expðb3 : INHÞ In those models, coefficients of interest were those of the effects of novelty (b1), optogenetic stimulation (b2) and optogenetic inhibition (b3) on the baseline hazard l0(t) of undergoing a change point in the lick trial variable. The random effect variables wi were assumed to be gamma-distributed with a mean of zero. We also modeled our data using a Rescorla-Wagner reinforcement learning model (Miller et al., 1995), separately for the three conditions of interest: EðYÞ = EðY 1Þ + ai ½RðYÞ EðY 1Þ R(Y) was coded as 1 if a trial outcome was reward (or, as 1, from the first rewarded trial onward in the inhibition experiment) and E(Y-1) was coded as 1 if the lick trial variable was 1 in a given trial, and 0 otherwise. E(Y) was initialized at 0. We then computed, through numerical optimization, which a minimized the mismatch between the simulated E(Y-1), and the behavioral (lick trial) E(Y-1). This yielded three different learning rates ai of interest in the CS stimulation experiment. Latent inhibition would translate into a relatively higher learning rate aN for the novelty condition. Additionally, assuming a learning acceleration effect due to optogenetic stimulation we would predict aFS > aF, meaning a higher learning rate with optogenetic stimulation than without optogenetic stimulation. In the CS inhibition experiment, an effect of optogenetic inhibition would result in a relatively lower learning rate for the condition with optogenetic inhibition versus the condition without optogenetic inhibition, i.e., aNI < aN. To further confirm the effect of our optogenetic manipulations, we performed additional between-group tests. First, we calculated the percentage change in change point, and in learning rates for the two conditions of interest: CS+famstim and CS+fam in the
e6 Neuron 106, 1–12.e1–e7, April 8, 2020
Please cite this article in press as: Morrens et al., Cue-Evoked Dopamine Promotes Conditioned Responding during Learning, Neuron (2020), https:// doi.org/10.1016/j.neuron.2020.01.012
stimulation experiments and CS+novinh and CS+nov in the inhibition experiment. Then, we evaluated whether those measures differed between ChR2VTA/SNc and ControlVTA/SNc (Figures S2I and S2J), between ChR2PFC and ControlPFC (Figures S3G and S3H) and between eNpHR3.0 and ControleNpHR3.0 (Figures S4E and S4F). Additionally, we also tested whether inhibition of dopamine affects motor licking behavior. To do this, we compared the anticipatory licking behavior in the CS+novinh and CS+nov trials of day 1 of the inhibition experiment, when the laser manipulation was performed. We compared the mean number of licks and the mean latency to first lick in the anticipatory licking phase (from odor-onset to reward delivery) in all trials after the change point in CS+novinh. DATA AND CODE AVAILABILITY Datasets and code supporting the current study are available from the Lead Contact on request.
Neuron 106, 1–12.e1–e7, April 8, 2020 e7