Can the VTA Come Out to Play? Only When the mPFC’s Predictions Go Astray!

Can the VTA Come Out to Play? Only When the mPFC’s Predictions Go Astray!

Neuron Previews demands. How are differential changes in calcium and ATP demand met in different regions of a neuron? Ultimately, viewing the forest ...

699KB Sizes 0 Downloads 48 Views

Neuron

Previews demands. How are differential changes in calcium and ATP demand met in different regions of a neuron? Ultimately, viewing the forest and not the trees, this is a rich and important area of investigation. The connections between basic metabolic biochemistry and higher brain function are likely to impact almost every aspect of brain and behavior inclusive of cognition, memory, personality, and emotion, all of which are influenced by the composition of our diet and our state of starvation versus satiation. REFERENCES Ashrafi, G., de Juan-Sanz, J., Farrell, R.J., and Ryan, T.A. (2020). Molecular Tuning of the Axonal Mitochondrial Ca2+ Uniporter Ensures Metabolic

Flexibility of Neurotransmission. Neuron 105, this issue, 678–687. Ashrafi, G., Wu, Z., Farrell, R.J., and Ryan, T.A. (2017). GLUT4 Mobilization Supports Energetic Demands of Active Synapses. Neuron 93, 606–615.e3. Baughman, J.M., Perocchi, F., Girgis, H.S., Plovanich, M., Belcher-Timme, C.A., Sancak, Y., Bao, X.R., Strittmatter, L., Goldberger, O., Bogorad, R.L., et al. (2011). Integrative genomics identifies MCU as an essential component of the mitochondrial calcium uniporter. Nature 476, 341–345. Divakaruni, A.S., Wallace, M., Buren, C., Martyniuk, K., Andreyev, A.Y., Li, E., Fields, J.A., Cordes, T., Reynolds, I.J., Bloodgood, B.L., et al. (2017). Inhibition of the mitochondrial pyruvate carrier protects from excitotoxic neuronal death. J. Cell Biol. 216, 1091–1105. Marchi, S., and Pinton, P. (2014). The mitochondrial calcium uniporter complex: molecular com-

ponents, structure and physiopathological implications. J. Physiol. 592, 829–839. Pathak, D., Shields, L.Y., Mendelsohn, B.A., Haddad, D., Lin, W., Gerencser, A.A., Kim, H., Brand, M.D., Edwards, R.H., and Nakamura, K. (2015). The role of mitochondrially derived ATP in synaptic vesicle recycling. J. Biol. Chem. 290, 22325–22336. Rangaraju, V., Calloway, N., and Ryan, T.A. (2014). Activity-driven local ATP synthesis is required for synaptic function. Cell 156, 825–835. Scho¨nfeld, P., and Reiser, G. (2017). Brain energy metabolism spurns fatty acids as fuel due to their inherent mitotoxicity and potential capacity to unleash neurodegeneration. Neurochem. Int. 109, 68–77. Yellen, G. (2018). Fueling thought: Management of glycolysis and oxidative phosphorylation in neuronal metabolism. J. Cell Biol. 217, 2235–2246.

Can the VTA Come Out to Play? Only When the mPFC’s Predictions Go Astray! Alexandra Stolyarova1 and Andrew M. Wikenheiser1,2,* 1Department

of Psychology, University of California, Los Angeles, Los Angeles, CA 90095, USA Brain Research Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA *Correspondence: [email protected] https://doi.org/10.1016/j.neuron.2020.01.036 2The

Confidence in perceptual decisions scales neural responses to violations in reward expectation. In this issue of Neuron, Lak et al. (2020) show that the medial prefrontal cortex in mice computes a confidence-dependent expectation signal that influences how dopamine neurons convey reward prediction errors to guide learning. For many years, perceptual and rewardbased decision making have been studied in isolation. Elaborate tasks have been designed, elegant models formulated, and multiple brain regions investigated to understand how decisions are guided by sensory or reward information independently (Summerfield and Tsetsos, 2012). For many decisions, however, both perceptual and reward information are relevant. For instance, suppose you notice a new sign on a previously unmarked building on your way to work. Based on your fleeting glance, you think you may have espied a certain mythical sea creature that signifies your favorite coffee shop—but you can’t be sure. Will you turn around in search of a caffeinated

beverage? Principled integration of previous reward learning (‘‘How good is the coffee at that place?’’) and sensory information (‘‘What did I actually see on that sign?’’) is necessary for good decision making in this situation. The reward-learning side of this problem has, in recent years, been closely tied to the midbrain dopamine system (Dayan and Daw, 2008). Dopamine neurons signal reward prediction error (RPE), the difference between how much reward subjects expect and how much reward they actually receive. Computational models of learning leverage such RPE signals to learn which actions lead to good consequences. But sometimes the correct action depends on sensory information (trying to enter a

coffee shop that is closed is unlikely to produce satisfying consequences), and sensory information may be noisy, ambiguous, or otherwise imperfect. In addition, unexpected changes in the environment may cause previously profitable actions to unexpectedly become more or less rewarding. How should learning proceed when the correct action to take depends on uncertain sensory information and fluctuating action-outcome contingencies? In this issue of Neuron, Lak et al. (2020) address this question by testing how mice process uncertainty in both the sensory and reward domains. In their behavioral task, a visual stimulus was presented on a computer monitor to the left or right of a head-fixed mouse. The mouse reported

Neuron 105, February 19, 2020 ª 2020 Elsevier Inc. 593

Neuron

Previews the location of the stimulus by turning a response wheel in the direction opposite of where the stimulus appeared, and correct responses resulted in reward delivery. Sensory uncertainty was manipulated by varying the contrast of the stimulus, allowing the researchers to program sequences of trials ranging from easy, high-contrast/high-confidence trials to difficult, low-contrast/low-confidence trials. To manipulate outcome uncertainty, mice were given a larger reward for correct decisions about the stimuli presented on one side than for correct decisions about stimuli presented on the other side, and the side associated with the larger reward could change unpredictably across blocks of trials. To earn as much reward as possible on this task, a good strategy would be to report the location of the stimulus accurately when the contrast is high and the correct answer is clear. But when the stimulus contrast—and, consequently, the likelihood of making the right decision—is low, it is best to perform the action currently associated with the larger reward. Mice generally followed this strategy. Manipulating sensory difficulty, however, also produced an unanticipated effect on behavior: perceptual uncertainty on one trial affected choices on the subsequent trial. When mice fortuitously chose correctly on a difficult, low-contrast trial, they were more likely to repeat the same action on the next trial, regardless of the stimulus location or its contrast. This suggests that a complex combination of current sensory information, previous reward learning, and previous decision confidence determined decisions on each trial. Building on previous modeling work (Lak et al., 2017), the authors set out to uncover the precise mental algorithm that underlies behavior on their task and the roles of two brain regions—the medial prefrontal cortex (mPFC) and ventral tegmental area (VTA)—in the necessary computations. They devised a computational model that captured multiple steps of the decision-making process. First, the model interpreted the sensory evidence to compute the probability (confidence) that the visual stimulus appeared on each side. Second, these probabilities were multiplied by the stored values for each action learnt from past tri594 Neuron 105, February 19, 2020

als to compute the expected values of the two choices. Third, the action with the highest expected value was selected. Finally, the discrepancy between received and expected reward (RPE) drove learning of action values. The novelty of this model is that action values come to reflect not only past choices and rewards, but also the sensory evidence on the current trial and subjects’ confidence in their choices on previous trials. This latter effect falls out of the model because low perceptual confidence reduces the expected value of either action. Thus, when subjects encounter a difficult, lowcontrast trial, the model computes a low expected value, and this dismal forecast makes the ensuing RPE all the larger if animals are rewarded, driving an outsized increase in the chosen action value that inappropriately biases choice on the next trial. The authors speculate that this model-free strategy, while less than efficient in the present setting, where stimulus location was randomized across trials, might be more advantageous in animals’ natural habitats, where stimuli tend to be correlated in time (Yu and Cohen, 2008). The authors next set out to determine how their model might be implemented in the brain. Lak et al. (2020) recorded neural activity in mPFC and used fiber photometry to measure responses in the VTA, the latter of which allowed them to record calcium fluctuations exclusively from dopamine-releasing neurons. The authors found that activity in both mPFC and VTA reflected the model-estimated expected value of the chosen option in the time between action initiation and reward delivery. Dopamine neurons also signaled RPEs after reward delivery or omission. Lak and colleagues then asked if these responses played a causal role in learning. Optogenetic inactivation of the prefrontal cortex during the reward anticipation period caused mice to become more sensitive to trial outcomes: they were more likely to repeat the same action on the subsequent trial if it produced a reward and were faster to learn about switches in the side associated with the larger reward. These findings suggest that the mPFC provides an expectation signal that is crucial for RPE computation. With mPFC inactivated, subjects behaved

as though rewards were more surprising, and RPEs were inflated, resulting in enhanced learning. However, suppressing the mPFC after outcome delivery had no effect on behavior. When the authors optogenetically excited or inhibited dopamine neurons in the VTA, a different pattern emerged: increasing the activity of dopamine neurons during outcome anticipation had little effect on learning, but manipulating dopamine neuron activity after reward delivery was sufficient to induce learning. Lak et al. (2020) then showed that the effects of mPFC and dopamine manipulations can be reproduced by changes to the expected values of choices and RPEs, respectively, in their model. These discoveries will undoubtedly inspire further investigations into the neural circuitry of reward-guided learning under perceptual uncertainty. Several immediate questions come to mind. First, subjective confidence reports do not always conform to the predictions of signal detection theory or Bayesian models (Peters et al., 2017). It would be interesting to combine the task reported here with established behavioral measures of subjective confidence (e.g., time wagering; Lak et al., 2014) to test whether RPEs are more sensitive to model-derived optimal confidence or subjective confidence when the two do not align. Relatedly, human participants’ subjective confidence in their perceptions can be correlated across trials, independent of objective stimulus properties (Rahnev et al., 2015). This ‘‘confidence leak’’ persists even across sensory detection tasks involving different visual features. Identifying whether a similar pattern occurs in rodents and testing whether the learning framework proposed by Lak et al. (2020) could accommodate a metacognitive continuity field of this nature would be informative experimental and modeling endeavors. At the neural circuit level, other sub-regions of the prefrontal cortex have been implicated in estimating and reporting perceptual confidence, including the orbitofrontal and anterior cingulate cortex in rats (Lak et al., 2014; Stolyarova et al., 2019). What are the specialties of these brain regions? Do they carry out unique computations and participate at different stages of the decision-making

Neuron

Previews and learning processes, or are their functions largely overlapping? A better understanding of the similarities and differences of these closely related frontal regions will help parse the unique contribution of each region to decisions and behavior. REFERENCES Dayan, P., and Daw, N.D. (2008). Decision theory, reinforcement learning, and the brain. Cogn. Affect. Behav. Neurosci. 8, 429–453. Lak, A., Costa, G.M., Romberg, E., Koulakov, A.A., Mainen, Z.F., and Kepecs, A. (2014). Orbitofrontal cortex is required for optimal waiting based on decision confidence. Neuron 84, 190–201.

Lak, A., Nomoto, K., Keramati, M., Sakagami, M., and Kepecs, A. (2017). Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision. Curr. Biol. 27, 821–832.

Rahnev, D., Koizumi, A., McCurdy, L.Y., D’Esposito, M., and Lau, H. (2015). Confidence leak in perceptual decision making. Psychol. Sci. 26, 1664–1680.

Lak, A., Okun, M., Moss, M.M., Gurnani, H., Farrell, K., Wells, M.J., Reddy, C.B., Kepecs, A., Harris, K.D., and Carandini, M. (2020). Dopaminergic and Prefrontal Basis of Learning from Sensory Confidence and Reward Value. Neuron 105, this issue, 700–711.

Stolyarova, A., Rakhshan, M., Hart, E.E., O’Dell, T.J., Peters, M.A.K., Lau, H., Soltani, A., and Izquierdo, A. (2019). Contributions of anterior cingulate cortex and basolateral amygdala to decision confidence and learning under uncertainty. Nat. Commun. 10, 4704.

Peters, M.A.K., Thesen, T., Ko, Y.D., Maniscalco, B., Carlson, C., Davidson, M., Doyle, W., Kuzniecky, R., Devinsky, O., Halgren, E., and Lau, H. (2017). Perceptual confidence neglects decision-incongruent evidence in the brain. Nat. Hum. Behav. 1, 0139, https://doi.org/10.1038/ s41562-017-0139.

Summerfield, C., and Tsetsos, K. (2012). Building Bridges between Perceptual and Economic Decision-Making: Neural and Computational Mechanisms. Front. Neurosci. 6, 70. Yu, A.J., and Cohen, J.D. (2008). Sequential effects: Superstition or rational behavior? Adv. Neural Inf. Process. Syst. 21, 1873–1880.

Precision Functional Mapping of Corticostriatal and Corticothalamic Circuits: Parallel Processing Reconsidered Charles J. Lynch1,2 and Conor Liston1,2,* 1Brain

and Mind Research Institute, Weill Cornell Medicine, 413 East 69th Street, Box 204, New York, NY 10021, USA of Psychiatry, Weill Cornell Medicine, 413 East 69th Street, Box 204, New York, NY 10021, USA *Correspondence: [email protected] https://doi.org/10.1016/j.neuron.2020.01.025 2Department

In this issue of Neuron, Greene et al. (2020) identify zones of network specificity and multi-network integration in the basal ganglia and thalamus of individual human subjects. Such information could aid in the development of personalized and more effective brain stimulation therapies for neuropsychiatric disorders.

Cortico-basal ganglia-thalamocortical loops—parallel circuits that support action selection and decision making and facilitate communication between neuroanatomically distributed areas of cortex and subcortical structures—are among the most extensively studied circuits in neurobiology and a staple of every neuroanatomy textbook. Nearly 35 years ago, converging evidence from anatomical tracing and neurophysiology experiments in non-human primates and rodents led to the formulation of a highly influential model in which parallel signaling streams originating in frontal cortical areas devoted to motor planning, cognition, and limbic processing project to spatially distinct basal ganglia and thalamic targets (Alexander et al., 1986). A key question in

this field with important clinical translational implications concerns the extent to which these parallel circuits are fully segregated, projecting to separate subcortical targets with minimal crosstalk, or converge to some degree on ‘‘integration zones’’ in specific subcortical sites in the human brain. A study by Greene and colleagues in this issue of Neuron provides persuasive new evidence for the latter model, identifying at least three integration zones in the ventral intermediate thalamus, the pulvinar nucleus, and the caudate nucleus that were consistently detectable in individual human subjects and could inform future efforts to develop more effective brain stimulation interventions for neuropsychiatric disorders (Greene et al., 2020).

In rodents and non-human primates, corticostriatal and corticothalamic circuits have been extensively mapped and probed. The original model—emphasizing parallel, highly segregated cortico-basal ganglia-thalamocortical loops involved in sensorimotor, cognitive, and limbic processes—was refined over the years to incorporate the discovery of subcortical convergence zones that integrate inputs from functionally distinct areas of cortex and cerebellum supporting these discrete cognitive and behavioral functions (Alexander and Crutcher, 1990; Bostan and Strick, 2018). In contrast, the functional organization of the thalamus and basal ganglia in humans is not as well understood, due in part to technical constraints. Functional magnetic resonance imaging

Neuron 105, February 19, 2020 ª 2020 Elsevier Inc. 595