Article
Continuous and Discrete Neuron Types of the Adult Murine Striatum Highlights d
d
scRNA-seq and RNA in situ analysis of adult striatal projection neurons (SPNs) Discrete, step-like expression heterogeneity does not encode spatial information
Authors Geoffrey Stanley, Ozgun Gokce, € dhof, Robert C. Malenka, Thomas C. Su Stephen R. Quake
Correspondence
d
Continuous heterogeneity is almost always spatial and gradient-like
[email protected] (S.R.Q.),
[email protected] (T.C.S.)
d
Conserved continuous heterogeneity predicts spatial distribution of novel neuron types
In Brief
Stanley et al., 2020, Neuron 105, 1–12 February 19, 2020 ª 2019 Published by Elsevier Inc. https://doi.org/10.1016/j.neuron.2019.11.004
Stanley et al. show that discrete, step-like gene expression heterogeneity defines four major striatal neuron types but does not provide spatial information. Within each discrete type, they find multiple axes of continuous, gradient-like heterogeneity that directly encode anatomical location.
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Neuron
Article Continuous and Discrete Neuron Types of the Adult Murine Striatum € dhof,2,4,* and Stephen R. Quake5,6,8,10,* Geoffrey Stanley,1,9 Ozgun Gokce,2,7,9 Robert C. Malenka,3 Thomas C. Su 1Program
in Biophysics, Stanford University School of Medicine, Stanford, CA 94305, USA of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, CA 94305, USA 3Nancy Pritzker Laboratory, Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, USA 4Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA 5Department of Bioengineering, Stanford University, Stanford, CA 94305, USA 6Department of Applied Physics, Stanford University, Stanford, CA 94305, USA 7Institute for Stroke and Dementia Research, Klinikum der Universita €t Mu €t, Munich, Germany €nchen, Ludwig-Maximilians-Universita 8Chan Zuckerberg Biohub, San Francisco, CA 94158, USA 9These authors contributed equally 10Lead Contact *Correspondence:
[email protected] (S.R.Q.),
[email protected] (T.C.S.) https://doi.org/10.1016/j.neuron.2019.11.004 2Department
SUMMARY
The mammalian striatum is involved in many complex behaviors and yet is composed largely of a single neuron class: the spiny projection neuron (SPN). It is unclear to what extent the functional specialization of the striatum is due to the molecular specialization of SPN subtypes. We sought to define the molecular and anatomical diversity of adult SPNs using single-cell RNA sequencing (scRNA-seq) and quantitative RNA in situ hybridization (ISH). We computationally distinguished discrete versus continuous heterogeneity in scRNA-seq data and found that SPNs in the striatum can be classified into four major discrete types with no implied spatial relationship between them. Within these discrete types, we find continuous heterogeneity encoding spatial gradients of gene expression and defining anatomical location in a combinatorial mechanism. Our results suggest that neuronal circuitry has a substructure at far higher resolution than is typically interrogated, which is defined by the precise identity and location of a neuron.
INTRODUCTION Neuronal computation is typically thought of as the interactions of defined circuits between large groups of cells called neuron types, which are interrogated by using marker genes that are expressed only in a given type in a brain region. Single-cell transcriptomics has advanced these efforts by enabling the identification of markers for hundreds of candidate neuron types across many brain regions (Darmanis et al., 2015; Harris et al., 2018; Kalish et al., 2018; Lake et al., 2016; Mun˜oz-Manchado et al., 2018; Pandey et al., 2018; Saunders et al., 2018; Tepe et al.,
2018; Zeisel et al., 2015, 2018). Broad neuron classes generally form clear, non-overlapping clusters, often distinguishable by a single marker gene (Gokce et al., 2016; Harris et al., 2018; Tasic et al., 2016). Cellular heterogeneity has also often been observed to form gradients, some of which have been observed to be spatial in nature (Cembrowski et al., 2016; Durruthy-Durruthy et al., 2014; Gokce et al., 2016; Halpern et al., 2017; Harris et al., 2018; Tasic et al., 2016). It is not clear how best to analyze single-cell data to reveal these differences or whether this continuous heterogeneity should even be used to classify neurons (Tasic, 2018; Zeng and Sanes, 2017). We performed single-cell RNA-seq and in situ hybridization on striatal neurons to understand the organizational principles of neuronal identity in this region. To analyze these data, we developed a novel algorithm that distinguishes between continuous and discrete variation. We used this algorithm to show that discrete subtypes are either spatially intermingled or restricted to spatial compartments that are discretely separated from the rest of the cells. Ultimately, discrete identities do not imply a spatial relationship. However, each discrete type varies along multiple orthogonal axes of expression gradients that map to spatial gene expression gradients. Additionally, we find that the genes underlying these spatial gradients are highly conserved between discrete subtypes. Our results show that in order to computationally generate atlases of the brain, one must first cluster cells into discrete types, then determine when continuous heterogeneity exists, then characterize each cell type by its score along each continuous axis, and then map each continuous axis to its corresponding spatial gradient. We report a previously undescribed region of the striatum, which is composed of D2 spiny projection neurons (SPNs) that co-express Tac1, Penk, Htr7, and Th and that we call the ventromedial patch because it is enriched in patches of the ventromedial shell and because it is continuous in gene expression space with D2-patch SPNs. Finally, we show that a subtype, which we previously referred to as D1-Pcdh8, co-expresses Drd1a and the rare ‘‘short’’ isoform of the D2 receptor (Gokce et al., 2016). Because of this hybrid-like expression, we name it the D1H SPN.
Neuron 105, 1–12, February 19, 2020 ª 2019 Published by Elsevier Inc. 1
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
More broadly, our results have implications for neuronal circuits and computation. We find that previously mapped connectivity correlates well to expression gradients, suggesting that striatal neuronal circuits can be subdivided into very small units depending on gene expression. This contrasts with existing models of neuronal circuits, which are traditionally defined by large groups of cells with shared regulatory regions from single marker genes. We further show that regions such as the dorsomedial striatum are composed of neurons that are ‘‘intermediate’’ in their expression even though they have a distinct behavioral role, implying that neuron classes that are intermediate in gene expression cannot be assumed to be functionally intermediate. Our results suggest that developing combinatorial, spatially precise approaches to targeting neurons may substantially improve our understanding of neuronal computation. RESULTS We and others previously found that most cells are clearly classified as D1 or D2 SPNs (Gokce et al., 2016; Saunders et al., 2018; Zeisel et al., 2018). However, we also reported several novel subtypes that co-expressed Tac1 and Penk (D1-Pcdh8, D2-Htr7) as well as continuous gradients of expression within the major D1 and D2 subtypes. However, we were limited in our ability to clearly distinguish these distributions and determine the exact number of intermediates by our limited number of cells. We therefore prepared 1,343 single dissociated D1-tdTom+ or D2-GFP+ SPNs from five adult mouse brains using SmartSeq2 full-length amplification and analyzed 1,211 cells with at least 105 mapped reads and 1,000 detected genes (Figures S1A and S1B). Clustering of Major SPN Types We first compared standard computational pipelines for analyzing single-cell RNA sequencing (scRNA-seq) data (principal-component analysis [PCA], Louvain clustering, and t-distributed stochastic neighbor embedding [tSNE] projection) using classic PCA versus truncated PCA, a method we previously described that reduces the amount of technical variability in single-cell dimensionality reduction (Satija et al., 2015; Su et al., 2018). tSNE projection resulted in clear separation in tSNE space between D1 and D2 SPNs, consistent with our previous findings that D1 and D2 SPNs are discrete types with no overlap (Figures 1A and 1B; Gokce et al., 2016). We also identified two other major clusters: a cluster expressing Drd1a and Rreb1 that we identified as SPNs located in the islands of Calleja (ICj) and a subtype expressing Nxph4 and Pcdh8 that we had previously reported as D1-Pcdh8 (Figures 1A and 1B; Gokce et al., 2016). We then re-applied dimensionality reduction and clustering to each of the four major SPN types. Within D1 SPNs, we found nine subgroups of cells, and within D2 SPNs, we found seven subpopulations, nearly all of which corresponded to anatomical regions of the striatum (Figures 1C–1H; Figure S2). Subregions were often best classified by combinations of marker genes (e.g., we classified cells in the ventral patch by co-expression of the Sema5b patch marker and the Crym ventral striatum marker) (Figures 1C and 1D). The medial aspect of the shell region (mShell) was clearly labeled by Stard5
2 Neuron 105, 1–12, February 19, 2020
and Stra6, which were also limited to a single cluster of cells in D1, D2, and D1H SPNs. We observed one cluster of D1 SPNs with genes specific to the flat part of the olfactory tubercle (OT) and an adjacent cluster with genes specific to the OT ruffles (Figure 1F). We found a small number of cells expressing Serpinb2 and Npsr1 in D1 SPNs, which the in situ database showed to be highly specific to a novel subregion of the medial shell (Figure 1G; Figure S2E). We thereby created a map between the subclusters of the major types (Figure 1I) and the anatomical regions of the striatum (Figure 1H). Computational Analysis of Discreteness versus Continuity We previously observed that the major D1 and D2 types of the striatum appeared discretely separated from each other with few or no intermediates, whereas other heterogeneity appeared more continuous and graded (Gokce et al., 2016). We first visualized continuity using partition-based graph abstraction (PAGA), but this produced nonzero connectivity between D1 and D2 SPNs, which we previously found to be clearly discrete (Figure 2A; Gokce et al., 2016). The PAGA connectivity metric (PAGA c) does not provide a definitive cutoff for a discrete subtype, because the null hypothesis underlying PAGA is a random graph, and both discrete and continuous clusters are significantly different from a random graph (Wolf et al., 2019). We reasoned that continuous subtypes should have several characteristics: they should be connected by cells with intermediate gene expression values, there should be an uninterrupted density of cells along a pseudotime connecting the subtypes, and there should be an uninterrupted gradient of cells’ relative cluster connectivity (RCC) to either subtype (Figures 2C and 2D). We assessed these characteristics for subtype pairs between the major types (Figures 2A, 2E, and 2G) and subtype pairs within the major types of SPNs (Figures 2B, 2F, and 2H). To visualize gene expression between cluster pairs, we scored cells by the top ten differentially expressed genes and plotted this average along cells ranked by their diffusion pseudotime (DPT) values between clusters (Figures 2E and 2F). We also inspected the RCC and DPT distributions (Figures 2G and 2H). We found that key connections within the major types had discrete jumps in gene expression (Figures 2E and 2G). Connections within the major SPN types were connected by an uninterrupted continuum of cells (Figures 2F and 2H). We created a metric to directly measure the existence of a discrete gap between clusters, which we call gap c (STAR Methods). The number of cells in the ‘‘gap’’ region, ngap, is calculated as the number of cells in the DPT or RCC window with the lowest density. This is compared with the number of cells in the window near the gap with the most cells, n1 (Figures 2I and 2J). Gap c is the ratio ngap/n1, using the smaller of the c values given by either DPT or RCC. The gap c metric provided a quantitatively more definitive cutoff for discreteness than PAGA c: the ratio of c scores between the highest scoring discrete subtype and lowest scoring continuous subtype was 5.7 for gap c and only 2.7 for PAGA c (Figure 2K). This is potentially because it directly measures the discreteness along a computed transition path rather than an overall network interconnectivity. Although gap c provides a more definitive threshold, PAGA c and gap c are
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Drd1a
A
Nxph4
Tac1
Tac2 5
4 3
4
tSNE2
4
tSNE2
tSNE2
tSNE2
6 4
3
2 2
0
0
tSNE1
Drd2
tSNE1
Rreb1 6
8 6
6 tSNE2
tSNE2
tSNE2
0
tSNE1
8
10 6
4
1
0
Crtac1
Penk
D1 D1H D2 ICj
2
1
4
tSNE2
tSNE1
2
B
6
5
8 6
4
4 2
tSNE1
C
Cnr1
0
0
tSNE1
2
2
2
tSNE1
0
0
tSNE1
D
Crym
Cnr1
D1
Crym
Id4
Kremen1
Kremen1
Id4
D1
D2
D2
D1H
F E
Stard5
Nrp2
Nrp2
NAcc
Stra6
D1
Stard5
OT-flat
D1 OT-ruffle 1 mm
D2
Tnnt1
Tnnt1
ICj
D1
OT-flat
D1H OT ruffle 1 mm
G
I
H
D1
G
caudal
D1
rostral
D2
Shared Mat.dLat Mat.vMed mShell NAcc Pat.dLat Pat.vMed Medial
Lateral
Dorsal
Ventral
ICj
0.5 mm
D1 only OT-flat OT-ruffle
D2 only D2-Htr7 not shown on map
mShell-Serpinb2
Figure 1. Mapping SPN Heterogeneity and Conserved Spatial Patterns (A) Expression of markers of major SPN types overlaid on tSNE projections of SPNs n = 1,203 cells. (B) tSNE projection of SPNs colored by major type. n = 1,203 cells. (C) Genes defining dorsolateral and ventromedial compartments of the striatum represented in scRNA-seq data (left) and in situ stains from the Allen Mouse Brain Atlas. tSNE projections are performed separately on the SPN major types: D1 (n = 458 cells), D2 (n = 581 cells), and D1H (n = 91 cells). Scale bar 500 mm. (legend continued on next page)
Neuron 105, 1–12, February 19, 2020 3
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
correlated and broadly agree, indicating that the discreteness of neuron types in our data is a robust property (Figures 2K–2N). Furthermore, the number of discrete subtypes is robust to widely varying the clustering resolution (Figures 2M and 2N). Overall, our analysis supports four discrete subtypes of striatal SPNs: D1, D2, D1H, and ICj SPNs (Figure 2N). In Situ Analysis of the Discrete Cell Types of the Striatum We then sought to understand the spatial relationships of discrete neuron types and to better define the spatial distributions of newly discovered SPN types. The D1 and D2 cells were discretely separated in the scRNA-seq data (Figure 3A). We found that Tac1/Penk expression in situ was very strongly bimodal, with few intermediates (Figures 3B and 3C). We next found that expression of the discrete ICj type’s marker genes changed in a discrete step along a spatial axis from the OT to the ICj (Figures 3D–3G). This matches with previous cytoarchitectural observations that the ICj appeared discretely separated from the rest of the striatum, supporting the validity of our scRNA-seq and in situ analyses (Millhouse, 1987). Finally, we found that the discrete D1H-Nxph4 cells were intermingled with classical D1 and D2 cells in the dorsal striatum and clustered in ventrolateral structures, primarily the lateral stripe of the striatum (LSS) (Figures 3I–3K). The ventrolateral clusters appear only in more rostral slices that contain the nucleus accumbens (NAcc) (Figure 3I). There seems to be far fewer intermingled D1H-Nxph4 cells in the NAcc (Figure 3I). We found a discrete step change in D1H marker gene expression from the ventral striatum to the D1H cluster (Figures 3K and 3L). We observed the Tac2+ D1H subtype only in discrete ventromedial clusters and scattered throughout the medial shell region (Figures 3N–3P). D1H-Tac2 cells are generally not observed in the caudate putamen (CPu) or in more caudal sections (Figure 3N). The D1H subtype had been previously reported in the dorsal striatum, but the discrete ventral clusters were not reported, likely because of the gene Otof is not a unique marker for D1H cells in the ventral striatum (Figure S3C; Saunders et al., 2018). We found discrete step in the D1H-Tac2 markers from the ventral striatum to the D1H-containing ventral stripe (VS) (Figures 3O and 3P). The VSs have higher cell density and likely correspond
to previously observed regions of high cell density in the NAcc (Herkenham et al., 1984). Both the D1H-Nxph4 and D1H-Tac2 spatial clusters appeared to be composed exclusively of the D1H type (Figures 3K and 3O). The Nxph4-Tac2 subdivision in the D1H type correlates strongly with the dorsolateral-ventromedial markers Cnr1 and Crym, and the Nxph4+ D1H cells are more dorsal and lateral than the Tac2+ D1H cells, suggesting that the spatial markers validated in D1 and D2 SPNs are also conserved in D1H SPNs (Figure 3R). We therefore use Stard5 expression in D1H cells to classify a subset of them as D1H cells from the medial shell region (Figures 3Q and 3R). In conclusion, we show that discrete gene expression shapes in high-dimensional scRNA-seq space are associated with spatially intermingled neurons or homogeneous spatial clusters of neurons, and we show that the expression of spatial genes in known neuron types can be used to predict the spatial distribution of unknown neuron types. Gene Expression Gradients Underlie Continuous Subtypes of the Striatum We previously found that the heterogeneity within discrete subtypes mapped to spatial subdivisions of the striatum (Figures 1C–1I) and that this spatial heterogeneity was gradient-like in scRNA-seq (Figures 2B, 2F, 2H, and 4A). This led us to believe that continuous neuronal subtypes represented spatial gradients of expression. To test this hypothesis, we performed quantitative RNA in situ hybridization using the genes defining the continuous axes of variation within the D1 and D2 discrete subtypes (Figure 4; Figure S3). In situ quantification of the Cnr1 and Crym continuous markers revealed a gradient consisting of a 250-fold change in relative expression along the full 2-mmlong dorsolateral-ventromedial axis (Figures 4B–4D). The dorsomedial part of the striatum has moderate Crym expression and low Cnr1 expression (Figures 4C and 4E). Crym is highest and Cnr1 is lowest in the medial NAcc. We also observed several genes, such as Gfra1, to be more restricted to the NAcc, but we did not find any gene that clearly defined a hard boundary between the CPu and NAcc. The gradient varied along the rostral-caudal axis as well: the caudal CPu showed a weaker gradient and lower maximal expression (Figure 4F). The dorsolateral gradient peaks at a specific location between the rostral and caudal CPu, where Cxcl14 is expressed in the
(D) Genes defining patch and matrix compartments of the striatum represented in scRNA-seq data (left) and in situ stains from the Allen Mouse Brain Atlas. tSNE projections are performed separately on the SPN major types: D1 (n = 458 cells), D2 (n = 581 cells), and D1H (n = 91 cells). Scale bar 700 mm. (E) Genes defining the medial shell compartment of the striatum represented in scRNA-seq data (left) and in situ stains from the Allen Mouse Brain Atlas. tSNE projections are performed separately on the SPN major types: D1 (n = 458 cells), D2 (n = 581 cells), and D1H (n = 91 cells). Scale bar 700 mm (left) and 100 mm (right). (F) Genes defining the olfactory tubercle (OT) flat and ruffle compartments of the striatum represented in scRNA-seq data (left, D1 cells only) and in situ stains from the Allen Mouse Brain Atlas. n = 458 cells. Scale bar 1 mm (left) and 100 mm (right). (G) Genes defining a small subset of D1 SPNs and marking a previously undescribed subregion of the medial shell of the striatum (Allen Mouse Brain Atlas). Scale bar 900 mm. (H) Striatum color coded by spatial region defined by gene expression in our scRNA-seq data and Allen Mouse Brain Atlas in situ stains. The anatomical distribution of the ICj discrete type is shown in gray. The colors correspond to the colors of the subclusters in (I), and the map represents the predicted anatomical distributions of these clusters. Shared subtypes are anatomical regions and gene expression patterns that are conserved between D1 and D2 discrete types. D1 (D2)-only in the legend indicates subtypes that were only observed within the scRNA-seq data of the D1 (D2) discrete types. The anatomical distribution of the D2-Htr7 subtype is not shown. Scale bar 500 mm. (I) scRNA-seq data colored by predicted anatomical region of D1 and D2 SPNs. Most subclusters are conserved, but several were observed within only one of the D1 or D2 discrete types. Shared subtypes are anatomical regions and gene expression patterns that are conserved between D1 and D2 discrete types. D1 (D2)-only in the legend indicates subtypes that were only observed within the scRNA-seq data of the D1 (D2) discrete types.
4 Neuron 105, 1–12, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Figure 2. Computational Distinction between Discrete and Continuous Heterogeneity in scRNA-Seq Data (A) PAGA graph with connectivity threshold = 0. Clusters are computed used Leiden clustering with resolution = 1. Clusters are colored by the major subtype as defined previously. n = 1,203 cells. (B) PAGA graph with connectivity threshold = 0.2. Clusters are computed used Leiden clustering with resolution = 1. n = 1,203 cells. (C) Cartoon illustrating the manifolds in highdimensional space occupied by two discrete subtypes. Arrow represents the expected axis traversed by a pseudotime calculation. (D) Cartoon illustrating the manifolds in highdimensional space occupied by two continuous subtypes. Arrow represents the expected axis traversed by a pseudotime calculation. (E) Plot of the diffusion pseudotime rank (x axis) and top 10 gene scores computed between the two clusters with the corresponding letter in (A). Dots are means and error bars are 95% confidence intervals. Gene scores are normalized as described in STAR Methods. n = (i) 234, (ii) 205, (iii) 236, (iv) 185, (v) 182, and (vi) 200 cells. (F) Plot of the diffusion pseudotime rank (x axis) against the top 10 gene scores computed between the two clusters with the corresponding letter in (B). Dots are means and error bars are 95% confidence intervals. n = (i) 202, (ii) 253, (iii) 209, (iv) 183, (v) 255, and (vi) 177 cells. (G) Plot of the diffusion pseudotime (x axis) against the relative cluster connectivity (RCC) computed between the two clusters with the corresponding letter in (A). Each dot represents a single cell. n = (i) 234, (ii) 205, (iii) 236, (iv) 185, (v) 182, and (vi) 200 cells. (H) Plot of the diffusion pseudotime (x axis) against the relative cluster connectivity (RCC) computed between the two clusters with the corresponding letter in (B). Each dot represents a single cell. n = (i) 202, (ii) 253, (iii) 209, (iv) 183, (v) 255, and (vi) 177 cells. (I) Example diffusion pseudotime histogram of a pair of clusters with a low gap c score. n = 236 cells. (J) Example diffusion pseudotime histogram of a pair of clusters with a high gap c. n = 253 cells. (K) Histogram of continuity scores for the PAGA metric (upper) and gap c statistic (lower) for n = 13 pairs of clusters, 12 of which are shown in (E)–(H). Discrete and continuous labels are assigned as described in text (using E–H). (L) PAGA graph of all SPNs colored by major class at three different clustering resolution. PAGA connectivity threshold for displaying a connection is 0.05 (left), 0 (middle), and 0 (right). n = 1,203 cells. (M) PAGA graph of all SPNs with connection strength calculated as the gap c statistic between each cluster, colored by major class at three different Leiden clustering resolutions using scanpy sc.tl.leiden. Gap c threshold for displaying a connection is 0.05 (left), 0 (middle), and 0 (right). n = 1,203 cells.
extreme dorsolateral aspect, and the ventromedial gradient peaks in the NAcc (Figures 4G and 4H). We describe singlecell in situ proof that these genes change along a dorsolateral-ventromedial gradient at a single-cell level and are not the result of heterogeneous distributions of discrete cell types. Similarly, we found that the medial shell gene program was a
gradient from the NAcc to the dorsal tip of the medial shell (Figure 4A; Figures S3D–S3F). The medial shell subtype is not apparent in previous droplet microfluidics scRNA-seq studies of the striatum. We compared these gradients with previously anatomical divisions of the striatum defined by corticostriatal projections
Neuron 105, 1–12, February 19, 2020 5
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
A
H
B
C
I
D
E
N
G
J
K
M
F
O
Q
L
P
R
Figure 3. In Situ Analysis of Discrete Types of SPNs (A) Expression of Tac1 and Penk along the diffusion pseudotime rank of cells computed between a D1 cluster and a D2 cluster. Means and 95% confidence interval are shown. n = 236 cells. (B) In situ stain of Tac1 (red) and Penk (blue) RNAscope probes in the dorsal striatum. Scale bar is 50 mm. (C) Distribution of in situ Tac1 versus Penk expression in the CPu, measured by the ratio of the integrated intensity per cell of either stain in log space. n = 2,353 cells. (D) Expression of Synpr along the diffusion pseudotime rank of cells computed between a D1 cluster and a D2 cluster. Means and 95% confidence interval are shown. n = 185 cells. (E) In situ expression (RNAscope) of a signature corresponding to the ICj subtype. This signature is calculated as Cxcl14 + Nts Synpr, using integrated intensities per cell of each probe in log space. Scale bar is 500 mm. (F) Fluorescence image of Cxcl14 (red), Nts (green), and Synpr (blue) in situ probes used to calculate the signature in 2F. Scale bar is 50 mm. The change from D1-OT to ICj is step-like. (G) In situ quantification of Synpr expression, showing a discrete, step-like change from the OT region to the ICj region. n = 346 cells. The y axis is a normalized Z score of the integrated log intensity values of the Synpr probe. Means and SEMs are overlaid. (H) Expression of Nxph4 and Penk along the diffusion pseudotime rank of cells computed between a D1 cluster (blue) and the D1H cluster (red) in the scRNA-seq data in log cpm space. Means and 95% confidence interval are shown. n = 182 cells. (I) In situ staining (RNAscope) of Tac1, Penk, and Nxph4. Any cell that does not express all three genes is colored gray, and cells with detection of all three genes are plotted in green. Scale bar is 500 mm. (legend continued on next page)
6 Neuron 105, 1–12, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
(Hintiryan et al., 2016; Hunnicutt et al., 2016). We found that the dorsolateral-ventromedial gradients strongly correlated with the projecting cortical region and with regions such as the amygdala and that projection-defined anatomical regions varied strongly along the same dorsolateral-ventromedial axis as Cnr1 and Crym (Figure S4B). We found examples of corticostriatal projection patterns that seemed to level match: Crym- or Cnr-high cortical regions project to Crym- or Cnr-high striatal regions, and Crym-intermediate/Cnr1-intermediate cortical regions project to Crym-intermediate/Cnr1-intermediate striatal regions (Figure S4D). We also found that the medial shell region we defined with the single-cell data corresponded almost exactly to the region defined by projections from the subiculum (Figure S4A; Hunnicutt et al., 2016). The correlation of graded gene expression and graded corticostriatal connectivity suggests that gene expression and connectivity are related at a very high resolution. We then found that genes correlated to the patch-matrix markers Kremen1, Sema5b, and Id4 formed a gradient in both D1 and D2 cells that was orthogonal to the Cnr1-Crym gradient (Figures 1D, 1H, and 4I). In situ, the patches appear to lack discrete boundaries, but the two-dimensional (2D) spatial arrangement was too complex to easily project onto a single spatial axis (Figures 4J and 4K). Therefore, to determine whether the patch-matrix organization was discrete or not, we simulated discrete patches by randomizing gene expression within each patch and within the matrix (Figures 4K and 4L). We quantified the degree of spatial organization as the correlation between a cell and its second nearest neighbor (r2nn). We found that the patch matrix was much more spatially organized than a discrete-patch model would predict (p < 0.001) (Figure 4L). This indicated that patch-matrix identity is not binary and instead depends continuously on spatial location. We also observed continuous subtypes that predicted anatomical location over very small length scales. One of the D1-only continuous subtypes localized to the ‘‘ruffles,’’ which are small protrusions of the OT. The scRNA-seq data showed a sharp but continuous transition between Nts and Cxcl14, a marker enriched in the OT ruffles (Figures 1F, 1H, 4M, and 4N). We stained for these genes and found that their expression
changed continuously from the flat part of the OT to the Cxcl14+ OT ruffles over just 150 mm (Figures 4M–4P). The expression of Nts was graded along the transition from the NAcc to the OT (Figures S3G–S3I). We also identified an expression gradient within the OT ruffles where Dner increased with proximity to the pial surface in OT ruffles (Figures S3J and S3K). A Novel Spatial Compartment of the Striatum: D2H Ventromedial ‘‘Patches’’ As reported previously, a subtype of D2 SPNs (D2-Htr7) co-expresses Penk and Tac1 (Figure 5A; Gokce et al., 2016; Saunders et al., 2018). Our scRNA-seq analysis predicts that it is continuous with the D2-patch cells (Figure 5B). Because the D2-patch cells have a distinct spatial distribution, and we have shown that continuous subtypes are anatomically related, we hypothesized that the spatial distribution of the D2-Htr7 cells would also be patch-like. We found that Tac1+/Htr7+/ Th+ cells appeared more ventromedial than most of the patches marked by Kremen1 or Sema5b (Figures 4J and 5D). Htr7+/Tac1+ cells were 14 times more likely to be Th+ than Htr7-/Tac1+ cells, matching the scRNA-seq data (Figures 5A and 5E). To determine whether the Htr7+/Th+ cells were clustered into patches, we computed the Getis-Ord hotspot statistic G to measure the significance of spatial clustering (Cliff and Ord, 1970). We found patches of cells that were significantly enriched for Htr7+/Th+ cells primarily in the medial shell region, corresponding to the region we previously defined using Stard5 and Stra6 (Figures 1E and 5F). We therefore define a novel spatial compartment of the striatum composed of medial shell patches containing Tac1+/Htr7+/Th+ D2 SPNs that appear to be the medial shell extension of the D2 patch system (Figure 5G). Intriguingly, the Th+ cells did not express Ddc, suggesting that they do not metabolize dopamine but instead use tyrosine hydroxylase in a different pathway. Splicing and Identity Determination of a D1R/D2R Co-expressing Type Though D1 and D2 receptors are primarily non-overlapping in expression and function, there is known to be co-expression
(J) Fluorescence image of Tac1 (red), Nxph4 (green), and Penk (blue) in situ probes used to calculate the signature in Figure 2F. Scale bar is 20 mm. The D1HNxph4 subtype appears scattered throughout the CPu. (K) Fluorescence image of Tac1 (red), Nxph4 (green), and Penk (blue) in situ probes in the lateral NAcc and lateral stripe of the striatum (LSS). Arrow corresponds to the spatial axis showing a discrete change in expression from the NAcc to the LSS. Scale bar is 100 mm. (L) Single-cell quantification of the D1H-Nxph4 signature Nxph4 + Penk, in log intensity space, over the spatial axis defined by the arrow in Figure 2K. Cells are manually gated using the in situ image into LSS and NAcc cells. Curve is LOESS regression with 95% confidence interval (CI), computed separately for the LSS and NAcc cells. n = 235 cells. Only Tac1+ cells are used here. (M) Expression of Tac2 along the diffusion pseudotime rank of cells computed between a D1 cluster and the D1H cluster in the scRNA-seq data in log cpm space. Means and 95% confidence interval are shown. n = 182 cells. (N) In situ staining (RNAscope) of Tac1, Penk, and Tac2. Any cell that does not express all three genes is colored gray, and cells with detection of all three genes are plotted in green. Scale bar is 500 mm. (O) Fluorescence image of Tac1 (red), Tac2 (green), and Penk (blue) in situ probes in the NAcc and a ventral cluster. Scale bar is 50 mm. (P) Single-cell in situ quantification of the D1H-Tac2 signature Tac2 + Penk, in log intensity space, over the spatial axis defined by the arrow in Figure 2K. Cells are manually gated using the in situ image into VS and NAcc cells. Curve is LOESS regression with 95% CI, computed separately for the LSS and NAcc cells. n = 128 cells. Only Tac1+ cells are used here. (Q) Assignment of D1H cells into spatial domain projected onto tSNE space. n = 91 cells. (R) Expression spatial markers shared between D1 and D2 SPNs (Cnr1, Crym, Stard5) and D1H-unique markers (Nxph4, Tac2) overlaid on a tSNE projection of D1H cells. Red = high detection, yellow = zero detection in log cpm space. n = 91 cells.
Neuron 105, 1–12, February 19, 2020 7
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
A
B
C
D
E
I
J
G
K
F
H
L
M
N
O
P
Figure 4. Spatial Gene Expression Gradients Underlie the Continuous Subtypes of the Striatum (A) Force-based projection of D1 SPNs (top, n = 458 cells) and D2 SPN (bottom, n = 581 cells) anatomical assignments with inter-cluster connections defined by gap c statistic. Threshold = 0 for displaying a connection. (B) Expression of Cnr1, Crym, and Cxcl14 along the diffusion pseudotime rank of cells computed between a D1 dorsal cluster (red) and a D1 ventral cluster (blue). Means and 95% confidence interval are shown. n = 255 cells. (C) In situ quantification of Cnr1-Crym, using integrated probe intensity in log space. Scale bar is 500 mm. (D) Quantification of the Cnr1-Crym along a dorsolateral-ventromedial slice of the CPu, defined by the dashed black lines in Figure 3C. Mean ± SEM is shown. n = 2,330 cells. (E) Representative in situ fluorescence of Cnr1 (green) and Crym (red) RNAscope probes in the dorsolateral (right), intermediate dorsal (middle-right), dorsomedial (middle-left), and ventromedial (left) section of striatum shown. Scale bar for whole striatum image is 500 mm. Scale bar for magnified images is 20 mm. (F) Representative in situ fluorescence of Cnr1 (green) and Crym (red) RNAscope probes in the dorsolateral (right) and dorsomedial (left) striatum of the caudal slice shown. Scale bar for whole-striatum image is 500 mm. Scale bar for magnified images is 20 mm. (G) Expression of Cnr1 (left) and Cxcl14 (right) of selected RNA in situ stains from the Allen Mouse Brain Atlas. Scale bars are 1 mm for the zoomed-out image and 200 mm for the magnifications. (H) Cartoon of the Cxcl14-Cnr1-Crym gradient across the rostrocaudal axis of the striatum. Bright red indicates high Cnr1 expression, bright green indicates high Crym expression, and black indicates intermediate expression of both. Yellow indicates Cxcl14 expression. Cartoon is derived from observations in (B)–(G). Scale bar is 1 mm. (legend continued on next page)
8 Neuron 105, 1–12, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
and even heterodimerization of these two receptors (Perreault et al., 2011). We found that most D2 SPNs expressed the D1 receptor at scattered, low levels, and the D1 SPNs expressed almost no D2R (Figure 1A). The subtype with highest co-expression of Drd1a and Drd2 was the D1H-Nxph4 subtype (Figure 1A). Drd1a/Drd2 co-expression in this subtype was not observed in recent droplet microfluidics studies, possibly because of the low detection rate of those technologies (Saunders et al., 2018; Zeisel et al., 2018). The D2 receptor is expressed in two isoforms, short (D2S) and long (D2L) (Usiello et al., 2000; Figure 6A). We found that just 8% of isoform-specific reads aligned to D2S in D2 SPNs, consistent with previous literature (Figure 6B; Usiello et al., 2000). Expression was similar in the ventromedial-patch subtype D2-Htr7, which co-expresses Tac1 and Penk, but not Drd1a and Drd2 (Figures 1A and 6B). Intriguingly, we found that 34% of isoform-specific reads aligned to D2S in the D1HNxph4 subtype (the D1H neurons found scattered throughout the dorsal striatum) (Figure 6B). To understand the molecular pathways driving this phenomenon, we investigated transcription factors (TFs) and splicing factors (SFs) enriched in the D1H-Nxph4 subtype. We found three SFs enriched in D1H-Nxph4 (false discovery rate [FDR] < 0.1) which may be responsible for splicing out exon 6 (Figure 6C). We found 19 TFs enriched in the D1H-Nxph4 subtype (FDR < 0.1) (Figure 6D). To determine which TFs might control Drd2 expression in D1H-Nxph4 neurons, we also measured the correlation of each TF to Drd2 across all SPNs. Isl1 was the most highly anticorrelated to Drd2 throughout all SPNs and de-enriched in the D1H-Nxph4 subtype. Sp9 was the most highly correlated to Drd2 and enriched in the D1HNxph4 cell type. This suggests that Isl1 is an inhibitor and Sp9 is an activator of Drd2 expression in D1H-Nxph4 cells. Sp9 and Isl1 are required for proper formation of D2 and D1 neurons (Ehrman et al., 2013; Lu et al., 2014; Zhang et al., 2016). However, because Isl1 is not expressed in D1H-Nxph4 cells, it cannot be solely responsible for turning on Drd1a expression. Therefore we measured the correlation of each enriched TF to Drd1a and found that Foxp2 was by far the most positively correlated to Drd1a among TFs enriched in D1HNxph4 cells. Together, our results suggest Isl1 as a Drd2 repressor, Sp9 as a Drd2 activator, and Foxp2 as a Drd1a activator.
DISCUSSION The effort to describe neuron types is still lacking an ontology that can describe the full range of neuronal heterogeneity while providing structure and clarity. Our work addresses this challenge by providing a framework of neuronal identity and anatomy by revealing how discrete and continuous heterogeneity relates to anatomy. In the striatum, we showed that discrete types have no implied spatial relationships to one another and can coexist randomly intermingled throughout a region, as in the case of D1 or D2 neurons, or can exist in discrete regions composed of single subtypes, as in the ICj. Continuous heterogeneity encodes spatial gradients with few exceptions (gradients of immediate-early gene activation being the main potential exception we observed). Subtypes that are directly continuous are generally anatomically adjacent. Previous studies have reported similar subtypes that are anatomically adjacent but that appear discrete because of a small number of co-expressing neurons (Cembrowski et al., 2018). We instead propose that spatial gradients can occur over very small length scales, exemplified by the OT-ruffle transition, which occurs over just 150 mm. Additionally, continuous heterogeneity does not always imply an obvious linear gradient of expression, as demonstrated by the patch-matrix gradients. Our conclusion is that stable, continuous neuronal subtypes are generally related by an underlying anatomical axis, regardless of the details of the continuity; discrete subtypes are not. Our work provides a natural explanation and solution to a key challenge of how to define neuronal subtypes. The effort of assigning neurons to types has been hindered by the difficulty of finding unique marker genes for each subtype (Li et al., 2017; Moffitt et al., 2016). Previous papers have dealt with this by naming neuron types by the combination of genes that best define them (Saunders et al., 2018), although this can lack clarity and interpretability. We showed that both the genes encoding dorsoventral gradients and the genes encoding the patch-matrix gradients are shared between the D1 and D2 discrete subtypes. Because of this, there are no clear marker genes defining, for example, just the cells of the dorsal D1 patches. Instead, these cells are best defined by three genes: a D1 gene (Drd1a), a dorsal gene (Cnr1), and a patch gene (Kremen1). Therefore we conclude that most projection neurons in the striatum in fact
(I) Expression of Kremen1, Sema5b, and Id4 along the diffusion pseudotime rank of cells computed between a D1 matrix cluster (red) and a D1 patch cluster (blue). Means and 95% confidence interval are shown. n = 253 cells. (J) In situ quantification of the patch-matrix signature defined as Kremen1 + Sema5b Id4, using integrated probe fluorescence intensity in log space. Scale bar is 500 mm. (K) Quantification of the patch-matrix signature within two real, adjacent patches (left panel). White lines show manually defined boundaries of the patch, dividing the plot into three regions (patch.1, patch.2, and matrix). Right panel shows simulated discrete patches, where the expression of cells are scrambled within each of the three regions. Scale bar is 200 mm. (L) Monte Carlo simulation of the second nearest neighbor correlation (r) of 1,000 randomized patch and matrix cells (gray distribution) and the true r (red line). n = 873 cells. (M) Expression of Cxcl14 and Nts along the diffusion pseudotime rank of cells computed between the D1 NAcc cluster (blue) and a D1-OT cluster (blue). Means and 95% confidence interval are shown. n = 202 cells. (N) In situ quantification of the OT-ruffle signature defined as Nts Cxcl14 using integrated fluorescence intensity in log space. Scale bar is 500 mm. (O) Quantification of individual genes (upper plot) and Nts Cxcl14 (lower plot) along the spatial axis from a medial OT-ruffle to the flat OT to a lateral OT-ruffle. Means ± SEMs are shown. n = 792 cells. (P) Representative in situ fluorescence of Cxcl14 (red), Synpr (green), and Nts (blue) RNAscope probes in the flat OT (upper), intermediate (middle), and OT-ruffle (bottom). Scale bar is 20 mm.
Neuron 105, 1–12, February 19, 2020 9
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
A
B
C
D E
F
G
are best defined by three values: one discrete value indicating D1 or D2 identity, one continuous value indicating dorsoventral identity, and one continuous value indicating patch-matrix identity. Our work therefore describes both an interpretable ontology, and a fundamental biological reason for using it. A major question in striatal organization is how finely defined the boundaries defining function regions is. The striatum is thought to be divided into roughly three parts: the sensorimotor, limbic, and associative domains, roughly corresponding to the dorsolateral, dorsomedial, and ventral striatum (Hunnicutt et al., 2016). However, these lack clear cytoarchitectural boundaries, which has hindered consistent functional interrogation. Our results show that at the level of gene expression, there is unlikely to be any such hard boundary but rather a gradual change, in both the medial-lateral and dorsal-ventral dimensions. The dorsolateral-ventromedial gradients of expression align with the topology of striatal connectivity, which also changes in a graded manner along the medial-lateral and dorsal-ventral axes (Poulin et al., 2018; Voorn et al., 2004). This suggests that circuits are themselves not discrete units composed of large numbers of cells. Instead, circuits may be defined highly locally, by the combinatorial and gradient identities of their underlying neurons. Interestingly, even though the dorsomedial striatum has a
10 Neuron 105, 1–12, February 19, 2020
Figure 5. Defining a Striatal Medial Shell Region Populated by Patch-like D2H SPNs (A) scRNA-seq expression of matrix, patch, and D2H markers in D2 SPNs (tSNE dimensionality reduction). (B) Force-based projection of D2 SPN (bottom, n = 581 cells) anatomical assignments with intercluster connections defined by gap c statistic. Threshold = 0 for displaying a connection. (C) Expression of Htr7 and Th along the diffusion pseudotime rank of cells computed between the D2-Htr7 cluster (red) and D2-patch-ventromedial cluster (blue). Means and 95% confidence interval are shown. n = 100 cells. (D) In situ quantification of D2H markers Htr7, Th, and Tac1. Triple-positive cells are displayed as blue, and non-triple-positive cells are gray. Only Tac1+ cells are shown. Scale bar is 680 mm. Sections are increasingly caudal from left to right. (E) In situ quantification of the fraction of cells expressing Th, separated by whether they also express Htr7. p < 0.001; OR = 14 using Fisher’s exact test. n = 10,924 cells. (F) In situ stains in (D) with cells colored red if they have a significant (FDR < 0.05) Getis-Ord hotspot statistic for enrichment of triple-positive cells among their 30 nearest neighbors. Scale bar is 680 mm. n = 9,336 cells. (G) Cartoon demonstrating the inferred dorsolateral-ventromedial-medial shell gradient within patch D2s, where the D2-Htr7 cells represent the medial shell tip of that gradient. Scale bar in shown in D) and is 680 mm.
unique projection topology and behavioral role, it lacks a unique gene signature, instead being defined by intermediate levels of ventral genes like Crym and low levels of dorsolateral genes like Cnr1. We conclude that neurons that look purely ‘‘intermediate’’ between two types may have unique roles in behavior and that scRNA-seq-driven studies of neurons must take this into account. Traditionally circuits are explored by probing hundreds, often thousands, of neurons that are thought to be the same type and correlating their activity with a downstream readout. However, if neurons have a defined identity that goes beyond their type and is determined at high resolution by location and gene expression patterns, it is difficult to envision defined circuits between large numbers of cells. There must be a substructure to these circuits that depends on the precise identity of the neurons involved, and this substructure may be as important for neural computations as the overall connectivity of the neurons themselves. It is possible that a redefinition of the concept of neural circuits may lead to a better understanding of how the brain processes information by analyzing the brain’s computational networks not as bulk circuits but deconvolving synaptic connections into more precise point-to-point circuits that differ among neurons of the same type depending on their location and gene expression patterns.
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
A
B
B
d
C
Quantitative in situ mapping of gene expression In situ analysis of discrete SPN types B In situ analysis of continuous SPN heterogeneity B In situ analysis of D2-Htr7 continuous subtype DATA AND CODE AVAILABILITY B
SUPPLEMENTAL INFORMATION Supplemental Information can be found online at https://doi.org/10.1016/j. neuron.2019.11.004. ACKNOWLEDGMENTS
D
E
We would like to acknowledge Spyros Darmanis, Derek Croote, Mark Kowarsky, and Fabio Zanini for their efforts in building the lab’s Smart-Seq2 wet lab and computational pipelines. We would like to acknowledge Marco Mignardi and Jenny Eng for help with in situ imaging and analysis. This work was supported by grants from the National Institute on Drug Abuse (NIDA) (K99DA038112 to O.G.), the NIH (R37MH52804 to T.C.S.), the Brain and Behavior Research Foundation (to O.G.), and the Chan-Zuckerberg Biohub (to S.R.Q.). AUTHOR CONTRIBUTIONS
Figure 6. Splicing and Identity Determination of a D1R/D2R Co-expressing Type of SPN (A) Splicing of the murine D2 dopamine receptor. (B) Quantification of the fraction of reads aligning to the skipped exon over the exon5 : exon7 . total reads aligning to either junction, exon5 : exon7 + exon5 : exon6 (C) Average log fold change (FC) of top differentially expressed splice factors (with Gene Ontology annotation GO: 190641), comparing D1H-Nxph4 cells with all the other SPNs. The three genes significantly enriched in D1H-Nxph4 are displayed as text and in red (FDR < 0.1). n = 90 cells. (D) Transcription factors (TFs) that are candidates for controlling Drd2 expression. TFs shown are all those significantly enriched in D1H-Nxph4 cells versus all other SPNs (FDR < 0.1). Genes are colored by their correlation to Drd2 across all SPN subtypes. Gene font size is calculated as 0.5 + Drd2 correlation 3 sign(D1H-Nxph4 log FC). n = 1,203 cells. (E) Transcription factors (TFs) that are candidates for controlling Drd1a expression. TFs shown are all those significantly enriched in D1H-Nxph4 cells versus all other SPNs (FDR < 0.1). Genes are colored by their correlation to Drd1a across all SPN subtypes. Gene font size is calculated as 0.5 + Drd1a correlation 3 sign(D1H-Nxph4 log FC). n = 1,203 cells.
Conceptualization, Resources, and Writing – Review & Editing, G.S., S.R.Q., O.G., and T.C.S.; Methodology and Investigation, G.S. and O.G.; Formal Analysis, Writing – Original Draft, Software, Data Curation, and Visualization, G.S.; Supervision, S.R.Q. and T.C.S. DECLARATION OF INTERESTS The authors declare no competing interests. Received: April 29, 2019 Revised: August 22, 2019 Accepted: November 1, 2019 Published: December 5, 2019 REFERENCES Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169. Cembrowski, M.S., Bachman, J.L., Wang, L., Sugino, K., Shields, B.C., and Spruston, N. (2016). Spatial gene-expression gradients underlie prominent heterogeneity of CA1 pyramidal neurons. Neuron 89, 351–368.
STAR+METHODS
Cembrowski, M.S., Phillips, M.G., DiLisio, S.F., Shields, B.C., Winnubst, J., Chandrashekar, J., Bas, E., and Spruston, N. (2018). Dissociable structural and functional hippocampal outputs via distinct subiculum cell classes. Cell 173, 1280–1292.e18.
Detailed methods are provided in the online version of this paper and include the following:
Cliff, A.D., and Ord, K. (1970). Spatial autocorrelation: a review of existing and new measures with applications. Econ. Geogr. 46, 269.
d d d d d
KEY RESOURCES TABLE LEAD CONTACT AND MATERIALS AVAILABILITY EXPERIMENTAL MODEL AND SUBJECT DETAILS B Animals METHOD DETAILS B Generation of single-cell RNA-Seq data QUANTIFICATION AND STATISTICAL ANALYSIS B Processing and Quality Control of Single-cell RNAseq Data B Normalization B Clustering scRNA-Seq data B Determining discreteness of cell types
Darmanis, S., Sloan, S.A., Zhang, Y., Enge, M., Caneda, C., Shuer, L.M., Hayden Gephart, M.G., Barres, B.A., and Quake, S.R. (2015). A survey of human brain transcriptome diversity at the single cell level. Proc. Natl. Acad. Sci. U S A 112, 7285–7290. Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. Durruthy-Durruthy, R., Gottlieb, A., Hartman, B.H., Waldhaus, J., Laske, R.D., Altman, R., and Heller, S. (2014). Reconstruction of the mouse otocyst and early neuroblast lineage at single-cell resolution. Cell 157, 964–978. Ehrman, L.A., Mu, X., Waclaw, R.R., Yoshida, Y., Vorhees, C.V., Klein, W.H., and Campbell, K. (2013). The LIM homeobox gene Isl1 is required for the correct development of the striatonigral pathway in the mouse. Proc. Natl. Acad. Sci. U S A 110, E4026–E4035.
Neuron 105, 1–12, February 19, 2020 11
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Gokce, O., Stanley, G.M., Treutlein, B., Neff, N.F., Camp, J.G., Malenka, R.C., €dhof, T.C., and Quake, S.R. (2016). Cellular Rothwell, P.E., Fuccillo, M.V., Su taxonomy of the mouse striatum as revealed by single-cell RNA-seq. Cell Rep. 16, 1126–1137.
Perreault, M.L., Hasbi, A., O’Dowd, B.F., and George, S.R. (2011). The dopamine d1-d2 receptor heteromer in striatal medium spiny neurons: evidence for a third distinct neuronal pathway in Basal Ganglia. Front. Neuroanat. 5, 31. Picelli, S., Faridani, O., Bjo¨rklund, A˚., Winberg, G., Sagasser, S., and
Halpern, K.B., Shenhav, R., Matcovitch-Natan, O., To´th, B., Lemze, D., Golan, M., Massasa, E.E., Baydatch, S., Landen, S., Moor, A.E., et al. (2017). Singlecell spatial reconstruction reveals global division of labour in the mammalian liver. Nature 542, 352–356.
Sandberg, R. (2014). Full-length RNA-seq from single cells using Smartseq2. Nature Protocols.
Harris, K.D., Hochgerner, H., Skene, N.G., Magno, L., Katona, L., Bengtsson Gonzales, C., Somogyi, P., Kessaris, N., Linnarsson, S., and Hjerling-Leffler, J. (2018). Classes and continua of hippocampal CA1 inhibitory neurons revealed by single-cell transcriptomics. PLoS Biol. 16, e2006387. Herkenham, M., Edley, S.M., and Stuart, J. (1984). Cell clusters in the nucleus accumbens of the rat, and the mosaic relationship of opiate receptors, acetylcholinesterase and subcortical afferent terminations. Neuroscience 11, 561–593. Hintiryan, H., Foster, N.N., Bowman, I., Bay, M., Song, M.Y., Gou, L., Yamashita, S., Bienkowski, M.S., Zingg, B., Zhu, M., et al. (2016). The mouse cortico-striatal projectome. Nat. Neurosci. 19, 1100–1114. Hunnicutt, B.J., Jongbloets, B.C., Birdsong, W.T., Gertz, K.J., Zhong, H., and Mao, T. (2016). A comprehensive excitatory input map of the striatum reveals novel functional organization. eLife 5, e19103. Kalish, B.T., Cheadle, L., Hrvatin, S., Nagy, M.A., Rivera, S., Crow, M., Gillis, J., Kirchner, R., and Greenberg, M.E. (2018). Single-cell transcriptomics of the developing lateral geniculate nucleus reveals insights into circuit assembly and refinement. Proc. Natl. Acad. Sci. U S A 115, E1051–E1060. Katz, Y., Wang, E.T., Silterra, J., Schwartz, S., Wong, B., Thorvaldsdo´ttir, H., Robinson, J.T., Mesirov, J.P., Airoldi, E.M., and Burge, C.B. (2015). Quantitative visualization of alternative exon expression from RNA-seq data. Bioinformatics 31, 2400–2402. Lake, B.B., Ai, R., Kaeser, G.E., Salathia, N.S., Yung, Y.C., Liu, R., Wildberg, A., Gao, D., Fung, H.-L., Chen, S., et al. (2016). Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 352, 1586–1590. Lamprecht, M.R., Sabatini, D.M., and Carpenter, A.E. (2007). CellProfiler: free, versatile software for automated biological image analysis. Biotechniques 42, 71–75. Li, H., Horns, F., Wu, B., Xie, Q., Li, J., Li, T., Luginbuhl, D.J., Quake, S.R., and Luo, L. (2017). Classifying Drosophila olfactory projection neuron subtypes by single-cell RNA sequencing. Cell 171, 1206–1220.e22. Lu, K.-M., Evans, S.M., Hirano, S., and Liu, F.-C. (2014). Dual role for Islet-1 in promoting striatonigral and repressing striatopallidal genetic programs to specify striatonigral cell identity. Proc. Natl. Acad. Sci. U S A 111, E168–E177. Millhouse, O.E. (1987). Granule cells of the olfactory tubercle and the question of the islands of Calleja. J. Comp. Neurol. 265, 1–24. Moffitt, J.R., Hao, J., Bambah-Mukku, D., Lu, T., Dulac, C., and Zhuang, X. (2016). High-performance multiplexed fluorescence in situ hybridization in culture and tissue with matrix imprinting and clearing. Proc. Natl. Acad. Sci. U S A 113, 14456–14461. Mun˜oz-Manchado, A.B., Bengtsson Gonzales, C., Zeisel, A., Munguba, H., Bekkouche, B., Skene, N.G., Lo¨nnerberg, P., Ryge, J., Harris, K.D., Linnarsson, S., and Hjerling-Leffler, J. (2018). Diversity of interneurons in the dorsal striatum revealed by single-cell RNA sequencing and PatchSeq. Cell Rep. 24, 2179–2190.e7. Pandey, S., Shekhar, K., Regev, A., and Schier, A.F. (2018). Comprehensive Identification and spatial mapping of habenular neuronal types using singlecell RNA-seq. Curr. Biol. 28, 1052–1065.e7.
12 Neuron 105, 1–12, February 19, 2020
Poulin, J.-F., Caronia, G., Hofer, C., Cui, Q., Helm, B., Ramakrishnan, C., Chan, C.S., Dombeck, D.A., Deisseroth, K., and Awatramani, R. (2018). Mapping projections of molecularly defined dopamine neuron subtypes using intersectional genetic approaches. Nat. Neurosci. 21, 1260–1271. Robinson, J.T., Thorvaldsdo´ttir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nat. Biotechnol. 29, 24–26. Satija, R., Farrell, J.A., Gennert, D., Schier, A.F., and Regev, A. (2015). Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502. Saunders, A., Macosko, E.Z., Wysoker, A., Goldman, M., Krienen, F.M., de Rivera, H., Bien, E., Baum, M., Bortolin, L., Wang, S., et al. (2018). Molecular diversity and specializations among the cells of the adult mouse brain. Cell 174, 1015–1030.e16. Su, T., Stanley, G., Sinha, R., D’Amato, G., Das, S., Rhee, S., Chang, A.H., Poduri, A., Raftrey, B., Dinh, T.T., et al. (2018). Single-cell analysis of early progenitor cells that build coronary arteries. Nature 559, 356–362. Tasic, B. (2018). Single cell transcriptomics in neuroscience: cell classification and beyond. Curr. Opin. Neurobiol. 50, 242–249. Tasic, B., Menon, V., Nguyen, T.N., Kim, T.K., Jarsky, T., Yao, Z., Levi, B., Gray, L.T., Sorensen, S.A., Dolbeare, T., et al. (2016). Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346. Tepe, B., Hill, M.C., Pekarek, B.T., Hunt, P.J., Martin, T.J., Martin, J.F., and Arenkiel, B.R. (2018). Single-cell RNA-seq of mouse olfactory bulb reveals cellular heterogeneity and activity-dependent molecular census of adultborn neurons. Cell Rep. 25, 2689–2703.e3. Usiello, A., Baik, J.-H., Rouge´-Pont, F., Picetti, R., Dierich, A., LeMeur, M., Piazza, P.V., and Borrelli, E. (2000). Distinct functions of the two isoforms of dopamine D2 receptors. Nature 408, 199–203. Voorn, P., Vanderschuren, L.J., Groenewegen, H.J., Robbins, T.W., and Pennartz, C.M. (2004). Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci. 27, 468–474. Wolf, F.A., Angerer, P., and Theis, F.J. (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15. Wolf, F.A., Hamey, F.K., Plass, M., Solana, J., Dahlin, J.S., Go¨ttgens, B., Rajewsky, N., Simon, L., and Theis, F.J. (2019). PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59. Zeisel, A., Mun˜oz-Manchado, A.B., Codeluppi, S., Lo¨nnerberg, P., La Manno, G., Jure´us, A., Marques, S., Munguba, H., He, L., Betsholtz, C., et al. (2015). Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142. Zeisel, A., Hochgerner, H., Lo¨nnerberg, P., Johnsson, A., Memic, F., van der €ring, M., Braun, E., Borm, L.E., La Manno, G., et al. (2018). Zwan, J., Ha Molecular architecture of the mouse nervous system. Cell 174, 999–1014.e22. Zeng, H., and Sanes, J.R. (2017). Neuronal cell-type classification: challenges, opportunities and the path forward. Nat. Rev. Neurosci. 18, 530–546. Zhang, Q., Zhang, Y., Wang, C., Xu, Z., Liang, Q., An, L., Li, J., Liu, Z., You, Y., He, M., et al. (2016). The zinc finger transcription factor Sp9 is required for the development of striatopallidal projection neurons. Cell Rep. 16, 1431–1444.
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
STAR+METHODS KEY RESOURCES TABLE
REAGENT or RESOURCE
SOURCE
IDENTIFIER
Hibernate-A Medium
ThermoFisher
Cat#A1247501
Trypsin inhibitor from chicken egg white
Sigma
Cat#T9253-500MG
Percoll
Sigma
Cat#P1644
Chemicals, Peptides, and Recombinant Proteins
DAPI (4’,6-Diamidine-20 -phenylindole dihydrochloride)
Sigma
Cat#10236276001
Triton X-100
Sigma
Cat#T8787
Betaine
Sigma-Aldrich
Cat#61962
Recombinant RNase inhibitor
Clontech
Cat#2313A
dNTP
NEB
Cat#N0447
SMARTScribe Reverse Transcriptase
Clontech
Cat#639538
DTT (dithiothreitol)
ThermoFisher
Cat#R0861
Lambda Exonuclease
NEB
Cat#M0262L
RNase-away
ThermoFisher
Cat#10328011
Critical Commercial Assays RNAscope Fluorescent Multiplex Reagent Kit
ACDBio
Cat#320850
Papain Dissociation System
Worthington
Cat#LK003150
Nextera XT DNA Library Prep Kit
Illumina
Cat#FC-131-1024
KAPA Hifi HotStart ReadyMix
Kapa Biosystems
Cat#KK2602
Ampure XP beads
Ampure XP beads
Cat#A63880
BioAnalyzer High Sensitivity Chips
Agilent
Cat#5067-4626
Falcon Cell Strainers
Fisher
Cat#08-771-2
ERCC RNA Spike-In Mix
Invitrogen
Cat#4456740
Cryomold
VWR
Cat#15160-215
Mouse Brain Atlas
Allen Institute
http://mouse.brain-map.org/
Mouse Reference Genome GRCm38/mm10
UCSC Genome Browser
http://genome.ucsc.edu/cgi-bin/ hgGateway?db=mm10
Raw and analyzed data
This paper
GEO: GSE139412
Mouse: FVB.Cg-Tg(Drd1-tdTomato)5Calak/ Mmnc Mus musculus
Malenka Laboratory
RRID:MMRRC_030512-UNC
Mouse: Wildtype: C57BL/6J
Jackson Laboratory
JAX: #000664
Mouse: Tg(Drd2-EGFP)S118Gsat/Mmnc
Malenka Laboratory
RRID:MMRRC_000230-UNC
IDT
Picelli et al., 2014
IS_PCR primer (AAGCAGTGGTATCAACGCAGAGT)
IDT
Picelli et al., 2014
Oligo dT: (50 -AAGCAGTGGTATCAACGC AGAGTACT30VN-30 )
IDT
Picelli et al., 2014
RNAscope Mm-Tac1 probe
ACD Biosciences
Cat#410351
RNAscope Mm-Penk probe
ACD Biosciences
Cat#318761
RNAscope Mm-Tac2 probe
ACD Biosciences
Cat#446391
RNAscope Mm-Nxph4 probe
ACD Biosciences
Cat#489641
Deposited Data
Experimental Models: Organisms/Strains
Oligonucleotides Template Switch Oligo (AAGCAGTGGTATCAACG CAGAGTGAATrGrG+G)
(Continued on next page)
Neuron 105, 1–12.e1–e8, February 19, 2020 e1
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Continued REAGENT or RESOURCE
SOURCE
IDENTIFIER
RNAscope Mm-Cnr1 probe
ACD Biosciences
Cat#420721
RNAscope Mm-Crym probe
ACD Biosciences
Cat#466131
RNAscope Mm-Sema5b probe
ACD Biosciences
N/A
RNAscope Mm-Id4 probe
ACD Biosciences
Cat#447861
RNAscope Mm-Nts probe
ACD Biosciences
Cat#420441
RNAscope Mm-Synpr probe
ACD Biosciences
Cat#500961
RNAscope Mm-Dner probe
ACD Biosciences
Cat#413951
RNAscope Mm-Tnnt1 probe
ACD Biosciences
Cat#466911
RNAscope Mm-Cxcl14 probe
ACD Biosciences
Cat#459741
RNAscope Mm-Th probe
ACD Biosciences
Cat#317621
RNAscope Mm-Htr7 probe
ACD Biosciences
Cat#401321
RNAscope Mm-Kremen1 probe
ACD Biosciences
Cat#425771
Software and Algorithms Star
v2.4.0, https://github.com/alexdobin/STAR
HTSeq
v0.11.1, https://htseq.readthedocs.io/ en/release_0.11.1/
R
v3.5.1, http://www.r-project.org; RRID:SCR_001905
Python
v3.6.6, https://www.python.org
Scanpy
v1.3.7, https://icb-scanpy.readthedocshosted.com/en/stable/
CellProfiler
v2.2.1, https://cellprofiler.org/releases/
Seurat
v2.3.4, https://satijalab.org/seurat/
spdep
v1.1-3, https://cran.r-project.org/web/ packages/spdep/index.html
Fiji ImageJ
Open source https://fiji.sc/
N/A
Samtools
N/A
v1.3, http://samtools.sourceforge.net/; RRID:SCR_002105
Integrative Genomics Viewer (IGV)
N/A
http://software.broadinstitute.org/ software/igv; RRID:SCR_011793
Other FACS (Aria II)
BD Biosciences
N/A
NextSeq 500
Illumina
N/A
Leica CM 3050 S cryostat
Leica
N/A
HybEZ II Hybridization System for Manual Assays
ACDBio
N/A
LEAD CONTACT AND MATERIALS AVAILABILITY Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Stephen Quake (
[email protected]). EXPERIMENTAL MODEL AND SUBJECT DETAILS Animals Generation of D1-tdTom/D2-GFP, and D1-tomato BAC reporter mice was previously described (Gokce et al., 2016). All lines were maintained by backcrossing to C57BL/6J. All other experiments were conducted on 5-7 week old male C57/Bl6 mice from Jackson Labs. Mice were weaned at 21 days of age and housed in groups of 2-5 on a 12-hour light/dark cycle (lights on 0700-1900h), with free access to food. All procedures conformed to the National Institutes of Health Guidelines for the Care and Use of Laboratory Animals and were approved by the Stanford University Administrative Panel on Laboratory Animal Care.
e2 Neuron 105, 1–12.e1–e8, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
METHOD DETAILS Generation of single-cell RNA-Seq data Tissue Dissociation Parasagittal slices (350 mm) containing striatum were prepared using standard procedures. Briefly, mice were anesthetized with isoflurane and decapitated, brains were quickly removed and placed in ice-cold low-sodium/high-sucrose dissecting solution. Slices were cut by adhering the lateral surface of the brain to the stage of a Leica vibroslicer, and placed in ice-cold hibernate-A solution (GIBCO, Themo Fisher). The striatum was dissected from each slice, and the tissue was dissociated using papain-based dissociation. Briefly, striatal tissue was incubated with frequent agitation at 30 C for 20 min 2mg/ml papain solution in Hibernate-A, followed by washing with hibernate-A including 1mg/ml Ovomucoid protease inhibitor (Sigma). The striatal slices were triturated briefly using fire polished glass Pasteur pipettes until a single-cell suspension was obtained. The suspension was cleaned from small debris by centrifuging for 3 min at 450 g through a three-step density gradient of Percoll (Sigma), top layer containing small debris was discarded and remaining cells were pelleted for 5 min at 550 g. The pellet was resuspended in 0.1 mL of hibernate A with DAPI (4’,6-diamidino-2-phenylindole, 1:4000 dilution; Sigma) to label dead cells (Gokce et al., 2016). Striatal tissue was kept in ice-cold solution throughout the processing except during the 20 min dissociation with papain at 30 C. Purification of Striatal Cells by FACS The dissociated cell suspension from adult D1-tdTom/D2-GFP mice was passed through a 70-mm cell strainer (BD Biosciences). Viable cells (DAPI negative) were sorted by FACS (Aria II; BD Biosciences). For specific enrichment of genetically labeled cell types, fluorescent striatal cells were analyzed to determine optimal parameters and gating for FACS. Finally, test sorted cells were imaged on confocal microscope as described previously (Gokce et al., 2016). Cells were sorted into ice-cold hibernate-A and pelleted by centrifugation for 5 min at 300 g. Single cells were sorted into 96 well plates filled with 4 mL lysis buffer composed of 0.05% Triton X-100 (Sigma) and, ERCC spike-ins (1:24000000 dilution), 2.5 mM oligo-dT, 2.5 mM dNTP and 2U/ml of recombinant RNase inhibitor (Clontech) then spun down and frozen at 80 C. Plates were thawed and libraries prepared as described below. Smart-Seq2 library preparation 96-well plates with sorted single cells were incubated for 3min at 72 C and immediately placed on ice. To perform reverse transcription (RT) we added each well a mix of 0.59 mL H2O, 0.5 mL SMARTScribe Reverse Transcriptase (Clontech), 2 mL 5x First Strand buffer, 0.25 mL Recombinant RNase Inhibitor (Clontech), 2 mL Betaine (5M Sigma), 0.5 mL DTT (100mM) 0.06 mL MgCL2 (1M Sigma), 0.1 mL Template-switching oligos (TSO) (100 mM AAGCAGTGGTATCAACGCAGAGTACrGrG+G). Next reaction RT reaction mixes were incubated at 42 C for 90 min followed by 70 C for 5 min. Pre amplification of cDNA done by adding 12.5 mL KAPA HiFi Hotstart 2x (KAPA Biosystems), 2.138 mL H2O, 0.25 mL ISPCR primers (10 mM, 50 -AAGCAGTGGTATCAACGCAGAGT-3), 0.1125 mL Lambda Exonuclease under the following conditions: 37 C for 30 min, 95 C for 3 min, 21 cycles of (98 C for 20 s, 67 C for 15 s, 72 C for 4 min), final extension at 72 C for 5 min. Cells were then cleaned using 2 rounds of AMPure bead (Beckman-Coulter) cleanup at a 0.8:1 ratio of beads to PCR product. The cleaned cDNA was then tagmented using the Nextera XT DNA kit and following the included instructions, using 11 cycles of PCR to amplify the tagmented cDNA (Illumina). Tagmented cDNA was then pooled and analyzed with a Bioanalyzer 2100 High Sensitivity DNA kit (Agilent) to assure that > 80% of the product was in the 200-1000bp range. This process was repeated separately for each animal. Libaries were sequenced using 2x75 reads base pairs (bp) paired-end on Illumina HiSeq 2000 or NextSeq to a target depth of 1x106 reads/sample. Generating fresh-frozen brain slices Wild-type C57B/L6 background mice were sacrificed at 8-12 weeks and perfused with ice-cold saline. The brain was removed and placed into a cryomold (VWR cat. no. 15160-215) filled with OCT (Tissue-Tek , VWR). This was then placed into an isopentane bath that was pre-cooled in liquid nitrogen. Care was taken to avoid splashing of any isopentane onto the exposed OCT surface. When the OCT was almost completely frozen (small drop of unfrozen OCT at the exposed surface) it was transferred to an insulated container filled with powdered dry ice, and stored at 80C for up to 3 months. The night before slicing, the brain was placed in a 20C freezer and allowed to equilibrate to 20C. The day of slicing, it was placed in a pre-cooled cryostat (Leica CM 3050 S) with the settings object temperature (OT): 12 to 15C; chamber temperature (CT) 25C. All chambers and tools were cleaned with RNase-away (ThermoFisher cat. no. 10328011) and 70% ethanol (Sigma). A new blade was installed in the blade holder. The glass anti-roll plate was kept cold by placing dry ice wrapped in foil on top of it every 5 slices. Slices were cut at 12-14 mm thickness with a quick, even movement of the blade. Brains were sliced coronally. Slices were flattened with a brush on the cold cutting surface, and mounted onto a Superfrost glass slide by bringing the slide into contact with the slice and allowing it to melt onto the slide, slowly, to avoid bubble formation. Each slice was quickly placed under a dissection microscope (Nikon) under dark-field illumination, and a photo was taken for reference (especially noting edge integrity and location of any bubbles). Slices that contained both the CPu and NAcc were kept for hybridization. Slides were stored at 80C for up to 6 months. RNAScope in situ hybridization All tools were cleaned with RNase-away before staining. RNAScope probes (ACDBio) were hybridized following the manufacturer’s instructions for fresh-frozen tissue. Briefly, brain slices were fixed in 4% PFA for 15 minutes. Then they were dehydrated in increasing concentrations of ethanol for 5 minutes each (70%, 85%, 100%, 100%). Then they were incubated for 30 minutes with Protease IV (ACDBio). Then they were hybridized to the target probes for 2 hours in an RNAScope hybridization oven at 40C (ACDBio).
Neuron 105, 1–12.e1–e8, February 19, 2020 e3
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Then they were washed and incubated with the amplification reagents (Amp-1, Amp-2, Amp-3, and Amp-4) as per manufacturer instructions. They were then stained with DAPI (ACDBio) for 30 s and mounted with Prolong Gold antifade reagent (ThermoFisher). QUANTIFICATION AND STATISTICAL ANALYSIS Processing and Quality Control of Single-cell RNA-seq Data Raw reads were aligned using rnaSTAR to the mm10 genome with Tdtomato and Gfp transgenic mRNA sequences and ERCC synthetic RNA added(Dobin et al., 2013). Gene counts were determined using htseq-count with ‘‘strict’’ mode on (all ambiguous alignments discarded)(Anders et al., 2015) . Gene expression was quantified as the number of counts per gene divided by the total number of mm10 counts, respectively, times 106 to get expression in counts per million (cpm). Values of log(cpm + 1) were used for analysis. Low-quality cells, which we defined as having < 1000 genes detected and < 105 counts counted by HTSeq-count, were removed for downstream analysis. Normalization Cells were normalized by dividing by the total number of counts per cell, multiplying by 106, and log-transforming with a pseudocount of one to make log cpm using NormalizeData in Seurat. Genes were scaled for PCA using the ScaleData function. Clustering scRNA-Seq data Contaminant removal A first round of clustering was performed using default Seurat parameters for FindVariableGenes and RunPCA for dimensionality reduction on cells passing quality thresholds. A group of 4 cells with low PC9 scores were enriched for genes like Apoe, Slc1a3. Querying the single-cell RNA-Seq database mousebrain.org showed that these genes were highly specific to astrocytes, so these cells were determined to be likely contamination and removed from downstream analysis (Figure S1C). Performing Seurat and PCA on the remaining cells revealed 3 cells expressing Sp8, Nfib, and other genes (Figure S1D). Querying the single-cell database dropviz.org showed these to be neurogenic genes. These cells were removed from further analysis. No other PCs or cells were observed to be enriched for glial markers. Filtering PCs that do not represent stable cell identity PCs were filtered by overlaying the PC scores on the tSNE plots (Figure S1F), and overlaying the genes with top PC loadings on tSNE plots for each PC (Figures S1G and S1H). Several PCs had high loadings of immediate early genes (IEG) like Fos and Arc, and those were removed because they are known to be highly transient8 (Figures S1F, S1H, and S1I). Other PCs were dominated by very-highexpressed genes like Lars2, a long non-coding RNA, which had almost zero dropout and did not have expression patterns that correlated to any clear subtype separation (Figure S1G). Comparing dimensionality reduction techniques (classical versus truncated PCA) We previously described a modification to PCA, truncated PCA, that improves signal and reduces correlation to technical metrics (Su et al., 2018). We compared classical and truncated PCA on clustering of all SPNs for separation of D1 and D2 SPNs and batch effects (Figures S1E, S1J, and S1L). SPNs were clustered and visualized using the following Seurat-based pipeline to optimize PCA for cluster discreteness. PCA was run on variable genes chosen with FindVariableGenes default parameters. The JackStraw function was used to compute the significance of each PC; there was a large fall-off of significance after PC15, so the first 15 PCs were used for clustering and graph abstraction. Truncated PCs were computed by setting all gene loadings below the top N positive and top N negative loadings to 0; in Figure 1 N = 200. The top 15 default or truncated PCs were used as input in the shared nearest-neighbor (SNN) graph generation and tSNE visualization with RunTSNE, and Louvain clustering with FindClusters of the Seurat package version 2.3.4 and R version 3.5.1. Louvain clustering grouped cells into neighborhoods that maximize the modularity of cluster connectivity. The resolution of clustering was chosen to produce a manageable number of clusters that roughly aligned with the number of distinct groups visible by expression of marker genes. The first 15 PCs generated by Seurat were saved and used to calculate the partial abstract graph approximation (PAGA) algorithm with the scanpy package (Wolf et al., 2018) version 1.3.7 and Python version 3.6.6. The following sequence of commands using the PC scores as input: sc.pp.neighbors with n_neighbors = 5, sc.tl.draw.graph with default parameters, sc.tl.louvain with resolution = 0.5, sc.pl.paga with groups = ‘louvain’, and sc.pl.paga with threshold = 0.15. Visualization of scRNA-Seq gene expression with tSNE tSNE projections were used to reduce the many PC dimensions (up to 12) into 2 dimensions so that cell similarity and cluster relationships can be visualized alongside gene expression. Subclustering D1 and D2 SPN discrete types D1 and D2 SPNs were initially clustered using Seurat. Each discrete type was separately re-normalized and scaled with NormalizeData with scale.factor = 1e6 and ScaleData with default parameters. Highly variable genes filtered with FindVariableGenes with y.cutoff = 0.3. Any variable genes found to be expressed in > 95% of cells was removed before PCA. PCA was calculated with RunPCA with default parameters. PCs were truncated as described previously, with N = 60. As before, PCs not likely to correspond to stable identities or corresponding to one or two outlier cells were removed. We used Louvain clustering to group cells into neighborhoods that maximize the modularity of cluster connectivity. The resolution of clustering was chosen to produce a manageable number of clusters that roughly aligned with the number of distinct groups visible by expression of marker genes.
e4 Neuron 105, 1–12.e1–e8, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
Mapping subclusters to in situ expression atlases The subclusters of each discrete type were assigned based on the genes expressed in them. Cluster-defining genes were manually searched and representative images downloaded from the Allen Adult Mouse Brain Atlas (mouse.brain-map.org). Genes that were consistent between D1 and D2 SPNs and strongly-detected in the striatum were used to map subtypes (Figures 1C–1E; Figures S2A– S2E). Multiple genes per cluster or anatomical regions were referenced to ensure that the cluster assignments were valid. Determining discreteness of cell types Partial abstract graph approximation (PAGA) of major cell types To compute the PAGA c and layout of clusters, sc.pp.neighbors with default parameters was run using the truncated PC scores for the PCs that remained after filtering (see above) to compute the neighbor graph of cells. Then sc.tl.paga and sc.pl.paga was run with default parameters to compute and visualize the connectivity between clusters. Thresholds used in visualization are described in the figure legends. Identification of key pairs of cell types We defined a ‘discrete subtype’ as a completely-connected component of the connectivity graph, meaning a set of clusters that are connected to each other but not connected to any other clusters. The number of discrete subtypes increased with increasing PAGA threshold for calling clusters (Figures 2A and 2B). To determine which subtypes were discrete, we first had to identify the key connections that created new completely-connected components as the PAGA threshold was raised (Figures 2A and 2B). The connections shown in Figure 2A all had PAGA c < 0.25, and raising the PAGA threshold above that resulted in 4 completely-connected components. We therefore considered the connections i-vi in Figure 2A to be key for creating the first 4 discrete subtypes. We then identified pairs of subtypes whose connections created additional completely-connected components as the PAGA threshold was raised further. Importantly, these were the strongest connections that maintained the connectivity of a set of clusters. Clusters often have weak connections to distant clusters within the same connected component (e.g., the connection between 1 and 0 in Figure 2A), and whether these subtypes are continuous or not is not important to the overall number of discrete subtypes. Therefore to assess whether a set of clusters should be a single discrete, completely-connected subtype, or whether it is comprised of multiple discrete subtypes, it is important to examine the strongest links that bind it together. We visualized these links that held together the clusters within the 4 major subtypes (Figure 2B). Visualizing the discreteness or continuity of gene expression To examine the continuity of pairs of clusters, we first plotted gene expression against diffusion pseudotime (DPT) (Figures 2E and 2F). We used the rank of DPT because our goal was to assess only the discreteness of the gene expression change, and DPT was just to order cells along an axis likely to correspond to a continuum if one existed. Cells were scored by the top 10 differentially-expressed genes: the average of the scaled expression of the top 5 genes in the first cluster minus the the average of the scaled expression of the top 5 genes in the 2nd cluster. Top DEGs were calculated by calling sc.tl.rank_genes_groups with default parameters on just the pair of clusters being examined. Genes were scaled by binning cells into 20 bins along the DPT axis and averaging each gene within each bin, and then dividing each gene by the bin with the highest value. This way, outlier cells did not dominate the scaling. These scores were plotting at single-cell resolution and in 20 bins along the DPT rank order. Error bars represent 95% CI calculated using seaborn regplot. Visualizing discreteness or continuity using DPT and relative cluster connectivity (RCC) We also plotted cells by two metrics that should both provide evidence of a continuum (or lack thereof) between cell types. The first metric is the relative cluster connectivity (RCC). We calculated this using the neighbors of each cell defined by sc.pp.neighbors. This is a symmetric neighbor metric. For each cell in a pair of clusters, we calculated the number of its neighbors belonging to cluster 1 (nn1) and the number of neighbors belonging to cluster 2 (nn2). We calculated RCC = nn1 – nn2. We then calculated DPT between a pair of clusters with the following series of commands: 1. sc.pp.neighbors; 2. sc.tl.diffmap with n_comps = number of PCs passing filter (see Filtering PCs that do not represent stable cell identity), and otherwise default parameters. 3. sc.tl.dpt with n_dcs = n_comps used in sc.tl.diffmap and otherwise default parameters. Computation of gap c We created an alternative to PAGA connectivity that computes the sharpness of the gap in density along an axis representing a possible continuum between two clusters. This statistic is computed for both RCC and DPT. First, RCC/DPT is computed between two clusters. Then the boundary between the clusters is identified as the window of N/2 cells with the most even number of cells from either cluster: boundary = argminw˛W f 0w ; f 1w Where w is a N/2 -cell window, W is all the N/2 -cell windows along the DPT/RCC order,f iw . is the fraction of cells in cluster i in window w, and N is the number of cells in both clusters.
Neuron 105, 1–12.e1–e8, February 19, 2020 e5
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
The cells are binned along DPT/RCC (binwidth = 0.05 for DPT and range(RCC)/10 for RCC). n1 is the number of cells in the 3-bin window with the most cells and ngap is the number of cells in the 3-bin window with the fewest cells. Then the gap connectivity statistic is calculated as ! RCC nDPT gap ngap gap c = min DPT ; RCC ; n1 n1 is nx calculated on the DPT trajectory and nRCC is nx calculated on the RCC trajectory. Where DDPT x x Quantitative in situ mapping of gene expression Imaging Slides were imaged with a Zeiss AxioImager 2 using a 20X 0.8NA air objective (Zeiss). Multiple overlapping tile images were acquired. 10-12 z stacks with 0.6-1mm spacing were used to assure that every part of the tile was in focus. Exposure times were 250-700ms. ZEN software (Zeiss) was used to do maximum intensity projection and stitching of the tiles using DAPI as the reference channel. Image analysis Images were analyzed with custom CellProfiler scripts (Lamprecht et al., 2007). Nuclei were smoothed with an 8-pixel artifact diameter with the Smooth function. 60-pixel speckle enhancement was done to reduce background, and then 8-pixel feature suppression to reduce the chance of segmenting on nucleoli. Nuclei were segmented with IdentifyPrimaryObjects with a typical pixel diameter between 35-80 pixels. Cells were segmented by expanding the nuclei segmentations by 7 pixels’ radius with hole filling. Probe intensity per cell was then quantified with the following pipeline. Background and illumination variation was reduced on all probe channels with EnhanceSpeckles with a typical diameter of 30 pixels (roughly the size of a cell). Probe images were then manually thresholded to remove background. MeasureObjectIntensity was used to quantify probe intensity per segmented cell (Figures S3A and S3B). Further analysis was done in custom R scripts. Gene expression was quantified as the log of the integrated intensity, with a lower cutoff manually determined by comparing to images to determine the level at which 0-1 probes were found. In situ analysis of discrete SPN types In situ analysis of D1 and D2 discrete types A coronal section containing the CPu and NAcc was stained and quantified for DAPI, Tac1, and Penk as described above. Cells without detection of either Tac1 or Penk and cells outside the anatomical boundary of the striatum were removed. A 500 mm x 500 mm section was selected from the CPu and the ratio Tac1/Penk was calculated as Tac1 - Penk in log space (Figure 3C). In situ analysis of the ICj discrete type A coronal section containing the CPu and NAcc was stained and quantified for DAPI, Synpr, Cxcl14, and Nts as described above. Cells without detection of either Synpr or Cxcl14 and cells outside the anatomical boundary of the striatum were removed. A section of striatum containing the olfactory tubercle (OT) and an island of Calleja was manually isolated and rotated so the line separating the two was perpendicular to the x axis. The expression of Synpr was plotted against this axis, both single-cell expression levels and means ± 95% CI of 10-cell bins (Figure 3G). The cells of the ICj were manually annotated by drawing a border around the ICj with the R package gatepoints. In situ analysis of the D1H-Nxph4 discrete type A coronal section containing the CPu and NAcc and a coronal section further caudal containing only the CPu were stained and quantified for DAPI, Tac1, Nxph4, and Penk as described above. Cells without detection of either Tac1 or Penk and cells outside the anatomical boundary of the striatum were removed. All remaining cells were plotted in gray and cells expressing all three markers (Tac1+/Nxph4+/Penk+) were plotter over those in dark green. In the following analysis only Tac1+ cells (D1 and D1H) were used. A section of striatum containing the NAcc and D1H-containing lateral stripe of the striatum (LSS) was manually isolated and rotated so the line separating the two was perpendicular to the x axis. The expression of Nxph4 + Penk in log space was plotted against this axis and a LOESS regression was plotted using stat_smooth with default parameters (Figure 3L). The cells of the LSS were manually annotated by drawing a border around the LSS with the R package gatepoints. In situ analysis of the D1H-Tac2 discrete type A coronal section containing the CPu and NAcc and a coronal section further caudal containing only the CPu were stained and quantified for DAPI, Tac1, Tac2, and Penk as described above. Cells without detection of either Tac1 or Penk and cells outside the anatomical boundary of the striatum were removed. All remaining cells were plotted in gray and cells expressing all three markers (Tac1+/Tac2+/Penk+) were plotter over those in dark green. In the following analysis only Tac1+ cells (D1 and D1H) were used. A section of striatum containing the NAcc and D1H-containing ventral stripe (VS) was manually isolated and rotated so the line separating the two was perpendicular to the x axis. The expression of Tac2 + Penk in log space was plotted against this axis and a LOESS regression was plotted using stat_smooth with default parameters (Figure 3L). The cells of the VS were manually annotated by drawing a border around the VS with the R package gatepoints.
e6 Neuron 105, 1–12.e1–e8, February 19, 2020
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
In situ analysis of continuous SPN heterogeneity Measuring continuity of D1 and D2 subtypes The gap c statistic was applied to all pairs of subclusters of D1 and D2 SPNs. Plots are displayed with threshold = 0.01. In situ analysis of the dorsolateral-ventromedial gradients A coronal section containing the CPu and NAcc and a coronal section further caudal containing only the CPu were stained and quantified for DAPI, Tac1, Cnr1, and Crym as described above. The following analysis was done on the rostral section containing CPu and NAcc. Tac1 was used to define the anatomical boundary of the striatum and cells outside the anatomical boundary of the striatum were removed. The ratio Cnr1/Crym was computed as Cnr1 - Crym in log space and plotted. A 500 mm-wide stripe of the striatum was manually isolated and rotated to be perpendicular to the x axis. Cnr - Crym in log space was plotted against this axis using the means and SEM of 50-cell bins (Figure 4C). To compare rostral and caudal distributions of the gradient, the expression of Cnr1 and Crym from example regions was visually compared between the rostral and caudal sections (Figures 4D and 4E). To investigate the expression of Cxcl14, Allen Atlas in situ images from a male P56 animal were downloaded and representative caudal, intermediate, and rostral sections and magnifications of the dorsolateral region were plotted. The previous information was used to construct a visualization of the Cxcl14/Cnr1/Crym gradient, where the brightness of the red (green) color indicates the relative expression of Cnr1 (Crym) and the yellow color represents Cxcl14 expression. In situ analysis of patch-matrix gradients A coronal section containing the CPu and NAcc was stained and quantified for DAPI, Kremen1, Sema5b, and Id4 as described above. The following analysis was done on the rostral section containing CPu and NAcc. Cells outside the anatomical boundary of the striatum were removed. The ratio Kremen1*Sema5b/Id4, computed as Kremen1 + Sema5b - Id4 in log space, was used as the patch-matrix score. A section containing two patches was isolated. Boundaries around the patches were manually drawn and cells were assigned to patch1, patch2, or matrix. The patch-matrix score was randomized within patch1, patch2, and matrix groups. The correlation between each cell and its 2nd-nearest-neighbor was used as a metric of spatial correlation (the nearest neighbor was not used as occasionally nearest neighbors represented single cells segmented into two by the segmentation algorithm (Figure S3)). This correlation was computed over 1000 randomizations and plotted along with the true correlation. In situ analysis of patch-matrix gradients A coronal section containing the CPu and NAcc was stained and quantified for DAPI, Kremen1, Sema5b, and Id4 as described above. The following analysis was done on the rostral section containing CPu and NAcc. Cells outside the anatomical boundary of the striatum were removed. The ratio Kremen1*Sema5b/Id4, computed as Kremen1 + Sema5b - Id4 in log space, was used as the patch-matrix score. A section containing two patches was isolated. Boundaries around the patches were manually drawn and cells were assigned to patch1, patch2, or matrix. The patch-matrix score was randomized *within* patch1, patch2, and matrix groups. The correlation between each cell and its 2nd-nearest-neighbor was used as a metric of spatial correlation (the nearest neighbor was not used as occasionally nearest neighbors represented single cells segmented into two by the segmentation algorithm (Figure S3)). This correlation was computed over 1000 randomizations and plotted along with the true correlation. In situ analysis of olfactory tubercle (OT) gradients A coronal section containing the CPu and NAcc was stained and quantified for DAPI, Cxcl14, Nts, and Synpr as described above. Cells outside the anatomical boundary of the striatum were removed. The ratio Nts/Cxcl14, computed as Nts + Cxcl14 in log space, was used as the patch-matrix score. A section of striatum containing the OT flat and two OT ruffles was manually isolated and rotated so the OT was as close to parallel to the x axis as possible. Nts and Cxcl14 in log space was plotted against this axis and mean ± SEM of 70 mm was plotted (Figure 3L). In situ analysis of D2-Htr7 continuous subtype scRNA-Seq gradients of Htr7 and Th Gradients of Htr7 and Th were visualized in scRNA-Seq data by computing the DPT ranking of cells between the D2-Htr7 and D2patch-ventromedial subtypes. Single-cell values and means ± 95% CI of binned values were plotted along cells ordered by DPT rank. In situ analysis of D2-Htr7 patches Three coronal section along a rostral-caudal range, all containing the CPu and NAcc, were stained and quantified for DAPI, Tac1, Htr7, and Th as described above. Tac1 was used to define the anatomical boundary of the striatum and cells outside the anatomical boundary of the striatum were removed. Cells not expressing Tac1 were removed. Cells expressing all three markers (Tac1+/Tac2+/ Penk+) were plotted dark blue and non-triple-expressing cells were plotted in gray. A contingency table was made by the number of cells with/without Htr7 and with/without Th and a Fisher exact test was used to compute the p value of Th enrichment in Htr7+ cells. The spatial clustering cells was analyzed using the spdep package in R to compute the Getis-Ord G and the associated p value of enrichment of triple-expressors in the following way. 1. The 30 nearest neighbors of each cell was computed with knearneigh with k = 30. 2. The G statistic was computed with localG with default parameters with each region defined by the 30-nearest-neighbors computed above.
Neuron 105, 1–12.e1–e8, February 19, 2020 e7
Please cite this article in press as: Stanley et al., Continuous and Discrete Neuron Types of the Adult Murine Striatum, Neuron (2019), https://doi.org/ 10.1016/j.neuron.2019.11.004
3. The G statistic, which is a z-score, was transformed to p values by the calculation 2*pnorm(-abs(G)) in R. Multiple testing correction was performed with p.adjust using the false discovery rate. This information was used to create a cartoon depiction of the Htr7/Th gradient within the D2-patches. Quantification of Drd2 splicing Bam files from 30 randomly-selected cells each from the D1, D2, and D1H SPN types were visualized with the Integrated Genome Viewer (IGV) software version 2.4.10 after loading the mm10 genome(Robinson et al., 2011). Drd2 splicing was quantified using a Sashimi plot generated by IGV(Katz et al., 2015). The fraction of skipped reads was calculated as the ratio of reads aligning to the exon5-exon7 junction over the sum of the reads aligning to the exon5-exon6 and exon5-exon7 junction. DATA AND CODE AVAILABILITY The processed scRNA-Seq and in situ data is deposited in Figshare (https://figshare.com/projects/Continuous_and_discrete_ neuron_types_of_the_adult_murine_striatum/69080). The code used to analyze the project is available on GitLab (https://code. stanford.edu/gmstanle/atlas_striatum). Raw scRNA-Seq data is available on GEO (accession GEO: GSE139412). The raw RNAscope data is currently stored and publicly available on Google Drive (https://drive.google.com/drive/folders/0B07-_ hxlZ129Z213bTQ4WnBjMjg).
e8 Neuron 105, 1–12.e1–e8, February 19, 2020