Accepted Manuscript Brain regions showing white matter loss in Huntington’s disease are enriched for synaptic and metabolic genes Peter McColgan, Sarah Gregory, Kiran K. Seunarine, Adeel Razi, Marina Papoutsi, Eileanoir Johnson, Alexandra Durr, Raymund AC. Roos, Blair R. Leavitt, Peter Holmans, Rachael I. Scahill, Chris A. Clark, Geraint Rees, Sarah J. Tabrizi PII:
S0006-3223(17)32129-7
DOI:
10.1016/j.biopsych.2017.10.019
Reference:
BPS 13364
To appear in:
Biological Psychiatry
Received Date: 19 April 2017 Revised Date:
5 October 2017
Accepted Date: 7 October 2017
Please cite this article as: McColgan P., Gregory S., Seunarine K.K, Razi A., Papoutsi M., Johnson E., Durr A., Roos R.A., Leavitt B.R, Holmans P., Scahill R.I, Clark C.A, Rees G., Tabrizi S.J & and the Track-On HD Investigators, Brain regions showing white matter loss in Huntington’s disease are enriched for synaptic and metabolic genes, Biological Psychiatry (2017), doi: 10.1016/ j.biopsych.2017.10.019. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Brain regions showing white matter loss in ACCEPTED MANUSCRIPT
2
Huntington’s disease are enriched for synaptic and metabolic genes
3
Peter McColgan1, Sarah Gregory1, Kiran K Seunarine2, Adeel Razi3, 4, Marina Papoutsi1, Eileanoir Johnson1,
4
Alexandra Durr5, Raymund AC Roos6, Blair R Leavitt7, Peter Holmans8, Rachael I Scahill1, Chris A Clark2,
5
Geraint Rees3*, Sarah J Tabrizi1, 9* and the Track-On HD Investigators
6
*These authors contributed equally to this work
7 8 9 10 11 12 13 14 15 16 17 18 19 20
1Huntington’s
21
Short title: Transcription and connectivity loss in Huntington’s
EP
TE D
M AN U
SC
Disease Centre, Department of Neurodegenerative Disease, UCL Institute of Neurology, London, WC1N 3BG, UK 2Developmental Imaging and Biophysics Section, UCL Institute of Child Health, London, WC1N 1EH, UK 3Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, London, WC1N 3BG, UK 4Department of Electronic Engineering, NED University of Engineering and Technology, Karachi, Pakistan 5APHP Department of Genetics, University Hospital Pitié-Salpêtrière, and ICM (Brain and Spine Institute) INSERM U1127, CNRS UMR7225, Sorbonne Universités – UPMC Paris VI UMR_S1127, Paris, France 6Department of Neurology, Leiden University Medical Centre, 2300RC Leiden, The Netherlands 7Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, University of British Columbia, 950 West 28th Avenue, Vancouver BC, V5Z 4H4 Canada 8 MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, CF24 4HQ, UK 9National Hospital for Neurology and Neurosurgery, Queen Square, London, WC1N 3BG, UK
AC C
22
RI PT
1
Peter McColgan 1
1
Abstract word count: 231
2
Text word count: 3,997
3
Number of tables: 1
4
Number of figures: 5
5
Number of Supplemental Material files: 2 (1 PDF file, 1 ZIP file)
6
PDF file: contains Methods, Figures S1-S6, Tables S1-S2
7
ZIP file: contains supplemental Excel Files 1-9
SC Correspondence to:
M AN U
8 9
RI PT
ACCEPTED MANUSCRIPT
Professor Sarah J. Tabrizi
11
UCL Huntington’s Disease Centre
12
Department of Neurodegenerative Disease
13
UCL Institute of Neurology and National Hospital for Neurology and Neurosurgery
14
Box 104
15
Queen Square
16
London WC1N 3BG
17
Email:
[email protected]
18
Telephone: 0203 108 7474
AC C
EP
TE D
10
Peter McColgan 2
1
Abstract
2
Background: The earliest white matter changes in Huntington’s disease are seen before disease onset in the
3
premanifest stage around the striatum, within the corpus callosum and in posterior white matter tracts. While
4
experimental evidence suggests these changes may be related to abnormal gene transcription we lack an
5
understanding of the biological processes driving this regional vulnerability.
6
Methods: Here, we investigate the relationship between regional transcription in the healthy brain, using the
7
Allen Institute of Brain Science transcriptome atlas, and regional white matter connectivity loss at three time
8
points over 24 months in premanifest Huntington’s disease relative to controls. The baseline cohort included
9
72 premanifest Huntington’s disease participants and 85 healthy controls.
SC
RI PT
ACCEPTED MANUSCRIPT
Results: We show that loss of cortico-striatal, inter-hemispheric and intra-hemispheric white matter
11
connections at baseline and over 24 months, in premanifest Huntington’s disease, is associated with gene
12
expression profiles enriched for synaptic genes and metabolic genes. Cortico-striatal gene expression
13
profiles are predominately associated with motor, parietal and occipital regions, while inter-hemispheric
14
expression profiles are associated with fronto-temporal regions. We also show that genes with known
15
abnormal transcription in human Huntington’s disease and animal models are over-represented in synaptic
16
gene expression profiles but not metabolic gene expression profiles.
17
Conclusions: These findings suggest a dual mechanism of white matter vulnerability in Huntington’s
18
disease, where abnormal transcription of synaptic genes and metabolic disturbance not related to
19
transcription may drive white matter loss.
21
TE D
EP
AC C
20
M AN U
10
Peter McColgan 3
1
Introduction
2
Huntington’s disease (HD) is a progressive fatal neurodegenerative disease caused by a CAG repeat
3
expansion in the huntingtin gene on chromosome 4. Individuals with more than 39 CAG repeats are certain
4
to develop HD, allowing investigation of the premanifest stage (preHD) many years before symptom onset
5
(1). While the caudate and putamen show the earliest grey matter changes (2), white matter changes are seen
6
around the striatum, within the corpus callosum and in the posterior white matter (WM) tracts (2-5). We
7
have demonstrated a hierarchy of white matter vulnerability where cortico-striatal connections show greatest
8
changes in preHD and controls followed by inter-hemispheric and intra-hemispheric connections (6).
RI PT
SC
9
ACCEPTED MANUSCRIPT
Voxel based morphometry suggests (2, 7) that grey and white matter abnormalities in the striatum occur in parallel in those furthest from disease onset, but more recent work (5) suggests that grey matter
11
atrophy precedes white matter atrophy in the striatum. However, as this was a cross-sectional study it is not
12
yet possible to define a typical time lag. Thus patterns of WM loss in preHD are well established, but the
13
underlying pathological processes are unclear.
Mutant huntingtin protein causes cellular dysfunction and ultimately neuronal cell death through
TE D
14
M AN U
10
several processes (8, 9) including downstream effects on synaptic signalling (10), cellular metabolism (11),
16
mitochondrial dysfunction (12), immune activation (13) and alterations in transcription (14). Furthermore
17
transcription levels of genes involved in these processes are atypical in human HD and animal models (14,
18
15). Decreased expression of synaptic proteins in cortical pyramidal neurons of HD mouse models are
19
linked to abnormal cortico-striatal connectivity (16), while changes in transcription levels of Brain Derived
20
Neurotrophic Factor (BDNF), another protein involved in synaptic transmission, are associated with changes
21
in cortico-cortical connectivity (17). Excitotoxic striatal lesion models of HD are consistent with these
AC C
EP
15
Peter McColgan 4
1
findings. Reduced BDNF is seen in the rat striatum after quinolinic acid injection (18) and reduced BDNF
2
and nerve growth factor are seen after 3-nitropropionic acid treatment (19).
ACCEPTED MANUSCRIPT
3
Some genes show a direct association with WM integrity. Loss of Peroxisome-proliferator-activated receptor gamma co-activator α (PGC1α), involved in the transcriptional regulation of energy metabolism,
5
results in striatal degeneration and corpus callosum WM abnormalities in HD mouse models (20). Reduced
6
transcription levels of myelin-related genes are associated with WM abnormalities in HD mouse models
7
(21).
9
Given the relationship between WM connectivity and gene transcription in HD, here we investigated
SC
8
RI PT
4
how regional gene transcription profiles of the healthy human brain, obtained from the Allen Institute of Brain Science (AIBS) human transcriptome atlas (22) were associated with WM connectivity loss in preHD.
11
Based on association between synaptic and metabolic genes and WM loss in HD (20, 21) we hypothesized
12
that WM connectivity loss in preHD would be associated with regional transcription profiles enriched for
13
synaptic and metabolic genes.
M AN U
10
17 18 19
EP
16
AC C
15
TE D
14
Peter McColgan 5
1
Methods and Materials
2
Overview
3
To test our hypothesis, WM connectivity loss was determined using diffusion weighted imaging (DWI) from
4
a longitudinal cohort of preHD and control participants. Brains were parcellated into 70 cortical and 2 sub-
5
cortical (caudate and putamen) regions of interest (ROI) based on the Desikan Freesurfer atlas (23). The
6
caudate and putamen were chosen as these regions show the greatest changes in preHD (2). Whole brain
7
tractography was performed using these parcellations to construct WM brain networks. We have recently
8
published a longitudinal analysis using this cohort (6).
RI PT
SC
9
ACCEPTED MANUSCRIPT
For each set of connections associated with a cortical ROI, WM connectivity loss was defined as either cortico-striatal (connections between cortex and caudate/putamen), inter-hemispheric (cortico-cortical
11
connections between hemispheres) or intra-hemispheric (cortico-cortical connections within the same
12
hemisphere) (see Figure 1). WM connectivity and rate of change in WM connectivity over 24 months were
13
normalised for preHD relative to controls for each participant. Connectivity measures were then transformed
14
to give atrophy and rate of atrophy measures. The resulting atrophy score was used in the cross-sectional
15
analysis, while the rate of atrophy score was used in the longitudinal analysis.
TE D
M AN U
10
To compare regional WM loss in preHD with regional gene expression in the healthy brain the 70
17
cortical ROIs (23) were matched to the closest AIBS ROI and gene expression data were averaged across
18
RNA probes corresponding to the same gene. ROIs with gene expression values greater than two standard
19
deviations above the mean or range were excluded this resulted in the inclusion of 20,737 genes across 68
20
cortical ROIs.
AC C
EP
16
Peter McColgan 6
1
Partial least squares (PLS) regression was used to investigate the relationship between regional gene
ACCEPTED MANUSCRIPT
expression and regional white matter loss. PLS is a multivariate technique used when the number of
3
predictor variables (i.e. regional gene expression) is much larger than the number of observations (i.e.
4
regional white matter loss). It has been used previously to investigate the relationship between gene
5
expression and MRI-derived regional brain measures in healthy volunteers (24, 25). For our analysis the
6
predictor variable comprised a gene x ROI matrix 20,737 x 68 and the response variable comprised a WM
7
loss x ROI matrix; 68 x 4 for the cortico-striatal analysis (68 cortical ROIs x left and right caudate and
8
putamen WM loss to each ROI region) and 68 x 1 for the inter and intra-hemispheric analyses (68 cortical
9
ROIs x inter/intra hemispheric WM loss for each ROI). PLS identified components or patterns of regional
10
gene expression having maximum covariance with regional white matter loss, such that the first few PLS
11
components provide the greatest representation of the covariance. For each component individual genes are
12
assigned weights based on their contribution to the variance explained (24).
SC
M AN U
13
RI PT
2
This analysis provided a weight for each gene indicating its contribution to WM connectivity loss for each component or pattern. Using this information, genes were ranked according to their PLS weight. Gene
15
enrichment analysis was then performed to identify the biological functions of genes with the highest
16
weights using gene ontology (GO) terms (26). Here, the significance of a GO term was determined based
17
on the rank of genes associated with that term.
20 21
EP
19
AC C
18
TE D
14
Peter McColgan 7
1
Imaging Cohort
2
The cohort included preHD and control participants from the Track-On HD study (27), followed up at 3
3
time-points over 24 months at four sites (London, Leiden, Paris and Vancouver). Baseline participants
4
included 72 preHD and 85 controls. For the longitudinal analysis only preHD participants with diffusion
5
data from all 3-time points were included (56 preHD, 65 controls; Supplemental Methods).
6
MRI Acquisition
7
T1 and diffusion weighted images were acquired on two different 3T MRI scanners (Philips Achieva at
8
Leiden and Vancouver and Siemens TIM Trio at London and Paris). Diffusion-weighted images were
9
acquired with 42 unique gradient directions (b = 1000 sec/mm2; Supplemental methods).
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Diffusion Tractography
11
Whole brain probabilistic tractography was performed using MRtrix (28). The spherical deconvolution
12
informed filtering of tractograms (SIFT2) algorithm (29) was used to reduce biases. To demonstrate our
13
results were robust to varying methodologies, additional cross-sectional analyses used alternative
14
connectome construction methodologies (see Supplemental Methods).
15
Mapping gene expression data to MRI space
16
Gene expression microarray data was used from the AIBS atlas (22). Maybrain software
17
(https://github.com/rittman/maybrain) matched centroids of MRI regions to the closest AIBS region. For the
18
cross-sectional analyses a leave one out approach and 3 out of 6 permutations of AIBS brain samples was
19
also used to ensure results were robust to different combinations of AIBS subjects (see Supplemental
20
Methods).
AC C
EP
TE D
10
Peter McColgan 8
1
Statistical analysis
2
Partial least squares regression was used to investigate the association between gene transcriptome of the
3
healthy brain and WM connectivity loss in preHD. Code used to perform this analysis was adapted from
4
Whitaker et al. (25). Random permutations of the gene predictor variable were also investigated to ensure
5
results were not due to chance (see Supplemental Methods).
6
Gene ontology enrichment analysis
7
We used the gene ontology enrichment analysis and visualisation tool (GOrilla) (http://cbl-
8
gorilla.cs.technion.ac.il) (26) to identify GO terms that were significantly enriched in the target gene list.
9
Overlap between gene profiles and Huntington’s disease related genes
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
To investigate similarities between gene profiles, we identified the significance of gene overlap between
11
analyses using a hypergeometric distribution. Gene ontology enrichment analysis was also repeated with
12
overlap genes removed to assess whether this affected the resulting GO terms. Overlap between genes in top
13
gene ontology terms and HD genes was also investigated.
14
Enrichment for Huntington’s disease related genes
15
We investigated whether genes showing abnormal transcription in human and animal models of HD were
16
enriched greater than chance in the first PLS components of the cortico-striatal, inter-hemispheric and intra-
17
hemispheric analyses. HD gene lists were obtained from Langfelder et al. (30). Additionally we investigated
18
whether HD related genes were more strongly enriched in these gene lists than other biologically plausible
19
gene sets, chosen at random. Gene sets for human supragranular genes, oligodendrocytes and cell cycle
20
metabolism were also investigated (see Supplemental Methods).
AC C
EP
TE D
10
Peter McColgan 9
1
Results
2
Gene expression profiles of the healthy human brain explain the variance of regional white
3
matter connectivity loss in preHD
4
For the majority of analyses the first PLS component accounted for a large percentage of the variance in
5
regional WM loss. We therefore focused on this component. Gene expression data explained 66% of the
6
variance of regional WM connectivity loss in the cortico-striatal cross-sectional analysis and 70% in the
7
longitudinal analysis for the first component of the PLS and 11% and 6% respectively for the second
8
component. For the inter-hemispheric analysis, gene expression explained 67% WM loss cross-sectionally
9
and 17% longitudinally for the first component and 9% and 60% respectively for the second component. For
10
the intra-hemispheric analysis gene expression explained 24% cross-sectionally and 65% longitudinally for
11
the first component and 47% and 11% respectively for the second component. See Supplemental File 1 for
12
the first component PLS gene weights for these analyses.
RI PT
SC
M AN U
13
ACCEPTED MANUSCRIPT
For each analysis the first components of the PLS were explored. The second components were also explored if they accounted for a large proportion of the variance. Variances explained by the first component
15
ranged between 45-69% for random permutations of the gene predictor matrix however gene and ROI
16
weights were very different from the original analysis.
19
EP
18
AC C
17
TE D
14
Peter McColgan 10
3
Similar significant GO terms were seen for the cortico-striatal and inter-hemispheric analyses including
4
modulation of chemical synaptic transmission, regulation of cell projection organisation and cell projection
5
organisation. We refer to these as a synaptic profile. For the intra-hemispheric analysis the most significant
6
GO terms included mRNA metabolic process, RNA processing and chromatin organisation (see Table 1 and
7
Figure 2), which we refer to as a metabolic/chromatin profile. For the intra-hemispheric analysis the second
8
component of the PLS was significantly associated with GO terms involved in myelination and lipid
9
metabolism. See Supplemental File 2 for all significant GO terms for each analysis.
SC
RI PT
2
Expression profiles associated with cross-sectional variation in white matter connections in ACCEPTED MANUSCRIPT preHD relative to controls
1
The leave-one-out analyses showed modulation of chemical synaptic signalling and cell projection
11
organization were the most significant GO terms for cortico-striatal and inter-hemispheric connections for
12
nearly all permutations. For intra-hemispheric connections, the GO terms mRNA metabolic process, RNA
13
processing and chromatin organisation were among the most significant for all permutations. Similarly the
14
addition of Gaussian noise also revealed consistent results (see Supplemental File 3). The 3 out of 6
15
permutation analyses revealed similar findings across many of the 8 permutations (see Supplemental File 4).
16
The use of FA weighting and the thresholded scale 60 easy Lausanne atlas resulted in a change from
17
synaptic to metabolic/chromatin profiles for the cortico-striatal and inter-hemispheric connections. For intra-
18
hemispheric connections FA weighting revealed a consistent metabolic/chromatin profile. For the
19
thresholded scale 60 Lausanne atlas intra-hemispheric connections showed a synaptic profile. There was no
20
change in profiles across consensus thresholds of 75% and 50%. Cross-sectional analyses using random
AC C
EP
TE D
M AN U
10
Peter McColgan 11
1
permutations of genes revealed very different GO terms at minimal levels of significance suggesting our
2
results are not due to chance (see Supplemental File 5).
3
Expression profiles associated with longitudinal change in white matter connections in preHD
4
relative to controls
5
For both cortico-striatal and inter-hemispheric analyses longitudinal change in white matter was associated
6
with GO terms involving metabolism or chromatin organisation (see Supplemental Material Table S1 and
7
Figures S1 and S2). For intra-hemispheric analysis longitudinal change was associated with GO terns
8
involved in mitochondrial function, metabolism and synaptic transmission (see Supplemental Material Table
9
S1 and Figure S3). The second component of the PLS for the inter-hemispheric analysis was significantly
SC
RI PT
ACCEPTED MANUSCRIPT
associated with a range of GO terms including immune function, development and protein folding (see
11
Supplemental File 2). In summary, these results suggest regional gene expression profiles associated with
12
loss of WM connectivity in preHD are involved in synaptic, metabolic and chromatin related biological
13
processes.
14
Overlap between synaptic and metabolic gene profiles and HD related genes
15
A significant overlap of 346 genes (p< 0.001) was found between the top genes in the cortico-striatal
16
analysis and intra-hemispheric analyses. These were then compared to the striatum genes showing
17
transcriptional abnormalities in HD humans and animal models. This revealed 8 genes in common, encoding
18
proteins involved in cell cycle (CEP135), axon development (NEK1) and G-protein coupling (ADORA2A).
19
See Supplemental File 6. Gene ontology enrichment analysis with overlap genes removed did not change the
20
most significant GO terms. The gene ontology terms “modulation of chemical synaptic transmission” and
21
“mRNA metabolic process” showed overlap of 7 genes. HD related genes showed overlap of 44 genes with
AC C
EP
TE D
M AN U
10
Peter McColgan 12
1
the GO terms modulation of chemical synaptic transmission and 7 genes with mRNA metabolic process.
2
The overlaps were not greater than those expected by chance.
3
Dissociation of cortico-striatal, inter- and intra-hemispheric gene enrichment in the cortex
4
The next step in our analysis was to explore the spatial pattern of each gene expression profile in the brain.
5
To determine what brain regions were enriched with each gene expression profile, we analysed PLS ROI
6
weights from each analysis where higher weights related to greater gene profile enrichment (see
7
Supplemental File 7 for ROI weights for each analysis). Cortical regions with the highest weights in the
8
cortico-striatal analysis (cross-sectional) were predominantly in motor, parietal and occipital cortices.
9
Conversely, cortical regions with the highest weights in the inter-hemispheric analysis (cross-sectional) were
SC
RI PT
ACCEPTED MANUSCRIPT
predominantly in frontal, temporal and insular cortices. Cortical regions with the highest weights in the
11
intra-hemispheric analysis (cross-sectional) included frontal, temporal and occipital regions (see
12
Supplemental Material table S2 and Figure 3). Plotting cortico-striatal ROI weights against both inter-
13
hemispheric and intra-hemispheric ROI weights revealed dissociation in terms of regions involved, where
14
regions enriched in the cortical-striatal analysis were distinctly different from those enriched in the inter-
15
hemispheric and intra-hemispheric analyses (see Figure 4). Cross-sectional analyses using random
16
permutations of ROIs revealed very different distribution of ROI weights suggesting our results are not due
17
to chance (see Supplemental Figure S5 and Supplemental file 7).
19 20
TE D
EP AC C
18
M AN U
10
Peter McColgan 13
3
Our next step was to assess whether genes that show abnormal transcription in HD, both in the cortex and
4
striatum, might be associated with white matter loss. The cortico-striatal gene list was significantly enriched
5
for abnormal HD genes in the striatum (p < 0.001) and in the cortex (p < 0.001). The inter-hemispheric gene
6
list was significantly enriched for genes in the striatum (p < 0.001) but not the cortex. No significant
7
enrichment was seen for the intra-hemispheric gene list (see Figure 5). To ensure the significance difference
8
for the striatum gene list was not related to the size of the gene data set we repeated the analysis using the
9
top 25 most significant genes based on Hodges q-value. Results were consistent with the 515 gene list
SC
RI PT
2
Enrichment of genes showing abnormal transcription in HD is seen in the cortico-striatal and ACCEPTED MANUSCRIPT inter-hemispheric gene expression profiles
1
showing significant enrichment for HD genes in the striatum for the cortico-striatal (p = 0.019) and inter-
11
hemispheric (p = 0.004) analyses (see Supplementary Material Figure S4). Enrichment compared against
12
biologically plausible gene sets revealed similar results, for both 515 and 25 striatum gene lists, with
13
significant enrichment for cortico-striatal (p < 0.001) and inter-hemispheric analyses (p < 0.001) but not for
14
the intra-hemispheric analysis. This suggests that abnormal transcription in HD may be associated with
15
cortico-striatal and inter-hemispheric WM connectivity loss.
TE D
16
M AN U
10
To further investigate the relationship between changes in gene expression in HD relative to controls and cortico-striatal WM loss we performed correlations between the log2 fold change in the Hodges (31),
18
Durrenberger (32) and Langfelder studies (30) for the 515 striatum gene set and the PLS weights from the
19
cross-sectional cortico-striatal analysis. This revealed negative correlations between PLS weights and Log2
20
fold change (Hodges, rho = -0.23, p = 1.1x10-7, Durrenberger, rho = -0.23, 8.4x10-8, Langfelder, rho = -0.19,
21
p = 1.6x10-5) (see Supplemental Material Figure S6). This suggests that genes associated with cortico-
AC C
EP
17
Peter McColgan 14
1
striatal white matter loss in preHD are also those that show reduced levels of transcription in human HD and
2
animal models.
3
Enrichment for other gene lists
4
We also investigated enrichment for genes associated with human supragranular cortex, oligodendrocytes
5
and cell cycle metabolism. The cortico-striatal (CS) and inter-hemispheric (IH) gene lists were significantly
6
enriched for human supragranular cortex genes (CS, p = 0.002, IH, p = 0.006) and oligodendrocyte genes
7
(CS, p < 0.001, IH, p < 0.001) but not cell cycle metabolism genes. Conversely the intra-hemispheric gene
8
list was significantly enriched for cell cycle metabolism genes (p < 0.001) but not human supragranular or
9
oligodendrocyte genes. This suggests a relationship between cortico-striatal white matter loss and abnormal
SC
RI PT
ACCEPTED MANUSCRIPT
transcription in oligodendrocytes. Additionally abnormal transcription in cortical supragranular genes,
11
which are implicated in long-range connectivity (33), may be linked to connectivity cortico-striatal and
12
inter-hemispheric white matter loss.
16 17 18 19
EP
15
AC C
14
TE D
13
M AN U
10
Peter McColgan 15
1
Discussion
2
In this study, we find that regional variance in white matter loss in preHD is differentially associated with
3
the pattern of expression of genes involved in synaptic, metabolic and chromatin related processes in the
4
healthy human brain. Cortico-striatal and inter-hemispheric WM loss is associated with synaptic genes,
5
whereas intra-hemispheric WM loss is associated with metabolic and chromatin-related genes. There is also
6
a distinction between gene enrichment in cortical regions where enrichment associated with cortico-striatal
7
connections is seen in more posterior regions such as motor, occipital and parietal cortices, whereas
8
enrichment associated with inter-hemispheric connections is seen in frontal, temporal and insula cortices.
9
We reveal that genes showing abnormal transcription in HD humans and animal models are over expressed
SC
RI PT
ACCEPTED MANUSCRIPT
in the ranked gene list associated with cortico-striatal and inter-hemispheric WM loss but not intra-
11
hemispheric WM connection loss.
12
M AN U
10
We focus on synaptic, metabolic and chromatin related genes to simplify interpretation of our results. However specific GO terms such as DNA metabolism may relate to DNA repair (34). DNA repair genes,
14
such as MSH3, have been linked to CAG instability (35), age of onset (36) and disease progression (37).
15
The GO terms mRNA metabolism may relate to splicing of mRNA, which has also been implicated in HD
16
pathogenesis. Aberrant splicing of the mutant huntingtin gene leads to the generation of the pathogenic exon
17
1 HTT protein (38). We note that further work would be needed to link these specific gene sets directly to
18
white matter loss in HD.
EP
Several studies have analysed gene expression profiles both in human HD and animal models. Gene
AC C
19
TE D
13
20
expression measured in post-mortem brain samples from HD patients was most affected in the caudate,
21
followed by the motor cortex, while no abnormalities were detected in the prefrontal association cortex (31).
Peter McColgan 16
1
The GO term showing greatest significance for both the caudate and motor cortex was synaptic transmission.
2
Furthermore, significance for the GO terms metabolism and glucose metabolism were seen in the cortex, but
3
not the caudate. These findings agree with the associations between synaptic genes and cortico-striatal WM
4
connection loss and metabolic genes and intra-hemispheric WM connection loss that we demonstrate here.
5
ACCEPTED MANUSCRIPT
In our previous longitudinal study, WM loss was greatest in cortico-striatal and inter-hemispheric connections in preHD relative to controls. No group differences were seen in intra-hemispheric connections
7
(6). The analysis presented here is based on regional atrophy of connection subtype. Therefore cortico-
8
striatal and inter-hemispheric regional atrophy is likely to be greater than intra-hemispheric regional atrophy.
9
Furthermore cortico-striatal and inter-hemispheric connections have greater topographical lengths than intra-
SC
RI PT
6
hemispheric connections (6). Therefore these similarities between cortico-striatal and inter-hemispheric
11
connections may account for the similarity between gene profiles.
12
M AN U
10
Changes from synaptic to metabolic profiles in cross-sectional vs. longitudinal, streamline volume vs. FA weighting and Desikan vs. scale 60 easy Lausanne atlas were seen for cortico-striatal and inter-
14
hemispheric connections. We investigated this further showing common genes highly ranked in both
15
profiles. One explanation for this may be that atrophy scores cross-sectionally will be higher than
16
longitudinal rate of atrophy scores. Similarly, atrophy scores in the Desikan 68-region atlas are likely to be
17
larger than in the more finely parcellated easy Lausanne scale 60 (110-region) atlas. With respect to FA
18
weighting, this metric is difficult to interpret in crossing fibre regions, which make up an estimated 60-90%
19
of the human brain (39).
EP
AC C
20
TE D
13
The gene ontology categories identified contain large numbers of genes. We therefore balance this
21
data driven approach by investigating whether gene profiles associated with regional white matter loss in
22
preHD are enriched for genes known to show abnormal transcription in both human HD and animal models.
Peter McColgan 17
1
Similar GO terms such as synaptic transmission and chromatin modification have been associated with
2
functional brain networks in healthy participants (24, 40). This likely represents the close relationship
3
between the healthy brain network and the perturbation of that network in neurodegeneration (41).
4
ACCEPTED MANUSCRIPT
We acknowledge the limitations of diffusion tractography. To address these we used both CSD, which deals more effectively with crossing fibres than the diffusion tensor or multi-tensor methods (28) and
6
SIFT2, which has higher reproducibility and is more representative of the underlying biology of WM
7
connections than conventional methods (42). CSD performs well at the acquisition protocol specifications
8
used in this study (b =1000) (43, 44). At b=1000 a minimum number of 28 gradient directions is required
9
(45). Therefore the angular coverage achieved using CSD at b=1000 is more than sufficient with 42
11
SC
directions.
M AN U
10
RI PT
5
The use of gene expression data from the healthy human brain to explain white matter loss in preHD is limited to the extent that transcription in preHD may be different than that seen in healthy brains.
13
However studies from post mortem manifest HD brains show that the transcription in the striatum is most
14
affected with limited abnormalities in the cortex (31). Indeed the transcription of only 25 genes in the cortex
15
is abnormal in both human and animal studies, compared to 515 in the striatum (30). Therefore, we mitigate
16
for the likely transcription abnormalities in preHD by using only cortical gene expression data from the
17
AIBS transcriptome atlas (46).
EP
18
TE D
12
We map the anatomical location of ROIs to corresponding regions in the AIBS atlas. However the resolution of these atlases are different and thus we acknowledge that the correspondence may not be exact
20
and may be a limitation of our methodology. There are other human brain transcriptome atlas such as
21
Braineac (47) and the Human Brain Transcriptome Project (48) however these atlases offer low resolution
22
compared to the AIBS atlas, where only a small number of cortical regions have been sampled so the
AC C
19
Peter McColgan 18
1
analysis carried out in this study could not be reproduced using Braineac or the Human Brain Transcriptome
2
Project atlas.
3
ACCEPTED MANUSCRIPT
The utility of using information from the healthy human brain to inform us about the patterns and mechanisms of neurodegeneration has been demonstrated many times in neuroimaging. Functional
5
connectivity and white matter networks from healthy participants can predict atrophy in Alzheimer’s disease,
6
corticobasal syndrome, fronto-temporal dementia and Parkinson’s disease (41, 49-51). More recently
7
transcriptome data from the healthy brains of the AIBS atlas has been used to investigate the association
8
between the expression of schizophrenia risk genes and white matter disconnectivity (52). The regional
9
expression of the tau gene MAPT from the AIBS atlas has also been linked to the selective vulnerability of
SC
RI PT
4
highly connected brain regions in Parkinson’s disease and progressive supranuclear palsy (53).
11
Conclusion
12
We show that cortico-striatal and inter-hemispheric WM connection loss is associated with the expression of
13
synaptic genes in preHD, while intra-hemispheric WM loss is associated with metabolic genes. Genes
14
showing abnormal transcription in HD are associated with the synaptic but not metabolic gene profiles.
15
These findings have important implications for linking the earliest WM changes in preHD to the underlying
16
pathological processes that may drive them.
17
Acknowledgements
18
Track-On HD Investigators
19
A Coleman, J Decolongon, M Fan, T. Petkau (University of British Columbia, Vancouver); C Jauffret, D
20
Justo, S Lehericy, K Nigaud, R Valabrègue (ICM and APHP, Pitié- Salpêtrière University Hospital,
AC C
EP
TE D
M AN U
10
Peter McColgan 19
1
Paris). A Schoonderbeek, E P ‘t Hart (Leiden University Medical Centre, Leiden); DJ Hensman Moss, R
2
Ghosh, H Crawford, M Papoutsi, C Berna, D Mahaleskshmi (University College London, London). R
3
Reilmann, N Weber (George Huntington Institute, Munster); I Labuschagne, J Stout (Monash University,
4
Melbourne); B Landwehrmeyer, M Orth, I Mayer (University of Ulm, Ulm); H Johnson (University of
5
Iowa); D Crawfurd (University of Manchester).
6
Financial disclosure
7
All authors report no biomedical financial interests or potential conflicts of interest.
8
Funding
9
This study was funded by the Wellcome Trust (GR, PMC) (091593/Z/10/Z, 515103) and supported by the
10
National Institute for Health Research [NIHR] University College London Hospitals [UCLH] Biomedical
11
Research Centre [BRC]. Track-On HD is funded by the CHDI foundation, a not for profit organisation
12
dedicated to finding treatments for Huntington’s disease. We would like to thank Timothy Rittman for
13
guidance on using Maybrain software and Dr Kirstie Whitaker and Dr Petra Vertes for making their code
14
freely available and guidance on its implementation.
15
References
16
1.
17
Dis Primers. 1:15005.
18
2.
19
premanifest and early stage Huntington's disease in the TRACK-HD study: the 12-month longitudinal analysis. Lancet
20
Neurol. 10:31-42.
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
EP
Bates GP, Dorsey R, Gusella JF, Hayden MR, Kay C, Leavitt BR, et al. (2015): Huntington disease. Nat Rev
AC C
Tabrizi SJ, Scahill RI, Durr A, Roos RA, Leavitt BR, Jones R, et al. (2011): Biological and clinical changes in
Peter McColgan 20
1
3.
2
in white matter pathways of the sensorimotor cortex in premanifest Huntington's disease. Hum Brain Mapp. 33:203-
3
212.
4
4.
5
Multimodal MRI analysis of the corpus callosum reveals white matter differences in presymptomatic and early
6
Huntington's disease. Cereb Cortex. 22:2858-2866.
7
5.
8
and deep gray matter alterations in premanifest Huntington disease. Neuroimage Clin. 11:450-460.
9
6.
Dumas EM, van den Bogaard SJ, Ruber ME, Reilman RR, Stout JC, Craufurd D, et al. (2012): Early changes
ACCEPTED MANUSCRIPT
Di Paola M, Luders E, Cherubini A, Sanchez-Castaneda C, Thompson PM, Toga AW, et al. (2012):
RI PT
Faria AV, Ratnanather JT, Tward DJ, Lee DS, van den Noort F, Wu D, et al. (2016): Linking white matter
McColgan P, Seunarine KK, Gregory S, Razi A, Papoutsi M, Long JD, et al. (2017): Topological length of
white matter connections predicts their rate of atrophy in premanifest Huntington's disease. JCI Insight. 2.
11
7.
12
manifestations of Huntington's disease in the longitudinal TRACK-HD study: cross-sectional analysis of baseline data.
13
Lancet Neurol. 8:791-801.
14
8.
15
Neurol. 10:83-98.
16
9.
Saudou F, Humbert S (2016): The Biology of Huntingtin. Neuron. 89:910-926.
17
10.
Plotkin JL, Surmeier DJ (2015): Corticostriatal synaptic adaptations in Huntington's disease. Curr Opin
18
Neurobiol. 33C:53-62.
19
11.
20
analysis implicates the huntingtin polyglutamine tract in extra-mitochondrial energy metabolism. PLoS Genet. 3:e135.
21
12.
22
and free radical damage in the Huntington R6/2 transgenic mouse. Ann Neurol. 47:80-86.
23
13.
24
disease pathogenesis. Curr Opin Pharmacol. 26:33-38.
SC
10
M AN U
Tabrizi SJ, Langbehn DR, Leavitt BR, Roos RA, Durr A, Craufurd D, et al. (2009): Biological and clinical
TE D
Ross CA, Tabrizi SJ (2011): Huntington's disease: from molecular pathogenesis to clinical treatment. Lancet
EP
Lee JM, Ivanova EV, Seong IS, Cashorali T, Kohane I, Gusella JF, et al. (2007): Unbiased gene expression
AC C
Tabrizi SJ, Workman J, Hart PE, Mangiarini L, Mahal A, Bates G, et al. (2000): Mitochondrial dysfunction
Andre R, Carty L, Tabrizi SJ (2016): Disruption of immune cell function by mutant huntingtin in Huntington's
Peter McColgan 21
1
14.
2
disease? Neurobiol Dis. 45:83-98.
3
15.
4
disease patient myeloid cells reveals innate transcriptional dysregulation associated with proinflammatory pathway
5
activation. Hum Mol Genet. 25:2893-2904.
6
16.
7
layer 5 pyramidal neurons may contribute to impaired corticostriatal connectivity in huntington disease. J Neuropathol
8
Exp Neurol. 69:880-895.
9
17.
Seredenina T, Luthi-Carter R (2012): What have we learned from gene expression profiles in Huntington's
ACCEPTED MANUSCRIPT
Miller JR, Lo KK, Andre R, Hensman Moss DJ, Trager U, Stone TC, et al. (2016): RNA-Seq of Huntington's
RI PT
Zucker B, Kama JA, Kuhn A, Thu D, Orlando LR, Dunah AW, et al. (2010): Decreased Lin7b expression in
Gambazzi L, Gokce O, Seredenina T, Katsyuba E, Runne H, Markram H, et al. (2010): Diminished activity-
dependent brain-derived neurotrophic factor expression underlies cortical neuron microcircuit hypoconnectivity
11
resulting from exposure to mutant huntingtin fragments. J Pharmacol Exp Ther. 335:13-22.
12
18.
13
derived neurotrophic factor (BDNF) and wild-type huntingtin in normal and quinolinic acid-lesioned rat brain. Eur J
14
Neurosci. 18:1093-1102.
15
19.
16
mRNA expression in the mouse striatum: 18S-rRNA is a reliable control gene for studies of the striatum. Neurosci
17
Bull. 28:517-531.
18
20.
19
receptor gamma coactivator 1 alpha contributes to dysmyelination in experimental models of Huntington's disease. J
20
Neurosci. 31:9544-9553.
21
21.
22
deficits occur prior to neuronal loss in the YAC128 and BACHD models of Huntington disease. Hum Mol Genet.
23
25:2621-2632.
24
22.
25
genetic signatures of the adult human brain. Nat Neurosci. 18:1832-1844.
SC
10
M AN U
Fusco FR, Zuccato C, Tartari M, Martorana A, De March Z, Giampa C, et al. (2003): Co-localization of brain-
TE D
Espindola S, Vilches-Flores A, Hernandez-Echeagaray E (2012): 3-Nitropropionic acid modifies neurotrophin
EP
Xiang Z, Valenza M, Cui L, Leoni V, Jeong HK, Brilli E, et al. (2011): Peroxisome-proliferator-activated
AC C
Teo RT, Hong X, Yu-Taeger L, Huang Y, Tan LJ, Xie Y, et al. (2016): Structural and molecular myelination
Hawrylycz M, Miller JA, Menon V, Feng D, Dolbeare T, Guillozet-Bongaarts AL, et al. (2015): Canonical
Peter McColgan 22
1
23.
2
system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage.
3
31:968-980.
4
24.
5
transcription profiles associated with inter-modular hubs and connection distance in human functional magnetic
6
resonance imaging networks. Philos Trans R Soc Lond B Biol Sci. 371.
7
25.
8
associated with genomically patterned consolidation of the hubs of the human brain connectome. Proc Natl Acad Sci
9
U S A. 113:9105-9110.
Desikan RS, Segonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, et al. (2006): An automated labeling
ACCEPTED MANUSCRIPT
Vertes PE, Rittman T, Whitaker KJ, Romero-Garcia R, Vasa F, Kitzbichler MG, et al. (2016): Gene
RI PT
Whitaker KJ, Vertes PE, Romero-Garcia R, Vasa F, Moutoussis M, Prabhu G, et al. (2016): Adolescence is
10
26.
11
enriched GO terms in ranked gene lists. BMC Bioinformatics. 10:48.
12
27.
13
Huntington's Disease: Evidence From the Track-On HD Study. EBioMedicine. 2:1420-1429.
14
28.
15
Imaging Systems and Technology. 22:53-56.
16
29.
17
brain white matter connectivity using streamlines tractography. Neuroimage. 119:338-351.
18
30.
19
and proteomics define huntingtin CAG length-dependent networks in mice. Nat Neurosci. 19:623-633.
20
31.
21
expression changes in human Huntington's disease brain. Hum Mol Genet. 15:965-977.
22
32.
23
Common mechanisms in neurodegeneration and neuroinflammation: a BrainNet Europe gene expression microarray
24
study. J Neural Transm (Vienna). 122:1055-1068.
SC
Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009): GOrilla: a tool for discovery and visualization of
M AN U
Kloppel S, Gregory S, Scheller E, Minkova L, Razi A, Durr A, et al. (2015): Compensation in Preclinical
Tournier JD, Calamante F, Connelly A (2012): MRtrix: Diffusion tractography in crossing fiber regions.
TE D
Smith RE, Tournier JD, Calamante F, Connelly A (2015): SIFT2: Enabling dense quantitative assessment of
Langfelder P, Cantle JP, Chatzopoulou D, Wang N, Gao F, Al-Ramahi I, et al. (2016): Integrated genomics
EP
Hodges A, Strand AD, Aragaki AK, Kuhn A, Sengstag T, Hughes G, et al. (2006): Regional and cellular gene
AC C
Durrenberger PF, Fernando FS, Kashefi SN, Bonnert TP, Seilhean D, Nait-Oumesmar B, et al. (2015):
Peter McColgan 23
1
33.
2
enriched genes associate with corticocortical network architecture in the human brain. Proc Natl Acad Sci U S A.
3
113:E469-478.
4
34.
5
variants associated with Huntington's disease progression: a genome-wide association study. Lancet Neurol.
6
35.
7
postmitotic neurons. Proc Natl Acad Sci U S A. 105:3467-3472.
8
36.
9
Onset of Huntington's Disease. Cell. 162:516-526.
Krienen FM, Yeo BT, Ge T, Buckner RL, Sherwood CC (2016): Transcriptional profiles of supragranular-
ACCEPTED MANUSCRIPT
Moss DJH, Pardinas AF, Langbehn D, Lo K, Leavitt BR, Roos R, et al. (2017): Identification of genetic
RI PT
Gonitel R, Moffitt H, Sathasivam K, Woodman B, Detloff PJ, Faull RL, et al. (2008): DNA instability in
Genetic Modifiers of Huntington's Disease C (2015): Identification of Genetic Factors that Modify Clinical
10
37.
11
variants associated with Huntington's disease progression: a genome-wide association study. Lancet Neurol. 16:701-
12
711.
13
38.
14
splicing of HTT generates the pathogenic exon 1 protein in Huntington disease. Proc Natl Acad Sci U S A. 110:2366-
15
2370.
16
39.
17
fiber configurations in white matter tissue with diffusion magnetic resonance imaging. Hum Brain Mapp. 34:2747-
18
2766.
19
40.
20
NETWORKS. Correlated gene expression supports synchronous activity in brain networks. Science. 348:1241-1244.
21
41.
22
the healthy brain functional connectome. Neuron. 73:1216-1227.
23
42.
24
biological accuracy of the structural connectome. Neuroimage. 104:253-265.
M AN U
SC
Moss DJH, Pardinas AF, Langbehn D, Lo K, Leavitt BR, Roos R, et al. (2017): Identification of genetic
Sathasivam K, Neueder A, Gipson TA, Landles C, Benjamin AC, Bondulich MK, et al. (2013): Aberrant
TE D
Jeurissen B, Leemans A, Tournier JD, Jones DK, Sijbers J (2013): Investigating the prevalence of complex
EP
Richiardi J, Altmann A, Milazzo AC, Chang C, Chakravarty MM, Banaschewski T, et al. (2015): BRAIN
AC C
Zhou J, Gennatas ED, Kramer JH, Miller BL, Seeley WW (2012): Predicting regional neurodegeneration from
Smith RE, Tournier JD, Calamante F, Connelly A (2015): The effects of SIFT on the reproducibility and
Peter McColgan 24
1
43.
2
clinical b-values: an evaluation study. Med Phys. 38:5239-5253.
3
44.
4
MRI: development of simulated brain images and comparison of multi-fiber analysis methods at clinical b-values.
5
Neuroimage. 109:341-356.
6
45.
7
gradient directions for high-angular-resolution diffusion-weighted imaging. NMR Biomed. 26:1775-1786.
8
46.
9
comprehensive atlas of the adult human brain transcriptome. Nature. 489:391-399.
Ramirez-Manzanares A, Cook PA, Hall M, Ashtari M, Gee JC (2011): Resolving axon fiber crossings at
ACCEPTED MANUSCRIPT
Wilkins B, Lee N, Gajawelli N, Law M, Lepore N (2015): Fiber estimation and tractography in diffusion
RI PT
Tournier JD, Calamante F, Connelly A (2013): Determination of the appropriate b value and number of
Hawrylycz MJ, Lein ES, Guillozet-Bongaarts AL, Shen EH, Ng L, Miller JA, et al. (2012): An anatomically
10
47.
11
regulation of gene expression in ten regions of the human brain. Nat Neurosci. 17:1418-1428.
12
48.
13
human brain. Nature. 478:483-489.
14
49.
15
connectivity predicts atrophy progression in non-fluent variant of primary progressive aphasia. Brain.
16
50.
17
scale human brain networks. Neuron. 62:42-52.
18
51.
19
brain atrophy in de novo Parkinson's disease. Elife. 4.
20
52.
21
and Cortical Gene Expression in Patients With Schizophrenia. Biol Psychiatry.
22
53.
23
the MAPT gene is associated with loss of hubs in brain networks and cognitive impairment in Parkinson disease and
24
progressive supranuclear palsy. Neurobiol Aging. 48:153-160.
SC
Ramasamy A, Trabzuni D, Guelfi S, Varghese V, Smith C, Walker R, et al. (2014): Genetic variability in the
M AN U
Kang HJ, Kawasawa YI, Cheng F, Zhu Y, Xu X, Li M, et al. (2011): Spatio-temporal transcriptome of the
Mandelli ML, Vilaplana E, Brown JA, Hubbard HI, Binney RJ, Attygalle S, et al. (2016): Healthy brain
TE D
Seeley WW, Crawford RK, Zhou J, Miller BL, Greicius MD (2009): Neurodegenerative diseases target large-
Zeighami Y, Ulla M, Iturria-Medina Y, Dadar M, Zhang Y, Larcher KM, et al. (2015): Network structure of
EP
Romme IA, de Reus MA, Ophoff RA, Kahn RS, van den Heuvel MP (2016): Connectome Disconnectivity
AC C
Rittman T, Rubinov M, Vertes PE, Patel AX, Ginestet CE, Ghosh BC, et al. (2016): Regional expression of
Peter McColgan 25
1
Figures legends and tables
2
Figure 1. Schematic illustrating sub-groups of regional white matter connectivity. (A) Cortico-striatal:
3
connections between cortex and striatum (caudate and putamen) for each cortical region of interest (ROI).
4
(B) Inter-hemispheric: connections to the opposite hemisphere for each cortical ROI. (C) Intra-hemispheric:
5
connections within the same hemisphere for each cortical ROI. Light blue – left hemisphere, purple – right
6
hemisphere, dark blue – caudate, yellow – putamen.
7
Figure 2. (A) Cortico-striatal cross-sectional analysis semantic similarity scatter plot: Significant gene
8
ontology (GO) terms for biological processes associated with the first component of the partial least squares
9
(PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most
RI PT
ACCEPTED MANUSCRIPT
significant GO terms are labelled for each analysis. Redundant GO terms and those associated with greater
11
than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of
12
each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
13
(B) Inter-hemispheric cross-sectional analysis semantic similarity scatter plot: Significant gene
14
ontology (GO) terms for biological processes associated with the first component of the partial least squares
15
(PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most
16
significant GO terms are displayed for each analysis. Redundant GO terms and those associated with greater
17
than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of
18
each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
19
(C) Intra-hemispheric cross-sectional analysis semantic similarity scatter plot: Significant gene
20
ontology (GO) terms for biological processes associated with the first component of the partial least squares
21
(PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most
22
significant GO terms are displayed for each analysis. Redundant GO terms and those associated with greater
23
than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of
24
each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
25
Figure 3. ROI weights for cross-sectional partial least squares regression analyses. (A) Cortico-striatal
26
(B) Inter-hemispheric (C) Intra-hemispheric. Brain regions displayed on brain mesh. Size and colour of
27
region indicates size of ROI weight (ranked from smallest-largest, 1-6). See colour map.
AC C
EP
TE D
M AN U
SC
10
Peter McColgan 26
1
Figure 4. Dissociation of cortico-striatal and inter/intra-hemispheric gene enrichment in the cortex.
2
(A) ROI weights for the first PLS component of the cross-sectional analysis for inter-hemispheric vs.
3
cortical-striatal. (B) ROI weights for the first PLS component of the longitudinal analysis for inter-
4
hemispheric vs. cortical-striatal. (C) ROI weights for the first PLS component of the cross-sectional analysis
5
for intra-hemispheric vs. cortical-striatal. (D) ROI weights for the first PLS component of the longitudinal
6
analysis for intra-hemispheric vs. cortical-striatal. Each red circle represents a cortical ROI.
7
Figure 5. Enrichment of genes showing abnormal transcription in Huntington’s disease in the first
8
PLS components of cortico-striatal, inter-hemispheric and intra-hemispheric cross-sectional analyses.
9
Red circle illustrates the mean weight (on the x-axis) for the gene list of interest in the first PLS component.
10
The y-axis represents the number of permutations of random genes from the first PLS component. Gene lists
11
over expressed in the first PLS component have a mean great than that of the random permutations (red
12
circle to the right of the permutation distribution).
14 15 16
21 22 23 24
EP
20
AC C
19
TE D
17 18
SC
M AN U
13
RI PT
ACCEPTED MANUSCRIPT
Peter McColgan 27
1
Table 1. Cortico-striatal, inter-hemispheric and intra-hemispheric cross-sectional analyses: Gene
2
ontology (GO) terms for biological processes associated with top ranking genes from the first component of
3
the partial least squares (PLS) analysis. The top 5 most significant GO terms are displayed for each analysis.
4
Full tables can be found in Supplementary file 2. Redundant GO terms and those associated with greater
5
than 1000 genes have been excluded. B – total number of genes associated with a specific GO term, n –
6
number of genes in the target set, b – is the number of genes in the intersection. Enrichment (E) = (b/n) /
7
(B/total number of genes). See (26) for further details. PLS1 Cortico-striatal Cross-sectional P-value
GO:0031344
Description regulation of dendrite development modulation of chemical synaptic transmission regulation of cell projection organization
1.88E-06
4.06E-03
1.29
549
6375
255
GO:0044057
regulation of system process
3.31E-06
4.98E-03
1.33
481
5795
209
GO:0030030
cell projection organization
4.31E-06
5.41E-03
1.24
699
6498
319
GO:0050804
PLS1 Inter-hemispheric Cross-sectional
GO:0043623
Description modulation of chemical synaptic transmission regulation of cell projection organization cellular protein complex assembly
GO:0061024
membrane organization
GO:0030030
cell projection organization
GO:0050804 GO:0031344
3.03E-03
1.06E-06
3.19E-03
P-value
B
n
b
2.18
124
3150
48
1.4
297
6419
151
FDR q-value
Enrichment
1.40E-14
3.51E-11
1.74
297
5246
153
1.99E-13
2.73E-10
1.64
549
3924
199
7.20E-13
8.34E-10
1.65
371
4892
169
1.51E-11
1.19E-08
1.45
820
4221
283
3.85E-08
1.47
699
4281
248
FDR q-value
Enrichment
TE D
GO Term
8.05E-07
Enrichment
SC
GO:0050773
FDR q-value
M AN U
GO Term
RI PT
ACCEPTED MANUSCRIPT
6.38E-11
B
n
b
PLS1 Intra-hemispheric Cross-sectional
9
GO:0016071
mRNA metabolic process
2.91E-33
1.12E-30
1.84
593
5085
313
GO:0006396
RNA processing
4.39E-30
1.58E-27
1.65
806
5357
402
GO:0006325
chromatin organization
1.11E-25
3.73E-23
1.79
657
4364
289
GO:0006397
mRNA processing
4.33E-21
1.42E-18
1.78
402
5435
219
GO:0019083
viral transcription
2.79E-20
8.77E-18
3.01
99
4044
68
EP
Description
AC C
8
GO Term
P-value
B
n
b
Peter McColgan 28
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Brain Regions Showing White Matter Loss in Huntington’s Disease Are Enriched for Synaptic and Metabolic Genes
RI PT
Supplemental Information
Supplemental Methods
SC
Imaging Cohort
Track-On is an extension of the Track-HD (1) study, but with only preHD and control
M AN U
participants carried over (early HD participants from Track-HD were excluded). Informed consent was obtained from each participant, and the study protocol was approved by the local ethics committees. Of the participants included, 31 preHD and 29 controls had participated previously in Track-HD (1). The preHD participants required a disease burden score (DBS) > 250 (2), on the basis of their medical records at the time of assessment. Controls were
TE D
selected from the spouses or partners of preHD individuals or were gene-negative siblings, to ensure consistency of environments. For this study, we excluded participants who had manifest disease at baseline, were left handed or ambidextrous, or had poor quality diffusion-
EP
weighted imaging (DWI) data, as defined by visual quality control. Therefore only preHD
AC C
participants were included who have not yet developed the motor manifestations of HD.
MRI Acquisition
Data were acquired on two different 3T MRI scanners (Philips Achieva at Leiden and Vancouver and Siemens TIM Trio at London and Paris), both using a 12-channel head coil. T1-weighted image volumes were acquired using a 3D MPRAGE acquisition sequence with the following imaging parameters: TR = 2200ms (Siemens)/ 7.7ms (Philips), TE=2.2ms (S)/3.5ms (P), FA=10◦ (S)/8◦(P), FOV= 28cm (S)/ 24cm (P), matrix size 256x256
1
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
(S)/224x224 (P), 208 (S)/164 (P) sagittal slices to cover the entire brain with a slice thickness of 1.0 mm with no gap. Diffusion-weighted images were acquired with 42 unique gradient directions
RI PT
(b = 1000 sec/mm2). Eight images with no diffusion weighting (b = 0 sec/mm2) and one image with no diffusion weighting (b = 0 sec/mm2) were acquired from the Siemens and Philips scanners respectively. For the Siemens scanners, TE = 88ms and TR = 13s; for the Phillips scanners, TE = 56ms and TR = 11s. Voxel size for the Siemens scanners was 2 x 2 x
SC
2 mm and for the Phillips scanners 1.96 x 1.96 x 2. Seventy-five slices were collected for Scanning time was
M AN U
each diffusion-weighted and non-diffusion weighted volume.
approximately 12 minutes for T1-weighted and 10 minutes for diffusion-weighted acquisitions.
MRI Data Analysis
TE D
Structural MRI Data
Cortical and sub-cortical regions of interest (ROIs) were generated by segmenting a T1weighted image using FreeSurfer (3). These included 70 cortical regions and 4 sub-cortical
EP
regions (caudate and putamen bilaterally). We chose to focus on the caudate and putamen sub-cortical structures based on observations from our cross-sectional structural connectivity
AC C
study (4) and from the earlier Track-HD studies (5, 6) that show the caudate and putamen are the sub-cortical structures most affected in preHD both in terms of grey matter volume and white matter connections While some studies have shown changes in the thalamus, globus pallidus and nucleus accumbens in preHD these tend to occur in preHD participants closer to disease onset (7, 8). Furthermore automatic segmentation of globus pallidus, nucleus accumbens and amygdala are not sufficiently reliable (9).
2
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
We choose the Desikan FreeSurfer atlas as this is based on 40 subjects across a range of ages encompassing 4 groups; young adults, middle aged adults, elderly adults and patients with Alzheimer’s disease. By including subjects with age and neurodegenerative related
RI PT
atrophy this better accounts for inter-subject variability (3), particularly in the case of our cohort, which contains adults across a range of ages and those with preHD. We have used this atlas extensively in HD, for both cross-sectional and longitudinal connectome analyses (4, 10-12). Atlases with large numbers of ROIs demonstrate less reproducibility (13). While the
SC
AAL atlas is commonly used in graph theory studies this is derived from single subject who
M AN U
was young and healthy and is therefore not suitable for the cohort investigated here (14).
Data Pre-processing
For the diffusion data the b=0 image was used to generate a brain mask using FSL’s brain extraction tool (15). Eddy current correction was used to align the diffusion-weighted
TE D
volumes to the first b=0 image and the gradient directions updated to reflect the changes to the image orientations. Finally, diffusion tensor metrics were calculated and constrained spherical deconvolution (CSD) applied to the data as implemented in MRtrix (16). FreeSurfer
EP
Desikan atlas (3) ROIs were warped into diffusion space by mapping between the T1weighted image and fractional anisotropy (FA) map using NiftyReg (17) and applying the A foreground mask was generated by combining
AC C
resulting warp to each of the ROIs.
FreeSurfer segmentations with the WM mask.
Diffusion Tensor Imaging Data Diffusion Tractography Whole brain probabilistic tractography was performed using the iFOD2 algorithm in MRtrix (16). Specifically, five million streamlines were randomly seeded throughout the WM, in all
3
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
foreground voxels where FA>0.2. Streamlines were terminated when they either reached the cortical or subcortical grey-matter mask or exited the foreground mask. The spherical deconvolution informed filtering of tractograms (SIFT2) algorithm (18) was used to reduce
RI PT
biases. The resulting set of streamlines was used to construct the structural brain network. To demonstrate our results were robust to varying methodologies additional cross-sectional analyses were completed using the addition of Gaussian noise to connectomes, FA weighting of connections and the Easy Lausanne scale 60 atlas (110 ROIs) (13) with connectomes
SC
undergoing consensus based thresholding at 75% and 50%. These values were chosen as they
M AN U
have been commonly used in structural connectomics (4, 19, 20).
Construction of Structural Connectivity Matrices
For structural connectivity matrices ROIs were defined as connected if a fibre originated in ROI 1 and terminated in ROI 2. Structural connections were weighted by streamline count
TE D
and a cross-sectional area multiplier as implemented in SIFT2 (18). Probabilistic tractography as implemented in MRtrix3 creates a connectome composed of one upper triangle of a connectivity matrix. This is then copied to the lower triangle to generate a symmetric matrix
EP
of 74x74. As there is no consensus in the literature regarding the optimal graph thresholding strategy (21) and results can vary widely based on the chosen approach (22) SIFT2 was our
AC C
preferred method of bias correction. Indeed the creators of SIFT2 argue against the use of matrix thresholding as it introduces an arbitrary threshold value (23). SIFT2 was chosen in preference to SIFT as it requires much less processing time and retains the full connectome. SIFT2 utilises information from the FOD to determine a cross sectional area for each streamline thereby generating streamline volume estimates between regions (18). Currently in the literature there is no consensus regarding volume normalisation in connectome studies. There is a suggestion that volume normalisation may overcompensate
4
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
volume-driven effects on streamline count (24). In keeping with this in our previous study we analysed both volume normalised and un-normalised connectomes and showed that volume normalisation results in biologically implausible findings, which are likely spurious (4). In a
RI PT
subsequent study using the same data set presented here we performed two complimentary tractography approaches: connectomics and voxel connectivity profiles (VCPs) (10). Volume normalisation was performed in the VCP analyses as the tractography is performed at the voxel level. Results between the two approaches were consistent suggesting the limited
SC
amount of brain atrophy seen in preHD has a minimal effect on tractography. Previous work
M AN U
by our group has demonstrated low within-subject variability of diffusion metrics in manifest HD participants, suggesting atrophy does not cause significant distortion of the diffusion signal (25). Thus the more limited atrophy seen in preHD is unlikely to introduce systematic differences in connectome construction.
TE D
Regional White Matter Atrophy
For each cortical brain region connection strength was defined as either the sum of corticostriatal connection weights, sum of connection weights from regions in the opposite
EP
hemisphere (inter-hemispheric) or sum of connection weights from regions in the same hemisphere (intra-hemispheric). Rate of change in connection strength over 24 months was
AC C
defined in the same way. PreHD were normalised relative to controls using a Z-score. These were then transformed to give positive atrophy and rate of atrophy measures, where higher scores represent greater connection atrophy. The atrophy score was used in the crosssectional analysis, while the rate of atrophy score was used in the longitudinal analysis.
Cross-sectional Analysis For the cross-sectional analysis a Z-score was calculated as follows:
5
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
.
where i is the regional connection strength, k is preHD, h is healthy controls, C is connection
RI PT
strength, μ is mean and σ is standard deviation. This was then transformed to produce atrophy measures between -1 and 1, were positive measures represent greatest atrophy, using the following equation:
.
SC
tanh
M AN U
This resulted in a transformed Z-score for each cortical region for each preHD participant cortico-striatal, inter-hemispheric and intra-hemispheric connections. An average was then calculated across the preHD group resulting in a single transformed Z-score for each
TE D
cortical region.
Longitudinal Analysis
For each preHD participant and for each connection a least squares line was fitted over the
EP
regional connection strengths across time points and the rate of connection atrophy defined as the gradient of the least squares line. A Z-score was then calculated using the following
AC C
equation:
.
where R is the rate of change of connection strength. This was then transformed to produce rate of atrophy measures between -1 and 1, using the following equation:
tanh
. 6
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
This resulted in a transformed Z-score of rate of regional atrophy for cortico-striatal, interhemispheric and intra-hemispheric connections for each preHD participant. An average was then calculated across the preHD group resulting in a single transformed Z-score for each
RI PT
cortical region.
Mapping Gene Expression Data to MRI Space
SC
Gene expression microarray data was used from the Allen Human brain atlas (26). This atlas is based on data from 6 post-mortem human brains with no known neuropsychiatric or history
(H0351.2001,
H0351.2002,
H0351.1009,
H0351.1012,
M AN U
neuropathological
H0351.1015, H0351.1016). Five donors were male and one was female with a mean age 42.5yrs. Three were Caucasian, two were African-American and one was Hispanic. This data is freely available to download from AIBS (http://human.brain-map.org/static/download). Maybrain software (https://github.com/rittman/maybrain) was used to match centroids
TE D
of MRI regions to the closest AIBS region. The nearest gene expression profile to the ROI coordinates was used as the expression profile for that ROI. Therefore for each ROI only one tissue sample was used from the AIBS atlas. The sample coverage for the AIBS atlas varied
EP
from 255-291 cortical samples for the 4 participants with data from one hemisphere. For the 2 participants with data from both hemispheres one had 412 samples and the other 528. Probes
AC C
were excluded that did not match to gene symbols in the AIBS data resulting in 20,737 genes included in the analysis. Expression data was then averaged across all samples from all donors. Data were also averaged across both hemispheres as two donors had data for both hemispheres, while four only had data for the left hemisphere. The maximum standard deviation across subjects for each gene probe in each brain region ranged from 0.1 to 4.6 (see Supplemental File 8). To account for this variability the mean and range of expression values for each brain region were calculated and regions excluded if they had values greater than
7
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
two standard deviations from either the mean or range. This resulted in the exclusion of two brain regions (right pars orbitalis and right rostral middle frontal), leaving a total of 68 cortical ROIs included in the analysis. Expression data were then normalised by calculating
RI PT
the Z-score across the 68 FreeSurfer regions. Similar approaches as those outlined above have been used when matching AIBS data to MRI atlases in other studies (27-29). Genetic data from outlier regions is likely to be unreliable. While it is difficult to pin point the exact
optimal matching between the AIBS and MRI atlases.
SC
reason for outlier regions in these analyses it may be that outlier regions represent sub-
M AN U
To investigate how robust results were to different combinations of AIBS participants, we also performed cortico-striatal, inter-hemispheric, intra-hemispheric cross-sectional analyses using using a leave one out approach. Average gene expression was calculated for 5 participants leaving one participant out in turn. A leave one out approach has been used in a previous study investigating regional gene expression and functional connectivity using the
TE D
Allen institute of Brain Science human transcriptome atlas (28). We also repeated the crosssectional analyses using permutations of 3 out of 6 AIBS brain samples resulting in a total of
EP
8 permutations.
Statistical Analysis
AC C
All statistical analysis was performed in MATLAB v8.3. Partial least squares regression was used to investigate the association between gene transcriptome of the healthy brain and WM connectivity loss in preHD both cross-sectionally and longitudinally. Code used to perform this analysis was adapted from Whitaker et al. (29). The original code is freely available (https://github.com/KirstieJane/NSPN_WhitakerVertes_PNAS2016). Partial least squares regression is a multivariate technique used to identify associations between response and predictor variables. In our case the predictor variable was
8
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
a 20,737 gene x 68. ROI matrix, as outlined above. For the cortico-striatal analysis the MRI data response variable was a 4 x 68 matrix of left and right caudate and putamen WM connectivity loss (preHD relative to controls) to 68 cortical ROIs. This was performed for
RI PT
both white matter atrophy (cross-sectional) and rate of white matter atrophy (longitudinal). For the inter-hemispheric analysis the MRI response variable was a vector of 1 x 68, representing WM inter-hemispheric connectivity loss for each cortical ROI. Similarly for the intra-hemispheric analysis the MRI response variable was a vector of 1 x 68, representing
SC
WM intra-hemispheric connectivity loss for each cortical ROI. For a cortical region inter-
M AN U
hemispheric connectivity was calculated as the sum of streamline volumes between that region and regions in the opposite hemisphere. Similarly intra-hemispheric connectivity was defined as the sum of streamline volumes between that region and regions in the same hemisphere. Atrophy scores were then calculated as using Z-scores and the tanh transform as described above.
TE D
As the greatest amount of variance was explained by the first PLS component, genes were ranked based on their contribution to this component. The error in estimating the weight of each gene was assessed by boot strapping and the ratio of the weight of each gene to its
EP
bootstrap standard deviation was used to rank the genes in descending order based on their contribution of the first component.
AC C
Random permutations of the gene predictor variable were also investigated to ensure results were not due to chance. To do this the randperm function in MATLAB was used to randomly reorder the predictor variable both in terms of genes and ROIs. Cross-sectional analyses were then re-run using the resulting predictor variables. Partial least squares regression (PLS) is well suited for high dimensional data as it combines Principle components analysis (for dimension reduction) with linear regression. It is also well suited in the case when the number of predictor variables far exceeds the number
9
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
of observations – exactly the scenario we are dealing with having 20,737 gene expression (predictor variables) and 68 brain region (observations). In comparison, other multivariate methods such as canonical variance analysis (CVA) or linear discriminant analysis (LDA)
RI PT
require around 4-8 times observations than the predictor variables. Boulesteix et al. (30) have previously shown the utility of this approach in high dimensional datasets for e.g. tumor classification from transcriptome data, identification of relevant genes, survival analysis and modeling of gene networks and transcription factor activities. There are several previous
SC
studies that used PLS for the large gene expression datasets from the Allen Institute of Brain
M AN U
Science (AIBS) mouse and human brain transcriptome atlases (28, 29, 31).
Gene Ontology Enrichment Analysis
We used the gene ontology enrichment analysis and visualisation tool (GOrilla) (http://cblgorilla.cs.technion.ac.il) (32) to identify GO terms that were significantly enriched in the
TE D
target gene list, based on the first PLS component. GOrilla GO terms are updated weekly. The target gene list is defined by finding the optimal hypergeometric tail probability over all possible partitions induced by gene ranking (see (32) for further details). Significance of a
EP
GO term is determined based on the rank of genes associated with that GO term and a false discovery rate (FDR) correction for multiple comparisons. This was performed for the first
AC C
PLS component for the cortico-striatal, inter-hemispheric and intra-hemispheric analysis both cross-sectionally and longitudinally. We also removed general GO terms by excluding those with greater than 1000 genes in their classification, in keeping with other studies in the literature (28, 29). This allowed us to focus on specific gene sets as opposed to GO terms encompassing thousands of genes covering a range of processes. The reduce and visualize gene ontology tool REViGO (33) (http://revigo.irb.hr) was then used to summarise significant GO terms by removing redundant terms.
10
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Overlap Between Gene Profiles and Huntington’s Disease Related Genes To investigate similarities between gene profiles in each analysis we identified which genes overlap in the top ranked 7,000 genes (based on target gene lists from top GO terms) from the
RI PT
cross-sectional cortico-striatal analysis and the intra-hemispheric analysis. We also assessed the probability of this overlap occurring greater than chance using a hypergeometric distribution
as
implemented
in
https://github.com/brentp/bio-playground/blob/master/
SC
utils/list_overlap_p.py. Gene ontology enrichment analysis was also repeated with overlap genes removed to assess whether this affected the resulting GO terms.
M AN U
To further assess the relationship between gene ontologies we investigated the overlap between genes in the top gene ontology terms across analyses: “modulation of chemical synaptic transmission” and “mRNA metabolic process”. Finally we investigated the overlap between top gene ontology terms and HD related genes. Gene lists for HD related genes for
TE D
both the striatum and cortex were obtained from (34).
Cortical Regional Enrichment
We used ROI weights from the PLS analysis to assess which cortical regions where enriched
EP
for genes in the first PLS component for the cortico-striatal, inter-hemispheric and intra-
AC C
hemispheric analysis. ROI weights were plotted for each analysis using BrainNet Viewer (35).
Enrichment for Huntington’s Disease Related Genes We also investigated whether genes showing abnormal transcription in human and animal models of HD were enriched greater than chance in the first PLS components of the corticostriatal, inter-hemispheric and intra-hemispheric analyses. Gene lists were obtained from (34). These included 515 genes in the striatum and 25 in the cortex.
11
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Gene lists for the striatum include the 6-month allelic series striatum from Langfelder et al. (34) and the human caudate nucleus (CN) data sets by Durrenberger et al. (36) and Hodges et al. (37) are reported. Each striatal gene satisfies the following criteria: FDR<0.05
RI PT
in the allelic series striatum, FDR<0.1 in each of the human data sets, and same sign of fold change across all 3 data sets. For the cortex the gene lists include the allelic series 6-month cortex, Brodmann area (BA) 4 and BA9 data by Hodges et al. (37), and prefrontal cortex (PFC) and visual cortex (VC) data from the Harvard Brain Tissue Resource Centre are
SC
reported (38). Each cortical gene satisfies the following criteria: FDR<0.05 in the allelic
M AN U
series cortex, FDR<0.1 in at least 3 of the 4 of the human data sets, and same sign of fold change in the allelic series cortex and at least 3 of the 4 human data sets. Genes in the Langfelder lists not included in the AIBS gene set were excluded; this resulted in the exclusion of 28 striatum genes.
The mean PLS weight of candidate gene sets were compared against the mean PLS
TE D
weight of 1000 random permutations of genes. A p-value was calculated based on the number of times in 1000 that the random gene list showed a higher mean rank than the candidate gene list. We also investigated whether HD related genes were more strongly enriched in these
EP
gene lists than other biologically plausible gene sets, chosen at random. In order to do this gene sets from known gene ontologies were downloaded from the molecular signatures
AC C
database (MSigBD) (http://software.broadinstitute.org/gsea/msigdb/). A p-value was calculated based on the number of times that the MSigBD gene list showed a higher mean rank than the candidate gene list. This was performed for the 515 striatum HD genes and MSigBD gene lists truncated at 515 (306 lists in total). In order to investigate smaller alternative gene sets the top 25 striatum HD genes were also compared with MSigBD gene lists truncated at 25 (3,633 lists in total).
12
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
To further investigate the relationship between changes in gene expression in HD relative to controls and cortico-striatal WM loss we performed correlations between the log2 fold change in the Hodges (37), Durrenberger (36) and Langfelder studies (34) for the 515
Enrichment for Alternative Gene Sets
RI PT
striatum gene set and the PLS weights from the cross-sectional cortico-striatal analysis.
SC
Enrichment of the PLS components of the cortico-striatal, inter-hemispheric and intrahemispheric analyses were also tested for a range of other gene sets. We included a set of
M AN U
human supragranular genes (n = 19) as these have been implicated in long-range connectivity (39) and we have previously shown cortico-striatal connections to have the longest topological length of the white connections subtypes investigated here (10). Genes specific to oligodendroctyes (n = 94) (40) were also included to investigate whether white matter loss may be driven by axonal or myelination dysfunction. Finally, genes involved in cell cycle
TE D
metabolism (n = 252) (http://www.bmrb.wisc.edu/data_library/Genes/Metabolic_Pathways/ Cell_cycle.html) were included as mutant huntingtin has been shown to cause cell cycle
AC C
EP
abnormalities (41).
13
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
TE D
M AN U
SC
RI PT
Supplemental Figures
AC C
EP
Figure S1. Cortico-striatal longitudinal analysis semantic similarity scatter plot: Significant gene ontology (GO) terms for biological processes associated with the first component of the partial least squares (PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most significant GO terms are labelled for each analysis. Redundant GO terms and those associated with greater than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
14
ACCEPTED MANUSCRIPT Supplement
M AN U
SC
RI PT
McColgan et al.
AC C
EP
TE D
Figure S2. Inter-hemispheric longitudinal analysis semantic similarity scatter plot: Significant gene ontology (GO) terms for biological processes associated with the first component of the partial least squares (PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most significant GO terms are labelled for each analysis. Redundant GO terms and those associated with greater than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
15
ACCEPTED MANUSCRIPT Supplement
M AN U
SC
RI PT
McColgan et al.
AC C
EP
TE D
Figure S3. Intra-hemispheric longitudinal analysis semantic similarity scatter plot: Significant gene ontology (GO) terms for biological processes associated with the first component of the partial least squares (PLS) analysis are plotted in semantic space, where similar terms are clustered together. The top 5 most significant GO terms are labelled for each analysis. Redundant GO terms and those associated with greater than 1000 genes have been excluded. Markers are scaled based on the log10 q-value for the significance of each GO term. Large blue circles are highly significant, while red circles are less significant (see colour bar).
16
ACCEPTED MANUSCRIPT Supplement
RI PT
McColgan et al.
AC C
EP
TE D
M AN U
SC
Figure S4. Enrichment of top 25 striatum genes showing abnormal transcription in Huntington’s disease (as defined by lowest Hodges q-value) in the first PLS components of cortico-striatal cross-sectional analyses. Red circle illustrates the mean weight (on the xaxis) for the gene list of interest in the first PLS component. The y-axis represents the number of permutations of random genes from the first PLS component. Gene lists over expressed in the first PLS component have a mean great than that of the random permutations (red circle to the right of the permutation distribution).
17
ACCEPTED MANUSCRIPT Supplement
TE D
M AN U
SC
RI PT
McColgan et al.
AC C
EP
Figure S5. Random permutation ROI weights for cross-sectional partial least squares regression analyses. (a) Cortico-striatal (b) Inter-hemispheric (c) Intra-hemispheric. Brain regions displayed on brain mesh. Size and colour of region indicates size of ROI weight (ranked from smallest-largest, 1-6). See colour map.
18
ACCEPTED MANUSCRIPT Supplement
SC
RI PT
McColgan et al.
AC C
EP
TE D
M AN U
Figure S6. Correlation between PLS1 cortico-striatal weights and log2 fold change in human HD (Hodges and Durrenberger) and animal HD model (Langfelder) studies. The red line represents a least squares regression line, rho = correlation coefficient, p = p-value and df = degrees of freedom.
19
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
PLS1 Cortico-striatal Longitudinal Description
P-value
GO:0016071
mRNA metabolic process
3.51E-33
1.35E-30
GO:0006396 GO:0019083 GO:0006325
RNA processing viral transcription chromatin organization nuclear-transcribed mRNA catabolic process, nonsense-mediated decay
7.90E-29 3.28E-28 9.85E-26
2.77E-26 1.12E-25 3.22E-23
2.16E-19
6.78E-17
PLS1 Inter-hemispheric Longitudinal GO Term
Description
GO:0016071 GO:0006325 GO:0006396
mRNA metabolic process chromatin organization RNA processing
GO:0006397
mRNA processing
GO:0016569
covalent chromatin modification
M AN U
GO:0000184
P-value
PLS1 Intra-hemispheric Longitudinal Description
GO:0022904
GO:0006091 GO:0009117
respiratory electron transport chain modulation of chemical synaptic transmission generation of precursor metabolites and energy nucleotide metabolic process
GO:0070271
protein complex biogenesis
B
n
b
1.81
593
5324
322
1.64 2.86 1.77
806 99 657
5324 5251 4520
397 84 296
2.53
102
5298
77
B
n
b
Enrichment
3.98E-16 4.09E-14 7.78E-14
1.67E-13 1.47E-11 2.66E-11
1.48 1.4 1.36
593 657 806
6539 6476 6410
323 337 397
6.16E-12
2.02E-09
1.49
402
6323
213
4.62E-09
1.36E-06
1.38
455
6476
230
FDR q-value
Enrichment
B
n
b
2.29E-13
1.15E-09
2.71
92
3917
55
2.95E-11
6.35E-08
1.84
297
3658
113
1.42E-10 4.97E-10
2.38E-07 7.48E-07
1.64 1.88
263 418
5414 2364
132 105
9.42E-10
1.01E-06
2.46
81
4186
47
P-value
AC C
EP
Enrichment
FDR q-value
TE D
GO Term
GO:0050804
FDR q-value
SC
GO Term
RI PT
Table S1. Cortico-striatal, inter-hemispheric and intra-hemispheric longitudinal analysis: Gene ontology (GO) terms for biological processes associated with top ranking genes from the first component of the partial least squares (PLS) analysis. The top 5 most significant GO terms are displayed for each analysis. Full tables can be found in supplementary file 2. Redundant GO terms and those associated with greater than 1000 genes have been excluded. B – total number of genes associated with a specific GO term, n – number of genes in target set, b – is the number of genes in the intersection. Enrichment (E) = (b/n) / (B/total number of genes). See (32) for further details.
20
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Table S2. ROI weights from first PLS components. BG – basal ganglia, IH – Interhemispheric, IA – Intra-hemispheric, cross – cross-sectional, long – longitudinal. Weights ordered for basal ganglia cross-sectional analysis, decreasing strongest to weakest. CS cross
IH cross
IA cross
CS long
IH long
0.22034 0.20856
0.028159 0.047965
-0.077541 -0.088025
-0.11582 -0.12867
R.superiorparietal L.cuneus L.inferiorparietal L.isthmuscingulate L.lateraloccipital
0.20856 0.15725 0.15725 0.15725 0.15725
0.047965 0.10562 0.10562 0.10562 0.10562
-0.088025 -0.12229 -0.12229 -0.12229 -0.12229
-0.12867 -0.12238 -0.12238 -0.12238 -0.12238
-0.059647 -0.10801 -0.10801 -0.10801 -0.10801
0.17176 0.13927 0.13927 0.13927 0.13927
L.paracentral L.pericalcarine L.posteriorcingulate L.precuneus L.superiorparietal
0.15725 0.15725 0.15725 0.15725 0.15725
0.10562 0.10562 0.10562 0.10562 0.10562
-0.12229 -0.12229 -0.12229 -0.12229 -0.12229
-0.12238 -0.12238 -0.12238 -0.12238 -0.12238
-0.10801 -0.10801 -0.10801 -0.10801 -0.10801
0.13927 0.13927 0.13927 0.13927 0.13927
L.supramarginal R.isthmuscingulate R.paracentral R.posteriorcingulate R.precuneus
0.15725 0.15725 0.15725 0.15725 0.15725
0.10562 0.10562 0.10562 0.10562 0.10562
-0.12229 -0.12229 -0.12229 -0.12229 -0.12229
-0.12238 -0.12238 -0.12238 -0.12238 -0.12238
-0.10801 -0.10801 -0.10801 -0.10801 -0.10801
0.13927 0.13927 0.13927 0.13927 0.13927
R.supramarginal R.caudalmiddlefrontal R.postcentral R.cuneus L.postcentral
0.15725 0.14542 0.14542 0.12658 0.11425
0.10562 0.10192 0.10192 0.097045 0.099365
-0.12229 -0.093706 -0.093706 -0.10093 -0.10889
-0.12238 -0.090327 -0.090327 -0.1211 -0.12017
-0.10801 -0.16404 -0.16404 -0.078746 -0.094212
0.13927 0.11031 0.11031 0.10593 0.1137
0.054748 0.030688 0.030688 0.0041809
0.11976 0.11459 0.11459 0.1269
-0.11586 -0.10071 -0.10071 -0.10137
-0.10097 -0.10298 -0.10298 -0.089321
-0.12061 -0.055668 -0.055668 -0.092781
0.097573 0.067514 0.067514 0.062488
M AN U
TE D
EP
0.15782 0.17176
RI PT
R.inferiorparietal R.precentral
L.caudalmiddlefrontal L.transversetemporal R.caudalanteriorcingulate L.precentral
-0.068989 -0.059647
IA long
SC
Region
0.0041809 0.00108 0.00108 0.00108 0.00108
0.1269 0.11641 0.11641 0.11641 0.11641
-0.10137 -0.10077 -0.10077 -0.10077 -0.10077
-0.089321 -0.099388 -0.099388 -0.099388 -0.099388
-0.092781 -0.054861 -0.054861 -0.054861 -0.054861
0.062488 0.055352 0.055352 0.055352 0.055352
L.parsorbitalis L.parstriangularis L.rostralmiddlefrontal L.superiorfrontal L.frontalpole
-0.013671 -0.013671 -0.026883 -0.026883 -0.026883
0.13099 0.13099 0.1453 0.1453 0.1453
-0.10188 -0.10188 -0.118 -0.118 -0.118
-0.083891 -0.083891 -0.095567 -0.095567 -0.095567
-0.11266 -0.11266 -0.12 -0.12 -0.12
0.051551 0.051551 0.061948 0.061948 0.061948
R.frontalpole L.entorhinal L.medialorbitofrontal L.temporalpole R.entorhinal
-0.026883 -0.077019 -0.077019 -0.077019 -0.077019
0.1453 -0.16519 -0.16519 -0.16519 -0.16519
-0.118 0.12322 0.12322 0.12322 0.12322
-0.095567 0.10631 0.10631 0.10631 0.10631
-0.12 0.20821 0.20821 0.20821 0.20821
0.061948 -0.10852 -0.10852 -0.10852 -0.10852
AC C
R.superiorfrontal L.caudalanteriorcingulate L.parsopercularis L.superiortemporal L.insula
21
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Region
CS cross
IH cross
IA cross
CS long
IH long
IA long
-0.077019 -0.077019 -0.077019 -0.11678
-0.16519 -0.16519 -0.16519 -0.11347
0.12322 0.12322 0.12322 0.12528
0.10631 0.10631 0.10631 0.14275
0.20821 0.20821 0.20821 0.07509
-0.10852 -0.10852 -0.10852 -0.13103
L.inferiortemporal L.lateralorbitofrontal L.lingual L.middletemporal L.parahippocampal
-0.11678 -0.11678 -0.11678 -0.11678 -0.11678
-0.11347 -0.11347 -0.11347 -0.11347 -0.11347
0.12528 0.12528 0.12528 0.12528 0.12528
0.14275 0.14275 0.14275 0.14275 0.14275
0.07509 0.07509 0.07509 0.07509 0.07509
-0.13103 -0.13103 -0.13103 -0.13103 -0.13103
L.rostralanteriorcingulate L.hippocampus R.hippocampus R.parahippocampal
-0.11678 -0.11678 -0.11678 -0.11678
-0.11347 -0.11347 -0.11347 -0.11347
0.12528 0.12528 0.12528 0.12528
0.14275 0.14275 0.14275 0.14275
0.07509 0.07509 0.07509 0.07509
-0.13103 -0.13103 -0.13103 -0.13103
R.rostralanteriorcingulate R.fusiform R.lateraloccipital R.lingual R.pericalcarine
-0.11678 -0.12353 -0.12353 -0.12353 -0.12353
-0.11347 -0.094844 -0.094844 -0.094844 -0.094844
0.12528 0.14257 0.14257 0.14257 0.14257
0.14275 0.16319 0.16319 0.16319 0.16319
0.07509 -0.0012139 -0.0012139 -0.0012139 -0.0012139
-0.13103 -0.15301 -0.15301 -0.15301 -0.15301
R.transversetemporal L.bankssts R.parstriangularis R.bankssts R.inferiortemporal
-0.12353 -0.12417 -0.12699 -0.13821 -0.13821
-0.094844 -0.11555 -0.1361 -0.14829 -0.14829
0.14257 0.12622 0.096136 0.15138 0.15138
0.16319 0.12119 0.094345 0.11965 0.11965
-0.0012139 0.089642 0.14245 0.19037 0.19037
-0.15301 -0.11748 -0.058084 -0.13649 -0.13649
R.middletemporal R.parsopercularis R.superiortemporal
-0.13821 -0.13821 -0.13821
-0.14829 -0.14829 -0.14829
0.15138 0.15138 0.15138
0.11965 0.11965 0.11965
0.19037 0.19037 0.19037
-0.13649 -0.13649 -0.13649
R.insula
-0.13821
-0.14829
0.15138
0.11965
0.19037
-0.13649
AC C
EP
TE D
M AN U
SC
RI PT
R.lateralorbitofrontal R.medialorbitofrontal R.temporalpole L.fusiform
22
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
Supplemental References Tabrizi SJ, Langbehn DR, Leavitt BR, Roos RA, Durr A, Craufurd D, et al. (2009): Biological and clinical manifestations of Huntington's disease in the longitudinal TRACK-HD study: cross-sectional analysis of baseline data. Lancet Neurol. 8:791-801.
2.
Penney JB, Jr., Vonsattel JP, MacDonald ME, Gusella JF, Myers RH (1997): CAG repeat number governs the development rate of pathology in Huntington's disease. Ann Neurol. 41:689-692.
3.
Desikan RS, Segonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, et al. (2006): An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 31:968-980.
4.
McColgan P, Seunarine KK, Razi A, Cole JH, Gregory S, Durr A, et al. (2015): Selective vulnerability of Rich Club brain regions is an organizational principle of structural connectivity loss in Huntington's disease. Brain. 138:3327-3344.
5.
Tabrizi SJ, Scahill RI, Durr A, Roos RA, Leavitt BR, Jones R, et al. (2011): Biological and clinical changes in premanifest and early stage Huntington's disease in the TRACKHD study: the 12-month longitudinal analysis. Lancet Neurol. 10:31-42.
6.
Tabrizi SJ, Reilmann R, Roos RA, Durr A, Leavitt B, Owen G, et al. (2012): Potential endpoints for clinical trials in premanifest and early Huntington's disease in the TRACKHD study: analysis of 24 month observational data. Lancet Neurol. 11:42-53.
7.
Faria AV, Ratnanather JT, Tward DJ, Lee DS, van den Noort F, Wu D, et al. (2016): Linking white matter and deep gray matter alterations in premanifest Huntington disease. Neuroimage Clin. 11:450-460.
8.
van den Bogaard SJ, Dumas EM, Acharya TP, Johnson H, Langbehn DR, Scahill RI, et al. (2011): Early atrophy of pallidum and accumbens nucleus in Huntington's disease. J Neurol. 258:412-420.
9.
Hibar DP, Stein JL, Renteria ME, Arias-Vasquez A, Desrivieres S, Jahanshad N, et al. (2015): Common genetic variants influence human subcortical brain structures. Nature. 520:224-229.
EP
TE D
M AN U
SC
RI PT
1.
10. McColgan P, Seunarine KK, Gregory S, Razi A, Papoutsi M, Long JD, et al. (2017): Topological length of white matter connections predicts their rate of atrophy in premanifest Huntington's disease. JCI Insight. 2.
AC C
11. McColgan P, Razi A, Gregory S, Seunarine KK, Durr A, R ACR, et al. (2017): Structural and functional brain network correlates of depressive symptoms in premanifest Huntington's disease. Hum Brain Mapp. 38:2819-2829. 12. McColgan P, Gregory S, Razi A, Seunarine KK, Gargouri F, Durr A, et al. (2017): White matter predicts functional connectivity in premanifest Huntington's disease. Ann Clin Transl Neurol. 4:106-118. 13. Cammoun L, Gigandet X, Meskaldji D, Thiran JP, Sporns O, Do KQ, et al. (2012): Mapping the human connectome at multiple scales with diffusion spectrum MRI. J Neurosci Methods. 203:386-397. 14. Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, et al. (2002): Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage. 15:273-289.
23
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
15. Smith SM (2002): Fast robust automated brain extraction. Hum Brain Mapp. 17:143-155. 16. Tournier JD, Calamante F, Connelly A (2012): MRtrix: Diffusion tractography in crossing fiber regions. Imaging Systems and Technology. 22:53-56.
RI PT
17. Modat M, Ridgway GR, Taylor ZA, Lehmann M, Barnes J, Hawkes DJ, et al. (2010): Fast free-form deformation using graphics processing units. Comput Methods Programs Biomed. 98:278-284. 18. Smith RE, Tournier JD, Calamante F, Connelly A (2015): SIFT2: Enabling dense quantitative assessment of brain white matter connectivity using streamlines tractography. Neuroimage. 119:338-351. 19. van den Heuvel MP, Sporns O (2011): Rich-club organization of the human connectome. J Neurosci. 31:15775-15786.
SC
20. van den Heuvel MP, Kahn RS, Goni J, Sporns O (2012): High-cost, high-capacity backbone for global brain communication. Proc Natl Acad Sci U S A. 109:11372-11377.
M AN U
21. Qi S, Meesters S, Nicolay K, Romeny BM, Ossenblok P (2015): The influence of construction methodology on structural brain network measures: A review. J Neurosci Methods. 253:170-182. 22. Garrison KA, Scheinost D, Finn ES, Shen X, Constable RT (2015): The (in)stability of functional brain network measures across thresholds. Neuroimage. 23. Yeh CH, Smith RE, Liang X, Calamante F, Connelly A (2016): Correction for diffusion MRI fibre tracking biases: The consequences for structural connectomic metrics. Neuroimage.
TE D
24. Zalesky A, Fornito A (2009): A DTI-derived measure of cortico-cortical connectivity. IEEE Trans Med Imaging. 28:1023-1036. 25. Cole JH, Farmer RE, Rees EM, Johnson HJ, Frost C, Scahill RI, et al. (2014): TestRetest Reliability of Diffusion Tensor Imaging in Huntington's Disease. PLoS Curr. 6. 26. Hawrylycz M, Miller JA, Menon V, Feng D, Dolbeare T, Guillozet-Bongaarts AL, et al. (2015): Canonical genetic signatures of the adult human brain. Nat Neurosci. 18:18321844.
AC C
EP
27. Rittman T, Rubinov M, Vertes PE, Patel AX, Ginestet CE, Ghosh BC, et al. (2016): Regional expression of the MAPT gene is associated with loss of hubs in brain networks and cognitive impairment in Parkinson disease and progressive supranuclear palsy. Neurobiol Aging. 48:153-160. 28. Vertes PE, Rittman T, Whitaker KJ, Romero-Garcia R, Vasa F, Kitzbichler MG, et al. (2016): Gene transcription profiles associated with inter-modular hubs and connection distance in human functional magnetic resonance imaging networks. Philos Trans R Soc Lond B Biol Sci. 371. 29. Whitaker KJ, Vertes PE, Romero-Garcia R, Vasa F, Moutoussis M, Prabhu G, et al. (2016): Adolescence is associated with genomically patterned consolidation of the hubs of the human brain connectome. Proc Natl Acad Sci U S A. 113:9105-9110. 30. Boulesteix AL, Strimmer K (2007): Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Brief Bioinform. 8:32-44.
24
ACCEPTED MANUSCRIPT McColgan et al.
Supplement
31. Rubinov M, Ypma RJ, Watson C, Bullmore ET (2015): Wiring cost and topological participation of the mouse brain connectome. Proc Natl Acad Sci U S A. 112:1003210037. 32. Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009): GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 10:48.
RI PT
33. Supek F, Bosnjak M, Skunca N, Smuc T (2011): REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS One. 6:e21800. 34. Langfelder P, Cantle JP, Chatzopoulou D, Wang N, Gao F, Al-Ramahi I, et al. (2016): Integrated genomics and proteomics define huntingtin CAG length-dependent networks in mice. Nat Neurosci. 19:623-633.
SC
35. Xia M, Wang J, He Y (2013): BrainNet Viewer: a network visualization tool for human brain connectomics. PLoS One. 8:e68910.
M AN U
36. Durrenberger PF, Fernando FS, Kashefi SN, Bonnert TP, Seilhean D, Nait-Oumesmar B, et al. (2015): Common mechanisms in neurodegeneration and neuroinflammation: a BrainNet Europe gene expression microarray study. J Neural Transm (Vienna). 122:1055-1068. 37. Hodges A, Strand AD, Aragaki AK, Kuhn A, Sengstag T, Hughes G, et al. (2006): Regional and cellular gene expression changes in human Huntington's disease brain. Hum Mol Genet. 15:965-977. 38. Zhang B, Gaiteri C, Bodea LG, Wang Z, McElwee J, Podtelezhnikov AA, et al. (2013): Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell. 153:707-720.
TE D
39. Krienen FM, Yeo BT, Ge T, Buckner RL, Sherwood CC (2016): Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain. Proc Natl Acad Sci U S A. 113:E469-478. 40. Cahoy JD, Emery B, Kaushal A, Foo LC, Zamanian JL, Christopherson KS, et al. (2008): A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci. 28:264-278.
AC C
EP
41. Molina-Calavita M, Barnat M, Elias S, Aparicio E, Piel M, Humbert S (2014): Mutant huntingtin affects cortical progenitor cell division and development of the mouse neocortex. J Neurosci. 34:10034-10040.
25